Computational Methods for Quantitative Finance: Finite Element Methods for Derivative Pricing (Springer Finance) 3642354009, 9783642354007

Many mathematical assumptions on which classical derivative pricing methods are based have come under scrutiny in recent

115 9 6MB

English Pages 312 [301] Year 2013

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Computational Methods for Quantitative Finance: Finite Element Methods for Derivative Pricing (Springer Finance)
 3642354009, 9783642354007

Table of contents :
Computational Methods for Quantitative Finance
Preface
Contents
Part I: Basic Techniques and Models
Chapter 1: Notions of Mathematical Finance
1.1 Financial Modelling
Stocks
Price Process
Derivative Securities
Options
Payoff
Modelling Assumptions
1.2 Stochastic Processes
1.3 Further Reading
Chapter 2: Elements of Numerical Methods for PDEs
2.1 Function Spaces
2.2 Partial Differential Equations
2.3 Numerical Methods for the Heat Equation
2.3.1 Finite Difference Method
2.3.2 Convergence of the Finite Difference Method
2.3.3 Finite Element Method
2.4 Further Reading
Chapter 3: Finite Element Methods for Parabolic Problems
3.1 Sobolev Spaces
3.2 Variational Parabolic Framework
3.3 Discretization
3.4 Implementation of the Matrix Form
3.4.1 Elemental Forms and Assembly
3.4.2 Initial Data
3.5 Stability of the theta-Scheme
3.6 Error Estimates
3.6.1 Finite Element Interpolation
3.6.2 Convergence of the Finite Element Method
3.7 Further Reading
Chapter 4: European Options in BS Markets
4.1 Black-Scholes Equation
4.2 Variational Formulation
4.3 Localization
4.4 Discretization
4.4.1 Finite Difference Discretization
4.4.2 Finite Element Discretization
4.4.3 Non-smooth Initial Data
4.5 Extensions of the Black-Scholes Model
4.5.1 CEV Model
4.5.2 Local Volatility Models
4.6 Further Reading
Chapter 5: American Options
5.1 Optimal Stopping Problem
5.2 Variational Formulation
5.3 Discretization
5.3.1 Finite Difference Discretization
5.3.2 Finite Element Discretization
5.4 Numerical Solution of Linear Complementarity Problems
5.4.1 Projected Successive Overrelaxation Method
5.4.2 Primal-Dual Active Set Algorithm
5.5 Further Reading
Chapter 6: Exotic Options
6.1 Barrier Options
6.2 Asian Options
6.3 Compound Options
6.4 Swing Options
6.5 Further Reading
Chapter 7: Interest Rate Models
7.1 Pricing Equation
7.2 Interest Rate Derivatives
7.3 Further Reading
Chapter 8: Multi-asset Options
8.1 Pricing Equation
8.2 Variational Formulation
8.3 Localization
8.4 Discretization
8.4.1 Finite Difference Discretization
8.4.2 Finite Element Discretization
8.5 Further Reading
Chapter 9: Stochastic Volatility Models
9.1 Market Models
9.1.1 Heston Model
9.1.2 Multi-scale Model
9.2 Pricing Equation
9.3 Variational Formulation
9.4 Localization
9.5 Discretization
9.5.1 Finite Difference Discretization
9.5.2 Finite Element Discretization
9.6 American Options
9.7 Further Reading
Chapter 10: Lévy Models
10.1 Lévy Processes
10.2 Lévy Models
10.2.1 Jump-Diffusion Models
10.2.2 Pure Jump Models
10.2.3 Admissible Market Models
10.3 Pricing Equation
10.4 Variational Formulation
10.5 Localization
10.6 Discretization
10.6.1 Finite Difference Discretization
10.6.2 Finite Element Discretization
10.7 American Options Under Exponential Lévy Models
10.8 Further Reading
Chapter 11: Sensitivities and Greeks
11.1 Option Pricing
11.2 Sensitivity Analysis
11.2.1 Sensitivity with Respect to Model Parameters
11.2.2 Sensitivity with Respect to Solution Arguments
11.3 Numerical Examples
11.3.1 One-Dimensional Models
11.3.2 Multivariate Models
11.4 Further Reading
Part II: Advanced Techniques and Models
Chapter 12: Wavelet Methods
12.1 Spline Wavelets
12.1.1 Wavelet Transformation
12.1.2 Norm Equivalences
12.2 Wavelet Discretization
12.2.1 Space Discretization
12.2.2 Matrix Compression
12.2.3 Multilevel Preconditioning
12.3 Discontinuous Galerkin Time Discretization
12.3.1 Derivation of the Linear Systems
12.3.2 Solution Algorithm
12.4 Further Reading
Chapter 13: Multidimensional Diffusion Models
13.1 Sparse Tensor Product Finite Element Spaces
13.2 Sparse Wavelet Discretization
13.3 Fully Discrete Scheme
13.4 Diffusion Models
13.4.1 Aggregated Black-Scholes Models
13.4.2 Stochastic Volatility Models
13.5 Numerical Examples
13.5.1 Full-Rank d-Dimensional Black-Scholes Model
13.5.2 Low-Rank d-Dimensional Black-Scholes
13.6 Further Reading
Chapter 14: Multidimensional Lévy Models
14.1 Lévy Processes
14.2 Lévy Copulas
14.3 Lévy Models
14.3.1 Subordinated Brownian Motion
14.3.2 Lévy Copula Models
14.3.3 Admissible Models
14.4 Pricing Equation
14.5 Variational Formulation
14.6 Wavelet Discretization
14.6.1 Wavelet Compression
14.6.2 Fully Discrete Scheme
14.7 Application: Impact of Approximations of Small Jumps
14.7.1 Gaussian Approximation
14.7.2 Basket Options
14.7.3 Barrier Options
14.8 Further Reading
Chapter 15: Stochastic Volatility Models with Jumps
15.1 Market Models
15.1.1 Bates Models
15.1.2 BNS Model
15.2 Pricing Equations
15.3 Variational Formulation
15.4 Wavelet Discretization
15.5 Further Reading
Chapter 16: Multidimensional Feller Processes
16.1 Pseudodifferential Operators
16.2 Variable Order Sobolev Spaces
16.3 Subordination
16.4 Admissible Market Models
16.5 Variational Formulation
16.5.1 Sector Condition
16.5.2 Well-Posedness
16.6 Numerical Examples
16.7 Further Reading
Appendix A: Elliptic Variational Inequalities
A.1 Hilbert Spaces
A.2 Dual of a Hilbert Space
A.3 Theorems of Stampacchia and Lax-Milgram
Appendix B: Parabolic Variational Inequalities
B.1 Weak Formulation of PVI's
B.2 Existence
B.3 Proof of the Existence Result
References
Index

Citation preview

Springer Finance

Editorial Board Marco Avellaneda Giovanni Barone-Adesi Mark Broadie Mark H.A. Davis Claudia Klüppelberg Walter Schachermayer Emanuel Derman

Springer Finance Springer Finance is a programme of books addressing students, academics and practitioners working on increasingly technical approaches to the analysis of financial markets. It aims to cover a variety of topics, not only mathematical finance but foreign exchanges, term structure, risk management, portfolio theory, equity derivatives, and financial economics.

For further volumes: http://www.springer.com/series/3674

Norbert Hilber r Oleg Reichmann r Christoph Schwab r Christoph Winter

Computational Methods for Quantitative Finance Finite Element Methods for Derivative Pricing

Norbert Hilber Dept. for Banking, Finance, Insurance School of Management and Law Zurich University of Applied Sciences Winterthur, Switzerland

Christoph Schwab Seminar for Applied Mathematics Swiss Federal Institute of Technology (ETH) Zurich, Switzerland

Oleg Reichmann Seminar for Applied Mathematics Swiss Federal Institute of Technology (ETH) Zurich, Switzerland

Christoph Winter Allianz Deutschland AG Munich, Germany

ISSN 1616-0533 ISSN 2195-0687 (electronic) Springer Finance ISBN 978-3-642-35400-7 ISBN 978-3-642-35401-4 (eBook) DOI 10.1007/978-3-642-35401-4 Springer Heidelberg New York Dordrecht London Library of Congress Control Number: 2013932229 Mathematics Subject Classification: 60J75, 60J25, 60J35, 60J60, 65N06, 65K15, 65N12, 65N30 JEL Classification: C63, C16, G12, G13 © Springer-Verlag Berlin Heidelberg 2013 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)

Preface

The subject of mathematical finance has undergone rapid development in recent years, with mathematical descriptions of financial markets evolving both in volume and technical sophistication. Pivotal in this development have been quantitative models and computational methods for calibrating mathematical models to market data, and for obtaining option prices of concrete products from the calibrated models. In this development, two broad classes of computational methods have emerged: statistical sampling approaches and grid-based methods. They correspond, roughly speaking, to the characterization of arbitrage-free prices as conditional expectations over all sample paths of a stochastic process model of the market behavior, or to the characterization of prices as solutions (in a suitable sense) of the corresponding Kolmogorov forward and/or backward partial differential equations, or PDEs for short, the canonical example being the Black–Scholes equation and its extensions. Sampling methods contain, for example, Monte-Carlo and Quasi-Monte-Carlo Methods, whereas grid-based methods contain, for example, Finite Difference, Finite Element, Spectral and Fourier transformation methods (which, by the use of the Fast Fourier Transform, require approximate evaluation of Fourier integrals on grids). The present text discusses the analysis and implementation of grid-based methods. The importance of numerical methods for the efficient valuation of derivative contracts cannot be overstated: often, the selection of mathematical models for the valuation of derivative contracts is determined by the ease and efficiency of their numerical evaluation to the extent that computational efficiency takes priority over mathematical sophistication and general applicability. Having said this, we hasten to add that the computational methods presented in these notes approximate the (forward and backward) pricing partial (integro) differential equations and inequalities by finite dimensional discretizations of these equations which are amenable to numerical solution on a computer. The methods incur, therefore, naturally an error due to this replacement of the forward pricing equation by a discretization, the so-called discretization error. One main message to be conveyed by these notes is that, using numerical analysis and advanced solution v

vi

Preface

methods, efficient discretizations of the pricing equations for a wide range of market models and term sheets are available, and there is no obvious necessity to confine financial modeling to processes which entail “exactly solvable” PIDEs. We caution the reader, however, that this reasoning implies that the error estimates presented in these notes are bounds on the discretization error, i.e. the error in the computed solution with respect to the exact solution of one particular market model under consideration. An equally important theme is the quantitative analysis of the error inherent in the financial models themselves, i.e. the so-called modeling errors. Such errors are due to assumptions on the markets which were (explicitly or implicitly) used in their derivations, and which may or may not be valid in the situations where the models are used. It is our view that a unified, numerical pricing methodology that accommodates a wide range of market models can facilitate quantitative verification of dependence of prices on various assumptions implicit in particular classes of market models. Thus, to give “non-experts” in computational methods and in numerical analysis an introduction to grid-based numerical solution methods for option pricing problems is one purpose of the present volume. Another purpose is to acquaint numerical analysts and computational mathematicians with formulation and numerical analysis of typical initial-boundary value problems for partial integro-differential equations (PIDEs) that arise in models of financial markets with jumps. Financial contracts with early exercise features lead to optimal stopping problems which, in turn, lead to unilateral boundary value problems for the corresponding PIDEs. Efficient numerical solution methods for such problems have been developed over many years in solvers for contact problems in mechanics. Contrary to the differential operators which arise with obstacle problems in mechanics, however, the PIDEs in financial models with jumps are, as a rule, nonsymmetric (due to the presence of a drift term which, in turn, is mandated by no-arbitrage conditions in the pricing of derivative contracts). The numerical analysis of the corresponding algorithms in financial applications cannot rely, therefore, on energy minimization arguments so that many well-established algorithms are ruled out. Rather than trying to cover all possible numerical approaches for the computational solution of pricing equations, we decided to focus on Finite Difference and on Finite Element Methods. Finite Element Methods (FEM for short) are based on particularly general, so-called weak, or variational formulations of the pricing equation. This is, on the one hand, the natural setting for FEM; on the other hand, as we will try to show in these notes, the variational formulation of the forward and backward equations (in price or in log-price space) on which the FEM is based has a very natural correspondence on the “stochastic side”, namely the so-called Dirichlet form of the stochastic process model for the dynamics of the risky asset(s) underlying the derivative contracts of interest. As we show here, FEM based numerical solution methods allow for a unified numerical treatment of rather general classes of market models, including local and stochastic volatility models, square root driving processes, jump processes which are either stationary (such as Lévy processes) or nonstationary (such as affine and polynomial processes or processes which are additive in the sense of Sato), for which transform based numerical schemes are not immediately applicable due to lack of stationarity.

Preface

vii

In return for this restriction in the types of methods which are presented here, we tried to accommodate within a single mathematical solution framework a wide range of mathematical models, as well as a reasonably large number of term sheet features in the contracts to be valued. The presentation of the material is structured in two parts: Part I “Basic Methods”, and Part II “Advanced Methods”. The material in the first part of these notes has evolved over several years, in graduate courses which were taught to students in the joint ETH and Uni Zürich MSc programme in quantitative finance, whereas Part II is based on PhD research projects in computational finance. This distinction between Parts I and II is certainly subjective, and we have seen it evolve over time, in line with the development of the field. In the formulation of the methods and in their analysis, we have tried to maintain mathematical rigor whenever possible, without compromising ease of understanding of the computational methods per se. This has, in particular in Part I, lead to an engineering style of method presentation and analysis in many places. In Part II, fewer such compromises have been made. The formulation of forward and backward equations for rather large classes of jump processes has entailed a somewhat heavy machinery of Sobolev spaces of fractional and variable, state dependent order, of Dirichlet forms, etc. There is a close correspondence of many notions to objects on the stochastic side where the stochastic processes in market models are studied through their Dirichlet forms. We are convinced that many of the numerical methods presented in these notes have applications beyond the immediate area of computational finance, as Kolmogorov forward and backward equations for stochastic models with jumps arise naturally in many contexts in engineering and in the sciences. We hope that this broader scope will justify to the readers the analytical apparatus for numerical solution methods in particular in Part II. The present material owes much in style of presentation to discussions of the authors with students in the UZH and ETH MSc quantitative finance and in the ETH MSc Computational Science and Engineering programmes who, during the courses given by us during the past years, have shaped the notes through their questions, comments and feedback. We express our appreciation to them. Also, our thanks go to Springer Verlag for their swift and easy handling of all nonmathematical aspects at the various stages during the preparation of this manuscript. Winterthur, Switzerland Zurich, Switzerland Zurich, Switzerland Munich, Germany

Norbert Hilber Oleg Reichmann Christoph Schwab Christoph Winter

Contents

Part I

Basic Techniques and Models

1

Notions of Mathematical Finance 1.1 Financial Modelling . . . . . 1.2 Stochastic Processes . . . . . 1.3 Further Reading . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

3 3 5 8

2

Elements of Numerical Methods for PDEs . . . . . . . . 2.1 Function Spaces . . . . . . . . . . . . . . . . . . . . 2.2 Partial Differential Equations . . . . . . . . . . . . . 2.3 Numerical Methods for the Heat Equation . . . . . . 2.3.1 Finite Difference Method . . . . . . . . . . . 2.3.2 Convergence of the Finite Difference Method 2.3.3 Finite Element Method . . . . . . . . . . . . 2.4 Further Reading . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

11 11 12 15 15 17 20 25

3

Finite Element Methods for Parabolic Problems . . . 3.1 Sobolev Spaces . . . . . . . . . . . . . . . . . . 3.2 Variational Parabolic Framework . . . . . . . . . 3.3 Discretization . . . . . . . . . . . . . . . . . . . 3.4 Implementation of the Matrix Form . . . . . . . . 3.4.1 Elemental Forms and Assembly . . . . . . 3.4.2 Initial Data . . . . . . . . . . . . . . . . . 3.5 Stability of the θ -Scheme . . . . . . . . . . . . . 3.6 Error Estimates . . . . . . . . . . . . . . . . . . . 3.6.1 Finite Element Interpolation . . . . . . . . 3.6.2 Convergence of the Finite Element Method 3.7 Further Reading . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

27 27 31 33 34 35 38 39 41 41 43 45

4

European Options in BS Markets . . . . . . . . . . . . . . . . . . . . 4.1 Black–Scholes Equation . . . . . . . . . . . . . . . . . . . . . . . 4.2 Variational Formulation . . . . . . . . . . . . . . . . . . . . . . .

47 47 51

. . . . . . . . . . . .

. . . . . . . . . . . .

ix

x

Contents

4.3 Localization . . . . . . . . . . . . . . . 4.4 Discretization . . . . . . . . . . . . . . 4.4.1 Finite Difference Discretization . 4.4.2 Finite Element Discretization . . 4.4.3 Non-smooth Initial Data . . . . . 4.5 Extensions of the Black–Scholes Model . 4.5.1 CEV Model . . . . . . . . . . . 4.5.2 Local Volatility Models . . . . . 4.6 Further Reading . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

52 54 54 54 55 58 58 62 64

5

American Options . . . . . . . . . . . . . . . . . . . . . . . 5.1 Optimal Stopping Problem . . . . . . . . . . . . . . . . 5.2 Variational Formulation . . . . . . . . . . . . . . . . . . 5.3 Discretization . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Finite Difference Discretization . . . . . . . . . . 5.3.2 Finite Element Discretization . . . . . . . . . . . 5.4 Numerical Solution of Linear Complementarity Problems 5.4.1 Projected Successive Overrelaxation Method . . . 5.4.2 Primal–Dual Active Set Algorithm . . . . . . . . 5.5 Further Reading . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

65 65 67 68 68 69 70 71 72 74

6

Exotic Options . . . . . 6.1 Barrier Options . . . 6.2 Asian Options . . . 6.3 Compound Options 6.4 Swing Options . . . 6.5 Further Reading . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

75 75 77 79 82 84

7

Interest Rate Models . . . . 7.1 Pricing Equation . . . . 7.2 Interest Rate Derivatives 7.3 Further Reading . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

85 85 87 90

8

Multi-asset Options . . . . . . . . . . . . . 8.1 Pricing Equation . . . . . . . . . . . . 8.2 Variational Formulation . . . . . . . . 8.3 Localization . . . . . . . . . . . . . . 8.4 Discretization . . . . . . . . . . . . . 8.4.1 Finite Difference Discretization 8.4.2 Finite Element Discretization . 8.5 Further Reading . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. 91 . 91 . 93 . 95 . 96 . 96 . 98 . 102

9

Stochastic Volatility Models . 9.1 Market Models . . . . . . 9.1.1 Heston Model . . 9.1.2 Multi-scale Model 9.2 Pricing Equation . . . . . 9.3 Variational Formulation .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

105 105 106 106 108 110

Contents

xi

9.4 Localization . . . . . . . . . . . . . . 9.5 Discretization . . . . . . . . . . . . . 9.5.1 Finite Difference Discretization 9.5.2 Finite Element Discretization . 9.6 American Options . . . . . . . . . . . 9.7 Further Reading . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

113 114 115 116 119 122

10 Lévy Models . . . . . . . . . . . . . . . . . . . . . . . . 10.1 Lévy Processes . . . . . . . . . . . . . . . . . . . . 10.2 Lévy Models . . . . . . . . . . . . . . . . . . . . . 10.2.1 Jump–Diffusion Models . . . . . . . . . . . 10.2.2 Pure Jump Models . . . . . . . . . . . . . . 10.2.3 Admissible Market Models . . . . . . . . . 10.3 Pricing Equation . . . . . . . . . . . . . . . . . . . 10.4 Variational Formulation . . . . . . . . . . . . . . . 10.5 Localization . . . . . . . . . . . . . . . . . . . . . 10.6 Discretization . . . . . . . . . . . . . . . . . . . . 10.6.1 Finite Difference Discretization . . . . . . . 10.6.2 Finite Element Discretization . . . . . . . . 10.7 American Options Under Exponential Lévy Models 10.8 Further Reading . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

123 123 126 126 127 128 128 131 134 135 136 137 140 143

11 Sensitivities and Greeks . . . . . . . . . . . . . . . . . . . 11.1 Option Pricing . . . . . . . . . . . . . . . . . . . . . . 11.2 Sensitivity Analysis . . . . . . . . . . . . . . . . . . . 11.2.1 Sensitivity with Respect to Model Parameters . 11.2.2 Sensitivity with Respect to Solution Arguments 11.3 Numerical Examples . . . . . . . . . . . . . . . . . . . 11.3.1 One-Dimensional Models . . . . . . . . . . . . 11.3.2 Multivariate Models . . . . . . . . . . . . . . . 11.4 Further Reading . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

145 145 147 147 151 152 153 154 155

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

159 160 161 162 163 164 165 167 168 171 172 175

Part II

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

Advanced Techniques and Models

12 Wavelet Methods . . . . . . . . . . . . . . . . . 12.1 Spline Wavelets . . . . . . . . . . . . . . . 12.1.1 Wavelet Transformation . . . . . . . 12.1.2 Norm Equivalences . . . . . . . . . 12.2 Wavelet Discretization . . . . . . . . . . . . 12.2.1 Space Discretization . . . . . . . . . 12.2.2 Matrix Compression . . . . . . . . . 12.2.3 Multilevel Preconditioning . . . . . 12.3 Discontinuous Galerkin Time Discretization 12.3.1 Derivation of the Linear Systems . . 12.3.2 Solution Algorithm . . . . . . . . . 12.4 Further Reading . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

xii

Contents

13 Multidimensional Diffusion Models . . . . . . . . . . . . . 13.1 Sparse Tensor Product Finite Element Spaces . . . . . . 13.2 Sparse Wavelet Discretization . . . . . . . . . . . . . . 13.3 Fully Discrete Scheme . . . . . . . . . . . . . . . . . . 13.4 Diffusion Models . . . . . . . . . . . . . . . . . . . . 13.4.1 Aggregated Black–Scholes Models . . . . . . . 13.4.2 Stochastic Volatility Models . . . . . . . . . . . 13.5 Numerical Examples . . . . . . . . . . . . . . . . . . . 13.5.1 Full-Rank d-Dimensional Black–Scholes Model  13.5.2 Low-Rank d-Dimensional Black–Scholes . . . 13.6 Further Reading . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

177 178 181 184 185 186 189 191 191 192 195

14 Multidimensional Lévy Models . . . . . . . . . . . . . . . 14.1 Lévy Processes . . . . . . . . . . . . . . . . . . . . . . 14.2 Lévy Copulas . . . . . . . . . . . . . . . . . . . . . . 14.3 Lévy Models . . . . . . . . . . . . . . . . . . . . . . . 14.3.1 Subordinated Brownian Motion . . . . . . . . . 14.3.2 Lévy Copula Models . . . . . . . . . . . . . . . 14.3.3 Admissible Models . . . . . . . . . . . . . . . 14.4 Pricing Equation . . . . . . . . . . . . . . . . . . . . . 14.5 Variational Formulation . . . . . . . . . . . . . . . . . 14.6 Wavelet Discretization . . . . . . . . . . . . . . . . . . 14.6.1 Wavelet Compression . . . . . . . . . . . . . . 14.6.2 Fully Discrete Scheme . . . . . . . . . . . . . . 14.7 Application: Impact of Approximations of Small Jumps 14.7.1 Gaussian Approximation . . . . . . . . . . . . 14.7.2 Basket Options . . . . . . . . . . . . . . . . . . 14.7.3 Barrier Options . . . . . . . . . . . . . . . . . 14.8 Further Reading . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

197 197 198 205 206 208 209 211 212 213 215 217 218 218 222 226 228

15 Stochastic Volatility Models with Jumps 15.1 Market Models . . . . . . . . . . . . 15.1.1 Bates Models . . . . . . . . 15.1.2 BNS Model . . . . . . . . . 15.2 Pricing Equations . . . . . . . . . . 15.3 Variational Formulation . . . . . . . 15.4 Wavelet Discretization . . . . . . . . 15.5 Further Reading . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

229 229 230 231 231 234 238 244

16 Multidimensional Feller Processes 16.1 Pseudodifferential Operators . 16.2 Variable Order Sobolev Spaces 16.3 Subordination . . . . . . . . . 16.4 Admissible Market Models . . 16.5 Variational Formulation . . . . 16.5.1 Sector Condition . . . . 16.5.2 Well-Posedness . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

247 247 250 253 256 259 259 260

. . . . . . . .

. . . . . . . .

. . . . . . . .

Contents

xiii

16.6 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . 262 16.7 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 Appendix A Elliptic Variational Inequalities . . . A.1 Hilbert Spaces . . . . . . . . . . . . . . . . A.2 Dual of a Hilbert Space . . . . . . . . . . . A.3 Theorems of Stampacchia and Lax–Milgram

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

269 269 271 273

Appendix B Parabolic Variational Inequalities B.1 Weak Formulation of PVI’s . . . . . . . B.2 Existence . . . . . . . . . . . . . . . . . B.3 Proof of the Existence Result . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

275 275 277 278

. . . .

. . . .

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297

Part I

Basic Techniques and Models

Chapter 1

Notions of Mathematical Finance

The present notes deal with topics of computational finance, with focus on the analysis and implementation of numerical schemes for pricing derivative contracts. There are two broad groups of numerical schemes for pricing: stochastic (Monte Carlo) type methods and deterministic methods based on the numerical solution of the Fokker–Planck (or Kolmogorov) partial integro-differential equations for the price process. Here, we focus on the latter class of methods and address finite difference and finite element methods for the most basic types of contracts for a number of stochastic models for the log returns of risky assets. We cover both, models with (almost surely) continuous sample paths as well as models which are based on price processes with jumps. Even though emphasis will be placed on the (partial integro)differential equation approach, some background information on the market models and on the derivation of these models will be useful particularly for readers with a background in numerical analysis. Accordingly, we collect synoptically terminology, definitions and facts about models in finance. We emphasise that this is a collection of terms, and it can, of course, in no sense claim to be even a short survey over mathematical modelling in finance. Readers who wish to obtain a perspective on mathematical modelling principles for finance are referred to the monographs of Mao [120], Øksendal [131], Gihman and Skorohod [71–73], Lamberton and Lapeyre [109], Shiryaev [152], as well as Jacod and Shiryaev [97].

1.1 Financial Modelling Stocks Stocks are shares in a company which provide partial ownership in the company, proportional with the investment in the company. They are issued by a company to raise funds. Their value reflects both the company’s real assets as well as the estimated or imagined company’s earning power. Stock is the generic term for assets held in the form of shares. For publicly quoted companies, stocks are quoted and traded on a stock exchange. An index tracks the value of a basket of stocks. N. Hilber et al., Computational Methods for Quantitative Finance, Springer Finance, DOI 10.1007/978-3-642-35401-4_1, © Springer-Verlag Berlin Heidelberg 2013

3

4

1

Notions of Mathematical Finance

Assets for which future prices are not known with certainty are called risky assets, while assets for which the future prices are known are called risk free. Price Process The price at which a stock can be bought or sold at any given time t on a stock exchange is called spot price and we shall denote it by St . All possible future prices St as functions of t (together with probabilistic information on the likelihood of a particular price history) constitute the price process S = {St : t ≥ 0} of the asset. It is mathematically modelled by a stochastic process to be defined below. Derivative Securities A derivative security, derivative for short, is a security whose value depends on the value of one or several underlying assets and the decisions of the investor. It is also called contingent claim. It is a financial contract whose value at expiration time (or time of maturity) T is determined by the price process of the underlying assets up to time T . After choosing a price process for the asset(s) under consideration, the task is to determine a price for the derivative security on the asset. There are several types of derivatives: options, forwards, futures and swaps. We focus exemplary on the pricing of options, since pricing other assets leads to closely related problems. Options An option is a derivative which gives its holder the right, but not the obligation to make a specified transaction at or by a specified date at a specified price. Options are sold by one party, the writer of the option, to another, the holder, of the option. If the holder chooses to make the transaction, he exercises the option. There are many conditions under which an option can be exercised, giving rise to different types of options. We list the main ones: Call options give the right (but not the obligation) to buy, put options give the right (but not an obligation) to sell the underlying at a specified price, the so-called strike price K. The simplest options are the European call and put options. They give the holder the right to buy (resp., sell) exactly at maturity T . Since they are described by very simple rules, they are also called plain vanilla options. Options with more sophisticated rules than those for plain vanillas are called exotic options. A particular type of exotic options are American options which give the holder the right (but not the obligation) to buy (resp., sell) the underlying at any time t at or before maturity T . For European options the price does not depend on the path of the underlying, but only on the realisation at maturity T . There are also so-called path dependent options, like Asian, lookback or barrier contracts. The value of Asian options depends on the average price of the option’s underlying over a period, lookback options depend on the maximum or minimum asset price over a period, and barrier options depend on particular price level(s) being attained over a period. Payoff The payoff of an option is its value at the time of exercise T . For a European call with strike price K, the payoff g is  S − K if ST > K, g(ST ) = (ST − K)+ = T 0 else.

1.2 Stochastic Processes

5

At time t ≤ T the option is said to be in the money, if St > K, the option is out of the money, if St < K, and the option is said to be at the money, if St ≈ K. Modelling Assumptions Most market models for stocks assume the existence of a riskless bank account with riskless interest rate r ≥ 0. We will also consider stochastic interest rate models where this is not the case. However, unless explicitly stated otherwise, we assume that money can be deposited and borrowed from this bank account with continuously compounded, known interest rate r. Therefore, 1 currency unit in this account at t = 0 will give ert currency units at time t, and if 1 currency unit is borrowed at time t = 0, we will have to pay back ert currency units at time t. We also assume a frictionless market, i.e. there are no transaction costs, and we assume further that there is no default risk, all market participants are rational, and the market is efficient, i.e. there is no arbitrage.

1.2 Stochastic Processes We refer to the texts Mao [120] and Øksendal [131] for an introduction to stochastic processes and stochastic differential equations. Much more general stochastic processes in the Markovian and non-Markovian setup are treated in the monographs Gihman and Skorohod [71–73] as well as Jacod and Shiryaev [97]. Prices of the so-called risky assets can be modelled by stochastic processes in continuous time t ∈ [0, T ] where the maturity T > 0 is the time horizon. To describe stochastic price processes, we require a probability space (Ω, F, P). Here, Ω is the set of elementary events, F is a σ -algebra which contains all events (i.e. subsets of Ω) of interest and P : F → [0, 1] assigns a probability of any event A ∈ F . We shall always assume the probability space to be complete, i.e. if B ⊂ A with A ∈ F and P[A] = 0, then B ∈ F . We equip (Ω, F, P) with a filtration, i.e. a family F = {Ft : 0 ≤ t ≤ T } of σ -algebras which are monotonic with respect to t in the sense that for 0 ≤ s ≤ t ≤ T holds that Fs ⊆ Ft ⊆ FT ⊂ F . In financial modelling, the σ -algebra Ft ∈ F represents the information available in the model up to time t. We assume that the filtered probability space (Ω, F, P, F) satisfies the usual assumptions, i.e. (i) F is P-complete, (ii) F0 contains all P-null subsets of Ω and  (iii) The filtration F is right-continuous: Ft = s>t Fs . Definition 1.2.1 (Stochastic processes) A stochastic process X = {Xt : 0 ≤ t ≤ T } is a family of random variables defined on a probability space (Ω, F, P, F), parametrised by the time variable t. For ω ∈ Ω, the function Xt (ω) of t is called a sample path of X. The process is F-adapted if Xt is Ft measurable (denoted by Xt ∈ Ft ) for each t.

6

1

Notions of Mathematical Finance

To model asset prices by stochastic processes, knowledge about past events up to time t should be incorporated into the model. This is done by the concept of filtration. Definition 1.2.2 (Natural filtration) We call FX = {FtX : 0 ≤ t ≤ T } the natural tX : filtration for X if it is the completion with respect to P of the filtration  FX = {F X t = σ (Xr : r ≤ s). 0 ≤ t ≤ T }, where for each 0 ≤ t ≤ T , F A stochastic process is called càdlàg (from French ‘continue à droite avec des limités à gauche’) if it has càdlàg sample paths, and a mapping f : [0, T ] → R is said to be càdlàg if for all t ∈ [0, T ] it has a left limit at t and is right-continuous at t. A stochastic process is called predictable if it is measurable with respect to , where F  is the smallest σ -algebra generated by all adapted càdlàg the σ -algebra F processes on [0, T ] × Ω. Asset prices are often modelled by Markov processes. In this class of stochastic processes, the stochastic behaviour of X after time t depends on the past only through the current state Xt . Definition 1.2.3 (Markov property) A stochastic process X = {Xt : 0 ≤ t ≤ T } is Markov with respect to F if E[f (Xs )|Ft ] = E[f (Xs )|Xt ], for any bounded Borel function f and s ≥ t. No arbitrage considerations require discounted log price processes to be martingales, i.e. the best prediction of Xs based on the information at time t contained in Ft is the value Xt . In particular, the expected value of a martingale at any finite time T based on the information at time 0 equals the initial value X0 , E[XT |F0 ] = X0 . Definition 1.2.4 (Martingale) A stochastic process X = {Xt : 0 ≤ t ≤ T } is a martingale with respect to (P, F) if (i) X is F adapted, (ii) E[|Xt |] < ∞ for all t ≥ 0, (iii) E[Xs |Ft ] = Xt P-a.s. for s ≥ t ≥ 0. There is a one-to-one correspondence between models that satisfy the no free lunch with vanishing risk condition and the existence of a so-called equivalent local martingale measure (ELMM). We refer to [54, 55] for details. The most widely used price process is a Brownian motion or Wiener process. Its use in modelling log returns in prices of risky assets goes back to Bachelier [4]. Recall that the normal distribution N (μ, σ 2 ) with mean μ ∈ R and variance σ 2 with σ > 0 has the density 1 2 2 e−(x−μ) /(2σ ) , fN (x; μ, σ 2 ) = √ 2 2πσ

1.2 Stochastic Processes

7

and it is symmetric around μ. Normality assumptions in models of log returns of risky assets’ prices imply the assumption that upward and downward moves of prices occur symmetrically. Definition 1.2.5 (Wiener process) A stochastic process X = {Xt : t ≥ 0} is a Wiener process on a probability space (Ω, F, P) if (i) X0 = 0 P-a.s., (ii) X has independent increments, i.e. for s ≤ t, Xt − Xs is independent of Fs = σ (Xu , u ≤ s), (iii) Xt+s − Xt is normally distributed with mean 0 and variance s > 0, i.e. Xt+s − Xt ∼ N (0, s), and (iv) X has P-a.s. continuous sample paths. We shall denote this process by W for N. Wiener. In the Black–Scholes stock price model, the price process S of the risky asset is modelled by assuming that the return due to price change in the time interval t > 0 is St+t − St St = = rt + σ Wt , St St in the limit t → 0, i.e. that it consists of a deterministic part rt and a random part σ (Wt+t − Wt ). In the limit t → 0, we obtain the stochastic differential equation (SDE) dSt = rSt dt + σ St dWt ,

S0 > 0.

(1.1)

The above SDE admits the unique solution St = S0 e(r−σ

2 /2)t+σ W t

.

This exponential of a Brownian motion is called the geometric Brownian motion. The stochastic differential equation (1.1) for the geometric Brownian motion is a special case of the more general SDE dXt = b(t, Xt ) dt + σ (t, Xt ) dWt ,

X0 = Z,

(1.2)

for which we give an existence and uniqueness result. Theorem 1.2.6 We consider a probability space (Ω, F, P) with filtration F and a Brownian motion W on (Ω, F, P) adapted to F. Assume there exists C > 0 such that b, σ : R+ × R → R in (1.2) satisfy |b(t, x) − b(t, y)| + |σ (t, x) − σ (t, y)| ≤ C|x − y|, |b(t, x)| + |σ (t, x)| ≤ C(1 + |x|),

x ∈ R, t ∈ R+ .

x, y ∈ R, t ∈ R+ , (1.3) (1.4)

Assume further X0 = Z for a random variable which is F0 -measurable and satisfies E[|Z|2 ] < ∞. Then, for any T ≥ 0, (1.2) admits a P-a.s. unique solution in [0, T ] satisfying   E sup |Xt |2 < ∞. (1.5) 0≤t≤T

8

1

Notions of Mathematical Finance

We refer to [131, Theorem 5.2.1] or [120, Theorem 2.3.1] for a proof of this statement. Note that the Lipschitz continuity (1.3) implies the linear growth condi(1.4) for time-independent coefficients σ (x) and b(x). For any t ≥ 0 one has t tion t 2 |b(s, X )| ds < ∞, |σ (s, X s s )| ds < ∞, P-a.s., i.e. the solution process X is a 0 0 particular case of a so-called Itô process. Equation (1.2) is formally the differential form of the equation t t b(s, Xs ) ds + σ (s, Xs ) dWs , Xt = Z + 0

0

for t ∈ [0, T ]. In the derivation of pricing equations, it will become important t to check under which conditions the integrals with respect to W , i.e. 0 φs dWs , are martingales. The notion of stochastic integrals is discussed in detail in [120, Sect. 1.5]. Proposition 1.2.7 Let the process φ be predictable and let φ satisfy, for T ≥ 0,

T 2 |φt | dt < ∞. (1.6) E 0

Then, the process M = {Mt : t ≥ 0}, Mt :=

t 0

φs dWs is a martingale.

For a proof of this statement, we refer to [131, Theorem 3.2.1]. In mathematical finance, we are interested in the dynamics of f (t, Xt ), e.g. where f (t, Xt ) denotes the option price process. Here, the Itô formula plays an important role. Theorem 1.2.8 (Itô formula) Let X be given by the Itô process (1.2), and let f (t, x) ∈ C 2 ([0, ∞) × R), i.e. f is twice continuously differentiable on [0, ∞) × R. Then, for Yt = f (t, Xt ) we obtain dYt =

∂f ∂f 1 ∂ 2f (t, Xt ) dt + (t, Xt ) dXt + (t, Xt ) · (dXt )2 , ∂t ∂x 2 ∂x 2

(1.7)

where (dXt )2 = (dXt ) · (dXt ) is computed according to the rules dt · dt = dt · dWt = dWt · dt = 0,

dWt · dWt = dt.

We refer to [120, Theorem 1.6.2] for a proof of the Itô formula. A sketch of the proof is given in [131, Theorem 4.1.2]. We note in passing that the smoothness requirements on the function f in Theorem 1.2.8 can be substantially weakened. We refer to [132, Sects. II.7 and II.8] and [40, Sect. 8.3] for general versions of the Itô formula for Lévy processes and semimartingales.

1.3 Further Reading An introduction to financial modelling and option pricing can be found in Wilmott et al. [161] and the corresponding student version [162]. More details on risk-neutral

1.3 Further Reading

9

pricing, absence of arbitrage and equivalent martingale measures are given in Delbaen and Schachermayer [53]. For a general introduction to stochastic differential equations, see Øksendal [131] and Mao [120], Protter [132] and the references therein.

Chapter 2

Elements of Numerical Methods for PDEs

In this chapter, we present some elements of numerical methods for partial differential equations (PDEs). The PDEs are classified into elliptic, parabolic and hyperbolic equations, and we indicate the corresponding type of problems that they model. PDEs arising in option pricing problems in finance are mostly parabolic. Occasionally, however, elliptic PDEs arise in connection with so-called “infinite horizon problems”, and hyperbolic PDEs may appear in certain pure jump models with dominating drift. Therefore, we consider in particular the heat equation and show how to solve it numerically using finite differences or finite elements. Finite difference methods (FDM) consist of finding an approximate solution on a grid by replacing the derivatives in the differential equation by difference quotients. Finite element methods (FEM) are based instead on variational formulations of the differential equations and determine approximate solutions that are usually piecewise polynomials on some partition of the (log) price domain. We start with recapitulating some function spaces as well as the classification of PDEs.

2.1 Function Spaces The variational formulation and the analysis of the finite element method require tools from functional analysis, in particular Hilbert spaces (see Appendix A). Let G be a non-empty open subset of Rd . If a function u : G → R is sufficiently smooth, we denote the partial derivatives of u by D n u(x) :=

∂ |n| u(x) n1 nd n = ∂x1 · · · ∂xd u(x), ∂x1n1 · · · ∂xd d

x = (x1 , . . . , xd ) ∈ G,

(2.1)

where n = (n1 , . . . , nd ) ∈ Nd0 is a multi-index. The order of the partial derivative is given by |n| = di=1 ni . For any integer n ∈ N0 , we define C n (G) = {u : D n u exists and is continuous on G for |n| ≤ n}, N. Hilber et al., Computational Methods for Quantitative Finance, Springer Finance, DOI 10.1007/978-3-642-35401-4_2, © Springer-Verlag Berlin Heidelberg 2013

11

12

2 Elements of Numerical Methods for PDEs

 and set C ∞ (G) = n≥0 C n (G). The support of u is denoted by supp u, and we define C0n (G), C0∞ (G) consisting of all functions u ∈ C n (G), C ∞ (G) with compact support supp u  G. We denote by Lp (G), 1 ≤ p ≤ ∞ the usual space which consists of all Lebesgue measurable functions u : G → R with finite Lp -norm,

 ( G |u(x)|p dx)1/p if 1 ≤ p < ∞, u Lp (G) := if p = ∞, ess supG |u(x)| where ess sup means the essential supremum disregarding values on nullsets. The case p = 2 is of particular interest. The space L2 (G) is a Hilbert space with respect  to the inner product (u, v) = G u(x)v(x) dx. Let H be a Hilbert space with the inner product (·,·)H and norm u H := 1/2 (u, u)H . We denote by H∗ the dual space of H which consists of all bounded linear functionals u∗ : H → R on H. H∗ can be identified with H by the Riesz representation theorem. Theorem 2.1.1 (Riesz representation theorem) For each u∗ ∈ H∗ there exists a unique element u ∈ H such that u∗ , vH∗ ,H = (u, v)H

∀v ∈ H.

The mapping u∗ → u is a linear isomorphism of H∗ onto H. The theory of parabolic partial differential equations requires the introduction of Hilbert space-valued Lp -functions. As above, let H be a Hilbert space with the norm · H . Denote by J the interval J := (0, T ) with T > 0, and let 1 ≤ p ≤ ∞. The space Lp (J ; H) is defined by Lp (J ; H) := {u : J → H measurable : u Lp (J ;H) < ∞}, with the norm

 p ( J u(t) H dt)1/p u Lp (J ;H) := ess supJ u(t) H

if 1 ≤ p < ∞, if p = ∞.

Furthermore, for n ∈ N0 let C n (J ; H) be the space of H-valued functions that are of the class C n with respect to t.

2.2 Partial Differential Equations For k ∈ N we let D k u(x) := {D n u(x) : |n| = k}

2.2 Partial Differential Equations

13

be the set of all partial derivatives of order k. If k = 1, we regard the elements of D 1 u(x) =: Du(x) as being arranged in a row vector Du = (∂x1 u, . . . , ∂xd u). If k = 2, we regard the elements of D 2 u(x) as being arranged in a matrix ⎞ ⎛ ∂x1 ∂x1 u . . . ∂x1 ∂xd u ⎟ ⎜ .. .. .. D2u = ⎝ ⎠. . . . ∂xd ∂x1 u

...

∂xd ∂xd u

Hence, the Laplacian u of u can be written as u :=

d 

∂xi ∂xi u = tr[D 2 u],

(2.2)

i=1

where tr : Rd×d → R, B → tr[B] = di=1 Bii is the trace of a d × d-matrix B. In the following, we write ∂xi xj instead of ∂xi ∂xj to simplify the notation. A partial differential equation is an equation involving an unknown function of two or more variables and certain of its derivatives. Let G ⊂ Rd be open, x = (x1 , . . . , xd ) ∈ G, and k ∈ N. Definition 2.2.1 An expression of the form F (D k u(x), D k−1 u(x), . . . , Du(x), u(x), x) = 0,

x ∈ G,

is called a kth order partial differential equation, where the function k

F : Rd × Rd

k−1

× · · · × Rd × R × G → R

is given and the function u : G → R is the unknown. Let aij (x), bi (x), c(x) and f (x) be given functions. For a linear first order PDE in d + 1 variables, F has the form F (Du, u, x) =

d 

bi (x)∂xi u + c(x)u − f (x).

i=0

Setting x0 = t, x1 = x, b(x) = (b1 (x), b2 (x)) = (1, b) , b ∈ R+ , and c = 0, for example, we obtain the (hyperbolic) transport equation with constant speed b of propagation ∂t u + b∂x = f (t, x). For a linear second order PDE in d + 1 variables, F takes the form F (D 2 u, Du, u, x) = −

d  i,j =0

aij (x)∂xi xj u +

d  i=0

bi (x)∂xi u + c(x)u − f (x).

14

2 Elements of Numerical Methods for PDEs

Let b(x) = (b0 (x), . . . , bd (x)) and assume that the matrix A(x) = {aij (x)}di,j =0 is symmetric with real eigenvalues λ0 (x) ≤ λ1 (x) ≤ · · · ≤ λd (x). We can use the eigenvalues to distinguish three types of PDEs: elliptic, parabolic and hyperbolic. Definition 2.2.2 Let I = {0, . . . , d}. At x ∈ Rd+1 , a PDE is called (i) Elliptic ⇔ λi (x) = 0, ∀i ∧ sign(λ0 (x)) = · · · = sign(λd (x)), (ii) Parabolic ⇔ ∃!j ∈ I : λj (x) = 0 ∧ rank(A(x), b(x)) = d + 1, (iii) Hyperbolic ⇔ (λi (x) = 0, ∀i) ∧ ∃!j ∈ I : sign λj (x) = signλk (x), k ∈ I \ {j }. The PDE is called elliptic, parabolic, hyperbolic on G, if it is elliptic, parabolic, hyperbolic at all x ∈ G. We give a typical example for each type: (i) (ii) (iii) (iv)

The Poisson equation u = f (x) is elliptic. The heat equation ∂t u − u = f (t, x) is parabolic (set x0 = t). The wave equation ∂tt u − u = f (t, x) is hyperbolic (set x0 = t). The Black–Scholes equation for the value of a European option price v(t, s) 1 ∂t v − σ 2 s 2 ∂ss v − rs∂s v + rv = 0, 2

(2.3)

with volatility σ > 0 and interest rate r ≥ 0 is parabolic at (t, s) ∈ (0, T ) × (0, ∞) and degenerates to an ordinary differential equation as s → 0. Partial differential equations arising in finance, like the Black–Scholes equation (2.3), are mostly of parabolic type, i.e. they are of the form

∂t u −

d 

aij (x)∂xi xj u +

i,j =1

d 

bi (x)∂xi u + c(x)u = f (x).

i=1

Therefore, we introduce the basic concepts for solving parabolic equations in the next section. For illustration purpose, we consider the heat equation. Indeed, setting s = ex , t = 2σ −2 τ and v(t, s) = eαx+βτ u(τ, x),

α = 1/2 − rσ −2 ,

β = −(1/2 + rσ −2 )2 ,

the Black–Scholes equation (2.3) for v(t, s) can be transformed to the heat equation ∂τ u − ∂xx u = 0 for u(τ, x).

2.3 Numerical Methods for the Heat Equation

15

Fig. 2.1 Time-space grid

2.3 Numerical Methods for the Heat Equation Let the space domain G = (a, b) ⊂ R be an open interval and let the time domain J := (0, T ) for T > 0. Consider the initial–boundary value problem: Find u : J × G → R such that ∂t u − ∂xx u = f (t, x),

in J × G,

u(t, x) = 0,

on J × ∂G,

u(0, x) = u0 ,

in G,

(2.4)

where u(0, x) = u0 is the initial condition and u(t, x) = 0 on the boundary is called the homogeneous Dirichlet boundary condition. We explain two numerical methods to find approximations to the solution u(t, x) of the problem (2.4). We start with the finite difference method.

2.3.1 Finite Difference Method In the finite difference discretization, the domain J × G is replaced by discrete grid points (tm , xi ) and the partial derivatives in (2.4) are approximated by difference quotients at the grid points. Let the space grid points be given by xi = a + ih, i = 0, 1, . . . , N + 1, h := (b − a)/(N + 1) = x,

(2.5)

which are equidistant with mesh width h, and the time levels by tm = mk, m = 0, 1, . . . , M, k := T /M = t. The time-space grid is illustrated in Fig. 2.1. Assume that f ∈ C 2 (G). Then, using Taylor’s formula, we have f  (x) =

f (x + h) − f (x) h  − f (ξ ), h 2

ξ ∈ (x, x + h).

(2.6)

16

2 Elements of Numerical Methods for PDEs

Setting fi := f (xi ), we obtain as h → 0, f  (xi ) =

fi+1 − fi + O(h) =: (δx+ f )i + O(h), h

where (δx+ f )i is called the one-sided difference quotient of f with respect to x at xi . The difference quotient is said to be accurate of first order since the remainder term is O(h) as h → 0. Analogous expressions hold for the time derivative ∂t . Higher order finite differences allow obtaining approximations of order O(hp ) with p ≥ 2 rather than just O(h). If the function to be approximated has sufficient regularity, we have fi+1 − fi−1 + O(h2 ) =: (δx f )i + O(h2 ), 2h fi+1 − 2fi + fi−1 2 f  (xi ) = + O(h2 ) =: (δxx f )i + O(h2 ), h2 f  (xi ) =

for f ∈ C 3 (G), for f ∈ C 4 (G).

With the difference quotients we turn next to the finite difference discretization of the heat equation (2.4). Let um i denote the approximate value of the solution u at ≈ u(t grid point (tm , xi ), i.e. um m , xi ). For a parameter θ ∈ [0, 1], we approximate i the partial differential operator ∂t u − ∂xx u at the grid point (tm , xi ) by the finite difference operator   um+1 − um m+1 i i 2 2 − (1 − θ )(δxx u)m + θ (δ u) xx i i k m+1 m+1  m  um − 2um + ui−1 um+1 um+1 − um i + ui−1 i+1 − 2ui i − (1 − θ ) i+1 , = i + θ k h2 h2 (2.7) and replace the partial differential equation (2.4) by the finite difference equations Eim :=

Eim = θfim+1 + (1 − θ )fim ,

i = 1, . . . , N,

m = 0, . . . , M − 1,

(2.8)

with initial conditions u0i = u0 (xi ), i = 1, . . . , N , and boundary conditions um k = 0, , i = 1, . . . , N , are k ∈ {0, N + 1}, m = 0, . . . , M. We observe that for θ = 0, um+1 i m m m given in Ei = fi explicitly in terms of ui , i.e. the scheme (2.8) is explicit. For at each time step, i.e. θ = 1, a linear system of equations must be solved for um+1 i the scheme is implicit. We write (2.8) in matrix form. To this end, we introduce the column vectors m  um = (um 1 , . . . , uN ) ,

f m = (f1m , . . . , fNm ) ,

u0 = (u0 (x1 ), . . . , u0 (xN )) ,

2.3 Numerical Methods for the Heat Equation

and the tridiagonal matrices ⎞ ⎛ 2 −1 ⎟ ⎜ .. .. ⎟ . . 1 ⎜−1 ⎟ ∈ RN ×N , G= 2 ⎜ sym ⎟ ⎜ .. .. h ⎝ . . −1⎠ −1 2

17

⎛ ⎜ ⎜ I=⎜ ⎜ ⎝



1 ..

⎟ ⎟ ⎟ ∈ RN ×N . sym ⎟ ⎠

. ..

. 1

(2.9) Then, after multiplication by k, the finite difference scheme (2.8) becomes: Find um+1 ∈ RN such that for m = 0, . . . , M − 1,     I + θ kG um+1 = I − (1 − θ )kG um + k(θ f m+1 + (1 − θ )f m ),

(2.10)

u0 = u0 . We now show that the vectors um converge towards the exact solution as k → 0 and h → 0.

2.3.2 Convergence of the Finite Difference Method Naturally, by the transition from the PDE (2.4) to the finite difference equations (2.8), which is called the discretization of the PDE, an error is introduced, the so-called discretization error, which we analyze next. We begin with the definition of a related consistency error. Definition 2.3.1 The consistency error Eim at (tm , xi ) is the difference scheme (2.7) m with um i in Ei replaced by u(tm , xi ). Using Taylor expansions of the exact solution at the grid point (tm , xi ), we can readily estimate the consistency errors Eim in terms of powers of the mesh width h and the time step size k. Proposition 2.3.2 If the exact solution u(t, x) of (2.4) is sufficiently smooth, then, as h → 0, k → 0, the following estimates hold for m = 1, . . . , M − 1 and i = 1, . . . , N : |Eim | ≤ C(u)(h2 + k),

0 ≤ θ ≤ 1,

(2.11)

|Eim | ≤ C(u)(h2 + k 2 ),

1 θ= , 2

(2.12)

where the constant C(u) > 0 depends on the exact solution u and its derivatives. For the convergence of the FDM, we are interested in estimating the error between the finite difference solution um i and the exact solution u(t, x) at the grid

18

2 Elements of Numerical Methods for PDEs

point (tm , xi ). We collect the discretization errors in the grid points at time tm in the error vector ε m , i.e. εim := u(tm , xi ) − um i ,

0 ≤ i ≤ N + 1,

0≤m≤M.

(2.13)

The error vectors {εm }M m=0 satisfy the difference equation  −1    k I + θ G ε m+1 + −k −1 I + (1 − θ )G ε m = E m

(2.14)

or, in explicit form, ε m+1 = Aθ ε m + ηm ,

(2.15)

where ηm := (k −1 I + θ G)−1 E m , and where Aθ := (k −1 I + θ G)−1 (−k −1 I + (1 − θ )G), is called an amplification matrix. The recursion (2.15) shows that the discretization error εim is related to the consistency error Eim . Estimates on εim can be obtained by taking norms in the recursion (2.15). Using induction on m, we have Proposition 2.3.3 For all M ∈ N, 1 ≤ m ≤ M, one has 0 εm 2 ≤ Aθ m 2 , ε 2 + k

m−1 

Aθ 2m−1−n E n 2 ,

(2.16)

n=0

where εm 22 =

N +1 i=0

|εim |2 .

We see from (2.16) that the discretization errors εim can be controlled in terms of the consistency errors Eim provided the norm Aθ 2 is bounded by 1. The condition that the norm of the amplification matrix Aθ is bounded by 1 is a stability condition for the FDM. We obtain immediately Theorem 2.3.4 If the stability condition Aθ 2 ≤ 1

(2.17)

holds, then, as M → ∞, N → ∞, the FDM (2.10) converges and, if ε 0 = 0, sup εm 2 ≤ T sup E m 2 . m

(2.18)

m

We want to discuss the validity of the stability condition (2.17) for G as in (2.9). Therefore, we need the following lemma to obtain the eigenvalues for a tridiagonal matrix. It follows immediately by elementary calculations.

2.3 Numerical Methods for the Heat Equation

19

Lemma 2.3.5 Let X ∈ R(N −1)×(N −1) be a tridiagonal matrix given by ⎛

α

⎜ ⎜γ X=⎜ ⎜ ⎝



β α .. .

..

.

..

.

γ

⎟ ⎟ ⎟ ⎟ β⎠ α

Then, Xv () = μ v () ,  = 1, . . . , N − 1, with the eigensystem    μ = α + 2β β −1 γ cos N −1 π ,

 N −1 v () = (β −1 γ )j/2 sin(N −1 j π) j =1 .

For G resulting from the finite differences discretization, we obtain Corollary 2.3.6 The eigensystem of the matrix G in (2.9) is given by μ =

  π 4 2 , sin 2(N + 1) h2

  j π N v () = sin , N + 1 j =1

 = 1, . . . , N. (2.19)

Using Corollary 2.3.6, we find that λ (Aθ ) =

4(1 − θ )h−2 k sin2 (π/(2(N + 1))) − 1 1 + 4θ h−2 k sin2 (π/(2(N + 1)))

,

 = 1, . . . , N.

Since Aθ 2 = max |λ |, we have Aθ 2 ≤ 1, if 2(1 − 2θ )h−2 k ≤ 1. For 0 ≤ θ < 12 , we obtain the so-called CFL-condition (after the seminal paper of Courant, Friedrichs and Lewy [43]), 1 k . ≤ h2 2(1 − 2θ )

(2.20)

Therefore, we obtain directly from Theorem 2.3.4: Lemma 2.3.7 (Stability of the θ -scheme) (i) If 12 ≤ θ ≤ 1, the scheme (2.10) is stable for all k and h. (ii) If 0 ≤ θ < 12 , the scheme (2.10) is stable if and only if the CFL-condition (2.20) holds. We can now combine the results obtained so far using Lemma 2.3.7, Proposition 2.3.3 and Proposition 2.3.2 to obtain a convergence result for the θ -scheme. We 1 measure the error using the quantity supm h 2 εm 2 which is a discrete version of the L∞ (J ; L2 (G))-norm.

20

2 Elements of Numerical Methods for PDEs

Theorem 2.3.8 If u ∈ C 4 (J × G), we have 1 2

(i) For

< θ ≤ 1 or for 0 ≤ θ
0 depends on the exact solution u and its derivatives. We next explain the finite element method which is based on variational formulations of the differential equations.

2.3.3 Finite Element Method For the discretization with finite elements, we use the method of lines where we first only discretize in space to obtain a system of coupled ordinary differential equations (ODEs). In a second step, a time discretization scheme is applied to solve the ODEs. We do not require the PDE (2.4) to hold pointwise in space but only in the variational sense. Therefore, we fix t ∈ J and let v ∈ C0∞ (G) be a smooth test function satisfying v(a) = v(b) = 0. We multiply the PDE with v, integrate with respect to the space variable x and use integration by parts to obtain

b

∂t uv dx −

a



d dt

a



b a

b

∂xx uv dx =

b

f v dx, a

b  x=b b uv dx − ∂x u(x, t)v(x) + ∂x u∂x v dx = f v dx. x=a a a    =0

Therefore, we have b b d b uv dx + ∂x u∂x v dx = f v dx, dt a a a

∀v ∈ C0∞ (G).

(2.21)

Since C0∞ (G) is not a closed subspace of L2 (G), we will consider test functions in the Sobolev space H01 (G) which is the closure of C0∞ (G) in the H 1 -norm, u 2H 1 (G) = u 2L2 (G) + u 2L2 (G) . The Sobolev space H01 (G) consists of all continuous functions which are piecewise differentiable and vanish at the boundary. Since (2.21) holds for all v ∈ C0∞ (G),

2.3 Numerical Methods for the Heat Equation

21

(2.21) also holds for all v ∈ H01 (G) because C0∞ (G) is dense in H01 (G). The weak or variational formulation of (2.4) reads: Find u ∈ C(J, H01 (G)) ∩ C 1 (J, L2 (G)) such that for t ∈ J b b d b uv dx + ∂x u∂x v dx = f v dx, ∀v ∈ H01 (G), dt a a a

(2.22)

u(0, ·) = u0 , where we assume that the initial condition u0 ∈ L2 (G). The finite element method is based on the Galerkin discretization of (2.22). The idea is to project (2.22) to a finite dimensional subspace VN ⊂ H01 (G) and to replace (2.22) by: Find uN ∈ C 1 (J, VN ), such that for t ∈ J b b d b uN vN dx + ∂x uN ∂x vN dx = f vN dx, dt a a a

∀vN ∈ VN ,

(2.23)

uN (0) = uN,0 , where uN,0 is an approximation of u0 in VN . For example, uN,0 = PN u0 , the L2  projection of u0 on VN , satisfying G PN u0 vN dx = G u0 vN dx for all vN ∈ VN . We show that (2.23) is equivalent to a linear system of ordinary differential equations (in time). Let bj , j = 1, . . . , N be a basis of VN . Since uN (t, ·) ∈ VN , we have uN (t, x) =

N 

uN,j (t)bj (x),

j =1

where uN,j (t) denote the time dependent coefficients of uN with respect to the basis of VN . Inserting this series representations into (2.23) yields for vN (x) = bi (x), i = 1, . . . , N , b b d b uN (t, x)vN (x) dx + ∂x uN (t, x)∂x vN (x) dx = f (t, x)vN (x) dx, dt a a a b N N d b ⇒ uN,j (t)bj (x)bi (x) dx + uN,j (t)bj (x)bi (x) dx dt a a =

j =1

j =1

b

f (t, x)bi (x) dx, a



N  j =1

=

u˙ N,j

b

bj (x)bi (x) dx +

a

b

f (t, x)bi (x) dx. a

N  j =1

uN,j a

b

bj (x)bi (x) dx

22

2 Elements of Numerical Methods for PDEs

Fig. 2.2 Basis functions bi (x), i = 1, . . . , N

With matrices M, A ∈ RN ×N and vector f (t) ∈ RN given by



b

Mij :=

bj (x)bi (x) dx,

Aij :=

a



a

b

bj (x)bi (x) dx,

b

fi (t) :=

f (t, x)bi (x) dx, a

we obtain the weak semi-discretization (2.23) in matrix form: Find uN ∈ C 1 (J ; RN ), such that for t ∈ J Mu˙ N (t) + AuN (t) = f (t),

(2.24)

uN (0) = u0 , where u0 denotes the coefficient vector of uN,0 = N i=1 u0,i bi . For the basis functions bi of VN = span{bi (x) : i = 1, . . . , N }, we take the socalled hat functions bi : [a, b] → R≥0 ,

bi (x) = max{0, 1 − h−1 |x − xi |},

i = 1, . . . , N,

as illustrated in Fig. 2.2. For equidistant mesh points xi , i = 1, . . . , N with mesh width h as in (2.5), xi = a + ih, i = 0, 1, . . . , N + 1, h := (b − a)/(N + 1) = x, ×N we obtain the matrices, M, A ∈ RN sym , given by

⎛ 4 ⎜ h ⎜1 M= ⎜ 6⎜ ⎝



1 .. .

..

..

..

.

.

. 1

⎟ ⎟ ⎟, ⎟ 1⎠ 4



−1 ⎜ .. . 1 ⎜−1 A= ⎜ ⎜ . h⎝ .. 2

⎞ ⎟ ⎟ ⎟. ⎟ .. . −1⎠ −1 2 ..

.

(2.25)

2.3 Numerical Methods for the Heat Equation

23

Table 2.1 Difference between finite differences and finite elements FDM

FEM

um

vector of um i ≈ u(tm , xi )

coeff. vector of uN (tm , x)

B

I + kθG

M + kθA

C

I − k(1 − θ)G

M − k(1 − θ)A

G = h−2 tridiag(−1, 2, −1)

A = h−1 tridiag(−1, 2, −1) b a f (tm , x)bi (x) dx

fim

f (tm , xi )

It remains to discretize the ODE (2.24). Proceeding exactly as in the FDM, we choose time levels tm , m = 0, . . . , M as in (2.6) tm = mk, m = 0, 1, . . . , M, k := T /M = t, m and denote um N := uN (tm ) and f := f (tm ). Then, the fully discrete scheme reads: m+1 ∈ RN such that for m = 0, . . . , M − 1, Find uN   m+1 m (M + kθ A)uN = (M − k(1 − θ )A)uN + k θ f m+1 + (1 − θ )f m ,

(2.26)

u0N = u0 . Thus, in both the finite difference and the finite element method we have to solve M systems of N linear equations of the form Bum+1 = Cum + kF m ,

m = 0, . . . , M − 1,

where F m = θ f m+1 + (1 − θ )f m . The difference between FDM and FEM is shown in Table 2.1. From Table 2.1 we see that both discretization schemes for the heat equation lead to similar linear systems of equations to be solved in each timestep. For all the partial differential equations which we will encounter in these note, we will use both finite differences and finite elements for the discretization. Example 2.3.9 Let G = (0, 1), T = 1, and u(t, x) = e−t x sin(πx). We measure the 1 discrete L∞ (0, T ; L2 (G))-error defined by supm h 2 εm 2 where εm 22 :=

N 

2 |u(tm , xi ) − um i | .

i=1

√ For θ = 1 (backward Euler), we let h = O( k) and obtain first order convergence with respect to the time step k both for FDM and FEM, i.e. 1

sup h 2 ε m 2 = O(k). m

(2.27)

24

2 Elements of Numerical Methods for PDEs

Fig. 2.3 L∞ (J ; L2 (G)) convergence rates for θ = 1 (top) and θ = 12 (bottom)

For θ = 12 (Crank–Nicolson), we let k = O(h) and obtain, in terms of the mesh width h, second order convergence for both FDM and FEM, i.e. 1

sup h 2 εm 2 = O(h2 ).

(2.28)

m

Both convergence rates are shown in Fig. 2.3. The convergence rates (2.27)–(2.28) have been shown for the finite difference method in Theorem 2.3.8. In the next chapter, we show that these also hold for the finite element method.

2.4 Further Reading

25

2.4 Further Reading A nice introduction to the mathematical theory of partial differential equations is given in Evans [65]. The mathematical theory of the finite difference and finite element methods for elliptic problems is introduced in the text of Braess [24], and for parabolic and hyperbolic equations in Larsson and Thomée [112]. Finite difference methods for time dependent problems are studied in more details in Gustafsson et al. [76]. For an elementary introduction to the finite element methods with particular attention to stabilized finite element methods for partial differential equations of the type which arise in finance, we refer to Johnson [99].

Chapter 3

Finite Element Methods for Parabolic Problems

The finite element methods are an alternative to the finite difference discretization of partial differential equations. The advantage of finite elements is that they give convergent deterministic approximations of option prices under realistic, low smoothness assumptions on the payoff function as, e.g. for binary contracts. The basis for finite element discretization of the pricing PDE is a variational formulation of the equation. Therefore, we introduce the Sobolev spaces needed in the variational formulation and give an abstract setting for the parabolic PDEs. All pricing equations satisfied by the price u of plain vanilla contracts which we encounter in these notes have the general form ∂t u − Au = f,

in J × G,

(3.1)

with the initial condition u(0, x) = u0 (x), x ∈ G, a linear second order partial (integro-) differential operator A, the forcing term f , the space domain G ⊆ Rd and the time interval J = (0, T ). The finite element approximation of the operators A fits into an abstract parabolic framework which we present next.

3.1 Sobolev Spaces We now introduce some particular Hilbert spaces which are natural to use in the study of partial differential equations. These spaces consist of functions which are square integrable together with their partial derivatives up to a certain order. Let G = (a, b) ⊂ R be an open, possibly unbounded domain and let first u ∈ C 1 (G). Integration by parts yields





u ϕ dx = − G

G

uϕ  dx,

∀ϕ ∈ C01 (G).

N. Hilber et al., Computational Methods for Quantitative Finance, Springer Finance, DOI 10.1007/978-3-642-35401-4_3, © Springer-Verlag Berlin Heidelberg 2013

27

28

3

Finite Element Methods for Parabolic Problems

If u ∈ L2 (G), then u does not necessarily exist in the classical sense, but we may define u to be the linear functional ∗ u (ϕ) = − uϕ  dx, ∀ϕ ∈ C01 (G). G

This functional is said to be a generalized or weak derivative of u. When u∗ is bounded in L2 (G), it follows from Riesz representation theorem (see Theorem 2.1.1) that there exists a unique function w ∈ L2 (G) such that u∗ (ϕ) = (w, ϕ) for all ϕ ∈ L2 (G), in particular wϕ dx, ∀ϕ ∈ C01 (G). − uϕ  dx = G

G

We then say that the weak derivative belongs to L2 (G) and write u = w. In particular, if u ∈ C 1 (G), the generalized derivative u coincides with the classical derivative u . In a similar way, we can define weak derivatives D n u of higher order n ∈ N. Definition 3.1.1 The linear functional D n u, n ∈ N is a weak derivative of u if D n uϕ dx = (−1)n uD n ϕ dx, ∀ϕ ∈ C0n (G). G

G

We can now define the spaces H m (G). Definition 3.1.2 Let m ∈ N. H m (G) is the space of all functions whose weak partial derivatives of order ≤ m belong to L2 (G), i.e. H m (G) = {u ∈ L2 (G) : D n u ∈ L2 (G) for n ≤ m}. We equip H m (G) with the inner product (u, v)H m (G) =

m  (D n u, D n v)L2 (G) , n=0

and the corresponding norm u 2H m (G) = (u, u)H m (G) =

m 

D n u 2L2 (G) .

n=0

We sometimes omit the (G) if the domain is clear from the context. H m (G) is complete and thus a Hilbert space. The space H m (G) is an example of a more general class of function spaces, called Sobolev spaces. Definition 3.1.3 Let p ∈ N ∪ {∞}. W m,p (G) is the space of all functions whose weak partial derivatives of order ≤ m belong to Lp (G), i.e. W m,p (G) = {u ∈ Lp (G) : D n u ∈ Lp (G) for n ≤ m}.

3.1 Sobolev Spaces

29

We equip W m,p (G) with the norm p

u W m,p (G) =

m 

p

D n u Lp (G) .

n=0

The normed space is complete and hence a Banach space for 1 ≤ p ≤ ∞. Functions u ∈ W 1,p (G) are “essentially” continuous. W m,p (G)

Theorem 3.1.4 Let G be bounded and u ∈ W 1,p (G). Then, there exists a continu a.e. on G and for all x1 , x2 ∈ G there uous function  u ∈ C 0 (G) such that u =  holds x2  u(x2 ) −  u(x1 ) = u (ξ ) dξ. (3.2) x1

Proof Fix y0 ∈ G and set for any g ∈ Lp (G), x g(t) dt, x ∈ G. v(x) := y0

Then, v ∈ C 0 (G) and  vϕ dx = G



!

x

ϕ  (x) dx

g(t) dt G

y0



y0

=−



a

y0

g(t)ϕ  (x) dt dx +



x

b x

y0

g(t)ϕ  (x) dt dx.

y0

Fubini’s theorem implies, ∀ϕ ∈ C01 (G),

vϕ  dx = −

G





y0

t

g(t) a

=−

ϕ  (x) dx dt +

a





b

b

g(t) y0

ϕ  (x) dx dt

t

g(t)ϕ(t) dt.

(3.3)

G

We set u(x) :=

x y0

u (ξ ) dξ . With (3.3) we obtain

uϕ  dx = − G

G

u ϕ dx,

∀ϕ ∈ C01 (G),

and hence with the definition of the weak derivative, (u − u)ϕ  dx = 0, ∀ϕ ∈ C01 (G). G

u := Therefore, it follows that for a.e. x ∈ G, we have u(x) − u(x) = C. Putting  u + C, we obtain the result. 

30

3

Finite Element Methods for Parabolic Problems

We will also need spaces with boundary conditions where we impose u = 0 on ∂G. 1,p

Definition 3.1.5 Let 1 ≤ p < ∞. Then, W0 norm, 1,p

W0 (G) = C01 (G)

is the closure of C01 in the W 1,p -

· W 1,p (G)

.

1,p

The space W0 (G) ⊂ W 1,p (G) is a closed linear subspace. In particular, 1 H0 (G) := W01,2 (G) is again a Hilbert space with the norm · H 1 (G) . We have the important Poincaré inequality. Theorem 3.1.6 (Poincaré inequality) Assume that G ⊂ R bounded. Then, (i) There exists a constant C(|G|, p) > 0 such that u Lp (G) ≤ C u Lp (G) , (ii) Define

1,p

∀u ∈ W0 (G).

(3.4)

 " 1,p u dx = 0 . W∗ (G) := u ∈ W 1,p (G) :

(3.5)

G

1,p

Then, (3.4) holds also for all u ∈ W∗ (G), with different C. Proof 1,p

(i) Let u ∈ W0 (G), G = (x1 , x2 ) be arbitrary, but fixed. By Theorem 3.1.4, there exists  u ∈ C 0 (G) such that u =  u for a.e. x ∈ G and such that  u(x1 ) = 0. Therefore, using Hölder’s inequality, # x # # # 1  # | u(x)| = | u(x) −  u(x1 )| = # u (ξ ) dξ ## ≤ |x − x1 | q u Lp (G) , x1

1 p = 1. 1,p u ∈ W∗ (G). 

where

1 q

+

p 1  Hence, the result follows with C = ( G |x − x1 | q dx) p .

(ii) Let Then, there exists  u ∈ C 0 (G) such that u =  u for a.e. x ∈ G u dx = 0. Therefore, there is x ∗ ∈ G such that  u(x ∗ ) = 0. We and such that G  may repeat therefore the proof of (i) with u(x1 ) replaced by u(x ∗ ). Taking the  supremum over all possible values of x ∗ gives the result. In Chap. 2, we have already introduced the Bochner spaces Lp (J ; H) which consist of functions u : J → H such that the Lp (J ; H)-norm is finite. For the theory of parabolic PDEs, it will prove essential to consider maps u : J → H which are also differentiable (in time). We call u the weak derivative of u if  u (t)ϕ(t) dt = − u(t)ϕ  (t) dt, ∀ϕ ∈ C01 (J ). J

J

3.2 Variational Parabolic Framework

31

Definition 3.1.7 Let H be a real Hilbert space with the norm · H . For J = (0, T ) with T > 0, and 1 ≤ p ≤ ∞, the space W 1,p (J ; H) is defined by W 1,p (J ; H) := {u ∈ Lp (J ; H) : u ∈ Lp (J ; H)}, with the norm

 p p ( J u(t) H + u (t) H dt)1/p u W 1,p (J ;H) := ess supJ ( u(t) H + u (t) H )

if 1 ≤ p < ∞, if p = ∞.

We again denote by H 1 (J ; H) := W 1,2 (J ; H).

3.2 Variational Parabolic Framework Let V ⊂ H be Hilbert spaces with continuous, dense embedding. We identify H with its dual H∗ and obtain the triplet V ⊂ H ≡ H∗ ⊂ V ∗ .

(3.6)

Denote by (·,·)H the inner product on H, and let · V , · H be the norms on V and H, respectively. Furthermore, let J = (0, T ) with T > 0, f ∈ L2 (J ; V ∗ ) and u0 ∈ H. Consider the variational setting of (3.1): Find u ∈ L2 (J ; V) ∩ H 1 (J ; V ∗ ) such that d u, vV ∗ ,V + a(u, v) = f, vV ∗ ,V , ∀v ∈ V, a.e. in J, dt

(3.7)

u(0) = u0 , where ·,·V ∗ ,V denotes the extension of the H-inner product as duality pairing in V ∗ × V. In particular, by Riesz representation theorem, we have u, vV ∗ ,V = (u, v)H , for all u ∈ H, v ∈ V. The bilinear form a(·,·) : V × V → R is associated with the operator A ∈ L(V, V ∗ ) in (3.1) via a(u, v) := −Au, vV ∗ ,V ,

∀u, v ∈ V,

where we denote by L(V, W) the vector space of linear and continuous operators A : V → W. Remark 3.2.1 (i) dtd in (3.7) is understood as a weak derivative. (ii) The choice of the space V depends on the operator A. Often we can choose H = L2 . The choice of V is usually the closure of a dense subspace of smooth functions, such as C0∞ , with respect to the ‘energy’ norm induced by A.

32

3

Finite Element Methods for Parabolic Problems

(iii) We only require u(0) = u0 in H. In particular, for well-posedness of the equation, it is only required that u0 ∈ L2 , not that u0 ∈ V. This is important, e.g. for binary contracts, where the payoff u0 is discontinuous (and, therefore, does not belong to V in general). (iv) The bilinear form a(·,·) is, in general, not symmetric due to the presence of a drift term in the operator A. We have the following general result for the existence of weak solutions of the abstract parabolic problem (3.7). A proof is given in Appendix B, Theorem B.2.2. See also [64, 65, 115]. Theorem 3.2.2 Assume that the bilinear form a(·,·) in (3.7) is continuous, i.e. there is C1 > 0 such that |a(u, v)| ≤ C1 u V v V ,

∀u, v ∈ V,

(3.8)

and satisfies the “Gårding inequality”, i.e. there are C2 > 0, C3 ≥ 0 such that a(u, u) ≥ C2 u 2V − C3 u 2H ,

∀u ∈ V.

(3.9)

Then, problem (3.7) admits a unique solution u ∈ L2 (J ; V) ∩ H 1 (J ; V ∗ ). Moreover, u ∈ C 0 (J ; H), and there holds the a priori estimate   u C 0 (J ;H) + u L2 (J ;V ) + u H 1 (J ;V ∗ ) ≤ C u0 H + f L2 (J ;V ∗ ) . (3.10) We give a simple example for the weak formulation (3.7). Example 3.2.3 Consider the heat equation as in (2.4). Then, the spaces H, V, V ∗ in (3.6) are H = L2 (G), V = H01 (G), V ∗ = H −1 (G), and the bilinear form a(·,·) is given by a(u, v) = u (x)v  (x) dx, u, v ∈ H01 (G). G

The bilinear form is continuous on V, since by the Hölder inequality, for all u, v ∈ H01 (G) |a(u, v)| ≤ G

|u v  | dx ≤ u L2 (G) v  L2 (G) ≤ u H 1 (G) v H 1 (G) .

Furthermore, we have for all u ∈ H01 (G), by the Poincaré inequality (3.4) a(u, u) = ≥

1 1 |u (x)|2 dx = u 2L2 (G) + u 2L2 (G) 2 2 G

1 1 u 2L2 (G) + u 2L2 (G) 2C 2

3.3 Discretization

33



  1 min{C −1 , 1} u 2L2 (G) + u 2L2 (G) = C2 u 2H 1 (G) , 2

i.e. (3.9) holds with C3 = 0. Hence, according to Theorem 3.2.2, the variational formulation of the heat equation admits, for u0 ∈ L2 (G), f ∈ L2 (J ; H −1 (G)) a unique weak solution u ∈ L2 (J ; H01 (G)) ∩ H 1 (J ; H −1 (G)). We can always achieve C3 = 0 in (3.9). If we substitute in (3.1) v = e−λt u with suitably chosen λ and multiply (3.1) by e−λt , we find that v satisfies the problem ∂t v + Av + λv = e−λt f,

in J × G,

with v(0, x) = u0 (x) in G. Choosing λ > 0 large enough, the bilinear form a(u, v)+ λ(u, v) satisfies (3.9) with C3 = 0.

3.3 Discretization For the discretization we use the method of lines where first (3.7) is only discretized in space to obtain a system of coupled ODEs which are solved in a second step. Let VN be a one-parameter family of subspaces VN ⊂ V with finite dimension N = dim VN < ∞. For each fixed t ∈ J we approximate the solution u(t, x) of (3.7) by a function uN (t) ∈ VN . Furthermore, let uN,0 ∈ VN be an approximation of u0 . Then, the semidiscrete form of (3.7) is the initial value problem, Find uN ∈ C 1 (J ; VN ) such that for t ∈ J (∂t uN , vN )H + a(uN , vN ) = f, vN V ∗ ,V , ∀vN ∈ VN ,

(3.11)

uN (0) = uN,0 , for the approximate solution function uN (t) : J → VN . Let VN be generated by a finite element basis, VN = span{b i (x) : 1 ≤ i ≤ N}. We write uN ∈ VN in terms of the basis functions, uN (t, x) = N j =1 uN,j (t) bj (x), and obtain the matrix form of the semidiscretization (3.11) Find uN ∈ C 1 (J ; RN ) such that for t ∈ J, Mu˙ N (t) + AuN (t) = f (t) ,

(3.12)

uN (0) = u0 , where u0 denotes the coefficient vector of uN,0 . The mass and stiffness matrices and the load vector with respect to the basis of VN are given by Mij = (bj , bi )H ,

Aij = a(bj , bi ),

fi (t) = f, bi V ∗ ,V ,

(3.13)

34

3

Finite Element Methods for Parabolic Problems

where i, j = 1, . . . , N . Let km , m = 1, . . . , M, be a sequence of (not necessarily equal sized) time steps and set t0 := 0, tm := m i=1 ki such that tM = T . Applying the θ -scheme, we obtain the fully discrete form Find um N ∈ VN such that for m = 1, . . . , M, m−1+θ −1 m km (uN − um−1 , vN ) = f m−1+θ , vN V ∗ ,V , ∀vN ∈ VN , N , vN )H + a(uN

u0N = uN,0 ,

(3.14)

where um+θ = θ uN (tm+1 ) + (1 − θ )uN (tm ) and f m+θ = θf (tm+1 ) + (1 − θ )f (tm ). N We can again write (3.14) in matrix notation, N Find um N ∈ R such that for m = 1, . . . , M, m−1 (M + km θ A)um + km (θ f m + (1 − θ )f m−1 ), (3.15) N = (M − km (1 − θ )A)uN

u0N = u0 . In the next section, we discuss the implementation of the matrix form (3.15).

3.4 Implementation of the Matrix Form Let G = (a, b). We describe a scheme to calculate the stiffness matrix A in case the corresponding bilinear form a(·,·) has the form   a(ϕ, φ) = α(x)ϕ  (x)φ  (x) + β(x)ϕ  (x)φ(x) + γ (x)ϕ(x)φ(x) dx, (3.16) G

and the finite element subspace VN consists of continuous, piecewise linear functions. Let T = {a = x0 < x1 < x2 < · · · < xN +1 = b} be an arbitrary mesh on G. Setting Kl := (xl−1 , xl ), hl := |Kl | = xl − xl−1 , l = 1, . . . , N + 1, we can also write +1 T = {Kl }N l=1 . Define $ % (3.17) ST1 := u(x) ∈ C 0 (G) : u|Kl is linear on Kl ∈ T . A basis for ST1 = span{bi (x) : i = 0, . . . , N + 1} is given by the so-called hatfunctions where, for 1 ≤ i ≤ N , ⎧ (x − xi−1 )/ hi if x ∈ (xi−1 , xi ], ⎪ ⎨ bi (x) := (xi+1 − x)/ hi+1 if x ∈ (xi , xi+1 ), (3.18) ⎪ ⎩ 0 else, and

* b0 (x) :=

(x1 − x)/ h0

if x ∈ (x0 , x1 ),

0

else,

(3.19)

3.4 Implementation of the Matrix Form

* bN +1 (x) :=

35

(x − xN )/ hN +1

if x ∈ (xN , xN +1 ),

0

else.

(3.20)

If the mesh T is equidistant, i.e. hi = h = (b − a)/(N + 1), we can write bi (x) = max{0, 1 − h−1 |x − xi |}, i = 0, . . . , N + 1. Note that for a given subspace ST1 , there are many different possible choices of basis functions. The choice (3.18)–(3.20) are the basis functions with smallest support. The FE subspace to approximate functions with homogeneous Dirichlet boundary conditions is ST1 ,0 := ST1 ∩ H01 (G) = span{bi (x) : i = 1, . . . , N},

(3.21)

with dim ST1 ,0 = N .

3.4.1 Elemental Forms and Assembly We decompose a(·,·) (3.16) into elemental bilinear forms al (·,·), l = 1, . . . , N + 1,   a(bj , bi ) = α(x)bj (x)bi (x) + β(x)bj (x)bi (x) + γ (x)bj (x)bi (x) dx G

=

N +1  l=1

=:

N +1 

Kl

  α(x)bj (x)bi (x) + β(x)bj (x)bi (x) + γ (x)bj (x)bi (x) dx

al (bj , bi ).

l=1

The restrictions bi |Kl , i = l − 1, l, are linear and given by the element shape functions NK1 l := bl−1 (x)|Kl , NK2 l := bl (x)|Kl , l = 1, . . . , N + 1. The element stiffness matrix Al associated to al (·,·) can be computed for each element independently. We  := (−1, 1) via transform each element Kl ∈ T to the so-called reference element K an element mapping 1 1 Kl  x = FKl (ξ ) := (xl−1 + xl ) + hl ξ, 2 2

 ξ ∈ K,

with derivative FK l (ξ ) = hl /2. Using the reference element shape functions 1 (ξ ) = 1 (1 − ξ ), N 2

2 (ξ ) = 1 (1 + ξ ), N 2

which are independent of the element Kl , we have for all Kl ∈ T , 1 (ξ ), NK1 l (FKl (ξ )) = N

2 (ξ ). NK2 l (FKl (ξ )) = N

The reference shape functions are shown in Fig. 3.1.

36

3

Finite Element Methods for Parabolic Problems

Fig. 3.1 Reference element shape functions on the  reference element K

Furthermore, for i, j = 1, 2 there holds j

(Al )ij = al (NKl , NKi l )   j j j α(x)∂x NKl ∂x NKi l + β(x)∂x NKl NKi l + γ (x)NKl NKi l dx = Kl

, + 4   2   j N i hl dξ, Kl (ξ ) Nj Ni +  = γKl (ξ )N  αKl (ξ ) 2 Nj Ni + β hl 2  hl K where    αKl (ξ ) := α FKl (ξ ) ,

  Kl (ξ ) := β FKj (ξ ) , β

  γKl (ξ ) := γ FKl (ξ ) . 

For later purpose, it is useful to split the integral into three parts (Al )ij = (Sl )ij + (Bl )ij + (Ml )ij ,

(3.22)

where (Sl )ij =

2 hl

(Ml )ij =

hl 2



 K

 K

j N i dξ,  αKl N

(Bl )ij =

 K

j N i dξ, Kl N β

j N i dξ. γKl N 

For general coefficients α(x), β(x) and γ (x), these integrals cannot be computed exactly. Hence, they have to be approximated by a numerical quadrature

1

−1

f (ξ ) dξ ≈

p 

(p)

(p)

wj f (ξj ),

j =1 (p)

using a p-point quadrature rule with quadrature weights wj

and quadrature points

(p) ξj

∈ (−1, 1), j = 1, . . . , p. It remains to construct the stiffness matrix A. The idea is to express the global basis functions bi (x) in terms of the local shape functions. We have for i = 1, . . . , N , bi (x) = NK2 i + NK1 i+1 ,

b0 (x) = NK1 1 ,

bN +1 (x) = NK2 N+1 .

3.4 Implementation of the Matrix Form

37

Hence, we can assemble the matrix A ∈ R(N +2)×(N +2) using the following summation ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ 0 0 AK1 ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ .. AK2 0 ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ . +⎜ + ··· + ⎜ A=⎜ ⎟ ⎟ ⎟. . . .. .. ⎝ ⎝ ⎠ ⎝ ⎠ ⎠ 0 AKN+1 (3.23) As for the stiffness matrix, we also decompose the load vector f into elemental loads. For f (t) ∈ L2 (G), we have 0

0

(f (t), bi ) =

f (t, x)bi (x) dx = G

N +1  l=1

f (t, x)bi (x) dx =:

Kl

N +1 

(fl (t), bi ).

l=1

Using again the local shape function, we obtain for i = 1, 2, i i i hl dξ, (fl (t), NKl ) = f (t, x)NKl dx = fKl (t, ξ )N 2  K Kl where

  fKl (t, ξ ) := f t, FKl (ξ ) .

For general functions f , these integrals cannot be computed exactly and have to be approximated by a numerical quadrature rule. To assemble the global load vector f , we use ⎞ ⎞ ⎛ ⎛ ⎞ ⎛ 0 fK 0 1 ⎜ .. ⎟ ⎜ 0 ⎟ ⎜f ⎟ ⎟ ⎜ ⎜ ⎟ ⎜ K ⎟ f = ⎜ . ⎟ + ⎜ . 2⎟ + · · · + ⎜ . ⎟ . . . ⎝ 0 ⎠ ⎝ . ⎠ ⎝ . ⎠ fK 0 0 N+1

Remark 3.4.1 For V = H01 (G) we have VN = span{bi (x) : i = 1, . . . , N }, since u(x0 ) = u(xN +1 ) = 0. Hence, we get N degrees of freedom and the reduced ma  trices M, A ∈ RN ×N where the first and last rows and columns of A in (3.23) are omitted. Similar considerations can be made for the vector f . If we have nonhomogeneous Dirichlet boundary condition, u(t, a) = φ1 (t), u(t, b) = φ2 (t) for given functions φ1 , φ2 , we write u(t, x) = w(t, x) + φ(t, x), where the function φ(t, x) satisfies the boundary conditions, φ(t, a) = φ1 (t), φ(t, b) = φ2 (t). Now, w has again homogeneous Dirichlet boundary conditions and is the solution of d w, vV ∗ ,V + a(w, v) = f, vV ∗ ,V − ∂t φ, vV ∗ ,V − a(φ, v). dt Choosing φ(t, x) = φ1 (t)b0 (x) + φ2 (t)bN +1 (x), we obtain the matrix form  w˙ N (t) +  M Aw N (t) =  l(t),

38

3

Finite Element Methods for Parabolic Problems

Fig. 3.2 Function u(x) and its piecewise linear approximation (IN u)(x)

where the load vector l is given by l(t) = f (t) − Mφ˙ − Aφ, with φ = (φ1 (t), 0, . . . , 0, φ2 (t)) ∈ RN +2 denoting the coefficient vector of φ. For Neumann boundary conditions, i.e. ∂x u(t, a) = φ1 (t), ∂x u(t, b) = φ2 (t), the boundary degrees of freedom must be kept.

3.4.2 Initial Data In the discretization (3.14), we need an approximation of u0 in VN . If u0 ∈ H 1 (G) (e.g. u0 being the payoff of a put or call option), we can use nodal interpolation, i.e. uN,0 = IN u0 . For u ∈ H 1 (G), define the nodal interpolant IN u ∈ ST1 by (IN u)(x) :=

N +1 

u(xi )bi (x),

(3.24)

i=0

as illustrated in Fig. 3.2. Since bi (xj ) = δij , we get (IN u)(xi ) = u(xi ). For u ∈ H 1 (G), the nodal value u(xi ) is well defined due to Theorem 3.1.4. Lemma 3.4.2 The interpolation operator IN : H 1 (G) → ST1 defined in (3.24) is bounded, i.e. there exists a constant C (which is independent of N ) such that IN u H 1 (G) ≤ C u H 1 (G) ,

∀u ∈ H 1 (G).

(3.25)

If we only have u0 ∈ L2 (G) (e.g. u0 being the payoff of a digital option), we cannot use interpolation, but need the L2 -projection of u0 on VN , i.e. uN,0 = PN u0 . For u ∈ L2 (G), the L2 -projection of u on VN is defined as the solution of PN uvN dx = uvN dx, ∀vN ∈ VN . (3.26) G

G

Using the basis bi , i = 0, . . . , N + 1, of VN we can obtain the coefficient vector uN of PN u0 as the solution of the linear system ubi dx, i = 0, . . . , N + 1. MuN = f , with fi = G

3.5 Stability of the θ -Scheme

39

We immediately have from (3.26), using Hölder’s inequality, Lemma 3.4.3 The L2 -projection PN : L2 (G) → ST1 defined in (3.26) is bounded, i.e. PN u L2 (G) ≤ u L2 (G) ,

∀u ∈ L2 (G).

(3.27)

3.5 Stability of the θ -Scheme For the stability of (3.14), we prove that the finite element solutions satisfy an analog of the estimate (3.10). For this section, we assume the uniform mesh width h in space and constant time steps k = T /M. We define  1 v a := a(v, v) 2 .

(3.28)

In the analysis, we will use for f ∈ VN∗ the following notation: f ∗ := sup

vN ∈VN

(f, vN ) . vN a

(3.29)

We will also need λA defined by λA := sup

vN ∈VN

vN 2 . vN 2∗

In the case 12 ≤ θ ≤ 1, the θ -scheme is stable for any time step k > 0, whereas in the case 0 ≤ θ < 12 the time step k must be sufficiently small. Proposition 3.5.1 In the case 0 ≤ θ < 12 , assume σ := k(1 − 2θ )λA < 2.

(3.30)

Then, there are constants C1 and C2 independent of h and of k such that the seM quence {um N }m=0 of solutions of the θ -scheme (3.14) satisfies the stability estimate 2 uM N L2 + C1 k

M−1 

2 0 2 um+θ N a ≤ uN L2 + C2 k

m=0

where C1 , C2 satisfy in the case of

M−1 

f m+θ 2∗ ,

(3.31)

m=0 1 2

≤ θ ≤ 1,

0 < C1 < 2,

C2 ≥

1 , 2 − C1

(3.32)

40

3

Finite Element Methods for Parabolic Problems

and in the case of 0 ≤ θ < 12 , 0 < C1 < 2 − σ,

C2 ≥

1 + (4 − C1 )σ . 2 − σ − C1

(3.33)

Proof Define m+1 2 2 m+θ 2 2 X m := um ∗ − C1 k um+θ N L2 − uN L2 + C2 k f N a .

The assertion follows if we show that X m ≥ 0. Then, adding these inequalities for m = 0, . . . , M − 1 will obviously give (3.31). m+θ m+1 1 Let w := um+1 − um = (um N , then uN N + uN )/2 + (θ − 2 )w and N m+1 m+1 m+θ 2 m 2 um+1 − um + um − (2θ − 1)w). N , uN N ) = (w, 2uN N L2 − uN L2 = (uN

By the definition of the θ -scheme, we have     m+θ m+θ 2 m+θ (w, um+θ = k − u + f m+θ , um+θ , um+θ N ) = k −AuN N N a + (f N )  2 m+θ ≤ k − um+θ ∗ um+θ N a + f N a . This gives   2 m+θ m+θ 2 X m ≥ (2θ − 1) w 2L2 + k (2 − C1 ) um+θ ∗ um+θ ∗ . N a − 2 f N a + C2 f In the case of 12 ≤ θ ≤ 1, we now obtain X m ≥ 0 if condition (3.32) is satisfied. In the case 0 ≤ θ < 12 , we have by the definition of the θ -scheme that (w, vN ) = + f m+θ , vN ), yielding k(−Aum+θ N  1/2 1/2  m+θ ∗ w L2 ≤ λA w ∗ ≤ λA k Aum+θ N ∗ + f  1/2  m+θ = λA k um+θ ∗ , N a + f m+θ m+θ m+θ since (Aum+θ N , vN ) ≤ uN a vN a gives AuN ∗ ≤ uN a and choosing m+θ vN := um+θ gives Aum+θ N N ∗ ≥ uN a . Hence, 2 m+θ m+θ 2 ∗ um+θ ∗ . k −1 X m ≥ (2 − C1 − σ ) um+θ N a − 2(1 + σ ) f N a + (C2 − σ ) f

Therefore, we have X m ≥ 0 if conditions (3.30), (3.33) hold.



Remark 3.5.2 The conditions (3.30), (3.33) are time-step restrictions of CFL1 -type. Here, time-step restrictions are formulated in terms of the matrix property λA . If 1 CFL is an acronym for Courant, Friedrich and Lewy who identified an analogous condition as being necessary for the stability of explicit timestepping schemes for first order, hyperbolic equations.

3.6 Error Estimates

41

V = H01 (G), and if VN = ST1 , we obtain λA ∼ Ch−2 as h ↓ 0. Hence, we get stability provided the CFL type stability condition k ≤ Ch2 /(1 − 2θ ), holds. For

1 2

0≤θ
0, we get from (3.37), (3.38) by scaling to the interval (0, h)   u − (IN u) L2 (0,h) ≤ Ch u L2 (0,h) ,

(3.39)

2 h2 u L2 (0,h) , u − IN u L2 (0,h) ≤ C

(3.40)

and (IN u)(0) = u(0), (IN u)(h) = u(h). Applying this to each interval Ki , squaring and summing yields (3.35).  In Proposition 3.6.1, we estimated the mean square error of u − IN u. We are often also interested in pointwise error estimates. Some (not optimal) bounds on the pointwise error can be deduced from Proposition 3.6.1 with Proposition 3.6.2 Let G = (0, 1). For every u ∈ H01 (G), the following holds: u 2L∞ (G) ≤ 2 u L2 (G) u L2 (G) .

(3.41)

u ∈ C 0 (G). Let ξ ∈ G be such that u L∞ (G) = Proof If u ∈ H01 (G), u =  u(x)| = | u(ξ )|. Then maxx | u 2L∞ (G)

# # = | u(ξ )| = |( u(ξ )) − ( u(0)) | ≤ # 2

2

ξ

2

0

# # ( u(η)2 ) dη#

# ξ # # # u L2 (G)  = 2#  u(η) u (η) dη# ≤ 2  u L2 (G) . 0



Corollary 3.6.3 For G = (0, 1), u ∈ H 2 (G) and equidistant mesh width h, one has 3

u − IN u L∞ (G) ≤ Ch 2 u L2 (G) ,

(3.42)

as h → 0. For a general interval G = (a, b), (3.41), (3.42) also hold with constants that depend on b − a. If u has better regularity than just u ∈ H 2 (G), better convergence rates are possible. Corollary 3.6.4 For G = (0, 1), u ∈ W 2,∞ (G) and equidistant mesh width h, one has u − IN u L∞ (G) ≤ Ch2 u W 2,∞ (G) , as h → 0.

(3.43)

3.6 Error Estimates

43

3.6.2 Convergence of the Finite Element Method Assume uniform mesh width h in space and constant time steps k = T /M in time. We show now that the computed sequence {um N } converges, as h → 0 and k → 0, to the exact solution of (3.7). We have Theorem 3.6.5 Assume u ∈ C 1 (J ; H 2 (G)) ∩ C 3 (J ; H −1 (G)). Let um (x) = 1 1 u(tm , x) and um N be as in (3.14), with VN = ST . Assume for 0 ≤ θ < 2 also (3.30). Then, the following error bound holds: 2 uM − uM N L2 (G) + k

M−1 

2 um+θ − um+θ N a

m=0



≤ Ch max u(t) H 2 (G) + Ch 2

0≤t≤T

*

+C

T

2

T k 2 0 ∂tt u(s) 2∗ ds T k 4 0 ∂ttt u(s) 2∗ ds

0

∂t u(s) 2H 1 (G) ds

if 0 ≤ θ ≤ 1, if θ = 12 .

(3.44)

Remark 3.6.6 By the properties (3.8)–(3.9) (with C3 = 0), the norm · a in (3.28) is equivalent to the energy-norm · V . Thus, we see from (3.44) that we have uM − uM N V = O(h + k), i.e. first order convergence in the energy norm, provided the solution u(t, x) is sufficiently smooth. However, one can also prove second order 2 convergence in the L2 -norm, i.e. uM − uM N L2 (G) = O(h + k), if θ ∈ [0, 1] \ {1/2}, 2 2 and uM − uM N L2 (G) = O(h + k ) if θ = 1/2. Hence, for continuous, linear finite elements, we obtain the same convergence rates as for the finite difference discretization in Theorem 2.3.8. m := um − The proof of Theorem 3.6.5 will be given in several steps. We define eN and consider the splitting (3.34), where now IN denotes nodal interpolant defined in (3.24). Since we already estimated the consistency error ηm = um − IN um , we focus on ξNm ∈ VN .

um N

Lemma 3.6.7 If u ∈ C 1 (J ; H ), the errors {ξNm }m are solutions of the θ -scheme: Given ξN0 := IN u0 −u0N , for m = 0, . . . , M −1 find ξNm+1 ∈ VN such that ∀vN ∈ VN :   k −1 (ξNm+1 − ξNm , vN ) + a θ ξNm+1 + (1 − θ )ξNm , vN = (r m , vN ) where the residuals r m = r1m + r2m + r3m are given by   (r1m , vN ) = k −1 (um+1 − um ) − u˙ m+θ , vN ,   (r2m , vN ) = k −1 (IN um+1 − IN um ) − k −1 (um+1 − um ), vN , (r3m , vN ) = a(IN um+θ − um+θ , vN ) .

(3.45)

44

3

Finite Element Methods for Parabolic Problems

The stability of the θ -scheme, Proposition 3.5.1, gives Corollary 3.6.8 Under the assumptions of Proposition 3.5.1, ξNM 2L2 (G)

+ C1 k

M−1 

ξNm+θ 2a

≤ ξN0 2L2 (G)

+ C2 k

m=0

M−1 

r m 2∗ .

(3.46)

m=0

To prove Theorem 3.6.5, it is therefore sufficient to estimate the residual r m ∗ . Proof of Theorem 3.6.5 (i) Estimate of r1 : for any vN ∈ VN , we have |(r1m , vN )| ≤ k −1 (um+1 − um ) − u˙ m+θ ∗ vN a . With k

−1

m+1

(u

− u ) − u˙ m

m+θ

=k

−1



tm+1

(s − (1 − θ )tm+1 − θ tm ) u¨ ds,

tm

we get k −1 (um+1 − um ) − u˙ m+θ ∗ ≤ k −1 ≤ Cθ k



tm+1

|s − (1 − θ )tm+1 − θ tm | u ¨ ∗ ds

tm 1 2



tm+1

tm

!1 2 u(s) ¨ ∗ ds

2

.

(ii) Estimate of r2 : for any vN ∈ VN , .  . |(r2m , vN )| ≤ C .k −1 (um+1 − um ) − IN (um+1 − um ) .∗ vN a . . tm+1 . . . vN a (I − I = Ck −1 . ) u(s) ˙ ds N . . ≤ Ck −1



tm

tm+1

tm



u˙ − IN u ˙ ∗ ds vN a .

(iii) Estimate of r3 : using the continuity of a(·,·), |(r3m , vN )| ≤ C um+θ − IN um+θ a vN a . We have proved that for every m = 0, 1, . . . , M − 1, tm+1 m 2 2 u(x) ¨ r ∗ ≤ Ck ∗ ds + Ck −1



tm

tm+1

tm

u˙ − IN u ˙ 2∗ ds + C um+θ − IN um+θ 2a .

3.7 Further Reading

45

Inserting into (3.46), we get M 2 L2 (G) + C1 k eN

M−1 

m+θ 2 eN a

m=0

/

M−1 M−1   m+θ 2 M 2 m+θ 2 M 2 ≤ 2 η L2 (G) + C1 k η a + ξN L2 (G) + C1 k ξN a m=0

m=0

M−1  M 2 m+θ 2 0 2 ≤ C η L2 (G) + C1 k η a + ξN L2 (G) + k Cθ k

0

m=0

+

T

0

u˙ − IN u ˙ 2∗ ds + C k

M−1 

T

2 u(s) ¨ ∗ ds

/ um+θ − IN um+θ 2a

m=0

M−1  ≤ C ξN0 2L2 (G) + ηM 2L2 (G) + k ηm+θ 2a + 0

T

η ˙ 2∗ ds + k 2

0

m=0

T

/ 2 u(s) ¨ ds . ∗

(iv) The terms involving η = u − IN u are estimated using the interpolation estimates of Proposition 3.6.1. Furthermore, if u0 is approximated with the L2 projection, we have ξN0 L2 (G) = IN u0 − uN,0 L2 (G) ≤ IN u0 − u0 L2 (G) + u0 − uN,0 L2 (G) . Since u ∈ C 1 (J ; H 2 (G)) by assumption, we have u0 ∈ H 2 (G). The L2 -projection uN,0 is a quasi-optimal approximation of u0 , i.e. u0 − PN u0 L2 (G) ≤ Ch2 u0 H 2 (G) . We obtain from Proposition 3.6.1 optimal convergence rates with respect to h, and Theorem 3.6.5 is proved. 

3.7 Further Reading The basic finite element method is, for example, described in Braess [24] and for parabolic problems in detail by Thomée [154]. Error estimates in a very general framework are also given in Ern and Guermond [64]. In this section, we only considered the θ -scheme for the time discretization. It is also possible to apply finite elements for the time discretization as in Schötzau and Schwab [146, 147] where an hp-discontinuous Galerkin method is used. It yields exponential convergence rates instead of only algebraic ones as in the θ -scheme.

Chapter 4

European Options in BS Markets

In the last chapters, we explained various methods to solve partial differential equations. These methods are now applied to obtain the price of a European option. We assume that the stock price follows a geometric Brownian motion and show that the option price satisfies a parabolic PDE. The unbounded log-price domain is localized to a bounded domain and the error incurred by the truncation is estimated. It is shown that the variational formulation has a unique solution and the discretization schemes for finite element and finite differences are derived. Furthermore, we describe extensions of the Black–Scholes model, like the constant elasticity of variance (CEV) and the local volatility model.

4.1 Black–Scholes Equation Let X be the solution of the SDE (1.2), where we assume that the coefficients b, σ are independent of time t and satisfy the assumptions of Theorem 1.2.6. Further assume r(x) to be a bounded and continuous function modeling the riskless interest rate. We want to compute the value of the option with payoff g which is the conditional expectation  T  V (t, x) = E e− t r(Xs ) ds g(XT ) | Xt = x . (4.1) We show that V (t, x) is a solution of a deterministic partial differential equation. Therefore, we first relate to the process X a differential operator A, the so-called infinitesimal generator of the process X. Proposition 4.1.1 Let A denote the differential operator which is, for functions f ∈ C 2 (R) with bounded derivatives, given by 1 (Af )(x) = σ 2 (x)∂xx f (x) + b(x)∂x f (x). 2 N. Hilber et al., Computational Methods for Quantitative Finance, Springer Finance, DOI 10.1007/978-3-642-35401-4_4, © Springer-Verlag Berlin Heidelberg 2013

(4.2) 47

48

4 European Options in BS Markets

Then, the process Mt := f (Xt ) − filtration of W .

t

0 (Af )(Xs ) ds

is a martingale with respect to the

Proof We apply the Itô formula (1.7) to f (Xt ) and obtain, in integral form, t 1 t ∂x f (Xs ) dXs + ∂xx f (Xs )σ 2 (Xs ) ds. f (Xt ) = f (X0 ) + 2 0 0 Using dXt = b(Xt ) dt + σ (Xt ) dWt , and the definition of A, we have t t (Af )(Xs ) ds + σ (Xs )f  (Xs ) dWs . f (Xt ) = f (X0 ) + 0

0

t The result follows if we can show that the stochastic integral 0 σ (Xs )∂x f (Xs ) dWs is a martingale (with respect tothe filtration of W ). According to Proposition 1.2.7, t it is sufficient to show that E[ 0 |σ (Xs )∂x f (Xs )|2 ds] < ∞. Since f has bounded derivatives and σ satisfies (1.4), we obtain

t

t |σ (Xs )|2 |∂x f (Xs )|2 ds ≤ C 2 sup |∂x f (x)|2 E (1 + |Xs |2 ) ds E 0

0

x∈R

   (1.5) ≤ T C 2 sup |∂x f (x)|2 1 + E sup |Xs |2 < ∞. 0≤s≤T

x∈R

 Remark 4.1.2 For t > 0 denote by Xtx the solution of the SDE (1.2) starting from t x at time 0. Then, since Mt = f (Xt ) − 0 (Af )(Xs ) ds is a martingale by Proposition 4.1.1, we know that E[M0 ] = f (x) = E[Mt ]. Therefore,

t x x E[f (Xt )] = f (x) + E (Af )(Xs ) ds . 0

Since by assumption f has bounded derivatives and b, σ satisfy the global Lipschitz and linear growth condition (1.3)–(1.4)     E sup |(Af )(Xsx )| ≤ CE sup |σ 2 (Xsx )| + |b(Xsx )| 0≤s≤T

0≤s≤T

   ≤ C  1 + E sup |Xsx |2 < ∞. 0≤s≤T

Thus, since Af and Xtx are continuous, the dominated convergence theorem gives

t d 1 E[f (Xtx )]|t=0 = lim E (Af )(Xsx ) ds = (Af )(x). t→0 dt t 0 Therefore, A is called the infinitesimal generator of the process Xtx .

4.1 Black–Scholes Equation

49

For the purpose of option pricing, we need a discounted version of Proposition 4.1.1. Proposition 4.1.3 Let f ∈ C 1,2 (R × R) with bounded derivatives in x, let A be as in (4.2) and assume that r ∈ C 0 (R) is bounded. Then, the process Mt := e−

t 0

r(Xs ) ds

f (t, Xt ) −

t

e−

s 0

r(Xτ ) dτ

(∂t f + Af − rf )(s, Xs ) ds,

0

is a martingale with respect to the filtration of W . Proof Denote by Z the process Zt := e−

t 0

r(Xs ) ds

. Then

d(Zt f (t, Xt )) = dZt f (t, Xt ) + Zt df (t, Xt ), with dZt = −r(Xt )Zt dt, and thus, by the Itô formula (1.7),  d(Zt f (t, Xt )) = − r(Xt )Zt f (t, Xt ) + Zt ∂t f (t, Xt ) dt + ∂x f (t, Xt ) dXt  1 + σ 2 (Xt )∂xx f (t, Xt ) dt 2  =Zt (−rf (t, Xt ) + ∂t f (t, Xt ) + (Af )(t, Xt )) dt  + σ (Xt )∂x f (t, Xt ) dWt . t

Thus, we need to show that



0 t

E

Zs σ (Xs )∂x f (s, Xs ) dWs is a martingale. But

|Zs σ (Xs )∂x f (s, Xs )| ds < ∞, 2

0

by the boundedness of r and by repeating the estimates in the proof of Proposition 4.1.1.  We now are able to link the stochastic representation of the option price (4.1) with a parabolic partial differential equation. Theorem 4.1.4 Let V ∈ C 1,2 (J × R) ∩ C 0 (J × R) with bounded derivatives in x be a solution of ∂t V + AV − rV = 0 in J × R,

V (T , x) = g(x)

with A as in (4.2). Then, V (t, x) can also be represented as   T V (t, x) = E e− t r(Xs ) ds g(XT ) | Xt = x .

in R,

(4.3)

50

4 European Options in BS Markets

Proof We show the result only for t = 0. Since ∂t V + AV − rV = 0, we have, by t Proposition 4.1.3, that the process Mt := e− 0 r(Xs ) ds V (t, Xt ) is a martingale. Thus, V (0, x) = E[M0 | X0 = x] = E[MT | X0 = x]  T  = E e− 0 r(Xs ) ds V (T , XT ) | X0 = x  T  = E e− 0 r(Xs ) ds g(XT ) | X0 = x .



Remark 4.1.5 The converse of Theorem 4.1.4 is also true. Any V (t, x) as in (4.1), which is C 1,2 (J × R) ∩ C 0 (J × R) with bounded derivatives in x, solves the PDE (4.3). We apply Theorem 4.1.4 to the Black–Scholes model [21]. In the Black–Scholes market, the risky asset’s spot-price is modeled by a geometric Brownian motion St , i.e. the SDE for this model is as in (1.2), with coefficients b(t, s) = rs, σ (t, s) = σ s, where σ > 0 and r ≥ 0 denote the (constant) volatility and the (constant) interest rate, respectively. Therefore, the SDE is given by dSt = rSt dt + σ St dWt . We assume for simplicity that no dividends are paid. Based on Theorem 4.1.4, we get that the discounted price of a European contract with payoff g(s), i.e. V (0, s) = E[e−rT g(ST )|S0 = s], is equal to a regular solution V (0, s) of the Black–Scholes equation 1 ∂t V + σ 2 s 2 ∂ss V + rs∂s V − rV = 0 in J × R+ . (4.4) 2 The BS equation (4.4) needs to be completed by the terminal condition, V (T , s) = g(s), depending on the type of option. Equation (4.4) is a parabolic PDE with the second order “spatial” differential operator 1 (Af )(s) = σ 2 s 2 ∂ss f (s) + rs∂s f (s), 2

(4.5)

which degenerates at s = 0. To obtain a non-degenerate operator with constant coefficients, we switch to the price process Xt = log(St ) which solves the SDE  1  dXt = r − σ 2 dt + σ dWt . 2 The infinitesimal generator for this process has constant coefficients and is given by  1 1  (ABS f )(x) = σ 2 ∂xx f (x) + r − σ 2 ∂x f (x). 2 2

(4.6)

4.2 Variational Formulation

51

We furthermore change to time-to-maturity t → T − t, to obtain a forward parabolic problem. Thus, by setting V (t, s) =: v(T − t, log s), the BS equation in real price (4.4) satisfied by V (t, s) becomes the BS equation for v(t, x) in log-price ∂t v − ABS v + rv = 0

in (0, T ) × R,

(4.7)

with the initial condition v(0, x) = g(ex ) in R. Remark 4.1.6 For put and call contracts with strike K > 0, it is convenient to introduce the so-called log-moneyness variable x = log(s/K) and setting the option price V (t, s) := Kw(T − t, log(s/K)). Then, the function w(t, x) again solves (4.7), with the initial condition w(0, x) = g(Kex )/K. Thus, the initial condition becomes for a put w(0, x) = max{0, 1 − ex } and for a call w(0, x) = max{0, ex − 1} where both payoffs now do not depend on K.

4.2 Variational Formulation We give the variational formulation of the Black–Scholes equation (4.7) which reads: Find u ∈ L2 (J ; H 1 (R)) ∩ H 1 (J ; L2 (R)) such that (∂t u, v) + a BS (u, v) = 0, ∀v ∈ H 1 (R), a.e. in J,

(4.8)

u(0) = u0 , where u0 (x) := g(ex ) and the bilinear form a BS (·,·) : H 1 (R) × H 1 (R) → R is given by 1 a BS (ϕ, φ) := σ 2 (ϕ  , φ  ) + (σ 2 /2 − r)(ϕ  , φ) + r(ϕ, φ). 2

(4.9)

We show that a BS (·,·) is continuous (3.8) and satisfies a Gårding inequality (3.9) on V = H 1 (R). Proposition 4.2.1 There exist constants Ci = Ci (σ, r) > 0, i = 1, 2, 3, such that for all ϕ, φ ∈ H 1 (R) |a BS (ϕ, φ)| ≤ C1 ϕ H 1 (R) φ H 1 (R) ,

a BS (ϕ, ϕ) ≥ C2 ϕ 2H 1 (R) − C3 ϕ 2L2 (R) .

Proof We first show continuity. By Hölder’s inequality, 1 |a BS (ϕ, φ)| ≤ σ 2 ϕ  L2 (R) φ  L2 (R) + |σ 2 /2 − r| ϕ  L2 (R) φ L2 (R) 2 + r ϕ L2 (R) φ L2 (R) ≤ C1 (σ, r) ϕ H 1 (R) φ H 1 (R) .

52

4 European Options in BS Markets

To show coercivity, note that with





 ϕ dx

=



1 2  2 R (ϕ ) dx

= 0 we have

1 1 a BS (ϕ, ϕ) = σ 2 ϕ  2L2 (R) + r ϕ 2L2 (R) = σ 2 ϕ 2H 1 (R) + (r − σ 2 /2) ϕ 2L2 (R) 2 2 1 2 ≥ σ ϕ 2H 1 (R) − |r − σ 2 /2| ϕ 2L2 (R) .  2 Referring to the abstract existence result Theorem 3.2.2 in the spaces V = H 1 (R) and H = L2 (R), we deduce that the variational problem (4.8) admits a unique weak solution u ∈ L2 (J ; H 1 (R)) ∩ H 1 (J ; L2 (R)) for every u0 ∈ L2 (R). Since u0 (x) = g(ex ), u0 ∈ L2 (R) implies an unrealistic growth condition on the payoff g. In the next section, we reformulate the problem on a bounded domain where this condition can be weakened. In particular, we require the following polynomial growth condition on the payoff function: There exist C > 0, q ≥ 1 such that g(s) ≤ C(s + 1)q ,

for all s ∈ R+ .

(4.10)

This condition is satisfied by the payoff function of all standard contracts like, e.g. plain vanilla European call, put or power options.

4.3 Localization The unbounded log-price domain R of the log price x = log s is truncated to a bounded domain G. In terms of financial modeling, this corresponds to approximating the option price by a knock-out barrier option. Let G = (−R, R), R > 0 be an open subset and let τG := inf{t ≥ 0 | Xt ∈ Gc } be the first hitting time of the complement set Gc = R\G by X. Then, the price of a knock-out barrier option in log-price with payoff g(ex ) is given by   (4.11) vR (t, x) = E e−r(T −t) g(eXT )1{T 0, such that |v(t, x) − vR (t, x)| ≤ C(T , σ ) e−γ1 R+γ2 |x| . Proof Let MT = supτ ∈[t,T ] |Xτ |. Then, with (4.10)     |v(t, x) − vR (t, x)| ≤ E g(eXT )1{T ≥τG } | Xt = x ≤ CE eqMT 1{MT >R} | Xt = x .

4.3 Localization

53

Using [143, Theorem 25.18], it suffices to show that there exist a constant C(T , σ ) > 0 such that   E eq|XT | 1{|XT |>R} | Xt = x ≤ C(T , σ ) e−γ1 R+γ2 |x| . We have for μ = r − σ 2 /2, with the transition probability pT −t ,   E eq|XT | 1{|XT |>R} | Xt = x = eq|z+x| 1{|z+x|>R} pT −t (z) dz R



1 2 2 eq|z| 1{|z+x|>R} 0 e−(z−μ(T −t)) /(2σ (T −t)) dz 2 R 2πσ (T − t) 2 2 2 ≤ C1 (T , σ ) eq|x| e(q+μ/σ )|z| 1{|z+x|>R} e−z /(2σ (T −t)) dz ≤ eq|x|

R

≤ C1 (T , σ ) eq|x|

R

e−(η−q−μ/σ

≤ C1 (T , σ ) e−γ1 R+γ2 |x|

R

2 )(R−|x|)

eη|z| e−z

eη|z| e−z

2 /(2σ 2 (T −t))

2 /(2σ 2 (T −t))

dz

dz,

 2 2 with γ1 = η − q − μ/σ 2 , and γ2 = γ1 + q. Since R eη|z| e−z /(2σ (T −t)) dz < ∞ for any η > 0, we obtain the required result by choosing η > q + μ/σ 2 .  Remark 4.3.2 We see from Theorem 4.3.1 that vR → v exponentially for a fixed x as R → ∞. The artificial zero Dirichlet barrier type conditions at x = ±R are not describing correctly the asymptotic behavior of the price v(t, x) for large |x|. Since the barrier option price vR is a good approximation to v for |x|  R, R should be selected substantially larger than the values of x of interest. The barrier option price vR can again be computed as the solution of a PDE provided some smoothness assumptions. Theorem 4.3.3 Let vR (t, x) ∈ C 1,2 (J × R) ∩ C 0 (J × R) be a solution of ∂t vR + ABS vR − rvR = 0 on (0, T ) × G where the terminal and boundary conditions are given by vR (T , x) = g(ex ),

∀x ∈ G,

vR (t, x) = 0 on (0, T ) × Gc .

Then, vR (t, x) can also be represented as in (4.11). Now we can restate the problem (4.8) on the bounded domain: Find uR ∈ L2 (J ; H01 (G)) ∩ H 1 (J ; L2 (G)) such that

(4.12)

54

4 European Options in BS Markets

(∂t uR , v) + a BS (uR , v) = 0, ∀v ∈ H01 (G), a.e. in J,

(4.13)

uR (0) = u0 |G . By Proposition 4.2.1 and Theorem 3.2.2, the problem (4.13) is well-posed, i.e. there exists a unique solution uR ∈ L2 (0, T ; H01 (G)) ∩ C 0 ([0, T ]; L2 (G)) which can be approximated by a finite element Galerkin scheme.

4.4 Discretization We use the finite element and the finite difference method to discretize the Black– Scholes equation. As for the heat equation in Chap. 2, we use the variational formulation of the differential equations for FEM and determine approximate solutions that are piecewise linear. For FDM we replace the derivatives in the differential equation by difference quotients.

4.4.1 Finite Difference Discretization We discretize the PDE (4.12) directly using finite differences on a bounded domain with homogeneous Dirichlet boundary conditions. Proceeding as in Sect. 2.3.1, we obtain the matrix problem: Find um+1 ∈ RN such that for m = 0, . . . , M − 1,     I + θ kGBS um+1 = I − (1 − θ )kGBS um ,

(4.14)

u0 = u0 , where GBS = σ 2 /2 R + (σ 2 /2 − r)C + rI, is given explicitly with ⎞ ⎞ ⎛ ⎛ 2 −1 0 1 ⎟ ⎟ ⎜ ⎜ .. .. . .. ⎟ ⎟ . . . 1 ⎜ 1 ⎜ −1 −1 . . ⎟ ⎟. ⎜ ⎜ , C= R= 2 ⎜ ⎟ ⎟ ⎜ .. .. .. .. 2h ⎝ h ⎝ . . −1⎠ . . 1⎠ −1 2 −1 0

4.4.2 Finite Element Discretization We discretize (4.13) using the θ -scheme and the finite element space VN = ST1 ∩ H01 (G) with ST1 given as in (3.17). Setting u0 (x) := g(ex ) and proceeding exactly

4.4 Discretization

55

as in Sect. 3.3 with uniform mesh width h and uniform time steps k, we obtain the matrix problem: Find um+1 ∈ RN such that for m = 0, . . . , M − 1 (M + kθ ABS )um+1 = (M − k(1 − θ )ABS )um ,

(4.15)

u0 = u0 , BS where Mij = (bj , bi )L2 (G) and ABS ij = a (bj , bi ). Let M be given as in (2.25). Using (3.22), we can compute ABS = σ 2 /2S + (σ 2 /2 − r)B + rM explicitly with ⎞ ⎞ ⎛ ⎛ 2 −1 0 1 ⎟ ⎟ ⎜ ⎜ .. .. .. . ⎟ ⎟ . . . 1⎜ 1⎜ −1 −1 . . ⎟ ⎟. ⎜ ⎜ , B= ⎜ (4.16) S= ⎜ ⎟ ⎟ . . . . h⎝ 2⎝ .. . . −1⎠ .. . . 1⎠ −1 2 −1 0

4.4.3 Non-smooth Initial Data As already mentioned, the advantage of finite elements is that we have low smoothness assumptions on the initial data u0 , and therefore on the payoff function g. In particular, as shown in Theorem 3.2.2, we have a unique solution for every u0 ∈ L2 (G). However, according to Theorem 3.6.5, we need u0 ∈ H 2 (G) to obtain the optimal convergence rate u − uN L2 (J ;L2 (G)) = O(h2 + k r ) where r = 1 for θ ∈ [0, 1] \ {1/2} and r = 2 for θ = 1/2. This is due to the time discretization since uniform time steps are used. To recover the optimal convergence rate for u0 ∈ H s (G), 0 < s < 2 we need to use graded meshes in time or space. We assume for simplicity T = 1. Let λ : [0, 1] → [0, 1] be a grading function which is strictly increasing and satisfies λ ∈ C 0 ([0, 1]) ∩ C 1 ((0, 1)),

λ(0) = 0,

λ(1) = 1.

We define for M ∈ N the algebraically graded mesh by the time points, m , m = 0, 1, . . . , M. tm = λ M It can be shown [146, Remark 3.11] that we obtain again the optimal convergence rate if λ(t) = O(t β ) where β depends on r and s, β = β(r, s) (algebraic grading). Example 4.4.1 Consider the payoff functions * * s − K if s > K, 1 if s > K, gc (s) = gd (s) = 0 else, 0 else.

56

4 European Options in BS Markets

Fig. 4.1 Option price of European call (top) and digital (bottom) option

We set strike K = 100, volatility σ = 0.3, interest rate r = 0.01, maturity T = 1. For the discretization we use N = 300, θ = 1/2, R = 3 and apply the L2 -projection for u0 . As in Remark 4.1.6, we use K = 1 in the calculations, for which R = 3 is a sufficiently large localization parameter. Using time steps M = O(N ), we obtain the option prices shown in Fig. 4.1. For G0 = (K/2, 3/2K) we measure the discrete L2 (J ; L2 (G0 ))-error defined by 1 2 M 2 3 km h ε m 22 , m=1

with

ε m 22

:=

N  i=1

|u(tm , xi ) − uN (tm , xi )|2 ,

4.4 Discretization

57

Fig. 4.2 L2 (J ; L2 (G)) convergence rate for θ = 1/2 using uniform time steps (top) and graded time steps (bottom)

both with uniform time steps and with graded time steps. The exact values are obtained using the analytic formulas [21]. We use the algebraic grading factor β = 3 for the call option and the extreme grading factor β = 25 for the digital option. It can be seen in Fig. 4.2 that for the call and the digital option we only obtain the optimal convergence rate using a graded time mesh. Remark 4.4.2 Note that the theory from [146, Chap. 3.3] suggests that choosing β = 10 is sufficient in the case of a digital option in the Black–Scholes market to obtain full convergence order of the given discretization. But the analysis in [146, Chap. 3.3] is performed in a semi-discrete setting, i.e. there is no discretization error in the space domain. This explains why in our situation, discretizing in space and time, a larger grading factor has to be chosen.

58

4 European Options in BS Markets

4.5 Extensions of the Black–Scholes Model We end this chapter by considering two extensions of the Black–Scholes model dSt = rSt dt + σ St dWt , S0 = s ≥ 0. ρ

In the constant elasticity of variance (CEV) model, σ St is replaced by σ St for some 0 < ρ < 1. Another possible extension is to replace the constant volatility σ by a deterministic function σ (s) which leads to the so-called local volatility models.

4.5.1 CEV Model Under a unique equivalent martingale measure, the stock price dynamics are given by ρ

dSt = rSt dt + σ St dWt ,

S0 = s ≥ 0,

(4.17)

where ρ is the elasticity of variance. We assume 0 < ρ < 1. Under this condition, the point 0 is an attainable state. As soon as S = 0, we keep S equal to zero, with the resulting process still satisfying (4.17). Note that (4.17) is of the form (1.2) but with σ (t, s) = σ s ρ , non-Lipschitz. The transformation to log-price, x = log s, will not allow removing the factor s ρ . A formal application of Theorem 4.1.4 yields that the value V (t, s) of a European vanilla with payoff g is the solution of ∂t V + ACEV V − rV = 0 in J × R≥0 , ρ

(4.18)

with the terminal condition V (T , s) = g(s) and generator 1 (ACEV f )(s) = σ 2 s 2ρ ∂ss f (s) + rs∂s f (s). ρ 2 Note that for ρ = 1, the CEV generator is the same as the BS generator. In (4.18), we change to time-to-maturity t → T − t and localize to a bounded domain G := (0, R), R > 0. Thus, we consider ∂t v − ACEV v + rv = 0 v=0 v(0, s) = g(s)

in J × G, on J × {R}, in G.

(4.19)

For the variational formulation of (4.19), we multiply the first equation in (4.19) by 2 v = ∂ (s 2ρ ∂ v) − a test function w and integrate from s = 0 to s = R. Using s 2ρ ∂ss s s 2ρ−1 2ρs ∂s v, we find, upon integration by parts, R R R #s=R 2ρ 2ρ s ∂ss vw ds = − s ∂s v∂s w ds − 2ρ s 2ρ−1 ∂s vw ds + s 2ρ ∂s vw #s=0 . 0

0

The boundary terms vanish for w

0

∈ C0∞ (G).

4.5 Extensions of the Black–Scholes Model

59

Define the weighted Sobolev space Wρ = C0∞ (G)

· ρ

(4.20)

,

where the weighted Sobolev norm · ρ is defined by ϕ 2ρ :=

R

(s 2ρ |∂s ϕ|2 + |ϕ|2 ) ds,

0 ≤ ρ ≤ 1.

(4.21)

0

For ϕ, φ ∈ C0∞ (G), we define the bilinear form aρCEV (ϕ, φ)

R 1 2 R 2ρ 2 := σ s ∂s ϕ∂s φ ds + ρσ s 2ρ−1 ∂s ϕφ ds 2 0 0 R R −r s∂s ϕφ ds + r ϕφ ds. 0

(4.22)

0

The variational formulation of (4.19) is based on the triple of spaces V = Wρ → H = L2 (G) = H∗ → Wρ∗ = V ∗ , and reads: Find v ∈ L2 (J ; Wρ ) ∩ H 1 (J ; L2 (G)) such that (∂t v, w) + aρCEV (v, w) = 0, ∀w ∈ Wρ , a.e. in J,

(4.23)

v(0) = g. To establish well-posedness of (4.23), we show continuity and coercivity of the bilinear form aρCEV (·,·) on Wρ . Proposition 4.5.1 Assume r > 0. There exist C1 , C2 > 0 such that for ϕ, φ ∈ Wρ |aρCEV (ϕ, φ)| ≤ C1 ϕ ρ φ ρ , aρCEV (ϕ, ϕ) ≥ C2 ϕ 2ρ ,

ρ ∈ [0, 1]\{1/2},

(4.24)

1 0≤ρ≤ . 2

(4.25)

Proof Let ϕ ∈ C0∞ (G). By Hardy’s inequality, for ε = 1, ε > 0, and any R > 0 

R

s ε−2 |ϕ|2 ds

1 2



0

2  |ε − 1|



R

s ε |∂s ϕ|2 ds

1 2

(4.26)

,

0

we find with ε = 2ρ = 1, that # # #

R 0

#  # s 2ρ−1 ∂s ϕφ ds # ≤

R

s 2ρ (∂s ϕ)2 ds

1 

0

2 ≤ ϕ ρ φ ρ |2ρ − 1|

R

2

0

s 2ρ−2 φ 2 ds

1 2

60

4 European Options in BS Markets

and, by the Cauchy–Schwarz inequality, # # #

R

0

#  # s∂s ϕφ ds # ≤ ϕ ρ

R

s 2−2ρ φ 2 ds

1 2

≤ ϕ ρ R 1−ρ φ L2 (G) .

0

Thus, for ϕ, φ ∈ C0∞ (G), ρ = 12 , one has |aρCEV (ϕ, φ)| ≤ C(ρ, σ, r) ϕ ρ φ ρ . Hence, we may extend the bilinear form aρCEV (·,·) from C0∞ (G) to Wρ by continuity for ρ ∈ [0, 1]\{ 12 }. Furthermore, we have aρCEV (ϕ, ϕ) =

R 1 2 ρ 1 s 2ρ−1 ∂s (ϕ 2 ) ds σ s ∂s ϕ 2L2 (G) + ρσ 2 2 2 0 R R 1 − r s∂s (ϕ 2 ) ds + r ϕ 2 ds. 2 0 0

Integrating by parts, we get, for 0 ≤ ρ ≤ 12 ,



R

s

2ρ−1

∂s (ϕ ) ds = −(2ρ − 1) 2

0

Analogously,

R

s 2ρ−2 ϕ 2 ds ≥ 0.

0

 1 R 2 2 0 s∂s (ϕ ) ds

= − 12

R 0

ϕ 2 ds, hence we get for 0 ≤ ρ ≤

1 2

1 3 1 aρCEV (ϕ, ϕ) ≥ σ 2 s ρ ∂s ϕ 2L2 (G) + r ϕ L2 (G) ≥ min{σ 2 , 3r} ϕ ρ . 2 2 2



By Theorem 3.2.2, we deduce Corollary 4.5.2 Problem (4.23) admits a unique solution V ∈ L2 (J ; Wρ ) ∩ H 1 (J ; L2 (G)) for 0 ≤ ρ < 1/2. The previous result addressed only the case 0 ≤ ρ < 12 . The case 12 ≤ ρ < 1 (which includes, for ρ = 12 , the Heston model and the CIR process) requires a modified variational framework due to the failure of the Hardy inequality (4.26) for ε = 1. Let us develop this framework. We multiply the first equation in (4.19) by an s 2μ w, where μ is a parameter to be selected and w ∈ C0∞ (G) is a test function, and integrate from s = 0 to s = R. We get from (4.19) CEV (∂t v, s 2μ w) + aρ,μ (v, w) = 0, CEV (·,·) is defined by where the bilinear form aρ,μ

∀w ∈ C0∞ (G),

(4.27)

4.5 Extensions of the Black–Scholes Model

61

R R 1 CEV aρ,μ (ϕ, φ) := σ 2 s 2ρ+2μ ∂s ϕ∂s φ ds + σ 2 (ρ + μ) s 2ρ+2μ−1 ∂s ϕφ ds 2 0 0 R R −r s 1+2μ ∂s ϕφ ds + r s 2μ ϕφ ds. (4.28) 0

0

CEV = a CEV and a CEV = a BS . We introduce the spaces W Note that aρ,0 ρ,μ as closures ρ 1 ∞ of C0 (G) with respect to the norm

ϕ 2ρ,μ

:=

R

 s 2ρ+2μ |∂s ϕ|2 + s 2μ |ϕ|2 ds ,

(4.29)

0

compare with (4.20). Note that ϕ ρ = ϕ ρ,0 . We now show the analog to Proposition 4.5.1, for ρ ∈ [1/2, 1]. Proposition 4.5.3 Assume 0 ≤ ρ ≤ 1 and select * μ=0 if 0 ≤ ρ < 12 , ρ = 1, − 12 < μ < 12 − ρ if 12 ≤ ρ < 1.

(4.30)

Assume also r > 0. Then there exist C1 , C2 > 0 such that ∀ϕ, φ ∈ Wρ,μ the following holds: CEV |aρ,μ (ϕ, φ)| ≤ C1 ϕ ρ,μ φ ρ,μ ,

(4.31)

CEV aρ,μ (ϕ, ϕ) ≥ C2 ϕ 2ρ,μ .

(4.32)

CEV in W Proof The continuity (4.31) of aρ,μ ρ,μ (0, R) × Wρ,μ (0, R) follows from the Cauchy–Schwarz inequality and by Hardy’s inequality (4.26) with ε = 2(ρ +μ) = 1 CEV |aρ,μ (ϕ, φ)| ≤

1 2 σ ϕ ρ,μ φ ρ,μ + σ 2 (ρ + μ) ϕ ρ,μ 2 !1/2 R 2+2μ−2ρ 2 + r ϕ ρ,μ s φ ds



!1/2

R

s

2ρ+2μ−2 2

φ ds

0

0



! + μ) 1−ρ ϕ ρ,μ φ ρ,μ . + + rR 2 |2ρ + 2μ − 1|

σ2

2σ 2 (ρ

Let ϕ ∈ C0∞ (G). We calculate CEV aρ,μ (ϕ, ϕ) =

R 1 2 ρ+μ 1 2 2 σ s ∂s ϕ L2 (G) + σ (ρ + μ) s 2ρ+2μ−1 ∂s (ϕ 2 ) ds 2 2 0 R 1 − r s 1+2μ ∂s (ϕ 2 ) ds + r s μ ϕ 2L2 (G) 2 0

62

4 European Options in BS Markets

=

1 2 ρ+μ 1 σ s ϕs 2L2 (G) − σ 2 (ρ + μ)(2ρ + 2μ − 1) 2 2 R 1 + r(1 + 2μ) s 2μ ϕ 2 ds + r s μ ϕ 2L2 (G) . 2 0



R

s 2ρ+2μ−2 ϕ 2 ds

0

Given 1/2 ≤ ρ < 1, we now choose μ such that −1/2 ≤ μ < 1/2 − ρ. Then, 2ρ + 2μ − 1 < 0, 1 + 2μ ≥ 0, ρ + μ ≥ 0, and we get 1 1 CEV aρ,μ (ϕ, ϕ) ≥ σ 2 s ρ+μ ∂s ϕ 2L2 (G) + r s μ ϕ 2L2 (G) ≥ min{σ 2 , 2r} ϕ 2ρ,μ . 2 2 By density of C0∞ (0, R) in Wρ,μ , we have shown (4.32).



We are now ready to cast Eq. (4.19) into the abstract parabolic framework. We choose μ as in (4.30) and observe that μ ≤ 0 then. Hence, if Hμ := L2 (0, R; s 2μ ds) denotes the weighted L2 -space corresponding to Wρ,μ , we have the dense inclusions Wρ,μ → Hμ ∼ = (Hμ )∗ → (Wρ,μ )∗ .

(4.33)

Denote by (·,·)μ the inner product in Hμ . The weak formulation then reads: Find v ∈ L2 (J ; Wρ,μ ) ∩ H 1 (J ; Hμ ) such that CEV (∂t v, w)μ + aρ,μ (v, w) = 0, ∀w ∈ Wρ,μ , a.e. in J,

(4.34)

v(0) = g. Applying Theorem 3.2.2, we have shown Theorem 4.5.4 Let ρ ∈ [0, 1], and assume that μ satisfies (4.30). Then, the problem (4.34) admits a unique solution.

4.5.2 Local Volatility Models We replace the constant volatility σ in the Black–Scholes model (1.1) by a deterministic function σ (s), i.e. the Black–Scholes model is extended to dSt = rSt dt + σ (St )St dWt ,

(4.35)

with r ∈ R≥0 and σ : R+ → R+ . We assume that the function s → sσ (s) satisfies (1.3)–(1.4), such that the SDE (4.35) admits a unique solution. Thus, the infinitesimal generator A of the process S is 1 (Af )(s) = s 2 σ 2 (s)∂ss f (s) + rs∂s f (s). 2

4.5 Extensions of the Black–Scholes Model

63

It follows from Theorem 4.1.4 that the option price V in (4.1) solves ∂t V + AV − rV = 0 in J × R+ , V (T , s) = g(s) in R+ . As in the Black–Scholes model, we switch to time-to-maturity t → T − t, to log-price x = ln(s) and localize the PDE to a bounded domain G = (−R, R), R > 0. Thus, we consider the parabolic problem for v(t, x) = V (T − t, ex ) ∂t v − ALV v + rv = 0 v=0 v(0, x) = g(ex )

in J × G, on J × ∂G, in G,

(4.36)

where, for  σ (x) := σ (ex ), we denote by ALV the operator  1 2 1 2  σ (x)∂xx f (x) + r −  σ (x) ∂x f (x). (ALV f )(x) =  2 2 The weak formulation to (4.36) reads: Find u ∈ L2 (J ; H01 (G)) ∩ H 1 (J ; L2 (G)) such that (∂t u, v) + a LV (u, v) = 0, ∀v ∈ H01 (G), a.e. in J,

(4.37)

u(0) = g, where the bilinear form a LV (·,·) : H01 (G) × H01 (G) → R is given by    1  σ σ +  σ 2 ϕ  φ  dx + σ 2 /2 − r ϕ  φ dx + r ϕφ dx. a LV (ϕ, φ) := 2 G G G Note that the bilinear form a LV (·,·) is a particular case of the bilinear form a(·,·) σ σ  )(x)+ defined in (3.16) by identifying the coefficients α(x) =  σ 2 (x)/2, β(x) = ( 2 LV  σ (x)/2 − r and γ (x) = r. Hence, the stiffness matrix A corresponding to the bilinear form a LV (·,·) can be implemented using the method described in Sect. 3.4. Proposition 4.5.5 Assume that  σ : R → R+ satisfies  σ ∈ W 1,∞ (G) and ∃σ0 > 0 such that  σ (x) ≥ σ0 > 0, ∀x ∈ G. Then, there exist constants Ci > 0, i = 1, 2, 3, such that for all ϕ, φ ∈ H01 (G) there holds |a LV (ϕ, φ)| ≤ C1 ϕ H 1 (G) φ H 1 (G) , |a LV (ϕ, ϕ)| ≥ C2 ϕ 2H 1 (G) − C3 ϕ 2L2 (G) . Proof We have 1 |a LV (ϕ, φ)| ≤  σ 2L∞ (G) ϕ  L2 (G) φ  L2 (G) + r ϕ L2 (G) φ L2 (G) 2   +  σ L∞ (G)  σ  L∞ (G) + 1/2  σ 2L∞ (G) + r ϕ  L2 (G) φ L2 (G) ≤ C1 ϕ H 1 (G) φ H 1 (G) .

64

4 European Options in BS Markets

Denote by σ the constant σ :=  σ L∞ (G)  σ  L∞ (G) + 1/2  σ 2L∞ (G) . Then, for an arbitrary ε > 0, 1 a LV (ϕ, ϕ) ≥ σ02 ϕ  2L2 (G) + r ϕ 2L2 (G) − σ ϕ  L2 (G) ϕ L2 (G) 2  2    ≥ σ0 /2 − εσ ϕ  2L2 (G) + r − σ /(4ε) ϕ 2L2 (G) . Choosing ε = σ02 /(4σ ), we have a LV (ϕ, ϕ) ≥ σ02 /4 ϕ  2L2 (G) + (r − (σ /σ0 )2 ) ϕ 2L2 (G) , which gives, by the Poincaré inequality (3.4), a LV (ϕ, ϕ) ≥ σ02 /8 ϕ  2L2 (G) + σ02 /(8C) ϕ 2L2 (G) − |r − (σ /σ0 )2 | ϕ 2L2 (G) ≥ σ02 /8 min{1, C −1 } ϕ 2H 1 (G) − C3 ϕ 2L2 (G) .



By Proposition 4.5.5, the bilinear form a LV (·,·) is continuous and satisfies a Gårding inequality. Hence, for g ∈ L2 (G), the weak formulation (4.37) admits a unique solution u ∈ L2 (J ; H01 (G)) ∩ H 1 (J ; L2 (G)).

4.6 Further Reading To derive the partial differential equations, we followed the line of Lamberton and Lapeyre [109]. Using finite differences to price options was first described in Brennan and Schwartz [26]. A rigorous treatment can be found in Achdou and Pironneau [1]. Finite elements were first applied to finance in Wilmott et al. [161]. Error estimates for non-smooth initial data are given in Thomée [154]. The CEV model was introduced by Cox and Ross [45, 46], and analytic formulas can be found, for example, in Hsu et al. [88]. See also [56]. The probabilistic argument to estimate the localization error is due to Cont and Voltchkova in [41], even in a more general setting of Lévy processes. The local volatility model is used to recover the volatility smile observed in the stock market as shown in Derman and Kani [57] or Dupire [59].

Chapter 5

American Options

Pricing American contracts requires, due to the early exercise feature of such contracts, the solution of optimal stopping problems for the price process. Similar to the pricing of European contracts, the solutions of these problems have a deterministic characterization. Unlike in the European case, the pricing function of an American option does not satisfy a partial differential equation, but a partial differential inequality, or to be more precise, a system of inequalities. We consider the discretization of this inequality both by the finite difference and the finite element method where the latter is approximating the solutions of variational inequalities. The discretization in both cases leads to a sequence of linear complementarity problems (LCPs). These LCPs are then solved iteratively by the PSOR algorithm. Thus, from an algorithmic point of view, the pricing of an American option differs from the pricing of a European option only as in the latter we have to solve linear systems, whereas in the former we need to solve linear complementarity problems. The calculation of the stiffness matrix is the same for both options since the matrix depends on the model and not on the contract. We assume that the dynamics of the stock price is modeled by a geometric Brownian motion and that no dividends are paid. Under this assumption, the value of an American call contract is equal to the value of the corresponding European call option. Therefore, we focus on put options in the following.

5.1 Optimal Stopping Problem Recall that a stopping time τ for a given filtration Ft is a random variable taking values in (0, ∞) and satisfying {τ ≤ t} ∈ Ft ,

∀t ≥ 0.

N. Hilber et al., Computational Methods for Quantitative Finance, Springer Finance, DOI 10.1007/978-3-642-35401-4_5, © Springer-Verlag Berlin Heidelberg 2013

65

66

5 American Options

Fig. 5.1 American put option

Denoting by Tt,T the set of all stopping times for St with values in the interval (t, T ), the value of an American option is given by   V (t, s) := sup E e−r(τ −t) g(Sτ ) | St = s . τ ∈Tt,T

(5.1)

As for the European vanilla style contracts, there is a close connection between the probabilistic representation (5.1) of the price and a deterministic, PDE based representation of the price. We have Theorem 5.1.1 Let v(t, x) be a sufficiently smooth solution of the following system of inequalities ∂t v − ABS v + rv ≥ 0 v(t, x) ≥ g(ex ) BS (∂t v − A v + rv)(g − v) = 0 v(0, x) = g(ex )

in J × R, in J × R, in J × R, in R.

(5.2)

Then, V (T − t, ex ) = v(t, x). A proof can be found in [15], we also refer to [98] for further details. For each t ∈ J there exists the so-called optimal exercise price s ∗ (t) ∈ (0, K) such that for all s ≤ s ∗ (t) the value of the American put option is the value of immediate exercise, i.e. V (t, s) = g(s), while for s > s ∗ (t) the value exceeds the immediate exercise value, see Fig. 5.1. The region C := {(t, s) | s > s ∗ (t)} is called the continuation region and the complement C c of C is the exercise region. Since the optimal exercise price is not known a priori, it is called a free boundary for the associated pricing PDE and the problem of determining the option price is then a free boundary problem. Note that the inequalities (5.2) do not involve the free boundary s ∗ (t). Remark 5.1.2 In the Black–Scholes model, the derivative of V at x = s ∗ (t) is continuous which is known as the smooth pasting condition. This does not hold for pure jump models.

5.2 Variational Formulation

67

5.2 Variational Formulation The set of admissible solutions for the variational form of (5.2) is the convex set Kg ⊂ H 1 (R) Kg := {v ∈ H 1 (R) : v ≥ g a.e. x}.

(5.3)

The variational form of (5.2) reads: Find u ∈ L2 (J ; H 1 (R)) ∩ H 1 (J ; L2 (R)) such that u(t, ·) ∈ Kg and (∂t u, v − u) + a BS (u, v − u) ≥ 0, ∀v ∈ Kg , a.e. in J,

(5.4)

u(0) = g. Since the bilinear form a BS (·,·) is continuous and satisfies a Gårding inequality in H 1 (R) by Proposition 4.2.1, problem (5.4) admits a unique solution for every payoff g ∈ L∞ (R) by Theorem B.2.2 of the Appendix B. As in the case of plain European vanilla contracts, we localize (5.4) to a bounded domain G = (−R, R), R > 0, by approximating the value v in (5.1)   v(t, x) := sup E e−r(τ −t) g(eXτ ) | Xt = x , τ ∈Tt,T

by the value of a barrier option   vR (t, x) = sup E e−r(τ −t) g(eXτ )1{τ 0 such that |v(t, x) − vR (t, x)| ≤ C(T , σ ) e−γ1 R+γ2 |x| , valid for payoff functions g : R → R satisfying the growth condition (4.10). Thus, we consider problem (5.1) restricted on G ∂t vR − ABS vR + rvR ≥ 0 vR (t, x) ≥ g(ex )|G BS (∂t vR − A vR + rvR )(g|G − vR ) = 0 vR (0, x) = g(ex )|G vR (t, ±R) = g(e±R )

in J × G, in J × G, in J × G, in G, in J.

(5.6)

We cast the truncated problem into variational form. To get a simple convex set for admissible functions and to facilitate the numerical solution, we introduce the time value of the option (also called excess to payoff ) wR := vR − g|G and consider K0,R := {v ∈ H01 (G) : v ≥ 0 a.e. x ∈ G}.

68

5 American Options

The variational formulation of the truncated problem reads then Find uR ∈ L2 (J ; H01 (G)) ∩ H 1 (J ; L2 (G)) such that uR (t, ·) ∈ K0,R and (∂t uR , v − uR ) + a BS (uR , v − uR ) ≥ −a BS (g, v − uR ), ∀v ∈ K0,R ,

(5.7)

uR (0) = 0. We calculate the right hand side in (5.7) for a put option, i.e. g(s) = max{0, K − s}. Let ϕ ∈ H01 (G). Then, by the definition of a BS (·,·) in (4.9) and integration by parts, ! ln K ln K 1 1 −a BS (g, ϕ) = − σ 2 (K − ex ) ϕ  dx + r − σ 2 (K − ex ) ϕ dx 2 2 −R −R ln K −r (K − ex )ϕ dx −R

#ln K ! ln K ln K # 1 1 1 − σ2 ex ϕ dx − r − σ 2 ex ϕ dx = σ 2 ex ϕ ## 2 2 2 −R −R −R ln K (K − ex )ϕ dx −r −R

1 = Kσ 2 ϕ(ln K) − rK 2



ln K

−R

ϕ dx.

Since this holds for all ϕ, −a BS (g, ϕ) defines the linear functional f = 12 Kσ 2 δln K − rKχ{x≤ln K} ∈ H −1 (G), where χB is the indicator function.

5.3 Discretization As in the European case, we approximate the option price both by the finite difference and the finite element method. The finite difference method discretizes the partial differential inequalities whereas the finite element method approximates the solution of variational inequalities. In both cases, the discretization leads to a sequence of linear complementarity problems. These LCPs are then solved iteratively by the PSOR algorithm.

5.3.1 Finite Difference Discretization Discretizing (5.6) with finite differences and the backward Euler scheme, i.e. the θ -scheme with θ = 1, we obtain a sequence of matrix linear complementarity prob-

5.3 Discretization

69

lems Find um+1 ∈ RN such that for m = 0, . . . , M − 1,   I + kGBS um+1 ≥ um + kf , um+1 ≥ u0 ,

(5.8)

  (um+1 − u0 ) (I + kGBS )um+1 − um − kf = 0, u0 = u0 , where GBS and u0 are as in (4.14). The vector f accounts for the (nonhomogeneous) Dirichlet boundary conditions and is given by f = k(f − , 0, . . . , 0, f + ) ∈ RN , with f ± := g(e±R )(σ 2 /(2h2 ) ∓ (σ 2 /2 − r)/(2h)). Note that we cannot approximate the time value of the option wR = vR − g|G by the FDM, since the corresponding inequality (5.6) satisfied by wR involves functionals f ∈ H −1 (G) which cannot be approximated by finite difference quotients.

5.3.2 Finite Element Discretization We discretize (5.7) using the backward Euler scheme and the finite element space ST1 ,0 given in (3.21). We obtain m+1 ∈ RN Find uN ≥0 such that for m = 0, . . . , M − 1,   m+1  m+1  m+1   (v − uN ) M + kABS uN ≥ (v − uN ) kf + Mum N ,

∀v ∈ RN ≥0 , (5.9)

u0N = 0, 1 BS are as where um N is the coefficient vector of uN (tm ) ∈ ST ,0 ∩ K0,R , and M and A BS in the European case, see (4.15). The vector f is given by f i = −a (g, bi ). The sequence of inequalities (5.9) can be rewritten as sequence of LCPs.

Lemma 5.3.1 Denote by B := M + kABS , F m := kf + Mum N . Then, problem (5.9) 0 = 0, find um+1 ∈ RN such that for m = 0, . . . , M − 1, is equivalent to: given uN N ≥ F m, Bum+1 N m+1 uN ≥ 0,  m+1   − F m = 0. (um+1 N ) BuN

(5.10)

m+1 m+1 m+1 Proof ‘⇐’: Clearly, from RN  uN ≥ 0 follows uN ∈ RN ≥ ≥0 . From BuN m m F , by the definition of B and F , it follows BS m+1 − um − kf ≥ 0 . w m+1 := M(um+1 N ) + kA uN N

70

5 American Options

Now, m+1  m+1 m+1  m+1 (v − uN ) w = v  wm+1 − (uN ) w ,       ≥0

=0

m+1  m+1 m+1  BS m+1 − kf ) ≥ 0 hence, (v − uN ) w = (v − uN ) (M(um+1 − um N ) + kA uN N N for all v ∈ R≥0 , which is the second line of (5.9). ‘⇒’: From the second line of (5.9), we have ∀v ∈ RN ≥0  m+1 v  w m+1 ≥ (um+1 . N ) w

Now suppose (w m+1 )k < 0 for some k ∈ {1, . . . , N}, and let (v)k # 1. Then the left hand side becomes arbitrarily small, which is a contradiction. Hence w m+1 ≥ 0, m+1 ≥ F m . Now which is the inequality BuN m+1  m+1 m+1 uN ∈ RN ≥ 0. ≥0 ⇒ (uN ) w m+1  m+1 ) w ≥ 0, hence Set in the second line of (5.9) v = 0, i.e. (−uN m+1  m+1 m+1  m+1 m+1   m+1 ≤ 0. We conclude (uN ) w = 0, which is (uN ) BuN − (uN ) w  F m = 0. M We give a convergence result for the price approximated by FEM. Let {um N }m=1 m be the solution of (5.9) and let u := u(tm , x) be the solution of (5.4) at time level tm . The proof of the next convergence result is shown in [121, Theorem 22].

Theorem 5.3.2 Assume u ∈ C 0 ((0, T ]; H 2 (G)) ∩ C 1,1 (J ; L2 (G)). Then, there exists a constant C = C(u) > 0 such that the following error bound holds max u

m

m

− um N L2 (G)

+ k

M 

!1/2 u

m

2 − um N H 1 (G)

≤ C(k + h).

m=1

Thus, as in the European case (compare with Theorem 3.6.5), we obtain first order convergence in the energy norm uM − uM N H 1 (G) = O(k + h), provided that u(t, x) is sufficiently smooth.

5.4 Numerical Solution of Linear Complementarity Problems In the following, two methods for solving the derived LCPs are described. Both methods are iterative approaches.

5.4 Numerical Solution of Linear Complementarity Problems Table 5.1 Description of the PSOR algorithm

71

Choose an initial guess x 0 ≥ c. Choose ω ∈ (0, 1] and ε > 0. For k = 0, 1, 2, . . . , For i = 1, . . . , N, i−1 N    1  Aij xjk+1 − Aij xjk bi −  xik+1 := Aii j =1 j =i+1 $ % k+1 xi := max ci , xik + ω( xik+1 − xik ) Next i If x k+1 − x k 2 < ε stop else Next k

5.4.1 Projected Successive Overrelaxation Method As shown in Eqs. (5.9) and (5.10), the numerical pricing by both the FDM and FEM of an American option reduces to solving (in each time step) an LCP. Both can be written in the abstract form: find x ∈ RN such that Ax ≥ b, x ≥ c,

(5.11)

(x − c) (Ax − b) = 0. Problem (5.11) was solved by Cryer [47] using the projected successive overrelaxation (PSOR) method for matrices A being symmetric and positive definite. In applications in finance, however, A is not symmetric due the presence of a drift term in the infinitesimal generator of the price process. It can be shown that PSOR works also for matrices which are not symmetric, but diagonally dominant. The algorithm is described in Table 5.1 where we denote by Aij the entries of A, i.e. A = (Aij )1≤i,j ≤N , and we assume that Aii = 0, ∀i. If the step xik+1 := xik+1 − xik )} is replaced by max{ci , xik + ω( xik+1 = xik + ω( xik+1 − xik ), then we get the classical SOR for the solution of Ax = b. This step ensures the inequality condition x ≥ c. The parameter ω is a relaxation parameter. Since the PSOR is an iterative algorithm, we have to discuss its convergence. The following result is proven in [1, Chap. 6]. Proposition 5.4.1 Assume A satisfies    N (i) There exist constants C1 , C2 > 0 such that C1 v v ≤ v Av ≤ C2 v v, ∀v ∈ R ; (ii) It is diagonally dominant, i.e. |Aii | > j =i |Aij |, ∀i.

Furthermore, assume ω ∈ (0, 1]. Then, the sequence {x k } generated by PSOR converges, as k → ∞, to the unique solution x of (5.11).

72

5 American Options Table 5.2 Description of the primal–dual active set algorithm Choose an initial guess x 0 ≥ c, λ0 ≥ 0. Choose ε > 0. For k = 0, 1, 2, . . . , Set Ik = {i : λki + k1 (cik − xik ) ≤ 0}, Ak = {i : λki + k1 (cik − xik ) > 0} Solve Ax k+1 + λk+1 = b, x k+1 = c on Ak , λk+1 = 0 on Ik . If x k+1 − x k 2 < ε stop else Next k

Note that assumption (i) of Proposition 5.4.1 ensures the existence of a unique solution of (5.11).

5.4.2 Primal–Dual Active Set Algorithm Problem (5.11) can be formulated as follows: find x, λ ∈ RN such that Ax + λ = b, x ≥ c,

(5.12)

λ ≥ 0, (x − c) λ = 0.

Problem (5.12) was solved by Ito and Kunisch [84] using the primal–dual active set algorithm that can also be formulated as a semi-smooth Newton method for Pmatrices A. A matrix is called a P-matrix if all its principal minors are positive. This can be shown to hold in our setup for the FEM discretization as well as for the FDM discretization for sufficiently small time-step k. The complementarity system in (5.12) can equivalently be expressed as C(x, λ) = 0, where C(x, λ) := λ − max (0, λ + k1 (c − x)), for each k1 > 0. Therefore, (5.12) is equivalent to Ax + λ = b, C(x, λ) = 0.

(5.13)

The primal–dual active set algorithm is based on using (5.13) for a prediction strategy, i.e. given a pair (x, λ), the active and inactive sets are given as I = {i : λi + k1 (ci − xi ) ≤ 0},

A = {i : λi + k1 (ci − xi ) > 0}.

This leads to the algorithm given in Table 5.2.

5.4 Numerical Solution of Linear Complementarity Problems

73

Fig. 5.2 American option price (top) and corresponding exercise boundary (bottom)

Proposition 5.4.2 Let A satisfy (i) and (ii) from Proposition 5.4.1 and let (x ∗ , λ∗ ) be the unique solution of (5.13), then the primal dual active set method converges superlinearly to (x ∗ , λ∗ ), provided that x 0 −x ∗ 22 + λ0 −λ∗ 22 is sufficiently small. A proof of this result can be found in [84, Theorem 3.1]. Example 5.4.3 Consider an American put with strike K = 100 and maturity T = 1. For σ = 0.3, r = 0.1, we compute the price of the option (at t = T ) as well as the free boundary s ∗ (t). The results are shown in Fig. 5.2, where we also plot the option price of the corresponding European option, which is, due to the single exercise right, lower than the price of the American contract.

74

5 American Options

5.5 Further Reading Approximate, semi-analytic solutions of American option pricing problems can, for example, be found in Barone-Adesi and Whaley [12], Carr [33] or Geske and Johnson [70]. An overview of various methods for pricing an American put in a Black– Scholes setting is given in Barone-Adesi [11]. A rigorous treatment is provided by Jaillet et al. [98] where the Brennan and Schwartz algorithm [25] is used for the discretization. A semi-smooth Newton approach is analyzed in Ito et al. [84, 92] and Hager and Wohlmuth [77]. Holtz and Kunoth [87] provide high-order B-spline approximations for American puts in the Black–Scholes model and the corresponding Greeks. Computable a posteriori error bounds on the discretization error in the numerical solution of American put contracts in a Black–Scholes setting have been obtained in [126].

Chapter 6

Exotic Options

Options with more sophisticated rules than those for plain vanillas are called exotic options. There are different types. Path dependent options depend on the whole history of the underlying and not just on the realization at maturity. In particular, we consider barrier options which depend on price levels being attained over a period and Asian options which depend on the average price of the option’s underlying over a period. Furthermore, we look at options which have different exercise styles like compound options which are options on options and swing options which have multiple exercise rights. We assume that the dynamics of the stock price is modeled by a geometric Brownian motion.

6.1 Barrier Options Barrier options differ from vanillas in the sense that the option contract is triggered if the price of the underlying hits some barrier B > 0. To be more specific, consider a contract which pays a specified amount at maturity T provided during 0 ≤ t ≤ T the price S does not cross a specified barrier B either from above, the so-called down-and-out barrier option, or from below, called up-and-out barrier option. If the barrier is crossed before T , the option expires worthless. A knock-in barrier option becomes the corresponding European vanilla when the barrier B is crossed at time 0 ≤ t ≤ T , e.g. the up-and-in call becomes the European call when B is crossed from below. We assume here for convenience that B is constant. A European plain vanilla option pays the same at T as a down-and-out plus a down-and-in with the same barrier B and of the same type (call/put) as the plain vanilla. The standard no arbitrage consideration shows that pricing a knock-in barrier reduces to pricing the corresponding knock-out barrier contract and a plain European vanilla. If we denote the value of a down-and-out barrier option by Vdo , the value of a up-and-out barrier option by Vuo and correspondingly Vdi , Vui for the knock-in barrier option, we have Vdi (t, s) = V (t, s) − Vdo (t, s),

Vui (t, s) = V (t, s) − Vuo (t, s).

(6.1)

N. Hilber et al., Computational Methods for Quantitative Finance, Springer Finance, DOI 10.1007/978-3-642-35401-4_6, © Springer-Verlag Berlin Heidelberg 2013

75

76

6

Exotic Options

Therefore, it is sufficient to consider the prices of knock-out barrier contracts. For notational simplicity, we omit the subscripts. Let r ≥ 0 the constant interest rate and let τB = inf{t ≥ 0 | St = B} be the first hitting time of B by the process S. In real price, the value of the down-and-out option V is then given by   Vdo (t, s) = E er(t−T ) g(ST )1{T 0. The weak formulation on a bounded domain G = [0, R], R > 0, and written in time-to-maturity, for the bond price in the CIR model reads: Find v ∈ L2 (J ; W1/2,μ ) ∩ H 1 (J ; Hμ ) such that CIR (∂t v, w)μ + a1/2,μ (v, w) = 0, ∀w ∈ W1/2,μ , a.e. in J,

v(0) = 1, where we denote by (·,·)μ the inner product in Hμ (G) and

(7.6)

7.2 Interest Rate Derivatives

87

R  R 1 1 +μ a CIR (φ, ϕ) := σ 2 r 1+2μ ∂r φ∂r ϕ dr + σ 2 r 2μ ∂r φϕ dr 2 2 0 0 R R − (α − βr)r 2μ ∂r φϕ dr + r 2μ+1 φϕ dr. 0

0

The spaces W1/2,μ (G) and Hμ (G) are defined in (4.29) and (4.33) with ρ = 1/2 and μ ∈ (−1/2, 0). Applying Theorem 3.2.2, one can show the well-posedness of the pricing problem. Remark 7.1.4 The localization of the pricing problem (7.2) to a bounded domain cannot be justified as in Theorem 4.3.1 as [143, Theorem 25.18] is not applicable. Instead, we can use a different approach as in Theorem 9.4.1 to rigorously justify the use of homogeneous Dirichlet boundary conditions for large r. Alternatively, we could use the approach described in [136] using local times and leading to homogeneous Neumann conditions for large r. A quantity that is often considered next to the bond price is the yield, which is for a given maturity T given by Y (t, r) := − T 1−t log B(t, r). It is the constant rate of continuously compounding interest which an investment has to be made starting from B(t, r) units of currency at time t to obtain one unit of currency at maturity. The corresponding graph for different maturities of T > t is called a yield curve and is often considered in the financial context. Typical shapes of yield curves are normal, i.e. monotonously increasing in T , inverse, i.e. monotonously decreasing in T , and humped, i.e. having exactly one local maximum and no local minimum in T . Example 7.1.5 We consider the CIR model with parameters α = 0.03, β = 0.5 and σ = 0.5. The bond prices are computed solving PDE (7.6) and the yield obtained postprocessing the bond prices. We plot in Fig. 7.1 the yield curve for different initial values of r0 . It can be seen that the three types of yield curved mentioned above (normal, inverse and humped) can be reproduced. We refer to [102] for further details. Remark 7.1.6 For other types of interest rate models, such as the Vasicek model, the well-posedness results can be obtained as in Sect. 4.5.2.

7.2 Interest Rate Derivatives Derivatives in interest rate markets are usually not written on the bonds directly, but on simply compounded interest rates which are defined as follows. To emphasize the dependency on the maturity, we now denote the bond price with B(t, T , r). Definition 7.2.1 The simply compounded interest rate prevailing at time t for the maturity T is denoted by L(t, T , r) and is the constant rate at which an investment

88

7 Interest Rate Models

Fig. 7.1 Yield curves in the CIR model with parameters α = 0.03, β = 0.5, σ = 0.5 and different initial conditions r0

has to be made to produce an amount of one unit of currency at maturity, starting from B(t, T , r) units of currency at time t, when accruing occurs proportionally to the investment time, i.e. L(t, T , r) :=

1 − B(t, T , r) . (T − t)B(t, T , r)

We now turn to the definition of forward rates. Forward rates are characterized by three time instants, namely the time t at which the rate is considered, its expiry T and its maturity T1 , with t ≤ T ≤ T1 . Forward rates are interest rates that can be locked in today for an investment in a future time period, and are set consistently with the current structure of zero coupon bonds. We can define a forward rate through a forward rate agreement. This contract gives its holder an interest rate payment for the period between T and T1 . At the maturity T1 a fixed payment K is exchanged for a variable payment L(T , T1 , r). The value of the contract in T1 is given as (T1 − T )(K − L(T , T1 , rT )). Recalling the definition of L(T , T1 , rT ) the value V (t, r) of the contract at t is given as V (t, r) = B(t, T1 , r)(T1 − T )K − B(t, T , r) + B(t, T1 , r). The value of K that renders the contract fair at time t, i.e. V (t, r) = 0, is called simply compounded forward interest rate. Definition 7.2.2 The simply compounded forward interest rate at time t for the expiry T > t with maturity T1 > T at short rate r is denoted by F (t, T , T1 ) and defined by ! B(t, T , r) 1 −1 . F (t, T , T1 , r) := T1 − T B(t, T1 , r)

7.2 Interest Rate Derivatives Table 7.1 Description of the swaption pricing algorithm

89 Set v0 = 1. For j = 2, . . . , n, Solve (7.6) to compute bond price B(T1 , Tj , r) = v(Tj ). Next j Compute swap value V PFS (T1 , r) by (7.7) Set v0 = V PFS (T1 , r). Solve (7.6) to compute swaption price

The counterpart of a future contract on stock markets are swap agreements on interest rates. A payer interest rate swap is a contract exchanging fixed and variable payments at certain time instances in the future, called the tenor structure. Let t < T1 < · · · < Tn < T , then the value V PFS (t, r) is given as 4 n 5   Ti PFS − t r(s) ds V (t, r) = E e (Ti − Ti−1 )(L(Ti−1 , Ti , rTi−1 ) − K) | rt = r , i=2

whereas the value of a receiver interest rate swap is given as V RIS (t, r) = −V PFS (t, r). Using the definition of a simply compounded forward interest rate (Definition 7.2.2), we can view a payer interest rate swap as a portfolio of forward rate agreements and obtain the following representation for the value V PFS (t, r) 4 n 5   Ti PFS − t r(s) ds V (t, r) = E e (Ti − Ti−1 )(F (t, Ti−1 , Ti , rt ) − K) | rt = r i=2

= B(t, T1 , r) − B(t, Tn , r) −

n  (Ti − Ti−1 )KB(t, Ti , r).

(7.7)

i=2

Another example of derivatives on interest rates are swaptions. A European payer swaption gives its holder the right but not the obligation to enter a payer interest rate swap at a future time, the swaption maturity. The value V (t, r) of the payer swaption, for which the maturity coincides with the first reset date T1 is given as 4 V (t, r) = E e− + ×

 T1 t

n  i=2

r(s) ds

e



 Ti T1

, r(s)ds

5

(Ti − T1 )(F (T1 , Ti−1 , Ti , rT1 ) − K)

| rt = r . +

In contrast to a swap, it cannot be decomposed into a portfolio of simpler products. However, we may proceed as in the preceding chapter: first, we compute the value of the swap contract underlying the swaption which can be priced using the PDE (7.2). Second, the swaption can then be priced as a compound option, where the underlying is the zero coupon bond. This is described schematically in Table 7.1.

90

7 Interest Rate Models

Fig. 7.2 Swaption price in the CIR model

Example 7.2.3 Consider the CIR model with parameters α = 0.05, β = 0.2, σ = 0.3, K = 0.05 and tenor structure T1 = 0.25, T2 = 0.5, T3 = 0.75, T4 = 1, T5 = 1.5, . . ., T10 = 4. The swaption price is shown in Fig. 7.2. Remark 7.2.4 A widely used extension of the models described above is the consideration of a deterministic shift function, i.e. instead of considering the solution rt of (7.1), the short rate model rt + ϕt is considered. This extension can be fitted to any observed term structure. The pricing of derivatives in the extended model is analogous to the described method applying a change of variable.

7.3 Further Reading For an introduction to interest rate models, we refer to Brigo and Mercurio [29]. Libor market models in the Lévy setting were considered by Eberlein and Özkan [60]. For theoretical results, we refer to Keller-Ressel et al. [101]. The equations arising in this context can be solved as described in Chap. 10. Recently, forward rate models have received significant attention, we refer to Carmona and Tehranchi [31] and Hepperger [78] for details. Keller-Ressel and Steiner [102] consider attainable yield curve shapes in an affine setting.

Chapter 8

Multi-asset Options

In Chap. 6, we considered exotic options written on a single underlying. Further examples of exotic options are given by the so-called multi-asset options. These are options derived from d ≥ 2 underlying risky assets, whose price movement can be described by a system of SDEs. The pricing functions of multi-asset options are multivariate functions satisfying a parabolic partial differential equation in d dimensions, together with an appropriate terminal value depending on the type of the option. We distinguish between different types of European multi-asset options. A basket option is an option whose payoff is linked to a portfolio or basket of underlier values. The basket can be any weighted sum of underlier values as long as the weights are all positive. A typical example of a basket option is the arithmetic mean put option with payoff g(s) = max{0, di=1 αi si − K}. Examples of rainbow option are better-of-options, where g(s) = max1≤i≤d {αi si }, or maximum call options, with g(s) = max1≤i≤d {max{0, si − Ki }}. A quanto option (also called cross-currency option) is a vanilla option on a foreign underlying, but with a payout in domestic currency. The payout of the option is converted to the domestic currency at expiration, at a predefined exchange rate s2 . The payoff function of a call is g(s1 , s2 ) = s2 max{0, s1 − K}.

8.1 Pricing Equation As in the case of a single underlying, the price of a multi-asset option on d assets is given as the conditional expectation   T V (t, x) = E e− t r(Xs ) ds g(XT ) | Xt = x , where Xt = (Xt1 , . . . , Xtd ) is an Rd -valued stochastic process modeling the dynamics of the d assets, r ∈ C 0 (Rd ; R≥0 ) is the deterministic interest rate, g : Rd → R≥0 denotes the payoff of the option and R≥0 the non-negative real numbers. N. Hilber et al., Computational Methods for Quantitative Finance, Springer Finance, DOI 10.1007/978-3-642-35401-4_8, © Springer-Verlag Berlin Heidelberg 2013

91

92

8 Multi-asset Options

We describe the process X in more detail. Let W be an Rn -valued Brownian motion. We assume that the ith component of the process X evolves according to dXti = bi (Xt ) dt +

n 

j

Σij (Xt ) dWt ,

X0i = Z i ,

i = 1, . . . , d.

(8.1)

j =1

Herewith, we assume the coefficients b : Rd → Rd , Σ : Rd → Rd×n satisfy the usual Lipschitz continuity and linear growth condition, i.e. there exists a constant C > 0 such that for all x, y ∈ Rd |b(x) − b(y)| + |Σ(x) − Σ(y)| ≤ C|x − y|,

(8.2)

|b(x)| + |Σ(x)| ≤ C(1 + |x|), (8.3)  2 where by | · | : Rm×n → R≥0 , A → |A| := |A|F = 1≤i≤m, 1≤j ≤n |Aij | we denote the Frobenius norm (Euclidean norm) of an m × n-matrix A. We further assume the existence of a random variable Z = (Z 1 , . . . , Z d ) ∈ Rd which is independent of the σ -algebra generated by W and satisfies E[|Z|2 ] < ∞. Under these assumptions, the existence and uniqueness result of the scalar case (Theorem 1.2.6) carries over to the SDE (8.1). To prove a multidimensional analog to Theorem 4.1.4, we need a generalization of Propositions 4.1.1, 6.2.1 to arbitrary dimensions. To this end, denote by Q := ΣΣ  the covariance matrix of the process X. t Remark 8.1.1 We note that 0 Qs ds = X, Xt , where X, Xt denotes the quadratic variation process. Proposition 8.1.2 Denote by A the infinitesimal generator of X which is, for functions f ∈ C 2 (Rd ) with bounded derivatives, given by 1 (Af )(x) = tr[Q(x)D 2 f (x)] + b(x) ∇f (x), (8.4) 2 t using the notation as in (2.2). Then, the process Mt := f (Xt ) − 0 (Af )(Xs ) ds is a martingale with respect to the filtration of W . Proof By the multidimensional Itô formula, we have with dX i , X j t = Qij (Xt ) dt df (Xt ) =

d 

∂xi f (Xt ) dXti +

d 1  ∂xi xj f (Xt ) dX i , X j t 2 i,j =1

i=1

= b(Xt ) ∇f (Xt ) dt +

d  i=1

∂xi f (Xt )

n  j =1

d 1  2 + (D f )ij (Xt )Qij (Xt ) dt 2 i,j =1   1 = b(Xt ) ∇f (Xt ) + tr[Q(Xt )D 2 f (Xt )] dt 2 d n   j + ∂xi f (Xt ) Σij (Xt ) dWt . i=1

j =1

j

Σij (Xt ) dWt

8.2 Variational Formulation

93

Using the boundedness of the derivatives of f , the linear growth of the coefficient Σ (8.3) and proceeding as in the proof of Proposition 4.1.1, we find that the process  t d n j i=1 j =1 ∂xi f (Xτ )Σi,j (Xτ ) dWτ , is a martingale with respect to the filtration 0 of W .  Repeating the arguments which lead to Theorem 4.1.4 yields Theorem 8.1.3 Let V ∈ C 1,2 (J × Rd ) ∩ C 0 (J × Rd ) with bounded derivatives in x be a solution of ∂t V + AV − rV = 0 in J × Rd ,

V (T , x) = g(x)

in Rd ,

(8.5)

with A as in (8.4). Then, V (t, x) can also be represented as  T  V (t, x) = E e− t r(Xs ) ds g(XT ) | Xt = x . We now consider the pricing of a multi-asset option in a Black–Scholes market model, i.e. we assume in (8.1) that bi (s) = rsi , Σij (s) = Σ ij si , 1 ≤ i, j ≤ d, with r ≥ 0 and Σ ij ≥ 0 constants. Furthermore, we denote by μ ∈ Rd the column vector given by μ := (Q11 /2 − r, . . . , Qdd /2 − r) . Switching to log-price xi = log(si ), i = 1 . . . , d, and to time-to-maturity t → T − t, we obtain that v(t, x1 , . . . , xd ) := V (T − t, ex1 , . . . , exd ) solves ∂t v − ABS v + rv = 0 in J × Rd ,

v(0, x) = g(ex )

in Rd .

(8.6)

g(ex1 , . . . , exd ),

and ABS

:= Herewith, by a slight abuse of notation, we let denotes the differential operator with constant coefficients g(ex )

(ABS f )(x) :=

1 tr[QD 2 f (x)] − μ ∇f (x). 2

8.2 Variational Formulation The variational formulation of (8.6) will require Sobolev spaces for functions of several variables. For m ∈ N and a bounded Lipschitz and simply connected domain G ⊆ Rd , it is sufficient to introduce H m (G) as $ % (8.7) H m (G) := u ∈ L2 (G) : D n u ∈ L2 (G) for |n| ≤ m . Here, D n u has the sense, i.e. D n u is the weak derivative of  weak  to nbe understood in n |n| n m u satisfying G D uϕ dx = (−1) G uD ϕ dx, ∀ϕ ∈ C0 (G). The space H (G) is equipped with the norm  D n u 2L2 (G) , u 2H m (G) := 

|n|≤m



in particular, u 2H 1 (G) = G |u|2 dx + G |∇u|2 dx. For G ⊂ Rd as above, we introduce spaces H0m (G) consisting of trace-zero functions,

94

8 Multi-asset Options

H0m (G) := C0∞ (G)

· H m (G)

.

We are now able to give the weak formulation of Eq. (8.6). It reads Find u ∈ L2 (J ; H 1 (Rd )) ∩ H 1 (J ; L2 (Rd )) such that (∂t u, v) + a BS (u, v) = 0, ∀v ∈ H 1 (Rd ), a.e. in J,

(8.8)

u(0) = u0 , where u0 (x) := g(ex ) and the bilinear form a BS (·,·) : H 1 (Rd ) × H 1 (Rd ) → R is given by 1 a BS (ϕ, φ) = (∇ϕ) Q∇φ dx + μ ∇ϕφ dx + r ϕφ dx. (8.9) 2 Rd Rd Rd Proposition 8.2.1 Assume the symmetric matrix Q is positive definite, i.e. there exists a constant γ > 0 such that x  Qx ≥ γ x  x, ∀x ∈ Rd . Then there exist constants Ci > 0, i = 1, 2, 3, such that for all ϕ, φ ∈ H 1 (Rd ) the following holds: |a BS (ϕ, φ)| ≤ C1 ϕ H 1 (Rd ) φ H 1 (Rd ) , a BS (ϕ, ϕ) ≥ C2 ϕ 2H 1 (Rd ) − C3 ϕ 2L2 (Rd ) . Proof By Hölder’s inequality, |a BS (ϕ, φ)| ≤

d d  1  |Qij | |∇ϕ||∇φ| dx + |μi | |∇ϕ||φ| dx 2 Rd Rd i,j =1 i=1 +r |ϕ||φ| dx ≤ C1 ϕ H 1 (Rd ) φ H 1 (Rd ) . Rd

  Furthermore, by the positive definiteness of Q and Rd ∂xi ϕϕ dx = 12 Rd ∂xi (ϕ 2 ) dx = 0, 1 a BS (ϕ, ϕ) ≥ γ |∇ϕ|2 dx + r |ϕ|2 dx d 2 Rd R 1 2  ≥ γ ϕ H 1 (Rd ) − |r − γ /2| ϕ 2L2 (Rd ) . 2 We apply the abstract existence result Theorem 3.2.2 in the spaces V = H 1 (Rd ), H = L2 (Rd ) and obtain, for every u0 ∈ L2 (Rd ), a unique solution to the problem (8.8). As in the one-dimensional case described in Sect. 4.3, we need to reformulate the problem on a bounded domain where the condition u0 (x) = g(ex ) ∈ L2 (Rd ) can be weakened. We require similar to (4.10) the following polynomial growth condition on the multidimensional payoff function: There exist C > 0, q ≥ 1 such that d  q g(s1 , . . . , sd ) ≤ C si + 1 for all s ∈ Rd≥0 . (8.10) i=1

This condition is satisfied by all standard multi-asset options like, e.g. basket, rainbow, spread or power options.

8.3 Localization

95

8.3 Localization The unbounded domain Rd of the log-price is truncated to a bounded domain G = (−R, R)d ⊂ Rd , R > 0. This again corresponds to approximating the option price by a knock-out barrier option   (8.11) vR (t, x) = E e−r(T −t) g(eXT )1{T 0, such that |v(t, x) − vR (t, x)| ≤ C(d, T , Q)e−γ1 R+γ2 x ∞ . Proof Let MT = supτ ∈[t,T ] Xτ ∞ . Then, with (8.10)   |v(t, x) − vR (t, x)| ≤ E g(eXT )1{T ≥τG } | Xt = x   ≤ C(d) E eqMT 1{MT >R} | Xt = x . It suffices to show that there exist a constant C(d, T , Q) > 0 such that   E eq XT ∞ 1{ XT ∞ >R} | Xt = x ≤ C(d, T , Q) e−γ1 R+γ2 x ∞ . We have, following the proof of Theorem 4.3.1,   q X ∞ T 1{ XT ∞ >R} | Xt = x = eq z+x ∞ 1{ z+x ∞ >R} pT −t (z) dz Ee ≤ C(d, T , Q)eq x ∞

Rd

d  i=1

Rd

e(q+ Q

−1 μ )|z | ∞ i

 −1 1{ z+x ∞ >R} e−z Q z/(2(T −t)) dz

≤ C(d, T , Q)eq x ∞ d  −1  −1 × e−(η−q− Q μ ∞ )(R− x ∞ ) eη|zi | e−z Q z/(2(T −t)) dz d i=1 R

≤ C(d, T , Q)e−γ1 R+γ2 x ∞

d  i=1 R

eη|zi | e−zi /(2σi (T −t)) dzi , 2

2

 2 2 with γ1 = η − q − Q−1 μ ∞ , and γ2 = γ1 + q. Since R eη|zi | e−zi /(2σi (T −t)) dzi < ∞ for all i = 1, . . . , d, and any η > 0, we obtain the required result by choosing  η > q + Q−1 μ ∞ . The price vR of the barrier option is, under the smoothness assumption vR ∈ C 1,2 (J × Rd ) ∪ C 0 (J × Rd ), and after switching to time-to-maturity t → T − t, a solution of the PDE

96

8 Multi-asset Options

∂t vR − ABS vR + rvR = 0

in J × G,

vR (t, x) = 0 in J × G , c

vR (0, x) = g(e ) x

(8.12)

on G.

The weak formulation for the price vR of the barrier option on the bounded domain G = (−R, R)d reads: Find uR ∈ L2 (J ; H01 (G)) ∩ H 1 (J ; L2 (G)) such that (∂t uR , v) + a BS (uR , v) = 0, ∀v ∈ H01 (G), a.e. in J,

(8.13)

uR (0) = u0 |G .

8.4 Discretization We derive the matrix problems analogous to the univariate case (4.14) and (4.15). Due to the product structure of the domain G = (−R, R)d and since the coefficients of the operator ABS are constant, the matrices GBS and ABS are Kronecker products of matrices corresponding to univariate problems. To describe the ideas and to simplify the notation, we focus in the derivation on the two-dimensional case. Definition 8.4.1 The Kronecker product of the matrices X ∈ Rm×n and Y ∈ Rp×q is given by ⎛ ⎞ X11 Y X12 Y · · · X1n Y ⎜ X Y X Y ··· X Y ⎟ 22 2n ⎟ ⎜ 21 mp×nq ⎟ Z := X ⊗ Y := ⎜ . .. .. ⎟ ∈ R ⎜ .. .. . ⎝ . ⎠ . . Xm1 Y Xm2 Y · · · Xmn Y For later purpose, it is sufficient to consider matrices X ∈ RN1 ×N1 , Y ∈ RN2 ×N2 which are quadratic. Then, also Z = X ⊗ Y is quadratic, Z ∈ RN ×N with N := N1 N2 . It follows by Definition 8.4.1 that an arbitrary entry Zj,j  , 1 ≤ j, j  ≤ N is then given by Xi1 ,i1 Yi2 ,i2 = ZN2 (i1 −1)+i2 ,N2 (i1 −1)+i2 =: Zj,j 

(8.14)

where Xi1 ,i1 , 1 ≤ i1 , i1 ≤ N1 , and Yi2 ,i2 , 1 ≤ i2 , i2 ≤ N2 , are the entries of X and Y, respectively.

8.4.1 Finite Difference Discretization We consider the truncated PDE (8.12) in two space variables, where the spacial operator ABS simplifies to

8.4 Discretization

97

1 1 ABS = Q11 ∂x1 x1 + Q12 ∂x1 x2 + Q22 ∂x2 x2 − μ1 ∂x1 − μ2 ∂x2 . 2 2 To discretize ABS by finite difference quotients, we define, for N1 , N2 ∈ N, a grid G on G = (−R, R)2 by G := {(x1,i1 , x2,i2 ) | 1 ≤ i1 ≤ N1 , 1 ≤ i2 ≤ N2 }, where the grid points (x1,i1 , x2,i2 ) are given by (x1,i1 , x2,i2 ) = (−R + i1 h1 , −R + i2 h2 ), hk := 2R/(Nk + 1),

1 ≤ ik ≤ Nk ,

k = 1, 2.

∈ C 4 (G),

For f we set fi1 ,i2 := f (x1,i1 , x2,i2 ) for the function value of f at the grid point (x1,i1 , x2,i2 ) ∈ G and consider the difference quotients 2 ∂x1 x1 f (x1,i1 , x2,i2 ) = h−2 1 (fi1 −1,i2 − 2fi1 ,i2 + fi1 +1,i2 ) + O(h1 )

=: (δx21 x1 f )i1 ,i2 + O(h21 ), 2 ∂x2 x2 f (x1,i1 , x2,i2 ) = h−2 2 (fi1 ,i2 −1 − 2fi1 ,i2 + fi1 ,i2 +1 ) + O(h2 )

=: (δx22 x2 f )i1 ,i2 + O(h22 ). Furthermore, ∂x1 x2 f (x1,i1 , x2,i2 ) = ∂x1 [(2h2 )−1 (fi1 ,i2 +1 − fi1 ,i2 −1 ) + O(h22 )]   = (4h1 h2 )−1 fi1 +1,i2 +1 − fi1 −1,i2 +1 − fi1 +1,i2 −1 + fi1 −1,i2 −1 + O(h21 ) + O(h22 ) =: (δx21 x2 f )i1 ,i2 + O(h21 ) + O(h22 ), as well as ∂x1 f (x1,i1 , x2,i2 ) = (2h1 )−1 (fi1 +1,i2 − fi1 −1,i2 ) + O(h21 ) =: (δx1 f )i1 ,i2 + O(h21 ), ∂x2 f (x1,i1 , x2,i2 ) = (2h2 )−1 (fi1 ,i2 +1 − fi1 ,i2 −1 ) + O(h22 ) =: (δx2 f )i1 ,i2 + O(h22 ). Denote by vim1 ,i2 ≈ v(tm , x1,i1 , x2,i2 ) an approximation of the option value v on time level tm at the grid point (x1,i1 , x2,i2 ) ∈ G. We now replace the PDE, ∂t v − ABS v + rv = 0, by the finite difference equations Eim1 ,i2 = 0,

1 ≤ ik ≤ Nk ,

k = 1, 2, i1

i2

m = 0, . . . , M − 1,

(8.15)

with the initial condition vi01 ,i2 = g(ex1 , ex2 ) and homogeneous boundary condim = v m = 0, i = 0, . . . , N + 1, m = 0, . . . , M. In (8.15), E m tions v0,i k k i1 ,i2 is the i1 ,0 2 finite difference operator given by     m+1 m Eim1 ,i2 := k −1 vim+1 − vim1 ,i2 − θ (Fv)m+1 i1 ,i2 + (1 − θ )(Fv)i1 ,i2 + rθ vi1 ,i2 1 ,i2 + r(1 − θ )vim1 ,i2 ,

98

8 Multi-asset Options

where (Fv)m i1 ,i2 :=

 1 2 m Qkk (δx2k xk v)m μk (δxk v)m i1 ,i2 + Q12 (δx1 x2 v)i1 ,i2 − i1 ,i2 . 2 2

2

k=1

k=1

We write (8.15) in matrix form. Setting, for N := N1 N2 , m  um = (um 1 , . . . , uN ) ,

m m um j = uN2 (i1 −1)+i2 := vi1 ,i2 ,

we find that (8.15) is equivalent to Find um+1 ∈ RN such that for m = 0, . . . , M − 1,     I + θ tGBS um+1 = I − (1 − θ )t GBS um ,

(8.16)

u = u0 , 0

where the matrices I and GBS are given by I := I1 ⊗ I2 , 1 1 GBS := Q11 R1 ⊗ I2 − Q12 C1 ⊗ C2 + Q11 I1 ⊗ R2 2 2 + μ1 C1 ⊗ I2 + μ2 I1 ⊗ C2 + rI1 ⊗ I2 . Herewith, we denote by Rk , Ck , Ik ∈ RNk ×Nk , the matrices given in (4.14) with respect to the coordinate direction xk , k = 1, 2.

8.4.2 Finite Element Discretization In the two-dimensional case, the bilinear form a BS (·,·) in (8.9) simplifies to 2 2  1  a BS (ϕ, φ) = Qk ∂x ϕ∂xk φ dx + μk ∂xk ϕφ dx + r ϕφ dx. 2 G G G ,k=1

k=1

(8.17) The discretization of the weak formulation (8.13) relies on finite element spaces VN ⊂ H01 (G) which are spanned by products of hat-functions. To be more precise, for N1 , N2 ∈ N we consider VN := span{ϕj (x1 , x2 ) | 1 ≤ j ≤ N1 N2 } = span{ϕN2 (i1 −1)+i2 (x1 , x2 ) | 1 ≤ ik ≤ Nk , k = 1, 2} = span{bi1 (x1 )bi2 (x2 ) | 1 ≤ ik ≤ Nk , k = 1, 2},

(8.18)

where bik (xk ) are the univariate hat-functions as in (3.18). Note that dim VN = N := N1 N2 . If the mesh sizes hk = 2R/(Nk + 1) in both coordinate directions are constant, we have bik (xk ) = max{0, 1 − h−1 k |xk − xk,ik |}, ik = 1, . . . , Nk , with mesh points xk,ik := −R + hk ik , ik = 0, . . . , Nk + 1.

8.4 Discretization

99

We now calculate the stiffness matrix ABS ∈ RN ×N using the finite element space VN . We have, by (8.17), BS BS BS ABS j,j  = a (ϕj  , ϕj ) = a (ϕN2 (i1 −1)+i2 , ϕN2 (i1 −1)+i2 ) = a (bi1 bi2 , bi1 bi2 ) Q11 Q12 = ∂x1 (bi1 bi2 )∂x1 (bi1 bi2 ) dx + ∂x (b  b  )∂x (bi bi ) dx 2 G 2 G 1 i1 i2 2 1 2 Q21 Q22 + ∂x2 (bi1 bi2 )∂x1 (bi1 bi2 ) dx + ∂x (b  b  )∂x (bi bi ) dx 2 G 2 G 2 i1 i2 2 1 2 + μ1 ∂x1 (bi1 bi2 )bi1 bi2 dx + μ2 ∂x2 (bi1 bi2 )bi1 bi2 dx G G bi1 bi2 bi1 bi2 dx +r G

Q11 = 2



R

−R

Q21 + 2 + μ1 +r

bi  bi1 dx1



R

−R

1

bi2 bi2 dx2 +



R

−R

bi  bi1 dx1 1



R −R

bi2 bi2 dx2

R Q22 R dx1 dx2 + b  bi dx1 bi  bi2 dx2 2 −R i1 1 −R −R −R 2 R R R R bi  bi1 dx1 bi  bi2 dx2 + μ2 bi  bi1 dx1 bi  bi2 dx2



−R R

−R

R



Q12 2

bi1 bi1

1



bi1 bi1 dx1

−R R

−R

R

bi  bi2 2

2

−R

1

−R

2

bi2 bi2 dx2 .

     Using that  the matrices S, B and M are given by Si,i  = bi  bi , B i,i  = bi  bi and Mi,i  = bi  bi , see, e.g. (3.22), and noting that bi  bi = − bi  bi = −Bi,i  , we obtain ABS j,j  =

Q11 Q12 Q21 Si1 ,i1 Mi2 ,i2 + Bi1 ,i1 (−B)i2 ,i2 + (−B)i1 ,i1 Bi2 ,i2 2 2 2 Q22 Mi1 ,i1 Si2 ,i2 + μ1 Bi1 ,i1 Mi2 ,i2 + μ2 Mi1 ,i1 Bi2 ,i2 + rMi1 ,i1 Mi2 ,i2 . + 2

Denote by Sk the matrix with entries Sik ,ik , k = 1, 2, and analogously for Bk , Mk . Using that the covariance matrix Q is symmetric and Eq. (8.14), we have shown Proposition 8.4.2 Assume d = 2 in (8.9) and assume that the finite element space VN is as in (8.18). Then, the stiffness matrix ABS to the bilinear form a BS (·,·) is given by ABS =

Q11 1 Q22 1 S ⊗ M2 − Q12 B1 ⊗ B2 + M ⊗ S2 2 2 + μ1 B1 ⊗ M2 + μ2 M1 ⊗ B2 + rM1 ⊗ M2 .

Since the Kronecker product is distributive, we can reduce the number of products by

100

8 Multi-asset Options

! Q11 1 S + μ1 B1 + rM1 ⊗ M2 + (−Q12 B1 + μ2 M1 ) ⊗ B2 2 Q22 1 M ⊗ S2 . + 2 Proposition 8.4.2 can be extended to arbitrary dimensions d > 2. The corresponding finite element space VN is spanned by the d-fold tensor products of univariate hatfunctions, i.e. ABS =

VN := span{ϕj (x1 , . . . , xd ) | 1 ≤ j ≤ N } (8.19) = span{bi1 (x1 ) · · · bid (xd ) | 1 ≤ ik ≤ Nk , k = 1, . . . , d}, 6d with dim VN = N := k=1 Nk . The index j in (8.19) is defined via the indices 6d−k i1 , . . . , id by j := d−1 k=1 =1 N+k (ik − 1) + id ; compare with (8.18). The stiffness matrix ABS ∈ RN ×N becomes ABS =

i−1 d d 7 7 1 Qii Mk ⊗ Si ⊗ Mk 2 i=1 d−1 

− +

k=1

d 

Qij

i=1 j =i+1 i−1 7

d  k=1

k=i+1

μk

k=1

i−1 7

M ⊗B ⊗ k

i

k=1

Mk ⊗ Bi ⊗

j −1 7

Mk ⊗ Bj ⊗

k=i+1

Mk + r

Mk

k=j +1

k=i+1 d 7

d 7

d 7

Mk .

(8.20)

k=1

If we furthermore discretize (8.13) by the θ -scheme in time, we obtain the matrix problem Find um+1 ∈ RN such that for m = 0, . . . , M − 1 (M + θ tABS )um+1 = (M − (1 − θ )tABS )um , u0N = u0 .

(8.21)

Note that if N1 = · · · = Nd , then the mesh width is the same in each coordinate direction, and the matrices in the representation of ABS satisfy M := M1 = · · · = Md , B := B1 = · · · = Bd and S := S1 = · · · = Sd . Hence, in this case, we have to calculate only the matrices M, B and S, independent of the dimension d. We give a convergence result that generalizes Theorem 3.6.5 to arbitrary dimensions. For simplicity, we assume that the mesh width is the same in each coordinate direction, h := h1 = · · · = hd . Let u := uR be the unique solution of (8.13), and let uN (tm , ·) ∈ VN be its finite element approximation at time level tm with corresponding coefficient vector um obtained from (8.21). As in the one-dimensional case, we m (x) := u(t , x) − u (t , x) =: um − um as split the error eN m N m N     m m m eN = u − PN u + PN um − um =: ηm + ξNm , (8.22) N where PN : V → VN is a projector or a quasi-interpolant. The reason why we cannot rely on a multivariate version of the nodal interpolant IN as in (3.24) is that functions u ∈ V = H 1 (G) for G ⊂ Rd are not necessarily continuous. In fact,

8.4 Discretization

101

the compact Sobolev embedding H m (G) ⊂ C 0 (G) only holds if m > d/2. We assume that the operator PN satisfies the following approximation property (compare with (3.36)) u − PN u H t (G) ≤ Chs−t u H s (G) ,

t = 0, 1,

0 ≤ t ≤ s ≤ 2.

(8.23)

The construction of PN having the property (8.23) for a large class of finite element spaces including the tensor product space VN in (8.19) can be found, e.g. in [27, Sect. 4.8] or [64, Sect. 1.6]. Thus, replacing in the proof of Theorem 3.6.5 the nodal interpolant IN by PN and using the approximation property (8.23), we find Theorem 8.4.3 Assume uR ∈ C 1 (J ; H 2 (G)) ∩ C 3 (J ; H −1 (G)). Let um (x) := uR (tm , x) and um N := uN (tm , x) ∈ VN , defined via (8.21), with VN as in (8.19), where N1 = · · · = Nd . Assume for 0 ≤ θ < 12 also (3.30). Then, the following error bound holds: 2 uM − uM N L2 (G) + k

M−1 

2 um+θ − um+θ N H 1 (G)

m=0



T

≤ Ch max u(t) H 2 (G) + Ch ∂t u(s) 2H 1 (G) ds 0≤t≤T 0 * T k 2 0 ∂tt u(s) 2∗ ds if 0 ≤ θ ≤ 1, +C T k 4 0 ∂ttt u(s) 2∗ ds if θ = 12 . 2

2

As in the univariate case, using k = O(h) and θ = 1/2, one can show that uM − = O(h2 ). Since n := N1 = · · · = Nd = O(h−1 ), the dimension of the

2 uM N L2 (G)

finite element space VN is N := dim VN = nd , and the rate of convergence with respect to the mesh width h can be expressed in terms of the number of degrees of freedom N 2 −2/d ). uM − uM N L2 (G) = O(N

(8.24)

The exponential decay of the rate of convergence with respect the dimension d of the problem, or equivalently, the exponential growth of N with respect to d is called the curse of dimension. Therefore, full tensor product spaces VN lead to inefficient schemes for large d. Example 8.4.4 For d = 2 we consider two different options: A basket option with payoff g(s) = (α1 s1 + α2 s2 − K)+ , and a better-of-option with payoff g(s) = max{s1 , s2 }. We set strike K = 100, and maturity T = 1, covariance matrix Q = (σi σj ρij )1≤i,j ≤2 with volatilities σ = (0.4, 0.1) and correlation ρ12 = 0.2. Furthermore, we let the interest rate r = 0.01 and α1 = 0.7, α2 = 0.3 in the payoff of the basket option. The option values are shown in Fig. 8.1 where we used finite elements for the discretization. Using analytic formulas (see, e.g. [165]), we can compute the L∞ -convergence rate on a subset G0 = (K/2, 3/2K)2 ⊂ G at maturity t = T . The convergence rates are shown in Fig. 8.2. It can be seen that we obtain the optimal convergence rate

102

8 Multi-asset Options

Fig. 8.1 Basket (top) and better-of-option price (bottom)

O(h2 ) with respect to the mesh width h for both options. But if we plot the convergence rate in terms of degrees of freedom N , we see the “curse of dimension” and only obtain O(N −1 ) instead of the optimal convergence rate O(N −2 ).

8.5 Further Reading Analytic formulas for basket options on two underlying were derived by Zhang [165, Chap. 27]. To avoid the ‘curse of dimension’, several authors (see Leentvaar and Oosterlee [114] or Reisinger and Wittum [139] and the references therein) use finite differences on the so-called sparse grids. For the finite element discretization, it is possible to use sparse tensor product spaces for the discretization as in von Petersdorff and Schwab [159]. In both cases, we are able to reduce the complexity in the number of degrees of freedom from O(h−d ) to O(h−1 | log h|d−1 ). Therefore, even

8.5 Further Reading

103

Fig. 8.2 Convergence rates for a two-dimensional Black–Scholes model with respect to the mesh width (top) and degrees of freedom (bottom)

for large dimensions d, multivariate options can be priced. We explain this in more detail in Chap. 13.

Chapter 9

Stochastic Volatility Models

In Sect. 4.5, we considered local volatility models as an extension of the Black– Scholes model. These models replace the constant volatility by a deterministic volatility function, i.e. the volatility is a deterministic function of s and t. In stochastic volatility (SV) models, the volatility is modeled as a function of at least one additional stochastic process Y 1 , . . . , Y nv , nv ≥ 1. Such models can explain some of the empirical properties of asset returns, such as volatility clustering and the leverage effect. These models can also account for long term smiles and skews. We focus on SV models which are pure diffusion models, i.e. the processes Y 1 , . . . , Y nv used to model the volatility are diffusions. We will also assume that there is only one risky underlying S. The flexibility offered by SV models comes at the cost of an increase in dimension. In SV models, the price S is no longer Markovian, since the price process is determined not only by its value but also by the level of volatility. To regain a Markov process, one must consider the (nv + 1)dimensional process (S, Y 1 , . . . , Y nv ) . This results in pricing PDEs in nv + 1 space dimensions: each additional source of randomness Y i gives an additional dimension in the pricing equation. Hence, we can use the discretization techniques developed for the pricing of multi-asset options also for the pricing of options in SV models.

9.1 Market Models A large class of diffusion SV models can described as follows. Consider the asset price process S following the SDE dSt = μSt dt + σt St dWt1 ,

(9.1)

where σ = {σt : t ≥ 0} is called the volatility process. A widely used model for σ is σt = ξ(Yt1 , . . . , Ytnv ),

(9.2)

where ξ : Rnv → R≥0 is some function and Y := (Y 1 , . . . , Y nv ) , nv ≥ 1, is an Rnv -valued diffusion process. The dynamics of the vector process Z := N. Hilber et al., Computational Methods for Quantitative Finance, Springer Finance, DOI 10.1007/978-3-642-35401-4_9, © Springer-Verlag Berlin Heidelberg 2013

105

106

9 Stochastic Volatility Models

(S, Y 1 , . . . , Y nv ) ∈ Rnv +1 can be described by the system of SDEs (compare with (8.1)) dZt = b(Zt ) dt + Σ(Zt ) dWt ,

Z0 = z,

(9.3)

where W is Rnv +1 -valued Brownian motion, and the coefficients b : Rnv +1 → Rnv +1 , Σ : Rnv +1 → R(nv +1)×(nv +1) satisfy the Lipschitz-continuity (8.2) and the linear growth condition (8.3). We again denote by Q = ΣΣ  .

9.1.1 Heston Model √ The Heston model is given by ξ(y) = y with √ nv = 1 in (9.2) and Y following t . The Brownian motion W  a CIR process, i.e. dYt = α(m − Yt ) dt + β Yt dW 1 might be correlated with the Brownian motion W in (9.1) that drives the asset 2 1 price S. We thus 0 introduce a Brownian motion W independent of W and write 1 2 2  Wt = ρWt + 1 − ρ Wt , with ρ ∈ [−1, 1] being the instantaneous correlation coefficient. Under a non-unique EMM, the coefficients b, Σ in (9.3) for the Heston model are therefore given by ! rs b(z) = , (9.4) α(m − y) − λ(s, y) , + √ 0 s y 0 , (9.5) Σ(z) = √ √ βρ y β 1 − ρ 2 y where z = (s, y) and the function λ appearing in (9.4) represents the price of volatil√ ity risk. Different choices of λ are given in the literature, for example, λ(s, y) = c y (together with ρ = 0) in [6] and λ(s, y) = cy in [79]. For simplicity, we will choose λ ≡ 0 in the following.

9.1.2 Multi-scale Model We assume that each component of Y = (Y 1 , . . . , Y nv ) evolves according to tk , dYtk = ck (Ytk ) dt + gk (Ytk ) dW

k = 1, . . . , nv ,

(9.6)

with coefficients ck , gk : R → R smooth and at most linearly growing. tk , k = 1, . . . , nv , are As in the Heston model, the Brownian motions W 1 and W not independent to each other, but are allowed to have a correlation structure L ∈ R(nv +1)×(nv +1) of the form t1 , . . . , W tnv ) = LWt (Wt1 , W with W = (W 1 , . . . , W nv +1 ) a standard (nv + 1)-dimensional Brownian motion and

9.1 Market Models

107

⎛ ⎜ ⎜ L=⎜ ⎝

1 ρ1 .. .

0 ··· 0  L

⎞ ⎟ ⎟ ⎟, ⎠

ρnv  ∈ Rnv ×nv is defined as where L ⎧ 0 ⎪ ⎨ j −1 2 1/2  Lij := 1 − ρi2 − k=1 ρ ki ⎪ ⎩ ρ j i

if i < j, if i = j, if i > j,

i, j = 1, . . . , nv .

ij , i, j = 1, . . . , nv , satisfy |ρi | ≤ 1, The constants ρi , i = 1, . . . , nv , and ρ j −1 2 ρi2 + k=1 ρ ki ≤ 1. Example 9.1.1 The case nv = 1 is introduced in [68]. The process Yt1 = Yt follows a mean-reverting Ornstein–Uhlenbeck (OU) process, i.e. the coefficients in (9.6) are given by c1 (y) = α(m − y),

g1 (y) = β,

with α > 0 the rate of mean-reversion, m > 0 the long-run mean level of volatility 0 = L 11 = 1 − ρ 2 , with |ρ| ≤ 1. Note that this model includes and β ∈ R. Here, L in particular the model of Stein–Stein [153] (ξ(y) = |y|, ρ = 0) and the model of Scott [150] (ξ(y) = ey , ρ = 0). Example 9.1.2 The case nv = 2 is studied in [67]. The first component of Yt is assumed to follow a mean-reverting OU process modeling a fast scale volatility factor, while the second component is a diffusion process, modeling a slow scale volatility factor. In particular, the coefficients in (9.6) are c1 (y1 ) = α1 (m − y1 ),

g1 (y1 ) = α2 ,

g2 (y2 ) = α4 g(y2 ), c2 (y2 ) = α3 c(y2 ), √ with α1 > 0 “large”, α2 = O( α1 ) as well as with α3 > 0 “small”, and α4 = √ O( α3 ). The functions c, g : R → R are assumed to be smooth and at most linearly growing. Introducing random volatility renders the market model incomplete so that there is—unlike in the constant volatility case—no unique measure equivalent to the physical measure under which the discounted stock price is a martingale. We consider the (nv + 1)-dimensional standard Brownian motion under the risk-neutral measure t1 , W t2 , . . . , W tnv +1 ) t = (W W := (Wt1 , Wt2 , . . . , Wtnv +1 )

t  t t μ−r ds, + γ1 (Ys ) ds, . . . , γnv (Ys ) ds , 0 ξ(Ys ) 0 0

108

9 Stochastic Volatility Models

where it is assumed that the market prices of volatility risk γi (y), i = 1, . . . , nv , are smooth bounded functions of y = (y1 , . . . , ynv ) only (compare with [68, Chap. 2], [67]). We introduce the combined market prices of volatility risk Λi defined by ρi (μ − r)  + γk (y)Li+1,k+1 , ξ(y) i

Λi (y) =

i = 1, . . . , nv .

(9.7)

k=1

Under a risk neutral probability measure, the dynamics of Z := (S, Y 1 , . . . , Y nv ) is as in (9.3), with coefficients b, Σ given by ⎛ ⎞ rs ⎜ c (y ) − g (y )Λ (y) ⎟ 1 1 1 ⎜ 1 1 ⎟ ⎟, (9.8) b(z) = ⎜ .. ⎜ ⎟ ⎝ ⎠ . cp (yp ) − gp (yp )Λp (y)   Σ(z) = diag sξ(y), g1 (y1 ), . . . , g2 (y2 ) L,

(9.9)

where z := (s, y1 , . . . , ynv ) ∈ Rnv +1 . In the next section, we derive the pricing equation for European options under these SV models.

9.2 Pricing Equation As in the BS model, we switch in (9.1) to the log-price process X = ln(S). We therefore consider instead of the process (S, Y 1 , . . . , Y nv ) the process Z := (X, Y 1 , . . . , Y nv ) . Assuming constant interest rate r ≥ 0 and setting z := (x, y1 , . . . , ynv ) ∈ Rnv +1 , we are interested in calculating the option value   V (t, z) := E e−r(T −t) g(eXT ) | Zt = z . (9.10) Note that the option price is a function of z = (x, y1 , . . . , ynv ) and not just x, since we have to condition on Zt = z and not just on Xt = x. The reason for this is that the process Z is Markovian (the unique strong solution to the SDE (9.3) is a Markov process, see, e.g. [3, Theorem 6.4.5]), but the process X is not. Proceeding as in Sect. 8.1, we find that the function v(t, z) := V (T − t, z) is a solution of the PDE ∂t v − Av + rv = 0

in J × R × GY ,

v(0, z) = v0 (z) := g(ex )

in R × GY ,

(9.11)

where (Af )(z) := 12 tr[Q(z)D 2 f (z)] + b(z) ∇f (z) is the infinitesimal generator of the process Z and GY ⊆ Rnv denotes the state space of the process Y . Consider now the Heston model with coefficients (9.4)–(9.5) (recall that we set λ = 0). Since the coefficient Σ in (9.5) is not Lipschitz continuous, we can only formally conclude that the infinitesimal generator A =: AH appearing in the pricing equation (9.11) is given in log-price by

9.2 Pricing Equation

109

1 1 (AH f )(z) := y∂xx f (x, y) + βρy∂xy f (x, y) + β 2 y∂yy f (x, y) 2 2 ! 1 + r − y ∂x f (x, y) + α(m − y)∂y f (x, y). 2

(9.12)

Furthermore, we have GY = R≥0 . To cast the pricing equation (9.11) corresponding to the Heston model in a variational formulation and to establish its well-posedness, we change variables  v (t, x, y ) := v(t, x, 1/4 y 2 ). The pricing equation for  v becomes, H ∂t  v−A v + r v=0  v0 = g(e ) x

in J × R × R≥0 , in R × R≥0 ,

(9.13)

(9.14)

with 1 2 1 1 H f )(x, y ∂xx f (x, y ∂x (A y ) :=  y ) + ρβ y ) + β 2 ∂ y) y f (x, y y f (x, 8 2 2 ! 1 2 y ∂x f (x, + r−  y) 8 ! 1 4αm − β 2 + −α y+ ∂ y ). (9.15) y f (x, 2  y In the multi-scale SV model, assume that the market price of volatility risk γi = 0, i = 1, . . . , nv . Furthermore, assume that the functions ξ , ck , gk , k = 1, . . . , nv , are such that the coefficients b, Σ in (9.8)–(9.9) satisfy (8.2)–(8.3). Then, the generator A =: AMS of the multi-scale process Z is given by v 1 1 (AMS f )(z) = ξ 2 (y)∂xx f (z) + gi2 (yi )∂yi yi f (z) 2 2

n

+ ξ(y)

nv 

i=1

ρi gi (yi )∂xyi f (z) +

i=1

nv 

qi,j (y)∂yi yj f (z)

i=1 j >i

! ! nv  1 2 μ−r ci (yi ) − gi (yi )ρi ∂yi f (z), + r − ξ (y) ∂x f (z) + 2 ξ(y) i=1

where the coefficients qi,j (y) are given by qi,j (y) := gi (yi )gj (yj )(ρi ρj + i   k=1 Lj,k Li,k ). Hence, for the mean reverting OU model considered in Example 9.1.1, we find 1 1 (AMS f )(z) = ξ 2 (y)∂xx f (x, y) + β 2 ∂yy f (x, y) + ξ(y)ρβ∂xy f (x, y) 2 2 ! 1 + r − ξ 2 (y) ∂x f (x, y) 2 ! μ−r + α(m − y) − βρ ∂y f (x, y), (9.16) ξ(y) where z = (x, y1 ) = (x, y).

110

9 Stochastic Volatility Models

9.3 Variational Formulation We consider the weak formulation of the pricing equation (9.11). For the ensuing numerical analysis, it is convenient to multiply the value of the option v in (9.11) with an exponentially decaying factor, i.e. we consider w := ve−η ,

(9.17)

C 2 (Rnv +1 )

where η ∈ is assumed to be at most polynomially growing at infinity. The function w solves ∂t w − (A + Aη )w + rw = 0 w(0, z) = w0 := g(ex )e−η

in J × R × GY ,

in R × GY ,

(9.18)

where the operator Aη is given by Aη := Ση ∇ +

 1 1  tr QD 2 η + (Ση + 2b) ∇η. 2 2

(9.19)

The coefficient Ση ∈ Rnv +1 appearing in (9.19) is defined by Ση = (Ση1 , . . . , Σηnv +1 ) ,

Σηk :=

n v +1

Qik ∂xi η.

i=1

Consider now the pricing equation of the (transformed) Heston model (9.14). For  Consider the change of variables notational simplicity, we drop “” in  v and A. 1 2 (9.17) with η = η(x, y) := 2 κy , κ > 0. By the definition of Aη (9.19), it follows that the pricing equation for w := (v − v0 )e−η in the Heston model becomes H ∂t w − AH in J × R × R≥0 , κ w + rw = fκ w(0, x, y) = 0 in R × R≥0 ,

(9.20)

where fκH := e−κ/2y (AH v0 − rv0 ) and 2

1 2 1 1 2 (AH κ f )(z) := y ∂xx f (x, y) + ρβy∂xy f (x, y) + β ∂yy f (x, y) 8 2 2 ! 1 2 1 2 + r − y + βκρy ∂x f (x, y) 8 2 ! 1 4αm − β 2 (2β 2 κ − α)y + ∂y f (x, y) + 2 y ! 1 2 y κ(β 2 κ − α) + 2ακm f (x, y). + (9.21) 2 2 Let  G := R × R≥0 and denote Hby (·,·) the L (G)-innerH product, i.e. (ϕ, φ) = G ϕφ dx dy. We associate to −Aκ + r the bilinear form aκ (·,·) via   ∞ aκH (ϕ, φ) := (−AH κ + r)ϕ, φ , ϕ, φ ∈ C0 (G).

Integration by parts yields

9.3 Variational Formulation

111

1 1 1 aκH (ϕ, φ) = (y∂x ϕ, y∂x φ) + β 2 (∂y ϕ, ∂y φ) + ρβ(y∂x ϕ, ∂y φ) 8 2 2 1 1 + [ρβ − 2r](∂x ϕ, φ) + [1 − 4βκρ](y∂x ϕ, yφ) 2 8 1 1 2 − [2β κ − α](y∂y ϕ, φ) − [4αm − β 2 ](y −1 ∂y ϕ, φ) 2 2 1 − κ[β 2 κ − α](yϕ, yφ) − [2καm − r](ϕ, φ) 2 9  =: bk (ϕ, φ). (9.22) k=1

Define the weighted Sobolev space V := C0∞ (G)

· V

,

(9.23)

where the closure is taken with respect to the norm

 v 2V := y∂x v 2L2 (G) + ∂y v 2L2 (G) + 1 + y 2 v 2L2 (G) .

(9.24)

Theorem 9.3.1 Assume that 0 < κ < α/β 2 and that 1 − 2|4αm/β 2 − 1| > ρ 2 . Then, there exist constants Ci > 0, i = 1, 2, 3, such that for all ϕ, φ ∈ V one has |aκH (ϕ, φ)| ≤ C1 ϕ V φ V , aκH (ϕ, ϕ) ≥ C2 ϕ 2V − C3 ϕ 2L2 (G) . Proof Let ϕ, φ ∈ C0∞ (G). We first show continuity of the bilinear form aκH (·,·). The only terms in (9.22) which cannot be estimated directly by an application of Cauchy–Schwarz are (∂x ϕ, φ) and (y −1 ∂y ϕ, φ). For both terms, the continuity follows from the Hardy inequality (4.26) y −1 φ L2 (G) ≤ 2 ∂y φ L2 (G) .

(9.25)

Indeed,  # 1 ## [4αm − β 2 ] y −1 ∂y ϕ, φ # 2 (9.25)

≤ C ∂y ϕ L2 (G) y −1 φ L2 (G) ≤ ∂y ϕ L2 (G) ∂y φ L2 (G) , and similarly  # 1 ## [ρβ − 2r] ∂x ϕ, φ # ≤ C y∂x ϕ L2 (G) y −1 φ L2 (G) ≤ C y∂x ϕ L2 (G) ∂y φ L2 (G) . 2 To this end, it follows from the triangle inequality |aκH (ϕ, φ)| ≤ C1 ϕ V φ V . To prove the Gårding inequality, we consider each term in (9.22) separately. We have, since (y∂y ϕ, ϕ) = 12 (∂y (ϕ 2 ), y) = − 12 ϕ 2L2 (G) ,

112

9 Stochastic Volatility Models

1 1 b1 (ϕ, ϕ) = y∂x ϕ 2L2 (G) , b2 (ϕ, ϕ) = β 2 ∂y ϕ 2L2 (G) , 8 2 1 2 2 b6 (ϕ, ϕ) = [2β κ − α] ϕ L2 (G) , 4 1 b8 (ϕ, ϕ) = κ[α − κβ 2 ] yϕ 2L2 (G) , 2

b9 (ϕ, ϕ) = [r − 2καm] ϕ 2L2 (G) ,

and, using (∂x ϕ, ϕ) = 1/2(∂x (ϕ)2 , 1) = 0, (y∂x ϕ, yϕ) = 1/2(∂x (ϕ)2 , y 2 ) = 0, we find that b4 (ϕ, ϕ) = b5 (ϕ, ϕ) = 0. Furthermore, for ε > 0, 1 b3 (ϕ, ϕ) ≥ − |ρβ| y∂x ϕ L2 (G) ∂y ϕ L2 (G) 2 1 1 ρ2β 2 ∂y ϕ 2L2 (G) , ≥ − ε y∂x ϕ 2L2 (G) − 2 2 4ε (9.25) 1 b7 (ϕ, ϕ) ≥ − |4αm − β 2 | ∂y ϕ L2 (G) y −1 ϕ L2 (G) ≥ −|4αm − β 2 | ∂y ϕ 2L2 (G) . 2 Collecting these estimates yields aκH (ϕ, ϕ) ≥ c1 y∂x ϕ 2L2 (G) + c2 ∂y ϕ L2 (G) + c3 yϕ 2L2 (G) + c4 ϕ 2L2 (G) , 2 2

with c1 = 18 − 12 ε, c2 = 12 β 2 − 12 β4ερ − |4αm − β 2 |, c3 = 12 κ[α − κβ 2 ] and c4 = r − 2καm + 14 [2β 2 κ − α]. We choose ε sufficiently close to 14 such that c1 > 0. Thus, in order for c2 to be positive, we must have 1 2 1 2 2 β > β ρ + |4αm − β 2 | ⇔ 1 > ρ 2 + 2|4αm/β 2 − 1|. 2 2 Furthermore, if 0 < κ < α/β 2 , then we have c3 > 0. Hence,  H 2 aκ (ϕ, ϕ) ≥ c1 y∂x ϕ L2 (G) + c2 ∂y ϕ L2 (G) + c3 1 + y 2 ϕ 2L2 (G) + (c4 − c3 ) ϕ 2L2 (G) ≥ min{c1 , c2 , c3 } ϕ 2V − |c4 − c3 | ϕ 2L2 (G) . Since by definition C0∞ (G) is dense in V , we may extend the bilinear form aκH (·, ·) continuously to V .  By the abstract well-posedness result Theorem 3.2.2 in the triple of spaces V = V ⊂ L2 (G) = H ≡ H∗ ⊂ V ∗ , we conclude that the weak formulation to the (transformed) Heston model (9.20), Find w ∈ L2 (J ; V ) ∩ H 1 (J ; L2 (G)) such that (∂t w, v) + aκH (w, v) = fκH , vV ∗ ,V , ∀v ∈ V , a.e. in J, w(0) = 0, admits a unique solution for every fκH ∈ V ∗ .

(9.26)

9.4 Localization

113

The weak formulation (9.26) addresses the Heston model. Since the generator AH (9.15) has the same structure as the generator AMS (9.16) of the multi-scale model of Example 9.1.1, i.e. nv = 1, ξ(y) = |y|, we also obtain the well-posedness of the weak formulation for this model. We derive the bilinear form a SV (·,·) for the general diffusion SV model in (9.11), (9.18) and give its weak formulation. The operator A + Aη − r in (9.18) can be written as (compare with (9.19)) (ASV f )(z) := (Af + Aη f − rf )(z) 1 (9.27) := tr[Q(z)D 2 f (z)] + μ(z) ∇f (z) + c(z)f (z), 2 where μ := Ση + b, and c := 12 tr[QD 2 η] + 12 (Ση + 2b) ∇η − r. Denote by G := R × GY the state space of the process Z = (ln(S), Y 1 , . . . , Y nv ) . Then, the bilinear form associated to the operator ASV   a SV (ϕ, φ) := −ASV ϕ, φ , ϕ, φ ∈ C0∞ (G), becomes, after integration by parts, 1 1 SV   (∇ϕ) Q∇φ dx + (∇Q − 2μ) ∇ϕφ dx − cϕφ dx. a (ϕ, φ) := 2 G 2 G G (9.28) Herewith, we denote by ∇ : [W 1,∞ (G)](nv +1)×(nv +1) → [L∞ (G)]nv +1 , A → ∇A, the operator which takes the divergence of the columns Ak := (A1,k , . . . , Anv +1,k ) , k = 1, . . . , nv + 1, of the matrix A, i.e.   ∇A := div(A1 ), . . . , div(Anv +1 ) . Note that for constant coefficients Q, μ and c, the bilinear form a SV (·,·) reduces to the bilinear form a BS (·,·) in (8.9) of the multivariate Black–Scholes model. The variational formulation to (9.18) reads: Find w ∈ L2 (J ; V ) ∩ H 1 (J ; L2 (G)) such that (∂t w, v) + a SV (w, v) = 0, ∀v ∈ V , a.e. in J, w(0) = w0 .

(9.29) · V

Here, we assume that the Hilbert space V is given by V := C0∞ (G) , where the SV norm · V depends on the model. We also assume that a (·,·) : V × V → R is continuous and satisfies a Gårding inequality, and that w0 ∈ L2 (G).

9.4 Localization We consider the pricing equation (9.18) truncated to a bounded domain GR := 6nv +1 k=1 (ak , bk ), bk > ak ∈ R, ∂t wR − ASV wR + rwR = 0 in J × GR , wR (0, z) = w0 in GR .

(9.30)

114

9 Stochastic Volatility Models

Equation (9.30) needs to be complemented with appropriate boundary conditions which depend on the model under consideration. Localizing the pricing equation to a bounded domain induces an error which we now estimate using the example of the Heston model. To this end, consider the weak formulation of the Heston model, truncated to a bounded domain GR := (−R1 , R1 ) × (−R2 , R2 ), ) ∩ H 1 (J ; L2 (GR )) such that Find wR ∈ L2 (J ; V , a.e. in J, (∂t wR , v) + aκH (wR , v) = fκH , vV∗ ,V , ∀v ∈ V

(9.31)

wR (0) = 0.  the space V  := C ∞ (GR ) · V , with norm · V as in Herewith, we denote by V 0 (9.24). Note that the truncated problem (9.31) admits a unique solution, since, by ×V  → R is continuous and satisfies a Theorem 9.3.1, the bilinear form aκH (·,·) : V ×V . Gårding inequality also on V Now, let w denote the solution of (9.26) on G = R2 , and let wR denote the solution of (9.31). Denote by w R the zero extension of wR to G := R × R+ and R − w be the localization error. Denoting by GR/2 := (−R1 /2, R1 /2) × let eR := w (−R2 /2, R2 /2) and repeating the arguments in the proof of [81, Theorem 3.6], we obtain Theorem 9.4.1 Let φR = φR (x, y) ∈ C0∞ (GR ) denote a cut-off function with the following properties φR ≥ 0,

φR ≡ 1 on GR/2 and ∇φR L∞ (GR ) ≤ C

for some constant C > 0 independent of R1 , R2 ≥ 1. Then, there exist constants c = c(T ), ε > 0 which are independent of R1 , R2 such that t φR eR (t, ·) 2L2 (G ) + φR eR (s, ·) 2V ds ≤ ce−ε(R1 +R2 ) . R

0

9.5 Discretization We discuss the implementation of the stiffness matrix ASV of the general diffusion SV model ASV in (9.27). Under the assumption that the coefficients Q, μ and c of ASV can be written as (a sum of) products of univariate functions, we show that ASV can be represented as sums of Kronecker products of matrices corresponding to univariate problems, as in Sect. 8.4. We will write x = (x1 , . . . , xd ) instead of z = (ln(s), y1 , . . . , ynv +1 ) to unify the notation and assume the following product structure of the coefficients Q, μ and c Assumption 9.5.1 The coefficients Q, μ and c are given by Qij (x) =

d 8 k=1

qijk (xk ),

μi (x) =

d 8

μki (xk ),

c(x) =

k=1

for univariate functions qijk , μki , ck : R → R, 1 ≤ i, j, k ≤ d.

d 8 k=1

ck (xk ),

9.5 Discretization

115

9.5.1 Finite Difference Discretization We define weighted versions of the matrices R, C and I given in (4.14). For w : R → R define matrices in RNk ×Nk with respect to kth coordinate direction ⎛ ⎞ 2w(xk,1 ) −w(xk,1 ) ⎜ ⎟ .. .. ⎟ . . 1 ⎜ ⎜ −w(xk,2 ) ⎟ w(xk ) R := 2 ⎜ ⎟, ⎟ .. .. hk ⎜ ⎝ . . −w(xk,Nk −1 ) ⎠ −w(xk,Nk )

⎛ C

w(xk )

1 := 2hk

0

⎜ ⎜ −w(x ) k,2 ⎜ ⎜ ⎜ ⎝

w(xk,1 ) .. .

..

.

..

..

.

.

2w(xk,Nk ) ⎞ ⎟ ⎟ ⎟ ⎟, ⎟ w(xk,Nk −1 ) ⎠

−w(xk,Nk ) as well as

0

  Iw(xk ) := diag w(xk,1 ), . . . , w(xk,Nk ) .

Here, we denote by xk,ik a grid point of a grid in the kth coordinate direction, i.e. k xk,ik = ak + hk ik , hk = bNkk−a +1 . The following definition will help to simplify the notation. Definition 9.5.2 For an arbitrary permutation σ : {1, . . . , d} → {1, . . . , d}, {1, . . . , d} → {σ (1), . . . , σ (d)} and matrices Xw(xk ) , 1 ≤ k ≤ d, we denote by s (Xw(xσ (1) ) ⊗ · · · ⊗ Xw(xσ (d) ) ) the sorted Kronecker product with factors sorted by increasing indices, i.e.  w(x )  s X σ (1) ⊗ · · · ⊗ Xw(xσ (d) ) := Xw(x1 ) ⊗ · · · ⊗ Xw(xd ) . Using the finite difference quotients δx2i xj , δxi on the grid % $ G := (x1,i1 , . . . , xd,id ) | 1 ≤ ik ≤ Nk , 1 ≤ k ≤ d ⊂ GR , and proceeding exactly as in Sect. 8.4.1, we find that the finite difference matrix GSV corresponding to (9.27) is given by ! d 1  s q i (xi ) 7 q k (xk ) SV ii ii R ⊗ I G := 2 k=i

i=1

1 − 2



d 

s

C

qiji (xi )

⊗C

i=1

s



7

qijk (xk )

I

k ∈{i,j / }

i,j =1 j =i

d 

j

qij (xj )

i

Cμi (xi ) ⊗

7 k=i

! 7 k Iμi (xk ) − Ick (xk ) . k

!

116

9 Stochastic Volatility Models

Assuming homogeneous Dirichlet boundary 6 conditions on ∂GR , discretizing in time with θ -scheme and denoting again by N := dk=1 Nk = |G| the number of points in the grid G, we obtain the fully discrete finite difference scheme for the truncated pricing equation (9.30) Find w m+1 ∈ RN such that for m = 0, . . . , M − 1 (I + θ tGSV )w m+1 = (I − (1 − θ )tGSV )w m , w0 = w0 . Example 9.5.3 Consider the multi-scale SV model with nv = 1 factor as described 1 (x ) = 1, q 2 (x ) = in Example 9.1.1 with ρ = 0. By (9.16), the coefficients are q11 1 11 2 1 2 1 2 2 2 ξ (x2 ), q22 (x1 ) = 1, q22 (x2 ) = β , μ1 (x1 ) = 1, μ1 (x2 ) = r − 12 ξ 2 (x2 ), μ12 (x1 ) = 1, μ22 (x2 ) = α(m − x2 ) and c1 (x1 ) = 1, c2 (x2 ) = −r. The remaining coefficients are zero. Hence, the finite difference matrix becomes 1 1 2 2 GMS = R1 ⊗ Iξ (x2 ) + I1 ⊗ Rβ 2 2 1 2 (x2 )

− C1 ⊗ Ir− 2 ξ

− I1 ⊗ Cα(m−x2 ) − I1 ⊗ I−r .

Since for any constants γi and any weights wi , i = 1, 2, one has Xγ1 w1 +γ2 w2 = γ1 Xw1 + γ2 Xw2 , and since the Kronecker product is distributive, we simplify the above expression to 1 2 GMS = R1 ⊗ Iξ (x2 ) − C1 ⊗ Y2 + I1 ⊗ Y1 , 2 where 1 Y1 := β 2 R1 − αmC1 + αCx2 + I1 , 2

1 2 Y2 := rI1 − Iξ (x2 ) . 2

We now approximate the value of a call option in the model of Stein–Stein, i.e. ξ(x2 ) = |x2 |, whose exact price can be found in [145]. To√this end, we set K = 100, T = 1/2 for the contract parameters and α = 1, β = 1/ 2, ρ = 0, m = 0.2, r = 0 for the model parameters. The computed option price is plotted in Fig. 9.1. We also give the rate of convergence O(N −1 ) of the L∞ -error with respect to the number of grid points N = |G|. The error is measured on the domain (K/2, 3/2K) × (0, 1).

9.5.2 Finite Element Discretization As for the finite difference discretization, we have to introduce weighted matrices. For w : R → R, define the matrices Sw(xk ) , Bw(xk ) , Mw(xk ) ∈ RNk ×Nk with entries given by

9.5 Discretization

117

Fig. 9.1 Option price (top) and convergence rate (bottom) for the mean reverting OU model



k) Sw(x ik ,ik

:=

w(xk )  k ,ik

:=

Bi

k) := Mw(x i ,i  k k

bk

ak bk ak bk

bi  (xk )bik (xk )w(xk ) dxk , k

bi  (xk )bik (xk )w(xk ) dxk , k



ak

bik (xk )bik (xk )w(xk ) dxk ,

where by bik : (ak , bk ) → R≥0 we denote the hat-function bik (xk ) := max{0, 1 − h−1 k |xk − xk,ik |} with respect to the kth coordinate direction on a uniform mesh xk,ik := ak + ik hk , ik = 0, . . . , Nk + 1, with mesh width hk := (bk − ak )/(Nk + 1). We now prove a representation of the stiffness matrix ASV corresponding to the bilinear form a SV (·,·) in (9.28).

118

9 Stochastic Volatility Models

Proposition 9.5.4 Let Assumption 9.5.1 hold, and assume the finite element space VN is given by (8.19). Then, the matrix ASV ∈ RN ×N is given by ! d 1  s q i (xi ) 7 qjjk (xk ) SV ii A := S ⊗ M 2 k=i

i=1

1 − 2

+

d 

qiji (xi )

s

B

d j 7 q k (x ) !   q j (xj ) dxj qij (xj ) ij ⊗ ⊗ B +M M ij k

k ∈{i,j / }

i,j =1 j =i

! d d 7 d i k 1 s 1  q (x ) Mqii (xk ) + B dxi ii i ⊗ 2 2 k=i

i=1



d 

s

μii (xi )

B



7

μki (xk )



7

k=i

i=1

with weights

7

qij (xk ) M k

!

k=j

Mck (xk ) ,

k

*  qijk (xk ) :=

j

Bqii (xj ) ⊗

i,j =1 j =i

!

M

s

qijk (xk )

if k = i,

d k dxk qij (xk )

if k = i.

Proof The proof follows the lines which led to Proposition 8.4.2. Since the co6 efficients qij (x) = k qijk (xk ) are not constant, we obtain additional terms due b k b (xk )bik ) bik dxk = to integration by parts, i.e. akk qijk (xk )bik bik dxk = − akk (qik d k b k b k  q k (xk ) dx qik (xk ) − akk qik (xk )bi  bik dxk − akk (qik ) (xk )bik bik dxk = −Bi ik,i  − Mi ,ik  .  k k

k

k k

Note that the matrices appearing in the representation of ASV have to be implemented as discussed in Sect. 3.4, since the coefficients are not constant. Example 9.5.5 Consider the transformed Heston model with operator −AH κ + r in 2 (x ) = 1 x 2 , q 2 (x ) = q 2 (x ) = 1 βρx , q 2 (x ) = β 2 , (9.21) and coefficients q11 2 2 12 2 21 2 22 2 4 2 2 μ21 (x2 ) = 18 (−1 + 4βκρ)x22 + r, μ22 (x2 ) = 12 (2β 2 κ − α)x2 + (4αm − β 2 )(2x2−1 ) and c2 (x2 ) = 12 κ(β 2 κ − α)x22 + 2ακm − r (the coefficients depending on x1 are equal to 1). By Proposition 9.5.4, we find  1 1 1 1 1 1 1 1 1  1 βρx2 x22 β2 2 AH + M 2 βρ − B1 ⊗ B 2 βρx2 κ = S ⊗M + M ⊗S − B ⊗ B 8 2 2 2 2 1 2 κ−α)x + 4αm−β 1 1 1 1 2 (2β 2 2x2 + B ⊗ M 2 βρ − B1 ⊗ M 8 (−1+4βκρ)x2 +r − M1 ⊗ B 2 2 1

− M1 ⊗ M 2 κ(β

2 κ−α)x 2 +2ακm−r 2

.

The above expression can be simplified to 1 1 x22 1 1 AH κ = S ⊗ M + B ⊗ Y1 + M ⊗ Y2 , 8

(9.32)

9.6 American Options

119

where the matrices Y1 , Y2 are given by 1 1 2 Y1 := − βρBx2 + (1 − 4βκρ)Mx2 − rM1 , 2 8 −1 1 1 1 Y2 := β 2 S1 + (α − 2β 2 κ)Bx2 − (4αm − β 2 )Bx2 2 2 2 1 2 + κ(α − β 2 κ)Mx2 + (r − 2ακm)M1 . 2 Applying the θ -scheme to discretize in time, we finally obtain the fully discrete scheme to approximate the solution w of the (truncated) weak formulation (9.29) Find wm+1 ∈ RN such that for m = 0, . . . , M − 1     M + θ tASV wm+1 = M − (1 − θ )tASV wm , w 0N

(9.33)

= w0 .

Note that (9.33) gives approximations wN (tm , x, y1 , . . . , yp ) = wN (tm , z) to the function w(tm , z) of (9.18). To obtain approximations vN (tm , z) to v(tm , z) of (9.11), we have, by (9.17), to set vN (tm , z) = wN (tm , z)eη(z) . Example 9.5.6 We consider a European call with strike K = 100 and maturity T = 0.5 within the Heston model, for which we set the parameters α = 2.5, β = 0.5, ρ = −0.5, m = 0.025, r = 0. As shown in Fig. 9.2, the option price at maturity converges in the L∞ -norm at the rate O(N −1 ).

9.6 American Options American options in stochastic volatility models can be obtained similar to the Black–Scholes case by replacing the Black–Scholes operator ABS by the corresponding stochastic volatility operator. The value of an American option is given as   V (t, z) := sup E e−r(T −t) g(eXT ) | Zt = z , τ ∈Tt,T

where Tt,T denotes the set of all stopping times for Z. A similar result to Theorem 5.1.1 is not available for general stochastic volatility models due to the possible degeneracy of the coefficients in the SDE. We therefore make the following assumption. Assumption 9.6.1 Let v(t, z) be a sufficiently smooth solution of the following system of inequalities ∂t v − Av + rv ≥ 0 v(t, x) ≥ g(ex ) (∂t v − Av + rv)(g − v) = 0 v(0, z) = g(ex )

in J × R × GY , in J × R × GY , in J × R × GY , in R × GY ,

(9.34)

120

9 Stochastic Volatility Models

Fig. 9.2 Option price (top) and convergence rate (bottom) for the Heston model

where A is the infinitesimal generator of the process Z and GY denotes the state space of Y . Then, V (T − t, z) = v(t, z). We perform the same transformations as described in Sect. 9.3 and obtain the following system of inequalities for w := (v − v0 )e−η in the Heston model H ∂t w − AH κ w + rw ≥ fκ w(t, x) ≥ 0

(∂t w − AH κ w + rw)w = 0 w(0, x, y) = 0

in J × R × R+ , in J × R × R+ , in J × R × R+ , in R × R+ .

(9.35)

The set of admissible solutions in the Heston model for the variational form of (9.35) is the convex set K0 given as % $ K0 := v ∈ V |v ≥ 0 a.e. z ∈ R × R+ ,

9.6 American Options

121

where V is given in (9.23). The variational formulation of (9.35) reads: Find w ∈ L2 (J ; V ) ∩ H 1 (J ; L2 (G)) such that w(t, ·) ∈ K0 and (∂t w, v − w) + aκH (w, v − w) ≥ fκH , v − wV ∗ ,V , ∀v ∈ K0 , a.e. in J, w(0) = 0.

(9.36)

Since the bilinear form aκH (·,·) is continuous and satisfies a Gårding inequality in V by Theorem 9.3.1, problem (9.36) admits a unique solution for every payoff g ∈ L∞ (R) by Theorem B.2.2 of the Appendix B. We localize the problem to a bounded domain as in Sect. 9.4 and obtain the following problem for % $ |v ≥ 0 a.e. z ∈ GR , K0,R := v ∈ V namely ) ∩ H 1 (J ; L2 (GR )) such that w(t, ·) ∈ K0,R and Find w ∈ L2 (J ; V (∂t w, v − w) + a H (w, v − w) ≥ fκH , v − wV∗ ,V , ∀v ∈ K0,R , a.e. in J, w(0) = 0.

(9.37)

For a general diffusion SV model, the weak localized formulation reads: ) ∩ H 1 (J ; L2 (GR )) such that w(t, ·) ∈ K0,R and Find w ∈ L2 (J ; V (∂t w, v − w) + a SV (w, v − w) ≥ −a SV (w0 , v − w), ∀v ∈ K0,R , a.e. in J, w(0) = 0,

(9.38)

 depend on the model. Discretization using finite differences or where K0,R and V finite elements in space and the backward Euler scheme in time as explained in Sect. 9.5 leads to the following sequence of linear complementary problems for (9.37). Given w 0N = 0, find wm+1 ∈ RN such that for m = 0, . . . , M − 1, N Bwm+1 ≥ F m, N ≥ 0, wm+1 N  m+1   (wN ) Bw m+1 − F m = 0, N

(9.39)

m m H where B := M + kAH ∗ ,V  . For a general κ , F := kf + MuN and f i = fκ , bi V SV model, we obtain the following system discretizing (9.38). Given w 0N = 0, find ∈ RN such that for m = 0, . . . , M − 1, wm+1 N

≥ F m, Bwm+1 N wm+1 ≥ 0, N  m+1   − F m = 0, (wN ) Bw m+1 N

(9.40)

SV where B := M + kASV , F m := kf + Mum N and f i = −a (w0 , bi ).

Example 9.6.2 We consider an American put with strike K = 100 and maturity T = 0.5 within the Heston model, for which we set the parameters α = 2.5, β = 0.5, ρ = −0.5, m = 0.06, r = 0. Figure 9.3 depicts the option prices as well as the behavior of the free boundary at t = T .

122

9 Stochastic Volatility Models

Fig. 9.3 Option price (top) and free boundary (bottom) for the Heston model

9.7 Further Reading Background information to stochastic volatility, in general, can be found in Shepard [151] and the references therein. Stochastic volatility (diffusion models) connected to derivative pricing is considered in Fouque et al. [68]. The diffusion SV models we have considered in this chapter can be extended by adding jumps. Bates [13, 14], for example, adds log-normal jumps to the Heston model. The model of Barndorff-Nielsen and Shepard (BNS) [10] describes the volatility as non-Gaussian mean reverting OU process. Introducing SV into an exponential Lévy model via time change was suggested by Carr et al. [37]. The pricing of European and American options via finite difference or finite element methods for diffusion SV models can be found in Achdou and Tchou [2], Ikonen and Toivanen [90] as well as Zvan et al. [166]. Benth and Groth [16] consider option pricing in the BNS model under the minimal entropy EMM. Stochastic volatility models with jumps are described in Chap. 15.

Chapter 10

Lévy Models

One problem with the Black–Scholes model is that empirically observed log returns of risky assets are not normally distributed, but exhibit significant skewness and kurtosis. If large movements in the asset price occur more frequently than in the BS-model of the same variance, the tails of the distribution of Xt , t > 0, should be “fatter” than in the Black–Scholes case. Another problem is that observed logreturns occasionally appear to change discontinuously. Empirically, certain price processes with no continuous component have been found to allow for a considerably better fit of observed log returns than the classical BS model. Pricing derivative contracts on such underlyings becomes more involved mathematically and also numerically since partial integro-differential equations must be solved. We consider a class of price processes which can be purely discontinuous and which contains the Wiener process as special case, the class of Lévy processes. Lévy processes contain most processes proposed as realistic models for log-returns.

10.1 Lévy Processes We start by recalling essential definitions and properties of Lévy processes and refer to [18] and [143] for a thorough introduction to Lévy processes. Definition 10.1.1 An adapted, càdlàg stochastic process X = {Xt : t ≥ 0} on (Ω, F, P, F) with values in R such that X0 = 0 is called a Lévy process if it has the following properties: (i) Independent increments: Xt − Xs is independent of Fs , 0 ≤ s < t < ∞; (ii) Stationary increments: Xt − Xs has the same distribution as Xt−s , 0 ≤ s < t < ∞; (iii) Stochastically continuous: limt→s Xt = Xs , where the limit is taken in probability. N. Hilber et al., Computational Methods for Quantitative Finance, Springer Finance, DOI 10.1007/978-3-642-35401-4_10, © Springer-Verlag Berlin Heidelberg 2013

123

124

10

Lévy Models

Note that for an adapted, càdlàg stochastic process X = {Xt : t ≥ 0} starting in X0 = 0 on (Ω, F, P, F) (i) and (ii) imply (iii). We can associate to X = {Xt : t ∈ [0, T ]} a random measure JX on [0, T ] × R, JX (ω, ·) =

X t =0 

1(t,Xt ) ,

t∈[0,T ]

which is called the jump measure. For any measurable subset B ⊂ R, JX ([0, t] × B) counts then the number of jumps of X occurring between 0 and t whose amplitude belongs to B. The intensity of JX is given by the Lévy measure. Definition 10.1.2 Let X be a Lévy process. The measure ν on R defined by ν(B) = E[#{t ∈ [0, 1] : Xt = 0, Xt ∈ B}],

B ∈ B(R),

is called the Lévy measure of X. ν(B) is the expected number, per unit time, of jumps whose size belongs to B.  The Lévy measure satisfies R 1 ∧ z2 ν(dz) < ∞. Using the Lévy–Itô decomposition, we see that every Lévy process is uniquely defined by the drift γ , the variance σ 2 and the Lévy measure ν. The triplet (σ 2 , ν, γ ) is the characteristic triplet of the process X. Theorem 10.1.3 (Lévy–Itô decomposition) Let X be a Lévy process and ν its Lévy measure. Then, there exist a γ , σ and a standard Brownian motion W such that t xJX (ds, dx) Xt = γ t + σ W t + + lim

0

t

ε↓0 0

x(JX (ds, dx) − ν(dx) ds) ε≤|x|≤1



= γ t + σ Wt + + lim

t

ε↓0 0

|x|≥1

Xs 1{|Xs |≥1}

0≤s≤t

x JX (ds, dx),

(10.1)

ε≤|x|≤1

where JX is the jump measure of X. Proof See [143, Chap. 4].



The characteristic triplet could also be derived from the Lévy–Khinchine representation. Theorem 10.1.4 (Lévy–Khinchine representation) Let X be a Lévy process with characteristic triplet (σ 2 , ν, γ ). Then, for t ≥ 0, E[eiξ Xt ] = e−tψ(ξ ) ,

ξ ∈ R,   1 2 2 with ψ(ξ ) = −iγ ξ + σ ξ + 1 − eiξ z + iξ z1{|z|≤1} ν(dz). 2 R

(10.2)

10.1

Lévy Processes

125



Proof See [3, Chap. 1.2.4].

Note that in (10.2) the integral with respect to the Lévy measure exists since the integrand is bounded outside of any neighborhood of 0 and 1 − eiξ z + iξ z 1{|z|≤1} = O(z2 )

as |z| → 0.

But there are many other ways to obtain an integrable integrand. We could, for example, replace 1{|z|≤1} by any bounded measurable function f : R → R satisfying f (z) = 1 + O(|z|) as |z| → 0 and f (z) = O(1/|z|) as |z| → ∞. Different choices of f do not affect σ 2 and ν. But  γ depends on the choice of the truncation function. If the Lévy measure satisfies |z|≤1 |z|ν(dz) < ∞, we can use the zero function as f and get   1 2 2 ψ(ξ ) = −iγ0 ξ + σ ξ + 1 − eiξ z ν(dz), (10.3) 2 R 2 with  γ0 ∈ R. We denote this representation by the triplet (σ , ν, γ0 )0 . Furthermore, if R ν(dz) < ∞, i.e. X is a compound Poisson process, we can rewrite (10.3) as 1 (10.4) ψ(ξ ) = −iγ0 ξ + σ 2 ξ 2 + λ (1 − eiξ z )ν0 (dz), 2 R  with λ = R ν(dz) and ν0 = ν/λ. We say that X is of finite activity with jump intensity λ and jump size distribution ν0 . If the Lévy measure satisfies  |z|>1 |z|ν(dz) < ∞, then, letting f be a constant function 1, we obtain   1 ψ(ξ ) = −iγc ξ + σ 2 ξ 2 + 1 − eiξ z + iξ z ν(dz), (10.5) 2 R

with triplet (σ 2 , ν, γc )c where γc is called the center of X since E[Xt ] = γc t. We use the representation (10.5) instead of (10.2) throughout this work but omit the subscript c for simplicity. No arbitrage considerations require Lévy processes employed in mathematical finance to be martingales. The following result gives sufficient conditions of the characteristic triplet to ensure this. Lemma with characteristic triplet (σ 2 , ν, γ ). As 10.1.5 Let X be a Lévy  process z sume, |z|>1 |z|ν(dz) < ∞ and |z|>1 e ν(dz) < ∞. Then, eX is a martingale with respect to the filtration F of X if and only if σ2 + γ + (ez − 1 − z)ν(dz) = 0. 2 R Proof We obtain for 0 ≤ t < s using the independent and stationary increments property E[eXs | Ft ] = E[eXt +Xs −Xt | Ft ] = eXt E[eXs −Xt ] = eXt E[eXs−t ] = eXt e(t−s)ψ(−i) .

126

10

Lévy Models

Therefore, setting ψ(−i) = 0 and using the Lévy–Khinchine formula (10.5) yields σ2 + γ + (ez − 1 − z)ν(dz) = 0. 2 R Note that ψ(−i) is well-defined due to [143, Theorem 25.17].



10.2 Lévy Models Financial models with jumps fall into two categories: Jump–diffusion models have a nonzero Gaussian component and a jump part which is a compound Poisson process with finitely many jumps in every time interval. On the other hand, infinite activity models have an infinite number of jumps in every interval of positive measure. A Brownian motion component is not necessary for infinite activity models since the dynamics of the jumps is already rich enough to generate nontrivial smalltime behavior. If there is no Brownian motion component, the models are called pure jump models.

10.2.1 Jump–Diffusion Models A Lévy process of jump–diffusion has the following form X t = γ0 t + σ W t +

Nt 

Yi

(10.6)

i=1

where N is a Poisson process with intensity λ counting the jumps of X, and Yi are the jump sizes which are modeled by i.i.d. random variables with distribution ν0 . The Lévy measure is given by ν = λν0 . To define the parametric model completely, we must specify the distribution ν0 of the jump sizes. In the Merton model [125], jumps in the log-price X are assumed to have a Gaussian distribution, i.e. Yi ∼ N (μ, δ 2 ), and therefore, 1 2 2 e−(z−μ) /(2δ ) dz. ν0 (dz) = √ 2 2πδ

(10.7)

In the Kou model [107], the distribution of jump sizes is an asymmetric exponential with a density of the form   (10.8) ν0 (dz) = pβ+ e−β+ z 1{z>0} + (1 − p)β− e−β− |z| 1{z 0 governing the decay of the tails for the distribution of the positive and negative jump sizes and p ∈ [0, 1] representing the probability of an upward jump. The probability distribution of returns of this model has semi-heavy tails.

10.2

Lévy Models

127

10.2.2 Pure Jump Models A popular class of processes is obtained by subordination of a Brownian motion with drift. Definition 10.2.1 A Lévy process X is called a subordinator if the sample paths of X are a.s. nondecreasing, i.e. t1 ≥ t2 ⇒ Xt1 ≥ Xt2 a.s. Since X0 = 0, it immediately follows by the definition that Xt ≥ 0 a.s. ∀t ≥ 0. The characteristic triplet of a subordinator has the following properties, see, e.g. [40, Proposition 3.10]. Lemma 10.2.2 Let G be a subordinator. Then the characteristic triplet (σ 2 , ν, γ ) of G satisfies σ 2 = 0, γ ≥ 0 and ν((−∞, 0]) = 0, R+ 1 ∧ z ν(dz) < ∞. By Lemma 10.2.2 and (10.3), the characteristic exponent ψ of a subordinator is  given by ψ(ξ ) = −iγ ξ + R+ (1 − eiξ z )ν(dz). Now let G be a subordinator and W a standard Brownian motion. Then, we can construct a Lévy process X by time changing W Xt = σ WGt + θ Gt ,

σ > 0, θ ∈ R, t ∈ [0, T ].

As an example we use as the subordinator a gamma process to obtain a variance gamma process [118]. We consider a gamma process G with Lévy density kG (s) = s e− ϑ (ϑs)−1 1{s>0} . Then, using [143, Theorem 30.1], the Lévy measure of X is given for B ∈ B(R) by ∞ 2 1 − (z−θs) 1 − s ν(B) = e ϑ ds dz e 2sσ 2 √ ϑs B 0 2πsσ 2 ∞ 2 2 1 1 2 − z 1 −( θ + 1 )s = √ eθs/σ s − 2 −1 e 2σ 2 s 2σ 2 ϑ ds dz. 0 ϑ 2πσ 2 B Using [74, Formula 3.471 (9)] to integrate the second integral, we obtain the Lévy measure ! e−β+ |z| e−β− |z| ν(dz) = c (10.9) 1{z>0} + c 1{z0} + c− 1+α 1{z 0 and 0 ≤ α < 2.

128

10

Lévy Models

The normal inverse Gaussian process (NIG) was proposed in [8] and has the Lévy density ν(dz) =

δα eβz K1 (α|z|) dz, π |z|

where K1 denotes the modified Bessel function of the third kind with index 1 and α > 0, −α < β < α, δ > 0. The NIG model is a special case of the generalized hyperbolic model [62].

10.2.3 Admissible Market Models We make the following assumptions on our market models. Assumption 10.2.3 Let X be a Lévy process with characteristic triplet (σ 2 , ν, γ ) and Lévy density k(z) where ν(dz) = k(z) dz. (i) There are constants β− > 0, β+ > 1 and C > 0 such that

−β |z| e − , z < −1, k(z) ≤ C e−β+ z , z > 1.

(10.11)

(ii) Furthermore, there exist constants 0 < α < 2 and C+ > 0 such that k(z) ≤ C+

1 , |z|1+α

0 < |z| < 1.

(10.12)

(iii) If σ = 0, we assume additionally that there is a C− > 0 such that 1 1 (k(z) + k(−z)) ≥ C− 1+α , 2 |z|

0 < |z| < 1.

(10.13)

tails (10.11) the Lévy measure satisfies  Note that due to the semi-heavy z ν(dz) < ∞. All Lévy processes described before |z|ν(dz) < ∞ and e |z|>1 |z|>1 satisfy these assumptions except for the variance gamma model. Here, α = 0 in (10.13) which is not allowed. Nevertheless, we will show in the numerical example that the finite element discretization still converges to the option value with optimal rate.

10.3 Pricing Equation As in the Black–Scholes case, we assume the risk-neutral dynamics of the underlying asset price is given by St = S0 ert+Xt ,

10.3

Pricing Equation

129

where X is a Lévy process with characteristic triplet (σ 2 , ν, γ ) under a non-unique EMM. As shown in Lemma 10.1.5, the martingale condition implies σ2 − (ez − 1 − z)ν(dz). γ =− 2 R We again want to compute the value of the option in log-price with payoff g which is the conditional expectation v(t, x) = E[e−r(T −t) g(erT +XT ) | Xt = x]. We show that v(t, x) is a solution of a deterministic partial integro-differential equation (PIDE). To prove the Feynman–Kac theorem for Lévy processes, we need a generalization of Proposition 4.1.1. Proposition 10.3.1 Let X be a Lévy process with characteristic triplet (σ 2 , ν, γ ) where the Lévy measure satisfies (10.11). Denote by A the integro-differential operator 1 2 (Af )(x) = σ ∂xx f (x) + γ ∂x f (x) + (f (x + z) − f (x) − z∂x f (x))ν(dz), 2 R (10.14) 2 for  t functions f ∈ C (R) with bounded derivatives. Then, the process Mt := f (Xt )− 0 (Af )(Xs ) ds is a martingale with respect to the filtration of X.

Proof We need the Itô formula for Lévy processes which is given in differential form by σ2 ∂xx f (Xt ) dt + f (Xt ) − f (Xt− ) − Xt ∂x f (Xt− ). 2 Using the Lévy–Itô decomposition dXt = γ dt + σ dWt + zJX (dt, dz), df (Xt ) = ∂x f (Xt− ) dXt +

R\{0}

we obtain df (Xt ) = ∂x f (Xt− )γ dt + σ ∂x f (Xt− ) dWt σ2 ∂xx f (Xt ) dt + ∂x f (Xt− ) zJX (dt, dz) + 2 R\{0} + f (Xt− + z) − f (Xt− ) − z∂x f (Xt− )JX (dt, dz) R\{0}

= (Af )(Xt− ) dt + σ (Xt )∂x f (Xt− ) dWt + (f (Xt− + z) − f (Xt− ))JX (dt, dz). R\{0}

t As already showed in Proposition 4.1.1, the process 0 σ (Xs )∂x f (Xs ) dWs is a mart  tingale. Therefore, it remains to show that 0 R\{0} (f (Xs− +z)−f (Xs− ))JX (ds, dz)

130

10

Lévy Models

is a martingale. Using a generalization of Proposition 1.2.7 for Lévy processes t (see, e.g. [40, Proposition 8.8]) it is sufficient to show that E[ 0 R |f (Xt− + z) − f (Xt− )|2 ν(dz)] < ∞. We have   t |f (Xt− + z) − f (Xt− )|2 ν(dz) ≤ T sup |∂x f (x)|2 |z|2 ν(dz) < ∞, E 0

R

x∈R

R

since f ∈ C 2 (R) has bounded derivatives and ν satisfies (10.11).



Remark 10.3.2 Using the Lévy–Khinchine representation (see Theorem 10.1.4), we derive from (10.14) the following formula for the action of the operator A on oscillating exponents  iξ(x+z)  1 (−A)eiξ x = σ 2 ξ 2 eiξ x − γ iξ eiξ x − e − eiξ x − izξ eiξ x ν(dz) 2 R = ψ(ξ )eiξ x ,

(10.15)

∞ with ξ ∈ R. For f ∈ S(R), the space 0 of functions in C (R) vanishing at infinity 2 faster than any negative power of 1 + |x| , we can write f (x) = eiξ x f(ξ ) dξ, (10.16) R

 where f(ξ ) := (2π)−1 R e−iξ x f (x)dx is the Fourier transform of f . By applying (10.15) to the representation (10.16), using Fubini’s theorem and the convergence theorem of Lebesgue, we obtain iξ x  (−Af )(x) = (−Ae )f (ξ ) dξ = ψ(ξ )eiξ x f(ξ ) dξ. (10.17) R

R

Thus, −A is a pseudo-differential operator (PDO) with symbol ψ. Repeating the arguments which lead to Theorem 4.1.4 yields Theorem 10.3.3 Let v ∈ C 1,2 (J × R) ∩ C 0 (J × R) with bounded derivatives in x be a solution of ∂t v + Av − rv = 0

in J × R,

v(T , x) = g(ex )

in R,

(10.18)

where A as in (10.14) with drift r + γ . Then, v(t, x) can also be represented as v(t, x) = E[e−r(T −t) g(erT +XT ) | Xt = x]. We again change to time-to-maturity t → T − t, to obtain a forward parabolic problem. Thus, by setting u(t, s) =: ert v(T − t, x − (γ + r)t), we remove the drift γ and the interest rate r. Then, u satisfies ∂t u − AJ u = 0 in (0, T ) × R, with the initial condition u(0, x) = g(ex ) in R and 1 2 J (A f )(x) = σ ∂xx f (x) + (f (x + z) − f (x) − z∂x f (x))ν(dz). 2 R

(10.19)

(10.20)

10.4

Variational Formulation

131

The removal of drift is important for stability of the numerical algorithm. For notational simplicity, we also removed the interest rate r.  Remark 10.3.4 If the Lévy measure satisfies |z|≤1 |z|ν(dz) < ∞, we remove the drift γ0 which is given, due to the martingale condition, by σ2 − (ez − 1)ν(dz), γ0 = − 2 R and obtain the generator 1 (AJ f )(x) = σ 2 ∂xx f (x) + 2

R

(f (x + z) − f (x))ν(dz).

10.4 Variational Formulation For the variational formulation, we need Sobolev spaces of fractional order, i.e. H s (R), s > 0. For any integer m, we defined the H m -norm using the derivative ∂ m (see Sect. 3.1). Equivalently, we can also define the norm using the Fourier transformation. In particular, for ∂ m u ∈ L2 (R) we have, by using Plancherel’s theorem, #2 # m 2 m # # 9 |∂ u(x)| dx = 2π |ξ |2m | u(ξ )|2 dξ, ∂ u(ξ ) dξ = 2π R

where  u(ξ ) = (2π)−1



R

R

e−iξ z u(z) dz

denotes the Fourier transform of u. Therefore, we can define an equivalent H s -norm for s > 0 via 2 u(ξ )|2 dξ, u H s (R) := (1 + |ξ |)2s | R

and set

H s (R) := {u ∈ L2 (R) : u

H s (R)

ρ=

1 α/2

< ∞}. We also set if σ > 0, if σ = 0,

with α given in Assumption 10.2.3. The variational formulation of the PIDE (10.19) reads Find u ∈ L2 (J ; H ρ (R)) ∩ H 1 (J ; L2 (R)) such that (∂t u, v) + a J (u, v) = 0, ∀v ∈ H ρ (R),

a.e. in J,

(10.21)

u(0) = u0 , where u0 (x) := g(ex ) and the bilinear form a J (·,·) : H ρ (R) × H ρ (R) → R is given by 1 a J (ϕ, φ) := σ 2 (ϕ  , φ  ) − (ϕ(x + z) − ϕ(x) − zϕ  (x))φ(x)ν(dz) dx. 2 R R (10.22)

132

10

Lévy Models

Remark 10.4.1 Let ϕ, φ ∈ C0∞ (R). Then, by the definition of the bilinear form a J (·,·) and (10.17), we find ψ(ξ )eixξ  ϕ (ξ ) dξ φ(x) dx a J (ϕ, φ) = (−AJ ϕ)(x)φ(x) dx = R R R (ξ ) dξ. ϕ (ξ ) eixξ φ(x) dx dξ = ψ(ξ ) ϕ (ξ )φ = ψ(ξ ) R

R

R

We want to show that a J (·,·) is continuous (3.8) and satisfies a Gårding inequality (3.9) on V = H ρ (R). By Remark 10.4.1, it is sufficient to study the characteristic exponent ψ . We do this in the following lemma. Lemma 10.4.2 Let X be a Lévy process with characteristic triplet (σ 2 , ν, 0) where the Lévy measure satisfies (10.12), (10.13). Then, there exist constants Ci > 0, i = 1, 2, 3, for all ξ ∈ R holds 'ψ(ξ ) ≥ C1 |ξ |2ρ ,

|ψ(ξ )| ≤ C2 |ξ |2ρ + C3 .

Proof Consider first σ = 0 and denote k sym = 12 (k(z) + k(−z)). Using (10.13) and 1 − cos(z) = 2 sin(z/2)2 ≥ C1 z2 for |z| ≤ 1, we obtain 1/|ξ | sym ξ 2 z2 k sym (z) dz 'ψ(ξ ) = (1 − cos(ξ z))k (z) dz ≥ C1 R

0



≥ C1 C −

1/|ξ |

0

C1 C− α |ξ | . ξ 2 z1−α dz = 2−α

 For an upper bound, we first consider α < 1. Then, with |z|≤1 |z|ν(dz) < ∞ and (10.12), # 1/|ξ | # # #   iξ z # 1−e ν(dz)## ≤ |eiξ z − 1|ν(dz) + C2 |ψ(ξ )| = # −1/|ξ |

R

≤2

1/|ξ |

|ξ z|ν(dz) + C2

−1/|ξ | 1/|ξ |

≤ 4C+

|ξ ||z|−α dz + C2 ≤

0

Similarly, we obtain for 1 ≤ α < 2 that # # #  #  iξ z # |ψ(ξ )| = # 1 − e + iξ z ν(dz)## ≤ ≤

R 1/|ξ |

−1/|ξ |

1/|ξ |

−1/|ξ |

|eiξ z − 1 − iξ z|ν(dz) + C3 + C4 |ξ |

|ξ z|2 ν(dz) + C3 + C4 |ξ | ≤ 2C+

≤ C5 |ξ |α + C6 . For σ > 0, we immediately have

4C+ |ξ |α + C2 . 1−α

0

1/|ξ |

|ξ |2 |z|1−α dz + C3 + C4 |ξ |

10.4

Variational Formulation

133



1 1 'ψ(ξ ) = σ 2 ξ 2 + (1 − cos(ξ z))ν(dz) ≥ σ 2 ξ 2 , 2 2 R 1 2 2 |ψ(ξ )| ≤ σ ξ + |eiξ z − 1 − iξ z|ν(dz) ≤ C7 ξ 2 + C8 . 2 R



Lemma 10.4.2 shows that ψ satisfies the so-called sector condition |(ψ(ξ )| ≤ C'ψ(ξ ), for all ξ ∈ R. Using this, we can show Theorem 10.4.3 Let X be a Lévy process with characteristic triplet (σ 2 , ν, 0) where the Lévy measure satisfies (10.12), (10.13). Then, there exist constants Ci > 0, i = 1, 2, 3, such that for all ϕ, φ ∈ H ρ (R) one has |a J (ϕ, φ)| ≤ C1 ϕ H ρ (R) φ H ρ (R) ,

a J (ϕ, ϕ) ≥ C2 ϕ 2H ρ (R) − C3 ϕ 2L2 (R) .

Proof Using Lemma 10.4.2, there exist positive constants C1 , C2 > 0 such that   'ψ(ξ ) ≥ C1 |ξ |2ρ , |ψ(ξ )| ≤ C2 |ξ |2ρ + 1 . Then, we have, using Hölder’s inequality and Remark 10.4.1, # # # # J #  ϕ (ξ )φ (ξ ) dξ ## |a (ϕ, φ)| = # ψ(ξ ) R



≤ C2 2 ≤C 2 ≤C

R



R

(ξ )| dξ (1 + |ξ |2ρ )| ϕ (ξ )φ (ξ )| dξ (1 + |ξ |)2ρ | ϕ (ξ )φ



R

(1 + |ξ |)2ρ | ϕ (ξ )|2 dξ

1/2  R

(ξ )|2 dξ (1 + |ξ |)2ρ |φ

1/2

2 ϕ H ρ (R) φ H ρ (R) , =C where we used the fact that there exists a c > 0 such that 0 0. and can be computed with at most O(N Recall that  uL,0 on the left hand side of (13.15) is the L2 -projection of the payoff L (13.10). Thus, its corresponding coefficient vector u0 appearing in (12.13) g onto V L is the unique solution of the system Mu0 = g with with respect to the basis of V 

the right hand side g ∈ RNL given by g(x1 , . . . , xd )ψ1 ,k1 (x1 ) · · · ψd ,kd (xd ) dx1 · · · dxd , g(,k) = G

If g factorizes, i.e. g(x1 , . . . , xd ) = j = 1, . . . , d, then d 8 g(,k) =

6d

j =1 gj (xj )

for some univariate gj : R → R,

bj

j =1 aj

||1 ≤ L, k ∈ ∇ .

gj (xj )ψj ,kj (xj ) dxj .

13.4

Diffusion Models

185

However, for most payoffs the factorizing property does not hold. To avoid numerical quadrature, we use integration by parts to find, in the sense of distributions, g (−2) (x1 , . . . , xd )ψ1 ,k1 (x1 ) · · · ψd ,kd (xd ) dx1 · · · dxd , (13.16) g(,k) = G

where

g (−k) (x) :=

[x0 ,x]

g (−k+1) (y) dy,

x ∈ Rd ,

k ≥ 1,

for a suitable x0 ∈ R ∪ {−∞}. Let ψi ,ki ∈ VL be a continuous, piecewise linear spline wavelet and denote its singular support by n

singsupp ψi ,ki =: {x1i ,ki , . . . , xii,ki }. Then, the integral in (13.16) becomes   j  j j j g(,k) = g (−2) x11,k1 , . . . , xdd ,kd ω11 ,k1 · · · ωdd ,kd , 1≤ji ≤ni 1≤i≤d

where the weights ω1i ,ki , . . . , ωnii,ki ∈ R depend only on the wavelet ψi ,ki . As an example, consider the L2 -normalized wavelets ψi ,ki : (ai , bi ) → R defined on an interval (ai , bi ) as described in Example 12.1.1. Then √ 3 3 (ω1i ,ki , ω2i ,ki , ω3i ,ki , ω4i ,ki , ω5i ,ki ) = 3(bi − ai )− 2 2 2 i (−1, 4, −6, 4, −1), if i ≥ 1 and ψi ,ki is an interior wavelet, and √ 3 3 (ω1i ,1 , ω2i ,1 , ω3i ,1 , ω4i ,1 ) = 3(bi − ai )− 2 2 2 i (2, −5, 4, −1), √ 3 3 (ω1i ,N , ω2i ,N , ω3i ,N , ω4i ,N ) = 3(bi − ai )− 2 2 2 i (−1, 4, −5, 2), i

i

i

i

if i ≥ 1 and ψi ,1 , ψi ,Ni is a left or a right boundary wavelet, respectively.

13.4 Diffusion Models We consider a process Z to model the dynamics of the underlying stock prices and of the background volatility drivers in case of stochastic volatility models. If Z is Markovian, the fair price of a European style contingent claim with underlying Z is given by   (13.17) u(t, z) = EQ e−r(T −t) g(ZT ) | Zt = z . We model the market Z by the stochastic differential equation (SDE) dZt = μ(Zt ) dt + Σ(Zt ) dWt ,

Z0 = z.

(13.18)

Herewith, for G ⊆ Rd , the coefficients μ : G → Rd , Σ : G → Rd×d are assumed to satisfy (1.3). We consider next two kinds of market dynamics, namely the Black– Scholes model and stochastic volatility models.

186

13

Multidimensional Diffusion Models

13.4.1 Aggregated Black–Scholes Models In the following, a full-rank Black–Scholes model and low-rank Black–Scholes model are introduced. Option pricing under the low-rank Black–Scholes model leads to lower dimensional PDEs compared to the full rank Black–Scholes model, introducing an additional error due to the rank reduction. Consider d assets S = (S 1 , . . . , S d ) with spot price dynamics Z i = S i given by dSti = μi Sti dt +

d 

j

i = 1, . . . , d,

(13.19)

where W = {Wt : t ≥ 0} is a standard Brownian motion in Rd and   μ := μi 1≤i≤d ∈ Rd ,   Σ := Σ ij 1≤i,j ≤d ∈ Rd×d

(13.20)

Σ ij Sti dWt ,

j =1

(13.21)

are the constant drift vector and volatility matrix, respectively, with the assumption that rank Σ = d. The state space domain is given by G = Rd . Under the unique EMM, the log-price dynamics X i := log S i are given by dXti = ηi dt +

d 

j

Σ ij dWt ,

i = 1, . . . , d,

(13.22)

j =1

where ηi := (r − 1/2Qii ), i = 1, . . . , d and Q := ΣΣ  ∈ Rd×d sym. denotes the volatility covariance matrix. Since Q is symmetric positive definite, there exists an orthogonal matrix U ∈ Rd×d such that UQU = D with diagonal matrix D := diag(s12 , . . . , sd2 ), s1 ≥ · · · ≥ sd > 0. Without loss of generality, we rescale time in (13.19) such that t → t ∗ = s12 t, yielding D∗ := diag(s1∗2 , . . . , sd∗2 ) with normalized s1∗ = 1, si∗ = si /s1 , i = 2, . . . , d. In the remainder, we drop the ∗ and define a process Y := {UXt : t ≥ 0} with dynamics dYti = λi dt + si dWti ,

i = 1, . . . , d,

(13.23)

where λ := Uη. The components Y 1 , . . . , Y d now satisfy the system of d decoupled SDEs (13.23). Let 1 ≤ d < d be a parameter and define  D := diag(ˆs12 , . . . , sˆd2 ) ∈ Rd×d with

 s , 1 ≤ i ≤ d, sˆi = i  (13.24) 0, d + 1 ≤ i ≤ d, 1  := U  D 2 with U and si , i = 1, . . . , d, as before. Consider the log-price and Σ  := {X t : t ≥ 0} with dynamics process X

ti =  dX ηi dt +

d  j =1

 ij dWtj , Σ

i = 1, . . . , d,

(13.25)

13.4

Diffusion Models

187

j ii ), i = 1, . . . , d and Q = Σ Σ   ∈ Rd×d where  ηi := (r − 1/2Q sym. and Wt , j =  as in (13.19). We designate the process X  as rank-reduced from d to d 1, . . . , d,  := {UX t : t ≥ 0} has since, under the change of basis induced by U, the process Y dynamics

ti =  dY λi dt + sˆi dWti i = λi dt + 1{1≤i≤d}  si dWt ,

i = 1, . . . , d,

(13.26)



d are therefore deterministic. Under d+1 , . . . , Y where  λ := U η. The components Y such market dynamics, the price of a European contingent claim (13.17) becomes u(t, x) ˆ = v(t, y) ˆ = v (t, yˆ1 , . . . , yˆd; yˆd+1  , . . . , yˆd )   −r(T −t) 1  d  , . . . , e yˆd )|(Y t1 , . . . , Y td) = (yˆ1 , . . . , yˆd) , =E e f(eYT , . . . , eYT ; eyˆd+1 with yˆ := Uxˆ and  , . . . , e yˆd ) f(eyˆ ) = f(eyˆ1 , . . . , eyˆd; eyˆd+1





 +λd+1  (T −t) , . . . , e yˆd +λd (T −t) ), = f (eyˆ1 , . . . , eyˆd, eyˆd+1

(13.27)

ˆ = v (t, yˆ1 , . . . , yˆd; yˆd+1 herewith f (ey ) := g(ex ). The price u(t, x)  , . . . , yˆd ) be comes a solution of a d-dimensional PDE with the initial condition depending on the parameters yˆd+1  , . . . , yˆd . Suppose a time rescaled d-dimensional Black–Scholes market model with logprice process X as in (13.19) has a volatility covariance matrix Q of full rank. Given 0 ≤ ε  1, assume that a principal component analysis of Q, i.e. UQU = D = diag(s12 , . . . , sd2 ) with s1 = 1 ≥ · · · ≥ sd > 0 yields si2 ≤ ε,

i = d+ 1, . . . , d,

 for some d = d(ε). This suggests the d-dimensional dynamics to be mainly driven  ε the εby d < d ε-aggregated price processes (see, e.g. [139]). We denote X  aggregated rank-d process of X with sˆd+1  = · · · = sˆd = 0 and define the ε-residual ε . From (13.22) and market process, i.e. the aggregation remainder, R ε := X − X (13.25), we have that ε )it = (ηi −  dRtε,i = d(X − X ηi ) dt +

d   ij ) dWtj . (Σ ij − Σ

(13.28)

j =1

Under the change of basis induced by U, the process T ε := {URtε : t ≥ 0}, i.e. the ε , has therefore dynamics fluctuation of X about the ε-aggregate X dTtε,i = (U(η −  η ))i dt +

d  j (D −  D)ij dWt j =1

λi ) dt, (λi −  = (λi −  λi ) dt + si dW t , i

 1 ≤ i ≤ d(ε),  + 1 ≤ i ≤ d. d(ε)

(13.29)

188

13

Multidimensional Diffusion Models

 Lemma 13.4.1 Given d = d(ε), there holds |λi −  λi | ≤

d d  2 sj , 2

i = 1, . . . , d.

 j =d+1

Proof From the definitions of η and λ, we have that # d # d d # #  1 # # kk | |λi − λi | = # Uik (ηk −  ηk )# ≤ |ηk −  ηk | = |Qkk − Q # # 2 k=1 k=1 k=1 # # # # d d d d   # 1  ## 2 jj )# ≤ 1 = Uj k (Djj − D sj2 # # 2 2 # #  k=1 j =d+1

k=1 j =1

=

d 2

d 

sj2 ,

i = 1, . . . , d.



 j =d+1

Remark 13.4.2 From (13.29) and Lemma 13.4.1, we conclude that the fluctuation  components T ε,i , i = 1, . . . , d(ε), are pure drifts of order ε. Furthermore, note that, upon setting  d → d = d − d(ε),

2 t → t = sd(ε)+1 t, 

 + 1, . . . , d, again define a d-dimensional  the fluctuation components T ε,i , i = d(ε) full-rank market of type (13.19)–(13.21) with timescale  t, allowing, in principle, for recursive ε-rank aggregation. We again consider the d-dimensional Black–Scholes market model with logε of the previous section, and price process X and its ε-aggregate rank d process X we estimate the error of approximating u(t, x) by  u(t, xˆ ε ) in (13.17). Theorem 13.4.3 Assume that the payoff g is Lipschitz. Then, there exists a constant C(x) independent of ε such that |u(t, x) −  u(t, xˆ ε )| ≤ C(x)

d 

si2 .

 i=d(ε)+1

ε with dynamics Proof We introduce the artificial process Y i tε,i = λi dt + 1 dY  si dWt , {1≤i≤d(ε)}

i = 1, . . . , d.

Under the change of basis induced by U, we have |u(t, x) −  u(t, xˆ ε )| = |v(t, y) −  v (t, yˆ ε )| ≤ |v(t, y) −  v (t, y)| ˜ + | v (t, y) ˜ − v (t, yˆ ε )|.

(13.30)

13.4

Diffusion Models

189

The two terms in (13.30) are estimated separately. Since g is globally Lipschitz, we have for the first term, where f (ey ) = g(ex ) and constants may change between lines, #  # ε # # |v(t, y) −  v (t, y)| ˜ = #E f (ey+YT −t ) − f (ey+YT −t ) # ≤C

d # #  i ε,i # # E #eyi +YT −t − eyi +YT −t # i=1 d 

=C

# # i # # eyi +λi (T −t) E #esi WT −t − 1#

 i=d(ε)+1 d 

=C

e

yi +λi (T −t)

R

 i=d(ε)+1 d 

≤C

eyi si2

≤ C(y)

 i=d(ε)+1

|ez − 1|e−z d 

2 /(2s 2 ) i

dz

si2 .

 i=d(ε)+1

Similarly, using Lemma (13.4.1), we have for the second term # # ##  # ε ε # # v (t, y) ˜ − v (t, yˆ ε )# = #E f (ey+YT −t ) − f (ey+YT −t ) # ≤C

d # #  ε,i ε,i # # E #eyi +YT −t − eyi +YT −t # i=1

=C

d 

# #   # si WTi −t # λi (T −t)  eyi E e1{1≤i≤d(ε)} − eλi (T −t) # #e

i=1

≤ C(y)

d 

eyi + 2 si (T −t) |λi −  λi | ≤ C(y) 1 2

i=1

≤ C(y)

d 

d 

eyi (1 + si2 )

i=1

d 

sj2

 j =d(ε)+1

sj2 .

 j =d(ε)+1

Since y = Ux, C(y) = C  (x), which completes the proof.



13.4.2 Stochastic Volatility Models Similarly to the one-dimensional case, multivariate stochastic volatility models replace the constant volatilities Σ ij in the Black–Scholes model (13.19) by stochastic processes Σij = fij (Y ), where fij are non-negative functions and Y is an additional source of randomness, which is modeled by an Itô diffusion in Rd .

190

13

Multidimensional Diffusion Models

We consider the stochastic volatility extension of the Black–Scholes model as described in [68, Chap. 10.6]. We set Z := (X, Y ), where X describes again the log-price dynamics of n > 1 assets and Y is an Rn -valued Itô diffusion describing the stochastic volatility Σij = fij (Y ). In particular, we assume that each Y i evolves according to the SDE ti , dYti = ci (Yti ) dt + bi (Yti ) dW

Y0i = y i ,

i = 1, . . . , n.

We pose the following assumptions: the state space domain of Y is GY ⊆ Rn , and the coefficients ck , bk : GY → R are globally Lipschitz-continuous and at most lint )t≥0 is early growing. Furthermore, the Rn -valued standard Brownian motion (W n correlated to the R -valued standard Brownian motion (Wt )t≥0 that drives the pro k = nj=1 ρj k W j + ρ ∗ W  ) is a standard Brownian  k , where (W, W cess X by W n ∗ 2 d 1/2 motion in R with d = 2n, and ρk := (1 − j =1 ρj k ) . Denoting by z := (x, y) = (x1 , . . . , xn , y1 , . . . , yn ), the coefficients μ, Σ in (13.18) under a non-unique EMM are given by   2 2 μ(z) := r − 1/2f11 (y), . . . , r − 1/2fnn (y), c1 (y1 ), . . . , cn (yn ) ∈ Rd ,

Σ(z) :=

Σ X (z) Σ Y (z)

0 D(z)

(13.31)

! ∈ Rd×d ,

(13.32)

where the matrices Σ X , Σ Y , D ∈ Rn×n are     Σ X (z) := fij (y) 1≤i,j ≤n , Σ Y (z) := ρj i bi (yi ) 1≤i,j ≤n ,   D(z) := diag ρ1∗ b1 (y1 ), . . . , ρn∗ bn (yn ) . The smooth functions fij : GY → R+ are assumed to be bounded from below and above. The state space domain of the pair process Z = (X, Y ) is G = Rn × GY . The infinitesimal generator A of the semigroup generated by the process Z is given by  1  A := − tr Q(z)D 2 − μ(z) ∇, 2

(13.33)

with Q = ΣΣ  . Example 13.4.4 (Volatility processes) In [68, Chap. 10.6], it is assumed that each volatility component Y k follows a mean-reverting Ornstein–Uhlenbeck process, i.e. ck (yk ) = αk (mk − yk ), bk (yk ) = βk , 1 ≤ k ≤ n. Here, αk > 0 is called the rate of mean reversion and mk ≥ 0 is the long-run mean level of Y k . Under an EMM, the drift term ck becomes ck (y) = αk (mk − yk ) − βk Λk (y), for some volatility risk premium Λ(y) = (Λ1 (y), . . . , Λn (y)) . See [68, Chap. 2.5] for a representation of Λ in the one dimensional case n = 1.

13.5

Numerical Examples

191

13.5 Numerical Examples We give numerical examples using the wavelet finite element discretization as described in Sect. 13.3.

13.5.1 Full-Rank d-Dimensional Black–Scholes Model We consider the geometric call option with payoff g(x) = max(0, e

d

i=1 αi xi

− K),

x ∈ Rd ,

for strike K = 1. The antiderivatives of g for αi > 0, i = 1, . . . , d are given by !k ! d 2d d d 8  1  1{ d αi xi ≥log K} . αi−2 e i=1 αi xi − αi xi − log K g (−2) (x) = i=1 k! i=1

k=0

i=1

We first set d = 2 and solve problem (2.24) for various mesh widths h = 2−L . Using interest rate r = 0.01, covariance Q = (σi σj ρij )1≤i,j ≤d , σ1 = 0.4, σ2 = 0.1, ρ12 = 0.2 and weights αi = 1/d, i = 1, . . . , d, we plot the convergence rate of the L2 -error eL := u(T , ·) − uL (T , ·) L2 (G0 ) ,

G0 = (K/2, 3/2K)2 ,

at maturity T = 1 in Fig. 13.2. To compare the rates we also solved the problem on full grid. In the top picture, it can be seen that sparse grid has (up to log terms) the same rate as full grid. To better show the advantage of a sparse grid, we additionally plot the convergence rate in terms of degrees of freedom. For the full grid, we have L = O(L 2L ). The convergence rate in the NL = O(22L ) and for the sparse grid N full grid shows the “curse of dimension”, whereas for the sparse grid we still obtain the optimal rate (up to log terms). For d > 2 we set σi = 0.3,0i = 1, . . . , d and ρi,j = 0, i = 1, . . . , d − 2, j = i + 2, . . . , d, ρ12 = ρ, ρi,i+1 = ρ 1 − ρ 2 , i = 1, . . . , d − 1, with ρ = 0.1 and weights α1 = 1, αi = 0, i = 2, . . . , d. We plot the convergence rate of the absolute error at the point S0 = (K, . . . , K) in Fig. 13.3. For low dimensions d ∈ {2, 3, 4}, the second order convergence rate on the sparse grid can be well observed over all levels. For higher dimensions d ∈ {6, 8}, the log terms prevail at low levels, hence the flattened behavior of the convergence curves which then exhibit the expected second order rates at finer discretizations. Despite the smoothness of the solution, we report a steeply increasing constant in the rates as d is raised, which forces us to already set L = 11 for d = 8 in order to have a relative L2 -error on the order of 10−2 which currently prevents us from reasonably increasing d beyond 8. The size of the constant can be traced back to the initial condition u0h , the H -projection of the payoff g onto Vh (Sect. 13.3), showing similar relative L2 -errors.

192

13

Multidimensional Diffusion Models

Fig. 13.2 Convergence rate of the wavelet discretization in terms of the mesh width h (top) and in terms of degrees of freedom (bottom)

 13.5.2 Low-Rank d-Dimensional Black–Scholes We consider the geometric call option as in Sect. 13.5.1 written on d underlyings under the Black–Scholes model and we are now interested in the case of larger dimensions, i.e. d > 8. This occurs when pricing contingent claims on stock indices considering all d price processes in comparison to handling the index as one single process. Straightforward computations in such high dimensions would currently require too high discretization levels as previously noted. Instead, we rely on the dimensionality reduction by ε-aggregation to identify a rank d ε-aggregated process driving a d-dimensional market. In particular, we focus on the Dow Jones industrial index where d = 30. We compute the principal components of the volatility covariance matrix Q := U DU ∈ Rd×d of their realized daily log-returns over 252 periods resulting in the spectrum (s12 , . . . , sd2 ), normalized by s12 as in Sect. 13.4.1

13.5

Numerical Examples

193

Fig. 13.3 Convergence rate of the BS model in different dimensions d

and shown in Fig. 13.4 (top). Similar eigenvalue decompositions for the DAX index are presented in [139]. We define the recovery ratio ⎛ ⎞+ ,−1 d d   2⎠ 2 ⎝ si si , d = 1, . . . , d, ηd := i=1

i=1

shown in Fig. 13.4 (bottom), whose values for d = 1, . . . , 5 are reported in Table 13.3. The eigenvalues are observed to decay exponentially and by virtue of Theorem 13.4.3, we may therefore expect sufficiently accurate results for d ≤ 5. Let  D := diag(ˆs12 , . . . , sˆd2 ) ∈ Rd×d with sˆ as in (13.24) for some 2 ≤ d ≤ 5  and Y  defined by (13.25) and which defines the rank-reduced processes X (13.26), respectively. We approximate the exact solution u(t, x) = v(t, y) of the d dimensional problem (8.12) by a parametrized d-dimensional option price  v (t, y) ˆ = d  v (t, yˆ1 , . . . , yˆd; yˆd+1  , . . . , yˆd ) for yˆ = (yˆ1 , . . . , yˆd ) ∈ G := (−R, R) . We therefore consider the option prices  vR satisfying 

vR + r vR + A vR = 0 ∂t   vR (t, ·) = 0  vR (0, y) ˆ = f(eyˆ )| d GR



d , in J × G R d, on J × ∂ G in

d, G R

(13.34)

R



 is now given by d := (−R, R)d . The rank-d operator A where G R    := − 1 tr  2 −   A DD λ ∇, 2 2  and D with  λ as in (13.26), rank-d differential operators ∇, + (∂y2ˆ yˆ )1≤i,j ≤d 2   i j  ∇ := (∂yˆ1 , . . . , ∂yˆd, 0, . . . , 0) , D := 0

, 0 , 0

194

13

Multidimensional Diffusion Models

Fig. 13.4 Eigenvalue spectrum (normalized by s12 ) of the realized volatility covariance matrix of the constituents of the Dow Jones index over a period of 252 days (top) and recovery ratio ηd, d = 1, . . . , d (bottom)

Table 13.3 First eigenvalues 2, d  = 1, . . . , 5, of the Dow sd Jones realized volatility covariance matrix Q and recovery ratio ηd

d

sd

2 sd

ηd

1

0.2052

0.0421

0.5307

2

0.0990

0.0098

0.6542

3

0.0940

0.0088

0.7655

4

0.0791

0.0062

0.8443

5

0.0459

0.0021

0.8708

respectively, and λd+1 λd T  +  T , . . . , e yˆd + ) f(eyˆ ) = f (eyˆ1 , . . . , eyˆd, eyˆd+1

= max(0, e

d

d

i=1 αi yˆi +

  yˆi +λi T i=d+1

− K).

13.6

Further Reading

195

Fig. 13.5 Convergence rates of the approximation of a 30 dimensional option price on the Dow Jones by d = 2, . . . , 5 dimensional options in the Black–Scholes model

We numerically solve (13.34) for d = 2, . . . , 5 and various mesh widths h = 2−L with the Dow Jones realized volatility covariance matrix Q whose principal components are plotted in Fig. 13.4 (top), weights αi = 0.3, i = 1, . . . , d, maturity T = 1, strike K = 1 and interest rate r = 0.045, and plot the convergence rate of the relative L2 -error v(T , ·) −  vL (T , ·) L2 (G1 ) , G1 = (3/4K, 5/4K) × {K} × · · · × {K}, eL := v(T , ·) L2 (G1 ) at maturity T = 1 in Fig. 13.5. The flattening behavior of the convergence rates is explained by further expanding the error into (a) an ε-aggregation error made by artificially setting volatilities to zero (Theorem 13.4.3) and (b) a discretization error [159] as ˆ = v(t, y) −  v (t, y) ˆ + v (t, y) ˆ − vL (t, y) ˆ v(t, y) −  vL (t, y) v (t, y) ˆ − v (t, y) ˆ . ≤ v(t, y) −  v (t, y) ˆ +      L   (a)

(b)

It follows that the lower bounds observed in Fig. 13.5 therefore stem from the εaggregation error which diminishes as d is increased.

13.6 Further Reading A rigorous error analysis for the numerical solution of high dimensional parabolic equations is given in von Petersdorff and Schwab [159], where a wavelet based sparse grid discretization is used in conjunction with a discontinuous Galerkin timediscretization. Reisinger and Wittum [139] use a finite difference based discretization on sparse grids employing the combination technique. Leentvaar and Oosterlee [113] use high order finite difference discretizations on sparse grids for the

196

13

Multidimensional Diffusion Models

pricing of basket options in up to 5 dimensions. The construction of the sparse L can be further developed to general sparse approximation tensor product space V spaces : L,β := V W 1 ⊗ · · · ⊗ W d , β||∞ −||1 ≥βL−(L+d−1)

where β ∈ R and  ∈ Nd . Griebel and Knapek show in [75] that using such approximation spaces one can remove the logarithmic term Ld−1 . Indeed, for 0 < β < 1, L,β = O(2L ). Furthermore, they have similar approximation properone has dim V L . ties as V

Chapter 14

Multidimensional Lévy Models

In this chapter, we extend the one-dimensional Lévy models described in Chap. 10 to multidimensional Lévy models. Since the law of a Lévy process X is timehomogeneous, it is completely characterized by its characteristic triplet (Q, ν, γ ). The drift γ has no effect on the dependence structure between the components of X. The dependence structure of the Brownian motion part of X is given by its covariance matrix Q. For purposes of financial modeling, it remains to specify a parametric dependence structure of the purely discontinuous part of X which can be done by using Lévy copulas.

14.1 Lévy Processes We associate to the multidimensional Lévy process X = {Xt : t ∈ [0, T ]} the Lévy measure   ν(B) = E #{t ∈ [0, 1] : Xt = 0, Xt ∈ B} , B ∈ B(Rd ).  The Lévy measure satisfies Rd 1 ∧ |z|2 ν(dz) < ∞, where we denote by |z|2 = d 2 i=1 zi . Using the Lévy–Khinchine representation, we see that every Lévy process is uniquely defined by a drift vector γ , a positive definite matrix Q ∈ Rd×d sym and the Lévy measure ν. Theorem 14.1.1 (Lévy–Khinchine representation) Let X be a Lévy process with state space Rd and characteristic triplet (Q, ν, γ ). Assume |z|>1 |z| ν(dz) < ∞. Then, for t ≥ 0,   E eiξ,Xt  = e−tψ(ξ ) , ξ ∈ Rd ,   (14.1) 1 with ψ(ξ ) = −iγ , ξ  + ξ, Qξ  + 1 − eiξ,z + iξ, z ν(dz). 2 Rd N. Hilber et al., Computational Methods for Quantitative Finance, Springer Finance, DOI 10.1007/978-3-642-35401-4_14, © Springer-Verlag Berlin Heidelberg 2013

197

198

14

Multidimensional Lévy Models

In what follows, we denote by X i , i = 1, . . . , d, the coordinate projection of the multidimensional process X = (X 1 , . . . , X d ) ∈ Rd . Coordinate projections of Lévy processes are again Lévy processes. Proposition 14.1.2 Let X = (X 1 , . . . , X d ) be aLévy process with state space Rd and characteristic triplet (Q, ν, γ ). Assume |z|>1 |z| ν(dz) < ∞. Then, the marginal processes X j , j = 1, . . . , d, are again Lévy processes with the characteristic triplet (Qjj , νj , γj ) where the marginal Lévy measures are defined as νj (B) := ν({x ∈ Rd : xj ∈ B \{0}}),

∀B ∈ B(R),

j = 1, . . . , d.

If all the marginal Lévy processes X i satisfy the martingale condition as i in Lemma 10.1.5, i.e. eX are martingales with respect to the filtration of X i , i = 1, . . . , d, they are also martingales with respect to the filtration F of X = (X 1 , . . . , X d ) . Lemma 14.1.3 Let X be a Lévy process with state space Rd and characteris  tic triplet (Q, ν, γ ). Assume, |z|>1 |z| ν(dz) < ∞ and |z|>1 ezj νj (dz) < ∞, j = j

1, . . . , d. Then, eX , j = 1, . . . , d, are martingales with respect to the filtration F of X if and only if  z  Qjj + γj + e j − 1 − zj νj (dz) = 0, j = 1, . . . , d. 2 R Proof As in the proof of Lemma 10.1.5, we have j  j  E eXs | Ft = eXt e(t−s)ψ(−iej ) . Therefore, setting ψ(−iej ) = 0, j = 1, . . . , d, the Lévy–Khinchine formula (14.1) yields  z  Qjj + γj + e j − 1 − zj ν(dz) = 0, j = 1, . . . , d. 2 Rd The result follows with the definition of the marginal Lévy density νj given in Proposition 14.1.2. 

14.2 Lévy Copulas d

We first give some definitions. The F -volume of (a, b], a, b ∈ R for a function d F : S → R, S ⊂ R is defined by  VF ((a, b]) := (−1)N (u) F (u) , u∈{a1 ,b1 }×···×{ad ,bd }

where N(u) = |{k : uk = ak }|.

14.2

Lévy Copulas

199 d

Definition 14.2.1 A function F : S → R, S ⊂ R is called d-increasing if VF ((a, b]) ≥ 0 for all a, b ∈ S with a ≤ b and (a, b] ⊂ S. Examples of d-increasing functions are distribution functions of random vectors X ∈ Rd , F (x1 , . . . xd ) = P[X1 ≤ x1 , . . . , Xd ≤ xd ], or more generally, F (x1 , . . . xd ) = μ ((−∞, x1 ], . . . , (−∞, xd ]) , where μ is a finite measure on B(Rd ). F is clearly d-increasing since the F -volume is just VF ((a, b]) = μ ((a, b]) for every a ≤ b. For parametric modeling dependence structures of multidimensional jump processes, margins play an important role. d

Definition 14.2.2 Let F : R → R be a d-increasing function which satisfies F (u) = 0 if ui = 0 for at least one i ∈ {1, . . . , d}. For any index set I ⊂ {1, . . . , d} |I | the I-margin of F is the function F I : R → R 8   F I (uI ) := lim sgn uj F (u1 , . . . , ud ) . a→∞

c (uj )j ∈I c ∈{−a,∞}|I |

j ∈I c

Since the Lévy measure is a measure on B(Rd ), it is possible to define a suitable notion of a copula. However, one has to take into account that the Lévy measure is possibly infinite at the origin. d

Definition 14.2.3 A function F : R → R is called a Lévy copula if (i) (ii) (iii) (iv)

F (u1 , . . . , ud ) = ∞ for (u1 , . . . , ud ) = (∞, . . . , ∞), F (u1 , . . . , ud ) = 0 if ui = 0 for at least one i ∈ {1, . . . , d}, F is d-increasing, F {i} (u) = u for any i ∈ {1, . . . , d}, u ∈ R.

In the following, we prove some useful properties of Lévy copulas. Lemma 14.2.4 Let F be a Lévy copula. Then, 0≤ 6d

d 8

sgn uj F (u1 , . . . , ud ) ≤ min{|u1 | , . . . , |ud |}

∀u ∈ Rd ,

j =1

and j =1 sgn uj F (u) is nondecreasing in the absolute value of each argument |uj |. Furthermore, Lévy copulas are Lipschitz continuous, i.e. |F (v1 , . . . , vd ) − F (u1 , . . . , ud )| ≤

d 

|vi − ui |

∀u, v ∈ Rd .

i=1

Proof Let u ∈ Rd , u1 ≥ 0 and 0 ≤ a1 ≤ u1 . Set b1 = u1 and for 2 ≤ j ≤ d set aj = 0, bj = uj if uj ≥ 0 otherwise aj = uj , bj = 0. Since F is d-increasing and grounded

200

14



VF ((a, b]) =

Multidimensional Lévy Models

(−1)N (v) F (v)

v∈{a1 ,b1 }×···×{ad ,bd }

=

d 8

sgn uj F (u1 , . . . , ud ) −

j =2

d 8

sgn uj F (a1 , u2 . . . , ud ) ≥ 0 .

j =2

Similarly for u1 < 0. This gives the lower bound with a1 = 0. Let I = {i} ⊂ {1, . . . , d}. Then, d 8

sgn uj F (u1 , . . . , ui , . . . ud )

j =1



≤ sgn ui lim ⎝ n→∞

8

sgn uj ⎠ F (n sgn u1 , . . . , ui , . . . , n sgn ud )

j ∈I c





≤ sgn ui lim

n→∞



(vj )j ∈I c ∈{−n,∞}

⎝ |I c |

8

⎞ sgn vj ⎠ F (v1 , . . . , ui , . . . , vd )

j ∈I c

= sgn ui F i (ui ) = |ui | . Since i ∈ {1, . . . , d} is arbitrary, we obtain the upper bound. To simplify notation, we suppose 0 ≤ ui ≤ vi , i = 1, . . . , d. The general case can be treated similarly. We have |F (v1 , . . . , vd ) − F (u1 , . . . , ud )| = |VF ((0, v1 ] × · · · × (0, vd ]) − VF ((0, u1 ] × · · · × (0, ud ])| ≤

d  i=1

=

  lim VF (−a, ∞]i−1 × (ui , vi ] × (−a, ∞]d−i

a→∞

d    i F (vi ) − F i (ui ) i=1

=

d 

|vi − ui | .



i=1

We also need tail integrals of Lévy processes. Definition 14.2.5 Let X be a Lévy process with state space Rd and Lévy measure ν. The tail integral of X is the function U : Rd \{0} → R given by U (x1 , . . . , xd ) =

d 8 i=1

sgn(xj )ν

d 8 j =1

 I (xj ) ,

14.2

Lévy Copulas

201

where

I (x) =

(x, ∞) (−∞, x]

for x ≥ 0, for x < 0.

Furthermore, for I ⊂ {1, . . . , d} nonempty the I-marginal tail integral U I of X is the tail integral of the process X I := (X i )i∈I . The next result, [100, Theorem 3.6], shows that essentially any Lévy process X = (X 1 , . . . , X d ) can be built from univariate marginal processes X j , j = 1, . . . , d and Lévy copulas. It can be viewed as a version of Sklar’s theorem for Lévy copulas. Theorem 14.2.6 (Sklar’s theorem for Lévy copulas) For any Lévy process X with state space Rd , there exists a Lévy copula F such that the tail integrals of X satisfy U I (x I ) = F I ((Ui (xi ))i∈I ) ,

(14.2)

for any nonempty I ⊂ {1, . . . , d} and any x I ∈ R|I | \{0}. The Lévy copula F is 6d unique on i=1 Ran Ui . Conversely, let F be a d-dimensional Lévy copula and Ui , i = 1, . . . , d, tail integrals of univariate Lévy processes. Then, there exits a d-dimensional Lévy process X such that its components have tail integrals Ui and its marginal tail integrals satisfy (14.2). The Lévy measure ν of X is uniquely determined by F and Ui , i = 1, . . . , d. Using partial integration, we can write the multidimensional Lévy measure in terms of the Lévy copula. Lemma 14.2.7 Let f (z) ∈ C ∞ (Rd ) be bounded and vanishing on a neighborhood of the origin. Furthermore, let X be a d-dimensional Lévy process with Lévy measure ν, Lévy copula F and marginal Lévy measures νj , j = 1, . . . , d. Then, Rd

f (z)ν(dz) =

d  j =1 R

+

d  j =2

f (0 + zj )νj (dzj )

 |I |=j I1 0 there exist a c ∈ Rd such that the distribution μ of X at t = 1 satisfies 

where r Q =



 μ(z)r =  μ(r Q z)eic,z ,

−1 n n n=0 (n!) (log r) Q .

For Q = diag ((1/α, . . . , 1/α)), 0 < α < 2, we again obtain α-stable processes. An extension of (isotropic) α-stable processes are anisotropic α-stable processes for an α = (α1 , . . . , αd ) with 0 < αi < 2, i = 1, . . . , d. Definition 14.3.3 Let 0 < αi < 2, i = 1, . . . , d and Q = diag{αi−1 : i = 1, . . . , d}. A Lévy process X = {Xt : t ≥ 0} with state space Rd is called α-stable if the

206

14

Multidimensional Lévy Models

distribution μ of X at t = 1 is Q-stable, i.e. for any r > 0 there exists a c ∈ Rd such that 1

1

 μ(z)r =  μ(r α1 z1 , . . . , r αd zd )eic,z .

(14.8)

Since we have for the characteristic function  μ(z) = e−ψ(z) , it follows from (14.8) that the characteristic exponent of an α-stable process satisfies for any r > 0 1

1

'ψ(r α1 ξ1 , . . . , r αd ξd ) = r'ψ(ξ ),

∀ξ ∈ Rd .

(14.9)

We assume that the Lévy measure ν has a Lévy density k, i.e. ν(dz) = k(z)dz and obtain for Q = 0 that 1

1

'ψ(r α1 ξ1 , . . . , r αd ξd ) + d ,, +  1 αi = r ξi zi 1 − cos k sym (z) dz =

Rd

Rd

i=1

(1 − cosξ, z) k sym (r

− α1

1

z1 , . . . , r

− α1

d

zd )r

− α1 −···− α1 1

d

dz.

Now using (14.9), the Lévy density has to satisfy k sym (r

− α1

1

z1 , . . . , r

− α1

d

zd ) = r

1+ α1 +···+ α1 1

d

k sym (z1 , . . . , zd ) .

(14.10)

A simple example of an α-stable Lévy process on Rd is given by the Lévy measure d

ν(dz) =

2  j =1

cj

+ d 

,−1− α1

1

|zi |

αi

−···− α1

d

1Qj dz,

(14.11)

i=1

2d

where cj ≥ 0, j =1 cj > 0. The corresponding marginal processes X i , i = 1, . . . , d ci |z|−1−αi dz of X are again α-stable processes in R with Lévy measure νi (dz) =  d where  ci depend on d, α and cj , j = 1, . . . , 2 . We plot the density (14.11) for d = 2, α = (0.5, 1.2) and cj = 1, j = 1, . . . , 4 in Fig. 14.2.

14.3.1 Subordinated Brownian Motion As in the one-dimensional case, we can obtain Lévy processes by subordination. For d > 1 there are two possibilities. Using a one-dimensional increasing process or subordinator G = {Gt : t ≥ 0}, the resulting process is given by Xti = WGi t + θi Gt ,

θi ∈ R,

t ∈ [0, T ],

for i = 1, . . . , d where W = (W 1 , . . . , W d ) is a vector of d Brownian motions with covariance matrix Q = (σi σj ρij )1≤i,j ≤d . Here, σi2 , i = 1, . . . , d, is the variance of

14.3

Lévy Models

207

Fig. 14.2 Anisotropic α-stable density in d = 2 for α = (0.5, 1.2) and corresponding contour plot

the one-dimensional Brownian motions W i , and ρij the correlation of the Brownian motions W i and W j . But we can also use a d-dimensional Lévy process G = (G1 , . . . , Gd ) which is componentwise increasing in each coordinate to obtain Xti = WGi i + θi Git , t

θi ∈ R,

t ∈ [0, T ],

for i = 1, . . . , d. This is called multivariate subordination. As an example we use as the one-dimensional subordinator a gamma process to obtain a multidimensional variance gamma process [117]. As in the onedimensional case (see Sect. 10.2.2), we consider a gamma process G with Lévy s density kG (s) = e− ϑ (ϑs)−1 1{s>0} . Then, the Lévy measure of X is given for B ∈ B(Rd ) by

208

14

ν(B) =



(2π) B

−d 2

det Q− 2 s − 2 e−z−θs,Q 1

d

Multidimensional Lévy Models

−1 (z−θs)/(2s)

e− ϑ (ϑs)−1 ds dz s

0 −d

1

Q−1 +Q−

z 2 (2π) 2 ϑ −1 det Q− 2 eθ, B ∞ z,Q−1 z  θ,Q−1 θ 1  d +ϑ s 2 × s − 2 −1 e− 2s − ds dz 0 ∞ −d 1 Q−1 +Q− d 1 z 2 = (2π) 2 ϑ −1 det Q− 2 eθ, s − 2 −1 e−β s −γ s ds dz ,

=

0

B

z, Q−1 z/2

where β = and γ = gral, we obtain the Lévy measure ν(dz) = 2 (2π)

−d 2

θ, Q−1 θ /2

1

ϑ −1 det Q− 2 eθ,

+ 1/ϑ . Integrating the second inte-

Q−1 +Q− z 2

β γ

!−d/4

 0  K−d/2 2 βγ dz, (14.12)

where K−d/2 (ξ ) is the modified Bessel function of the second kind. For small ξ , we have K−d/2 (ξ ) ∼ ξ −d/2 , and therefore ν(dz) ∼ z, Q−1 z−d/2 dz ∼ |z|−d dz since Q > 0. The marginal processes X i , i = 1, . . . , d ofX are variance gamma processes − 2/ϑ+θ 2 /σ 2 /σ |z|

i i i |z|−1 dz. We plot on R with Lévy measure νi (dz) = ϑ −1 eθi /σi z e the density (10.9) for d = 2, θ = (−0.1, −0.2), σ = (0.3, 0.4), ρ12 = 0.5 and ϑ = 1 in Fig. 14.3. 2

14.3.2 Lévy Copula Models Lévy copulas F allow parametric constructions of multivariate jump densities from univariate ones. Let U1 , . . . , Ud be one-dimensional tail integrals with Lévy density k1 , . . . , kd , and let F be a Lévy copula such that ∂1 · · · ∂d F exists in the sense of distributions. Then, k(x1 , . . . , xd ) = ∂1 · · · ∂d F |ξ1 =U1 (x1 ),...,ξd =Ud (xd ) k1 (x1 ) · · · kd (xd )

(14.13)

is the jump density of a d-variate Lévy measure with marginal Lévy densities k1 , . . . , kd . For example, we can use the Clayton Lévy copula (see Definition 14.6) + d ,− ϑ1    |ui |−ϑ η1{u1 ···ud ≥0} − (1 − η)1{u1 ···ud ≤0} , F (u1 , . . . , ud ) = 22−d i=1

where ϑ > 0, η ∈ [0, 1] and consider α-stable marginal Lévy densities, ki (z) = |z|−1−αi , 0 < αi < 2, i = 1, . . . , d. This leads to the d-dimensional Lévy density + d ,− ϑ1 −d d 8    k(z) = 22−d αiϑ |zi |αi ϑ 1 + (i − 1)ϑ αiϑ+1 |zi |αi ϑ−1 i=1

  · η1{z1 ···zd ≥0} + (1 − η)1{z1 ···zd ≤0} .

i=1

(14.14)

14.3

Lévy Models

209

Fig. 14.3 Variance gamma density in d = 2 and corresponding contour plot

Note that this is again an anisotropic α-stable process. We plot the density (14.14) for d = 2, ϑ = 0.5, η = 0.5 and α = (0.5, 1.2) in Fig. 14.4.

14.3.3 Admissible Models We make the following assumptions on our models. Assumption 14.3.4 Let X be a d-dimensional Lévy process with characteristic triplet (Q, ν, γ ), Lévy density k and marginal Lévy densities ki , i = 1, . . . , d. (i) There are constants βi− > 0, βi+ > 1, i = 1, . . . , d and C > 0 such that

−βi− |z| ki (z) ≤ C e−β + z , z < −1, (14.15) e i , z > 1.

210

14

Multidimensional Lévy Models

Fig. 14.4 Anisotropic α-stable Lévy copula density (14.14) in d = 2 for α = (0.5, 1.2) and corresponding contour plot

(ii) Furthermore, we assume there exist an α-stable process X 0 with Lévy density k 0 and C+ > 0 such that k(z) ≤ C+ k 0 (z),

0 < |z| < 1.

(14.16)

(iii) If Q is not positive definite, we assume additionally that there is a C− > 0 such that k sym (z) ≥ C− k 0,sym (z),

0 < |z| < 1.

(14.17)

For d = 1, the assumptions (14.15)–(14.17) coincide with the assumptions (10.11)–(10.13). In the case that the marginal processes X i , i = 1, . . . , d are independent, it is equivalent that the corresponding marginal one-dimensional densities ki , i = 1, . . . , d satisfy (10.11)–(10.13).

14.4

Pricing Equation

211

14.4 Pricing Equation As before we assume the risk-neutral dynamics of the underlying asset price is given by i

Sti = S0i ert+Xt ,

i = 1, . . . , d,

where X is a d-dimensional Lévy process with characteristic triplet (Q, ν, γ ) under a non-unique EMM. As shown in Lemma 14.1.3, the martingale condition implies  z  Qjj − e j − 1 − zj νj (dz), j = 1, . . . , d. γj = − 2 R We again show that the value of the option in log-price v(t, x) is a solution of a multidimensional PIDE. Proposition 14.4.1 Let X be a Lévy process with state space Rd and characteristic triplet (Q, ν, γ ) where the Lévy measure satisfies (14.15). Denote by A the integrodifferential operator 1 (Af )(x) = tr[QD 2 f (x)] + γ  ∇f (x) 2   f (x + z) − f (x) − z ∇x f (x) ν(dz), (14.18) + Rd

2 d for functions  t f ∈ C (R ) with bounded derivatives. Then, the process Mt := f (Xt ) − 0 (Af )(Xs ) ds is a martingale with respect to the filtration of X.

Proof Let Σ = (Σ ij )1≤i,j ≤d be given such that ΣΣ  = Q. Proceeding as in the proof of Proposition 10.3.1, we obtain using the Itô formula for multidimensional Lévy processes and the Lévy–Itô decomposition df (Xt ) =

d 

∂xi f (Xt− ) dXti +

d 1  Qij ∂xi xj f (Xt ) dt 2 i,j =1

i=1

+ f (Xt ) − f (Xt− ) −

d 

Xti ∂xi f (Xt− )

i=1

= γ  ∇f (Xt− ) dt +

d 

∂xi f (Xt− )

i=1

+

d  i=1

∂xi f (Xt− )

Rd \{0}

d 

j

Σ ij dWt

j =1

zi JX (dt, dz)

1 f (Xt− + z) − f (Xt− ) + tr[QD 2 f (Xt )] dt + 2 Rd \{0} ! d  zi ∂xi f (Xt− ) JX (dt, dz) − i=1

212

14

= (Af )(Xt ) dt +

i=1

+

d 

Rd

∂xi f (Xt− )

d 

Multidimensional Lévy Models j

Σ ij dWt

j =1

(f (Xt− + z) − f (Xt− ))JX (dt, dz).

j As already shown in Proposition 8.1.2, di=1 ∂xi f (Xt− ) dj =1 Σ ij dWt is a martingale. Since f ∈ C 2 (Rd ) and the Lévy measure ν satisfies (14.15), we have similar to Proposition 10.3.1 that also Rd (f (Xt− + z) − f (Xt− ))JX (dt, dz) is a martingale.  Repeating the arguments which lead to Theorem 4.1.4 yields Theorem 14.4.2 Let v ∈ C 1,2 (J × Rd ) ∩ C 0 (J × Rd ) with bounded derivatives in x be a solution of ∂t v + Av − rv = 0

in J × Rd ,

v(T , x) = g(ex1 , . . . , exd )

in Rd ,

(14.19)

with A as in (14.18) with drift r + γ . Then, v(t, x) can also be represented as   v(t, x) = E e−r(T −t) g(erT +XT ) | Xt = x . We again change to time-to-maturity t → T − t and remove the drift γ and the interest rate r by setting u(t, s) =: ert v(T − t, x1 − (γ1 + r)t, . . . , xd − (γd + r)t).

14.5 Variational Formulation For the variational formulation, we need anisotropic Sobolev spaces of fractional order as defined in (13.4). Let Q = (σi σj ρij )1≤i,j ≤d , where ρij is the correlation of the Brownian motion W i and W j . We set ρ = (ρ1 , . . . , ρd ) with

1 if σi > 0, ρi = αi /2 if σi = 0, with α given in Assumption 14.3.4. The variational formulation of the PIDE reads Find u ∈ L2 (J ; H ρ (Rd )) ∩ H 1 (J ; L2 (Rd )) such that (∂t u, v) + a J (u, v) = 0, ∀v ∈ H ρ (Rd ), a.e. in J,

(14.20)

u(0) = u0 , where u0 (x) := g(ex1 , . . . , exd ) and the bilinear form a J (·, ·) : H ρ (Rd ) × H ρ (Rd ) → R is given by 1 a J (ϕ, φ) := (∇ϕ) Q∇φ dx 2 Rd ! d  ϕ(x + z) − ϕ(x) − − zi ϕxi (x) φ(x)ν(dz) dx. Rd

Rd

i=1

14.6

Wavelet Discretization

213

For a multidimensional Lévy process satisfying the Assumptions 14.3.4, we again have a sector condition as shown in [134]. Lemma 14.5.1 Let X be a Lévy process with characteristic triplet (Q, ν, 0) where the Lévy measure satisfies (14.16), (14.17). Then, there exist constants Ci > 0, i = 1, 2, 3, 'ψ(ξ ) ≥ C1

d 

|ξ |2ρi ,

|ψ(ξ )| ≤ C2

i=1

d 

|ξ |2ρi + C3 .

i=1

Using this, we can show as in Theorem 10.4.3 Theorem 14.5.2 Let X be a Lévy process with characteristic triplet (Q, ν, 0) where the Lévy measure satisfies (14.16), (14.17). Then, there exist constants Ci > 0, i = 1, 2, 3, such that for all ϕ, φ ∈ H ρ (Rd ) the following holds: |a J (ϕ, φ)| ≤ C1 ϕ H ρ (Rd ) φ H ρ (Rd ) ,

a J (ϕ, ϕ) ≥ C2 ϕ 2H ρ (Rd ) − C3 ϕ 2L2 (Rd ) .

In particular, for every u0 ∈ L2 (Rd ) there exists a unique solution to the problem (14.20). For multidimensional payoffs g which satisfy the growth condition (8.10), we can localized to a bounded domain G = (−R, R)d ⊂ Rd , R > 0 which again corresponds to approximating the option price by a knock-out barrier option. Similar to Theorem 8.3.1 and Theorem 10.5.1, we obtain that the barrier option price converges to the option price exponentially fast in R if the Lévy measure satisfies (14.15). Therefore, we have the localized problem ρ (G)) ∩ H 1 (J ; L2 (G)) such that Find uR ∈ L2 (J ; H ρ (G) , a.e. in J, (∂t uR , v) + a J (uR , v) = 0, ∀v ∈ H

(14.21)

uR (0) = u0 |G , which as a unique solution for every payoff g satisfying the growth condition (8.10).

14.6 Wavelet Discretization Since the diffusion part has been already discussed in Sect. 8.4, we set Q = 0 and only consider pure jump models. The main problem is as in the one-dimensional case the singularity of the Lévy measure at the origin z = 0 and additionally on the axis zi = 0, i = 1, . . . , d. Therefore, we integrate by parts twice the integrodifferential expression for the jump generator as in Lemma 14.2.7 to obtain for 1 (G) φ, ϕ ∈ H

214

14

Multidimensional Lévy Models

d 

∂i ϕ(x + zi )∂i φ(x)ki−2 (zi ) dx dzi R G i=1 d  

a J (ϕ, φ) =



i=2

∂ I ϕ(x + zI )φ(x)U I (zI ) dx dzI .

Ri

|I |=i I1 0 the domains spaces H are equal to those in the Black–Scholes case, cf. [134, Theorem 4.8]. Several examples of Feller processes are provided in this section. The methods presented in this work are not applicable to all of them. But admissible market models are derived ensuring the well-posedness of the corresponding pricing equations and the applicability of finite element methods. Subordination can be used to construct Feller processes. However, the structure of the symbol is more involved than in the Lévy setting. We also present a construction of Feller processes using Lévy copulas.

16.3 Subordination Many Lévy models in the context of option pricing are constructed via subordination of a Brownian motion by a corresponding stochastic clock, e.g. an NIG process [8] or a VG process [119]. We describe a similar construction for Feller processes and point out similarities and differences to the Lévy case. Bernstein functions play a crucial role in the representation of subordinators. Definition 16.3.1 A function f (x) ∈ C ∞ (0, ∞) is called a Bernstein function if f ≥ 0,

(−1)k

∂ k f (x) ≤ 0, ∂x k

∀k ∈ N.

Example 16.3.2 The functions f1 (x) = c, f2 (x) = cx and f3 (x) = 1 − e−cx , c ≥ 0 are Bernstein functions. Bernstein functions admit the following representation. Theorem 16.3.3 For any Bernstein function f (x) there exists a measure μ on (0, ∞), with ∞ s μ(ds) < ∞ 1 + s 0+

254

16

Multidimensional Feller Processes

such that for x > 0 and positive constants a, b ∞ (1 − e−xs )μ(ds). f (x) = a + bx + 0+



Proof See [94, Theorem 3.9.4].

If we choose as Lévy process a subordinator, Bernstein functions allow a complete characterization of the process. It is based on the observation that subordinators can be described via their convolution semigroup. Definition 16.3.4 A family (ηt )t≥0 of bounded Borel measures on R is called convolution semigroup on R if the following conditions are fulfilled: (i) ηt (R) ≤ 1, for all t ≥ 0, (ii) ηs ∗ ηt = ηs+t , s, t ≥ 0 and μ0 = δ0 , (iii) ηt → δ0 vaguely as t → 0, where δ0 denotes the Dirac measure at 0. By vague convergence of a sequence (ηt )t>0 of measures to η0 we mean that for all continuous functions with compact support u ∈ C0 (R), we have lim u(x)ηt (dx) = u(x)η0 (dx). t→0 R

R

The relation between convolution semigroups and Bernstein functions is given in the following theorem. Theorem 16.3.5 Let f : (0, ∞) → R be a Bernstein function. Then, there exists a unique convolution semigroup (ηt )t≥0 supported on [0, ∞) such that L(ηt )(x) = e−tf (x) ,

x > 0 and t > 0,

∞

(16.16)

holds, where L denotes the Laplace transform, i.e. L(ηt )(x) := 0 e−zx ηt (dz), for appropriate ηt and x > 0. Conversely, for any convolution semigroup (ηt )t≥0 supported by [0, ∞) there exists a unique Bernstein function f such that (16.16) holds. Proof See, e.g. [94, Theorem 3.9.7].



We recall the correspondence between convolution semigroups and Lévy processes. Theorem 16.3.6 Let X be a Lévy process, where for each t ≥ 0 X(t) has law ηt , then (ηt )t≥0 is convolution semigroup. Proof See [3, Proposition 1.4.4]. The semigroup of subordinated Feller processes can now be characterized.



16.3

Subordination

255

Theorem 16.3.7 Let (Tt )t≥0 be a Feller semigroup on a Banach space H with generator A, let f : (0, ∞) → R be a Bernstein function and let (ηt )t≥0 denote the f associated convolution semigroup on R supported on [0, ∞). Define Tt u for u ∈ H by the Bochner integral ∞ f Ts u ηt (ds). (16.17) Tt u = 0

f

Then, the integral is well-defined and (Tt )t≥0 is a Feller semigroup on H. Proof See [94, Theorem 4.3.1 and Corollary 4.3.4].



f

The representation (16.17) of Tt u can be used for numerical methods, but it involves the approximation of an integral over a possibly semi-infinite interval, which can be very costly if the integrand is not well-behaved. The generator Af of the f semigroup (Tt )t≥0 is a PDO and, in the case of a spatially homogeneous semigroup (Tt )t≥0 , its symbol corresponds to a Lévy process X given as follows. Theorem 16.3.8 Let (Tt )t≥0 be a Feller semigroup with generator A with constant symbol ψ(ξ ) and f (x) as in the previous theorem, then the symbol ψ f (ξ ) of the f generator Af of the semigroup (Tt )t≥0 is given as ψ f (ξ ) = f (ψ(ξ )) . 

Proof See [3, Proposition 1.3.27].

The same characterization does not hold in the case of a more general subordinated process. We obtain the following representation for Af . Theorem 16.3.9 Let f and (Tt )t≥0 be as in Theorem 16.3.7. For all u ∈ D(A), we have u ∈ D(Af ) and ∞ Rλ Auμ(dλ), Af u = au + bAu + 0

where Rλ denotes the resolvent of A, i.e. Rλ u = (λ − A)−1 u and a, b and μ(ds) are defined in Theorem 16.3.5. 

Proof See [96, Theorem 2.15].

This resolvent representation is of limited use for numerical computation, and we therefore aim at a characterization of the symbol of the PDO Af . We consider a certain symbol class as in [96]. Let L(x, D) be the differential operator given by L(x, D) = −

d  i,j =1

ai,j (x)

∂2 + c(x), ∂xi ∂xj

256

16

Multidimensional Feller Processes

where ai,j (x), 1 ≤ i, j ≤ d are continuously differentiable functions such that ai,j (x) = aj,i (x) and κ1 |ξ |2 ≤

d 

ai,j (x)ξi ξj ≤ κ2 |ξ |2

i,j =1

holds for all x ∈ Rd and ξ ∈ Rd for some constants 0 < κ1 ≤ κ2 . Besides, we assume d  ∂ai,j i=1

∂xi

= 0,

for any j = 1, . . . , d and c(x) is a continuous and bounded function satisfying 0 < c ≤ c(x) ≤ c < ∞. In this situation, we obtain the following representation of Af = f (A) for u ∈ D(A), A = L(x, D). ∞ f (A)u = f (L(x, ξ ))u + λRλ Kλ (x, D)uμ(dλ), (16.18) 0

where Kλ (x, D) = (L(x, D) + λ id) ◦ qλ (x, D) − id, 1 , qλ (x, ξ ) = L(x, ξ ) + λξ L(x, ξ ) =

d 

ai,j ξi ξj + c(x)

i,j =1

and Rλ denotes the resolvent of A at λ. We remark that Kλ ≡ 0 for constant symbols, which proves Theorem 16.3.8. In general, both terms in (16.18) have to be considered. We refer to Carr [34] for a generalization of the VG model. Remark 16.3.10 The consideration of symbols of the type a(x, ξ ), where a(x, ξ ) is a Lévy symbol for all x ∈ R is therefore in general not equivalent to a construction via subordination, i.e. for A = L(x, D) with symbol ψ(x, ξ ) the symbol of Af is not given as f (ψ(x, ξ )). It has a more involved structure as described above. This observation was made by [7], where some asymptotic expansion of the difference in terms of the symbol under certain assumptions on the structure of the process was provided.

16.4 Admissible Market Models We now formulate the requirements for market models which are admissible for our pricing schemes in terms of the marginals and the copula function. These requirements not only ensure existence and uniqueness of a solution of the corresponding pricing problem, but also ensure that the presented FEM based algorithms are feasible.

16.4

Admissible Market Models

257

Definition 16.4.1 We call a d-dimensional Feller process with characteristic triplet (γ (x), Q(x), ν(x, dz)) a time-homogeneous admissible market model if it satisfies the following properties. (i) The vector function x → b(x) ∈ Rd is smooth and bounded. (ii) The matrix function x → Q(x) ∈ Rd×d sym is smooth and bounded and, for all d x ∈ R , the matrix Q(x) is positive semidefinite. (iii) The jump measure ν(x, dz) is constructed from d independent, univariate Feller–Lévy measures with a 1-homogeneous copula function F that fulfills the following estimate: there is a constant C > 0 such that for all u ∈ (R\{0})d and all n ∈ Nd0 it holds d 8 # # n #∂ F (u)# ≤ C |n|+1 |n|! min{|u1 | , . . . , |ud |} |ui |−ni . i=1

(iv) For the marginal densities νi (xi , dzi ) = ki (x, z) dz the mapping xi → νi (xi , B) is smooth for all B ∈ B(R). (v) There exist univariate Lévy kernels k i (z), i = 1, . . . , d, with semi-heavy tails, i.e. which satisfy

−β − |z| (16.19) k i (z) ≤ C e−β + z , z < −1, , z > 1, e for some constants C > 0, β − > 0 and β + > 1. These Lévy kernels satisfy the following estimates 0 ≤ νi (xi , B) ≤ k i (z) dz ∀xi ∈ R, B ∈ B(R), i = 1, . . . , d. B

(vi) Besides, we require the following estimates on the derivatives of ki (x, z) # n # #∂ ki (x, z)# ≤ C n+1 n! |z|−Yi (x)−δn−1 , x # n # #∂ ki (x, z)# ≤ C n+1 n! |z|−Yi (x)−n−1 , z

for any δ ∈ (0, 1), for all 0 = z, x ∈ R and Y < 2 as well as Y > 0, Yi (x) = i (x), Y i ∈ R+ and Y i (x) ∈ S(R), i = 1, . . . , d. i + Y Y (vii) Finally, we require F 0 to be a 1-homogeneous Lévy copula and ki0 (xi , zi ) to be Yi (xi )-stable densities with tail integrals Ui0 (xi , zi ), i = 1, . . . , d such that ki (x, z) ≥ Cki0 (x, z),

∀0 < |z| < 1, ∀x ∈ R,

∂1 · · · ∂n F (U (x, z)) ≥ C∂1 · · · ∂n F 0 (U 0 (x, z))

i = 1, . . . , d,

∀0 < |z| < 1,

for some constant C > 0. Remark 16.4.2 Note that the smoothness assumptions on Yi (xi ), i = 1, . . . , d, are necessary in order to obtain symbols as given in Definition 16.2.2. We confine the discussion to such symbols in order to use the results of [137] that rely on pseudodifferential calculus for symbols of variable order, cf. [86, 104]. The derivation of similar results for symbols with lower regularity is open to our knowledge.

258

16

Multidimensional Feller Processes

In the following lemma, we characterize the symbol classes of admissible market model. This is crucial for the well-posedness of the pricing equation as discussed in Sect. 16.5. Lemma 16.4.3 The symbol ψ(x, ξ ) of a time-homogeneous admissible market model given as 1 ψ(x, ξ ) = −ib(x), ξ  + ξ, Q(x)ξ  2   + 1 − eiz,ξ  + iz, ξ  ν(x, dz) 0=z∈Rd

is contained in the following symbol classes: ⎧ 2 for Q(x) ≥ Q0 > 0, ψ(x, ξ ) ∈ S1,δ ⎪ ⎪ ⎨ Y(x) for Q = 0, γ = 0, ψ(x, ξ ) ∈ S1,δ ⎪ ⎪ ⎩ 2 m(x) for Q = 0, γ = 0, ψ(x, ξ ) ∈ S1,δ where δ ∈ (0, 1) and m i (xi ) =

max (Yi (xi ),1) , 2

i = 1, . . . , d.

Proof We have, analogous to [134, Proposition 3.5], ∀ξ ∈ R : d

Rd

d    |ξi |Yi (xi ) , 1 − eiz,ξ  + iz, ξ  ν(x, dz) ≤ C1 i=1

for some positive constant C1 , C2 , C3 > 0. The following estimate holds for the diffusion and the drift component: # # d d   # #1 d # |ξi |2 and |ib(x), ξ | ≤ C3 |ξi | . ∀ξ ∈ R : # ξ, Q(x)ξ ## ≤ C2  2 i=1

i=1

Remark 16.4.4 The partially degenerate case Q = 0, but Q ≯ 0 can be analyzed as in [134, Remark 4.9]. Note that in the case γ = 0 and Q = 0 additional assumptions on the behavior of Yi (xi ) at 1 are necessary in order to ensure the smoothness of m i (xi ), i = 1, . . . , d. The infinitesimal generator A of a time-homogeneous admissible market model X reads Aϕ(x) = ATr ϕ(x) + ABS ϕ(x) + AJ ϕ(x), ATr ϕ(x) = b(x) ∇ϕ(x),  1  ABS ϕ(x) = tr Q(x)D 2 ϕ(x) , 2   ϕ(x + z) − ϕ(x) − z ∇ϕ(x) ν(x, dz), AJ ϕ(x) = Rd

for ϕ ∈ C ∞ (Rd ).

(16.20)

16.5

Variational Formulation

259

Note that the (non-constant) symbol of the infinitesimal generator of X does generally not coincide with the characteristic exponent of the process X, due to the spatial inhomogeneity of X. We illustrate the preceding, abstract developments with an example related to the so-called tempered-stable class of Lévy processes which were advocated in recent years in the context of financial modeling. Example 16.4.5 (Feller–CGMY) We consider a d-dimensional Feller process with Clayton Lévy copula + F (u1 , . . . , ud ) = 22−d

d 

,− ϑ1 |ui |ϑ

  ρ1{u1 ,...,ud ≥0} − (1 − ρ)1{u1 ,...,ud ≤0} ,

i=1

where ϑ > 0, ρ ∈ [0, 1] together with CGMY-type densities + , − + e−βi (x)|z| e−βi (x)|z| ki (x, z) = C(x) 1{z0} , |z|1+Yi (x) |z| i with smooth and bounded functions C(x) > 0, βi− (x) > 0, βi+ (x) > 1 and 0 < Y i < Yi (x) ≤ Y i < 2, Yi (x) sufficiently smooth, for i = 1, . . . , d. We assume the Gaussian component Q(x) to be positive semidefinite, smooth and bounded. The drift γ (x) is assumed to be smooth and bounded. It is easy to see that this market model satisfies properties (i), (ii), (iv)–(vi) of the above definition. Properties (iii) and (vii) follow analogously to the proof of [163, Proposition 2.3.7].

16.5 Variational Formulation We first prove a sector condition for admissible market models, which is crucial for the proof of well-posedness of the pricing equation. Subsequently, well-posedness results for certain types of pricing equations are given. These equations are of parabolic type, with the highest order operator being the diffusion or jump part of the generator of the market model.

16.5.1 Sector Condition The sector condition for the symbol ψ(x, ξ ) of a Feller process X is one of the main ingredients for proving well-posedness of the initial boundary value problems for the PIDEs arising in option pricing problems. The sector condition reads: ∃ C > 0 s.t. ∀x, ξ ∈ Rd :

' ψ(x, ξ ) + 1 ≥ Cξ 2m(x) .

(16.21)

Theorem 16.5.1 Let X be an admissible time-homogeneous market model process taking values in Rd with characteristic triplet (b(x), Q(x), k(x, z) dz). Then, there

260

16

Multidimensional Feller Processes

exists a constant C > 0 such that for all x ∈ Rd and ξ ∞ sufficiently large 'ψ(x, ξ ) ≥ C

d 

|ξ |Yj (xj ) ,

(16.22)

j =1

where Yj (xj ) = 2 in the case Q0 ≥ Q > 0. Proof The proof mainly follows the arguments of [163, Proposition 2.4.3]. We refer to [135, Theorem 5.1.6]. 

16.5.2 Well-Posedness For an admissible time-homogeneous market model X, we can derive a PDO and PIDE representation and prove well-posedness of the weak formulation of the problem on a bounded domain. Due to no arbitrage considerations, we require the considered processes to be martingales under a pricing measure Q. This requirement can be expressed in terms of the characteristic triplet. Lemma 16.5.2 Let X be an admissible time-homogeneous market model with characteristic triplet (b(x), Q(x), ν(x, dz)) and semigroup (Tt )t≥0 further let Tt (exj ) < ∞ hold for t ≥ 0, j = 1, . . . , d. Then, eXj is a Q-martingale with respect to the canonical filtration of X if and only if Qjj (x)2 + bj (x) + (ezj − 1 − zj )νj (x, dzj ) = 0 ∀x ∈ R. (16.23) 2 0=y∈R Proof This is a direct consequence of [61, Sect. 3].



Remark 16.5.3 Note that without the assumption of finiteness of exponential moments of the processes Xj , the processes eXj , j = 1, . . . , d would generally only be local martingales. For Lévy processes, exponential decay of the jump measure implies the existence of exponential moments, cf. [143, Theorem 25.3]. This is not obvious for general Feller processes. Recently, Knopova and Schilling have proved in [106] the finiteness of exponential moments for a certain class of Feller processes assuming exponential decay of the density of the jump measure. We are now able to derive a PDO and PIDE for option prices. Let the stochastic process X be an admissible time-homogeneous market model with generator A and let be g be sufficiently smooth and V = H 1 (Rd ) for diffusion market models, V = H m (Rd ), m = [Y1 /2, . . . , Yd /2] ∈ (0, 1)d for general space and timehomogeneous models and V = H m(x) (Rd ) as in Definition 16.2.2, with m(x) = [Y1 (x1 )/2, . . . , Yd (xd )/2], for time-homogeneous admissible market models considered here. Then, we obtain formally from semigroup theory and from the representation u(t, x) = Tt (g) = e−r(T −t) E[g(Xt )|X0 = x] by differentiation in t the following backward PIDE

16.5

Variational Formulation

261

∂t u + Au − ru = 0 in (0, T ) × Rd , u(T ) = g

in R . d

(16.24) (16.25)

We set t0 = 0 for notational convenience. Testing with a function v ∈ V and transforming to time-to-maturity, we end up with the following parabolic evolution problem: find u ∈ L2 ((0, T ); V) ∩ H 1 ((0, T ); V ∗ ) s.t. for all v ∈ V and a.e. t ∈ [0, T ] the following holds: ∂t u, vV ∗ ×V + a(u, v) = 0,

u(0) = g,

(16.26)

where the bilinear form a(ϕ, φ) = −Aϕ, φV ∗ ,V + r(ϕ, φ)L2 (Rd ) is closely related to the Dirichlet form of the stochastic process X. Although in option pricing, only the homogeneous parabolic problem (16.26) arises, the inhomogeneous equation (16.27) is useful in many applications. We mention only the computation of the time-value of an option, or the computation of quadratic hedging strategies and the corresponding hedging error. Thus, we consider the nonhomogeneous analogue of the above equation. The general problem reads: Find u ∈ L2 ((0, T ); V) ∩ H 1 ((0, T ); V ∗ ) s.t. ∂t u, vV ∗ ×V + a(u, v) = f, vV ∗ ×V

in (0, T ), ∀v ∈ V

u(0) = g (16.27)

L2 ((0, T ); V ∗ ).

for some f ∈ Now we consider the localization of the unbounded problem to a bounded domain G. For any function u with support in a bounded u the zero extensions of u to Gc = Rd \G and define domain G ⊂ Rd , we denote by  u). The variational formulation of the pricing equation on a bounded AG (u) = A( domain G ⊂ Rd reads: Find u ∈ L2 ((0, T ); VG ) ∩ H 1 ((0, T ); (VG )∗ ) s.t. for all v ∈ VG and a.e. t ∈ [0, T ] the following holds: ∂t u, vVG∗ ×VG + aG (u, v) = f, vVG∗ ×VG , u(0) = g|G ,

(16.28) (16.29)

u, v ). The spaces VG =: {v ∈ L2 (G) :  v ∈ V} consist of funcwhere aG (u, v) := a( tions which vanish in a weak sense on ∂G. A comparable result for general Feller processes does not appear to be available, yet. Remark 16.5.4 Formulation (16.28)–(16.29) naturally arises for payoffs with finite support such as digital or (double) barrier options. The truncation to a bounded domain can thus be interpreted economically as the approximation of a standard derivative contract by a corresponding barrier option on the same market model. Existence and uniqueness of weak solutions of (16.28)–(16.29) follows from continuity of the bilinear form aG (·, ·) and a Gårding inequality which follows from the Theorem 16.5.5. 2m(x)

Theorem 16.5.5 Let the generator A(x, D) ∈ Ψρ,δ be a pseudodifferential operator of variable order 2m(x), 0 < mi (xi ) < 1, i = 1, . . . , d with m(x) =

262

16

Multidimensional Feller Processes

2m(x)

for some 0 < δ < ρ ≤ 1 for

(m1 (x1 ), . . . , md (xd )) and symbol ψ(x, ξ ) ∈ Sρ,δ which there exists C > 0 with ' ψ(x, ξ ) + 1 ≥ Cξ 2m(x)

∀x, ξ ∈ Rd .

(16.30)

2m(x) A(x, D) ∈ Ψρ,δ

Then, satisfies a Gårding inequality in the variable order space m(x)  (G): There are constants C1 > 0 and C2 ≥ 0 such that H m(x) (G) : ∀u ∈ H

'a(u, u) ≥ C1 u 2Hm(x) (G) − C2 u 2L2 (G) ,

(16.31)

and ∃λ > 0

m(x) (G) → H −m(x) (G) such that A(x, D) + λI : H

(16.32)

m(x) (G). is boundedly invertible, for a(u, v) = Au, vH −m(x) (G),Hm(x) (G) , u, v ∈ H Proof The proof follows along the lines of the proof of [137, Theorem 5], where the case d = 1 was treated.  Note that in the case of an admissible time-homogeneous market model, 'aG (u, u) = aG (u, u) holds for u ∈ VG and aG (·, ·) as in (16.28). Theorem 16.5.6 The problem (16.28)–(16.29) for an admissible time-homogeneous market model X with symbol ψ(x, ξ ) with initial condition g ∈ H = L2 (Rd ) and Y ≥ 1, Q = 0 or Q ≥ Q0 > 0 has a unique solution. Proof We obtain from Lemma 16.4.3 Y(x)

ψ(x, ξ ) ∈ S1,δ

2 ψ(x, ξ ) ∈ S1,δ

for Y ≥ 1, Q = 0, for Q ≥ Q0 > 0.

Theorem 16.5.1 implies 'ψ(x, ξ ) + 1 ≥ Cξ Y(x) 'ψ(x, ξ ) + 1 ≥ Cξ 2

for Y ≥ 1, Q = 0,

(16.33)

for Q ≥ Q0 > 0,

(16.34)

for all x, ξ ∈ Rd . An application of Theorem 16.5.5 implies the claimed result.



16.6 Numerical Examples In this section the implementation of numerical solution methods for the Kolmogorov equations for admissible market models with inhomogeneous jump measures using the techniques described above is discussed. We assume the risk-neutral dynamics of the underlying asset to be given by S(t) = S(0)ert+X(t) ,

16.6

Numerical Examples

263

where X is a Feller process with characteristic triple (b(x), Q(x), k(x, z)dz) under a risk neutral measure Q such that eX is a martingale with respect to the canonical filtration of X. In the following we set r = 0 for notational convenience. We consider Feller processes X that are admissible time-homogeneous market models. In the following we consider a special family of Feller processes to confirm the theoretical results of the previous chapters. Example 16.6.1 We consider a CGMY-type Feller process with jump kernel * + 2 z > 0, e−β z z−1−Y (x) , Y (x) = ke−x + 0.5. k(x, z) = C − |z| −1−Y (x) −β |z| , z < 0, e This process has no Gaussian component and the drift γ (x) is chosen according to (16.23). We also consider the following family of processes that do not satisfy the conditions of the theory developed above, since the variable order is assumed to be Lipschitz continuous only. Example 16.6.2 We consider again a CGMY-type Feller process with jump kernel * + e−β z z−1−Y (x) , z > 0, k(x, z) = C − |z| −1−Y (x) −β |z| e , z < 0, ⎧ 0.4x, 0.25 > x > 0, ⎪ ⎪ ⎪ ⎪ 0.8x − 0.1, 0.5 > x ≥ 0.25, ⎨ Y (x) = 0.5 + k −0.4x + 0.5, 0.75 > x ≥ 0.5, ⎪ ⎪ ⎪ ⎪ ⎩ −0.8x + 0.8, 1 > x ≥ 0.75, 0, else. This process has no Gaussian component and the drift b(x) is chosen according to (16.23). In Fig. 16.1 the stiffness matrix for the process in Example 16.6.1 is depicted. Note that the uncompressed stiffness matrix is densely populated, but structurally very similar to the matrix in the Black–Scholes model, compare with Fig. 16.2. The assembly of the stiffness matrix is carried out using numerical quadratures. Standard quadratures, such as the Gauss quadrature, yield unsatisfactory results due to the weak singularity of the integrand. We therefore employ tensorized composite Gauss quadratures for the computation of the matrix entries, see [138, Sect. 7.2], [163, Chap. 5] and [149]. Using such an approach we obtain quadrature rules that converge exponentially with respect to the number of quadrature points even for weakly singular kernels. The condition numbers of the preconditioned stiffness matrices have to be uniformly bounded in the number of levels due to arguments similar to Sect. 12.2.3, we refer to [135, Chap. 5] for details. A parameter study for

264

16

Multidimensional Feller Processes

Fig. 16.1 Stiffness matrices for the pure jump case with CGMY-type Lévy kernel 2 (Y (x) = 1.25e−x + 0.5)

various choices of k in Example 16.6.1 and Example 16.6.2 is shown in Fig. 16.3. The condition numbers are uniformly bounded and of order 101 , although the theoretical results only apply to Example 16.6.1. For variable orders with 1.95 ≤ Y we obtain condition numbers of order 102 . Note that the condition numbers are not only influenced by the order of the singularity of the jump kernel at z = 0, but also

16.6

Numerical Examples

265

Fig. 16.2 Stiffness and Mass matrices for the Black–Scholes model with σ = 0.3 and r = 0

by the rates of exponential decay β + and β − . Fast decaying tails, i.e., large β + and β − may lead to larger constants. Figure 16.4 shows the price of a European put option for several Lévy processes and for one Feller process. In the Feller case

266

16

Multidimensional Feller Processes

Fig. 16.3 Condition numbers for different levels and choices of k

we choose the CMGY parametrization with state-dependent parameter Y given by 2 Y (x) = 0.8e−x + 0.1 in Example 16.6.1. For the Lévy models we choose constant Y ∈ {0.1, 0.5, 0.7, 0.8, 0.9}. In all cases we set C = 1, β + = β − = 10 and use truncation parameters a = −3, b = 3 in log-moneyness coordinates. The prices in the Feller model are significantly different from the prices in the different Lévy models. This can be explained by the ability of the Feller model to account for different tail behavior for different states of the process, which is not possible using Lévy processes. Figure 16.5 shows the prices of American put option for a Feller process and for several Lévy models. To enforce the constraint, we use a Lagrangian multiplier approach as described in [92, 93], see Chap. 5. The parameters were chosen as above.

16.7

Further Reading

267

Fig. 16.4 Option prices for several models for a European put option with T = 1 and K = 100

Fig. 16.5 Option prices for several models for an American put option with T = 1 and K = 100

16.7 Further Reading Schneider et al. [137] consider the well-posedness and discretization of pricing equations on variable order spaces. For the multidimensional setting we refer to Reichmann and Schwab [138] and Reichmann [135]. Extensions of Lévy type market models involving local speed functions have also been considered by Carr et al. [35]. A generalized NIG model was proposed by Barndorff-Nielsen and Levendorskiˇi [7].

Appendix A

Elliptic Variational Inequalities

This appendix gives more information on elliptic variational inequalities. Some definitions and results from functional analysis are summarized in Sects. A.1 and A.2. Existence and uniqueness results are discussed in Sect. A.3.

A.1 Hilbert Spaces Let H be a vector space over R. An inner product (u, v) is a bilinear form from H × H → R which is symmetric: ∀u, v ∈ H

(u, v) = (v, u),

positive definite: ∀u ∈ H (u, u) ≥ 0, (u, u) = 0 ⇐⇒ u = 0 . Each inner product (·, ·) on H induces a norm on H via 1

v := (v, v) 2

∀v ∈ H ,

and satisfies the Cauchy–Schwarz inequality: ∀u, v ∈ H : |(u, v)| ≤ u v . Moreover, there holds the parallelogram law . u + v .2 . u − v .2 1   . . . . u 2 + v 2 ∀u, v ∈ H . . +. . = . 2 2 2

(A.1)

(A.2)

Definition A.1.1 A vector space H is a Hilbert space if it is endowed with an inner 1 product (·, ·) and if it is complete with respect to the norm u = (u, u) 2 . A subset K ⊂ H is convex, if ∀u, v ∈ K : {λu + (1 − λ) v : 0 < λ < 1} ⊂ K . N. Hilber et al., Computational Methods for Quantitative Finance, Springer Finance, DOI 10.1007/978-3-642-35401-4, © Springer-Verlag Berlin Heidelberg 2013

(A.3) 269

270

A Elliptic Variational Inequalities

Theorem A.1.2 (Projection onto closed convex subsets) Let H be a Hilbert space and K ⊂ H be a closed and convex subset. Then, for every f ∈ H there exists a unique u ∈ K such that f − u = min f − v .

(A.4)

v∈K

Moreover, u is characterized by the variational inequality: find u ∈ K : (f − u, v − u) ≤ 0 ∀v ∈ K .

(A.5)

We define u = PK f as the projection of f onto K. Proof The set { f − v : v ∈ K} is bounded from below by 0, hence d := infv∈K f − v ≥ 0, exists. Let (vn )n ⊂ K denote a minimizing sequence, i.e. dn := f − vn → d as n → ∞. Then (vn )n is Cauchy, since by (A.2) with f − vn in place of u and with f − vm in place of v it holds . . . vn + vm . 1 2 . .2 . vn − vm .2 2 (dn + dm ) = .f − . +. . . 2 2 2 Since K is convex, (vn + vm )/2 ∈ K and hence f − (vn + vm )/2 ≥ d. Therefore, . v − v .2 1 . n m. 2 ) − d 2 → 0 as n, m → ∞, . ≤ (dn2 + dm . 2 2 and hence (vn )n is Cauchy. Since H is complete, there exists a unique u = limn→∞ vn , and since K is closed, u ∈ K. Since d = inf{ f − v : v ∈ K} ≤ f − u = f − vn + vn − u ≤ dn + u − vn −→ d , n→∞

u satisfies (A.4). To show (A.5), let u ∈ K verify (A.4) and let w ∈ K be arbitrary. Then {(1 − λ)u + λw : 0 < λ < 1} ⊂ K and hence f − u ≤ f − [(1 − λ)u + λw] = (f − u) − λ(w − u) , so that for 0 ≤ λ ≤ 1 holds f − u 2 ≤ f − u 2 − 2λ(f − u, w − u) + λ2 w − u 2 , from where we find that ∀w ∈ K,

∀0 ≤ λ ≤ 1:

2(f − u, w − u) ≤ λ w − u 2 .

Choosing here λ = 0 implies (A.5). Conversely, let u satisfy (A.5). Then, for all v ∈ K, u − f 2 − v − f 2 = 2(f − u, v − u) − u − v 2 ≤ 0 which implies (A.4). To show that u in (A.4) is unique, we assume that there is u = u, u ∈ K such that d = f − u = f − u . Then

A.2 Dual of a Hilbert Space

271

u − u 2 = (u − f ) − (u − f ) 2 = u − f 2 + 2(u − f, u − f ) + u − f 2 = 2d 2 + 2(f − u, u − u + u − f ) = 2d 2 + 2(f − u, u − u) − 2(f − u, f − u) = 2d 2 + 2(f − u, u − u) − 2 f − u 2 = 2(f − u, u − u) ≤ 0 , hence u = u, a contradiction.



We establish some properties of PK . Theorem A.1.3 For any K ⊂ H closed and convex, the following holds: ∀f1 , f2 ∈ H : PK f1 − PK f2 ≤ f1 − f2 .

(A.6)

Proof Let u1 = PK f1 , u2 = PK f2 . By (A.5), ∀v ∈ K : (f1 − u1 , v − u1 ) ≤ 0 ,

(A.7)

∀v ∈ K : (f2 − u2 , v − u2 ) ≤ 0 .

(A.8)

Insert v = u2 into (A.7) and v = u1 into (A.8). Then adding, we get u1 − u2 2 ≤ (f1 − f2 , u1 − u2 ) ≤ f1 − f2 u1 − u2 . 

A.2 Dual of a Hilbert Space A linear map u∗ : H → R is continuous if and only if "  |u∗ (u)| : u ∈ H < ∞. u∗ H∗ := sup u u=0

(A.9)

Such u∗ are called continuous, linear functionals. The set of all continuous, linear functionals on H is denoted by H∗ . It is a normed, linear space with norm u∗ H∗ in (A.9), and called the dual of H. Theorem A.2.1 (Riesz representation theorem) For every u∗ ∈ H∗ there exists a unique u ∈ H such that ∀v ∈ H : u∗ (u) = (u, v)

(A.10)

u∗ H∗ = u .

(A.11)

and

Proof Since u∗ ∈ H∗ , the set N := {v ∈ H : u∗ (v) = 0} is a closed linear subspace of H. If N = H, u∗ ≡ 0, and we take u = 0. Assume now N = H. Then, we claim that there exists g ∈ H\N such that g = 1,

∀w ∈ N : (g, w) = 0 .

(A.12)

272

A Elliptic Variational Inequalities

To this end, let g0 ∈ H\N and let g1 := PN g0 = g0 . Then g := g0 − g1 −1 × (g0 − g1 ) satisfies (A.12). Next, every v ∈ H can be written as v = λg + w with λ ∈ R and w ∈ N : put, for example, u∗ (v) , w = v − λg . u∗ (g) Then, we get 0 = (g, w) = (g, v − λg), i.e. (g, v) = λ = u∗ (v)/u∗ (g). Hence u := u∗ (g)g satisfies λ=

∀v ∈ H : u∗ (v) = u∗ (λg + w) = λu∗ (g) + u∗ (w) = λu∗ (g) = (g, v)u∗ (g) = (u, v) , i.e. (A.10) and, by (A.1), |u∗ (v)| |(u, v)| = sup ≤ u v v∈H v∈H v

u∗ H∗ = sup and u =

|u∗ (v)| (u, u) |u∗ (u)| = ≤ sup = u∗ H∗ . u u v v∈H



Remark A.2.2 The preceding result shows that H∗ is isomorphic to H, and the map u∗ → u is, by (A.11), an isometry. One therefore often, but not always identifies H∗ with H. In the parabolic setting, we often have the following. Let V ⊂ H be a subspace which is dense in H and equipped with a norm · V such that the canonical embedding i: V → H is continuous: ∀v ∈ V : v ≤ C v V .

(A.13)

We identify H and H∗ . Then one can embed H into V ∗ as follows: given u ∈ H, the map u∗u : V  v → (u, v) is in H∗ and, due to |u∗ (v)| = |(u, v)| ≤ u v ≤ C u v V also in

V ∗.

We denote by T

: u → u∗u

∀u ∈ H Then T

: H → V∗

∀v ∈ V ,

the mapping with

∀v ∈ V : (T u)(v) = (u, v) .

satisfies:

u)(w)| (i) T u V ∗ = supw∈V |(T w = supw∈V V (ii) T injective, (iii) T (H) is dense in V ∗ .

|(u,v)| w V

≤ C u ,

With T we can embed H into V ∗ and get the triple V → H = H∗ → V ∗ ,

(A.14)

where the canonical injections are continuous and dense, and where H is called a ‘pivot space’. Note that, in (A.14), one cannot identify V and V ∗ any more.

A.3 Theorems of Stampacchia and Lax–Milgram

273

A.3 Theorems of Stampacchia and Lax–Milgram The theorems of Stampacchia and Lax–Milgram are useful existence results for stationary problems. Let V be a Hilbert space with a norm · V and innerproduct ·, ·V . Definition A.3.1 A bilinear form b(·, ·) : V × V → R is (i) continuous, if there exists C1 > 0 such that ∀u, v ∈ V : |b(u, v)| ≤ C1 u V v V ,

(A.15)

(ii) coercive, if there exists C2 > 0 such that ∀u ∈ V : b(u, u) ≥ C2 u 2V .

(A.16)

Theorem A.3.2 Let b(·, ·) : V × V → R be a continuous and coercive bilinear form on V, ∅ = K ⊂ V be closed and convex. Then, for any  ∈ V ∗ there exists a unique u ∈ K solution of the variational inequality Find u ∈ K such that b(u, v − u) ≥ (v − u)

∀v ∈ K .

(A.17)

The proof uses Theorem A.3.3 (Banach’s fixed point theorem) Let X be a complete metric space and S : X → X a mapping with d(Sv1 , Sv2 ) ≤ κd(v1 , v2 )

∀v1 , v2 ∈ S

(A.18)

for some κ < 1. Then, the problem Find u ∈ X such that u = Su

(A.19)

admits a unique solution. Proof of Theorem A.3.2 Given  ∈ V ∗ , there exists, by Theorem A.2.1, a unique f ∈ V such that ∀v ∈ V : (v) = f, vV . For every fixed u ∈ V, the map v −→ b(u, v) is in V ∗ and there is a unique representative in V, denoted by Bu, such that b(u, v) = Bu, vV , ∀v ∈ V. The operator B : V → V is linear and, by (A.15), (A.16), Bu V ≤ C1 u V

∀u ∈ V ,

(A.20)

Bu, uV ≥ C2 u V

∀u ∈ V .

(A.21)

2

Hence, (A.17) may be rewritten as Find u ∈ K such that Bu, v − uV ≥ f, v − uV

∀v ∈ K .

(A.22)

274

A Elliptic Variational Inequalities

Let ρ > 0 be a parameter. Then (A.22) is equivalent to Find u ∈ K such that ρf − ρBu + u − u, v − uV ≤ 0 ∀v ∈ K ,

(A.23)

or to the fixed point problem u = PK (ρf − ρBu + u) in K .

(A.24)

Define, for v ∈ K, Sv = PK (v − ρ(Bv − f )). Then, by (A.6), for every v1 , v2 ∈ K, Sv1 − Sv2 V ≤ (v1 − v2 ) − ρB(v1 − v2 ) V , and hence Sv1 − Sv2 2V ≤ v1 − v2 2V − 2ρB(v1 − v2 ), v1 − v2 V + ρ 2 B(v1 − v2 ) 2V ≤ v1 − v2 2V (1 − 2ρ C2 + ρ 2 C12 ) . 1

We fix ρ > 0 such that κ := (1 − 2ρ C2 + ρ 2 C12 ) 2 < 1. Then S is a contraction on K and (A.24) has a unique solution.  Remark A.3.4 The proof is constructive—the contraction S converges with rate κ < 1. The rate is maximal, if ρ = ρ∗ = arg min{1 − 2ρ C2 + ρ 2 C12 } = C2 /C12 ,

(A.25)

then 1

In particular, if

v (0)

κ = κ∗ = (1 − C22 /C12 ) 2 < 1 .

(A.26)

∈ K is arbitrary, the sequence   v (i+1) = Sv (i) = PK v (i) − ρ(Bv (i) − f )

(A.27)

converges to u ∈ K, the solution of (A.17), and u − v (i) V ≤ C κ∗i u − v (0) V , i ≥ 0 . Corollary A.3.5 If M ⊆ V is a closed, linear subspace, the problem Find u ∈ M such that b(u, v) ≥ (v) admits a unique solution.

∀v ∈ M

(A.28)

Appendix B

Parabolic Variational Inequalities

To formulate the parabolic variational inequalities (PVIs) for pricing European and American contracts, we require suitable function spaces. Let V, H be Hilbert spaces with norms · V and · H , respectively. As in Appendix A, we assume that V → H with dense injection and identity H with its dual H∗ , so that we obtain the so-called evolution triple with dense injections, V → H ∼ (B.1) = H∗ → V ∗ . On the time interval J := (a, b) ⊂ R, we introduce the spaces of “sum” and of “intersection” type by S(a, b) := L1 (J ; H) + L2 (J ; V ∗ ), ∞

(B.2)



I (a, b) := L (J ; V) ∩ L (J ; H ) . 2

(B.3)

The present results on existence and time discretization of abstract parabolic evolution variational inequalities are based on [5].

B.1 Weak Formulation of PVI’s In the triple (B.1), we are given a linear, possibly non-selfadjoint operator A : V → V ∗ with the associated bilinear form a(u, v) := Au, vV ∗ ,V : V × V → R, where ·, ·V ∗ ,V denotes the extension of the H innerproduct to V ∗ × V by continuity. We assume that there are α, β > 0 ∀u, v ∈ V : |a(u, v)| ≤ β u V v V ,

(B.4a)

∀u ∈ V : a(u, u) + λ u H ≥ α u V , 2

2

(B.4b)

for some λ ≥ 0. A parabolic variational inequality (PVI) in strong form is given by a set ∅ = K ⊂ V, closed and convex with 0 ∈ K , N. Hilber et al., Computational Methods for Quantitative Finance, Springer Finance, DOI 10.1007/978-3-642-35401-4, © Springer-Verlag Berlin Heidelberg 2013

(B.5) 275

276

B Parabolic Variational Inequalities

of admissible prices, a payoff u0 ∈ H, a time horizon T ≤ ∞ and a source term f : J → V ∗ , J = (0, T ). Then, the PVI reads u : J → V such that u(t) ∈ K a.e. t ∈ J =  > u + Au(t) − f (t), u(t) − v V ∗ ,V ≤ 0 ∀v ∈ K, a.e. in J,

(B.6)

u(0) = u0 .

Existence results for the PVI (B.6) requires a weak formulation: to state it, we assume in (B.6) u ∈ L2 (J ; V), 

v, v ∈ L (J ; V), 2

u(t) ∈ K a.e. t ∈ J ,

(B.7a)

v(t) ∈ K a.e. t ∈ J .

(B.7b)

Then a solution u in (B.6) is such that for all v in (B.7a), (B.7b)  = > d 1 u(t) − v(t) 2H + v  (t) + Au(t) − f (t), u(t) − v(t) V ∗ ,V ≤ 0 . (B.8) dt 2 This remains valid even for u which do not satisfy (B.7a). The weak form of (B.6) is thus: given u0 ∈ K

· H

, 0 < T ≤ ∞, f ∈ L2 (0, T ; V ∗ ) ,

(B.9a)

and K satisfying (B.5), A : V → V ∗ satisfying (B.4a), (B.4b) for λ = 0, find u ∈ L2 (0, T ; V) such that u(t) ∈ K a.e. t ∈ J ,

(B.9b)

and such that the function Θ(t), given by t =  > 1 2 Θ(t) := u(t) − v(t) H + v (τ ) + Au(τ ) − f (τ ), u(τ ) − v(τ ) V ∗ ,V dτ 2 0 (B.9c) satisfies Θ  (t) ≤ 0 in J

(B.9d)

in the sense of distributions and Θ(t) ≤

1 u0 − v(0) 2H in J 2

(B.9e)

for all v satisfying (B.7b). Remark B.1.1 If u0 ∈ H is given, then u0 in (B.9a) and (B.9e) must be replaced by Pu0 = P ◦ H u0 , its projection (cf. Theorem A.3) onto the closure of K in H K (otherwise, there may be multiple solutions of (B.9a)–(B.9e)), i.e. by Pu0 ∈ K

◦ H

: (u0 − Pu0 , v − Pu0 ) ≤ 0 ∀v ∈ K

◦ H

.

Remark B.1.2 If K is a closed, convex cone with vertex 0, (B.9b) splits into =  > u (t) + Au(t) − f (t), v V ∗ ,V ≥ 0 ∀v ∈ K

B.2 Existence

and

277

=

> u (t) + Au(t) − f (t), u V ∗ ,V = 0 .

If K ⊂ V is a subspace, (B.9b) becomes > =  u (t) + Au(t) − f (t), v V ∗ ,V = 0 ∀v ∈ K . Remark B.1.3 In the weak formulation (B.9a)–(B.9e), we assumed (B.4b) with α > 0, λ = 0. If (B.4b) holds only with λ > 0, the function Θ(t) in the weak formulation (B.9a)–(B.9e) has to be replaced by Θλ (t) :=

1 −λt e (u(t) − v(t)) 2H 2 t= e−λτ (v  (τ ) + Au(τ ) + λu(τ ) − λv(τ ) − f (τ )), + 0 > e−λτ (u(τ ) − v(τ )) V ∗ ,V dτ ;

the spaces L2 (J ; V) and L2 (J ; V ∗ ) must be replaced by spaces with weight exp(−λτ ) or, if T = ∞, by L2loc (0, ∞; V), etc. Then, (B.9a)–(B.9e) and all what follows will apply also to the case when (B.4b) holds only with λ > 0.

B.2 Existence To show the existence of solutions to the PVI (B.9a)–(B.9e), we semidiscretize (B.6) in time as follows: given k > 0 with k = T /M if T < ∞, we define Jkm := [mk, (m + 1) k] , 1 f (τ ) dτ uk,0 = Pu0 , fkm := k Jkm

(B.10) (B.11)

and replace (B.6), (B.9a)–(B.9e) by the sequence of elliptic variational inequalities: for m = 0, 1, 2, . . ., find

such that

uk,m+1 ∈ K

(B.12a)

> uk,m+1 − uk,m + kAuk,m+1 − fk,m , uk,m+1 − v V ∗ ,V ≤ 0

(B.12b)

=

for all v ∈ K. By (B.4a), (B.4b) with λ = 0 and by Theorem A.8, the EVI (B.12b) admits a unique solution uk,m+1 ; hence {uk,m }M m=0 (M = ∞ if T = ∞) is well-defined. ∞ With {uk,m }m=0 we associate a function Uk (t) ∈ C 0 (J , V) by # (B.13a) Uk (t)#J is linear on Jk,m , m = 0, 1, 2, . . . k,m

Uk (mk) = uk,m+1 .

(B.13b)

278

B Parabolic Variational Inequalities

Remark B.2.1 In (B.13b), uk,m+1 is needed, as by (B.11) uk,0 = u0 ∈ / V for general ◦ H u0 ∈ K ; the choice Uk (mk) = uk,m in (B.13b) would then imply Uk (t) ∈ / V for 0 < t < k. For every k > 0, Uk is well-defined. Theorem B.2.2 (i) The mapping Tk : {u0 , f } → Uk (t) is bounded and Lipschitz-continuous from H × L2 (J ; V ∗ ) → L2 (J ; V), uniformly in k. For any {u0 , f } ∈ H × L2 (0, T ; V ∗ ), the family {Uk }k>0 is Cauchy in L2 (0, T ; V) and its limit u ∈ L2 (J, V) is the unique weak solution of the PVI (B.9a)–(B.9e), which satisfies (B.8) and, moreover, u ∈ C 0 (J , H) .

(B.14)

(ii) If, moreover, f ∈ S(0, T ) (cf. (B.2)), then Tk is bounded and Lipschitzcontinuous from (cf. (B.3)) Tk : H × S(0, T ) → I (0, T ) and {Uk }k>0 is Cauchy in I (0, T ). (iii) Finally, assuming f = g + h with g ∈ BV (J ; H), Pu0 ∈ K,

h ∈ H 1 (J ; V ∗ ) ,

APu0 − h(0) ∈ H,

(B.15a) (B.15b)

also {Uk }k>0 is uniformly bounded in I (0, T ), and u = limk→0 Uk satisfies u ∈ I (0, T ) and the second line in (B.6).

B.3 Proof of the Existence Result We prove the existence result by establishing a priori estimates for time-semidiscrete approximate solutions in (B.12a), (B.12b). We start by analyzing (B.12a), (B.12b) for m = 0, k > 0 and write u1 for uk,1 and f0 in place of fk,0 in (B.11). We assume that f ∈ S(0, T ), i.e. f = g + h, g ∈ L1 (J ; H), h ∈ L2 (J ; V ∗ ) , and set g0 :=

1 k

0

k

g(τ ) dτ, h0 :=

1 k



k

h(τ ) dτ .

Then, (B.12a), (B.12b) reads: find u1 ∈ K such that for all v ∈ K: √ u1 + kAu1 , u1 − vV ∗ ,V ≤ (G0 , u1 − v) + kH0 , u1 − vV ∗ ,V where G0 := u0 + kg0 ∈ H, H0 :=

(B.16)

0

√ k h0 ∈ V ∗ .

(B.17)

(B.18)

B.3 Proof of the Existence Result

279

Note that, as k → 0, G0 → u0 ∈ H, H0 → 0 in V ∗ strongly .

(B.19)

Since 0 ∈ K, we may choose in (B.17) v = 0 and get u1 2H + k u1 2V ≤ C ( G0 2H + H0 2∗ ),

(B.20)

where · ∗ denotes the norm in V ∗ . We claim that u1 − Pu0 2H + k u1 2V → 0 as k → 0 .

(B.21)

To prove (B.21), we note that u1 ∈ K. Also, by the definition of P in Remark B.1.1, we have (u0 − Pu0 , u1 − Pu0 ) ≤ 0, hence u1 − Pu0 2H = (u1 − Pu0 , u1 − Pu0 ) ≤ (u1 − u0 , u1 − Pu0 ) .

(B.22)

By (B.4b) with λ = 0, (B.9c) and (B.9e), (B.17), (B.22), it follows for every v ∈ K u1 − Pu0 2H + αk u1 − v 2V ≤ (u1 − u0 , u1 − Pu0 ) + kA(u1 − v), u1 − vV ∗ ,V ≤ (u1 − u0 , v − Pu0 ) + ku1 − u0 + kAu1 , u1 − vV ∗ ,V − kAv, u1 − vV ∗ ,V (B.17)

≤ u1 − u0 H v − Pu0 H + (G0 − u0 , u1 − v) √ √ + k H0 − kAv, u1 − vV ∗ ,V .

Therefore, it holds for all v ∈ K that u1 − Pu0 2H + k u1 − u0 2V  ≤ C u1 − u0 H v − Pu0 H √ = √ # #" > + # (G0 − u0 , u1 − v) + k H0 − kAv, u1 − v V ∗ ,V # .

(B.23)

·

Since Pu0 ∈ K H , there is a sequence {vk }k>0 ⊂ K such that, as k → 0, k vk 2V → 0, and vk − Pu0 H → 0. We choose in (B.23) v = vk and obtain, after passing to the limit and using (B.19), (B.20), the assertion (B.21). We have Lemma B.3.1 For any fixed n > 0, there exists Cn > 0 such that, as k → 0,   (B.24) uk,n − Pu0 2H + k uk,n 2V ≤ Cn u0 2H + f 2S(0,nk) . Moreover, as k → 0, uk,n − Pu0 2H + k uk,n 2V → 0 (non-uniformly in n) .

(B.25)

280

B Parabolic Variational Inequalities

Proof We proceed by induction on n. Inequality (B.24) for n = 1 follows from (B.20), (B.21), if we note that the right hand side of (B.20) is bounded by C { u0 2H + g 2L1 (0,k;H) + h 2L2 (0,k;V ∗ ) } for f ∈ S(0, T ). Assume now the assertion (B.24) is true for some n. Then, arguing as in the proof of (B.20), (B.21), we obtain (B.24) for n + 1.  Next, we address the case (iii) in Theorem B.2.2. Lemma B.3.2 If (B.15a), (B.15b) holds, as k → 0 we have u1 − Pu0 2H + k u1 − Pu0 2V ≤ Ck 2 .

(B.26)

·

Proof We use (B.18) and insert v = Pu0 ∈ K H into (B.23) to obtain k  2 2 u1 − Pu0 H + k u1 − Pu0 V ≤ C k u1 − Pu0 H g(τ ) H dτ 0 # # + k #(h(0) − APu0 , u1 − Pu0 )# k " + |u1 − Pu0 |V h(τ ) − h(0) ∗ dτ . 0

(B.27)

The assertion (B.26) follows from (B.27) with the estimates k g(τ ) H dτ ≤ k g L∞ (0,k;H) , 0



k 0

h(τ ) − h(0) ∗ dτ ≤ C k 3/2 h L2 (0,k;V ∗ ) .



Remark B.3.3 Inserting v = uk,1 into (B.12b), we obtain by similar arguments uk,2 − uk,1 2H + k uk,2 − uk,1 2V ≤ (fk,1 − kAuk,1 , uk,2 − uk,1 )  ≤ Ck 2 sup g(τ ) 2H + h(0) − APu0 2H + 0 0, we replace (B.48) by   −λτ p(τ ) dτ −λτ q(τ ) dτ Jk,m e Jk,m e , qm =  . (B.50) pm =  −λτ dτ −λτ dτ Jk,m e Jk,m e We define for k sufficiently small w m = (1 − λk)m wm , and analogously p m ,  qm . Then the same reasoning as in Remark B.3.7 gives for k sufficiently small % $ M } ≤ C w0 H + e−λτ s(τ ) S(0,Mk) , M , Y (B.51) max{X if we use 1/(1 − λk)M ≤ exp(−λk M /(1 − λk)). We now apply (B.49), (B.51) to obtain a priori estimates for the approximate solution Uk (t) obtained from (B.12a), (B.12b), (B.13a), (B.13b). To this end, we identify the sequence {wm } with the step function w k (t) such that # w k (t)#J = wm , m = 1, . . . , M . (B.52) k,m

Lemma B.3.9 Assume (B.4b) with λ = 0. Then there exists C > 0 depending only on α such that wk (t + k) I(0,T ) ≤ C ( w0 H + s(t) S(0,T ) ) and wk (t + k) I(0,T ) ≤ C ( w1 H +

√ k w1 V + s(t) S(0,T +k) ) .

(B.53) (B.54)

Proof Since T = Mk, (B.49) and (B.52) give (B.53). To show (B.54), we may assume T > k. We apply (B.49) to the shifted sequence {wm+1 }M  m=1 . We are now in position to give a priori estimates for the sequence of solutions to (B.12a), (B.12b).

284

B Parabolic Variational Inequalities

Proposition B.3.10 Define the step-functions {uk (t)}k>0 by # uk (t)# := uk,m , m = 0, . . . , M − 1 , Jk,m

(B.55)

and analogously f k (t). Assume 0 ∈ K and (B.4b) with λ = 0, α > 0. Then, % $ uk (t + k) I (0,T ) ≤ C(α) w0 H + f k (t) S(0,T +k) , (B.56) √ % $ uk (t + k) I (0,T ) ≤ C(α) uk,1 H + k uk,1 V + f k (t) S(0,T +k) . (B.57) Proof We choose 0 = v ∈ K in (B.12b). Since, for m > 0, uk,m ∈ K and we have (B.29), (B.30) with wm = uk,m , fk,m = pm + qm , (B.56) and (B.57) follow then from (B.53), (B.54).  We also have a bound for Uk (t) in (B.13a), (B.13b). Lemma B.3.11 Assume (B.4b) with λ = 0 and 0 ∈ K. The semi-discretization (B.11), (B.12a), (B.12b) of PVI (B.9a)–(B.9e) is stable in the sense that for T = Mk % $ Uk (t) I (0,T ) ≤ C u0 H + f (t) S(0,T +2k) . (B.58) Proof For T = Mk, we have fk (t) S(0,T ) ≤ f (t) S(0,T +k) ;

Uk (t) I (0,T ) ≤ uk (t) I (0,T +k) . 

Inserting this into (B.56), we get (B.58).

Remark B.3.12 Note that we could also obtain an a priori bound for Uk (t) from (B.57): here (B.25) with n = 1 would imply % $ Uk (t) I (0,T ) ≤ C Pu0 H + f (t) S(0,T +2k) . (B.59) We are now able to show our first main result. To state it, we define, for f k (t) as in (B.55), * f k (t) − f (t) S(0,T +2k) + f (t + k) − f (t) S(0,T +2k) √ E(k, T ) := + uk,2 − uk,1 H + k uk,2 − uk,1 V + uk,1 − Puk,0 H . (B.60) Lemma B.3.13 We have for T > 0, as k = T /M → 0, kUk (t) I (0,T ) ≤ C E(k, T ) .

(B.61)

Proof We note that due to its definition (B.13a), (B.13b), kUk (t)|Jk,m = uk,m+2 − uk,m+1 . Selecting v = uk,m+2 in (B.12b) and also v = uk,m+1 in (B.12b) with m + 1 in place of m gives the two inequalities: uk,m+1 − uk,m + kAuk,m+1 − fk,m , uk,m+1 − uk,m+2 V ∗ ,V ≤ 0 , −uk,m+2 − uk,m+1 + kAuk,m+2 − fk,m+1 , uk,m+1 − uk,m+2 V ∗ ,V ≤ 0 .

B.3 Proof of the Existence Result

285

Adding these inequalities together gives an inequality of the type (B.30), where pm + qm = fk,m+2 − fk,m+1 , wm = uk,m+1 − uk,m , m ≥ 1. We apply Lemma B.3.9 to w k (t) defined in (B.52), with s(t) = f k (t + k) and w0 := uk,1 − P u0 . Observing  that wk (t) = kUk (t), we get (B.61) with (B.60) from (B.53), (B.54). From (B.61) the size of E(k, T ) as k → 0 for fixed T > 0 is important. We have Lemma B.3.14 Assume 0, u0 ∈ K

· H

and (B.11). Then

E(k, T ) = o(1) as k → 0 ,

(B.62a)

and, for compatible data satisfying (B.15a), (B.15b), it holds E(k, T ) ≤ C k as k → 0 .

(B.62b)

Proof We show (B.62b). The terms in the second row of the definition (B.60) of E(k, T ) can be bounded as in (B.62b), by (B.26), (B.28). The terms in the first row of (B.60) are bound using (B.15a), (B.15b) as follows: we decompose f = g + h ∈ S(0, T ) and define g k (t), hk (t) as in (B.55). Assume k = T /M for M ∈ N. Then h(t + k) − h(t) L2 (J ;V  ) ≤ k h (t) L2 (0,T +k;V ∗ ) , hk (t) − h(t) L2 (J ;V  ) ≤ k h (t) L2 (0,T +k;V ∗ ) , as well as g(t + k) − g(t) L2 (J ;H) =

M−1  m=0 Jk,m

=

k M−1 

g(τ + (m + 1)k) − g(τ + m k) H dτ

0 m=0

≤k

M−1 

g(τ + k) − g(τ ) H dτ

Var(g, Jk,m ; H)

m=0

≤ kVar(g, J ; H) . Furthermore, g k (t) − g(t) L1 (J ;H) = ≤

M−1  m=0 Jk,m

. .1 . . g(τ ) dτ − g(t). dt . H k Jk,m

M−1 1  g(s) − g(t) H dt ds k Jk,m Jk,m m=0

≤k

M−1 

Var(g, Jk,m ; H)

m=0

≤ kVar(g, J ; H) .

286

B Parabolic Variational Inequalities

This gives (B.62b), if we take the infimum over all decompositions of f ∈ S(0, T ) of the form f = g + h as in (B.15a), provided u0 satisfies also (B.15b). To show (B.62a) for general f ∈ S(a, b), u0 ∈ H, we use (B.21), (B.25) to bound the second row of (B.60), and approximate a general f ∈ S(0, T ) from BV(J ; H) +  H 1 (J ; V ∗ ) to bound the first row of (B.60) by o(1) as k → 0. We are now ready to prove the existence Theorem B.2.2. To this end, we show that the family {Uk (t)}k>0 is Cauchy in I (0, T ). More precisely, there is C > 0 such that 0 $0 % (B.63) Uk (t) − Uh (t) I (0,T ) ≤ C E(k, T ) + E(h, T ) . This and (B.62a), (B.62b) imply that {Uk }k>0 is Cauchy in I (0, T ) and that there is U = limk→0 Uk (t) ∈ I (0, T ) with 0 (B.64) U (t) − Uk (t) I (0,T ) ≤ CE(k, T ) . We note that (B.64) with (B.62a), (B.62b) gives an error estimate for (B.12a), (B.12b). To prove (B.63), we recall the definition (B.55) of uk (t) and we also define  uk (t) = k (t) uk (t) + (1 − k (t)) u(t + k) ,

(B.65)

where k (t) := m + 1 − t/k ∈ [0, 1],

t ∈ Jk,m = [mk, (m + 1)k] .

(B.66)

Then it follows from (B.12a), (B.12b) that for all t ∈ J holds =  >  uk + Auk (t + k) − f k (t), uk (t + k) − v V ∗ ,V ≤ 0 ∀v ∈ K .

(B.67)

To prove (B.63), we proceed as follows: we rewrite (B.67) in terms of Uk (t) in (B.13a), (B.13b) with small right hand side. Then (B.63) will be obtained by application of Lemma B.3.6 to Uk (t) − Uh (t). We have from 0 ≤ k ≤ 1 that in H uk (t) H ,  uk (t) − uk (t + k) H ≤ uk (t) − uk (t + k) H = k

(B.68)

and also in V, and for every (a, b) ⊆ J , that  uk (t) I (a,b) ≤ uk (t) I (a,b) + uk (t + k) I (a,b) ≤ 2 uk (t) I (a,b+k) .

(B.69)

Lemma B.3.15 For any v ∈ K, the following holds: Uk (t) + AUk (t) − f k (t + k), Uk (t) − vV ∗ ,V ≤ β kUk (t) V Uk (t) − v V = > − k (t) Uk (t) + Auk (t + k) − f k (t + k), kUk (t) V ∗ ,V .

(B.70)

Proof We have by (B.4a) AUk (t), Uk (t) − vV ∗ ,V ≤ Auk (t + k), Uk (t) − vV ∗ ,V + A uk (t + k) − Auk (t + k), Uk (t) − vV ∗ ,V ≤ Auk (t + k), Uk (t) − vV ∗ ,V + β  uk (t + k) − uk (t + k) V Uk (t) − v V =: I + II .

B.3 Proof of the Existence Result

287

We estimate II ≤ M Uk (t) V Uk (t) − v V , and combine I with the left hand side of (B.70). It then remains to estimate Uk (t) + Auk (t + k) − f k (t + k), Uk (t) − vV ∗ ,V . To this end, we write Uk (t) − v = Uk (t) − uk (t + k) + uk (t + k) − v = −k (t)kUk (t) + uk (t + k) − v , and obtain Uk (t) + Auk (t + k) − f k (t + k), Uk (t) − vV ∗ ,V = −Uk (t) + Auk (t + k) − f k (t + k), k (t)kUk (t)V ∗ ,V + Uk (t) + Auk (t + k) − f k (t + k), uk (t + k) − vV ∗ ,V =: III + IV . By (B.65), (B.55) and (B.13a), (B.13b), Uk (t) =  u (t + k) and, by (B.67) evaluated at t + k, IV ≤ 0 and III implies (B.70).  Inspecting the proof, we also have Corollary B.3.16 For any v ∈ K, the following holds: Uk (t) + AUk (t) − f k (t + k), Uk (t) − vV ∗ ,V

≤ s(t), Uk (t) − vV ∗ ,V − k (t)Uk (t) + Auk (t + k) − f k (t + k), kUk (t)V ∗ ,V (B.71)

where s(t) satisfies s(t) V ≤ β kUk (t) V a.e. t ∈ J .

(B.72)

Next, we replace f k (t + k) in the bounds (B.70), (B.71). Corollary B.3.17 For any v(t) ∈ K, a.e. t ∈ J , one has Uk (t) + AUk − f (t), Uk (t) − vV ∗ ,V ≤ s(t), Uk (t) − vV ∗ ,V

+ k (t)Uk (t) + Auk (t + k) − f k (t + k), −kUk (t)V ∗ ,V + f k (t + k) − f (t), Uk (t) − vV ∗ ,V ,

(B.73)

where s(t) satisfies (B.72). Proof (B.73) follows from (B.71) by adding and subtracting f (t) on the left hand side of (B.71). 

288

B Parabolic Variational Inequalities

Lemma B.3.18 For u0 ∈ K

· H

and f ∈ S(0, T ), and any h, k > 0, one has 0 $0 % E(h, T ) + E(k, T ) , (B.74) Uh (t) − Uk (t) I (0,T ) ≤ C

with E(h, T ) as in Lemma B.3.14. In particular, {Uh (t)}h>0 is Cauchy in I (0, T ) and there exists u = lim Uh (t) ∈ I (0, T ) . h→0

Proof We choose in (B.73) v = Uh (t) for some h > 0, and then exchange in the resulting inequality the roles of k and h. Adding the resulting two inequalities for the difference w(t) := Uk (t) − Uh (t), we get an inequality of the type considered in Lemma B.3.6, with s(t) replaced by s(t) + f k (t + k) − f (t): To determine r(t) in (B.40), we estimate the last term in the bound (B.73) as follows: using 0 ≤ k (t) ≤ 1 and k (t)Uk (t), −kUk (t)V ∗ ,V ≤ 0, we have

(B.75)

# # 0 ≤ r(t) := #Auk (t + k) − f k (t + k), −kUk (t)V ∗ ,V #   ≤ β uk (t + k) V + f k (t + k) H kUk (t) V .

Hence, in (B.42), T R(T ) = r(t) dt 0   ≤ Const uk (t + k) I (0,T ) + f k (t) S(0,T +k) · kUk (t) I (0,T ) . From (B.56), we get

% $ R(T ) ≤ Const w0 H + f k (t) S(0,T +k) kUk (t) I (0,T ) ,

and (B.61) gives

% $ R(T ) ≤ Const w0 H + f k (t) S(0,T +k) E(k, T ) .

(B.76)

To estimate the value of w(0) H in (B.76) and in (B.43), we use that, by Lemma B.3.2, w(0) H = (uk,1 − Pu0 ) − (uh,1 − Pu0 ) H ≤ E(k, T ) + E(h, T ) , which, inserted into Lemma B.3.6, implies the assertion.



We can now give the proof of Theorem B.2.2(i). Lemma B.3.18 established that {uh }h>0 is Cauchy in I (0, T ), hence in particular in L2 (J ; V) and in L∞ (J ; H). Therefore, u(t) ∈ C 0 (J ; H) and u(0) = limk→0 Uk (0) = limk→0 uk,1 = Pu0 ∈ · K H , which is the third line in (B.6). To show that u(t) is a solution of the PVI, pick in (B.73) v(t) satisfying (B.7b), and pass in (B.73) to the limit k → 0, implying the second line of (B.6); since K is closed in H and Uh → u in L∞ (J ; H), we also have the first line of (B.6). The uniqueness and Theorem B.2(ii) will follow from

References

289

Lemma B.3.19 The map T : {u0 , f } → u(t) which is a solution of PVI (B.6) is Lipschitz from H × S(0, T ) → I (0, T ). Proof Observe that {u0 , f } → Uk is Lipschitz continuous uniformly in k: let {u∗0 , f ∗ } be a second set of initial data. Then pick v = uk,m+1 in (B.12b) and also v = u∗k,m+1 in (B.12b). To the difference w = uk,m − u∗k,m we may apply Proposition B.3.10 which gives for the extensions of uk,m , u∗k,m as in (B.55) % $ ∗ uk (t + k) − u∗k (t + k) I (0,T +k) ≤ C uk,1 − u∗k,1 H + f k (t) − f k (t) S(0,∞) . By (B.68), an analogous estimate uniform in k holds also true for the linear extensions Uk (t), Uk∗ (t). 

References 1. Y. Achdou and O. Pironneau. Computational methods for option pricing, volume 30 of Frontiers in Applied Mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, 2005. 2. Y. Achdou and N. Tchou. Variational analysis for the Black and Scholes equation with stochastic volatility. M2AN Math. Model. Numer. Anal., 36(3):373–395, 2002. 3. D. Applebaum. Lévy processes and stochastic calculus, volume 93 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, 2004. 4. L. Bachelier. Théorie de la spéculation. Ann. Sci. Éc. Norm. Super., 17:21–86, 1900. 5. C. Baiocchi. Discretization of evolution variational inequalities. In Partial differential equations and the calculus of variations, Vol. I, volume 1 of Progr. Nonlinear Differential Equations Appl., pages 59–92. Birkhäuser, Boston, 1989. 6. C.A. Ball and A. Roma. Stochastic volatility option pricing. J. Financ. Quant. Anal., 29(4):589–607, 1994. 7. O. Barndorff-Nielsen and S.Z. Levendorskiˇi. Feller processes of normal inverse Gaussian type. Quant. Finance, 1:318–331, 2001. 8. O.E. Barndorff-Nielsen. Normal inverse Gaussian processes and the modelling of stock returns. Research report 300, Department of Theoretical Statistics, Aarhus University, 1995. 9. O.E. Barndorff-Nielsen, E. Nicolato, and N. Shephard. Some recent developments in stochastic volatility modelling. Quant. Finance, 2(1):11–23, 2002. Special issue on volatility modelling. 10. O.E. Barndorff-Nielsen and N. Shephard. Non-Gaussian Ornstein–Uhlenbeck-based models and some of their uses in financial economics. J. R. Stat. Soc., Ser. B, Stat. Methodol., 63(2):167–241, 2001. 11. G. Barone-Adesi. The saga of the American put. J. Bank. Finance, 29(11):2909–2918, 2005. 12. G. Barone-Adesi and R.E. Whaley. Efficient analytic approximation of American option values. J. Finance, 42(2):301–320, 1987. 13. D.S. Bates. Jumps stochastic volatility: the exchange rate process implicit in Deutsche Mark options. Rev. Finance, 9(1):69–107, 1996. 14. D.S. Bates. Post-’87 crash fears in the S&P 500 futures option market. J. Econ., 94(1–2):181– 238, 2000. 15. A. Bensoussan and J.-L. Lions. Applications of variational inequalities in stochastic control, volume 12 of Studies in Mathematics and Its Applications. North-Holland, Amsterdam, 1982. 16. F.E. Benth and M. Groth. The minimal entropy martingale measure and numerical option pricing for the Barndorff-Nielsen–Shephard stochastic volatility model. Stoch. Anal. Appl., 27(5):875–896, 2009.

290

B Parabolic Variational Inequalities

17. F.E. Benth and T. Meyer-Brandis. The density process of the minimal entropy martingale measure in a stochastic volatility model with jumps. Finance Stoch., 9(4):563–575, 2005. 18. J. Bertoin. Lévy processes. Cambridge University Press, New York, 1996. 19. S. Beuchler, R. Schneider, and Ch. Schwab. Multiresolution weighted norm equivalences and applications. Numer. Math., 98(1):67–97, 2004. 20. I.H. Biswas. Viscosity solutions of integro-PDE: theory and numerical analysis with applications to controlled jump-diffusions. PhD thesis, University of Oslo, 2008. 21. F. Black and M. Scholes. The pricing of options and corporate liabilities. J. Polit. Econ., 81(3):637–659, 1973. 22. J.M. Bony, Ph. Courrège, and P. Priouret. Semi-groupes de Feller sur une variété à bord compacte et problèmes aux limites intégro-différentiels du second ordre donnant lieu au principe du maximum. Ann. Inst. Fourier (Grenoble), 18(2):369–521, 1968. 23. S.I. Boyarchenko and S.Z. Levendorski˘ı. Non-Gaussian Merton–Black–Scholes theory, volume 9 of Advanced Series on Statistical Science & Applied Probability. World Scientific, River Edge, 2002. 24. D. Braess. Finite elements, 3rd edition. Cambridge University Press, Cambridge, 2007. 25. M.J. Brennan and E.S. Schwartz. The valuation of American put options. J. Finance, 32(2):449–462, 1977. 26. M.J. Brennan and E.S. Schwartz. Finite difference methods and jump processes arising in the pricing of contingents claims: a synthesis. J. Financ. Quant. Anal., 13:462–474, 1978. 27. S.C. Brenner and L.R. Scott. The mathematical theory of finite element methods, volume 15 of Texts in Applied Mathematics, 2nd edition. Springer, New York, 2002. 28. M. Briani, R. Natalini, and G. Russo. Implicit-explicit numerical schemes for jump–diffusion processes. Calcolo, 44(1):33–57, 2007. 29. D. Brigo and F. Mercurio. Interest rate models—theory and practice. Springer Finance, 2nd edition. Springer, New York, 2006. 30. P. Brockwell, E. Chadraa, and A. Lindner. Continuous-time GARCH processes. Ann. Appl. Probab., 16(2):790–826, 2006. 31. R. Carmona and M. Tehranchi. Interest rate models: an infinite dimensional stochastic analysis perspective. Springer Finance, 1st edition. Springer, New York, 2006. 32. R. Carmona and N. Touzi. Optimal multiple stopping and valuation of swing options. Math. Finance, 18(2):239–268, 2008. 33. P. Carr. Randomization and the American put. Rev. Finance, 11(3):597–626, 1998. 34. P. Carr. Local variance gamma. Technical report, Private communication, 2009. 35. P. Carr, H. Geman, D. Madan, and M. Yor. From local volatility to local Lévy models. Quant. Finance, 4(5):581–588, 2004. 36. P. Carr, H. Geman, D.B. Madan, and M. Yor. The fine structure of assets returns: an empirical investigation. J. Bus., 75(2):305–332, 2002. 37. P. Carr, H. Geman, D.B. Madan, and M. Yor. Stochastic volatility for Lévy processes. Math. Finance, 13(3):345–382, 2003. 38. A. Cohen. Numerical analysis of wavelet methods, volume 32 of Studies in Mathematics and Its Applications. North-Holland, Amsterdam, 2003. 39. S. Cohen and J. Rosi´nski. Gaussian approximation of multivariate Lévy processes with applications to simulation of tempered stable processes. Bernoulli, 13(1):195–210, 2007. 40. R. Cont and P. Tankov. Financial modelling with jump processes. Financial Mathematics Series. Chapman & Hall/CRC, Boca Raton, 2004. 41. R. Cont and E. Voltchkova. A finite difference scheme for option pricing in jump diffusion and exponential Lévy models. SIAM J. Numer. Anal., 43(4):1596–1626, 2005. 42. R. Cont and E. Voltchkova. Integro-differential equations for option prices in exponential Lévy models. Finance Stoch., 9(3):299–325, 2005. 43. R. Courant, K. Friedrichs, and H. Lewy. Über die partiellen Differenzengleichungen der mathematischen Physik. Math. Ann., 100(1):32–74, 1928. 44. Ph. Courrège. Sur la forme intégro-differentielle des opérateurs de Ck∞ dans C satisfaisant du principe du maximum. Sém. Théorie du Potentiel, 38:38 pp., 1965/1966.

References

291

45. J.C. Cox. Notes on option pricing I: constant elasticity of diffusions. Technical report, Stanford University, Stanford, 1975. 46. J.C. Cox and S. Ross. The valuation of options for alternative stochastic processes. J. Financ. Econ., 3:145–166, 1976. 47. C.W. Cryer. The solution of a quadratic programming problem using systematic overrelaxation. SIAM J. Control, 9:385–392, 1971. 48. C. Cuchiero, D. Filipovic, E. Mayerhofer, and J. Teichmann. Affine processes on positive semidefinite matrices. Ann. Appl. Probab., 21(2):397–463, 2011. 49. M. Dahlgren. A continuous time model to price commodity-based swing options. Rev. Deriv. Res., 8(1):27–47, 2005. 50. W. Dahmen. Wavelet and multiscale methods for operator equations. Acta Numer., 6:55–228, 1997. 51. W. Dahmen, H. Harbrecht, and R. Schneider. Compression techniques for boundary integral equations—asymptotically optimal complexity estimates. SIAM J. Numer. Anal., 43(6):2251–2271, 2006. 52. I. Daubechies. Ten lectures on wavelets, volume 61 of CBMS-NSF Lecture Notes. SIAM, Providence, 1992. 53. F. Delbaen and W. Schachermayer. A general version of the fundamental theorem of asset pricing. Math. Ann., 300(3):463–520, 1994. 54. F. Delbaen and W. Schachermayer. The fundamental theorem of asset pricing for unbounded stochastic processes. Math. Ann., 312(2):215–250, 1998. 55. F. Delbaen and W. Schachermayer. The mathematics of arbitrage. Springer Finance, 2nd edition. Springer, Heidelberg, 2008. 56. F. Delbaen and H. Shirakawa. A note on option pricing for the constant elasticity of variance model. Asia-Pac. Financ. Mark., 9(1):85–99, 2002. 57. E. Derman and I. Kani. Riding on a smile. Risk, 7(2):32–39, 1994. 58. G. Dimitroff, S. Lorenz, and A. Szimayer. A parsimonious multi-asset Heston model: calibration and derivative pricing. Int. J. Theor. Appl. Finance, 14(8):1299–1333, 2011. 59. B. Dupire. Pricing with a smile. Risk, 7(1):18–20, 1994. 60. E. Eberlein and F. Özkan. The Lévy LIBOR model. Finance Stoch., 9(3):327–348, 2005. 61. E. Eberlein, A. Papapantoleon, and A.N. Shiryaev. On the duality principle in option pricing: semimartingale setting. Finance Stoch., 12(2):265–292, 2008. 62. E. Eberlein and K. Prause. The generalized hyperbolic model: financial derivatives and risk measures. In H. Geman, D. Madan, S.R. Pliska, and T. Vorst, editors, Mathematical finance— Bachelier Congress, 2000. Springer Finance, pages 245–267. Springer, Berlin, 2002. 63. E. Ekström and J. Tysk. Boundary conditions for the single-factor term structure equation. Ann. Appl. Probab., 21(1):332–350, 2011. 64. A. Ern and J.L. Guermond. Theory and practice of finite elements, volume 159 of Applied Mathematical Sciences. Springer, New York, 2004. 65. L.C. Evans. Partial differential equations, volume 19 of Graduate Studies in Mathematics. American Mathematical Society, Providence, 1998. 66. W. Farkas, N. Reich, and Ch. Schwab. Anisotropic stable Lévy copula processes—analytical and numerical aspects. M3AS Math. Model. Meth. Appl. Sci., 17(9):1405–1443, 2007. 67. J.-P. Fouque, G. Papanicolaou, R. Sircar, and K. Solna. Multiscale stochastic volatility asymptotics. Multiscale Model. Simul., 2(1):22–42, 2003. 68. J.P. Fouque, G. Papanicolaou, and K.R. Sircar. Derivatives in financial markets with stochastic volatility. Cambridge University Press, Cambridge, 2000. 69. R. Geske. The valuation of compound options. J. Financ. Econ., 7(1):63–81, 1979. 70. R. Geske and H.E. Johnson. The American put option valued analytically. J. Finance, 39(5):1511–1524, 1984. 71. I.I. Gihman and A.V. Skorohod. The theory of stochastic processes I. Grundlehren Math. Wiss. Springer, New York, 1974. 72. I.I. Gihman and A.V. Skorohod. The theory of stochastic processes II. Grundlehren Math. Wiss. Springer, New York, 1975.

292

B Parabolic Variational Inequalities

73. I.I. Gihman and A.V. Skorohod. The theory of stochastic processes III. Grundlehren Math. Wiss. Springer, New York, 1979. 74. I.S. Gradshteyn and I.M. Ryzhik. Table of integrals, series, and products. Academic Press, New York, 1980. 75. M. Griebel and S. Knapek. Optimized general sparse grid approximation spaces for operator equations. Math. Comput., 78(268):2223–2257, 2009. 76. B. Gustafsson, H.-O. Kreiss, and J. Oliger. Time dependent problems and difference methods. Pure and Applied Mathematics (New York). Wiley, New York, 1995. 77. C. Hager and B. Wohlmuth. Semismooth Newton methods for variational problems with inequality constraints. GAMM-Mitt., 33:8–24, 2010. 78. P. Hepperger. Option pricing in Hilbert space valued jump–diffusion models using partial integro-differential equations. SIAM J. Financ. Math., 1:454–489, 2010. 79. S.L. Heston. A closed-form solution for options with stochastic volatility, with applications to bond and currency options. Rev. Finance, 6:327–343, 1993. 80. N. Hilber. Stabilized wavelet methods for option pricing in high dimensional stochastic volatility models. PhD thesis, ETH Zürich, Dissertation No. 18176, 2009. http://e-collection. ethbib.ethz.ch/view/eth:41687. 81. N. Hilber, A.M. Matache, and Ch. Schwab. Sparse wavelet methods for option pricing under stochastic volatility. J. Comput. Finance, 8(4):1–42, 2005. 82. N. Hilber, N. Reich, and Ch. Winter. Wavelet methods. Encyclopedia of Quantitative Finance. Wiley, Chichester, 2009. 83. N. Hilber, Ch. Schwab, and Ch. Winter. Variational sensitivity analysis of parametric Markovian market models. In Ł. Stettner, editor, Advances in mathematics of finance, volume 83, of Banach Center Publ., pages 85–106. 2008. 84. M. Hintermüller, K. Ito, and K. Kunisch. The primal–dual active set strategy as a semismooth Newton method. SIAM J. Optim., 13:865–888, 2003. 85. W. Hoh. Pseudodifferential operators generating Markov processes. Habilitationsschrift, University of Bielefeld, 1998. 86. W. Hoh. Pseudodifferential operators with negative definite symbols of variable order. Rev. Mat. Iberoam., 16:219–241, 2000. 87. M. Holtz and A. Kunoth. B-spline based monotone multigrid methods with an application to the pricing of American options. In P. Wesseling, C.W. Oosterlee, and P. Hemker, editors, Multigrid, multilevel and multiscale methods, Proc. EMG, 2005. 88. Y.L. Hsu, T.I. Lin, and C.F. Lee. Constant elasticity of variance (CEV) option pricing model: integration and detailed derivation. Math. Comput. Simul., 79(1):60–71, 2008. 89. N. Ikeda and Sh. Watanabe. Stochastic differential equations and diffusion processes. NorthHolland, Amsterdam, 1981. 90. S. Ikonen and J. Toivanen. Efficient numerical methods for pricing American options under stochastic volatility. Numer. Methods Partial Differ. Equ., 24(1):104–126, 2008. 91. J. Ingersoll. Theory of financial decision making. Rowman & Littlefield Publishers, Inc., Oxford, 1987. 92. K. Ito and K. Kunisch. Semi-smooth Newton methods for variational inequalities of the first kind. M2AN Math. Model. Numer. Anal., 37:41–62, 2003. 93. K. Ito and K. Kunisch. Parabolic variational inequalities: the Lagrange multiplier approach. J. Math. Pures Appl., 85:415–449, 2005. 94. N. Jacob. Pseudo differential operators and Markov processes, Vol. I: Fourier analysis and semigroups. Imperial College Press, London, 2001. 95. N. Jacob. Pseudo differential operators and Markov processes, Vol. II: Generators and their potential theory. Imperial College Press, London, 2002. 96. N. Jacob and R. Schilling. Subordination in the sense of S. Bochner—an approach through pseudo differential operators. Math. Nachr., 178:199–231, 1996. 97. J. Jacod and A. Shiryaev. Limit theorems for stochastic processes, 2nd edition. Springer, Heidelberg, 2003.

References

293

98. P. Jaillet, D. Lamberton, and B. Lapeyre. Variational inequalities and the pricing of American options. Acta Appl. Math., 21(3):263–289, 1990. 99. C. Johnson. Numerical solution of partial differential equations by the finite element method. Cambridge University Press, Cambridge, 1987. 100. J. Kallsen and P. Tankov. Characterization of dependence of multidimensional Lévy processes using Lévy copulas. J. Multivar. Anal., 97(7):1551–1572, 2006. 101. M. Keller-Ressel, A. Papapantoleon, and J. Teichmann. The affine LIBOR models. Technical report, arXiv.org, 2011. 102. M. Keller-Ressel and T. Steiner. Yield curve shapes and the asymptotic short rate distribution in affine one-factor models. Finance Stoch., 12(2):149–172, 2008. 103. A.G.Z. Kemna and A.C.F. Vorst. A pricing method for options based on average asset values. J. Bank. Finance, 14(1):113–129, 2002. 104. K. Kikuchi and A. Negoro. On Markov process generated by pseudodifferential operator of variable order. Osaka J. Math., 34:319–335, 1997. 105. C. Klüppelberg, A. Lindner, and R. Maller. A continuous-time GARCH process driven by a Lévy process: stationarity and second-order behaviour. J. Appl. Probab., 41(3):601–622, 2004. 106. V. Knopova and R.L. Schilling. Transition density estimates for a class of Lévy and Lévytype processes. J. Theor. Probab., 25(1):144–170, 2012. 107. G. Kou. A jump diffusion model for option pricing. Manag. Sci., 48(8):1086–1101, 2002. 108. O. Kudryavtsev and S.Z. Levendorski˘ı. Fast and accurate pricing of barrier options under Lévy processes. Finance Stoch., 13(4):531–562, 2009. 109. D. Lamberton and B. Lapeyre. Introduction to stochastic calculus applied to finance. Chapman & Hall/CRC Financial Mathematics Series, 2nd edition. Chapman & Hall/CRC, Boca Raton, 2008. 110. D. Lamberton and M. Mikou. The critical price for the American put in an exponential Lévy model. Finance Stoch., 12(4):561–581, 2008. 111. D. Lamberton and M. Mikou. The smooth-fit property in an exponential Lévy model. J. Appl. Probab., 49(1):137–149, 2012. doi:10.1239/jap/1331216838. 112. S. Larsson and V. Thomée. Partial differential equations with numerical methods, volume 45 of Texts in Applied Mathematics. Springer, Berlin, 2003. 113. C.C.W. Leentvaar and C.W. Oosterlee. Pricing multi-asset options with sparse grids and fourth order finite differences. In A.B. de Castro, D. Gómez, P. Quintela, and P. Salgado, editors, Numerical mathematics and advanced applications, pages 975–983, Springer, Berlin, 2006. doi:10.1007/978-3-540-34288-5_97. 114. C.C.W. Leentvaar and C.W. Oosterlee. On coordinate transformation and grid stretching for sparse grid pricing of basket options. J. Comput. Appl. Math., 222(1):193–209, 2008. 115. J.-L. Lions and E. Magenes. Problèmes aux limites non homogènes et applications, volume 1 of Travaux et Recherches Mathématiques. Dunod, Paris, 1968. 116. J.J. Lucia and E.J. Schwartz. Electricity prices and power derivatives: evidence from the Nordic Power Exchange. Review of Derivatives, 5(1):5–50, 2002. 117. E. Luciano and W. Schoutens. A multivariate jump-driven financial asset model. Quant. Finance, 6(5):385–402, 2006. 118. D.B. Madan, P. Carr, and E. Chang. The variance gamma process and option pricing. Eur. Finance Rev., 2(1):79–105, 1998. 119. D.B. Madan and E. Seneta. The variance gamma model for share market returns. J. Bus., 63:511–524, 1990. 120. X. Mao. Stochastic differential equations and applications, 2nd edition. Horwood Publishing, Chichester, 2007. 121. A.-M. Matache, P.-A. Nitsche, and Ch. Schwab. Wavelet Galerkin pricing of American options on Lévy driven assets. Quant. Finance, 5(4):403–424, 2005. 122. A.-M. Matache, T. Petersdorff, and Ch. Schwab. Fast deterministic pricing of options on Lévy driven assets. M2AN Math. Model. Numer. Anal., 38(1):37–71, 2004.

294

B Parabolic Variational Inequalities

123. A.-M. Matache, Ch. Schwab, and T.P. Wihler. Linear complexity solution of parabolic integro-differential equations. Numer. Math., 104(1):69–102, 2006. 124. R.C. Merton. Theory of rational option pricing. Bell J. Econ. Manag. Sci., 4:141–183, 1973. 125. R.C. Merton. Option pricing when underlying stock returns are discontinuous. J. Financ. Econ., 3(1–2):125–144, 1976. 126. K.-S. Moon, R.H. Nochetto, T. von Petersdorff, and C.-S. Zhang. A posteriori error analysis for parabolic variational inequalities. M2AN Math. Model. Numer. Anal., 41(3):485–511, 2007. 127. J. Muhle-Karbe, O. Paffel, and R. Stelzer. Option pricing in multivariate stochastic volatility models of OU type. Technical report, arXiv.org, 2010. http://arxiv.org/abs/1001.3223v1. 128. E. Nicolato and E. Venardos. Option pricing in stochastic volatility models of the Ornstein– Uhlenbeck type. Math. Finance, 13(4):445–466, 2003. 129. S.M. Nikol’ski˘ı Approximation of functions of several variables and imbedding theorems, volume 205 of Grundlehren Math. Wiss.. Springer, New York, 1975. 130. R.H. Nochetto, T. von Petersdorff, and C.-S. Zhang. A posteriori error analysis for a class of integral equations and variational inequalities. Numer. Math., 116(3):519–552, 2010. 131. B. Øksendal. Stochastic differential equations: an introduction with applications. Universitext, 6th edition. Springer, Berlin, 2003. 132. P.E. Protter. Stochastic integration and differential equations, volume 21 of Stochastic Modelling and Applied Probability, 2nd edition. Springer, Berlin, 2005. 133. N. Reich. Wavelet compression of anisotropic integrodifferential operators on sparse tensor product spaces. PhD thesis, ETH Zürich, Dissertation No. 17661, 2008. http://e-collection. ethbib.ethz.ch/view/eth:30174. 134. N. Reich, Ch. Schwab, and Ch. Winter. On Kolmogorov equations for anisotropic multivariate Lévy processes. Finance Stoch., 14(4):527–567, 2010. 135. O. Reichmann. Numerical option pricing beyond Lévy. PhD thesis, ETH Zürich, Dissertation No. 20202, 2012. http://e-collection.library.ethz.ch/view/eth:5357. 136. O. Reichmann. Optimal space-time adaptive wavelet methods for degenerate parabolic PDEs. Numer. Math., 121(2):337–365, 2012. doi:10.1007/s00211-011-0432-x. 137. O. Reichmann, R. Schneider, and Ch. Schwab. Wavelet solution of variable order pseudodifferential equations. Calcolo, 47(2):65–101, 2010. 138. O. Reichmann and Ch. Schwab. Numerical analysis of additive Lévy and Feller processes with applications to option pricing. In Lévy matters I, volume 2001 of Lecture Notes in Mathematics, pages 137–196, 2010. 139. C. Reisinger and G. Wittum. Efficient hierarchical approximation of high-dimensional option pricing problems. SIAM J. Sci. Comput., 29(1):440–458, 2007. 140. O. Reiss and U. Wystup. Efficient computation of option price sensitivities using homogeneity and other tricks. J. Deriv., 9:41–53, 2001. 141. D. Revuz and M. Yor. Continuous martingales and Brownian motion, 3rd edition. Springer, New York, 1990. 142. L. Rogers and Z. Shi. The value of an Asian option. J. Appl. Probab., 32(4):1077–1088, 1995. 143. K. Sato. Lévy processes and infinitely divisible distributions, volume 68 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, 1999. 144. R. Schilling. On the existence of Feller processes with a given generator. Technical report, TU Dresden, 2004. http://www.math.tu-dresden.de/sto/schilling/papers/existenz. 145. R. Schöbel and J. Zhu. Stochastic volatility with an Ornstein–Uhlenbeck process: an extension. Eur. Finance Rev., 3:23–46, 1999. 146. D. Schötzau. hp-DGFEM for parabolic evolution problems. PhD thesis, ETH Zürich, 1999. 147. D. Schötzau and Ch. Schwab. hp-discontinuous Galerkin time-stepping for parabolic problems. C. R. Acad. Sci. Paris Sér. I Math., 333(12):1121–1126, 2001. 148. W. Schoutens. Lévy processes in finance. Wiley, Chichester, 2003. 149. Ch. Schwab. Variable order composite quadrature of singular and nearly singular integrals. Computing, 53(2):173–194, 1994.

References

295

150. L.O. Scott. Option pricing when the variance changes randomly: theory, estimation, and an application. J. Financ. Quant. Anal., 22:419–438, 1987. 151. N. Shephard, editor. Stochastic volatility: selected readings. Oxford University Press, London, 2005. 152. A.N. Shiryaev. Essentials of stochastic finance: facts, models, theory. World Scientific, Singapore, 2003. 153. E.M. Stein and J.C. Stein. Stock price distributions with stochastic volatility: an analytic approach. Rev. Finance, 4(4):727–752, 1991. 154. V. Thomée. Galerkin finite element methods for parabolic problems, volume 25 of Springer Series in Computational Mathematics, 2nd edition. Springer, Berlin, 2006. 155. K. Urban. Wavelet methods for elliptic partial differential equations. Oxford University Press, Oxford, 2009. 156. O. Vasicek. An equilibrium characterisation of the term structure. J. Financ. Econ., 5(2):177– 188, 1977. 157. J. Vecer. Unified Asian pricing. Risk, 15(6):113–116, 2002. 158. T. von Petersdorff and Ch. Schwab. Wavelet discretization of parabolic integrodifferential equations. SIAM J. Numer. Anal., 41(1):159–180, 2003. 159. T. von Petersdorff and Ch. Schwab. Numerical solution of parabolic equations in high dimensions. M2AN Math. Model. Numer. Anal., 38(1):93–127, 2004. 160. M. Wilhelm and Ch. Winter. Finite element valuation of swing options. J. Comput. Finance, 11(3):107–132, 2008. 161. P. Wilmott, S. Howison, and J. Dewynne. The mathematics of financial derivatives: a student introduction. Cambridge University Press, Cambridge, 1993. 162. P. Wilmott, S. Howison, and J. Dewynne. Option pricing: mathematical models and computation. Oxford Financial Press, Oxford, 1995. 163. Ch. Winter. Wavelet Galerkin schemes for option pricing in multidimensional Lévy models. PhD thesis, ETH Zürich, Dissertation No. 18221, 2009. http://e-collection.ethbib.ethz.ch/ view/eth:41555. 164. Ch. Winter. Wavelet Galerkin schemes for multidimensional anisotropic integrodifferential operators. SIAM J. Sci. Comput., 32(3):1545–1566, 2010. 165. P.G. Zhang. Exotic options, 2nd edition. World Scientific, Singapore, 1998. 166. R. Zvan, P.A. Forsyth, and K.R. Vetzal. Penalty methods for American options with stochastic volatility. J. Comput. Appl. Math., 91(2):199–218, 1998. 167. R. Zvan, P.A. Forsyth, and K.R. Vetzal. PDE methods for pricing barrier options. J. Econ. Dyn. Control, 24(11–12):1563–1590, 2000.

Index

0–9 1-homogeneous, 204 A α-stable, 205 A priori estimate, 32 Admissible market model, 128, 209, 256 American option, 65, 119, 140 Amplification matrix, 18 Anisotropic Sobolev space, 180 Antiderivative, 135 Asian option, 77 Assets, 3 B Banach’s fixed point theorem, 273 Barrier option, 75 Basket, see multi-asset option Bates model, 230 Bernstein function, 253 Better-of-option, 101 Bilinear form, 31 Black–Scholes equation, 50 Black–Scholes model, 7 BNS model, 231 Bond, 85 Brownian motion, see Wiener process C Càdlàg, 6 Call option, 4 CEV-model, 58 CFL-condition, 19, 41 CGMY process, see tempered stable process Characteristic triplet, 124 CIR model, 86 Clayton Lévy copula, 204

Complete dependence Lévy copula, 203 Compound option, 79 Compression scheme, 165, 215 Condition number, 167 Contingent claim, 4 Continuity, 32 Convergence, 20, 43 Convolution semigroup, 254 Curse of dimension, 101 D Derivative, 4 Difference quotient, 16 Differential operator, 47 Digital option, 57 Dirichlet boundary condition, 15 Discontinuous Galerkin scheme, 168 Discretization, 15, 33, 54, 68, 96, 114, 135 Discretization error, 17, 43 Dual space, 12, 271 E ε-aggregated price process, 187 Elliptic, 14 Equivalent local martingale measure, 6 Error estimate, 101, 164, 166, 173, 184, 217, 241 Excess to payoff, see time value Exercise boundary, 66, 142 Existence, 32 Exotic option, 75 F Feller semigroup, 249 Feynman–Kac formula, 49, 93, 130, 212, 233 Filtration, 6 Finite activity, 125

N. Hilber et al., Computational Methods for Quantitative Finance, Springer Finance, DOI 10.1007/978-3-642-35401-4, © Springer-Verlag Berlin Heidelberg 2013

297

298 Finite difference method, 15 Finite element method, 20, 27 First compression, 165 Full-rank Black–Scholes model, 186 Function spaces, 11 G Galerkin discretization, 21 Gårding inequality, 32 Gaussian approximation, 218 Geometric call, 191 Geometric partition, 170 Geometric payoff, 184 Graded mesh, 55 Greeks, 147 H Hardy’s inequality, 59, 235 Hat functions, 22, 34 Heat equation, 14 Heston model, 106, 154 Hierarchical basis, 160 Hilbert space, 269 Hyperbolic, 14 I Implementation, 34 Independence Lévy copula, 203 Infinitesimal generator, 48, 58, 62, 92, 108, 129, 211, 233, 249 Initial condition, 15, 38 Inner product, 269 Integro-differential operator, 129 Interest rate, 5 Interest rate derivative, 87 Interest rate model, 85 Itô formula, 8 Itô process, 8 J Jackson type estimate, 163 Jump measure, see Lévy measure Jump-diffusion model, 126 K KoBoL, see tempered stable process Kou model, 126 Kronecker product, 96, 115 L Lp -norm, 12 Laplacian, 13 Lévy copula, 199 Lévy measure, 124

Index Lévy process, 123, 200 Lévy–Itô decomposition, 124 Lévy–Khinchine representation, 124, 197 Linear complementarity problem, 70 Load vector, 33 Local volatility model, 62 Localization, 67, 80, 95, 113, 134 Low-rank Black–Scholes model, 187 M Marginal process, 198 Markov property, 6 Martingale, 6 Mass matrix, 33 Matrix–vector multiplication, 183 Merton model, 126 Method of lines, 20, 33 Minimal entropy martingale measure, 234 Multi-asset option, 91 Multi-index, 11 Multi-scale basis, see hierarchical basis Multi-scale model, 106 Multidimensional Lévy process, 197 Multidimensional variance gamma process, 207 Multivariate subordination, 207 N Non-homogeneous Dirichlet boundary condition, 37 Non-smooth initial data, 55 Norm equivalence, 162, 180 Normal inverse Gaussian process, 128 O Option, 4 Ornstein–Uhlenbeck process, 231 P Parabolic, 14 Parabolic variational inequalities, 275 Parametric Markovian market model, 146 Partial differential equation, 13 Partial integro-differential equation, 129 Payoff, 4 Poincaré inequality, 30 Positive maximum principle, 248 Preconditioning, 167, 239 Price process, 4 Primal–dual active set algorithm, 72 Pseudodifferential operator, 130, 247 PSOR method, 71 Pure jump model, 127

Index Q Quadratic variation, 92 R Random measure, see Lévy measure Removal of drift, 131 Riesz representation theorem, 12, 271 S Schur decomposition, 172 Second compression, 165 Sector condition, 133, 259 Semi-heavy tails, 126 Sensitivity, see Greeks Singular support, 165 Sklar’s theorem, 201 Smooth pasting, 66 Smooth pasting condition, 141 Sobolev space, 20, 27, 93, 134 Sobolev space of fractional order, 131 Sobolev space of variable order, 250 Sparse grid, 178 Stability condition, see CFL-condition Stiffness matrix, 33 Stochastic process, 5 Stochastic volatility model, 105, 189, 229 Stopping time, 65 Subordination, 127, 253 Swap, 89 Swaption, 89 Swing option, 82

299 Symbol, 130, 247 T θ -scheme, 16 Tempered stable process, 127 Theorem of Lax–Milgram, 273 Theorem of Stampacchia, 273 Time of maturity, 4 Time value, 67 Time-space grid, 15 Time-to-maturity, 51 Truncation function, 125 V Variance gamma process, 127 Variational formulation, 21, 31, 51, 67, 93, 110, 131, 148, 164, 181, 212, 234, 260 Vasicek model, 86 Volatility smile, 139 W Wavelet, 160 Wavelet discretization, 164, 181, 213, 238 Weak derivative, 28 Weak formulation, see variational formulation Weighted Sobolev space, 59, 111, 236 Wiener process, 7 Y Yield curve, 87