Differential Equations on Measures and Functional Spaces 978-3-030-03377-4

This advanced book focuses on ordinary differential equations (ODEs) in Banach and more general locally convex spaces, m

642 66 5MB

English Pages 536 Year 2019

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Differential Equations on Measures and Functional Spaces
 978-3-030-03377-4

Citation preview

Birkhäuser Advanced Texts Basler Lehrbücher

Vassili Kolokoltsov

Differential Equations on Measures and Functional Spaces

Birkhäuser Advanced Texts Basler Lehrbücher

Series editors Steven G. Krantz, Washington University, St. Louis, USA Shrawan Kumar, University of North Carolina at Chapel Hill, Chapel Hill, USA Jan Nekováˇr, Sorbonne Université, Paris, France

More information about this series at http://www.springer.com/series/4842

Vassili Kolokoltsov

Differential Equations on Measures and Functional Spaces

Vassili Kolokoltsov Department of Statistics University of Warwick Warwick, UK Higher School of Economics Moscow, Russia

ISSN 1019-6242 ISSN 2296-4894 (electronic) Birkhäuser Advanced Texts Basler Lehrbücher ISBN 978-3-030-03376-7 ISBN 978-3-030-03377-4 (eBook) https://doi.org/10.1007/978-3-030-03377-4 Mathematics Subject Classification (2010): 34B15, 34G10, 34G20, 34H05, 35D30, 35F20, 35F21, 35K25, 35K55, 35K67, 35Q20, 35Q82, 35Q83, 35Q84, 35Q91, 35Q92, 35S05, 45N05, 46E10, 47D03, 47D06, 47D07, 47D08, 47G20, 49J50, 91A13, 91A22 © Springer Nature Switzerland AG 2019 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This book is published under the imprint Birkhäuser, www.birkhauser-science.com by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xi

Standard notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xv

1 Analysis on Measures and Functional Spaces 1.1 Banach spaces: notations and examples . . . . . . . . . . . . . 1.2 Smooth functions on Banach spaces . . . . . . . . . . . . . . . 1.3 Additive and multiplicative integrals . . . . . . . . . . . . . . . 1.4 Differentials of the norms . . . . . . . . . . . . . . . . . . . . . 1.5 Smooth mappings between Banach spaces . . . . . . . . . . . . 1.6 Locally convex spaces and Fr´echet spaces . . . . . . . . . . . . 1.7 Linear operators in spaces of measures and functions . . . . . . 1.8 Fractional calculus . . . . . . . . . . . . . . . . . . . . . . . . . 1.9 Generalized functions: main operations . . . . . . . . . . . . . 1.10 Generalized functions: regularization . . . . . . . . . . . . . . . 1.11 Fourier transform, fundamental solutions and Green functions 1.12 Sobolev spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.13 Variational derivatives . . . . . . . . . . . . . . . . . . . . . . . 1.14 Derivatives compatible with duality, AM- and AL-spaces . . . 1.15 Hints and answers to chosen exercises . . . . . . . . . . . . . . 1.16 Summary and comments . . . . . . . . . . . . . . . . . . . . . 2 Basic ODEs in Complete Locally Convex Spaces 2.1 Fixed-point principles for curves in Banach spaces . . . . 2.2 ODEs in Banach spaces: well-posedness . . . . . . . . . . 2.3 Linear equations and chronological exponentials . . . . . 2.4 Linear evolutions involving spatially homogeneous ΨDOs 2.5 Hamiltonian systems, boundary-value problems and the method of shooting . . . . . . . . . . . . . . . . . . . 2.6 Hamilton–Jacobi equation, method of characteristics and calculus of variation . . . . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

1 9 14 19 23 26 37 43 56 59 64 70 71 77 80 83

. . . .

. . . .

85 88 93 98

. . . . . 104 . . . . . 113 v

vi

Contents

2.7

Hamilton–Jacobi–Bellman equation and optimal control . . . . . . 119

2.8

Sensitivity of integral equations . . . . . . . . . . . . . . . . . . . . 125

2.9

ODEs in Banach spaces: sensitivity

. . . . . . . . . . . . . . . . . 131

2.10 Linear first-order partial differential equations . . . . . . . . . . . 133 2.11 Equations with memory: causality . . . . . . . . . . . . . . . . . . 135 2.12 Equations with memory: fractional derivatives . . . . . . . . . . . 137 2.13 Linear fractional ODEs and related integral equations . . . . . . . 139 2.14 Linear fractional evolutions involving spatially homogeneous ΨDOs . . . . . . . . . . . . . . . . . . . . . . . . . . 142 2.15 Sensitivity of integral and differential equations: advanced version . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 2.16 ODEs in locally convex spaces . . . . . . . . . . . . . . . . . . . . 148 2.17 Monotone and accretive operators . . . . . . . . . . . . . . . . . . 150 2.18 Hints and answers to chosen exercises . . . . . . . . . . . . . . . . 153 2.19 Summary and comments . . . . . . . . . . . . . . . . . . . . . . . 155 3 Discrete Kinetic Systems: Equations in lp+ 3.1

Equations in Rn+ . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

3.2

Examples in Rn+ : replicator dynamics and mass-action-law kinetics . . . . . . . . . . . . . . . . . . . . . . . . 162

3.3

Entropy and equilibria for linear evolutions in Rn+ . . . . . . . . . 169

3.4

Entropy and equilibria for nonlinear evolutions in Rn+ . . . . . . . 172

3.5

Kinetic equations for collisions, fragmentation, reproduction and preferential attachment . . . . . . . . . . . . . . . . . . . . . . 175

3.6

Simplest equations in lp+ . . . . . . . . . . . . . . . . . . . . . . . . 182

3.7

Existence of solutions for equations in lp+ . . . . . . . . . . . . . . 186

3.8

Additive bounds for rates . . . . . . . . . . . . . . . . . . . . . . . 188

3.9

Evolution of moments under additive bounds . . . . . . . . . . . . 191

3.10 Accretive operators in lp . . . . . . . . . . . . . . . . . . . . . . . . 194 3.11 Accretivity for evolutions with additive rates . . . . . . . . . . . . 197 3.12 The major well-posedness result in lp+ . . . . . . . . . . . . . . . . 200 3.13 Sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 3.14 Second-order sensitivity . . . . . . . . . . . . . . . . . . . . . . . . 206 3.15 Stability of solutions with respect to coefficients . . . . . . . . . . 208 3.16 Hints and answers to chosen exercises . . . . . . . . . . . . . . . . 209 3.17 Summary and comments . . . . . . . . . . . . . . . . . . . . . . . 210

Contents

4 Linear Evolutionary Equations: Foundations 4.1 Semigroups and their generators . . . . . . . . . . . . . . . 4.2 Semigroups of operators on Banach spaces . . . . . . . . . 4.3 Simple diffusions and the Schr¨ odinger equation . . . . . . . 4.4 Evolutions generated by powers of the Laplacian . . . . . . 4.5 Evolutions generated by ΨDOs with homogeneous symbols and their mixtures . . . . . . . . . . . . . . . . . . . . . . . 4.6 Perturbation theory and the interaction picture . . . . . . . 4.7 Path integral representation . . . . . . . . . . . . . . . . . . 4.8 Diffusion with drifts and Schr¨ odinger equations with singular potentials and magnetic fields . . . . . . . . . . . . 4.9 Propagators and their generators . . . . . . . . . . . . . . . 4.10 Well-posedness of linear Cauchy problems . . . . . . . . . . 4.11 The operator-valued Riccati equation . . . . . . . . . . . . 4.12 An infinite-dimensional diffusion equation in variational derivatives . . . . . . . . . . . . . . . . . . . . . 4.13 Perturbation theory for propagators . . . . . . . . . . . . . 4.14 Diffusions and Schr¨ odinger equations with nonlocal terms . 4.15 ΨDOs with homogeneous symbols (time-dependent case) . 4.16 Higher-order ΨDEs with nonlocal terms . . . . . . . . . . . 4.17 Hints and answers to chosen exercises . . . . . . . . . . . . 4.18 Summary and comments . . . . . . . . . . . . . . . . . . .

vii

. . . .

. . . .

. . . .

. . . .

213 219 223 233

. . . . 237 . . . . 244 . . . . 250 . . . .

. . . .

. . . .

. . . .

254 259 263 266

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

270 272 278 281 284 285 286

. . . .

289 296 304 308

. . . . . . . . . . . .

311 316 322 326 331 334 339 345 348 351 354 358

5 Linear Evolutionary Equations: Advanced Theory 5.1 T -products with three-level Banach towers . . . . . . . . . . . . 5.2 Adding generators with 4-level Banach towers . . . . . . . . . . 5.3 Mixing generators . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 The method of frozen coefficients: heuristics . . . . . . . . . . . . 5.5 The method of frozen coefficients: estimates for the Green function . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6 The method of frozen coefficients: main examples . . . . . . . . . 5.7 The method of frozen coefficients: regularity . . . . . . . . . . . 5.8 The method of frozen coefficients: the Cauchy problem . . . . . 5.9 Uniqueness via duality and accretivity; generalized solutions . . 5.10 Uniqueness via positivity and approximations; Feller semigroups 5.11 L´evy–Khintchin generators and convolution semigroups . . . . . 5.12 Potential measures . . . . . . . . . . . . . . . . . . . . . . . . . . 5.13 Vector-valued convolution semigroups . . . . . . . . . . . . . . . 5.14 Equations of order at most one . . . . . . . . . . . . . . . . . . . 5.15 Smoothness and smoothing of propagators . . . . . . . . . . . . 5.16 Summary and comments . . . . . . . . . . . . . . . . . . . . . .

viii

Contents

6 The Method of Propagators for Nonlinear Equations 6.1 Hamilton–Jacobi–Bellman (HJB) and Ginzburg–Landau equations . . . . . . . . . . . . . . . . 6.2 Higher-order PDEs and ΨDEs, and Cahn–Hilliard-type equations . . . . . . . . . . . . . . . . 6.3 Nonlinear evolutions and multiplicative-integral equations 6.4 Causal equations and general path-dependent equations . 6.5 Simplest nonlinear diffusions: weak treatment . . . . . . . 6.6 Simplest nonlinear diffusions: strong treatment . . . . . . 6.7 Simplest nonlinear diffusions: regularity and sensitivity . 6.8 McKean–Vlasov equations . . . . . . . . . . . . . . . . . . 6.9 Landau–Fokker–Planck-type equations . . . . . . . . . . . 6.10 Forward-backward systems . . . . . . . . . . . . . . . . . 6.11 Linearized evolution around non-linear propagators . . . 6.12 Sensitivity of nonlinear propagators . . . . . . . . . . . . 6.13 Summary and comments . . . . . . . . . . . . . . . . . .

. . . . . 362 . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

7 Equations in Spaces of Weighted Measures 7.1 Conditional positivity . . . . . . . . . . . . . . . . . . . . . . . 7.2 Simplest equations that preserve positivity . . . . . . . . . . . 7.3 Path-dependent equations and forward-backward systems . . . 7.4 Kinetic equations (Boltzmann, Smoluchowski, Vlasov, Landau) and replicator dynamics . . . . . . . . . . . . . . . . . . . . . . 7.5 Well-posedness for basic kinetic equations . . . . . . . . . . . . 7.6 Equations with additive bounds for rates . . . . . . . . . . . . 7.7 On the sensitivity of kinetic equations . . . . . . . . . . . . . . 7.8 On the derivation of kinetic equations: second quantization and beyond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.9 Interacting particles and measure-valued diffusions . . . . . . . . . . . . . . . . . . . . . 7.10 Summary and comments . . . . . . . . . . . . . . . . . . . . . 8 Generalized Fractional Differential Equations 8.1 Green functions of fractional derivatives and the Mittag-Leffler function . . . . . . . . . . . . . . . . . . 8.2 Linear evolution . . . . . . . . . . . . . . . . . . . . . 8.3 The fractional HJB equation and related equations with smoothing generators . . . . . . . . . . . . . . . 8.4 Generalized fractional integration and differentiation . 8.5 Generalized fractional linear equations, part I . . . . . 8.6 Generalized fractional linear equations, part II . . . . 8.7 The time-dependent case; path integral representation

. . . . . . . . . . . .

. . . . . . . . . . . .

370 371 374 377 379 382 385 393 394 396 401 402

. . 405 . . 407 . . 413 . . . .

. . . .

416 424 427 431

. . 432 . . 438 . . 440

. . . . . . . 444 . . . . . . . 446 . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

450 453 458 463 468

Contents

8.8 8.9

ix

Chronological operator-valued Feynmann–Kac formula . . . . . . . 475 Summary and comments . . . . . . . . . . . . . . . . . . . . . . . 479

9 Appendix 9.1 Fixed-point principles . . . . . . . . . . . . . . . . . . 9.2 Special functions . . . . . . . . . . . . . . . . . . . . . 9.3 Asymptotics of the Fourier transform: power functions and their exponents . . . . . . . . . . . . . . . . . . . 9.4 Asymptotics of the Fourier transform: functions of power growth . . . . . . . . . . . . . . . . . . . . . 9.5 Argmax in convex Hamiltonians . . . . . . . . . . . .

. . . . . . . 481 . . . . . . . 483 . . . . . . . 485 . . . . . . . 491 . . . . . . . 498

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521

Preface Objectives, scope and methodology This is an advanced text on ordinary differential equations (ODEs) in Banach and more general locally convex spaces, most notably ODEs on measures and various function spaces. The methodology is carefully chosen in order to provide a very concise introduction of the fundamentals and then move on quickly, but rigorously and systematically to the up-fronts of modern research in linear and nonlinear PDEs, ΨDEs, general kinetic equations and fractional evolutions. More than half of the book content has not previously been included in any textbook. Other parts have been streamlined and given unified arguments. The level of generality was chosen such that the book content is suitable for the study of the most important nonlinear equations of mathematical physics, such as Boltzmann, Smoluchovskii, Vlasov, Landau–Fokker–Planck, Cahn–Hilliard, Hamilton–Jacobi–Bellman, nonlinear Schr¨ odinger or McKean–Vlasov diffusions and their nonlocal extensions, of mass-action-law kinetics from chemistry, as well as of nonlinear evolutions arising in evolutionary biology, mean-field games, optimization theory, epidemics and system biology, in general models of interacting particles or agents that describe splitting and merging, collisions and breakage, mutations or the preferential-attachment growth on networks. With this objective in mind, the abstract vector spaces are introduced and studied mostly not for their own sake, but as a convenient tool for storing and summarizing the basic properties of the concrete infinite-dimensional spaces of smooth or integrable functions, measures and distributions (generalized functions), which are crucial for the above-mentioned equations. In other words, the general theory is developed as a tool for effectively solving concrete problems, and it aims at simplifying and not complicating the matter. In accordance with this approach, we are not dealing much with ‘pathologies’ that may arise in abstract spaces, but rather focus on the regularity properties of the most important classes of equations. A large number of remarks and comments are scattered throughout the text which stress the interconnections between various parts of the book and aim at revealing where and how a particular result is used in other chapters, or may be used in other contexts. In order to make the text appealing and accessible to readers with different backgrounds, much attention is paid to the clarification of the links between the languages of pseudo-differential operators (ΨDOs), generalized functions, operator theory, abstract linear spaces, fractional calculus and path integrals. With the same objective in mind, lots of attention is paid to proper definitions of all the objects that are used. Some definitions are even repeated in different chapters. A detailed subject index refers to the pages where the corresponding notions are defined. Also, the book contains many exercises that deal with examples and further developments. Note that these exercises never substitute the proofs of the main results. Solutions are provided for most exercises of the four initial chapters. Exercises in later chapters are more research-oriented. xi

xii

Preface

General context and specific features The basic classes of partial differential, integral and pseudo-differential equations usually lead to ODEs in infinite-dimensional spaces with unbounded, not Lipschitzcontinuous and/or singular coefficients. Roughly speaking, our major tools in their study derive from the methods of semigroups and propagators, on the one hand, and from the exploitation of some kind of positivity preservation, on the other. The overall emphasis is on the well-posedness of the problems (existence, uniqueness and continuous dependence of solutions on the initial data), sensitivity (smooth dependence of solutions on initial data and/or parameters), regularity of the solutions in various classes of smooth functions equipped with pointwise or integral norms, with precise growth estimates, and either the integral representations of solutions (whenever possible) or the natural approximating schemes that allow for various kinds of numerical algorithms to be employed for finding solutions. Apart from being crucial for numeric computations, explicit estimates are important for studying equations with random coefficients, which requires precise control over all bounds. Together with the regular solutions, various concepts of generalized solutions are introduced, whereby two basic classes of such solutions are stressed: generalized solutions by approximations (that can be approximations of regular solutions or approximations by discrete times) and generalized solutions by duality (that can be a Banach-space duality or, more generally, duality of locally convex spaces, the most notable example of the latter being the method of generalized functions). A unique feature of the exposition in this book is that it is strongly influenced by the links with probability theory and Markov processes (Feller semigroups, L´evy–Khintchin generators, path integrals are standard players in stochastic analysis, but not in standard texts on ODEs), though always remaining independent of these links. The links are crucial for modern developments in the field, since probability theory keeps steadily penetrating all areas of natural and even social sciences. They are made explicit in several side notes that are aimed at readers with some knowledge of stochastic analysis. The links are revealed in detail in [147], [148]. Other accessible books on the links between PDE and stochastics are [12], [68] and [118]. The exposition is also strongly influenced by the fractional calculus, which is rapidly developing as an appropriate tool to deal with various complex problems in natural and social sciences, see, e.g., [253, 255, 263]. Although results on fractional differential equations are analysed in special sections (that readers may choose to omit), it turns out that the type of singularities occurring in fractional equations are in fact quite common in other settings, like nonlinear diffusions or general perturbation theory estimates. Therefore, the growth estimates of the solutions and their sensitivity to parameters are naturally expressed in terms of the MittagLeffler functions in various, seemingly unrelated contexts, these functions being the main players in fractional calculus. A unified abstract framework for these contexts makes it possible to treat them in a very effective and concise way.

Preface

xiii

The source of many developments in the theory of differential equations (especially nonlinear differential equations) can be traced back to the analysis of systems of interacting particles (or agents in the social context) in the limit of large particle numbers (dynamic law of large numbers). Although this link has not been formally developed in the book, it was crucial for the selection of material and methodology. Another source of new methods and ideas in nonlinear differential equations is the theory of optimization and competitive control systems. This link is being developed here, including the analysis of various classes of the Hamilton–Jacobi– Bellman equation, forward-backward systems (occurring in mean-field games), the Riccati equation and the replicator dynamics of evolutionary game theory and controlled systems of interacting agents. Since differential equations are a key tool in almost all developments and applications of mathematics, many introductory textbooks on differential equations are available. Traditionally, the topic is included in the undergraduate curriculum in two separate parts: ordinary differential equations (ODEs) and partial differential equations (PDEs). Examples for classical texts on ODEs are [15, 104, 220, 224]. Meanwhile, the standard theory of PDEs is much more diversified. Starting from the classical second-order equations of mathematical physics (Laplace, heat and wave equations), it utilizes various methods for various types of equations. Therefore, the boundaries of even the core of the subject are difficult to overview, and the same applies for a comprehensive list of textbooks. However, a large portion of these methods can be unified by looking at partial differential operators as representatives of linear operators in certain abstract spaces, and then applying the tools of functional analysis. In this framework, PDEs – and more general integrodifferential and pseudo-differential equations – are considered ODEs in abstract linear spaces, and the boundary between these two parts of the theory is melting away. The well-known book [219] was one of the first systematic developments of this framework. Following this general idea, the present book provides a unique concise and application-oriented exposition of ODEs on measures and functional spaces, starting from scratch and moving up to the level of modern research in many directions, including non-equilibrium statistical mechanics, nonlinear quantum mechanics, fractional evolutions, evolutionary biology, models of interacting agents, and others. Readers and prerequisites This textbook is designed to serve as a multi-purpose learning resource on differential equations. It is mostly aimed at postgraduate or final year undergraduate courses for mathematics students. However, the final Chapters can also be of interest to researchers in linear and nonlinear differential equations. On the other hand, the book can also be used for basic undergraduate courses and self-studies. For instance, Chapter 2 can be considered an intensive undergraduate introductory course on ODEs, if the term ‘Banach space’ is substituted by ‘Euclidean space Rd ’ and if the integration methods of the simplest one-dimensional equations are

xiv

Preface

added as exercises. A study in this framework would help the students to grasp the more abstract approaches from the beginning of their curriculum, which makes their further transition to graduate courses more easy. Similarly, Chapter 3 on the equations in Rd+ and l1+ does not require any prerequisites apart from introductory calculus and linear algebra. The overall level of presentation is meant to be appropriate for readers who are familiar with basic calculus and linear algebra, with the principles of convergence in general compact, metric and Banach spaces, with the basic notions of linear spaces, linear operators, dual spaces and operators, with the theory of measure, integration and Lp -spaces and occasionally with some functions of complex variables. No prior knowledge of ODEs is assumed. Each of the first four chapters can serve as the basis for a crash course on their respective topic. Put together, they cover a range of topics that is appropriate for a full one semester module. Elements of the other chapters can be used to enhance the course in various particular directions, or as a basis for more advanced courses. For each chapter, a specific abstract and summary are provided. The material of the last chapters has been adapted from research articles and specialized monographs. Their topical selection was of course influenced by the research interests of the author, but a strong attempt was made to choose only topics within the mainstream of research, including general methods which can be used in a variety of developments and which at the same time have grown mature enough to be presented in a more or less final form. Though aimed at mathematicians and filled with abstract theory, the book is meant to be truly application-oriented: Not in the sense that it is to be used for producing certain concrete industrial products, but in the sense that the abstract theory is developed in order to effectively solve basic concrete problems that arise in natural sciences or modelling of social processes – that is, as a tool to streamline and simplify the analysis of these problems, and not for the sake of generality in its own right. Bibliographic comments Due to the immense amount of literature on the topics that are touched upon in the book, it was unfortunately impossible to provide an exhaustive guide to all relevant contributions. Instead, the given bibliography essentially includes the sources that have been used by the author for the preparation of the manuscript, as well as some classical textbooks and key references for related further developments. Acknowledgement It is my pleasure to express my gratitude for fruitful discussion and joint research on the topics reflected in this book to my colleagues, collaborators and PhD students, especially to S. Assing, A. Bensoussan, A. Hilbert, M.E. Hern´andezHern´ andez, A. Kulik, S. Katsikas, O.A. Malafeyev, L. Toniazzi, M. Troeva, M. Veretennikova and W. Yang. Also, I gratefully acknowledge the support by the Russian Academic Excellence project ‘5-100’.

Standard notations N, Z, R, C Sets of natural, integer, real and complex numbers Z+ N ∪ {0} R+ {x ∈ R : x ≥ 0} N∞ , Z∞ , R∞ , C∞ Sets of sequences from N, Z, R, C ∞ Z∞ Subsets of Z∞ , R∞ with non-negative elements + , R+ d d C ,R Complex and real d-dimensional spaces (x, y) or xy Scalar product of the vectors x, y ∈ Rd ; also x2 = |x|2 = (x, x)  |x| or x Standard Euclidean norm (x, x) of x ∈ Rd or x ∈ Cd Re a, Im a Real and imaginary part of a complex number a [x] Integer part of a real number x (maximal integer not exceeding x) sgn x = sgn(x) The sign of x (equals 1, 0, −1 if x < 0, x = 0, x > 0 respectively) Sd Unit sphere in Rd+1 C(X) For a metric or topological space X, the Banach space of bounded continuous functions on X equipped with the sup-norm f  = f C(X) = supx∈X |f (x)| M(X) Banach space of finite signed Borel measures on X + M (X) and P(X) The subsets of M(X) of positive and positive normalized (probability) measures C∞ (X) For a locally compact X, the subspace of C(X) consisting of functions that tend to zero at infinity For a positive function f on X, the Banach space of continuous Cf (X) functions on X with a finite norm gCf (X) = g/f C(X) Mf (X) For a positive continuous function f on X, the space of Borel measures on X with a finite norm μMf (X) = sup{(g, μ) : gCf (X) ≤ 1} Cf,∞ (X) For a locally compact X and a positive function f , the subspace of Cf (X) consisting of functions g such that fg(x) (x) → 0, as x → ∞ C k (Rd ) or short C k Banach space of k times continuously differentiable functions with bounded derivatives on Rd , with the norm being the sum of the sup-norms of the function itself and all its partial derivatives up to and including the order k k C∞ (Rd ) ⊂ C k (Rd ) Functions whose derivatives up to and including order k are all in C∞ (Rd )   ∂f ∂f The gradient of the function f ∇f = (∇1 f, . . . , ∇d f ) = ∂x , . . . , ∂xd 1  2 2 Δ = ∇ = j ∇j The Laplacian operator ∇⊗2 f =

∂2f ∂x2

The matrix of the second-order derivatives of f , sometimes referred to as the Hessian xv

xvi

Standard notations

Lp (X, μ) or Lp (X, μ), p ≥ 1 The Banach spaces of (the equivalence classes of) integrable functions on the metric or topological space X with respect to the Borel measure μ, equipped with the p-norm  1/p |f (x)|p μ(dx) f p = Lp (Rd ) S(Rd ) |ν| (f, g) =



The space Lp (Rd , μ) with Lebesgue measure μ Schwartz space of fast-decreasing functions: {f ∈ C ∞ (Rd ) : ∀k, l ∈ N, |x|k ∇l f ∈ C∞ (Rd )} The (positive) total variation measure for a signed measure ν f (x)g(x) dx Scalar product for functions f, g on Rd . For f ∈ C(X), μ ∈ M(X), the following notation is used:  (f, μ) = (μ, f ) = X f (x)μ(dx)

AT or A Transpose of a matrix A A or A Dual or adjoint operator of A

ker A, tr A Kernel and trace of the matrix A 1M Indicator function of a set M (equals one or zero according to whether its argument is in M or not) 1 Constant function equal to one, and also the identity operator f = O(g) For functions f and g, this means that |f | ≤ Cg for some constant C f = o(g)x→a For functions f and g, this means that limx→a Standard abbreviations ODE PDE

Ordinary differential equation Partial differential equation

ΨDE ΨDO

Pseudo-differential equation Pseudo-differential operator

r.h.s., l.h.s. Right-hand side and left-hand side, respectively

f (x) g(x)

=0

Chapter 1

Analysis on Measures and Functional Spaces In this chapter, we shall review some key facts on the calculus of smooth mappings between Banach spaces, Fr´echet spaces and general locally convex spaces, and their key representatives, including the spaces of generalized functions (or distributions). Sections 1.1 to 1.6 deal with abstract notions, and the remaining part provides more concrete information on spaces of measures and functions, their dual spaces and the structure of the basic classes of linear operators, including multidimensional mixed fractional derivatives and ΨDOs. In order to be reasonably self-contained, we supply most of the proofs, apart from some standard facts that are clearly formulated and provided with references where the proofs can be found.

1.1 Banach spaces: notations and examples In this introductory section, we explain the notations for the basic Banach spaces of functions, operators and measures that are used throughout the book without further reminder. Apart from fixing notations, the objective is to draw the circle of notions and ideas that comprise the main building blocks for the development in this treatise and that the reader is supposed to be familiar with. Besides standard calculus and linear algebra (including the Fourier transform), this includes the theory of measure and integration, convergence in compact, metric (and rarely in general topological) spaces and basic theoretical facts on Banach spaces. It is only in rare occasions that we use deeper facts of functional analysis like the Hahn–Banach theorem, Baire’s theorem on categories or Schauder’s fixed-point principle. Details on all these prerequisites can be found in many standard texts on functional analysis including [166, 231] and [115]. © Springer Nature Switzerland AG 2019 V. Kolokoltsov, Differential Equations on Measures and Functional Spaces, Birkhäuser Advanced Texts Basler Lehrbücher, https://doi.org/10.1007/978-3-030-03377-4_1

1

2

Chapter 1. Analysis on Measures and Functional Spaces

We shall mostly work with Banach spaces over the field of real numbers. Recall that for any Banach space B with the norm . = .B (we write simply ξ if it is clear which Banach space we are talking about), the dual Banach space, usually denoted by B ∗ or B  , is defined as the space of continuous linear functionals z on B, the value of z at y being usually denoted by z(y) = (z, y). This space is Banach with respect to the norm z = sup{|z(y)| : y ≤ 1}. Remark 1. For a real Hilbert space H, for instance the usual Euclidean space Rn , H can be identified with its dual H ∗ , so that (z, y) is the inner product of z and y, which coincides with the value of z on y. Each Banach space B is naturally embedded in its second dual space B ∗∗ , since any element y ∈ B defines the linear functional on B ∗ via the formula y(z) = (z, y). The Banach space B is called reflexive if this embedding is a bijection, that is, the second dual B ∗∗ is isomorphic to B. Reflexive spaces share many features with Hilbert spaces and are usually more convenient for analysis. The main examples of Banach spaces in this book, however, are not reflexive. The weakest topology of B ∗ ensuring that all functionals from B are continuous is called the ∗-weak topology of B ∗ . Thus b∗n → b∗ in the ∗-weak topology means that (b∗n , b) → (b∗ , b) for any b ∈ B. The weakest topology of B ensuring that all functionals from B ∗ are continuous is called the weak topology of B. If B is reflexive, then the weak and the ∗-weak topology coincide on B ∗ , as well as on B = (B ∗ )∗ . A subset Z of B ∗ is said to separate points of B, if for any b1 , b2 ∈ B there exists z ∈ Z such that (z, b1 ) = (z, b2 ). In many situations, it is handy to work with dual pairs of Banach spaces. In physics, these pairs usually represent observables and states. Therefore we shall sometimes use the notations Bst and Bobs for the dual pairs. We say that a pair of Banach spaces (Bobs , Bst ) is a dual pair if each of these spaces is a closed subspace of the dual of the respective other space that separates the points of the latter. Given a pair (Bobs , Bst ), one defines the weak topology of Bobs (respectively Bst ) with respect to this dual pair as the weakest topology that makes all functionals from Bst (respectively Bobs ) continuous. A linear operator A between two Banach spaces B1 and B2 is a linear mapping A : D → B2 , where D is a subspace of B1 called the domain of A. The operator A is said to be densely defined if D is dense in B1 . The operator A is called bounded if the norm A = supx∈D Ax/x is finite. If A is bounded and D is dense, then A has a unique bounded extension (with the same norm) to an operator with the whole B1 as domain. A linear operator on a Banach space is called a contraction, if its norm does not exceed 1. For an operator A : B → B, the dual operator A∗ : B ∗ → B ∗ is defined by the equation (A∗ z, b) = (z, Ab).

1.1. Banach spaces: notations and examples

3

It is known (and not difficult to show) that a linear operator A : B1 → B2 is continuous if and only if it is bounded. For a continuous linear mapping A : B1 → B2 , its norm is defined as AB1 →B2 = AL(B1 ,B2 ) = sup x =0

AxB2 . xB1

(1.1)

The space of bounded linear operators B1 → B2 equipped with this norm is a Banach space itself, often denoted by L(B1 , B2 ). For B1 = B2 = B we shall also use the shorter notation AL(B) or even simpler AB instead of AL(B,B) . A sequence of bounded operators An , n = 1, 2, . . . , from B1 to B2 is said to converge strongly to an operator A if An f → Af for any f ∈ B1 . This defines the strong topology on L(B1 , B2 ), which is weaker than the norm topology. If B2 = R, then L(B1 , R) = B1∗ and the strong topology turns into the ∗-weak topology of B1∗ . Exercise 1.1.1. Show that A∗  = A for any bounded operator A : B → B. A bilinear operator from B1 to B2 is a mapping D : B1 × B1 → B2 which is linear with respect to each of its two variables. It is called bounded if there exists a constant d such that D(x, y) ≤ dx y. The minimal such constant is called the norm of D. Clearly, if D is bounded, then it is continuous as a function of two variables. Let us denote L2 (B1 , B2 ) the space of bounded symmetric bilinear operators from B1 to B2 , which is Banach with respect to the above defined norm. Similarly, one defines the spaces Ln (B1 , B2 ) of multi-linear symmetric bounded mappings B1 × · · · × B1 → B2 . Remark 2. If a bilinear mapping between Banach spaces is continuous with respect to each of its variables separately, then it is bounded. This fact follows from the principle of uniform boundedness. The following examples of Banach spaces are going to appear quite often in this book. For a topological (in particular metric) space X and a Banach space B, we define the space C(X, B) of continuous bounded functions f : X → B. Note that this space is often denoted by Cb (X, B) in other literature. It is a Banach space equipped with the sup-norm f C(X,B) = sup f (x)B . x∈X

Sometimes we shall also use the space Cuc (X, B) of bounded, uniformly continuous functions on X. Another established notation for this space is BU C(X, B). For the special case B = R, we write shortly C(X) for C(X, R) and Cuc (X) for Cuc (X, R). Both of them are Banach spaces with the norm f sup = supx |f (x)|. The space Cuc (X) is a closed subspace of C(X). Particular cases are the space (Rd , sup) of d-dimensional vectors with the norm ysup = maxj |yj | and the space l∞ ⊂ R∞ of sequences of bounded elements with the norm ysup = supj |yj |.

4

Chapter 1. Analysis on Measures and Functional Spaces

Integration represents a pairing of measures and functions. Whenever the  integral exists, one therefore often uses the notation (f, μ) = (μ, f ) = f (x)μ(dx) for a function f and a measure μ. The space M(X) for a topological (in particular metric) space X is the space of bounded signed measures on the Borel σ-algebra of X (i.e., the algebra that is generated by all open subsets). It is a Banach space with respect to the norm μ = sup{|(f, μ)| : f ∈ C(X), f sup ≤ 1}. This norm coincides with the full variation norm μ = (1, μ+ + μ− ), where μ = μ+ − μ− is the Hahn decomposition of μ in its positive and negative parts. The positive measure |μ| = μ+ + μ− is often referred to as the total variation measure of μ. The subset of positive measures is denoted by M+ (X). The elements of M+ (X) that have a total measure of 1 are called probability measures, and the set of these measures is denoted by P(X). An important example for measures are the Dirac measures or atoms δx , which assign the measure 1 to the point x and zero to the compliment of x. A ∞ measure μ ∈ M(X) is called discrete or atomic if μ = j=1 aj δxj with some summable sequence {aj } of numbers and some sequence {xj } of points in X. A measure is called continuous if it contains no atoms (i.e., any point is a set of zero measure). It is known (and easy to show) that the sets of discrete and continuous measures are closed Banach subspaces in M(X). Occasionally, we shall use unbounded measures μ, which will always be taken from the class of the so-called Radon measures MR (X). These measures have the property that |μ|(A) is finite for any compact set A. For μ1 , μ2 ∈ MR (X), the complex-valued function μ1 + iμ2 on Borel subsets of X is called a complex Radon measure. The space of such measures is denoted by MC R (X). For μ = μ1 + iμ2 , the total variation measure of μ is defined as the positive measure |μ| = |μ1 | + |μ2 |. The space of bounded complex measures is denoted by MC (X). An important special case of unbounded measures and functions are weighted spaces. For a continuous non-negative function L on X, let the weighted function space CL (X) be the space of continuous functions on X with the finite norm f CL (X) = inf{K : |f (x)| ≤ KL(x) for all x}.

(1.2)

Similarly, one defines the weighted measure space M(X, L) of Borel measures on X with the finite norm μM(X,L) = L(x)|μ|(dx) = sup{(g, μ) : gCL(X) ≤ 1}. (1.3) X

If L is strictly positive, then the above-defined weighted spaces are Banach spaces and all measures from M(X, L) are Radon measures. If X is a locally compact space, the closed subspace C∞ (X) (respectively C∞ (X, B) for a Banach space B) of C(X) (respectively of C(X, B)) consists of

1.1. Banach spaces: notations and examples

5

functions that vanish at infinity, i.e., such functions f that for any  > 0 there exists a compact K such that |f (x)| <  for x ∈ / K. Similarly, for a positive function L on X the space CL,∞ (X) is the closed subspace of CL (X) which contains functions f such that f (x)/L(x) vanishes at infinity. The fundamental Riesz–Markov theorem states that M(X) coincides with the dual space [C∞ (X)]∗ , which defines the ∗weak topology on M(X). For the space C∞ (N), which is the closed subspace of l∞ containing sequences {y1 , y2 , . . .} such that yn → 0 as n → ∞, we use the short term c∞ . (Another established notation for this space is c0 .) If a positive Borel measure μ is chosen on X, the spaces of (the equivalence classes of) integrable functions Lp (X, μ), p ≥ 1, are defined in the usual way. They are equipped with the p-norm

f p =

1/p |f (x)|p μ(dx) .

Special cases are the spaces lp ⊂ R∞ of sequences with the bounded norm yp =



1/p |yj |

,

p=1

and their finite-dimensional subsets Rd . As usual, we write Lp (Ω) for Ω ⊂ Rn , if the measure is assumed to be Lebesgue. The space L∞ (X, μ) is the equivalence class of bounded functions on X with the ess-sup-norm (supremum that is obtained by ignoring sets of measure zero) denoted byf ∞ . For X = N, L∞ (X, μ) turns into the space l∞ of bounded sequences, where the sup-norm and ess-sup-norm coincide. It is a known fact that [Lp (X, μ)]∗ = Lq (X, μ) for p ≥ 1 with 1/p + 1/q = 1, including the case p = 1 and q = ∞, but excluding p = ∞. By this correspondence, an integration: f (g) =  element f ∈ Lq (X, μ) defines ∗a functional on Lp (X, μ) via f (x)g(x)μ(dx). In particular, lp = lq . Moreover, l1 = (c∞ )∗ . According to the tradition, the weak topology on M(X) (also called narrow topology by some authors) is meant to be the weak topology with respect to the pair (C(X), M(X)). Note that this definition differs from the general definition of weak topology as given above. Only for l1 = M(N) = L1 (N) (where in the notation L1 (N) the set N is supposed to be equipped with the uniform measure that assigns the unit measure to any point) the dual space is l∞ = C(N) and the weak topology for measures coincides with the weak topology in the general sense of functional analysis. When working with the weak topology, Prokhorov’s compactness criterion plays an important role: The bounded family μt of Borel measures on a complete metric space X is relatively compact in the weak topology if it is tight, that is, for any  > 0 there exists a compact set K ⊂ X such that μt (X \ K) <  for all t. For example, let P p (Rd ) denote the subset of P(Rd ) that contains probability

6

Chapter 1. Analysis on Measures and Functional Spaces

 measures μ with a finite pth order moment, i.e., with |x|p μ(dx) < ∞. Then, for any p > 0 and λ > 0, the set   M = μ ∈ P p (Rd ) : |x|p μ(dx) < λ is compact in the weak topology of P(Rd ). More generally, for any non-negative continuous function L on a locally compact space X such that L(x) → ∞ as x → ∞, the set   + (X, L) = μ ∈ M (X) : L(x)μ(dx) ≤ λ (1.4) M+ ≤λ X

is compact in the weak topology of M(X) for any λ > 0. A sequence of measures μn is said to converge vaguely to μ, if (μn , φ) converges to (μ, φ) for any φ with a compact support. For bounded sequences μn on a locally compact space, the vague and ∗-weak convergence coincide. However, unbounded sequences of measures can converge vaguely (and not ∗-weakly) to an unbounded measure. Remark 3. A sequence of measures μn converges vaguely to a measure μ if and only if μn converges in the space D of generalized functions, see Section 1.9. Let us point out the crucial difference between strong Banach topologies and weak topologies in M(X) for the example X = Rd . Recall that a measure μ ∈ M(Rd ) is called absolutely continuous (respectively singular) with respect to the Lebesgue measure if μ has a density with respect to the Lebesgue measure (or if the whole measure is concentrated on a set of zero Lebesgue measure, respectively). The famous Lebesgue decomposition theorem states that any continuous measure in Rd can be uniquely represented as the sum of an absolutely continuous and a singular measure. Moreover, the sets of absolutely continuous and singular measures are closed Banach subspaces in M(Rd ). In the weak topology, the situation is different: Here, the space of absolutely continuous measures (which is naturally isomorphic to the space L1 (Rd )) is weakly dense in M(X). In fact, let φ(x) be a mollifier, i.e., a continuous even function Rd → [0, 1] that is supported on the unit ball and has the unit norm in L1 (Rd ). For any μ ∈ M(X), let us define (1.5) fn (x) = nd φ(n(x − y))μ(dy). Then the sequence of measures with the continuous densities fn converges weakly to μ, as n → ∞. Exercise 1.1.2. Prove this statement. Exercise 1.1.3. Show that the set of discrete measures is weakly dense in M(X).

1.1. Banach spaces: notations and examples

7

For the analysis of ODEs, Lipschitz-continuous functions play a crucial role. We define CbLip (X) as the space of bounded Lipschitz functions with the norm f bLip = f  + f Lip ,

f Lip = sup x =y

|f (x) − f (y)| , ρ(x, y)

(1.6)

where ρ is the metric on X. The Lipschitz constant itself does not represent a norm, since f Lip = 0 for all constant functions f . One says that f is locally Lipschitz if for any x the Lipschitz constant f Lip is bounded for any y in some neighbourhood of x. If we want to stress which metric ρ is used, we can use the notation CbLip(ρ) (X) both for the space itself and for the norms. For instance, using the l1 -norm for vectors x ∈ Rd , one has f Lip(1) = sup x =y

|f (x) − f (y)| |f (x) − f (y)| , = sup sup |x − y|1 |xj − yj | j

(1.7)

where the last supremum is over the pairs x, y that differ only in their jth coordinate. Exercise 1.1.4. Prove the second equation in (1.7). For X an open subset of Rd , let C k (X) denote the space of k times continuously differentiable functions on X with uniformly bounded derivatives, equipped with the norm k

f (j) , (1.8) f C k (X) = f  + j=1

where f  is the supremum of the absolute values of all partial derivatives of f of order j. In particular, for a differentiable function, we find f C 1 = f bLip(1) . For X a closed convex subset of Rd with a nonempty interior Int(X), let C k (X) denote the closed subspace of C k (Int(X)) that contains functions whose derivatives up to the kth order all have a continuous extension to X. For convex subset k k (X) (respectively CbLip (X)) the closed subspace of X ∈ Rd , we denote by C∞ k functions f from C (X) such that f and all its derivatives up to order k belong to C∞ (X) (respectively with all derivatives of up to and including order k bek longing to CbLip (X)). For a Banach space B, the B-valued spaces C∞ (X, B) and k CbLip (X, B) are similarly defined. When analysing measure-valued evolution, more exotic spaces are sometimes needed. For example, the spaces C k×k (R2d ) that denote subspaces of C(R2d ) consist of functions f such that the partial derivatives ∂ α+β f /∂xα ∂y β with multiindex α, β, |α| ≤ k, |β| ≤ k, are well defined and belong to C(R2d ). The supremum of the norms of these derivatives provides a natural norm for these spaces. Let us now recall the celebrated Monge–Kantorovich theorem. It states that the weak topology on the subset P 1 (Rd ) of probability measures with a finite first moment can be metricized by the metric (j)

dMK (μ1 , μ2 ) = sup{|(f, μ1 − μ2 )| : |f (x) − f (y)| ≤ |x − y|}.

8

Chapter 1. Analysis on Measures and Functional Spaces

Let us single out an important corollary of this fact (and of the above-mentioned facts on tightness), which is very useful for working with spaces of differentiable functions. Proposition 1.1.1. For any k ≥ 1 and λ > 0, the set   1 P≤λ (Rd ) = μ ∈ P(Rd ) : |x|μ(dx) < λ is a compact subset of the dual Banach space (C k (Rd ))∗ . The existence of the metric dMK for metricising the weak convergence allow for a better quantification of the classes of weakly continuous functions. For instance, a handy class are the weakly Lipschitz-continuous functions on measures that have a finite weak Lipschitz constant F weakLip = sup{|F (μ1 ) − F (μ2 )| : dMK (μ1 , μ2 ) ≤ 1}.

(1.9)

The metric dMK is not the only handy metric for metricizing the weak topology. Let us indicate another one. For that purpose, let X be a locally compact space such that the space C∞ (X) is separable (which is the case, e.g., when X is a locally compact metric space), so that there exists a countable set of functions φn ∈ C∞ (X), n ∈ N, of unit norm such that their finite linear combinations are dense in C∞ (X). Then the function d(μ, ν) =

n

2−n

|(φn , μ − ν)| 1 + |(φn , μ − ν)|

(1.10)

defines a distance that metricizes the ∗-weak topology on M(X). On weakly compact sets (1.4), weak and ∗-weak topologies coincide, which has the following implication. Proposition 1.1.2. For any non-negative continuous function L on a locally compact metric space X such that L(x) → ∞ as x → ∞, the weak topology of compact sets (1.4) can be given by the metric (1.10). Finally, let us suggest a couple of exercises on the properties of the spaces of sequences lp . Exercise 1.1.5. Let A = (Ajk ), j, k ∈ N, be an infinite matrix  defining a linear operator A in l∞ according to the usual rule: (Ax)j = k Ajk xk . Check the following statements:

Al∞ →l∞ = sup |Ajk |, Al1 →l∞ = sup sup |Ajk |, (1.11) k

j

Al1 →l1 =

j

sup |Ajk |, k

j

Al∞ →l1 ≤

k

j

k

|Ajk |.

Find an example which shows that (1.12) can be a strict inequality.

(1.12)

1.2. Smooth functions on Banach spaces

9

Exercise 1.1.6. A symmetric matrix A as defined in the previous exercise also defines a bilinear operator that is an element of L2 (l∞ , R). Show that

AL2 (l∞ ,R) ≤ |Ajk |, AL2 (l1 ,R) ≤ sup sup |Ajk |. (1.13) j

k

j

k

Exercise 1.1.7. A sequence y (m) of elements of l∞ converges ∗-weakly to y, i.e., (y (m) , x) → (y, x) for any x ∈ l1 , if and only if it is uniformly bounded and each coordinate converges. Exercise 1.1.8. A sequence y (m) of elements of l1 converges weakly to y ∈ l1 if and only if it converges in the norm of l1 .

1.2 Smooth functions on Banach spaces Let M be a closed convex subset of a Banach space B and F a real function on M . We shall assume for convenience that the linear space that is generated by M coincides with B. This can always be achieved by reducing B appropriately. The directional derivative (less popular names are first variation or Gˆ ateaux derivative, the latter term being usually linked to linearity, see below) of F at Y in the direction ξ ∈ B (so that hξ ∈ M − Y for some h > 0) is defined as Dξ F (Y ) = DF (Y )[ξ] = lim

h→0+

F (Y + hξ) − F (Y ) h

(1.14)

Hereby, the notation h → 0+ means that h → 0 through positive values only. Remark 4. The different notations Dξ and D[ξ] are introduced in order to use them in different contexts, where subscripts or brackets may be overloaded with other stuff. One says that F is directionally differentiable on M if this derivative exists at all points of M and in all eligible directions ξ (so that hξ ∈ M −Y for some h > 0). We say that the directional derivative is bounded on M if |Dξ F (Y )| ≤ Cξ for all Y, ξ and a constant C. It is locally bounded if it is bounded for Y from any bounded subset of M . From the definition, it follows that if F is directionally differentiable on M , then (1.15) Daξ F (Y ) = aDξ F (Y ) for all a > 0. If the directional derivative is locally bounded, then for any Y ∈ M and ξ ∈ M − Y the function F (Y + sξ) of a real variable s ∈ [0, 1] has bounded right and left derivatives everywhere, and hence by the Lebesgue theorem it is differentiable almost everywhere and equals the integral over its derivative: 1 1 d F (Y + ξ) − F (Y ) = Dξ F (Y + sξ) ds. (1.16) F (Y + sξ) ds = 0 ds 0 This is the first-order Taylor expansion for functions on Banach spaces.

10

Chapter 1. Analysis on Measures and Functional Spaces

The existence of a bounded directional derivative does not imply that the mapping Dξ F (Y ) is linear in ξ. If the mapping Dξ F (Y ) is linear in ξ, the linear operator Dξ F (Y ) is usually called the Gˆ ateaux derivative. If this is the case for all Y , then F is called Gˆ ateaux-differentiable on M . The standard condition that is sufficient for this linearity involves the continuity with respect to Y : Proposition 1.2.1. If F has a bounded (respectively locally bounded) directional derivative on M such that Dξ F (Y ) is continuous in Y for any ξ, then Dξ F (Y ) is linear in ξ for any Y , Dξ F (Y ) is a continuous function of two variables ξ and Y , and F (Y ) is Lipschitz-continuous on M (respectively locally). Proof. By the first-order Taylor expansion and (1.15), we get F (Y + s(ξ1 + ξ2 )) = F (Y + sξ1 ) + [F (Y + sξ1 + sξ2 ) − F (Y + sξ1 )] 1 Dξ2 F (Y + sξ1 + hsξ2 ) dh = F (Y + sξ1 ) + s 0 s Dξ2 F (Y + sξ1 + hξ2 ) dh. = F (Y + sξ1 ) + 0

Due to the continuity of Dξ2 F (Z) at the point Z = Y , it follows that lim

s→0+

1 [F (Y + s(ξ1 + ξ2 )) − F (Y )] = Dξ1 F (Y ) + Dξ2 F (Y ), s

that is, the additivity of Dξ F (Y ). On the other hand, passing to the limit s → 0+ in the equation s Dξ F (Y + hξ) dh F (Y + sξ) − F (Y ) = 0

= −[F (Y ) − F (Y + sξ)] s D−ξ F (Y + sξ − hξ) dh =−

(1.17)

0

yields D−ξ F (Y ) = −Dξ F (Y ), which extends (1.15) to negative a and thus completes the proof of the linearity of Dξ F (Y ). The continuity of Dξ F (Y ) with respect to both variables follows from this linearity. The Lipschitz-continuity of F follows from (1.16).  Remark 5. For the first-order Taylor expansion to hold, only the local boundedness of Dξ F (Y ) for any ξ is needed. Similarly, for linearity in ξ to hold, only the local boundedness of Dξ F (Y ) and its continuity in Y for any ξ are needed. For the finite-dimensional case, this would imply the boundedness in ξ. But in the infinitedimensional case, the boundedness in ξ (or the continuity of Dξ F (Y ) with respect to ξ) represents an additional assumption.

1.2. Smooth functions on Banach spaces

11

Defining the norm of the linear mapping Dξ F (Y ) = DF (Y )[ξ] in the usual way, i.e., DF (Y ) = sup{|DF (Y )[ξ]| : ξ ≤ 1}, one obtains from (1.16) the formula for finite increments: |F (Y ) − F (Z)| ≤ Y − Z sup{DF (Z + s(Y − Z)) : s ∈ [0, 1]}.

(1.18)

In many cases, a version of differentiability that is stronger than that of Gˆateaux becomes important. One says that a function F on M is Fr´echet-differentiable at Y if there exists an element DF (Y ) ∈ B ∗ , called the derivative of F at Y , such that for any  > 0 there exists δ such that |F (Y + ξ) − F (Y ) − DF (Y )[ξ]| ≤ ξ

(1.19)

for any ξ with Y + ξ ∈ M and ξ ≤ δ. In other words, the condition reads |F (Y + ξ) − F (Y ) − DF (Y )[ξ]| = o(ξ), as ξ → 0. In this context, the terms ‘derivative’, ‘strong derivative’ and ‘Fr´echet derivative’ all mean the same thing. Notice that the existence of the Gˆateaux derivative (1.14) means that for any δ there exists  such that |F (Y + hξ) − F (Y ) − DF (Y )[hξ]| ≤ hξ

(1.20)

for h ∈ (0, δ). Therefore, the Fr´echet derivative of F is also necessarily its Gˆateaux derivative. The difference is that (1.19) holds uniformly for all ξ in some bounded domain. At this point, an important peculiarity of infinite-dimensional settings has to be mentioned. The continuity of DF (Y )[ξ] as a function of two variables clearly implies that the mapping Y → DF (Y ) is a continuous mapping M → B ∗ with B ∗ considered in its ∗-weak topology, but not necessarily in the norm topology. For B = Rd , these topologies coincide, which implies that under the assumptions of Proposition 1.2.1 the Gˆ ateaux derivative coincides with the Fr´echet derivative (according to the next result). In general Banach spaces, one has to impose the norm continuity as an additional assumption. Proposition 1.2.2. Under the assumptions of Proposition 1.2.1, let us assume that the mapping Y → DF (Y ) is continuous as a mapping from M to B ∗ with the norm-topology that is used in B ∗ . Then F is Fr´echet-differentiable at any point, with the derivative being given by the directional (or Gˆ ateaux) derivative. Proof. By the first-order Taylor expansion (1.16), we find   1   |F (Y + ξ) − F (Y ) − DF (Y )[ξ]| =  (DF (Y + sξ) − DF (Y ))[ξ] ds 0

≤ ξ sup{DF (Y + sξ) − DF (Y ) : s ∈ [0, 1]}, implying (1.19) by the continuity of DF .



12

Chapter 1. Analysis on Measures and Functional Spaces

1 1 Let us denote by CFr´ echet (M ) or simply C (M ) the space of bounded continuous functions F on M that have bounded Fr´echet derivatives DF (Y ) such that the mapping Y → DF (Y ) is continuous in the norm topologies of B and B ∗ . The space C 1 (M ) is equipped with the norm

F C 1 (M) = sup |F (y)| + sup DF (Y )B ∗ . Y ∈M

Y ∈M

(1.21)

1 (M ) the space of bounded continuous functions F on M Let us denote by CGat that have bounded Gˆateaux derivatives DF (Y ) such that DF (Y )[ξ] depends continuously on Y for any ξ (or equivalently, by Proposition 1.2.1, so that DF (Y )[ξ] is a continuous function of two variables), equipped with the same norm (1.21). It 1 (M ) and follows from Proposition 1.2.2 that C 1 (M ) is a closed subspace of CGat 1 1 F ∈ CGat (M ) belongs to C (M ) whenever the mapping Y → DF (Y ) is continuous in the norm topologies of B and B ∗ . The higher-order derivatives are defined recursively. For instance,

D2 F (Y )[ξ, η] = D (DF (Y )[ξ]) [η]. Proposition 1.2.3. If the derivatives Dl F (Y )[ξ1 , . . . , ξl ], l = 1, . . . , k, are well defined and depend continuously on Y , then the multi-linear forms Dl F (Y )[ξ1 , . . ., ξl ], l = 1, . . . , k, are invariant under any permutations of ξ1 , . . . , ξl . Proof. Let us show this for k = 2, all other cases work accordingly. Applying twice the first-order Taylor expansion (1.16) yields D (DF (Y )[ξ2 ]) [ξ1 ] 1 1 lim = lim h1 →0 h1 h2 →0 h2 × [(F (Y + h1 ξ1 + h2 ξ2 ) − F (Y + h1 ξ1 )) − (F (Y + h2 ξ2 ) − F (Y ))] 1 1   ds1 ds2 D DF (Y + s1 h1 ξ1 + s2 h2 ξ2 )[ξ2 ] [ξ1 ]. = lim lim h1 →0 h2 →0

0

0

Due to the assumptions of continuity, this repeated limit equals the joint limit limh1 ,h2 →0 . Hence it can be reversed and equals the repeated limit limh2 →0 limh1 →0 .  The spaces C k (M ) of k times continuously differentiable functions on M are defined recursively as the subsets of functions F from C k−1 (M ) with the derivative Dk F (Y ) being uniformly bounded in Y and such that the mapping Y → Dk F (Y ) is continuous in the norm topologies. The space C k (M ) is equipped with the norm F C k (M) = F C k−1 (M) + sup

sup

Y ∈M ξ1 ,..., ξk ≤1

|Dk F (Y )[ξ1 , . . . , ξk ]|.

(1.22)

1.2. Smooth functions on Banach spaces

13

k−1 k Similarly, one defines the space CGat (M ) ⊂ CGat (M ) of k times Gˆateaux-differentiable functions with bounded derivatives such that |Dk F (Y )[ξ1 , . . . , ξk ]| is a continuous function of (k + 1) variables, and equipped with the same norm (1.22). k k As above, C k (M ) is a closed subspace of CGat (M ), and F ∈ CGat (M ) belongs k l to C (M ) whenever the mappings Y → D F (Y ) are continuous as the mappings between the Banach spaces B and Ll (B, R) for all l = 1, . . . , k. Of course, these definitions depend on the norm that is used in B. For instance, the norm (1.8) is a special case of (1.22) if the norm .1 is chosen for Rd . More generally, if B = l1 and M = l1 or M = l1+ , then    ∂F  , (1.23) F C 1 (M) = F C(M) + sup sup  ∂xk  x∈M k  2   ∂ F   . 2 1 (1.24) F C (M) = F C (M) + sup sup   x∈M k,l ∂xk ∂xl

Moreover, if all ∂F/∂xk are defined and the r.h.s. of (1.23) is bounded, then (i) F ∈ C 1 (M ) if the mapping x → {∂F/∂xk } is continuous as a mapping from l1 to 1 (M ) if the mapping x → ∂F/∂xk is continuous for any k. l∞ , and (ii) F ∈ CGat 1 From the first-order Taylor formula (1.16), it follows that for F ∈ CGat (M ), F (x) = F (0) +

j

xj Gj (x),

(1.25)

1 with continuous functions Gj . In fact, this holds with Gj (x) = 0 (∂F/∂xj )(sx) ds. By applying the first-order Taylor expansion to Dξ F (Y + sξ) in (1.16) and then to the next derivative, one gets the Taylor expansion of any order. For in2 stance, for F ∈ CGat (M ) the second-order Taylor expansion reads F (Y + ξ) − F (Y ) = DF (Y )[ξ] +

1

(1 − s)D2 F (Y + sξ)[ξ, ξ] ds.

(1.26)

0

As an insightful example, let us differentiate the determinant mapping A → det A for A being a square matrix in Rd . If A is invertible, then D det A[B] = det A tr (A−1 B).

(1.27)

To obtain this formula, one can use the identity det A = exp{ tr ln A},

(1.28)

which is valid whenever A − 1 < 1. Recall that the analytic function f (x) =  ∞ n j=0 fn x /n! with the radius of convergence R can be defined as mappings on the set of square matrices by the same series expansion, as long as their norm does not exceed R. This defines exp and ln in (1.28) whenever A − 1 < 1. Formula

14

Chapter 1. Analysis on Measures and Functional Spaces

(1.28) is proven by reducing it to the case of diagonal matrices, where it becomes straightforward. For sufficiently small , one therefore has det(A + B) = det A det(1 + A−1 B) = det A exp{ tr ln(1 + A−1 B)} = det A exp{ tr (A−1 B) + o()} = det A(1 +  tr (A−1 B) + o()), which implies (1.27).

1.3 Additive and multiplicative integrals In this section, we briefly touch the theory of integration of vector-valued functions. For standard proofs, we shall refer to other sources, but we will clearly indicate the main peculiarities of the infinite-dimensional case. By the standard definition, if F is a given σ-algebra of subsets of a set Ω, and S a metric space, a mapping f : Ω → M is F -measurable if f −1 (M ) ∈ F for any Borel set M ⊂ S. In the theory of integration, it is convenient to modify the definition: If B is a Banach space, a mapping f : Ω → B is called F -measurable if f −1 (M ) ∈ F for any Borel set M ⊂ B and if the range of f is separable. These two definitions only coincide for separable Banach spaces B. The convenience of the second definition stems from the fact (see, e.g., Section 1 of [63] for a proof) that f : Ω → B is F - measurable in the second sense for any Banach space B if and only if there exists a sequence fn of B-valued F -step functions (i.e., finite linear combinations of functions taking some constant value on an element of F and vanishing otherwise) such that fn → f pointwise. This property can be used – and is indeed used – as an alternative definition. Obviously, it nicely matches the standard definition of the integral as the limit of the integrals over step functions. Remark 6. It can be also shown (see Section 1 of [63]) that for any F -measurable function f : Ω → B there exists a sequence fn of functions that are countable linear combinations of the functions taking some constant value on an element of F and vanishing otherwise, so that fn → f uniformly. For a measurable space (Ω, F , μ), where Ω is a set, F its σ-algebra and μ a σ-additive measure on F , a set N ⊂ Ω is called μ-negligible if N is a subset of a set of μ-measure zero. Recall that some property (of the points of Ω) is said to hold almost surely with respect to μ, if it holds everywhere apart from a negligible set. For Ω being a subset of Rd and F the Borel σ-algebra (the only case we are going to deal with), N is negligible if and only if for any  > 0 there exists a countable union U of open balls such that U ⊃ N and the total volume of U does not exceed . A mapping f : Ω → B is called μ-measurable if it is F -measurable apart from a negligible set. A function f : Ω → B is called Bochner-integrable with respect to μ if it is μ-measurable and f  is integrable with respect to μ. Bochner’s theorem states (see, e.g., [115] for a proof) that f is Bochner-integrable if and only if there

1.3. Additive and multiplicative integrals

15

exists a sequence fn of B-valued F -step functions converging to f almost surely with respect to μ (i.e., outside a μ-negligible set) and such that fn (ω) − f (ω)μ(dω) → 0, as n → ∞. The Bochner integral of f over the set M ∈ F is then defined as f (ω)μ(dω) = lim fn (ω)μ(dω), M

n→∞

M

where the integrals of the step functions are defined in the usual way (as for real functions). Remark 7. The extension of this integral to functions with values in general locally convex linear spaces is also well established, see, e.g., [209]. We are mostly interested in integrals of B-valued functions that are defined on the real line. For ODEs, the link between integration and differentiation is of key importance. We shall first prove two preliminary results on the unique identification of a function by its right or left derivative. Afterwards, we will discuss the link between integration and differentiation in more detail for a simpler Riemann integral. Lemma 1.3.1. Let a continuous function h : [0, 1] → B, with a Banach space B, have the right derivative h+ (x) everywhere on [0, 1), which is a continuous function   up to the boundary  x  point. Then  x h(x) is differentiable and h (x) = h+ (x), so that h(x) − h(0) = 0 h (y)dy = 0 h+ (y)dy. x Proof. Let us introduce the integral ψ(x) = 0 h+ (y)dy. Then φ(x) = ψ(x) − h(x) is continuous and its right derivative vanishes everywhere. It remains to show that such φ must be a constant. Assuming that this is not the case, one can find two points a < b in [0, 1] such that φ(b) = φ(a). Set  = φ(b) − φ(a)/2(b − a). Since φ+ (a) = 0, the number   φ(x) − φ(a) > c = inf x ≤ b : x−a is well defined and c ∈ (a, b). By continuity, we find φ(c) − φ(a) = (c − a). Since φ+ (c) = 0, there exists d ∈ (c, b) such that φ(x) − φ(c) ≤ (x − c) for x ∈ (c, d). For these x, φ(x) − φ(a) ≤ (c − a) + (x − c) = (x − a), which contradicts the definition of c.



We would like to extend this result to functions that have derivatives almost everywhere. Basic examples of real analysis (for instance, the famous Cantor staircase) show that a continuous function whose derivative vanishes almost everywhere

16

Chapter 1. Analysis on Measures and Functional Spaces

does not necessarily have to be a constant. For this to hold, a stronger continuity requirement is needed: A function f : R → B is called absolutely continuous if for any  > 0 there exists δ> 0 such that for any finite collection of pairwise disjoint intervals (ak , bk ) with k (bk − ak ) < δ it follows that

f (bk ) − f (ak ) < . k

Lemma 1.3.2. Let an absolutely continuous function h : [0, 1] → B, with a Banach space B, have a vanishing right derivative h+ (x) almost everywhere on [0, 1). Then h is a constant. Proof. Since the set where h+ does not exist can be enclosed into a union of intervals with arbitrary small total length, the total oscillation of h on this union can be made arbitrary small due to the absolute continuity of h. But then, the total oscillation on the remaining closed set must be zero by Lemma 1.3.1. Thus the total oscillation of h vanishes.  Let us specifically mention the simple but important link between the Bochner integral and the usual integrals in the case of B = L1 (X). Lemma 1.3.3. Let ft be a bounded measurable curve in [0, T ] → L1 (X, μ), where X is a complete separable metric space and μ a finite Borel measure on X. Then ft is Bochner-integrable, and  T

T

ft dt (x) = 0

ft (x) dt

(1.29)

0

almost surely with respect to μ, with the usual real-valued integral on the r.h.s. Proof. Equation (1.29) holds for L1 (X, μ)-valued step functions. Approximating f by such functions fn so that

T 0

ftn − ft L1 (X,μ) → 0,

as n → ∞ (which is possible by the Bochner theorem), and passing to the limit yields (1.29) for a given ft .  Let us now discuss Riemann integrals and their multiplicative extensions, which are crucial for the theory of linear ODEs. For that purpose, let f be a function f : [t, T ] → B with values in a Banach space B. For a partition Δ = {t = t0 < t1 < · · · < tn = T } of the interval [t, T ], let us define |Δ| = maxj (tj − tj−1 ). Let sj ∈ [tj−1 , tj ], j = 1, . . . , n, be arbitrary points. The expression R(f, Δ, {sj }) =

n

j=1

f (sj )(tj − tj−1 )

1.3. Additive and multiplicative integrals

17

is called the Riemann sum built on the triple (f, Δ, {sj }). The function f is called Riemann-integrable, if the limit of these sums exists, as |Δ| → 0, which is independent of the choices of {sj }. This limit is called the Riemann integral of f on [t, T ]:

T

f (s) ds = lim R(f, Δ, {sj }). |Δ|→0

t

(1.30)

The main criterion of integrability is as follows. Theorem 1.3.1. If f is bounded on [t, T ] and continuous on [t, T ] \ N , where N is a subset of zero Lebesgue measure (or a Lebesgue negligible set), then f is Riemanns integrable on any interval [t1 , T1 ], t ≤ t1 ≤ T1 ≤ T , and the integral t f (τ )dτ is a Lipschitz-continuous function:  s1  s2     ≤ |s1 − s2 | sup f (s). f (τ ) dτ − f (τ ) dτ   t

t

s∈[t,T ]

Proof. A proof can be found in [117], where it is carried out even more generally for functions with values in Fr´echet spaces. The proof is essentially the same as for real-valued functions.  As a direct consequence of Theorem 1.3.1 and Lemma 1.3.2, we obtain the following assertion. Theorem 1.3.2. Let f be bounded on [t, T ] and continuous on [t, T ] \ N , where s N is a subset of zero Lebesgue measure. Then the function F (s) = t f (τ ) dτ , s ∈ [t, T ], the indefinite integral of f , is an absolutely continuous function that vanishes at t and is differentiable at all points of continuity of f , where F  (s) = f (s). Moreover, F is the unique absolutely continuous function that vanishes at t and is differentiable with F  (s) = f (s) almost surely. Next, let A : [t, T ] → L(B, B) be a curve in the space of bounded operators L(B, B). For a partition Δ = {t = t0 < t1 < · · · < tn = T } of the interval [t, T ] and a choice of points sj ∈ [tj−1 , tj ], j = 1, . . . , n, let us define two types of multiplicative time-ordered Riemann approximations as RM (A, Δ, {sj }) = e(tn −tn−1 )A(sn ) e(tn−1 −tn−2 )A(sn−1 ) · · · e(t1 −t)A(s1 ) ,

(1.31)

˜ M (A, Δ, {sj }) R = [1 + (tn − tn−1 )A(sn )][1 + (tn−1 − tn−2 )A(sn−1 )] · · · [1 + (t1 − t)A(s1 )]. (1.32) If there exists a limit of RM (A, Δ, {sj }), as |Δ| → 0, independent of the choices of sj , then A is called multiplicatively time-ordered Riemann-integrable on [t, T ]. The

18

Chapter 1. Analysis on Measures and Functional Spaces

respective limit is called the multiplicative time-ordered Riemann integral of A(s), or the T -product, or the chronological exponential, or the time-ordered exponential:   T

A(τ ) dτ

T exp

= lim RM (A, Δ, {sj }).

t

|Δ|→0

(1.33)

Remark 8. Reversing the order of the multipliers in (1.31) leads to the timereversed multiplicative Riemann integral. The relation between the approximations (1.31) and (1.32) is determined by the following elementary fact. Proposition 1.3.1. If a function A is bounded, then the limit of RM (A, Δ, {sj }) ˜ M (A, Δ, {sj }) exists, in which case they coincide: exists if and only if the limit of R ˜ M (A, Δ, {sj }). lim RM (A, Δ, {sj }) = lim R

|Δ|→0

|Δ|→0

Proof. We have ˜ M (A, Δ, {sj }) RM (A, Δ, {sj }) − R = [e(tn −tn−1 )A(sn ) − (1 + (tn − tn−1 )A(sn ))]e(tn−1 −tn−2 )A(sn−1 ) · · · e(t1 −t)A(s1 ) + · · · + (1 + (tn − tn−1 )A(sn )) · · · · · · (1 + (t2 − t1 )A(sn−1 ))[e(t1 −t)A(s1 ) − (1 + (t1 − t)A(s1 ))], so that ˜ M (A, Δ, {sj }) RM (A, Δ, {sj }) − R ≤ n exp{(n − 1) sup A(s) sup e(tj −tj−1 )A(sj ) − (1 + (tj − tj−1 )A(sj )). s∈[t,T ]

j

˜ M (A, Δ, {sj }) The last termis of order (tj − tj−1 )2 , so that RM (A, Δ, {sj }) − R is of order n j (tj − tj−1 )2 ≤ |Δ| → 0, as |Δ| → 0.  The main result about the multiplicative integral for bounded A is the following. Theorem 1.3.3. The function A is multiplicatively Riemann-integrable on [t, T ], if and only if it is Riemann-integrable on [t, T ]. In particular, it is multiplicatively Riemann-integrable, if it is bounded on [t, T ] and continuous on [0, T ] \ N , where N is a subset of zero Lebesgue measure. Proof. A proof can be found in [198] (Theorem 16.3), where it is carried out in even greater generality for functions with values in Banach algebras.  The application of multiplicative integrals to basic linear ODEs will be presented in Section 2.3, where also an alternative series representation for T -products is developed.

1.4. Differentials of the norms

19

Remark 9. Of course, the story is quite different for unbounded operators A, a case which is of great importance for applications. This case is not covered by ˜ M may not be defined, so that the the above results! In particular, in this case R definition via RM is more fundamental.

1.4 Differentials of the norms As an example for differentiation, let us look at an important class of functions on a Banach space, so-called convex functions φ defined by the property φ(αx + (1 − α)y) ≤ αφ(x) + (1 − α)φ(y),

(1.34)

for any α ∈ (0, 1). It is equivalent to the requirement that the restriction of φ on any straight line is convex as a real function on the line, which can in turn be rewritten as x2 − x1 x3 − x2 φ(x3 ) + φ(x1 ), (1.35) φ(x2 ) ≤ x3 − x1 x3 − x1 for any points x1 , x2 , x3 such that x2 belongs to the interval (x1 , x3 ). Since (1.35) can be equivalently written as φ(x3 ) − φ(x1 ) φ(x2 ) − φ(x1 ) ≤ , x2 − x1 x3 − x1

(1.36)

it follows that (1.35) is equivalent to the requirement that, for x < y, the increments (φ(y) − φ(x))/(y − x) are increasing both in y and in x. In particular, for any convex function φ(x) on a real line, the right and left derivatives φ+ (x) and φ− (x) always exist and φ(x + h) − φ(x) φ(x) − φ(x − h) ≤ φ− (x) ≤ φ+ (x) ≤ . h h

(1.37)

For any Banach space, x and x2 are convex functions. It therefore follows that their directional derivatives  d  [x, y]+ = Dy x = x + hy, dh+ h=0 (1.38)   d  d  x + hy = − x − hy [x, y]− = −D−y x = dh− h=0 dh+ h=0 and  1 1 d  (x, y)+ = Dy x2 = x + hy2 , 2 2 dh+ h=0 (1.39)  1 1 d  2 2 (x, y)− = − D−y x = x + hy 2 2 dh− h=0

20

Chapter 1. Analysis on Measures and Functional Spaces

are always well defined and have the following properties: [x, y]− ≤ [x, y]+ ,

(x, y)− ≤ (x, y)+ ,

(x, y)± = x[x, y]± .

(1.40)

The functions [x, y]− and [x, y]+ are called the normalized lower and upper semiinner products on B. The functions (x, y)− and (x, y)+ are called the lower and upper semi-inner products on B. Clearly, [0, y]± = ±y,

(0, y)± = 0.

Proposition 1.4.1. Let μt be a curve in B, t ∈ [0, T ]. Then at any point t, where the right (or left) derivative of μt exists, there exists the right (or left) derivative of μt  and μt 2 , with μt ± = [μt , μt± ]± ,

(μt 2 )± = 2(μt , μt± )± .

(1.41)

Proof. This follows from the definitions. For the first equation, e.g., at the points t where μt+ exists, we know that 1 (μt+h  − μt ) h 1 = lim (μt + μt+ h + o(h) − μt ) = [μt , μt+ ]+ . h→0+ h

μt + = lim

h→0+



As a corollary, we get the following proposition. t Proposition 1.4.2. Let μt = 0 νs ds with a bounded curve νt in B, t ∈ [0, T ], which is almost surely continuous. Then

t

μt  = μ0  +

[μs , μs ]+ ds

0

= μ0  +

0

t

[μs , μs ]− ds

1 = μ0  + 2



t

(1.42) ([μs , μs ]+

+

[μs , μs ]− )ds.

0

Proof. By Theorem 1.3.2, μt is an absolutely continuous function which is differentiable at all points of continuity of f , i.e., almost surely. Hence, by Proposition 1.4.1, μt  is absolutely continuous and almost surely has bounded right and left derivatives as given by (1.41). For an absolutely continuous real function, the set of points where the right and the left derivative differ from each other has zero measure (Lebesgue theorem) and thus does not contribute to the integral (alternatively, see Lemma 1.3.2). Hence one can use either of them or any convex combination in the Newton–Leibnitz integral representation.  As an important example, let us consider the space of measures.

1.4. Differentials of the norms

21

Proposition 1.4.3. Let B = M(X) for a complete metric space X. If μ ∈ M+ (X), then (1.43) [μ, ν]± = ±νsing  + νabs (dx), where ν = νabs +νsing is the Hahn decomposition of ν into the absolutely continuous and the singular part with respect to μ. For a general μ ∈ M(X), νabs (dx) − νabs (dx), (1.44) [μ, ν]± = ±νsing  + X+

X−

where X+ (respectively X− ) is the support of the positive (respectively negative) part of μ. Proof. Let us derive only [μ, ν]+ for μ ∈ M+ (X), leaving all other cases as an exercise. For μ ∈ M+ (X), [μ, ν]+ = lim

h→0+

1 1 (μ + hν − μ) = νsing  + lim (μ + hνabs  − μ). h→0+ h h

Since νabs = g(x)μ with some g ∈ L1 (X, μ), we have 1 (|(1 + hg(x))| − 1)μ(dx)). [μ, ν]+ = νsing  + lim h→0+ h The function under the integral tends to g(x) for all x (where g(x) is finite) and is bounded by |g(x)|. Hence, by the dominated convergence theorem, the limit in    the last formula equals g(x)μ(dx) = νabs (dx), as required. Exercise 1.4.1. Let K be a compact space, B = C(K), and f, g ∈ B, f = 0. Then [f, g]+ = max{g(x) sgn(f (x)) : |f (x)| = f }, [f, g]− = min{g(x) sgn(f (x)) : |f (x)| = f }.

(1.45)

Exercise 1.4.2. Let B = L1 (X, μ), with a Borel measure μ on a complete metric space X. Then g(x) sgn (f (x))μ(dx) ± |g(x)|μ(dx), (1.46) [f, g]± = X\M0

M0

where M0 = {x : f (x) = 0}. In particular, for B = l1 ,

[f, g]± = gj sgn fj ± |gj |. j:fj =0

(1.47)

j:fj =0

Finally, for B = R, [f, g]± = g sgn (f ) if f = 0 and [f, g]± = ±g if f = 0. The following results on the differentiation of the magnitudes of curves in the spaces L1 (X) or M(X) are an important tool for proving the correctness of measure-valued nonlinear evolutions, e.g., in Sections 3.12, 3.13 and 7.6.

22

Chapter 1. Analysis on Measures and Functional Spaces

Proposition 1.4.4. Let X be a complete separable metric space and μ ∈ M+ (X). Let yt be a bounded almost surely continuous curve in L1 (X, μ) and x(t, z) =

t

y(s, z) ds, 0

where, according to Lemma 1.3.3, the integral can be understood as a usual integral (pointwise, for any z), or as the Bochner integral for L1 (X, μ)-valued functions, or as a Riemann integral. Then

t

|x(t, z)| =

sgn [x(s, z)]y(s, z) ds.

(1.48)

0

Proof. Let us give two proofs, one as a corollary of Proposition 1.4.2 and an independent one. (i) Let Mt = {z : x(t, z) = 0}. By Proposition 1.4.2 and Exercise 1.4.2 with B = R,



t

|x(t, z)| = 0



Ms

0



t

0

|y(s, z)|dz

ds

sgn [x(s, z)]y(s, z) ds −

=



t

sgn [x(s, z)]y(s, z) ds +



t

|y(s, z)|dz.

ds 0

Ms

Hence, the last term in both integrals vanishes, which proves (1.48). (ii) First note that if y(t) is a real-valued bounded measurable function on [0, T ] and t t x(t) = 0 y(s) ds, then |x(t)| = 0 sgn x(s) y(s) ds, which can be seen for instance by approximating x(t) or y(t) with polynomials and then passing to the limit. Applying this result to y(t) = y(t, z) yields (1.48).  Proposition 1.4.5. Let X be a complete separable metric space.  t Let νt be a bounded almost surely continuous curve in M(X) and μ(t) = 0 ν(s) ds (Riemann or Bochner integral). Then |μ(t)| =

t

sgn [μ(s)]ν(s) ds,

(1.49)

0

where sgn [μ(t)] = sgn [μ(t)](z) is a function on X which equals 1 or −1 on the positive or respectively negative parts of μ(t). Proof. (i) One proof can be carried out exactly as the first proof of Proposition 1.4.4. (ii) The second proof works for the case when νt has only discontinuities of the first kind and at each point coincides either with its left or right limit. In this case, all measures ν(t) are absolutely continuous with respect to the measure T M = 0 |ν(t)| dt, and thus the statement is reduced to the setting of Proposition 1.4.4, as all our curves become elements of L1 (X, M ). 

1.5. Smooth mappings between Banach spaces

23

1.5 Smooth mappings between Banach spaces For mappings F : M → B2 from a closed convex subset M of a Banach space B1 to a Banach space B2 , the previous notions and results of the case B2 = R can be extended almost automatically. Namely, the definitions of directional (or Gˆ ateaux) derivatives (1.14) and of derivatives (1.19) (or Fr´echet derivatives) remain the same, only convergence and norms are understood in the sense of the corresponding Banach spaces. Propositions 1.2.1, 1.2.2 and their proofs also remain valid, the only difference being the use of the Banach-space-valued integrals of functions F (s) from R to B2 (see Theorems 1.3.1 and 1.3.2 for these integrals). For continuous functions F (and we are only using continuous functions), such integrals are defined exactly as for real functions, namely as limits of Riemannian 1 sums, with the Leibnitz rule 0 F  (s) ds = F (1) − F (0) following in the usual way. Extending the notation C k (M ), we define the spaces C 1 (M, B2 ) and C 2 (M, B2 ) of differentiable functions F of order 1 or 2 with continuous bounded derivatives, the continuity being understood as the continuity of mappings Y → DF (Y ) and Y → D2 F (Y ) with the norm topologies of M , L(B1 , B2 ) and L2 (B1 , B2 ). The norm on these spaces is defined analogously to (1.21) and (1.22), for instance   F C 2 (M,B2 ) = sup F (Y )B2 + DF (Y )L(B1 ,B2 ) + D2 F (Y )L2 (B1 ,B2 ) Y  (1.50) = sup F (Y )B2 + sup DF (Y )[ξ]B2 Y ξ B1 ≤1  + sup D2 F (Y )[ξ, η]B2 . ξ B1 , η B1 ≤1

1 2 Similarly the spaces of Gˆateaux-differentiable functions CGat (M, B2 ), CGat (M, B2 ) are defined, which contain C 1 (M, B2 ) and C 2 (M, B2 ) as closed subspaces. Again, the difference is in the assumption that the derivatives are continuous functions of all its variables in the Gˆ ateaux case and are continuous functions from M to the corresponding spaces of bounded operators in the case of Fr´echet-differentiable 1 (M, B2 ) functions. The Taylor formulas (1.16) and (1.26) remain valid for F ∈ CGat 2 and F ∈ CGat (M, B2 ), respectively. The following chain rule is a key tool for analysis. 1 Proposition 1.5.1. Let Φ ∈ C 1 (M, B2 ) (respectively CGat (M, B2 )) with M a closed 1 1 convex subset of B1 , and F ∈ C (B2 , B3 ) (respectively CGat (B2 , B3 )) for some Banach spaces B1 , B2 , B3 . Then the composition F ◦ Φ(Y ) = F (Φ(Y )) belongs to 1 C 1 (M, B3 ) (respectively CGat (M, B3 )) and

Dξ (F ◦ Φ)(Y ) = DF (Φ(Y ))[Dξ Φ(Y )], for any Y, ξ such that Y + hξ ∈ M for some h > 0.

(1.51)

24

Chapter 1. Analysis on Measures and Functional Spaces

Proof. By the first-order Taylor expansion of the composition map, we get F (Φ(Y + ξ) − F (Φ(Y )) 1 = DF (Φ(Y ) + s(Φ(Y + ξ) − Φ(Y )))[Φ(Y + ξ) − Φ(Y )] ds 0

1 1 DF (Φ(Y ) + s(Φ(Y + ξ) − Φ(Y ))) DΦ(Y + θξ)[ξ] dθ ds = 0



0 1

DF (Φ(Y ) + s(Φ(Y + ξ) − Φ(Y )))

= 0



1 (DΦ(Y + θξ) − DΦ(Y ))[ξ] dθ ds. × DΦ(Y )[ξ] + 0

Writing hξ instead of ξ and passing to the limit h → 0 completes the proof.



Another important result concerns partial derivatives of Gˆateaux- or Fr´echettype: Proposition 1.5.2. Let B1 , B2 , B3 be three Banach spaces, and M1 and M2 convex closed subsets of B1 and B2 such that the linear span of each Mi is Bi . Let F be a mapping B1 × B2 → B3 such that the Gˆ ateaux derivatives D1 F (b1 , b2 )[ξ1 ] and D2 F (b1 , b2 )[ξ2 ] with respect to the first and the second variable exist for all b1 , b2 ∈ M1 × M2 and all ξ1 , ξ2 , so that bi + hξi ∈ Mi for sufficiently small h. (i) If for any ξ1 and ξ2 the functions D1 F (b1 , b2 )[ξ1 ] and D2 F (b1 , b2 )[ξ2 ] are continuous as functions of b1 , b2 , then the mapping F is Gˆ ateaux-differentiable on M1 × M2 and DF (b1 , b2 )[ξ1 , ξ2 ] = D1 F (b1 , b2 )[ξ1 ] + D2 F (b1 , b2 )[ξ2 ].

(1.52)

(ii) If the mappings (b1 , b2 ) → D1 F (b1 , b2 ) and (b1 , b2 ) → D2 F (b1 , b2 ) are continuous as mappings from M1 × M2 to L(B1 , B3 ) and L(B2 , B3 ) respectively, then the mapping F is Fr´echet-differentiable on M1 × M2 and this derivative is given by (1.52). Proof. Let us prove only (i), since the second statement is fully analogous. By the first-order Taylor expansion, F (b1 + hξ1 , b2 + hξ2 ) − F (b1 , b2 ) = F (b1 + hξ1 , b2 + hξ2 ) − F (b1 , b2 + hξ2 ) + F (b1 , b2 + hξ2 ) − F (b1 , b2 ) h h = D1 F (b1 + sξ1 , b2 + hξ2 )[ξ1 ] ds + D2 F (b1 , b2 + sξ2 )[ξ2 ] ds, 0

0

which tends to D1 F (b1 , b2 )[ξ1 ]+D2 F (b1 , b2 )[ξ2 ] due to the assumed continuity. 

1.5. Smooth mappings between Banach spaces

25

The space CbLip (M, B2 ) of Lipschitz-continuous mappings M → B2 is defined as the subspace of C(M, B2 ) with a finite norm F bLip = F C(M,B2 ) + F Lip,

F Lip = sup

Y =Z

F (Y ) − F (Z)B2 . Y − ZB1

(1.53)

1 (M, B2 ), As follows from the first-order Taylor expansion (1.16), if F ∈ CGat then F ∈ CbLip (M, B2 ) and

F bLip = F C 1 (M,B2 ) ,

F Lip = sup DF (Y ).

(1.54)

Y

1 Similarly, one defines the space CbLip (M, B2 ) of functions F from C 1 (M, B2 ), which have a Lipschitz-continuous derivative, that is

DF (Y1 ) − DF (Y2 ) ≤ CY1 − Y2 ,

(1.55)

with a constant C. We say that F has a locally Lipschitz-continuous derivative, if this holds for Y1 , Y2 from any bounded subsets of M . From the first-order Taylor expansion (1.16), it follows that

1

(DF (Y + sξ)[ξ] − DF (Y )[ξ]) ds.

F (Y + ξ) − F (Y ) = DF (Y )[ξ] +

(1.56)

0

Consequently, if (1.55) holds, then F (Y + ξ) − F (Y ) − Dξ F (Y )B2 ≤ Cξ2 /2,

(1.57)

which is stronger than (1.19) and often a rather handy inequality. If the derivative is only locally Lipschitz-continuous, the estimate (1.57) holds for Y, ξ from any bounded set. Weakening the condition of Lipschitz continuity, it is handy to define the 1 1 (M, B2 ) (respectively Cuc (M, B2 )) of mappings with locally uniformly space Cluc continuous derivatives (respectively uniformly continuous). It is the closed subspace of C 1 (M, B2 ) of functions F such that the mapping Y → DF (Y ) is uniformly continuous on bounded subsets of M (respectively on the whole M ). If 1 1 (M, B2 ) (respectively F ∈ Cuc (M, B2 )), then with the help of (1.56) F ∈ Cluc again, (1.19) improves to F (Y + ξ) − F (Y ) − DF (Y )[ξ]B2 ≤ ξ for ξ ≤ δ,

(1.58)

uniformly for Y and Y + ξ from any bounded set (respectively for any Y, Y + ξ from M ). 2 2 (M, B2 ) (respectively Cuc (M, B2 )) of the mappings Similarly, the space Cluc with locally uniformly continuous (respectively uniformly continuous) second derivatives is the closed subspace of C 2 (M, B2 ) of functions F such that the mapping

26

Chapter 1. Analysis on Measures and Functional Spaces

Y → D2 F (Y ) is uniformly continuous on bounded subsets of M (respectively on the whole M ). By introducing these spaces, the peculiarity of the infinite-dimensional case becomes obvious once again, since for finite-dimensional B1 and B2 the spaces 1 1 Cluc (M, B2 ) and C 1 (M, B2 ) coincide (though Cuc (M, B2 ) may still be different). 2 The same applies for the spaces Cluc (M, B2 ) and C 2 (M, B2 ). The following example shows that derivatives in Banach spaces may look quite different from the usual derivatives. Exercise 1.5.1. (i) If A is an invertible element of L(B, B), then the mapping F : A → A−1 has the following differential at A: DF (A)[ξ] = −A−1 ξA−1 .

(1.59)

Hint: (A + hξ)−1 = (1 + hA−1 ξ)−1 A−1 = (1 − hA−1 ξ + o(h))A−1 . (ii) If C(t) is a smooth curve in L(B, B), then d (1 + C(t))−1 = −(1 + C(t))−1 C  (t)(1 + C(t))−1 . dt

(1.60)

1.6 Locally convex spaces and Fr´echet spaces In this book, we work mostly with Banach spaces. However, some very useful spaces – in particular the spaces of generalized functions or distributions or some classes of smooth functions, notably the Schwartz space – do not fit into the Banach setting. Since most of our methods for dealing with ODEs can be naturally extended to the general setting of locally convex spaces, we provide here the necessary background on these spaces in a concise form. More extensive expositions can be found, e.g., in [115, 212, 235]). Remark 10. Readers who do not want to deal with this level of generality (which is more ‘topological’ in nature than the rest of the book) can skip this section and always think of Banach spaces whenever we mention results or definitions for general locally convex spaces, in particular the Fr´echet spaces. One only has to keep in mind that by the convergence of sequences in function spaces that are equipped with a countable set of norms or seminorms (see key examples (1.70), (1.72), (1.73) below), one means their convergence in each of these seminorms. Although the general point of view definitely enhances the understanding of generalized functions, it is possible to grasp the basic properties of these objects without recourse to abstract spaces. A set M in a linear vector space V is called absorbing or absorbent if for any x ∈ V there exists t ∈ R+ such that tx ∈ M , or in other words, if V = ∪t∈R+ tM . Clearly, 0 ∈ M for such set M .

1.6. Locally convex spaces and Fr´echet spaces

27

The key notion for the theory is the following link between geometric and analytic representations of convex sets. For any set M ⊂ V , its Minkowski functional pM : V → R+ is defined by the formula pM (x) = inf{t > 0 : x ∈ tM }.

(1.61)

The requirement that M is absorbing is then equivalent to the requirement that pM (x) has a finite non-negative value for any x ∈ V . For any convex absorbing set M , pM has the following properties: (i) if x ∈ M , then pM (x) ≤ 1; (ii) if pM (x) < 1, then x ∈ M ; (iii) pM (tx) = tpM (x) for any t ≥ 0 and x ∈ V ; (iv) p(x + y) ≤ p(x) + p(y) for any x, y ∈ V . Exercise 1.6.1. Prove these properties of pM . Conversely, any mapping p : V → R+ satisfying the above conditions (i)–(iv) is a Minkowski functional for some convex absorbing set. In fact, p(x) = pM (x) for any set M , so that {x : p(x) < 1} ⊂ M ⊂ {x : p(x) ≤ 1}. Exercise 1.6.2. Prove this assertion. Recall that a semi-norm on V is a functional p → R+ such that p(αx) = |α|x for any α ∈ R and p(x + y) ≤ p(x) + p(y). A subset M ⊂ V is called symmetric, balanced or centered if x ∈ M implies −x ∈ M . Remark 11. For complex spaces, the above notion of a subset M being balanced requires that x ∈ M implies λx ∈ M for any complex λ of unit magnitude. As a direct consequence of the properties of Minkowski functionals, it follows that if M is a convex absorbing and symmetric set, then pM is a semi-norm. Also vice versa, for any semi-norm p, there exists a convex absorbing and symmetric set M such that p = pM , for instance M = {x : p(x) < 1}. A linear topological space is a linear vector space equipped with a topology which renders the operations of addition and scalar multiplication continuous. Notice that any neighbourhood U of zero in a linear topological space V is absorbing. In fact, the mapping t → tx is continuous for any x and takes 0 ∈ R to 0 ∈ V . Hence tx ∈ U for sufficiently small t. By the base of open neighbourhoods of a point x in a topological space, one means the family of neighbourhoods Uα of x with the property that for any neighbourhood U there exists α such that Uα ⊂ U . A linear topological space is called locally convex if zero has a base of open neighbourhoods that consists of convex centered (and hence absorbing) sets Mα , with α from some set of indices. Notice that the Minkowski functional pα of Mα is continuous for any Mα . In fact, if x − y ∈ Mα , then |p(x) − p(y)| ≤ p(x − y) ≤ . Hence the open neighbourhoods of zero Mα can be described as the sets {x : pα (x) < 1} for continuous semi-norms pα . Therefore, one can conclude that a linear topological space is locally convex

28

Chapter 1. Analysis on Measures and Functional Spaces

if and only if its topology can be defined by a family of semi-norms, that is, there exists a family of semi-norms pα , with α from some set of indices, so that the sets Nα1 ,...,αk ; 1 ,..., k = {x : pαj (x) < j ,

j = 1, . . . , k}

(1.62)

define a base of open neighbourhoods of zero. Hence, the base of neighbourhoods of any point x is given by the sets x + Nα1 ,...,αk ; 1 ,..., k . A sequence xn ∈ V converges to x in this topology if and only if it converges in any pα , that is, for any  > 0 and α there exists N such that pα (xn − x) <  for all n > N . A topology on a set V is said to be separated or Hausdorff if any two points x = y from V have nonintersecting neighbourhoods. Clearly, a topological linear space with the topology given by semi-norms pα is Hausdorff if and only if the family pα is separating, that is, if pα (x) = 0 for all α implies x = 0. Exercise 1.6.3. Let V be a linear vector space and pα a separating family of seminorms on V . Show that V becomes a locally convex topological linear space if it is equipped with the topology that is generated by the base of open sets (1.62) and all its shifts. A proof can be found, e.g., in [212]. Proposition 1.6.1. A linear mapping L : V1 → V2 between two locally convex spaces with the topologies generated by the semi-norms {p1α } and {p2β } respectively is continuous if and only if for any β there exists a constant C and a finite number of semi-norms {p1α1 , . . . , p1αk } such that p2β (Lx) ≤ C(p1α1 (x) + · · · + p1αk (x)).

(1.63)

Proof. The ‘if’ part follows from the definition of continuity: the pre-image of an open set (in our case {x : p2β (Lx) < }) must be open. For proving the ‘only if’ part, notice that, if L is continuous, then for any β there exists a constant  and a finite number of semi-norms {p1α1 , . . . , p1αk } such that p1α1 (x) < , . . . , p1αk (x) <  ⇒ p2β (Lx) < 1.

(1.64)

If (1.63) does not hold, we can choose a sequence Cn → ∞, as n → ∞, and a sequence of vectors xn ∈ V1 such that p2β (Lxn ) > Cn (p1α1 (xn ) + · · · + p1αk (xn )).

(1.65)

However, by linearity, if (1.65) holds for xn , it holds for all kxn with k > 0. We can therefore choose xn such that /2 < p1α1 (xn ) + · · · + p1αk (xn ) < . Then (1.65) will contradict (1.64) for large enough Cn .



This proposition implies that two families of semi-norms pα and dβ on a linear space V are equivalent – that is, they define the same topology on V – if

1.6. Locally convex spaces and Fr´echet spaces

29

and only if for any β and α there exists a constant C and a finite number of semi-norms {pα1 , . . . , pαk } and {dβ1 , . . . , dβm } such that dβ (x) ≤ C(pα1 (x) + · · · + pαk (x)),

pα (x) ≤ C(dβ1 (x) + · · · + dβm (x)). (1.66)

A family of semi-norms pα on V is called directed if for any α1 , α2 there exists α3 and a constant C such that pα1 (x) + pα2 (x) ≤ Cpα3 (x). In any locally convex space V with the topology defined by the semi-norms pα , there always exists an equivalent directed family of norms. In fact, it is sufficient to choose all finite sums of the semi-norms of the initial family as a new family. As follows from Proposition 1.6.1, two directed families of semi-norms pα and dβ on a linear space V are equivalent, if and only if for for any β and α there exists a constant C and semi-norms pα1 and dβ1 such that dβ (x) ≤ Cpα1 (x),

pα (x) ≤ Cdβ1 (x).

(1.67)

One often has to deal with families of linear mappings. A family of linear mappings Lβ : V → W between two topological linear spaces V, W is called equicontinuous if for any neighbourhood N of zero in W there exists a neighbourhood U of zero in V such that Lβ V ⊂ N for all β. Notice that for a single mapping L this turns into the definition of continuity. The following proposition is a direct extension of Proposition 1.6.1 for the families of mappings. Proposition 1.6.2. A family of linear mappings Lν : V1 → V2 between two locally convex spaces with the topologies generated by the semi-norms p1α and p2β respectively is equicontinuous if and only if for any β there exists a constant C and a finite number of semi-norms {p1α1 , . . . , p1αk } such that p2β (Lν x) ≤ C(p1α1 (x) + · · · + p1αk (x))

(1.68)

for all x ∈ V1 and all ν. The structure of a topological linear space makes it possible to define analogues of Cauchy sequences, which are essential for the study of metric spaces. A sequence xn in a topological linear space is called Cauchy sequence if for any neighbourhood U of zero there exists N such that xn − xm ∈ U for all m, n > N . In locally convex spaces with the topology defined by the semi-norms pα , the property of being Cauchy is equivalent to the requirement that xn is a Cauchy sequence in each semi-norm pα , that is, for any  > 0 and any α there exists N such that pα (xn − xm ) <  for all m, n > N . A topological linear space V is called sequentially complete if any Cauchy sequence in V converges. A topological space is called metricizable whenever its topology can be defined in terms of a certain metric. In general (non-metricizable) topological spaces, the convergence of sequences does not fully specify the topology. Instead, one must use converging directed sets (also called nets) xμ that are indexed by a partially ordered set of indices μ such

30

Chapter 1. Analysis on Measures and Functional Spaces

that for any μ1 , μ2 there exists μ such that μ > μ1 and μ > μ2 . A directed set xμ is called a Cauchy-directed set if for any neighbourhood U of zero there exists μ such that xμ1 − xμ2 ∈ U for all μ1 , μ2 > μ. A topological linear space V is called complete if any Cauchy-directed set in V converges. This completeness implies sequential completeness, and they are equivalent for metricizable spaces. Proposition 1.6.3. A locally convex Hausdorff space V is metricizable if and only if V has a countable base of neighbourhoods of zero. Proof. Any metric space has a countable base of neighbourhoods of any point, given by the balls centered at this point with rational radii. Therefore, the topology can be specified by a countable set of norms. On the other hand, if there exists a countable base of neighbourhoods of zero, the topology of V can be generated by a countable set pn of semi-norms. Then the formula

pn (x − y) (1.69) 2−n d(x, y) = n 1 + pn (x − y) specifies a metric on V that defines the same topology as the topology given by the family of semi-norms pn .  Exercise 1.6.4. Check the last assertion. Also check that V is complete with respect to the family of semi-norms pn if and only if it is complete as the metric space with the metric (1.69). A locally convex Hausdorff (topological linear) space V is called Fr´echet space if it is complete and its topology can be specified by a countable set of semi-norms. (Thus, V is metricizable). The key examples for Fr´echet spaces include various classes of smooth functions on Rd . For instance, the Schwartz space S(Rd ) is the space of infinitely differentiable functions on Rd that decrease at infinity faster than any power, together with all their derivatives. The topology in this space is defined by the countable set of norms (with p, q non-negative integers) f p,q =

sup

k1 +···+kd ≤p,m1 +···+md ≤q

x

d  j=1

    ∂qf  |xj |kj  m1 md  , ∂x · · · ∂x 1

(1.70)

d

or equivalently, by their integral versions ⎛

f p,q,2

2 ⎞1/2    d q   ∂ f ⎜  ⎟ kj  = xj . (1.71) ⎝  md  dx⎠ m1 ∂x · · · ∂x j=1 1 d  k1 +···+kd ≤p,m1 +···+md ≤q

Exercise 1.6.5. (i) Check that the family of norms (1.70) and (1.71) are equivalent. (ii) Check that S is a Fr´echet space, that is, it is complete.

1.6. Locally convex spaces and Fr´echet spaces

31

For a compact set K in Rd , the space C0∞ (K) is the space of infinitely differentiable functions on Rd with support in K, i.e., they vanish outside K together with all their derivatives. The topology in this space is defined by the following countable set of norms:  

  ∂qf  (1.72) sup  m1 f q = md  . ∂x1 · · · ∂xd m +···+m =q x∈K 1

d

Let Ω be an open subset of Rd and Ω1 ⊂ Ω2 ⊂ · · · an increasing sequence of ¯ n of each Ωn is a compact subset of Ωn+1 and open subsets such that the closure Ω Ω = ∪n Ωn . Let C ∞ (Ω) be the space of infinitely differentiable functions on Ω. The topology in this space is defined by the following countable family of seminorms:  

  ∂qf  sup  m1 f q,n = (1.73) md  . ∂x1 · · · ∂xd ¯ m +···+m =q x∈Ωn 1

d

Exercise 1.6.6. Check that C0∞ (K) and C ∞ (Ω) are Fr´echet spaces. ¯ n ) of infinitely differentiable functions on Ω The space C0∞ (Ω) = ∪n C0∞ (Ω with a compact support is dense in C ∞ (Ω). In order to define its natural topology, we have to go beyond the class of Fr´echet spaces (using inductive limits, see below). A subset M of a linear topological space V is called bounded if it is absorbed by any neighbourhood of zero, that is, for any neighbourhood of zero U there exists λ > 0 such that λM ⊂ U . If V is locally convex, with the topology generated by the semi-norms pα , then M is bounded if and only if sup{pα (x) : x ∈ M } < ∞ for any α. The following elementary properties of bounded sets are worth being noted. Proposition 1.6.4. (i) Any compact set is bounded. (ii) The Minkowski functional pM of a bounded balanced convex absorbing set M in a separated (or Hausdorff) locally convex space is a norm. Proof. (i) For any seminorm p and a point x in a compact set M , there exists λx > 0 such that x ∈ {y : p(y) < λx }. Choosing a finite subset of this open cover of M implies that sup{p(y) : y ∈ M } < ∞. (ii) Since pM is a seminorm, one only has to check that x = 0 implies pM (x) = 0. Due to the separation, for any x = 0 there exists a neighbourhood U of zero such that x ∈ / U . Since M is bounded, there exists λ > 0 such that λM ⊂ U , and  consequently pM (x) = 0. Remark 12. The norm pM (x) from Proposition 1.6.4 is generally not continuous and defines a stronger topology on the subspace of V generated by M , than the topology that is induced by V .

32

Chapter 1. Analysis on Measures and Functional Spaces

Bounded sets are crucial ingredients for defining various useful topologies on the space of continuous linear operators L(V1 , V2 ) between two locally convex spaces V1 , V2 with the topologies generated by the semi-norms p1α and p2β respectively. The most important operator topologies are the topology of pointwise convergence defined by the family of seminorms p2β (Lx) for L ∈ L(V1 , V2 ), with all possible β and all points x ∈ V1 , and the topology of bounded convergence defined by the family of seminorms sup{p2β (Lx) : x ∈ M }, with all possible β and all bounded subsets M ⊂ V1 . If V1 , V2 are Banach spaces, the topology of pointwise convergence coincides with the strong operator topology (as defined in Section 1.1), and the topology of bounded convergence is generated by the norm (1.1). It is sometimes referred to as the uniform topology. Remark 13. An intermediate topology between the topologies of pointwise convergence and bounded convergence is the topology of compact convergence, which is defined by the family of seminorms sup{p2β (Lx) : x ∈ M }, with all possible β and all compact subsets M ⊂ V1 . Important cases are the topologies on the dual space to V , denoted by V  or V . The dual space is defined as the space of continuous linear functionals V → R, that is V  = L(V, R). (Of course, for complex spaces one uses C instead of R in this definition.) The topology of pointwise convergence in L(V, R) is called the ∗weak topology on V  , and the topology of bounded convergence is called the strong topology on V  . ∗

Remark 14. Notice the mismatch in the standard nomenclature: for a Banach space V , the ‘strong topology’ on L(V, R) is the ‘∗-weak topology’ on V  = L(V, R). A mapping between two locally convex spaces is called bounded if it takes bounded subsets to bounded subsets. Remark 15. According to this definition, the linear mapping f : x → x in R is bounded, although the function f (x) = x is surely not bounded in the usual sense of calculus. In Banach spaces, bounded sets are sets that are bounded by norm. Therefore, a linear mapping between Banach spaces is continuous if and only if it is bounded. In general locally convex spaces, continuity of a linear map L implies that it is bounded, which follows directly from the definitions. But the inverse implication does not hold in general, which gives rise to the following definition. A locally convex space V is called bornological if any bounded linear mapping L : V → W , with W being any locally convex space, is continuous. Proposition 1.6.5. Any metricizable locally convex space V , i.e., a locally convex space with a countable base of neighbourhood of zero, is bornological. Proof. If V is metricizable, then there exists a countable base of open neighbourhoods {Un } such that Un+1 ⊂ Un for all n. Let us assume that a linear mapping

1.6. Locally convex spaces and Fr´echet spaces

33

L : V → W is not continuous. Then there exists a neighbourhood N of zero in F such that L−1 (N ) is not a neighbourhood of zero, and hence tL−1 (N ) is not a neighbourhood either for any t. We can therefore choose a sequence xn ∈ Un such that xn ∈ / nL−1 (N ). Thus {xn } is a bounded set (because it converges to zero), but {L(xn )} is not (since it cannot be absorbed by N ). Therefore, L is not bounded.  A balanced closed convex absorbing set in a locally convex space V is called a barrel. A locally convex space V is called barrelled if any barrel in V is a neighbourhood of zero. Proposition 1.6.6. Any Fr´echet space is barrelled. Proof. This is a direct consequence of the fundamental Baire theorem, which states that a complete metric space cannot be represented as a countable union of nowhere-dense sets. An alternative direct proof can be found in [235].  The importance of a space being barrelled lies primarily in the following principle of uniform boundedness, also called Banach–Steinhaus theorem when applied to Banach spaces. Theorem 1.6.1. Let V be a barrelled space and W any locally convex space. Let Lβ be a family of continuous linear mappings V → W such that the sets {Lβ v} are bounded in W for any v. Then the family Lβ is equicontinuous. Proof. For a closed balanced convex neighbourhood N of zero in W , each set −1 L−1 β (N ) is closed, because Lβ is continuous. Hence the set M = ∩β Lβ (N ) is closed, convex and balanced. By the last assumption on Lβ , M is also absorbing. Hence it is a barrel and therefore a neighbourhood of zero in V . The equicontinuity follows from Lβ (M ) ⊂ N .  A fundamental tool for constructing new classes of spaces is based on the idea of the inductive limit. Let X0 ⊂ X1 ⊂ X2 ⊂ · · · be an increasing sequence of topological (e.g., metric) spaces such that for any n the topology of Xn coincides with the topology that is induced on it by Xn+1 . In other words, the open sets in Xn are sets of the form Xn ∩ U with U open subsets of Xn+1 . This implies that the inclusions Xn → Xn+1 are continuous. The topology of the inductive limit on X = ∪n Xn is the topology whose open sets U are subsets of X such that U ∩ Xn is open in Xn for any n. This implies that a mapping X → Y with any topological space Y is continuous if and only if its restriction to any of Xn is continuous. In fact, this property can be taken as an alternative definition of the inductive topology. By yet another equivalent characterization, the topology of the inductive limit is the strongest one on X for which all inclusions Xn → X are continuous. Exercise 1.6.7. Let V ⊂ W be two locally convex spaces such that the topology of V is induced by W . Show that for any balanced convex neighbourhood V0 of zero

34

Chapter 1. Analysis on Measures and Functional Spaces

in V , there exists a balanced convex neighbourhood W0 of zero in W such that V0 = W0 ∩ V . Applying this general notion to linear spaces, we can define the inductive limit of an increasing sequence of locally convex spaces V0 ⊂ V1 ⊂ · · · as the space V = ∪n Vn equipped with the inductive topology. This topology is also locally convex and separated if all Vj are separated. Exercise 1.6.8. Check the last claim. Proposition 1.6.7. (i) The inductive limit of barrelled spaces is barrelled. (ii) The inductive limit of bornological spaces is bornological. Proof. (i) If U is a barrel in V , then U ∩ Vn is a barrel in Vn and hence a neighbourhood there. Therefore, U is a neighbourhood in V by the definition of the inductive topology. (ii) If L : V → W is bounded, then the restriction of L on Vn is bounded and hence continuous. Therefore, L : V → W is continuous by the definition of the inductive topology.  Let us quote the following result that stresses the role of the inductive limit in the theory of locally convex spaces. Note, however, that we shall not use this fact. (Its proof is not very difficult and can be found, e.g., in [235].) Proposition 1.6.8. (i) A Hausdorff locally convex space is bornological if and only if it is the inductive limit of normed spaces. (ii) A complete Hausdorff bornological space is the inductive limit of Banach spaces. A prime example for the inductive limit is the space D(Ω) = Cc∞ (Ω) of infinitely differentiable functions on an open subset Ω in Rd with a compact support. ¯ n ), where Ω1 ⊂ Ω2 ⊂ · · · is an It was represented earlier as C0∞ (Ω) = ∪n C0∞ (Ω ¯ n of each Ωn is increasing sequence of open subsets of Ω such that the closure Ω a compact subset of Ωn+1 and Ω = ∪n Ωn . The natural topology on D(Ω) is the ¯ n ). Equipped with this topology of the inductive limit of the Fr´echet spaces C0∞ (Ω topology, it is usually referred to as the space of test functions on Ω. The dual spaces D (Ω) to D(Ω) and S  (Rn ) to the Schwartz space S(Rd ) are called the spaces of generalized functions (or distributions) and tempered generalized functions (or distributions) respectively. They play an important role in the analysis of partial differential (and pseudo-differential) equations in Rd , and will be discussed in more detail in the next sections (essentially independent from the general theory). At this stage, let us only note that the operation of differentiation extends to generalized functions by the usual duality, that is, for any ξ ∈ D and φ ∈ D one defines (ξ  , φ) = −(ξ, φ ) (motivated by the integration by parts

1.6. Locally convex spaces and Fr´echet spaces

35

formula). According to this definition, any generalized function is infinitely differentiable. This allows for an extension of the calculus beyond its usual boundaries. n For instance, each locally integrable  function g(x) on R can be considered an  element of D acting on D as φ → g(x)φ(x) dx, and thus we have found a way to differentiate such functions. The following result describes the convergence of sequences (not necessarily directed sets) in the inductive limit. Proposition 1.6.9. Let X = ∪Xn be the inductive limit of the sequence of locally convex spaces Xn such that each Xn is a proper closed subspace of Xn+1 . A sequence {xm } converges in X if and only if all xm belong to some Xn and {xm } converges in Xn . Remark 16. The proof is not very difficult. It is based on the Hahn–Banach theorem and can be found, e.g., in [231]. This general result is only quoted for the sake of completeness. We only need its application to the space of test functions D(Ω). But to work in this space, one can just define sequential convergence there with the help of this rule, instead of deriving it from the general construction of the inductive limit. (Note that this is the most common way to develop the theory, which is often found in the literature.) Remark 17. From Proposition 1.6.9, it follows that the inductive limit X is not metricizable. In particular, the space D(Ω) of test functions is not metricizable. In fact, assuming the existence of a countable base of open neighbourhoods Un of the / Xn . Then {xn } has origin in X, we can choose a sequence xn ∈ Un such that xn ∈ to converge to 0 by the definition of the base, but it does not so by Proposition 1.6.9. To complete this lengthy section, let us describe the smoothness in locally convex spaces. For a mapping F : V → W between two locally convex spaces V, W , the directional derivative at a point Y in the direction ξ ∈ V is defined as it is for Banach spaces: Dξ F (Y ) = DF (Y )[ξ] = lim

h→0+

1 (F (Y + hξ) − F (Y )). h

(1.74)

If this derivative exists for some ξ and all Y from some convex subset M of V , and if it is continuous in Y (for this ξ), then the homogeneity property (1.15) holds for Y ∈ M , as well as the first-order Taylor expansion (1.16): 1 1 d F (Y + sξ) ds = F (Y + ξ) − F (Y ) = Dξ F (Y + sξ) ds, (1.75) 0 ds 0 The justification is the same as for the Banach case. If the mapping Dξ F (Y ) is defined for all Y in M and all ξ, and if it is linear and continuous in ξ, then the linear operator Dξ F (Y ) is usually called the Gˆ ateaux derivative. If this is the case for all Y , then F is called Gˆ ateaux-differentiable on M .

36

Chapter 1. Analysis on Measures and Functional Spaces

The analogue to Proposition 1.2.1 (see also Remark 5) reads as follows: Proposition 1.6.10. (i) If Dξ F (Y ) exists for all ξ and all Y from M , and if it is continuous in Y for any ξ, then Dξ F (Y ) is linear in ξ. In particular, if Dξ F (Y ) is also continuous in ξ (the continuity at the point ξ = 0 is of course sufficient for ateaux-differentiable. the continuity of a linear map), then Dξ F (Y ) is Gˆ (ii) If additionally the set {Dξ F (Y ) : Y ∈ M } is bounded in W for any ξ and V is a barrelled space, then the family of linear mappings Dξ F (Y ) (parametrized by Y ∈ M ) is equicontinuous, and therefore the mapping (ξ, Y ) → Dξ F (Y ) is continuous. Proof. (i) This repeats exactly the proof of Proposition 1.2.1. (ii) This follows from the principle of uniform boundedness (Theorem 1.6.1).  One can imagine several reasonable extensions of the notion of Fr´echet derivatives that turn into the usual derivative for Banach spaces. The most natural extensions are as follows. One says that a function F on M is Fr´echet-differentiable at Y (respectively strongly Fr´echet-differentiable) if there exists an element DF (Y ) ∈ B ∗ , called the derivative of F at Y , such that lim

h→0

F (Y + hξ) − F (Y ) − hDF (Y )[ξ] =0 h

(1.76)

uniformly for ξ from any bounded subset of V (respectively from any neighbourhood of zero). In Banach spaces, neighbourhoods are bounded sets, and thus both notions yield the usual Fr´echet derivative in the case of Banach spaces. Varying the class of subsets for which uniform convergence in (1.76) is required leads to other notions of differentiability. For instance, the Gˆ ateaux derivative corresponds to the choice of singletons, so that Fr´echet-differentiability implies the Gˆateaux-differentiability, as for Banach spaces. The so-called Hadamard derivative corresponds to the choice of compact sets, which makes it an intermediate notion between the Gˆ ateaux and the Fr´echet derivative. Difficulties with all these derivatives in a general locally convex context mostly arise when it comes to the chain rule and thus a systematic development of the usual rules of calculus. This has to do with the fact that the mapping of the composition of continuous linear operators L(V, B) × L(B, W ) → L(V, W ) (taken in the topology of bounded convergence) turns out to be continuous (as a function of two variables) for Banach intermediate spaces B only (see, e.g., [264] for a proof). In concrete situations, however, problems with the chain rules can usually be overcome by introducing some sort of equicontinuity assumptions on the derivatives.

1.7. Linear operators in spaces of measures and functions

37

1.7 Linear operators in spaces of measures and functions We shall now discuss the basic representations of linear operators on the spaces of measures in terms of transition kernels, and on the main functional spaces as pseudo-differential operators (ΨDO). In this context, we define the Fourier transform in order to clarify the related notation and highlight its basic properties. Their proofs are omitted, but can be found in numerous books such as [232]. A transition kernel (or just kernel) ν(x, A) or ν(x, dy) from a topological space X to a topological space Y is a function of two variables such that, for each x ∈ X, ν(x, .) ∈ M+ (Y ), and for each Borel set A, ν(., A) is a Borel-measurable function on X. A signed transition kernel is defined in the same way, but with ν(x, .) ∈ M(Y ). If X = Y , then ν is referred to as the kernel in X. Any signed transition kernel defines a linear operator from bounded measurable functions on Y to measurable functions on X, where ν plays the role of the integral kernel via the formula (1.77) Tν f (x) = f (y)ν(x, dy). If all ν are positive measures, then this Tν is positivity preserving: it takes nonnegative functions to non-negative functions. A kernel ν is called bounded if supx ν(x, .) < ∞. In particular, ν is a probability kernel or stochastic kernel if all measures ν(x, .) are probability measures. One says that a signed transition kernel ν is weakly continuous if Tν f (x) is a continuous function for any f ∈ C(Y ). For a weakly continuous bounded signed kernel ν, the operator Tν is a bounded linear operator C(Y ) → C(X). For X a locally compact space, the dual operator Tν acts in the space of measures M(X) by the formula (1.78) (Tν μ)(dy) = μ(dx)ν(x, dy). If X = Rd and the kernel ν(x, dy) has a dual kernel ν  such that ν(x, dy)dx = ν  (y, dx)dy, then the dual operator can be reduced to the action on C(Rd ) via the formula (1.79) (Tν g)(y) = g(x)ν  (y, dx). The famous Riesz–Markov theorem states that if Y is a locally compact metric space, then any bounded linear functional on C∞ (Y ) is given by integration with respect to a Borel measure. Consequently, in this case any bounded linear operator T : C∞ (Y ) → C(X) is given by (1.77) with some bounded weakly continuous signed transition kernel ν. These operators are contractions if and only if ν(x, .) ≤ 1 for any x. Of course, this holds for probability kernels ν. Therefore, bounded linear operators C∞ (Y ) → C(X) can be naturally lifted up into the bounded linear operators C(Y ) → C(X), given by the same formula (1.77). This lifting can be described in an alternative way without a reference to the struc-

38

Chapter 1. Analysis on Measures and Functional Spaces

ture given by ν. Namely, a sequence of functions fn ∈ C(X) is said to converge to f ∈ C(X) in the bounded-pointwise topology (or shortly, in bp-topology) if the family {fn } is uniformly bounded and fn (x) → f (x) for any x, as n → ∞. If φ is a bounded linear functional on C∞ (X), then its lifting to C(X) can be performed by the bp-closure: for f ∈ C(X), φ(f ) = limn→∞ φ(fn ) for any sequence fn ∈ C∞ (X) that is bp-converging to f . (The existence of the limit follows from the dominated convergence theorem and the fact that φ can be represented via the measure ν.) Similarly, the operators C∞ (Y ) → C(X) are lifted to C(X) by the bp-closure. It is worth noting that C∞ (X) is closed in C(X) in its Banach topology, but it is dense in C(X) in the bp-topology. In principle, the operators on C(X) (not C∞ (X)) can be quite different from those lifted from C∞ (X). For instance, the formulas f → lim supx→∞ f (x) or f → lim supn→∞ f (xn ), with any sequence xn tending to infinity, define linear continuous functionals on C(R). However, such functionals are rarely used. Proposition 1.7.1. For a bounded linear functional φ : C(X) → R with a locally compact metric space X, the following three properties are equivalent:  (i) φ(f ) = f (x)μ(dx) with some μ ∈ M(X), (ii) φ is bp-continuous, (iii) φ is obtained by the bp-closure from a bounded functional on C∞ (X). Proof. The implication (iii) to (i) follows from the structure of the functionals on C∞ (X), i.e., from the Riesz–Markov theorem. Implication (i) to (ii) follows from the dominated convergence theorem. Finally, any bounded linear functional φ : C(X) → R can be reduced to C∞ (X), where it has a structure as given in (i). If it is bp-continuous, its value on any element of C(X) will be recovered by the bp-closure from this restriction.  Similarly, a bounded operator T : C(Y ) → C(X) has a representation (1.77) with a bounded kernel ν if and only if it is bp-bp-continuous, i.e., it takes bpconverging sequences to bp-converging sequences. We shall only work with operators on C(X) that are lifted from the operators on C∞ (X) by bp-continuity. In practice, unbounded operators in C(Rd ), or in C(X) with X a convex subset of Rd , are often well defined on some classes of smooth functions. In order to distinguish this class of operators, let us say that, for a k ∈ N, an unbounded operator A in C(Rd ) is of at most kth order or has order not exceeding k if A is a bounded operator C k (Rd ) → C(Rd ). The structure of such operators, when k reduced to C∞ (Rd ), is as follows. k (X) → C(X), then Proposition 1.7.2. If A is a bounded operator C∞

Af (x) =

k

m=0 i1 ,...,im

∂ m f (y) νi ···i (x, dy), ∂yi1 · · · ∂yim 1 m

with certain weakly continuous bounded signed transition kernels νi1 ···im .

(1.80)

1.7. Linear operators in spaces of measures and functions

39

Proof. The mapping from f to the collection of all its partial derivatives up to k (X) to the product of a sufficient number order k is a bounded injection of C∞ of C(X). Hence, by the Hahn–Banach theorem, any bounded linear functional on k C∞ (X) can be extended to a bounded linear functional on this product of C(X), which yields (1.80).  The simplest examples of such operators are of course the differential operators: k

∂ m f (x) Ai1 ···im (x) . (1.81) Af (x) = ∂xi1 · · · ∂xim m=0 i ,...,i 1

m

Another particular case are integral operators on the tails of the Taylor expansion:

Af (x) =

k

1 ∂ m f (x) yi1 · · · yim Am (y) νi1 ···im (x, dy), f (x+y)− m! i ,...,i ∂xi1 · · · ∂xim m=0 1

m

(1.82) where ν is not necessarily a bounded kernel, but only such that the integral is well defined on smooth functions, that is, such that min(|y|k+1 , 1) is integrable. The functions Am play the role of a mollifier. Sometimes, they are indeed needed, but often Am (y) = 1 is enough. The main example for such operators is given by the powers of Laplacians, see (1.145) below. Another class of examples corresponds to the cases with k = 1 and k = 2: Aν1 f (x) = (f (x + y) − f (x))ν(x, dy), (1.83) ν A2 f (x) = (f (x + y) − f (x) − (∇f (x), y)χ(y))ν(x, dy),   where min(|y|, 1)ν(x, dy) < ∞ in the first equation and min(|y|2 , 1)ν(x, dy) < ∞ in the second one. These operators arise in stochastic analysis, where they are often referred to as L´evy–Khintchin-type operators. The mollifier χ in (1.83) is conventionally chosen either as χ(y) = 1/(1 + y 2 ) or as χ(y) = 1y≤1 . In order to represent operators (1.82) in the general form (1.80) with bounded kernels ν, one has to expand f (x + y) in a Taylor series. For instance, if d = 1, the first operator in (1.83) can be written equivalently as Aν1 f (x) =

∞ −∞

f  (x + z)Φ(x, z) dz,

∞ ⎧ ⎪ ⎪ Φ (x, z) = ν(x, dy), z > 0, ⎨ + z z Φ(x, z) = ⎪ ⎪ ⎩ Φ− (x, z) = ν(x, dy), z < 0, −∞

(1.84)

40

Chapter 1. Analysis on Measures and Functional Spaces

so that Φ± (x, .) are positive functions, decreasing (respectively increasing) in z, and ∞ 0 0 ∞ Φ+ (x, z)dz = yν(x, dy), Φ− (x, z)dz = − yν(x, dy). 0

−∞

0

−∞

And the second operator in (1.83) with χ = 1 (for simplicity) can be written as ∞ ν f  (x + z)Φ(x, z) dz, A2 f (x) = −∞ ∞ ⎧ ⎪ ⎪ Φ+ (x, z) = (y − z)ν(x, dy), z > 0, (1.85) ⎨ z z Φ(x, z) = ⎪ ⎪ ⎩ Φ− (x, z) = (z − y)ν(x, dy), z < 0, −∞

so that Φ± (x, .) are positive convex functions, decreasing (respectively increasing) in z: ∞ z ∂Φ+ (x, z) ∂Φ− (x, z) =− = ν(x, dy) < 0, ν(x, dy) > 0. ∂z ∂z z −∞ Exercise 1.7.1. Check the formulas (1.84) and (1.85). The representation (1.80) is of course not unique. The situation becomes simpler in dimension d = 1. In this case, the Taylor expansion f (x) = f (a) +

k−1

f (m) (a)

m=1

xm + m!



x

f (k) (y) a

(x − y)k−1 dy (k − 1)!

yields a bijective mapping from C k ([a, b]) to Rk × C([a, b]). Consequently, any continuous linear operator A : C k ([a, b]) → C([a, b]) has a unique representation Af (x) = α0 (x)f (a) +

k−1

m=1

αm (x)f (m) (a) +

b

f (k) (y)ν(x, dy),

(1.86)

a

with some signed weakly continuous bounded transition kernel ν in [a, b] and continuous functions αj (x). The most appropriate language for describing the operators in the spaces of smooth functions is the language of pseudo-differential operators. In order to introduce this language properly, let us recall the basics of the Fourier transform (see any text on analysis for detail, e.g., [232], if needed) and also fix the notations. The classical Fourier theorem states that the Fourier transform (1.87) F : φ → (F φ)(p) = e−i(p,x) φ(x) dx

1.7. Linear operators in spaces of measures and functions

41

is an isomorphism of the Schwartz space S(Rn ), with the inversion formula −1 −1 −n (1.88) ei(p,x) ψ(x) dx = (2π)−n F ψ(−p). F : ψ → (F ψ)(λ) = (2π) Remark 18. An annoying inconsistency prevails in the definitions of the Fourier transform, since it can also be defined as F : φ → (F φ)(p) = ei(p,x) φ(x) dx, and also with various multipliers like (2π)−n or (2π)−n/2 . These differences affect the form of the inverse transform and the sign in the link between the differentiation of a function and the multiplication of its Fourier transform by p. The most basic example for a Fourier transform is the transform of the exponential of a quadratic form:     1 (2π)d/2 1 exp − (p, A−1 p) (1.89) e−i(p,x) exp − (Ax, x) dx = √ 2 2 det A for a symmetric positive matrix A. Exercise 1.7.2. Prove (1.89). Hint: Bring A into a diagonal form, i.e., express it as A = ODO−1 with a diagonal matrix D and an orthogonal matrix O. Then, one can reduce the calculations to the one-dimensional case. The Riemann–Lebesgue lemma states that F extends to the bounded linear operator L1 (Rn ) → C∞ (Rn ). It also extends to the bounded operator M(Rn ) → C(Rn ) and to the isomorphism of the space L2 (Rn ), so that the following isometry relation holds: n (F f )(x)(F g)(x) dx = (2π) f¯(x)g(x) dx. (1.90) Another fundamental fact on the range of the Fourier transform is given by the Paley–Wiener theorem, which states that the image under Fourier transform of the space of functions from S(Rd ) with a compact support in the ball {y : |y| ≤ R} coincides with the space of entire analytic functions g on Cn such that, for any N > 0, (1.91) |g(ξ)| ≤ CN eR|Im ξ| (1 + |ξ|)−N , with a constant CN depending on N (see proofs in [232] or [90]). The Fourier transform is used as the universal tool for diagonalizing translation invariant linear operators, because – as can be seen by the direct application of integration by parts – the operator (1.87) turns the operator of differentiation into an operator of multiplication:   ∂φ ∂φ (x) dx = −ipj (F φ)(p). (1.92) (p) = e−i(p,x) F ∂xj ∂xj

42

Chapter 1. Analysis on Measures and Functional Spaces

This implies that operator (1.87) turns differential operators with constant coefficients into operators of multiplication by a function. Moreover, the Fourier transform takes the operation of convoluting functions, (φ  ψ)(x) = φ(x − y)ψ(y)dy to a multiplication of functions: F (φ  ψ)(p) = (F φ)(p)(F ψ)(p), and vice versa:

(1.93)

F (φψ) = (2π)−d (F φ)  (F ψ).

(1.94)

For an operator A acting in a space of functions on R , its symbol is defined as the following function of two variables: n

ψ(x, p) = exp{−ixp}(A exp{ip ·})(x),

(1.95)

whenever this expression is well defined. Representing a function u via the Fourier inversion formula 1 u ˆ(p)eixp dp u= (2π)n yields Au(x) =

1 (2π)n



u ˆ(p)(Aeip · )(x) dp =

1 (2π)n

u ˆ(p)eixp ψ(x, p) dp,

(1.96)

which expresses the action of A in terms of its symbol. For instance, the operator (1.80) of order at most k has the symbol ψ(x, p) = im e−ipx

k

pi1 · · · pim

eiyp νi1 ···im (x, dy).

m=0 i1 ,...,im

For a function ψ(x, p) which is a polynomial in the variables p = (p1 , . . . , pn ), the differential operator ψ(x, −i∇) acts as 1 u ˆ(p)eixp ψ(x, p) dp, ψ(x, −i∇)u(x) = (2π)n because the operator of differentiation turns into the operator of multiplication under the Fourier transform. Therefore, the general operator A with the symbol ψ(x, p) acting by (1.96) can be naturally denoted by ψ(x, −i∇). Operators represented in this form are called pseudo-differential operators (ΨDOs) with symbols ψ. In cases where ψ(x, p) = ψ(p) does not depend on x, ψ(−i∇) is referred to as an operator with constant coefficients, or as a spatially homogeneous operator. By

1.8. Fractional calculus

43

(1.96), the action of ψ(−i∇) on a function is equivalent to the multiplication of its Fourier image with ψ(p), so that F [ψ(−i∇)u](y) = ψ(y)(F u)(y), F [V (.)u(.)](y) = V (i∇)(F u)(y)

(1.97) (1.98)

for bounded continuous functions ψ and V . Moreover, as follows from (1.93), these operators commute with the convolution in the sense that ψ(−i∇)(φ  ψ) = (ψ(−i∇)φ  ψ) = (φ  ψ(−i∇)ψ),

(1.99)

whenever all these expressions are well defined.

1.8 Fractional calculus In this book, we shall demonstrate in many places that an extension of the theory of ODEs that is required for including fractional derivatives can be achieved more or less directly if one derives both of them from an appropriately formulated general theory of integral equations. In this section, we present the necessary background on fractional calculus in concise manner, namely the definitions of fractional integrals and derivatives in their Banach-space-valued extension, their most important alternative representations, the link between integrals and derivatives, the action of fractional derivatives on the exponents and how this leads to the symbols of ΨDOs representing these derivatives, and finally their finite-dimensional extensions. Remark 19. Readers who are not interested in the ‘fractional’ development may well skip this section and just note the definitions of fractional integrals that we occasionally use as a tool for streamlining the treatment of certain series expansions. Let Ia f be the integration operator defined on the set of continuous curves f ∈ C([a, b], B), with B a Banach space, as x Ia f (x) = f (t) dt. a

Integration by parts yields Ia2 f (x) =



x

a

x

(x − y)f (y) dy.

(Ia f )(y) dy = a

Similarly, by induction one gets the following formula for the iterated Riemann integral: x 1 n (x − t)n−1 f (t)dt. (1.100) Ia f (x) = (n − 1)! a

44

Chapter 1. Analysis on Measures and Functional Spaces

This formula suggests a natural analytical extension, if x > a, to complex n with positive real part, leading to the following definition of the (left) fractional or Riemann–Liouville (RL) integral of order β, for any β with positive real part: x 1 β β (x − t)β−1 f (t)dt. (1.101) Ia f (x) = Ia+ f (x) = Γ(β) a As can be expected from this definition, one finds the following semigroup property of fractional integrals by direct integration: Iaβ1 Iaβ2 f (x) = Iaβ1 +β2 f (x),

(1.102)

for positive β1 , β2 and any continuous curve f in B. On the other hand, direct differentiation shows that for β > k (only β ∈ R are considered here) with k ∈ N, dk β I f (x) = Iaβ−k f (x). dxk a Since (Iaβ 1)(x) =

(1.103)

(x − a)β Γ(β + 1)

where 1 is the constant function with value 1, it follows from the definition of the Mittag-Leffler function (see (9.13)) that ⎞ ⎛ ∞

Eβ (λ(t − a)β ) = ⎝ λj Iajβ ⎠ 1(t), (1.104) j=0

for any number λ, or more generally for any bounded linear operator λ in a Banach space B. Noting that the derivation is the inverse operation to usual integration, the definition (1.101) of the fractional integral suggests two notions of fractional derivatives: 1) the so-called RL (left) derivatives of order β ∈ (n, n + 1), with n a nonnegative integer: dn+1 n+1−β I f (x) dxn+1 a x 1 dn+1 = (x − t)n−β f (t)dt, Γ(n + 1 − β) dxn+1 a

β f (x) = Da+

(1.105) x > a,

and 2) the so-called Caputo (left) derivative of order β ∈ (n, n + 1):

n+1 d β f (x) = Ian+1−β f (x) Da+∗ dxn+1

n+1 x d 1 = (x − t)n−β f (t)dt, x > a. Γ(n + 1 − β) a dtn+1

(1.106)

1.8. Fractional calculus

45

Proposition 1.8.1. For f ∈ C 1 (R, B) and β ∈ (0, 1), x > a, x−a f (x − z) − f (x) 1 f (x) β f (x) = dz + , Da+ Γ(−β) 0 z 1+β Γ(1 − β)(x − a)β x−a f (x − z) − f (x) 1 f (x) − f (a) β Da+∗ f (x) = dz + , Γ(−β) 0 z 1+β Γ(1 − β)(x − a)β

(1.107) (1.108)

implying β β β Da+∗ f (x) = Da+ [f − f (a)](x) = Da+ f (x) −

f (a) . Γ(1 − β)|x − a|β

(1.109)

Proof. Integrating by parts yields x x

d (x − t)1−β 1 1 1−β −β Ia+ f (x) = (x − t) f (t) dt = − f (t) dt Γ(1 − β) a Γ(1 − β) a dt 1−β

x

(x − t)1−β (x − a)1−β 1 1 f (a) + f  (t) dt, = Γ(1 − β) 1−β Γ(1 − β) a 1−β so that β Da+ f (x)

d 1−β f (a) 1 Ia+ f (x) = = + β dx Γ(1 − β)(x − a) Γ(1 − β)



x

(x − t)−β f  (t) dt.

a

(1.110) Another integration by parts using f  (t) = (f (t) − f (x)) yields x f (t) − f (x) f (a) f (a) − f (x) β β f (x) = − − dt, Da+ β β Γ(1 − β)(x − a) Γ(1 − β)(x − a) Γ(1 − β) a (x − t)1+β which equals the r.h.s. of (1.107). On the other hand, x 1 β 1−β  Da+∗ f (x) = Ia+ f (x) = (x − t)−β f  (t) dt, Γ(1 − β) a which differs from (1.110) by f (a)(x − a)−β /Γ(1 − β). This leads to (1.108) and (1.109).  In particular, it follows that for smooth bounded integrable functions, the left RL and Caputo derivatives coincide for a = −∞, β ∈ (0, 1). Therefore, one defines the fractional derivative in generator form as their common value: ∞ f (x − z) − f (x) dβ 1 β β f (x) = D f (x) = D f (x) = dz. −∞+ −∞+∗ dxβ Γ(−β) 0 z 1+β (1.111) The following corollary is important for building the generalized fractional calculus, as performed in Chapter 8.

46

Chapter 1. Analysis on Measures and Functional Spaces

β β β Proposition 1.8.2. The operators Da+∗ (respectively Da+ ) are obtained from D∞+ 1 by the restriction of its action on the space C ([a, ∞), B) that can be considered the subspace of functions from C 1 (R, B) that are constants for x ≤ a (respectively 1 on the subspace of Ckill ([a, ∞), B) consisting of functions that vanish for x ≤ a).

Proof. One only has to observe that 1 1 = Γ(1 − β)(x − a)β Γ(−β)





x−a

dz . z β+1



As can be expected from the definition, an important consequence of (1.109) is that the fractional derivatives form the left inverse operations to the RL integrals: β β Iaβ g(x) = Da+ Iaβ g(x) = g(x), (1.112) Da+∗ for any continuous curve g and x > a. In fact, the first equation follows from (1.109) and the second is obtained from the definition and (1.102): β Da+ Iaβ g =

d 1−β β I Ia g = g. dx a

Proposition 1.8.3. For f ∈ C 2 (R, B) and β ∈ (1, 2), x > a, x−a 1 f (x − z) − f (x) + f  (x)z β Da+ f (x) = dz Γ(−β) 0 z 1+β βf  (x)(x − a)1−β f (x)(x − a)−β + , + Γ(1 − β) Γ(2 − β) x−a f (x − z) − f (x) + f  (x)z 1 β Da+∗ f (x) = dz Γ(−β) 0 z 1+β +

(1.113)

(1.114)

(βf  (x) − f  (a))(x − a)1−β (f (x) − f (a))(x − a)−β + , Γ(1 − β) Γ(2 − β)

so that β β Da+∗ f (x) = Da+ [f − f (a) − f  (a)(. − a)](x) β = Da+ f (x) −

f  (a)(x − a)1−β f (a)(x − a)−β − . Γ(1 − β) Γ(2 − β)

Proof. For β ∈ (1, 2) and x > a, β Da+∗ f (x)

1 = Γ(2 − β)



x

(x − t)1−β f  (t) dt,

(1.115)

(1.116)

a

which rewrites as 1 1−β (x − a)1−β (f  (x) − f  (a)) + Γ(2 − β) Γ(2 − β)

a

x

(x − t)−β (f  (t) − f  (x)) dt.

1.8. Fractional calculus

47

Using d (f (t) − f (x) − (t − x)f  (x)), dt another integration by parts yields f  (t) − f  (x) =

β f (x) = Da+∗

1 (x − a)1−β (f  (x) − f  (a)) Γ(2 − β) 1 (x − a)−β (f (a) − f (x) − (a − x)f  (x)) − Γ(1 − β) x 1 f (t) − f (x) − (t − x)f  (x) + dt, Γ(−β) a (x − t)1+β

which equals the r.h.s. of (1.114). On the other hand, again for β ∈ (1, 2) and x > a, x 1 2−β Ia+ f (x) = (x − t)1−β f (t) dt, Γ(2 − β) a which rewrites by integration by parts as 2−β f (x) = Ia+

1 f (a) (x − a)2−β + Γ(2 − β) 2 − β Γ(2 − β)



x

a

(x − t)2−β  f (t) dt, 2−β

and by yet another integration by parts as 2−β f (x) = Ia+

f  (a)(x − a)3−β f (a) (x − a)2−β + Γ(2 − β) 2 − β Γ(2 − β)(2 − β)(3 − β) x 3−β  (x − t) f (t) + dt. Γ(2 − β)(2 − β)(3 − β) a

Consequently, β f (x) = Da+

d2 2−β 1−β (x − a)−β f (a) Ia+ f (x) = 2 dx Γ(2 − β) x (x − t)1−β f  (t) f  (a)(x − a)1−β + dt. + Γ(2 − β) Γ(2 − β) a

Comparing this with (1.116) yields (1.113) and (1.115).



For smooth bounded integrable functions, the left RL and Caputo derivatives again coincide for a = −∞, β ∈ (1, 2), and one defines the fractional derivative in generator form as their common value: ∞ f (x − z) − f (x) + f  (x)z 1 dβ β β f (x) = D f (x) = D f (x) = dz. −∞+ −∞+∗ β dx Γ(−β) 0 z 1+β (1.117)

48

Chapter 1. Analysis on Measures and Functional Spaces

Formula (1.115) implies that (1.112) holds for β ∈ (1, 2). Similar arguments justify it for all positive β. Moreover, for any β ∈ (n, n+1) one derives that the left RL and Caputo derivatives coincide for a = −∞, and one defines the fractional derivative in generator form as their common value: dβ β β f (x) = D−∞+ f (x) = D−∞+∗ f (x) dxβ ∞ n+1 d f (x − z)dz 1 = n+1 Γ(n + 1 − β) 0 dx z β−n n+1 ∞ d 1 = (x − t)n−β f (t) dt (1.118) Γ(n + 1 − β) dxn+1 0 ∞

dz 1 1 = f (x − z) − f (x) + f  (x)z − · · · − f (n) (x)(−z)n 1+β Γ(−β) 0 n! z 0

dz 1 1 = . f (x + z) − f (x) − f  (x)z − · · · − f (n) (x)z n Γ(−β) −∞ n! |z|1+β Note that we presented several equivalent forms that are used in various contexts. Let us specifically distinguish the following consequence of (1.112) that allows differential equations to be rewritten with fractional derivatives in an equivalent integral form, which is crucial for their analysis. Proposition 1.8.4. Let β ∈ (n, n + 1) with a non-negative integer n. (i) If g ∈ C([a, b], B) and f (x) =

n−1

k=0

xk (k) 1 f (a) + k! Γ(β)



x

(x − s)β−1 g(s) ds,

(1.119)

0

β then g = Da+∗ f. β f is well defined as a continuous function, (ii) If f ∈ C n−1 ([a, b], B) and g = Da+∗ then f is given by (1.119).

Proof. Let us prove it for β ∈ (0, 1), the case β ∈ (n, n + 1) with arbitrary n ∈ N being analogous. (i) We have β β β Da+∗ f = Da+ (f − f (a)) = Da+ Iaβ g = g.

(ii) We have d 1−β 1−β 1−β I (f − f (a)) = Da+∗ Ia (f − f (a)) dx a 1−β 1−β = Da+ Ia (f − f (a)) = f − f (a).

Iaβ g = Iaβ



Remark 20. An insightful framework for understanding fractional operations is given by the theory of generalized functions, which allows one to look at fractional integrals and derivatives in a unified way as convolutions with regularized power functions, see Section 1.10.

1.8. Fractional calculus

49

Turning to the right derivative, notice that for x < a formula (1.100) rewrites as a (−1)n (t − x)n−1 f (t)dt. (1.120) Ian f (x) = (n − 1)! x This suggests several possible normalizations for the analytic continuation in n and the corresponding inversions (fractional derivatives). The most common definition of the right fractional or Riemann–Liouville (RL) integral of order β (with positive real part) is a 1 β Ia− f (x) = (t − x)β−1 f (t) dt, (1.121) Γ(β) x and the right versions of (1.105), (1.106) are chosen as follows: a dn+1 (−1)n+1 (t − x)n−β f (t) dt, x < a, Γ(n + 1 − β) dxn+1 x a (−1)n+1 dn+1 β f (x) = (t − x)n−β n+1 f (t) dt, x < a. Da−∗ Γ(n + 1 − β) x dx β f (x) = Da−

(1.122) (1.123)

When β ∈ (0, 1) and x < a, similar calculations as for the left derivative (see (1.110)) lead to the following analogues of (1.107), (1.108): β Da− f (x) = β Da−∗ f (x) =

1 Γ(−β) 1 Γ(−β)



a−x

0



0

a−x

f (x + z) − f (x) f (x) dz + , z 1+β Γ(1 − β)(a − x)β

(1.124)

f (x + z) − f (x) f (x) − f (a) dz + , z 1+β Γ(1 − β)(a − x)β

(1.125)

implying β β β Da−∗ f (x) = Da− [f − f (a)](x) = Da− f (x) −

f (a) . Γ(1 − β)(a − x)β

(1.126)

When β ∈ (1, 2), x < a, one obtains a−x 1 f (x + z) − f (x) − f  (x)z = dz Γ(−β) 0 z 1+β βf  (x)(a − x)1−β f (x)(a − x)−β − , + Γ(1 − β) Γ(2 − β) a−x f (x + z) − f (x) − f  (x)z 1 β Da−∗ f (x) = dz Γ(−β) 0 z 1+β β f (x) Da−

+

(1.127)

(βf  (x) − f  (a))(a − x)1−β (f (x) − f (a))(a − x)−β − , (1.128) Γ(1 − β) Γ(2 − β)

50

Chapter 1. Analysis on Measures and Functional Spaces

so that β β Da−∗ f (x) = Da− [f − f (a) − f  (a)(. − a)](x) β = Da− f (x) −

f  (a)(a − x)1−β f (a)(a − x)−β + . Γ(1 − β) Γ(2 − β)

(1.129)

For smooth bounded integrable functions, the right fractional derivatives in β β generator form is again the common value of Da− f (x) and Da−∗ f (x) for a = ∞, which for β ∈ (n, n + 1) is dβ β β f (x) = D∞− f (x) = D∞−∗ f (x) (1.130) d(−x)β

∞ dz 1 1 = f (x + z) − f (x) − f  (x)z − · · · − f (n) (x)z n 1+β . Γ(−β) 0 n! z It is straightforward to see that the pairs of operators (1.118), (1.130) are dual in the sense that  β    d dβ f, g = f, g (1.131) dxβ d(−x)β for β ∈ (n, n + 1) and sufficiently regular functions f, g, where the pairing (f, g) denotes of course the usual L2 -product: (f, g) = f (x)g(x)dx. This fact also justifies the notation dβ /d(−x)β , since for β = 1 the operators d/dx and −d/dx = d/d(−x) are dual. Proposition 1.8.5. For any β > 0, dβ e−ipx = exp{∓iπβ sgn p/2}|p|β e−ipx . d(±x)β

(1.132)

Proof. For natural β, this follows from the usual differentiation. Let β ∈ (n, n + 1) with a non-negative integer n. Then by (1.118) and (1.130), ∞

dz 1 1 dβ −ipx −ipx ±ipz n e e = − 1 − (±ipz) − · · · − (±ipz) , e 1+β d(±x)β Γ(−β) n! z 0 and (1.132) follows by (9.24).



When applied to fractional derivatives, the correspondence (1.97) implies the following. Proposition 1.8.6. For any β > 0,     dβ i πβ sgn p |p|β F (f )(p), f (p) = exp ± F d(±x)β 2   β d dβ + (p) = 2 cos(πβ/2)|p|β F (f )(p). F dxβ d(−x)β

(1.133) (1.134)

1.8. Fractional calculus

51

Proof. By (1.131) and the definition of F ,  β    d dβ −ipx F f (p) = e f (x) dx. dxβ d(−x)β Hence (1.132) proves (1.133). Equation (1.134) is a consequence of (1.133).



β

Remark 21. Note that exp{iπβ sgn p/2}|p| is the value of the main branch of the analytic function (ip)β (if p is real). Thus Proposition 1.8.6 states that F takes the fractional β-derivative to a multiplication by (ip)β . Keeping in mind that the Fourier transform F takes −d2 /dx2 to the operator of multiplication by p2 , one defines the symmetric fractional derivative |d2 /dx2 |β/2 = |d/dx|β as the operator that the Fourier transform takes to |p|β . Proposition 1.8.7. For any β ∈ (n, n + 1) with a non-negative integer n,  β  d 1   f (x) = (1.135)  dx  2Γ(−β) cos(πβ/2) ∞

dz 1 × . f (x + z) − f (x) − f  (x)z − · · · − f (n) (x)z n 1+β n! |z| −∞ If β = 2k with k ∈ N, then |d/dx|2k = (−1)k d2k /dx2k . If β = 2k + 1 with k ∈ N, then  β k+1  d    f (x) = (−1) (2k + 1)! (1.136)  dx  π

dz 1 × lim f (2k) (x)z 2k . f (x + z) − f (x) − f  (x)z − · · · −

→0 R\(− , ) (2k)! |z|2k+2 Proof. It follows from (1.134) that for positive β which is not an odd integer,  β   β  d  1 dβ d   f (x) = + .  dx  2 cos(πβ/2) dxβ d(−x)β Therefore, the first two statements follow from the definitions of the one-sided fractional derivatives. If β = 2k + 1, we use the continuity to write  2k+1  d  1   f (x) = lim lim  dx 

→0 β→2k+1 2Γ(−β) cos(πβ/2) ∞

dz 1  (2k) 2k f × (x)z , f (x + z) − f (x) − f (x)z − · · · − 1+β (2k)! |z| −∞ where β → 2k + 1 from below. To take the limit, we note that   2k + 1 − β 2k + 1 − β k cos(πβ/2) = (−1) sin π ∼ (−1)k π 2 2

52

Chapter 1. Analysis on Measures and Functional Spaces

for small (2k + 1 − β)/2 and that (the analytic continuation of) the function Γ(−β) has a pole at β = 2k + 1 with Γ(β) ∼ −1/[(2k + 1)!(2k + 1 − β)] for β near this pole. This implies (1.136).  Exercise 1.8.1. As an example, check that for β ∈ (0, 1), δ < β, x > 0, ∞ dβ xδ−β (1 + z)δ − 1 δ x = dz. d(−x)β Γ(−β) 0 z 1+β

(1.137)

The exact evaluation of fractional derivatives can be rarely achieved, and one usually has to confine oneself to its asymptotic or qualitative behaviour, as the following example shows. Proposition 1.8.8. Let β ∈ (0, 1), and t, x, α > 0. Then ⎧ ⎪ (txα )β , ⎪ ⎪ ⎪ ⎨ txα , dβ α −β α exp{−tx } ∼ x exp{−tx } × ⎪ txα − ln(txα ), d(−x)β ⎪ ⎪ ⎪ ⎩ α β/α (tx ) , where ∼ means that f ∼ g on a set A, if C −1 constant C. Proof. We have 1 dβ exp{−txα } = d(−x)β Γ(−β)





txα > 1, txα ≤ 1, β > α, txα ≤ 1, β = α,

txα ≤ 1, β < α, (1.138) < f /g < C on A for some

[exp{−t(x + y)α } − exp{−txα }]

0

dy . y 1+β

Changing the variable y into z = y/x yields dβ exp{−txα } ∞ dz α exp{−tx } = [exp{−txα ((1 + z)α − 1)} − 1] 1+β . β β d(−x) Γ(−β)x z 0 Next, changing the variable z into w = (1 + z)α − 1 yields dβ exp{−txα } ∞ dw α , exp{−tx } = [exp{−txα w} − 1] β β d(−x) Γ(−β)x f (w) 0 where f (w) = ((1 + w)1/α − 1)1+β α(1 + w)(α−1)/α is a positive increasing function of w > 0 such that f (w) ∼ w1+β/α for w > 1 and f (w) ∼ w1+β for w ≤ 1. Hence (1.138) follows from the following similarities: ⎧ t > 1, ⎪ 1/γ, ∞ −tw (e − 1)dz ⎨ min(1,γ) − ∼ t (1.139) , t ≤ 1, γ = 1, ⎪ w1+β 1 ⎩ − t ln t, t ≤ 1, γ = 1,  1 −tw tβ , t > 1, (e − 1)dz  − ∼ w1+β t, t ≤ 1. 0

1.8. Fractional calculus

53

Exercise 1.8.2. Let β ∈ (0, 1), δ < β and t, x, α > 0. Show that  dβ  δ x exp{−txα } ∼ xδ−β exp{−txα }g(txα ), d(−x)β

(1.140)

where g(y) ∼ y β for large y and g is uniformly bounded for y ∈ [0, a] with any a > 0. So far we have worked in one dimension only. In finite dimensions, we shall limit the discussion to symmetric (mixed) fractional derivatives. Since one can weight the derivatives of different directions differently, it is natural to consider a symmetric (mixed) fractional operator in Rd of the form |(∇, s)|β μ(ds), (1.141) S d−1

where μ(ds) is an arbitrary centrally symmetric finite (non-negative) Borel measure on the sphere S d−1 , and β > 0. The most natural way to define this operator is via the Fourier transform, i.e., as an operator that multiplies the Fourier transform of a function by |(p, s)|β μ(ds), S d−1

or, in other words, via the equation β F( |(∇, s)| μ(ds)f )(p) = S d−1

|(p, s)|β μ(ds)F f (p).

(1.142)

S d−1

By analogy with the one-dimensional case, in order to get a more concrete expression for this operator, one can look at the operator   ∞ d|y| μ(ds) 1 , Lβμ f = f (x + y) − f (x) − (∇f (x), y) − · · · − ∇k f (x)y ⊗k k! |y|1+β 0 S d−1 (1.143) where β ∈ (k, k + 1), s = y/|y|, and ∇k f (x)y ⊗k =

j1 ,...,jk

∂ k f (x) yj · · · yjk . ∂xj1 · · · ∂xjk 1

By (9.28), (Lβμ e±i(p,.) )(x) = e±i(p,x) Γ(−β) cos(πβ/2) Consequently,



|(p, s)|β μ(ds). S d−1



f (x)Lβμ e−i(p,x) dx = (F f )(p)Γ(−β) cos(πβ/2) |(p, s)|β μ(ds).

F (Lβμ f )(p) =

e−i(p,x) Lβμ f (x) dx =



S d−1

(1.144)

54

Chapter 1. Analysis on Measures and Functional Spaces

Therefore, according to the above definition of the operator (1.141) via its Fourier transform, we find 1 Lβ , |(∇, s)|β μ(ds) = (1.145) Γ(−β) cos(πβ/2) μ S d−1 whenever β is not an odd integer. This formula extends (1.135) to arbitrary dimensions. For β ∈ (1, 2), the r.h.s. of (1.145) can be written in equivalent forms as ∞ d|y| 1 (f (x + y) + f (x − y) − 2f (x)) 1+β μ(ds) 2Γ(−β) cos(πβ/2) 0 |y| d−1 S ∞ d|y| 1 = lim (f (x + y) − f (x)) 1+β μ(ds). (1.146) Γ(−β) cos(πβ/2) →0

|y| d−1 S For β = 2k + 1 with a non-negative integer k, it follows by the same limiting procedure as in Proposition 1.8.7 that 2 |(∇, s)|2k+1 μ(ds) f (x) = (−1)k+1 (2k + 1)! π S d−1  ∞ × lim f (x + y) − f (x) − (∇f (x), y) − · · ·

→0

S d−1  d|y| 1 ∇2k f (x)y ⊗2k ··· − μ(ds). (1.147) (2k)! |y|2k+2 One can see directly from both its Fourier representation  (1.142) and the integro-differential representation (1.145) that the operators S d−1 |(∇, s)|β μ(ds) are self-dual, that is     |(∇, s)|β μ(ds) f, g = f, |(∇, s)|β μ(ds) g . (1.148) S d−1

S d−1

The measure μ that mixes the fractional derivatives in various directions is often referred to as the spectral measure of the operator (1.145). The case of the uniform (or Lebesgue) measure μ(ds) = ds and of the corresponding operator Lβds is particularly interesting. In this case, we can calculate |(p, s)|β ds S d−1

= |p|β |S d−2 |





π

| cos θ|β sind−2 θ dθ = 2|p|β |S d−2 |

1

uβ (1 − u2 )(d−3)/2 du 0 0   1 β+1 d−1 β d−2 (β−1)/2 (d−3)/2 β d−2 , | v (1 − v) dv = |p| |S |B = |p| |S . 2 2 0

1.8. Fractional calculus

55

Using (9.15) and (9.12) yields Γ((β + 1)/2) . |(p, s)|β ds = 2|p|β π (d−1)/2 Γ((β + d)/2) d−1 S

(1.149)

Consequently, the operator Lβds multiplies the Fourier transform of a function by 2|p|β Γ(−β) cos(πβ/2)π (d−1)/2

Γ((β + 1)/2) . Γ((β + d)/2)

Therefore, for any positive non-integer β, the fractional Laplacian operator |∇|β = |Δ|β/2 , defined as the operator that multiplies the Fourier transform of a function by |p|β , can be represented as |∇|β f (x) =

2π (d−1)/2

1 Γ((β + d)/2) Lβ f (x). cos(πβ/2) Γ((β + 1)/2)Γ(−β) ds

(1.150)

In the language of ΨDOs (see (1.95)), formula (1.132) means that the function exp{±iπβ sgn p/2}|p|β

(1.151)

is the symbol of the operator dβ /d(±x)β . Since the analytic properties of symbols are important for the analysis of ΨDOs, let us note that this function is the boundary value at real p of the analytic function (−ip)β on the half-space {Im p ≥ 0}, or (for the negative sign in (1.151)) of the function (ip)β on the half-space {Im p ≤ 0}, respectively. Similarly, formula (1.144) means that the function |(p, s)|β μ(ds) (1.152) ψ(p) = S d−1   ∞ 1 ik (y, p)k d|y| μ(d¯ y )) = , ei(y,p) − 1 − i(y, p) − · · · − 1+β Γ(−β) cos(πβ/2) 0 S d−1 k! |y|  where y¯ = y/|y|, is the symbol of the operator S d−1 |(∇, s)|β μ(ds). The corresponding ΨDOs with variable coefficients have the symbols |(p, s)|β μ(x, ds). (1.153) ψ(x, p) = S d−1

Another useful extension of operators Lβμ is given by operators of the type Lβμ,Ω f

= 0



 f (x + y) − f (x) − (∇f (x), y) − · · · S d−1  1 k ⊗k Ω(x, y)d|y| μ(ds) · · · − ∇ f (x)y , k! |y|1+β



(1.154)

56

Chapter 1. Analysis on Measures and Functional Spaces

with a function Ω(x, y). These operators are sometimes referred to as hypersingular integrals, with the function Ω called their characteristics. If Ω(x, y) depends on y only through the ratio y/|y|, these hyper-singular integrals have symbols of the type (1.153). General hyper-singular integrals arise naturally as the dual operators to operators with symbols of the type (1.153). Finally, let us note that the equivalence between the definitions of the fractional derivatives via their Fourier transform (1.142) and via an integral operator of the type (1.145) was obtained for functions f from the Schwartz space S(Rd ). Once this is done, the formulas (1.142) and (1.145) can be treated in two different ways in order to extend the fractional derivatives to less  regular functions: On the one hand, formula (1.142) allows for the definition of S d−1 |(∇, s)|β μ(ds) as a bounded operator from the space of functions representable as the Fourier transforms of functions φ ∈ L1 (Rd ) such that |p|β φ(p) ∈ L1 (Rd ) to the space C∞ (Rd ). On the other hand, formula (1.145) allows the definition of S d−1 |(∇, s)|β μ(ds) as a bounded operator from C n+1 (Rd ) to C(Rd ), or more generally from the subspace of C n (Rd ) with H¨older-continuous nth-order derivatives to C(Rd ).

1.9 Generalized functions: main operations In Remark 16, we had introduced the space D = D(Ω) of infinitely differentiable functions (real- or sometimes complex-valued) on the open set Ω ⊂ Rn with a bounded support, equipped with such a topology that a sequence φn ∈ D converges to φ ∈ D, as n → ∞, if there exists a compact K ⊂ Rn such that all φn and φ vanish outside K and φn → φ uniformly on K together with all its derivatives. A continuous linear functional ξ on D is called a generalized function on Rn , or is alternatively referred to as a distribution. Its value on φ ∈ D is usually denoted by ξ(φ), or sometimes (ξ, φ), if no ambiguity arises (for instance, related to the complex scalar product, see Remark 22 below). In this section, we shall summarize the basic definitions, notations and formulas with respect to the theory of generalized functions. A detailed exposition can be found in many places, e.g., [90], while we shall keep it short. The space D of all generalized functions is equipped with the usual dual topology (which would formally be better called ∗-weak topology), that is ξn → ξ implies ξn (φ) → ξ(φ) for all φ ∈ D. In this context, the space D is referred to as the space of test functions. An alternative convenient space of linear functionals is the space S  (Rn ) of continuous linear functionals on the Schwartz space S(Rn ), referred to as the space of tempered generalized functions. This space is again equipped with the corresponding weak topology: ξn → ξ means that ξn (φ) → ξ(φ) for all φ ∈ S. Basic examples of generalized functions are supplied by ‘normal’ locally inn μ acting on D by tegrable functions f (x) on R  and locally bounded measures  integration, that is f (φ) = f (x)φ(x) dx or μ(φ) = φ(x) μ(dx). A notable ex-

1.9. Generalized functions: main operations

57

ample of the latter case is given by the famous Dirac delta-functions δx , which are point measures at x acting on D by the evaluation δx (φ) = φ(x). A sequence of regular functions fn is called a δ-convergent sequence, or shortly a δ-sequence, if fn → δ = δ0 , as n → ∞ in the sense of generalized functions. In order to stress the difference between D and S  , let us note that the function ex on R defines an element of D (R), which does not belong to S  (R). Remark 22. The use of the Fourier transform makes it indispensable to work with complex-valued functions. In this case, it is often convenient to define the correspondence between ordinary functions and generalized functions in a different way. In fact, some authors prefer to assign  a generalized function to the ordinary function f according to the rule (f, φ) = f¯(x)φ(x) dx, which aligns the notation with the usual scalar product in L2 (Rd ), but makes the correspondence of functions and generalized functions anti-linear. We shall stick to our original definition given above. Of course, with such a modified correspondence, some other formulas also change, most notably those related to the Fourier transform. The direct product f × g of two generalized functions on Rd and Rn is naturally defined as the generalized function on Rd+n , so that it acts as (f × g, φψ) = (f, φ)(g, ψ) for the products of the test functions φ(x)ψ(y) from S(Rd+n ) or D(Rd+n ), and extends to arbitrary test functions by linearity and continuity. The operation of differentiation extends to generalized functions by duality, that is, for any ξ ∈ D and φ ∈ D one defines (ξ  , φ) = −(ξ, φ ) (because (f  , φ) = −(f, φ ) due to integration by parts for f, φ ∈ D). According to this definition, any generalized function is infinitely differentiable. For instance, the derivative δx acts on D(R) as δx (φ) = −φ (x). Moreover, if L = ψ(−i∇) is a pseudo-differential operator in Rd with symbol ψ(p), the formal dual is the operator L∗ = ψ(i∇) with the symbol ψ(−p), and therefore the action of L on generalized functions is defined as (Lξ, φ) = (ψ(−i∇)ξ, φ) = (ξ, L∗ φ) = (ξ, ψ(i∇)φ).

(1.155)

Accordingly, one says that a generalized function ξ is a generalized solution to the equation Lξ = f , if this equation holds in the sense of generalized functions. Similarly, the Fourier transform of a generalized function ξ is defined as (F ξ, φ) = (ξ, F φ). From the Fourier theorem on the isomorphism of S(Rd ) under F , it follows that F is also an isomorphism of S  (Rd ). In order to see what happens with D , we observe that, according to the Paley–Wiener theorem (see (1.91)), the Fourier transform is an isomorphism of the space D(Rd ) and the space of test functions Z(Cd ) consisting of entire analytic functions g on Cd for which an R exists (depending on g) such that for any N > 0, |g(ξ)| ≤ CN eR|Im ξ| (1 + |ξ|)−N ,

(1.156)

58

Chapter 1. Analysis on Measures and Functional Spaces

with a constant CN depending on N and g. Consequently, F and F −1 yield isomorphisms between the dual spaces D (Rd ) and Z  (Cd ), the dual space of Z(Cd ). The definition of the convolution requires much care. Observing that for ordinary functions (say, from the space S) (f  g)(x)φ(x) dx = f (x)g(y)φ(x + y) dxdy, it is natural to define the convolution f  g of two generalized functions by setting     (f  g), φ = (f × g)(x, y), φ(x + y) . (1.157) However, this term is not always well defined, because the function φ(x + y) does not usually belong to S(R2d ) or D(R2d ) (see Proposition 1.9.2 below). Although one cannot speak about the value of a generalized function ξ at a specific point, it makes sense to say that ξ vanishes in an open set U – meaning that (ξ, φ) = 0 for all φ with a support in U . Accordingly, by the support of a generalized function ξ in Rd , one means the intersection of all closed sets K ⊂ Rd such that ξ vanishes on Rd \ K. A noteworthy example is as follows. Proposition 1.9.1. If a generalized function ξ on R (from S  or D ) is supported on  the one-point set {0}, then there exists k = 0, 1, 2, . . . such that ξ = kj=0 aj δ (j) with some constants aj . Proof. If ξ ∈ D and it has a compact support, then ξ ∈ S  . Due to the continuity of ξ (see Proposition 1.6.1), there exists k = 0, 1, 2, . . . such that |(ξ, φ)| ≤ φC k (R) . By Proposition 1.7.2, any such functional is given by the sum of the integrals of the derivatives of φ up to order k over some measures. Due to the condition of being supported at zero, these measures have to have support at zero.  An important property of functions that are supported by proper cones is the fact that the convolution restricted to such functions is always well defined. For instance, the following holds. Proposition 1.9.2. If f and g are generalized functions on Rd that are supported on the positive cone Rd+ , the convolution f  g is well defined and also supported on Rd+ . Proof. It follows from (1.157), since for any φ ∈ D(Rd ), the function φ(x + y)1x≥0 1y≥0 belongs to D(R2d ).  Finally, let us comment on the operation of (pointwise) multiplication of generalized functions. If f is an infinitely smooth function with all derivatives bounded and ξ a generalized function, then the product f ξ can be defined as the generalized function acting by the rule (f ξ, φ) = (ξ, f φ). If f is not infinitely smooth, this product f ξ may not be well defined. Thus by passing to generalized functions, the product structure of usual functions is effectively lost. Several approaches exist

1.10. Generalized functions: regularization

59

(which we will not touch) suggesting to extend the notion of generalized functions in such a way that an extension of the multiplicative structure to these objects becomes possible.

1.10 Generalized functions: regularization If a function f on R is not locally integrable, it cannot be used directly to define  the generalized function φ → f (x)φ(x) dx. However, if f is integrable around any point apart from a finite set M of singular points, then f (x)φ(x) dx is defined ˜ for all φ ∈ D that vanish in a neighbourhood  of M , and a generalized function f ˜ is called a regularization of f if (f , φ) = f (x)φ(x) dx for all such φ. Note that a regularization is never uniquely defined. In fact, if f˜ is a regularization of f having a singularity at 0, then f˜ + δ (k) is also a regularization for any k ∈ N. Nevertheless, for a wide class of functions including various combinations of power functions, there exists in some sense a canonical way to choose a regularization. Let us demonstrate this choice on the basic example of one-sided power functions   0, x ≥ 0, xλ , x > 0, λ λ x− = x+ = λ 0, x ≤ 0, |x| , x < 0, and their even and odd combinations |x|λ = xλ+ + xλ− ,

|x|λ sgn x = xλ+ − xλ− .

These functions are locally integrable for λ > −1 and have a non-integrable singularity at x = 0 for λ ≤ −1. What is the natural regularization for λ ≤ −1? Two ideas can be exploited for answering this question. Firstly, it would be nice to keep the usual rules of differentiation, say (xλ+ ) = λ−1 λx+ . Let us calculate (xλ+ ) for λ ∈ (−1, 0) in the sense of generalized functions: ∞ ∞ λ  λ  λ  ((x+ ) , φ) = −(x+ , φ ) = − x φ (x) dx = − lim xλ φ (x) dx. 0

→0





Integrating by parts with φ (x)dx = d(φ(x) − φ(0)) yields

∞ (φ(x) − φ(0))λxλ−1 dx ((xλ+ ) , φ) = lim λ (φ() − φ(0)) +

→0

∞ = (φ(x) − φ(0))λxλ−1 dx. 0

∞ Therefore, the generalized function acting as φ → 0 (φ(x) − φ(0))xμ dx is the natural regularization of xμ+ for μ ∈ (−2, −1), which we shall denote by xμ+ in a slightly abusing notation. This choice of regularization ensures that the rule (xλ+ ) = λxλ−1 holds for λ ∈ (−1, 0). +

60

Chapter 1. Analysis on Measures and Functional Spaces

Iterating this procedure, i.e., differentiating the obtained function xλ+ for λ ∈ (−2, −1) and so on, one can show that if the regularized version of xλ+ for −n − 1 < Re λ < −n is defined by the equation ∞ 1 (xλ+ , φ) = xn−1 φ(n−1) (0)]dx, (1.158) xλ [φ(x) − φ(0) − xφ (0) − · · · − (n − 1)! 0 holds for all λ = −1, −2, . . .. then the rule (xλ+ ) = λxλ−1 + The second approach to canonical regularization is based on the idea of analytic continuation. Namely, for Re λ > −1 we have (xλ+ , φ)

=





λ

x φ(x) dx = 0



1

x [φ(x) − φ(0)]dx +

λ

0

1



xλ φ(x) dx +

φ(0) . λ+1

This function is analytic in λ for Re λ > −2 and has a single pole at the point λ = −1. Hence the idea of the analytic continuation suggests to take it as the extension of the function xλ+ to the domain Re λ > −2. Notably, if λ ∈ (−2, −1), this formula rewrites as ∞ λ (φ(x) − φ(0))xλ dx, (x+ , φ) = 0

yielding the same result as obtained previously with respect to the consistency of the differentiation rules. Continuing in the same way, i.e., adding and subtracting further terms of the Taylor expansion of the test function φ, yields the analytic ∞ continuation of the integral 0 xλ φ(x) dx to the domain Re λ > −n − 1 in the form ∞ ∞ n

φ(k−1) (0) xλ φ(x) dx = xλ φ(x) dx + (k − 1)!(k + λ) 0 1 k=1 1 1 xn−1 φ(n−1) (0)]dx. xλ [φ(x) − φ(0) − xφ (0) − · · · − (1.159) + (n − 1)! 0 This analytic function has simple poles at λ = −1, −2, . . . , −n. For −n − 1 < Re λ < −n, it coincides remarkably with the expression (1.158) that was obtained from another point of view, because ∞ 1 xk+λ dx = − k+λ+1 1 for k ∈ [0, n − 1] and −n − 1 < Re λ < −n. Therefore, the formulas (1.158) and (1.159) specify an analytic continuation ∞ of the integral 0 xλ φ(x) dx to the whole complex plane of λ (with poles at λ = −1, −2, . . .) and define a family of generalized functions xλ+ that satisfy the usual differentiation rule (xλ+ ) = λxλ−1 (for λ outside the poles). +

1.10. Generalized functions: regularization

61

Similarly, one can define a family of generalized functions xλ− . However, it is easier to use the identity (xλ− , φ(x)) = (xλ+ , φ(−x)) for writing down the expressions directly: ∞ (−x)n−1 (n−1) φ (xλ− , φ) = xλ [φ(−x) − φ(0) + xφ (0) − · · · − (0)]dx, (1.160) (n − 1)! 0 ∞ n

(−1)k−1 φ(k−1) (0) λ (1.161) xλ φ(−x) dx + (x− , φ) = (k − 1)!(k + λ) 1 k=1 1 (−x)n−1 (n−1) + φ xλ [φ(−x) − φ(0) + xφ (x) − · · · − (0)]dx, (n − 1)! 0 valid for −n − 1 < Re λ < −n and for −n − 1 < Re λ, respectively. It is insightful to observe from the formulas (1.158) and (1.160) on the one hand and the formulas (1.130) and (1.118) on the other hand that the values (xλ± , φ) with λ < −1 yield nothing but the fractional derivatives Γ(−β)dβ / d(∓x)β φ(0) with β = −1 − λ ∈ (n − 1, n), see (1.130). We can generally conclude that both fractional integrals and derivatives (in the generator form) of a function are given by convolutions with the power functions xλ+ (defined as generalized functions that regularize the usual power functions). Adding and subtracting the expressions for (xλ± , φ), we see that in both cases half of the poles cancel, so that |x|λ has poles only at λ = −1, −3, −5, . . . and |x|λ sgn x has poles only at λ = −2, −4, . . .. Therefore, for −2k − 1 < Re λ < −2k + 1,  ∞

1 λ λ (|x| , φ) = x φ(x) + φ(−x) − 2 φ(0) + x2 φ (0) + · · · 2 0  (1.162) 1 2k−2 (2k−2) x φ (0) dx. ···+ (2k − 2)! In particular, the generalized function |x|−2k with k ∈ N, that can be naturally denoted by x−2k , acts as

 ∞ 1 x−2k φ(x) + φ(−x) − 2 φ(0) + x2 φ (0) + · · · (x−2k , φ) = 2 0  (1.163) 1 ··· + x2k−2 φ(2k−2) (0) dx, (2k − 2)! and the function |x|−1 sgn x, that can be naturally denoted by 1/x, acts as ∞ dx (1.164) [φ(x) − φ(−x)] , (x−1 , φ) = x 0 which coincides with the Cauchy principle value of the integral with the density 1/x: φ(x) dx. (1.165) (x−1 , φ) = lim

→0 R\(− , ) x

62

Chapter 1. Analysis on Measures and Functional Spaces

The simplest application of the obtained formulas is the analytic continuation of the Euler Gamma-function ∞ −x xλ−1 e−x dx = (xλ−1 ), Γ(λ) = + ,e 0

where e−x can be considered continued as a smooth function to the left of the origin in such a way that it becomes a function from S. Namely, applying (1.158) and (1.159) yields the continuation of Γ(λ + 1) to a analytic function with poles at λ = −1, −2, . . ., so that, for Re λ > −n − 1, $ ∞ 1 # n n−1 k

(−1)(k−1) λ −x λ −x kx Γ(λ + 1) = x e dx + x e − (−1) + dx. (k − 1)!(k + λ) k! 1 0 k=1 k=0 (1.166) For −n − 1 < Re λ < −n, this reduces to $ ∞ # n−1 k

λ −x kx dx. (1.167) Γ(λ + 1) = x e − (−1) k! 0 k=0

Thus Γ(λ + 1) has simple poles at λ = −k, k ∈ N, like the generalized functions xλ+ . By dividing (xλ+ , φ) by Γ(λ + 1), we therefore cancel the poles, so that the generalized function xλ+ /Γ(λ + 1) is an entire analytic function in λ. Similarly, one shows the following assertion. Proposition 1.10.1. The generalized functions xλ+ , Γ(λ + 1)

xλ− , Γ(λ + 1)

|x|λ , Γ( λ+1 2 )

sgn x|x|λ Γ( λ+2 2 )

(1.168)

are entire functions of the parameter λ ∈ C. Odd and even combinations of xλ± make it possible to cancel some of the singularities, but not all of them. The natural question arises whether a linear combination (with regular coefficients, not like 1/Γ(λ + 1)) can be chosen in such a way that we get an analytic function of λ without any singularities. A nice construction of such a combination goes as follows. For complex numbers z ∈ C \ R− , one can fix the argument arg(z) by the requirement −π < arg(z) < π. Then for any λ ∈ C, the function z λ = |z|λ eiλ arg(z) = eλ ln |z| eiλ arg(z) is a well-defined analytic function of z ∈ C \ R− . This allows for the definition of the functions (x ± i0)λ = lim (x ± iτ )λ , (1.169) τ →0+

1.10. Generalized functions: regularization

63

where τ → 0+ means the convergence from the right. Since  0, x > 0, lim arg(x ± iτ ) = τ →0+ ± π, x < 0, 

it follows that (x ± i0) =

|x|λ ,

λ

In other words,

e

±iπλ

x > 0,

|x| , x < 0. λ

(x ± i0)λ = xλ+ + e±iπλ xλ− .

(1.170)

(1.171)

Though (x ± i0)λ are not locally integrable functions for all λ, the formula (1.171) makes it possible to give them a meaning as generalized functions for all λ = −1, −2, . . .. Moreover, as can be seen from the above formulas for xλ± , their singularities cancel in the combinations (1.171), so that the following assertion holds. Proposition 1.10.2. The generalized functions (x±i0)λ given by (1.171) or (1.169) are entire functions of the parameter λ ∈ C. Extending the above arguments to the finite-dimensional case, one can obtain the analytic continuation of generalized functions arising from homogeneous functions on Rd of order λ, that is functions of the type Ψλ (x) = |x|λ ψ(x/|x|) with some integrable function ψ on S d−1 . In fact, for any test function φ ∈ D(Rd ), we find ∞ λ rλ+d−1 u(r) dr, (1.172) |x| ψ(x/|x|)φ(x) dx = 0



with

ψ(s)φ(rs) ds,

u(r) = S d−1

where ds denotes the Lebesgue measure on S d−1 . The function u(r) is infinitely differentiable on R+ with a bounded support, with all derivatives having continuous limits as r → 0 due to ψ(s)∇k φ(0)s⊗k ds u(k) (0) = S d−1

(concise tensor notations from (3.7) and (3.8) were used). Hence it can be continued to a function belonging to D(R). Consequently, the integral (1.172) is well defined for λ > −d. Moreover, it equals (xλ+d−1 , u). By the properties of xλ+d−1 , it + + therefore follows that the integral (1.172) has an analytic continuation to the plane of complex λ with possible poles only at the points λ = −d, −d − 1, . . .. This extension defines the generalized function Ψλ on Rd . Moreover, if ψ(s) is an even function, i.e., it satisfies ψ(s) = ψ(−s)), then the derivatives u(k) (0) of odd orders k vanish – which implies that the poles of Ψλ may occur only at

64

Chapter 1. Analysis on Measures and Functional Spaces

λ = −d, −d − 2, −d − 4, . . .. A basic example is obtained by choosing ψ(s) = 1, which leads to the generalized functions |x|λ on Rd . The same results also hold if ψ is not a function on S d−1 , but a measure – or even more generally a generalized function, that is a continuous linear functional on the space of smooth functions on S d−1 .

1.11 Fourier transform, fundamental solutions and Green functions The true power of generalized functions reveals itself in combination with the theory of Fourier transforms. By (9.20), for λ > 0 and τ > 0, ∞ F (xλ± e−τ |x|)(p) = rλ e−rτ ∓irp dr = (±ip + τ )−1−λ Γ(1 + λ), 0

where −π/2 < arg(±ip + τ ) < π/2. This can be rewritten as F (xλ± e−τ |x|)(p) = e∓iπ(λ+1)/2 (p ∓ iτ )−1−λ Γ(1 + λ),

(1.173)

where −π < arg(p − iτ ) < 0 and 0 < arg(p + iτ ) < π, respectively. Passing to the limit τ → 0+ yields F (xλ± /Γ(1 + λ))(p) = e∓iπ(λ+1)/2 (p ∓ i0)−1−λ .

(1.174)

By analytic continuation, this equation holds in the sense of generalized functions for all λ, since both sides are entire functions of λ by Propositions 1.10.1 and 1.10.2. Representing (p ∓ i0) by (1.171), we obtain −(1+λ)

F (xλ± )(p) = Γ(1 + λ)[e∓iπ(λ+1)/2 p+

−(1+λ)

+ e±iπ(λ+1)/2 p−

],

(1.175)

for λ = −1, −2, . . ., and thus 1 F (xλ± )(−p) 2π & Γ(1 + λ) % ∓iπ(λ+1)/2 −(1+λ) −(1+λ) . e p− + e±iπ(λ+1)/2 p+ = 2π

F −1 (xλ± )(p) =

(1.176)

Adding and subtracting yields the Fourier transforms of the even and odd combinations. For instance, F (|x|λ )(p) = −2Γ(1 + λ) sin

λπ −(1+λ) |p| . 2

(1.177)

This result has a direct extension to arbitrary dimensions. Namely, the Fourier transform of the function |x|λ in Rd equals ∞ λ F (|x| )(p) = |y|λ+d−1 ei(y,p) d|y| d¯ y, 0

S d−1

1.11. Fourier transform, fundamental solutions and Green functions

65

with y¯ = y/|y| and ds the Lebesgue measure on S d−1 applied to s = y¯. Hence for 0 < λ + d < 1, we get from (9.26) that F (|x|λ )(p) = Γ(d + λ) |(p, s)|−d−λ cos(π(d + λ)/2)ds. S d−1

Applying (1.149) finally yields Γ(λ + d)Γ((−λ − d + 1)/2) −d−λ |p| . Γ(−λ/2) (1.178) With the convergence (as an improper Riemann integral) proved for 0 < λ + d < 1, this formula extends by the analytical continuation to all λ = 0, 2, 4, . . . and λ = −d, −d − 2, . . . as an equation for generalized functions. More generally, formula (9.26) provides the Fourier transform of the homox)d|x| with a symmetric measure μ on S d−1 geneous generalized function |x|ω μ(d¯ for −1 < ω < 0, which again can be extended to other λ by analytic continuation of homogeneous functions as explained above (see the discussion after formula (1.172)). In the analysis of partial differential and pseudo-differential equations, a key role is played by the so-called fundamental solutions. If L is a linear differential or pseudo-differential operator with constant coefficients, its fundamental solution is defined as a generalized function EL such that LEL (x) = δ(x). F (|x|λ )(p) = 2 cos(π(d + λ)/2)π (d−1)/2

Proposition 1.11.1. If g(x) is a function such that the convolution u(x) = (EL  g)(x) = EL (x − y)g(y) dy

(1.179)

is well defined, then this convolution solves the equation Lu(x) = g(x). Moreover, this solution is unique in the class of functions where the convolution EL  u is well defined. Proof. The first assertion holds because of Lu(x) = (LEL  g)(x) = (δ  g)(x) = g(x). If we assume that there are two solutions to the equation Lu(x) = g(x), then their difference solves the equation Lu = 0. Hence u = u  δ = u  LEL = Lu  EL = 0, where (1.99) was used.



Note, however, that a fundamental solution may not be unique (see examples below or Proposition 5.12.2).

66

Chapter 1. Analysis on Measures and Functional Spaces

A useful extension can be obtained for Banach-space-valued functions. Namely, for a Banach space B, we may define the B-valued generalized functions from   (Rd ) and SB (Rd ) as the space of continuous linear mappings D(Rd ) → B DB respectively S(Rd ) → B. Any locally integrable function ξ : Rd → B defines a  (Rd ) by the usual rule (ξ, φ) = ξ(x)φ(x) dx for any generalized function from DB d   φ ∈ D(R ). The differentiation operation extends to DB (Rd ) and SB (Rd ) like in the real-valued case. It is important to note that Proposition 1.11.1 extends to B-valued solutions of the equation Lu(x) = g(x):  (Rd ) and the convolution EL  g is well defined, Proposition 1.11.2. If g ∈ DB  (Rd ). Moreover, this then this convolution solves the equation Lu(x) = g(x) in DB  d solution is unique in the class of g ∈ DB (R ) where EL  u is well defined.

Let us see how the Fourier transform works when it comes to calculating fundamental solutions. Let EL be a fundamental solution for a pseudo-differential operator with constant coefficients and symbol ψ. Then passing to the Fourier transform in the equation LEL = δ yields ψ(p)F EL (p) = 1. Hence EL = F −1 (1/ψ).

(1.180)

For instance, if L = |∇|β = |Δ|β/2 , with β > 0, it follows that EL = F (|p|−β ). Consequently, by (1.178), if β = d, d + 2, d + 4, . . ., the fundamental solution for L is (1.181) EL (x) = c(β, d)|x|β−d , −1

with a constant c(β, d). In particular, for the usual Laplacian (where β = 2) one gets 1 EL = − |x|2−d , (1.182) (d − 2)|S d−1 | for all d = 2. Exercise 1.11.1. (i) Check (1.182) by direct differentiation, as well as the formula EΔ =

1 ln |x| for 2π

d = 2.

(1.183)

(ii) Check that the functions θ(t)e−at ,

θ(t)

sin(at) a

represent fundamental solutions to the one-dimensional operators a + d/dt and a2 + d2 /dt2 respectively. Also, show that both fundamental solutions are unique under the additional assumption that they vanish for negative t.

1.11. Fourier transform, fundamental solutions and Green functions

67

If a ΨDO L has an infinitely differentiable symbol ψ(p) of at most polynomial growth, then the action Lξ is well defined for any generalized function. In fact, Lξ can be defined by its Fourier transform ψ(p)F ξ. If the symbol ψ is not infinitely smooth, this product may not be defined, and therefore also not the action Lξ. Thus one has to be cautious when working with ΨDOs with non-smooth symbols in the framework of generalized functions. Let us find the fundamental solutions for the operators of the fractional derivation dβ /d(±x)β . By (1.151), their symbols equal exp{±iπβ sgn p/2}|p|β and are not infinitely smooth. Nevertheless, the fundamental solutions are well defined by the formula   E±β = F −1 exp{∓iπβ sgn p/2}|p|−β ' ( ' ( −1 −β = exp ∓iπβ/2 F −1 p−β p− . + + exp ±iπβ/2 F Using formula (1.175), we find (after some cancellation and exploiting the fact Γ(β)Γ(1 − β) = π/ sin(πβ)) that E±β (x) = xβ−1 ± /Γ(β),

(1.184)

for all positive β ∈ / N. These fundamental solutions are unique under the additional condition that they vanish on a half-line (see Proposition 5.12.2 for more general cases). Another type of fundamental solution arises for evolutionary problems. Namely, if L is a linear differential or pseudo-differential operator on Rd with constant coefficients, its Green function, or the fundamental solution for its Cauchy problem (also referred to as the heat kernel) is defined as the generalized function GL (t, .) on Rd depending on the parameter t > 0 such that ∂GL − LGL = 0, ∂t

t > 0;

lim GL (t, x) = δ(x).

t→0

(1.185)

When such a GL is known, it follows from its definition that the formula f (t, x) = (GL (t, .)  f0 )(x)

(1.186)

supplies a solution to the Cauchy problem ∂f (t, x) − Lf (t, x) = 0, ∂t

t > 0;

f (0, x) = f0 (x),

(1.187)

with an arbitrary initial condition f0 ∈ D. If the Green function is given by ordinary locally integrable functions of x depending smoothly on t (which is often

68

Chapter 1. Analysis on Measures and Functional Spaces

the case in concrete examples), then the convolution in (1.186) turns into the usual integral, (1.188) f (t, x) = GL (t, x − y)f0 (y) dy, and is well defined for all bounded measurable functions f0 . More generally, for a time-dependent family of operators Lt , the Green function of Lt is defined as the generalized function GL (t, s, .) on Rd depending on the parameters t > s such that ∂GL (t, s, .) − Lt GL (t, s, .) = 0; ∂t

lim GL (t, s, x) = δ(x).

t→s

(1.189)

The most fundamental example of the Green function is supplied by the Cauchy problem for the diffusion or heat conductivity equation: 1 ∂f (t, x) = Δf (t, x), ∂t 2

f (0, x) = f0 (x).

(1.190)

The corresponding Green function solves the problem ∂G(t, x) 1 − ΔG(t, x) = 0, ∂t 2

t > 0;

lim GL (t, x) = δ(x),

t→0

(1.191)

and is given by the formula  2 x 1 exp − G(t, x) = , 2t (2πt)d/2

(1.192)

Its derivation via the Fourier transform can be found, e.g., with (2.63). We shall now give two simple but fundamental results that link fundamental solutions and Green functions. Proposition 1.11.3. Let L be a pseudo-differential operator ψ(−i∇), and let its Green function GL be given by ordinary functions GL (t, x), t > 0, so that the  integrals GL (t, x)dx are uniformly bounded for small t and GL (t, x) satisfies the equation ∂GL /∂t − LGL = 0 classically. Then the function  E(t, x) =

GL (t, x), 0,

t > 0, t ≤ 0,

is the fundamental solution for the operator ∂/∂t − L in Rd+1 .

(1.193)

1.11. Fourier transform, fundamental solutions and Green functions

69

Proof. By (1.155) and the integrability of GL (t, .), if φ ∈ D(Rd+1 ), then 



∞ ∂φ ∂ − L E, φ = − (t, x) + L∗ φ(t, x) dx dt GL (t, x) ∂t ∂t 0

∞ ∂φ (t, x) + L∗ φ(t, x) dx dt GL (t, x) = − lim

→0

∂t ∞

∂ = lim − L GL (t, x)φ(t, x)dx dt

→0

Rd ∂t + lim GL (, x)φ(, x) dx.

→0

Rd

The first term vanishes, since GL satisfies the equation ∂GL /∂t − LGL = 0 classically. Moreover, GL (, x)(φ(, x) − φ(0, x)) dx = 0, lim

→0

Rd

due to the uniform integrability of GL (t, .). Hence  

∂ − L E, φ = lim GL (, x)φ(0, x) dx = φ(0, 0).

→0 Rd ∂t



What can be said about the fundamental solution for L, once the Green function GL of its Cauchy problem is given? Let us say that a generalized function ξ on Rd+1 = {t, x1 , . . . , xn } can be reduced to the generalized function ξ˜ on Rd , if ˜ φ) = (ξ, φ(x)1(t)) (ξ, is well defined in the following sense: for any π ∈ D(Rd ) and any sequence of functions ηk : R → [0, 1] such that ηk → 1 monotonically as k → ∞, there exists a limit of the sequence (ξ, φ(x)ηk (t)), denoted by (ξ, φ(x)1(t)), such that this limit does not depend on the sequence ηk and depends continuously on φ in the topology of D(Rd ). Remark 23. In fact, the last condition about the continuity is automatically fulfilled once the limit exists. However, this fact is not at all obvious and requires a nontrivial proof (see, e.g., [261]). Proposition 1.11.4. Let L be a pseudo-differential operator ψ(−i∇), and let E ∈ D (Rd+1 ) be the fundamental solution for the operator ∂/∂t−L. If E can be reduced to the generalized function E˜ ∈ D (Rd ), then E˜ is a fundamental solution for the operator −L. Proof. We have ˜ φ) = −(E, ˜ L∗ φ) = −(E, L∗ φ(x)1(t)) −(LE, 

 ∂ = − L E, φ(x)1(t) = (δ(x)δ(t), φ(x)1(t)) = φ(0), ∂t as claimed.



70

Chapter 1. Analysis on Measures and Functional Spaces

As an exemplary application, let us derive again the fundamental solution (1.182) for the Laplacian operator Δ in d > 2, now using formula (1.192) for the Green function. By Proposition 1.11.4, the fundamental solution E(x) for −Δ/2 should be given by the integral  2 ∞ 1 x exp − E(x) = dt d/2 2t (2πt) 0 whenever it is well defined, which is the case for d > 2. Therefore, using this formula for d > 2 and changing the variable t into u = x2 /2t yields t = x2 /2u, dt = −(x2 /2u2 )du and thus ∞ 1 1 E(x) = u−2+d/2 e−u du = d/2 d−2 Γ(−1 + d/2). 2π d/2 |x|d−2 0 2π |x| Evaluating the Γ-function yields the same result for the fundamental solution for the Laplacian as in (1.182).

1.12 Sobolev spaces The Fourier transform allows for a stratification of the spaces S and S  in a natural way. For an integer k ≥ 0 and p ≥ 1, the Sobolev space Hpk = Hpk (Rd ) is defined as the subspace of functions u from Lp (Rd ) such that the partial derivatives up to and including order k (always defined in the sense of generalized functions) turn out to be also functions from Lp (Rd ). The norm on Hpk is defined as f k,p =

k



q=0 m1 +···+md =q

) ) q ) ) ) m∂ f m ) ) ∂x 1 · · · ∂x d ) 1

d

.

(1.194)

Lp (Rd )

According to the Fourier inversion theorem, the norm (1.194) for k = 2 is equivalent to the norm  f H2k =

1/2 (1 + |p| ) |F f (p)| dp 2 k

2

.

(1.195)

This allows for an extension of the definition of H2k to all real k. All these spaces H2s = H2s (Rd ), s ∈ R, are Hilbert spaces, so that the Fourier transform F : H2k (Rd ) → L2 (Rd , (1 + |p|2 )k/2 dp) is an isometry. Similar representations via the Fourier transform also exist for Hpk with p = 2, but we will not give details here. With the help of the celebrated Sobolev embedding theorem, one can embed these spaces in some classes of regular functions. The most basic result is as follows.

1.13. Variational derivatives

71

Theorem 1.12.1. (i) If s > d/2, then H2s (Rd ) ⊂ C∞ (Rd ). k (ii) If s > d/2 + k, then H2s (Rd ) ⊂ C∞ (Rd ). Proof. (i) By the Riemann–Lebesgue lemma, it is sufficient to check that u ∈ H2s implies F u ∈ L1 (Rd ), because in this case u = F −1 ◦ F u ∈ C∞ (Rd ). With the Cauchy inequality for scalar products in Hilbert spaces, we find 1/2 



|(F u)(p)|dp ≤

|(F u)(p)| (1 + |p| ) dp 2

 = uH2s

(1 + |p|2 )−s dp

2 −s

(1 + |p| )

2 s

1/2 dp

1/2 ,

and the last term here is finite because of s > d/2. (ii) The proof is similar, if we take into account that the differentiation of u of order k adds a multiplier of  magnitude |pk | to its Fourier image. For an open Ω ⊂ Rd , one defines the local Sobolev space H2s (Ω) as the subspace of generalized functions ξ of D (Rd ) such that φξ ∈ H2s (Rd ) for all φ ∈ D(Rd ) with a support in Ω. Remark 24. One can show (e.g., in [232], Section IX.6) that for any bounded open Ω and any ξ ∈ D (Rd ) there exists an s such that ξ ∈ H2s (Ω). Therefore, the spaces H2s (Ω) cover all D .

1.13 Variational derivatives We shall now discuss the peculiarities that arise when analysing spaces of measures and the dual functional spaces, where the structure of differentials is revealed in terms of variational derivatives. For that purpose, recall that M(X) is the space of bounded Borel measures on a metric space X, and M+ (X) is its cone of positive measures. Let M 0 there exists a partition of K, i.e., its representation K = ∪Xj as a union of a finite number of pairwise disjoint subsets, such that the oscillation of f on each Xj does not exceed . Choosing arbitrary xj ∈ Xj and defining μ =

j

μ(Xj )δxj ,

one gets |(f, μ) − (f, μ )| ≤ μ. Exercise 1.1.5. The equations 1.11 are straightforward. The first equation in (1.12) follows from duality (see Exercise 1.1.1). For example, it is sufficient to use just R2 . Exercise 1.1.7. (y (m) , x) =

n

(m)

yj

j=1

xj +



(m)

yj

xj ,

j=n+1

so that, if y (m) is bounded, the second term can be made arbitrary small for any x by choosing n large enough. This reduces the problem to a finite-dimensional setting. Exercise 1.1.8. Weak convergence implies tightness, that is, for any  there exists (m) N such that |yn | <  for all n > N and all m. Exercise 1.4.1. Firstly, 1 (max |f (x) + hg(x)| − f ) h x 1 ≥ lim max (|f (x)| + hg(x) sgn f (x) − f ) = max (g(x) sgn f (x)). h→0 h x∈Mf x∈Mf

[f, g]+ = lim

h→0

In order to prove the equality, choose a converging subsequence from the sequence xh that realises a maximum for finite h. Exercise 1.4.2. It is a consequence of Proposition 1.4.3. Exercise 1.5.1. (i) (A + hξ)−1 = (1 + hA−1 ξ)−1 A−1 = (1 − hA−1 ξ + o(h))A−1 . Exercise 1.6.1. Claims (i) to (iii) are straightforward. To prove (iv), note that if x = tm1 , y = sm2 with m1 , m2 ∈ M , then tm1 + sm2 x+y = ∈ M. t+s t+s Exercise 1.6.4. pn (x) <  =⇒ d(x, 0) <

k =n

2−k + ,

d(x, 0) <  =⇒ pn (x) < /(1 − ).

82

Chapter 1. Analysis on Measures and Functional Spaces

Exercise 1.6.5.

f 2p,q,2 f 0,0

(1 + x||2d )−1 dx,      ∂f   ∂f 2     (1 + x)2d dx (1 + x)−2d dx. dx ≤ ≤   ∂xj  ∂xj  ≤

f 2p+d,q

Exercise 1.6.7. For a balanced convex neighbourhood V0 of zero in V , there exists a neighbourhood W1 of zero in W such that V0 = W1 ∩ V . Since W is locally convex, there exists a convex balanced neighbourhood W2 of zero in W such that W2 ⊂ W1 . Take W0 = {αv + βw : v ∈ V0 , w ∈ W2 , |α| + |β| = 1}. Exercise 1.6.8. Since for any x, y ∈ V there exists n such that x, y ∈ Vn , the separation property follows. The local convexity follows essentially from Exercise 1.6.7. Exercise 1.7.1. For (1.85), use the Taylor expansions in the form



y

f (x + y) − f (x) − f (x)y =

f  (x + z)(y − z) dz, y > 0,

0

f (x + y) − f (x) − f  (x)y =



0

f  (x + z)(z − y) dz,

y < 0.

y

Exercise 1.7.2. One can reduce the calculations to the one-dimensional case by bringing A into diagonal form, i.e., expressing it as A = ODO−1 with a diagonal matrix D and an orthogonal matrix O. Exercise 1.8.2. This is as a corollary of (1.138) and (1.137). Exercise 1.11.1(i). (Δ ln |x|, ψ) = (ln |x|, Δψ) = lim

→0

|x|≥

ln |x|Δψ(x) dx.

In polar coordinates (r, φ), we find Δ=

1 ∂2 1 ∂2 1 ∂ 1 ∂ ∂ ∂2 + 2 2 = r + 2 2, + 2 ∂r r ∂r r ∂φ r ∂r ∂r r ∂φ

implying (Δ ln |x|, ψ) = lim

→0







dr

0



1 ∂ ∂ 1 ∂2 r + 2 2 ψ(r, φ) dφ. r ln r r ∂r ∂r r ∂φ

1.16. Summary and comments

83

The term with ∂ 2 /∂φ2 disappears due to periodicity, so that (Δ ln |x|, ψ) = lim = − lim

→0

→0









dr

0





dr

0



∂ ∂ r ln r ψ(r, φ) dφ ∂r ∂r

∂ ∂ (ln r)r ψ(r, φ) dφ − lim  ln 

→0 ∂r ∂r

0



∂ ψ(, φ) dφ. ∂r

The second term vanishes, and therefore (Δ ln |x|, ψ) = − lim

→0





dr

0



∂ ψ(r, φ) dφ = lim

→0 ∂r





ψ(, φ) dφ = 2πψ(0). 0

1.16 Summary and comments We developed the calculus on locally convex spaces, paying most attention to the case of Banach spaces and going into special details for the space of measures, where the variational derivatives play a crucial role. The analysis on general spaces was presented in a more sketchy way. By carefully selecting the material, we refined the general ideas to such a degree that they are appropriate for applications to a wide class of concrete partial differential and pseudo-differential equations, some of which will be dealt with further on. With the space of generalized functions, the key example of locally convex spaces was properly introduced together with its main properties, including fundamental solutions and the Fourier transform. Propositions 1.4.4, 1.4.5 that are major tools for proving uniqueness and stability of general kinetic equations seem to have be found first in [141]. A probabilistic proof based on Martingale theory is given in [215]. Here, we give the most elementary proof based on the theory of semi-inner products. Many sources exist where the classical topics that are touched upon here are developed in various directions. The general references to topological linear spaces include [115, 212] and[235]. The discussion of various notions of differentiability in locally convex spaces can be found, e.g., in [20, 21, 264]. Fractional calculus is presented in many excellent books. Standard references include [123, 238] and [239]. Note that the last book specifically deals with hypersingular integrals that we only touched upon here. Of great importance is the probabilistic interpretation of fractional derivatives (see [152] and references therein), which plays the key role for establishing their links with various models in physics and economics. In fact, one of the greatest impetus to the modern development of fractional calculus is due to its appearance in the theory of continuous-time random walks (CTRW), see, e.g., [144, 206, 207, 253, 255] and references therein. A well-written exposition of generalized function is the book [90]. The theory of generalized functions as a powerful tool for solving various classes of partial differential equations was first developed by S.L. Sobolev, following the ideas of N.M. Gunter, who initially introduced them under the name of ‘functions of domains’.

84

Chapter 1. Analysis on Measures and Functional Spaces

The final, more abstract theory was shaped by L. Schwartz and I.M. Gelfand (more on this history, e.g., in [14]). The modern development of the theory of pseudo-differential operators was essentially initiated by H¨ormander [109] and Maslov [200]. Many excellent expositions of pseudo-differential operators (ΨDOs) include [202, 245, 254]. Here, we only use the notations of the classical theory of ΨDOs, since this theory mostly deals with smooth symbols and the problems discussed here are often lacking such regularity. The theory of pseudo-differential operators with non-smooth symbols as they arise in the analysis of Markov processes is developed in [118]. Of course, many important tools of modern analysis on measures did not find their place in this exposition. The most notable omission are metrics that metricise the weak convergence (Wasserstein–Kantorovich metrics) and the notions of derivatives that are motivated by the links with probability theory, like the derivatives of Malliavin or Itˆo’s pathwise calculus in the sense of F¨ ollmer and Cont, see, e.g., [10] and [26] and [84]. The fundamental monograph on mean field games, [48], contains a lot of information on modern analysis of the mappings between measure spaces. Also, we did not touch the very interesting topic of the analysis on metric spaces without a linear structure, which can be found in [7].

Chapter 2

Basic ODEs in Complete Locally Convex Spaces In this chapter, we present more or less standard material (though not easily found in a concise systematic form) on vector-valued ODEs with a Lipschitz r.h.s. We shall fix the notations and set the scene for further developments. The Lipschitz r.h.s. ensures that the solution is globally well defined in both forward and backward time. We shall prove the basic well-posedness and sensitivity for ODEs, explain their links with partial differential equations via the method of characteristics, present some extensions to equations with memory (including causal equations and fractional derivatives in time), and finally review the theory of accretivity. We shall mostly work with Banach spaces, but will indicate the necessary modifications for the analogous theory in general locally convex spaces at the end of the chapter. Although used here for ODEs with Lipschitz r.h.s., the abstract well-posedness results are formulated in such a way that they are applicable to wide classes of more general equations, as will be seen later. Therefore, this chapter lays the foundation for all future developments, the main general tools being presented in Sections 2.1, 2.3, 2.8, 2.10 and 2.15.

2.1 Fixed-point principles for curves in Banach spaces Instead of carrying out case-by-case modifications of the fixed-point arguments for proving the well-posedness of various ODEs, it is handy to have some abstract version that we derive here as a consequence of the general fixed-point principle in metric spaces, as recalled in the Appendix (Section 9.1). Let B be a Banach space and M a closed convex subset therein. For any τ < t, let C([τ, t], M ) be a convex subset of the Banach space C([τ, t], B) of functions on [τ, t] with values in M ⊂ B. Thus C([τ, t], M ) is a complete metric space, equipped © Springer Nature Switzerland AG 2019 V. Kolokoltsov, Differential Equations on Measures and Functional Spaces, Birkhäuser Advanced Texts Basler Lehrbücher, https://doi.org/10.1007/978-3-030-03377-4_2

85

86

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

with the distance that is induced from C([τ, t], B). There is a natural restriction mapping that projects C([τ, t], M ) → C([τ, s], M ) for t ≥ s ≥ τ and a natural inclusion M → C([τ, t], M ) that maps Y ∈ M onto a constant function μs = Y on [τ, t]. For any Y ∈ M , let CY ([τ, t], M ) denote the closed convex subset of C([τ, t], M ) consisting of functions μt with the initial condition Y : μτ = Y . Let B1 be an auxiliary Banach space of some parameters. Theorem 2.1.1. Suppose that for any Y ∈ M , α ∈ B1 , a mapping ΦY,α : C([τ, T ], M ) → CY ([τ, T ], M ) is given with some T > τ such that for any t the restriction of ΦY,α (μ. ) on [τ, t] depends only on the restriction of the function μs on [τ, t]. Moreover t [ΦY,α (μ1. )](t) − [ΦY,α (μ2. )](t) ≤ L(Y ) μ1. − μ2. C([τ,s],B) ds, (2.1) τ [ΦY1 ,α1 (μ. )](t) − [ΦY2 ,α2 (μ. )](t) ≤ κY1 − Y2  + κ1 α1 − α2 , for any μ1 , μ2 ∈ C([τ, T ], M ), α1 , α2 ∈ B1 , some constants κ, κ1 and a continuous function L on M . Then for any Y ∈ M , α ∈ B1 the mapping ΦY,α has a unique fixed point μt,τ (Y, α) in CY ([τ, T ], M ). Moreover, for all t ∈ [τ, T ], μt,τ (Y, α) − Y  ≤ e(t−τ )L(Y ) [ΦY,α (Y )](t) − Y ,

(2.2)

and the fixed points μt,τ (Y1 , α1 ) and μt,τ (Y2 , α2 ) with different initial data Y1 , Y2 and parameters α1 , α2 satisfy the estimate μt,τ (Y1 , α1 ) − μt,τ (Y2 , α2 ) ≤ (κY1 − Y2  + κ1 α1 − α2 ) exp{(t − τ ) min(L(Y1 ), L(Y2 ))}.

(2.3)

Proof. It follows from (2.1) by direct induction that [ΦnY,α (μ1. )] − [ΦnY,α (μ2. )]C([τ,t],B) ≤

(t − τ )n Ln (Y ) 1 μ. − μ2. C([τ,t],B) n!

(2.4)

holds for any t ∈ [τ, T ]. Hence, by Proposition 9.1.1 the mapping Φ has a unique fixed point μt,τ (Y, α) in CY ([τ, T ], B), the approximations [ΦnY,α (Y )](t) converge to this point in C([τ, T ], B), and (2.2) holds. Finally, equation (2.3) follows from (2.1) and Proposition 9.1.3 (equation (9.2)).  We shall see several applications of this result, often with slight extensions that are now pointed out in detail. For this, let M (t), t ∈ [τ, T ], be an increasing family of bounded convex closed subsets in B such that M = M (τ ), and let C([τ, T ], M (.)) denote the closed subset of curves μ. of C([τ, T ], M (T )) such that μt ∈ M (t) for all t ∈ [τ, T ]. The following extension of Theorem 2.1.1 is straightforward.

2.1. Fixed-point principles for curves in Banach spaces

87

Theorem 2.1.2. Let all assumptions of Theorem 2.1.1 hold, with the only difference that for any Y ∈ M = M (τ ), ΦY,α maps C([τ, T ], M (.)) to itself. Then the statements of Theorem 2.1.1 hold for all Y ∈ M with the fixed point existing in C([τ, T ], M (.)). For more advanced developments, (e.g., for nonlinear parabolic equations and equations with fractional derivatives), the following extension turns out to be useful. Note that the Mittag-Leffler function used therein is defined in (9.13). Theorem 2.1.3. (i) Suppose that the conditions of Theorem 2.1.1 hold, but with (2.1) replaced by [ΦY,α (μ1. )](t)



[ΦY,α (μ2. )](t)

≤ L(Y )

t

(t − s)−ω μ1. − μ2. C([τ,s],B) ds,

τ

[ΦY1 ,α1 (μ. )](t) − [ΦY2 ,α2 (μ. )](t) ≤ κY1 − Y2  + κ1 α1 − α2 ,

(2.5)

with some ω ∈ (0, 1). Then for any Y ∈ M the mapping ΦY,α has a unique fixed point μt,τ (Y, α) in CY ([τ, T ], M ). Moreover, for all t ∈ [τ, T ], μt,τ (Y, α) − Y  ≤ E1−ω (L(Y )Γ(1 − ω)(t − τ )1−ω )[ΦY,α (Y )](t) − Y , (2.6) and the fixed points μt,τ (Y1 , α1 ) and μt,τ (Y2 , α2 ) with different initial data Y1 , Y2 and parameters α1 , α2 satisfy the estimate (for any j = 1, 2) μt,τ (Y1 , α1 ) − μt,τ (Y2 , α2 ) ≤ (κY1 − Y2  + κ1 α1 − α2 )E1−ω (L(Yj )Γ(1 − ω)(t − τ )1−ω ).

(2.7)

(ii) Similar to Theorem 2.1.2, let all assumptions of part (i) hold, with the only difference that for any Y ∈ M = M (τ ), ΦY,α maps C([τ, T ], M (.)) to itself (for an increasing family of bounded convex closed subsets M (.)). Then the statements of (i) hold for all Y ∈ M with the fixed point existing in C([τ, T ], M (.)). Proof. Let us prove only (i), since the modification (ii) is straightforward. The first inequality of (2.5) can be rewritten in terms of the fractional Riemann integral (see (1.100) for its definition) as [ΦY,α (μ1. )](t) − [ΦY,α (μ2. )](t) ≤ L(Y )Γ(1 − ω)Iτ1−ω (μ1. − μ2. C([τ,.],B))(t). By iteration and the composition rule Iτk Iτm = Iτk+m for fractional integrals (see (1.102)), it follows that [ΦnY,α (μ1. )] − [ΦnY,α (μ2. )]C([τ,t],B) ≤ (L(Y )Γ(1 − ω))n Iτn(1−ω) (μ1. − μ2. C([τ,.],B))(t)

88

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

t (L(Y )Γ(1 − ω))n 1 2 μ. − μ. C([τ,t],B) ≤ (t − s)n(1−ω)−1 ds Γ(n(1 − ω)) τ (L(Y )Γ(1 − ω)(t − τ )1−ω )n 1 μ. − μ2. C([τ,t],B), = Γ(n(1 − ω) + 1)

(2.8)

where in the last equation the formula Γ(x + 1) = xΓ(x) was used. Of course, the estimate (2.8) can be alternatively obtained from (2.5) by direct induction. The fractional integrals are only used as an elegant tool in order to avoid such routine induction arguments. Consequently, by the definition of the Mittag-Leffler function (see (9.13)) it follows that Propositions 9.1.1 and 9.1.3 are applicable with A = A(t − τ ) =



(L(Y )Γ(1 − ω)(t − τ )1−ω )n = E1−ω (L(Y )Γ(1 − ω)(t − τ )1−ω ), Γ(n(1 − ω) + 1) n=0

which yields the unique fixed point for ΦY and the estimates (2.6) and (2.7).



2.2 ODEs in Banach spaces: well-posedness As usual, let B be a Banach space. Differential equations supplemented by some initial conditions (i.e., conditions at an initial time of the evolution) are usually referred to as Cauchy problems. Theorem 2.2.1. Let F be a Lipschitz-continuous (not necessarily bounded) mapping B → B, with a Lipschitz constant F Lip = L as defined by (1.53). Then for any Y ∈ B there exists a unique global solution μt = μt (Y ) (defined for all t ≥ 0) to the Cauchy problem μ˙ t = F (μt ), (2.9) with the initial condition μ0 = Y . Moreover μt (Y ) − Y  ≤ tetL F (Y ) ≤ tetL (LY  + F (0)),

(2.10)

and the solutions μt (Y1 ) and μt (Y2 ) with different initial data Y1 , Y2 satisfy the estimate (2.11) μt (Y1 ) − μt (Y2 ) ≤ etL Y1 − Y2 . Proof. For t > 0 the Cauchy problem (2.9) is equivalent to the integral equation μt = Y +

t

F (μs ) ds. 0

Let us fix an arbitrary T > 0, and let us define the mapping ΦY : C([0, T ], B) → CY ([0, T ], B)

(2.12)

2.2. ODEs in Banach spaces: well-posedness

89

by the equation

t

[ΦY (μ. )](t) = Y +

F (μs ) ds,

t ∈ [0, T ],

(2.13)

0

where CY ([0, T ], B) is the subset of curves μt in C([0, T ], B) such that μ0 = Y . For any two curves μ1t , μ2t from C([0, T ], B), it follows that [ΦY1 (μ1. )](t) − [ΦY2 (μ2. )](t) ≤ L

t

μ1. − μ2. C([0,s],B) ds + Y1 − Y2 . (2.14) 0

Consequently, Theorem 2.1.1 applies with the constant L, the empty space B1 , κ = 1 and [ΦY (Y )](t) − Y = tF (Y ).  In the above Lipschitz setting, negative times can be treated in the same way as positive times. This yields the following result. Proposition 2.2.1. Under the assumptions of Theorem 2.2.1, the solutions to equation (2.9) are also well defined for all t < 0, so that (2.10) and (2.11) extend to μt − Y  ≤ te|t|L F (Y ),

μt (Y1 ) − μt (Y2 ) ≤ e|t|L Y1 − Y2 .

(2.15)

Finally, for any t, the mapping Y → μ−t (Y ) is the inverse of the mapping Y → μt (Y ). Proof. Global Lipschitz continuity ensures that the same arguments as in Theorem 2.2.1 work for negative times. (Inverting the time reduces the case with negative t to the case with positive t.) Thus only the last statement needs a proof. For any s < 0 the functions μt (μs (Y )) and μs+t (Y ) satisfy the same equation with the same initial condition μs (Y ) at t = 0. Hence μt (μs (Y )) = μs+t (Y ). In particular,  μt (μ−t (Y )) = Y . Several extensions of the above results are straightforward. Namely, one often deals with equations whose r.h.s. depends explicitly on the time t and/or on an additional parameter. In other words, one deals with equations of the form μ˙ t = F (t, μt , α),

(2.16)

with α from some other Banach space B1 , and one is interested in the continuous dependence of the solutions with respect to α. However, lifting equation (2.16) up to the respective equation in B × B1 by adding the equation α˙ = 0 reduces the problem of sensitivity with respect to a parameter (or the continuous dependence on it) to a problem of sensitivity with respect to the initial data. The dependence on time does not require any changes in the above results, as long as it is continuous. Therefore, the proof of the following result is a direct extension of the proof of Theorem 2.2.1.

90

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Theorem 2.2.2. Let F : R×B×B1 → B be a continuous function such that F (t, ., .) is Lipschitz-continuous as a mapping B × B1 → B with a Lipschitz constant that is independent of t, so that F (t, Y1 , α1 ) − F (t, Y2 , α2 ) ≤ LY1 − Y2  + L1 α1 − α2 .

(2.17)

Then (i) for any Y ∈ B, s ∈ R, α ∈ B1 there exists a unique global solution μt,s (Y, α) (defined for all t ∈ R) to the Cauchy problem for equation (2.16) with the initial condition μs,s = Y . Moreover, μt,s (Y, α) − Y  ≤ |t − s|e|t−s|L (LY  + sup F (τ, 0, α));

(2.18)

τ ∈[s,t]

(ii) the solutions μt,s (Y1 , α1 ) and μt,s (Y2 , α2 ) with different initial data Y1 , Y2 and parameter values α1 , α2 satisfy the estimate μt,s (Y1 , α1 )−μt,s (Y2 , α2 ) ≤ e|t−s|L (Y1 −Y2 +L1 (t−s)α1 −α2 ); (2.19) (iii) for any s, t, α the mapping Y → μs,t (Y, α) is the inverse of the mapping Y → μt,s (Y, α). In many applications, F is not everywhere continuous in t. In this case, instead of the Cauchy problem for (2.16) one can work with its integral version: t μt = μs + F (τ, μτ , α) dτ. (2.20) s

Proposition 2.2.2. Let F (τ, μτ , α) satisfy all the assumptions of Theorem 2.2.2 apart from being continuous in t. (i) Let F be measurable and locally bounded as a function of t. Then the claims of Theorem 2.2.2 remain valid when applied to the problem (2.20). (ii) Let F be continuous with respect to t apart from some fixed set N of measure zero (independent of other arguments of F ), and let F also be locally bounded in t. Then a locally absolutely continuous function μt solves (2.20) if and only if it assumes the initial value μs and satisfies (2.16) almost surely. Proof. Statement (i) is straightforward, since it is actually the problem (2.20), which has been dealt with in Theorem 2.2.2. Statement (ii) follows from Theorem 1.3.2.  Remark 27. If B = R, the integral equation (2.20) is equivalent to (2.16) being satisfied almost surely, whenever F is locally bounded and measurable in t. As another key extension, let us mention the equations of higher order, namely equations of the type (k)

μt

= F (t, μt , μt , . . . , μt

(k−1)

, α),

(2.21)

2.2. ODEs in Banach spaces: well-posedness

91

where k ∈ N. The Cauchy problem for this equation for times t ≥ s is posed by specifying the k initial vectors: . Y0 = μs , Y1 = μs , . . . , Yk−1 = μ(k−1) s

(2.22)

The standard trick is to rewrite this equation as a first-order equation in B k (with the norm being defined as the sum of the norms of the components in B) by setting (k−1) νt = (νt0 , νt1 , . . . , νtk−1 ) = (μt , μt , . . . , μt ) ∈ Bk . In terms of ν, the problem (2.21), (2.22) rewrites as d d νt = (νt0 , νt1 , . . . , νtk−1 ) = (νt1 , . . . , νtk−1 , F (t, νt0 , νt1 , . . . , νtk−1 , α)) dt dt

(2.23)

with the initial condition νs = (Y0 , . . . , Yk−1 ).

(2.24)

Theorem 2.2.3. Let F : R × B × B1 → B be a continuous function such that F (t, ., .) is Lipschitz-continuous as a mapping B k × B1 → B with a Lipschitz constant that is independent of t, so that k

F (t, Y0 , . . . , Yk−1 , α) − F (t, Z0 , . . . , Zk−1 , β) ≤ L

k−1

Yj − Zj  + L1 α − β.

j=0

(2.25) Then (i) for any Y = (Y0 , . . . , Yk−1 ) ∈ B k , s ∈ R, α ∈ B1 , there exists a unique global solution νt,s (Y, α) (defined for all t ∈ R) to the Cauchy problem (2.23), 0 (Y, α) to the Cauchy (2.24), and thus a unique global solution μt,s (Y, α) = νt,s problem (2.21), (2.22), and νt,s (Y, α) − Y B k ≤ |t − s|e|t−s|(1+L) ((1 + L)Y B k + sup F (τ, 0, α)); τ ∈[s,t]

(2.26) (ii) the solutions μt,s (Y, α) and μt,s (Z, β) with different initial data Y, Z and parameter values α, β satisfy the estimate νt,s (Y, α) − νt,s (Z, β) ≤ e|t−s|(1+L) (Y − ZB k + L1(t − s)α − β); (2.27) Proof. The results follow from applying Theorem 2.2.2 in the Banach space B k , taking into account that (νt1 , . . . , νtk−1 , F (t, νt0 , νt1 , . . . , νtk−1 , α)) − (˜ νt1 , . . . , ν˜tk−1 , F (t, ν˜t0 , ν˜t1 , . . . , ν˜tk−1 , α))B k ≤ (1 + L)

k−1

j=0

Yj − Zj  + L1 α − β.



92

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

In this book, we shall mostly be looking for the global solutions of ODEs (defined for all t > 0). However, local solutions are also often found in practice. Such solutions arise when the assumptions of Theorem 2.2.1 turn out to hold for some finite period of time only, say when the r.h.s is only locally Lipschitzcontinuous. Physically, local solutions usually describe the effect of explosion: the solutions diverge to infinity in finite times. We illustrate this possibility in the following exercise. Exercise 2.2.1. Solve the Cauchy problem for the ODE x˙ = x3 . For each initial point x0 , find the ‘explosion time’ t0 such that limt→t0 x(t) = ∞. The next exercise provides the simplest example for the case of Lipschitz continuity not being met by the r.h.s., and the corresponding Cauchy problem having infinitely many solutions. √ Exercise 2.2.2. Show that the Cauchy problem for the ODE x˙ = x with the initial condition x0 = 0 has infinitely many solutions. As already emphasized, the abstract view on ODEs as promoted in this text makes it possible to look at PDEs as ODEs in Banach spaces. Besides this abstract approach, there are other ways for reducing PDEs to ODEs, some of them more concrete and practically very useful. For instance, a systematic method for dealing with first-order PDEs will be discussed later in this chapter. Another such method (with some empirical flavor) consists of searching for solutions of a particular form for a given PDE. The following exercise demonstrates this approach on the example of one of the most famous nonlinear PDEs, the KdV-equation 3 ∂u(t, x) 1 ∂ 3 u(t, x) ∂u(t, x) = u + , ∂t 2 ∂x 4 ∂x3

(2.28)

where t, x ∈ R. Exercise 2.2.3. Show that if a solution to the KdV-equation has the form of a ‘traveling wave’ u(t, x) = w(x + ct) with a function w and a constant c, then w satisfies the ODE (w )2 = −2w3 + 4cw2 − 8c1 w − 8c2 (2.29) with some constants c1 , c2 . By an appropriate linear transformation of the unknown function, i.e., by changing w into W = aw + b, equation (2.29) can be transformed to the canonical form (W  )2 = 4W 3 − k1 W − k2 , (2.30) which is a famous classical ODE with solutions being given by the Weierstrass elliptic p-function. Therefore, the traveling-wave-solution u(x + ct) to the KdVequation is explicitly expressed in terms of the classical elliptic functions. This

2.3. Linear equations and chronological exponentials

93

includes the famous ‘solitary waves’ (or solitons) on shallow waters that can be described by solutions of the form  −2 3 3 usol (t, x) = 8k 2 ekx+k t + e−kx−k t that solves the KdV-equation for any value of k. S. Russel is reported to have observed it and singled it out for the first time by following it for several miles on the horseback along a narrow channel. After the well-posedness of ODEs, the next basic question concerns their sensitivity to parameters and initial data. Before diving into this topic, we shall first discuss the properties and concrete representations for the solutions of three classes of equations: general linear equations with bounded generators, linear ΨDEs with spatially homogeneous symbols and basic Hamiltonian evolutions, with the application of the latter to PDEs and optimal control.

2.3 Linear equations and chronological exponentials The simplest class of ODEs in a Banach space B is given by the linear equations μ˙ t = Aμt + gt ,

(2.31)

where A is a linear operator in B and gt a given curve there. For instance, if B = Rd and A is a square matrix, we talk about linear equations in finite dimensions. Proposition 2.3.1. Let A ∈ L(B, B) and gt a continuous curve. Then the unique solution to the Cauchy problem of equation (2.31) with the initial data μ0 = Y is given by the following Duhamel formula: At

t

e(t−s)A gs ds.

μt = e Y +

(2.32)

0

Exercise 2.3.1. Check (2.32)  byndirect differentiation, taking into account that the operator exponent eA = n An! satisfies the same rules of differentiation as the usual exponent. We will now see how one can derive (2.32) in a more general setting of timedependent evolutions, namely for the equation μ˙ t = At μt + gt ,

(2.33)

where At is a family of bounded linear operators in B depending continuously on t, or more generally measurably (in the strong operator topology). That is, t → At f is a continuous, or measurable, function for any f ∈ B.

94

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

If and only if all operators At commute, the solution can be written as a straightforward extension of the homogeneous case:  t   t  t As ds Y + exp Aτ dτ gs ds. (2.34) μt = exp 0

s

0

In order to find the general solution to the Cauchy problem of (2.33), let us first rewrite it in the integral form as t t μt = Y + As μs ds + gs ds. (2.35) 0

0

Let us iterate this equation by replacing μs under the integral by the whole expression given by the r.h.s. of the formula:  s  t t (As Y + gs ) ds + As (As1 μs1 + gs1 ) ds1 ds. μt = Y + 0

0

0

t

Denoting g˜t = 0 gs ds and repeating this procedure recursively leads, for any n ≥ 2, to the formula μt = Y + g˜t +

n−1

k=1

0≤s1 ≤···≤sk ≤t

Ask · · · As1 (Y + g˜s1 )ds1 · · · dsk



+ 0≤s1 ≤···≤sn ≤t

Asn · · · As1 μs1 ds1 · · · dsk .

If all As are uniformly bounded, it follows that the last term here tends to zero, as n → ∞, leading to the following result. Proposition 2.3.2. Let At and gt be families of uniformly bounded operators and elements in B that depend continuously – or more generally almost surely continuously (with the points of discontinuity forming a negligible set) – on t ∈ R. Then the unique solution to the Cauchy problem of equation (2.33) with the initial data μ0 = Y , given by Theorem 2.2.1 or Proposition 2.2.2, has the following convergent series representation: ∞

μt = Y + g˜t + Ask · · · As1 (Y + g˜s1 )ds1 · · · dsk . (2.36) k=1

0≤s1 ≤···≤sk ≤t

Formula (2.36) can be rewritten in various insightful ways. For instance, when I ◦ A denotes the operator in C([0, T ], B) (for any fixed T ) acting as g. → t As gs ds, equation (2.36) rewrites as the geometric series 0 μt =



k=0

(I ◦ A)k (Y + g˜. )(t).

(2.37)

2.3. Linear equations and chronological exponentials

95

∞ Remark 28. The sum k=0 (I ◦ A)k can be interpreted as the series expansion for the operator (1 − I ◦ A)−1 , referred to as the resolvent of the operator I ◦ A. In fact, it follows formally from (2.35) that (1 − I ◦ A)−1 (Y + g˜. )(t) should be the solution to (2.35), in accordance with (2.37). Alternatively, setting g = 0 first, one can rewrite (2.36) as μt = Y +

t ∞

1 t ··· T (Ask · · · As1 )Y ds1 · · · dsk , k! 0 0

(2.38)

k=1

where T is the ordering functional, which for any sequence s1 , . . . , sk reorders the product Ask · · · As1 in such a way that sj follow each other in decreasing order from the left to the right. A comparison with the usual exponential expansion suggests the definition of the fundamental concept of the chronological exponential or timeordered exponential or T -product of the integral of time-dependent operators as   t t ∞ 1 t Aτ dτ = ··· T (Ask · · · As1 ) ds1 · · · dsk . (2.39) T exp k! 0 s 0 k=1

With this formula for the chronological exponentials (2.36), one gets the following fundamental representation of the solution:  t   t  t As ds Y + T exp Aτ dτ gs ds. (2.40) μt = T exp 0

0

s

If At commutes, this turns into (2.34), and if At does not depend on t, it reduces to (2.32). The chronological exponentials can be expressed in another insightful way, namely as multiplicative Riemann integrals (see Section 1.3). Approximating At in the equation μ˙ t = At μt by piecewise-constant families suggests the formulation of the approximate solution to the Cauchy problem of this equation with the initial condition μs = Y as μΔ t = exp{(tn − tn−1 )Atn−1 } · · · exp{(t1 − t0 )At0 }Y.

(2.41)

where Δ = {s = t0 < t1 < · · · < tn = t} is any partition of the interval [s, t]. If At is continuous apart from a set of zero measure and uniformly bounded, it follows from Theorem 1.3.3 that the approximations μΔ t do converge towards the solution to the equation μ˙ t = At μt , as |Δ| = max(tj+1 − tj ) → 0. This leads to an alternative representation for the chronological exponent (2.39):   t Aτ dτ = lim exp{(tn − tn−1 )Atn−1 } · · · exp{(t1 − t0 )At0 }. (2.42) T exp s

|Δ|→0

Remark 29. For unbounded At , the story is more complicated. See Theorem 5.1.3 for continuous families At and [149] (Theorem 2) for more general cases.

96

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Note for future reference that if At is piecewise-constant, i.e., At = Aj for tj ≤ t < tj+1 for some partition Δ = {s = t0 < t1 < · · · < tn = t}, then  T exp



t

Aτ dτ

= exp{(tn − tn−1 )Atn−1 } · · · exp{(t1 − t0 )At0 }.

(2.43)

s

As follows from the definition of the chronological product, the solution to the linear equation (2.33) can be estimated by μt  ≤ exp{t sup As B→B }(Y  + t sup gs ). s∈[0,t]

(2.44)

s∈[0,t]

Similarly, one solves the backward linear problem μ˙ t = −At μt + gt ,

t ≤ r,

μr = Y.

(2.45)

t ≤ r.

(2.46)

Its equivalent integral representation reads r μt = Y + (As μs − gs ) ds, t

If the operators At commute, the unique solution to this problem equals  r   s  r As ds Y + exp Aτ dτ gs ds. μt = exp t

t

(2.47)

t

In the general case, the exponents are changed into the backward chronological exponentials or backward T -product:  s  r   r ˜ ˜ T exp μt = T exp As ds Y + Aτ dτ gs ds, (2.48) t

t

t

where  T˜ exp



t

Aτ dτ

= lim exp{(t1 − t0 )At0 } · · · exp{(tn − tn−1 )Atn−1 }. (2.49)

s

|Δ|→0

The following notation containing the inverse order of integration is also in use: 



t

T˜ exp

Aτ dτ

 = T exp

s



s

Aτ dτ

,

s ≤ t.

(2.50)

t

Quite similarly, one can also conclude that the integral equation



t

μt = Gt Y +

At,s μs ds + 0

t

gs ds, 0

t > 0,

(2.51)

2.3. Linear equations and chronological exponentials

97

with uniformly bounded linear operators Gt , At,s depending continuously on t and measurably on s, has a unique solution that is given by the series representation gt + μt = Gt Y +˜



k=1

0≤s1 ≤···≤sk ≤t

Ask+1 ,sk · · · As2 ,s1 (Gs1 Y +˜ gs1 )ds1 · · · dsk (2.52)

(where sk+1 = t), with the estimate μt  ≤ exp{t sup As2 ,s1 B→B } s1 ≤s2 ≤t

 sup Gs B→B Y  + t sup gs  . s∈[0,t]

s∈[0,t]

(2.53) Finally, let us look at linear equations in Rd . By (2.32) the problem is reduced to calculating the exponential etA with an arbitrary matrix A. First, let us assume that A is diagonalizable (which is, e.g., always the case when A is symmetric). This means that there exists an invertible C such that A = CDC −1 with a diagonal matrix D that has some numbers λ1 , . . . , λd on the diagonal. In this case, we find etA = CetD C −1 , where etD is a diagonal matrix with numbers etλ1 , . . . , etλd on the diagonal. Moreover, the columns vj = C.j of the matrix C are the eigenvectors of A, so that etA vj = etλj vj ,

j = 1, . . . , d.

If A is not diagonalizable, it can be brought to the Jordan normal form, for which the calculation of the exponent is lengthier, but still quite explicit (see any elementary introduction to ODEs for more details). Therefore, for the numerical solutions of linear problems in large dimensions, the main problem lies in the effective calculation of the Jordan normal form of A, which in turn can basically be reduced to finding eigenvectors and eigenvalues of A. Exercise 2.3.2. Find the exponential exp{tσj } for the Pauli matrices    1 0 0 1 1 −i σ1 = , σ2 = , σ3 = . 0 −1 1 0 0 i

(2.54)

Exercise 2.3.3. Find the exponentials of the Jordan blocks ⎛ ⎞  0 1 0 0 1 ⎜ ⎟ J2 = , J3 = ⎝0 0 1⎠ 0 0 0 0 0 in dimensions d = 2 and d = 3. Afterwards, extend this result to arbitrary dimensions.

98

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Exercise 2.3.4. Find the solution to the second-order linear equation f¨ + bf˙(x) = g(t),

t ≥ 0,

(2.55)

with the initial data f (0) = f0 , f˙(0) = y0 . Exercise 2.3.5. Find the solution to equation (2.55) with the boundary conditions f (0) = f0 , f (T ) = fT .

2.4 Linear evolutions involving spatially homogeneous ΨDOs In this section, we deal mostly with equations that have a Lipschitz-continuous r.h.s. First, we give a brief introduction to the important class of equations that can be brought into such a form by Fourier transform and then solved explicitly. Namely, let us consider the evolution equations f˙t = −ψt (−i∇)ft ,

f |t=s = fs ,

t ≥ s,

(2.56)

for the (possibly time-dependent family of) differential or pseudo-differential operators with constant coefficients (i.e., spatially homogeneous operators) with sufficiently regular symbols ψt . The most studied examples of this Cauchy problem are the simplest diffusion equations and, more generally, equations with (possibly fractional) powers of the Laplacian, (2.57) f˙t = −|Δ|α/2 ft , f |t=s = fs , as they arise from (2.56) with ψ(p) = |p|α . We shall discuss these problems and their extensions in more detail in Chapter 4. Passing to the Fourier transform  fˆ(p) = e−ipx f (x) dx, the Cauchy problem (2.56) turns into the problem d ˆ ft (p) = −ψt (p)fˆt (p), fˆ|t=s = fˆs , dt By (2.34) and Proposition 2.3.2, this problem has the unique solution  t  ˆ ψτ (p) dτ fˆs (p), ft = exp −

(2.58)

s

whenever ψt is almost surely continuous with respect to t. Returning to f via the inverse Fourier transform yields ψ ft (x) = Gt,s (x − y)fs (y) dy = Gψ t,s (z)fs (x − z) dy,  t  1 ipx = ψτ (p) dτ dp, e exp − (2π)d s whenever this integral is well defined.

(2.59)

with

Gψ t,s (x)

(2.60)

2.4. Linear evolutions involving spatially homogeneous ΨDOs

99

From (2.59), one derives the following characteristic feature of evolutions that are generated by ΨDOs with constant coefficients: The resolving operator fs → ft commutes with the shifts, i.e., with the mappings f (x) → f (x+a), for any a ∈ Rd . If the family ψ does not explicitly depend on t, the solution to the Cauchy problem (2.56) simplifies to (2.61) ft (x) = Gψ t−s (x − y)fs (y) dy, with Gψ t (x) =

1 (2π)d

eipx exp{−tψ(p)} dp.

(2.62)

According to the general definitions (1.185) and (1.189), the functions Gψ t,s (x) or are referred to as the Green functions or the heat kernels for the corresponding Cauchy problem. By (2.59) and (2.61), these functions solve the Cauchy problem (2.56) with the Dirac initial condition δ(x). For instance, for the basic diffusion equation f˙t = 12 Δft with ψ(p) = p2 /2, the Green function (2.62) has the form  2 1 1 x ipx 2 (x) = e exp{−tp } dp = exp − Gψ , (2.63) t d d/2 (2π) 2t (2πt) Gψ t (x)

where (1.89) was used. Similarly, the solution to the backward Cauchy problem f˙t = ψt (−i∇)ft , can be written as ft (x) =

f |t=r = fr ,



t ≤ r,

(2.64)

Gψ r,t (x − y)fr (y) dy =

Gψ r,t (z)fr (x − z) dy,

because, by (2.47), its Fourier transform equals  r  ˆ ψτ (p) dτ fˆr (p). ft = exp −

(2.65)

(2.66)

t

Whenever ψt is continuous with respect to p (with a real part that is bounded from below) and almost surely continuous with respect to t (or even just measurt able, see Remark 27), the function exp{− s ψτ (p) dτ } is well defined as an element of S  (Rd ), that is, as a tempered generalized function. In this case, equation (2.59) can be written as a convolution: ft (x) = (Gψ t,s  fs )(x).

(2.67)

Therefore, whenever the convolutions (2.59) or (2.61) are well defined, they yield unique solutions to the Cauchy problem (2.56), well defined in the sense of generalized functions.

100

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Natural Banach spaces that provide the well-posedness of (2.56) are spaces with various integral norms, since those norms can be recast in terms of the Fourier transform. Basic examples are the Sobolev spaces (see (1.194) and (1.195)), the so-called Wiener space or Wiener ring F (L1 (Rd )) of the Fourier transforms of L1 (Rd ) (equipped with the norm inherited from L1 (Rd )) and, more generally, the space F (M(Rd )) of the Fourier transforms of bounded signed measures. Theorem 2.4.1. Let ψt be a locally bounded measurable function of two variables with a non-negative real part. Then the mapping fs → ft that resolves the problem (2.56) and is given by (2.59) is a well-defined contraction in the space L2 (Rd ), in the Sobolev spaces H2s (Rd ) for all s ∈ R, in the Wiener space F (L1 (Rd )) and in the space F (M(Rd )). Moreover, in all these spaces the initial values are attained in the usual sense: ft − fs  → 0 as t − s → 0. Proof. This follows from the observation that the mapping fˆs → fˆt given by (2.58) is a well-defined contraction in the spaces Lq (Rd ), q ≥ 1, and their weighted versions, the spaces Lq (Rd , (1 + |p|2 )s dp), and that fˆt − fˆs  → 0 as t − s → 0 in all these spaces.  Remark 30. The last property that is stated in Theorem 2.4.1 is referred to as strong continuity of the mappings fs → ft . This property will be studied in more detail later. Let us give a reasonably general criterion for the heat kernels to exist as usual smooth functions. Theorem 2.4.2. Let ψt (p) be a continuous function of two variables, with its real part growing as a power for large p: Re ψt (p) ≥ Clow |p|δ ,

|p| > p˜,

(2.68)

with some positive constants Clow , δ, p˜. Then, for any t > s, the heat kernel Gψ t,s (.) belongs to C∞ (Rd ) together with all its spatial derivatives (derivatives with respect to x of any order), and     d+l ∂l |S d−1 | 1 d+l −(d+l)/δ Γ Gψ (x)| ≤ + (t − s)) p ˜ , sup | (C low ∂xi1 · · · ∂xil t,s (2π)d δ δ x (2.69) where |S d−1 | is the surface area of the unit sphere in Rd . If additionally ψt (p) grows at most polynomially for large p, i.e., |ψt (p)| ≤ Cup |p|N ,

|p| > p˜,

with some positive constants Cup and N , then the derivatives ∂Gψ t,s (.) ∂t

and

∂Gψ t,s (.) ∂s

(2.70)

2.4. Linear evolutions involving spatially homogeneous ΨDOs

101

are well defined, belong to C∞ (Rd ) as well as all their spatial derivatives and     ∂ ψ  ∂ ψ     sup max  Gt,s (x) ,  Gt,s (x) (2.71) ∂t ∂s x     d+N 1 |S d−1 | d+l −(d+N )/δ C max |ψ (p)| + Γ (t − s)) p ˜ . ≤ (C τ up low (2π)d |p|≤p,τ ˜ ∈[s,t] δ δ Proof. Since ∂m Gψ (x) = (2π)−d ∂xj1 · · · ∂xjm t,s

m ipx

i e

 t  pj1 · · · pjm exp − ψτ (p) dτ dp, s

 t  ∂ ψ Gt,s (x) = −(2π)−d eipx ψt (p) exp − ψτ (p)dτ dp, ∂t s  t  ∂ ψ Gt,s (x) = (2π)−d eipx ψs (p) exp − ψτ (p)dτ dp, ∂s s

the claims about C∞ (Rd ) follow from the Riemann–Lebesgue lemma. Moreover, (2.68) implies |S d−1 | d |S d−1 | ∞ (x)| ≤ p ˜ + exp{−Clow (t − s)|p|δ }|p|d−1 d|p|. sup |Gψ t,s (2π)d (2π)d 0 x A change of variables to r = Clow (t − s)|p|δ in the last integral allows for this integral to be expressed in terms of the Gamma function, using the equation ∞ 1 rω exp{−Arδ }dr = Γ((1 + ω)δ)A−(1+ω)/δ . δ 0 This leads to (2.69) with l = 0. The other estimates are similarly obtained.



Equations that satisfy (2.68) are usually called parabolic. Theorem 2.4.2 implies that the solution ft given by (2.61) is infinitely smooth for any fs ∈ L1 (Rd ). This, however, does neither imply that the solution is well defined for fs ∈ C∞ (Rd ), nor that the spaces C∞ (Rd ) or L1 (Rd ) are preserved by the resolving mapping fs → ft . In order for such conclusions to hold, we need the integrability of G. And for a proof, we need to know the behaviour of G for large x. For an appropriate decrease of G(x), as x → ∞, some smoothness of the symbol is required. The simplest result in this direction is as follows. Theorem 2.4.3. Under the assumptions of Theorem 2.4.2, let us assume that ψt (p) are infinitely differentiable in p with all derivatives growing at most polynomially as p → ∞ (uniformly in t). Then Gψ t,s (x) decreases faster than any power as x → ∞, that is for any m > 0 there exists a constant C(m) such that |Gψ t,s (x)| ≤ C(m)|x|−m for |x| > 1. In particular, the resolving operator fs → ft maps the spaces C∞ (Rd ) or L1 (Rd ) to themselves.

102

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Proof. Due to the Riemann–Lebesgue lemma and the assumptions on ψ, the functions  t  ∂l ψ l −d ipx exp − ψτ (p) dτ dp e (−i) xi1 · · · xil Gt,s (x) = (2π) ∂pi1 · · · ∂pil s belong to C∞ (Rd ) for any l > 0.



A case that has been studied extensively is the case when ψt (p) is a polynomial in p of an even order 2m such that the homogeneous part of the highest order 2m, ψt2m (p), is positive non-degenerate, that is |ψt2m (p)| ≥ |p|2m with some  > 0. In this case, a very detailed analysis of G is available. In particular, Gψ t,d (x) decreases exponentially as x → ∞ (see, e.g., [91]). In many important examples, however, ψ(p) has some singularity. A prominent example are the powers ψ(p) = |p|β with some real β. Let us give some results on the behaviour of G for large x in the presence of such singularities. Theorem 2.4.4. Under the assumptions of Theorem 2.4.2, let us assume that ψt (p) is l-times continuously differentiable with respect to p outside some compact, nowhere dense set of singularities (the same for all t), such that all derivatives are growing at most polynomially as p → ∞ and satisfy the condition that any product of the partial derivatives ∂ lm ψ ∂ l1 ψ ··· ∂pi1 · · · ∂pil1 ∂pj1 · · · ∂pjlm

(2.72)

with l1 + · · · + lm = l is locally integrable everywhere. Then, for any s < t, l (1 + |.|l )Gψ t,s (.), (1 + |.| )

∂Gψ ∂Gψ t,s (.) t,s (.) , (1 + |.|l ) ∈ C∞ (Rd ), ∂t ∂s

(2.73)

and the same holds for all spatial derivatives of these functions. In particular, if this holds for l = d + 1, then the functions Gψ t,s (.), ∂Gψ

∂Gψ t,s (.) ∂t

(.)

t,s and belong to L1 (Rd ) together with all their spatial derivatives. In other ∂s words, these functions belong to the Sobolev spaces H1k (Rd ) for all natural k. The corresponding norms are again uniformly bounded for t − s ∈ [T1 , T2 ] with any 0 < T1 < T2 .

Proof. According to the assumptions, the function under the integral in the expression  t  ∂l ψ l −d ipx (−i) xi1 · · · xil Gt,s (x) = (2π) exp − ψτ (p) dτ dp e ∂pi1 · · · ∂pil s belongs to L1 (Rd ). Therefore, the first inclusion in (2.73) follows again by the Riemann–Lebesgue lemma. Since multiplying by polynomials in p or by ψ(p) does

2.4. Linear evolutions involving spatially homogeneous ΨDOs

103

not increase the singularity, the other inclusions in (2.73) immediately follow. All other statements follow from to the fact that the condition (1 + |x|d+1 )f (x) ∈ C∞ (Rd ) for a continuous f implies f ∈ L1 (Rd ).  Remark 31. Although Theorem 2.4.2 and (2.4.4) imply that the resolving operator fs → ft of the Cauchy problem (2.56) takes C∞ (Rd ) to itself and L1 (Rd ) to itself, the question remains open as to how the initial conditions are met, that is whether ft − fs  → 0 as t − s → 0 in these spaces. This question is ultimately linked with the question of uniform boundedness for small t − s of the operators fs → ft . We shall return to these questions in Chapter 4. For example, if ψ(p) = |p|β with a β > 0, the only singularity of ψ is at the origin, and the condition of the integrability of the products (2.72) is fulfilled for l = d + k, where k is the biggest integer that is strictly smaller than β. In particular, l ≥ d + 1 for β > 1. The case of homogeneous ψ will be considered in detail in the Sections 4.4 and 4.5. Here, let the operator ψ(−i∇) be the fractional derivatives (right or left) on R as an insightful example. The corresponding Green functions or heat kernels solve the problems dβ ∂G (t, x) = − G(t, x), ∂t d(±x)β

t ≥ 0,

Gt=0 = δ(x),

(2.74)

with β ∈ (0, 1). From Proposition 1.8.6, it follows that the corresponding symbols ψ equal   i ψ(p) = exp ± πβ sgn p |p|β , 2 so that    ∞ 1 i β dp. (2.75) exp ipx − t|p| exp ± πβ sgn p G(t, x) = G±β (t, x) = 2π −∞ 2 Taking into account that this expression is real, it follows that    ∞ 1 i β G(t, x) = G±β (t, x) = Re dp exp ipx − t|p| exp ± πβ sgn p 2π 2 −∞    ∞ 1 i = Re dp. (2.76) exp ipx − tpβ exp ± πβ π 2 0 From this formula, it follows that G±β (t, x) = G∓β (t, −x). Moreover, these functions satisfy a scaling law: G±β (t, x) = t−1/β G±β (1, t−1/β x) =

1 G±β (t|x|−β , sgn x). |x|

(2.77)

This law is crucial for their analysis. Moreover, Theorem 2.4.2 (i) implies that the heat kernel G±β (t, .) belongs to C∞ (R) together with all its spatial derivatives, ∂G (t,.) and that the derivative ±β also belongs to C∞ (R) together with all its spatial ∂t derivatives.

104

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Remark 32. In probability theory, the functions p±β (t,x) = G∓β (t,−x) = G±β (t,x) are referred to as stable densities. They represent transition densities for increasing respectively decreasing stable subordinators. More specific properties of these stable densities are collected in the following assertion. Proposition 2.4.1. (i) The functions G±β (t, x) are non-negative, and G±β (t, x) = 0

for

∓ x > 0.

(2.78)

(ii) For any t, these functions behave like |x|−1−β for large x, in the sense that there exits a finite positive limit lim G±β (t, x)|x|1+β

|x|→∞

for

± x > 0.

(2.79)

(iii) G±β (t, x) > 0 for ±x > 0, and they are unimodal in this region, that is, they increase monotonically from zero up to the maximum and then decrease monotonically back to zero. Proof. Assertion (i) follows from the more general Proposition 5.11.3. Assertion (ii) follows from the more general Proposition 4.5.1 that shall be proven later on. Assertion (iii) will be neither proved nor used here. Proofs can be found, e.g., in [78, 267] or [148].  Another key property of these functions has to do with the following representation of the Mittag-Leffler function: ∞ −β 1 ∞ sx −1−1/β e x Gβ (1, x−1/β ) dx = esy Gβ (1, y) dy, (2.80) Eβ (s) = β 0 0 which holds for β ∈ (0, 1) and all s ∈ C. As we shall see later on, this formula is crucial for analysing general fractional equations. We shall give two analytic proofs of this formula in Proposition 8.1.1 and in Section 8.4. A probabilistic proof can be found in [106].

2.5 Hamiltonian systems, boundary-value problems and the method of shooting For a smooth function H(x, p), x, p ∈ Rm , the system of equations ⎧ ∂H ⎪ ⎪ ⎨ x˙ = ∂p ⎪ ∂H ⎪ ⎩ p˙ = − ∂x

(2.81)

2.5. Hamiltonian systems, boundary-value problems, method of shooting

105

is called the system of Hamiltonian equations with the Hamiltonian function, or just the Hamiltonian, H. It represents one of the key examples of ODEs that occur in physics. The coordinates x and p are referred to as position and momentum. Projections of the solutions onto the position coordinates are often referred to as characteristics or extremals. The reason for this nomenclature will be revealed below. This and the next section analyse Hamiltonian systems in more detail. Note that the obtained results are not used in other parts of the book. Unlike most of the other book content, we mostly stick to finite-dimensional Hamiltonian systems, since their infinite-dimensional extension is nontrivial and has to be specially developed. Many specialized treatises are devoted to this topic. But even the finite-dimensional theory of Hamiltonian systems is quite specific, which is why it can only be roughly touched in the framework of this book, namely by introducing the method of shooting that can be used to solve boundary-value problems for ODEs, provided that the corresponding Cauchy (or initial value) problem is well understood. The core idea of this method is to consider all trajectories that emerge from a certain point (shooting the corresponding mechanical particles out of this point) and then to try to find the trajectory that satisfies the required condition on its end point. In the next section, we shall explain how Hamiltonian systems can be used to solve first-order partial differential equations, namely the Hamilton–Jacobi equations. A basic class of Hamiltonian functions that is responsible for the second-order Newton equations of mechanics is given by the convex quadratic-in-momentum Hamiltonians 1 (2.82) H(x, p) = (G(x)p, p) − (A(x), p) − V (x), 2 where G(x) is a non-negative symmetric matrix. Such a Hamiltonian is called nondegenerate if G is strictly positive, so that G(x)−1 exists for all x and is uniformly bounded. In this section, we develop the theory of boundary-value problems for the basic non-degenerate case. For the more general case, we shall refer to the original papers. Before proceeding with the boundary-value problem, one needs some estimates for the solutions to the Cauchy problem for the Hamiltonian system (2.81). For the Hamiltonian (2.82), it has the form ⎧ ⎪ ⎨ x˙ = G(x)p − A(x)     (2.83) ∂A 1 ∂G ∂V ⎪ p, p + ,p + , i = 1, . . . , m, ⎩ p˙ i = − 2 ∂xi ∂xi ∂xi where we have written the second equation for each coordinate separately. In what follows, Br denotes the balls in Rm of radius r centered at zero. Furthermore, let us denote by X(s, x0 , p0 ), P (s, x0 , p0 ) the solution to (2.81) or (2.83) with the initial data (x0 , p0 ) (whenever it exists).

106

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Lemma 2.5.1. Let G(x), A(x), V (x) be twice continuously differentiable functions. Then for an arbitrary x0 ∈ Rm and an arbitrary open bounded neighbourhood U (x0 ) of x0 , there exist positive constants t0 , c0 , C such that for t ∈ (0, t0 ], c ∈ (0, c0 ] and p0 ∈ Bc/t , the solution X(s, x0 , p0 ), P (s, x0 , p0 ) exists on the interval [0, t], and for all s ∈ [0, t], X(s, x0 , p0 ) ∈ U (x0 ),

|P (s, x0 , p0 )| < C(|p0 | + t).

(2.84)

Proof. Let T be the time of exit of the solution from the domain U (x0 ), namely T (t) = min (t, sup{s : X(s, x0 , p0 ) ∈ U (x0 ), P (s, x0 , p0 ) < ∞}) . Since the derivatives of G, A and V are bounded in U (x0 ), it follows that for s ≤ T (t) the growth of |X(s, x0 , p0 ) − x0 | and |P (s, x0 , p0 )| is bounded by the solution to the system x˙ = K(p + 1),

p˙ = K(p2 + 1)

(2.85)

in R2 with the initial conditions x(0) = 0, p(0) = |p0 | and some constant K. The solution to the second equation is p(s) = tan(Ks + arctan p(0)) =

p(0) + tan Ks . 1 − p(0) tan Ks

(2.86)

˜ where K ˜ is chosen in such a way Therefore, if |p0 | ≤ c/t with c ≤ c0 < 1/K, ˜ that tan Ks ≤ Ks for s ≤ t0 , then ˜ ˜ ≥ 1 − c0 K 1 − |p0 | tan Ks > 1 − |p0 |Ks for all s ≤ T (t). Consequently, for such s, |P (s, x0 , p0 )| ≤

˜ |p0 | + Ks , ˜ 1 − c0 K

|X(s, x0 , p0 ) − x0 | ≤ Ks + K

˜ 2 c + Ks . ˜ 1 − c0 K

Choosing t0 , c0 in such a way that the last inequality implies X(s, x0 , p0 ) ∈ U (x0 ) for s ≤ t0 , c ≤ c0 , it follows that T (t) = t. This implies (2.84) as required.  Lemma 2.5.2. Let G(x), A(x), V (x) be twice continuously differentiable functions. Then there exist t0 > 0 and c0 > 0 such that, for s ≤ t ≤ t0 , c ≤ c0 , p0 ∈ Bc/t , 1 ∂X (s, x0 , p0 ) = G(x0 ) + O(c + t), s ∂p0

∂P (s, x0 , p0 ) = 1 + O(c + t). ∂p0

(2.87)

Proof. Differentiating the first equation in (2.83) yields ∂Gik ∂Ai (x)x˙ l pk + Gik (x)p˙k − x˙ l (2.88) ∂x ∂xl  l    ∂Gik ∂Ai ∂A ∂V 1 ∂G = pk − p, p) + ( , p) + (Glj pj − Al ) + Gik − ( . ∂xl ∂xl 2 ∂xk ∂xk ∂xk

x ¨i =

2.5. Hamiltonian systems, boundary-value problems, method of shooting

107

Consequently, differentiating the Taylor expansion s x(s) = x0 + x(0)s ˙ + (s − τ )¨ x(τ ) dτ 0

with respect to the initial momentum p0 and using (2.84), one gets ∂X (s,x0 ,p0 ) (2.89) ∂p0  s ∂X ∂P = G(x0 )s + (τ,x0 ,p0 ) + O(1 + |p0 |) (τ,x0 ,p0 ) (s − τ )dτ. O(1 + |p0 |2 ) ∂p ∂p 0 0 0 s Similarly, differentiating p(s) = p0 + 0 p(τ ˙ ) dτ leads to ∂P (s, x0 , p0 ) (2.90) ∂p0  s ∂X ∂P =1+ O(1 + |p0 |2 ) (τ, x0 , p0 ) + O(1 + |p0 |) (τ, x0 , p0 ) dτ. ∂p ∂p 0 0 0 Let us now look at the matrices v(s) =

1 ∂X (s, x0 , p0 ), s ∂p0

u(s) =

∂P (s, x0 , p0 ) ∂p0

as elements of the Banach space C([0, t], Mm ) of continuous m × m-matrix-valued functions M (s) on [0, t] equipped with the norm sup{|M (s)| : s ∈ [0, t]}. Then one can write the equations (2.89), (2.90) in the abstract form ˜ 1 u, v = G(x0 ) + L1 v + L

˜ 2 u, u = 1 + L2 v + L

˜ 1, L ˜ 2 are linear operators in C([0, t], Mm ) with the norms |Li | = where L1 , L2 , L ˜ i | = O(c + t). This implies (2.87) for c and t small enough. In O(c2 + t2 ) and |L fact, the second equation yields u = 1 + O(c + t) + O(c2 + t2 )v. Substituting this equality in the first equation yields v = G(x0 ) + O(c + t) + O(c2 + t2 )v, and solving this equation with respect to v leads to the first equation in (2.87).  Now we are ready to prove the existence of the family Γ(x0 ) of solutions of the system (2.83) that start at x0 and cover a neighbourhood of x0 in times t ≤ t0 . Theorem 2.5.1. Let G(x), A(x), V (x) be twice continuously differentiable functions and let the matrix G be positive non-degenerate. Then (i) for each x0 ∈ Rm there exist c and t0 such that for all t ≤ t0 the mapping p0 → X(t, x0 , p0 ) defined on the ball Bc/t is a diffeomorphism onto its image; (ii) for an arbitrary small enough c, there are positive r = O(c) and t0 = O(c) such that the image of this diffeomorphism contains the ball Br (x0 ) for all t ≤ t0 .

108

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Proof. (i) Note first that, by Lemma 2.5.2, the mapping p0 → X(t, x0 , p0 ) is a local diffeomorphism for all t ≤ t0 . Moreover, if p0 , q0 ∈ Bc/t , then

1

∂X (t, x0 , q0 + s(p0 − q0 )) ds (p0 − q0 ) ∂p 0 0 = t(G(x0 ) + O(c + t))(p0 − q0 ).

X(t, x0 , p0 ) − X(t, x0 , q0 ) =

(2.91)

Therefore, if c and t are sufficiently small, the r.h.s. of (2.91) cannot vanish for p0 − q0 = 0. (ii) We must prove that for x ∈ Br (x0 ) there exists p0 ∈ Bc/t such that x = X(t, x0 , p0 ), or equivalently, such that 1 p0 = p0 + G(x0 )−1 (x − X(t, x0 , p0 )). t In other words, the mapping 1 Fx : p0 → p0 + G(x0 )−1 (x − X(t, x0 , p0 )) t

(2.92)

has a fixed point in the ball Bc/t . Since every continuous mapping from a ball to itself has a fixed point (Schauder fixed-point principle), it is enough to prove that Fx takes the ball Bc/t into itself, i.e., that |Fx (p0 )| ≤ c/t

(2.93)

whenever x ∈ Br (x0 ) and |p0 | ≤ c/t. By (2.84), (2.88) and (2.89), we have X(t, x0 , p0 ) = x0 + t(G(x0 )p0 − A(x0 )) + O(c2 + t2 ). Therefore, it follows from (2.92) that (2.93) is equivalent to |G(x0 )−1 (x − x0 ) + O(t + c2 + t2 )| ≤ c, which holds for t ≤ t0 , |x − x0 | ≤ r and sufficiently small r, t0 , provided that c is chosen small enough.  As a consequence, we get the following local well-posedness result for the boundary-value problem for quadratic Hamiltonians. Theorem 2.5.2. If either (i) A, V, G, G−1 ∈ C 2 (Rd ), or (ii) G is a constant positive matrix, A ∈ C 2 (Rd ) and the second derivatives V  exist and are uniformly bounded, then there exist positive r, c, t0 such that for any t ∈ (0, t0 ] and any x1 , x2 with |x1 − x2 | ≤ r, there exists a solution to the system (2.83) with the boundary conditions x(0) = x1 , x(t) = x2 . Moreover, this solution is unique under the additional assumption that |p(0)| ≤ c/t.

2.5. Hamiltonian systems, boundary-value problems, method of shooting

109

Proof. The case (i) follows directly from Theorem 2.5.1. Under the assumptions (ii), the proof of Lemma 2.5.1 can be repeated with the system x˙ = K(p + 1),

p˙ = K(1 + p + x)

as a bound for the solution to the Hamiltonian system. This system is linear (this is where the assumption of G being constant comes in) and its solutions are estimated by the exponents. The rest of the proof remains the same.  The proof of the existence of the boundary-value problem given above is not constructive. However, once the well-posedness is given, one can construct approximate solutions up to any order in small t for smooth enough Hamiltonians. For this purpose, one again begins with the construction of the asymptotic solution for the Cauchy problem. Proposition 2.5.1. If the functions G, A, V in (2.82) have continuous derivatives of order up to k + 1, then for the solution to the Cauchy problem for equation (2.83) with initial data x(0) = x0 , p(0) = p0 , the following asymptotic formulae hold: X(t, x0 , p0 ) = x0 + tG(x0 )p0 − A(x0 )t +

k

Qj (t, tp0 ) + O(c + t)k+1 ,

(2.94)

j=2

⎡ ⎤ k

1 P (t, x0 , p0 ) = p0 + ⎣ Pj (t, tp0 ) + O(c + t)k+1 ⎦ , t j=2

(2.95)

where Qj (t, q) = Qj (t, q 1 , . . . , q m ), Pj (t, q) = Pj (t, q 1 , . . . , q m ) are homogeneous polynomials of degree j with respect to all their arguments, with coefficients that depend on the values of G, A, V and their derivatives up to order j at the point x0 . Moreover, one has the following expansion for the derivatives with respect to the initial momentum: k

1 ∂X ˜ j (t, tp0 ) + O(c + t)k+1 , Q = G(x0 ) + t ∂p0 j=1 k

∂P P˜j (t, tp0 ) + O(c + t)k+1 , =1+ ∂p0 j=1

(2.96)

(2.97)

˜ j , P˜j are again homogeneous polynomials of degree j, but now matrixwhere Q valued. Proof. This follows directly from differentiating the equations (2.83), then using the Taylor expansion for its solution up to kth order and finally estimating the remainder with the help of Lemma 2.5.1. 

110

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Proposition 2.5.2. If the functions G, A, V in (2.82) have continuous derivatives of order up to k + 1 and G(x0 ) is a non-degenerate positive matrix, the function p0 (t, x, x0 ), defined by the equation x = X(t, x0 , p0 (t, x, x0 )) according to Theorem 2.5.1, has the asymptotic expansion ⎡ ⎤ k

1 Pj (t, x − x0 ) + O(c + t)k+1 ⎦ , p0 (t, x, x0 ) = G(x0 )−1 ⎣(x − x0 ) + A(x0 )t + t j=2 (2.98) where Pj (t, x − x0 ) are certain homogeneous polynomials of degree j in all their arguments. Proof. It follows from (2.94) that x − x0 can be expressed as an asymptotic power series in the variable (p0 t) with coefficients that have asymptotic expansions in powers of t. This implies the existence and uniqueness of the formal power series of the form (2.98) which solves equation (2.94) with respect to p0 . The well-posedness of this equation (which follows from Theorem 2.5.1) completes the proof.  We have shown how the method of shooting is used for proving the wellposedness and for effectively calculating the solution to the boundary-value problem for Hamiltonian systems locally. The vast development of the theory of Hamilton equations is beyond the scope of this book, but we shall provide some comments on its various directions. (More bibliographical comments are provided in the last section of the chapter.) For convex Hamiltonians, it will be shown in the next section that the projections of the solutions to the boundary-value problems on the position coordinate x, X(τ, x0 , p0 ) with x = X(t, x0 , p0 ) provide local minimizers for the integral functional t

It (y(.)) =

L(y(τ ), y(τ ˙ )) dτ

(2.99)

0

among all piecewise smooth curves y(τ ) with given boundary conditions y(0) = x0 , y(t) = x (hence the term extremals mentioned above), where L(x, v) is the Lagrange function or the Lagrangian of H defined as the Legendre transform of H(x, p) in the variable p: L(x, v) = max(pv − H(x, p)). p

(2.100)

For example, for the quadratic Hamiltonian H(x, p) = 12 (G(x)p, p), the corresponding Lagrangian is also quadratic: L(x, v) =

1 −1 (G (x)v, v). 2

Exercise 2.5.1. Check formula (2.101).

(2.101)

2.5. Hamiltonian systems, boundary-value problems, method of shooting

111

This link to the theory of optimization (or the calculus of variations) makes it possible to derive from the local well-posedness of the boundary-value problem (proved above) the global existence result for its solution – without the restriction that x0 and x are close and t is small. (Note, however, that this does not imply the uniqueness!) Hereby, a possible solution can be identified as a curve supplying the global minimum of (2.99). This existence result for non-degenerate quadratic Hamiltonians is referred to as Tonelli’s theorem. The key application of this theory in geometry and general relativity arises from pure quadratic Hamiltonians with a Lagrangian of the form (2.101). When defined on the tangent bundles to a Riemannian manifold, these functions that are quadratic in the velocity v specify Riemannian metrics. The corresponding local minimizers or extremals are called geodesics. These are curves of minimal length, that is, the analogue to straight lines in Euclidean geometry. The corresponding evolution generated by the Hamilton equations is referred to as the geodesic flow. While non-degenerate quadratic Hamiltonians arise in geodesic flows and in standard problems of calculus of variation with Lagrangians L(x, v) that depend quadratically on v, degenerate quadratic Hamiltonians arise in the study of stochastic geodesic flows (i.e., geodesic flows that are perturbed by some noisy input) and in problems of calculus of variation with Lagrangians L(x, x, ˙ . . . , x(k) ) that depend on higher derivatives and show a quadratic dependence on the highest derivative x(k) . For degenerate quadratic Hamiltonians, the analysis of the boundary-value problem is much less straightforward. For example, let us consider the following quadratic convex Hamiltonian on R4d : 1 H(x, y, p, q) = −f (y)p + q 2 2

(2.102)

with an everywhere-positive function f , for instance f (y) = y 2 . (Note that (p, q) is the momentum related to the position coordinates (x, y).) The corresponding Hamiltonian system reads x˙ = −f (y),

y˙ = q,

p˙ = 0,

q˙ = f  (y)p.

Therefore, x˙ is always negative and there is no solution of the Hamiltonian system joining (x0 , y0 ) and (x, y) whenever x > x0 , even for small positive t and close x, x0 . In other words, the boundary-value problem is not solvable even locally. A natural approach to address the local boundary-value problem for a degenerate Hamiltonian near a point x0 is to consider its linear approximation, obtained by taking the quadratic (in both p and x) approximation to the Hamiltonian around some point x0 . However, it turns out that the solution to the local boundary-value problem is approximated by the solutions to the corresponding linear approximation for a particular class of Hamiltonians only. This class of Hamiltonians has been identified in [136], where the full theory of the boundaryvalue problem for the corresponding Hamiltonian equations (local well-posedness

112

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

and global existence of the global minimizing solutions) is provided. The Hamiltonians of this class are called regular in [136]. They can be classified with the help of Young diagrams (finite non-increasing sequences of natural numbers). For a diagram that consists of only one number, the Hamiltonian is non-degenerate. For a diagram that consists of two numbers k ≥ n, the corresponding Hamiltonians H(x, y, p, q), x, p ∈ Rn , y, q ∈ Rk have the form 1 1 (g(x)q, q) − (a(x) + α(x)y, p) − (b(x) + β(x)y + (γ(x)y, y), q) − V (x, y), 2 2 (2.103) or more explicitly

1 H= (ai (x) + αij yj )pi gij (x)q i q j − 2

1 jl γi yj yl )q i − V (x, y), − (bi (x) + βij (x)yj + 2 H=

where g is a positive k × k matrix, V (x, y) is a polynomial in y of degree ≤ 4, bounded from below, and the rank of the matrix α(x) is n. Exercise 2.5.2. Write down explicitly the Hamiltonian system arising from Hamiltonians of the type (2.103). Afterwards, prove the following generalization of Lemma 2.5.1 for this system: There exist constants K, t0 , and c0 such that for all c ∈ (0, c0 ] and t ∈ (0, t0 ], the solution (X, Y, P, Q)(t, x0 , y0 , p0 , q0 ) to this system with the initial data (x0 , y0 , p0 , q0 ) exists on the interval [0, t] whenever c |y0 | ≤ , t

|q0 | ≤

c2 , t2

|p0 | ≤

c3 . t3

On this interval, the following estimates hold:   c , |y − y0 | ≤ Kt 1 + |x − x0 | ≤ Kt 1 + t    c3 |q − q0 | ≤ Kt 1 + 3 , |p − p0 | ≤ Kt 1 + t

 c2 , t2  c4 . t4

Note that a solution to this exercise can be found in [136]. Exercise 2.5.3. Prove the following local well-posedness result for the boundaryvalue problem for a Hamiltonian system with a Hamiltonian of the form (2.103): (i) There exist positive real numbers c and t0 (depending only on x0 ) such that for all t ≤ t0 and |y| ≤ c/t, the mapping (p0 , q0 ) → (X, Y )(t, x0 , y0 , p0 , q0 ) defined on the polydisc Bc3 /t3 × Bc2 /t2 is a diffeomorphism onto its image. (ii) There exist positive r, c, t0 such that the image of this diffeomorphism contains the polydisc y ) × Br (˜ x), where Br/t (˜ (˜ x, y˜) = (X, Y )(t, x0 , y0 , 0, 0). The arguments are rather lengthy, see again [136].

2.6. Hamilton–Jacobi eq., method of characteristics, calc. of variation

113

Remark 33. For a general Young diagram M = {mM+1 ≥ mM ≥ · · · ≥ m0 > 0}, let x0 , . . . , xM , xM+1 = y be the position coordinates in the spaces of dimensions m0 , . . . , mM+1 respectively, and let p0 , . . . , pM , pM+1 = q be the momenta. The corresponding regular Hamiltonians are defined by the formula H(x, y, p, q) =

 1 0 g(x )q, q − R1 (x, y)p0 − · · · 2 · · · − RM+1 (x, y)pM − RM+2 (x, y)q − R2(M+2) (x, y),

where the RI (x, y) are (vector-valued) polynomials in the variables x1 , . . . , xM , y = xM+1 of M-degree I with smooth coefficients depending on x0 , and where the M-degree of a polynomial is defined by prescribing the degree I to the variables I xI , I = 0, . . . , M +1. Moreover, g(x0 ) is non-degenerate and the matrices ∂R ∂xI have the rank mI−1 . The problem of solving Hamiltonian equations is closely related to (in fact, is often even reduced to) the problem of finding the conservation laws (also referred to as first integrals), which are functions g(x, p) such that the values g(X(τ, x0 , p0 ), P (τ, x0 , p0 )) do not depend on τ for any x0 , p0 . A tool for finding such first integrals is supplied by the notion of the Poisson bracket, which is defined for functions g, h on R2d as follows: {g, h}(x, p) =

∂g ∂h ∂g ∂h − . ∂x ∂p ∂p ∂x

(2.104)

The functions g, h are said to be in convolution if their bracket vanishes. It is straightforward to see that if g and H are in convolution then g is a conservation law for the Hamiltonian system with the Hamiltonian function H. Exercise 2.5.4. Check this claim. If a sufficient number of conservation laws can be found (i.e., sufficient to identify the solutions), the Hamiltonian system is referred to as integrable. Integrable infinite-dimensional systems are of particular interest. Their study brought to life the remarkable theory of solitons and instantons in the nonlinear PDEs, based on the famous KdV-equation.

2.6 Hamilton–Jacobi equation, method of characteristics and calculus of variation We shall now introduce the method of characteristics, which links ODEs and firstorder PDEs and is a key tool for solving the latter. Remark 34. Remarkably, the method also turns out to be effective in the opposite direction, making it possible to derive solutions to ODEs in some important cases when the solutions to PDEs are easier to find, see, e.g., [16].

114

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

We develop the method based on the most important case of the Hamilton– Jacobi equations, which differ from the general first-order PDE by the restriction that these equations do not depend explicitly on the unknown function, but only on its partial derivatives. (Note, however, that general linear equations can be found in Section 2.10.) As a consequence, we will show the central role of Hamiltonian systems in solving minimization problems of the calculus of variation. We will build the theory of the Cauchy problem for the Hamilton–Jacobi equation for small times and then briefly indicate the ways for constructing global generalized solutions. Let H = H(x, p) be a twice continuously differential function on R2n . Let X(t, x0 , p0 ), P (t, x0 , p0 ) denote the solution to the Hamiltonian system (2.81) with initial conditions (x0 , p0 ) at time zero. (As was already noted, the projections on the x-space of the solutions of (2.81) are called characteristics of the Hamiltonian H, or extremals.) Suppose that for some x0 , t0 > 0 and all t ∈ (0, t0 ], there exists a neighbourhood of the origin Ωt ∈ Rn such that the mapping p0 → X(t, x0 , p0 ) is a diffeomorphism from Ωt onto its image and, moreover, this image contains a fixed neighbourhood D(x0 ) of x0 (not depending on t). Then the family Γ(x0 ) of solutions of (2.81) with initial data (x0 , p0 ), p0 ∈ Ωt , is called the field of characteristics starting from x0 and covering D(x0 ) in times t. Basic classes of Hamiltonians where such field exists (which occurs whenever the boundary-value problem is locally well posed) were identified in the previous section. Assuming that this field exists, one can define the smooth function p0 (t, x, x0 ) : (0, t0 ] × D(x0 ) → Ωt (x0 ) so that X(t, x0 , p0 (t, x, x0 )) = x. The family Γ(x0 ) defines two natural vector fields in (0, t0 ] × D(x0 ). Namely, each point of this set is associated with the momentum and velocity vectors p(t, x) = P (t, x0 , p0 (t, x, x0 )),

v(t, x) =

∂H (x, p(t, x)) ∂p

(2.105)

of the solution to (2.81) joining x0 and x in time t. With each solution X(t, x0 , p0 ), P (t, x0 , p0 ), one associates the action function defined by the integral t ˙ σ(t, x0 , p0 ) = (P (τ, x0 , p0 )X(τ, x0 , p0 ) − H(X(τ, x0 , p0 ), P (τ, x0 , p0 ))) dτ. 0

(2.106) Due to the properties of the field of characteristics Γ(x0 ), one can locally define the two-point function S(t, x, x0 ) as the action along the trajectory from Γ(x0 ) that joins x0 and x in the time t, i.e., S(t, x, x0 ) = σ(t, x0 , p0 (t, x, x0 )).

(2.107)

2.6. Hamilton–Jacobi eq., method of characteristics, calc. of variation

115

Using the vector field p(t, x), one can rewrite this in the equivalent form t S(t, x, x0 ) = [p(τ, x) dx − H(x, p(τ, x)) dτ ], (2.108) 0

the curvilinear integral being taken along the characteristic X(τ, x0 , p0 (t, x, x0 )). The following statement represents the key link between ODEs and first-order partial differential equations – more concretely, between Hamiltonian equations and the Hamilton–Jacobi equation. It also plays the central role in the classical calculus of variations. Theorem 2.6.1. For any x0 , the function S(t, x, x0 ) satisfies the Hamilton–Jacobi equation   ∂S ∂S + H x, =0 (2.109) ∂t ∂x in the domain (0, t0 ] × D(x0 ). Moreover ∂S (t, x) = p(t, x). (2.110) ∂x Finally, the integral in the r.h.s. of (2.108) does not depend on the path of integration, i.e., it has the same value for all smooth curves x(τ ) joining x0 and x in the time t and lying completely in the domain D(x0 ). Proof. First we prove (2.110). This equation can be rewritten as P (t, x0 , p0 ) =

∂S (t, X(t, x0 , p0 )), ∂x

or equivalently as P (t, x0 , p0 ) =

∂σ ∂p0 (t, x, x0 ). (t, x0 , p0 (t, x, x0 )) ∂p0 ∂x

Since X(t, x0 , p0 (t, x, x0 )) = x, we find  −1 ∂p0 ∂X (t, x, x0 ) = (t, x0 , p0 (t, x, x0 )). ∂x ∂p0 It follows that equation (2.110), written in terms of the variables (t, p0 ), has the form ∂X ∂σ P (t, x0 , p0 ) (t, x0 , p0 ) = (t, x0 , p0 ). (2.111) ∂p0 ∂p0 To prove this equation, we note that it holds at t = 0, since both parts vanish at that time. Moreover, differentiating this equation with respect to t and using (2.81), one gets the tautology −

∂2X ∂2X ∂P ∂H ∂H ∂P ∂H ∂X ∂H ∂X +P +P = − − , ∂x ∂p0 ∂t∂p0 ∂p0 ∂p ∂t∂p0 ∂p ∂p0 ∂x ∂p0

showing that (2.111) holds for all t.

116

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

In order to prove (2.109), let us first rewrite it as ∂σ ∂p0 ∂σ + (t, x) + H(x, p(t, p0 (t, x))) = 0. ∂t ∂p0 ∂t Inserting the expressions for

∂σ ∂t

and

∂σ ∂x

from (2.106) respectively (2.111) yields

˙ x0 , p0 ) + P (t, x0 , p0 ) P (t, x0 , p0 )X(t,

∂X ∂p0 = 0, (t, x0 , p0 ) ∂p0 ∂t

which is true, because the equation ∂X ∂p0 ˙ x0 , p0 ) = 0 + X(t, (t, x0 , p0 ) ∂p0 ∂t is obtained by differentiating the equation X(t, x0 , p0 (t, x, x0 )) = x. The final statement follows, because the integrand in (2.108) is a complete differential due to (2.109) and (2.110).  The integral on the r.h.s. of (2.108) is often referred to as the invariant Hilbert integral. For the sake of completeness, let us examine the key link of Hamiltonian systems with the calculus of variations. For this purpose, one introduces the Lagrange function L(x, v) as the Legendre transform (2.99) of H(x, p) in the variable p, the integral functional (2.99) defined on piecewise-smooth curves joining x0 and x in the time t, and the Weierstrass function W (x, q, p) defined in the Hamiltonian picture as   ∂H (x, p) . W (x, q, p) = H(x, q) − H(x, p) − q − p, ∂p One says that the Weierstrass condition holds for a solution (x(τ ), p(τ )) to the system (2.81), if W (x(τ ), q, p(τ )) ≥ 0 for all τ and all q ∈ Rn . For example, if the Hamiltonian H is convex (even non-strictly) in the variable p, then the Weierstrass function is non-negative for any choice of its arguments, thus in this case the Weierstrass condition holds trivially for all curves. The following result is the basic Weierstrass sufficient condition for minimum in the calculus of variation. Theorem 2.6.2. If the Weierstrass condition holds on a trajectory X(τ, x0 , p0 ), P (τ, x0 , p0 ) of the field Γ(x0 ) joining x0 and x in the time t (i.e., such that X(t, x0 , p0 ) = x), then the characteristic X(τ, x0 , p0 ) provides a minimum for the functional (2.99) over all curves that lie completely in D(x0 ). Furthermore, S(t, x, x0 ) is the corresponding minimal value.

2.6. Hamilton–Jacobi eq., method of characteristics, calc. of variation

117

Proof. For any curve y(τ ) joining x0 and x in the time t and lying in D(x0 ), one finds t t L(y(τ ), y(τ ˙ )) dτ ≥ (p(t, y(τ ))y(τ ˙ ) − H(y(τ ), p(τ, y(τ ))) dτ. It (y(.)) = 0

0

Since the r.h.s. is the invariant Hilbert integral, it equals S(t, x, x0 ). It remains to prove that S(t, x, x0 ) provides the value of It on the characteristics X(τ, x0 , p0 (t, x, x0 )). For this purpose, it is enough to show that ˙ P (τ, x0 , p0 )X(τ, x0 , p0 ) − H(X(τ, x0 , p0 ), P (τ, x0 , p0 )) ˙ x0 , p0 )), where p0 = p0 (t, x, x0 ); in other words, that equals L(X(τ, x0 , p0 ), X(τ, ∂H (X(τ, x0 , p0 ), P (τ, x0 , p0 )) − H(X(τ, x0 , p0 ), P (τ, x0 , p0 )) ∂p ∂H (X(τ, x0 , p0 ), P (τ, x0 , p0 )) − H(X(τ, x0 , p0 ), q) ≥q ∂p

P (τ, x0 , p0 )

for all q. But this inequality is just the Weierstrass condition.



Exercise 2.6.1. Let the field of characteristics exist for all x0 in some open set. Show the following further link between the field of extremals and the two-point function: ∂S (t, x, x0 ) = −p0 (t, x, x0 ). (2.112) ∂x0 Moreover, show that the function S(t, x, x0 ) as a function of t, x0 satisfies the ˜ Hamilton–Jacobi equation corresponding to the Hamiltonian H(x, p) = H(x, −p). Exercise 2.6.2. Using equations (2.110) and (2.112) and assuming that the matrices ∂X ( ∂p ) are non-degenerate, derive the equations 0  −1 ∂X ∂P ∂2S (t, x, x0 ) = (t, x0 , p0 ) (t, x0 , p0 ) , ∂x2 ∂p0 ∂p0  −1 ∂X ∂X ∂2S (t, x, x ) = (t, x , p ) (t, x0 , p0 ), 0 0 0 2 ∂x0 ∂p0 ∂x0  −1 ∂X ∂2S (t, x, x0 ) = − (t, x0 , p0 ) , ∂x0 ∂x ∂p0

(2.113) (2.114)

linking the derivatives of the solutions to Hamiltonian systems with the second derivatives of the two-point function. By (2.96), this result implies that, for nondegenerate quadratic Hamiltonians and small t, 1 ∂2S (t, x, x0 ) ∼ G(x0 ), ∂x2 t so that S is convex in x and x0 .

∂2S 1 (t, x, x0 ) ∼ G(x0 ), ∂x20 t

(2.115)

118

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Using the asymptotic solutions to the boundary-value problems constructed in the previous section, one can derive the asymptotic representation of the twopoint function. Proposition 2.6.1. Under the assumptions of Proposition 2.5.1, the two-point function S(t, x, x0 ) can be extended into the form S(t, x, x0 ) =

 1  x − x0 + A(x0 )t, G(x0 )−1 (x − x0 + A(x0 )t) 2t ⎛ ⎞ +

k

1⎝ V (x0 )t2 + Pj (t, x − x0 ) + O(c + t)k+1 ⎠ , t j=3

(2.116)

where the Pj are again polynomials in t and x − x0 of degree j, and where the term that is quadratic in x − x0 is explicitly written down. Proof. First, one needs to find the asymptotic expansion for the action σ(t, x0 , p0 ) defined by (2.106). For a Hamiltonian of the form (2.82), one gets that σ(t, x0 , p0 ) equals t

1 (G(X(τ, x0 , p0 ))P (τ, x0 , p0 ), P (τ, x0 , p0 )) + V (X(τ, x0 , p0 )) dτ. 2 0 Using the expansions of Proposition 2.5.1, one obtains ⎤ ⎡ k

1 ⎣1 σ(t, x0 , p0 ) = (p0 t, G(x0 )p0 t) + V (x0 )t2 + Pj (t, tp0 ) + O(c + t)k+1 ⎦ , t 2 j=3 where Pj are polynomials of degree ≤ j in p0 . Inserting the asymptotic expansion (2.98) for p0 (t, x, x0 ) in this formula yields (2.116).  Remark 35. Of course, one can calculate the coefficients of the expansion (2.116) directly from the Hamilton–Jacobi equation, without solving the boundary-value problem. However, the well-posedness of the boundary-value problem explains why the asymptotic expansion has such a form, and it justifies the formal calculation of its coefficients by means of, e.g., the method of undetermined coefficients. Remark 36. Similarly (but with much more calculation efforts), for the Hamiltonian (2.103) one finds that the main term of the small-time asymptotic of the two-point function has the form S0 (t, x, y, x0 , y0 ) = with

1 6 (y − y0 , g −1 (x0 )(y − y0 )) + 3 (g −1 (x0 )z, z), 2t t

t z = α−1 (x0 )(x − x0 ) + [y + y0 + 2α−1 (x0 )a(x0 )]. 2 This expression is mostly known due to its appearance in the Green function of the so-called Kolmogorov diffusion.

2.7. Hamilton–Jacobi–Bellman equation and optimal control

119

2.7 Hamilton–Jacobi–Bellman equation and optimal control Theorem 2.6.1 showed that the two-point function S(t, x, x0 ) satisfies the Hamilton–Jacobi equation (2.109) locally. From these local solutions, one can also derive the solutions to the corresponding Cauchy problem:   ∂S ∂S + H x, (2.117) = 0, S(0, x) = S0 (x). ∂t ∂x For this purpose, a different field of characteristics has to be employed. Namely, assuming that S0 is differentiable, let us consider the family of char0 acteristics that exit from all points ξ ∈ Rn with the momentum p(ξ) = ∂S ∂x (ξ) and the corresponding mapping Y (t, ξ) = X(t, ξ, p(ξ)). Proposition 2.7.1. Let H be a quadratic Hamiltonian of the type (2.82) with G, A, V ∈ C 2 (Rn ), and let S0 ∈ C 2 (Rn ). Then there exists t0 such that the mappings ξ → Y (t, ξ) are diffeomorphisms Rn → Rn for all t ∈ [0, t0 ].   0 Proof. Let us denote ∇S = supx  ∂S ∂x . According to (2.86), if t0 is chosen such that tan(Kt0 )∇S < 1, then the solution (2.85) exists for all p0 = C=

∂S0 ∂ξ ,

s ≤ t0 , and is bounded in norm by

∇S + tan(Kt0 ) . 1 − tan(Kt0 )∇S

Hence, by (2.86), for t ≤ t0 , |Y (t, ξ) − ξ| ≤ K(1 + C)t0 . Following the same arguments as in the proof of Lemma 2.5.1, we find that ∂X (s) = 1 + O(s), ∂ξ so that

∂X (s) = O(s), ∂p0

∂ 2 S0 ∂Y ∂X ∂X (s) 2 = 1 + O(s). (s) = (s) + ∂ξ ∂ξ ∂p0 ∂ξ

Consequently, for small enough t0 and t < t0 , the determinant of



∂Y ∂ξ

 (t) is

uniformly bounded from below and above. Hence the mappings ξ → Y (t, ξ) are local diffeomorphisms. Moreover, since   1 ∂Y (Y (t, ξ1 ) − Y (t, ξ2 ), ξ1 − ξ2 ) = ξ1 − ξ2 , (t, ξ2 + h(ξ1 − ξ2 ))(ξ1 − ξ2 ) dh , 0 ∂ξ

120

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

it follows that the mappings ξ → Y (t, ξ) are injective. Finally, since Y (t, ξ) → ∞ as ξ → ∞, it follows that the image of Y (t, .) is both open and closed in Rn , and therefore coincides with Rn .  Theorem 2.7.1. Under the assumptions of Proposition 2.7.1, the function S(t, x) = [S0 (ξ) + S(t, x, ξ)]|ξ=ξ(t,x)

(2.118)

is the unique smooth solution to the Cauchy problem (2.117) for t ≤ t0 , where ξ(t, x) is the inverse function to Y (t, ξ). Proof. (i) Let us prove that S is a solution. Firstly,    ∂S0 ∂S(t, x, ξ)  ∂S(t, x, ξ) ∂ξ ∂S (t, x) = + . +  ∂x ∂x ∂ξ ∂ξ ∂x ξ=ξ(t,x) ξ=ξ(t,x) The second term vanishes due to (2.112), which implies  ∂S(t, x, ξ)  ∂S (t, x) = .  ∂x ∂x ξ=ξ(t,x) Next, we find    ∂S0 ∂S ∂S(t, x, ξ)  ∂S(t, x, ξ) ∂ξ (t, x) = + . +  ∂t ∂t ∂ξ ∂ξ ξ=ξ(t,x) ξ=ξ(t,x) ∂t The second term vanishes due to (2.112). Hence     ∂S(t, x, ξ)  ∂S ∂S (t, x) = −H x, (t, x) , = −H x,  ∂t ∂x ∂t ξ=ξ(t,x) as required. (ii) Let g(t, x) be a smooth solution to the Cauchy problem (2.117) for t ≤ t0 . The main point is to show that the spatial gradient of g coincides with the momentum on the corresponding characteristics:      ∂S0 ∂S0 ∂g t, X t, ξ, p0 = = P t, ξ, p0 = . (2.119) ∂x ∂ξ ∂ξ Fixing ξ, let f (t) and p(t) denote the l.h.s. respectively the r.h.s. of this equation. 0 First of all, we find f (0) = p(0) = p0 = ∂S ∂ξ , because g(0, x) = S0 (0, x). Next, the Hamiltonian equations yield p(t) ˙ =−

∂H (X(t, ξ, p0 ), p(t)) . ∂x

2.7. Hamilton–Jacobi–Bellman equation and optimal control

121

Moreover, we find ∂2g ∂2g ∂H f˙(t) = (t, x)|x=X(t,ξ,p0 ) + 2 (t, X(t, ξ, p0 )) (X(t, ξ, p0 ), p(t)). ∂x∂t ∂x ∂p Due to   ∂2g ∂ ∂g (t, x) = − H x, (t, x) ∂x∂t ∂x ∂x     ∂H ∂2g ∂H ∂g ∂g =− (t, x) − 2 (t, x) (t, x) , x, x, ∂x ∂x ∂x ∂p ∂x it follows that

∂H ∂H ∂2g ∂H f˙(t) = − (x, f (t)) + 2 (t, x) (x, p(t)) − (x, f (t)) , ∂x ∂x ∂p ∂p with x = X(t, ξ, p0 ). Since the equation for p(t) can be equivalently rewritten in the form

∂H ∂2g ∂H ∂H (x, p(t)) + 2 (t, x) (x, p(t)) − (x, p(t)) , p(t) ˙ =− ∂x ∂x ∂p ∂p again with x = X(t, ξ, p0 ), the functions f (t) and p(t) satisfy the same ODE, and hence coincide. Consequently, ∂S ∂g (t, x) = (t, x) ∂x ∂x at all points t ≤ t0 , x. Therefore, we also find     ∂g ∂S ∂S ∂g (t, x) = −H x, (t, x) = −H x, (t, x) = (t, x), ∂t ∂x ∂x ∂t which implies g(t, x) = S(t, x).



A very special class of Hamiltonians is given by functions that are linear with respect to one of their two variables. For instance, if H(x, p) = −(p, f (x)) is linear with respect to p, then px˙ = −pf (x) = H(x, p) and hence σ(t, x0 , p0 ) = 0. Therefore, the two-point function S(t, x, x0 ) also vanishes. Moreover, the first equation of the Hamiltonian system x˙ = ∂H ∂p = −f (x) does not depend on p, so that X(t, x0 , p0 ) does not depend on p0 and is the solution to the ODE x˙ = −f (x) with the initial condition x0 . The following fact is a specification of Theorem 2.7.1 for the case of a Hamiltonian that is linear in p. Theorem 2.7.2. In case H(x, p) = −(p, f (x)) with f ∈ C 1 (Rd ), the mapping ξ → Y (t, ξ) from Proposition 2.7.1 is given by the solution to the ODE x˙ = −f (x)

122

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

with the initial condition ξ and is a diffeomorphism for any t ∈ R. The solution (2.118) is globally defined (for all t ∈ R) and equals S(t, x) = S0 (ξ)|ξ=ξ(t,x) ,

(2.120)

where ξ(t, x) is the inverse mapping to ξ → Y (t, ξ), which is also given as the solution ξ(t, x) to the ODE ξ˙ = f (ξ) with the initial condition x. In Section 2.10, we shall derive this important result for a more general infinite-dimensional setting in a different way, independent of the theory of Hamiltonian systems. Unlike the case of Hamiltonians that are linear in p, the mapping ξ → 0 X(t, ξ, p0 = ∂S ∂ξ ) usually does not remain a diffeomorphism for all t, which implies that the classical (smooth) solutions to the Cauchy problem (2.117) fail to exist globally, and that one has to resort to some kind of generalized solutions. Equation (2.115) implies that, for non-degenerate quadratic Hamiltonians, solution (2.118) can be rewritten in the following insightful form: S(t, x) = min[S0 (ξ) + S(t, x, ξ)]. ξ

(2.121)

This has numerous interesting consequences. First, since this form does not depend on the derivatives, it can be used for defining generalized solutions by approximations, i.e., as a limit of sequences of classical solutions. Another approach can be derived from the observation that the mapping S0 → St = S(t, .) given by (2.121) can be considered linear in the exotic (max, +)-structure (also referred to as tropical algebra), i.e., min(S01 , S02 ) → min(St1 , St2 ). Therefore, the methods of linear equations can be applied in this case and one can define generalized solutions by duality in the sense of the corresponding (max, +)-‘generalized functions’. Finally, the representation (2.121) is convenient for developing the methods of viscosity solutions. References for these developments are provided in Section 2.19. In the theory of optimization, one of the basic equations is the so-called Bellman equation. It has the form

  ∂S ∂S + sup g(x, u), − J(x, u) = 0, (2.122) ∂t u∈U ∂x with some functions g(x, u), J(x, u) and the parameter u taken from some set U . This equation is nothing but the Hamilton–Jacobi equation (2.109) with the specific Hamiltonian H(x, p) = sup [(g(x, u), p) − J(x, u)].

(2.123)

u∈U

Therefore, the equations (2.109) are often referred to as the Hamilton–Jacobi– Bellman equations, or shortly HJB-equations.

2.7. Hamilton–Jacobi–Bellman equation and optimal control

123

Remark 37. Notice that any function H(x, p) that is convex in p can be written in the form (2.123). Therefore, the Bellman equations and the Hamilton–Jacobi equations (with a convex Hamiltonian) represent in fact the same class of equations. Remark 38. For the sake of completeness, let us recall the basic heuristic derivation of the Bellman equation (2.122). Assume that an agent has a position x ∈ Rd at the time t. Also assume that the position will be moved to a new position x(t + τ ) during a small time τ according to the ODE x˙ = g(x, u), where the ‘control parameter’ u can be chosen by the agent from some given set U . Moreover, the agent has to pay a charge of J(x, u) per unit of time during the transition. Furthermore, let us assume that at the next time t+τ the agent can choose another u for the next transition during the time interval [t + τ, t + 2τ ] and so on, until the terminal time T is reached, where the agent receives the award VT (x(T )) that depends on the final position. Then for the total optimal payoff S(t, x) of the agent starting at x at the time t, we can write the following approximate equation (in the first order for small τ ): S(t, x) = sup [S(t + τ, x + g(x, u)τ ) − J(x, u)τ ]. u∈U

Expanding S by the first-order Taylor expansion yields the approximation

∂S ∂S (t, x)τ + (t, x)g(x, u)τ − J(x, u)τ . S(t, x) = sup S(t, x) + ∂t ∂x u∈U Cancelling S(t, x) from both sides yields (2.122). Equation (2.122) arises from a control problem, when an agent can control the velocity of its movement. A different class of Bellman equations arises when the motion of an agent is not smooth, but subject to possible jumps. Namely, if the number of allowed jumps from a position x ∈ Rd is finite and if the jumps are given by some functions x → y1 (x), . . . , ym (x) with the intensities uj νj (x) controlled by an agent via the control parameters u = (u1 , . . . , um ) ∈ U , the equation for the optimal payoff gets the form ⎡ ⎤ m

∂S uj νj (x)(S(t, y(xj )) − S(t, x)) − J(x, u)⎦ = 0, (2.124) + sup ⎣ ∂t u∈U j=1 which is called the Bellman equation for controlled jump-processes.. Remark 39. The heuristic derivation of Remark 38 is modified for the jump-type dynamics as follows: Saying that possible jumps x → y1 (x), . . . , ym (x) occur with the rates uj νj (x) per unit time means thatthe probability of having a jump in a small time τ equals approximately R = τ uj νj (x), and the probability to have

124

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

the jump x → yj (x) (when a jump occurs) is uj νj (x)/R. Under this assumption, the approximate equation for the optimal payoff becomes S(t, x) = sup [τ u∈U

m

uj νj (x)S(t+τ, yj (x))−J(x, u)τ +(1−τ

j=1

m

uj νj (x))S(t+τ, x)].

j=1

Expanding the change in small τ into a Taylor series yields (2.124). More generally, if jumps are distributed over Rd with some controlled intensities ν(u, x, dy) (given by a transition kernel in Rd that depends on u ∈ U as a parameter), one gets the more general Bellman equation for controlled jumpprocesses in the form

∂S + sup (S(t, y) − S(t, x))ν(u, x, dy) − J(x, u) = 0. (2.125) ∂t u∈U From the very essence of optimal control problems, it is seen that the natural problem for Bellman equations is the backward Cauchy problem, where the solutions of (2.124) or (2.125) are sought for times t ≤ T under the additional terminal conditions S(T, x) = VT (x) with a given VT (x). Remark 40. The same backward problem is natural in stochastic analysis when one is concerned with the value of some payoff (say, a financial obligation) at the time t, where the payoff value depends on the position of the process in the future time T > t. This remark provides an additional motivation for the study of backward problems (and thus the backward propagators of Chapter 4). The equations (2.125) may be regarded as being simpler than the HJBequation (2.122), since they are not PDEs, but ODEs. However, they illustrate the importance of the abstract Banach-space setting for ODEs, because equation (2.125) cannot be written in the form x˙ = F (t, x) for a function F (t, x) in any finite-dimensional Euclidean space. On the other hand, equation (2.125) is an equation of the type S˙ + F (t, S) = 0, with S being an element of the functional Banach space C(Rd ). Hence, we get the following well-posedness result for the Bellman equation for jump processes as a direct application of Theorems 2.2.1 and 2.2.2 or of Proposition 2.2.1. Theorem 2.7.3. (i) Let the transition kernels ν be weakly continuous (for any u) and uniformly bounded, i.e., ν(u, x, dy) < ∞. sup sup u∈U x∈Rd

Then the backward Cauchy problem for equation (2.125) is well posed in C(Rd ), i.e., for any VT ∈ C(Rd ) there exists a unique curve St in C(Rd ), t ≤ T , such that (2.125) holds and S(T, x) = VT (x). (Note that the derivative

2.8. Sensitivity of integral equations

125

in t is defined with respect to the Banach topology of C(Rd ).) Moreover, the solution depends Lipschitz-continuously on the terminal value VT . (ii) If additionally ν = να depends Lipschitz-continuously on a parameter α from a Banach space B1 , so that να (u, x, dy) − νβ (u, x, dy) ≤ κα − β, then the solution St also depends Lipschitz-continuously on α. We shall pick up the story of the jump-type Bellman equations in Chapter 7 in the context of forward-backward systems.

2.8 Sensitivity of integral equations We continue the development of the general theory of ODEs by addressing the issue of sensitivity or smooth dependence of the solutions on the initial data or parameters. In order to present sensitivity results for various equations in a unified way, we shall discuss here the smooth dependence of the fixed points on initial data and parameters for a special class of integral mappings Φ, namely for the operators ΦY in C([τ, t], B) of the form t [ΦY,τ (μ. )](t) = Gt,τ Y + Ωt,s (μs ) ds, (2.126) τ

where Gt,s and Ωt,s , t > s are the families of linear operators respectively (possibly) nonlinear mappings in B, such that Gt,t is the identity. As usual, we write . for the norm .B . In this section, we shall deal with uniformly bounded Ω. Later on, we shall discuss an extension for Ω having a singularity at t = s. Therefore, assuming the existence of the unique fixed points μt,τ (Y ) ∈ C([τ, T ], B) of these mappings for any Y ∈ B, T > τ , we are interested in the derivatives (2.127) ξt = ξt,τ (Y )[ξ] = Dμt,τ (Y )[ξ] in some direction ξ, where D denotes the derivative with respect to the argument from B (for fixed t, τ ) throughout this section. Also, BR denotes the ball of radius R in B centered at zero. As an important hint for obtaining these derivatives, we note that if they are well defined, then they satisfy the equation t DΩt,s (μs,τ )[ξs ] ds, (2.128) ξt = Gt,τ ξ + τ

which is obtained by differentiation from the fixed-point equation t Ωt,s (μs,τ ) ds. μt,τ = Gt,τ Y + τ

(2.129)

126

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Theorem 2.8.1. Let Gt,s and Ωt,s depend continuously on t and be measurable on s. Let equation (2.129) be well posed in the following sense: for any T, R > 0 there exists MT (R) > 0 such that for any Y ∈ BR , τ ∈ [0, T ], there exists a unique solution to equation (2.129) for t ∈ [τ, T ] such that μ. C([τ,T ],B) ≤ MT (R). Let Gt,t be the identity operator and Gt,τ B→B ≤ G,

(2.130)

1 with some constant G ≥ 1. Moreover, let Ωt,s ∈ Cuc (BR , B) uniformly in t, s for any R. In other words, for any , R, there exists L(R) and δ = δ(R, ) such that

DΩt,s (μ1 )B→B ≤ L(R),

DΩt,s (μ1 ) − DΩt,s (μ2 )B→B ≤ ,

(2.131)

for all t, s, and any μ1 , μ2 such that μ1 ∈ BR , μ1 − μ2  ≤ δ. 1 Then the mapping μτ = Y → μt,τ (Y ) belongs to Cuc (BR , B) for all t, ξt = Dμt,τ (Y )[ξ] represents the unique solution to equation (2.128) and ξt  ≤ Ge(t−τ )L(MT (R)) ξ,

0 ≤ τ ≤ t ≤ T.

(2.132)

Proof. First of all, subtracting the equations for μt,τ (Y1 ) and μt,τ (Y2 ) yields μt,τ (Y1 ) − μt,τ (Y2 ) ≤ GY1 − Y2  +

t

L(MT (R))μs,τ (Y1 ) − μs,τ (Y2 )ds, τ

which implies μt,τ (Y1 ) − μt,τ (Y2 ) ≤ Ge(t−τ )L(MT (R)) Y1 − Y2 ,

Y1 , Y2 ∈ BR ,

(2.133)

by Gronwall’s lemma. Next, (2.52) and (2.53) ensure that the Cauchy problem for the linear equation (2.128) is well posed for any ξ, and that it defines a unique ξt that satisfies (2.132). In order to show that this solution defines the derivative Dμt,τ (Y )[ξ], we have to prove that φt  ≤ ξ for any  and ξ whenever ξ ≤ δ with sufficiently small δ, where φt = μt,τ (Y + ξ) − μt,τ (Y ) − ξt . For this, the idea is to find the equation for φ and then estimate its solution. Subtracting from the integral equation for μt,τ (Y + ξ) the integral equations for μt,τ (Y ) and ξt yields

t

[Ωt,s (μs,τ (Y + ξ)) − Ωt,s (μs,τ (Y )) − DΩt,s (μs,τ (Y ))[ξs ]] ds,

φt = τ

2.8. Sensitivity of integral equations

127

and adding and subtracting DΩt,s (μs,τ (Y ))[μs,τ (Y + ξ) − μs,τ (Y )] leads to t φt = DΩt,s (μs,τ (Y ))[φs ] ds τ t (2.134) Ωt,s (μs,τ (Y + ξ)) − Ωt,s (μs,τ (Y )) + τ  − DΩt,s (μs,τ (Y ))[μs,τ (Y + ξ) − μs,τ (Y )] ds. For any , let us choose δ ≤ 1 such that (1.58) holds for F = Ωt,s , that is Ωt,s (Z + ξ) − Ωt,s (Z) − DΩt,s (Z)[ξ] ≤ ξ for

ξ ≤ δ,

Z ≤ MT (R + 1).

By (2.133), if ξ ≤ G−1 e−(T −τ )L(MT (R)) δ, then μt,τ (Y + ξ) − μt,τ (Y ) ≤ Ge(t−τ )L(MT (R)) ξ ≤ δ, so that Ωt,s (μs,τ (Y + ξ)) − Ωt,s (μs,τ (Y )) − DΩt,s (μs,τ (Y ))[μs,τ (Y + ξ) − μs,τ (Y )] ≤ μt,τ (Y + ξ) − μt,τ (Y ) ≤ Ge(t−τ )L(MT (R)) ξ.

(2.135)

Consequently, (2.134) implies t L(MT (R))φs  ds + (t − τ )Ge(t−τ )L(MT (R)) ξ. φt  ≤ τ

By Gronwall’s lemma, this implies sup φs  ≤ (t − τ )Ge2(t−τ )L(MT (R)) ξ,

(2.136)

s∈[τ,t]

thus showing that ξt = Dμt,τ (Y )[ξ]. It remains to show that ξt = Dμt,τ (Y )[ξ] is a uniformly continuous function of Y ∈ BR . Let ξt1 = Dμt,τ (Y1 )[ξ] and ξt2 = Dμt,τ (Y2 )[ξ]. Subtracting the respective equations (2.128) for ξt1 and ξt2 yields t ξt1 − ξt2  ≤ DΩt,s (μs,τ (Y1 )[ξs1 − ξs2 ] ds 0 t (DΩt,s (μs,τ (Y1 )) − DΩt,s (μs,τ (Y2 ))[ξs2 ] ds + 0

and thus ξt1



ξt2 

t

≤ L(MT (R))

ξs1 − ξs2  ds τ

+(t − τ )Ge(t−τ )L(MT (R)) ξ sup (DΩt,s (μs,τ (Y1 ) − DΩt,s (μs,τ (Y2 )). s∈[τ,t]

(2.137)

128

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

The last term can be made arbitrary small for small Y1 − Y2 , because F ∈ 1 (BR , B). Again by Gronwall’s lemma, ξt1 −ξt2  can therefore be made arbitrary Cuc small for small Y1 − Y2 , as required.  The next statement provides an alternative proof of Theorem 2.8.1 under a slightly stronger assumption: Lipschitz continuity of DΩ rather than simple continuity. The idea is to show that the derivative ξt can be obtained as the limit of the derivatives ξtn = D[ΦnY,τ ](t)[ξ] of the approximations that are used in the proof of Theorem 2.1.1 in order to obtain a fixed point for the mapping Φ, given by (2.126). From the recursion formula for Φnt and Proposition 1.5.1, it follows that all n ξt are well defined and satisfy the recursion formula

t n−1 DΩt,s ([Φn−1 ] ds, Y,τ ](t))[ξs

ξtn = ξ +

(2.138)

0

whenever Ωt,s is differentiable. Theorem 2.8.2. Under the assumptions of Theorem 2.8.1, let us additionally sup1 (BR , B) uniformly in t, s for any R, so that (2.131) is pose that Ωt,s ∈ CbLip improved to DΩt,s (μ1 )B→B ≤ L(R),

DΩt,s (μ1 ) − DΩt,s (μ2 )B→B ≤ LD (R)μ1 − μ2 , (2.139) for all μ1 , μ2 ∈ BR and some constants L(R) and LD (R). 1 (BR , B) for all t, τ . Then the mapping μτ = Y → μt,τ (Y ) belongs to CbLip Moreover, ξt = Dμt,τ (Y, α)[ξ] is the unique solution to equation (2.128). It satisfies (2.132) and is the limit of the approximations (2.138). Proof. First of all, it follows from (2.130) and (2.139) that [ΦY,τ (μ1. )](t) − [ΦY,τ (μ2. )](t) ≤ L(MT (R))

t

μ1. − μ2. C([τ,s],B) ds, τ

(2.140)

[ΦY1 ,τ (μ. )](t) − [ΦY2 ,τ (μ. )](t) ≤ GY1 − Y2 . By Theorem 2.1.1, this implies the convergence of the approximations ΦnY to the unique fixed point μt,τ (Y ) of ΦY . The well-posedness of the linear equation (2.128) as well as the estimate (2.132) follow as in Theorem 2.8.2. In order to show that the solution ξt to (2.128) yields the derivative of μt,τ (Y ), let us show that the derivatives ξtn of the approximations converge, where ξt0 = ξ. This would imply that the limit ξt is precisely Dμt,τ (Y )[ξ] and satisfies equation (2.145). In fact, if a sequence of functions uniformly converges, and if the sequence of their derivatives uniformly converges as well, then the limit of the sequence of the derivatives coincides with the derivatives of the limit.

2.8. Sensitivity of integral equations

129

From (2.138), it follows that ξtn 

≤ Gξ +

t

L(MT (R))ξsn−1  ds, τ

which implies by Lemma 9.1.1 that all approximations are bounded by (2.132). In order to prove that the ξn converge, we derive from (2.138) that ξtn+1



ξtn 

t



DΩt,s ([ΦnY,τ (μ. )](s))B→B [ξsn − ξsn−1 ] ds τ



t n−1 DΩt,s ([ΦnY,τ (μ. )](s)) − DΩt,s ([Φn−1 ] ds. Y,τ (μ. )](s))B→B [ξs

+ 0

For estimating the last term, we can use (2.4), which yields ΦnY,τ (μ. )]−[Φn−1 Y,τ (μ. )]C([τ,t],B) ≤

(t − τ )n−1 Ln−1 (MT (R)) 1 ΦY (Y )−Y C([τ,t],B). (n − 1)!

Therefore, taking into account (2.139), we get

t

ξtn+1 − ξtn  ≤ L(MT (R))

ξsn − ξsn−1  ds τ

+ ξG exp{(t − τ )L(MT (R))} ×

(t − τ )n Ln−1 (MT (R))LD (MT (R)) ΦY (Y ) − Y C([τ,t],B). n!

Consequently, we find ξtn+1 − ξtn  ≤ L2 (MT (R))



t

s

ds τ

τ

ξsn−1 − ξsn−2  ds1 1 1

+ 2ξG exp{(t − τ )L(MT (R))} ×

(t − τ )n Ln−1 (MT (R))LD (MT (R)) ΦY (Y ) − Y C([τ,t],B). n!

By induction, it follows that (L(t − τ ))n n! + n2ξG exp{(t − τ )L(MT (R))}

ξtn+1 − ξtn  ≤ ξt1 − ξ 0 

×

(t − τ ) L n

n−1

(2.141)

(MT (R))LD (MT (R)) ΦY (Y ) − Y C([τ,t],B). n!

Therefore, the sequence ξtn converges in C([τ, T ], B).



130

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

As was mentioned in the comments after equation (2.16), the problem of sensitivity with respect to a parameter can be reduced to the problem of sensitivity with respect to initial data. So let us formulate the obtained result directly via this approach. For that purpose, we shall work with real parameters, since one can always reduce the general case to this particular one by using directional derivatives in the space of vector-valued parameters. Instead of looking for fixed points of the single integral operator (2.126), let us now consider the family of such equations [ΦY,τ ;α (μ. )](t) =

Gα t,τ Y

t

Ωα t,s (μs ) ds,

+

(2.142)

τ α where Gα t,s and Ωt,s , t ≥ s are families of linear operators respectively (possibly) nonlinear mappings in B that depend on a real parameter α. Differentiating this ∂μα t,τ (Y ) equation for the fixed point μα is t,τ (Y ), we find that if the derivative βt = ∂α well defined, then it satisfies the equation  t α ∂Gα ∂Ωt,s (μs,τ ) t,τ α Y + + DΩt,s (μs,τ )[βs ] ds. βt = (2.143) ∂α ∂α 0 α Theorem 2.8.3. Assume that Gα t,s and Ωt,s satisfy all the assumptions of Theorem 1 2.8.1, with all estimates being uniform in α. Moreover, let Ωα t,s (μ) ∈ Cuc (R × α 1 BR , B) as a function of (α, μ) and Gt,s ∈ Cuc (R, L(B, B)) as a function of α uniformly in t, s for any R. Then the mapping μτ = Y → μα t,τ (Y ) belongs to 1 Cuc (R × BR , B) for all t, τ , and βt = (2.143).

∂μα t,τ (Y ) ∂α

is the unique solution to equation

Higher-order derivatives of the fixed points of operators (2.126) with respect to initial data can be derived analogously. However, it is easier to obtain them by looking at the differentiability of equation (2.128) with respect to the parameter μτ . Looking at the directional derivative   d |h=0 ξt,τ (Y + hξ 2 )[ξ 1 ] ηt = D2 μt,τ (Y )[ξ 1 , ξ 2 ] = D ξt,τ (Y )[ξ 1 ] [ξ 2 ] = dh differentiating (2.128) gives us the equation ηt =



t

τ

t

D2 Ωt,s (μs,τ )[ξs1 , ξs2 ] ds,

DΩt,s (μs,τ )[ηs ] ds +

(2.144)

τ

where ξtj = Dμt,τ [ξ j ]. As a consequence of Theorem 2.8.3, we get the following. 2 Theorem 2.8.4. Under the assumptions of Theorem 2.8.1, let Ωt,s ∈ Cuc (BR , B) 2 for all R. Then the mapping μτ = Y → μt,τ (Y ) belongs to Cuc (BR , B) for all t, τ , and ηt = D2 μt,τ (Y )[ξ 1 , ξ 2 ] is the unique solution to equation (2.144).

2.9. ODEs in Banach spaces: sensitivity

131

2.9 ODEs in Banach spaces: sensitivity As a consequence of Theorems 2.8.1 and 2.8.2, we get the following result on the sensitivity of ODEs. 1 (B, B) for a Banach space B (and consequently the Theorem 2.9.1. Let F ∈ Cluc assumptions of Theorem 2.2.1 hold). Then the mapping μ0 = Y → μt (Y ) that is 1 constructed in Theorem 2.2.1 belongs to Cluc (B, B) for all t, and ξt = Dμt (Y )[ξ] is the unique solution to the linear equation

ξt = ξ +

t

DF (μs )[ξs ] ds

⇐⇒

ξ˙t = DF (μt )[ξt ]

and

ξ0 = ξ,

(2.145)

0

with the initial condition ξ0 = ξ. Moreover, ξt  ≤ e|t|L ξ ≤ exp{|t|F C 1 (B) }ξ.

(2.146)

1 1 If additionally F ∈ Cuc (B, B) or F ∈ CbLip (B, B), then the mapping μ0 = Y → 1 1 μt (Y ) belongs to Cuc (B, B) or CbLip (B, B), respectively.

Proof. We apply Theorems 2.8.1 and 2.8.2 with Ωt,s = F . Moreover, the result for t < 0 is obtained by changing the variable t to −t.  Similarly, the following result is a consequence of Theorem 2.8.4. Theorem 2.9.2. Under the assumption of Theorem 2.2.1, let us additionally as2 (BR , B) for any R. Then the mapping Y → μt (Y ) belongs to sume that F ∈ Cuc 2 Cuc (BR , B) for all t, and ηt = D2 μt (Y )[ξ 1 , ξ 2 ] is the unique solution to the linear equation η˙ t = D2 F (μt )[ξt1 , ξt2 ] + DF (μt )[ηt ], (2.147) with the initial condition η0 = 0, where ξti = Dμt (Y )[ξ i ], i = 1, 2. If F ∈ 2 2 CbLip (BR , B), then the mapping Y → μt (Y ) belongs to CbLip (B, B). The key property of the differential Dμt (Y ) is that it transfers the vector field F (Y ) = μ˙ t (Y )|t=0 along its integral curves μt : Proposition 2.9.1. Under the assumptions of Theorem 2.9.1, F (μt (Y )) = Dμt (Y )[F (Y )]

(2.148)

for any t and Y . Moreover, μt (Y ) ∈ C 1 (K × B) as a function of two variables for any bounded interval K of R. Proof. The function ξt = F (μt (Y )) satisfies the equation ξ˙t = DF (μ)|μ=μt (Y ) [ξt ],

132

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

with the initial condition ξ0 = F (Y ). On the other hand, zt = Dμt (Y )[F (Y )] has the same initial data and satisfies the same equation, because z˙t = D(F ◦ μt )(Y )[F (Y )] = DF (μ)|μ=μt (Y ) [Dμt (Y )[F (Y )]] by the chain rule. Therefore, the uniqueness of the solution to (2.145) yields ξt = Dμt (Y )[F (Y )], as required. The last statement follows from Proposition 1.5.2.  In the time-dependent case of with F (t, μt ) and the corresponding solutions μt,s (Y ), we have   ∂μt,s (Y )  ∂μt,s (Y )  F (t, Y ) = =− ,  ∂t ∂s s=t t=s and the natural question arises as to which of these derivatives is transferred along the solutions by the differential Dμt,s (Y ). Proposition 2.9.2. Under the assumptions of Theorem 2.2.2 and assuming that 1 (B, B) as a function of Y uniformly in t and α, it follows that F (t, Y, α) ∈ Cluc ∂μt,s (Y, α) = −Dμt,s (Y, α)[F (s, Y, α)]. ∂s Proof. Since



(2.149)

t

μt,s (Y, α) =

F (τ, μτ,s (Y, α), α) dτ, s

it follows that ηt,s =

∂μt,s (Y,α) ∂s

satisfies the equation

ηt,s = −F (s, Y, α) +

t

DF (τ, μτ,s (Y, α), α)[ητ,s ] dτ, s

and consequently d ηt,s = DF (t, μ, α)|μ=μt,s (Y,α) , α)[ηt,s ]. dt By the chain rule, the same equation is satisfied by zt = Dμt,s (Y, α)[F (s, Y, α)], which implies (2.149).  Exercise 2.9.1. Under the assumptions of Proposition 2.9.2, write down the equa∂μ (Y,α) tion for t,s∂α . Exercise 2.9.2. Prove that under the assumptions of Theorem 2.8.3, equation (2.148) generalizes to

t ∂F (τ, μτ,s (Y )) dτ. (2.150) Dμt,τ (μτ,s (Y )) F (t, μt ) = Dμt,s (Y )[F (s, Y )] + ∂τ s

2.10. Linear first-order partial differential equations

133

Exercise 2.9.3. Prove the analogue of Theorem 2.9.1 for Gˆateaux derivatives: Un1 (B, B), der the assumption of Theorem 2.2.1 and additionally assuming F ∈ CGat 1 it follows that the mapping μ0 = Y → μt (Y ) belongs to CGat (B, B) for all t, and ξt = Dμt (Y )[ξ] is still the unique solution to (2.145). Hint: Inequality (1.20) extends to Y and ξ from any compact set, and the set of solutions μs (μ0 + hη), s ≤ t, h ≤ 1, is compact for a given η (as the image of a continuous mapping of the square). The next exercise provides more concrete estimates for the derivatives with respect to initial data in the case of Rd . Exactly the same estimates can also be proved for the equations in l1 . Exercise 2.9.4. (i) Let g ∈ C 1 (Rd ), and X x (t) = {Xjx (t)} be the solution to the equation x˙ = g(x) in Rd . Then X x (t) ∈ C 1 (Rd ) as a function of x and   

 ∂X x(t)   ∂g(x)  k     . ≤ exp t sup  (2.151) sup k  ∂xj  ∂xk  j,x k,x Moreover, if f ∈ C 1 (Rd ), then         ∂  ∂  ∂g(x)    x     .   sup  f (X (t)) ≤ sup  f (x) exp t sup  ∂xk  j,x ∂xj j,x ∂xj k,x (ii) Let g ∈ C 2 (Rd ). Then X x (t) ∈ C 2 (Rd ) as a function of x and  2    

 ∂ 2 X x (t)      k   ≤ t sup  ∂ g(x)  exp 3t sup  ∂g(x)  . sup      k ∂xi ∂xj ∂xj  j,i,x j,i,x ∂xj ∂xi j,x

(2.152)

(2.153)

Moreover, if f ∈ C 2 (Rd ), then    ∂2  sup  f (X x (t)) (2.154) j,i,x ∂xj ∂xi      2      ∂2  ∂f (x)   ∂ g(x)   ∂g(x)    sup   exp 3t sup   ≤ sup  f (x) + t sup   ∂xk  . ∂xk  j,i,x  ∂xj ∂xi  j,i,x ∂xj ∂xi k,x k,x

2.10 Linear first-order partial differential equations A standard method of solving partial differential equations of first order is based on obtaining the solutions in terms of the solutions of certain ODEs called the characteristics of the original equations. In Sections 2.6 and 2.7, we developed this method for finite-dimensional Hamiltonian systems. Now, we show how this method works in the infinite-dimensional setting, although we restrict our attention to linear equations. We start with the simplest case of time-homogeneous equations.

134

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Theorem 2.10.1. Under the assumptions of Theorem 2.9.1, let S ∈ C 1 (B). Then the function G(t, Y ) = S(μt (Y )) is the unique solution in C 1 (R × B) to the linear partial differential equation ∂G (t, Y ) = DG(t, Y )[F (Y )] ∂t

(2.155)

with the initial condition G(0, Y ) = S(Y ), where D denotes the derivative with respect to the second variable of G. The curves μt (Y ) are referred to as the characteristics of the partial differential equation (2.155). Proof. Let G(t, Y ) = S(μt (Y )). Then G ∈ C 1 (R × B) by Propositions 2.9.1 and 1.5.1. Moreover, by (2.148) and Proposition 1.5.1, we find ∂G (t, Y ) = DS(μ)|μ=μt (Y ) [F (μt )] = DS(μ)|μ=μt (Y ) [Dμt (Y )[F (Y )]] ∂t = D(S ◦ μt )(Y )[F (Y )] = DG(t, Y )[F (Y )], therefore G is in fact a solution. In order to prove the uniqueness, let us assume that g(t, Y ) is another solution to (2.155) from C 1 (R × B). Let us introduce a function φ(t, Y ) = g(t, μ−t (Y )). Then this function does not depend on time, since ∂g ∂φ (t, Y ) = (t, μ−t (Y )) − Dg(t, μ−t (Y ))F (μ−t (Y )) = 0. ∂t ∂t Consequently, g(t, Y ) = g(0, μt (Y )), i.e., g is a function of μt (Y ). Therefore it coincides with S(μt (Y )).  Let us now turn to the time-dependent case. Extrapolating the previous result to the dynamics of functions that arise from the evolution Y → μt,s (Y ) of Theorem 2.2.2 (omitting the irrelevant dependence on a parameter α), one could expect a function of the type G(t, Y ) = S(μt,s (Y )) to satisfy the equation ∂G ∂t = DG(t, Y )[Ft (Y )]. The correct result, however, is different. Theorem 2.10.2. Under the assumptions of Theorem 2.2.2 (omitting the irrelevant dependence on a parameter α), let S ∈ C 1 (B). Then the function G(t, s, Y ) = S(μt,s (Y )) satisfies the linear equations ∂G (t, s, Y ) = −DG(t, s, Y )[Fs (Y )], ∂s

(2.156)

where D again denotes the derivative with respect to the last variable of G. Moreover, this function is the unique solution to this equation from the space C 1 (R × R × B) with the initial condition G(t, t, Y ) = S(Y ).

2.11. Equations with memory: causality

135

Proof. By (2.149), we find

∂μt,s (Y ) ∂G (t, s, Y ) = DS(μt,s (Y )) ∂s ∂s = −DS(μt,s (Y ))[Dμt,s (Y )[F (s, Y )]] = −DG(t, s, Y )[Fs (Y )],

as claimed in (2.156). Suppose that φ(s, Y ) is another solution, and let g(s) = φ(s, μs,t (Y ). Then

∂μs,t ∂φ (s, μs,t (Y )) + Dφ(s, μs,t (Y )) (Y ) g  (s) = ∂s ∂s = −Dφ(s, μs,t (Y ))[F (s, μs,t (Y ))] + Dφ(s, μs,t (Y ))[F (s, μs,t (Y ))] = 0, because of (2.149) and the assumption that φ solves (2.156). Hence φ(s, μs,t (Y )) = φ(t, Y ) = S(Y ) and thus φ(s, Y ) = S(μt,s (Y )), as claimed.  The already discussed link between the nonlinear dynamics μt (Y ) in B solving the ODE μ˙ t = F (μt ) and the linear evolution on the functions on B: Tt S(Y ) = S(μt (Y )) solving the PDE (2.155) is a crucial tool in many branches of analysis. We shall return to this link in Chapter 4 (see Proposition 4.1.1) and in Chapter 7 (when deriving the kinetic equations).

2.11 Equations with memory: causality In practice, one often has to deal with extensions of ODEs that incorporate memory. The implementation of memory is usually achieved by two approaches. Firstly, one can work with an extension of (2.9), where the r.h.s. depends on the past values of the unknown function: μ˙ t = F (t, μ≤t ),

(2.157)

with F (t, μ≤t ) a continuous mapping [0, T ] × C([0, T ], B) → B such that for any t, F (t, μ. ) depends only on the values μs : s ∈ [0, t]. Such equations are also referred to as causal equations. A particular case is the class of delay equations, where F (t, μ≤t ) is of the type Ft (μt , μt−δ1 , . . . , μt−δk ), with some δk > · · · > δ1 > 0, i.e., the function depends on several past values of μs . In this section, we shall discuss such equations. The second approach is based on replacing the usual derivative on the l.h.s. of (2.9) by a fractional derivative, which will be discussed in the next section. Theorem 2.11.1. Let B be a Banach space and T > 0. For any t, let F (t, μ. ) be a Lipschitz-continuous mapping C([0, t], B) → B, with a Lipschitz constant F Lip = L that can be chosen uniformly in t ∈ [0, T ] for some T > 0, so that F (t, μ1≤t ) − F (t, μ2≤t ) ≤ Lμ1. − μ2. C([0,t],B) = L sup μ1s − μ2s  s∈[0,t]

(2.158)

136

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

for all t ∈ [0, T ] and μ1. , μ2. ∈ C([0, t], B). Then for any Y ∈ B there exists a unique solution μ. (Y ) ∈ C([0, T ], B) to the Cauchy problem for equation (2.157) with the initial condition μ0 = Y . Moreover,   t tL μ. (Y ) − Y C([0,t],B) ≤ e F (s, 0)ds (2.159) tLY  + 0

for all t ∈ [0, T ]. Finally, for solutions μt (Y1 ) and μt (Y2 ) with different initial data Y1 , Y2 , the following estimate holds: μ. (Y1 ) − μ. (Y2 )C([0,t],B) ≤ etL Y1 − Y2 .

(2.160)

Proof. This is an example where the abstract Theorem 2.1.1 can be applied. Namely, the mapping ΦY : C([0, T ], B) → CY ([0, T ], B) defined by the equation t F (s, μ. ) ds, t ∈ [0, T ], (2.161) [Φ(μ. )](t) = Y + 0

implies the same estimate (2.14) as for usual ODEs, as well as the estimate t F (s, Y )ds (2.162) [ΦY (Y )](t) − Y  ≤ 0 t F (s, 0)ds).  ≤ (tLY  + 0

Remark 41. We introduced causal equations in the most transparent, somewhat simplified way. More established definitions of a causal r.h.s. of the equation μ˙ = F (μ. ) require that the functional F on curves μ. satisfies the following property: if two curves μ1. and μ2. coincide up to a time t, then the corresponding curves F (μ1. ) and F (μ2. ) also coincide up to time t. t Exercise 2.11.1. Show that the unique solution to the equation x˙ = 0 x(s)ds on R with the initial condition x0 equals x(t) = x0 cosh t. As a concrete example of causal equations, let us consider the case where the r.h.s. depends on a fractional integral of the unknown function: μ˙ t = F (t, (Iaβ μ. )(t)),

(2.163)

with any β > 0. The following result is a direct consequence of Theorem 2.11.1. Theorem 2.11.2. Let B be a Banach space and T > 0. Let F (t, μ) be a continuous mapping [0, T ]× B → B, which is Lipschitz-continuous in the second variable, i.e., F (t, μ1 ) − F (t, μ2 ) ≤ Lμ1 − μ2 

(2.164)

for all t ∈ [0, T ] and μ , μ ∈ B. Then for any a ∈ [0, T ) and Y ∈ B, there exists a unique solution μ. (Y ) ∈ C([a, T ], B) to the Cauchy problem for equation (2.163) with the initial condition μa = Y . Moreover, the estimates (2.159) and (2.160) hold with the constant LT β /Γ(β) instead of L. 1

2

2.12. Equations with memory: fractional derivatives

137

2.12 Equations with memory: fractional derivatives An alternative standard way of introducing memory into a system that is governed by an ODE is to change the usual derivative on the l.h.s. into a fractional derivative of order β ∈ (0, 1). Thus, for a Banach space B, a vector Y ∈ B and constants a ∈ R, β ∈ (0, 1), let us consider the Cauchy problem β Da+∗ μt = F (t, μt ),

μa = Y,

t ≥ a,

(2.165)

β where Da+∗ is the Caputo fractional derivative of order β, see (1.109) and (1.106). More generally, for β ∈ (k − 1, k) with k ∈ N, and Y0 , Y1 , . . . Yk−1 ∈ B, we consider the Cauchy problem β μt = Ft (μt , μt , . . . , μt Da+∗

(k−1)

μa = Y0 ,

),

d dk−1 μa = Y1 , . . . , k−1 μa = Yk−1 , dt dt

t ≥ a,

(2.166)

By Proposition 1.8.4, the problems (2.165) and (2.166) are equivalent to the integral equations t 1 (t − s)β−1 F (s, μs )ds, (2.167) μt = Y + Γ(β) a respectively μt =

k−1

j=0

(t − a)j 1 Yj + j! Γ(β)



t

(t − s)β−1 F (s, μs , μs , . . . , μ(k−1) )ds. s

(2.168)

a

Remark 42. Readers who do not wish to enter the world of fractional calculus can just consider the equations (2.167) and (2.168) (containing nothing that is explicitly ‘fractional’) as defining the evolutions (2.165) and (2.166) driven by the fractional Caputo derivatives. Recall that Eβ denotes the Mittag-Leffler function. Theorem 2.12.1. Let F be a continuous function R × B → B, which is Lipschitzcontinuous in the variable μ ∈ B, with a Lipschitz constant F Lip = L as defined in (1.53). Then for any Y ∈ B there exists a unique global (defined for all t ≥ a) solution μt = μt (Y ) to the problem (2.167) – and therefore also to (2.165). Moreover, μt (Y ) − Y  ≤ Eβ (L(t − a)β )

(t − a)β max F (s, Y ), Γ(β + 1) s∈[a,t]

(2.169)

and the solutions μt (Y1 ) and μt (Y2 ) with different initial data Y1 , Y2 satisfy the estimate (2.170) μt (Y1 ) − μt (Y2 ) ≤ Y1 − Y2 Eβ (L(t − a)β ).

138

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Proof. This is a direct consequence of Theorem 2.1.3, since the solutions to the problem (2.167) are fixed points of the mapping t 1 (t − s)β−1 F (s, μs )ds, [Φ(μ. )](t) = Y + Γ(β) a which satisfies all assumptions of Theorem 2.1.3 with ω = 1 − β, κ = 1, L(Y ) = L/Γ(β). (Strictly speaking, this applies only after shifting t to t − a.)  Theorem 2.2.3 suggests the application of the same trick to equation (2.166). For this, notice first that differentiating equation (2.168) (k − 1) times yields t 1 (k−1) = Yk−1 + (t − s)β−k F (s, μs , μs , . . . , μ(k−1) )ds. μt s Γ(β − (k − 1)) a (Of course, this also follows from (1.103).) Hence, in terms of the vector-function νt = (νt0 , νt1 , . . . , νtk−1 ) = (μt , μt , . . . , μt

(k−1)

) ∈ Bk

and the vector Y = (Y0 , . . . , Yk−1 ) ∈ B k , the problem (2.168) can be rewritten as  t (t − s)β−k F (s, νs0 , . . . , νsk−1 ) ds. νt = (νt0 , . . . , νtk−1 ) = Y + νs1 , . . . , νsk−1 , Γ(β − (k − 1)) a (2.171) Noticing that 1 ≤ (t − a)−ω (T − a)ω for t ∈ [a, T ], we can apply Theorem 2.1.3 in the Banach space B k with κ = 1, ω = k − β, which gives the following result. Theorem 2.12.2. Let F be a continuous function R × B k → B such that F (t, Y0 , . . . , Yk−1 ) − F (t, Z0 , . . . , Zk−1 ) ≤ L

k−1

Yj − Zj .

j=0

Then for any Y = (Y0 , . . . , Yk−1 ) ∈ B k there exists a unique global (defined for all t ≥ 0) solution νt = νt (Y ) to the problem (2.171), and hence the unique solution μt = μt (Y ) = νt0 (Y ) to the problem (2.168). Moreover, for t ∈ [a, T ] with any T ,   (t − a)β max F (s, Y ) νt (Y ) − Y B k ≤ Y B k + (2.172) Γ(β + 1) s∈[a,t] × Eβ−(k−1) [(L + (T − a)k−β Γ(β − (k − 1)))(t − a)β−(k−1) ], and the solutions νt (Y ) and νt (Z) with different initial data Y, Z satisfy the estimate νt (Y1 ) − νt (Y2 )B k

(2.173)

≤ Y − ZB k Eβ−(k−1) [(L + (T − a)

k−β

Γ(β − (k − 1)))(t − a)

β−(k−1)

].

2.13. Linear fractional ODEs and related integral equations

139

2.13 Linear fractional ODEs and related integral equations As a key example for fractional ODEs, let us consider the linear Cauchy problem β Da+∗ μt = Aμt + bt ,

μa = Y,

t ≥ a,

(2.174)

with β ∈ (0, 1), bt a continuous curve in B, and A a bounded linear operator in B. The equivalent integral form of the problem reads t 1 μt = Y + (t − s)β−1 (Aμs + bs ) ds. (2.175) Γ(β) a Proposition 2.13.1. The unique solution (from Theorem 2.12.1) to the problem (2.174) or (2.175) equals t β (t − s)β−1 Eβ,β (A(t − s)β )bs ds μt = Eβ (A(t − a) )Y + a (2.176) t β β−1  β (t − s) Eβ (A(t − s) )bs ds. = Eβ (A(t − a) )Y + β a

Proof. By recursively replacing μs under the integral in equation (2.175) by the whole expression of the r.h.s. of (2.175) and by using the semigroup property of the fractional integral Iaβ , one finds t Ak (t − s)kβ−1 μs ds μt = (1 + A(Iaβ 1)(t) + · · · + Ak−1 (Ia(k−1)β 1)(t))Y + Γ(kβ) a +(Iaβ + AIa2β + · · · + Ak−1 Iakβ )b(t), which yields (2.176) by passing to the limit k → ∞ and using (9.14).



As in the case of usual linear equations, the analysis of equation (2.174) extends straightforwardly to the time-dependent case, β Da+∗ μt = At μt + bt ,

μa = Y,

t ≥ a,

(2.177)

where At is a family of linear operators, provided that one carefully takes into account the non-commutativity of the family At . Proposition 2.13.2. Let β ∈ (0, 1). Suppose that At is a family of uniformly bounded operators in B that depend continuously on t, and that bt a continuous curve in B. Then equation (2.177) has a unique solution. This solution has a representation as the geometric series μt =



(Iaβ ◦ A)m [Y + (Iaβ b)(.)](t),

m=0

(2.178)

140

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

where Iaβ ◦ A acts in C([a, t], B) by the formula t 1 (t − s)β−1 As gs ds. (Iaβ ◦ A)g(t) = Γ(β) a Finally, μt  ≤ Eβ ( sup As (t − a)β )(Y  + sup (Iaβ b)(s)). s∈[a,t]

(2.179)

s∈[a,t]

Proof. As mentioned before, the expansion (2.178) is obtained as the direct extension of the corresponding expansion for a constant A.  Remark 43. Since the integral version (2.167) of (2.177) reads (1 − Iaβ ◦ A)μ. = Y + (Iaβ b)(.), representation (2.178) is formally obtained by the expansion of (1 − Iaβ ◦ A)−1 into a geometric series. More generally, estimates of the type (2.179) extend to linear equations of the type t μt = Gt Y + At,s μs ds + gt . (2.180) a

Namely, the following assertion is obtained by using the same arguments as in the proof of Proposition 2.13.1, i.e., recursively inserting the r.h.s. of equation (2.180) in order to express μs under the integral and then estimating the terms of the resulting series via fractional integrals. Proposition 2.13.3. Suppose that At,s , t > s, and Gt , t ≥ 0, are families of bounded linear operators in B that depend continuously on t and measurably on s, and let gt be a continuous curve in B. Suppose further that the family Gt is uniformly bounded and (2.181) At,s  ≤ |A|(t − s)−ω for some constants |A| > 0 and ω ∈ (0, 1). Then equation (2.180) has a unique solution. This solution has a representation as the geometric series μt =



(IA)m (G. Y + g. )(t),

(2.182)

m=0

where IA acts in C([a, t], B) by the formula t At,s hs ds. (IAh)(t) = a

Finally, μt  ≤ E1−ω (|A|Γ(1 − ω)(t − a)1−ω ) sup (Gs B→B Y  + gs ). s∈[a,t]

(2.183)

2.13. Linear fractional ODEs and related integral equations

141

Remark 44. The reason for studying equation (2.180) is that the milder forms of many standard PDEs (including diffusions) are represented by equations of this type, see, e.g., Sections 4.6 and 4.8. Therefore, equation (2.180) is a handy way to put both fractional and usual evolutions under one single umbrella. The discussed theory of linear equations with β ∈ (0, 1) extends directly to the case of arbitrary positive β. In fact, by (2.168), the equivalent integral representation for the Cauchy problem β μt = At μt + bt , Da+∗

μa = Y0 ,

d dk−1 μa = Y1 , . . . , k−1 μa = Yk−1 , dt dt

t ≥ a, (2.184)

is μt =

k−1

j=0

(t − a)j 1 Yj + j! Γ(β)



t

(t − s)β−1 (As μs + bs )ds.

(2.185)

a

Theorem 2.13.1. Let β ∈ (k − 1, k) for a natural k, and suppose that At is a family of uniformly bounded operators in B depending continuously on t and bt is a continuous curve in B. Then equation (2.185) has a unique solution. This solution has a representation as the geometric series ∞

μt =

(Iaβ ◦ A)k [Y0 + (. − a)Y1 + · · · +

k=0

(. − a)k−1 Yk−1 + (Iaβ b)(.)](t) (k − 1)!

(2.186)

and is bounded: μt  ≤ Eβ ( sup As (t − a) ) β

s∈[a,t]

k−1

(t − a)l l!

l=0

 Yl  + sup

(Iaβ b)(s)

. (2.187)

s∈[a,t]

If At = A does not depend on t, the solution takes the form μt =

k−1

l=0

(Ial Eβ (A(. − a)β ))(t)Yl + β

t

(t − s)β−1 Eβ (A(t − s)β )bs ds.

(2.188)

a

Proof. Let us only check how (2.188) is obtained for the case k = 2. Again, replacing μs under the integral in equation (2.185) by the whole expression of the r.h.s. of (2.185) yields   μt = Y0 + (t − a)Y1 + Iaβ b. + A(Y0 + (. − a)Y1 + Iaβ (Aμ. + b. )) (t) = Y0 + (Iaβ 1)(t)AY0 + (Ia1 1)(t)Y1 + (Iaβ+1 1)(t)AY1 + (Iaβ b. )(t) + (Ia2β Ab(.))(t) + (Ia2β A2 μ. )(t).

142

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Repeating this recursively leads to μt =

m

(Ialβ 1)(t)Al Y0 +

l=0

+

m

(Ialβ+1 1)(t)Al Y1 +

m

(Ia(l+1)β Al b. )(t)

l=0 l=0 (m+1)β m+1 (Ia A μ. )(t).

Passing to the limit m → ∞ and taking into account formula (1.104) yields (2.188) for k = 2. The formulae (2.186) and (2.187) are obtained similarly as straightforward extensions of Propositions 2.13.1 and 2.13.2.  Finally, let us consider a system of fractional differential equations of different orders, for μ = (μ1 , . . . , μk ) ∈ B k : ⎧ β1 1 1 ⎪ ⎨ Da+∗ μt = F1 (t, μt ), μa = Y1 , ··· (2.189) ⎪ ⎩ βk k k Da+∗ μt = Fk (t, μt ), μa = Yk , or in the integral form μjt = Yj + (Iaβj Fj (., μ. ))(t),

1, . . . , k.

(2.190)

The following statement is yet another direct consequence of Theorem 2.1.3. Theorem 2.13.2. For a natural k, let β1 , . . . , βk ∈ (0, 1) and let F1 , . . . Fk be Lipschitz-continuous mappings B k → B, so that μ) ≤ Lμ − μ ˜ B k = L Fj (μ) − Fj (˜

k

μj − μ ˜j 

j=1

with a constant L. Then the system (2.190) has a unique global solution μt (Y ), t ≥ a, for any Y = (Y1 , . . . , Yk ) ∈ B k . Moreover, for any T > a there exists a constant C(T, L) such that for all t ∈ [a, T ] μt (Y ) − Y  ≤ C(T, L) max F (s, Y ), s∈[a,t]

μt (Y ) − μt (Y˜ ) ≤ C(T, L)Y − Y˜ .

(2.191)

2.14 Linear fractional evolutions involving spatially homogeneous ΨDOs In this section, we extend the results of Section 2.4 to fractional evolutions. Namely, let us consider the equations β Da+∗ ft = −ψ(−i∇)ft ,

f |t=a = fa ,

(2.192)

2.14. Linear fractional evolutions involving spatially homogeneous ΨDOs

143

for differential or pseudo-differential operators with constant coefficients, with sufficiently regular symbols ψ (choosing ψ to be time-independent for simplicity), where fractional derivatives are taken with respect  −ipxto the variable t. ˆ f (x) dx, the Cauchy problem Passing to the Fourier transform f (p) = e (2.192) turns into β fˆt (p) = −ψ(p)fˆt (p), Da+∗

fˆ|t=a = fˆa .

By Proposition 2.13.1, it has the solution fˆt (p) = Eβ (−(t − a)β ψ(p))fˆa (p). Returning to f via the inverse Fourier transform yields ft (x) = Gψ,β t−a (x − y)fa (y) dy with Gψ,β t−a (x) =

1 (2π)d

(2.193)

eipx Eβ (−(t − a)β ψ(p)) dp,

(2.194)

whenever this integral is well defined. The ‘fractional heat kernels’ Gψ,β t−a (x) solve the Cauchy problem (2.56) with the Dirac initial condition δ(x). Similar to the usual linear equations, the solution (2.193) is well defined and unique in S  (Rd ) for continuous functions ψ that are bounded from below. In order to see how the solution can be defined in spaces of more regular functions, we can exploit the integral representation for the Mittag-Leffler function (2.80), that is ∞

βEβ (s) =

esx x−1−1/β Gβ (1, x−1/β ) dx.

0

Using this formula, we rewrite (2.194) as ∞ β 1 ψ,β ipx e dp Gt−a (x) = e−(t−a) ψ(p)y y −1−1/β Gβ (1, y −1/β ) dy, (2.195) (2π)d β 0 or as Gψ,β t−a (x)

1 = β

0



Gψ (x)y −1−1/β Gβ (1, y −1/β ) dy, (t−a)β y

(2.196)

where Gψ (x) is the heat kernel (2.62) of the corresponding problem (2.56) (t−a)β y with the usual derivative. The following assertion is an application of a general result that relates linear Cauchy problems with usual and fractional derivatives. Theorem 2.14.1. Let the family of linear mappings fs → ft resolving problem (2.56) and given by (2.59) be well defined in some functional space L with norms that are uniformly bounded by a constant M for T1 ≤ s ≤ t ≤ T2 . For instance,

144

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

according to Theorem 2.4.1, these mappings are contractions if L is chosen as L2 (Rd ) or the Sobolev space H2k (Rd ) for any k ∈ R, whenever ψ is a locally bounded measurable function with a non-negative real part. Then the mappings fa → ft given by (2.193) and resolving problem (2.192) are also well defined in the same space L with norms that are bounded by the same constant M . Proof. From Proposition 2.4.1 (i) and (iii), it follows that the function y −1−1/β Gβ (1, y −1/β ) under the integral in (2.196) is bounded as y → 0 and decays faster than any power as y → ∞. In particular, it belongs to L1 (R). For the mapping fs → ft given by (2.193), we therefore find ∞ 1 ft L ≤ M fa L y −1−1/β Gβ (1, y −1/β ) dy = M fa L E(0) = M fa L , β 0 

as required.

Let us extend this result to equations with a nontrivial r.h.s., i.e., to equations of the type β ft = −ψ(−i∇)ft + gt , f |t=a = fa , (2.197) Da+∗ and let us formulate the result in the framework of Theorem 2.4.1. Theorem 2.14.2. Let ψ(p) be a locally bounded measurable function with a nonnegative real part, and let gt be a continuous curve in the space L, which is either L1 (Rd ) or L2 (Rd ) or F (L1 (Rd )). Then there exists a unique solution to the problem (2.197), and it is given by the formula 1 ∞ dy Gψ (x − z)y −1−1/β Gβ (1, y −1/β )fa (z) dz (2.198) ft (x) = y(t−a)β β 0 Rd t ∞ + ds dy (t − s)β−1 Gψ (x − z)y −1/β Gβ (1, y −1/β )gs (z) dz. y(t−s)β a

0

Rd

Moreover, it satisfies the estimate 1 ft L ≤ fa L + Γ(β)



t

(t − s)β−1 gs (.)L ds.

(2.199)

a

Proof. By the Fourier transform, equation (2.197) turns into the equation β fˆt (p) = −ψ(p)fˆt (p) + gˆt (p), Da+∗

fˆ|t=a = fˆa .

(2.200)

According to Proposition 2.13.1, its solution is unique and given by the formula t fˆt (p) = Eβ (−ψ(p)(t − a)β )fˆ0 (p) + β (t − s)β−1 Eβ (−ψ(p)(t − s)β )ˆ gs (p) ds. a

(2.201)

2.15. Sensitivity of integral and differential equations: advanced version

145

Using again (2.80), this can be written as 1 ∞ ˆ ft (p) = exp{−yψ(p)(t − a)β }y −1−1/β Gβ (1, y −1/β ) dy fˆa (p) (2.202) β 0 t ∞ ds exp{−yψ(p)(t − s)β }(t − s)β−1 y −1/β Gβ (1, y −1/β ) dy gˆs (p). + a

0

Applying the inverse Fourier transform yields (2.198). Estimate (2.199) follows from the contraction property in L of the integral operators with the kernel Gψ t (x−  z), and due to Eβ (0) = 1, Eβ (0) = 1/Γ(β + 1). As an example, let us consider the Cauchy problem α Da+∗ ft (x) = −

dβ ft (x) + gt (x), dxβ

f |t=a = fa ,

(2.203)

α where α, β ∈ (0, 1) and the operator Da+∗ is assumed to act on the variable t. According to Theorem 2.14.2 and Proposition 2.4.1, its unique solution is given by the formula x 1 ∞ ft (x) = dy Gβ (y(t − a)α , x − z)y −1−1/α Gα (1, y −1/α )fa (z) dz (2.204) α 0 −∞ t ∞ x dy (t − s)α−1 Gβ (y(t − s)α , x − z)y −1/α Gα (1, y −1/α )gs (z) dz. + ds a

0

−∞

2.15 Sensitivity of integral and differential equations: advanced version Let us now prove an advanced version of the sensitivity for integral equations.. Its application will include both the fractional equations discussed above and nonlinear diffusions that will be considered later on. In fact, the above Theorems 2.8.1 and 2.8.2 were formulated in such a way that they can be more or less straightforwardly extended to the present setting. We are again looking for the derivatives of the fixed points of equation (2.129), the only difference being now the presence of a singularity of Ωt,s for t = s. Theorem 2.15.1. Suppose that the assumptions of Theorem 2.8.1 hold with a slight modification concerning Ω. Namely, instead of (2.131), we assume that Ωt,s ∈ 1 Cuc (BR , B) for t > s and any R, and there exists ω ∈ (0, 1) such that for any , R, there exist L(R) and δ = δ(R, ) such that DΩt,s (μ)B→B ≤ L(R)(t − s)−ω , DΩt,s (μ1 ) − DΩt,s (μ2 )B→B ≤ (t − s)−ω , for all t, s, and for any μ1 , μ2 with μ1 ∈ BR , μ1 − μ2  ≤ δ.

(2.205)

146

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

1 Then the mapping μτ = Y → μt,τ (Y ) belongs to Cuc (BR , B), and thus to for all t. Moreover, ξt = Dμt,τ (Y )[ξ] is the unique solution to equation (2.128) and the following estimate holds: 1 (B, B), Cluc

ξt  ≤ E1−ω (L(MT (R))Γ(1 − ω)(t − τ )1−ω )Gξ.

(2.206)

Proof. We are using the same arguments as in the proof of Theorem 2.8.1. First of all, equation (2.128) is well posed and its solution satisfies (2.206) due to Proposition 2.13.3. Next, we find μt,τ (Y1 ) − μt,τ (Y2 ) ≤ GY1 − Y2 E1−ω (L(MT (R))Γ(1 − ω)(t − τ )1−ω ) (2.207) by Theorem 2.1.3. As in Theorem 2.8.1, the function φt = μt,τ (Y + ξ) − μt,τ (Y ) − ξt again satisfies equation (2.134). For an , we choose δ ≤ 1 such that Ωt,s (Z + ξ) − Ωt,s (Z) − DΩt,s (Z)[ξ] ≤ ξ(t − s)−ω for ξ ≤ δ and Z ≤ MT (R + 1). By (2.133), if ξ ≤ [GE1−ω (L(MT (R))Γ(1 − ω)(t − τ )1−ω )]−1 δ, then μt,τ (Y + ξ) − μt,τ (Y ) ≤ δ, so that Ωt,s (μs,τ (Y + ξ)) − Ωt,s (μs,τ (Y )) − DΩt,s (μs,τ (Y ))[μs,τ (Y + ξ) − μs,τ (Y )] ≤ μt,τ (Y + ξ) − μt,τ (Y )(t − s)−ω ≤ G(t − s)−ω E1−ω (L(MT (R))Γ(1 − ω)(t − τ )1−ω )ξ.

(2.208)

Consequently, by (2.134) and Proposition 2.13.3, we find sup φs  ≤ ξκ, s∈[τ,t]

with a κ depending on L(MT (R)), T, τ and G. This shows that ξt = Dμt,τ (Y )[ξ]. The continuity of ξt = Dμt,τ (Y )[ξ] as a function of Y ∈ BR can be shown like in Theorem 2.8.1.  As an application, we can get the following result for the sensitivity for fractional equations.

2.15. Sensitivity of integral and differential equations: advanced version

147

1 Theorem 2.15.2. Let F ∈ Cluc (B, B) for a Banach space B – therefore, the assumptions of Theorem 2.12.1 hold. Then the mapping Y → μt (Y ) as constructed 1 in Theorem 2.12.1 belongs to Cluc (B, B) for all t, and ξt = Dμt (Y )[ξ] is the unique solution to the linear equation

ξt = ξ +

1 Γ(β)

⇐⇒



t

(t − s)β−1 DF (s, μs )[ξs ] ds

0 β Da+∗ ξt

= DF (t, μt )[ξt ]

and

(2.209) ξ0 = ξ.

Moreover, the following estimate holds: ξt  ≤ Eβ (L(t − a)β )ξ.

(2.210)

Next, let us formulate an extension of Theorem 2.8.2 to singular Ωt,s . Theorem 2.15.3. Under the assumptions of Theorem 2.15.1, let us additionally 1 suppose that Ωt,s ∈ CbLip (BR , B) for t > s, so that DΩt,s (μ1 ) − DΩt,s (μ2 )B→B ≤ LD (R)(t − s)−ω μ1 − μ2 

(2.211)

for all μ1 , μ2 ∈ BR and some constants L(R) and LD (R). 1 (BR , B) for all t, τ . MoreThen the mapping Y → μt,τ (Y ) belongs to CbLip over, ξt = Dμt,τ (Y, α)[ξ] is the unique solution to equation (2.128), it satisfies (2.206) and is the limit of the approximations (2.138). Exercise 2.15.1. Give the full proof of Theorem 2.15.3 by identifying those arguments in the proof of Theorem 2.8.2 that have to be modified. Exercise 2.15.2. Formulate and prove the extension of Theorem 2.8.3 for singular Ωt,s . Finally, let us formulate a direct extension of Theorem 2.8.4 that deals with second-order derivatives. 2 (BR , B) Theorem 2.15.4. Under the assumptions of Theorem 2.15.1, let Ωt,s ∈ Cuc for all R and t > s, and for any , R, let L2 (R) and δ2 = δ2 (R, ) exist such that

D2 Ωt,s (μ)B×B→B ≤ L2 (R)(t − s)−ω , D2 Ωt,s (μ1 ) − D2 Ωt,s (μ2 )B×B→B ≤ (t − s)−ω ,

(2.212)

for all t, s, and any μ1 , μ2 with μ1 ∈ BR , μ1 − μ2  ≤ δ2 . Then the mapping μτ = 2 (BR , B) for all t, τ . Moreover, ηt = D2 μt,τ (Y )[ξ 1 , ξ 2 ] Y → μt,τ (Y ) belongs to Cuc is the unique solution to equation (2.144).

148

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

2.16 ODEs in locally convex spaces In this section, we are going to show how the previous results for Banach spaces can be extended to general locally convex spaces. All necessarily definitions can be found in Section 1.6. The main idea is rather simple: convergence of sequences in locally convex spaces is equivalent to their convergence in each of the semi-norms that define the topology. Therefore, in order to extend the main results for the Banach spaces, the requirement is that all the relevant estimates must hold in each of the semi-norms. Note that we will not discuss the more subtle situation when this uniformity does not hold. Throughout this section, let us denote by V any complete Hausdorff locally convex linear topological space, with the topology being defined by a separating family of semi-norms pγ . If the set pγ is countable, then V is a Fr´echet space, which can be metricised by the metric (1.69). First, let us formulate the extensions of the fixed-point principles. The analogue to Proposition 9.1.1 reads as follows. n Proposition 2.16.1. If Φ is a mapping V → V such that pγ (Φ (x), Φn (y)) ≤ ∞ γ γ αn pγ (x − y) for all x, y and γ with some αn such that Aγ = 1 + n=1 αγn < ∞, then Φ has a unique fixed point x∗ , Φn (x) converges to x∗ for any x and

pγ (x − x∗ ) ≤ Aγ pγ (x − Φ(x)).

(2.213)

Proof. Using the estimates from the proof of Proposition 9.1.1 for each semi-norm pγ (instead of the metric ρ) implies that Φn (x) is Cauchy in each semi-norm and ˜∗ hence converges to a point x∗ . Applying (2.213) to another fixed point x = x ∗ ∗ ∗ yields pγ (x − x ) = 0 for any γ and hence x˜ = x .  Similarly, Proposition 9.1.3 on the stability of fixed points can be directly extended: Proposition 2.16.2. If Φ1 , Φ2 are two mappings V → V such that pγ (Φnj (x) − Φnj (y)) ≤ αγn (j)pγ (x − y) for j = 1, 2 and all x, y and γ, with some αγn (j) such ∞ that Aγ (j) = 1 + n=1 αγn (j) < ∞, and if pγ (Φ1 (x) − Φ2 (x)) ≤ γ for all x, then pγ (x∗1 − x∗2 ) ≤ γ min Aγ (j) j=1,2

for the fixed points x∗j of the mappings Φj . In order to extend Theorems 2.1.1 and 2.1.3, we introduce the spaces C([τ, t], V ) of continuous functions [τ, t] → V . These spaces are considered complete Hausdorff locally convex spaces if equipped with the family of semi-norms p[τ,t] γ (μ. ) = sup pγ (μ(s)). s∈[τ,t]

2.16. ODEs in locally convex spaces

149

For a closed convex set M ⊂ V and Y ∈ M , let C([τ, t], M ) be a closed convex subset of C([τ, t], V ) of functions with values in M , and let CY ([τ, t], M ) be a subset of functions μ with μτ = Y . If the family of norms pγ is countable, then both V and C([s, t], V ) are Fr´echet spaces, and hence metric spaces. However, this property will not be used here. The proof of the following result is derived in the same way from Propositions 2.16.1 and 2.16.2 as the proofs of Theorems 2.1.1 and 2.1.3 are derived from the corresponding fixed-point principles in metric spaces. Theorem 2.16.1. Suppose that for any Y ∈ M and α ∈ B1 (with B1 a Banach space), a mapping ΦY,α : C([τ, T ], M ) → CY ([τ, T ], M ) is given with some T > τ such that for any t the restriction of ΦY,α (μ. ) on [τ, t] depends only on the restriction of the function μs on [τ, t]. Moreover, for any γ, let t pγ ([ΦY,α (μ1. )](t) − [ΦY,α (μ2. )](t)) ≤ Lγ (Y ) (t − s)−ω pγ[τ,s] (μ1. − μ2. ) ds, τ

pγ ([ΦY1 ,α1 (μ. )](t) − [ΦY2 ,α2 (μ. )](t)) ≤ κγ pγ (Y1 − Y2 ) + κγ1 α1 − α2 ,

(2.214)

for any μ1 , μ2 ∈ C([τ, T ], M ), some constants ω ∈ [0, 1), κγ , κγ1 and continuous functions Lγ on M . Then for any Y ∈ M and α ∈ B1 , the mapping ΦY,α has a unique fixed point μt,τ (Y, α) in CY ([τ, T ], M ). Moreover, for all t ∈ [τ, T ] and all γ, pγ (μt,τ (Y, α) − Y ) ≤ e(t−τ )Lγ (Y ) pγ ([ΦY,α (Y )](t) − Y ),

(2.215)

if ω = 0, or pγ (μt,τ (Y, α) − Y ) ≤ E1−ω (Lγ (Y )Γ(1 − ω)(t − τ )1−ω )pγ ([ΦY (Y )](t) − Y ), (2.216) if ω > 0. Finally, the fixed points μt,τ (Y1 , α1 ) and μt,τ (Y2 , α2 ) with different initial data Y1 , Y2 and parameters α1 , α2 satisfy the estimate pγ (μt,τ (Y1 , α1 ) − μt,τ (Y2 , α2 )) ≤ (κγ pγ (Y1 − Y2 ) + κγ1 α1 − α2 ) exp{(t − τ )Lγ (Yj )},

(2.217)

if ω = 0, or pγ (μt,τ (Y1 , α1 ) − μt,τ (Y2 , α2 )) ≤ (κγ pγ (Y1 − Y2 ) + κγ1 α1 − α2 )E1−ω (L(Yj )Γ(1 − ω)(t − τ )1−ω ).

(2.218)

if ω > 0, for j = 1, 2. All results that have been previously derived for Banach spaces can be straightforwardly extended to the case of general V . For instance, Theorems 2.2.1 and 2.12.1 rewrite as follows.

150

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Theorem 2.16.2. Let F be a continuous mapping R × V → V which is Lipschitzcontinuous in the variable μ ∈ V in all semi-norms, i.e., pγ (F (t, μ1 ) − F (t, μ2 )) ≤ Lγ pγ (μ1 − μ2 ) for all γ and some constants Lγ . Then for any Y ∈ V there exists a unique global solution μt = μt (Y ) to the Cauchy problems μ˙ t = F (t, μt ),

μa = Y,

t ∈ R,

(2.219)

β Da+∗ μt = F (t, μt ),

μa = Y,

t ≥ a,

(2.220)

and

with any β ∈ (0, 1). Moreover, pγ (μt (Y ) − Y ) ≤ |t − a|e|t−a|Lγ sup pγ (F (s, Y ))

(2.221)

s∈[a,t]

and pγ (μt (Y ) − Y ) ≤ Eβ (Lγ (t − a)β )

(t − a)β max pγ (F (s, Y )), Γ(β + 1) s∈[a,t]

(2.222)

for the equations (2.219) and (2.220), respectively. Finally, the solutions μt (Y1 ) and μt (Y2 ) with different initial data Y1 , Y2 satisfy the estimate pγ (μt (Y1 ) − μt (Y2 )) ≤ e|t−a|Lγ pγ (Y1 − Y2 )

(2.223)

pγ (μt (Y1 ) − μt (Y2 )) ≤ pγ (Y1 − Y2 )Eβ (Lγ (t − a)β )

(2.224)

and

for the equations (2.219) and (2.220), respectively. Similar extensions can be achieved for the results on sensitivity.

2.17 Monotone and accretive operators In this section, we remind the reader of the useful notion of monotone and accretive operators, returning back to Banach spaces for simplicity. Moreover, for the sake of completeness, we sketch the main results of the general theory of ODEs, involving operators that can cover quite many interesting problems. For more details, however, we refer to the abundant literature, where this theory is well documented, see, e.g., [27, 244, 262]. A Banach space is called strictly convex if x = y, x = y = 1 implies hx + (1 − h)y < 1 for any h ∈ (0, 1). A Banach space is called uniformly convex if for any  ∈ (0, 2) there exists δ such that x ≤ 1, y ≤ 1 and x − y >  implies x + y < 2(1 − δ).

2.17. Monotone and accretive operators

151

Exercise 2.17.1. (i) A Banach space is strictly convex, if x = y, x = y = 1 implies hx + (1 − h)y < 1 for some h ∈ (0, 1). (ii) If a Banach space is uniformly convex, then it is strictly convex. Proposition 2.17.1. If a Banach space B or its dual B ∗ is uniformly convex, then B is reflexive (see, e.g., [244]). Basic examples of uniformly convex Banach spaces are Hilbert spaces and the spaces Lp (Rn ) with p > 1. On the other hand, the spaces L1 (Rn ) and M(Rn ) are neither strictly convex nor reflexive. The duality mapping J of B is defined as the following multi-valued mapping from B to B ∗ : J(x) = {x∗ ∈ B ∗ : (x∗ , x) = x2 = x∗ 2 }.

(2.225)

Exercise 2.17.2. If B = Lp (Rn ), p > 1, then J is single-valued. Namely, if f ∈ B with f  = 1, then (J(f ))(x) = sgn (f (x))|f (x)|p−1 ∈ Lq (Rn ) with 1/q + 1/p = 1. Exercise 2.17.3. Let B = L1 (Rn ) and f ∈ L1 (Rn ). If f > 0 everywhere, then J(f ) = 1. In general, g ∈ J(f ) if g(x) = sgn (f (x)) for f (x) = 0 and g(x) ∈ [−1, 1] otherwise. Proposition 2.17.2. The image J(x) is never empty. (This is a consequence of the Hahn–Banach theorem.) If B is strictly convex, then the duality mapping is single-valued. Proof. See [244], Section II.8.



A mapping A from a subspace D of a Banach space B to subsets of its dual B ∗ (in other words, a multi-valued mapping D → B ∗ ) is called monotone if (x∗ − y ∗ , x − y) ≥ 0 for all x, y ∈ D, x∗ ∈ A(x), y ∗ ∈ A(y). In particular, if A is linear and single-valued, this is equivalent to the requirement that (A(x), x) ≥ 0. The most fundamental example of monotone mappings are sub-gradients of convex functions. Recall that for a convex function φ : B → R the sub-gradient ∂φ(x) at x is defined as ∂φ(x) = {x∗ ∈ B ∗ : φ(z) − φ(x) ≥ (x∗ , z − x) for all z ∈ B}.

(2.226)

Summing up these conditions for the pairs (x, y) and (y, x) yields (x∗ − y ∗ , x − y) ≥ 0 for all x∗ ∈ ∂φ(x), y ∗ ∈ ∂φ(y), that is, the monotonicity of the sub-gradient mapping x → ∂φ(x). If H is a Hilbert space and D ⊂ H, a mapping A : H → H is called accretive if it becomes monotone after the usual identification of H and H ∗ (see Remark 1), i.e., if (A(x) − A(y), x − y) ≥ 0 for all x, y ∈ D. In order to define accretivity for Banach spaces, one uses the duality mapping to transfer B to B ∗ . As for monotonicity, the most natural notion of accretivity is formulated in terms of multi-valued mappings. Recall that by a (binary) relation

152

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

A on B, one means any subset of B × B whose domain D(A) is defined as {x ∈ B : ∃y : {x, y} ∈ A} and whose range or image is defined as {y ∈ B : ∃x : {x, y} ∈ A}. These relations are naturally identified with multi-valued mappings (which we denote by the same letter): A(x) = {y : {x, y} ∈ A}. The inverse relation or the inverse mapping is defined as A−1 = {{y, x} : {x, y} ∈ A}. Linear operations on the relations are defined as λA = {{x, λy} : {x, y} ∈ A} for λ ∈ R and A + B = {{x, y + z} : {x, y} ∈ A, {x, z} ∈ B}. A relation A is said to be a contraction if y1 − y2  ≤ x1 − x2  for all yj ∈ A(xj ). The relation or the multi-valued mapping A is called accretive if [x1 − x2 , y1 − y2 ]+ ≥ 0 ⇐⇒ (x1 − x2 , y1 − y2 )+ ≥ 0

(2.227)

whenever {xj , yj } ∈ A, j = 1, 2. (The equivalence follows from (1.40).) By the definition of the semi-inner product and by the monotonicity of the slopes of convex functions (1.37), this condition is equivalent to the requirement that x1 − x2  ≤ (x1 + αy1 ) − (x2 − αy2 )

(2.228)

whenever α > 0 and {xj , yj } ∈ A, j = 1, 2. In other words, this requirement means that the relations Jα = (I + αA)−1 are contractions for all α > 0. If such an A is single-valued, one can call it accretive mapping. Note, however, that the term ‘accretive mapping’ often refers to multi-valued mappings in the literature. For a linear operator A, condition (2.228) turns into x + αAx ≥ x for any α > 0, in which case one says that the operator (−A) is dissipative. Proposition 2.17.3. (i) A is accretive if and only if {xj , yj } ∈ A, j = 1, 2 implies that there exists f ∈ J(x1 − x2 ) such that (f, y1 − y2 ) ≥ 0. (ii) If A is accretive, then the image of (I + αA) coincides with the whole B for some α > 0 if and only if it coincides with the whole B for all α > 0. 

Proof. See [244], Section IV.7.

An accretive relation is called m-accretive if the last condition in Proposition 2.17.3 holds, i.e., the image of (I + αA) coincides with the whole B for all α > 0. The following Kato’s theorem is the main result for accretive operators in a uniformly convex context. Theorem 2.17.1. Let B ∗ be uniformly convex and A an m-accretive relation in B. Then for any u0 ∈ D(A), ω ∈ R and T > 0, there exists a unique curve u ∈ C([0, T ]; B) such that ωu(t) ∈ u (t) + A(u(t)) 

(2.229)

for almost all t ∈ [0, T ]. Moreover, u exists almost everywhere and is uniformly bounded.

2.18. Hints and answers to chosen exercises

153



Proof. See [244], Section IV.7.

For single-valued relations A, the inclusion (2.229) reduces to the usual ODE u (t) + A(u(t)) = ωu(t). An extension of this result exists for general Banach spaces, namely the Crandall–Ligget Theorem, see, e.g., [244] or [262]. In this case, however, one cannot generally guarantee the existence of a classical solution. Therefore, one proves the existence of a unique generalized solution, the so-called C 0 -solution, which is defined as the limit of certain natural discrete approximations. Kato’s and Crandall–Ligget’s Theorems extend the famous Hille–Yosida result for linear operators A to the nonlinear case. As for the Hille–Yosida case, the necessity to check m-accretivity is a very delicate point when checking the assumptions of these theorems, since m-accretivity is a much stronger requirement than just accretivity. We refer to the above-mentioned books for numerous examples of successful applications of Kato’s and Crandall–Ligget’s Theorem. A notable example is the so-called porous medium equation in L1 (Ω), Ω ⊂ Rn , with Au = −Δρ(u) and ρ some real function, the main example being ρ(r) = r|r|m−1 .

2.18 Hints and answers to chosen exercises −1 Exercise 2.2.1. The solution is x2 (t) = (x−2 , and the explosion time is 0 − 2t) given by t0 = 1/(2x20 ).

Exercise 2.2.2. For any t0 ≥ 0, the formulae x(t) = 0 for t ≤ t0 and x(t) = (t − t0 )2 /4 for t ≥ t0 define a solution. Exercise 2.2.3. Substituting u(t, x) = w(x + ct) in (2.28) yields the equation cw =

3 1 ww + w , 2 4

cw =

3 2 1  w + w + c1 4 4

which implies

with a constant c1 . For further integration and to get rid of the second derivative, the trick is to multiply this equation by the factor w . Exercise 2.3.2. Since σj2 = 1, exp{tσj } = cosh t + σj sinh t. Exercise 2.3.3. The key point is that (J2 )2 = 0, (J3 )3 = 0. Answer: exp{J2 } = 1 + tJ2 =





1

t

0

1

1 t exp{J2 } = 1 + tJ3 + J32 = ⎝0 2 0 2

,

⎞ t t2 /2 1 t ⎠ 0 1

154

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Exercise 2.3.4. Rewrite the equation in terms of f and y = f˙, and use the Duhamel formula. Alternatively, rewrite the equation in terms of the function f˙. Answer: 1 1 t (1 − e−(t−s)b )g(s) ds. (2.230) f (t) = f0 + (1 − e−bt )y0 + b b 0 Exercise 2.3.5. Answer: 1 − e−bt 1 t + (1 − e−(t−s)b )g(s) ds 1 − e−bT b 0 T 1 − e−bt − (1 − e−(t−s)b )g(s) ds. b(1 − e−bT ) 0

f (t) = f0 + (fT − f0 )

(2.231)

Exercise 2.5.1. The maximum in maxp (pv − (G(x)p, p)/2) is attained at p = G−1 (x)v. Exercise 2.6.1. If the curve (x(τ ), p(τ )) is a solution to (2.81) joining x0 and x in the time t, then the curve (˜ x(τ ) = x(t − τ ), p˜(τ ) = −p(t − τ )) is the solution to ˜ joining the points x and x0 in the Hamiltonian system with the Hamiltonian H the time t. Exercise 2.6.2. For the first equation in (2.113), note that ∂2S (t, x, x0 ) = ∂p(t, x)∂x, ∂x2 where p(t, x) = P (t, x0 , p0 ), x = X(t, x0 , p0 ). Exercise 2.11.1. Differentiation turns this equation into the ODE x ¨ = x. Exercise 2.15.1. The estimates for the increments ξtn+1 − ξtn become ξtn+1 − ξtn  ≤ Γ(1 − ω)L(MT (R))Iτ1−ω ξ.n − ξ.n−1 (t) + κξ(t − τ )

(L(MT (R))Γ(1 − ω)(t − τ )1−ω )n−1 Γ(n(1 − ω) + 1)

for some constant κ = κ(T ) that depends on T . Exercise 2.17.1. (i) Let h0 ∈ (0, 1) be such that h0 x + (1 − h0 )y < 1. Suppose ˜ ∈ (0, 1), say ˜h > h0 , such that hx ˜ + (1 − h)y ˜ that there is another h = 1. Then ˜ 1], which contradicts the assumption for the hx + (1 − h)y = 1 for all h ∈ [h, ˜ + (1 − h)y ˜ and y. pair of points hx (ii) This follows from (i), since it yields the condition there with h = 1/2. Exercise 2.17.2. This follows from the H¨older inequality, which states that g(x)f (x)dx ≤ gp f q . The equality can only occur if |g(x)|p = |f (x)|q for almost all x.

2.19. Summary and comments

155

2.19 Summary and comments As mentioned before, the material of this chapter is more or less standard. Still, we tried hard to balance clearness, brevity and a reasonable generality, as well as to streamline and simplify the proofs. A methodological novelty lies in the systematic use of abstract fixed-point results for curves from Section 2.1, which is amplified by the use of the semigroup of fractional integration for dealing with singularities in time. This leads to a very concise presentation of various well-posedness results, extending usual ODEs to rather general causal equations and equations with fractional derivatives, and to precise estimates (including constants) for the growth of solutions, the continuous dependence on the initial data and the derivatives with respect to initial data and parameters. Later in this book, we shall see that the results from Section 2.1 allows for quick and elegant arguments in other, more advanced situations like nonlinear diffusions or the Hamilton–Jacobi–Bellman equations. Also, much care was given to distinguish the differentials of Gˆ ateaux and Fr´echet, which is a specific feature of the infinite-dimensional setting. Moreover, the method of T -products or chronological exponentials was developed in some detail. This method is used in abundance in the physics literature, while (strangely enough) it is a rare guest in mathematics textbooks. The representation (2.80) for the Mittag-Leffler function, which is crucial for our approach to fractional calculus, was probably first established by Zoloterev in [266], following the results of Pollard [223], who proved that the Mittag-Leffler function is the Laplace transform of a positive function. In Chapter 8, we shall give two new proofs for this formula and its extension to generalized Mittag-Leffler functions. Section 2.5 is a very short introduction to Hamiltonian systems, with the emphasis on boundary-value problems. For a full exposition of such boundaryvalue problems (including complex and/or stochastic characteristics), we refer the reader to [136] and [159]. The classical book on the mathematical aspects of Hamiltonian mechanics is [16]. Hamiltonian dynamics can be integrable or exhibiting chaotic behaviour. Integrable systems have rather transparent general structures, since their trajectories fill the tori (Arnold–Liouville theorem). Therefore, much effort was given to the description and classification of integrable systems. For two-dimensional flows, a powerful topological invariant is the Fomenko–Tsishang invariant that was discovered in [85]. This invariant can be effectively calculated for many classical systems in order to prove their topological equivalence or nonequivalence, see, e.g., [203] and [40]. Among the conservation laws for Hamiltonian systems, a most prominent role is played by laws that are polynomial in the momentum. Therefore, much attention has been given to geodesic flows with such first integrals. Starting from the classification of geodesic flows with quadratic integrals in [132], the problem was intensively studied with impressive results on the geodesic flows, see, e.g., [252] and [172], as well as for other systems, see [170].

156

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

The integrability of geodesic flows with polynomial integrals leads to various interesting geometric properties (like all geodesics being closed), see [133] and [126], as well as spectral properties of the corresponding geometric Laplacians, see [67]. Complete integrability of infinite-dimensional Hamiltonian systems remains a very active area of research that we do not touch here at all. Sections 2.6, 2.7 and 2.10 touch the vast area of the method of characteristics for first-order PDEs. Its development in the geometric theory of branching solutions leads to the dynamics of Lagrangian manifolds, which specifies the quasior semiclassical approximations for quantum mechanical problems via the Maslov canonical operator, see, e.g., [199, 200]. One version of this theory can be applied to diffusions and other stochastic processes, see [136]. The idea is that in the asymptotic analysis of PDEs with a small parameter, like the Schr¨ odinger equation ihψ˙ = (V (x) − h2 Δ/2)ψ or the diffusion equation hu˙ = −(V (x) − h2 Δ/2)u, the solution is sought in a quickly oscillating form ψ(t, x) = φ(t, x) exp{iS(t, x)/h} or in a bell-shaped form u(t, x) = φ(t, x) exp{−S(t, x)/h} with S ≥ 0, respectively. The developments in the HJB-equation theory in applications to optimization and games lead to the theory of generalized solutions based on the idea of viscosity solutions, see [83], or on Subbotin’s minimax solutions, see [250] and [251], or on the idempotent (or tropical, or max-plus) superposition principle and related ideas of approximations, see, e.g., [158] and [194]. The analogues of usual characteristics for the generalized solutions are certain piecewise smooth curves that are specified by the Pontryagin maximum principle. Apart from classical Hamiltonian systems, the method of characteristics has been successfully applied to general first-order PDEs, as well as for integrodifferential and general abstract operator equations, see, e.g., [200] and [201]. Semiclassical asymptotics involving general first-order PDEs were developed in the framework of superprocesses in [135]. The method of characteristics can even be successfully developed if the solutions to the underlying ODEs are not well defined. In this case, certain generalized solutions arise, see, e.g., the case of the transport equation in [64, 193] and references therein. An interesting development is the method of stochastic characteristics, which can be used for semiclassical approximations of stochastic heat or the Schr¨ odiner equation (see [136]). This method transforms stochastic PDEs to simpler PDEs with random coefficients, see [174] for the general theory and [162] for its application to the analysis of sensitivity of stochastic McKean–Vlasov equations. For stochastic characteristics in optimal control, we can refer to [66] and references therein. Various probabilistic methods for integral representations and numeric solutions (see, e.g., [61] and references therein) can also be considered an extension to the method of characteristics, where the solutions to PDEs propagate via the random trajectories of Markov processes. More details on causal equations that have been touched upon in Section 2.11 can be found, e.g., in [178]. Insightful examples occur in the modelling of

2.19. Summary and comments

157

epidemics, see, e.g., [230]. Fractional equations were quickly discussed as an exemplary application of general fixed-point theorems for curves. This topic will be picked up and further developed in Chapter 8. Well-written books on ODEs in Banach spaces include [27, 197, 257, 262]. A more application-oriented presentation is given in [244]. The so-called degenerate d M f contains a linear operator M with a nontrivial equations (where the l.h.s. dt d kernel, rather than just dt ) are developed in [75]. The basics of ordinary differential equations in locally convex spaces have been developed in [209] and [189]. The porous medium equation u˙ = Δρ(u) that was mentioned in Section 2.17 remains a very active area of research, see, e.g., [217] and [4] and references therein. Note that we are did not touch the very important direction of research that arises from ODEs with a discontinuous r.h.s. For such equations, we refer to [79, 80] and [171]. A proper analysis of such equations naturally leads to the theory of differential inclusions, see [18], [248]. Abundant literature exists that is devoted to the existence only of solutions to ODEs when the r.h.s. is continuous, but not Lipschitz-continuous. For this purpose, a different class of fixed-point principles must be used. The most standard principle is the Schauder fixed-point principle, stating that a compact (completely continuous) mapping from a convex closed subset of a Banach space to itself has a fixed point. An almost direct application of this result yields the existence of a solution to the Carath´eodory equation x˙ = f (t, x) in Rd , where f is measurable with respect to t, continuous with respect to x and bounded by a summable function of t (see, e.g., [80] for a proof). If f is continuous, this result is the classical Peano theorem. In Banach spaces, however, it does not hold. Namely, Godunov’s theorem states that for any infinite-dimensional Banach space B there exists a continuous function f (t, x) on R × B and initial condition x0 such that the equation x˙ = f (t, x) has no (even local) solution with this initial condition. H´ ajek and M. Johanis showed in [102] that for any separable Banach space B there exists a continuous mapping f : B → B such that the autonomous equation x˙ = f (x) has no solution at any point. In order to prove the existence of solutions of Banach space-valued ODEs, one uses various extensions of the fixed-point principles that often depend on some measure of non-compactness, the simplest one being the Kuratowskii measure of non-compactness χ(S) of subsets of a metric space, which is the infimum of numbers (S) such that there exists a covering of S by a finite number of sets whose diameter does not exceed . E.g., the Sadovskii fixed-point principle states that a continuous condensing mapping from a convex closed subset of a Banach space to itself has a fixed point (see [236]). (A mapping f from a bounded subset S of a Banach space S is called condensing if χ(f (S  )) < χ(S  ) for all S  ⊂ S.) The related weaker Darbo theorem [58] states the existence of a fixed point if χ(f (S  )) ≤ kχ(S  ) for all S  ⊂ S with some k ∈ (0, 1). Many peculiarities arise when passing from Banach spaces to Fr´echet spaces. For instance, one can use a version of the Lipschitz condition with the Lipschitz constant being an infinitely-dimensional matrix that calibrates the stretching of

158

Chapter 2. Basic ODEs in Complete Locally Convex Spaces

different norms (see, e.g., [108] and references therein). Another approach is based on the fact that Fr´echet spaces are projective limits of Banach spaces and that ODEs in Fr´echet spaces can be recast in terms of the systems of equations in sequences of Banach spaces (see, e.g., [89] and references therein). Yet another approach is based on various subtle smoothing properties of specific Fr´echet spaces and the resulting inverse function theorems (see, e.g., [225] and references therein). Let us stress again that we were mainly interested in constructing sufficiently regular global solutions to infinite-dimensional ODEs. Abundant literature exists that is devoted to the general classification of various particular features of the solutions to infinite-dimensional ODEs as compared to finite-dimensional ones, see, e.g., [52, 101, 112, 189].

Chapter 3

Discrete Kinetic Systems: Equations in lp+ In this chapter, we initiate the theory of positivity-preserving ODEs with unbounded coefficients in the most simple case of spaces of sequences. Unlike the Lipschitz-continuous case, the behaviour for forward- and backward-times is quite different. As a warm-up, we begin with an elementary theory of ODEs in Rn+ . This setting allows for a succinct demonstration of two new tools that arise as a substitute for global Lipschitz continuity, namely the bounds from positivity preservation and from linear Lyapunov functions. Afterwards, we introduce the main examples for equations in Rn+ and lp+ that occur in natural and social sciences. The analysis of equations in Rn+ is concluded by basic results on equilibria and ergodicity that arise from analysing entropy and its extensions. The study of chemical reactions gives inspiration for a useful general representation of positivity-preserving ODEs in order to describe the evolution of interacting particle systems. This representation is the starting point for more advanced methods that are based on two additional tools, namely moment estimates and accretivity. These methods will be further developed for equations in lp . Hence, this chapter is effectively decomposed into two parts that can be read independently: first Sections 3.1 to 3.4 on the evolutions in Rn+ , with the emphasis on large time behaviour, and secondly Sections 3.5 to 3.14 on the evolutions in lp+ , with the emphasis on non-explosion, uniqueness and sensitivity. Note that the results of this chapter are not used in other parts of the book and give a more or less self-contained overview of the topic. In Chapter 7, the theory will be generalized to measure-valued evolutions on arbitrary state spaces.

© Springer Nature Switzerland AG 2019 V. Kolokoltsov, Differential Equations on Measures and Functional Spaces, Birkhäuser Advanced Texts Basler Lehrbücher, https://doi.org/10.1007/978-3-030-03377-4_3

159

160

Chapter 3. Discrete Kinetic Systems: Equations in lp+

3.1 Equations in Rn+ Even very simple equations with a quadratic r.h.s. may have no global solutions. −1 , For instance, the equation x˙ = x2 in R has the general solution x(t) = (x−1 0 −t) which for x0 > 0 is only defined for t < 1/x0 . Nevertheless, important classes of equations with a quadratic or polynomial r.h.s. do have global solutions due to the combined effect of two bounds, arising from positivity and growth estimates that are governed by a linear Lyapunov function. First of all, recall the notations . /

xj = 1 Rn+ = {x = (x1 , . . . , xn ) ∈ Rn : xj ≥ 0 for all j}, Σn = x ∈ Rn+ : for the positive quadrant and the standard simplex in Rn . We shall denote the interior of Rn+ by Rn++ = {x = (x1 , . . . , xn ) ∈ Rn : xj > 0 for all j}. The expression (x, y) denotes the usual inner product in Rn . The following simple observation is the starting point for our analysis. If A is an n × n-matrix, then the solution etA x to the linear equation x˙ = Ax in Rn with the initial condition x always takes Rn+ to itself if and only if A is conditionally positive in the sense that its off-diagonal terms are non-negative (or equivalently, if (Av)j ≥ 0 whenever vj = 0). We say that the r.h.s. f (x) of the ODE x˙ = f (x) ⇐⇒ {x˙ j = fj (x) for all j = 1, . . . , n}

(3.1)

is conditionally positive if fj (x) ≥ 0 whenever xj = 0. Moreover, we say that a vector L ∈ Rn++ is a Lyapunov function or a Lyapunov vector for equation (3.1), or that the mapping f has the Lyapunov function L, if the Lyapunov condition (L, f (x)) ≤ a(L, x) + b

(3.2)

holds with some constants a, b. The function L is called subcritical (respectively critical) Lyapunov function, or f is said to be L-subcritical (respectively L-critical), if (L, f (x)) ≤ 0 (respectively (L, f (x)) = 0) for all x ∈ Rn+ . Theorem 3.1.1. If f : Rn+ → Rn is conditionally positive, locally Lipschitzcontinuous (that is, Lipschitz-continuous on any bounded subset of Rn+ ) and has a Lyapunov vector L ∈ Rn++ , then for any x ∈ Rn+ there exists a unique global solution X(t, x) (defined for all t ≥ 0) to equation (3.1) with the initial condition x, and this solution lies in Rn+ for all t. Moreover, if a = 0, then   b b (3.3) 0 ≤ (L, X(t, x)) ≤ eat (L, x) + − a a If a = 0, then (L, X(t, x)) ≤ (L, x) + bt. Finally, if f is L-critical, then (L, X(t, x)) = (L, x).

3.1. Equations in Rn +

161

Remark 45. Intuitively, this result is clear. In fact, by conditional positivity, the vector field f (x) on any boundary point of Rn+ is directed inside or tangent to the boundary, thus not allowing a solution to leave. On the other hand, the Lyapunov condition implies that t (L, X(t, x)) ≤ (L, x) + a (L, X(s, x)) ds + bt 0

which leads to (3.3) by Gronwall’s lemma. However, a rigorous proof is not fully straightforward, since already the existence of a solution is not clear: Any attempts to construct a solution via the usual approximation schemes (see, e.g., Theorem 2.2.1) or by standard Euler or Peano approximations encounter a problem, since all these approximations may not be positivity-preserving (and may therefore jump out of the domain where f is defined). Proof. We obtain positivity-preserving approximations by using a linear bound for the negative part of f . Namely, assuming a = 0 for definiteness, let     b b x n at (L, x) + − . Ma,b (t) = y ∈ R+ : (L, y) ≤ e a a Fixing T , let us define the space Ca,b (T ) of continuous functions y : [0, T ] → Rn+ x such that y([0, t]) ∈ Ma,b (t) for all t ∈ [0, T ]. Now, let fL = fL (x) be the maximum of the Lipschitz constants (1.7) of x (T ), and let us pick a constant K = K(x) ≥ fL . By conditional all fj on Ma,b x positivity, fj (y) ≥ −Kyj in Ma,b (T ). Therefore, we can rewrite equation (3.1) equivalently as y˙ = (f (y) + Ky) − Ky ⇐⇒ {y˙ j = (fj (y) + Kyj ) − Kyj ,

j = 1, . . . , n}, (3.4)

which ensures that the nonlinear part fj (y) + Kyj of the r.h.s. is always nonnegative. Next, we modify the usual approximation scheme (see Theorem 2.2.1) by defining the map Φx from Ca,b (T ) to itself in the following way: For a y ∈ C([0, T ], R+ n ), let Φx (y) be the solution to the equation d [Φx (y)](t) = f (y(t)) + Ky(t) − K[Φx (y.)](t), dt with the initial data [Φx (y)](0) = x. It is a linear equation with a unique explicit solution, which can be taken as an alternative definition of Φx : t [Φx (y)](t) = e−Kt x + e−K(t−s) [f (y(s)) + Ky(s)] ds. 0

Clearly, a fixed point of Φ is a solution to (3.1) with the initial data x.

Chapter 3. Discrete Kinetic Systems: Equations in lp+

162

Next, let us check that Φ takes Ca,b (T ) to itself. In fact, if y ∈ Ca,b (T ), we find that t (L, [Φx (y)](t)) = e−Kt (L, x) + e−K(t−s) [(a + K)(L, y(s)) + b] ds 0

    t b b e−K(t−s) (a + K) eas (L, x) + ≤ e−Kt (L, x) + − + b ds a a 0   b b = (L, x)e−Kt + e−Kt (e(K+a)t − 1) (L, x) + − e−Kt (eKt − 1) a a b = (L, x)eat + (eat − 1). a Notice that, due to this bound, the iterations of Φ remain in Ca,b (T ). Therefore, it is justified to use the Lipschitz constant K for f . The proof is completed by referring to Theorem 2.1.2, because t ys1 − ys2  ds + x1 − x2  [Φx1 (y.1 )](t) − [Φx2 (y.2 )](t) ≤ (K + fL ) 0

and [Φx (x)](t) − x ≤ tf (x).



3.2 Examples in Rn+ : replicator dynamics and mass-action-law kinetics This section presents basic examples of equations in Rn+ as they arise in chemistry, physics, biology and economics. Apart from supplying practical examples, the relevant models of natural science motivate an important representation of such equations in cases when an appropriate Lyapunov function is available. Namely, they make it possible to represent the r.h.s. of any such equation as the description of a combined effect that results from many small-scaled reactions, each one of which satisfying the corresponding Lyapunov condition. This representation is crucial for the advanced analysis of such equations, since it allows for an estimate of higher moments of the conservation law and for proving the property of accretivity that leads to uniqueness and continuous dependence on initial data. To begin with, let us consider equations with a quadratic r.h.s.: x˙ = (Ax, x) ⇐⇒ {x˙ j = (Aj x, x)

for all j = 1, . . . , N },

(3.5)

where each Aj is a symmetric N × N -matrix. Let A˜j denote the (N − 1) × (N − 1)matrix obtained from Aj by deleting the row and column of index j. Clearly, the r.h.s. of (3.5) is conditionally positive if and only if (A˜j v, v) ≥ 0

−1 for all v ∈ RN . +

(3.6)

3.2. Examples in Rn + : replicator dynamics and mass-action-law kinetics

163

The following equation has similar properties: x˙ = A[x⊗k ] ⇐⇒ {x˙ j = Aj [x⊗k ])

for all j = 1, . . . , N },

(3.7)

with the r.h.s. being a polynomial of order k:

Aj [Y (1) ⊗ · · · ⊗ Y (k)] =

Aji1 ,...,ik Yi1 (1) · · · Yik (k),

(3.8)

i1 ,...,ik

with some array of N k+1 numbers A = (Aj ) = {Aji1 ,...,ik }, which is symmetric with respect to changes of the order of i1 , . . . , ik for any j. Note that the algebra of tensor products will not be used: Here, the notation [Y (1)⊗· · ·⊗Y (k)] can be understood as merely denoting the collection of vectors {Y (1), . . . , Y (k)}. Therefore, X ⊗k is just a handy notation for the collection of k vectors, each of which equals X. The r.h.s. of (3.7) is conditionally positive if and only if −1 for all v ∈ RN , +

A˜j [v ⊗k ] ≥ 0

(3.9)

where A˜j denotes the array of (N − 1)k numbers that is obtained from Aj by deleting all elements Aji1 ,...,ik with at least one of il being equal to j. An equation with an analytic r.h.s. can thus be written as x˙ = A(0) +



A(k)[x⊗k ],

(3.10)

k=1

where A(0) is a constant vector and each A(k), k > 0, has the form of the r.h.s. of (3.7). If only a finite number of A(k) in (3.10) is non-vanishing, then we have an equation with a general polynomial r.h.s. Clearly, if each A(k) is conditionally positive, then the same applies to the r.h.s. of (3.10). Note, however, that the converse is not true. (A full discussion of this point is given in [143].) The polynomial r.h.s. of equation (3.7) can be equivalently written as a summation over unordered and ordered collections i1 , . . . , ik . This notation is widely used in the theory of interacting particles and will be frequently used in our exposition. Proposition 3.2.1. Let Ψ(i1 , . . . , ik ) ∈ Z∞ + denote the profile of the collection {i1 , . . . , ik }, that is, the coordinate ψj of Ψ equals the number of indices j that enter the collection {i1 , . . . , ik }. With the notation xΨ =

 j

ψ

xj j ,

Ψ! =



ψj !,

(3.11)

j

the polynomial A[x⊗k ] given by the symmetric arrays A = (Ai1 ···ik ) can be rewrit-

Chapter 3. Discrete Kinetic Systems: Equations in lp+

164

ten as A[x⊗k ] =

Ai1 ···ik xi1 · · · xik

i1 ,...,ik

= k!



i1 ≤···≤ik

= k!

 [ψm (i1 , . . . , ik )]!

−1 Ai1 ···ik xi1 · · · xik

(3.12)

m



Ψ∈Z∞ +

xΨ . Ψ!

Therefore, equation (3.7) can be written as

x˙ j = k!

AjΨ

Ψ∈Z∞ +

xΨ . Ψ!

(3.13)

Proof. Equation (3.12) is proved by direct induction. In order to get a feeling for how the combinatorics works, let us write it down specifically for k = 2 and k = 3:

Ai1 i2 xi1 xi2 = Aii xi xi + 2 Ai1 i2 xi1 xi2 , (3.14)

i1 ,i2

Ai1 i2 i3 xi1 xi2 xi3 =

i

i

i1 ,i2 ,i3

+ 3!

i1 0 such that the product (b, c) is preserved by (3.29): d (b, c) = (b, f (c)) = 0 dt for all c. In other words, b is a critical Lyapunov vector for (3.29). Conservativity is the simplest way to ensure well-posedness, which under this condition follows from Theorem 3.1.1. As usual, an equilibrium point for the system (3.29) is defined as a vector c∗ = (c∗1 , . . . , c∗n ) ∈ Rn+ such that f (c∗ ) = 0. This equilibrium is positive, if c∗j > 0 for all j. By analogy with linear systems, one says that c∗ is a point of detailed balance if, for all j and l, the reaction rates of any pair of forward and backward reactions coincide at c∗ , that is, if R(c∗ ) = RT (c∗ ). By (3.32), any point of detailed balance is an equilibrium point. But formula (3.32) suggests the introduction of an intermediate notion (between equilibrium and detailed balance). Namely, one says that c∗ is a point of complex balance if the complex formation vector vanishes at this point: (R(c∗ ) − RT (c∗ ))1 = 0. Therefore, a point of complex balance is also an equilibrium. For linear systems, complexes and species coincide, and therefore the notions of detailed and complex balance also coincide. One says that system (3.29) is complex balanced (respectively detailed balanced) if the set of positive equilibria is not empty and coincides with the set of points of complex (respectively detailed) balance. Complex balancing turns out to be crucial for finding the ‘thermodynamic properties’ of (3.29), i.e., for finding out whether a function (3.44) or some natural extension of it can serve as a Lyapunov function for (3.29), as in the linear case. We shall now see how it works for mass-action-law systems. For this purpose, recall that a kinetic system is a mass-action-law kinetics if all admissible reactions (3.26) of this system are of the form (3.33). Theorem 3.4.1. Assume that a positive point of detailed balance c∗ exists for a mass-action-law kinetics. Then the system is detailed balanced, that is, all other

3.4. Entropy and equilibria for nonlinear evolutions in Rn +

173

positive points of equilibria are points of detailed balance. Moreover, the function 

cj cj ln ∗ − 1 (3.45) G(c) = j cj is a Lyapunov function for the system (as for the linear case), that is dG(c) ≤0 dt for all c, with equality only for the points of detailed balance. Finally, these points c of detailed balance can be characterized as the critical points of G on the hyperplane c + S, where S is the stoichiometric space. ∂G Proof. Since ∂c equals ln(ci /c∗i ), one can rewrite the reaction rates in the following i equivalent forms: m   y (j) m  

cl cl l ∗ ∗ = rij (c ) exp yl (j) ln ∗ rij (c) = rij (c ) , (3.46) ∗ cl cl l=1

l=1

and consequently as rij (c) = rij (c∗ ) exp{(y(j), ∇G(c))}.

(3.47)

Taking into account the detailed balance condition rij (c∗ ) = rji (c∗ ), this implies ln

rij (c) = (y(j) − y(i), ∇G(c)) rji (c)

(3.48)

whenever kij = 0. Consequently, applying the last expression in (3.30) takes us to dG(c) = (∇G(c), f (c)) = dt and thus

dG(c) =− dt

(rij (c) − rji (c))(y(i) − y(j), ∇G(c))

i 0) combined with an external input of particles. Suppose next that any particle of the type k can mutate to a type m with the rate Qm k , independently of the presence of other particles. Since the number of such mutations will be proportional to the size xj of the population of type j, the process of mutation can be described by the following infinite system of equations: x˙ j =



k=1 m =k

m k xk Qm k (δj − δj ) =

k =j

(xk Qjk − xj Qkj ),

(3.54)

Chapter 3. Discrete Kinetic Systems: Equations in lp+

176

 where Qkj are any non-negative numbers such that k Qkj < ∞ and Qjj = 0 for all j. Equation (3.54) plays the central role in the theory of Markov chains, where it is referred to as Kolmogorov’s forward equation. On the other hand, if any particle of the type k can be decomposed into two particles of the types m and n with the rate Pkmn independently of the presence of other particles, the evolution is governed by the system x˙ j =





xk

k=1

Pkmn (δjm + δjn − δjk ),

(3.55)

(m,n)

which describes the process of fragmentation or the process of branching. (We mean fragmentation if particles m, n are in some  sense smaller than k, and branching if they are the same as k.) In this equation, (m,n) denotes the sum over all pairs of types (m, n) with irrelevant order. The linearity of the evolutions (3.54) and (3.55) reflects the absence of interactions: any particle is subject to certain transformations independently of other particles. Meanwhile, a quadratic r.h.s. appears when one deals with binary interactions. For example, if any two particles of the type k and l can merge into a new m , the evolution is governed by the system particle of the type m with the rate Pkl ∞ ∞ ∞

1

m m x˙ j = xk xl Pkl (δj − δjl − δjk ), 2 m=1

(3.56)

k=1 l=1

which describes the processes of merging, or coagulation, or coalescence. On the other hand, if any two particles of the type k and l can collide or mutate to create mn , the evolution is a new pair of particles of the type m and n with the rate Pkl governed by the system x˙ j =

∞ ∞

1

mn m xk xl Pkl (δj + δjn − δjl − δjk ), 2 k=1 l=1

(3.57)

(m,n)

which describes the processes of collision, or collision breakage, or pairwise mutations. Another insightful example is the interest driven migration in decision processes like evolutionary games, where agents of the behavioural type i can migrate to the type j under the influence of an agent of the type j if some performance function Rj (x) is better for the type j, i.e., if the migration probabilities are Pij = (Rj − Ri )+ . The corresponding evolution is governed by the equation ∞

x˙ j =



1

xk xl (Rk − Rl )+ (δjk − δjl ), 2

(3.58)

k=1 l=1

which can be simplified to its most common form ∞

x˙ j =

1 xj xk (Rj − Rk ). 2 k=1

(3.59)

3.5. Kinetic equations for collisions, fragmentation, reproduction . . .

177

Exercise 3.5.1. Check that (3.59) is equivalent to (3.58). Hint: equation (3.58) can be rewritten as 1 1 (Rj − Rl )+ − xj (Rk − Rj )+ . x˙ j = xj 2 2 l =j

k =j

A polynomial r.h.s. of order k describes interactions that involve simultaneous transformations of k particles (kth-order interaction). Namely, the evolution of a system where any k particles of the types i1 , . . . , ik can be changed into a collection ∞ Φ of particles described by a profile Φ = (φ1 , φ2 , . . .) ∈ Z+,f in with some rate Pi1 ,...,ik is governed by the equations x˙ j =



1 xi1 · · · xik PiΦ1 ,...ik (φj − δji1 − · · · − δjik ). Φ k! i ,...,i =1 1

(3.60)

k

Equivalently, if Ψ denotes the profiles of the collection of k particles that are eligible for a transformation to the profile Φ with the rate PΨΦ , equation (3.60) can be rewritten in the form x˙ j =

Ψ:#(Ψ)=k

xΨ P Φ (φj − ψj ), Φ Ψ Ψ!

(3.61)

where notation (3.11) and the last equation in (3.12) were used, and where #(Ψ) =  ψ denotes the number of particles in the profile Ψ. If one relaxes the conj j straints on Ψ by requiring that it includes at most k rather than exactly k particles, one can include the transformations of all orders not exceeding k, or even more generally transformations of any order. In the latter case, the equation gets the form

xΨ P Φ (φj − ψj ) Ψ Ψ! Φ Ψ ∞ ∞

1 xi1 · · · xik PΦ (φj − δji1 − · · · − δjik ). = Φ i1 ,...ik k! i ,...,i =1

x˙ j =

k=1

1

(3.62)

k

The equations (3.62) are the general kinetic equations with a polynomial or analytic r.h.s. Even more generally, the coefficients PΨΦ can also depend on x, in which case one speaks of a mean-field dependence of the reaction rates. With the mean-field dependence Rj (x) included, the equations (3.59) contain the famous replicator dynamics of evolutionary games in its most general form. Remark 51. Let us point out some slight abuse of the used notation: by Pkmn in (3.55), we mean PΨΦ with Ψ = {ψj = δkj }, Φ = {φj = δjm + δjn }. The equations (3.62) can be equivalently written in another insightful form, namely the weak form, where the elements x of l1 are characterized by their scalar

Chapter 3. Discrete Kinetic Systems: Equations in lp+

178

products with arbitrary elements g ∈ c∞ . Multiplying (3.62) by a g ∈ c∞ yields the representation of this equation in its weak form:

xΨ d (g, x) = P Φ (g, φ − ψ) Ψ Ψ! Φ Ψ dt ∞ ∞

1 = xi1 · · · xik PΦ [g(xj1 ) + · · · Φ i1 ,...ik k! i ,...,i =1 k=1

1

(3.63)

k

· · · + g(xjm ) − g(xi1 ) − · · · − g(xik )]. This equation must hold for all g ∈ c∞ or, more generally, for g from some dense subset of c∞ , where j1 , . . . , jm are the types of particles in a profile Φ: Φ = Φ(j1 , . . . , jm ). We shall use this form in the lp -theory implicitly, when calculating the derivatives of the linear functionals on the solutions. In Chapter 7), we shall discuss its very important extension to the analysis of evolutions of measures on general state spaces. Remark 52. Let us stress again that the summation over profiles Φ in (3.63) means a summation over an unordered collection of particles. Therefore, switching to a summation over ordered indices j1 , . . . , jm would require additional combinatorial adjustments, as was explicitly demonstrated for Ψ. When the number of possible transformations Ψ → Φ in (3.61) is finite and the number of types j is a finite set {1, . . . , N }, i.e., if equation (3.61) is an equation in RN with a polynomial r.h.s., then equation (3.61) describes a general mass-action-law kinetics (3.27) from chemistry, where the transformations Ψ → Φ are referred to as reactions and the output-input profiles Ψ, Φ as complexes or compounds. If the quantity (L, x) is conserved for an L ∈ R∞ ++ and all transformations Ψ → Φ in (3.61), i.e., if (L, Φ) = (L, Ψ) whenever PΨΦ = 0, then the function j → Lj is the critical Lyapunov vector for the system (3.61), and it is also referred to as the conservation law. In chemistry, this usually holds with Lj = mj being the mass of a particle of the type j, due to the principle of conservation of mass or the Lomonosov–Lavoisier law. A natural subclass of processes arises once the index j itself is interpreted as the mass of a particle: mj = j. In this case, particles differ only by their masses. The corresponding processes are referred to as mass-exchange processes. Another important class of processes includes some quantity L that is allowed to grow only by reactions involving no more than one particle. In this case, equation (3.62) gets the form x˙ j = aj +

k

xk

Φ:(L,Φ)≤Lk +b

PkΦ (φj − δjk ) +

xΨ Ψ Ψ!

PΨΦ (φj − ψj ),

Φ:(L,Φ)≤(L,Ψ)

(3.64)

3.5. Kinetic equations for collisions, fragmentation, reproduction . . .

179

with non-negative b and aj . A notable example is the celebrated model of preferential attachment. In this model, one interprets xj as the concentration of coalitions of the size k, and the growth obeys the following rule: new particles are regularly injected into the system in such a way that they either become independent (i.e., they form a new coalition of the unit size) or randomly join one of the coalitions with a probability that is proportional to its size. This process is governed by the equation x˙ j = a(x)δj1 +



j kxk P (x)(δk+1 − δkj ) = a(x)δj1 + P (x)[(j − 1)xj−1 − jxj ], (3.65)

k=1

with Pkk+1 = kxk P (x) as discussed in Remark 51. Here, we included some additional mean-field dependence of a and P . This equation is a variation of (3.64) with L = (1, 2, . . .) and b = 1. By far the most important interactions are binary interactions. Therefore, we shall often deal with equations that combine a linear and/or quadratic r.h.s. For example, the most studied mass-exchange system is the Smoluchowski coagulationfragmentation, where any pair of particles with the masses k and l can coagulate to produce a particle of the mass k + l with some rate Qkl , and any particle of the mass k can fragment into two pieces of the masses l and k − l for any l < k with some rate P k−l,l . These processes are special cases of (3.55) and (3.56). The corresponding evolution equations are ∞ ∞ ∞

1

xk xl Qkl (δjk+l − δjl − δjk ) x˙ j = 2 m=1 k=1 l=1



1 xk P m,k−m (δjm + δjk−m − δjk ), + 2 k=1

(3.66)

m j0 and all il do not exceed n. Roughly speaking, this means that species of lower levels cannot directly influence the changes in species of considerably higher levels. Theorem 3.6.3. Let L ∈ R∞ ++ be non-decreasing with finite set levels and such that for any m = 1, . . . , k L(j) ≤ L(i1 ) + · · · + L(im )

3.6. Simplest equations in lp+

185

whenever Aji1 ···im (m) = 0 (in particular, all A(m) are local) and

j

L(j)Aji1 ···im (m) = 0

for all m, i1 , . . . , im . Assume that (i) the r.h.s. of (3.82) is conditionally positive; (ii) there is a fixed number J such that the number of values j with Aji1 ···im (m) = 0 is uniformly bounded by J for all m, i1 , . . . , im ; (iii) all arrays are uniformly bounded: |Aji1 ···im (m)| < K with some K. Then equation (3.82) satisfies the conditions of Theorem 3.6.1 for p = 1,  and therefore has a unique global solution in l1+ for any x ∈ M+ (L). Moreover, j L(j)xj is constant along any solution. Proof. We have

j

L(j)

|Aji1 ···im (m)|xi1 · · · xim

i1 ,...,im



JK(L(i1 ) + · · · + L(im ))xi1 · · · xim

i1 ,...,im

≤ JKm(L, x)xm−1 , l1 which shows that the r.h.s of (3.82) takes M+ (L) to M(L). Furthermore, we find   



 j j Ai1 ···im (m)xi1 · · · xim − Ai1 ···im (m)yi1 · · · yim   j  i1 ,...,im



i1 ,...,im

i1 ,...,im

j

|Aji1 ···im (m)|

m

xi1 · · · xip−1 |xip − yip |yip+1 · · · yim

p=1

≤ JKmx − yl1 max(xm−1 , ym−1 ), l1 l1 which shows the Lipschitz continuity of the r.h.s. of (3.82) in l1 on any set M≤λ (L).  As an example, let us apply this result to the Smoluchowski equation (3.66) or (3.67).  Corollary 2. If supk,l Qkl < ∞ and supk mn

i1 ,...,im

JL(i1 ) · · · L(im )ω(n)xi1 · · · xim ≤ Jω(n)(L, x)m ,

i1 ,...,im

where ω(n) → 0, as n → ∞. This establishes condition (iii) of Theorem 3.7.1 for p = 1.  As an example, let us apply this result to the Smoluchowski equation (3.66) or (3.67) again.  Corollary 3. If Qkl ≤ o(1)kl and m 1. Before going ahead, let us first collect some elementary inequalities that are routinely used for estimating moments in this context. Lemma 3.9.1. (i) For any m, β ≥ 1, aj ≥ 0 and bj ≥ 1, m

⎛ aβj bj

≤⎝

j=1

m

⎞β aj b j ⎠ .

(3.93)

j=1

(ii) For any m, aj ≥ 0 and β ≥ 1, m

⎞β ⎛ m m

aβj ≤ ⎝ aj ⎠ ≤ mβ−1 aβj ,

j=1

j=1

(3.94)

j=1

m m while for β ∈ [0, 1] one has ( j=1 aj )β ≤ j=1 aβj . (iii) For any a, b, β > 0, the following estimates hold: (a + b)β − aβ − bβ ≤ 2β (abβ−1 + baβ−1 ) (a + b)β − aβ ≤ β2β (bβ + baβ−1 ).

(3.95)

Exercise 3.9.1. Prove these inequalities. Proposition 3.9.1. Let L ∈ R∞ ++ be non-decreasing with finite set levels such that Lj ≥ 1 for all j. Moreover, let us assume that (3.87) holds. Then (Lβ , f (Pn (x))) ≤ (Lβ , x)κ(β, λ) + (Lβ , a(x))

(3.96)

β for any β > 1, f from (3.86), any n and x ∈ M+ ≤λ (L) ∩ M(L ), where

κ(β, λ) = Ccβ



λm mβ b λ+ m! m=1 β

 ,

(3.97)

with a constant cβ . Therefore, if the coordinates of a(x) are uniformly bounded by the coordinates of a vector a ∈ M+ (Lβ ), then Lβ is a Lyapunov vector in the sense of (3.80). In case of equation (3.85), the sum in (3.97) is over m ≤ k − 1. Proof. Notice first that using Pn (x) rather than x in (3.96) ensures by locality that f (Pn (x)) ∈ M(Lβ ). Therefore, the l.h.s. of (3.96) is well defined. Let us set

Chapter 3. Discrete Kinetic Systems: Equations in lp+

192

a = 0 and b = 0, since their contributions are straightforward to obtain. By (3.85), we find



β

(Lβ , f (Pn (x))) ≤ PΨΦ (x) (Lj φj − Lβj ψj ). Ψ Ψ! j Φ:(L,Φ)≤(L,Ψ)

By (3.93), it follows that

xΨ (L , f (Pn (x))) ≤ Ψ Ψ!

β

PΨΦ (x)



β Lj φj

j



Lβ ψj j j

,

Φ:(L,Φ)≤(L,Ψ)

and consequently by (3.87) (Lβ , f (Pn (x))) ≤

xΨ Ψ Ψ!

≤C

PΨΦ (x)



β j

Lj ψj

Φ:(L,Φ)≤(L,Ψ)

xΨ (L, Ψ) Ψ Ψ!



β j

Lj ψj







j

Lβj ψj



j

Lβj ψj ,

which can be rewritten with the summation over unordered indices by (3.12) as C



1 m! m=1

i1 ,...,im ≤n

xi1 · · · xim (Li1 + · · · + Lim )[(Li1 + · · ·+ Lim )β − Lβi1 − · · · − Lβim ],

which, due to the symmetry, yields (Lβ , f (Pn (x)))

1 ≤C m (m − 1)!

(3.98)

xi1 · · · xim Li1 [(Li1 + · · · + Lim )β − Lβi1 − · · · − Lβim ].

i1 ,...,im ≤n

Let us first perform the next step for the easiest case β = 2, in order to understand the main point of the argument: the use of cancelation for reducing the highest power of L by one. In this case, one can solve the brackets in (Li1 + · · · + Lim )2 to find ⎛ ⎞ ∞

1 xi1 · · · xim Li1 ⎝ Lip Liq ⎠ (L2 , f (Pn (x))) ≤ 2C (m − 1)! m=2 i1 ,...,im ≤n 1≤p1 p>1 i1 ,...,im ≤n

which, by symmetry, equals ∞

m(m − 1) 2C (m − 1)! m=2

i1 ,...,im ≤n

xi1 · · · xim L2i1 Li2 .

3.9. Evolution of moments under additive bounds

193

Since Lj ≥ 1 for all j, the above term is therefore is bounded by ∞

m(m + 1) 2 (L , x)(L, x)m , 2C m! m=1

as required. Let us now turn to arbitrary β. In order to do a similar cancelation, we employ the second inequality of (3.95) with a = Li1 and b = Li1 + · · · + Lim to obtain Li1 [(Li1 + · · · + Lim )β − Lβi1 − · · · − Lβim ] ≤ β2β [Li1 (Li2 + · · · + Lim )β + Lβi1 (Li2 + · · · + Lim )] + Li1 [(Li2 + · · · + Lim )β − Lβi2 − · · · − Lβim ]. By (3.94) and symmetry, (3.98) therefore yields the estimate (Lβ , f (Pn (x))) ≤ Ccβ



(m − 1)β (m − 1)! m=2

i1 ,...,im ≤n

xi1 · · · xim Lβi1 Li2



(m − 1)β β (L , x)(L, x)m−1 , ≤ Ccβ (m − 1)! m=2



which implies (3.96).

As a consequence of these moment-estimates, one obtains the following existence result. Theorem 3.9.1. Let L ∈ R∞ ++ be non-decreasing with finite set levels such that Lj ≥ 1 for all j, and let the coefficients PΨΦ (Pn (x)) be Lipschitz-continuous in x for any n. β (i) Let (3.87) and (3.88) hold. Let x ∈ M+ ≤λ (L) ∩ M≤ν (L ), and let the coordinates of a(x) be uniformly bounded by the coordinates of a vector a ∈ M+ (Lβ ) with β > 1. Then for any p ≥ 1 there exists a global solution X(t, x) to (3.85) in lp with the initial condition x such that β X(t, x) ∈ M+ ≤λ(t) (L) ∩ M≤ν(t) (L ),

ν(t) = eκ(β,λ(t))t ν + (eκ(β,λ(t))t − 1) with κ from (3.97) and λ(t) from (3.92).

(Lβ , a) , κ(β, λ(t))

(3.99)

Chapter 3. Discrete Kinetic Systems: Equations in lp+

194

(ii) Let (3.87) hold. Then the same existence result is valid for equation (3.86) and all β > 2. β Proof. (i) We shall use Lemma 3.7.1 with B = l1 , K = M+ ≤λ (L) ∩ M≤ν (L ) and + KT = M≤λ(T ) (L) ∩ M≤ν(T ) (Lβ ). Due to locality and the conditional positivity of f (x), the equation x˙ = f (Pn (x)) is well posed for all n, and it belongs to KT by Proposition 3.9.1 and Theorem 3.6.1 applied to the Lyapunov function Lβ . In order to apply Lemma 3.7.1 in the space lp , p ≥ 1, one therefore only has to prove that f (x)− f (Pn (x))l1 → 0 as n → ∞ uniformly for x ∈ KT (because the norms in lp are bounded by the norm in l1 ). Similar to the estimates that are used in the proof of Proposition 3.8.1(ii), we get

j

|fj (Pn (x)) − fj (x)|

≤ (J + k)C

= (J + k)C

k

1 m! m=1

xi1 · · · xim (Li1 + · · · + Lim )

i1 ,...,im :max(iq )≥n

k

1 (m − 1)! m=1

xi1 · · · xim Li1 ,

i1 ,...,im :max(iq )≥n

which tends to zero, as n → ∞, uniformly in x ∈ KT , since we can estimate xi1 Li1 ≤ xi1

Lβi1

, β−1

Ln

xil ≤ xil

Lβil Lβn

,

for i1 > n or il > n with l = 1, respectively. (ii) This proof is analogous, based on the estimates from Proposition 3.8.1(i). Namely, we can prove that (L, |f (x) − f (Pn (x)|) → 0 as n → ∞ for x ∈ KT (for which β > 2 is required). This implies again that f (x) − f (Pn (x))l1 → 0.  Unlike Theorem 3.7.1, the conditions of Theorem 3.9.1 also allow for a proof of the uniqueness, as will be shown shortly.

3.10 Accretive operators in lp The standard assumption for the r.h.s. of an ODE that ensures the uniqueness of solutions is Lipschitz continuity, as used, e.g., in Theorem 2.2.1. Another condition, which is a bit more subtle, can serve the same purpose: accretivity. The general notion of accretivity was given in Section 2.17. In this section, we make an independent sketch of the theory that is adapted to the spaces lp . A motivation for the definition can be gained from the attempt to estimate the norm of the solutions to an ODE. Namely, if x(t) ∈ lp , p > 1, for all t and

3.10. Accretive operators in lp

195

x(t) ˙ + f (t, x(t)) = 0, then

d d x(t)pp = |xj (t)|p = − p|xj (t)|p−1 sgn(xj (t))fj (t, x(t)). j j dt dt Remark 57. In the literature on accretive operators, one traditionally writes equations in the form x(t) ˙ + f (t, x(t)) = 0 rather than our usual x(t) ˙ = f (t, x(t)). In order to estimate this equation by x(t)pp – and consequently to be able to estimate the norm x(t)pp via Gronwall’s lemma –, one needs an estimate of the type

− |xj |p−1 sgn(xj )fj (t, x) ≤ κ |xj |p (3.100) j

j

with a constant κ ≥ 0. Similarly, in order to estimate the derivative

d x(t)−y(t)pp = − p|xj (t)−yj (t)|p−1 sgn(xj (t)−yj (t))(fj (t, x(t))−fj (t, y(t))) j dt in terms of x(t) − y(t)pp , one needs an estimate of the type −

j

|xj − yj |p−1 sgn(xj − yj )(fj (x) − fj (y)) ≤ κ

j

|xj − yj |p = κx − ypp .

(3.101) One says that a mapping f : M → lp defined on a subset M ⊂ lp is quasiaccretive in lp if (3.101) holds with a constant κ for all x, y ∈ M . Such a mapping is called accretive if the same holds with κ = 0. Notice that if f : l2 → l2 is Lipschitz-continuous, that is f (x) − f (y)2 ≤ κx − y2 , then     (xj − yj )(fj (x) − fj (y)) ≤ κx − y2 f (x) − f (y)2 ≤ κx − y22 ,  j

which implies that f is quasi-accretive in l2 . Note, however, that the converse may not be true! Therefore, quasi-accretivity is a weaker property than Lipschitz continuity. For spaces lp with p > 1, the norm is a differentiable function. It is a peculiarity of the case l1 that its norm is not differentiable. However, since the norm is a convex function, the right and left directional derivatives of the norm are well defined, which leads to the notion of the semi-inner products (1.38), (1.39). For absolutely continuous functions, the set of points where the right and left derivatives differ has zero measure and does not contribute to the integrals. Therefore, it is essentially irrelevant if the right or left derivative or any of its convex combination are chosen when estimating quantities like x(t) − y(t)pp , see Proposition 1.4.2. Strictly speaking, the accretivity of f : M → l1 means that [x − y, f (x) − f (y)]+ ≥ 0, with the formula for [x, z]± in l1 being given in (1.47). Since by (1.40), [x, z]+ ≥ ([x, z]+ + [x, z]− )/2, and since the formula for the r.h.s.

Chapter 3. Discrete Kinetic Systems: Equations in lp+

196

is simpler in l1 , one naturally modifies the accretivity condition in l1 by defining modified accretivity as 1 ([x − y, f (x) − f (y)]+ + [x − y, f (x) − f (y)]− ) ≥ 0, 2

(3.102)

which is just a bit stronger than the standard accretivity (and coincides with it for all lp with p > 1). Due to (1.47), the modified accretivity (that we shall stick to) and quasi-accretivity of a mapping f in l1 can be formally written in the same way as for other lp as −

j

sgn(xj − yj )(fj (x) − fj (y)) ≤ κ

j

|xj − yj |,

(3.103)

where the function sgn(y) has the usual meaning of being equal to 1, −1, 0 for y > 0, y < 0, y = 0 respectively. Remark 58. Using the accretivity notion in its standard form would require a more specific description of the contribution of the terms with xj − yj = 0 in (3.103), which turns out to be irrelevant in all concrete examples. The following simple result shows in detail how the notion of accretivity works. Proposition 3.10.1. Let f : lp → lp be quasi-accretive and let x(t) and y(t) be two solutions to the equation x˙ + f (x) = 0 in lp , p ≥ 1. Then x(t) − y(t)p ≤ x(0) − y(0)p etκ .

(3.104)

In particular, for a given x(0) there can exist at most one solution. Moreover, if f is accretive, then the norm of the difference of any two solutions is non-increasing in time. Proof. In fact, by quasi-accretivity, we find d x(t) − y(t)pp ≤ κx(t) − y(t)pp , dt and (3.104) follows from Gronwall’s lemma. For p = 1, the norm is not differentiable, and the most direct proof is completed by Proposition 1.4.4 in the  space l1 . As we shall see, accretivity with respect to the weighted norm x → (L, |x|) with L ∈ R∞ ++ arises naturally in the analysis of equations of the type (3.61). Such accretivity of f (x) means that −

j

Lj sgn (xj − yj )(fj (x) − fj (y)) ≤ κLj

j

|xj − yj |.

(3.105)

3.11. Accretivity for evolutions with additive rates

197

3.11 Accretivity for evolutions with additive rates The following result on the accretivity with respect to the weighted norms is a major ingredient for the analysis of well-posedness in lp for equations with additive bounds for rates. Lemma 3.11.1. Let Lj ≥ 1 for all j, the r.h.s. f of equation (3.86) satisfy (3.87) and the coefficients PΨΦ (x), a(x) satisfy the following Lipschitz continuity condition:

Φ

˜ |PΨΦ (x) − PΨΦ (y)| ≤ C(L, |x − y|)(L, Ψ),

(3.106)

(L, |a(x) − a(y)|) ≤ Ca (L, |x − y|).

(3.107)

Then −f is accretive on M+ (L2 ) with respect to the weighted norm x → (L, |x|). More precisely, ∞

˜ Li σi (fi (x) − fi (y)) ≤ [α(λ)ν + Ca + b(C + λC)](L, |x − y|),

(3.108)

i=1 + 2 for x, y ∈ M+ ≤λ (L) ∩ M≤ν (L ) and σi = sgn(xi − yi ), where

˜ α(λ) = 2(C + C)



(m + 1)λm−1 . (m − 1)! m=1

(3.109)

In the case of equation (3.85), the summation in the last formula is over m ≤ k−1. Remark 59. Here, we mean that σi equals 1, −1, 0 for xi − yi > 0, xi − yi < 0, xi − yi = 0, respectively – in line with our convention on the accretivity for the space l1 , see (3.102) and (3.103). However, the below proof will show that the assignment of the value σi for the case xi − yi = 0 turns out to be irrelevant, which confirms the claim made in the above Remark 58. Proof. Proposition 3.8.1(i) ensures that the l.h.s. of (3.108) is well defined. In order to keep the formulae slim, we set a = 0, since the contribution of a is straightforward. Let us first consider the case when PΨΦ (x) does not depend on x and when b = 0. By (3.85), we get ∞

i=1

Li σi (fi (x) − fi (y)) =



i=1

Li σi

xΨ − y Ψ Ψ Ψ!

Φ:(L,Φ)≤(L,Ψ)

PΨΦ (φi − ψi ),

Chapter 3. Discrete Kinetic Systems: Equations in lp+

198

which rewrites as ∞

1 (xi1 · · · xim − yi1 · · · yim ) m! i ,...,i m=1 1 m ∞ 

Φ × Pi1 ,...,im Li σi φi − Li1 σi1 − · · · − Lim σim i=1

(L,Φ)≤Li1 +···+Lim

=

m

1 m! i ,...,i 1

m

m

×

xi1 · · · xiq−1 (xiq − yiq )yiq+1 · · · yim

q=1

PiΦ1 ,...,im

m

×

 Li σi φi − Li1 σi1 − · · · − Lim σim

i=1

(L,Φ)≤Li1 +···+Lim

=



m 1 xi · · · xiq−1 yiq+1 · · · yim |xiq − yiq |σiq m! i ,...,i q=1 1 1 m ∞ 

Φ Pi1 ,...,im Li σi φi − Li1 σi1 − · · · − Lim σim . i=1

(L,Φ)≤Li1 +···+Lim

And now comes the main trick: ∞ 

Φ Pi1 ,...,im Li σi φi − Li1 σi1 − · · · − Lim σim σiq

=

(3.110)

i=1

(L,Φ)≤Li1 +···+Lim

PiΦ1 ,...,im

 i

σiq Li σi φi − σiq Li1 σi1 − · · · − σiq Lim σim



(L,Φ)≤Li1 +···+Lim

≤ C(Li1 + · · · + Lim )(Li1 + · · · + Lim − σiq Li1 σi1 − · · · − σiq Lim σim ). Since the term Liq cancels in the second bracket, the last expression vanishes for m = 1. For m > 1, it does not exceed

2CLiq

p =q

⎛ Lip + 2C ⎝

p =q

⎞2 Lip ⎠ ≤ 2CLiq

Lip + 2C(m − 1)

p =q

L2ip .

p =q

Consequently, ∞

i=1

∞ m

1 xi · · · xiq−1 yiq+1 · · · yim m! i ,...,i q=1 1 m=2 1 m ⎤ ⎡

× |xiq − yiq | ⎣Liq Lip + (m − 1) L2ip ⎦ , (3.111)

Li σi (fi (x) − fi (y)) ≤ 2C

p =q

p =q

3.11. Accretivity for evolutions with additive rates

199

so that Liq enters only linearly! This expression can be bounded by 2C(L, |x − y|)



1 [λm−1 + (m − 1)λm−2 ν] (m − 2)! m=2

≤ 2C(L, |x − y|)



(m + 1)λm−1 ν, (m − 1)! m=1

which implies (3.108). (Note that we used the estimate λ ≤ ν in the second inequality.) If nontrivial b are included, the estimates (3.110) are modified for m = 1. An additional term that is bounded by CbLi then adds to the r.h.s. of (3.111):

|xi − yi |Li = Cb(L, |x − y|). Cb i

Turning to a general PΨΦ (x) satisfying (3.106) and b = 0, we have to add to the r.h.s. of (3.111) the term ∞

i=1

Li σi

yΨ Ψ Ψ!



yΨ ≤ Ψ Ψ! i=1

[PΨΦ (x) − PΨΦ (y)](φi − ψi )

Φ:(L,Φ)≤(L,Ψ)

|PΨΦ (x) − PΨΦ (y) |Li φi + Li ψi |,

Φ:(L,Φ)≤(L,Ψ)

which by (3.106) and the estimate (L, Φ) ≤ (L, Ψ) does not exceed

yΨ ˜ 2 C(L, ψ)2 (L, |x − y|) Ψ Ψ!

1 ˜ yi1 · · · yim m(L2i1 + · · · + L2im ) ≤ 2C(L, |x − y|) m! i ,...,i 1

˜ = 2C(L, |x − y|)

m



m yi1 · · · yim L2i1 . (m − 1)! m=1 i ,...,i 1

m

This gives the remaining contribution to (3.109). Including b adds an additional ˜ term bλC(L, |x − y|).  It is remarkable that similar to the case with moments, the accretivity estimates can also be obtained for weighted norms (Lβ , |x|) with any β > 1. In order to keep the formulae slim, let us only discuss the case β = 2. Lemma 3.11.2. Under the assumptions of Lemma 3.11.1, the function −f is accretive on M+ (L3 ) with respect to the weighted norm x → (L2 , |x|). More precisely, ∞

˜ Ca , (L3 , x))(L2 , |x − y|), L2i σi (fi (x) − fi (y)) ≤ α(C, C,

i=1

˜ Ca , (L3 , x)). for x, y ∈ M+ (L3 ) with some continuous function α(C, C,

(3.112)

Chapter 3. Discrete Kinetic Systems: Equations in lp+

200

Proof. The difference to Lemma 3.11.1 only affects the parts with a = 0, b = 0 and P independent of x. In this case, working like in the proof of Lemma 3.11.1 we obtain ∞

L2i σi (fi (x) − fi (y))

i=1



m

×

m 1 xi · · · xiq−1 yiq+1 · · · yim |xiq − yiq |σiq m! i ,...,i q=1 1 1 m ∞ 

Φ 2 2 2 Pi1 ,...,im Li σi φi − Li1 σi1 − · · · − Lim σim . i=1

(L,Φ)≤Li1 +···+Lim

Now the main trick is the estimate σiq





i



L2i σi φi − L2i1 σi1 − · · · − L2im σim ≤ ⎝

i

L2i φi − L2iq +

⎞ L2ip ⎠

p =q

⎞ ⎛ ⎞ ⎛ 2 

Li φi − L2iq + L2ip⎠ ≤ ⎝(Li1 + · · · + Lim )2 − L2iq + L2ip⎠ ≤⎝ i

p =q

≤ 2Lq

Lip + m

p =q

p =q

L2ip .

p =q



The leftover part is fully analogous to Lemma 3.11.1.

3.12 The major well-posedness result in lp+ We can now improve Theorem 3.9.1 to the full well-posedness result. Theorem 3.12.1. Let L ∈ R∞ ++ be non-decreasing with finite set levels such that Lj ≥ 1 for all j. Let (3.87), (3.88), (3.106) and (3.107) hold, and let x ∈ M+ ≤λ (L)∩ M≤ν (L2 ), a(x) = a ∈ R∞ . +,f in (i) Then for any p ≥ 1 there exists a unique global solution X(t, x) to (3.85) in lp with the initial condition x such that 2 X(t, x) ∈ M+ ≤λ(t) (L) ∩ M≤ν(t) (L ),

ν(t) = eκ(2,λ(t))t ν + (eκ(2,λ(t))t − 1)

(Lβ , a) , κ(2, λ(t))

(3.113)

with κ from (3.97) and λ(t) from (3.92). If x ∈ M+ (Lβ ) with some β > 2, then the corresponding solution X(t, x) also belongs to M+ (Lβ ) with similar estimates (given in Theorem 3.9.1).

3.12. The major well-posedness result in lp+

201

(ii) If X(t, x) and X(t, y) are two solutions with the initial conditions x and y, then ˜ |x − y)|), (L, |X(t, x) − X(t, y)|) ≤ exp{[a(λ(t))ν(t) + Ca + b(C + C)]t}(L, (3.114) with a(λ) from (3.109). (iii) If Xn (t, x) denotes the solution to the finite-dimensional approximation x˙ = f (Pn (x)), then Xn (t, x) − X(t, x)lp → 0

and

(L, |Xn (t, x) − X(t, x)|) → 0

(3.115)

as n → ∞ uniformly on x ∈ M≤ν (L2 ). In particular, it follows that if the sum in (3.86) is over Φ : (L, Φ) = (L, Ψ) and not just over (L, Φ) ≤ (L, Ψ)+b, then (L, X(t, x)) = (L, x). Proof. The existence part is already proved in Theorem 3.9.1. Estimate (3.114) and hence uniqueness follow from the accretivity (3.108) in the weighted norm in the same way as (3.104) of Proposition 3.10.1 follows from the accretivity in lp and Proposition 1.4.4 in the case p = 1. It remains to show (iii): The first limit in (3.115) follows from the construction of X(t, x) in Theorem 3.9.1, and the second limit follows from the convergence in l1 and the observation that j≥m Lj yj → 0 2 as m → ∞ uniformly for y ∈ M+  ≤ν (L ). Remark 60. Part (ii) of Theorem 3.9.1 can be similarly extended into a well-posedness of equation (3.86), but with the additional constraint that x, a ∈ M+ (Lβ ) with some β > 2. As an example, let us consider equation (3.68) describing the mean-field dependent merging-splitting or coagulation-fragmentation, enhanced by the process of preferential attachment. Theorem 3.12.2. Let P (x), a(x), Qkl (x), P m,k (x) be continuous non-negative functions on l1 such that

P (x) + a(x) ≤ c, Qkl (x) ≤ c(k + l), P mn (x) ≤ ck, (3.116) m+n 1 and to (L, Φ) ≤ (L, Ψ) + b for m = 1. Assuming the existence of the partial derivative of the solutions to (3.119), one gets the following equation ξ(t) = ξ p (t) = ∂X(t,x) ∂xp p for ξ = ξ(t) = ξ (t) by differentiation: ξ˙j =

k m



1 ξiq Xir (t, x) PΦ (φj − δji1 − · · · − δjim ) Φ i1 ,...,im m! m=1 i ,...,i q=1 1

=

m

r =q

k−1

1 ξl Xi1 (t, x) · · · Xim (t, x) m! m=0 l,i1 ,...,im

PiΦ1 ,...,im ,l (φj − δjl − δji1 − · · · − δjim ), × Φ

(3.120)

with the initial condition ξj (0) = δjp . Since we shifted the parameter m in the last equation, the restriction on Φ turns to (L, Φ) ≤ (L, Ψ) for m > 0 and (L, Φ) ≤ (L, Ψ) + b for m = 0. Equation (3.120) differs from (3.119) in two essential points. First of all, it is linear: it can be written as

Ajl (t)ξl ⇐⇒ ξ˙ = A(t)ξ, (3.121) ξ˙j = l

with the infinite matrix A having the elements Ajl (t) =

1 Xi1 (t, x) · · · Xim (t, x) PΦ (φj −δjl −δji1 −· · ·−δjim ). Φ i1 ,...,im ,l m! m=0 i ,...,i k−1

1

m

And secondly, no positivity for ξ can be expected. This implies that an estimate of (L, ξ) does not make much sense, and that only estimates for (L, |ξ|) can be relevant.

3.13. Sensitivity

203

The natural finite-dimensional approximations to (3.120) are given by the equations ξ˙j =

(n)

l

Ajl (t)ξl =

1 ξl Xi1 (t, Pn (x)) · · · Xim (t, Pn (x)) m! (3.122) m=0 l,i1 ,...,im

i1 im Φ l Pi1 ,...,im ,l (φj − δj − δj − · · · − δj ). × k−1

Φ

(Note that they are finite-dimensional due to the locality of the r.h.s. of (3.119).) (n) Obviously, the solutions ξj to these finite-dimensional equations with the initial p conditions δj are well defined. We shall analyse the smoothness of the solutions to equation (3.119) by the following steps. First we shall see with the help of accretivity that the approximations (3.122) fit the assumptions of Lemma 3.7.1, which leads to the existence of solutions to (3.120). Next, we shall use accretivity again for proving the unique(n) (n) ness. This would imply that the approximations ξj = ξj (t) (note that we often omit the dependence on t in order to keep the formulae slim) solving (3.122) actually converge to the solutions of (3.120) (not only their subsequences), which (n) again implies that the derivatives ξj of X(t, Pn (x)) with respect to xp converge. Therefore, their limit ξj is the derivative of X(t, x). Let us start with accretivity estimates in the weighted norm y → (L, |y|), as (n) in Lemma 3.11.1, in order to get the estimates for the growth of (L, |ξj |). The estimate (3.108) for the finite difference of the r.h.s. of equation (3.119) suggests that a similar estimate should hold for the derivatives of this r.h.s.: ∞

(n) Li sgn(ξi )[A(n) (t)ξ (n) ]i i=1

#

≤ C(L, |ξ (n) |) b + 2(L2 , X(t, Pn (x)))

k−1

m=1

$ (m + 1)(L, x)m−1 . (m − 1)!

(3.123)

In fact, this can be proved similar to the proof in Lemma 3.11.1. Namely, setting b = 0 in order to avoid lengthy formulae (its contribution is straightforward), we have ∞

(n) Li sgn(ξi )[A(n) (t)ξ (n) ]i i=1

=

k−1

1 m! m=0 ×

(n)

|ξl |Xi1 (t, Pn (x)) · · · Xim (t, Pn (x))

l,i1 ,...,im

(L,Φ)≤Li1 +···+Lim +Ll

×



i

(n)

Φ Pl,i sgn(ξl ) 1 ,...,im

 (n) (n) (n) (n) Li sgn(ξi )φi − Li1 sgn(ξi1 ) − · · · − Lim sgn(ξim ) − Ll sgn(ξl ) .

Chapter 3. Discrete Kinetic Systems: Equations in lp+

204

Similar to the proof of Lemma 3.11.1, we can show that this expression is bounded by ≤

k−1

1 m! m=1

(n)

|ξl |Xi1 (t, Pn (x)) · · · Xim (t, Pn (x))

l,i1 ,...,im

× 2C(Li1 + · · · + Lim + Ll )(Li1 + · · · + Lim ) ≤ 2C(L, |ξ (n) |) + 2C(1, |ξ|

k−1

(L, Pn (x))m (m − 1)! m=1

(n)

)

k−1

m=1

(L2 , X(t, Pn (x)))

m(L, Pn (x))m−1 , (m − 1)!

which yields (3.123). By (3.123) and Proposition 1.4.4 in the Banach space of sequences with the weighted norm (L, |ξ|), it follows that (L, |ξ (n) (t)|) $ # t k−1

(m + 1)(L, x)m−1 ds. ≤C (L, |ξ (n) (s)|) b + 2(L2 , X(s, Pn (x))) (m − 1)! 0 m=1 By Gronwall’s lemma and taking into account that X(t, Pn (x)) ∈ M≤ν(t) (L2 ) by (3.113) and that (L, |ξ (n) |) = Lp at t = 0, it follows that $  # k−1

m(L, x)m−1 (n) Lp . (3.124) (L, |ξ (t)|) ≤ exp Ct b + 2ν(t) (m − 1)! m=1 Consequently, we can apply Lemma 3.7.1 to the approximations ξ (n) with $    # k−1

m(L, x)m−1 Lp . KT = y ∈ l1 : (L, |y|) ≤ exp CT b + 2ν(T ) (m − 1)! m=1 It can be seen that all conditions are met due to (3.124) and Theorem 3.12.1(iii), which implies the existence of solutions to (3.120) with the initial condition ξj (0) = δjp that can be obtained as the limit of a converging subsequence of ξ (n) . Moreover, for any solution ξ to (3.120) from KT , the following estimate can be obtained similar to (3.123): $ # ∞ k−1

(m + 1)(L, x)m−1 . (3.125) Li sgn(ξi )[A(t)ξ]i ≤ C(L, |ξ|) b + 2ν(t) (m − 1)! m=1 i=1 Next, by linearity of A(t), this estimate can be applied to the difference ξ p −ξ q of any two solutions of (3.120) from KT with the initial conditions ξjp (0) = δjp ,

3.13. Sensitivity

205

ξjq (0) = δjq . This leads to ∞

Li sgn(ξip − ξiq )[A(t)(ξ p − ξ q )]i

i=1

$ (m + 1)(L, x)m−1 . ≤ C(L, |ξ − η|) b + 2ν(t) (m − 1)! m=1 #

(3.126)

k−1

Using again Proposition 1.4.4 and Gronwall’s lemma, we can conclude that 

#

k−1

(m + 1)(L, x)m−1 (L, |ξ (t) − ξ (t)|) ≤ exp Ct b + 2ν(t) (m − 1)! m=1 p

q

$ |Lp − Lq |. (3.127)

The uniqueness of solutions to (3.120) with the initial condition ξj (0) = δjp follows from (3.127). Consequently, it also follows that this solution is the limit in l1 of the approximating solutions ξ (n) . As already noted, this implies that the solution ξ p (t) to (3.120) with the initial condition ξj (0) = δjp equals in fact the ∂X(t,x) ∂xp .

derivative

Summing up, we proved the following sensitivity result for discrete kinetic equations: Theorem 3.13.1. Let L ∈ R∞ ++ be non-decreasing with finite set levels such that Lj ≥ 1 for all j and that (3.87), (3.88) hold. Then the solutions X(t, x) to equation (3.119) (which are uniquely defined according to Theorem 3.12.1) are differentiable with respect to each coordinate xp for x ∈ M+ (L2 ), and the derivatives ξ p (t) = ∂X(t,x) are the unique solutions to equation (3.120) in lp belonging to M(L), with ∂xp the initial condition ξjp (0) = δjp . Their growth is estimated by the same estimate (3.124) as for ξ (n) , and the difference of two derivatives satisfies (3.127). Let us now extend Theorem 3.13.1 to the full equations. Note that we omit the details of the proof and show only a straightforward modification to the above proof. Theorem 3.13.2. Under the condition of Theorem 3.12.1, assume additionally that the functions a(x), PΨΦ (x) are continuously differentiable in x, so that sup j

 

 ∂P Φ (x)   Ψ  ≤ C(L, ˜ Ψ),  Φ  ∂xj

    ∂a(x)   ≤ Ca . sup L,  ∂xj  j

(3.128)

Then the solutions X(t, x) to equation (3.119) are differentiable with respect to each coordinate xp for x ∈ M+ (L2 ). The derivatives ξ p (t) = ∂X(t,x) belong to ∂xp

Chapter 3. Discrete Kinetic Systems: Equations in lp+

206

M(L) and have the bounds # ˜ (L, |ξ p (t)|) ≤ exp t Ca + (C + (L, x)C)b ˜ + 2(C + C)ν(t)

k−1

m=1

(m + 1)(L, x)m−1 (m − 1)!

$

and the difference of two derivatives can be estimated by  # ˜ (L, |ξ p (t) − ξ q (t)|) ≤ exp Ct Ca + (C + (L, x)C)b ˜ + 2(C + C)ν(t)

k−1

m=1

(m + 1)(L, x)m−1 (m − 1)!

(3.129) Lp ,

$

(3.130) |Lp − Lq |.

Let us emphasize that the solutions X(t, x) to equation (3.119) belong to M(L2 ) and their derivatives ξ belong to M(L), which reflects the important general effect that derivatives with respect to the initial condition are often less regular than the solution itself. This regularity decay extends to other moments as follows. Theorem 3.13.3. Under the condition of Theorem 3.13.2, let x ∈ M+ (L3 ) (and therefore X(t, x) ∈ M+ (L3 ) for all t by Theorem 3.12.1(i)). Then the derivatives ξ p (t) = ∂X(t,x) belong to M(L2 ) and the following estimates hold: ∂xp ˜ (L3 , x))L2 , (L2 , |ξ p (t)|) ≤ α1 (t, Ca , C, C, p 2 p q 3 ˜ (L , |ξ (t) − ξ (t)|) ≤ α1 (t, Ca , C, C, (L , x))|L2p − L2q |,

(3.131) (3.132)

˜ (L3 , x)). with some continuous function α1 (t, Ca , C, C, Proof. It is the same as in Theorem 3.13.2, but one must now use the accretivity of Lemma 3.11.2 rather than that of Lemma 3.11.1. Namely, one obtains the following accretivity estimate for the derivative approximations: ∞

(n)

˜ (L3 , x)) L2i sgn(ξi )[A(n) (t)ξ (n) ]i ≤ (L2 , |ξ (n) |)α1 (t, Ca , C, C,

(3.133)

i=1

˜ (L3 , x)). with some continuous function α1 (t, Ca , C, C,



In the following section, we will show that – as one may now expect – taking second derivatives leads to a further decay of the regularity.

3.14 Second-order sensitivity For the analysis of fluctuations in interacting particles systems as described by the kinetic equations, the second-order derivatives of the solutions with respect to

3.14. Second-order sensitivity

207

the initial data play a key role. Their analysis is based on similar ideas as for the first-order derivatives. However, the expected further decay of regularity requires that they can only be obtained for solutions with a third-order moment in L, i.e., the solutions from M+ (L3 ) according to Theorem 3.13.3. Theorem 3.14.1. Under the conditions of Theorem 3.13.2, assume additionally that the functions a(x), PΨΦ (x) are twice continuously differentiable in x, so that sup j,k

 ∂ 2 P Φ (x)  Ψ   ≤ C(L, ˜ Ψ), Φ  ∂xj ∂xk 

   2  ∂ a(x)   ≤ Ca . sup L,  ∂xj ∂xk  j,k

(3.134)

Let x ∈ M+ (L3 ), such that X(t, x) ∈ M+ (L3 ) for all t by Theorem 3.12.1(i) and belong to M(L2 ) by Theorem 3.13.3. Then the continuous that ξ p (t) = ∂X(t,x) ∂xp derivatives η p,q (t) = estimate holds:

∂ 2 X(t,x) ∂xp ∂xq

exist and belong to M(L). Moreover, the following

˜ (L3 , x)) (L, |η p,q (t)|) ≤ tLp Lq α2 (t, Ca , C, C,

(3.135)

˜ (L3 , x)). with some continuous function α2 (t, Ca , C, C, Proof. It is very similar to Theorem 3.13.1. Reducing the discussion to the case of a and P independent of x, we first formally differentiate equation (3.120) (assuming that the derivative exists), which leads to the equation η˙ jp,q (t) = [A(t)η p,q (t)]j +

k−2

1 ξlp1 ξlq2 Xi1 (t, x) · · · Xim (t, x) (3.136) m! m=0 l1 ,l2 ,i1 ,...,im

× PlΦ1 ,l2 ,i1 ,...,im (φj − δjl1 − δjl2 − δji1 − · · · − δjim ), Φ

which must be satisfied with the initial condition ηjp,q (0) = 0. This equation has the same linear part as equation (3.120), and its non-homogeneous term g, i.e., the sum in (3.136), has a uniformly bounded weighted norm (L, |g|), because (L2 , |ξ p (t)|) and (L2 , |ξ q (t)|) are bounded. Therefore, equation (3.136) has a unique solution which belongs to M(L). By the same approximation as in Theorem 3.13.1, one can show that this solution yields in fact the required second-order derivative.  For example, applying this result to equation (3.68) yields the following. Theorem 3.14.2. Under the assumptions of Theorem 3.12.2, assume additionally that the functions a, P, Qkl , P mk are twice continuously differentiable and    

 ∂P mn (x, b)   ∂Qkj (x, b)   ∂a(x, b) ∂P (x, b)    ≤ c(k + j),   ≤ ck,  ≤ c, | + |      ∂xp ∂xp  ∂xp ∂xp m+n 0. This difference is important when it comes to μ − δx ∈ positivity-preserving solutions, since for l1 the corresponding conditional positivity is a property on the boundary of l1+ , and for M(Rn ) it is a property for the whole set M+ (Rn ). Section 3.1 that deals with the equations in Rn that preserve the positive cone Rn+ can be considered an application of the general theory of invariant sets for ODEs, see, e.g., [233], and [176] for Banach spaces. For the history of the detailed and complex balance conditions that culminate in Theorem 3.4.2, we refer to [95]. Extensions to these results for continuous state spaces are given in [94]. Sections 3.7 to 3.12 cover the theory of discrete kinetic equations, culminating in Theorem 3.12.1. For the basic Smoluchowski coagulation-fragmentation model, this result was first obtained in [25] and then extended to various models by many authors. The general form given here is taken from [139] (although it is further extended in order to account for a possible growth of L by injections and unilateral transformations), where a rather extensive bibliography can be found.

3.17. Summary and comments

211

Let us emphasize, however, that the method of [25] developed specifically to treat Smoluchowski equations can be considered an application of the general method of accretive operators. Theorem 3.13.1 is an adaptation to the discrete case of the corresponding sensitivity result for continuous state models from [147]. In the framework of increasing Lyapunov functions, the results of Theorems 3.13.2, 3.13.3, 3.14.1 and 3.14.2 are new. Sensitivity for the Smoluchowski equation with respect to a parameter was developed in [22] and applied in [23] for obtaining effective numeric solutions. The kinetic equations that have been dealt with in this chapter can be obtained as weak limits of the Markov model of interacting particles, i.e., as a dynamic law of large numbers, see [139] and [147] for the general kinetics of massexchange processes. For the coagulation-fragmentation processes described by the Smoluchowski equations, the approximating Markov process is referred to as the Markus–Lushnikov process. The corresponding convergence result was obtained by many authors under various assumptions, see, e.g., [214] and references therein. This relation leads to a deep study of the coagulation-fragmentation process by probabilistic methods, see [36] and references therein. In particular, this method leads to a detailed description of special solutions once the total mass is not conserved – a situation that is referred to as gelation (creation of infinite mass clusters in a finite time). It turns out that the limiting behaviour of fluctuation processes of particle systems around their law-of-large-number limits can be described by an appropriate infinite-dimensional diffusion process. The corresponding convergence result, i.e., an infinite-dimensional central limit theorem, was obtained for the discrete case with bounded coefficients in [59], and for a general case in [145]. The method of [59] is essentially probabilistic, while the method of [145] is analytic with the main ingredient being the sensitivity of the solutions to kinetic equations with respect to the initial data – which is a very strong motivation for studying sensitivity. The systematic inclusion of the positive influx of particles (growing Lyapunov functions) that was presented here is motivated by the vast modern literature on the models of growth, in particular the famous preferential-attachment model that was introduced in the context of complex networks in [5]. Its remarkable history is nicely described in [246]. Our results on sensitivity of the kinetic equations in these models allow for a vigorous proof in a very general setting that the corresponding Markov systems of interacting particles with a linear growth rate converge to the deterministic limit as described by these kinetic equations, see [154].

Chapter 4

Linear Evolutionary Equations: Foundations This chapter deals with linear equations with an unbounded r.h.s. in Banach spaces (or even locally convex spaces), and the methods of semigroups and propagators are developed. The basic tools are duality, perturbation theory and the Fourier transform. The general theory is illustrated on various examples, including evolutions with mixed fractional Laplacians, Schr¨odinger equations with singular and polynomially growing potentials and magnetic fields, complex diffusions and parabolic higher-order PDEs and ΨDEs with local and nonlocal perturbations.

4.1 Semigroups and their generators We start by recalling the basic facts on semigroups of linear operators, which are defined as collections of continuous operators Tt , t ∈ R+ , in a linear topological space V such that T0 is the identity and the chain rule (also called group equation) Tt+s = Tt Ts holds for all t, s ≥ 0. Such semigroup Tt is called strongly continuous (or, alternatively, a semigroup of the class (C0 )) if Tt f → f as t → 0 for any f ∈ V . In a Banach space B, this condition reads Tt f − f  → 0 as t → 0, and in a locally convex space with the topology generated by semi-norms pα , it turns into the requirement that pα (Tt f − f ) → 0 for any α and f . If bounded operators Tt are defined for all t ∈ R and satisfy the group-equation Tt Ts = Tt+s there, then the family {Tt } is referred to as a group of operators. We shall mostly work with Banach spaces. Note, however, that an extension of the basic theory to general locally convex spaces is usually straightforward if all the required conditions on the norm are extended in such a way that they hold for each semi-norm of the locally convex space. If V is a barrelled space (for instance, a Banach space or a Fr´echet space), then it follows from the principle of uniform boundedness, Theorem 1.6.1, and the © Springer Nature Switzerland AG 2019 V. Kolokoltsov, Differential Equations on Measures and Functional Spaces, Birkhäuser Advanced Texts Basler Lehrbücher, https://doi.org/10.1007/978-3-030-03377-4_4

213

214

Chapter 4. Linear Evolutionary Equations: Foundations

strong continuity of Tt (and, of course, the continuity of each Tt ) that the family Tt is locally equicontinuous. For a Banach space, this means that the norms Tt  are uniformly bounded for t from any compact interval. For a barrelled space V with the topology induced by the family of semi-norms pα , it means, according to Proposition 1.6.2, that for any t > 0 and α there exist a constant C and a finite number of semi-norms {pα1 , . . . , pαk } such that pα (Ts x) ≤ C(pα1 (x) + · · · + pαk (x))

(4.1)

for all x ∈ V and s ∈ [0, t]. The semigroup Tt in a locally convex space V is called equicontinuous if for any α there exist a constant C and a finite number of semi-norms {pα1 , . . . , pαk } such that (4.1) holds for all s ≥ 0. In the case of a Banach space, this is equivalent to the requirement that the family of operators Tt is uniformly bounded. A particularly important case is when Tt is a semigroup of contraction in a Banach space B, i.e., Tt  ≤ 1 for all t. The simplest example of strongly continuous semigroups of operators in a Banach space B is the family of exponents Tt = e

tA

∞ n

t n A , = n! n=0

(4.2)

defined for any bounded linear operator A on B. In this case, the Tt actually form a group. A special case of such A are square matrices, i.e., linear operators in Rd . Furthermore, the shifts Tt f (x) = f (x + t) form a strongly continuous semigroup (and even a group) of contractions on the Banach spaces C∞ (R) and Lp (R), p ≥ 1, an equicontinuous semigroup on the Schwartz space S(Rd ) and on the space of test functions D(Rd ), i.e., infinitely differentiable functions with a compact support. However, this semigroup is not strongly continuous on C(R). Exercise 4.1.1. Check these assertions. Observe also that if f is an analytic function, then ∞ n

t f (x + t) = (Dn f )(x), n! n=0

(4.3)

which can be formally written as etD f (x). The resolving operators f0 → ft from Theorem 2.4.1 in the case of a timeindependent ψt = ψ provide more examples of strongly continuous semigroups. The general relation between semigroups and Cauchy problems will be established in the framework of propagators in Theorem 4.10.1. If D ⊂ V and A : D → V is a linear mapping, then A is usually referred to as a linear operator on V with the domain D. Such operator is called closed if its graph is a closed subset of V × V . If V is metricizable (Banach or Fr´echet space),

4.1. Semigroups and their generators

215

this is equivalent to the requirement that if xn → x and Axn → y as n → ∞ for a sequence xn ∈ D, then x ∈ D and y = Ax. A is called closable if a closed extension of A exists, in which case the closure of A is defined as the minimal closed extension of A, i.e., the operator with the graph being the closure of the graph of A. A subspace D of the domain DA of a closed operator A is called a core for A if A is the closure of A restricted to D. Let Tt be a strongly continuous semigroup of linear operators on a space V . The infinitesimal generator (or just the generator) of Tt is defined as the operator Af = lim

t→0

Tt f − f t

on the linear subspace DA ⊂ V (the domain of A) where this limit exists. For example, any bounded A on a Banach space B is the generator of the semigroup (4.2). By analogy, one often denotes the semigroup Tt that is generated by an operator A by the exponential function: Tt = etA . Another standard example is the semigroup of shifts (4.3), whose generator is the differentiation operator D. A more complicated example is given by the link between nonlinear dynamics in a Banach (or locally convex) space and the corresponding linear dynamics on the functions, performed by the method of characteristics (see Theorem 2.10.1). Let us prove a simple assertion revealing this link. Proposition 4.1.1. (i) Under the assumptions of Theorem 2.2.1, let F be uniformly bounded. Then the operators Tt S(Y ) = S(μt (Y )) form a strongly continuous group of linear contractions in the space Cuc (B) of uniformly continuous bounded functions on B (as a closed subspace of C(B)). (ii) Under the assumptions of Theorem 2.9.1, the space C 1 (B) is an invariant subspace for this semigroup, where the generator is given by the formula LS(Y ) = DS(Y )[F (Y )],

(4.4)

and G(t, Y ) = Tt S(Y ) solves the PDE (2.155) for any Y ∈ C 1 (B). Proof. (i) The space Cuc (B) is invariant under Tt by (2.11). The strong continuity follows from (2.10) and the assumption that F is bounded. The contraction property is seen from the following estimate: Tt S = sup |Tt S(Y )| = sup |S(μt (Y ))| ≤ sup |S(μ)| = S. Y

Y

μ

(ii) The space C 1 (B) is invariant under Tt by (2.146) and the chain rule. The remaining statements are consequences of Theorem 2.10.1.  Remark 61. This result can be extended to more general situations, when the dynamics μt (Y ) does not satisfy the assumptions of Theorem 2.2.1, in particular

216

Chapter 4. Linear Evolutionary Equations: Foundations

when F (μ) is not Lipschitz-continuous. Whenever the dynamics t → μt is well defined at least for positive t, the operators Tt S form a semigroup (not a group!) of linear contractions in an appropriate space of functions. We shall explain in Chapter 7 how such extensions can be used for deriving general kinetic equations. Exercise 4.1.2. Let F ∈ CbLip (Rd , Rd ) and Tt F (Y ) = S(μt (Y )), the semigroup in the space Cuc (Rd ) given by Proposition 4.1.1. Show that the space C 1 (Rd ) is a core (although possibly not invariant) for this semigroup. The following result shows that the domain of the generator is always rather rich. Proposition 4.1.2. Let Tt be a strongly continuous semigroup of continuous linear operators  t on a locally convex space V . Then for all ψ ∈ V and t > 0, the vector ψ(t) = 0 Tu ψ du belongs to DA and Aψ(t) = Tt ψ − ψ. The vectors of these form, and hence the vectors from the domain of D, are dense in B. Proof. We have Tδ ψ − ψ 1 = δ δ



t+δ

Tu ψ du − t

1 δ



δ

Tu ψ du, 0

which implies Aψ(t) = Tt ψ − ψ by the continuity of Tu ψ. Moreover, ψ(t)/t → ψ as t → 0 implies the density.  If the semigroup Tt on V is equicontinuous (in particular, if Tt is a uniformly bounded family on a Banach space), then the resolvent of Tt (or of A) is defined for any λ > 0 as the operator ∞ Rλ f = e−λt Tt f dt. (4.5) 0

The equicontinuity implies that the resolvent is a well-defined continuous operator for any λ > 0. In particular, if the family of semi-norms pα on V is ordered, then for any α there exist β and a constant Cαβ such that Rλ f α = λ−1 Cαβ f β .

(4.6)

The following result collects the basic properties of generators and resolvents. It also provides another proof (for the equicontinuous case) that the generator always has a dense domain. Theorem 4.1.1. Let Tt = etA be a equicontinuous and strongly continuous semigroup of linear operators on a locally convex space V with the generator A. Then the following assertions hold: (i) Tt DA ⊂ DA for each t ≥ 0 and Tt Af = ATt f for each t ≥ 0, f ∈ DA . t (ii) Tt f = 0 ATs f ds + f and T˙t f = ATt f for f ∈ DA .

4.1. Semigroups and their generators

217

(iii) λRλ f → f as λ → ∞. (iv) Rλ f ∈ DA for any f and λ > 0 and (λ − A)Rλ f = f , i.e. Rλ = (λ − A)−1 . (v) If f ∈ DA , then Rλ Af = ARλ f ; in particular, Rλ (λ − A)f = f , so that Rλ is a bijection V → DA for any λ > 0. (vi) DA is dense in B. (vii) A is closed on DA . (viii) If Tt acts in a Banach space and Tt  ≤ M , then Rλ  ≤ M/λ for any λ > 0. (ix) All Rλ commute, and the following resolvent equation holds: Rλ − Rμ = (μ − λ)Rλ Rμ .

(4.7)

Proof. (i) For ψ ∈ DA ,



1 1 ATt ψ = lim (Th − I) Tt ψ = Tt lim (Th − I) ψ = Tt Aψ. h→0 h h→0 h (ii) Follows from (i). (iii) Follows from the equation ∞

∞ −λt −λt −λt e Tt f dt = λ e f dt+λ e (Tt f −f ) dt+λ λ 0

0



e−λt (Tt f −f ) dt,



0

observing that the first term on the r.h.s. is f , the second (respectively third) term is small for small  (respectively for any  and large λ). (iv) By definition, 1 1 ∞ −λt e (Tt+h f − Tt f ) dt ARλ f = lim (Th − 1)Rλ f = lim h→0 h h→0 h 0 $ # eλh h −λt eλh − 1 ∞ −λt e Tt f dt − e Tt f dt = λRλ f − f. = lim h→0 h h 0 0 (v) Follows from the definitions and (ii). (vi) Follows from (iv) and (v). (vii) If fn → f as n → ∞ for a sequence fn ∈ D and Afn → g, then t t Ts Afn ds = Ts g ds. Tt f − f = lim n→∞

0

0

Applying the fundamental theorem of calculus shows that g = Af , which completes the proof. (viii) Follows from (4.5). (ix) Equation (4.7) follows more or less directly from the definitions. It implies  that all Rλ commute.

218

Chapter 4. Linear Evolutionary Equations: Foundations

Exercise 4.1.3. Give a detailed derivation of the resolvent equation (4.7). Proposition 4.1.3. Let an operator A with domain DA generate a strongly continuous semigroup of linear continuous operators Tt on a locally convex space V . If D is a dense subspace of DA of A, i.e., invariant under all Tt , then D is a core for A. ¯ be the domain of the closure of A restricted to D. We have to show Proof. Let D ¯ n ∈ N, such that ψn → ψ and that for ψ ∈ DA there exists a sequence ψn ∈ D, Aψn → Aψ. By Proposition 4.1.2, it is enough to show this for vectors ψ(t) = t 0 Tu ψ du. Since D is dense, there exists a sequence ψn ∈ D converging to ψ. Therefore, also Aψn (t) → Aψ(t) holds, because Aψ (t) = Tt ψ − ψ. The observation ¯ by the invariance of D completes the proof. that ψn (t) ∈ D  Under the assumptions of Theorem 4.1.1, the resolvent R0 for λ = 0 is in general not well defined. Nevertheless, it is defined in many interesting situations. The operator R0 is then referred to as the potential operator. Proposition 4.1.4. Under the assumptions of Theorem 4.1.1, let the operator R0 f =  Tt f dt be well defined as a continuous operator in V . Then R0 is a bijection V → DA , and R0 Ag = AR0 g = −g for any g ∈ DA . t Proof. The assumption implies that Tt ψ → 0 and ψ(t) = 0 Ts ψ ds → R0 ψ as t → ∞. But Aψ(t) = Tt ψ − ψ (see Proposition 4.1.2), and hence Aψ(t) → −ψ. Consequently, since A is closed, we find R0 ψ ∈ DA and AR0 ψ = −ψ for any ψ. On the other hand, if ψ ∈ DA , then t R0 Aψ = lim ATs ψ ds = lim (Tt ψ − ψ) = −ψ, t→∞

0

so the image of R0 coincides with DA .

t→∞



Let us now point out the two main relations by which the theory of semigroups enters the theory of differential equations. Proposition 4.1.4 states that the potential operator is the inverse of A up to the minus sign. Therefore, it can be considered the abstract analogue of the fundamental solution from the theory of generalized functions. More precisely, by (1.179), if A is a ΨDO, then the fundamental solution is the integral kernel of the operator (−R0 ). Theorem 4.1.1 (ii) states that Tt f solves the Cauchy problem for the equation f˙t = Aft with the initial condition f0 = f , whenever f ∈ D. The inclusion ft ∈ D is an abstract version of what is meant by the notion of classical solutions in classical ODEs and PDEs. The theory of semigroups also fits the natural notion of generalized solutions. Let us say that ft is a generalized solution to the Cauchy problem for the equation f˙t = Aft , if it is a continuous function of t, satisfies the initial condition f0 = f , and if a sequence of elements f n ∈ D exists such that f n → f and the corresponding (classical, i.e., belonging to the domain) solutions

4.2. Semigroups of operators on Banach spaces

219

Tfn converge to ft , as n → ∞. Therefore, the curves Tt f represent generalized solutions for any f ∈ V . Another way of defining generalized solutions is based on duality. Both these notions as well as the related question of well-posedness will be discussed in more detail later (on a more general level of propagators).

4.2 Semigroups of operators on Banach spaces We shall now provide more details for the case of semigroups on Banach spaces B. Let us start with the classification of unbounded semigroups. Proposition 4.2.1. Let Tt be a strongly continuous semigroup in a Banach space B. Then there exist constants M and m such that Tt  ≤ M emt . Proof. As was already mentioned above, the principle of uniform boundedness implies that for any S the norms Tt , t ∈ [0, S], are uniformly bounded, say, by M . Next, for any t, let n ∈ N be the integer part of t/S, so that t = nS + δ with  δ ≤ S. Then Tt  ≤ M n+1 = M en ln M ≤ M emt with m = (ln M )/S. The infimum of those m for which Tt  ≤ M emt with some M is called the type of growth of the semigroup Tt . The following result is straightforward. Theorem 4.2.1. If Tt is a strongly continuous semigroup of type m0 in a Banach space B, Theorem 4.1.1 still holds with the only modification that Rλ is defined for λ > m0 and (viii) is replaced by the estimate Rλ  ≤ M (m)/(λ − m), which holds for λ > m > m0 with some constants M (m) that may depend on m. Next, we provide a simple convergence result. Proposition 4.2.2. Let Tt be a strongly continuous semigroup in a Banach space B generated by the operator A on D. Let Tn be a sequence of strongly continuous semigroups generated by the operators An on the domains Dn ⊃ D such that (An − A)f → 0 as n → ∞ for f ∈ D, and either (i) this convergence is uniform on the sets {f : Af  ≤ K}, or (ii) the norms An f  are uniformly bounded on the sets {f : Af  ≤ K}. Then Ttn f → Tt f for any f ∈ B, uniformly for t from compact intervals. Proof. We have



n Tt − Ttn = (Tt−s Ts )|t0 = 0

t

d n (T Ts ) ds = ds t−s



t n Tt−s (A − An )Ts ds.

(4.8)

0

If Tt  is bounded by M on [0, t], then ATs f  ≤ M Af . In the case (i), (A − An )Ts f → 0 uniformly for s ∈ [0, t] for any f ∈ D. Therefore, (Tt − Ttn )f → 0 uniformly for t from compact segments and any f ∈ D. Approximating arbitrary f by elements from D yields the claimed statement. In the case (ii), the convergence (Tt − Ttn )f → 0 for each t follows from the dominated convergence theorem. The uniform convergence is a consequence of the boundedness of the integrand in (4.8). 

220

Chapter 4. Linear Evolutionary Equations: Foundations

Remark 62. With a somewhat more elaborate argument, this proposition can be proved without the additional assumptions (i) or (ii), see, e.g., [119]. A notable example for approximating semigroups is given by the Yosida approximation Aλ , which for any generator A of a semigroup Tt is defined as Aλ = λARλ = λ(λRλ − 1).

(4.9)

For any strongly continuous semigroup Tt of type m0 , this operator is bounded for λ > m0 . For f ∈ DA , we find (Aλ − A)f = λRλ Af − Af, which according to Theorems 4.1.1 and 4.2.1 tends to zero, as λ → ∞. Moreover, Aλ f  ≤

M (m)λAf  . λ−m

Proposition 4.2.3. Let Tt be a strongly continuous semigroup in a Banach space B generated by the operator A on D. Let Aλ be its Yosida approximation. Let Ttλ denote the semigroups exp{tAλ }. Then the semigroups Ttλ converge to Tt uniformly on compact sets of t. Moreover, all semigroups exp{tAλ } and Tt commute, D is invariant under all Ttλ , and A and Ttλ commute on D. Finally Aλ Ttλ f → ATt f, as λ → ∞, for any f ∈ D. Proof. The first statement is a corollary to Proposition 4.2.2. All Aλ commute, because the resolvents Rλ commute. Therefore, their semigroups commute. Passing to the limit in the relation exp{tAλ } exp{tAμ } = exp{tAμ } exp{tAλ } yields exp{tAλ }Tt = Tt exp{tAλ }. Finally, Aλ Ttλ f − ATt f = (Aλ − A)Ttλ f + A(Ttλ − Tt )f = Ttλ (Aλ − A)f + (Ttλ − Tt )Af, which tends to 0, as λ → ∞, due to the convergence of the operators Aλ and their semigroups.  For the analysis of semigroups Tt generated by an operator A on the domain D, it is often convenient to measure the size of the elements f ∈ D by another norm, a canonical choice being f D = f B + Af B .

(4.10)

In particular, this is useful for the analysis of the regularity classes of solutions to the equation f˙t = Aft , which are classified according to the number k ∈ N such that Ak f is well defined.

4.2. Semigroups of operators on Banach spaces

221

Proposition 4.2.4. Let Tt be a strongly continuous semigroup of linear operators in B generated by an operator A on the domain D. Then (i) D is a Banach space with respect to .D ; (ii) the operator A is a contraction as an operator D → B; (iii) the operators Tt form a strongly continuous semigroup of bounded operators in D with the same bounds for the norms as in B; ˜ = {f ∈ (iv) Tt in D is generated by the same operator A defined on the domain D D : Af ∈ D}. Proof. (i) Completeness with respect to .D is a consequence of A being a closed operator on D. (ii) Af B ≤ Af B +f B = f D . (iii) Since A and Tt commute, Tt f D = Tt f B + Tt Af B ≤ Tt B→B f D . Moreover, Tt f − f D = Tt f − f B + Tt Af − Af B , which tends to zero due to the strong continuity of Tt in B whenever Af ∈ B, that is, for all f ∈ D. (iv) If A˜ is the generator of Tt in D with the domain ˜ D → 0, as t → 0, for any f ∈ D . But this implies D , then (Tt f − f )/t − Af ˜ ˜ = Af . Moreover, since Af ˜ = Af is the (Tt f − f )/t − Af B → 0, and hence Af  ˜ limit in D of (Tt f − f )/t, it follows that Af ∈ D. Therefore, D ⊂ D. On the other ˜ then hand, if f ∈ D, (Tt Af − Af )/t − AAf B → 0 as t → 0, so that f ∈ D .

=⇒ (Tt f − f )/t − Af D → 0, 

By iteration, one can therefore build a sequence of decreasing dense subspaces of arbitrary regularity D(k) = {f ∈ B : Ak f ∈ B}, which are Banach spaces with the norm f D(k) = f B + Af B + · · · + Ak f B , so that Tt is a strongly continuous semigroup in each D(k) generated by A with the domain D(k + 1). This reflects an important property, namely the preservation of regularity of the solutions to the equation f˙t = Aft . As the later discussions on perturbation theory, on T -products and on generator mixing will show, Banach towers of the embedded regularity classes D(k) of A can be very useful for the construction of semigroups. Generally, a k-levelBanach tower is meant to be a collection of k embedded Banach spaces Dk−1 ⊂ · · · ⊂ D1 ⊂ B with the ordered norms .Dk−1 ≥ · · · ≥ .D1 ≥ .B such that every space in this row is dense in the next space with respect to the topology of the latter. As a simple example for the application of such Banach towers, let us obtain a convergence result for semigroups that extends Proposition 4.2.2 to the case when the limiting semigroup is not given a priori. (For an exemplary application of this result, see Section 5.2.)

222

Chapter 4. Linear Evolutionary Equations: Foundations

Proposition 4.2.5. ˜ ⊂ D ⊂ B be a three-level Banach tower. Let T n be a sequence of (i) Let D t strongly continuous semigroups in B generated by the operators An on the common invariant core D such that Ttn B ≤ eKt , Ttn D ≤ eKt with a constant K for all n. Moreover, let An ∈ L(D, B) represent a Cauchy sequence in L(D, B). Then the sequence Ttn converges to a strongly continuous semigroup Tt in B with a domain containing D, where its generator A acts as Af = limn→∞ An f . Moreover, Tt B ≤ eKt . ˜ is additionally invariant under Ttn , Ttn  ˜ ≤ eKt , An ∈ L(D, ˜ D) and (ii) If D D ˜ they represent a Cauchy sequence in L(D, D) as well, then D is an invariant core for Tt , and the Ttn converge to Tt also in D. Proof. (i) Using the formulae (4.8) for comparing the semigroups Ttn and Ttm , we find that (4.11) (Ttm − Ttn )f B ≤ teKt Am − An D→B f D . Therefore, Ttm f is a Cauchy sequence in C([0, T ], B) for any T > 0 and f ∈ D. This implies that the curves Ttm f converge to a curve Tt f . By the density argument, we can derive that the convergence holds for any f ∈ B. Consequently, the operators Tt form a strongly continuous semigroup in B satisfying the required growth rate. In order to see that D is in the domain of Tt , we write T nf − f Tt f − Ttn f Tt f − f = t + . t t t By (4.11), the second term tends to zero, as t → 0. Therefore, we can write Tt f − f 1 = t t



t

Tsn Af ds + 0

1 t



t

Tsn (An − A)f ds + 0

Tt f − Ttn f , t

where the second and the third term can be made arbitrary small by choosing n large enough and t small enough. The first term tends to Af as t → 0 for any n (and in fact uniformly in n). ˜ D) (ii) Repeating the above arguments for the pair of embedded spaces (D, shows that the Ttn converge to Tt also in the topology of D. This implies the  invariance of D under Tt . Since the canonical norm (4.10) may be rather artificial (after all, it strongly depends on A), it might be handy to choose it – and the corresponding Banach tower – in a more universal way. For instance, for semigroups Tt on C∞ (Rd ), a usek ful universal Banach tower is formed by the spaces of smooth functions C∞ (Rd ). The following modification of Proposition 4.2.4 addresses this issue. Proposition 4.2.6. Let Tt be a strongly continuous semigroup of linear operators in B generated by an operator A with an invariant core D. Assume that .D ≥ .B is a norm on D with respect to which D is a Banach space, such that Tt is a strongly

4.3. Simple diffusions and the Schr¨ odinger equation

223

continuous semigroup there. Then Tt in D is generated by the same operator A ˜ = {f ∈ D : Af ∈ D}. defined on the domain D Proof. Exactly as above, we can deduce that if f belongs to the domain of the ˜ = Af (this is where the assumption .D ≥ .B generator A˜ of Tt in D, then Af is used) and Af ∈ D. Conversely, assume that f ∈ D and Af ∈ D. The first t inclusion implies that Tt f − f = 0 ATs f ds. And the second inclusion, together with the assumed strong continuity of Tt in D, implies that the function under the integral is continuous in the norm topology of D. Consequently, Tt f − f 1 t − Af = (Ts Af − Af )ds, t t 0 and the r.h.s. tends to zero, as t → 0, in the topology of D.



One can further extend the preservation of regularity to several semigroups with a common invariant domain. Proposition 4.2.7. Let Tt1 , . . . , Ttk be k strongly continuous semigroups of linear operators in B generated by the operators A1 , . . . , Ak , respectively, with a common invariant core D. ¯ of D with respect to the norm f D = f B +  Aj f B (i) Then the closure D j

is a Banach space, which is a core for each group T k . (ii) If additionally

Aj f B ) = κf D Aj Tti f B ≤ κ(f B + j

for all t, i, j, and a constant κ, then all T j are strongly continuous bounded semigroups in D generated by the same operators Aj but with the domain ˜ j = {f ∈ D : Aj f ∈ D}. D Exercise 4.2.1. Prove this proposition.

4.3 Simple diffusions and the Schr¨odinger equation Possibly the most fundamental example of an operator semigroup is the semigroup generated by the (customary half of the) Laplacian Δ/2 with Δf (x) =

d

∂2f j=1

∂x2j

,

x ∈ Rd .

Apart from playing a fundamental role in various domains of natural science, it has a large mathematical value, since all basic objects related to this semigroup (e.g., resolvent, generator) can be explicitly calculated. Therefore, one can use it as a basic example for testing general assertions.

224

Chapter 4. Linear Evolutionary Equations: Foundations

The heat conduction semigroup ft = Tt f solving the Cauchy problem for the simplest heat conduction or diffusion equation 1 f˙t = Δft , 2

f0 = f,

(4.12)

has the following closed form: ft (x) = Tt f (x) = with the function −d/2

Gt (x) = (2πt)

Gt (x − y)f (y) dy,  2 x exp − , 2t

t > 0.

(4.13)

(4.14)

This function is referred to as the heat kernel or the Green function of equation (4.12). Exercise 4.3.1. (i) Gt (x) represents a so-called δ-sequence as t → 0, meaning that Gt (x − y)f (y) dy → f (x), t → 0,

(4.15)

for any f ∈ C(Rd ) and moreover ∂ 1 Gt (x) = ΔGt (x). ∂t 2

(4.16)

(ii) For any t > 0 and any bounded measurable function f , Tt f given by (4.13) is an infinitely differentiable function satisfying (4.12). Moreover, Tt f → f in C∞ (Rd ) as t → 0 whenever f ∈ C∞ (Rd ). (iii) The operators Tt of (4.13) form a strongly continuous semigroup in C∞ (Rd ) 2 (Rd ) being its invariant core. with C∞ (iv) The operators Tt form a semigroup of positivity-preserving contractions in C(Rd ) that also preserve constants. The semigroup Tt is a basic example for the class of Feller semigroups (see definitions in Section 5.10). These semigroups play a key role in the theory of Markov processes. 2 (Rd ), It is a bit more tricky to find not only a handy invariant core like C∞ but the full domain of the generator Δ/2 of the heat semigroup Tt of (4.13). To this end, let us limit our attention to the one-dimensional case d = 1. The key ingredient for the analysis is the resolvent of Tt , which for d = 1 is given by the operator ∞ √ 1 Rλ f (x) = √ exp{− 2λ|y − x|}f (y) dy. (4.17) 2λ −∞

4.3. Simple diffusions and the Schr¨ odinger equation

Exercise 4.3.2. Check that ∞ √ Rλ f = exp{− 2λ|y − x|}f (y) sgn (x − y) dy, −∞ √ ∞ √  exp{− 2λ|y − x|}f (y) dy − 2f (x), Rλ f = 2λ

225

(4.18)

−∞

where the primes denote derivatives with respect to x. Confirm that the functions Rλ f , Rλ f , Rλ f belong to C∞ (R) whenever f ∈ C∞ (R), in which case Rλ satisfies the resolvent equation 1 d2 Rλ f = λRλ f − f. (4.19) 2 dx2 Hence conclude that equation (4.17) yields in fact the resolvent of the heat semi2 (R) is the domain of its generator. group (4.13), and that the space C∞ Exercise 4.3.3. Check that the heat semigroup (4.13) is also a strongly continuous semigroup in any of the spaces Lp (Rd ) with p ≥ 1, but that it is not strongly continuous in C(Rd ). Exercise 4.3.4. The heat semigroup (4.13) extends to the semigroup on M(Rd ) such that Tt f ∈ L1 (Rd ) for any t > 0 and f ∈ M(Rd ). However, in M(Rd ) this semigroup is not strongly continuous, but only weakly continuous. Check this claim on the example of the Green function Gt , which solves (4.12) with the Dirac δ-function δ ∈ M(Rd ) as initial condition. The heat semigroup (4.13) is an example for a so-called analytic semigroup, meaning that the operators Tt can be extended to complex values of times t. In other words, formula (4.13) solving (4.12) extends to the formula ft (x) = Ttσ f (x) = Gtσ (x − y)f (y) dy, (4.20) which solves the complex diffusion equation 1 f˙t = σΔft , 2

f0 = f,

(4.21)

with any σ ∈ C having positive real part. For any such σ, formula (4.20) still defines a strongly continuous semigroup in the space of complex-valued continuous functions on Rd vanishing at infinity. Formula (4.20) can be further extended to pure imaginary σ, in which case equation (4.21) turns to the most basic Schr¨ odinger equation of quantum mechanics. In this case, however, the natural functional space becomes L2 rather than C∞ . Notice for the sake of completeness that, more generally, the evolutionary Schr¨ odinger equation is an equation of the type f˙t = −iHft ,

f0 = f,

(4.22)

226

Chapter 4. Linear Evolutionary Equations: Foundations

where H is a self-adjoint operator in a Hilbert space. It is shown in functional analysis that the resolving operators exp{−iHt} are well defined and form the group (not only semigroup) of unitary operators in this Hilbert space. Equation (4.21) with σ = i +  and a small real  is sometimes considered a Schr¨ odinger equation that is regularized by a complexification of the time or the mass. The properties of this regularized equation can be exploited for obtaining useful information about the limiting case σ = i. Exercise 4.3.5. Show that, if σ = iκ with a real κ, the operators (4.20) form a odinger strongly continuous semigroup in L2 (Rd ) and solve the corresponding Schr¨ equation (4.21). An invariant core for this semigroup is given by the Schwartz space S(Rd ). A key feature of the heat semigroup (4.13) is its smoothing property. Proposition 4.3.1. For the Green function Gt from (4.14) and f ∈ C(Rd ), j = 1, . . . , d, we have 0 ) ) ) ∂Gt (x) ) 2 ) ) , (4.23) = ) ∂xj ) πt L1 (Rd )  0   ∂  2  sup  f C(Rd) , Tt f (x) ≤ ∂x πt x j 0 ) ) ) ∂ ) 2 ) ) f L1(Rd ) . T f (x) ≤ (4.24) ) ∂xj t ) πt d L1 (R ) Proof. The estimates (4.24) are consequences of (4.23). In order to get (4.23), we calculate  2 x xj ∂Gt (x) exp − = −(2πt)−d/2 , ∂xj t 2t which yields ) )  2 ) ∂Gt (x) ) 1 z 2 −d/2 ) ) √ = | exp − |z (2π) dz = √ (2π)−d/2 (2π)(d−1)/2 . j ) ∂xj ) 2 t t L1 (Rd ) This implies (4.23).



The estimates (4.24) extend to the derivatives of all orders. For instance, for any f ∈ C 1 (Rd ), we get   0  ∂2  2 f C 1 (Rd ) . Tt f (x) ≤ (4.25) sup sup  ∂x ∂x πt x j k j,k Exercise 4.3.6. Prove (4.25). The next result shows that the smoothing property of Tt works also for integral norms, that is for Sobolev spaces.

4.3. Simple diffusions and the Schr¨ odinger equation

227

Exercise 4.3.7. Following the line of arguments of Proposition 4.3.1 above, show that the heat semigroup Tt takes L1 (Rd ) to H1k (Rd ) with any k > 0, and that Tt f H1k (Rd ) ≤ κk (d)t−k/2 f L1 (Rd ) .

(4.26)

Exercise 4.3.8. Extend (4.24), (4.25) and (4.26) to the semigroup Ttσ of (4.20) for any σ with a positive real part. Another key feature of the heat semigroup (4.13) is that it preserves smoothness and polynomial decay at infinity. Namely, for any f ∈ C k (Rd ), Tt f C k (Rd ) ≤ f C k (Rd ) .

(4.27)

On the other hand, if f increases not faster than |x|2n at infinity, n ∈ N , then sup |(1 + |x|2n )Tt f (x)| ≤ (1 + cn t) sup |(1 + |x|2n )f (x)|, x

(4.28)

x

for t ∈ [0, 1] and a constant cn . Estimate (4.27) follows from the observation that Tt commutes with the operation of differentiation. In order to prove (4.28), one must first show a rougher estimate: sup |(1 + |x|α )Tt f (x)| ≤ κα sup |(1 + |x|α )f (x)|, x

(4.29)

x

which holds for any α > 0 and t ∈ [0, 1] with some constant κα . Exercise 4.3.9. Derive the inequality (4.29) and give an estimate for the constant κα . Next, we show that (4.28) is equivalent to the inequality (1 + |x|2n )

  dy (x − y)2 (2πt)−d/2 exp − ≤ (1 + cn t). 2t 1 + |y|2n

(4.30)

At t = 0, the l.h.s. of (4.30) equals one. In order to prove (4.30), one therefore has to show that      ∂  dy (x − y)2 2n −d/2   sup sup  exp − (1 + |x| ) (2πt) 2n  ∂t 2t 1 + |y| t∈[0,1] x       (x − y)2 1 1  sup sup (1 + |x|2n )Δ (2πt)−d/2 exp − dy =  2n 2 t∈[0,1] x 2t 1 + |y|       (x − y)2 1 1 2n −d/2  sup sup (1 + |x| ) (2πt) exp − dy  < ∞. = Δ 2n 2 t∈[0,1] x 2t 1 + |y|

228

Chapter 4. Linear Evolutionary Equations: Foundations

But this follows from (4.29), since         1 1 1 d−1 Δ = +  1 + |y|2n  1 + r2n r 1 + r2n

2nr2n−2 2nr2n =− d + 2n − 2 − (1 + r2n )2 (1 + r2n )3 1 ≤ κn 1 + r2n with r = |y|, which completes the proof of (4.30). Summing up, we proved the following result. Proposition 4.3.2. The action of the semigroup Tt in the weighted spaces CLn (Rd ) with the functions Ln (x) = 1 + |x|n , n ∈ N, satisfy the estimates Tt |L(CLn (Rd )) ≤ exp{Cn t}

(4.31)

with a constant Cn . Remark 63. By duality, this implies that Tt preserves the sets of measures with bounded moments of any order. The fact that one can reduce evolutions to measures of bounded moments is crucial for nonlinear equations (see, e.g., Theorems 6.4.3 and 6.10.1) because, by Proposition 1.1.1, the sets of measures with a bounded moment are compact in the weak topology. Both for mathematical developments and physical applications, a highly interesting field are the processes of heat conduction and diffusion in bounded domains, for which one needs to find appropriate modifications of the heat semigroup Tt that act in spaces C(Ω) with Ω a subset of Rd . As the simplest example, let us consider the case when the one-dimensional domain Ω is the half-line R+ . Exercise 4.3.10. Check that the operators ∞ (Gt (x − y) + Gt (x + y))g(y) dy TtNeu f (x) =

(4.32)

0

define a strongly continuous semigroup in C∞ ([0, ∞)) with the generator Lf = f  /2 defined in this way on the domain 2 ([0, ∞)) : f  (0) = 0}. DNeu = {f ∈ C∞

Note that the condition f  (0) = 0 is the so-called Neumann boundary condition. Formula (4.32) can be obtained from Tt of (4.13) by reducing it to the space of even functions. Exercise 4.3.11. Check that the operators ∞ Dir (Gt (x − y) − Gt (x + y))g(y) dy Tt f (x) = 0

(4.33)

4.3. Simple diffusions and the Schr¨ odinger equation

229

define a strongly continuous semigroup in the subspace Ckill(0),∞ ([0, ∞)) of C∞ ([0, ∞)), consisting of functions that vanish at zero, with the generator Lf = f  /2 defined in this way on the domain 2 DDir = {f ∈ C∞ ([0, ∞)) : f (0) = f  (0) = 0}.

The condition f (0) = 0 is the so-called Dirichlet boundary condition. Formula (4.33) can be obtained from Tt of (4.13) by reducing it to the space of odd functions. Exercise 4.3.12. Check that the operators ∞ ∞ Gt (y) dy + (Gt (x − y) − Gt (x + y))g(y) dy, Ttstop f (x) = 2f (0) x

(4.34)

0

define a strongly continuous semigroup in C∞ ([0, ∞)) with the generator Lf = f  /2 defined in this way on the domain 2 ([0, ∞)) : f  (0) = 0}. Dstop = {f ∈ C∞

This semigroup is an extension of the semigroup TtDir to the whole space C∞ ([0, ∞)). Moreover, check the intertwining relation (Ttstop f ) = TtNeu (f  ).

(4.35)

The theory extends directly to the case when Δ in (4.12) is substituted by an arbitrary non-degenerate second-order operator with constant coefficients: 1 f˙t = (A∇, ∇)ft + (b, ∇)ft , 2

f0 = f,

(4.36)

where b ∈ Rd and A is a symmetric positive matrix. In this case, the solution is (4.37) ft (x) = TtA,b f (x) = GA,b t (x − y)f (y) dy, with the heat kernel GA,b t (x)

−d/2

= (2πt)

−1/2

(det A)

  1 −1 exp − (A (x + bt), x + bt) . 2t

(4.38)

Formula (4.37) can either be directly verified or derived via the Fourier transform, see (2.60) and (1.89). Another example are Gaussian diffusions, which are an extension of equation (4.12) in the sense that their heat kernels or Green functions (i.e., the integral kernel of the operators that form the semigroup) can be written explicitly as

230

Chapter 4. Linear Evolutionary Equations: Foundations

the exponential of a quadratic form. Namely, a Gaussian diffusion operator is a second-order differential operator of the form     ∂2 ∂ 1 (4.39) L = B x, + tr A 2 , ∂x 2 ∂x where x ∈ Rd , B and A are d × d-matrices, and the matrix A is symmetric and t non-negative. The corresponding parabolic equation ∂f ∂t = Lft can be written more explicitly as ∂ft ∂ 2 ft 1 ∂ft = bij xi + Aij . ∂t ∂xj 2 i,j ∂xi ∂xj i,j

(4.40)

If A is the unit matrix, the second term turns into the Laplacian Δ, and the resulting equation is usually referred to as the Ornstein–Uhlenbeck diffusion equation. Remark 64. In Section 4.12, we shall look at infinite-dimensional generalizations of Gaussian or Ornstein–Uhlenbeck diffusions, by using an approach that is different from the one employed here. Namely, instead of building a Green function (which is not a straightforward object in the infinite-dimensional case), we will consider the evolution of Gaussian packets, which naturally leads to the so-called Riccati equations. If the matrix



t

eBτ AeB

E = E(t) =

T

τ



(4.41)

0

is non-singular, where B T is the transpose of B, the Gaussian diffusion semigroup ft = Tt f solving the Cauchy problem for equation (4.40) has the following closed form: ft (x) = Tt f (x) = GA,B (x − y)f (y) dy, (4.42) t with

   1  −1 −d/2 −1/2 Bt Bt E (x) = (2π) (det E(t)) exp − (x − e x), x − e x GA,B 0 0 t 2 (4.43) the heat kernel (or Green function) of equation (4.40). Exercise 4.3.13. (i) Check that (4.42) yields a solution to equation (4.40). (ii) Derive (4.43) via the Fourier transform. It is possible to fully classify operators of the type (4.39) having a nonsingular matrix E(t), as well as the possible small-time asymptotics of E(t). Namely, it turns out (see, e.g., Chapter 1 of [136] for the proofs) that if E(t) is non-singular, then coordinates exist where E(t) is block-diagonal with blocks of

4.3. Simple diffusions and the Schr¨ odinger equation

231

dimension p that have the form Λp (t)(1 + O(t)), where the main term Λp (t) has the entries Λp (t)ij =

t2p+1−(i+j) 1 1 λp (t)ij = , (p − i)!(p − j)! (p − i)!(p − j)! 2p + 1 − (i + j) i, j = 1, . . . , p,

(4.44)

and the determinant (entering formula (4.43)) 2

det Λp (t) = tp

2! · · · (p − 1)! . p!(p + 1)! · · · (2p − 1)!

In order to better understand this structure, let us write down the blocks λp and the inverses of Λp for p = 1, 2, 3: ⎞ t5 t4 t3 ⎜ 5 ⎞ ⎛ 3 4 3⎟ ⎜ t2 t ⎟ ⎜ t4 t3 t2 ⎟ ⎜ ⎟ ⎟ ⎜3 λ1 (t) = t, λ2 (t) = ⎝ 2 2 ⎠ λ3 (t) = ⎜ ⎟, ⎜ 4 3 2⎟ t ⎟ ⎜ t, ⎝ t3 t2 ⎠ 2 t 3 2 ⎞ ⎛ 6 12 ⎜ t3 − t2 ⎟ 1 ⎟, (Λ1 (t))−1 = , (Λ2 (t))−1 = ⎜ ⎝ t 4 ⎠ 6 − 2 t t ⎛ 6! 6! 60 ⎞ − 4 ⎜ t5 2t t3 ⎟ ⎜ ⎟ ⎜ 6! 192 36 ⎟ −1 ⎜ (Λ3 (t)) = ⎜ − . − 2⎟ t3 t ⎟ ⎜ t4 ⎟ ⎝ ⎠ 60 9 36 − t3 t2 t ⎛

(4.45)

(4.46)

Therefore, the Gaussian diffusions with non-singular E(t) (and thus with a smooth heat kernel) are fully classified by the numbers of blocks of order p (any collection can be realized). This can be nicely encoded by the so-called Young schemes or Young diagrams, which are non-decreasing finite sequences of natural numbers. Also, the small-time asymptotics of their heat kernels are given in closed form. The situation with the blocks Λ1 only corresponds to the usual non-degenerate diffusions without drift (having heat kernels (4.38) with b = 0). Diffusion equations that arising from this scheme with only one block Λ2 are usually referred to as Kolmogorov’s diffusion. One can also classify diffusions with variable coefficients, whose small-time asymptotics is given by the heat kernels of the above Gaussian diffusions. The symbols H(x, p) of these diffusion operators are the regular

232

Chapter 4. Linear Evolutionary Equations: Foundations

Hamiltonians from formula (2.103) in connection with boundary-value problems for Hamilton systems. As a last example which is important to many applications (see e.g. Theorem 6.9.1), let us present without a proof the fundamental result on the degenerate diffusions that can be obtained by the method of stochastic analysis. (Proofs of various versions can be found, e.g., in [87, 147] or [174].) The result deals with situations when the matrix of diffusion coefficients can be written in a special product form: (4.47) A(x) = σ(x)σ T (x), with some square matrix σ and σ T its transpose. Theorem 4.3.1. (i) In the diffusion equation of the form ∂u 1 = Lu = (A(x)∇, ∇)u + (b(x), ∇)u(x), ∂t 2

x ∈ Rd ,

(4.48)

let A be given by (4.47), and let both σ and b be Lipschitz-continuous in x. Then the operator L in (4.48) generates a strongly continuous semigroup of contractions Tt in C∞ (Rd ), whose domain contains the space of twice continuously differentiable functions with a compact support. Moreover, Tt extends to the strongly continuous semigroup Tt in the space of weighted functions CL,∞ (Rd ) (as defined in (1.2)) with L(x) = 1 + x2 , so that Tt CL,∞ (Rd ) ≤ eKt ,

(4.49)

with a constant K depending on d and the Lipschitz constants of σ, b in C k (Rd ). (ii) If ∇σ, ∇b are well defined and belong to C k−1 (Rd ) with k ∈ N, then the k (Rd ) is invariant under the action of the semigroup Tt . Moreover, space C∞ the Tt are bounded operators with respect to the usual norm of these spaces, so that (4.50) Tt C k (Rd ) ≤ eKk t , with a constant Kk depending on d, k and the norms of σ, b in C k (Rd ). In 2 (Rd ) is an invariant core for the semigroup particular, if k ≥ 2, the space C∞ d Tt in C∞ (R ), and the space of functions D consisting of twice continuously differentiable functions f such that their second-order derivatives belong to C∞ (Rd ) is an invariant core for the semigroup Tt in CL,∞ (Rd ). Moreover, if we equip the space D with a Banach space structure via the norm f D = f CL(Rd ) + ∇2 f C(Rd ) , then Tt D ≤ eKD t with a constant KD .

4.4. Evolutions generated by powers of the Laplacian

233

(iii) Under the assumptions of (i), Tt respects the rates of polynomial growth or decay at infinity in the following sense: for any real α, the spaces CLα (Rd ) and CLα ,∞ (Rd ) are invariant under Tt , and Tt CLα (Rd ) ≤ eκα t

(4.51)

with constants κα . Remark 65. (i) Notice that the conditions of the theorem permit a linear growth of σ(x) and b(x), and therefore a quadratic growth of A(x) = σ(x)σ T (x). (ii) If A is a non-degenerate (strictly elliptic) matrix, then the analytic method of frozen coefficients can be used for obtaining this result. This will be demonstrated in Chapter 5. (iii) Concerning the representation (4.47), it can be shown (see, e.g., Chapter 3 of [87]) that if A(x) ∈ C 2 (Rd ) and if A(x) is a non-negative symmetric matrix, then there exists a symmetric matrix σ(x) such that A(x) = σ 2 (x) and σ(x) is Lipschitz-continuous in x with a Lipschitz constant L ≤ C(d)AC 2 (Rd ) with a constant C(d).

4.4 Evolutions generated by powers of the Laplacian Since the Fourier transform takes the differentiation to multiplication, it takes the negation of the Laplacian −Δ to the multiplication by |p|2 = p2 . This fact is a motivation for calling −Δ the magnitude of Δ, |Δ| = −Δ (which actually fits the general definition of |A| in the operator calculus). Consequently, one can naturally define the operator |Δ|α/2 for any α > 0 as the operator that becomes the operator of multiplication by |p|α under Fourier transformation. Of course, for an integer β, |Δ|β = (−Δ)β defined in such a way coincides with the corresponding power of −Δ calculated in the usual way. The semigroups generated by powers of the Laplacian represent another fundamental example of semigroups. These semigroups solve the equation f˙t = −σ|Δ|α/2 ft ,

f0 = f,

(4.52)

with α > 0 and σ > 0. In what follows, we shall stick to real σ, since in the case of complex diffusion, the theory remains more or less the same for a complex σ with a positive real part. We will now obtain some basic properties of the problem (4.52), whereby we use no other tools than integration by parts. In the next section, we will refine these results in a more general setting of homogeneous symbols, using more sophisticated methods. In order to solve (4.52), one has to apply the Fourier transform to it (see (1.87), if needed). According to the above definition of |Δ|β , this yields the fol-

234

Chapter 4. Linear Evolutionary Equations: Foundations

lowing equation for fˆt = F (ft ): ∂ ˆ ft = −σ|p|α fˆt , ∂t

fˆ0 = fˆ.

(4.53)

fˆt (p) = exp{−tσ|p|α }fˆ(p).

(4.54)

This linear problem has the solution

Consequently, taking the inverse Fourier transform and using (1.93) yields α,σ ft (x) = Tt f (x) = Gα (4.55) tσ (x − y)f (y) dy, with Gα σ (x)

−d

= (2π)

exp{−σ|p|α + ipx} dp.

(4.56)

In order to assess the properties of the semigroup Ttα,σ , we have to analyse the properties of the functions Gα σ , given by the integral (4.56). By the Riemann– ∈ C∞ (Rd ), and all derivatives of Gα Lebesgue Lemma, we find Gα σ σ with respect d to x also belong to C∞ (R ) for all positive α, σ. Equation (4.52) is a special case of equation (2.56) discussed in Theorems 2.4.2 and 2.4.4. The conditions of both of these theorems are satisfied for (4.52) with δ = N = α and l = d + k, where k is the biggest integer that is strictly smaller than α. In particular, l ≥ d + 1 for α > 1. Therefore, if σ > 0 and α > 1, d then Gα σ ∈ L1 (R ) and ∂m Gα (x) ∈ L1 (Rd ) ∂xj1 · · · ∂xjm σ for any indices j1 , . . . , jm . However, much more can be said for the semigroup Ttα,σ resolving the problem (4.52). Theorem 4.4.1. Let σ > 0 and α > 1. Then the semigroup Ttα,σ given by (4.56) is a uniformly bounded and strongly continuous semigroup in C∞ (Rd ), with all k k spaces C∞ (Rd ) being invariant. In particular, the subspaces C∞ (Rd ) for all k ≥ α represent invariant cores. Moreover, the semigroup is smoothing in the sense that k (Rd ) for any k > 0, t > 0 and f ∈ C(Rd ). Finally, for all nonTtα,σ f ∈ C∞ negative integers k, l, ) ) ) ) ∂k α ) ) G (.) ≤ ck t−α/k , (4.57) ) ∂xj · · · ∂xj tσ ) d 1 k L1 (R ) Ttα,σ f C l+k (Rd ) ≤ ck,l t−k/α f C l (Rd ) , with some constants ck and ck,l depending on α, σ and d.

(4.58)

4.4. Evolutions generated by powers of the Laplacian

235

Proof. Since Gα tσ (x)

−d

exp{−tσ|p|α + ipx} dp

= (2π)

= (2π)−d



exp{−σ|q|α + iqxt−1/α }t−d/α dq,

it follows that (with y = xt−1/α )     α −d α  |Gtσ (x)| dx = (2π) exp{−σ|q| + iqy} dq  dy, which has a bound that is independent from t. This implies that the semigroup Ttα,σ is uniformly bounded in C∞ (Rd ). Next, we find ∂ α −d −1/α G (x) = (2π) t iqj exp{−σ|q|α + iqxt−1/α }t−d/α dp, ∂xj tσ so that     ∂ α  dx = t−1/α  G (x)  ∂xj tσ 

    (2π)−d iqj exp{−σ|q|α + iqy} dp dy,  

which is bounded by a constant times t−1/α . (4.57) is similarly obtained. This implies (4.58) for l = 0. The case of arbitrary l can be reduced to the case l = 0, because the gradient ∇f evolves according to the same equation as f itself. In order to show strong continuity, we have to show that ft = Ttα,σ f → f in C∞ (Rd ). By the Riemann–Lebesgue lemma, this follows from the fact that fˆt → fˆ in L1 (Rd ), which again follows straightforwardly from (4.54). k (Rd ) represent invariant cores for all k ≥ α, Finally, for showing that the C∞ it remains to show that these spaces belong to the domain D(A) of the generator A of the semigroup Ttα,σ . For this, we observe that D(A) contains the subspace of C∞ (Rd ) consisting of functions that are representable as Fourier transforms of functions φ ∈ L1 (Rd ) such that |p|α φ(p) ∈ L1 (Rd ), because (exp{−tσ|p|α } − 1)φ(p)/t → −σ|p|α φ(p),

t → 0,

both pointwise and in L1 (Rd ). In particular, S(Rd ) ⊂ D(A). In order to conclude k that C∞ (Rd ) ⊂ D(A) for all k ≥ α, it remains to note that A is closed (by Theorem k 4.1.1(vii)) and that C∞ (Rd ) belongs to the closure of A defined on S(Rd ) for all k ≥ α (due to the formulae (1.145) and (1.143)).  Remark 66. Spaces of smooth functions of fractional order (i.e., subspaces of k (Rd ) that consist of functions with H¨ older-continuous derivatives) yield cores C∞ that represent the structure of the semigroups Ttα,σ in a better way. However, we shall stick to C k (Rd ) for the sake of simplicity.

236

Chapter 4. Linear Evolutionary Equations: Foundations

Proposition 4.4.1. Under the assumptions of Theorem 4.4.1, the following holds: (i) The semigroups Ttα,σ also represent strongly continuous bounded semigroups k in each space C∞ (Rd ). (ii) The semigroups Ttα,σ extend to C(Rd ), so that Tt f (x) → f (x), as t → 0, for any f ∈ C(Rd ) and each x (uniformly on compact subsets of x, but not uniform in all x). necessarily α (iii) Gσt (x)dx = 1 for all t, σ. Proof. Statement (i) is straightforward. In order to prove (ii), one must approximate f ∈ C(Rd ) by functions f ∈ C∞ (Rd ) that converge uniformly on any compact subset. Statement (iii) follows from (ii) by choosing f = 1 and taking   into account that, by homogeneity, Gα σt (x)dx does not depend on time. In particular, it follows thatif G is not positive everywhere (which is in fact the case for α > 2), then Tt  = |Gα σt (x)|dx > 1. Therefore, Tt  does not tend to 1, as t → 0, although Tt strongly converges to the identity operator. Remark 67. Gα σt (x) is positive for α ≤ 2. This follows from the conditional positivity of the generator, see Section 5.10. For the analysis of duality (which is needed, e.g., when dealing with variable coefficients), one has to know how the evolution Tt acts on the dual spaces to C k (Rd ), i.e., on the spaces of integrable functions. Proposition 4.4.2. Under the assumptions of Theorem 4.4.1, the following holds: (i) The semigroup Ttα,σ is strongly continuous in L1 (Rd ), and each Sobolev space H1k (Rd ), k ∈ N, is invariant. (ii) For t > 0, Ttα,σ takes L1 (Rd ) to H1k (Rd ), and the equation f˙t = −σ|Δ|α/2 ft holds in the norm of L1 (Rd ). (iii) The semigroups Ttα,σ extend by weak continuity from L1 (Rd ) to M(Rd ), so that Ttα,σ μ ∈ C(Rd ) ∩ L1 (Rd ) (more precisely, the measure Ttα,σ μ has a density that belongs to this space), and Ttα,σ μ → μ weakly (but not necessarily strongly), as t → 0, for any μ ∈ M(Rd ). Proof. (i) and (ii) follow directly from the properties of Gα tσ . α,σ α (iii) Again by the properties of Gtσ , Tt μ is a well-defined continuous and integrable function (even infinitely differentiable) for any μ ∈ M(Rd ). Since  α Gtσ (x − y)φ(x) → φ(y) uniformly due to the strong continuity of Ttα,σ , it follows that (x − y)φ(x)μ(dy) → φ(y)μ(dy), Gα tσ as t → 0, for any μ ∈ M(Rd ).



In order to assess the properties of the solutions to (4.52), let us obtain some pointwise estimates for the Green function Gα σ (x), uniform with respect to its two key variables σ and x.

4.5. Evolutions generated by ΨDOs with homogeneous symbols . . .

Proposition 4.4.3. There exists a constant C = C(α) such that   1 σ k/α α −d/α , d , d+k , |Gσ (x)| ≤ C min σ |x| |x|

237

(4.59)

where k is the biggest integer being strictly smaller than α. m Proof. Multiplying Gα σ (x) by i xi1 · · · xim is equivalent to differentiating its α Fourier transform exp{−σ|p| } with respect to pi1 , . . . , pim . Each consecutive differentiation yields a multiplier of order either σ|p|α−1 or |p|−1 . The resulting exipx | ≤ 1 and changing pression for |xi1 · · · xim Gα σ (x)| can then be estimated using |e α α the integration variable p to q via the equation σ|p| = |q| with dp = σ −d/α dq, which turns these multipliers to σ|p|α−1 = |q|α−1 σ 1/α and |p|−1 = |q|−1 σ 1/α . Therefore, the total dependence of the estimate on σ is reduced to σ m/α , whic implies that σ −d/α σ m/α |Gα (4.60) σ (x)| ≤ C |x|m

for any m = 0, 1, . . . , k. The estimates (4.59) are obtained by choosing m = 0, d, d + k.  As we shall show in the next section by a more refined method, the estimate d+α . This implies, in particular, that (4.59) can be optimized to |Gα σ (x)| ≤ Cσ/|x| Theorem 4.4.1 can be extended to all α > 0.

4.5 Evolutions generated by ΨDOs with homogeneous symbols and their mixtures The case α ∈ (0, 1] that had been excluded above will now be dealt with in a much more general setting of problems (2.56) with homogeneous symbols ψ of a positive order β. For the sake of simplicity, we shall work with time-homogeneous equations. Therefore, we are looking at the Cauchy problem f˙t = −ψ(−i∇)ft ,

f |t=0 = f0 ,

(4.61)

where ψ(p) = |p|β ω(p/|p|),

(4.62)

where ω = ωr + iωi is a continuous complex-valued function on the sphere S d−1 with positive real part. The corresponding semigroup of operators resolving (4.61) acts as (4.63) Tt f (x) = Gψ t (x − y)f (y) dy, with the Green function (2.62) of the form 1 ψ Gt (x) = eipx exp{−t|p|β ω(p/|p|)} dp. (2π)d

(4.64)

238

Chapter 4. Linear Evolutionary Equations: Foundations

Obviously, this function is real, so Tt preserves the reality of functions if and only if the condition ω ¯ (p/|p|) = ω(−p/|p|) (4.65) holds for all p. An important example for an operator ψ(−i∇) with a homogeneous symbol (4.62) are mixed fractional symmetric derivatives (1.141) with symbols |(p, s)|β μ(ds), ψ(p) = S d−1

and

|(p/|p|, s)|β μ(ds).

ω(p/|p|) =

(4.66)

S d−1

This ω is real and non-negative, and the condition of its strict positivity is equivalent to the requirement that the support of the measure μ on S d−1 is not contained in any hyperplane of Rd . Exercise 4.5.1. Check the last claim. Let us start with the one-dimensional case d = 1. Then ψ(p) = a+ 1p≥0 |p|β + a− 1p 0.

(4.68)

The reality condition (4.65) becomes a ¯± = a∓ , in which case ψ gets the form ψ(p) = (ar + iai sgn (p))|p|β

(4.69)

with constants ar > 0, ai ∈ R. By changing the integration variable p to −p in (4.64), the Green function for d = 1 can be rewritten as ∞ ∞ 1 1 ψ ipx + β e exp{−ta p } dp + e−ipx exp{−ta− pβ } dp. (4.70) Gt (x) = 2π 0 2π 0 Proposition 4.5.1. For any β > 0 and constants (4.68), the Green function (4.70) is infinitely smooth in both variables, and the estimates   ∂l ψ t ± −(1+l)/β | l Gt (x)| ≤ C(β, a , l) min t , 1+β+l (4.71) ∂x |x|      ∂ ∂l ψ  1 t ˜ a± , l) min t−(1+l)/β ,   C(β, G (x) ≤ , (4.72)   ∂t ∂xl t t |x|1+β+l ˜ a± , l). hold for any integer l with constants C(β, a± , l), C(β,

4.5. Evolutions generated by ΨDOs with homogeneous symbols . . .

239

Proof. Applying Proposition 9.3.5 with ω = l to (4.70) and noting that the first terms of the asymptotics ±i/|x| for the two terms in (4.70) always cancel, we find the estimate  l   ∂  t ψ   ≤ C(β, a± , l) G (x) .   ∂xl t |x|1+β+l Taking (2.69) into account completes the proof of (4.71). For the proof of (4.72), one applies Proposition 9.3.5 with ω = l + β.  Let us extend these estimates to arbitrary dimensions d. Theorem 4.5.1. (i) Let d > 1, β > 0 and a function ω on S d−1 be (d+ 1 + [β]) times continuously differentiable (where [β] is the integer part of β; also note that these derivatives are uniformly bounded due to the compactness of S d−1 ), with a real part that bounded from below by a positive number. Then the Green function (4.64) satisfies the estimate   t −d/β |Gψ (x)| ≤ C min t , . (4.73) t |x|d+β (ii) Moreover, G is differentiable with respect to t, and the following bounds apply:     ∂ ψ   Gt (x) ≤ C min t−1−d/β , 1 . (4.74)   ∂t |x|d+β (iii) If additionally ω is (d + 1 + [β] + l) times continuously differentiable, then Gψ t (x) is l times continuously differentiable in x and       ∂k t ψ −(d+k)/β   , d+β+k (4.75)  ∂xi · · · ∂xi Gt (x) ≤ C min t |x| 1 k for all k ≤ l and all i1 , . . . , ik . In all these estimates, C is a constant that depends on β, d and the corresponding bounds for ω and its derivatives. Proof. (i) The first bound t−d/β is already known from Proposition 4.4.3. Using spherical coordinates with the axis along the direction of x¯ = x/|x|, we can rewrite (4.64) as π ∞ 1 ψ d|p| dθ dn ei|p| |x| cos θ Gt (x) = (2π)d 0 (4.76) 0 S d−2 × exp{−t|p|β ω(¯ x, cos θ, n)} sind−2 θ|p|d−1 , x, cos θ, n) denotes the function ω(p/|p|) expressed where ω(¯ x, cos θ, n) = (ωr +iωi )(¯ in terms of the spherical coordinates θ, n (that depend on x¯). Changing θ to u = cos θ yields the equivalent formulation 1 ∞ 1 Gψ (x) = d|p| du dn ei|p| |x|u t (2π)d 0 (d−2 (4.77) −1 S × exp{−t|p|β ω(¯ x, u, n)}(1 − u2 )(d−3)/2 |p|d−1 .

240

Chapter 4. Linear Evolutionary Equations: Foundations

When trying to directly apply Proposition 9.3.5, a problem arises from the possibility of u = 0. Therefore, the idea is to separate the small values of u. Namely, let χ1 (u) be an even infinitely differentiable function R → [0, 1] with support [−1/2, 1/2] and such that χ1 (u) = 1 for u ∈ [−1/4, 1/4]. Let χ2 (u) = 1 − χ1 (u) ψ and let us represent Gψ t as the sum of two integrals Gt (x) = I1 + I2 with Ij =

1 (2π)d









1

d|p|

dn ei|p| |x|u

du −1

0

(4.78)

S d−2

× exp{−t|p| ω(¯ x, u, n)}(1 − u ) β

2 (d−3)/2

χj (u)|p|

d−1

.

Let us start with I2 . Proposition 9.3.5 with ω = d − 1 implies I2 = I20 + I˜2 , where I20

|S d−2 |Γ(d) = (2π)d |x|d



1

−1

exp{i sgn (u)πd/2}(1 − u2 )(d−3)/2 χ2 (u)

and |I˜2 | ≤ C(d, β) sup |ω(s)| s

du |u|d

t . |x|d+β

Now let us turn to I1 . Changing the integration variable |p| to r = |x| |p| yields ∞ 1 1 I1 = dr du dn eiru (2π)d |x|d 0 −1 S d−2   rβ × exp −tω(¯ x, u, n) β (1 − u2 )(d−3)/2 χ1 (u)rd−1 , |x| which we represent as the sum I1 = I10 + I˜1 , where I10 = I˜1 =

|S d−2 | (2π)d |x|d







1

dr −1 1

0

du eiru (1 − u2 )(d−3)/2 χ1 (u)rd−1 ,

∞ 1 dr du dn (2π)d |x|d 0 −1 S d−2     rβ iru ×e exp −tω(¯ x, u, n) β − 1 (1 − u2 )(d−3)/2 χ1 (u)rd−1 . |x|

Equivalently, the second term can be written as I˜1 = −

t (2π)d |x|d+β





∞ 0



1

dr

du −1

dn S d−2 β

  r × eiru ω(¯ x, u, n)rβ φ tω(¯ x, u, n) β (1 − u2 )(d−3)/2 χ1 (u)rd−1 , |x|

4.5. Evolutions generated by ΨDOs with homogeneous symbols . . .

241

where φ(y) = (1 − e−y )/y and y = tω(¯ x, u, n)rβ /|x|β . Using eiru = (ir)−1 deiru , we can now integrate by parts (d + 1 + [β]) times with respect to the variable u. Hereby, the required differentiation of the function ω(¯ x, u, n)(1 − u2 )(d−3)/2 χ1 (u) does not create any additional singularities or growing terms. On the other hand,   d d −1 ∂ φ(y(u)) = [ω(¯ x, u, n)] ω(¯ x, u, n) y φ(y) |y=tω(¯x,u,n)rβ /|x|β . du ∂u dy Applying this differentiation (d+ 1 + [β]) times gives a bounded result, because the d k )] φ(y) are uniformly bounded on R+ for any k ∈ N. As a result, functions [y( dy we get a function of r that decreases as r−2+β−[β] , as r → ∞, and is therefore integrable. This yields the estimate |I˜1 | ≤ C1

t |x|d+β

.

d k Exercise 4.5.2. Check the claim about [y( dy )] φ(y).

Consequently, the correction terms I˜1 + I˜2 satisfy the required bounds, and in order to complete the proof, it is sufficient to show that I10 + I20 = 0. This is easy to see for odd dimensions d, since in this case both I10 and I20 vanish. For even dimensions, the direct proof does not seem to be so straightforward (see Exercise 4.5.3 below). The simplest way to see that the sum I10 + I20 vanishes is to observe that it does not depend on ω or β. But for ω = 1 and β = 2 it must vanish, since otherwise the Green function G2σ (x) from (4.56) would have a decay of order 1/|x|d as x → ∞, which is against our knowledge that the heat kernel of the standard diffusion decays exponentially as x → ∞. (ii) The proof of the second bound in (4.74) is analogous, the only difference now being that the corresponding terms Ij0 already depend on β and ω, which is why we cannot generally prove that they vanish or cancel. Instead, we must directly estimate the corresponding integrals ∞ 1 1 I˜j = − dr du dn (2π)d |x|d 0 −1 S d−2   rβ rβ × eiru ω(¯ x, u, n) β exp −tω(¯ x, u, n) β (1 − u2 )(d−3)/2 χj (u)rd−1 , |x| |x| without the above-used subtraction for distinguishing the major term of the asymptotics. In particular, for the integral I˜2 we use the estimate (9.31) of Proposition 9.3.5 with ω = β +d−1. The first bound in (4.74) is obtained by the variable change trβ = q β . (iii) The first estimate is obtained by differentiating (4.64) and then using the estimate |ei(p,x) | ≤ 1. The second, more subtle estimate is proved as in (i).

242

Chapter 4. Linear Evolutionary Equations: Foundations

The difference is that differentiating the function Gt (x) k times with respect to x yields multipliers of the order |p|k . Therefore, one has to differentiate d+1+[β]+k times by parts in the corresponding integral I˜1 in order to get the decay of order r−2+β−[β] in the integrand.  Exercise 4.5.3. Show that, for odd dimensions, both I10 and I20 from the proof of (i) vanish. On the other hand, for d = 2k, show that I10 +I20

2 +2 =− 2k − 1

1

du 1 (2k−2) 1  2 2k−2 (0)u , φ(u) − φ(0) − φ (0)u − · · · − φ 2 2 u2k 0

where φ(u) = (1 − u2 )(d−3)/2 . (Hint: you may use (1.175) and (1.163).) Use this formula to confirm that I10 + I20 = 0 at least for small dimensions d = 2 or 4. The solution to this exercise can be found in Section 5.2 of [136]. Remark 68. (i) Expanding the exponent exp{−t|p|β ω(¯ x, u, n)} that appears in the integrals Ij into a power series yields an asymptotic expansion of G in power series of the variable t/|x|β . One can even show that this power series is convergent for β ∈ (0, 1). (ii) For β = 2k with k ∈ N, the above power estimate is very rough, since in this case Gψ t (x) actually decreases exponentially as x → ∞, like in the case of diffusion with β = 2. As a corollary of the estimates (4.73), one can conclude that the corresponding semigroup Tt given by (4.63), and resolving equation (4.61), acts by bounded operators in the weighted spaces CLα (Rd ) with Lα (x) = 1 + |x|α . (Recall Remark 63 on why this is of importance.) Proposition 4.5.2. Under the assumptions of Theorem 4.5.1, Tt preserves the weighted spaces CLα (Rd ) for α ∈ [0, β), and the norms Tt CLα (Rd ) are bounded on any bounded time interval. Proof. We need to show that      Gψ (x − y)|y|α dy  ≤ C(α, β, t)(1 + |x|α ), t   for which it is sufficient to show that       ψ α Gt (x − y)|y| dy  ≤ C(α, β, t)(1 + |x|α ).   |y|>1 

(4.79)

Let us decompose this integral into the sum of two integrals I1 + I2 , where I1 is over the set {y : |x − y| < t1/β }.

4.5. Evolutions generated by ΨDOs with homogeneous symbols . . .

243

Then, we have

|y|α t−d/β dy α −d/β ≤ C(α, β, t)(1 + |x| )t

I1 ≤ C(β)

{|x−y|1}

t|y|α dy. |x − y|d+β

The integral over the set |y| ≤ 2|x| is bounded by |x|α . Therefore, the following integral remains: t|y|α dy d+β D |x − y| over the domain D = {y : |y| > 1, |y| > 2|x|, |x − y| > t1/β }. But this integral can be estimated by t|y|α dy, C(t, α, β) d+β D |y| 

which again yields the same estimate.

An interesting extension of the above results concerns equations with mixed homogeneous symbols. Namely, let us consider the problem (4.61) with |p|β(u) ω(u, p/|p|)μ(du),

ψ(p) =

(4.80)

U

where β(u) and ω(u, .) are continuous functions on u ∈ U (an arbitrary metric space), and μ is a (positive) measure on U . The corresponding semigroup of operators resolving (4.61) acts as (4.63) with Gψ t (x) =

1 (2π)d



  eipx exp −t |p|β(u) ω(u, p/|p|)μ(du) dp.

(4.81)

U

Theorem 4.5.2. Let β(u) ∈ [bmin , bmax ] with some 0 < bmin ≤ bmax < ∞, and let ω(u, .) satisfy all conditions of Theorem 4.5.1 with all estimates uniform in u. Let β, ω depend continuously on u, and let μ be a Borel measure on U such that μ{u : β(u) = bmax } > 0.

244

Chapter 4. Linear Evolutionary Equations: Foundations

Then, for t ∈ (0, 1), the Green function (4.64) and its derivatives satisfy the estimates   μ(du) −d/bmax |Gψ (x)| ≤ C min t , t , (4.82) t d+β(u) U |x|     ∂ ψ  μ(du)  G (x) ≤ C min t−1−d/bmax , , (4.83)   ∂t t d+β(u) U |x|       μ(du) ∂k ψ −(d+k)/bmax   ,t , (4.84)  ∂xi · · · ∂xi Gt (x) ≤ C min t d+β(u)+k U |x| 1 k for all k ≤ l and i1 , . . . , ik . Proof. The second parts of the estimates in (4.82) to (4.84) are obtained in literally the same way as in the proof of Theorem 4.5.1, except that Proposition 9.3.6 is used rather than Proposition 9.3.5. In order to get the first estimate in (4.82), one decomposes the integral in (4.81) into two integrals over the sets {|p| ≤ 1} and {|p| > 1}, respectively. The first integral is bound by a constant. Hence up to a constant, we find 1 (x)| ≤ exp{−t|p|βmax μ{u : β(u) = bmax } min ω(u, n)} dp. |Gψ t u,n (2π)d This yields the first estimate in (4.82). Similar decompositions yield the first estimates in (4.83) and (4.84).  Exercise 4.5.4. (i) Extend Theorem 4.4.1 to all α > 0. (ii) Formulate and prove the analogue of Theorem 4.4.1 to the mixed homogenous setting of Theorem 4.5.2.

4.6 Perturbation theory and the interaction picture An important tool for the construction of semigroups is perturbation theory, which can be applied once a generator of interest can be written as the sum of a wellunderstood operator and a term that is smaller (in some sense). We start with the simplest result of this kind. Theorem 4.6.1. Let an operator A with domain DA generate a strongly continuous semigroup Tt on a Banach space B, and let L be a bounded operator on B. Then A + L with the same domain DA also generates a strongly continuous semigroup Φt on B, which is given by the series ∞

Φt = T t + Tt−sm LTsm −sm−1 · · · LTs1 ds1 · · · dsm = Tt +

m=1 0≤s1 ≤···≤sm ≤t ∞

m=1

0≤s1 ≤···≤sm ≤t

(4.85) Ts1 LTs2 −s1 · · · LTt−sm ds1 · · · dsm

4.6. Perturbation theory and the interaction picture

245

that converges in the operator norm. Moreover, Φt f is the unique (bounded) solution to the integral equation t Tt−s LΦs f ds, (4.86) Φt f = T t f + 0

with a given f0 = f . Remark 69. The two versions of the series in (4.85) are obtained from each other by a trivial change of the integration variables. For the path-integral representation, however, it turns out that one of them can be preferable as corresponding to the ‘natural’ time-direction on the path. See Section 4.7 for more details on this point. Proof. Clearly Φt  ≤ Tt  +



(Lt)m ( sup Ts )m+1 , m! s∈[0,t] m=1

which implies the convergence of the series. Next, the main semigroup condition is shown by Φt Φτ f =



m=0 0≤s1 ≤···≤sm ≤t ∞

Tt−sm LTsm −sm−1 · · · LTs1 ds1 · · · dsm

×

=

n=0 0≤u1 ≤···≤un ≤τ ∞



m,n=0

Tτ −un LTun −un−1 · · · LTu1 du1 · · · dun

0≤u1 ≤···un ≤τ ≤v1 ≤···≤vm ≤t+τ

dv1 · · · dvm du1 · · · dun

× Tt+τ −vm LTvm −vm−1 L · · · Tv2 −v1 LTv1 −un LTun −un−1 · · · LTu1 ∞

= Tt+τ −uk LTuk −uk−1 L · · · LTu1 du1 · · · duk k=0

0≤u1 ≤···≤uk ≤t+τ

= Φt+τ f. Equation (4.86) is a consequence of (4.85). On the other hand, if (4.86) holds, then substituting the l.h.s. of this equation into its r.h.s. recursively yields t t s2 Tt−s LTs f ds + ds2 Tt−s2 L ds1 Ts2 −s1 LΦs1 f Φt f = T t f + 0

= Tt f +

0

N

m=1



0≤s1 ≤···≤sm ≤t

+ 0≤s1 ≤···≤sN +1 ≤t

0

Tt−sm LTsm −sm−1 · · · LTs1 f ds1 · · · dsm

Tt−sN +1 LTsN +1 −sN · · · LTs2 −s1 LΦs1 f ds1 · · · dsm

246

Chapter 4. Linear Evolutionary Equations: Foundations

for arbitrary N . Since the last term tends to zero, the series representation (4.85) follows, which implies that the solution is unique. Finally, since the terms with m > 1 in (4.85) are of order O(t2 ) for small t, we find   t d d d |t=0 Φt f = |t=0 Tt f + |t=0 Tt f + Lf. Tt−s LTs f ds = dt dt dt 0 Therefore,

d dt |t=0

Φt f exists if and only if

d dt |t=0

Tt f exists, and in this case

d |t=0 Φt f = (A + L)f. dt Thus the domains of A and A + L coincide.



˙ tf = Equation (4.86) is often referred to as the mild form of the equation Φ (A + L)Φt , and the solutions to (4.86) are referred to as mild solutions to the ˙ t f = (A + L)Φt . equation Φ Remark 70. If D ⊂ DA is a core for Tt from Theorem 4.6.1, such that for any f ∈ DA there exists a sequence fn ∈ D with fn → f, Afn → Af , as n → ∞, then also Lfn → Lf , and therefore D is also a core for Φt . If D is an invariant core for Tt , however, we cannot generally conclude that it is also an invariant core for Φt . Yet, if L and Tt are both bounded in D with respect to some norm in D, then the perturbation series also converges in D, which ensures its invariance under Φt . Theorem 4.6.1 can be used to show the strong convergence of operator semigroups. Theorem 4.6.2. Under the assumptions of Theorem 4.6.1, suppose that we are additionally given a sequence of operators An generating the uniformly bounded semigroups Ttn on some domains Dn and a family of uniformly bounded operators Ln in B. Assume that the Ttn converge strongly to Tt and the Ln converge strongly to L. Then the corresponding semigroups Φnt (built by Theorem 4.6.1 from Ttn and Ln ) converge strongly to the semigroup Φt . Proof. For each n, formula (4.85) can be rewritten as Φnt = Ttn +



m=1

0≤s1 ≤···≤sm ≤t

n Tt−s Ln Tsnm −sm−1 · · · Ln Tsn1 ds1 · · · dsm . m

(4.87)

By the dominated convergence theorem, in order to prove that the Φnt f converge to Φt f it is sufficient to show that n Tt−s Ln Tsnm −sm−1 · · · Ln Tsn1 f → Tt−sm LTsm −sm−1 · · · LTs1 f, m

for any collection 0 ≤ s1 ≤ · · · ≤ sm ≤ t. But this follows from the assumption of the theorem. 

4.6. Perturbation theory and the interaction picture

247

Perturbation theory is also used in a more general setting of unbounded operators L, when their unboundedness can be somehow estimated by A. Such estimates can be given in terms of A, or in terms of its resolvent, or in terms of its semigroup. We shall work with the last approach. Theorem 4.6.3. (i) Let an operator A with domain DA generate a strongly continuous semigroup Tt on a Banach space B, and let L be an unbounded operator in B with the domain DL such that Tt f ∈ DL for any f ∈ B, t > 0 and LTt  ≤ κt−ω

(4.88)

with constants κ > 0, ω ∈ (0, 1), uniformly for t from a fixed bounded interval. Then the series (4.85) converges in the operator norm, and the operators Φt form a strongly continuous semigroup in B. If Tt  ≤ M emt , then Φt  ≤ M emt E1−ω (Γ(1 − ω)M κt1−ω ),

(4.89)

with E1−ω the Mittag-Leffler function. (ii) If additionally L is a closed operator, then Φt f ∈ DL for any f ∈ B, t > 0, with the same order of growth as for Tt : LΦt  ≤ M emt κΓ(1 − ω)t−ω E1−ω,1−ω (κΓ(1 − ω)t1−ω ).

(4.90)

Moreover, Φt f is the unique solution to the integral equation (4.86) for any given f0 = f such that LΦt f  ≤ ct−ω1 with some constants ω1 ∈ (0, 1) and c > 0. Proof. (i) For the nth term Φt (n) of the series (4.85), we have (sn − sn−1 )−ω · · · (s2 − s1 )−ω s−ω Φt (n) ≤ M n+1 emt κ n 1 ds1 · · · dsn . 0≤s1 ≤···≤sn ≤t

By the definition of the fractional Riemann integral (see (1.101)) and by the composition rule I k I n = I k+n for fractional integrals, this estimate can be rewritten as Φt (n) ≤ M n+1 emt (κΓ(1 − ω))n I n(1−ω) (1)(t) n t n+1 mt (κΓ(1 − ω)) ≤M e (t − s)n(1−ω)−1 ds Γ(n(1 − ω)) 0 (κΓ(1 − ω)t1−ω )n , = M n+1 emt Γ(n(1 − ω) + 1) where in the last equation the formula Γ(x + 1) = xΓ(x) was used. Alternatively, this estimate follows directly from the Dirichlet formula (9.16). The series with these terms converges, and its sum can be written in terms of the Mittag-Leffler

248

Chapter 4. Linear Evolutionary Equations: Foundations

function (9.13). This takes us to (4.89). The semigroup property is proved as in Theorem 4.6.1. (ii) We shall now prove that the series (4.85) still converges if we apply L to all of its terms. For the term Φt (n), we have LΦt (n) ≤ M n+1 emt κ n+1 (t − sn )−ω (sn − sn−1 )−ω · · · 0≤s1 ≤···≤sn ≤t

= M n+1 emt κ n+1 tn(1−ω)−ω

· · · (s2 − s1 )−ω s−ω 1 ds1 · · · dsn

0≤h1 ≤···≤hn ≤1

(1 − hn )−ω (hn − hn−1 )−ω · · ·

· · · (h2 − h1 )−ω h−ω 1 dh1 · · · dhn = M n+1 emt κt−ω (κt1−ω )n

(Γ(1 − ω))n+1 , Γ((n + 1)(1 − ω))

where the Dirichlet formula (9.15) was used. Consequently, by the definition of the Mittag-Leffler function (9.13), we find ∞

LΦt (n) ≤ M n+1 emt κΓ(1 − ω)t−ω E1−ω,1−ω (κΓ(1 − ω)M t1−ω ).

(4.91)

n=0

Since the series (4.85) still converges if we apply L to all of its terms, it follows from the closeness of L that Φt f ∈ DL for any f ∈ B, t > 0. The estimate (4.91) then implies (4.90). Consequently, the proof that Φt f is the unique solution to equation (4.86) can be completed as in Theorem 4.6.1.  It is reasonable to ask whether the strongly continuous semigroup Φt from Theorem 4.6.3 is actually generated by A + L, as was the case with bounded L. A simple answer can be obtained in terms of an intermediate Banach space, which is often explicitly given in applications. Theorem 4.6.4. Let an operator A with domain DA generate a strongly continuous ˜ be a dense subspace of B, which is itself semigroup Tt on a Banach space B. Let B ˜ B). Let DA ⊂ B ˜ a Banach space under the norm .B˜ ≥ .B , and let L ∈ L(B, ˜ and the semigroup Tt have the following regularization property: Tt f ∈ B for any f ∈ B, t > 0, and (4.92) Tt B→B˜ ≤ κt−ω , with constants ω ∈ (0, 1), κ > 0, uniformly for t ∈ (0, 1]. Moreover, let Tt be ˜ strongly continuous in B. Then the semigroup Φt constructed in Theorem 4.6.3 has DA as an invariant ˜ with the estimate core, where its generator equals A+L. Moreover, Φt maps B to B Φt B→B˜ ≤ κt ˜ −ω ,

(4.93)

with a constant κ ˜ depending on κ, ω, LB→B . Furthermore, Φt is also strongly ˜ ˜ continuous in B.

4.6. Perturbation theory and the interaction picture

249

˜ with the estimate (4.93), Proof. First of all, we can conclude that Φt maps B to B ˜ if applied to any f ∈ B. In fact, because the series (4.85) for t > 0 converges in B its norm is bounded by ∞

f B (κLB→B )m (t − sm )−ω · · · Φt f B˜ ≤ κt−ω f B + ˜ m=1

0≤s1 ≤···≤sm ≤t

· · · (s2 − s1 )−ω s−ω ds1 · · · dsm , 1 which is estimated like the above series (4.91), yielding (4.93). Similar estimates ˜ show that the B-norm of the sum in (4.85) converges to zero, as t → 0, if applied ˜ ˜ This shows that Φt is strongly continuous in B. to any f ∈ B. ˜ Finally, if f ∈ B, then t d |t=0 Tt−s LΦs f ds = Lf, dt 0 ˜ and the strong continuity of Tt in B because of the strong continuity of Φt in B ˜ ˜ then it belongs (and of course the boundedness of L : B → B). Therefore, if f ∈ B, to the domain of the generator of Φt if and only if it belongs to DA , in which case the generator equals A + L. Since the domain of any generator is invariant under ˜ is invariant under Φt , DA must be invariant under Φt , its semigroup and since B and hence constitutes an invariant core.  Similarly to the above results, one can analyse a situation when the operator L can be regularized by left multiplication on Tt , which leads to the following result. We omit the proof, since it is the same as the proof of Theorem 4.6.3 above. Theorem 4.6.5. Let an operator A with domain DA generate a strongly continuous semigroup Tt on a Banach space B, and let L be an operator mapping a subspace of B to some (possibly different) space, but in such a way that the composition Tt L is well defined for t > 0 as a bounded operator in B such that Tt L ≤ κt−ω

(4.94)

with constants κ > 0, ω ∈ (0, 1), uniformly for t ∈ (0, 1]. Then the series (4.85) converges in the operator norm, the operators Φt form a strongly continuous semigroup in B, and Φt f is the unique bounded solution to the integral equation (4.86) for any given f0 = f . Moreover, Φt L is well defined as a bounded operator in B for any t > 0, and ˜ t−ω (4.95) Φt L ≤ κ for t ∈ (0, 1] with a constant κ. ˜ Remark 71. (i) In situations as described by Theorem 4.6.5, it can be difficult to identify the domain of the generator of the semigroup Φt .

250

Chapter 4. Linear Evolutionary Equations: Foundations

(ii) As an application, we shall consider the Schr¨odinger equation with a singular potential in Section 4.8. As an insightful example for the perturbation series, let us consider an equation of the form ∞ (ft (x + y) − ft (x))ν(dy), (4.96) f˙t (x) = 0

with a finite measure ν. The solution to the equation f˙t = −νft with the inial condition f0 equals ft = exp{−tν}f0. Considering the operator f (x + y)ν(dy) the perturbation, one gets the formula ∞ ∞ k ∞

t ··· f0 (x + y1 + · · · + yk )ν(dy1 ) · · · ν(dyk ) ft (x) = exp{−tν} k! 0 0 k=0 (4.97) for the resolving operator (4.85) of the Cauchy problem of equation (4.96). Exercise 4.6.1. How must (4.97) be modified in order to include the case of a signed bounded measure ν? Let us distinguish an important special case of the above treatment of the equation f˙t = (A + L)ft , when A generates not only a semigroup, but a group. Namely, there exists a family of bounded operators Tt , t ∈ R, depending strongly continuous on t such that Tt Ts = Tt+s and T˙t = ATt f = Tt Af for any f ∈ D. For instance, this is the case when the operator A is bounded or A is a self-adjoint operator in a Hilbert space. In this case, choosing a new function μt = T−t ft in the equation f˙t = (A + L)ft takes us to μ˙ t = −AT−t ft + T−t (A + L)ft , and therefore, the equation f˙t = (A + L)ft can be written as ˜ t μt = T−t LTt μt . μ˙ t = L

(4.98)

This variant of the original equation f˙t = (A+L)ft plays a crucial role in quantum physics, where it is referred to as the interaction representation or the interaction picture. A standard technique in quantum physics is based on finding the approxi˜ instead of mate solutions to (4.98) from the first terms of the series (2.36), with L ˜ A, even if L is unbounded and the series does not converge. The point to empha˜ instead of A can obviously be size is that the series (2.36) for μt = T−t ft and L obtained from the perturbation series (4.85) for ft = Φt f by applying T−t to both sides, so that the perturbation series is just another representation of the basic expansion (2.36) of the chronological exponential in the interaction picture.

4.7 Path integral representation Integral representations for the solutions to linear PDEs in terms of an infinitedimensional integral over some space of trajectories are a very important tool both

4.7. Path integral representation

251

for modern analysis and theoretical physics. Let us show here a typical result of this kind that is implied by the perturbation theory developed above. For a more complete picture, we refer to Chapter 9 of [148]. The path-space that we shall work on is the space of piecewise constant paths. Namely, a sample path Z in Rd on the time interval [τ, t] that starts at a point y is defined by a finite number, say n, of jump-times τ < s1 < · · · < sn < t, and by jumps-sizes z1 , . . . , zn (each zj ∈ Rd \ {0}) at these times: ⎧ 0 Z = 0, τ ≤ s < s1 , ⎪ ⎪ ⎪ ⎨ Z1 = z , s1 ≤ s < s2 , 1 ···sn Zzs11···z (s) = (4.99) n ⎪ ··· ⎪ ⎪ ⎩ n Z = z1 + · · · + zn , sn ≤ s ≤ t.

···sn Zx (s) = x−Zzs11···z (s), n

Let P Cx (τ, t), abbreviated as P Cx (t) for τ = 0, denote the set of all such rightcontinuous and piecewise-constant paths [τ, t] → Rd starting from the point x at τ , and let P Cxn (τ, t) denote the subset of paths with exactly n discontinuities. Topologically, P Cx0 (τ, t) is a point and P Cxn (τ, t) = Simnτ,t ×(Rd )n , n = 1, 2, . . . , where (4.100) Simnτ,t = {s1 , . . . , sn : τ < s1 < s2 < · · · < sn < t} denotes the standard n-dimensional simplex. For τ = 0, we simply write Simnt . To each σ-finite measure M on Rd , there corresponds the σ-finite measure PC on P Cx (t), which is defined as the sum of measures MnP C , n = 0, 1, . . . , M where each MnP C is the product-measure on P Cxn (t) of the Lebesgue measure on Simnt and of n copies of the measure M on Rd . Therefore, if Z is parametrized as in (4.99), then MnP C (dZ(.)) = ds1 · · · dsn M (dz1 ) · · · M (dzn ), and for any measurable functional F (Zx (.)) = {Fn (x − Z 0 , x − Z 1 , . . . , x − Z n )} on P Cx (t), given by a collection of functions Fn on Rdn , n = 0, 1, . . ., we find F (Zx (.))M P C (dZ(.)) = F (x) + P Cx (t)

=



n=0

Simn t

ds1 · · · dsn

Rd

n=1



···



Rd

P Cxn (t)

F (Zx (.))MnP C (dZ(.))

M (dz1 ) · · · M (dzn )

× Fn (x − Z 0 , x − Z 1 , . . . , x − Z n ).

(4.101)

If the measure M on Rd is finite, then the measure M P C = M P C (t, x) on P Cx (t) is also finite with M P C  = 1 +



n=1

Simn t

ds1 · · · dsn

Rdn

M (dz1 ) · · · M (dzn ) = et M .

252

Chapter 4. Linear Evolutionary Equations: Foundations

Therefore, using the probabilistic notation E (the expectation) for the integral ˜ P C = e−t M M P C on the path-space over the normalized (probability) measure M P Cx (t), we can write (4.101) as PC t M ˜ P C (dZ(.)) F (Zx (.))M (dZ(.)) = e F (Zx (.))M P Cx (t)

P Cx (t)

=e

t M

E F (Zx (.)).

(4.102)

Let us now look at perturbation series (4.85) assuming that A is the operator of multiplication by a function A(y) in Rd and L is an integral operator in C(Rd ). For the sake of simplicity, we take this integral operator to be spatially  homogeneous, i.e., Lf (x) = f (x − y)ν(dy) with a measure ν on Rd (possibly unbounded and complex-valued). Then series (4.85) (its second version!) can be rewritten as Φt Y (x) = etA(x) Y (x) ∞

+ m=1

(4.103)

0≤s1 ≤···≤sm ≤t

Y (x − z1 − · · · − zm )ds1 · · · dsm ν(dz1 ) · · · ν(dzm )

' × exp s1 A(x) + (s2 − s1 )A(x − z1 ) + · · ·

( · · · + (t − sm )A(x − z1 − · · · − zm ) .

The latter exponential term can also be written as   t A(Zx (s)) ds . exp 0

Comparing with (4.101), we find the following path integral representation for the solutions to the equation f˙t = (A + L)ft . Theorem 4.7.1. Under the assumptions of any one of the Theorems 4.6.1, 4.6.3 d or 4.6.5, let  A be the operator of multiplication by dthe function A(y) in R and Lf (x) = f (x − y)ν(dy) with a measure ν on R . Then the convergent series (4.85) or (4.103), expressing the resolving operator for the Cauchy problem of equation f˙t = (A + L)ft , can be represented as a path integral of the type (4.101) with  t  A(Zx (s)) ds Y (Zx (t)) F (Zx (.)) = exp 0

and ν instead of M . If ν is not positive, then it can be written as ν(dz) = ξ(z)M (dz) with a real (or complex) density ξ and some positive measure M . In this case, (4.103) can be written as (4.101), with  t  A(Zx (s)) ds ξ(z1 ) · · · ξ(zn )Y (x − z1 − · · · − zn ). F (Zx (.)) = exp 0

4.7. Path integral representation

253

As a basic example, let us consider the regularized Schr¨ odinger equation 

∂ ft = σ(−Δ + V (x))ft , ∂t

(4.104)

where σ is a complex constant with a non-negative real part; σ = −i for the standard Schr¨odinger equation. If the potential V is the Fourier transform of some measure ν on Rd (possibly unbounded), then applying the Fourier transform to equation (4.104) leads (by (1.94)) to the following equation for the Fourier transform fˆ of f : ∂ ˆ 2ˆ −d (4.105) fˆt (p − q)ν(dq).  ft (p) = −ip ft (p) − i(2π) ∂t Therefore, we are in the framework of Theorem 4.7.1, which implies the path integral representation for the solutions to the Schr¨odinger equation. Different conditions on ν that ensure the compliance with all assumptions of this Theorem are discussed in detail in Chapter 9 of [148]. Later on, we will extend this theory to time-dependent A and L, see (4.181). In Chapter 8, we will extend it into other directions, in particular to the case where A is not necessarily a multiplication operator. Remark 72. For readers with some probabilistic background, let us indicate an alternative approach to the path integral representation. To this end, let us start with the series (4.85) in its first version: ∞

Φt Y = T t Y + Tt−sm LTsm −sm−1 · · · LTs1 Y ds1 · · · dsm . (4.106) 0≤s1 ≤···≤sm ≤t

m=1

It represents the solutions ft = Φt Y to the equation f˙t = (A + L)ft with the initial condition Y . A similar series expansion can be used to find the solution to the Cauchy problem for the SDE dφt = Aφt + Lφt dWt , where W is the Wiener process. This leads to the formula ∞

φt Y = Tt Y + Tt−sm LTsm −sm−1 · · · LTs1 Y dWs1 · · · dWsm , m=1

0≤s1 ≤···≤sm ≤t

(4.107) expressed in terms of the iterated (multiplicative) Wiener (or Itˆo) integral. The celebrated Wiener isometry states that for any two expansions ∞

j φj = gm (s1 , . . . , sm )dWs1 · · · dWsm , j = 1, 2 m=0

0≤s1 ≤···≤sm ≤t

(under appropriate growth conditions), one has 1 ¯2 1 2 E(φ φ ) = gm g¯m ds1 · · · dsm . 0≤s1 ≤···≤sm ≤t

254

Chapter 4. Linear Evolutionary Equations: Foundations

2 Using this isometry for gm = 1 for all m and taking into account the well-known formula   ∞ t exp Wt − dWs1 · · · dWsm , = 2 m=0 0≤s1 ≤···≤sm ≤t

one finds a representation of the series (4.106) in terms of the expectation E with respect to the Wiener measure:

  t ΦT Y = E φt (Y ) exp Wt − , (4.108) 2 where φt Y solves the Cauchy problem for the SDE dφt = Aφt + Lφt dWt .

4.8 Diffusion with drifts and Schr¨ odinger equations with singular potentials and magnetic fields As an example for the application of perturbation theory as developed above, let us analyse the diffusion or heat conduction equation in Rd with a variable drift b and a source V : 1 f˙t = Δft + (b(x), ∇)ft (x) + V (x)ft (x), 2  ∂ . Hereby, the operator where (b(x), ∇) = j bj (x) ∂x j Lf = (b(x), ∇)f (x) + V (x)f (x)

f0 = f,

(4.109)

(4.110)

is considered a perturbation. If b = 0 and V ∈ C(Rd ), then the semigroup generated by L + Δ/2 can be constructed via Theorem 4.6.1. In order to deal with a non-vanishing drift b, Theorem 4.6.3 or 4.6.4 are required. Proposition 4.8.1. Let V and all bj belong to C(Rd ). Then the operator L + Δ/2 2 (Rd ) is an invarigenerates a strongly continuous semigroup in C∞ (Rd ), and C∞ ant core. 1 ˜ = C∞ Proof. This is a direct application of Theorem 4.6.4 with B (Rd ) and of the regularization property (4.24) of the semigroup generated by Δ. 

One is often interested in situations with discontinuous drifts or sources. Proposition 4.8.2. Let V and all bj be bounded measurable functions on Rd . Then the operator L+Δ/2 still generates a strongly continuous semigroup Φt in C∞ (Rd ). Its members Φt provide unique solutions to the corresponding mild form of equation (4.109).

4.8. Diffusion with drifts and Schr¨ odinger equations . . .

255

Proof. Formally, Theorem 4.6.4 does not apply even for the vanishing drift b, since L is not a well-defined operator in C∞ (Rd ). However, the arguments proving the convergence of the series (4.85) and the strong continuity of Φt are still perfectly applicable. Alternatively, the proof can be based on Theorem 4.6.5, as will be done in the below Theorem 4.8.1.  These results automatically extend to complex diffusion equations, or regularized Schr¨ odinger equations with magnetic fields :   1 1 σΔ + L ft (x), f0 = f, (4.111) f˙t = σΔft + (b(x), ∇)ft (x) + V (x)ft (x) = 2 2 where σ = i +  is a complex constant with a positive real part . This takes us to the following result. Proposition 4.8.3. Let V and all bj be bounded continuous complex-valued functions. Then the operator on the r.h.s. of (4.111) generates a strongly continuous C semigroup in the space of complex-valued continuous functions C∞ (Rd ) vanishing 2,C d at infinity, and the corresponding complex-valued space C∞ (R ) is an invariant core. The limiting equation with σ = i is the Schr¨ odinger equation with magnetic fields. In its standard representation, it is written as i

∂ 1 2 ft = (−i∇ + A(x)) ft + V (x)ft , ∂t 2

(4.112)

with  the Planck constant and the functions V (x) and A(x) specifying the socalled vector potential. In physics, one is often interested in more general potentials V , e.g. potentials that are represented by measures. For this purpose, let us consider the regularized Schr¨ odinger equation (4.111) when V and ∇b are complex Radon measures. (Of course, V in (4.110) should in this case better be denoted by V (dx).) Let us d d introduce the subspace MC R,α (R ) of Radon measures μ on R such that |μ|(Br (x)) ≤ Crα

(4.113)

for all x ∈ Rd , r ∈ (0, 1) and some constant α ∈ (max(d − 2, 0), d] called the dimensionality of μ. Br (X) denotes the ball of radius r centered at x. Notice that (4.113) implies |μ|(Br (x)) ≤ C1 max(1, rd ) (4.114) d for all r and some other constant C1 depending on d and α. The space MC R,α (R ) of measures of dimensionality α is a Banach space with respect to the norm

μR,α =

|μ|(Br (x)) . rα x∈Rd ,r∈(0,1] sup

(4.115)

256

Chapter 4. Linear Evolutionary Equations: Foundations

3 Natural examples of measures from MC R,2 (R ) are the volumes on regular hypersurfaces. Dirac’s point masses (or δ-functions) on R and their finite linear combinations belong to MC R,0 (R). d Theorem 4.8.1. Let b be a bounded measurable function. Let V ∈ MC R,α (R ) and ∂b C d ∂xj ∈ MR,α (R ) for all j, where d > 1, α ∈ (d − 2, d], and let σ be a complex constant with a positive real part. Then the perturbation series (4.85) with A = σΔ/2 and L from (4.110) converges in the norm of the space of bounded operators in C∞ (Rd ), and Φt f represents a unique bounded solution to the mild equation (4.86) d for any f ∈ C∞ (Rd ). Moreover, the operators Φt extend to the space MC R,α (R ), d so that for any f ∈ MC R,α (R ), Φt f solves (4.86) and satisfies the estimate C (Rd ) ≤ κt ˜ −ω f MCR,α (Rd ) , Φt f C∞

(4.116)

with any ω > (d − α)/2 and some κ. ˜ ∂b ∂xj

is defined in the sense of generalized functions,  ∂b (dx) only. This fact will i.e., it is effectively defined inside the integral f (x) ∂x j be used below in (4.119). For  instance, if b(x) = 1[0,∞) is the indicator function of the positive half line, then f (x)b (dx) = f (0). Readers who do not wish to touch generalized functions can work with a special case when b is a Lipschitz-continuous function, for which the partial derivatives are well defined almost surely.

Remark 73. The derivative

Proof. In view of Theorem 4.6.5, we only need to show the corresponding version of estimate (4.94):      Gtσ (x − y) (Lf )(y)(dy) ≤ κt−ω f C(Rd) (4.117)   with some constants ω ∈ (0, 1), κ > 0, where the heat kernel G is given by (4.14). Due to the structure of L, this boils down to proving the estimates (4.118) |Gtσ (x − y)| |V |(dy) ≤ κt−ω and

 

  ∂  Gtσ (x − y) bj (y) f (y)(dy)  j ∂yj    ∂   (Gtσ (x − y)bj (y))f (y)(dy) ≤ κt−ω f C(Rd) . = j ∂yj Let us start with (4.118). Since −d/2

|Gtσ (x)| = (2πt|σ|)

  x2 exp − , 2tσ1

(4.119)

4.8. Diffusion with drifts and Schr¨ odinger equations . . .

257

with σ1 being the real part of σ −1 , re-scaling of t and κ leads to the following reduced estimate:   (x − y)2 −d/2 (2πt) (4.120) exp − |V |(dy) ≤ κt−ω . 2t Since condition (4.113) is shift-invariant, one can further reduce the proof to the estimate  2 x (2πt)−d/2 exp − (4.121) |V |(dx) ≤ κt−ω . 2t In order to estimate the integral on the l.h.s., we decompose it into three parts: over the ball Btδ (0), over the band B1 (0) \ Btδ (0) and over the remaining part. Therefore, it follows from (4.113) and (4.114) that  2 x (2πt)−d/2 exp − |V |(dx) ≤ (2πt)−d/2 Ctδα 2t  2δ  t + (2πt)−d/2 C1 exp − (4.122) 2t  2 x −d/2 exp − |V |(dx). + (2πt) 2t |x|≥1 For δ < 1/2, the second term is exponentially small for small t, and the first term is of order t−ω with ω = d/2 − δα. In order to have ω < 1, one has to choose δα > d/2 − 1, which is consistent with the restriction δ < 1/2 precisely under the assumed condition α > d − 2. It remains to estimate the last term in (4.122). It can be rewritten as  2 ∞ r −d/2 exp − V˜ (dr), (2πt) 2t 1 where V˜ ((r1 , r2 ]) = |V |(Br2 (0)) − |V |(Br1 (0)). With the help of (4.114), this is further estimated by  2 ∞ r −d/2 C1 exp − (2πt) rd dr, 2t 1 which is exponentially small for small t, as expected. Turning to (4.119), we note that, since b is bounded, it is sufficient to show the following two estimates:    ∂  |Gtσ (x − y)|  bj (y) (dy) ≤ κt−ω , (4.123) j ∂yj    ∂    dy ≤ κt−ω . G (x − y) (4.124) tσ   j ∂yj

258

Chapter 4. Linear Evolutionary Equations: Foundations

But estimate (4.123) is the same as (4.118), since only the dimensionality of V and ∇b is relevant. Estimate (4.124) is straightforward (the calculations are performed in Proposition 4.3.1).  The situation is much simpler in the dimension d = 1. If the measures V and b are bounded, one can even deal with the Schr¨ odinger equation directly, without any regularization. Theorem 4.8.2. Let b be a bounded measurable function, V and b be bounded complex measures in R and σ = 0 a complex constant with a non-negative real part. Then the perturbation series (4.85) with A = σΔ/2 and L from (4.110) converges in the norm of the space of bounded operators in C∞ (R), and Φt f represents a unique bounded solution to the mild equation (4.86) for any f ∈ C∞ (R). Proof. Estimating |Gtσ | roughly by (2πt|σ|)−1/2 implies that |Gtσ (x − y)| |V |(dy) ≤ (2πt|σ|)−1/2 V , which yields (4.118) with ω = 1/2. (4.118) is obtained similarly.



It is instructive to compare equation (4.109) with its dual equation g˙ t =

1 Δgt − (∇, b(x)gt (x)) + V (x)gt (x), 2

g0 = g.

(4.125)

In this case, b stands under the operator ∇, therefore it seems as if more regularity was required for b than in (4.12). However, this is not the case. Working with (4.125) as with (4.109) in Theorem 4.8.1, we again need to obtain the estimate (4.117). But here, unlike the case of (4.109), the integration by parts transfers the derivatives to G only. Therefore, there is no need to prove the estimate (4.123). Consequently, one obtains the following result: d Theorem 4.8.3. Let b be a bounded measurable function and V ∈ MC R,α (R ), where d > 1, α ∈ (d − 2, d], and let σ be a complex constant with a positive real part. Then the perturbation series (4.85) with A = σΔ/2 and

Lg = −(∇, b(x)g(x)) + V (x)g(x) converges in the norm of the space of bounded operators in C∞ (Rd ), and Φt f represents a unique bounded solution to the corresponding mild form (4.86) of equation (4.125) for any f ∈ C∞ (Rd ). Moreover, the operators Φt extend to the d C d space MC R,α (R ), so that for any f ∈ MR,α (R ), Φt f solves (4.86) and satisfies the estimate Φt f C∞ C (Rd ) ≤ κt ˜ −ω f MCR,α (Rd ) , (4.126) with any ω > (d − α)/2 and some κ. ˜

4.9. Propagators and their generators

259

4.9 Propagators and their generators For a set S, a family of mappings U t,r from S to itself, parametrized by the pair of numbers r ≤ t (respectively t ≤ r) from a given finite or infinite interval of R is called a propagator (respectively a backward propagator) in S, if U t,t is the identity operator in S for all t and the following chain rule, or propagator equation, holds for r ≤ s ≤ t (respectively for t ≤ s ≤ r): U t,s U s,r = U t,r . If the mappings U t,r forming a backward propagator depend only on the differences r − t, then the family T t = U 0,t is clearly a semigroup. Remark 74. In the literature, propagators are also referred to as two-parameter semigroups and evolutionary families. The propagators of continuous linear operators in linear topological spaces V (sometimes shortly referred to as propagators on V ) arise naturally when solving linear Cauchy problems in V : f˙(t) = At f (t),

t ≥ s,

(4.127)

with a given f = f (s), or its backward version f˙(s) = −As f (s),

s ≤ t,

(4.128)

with a given f = f (t), where At is a family of densely defined operators in V , and where for t = s in (4.127) (respectively s = t in (4.128)) the derivative is understood as the right derivative (respectively left derivative). One says that the propagator (respectively backward propagator) U t,r of continuous linear operators solves the Cauchy problem (4.127) (respectively the backward Cauchy problem (4.128)) on D ⊂ B for a family of densely defined operators At , or equivalently, that a family of densely defined operators At generate the propagator (respectively backward propagator) U t,r on D, if D is a subspace contained in the domains of all At , D is invariant under all U t,s , and for any f ∈ D, U t,s f is a solution to (4.127) (respectively (4.128)), i.e., d t,s U f = At U t,s f, dt

t ≥ s,

(4.129)

d s,t U f = −As U s,t f, s ≤ t, ds

(4.130)

respectively

for the backward case. Remark 75. When dealing with discrete approximations, see (5.3), or in applications to stochastic equations, see [146], it can be useful to extend the above notion to the case when (4.129) or (4.130) hold only outside some fixed finite or even countable subset of t and s. Note that practically all below results remain valid under this extension.

260

Chapter 4. Linear Evolutionary Equations: Foundations

It is clear that if the Cauchy problems (4.127) or (4.128) have unique solutions given by continuous operators, then these operators form a propagator respectively a backward propagator. Shortly, we shall prove the inverse statement using the method of duality. It is natural to ask what happens with a backward propagator U s,t when it is differentiated with respect to t and whether it is sufficient to assume (4.130) only for s = t. Proposition 4.9.1. Let U s,t be a backward propagator on V with some invariant subspace D and   d U s,t f  = −As f, f ∈ D, (4.131) ds− s=t where d/ds− denotes the left derivative, for all t and some family of operators At with domains containing D. Then d U s,t f = −As U s,t f, ds−

d s,t U f = U s,t At f, dt−

(4.132)

for all s < t. If additionally the family At is strongly continuous as a family of operators D → V , i.e., if At f is a continuous mapping t → V for any f ∈ D, then the second equation of (4.132) improves to d s,t U f = U s,t At f, dt

s ≤ t,

(4.133)

with the derivative being understood as the right derivative in the case of s = t. Proof. The first equation in (4.132) follows from the chain rule for propagators and (4.131). The second equation in (4.132) is obtained by writing 1 s,t 1 (U − U s,t−h )f = U s,t−h (U t−h,t f − f ) h h   1 t−h,t s,t−h =U (U f − f ) − At f + U s,t−h At f, h for h > 0. The first term tends to zero and the second to U s,t At f , as h → 0. This implies the second equation in (4.132). Finally, in order to get (4.133), Lemma 1.3.1 must be applied.  Remark 76. In order to deduce (4.130) from the first equation in (4.132) by Lemma 1.3.1, one has to require that the mapping s → As U s,t f is continuous for all f ∈ D and s < t. Remark 77. Property (4.133) is crucial for the theory of propagators. It can be included into the definition of the generation of U by A. However, this propoerty is difficult to check in other ways than via the continuity assumptions of Proposition 4.9.1.

4.9. Propagators and their generators

261

A propagator or a backward propagator U t,r of uniformly (for t, r from a compact set) continuous linear operators on a locally convex linear space V is called strongly continuous if the family U t,r depends strongly continuously on the pair of variables (t, r), i.e., U t,r f is a continuous function (t, r) → V for any f ∈ V . As for the semigroups, it follows from the principle of uniform boundedness, Theorem 1.6.1, that if V is a barrelled space, then the strongly continuous family of continuous operators U t,r is locally (that is, for t, r from any compact set) equicontinuous. However, unlike the semigroup case, the link between strongly continuous propagators and Cauchy problems (4.127) or (4.128) is much less straightforward in general. For instance, no analogues to Proposition 4.2.4 seem to exist. Similarly, Proposition 4.2.1 on exponential bounds does not extend to propagators, see, e.g., [42] for a review on this topic. If a propagator (respectively a backward propagator) U t,r of continuous linear operators solves the Cauchy problem (4.127) or (4.128) on D, then U t,r depends strongly continuously on r (respectively on t) on the closure of D in V . In this case, one says that if f belongs to the closure of D in V , then U t,r f defines the generalized solution to the Cauchy problems (4.127) or (4.128). As for semigroups, we shall mostly work with propagators in Banach spaces, although most of the abstract results have direct extensions to locally convex spaces. The following simple fact extends formula (4.8) and is crucial for comparing different propagators. Proposition 4.9.2. Let U1s,t and U2s,t be two backward propagators of bounded linear operators in a Banach space B with a common invariant subspace D. Let A1t and A2t be two families of operators in B with domains containing D such that U2s,t , A2t satisfy (4.130) and U1s,t , A1t satisfy (4.133). Moreover, let U1s,t depend strongly continuously on t. Then (U1t,r − U2t,r )f =

r

U1t,s (A1s − A2s )U2s,r f ds,

f ∈ D.

(4.134)

t

Proof. This follows from the observation that the function U1t,s U2s,r f is differentiable in s ∈ [t, r] for any f ∈ D, and from d (U t,s U s,r f ) = U1t,s (A1s − A2s )U2s,r f. ds 1 2 In order to prove this formula, we write 1 t,s+δ s+δ,r (U U2 f − U1t,s U2s,r f ) δ 1 1 1 = U1t,s+δ (U2s+δ,r − U2s,r )f + (U1t,s+δ − U1t,s )U2s,r f. δ δ

(4.135)

262

Chapter 4. Linear Evolutionary Equations: Foundations

The second term tends to U1t,s A1s U2s,r f , as δ → 0, because U1s,t , A1t satisfy (4.133). The first term can be written as   1 s+δ,r t,s+δ s,r 2 s,r (U − U2 )f + As U f − U1t,s+δ A2s U s,r f, U1 δ 2 which converges to −U1t,s A2s U s,r f , because U1t,s is strongly continuous and U2s,t , A2s satisfy (4.130).  As a direct consequence, we obtain the following stability (or continuity) result for propagators. Proposition 4.9.3. Under the assumptions of Proposition 4.9.2, assume that D is itself a Banach space under the norm .D ≥ .B such that the operators A1t , A2t are bounded as operators D → B and U2t,s are bounded operators in D. Then (U1t,r − U2t,r )f B ≤ f D sup U1t,s B→B sup U2s,r D→D s∈[t,r]

×

s∈[t,r]

(4.136)

r

A1s − A2s D→B ds. t

This result can be used to find the derivative of a propagator or a semigroup with respect to a parameter. In fact, the following statement is a consequence of Proposition 4.9.2. Proposition 4.9.4. Let Uαs,t be a family, depending on a parameter α ∈ R, of strongly continuous backward propagators of bounded linear operators in a Banach space B with a common invariant subspace D. Let Aα t be the families of operators in B with domains containing D that generate the propagators Uαs,t on D. Assume that D is itself a Banach space under the norm .D ≥ .B such that the operators t,s Aα t are bounded as operators D → B and Uα are also bounded as operators in D. Finally assume that, for any t and g ∈ D, Aα t g is differentiable with respect to α as a mapping R → B, and that their derivatives are uniformly bounded for t and g from any bounded sets. Then the mappings Uαt,r f are differentiable in α for any f ∈ B and r ∂Aα ∂(Uαt,r f ) s Uαt,s (4.137) = (Uαs,r f ) ds. ∂α ∂α t In particular, if exp{tAα } is a family of strongly continuous semigroups of linear operators in B with the same domain D, then the derivative with respect to the parameter α is given by the formula ∂ exp{tAα } = ∂α



t

exp{(t − s)Aα } 0

∂Aα exp{sAα } ds, ∂α

(4.138)

which is far from being the same as if it were the derivative of a usual exponent.

4.10. Well-posedness of linear Cauchy problems

263

Finally, let us mention that there exists a standard method for building semigroups from propagators by enlarging the state space. Namely, each strongly continuous propagator U t,s of bounded linear operators in a Banach space B can be associated with a semigroup of operators in the space C(R, B), which is often referred to as the Holland semigroup and given by the formula (Tt f )(x) = U x,x−t f (x − t),

t ≥ 0.

(4.139)

The semigroup property Tt+s = Tt Ts is readily checked. Many results on evolution equations that arise from this link can be found, e.g., in [53]. See also [243] for an application to second-order equations of the type u ¨ = Bu + f . More generally, for any δ ∈ R and α > 0, the semigroup can be formed by the operators (Tt f )(x) = U x+δ,x+δ−αt f (x − αt),

t ≥ 0.

(4.140)

Exercise 4.9.1. Check that these operators define a strongly continuous semigroup in C∞ (R, B). Moreover, if At generates the forward propagator U t,s , then the generator of the semigroup Tt is (Lf )(x) = αAx+δ f (x) − αf  (x).

(4.141)

An extension of this dynamics with mixed fractional derivative instead of f  will be presented in Chapter 8, see Theorems 8.7.2 and 8.8.1.

4.10 Well-posedness of linear Cauchy problems The following result employs the method of duality for establishing the wellposedness of the linear Cauchy problem corresponding to a propagator (and, in particular, to a semigroup). Theorem 4.10.1. Let U t,r be a strongly continuous backward propagator of bounded linear operators in a Banach space B generated by a family of linear operators At on a common dense domain D that is invariant under all U t,r . Moreover, let At f be a continuous function t → B for any f ∈ D. Then the following holds: (i) The family of dual operators V s,t = (U t,s )∗ forms a ∗-weakly continuous (in s, t) propagator of bounded linear operators in B ∗ (contractions if all U t,r are contractions), so that d s,t V ξ = −V s,t A∗t ξ, dt

d s,t V ξ = A∗s V s,t ξ, ds

t ≤ s ≤ r,

(4.142)

f ∈ D;

(4.143)

holds weakly on D, i.e., say, for the second equation d (f, V s,t ξ) = (As f, V s,t ξ), ds

t ≤ s ≤ r,

264

Chapter 4. Linear Evolutionary Equations: Foundations

(ii) V s,t ξ is the unique solution to the Cauchy problem of equation (4.143), i.e., if ξt = ξ for a given ξ ∈ B ∗ and ξs , s ∈ [t, r], is a ∗-weakly continuous family in B ∗ satisfying d (f, ξs ) = (As f, ξs ), t ≤ s ≤ r, f ∈ D, (4.144) ds then ξs = V s,t ξ for t ≤ s ≤ r; (iii) U s,r f is the unique solution to the inverse Cauchy problem (4.130), i.e., if fr = f , fs ∈ D for s ∈ [t, r] and satisfies the equation d fs = −As fs , ds

t ≤ s ≤ r,

(4.145)

then fs = U s,r f . Proof. Statement (i) is a direct consequence of duality and the equations (4.130) and (4.133). (ii) Let g(s) = (U s,r f, ξs ) for a given f ∈ D. We will show that g  (s) = 0 for all s. In fact, we have [(U s+δ,r f, ξs+δ ) − (U s,r f, ξs )]/δ = (U s+δ,r f − U s,r f, ξs+δ )/δ + (U s,r f, ξs+δ − ξs )/δ, and the second term tends to (As U s,r f, ξs ), as δ → 0. The first term can be written as   1 s+δ,r −(As U s,r f, ξs+δ ) + (U − U s,r )f + As U s,r f, ξs+δ , δ where the first term tends to −(As U s,r , ξs ), as δ → 0, since ξ is ∗-weakly continuous, and the second term tends to zero, because the ξs are uniformly bounded in B ∗ . Therefore, we find g  (s) = 0 as claimed. This implies g(r) = (f, ξr ) = g(t) = (U t,r f, ξt ), which shows that ξr is uniquely defined. Similarly, we can analyse any other point r ∈ (s, r). (iii) Similar to (ii), it is sufficient to prove the equation d (fs , V s,t ξ) = 0. ds To this end, let us write (fs+δ − fs , V

s+δ,t

ξ)/δ =



 1 s+δ,t (fs+δ − fs ) + As fs , V ξ − (As fs , V s+δ,t ξ). δ

The first term tends to zero, as δ → 0, because (fs+δ − fs )/δ + As fs  → 0 and the family V s+δ,t ξ is uniformly bounded in B ∗ . The second term tends to −(As fs , V s,t ξ) because of the ∗-weak continuity of V s,t . Consequently, lim [(fs+δ , V s+δ,t ξ)/δ − (fs , V s,t ξ)/δ]

δ→0

= −(As fs , V s,t ξ) + lim (fs , V s+δ,t ξ − V s,t ξ)/δ = 0, δ→0

as required.



4.10. Well-posedness of linear Cauchy problems

265

Sometimes, not only the contuinity, but also quantitative measures of the regularity of the dual propagator are required. For that purpose, a bit more structure is usually assumed. Proposition 4.10.1. Under the assumptions of Theorem 4.10.1, suppose that D is itself a Banach space under some norm .D such that .D ≥ .B and ¯ As D→B ≤ A,

U s,r B→B ≤ UB ,

¯ UB . Then the dual curve V s,t ξ is Lipschitzfor all s, r and some constants A, continuous in s in the norm topology of D∗ : ¯ B∗ . V s+δ,t ξ − V s,t ξD∗ ≤ δUB Aξ

(4.146)

If additionally U s,r D→D ≤ UD with a constant UD , then also ¯ B∗ . V s,t ξ − V s,t±δ ξD∗ ≤ δUD Aξ

(4.147)

Proof. We have V s+δ,t ξ − V s,t ξD∗ = sup |(f, V s+δ,t ξ − V s,t ξ)| f D ≤1

¯ B∗ . = sup |(U t,s+δ f − U t,s f, ξ)| ≤ δUB Aξ f D ≤1



(4.147) is proved in a similar way. Let us now extend the uniqueness result to affine equations of the form d fs = −As fs − gs , ds

t ≤ s ≤ r,

(4.148)

with a given fr and a continuous curve s → gs ∈ B. Proposition 4.10.2. Under the assumptions of Theorem 4.10.1, suppose that s → gs is a continuous mapping s → B and gs ∈ D for all s with uniformly bounded gs D . Then equation (4.148) has a unique solution fs for any boundary condition f = fr ∈ D, and we have r fs = U s,r f + U s,t gt dt. (4.149) s

Proof. Let us assume that there are two solutions to (4.148) with the same boundary condition. Then their difference satisfies the equation f  = −As f with a vanishing boundary condition. In this case, Theorem 4.10.1 implies that f vanishes. Therefore, there can be at most one solution to (4.148) with a given boundary condition. Differentiating (4.149) with respect to s (this is where we use gs ∈ D!), one finds that it satisfies (4.148). 

266

Chapter 4. Linear Evolutionary Equations: Foundations

Remark 78. Of course, one obtains the direct analogues of Theorem 4.10.1 and Proposition 4.10.2 for the forward propagators by reverting the direction of time. The dual analogue of (4.148), written in the weak form, is the equation d (f, ξs ) = (As f, ξs ) + (f, νs ), ds

s ≥ t,

f ∈ D,

with some given curve νs . This equation often appears in the slightly modified form d (4.150) (f, ξs ) = (As f, ξs ) + (Ls f, νs ), s ≥ t, f ∈ D, ds with Ls : B → B a family of bounded operators. The next statement is the dual version of Proposition 4.10.2. Proposition 4.10.3. Under the assumptions of Theorem 4.10.1, suppose that s → νs is a continuous curve in B ∗ and s → Ls is a continuous curve in L(B, B). Then the weak equation (4.150) in B ∗ with the initial condition ξt = ξ has a unique solution ξs given by the formula s t,s (f, ξs ) = (U f, ξt ) + (Lr U r,s f, μr ) dr. (4.151) t

Proof. The uniqueness follows directly by subtracting two solutions and referring to the uniqueness of the solutions to (4.144). Therefore, one only has to show that ξs given by the explicit formula (4.151) solves the Cauchy problem for equation (4.150). This, however, follows by differentiation. 

4.11 The operator-valued Riccati equation As an insightful example for the application of linear propagator methods, let us analyse an important class of quadratic equations, the so-called Riccati equations. As usual, let B and B ∗ be a real Banach space and its dual. Let us say that a densely defined operator C from B to B  (possibly unbounded) is symmetric (respectively positive) if (Cv, w) = (Cw, v) (respectively if additionally (Cv, v) ≥ 0) for all v, w from the domain of C. Let us denote the space of bounded symmetric operators B → B ∗ (respectively its convex subset of positive operators) by SL(B, B ∗ ) (respectively SL+ (B, B ∗ )). Analogous definitions are applied to the operators B ∗ → B. The notion of positivity of course induces a (partial) order relation on the space of symmetric operators. For the study of equations in SL(B, B ∗ ) or SL(B ∗ , B), it is convenient to introduce a special norm therein: CS = sup{|(Cv, v)| : |v| ≤ 1}.

4.11. The operator-valued Riccati equation

267

Proposition 4.11.1. CS is in fact a norm and it is equivalent on SL(B, B ∗ ) to the usual norm of the space L(B, B ∗ ). More precisely, CS ≤ CL(B,B ∗ ) ≤ 3CS .

(4.152)

Proof. Homogeneity, positivity and the triangle inequality are straightforward consequences from the definition. Next, for any C ∈ SL(B, B ∗ ), we find (Cv, w) =

1 [(C(v + w), v + w) − (Cv, v) − (Cw, w)]. 2

(4.153)

Consequently, if CS = 0, then (Cv, w) = 0 for all v, w, and thus C = 0. Therefore, CS satisfies all properties of a norm. Next, the l.h.s. inequality of (4.152) follows from the definition. In order to get the r.h.s. inequality of (4.152), we use (4.153) to obtain CL(B,B ∗ ) = sup{|(Cv, w)| : |w|, |v| ≤ 1} 1 ≤ sup{|(C(v + w), v + w)| : |w|, |v| ≤ 1} + sup{|(Cv, v)| : |v| ≤ 1} 2 ≤ 3CS , 

as required. Let us start with the simplest quadratic equation π(t) ˙ = −π(t)C(t, s)π(t),

t ≥ s,

(4.154)

for an unknown operator-valued function π(t) ∈ SL+ (B ∗ , B). Remark 79. Before touching operator-valued evolutions, it is useful to get a clear understanding of the simpler situation in R. Namely, look at the equation x˙ = −x2 . For x0 > 0, the solution is globally defined and tends to zero as t → ∞. For x0 < 0, the solution explodes, i.e., tends to −∞, in a finite time. Exercise 4.11.1. Solve the equation x˙ = −x2 explicitly to confirm the claim in the above remark. Proposition 4.11.2. Suppose that C(t, s), t ≥ s, is a family of elements in SL+ (B, B ∗ ) such that C(t, s) are strongly continuous in t for t > s, with an t integrable singularity at t = s at most, i.e., with s C(τ, s) dτ < ∞. Then the following holds: (i) For any πs ∈ SL+ (B ∗ , B) there exists a unique global strongly continuous family of operators π(t) ∈ SL+ (B ∗ , B), t ≥ s, such that t π(t) = πs − π(τ )C(τ, s)π(τ ) dτ. (4.155) s

(Note that the integral is defined in the norm topology.) (ii) π(t) ≤ πs and the image of π(t) belongs to the image of πs for all t ≥ s. (iii) Equation (4.154) holds in the norm topology of L(B, B ∗ ) for t > s.

268

Chapter 4. Linear Evolutionary Equations: Foundations

Proof. The uniqueness follows from Gronwall’s lemma in Proposition 9.1.4. The existence of a positive solution for small t − s follows from the explicit formula −1  −1  t t C(τ, s) dτ πs = 1 + πs C(τ, s) dτ πs . π(t) = πs 1 + s

(4.156)

s

In fact, for small t−s the inverse operators in this formula are well defined, because (1 + A)−1 exists whenever AL(B,B) < 1. In order to check that this is indeed a solution, we can use (1.60), which implies, e.g., for the first formula   −1 −1 t t C(τ, s) dτ πs C(t, s)πs (t) 1 + C(τ, s) dτ πs π(t) ˙ = −πs 1 + s

s

= −π(t)C(t, s)π(t), as required. Remark 80. In order to derive (or to guess) formula (4.156), one can assume that d −1 πt−1 exists. In this case, (4.154) is equivalent to the equation dt π (t) = C(t, s). It remains to show the global existence. To this end, it is natural to try to employ some kind of accretivity. As it turns out, the norm .S is convenient for that purpose. More precisely, for any t > s there exists a decomposition s = t0 < t1 < · · · < tn = t of the interval [s, t] such that ) ) ) tj+1 ) ) ) 3πs S ) C(τ, s) dτ ) s, with norms that have at most an integrable singularity at s = t. Then for any R ∈ SL+ (B ∗ , B) with the image belonging to D, the family Rt = U t,s π(t)(U t,s )∗ ,

t ≥ s,

(4.158)

where π(t) is the solution to (4.154) given by Proposition 4.11.2 with πs = R and C(t, s) = (U t,s )∗ C(t)U t,s , is a continuous function t → SL+ (B ∗ , B), t ≥ s, in the strong operator topology, the images of all Rt belong to D, Rt depends Lipschitzcontinuously on R and satisfies the Riccati equation weakly, i.e., d (Rt v, w) = (A(t)Rt v, w) + (v, A(t)Rt w) − (Rt C(t)Rt v, w) dt

(4.159)

for all v, w ∈ B ∗ . If R extends to a bounded operator D∗ → D, then Rt satisfies the Riccati equation (4.157). Proof. Everything follows from inspecting the given explicit formula. In fact, this formula implies that d Rt v = At U t,s π(t)(U t,s )∗ v − U t,s π(t)C(t, x)π(t)(U t,s )∗ v + U t,s π(t)(U t,s )∗ A∗s v. dt The only problem in this formula is rooted in its last term, since A∗s v (and hence (U t,s )∗ A∗s v) may belong to D∗ , where R and thus π(t) may be not defined. This is why one generally resorts to weak solutions.  Remark 81. The above results were formulated for the usual forward Cauchy problem. Of course, everything remains valid for the backward Riccati equation R˙ s = −A(s)Rs − (A(s)Rs )∗ + Rs C(s)Rs ,

s ≤ t,

(4.160)

if A(t) is assumed to generate the backward propagator U s,t , s ≤ t, in B. Then the family Rs = U s,t πs (U s,t )∗ solves (4.160) with the terminal condition Rt = R, where πs solves the reduced backward Riccati equation π˙ s = πs C(s, t)πs with the same initial condition πt = R and C(s, t) = (U s,t )∗ C(s)U s,t .

270

Chapter 4. Linear Evolutionary Equations: Foundations

In the case B = C∞ (Rd ), B ∗ = M(Rd ), the elements C of SL(B, B ∗ ) are usually integral operators with symmetric kernels C(x, y), so that the corresponding bilinear form can be written as C(x, y)v(x)w(y) dxdy. (4.161) LC (v, w) = (Cv, w) = This extends to linear forms on the space of symmetric functions of two variables as LC (R) = C(x, y)R(x, y) dxdy. (4.162) The corresponding bilinear form of the dual operator C ∗ from SL(B ∗ , B) is then analogously given by (Cμ, ν) = C(x, y)μ(dx)ν(dy). Note that the requirement of C ∈ SL(B, B ∗ ) being positive turns into the requirement of (Cv, v) of (4.161) being positive for any v. Symmetric functions C(x, y) that satisfy this property are called positive-definite kernels (real-valued in our case). In applications, such kernels arise as correlation functions of stationary random fields, where their structure is highly developed and well understood.

4.12 An infinite-dimensional diffusion equation in variational derivatives As an example for the application of the Riccati equation, let us analyse the backward propagators in C(M(Rd )) specified by a second-order operator in the variational derivatives of the form   1 δ2F δF , Y + LC(t) 2 , (4.163) Ot F (Y ) = A(t) δY (.) 2 δY (., .) where A(t) and C(t) are as in Proposition 4.11.3 with B = C∞ (Rd ), and LC(t) is the bilinear form corresponding to C(t) given by (4.161) and (4.162). This equation is an infinite-dimensional extension of Gaussian diffusion (4.39) and can therefore be referred to as measure-valued Ornstein–Uhlenbeck diffusion. Equations of this type naturally appear in the analysis of fluctuations of systems with a large number of particles (or agents) around their law-of-large-number-limits that are specified by the kinetic equations considered in Chapter 7. Unlike the method of Green functions employed in Section 4.3, we shall analyse the evolution generated by Ot by looking at the evolution of Gaussian packets, i.e., functions of Y ∈ M(Rd ) of the form   1 Ft (Y ) = FRt ,μt ,γt (Y ) = exp − (Rt (Y − μt ), Y − μt ) + γt , (4.164) 2

4.12. An infinite-dimensional diffusion equation in variational derivatives

271

where Rt ∈ SL+ (B ∗ , B) are given by a symmetric bounded positive-definite kernel Rt (x, y), so that (Rt Y, Y ) =

Rt (x, y)Y (dx)Y (dy),

and μt ∈ M(Rd ), γt ∈ C(Rd ). It follows that δFt = −Rt (Y − μt )(z)Ft (Y ) = − Rt (z, y)(Y − μt )(dy)Ft (Y ), δY (z) δ 2 Ft = [Rt (Y − μt )(z)Rt (Y − μt )(w) − Rt (z, w)]Ft (Y ), δY (z)δY (w) and therefore

Ot Ft (Y ) = −(A(t)Rt (Y − μt ), Y )  1 1 + C(t)Rt (Y − μt ), Rt (Y − μt ) − LC(t) Rt F (Y ) 2 2 = − F (Y ) [A(t)Rt (., y)] (z)(Y − μt )(dy)Y (dz)

1 + F (Y ) C(t, x, y) Rt (x, v)(Y − μt )(dv) 2 × Rt (y, w)(Y − μt )(dw) − Rt (x, y) dxdy. On the other hand,

1 F˙t (Y ) = − (R˙ t (Y − μt ), Y − μt ) + (μ˙ t , Rt (Y − μt )) + γ˙ t F (Y ). 2 Therefore, the backward equation F˙ (Y ) = −Ot Ft (Y ) reads 1 ˙ (Rt (Y − μt ), Y − μt ) − (μ˙ t , Rt (Y − μt )) − γ˙ t 2 = −(A(t)Rt (Y − μt ), Y − μt ) − (A(t)Rt (Y − μt ), μt ) 1 1 + (C(t)Rt (Y − μt ), Rt (Y − μt )) − LC(t) Rt . 2 2 This equation is satisfied if the following conditions hold: (R˙ t v, v) = −2(A(t)Rt v, v) + (C(t)Rt v, Rt v), μ˙ t = A∗ (t)μt , 1 γ˙ t = LC(t) Rt . 2

(4.165)

272

Chapter 4. Linear Evolutionary Equations: Foundations

The first equation (written in the weak form in order to hold for any v) is a consequence of the backward Riccati equation (R˙ t v, w) = −(A(t)Rt v, w) − (A(t)Rt w, v) + (Rt C(t)Rt v, w). 1 2 For the sake of definiteness, let D = C∞ (Rd ) or D = C∞ (Rd ). The following result is a consequence of the above calculations and Proposition 4.11.3 (and Remark 81).

Proposition 4.12.1. Assume that (i) A(t) and C(t) are bounded strongly continuous families of operators D → B and D → B ∗ , respectively; (ii) A(t) generates a backward propagator U s,t in B on the common invariant domain D, and U t,s is strongly continuous both in B and D; (iii) all C(t) are positive and C(s, t) = (U s,t )∗ C(s)U s,t ∈ SL+ (B, B ∗ ) for t > s, with the norms having an integrable singularity at s = t at most. Then for any R ∈ SL+ (B ∗ , B) with the image belonging to D, all three equations in (4.165) are well posed and therefore specify a well-defined evolution of Gaussian packages that solves the backward Ornstein–Uhlenbeck equation F˙ (Y ) = −Ot Ft (Y ). From the equations (4.165) and the formulae of Proposition 4.11.3, it additionally follows that the backward evolution specified by the equation F˙ (Y ) = −Ot Ft (Y ) is positivity-preserving and non-expansive (a contraction) on the set of linear combinations of the above Gaussian packages. Exercise 4.12.1. Prove this claim.

4.13 Perturbation theory for propagators The next result extends Theorem 4.6.1 to propagators. Theorem 4.13.1. Let U t,r be a strongly continuous backward propagator of bounded linear operators in a Banach space B, generated by a family of linear operators At on a common dense domain D, such that (4.130) holds. Let Lt be a family of bounded operators in B that depend strongly continuously on t. Then the series ∞

U t,s1 Ls1 U s1 ,s2 · · · Lsm U sm ,r ds1 · · · dsm (4.166) Φt,r = U t,r + m=1

t≤s1 ≤···≤sm ≤r

are well defined as a converging series of bounded operators in B. They form a strongly continuous backward propagator in B such that Φt,r f is the unique bounded solution to the integral equation r t,r t,r U t,s Ls Φs,r f ds, (4.167) Φ f =U f+ t

4.13. Perturbation theory for propagators

273

with a given fr = f . Moreover, if f ∈ D, then d |t=r Φt,r f = −(Ar + Lr )f dt−

(4.168)

for any f ∈ D. Remark 82. Unlike for semigroups, one cannot deduce directly from (4.168) that D is invariant under Φt,r and that this propagator solves the Cauchy problem for At + Lt there. Proof. This is a straightforward extension of Theorem 4.6.1. The only difference to note is that in order to conclude that r d |t=r U t,s Ls U s,r f ds = −Lr f, dt t one must use the continuous dependence of Ls on s and the observation that if (1) (k) As , . . . , As are k families of bounded strongly continuous operators, then the (1) (k) family of compositions As ◦· · ·◦As is also bounded and strongly continuous.  As for semigroups, equation (4.167) is often referred to as the mild form of t,r the equation dΦdt f = (A + L)Φt,r f and the solutions to (4.86) are referred to as t,r

the mild solutions of the equation dΦdt f = (A + L)Φt,r f . In order to conclude further (from the conditions of Theorem 4.13.1) that Φt,r solves the Cauchy problem for At + Lt on D, one has to know that D is invariant under Φt,r . This, however, does not follow from (4.166), even if we assume that D is invariant under all Lt . A way to overcome this difficulty arises once the domain D has itself a natural Banach space structure, given by a certain norm .D ≥ .B (similar to the situation in Proposition 4.9.3). This is often the case for concrete equations of practical interest. In this case, in order to check that As U s,t f is continuous, it is sufficient to prove that U s,t f is a strongly continuous family of bounded operators in D and that As is a strongly continuous family of bounded operators D → B (not B → B of course!). With the help of such structure, one obtains the following improvement of Theorem 4.13.1. Proposition 4.13.1. Under the conditions of Theorem 4.13.1, assume additionally that D is itself a Banach space under another norm .D ≥ .B such that As is a strongly continuous family of bounded operators D → B and U s,t is a strongly continuous family of operators in D. Moreover, assume that the family Lt is strongly continuous as a family of operators B → B and is bounded as a family of operators D → D. Then D is invariant under Φt,r , the propagator Φt,r is generated by the family At + Lt on D and (As + Ls )Φs,t f is a continuous function (s, t) → B for any f ∈ D. Proof. It follows that the series (4.166) converges in the norm of the Banach space D. Therefore, D is invariant under Φt,r . The continuity assumptions are specifically

274

Chapter 4. Linear Evolutionary Equations: Foundations

designed in order to ensure the continuity of (As +Ls )Φs,t f as a function (s, t) → B for any f ∈ D. The rest follows from Theorem 4.13.1, Proposition 4.9.1 and Lemma 1.3.1.  As a simple example, let us consider the straightforward time-dependent extension of formula (4.97). Namely, the unique solution given by the perturbation series to the backward Cauchy problem ∞ (ft (x + y) − ft (x))νt (dy), t ≤ T, (4.169) f˙t (x) = − 0

with a continuous family of finite measures νs and a given terminal condition fT can be written as   T

ft (x) = exp − ×



k=0

νs  ds t

t≤s1 ≤···≤sk ≤T

(4.170)





···

0



fT (x + y1 + · · · + yk )νs1 (dy1 ) · · · νsk (dyk ).

0

In the sensitivity analysis for nonlinear propagators (which will be carried out later on), the following modification of Theorem 4.13.1 becomes important for the case when the operators Lt are not bounded in B, but only in D: Theorem 4.13.2. Again let D be a dense subspace of B, which is itself a Banach space under another norm .D ≥ .B , and At be a family of bounded strongly continuous linear operators D → B. Let U t,r be a strongly continuous backward propagator of bounded linear operators in B generated by At on D such that U t,r is also a strongly continuous propagator in the Banach space D. Let Lt be a strongly continuous family of bounded operators in D. Then the series (4.166) are well defined as converging series of bounded operators in D. They form a strongly continuous backward propagator in D such that, for any f ∈ D, Φt,r f is the unique bounded solution to equation (4.167) with a given fr = f . Moreover, d t,r Φ f = −(At + Lt )Φt,r f, dt

d t,r Φ f = Φt,r (Ar + Lr )f, dr

(4.171)

for any f ∈ D, where the derivatives are understood in the sense of the topology of B, and Φt,r f yields the unique solution to the backward Cauchy problem for the equation f˙t = −(At + Lt )ft with a given boundary condition fr = f ∈ D. Proof. The convergence of the series (4.166) follows directly from the conditions of the Theorem, which imply a) the invariance of D under all Φt,r and b) that Φt,r defines a strongly continuous backward propagator in D (although possibly not in B). The invariance of D leads to (4.171), first with the left derivatives instead of the full derivatives, as in Proposition 4.9.1. The extension to the full derivatives

4.13. Perturbation theory for propagators

275

follows from the assumed strong continuity of At and Lt . Finally, by Proposition 4.10.2, any solution ft to the equation f˙t = −(At + Lt )ft satisfies the equation r ft = U t,r f + U t,s Ls fs ds, t



and is therefore unique.

Let us present the counterpart of Theorem 4.6.3 for propagators. Its proof is almost identical to the proof of Theorem 4.6.3 and is therefore omitted. Theorem 4.13.3. (i) Let the family of operators At with the common domain D generate a strongly continuous backward propagator U t,s on a Banach space B, and let Lt be a family of operators in B with the common domain DL such that U t,s f ∈ DL for any f ∈ B, t < s, and Lt U t,s  ≤ κ(s − t)−ω

(4.172)

with constants κ > 0, ω ∈ (0, 1), uniformly for t, s from any fixed bounded interval. Then the series (4.166) converges in the operator norm, and the operators Φt,r form a strongly continuous backward propagator in B. (ii) If additionally each Lt is a closed (or closable) operator on DL , then Φt,r f ∈ DL for any f ∈ B, t > 0 (or Φt,r f belongs to the domain of the closure of Lt , respectively), with the same order of growth as for Tt : ˜ − t)−ω , Lt Φt,r  ≤ κ(r

(4.173)

with some other constant κ. ˜ Moreover, Φt,r f is the unique solution to the integral equation (4.167) for any given f0 = f such that Lt Φt,r f  ≤ c(r − t)−ω1 with some constants ω1 ∈ (0, 1) and c > 0. Let us now present a version of Theorem 4.6.4 for propagators, which exploits the method of three-level Banach towers. Theorem 4.13.4. Let U t,s be a strongly continuous backward propagator in a Banach space B generated by the family At on an invariant dense subspace D. Let ˜ be another dense subspace of B such that D ⊂ B ˜ ⊂ B. Assume that D and B ˜ B are themselves Banach spaces under the norms .D ≥ .B˜ ≥ .B , and let ˜ B)∩L(D, B) ˜ be a strongly continuous family. Let U t,s have the following Lt ∈ L(B, regularization property: Tt f ∈ D for any f ∈ B, t > 0, and U t,s B→B˜ ≤ κ(s − t)−ω ,

U t,s B→D ≤ κ(s − t)−ω ˜

(4.174)

with constants ω ∈ (0, 1), κ > 0, uniformly for (s − t) ∈ (0, 1]. Finally, let U t,s be ˜ and D, and let strongly continuous in B m(s−t) , U t,s B→ ˜ B ˜ ≤ Me

U t,s D→D ≤ MD emD (s−t) .

(4.175)

276

Chapter 4. Linear Evolutionary Equations: Foundations

Then (i) the backward propagator Φt,s constructed in Theorem 4.13.3 is strongly con˜ and it satisfies the estimates tinuous in both D and B, m(s−t) Φt,s B→ E1−ω [Γ(1 − ω)κM (s − t)1−ω sup Ls B→B ], ˜ B ˜ ˜ ≤ Me

Φ D→D ≤ MD e t,s

mD (s−t)

s 1−ω

E1−ω [Γ(1 − ω)κMD (s − t)

sup Ls D→B˜ ]; s

(4.176) (ii) it has the same regularization property as U t,s , namely Φt,s B→B˜ ≤ κ(s ˜ − t)−ω ,

Φt,s B→D ≤ κ(s ˜ − t)−ω , ˜

(4.177)

with a constant κ ˜ that can also be expressed in terms of Mittag-Leffler functions; (iii) the backward propagator Φt,r is generated by At + Lt on D, i.e., the equations (4.171) hold in B for any f ∈ D; (iv) Φt,s f ∈ D for any f ∈ B and t < s, and Φt,s B→D ≤ 4κ ˜ 2 (s − t)−2ω .

(4.178)

˜ B) ∩ L(D, B) ˜ (with norms that are Proof. The conditions (4.174) and Lt ∈ L(B, ˜ for any f ∈ B uniform in t) ensure that the perturbation series converge in B ˜ The estimates for the sum are obtained as in the proof and in D for any f ∈ B. of Theorem 4.6.4. The required strong continuity of Φt,r follows from the strong continuity of U t,r . Due to the invariance of D, it is sufficient to check the equations (4.171) only for t = r, where they are readily seen. Finally, (4.178) follows from (4.177) and the chain rule.  An important example for the application of Theorem 4.13.4 will be given in Theorem 6.8.3, which is devoted to quantitative estimates of the sensitivity of McKean–Vlasov diffusions with respect to the initial data. Perturbation theory can be used for constructing solutions to the perturbed equation even if U t,s are neither supposed to be propagators, nor to represent unique solutions to the Cauchy problem of the equation f˙t = −At ft . ˜ ⊂ B be a triple of Banach spaces with .D ≥ . ˜ ≥ Theorem 4.13.5. Let D ⊂ B B ˜ is dense in B ˜ (respectively in B) in the topology .B such that D (respectively B) ˜ (respectively B). Let U t,s , t ≤ s, be a family of operators in B such that D and of B ˜ B are invariant and the U t,s are bounded and strongly continuous in all three spaces ˜ B. Also, let the U t,s be smoothing, such that Tt f ∈ D for any f ∈ B, t > 0, D, B, ˜ B) ∩ L(D, B) ˜ be strongly and that (4.174) holds. Let At ∈ L(D, B) and Lt ∈ L(B, d t,s continuous families, and for any f ∈ B, let the equation dt U f = −At U t,s f hold for t < s.

4.13. Perturbation theory for propagators

277

˜ B, Then the series (4.166) converges in the norms of all three spaces D, B, its sum Φt,s is strongly continuous in all these spaces, the operators Φt,s have the same regularization property as U t,s (namely the estimates (4.177) hold), and for ˜ the equation any f ∈ B d t,r Φ f = −(At + Lt )Φt,r f, dt

t < r,

(4.179)

is satisfied. Proof. This is just a repetition of the arguments used in Theorem 4.13.4, where everything concerning uniqueness and the chain rule is omitted.  Finally, let us give a direct extension of the path integral representation of Theorem 4.7.1. Keeping the notation (4.99) for piecewise constant paths, and P Cx (τ, t) for the set of such paths [τ, t] → Rd starting from the point x at τ , let us assign to each continuous family of bounded measures Mt on Rd the measure M P C on P Cx (τ, t), which is defined as the sum of measures MnP C , n = 0, 1, . . . , where each MnP C is the product-measure on P Cxn (τ, t) of the Lebesgue measure on Simnτ,t and of n measures Msj on Rd taken at the points of discontinuity of Zx (s). That is, if Z is parametrized as in (4.99), then MnP C (dZ(.)) = ds1 · · · dsn Ms1 (dz1 ) · · · Msn (dzn ), and for any measurable functional F (Zx (.)) = {Fn (x − Z 0 , x − Z 1 , . . . , x − Z n )} on P Cx (τ, t), given by the collection of functions Fn on Rdn , n = 0, 1, . . ., ∞

F (Zx (.))M P C (dZ(.)) = F (x) + F (Zx (.))MnP C (dZ(.)) P Cx (τ,t)

=



n=0

Simn τ,t

ds1 · · · dsn

n=1



Rd

···

Rd

P Cxn (t)

Ms1 (dz1 ) · · · Msn (dzn )

× Fn (x − Z 0 , x − Z 1 , . . . , x − Z n ). Since M P C  = 1 +





n=1

(4.180)

Simn τ,t

ds1 · · · dsn Ms1  · · · Msn  = exp

t

 Ms  ds ,

τ

and using the probabilistic notation E (the expectation) for the integral over the  ˜ P C = exp{− t Ms  ds}M P C on the pathnormalized (probability) measure M τ space P Cx (τ, t), we can write (4.180) as  t  PC ˜ P C (dZ(.)) F (Zx (.))M (dZ(.)) = exp Ms  ds F (Zx (.))M τ

P Cx (τ,t)



t



P Cx (t)

Ms  ds E F (Zx (.)).

= exp τ

(4.181)

278

Chapter 4. Linear Evolutionary Equations: Foundations

Assuming that the propagators U s,t in (4.166) are generated by multiplication operators of the time-dependent family of functions A(t, x)  and that Lt is a family of bounded operators in C(Rd ) of the form Lt f (x) = f (x − y)νt (dy) with a family of bounded measures νt on Rd , the series (4.166) can be rewritten as  r  t,r Φ Y (x) = exp A(s, x) ds Y (x) (4.182) +

t



m=1

t≤s1 ≤···≤sm ≤r



× exp

Y (x − z1 − · · · − zm )ds1 · · · dsm νs1 (dz1 ) · · · νsm (dzm )

s1



t



s2

A(s, x − z1 ) ds + · · · +

A(s, x) ds +

t

 A(s, x − z1 − · · · − zm ) .

sm

s1

r

The last exponential term can be also written as exp{ t A(s, Zx (s)) ds}. Therefore, the series (4.182), which is a performance of the perturbation series (4.166), can be represented as a path integral of the type (4.181):  r  exp A(s, Zx (s)) ds Y (Zx (s))M P C (dZ(.)) P Cx (t,r)



t t







Ms  ds E exp

= exp τ

r



A(s, Zx (s)) ds Y (Zx (r)) .

(4.183)

t

4.14 Diffusions and Schr¨odinger equations with nonlocal terms As a first example for the application of perturbation theory of propagators, let us extend Proposition 4.8.1 to time-dependent drifts and sources, namely to the heat conduction equation 1 f˙t = Δft + (bt (x), ∇)ft (x) + Vt (x)ft (x), 2

f0 = f,

(4.184)

with the time-dependent perturbation Lt f = (bt (x), ∇)f (x) + Vt (x)f (x). Proposition 4.14.1. Let Vt , bt be bounded measurable functions. Then the series (4.166) with U t,s = exp{(s − t)Δ/2} converge, and the corresponding family Φt,r forms a backward propagator that solves the mild equation (4.167). As in the time-homogeneous case, these results automatically extend to complex diffusion equations, or the regularized Schr¨ odinger equation with magnetic fields: 1 1 f˙t = σΔft + (bt (x), ∇)ft (x) + Vt (x)ft (x) = ( σΔ + Lt )ft (x), 2 2 where σ = i +  is a complex constant with a positive real part .

f0 = f, (4.185)

4.14. Diffusions and Schr¨ odinger equations with nonlocal terms

279

Theorem 4.6.5 also extends to propagators, which almost literally leads to the following time-dependent version of Theorems 4.8.1 and 4.8.2. Theorem 4.14.1. (i) Let bt (x) be a bounded measurable function and Vt a Radon measure on Rd , ∂bt d C d d > 1, depending measurably on t. Let Vt ∈ MC R,α (R ) and ∂xj ∈ MR,α (R ) for all j, with all constants uniform in t, α ∈ (d − 2, d] and σ a complex constant with a positive real part. Then equation (4.185) is well posed in the sense of mild solutions. t (ii) Let bt (x) be a bounded measurable function. Let Vt and ∂b ∂x be bounded (uniformly in t) measures on R, depending measurably on t. Let σ be a complex constant with a non-negative real part (thus including the case σ = i). Then equation (4.185) is well posed in the sense of mild solutions. As an example related to Theorem 4.13.4, let us provide conditions that ensure that the equations (4.12) or (4.185) can be classically solved. Theorem 4.14.2. Let bt , Vt ∈ C([0, T ], C 1 (Rd )) (or its complex-valued version), where d ≥ 1 and σ is a complex constant with a positive real part. Then the propagator Φt,r solving (according to Theorem 4.14.1) the mild form of equation (4.185) 1 2 acts strongly continuously in C∞ (Rd ), C∞ (Rd ) and C∞ (Rd ), it is generated by 2 d Lt + σΔ/2 on C∞ (R ) and therefore solves equation (4.185) classically. 2 ˜ = C1 , Proof. This is a direct consequence of Theorem 4.13.4 with D = C∞ ,B ∞ σ B = C∞ and the regularization property of the semigroups Tt of (4.20), see (4.24), (4.25) and Exercise 4.3.8. 

Remark 83. By carefully looking at the proof of Theorem 4.13.4 in this case, we can to conclude that the assumption bt , Vt ∈ C([0, T ], CbLip (Rd )) is sufficient for the validity of the results of Theorem 4.14.2. Let us now extend the theory for the equations (4.185) to the case of diffusions or Schr¨ odinger equations with non-local terms, i.e., to equations of the type   1 σΔ + Lt ft (x) f˙t = 2 1 (4.186) = σΔft + (bt (x), ∇)ft (x) + Vt (x)ft (x) 2 +

ft (y)νt (x, dy) +

(∇ft (y), ν˜t (x, dy)),

f0 = f,

where νt (x, dy) is a (possibly signed or even complex) transition kernel for any t, i.e., νt (x, dy) is measurable with respect to x ∈ Rd and belongs to M(Rd ) or MC (Rd ) as a function of the second variable. Moreover, ν˜t = (˜ νt1 , . . . , ν˜td ) is a vector-valued transition kernel. An important special case are operators Lt of the

280

Chapter 4. Linear Evolutionary Equations: Foundations

L´evy–Kchintchin type, which arise in the theory of nonlocal Markov processes: Lt f = (bt (x), ∇)ft (x) + Vt (x)ft (x) + (ft (x + y) − ft (x))νt (x, dy)), (4.187) where for any x the (positive) measure νt (x, .) may be unbounded, but such that min(1, |y|)νt (x, dy) < ∞. By (1.84), these operators can be expressed in the form (4.186) with bounded νt , ν˜t . Interestingly enough, nonlocal equations of the type (4.186) also arise naturally in linearized evolutions of nonlinear diffusions, see equation (6.81) below. Recall that the transition kernel νt (x, dy) is said to be weakly continuous if  f (y)νt (x, dy) is a continuous function of t and x for any continuous bounded f . Theorem 4.14.3. (i) Let σ be a complex constant with a positive real part, let Vt , bt be bounded measurable complex-valued functions and νt , ν˜t uniformly bounded complex transition kernels. Then the series (4.166) with U t,s = exp{(s−t)σΔ/2} converges both in C(Rd ) and C 1 (Rd ), and the corresponding family Φt,r forms a 1 strongly continuous backward propagator both in C∞ (Rd ) and C∞ (Rd ). This propagator solves the mild equation (4.167). (ii) Let additionally bt , Vt ∈ C([0, T ], C 1 (Rd )) (possibly complex-valued), and t (x,.) t (x,.) , ∂ ν˜∂x exist as uniformly bounded weakly let the partial derivatives ∂ν∂x j j continuous families of complex transition kernels. Then the propagator Φt,r 1 2 from (i) acts strongly continuously in C∞ (Rd ), C∞ (Rd ) and C∞ (Rd ), it is 2 d generated by Lt + σΔ/2 on C∞ (R ) and therefore solves equation (4.186) classically. Proof. Again, this is a direct consequence of Theorem 4.13.4.



For the sake of simplicity, we used the standard Laplacian operators as main term of the equations. The above results extend straightforwardly to equations of the type 1 f˙t = (At ∇, ∇)ft + (bt (x), ∇)ft (x) + Vt (x)ft (x), 2

fs = f,

(4.188)

with At a family of symmetric positive matrices. In fact, the solution to the corresponding homogeneous problem 1 f˙t = (At ∇, ∇)ft , 2

fs = f,

(4.189)

4.15. ΨDOs with homogeneous symbols (time-dependent case)

281

is expressed in the following closed form (obtained directly by the Fourier transform, using (1.89)):    −1 t (2π)−d/2 1 Aτ dτ (x − y), x − y f (y) dy.f exp − ft = 1 t 2 s det( s Aτ dτ ) (4.190) The corresponding propagator has all properties that are required for an extension of the above results to equations of the type (4.188).

4.15 ΨDOs with homogeneous symbols (time-dependent case) In this section, we extend the theory of Section 4.5 into several directions, including time-dependent homogeneous symbols. We start with the Cauchy problem f˙t = −ψt (−i∇)ft ,

f |t=s = fs ,

t ≥ s,

(4.191)

with ψt (p) = |p|β ωt (p/|p|),

(4.192)

where ωt = ωrt + iωit is a continuous function on R × S d−1 with a positive real part (see (4.66) for key examples). The corresponding propagator resolving (4.61) acts as U t,s fs (x) =

Gψ t,s (x − y)fs (y) dy,

(4.193)

with the Green function (2.60) of the form Gψ t,s (x) =

1 (2π)d



  t ωτ (p/|p|) dτ dp. eipx exp −|p|β

(4.194)

s

Since this function is real, U t,s preserves the reality of functions if and only if the condition ω ¯ t (p/|p|) = ωt (−p/|p|) (4.195) holds for all p and t. The following two theorems are proved by directly extending the arguments from Theorems 4.5.1 and 4.4.1. Theorem 4.15.1. Let β > 0 and a continuous function ωt (s) on S d−1 be (d+1+[β])times (where [β] is the integer part of β) continuously differentiable, with its real part being bounded from below by a positive number. Then the following holds:

282

Chapter 4. Linear Evolutionary Equations: Foundations

(i) The Green function (4.194) is infinitely differentiable in all variables t, s, x for t − s > 0 and satisfies the estimate       ∂Gt,s (x)  ∂Gt,s (x)    (t − s) , (t − s) (x)|, max |Gψ t,s  ∂t   ∂s    (4.196) t−s ≤ C min (t − s)−d/β , d+β |x| for a constant C. (ii) The propagator U t,s given by (4.193) is a uniformly bounded and strongly k continuous propagator in C∞ (Rd ), with all spaces C∞ (Rd ) being invariant. It is smoothing in the sense that U t,s f is infinitely differentiable for t > s and any f ∈ C∞ (Rd ). If β ≥ 1, then for all non-negative integers k, l, U t,s f C l+k (Rd ) ≤ ck,l (t − s)−k/β f C l (Rd )

(4.197)

with some constants ck,l . (iii) The family of operators −ψt (−i∇) generates the propagator U t,s on the invariant subspace of C∞ (Rd ) consisting of functions that can be represented as Fourier transforms of functions φ ∈ L1 (Rd ) such that |p|β φ(p) ∈ L1 (Rd ), so that equation (4.129) holds for functions from this space. (The −ψt (−i∇) are defined as operators that multiply the Fourier transform of a function by −ψt (p).) (iv) If additionally ω is (d + 1 + [β] + l)-times continuously differentiable, then       t ∂k ψ −(d+l)/β   G (x) , , (4.198) ≤ C min t   ∂xi · · · ∂xi t |x|d+β+l 1 k for a constant C and all k ≤ l and i1 , . . . , ik . Theorem 4.15.2. Under the assumptions of Theorem 4.15.1 (i) to (iii), assume additionally that the ψt are symmetric mixed fractional derivatives |(∇, s)|β μt (ds), S d−1

that is, their symbols are given by formula (1.152) with some time-dependent family of spectral measures μt . Then the family of operators −ψt (−i∇) generates the k propagator U t,s on the invariant spaces C∞ (Rd ) with any integer k ≥ β, such equation (4.129) holds that for the functions from this space. (Here, the −ψt (−i∇) k (Rd ) by the formulae (1.145) and (1.143).) are defined on C∞ Next, let us consider the Cauchy problem f˙t = −ψt (−i∇)ft + gt ,

f |t=s = fs ,

t ≥ s,

(4.199)

where ψt is given by (4.192) and gt is a given curve in C∞ (Rd ). Theorem 4.15.3. Let the assumptions of Theorem 4.15.2 hold and gt be a continuous curve [s, ∞) → C∞ (Rd ). Let an integer k ≥ β. Then:

4.15. ΨDOs with homogeneous symbols (time-dependent case)

(i) The function

ft = U t,s fs +

283

t

U t,τ gτ dτ

(4.200)

s k represents the unique solution to equation (4.199), whenever fs ∈ C∞ (Rd ) k d and gt is a bounded curve [s, ∞] → C∞ (R ). k−1 (Rd ), (ii) If β ≥ 1, fs ∈ C∞ (Rd ) and gt is a bounded curve [s, ∞] → C∞ then the function ft from (4.200) represents the unique continuous curve k [s, ∞] → C∞ (Rd ) with the initial condition fs such that ft belongs to C∞ (Rd ) and equation (4.199) holds for all t > s.

Proof. Statement (i) is a direct consequence of Proposition 4.10.2 and Remark 78. Statement (ii) is the effect of smoothing. Namely, by Theorem 4.15.1(ii), U t,s fs ∈ k k C∞ (Rd ) for all t > s, and by (4.197), U t,τ gτ ∈ C∞ (Rd ) for all t > τ and the k d  family of the corresponding norms in C∞ (R ) is integrable in τ . Finally, let us briefly touch the extension of the theory to homogeneous symbols, but with time-dependent order. Namely, let us consider the Cauchy problem (4.191) with (4.201) ψt (p) = |p|βt ωt (p/|p|). A generalization of all above results can again be obtained by directly extending the arguments from Theorems 4.5.1 and 4.4.1, where we now use Proposition 9.3.6 rather than 9.3.5, see also similar arguments in Theorem 4.5.2. For instance, the following Theorem holds: Theorem 4.15.4. Let βt be a continuous curve on R with strictly positive values, and let μt be a continuous curve with values in the set of (positive) measures on  S d−1 such that the functions S d−1 |(p, s)|βt μt (ds) are (d+2+[bmax ])-times continuously differentiable in p ∈ S d−1 and bounded from below by a positive constant. Let us denote bmin [s, t] = min{βτ : τ ∈ [s, t]},

bmax [s, t] = max{βτ : τ ∈ [s, t]}.

Then the Green function Gψ t,s (x) of the corresponding propagator (4.193) resolving (4.191) is infinitely differentiable with respect to all variables t, s, x, and for t − s ∈ (0, 1) it satisfies the estimates   t dτ −d/bmin [s,t] (x)| ≤ C min (t − s) , , |Gψ t,s d+β(τ ) s |x|      ∂Gt,s (x)  1  ≤ C min (t − s)−(d+β(t))/bmin [s,t] ,  (4.202) ,  ∂t  |x|d+β(t)      ∂Gt,s (x)  1   ≤ C min (t − s)−(d+β(s))/bmin [s,t] , ,  ∂s  |x|d+β(s) for a constant C.

284

Chapter 4. Linear Evolutionary Equations: Foundations

Moreover, the propagator U t,s given by (4.193) is a uniformly bounded and k (Rd ) being invaristrongly continuous propagator in C∞ (Rd ), with all spaces C∞ t,s ant. It is smoothing in the sense that U f is infinitely differentiable for t > s and any f ∈ C∞ (Rd ), and for all non-negative integers k, l and t − s ≤ 1, U t,s f C l+k (Rd ) ≤ ck,l (t − s)−k/bmin [s,t] f C l(Rd )

(4.203)

with some constants ck,l . Finally, the family of operators −ψt (−i∇) generates the propagator U t,s on k (Rd ) with any integer k ≥ max{βτ }. the invariant spaces C∞ Exercise 4.15.1. Give a full proof of this theorem. Exercise 4.15.2. Extend Theorem 4.15.4 to time-varying mixtures, i.e., to symbols of the type |p|β(t,u) ωt (u, p/|p|)μt (du).

ψt (p) = U

4.16 Higher-order ΨDEs with nonlocal terms As another example for the application of perturbation theory to propagators, let us give an extension of Theorem 4.14.2 to higher (and/or fractional) order PDEs with nonlocal terms: f˙t = −σ|Δ|α/2 ft + Lt ft (4.204) with Lt f =

k



btj1 ···jm (x)

m=0 j1 ≤···≤jm ≤d

+

k



m=0 j1 ≤···≤jm ≤d



∂ m f (x) ∂xj1 · · · ∂xjm (4.205)

∂ m f (y) νt (x, dy), ∂yj1 · · · ∂yjm j1 ···jm

where k ≥ 1 is any natural number that is strictly less than α and btj1 ···jm (x) and νjt1 ···jm (x, dy) are families of measurable functions and transition kernels on Rd . Let p be the smallest integer that is not less than α. The following result is p (Rd ), a direct consequence of Theorem 4.13.4 and Proposition 4.4.1 with D = C∞ ˜ = C k (Rd ). B ∞ Theorem 4.16.1. (i) Let σ be a complex constant with a positive real part, let btj1 ···jm (x) be bounded measurable complex-valued functions and νjt1 ···jm (x, dy) uniformly bounded complex stochastic kernels. Then the series (4.166) with U t,s = exp{(s − t)σ|Δ|α } and Lt from (4.205) converges in C∞ (Rd ), and the corresponding family Φt,r forms a backward propagator that solves the mild equation (4.167). (ii) Let additionally btj1 ···jm ∈ C([0, T ], C k (Rd )) (possibly complex-valued), and let the partial derivatives of order up to and including k of the kernels

4.17. Hints and answers to chosen exercises

285

νjt1 ···jm (x, .) with respect to x exist as uniformly bounded, weakly continuous families of complex transition kernels. Then the propagator Φt,r from (i) k p acts strongly continuously in C∞ (Rd ), C∞ (Rd ) and C∞ (Rd ), it is generated α/2 p d by Lt − σ|Δ| on C∞ (R ) and therefore solves equation (4.186) classically. Exercise 4.16.1. By applying Theorems 4.15.1 or 4.15.4, show how Theorem 4.16.1 extends to the case when σ|Δ|α/2 f is substituted by a general homogeneous operator of (possibly time-varying) orders βt ≥ α. Remark 84. A corollary to Proposition 5.9.1 will extend this result to the case when the variable σ depends on x.

4.17 Hints and answers to chosen exercises Exercise 4.1.1. All statements on strong continuity and equicontinuity are obtained by reducing them to the problem of convergence Tt f → f uniform on compact √ sets. For a counterexample, take f (x) = sin x. Exercise 4.1.2. The invariant core is given by all shifts Tt S for all t and S ∈ C 1 (Rd ). Therefore, one needs to show that any such Tt S belongs to the closure of L defined on C 1 (Rd ). To this end, approximate F by a sequence Fn ∈ C 1 (Rd , Rd ) and show that Ttn S(Y ) → Tt S(Y ) and LTtn S(Y ) → LTt S(Y ). Exercise 4.1.3. Rλ Rμ f =



e

−λt

0





Tt

e

−μs









Ts f ds dt =

0

0



e−λt−μs Tt+s f dsdt.

0

Afterwards, change the integration variable t to t + s. Exercise 4.3.1. (i) For any  there exists δ such that |f (y)−f (x)| <  for |x−y| < δ. Next, Gt (x − y)(f (y) − f (x)) dy Gt (x − y)(f (y) − f (x)) dy = |x−y|>t +1/2 + Gt (x − y)(f (y) − f (x)) dy, |x−y|≤t +1/2

and the first integral tends to zero as t → 0. Finally, for small enough t, Gt (x − y)|f (y) − f (x)| dy ≤  Gt (x − y) dy = . |x−y|≤t +1/2

2 (Rd ), (iii) For f ∈ C∞ d ∂ Tt f (x) = Gt (x − y)f (y) dy dt ∂t 1 1 = ΔGt (x − y)f (y) dy = Gt (x − y)Δf (y) dy. 2 2

286

Chapter 4. Linear Evolutionary Equations: Foundations

Exercise 4.3.3. Strong continuity for continuous f with a compact support follows as in Exercise 4.3.1. It can be extended to the whole space by a density argument. Exercise 4.3.5. This semigroup turns to the multiplication semigroup f (p) → 2 eiκp f (p) under Fourier transform. Exercise 4.3.6. Use integration by parts. Exercise 4.3.10. One only has to check that DNeu is invariant, which follows from the explicit formula for TtNeu. Exercise 4.3.11. Again, the invariance of the spaces C([0, ∞)) and DDir follows from the explicit formula for TtDir . Exercise 4.5.2. yφ (y) = e−y − φ(y). Exercise 4.9.1. Ts Tt f (x) = U x+δ,x+δ−αs (Tt f )(x − αs) = U x+δ,x+δ−αs U x−αs+δ,x+δ−αs−αt f (x − αs − αt) = Ts+t f (x). Exercise 4.12.1. One first shows this for positive and negative linear combinations separately, and then uses the observation that these two subsets evolve independently of each other.

4.18 Summary and comments In this chapter, we developed the theory of semigroups and propagators of linear operators in locally convex spaces, with the emphasis on the case of Banach spaces. The theory provides a tool for analysing a wide class of evolutionary partial differential and pseudo-differential equations. The presented results are mostly known. However, as in the previous chapters, the aim was to simplify, streamline and unify various facts and approaches, while at the same time supplying a sufficiently detailed exposition and various insightful examples. The classical books on linear differential equations are the treatises [110] and [175]. An extensive literature is devoted to the asymptotics of the Green function of the Cauchy problem for spatially homogenous ΨDOs, the main emphasis being usually given to the case of homogeneous symbols. One of the first papers in this direction was [77]. The case of the homogeneity indices α ∈ (0, 2) is crucial for probability theory, since the corresponding Green functions represent transition probabilities of stable L´evy processes. These transition probabilities have been studied in detail by probabilists, leading to very precise asymptotic estimates and expansions, see, e.g., the classical monograph [267] for the one-dimensional case. For the multi-dimensional case, asymptotic expansions were obtained in [134]. Based on the approaches of this paper, we tried to give a concise representation for a reasonably general case of homogeneous symbols.

4.18. Summary and comments

287

In many cases, the r.h.s. of an evolutionary equation can be written as the sum (A + L)u of two operators applied to the unknown function, one yielding a well-understood evolution etA (ideally given in a closed form) and the other, L, being in some sense more regular than the other. Perturbation theory is a classical tool for dealing with such systems, and it has been developed by many authors based on various ways of comparing A and L, see, e.g., the well-known books [199, 200, 232]. Our approach is based on the systematic exploitation of the integrability of the singularity of the compositions etA L or LetA . This provides an effective and systematic way of obtaining various concrete results. The advantage of this approach also lies in its more or less straightforward extensibility to timenonhomogeneous situations, i.e., to propagators. (Note that this may not be the case for other methods, e.g., those that are based on resolvents.) For instance, Theorems 4.8.1 and 4.8.2 are obtained in [138] with much harder calculations, and their time-nonhomogeneous extension, Theorem 4.14.1, is derived in our approach more or less automatically, including even more extensions to equations with nonlocal terms. An interesting problem that was not considered here is the analysis of diffusion equations with measure-valued drifts, see [122] and references therein. Path integral representations for the solutions of PDEs are the main link between PDEs and probability theory. We are not thoroughly developing this topic here. A specific branch of research on path integrals arises from the equations of quantum mechanics, because the corresponding Feynmann path integral does not always match the rigorous probabilistic treatment. In this chapter, we followed an approach to path integrals as it arises from jump-type processes. This approach to Feynmann path integrals was first considered in [201] and fully developed in [136– 138]. It exploits the link between path integrals and perturbation theory, which, in the theory of quantum fields, is usually designated graphically via Feynmann diagrams. For more details and an extensive bibliography, we refer to [148]. More details on other approaches to Feynmann path integrals can be found, e.g., in [6]. The literature on the infinite-dimensional Riccati equation is also extensive, mainly due to its application in optimal control, see, e.g., McEneaney [204] and references therein. It is often analysed in the setting of Hilbert spaces. Following [147], we gave a direct proof of its well-posedness for unbounded families A(t), C(t) in a Banach space via an explicit formula that arises from the ‘interaction representation’, while we by-passed any optimization interpretations and related tools. The related theory of measure-valued infinite-dimensional diffusion arises naturally in the analysis of fluctuations (dynamical central-limit theorem) of various statistical mechanical systems, see [143] and [145]. The method of duality is a well-established tool for proving the uniqueness of evolution equations. It will be further developed in Chapter 5. Its concrete version in Theorem 4.10.1 follows the exposition from [147], but is close to the classical expositions of [219] and [91].

Chapter 5

Linear Evolutionary Equations: Advanced Theory In this chapter, we continue the study of linear equations with unbounded r.h.s. in Banach spaces, thereby further developing the methods of semigroups and propagators. The first part (Sections 5.1 to 5.3) is devoted to methods for building propagators and their generator families by combining (mixing, adding, making them time-varying) the generators of the semigroup. The second part of the chapter (Sections 5.4 to 5.7) is devoted to the method of frozen coefficients, which is a variation of the more general method of parametrix. This method aims at building solutions to ΨDEs with variable coefficients from combinations of the solution to the equations with constant coefficients. It is developed here in a rather general form and illustrated on various examples, including L´evy–Khintchin-type generators and various mixed fractional Laplacians. In the final part of the chapter, general methods for proving uniqueness are introduced, with a discussion of the notion of generalized solutions as a by-product. Particular attention is given to positivity-preserving evolutions, which leads to the so-called Feller semigroups that play a key role in stochastic calculus. As insightful examples of linear equations and the related semigroups, we discuss in some detail the convolution semigroups generated by L´evy–Khintchin operators in Banach spaces and by generators of order at most one. The chapter is completed with a brief discussion of smoothing and smoothness.

5.1 T -products with three-level Banach towers Extending the theory of Section 2.3 to unbounded generators, we shall now develop the notion of T -products or chronological products, or operator-valued multiplicative integrals. They are an important tool for building propagators that are generated by families of operators, each of which generates a sufficiently regular semigroup. © Springer Nature Switzerland AG 2019 V. Kolokoltsov, Differential Equations on Measures and Functional Spaces, Birkhäuser Advanced Texts Basler Lehrbücher, https://doi.org/10.1007/978-3-030-03377-4_5

289

290

Chapter 5. Linear Evolutionary Equations: Advanced Theory

Let Lt , t ∈ R, be a family of operators in the Banach space B such that each one of them generates a strongly continuous semigroup exp{sLt } in B, having the dense subspace D ⊂ B as as invariant core. The natural question arises whether the family Lt generates a propagator that solves the Cauchy problem u(t) ˙ = Lt u(t),

t ≥ s,

us = u ∈ D,

(5.1)

or a backward propagator solving the backward Cauchy problem u(t) ˙ = −Lt u(t),

t ≤ s,

us = u ∈ D.

(5.2)

Since each Lt generates a semigroup, one can expect the solution to be obtained by the limit of discrete approximations, where the evolution is defined by the corresponding semigroup for each step. To be precise, for a partition Δ = {0 = t0 < t1 < · · · < tN = T } of an interval [0, T ], let us define a family of operators UΔ (τ, s), 0 ≤ s ≤ τ ≤ T , by the following rules: UΔ (τ, s) = exp{(τ − s)Ltj }, UΔ (τ, r) = UΔ (τ, s)UΔ (s, r),

tj ≤ s ≤ τ ≤ tj+1 , 0 ≤ r ≤ s ≤ τ ≤ T.

(5.3)

It is clear that each UΔ is a propagator in B with the invariant domain D, generated by the family Ls(Δ) , where s(Δ) = tj for tj ≤ s < tj+1 . Remark 85. In this context, the notion of generation is used in the sense of Remark 75, since the corresponding differential equations hold outside the finite set {t1 , . . . , tN }. Let Δtj = tj+1 − tj and |Δ| = maxj Δtj . If the limit U s,r f = lim UΔ (s, r)f, |Δ|→0

s ≥ r,

(5.4)

exists for some f and all 0 ≤ r ≤ s ≤ T , it is called the T -product (or chronological product or chronological exponential) of Lt applied to f . It is denoted by   s s,r Lτ dτ f = Ufor f, r ≤ s, T exp r

where the suffix ‘for’ refers to the fact that this T-product is ‘forward’ in time. As mentioned above, one expects the T -product to provide a solution to (5.1). The backward Cauchy problem (5.2) can be handled similarly. Namely, for a partition Δ = {0 = t0 < t1 < · · · < tN = t} we define the family of operators UΔ (s, τ ), 0 ≤ s ≤ τ ≤ t, by the following rules: UΔ (s, τ ) = exp{(τ − s)Ltj+1 }, tj ≤ s ≤ τ ≤ tj+1 , 0 ≤ r ≤ s ≤ τ ≤ T. UΔ (r, τ ) = UΔ (r, s)UΔ (s, τ ),

(5.5)

5.1. T -products with three-level Banach towers

291

If the limit U s,r f = lim UΔ (s, r)f, |Δ|→0

s ≤ r,

(5.6)

exists for some f and all 0 ≤ s ≤ r ≤ T , it is called the backward T -product of Lt applied to f . It is denoted by  r   s  ˜ T exp Lτ dτ f = T exp Lτ dτ f s  (5.7)  r r s,r (−Lτ ) dτ f = Uback f, s ≤ r. = T exp s

One expects this T -product to provide a solution to (5.2). Remark 86. Due to this fact, it is customary in physics to denote propagators that are generated by a family Lt by the corresponding T -product. In fact, one can show that if the propagator is well defined, then the convergence in (5.6) takes place under very general conditions. In what follows, we tackle the more complicated problem of proving convergence without initially assuming the existence of a propagator generated by Lt . For a rigorous treatment, it is handy to work with the common invariant core being equipped with its own Banach space structure (two-level Banach tower), as we did in Proposition 4.13.1. Theorem 5.1.1. Let B be a Banach space and D a dense subspace therein, which is itself Banach with respect to the norm .D ≥ .B . Let a family Lt f , t ∈ [0, T ], of linear operators in B be given such that (i) each Lt generates a strongly continuous semigroup esLt , s ≥ 0, in B with the invariant core D such that  exp{sLt }B→B ≤ eKs ,

 exp{sLt }D→D ≤ eKs ,

s, t ∈ [0, T ],

(5.8)

with some constant K; ¯ < ∞. (ii) Lt are uniformly bounded operators D → B, so that LD→B ≤ L They depend continuously on t in the norm topology of L(D, B). Then:

r r r,s s,r f = T exp{ s Lτ dτ }f and Uback f = T exp{ s (−Lτ )dτ }f (i) the T -products Ufor exist for all f ∈ B, where the limit is understood in the norm of B. Moreover, the convergence in (5.4), (5.6) is uniform in f on any bounded subset of D and in s, t ∈ [0, T ]; r,s s,r and Uback form a bounded strongly continuous propagator (ii) the T -products Ufor and a backward propagator, respectively, with the bounds r,s B→B ≤ eK(r−s) , Ufor

s,r Uback B→B ≤ eK(r−s) ;

(5.9)

292

Chapter 5. Linear Evolutionary Equations: Advanced Theory

r,s s,r (iii) the T -products Ufor and Uback satisfy the following equations:

d s,t s,t U f = −Ufor Lt f, dt for

s≥t

(5.10)

d s,t s,t U f = Uback Lt f, dt back

s ≤ t,

(5.11)

and

for any f ∈ D; t,s s,t ∗ t,s s,t ∗ = (Uback ) and Vback = (Ufor ) in B ∗ solve the (iv) the dual propagators Vfor weak versions of the dual equations to (5.2) and (5.1): d s,t s,t (f, Vfor ξ) = (Ls f, Vfor ξ), ds d s,t s,t (f, Vback ξ) = (Ls f, Vback ξ), ds

t ≤ s ≤ r,

f ∈ D,

(5.12)

r ≤ s ≤ t,

f ∈ D.

(5.13)

Remark 87. Let us highlight the importance of exponential bounds (5.8) that we sometimes refer to as regular bounds. In the sequel, they shall be regularly used. They are much stronger than just the requirement of boundedness of the semigroups. Semigroups with such estimates for the resolvents are called ‘stable’ by some authors (see, e.g., [100]). Examples for the case when this property does not hold are given after Proposition 4.4.1. Proof. For the sake of definiteness, let us work with the forward propagator, the case of the backward propagator being analogous. (i) We use formula (4.134) to write s UΔ (s, τ )(Lτ (Δ) − Lτ (Δ) )UΔ (τ, r) f dτ, (UΔ (s, r) − UΔ (s, r))f = r

where s(Δ) = tj for tj ≤ s < tj+1 . In fact, each UΔ is a propagator in B with the invariant domain D generated by the family Lt(Δ) . s,r Remark 88. Unlike the setting of Proposition 4.9.2, the propagators UΔ are differentiable only outside a finite subset of s and r. However, this does not affect the validity of the representation (4.134), see Remark 75.

By (5.8), we find UΔ (τ, r)D→D ≤ eK(τ −r),

UΔ (s, τ )B→B ≤ eK(s−τ ) .

The assumed continuity of Lt therefore yields (UΔ (s, r) − UΔ (s, r))f B → 0, as |Δ|, |Δ | → 0, uniformly in f on any bounded subset of D and in s, t ∈ [0, T ]. The existence of the T -product for arbitrary f ∈ B follows by approximating f by elements from D and using the uniform boundedness of all approximations UΔ (s, r).

5.1. T -products with three-level Banach towers

293

(ii) The bound (5.9) follows from the convergence of the approximations with the same bound. Next, choosing the partitions Δ of [r, τ ] containing the point s ∈ (r, τ ) in such a way that they are composed of a partition Δ1 of [r, s] and a partition Δ2 of [s, τ ], we get τ,r = Ufor

lim

|Δ1 |,|Δ2 |→0

τ,s s,r UΔ2 (τ, s)UΔ1 (s, r) = Ufor Ufor

r,s showing the chain rule for Ufor . If f ∈ D, then the equations



s

UΔ (s, r)f − f =

Lτ (Δ) UΔ (τ, r)f dτ, r

imply that ¯ − r)eK(s−r) f D . UΔ (s, r)f − f B ≤ L(s Therefore, the family UΔ (s, r)f is Lipschitz-continuous as a function (s, t) → B uniformly for f bounded in D and all Δ. By approximations, one gets the strong r,s in B. continuity of the family Ufor (iii) Applying (4.133) to the propagator UΔ (s, r)f (with the inverted sign, since (4.133) applies to backward propagators) and integrating yields t s,t s,r UΔ Lr(Δ) f dr, f ∈ D. (5.14) UΔ f = − s

Passing to the limit |Δ| → 0 yields s,t Ufor f =−

t s,r Ufor Lr f dr.

s

For f ∈ D, the function under the integral on the r.h.s. is continuous, which implies (5.10) after differentiation. (iv) Equation (5.13) follows from (5.10) by the definition of duality.  The difficulty to get Ufor or Uback fully generated by the family Lt in the sense of the ‘strong’ equations (4.129) and (4.130) stems from the fact that under the assumptions of Theorem 5.1.1 there seems to be no way to check the invariance of D under Ufor or Uback . Therefore, the link between the propagators and the ‘generators’ is only given by the weaker equations (5.10) and (5.11). As we shall show shortly, this issue can be resolved by working with three Banach spaces. But let us first specify the notion of solutions arising from Theorem 5.1.1. Let us say that ut is a generalized solution via discrete approximations to the Cauchy problem (5.1) or (5.2), if for any  > 0 there exists δ such that for any partition Δ with |Δ| < δ, (5.15) ut − UΔ (t, s)u < , with UΔ given by (5.3), t ∈ [s, T ], or by (5.5), t ∈ [0, s], respectively.

294

Chapter 5. Linear Evolutionary Equations: Advanced Theory

s,t s,t Theorem 5.1.2. Under the assumptions of Theorem 5.1.1, Ufor u and Uback u represent the unique generalized solutions to (5.1) or (5.2), respectively. Moreover, any classical solution ut (such that ut ∈ D for all t and the equations (5.1) or (5.2) hold rigorously) is a generalized solution.

Remark 89. Defining generalized solutions via discrete approximations is a standard approach in the theory of differential equations. For instance, the so-called C 0 -solutions that are used in the Crandall–Ligget theory of accretive operators (see, e.g., [244] or [262]) are defined in this way. Of course, the reasonability of such a definition should be supported by a result like Theorem 5.1.2. Proof. The uniqueness follows from the existence of the T -product, see Theorem 5.1.1(i). Furthermore, suppose that ut is a classical solution with the initial condition u ∈ D to, say, equation (5.1). Then

t

ut − UΔ (t, s)us = UΔ (t, r)ur |ts =

UΔ (t, r)(Lr − Lr(Δ) )ur dr, s

which tends to 0 as |Δ| → 0.



As mentioned before, the natural setting for proving that the above T products solve the equations (5.1) and (5.2) classically is obtained by the method of embedded Banach spaces or Banach towers. In this context, a triple of spaces (three-level Banach tower) is sufficient. Namely, let us introduce another dense ˜ ⊂ D ⊂ B such that D ˜ is itself a Banach space under the norm subspace D .D˜ ≥ .D . Theorem 5.1.3. Under the assumptions of Theorem 5.1.1, suppose additionally that ˜ → D, so that Lt  ˜ (i) the Lt are also uniformly bounded as operators D D→D ≤ ¯ < ∞; L ˜ is also invariant under all esLt , and these operators are uniformly bounded (ii) D ˜ with their norms not exceeding eKs . as operators in D, Then, if f ∈ D, the approximations UΔ (s, r) converge also in D. Therefore, the t,s t,s space D is invariant under all T -products Ufor and Uback . Moreover, these T products define strongly continuous propagators in D which solve the problems (5.1) or (5.2) for any f ∈ D. ˜ D) instead of (D, B), one shows analogously to Theorem Proof. Using the pair (D, ˜ first, and then extends the result to all f ∈ D 5.1.1 the convergence in D for f ∈ D ˜ by a density argument. Similarly, one shows strong continuity in D first for f ∈ D and then for all f ∈ D. Once this is proven, the equations (5.1) or (5.2) can be obtained either by passing to the limit in similar equations for the approximations t,s , or from the equations (5.10), (5.11) by the arguments used in Proposition UΔ 4.9.1. 

5.1. T -products with three-level Banach towers

295

As another example for the use of this three-spaces-technique, let us extend (at least partially) Proposition 4.2.6 to the preservation of regularity for propaga˜ ⊂ D ⊂ B as above. tors. To this end, we assume the Banach spaces D Proposition 5.1.1. Let U t,s be a strongly continuous propagator (or backward propagator) of linear operators in B generated by the family At on the invariant ˜ is also invariant, that all operators U t,s domain D. Assume next that the space D ˜ → D, ˜ are uniformly bounded and strongly continuous as operators D → D and D and that the At are bounded and strongly continuous as operators D → B and ˜ → D. Then the propagator (or backward propagator) U t,s in D is generated by D ˜ the same family At on the invariant domain D. Proof. Let us talk about backward propagators for the sake of definiteness. Since U t,s is generated by At on D, we have U

t,s

f − f = (s − t)As f +

s

(Ar U r,s − As )f dr t

˜ then the function under the integral is continuous in the for any f ∈ D. If f ∈ D, ˜ and the At are continuous topology of D, because U t,s is strongly continuous in D ˜ as mappings D → D. This implies ) ) ) 1 ) t,s ) (U f − f ) − A f lim ) s ) = 0, ) s−t→0+ s − t D

so that equation (4.131) holds in the topology of D. The rest follows from Proposition 4.9.1.  As a direct consequence of the obtained results, we can extend Theorem 4.3.1 to the time-dependent case (with stronger assumptions on regularity): Theorem 5.1.4. In the diffusion operators Lt u =

1 (A(t, x)∇, ∇)u + (b(t, x), ∇)u(x), 2

x ∈ Rd ,

(5.16)

let A(t, x) = σ(t, x)σ T (t, x), and let both σ and b be continuous and belong to C 4 (Rd ) as functions of x. Then the family of operators Lt in (5.16) generates a strongly continuous backward propagator U s,t in C∞ (Rd ) on the common invariant 2 domain C∞ (Rd ), so that U s,t L(C 2 (Rd )) ≤ eKt . Remark 90. In this setting, the direct probabilistic approach still gives better results, since it allows for a straightforward time-dependent extension of Theorem 4.3.1 without strengthening the regularity assumptions.

296

Chapter 5. Linear Evolutionary Equations: Advanced Theory

5.2 Adding generators with 4-level Banach towers We address now the problem of ‘adding generators’, i.e., constructing a propagator that is generated by the family Lt1 + · · · + Ltn from the propagators that are generated by each Ltj separately. We start with the case n = 2. The Lie–Trotter–Daletski–Chernoff formula eL1 +L2 = lim (eL1 /n eL2 /n )n n→∞

(5.17)

was established and widely applied under various assumptions for linear operators L1 , L2 . In most cases, it is obtained under the condition that the semigroup exp{t(L1 + L2 )} exists, see [232] and the bibliography therein. We are now going to discuss a situation where the semigroups generated by L1 , L2 are sufficiently regular for deducing the existence of the semigroup generated by L1 + L2 , for identifying its invariant core and for getting the precise rates of convergence. Furthermore, we will extend this result to time-dependent generators. For two given operators L1 , L2 that generate the bounded semigroups etL1 and etL2 in a Banach space and for a given τ > 0, let us define the family of bounded operators Utτ , t > 0, in the following way: For k = 0 or k ∈ N, let 2kτ ≤ t ≤ (2k + 1)τ,

(5.18)

) , (2k + 1)τ ≤ t ≤ (2k + 2)τ.

(5.19)

Utτ = e(t−2kτ )L1 (eτ L2 eτ L1 )k , Utτ

=e

(t−(2k+1)τ )L2 τ L1

e

(e

τ L2 τ L1 k

e

As in Theorem 5.1.3, we shall work with Banach towers. Here, however, the four-level tower D3 ⊂ D2 ⊂ D ⊂ B turns out to be useful. (A definition of Banach towers can be found prior to Proposition 4.2.6.) As the following results reveal, higher-level Banach towers can often be used as an intermediate tool, when the operators under analysis belong to a class where more regular approximations are naturally available. In particular, this is the case for the wide class of pseudodifferential operators (see the discussion prior to Theorem 5.15.1). Theorem 5.2.1. Suppose that (i) the linear operators L1 , L2 in B generate strongly continuous semigroups etL1 and etL2 in B with D being their common invariant core, and   max etL1 B , etL2 B , etL1 D , etL2 D ≤ eKt with a constant K; (ii) D2 and D3 are also invariant under etL1 and etL2 with     max etL1 D2 , etL2 D2 ≤ eK2 t , max etL1 D3 , etL2 D3 ≤ eK3 t , with constants K2 , K3 , where we assume for the sake of definiteness that K3 ≥ K2 ≥ K;

5.2. Adding generators with 4-level Banach towers

297

(iii) L1 , L2 are bounded as operators D → B, D2 → D and D3 → D2 with norms ¯ 2, L ¯ 3. ¯ DB , L that are bounded by some constants L Then (i) for any T > 0 and f ∈ B, the curves Utτk f , τk = 2−k , converge in C([0, T ], B) to a curve Ut f , as k → ∞. For f ∈ D, this convergence holds in C([0, T ], D), so that Ut f ∈ C([0, T ], D); (ii) the norms Ut B and Ut D are bounded by eKt ; (iii) for f ∈ D2 and τk < min(1, t/2), we have ¯ 2 e2K2 τk eK2 t ¯ DB L (Utτk − Ut )f B ≤ 6L

tτk f D2 ; 1 − τk

(5.20)

(iv) the operators Ut form a strongly continuous semigroup in B with the generator (L1 + L2 )/2, with D being its invariant core; (v) the curve Ut f is a Lipschitz-continuous function t → Ut f ∈ D for any f ∈ D2 . Proof. Let τ ≤ t ∈ [0, T ] with an arbitrary given T > 0. Assumption (ii) implies Utτ B→B ≤ eKt ,

Utτ D→D ≤ eKt ,

Utτ D2 →D2 ≤ eK2 t .

Next, for any f ∈ D and i = 1, 2, we find t tLi e f −f = Li esLi f ds,

(5.21)

(5.22)

0

which implies ¯ DB eKt f D , etLi f − f B ≤ tL ¯ 2 eK2 t f D2 . etLi f − f D ≤ tL

(5.23)

Moreover, we find

t

Li (esLi f − f ) ds

etLi f = f + tLi f + 0

which implies ¯ DB L ¯ 2 eK2 t f D2 . etLi f − f − tLi f B ≤ t2 L

(5.24)

Consequently, ¯ DB L ¯ 2 e2K2 t f D2 , etL2 etL1 f − etL2 f − tetL2 L1 f B ≤ t2 L and therefore, by approximating etL2 f by f + tL2 f via (5.24) and tetL2 L1 f by tL1 f via (5.22) applied to tL1 f , the following estimate holds: ¯ DB L ¯ 2 e2K2 t f D2 . etL2 etL1 f − f − t(L1 + L2 )f B ≤ 3t2 L

(5.25)

298

Chapter 5. Linear Evolutionary Equations: Advanced Theory

Consequently, ¯ DB L ¯ 2 e2K2 t f D2 , (etL1 etL2 − etL2 etL1 )f B ≤ 6t2 L

(5.26)

and therefore (e2tL1 e2tL2 − etL1 etL2 etL1 etL2 )f B = etL1 (etL1 etL2 − etL2 etL1 )etL2 f B ¯ DB L ¯ 2 e4K2 t f D2 . ≤ 6t2 L Writing now (eτ L2 eτ L1 )k − (eτ L2 /2 eτ L1 /2 )2k =

k

(eτ L2 eτ L1 )k−l [eτ L2 eτ L1 − (eτ L2 /2 eτ L1 /2 )2 ](eτ L2 /2 eτ L1 /2 )2l−2 ,

l=1

we can conclude that )  ) ) ) τ L2 τ L1 k ¯ DB L ¯ 2 e2K2 τ eK2 t f D2 , ) (e e ) − (eτ L2 /2 eτ L1 /2 )2k f ) ≤ 6kτ 2 L B

so that τ /2

(Utτ − Ut

¯ 2 e2K2 τ eK2 t f D2 . ¯ DB L )f B ≤ 6τ tL

(5.27)

Consequently, for a natural number k < l and τ < 1, we find (Utτk − Utτl )f B ≤

n>k

τ ¯ 2 e4K2 τ eK2 t tτk f D2 , ¯ DB L (Utτn − Ut n+1 )f B ≤ 6L 1 − τk

which implies that Utτk converges in C([0, T ], B) as k → ∞ to a curve (which we denote by Ut f ), and that the estimate (5.20) holds. Since the Ut are uniformly bounded, we can deduce by the usual density argument that Utτk f converges for any f ∈ B, and that the limiting set of operators Ut forms a bounded semigroup in B. Repeating the above arguments for the triple of spaces D3 ⊂ D2 ⊂ D yields the convergence of the approximations in D and therefore the invariance of D under Ut and the required bound for Ut in D. Next, by (5.23), we find ) ) ) ) k k i−1 ) ) )  ˜ ) ˜ ˜ ˜ ) ) ) ) ¯ 2 ekK2 τ f D2 , eτ Ll f ) ≤ τ k L ) eτ Lk · · · eτ L1 f − f ) = ) (eτ Li − 1) ) ) ) ) i=1

D

i=1

l=1

˜ i is any of the operators L1 , L2 . Therefore, where each L ¯ 2 eK2 t f D2 , Utτ f − f D ≤ tL

D

5.2. Adding generators with 4-level Banach towers

299

and thus, for t2 > t1 , ¯ 2 eK2 t2 f D2 , Utτ2 f − Utτ1 f D = (Utτ2 −t1 − 1)Utτ1 f D ≤ (t2 − t1 )L

(5.28)

which proves the Lipschitz continuity of the approximations and therefore also of the limiting propagator in D. It remains to prove statement (iv) of the theorem. Let first f ∈ D2 and let t be a binary rational. Then t/2τk ∈ N for k large enough, so that Utτk f = (eτk L2 eτk L1 )t/2τk and

(t/2τk )−1

Utτk f − f =

(eτk L2 eτk L1 − 1)(eτk L2 eτk L1 )l f.

l=0

Therefore, (5.25) leads to ) ) (t/2τk )−1

) τ ) τk L2 τk L1 l ) )U k f − f − τk (L + L )(e e ) f 1 2 ) t )

B

l=0

)(t/2τk )−1 ) ) ) τk L2 τk L1 τk L2 τk L1 l ) =) [e e − 1 − τ (L + L )](e e ) f k 1 2 ) )

B

l=0

(t/2τk )−1

≤3

¯ DB L ¯ 2 e2K2 τk l f D2 ≤ 3tτk etK2 L ¯ DB L ¯ 2 f D2 . τk2 L

l=0

Consequently, Utτk f − f =

1 (2τk ) 2

(t/2τk )−1

(L1 + L2 )U2lτk f

l=0

(t/2τk )−1

+ τk αk B ≤

τk (L1 + L2 )[U2lτ − U2lτk ]f + αk , k

l=0 tK2 ¯ ¯ 2 f D2 . 3tτk e LDB L

Passing to the limit as k → ∞ in the topology of B and applying (5.20) yields 1 t (L1 + L2 )Us f ds. (5.29) Ut f − f = 2 0 By a density argument, the same formula holds for any f ∈ D. Due to the continuity of Ut f in D, it follows from (5.29) that 1 d |t=0 Ut f = (L1 + L2 )f dt 2 for any f ∈ D, where the derivative is defined in the topology of B.



300

Chapter 5. Linear Evolutionary Equations: Advanced Theory

The possibility of regular approximations provides a way to get rid of lengthy Banach towers. Namely, as a direct consequence of Theorem 5.2.1 and Proposition 4.2.5, we can conclude the following: Theorem 5.2.2. (i) Suppose that L1 , L2 ∈ L(D, B) and there exists a sequence of pairs of operators Ln1 , Ln2 , n ∈ N, satisfying the assumptions of Theorem 5.2.1 for each n, with the constant K being independent of n, and such that Ln1 → L1 and Ln2 → L2 , as n → ∞, in the operator topologies of the space L(D, B). Then the semigroups exp{tL1 }, exp{tL2 } and exp{t(L1 + L2 )/2} converge in B to some strongly continuous semigroups Tt1 , Tt2 and Tt12 , respectively, such that D belongs to the domain of the generators of these semigroups. Their generators coincide with L1 , L2 and (L1 + L2 )/2, respectively. (ii) Suppose additionally that the constant K2 is independent of n and the convergence Ln1 → L1 and Ln2 → L2 holds also in the operator topologies of the space L(D2 , D). Then the space D is an invariant core for the semigroups Tt1 , Tt2 , Tt12 , where these semigroups are generated by the operators L1 , L2 , (L1 + L2 )/2 respectively. Another way of reducing the length of the tower is based on additional duality assumptions, as shown in the following result. Theorem 5.2.3. Suppose that (i) the linear operators L1 , L2 in B generate strongly continuous semigroups etL1 and etL2 in B with the spaces D and D2 being invariant, such that etLj , j = 1, 2, have norms that are bounded by eKt with some K in all spaces B, D, D2 . Moreover, L1 and L2 are bounded as operators D → B and D2 → D with ¯ norms that are bounded by some constant L. ∗ ∗ , D = ED for some (ii) the spaces D and B are dual Banach spaces, i.e., B = EB Banach spaces EB ⊂ ED . Then the semigroups Utτk converge to a strongly continuous semigroup Ut in B with an invariant core D, where it is generated by (L1 + L2 )/2. Proof. Following the proof of Theorem 5.2.1, we obtain (5.27) and therefore the convergence of Utτk in B, as well as the Lipschitz continuity in t of the curves Utτk f in both spaces B and D for any f ∈ D2 . Using the Banach–Alaoglu theorem on the ∗ compactness of the unit ball in D = ED in the ∗-weak topology, we can conclude τk that for any f ∈ D the sequence Ut f is bounded in D and therefore relatively compact in the ∗-weak topology. Hence there exists a converging subsequence fl . But since EB ⊂ ED , it follows that fl converges ∗-weakly in B. But then its limit is Ut f . Therefore, all converging subsequences have the same limit. Consequently, the sequence Utτk f converges ∗-weakly in D, and hence Ut f ∈ D. Therefore, D turns out to be invariant under Ut , and the Ut are bounded operators in D. Moreover,

5.2. Adding generators with 4-level Banach towers

301

the Lipschitz continuity of Utτk f in D implies that the Ut f are also Lipschitzcontinuous in D. The proof that Ut is generated by (L1 + L2 )/2 in D is similar to the proof of Theorem 5.2.1.  It is more or less straightforward to extend the theory to a finite number of generators Lj , j = 1, . . . , n, with the approximations being defined as Utτ = e(t−(nk+l)τ )Ll+1 eτ Ll · · · eτ L1 (eτ Ln · · · eτ L1 )k , (nk + l)τ ≤ t ≤ (nk + l + 1)τ.

(5.30)

Theorem 5.2.4. Suppose that the operators Lj in B, j = 1, . . . , n, satisfy all the conditions assumed for L1 , L2 in Theorem 5.2.1. Then for any T > 0 and f ∈ B, the curves Utτk f , τk = 2−k , converge in C([0, T ], B) to a curve Ut f , as k → ∞, and for f ∈ D this convergence holds in C([0, T ], D), so that Ut f ∈ C([0, T ], D). Moreover, the norms Ut B and Ut D are bounded by eKt and the operators Ut form a strongly continuous semigroup in B with the generator (L1 + · · · + Ln )/n and the invariant core D. Proof. From (5.24), we derive (5.25) as in Theorem 5.2.1. Then we repeat the procedure. Namely, from (5.25), we can derive ¯ DB L ¯ 2 e3K2 t f D2 , etL3 etL2 etL1 f − etL3 f − tetL3 (L1 + L2 )f B ≤ 3t2 L

(5.31)

and consequently etL3 etL2 etL1 f − f − t(L1 + L2 + L3 )f B ¯ DB L ¯ 2 e3K2 t f D2 + etL3 f − f − tL3 f B + t(etL3 − 1)(L1 + L2 )f B ≤ 3t2 L ¯ DB L ¯ 2 e3K2 t f D2 . ≤ 6t2 L By induction, it follows that etLn · · · etL1 f − f − t(L1 + · · ·+ Ln)f B ≤

1 ¯ DB L ¯ 2 enK2 t f D2 , (5.32) n(n+ 1)t2 L 2

and therefore ¯ DB L ¯ 2 enK2 t f D2 , etLn · · · etL1 f − etLπ(n) · · · etLπ(1) f B ≤ n(n + 1)t2 L

(5.33)

where π is any permutation of the set {1, . . . , n}. Consequently, ¯ DB L ¯ 2 enK2 t f D2 , etLn · · · etL1 f − (etLn /2 · · · etL1 /2 )2 f B ≤ 2n(2n + 1)t2 L and therefore (eτk Ln · · · eτk L1 )k f − (eτ Ln /2 · · · eτk L1 /2 )2k f B ¯ DB L ¯ 2 enK2 τk eKt f D2 . ≤ n(n + 1)tτk L The rest of the proof is as in Theorem 5.2.1.

(5.34) 

302

Chapter 5. Linear Evolutionary Equations: Advanced Theory

Exercise 5.2.1. Derive the analogue of Theorems 5.2.2 and 5.2.3 for finite families of operators Lj . As an example, let us consider the heat-conduction equation in Rd with bounded sources and unbounded sinks: 1 ∂ft = Δft − V (x)ft , ∂t 2

(5.35)

where V function that is bounded from below and has a polynomial growth at infinity. This example also illustrates the importance of using weighted spaces of measures and functions. For the sake of simplicity, we shall work with positive V . Since we aim at applying Theorem 5.2.1, we note that the operator Δ/2 generates the heat semigroup Tt = exp{tΔ/2} in B = C∞ (Rd ), given by (4.13), with k (Rd ) for any k ≥ 2, and the operator (−Vˆ ) of multiplication by invariant cores C∞ (−V (x)) generates the semigroup exp{−tVˆ } of multiplication by exp{−tV (x)} in B = C∞ (Rd ) with cores being spaces of functions that decrease sufficiently fast at infinity. In order to find a common core, let us write down the derivatives of the semigroup exp{−tVˆ }:  ∂f ∂  −tV (x) ∂V e f (x) = − t (x)e−tV (x) f (x) + e−tV (x) , (5.36) ∂xj ∂xj ∂xj  ∂ 2  −tV (x) ∂V ∂2V ∂V e f (x) = − t (x)e−tV (x) f (x) + t2 (x) (x)e−tV (x) f (x) ∂xi ∂xj ∂xi ∂xj ∂xi ∂xj −t

∂V −tV (x) ∂f ∂2f ∂V −tV (x) ∂f e −t e + e−tV (x) . ∂xi ∂xj ∂xj ∂xi ∂xi ∂xj (5.37)

For any k ∈ R, let us now introduce the spaces Ck of continuous functions on Rd with a finite norm f k = sup(|f (x)|(1 + |x|k )), x

and by Ck,∞ its subspace of functions such that |f (x)|(1 + |x|k ) → 0 as x → ∞. Let us denote by ∇l f the collection of all partial derivatives of f of order l, and let us say that ∇l f ∈ Ck or ∇l f ∈ Ck,∞ if all these derivatives belong to Ck or Ck,∞ , respectively, with the norm ∇l f k being defined as the supremum of the Ck -norms of all partial derivatives of order l. The next result identifies the simplest common core for the semigroups exp{tΔ/2} and exp{−tVˆ } in B = C∞ (Rd ). Lemma 5.2.1. Assume that V ≥ 0 and, for some even k > 0, that V, ∇V, ∇2 V ∈ C−k . Then the space D = {f ∈ C2k,∞ : ∇f ∈ Ck,∞ , ∇2 f ∈ C0,∞ = B} is an invariant core for both exp{tΔ/2} and exp{−tVˆ }. If D is equipped with the norm f D = f 2k + ∇f k + ∇2 f B ,

5.2. Adding generators with 4-level Banach towers

303

then the semigroups exp{tΔ/2} and exp{−tVˆ } act strongly continuously in D and Vˆ , Δ ∈ L(D, B). Moreover, for t ∈ (0, 1), we have  exp{−tVˆ }D→D ≤ 1 + κt,

 exp{−tΔ/2}D→D ≤ 1 + κt,

(5.38)

with some constant κ depending on d and on the norms of V, ∇V, ∇ V in the corresponding spaces. 2

Proof. It can be seen from (5.36) that ∇(e−tV (x) f (x)) ∈ Ck,∞ whenever ∇f ∈ Ck,∞ and f ∈ C2k,∞ . (5.37) implies that ∇2 (e−tV (x) f (x)) ∈ B whenever ∇2 f ∈ B, ∇f ∈ Ck,∞ and f ∈ C2k,∞ . This implies the invariance of D under exp{−tVˆ }. The invariance of D under exp{tΔ/2} follows from (4.27) and (4.28). The estimates (5.38) also follow from (5.36) and (5.37).  The following lemma is proved in a completely analogous manner. Lemma 5.2.2. Assume that V ≥ 0 and, for some even k > 0, that V , ∇V , ∇2 V , ∇3 V , ∇4 V ∈ C−k . Then the space ' ( ˜ = f ∈ C4k,∞ : ∇f ∈ C3k,∞ , ∇2 f ∈ C2k,∞ , ∇3 f ∈ Ck,∞ , ∇4 f ∈ B D ˜ is equipped with is also an invariant core for both exp{tΔ/2} and exp{−tVˆ }. If D the norm f D = f 4k + ∇f 3k + ∇2 f 2k + ∇3 f k + ∇4 f B , ˜ and then the semigroups exp{tΔ/2} and exp{−tVˆ } act strongly continuously in D ˜ D). Moreover, for t ∈ (0, 1), we have Vˆ , Δ ∈ L(D,  exp{−tVˆ }D→ ˜ D ˜ ≤ 1 + κt,

 exp{−tΔ/2}D→ ˜ D ˜ ≤ 1 + κt,

(5.39)

with some constant κ depending on d and on the norms of V and its derivatives in C−k . The next result is a direct consequence of the Lemmas 5.2.1, 5.2.2 and Theorem 5.2.2. Proposition 5.2.1. Under the assumptions of Lemma 5.2.2, the operator −Vˆ + Δ/2 generates a strongly continuous semigroup in B, with D as introduced in Lemma 5.2.1 being its invariant core. In particular, the Cauchy problem for equation (5.35) in B is well posed for initial conditions in D. Exercise 5.2.2. Derive the analogue of Proposition 5.2.1 for a time-dependent family Vt using Theorem 5.1.1. Exercise 5.2.3. Use Theorems 5.2.4 and 4.3.1 to show that the diffusion operator (5.16) generates a strongly continuous semigroup whenever A(t, x) =

n

j=1

and b, σj are sufficiently regular.

σj (x)σjT (x)

304

Chapter 5. Linear Evolutionary Equations: Advanced Theory

5.3 Mixing generators In this section, we extend the above construction to time-dependent mixtures of an infinite number of generators. The extension to a finite number of timedependent terms is more or less straightforward. For this purpose, the Banach spaces D3 ⊂ D2 ⊂ D ⊂ B are supposed to form a four-level tower as in Theorem 5.2.1. For two families Ls1 and Ls2 , s ≥ 0, of linear operators in B such that each s Li generates a bounded semigroup, and for a given τ > 0, let us define the family Utτ in the following way: (2k−1)τ

τ U2kτ,2lτ = exp{τ L2

(2k−2)τ

} exp{τ L1

(2l+1)τ

} · · · exp{τ L2

} exp{τ L2lτ 1 };

k, l ∈ N, k > l; τ Ut,s = exp{(t − s)L2kτ 1 }, τ Ut,s

= exp{(t −

2kτ ≤ s ≤ t ≤ (2k + 1)τ,

(2k+1)τ s)L2 },

(2k + 1)τ ≤ s ≤ t ≤ (2k + 2)τ.

(5.40) (5.41) (5.42)

τ becomes a propagator. For other t, s, it is obtained by gluing in such a way that Ut,s

Theorem 5.3.1. Suppose that (i) the linear operators Ls1 and Ls2 , s ∈ [0, T ], generate strongly continuous semigroups exp{tLsi } in B with the common invariant core D, and max ( exp{tLs1 }B ,  exp{tLs2 }B ,  exp{tLs1 }D ,  exp{tLs2 }D ) ≤ eKt with a constant K; (ii) D2 and D3 are also invariant under exp{Ls1 } and exp{tLs2 } with max ( exp{tLs1 }D2 ,  exp{tLs2 }D2 ) ≤ eK2 t , max ( exp{tLs1 }D3 ,  exp{tLs2 }D3 ) ≤ eK3 t , with constants K3 ≥ K2 ≥ K; (iii) Ls1 , Ls2 are bounded as operators D → B, D2 → D and D3 → D2 with norms ¯ 2, L ¯ 3; ¯ DB , L that are bounded by some constants L t (iv) the Lj depend Lipschitz-continuously on t in the following sense: − Ltj )f B ≤ κτ f D2 , (Lt+τ j

(Lt+τ − Ltj )f D ≤ κτ f D3 j

(5.43)

uniformly for finite t, τ . τk converge in C([0, T ], B) and in C([0, T ], D) to a propThen the propagators Ut,s agator Ut,s such that

Ut,s B ≤ eK(t−s) ,

Ut,s D ≤ eK(t−s) ;

and the propagator Ut,s is generated on D by the family (Lt1 + Lt2 )/2 (in the sense of the definition given prior to Proposition 4.9.1).

5.3. Mixing generators

305

Proof. This follows a similar path as in the proof for Theorem 5.2.1. Namely, instead of (5.25) one gets ¯ DB L ¯ 2 e2K2 t f D2 , (5.44)  exp{τ Lt11 } exp{τ Lt22 }f − f − τ (Lt11 + Lt22 )f B ≤ 3τ 2 L for any t1 , t2 ∈ (0, t). Together with (5.43), this implies the following modification of (5.26): (exp{tLt11 } exp{tLt22 } − exp{tLt22 } exp{tLt11 })f B (5.45) ¯ DB L ¯ 2 e2K2 t f D2 + 2κt2 f D2 . ≤ 6t2 L Therefore, instead of (5.27) one gets the estimate τ /2

τ (Ut,s − Ut,s )f B ≤ τ (t − s)C(T )f D˜ ,

(5.46)

with another constant C(T ). The remaining argument of Theorem 5.2.1 can be similarly applied.  Exercise 5.3.1. Extend Theorems 5.2.2 and 5.2.3 to a time-dependent setting. Exercise 5.3.2. Extend Theorems 5.45, 5.2.2 and 5.2.3 to the case of n families of operators Lsi , i = 1, . . . , n, in B generating strongly continuous semigroups exp{tLsi }, and show that the time-dependent extensions of the approximations (5.30) converge to the propagator that is generated by the family (Ls1 +· · ·+Lsn )/n on D. The main result of this section concerns infinite mixtures of the generators. Let D4 ⊂ D3 ⊂ D2 ⊂ D ⊂ B be a 5-level Banach tower, as defined prior to Proposition 4.2.6. Theorem 5.3.2. Suppose that we are given a family of operators L(x) in B depending on a parameter x ∈ Rd such that L(x) are bounded operators D4 → D3 , D3 → ¯ 4, L ¯ 3, L ¯ 2 andL ¯ 1, D2 , D2 → D and D → B with norms not exceeding the constants L respectively, so that each L(x) generates a strongly continuous semigroup etL(x) in B, with D their common invariant core and with all Dj being invariant. Moreover, assume that the mapping x → L(x) is continuous in the operator topologies of L(D4 , D3 ), L(D3 , D2 ), L(D2 , D), L(D, B), and that the norms of the operators etL(x) in the spaces D4 , D3 , D2 , D, B are bounded by eK4 t , eK3 t , eK2 t , eK1 t and eKt , respectively, with some constants Kj , K. Let μt be a Lipschitz-continuous (in the norm topology of M(Rd )) family of probability measures on Rd . Then the time-dependent family of operators Lt = L(x)μt (dx) generates strongly continuous forward and backward propagators in B with the common domain D such that the norms of U t,s in B and D do not exceed eK|t−s| and eK2 |t−s| , respectively. Proof. Let us work with forward propagators for the sake of definiteness. The idea is to approximate the integral Lt by Riemannian integral sums and use the version of Theorem 5.2.4 with time-dependent generators (see Exercise 5.3.2) to construct the propagators that are generated by these integral sums. More concretely, by

306

Chapter 5. Linear Evolutionary Equations: Advanced Theory

applying this latter result to the 4-level tower D3 ⊂ D2 ⊂ D ⊂ B, for any finite partition Δ = {Aj } of Rd into a disjoint union of measurable sets Aj and any points xj ∈ Aj , the family of operators LΔ t =

j

μt (Aj )L(xj ) =

1 t L , j j n

Ltj = nμt (Aj )L(xj ),

t,s generates a propagator UΔ in B on the common invariant domain D such that t,s t,s UΔ B→B and UΔ D→D are bounded by eK|t−s| and eK2 |t−s| , respectively. In fact, estimates of the type (5.43) follow from the Lipschitz continuity of μt . Moreover, although  exp{tLsj } ≤ exp{tnKμs (Aj )}

does not provide a uniform bound as required in Theorem 5.2.4, it can be shown that their product is uniformly bounded:  exp{tLs1 } · · · exp{tLsn }B ≤ exp{tnKμs (A1 )} · · · exp{tnKμs (An )} = etnK . This is sufficient for the proof of Theorem 5.2.4 (or its version with time-dependent generators). Using the 4-level tower D4 ⊂ D3 ⊂ D2 ⊂ D, we show next that the space t,s D2 is also invariant under UΔ , whereby these operators are bounded in norm by tK2 e . Since μt is a continuous curve, one can choose, for any , the cube [−R, R]d ⊂ Rd such that μt (Rd \ [−R, R]d) < . Using the continuity of L(x) in Rd and hence its uniform continuity in [−R, R]d , we can find a partition of [−R, R]d , [−R, R]d = ∪nj=1 Aj , such that L(x) − L(y)D→B < ,

L(x) − L(y)D2 →D < ,

L(x) − L(y)D3 →D2 < 

for any j and any x, y ∈ Aj . Consequently, for the partition Δ = {Aj } ∪ (Rd \ [−R, R]d) of Rd , the norms of the operator LΔ t −Lt in the spaces L(D, B), L(D2 , D) ¯ 1 ), (1 + L ¯ 2 ) and (1 + L ¯ 3 ), respectively. and L(D3 , D2 ) are bounded by (1 + L Therefore, we can apply Proposition 4.2.5 (more precisely, its direct extension to t,s to time-dependent generators) to derive the convergence of the propagators UΔ the required propagator U t,s generated by the family Lt on the invariant domain D.  In applications to nonlinear equations, one usually works with weakly continuous curves μt , rather than strongly continuous ones. This issue is now addressed. Also, we relax the assumptions on the regularity of L(x), which can be substituted by appropriate approximations, as was the case with the addition of a finite number of generators. Theorem 5.3.3. Let L(x) be a family of operators in B such that the mapping x → L(x) is continuous in the operator topologies of the spaces L(D2 , D) and

5.3. Mixing generators

307

L(D, B). Let a sequence of families Ln (x) exist that satisfies the assumptions of Theorem 5.3.2 with constants K, K1 , K2 independent of n and such that sup Ln (x) − L(x)D→B → 0, x

sup Ln (x) − L(x)D2 →D → 0,

(5.47)

x

as n → ∞. Let μt be a weakly continuous family of probability measures on Rd , which is Lipschitz-continuous in the topology of the space (C k (Rd ))∗ with some k ∈ N, that is      φ(x)(μt (dx) − μs (dx)) ≤ κ|t − s|φC k (Rd )   for allφ ∈ C k (Rd ) and a constant κ. Then the time-dependent family of operators Lt = L(x)μt (dx) generates strongly continuous forward and backward propagators in B with the common domain D such that the norms of U t,s in B and D do not exceed eK|t−s| and eK1 |t−s| , respectively. Proof. Again, let us work with forward propagators. The idea is to approximate μt by strongly continuous measure curves. For this purpose, we can use the standard approximation (1.5). However, we must additionally use a cutoff in x, i.e., the approximations ftn (x) = φ(x/n)nd φ(n(x − y))μt (dy), (5.48) and choose φ ∈ C k (Rd ) such that φ(x) equals 1 in a neighbourhood of the origin. Notice first that the curves ftn are Lipschitz-continuous in t in the norm topology of L1 (Rd ), so that the measure-valued curves μnt with the densities ftn are Lipschitz-continuous in t in the norm topology of M(Rd ). In fact,     |ftn (x) − fsn (x)| = φ(x/n)nd  φ(n(x − y))(μt (dy) − μs (dy)) ≤ κ|t − s|nd+k φC k (Rd ) φ(x/n), and thus

|ftn (x) − fsn (x)| dx ≤ κ|t − s|n2d+k φC k (Rd ) .

Therefore, for any n, the operators Ln (x) and the curves μnt satisfy  all assumptions of Theorem 5.3.2. Consequently, the families of the operators Ln (x)μnt (dx) generate forward propagators Unt,s such that Unt,s B→B ≤ eKt ,

Unt,s D→D ≤ eK1 t ,

Unt,s D2 →D2 ≤ eK2 t .

308

Chapter 5. Linear Evolutionary Equations: Advanced Theory

To complete the proof, we apply Proposition 4.2.5, more precisely its direct extension to propagators. For doing so, we have to know that ) ) ) ) ) Ln (x)μnt (dx) − L(x)μt (dx)) → 0, ) ) D→B ) ) ) ) ) Ln (x)μnt (dx) − L(x)μt (dx)) → 0, ) ) D2 →D

as n → ∞. Due to (5.47), it is sufficient to show that ) ) ) ) ) L(x)μnt (dx) − L(x)μt (dx)) ) ) D→B ) ) ) ) ) L(x)μnt (dx) − L(x)μt (dx)) ) )

D2 →D

→ 0, (5.49) → 0.

Of course, we expect this to be true, since the μnt converge weakly to μt . To be more precise, let us write down, e.g., the first limit in (5.49) in more detail: ) ) ) ) d ) → 0. L(x)φ(x/n)n φ(n(x − y))μt (dy) dx − L(x)μt (dx)) ) ) D→B

Since φ(x/n) → 1, as n → ∞, it is sufficient to show that ) ) ) ) d ) ) φ(n(x − y))μ (dy) dx − L(y)μ (dy) L(x)n t t ) )

→ 0,

D→B

or equivalently ) ) ) ) d ) ) φ(n(x − y))μ (dy) dx (L(x) − L(y))n t ) )

→ 0.

(5.50)

D→B

But this follows from the continuity of L(x) (and hence its uniform continuity on compact subsets of Rd ) and the tightness of the family μt . 

5.4 The method of frozen coefficients: heuristics The method of frozen coefficients is a classical approach to solving equations with variable coefficients by approximating the solution with the solutions of equations that have constant, or frozen, coefficients. We start with some formal calculations that will suggest some natural assumptions on the operator symbols, which are then given a rigorous form. Let ψt (x, −i∇) be a time-dependent family of pseudo-differential operators with symbols ψt (x, p) whose real part is bounded from below. As basic examples,

5.4. The method of frozen coefficients: heuristics

309

one can have in mind the diffusion operator − 12 (At (x)∇, ∇), a fractional Laplacian with a variable scale, ψ(x, −i∇) = σt (x)|Δ|α , or a more general ΨDO having homogeneous symbols with position- and time-dependent coefficients, ψ(x, p) = ωt (x, p/|p|)|p|βt (x) . Lower-order terms can always be added, but we prefer to treat them separately via perturbation theory. Assume that we are interested in solving the Cauchy problem ∂ ft (x) = −ψt (x, −i∇)ft (x), ∂t

f |t=s = fs ,

t > s,

expecting to obtain a solution of the form ft (x) = Gt,s (x, y)fs (y) dy,

(5.51)

(5.52)

with a Green function G that solves equation (5.51) with the initial condition Gs,s (x, y) = δ(x − y). Let us write ψt,[z] (−iΔ) for the operator ψ with frozen coefficients z, i.e., it acts on functions f (x) as an operator with constant coefficients (which are fixed by the choice of z). We know that the solution to the Cauchy problem for operators with constant coefficients, ∂ gt (x) = −ψt,[z] (−i∇)gt (x), ∂t

g|t=s = gs ,

is given by the formulae (2.59) and (2.60), so that gt (x) = Gψ,z t,s (x − y)gs (y) dy,  t  1 ψ,z i(p,x) exp − ψτ (z, p) dτ dp. Gt,s (x) = e (2π)d s

(5.53)

(5.54) (5.55)

ψ,y The idea is to use the function Gap t,s (x, y) = Gt,s (x − y) as a first approximation for the actual Gt,s (x, y). It is readily seen that it satisfies the equation

∂ ap G (x, y) = −ψt,[y] (−i∇)Gap t,s (x, y), ∂t t,s where the operator ∇ acts on the variable x, or equivalently ∂ ap G (x, y) = −ψt (x, −i∇)Gap t,s (x, y) − Ft,s (x, y), ∂t t,s

(5.56)

Ft,s (x, y) = (ψt,[y] (−i∇) − ψt (x, −i∇))Gap t,s (x, y).

(5.57)

with

310

Chapter 5. Linear Evolutionary Equations: Advanced Theory

For these entities, we have ψt,[y] (−i∇)Gap t,s (x, y)

1 = (2π)d

ψt (x, −i∇)Gap t,s (x, y) =

1 (2π)d

 t  e ψt (y, p) exp − ψτ (y, p) dτ dp, s  t  ψτ (y, p) dτ dp, ei(p,x−y) ψt (x, p) exp −

i(p,x−y)

s

(5.58) which implies Ft,s (x, y) =

1 (2π)d



 t  ψτ (y, p) dτ dp. ei(p,x−y) [ψt (y, p) − ψt (x, p)] exp − s

(5.59) If the evolution (5.52) is well defined and F is sufficiently regular, then it follows from (5.56) and Proposition 4.10.2 (see also Remark 78) that t ap dτ Gt,τ (x, z)Fτ,s (z, y) dz. (5.60) Gt,s (x, y) = Gt,s (x, y) − s

Remark 91. Equation (5.60) can be considered a (slightly generalized) variant of (4.134). In operator form, the r.h.s. of (5.60) reads G − F˜ G, where F˜ is the linear operator acting on functions of four variables as t (F˜ ψ)t,s (x, y) = dτ ψt,τ (x, z)Fτ,s (z, y) dz. (5.61) s

Therefore, it follows from (5.60) that Gt,s (x, y) = [(I − F˜ )−1 Gap ]t,s (x, y) =



(F˜ n Gap )t,s (x, y).

(5.62)

n=0

The mth term of this series equals



(F˜ m Gap )t,s (x, y) =

s≤s1 ≤···≤sm ≤t



ds1 · · · dsm

Gap t,sm (x, zm )Fsm ,sm−1 (zm , zm−1 ) · · · Fs2 ,s1 (z2 , z1 )Fs1 ,s (z1 , y)dz1 · · · dzm . (5.63) Or in other words, Gt,s (x, y) = Gap t,s (x, y) +



t



Gap t,τ (x, z)Φτ,s (z, y) dz,

(5.64)

s

with Φt,s (x, y) =



(F˜ n F )t,s (x, y).

n=0

(5.65)

5.5. The method of frozen coefficients: estimates for the Green function

311

Formula (5.64) suggests an alternative approach to the derivation of (5.62), namely by searching for the heat kernel Gt,s (x, y) in the form (5.64) with a new unknown function Φ. Plugging this expression into the equation ∂ Gt,s (x, y) = −ψt (x, −i∇)Gt,s (x, y) ∂t

(5.66)

yields t (x, y) + Φ (x, y) − dτ ψt,[z] (−i∇)Gap − ψt,[y] (−i∇)Gap t,s t,s t,τ (x, z)Φτ,s (z, y) dz s t (x, y) − dτ ψt (x, −i∇)Gap = −ψt (x, −i∇)Gap t,s t,τ (x, z)Φτ,s (z, y) dz, (5.67) s

which implies the following integral equation for Φ: t Φt,s (x, y) = Ft,s (x, y) + dτ Ft,τ (x, z)Φτ,s (z, y) dz.

(5.68)

s

Solving this equation by successive approximation yields again (5.65), and therefore also the expansion (5.62). In order to render all these calculations rigorous, one has to show at least that all series involved in the definition of Gt,s (x, y) do indeed converge, and then to clarify in what sense (if any) this function satisfies equation (5.51) with the initial condition f (y) = δ(y), obtained as the limit for t − s → 0 in the sense of generalized functions. We shall carry out these tasks in the following Sections. At this stage, it should only be noted that the expansion (5.65) is similar to the perturbation series (4.166), with F˜ playing the role of LU . When analysing the convergence of the series (5.62), one can therefore use similar arguments as in Theorems 4.13.3 and 4.13.4.

5.5 The method of frozen coefficients: estimates for the Green function As a starting point for a rigorous theory, let us begin with the convergence of a series of the type (5.65). We are interested in convergence both in L1 (Rd ) and C∞ (Rd ). A quick look at the expression (5.59) (details will be given later) suggests that Ft,s (., y) should belong to the intersection L1 (Rd ) ∩ C∞ (Rd ), although with possible power-type singularities as t → s. This observation is reflected in the following lemma, which is built on the weakest general assumptions that still ensure the required convergence of (5.65). Lemma 5.5.1. Let Ft,s (x, y), T1 ≤ s < t ≤ T2 , T = T2 − T1 , x, y ∈ Rd , be a continuous function in all its variables such that Ft,s (., y) ∈ L1 (Rd ) ∩ C∞ (Rd ) for

312

Chapter 5. Linear Evolutionary Equations: Advanced Theory

all t, s, y and   −ωL L |Ft,s (x, y)| ≤ min (t − s)−ωC ΩC Ωt−s (x − y) t−s (x − y), (t − s)

(5.69)

with some constants ωL ∈ (0, 1), ωC > 0 and non-negative functions ΩC t (.) ∈ d C∞ (Rd ), ΩL (.) ∈ L (R ) that have uniformly bounded norms: 1 t ΩC t (.)C∞ (Rd ) ≤ ΩC ,

ΩL t (.)L1 (Rd ) ≤ ΩL ,

with some positive constants ΩC , ΩL . Then the series on the r.h.s. of (5.65), where F˜ is given by (5.61), converges for any t > s, y both in L1 (Rd ) and C∞ (Rd ), and the following estimates hold for its sum: max(Φt,s (., y)L1 (Rd ) , Φt,s (x, .)L1 (Rd ) ) ≤ (t − s)−ωL C(ΩL , ωL , T ), −ωC

max(Φt,s (., y)C∞ (Rd ) , Φt,s (x, .)C∞ (Rd ) ) ≤ (t − s)

(5.70)

ΩC C(ΩL , ωL , ωC , T ). (5.71)

Explicit expressions for the constants C(ΩL , ωL , T ) and C(ΩL , ωL , ωC , T ) are given below. Moreover, equation (5.68) has a unique solution Φ that satisfies the estimates (5.70), (5.71), and it is given by the sum on the r.h.s. of (5.65). d Proof. It is well known (and  easy to see) that for any functions φ,d ψ ∈ L1 (R ) the convolution (φ  ψ)(x) = φ(x − y)ψ(y)dy also belongs to L1 (R ) and

φ  ψL1 (Rd ) ≤ φL1 (Rd ) ψL1 (Rd ) . For the nth term of the series on the r.h.s. of (5.65), we therefore get ds1 · · · dsn (t − sn )−ωL (sn − sn−1 )−ωL · · · (s1 − s)−ωL (F˜ n F )t,s (x, y) ≤ s≤s1 ≤···≤sn ≤t L × ΩL t−sn (x − yn ) · · · Ωs1 −s (y1 − y)dy1 · · · dyn Rdn n+1 ≤ ΩL ds1 · · · dsn (t − sn )−ωL (sn − sn−1 )−ωL · · · (s1 − s)−ωL . s≤s1 ≤···≤sn ≤t

Using formula (9.15), we can conclude that n(1−ωL )−ωL (F˜ n F )t,s (., y)L1 (Rd ) ≤ Ωn+1 L (t − s)

[Γ(1 − ωL )]n+1 . Γ((n + 1)(1 − ωL ))

Therefore, by the definition of the Mittag-Leffler function (9.13), the series defining Φ converges in L1 (Rd ) and Φt,s (., y)L1 (Rd ) ≤ ΩL (t − s)−ωL Γ(1 − ωL )E1−ωL ,1−ωL (Γ(1 − ωL )ΩL (t − s)1−ωL ), (5.72)

5.5. The method of frozen coefficients: estimates for the Green function

313

which proves the estimate (5.70) with C = ΩL Γ(1 − ωL )E1−ωL ,1−ωL (Γ(1 − ωL )ΩL T 1−ωL ). Next, one observes that if φ ∈ L1 (Rd ) and ψ ∈ C∞ (Rd ), then φ  ψ ∈ C∞ (Rd ) and (5.73) φ  ψC∞ (Rd ) ≤ φC∞ (Rd ) ψL1 (Rd ) . Exercise 5.5.1. Prove the assertion (5.73).  Consequently, Ft,τ (., z)Fτ,s (z, y) dz ∈ C∞ (Rd ) and ) ) ) ) ) Ft,τ (., z)Fτ,s (z, y) dz ) ) )

(5.74)

C∞ (Rd )

  ≤ ΩC ΩL min (t − τ )−ωC (τ − s)−ωL , (t − τ )−ωL (τ − s)−ωC .

Now, a problem arises: how to push the (possibly non-integrable) singularity (t − τ )−ωC or (τ − s)−ωC through the integration over τ ? The trick is to decompose the integral into two parts:



t

dτ s

Ft,τ (x, z)Fτ,s (z, y) dz



s+(t−s)/2

=







s



t

Ft,τ (x, z)Fτ,s (z, y) dz +



Ft,τ (x, z)Fτ,s (z, y) dz,

s+(t−s)/2

and to estimate the first and the second integral by the first and the second estimate of (5.74), respectively. This yields ) ) t ) ) ) dτ Ft,τ (., z)Fτ,s (z, y) dz ) ) ) s

C∞ (Rd )



s+(t−s)/2

≤ ΩC ΩL

(t − τ )−ωC (τ − s)−ωL dτ

s



t

(t − τ )−ωL (τ − s)−ωC dτ

+ ΩC ΩL s+(t−s)/2

≤2

ωC

−ωC



s+(t−s)/2

ΩC ΩL (t − s)

(τ − s) s

=

−ωL



t

 −ωL

(t − τ )

dτ +



s+(t−s)/2

1 2ωC +ωL ΩC ΩL (t − s)1−ωL −ωC . 1 − ωL

In order to estimate the term F˜n F , we observe that for any partition s = s0 ≤ s1 ≤ · · · ≤ sn ≤ sn+1 = t, there exists k ∈ {0, . . . , n} such that sk+1 − sk > (t − s)/(n + 1). Therefore, the integral in the term F˜n F can be bounded by the

314

Chapter 5. Linear Evolutionary Equations: Advanced Theory

sum of (n + 1) integrals Ik such that sk+1 − sk > (t − s)/(n + 1) in Ik . In each integral Ik , we estimate Fsk+1 ,sk (yk+1 , yk ) by ωC (t − s)−ωC ΩC (sk+1 − sk )−ωC ΩC sk+1 −sk (yk+1 − yk ) ≤ (n + 1) sk+1 −sk (yk+1 − yk )

and the other Fsj+1 ,sj (yj+1 , yj ) by (sj+1 − sj )−ωL ΩL sj+1 −sj (yj+1 − yj ). This leads to the estimate (F˜ n F )t,s (., y)C∞ (Rd ) ≤ (n + 1)ωC +1 (t − s)−ωC ΩC ΩnL × (sn − sn−1 )−ωL · · · (s1 − s)−ωL ds1 · · · dsn s≤s1 ≤···≤sn ≤t

= (n + 1)ωC +1 (t − s)n(1−ωL )−ωC ΩC ΩnL

[Γ(1 − ωl )]n , Γ(n(1 − ωL ) + 1)

where (9.16) was used. Consequently, we find Φt,s (., y)C∞ (Rd ) ≤ ΩC (t − s)−ωC



(n + 1)ωC +1 ΩnL

n=0

[(t − s)1−ωL Γ(1 − ωL )]n . Γ(n(1 − ωL ) + 1)

This series converges and can also be expressed in terms of the Mittag-Leffler functions, which proves (5.71). The last statement is a direct consequence of the obtained estimates.  We can now obtain a general criterion for ensuring that a candidate for the Green function (5.64) suggested by the method of frozen coefficients is well defined and satisfies the required initial condition δ(y). Lemma 5.5.2. (i) Suppose that Φ satisfies the estimates (5.70), (5.71) of Lemma 5.5.1. Let Gap t,s (x, y), t > s, be a continuous function in all its variables such that d d Gap t,s (., y) ∈ L1 (R ) ∩ C∞ (R ) for all t, s, y, and Gap t,s (., y)L1 (Rd ) ≤ ΩGL , Gap t,s (x, .)L1 (Rd ) ≤ ΩGL , Gap t,s (., .)C∞ (R2d )

(5.75) −ωG

≤ (t − s)

ΩGC ,

with some constants ωG , ΩGL , ΩGC > 0. Then Gt,s (x, y) given by (5.64) is well defined, Gt,s (., y) ∈ L1 (Rd ) ∩ C∞ (Rd ) for all y and t > s, and for the

5.5. The method of frozen coefficients: estimates for the Green function

315

second term in (5.64) we have the estimates ) t ) ) ) ap ) dτ Gt,τ (., z)Φτ,s (z, y) dz ) ) )



) ) t ) ) ap ) dτ Gt,τ (x, z)Φτ,s (z, .) dz ) ) )

ΩGL C(ΩL , ωL , T ) (t − s)1−ωL , 1 − ωL (5.76)



ΩGL C(ΩL , ωL , T ) (t − s)1−ωL , 1 − ωL (5.77)

s

L1 (Rd )

s

L1 (Rd )

) t ) ) ) ap ) dτ Gt,τ (., z)Φτ,s (z, .) dz ) ) )

C∞ (R2d )

s

≤ C max[(t − s)1−ωL −ωG , (t − s)1−ωC ] (5.78)

with a constant C = C(ΩGL , ΩGC , ΩL , ΩC , ωL , ωC , ωG , T ). (ii) Additionally, let ) ) ) ) ) Gap ) (., y)f (y)dy − f (.) → 0, as(t − s) → 0, t,s ) )

(5.79)

C∞ (Rd )

expresses the strong continuity of for any f ∈ C∞ (Rd ). (This requirement  d the family of operators f → Gap ) at the diagonal t,s (., y)f (y)dy in C ∞ (R ap s = t and is a bit stronger than the requirement that Gt,s (., y) tends to δ(y) in the sense of generalized functions.) Then the same holds for Gt,s (x, y): ) ) ) ) ) Gt,s (., y)f (y)dy − f (.)) → 0, as(t − s) → 0. (5.80) ) ) C∞ (Rd )

Proof. (i) The estimates (5.76) and (5.77) are a direct consequence of (5.70) and the first two estimates of (5.75). Next, by (5.75), (5.70), (5.71) and (5.73) imply that      Gap  (x,z)Φ (z,y)dz (5.81) τ,s t,τ   ≤ min[(t − τ )−ωG ΩGC (τ − s)−ωL C(ΩL ,ωL ,T ),ΩGL (τ − s)−ωC ΩC C(ΩL ,ωL ,ωC ,T )]. In order to estimate the integral over τ , we use the same trick as in the proof of Lemma 5.5.2. Namely, we decompose this integral into two parts:

t

dτ Gap t,τ (x, z)Φτ,s (z, y) dz s





s+(t−s)/2

t

dτ Gap t,τ (x, z)Φτ,s (z, y) dz +

= s

dτ Gap t,τ (x, z)Φτ,s (z, y) dz, s+(t−s)/2

316

Chapter 5. Linear Evolutionary Equations: Advanced Theory

Afterwards, we can estimate the first and the second integral by the first and the second estimate of (5.81), respectively. This yields the estimate t ΩGC C(ΩL , ωL , T ) ωG dτ Gap (t − s)−ωG (t − s)1−ωL t,τ (x, z)Φτ,s (z, y) dz ≤ 2 1 − ωL s + 2ωC (t − s)−ωC ΩGL ΩC C(ΩL , ωL , ωC , T )(t − s), which implies (5.78). (ii) The estimate (5.77) implies that ) t ) ) ) ap ) dτ Gt,τ (., z)Φτ,s (z, y)f (y) dz dy ) ) ) s

→ 0,

ast − s → 0,

C∞ (Rd )

since (t − s)1−ωL f C(Rd) → 0 for any f ∈ C(Rd ), so that (5.80) is equivalent to (5.79). 

5.6 The method of frozen coefficients: main examples In the next section, we shall search for conditions to ensure that G actually solves the required equation. In this section, let us confirm that the conditions of Lemmas (5.5.1) and (5.5.2) are indeed satisfied for the basic examples that we have in mind, namely equations arising from operators with homogenous symbols. We shall consider three cases separately: diffusions, operators with homogeneous symbols and operators with homogeneous symbols of variable order. Their respective the symbols are: ψt (x, p) = (At (x)p, p), ψt (x, p) = ωt (x, p/|p|)|p|β , ψt (x, p) = ωt (x, p/|p|)|p|βt (x) . (5.82) Of course, the first case is a special case of the second one. However, it is reasonable to discuss diffusion separately a) because of its importance and b) due to its simplicity and therefore the possibility to avoid the machinery of pseudo-differential operators. For a diffusion operator with the symbol ψt (x, p) = −(At (x)p, p) with a family of symmetric positive matrices At (x), the Green function for the corresponding equation with frozen coefficients is given by (4.190):    −1 t (2π)−d/2 1 ap Gt,s (x, y) = 1 Aτ (y)dτ (x − y), x − y . exp − t 2 s det( s Aτ (y)dτ ) (5.83) It satisfies the following estimate:   (x − y)2 ap −d/2 |Gt,s (x, y)| ≤ (2πλmin (t − s)) exp − , (5.84) 2(t − s)λmax

5.6. The method of frozen coefficients: main examples

317

where λmin and λmax are the minimum respectively the maximum of the eigenvalues of all matrices Aτ , τ ∈ [s, t]. Consequently, (5.75) holds with ωG = d/2. Next, we find  t −1 ∂Gap t,s (x, y) =− Aτ (y)dτ (x − y) Gap (5.85) t,s (x, y) ∂x s and # −1 −1  t

 t ∂ 2 Gap t,s (x, y) ap = Gt,s (x, y) Aτ (y)dτ (x − y)m Aτ (y)dτ (x − y)l ∂xj ∂xk s s jm kl l,m −1 $  t k − δj Aτ (y)dτ . (5.86) s

jk

Assume that the At (x) are γ-H¨older-continuous in x with an index γ ∈ (0, 1], i.e., At (y) − At (x) ≤ HA x − yγ .

(5.87)

The function F given by (5.57) equals # $ ∂ 2 Gap 1 t,s (x, y) Ft,s (x, y) = tr (At (y) − At (x)) 2 ∂x2 ∂ 2 Gt,s (x, y) 1 = (At,jk (y) − At,jk (x)) . 2 ∂xj ∂xk ap

(5.88)

j,k

In order to estimate this trace, we can use the fact that tr(CD) ≤ max |λC | D for any symmetric matrices C, D, where max |λC | is the maximal magnitude of the eigenvalues of C, and D is the Euclidean norm of D, so that 2 2 ≤ d max |D |. D ≤ Dij ij ij

i,j

Therefore, we get ) ) ) ∂ 2 Gap (x, y) ) ) ) t,s |Ft,s (x, y)| ≤ min(λmax , HA x − y ) ) ). ) ∂xj ∂xk ) γ

The last term can be estimated as the sum of two terms in the bracket on the r.h.s. of (5.86), which yields

d (x − y)2 min(HA x − yγ , λmax ) 1 + Gap |Ft,s (x, y)| ≤ t,s (x, y). λmin (t − s) λmin (t − s) (5.89)

318

Chapter 5. Linear Evolutionary Equations: Advanced Theory

√ Integrating over x and changing the variable x to w = (x − y)/ t − s leads to dHA |Ft,s (x, y)|dx ≤ λmin (t − s)1−γ/2 

 w2 |w|2 × |w|γ 1 + (2πλmin )−d/2 exp − dw. λmin 2λmax Therefore, the second estimate in (5.69) holds with ωL = 1 − γ/2. Similarly, the first estimate in (5.69) holds with ωC = 1 + d/2. Next, let us discuss the second symbol in (5.82). Assume that for each x the conditions of Theorem 4.15.1 are satisfied with bounds that are uniform in x, and older-continuous in x, i.e., that ωt (x, s) is γ-H¨ |ωt (x, p/|p|) − ωt (y, p/|p|)| ≤ Hω |x − y|γ . It follows from (4.196) that for all t > s and x, y, z,  (x, y)| ≤ C min (t − s)−d/β , |Gψ,z t,s

t−s |x − y|d+β

(5.90)  ,

(5.91)

where C depends on d, β, the minimal value of the real part and the maximum ψ,y magnitude of ωt (x, s). Consequently, for the corresponding Gap t,s (x, y) = Gt,s (x, y) the estimates (5.75) hold with ωG = d/β. In order to estimate the corresponding function F of (5.57), we note that the action of ψt,[y] (−i∇) on Gψ,y t,s (x, y) is equivalent to a differentiation in t, for which the estimate (4.196) can be used. In the exact same way, one estimates the action of ψt,[x] (−i∇) = ψ(x, −i∇) on Gψ,y t,s (x, y). As a consequence, one sees that the action of ψt,[y] (−i∇) − ψ(x, −i∇) on Gψ,y t,s (x, y) has the same upper bound, but with an additional factor min(1, |x − y|γ ). In fact, analogously to (4.77), we have 1 ∞ 1 Ft,s (x, y) = d|p| du dn ei|p| |x−y|u (1 − u2 )(d−3)/2 |p|d−1 |p|β (2π)d 0 (d−2) −1 S   t β × exp −|p| ωτ (y, x − y, u, n) dτ [ωt (y, x − y, u, n) − ωt (x, x − y, u, n)]. s

The last term in the square brackets brings in the factor min(1, |x − y|γ ). The remaining integral can be decomposed into two parts by writing 1 = χ1 (u) + χ2 (u) as for (4.77). The second integral is estimated by (9.31). Therefore, we get the estimate   1 t−s |Ft,s (x, y)| ≤ C min(1, |x − y|γ ) . (5.92) (t − s)−d/β , t−s |x − y|d+β Similar to the case of diffusions we observe that, when estimating the norms of F , the natural scaled variable is w = (x − y)(t − s)−1/β . This implies the estimates (5.69) with ωL = 1 − γ/β, ωC = 1 + d/β, (5.93)

5.6. The method of frozen coefficients: main examples

319

which is of course consistent with the above estimate for diffusions corresponding to β = 2. Let us now analyse the case of variable order of homogeneity, i.e., the third case in (5.82): ψt (x, p) = ωt (x, p/|p|)|p|βt (x) . Assuming that for each x the conditions of Theorem 4.15.4 are satisfied with bounds that are uniform in x, it follows from (4.202) that for t−s < T , with any T ,

  t−s t−s −d/bmin |Gψ,z (x, y)| ≤ C min (t − s) , max , , t,s |x − y|d+bmin |x − y|d+bmax (5.94) where C depends on T , bmax = maxt,x βt (x), bmin = mint,x βt (x), the minimal value of the real part and the maximum magnitude of ωt (x, s). Consequently, for ψ,y the corresponding Gap t,s (x, y) = Gt,s (x, y) the estimates (5.75) hold with ωG = d/bmin. Next, assuming the H¨older continuity of the coefficients, i.e., assuming (5.90) and (5.95) |βt (x) − βt (y)| ≤ Hβ x − yγ , it follows that 1 (1 + | ln(t − s)|) (5.96) |Ft,s (x, y)| ≤ C min(1, |x − y|γ ) t−s

  t−s t−s × min (t − s)−d/bmin , max , , |x − y|d+bmin |x − y|d+bmax because, for |p| > 1,    βt (x)  − |p|βt (y) | ≤ |βt (x) − βt (y)||p|bmin ln |p|, |p| and the multiplier ln |p| yields an additional multiplier | ln(t − s)| in the final estimate. In particular, the estimates (5.69) hold with ωL = 1 −  − γ/bmax ,

ωC = 1 −  + d/bmin

with any  > 0. Summing up, we proved the following. Theorem 5.6.1. Let the symbols of the Cauchy problem (5.51) have one of the three types given by (5.82) and their coefficients (A, ω, β) be H¨ older-continuous in x. Suppose that A is symmetric with all eigenvalues between certain positive numbers λmin , λmax , that the βt (x) belong to a certain interval [bmin , bmax ] ⊂ R+ and that ωt (x, s) has a positive real part separated from zero and is (d + 1 + bmax )-times continuously differentiable in s. Then the Green function of the type (5.64) constructed by the method of frozen coefficients is a well-defined continuous function satisfying (5.76), (5.78) and (5.80). In particular, ωL = 1 − γ/β, ωC = 1 + d/β in the first two cases.

320

Chapter 5. Linear Evolutionary Equations: Advanced Theory

By a more careful analysis of the series (5.65), one can often obtain rather precise bounds and even two-sided estimates for the Green function Gt,s (x, y), as well as its asymptotic expansions in small times and around the diagonal {x = y}. An extensive literature is devoted to such estimates and asymptotic expansions. For the sake of completeness, let us quote some key results. In [116], the case of diffusion is studied in detail (see also [227] and references therein), which leads to the following theorem: Theorem 5.6.2. For the general heat equation ∂ut (x) = Lt ut (x), ∂t 1 Lt ut (x) = (a(t, x)∇, ∇)ut (x) + (b(t, x), ∇ut (x)) + c(t, x)ut (x), 2

(5.97)

suppose that a is uniformly elliptic, i.e., the expression (a(x)ξ, ξ) for unit vectors ξ is uniformly bounded from below and above by positive constants m, m−1 , and that it is continuously differentiable in x. Moreover, suppose that b, c are continuous with respect to all their variables and the following uniform bounds hold: sup max(|∇a(t, x)|, |b(t, x)|, |c(t, x)|) ≤ M. t,x

Then there exist constants σi , Ci , i = 1, 2, depending only on m, M, T such that the Green function Gt,s (x, ξ) of equation (5.97) is well defined and satisfies the following two-sided bounds: C1 Gσ1 (t−s) (x − ξ) ≤ Gt,s (x, ξ) ≤ C2 Gσ2 (t−s) (x − ξ),

0 < s, t < T,

(5.98)

where Gσ1 (t−s) (x − ξ) is the heat kernel (4.14) of the basic heat equation. Moreover, if a is twice continuously differentiable in x, and b, c are continuously differentiable (with all derivatives bounded), then Gt,s (x, ξ) is differentiable in x, ξ and       ∂  ∂    max  Gt,s (x, ξ) ,  Gt,s (x, ξ) ≤ Ct−1/2 Gt,s (x, ξ), 0 < s, t < T, (5.99) ∂ξ ∂x where the constant C depends only on m, M, T and the bounds for the derivatives. In [134, 136], two-sided bounds are obtained for the equation generated by mixed fractional Laplacians of order not exceeding 2, with variable coefficients, in terms of the corresponding heat kernels of the constant coefficient case, as well as the related estimates for the derivatives. Let us give some precise results for the equation ∂u = −a(x)|∇|β(x) u, ∂t

x ∈ Rd ,

t ≥ 0.

(5.100)

5.6. The method of frozen coefficients: main examples

321

Theorem 5.6.3. Let β(x) ∈ [βd , βu ], a(x) ∈ [ad , au ] be γ-H¨ older-continuous functions on Rd with values in compact subsets of (0, 2) and (0, ∞), respectively, and γ ∈ (0, 1]. Then the Green function G(t, x, ξ), t > 0, x, ξ ∈ Rd , for equation (5.100) has the following upper bound: G(t, x, ξ) ≤ K[Gβt d (x − ξ) + Gβt u (x − ξ)],

(5.101)

for all t ≤ T with an arbitrary T , where K = K(T ) is a constant and where Gβt (x − ξ) is the Green function of the Cauchy problem for the ΨDO with the symbol |p|β . Moreover, if the index β(x) = β does not depend on x, then the following two-sided estimate holds as well: K −1 Gβt (x − ξ) ≤ G(t, x, x0 ) ≤ KGβt (x − ξ). A proof and various extensions can be found in [134, 136, 148]. Abundant literature exists on estimates and asymptotic expansions of the heat kernels for diffusions on manifolds, including the case of degenerate diffusions. For some basic results, see, e.g., [33, 182]. Starting from the classification of Gaussian degenerate diffusions (with the Green function given by the exponent of a quadratic form) in terms of the Young schemes, see the discussion around formula (4.44), and taking into account the extreme importance of Gaussian estimates, it is natural to ask which class of diffusions has a Gaussian form as the main term of the asymptotics for small times. This question was answered in [136], where the full classification of such diffusions is given in accordance to the classification of Gaussian diffusions themselves. For instance, in the case of the Young scheme (k, n), k ≥ n (Kolmogorov’s Gaussian diffusion), diffusion equations where the heat kernel has a corresponding Gaussian main term turn out to be given by second-order operators of the following type:     1 ∂ ∂ ∂ G(x) , + a(x) + α(x)y, 2 ∂y ∂y ∂x   ∂ 1 (5.102) + b(x) + β(x)y + (γ(x)y, y), 2 ∂y − V (x, y), where x ∈ Rn , y ∈ Rk , the rank of the matrix α(x) is n, G is a square k×k positive matrix, and V (x, y) is a polynomial in y of order at most 4 that is bounded from below. These operators naturally arise in the analysis of stochastic geodesic flows on manifolds. For some recent developments in degenerate stable-like equations (with homogeneous symbols of order not exceeding 2), we can refer, e.g., to [114, 167] and references therein.

322

Chapter 5. Linear Evolutionary Equations: Advanced Theory

5.7 The method of frozen coefficients: regularity In this section, we further investigate the regularity of the Green function Gt,s (x, y) as constructed above. To this end, we shall assume for the sake of simplicity that the symbols ψt = ψ do not depend on t. (The general case is not more demanding from an ideological point of view, but it requires lengthier formulae.) Therefore, the functions Gas t,s , Ft,s , Φt,s and Gt,s become functions of the difference t − s only. From now on, we will denote the functions Gas t,0 , Ft,0 , Φt,0 and Gt,0 (x, y) in a shorter form by Gas , F , Φ and G (x, y). t t t t Aiming at the differentiability of G with respect to t, we start again with the question concerning the series Φ, see (5.64) and (5.65). A quick look at the examples of ψ considered above suggests that differentiating Ft (x, y) with respect to t should increase the singularity at small t by a factor 1/t, which motivates the general assumption in the following result. Lemma 5.7.1. Under the assumptions of Lemma 5.5.1, assume additionally that F depends only on the difference of t − s and that Ft (x, y) is continuously differentiable in t, where the derivative has the same estimate as F with an additional multiplier of the order 1/t, i.e.,     ∂   t Ft (x, y) ≤ CF min t−ωC ΩC (x − y), t−ωL ΩL (x − y) , (5.103) t t   ∂t with some constant CF . Then the function Φt (x, y) (which is Φt,0 (x, y) in the previous notation of the time-dependent case and denotes the sum of the series on the r.h.s. of (5.65) with s = 0) is also continuously differentiable in t, and this derivative has the same estimates as Φ with an additional multiplier of the order 1/t, i.e., ) ) ) ∂ ) )t Φt (., y)) ≤ t−ωL C(ΩL , ωL , T, CF ), (5.104) ) ∂t ) L1 (Rd ) ) ) ) ) ∂ )t Φt (., y)) ≤ t−ωC ΩC C(ΩL , ωL , ωC , T, CF ). (5.105) ) ) ∂t d C∞ (R )

Proof. Let us first look at the differentiability of the first nontrivial term in the series (5.65): t I1 (t, x, y) = dτ Ft−τ (x, z)Fτ (z, y) dz. 0

The problem is that a direct differentiation and using the estimates for F yields a non-integrable singularity 1/(t − τ ) inside the integral. This difficulty can be overcome by changing the variable of integration: I1 (t, x, y) = t



1

ds 0

Ft(1−s) (x, z)Fts (z, y) dz.

5.7. The method of frozen coefficients: regularity

323

Differentiating now yields ∂ I1 (t, x, y) = ∂t





1

ds 0



Ft(1−s) (x, z)Fts (z, y) dz

1

(1 − s)

ds

+t 0

+t



1

ds 0

∂Fτ (x, z) |τ =t(1−s) Fts (z, y) dz ∂τ

sFt(1−s) (x, z)

∂Fτ (z, y) |τ =ts dz. ∂τ

All three terms are perfectly well defined and can be estimated as I2 itself in the proof of Lemma 5.5.1 (using (5.103) for the second and third term). This leads to ) ) ) ) ∂ [Γ(1 − ωL )]2 )t I1 (t, ., y)) . ≤ (1 + 2CF )Ω2L t1−2ωL ) ∂t ) Γ(2(1 − ωL )) L1 (Rd ) Similarly, the nth term of the series (5.65), In (t, x, y) = ds1 · · · dsn 0≤s1 ≤···≤sn ≤t × Ft−sn (x, zn )Fsn −sn−1 (zn , zn−1 ) · · · Fs1 (z1 , y)dz1 · · · dzn , can be rewritten as n In (t, x, y) = t ds1 · · · dsn 0≤s ≤···≤sn ≤1 1 × Ft(1−sn ) (x, zn )Ft(sn −sn−1 ) (zn , zn−1 ) · · · Fts1 (z1 , y)dz1 · · · dzn . Consequently,

∂ n ∂Fτ (zk , zk−1 ) In (t, x, y) = In (t, x, y) + tn |τ =t(sk −sk−1 ) · · · , · · · (sk − sk−1 ) ∂t t ∂τ n

k=1

where · · · denotes all terms that were not subject to differentiation. Again, all terms can be estimated like In (t, x, y) itself, which yields ) ) n ) ) ∂ n(1−2ωL )−ωL [Γ(1 − ωL )] )t In (t, ., y)) . ≤ n(1 + nCF )Ωn+1 L t ) ∂t ) Γ(n(1 − ωL )) L1 (Rd ) The corresponding sum is again convergent, thus yielding (5.104). The terms In (t, ., y) are estimated as elements of C∞ (Rd ) exactly like in Lemma 5.5.1), which yields (5.105). 

324

Chapter 5. Linear Evolutionary Equations: Advanced Theory

Lemma 5.7.2. Under the assumptions of Lemma 5.5.2, assume additionally that ψt = ψ does not depend on t, that Φ satisfies the estimates (5.104), (5.105) from ap ∂ Lemma 5.7.1 and that t ∂t Gap t (x, y) satisfies the same estimates as Gt (x, y) itself, i.e., ) ) ) ) ) ∂ ap ) ) ∂ ap ) )t Gt (., y)) ) ˜ GC , ˜ ≤ ΩGL , )t Gt (., y)) ≤ t−ωG Ω (5.106) ) ∂t ) ) ∂t L1 (Rd ) C∞ (Rd ) ˜ GL , Ω ˜ GC > 0. with some constants Ω Then the function Gt (x, y) given by (5.64) (with s = 0) is continuously dif∂ Gt (x, y)| has again the same estimates as Gt (x, y) itself, ferentiable in t and |t ∂t i.e., the following estimates hold for the second term in (5.64): ) ) t ) ) ∂ ap )t dτ Gt−τ (., z)Φτ (z, y) dz ) ≤ Ct1−ωL , (5.107) ) ) ∂t 0 L1 (Rd ) ) ) t ) ∂ ) ap )t ) dτ G (., z)Φ (z, y) dz ≤ C max[t1−ωL −ωG , t1−ωC ] (5.108) τ t−τ ) ∂t ) d 0 C∞ (R ) with constants C depending on all constants that have entered the assumptions of the Lemma. Proof. For estimating the derivative of the second term in (5.64), one uses the same trick as for the integral I1 in the proof of Lemma 5.7.1. Namely, one rewrites it as 1 t ap dτ Gt−τ (x, z)Φτ (z, y) dz = t ds Gap t(1−s) (x, z)Φts (z, y) dz 0

0

so that 1 t ∂ dτ Gap (x, z)Φ (z, y) dz = ds Gap τ t−τ t(1−s) (x, z)Φts (z, y) dz ∂t 0 0  1  ∂Gap τ (x, z)  ds (1 − s) Φts (z, y) dz +  ∂τ 0 τ =t(1−s)  1 ∂Φτ (z, y)  + ds sGap (x, z) dz. t(1−s)  ∂τ 0 τ =ts Finally, one uses the same estimates as for the second term in (5.64) itself.



Everything is now ready for the main result that summarizes the outcome of the method of frozen coefficients in an abstract form. Theorem 5.7.1. Let ψ(x, p) be a continuous function such that 1 ap ei(p,x−y) exp{−tψ(y, p)}dp Gt (x, y) = (2π)d

5.7. The method of frozen coefficients: regularity

325

is a well-defined continuous function with respect to all its variables for t > 0, ap ∂ d d is differentiable in t, and such that Gap t (., y), ∂t Gt (., y) ∈ L( R ) ∩ C∞ (R ) and the estimates (5.75), (5.106) hold, as well as the convergence property (5.79). Moreover, let the function Ft (x, y) = (ψt,[y] (−i∇) − ψt (x, −i∇))Gap t (x, y) 1 = ei(p,x−y) [ψt (y, p) − ψt (x, p)] exp{−tψ(y, p)} dp (2π)d

(5.109)

be well defined and continuously differentiable in t, and let the estimates (5.69) and (5.103) hold, i.e.,   −ωL L Ωt (x − y) , |Ft (x, y)| ≤ min t−ωC ΩC t (x − y), t   ∂ −ωL L Ωt (x − y) . |t Ft (x, y)| ≤ CF min t−ωC ΩC t (x − y), t ∂t Then the function Φ given by (5.65) (with s = 0) and the function t Gt (x, y) = Gap (x, y) + dτ Gap t t−τ (x, z)Φτ (z, y) dz

(5.110)

0

are well defined continuous functions, which are differentiable in t and have the properties (5.76), (5.78), (5.80), (5.107) and (5.107). Moreover, Gt satisfies equation (5.66), i.e., ∂ Gt (x, y) = −ψt (x, −i∇)Gt (x, y) (5.111) ∂t (where the action of ψ on G is defined classically via its action on Gap given by (5.58)), with the initial condition (5.80), i.e., ) ) ) ) ) Gt (., y)f (y)dy − f (.)) → 0, as(t − s) → 0, ) ) C∞ (Rd )

for all f ∈ C∞ (Rd ). Proof. When putting together the results of all lemmas above, the only thing that is left to check is equation (5.111). From the derivation of equation (5.67), i.e., the equation t ap dτ ψt,[z] (−i∇)Gap − ψt,[y] (−i∇)Gt (x, y) + Φt (x, y) − t−τ (x, z)Φτ (z, y) dz 0 t dτ ψt (x, −i∇)Gap = −ψt (x, −i∇)Gap t (x, y) − t−τ (x, z)Φτ (z, y) dz, (5.112) 0

it follows that it is equivalent to (5.111) and equation (5.68) on Φ, t dτ Ft−τ (x, z)Φτ (z, y) dz, Φt (x, y) = Ft (x, y) + 0

(5.113)

326

Chapter 5. Linear Evolutionary Equations: Advanced Theory

as long as all terms in (5.112) make sense. The point to emphasize here is that although we already know that the function dτ Ft−τ (x, z)Φτ (z, y) dz 0 t dτ (ψt,[z] (−i∇) − ψt (x, −i∇))Gap = t−τ (x, z)Φτ (z, y) dz



t

0

is well defined (since the integration over t can be performed), this does not directly imply that each of the integrals on the l.h.s. and the r.h.s. of (5.112) makes sense. One still has to show that at least one of them is well defined. For this purpose, we note that 1 ∂Gt (x, y) = lim (Gt+δ (x, y) − Gt (x, y)) δ→0 ∂t δ 1 t+δ ∂Gap t (x, y) + lim = dτ Gap t+δ−τ (x, z)Φτ (z, y)dz δ→0 δ t ∂t t 1 dτ (Gap (x, z) − Gap + lim t−τ (x, z))Φτ (z, y)dz. δ→0 0 δ t+δ−τ The l.h.s. is defined by Lemma 5.7.2, and the first limit on the r.h.s. is defined and equals Φt (x, y) by (5.80). Consequently, we may conclude that the second integral on the r.h.s. is also well defined. It equals 0

t

∂Gap t−τ (x, z) Φτ (z, y)dz = dτ ∂t





t



ψt,[z] (−i∇)Gap t−τ (x, z)Φτ (z, y) dz,

0

as required. This completes the proof.



5.8 The method of frozen coefficients: the Cauchy problem In this section, we derive some basic properties of the resolving operator for the Cauchy problem of equations with the symbol ψ(x, p), as they arise as consequences of the properties of the Green function obtained above. Proposition 5.8.1. Under the assumptions of Theorem 5.7.1, for any f ∈ C(Rd ), the function Tt f (x) = Gt (x, y)f (y) dy (5.114) is continuously differentiable in t for t > 0 and satisfies equation (5.51) classically (also for t > 0). Moreover, the initial condition is met in the sense that Tt f (x) → f (x), as t → 0, for any x, and this convergence is uniform in x for f ∈ C∞ (Rd ).

5.8. The method of frozen coefficients: the Cauchy problem

327

Finally, the mappings Tt extend to M(Rd ), so that for any μ ∈ M(Rd ), Tt μ is a measure with a continuous density and the curve t → Tt μ is weakly continuous in M(Rd ). Proof. The only statement that is left to be proved is the weak continuity of the extension of Tt to measures. This follows from the corresponding property of Gas (see Proposition 4.4.2) and the estimate ) t ) ) ) ap ) ) dτ G (., z)Φ (z, y) dzμ(dy) τ t−τ ) ) 0

L1

≤ Ct1−ωL μ, (Rd )

which is a consequence of (5.76).



By Theorem 5.6.1, we know that the conditions of Lemmas 5.5.1, 5.5.2 are satisfied for our basic examples. It is readily seen that the additional requirements of the Lemmas 5.7.1 and 5.7.2 are also met by these examples. Namely, for the case of diffusion it is seen from (5.83), (5.86), (5.88). For the other two cases in (5.82), the estimates (5.106) follow from (4.74) and (4.196). The estimates for the derivatives of F are obtained analogously. This leads to the following result: Theorem 5.8.1. Under the assumptions of Theorem 5.6.1, the Green function of the type (5.64) constructed by the method of frozen coefficients satisfies the assumptions of Theorem 5.7.1 and consequently its assertions and corollary. We are interested in the additional regularity of the solutions to the Cauchy problems (5.51). For simplicity, we formulate them only for the class of homogeneous symbols, keeping in mind that the applied principles are rather general. Theorem 5.8.2. Assume that the assumptions of Theorem 5.6.1 hold for the symbol ψ = ψ(x, p) = ω(x, p/|p|)|p|β (the second case in (5.82)) with β > 1. Let k denote the maximal integer that is strictly less than β, and let ω(x, s) be (d + 1 + β + k)times continuously differentiable in s. Then for any integer l ≤ k, the heat kernel Gt (x, y) given by (5.110) is l-times continuously differentiable in x and the second term of (5.110) has the following bounds: ) ) t ) ) ∂l ap ) ) dτ G (., z)Φ (z, y) dz ≤ Ct(γ−l)/β , τ t−τ ) ) ∂x · · · ∂x d i i 0 1 l L1 (R ) ) ) t l ) ) ∂ ) dτ Gap (x, z)Φτ (z, .) dz ) ≤ Ct(γ−l)/β , ) ) ∂xi1 · · · ∂xil t−τ d 0 L1 (R )   t l   ∂ ap  dτ Gt,τ (x, z)Φτ,s (z, y) dz  ≤ Ct(γ−d−l)/β . sup  ∂xi1 · · · ∂xil x,y s

(5.115) (5.116) (5.117)

Moreover, the family of operators (5.114) is smoothing in both C∞ (Rd ) and k M(Rd ) in the sense that Tt f ∈ C∞ (Rd ) for any f ∈ C(Rd ) and (Tt μ)(x) ∈

328

Chapter 5. Linear Evolutionary Equations: Advanced Theory

H1k (Rd ) for any μ ∈ M(Rd ), and the following estimates hold for any l ≤ k and indices i1 , . . . , il : ) ) ) ) ∂l ) ) (T μ)(.) ≤ Ct−l/β μM(Rd ) , (5.118) ) ∂xi · · · xi t ) d 1 l L1 (R ) ) ) ) ) ∂l ) ) T f (.) ≤ Ct−l/β f C(Rd) . (5.119) ) ) ∂xi · · · xi t d 1 l C(R ) Proof. The function Gt (x, y) is given by (5.110). The required properties of its first term follow from the corresponding results for position-independent equations, namely that the L1 -norms of the spatial derivatives of order m of Gap t are bounded by a constant times t−β/m , see (4.57). Using these properties of Gap t , the estimates (5.115), (5.116) and (5.117) are proved by the same argument as estimates (5.76), (5.77) and (5.78) are proved in Lemma 5.5.2, i.e., due to the estimate of the L1 norm of Φt (x, .) by a constant times t−ωL with ωL = 1 − γ/β. The condition k < β ensures the integrability of the singularity that arises for small (t − τ ) in the integrals (5.115) and (5.116). The estimates (5.118) and (5.119) are consequences of (5.115) and (5.116).  In Theorem 5.8.1, we obtained the differentiability of the heat kernel of order k < β without any essential additional assumptions. This could be expected, since ψ(x, −i∇) acts as a kind of generalized derivative of order β. However, the action of ψ(x, −i∇) on the above Gt is defined in terms of its Fourier transform. In order to be able to define its action on Gt (and hence on other solutions to the Cauchy problems (5.51)) directly via a formula like (1.145), (1.143) (at least for a particular choice of ω), we need regularity of an order that is higher than β. The next result shows how this can be achieved. Theorem 5.8.3. Let the assumptions of Theorem 5.6.1 again hold for the symbol ψ(x, p) = −ω(x, p/|p|)|p|β with β > 0, and let k denote the maximal integer that is strictly less than β. Let additionally ω(x, s) be q-times continuously differentiable in x, and let each of these derivatives be (d + 1 + (k + q)(β + 1))-times continuously differentiable in s with all bounds uniform in x, s. Then the family of operators q (Rd ) and is locally bounded in these spaces (with (5.114) preserves the space C∞ respect to their standard norm). Moreover, they are smoothing in C q (Rd ) in the q+k (Rd ) for any f ∈ C q (Rd ), and the following estimates hold sense that Tt f ∈ C∞ for any l ≤ k: (5.120) Tt f C l+q (Rd ) ≤ Ct−l/β f C q (Rd ) . Remark 92. The statement about smoothing becomes void for β ≤ 1. In this case, the smoothing effect can still be observed, but requires spaces of fractional order of smoothness (see Remark 66) that we avoid here. Proof. As in the previous theorem, we show that each term in formula (5.110) for Gt (x, y) defines an integral operator with the required properties. These operators

5.8. The method of frozen coefficients: the Cauchy problem

are

Gap t (x, y)f (y) dy,

Tt1 f (x) = Tt2 f (x)

=



t

dτ 0

Gap t (x, y) =

1 (2π)d

329

ei(p,x−y) exp{−tψ(y, p)}dp,

∂ ap G (x, z)Φτ (z, y)f (y) dz dy. ∂x t−τ

As far as Tt1 is concerned, the idea is to transfer the first m derivatives with respect to x to the variable y, in order to further transfer it to f via integration by parts. Thereby, the main observation is that ap ap ˜ t (x, y) = ∂Gt (x, y) + ∂Gt (x, y) G ∂x ∂y ∂ψ(y, p) t exp{−tψ(y, p)}dp, ei(p,x−y) =− (2π)d ∂y

(5.121)

which has the same bounds as Gap t (x, y) itself, given by (4.73), (4.75), i.e.,       ∂l t  ˜ t (x, y) ≤ C min t−(d+l)/β , G  ∂xi · · · ∂xi  |x|d+β+l 1 l for l ≤ k, although each term on the l.h.s. of (5.121) is more singular due to the additional factor t−1/β . Consequently, one has ∂ ∂ ap ap ˜ G (x, y)f (y) dy Gt (x, y)f (y) dy = Gt (x, y)f (y) dy − ∂x ∂y t ∂ ˜ t (x, y)f (y) dy + Gap f (y) dy. = G t (x, y) ∂y Therefore, if f ∈ C 1 (Rd ), then both terms are uniformly bounded. Repeating this procedure m times shows that Tt f C m (Rd ) ≤ Cf C m (Rd ) , i.e., the operators Tt1 act as bounded operators on C m (Rd ). The extension to (5.120) is obtained as in the proof of Theorem 5.8.3. When dealing with the operators Tt2 , the same procedure transfers the first m derivatives from Gap to Φτ . Therefore, we need the regularity of Φ. Two cases must be distinguished. The simpler case is when m ≤ k. In this case, the differentiation of  F in the terms of the series (5.65) that defines Φ, like in the second term Ft,τ (., z)Fτ,s (z, y) dz, does not create a non-integrable singularity in τ , since each differentiation just ‘spoils’ the estimates by a factor of the order (t − τ )−1/β (assuming of course the differentiability of ω(x, p) with respect to x), and the estimate of the derivatives of all terms of the series (5.65) is done as in Lemma 5.5.2 for Φ itself. An additional difficulty arises when dealing with m > k. In this

330

Chapter 5. Linear Evolutionary Equations: Advanced Theory

case, the direct differentiation of F in the terms of the series (5.65) does create a non-integrable singularity. However, the situation is similar to the case when we dealt with the derivatives in t in Lemma 5.7.1. Therefore, it can be resolved by exactly the same procedure as there.  Theorems 5.8.2 and 5.8.3 include the case of an even integer β = 2q, q ∈ N, and ψ being a homogeneous polynomial of order 2q in the variable p, which is the case of classical parabolic PDEs. In this case, however, many things become simpler. For instance, since ω = m(x, s) is a polynomial in s, it is automatically infinitely smooth in s. And the conditions on the positivity of the real part of ω are fulfilled if the polynomial ψ(x, p) is (strictly) elliptic, i.e., C −1 |p|2q ≤ ψ(x, p) ≤ C|p|2q

(5.122)

for all x and some constant C > 0. As we already mentioned, the theory extends to evolutions that are generated by homogeneous operators supplemented by operators of lower terms, like in (4.204). Namely, let us consider the equation f˙t = −ψ(x, −i∇)ft + Lt ft

(5.123)

with ψ(x, p) = −ω(x, p/|p|)|p|β , with a β > 1, and Lt f =

k



btj1 ···jm (x)

m=0 j1 ≤···≤jm ≤d

+

k



m=0 j1 ≤···≤jm ≤d



∂ m f (x) ∂xj1 · · · ∂xjm (5.124)

∂ m f (y) νt (x, dy), ∂yj1 · · · ∂yjm j1 ···jm

where k ≥ 1 is any natural number that is strictly less than β, and btj1 ···jm (x) and νjt1 ···jm (x, dy) are families of measurable functions and transition kernels on Rd , respectively. Let p be the smallest integer that is not less than β. The following p (Rd ) and result is a direct consequence of Theorems 4.13.5 and 5.8.3 with D = C∞ k d ˜ = C∞ (R ). B Theorem 5.8.4. (i) Let ψ(x, p) = ω(x, p/|p|)|p|β with a β > 1 be a symbol satisfying the assumptions of Theorem 5.8.3 with any q ≥ 0. Let btj1 ···jm (x) be bounded measurable complex-valued functions and νjt1 ···jm (x, dy) uniformly bounded complex stochastic kernels. Then the series (4.166) with U t,s = Tt−s and Lt from (5.124) converges in C∞ (Rd ), and the corresponding functions Φt,r f solve the mild form of equation (5.123). (ii) Let additionally btj1 ···jm ∈ C([0, T ], C q (Rd )) (possibly complex-valued), and let the partial derivatives of the kernels νjt1 ···jm (x, .) with respect to x of the order up to and including q exist as uniformly bounded weakly continuous families

5.9. Uniqueness via duality and accretivity; generalized solutions

331

of complex transition kernels. Then the family of operators Φt,r from (i) acts q (Rd ), and the corresponding series strongly continuously in C∞ (Rd ) and C∞ q d (4.166) converges in C (R ). Moreover, the operators Φt,r are smoothing, so that (5.125) Φt,r f C l+q (Rd ) ≤ C(t − r)−l/β f C q (Rd ) . If q ≥ k, then Φt,r f solves equation (4.186) classically.

5.9 Uniqueness via duality and accretivity; generalized solutions The method of frozen coefficients did not supply the uniqueness of the solutions to the Cauchy problems (5.51), since we can only assert that the constructed Green function is unique among functions represented by (5.64) with a sufficiently regular Φ. Therefore, we cannot derive from Theorem 5.6.1 that the operators Tt form a semigroup. In the sequel, we shall present four approaches to uniqueness that allow for a completion of the analysis of the equations (5.51). In this section, we discuss two approaches that are based on duality, the second one being reminiscent to the method of accretivity. The result on uniqueness that we begin with is very close to Theorem 4.10.1. The difference is that instead of starting with a propagator and building the dual propagator via duality, we start with the solutions (that do not necessarily form a propagator) for both the forward and the backward problems that have been constructed independently from the duality. We formulate the result in terms of dual pairs of Banach spaces, i.e., pairs of Banach spaces (Bobs , Bst ) such that each of these spaces is a closed subspace of the dual of the other space that separates the points of the latter. Theorem 5.9.1. Suppose that (Bobs , Bst ) is a dual pair of Banach spaces and Dobs ⊂ Bobs , Dst ⊂ Bst are their dense subspaces. Let At be a family of lin∗ ear operators Dobs → Bobs such that their dual operators A∗t acting from Bobs ∗ to Dobs represent the extensions (by weak continuity) of the family of operators, which is also denoted by A∗t and acts from Dst to Bst . Let U t,r , t ≤ r, be a strongly continuous family of bounded linear operators in Bobs such that Dobs is invariant and the equation d fs = −As fs , ds

s < t,

(5.126)

is satisfied by fs = U s,t f for any f ∈ Dobs . Let V r,t , r ≥ t, be a strongly continuous family of bounded linear operators in Bst such that Dst is invariant and the equation d ξs = A∗s ξs , s > t, (5.127) ds is satisfied by ξs = V s,t ξ for any ξ ∈ Dst .

332

Chapter 5. Linear Evolutionary Equations: Advanced Theory

Then V s,t ξ is the unique solution to the Cauchy problem of equation (5.127), i.e., if ξt = ξ for a given ξ ∈ Dst and ξs , s ≥ t, is a continuous curve in Bst such that ξs ∈ Dst for all s and satisfies (5.127), then ξs = V s,t ξ for all s ≥ t. Analogously, U s,r f is the unique solution to the inverse Cauchy problem (5.126). Proof. It is almost identical to the proof of Theorem 4.10.1. Namely, for a solution ξs to equation (5.127) with the initial condition ξt = ξ and any f ∈ Dobs , we define the function g(s) = (U s,t f, ξs ) and show exactly as in the proof of Theorem 4.10.1 that g  (s) = 0. Therefore, we find g(s) = (f, ξs ) = g(t) = (U t,s f, ξt ), which shows that ξs is uniquely defined, because Bst separates the points of Bobs and Dobs is dense in Bobs . The second statement is proved similarly.  Of course, the uniqueness stated in Theorem 5.9.1 implies that families U t,s and V s,t form a backward and a forward propagator, respectively. Exercise 5.9.1. Analyse the proof of Theorem 4.10.1 in order to find out where the assumption of U t,r being a backward propagator was really used. Why is it not possible to just start with any family U t,r resolving the corresponding Cauchy problem, then build a family of dual operators solving the dual problem, and then complete the proof as above? As an example, let us apply this result to equations of the type (5.123) with slightly more restrictive assumptions. Theorem 5.9.2. Let us consider the equation f˙t = −ψ(x, −i∇)ft + Lt ft ,

(5.128)

with ψ(x, p) being an elliptic polynomial (see (5.122)) of order 2q, q ∈ N, and Lt f =

k



btj1 ···jm (x)

m=0 j1 ≤···≤jm ≤d

+

k





m=0 j1 ≤···≤jm ≤d

∂ m f (x) ∂xj1 · · · ∂xjm (5.129)

∂ m f (y) νt (x, y) dy, ∂yj1 · · · ∂yjm j1 ···jm

where k ≥ 1 is any natural number that is strictly less than 2q, and btj1 ···jm (x) and νjt1 ···jm (x, y) are functions of the class C 2k (Rd ) and C 2k (R2d ) ∩ H12k (R2d ), respectively. Then the solutions constructed in Theorem 5.8.4 are unique and the operators Φt,r form a propagator. Proof. This result follows from Theorem 5.9.1, since the dual operator to −ψ(x, −i∇)f + Lt

is

˜ t, − ψ(x, −i∇)f + L

5.9. Uniqueness via duality and accretivity; generalized solutions

333

˜ t of the same type as Lt . Therefore, with the same main term −ψ(x, −i∇) and L ˜ t or the solutions to the Cauchy problem with this operator −ψ(x, −i∇)f + L ˜ t can be to the time-reversed Cauchy problem with the operator ψ(x, −i∇)f − L constructed by Theorem 5.8.4.  Remark 93. The additional smoothness assumptions on ν have the specific purpose that the dual operator satisfies the conditions of Theorem 5.8.4(ii). Moreover, we restricted the main term ψ to be a polynomial, because otherwise the dual operator would not be of the same form as ψ, but would rather contain hyper-singular integrals of the type (1.154) with nontrivial characteristics. In such a case, dealing with the corresponding Cauchy problem would require additional work. The duality can be used for proving uniqueness in a slightly different way (reminiscent to using accretivity) that we are going to explain now. This approach allows us to treat some other classes of equations, for instance those with symbols ψ(x, p) = at (x)|p|α . Instead of presenting the abstract framework, we just show how the method works for this particular example. Proposition 5.9.1. Let β > 0 and a(x) be a continuous function on Rd such that C −1 ≤ a(x) ≤ C for all x and some C > 0. Then for any m ≥ β and f ∈ m m (Rd ) ∩ H1m (Rd ), there may exist at most one (real) curve ft ∈ C∞ (Rd ) ∩ C∞ m d d d H1 (R ) that is continuous both in C∞ (R ) and L1 (R ) and satisfies the equation f˙t = −a(x)|∇|β ft , t ≥ s, both in the topology of C∞ (Rd ) and L1 (Rd ), and the initial condition fs = f . Proof. It is sufficient to prove the uniqueness for fs = 0. In this case, we have d −1 ft (x)ft (x)[a(x)] dx = −2 ft (x)|∇|β ft (x) dx dt = −2 |∇|β/2 ft (x)|∇|β/2 ft (x) dx ≤ 0, and hence



ft (x)ft (x)[a(x)]−1 dx = 0 for all t. Therefore, we find ft = 0.



Corollary 4. Let β > 0, m ≥ β, a(x) ∈ C m (Rd ) and C −1 ≤ a(x) ≤ C for all x and some C > 0. Then the resolving family of operators Tt of the Cauchy problem for the equation f˙t = −a(x)|∇|β ft constructed in Theorem 5.8.3 form a semigroup m (Rd ) and H1m (Rd ). and yield unique solutions both in C∞ m Proof. The uniqueness of the solutions in C∞ (Rd ) ∩ H1m (Rd ), which follows from Theorem 5.9.1, implies that the Tt form a semigroup, because Ts Tt and Tt+s must coincide as a solution to the same problem. Therefore, the uniqueness in each m (Rd ) or H1m (Rd ) is derived from Theorem 4.10.1.  space C∞

Corollary 5. Theorem 4.14.2 about the equation f˙t = −σ|Δ|α/2 ft + Lt ft

334

Chapter 5. Linear Evolutionary Equations: Advanced Theory

remains true if σ is not a constant, but a function from C k (Rd ) that is bounded from below by a positive constant. 

Proof. It is straightforward.

Exercise 5.9.2. Extend Proposition 5.9.1 to the case of the time-dependent family at (x). Theorem 5.9.1 yields a convenient framework to discuss generalized solutions, as mentioned at the end of Section 4.1. Let us say that a continuous curve fs , s ≤ t, in Bobs is a generalized solution by approximation to the Cauchy problem of equation (5.126) with the terminal condition ft , if it satisfies this terminal condition and there exists a sequence of elements ftn ∈ D such that ftn → f and the corresponding classical (i.e., belonging to the domain) solutions U s,t ft converge to fs , as n → ∞. Moreover, let us say that a continuous curve fs , s ≤ t, in Bobs (continuity can be understand in the strong sense or weakly with respect to the pair) is a generalized solution by duality to the Cauchy problem of equation (5.126) with the terminal condition ft , if for any ξ ∈ Dst , the weak form of equation (5.126) holds, i.e., d (fs , ξ) = −(fs , A∗ ξ), ds

s < t, ξ ∈ Dst .

(5.130)

Generalized solutions to (5.127) are defined in a symmetrical manner. Applying Theorem 4.10.1 and taking into account that Dobs is dense in Bobs , we can derive the following corollary to Theorem 5.9.1. Proposition 5.9.2. Under the assumptions of Theorem 5.9.1, for any f ∈ Bobs (respectively for any ξ ∈ Bst ), the curve U s,t f (respectively V s,t ξ) represents the unique generalized solution by approximation and the unique generalized solution by duality to the Cauchy problem of equation (5.126) (respectively (5.127)). Of course, analogues of these notions of generalized solutions can be defined and exploited for more general pairs of spaces (Bobs , Bst ) that do not necessarily satisfy the conditions of Theorem 5.9.1, and do not necessarily have to be Banach. For instance, the usual notion of a generalized solution in the theory of generalized functions corresponds to the duality arising from pairs of locally convex spaces (D, D ).

5.10 Uniqueness via positivity and approximations; Feller semigroups In this section, we show that the uniqueness of solutions to the Cauchy problems of equations of at most second order can be established with the help of the property of positivity-preservation. Towards the end of the section, we will develop

5.10. Uniqueness via positivity and approximations; Feller semigroups

335

an alternative approach that is based on approximation and can also be directly applied to equations of at most second order. One says that a linear operator L acting from a subspace D of C(Rd ) to some other space of functions on Rd (i) is conditionally positive, if Lf (x) ≥ 0 for any f ∈ D such that f (x) = 0 = miny f (y); (ii) satisfies the positive maximum principle (PMP), if Lf (x) ≤ 0 for any f ∈ D such that f (x) = maxy f (y) ≥ 0. By passing from f to −f , the property (ii) is seen to be equivalent to the requirement that Lf (x) ≥ 0 for any f ∈ D such that f (x) = miny f (y) ≤ 0. In particular, it implies conditional positivity. By shifting, one sees that conditional positivity implies the PMP whenever D contains constants and L takes non-positive values on them. For example, the operator of multiplication u(x) → c(x)u(x) with a function c ∈ C(Rd ) is always conditionally positive, but it satisfies the PMP only in the case of a non-negative c. The main class of examples for conditionally positive operators are special representatives of the operators of at most second order, namely the so-called L´evy–Khintchin-type operators, which are generally given by the following formula: Lf (x) =

1 (A(x)∇, ∇)f (x) + (b(x), ∇)f (x) 2 +

(5.131)

(f (x + y) − f (x) − (∇f (x), y)χ(|y|))ν(x, dy) + c(x)f (x),

where A is a non-negative symmetric matrix-valued function, ν is a non-negative  kernel such that min(|y|2 , 1)ν(x, dy) < ∞ (so-called L´evy kernel) and χ a nonnegative decreasing function (a mollifier, needed to make the integral well defined), which is normally taken to be either the indicator function χ(s) = 1s≤1 or χ(s) = 1/(1 + s2 ). If ν(x, .) has a bounded first moment, then χ(|y|) is not needed at all. It is readily seen that these operators are defined on C 2 (Rd ) and conditionally positive. They satisfy the PMP if and only if the function c(x) is non-positive. Remark 94. It can be shown (Courr`ege theorem) that, under mild additional assumptions, conditionally positive operators have to be of the type (5.131). The following result is representative for a class of results that is referred to as the maximum principle. Theorem 5.10.1. Let a subspace D ⊂ C(Rd ) contain constant functions, and let a family of operators Lt : D → C(Rd ) satisfying the PMP be given. Let the numbers s < T be given and u(t, x) ∈ C([s, T ] × Rd . Assume that u(s, x) is non-negative everywhere, u(t, .) ∈ C∞ (Rd ) ∩ D for all t ∈ [s, T ], is differentiable in t for t > s and satisfies the evolutionary equation ∂u = Lt u, ∂t

t ∈ (s, T ].

336

Chapter 5. Linear Evolutionary Equations: Advanced Theory

Then u(t, x) ≥ 0 everywhere, and max{u(t, x) : t ∈ [s, T ], x ∈ Rd } = max{u(s, x) : x ∈ Rd }.

(5.132)

Proof. Suppose inf u = −α < 0. For a δ < α/(T − s), consider the function v = u(t, x) + δ(t − s). Clearly, this function also has a negative infimum. Since v tends to a positive constant δt as x → ∞, v has a global negative minimum at some point (t0 , x0 ), which lies in (s, T ] × Rd. Therefore, we find ∂v ∂t (t0 , x0 ) ≤ 0 (the equality being true if t0 < T ), and by the PMP Lt v(t0 , x0 ) ≥ 0. Consequently,   ∂v − Lt v (t0 , x0 ) ≤ 0. ∂t On the other hand, from the evolution equation (and since Lt δ ≤ 0 by the PMP), one deduces that     ∂u ∂v − Lt v (t0 , x0 ) ≥ − Lt u (t0 , x0 ) + δ = δ, ∂t ∂t because, by the PMP, L takes positive (respectively negative) constants to nonpositive (respectively non-negative) functions. This contradiction completes the proof of the first statement. Next, assume that max{u(t, x) : t ∈ [s, T ], x ∈ Rd } > max{u(s, x) : x ∈ Rd }. Then there exists a δ > 0 such that the function v = u(t, x) − δ(t − s) also attains its maximum at a point (t0 , x0 ) with t0 > s. Therefore, we find ∂v ∂t (t0 , x0 ) ≥ 0 (the equality being true if t0 < T ), and by the PMP Lt v(t0 , x0 ) ≤ 0. Consequently,   ∂v − Lt v (t0 , x0 ) ≥ 0. ∂t But from the evolution equation,     ∂u ∂v − Lt v (t0 , x0 ) ≤ − Lt u (t0 , x0 ) − δ = −δ. ∂t ∂t This is again a contradiction.



Remark 95. We assumed that the domain of Lt contains constants. This is not necessary. If Lt satisfies the PMP on a subspace D of C∞ (Rd ), we can extend it to the space generated by D and constants by linearity and by setting Lt to be zero on constant functions. In this case, the result of Theorem 5.10.1 and its proof would remain valid. Alternatively, the extension can be performed by continuity.

5.10. Uniqueness via positivity and approximations; Feller semigroups

337

A strongly continuous semigroup Tt (or a propagator U t,s ) in C∞ (Rd ) is called Feller semigroup (respectively Feller propagator) if for any f with values in [0, 1], all functions Tt f (respectively U t,s f ) also have values in [0, 1]. Corollary 6. Strongly continuous semigroups in C∞ (Rd ) generated by operators that satisfy the PMP are Feller semigroups. In particular, this is the case for operators L of the type (5.131) with a non-positive c(x). Corollary 7. (i) Under the conditions on D and Lt as in the above theorem, assume that f ∈ C([s, T ] × Rd ), g ∈ C∞ (Rd ). Then the Cauchy problem ∂u = L t u + ft , ∂t

u(s, x) = g(x),

(5.133)

can have at most one solution u ∈ C([s, T ] × Rd ) such that u(t, .) ∈ C∞ (Rd ) for all t ∈ [s, T ] and u(t, .) ∈ D for all t ∈ (s, T ] (and the equation is supposed to hold for t > s). (ii) In particular, this result holds for operators Lt of the L´evy–Khintchin type (5.131), if its coefficients are continuous bounded functions, where one can 2 (Rd ). (The kernels νt (x, dy) should depend take B = C∞ (Rd ) and D = C∞  continuously on t and x in the weak sense, i.e., f (y)ν(x, dy) is a continuous function whenever f is continuous and such that f (x) ≤ min(1, |x|2 ).) As a direct consequence of this uniqueness result, we get the following improvement of Theorem 5.8.4.  Theorem 5.10.2. If β ∈ (1, 2] in Theorem 5.8.4 and ψ(x, p) = S d−1 |(p, s)|β μ(ds) (so that ψ(x, −i∇) is of the L´evy–Khinchine type by (1.145)), then the operators Φt,r form a propagator yielding unique solutions to equation (5.123). If additionally the operators Lt are of the L´evy–Khintchin type, the propagators Φt,r are Feller. From the characterization of the continuous operators in C∞ (Rd ) (see (1.77)) and the property of positivity-preservation, it follows that the Feller semigroups  Tt are given by the formulae Tt f (x) = f (y)μt (x, dy) with some transition kernels μt such that μt (x, .) ≤ 1. With this formula, one can extend the action of Tt on C∞ (Rd ) to the space C(Rd ) (although Tt f for f ∈ C(Rd ) may turn out to be not continuous). A Feller semigroup is called conservative, if its extension to C(Rd ) preserves constants. In terms of μt , this condition means that all μt (x, .) are probability measures: μt (x, .) = 1. The simplest examples of Feller semigroups are generated by the L´evy– Khintchin operators, that is, operators (5.131) with A,b, ν constant (not depending on x) and vanishing c. We shall discuss these examples in some detail in the next section. For now, let us introduce another approach to uniqueness, based on approximations.

338

Chapter 5. Linear Evolutionary Equations: Advanced Theory

Theorem 5.10.3. Let B be a Banach space and D a dense subspace that is itself a Banach space under some norm .D ≥ .B . Let At be a family of uniformly bounded linear operators D → B. Let there exist a sequence of families of bounded operators Ant in B such that (i) Ant f → At f for any f ∈ D, (ii) the norms Ant f  are uniformly bounded on bounded subsets of D, t (iii) the propagators Unt,s = T exp{ s Anτ dτ } in B (see (2.39) for the used notation) generated by Ant are uniformly bounded in n for t from any compact set. Then for any f ∈ D there might exist at most one continuous curve ft , t ≥ s, such that fs = f applies, ft ∈ D for all t, supt∈[s,T ] ft D < ∞ and ft satisfies the equation f˙t = At ft . Proof. Let us show that any curve ft satisfying the requirements of the theorem is necessarily the limit of the sequence ftn = Unt,s f solving the Cauchy problem for the equations f˙tn = Ant ftn . This would imply the uniqueness. From the equations for ft and ftn , we get the equation d n (f − ft ) = Ant ftn − At ft = Ant (ftn − ft ) + (Ant − At )ft . dt t Since fsn − fs = 0, it follows from Proposition 4.10.2 (and Remark 78) that ftn − ft =

t

Unt,s (Ans − As )fs ds. s

By the assumption (i), the integrand converges to 0, as n → ∞. By the assumptions (ii) and (iii), one can apply the dominated convergence theorem to conclude that ftn − ft → 0, as claimed.  This theorem is most easily applied to L´evy–Khintchin-type operators, where it provides an alternative proof to statement (ii) of the corollary to Theorem 5.10.1. In fact, by approximating the L´evy kernels ν(x, dy) by bounded kernels 1|y|> ν(x, dy) and the derivatives by the corresponding finite differences, one gets approximations to the L´evy–Khintchin-type operators that also satisfy the PMP and therefore generate propagators that are contractions and hence uniformly bounded. We will apply this method to fractional equations in Chapter 8. Finally, let us formulate the general result on the well-posedness of L´evy– Khintchin-type operators that can be kept in mind as a standard example for wellposed linear problems. This result extends the properties of diffusions given in Theorem 4.3.1, as well as Theorem 5.10.2. Recall that the Wasserstein–Kantorovich distance of order p between measures on Rd with a finite pth moment is defined as  1/p p Wp (ν1 , ν2 ) = inf |y1 − y2 | ν(dy1 dy2 ) , ν

(5.134)

5.11. L´evy–Khintchin generators and convolution semigroups

339

where inf is taken over all ν on Rd ×Rd that couple ν1 and ν2 , i.e., their projection on the first and second variable coincides with ν1 and ν2 , respectively. Remark 96. This distance is usually defined for probability measures. But (5.134) also makes perfect sense for unbounded measures, as long as they have a finite second moment. Theorem 5.10.4. Let an operator L have the form (5.131) with vanishing c, where    A(x1 ) − A(x2 ) + |b(x1 ) − b(x2 )| + W2 (1B1 (.)ν(x1 ; .), 1B1 (.)ν(x2 ; .)) (5.135) ≤ κ|x1 − x2 | with a certain constant κ, and   A(x) + |b(x)| + sup x

 |y|2 ν(x, dy) < ∞.

(5.136)

B1

Let the family of finite measures {1Rd \B1 )(.)ν(x; .)} be uniformly bounded, tight and depend weakly continuously on x. Then L extends to the generator of a conservative Feller semigroup. As in the case of degenerate diffusions, the proof is based on the application of stochastic differential equations. We shall not give it here, but refer to the original paper [146]. In the remaining chapter, we shall look in more detail at some concrete classes of ΨDEs and their corresponding propagators.

5.11 L´evy–Khintchin generators and convolution semigroups In this section, we shall deal with the properties of semigroups that are generated by L´evy–Khintchin operators, that is, operators of the type (5.131) with constant coefficients and vanishing c: Lf (x) =

1 (A∇, ∇)f (x) + (b, ∇)f (x) 2 +

(5.137)

(f (x + y) − f (x) − (∇f (x), y)1|y|≤1 )ν(dy),

where ν is a L´evy measure, i.e., a measure on Rd \ {0} such that min(|y|2 , 1)ν(dy) < ∞.

(5.138)

In probability theory, the symbol of the ΨDO on the r.h.s. of (5.137), ψ(p) = e−ipx Leipx 1 = − (Ap, p) + i(b, p) + [ei(p,y) − 1 − i(p, y)1|y|≤1]ν(dy), 2

(5.139)

340

Chapter 5. Linear Evolutionary Equations: Advanced Theory

is called the L´evy exponent or L´evy symbol or characteristic exponent of L, and the semigroups generated by L´evy–Khintchin operators describe the so-called L´evy processes. Let us emphasize that the condition (5.138) ensures that the L´evy exponent ψ(p) is well defined as a continuous function on R such that ψ(0) = 1 and |ψ(p)| ≤ C(1 + p2 ) with some constant C and Re ψ(p) ≤ 0. Proposition 5.11.1. For any non-negative matrix A, a vector b and a L´evy measure ν, the operator L of the type (5.137) generates a Feller semigroup Tt in C∞ (Rd ) k with all spaces C∞ (Rd ) invariant and, for k ≥ 2, representing cores. Moreover, Tt acts as Tt f (x) = f (x − y)˜ μt (dy) = f (x + y)μt (dy), (5.140) where the probability measure μ ˜t (dy) is the Green function of the Cauchy problem of the operator L given by the formulae μ ˜ t = F −1 (etψ(.) ) ⇐⇒ etψ(p) = (F μ ˜t )(p),

(5.141)

and where μ(dy) = μ ˜t (−dy). Proof. The real part of the function −ψ(p) is bounded from below. Therefore, d ˆ ft = ψ(p)fˆt , which is obtained by Fourier the Cauchy problem for the equation dt ˙ transforming the equation ft = Lft , has the solution fˆt = exp{tψ(p)}fˆ0 . This solution defines a strongly continuous semigroup in all spaces Lp , see Theorem 2.4.1. Consequently, for any f0 ∈ C∞ (Rd ) from the Wiener ring F (L1 (Rd )) of the Fourier transforms of functions fˆ0 ∈ L1 (Rd ), the solution Tt f0 = ft to the Cauchy problem for the equation f˙t = Lft is a uniquely defined continuous curve in C∞ (Rd ). Moreover, by Theorem 5.10.1, the mapping f0 → Tt f0 is such that for any f0 with values in [0, 1] the functions Tt f0 also have values in [0, 1]. Therefore, the operators Tt are contractions in C∞ (Rd ) when reduced to the Wiener ring F (L1 (Rd )). Hence Tt extends to the strongly continuous semigroup on the whole C∞ (Rd ) by a density argument. Since the ΨDO L has constant coefficients, its semigroup is given by the convolution with the generalized function Gψ t−s , see (2.67) and (2.62), referred to as the Green function of the Cauchy problem of L. Since the Green function Gψ t−s in our case defines a positive continuous operator on C∞ (Rd ), it is a measure (by the Riesz–Markov theorem). This yields (5.140) and (5.141). k (Rd ) is seen from (5.140). Since the Schwartz The invariance of the spaces C∞ space is invariant under the Fourier transform and belongs to the domain of the Fourier transformed semigroup Tt , it belongs to the domain of L. It follows that 2 C∞ (Rd ) belongs to the domain of L, because L is closed on its domain (Theorem 2 4.1.1(vii)) and L is bounded as an operator C∞ (Rd ) → C∞ (Rd ). 

5.11. L´evy–Khintchin generators and convolution semigroups

341

In terms of μt , the semigroup equation Tt Ts = Ts+t can be rewritten as the convolution equation μt  μs = μs+t , showing that the μt form a convolution semigroup, i.e., a semigroup with respect to the convolution. The strong continuity of Tt translates into the vague continuity of μt . Remark 97. One can show that the vague continuity of a family of probability measures implies its weak continuity, so that μt is a weakly continuous family. Notice that Tt of (5.140) extends directly to the semigroups of contractions on the space of bounded Borel functions on Rd . For the theory of differential equations, it is important to know in which sense the initial condition and the equation f˙t = Lft are satisfied by Tt f whenever f deviates from the domain of L. Besides, one is also interested in regularization properties of Tt . The following result gives a partial answer (see Proposition 5.11.4 for a follow-up). Proposition 5.11.2. Let the assumptions of Proposition 5.11.1 hold. (i) If f is a bounded Borel function, then Tt f (x) → f (x), as t → 0, at any point x of continuity of f . (ii) The semigroup Tt extends to the strongly continuous semigroup on the space Cuc (Rd ) of uniformly continuous functions on Rd . (iii) If all measures μt have no atoms, then Tt f is continuous whenever f is bounded with at most a countable number of discontinuities. Proof. (i) Let x be a point of continuity of f . For any , we can choose δ such that |x − y| < δ implies |f (x) − f (y)| < . Due to the strong continuity of Tt on C∞ (R), we can choose t small, so that μt [Rd \ (−δ, δ)d ] < ,

μt [(−δ, δ)d ] > 1 − .

It follows that          f (x + y)μt (dy) − f (x) =  (f (x + y) − f (x))μt (dy)           (f (x + y) − f (x))μt (dy) ≤ 2f  +   (−δ,δ)d  ≤ 2f  +  with all three terms being of order . (ii) This is the same as in (i). The only modification is that, for f ∈ Cuc (Rd ), δ can be chosen uniformly in x. (iii) If μt have no atoms, f is bounded with at most a countable number of discontinuities, and xn → x as n → ∞, then f (xn + y) → f (x + y) almost surely with respect to μt . Therefore, the dominated convergence theorem completes the proof. 

342

Chapter 5. Linear Evolutionary Equations: Advanced Theory

Specific features arise in equations with one-sided ν, namely equations of the type



f˙t = Lν ft (x) =

(ft (x + y) − ft (x))ν(dy) =

0



−∞

(ft (x + y) − ft (x))ν(dy),

(5.142) where ν is a measure with a support on {y : y > 0} that satisfies the one-sided L´evy condition ∞ min(1, y)ν(dy) < ∞. (5.143) 0

The latter condition ensures that the symbol ψν (p) = (eipy − 1)ν(dy)

(5.144)

of the operator Lν on the r.h.s. of (5.142) is well defined for all p, as a continuous function. Moreover, ψ(0) = 1, |ψ(p)| ≤ C(1 + |p|) with some constant C and Re ψ(p) ≤ 0. Proposition 5.11.3. (i) Under the condition (5.143), the equations (5.142) generate the Feller semigroups Tt on C∞ (R) such that Tt f (x) =





f (x + y)G(ν) (t, dy) = 0

0

−∞

˜ (ν) (t, dy), f (x − y)G

(5.145)

with some probability measures G(ν) (t, dy) on R+

and

˜ (ν) (t, dy) = G(ν) (t, d(−y)) on R− , G

such that the value of Tt f (x) depends only on f (z) with z ≥ x. The space 1 C∞ (R) is an invariant core for Tt . (ii) The Tt have the following monotonicity properties: If f is non-decreasing, then Tt f (x) is non-decreasing both in t and in x, and Tt f (x) ≥ f (x) for all x and t. (iii) Comparison principle: Let ν1 , ν2 be two measures satisfying (5.143) and defining the semigroups Tt1 and Tt2 . Let ν1 (dy) ≥ ν2 (dy). Then Tt1 f (x) ≥ Tt2 f (x) for any non-decreasing f . Proof. (i) If ν is a finite measure, then the assertion follows from (4.97). A general ν can be approximated by finite ν (dy) = 1|y|≥ ν(dy), in which case the assertion follows from the convergence of the corresponding semigroups, see Proposition 2 (R) 4.2.2. The statement about the core is a consequence of the three facts: C∞ 1 is a core by Proposition 5.11.1, Lν is bounded as an operator C∞ (R) → C∞ (R), and the generator Lν is closed on its domain.

5.11. L´evy–Khintchin generators and convolution semigroups

343

(ii) It is seen from (5.145) (and the fact that the G(ν) (t, .) are probability measures) that Tt f (x) is non-decreasing in x, and Tt f (x) ≥ f (x) for non-decreasing f . The monotonicity in t follows from equation (5.142). (iii) Let ν1 − ν2 = ν3 . Since ν3 is positive, it also defines a semigroup, say Tt3 , of the same kind. Moreover, all Ttj commute. This can be first checked for approximating bounded νj as in (i), and then by passing to the limit for general νj . Moreover, T 1 = T 3 T 2 . This is a straightforward consequence of the Lie–Trotter formula (5.17) for commuting generators, or can be directly proved. Therefore, to show that Tt1 f (x) ≥ Tt2 f (x) for any non-decreasing f means to show that Tt3 Tt2 f (x) ≥ Tt2 f (x), and hence that Tt3 g(x) ≥ g(x) for any non-decreasing g (since Tt2 f is non-decreasing for non-decreasing f ). But this holds due to the third statement in (ii).  Remark 98. In probability theory, the order relation Tt1 f (x) ≥ Tt2 f (x) for any nondecreasing f is referred to as stochastic order, and the property that Tt preserves the set of increasing functions is referred to as stochastic monotonicity. Proposition 5.11.4. Under the assumption (5.143), let Tt be the corresponding semigroup given by (5.145) and let the measures μt have no atoms. Let f ∈ C∞ (R) be piecewise differentiable, i.e., there exists a finite number of points a1 < · · · < ak such that f is continuously differentiable outside the set of these points with a uniformly bounded derivative. Then 1 lim (Tt f (x) − f (x)) = Lν f (x), t→0 t

(5.146)

for all x ∈ / {a1 , . . . , ak }. 1 Proof. Let fn be a sequence of uniformly bounded elements of C∞ (R) such that, for any y ∈ / {a1 , . . . , ak }, fn (y) = f (y) for large enough n (depending on y). Consequently, Lν fn (y) → Lν f (y) for y ∈ / {a1 , . . . , ak }, where Lν f (y) is defined by the r.h.s. of (5.142) and is continuous for y ∈ / {a1 , . . . , ak }. In fact, for any y ∈ (ak−1 , ak ) we can write ∞ (f (y + z) − f (y))ν(dz) Lν f (y) = 0

=

0

(ak −y)/2

(f (y + z) − f (y))ν(dz) +



(ak −y)/2

(f (y + z) − f (y))ν(dz),

The first term is bounded and approximated by the corresponding expression for fn , because f  is bounded on the segment [y, y+(ak −y)/2]. And the second term is bounded and approximated by the corresponding expression for fn , because ν(dz) is bounded for z > (ak − y)/2. This argument shows that Lν f (y) is uniformly bounded on R \ {a1 , . . . , ak }. Hence, by the dominated convergence theorem and due to the fact that the μt have no atoms, we find (Ts Lν fn )(x) → (Ts Lν f )(x)

344

Chapter 5. Linear Evolutionary Equations: Advanced Theory

for all x. Again by the dominated convergence theorem, we can pass to the limit n → ∞ in the equation

t

Tt fn (x) = fn (x) +

(Ts Lν fn )(x) ds 0

and obtain



t

Tt f (x) = f (x) +

(Ts Lν f )(x) ds. 0

This implies (5.146), because (Ts Lν f )(x) → Lν f (x), as s → 0, by Proposition 5.11.2.  For a concrete ν, the assumptions on f ensuring (5.146) can be further weakened. The important methodological consequence of Proposition 5.11.4 is the possibility to talk about the values of Lν f (x) at particular points x, even if the function Lν f (x) is not globally defined. If (5.146) holds, then we can say that f belongs to the domain of L locally, at a point x. This concept is crucial for applications to boundary-value problems of PDEs and ΨDEs, see e.g. Proposition 8.4.2 below. ˜ (ν) (t, .) is the Green function As for the general L´evy–Khintchin operators, G of the Cauchy problem for equation (5.142), and ˜ (ν) (t, .))(p). ˜ (ν) (t, .) = F −1 (etψν (.) ) ⇐⇒ etψν (p) = (F G G Similarly, the Cauchy problem for the equations ∞ (ft (x − y) − ft (x))ν(dy), f˙t = Lν ft (x) =

(5.147)

(5.148)

0

with a generator that is dual to the operator on the r.h.s. of (5.142), has solutions of the form 0 ∞ ˜ (ν) (t, dy) = f (x + y)G f (x − y)G(ν) (t, dy), (5.149) Tt f (x) = −∞

0

the Green function G(ν) (t, dy) and the following symbol of the generator: ψν− (p) = ψν (−p) =

(e−ipy − 1)ν(dy).

Remark 99. For the development of generalized fractional equations (see Chapter 8), it is crucial that the space C([a, ∞)) (as a subspace of C(R) of functions being constant to the left of a) and its subspace Ckill(a) ([a, ∞)) consisting of functions vanishing to the left of a are both invariant under Tt . The restriction of the generator Lν to these subspaces of C(R) is a far-reaching extension of the mixed fractional derivative of the Caputo-type and of the RL-type.

5.12. Potential measures

345

5.12 Potential measures When working with one-sided equations of the type (5.142), it is often convenient to analyse the properties of the Green functions G(ν) (t, .) via their Laplace transform. Namely, one introduces the Laplace exponent of the operator Lν as ∞ φν (λ) = −ψν (iλ) = (1 − e−λy )ν(dy), (5.150) 0

so that e−tφν (λ) = etψν (iλ) =



0

−∞ ∞

=

˜ (ν) (t, dy) eλy G (5.151) e−λy G(ν) (t, dy) = (LG(ν) (t, dy))(λ),

0

where we used the notation L for the Laplace transform. An important general concept is the following: an infinitely differentiable function f on (0, ∞) with non-negative values is called completely monotone (respectively a Bernstein function) if (−1)n f (n) (λ) ≥ 0 for all n = 0, 1, . . . (respectively if the derivative of f is completely monotone). Bernstein’s theorem states (for a proof, see, e.g., [241]) that a function f : (0, ∞) → R is completely monotone if and only if it is the Laplace transform of a positive measure: ∞ e−λy μ(dy) f (λ) = (L(μ))(λ) = 0

with some μ on {y : y ≥ 0} that may not be finite, but such that L(μ) is a well-defined function on {λ : λ > 0}. Therefore, the Laplace exponents φν (λ) of (5.150) represent the Bernstein functions, and the exponents e−tφν (λ) are completely monotone for all t. Proposition 5.12.1. (i) For any measure ν on {y : y > 0} satisfying (5.143), there exists the vague limit ∞ (ν) G(ν) (t, M ) dt U (M ) = 0

K

of the measures 0 G(ν) (t, .) dt, K → ∞, such that U (ν) (M ) is finite for any compact M . Moreover, the Laplace transform of U (ν) is well defined for all λ > 0 and ∞ e−λy U (ν) (dy) = 1/φν (λ).

(LU (ν) )(λ) =

(5.152)

0

(ii) Comparison principle: if ν1 (dy) ≥ ν2 (dy), then f (y)U (ν1 ) (dy) ≥ f (y)U (ν2 ) (dy) for any non-decreasing f . The opposite inequality holds for non-increasing f .

346

Chapter 5. Linear Evolutionary Equations: Advanced Theory

Proof. (i) We have ∞ 



e 0

−λy

 G(ν) (t, dy) dt =

0

0



e−tφν (λ) dt =

1 . φν (λ)

For any function u(y) with the compact support [0, z], it holds u(y) ≤ ue−λy eλz . Therefore,  ∞  ∞ eλz . u(y)G(ν) (t, dy) dt ≤ u φν (λ) 0 0 ∞ Consequently, U (ν) (.) = 0 G(ν) (t, .) dt is a well-defined finite measure on [0, z] for any finite z, and thus U (ν) exists as a σ-finite measure on R+ . (ii) This follows from the comparison principle for the semigroups Tt , see Proposition 5.11.3(iii).  For example, if ν is finite, then it follows from (4.97) that ∞

1 1 f (0) + f (y)U (ν) (dy) = · · · f (y1 + · · · + yk )ν(dy1 ) · · · ν(yk ). ν νk k=1 (5.153) Consequently, applying the comparison principle of Proposition 5.12.1 leads to the following result. Corollary 8. The potential measure U (ν) has an atom at zero if and only if ν is finite, in which case this atom is δ0 /ν. Exercise 5.12.1. Extend this assertion to λ-potential measures as defined in (5.154) below. ˜ (ν) (t, .) As already mentioned, in the terminology of differential equations, G (respectively G(ν) (t, .)) is the Green function of the Cauchy problem for the operator Lν (respectively for the operator Lν ). Then, by Proposition 1.11.4, the measure U (ν) (dy) on {y ≥ 0} is the fundamental solution to the operator −Lν , and the measure U (ν) (−dy) on {y ≤ 0} is the fundamental solution  to the operator −Lν . In the terminology of semigroups, the operator g → g(x + y)U (ν) (dy) with the kernel U (ν) (dy) is the potential operator  for the semigroup Tt , see Proposition 4.1.4, and the convolution operator g → g(x − y)U (ν) (dy) is the potential operator for the semigroup Tt . However, unlike the statement of Proposition 4.1.4, this operator is not defined as a strong limit and therefore may be unbounded in C∞ (Rd ). The measure U (ν) (dy) is usually referred to as the potential measure of the convolution semigroup {G(ν) (t, .)}. Accordingly, the measure ∞ (ν) Uλ (A) = e−λt G(ν) (t, A) dt (5.154) 0

is called the λ-potential measure of the convolution semigroup {G(ν) (t, .)}. It is (ν)

finite with Uλ  = 1/φν (λ) and represents the integral kernel of the resolvent

5.12. Potential measures

347

operator Rλ of the semigroup Tt generated by Lν . On the other hand, Rλ = (ν) (λ − Lν )−1 implies that its integral kernel Uλ (dy) is the fundamental solution to (ν) the operator λ − Lν . Of course, U0 (dy) = U (ν) (dy). The final result of this section is devoted to the question of uniqueness of the fundamental solution to the operators Lν and their shifts. Proposition 5.12.2. Let the measure ν on {y : y > 0} satisfy (5.143). (ν)

(i) For any λ > 0, the λ-potential measure Uλ represents the unique fundamental solution to the operator λ − Lν . (ii) If the support of ν is not contained in a lattice {αn, n ∈ Z}, with some α > 0, then the measure U (ν) (dy) represents the unique fundamental solution to the operator −Lν up to an additive constant. (iii) Let {αn, n ∈ Z} be the minimal lattice (that cannot be further rarified) containing the support of ν, such that for any k ∈ Z, k > 1, there exists n ∈ Z such that αn belongs to the support of ν and n/k ∈ / Z. Then any two fundamental solutions to the operator −Lν differ by a linear combination of the type

G(x) = an exp{2πnix/α} (5.155) n∈Z

with some numbers an . In particular, U (ν) (dy) is again the unique fundamental solution vanishing on the negative half-line. Proof. (i) For any two fundamental solutions U1 , U2 of λ − Lν , the Fourier transform implies that (ψν (−p) − λ)(F G)(p) = 0 for G = U1 − U2 . Since Re (ψν (−p) − λ) ≤ −λ < 0 for all p, F G(p) = 0 and hence G = 0. (ii) For any two fundamental solutions U1 , U2 of −Lν , the Fourier transform implies ψν (−p)(F G)(p) = 0 for G = U1 − U2 . Since the support of ν is not contained in a lattice, ψν (−p) < 0 everywhere except at p = 0, because cos(py) − 1 < 0 everywhere except when y = 2πn/p with some n ∈ Z. Therefore, F G has a support at zero. Consequently, by Proposition 1.9.1, F G is a finite linear But the derivative of ψν (−p) combination of the derivatives δ (j) of the δ-function.  at zero does not vanish: it either equals −i yν(dy), if this integral is finite, or is not finite at all, if otherwise. In both cases, F G cannot  have other terms in the sum apart from the δ-function itself. In fact, ψν (−p) j aj δ (j) (p) = 0 would mean that m

aj [ψν (−p)φ(p)](j) (0) = 0 j=0

for any φ ∈ D(Rd ). This is possible only if all aj = 0 for j > 0. Hence G is a constant, as claimed.  (iii) Under the assumption of (ii), we have ν(dy) = n>0 bn δαn (y) with some non-negative numbers bn such that for any k ∈ Z, k > 1, there exists n ∈ Z such

348

Chapter 5. Linear Evolutionary Equations: Advanced Theory

that bn > 0 and n/k ∈ / Z. Therefore, ψν (p) =



bn (eipαn − 1).

n=1

Notice that ψν (p) = 0 if and only if cos(pαn) = 1, or equivalently pαn = 2πl with l ∈ Z, for all n with bn > 0. Therefore, ψν (p) = 0 for pm = 2πm/α, m ∈ Z. Moreover, if p is not of this form, then ψν (p) = 0. To see this, let us assume otherwise, i.e., ψν (p) = 0 for some p = pm . Then p = (2π/α)(m/k) with some rational number m/k. Let us choose it in such a way that the fraction m/k is irreducible. Then k > 1, since p = pm . Let us choose n ∈ Z such that bn = 0 and n/k ∈ / Z. Since pαn = 2πl with some integer l, it follows that m/k = l/n. Since m/k is irreducible, n/k is an integer, which leads to a contradiction. Consequently, if ψν (−p)(F G)(p) = 0, then the support of F G is on the lattice {pm } (called the dual lattice to the lattice αn). As in (i), we find that the derivatives of the δ-function cannot enter the formula for F G. Therefore, we have F G(p) = m∈Z am δpm (p), which implies (5.155). The final statement is due to the fact that a linear combination of exponents cannot vanish on the negative half-line.  In Chapter 8, we shall develop this topic further, since Proposition 5.12.2 is the cornerstone for the development of the generalized fractional calculus.

5.13 Vector-valued convolution semigroups In this section, we present the Banach-valued extensions of the semigroups from Proposition 5.11.3, and some of their direct applications. Further extensions will be provided in Chapter 8. Proposition 5.13.1. (i) Let B be a Banach space. Under the condition (5.143), the operators Tt and Tt given by (5.145) and (5.149) extend to C(R, B) (by the same formula) and k represent strongly continuous semigroups in each of the spaces C∞ (R, B), k = 0, 1, . . . and in Cuc (R, B). (ii) If Tt and (Tt ) denote the semigroups generated by the finite approximations ν (dy) = 1|y|≥ ν(dy) of ν, then Tt → Tt and (Tt ) → Tt strongly, as  → 0, k (R, B). in each of the spaces C∞ 1 (R, B) is an invariant core for both Tt and Tt in C∞ (R, B). (iii) The space C∞ (iv) The resolvent operators Rλ of the semigroup Tt , given by the formula ∞ (ν)  Rλ f (x) = f (x − y)Uλ (dy), 0

also extend to bounded operators in C∞ (R, B), so that Rλ (λ − Lν )f = f for any f ∈ C∞ (R, B).

5.13. Vector-valued convolution semigroups

349

Proof. (i) For the sake of definiteness, let us deal with Tt . For f ∈ C∞ (R, B), the integral in (5.145) can be defined as the limit of Riemannian sums, which converge (in the norm of B) due to the uniform continuity of f on R. The uniform continuity of f also implies that Tt f is continuous. Since f tends to zero at infinity, the same holds for Tt f . The boundedness of the Tt in C∞ (R, B) follows from the boundedness of their restrictions to C∞ (R). The proof of the strong continuity in Cuc (R, B) relies on the same arguments as in Proposition 5.11.2(i). Finally, the same argument shows that Tt acts strongly continuously in each of the spaces k C∞ (R, B), because differentiation commutes with all the operators Tt . (ii) For any , the semigroups Tt are generated by bounded operators. Consequently, they can be defined by a convergent exponential series. Since Tt → Tt as operators in C∞ (R) (by Proposition 4.2.2), it follows that the corresponding measures G(ν ) (t, .) converge weakly to G(ν) (t, .) and hence Tt f → Tt f weakly in B for any f ∈ B. In order to see that this convergence is strong, we can estimate the difference Tt 1 −Tt 2 , with some 1 > 2 , by the same method as used in formula (4.8): t

2 Tt−s (Lν 1 − Lν 2 )Ts 1 ds. Tt 1 − Tt 2 = 0

Therefore, if f ∈ C 1 (R, B),

t

(Tt 1 − Tt 2 )f C(R,B) ≤ 0



) ) ds ) )

t



1

2

1

ds

2 1

0



t

ds 0

) ) (Ts 1 f (. + y) − Ts 1 f (.))ν(dy)) )

C(R,B)

yTs 1 f C 1 (R,B) ν(dy) yf C 1 (R,B) ν(dy),

(5.156)

2

which tends to zero as 1 → 0. Therefore, the family Tt f is Cauchy in C∞ (R, B), and hence the weak convergence implies the strong convergence. (iii) For each  and any f ∈ C∞ (R, B), we have

t

Tt f (x) − f (x) =

Lν Ts f (x) ds. 0

1 (R, B) we can pass to the limit  → 0, which yields For f ∈ C∞



t

Tt f (x) − f (x) =

Lν Ts f (x) ds. 0

1 (R, B) is an invariant core for Tt in C∞ (R, B). Therefore, C∞ (iv) This follows from (i), (iii) and the general Theorem 4.1.1.



350

Chapter 5. Linear Evolutionary Equations: Advanced Theory

Exercise 5.13.1. Check that Propositions 5.11.4 and 5.11.2 extend to the present Banach-space-valued setting. Moreover, if the Banach space B is a Banach lattice with respect to some partial order relation, then the monotonicity properties of Proposition 5.11.3(ii) and (iii) also hold in this setting. Theorem 5.13.1. (i) Let A be an operator in B that generates a strongly continuous semigroup etA in B with an invariant core D. Let D itself be a Banach space under some norm .D ≥ .B (for the canonical choice of such a norm see (4.10)) such that etA is also strongly continuous in D and A is a bounded operator D → B. Then etA extends to the strongly continuous semigroup in C∞ (R, B) with the invariant core C∞ (R, D). (ii) The semigroup etA and the semigroups Tt generated by Lν (as constructed in 1 (R, D) Proposition 5.13.1) are commuting semigroups in C∞ (R, B) with C∞  being their common invariant core. Moreover, the operator Lν + A generates a strongly continuous semigroup Tt etA in the following spaces: (a) C∞ (R, B) 1 (R, D); (b) Cuc (R, B) with the invariant core with the invariant core C∞ 1 Cuc (R, D) of functions from C 1 (R, D) which are uniformly continuous together with their derivatives; (c) Cuc ((−∞, b], B) for any b, with the invariant 1 core Cuc ((−∞, b], D). Proof. (i) The operators etA act pointwise in C∞ (R, B): (etA f )(x) = etA (f (x)). These operators form a bounded semigroup in C∞ (R, B), because the etA form a bounded semigroup in B. For the strong continuity, we note that the pointwise convergence, etA (f (x)) → f (x), as t → 0 for any x, follows from the strong continuity of etA in B. The uniform convergence in x is straightforward on compact sets and extends to all x due to f ∈ C∞ (R, B). By applying the same result to D, we can conclude that the operators etA represent a strongly continuous semigroup in C∞ (R, D) as well. Finally, for any f ∈ C∞ (R, B), we have 1 tA 1 t sA (e f (x) − f (x)) = Ae f (x) ds → Af (x), as t → 0, t t 0 uniformly in x. (ii) The commutativity of etA and Tt can best be proved by starting from their approximations with a bounded generator (say, the Yosida approximation for A and (Tt ) for Tt ) and then passing to the limit in the commutation relation. From the commutativity of Tt and etA , it follow that the operators Tt etA form a strongly continuous semigroup in C∞ (R, B). Since both Tt and etA have the core 1 1 C∞ (R, D), it follows that C∞ (R, D) is also a core for Tt etA . Similarly, one deals with other spaces.  As a direct application, let us discuss some very simple fractional PDEs. Namely, let us consider the Cauchy problem ∂ft = Lν ft + Aft , ∂t

ft |t=0 (x, y) = f0 (x, y),

(5.157)

5.14. Equations of order at most one

351

where x ∈ R and y ∈ Rd , Lν acts on the variable x and is given by (5.148), and A is a linear operator acting on the variable y that generates a strongly continuous semigroup etA in C∞ (Rd ), with some invariant core D ⊂ C∞ (Rd ). dβ For instance, −Lν can be taken as dx β with β ∈ (0, 1) or as a linear combination of these derivatives with positive coefficients, and A can be the generator of an arbitrary Feller semigroup, or an elliptic PDO, or a mixed fractional Laplacian like A = a(y)|Δ|α . As a consequence of Theorem 5.13.1, we get the following result. Proposition 5.13.2. For any f0 ∈ C∞ (R, D), there exists a unique solution ft ∈ C∞ (R, D) to the problem (5.157). It is given by the formula ∞ etA f (x − z, .)G(ν) (t, dz). (5.158) ft (x, .) = 0

5.14 Equations of order at most one In this section, we extend equations with the r.h.s. of the type Lν or Lν , as discussed above, to variable coefficients. Namely, we deal with semigroups generated by integro-differential (or pseudo-differential) operators of order at most one, i.e., by operators (f (x + y) − f (x))ν(x, dy) (5.159) Lf (x) = (b(x), ∇f (x)) + Rd \{0}

 with L´evy kernels ν(x, .) that have a finite local first moment B1 |y|ν(x, dy). (Note that Ba denotes the ball of radius a in Rd centered at the origin.) The arguments will be similar to the arguments used in Proposition 5.13.1(ii), although we will restrict our attention to real-valued evolutions only. Theorem 5.14.1. Assume that b ∈ C 1 (Rd ) and that ∇ν(x, dy), the gradient of the L´evy kernel with respect to x, exists in the weak sense  as a signed measure and depends weakly continuously on x, in the sense that f (y)∇ν(x, dy) is a continuous function for any f ∈ C(Rd ) with a support separated from zero. Moreover, assume that (5.160) sup min(1, |y|)ν(x, dy) < ∞, sup min(1, |y|)|∇ν(x, dy)| < ∞, x

x

and that for any  > 0 there exists a K > 0 such that ν(x, dy) < , sup |∇ν(x, dy)| < , sup x x Rd \BK Rd \BK sup |y|ν(x, dy) < . x

B1/K

(5.161) (5.162)

352

Chapter 5. Linear Evolutionary Equations: Advanced Theory

Then L generates a conservative Feller semigroup Tt in C∞ (Rd ) with the invari1 1 (Rd ). Moreover, Tt reduced to C∞ (Rd ) is also a strongly continuous ant core C∞ 1 d semigroup in the Banach space C∞ (R ), where it is regular in the sense that Kt Tt C∞ 1 (Rd ) ≤ e

(5.163)

with a constant K. Proof. Notice first that (5.160) implies, for any  > 0, sup ν(x, dy) < ∞, sup |∇ν(x, dy)| < ∞. x

Rd \B

x

(5.164)

Rd \B

Next, since the operator Rd \B1

(f (x + y) − f (x))ν(x, dy)

(5.165)

is bounded in the Banach spaces C(Rd ) and C 1 (Rd ) (by (5.160)) and also in 1 the Banach spaces C∞ (Rd ) and C∞ (Rd ) (by (5.161)), the standard perturbation argument (see Theorem 4.6.1 and Remark 70) makes it possible to reduce the situation to the case when all ν(x, dy) have support in B1 , which we shall assume from now on. Let us introduce the approximation (f (x + y) − f (x))ν(x, dy). (5.166) Lh f (x) = (b(x), ∇f (x)) + Rd \Bh

For any h > 0, this operator generates a conservative Feller semigroup Tth in 1 C∞ (Rd ) with the invariant core C∞ (Rd ), because the first term in (5.166) does so and the second term is a bounded operator in the Banach spaces C∞ (Rd ) and 1 (Rd ) (by (5.164)). Therefore, perturbation theory (Theorem 4.6.1) applies. C∞ The conservativity also follows from the perturbation series representation, and the contraction property follows, e.g., from Theorem 5.10.1. Formally differentiating the equation f˙(x) = Lh f (x) with respect to x (that is, assuming that all derivatives are well defined) yields the equation d ∇k f (x) = Lh ∇k f (x) + (∇k b(x), ∇f (x)) + (f (x + y) − f (x))∇k ν(x, dy). dt B1 \Bh (5.167) Considering this an evolution equation for g = ∇f in the Banach space C∞ (Rd × {1, . . . , d}) = C∞ (Rd ) × · · · × C∞ (Rd ), we observe that the r.h.s. is represented as the sum of a diagonal operator that generates a Feller semigroup and of two bounded (uniformly in h by (5.160)) operators of g (by expanding f (x + y) − f (x) into a Taylor series). Therefore, these evolutions are well posed and generate semigroups that are uniformly bounded in h.

5.14. Equations of order at most one

353

1 1 Let us now show that if f0 ∈ C∞ (Rd ), then ft ∈ C∞ (Rd ) and its derivative g = ∇f is actually given by the semigroup generated by (5.167). To this end, we first approximate b and ν by a sequence of twice continuously differentiable objects bn , νn , n → ∞, and prove this claim for the corresponding operators Lnh . For the corresponding approximating evolutions,

d ∇k f n (x) = Lnh ∇k f n (x) + (∇k bn (x), ∇f n (x)) dt + B1 \Bh

(5.168)

(f n (x + y) − f n (x))∇k νn (x, dy),

1 2 for g n = ∇f n , the space C∞ (Rd ) is an invariant core. Therefore, if f0 ∈ C∞ (Rd ), n 2 d n then ft ∈ C∞ (R ), so we can legitimately differentiate the evolution of ft and conclude that it does satisfy the equation (5.168). By the uniqueness Theorem 4.10.1, we can further conclude that the evolution is given by the semigroup gen1 (Rd ), then we erated by the operator on the r.h.s. of (5.168). Next, if f0 ∈ C∞ 2 can come to the same conclusion by approximating it by functions from C∞ (Rd ). Finally, the semigroup generated by the operator on the r.h.s. of (5.168) is given by a perturbation series, where all terms apart from Lnh are considered perturbations. Letting n → ∞, we observe that all terms of this series converge to the corresponding series without the label ‘n’. Therefore, the derivatives of the approximations ∇ftn converge, as n → ∞, to the function given by the semigroup generated by (5.167). Hence the approximations ∇ftn converge to ∇ft and the function gt = ∇ft is well defined and can be obtained by applying the operators of the semigroup generated by (5.167) to g0 = ∇f0 . Consequently, we may conclude that the ∇k Tth f are uniformly bounded for 1 (Rd ). Therefore, all h ∈ (0, 1] and t from any compact interval whenever f ∈ C∞ by the same method as used in formula (4.8), we can write t h1 h2 h2 (Tt − Tt )f = Tt−s (Lh1 − Lh2 )Tsh1 ds 0

for arbitrary h1 > h2 and then estimate |(Lh1 − Lh2 )Tsh1 f (x)| ≤ |(Tsh1 f )(x + y) − (Tsh1 f )(x)|ν(x, dy) Bh1 \Bh2



Bh1

1 , ∇Tsh1 f |y|ν(x, dy) = o(1)f C∞

as h1 → 0,

by (5.162), which yields 1 , (Tth1 − Tth2 )f  = o(1)tf C∞

as h1 → 0.

(5.169)

Therefore, the family Tth f converges to a family Tt f , as h → 0. Clearly, the limiting family Tt specifies a strongly continuous semigroup in C∞ (Rd ). Writing Tt − f Tt − Tth f Th − f = + t t t t

354

Chapter 5. Linear Evolutionary Equations: Advanced Theory

1 due to (5.169), as h → 0, and noting that the first term is of the order o(1)f C∞ 1 d we can conclude that C∞ (R ) belongs to the domain of the generator of the semigroup Tt in C∞ (Rd ) and that it is given there by (5.159). 1 In order to show that C∞ (Rd ) is an invariant core, let us now apply to Tt the procedure applied above to Tth . Differentiating, first formally, the evolution equation with respect to x, we obtain for g = ∇f the equation d gk (x) = Lgk (x)+(∇k b(x), ∇f (x))+ (f (x+y)−f (x))∇k ν(x, dy). (5.170) dt B1 \Bh

In order to show that the semigroup generated by this equation actually yields the 1 (Rd ), we again approximate b and ν by a sequence of derivatives of ft for f ∈ C∞ twice-continuously differentiable objects bn , νn , n → ∞. For these bn , νn , the esti1 (Rd ) in the same way as they were obtained mates (5.169) can be obtained in C∞ d above in C∞ (R ). This implies the claim for the approximating evolutions with bn , νn . And again, the semigroups generated by the r.h.s. of (5.170) with bn , νn converge to the semigroup generated by the r.h.s. of (5.170) without the label ‘n’, because all terms of the perturbation series converge. The perturbation series representation also implies that Tt is a strongly con1 (Rd ), and that the estimate (5.163) holds.  tinuous semigroup in C∞

5.15 Smoothness and smoothing of propagators In the abstract form, the regularization property of the semigroup S t in a Banach space B generated by an operator A with the domain D(A) means that S t μ ∈ D(A) for any t > 0 and any μ ∈ B (not necessarily from D) and that an estimate of the type AS t μ ≤ c(t)μ holds with some c(t) having a singularity at zero. The abstract theory is well developed for Hilbert spaces B. For instance, if A is positive and self-adjoint, then its semigroup is known to be regularizing with c(t) of the order t−1 , see, e.g., [244]. In the non-Hilbert setting of spaces of measures and continuous functions (where we mostly work), the regularization property is usually derived from the existence and properties of the Green function. In Theorem 5.8.3, we derived both the regularization property (5.120) of Cauchy problems that arise from ΨDOs with homogeneous symbols and the preservationof-smoothness property via certain manipulations with the Green function. Now we are going to show that the initial regularization property can be deepened, i.e., semigroups that regularize continuous functions will also regularize more smooth functions. This deepening is the consequence of some mild assumption on the core and of the possibility to perform a regular approximation. For a pseudo-differential operator A in Rd with the symbol A(x, p), which we ∂A denote by the same letter with some abuse of notation, let us denote by ∂x the j operator with the symbol

∂A ∂xj (x, p),

and more generally by

∂k A ∂xi1 ···∂xik

the operators

5.15. Smoothness and smoothing of propagators

with the symbol operator

∂ k A(x,p) ∂xi1 ···∂xik

355

. The main examples are as follows: (i) A is a differential

Af (x) =

Ai1 ···ik (x)

∂ k f (x) , ∂xi1 · · · ∂xik

in which case the derivatives of A are reduced to the derivatives of its coefficients, say

∂Ai ···i (x) ∂ k f (x) ∂A 1 k f (x) = ; ∂xj ∂xj ∂xi1 · · · ∂xik (ii) A is an integral operator of the L´evy–Khintchin type (5.131), in which case the differentiation of its integral part reduces to the differentiation of the transition kernel ν with respect to the first variable:     ∂b(x) ∂A 1 ∂A(x) f (x) = ∇, ∇ f (x) + , ∇ f (x) ∂xj 2 ∂xj ∂xj (5.171) ∂ν(x, dy) , + (f (x + y) − f (x) − (∇f (x), y)χ(y)) ∂xj with similar formulae for other derivatives. Working in the framework of Banach spaces C k (Rd ) suggests to use operators of at most kth order (see Proposition 1.7.2). Whenever we choose such operators as the generators of semigroups in C∞ (Rd ), we shall tacitly assume that their k domains contain the space C∞ (Rd ). By far the most important class of operators are operators of at most second order. In particular, such operators arise in the analysis of Markov processes. Although our methods work for operators with arbitrary order, we shall stick to operators of second order for the sake of clearness and simplicity. We shall use the fact that the equations of all practical examples are taken from a class where a natural approximation by operators of the same class is available, which are either bounded or have smoother coefficients. Abstractly speaking, the Yosida approximations can serve as A(n). However, since we mostly apply the theory to differential or pseudo-differential operators, these approximations can be constructed explicitly by (i) using finite differences instead of differential operators (or, more generally, linear combinations of exponentials for approximating the symbol of a pseudo-differential operator), in order to get a bounded approximation, or (ii) approximating the coefficients of a differential operators (more generally, the symbol of a pseudo-differential operator) by smoother ones, in order to get more regular operators of the same class. Therefore, the existence and even the explicit form of the approximation A(n) is usually seen directly in concrete examples. Theorem 5.15.1. Let A be a pseudo-differential operator of at most second order 2 (Rd ) as its core and enjoys which generates a semigroup etA in C∞ (Rd ) having C∞

356

Chapter 5. Linear Evolutionary Equations: Advanced Theory

1 the following smoothing property: etA f ∈ C∞ (Rd ) for any t > 0, f ∈ C∞ (Rd ), and (5.172) etA f C 1 (Rd ) ≤ κt−ω f C(Rd) , ∂A be well defined with constants κ > 0, ω ∈ (0, 1), and t ∈ (0, 1]. Let the operator ∂x j as an operator of at most second order. Moreover, assume that the operator A can be approximated by a sequence of operators An , so that ) ) ) ∂A ∂An ) n ) ) 2 (Rd ),C(Rd )) → 0, → 0, (5.173) A − AL(C∞ ) ∂x − ∂x ) 2 d L(C (R ),C(Rd )) ∞

as n → ∞, and such that the An have the same properties as A, but additionally 1 2 3 leave the spaces C∞ (Rd ), C∞ (Rd ) and C∞ (Rd ) invariant under exp{tAn }, and n 1 such that the semigroups exp{tA } are strongly continuous in C∞ (Rd ). 2 (Rd ), Then etA takes C 1 (Rd ) to C∞ etA f C 2 (Rd ) ≤ κt ˜ −ω f C 1(Rd )

(5.174)

1 (Rd ). with some other constant κ, ˜ and the etA are strongly continuous in C∞ At l d Similarly, if e acts strongly continuously on the spaces C∞ (R ) with l = 1, . . . , k, then ˜ −ω f C k−1(Rd ) . (5.175) etA f C k (Rd ) ≤ κt

Proof. Let us only deal with the first statement, that is with k = 2. Assume first j (Rd ), j = 1, 2, 3, are invariant under all operators etA , and that the spaces C∞ tA 1 3 that the e are strongly continuous in C∞ (Rd ). Then, if f0 ∈ C∞ (Rd ), we can ˙ differentiate the equation f = Af and obtain the equation ∂A g˙ = Adiag g + f (5.176) ∂x   ∂f ∂f for the derivative g = ∂f ∂x = ∂x1 , . . . , ∂xd , where Adiag is the diagonal operator with the element A on the main diagonal. Due to Proposition 1.7.2, we can write ∂A f = A1 g + B1 f, (5.177) ∂x where A1 is an operator of at most first order and B1 is a bounded operator in C∞ (Rd ). Therefore, the first step in the analysis of equation (5.176) consists of analysing the equation g˙ = Adiag g + A1 g. (5.178) 2 (Rd ) is an invariant core for etA , applying Theorem 4.6.4 with Since C∞ d d ˜ 1 B = (C∞ (R )) , B = (C∞ (Rd ))d and L = A1 leads to the conclusion that the semigroup Φt yielding mild solutions to equation (5.178) is strongly continuous ˜ takes B to B ˜ with the estimate both in B and B,

Φt B→B˜ ≤ κt ˜ −ω

(5.179)

2 and is generated by the operator Adiag + A1 on the invariant core (C∞ (Rd ))d .

5.15. Smoothness and smoothing of propagators

357

Next, by Proposition 4.10.2, equation (5.176) written as g˙ = Adiag g + A1 g + 2 (Rd ))d and a curve B1 f has a unique solution gt for any g0 ∈ (C∞ f. ∈ C([0, T ], (C∞ (Rd ))d ), and it is given by the formula

t

exp{(t − s)(Adiag + A1 )}B1 fs ds.

gt = exp{t(Adiag + A1 )}g0 +

(5.180)

0

Moreover, for any f. ∈ C([0, T ], (C∞ (Rd ))d ) and g0 ∈ (C∞ (Rd ))d (respectively 1 (Rd ))d ), the function t → gt is continuous in the topology of (C∞ (Rd ))d g0 ∈ (C∞ 1 (respectively (C∞ (Rd ))d ) and gt (C∞ ˜ −ω g0 (C∞ (Rd ))d + 1 (Rd ))d ≤ κt

κ ˜ t1−ω B1  sup fs C∞ (Rd ) . 1−ω s∈[0,t]

(5.181)

∂f0 Since g = ∂f ∂x solves equation (5.176) with the initial condition g0 = ∂x and the unique solution to this problem is given by (5.180), we can conclude that this formula yields the derivative g = ∂f ∂x . The estimate (5.181) translates to the required estimate (5.174). Due to the continuous dependence of gt on g0 and the assumed boundedness 1 of the semigroup etA in C∞ (Rd ), formula (5.180) for the derivative g = ∂f ∂x remains 1 valid even for f0 ∈ C∞ (Rd ), and due to the smoothing property of eAt even for f ∈ C∞ (Rd ) and t > 0. 2 (Rd ) If A is approximated by the sequence An with the invariant spaces C∞ 3 d n and C∞ (R ), then g given by (5.180) with all operators labeled by n accordingly n yield the derivatives g n = ∂f ∂x . Because of the first estimate in (5.173), Proposition 2 (Rd ) is a core for etA , we can conclude that the 4.2.2 and the assumption that C∞ n tAn tA ft = e f converge to ft = e f in C∞ (Rd ) for any f ∈ C∞ (Rd ). By the second estimate in (5.173) and the perturbation argument, we can conclude that exp{t(Andiag + An1 )}g0 converges to exp{t(Adiag + A1 )}g0 for any g0 ∈ (C∞ (Rd ))d . ∂f n

Consequently, the gtn = ∂xt given by (5.180) with all objects labeled with ‘n’ also converge for any f0 ∈ C∞ (Rd ), as n → ∞. The limiting function is given t by (5.180) and is equal to the derivative gt = ∂f ∂x , which implies all the required estimates.  Corollary 9. Under the assumptions of Theorem 5.15.1, its statement generalizes to the assertion that eAt for t > 0 takes the space CbLip (Rd ) to C 2 (Rd ), with the same estimate: ˜ −ω f CbLip(Rd ) . (5.182) etA f C 2 (Rd ) ≤ κt Proof. This follows from the observation that elements of CbLip (Rd ) can be uniformly approximated by elements of C 1 (Rd ) and that the norms of CbLip (Rd ) and C 1 (Rd ) coincide in C 1 (Rd ). 

358

Chapter 5. Linear Evolutionary Equations: Advanced Theory

5.16 Summary and comments In this chapter, we continued our analysis of linear systems and extended the theory into various directions. The method with the T -product is another classical tool for dealing with non-homogeneous, as well as nonlinear, equations. One of the first books to systematically present this method in the context of quantum mechanics (nonlinear Schr¨ odinger equation and related questions) was [201]. We developed the T product approximation in Section 5.1. Our exposition is close to [100], although we rely on the method of Banach towers for justifying the convergence instead of assuming the reflexivity of the involved Banach spaces. In Section 5.2, we provided a result that belongs to a class of formulae which is generally referred to as the Lie–Trotter–Daletski–Chernoff formula. Following here the exposition of [148], we highlighted once again the convenient use of the method of Banach towers in conjunction with the method of regular approximation as a tool for proving convergence of Lie–Trotter approximations. The final results on mixing, Theorems 5.3.2 and 5.3.3, are possibly new. An alternative to finite Banach towers that turns out to be a handy tool in many cases are the scales of Banach spaces that depend on a continuous parameter, the so-called Ovsyannikov’s method, for which we refer to [81] and references therein. The second part of the chapter was devoted to the method of frozen coefficients, which permit the construction of propagators for equations with variable coefficients whenever the corresponding Cauchy problem for the equation with constant coefficients is well understood. The literature on this method is quite extensive, and we are not attempting to review it. Since we mainly apply the method to the case of homogeneous symbols, let us mention the paper [128], followed by [213], where this method was initially applied to rather general homogeneous symbols. The monograph [72] is gives a very detailed presentation of the method of frozen coefficients for homogeneous symbols, using the theory of hyper-singular integrals and the Fourier expansion of coefficients in the series with respect to spherical harmonics. For some recent achievements, let us mention [127] and [173], which improve the method of frozen coefficients by using a modified initial approximation for the Green function (by adding some correcting term). Serious attention is given to equations with an index of homogeneity α ∈ (0, 2), since these equations naturally appear in the analysis of Markov processes, namely the so-called stable and stable-like processes. Two-sided estimates for the Green functions are provided in [134] and [136]. In our exposition, an abstract version of the method of frozen coefficients has been developed, which allows for a rather concise and unified presentation for various special cases, not only of the convergence of the main perturbation series, but also of the regularization properties of the obtained resolving operators. The detailed exposition given here in such generality seems to be new.

5.16. Summary and comments

359

Unfortunately, we did not touch at all the issues related to the evolution equations in domains with a boundary, see, e.g., [39, 51, 121] and references therein for various directions in this topic. Next, we turned to the general methods for proving the uniqueness of solutions to Cauchy problems. We presented the methods based on duality, accretivity, positivity and approximations in an abstract form. As examples, we analysed various classes of L´evy–Khintcin-type operators that generate Feller semigroups. Considering the related convolution semigroups as lifted to Banach spaces, one can obtain various useful extensions. As examples, we analysed simple fractional PDEs. Next, following [147], we studied the class of Cauchy problems generated by operators of order at most one, which can be considered the most general representation of mixed fractional derivatives. Proposition 5.12.2, which is the methodological basis for our future treatment of generalized fractional calculus, is new. In the final section, the link between smoothness and smoothing was clarified in a rather general setting. Smoothing properties of the operator semigroups are crucial for their application to nonlinear equations, as will be made clear in Chapter 6.

Chapter 6

The Method of Propagators for Nonlinear Equations This chapter applies the method of linear propagators to nonlinear equations with an unbounded r.h.s. We provide the well-posedness of nonlinear evolutions and their sensitivity with respect to initial data and parameters. It is shown how these methods work for equations with memory and with anticipating path dependence. The general theory is illustrated on various concrete examples. In particular, we analyse nonlinear heat conduction equations, nonlinear Schr¨ odinger equations and complex diffusions, HJB equations and Mc-Kean–Vlasov equations, Cahn–Hilliard equations, and the related forward-backward systems. The chapter ‘celebrates’ the strength of the abstract Theorems 2.1.1 to 2.1.3, showing how various nontrivial equations can be dealt with more or less directly by means of these results, with estimates for the growth of solutions via the Mittag-Leffler functions. We shall look at evolutionary equations where the r.h.s. is given by pseudodifferential operators with symbols At (x, p) that depend on an unknown function (equations in function spaces) or on an unknown measure (equations on measures), in particular by differential operators whose coefficients depend on the unknown function or measure. This dependence can be a) pointwise, i.e., At (x, p) depends on the value ut (x) of the unknown function u(x) (or of the density of the unknown measure), b) local, i.e., At (x, p) depends on the spatial derivatives of ut (x), or c) integral. In the last case, one can distinguish (i) the spatial integral dependence, i.e., At (x, p) depends on some integrals over ut (x) taken at time t, (ii) the adaptive or causal dependence, i.e., At (x, p) depends on the values of us at times s ≤ t, and (iii) general path dependence. All of these cases require different regularity assumptions on the symbols or coefficients for the analysis. Namely, for pointwise or local dependence of At (x, p) on some values of u, the smoothness is quantitatively given in terms of the partial derivatives with respect to these values. For the integral dependence of At (x, p) on u, the smoothness is most naturally represented by variational derivatives. © Springer Nature Switzerland AG 2019 V. Kolokoltsov, Differential Equations on Measures and Functional Spaces, Birkhäuser Advanced Texts Basler Lehrbücher, https://doi.org/10.1007/978-3-030-03377-4_6

361

362

Chapter 6. The Method of Propagators for Nonlinear Equations

A critical point for the classification of the methods that are used for analysis concerns the possible non-degeneracy of the major (most singular) term that can usually be used for deriving certain smoothing properties of the evolution generated by this major term and therefore allowing for much less regular behaviour of the remaining parts. The absence of such non-degeneracy assumptions makes the analysis much more demanding. Another key point is the presence or absence of nonlinearity in the major term. Finally, as in the linear case, serious progress can be achieved by exploiting various representations of the equations, namely mild, strong or weak. These representations make it possible to deal with various regularity classes of solutions. The strong form for measure-valued evolutions seems to be most appropriate for the non-degenerate case, when the smoothing property turns arbitrary initial measures into measures with smooth densities, where pseudo-differential operators are well defined. Since pseudo-differential operators are not well defined on arbitrary measures, however, the strong form may not be available in the absence of smoothing, in which case the weak form becomes the only possibility for the analysis. This is why the sensitivity for the general degenerate case is developed separately at the end of the chapter. Let us explain in abstract terms the precise link to Theorems 2.1.1 to 2.1.3, which will be used in the sequel. The basic approach to the analysis of the ODE μ˙ = F (μ) in a Banach space B with an initial state μ0 = Y for a Lipschitz-continuous t r.h.s. f was based on successive approximations μn+1 (t) = x + 0 F (μn (s)) ds, which are obtained as solutions to the approximating recursive system of equations μ˙ n+1 = F (μn ). If F is unbounded, then such approximations may become inappropriate, because the growth rate of F may increase by the iterations. However, if one can distinguish a linear or affine unbounded part of F such that F (μ) = A[μ]μ + C[μ], where A[ν] is a linear operator in B for any ν that can be handled by the linear theory, then the natural system of recursive approximations to the ODE μ˙ = F (μ) can is μ˙ n+1 = A[μn ]μn+1 + C[μn ]. The corresponding fixed-point equation can be interpreted as an infinite-dimensional multiplicativeintegral equation. In fact, we already used such approximations, with A[μn ] being a constant, in order to handle the positivity in Theorem 3.1.1. In this chapter, we use this idea quite generally, exploiting it both for strong and weak equations. The proof of Theorem 6.1.1 below is the first application of this idea.

6.1 Hamilton–Jacobi–Bellman (HJB) and Ginzburg–Landau equations Let us start with the equation   ∂f ∂f (x) = Af (x) + H x, (x), f (x) , ∂t ∂x

(6.1)

6.1. Hamilton–Jacobi–Bellman (HJB) and Ginzburg–Landau equations

363

where A is the generator of a strongly continuous semigroup etA in C∞ (Rd ) such that (6.2) etA f C 1 (Rd ) ≤ κt−ω f C(Rd) , uniformly for t from a compact interval [0, T ], ω ∈ (0, 1), and H is a Lipschitzcontinuous function in three variables, referred to as the Hamiltonian of equation (6.1). The smoothing assumption (6.2) is central for the following discussion. In the previous chapter, we showed that many natural evolutions (parabolic PDEs and ΨDEs) satisfy this property, see, e.g., Theorem 5.8.3. A basic example for an equation of the type (6.1) comes from stochastic control theory, where A is the generator of a sufficiently regular Feller process (say, diffusion or a stable or stable-like process) and the Hamiltonian function H has the form (2.123). In this case, equation (6.1) represents the standard Hamilton– Jacobi–Bellman (HJB) equation of stochastic control (of Markov processes generated by A). A substantial simplification arises when H in (6.1) does not depend on the derivative ∂f ∂x . An example for this case is the Ginzburg–Landau equation ∂f (x) = Δf (x) − ψ  (f (x)) ∂t

(6.3)

with some given function ψ. The Ginzburg–Landau equation and the Cahn– Hilliard equation (considered below, see (6.28)) are central for materials science, see, e.g., [99] for their derivation and potential extensions. Remark 100. If A in (6.1) is not smoothing, then the theory is quite different. This can already be seen at the case of vanishing A, which was touched upon in Section 2.6. Motivated by Theorem 4.6.3, we shall consider the term with H as a perturbation (though a nonlinear one) and will start with the corresponding mild solutions to (6.1), i.e., with solutions of the mild form to the following equation:   t ∂fs tA (t−s)A (.), fs (.) ds. e H ., (6.4) ft = e Y + ∂x 0 Theorem 6.1.1. Let A be an operator in C∞ (Rd ) that generates a strongly continuous semigroup etA in C∞ (Rd ) such that etA is also a strongly continuous 1 semigroup in C∞ (Rd ) and etA C∞ (Rd )→C∞ (Rd ) ≤ TC ,

etA C∞ 1 (Rd )→C 1 (Rd ) ≤ TD , ∞

(6.5)

1 (Rd ) and let with constants TC , TD and t ∈ [0, T ]. Let etA take C(Rd ) to C∞ (6.2) hold with κ > 0, ω ∈ (0, 1), and let H(x, p, q) be a continuous function on Rd × Rd × R such that h = supx |H(x, 0, 0)| < ∞ and

|H(x, p1 , q1 ) − H(x, p2 , q2 )| ≤ LH |p1 − p2 | + LH |q1 − q2 |

(6.6)

364

Chapter 6. The Method of Propagators for Nonlinear Equations

1 with a constant LH . Then for any Y ∈ C∞ (Rd ) there exists a unique solution 1 d f. ∈ C([0, T ], C∞ (R )) to equation (6.4). Moreover, for all t ≤ T ,

(6.7) ft (Y ) − Y C 1 (Rd ) ≤ E1−ω (κLH Γ(1 − ω)t1−ω )  1−ω  t κ (h + LH Y C 1 (Rd ) ) + (etA − 1)Y C 1 (Rd ) , × 1−ω and the solutions ft (Y1 ) and ft (Y2 ) with different initial data Y1 , Y2 satisfy the estimate ft (Y1 ) − ft (Y2 )C 1 (Rd ) ≤ TD Y1 − Y2 C 1 (Rd ) E1−ω (κLH Γ(1 − ω)t1−ω ),

(6.8)

where E denotes the Mittag-Leffler function (9.13). Remark 101. The estimate (6.7) expresses the rate of convergence ft (Y ) → Y , as t → 0, in terms of the rate of convergence etA (Y ) → Y . For instance, if Y belongs to the domain of the generator of etA in C 1 (Rd ), then the difference etA (Y ) − Y C 1 (Rd ) will be of order t. Proof. Using the notations given prior to Theorem 2.1.1, let us define the mapping 1 1 ΦY : C([0, t], C∞ (Rd )) → CY ([0, t], C∞ (Rd )) by   t ∂fs (.), fs (.) ds. [ΦY (f. )](t) = etA Y + e(t−s)A H ., (6.9) ∂x 0 1 Let us first show that this mapping is well defined. If f ∈ C([0, t], C∞ (Rd )), then 1 d [ΦY (f. )](t) ∈ C∞ (R ) for any t, with t [ΦY (f. )](t)C 1 (Rd ) ≤ TD Y C 1 (Rd ) + κ (t − s)−ω (h + LH fs C 1 (Rd ) ) ds. 0

We need to show the continuous dependence of [ΦY (f. )](t) on t. The first term in 1 (6.9) depends continuously on t in the topology C∞ (Rd ), because of the assumed tA 1 d strong continuity of e in C∞ (R ). For the difference of the values of the second term of (6.9) at different times t1 > t2 , we find t1 t2 ∂fs ∂fs (t1 −s)A (.), fs (.)) ds − (.), fs (.)) ds e H(., e(t2 −s)A H(., ∂x ∂x 0 0 t1 ∂fs (.), fs (.)) ds e(t1 −s)A H(., = ∂x t2 t2 ∂fs + (e(t1 −t2 )A − 1) (.), fs (.)) ds. e(t2 −s)A H(., ∂x 0 The first term on the r.h.s. tends to zero in the C 1 (Rd )-norm, as t1 → t2 , because of 1 (6.2), and the second because of the strong continuity of etA in C∞ (Rd ). Therefore, ΦY is well defined.

6.1. Hamilton–Jacobi–Bellman (HJB) and Ginzburg–Landau equations

365

The rest is the application of Theorem 2.1.3, because [ΦY1 (f. )](t) − [ΦY2 (f. )](t)C∞ 1 (Rd ) ≤ TD Y1 − Y2 C 1 (Rd ) , and [ΦY (f.1 )](t) − [ΦY (f.2 )](t)C 1 (Rd ) t ∂f 1 ∂f 2 ≤κ (t − s)−ω H(., s (.), fs1 (.)) − H(., s (.), fs2 (.))C∞ (Rd ) ds, ∂x ∂x 0 t (t − s)−ω fs1 − fs2 C∞ 1 (Rd ) ds, ≤ κLH 0

which yields the estimate (2.5) of Theorem 2.1.3, and finally t ∂Y tA [ΦY (Y )](t) − Y = (e − 1)Y + e(t−s)A H(., , Y )ds, ∂x 0 t1−ω κ (h + LH Y C 1 (Rd ) ). [ΦY (Y )](t) − Y  ≤ (etA − 1)Y C 1 (Rd ) + 1−ω



The next result states the continuous and Lipschitz dependence of the solutions to the HJB equation on a parameter entering the expression for the Hamiltonian. Theorem 6.1.2. Let Hα (x, p, q) be a family of Hamiltonians depending on a parameter α taken from an auxiliary Banach space Bpar . Suppose that each Hα satisfies all assumptions of Theorem 6.1.1 with all bounds uniform in α, and moreover |Hα (x, p, q) − Hβ (x, p, q)| ≤ α − βLpar H (1 + |p| + |q|),

(6.10)

with a constant Lpar H . Then the solutions ft (Y, α) and ft (Y, β) of (6.4) (built in Theorem 6.1.1) with different parameter values satisfy the estimate sup fs (Y, α) − fs (Y β)C 1 (Rd ) ≤ Lpar H Kα − β(1 + Y C 1 (Rd ) ),

(6.11)

s∈[0,t]

where the constant K depends continuously on t, ω, κ, h, LH and TD . Proof. This is again a consequence of Theorem 2.1.3, because [ΦY,α (f. )](t) − [ΦY,β (f. )](t)C 1 (Rd ) ) t  )  ) ) ∂fs ∂fs (t−s)A ) (.), fs (.)) − Hβ (., (.), fs (.)) ds) ≤) e Hα (., ) 1 d ∂x ∂x 0 C (R ) ) ) ) ) κ 1−ω ∂fs ∂fs ) t ≤ sup ) )Hα (., ∂x (.), fs (.)) − Hβ (., ∂x (.), fs (.))) 1−ω s∈(0,t] C(Rd )  κ 1−ω par t ≤ LH α − β 1 + sup fs C 1 (Rd ) . 1−ω s∈(0,t]



366

Chapter 6. The Method of Propagators for Nonlinear Equations

Let us now analyse the conditions under which the regularity of the solutions can be improved and one can therefore pass from the mild equation to the initial one. Theorem 6.1.3. (i) Under the assumptions of Theorem 6.1.1, assume additionally that etA acts 2 strongly continuously in C∞ (Rd ), so that ˜ etA C∞ 2 (Rd )→C 2 (Rd ) ≤ TD , ∞

(6.12)

with some constant T˜D , and that H is Lipschitz-continuous in the first argument: (6.13) |H(x1 , p, q) − H(x2 , p, q)| ≤ LH |x1 − x2 | |p|. (The linear dependence of the Lipschitz constant on |p| is a standard feature 2 (Rd ), the unique solution in all natural examples.) Then for any f0 ∈ C∞ 1 d 2 (Rd ) for any t f. ∈ C([0, T ], C∞ (R )) to equation (6.4) is such that ft ∈ C∞ 2 d with the norm in C∞ (R ) being uniformly bounded for t ∈ [0, T ]. Moreover, ft satisfies (6.1) for t > 0. (ii) Let Hα (x, p, q) be a family of Hamiltonians and Aα a family of operators depending on a parameter α taken from an auxiliary Banach space Bpar in such a way that, for each α, Hα and Aα satisfy the conditions of (i) with all bounds uniform in α. Moreover, let (6.10) hold and Aα − Aβ C 2 (Rd )→C(Rd ) ≤ Lpar A α − β,

(6.14)

with a constant Lpar A . Then the estimate par ft (Y, α) − ft (Y β)C 1 (Rd ) ≤ (Lpar H + LA )Kα − β(1 + Y C 1 (Rd ) ) (6.15)

holds for the solutions ft (Y, α) and ft (Y, β) of (6.4) with different parameter values, where the constant K depends continuously on t, ω, κ, h, LH , TD and T˜D . 2 (Rd )). Remark 102. We do not state in (i) that f. ∈ C([0, T ], C∞

Remark 103. In the proof, we shall use the additional regularization estimates (5.174) and (5.182) obtained in Theorem 5.15.1. One can avoid referring to this theorem by additionally assuming the smoothing properties (5.174) and (5.182). According to Theorem 5.8.3, these properties hold for the evolutions generated by parabolic PDEs and ΨDEs. Proof. (i) The mapping (6.9) is now a mapping 2 2 ΦY : C([0, t], C∞ (Rd )) → CY ([0, t], C∞ (Rd )), 2 and all functions ΦnY (Y )(t) are uniformly bounded in CY ([0, t], C∞ (Rd )). In fact, 2 d we know by Theorem 6.1.1 that for any Y ∈ C∞ (R ) all approximations are uni-

6.1. Hamilton–Jacobi–Bellman (HJB) and Ginzburg–Landau equations

367

1 formly bounded in C∞ (Rd ), so the Lipschitz constant LH |p| in (6.13) is uniformly bounded for ∂ n Φ (Y )(t) p= ∂x Y 2 (Rd ) follows from and all n. Therefore, the boundedness of all [ΦY (f. )](t) in C∞ 2 (Rd ) follows as (6.12) and (5.182). The continuity of the curve [ΦY (f. )](t) in C∞ 2 in the proof of Theorem 6.1.1, but now the strong continuity of etA in C∞ (Rd ) and (5.182) are used. Therefore, all approximations of ft have uniformly bounded norms in 2 1 C∞ (Rd ), and hence the limit ft has a bounded norm in CbLip (Rd ). But since 2 d ft satisfies (6.4), it follows that ft ∈ C∞ (R ) for any t > 0. Since the norms in 1 (Rd ) and C 2 (Rd ) coincide, it follows that the norms of ft in C 2 (Rd ) are CbLip uniformly bounded, as claimed. Finally, in order to show that (6.1) holds for t > 0, we can apply Proposition 4.10.2. Note that it is not directly applicable, since we did not show that 2 d t gt = H(x, ∂f ∂x , ft ) belongs to C∞ (R ) for each t. Still, its conclusion holds, since 2 e(t−s)A gs ∈ C∞ (Rd ) for any t − s > 0, and therefore the mild solution ft is differentiable at least for t > 0. (ii) Again we use Theorem 2.1.3. To this end, we have to estimate

(etAα − etAβ )Y C 1 (Rd ) ) t    )   ) ) ∂fs ∂fs (t−s)Aα (t−s)Aβ ) (.), fs (.) − e (.), fs (.) ds) +) Hα ., Hβ ., e ) ∂x ∂x 0

. C 1 (Rd )

Given the calculations in the proof of Theorem 6.1.2, we only need to estimate the differences D1 = (etAα − etAβ )Y C 1 (Rd ) and

) t  )  ) ) ∂fs (t−s)Aα (t−s)Aβ D2 = ) (.), f (e − e )H (.) ds) ., α s ) ) ∂x

.

C 1 (Rd )

0

The difference between two semigroups can be estimated with the help of formula (4.8): t e(t−s)Aβ (Aα − Aβ )esAα ds. (6.16) etAα − etAβ = 0

Consequently, by (6.2) and (5.174), we find t −ω D1 ≤ κ κ(t ˜ − s)−ω α − βLpar ds Y C 1 (Rd ) A s 0 1−2ω = B(1 − ω, 1 − ω)κ κα ˜ − βLpar Y C 1 (Rd ) , A t

as required in order for Theorem 2.1.3 to be applicable. (Note that B is the Betafunction defined in (9.7).) The difference D2 is estimated in the same way. 

368

Chapter 6. The Method of Propagators for Nonlinear Equations

Let us now formulate the time-dependent version of the above result. (Note that we omit the proof, since it is almost identical to the above proof.) In the time-dependent setting, the Cauchy problem for HJB equations usually arises in the inverse time. Therefore, we shall analyse the equation   ∂ft ˙ (6.17) (x), ft (x) , 0 ≤ t ≤ r ≤ T, ft (x) = −At ft (x) − Ht x, ∂x where At is a family of operators generating a strongly continuous backward propagator U t,r in C∞ (Rd ) such that U t,r f C 1 (Rd ) ≤ κ(r − t)−ω f C(Rd)

(6.18)

uniformly for t, r from a compact interval [0, T ], and Ht is a family of Lipschitzcontinuous functions. Like in the time-homogeneous case (taking into account the inverse direction of time), the mild solutions to the Cauchy problem of equation (6.17) with the terminal data Y = fr are defined as the solutions to the mild form of the equation (6.17):   r ∂fs (.), fs (.) ds, t ≤ r. U t,s Hs ., (6.19) ft = U t,r Y + ∂x t Theorem 6.1.4. (i) Let At be a family of operators in C∞ (Rd ) generating a strongly continuous backward propagator U t,r in C∞ (Rd ) such that U t,r is also a strongly 1 continuous propagator in C∞ (Rd ) with U t,r C∞ (Rd )→C∞ (Rd ) ≤ TC ,

U t,r C∞ 1 (Rd )→C 1 (Rd ) ≤ TD , ∞

(6.20)

1 (Rd ), let with constants TC , TD and t, r ∈ [0, T ]. Let U t,r take C(Rd ) to C∞ (6.18) hold with κ > 0, ω ∈ (0, 1), and let Ht (x, p, q) be a continuous function on [0, T ] × Rd × Rd × R such that h = supt,x |Ht (x, 0, 0)| < ∞ and

|Ht (x, p1 , q1 − Ht (x, p2 , q2 )| ≤ LH |p1 − p2 | + LH |q1 − q2 |

(6.21)

1 with a constant LH . Then for any Y = fr ∈ C∞ (Rd ) there exists a unique 1 d solution ft ∈ C∞ (R ) to equation (6.19). Moreover, for all t ≤ r,

(6.22) ft (Y ) − Y C 1 (Rd ) ≤ E1−ω (κΓ(1 − ω)(r − t)1−ω )

  1−ω 1−ω (r − t) (r − t) + Y C 1 (Rd ) 1 + TD + κLH × hκ , 1−ω 1−ω and the solutions ft (Y1 ) and ft (Y2 ) with different initial data Y1 , Y2 satisfy the estimate ft (Y1 ) − ft (Y2 )C 1 (Rd ) ≤ TD Y1 − Y2 E1−ω (κΓ(1 − ω)(r − t)1−ω ), (6.23) where E denotes the Mittag-Leffler function (9.13).

6.1. Hamilton–Jacobi–Bellman (HJB) and Ginzburg–Landau equations

369

(ii) Let Hα,t (x, p, q) be a family of Hamiltonians depending on a parameter α taken from an auxiliary Banach space Bpar , with each Hα satisfying the assumptions of (i) with all bounds uniform in α. Moreover, let |Hα,t (x, p, q) − Hβ,t (x, p, q)| ≤ α − βLpar H (1 + |p| + |q|),

(6.24)

with a constant Lpar H . Then the solutions ft (Y, α) and ft (Y, β) to (6.4) with different parameter values satisfy the estimate (6.11), where the constant K depends continuously on t − r, ω, κ, h, LH and TD . Exercise 6.1.1. Formulate and prove a non-homogeneous analogue of Theorem 6.1.3. Finally, let us analyse the sensitivity (i.e., the smooth dependence on the initial data) of the nonlinear equations that we dealt with above. This analysis demonstrates once again the power of the abstract results of Chapter 2. The equations (6.4) and (6.19) are of the type (2.129) and can therefore be handled by Theorem 2.15.1. Let us discuss sensitivity of the HJB equation in the simplest framework of Theorem 6.1.1, the other cases considered above can be dealt with analogously. Theorem 6.1.5. Under the assumptions of Theorem 6.1.1, let us additionally asand ∂H(x,p,q) exist and are continuous funcsume that the derivatives ∂H(x,p,q) ∂p ∂q tions, uniformly for x ∈ Rd and p, q from any bounded set. Then the mapping 1 Y → ft ∈ C∞ (Rd ) yielding the solution to equation (6.4) constructed in Theorem 1 1 1 (C∞ (Rd ), C∞ (Rd )), and the derivative ξt = Dft (Y )[ξ] is the 6.1.1 belongs to Cluc unique solution to the equation      ∂ξ ∂H ∂fs ∂fs (.), fs (.) + (.), fs (.) ξ ds. ., ., ∂x ∂x ∂q ∂x 0 (6.25) Moreover, this solution is bounded:



t

ξt = etA ξ +

e(t−s)A

∂H ∂p

ξt  ≤ κ(T, Y )ξ(1 + t).

(6.26)

Proof. It is a consequence of (6.2) and Theorems 6.1.1, 2.15.1, if one notes that the (x) 1 derivative of the mapping f → H(x, ∂f∂x , f (x)), as a mapping C∞ (Rd ) → C(Rd ), is given by the formula ∂f (.), f (.))[ξ](x) ∂x     ∂ξ(x) ∂H ∂H ∂f ∂f = (x), f (x) + (x), f (x) ξ(x). x, x, ∂p ∂x ∂x ∂q ∂x

DH(.,



370

Chapter 6. The Method of Propagators for Nonlinear Equations

6.2 Higher-order PDEs and ΨDEs, and Cahn–Hilliard-type equations Similar to the operators of at most second order, as developed in Theorem 5.15.1, one can prove a strengthened smoothing for operators of at most order α under the corresponding regularity assumptions. This leads to extensions of the theory that was developed above for the equations (6.1). For the sake of simplicity, we shall analyse these equations only for the simplest A of order α, namely for A = σ(x)|Δ|α/2 , whose semigroup was constructed in Proposition 4.4.1 for a constant σ and in the corollary to Proposition 5.9.1 for variable ones. The corresponding extension of the equations (6.1) is the class of equations of the type   f˙t (x) = −σ(x)|Δ|α/2 ft + H x,

∂ m ft ∂xi1 · · · ∂xim



 (x), ft (x) ,

(6.27)

where σ is a positive constant (more generally, a complex constant with a positive real part), H(x, {pi1 ,...,im }, q) is a function of the variables x ∈ Rd and pi1 ,...,im , q ∈ R, where pi1 ,...,im are parametrized by any sequence of m numbers from {1, . . . , d} with m ∈ [1, . . . , k], where k < α. Therefore, H in (6.27) is a function of f and all its derivatives of order up to k. Equation (6.27) is referred to as quasi-linear if H is linear with respect to all partial derivatives of f . An important example for physics is the so-called Cahn–Hilliard equation: f˙t = −σΔ2 ft + Δ(γ2 ft3 + γ1 ft2 − ft ),

(6.28)

with constants γ1,2 . It governs the thermodynamical process of the separation of mixtures (spinodal decompositions). Remark 104. The growth of the coefficients in (6.28) does not fit the assumptions of our general treatment of (6.27) below. However, the particular structure makes it possible to get some a-priori estimates as an appropriate counterbalance for this growth, see, e.g., [73] and [74]. Similar to (6.4), one can define the mild solutions to (6.27) as the solution to the mild form of this equation: ft = exp{−tσ|Δ|α }Y   t + exp{−(t − s)σ|Δ|α }H ., 0

m

∂ fs ∂xi1 · · · ∂xim



 (.), fs (.) ds.

(6.29)

The reason for the assumption k < α in (6.27) is the integrability of the singularity t−k/2α in (4.58), which is ensured by this assumption. Keeping this in mind, the following result is a straightforward extension of Theorem 6.1.1.

6.3. Nonlinear evolutions and multiplicative-integral equations

371

Theorem 6.2.1. Let H be a continuous function such that h = supx |H(x, 0, 0)| < ∞ and |H(x, {p1i1 ,...,im }, q 1 ) − H(x, {p2i1 ,...,im }, q 2 )|

(6.30) |p1i1 ,...,im − p2i1 ,...,im | + LH |q 1 − q 2 | ≤ LH i1 ,...,im k with a constant LH . Then for any Y ∈ C∞ (Rd ) there exists a unique solution k d f. ∈ C([0, T ], C∞ (R )) to equation (6.29). Moreover, the solutions ft (Y1 ) and ft (Y2 ) with different initial data Y1 , Y2 satisfy the estimate

ft (Y1 ) − ft (Y2 )C k (Rd ) ≤ c1 Y1 − Y2 C k (Rd ) E1−k/2α (c2 t1−k/2α ),

(6.31)

with some constants c1 , c2 that only depend on α and σ. All the other results that have earlier been developed for equation (6.1) extend more or less automatically to the equations (6.27) and their related extension as well.

6.3 Nonlinear evolutions and multiplicative-integral equations In the last two sections, we laid the foundation for studying the strong form of equations with a linear major term and a non-linearity that depends pointwise on the values of an unknown function and its derivatives, working with their mild representations. Now we set up an alternative scheme of analysis that arises from working with the weak form of the equations, and therefore with a strong emphasis on duality. The weak form arises naturally in many applications, e.g., in cases when the symbols bear some integral dependence on the unknown function. Sometimes the weak form is the only way to write down an equation rigourously. But it can also be used as an alternative approach to the equations discussed above. A handy general framework for dealing with weak equations is given by the setting of the dual Banach pair (Bobs , Bst ) (i.e., each of these spaces is a closed subspace of the dual of the other space that separates the points of the latter). Unlike the more popular pair (B, B ∗ ), the setting (Bobs , Bst ) makes the results explicitly symmetric with respect to changing the order of spaces in the pair. Therefore, we shall analyse ODEs in the Banach space Bst of the form d (f, μt ) = (At (μt )f, μt ), dt

μ0 = Y,

f ∈ D,

(6.32)

which should hold for all f from a dense subspace D of Bobs , and where At [μ] is a family of linear operators in Bobs for any μ, with a domain containing D. It will be convenient to assume that D itself is a Banach space if equipped with some other norm .D ≥ .Bobs , which allows for working with its dual

372

Chapter 6. The Method of Propagators for Nonlinear Equations

∗ ∗ Banach space D∗ ⊃ Bobs ⊃ Bst (and thus .D∗ ≤ .Bobs ≤ .Bst ). Recall that ∗ C([τ, T ], D ) denotes the Banach space of continuous functions [τ, T ] → D∗ . For M ⊂ Bst , let us denote by C([τ, T ], M (D∗ )) the subset of C([τ, T ], D∗ ) of functions that take values in M , and by CY ([τ, T ], M (D∗ )) the subset of C([τ, T ], D∗ ) of functions μt with the given initial value μτ = Y . If M is closed in D∗ , then C([τ, T ], M (D∗ )) is close in C([τ, T ], D∗ ) and a complete metric space with the distance induced by the norm of C([τ, T ], D∗ ).

Theorem 6.3.1. Let M be a convex subset of Bst that is closed in the norm topologies of both Bst and D∗ . Let ξ → At (ξ) be a mapping from M × [0, T ] to the bounded linear operators At [ξ] : D → Bobs , which is continuous in t and Lipschitzcontinuous as a mapping D∗ → L(D, Bobs ): At (ξ) − At (η)D→Bobs ≤ LA ξ − ηD∗ ,

ξ, η ∈ M,

(6.33)

for a constant LA . Assume that for any Y ∈ M and ξ. ∈ CY ([τ, T ], M (D∗ )), τ ∈ [0, T ), the operator curve At (ξt ) : D → Bobs generates a strongly continuous backward propagator of uniformly bounded linear operators U r,s [ξ. ], τ ≤ r ≤ s ≤ T , in Bobs on the common invariant domain D such that U r,s [ξ. ]D→D ≤ UD ,

U r,s [ξ. ]Bobs →Bobs ≤ UB ,

(6.34)

for some constants UD , UB , and that the dual propagators V s,r [ξ. ] = (U r,s )∗ [ξ. ] preserve the set M . Then the weak nonlinear Cauchy problem (6.32) is well posed in M . More precisely, for any Y ∈ M it has a unique solution μt (Y ) ∈ M , and the transformation Y → μt (Y ) of M depends Lipschitz-continuously on the time t and the initial data in the norm of D∗ , i.e., μt (Y1 ) − μt (Y2 )D∗ ≤ exp{tUD UB LA }UB Y1 − Y2 D∗ ,

(6.35)

μt (Y ) − Y D∗ ≤ tUB Y Bst A(Y )D→B exp{tY Bst UD UB LA }. (6.36) Remark 105. (i) The constant LA may depend on ξ and η, but it should be uniformly bounded for ξ, η from any bounded subset of M . (ii) The continuity (6.33) is a stronger requirement than the Lipschitz continuity in Bst . (iii) The notation C([τ, T ], M (D∗ )) (rather than just C([τ, T ], M )) emphasizes that M is considered in the topology of D∗ . (iv) According to Proposition 4.10.1, the dual propagators V s,r [ξ. ] are necessarily Lipschitz-continuous functions of s, r in the norm topology of D∗ . Therefore, Theorem 6.3.1 is still valid if the propagators U r,s [ξ. ] can be constructed not for any ξ ∈ CY ([τ, T ], M (D∗ )), but only for those that are Lipschitzcontinuous in the norm topology of D∗ . As an application of this remark, see Theorem 6.9.1.

6.3. Nonlinear evolutions and multiplicative-integral equations

373

Proof. As mentioned before, we are planning to approximate the solutions to (6.32) by a recursive system solving the equations d (f, ξn+1 (t)) = (At [ξn (t)]f, ξn+1 (t)), dt

ξn (τ ) = Y,

f ∈ D,

(6.37)

expecting that the ξn converge to a fixed point of the mapping ξt → ΦY (ξ.)(t) = V t,τ [ξ. ]Y considered as a mapping of the metric space CY ([τ, T ], M (D∗ )) to itself. In order to estimate the difference of two propagators U τ,t [ξ.1 ] and U τ,t [ξ.2 ] via Proposition 4.9.2, we need the continuity of the mapping At (ξ)f as a mapping M (D∗ ) → Bobs for any f ∈ D, which follows by (6.33). Since (f, (V t,τ [ξ.1 ] − V t,τ [ξ.2 ])Y ) = (U τ,t [ξ.1 ]f − U τ,t [ξ.2 ]f, Y ), it follows that [ΦY (ξ.1 )](t) − [ΦY (ξ.2 )](t)D∗ = V t,τ [ξ.1 ]Y − V t,τ [ξ.2 ]Y D∗ ≤ U τ,t [ξ.1 ] − U τ,t [ξ.2 ]D→B Y Bst t ≤ UD UB LA Y Bst ξs1 − ξs2 D∗ ds.

(6.38)

τ

Moreover, [ΦY1 (ξ. )](t) − [ΦY2 (ξ. )](t)D∗ = V t,0 [ξ. ](Y1 − Y2 )D∗ ≤ UB Y1 − Y2 Bst . (6.39) Finally, using (4.133), we obtain ΦY (Y ) − Y C([τ,t],D∗) = sup V s,τ [Y ]Y − Y D∗ s∈[τ,t]

= sup

sup (U τ,s [Y ]f − f, Y )

s∈[τ,t] f D ≤1

≤ (t − τ )UB sup As (Y )D→B Y Bst . s∈[τ,t]

Therefore, everything follows from Theorem 2.1.1.



Remark 106. For measure-valued equations, the basic examples of the set M are the set of probability measures or the set of positive measures. In the above fixed-point equation ξ. = ΦY (ξ. ), the r.h.s is expressed in terms of the propagator, which in turn represents a T -product or a multiplicative integral. Therefore, this equation is the multiplicative-integral analogue of the usual integral equations. Let us now prove the stability (or continuous dependence) of the nonlinear semigroups of transformations μt with respect to small perturbations of the generators A.

374

Chapter 6. The Method of Propagators for Nonlinear Equations

Theorem 6.3.2. Assume that we have a family ξ → Aα t (ξ) of mappings from M × [ξ] : D → B [0, T ] to the bounded linear operators Aα obs , satisfying the conditions t of Theorem 6.3.1 with all constants being uniform in α for α from some auxiliary Banach space B1 . Suppose that β Aα t (ξ) − At (ξ)D→B ≤ κα − β,

ξ ∈ M,

(6.40)

with a constant κ. Then β μα t (Y ) − μt (Y )D∗ ≤ (t − τ )α − βκ exp{(t − τ )UD UB LA Y Bst }UD UB Y Bst , (6.41) α where μα t (Y ) denotes the corresponding solution to the equation with At .

Proof. By duality and Proposition 4.9.3, we find β t,τ t,τ [Φα Y (ξ. )](t) − [ΦY (ξ. )](t)D∗ = (Vα [ξ. ]Y − Vβ [ξ. ]Y D∗

≤ Uατ,t [ξ. ] − Uβτ,t [ξ. ]D→Bobs Y Bst ≤ (t − τ )UD UB κα − βY Bst ,

(6.42) 

which again implies (6.41) by Theorem 2.1.1.

6.4 Causal equations and general path-dependent equations In this section, we shall deal with a path-dependent version of equation (6.32), i.e., with the equation d (f, μt ) = (A[t, {μs }0≤s≤T ]f, μt ), dt

μ0 = Y,

f ∈ D,

(6.43)

where (t, {ηs }0≤s≤T ) → A[t, {ηs }0≤s≤T ] maps R+ × Cμ ([0, T ], M (D∗ )) to the bounded linear operators D → Bobs . We refer to equation (6.43) as the general path-dependent kinetic equation. It should hold for all test functions f ∈ D. If the operators A only depend on the history of the trajectory {μ.} ∈ Cμ ([0, T ], M (D∗ )), that is, d (f, μt ) = (A[t, {μ≤t }]f, μt ), μ0 = Y, f ∈ D, (6.44) dt where {μ≤t } is a short form for {μs }0≤s≤t , then the equations (6.44) are causal, see (2.157). It is also natural to call them adaptive kinetic equations, in analogy to adaptive control and adaptive stochastic differential equations. If, on the other had, the generators A only depend on the future of the trajectory {μ.}, that is, d (f, μt ) = (A[t, {μ≥t }]f, μt ), dt

μ0 = Y,

f ∈ D,

(6.45)

6.4. Causal equations and general path-dependent equations

375

where {μ≥t } is a short form for {μs }t≤s≤T , then we call (6.45) an anticipating kinetic equation. Remark 107. The main reason for talking about (seemingly quite exotic) anticipating equations lies in their inevitable appearance in the study of forward-backward systems, see Section 6.10. Let us start with the causal equations, where no additional difficulties arise compared to equation 6.32. Theorem 6.4.1. As in Theorem 6.3.1, let M be a bounded convex subset of Bst that is closed in the norm topologies of both Bst and D∗ . Assume that for any t ∈ [0, T ] and a curve {ξ.} ∈ Cμ ([0, T ], M (D∗ )), a linear operator A[t, {ξ≤t }] : D → Bobs is defined such that all A[t, {ξ≤t }] : D → Bobs are uniformly bounded and Lipschitzcontinuous in {ξ.}, i.e., for any {ξ.}, {η.} ∈ CY ([0, T ], M (D∗ )), we have sup A[s, {ξ≤s }] − A[s, {η≤s }]D→Bobs ≤ LA ξ. − η. C([0,t],D∗ ) ,

(6.46)

s∈[0,t]

with a positive constant LA . Moreover, assume that for any {ξ. } ∈ CY ([0, T ], M (D∗ )), the operator curve A[t, {ξ≤t }] : D → Bobs generates a strongly continuous backward propagator of bounded linear operators U r,s [{ξ≤t }] in Bobs , 0 ≤ r ≤ s ≤ t, on the common invariant domain D, so that U r,s [{ξ.}]D→D ≤ UD and ||U r,s [{ξ.}]||Bobs →Bobs ≤ UB ,

t ≤ s,

(6.47) r,s ∗

for some constants UD , UB , and that their dual propagators V = (U ) [{ξ≤t }] preserve the set M . Then the Cauchy problem (6.44) is well posed, that is, for any Y ∈ M , it has a unique solution μt (Y ) ∈ M that depends Lipschitz-continuously on the time t and the initial data in the norm of D∗ . In other words, the same estimates (6.35), (6.36) hold, but with sups∈[0,t] A(s, Y )D→Bobs instead of A(Y )D→Bobs in (6.36). s,r

Proof. The proof of Theorem 6.3.1 was presented in such a way that it can be readily extended to the present, more general situation by working with the mapping  ΦY (ξ≤t )(t) = V t,0 [ξ≤t ]Y . Let us now turn to general path-dependent equations. Theorem 6.4.2. Let M be a convex subset of Bst with supμ∈M μBst ≤ K, which is closed in the norm topologies of both Bst and D∗ . Suppose that (i) the linear operators A[t, {ξ.}] : D → Bobs are uniformly bounded and Lipschitz-continuous in {ξ.}, i.e., for any {ξ.}, {η.} ∈ CY ([0, T ], M (D∗ )), we have sup A[t, {ξ.}] − A[t, {η.}]D→Bobs ≤ LA sup ||ξt − ηt ||D∗ ,

t∈[0,T ]

with a positive constant LA ;

t∈[0,T ]

(6.48)

376

Chapter 6. The Method of Propagators for Nonlinear Equations

(ii) for any {ξ. } ∈ CY ([0, T ], M (D∗ )), the operator curve A[t, {ξ. }] : D → Bobs generates a strongly continuous backward propagator of the bounded linear operators U t,s [{ξ. }] in B, 0 ≤ t ≤ s, on the common invariant domain D, so that ||U t,s [{ξ.}]||D→D ≤ UD and ||U t,s [{ξ.}]||B→B ≤ UB ,

t ≤ s,

(6.49)

for some positive constants UD , UB . Moreover, suppose that their dual propagators V s,t [{ξ. }] preserve the set M . Then, if LA UB UD KT < 1,

(6.50)

the Cauchy problem (6.43) is well posed, i.e., for any Y ∈ M it has a unique solution μt (Y ) ∈ M (that is, (6.43) holds for all f ∈ D) that depends Lipschitzcontinuously on the initial data in the norm of D∗ : μ. (Y1 ) − μ. (Y2 )C([0,T ],M(D∗ )) ≤

UB Y1 − Y2 D∗ . 1 − LA UB UD KT

(6.51)

Proof. The peculiarity of the general path dependence arises in the equation (6.42). Unlike for the causal case, it is not valid here. Instead, one gets [ΦY1 (ξ.1 )](t) − [ΦY2 (ξ.2 )](t)D∗ ≤ UD UB LA Y1 B ∗ T ξ.1 − ξ.2 C([0,T ],D∗) + UB Y1 − Y2 B ∗ .

(6.52)

Therefore, the mapping ΦY is a contraction if (6.50) holds. The existence of the fixed point now follows from the Banach contraction principle, and (6.51) follows from the stability of fixed points, see Proposition 9.1.3.  Theorem 6.4.3. Under the assumptions in Theorem 6.4.2, assume additionally that for any t from a dense subset of [0, T ], the set {V t,0 [{ξ.}]Y : {ξ.} ∈ CY ([0, T ], M (D∗ ))}

(6.53)

is relatively compact in M (D∗ ). Then a solution to the Cauchy problem (6.43) exists in M globally, i.e., without the restriction (6.50). Proof. Since M is convex, the space CY ([0, T ], M (D∗ )) is also convex. Since the dual operators V t,0 [{ξ.}] preserve the set M , the mapping ΦY acts from CY ([0, T ], M (D∗ )) to itself. Moreover, by (6.52), this mapping is Lipschitz-continuous. Let us denote by Cˆ the image of CY ([0, T ], M (D∗ )) in CY ([0, T ], M (D∗ )) under ΦY . In particular, ΦY takes Cˆ to itself. Together with (6.52), the assumption that the set (6.53) is compact in M for any t from a dense subset of [0, T ] implies that the set Cˆ is relatively compact in CY ([0, T ], M (D∗ )) (by the Arzel`a–Ascoli Theorem). Finally, the Schauder fixed-point theorem implies that there exists a fixed point in Cˆ ⊂ CY ([0, T ], M (D∗ )), which ensures the existence of a solution to (6.43). 

6.5. Simplest nonlinear diffusions: weak treatment

377

In all applications, we have to keep in mind that B = Bobs = C∞ (Rd ), k (Rd ) with some k ∈ N, Bst = B ∗ = M(Rd ) and M = P(Rd ). D = C∞ According to Proposition 1.1.1, a handy class of compact subsets of P(Rd ) is given by the sets   1 d d P≤λ (R ) = μ ∈ P(R ) : |x|μ(dx) ≤ λ . Therefore, we get the following more concrete version of Theorem 6.4.3: k Theorem 6.4.4. Let B = Bobs = C∞ (Rd ), D = C∞ (Rd ) with some k ∈ N, d and M = P(R ). Under the assumptions in Theorem 6.4.2, assume additionally that all backward propagators U t,s [{ξ. }] act as uniformly bounded operators in the weighted spaces CL (Rd ) with L(x) = 1+|x|. Then a solution to the Cauchy problem 1 (6.43) exists in P 1 (Rd ) = ∪λ>0 P≤λ (Rd ) for any Y ∈ P 1 (Rd ).

6.5 Simplest nonlinear diffusions: weak treatment Let us consider some basic examples of nonlinear evolutions (with both pointwise and integral nonlinearities) and compare their strong and weak formulations. First, let B = Bobs = C∞ (R), Bst = B ∗ = M(R), M = M+ ≤λ (R) and 1 1 D = C∞ (R) equipped with the usual norm of C∞ (R). Let ∂ ∂x with some function a : B ∗ × R → R. Then, we have A(ξ)f = a(ξ, x)

(6.54)

A(ξ) − A(η)D→B ≤ a(ξ, .) − a(η, .)C(R) , so that condition (6.33) requires a(ξ, .) − a(η, .)C(R) ≤ LA ξ − ηD∗ . Assuming that

(6.55)

a(ξ, x) = Rk

g(x, y1 , . . . , yk )ξ(dy1 ) · · · ξ(dyk )

(6.56)

is the integral operator with the kernel g, we find a(ξ, .) − a(η, .)C(R) ≤ gC(Rk+1 ) kbk−1 ξ − ηB ∗ , where b = max(ξB ∗ , ηB ∗ ), and 

a(ξ, .)−a(η, .)C(R) ≤ sup bk−1 k|g(x, y)| + x,y

   ∂g    ξ −ηD∗ , (6.57) (x, y)   j ∂yj

so that the continuity (6.33) requires smoothness of the integral kernel g. As a direct consequence of Theorem 6.3.1, we have the following result:

378

Chapter 6. The Method of Propagators for Nonlinear Equations

Proposition 6.5.1. Let A(ξ) be given by (6.54) with a(ξ, .) ∈ C 1 (R) uniformly in ξ ∈ M+ ≤λ (R), and let (6.55) hold there. For instance, the function a can be given by (6.56) with g ∈ C 1 (Rk+1 ). Then the Cauchy problem d (f, μt ) = (a(μt , x)f  (x), μt ), μ0 = Y ∈ M+ ≤λ (R), dt 1 f ∈ D = C (R) ∩ C∞ (R),

(6.58)

with t ∈ [0, T ], satisfies the conditions of Theorem 6.3.1 with M = M+ ≤λ (R),

UB = 1,

UD = exp{T sup a(μ, .)C 1 (R) : μ ∈ M+ ≤λ (R)}. μ

Therefore, this Cauchy problem is well posed in M = M+ (R). Exercise 6.5.1. Extend Proposition 6.5.1 to B = C∞ (Rd ), and also for causal and path-dependent nonlinearities. Next, let us consider the (possibly complex) diffusion operator A(ξ)f =

1 σΔf + (bt (ξ, x), ∇)f (x) + Vt (ξ, x)f (x) 2

(6.59)

in Rd , where σ is a complex constant with a positive real part , with some functions bt (ξ, x), Vt (ξ, x) (possibly complex-valued). In this case, the appropriate 2 spaces are B = C∞ (Rd ) and D = C∞ (Rd ), with B ∗ = M(Rd ) and either M = + d d M≤λ (R ) or M = M≤λ (R ) or, in the case of complex-valued functions, M = d MC ≤λ (R ). The condition (6.33) requires that bt (ξ, .) − bt (η, .)C(R) ≤ LA ξ − ηD∗ , Vt (ξ, .) − Vt (η, .)C(R) ≤ LA ξ − ηD∗ .

(6.60)

If bt , Vt are integrals of the type (6.56), then this condition can be more explicitly expressed in terms of the second derivatives of the corresponding integral kernels, as shown above for the function a. The control over the growth of the solutions can be performed either via positivity or the PMP (see Corollary 6), or by just requiring uniform boundedness. As a direct consequence of Theorems 6.3.1 and 4.14.2 (and Corollary 6), we get the following result: Proposition 6.5.2. (i) Let At (ξ) be given by (6.59) with a positive real σ and with real continuous functions bt (ξ, x), Vt (ξ, x) such that Vt ≤ 0 everywhere and bt (ξ, .), Vt (ξ, .) ∈ d C 1 (Rd ) uniformly in time and ξ ∈ M+ ≤λ (R ) for any λ, and let (6.60) hold + d again uniformly for ξ ∈ M≤λ (R ). Then the Cauchy problem   1 d (f, μt ) = σΔf + (bt (μt , x), ∇)f (x) + Vt (μt , x)f (x), μt , dt 2 (6.61) d μ0 = Y ∈ M + ≤λ (R ),

6.6. Simplest nonlinear diffusions: strong treatment

379

2 (written in the weak form for f ∈ D = C∞ (Rd )) satisfies the conditions + d of Theorem 6.3.1 with M = M≤λ (R ) for any λ. Therefore, this Cauchy problem is well posed in M+ (R). (ii) Let bt , Vt be complex-valued and σ be complex with a positive real part. Let bt (ξ, .), Vt (ξ, .) ∈ C 1 (Rd ) and let (6.60) hold uniformly in time and all ξ ∈ MC (Rd ). Then the Cauchy problem (6.61) is well posed in MC (Rd ).

6.6 Simplest nonlinear diffusions: strong treatment The weak equations (6.32), (6.59) can be equivalently written in the strong form: ∂ 1 μt = σΔμt − (∇, bt (μt , x)μt ) + Vt (μt , x)μt , ∂t 2 d μ0 = Y ∈ M + ≤λ (R ).

(6.62)

Since the major term Δ is non-degenerate, one can expect better results from the strong treatment, compared to the weak approach developed above. When comparing (6.62) with (6.1), it is important to note that we are now looking for measure-valued solutions, or at least solutions in L1 (Rd ), as opposed 1 to the well-posedness in C∞ (Rd ) that we proved for (6.1). Even more importantly, we do not assume the dependence of the coefficients on μ to be of a pointwise local form, as we did in (6.1). Moreover, for the sake of definiteness, we want to avoid this type of dependence here, since its treatment may be different from the treatment of the integral dependence, which we are going to mostly deal with from now on. In order to understand this point, observe that if bt (μt , x) is given by a function ∂b = h (φt (x))φ (x), but of the type h(φt (x)), where φt is the density of μt , then ∂x  ∂b if bt (μt , x) = h(x)μt (dx), then ∂x = 0. To be specific, we shall usually assume that the dependence of the coefficients on μ and x is given by some function of a finite number of integral monomials of the type (6.56), i.e., g(x, y1 , . . . , yk )μ(dy1 ) · · · μ(dyk ). Rk

In particular, for k = 0, we have just a dependence on x, and if g does not depend on x, then the monomial does not depend on x. This kind of dependence is convenient, because it allows for a more or less explicit representation of all variational derivatives. Also, it covers virtually all equations that arise from applications. As an important special case, let us mention the dependence of the coefficients on the convolutions g  μt . This case occurs, e.g., in the Landau–Fokker–Planck model (7.54) of grazing collisions. Equation (6.62) can also be written in a mild form. Since the semigroup generated by σΔ takes measures to L1 (Rd ), the solutions to the mild equation should belong to L1 (Rd ) for t > 0 (even if Y does not), or in other words, they

380

Chapter 6. The Method of Propagators for Nonlinear Equations

should be measures with densities. In terms of the densities φt , the mild form of (6.62) reads Gtσ (x − y)Y (dy) (6.63) φt (x) = t ds G(t−s)σ (x − y) [Vs (φs , y)φs (y) − (∇, bs (φs , y)φs (y))] dy, + 0

where Vt (φ, y) and bt (φ, y) denote the values of the functionals Vt and bt on a measure with the density φ (with some abuse of notations). Integrating by parts yields t Gtσ (x − y)Y (dy) + ds G(t−s)σ (x − y)Vs (φs , y)φs (y)dy φt (x) = 0  t  ∂ G(t−s)σ (x − y), bs (φs , y)φs (y) dy. ds (6.64) + ∂y 0 This is the most convenient form, since it makes sense even if no smoothness is assumed on b. Therefore, one can expect that the well-posedness for the mild equation (6.64) can be proved under weaker assumptions than for (6.61). This is actually true, as we will show in the following theorem that represents the main well-posedness result for equations of the type (6.61). In order to work with the equations (6.63) and (6.64), it is often instructive to write down the weak form of this mild equation, which can also be considered the mild form of the weak equation (6.61): t ds ([Vs (φs , .) + (bs (φs , .), ∇)]Tt−s f, φt ) , (6.65) (f, φt ) = (Tt f, Y ) + 0

where Tt is the heat semigroup generated by σΔ/2. Theorem 6.6.1. Let σ > 0 and bt (ξ, .) = {bjt (ξ, .)}, Vt (ξ, .) be measurable bounded real or complex functions, such that

j |bt (ξ, y)| < ∞, (6.66) V = sup |Vt (ξ, y)| < ∞, b = sup y,ξ,t

and let

y,ξ,t

j

sup |bt (ξ, x) − bt (η, x)| ≤ LA ξ − ηM(Rd ) , x,t

sup |Vt (ξ, x) − Vt (η, x) ≤ LA ξ − ηM(Rd ) .

(6.67)

x,t

Then: (i) For any T > 0 and any Y having a density φ, the mild equation (6.64) has the unique bounded solution φ. ∈ C([0, T ], L1(Rd )). This solution has the bound √ √ φt L1 (Rd ) ≤ Y M(Rd ) E1/2 (Γ(1/2)(V t + b)¯ σ t), (6.68)

6.6. Simplest nonlinear diffusions: strong treatment

381

and for any two solutions φ1t and φ2t with the initial conditions Y 1 andY 2 , respectively, the estimate √ (6.69) φ1t − φ2t L1 (Rd ) ≤ Y 1 − Y 2 L1 (Rd ) E1/2 (κ(T )Γ(1/2) t) holds, where E1/2 is the Mittag-Leffler function, σ ¯ = max(1, σ −1/2 ) and κ(T ) is given by (6.72) below. (ii) For any T > 0 and Y ∈ M(Rd ), the mild equation (6.64) has the unique solution φ. ∈ C((0, T ], L1 (Rd )) such that φt → Y weakly, as t → 0. In this case, the estimate (6.69) can be rewritten as √ (6.70) φ1t − φ2t L1 (Rd ) ≤ Y 1 − Y 2 M(Rd ) E1/2 (κ(T )Γ(1/2) t). (iii) Solutions φt to the mild equation (6.64) also solve the weak equation (6.61). (iv) If additionally b and V are real and V is non-positive, then for any positive φ ∈ L1 (Rd ) and any positive Y ∈ M+ (Rd ), the solution φt is also nonnegative, we have φt L1 (Rd ) ≤ Y M(Rd ) , and (6.70) holds with κ(T ) = Γ(1/2)¯ σ [(V

√ √ T + b) + LA ( T + 1)]Y .

Proof. (i) In order to be able to apply Theorem 2.1.3, we first need to get a bound for all of  the mapping ΦY (φ. )(t) defined as the r.h.s. of (6.64). Since √ iterations √ 1 ≤ t/ t − s and 2/π ≤ 1 holds, using (4.23) yields t √ [ΦY (φ. )](t)L1 (Rd ) ≤ Y L1 (Rd ) + Γ(1/2)(V t + b)¯ σ (t − s)−1/2 φs L1 (Rd ) ds. 0

Iterating and using the definition of the Mittag-Leffler function yields √ √ σ t), [ΦnY (Y )](t)L1 (Rd ) ≤ Y L1 (Rd ) E1/2 (Γ(1/2)(V t + b)¯

(6.71)

which implies (6.68). Again using (4.23) yields [ΦY (φ1. )](t) − [ΦY (φ2. )](t)L1 (Rd ) t √ ≤ Γ(1/2)(V T + b)¯ σ −1 (t − s)−1/2 φ1. − φ2. C([0,s],L1 (Rd )) ds 0 √ + Γ(1/2)LA ( T + 1)¯ σ −1 t × (t − s)−1/2 φ1. − φ2. C([0,s],L1 (Rd )) φ1. C([0,T ],L1(Rd )) ds. 0

Therefore, [ΦY (φ1. )](t) − [ΦY

(φ2. )](t)L1 (Rd )

≤ κ(T ) 0

t

(t − s)−1/2 φ1. − φ2. C([0,s],L1(Rd )) ds

382

Chapter 6. The Method of Propagators for Nonlinear Equations

with

% √ √ √ √ & κ(T ) = Γ(1/2)¯ σ (V T + b) + LA ( T + 1)E1/2 (Γ(1/2)(V T + b)¯ σ T) . (6.72) Since [ΦY1 (φ. )](t) − [ΦY2 (φ. )](t) ≤ Y1 − Y2 , the result now follows from Theorem 2.1.3 with ω = 1/2 and the Banach space L1 (Rd ). (ii) Theorem 2.1.3 can be applied for finding the unique solution in the Banach space M(Rd ). However, the curve ΦY (μ. )(t) is only weakly continuous in t at t = 0, because the same applies to the first term in (6.64). It follows from (6.64) that ΦY (μ. )(t) has a density for all t > 0. (iii) If φt solves (6.64), then it solves (6.65) for any f ∈ C(Rd ). If f ∈ 2 C∞ (Rd ), then we can differentiate (6.65) and get d 1 1 t (f, Y ) = σ(ΔTt f, φ) + ds ([Vs (φs , .) + (bs (φs , .), ∇)]σΔTt−s f, φs ) dt 2 2 0 + ([Vt (φt , .) + (bt (φt , .), ∇)]f, φt ) 1 = (σΔf, φt ) + ([Vt (φt , .) + (bt (φt , .), ∇)]f, φt ) , 2 as required, where (6.63) was used in the last equation. (iv) By Proposition 6.5.2(i), solutions to (6.65) preserve the positivity and do not increase the norm. 

6.7 Simplest nonlinear diffusions: regularity and sensitivity The regularity of the solutions can be enhanced by increasing the regularity of b: Theorem 6.7.1. (i) Under the assumptions of Theorem 6.6.1, assume additionally that bt (ξ, .) ∈ d C 1 (Rd ) uniformly for ξ ∈ M+ ≤λ (R ) and that    ∂bt (ξ, x) ∂bt (η, x)   ≤ LA ξ − ηM(Rd ) .  − sup  ∂x ∂x  x

(6.73)

Then for any Y with the density φ ∈ H11 (Rd ), the solution φt to (6.64) belongs to C([0, T ], H11 (Rd )), and for any two solutions φ1t and φ2t with the initial conditions φ1 and φ2 , respectively, the estimate φ1t − φ2t H11 (Rd ) ≤ c(t)φ1 − φ2 H11 (Rd )

(6.74)

holds, with continuous functions c(t) expressed again in terms of the MittagLeffler function, like in (6.70).

6.7. Simplest nonlinear diffusions: regularity and sensitivity

383

(ii) Moreover, the solution is smoothing in the following sense: For any Y ∈ M(Rd ), the solution φt to (6.64) belongs to H11 (Rd ) for all t > 0 and we have (6.75) φt H11 (Rd ) ≤ κt−1/2 Y M(Rd ) with a constant κ. Proof. (i) This is similar to the proof of Proposition 6.6.1, although it is more convenient to work with the representation (6.63). Under the present assumptions, the expression in the square brackets in (6.63) belongs to L1 (Rd ) whenever φs ∈ H11 (Rd ). Therefore, the r.h.s. of (6.63) becomes bounded in H11 (Rd ) due to the smoothing property of the heat semigroup. Now we can apply Theorem 2.1.3 in the Banach space B = H11 (Rd ). (ii) Suppose first that φ ∈ H11 (Rd ). Then φt H11 (Rd ) is bounded. By (6.63), we have t c c √ φs H11 (Rd ) ds, φt H11 (Rd ) ≤ √ φL1 (Rd ) + t−s t 0 with a constant c that depends on T, σ and the bounds for V and b. Consequently, √ √ √ sup( sφs H11 (Rd ) ) ≤ cφL1 (Rd ) + t sup( sφs H11 (Rd ) )



√ √ t sup( sφs H11 (Rd ) )



s≤t

s≤t

= cφL1 (Rd ) +

s≤t

t

c √ √ ds t−s s

1

c √ √ du. 1−u u

0

0

Therefore, for sufficiently small t, we have √ sup( sφs H11 (Rd ) ) ≤ cκφL1 (Rd ) , s≤t

√ with κ = (1 − tcB(1/2, 1/2))−1, which implies (6.75) for an initial φ ∈ H11 (Rd ). Next, let φ ∈ L1 (Rd ). Then it can be approximated in L1 by a sequence n φ of elements of H11 (Rd ) with the same norm in L1 . Applying (6.75) to these 1 (Rd ) the estimate approximations yields for any f ∈ C∞ |(φt , f  )| = lim |(φnt , f  )| n→∞ −1/2

≤ κt

φL1 (Rd ) f C(Rd) .

Therefore, φt has a generalized derivative for t > 0, which is a signed measure with a norm that is bounded by the r.h.s. of (6.75). But it is seen from (6.64) that the derivative of φt is in fact a function, not just a measure. Finally, for Y ∈ M(Rd ), the solution has a density φt ∈ L1 (Rd ) for any t > 0, and the previous argument can be applied to the solution φt considered as the solution to the Cauchy problem starting at any time t > 0. 

384

Chapter 6. The Method of Propagators for Nonlinear Equations

As a consequence of Theorems 2.15.1 and 6.6.1, we now prove the sensitivity result for the solutions to (6.64) and (6.61). For this purpose, we construct the variational derivatives of the solutions to nonlinear diffusion with respect to the initial data as Green functions (i.e., solutions with the Dirac initial data) of the Cauchy problem for the corresponding linearized equation (equations in variations). Theorem 6.7.2. Under the assumptions of Theorem 6.6.1(iv), let the variational derivatives of Vt (μ,y) and bt (μ,y) with respect μ be well defined and locally bounded, i.e.,  

 δbj (μ, y)   δVs (μ, y)    ≤ R(λ),   s ≤ R(λ), sup (6.76) sup  j  δμ(z)  δμ(z)  y,z,s y,z,s d for μ ∈ M+ ≤λ (R ) and constants R(λ). Moreover, let these variational derivatives be locally Lipschitz-continuous, i.e.,

   δVs (μ1 , y) δVs (μ2 , y)   ≤ L(λ)μ1 − μ2 M(Rd ) ,  − sup  δμ(z) δμ(z)  y,z,s  

 δbj (μ1 , y) δbj (μ2 , y)  s  s  ≤ L(λ)μ1 − μ2 M(Rd ) , − sup j  δμ(z) δμ(z)  y,z,s

(6.77) (6.78)

for μ ∈ M≤λ (Rd ) and constants L(λ). Then the mapping φ = φ0 → φt with the initial data φ ∈ L1 (Rd ) (or, more generally, Y → φt with the initial data Y ∈ M(Rd )), solving (6.64) ac1 cording to Theorem 6.6.1, belongs to Cluc (L1 (Rd ), L1 (Rd )) (or, more generally, to t (Y ) 1 d d Cluc (M(R ), L1 (R ))), for all t > 0, and ξt (x) = δφ δY (x) is the unique solution to the equation (6.79) ξt (x; z) = Gtσ (z − x)   t δVs (φs , y) + ξs (x; w)φs (y) dw + Vs (φs , y)ξs (x; y) dy ds G(t−s)σ (z − y) δφs (w) 0 t  ∂ G(t−s)σ (z − y), + ds ∂y 0  δbs (φs , y) ξs (x; w)φs (y) dw + bs (φs , y)ξs (x; y) dy. δφs (w) It satisfies the Dirac initial condition ξ0 (x; .) = δx and has the bound √ √ ξt (x, .)L1 (Rd ) ≤ E1/2 [Γ(1/2) t((λR(λ) + V ) t + λR(λ) + b)].

(6.80)

Finally, ξt also solves the weak equation obtained by formal differentiation of

6.8. McKean–Vlasov equations

385

(6.61): d (f, ξt (x; .)) = dt



 1 σΔf + (bt (φt , .), ∇)f + Vt (φt , .)f, ξt (x; .) 2 δVt (φt , y) + ξt (x; w)f (y)φt (y) dydw δφt (w)   δbt (φt , y) ξt (x; w), ∇f (y) φt (y) dydw. + δφt (w)

(6.81)

Remark 108. Sensitivity still holds under Theorem 6.6.1(i) to (iii), although with more complicated (and more rapidly growing) bounds for ξt , namely with λ not being Y  as in the below proof, but being given by the r.h.s. of (6.68). Proof. We are in the setting of Theorem 2.15.1 with ξt satisfying an equation of the type (2.128) with ξ = δx , Gt,0 ξ = Gtσ (z −x) and DΩt,s given by the expression under the integral in (6.79). Therefore, we can estimate the norms as in the proof of the above Theorem 6.6.1 and get ¯ (λR(λ) + b) DΩt,s (φ)M(Rd )→L1 (Rd ) ≤ (λR(λ) + V ) + (t − s)−1/2 σ √ −1/2 σ ¯ [λR(λ) + b + t(λR(λ) + V )], ≤ (t − s) with λ = Y M(Rd ) , which yields the first estimate in (2.205). Furthermore, DΩt,s (φ1 ) − DΩt,s (φ2 )M(Rd )→L1 (Rd ) ≤ (λL(λ) + R(λ))φ1 − φ2 C([0,s],L1 (Rd )) (1 + σ ¯ (t − s)−1/2 ) √ ≤σ ¯ (λL(λ) + R(λ))φ1 − φ2 C([0,s],L1(Rd )) (1 + t), which yields the second estimate in (2.205). Therefore, the application of Theorem 2.15.1 completes the proof.  The weak and mild representations (6.81) and (6.79) of the derivatives with respect to the initial conditions can be used for deriving different kinds of regularity for these derivatives. This will be shown in the next section for a more general model with nontrivial diffusion coefficients.

6.8 McKean–Vlasov equations Nonlinear diffusion equations represent a general class of diffusion-type equations with coefficients that depend on an unknown function. In other words, they are evolutionary equations of the second order, which are linear with respect to the derivatives: 1 ∂ut = (at (ut , x)∇, ∇)ut (x) + (bt (ut , x), ∇)ut (x) + Vt (ut , x)ut , ∂t 2

(6.82)

386

Chapter 6. The Method of Propagators for Nonlinear Equations

or equivalently 1 ∂ut = tr ∂t 2

    ∂ 2 ut (x) ∂ut (x) at (ut , x) (u , x), + b + Vt (ut , x)ut , t t ∂x2 ∂x

(6.83)

or more explicitly ∂ut 1 ∂ 2 ut (x) j ∂ut (x) = at,ij (ut , x) + b (ut , x) + Vt (ut , x)ut (x), (6.84) j t ∂t 2 i,j ∂xi ∂xj ∂xj with given functions at (u, x), bt (u, x), Vt (u, x). When the coefficient-functions a, b, V are allowed to be complex-valued, we speak of complex nonlinear diffusion equations or (complex) nonlinear Schr¨ odinger equations. This section is devoted to the sensitivity of nonlinear diffusions with respect to the initial data. It also provides explicit bounds for the norms of the derivatives in various regularity classes. Nonlinear diffusion equations and nonlinear Schr¨ odinger equations are basic equations that represent the evolution of the large-number-of-particles limit for systems of interacting classical or quantum particles. Their derivation from interacting particle systems will be sketched in Section 7.8. So far, we have analysed the special case when a is the identity matrix. If the dependence of the functions b, V on u is simply a dependence on the values of u at given points, say b(u, x) = b(u(x), x), then the equations (6.84) can be considered special cases of equation (6.1), with H depending linearly on p and q. Therefore, its main properties can be obtained in the framework of the HJB theory, for instance from Theorems 6.1.1, 6.1.4 and 6.1.5. Of course, the linearity of H allows for a weakening of the assumptions on its regularity, e.g., by choosing b and V to be measures of dimensionality α ∈ (d − 2, d], in the spirit of Theorem 4.8.1. Returning to the general case (6.82), we shall always assume that for any continuous function ut (x) the second-order operator (at (ut (.), x)∇, ∇)/2 generates t,s 2 a propagator Ua(u) in C∞ (Rd ) on the invariant domain C∞ (Rd ). In this case, equation (6.82) with the initial condition u0 can be rewritten in a mild form: t t,0 t,s u0 + Ua(u) (b(us , x), ∇)us (x) + V (us , x)us ) ds. (6.85) ut = Ua(u) 0

Very often, the nonlinear equations arise in their weak form, as equations of the type (6.32): d (f, μt ) = (At (μt )f, μt ), dt

μ0 = Y,

f ∈ D,

(6.86)

where At (μ)f (x) =

1 (at (μ, x)∇, ∇)f (x) + (bt (μ, x), ∇)f (x) + Vt (μ, x)f (x). 2

(6.87)

6.8. McKean–Vlasov equations

387

The strong form of equation (6.86) for measures μ with densities φ reads 1 ∂φ = (∇, ∇(at (φ, x)φ(x))) − (∇, bt (φ, x)φ(x)) + Vt (φ, x)φ(x), (6.88) ∂t 2 where we denote the functions of measures and of their densities by the same letters, with some abuse of notation. More explicitly, equation (6.88) can be rewritten as

∂ j ∂φ 1 ∂2 = (at,ij (φ, x)φ) − (b (φ, x)φ(x)) + Vt (φ, x)φ(x), i,j ∂xi ∂xj j ∂xj t ∂t 2 (6.89) which can of course be rewritten in the general form (6.84), although with some other functions a, b and V . Nonlinear diffusion equations are often referred to as McKean–Vlasov diffusions, especially when the coefficients a, b, V depend on the unknown function u via moments of the type g(x1 , . . . , xk )u(x1 ) · · · u(xk )dx1 · · · dxk . As we already mentioned, the analysis of diffusions crucially depends on whether the major second-order term is degenerate or not. We shall first analyse the non-degenerate case. In this case, the strong form and its mild version seem to be most convenient for the analysis. Even if the initial condition is a non-regular object, say a measure, the smoothing property of non-degenerate diffusions makes it a smooth function at any positive time. If the matrix a(u, x) does not depend on u, then equation (6.82) belongs to the class of equations of the type 1 ∂ut = (at (x)∇, ∇)ut (x) + (bt (ut , x), ∇ut (x)) + Vt (ut , x)ut , (6.90) ∂t 2 where a is a standard second-order operator. Working with this equation makes the analysis easier, as compared to the general case. In fact, the analysis of this equation with non-degenerate a can be reduced to the case of a = Δ by Proposition 5.6.2. For the sake of simplicity, we shall mostly concentrate on equations of the type (6.90) and use such a reduction. More precisely, we shall work with the corresponding weak equation (6.86) and the corresponding strong form 1 ∂φ = (∇, ∇(at (x)φ(x))) − (∇, b(φ, x)φ(x)) + Vt (φ, x)φ(x). (6.91) ∂t 2 References to papers that treat more general cases are supplied in the last Section. Theorem 6.6.1 can be extended to the following result: Theorem 6.8.1. Let bt (ξ, .) = {bjt (ξ, .)}, Vt (ξ, .) be measurable bounded real functions with V non-positive, satisfying (6.66) and (6.67). Let a(x) be a matrix-valued function, with elements belonging to C 2 (Rd ) (uniformly in t), which is uniformly elliptic, so that

m(ξ, ξ) ≤ (at (x)ξ, ξ) ≤ m−1 (ξ, ξ), at,ij (x)C 2 (Rd ) ≤ M < ∞, (6.92) ij

with some positive constants m, M .

388

Chapter 6. The Method of Propagators for Nonlinear Equations

Then for any T > 0 and any Y having a non-negative density φ (respectively without density), the mild equation t φt (x) = Gt (x, y)Y (dy) + ds G(t−s) (x, y)Vs (φs , y)φs (y)dy 0  t  ∂ G(t−s) (x, y), bs (φs , y)φs (y) dy ds (6.93) + ∂y 0 has a unique bounded non-negative solution φ. ∈ C([0, T ], L1 (Rd )) (respectively a unique solution φ. ∈ C((0, T ], L1 (Rd ))) such that φt → Y weakly, as t → 0, so that φt L1 (Rd ) ≤ Y M(Rd ) , and for any two solutions φ1t and φ2t with the initial conditions Y 1 and Y 2 , respectively, the estimate √ φ1t − φ2t L1 (Rd ) ≤ Y 1 − Y 2 M(Rd ) E1/2 (κ(T ) t) (6.94) holds, where ¯ κ(T ) = C[(V

√ √ T + b) + LA ( T + 1)]Y ,

with a constant C¯ depending on m, M, T . Finally, solutions φt to the mild equation (6.93) also solve the corresponding weak equation that extends (6.58). Proof. The solution φt is the fixed point of the mapping ΦY that extends the mapping arising from (6.64), i.e., t [ΦY (φ. )](t)(x) = Gt (x, y)Y (dy) + ds G(t−s) (x, y)Vs (φs , y)φs (y)dy 0  t  ∂ G(t−s) (x, y), bs (φs , y)φs (y) dy. ds (6.95) + ∂y 0 By Proposition 5.6.2, there exist constants σ and C depending only on m, M, T such that the Green function Gt,s (x, y) of the Cauchy problem for the operator φ(x) → 12 (∇, ∇(at (x)φ(x))) is differentiable in x and y and satisfies the estimate Gt,s (x, y) ≤ CGσ(t−s) (x − ξ), 0 < s, t < T,     ∂   ∂  max  Gt,s (x, y) ,  Gt,s (x, y) ≤ Ct−1/2 Gσ(t−s) (x − ξ), ∂y ∂x

(6.96) 0 < s, t < T. (6.97)

(Recall that this Green function is just the transpose kernel to the Green function Gs,t (y, x) of the backward Cauchy problem for the dual operator 12 (at (x)∇, ∇).) Therefore, all estimates used in the proof of Theorem 6.6.1 remain valid, if the additional constant C is included. 

6.8. McKean–Vlasov equations

389

Similarly, Theorems 6.7.1 and 6.7.2 and Proposition 6.8.3 have direct extensions. For instance, the following sensitivity result for McKean–Vlasov diffusion holds. Theorem 6.8.2. Under the assumptions of Theorem 6.8.1, let the variational derivatives of Vt (μ, y) and bt (μ, y) with respect μ be well defined, locally bounded and satisfy (6.76), (6.77) and (6.78). Then the mapping φ = φ0 → φt , with the initial data φ ∈ L1 (Rd ) (or, more generally, Y → φt with the initial data Y ∈ M(Rd )), solving (6.95) ac1 (L1 (Rd ), L1 (Rd )) (or, more generally, to cording to Theorem 6.8.1, belongs to Cluc t (Y ) 1 d d Cluc (M(R ), L1 (R ))), for all t > 0, and ξt (x) = δφ δY (x) is the unique solution to the equation ξt (x; z) = Gt (z, x) (6.98)   t δVs (φs , y) + ξs (x; w)φs (y) dw + Vs (φs , y)ξs (x; y) dy ds Gt−s (z, y) δφs (w) 0   t ∂ δbs (φs , y) Gt−s (z, y), ξs (x; w)φs (y) dw + bs (φs , y)ξs (x; y) dy. + ds ∂y δφs (w) 0 It satisfies the Dirac initial condition ξ0 = δx and has the bound ξt (x, .)L1 (Rd ) ≤ E1/2 [C(2t + 1)(2λR(λ) + V + b)],

(6.99)

with a constant C depending on m, M and T . Finally, ξt also solves the weak equation obtained by formal differentiation of (6.58):   1 d (f, ξt (x; .)) = (A(.)∇, ∇)f + (bt (φt , .), ∇)f + Vt (φt , .)f, ξt (x; .) dt 2 δVt (φt , y) ξt (x; w)f (y)φt (y) dydw (6.100) + δφt (w)   δbt (φt , y) ξt (x; w), ∇f (y) φt (y) dydw. + δφt (w) As already mentioned, the weak and mild representations (6.100) and (6.98), respectively, of the derivatives with respect to the initial conditions can be used to derive different kinds of regularity for these derivatives. In order to illustrate this claim, let us observe that the weak equation (6.81) shows that the evolution ξ → ξt of the directional derivatives of the solutions φt is dual to the backward evolution in C(Rd ) generated by the equation 1 (6.101) f˙t (z) = − (A(z)∇, ∇)ft (z) − (bt (φt , z), ∇)ft (z) − Vt (φt , z)ft (z)  2  δbt (φt , y) δVt (φt , y) ft (y)φt (y) dy − , ∇ft (y) φt (y) dy. − δφt (z) δφt (z)

390

Chapter 6. The Method of Propagators for Nonlinear Equations

Theorem 6.8.3. Under the assumptions of Theorem 6.8.2, let additionally bt (μ, .), Vt (μ, .),

δbt (μ, y) δVt (μ, y) , ∈ C 1 (Rd ) δμ(.) δμ(.)

d uniformly for bounded μ, so that for μ ∈ M+ ≤λ (R ) with any λ, we have

  sup bt (μ, .)C 1 (Rd ) + Vt (μ, .)C 1 (Rd ) t )  ) ) ) ) δbt (μ, y) ) ) δVt (μ, y) ) ) ) ) ) + sup ) + ≤ c1 (λ) δμ(.) )C 1 (Rd ) ) δμ(.) )C 1 (Rd ) t,y

(6.102)

with a continuous function c1 (λ). Then: 2 (Rd ) and it generates a backward (i) The equation (6.101) is well posed in C∞ t,s 1 propagator Φ acting strongly continuously in the spaces C∞ (Rd ), C∞ (Rd ) 2 d and C∞ (R ), so that √ C(m,M)(s−t) E1/2 [(V + b + 2R(λ))C(m, M ) s − t], Φt,s L(C∞ 1 (Rd )) ≤ e √ C(m,M)(s−t) Φt,s L(C∞ 2 (Rd )) ≤ e E1/2 [c1 (λ)C(m, M ) s − t], (6.103)

with a constant C(m, M ) depending only on m and M . Consequently, by duality, the weak equation (6.81) generates a (forward) propagator (Φt,s )∗ in 1 (Rd ))∗ M(Rd ) that extends to bounded propagators in the dual spaces (C∞ 2 d ∗ and (C∞ (R )) . Moreover, the variational derivatives ξt (x; .) are twice differentiable in x, so that ∂ξt (x; .) 1 ∈ (C∞ (Rd ))∗ , ∂x and

∂ 2 ξt (x; .) 2 ∈ (C∞ (Rd ))∗ , ∂x2

(6.104)

) ) √ ) ∂ξt (x; .) ) C(m,M)t ) ) E1/2 [(V + b + 2R(λ))C(m, M ) t], ) ∂x ) 1 d ∗ ≤ e (C (R )) ) 2 ) ∞ √ ) ∂ ξt (x; .) ) C(m,M)t ) ) E1/2 [c1 (λ)C(m, M ) t]. (6.105) ) ∂x2 ) 2 d ∗ ≤ e (C∞ (R ))

(ii) The directional derivative ξt [ξ] = Dφt (Y )[ξ] belongs to H11 (Rd ) whenever ξ does, and ξt H11 (Rd ) ≤ C(T )ξH11 (Rd ) , where C(T ) depends on the bounds of all derivatives of A, b and V mentioned in the assumptions of the proposition, and can also be explicitly expressed in terms of the Mittag-Leffler function.

6.8. McKean–Vlasov equations

391

Proof. (i) It follows directly from Theorem 4.13.4 applied to equation (6.101) ˜ = C 1 (Rd ), D = C 2 (Rd ). Notice also that the derivatives with B = C∞ (Rd ), B ∞ ∞ (6.104) satisfy the same equation as ξt itself, but have the initial conditions ∇δx ∈ 1 2 (C∞ (Rd ))∗ , ∇2 δx ∈ (C∞ (Rd ))∗ . (ii) It follows by applying Theorem 2.1.3 to equation (6.98) in the Banach space H11 (Rd ).  Exercise 6.8.1. Apply Theorem 4.13.4 to further confirm that Φt,s is smoothing 1 2 1 (Rd ) to C∞ (Rd ) and C∞ (Rd ) to C∞ (Rd ). Find the corresponding when taking C∞ 2 1 estimates. Show that the dual propagators take (C∞ (Rd ))∗ to (C∞ (Rd ))∗ and 2 (x;.) t (x;.) 1 (C∞ (Rd ))∗ to (C∞ (Rd ))∗ , and thus that ∂ξt∂x and ∂ ξ∂x belong to H11 (Rd ) 2 for any t > 0. Again obtain the corresponding estimates for the norms. Finally, let us look at the second variational derivatives of the solutions to McKean–Vlasov equations with respect to the initial data: ηt (x, z; .) =

δ 2 φt (Y ) . δY (x)δY (z)

Remark 109. Second-order derivatives are crucial for the analysis of fluctuations in a system of interacting particles around its limit given by the nonlinear kinetic equations (see Section 7.8), for instance, the McKean–Vlasov equation. In particular, the estimates (6.110) and (6.113) are of great relevance, see Remark 116. Differentiating equation (6.100), we obtain for η the weak equation   1 d (f, ηt (x, z; .)) = (at (.)∇, ∇)f + (bt (φt , .), ∇)f + Vt (φt , .)f, ηt (x, z; .) dt 2 δVt (φt , y) ηt (x, z; w)f (y)φt (y) dydw + (6.106) δφt (w)   δbt (φt , y) ηt (x, z; w), ∇f (y) φt (y) dydw + (f, gt ), + δφt (w) with (f, gt ) given by  

δbt (φt , y) δVt (φt , y) f (y) + , ∇f (y) δφt (w) δφt (w) × [ξt (x; y)ξt (z; w) + ξt (x; w)ξt (z; y)] dydw  2  2 δ bt (φt , y) δ Vt (φt , y) f (y) + , ∇f (y) + δφt (w)δφt (u) δφt (w)δφt (u)

(6.107)

× ξt (x; w)ξt (z; u)φt (y) dydwdu, which should be satisfied with the vanishing initial condition η0 (x, z, .) = 0. This is the same equation as (6.100), but with an additional non-homogeneous term

392

Chapter 6. The Method of Propagators for Nonlinear Equations

(g, ft ). Therefore, its solution can be expressed in terms of the propagators (Φt,s )∗ from Theorem 6.8.3: t

ηt (x, z; .) =

(Φ0,s )∗ gs ds.

(6.108)

0

Therefore, in spite of the challengingly looking expression (6.107), the analysis of η is more or less straightforward. However, the structure of (6.107) conveys an important message, namely 2,k×k d (M+ that for this analysis one needs the exotic spaces Cweak ≤λ (R )), which are 2

1,1 δ F (Y ) d subspaces of Cweak (M+ ≤λ (R )) consisting of functionals F (μ) such that δY (x)δY (z) exists for all x, z and belongs to C k×k (R2d ) (see the definition of this space in d Section 1.1) uniformly for Y ∈ M+ ≤λ (R ). Their relevance for the approximation of systems of interacting particles will be further elaborated in Section 7.8. From Theorem 6.8.3 and formula (6.108), we can now derive the following consequence on the second-order sensitivity for the McKean–Vlasov diffusion:

Theorem 6.8.4. (i) Under the assumptions of Theorem 6.8.3, assume the existence of continuous bounded second-order variational derivatives:    2 

 δ 2 bj (μ, y)   δ Vt (μ, y)  t  ≤ R2 (λ), sup sup    ≤ R2 (λ). (6.109)  j  δμ(w)δμ(u)  y,w,u,t δμ(w)δμ(u) y,w,u,t Then ηt (x, z; .) is well defined for any t as an element of (C 1 (Rd ))∗ and has the following bound: ηt (x, z; .)(C 1 (Rd ))∗ ≤ tC(m, M, T )λ[R(λ) + R2 (λ)]  3 × E1/2 [C(m, M, T )(2λR(λ) + V + b)] with λ = Y M(Rd ) . (ii) Assuming additionally that ) ) ) 2 ) ) j

) ) δ Vt (μ, y) ) ) δ 2 bt (μ, y) ) ) ) ≤ R (λ), sup sup ) ) ) 3 j ) δμ(.)δμ(.) ) δμ(.)δμ(.) )C 1×1 (R2d ) y,t y,t

(6.110)

≤ R3 (λ),

C 1×1 (R2d )

) ) ) ) j )

) ) δVt (μ, .) ) ) δbt (μ, .) ) ) ) sup ) ≤ R (λ), sup ) ) 4 j ) δμ(w) ) δμ(w) )C 1 (Rd ) w,t w,t

(6.111) ≤ R4 (λ),

C 1 (Rd )

(6.112) it follows that the derivatives of ηt (x, z; .) with respect to x and z of order at most one are well defined as elements of (C 2 (Rd ))∗ and ) ) α β ) ) ∂ ∂ ) ) (6.113) ) ∂xα ∂z β ηt (x, z; .)) 2 d ∗ (C (R ))  3 ≤ tC(m, M, T )λ[R(λ) + R2 (λ) + R3 (λ) + R4 (λ)] E1/2 [C(m, M, T )c1 (λ)] for α, β = 0, 1.

6.9. Landau–Fokker–Planck-type equations

393

Proof. (i) For gt in (6.108), we have the estimate gt (C 1 (Rd ))∗ ≤ 4λ[R(λ) + R2 (λ)]Y M(Rd ) ξt (x; .)L1 (Rd ) ξt (z; .)L1 (Rd )  2 ≤ 4λ[R(λ) + R2 (λ)] E1/2 [C(m, M, T )c1 (λ)] , where (6.99) was used. Therefore, (6.110) follows by the first estimate in (6.103). In order to prove that the solution ηt to equation (6.106) yields in fact the second derivative of μt with respect to the initial data, one can either use the strong form of this equation and apply Theorem 2.15.1, like in the proof of Theorem 6.7.2, or use an approximation by bounded operators, as will be explained in Section 6.12 for a more general setting. (ii) The same as (i), but using the second estimate in (6.103). 

6.9 Landau–Fokker–Planck-type equations We shall now analyse the class of possibly degenerate nonlinear diffusions of the type (6.32) with     1 ∂2f ∂f A(μ)f = tr σ(x, y)σ T (x, y)μ(dy) 2 + b(x, y)μ(dy), , (6.114) 2 ∂x ∂x with a d × d-square-matrix-valued function σ(x, y), x, y ∈ Rd , and a vector-valued function b(x, y). As a key example, this class contains the celebrated Landau– Fokker–Planck equation from statistical physics (see Section 7.4 for concrete σ, b arising in this context). Unlike the rest of the book, the results of this section depend strongly on the use of probability theory, since one of the ingredients of the proofs is supplied by Theorem 4.3.1, which we formulated without proof while mentioning that it arises as a consequence of the methods of stochastic analysis. Theorem 6.9.1. Let σ, b be continuous functions such that σ(., y), b(., y) ∈ C 4 (Rd ) with all required derivatives being continuous and uniformly bounded in both variables and σ(x, .), b(x, .) ∈ C 2 (Rd ) with all required derivatives being continuous and uniformly bounded in both variables. Then the Cauchy problem (6.32), (6.114) is well posed in the sense that for any Y ∈ P(Rd ) there exists a unique global solution μt (Y ) in M(Rd ) with the initial condition Y such that μt (Y ) ∈ P(Rd ) for all t. Proof. Similar to our dealing with nonlinear diffusions, we are going to use The2 (Rd ), M = P(Rd ). By the assumption orem 6.3.1 with Bobs = C∞ (Rd ), D = C∞ 4 d 2 σ(., y), b(., y) ∈ C (R ), by Theorem 5.3.3 with D2 = C∞ (Rd ), and by Theorem 4.3.1, for any Lipschitz-continuous curve ξt ∈ CY ([0, T ], M (D∗ )) there exists a backward propagator U r,s [ξ. ] generated by the family of operators     1 ∂2f ∂f T A(ξt )f = tr σ(x, y)σ (x, y)ξt (dy) 2 + b(x, y)ξt (dy), , 2 ∂x ∂x

394

Chapter 6. The Method of Propagators for Nonlinear Equations

so that U r,s [ξ. ]D→D ≤ UD ,

U r,s [ξ. ]B→B ≤ UB

uniformly for any compact interval containing r, s, with some constants UD , UB that depend on the norms of σ(., y), b(., y) in C 4 (Rd ). By the assumption σ(x, .), b(x, .) ∈ C 2 (Rd ), we can conclude that   A(ξ) − A(η)D→B ≤ sup σ(x, .)2C 2 (Rd ) + b(x, .)C 2 (Rd ) ξ − ηD∗ . x

Consequently, by Theorem 6.3.1 and Remark 105(iv), we find the required wellposedness of the Cauchy problem (6.32), (6.114).  Exercise 6.9.1. Extend Theorem 6.9.1 to the case of time-dependent σ and b. In most concrete applications (in particular in the setting of Landau equations), σ and b are not bounded. For instance, in the case of collisions of so-called Maxwellian molecules, these coefficients are of linear growth, see (7.55), (7.56). We shall not go into details concerning the necessary extensions, but refer instead to the original papers, see, e.g., the comments in Section 7.10 on the Landau– Fokker–Planck equation.

6.10 Forward-backward systems As we already pointed out, two different classes of the considered nonlinear equations can be distinguished: (i) with a local nonlinearity, i.e., depending on the pointwise values of the unknown function and its derivatives, and (ii) with a nonlinearity that depends on the integral characteristics of the unknown function (this is when the weak form often turns out to be useful). More recent studies (essentially arising from the mean-field games) brought to light the analysis of systems that are described by coupled combinations of equations of these two classes, which moreover are evolving in the opposite direction of time. A sufficiently general forward-backward system of this kind can be written as follows:        ⎧ d ∂ft ⎪ ⎪ (φ, μt ) = A(μt )φ + bt x, μt , u , μt ˆ x, , ∇φ , μt , ⎪ ⎪ ∂x ⎪ ⎨ dt ⎪   ⎪ ⎪ ∂ft ∂ft ⎪ ⎪ ⎩ + Aft + Hμt x, = 0, ∂t ∂x

μ|t=0 = μ0 ,

φ ∈ D,

(6.115)

f |t=T = fT .

Let us explain the meaning of all occurring terms. The unknown functions are μt ∈ P(Rd ), t ∈ [0, T ] and ft ∈ B = C∞ (Rd ). The first equation is written in the weak form, which has to hold for all test functions φ ∈ D, with D usually taken 2 (Rd ). A and A(μ) are operators that generate strongly continuous to be D = C∞ semigroups in C∞ (Rd ) (for the sake of simplicity, we choose A to be independent

6.10. Forward-backward systems

395

of μ). The drift bt (x, μ, u) is continuous, and the Hamilton function H (or, more precisely, the family of these functions depending on μ as a parameter) is of the form (6.116) Hμ (x, p) = max[g(x, μ, u)p − J(x, μ, u)], u∈U

as it arises from the theory of optimization, with some continuous functions g, J and a closed set U ⊂ Rn . It is assumed that this representation is chosen in such a way that the value u ˆ(x, p, μ), where the maximum in (6.116) is attained, is always uniquely defined. Therefore, u ˆ in the first equation of (6.115) expresses the coupling between the two equations of (6.115). Assuming that the backward Cauchy problem for the second equation in (6.115) is well posed for any curve μt ∈ C([0, T ], P 1 (D∗ )) (at least in the sense of the mild solutions), the function   ∂ft (x) (x, {μ≥t }, fT ), μt , u¯(x, {μ≥t }; fT ) = u ˆ x, (6.117) ∂x is well defined. Consequently, the forward-backward system (6.115) can be rewritten as a single anticipating kinetic equation of the type (6.45): d (φ, μt ) = (A(μt )φ + (bt (x, μt , u ¯(x, {μ≥t }; fT )), ∇φ), μt ) , dt

μ|t=0 = μ0 ,

φ ∈ D.

(6.118) Let us say that the Hamiltonian (6.116) has a Lipschitz minimizer if the function u ˆ(x, p, μ) is a Lipschitz-continuous function of all its three variables, with μ ∈ P(Rd ) considered in the norm topology of the space D∗ . In the sequel, we shall work only with Hamiltonians of this class. The most important nontrivial point is the Lipschitz continuity in p. This property is often fulfilled for convex (in u) functions J and convex sets U . In the Appendix, see Theorem 9.5.1, a simple subclass of such Hamiltonians is explicitly identified. Recall that we denoted 1 1 d d P = P (R ) = {μ ∈ P(R ) : |x|μ(dx) < ∞}. Theorem 6.10.1. Let Hμ (x, p) be a family of Hamiltonians (6.116) having a Lipschitz minimizer and satisfying the assumptions of Theorem 6.1.2. Let A satisfy the assumption on A of of Theorem 6.1.1 an A(μ) satisfy the assumptions of Theorem 6.1.3(ii) with Bpar = C([0, T ], P 1 (D∗ )). Assume that for any continuous function u(x, {μ. }; fT ), which is Lipschitz-continuous in the first two variables x ∈ Rd , μ. ∈ C([0, T ], P 1 (D∗ )), and any curve ξ. ∈ C([0, T ], P 1 (D∗ )), the operators A¯t = A¯t ({ξ. }) = A(ξt ) + (bt (x, ξt , u(x, {ξ. }; fT )), ∇) : D → B

(6.119)

generate conservative Feller propagators U t,s [{ξ. }] in B (recall that ‘conservative’ means ‘preserving constants’) on the common invariant domain D such that the

396

Chapter 6. The Method of Propagators for Nonlinear Equations

U t,s [{ξ. }] also act as bounded operators in the space of weighted functions CL (Rd ) with L(x) = 1 + |x|, so that ||U t,s [{ξ.}]||D→D ≤ UD and ||U t,s [{ξ.}]||CL (Rd )→CL (Rd ) ≤ UL ,

(6.120)

for some constants UD , UL . Moreover, let ¯ {ξ.}] − A[t, ¯ {η.}]D→B ≤ LA sup ||ξt − ηt ||D∗ sup A[t, t∈[0,T ]

(6.121)

t∈[0,T ]

for a constant LA . Then the forward-backward problem (6.115) has a solution μt ∈ P 1 and ft ∈ C 1 (Rd ) for any fT ∈ C 1 (Rd ) and μ0 ∈ P 1 . If LA UD T < 1, this solution is unique. Proof. By Theorem 6.1.4(ii) with Bpar = C([0, T ], P 1 (D∗ )), for any μt ∈ C([0, T ], P 1 (D∗ )) the backward equation in (6.115), which is a HJB equation, has a unique mild solution in C 1 (Rd ) for any fT ∈ C 1 (Rd ). By the assumption of a Lipschitz minimizer and again by Theorem 6.1.2, the function u ¯(x, {μ≥t }; fT ) depends Lipschitz-continuously on its arguments x and {μ≥t }. Therefore the proof can be completed by applying Theorems 6.4.2 and 6.4.4.  The main examples are supplied by operators A and A of the L´evy–Khintchin type. For instance, the following holds: Theorem 6.10.2. The assumption on A¯ from Theorem 6.10.1 holds if bt is a Lipschitz-continuous function of its variables and the A(μ) are either non-degenerate diffusion operators or mixed fractional Laplacians of the type (5.100) satisfying the assumptions of Theorem 5.6.3 uniformly in μ. Proof. The basic conditions (6.120) follow from Theorems 5.6.2 and 5.6.3 as well as from Propositions 4.5.2 and 4.3.2. 

6.11 Linearized evolution around non-linear propagators In this section, we make a preparatory step to proving the sensitivity for general evolutions provided by Theorem 6.3.1 without the simplifying assumption of the major term being non-degenerate. In terms of the general notations for the spaces of mappings (see (1.53)), the assumption (6.33) means that A ∈ CbLip (M (D∗ ), L(D, B)) with LA being the Lipschitz constant, where writing M (D∗ ) emphasizes that M is considered in

6.11. Linearized evolution around non-linear propagators

397

the topology induced from D∗ . Let us now make the stronger assumption that A ∈ C 1 (M (D∗ ), L(D, B)). We aim at showing that the derivative ξt = Dμt (Y )[ξ] = lim

h→0+

1 (μt (Y + hξ) − μt (Y )) h

(6.122)

is well defined as an element of D∗ , with the limit in (6.122) also existing in the sense of the topology D∗ . This will illustrate a general feature of evolutions with a non-Lipschitz r.h.s., where the derivatives with respect to the initial data lie in a different regularity class than the solutions. Assuming that (6.122) exists, one can differentiate equation (6.32) to obtain d (f, ξt ) = (A(μt )f, ξt ) + (DA(μt )[ξt ]f, μt ) ξ0 = ξ, dt

f ∈ D,

(6.123)

which describes the linearized evolution around a path of a nonlinear semigroup μt = μt (Y ). As in similar situations, in order to prove the existence of the derivative (6.122), we first show the well-posedness of the evolution (6.123), and then we show that it yields the derivative (6.122). A careful look at equation (6.123) leads to the conclusion that one cannot generally expect to solve it by a curve ξt in Bst , but only in D∗ . This in turn implies the necessity of additional assumptions on its r.h.s. Namely, let us assume that there exists the representation (Dξ A(μ)g, μ) = (F (μ)g, ξ),

(6.124)

with F (μ) a Lipschitz-continuous mapping M (D∗ ) → L(D, D), so that F (μ) − F (η)D→D ≤ LF μ − ηD∗ ,

(6.125)

with a constant LF that may depend on μ and η, but must be uniformly bounded for μ, η from bounded subsets of M . This condition does not seem to be very intuitive from an abstract point of view. In concrete examples, however, the operators A(μ) are usually some differential operators with coefficients depending on μ, i.e., they are sums of terms like AΩ (μ) = Ω(x, μ)

∂k ∂x1 · · · ∂xk

with some functional Ω(x, μ) that smoothly depends on μ. In this case, we find ∂k g, μ) (Dξ A(μ)g, μ) = (Dξ Ω(x, μ) ∂x1 · · · ∂xk ∂ k g(x) δΩ(x, μ) ξ(dy) = μ(dx), δμ(y) ∂x1 · · · ∂xk

(6.126)

398

Chapter 6. The Method of Propagators for Nonlinear Equations

so that the structure (6.124) becomes transparent with the function (F (μ)g)(y) =

δΩ(x, μ) ∂ k g(x) μ(dx). δμ(y) ∂x1 · · · ∂xk

More generally, if A(μ) is a family of pseudo-differentiable operators with the symbols H(x, p, μ) (see (1.95) if needed) depending smoothly on μ, one gets  

δH(x, p, μ) (F (μ)g)(y) = g (x)μ(dx), (6.127) δμ(y) where we denoted (only for this argument) by [ψ] the pseudo-differential operator with the symbol ψ. Therefore, the following theory is effectively meant to be applicable to differential or pseudo-differential operators A(μ). Under the assumption (6.124), equation (6.123) can be rewritten as d (f, ξt ) = (A(μt )f, ξt ) + (F (μt )f, ξt ), dt

f ∈ D,

(6.128)

which is dual to the equation g˙ = −(A(μt ) + F (μt ))g.

(6.129)

Equations of this type were analysed in Theorem 4.13.2, which implies that this equation is well posed in D with the derivative understood in the sense of the topology of Bobs . The resolving backward propagator Φt,s [Y ] of equation (6.129) acts in D (and generally not in B), and the ξt evolving according to the dual propagator Ψs,t [Y ] = (Φt,s [Y ])∗ are well defined in D∗ . However, since Ψs,t [Y ] = (Φt,s [Y ])∗ acts in D∗ and not in B ∗ , the expression on the r.h.s. of (6.128) may not make sense for f ∈ D. Therefore, this propagator supplies, strictly speaking, only a generalized solution to (6.128). This is a typical difficulty related to the already mentioned fact that nonlinear evolutions with unbounded coefficients usually push the derivatives of the solutions with respect to parameters outside the domain where the solutions live. In order to ensure that the Ψs,t [Y ]ξ do solve (6.128), further regularity assumptions are required, which can be conveniently formulated in the setting of the three Banach spaces ˜ ⊂ D ⊂ Bobs as in Theorem 5.1.3, so that   ˜ ≥  D ≥  B , D is dense in D obs D ˜ is dense in D in the topology of D. The main Bobs in the topology of B, and D result of this section is as follows: Theorem 6.11.1. (i) Under the assumptions of Theorem 6.3.1, let the representation (6.124) to (6.125) hold. Then for each Y ∈ M , equation (6.129) is well posed, the solution being given by a strongly continuous backward propagator Φt,s [Y ] in D such that Φt,s [Y ]g is the unique curve in D solving equation (6.129) in Bobs

6.11. Linearized evolution around non-linear propagators

399

(the derivative being defined with respect to the topology of B). The dual operators Ψs,t [Y ] = (Φt,s [Y ])∗ form a ∗-weakly continuous propagator in D∗ . (ii) Assume additionally that the backward propagators U t,s [η.] generated by A(ηt ) ˜ and that the operators A(η) are also strongly continuous and bounded in D, ˜ are also bounded operators D → D such that A(ξ) − A(η)D→D ≤ LA ξ − ηD∗ . ˜

(6.130)

Moreover, let the representation (6.124) hold, with F (μ) being a Lipschitz˜ D) ˜ such that continuous mapping M (D∗ ) → L(D, D) and M (D∗ ) → L(D, (6.125) holds and F (μ) − F (η)D→ ˜ D ˜ ≤ LF μ − ηD∗ .

(6.131)

Then, for each Y ∈ M and ξ ∈ D∗ , ξt = Ψt,0 [Y ]ξ is the unique solution to ˜ (6.123) in the sense that it holds for any f ∈ D. Remark 110. The construction of propagators from condition (ii) can naturally be carried out via Theorem 5.1.1, that is via T -products. Proof. (i) This is a direct consequence of Theorem 4.13.2. (ii) By Proposition 5.1.1, ˜ Proposition the U t,s [μt ]g solve the equation g˙ t = −A(μt )gt in D for any g ∈ D. ˜ 4.13.1, applied to (D, D) rather than (D, Bobs ), yields the existence of the strong backward propagator Φt,s of linear operators in D generated by the family A(μt )+ ˜ By Theorem 4.10.1, again applied to (D, ˜ D), the dual propagator F (μt ) on D. s,t t,s ∗ Ψ = (Φ ) supplies a unique solution to the Cauchy problem for equation (6.128).  Both for numerical simulations and for the application to interacting particles, it is crucial to analyse the dependence of the solutions on other parameters, not only on the initial data. Therefore, we shall consider a more general situation, where we are given a family of operators Aα (μ) that depends on a real parameter α α and satisfies the assumptions of Theorem 6.3.1 for each α. For μα t = μt (Y ), a solution corresponding to (6.32) with the initial condition Y , we are interested in the derivative ∂μα t . (6.132) ξt (α) = ∂α Differentiating (6.32) (at least formally for the moment) with respect to α yields the equation  α α  ∂A (μt ) d α α α α (g, ξt (α)) = (Aα (μα g, μ )g, ξ (α)) + (D A (μ )g, μ ) + , t ξ (α) t t t t t dt ∂α (6.133) as an extension of (6.123), with the initial condition ξ0 = ξ0 (α) =

∂μα 0 . ∂α

(6.134)

We can now extend Theorem 6.11.1 to the case of the linearized evolution (6.133).

400

Chapter 6. The Method of Propagators for Nonlinear Equations

Theorem 6.11.2. Let the conditions of Theorem 6.11.1(ii) hold for each family ∂Aα (μα t ) exist and define Aα , with all bounds being uniform in α. Moreover, let ∂α ∗ a continuous mapping M (D ) → L(D, Bobs ). Then the Cauchy problem for the weak equation (6.133) has a unique solution ξt = Πt,0 [α, Y ]ξ0 (in the sense that ˜ given by the formula (6.133) holds for all g ∈ D)  r α α ∂A (μs ) s,r t,r α (g, ξt ) = (Φ [α, Y ]g, ξt ) + Φ [α, Y ]g, μs ds. (6.135) ∂α t 

Proof. It follows directly from Theorem 6.11.1 and Proposition 4.10.3. We complete this section by a simple stability result for Πs,t . Theorem 6.11.3.

(i) Under the assumptions of Theorem 6.11.1(ii), the propagator Ψ depends continuously on Y = μ0 in the following sense: Ψs,t [Y1 ] − Ψs,t [Y2 ]D∗ →D˜ ∗ ≤ C(LA + LF )Y1 − Y2 D∗ ,

(6.136)

with a constant C that is uniform for Y1 , Y2 in any bounded subset of M (D∗ ). (ii) Under the assumptions of Theorem 6.11.2, suppose that ) α ) ) ∂A (μ) ∂Aα (η) ) ) ) ≤ L∂A μ − ηD∗ , ) ∂α − ∂α ) D→B with a constant L∂A that can be chosen uniformly for μ, η from any bounded subset of M (D∗ ). Then the solutions ξt = Πt,0 [α, Y ]ξ0 depend continuously on Y = μ0 in the following sense (Πt,0 [α, Y1 ] − Πt,0 [α, Y2 ])ξ0 D˜ ∗ ≤ C(L∂A + (LA + LF )ξD∗ )Y1 − Y2 D∗ . (6.137) Proof. (i) By Proposition 4.9.3, we have (Φt,s [Y1 ] − Φt,s [Y2 ])f D ≤ tf D˜ (LA + LF ) sup Φτ,r [Y1 ]D→D τ,r

× sup Φ τ,r

τ,r

[Y2 ]D→ ˜ D ˜ sup μτ (Y1 ) − μτ (Y2 )D∗ . τ

The last term can be estimated by UB exp{tUD UB LA }Y1 − Y2 D∗ , which implies (6.136) by duality. (ii) This again follows by Proposition 4.9.3.



6.12. Sensitivity of nonlinear propagators

401

6.12 Sensitivity of nonlinear propagators In order to deduce that Ψs,t [Y ]ξ from the previous section equals the derivative (6.122), and therefore to prove the sensitivity result for general nonlinear propagators, we shall approximate A(μ) by a sequence of bounded operators An (μ), for which the statement is a straightforward consequence of the theory for equations with a Lipschitz-continuous r.h.s., see Theorem 2.9.1. Afterwards, we pass to the limit. For the setting of ΨDOs that we are dealing with, a natural way is to approximate operators by approximating their symbols, see the discussion prior to Theorem 5.15.1. Theorem 6.12.1. Under the assumptions of Theorem 6.11.1(ii), assume that there exists a sequence of families of operators An (μ) such that for each n they satisfy all the assumptions of Theorem 6.3.1 with the same bounds, they are bounded as operators D → D and Bobs → Bobs , and An (η) − A(η)D→D → 0, ˜

Fn (η) − F (η)D→D → 0, ˜

(6.138)

as n → ∞, uniformly for η from any bounded subset of M . Then the derivative (6.122) exists (the limit being understood in the sense of D∗ ) and it equals the unique solution ξt = Ψt,0 [Y ]ξ to equation (6.123) constructed in Theorem 6.11.1(ii). Proof. We need to add a dependence on n to all objects constructed from An . By Theorem 2.9.1, the claim of Theorem 6.12.1 holds for ξtn [Y ] = Ψt,0 n [Y ]ξ, which implies that h n n ξtn [Y + rξ] dr. (6.139) μt (Y + hξ) − μt (Y ) = 0

By Theorem 6.3.2 and the first relation in (6.138), μnt converges to μt in D∗ , as n → ∞. By (6.130), (6.131), (6.138) and Proposition 4.9.3, applied to the backward n ∗ propagators Φt,s [Y ] and Φt,s n [Y ] in D, it follows that ξs → ξs in D . Therefore, ∗ passing to the limit in (6.139) in the topology of D , as n → ∞, yields μt (Y +hξ)−μt (Y ) =



h

ξt [Y +rξ] dr = hξt [Y ]+ 0

h

(ξt [Y +rξ]−ξt [Y ]) dr. (6.140) 0

Theorem 6.11.3 implies that ξt [Y + rξ] − ξt [Y ]D∗ → 0, as r → 0, and consequently (μt (Y + hξ) − μt (Y ))/h → ξt [Y ] in D∗ , as h → 0, as required.  The extension to a more general dependence on a parameter is as follows:

402

Chapter 6. The Method of Propagators for Nonlinear Equations

Theorem 6.12.2. Let ξ0 = ξ ∈ B ∗ and be defined by (6.134), where the derivative exists in the norm-topology of D∗ . Then the derivative (6.132) exists in D∗ and it equals the unique solution ξt [α] = Πt,0 [α, μα 0 ]ξ to equation (6.133) constructed in Theorem 6.11.2. Proof. As above, we get the following equation for the approximating family: α α0 α ξt [β](n) dβ, (6.141) μt (n) − μt (n) = α0

which holds as an equation in D∗ . Again passing to the limit completes the proof.  Theorems 6.12.1 and 6.12.1 are applicable to a wide range of nonlinear evolutions, including degenerate McKean–Vlasov diffusions, various nonlinear evolutions with fractional derivatives and the Landau–Fokker–Planck-type equations of Section 6.9.

6.13 Summary and comments A method for constructing nonlinear dynamics via infinite-dimensional multiplicative-integral equations was developed for the quantum mechanical problem of many particle systems in [201]. Further development of this method for the analysis of nonlinear equations of control theory (HJB) and statistical mechanics of interacting Markov evolutions (like nonlinear diffusions and nonlinear evolutions with fractional Laplacians) in the spirit of Sections 6.3 and 6.12, including most importantly their sensitivity and applications to forward-backward systems as presented in this chapter, was carried out in several works of the author with co-workers, see, e.g., [143, 147, 163–165]. We did not touch here the important analysis of nonlinear diffusions in domains with a boundary, see, e.g., [226, 247] and references therein. Our approach to HJB equations was essentially via mild solutions, which is effective for equations with a smoothing linear part. For first-order HJB equations of the type H(x, u(x), ∇u(x)) = 0 with x ∈ Ω for open subsets Ω of a Banach space B (including the evolutionary equation ∂u ∂t + H(x, u, ∇u) = 0 in (0, T ) × Ω), the most appropriate notion of a solution is that of a viscosity solution. There are several similar definitions. The most transparent one that suffices to fully treat equations in Rd and in reflexive Banach spaces (see [54] and references therein) is the following: A continuous function φ on Ω is a viscosity solution to the equation H(x, u(x), ∇u(x)) = 0 if it is both a supersolution and a subsolution, meaning that H(x, u(x), p) ≥ 0 (respectively H(x, u(x), p) ≤ 0) for any x and any subdifferential (respectively superdifferential) p of u at x, where p ∈ B ∗ is called a subdifferential of u at x, if lim inf [u(y) − u(x) − (p, y − x)] ≥ 0, y→x

6.13. Summary and comments

403

and a superdifferential of u at x, if lim sup[u(y) − u(x) − (p, y − x)] ≤ 0. y→x

The literature on the popular topic of McKean–Vlasov diffusions (deterministic or random) is extensive, see, e.g., [3, 46, 56, 221] and references therein. The main new point of [147] was the systematic development of sensitivity (smoothness of solutions with respect to the initial data). There is quite a lot of work available on the Landau–Fokker–Planck equation (bibliographical comments are given in Section 7.10). We only explained how our general approach works for the class of equations of a similar type, but with bounded coefficients. As mentioned, the rising interest in forward-backward systems has been due to the development of mean-field games, which currently represent one of the most popular directions in game theory. Mean-field games were initiated in [113] and [180]. At present, there exist already several excellent surveys and monographs on various directions of the theory, see [34, 43, 48, 92, 97]. For results on the existence of solutions to forward- backward systems under various assumptions, we can refer, e.g., to [29, 93, 164] and references therein. For an analysis in terms of the master equation, see [35, 45, 47, 48, 162]. The general Theorem 6.3.1 can be used in many other situations that have not been covered in this book. For instance, quantum dynamic semigroups acting in the space L(H, H) of bounded operators in a Hilbert space H are known to have generators of the form  ∞ 

1 ∗ ∗ ∗ (6.142) Vj XVj − (Vj Vj X + XVj Vj ) + i[H, X], L(X) = 2 j=1  ∗ where H is a self-adjoint operator in a Hilbert space H and Vi , ∞ i=1 Vi Vi ∈ L(H, H). A straightforward manipulation shows that the corresponding dual evolution on the space of trace-class operators has the generator ∞

 1  L (Y ) = [Vj Y, Vj∗ ] + [Vj , Y Vj∗ ] − i[H, Y ], 2 

(6.143)

j=1

where L denotes the dual operator with respect to the usual pairing given by the trace (see Section 1.14). Nonlinear counterparts of dynamic semigroups that appear as the law of large numbers or the mean field limit for interacting quantum particles are given by nonlinear equations of the form ∞

 1  [Vj (Y )Y, Vj∗ (Y )] + [Vj (Y ), Y Vj∗ (Y )] − i[H(Y ), Y ], (6.144) Y˙ t = LY (Y ) = 2 j=1 where the operators Vi and H additionally depend on the current state Y . As a more or less direct consequence of Theorem 6.3.1 (see [147]), one can derive a

404

Chapter 6. The Method of Propagators for Nonlinear Equations

rather general well-posedness result for equation (6.144), with its solution forming the so-called nonlinear quantum dynamic semigroup. Combining the results of Chapters 4 and 5 with the general construction of Sections 6.3 and 6.4, one can construct nonlinear extensions of basically all linear evolutions covered in Chapters 4 and 5. We refer to them as nonlinear Markov if they define evolutions of measures that preserve positivity. As we saw in Section 5.10, this positivity-preservation is the hallmark of nonlinear evolutions that have generators of the L´evy–Khintchin type with coefficients depending on an unknown measure. For instance, fractional Laplacians |Δ|α/2 have this form for α ≤ 2. We refer to [143] for an analysis of a reasonably general equation d (g, μt ) = − (σ(x)|Δ|α/2 g, μt ) + (V (x, y), ∇g(x))μt (dx)μt (dy) dt + (g(x + z) − g(x))ψ(x, y; z)dzμt (dx)μt (dy), (6.145) which combines fractional Laplacian, drift and integral terms. As other examples, one can mention stochastic geodesic flows on manifolds, including non-local evolutions of the velocities (see [147]) and nonlinear diffusions with singular coefficients (depending, e.g., on the median or the VaR), as developed in [151] for tackling the model of financial asset pricing suggested in [55].

Chapter 7

Equations in Spaces of Weighted Measures In this chapter, we deal with equations in spaces of weighted measures and continuous functions. Basic examples for this type of equation are supplied by general kinetic equations of statistical physics, including the celebrated equations of Boltzmann and Smoluchowski, as well as by replicator dynamics and its various modifications from evolutionary biology and dynamic games. We shall briefly explain how these nonlinear equations arise from the linear evolutions of systems of interacting particles when the number of particles tends to infinity, i.e., as the dynamic law of large numbers. Note that we shall only lay the foundations, while we will supply a brief guide to the enormous literature at the end of the chapter. Most of the exposition can be regarded as a far-reaching extension of the discrete setting of Chapter 3.

7.1 Conditional positivity Let us start with the infinitesimal structure of positivity-preserving evolutions. Theorem 7.1.1. Let X be a complete metric space. Let μ˙ t = Ft (μt ) be an equation in M(X) such that F : R × M(X) → M(X) is a continuous function. Let μt , t ≥ s, be a solution to this equation with the initial data μs = μ ∈ M+ (X). If the negative part Fs− (μ) of the Hahn decomposition Fs (μ) = Fs+ (μ) − Fs− (μ) of Fs (μ) is not absolutely continuous with respect to μ, then μt ∈ / M+ (X) for all sufficiently small t − s. − Proof. By the Lebesgue decomposition theorem, we find Fs− (μ) = Fs,abs (μ) + − − − Fs,sing (μ), where Fs,abs (μ) and Fs,sing (μ) are the absolutely continuous respectively

© Springer Nature Switzerland AG 2019 V. Kolokoltsov, Differential Equations on Measures and Functional Spaces, Birkhäuser Advanced Texts Basler Lehrbücher, https://doi.org/10.1007/978-3-030-03377-4_7

405

406

Chapter 7. Equations in Spaces of Weighted Measures

the singular parts of the measure Fs− (μ) with respect to μ. If Fs− (μ) is not abso− (μ))(A) > 0 lutely continuous with respect to μ, there exists A such that (Fs,sing − + and (Fs,abs (μ))(A) = (Fs (μ))(A) = μ(A) = 0. For any solution μt , we have μs+δ − μ − δFs (μ) ≤ δ, where  can be made arbitrary small by choosing a sufficiently small δ. Therefore, − (μ))(A) ≤ δ. μs+δ (A) + δ(Fs,sing − (μ))(A), it follows that μs+δ (A) < 0, as required. If  < (Fs,sing



This result motivates the following definition: A mapping F : M(X) → M(X) is conditionally positive if its negative part F − (μ) is everywhere absolutely continuous with respect to μ, i.e., F (μ) = Ω(μ) − a(x, μ)μ,

(7.1)

where a(x, μ) is a non-negative function and Ω(μ) ∈ M+ (X). The proof given above is sufficiently robust to have several extensions. For instance, the same conclusion holds if we consider the same equation in the space of weighted measures (see M(X, L) below), or if we require the equation to hold weakly. However, it does not cover equations where Ft (μ) maps M(X) onto a more irregular class of generalized functions. Introducing the transition kernel ν(x, μ, dy) = 

a(x, μ) Ω(μ)(dy), a(z, μ)μ(dz)

(7.2)

depending on μ as on a parameter, one can rewrite (7.1) in the equivalent form F (μ) = μ(dw)ν(w, μ, .) − a(., μ)μ(.), (7.3) X

which is convenient for comparisons with the linear theory, and therefore for the weak formulation. In fact, by (7.3), the equation μ˙ t = F (μt ) can be rewritten in the standard weak form (6.32), i.e., d (f, μt ) = (A(μt )f, μt ) = (f, F (μt )), μ0 = Y, dt with test-functions f from C(X) and f (y)ν(x, μ, dy) − a(x, μ)f (x). (A(μ)f )(x) = X

(7.4)

(7.5)

 In particular, if  F (μ)(dy) = 0 for all μ (the conservativity condition), that is, Ω(μ)(dy) = a(x, μ)μ(dx), then the kernel (7.2) satisfies the equation a(x, μ) = ν(x, μ, dy) = ν(x, μ, .), so that A(μ)f = (f (y) − f (x))ν(x, μ, dy). (7.6) X

7.2. Simplest equations that preserve positivity

407

Remark 111. Formula (7.2) provides the simplest ν satisfying (7.3). In all practical examples, F is expressed by (7.3) with ν differing from (7.2). Therefore, we shall be constantly working with (7.3) without assuming the validity of (7.2).

7.2 Simplest equations that preserve positivity Recall the following notations for weighted measure spaces: For a non-negative continuous function L on X, let us introduce the sets of weighted measures M(X, L) = {μ ∈ M(X) : (L, |μ|) < ∞}, M≤λ (X, L) = {μ ∈ M(X, L) : (L, |μ|) ≤ λ} for a λ > 0. We denote by M+ (X, L) and M+ ≤λ (X, L) the corresponding subsets of non-negative elements. The space M(X, L) is a Banach space if equipped with the norm μM(X,L) =

L(x)|μ|(dx) = sup{|(y, μ)| : |y| ≤ L}.

Similar to the usual spaces of measures, the natural weak topology on the weighted space M(X, L) is defined with respect to the duality relation with the weighted space CL (X) consisting of continuous functions on X with a bounded norm f CL (X) = inf{K : |f (x)| ≤ KL(x) for all x}. For kernels (even signed kernels) ν(x, dy) acting in a space of weighted measures and functions, the natural norm is the ‘double-norm’, i.e., the norm in CL (X) with respect to the first variable and the norm in M(X, L) with respect to the second one: (7.7) ν(., .)LL(X) = inf{K : ν(x, .)M(X,L) ≤ KL(x)}. For strictly positive L, this is equivalent to ν(., .)LL(X) = sup x

ν(x, .)M(X,L) = sup{L−1 (x) L(x) x

L(y)|ν(x, dy)|}. X

These norms also define the norms of the integral operators (1.77) with the kernel ν, when they act in weighted spaces: Tν L(CL (X)) ≤ ν(., .)LL(X) ,

Tν L(M(X,L)) ≤ ν(., .)LL(X) .

(7.8)

Exercise 7.2.1. Prove these inequalities. Show that these inequalities become equalities if L is strictly positive. Let us clarify the relation between the Lipschitz continuity of ν, a and F connected by (7.3).

408

Chapter 7. Equations in Spaces of Weighted Measures

Lemma 7.2.1. Suppose, for μ, ξ, η ∈ M≤λ (X, L), that a(x, μ) + ν(., μ, .)LL(X) ≤ c1 (λ), |a(x, ξ) − a(x, η)| + ν(., ξ, .) − ν(., η, .)LL(X) ≤ c2 (λ)ξ − ηM(X,L) .

(7.9) (7.10)

Then F (μ)M(X,L) ≤ λc1 (λ),

(7.11)

F (ξ) − F (η)M(X,L) ≤ (c1 (λ) + λc2 (λ))ξ − ηM(X,L) .

(7.12)

Proof. We have F (μ)M(X,L) ≤

L(y)μ(dw)ν(w, μ, dy) +

L(y)a(y, μ)μ(dy)

X2

≤ c1 (λ)μM(X,L) , which shows (7.11). Next, we write F (ξ) − F (η)M(X,L) ≤ L(y)|a(y, ξ) − a(y, η)|η(dy) + L(y)a(y, ξ)|ξ(dy) − η(dy)| + L(y)|ξ(dw) − η(dw)|ν(w, ξ, dy) X2 + L(y)η(dw)|ν(w, ξ, dy) − ν(w, η, dy)|, X2

which leads to (7.12).



Most of the analysis of the equations (7.4) is linked with some Lyapunov functions. A non-negative continuous function L on X is called the Lyapunov function for F (μ) or A(μ) if ν(x, μ, .) ∈ M+ (X, L) for all x and μ ∈ M+ (X, L) and (L, F (μ)) = (A(μ)L, μ) ≤ α(L, μ) + β (7.13) with some constants α, β. As in the discrete setting, the Lyapunov function L is called subcritical (respectively critical), or F is said to be L- subcritical (respectively L- critical), if (L, F (μ)) ≤ 0 (respectively (L, f (μ)) = 0) for all μ ∈ M+ (X, L). We shall now prove the basic well-posedness result for measure-valued ODEs with a Lyapunov condition and bounded rates, extending its discrete version of Theorem 3.6.1. Theorem 7.2.1. (i) Let X be a complete metric space, L a non-negative continuous function on X, and a(x, μ) a continuous non-negative function on X × M+ (X, L). Let

7.2. Simplest equations that preserve positivity

409

ν be a family of transition kernels such that ν(x, μ, .) ∈ M+ (X, L) for all x ∈ X, μ ∈ M+ (X, L), depending continuously on x, with ν considered in the weak topology of M+ (X, L). Let F be given by (7.3). Let the Lyapunov condition (7.13) and the conditions of growth and Lipschitz continuity (7.9), (7.10) hold. Then for any Y ∈ M+ ≤λ (X, L), there exists a unique global solution μt (Y ) ∈ M+ (X, L) (defined for all t ≥ 0) to the Cauchy problem for the equation μ˙ = F (μ) in M(X, L) (i.e., with the derivative understood in the Banach topology of M(X, L)), with the initial condition Y . Moreover, if α = 0, then μ(t, μ) ∈ M+ ≤λ(t) (X, L),

λ(t) = eαt λ + (eαt − 1)β/α.

(7.14)

If α = 0, then the same holds with λ(t) = λ + βt. If f is L-subcritical, then the same holds with λ(t) = λ. (ii) Suppose that all the conditions of (i) hold apart from (7.9) and (7.10), which are substituted by the analogous conditions in the topology of M(X) = M(X, L = 1), i.e., (7.15) a(x, μ) + ν(., μ, .)11(X) ≤ c1 (λ), |a(x, ξ) − a(x, η)| + ν(., ξ, .) − ν(., η, .)11(X) ≤ c2 (λ)ξ − ηM(X) . (7.16) Assume moreover that L is bounded from below by a constant L0 > 0. Then the same conclusion as in (i) holds, but the solution to the equation μ˙ = F (μ) is understood in the sense of the Banach topology of M(X). Proof. (i) As was the case for Theorem 3.6.1, this is again a straightforward extension of the proof of Theorem 3.1.1 and a consequence of Theorem 2.1.2. Namely, fixing T , let us define the convex set Ca,b (T ) of continuous functions μ : [0, T ] → M+ (X, L) such that μ([0, t]) ∈ M+ ≤λ(t) (X, L) for all t ∈ [0, T ]. Let K = max{a(x, μ) : x ∈ X, μ ∈ M+ ≤λ(T ) (X, L)}. Therefore, by defining the mapping ΦY as t −Kt [ΦY (μ. )](t) = e Y + e−K(t−s) [F (μs ) + Kμs ] ds, 0

it follows that ΦY preserves positivity and that the fixed points of ΦY are the solutions to μ˙ = F (μ). The same integration as in Theorem 3.6.1 shows that ΦY preserves the set Ca,b (T ). The proof is completed by referring to Theorem 2.1.2, because, by (7.11) and (7.12), [ΦY1 (μ1. )](t) − [ΦY2 (μ2. )](t)M(X,L) t ≤ (K + c1 (λ) + λc2 (λ)) μ1s − μ2s M(X,L) ds + Y1 − Y2 M(X,L) 0

410

Chapter 7. Equations in Spaces of Weighted Measures

and [ΦY (Y )](t) − Y = e

−Kt

Y −Y +

t

e−K(t−s) [F (Y ) + KY ] ds =

0

1 (1 − e−Kt )F (Y ), K

which implies [ΦY (Y )](t) − Y M(X,L) ≤ tF (Y )M(X,L) . (ii) The conditions (7.15) and (7.16) imply F (μ)M(X) ≤ (a(x, μ) + ν(., μ, .)11(X) )μM(X) ≤ λc1 (λ)/L0 , F (ξ) − F (η)M(X) ≤ (c1 (λ) + λc2 (λ)/L0 )ξ − ηM(X) .

(7.17) (7.18)

Consequently, [ΦY1 (μ1. )](t) − [ΦY2 (μ2. )](t)M(X) t ≤ (K + c1 (λ) + λc2 (λ)/L0 ) μ1s − μ2s M(X) ds + Y1 − Y2 M(X) . 0

Therefore, the proof can again be completed by referring to Theorem 2.1.2 and the observation that the sets M+  ≤λ (X, L) are closed in M(X) for any λ. Remark 112. In [147], four different proofs of Theorem 7.2.1 are provided for the special case L = 1. Let us now prove a sensitivity result for the evolution μ˙ = F (μ). Translating the basic setting of Section 1.13 into the present setting with weighted spaces, we say that a mapping F : M+ ≤λ (X, L) → M(X, L) has a strong variational derivative δF (Y, x) if for any Y ∈ M≤λ (X), x ∈ X the limit δF (Y, x) =

1 δF = lim (F (Y + sδx ) − F (Y )) δY (x) s→0+ s

exists in the norm topology of M(X, L). 1 (M≤λ (X, L), M(X, L)), if the strong variaWe say that F belongs to Cweak tional derivative δF (Y, x) exists for all x ∈ X, Y ∈ M(X, L) and is a continuous (in the sense of the weak topology) mapping M+ ≤λ (X, L) × X → M(X, L). Like 1 in Theorem 1.13.2(i), it follows that a mapping F ∈ Cweak (M≤λ (X, L), M(X, L)) 1 belongs to C (M≤λ (X, L), M(X, L)) if the mapping Y → δF (Y, x) is a continuous bounded mapping M+ ≤λ (X, L) → L(M(X, L)) with respect to the norm topologies on both sides, so that δF (Y1 , .) − δF (Y2 , .)LL(X) → 0, as Y1 − Y2 M(X,L) → 0. (For this, recall that the norm in L(M(X, L)) is given 1 (M≤λ (X, L), M(X, L)) ∩ C 1 (M≤λ (X, L), M(X, L)) is by (7.8).) The space Cweak a Banach space when equipped with the norm sup Y ∈M≤λ (X,L)

F (Y )M(X,L) +

sup Y ∈M≤λ (X,L)

δF (Y, .)LL(X).

(7.19)

7.2. Simplest equations that preserve positivity

411

As a direct consequence of Theorems 2.8.1 and 7.2.1, we obtain the following sensitivity result: Theorem 7.2.2. Under the assumptions of Theorem 7.2.1(i), suppose that 1 F ∈ Cweak (M≤λ (X, L), M(X, L)) ∩ C 1 (M≤λ (X, L), M(X, L))

for any λ and that the derivative is uniformly continuous in the sense that for any λ > 0 and  > 0 there exists δ > 0 such that δF (Y1 , .) − δF (Y2 , .)LL(X) < , whenever Y1 , Y2 ∈ M+ ≤λ (X, L) and Y1 − Y2 M(X,L) < δ. Then for any ξ ∈ M+ (X, L), the derivative ξt = Dμt (Y )[ξ] exists strongly (i.e., the limit exists in the norm topology of M(X, L)) and is the unique solution to the integral equation

t −Kt −K(t−s) ξt = e ξ+ e δF (μs , x)ξs (dx) + Kξs ds, 0

which is equivalent to the Cauchy problem ˙ξt = δF (μt , x)ξt (dx),

ξ0 = ξ.

Exercise 7.2.2. Formulate the analogous result in the framework of Theorem 7.2.1(ii). Let us emphasize that the regularity conditions on F in Theorem 7.2.2 follow from the following regularity conditions on ν and a via (7.3): the strong variational derivatives δa(x, μ) δν(x, μ, .) , δμ(y) δμ(y) exist, depend continuously on their arguments (measures considered in the weak topology of M(X, L)) and are locally uniformly continuous as functions of μ in the sense that for any λ > 0 and  > 0 there exists δ > 0 such that     δa(x, Y1 )  δa(x, Y2 ) −1  sup L (y) L(x)  Y1 (dx) − Y2 (dx) < , (7.20) δμ(y) δμ(y) y     δν(w, Y1 , dx)  δν(w, Y2 , dx) Y1 (dw) − Y2 (dw) < , L(x)  sup L−1 (y) δμ(y) δμ(y) y (7.21) whenever Y1 , Y2 ∈ M+ ≤λ (X, L) and Y1 − Y2 M(X,L) < δ. A particularly interesting case are equations with unbounded rates, where the non-uniqueness of solutions can be naturally expected. A characteristic example of such situation is given by the following existence result.

412

Chapter 7. Equations in Spaces of Weighted Measures

Theorem 7.2.3. Let X be a locally compact metric space, L a continuous function on X that is bounded from below by a constant L0 > 0 and tends to infinity as x → ∞. Let a(x, μ) be a continuous non-negative function on X × M+ (X, L) and ν a family of transition kernels in M(X, L) such that for μ, ξ, η ∈ M+ ≤λ (X, L), we have |a(x, μ)| + ν(x, μ, .)M(X) = ω(x)L(x)c1 (λ), |(a(x, ξ) − a(x, η)|1{L(x)≤ρ} ≤ c2 (λ)˜ c2 (ρ)ξ − ηM(X) , c2 (ρ)ξ − ηM(X) , (ν(x, ξ, .) − ν(x, η, .)M(X) 1{L(x)≤ρ} ≤ c2 (λ)˜

(7.22) (7.23) (7.24)

with some c1 (λ), c2 (λ), c˜2 (ρ) and a bounded function ω on R+ tending to zero at infinity. Let the Lyapunov condition (7.13) hold and F be given by (7.3). Then for any Y ∈ M+ ≤λ (X, L), there exists a global solution μt (Y ) to the Cauchy problem for the equation μ˙ = F (μ) in M(X), with the initial condition Y , such that (7.14) holds for α = 0. For α = 0, the same condition holds with λ(t) = λ + βt. Proof. Let χn (x), n ∈ N, be a continuous function X → [0, 1] such that χn (x) = 1 for L(x) ≤ n − 1 and χn (x) = 0 for L(x) ≥ n. For any n < m, let us consider the cut-off data am (x, μ) = χm (x)a(x, μ),

νn (x, μ, .) = χn (x)ν(x, μ, .).

Let Fnm be given by (7.3) with am , νn instead of a, ν. Due to the assumption n < m, the Lyapunov condition (7.13) holds for Fnm . In fact, one can just fix n = −m − 1. Applying Theorem 7.2.1(ii) leads to the conclusion that for any n < m there exists a unique solution μnm t (Y ) to the Cauchy problem for the equation μ˙ = Fnm (μ) which satisfies the required growth conditions (e.g., (7.14) for α = 0). Next, we find Fnm (μnm t (Y ))M(X) ≤ sup{ω(z)}λ(t)c1 (λ(t)), z

which is uniformly bounded in n, m. Consequently, for any sequence of pairs n < m tending to infinity, the Arzel`a–Ascoli theorem ensures the existence of a subsequence such that μnm t (Y ) converges in C([0, t], M(X)) to some function μt (Y ). It remains to show that μt (Y ) satisfies the equation μt = Y +

t

F (μs ) ds. 0

This can be achieved by passing to the limit in the corresponding equation for μnm t . Since Fnm (μ) − F (μ)M(X) ≤ ω(n)λc1 (λ) → 0,

7.3. Path-dependent equations and forward-backward systems

413

as n → ∞, uniformly for μ ∈ M+ ≤λ (X, L) with any λ, one only has to show that (Y )) − F (μ (Y )) → 0 as m → ∞. Decomposing the integrals in the F (μnm t M(X) t expression F (μnm t (Y )) − F (μt (Y ))M(X) nm ≤ |a(x, μnm t (Y ))μt (Y )(dx) − a(x, μ(Y ))μ(Y )(dx)| X nm + |μnm (Y ), dy) − μt (Y )(dx)ν(x, μt (Y ), dy)| t (Y )(dx)ν(x, μ X2

into two parts over the sets {L(x) > ρ} and {L(x) ≤ ρ}, respectively, we find that the integrals over the first set tend to zero by (7.22), as ρ → ∞, and the integrals over the second set tend to zero, as n, m → ∞, for any ρ due to the convergence  μnm t (Y ) → μt (Y ).

7.3 Path-dependent equations and forward-backward systems Let us briefly sketch the analogue of the theory of forward-backward systems from Section 6.10 for the framework of kinetic equations in weighted measure spaces. The starting point are path-dependent kinetic equations, which can be written as d μt (dx) = F (t, {μ. }) = μt (dw)ν(t, w, {μ. }, dx) − a(t, x, {μ. })μt (dx), μ0 = Y, dt (7.25) on some interval t ∈ [0, T ]. Extending Theorem 7.2.1 and limiting our attention to the subcritical case only gives us the following result: Theorem 7.3.1. Let X be a locally compact metric space, L a continuous function on X such that L(x) ≥ L0 > 0 for all x, and L(x) → ∞, as x → ∞. Let a(t, x, {μ. }) be a continuous non-negative function on [0, T ] × X × C([0, T ], M+ (X, L)) and ν a family of transition kernels such that ν(t, x, {μ. }, .) ∈ M+ (X, L) for all t ∈ [0, T ], x ∈ X, μ. ∈ C([0, T ], M+(X, L)), depending continuously on t and x, with ν considered in the weak topology of M+ (X). Let the Lyapunov condition (L, F (t, {μ. })) ≤ 0 hold, as well as the following conditions of growth and Lipschitz continuity: a(t, x, {μ. }) + ν(t, ., {μ. }, .)11(X) ≤ c1 (λ), |a(t, x, {ξ. }) − a(t, x, {η. })| + ν(t, ., {ξ. }, .) − ν(t, ., {η. }, .)11(X) ≤ c2 (λ)ξ − ηC([0,T ],M(X)) .

(7.26) (7.27)

414

Chapter 7. Equations in Spaces of Weighted Measures

+ Then for any Y ∈ M+ ≤λ (X, L) there exists a solution μt (Y ) ∈ M (X, L) to the Cauchy problem for equation (7.25) (with the derivative understood in the Banach topology of M(X)), with the initial condition Y . Moreover, this solution is unique for sufficiently small T , namely for

T (1 + c1 (λ) + λc2 (λ)/L0 ) < 1.

(7.28)

Proof. As in Theorem 7.2.1, our solution is a fixed point of the mapping t −Kt [ΦY (μ. )](t) = e Y + e−K(t−s) [F (s, {μ. }) + Kμs ] ds. 0

Since [ΦY (μ1. )](t)−[ΦY (μ2. )](t)M(X) ≤ tμ1. −μ2. C([0,T ],M(X)) (K +c1 (λ)+λc2 (λ)/L0 ), the mapping ΦY is a contraction in C([0, T ], M(X)) under (7.28), which implies the existence and uniqueness of a fixed point. For arbitrary T , ΦY maps C([0, T ], M+ ≤λ (X, L)) to itself. By Proposition (X, L) are metric and compact in the weak topology. Since the 1.1.2, the sets M+ ≤λ image of ΦY are Lipschitz-continuous curves, we can conclude by the Arzel` a–Ascoli + Theorem that this image is compact in C([0, T ], M+ (X, L)), where M ≤λ ≤λ (X, L)) is metricized by the metric of Proposition 1.1.2. Consequently, as in Theorem 6.4.3, the Schauder fixed-point theorem ensures the existence of a fixed point of ΦY .  Let us now consider the forward-backward system with a forward kinetic equation and a backward Bellman equation of the jump-type, see (2.124) and (2.125). In the latter equation, we separate jumps that depend on the control u and jumps that depend on the ‘environment’ μ that couples it with the kinetic equation: d μt (dx) = F (t, μt , u ¯(x, {μ≥t }; ST )) dt =

μt (dw)ν(w, μt , u ¯(x, {μ≥t }; ST ), dx) − a(x, μt , u ¯(x, {μ≥t }; ST ))μt (dx), ⎡



m

∂S + sup ⎣ uj νj (x)(S(t, yj (x)) − S(t, x)) − J(x, u)⎦ ∂t u∈U j=1 + (S(t, y) − S(t, x))n(μt , x, dy) = 0, S|t=T = ST ,

μ0 = Y, (7.29)

where U is a convex set and J a strictly convex function of u, so that the value u ˆ(x, p) where the ‘Hamiltonian’ ⎤ ⎡ m

uj νj (x)pj − J(x, u)⎦ , x ∈ X, p ∈ Rm , H(x, p) = sup ⎣ u∈U

j=1

7.3. Path-dependent equations and forward-backward systems

415

reaches its maximum is always unique. u ¯ in the first equation is defined by the formula ˆ(x, p)|pj =S(t,y(xj ))−S(t,x) , (7.30) u ¯(x, {μ≥t }; ST ) = u with S(t, x) depending on {μ≥t } and ST via the solution to the second equation (7.29). Theorem 7.3.2. Let X be a locally compact metric space, L a continuous function on X such that L(x) ≥ L0 > 0 for all x, and L(x) → ∞, as x → ∞. Let U be a convex compact subset of a Euclidean space. Let a(x, μ, u) be a continuous nonnegative function on X × M+ (X, L) × U and ν a family of transition kernels such that ν(x, μ, u, .) ∈ M+ (X, L) for all x ∈ X, μ ∈ M+ (X, L), u ∈ U depending continuously on x, with ν considered in the weak topology of M+ (X). Let the Lyapunov condition (L, F (t, μ, u)) ≤ 0 hold, as well as the following conditions of growth and Lipschitz continuity: a(x, μ, u) + ν(., μ, u, .)11(X) ≤ c1 (λ),

(7.31)

|a(x, μ , u ) − a(x, μ , u )| + ν(., μ , u , .) − ν(., μ , u , .)11(X) 1

1

2

2

1

1

2

2

≤ c2 (λ)(μ1 − μ2 C([0,T ],M(X)) + u1 − u2 ).

(7.32)

Let νj and n be uniformly bounded stochastic kernels and let the Hamiltonian H(x, p) have a Lipschitz minimizer (see the discussion after equation (6.117) and Theorem 9.5.1 in the Appendix). Then for any Y ∈ M+ ≤λ (X, L) and ST ∈ C(X), there exists a solution μt (Y ) ∈ M+ (X, L), S(t, .) ∈ C(X) to the forward-backward system (7.29). Moreover, this solution is unique for sufficiently small T . Proof. By Theorem 2.7.3, the solution to the second equation of (7.29) is well defined and depends Lipschitz-continuously on μ for any continuous curve μt (Y ) ∈ M+ (X, L). Therefore, since H has a Lipschitz minimizer, u¯(x, {μ≥t }; ST ) depends Lipschitz-continuously on μ. Hence we find ourselves in the setting of Theorem 7.3.1.  An important observation that leads to an advanced study of the forwardbackward system (7.29) is the possibility to encode the system into a single backward equation with an unknown function G(t, x, μ). This equation is referred to as the master equation and has the form ∂G (t, x, μ) + (G(t, y, μ)) − G(t, x, μ))n(μ, x, dy) ∂t ⎡ ⎤ m 

 +⎣ u ¯j νj (x)(G(t, yj (x), μ) − G(t, x, μ)) − J(x, u¯)⎦  + = 0,

j=1

δG (t, x, μ) δμ(z)

u ¯=¯ u(x,G(t,.),μ)

w∈X

  μ(dw)ν(w, μ, u, dz) − a(z, μ, u)μ(dz) 

u ¯ =¯ u(z,G(t,.),μ)

416

Chapter 7. Equations in Spaces of Weighted Measures

where u ¯ = {¯ uj }(x, G(t, .), μ) is the argmax of the expression m

uj νj (x)[G(t, yj (x), μ) − G(t, x, μ)] − J(x, u).

j=1

For instance, if X = {1, . . . , n} is finite, then this equation simplifies to

∂Gi (t, μ) + (Gk (t, μ) − Gi (t, μ))nik (μ) k ∂t ⎡ ⎤ m

+⎣ u¯j νji (Gj (t, μ) − Gi (t, μ)) − Ji (¯ u)⎦ |u¯=¯u(i,G(t,.),μ) j=1

& %

∂G + (t, μ) μp νpl (μ, u¯) − ap (μ, u¯)μp |u¯=¯u(l,G(t,.),μ) l ∂μl p = 0, where u ¯ = {¯ uj }(i, G(t, .), μ) is the argmax of the expression m

uj νji (Gj (t, μ) − Gi (t, μ)) − Ji (u).

j=1

7.4 Kinetic equations (Boltzmann, Smoluchowski, Vlasov, Landau) and replicator dynamics The basic examples of measure-valued evolutions as they appear in natural sciences are general kinetic equations. Some special cases of these equations, arising from a discrete state space and given by (3.63) in the weak representation, were analysed in detail in Chapter 3. In this section, we are going to formulate the analogues for a state space X being an arbitrary complete metric space and point out their most well-known special cases. Let us first introduce some notation. Denoting by X 0 a one-point space and j by X the powers X × · · · × X (j times) considered with their product topologies, j we denote by X their disjoint union X = ∪∞ j=0 X . In applications, X specifies ∞ j the state space of one particle and X = ∪j=0 X stands for the state space of an arbitrary number of similar particles. We denote by Csym (X ) the Banach spaces of symmetric bounded continuous functions on X and by Csym (X k ) the corresponding spaces of functions on the finite power X k . The space of symmetric (positive finite Borel) measures is denoted by Msym (X ). The elements of Msym (X ) and Csym (X ) are the (mixed) states and the observables, respectively, for an evolution on X . We denote the elements of X by bold letters, say x, y. For a finite subset I = {i1 , . . . , ik } of a finite set J = {1, . . . , n}, we denote by |I| the number of elements in I, by I¯ its complement J \ I and by xI the collection of variables xi1 , . . . , xik .

7.4. Kinetic equations (Boltzmann, Smoluchowski, Vlasov, Landau) . . .

417

Reducing the set of observables to Csym (X ) effectively means that our state space is not X (or X k ) but rather the quotient-space SX (respectively SX k ) obtained by factorization with respect to all permutations. This allows for the identifications Csym (X ) = C(SX ) and Csym (X k ) = C(SX k ). The set of equivalence classes SX can be identified with the set of all finite subsets of X, the order being irrelevant. Each f ∈ Csym (X ) is defined by its components (restrictions) f k on X k . E.g., for x = (x1 , . . . , xk ) ∈ X k ⊂ X , say, we can write f (x) = f (x1 , . . . , xk ) = f k (x1 , . . . , xk ). Similar notations will be used for the components of measures from M(X ). In particular, the pairing between Csym (X ) and M(X ) can be written as f (x)ρ(dx) = f 0 ρ0 +

(f, ρ) =



f (x1 , . . . , xn )ρ(dx1 · · · dxn )

n=1

for f ∈ Csym (X ), ρ ∈ M(X ). A useful class of measures (and mixed states) on X is given by the decomposable measures of the form Y ⊗ , which are defined for an arbitrary finite measure Y (dx) on X by their components (Y ⊗ )n (dx1 · · · dxn ) = Y ⊗n (dx1 · · · dxn ) = Y (dx1 ) · · · Y (dxn ). Similarly, decomposable observables, multiplicative or additive, are defined for an arbitrary Q ∈ C(X) as follows: (Q⊗ )n (x1 , . . . , xn ) = Q⊗n (x1 , . . . , xn ) = Q(x1 ) · · · Q(xn ), ⊕

(Q )(x1 , . . . , xn ) = Q(x1 ) + · · · + Q(xn ).

(7.33) (7.34)

(Note that Q⊕ vanishes on X 0 .) In particular, if Q = 1, then Q⊕ = 1⊕ is the number of particles: 1⊕ (x1 , . . . , xn ) = n. The analogues of the rates PΨΦ from the discrete setting of the equations (3.62) 1 (x; dy1 · · · dym )} from and (3.63) are (a) the transitions kernels P 1 (x; dy) = {Pm X to SX (they describe unilateral (or spontaneous) transformations of particles such as splitting or mutations), (b) the transition kernels P 2 (x1 , x2 ; dy) = 2 (x1 , x2 ; dy1 · · · dym )} from SX 2 to SX (they describe binary interactions such {Pm as collisions or breakage), and generally (c) the transition kernels k P k (x1 , . . . , xk ; dy) = {Pm (x1 , . . . , xk ; dy1 · · · dym )}

(7.35)

from SX k to SX (they describe simultaneous interactions of k particles, i.e., kthorder interaction). The norms of the kernels, P 1 (x) =

P 1 (x; dy) = X



m=0

Xm

1 Pm (x; dy1 · · · dym )

418

Chapter 7. Equations in Spaces of Weighted Measures

for unilateral transformations, and ∞

k (x1 , . . . , xk ; dy1 · · · dym ), P k (x1 , . . . , xk ) = P k (x1 , . . . , xk ; dy) = Pm m=0

(7.36) for kth-order interaction, are usually referred to as the transition intensities (or rates). The natural analogue of the weak equation (3.63) for a continuous state space is given by the following kinetic equation of pure jump-type in the weak form: K



k=1

X

1 d (g, μt ) = dt k!

(g ⊕ (y) − g ⊕ (z))P k (z; dy)μ⊗k t (dz),

Xk

(7.37)

where we used a finite bound K for the order of interaction, since only this case prevails in concrete applications. Loosely speaking, these equations describe the evolution of the particle concentrations μt (dz) in regions (dz) under the transformations z → y with the rates P (z; dy). A systematic way for deriving these equations from the evolution of systems of interacting particles will be outlined in Section 7.8. There, it will become clear (see (7.99)) that (7.37) is a consequence of a certain scaling of the interactions of order k. This scaling is well established in statistical mechanics. In a biological context, however, a different scaling is more appropriate. With this other scaling, (7.37) is replaced by the following re-scaled modification: K

1 d (g, μt ) = (g ⊕ (y) − g ⊕ (z))P k (z; dy)μ⊗k (7.38) t (dz). dt k!μt k−1 X X k k=1

Moreover, in the biological context, one is usually interested in normalized (prob ability) measures ν = μ/μ. Since μ = X μ(dx) for positive μ, we find for positive solutions μt to (7.38) that d 1 μt  = dt k!

where





Q(z) = X

Q(z) Xk

μt μt 

⊗k (dz)μt ,

(1⊕ (y) − 1⊕ (z))P k (z; dy).

(7.39)

(7.40)

Consequently, rewriting equation (7.38) in terms of the normalized measure νt = μt /μt  yields d 1 g(z)νt (dz) = (g ⊕ (y) − g ⊕ (z))P k (z; dy)νt⊗k (dz) dt X k! X X k (7.41) 1 ⊗k − g(z)νt (dz) Q(z)νt (dz). k! X X Xk

7.4. Kinetic equations (Boltzmann, Smoluchowski, Vlasov, Landau) . . .

419

Remark 113. The re-scaling of interactions that leads to (7.38) is equivalent to a time change in (7.37). A special case of this reduction in evolutionary biology is the well-known trajectory-wise equivalence of the Lotka–Volterra model and replicator dynamics, see, e.g., [111]. Notice that equation (7.37) is a special case of the equations (7.4) and (7.5) with 1 P k (x, z2 , . . . , zk ; dy)μ(dz2 ) · · · μ(dzk ), (7.42) a(x, μ) = (k − 1)! X X k−1 k=1 ∞ K

m k ν(w, μ, dy) = Pm (w, z2 , . . . , zk ; dydy2 · · · dym )μ(dz2 ) · · · k! m−1 k−1 X X k=1 m=1 · · · μ(dzk ). (7.43) K

Example 1. Generalized Smoluchowski coagulation model. The classical Smoluchowski model describes the process of mass-preserving binary coagulation of particles. In a more general context, referred to as cluster coagulation (see [214]), a particle is characterized by a parameter x from a locally compact state space X, where a mapping E : X → R+ , the generalized mass, and a transition kernel P12 (z1 , z2 ; dy) = K(z1 , z2 ; dy), the coagulation kernel, are given such that the measures K(z1 , z2 ; .) are supported on the set {y : E(y) = E(z1 ) + E(z2 )}. In this setting, equation (7.37) takes the form d g(z)μt (dz) dt X (7.44) 1 = [g(y) − g(z1 ) − g(z2 )]K(z1 , z2 ; dy)μt (dz1 )μt (dz2 ) 2 X3 and is known as Smoluchowski’s equation. In the classical Smoluchowski model, we have X = R+ , E(x) = x and K(x1 , x2 ; dy) = K(x1 , x2 )δ(x1 + x2 − y) for a certain symmetric function K(x1 , x2 ). Example 2. Spatially homogeneous Boltzmann collisions. This model describes the process of binary collisions that transform the velocities of two particles (v1 , v2 ) → (w1 , w2 ) in such a way that the total momentum and energy are conserved: v1 + v2 = w1 + w2 , v12 + v22 = w12 + w22 .

(7.45)

These equations imply that w1 = v1 − n(v1 − v2 , n), w2 = v2 + n(v1 − v2 , n)),

n ∈ S d−1 , (n, v2 − v1 ) ≥ 0.

(7.46)

420

Chapter 7. Equations in Spaces of Weighted Measures

If we assume that the collision rates are shift-invariant, i.e., they depend on v1 , v2 only via their difference, and ignore the spatial distribution of particles, then the weak kinetic equation (7.37) for describing such collisions turns into the spatially trivial Boltzmann equation: 1 d (g, μt ) = dt 2





n∈S d−1 :(n,v2 −v1 )≥0

R2d

μt (dv1 )μt (dv2 )

(7.47)

× [g(w1 ) + g(w2 ) − g(v1 ) − g(v2 )]B(v2 − v1 , dn), with a certain collision kernel B(v, dn) that specifies a concrete physical model for the collisions. In the most common models, the kernel B has a density with respect to the Lebesgue measure on S d−1 and depends on v only via its magnitude |v| and the angle θ ∈ [0, π/2] between v and n. In other words, one assumes B(v, dn) to have the form B(|v|, θ)dn for a certain function B. By extending B to the angles θ ∈ [π/2, π] by B(|v|, θ) = B(|v|, π − θ),

(7.48)

we can write the weak form of the Boltzmann equation in the equivalent form 1 d (g, μt ) = dt 4

S d−1

R2d

[g(w1 ) + g(w2 ) − g(v1 ) − g(v2 )]

(7.49)

× B(|v1 − v2 |, θ)dnμt (dv1 )μt (dv2 ), where w1 , w2 are given by (7.46), θ is the angle between v2 − v1 and n, and B satisfies the condition (7.48). Example 3. Multiple coagulation, fragmentation and collision breakage. Processes that combine pure coagulation of no more than k particles, spontaneous fragmentation into no more than k pieces, and collisions (or collision breakages) of no more than k particles are specified by the following transition kernels: P1l (z1 , . . . , zl , dy) = Kl (z1 , . . . , zl ; dy), l = 2, . . . , k, called coagulation kernels, 1 (z; dy1 · · · dym ) = Fm (z; dy1 · · · dym ), Pm m = 2, . . . , k,

called fragmentation kernels and Pll (z1 , . . . , zl ; dy1 · · · dyl ) = Cl (z1 , . . . , zl ; dy1 · · · dy2 ), l = 2, . . . , k,

7.4. Kinetic equations (Boltzmann, Smoluchowski, Vlasov, Landau) . . .

421

called collision kernels. The corresponding kinetic equation (7.37) takes the form d g(z)μt (dz) (7.50) dt k l

 1 [g(y) − g(z1 ) − · · · − g(zl )]Kl (z1 , . . . , zl ; dy) μt (dzj ) = l! z1 ,...,zl ,y j=1 l=2

+

k

m=2

+

k

[g(y1 ) + · · · + g(ym ) − g(z)]Fm (z; dy1 , . . . , dym )μt (dz)

z,y1 ,...,ym

[g(y1 ) + · · · + g(yl ) − g(z1 ) − · · · − g(zl )]

l=2

× Cl (z1 , . . . , zl ; dy1 · · · dyl )

l 

μt (dzj ).

j=1

Apart from the instantaneous transformations of particles described by the equations (7.37), one can also consider other processes involving groups of particles and described by some generators Ak acting in the spaces Csym (X k ) (given that the process involves k particles). If A = (A1 , A2 , . . .) denotes the collection of such operators, then the natural extension of the evolutions (7.37) are described by the equations   K

1 d ⊕ ⊕ ⊕ k (g, μt ) = Ag (z) + (g (y) − g (z))P (z; dy) μ⊗k t (dz), dt k! X k X k=1 (7.51) where the intuitive notation (Ag ⊕ )(z1 , . . . , zl ) = Al g ⊕ (z1 , . . . , zl ) was used. As a final level of extension, let us mention the possibility for the rates and evolutions A to depend on the current state μ (the so-called mean-field interaction). This leads to the following general kinetic equation in the weak form:   K

1 d (g, μt ) = A(μt )g ⊕ (z) + (g ⊕ (y) − g ⊕ (z))P k (μt , z; dy) μ⊗k t (dz). dt k! X k X k=1 (7.52) This equation with A = 0 is again a performance of the equations (7.4) and (7.5), with a, ν given by (7.42), (7.42) and P additionally depending on μ. Example 4. Vlasov’s equation. In standard dynamics, particles are described by their positions and momenta, so that X = R2d . The Vlasov equation of plasma physics in the weak form is   ∂H ∂g ∂H ∂g d (g, μt ) = − (7.53) μt (dxdp) dt ∂p ∂x ∂x ∂p R2   ∂g (x1 , p1 ) μt (dx1 dp1 )μt (dx2 dp2 ). + ∇V (x1 − x2 ), ∂p 4d 1 R

422

Chapter 7. Equations in Spaces of Weighted Measures

The function H(x, p) is called the Hamiltonian, say H = p2 /2 − U (x) with a given potential U . V stands for the potential of the interaction. Example 5. Landau–Fokker–Planck equation. This is the equation

1 d (g, μt ) = (G(v − w)∇, ∇)g(v) + (b(v − w), ∇g(v)) μt (dw)μt (dv), dt R2d 2 (7.54) with a certain non-negative matrix-valued function G(v) and a vector field b(v). In the original equation that specifies the limiting regime of Boltzmann collisions as described by (7.47) when they become grazing, i.e., when v1 is close to v2 , we have

∂ Gij (v) = ψ(|v|)(|v|2 δij − vi vj ), bi (v) = Gij (v), (7.55) j ∂vj with some function ψ(r) that reflects the details of the interaction. The special case of ψ being a constant is referred to as the case of Maxwellian molecules. Increasing or decreasing the function ψ specifies the cases of hard and soft potentials, respectively. A key property of G from (7.55) (essentially due to Theorem 4.3.1) is the possibility to represent it as a product G(z) = σ(z)σ T (z). In fact, for d = 2 and d = 3, this representation holds with ⎛ ⎞ 0 z2 −z3  z2 0   ⎜ ⎟ 0 z3 ⎟ σ(z) = ψ(|z|) (7.56) , σ(z) = ψ(|z|) ⎜ ⎠, ⎝ −z1 −z1 0 0 z1 −z2 respectively. Example 6. McKean–Vlasov diffusions. Taking A1 in (7.52) to be a second-order (or diffusion) operator and setting all other terms to zero, we obtain the McKean– Vlasov nonlinear diffusion that we already analysed in Chapter 6. Example 7. Generalized replicator dynamics. The evolution d 1 g(x)νt (dx) = (H  (νt x) − H  (νt ))g(x)νt (dx), dt X (k − 1)! X

(7.57)

represents the replicator dynamics in weak form for a symmetric k-person game with an arbitrary compact space of strategies. It is a special case of (7.41). H(z1 ; z2 , . . . , zk ) is a function that is symmetric with respect to permutations of all variables apart from the first one and which is interpreted as the costs for a player employing the strategy z1 when other players employ the strategies z2 , . . . , zk . Moreover, the following notations apply: Hi (x1 , . . . , xk )P (dx1 · · · dxk ), Hi (P ) = Xk

P = (p1 , . . . , pk ) ∈ P(X1 ) × · · · × P(Xk ), Hi (P xi ) = Hi (x1 , . . . , xn ) dp1 · · · dpi−1 dpi+1 · · · dpn . X1 ×···×Xi−1 ×Xi+1 ×···×Xn

7.4. Kinetic equations (Boltzmann, Smoluchowski, Vlasov, Landau) . . .

423

If the νt have densities ft with respect to some reference probability measure M on X, then equation (7.57) can be rewritten in terms of ft as f˙t (x) = ft (x)(H  (ft M x) − H  (ft M )),

(7.58)

which is most established form of replicator dynamics. Even more generally, if the pairwise interaction of players does not constitute a game between themselves, but rather a process of adapting better strategies for playing against a common adversary (an interaction that is analysed in [154] and called there the pressureand-resistance game), then the replicator dynamics (7.57) turns into the equation d g(x)νt (dx) = (R(x, μ) − R(y, μ))g(x)νt (dx)νt (dy), (7.59) dt X X where R is some function on X × M(X). Other examples include spatially nontrivial Boltzmann and Smoluchowski equations, and various extensions of the replicator dynamics to evolutionary biology and many other models. Let make explicit the crucial link between the dynamics Y → μt (Y ) and the corresponding linear dynamics in C(M(X)) realized by the method of characteristics in the spirit of Proposition 4.1.1. According to this proposition (and Remark 61), once the well-posedness of equations (7.52) is proved, one can expect the operators Tt S(Y ) = S(μt (Y )) to form a semigroup of contractions on some space of smooth functions on M+ (X), with the generator   δS(Y ) , μ˙ t (Y )|t=0 ΛA,P S(Y ) = δY (.) or explicitly  ⊕   K

δS(Y ) 1 ΛA,P S(Y ) = A(Y ) (z) (7.60) k! X k δY (.) k=1   ⊕ ⊕ $ # δS(Y ) δS(Y ) k (y) − (z) P (Y, z; dy) μ⊗k + t (dz). δY (.) δY (.) X Therefore, the function Ft (Y ) = Tt S(Y ) solves the following first-order infinitedimensional PDE in variational derivatives:  ⊕   K

δFt (Y ) 1 d Ft (Y ) = A(Y ) (z) (7.61) dt k! X k δY (.) k=1   ⊕ ⊕ $ # δFt (Y ) δFt (Y ) (y) − (z) P k (Y, z; dy) Y ⊗k (dz). + δY (.) δY (.) X In Section 7.8, we shall sketch the derivation of this equation (and hence of the corresponding kinetic equation) from the evolution of interacting particle systems.

424

Chapter 7. Equations in Spaces of Weighted Measures

Remark 114. If A and P do not depend explicitly on μ, then the r.h.s. of (7.61) preserves the analytic functions F of the type Fg (Y ) =



g m (x1 , . . . , xm )μ(dx1 ) · · · μ(dxm )

m=0

with some g ∈ Csym (X ). Therefore, equation (7.61) can be rewritten in terms of the coefficient functions gtm . This leads to the so-called Bogolubov chains or BBGKY hierarchies, which are another useful tool for analysing kinetic equations.

7.5 Well-posedness for basic kinetic equations The class of equations (7.52) unifies an immense variety of important concrete evolutions. Moreover, it provides a fast unified approach for obtaining various quantitative and qualitative features of concrete examples that had initially been developed in lengthy case-by-case studies. In all models of interest, one can distinguish a function on X that measures some key property of particles and their clusters, and that does not increase during all the transitions. For instance, this can be the mass in mass-exchange models like coagulation-fragmentation or the kinetic energy for Boltzmann collisions. This function plays the role of a Lyapunov function. In the sequel, we shall denote it by E (energy), which is the well-established notation in the Boltzmann setting. The formal definition goes as follows: Let E be a positive function on X. The transition kernel P = P (x; dy) in (7.35) is called E-subcritical (respectively E-critical), if (7.62) (E ⊕ (y) − E ⊕ (x))P (x; dy) ≤ 0 for all x (respectively if the equality holds). We say that P (x; dy) is E-preserving (respectively E-non-increasing) if the measure P (x; dy) is supported on the set {y : E ⊕ (y) = E ⊕ (x)} (respectively {y : E ⊕ (y) ≤ E ⊕ (x)}). Clearly, if P (x; dy) is E-preserving (respectively E-non-increasing), then it is also E-critical (respectively E-subcritical). E.g., if E = 1, then the preservation of E (subcriticallity) means that the number of particles remains constant (does not increase on average) during the evolution of the process. As we shall see later, subcriticallity enters practically all natural assumptions and ensures the non-explosion of the models of interaction. The well-posedness for bounded kernels is mostly covered by Theorems 7.2.1 and 7.2.2. Theorem 7.5.1. Let X be a complete metric space and E a continuous function on X that is bounded from below by a constant E0 > 0. In order to keep the formulae slim, we shall take E0 = 1. Let P (x; .) be a continuous transition kernel in SX

7.5. Well-posedness for basic kinetic equations

425

that is E-subcritical with uniformly bounded intensities, sup P (z; dy) = P¯ < ∞. sup z∈X μ∈M(X,E)

X

Then the Cauchy problem for equation (7.37) is well posed in M+ (X, E) and the solutions depend smoothly on the initial data. Proof. It follows from Theorems 7.2.1 and 7.2.2 by taking into account the formulae (7.42) and (7.43), as well as the resulting formulae for the variational derivatives: K



k=1

X

1 δa(x, μ) = δμ(z) (k − 2)! δν(w, μ, dy) = δμ(z)

P k (x, z, z3 , . . . , zk ; dy)μ(dz3 ) · · · μ(dzk ), (7.63) X k−2

∞ K

m(k − 1) (7.64) k! k=1 m=1 k × Pm (w, z, z3 , . . . , zk ; dydy2 · · · dyk )μ(dz3 ) · · · μ(dzk ), X m−1

X k−2

Moreover, the following estimates apply:

(E, μ)k−1 ≤ P¯ exp{(E, μ)}, k (k − 1)! K 1 E(y)ν(x, μ, dy) ≤ E ⊕ (x, z2 , . . . , zk ) k! k=1 × P k (x, z2 , . . . , zk ; dy)μ(dz2 ) · · · μ(dzk ) ≤ E(x)P¯ exp{(E, μ)}. a(x, μ) ≤ P¯



This result can be extended to equations with a mean-field dependence: K



k=1

X

1 d (g, μt ) = dt k!

Xk

(g ⊕ (y) − g ⊕ (z))P k (μt , z; dy)μ⊗k t (dz).

(7.65)

Theorem 7.5.2. Let X be a complete metric space and E a continuous function on X that is bounded from below by a positive constant. Let P (μ, x; .) be a family of continuous transition kernels in SX that are E-subcritical with uniformly bounded intensities, sup

sup

z∈X μ∈M(X,E)

X

P (μ, z; dy) < ∞.

Let P (μ, x, .) depend locally Lipschitz-continuously on μ, so that sup P (ξ, x; .) − P (η, x; .)M(X,E) ≤ C(λ)ξ − ηM(X,E) x

426

Chapter 7. Equations in Spaces of Weighted Measures

for ξ, η ∈ M+ ≤λ (X, E) and some constants C(λ). Then the Cauchy problem for equation (7.65) is well posed in M+ (X, E) and the solutions depend continuously on the initial data. If additionally P (μ, x; .) has uniformly continuous variational derivatives with respect to μ on each set M+ ≤λ (X, E), then the solutions depend smoothly on the initial data. Equations with unbounded rates and kernels are particularly interesting. Two main classes of such kernels are usually discussed in the literature: Namely, the transition kernel P is called multiplicatively E-bounded or E ⊗ -bounded (respectively additively E-bounded or E ⊕ -bounded) for a positive function E on X whenever P (μ, x; .) ≤ cE ⊗ (x) (respectively P (μ, x; .) ≤ cE ⊕ (x)) for all μ and x and some constant c > 0, where we used the notations (7.33) and (7.34). In order to keep the formulae short, we shall consider this definition to hold with c = 1. The next result is a variation of Theorem 7.2.3. Theorem 7.5.3. Let X be a locally compact metric space and E a continuous function on X that is bounded from below by a constant E0 that we choose to equal 1 without loss of generality. Let P (μ, x, .) be a family of continuous transition kernels in SX such that the number of particles created in one transition is uniformly bounded by some number M (i.e., m is bounded by M in (7.35)). Let P be E-subcritical and sub-multiplicatively E-bounded in the sense that P (μ, z; .) ≤ ω(z)E ⊗ (z),

(7.66)

with some positive function ω(z) that is bounded by a constant Ω and tends to zero, as z → ∞. Let P (μ, x, .) depend locally Lipschitz-continuously on μ, so that sup P (ξ, z, .) − P (η, z, .)M(X) ≤ E ⊗ (z)C(λ)ξ − ηM(X) x

for ξ, η ∈ M+ ≤λ (X, E) and some constants C(λ). Then the Cauchy problem for equation (7.65) has a global solution μt ∈ M+ (X, E) for any μ0 ∈ M+ (X, E). Proof. The proof is essentially the same as in Theorem 7.2.3. The only difference is that one has to choose the approximations in accordance with the particular structure of the problem. Namely, one approximates the kernels P by Pn (μ, z1 , . . . zk ; dy) =

k 

χn (zj )Pn (μ, z1 , . . . zk ; dy),

j=1

and takes an , νn as given by (7.42), (7.43) with Pn instead of P . This choice yields the Lipschitz continuity conditions |(an (x, ξ) − an (x, η)| ≤ E(n)ξ − ηM(X)

K

k=1

1 [C(λ)λk−1 + Ω(k − 1)λk−2 ], (k − 1)!

(7.67)

7.6. Equations with additive bounds for rates

427

(νn (x, ξ, .) − νn (x, η, .)M(X) ≤ M E(n)ξ − ηM(X)

K

1 [C(λ)λk−1 + Ω(k − 1)λk−2 ] k!

(7.68)

k=1

for the approximating equations , which allows for the conclusion of the wellposedness of these approximating equations. The rest of the proof is the same as in Theorem 7.2.3. 

7.6 Equations with additive bounds for rates The methods for the unified analysis of the equations (7.52) are largely based on ideas that are similar to the ones used in the discrete case of Chapter 3: Lyapunov functions, preservation of positivity, finite-dimensional (or bounded rates) approximations, moment estimates and accretivity. In this section, we show how the last two ideas can be exploited. For a complete picture, we refer to the more specialized literature (see Section 7.10). Here, we shall only analyse a subclass of the equations (7.52) that describes binary interactions, i.e., the equations 2

1 d ⊕ ⊕ (g, μt ) = (g (y) − g (z))P (z; dy) μ⊗l μ0 = Y, (7.69) t (dz), dt l! X l X l=1

and its integral version t 2 1 ⊕ ⊕ (g, μt ) − (g, μ) = ds (g (y) − g (z))P (z; dy) μ⊗l s (dz), (7.70) l! X l X 0 l=1

where we assume that the P (z; dy) are transition kernels in X ∪ SX 2 . Therefore, only two particles take part in any instantaneous act of interaction and only two particles can be created as the result of this interaction. Already this system shows the advantage of a concise notation for functions and measures on X used in (7.52). In a more detailed description, equation (7.69) can be written as d (g, μt ) = (g(y1 ) + g(y2 ) − g(z))P21 (z; dy1 dy2 ) dt X X2 + (g(y) − g(z))P11 (z; dy) μt (dz) X 1 + (g(y1 ) + g(y2 ) − g(z1 ) − g(z2 )) 2 X2 X2 × P 2 (z1 , z2 ; dy1 , dy2 )μt (dz1 )μt (dz2 ) 1 + (g(y) − g(z1 ) − g(z2 ))P 2 (z1 , z2 ; dy)μt (dz1 )μt (dz2 ), 2 X2 X (7.71) μ0 = Y.

428

Chapter 7. Equations in Spaces of Weighted Measures

Equations of this type contain spatially homogeneous evolutions of the Boltzmann and Smoluchowski type (Examples 1 and 2 above). We shall first prove the existence of the solutions and the moment estimates. Afterwards, we shall consider uniqueness and the continuous dependence on the initial data. Theorem 7.6.1. Let X be a locally compact metric space and P (x; .) be a continuous transition kernel from X ∪ SX 2 to itself such that P (x; .) is E-non-increasing and (1 + E)⊕ -bounded for some continuous non-negative function E on X with  E(x) → ∞ as x → ∞. Suppose that (1 + E β )(x)μ(dx) < ∞ for the initial condition μ with some β > 1. Then there exists a global non-negative solution (in the topology of M(X)) to (7.69), which does not increase E, i.e., with (E, μt ) ≤ (E, Y ), t ≥ 0, such that for an arbitrary T , sup (7.72) (1 + E β )(x)μt (dx) ≤ C(T, β, (1 + E, Y ))(1 + E β , Y ) t∈[0,T ]

with some constant C(T, β, (1 + E, Y )). Proof. Let us first approximate the transition kernel P by the cut-off kernels Pn defined by the equation (7.73) g(y)Pn (z; dy) = χn (E ⊕ (z))g(y)χn (E ⊕ (y))P (z; dy), for arbitrary g, where χ is the same as in Theorem 7.2.3. Then Pn has the same properties as P , but is bounded at the same time. Therefore, the solutions μnt to the corresponding kinetic equations with initial condition Y exist by Proposition 7.5.2. Since the evolution defined by Pn does not change measures outside the compact region {y : E(y) ≤ n}, it follows that if (1 + E β )(x)Y (dx) < ∞, then the same holds for μt for all t. Our aim now is to obtain a bound for this entity that is independent on n. For that purpose, let us denote by Fg the linear functional on measures Fg (μ) = (g, μ), and by ΛFg (μt ) the r.h.s. of equation (7.69). Due to the structure of ΛFg (μt ) and the assumptions on P , we find ΛF1 (μ) ≤ F1+E (μ),

ΛFE (μ) ≤ 0,

which by Gronwall’s lemma implies (1 + E)(x)μnt (dx) ≤ eT (1 + E, Y ). sup

(7.74)

t∈[0,T ]

This implies that the sequence of curves μnt in C([0, T ], M(X)) is relatively compact by the Ascoli theorem. Therefore, one can extract a subsequence (again

7.6. Equations with additive bounds for rates

429

denoted by μnt ) that converges to some curve μ. ∈ C([0, T ], M(X)) which also satisfies the growth condition (7.74). Next, for any y in the support of P (x, .), we have (E β )⊕ (y) ≤ (E ⊕ (y))β ≤ (E ⊕ (x))β , since P is E-non-increasing and the function z → z β is convex. Consequently, one has 2

1 [(E β )⊕ (y) − (E β )⊕ (x)]P (x; dy)μ⊗l (dx) ΛFE β (μ) = l! X l l=1 1 [(E(x1 ) + E(x2 ))β − E β (x1 ) − E β (x2 )]P (x; dy)μ(dx1 )μ(dx2 ). ≤ 2 X2 Using the symmetry with respect to permutations of x1 , x2 and the assumption that P is (1 + E)⊕ -bounded, one deduces that this expression does not exceed [(E(x1 ) + E(x2 ))β − E β (x1 ) − E β (x2 )](1 + E(x1 ))μ(dx1 )μ(dx2 ). Using the inequalities (3.95) with a = E(x1 ), b = E(x2 ) yields (E(x1 ) + E(x2 ))β − E β (x1 ) − E β (x2 ) ≤ 2β [E(x1 )E β−1 (x2 ) + E(x2 )E β−1 (x1 )] ≤ 2β+1 [E β (x1 ) + E β (x2 )], and [(E(x1 ) + E(x2 ))β − E β (x1 )]E(x1 ) ≤ β2β [E(x2 )β E(x1 ) + E(x1 )β E(x2 )]. Again by the symmetry, this implies ΛFE β (μ) ≤ 2β+1 (2 + β) E β (x1 )(1 + E(x2 ))μ(dx1 )μ(dx2 ).

(7.75)

By (7.74), it follows that ΛFE β (μ) ≤ eT (1 + E, μ)2β+1 (2 + β)(E β , μ). The same estimates hold for the transitions Pn instead of P . Consequently, Gronwall’s lemma implies for an arbitrary T that FE β (μnt ) = (E β , μnt ) < C(T, β, (1 + E, Y ))(E β , Y ) with some constant C(T, β, (1 + E, μ0 )) for all t ∈ [0, T ] and all n. This implies that the limiting curve μt satisfies the estimate (7.72). It remains to show that μt satisfies (7.70) by passing to the limit in the corresponding equations for μnt . This is done as in the proof of Theorem 7.2.3: all integrals outside the domain {y : E(y) < K} can be made arbitrary small by choosing a large K (because of (7.72));  and inside this domain, the result follows from the convergence μnt to μt .

430

Chapter 7. Equations in Spaces of Weighted Measures

Theorem 7.6.2. Suppose that the assumptions of Theorem 7.6.1 hold and that P is (1 + E α )⊕ -bounded for some α ∈ [0, 1] such that β ≥ α + 1. Then there exists a unique non-negative solution μt to (7.69) satisfying (7.72) and a given initial condition Y ∈ M+ (X, 1 + E β ). Moreover, the mapping Y → μt is Lipschitz-continuous in the norm of M1+E (X), i.e., for any two solutions μt and νt to (7.69) satisfying (7.72) with the initial conditions Yμ and Yν , one has (7.76) (1 + E)(x)|μt − νt | (dx) ≤ C (1 + E)(x)|Yμ − Yν | (dx) for some constant C uniformly for all t ∈ [0, T ]. Proof. Given the previous results, we only need to prove (7.76). For that purpose, let us apply Proposition 1.4.5 to the measure-valued curve (1 + E)(x)(μt − νt )(dx). Let ft denote a version of the density of μt − νt with respect to |μt − νt |. Then we get (1 + E)(x)|μt − νt |(dx) = (1 + E)(μt − νt ) t ds fs (x)(1 + E)(x)(μ˙ s − ν˙ s )(dx). = (1 + E)(x)|Yμ − Yν |(dx) + 0

X

By (7.69), the last integral in this expression equals t   ds [fs (1 + E)]⊕ (y) − [fs (1 + E)](z) P 1 (z; dy)(μs (dz) − νs (dz)) 0 t   ds [fs (1 + E)]⊕ (y) − [fs (1 + E)]⊕ (z) + 0

× P 2 (z; dy)(μs (dz1 )μs (dz2 ) − νs (dz1 )νs (dz2 )).

(7.77)

The second term on the r.h.s. equals 0

t

k

  ds [fs (1 + E)]⊕ (y) − [fs (1 + E)]⊕ (z) l=1

(7.78)

× P (z; dy)[(μs − νs )(dz1 )μs (dz2 ) + νs (dz1 )(μs − νs )(dz2 )]. Let us now estimate the integral arising from the first term in the last square bracket. (Note that the second term can be dealt with analogously.) We have   [fs (1 + E)]⊕ (y) − [fs (1 + E)]⊕ (z) P 2 (z; dy)(μs − νs )(dz1 )μs (dz2 ). (7.79)   = [fs (1 + E)]⊕ (y) − [fs (1 + E)]⊕ (z) P 2 (z; dy)fs (z1 )|μs − νs |(dz1 )μs (dz2 )

7.7. On the sensitivity of kinetic equations

431

Since E is non-increasing by P (z; dy), we find   [fs (1 + E)]⊕ (y) − [fs (1 + E)]⊕ (z) fs (z1 ) ≤ (1 + E)⊕ (y) − fs (z1 )[fs (1 + E)]⊕ (z) ≤ 4 + E ⊕ (z) − E(z1 ) − fs (z1 )fs (z2 )E(z2 ) ≤ 4 + 2E(z2 ). Therefore, (7.79) does not exceed (4 + 2E(z2 ))(1 + E α (z1 ) + E α (z2 ))|μs − νs |(dz1 )μs (dz2 ). Consequently, since 1 + α ≤ β and α ≤ 1, the integral (7.78) does not exceed t C(T, β, (1 + E, Yμ + Yν ))(1 + E β , Yμ + Yν ) ds (1 + E)(x)|μt − νt |(dx) 0

with some constant C. A similar procedure with the first term in (7.77) shows that it is non-negative. Consequently, (7.76) follows by Gronwall’s lemma.  Remark 115. As in the discrete setting of Chapter 3, the above proof of the uniqueness is effectively the proof of the accretivity of the corresponding kinetic equation with respect to the weighted norm of M(X, L).

7.7 On the sensitivity of kinetic equations After the brief but systematic exposition of the previous sections, let us now sketch some directions of further developments. In this section, we give some comments on how to deal with sensitivity. Thereby, the sensitivity for kinetic equations with additively bounded kernels is derived from the sensitivity of the approximated model with cut-off rates, like in the framework of Theorem 7.6.2. To begin with, one differentiates equation (7.69) with respect to the initial data and obtains the following linear equation for the derivative ξt = Dμt (Y )[ξ]: d (g, ξt ) = (g ⊕ (y) − g(z))P 1 (z; dy)ξt (dz) dt X X (7.80) + (g ⊕ (y) − g(x) − g(z)) X2

X

× P 2 (x, z; dy)[ξt (dz)μt (dx) + ξt (dx)μt (dz)]. The dual backward equation on functions reads d gt (x) = − (gt⊕ (y) − gt (x))P 1 (x; dy) dt X − (gt⊕ (y) − gt (x))P 2 (x, z; dy)μt (dz) X X + gt (z)P 2 (x, z; dy)μt (dz). X

X

(7.81)

432

Chapter 7. Equations in Spaces of Weighted Measures

Note that we wrote the last term separately in order to highlight that this term destroys the conditional positivity of the evolution. Therefore, the propagator for the evolution (7.81) is built in two steps: (1) for the corresponding equation without the last term, where the methods of positivity-preserving evolutions can be used; (2) by dealing with the last term based on perturbation theory. The systematic treatment reveals an important additional complication: solutions to the linearized equations (7.80) describing the evolution of the derivatives with respect to the initial data usually belong to other weighted spaces (i.e., have stronger growth at infinity) than the solutions to the kinetic equations. We refer to [147] for the full story. At this point, we only mention the remarkable observation that the pair of equations (7.69) and (7.81) (the kinetic equation and the dual equation to its linearized evolution) can be written in the infinite-dimensional Hamiltonian form with the help of partial Fr´echet derivatives: d μt = D2 H(μt , gt ), dt

d gt = −D1 H(μt , gt ) dt

(7.82)

(the first equation being the strong form of weak equation (7.69)), where D1 , D2 denote the derivatives with respect to the first or second variable, respectively, and where the Hamiltonian function is H(μ, g) = (Aμ g, μ),

(7.83)

if written in the general form for all kinetic equations, or 2

1 ⊕ ⊕ (g (y) − g (z))P (z; dy) μ⊗l H(μ, g) = t (dz), l! X l X

(7.84)

l=1

if written for the considered special case. In its weak form and using fractional derivatives, the Hamiltonian system (7.82) can be rewritten as d (f, μt ) = D2 H(μt , gt )[f ], dt

d δH(μt , gt ) gt (.) = − . dt δμt (.)

(7.85)

7.8 On the derivation of kinetic equations: second quantization and beyond In this section, we shall draw a general scheme for the derivation of kinetic equations from the evolution of systems of interacting particles, referring to [147] for a detailed, rigorous derivation based on this scheme. In the sequel, we shall use the notation from Section 7.4. Let us start with spontaneous transitions under mean-field interaction. Let A be an arbitrary ΨDO in Rd describing the evolution of a state (a particle) in

7.8. On the derivation of kinetic equations: second quantization and beyond

433

X = Rd : f˙t (x) = Aft (x). This equation is interpreted as describing the evolution of observables f (some measurable quantities) that depend on a (possibly rant dom) state x. E.g., the first-order equation f˙t = a(x) ∂f ∂x describes the evolution ft (x) = f (Xt (x)) of the quantity f depending on the position of a particle that moves deterministically according to the equation x˙ = a(x), Xt (x) being its solutions starting at x at the time zero. A second-order equation stands for diffusion  processes. Equations with the integral generator f˙(x) = (f (y) − f (x))ν(dy) describe processes of random jumps occurring with the rate ν(y). In the quantum setting, various self-adjoint operators A can be used. To any operator A1 acting on the functions on X, there corresponds an operator Aˆ1 acting on the space of functions on SX that describes the evolution of an arbitrary number of particles, each one developing according to A1 , independently from each other. Therefore, Aˆ1 acts as

Aˆ1 f (x1 , . . . , xk ) = A1j f (x1 , . . . , xk ), (7.86) j

where A1j denotes the operator A1 acting on f seen as a function of xj . In quantum physics, the operator Aˆ1 is referred to as the second quantization 1 of A . However, in quantum physics the emphasis is on operators that act in L2 (Rd ) (therefore, the operator (7.86) acts in L2 (Rdk )). In this section, we will deal with classical particles, where the more appropriate functional space is C(X) or better C∞ (X). One says that an operator A1 in C(X) is subject to a mean-field interaction, if instead of one operator we are given a family of operators A1 (μ) depending on μ ∈ M+ (X) as a parameter. The corresponding evolution f˙t = Aˆ1 ft in C(SX ) given by (7.86) is then modified by letting A1 depend on the ‘environment’ via the empirical distribution μ = δx /k = (δx1 + · · · + δxk )/k ∈ P(X), where we introduced the notation δx = δx1 +· · ·+δxk for a point x = (x1 , . . . , xk ) in SX . Therefore, the mean-field interacting particle system described by the family of operators A1 (μ) in C(X) is given by the equation

f˙(x1 , . . . , xk ) = Aˆ1 (δx /k)f (x1 , . . . , xk ) = A1j (δx /k)f (x1 , . . . , xk ), (7.87) j

or, a in more concise form, f˙(x) = Aˆ1 (μ)f (x)|μ=δx /k ,

x ∈ SX k .

(7.88)

As we saw earlier already, the inclusion SX to M(X) given by the scaled transformation x = (x1 , . . . , xl ) → h(δx1 + · · · + δxl ) = hδx

(7.89)

434

Chapter 7. Equations in Spaces of Weighted Measures

plays a key role in the theory of measure-valued limits of interacting particle systems. This transformation defines a bijection between SX and the set M+ δ (X) of finite linear combinations of Dirac-δ-measures with natural coefficients. This bijection can be used for equipping SX with the structure of a metric space by pulling back any distance on M(X) that is compatible with its weak topology. In the above-considered situation when the total number of particles is preserved by the evolution, one naturally chooses h = 1/k, where k the number of particles. But in other situations, h is best fixed as the inverse 1/N0 to the number of particles in the initial state. The final objective is to pass to the limit h → 0, k → ∞ in such a way that μ = hδx has a finite limit in M+ (X). In order to be able to find such limit, one has to transfer the action of Aˆ1 from the functions on SX to the functionals F (hδx ) on M+ (X). For this transformation, the formulae for the differentiation of these functionals pay a crucial role. In order to properly describe these formulae, we need the notations from Section 1.13 (Y ) and their extensions specifying the regularity of the variational derivatives δF δY (x) 1,k with respect to x. Namely, let us define the space Cweak (M+ ≤λ (X)) as the sub1 (M+ space of Cweak ≤λ (X)) consisting of functionals F (μ) such that uniformly for Y ∈ M+ ≤λ (X). This space is Banach with the norm ) ) ) δF (Y ) ) ) ) F C 1,k (M+ (X)) = sup . ) δY (.) ) k weak ≤λ C (X) Y ∈M+ (X)

δF (Y ) δY (.)

∈ C k (X)

(7.90)

≤λ

n,k (M+ Similarly, one can introduce the space Cweak ≤λ (X)) of functionals having variational derivatives of order up to k, which are continuously differentiable of order up to n with respect to the parameters entering these variational derivatives. It turns out, however, that for the analysis of particle systems the key role is played by the following space that may seem rather artificial at first sight: Let 2,k×k 1,1 + d d (M+ Cweak ≤λ (R )) denote the subspace of Cweak (M≤λ (R )) consisting of func2

F (Y ) k×k (R2d ) (see tionals F (μ) such that δYδ(x)δY (z) exists for all x, z and belongs to C the definition of this space in Section 1.1).

Lemma 7.8.1. 1,1 (i) If F ∈ Cweak (M(Rd )), then

 ∂ ∂ δF (Y )  F (hδx ) = h . ∂xi ∂xi δY (xi ) Y =hδx

(7.91)

1,2 2,1×1 d (ii) If F ∈ Cweak (M(Rd )) ∩ Cweak (M+ ≤λ (R )), then

  2 ∂2 δF (Y )  ∂ 2 δF (Y )  2 ∂ F (hδ ) = h + h , x ∂x2i ∂x2i δY (xi ) Y =hδx ∂y∂z δY (y)δY (z) Y =hδx , y=z=xi

(7.92)

7.8. On the derivation of kinetic equations: second quantization and beyond

  ∂2 ∂2 δF (Y ) 2  F (hδx ) = h , ∂xi ∂xj ∂xi ∂xj δY (xi )δY (xj ) Y =hδx

i = j.

435

(7.93)

Proof. Let us prove only (7.91) and leave the other formulae as an exercise. In fact, this is a consequence of Theorem 1.13.1, because the use of (1.202) leads to ∂ 1 F (hδx ) = lim [F (hδx + hδxi + − hδxi ) − F (hδx )]

→0  ∂xi   h 1 δF (Y )  , δxi + − δxi ds = lim

→0  0 δY (.) Y =hδx +hs(δx + −δx ) i i   h 1 δF (hδx + hs(δxi + − δxi )) δF (hδx + hs(δxi + − δxi )) − = lim ds,

→0  0 δY (xi + ) δY (xi ) 

which implies (7.91). 1

Lemma 7.8.2. Let A (μ) be a differential operator with coefficients that depend on μ, such that the symbol of A1 , A1 (μ, x, p), is a polynomial in p with a vanishing free term: A(μ, x, 0) = 0. Then, if hδx tends to a measure μ ∈ M+ (X), as h → 0 (of course, x also changes with h), we have   δF (μ) 1 1 ˆ lim A (μ)F (hδx ) = A (μ) (x)μ(dx). (7.94) h→0 δμ(.) X Proof. In fact, (7.91) implies that the highest-order term (in small h) of the derivatives of F (hδx ) with respect to xi coincides with the derivatives of the function δF (Y ) δY (xi ) , i.e., in the highest order we have  

δF (Y ) δF (μ) A1i (μ) = A1 (μ) , hδx . Aˆ1 (μ)F (hδx ) = h i δY (xi ) δμ(.) 

This yields (7.94). Formula (7.94) extends to ΨDOs with symbols A1 (μ, x, p) such that A1 (μ, x, 0) = 0.

 In particular, it holds for the integral operators A1 f (x) = (f (y)−f (x))ν(μ, x, dy). Exercise 7.8.1. As an instructive exercise, give an independent proof of (7.94) for such integral operators. As a consequence of Lemma 7.8.2, the generator Aˆ1 (μ)F (hδx ) acting on C(SX ) tends to the operator on the r.h.s. of (7.94) acting in C(M(X), as h → 0. This, however, is a particular performance of the operator (7.60) with k = 1 and vanishing P , and the evolution (7.88) on C(SX ) tends to the evolution   δFt 1 ˙ Ft (μ) = ΛA1 Ft (μ) = A (μ) (x)μ(dx). (7.95) δμ(.) X

436

Chapter 7. Equations in Spaces of Weighted Measures

Therefore, we derived (7.60) as well as the corresponding kinetic equation d (g, μt ) = (A1 (μt )g, μt ) = dt

A1 (μt )g(x)μt (dx),

(7.96)

X

with k = 1 and vanishing P . Remark 116. (i) Of course, by Proposition 4.2.2, in order to rigorously prove the convergence of the semigroups generated by Aˆ1 on C(SX ) to the semigroups generated by ΛA1 on C(M(X)), one has to ensure that the convergence of the generators holds on the core of the limiting semigroup. Therefore, one has to show that smooth functionals F (μ) on M+ (X) are invariant under the semigroup generated by ΛA , which boils down to the smooth dependence of the solutions to kinetic equations with respect to the initial data, see [147] for detail. (ii) The second term in the approximation of Aˆ1 (μ) as h → 0, which is required for providing error estimates, can be written in terms of expressions of the type (7.93) involving second-order variational derivatives. This leads to a 2,1×1 , which in turn requires the secondnecessity to work with the spaces Cweak order sensitivity for solutions to the kinetic equations, as analysed in Theorem 6.8.4. (iii) The formulae of Lemma 7.8.1(ii) are used when the next-order corrections in h to the evolution on F (hδx ) are sought. Let us now extend the story to operators that change the number of particles. For a transition kernel P 1 (x; dy) of spontaneous transformations of single particles, the analogue of the evolution (7.87), which describes the process of spontaneous and independent transformations of any particle that is present in the system, is the evolution

[ft (x : xj → y) − ft (x1 , . . . , xk )]P 1 (xj ; dy), (7.97) f˙t (x1 , . . . , xk ) = j

X

where (x : xj → y) is the collection of points obtained from the collection x = (x1 , . . . , xk ) by substituting the point xj by the collection y. In terms of the functionals F (hδx ) = f (x), this evolution can be rewritten as F˙t (hδx ) =

j

X

[Ft (hδx − hδxj + hδy ) − Ft (hδx )]P 1 (xj ; dy).

In case of the mean-field dependence, i.e., if P 1 = P 1 (μ, x; dy), the equation is modified accordingly:

˙ Ft (hδx ) = [Ft (hδx − hδxj + hδy ) − Ft (hδx )]P 1 (hδx , xj ; dy). j

X

7.8. On the derivation of kinetic equations: second quantization and beyond

437

Using (1.202) and passing to the limit h → 0 with hδx → μ yields the equation  ⊕  δF (μ) (μ) δF t t F˙t (μ) = P 1 (μ, x; dy)μ(dx). (y) − (7.98) δμ(.) δμ(x) X X Therefore we derived equation (7.61) as well as the corresponding kinetic equation for k = 1 and vanishing A. Exercise 7.8.2. In the above scheme, identify A1 (μ) or P 1 (μ) that lead to the Boltzmann, Smoluchowski and Vlasov equations of Section 7.4. Let us now turn to binary interactions. In this case, the limiting equation on C(M(X)) is influenced by the way the scaling parameter h is applied, which should reflect the physical setting of the problem. Let A2 be an operator in Csym (X 2 ), i.e., the state space of two particles. As for the case of the operator A1 in C(X), the standard procedure (extending the second quantization of operators A1 ) for lifting the action A2 to a system of arbitrary particle numbers is to define

Aˆ2 f (x1 , . . . , xk ) = A2ij f (x1 , . . . , xk ), (i,j)⊂{1,...,k}

where A2ij denotes the operator A2 acting on f considered as a function of the variables xi , xj . Following the general idea that, as the number of particles grows, the size of each particle decreases and the action of each particle on another should also decrease proportionally, one can scale the binary interaction by another multiplier h (compared to spontaneous transitions). Therefore, the operator Aˆ2 lifted to the functionals on measures F (hδx ) = f (x) should be scaled as hAˆ2 F (hδx ) = h

A2ij F (hδx ),

x ∈ SX k .

(7.99)

(i,j)⊂{1,...,k}

Applying again Lemma (7.8.1) in conjunction with the identity

h2

φ(xi , xj )

(i,j)⊂{1,...,n}

1 = 2



h φ(z1 , z2 )(hδx )(dz1 )(hδx )(dz2 ) − 2



(7.100) φ(z, z)(hδx )(dz),

we find that, if hδx tends to a measure μ ∈ M+ (X), as h → 0, then  ⊕  δF (μ) 1 2 2 ˆ lim A (μ)F (hδx ) = A (μ) (x, y)μ(dx)μ(dy). h→0 2 X2 δμ(.)

(7.101)

438

Chapter 7. Equations in Spaces of Weighted Measures

Therefore, the evolution F˙t = hAˆ2 F (hδx ) tends to the evolution  ⊕  δF (μ) 1 t F˙t (μ) = A2 (μ) (x, y)μ(dx)μ(dy). 2 X2 δμ(.)

(7.102)

Including the transition kernels of binary interactions yields the equation  ⊕  δF (μ) 1 t 2 F˙t (μ) = A (μ) (x, y)μ(dx)μ(dy) (7.103) 2 X2 δμ(.)  ⊕ ⊕ $ # δFt (μ) δFt (μ) 1 + (y) − (z) P 2 (z; dy)μ(dz1 )μ(dz2 ). 2 X2 X δμ(.) δμ(.) Therefore, we derived a first-order infinite-dimensional PDE of the type (7.61) for k = 2. Arbitrary equations (7.61) are obtained similarly from particle systems with kth-order interaction. Notice that the Boltzmann, Smoluchowski and Vlasov equations can be obtained either via certain mean-field dependent processes of spontaneous transformations as specified by the kernel P 1 (μ, x; dy) or the operator A that depends linearly on μ, or via the process of pure binary interactions, without any mean-field dependence, as specified by the operators A2 or the transition kernels P 2 . Exercise 7.8.3. In the above scheme, identify A2 or P 2 that lead to the Boltzmann, Smoluchowski and Vlasov equations of Section 7.4.

7.9 Interacting particles and measure-valued diffusions A very natural uniform scaling of the binary interaction (7.99) always leads to a first-order infinite-dimensional PDE in variational derivatives, whose respective system of characteristics is usually referred to as kinetic equations. However, such scaling is far from being the unique reasonable way to do so. Non-uniform scaling may lead to important higher-order PDEs. For instance, for the discrete setting, these limits are described in [140] and [139]. One of the most famous infinitedimensional limit obtained in this way is the celebrated super-Brownian motion, as well as more general super-processes. In this section, we shall just consider an example which is related to McKean–Vlasov equations as analysed in Chapter 6. Assume that the operator A2 on Csym (Rd × Rd ) from above is of the secondorder diffusive type and mixes the coordinates:   ∂2f ∂2f 2 T A f (x, y) = tr σ(x)σ (y) σik (x)σjk (y) . (7.104) = ∂x∂y ∂xi ∂yj i,j,k

Let A1 (μ) be a family of ΨDOs in C(Rd ) with the symbol A1 (μ, x, p) such that A1 (μ, x, 0) = 0. The interacting particle system specified by A1 (μ) and A2 is

7.9. Interacting particles and measure-valued diffusions

439

generated by the operator

j

A1j (hδx )f (x1 , . . . , xk ) +

A2ij f (x1 , . . . , xk ).

(7.105)

(i,j)⊂{1,...,k}

If we do not apply additional scaling by h to the binary interactions (unlike (7.99)) when transforming it to an action on C(M(X)), then the resulting evolution on C(M(X)) is generated by the operator Aˆ1 (hδx )F (hδx ) + Aˆ2 F (hδx ),

x ∈ SX k .

(7.106)

Applying the formulae (7.100) and (7.93) shows that if hδx converges to a measure μ ∈ M(X), then the expressions in (7.105) converge to ΛA F (μ) =

  δF A1 (μ) (x)μ(dx) (7.107) δμ(.) X   δ2F ∂2 1 tr σ(x)σ T (y) + μ(dx)μ(dz), 2 R2d ∂x∂y δμ(x)δμ(y)

i.e., unlike the scaling result (7.99), we obtained a second-order infinite-dimensional PDE in variational derivatives, a measure-valued diffusion. Remark 117. For readers with a background in probability, let us point out why equations that are generated by operators of the type (7.105) (and thus their limits (7.107)) are of particular interest: As can be directly checked by Itˆ o’s formula, for a system of k stochastic SDEs, we have dXtj = b(Xtj , μt )dt + σ1 (Xtj )dBtj + σ2 (Xtj )dWt ,

j = 1, . . . , k,

μt =

j

δX j /k, t

(7.108) where B 1 , . . . , B k , W are independent Wiener processes, and the function ES(Xt1 , . . . , Xtk )(x) satisfies the equation # $

∂S 1 ∂2S ∂2S ∂S 1 = + σ1 (xj ) 2 + σ2 (xi )σ2 (xj ) b(xj , μt ) j ∂t ∂xj 2 ∂xj 2 i 0, we have ∞ 0

x−s Gβ (t, x) dx =

Γ(s/β) −s/β t ; βΓ(s)

(ii) For any s ∈ R, Zolotarev’s formula holds: 1 ∞ sx −1−1/β Eβ (s) = e x Gβ (1, x−1/β ) dx. β 0 Equivalently, this formula can be written as ∞ −β 1 ∞ sx −1 Eβ (s) = e x Gβ (x, 1) dx = esy Gβ (1, y) dy. β 0 0

(8.1)

(8.2)

(8.3)

Moreover, the derivatives of Eβ (s) (with respect to s) can be obtained by differentiation inside the integrals (8.2) or (8.3). Remark 118. Formula (8.2) basically states that βEβ (−s) is the Laplace transform of the positive function x−1−1/β Gβ (1, x−1/β ). In particular, βEβ (−s) is a completely monotone function.

8.1. Green functions of fractional derivatives and the Mittag-Leffler function

445

Proof. (i) Notice first of all that by Proposition 2.4.1(i) all derivatives of the function Gβ (t, x) with respect to x vanish at zero. Therefore, all integrals in (8.1) are well defined. Assume now that s ∈ / N. By (1.176) and the definition of G, we find    ∞ i −s −1 −s β πβ sgn p dp x Gβ (t, x) dx = (F (x+ ))(p) exp −t|p| exp 2 0 R ∞ β iπβ/2 F −1 (x−s } dp = 2 Re + )(p) exp{−tp e 0 ∞ Γ(1 − s) iπ(1−s)/2 e = Re ps−1 exp{−tpβ eiπβ/2 } dp. π 0 Therefore, by (9.9), we find ∞ Γ(1 − s) iπ(1−s)/2 −s/β Γ(s/β) −iπs/2 e e x−s Gβ (t, x) dx = Re t π β 0 Γ(s/β) −s/β Γ(1 − s) −s/β Γ(s/β) t = t = sin(πs) , π β βΓ(s) where formula (9.10) was used. By continuity, this extends to all positive s including s ∈ N. (ii) The first formula in (8.3) is obtained from (8.2) by scaling (2.77). The second formula in (8.3) is obtained by changing the integration variable to x = y −β . Let us prove the second formula in (8.3). By expanding the exponents on the r.h.s. of (8.2) into a power series, we are led to show that ∞

sn ∞ −βn Eβ (s) = y Gβ (1, y) dy. n! 0 n=0

(8.4)

But by (8.1), we find 1 Γ(n) 1 ∞ −βn = , y Gβ (1, y) dy = n! 0 n!βΓ(βn) Γ(1 + βn) so that we get precisely the defining series for E. Since the series representation (8.4) of the analytic function Eβ (s) can be differentiated termwise, and since the terms of the corresponding series are positive, the order of summation and integration can be changed in (8.4) as well as in its derivatives. This justifies a differentiation under the integrals in (8.2) or (8.3).  Remark 119. The proof implies that the integrals in (8.2) are well defined, which is a strong requirement on the behaviour of Gβ near zero. Moreover, differentiating under the integrals in (8.2) or (8.3) increases the singularity at one side of R+ , but it always remains integrable.

446

Chapter 8. Generalized Fractional Differential Equations

Formula (8.2) is of key importance, since it allows for a definition of the Mittag-Leffler function E(A) for any operator A that generates a strongly continuous semigroup Tt in a Banach space B via the formula 1 ∞ Ax −1−1/β 1 ∞ −1/β e x Gβ (1, x ) dx = Tx x−1−1/β Gβ (1, x−1/β ) dx, Eβ (A) = β 0 β 0 (8.5) and its derivative by 1 ∞ Ax −1/β 1 ∞  −1/β e x Gβ (1, x ) dx = Tx x−1/β Gβ (1, x−1/β ) dx. Eβ (A) = β 0 β 0 (8.6) Since the solutions to linear fractional equations are expressed in terms of Eβ (Atβ ), let us derive the most concise integral representations for these expressions. Namely, changing variables in (8.5), (8.6) yields the equations t ∞ Ay dy (8.7) Eβ (Atβ ) = e Gβ (y, t) , β 0 y t1−β ∞ Ay Eβ (Atβ ) = e Gβ (y, t) dy. (8.8) β 0

8.2 Linear evolution The main result for linear fractional equations with time-independent generators is as follows: Theorem 8.2.1. Let β ∈ (0, 1) and let A be a generator of a strongly continuous semigroup Tt in a Banach space B with the domain D(A). Let Y ∈ D(A), and let bt be a continuous curve in B such that bt ∈ D(A) for any t and the norms Abt  are bounded on compact intervals of t. Then the fractional linear Cauchy problem (2.174), i.e., β μt = Aμt + bt , μa = Y, t ≥ a, (8.9) Da+∗ has a unique solution, which is given by (2.176). Namely, the solution reads t β μt = Eβ (A(t − a) )Y + β (t − s)β−1 Eβ (A(t − s)β )bs ds, (8.10) a

Eβ

where Eβ , are given by (8.5), (8.6). This function is also the unique solution to the fractional integral equation t 1 (t − s)β−1 (Aμs + bs ) ds = Y + I β (Aμ. + b. )(t). (8.11) μt = Y + Γ(β) a Finally, if Tt has the growth type m0 (see the definition before Theorem 4.2.1), so that Tt  ≤ M emt with some m > m0 and M , then the norms of μt satisfy the

8.2. Linear evolution

447

estimate

t

μt B ≤ M Eβ (m(t−a) )Y B +M β β

(t−s)β−1 Eβ (m(t−s)β )bs B ds. (8.12)

a

Proof. Let Aλ be the Yosida approximation of A, see (4.9). By Proposition 2.13.1, the Cauchy problem β Da+∗ μt = Aλ μt + bt ,

μa = Y,

t ≥ a,

is equivalent to the integral equation t 1 (t − s)β−1 (Aλ μs + bs ) ds = Y + I β (Aλ μ. + b. )(t), μt = Y + Γ(β) a and its unique solution is given by the formula t λ β (t − s)β−1 Eβ (Aλ (t − s)β )bs ds, μt = Eβ (Aλ (t − a) )Y + β

(8.13)

(8.14)

(8.15)

a

where the Mittag-Leffler function can be defined both by its series representation and by (8.5) due to the boundedness of Aλ . By Proposition 4.2.3, the semigroups generated by Aλ converge to Tt . By the principle of uniform boundedness (see Theorem 1.6.1), this implies that the norms Ttλ  are uniformly bounded for all λ and t from compact intervals. Consequently, by the dominated convergence and Proposition 4.2.3, the μλt converge to μt given by (8.10), as λ → ∞, uniformly on compact intervals of t. This holds for any Y and bt continuous in B. Using now that Y ∈ D(A) and bt ∈ D(A), we can conclude that Aλ μλt → Aμt . Therefore, passing to the limit in equation (8.14) yields (8.11). This implies (8.9) by Proposition 1.8.4. It remains to show the uniqueness. This will be done similarly to Theorem 5.10.3. For that purpose, assume that some continuous curve ft in B is such that β ft = Aft + bt with f0 = Y . We shall ft ∈ D(A) and satisfies the equation Da+∗ show that this ft is necessarily the limit of μλt (and hence coincides with μt ). In fact, from the equations for ft and μλt , we get the equation β (μλt − ft ) = Aλ μλt − Aft = Aλ (μλt − ft ) + (Aλ − A)ft . Da+∗

By (8.15), this implies μλt − ft = β

t

Eβ (Aλ (t − s)β )(Aλ − A)fs ds.

a

But this tends to zero by the dominated convergence, since (Aλ − A)fs → 0 and the norms Afs  and Aλ fs  are uniformly bounded. Finally, the estimate (8.12) is a direct consequence of (8.12) and the inequal ity Tt  ≤ M emt .

448

Chapter 8. Generalized Fractional Differential Equations

With Theorem 8.2.1, it becomes possible to get well-posedness and an integral representation for the solutions to the equations (8.9) once the theory of the corresponding usual evolutionary equation μ˙ t = Aμt is developed. As basic examples and consequences of Theorem 8.2.1, we can indicate the following cases: (i) The general fractional Schr¨ odinger equation, see (4.21): β Da+∗ ψt = σHψt ,

(8.16)

where H is a self-adjoint operator in a Hilbert space H, and either σ = −i or H is bounded from above and σ is any complex number with a non-negative real part. In both cases, Theorem 8.2.1 applies and ensures the well-posedness in H. Moreover, Theorem 8.2.1 applies to the regularized fractional Schr¨ odinger equation or complex fractional diffusion of the type β Da+∗ ft =

1 σΔft + (b(x), ∇)ft (x) + V (x)ft (x), 2

f0 = f,

(8.17)

considered in the Banach space C∞ (Rd ), where σ is a complex constant with a positive real part, with V and all bj bounded continuous complex-valued functions (see Proposition 4.8.3), or more generally for V, b from Theorem 4.8.1. (ii) The general fractional Feller evolution, which is given by equation (8.9), where A is the generator of an arbitrary Feller semigroup, e.g., a diffusion or a more general L´evy–Khintchin-type operator from Theorem 5.10.4. In this case, (8.10) reveals that evolution operators Y → μt given by (8.10) with vanishing bt have the following additional property: if 0 ≤ Y ≤ 1, then if 0 ≤ μt ≤ 1. Moreover, if the Feller semigroup generated by Aλ is conservative, i.e., it preserves constants, then the mapping Y → μt is also conservative. (iii) Fractional evolutions generated by ΨDOs with symbols that do not depend on positions: β Da+∗ ft = −ψ(−i∇)ft + gt ,

f |t=a = fa ,

(8.18)

with β ∈ (0, 1), under various assumptions on the symbols ψ(p), as described in Theorems 2.4.1, 4.4.1, 4.5.1 or 4.5.2. In this case, the general formula (8.10) takes the more concrete form (2.198): 1 ft (x) = β







dy 0 t ds + a

Gψ (x − z)y −1−1/β Gβ (1, y −1/β )fa (z) dz (8.19) y(t−a)β ∞ dy (t − s)β−1 Gψ (x − z)y −1/β Gβ (1, y −1/β )gs (z) dz. y(t−s)β

Rd

0

Rd

Exercise 8.2.1. Derive the well-posedness of the problems (8.18) for homogeneous symbols ψ or their mixtures by using the Fourier transform that leads to (8.19) and by-passing the general Theorem 8.2.1.

8.2. Linear evolution

449

(iv) The equations (8.9) with A being a fractional Laplacian with variable scale or a parabolic polynomial perturbed by local or nonlocal terms of lower order, see the corollary to Proposition 5.9.1 as well as Theorems 4.14.2 and 5.9.2. For Y, bt ∈ D, the solutions to the equations (8.9) given by (8.10) belong to the domain of A at all times, and they are classical in this sense. In analogy to the case of usual linear evolutions (see the discussion prior to Proposition 5.9.2), one can introduce the natural notion of generalized solutions to the equations (8.9). For instance, let us say that a continuous curve μt , t ≥ a, in B is a generalized solution by approximation to the Cauchy problem (8.9), if it satisfies the initial condition μa = Y , and if there exists a sequence of elements μn ∈ D and curves bnt in D such that μn → Y and bnt → bt , as n → ∞, and the corresponding (classical, i.e., belonging to the domain) solutions μnt , given by (8.10) with μn instead of Y and bnt instead of bt , converge to μt , as n → ∞. The following assertion is a direct consequence of Theorem 8.2.1 (and its proof): Proposition 8.2.1. Under the assumptions of Theorem 8.2.1, formula (8.10) supplies the unique generalized solution by approximation to the Cauchy problem (8.9) for any Y ∈ B and any bounded measurable curve bt ∈ B. As one can expect, if the semigroup generated by an operator A has some regularization property, then the same holds for solutions to the problem (8.9). As it turns out, the solution to (8.9) has even better regularization properties, since it ‘spreads’ the singularity by the integration. Let us formulate the precise result: ˜ be two Banach spaces with . ˜ ≥ .B . Let β ∈ (0, 1), Theorem 8.2.2. Let B ⊃ B B and let A be a generator of a strongly continuous semigroup Tt in B. Let it be ˜ and regularizing so that Tt takes B to B Tt f B˜ ≤ κt−ω f B ,

t ≤ S,

(8.20)

with constants ω ∈ (0, 1), κ > 0. Then the mapping (Y, {bt }) → μt (Y ) given by formula (8.10) (which by Proposition 8.2.1 yields the generalized solution to the problem (8.9)) is also regularizing. More precisely, if S = ∞, then ∞ κ β(1−ω)−1 Y B x−ω−1−1/β Gβ (1, x−1/β ) dx μt (Y )B˜ ≤ (t − a) β 0 ∞ t −βω (t − s) bs B ds x−ω−1/β Gβ (1, x−1/β ) dx +κ a

(8.21)

0

for all t > a (where the integrals are finite), and if S is finite, then ˜ − a)−ωβ Y B + κ ˜ μt (Y )B˜ ≤ κ(t



t

(t − s)β(1−ω)−1 bs B ds,

t ≤ S, (8.22)

a

with a constant κ ˜ depending on κ, S and the growth type m0 of Tt .

450

Chapter 8. Generalized Fractional Differential Equations

Proof. Using (8.5) and (8.10) together with the estimates eA(t−a) x Y B˜ ≤ κ(t − a)−βω x−ω Y B , β

eA(t−s) x bs B˜ ≤ κ(t − s)−βω x−ω bs B , β

yields (8.21) in the case of S = ∞. The first integral in (8.21) is finite, since the singularity of the integrand at zero is of order x−ω . If S is finite, then for any m > m0 there exists M such that Tt f B˜ ≤ κ ˜ S −ω M em(t−S) f B ,

t ≥ S.

Plugging this estimate into (8.10) yields (8.22).

(8.23) 

In earlier chapters, you will find many examples for the smoothing property k ˜ = C∞ (8.20) for B = C∞ (Rd ) or B = L1 (Rd ) and B (Rd ) or H1k (Rd ), respectively, with some k ∈ N, see, e.g., (4.24) or (4.26), and Theorems 4.4.1 and 5.8.3.

8.3 The fractional HJB equation and related equations with smoothing generators In this section, we shall analyse the equation   ∂ft β , ft , Da+ ft = Aft + H x, ∂x

(8.24)

where A is the generator of a strongly continuous semigroup etA in C∞ (Rd ) satisfying (6.2), i.e., (8.25) etA f C 1 (Rd ) ≤ κt−ω f C(Rd) , β for t ∈ (0, S] with some finite or infinite S. Da+ is the Caputo derivative of order β ∈ (0, 1), and H is a function that is Lipschitz-continuous in its three variables. A basic example comes from stochastic control theory, where A is the generator of a Feller process (say, a diffusion or a stable or stable-like process) and the Hamiltonian function H has the form (2.123). The corresponding equation (6.1) was initially derived in [160], where it describes the optimal cost for controlled processes governed by scaling limits of continuous-time random walks. Similarly, one can deal with the fractional versions of higher-order PDEs or ΨDEs as analysed in Section 6.2, i.e., equations of the type     ∂ m ft β α/2 , ft , (8.26) Da+ ft = −σ(x)|Δ| ft + H x, ∂xi1 · · · ∂xim

where the notations are explained after equation (6.27), or even more general equations with σ varying in space or general parabolic differential operators instead

8.3. The fractional HJB equation and related equations . . .

451

of |Δ|α/2 . In order to keep all formulae short, we shall stick to equations of the type (8.24). According to Theorem 8.2.1, if a sufficiently regular ft solves (8.24) with the initial condition f0 = Y , then it satisfies the integral equation   t ∂fs , fs ds, (8.27) (t − s)β−1 Eβ (A(t − s)β )H ., ft = Eβ (A(t − a)β )Y + β ∂x a or equivalently 1 ∞ exp{A(t − a)β y}y −1−1/β Gβ (1, y −1/β ) dy Y (8.28) ft = β 0   t ∞ ∂fs β−1 β −1/β −1/β , fs ds. (t − s) exp{A(t − s) y}y Gβ (1, y ) dyH ., + ∂x a 0 Therefore, this equation can be naturally referred to as the mild equation or the mild form of (8.24). In other words, ft is a solution to the mild equation (8.28), if it is a fixed point of the mapping Φt : 1 ∞ exp{A(t − a)β y}y −1−1/β Gβ (1, y −1/β ) dy Y (8.29) [ΦY (f. )](t) = β 0   t ∞ ∂fs + (t − s)β−1 exp{A(t − s)β y}y −1/β Gβ (1, y −1/β ) dyH ., , fs ds. ∂x a 0 The theory will now be developed in full analogy with Theorems 6.1.1 and 6.1.2. The only difference is that it is not sufficient to have local bounds of the type (6.5), since the formula for the Mittag-Leffler function includes an integration over all times. Therefore, the assumptions must be modified in order to take into account the growth type of eAt . Theorem 8.3.1. Let A be an operator in C∞ (Rd ) generating a strongly continuous semigroup etA in C∞ (Rd ) such that etA is also a strongly continuous semigroup 1 (Rd ) satisfying (8.25) and in C∞ etA C∞ (Rd )→C∞ (Rd ) ≤ MC emC t ,

mD t etA C∞ 1 (Rd )→C 1 (Rd ) ≤ MD e , ∞

(8.30)

with constants MC , MD , mC , mD and for all t. Let H(x, p, q) be a continuous function on Rd × Rd × R such that h = supx |H(x, 0, 0)| < ∞, and let the Lipschitz 1 (Rd ) there exists a unique socontinuity property (6.6) hold. Then for any Y ∈ C∞ 1 d lution f. ∈ C([0, T ], C∞ (R )) to the mild equation (8.28). Moreover, the solutions ft (Y1 ) and ft (Y2 ) with different initial data Y1 , Y2 satisfy the estimate ft (Y1 ) − ft (Y2 )C 1 (Rd ) ≤ CY1 − Y2 C 1 (Rd ) ,

(8.31)

with the constant C depending continuously on t, κ, ω, β, LH , MC , MD , mC and mD .

452

Chapter 8. Generalized Fractional Differential Equations

Proof. By the same argument as in Theorem 6.1.1, one shows with the help of (8.21) or (8.10) that the mapping ΦY given by (8.29) is a well-defined mapping 1 1 C([a, t], C∞ (Rd )) → CY ([a, t], C∞ (Rd )) for any t > a that takes bounded subsets to bounded subsets. (The notations given prior to Theorem 2.1.1 are used.) More precisely, ΦY (f. )](t)C 1 (Rd ) ≤ MD Y C 1 (Rd ) Eβ (mD (t − a)β ) t (t − s)β(1−ω)−1 (h + fs C 1 (Rd ) ) ds. +κ ˜

(8.32)

a

Moreover, by (8.5) and (8.10), we have β [ΦY1 (f. )](t) − [ΦY2 (f. )](t)C∞ 1 (Rd ) ≤ MD Eβ (mD t )Y1 − Y2 C 1 (Rd ) .

By (8.21) or (8.10), we find ˜ LH [ΦY (f.1 )](t) − [ΦY (f.2 )](t)C 1 (Rd ) ≤ κ a

t

(t − s)β(1−ω)−1 fs1 − fs2 C∞ 1 (Rd ) ds,

uniformly for t from any compact interval. Therefore, we have the estimate (2.5) of Theorem 2.1.3, and applying this theorem completes the proof.  By similarly modifying the proof of Theorem 6.1.2, we obtain the following result on the continuous dependence of solutions to fractional HJB equations on a parameter: Theorem 8.3.2. Let Hα (x, p, q) be a family of Hamiltonians depending on a parameter α taken from an auxiliary Banach space Bpar . Suppose that each Hα satisfies all assumptions of Theorem 8.3.1 with all bounds uniform in α. Moreover, suppose that H is Lipschitz-continuous in α, in the sense that |Hα1 (x, p, q) − Hα2 (x, p, q)| ≤ α1 − α2 Lpar H (1 + |p| + |q|),

(8.33)

with a constant Lpar H . Then the solutions ft (Y, α1 ) and ft (Y, α2 ) to (8.28) (built in Theorem 8.3.1) with different parameter values satisfy the estimate sup fs (Y, α1 ) − fs (Y, α2 )C 1 (Rd ) ≤ Lpar H Kα1 − α2 (1 + Y C 1 (Rd ) ), (8.34)

s∈[a,t]

where the constant K depends continuously on t, ω, κ, βh, LH , MC , MD , mC and mD . In full analogy to Theorem 6.1.5, one can prove the sensitivity (i.e., smooth dependence) of solutions to the fractional HJB equation 8.24 with respect to a parameter or to initial values. Similar to Theorem 6.1.3, one can prove additional regularity of the solutions.

8.4. Generalized fractional integration and differentiation

453

8.4 Generalized fractional integration and differentiation β

The fractional derivative ddxfβ , β ∈ (0, 1), was suggested as a substitute to the usual df , which can model some kind of memory by taking into account the derivative dx past values of f . An obvious extension that is widely used in the literature are various mixtures of such derivatives, both discrete and continuous, N

j=1

aj

dβj f , dxβj

0

1

dβ f μ(dβ). dxβ

(8.35) β

In order to take this idea further, one can observe that ddxfβ represents a weighted sum of the increments of f , f (x − y) − f (x), from various past values of f to the ‘present value’ at x. From this point of view, the natural class of generalized mixed fractional derivatives is represented by the causal integral operators (already discussed in Section 5.11) ∞  (f (x − y) − f (x))ν(dy), (8.36) Lν f (x) = 0

with some positive measure ν on {y : y > 0} satisfying the one-sided L´evy condition (5.143): ∞ min(1, y)ν(y)dy < ∞. (8.37) 0

This condition ensures that Lν is well defined at least for the set of bounded infinitely smooth functions on {y : y ≥ 0}. The dual operators to Lν are given by the anticipating integral operators, i.e., weighted sums of the increments from the ‘present’ to any point ‘in the future’: ∞ Lν f (x) = (f (x + y) − f (x))ν(dy). (8.38) 0

Of course, one can weight the points in the past or the future differently, depending on the present position. Also, one can add a local part for completing the picture, which leads to the operators ∞ df l Lν,b f (x) = (8.39) (f (x − y) − f (x))ν(x, dy) + b(x) , dx 0 with a non-positive drift b(x) and a transition kernel ν(x, .) such that min(1, y)ν(x, dy) < ∞.

454

Chapter 8. Generalized Fractional Differential Equations

These operators fully capture the idea of ‘weighting the past’ and can be called one-sided, namely left-sided or causal, operators of order at most one. Similarly, one can define the right-sided or anticipating operators of order at most one as ∞ df r (8.40) (f (x + y) − f (x))ν(x, dy) − b(x) . Lν,b f (x) = dx 0 Remark 120. Notice that Ll and Lr are dual only if ν and b do not depend on x. General operators of order at most one, are given by linear combinations of one-sided operators, and their semigroups were systematically studied in [147], [148] (see also Section 5.14). The theory of the corresponding fractional differential equations was built in [106]. For the sake of simplicity, let us stick to the mixed derivatives (8.38) and (8.36) and use the following notations: ∞ (ν) (f (x − y) − f (x))ν(dy), D+ = −Lν f (x) = − 0 ∞ (ν) (f (x + y) − f (x))ν(dy). D− = −Lν f (x) = − 0

(With some abuse of notations, if ν has a density, we shall denote this density again by ν.) The minus sign was introduced in order to comply with the standard notation for fractional derivatives, so that, e.g., dβ (ν) β f (x) = D−∞+ = D+ dxβ with ν(y) = −1/[Γ(−β)y 1+β ], because (see (1.111)) ∞ dβ f (x − y) − f (x) 1 β f (x) = D−∞+ f (x) = dy dxβ Γ(−β) 0 y 1+β and Γ(−β) < 0. (ν)

(ν)

The symbols of the ΨDOs D+ and D− are −ψν (−p) and −ψν (p), respectively, where ψν (p) = (eipy − 1)ν(dy) is the symbol of the operator Lν . (ν) If ν is finite, then the operators D+ are bounded, which is not the case for the derivatives. Therefore, the proper extensions of the derivatives represent (ν) only those operators D+ arising from infinite measures ν that satisfy (8.37). The operators arising from a finite ν can better be considered analogues of the finite differences that approximate the derivatives (see Proposition 5.13.1(ii)). (ν) β and The operators D± are an extension of the fractional derivatives D−∞+ β D∞− , often referred to as the fractional derivatives in generator form . When

8.4. Generalized fractional integration and differentiation

455

β β looking for the corresponding extensions of the operators Da± and Da±∗ with a finite a, we note that, by Proposition 1.8.2 (and the symmetric property of the right β β β β derivatives), Da+∗ (respectively Da−∗ ) is obtained from D−∞+ (respectively D∞− ) by restricting its action on the subspace C 1 ([a, ∞)) (respectively C 1 ((−∞, a]). Therefore, the analogues of the Caputo derivatives should be defined as x−a ∞ (ν) Da+∗ = − (f (x − y) − f (x))ν(dy) − (f (a) − f (x))ν(dy), 0 x−a (8.41) a−x ∞ (ν)

Da−∗ = −

(f (x + y) − f (x))ν(dy) −

(f (a) − f (x))ν(dy).

a−x

0

k k ([a, ∞)) and Ckill(a) ((−∞, a]) the subspaces of Let us denote by Ckill(a) k C ([a, ∞)) and C ((−∞, a]), respectively, consisting of functions that vanish to β or the right or to the left of a. Again by Proposition 1.8.2, the operators Da+ β Da− , the analogues of the Riemann–Liouville derivatives, are obtained by furβ β 1 ther restricting the actions of D−∞+ and D∞− to the spaces Ckill(a) ([a, ∞)) and 1 Ckill(a) ((−∞, a]): k

(ν)



x−a

Da+ = −

(f (x − y) − f (x))ν(y)dy + 0

(ν) Da−

=−

a−x

(f (x + y) − f (x))ν(y)dy +



f (x)ν(y)dy, x−a ∞

(8.42) f (x)ν(y)dy.

a−x

0

In order to see what the proper analogue of the fractional integral could be, notice that, according to (1.184), the fundamental solution (that vanishes on β−1 dβ β /Γ(β). the negative half-line) to the fractional derivative dx β is U (x) = x+ Therefore, the usual fractional integral x x−a 1 1 β β−1 Ia f (x) = (x − y) f (y) dy = z β−1 f (x − z) dz (8.43) Γ(β) a Γ(β) 0 β

d is nothing but the potential operator of the semigroup generated by − dx β , or, in other words, the integral operator with the kernel being the fundamental solution dβ to − dx β (or, yet in other words, the convolution with this fundamental solution), restricted to the space Ckill(a) ([a, ∞)). By Proposition 5.12.2, the potential measure U (ν) (dy) represents the unique fundamental solution to the operator Lν , vanishing on the negative half-line. Therefore, the analogue of the fractional integral Iaβ for such ν should be the potential operator of the semigroup Tt generated by Lν , i.e., the convolution with U (ν) (dy) restricted to the space Ckill(a) ([a, ∞)): x−a (ν) f (x − z)U (ν) (dz). (8.44) Ia f (x) = 0

456

Chapter 8. Generalized Fractional Differential Equations

The following result corroborates this identification by showing the analogy with Proposition 1.8.4 reduced to the case β ∈ (0, 1) and vanishing initial conditions, and extending it into various directions. Proposition 8.4.1. (i) Let the measure ν on {y : y > 0} satisfy (5.143). For any generalized function g ∈ D (R) supported on the half-line [a, ∞) with any a ∈ R, and for any (ν) λ ≥ 0, the convolution Uλ  g with the λ-potential measure (5.154) is a well defined element of D (R), which is also supported on [a, ∞). This convolution represents the unique solution (in the sense of generalized function) to the equation (λ − Lν )f = g, or equivalently (ν)

D+ f = −λf + g, supported on [a, ∞). (ii) If λ > 0, and if g ∈ C∞ (R) and is supported on the half-line [a, ∞), i.e., g ∈ Ckill(a) ([a, ∞)) ∩ C∞ (R), then ∞ (ν) (ν) g(x − y)Uλ (dy) f (x) = (Uλ  g)(x) = Rλ g(x) = −∞ (8.45) ∞ x−a g(x − y) e−λt G(ν) (t, dy) dt = 0

0

belongs to the domain of the operator Lν and therefore represents the classical solution to the equation (λ − Lν )f = g, or equivalently (ν)

(ν)

(ν)

D+ f = Da+ f = Da+∗ f = −λf + g.

(8.46)

(iii) If λ = 0, then the potential U (ν) defines an unbounded operator in C∞ (R) that does not fully fit into the framework of Proposition 4.1.4. However, if reduced to the space Ckill(a) ([a, b]) of continuous functions on [a, b] vanishing at a (this space is invariant under Tt and hence under all Rλ ), the potential operator R0 with the kernel U (ν) becomes bounded, and therefore (U (ν)  g)(x) = R0 g(x) = Ia(ν) g(x)

(8.47)

belongs to the domain of Lν and represents the classical solution to the equation (ν) (ν) (ν) −Lν f = D+ f = Da+ f = Da+∗ f = g (8.48) on Ckill(a) ([a, b]). (ν)

Proof. (i) By Propositions 1.11.1 and 1.9.2, the convolution Uλ  g is well defined and solves the equation (λ − Lν )f = g. The uniqueness follows as in Proposition 5.12.2.

8.4. Generalized fractional integration and differentiation

457

(ii) Since Lν generates a semigroup Tt from (5.149), which preserves the spaces C∞ ([a, ∞)) and Ckill(a) ([a, ∞)) ∩ C∞ ([a, ∞)) (see Remark 99), these spaces are also invariant under the resolvent Rλ = (λ − Lν )−1 . By Theorem 4.1.1(v), the image of the resolvent always coincides with the domain of the generator. Therefore, Rλ g belongs to the intersection of Ckill(a) ([a, ∞)) ∩ C∞ ([a, ∞)) and the (ν)

domain of D+ . (iii) The potential operator R0 g(x) = (U (ν)  g)(x) =



x−a

g(x − y)U (ν) (dy) 0

is bounded on Ckill(a) ([a, b]) by Proposition 5.12.1. Therefore, Proposition 4.1.4 applies to this space.  β

d In particular, applying (8.45) to Lν = − dx β and a comparison with (8.10)

yield βz β−1 Eβ (−λz β ) =





e−λt Gβ (t, z) dt = Uλβ (z),

(8.49)

0

which is equivalent to (8.8) and therefore gives another proof of this formula, and hence also of (8.2). Unlike the case of usual fractional derivatives, for general ν the classical interpretation of the solution Rλ g(x) is more subtle for g ∈ C([a, ∞)) not vanishing at a. Moreover, one must carefully distinguish the cases when g is extended to the left of a as g(a), which we denote by g, or as zero, which we denote by g0 . By the corollary to Proposition 5.12.1, if ν is not finite, then R0 g is continuous at zero even if g ∈ C([a, b]) does not vanish at a. Still, it does not belong to the domain of Lν . However, it may well belong to the domain locally, outside the boundary point a, see Proposition 5.11.4 and the follow-up discussion. In fact, the requirement for the solution to belong to the domain outside a boundary point is common for classical problems of PDEs. The following assertion gives a concrete illustration of this point. Proposition 8.4.2. Under the assumptions of Proposition 8.4.1, let the potential measure U (ν) (dy) have a continuous density, U (ν) (y), with respect to the Lebesgue measure. Let g ∈ C 1 [a, b]. Then the function f (x) = R0 g0 (x) belongs to Ckill(a) ([a, b]) and is continuously differentiable in (a, b]. Therefore, by Proposition 5.11.4, it satisfies the equa(ν) tion Da+∗ f = g0 locally, at all points from the interval (a, b]. Proof. From the formula for R0 g0 (x), it follows that d  R g0 (x) = dx 0

0

x−a

d g(x − y)U (ν) (y) dy + g(a)U (ν) (x − a), dx

458

Chapter 8. Generalized Fractional Differential Equations

which is well defined and continuous for x ≥ a. The limit from the right of d  (ν) (0), which may cause a jump when this funcdx R0 g0 (x) as x → a is g(a)U tion crosses the value x = a.  As mentioned before, the image of the resolvent coincides with the domain of the generator. This implies that the function (8.45) belongs to the domain of Lν , restricted to Ckill(a) ([a, ∞)), whenever g ∈ Ckill(a) ([a, ∞)). For any other g, our generalized solution was defined in the sense of a generalized function (which reflects the notion of generalized solutions via duality). As usual (see the discussion prior to Proposition 5.9.2), one can also introduce the notion of generalized solutions with the help of approximations. Namely, for a measurable bounded function g(x) on [a, ∞), a continuous curve f (x), t ≥ a, is the generalized solution (ν) via approximation to the problem D+ f = −λf + g on C([a, b]), if there exists a sequence of curves g n (.) ∈ Ckill(a) ([a, b]) such that g n → g almost surely, as n → ∞, and the corresponding classical (i.e., belonging to the domain) solutions f n (x), given by (8.45) with g n (x) instead of g(x), converge pointwise to f (t), as n → ∞. The following assertion is a consequence of Proposition 8.4.1. Proposition 8.4.3. For any measurable bounded function b(x) on [a, ∞), the formula (8.45) (respectively (8.47)) supplies the unique generalized solution by approximation to the problem (8.46) (respectively (8.48)) on [a, b] for any b > a.

8.5 Generalized fractional linear equations, part I In this section, we will analyse linear equations with a non-vanishing boundary value at a by extending Proposition 8.4.1. Proposition 8.5.1. Let a non-negative measure ν on {y : y > 0} satisfy (8.37). (i) If g is a generalized function (from S  (R) or D (R)) vanishing to the left of a, then x−a g(x−y)U (ν) (dy) = Y +(g U (ν))(x) (8.50) f (x) = Y +Ia(ν) g(x) = Y + 0

is the unique solution (from S  (R) or D (R), respectively) to the equation (ν)

(ν)

g = Da+∗ f = D+ f

(8.51)

that equals the constant Y to the left of a. (ii) If g ∈ Ckill(a) ([a, b]), then f from (8.50) belongs to the domain of the generator of the semigroup Tt defined either on the space Cuc ((−∞, b]) of uniformly continuous functions on (−∞, b] or on its subspace C([a, b]) of functions that are constants to the left of a < b. In this case, f is the classical solution to equation (8.51) that equals Y to the left of a.

8.5. Generalized fractional linear equations, part I

459

(iii) If g ∈ C([a, b]) (and is extended to the left of a by zero) and ν is not finite, then f ∈ C(−∞, b] for any b > a and therefore takes the initial condition f (a) = Y in the classical sense. Remark 121. If ν is finite and g(a) = 0, then f has a discontinuity at a, since in this case the limit of f from the right at a equals Y + g(a)/ν, see the corollary to Proposition 5.12.1. Proof. (i) By Propositions 5.12.2 and 1.11.1, for any g ∈ S  (R) supported on (ν) (ν) [a, ∞), the function Ia g(x) is the unique solution to the equation g = Da+∗ f , in the sense of generalized functions, up to an additive constant. Therefore, adding Y fixes the initial condition in a unique way. (ii) As in Proposition 8.4.1, this follows from the fact that the image of the potential operator, when it is bounded, coincides with the domain. (iii) This follows from the corollary to Proposition 5.12.1.  Let us now look at equations with g being continued to the left of a as g(a). Proposition 8.5.2. Let the measure ν on {y : y > 0} satisfy (8.37) and let λ > 0. (i) For any g ∈ C∞ [a, ∞), considered as an element of Cuc (R) by extending it to the left of a by the constant g(a), the function



(ν)

(ν)

g(x − y)Uλ (dy) = (g  Uλ )(x) 0 ∞ x−a (ν) (ν) g(x − y)Uλ (dy) + g(a) Uλ (dy) =

f (x) =

(8.52)

x−a

0

is the unique solution to the equation (ν)

Da+∗ f (x) = −λf (x) + g(x)

(8.53)

in the domain of the generator of the semigroup Tt on Cuc (R). This function equals g(a)/λ to the left of a. (ii) For any g ∈ S  (R) that is constant to the left of a, the generalized function (ν) g  Uλ is a well-defined element of S  (R), and it represents the unique solution to equation (8.53) in the sense of generalized functions. Proof. (i) By (8.52), f = Rλ g is obtained by applying the resolvent to g. Therefore, it belongs to the domain of Lν and solves the equation (λ − Lν )f = g. (ii) This follows from Proposition 5.12.2(i).  As above, one can also interpret the formula (8.52) in the sense of generalized solutions by approximation. However, function (8.52) is not the solution that we are mostly interested in, since it prescribes the boundary value at a rather

460

Chapter 8. Generalized Fractional Differential Equations

than solving the boundary-value problem. The most straightforward way to deal properly with the problem (ν)

Da+∗ f (x) = −λf (x) + g(x),

f (a) = Y,

x ≥ a,

(8.54)

is to turn it into a problem with vanishing boundary value, which is a common trick in the theory of PDEs. Namely, by introducing the new unknown function u = f − Y , we see that u must solve the problem (ν)

Da+ u(x) = −λu(x) − λY + g(x),

u(a) = 0,

x ≥ a,

(8.55)

just with g − λY instead of g. We can therefore define the solution to (8.54) to be the function f = u + Y , where u solves (8.55). For the sake of clarity, let us emphasize that in (8.55) the r.h.s. g(x) − λY is considered to be continued as zero (ν) to the left of a. This definition also complies with one of the definitions of Da+∗ , (ν) (ν) given by Da+∗ f = Da+ (f − f (a)) (arising from (1.109)). Taking first g = 0, we find the solution to (8.54) to be f (x) = Y + u(x) x−a ∞ = Y − λY e−λt G(ν) (t, dy) dt 0 0   ∞ ∞ −λt = λY e G(ν) (t, dy) dt. 0

(8.56)

x−a

Integrating by parts leads to an alternative expression for x > a:   ∞ ∞ ∂ f (x) = Y e−λt G(ν) (t, dy) dt. ∂t 0 x−a Restoring g, we arrive at the following result: Proposition 8.5.3. For any g supported on [a, ∞), the unique solution to the problem (8.54) in the sense defined above is given by the formula   ∞ ∞ −λt ∂ e G(ν) (t, dy) dt f (x) = Y ∂t 0 x−a (8.57) ∞ x−a −λt g(x − y) e G(ν) (t, dy) dt + 0

0

This solution can be classified as classical (from the domain of the generator) or generalized (in the sense of generalized functions or by approximation) according to Proposition 8.4.1 applied to the problem (8.55). β

d Since for Lν = − dx β , the coefficient at Y for x − a = 1 is the Mittag-Leffler function Eβ (−λ) of index β, one can define the analogue of the Mittag-Leffler

8.5. Generalized fractional linear equations, part I

461

function for arbitrary ν as   ∞ ∂ E(ν) (−λ) = e G(ν) (t, dy) dt ∂t 0   1∞ ∞ −λt =λ e G(ν) (t, dy) dt



0

−λt

1





=1−λ 0

e−λt



1

(8.58)

 G(ν) (t, dy) dt.

0

∞

By Proposition 5.11.3(ii), the function x−a G(ν) (t, dy) increases with t. Therefore, its derivative is well defined as a positive measure (and as a function almost everywhere), which makes the function E(ν) (−λ) a completely monotone function of λ. This function is well defined and continuous for Re λ ≥ 0, since it is bounded there by 1:   ∞ ∞  ∞ ∞  ∂ G(ν) (t, dy) dt = G(ν) (t, dy)  = 1. (8.59) |E(ν) (−λ)| ≤ ∂t 0 1 1 0 Moreover, we have E(ν) (0) = 1. In fact, one can define the family of these Mittag-Leffler functions depending on the positive parameter z as   ∞ ∞ −λt ∂ e G(ν) (t, dy) dt E(ν),z (−λ) = ∂t 0 z  (8.60)  ∞ z −λt e G(ν) (t, dy) dt. =1−λ 0

0

They all are completely monotone, and the solution (8.56) to the problem (8.55) is then expressed as x−a (ν) f (x) = Y E(ν),x−a (−λ) + g(x − y)Uλ (dy), (8.61) 0

where the λ-potential measure is expressed in terms of E(ν),z by the equation z (ν) Uλ (dy) = (1 − E(ν),z (−λ))/λ. (8.62) 0

If the measures G(ν) (t, dy) have densities with respect to the Lebesgue measure, (ν) say G(ν) (t, y), then the λ-potential measure also has a density Uλ (y), and (8.62) can be rewritten as 1 ∂E(ν),y (−λ) (ν) Uλ (y) = − . (8.63) λ ∂y However, the additional relation E(ν),z (−λ) = E(ν) (−λz β ) only applies for the case of the derivative

dβ dxβ ,

due to the particular scaling property of Gβ .

462

Chapter 8. Generalized Fractional Differential Equations β

d In the case of Lν = − dx β , we have ∞ ∞ Gβ (t, y)dy = t−1/β Gβ (1, t−1/β y)dy = 1

t−1/β

1

which shows that ∂ ∂x







Gβ (t, y)dy = 1

Gβ (1, x)dx,

1 −1−1/β t Gβ (1, t−1/β ). β

Therefore, we again arrive at formula (8.2). In particular, Eβ (λ) is an entire analytic function of λ. The same is true for the λ-potential measure Uλβ , due to (8.10). In order for E(ν) (s) to be an entire analytic function like Eβ , some regularity assumptions on ν are needed. This will be discussed in the next section. Let us now turn to the extension of linear equations to the Banach-spacevalued setting, i.e., to the equations (ν)

Da+∗ μ(x) = Aμ(x) + g(x),

x ≥ a.

(8.64)

x ≥ a.

(8.65)

μ(a) = Y,

For μ(a) = Y = 0, this turns into the RL-type equation (ν)

Da+ μ(x) = Aμ(x) + g(x),

μ(a) = 0,

As above, we define the solution to (8.64) as a function μ(x) = Y + u(x), where u(x) solves the problem (ν)

Da+ u(x) = Au(x) + AY + g(x),

u(a) = 0,

x ≥ a.

(8.66)

Compared to the case of real-valued A, the only new point is the application of Theorem 5.13.1 for building the semigroup Tt etA and of Proposition 1.11.2 if one is interested in generalized solutions in the sense of generalized functions. Notice also that the assumption of etA to be a contraction naturally extends the case A = −λ with λ > 0 (since e−λt ≤ 1) and allows for a definition of the operator-valued generalized Mittag-Leffler functions by the operator-valued integral   ∞ ∞ tA ∂ e G(ν) (t, dy) dt E(ν),z (A) = ∂t 0 z  z  (8.67) ∞ tA =1+A e G(ν) (t, dy) dt. 0

0

Theorem 8.5.1. (i) Let the measure ν on {y : y > 0} satisfy (5.143) and let A be the generator of the strongly continuous semigroup etA of contractions in the Banach space B, with the domain of the generator D ⊂ B. Then the L(B, B)-valued potential measure ∞ (ν) U−A (M ) = etA G(ν) (t, M ) dt (8.68) 0

8.6. Generalized fractional linear equations, part II

463

k of the semigroup Tt etA on the subspace Ckill(a) ([a, b], B) of Cuc ((−∞, b], B) (as constructed in Theorem 5.13.1) is well defined as a σ-finite measure on {y : y ≥ 0} such that for any z > 0, λ > 0, (ν)

U−A ([0, z]) ≤ eλz /φν (λ). (ν)

Therefore, the potential operator (given by convolution with U−A ) of the semik group Tt etA on Ckill(a) ([a, b], B) is bounded for any b > a. (ii) For any g ∈ Ckill(a) ([a, b], B), the B-valued function f (x) = 0

x−a

(ν)

U−A (dy)g(x−y) =

x−a ∞

0

etA G(ν) (t, dy) dt g(x−y) (8.69)

0

belongs to the domain of the generator of the semigroup Tt etA and is the unique solution to problem (8.65) from the domain. For any g ∈ C([a, b], B), continued as zero to the left of a, this function represents the unique generalized solution to (8.65), both by approximation and in the sense of generalized functions. (iii) For any g ∈ C([a, b], B) (continued as zero to the left of a) and Y ∈ B, the function x−a

(ν)

U−A (dy)(AY + g(x − y)) 0 x−a (ν) U−A (dy)g(x − y) = E(ν),x−a (A)Y +

f (x) = Y +

(8.70)

0

is the unique generalized solution to problem (8.64). (ν)

Proof. (i) For the measure UA , we obtain the same estimate as for U (ν) in Proposition 5.12.1, because etA are contractions. (ii) Concerning the solutions in the domain, this is again a consequence of Proposition 4.1.4. Generalized solutions in the sense of generalized functions are obtained from Proposition 1.11.2. The existence and uniqueness of generalized solutions by approximation is a consequence of the explicit integral formula. (iii) This follows from (ii) by the definition of the solution to (8.64). 

8.6 Generalized fractional linear equations, part II We have constructed the solutions to the linear problems (8.65) and (8.64) only for the case of A generating a contraction semigroup (with a direct extension to the case of a uniformly bounded semigroup etA ). This restriction is ultimately linked with the formula (8.58) for the generalized Mittag-Leffler function, which it not directly seen to be extensible to negative λ. In this section, we present some additional assumptions on ν which ensure that this extension is possible and

464

Chapter 8. Generalized Fractional Differential Equations

therefore allow for an extension of the earlier results to A generating arbitrary strongly continuous semigroups. These assumptions are of two kinds: (a) via lower bounds for ν(dy) and (b) via its asymptotics at small y. By arguments that are similar to those in Proposition 2.13.1, we can derive (ν) that the unique solution to the linear problem Da+∗ f (x) = Af (x) with a bounded operator A in a Banach space B and a given initial condition f (x) = Y must be (ν) represented by a geometric series of the operators Ia : (1 + A(Ia(ν) 1)(x) + · · · + Ak [(Ia(ν) )k 1](x) + · · · )Y,

(8.71)

whenever this series converges. Therefore we are looking for assumptions on ν which can ensure the convergence and provide reasonable estimates for the sum. Remark 122. Since the measure U (ν) (dy) is always finite on any finite interval [0, z], the series (8.71) has a non-vanishing radius of convergence (it converges for A sufficiently small). This means that Theorem 8.5.1 can be extended to A generating semigroups of the small-growth type. What we are looking for now are the conditions that ensure the convergence of (8.71) for all A. The following assertion is a direct consequence of Propositions 9.4.3 and 9.4.2 from the Appendix. Proposition 8.6.1. Let κ > 0 and β ∈ (1/2, 1). Let ν(y) be a continuous function on {y : y ≥ 0} satisfying (8.37) and such that the function ν(y) − κβy −β−1 tends to zero, as y → 0, and is integrable around the origin. Then the symbol ψν (p) of the operator Lν has the asymptotic behaviour ψν (p) = −e−iπβ sgn (p)/2 Γ(1 − β)κ|p|β + O(1).

(8.72)

Moreover, the potential measure U (ν) (dx) on {x : x ≥ 0}, representing the fundamental solution to the operator Lν , has a continuous density Eν (x), whose asymp˜ totic behaviour is given by (9.64) (where E(x) = E(−x)): Eν (x) =

1 β−1 x sin(πβ) + O(1), πκ

with a uniformly bounded function O(1). In particular, Eν (x) ≤ Cν (1 + xβ−1 )/Γ(β),

(8.73)

with some constant Cν . Remark 123. The restriction to β > 1/2 is a consequence of the elementary methods that were used for proving Proposition 9.4.3. The result also holds without this restriction.

8.6. Generalized fractional linear equations, part II

465

Rougher estimates are available from the comparison principle, as the following consequence of Proposition 5.12.1 reveals (together with formula (8.43) for the potential measure of the fractional derivative of order β): Proposition 8.6.2. Let ν(dy) be a measure on {y : y > 0} satisfying (8.37) and having the lower bound of the β-fractional type ν(dy) ≥ (−1/Γ(−β))Cν y −1−β dy with some β ∈ (0, 1) and Cν > 0. Then x U (ν) (dy) ≤ Cν (I0β 1)(x) = Cν xβ /Γ(β)

(8.74)

(8.75)

0

for any x > 0. Using (8.73) or (8.75), we can estimate the geometric series (8.71) with the integral operators (8.44). Namely, we get from (8.75) that |1 + λ(Ia(ν) 1)(x) + · · · + λk [(Ia(ν) )k 1](x) + · · · | ≤ 1 + Cν |λ|(Iaβ 1)(x) + · · · + (Cν |λ|)k [Iakβ 1](x) + · · ·

(8.76)

≤ Eβ (Cν |λ|(x − a) ), β

where we refer to Proposition 2.13.1 for the last equation. With (8.73), we find that the second term in this estimate dominates for x − a < 1. This implies 1 + λ(Ia(ν) 1)(x) + · · · + λk [(Ia(ν) )k 1](x) + · · ·  ≤ Eβ (2Cν |λ|(x − a)β ),

x − a ≤ 1. (8.77)

For x − a > 1, we can give the following rough bound: (Ia1 + Iaβ )n 1(x) ≤ 2n max I βk+n−k 1(x) k∈[0,n]

≤ 2n max k

(t − a)βk+n−k (t − a)n ≤ 2n , Γ(βk + n − k + 1) Γ(βn + 1)

and therefore 1 + λ(Ia(ν) 1)(x) + · · · + λk [(Ia(ν) )k 1](x) + · · ·  ≤ Eβ (2Cν |λ|(x − a)),

x − a ≥ 1. (8.78)

Putting these estimates together yields 1 + λ(Ia(ν) 1)(x) + · · · + λk [(Ia(ν) )k 1](x) + · · ·  ≤ Eβ (2Cν |λ| max(x − a, (x − a)β )). (8.79) Theorem 8.6.1. Under the assumptions of either Proposition 8.6.2 or Proposition 8.6.1, the integral (8.60) converges for all complex λ, so that the function E(ν),z (λ)

466

Chapter 8. Generalized Fractional Differential Equations

(defined initially by (8.60) for negative parameter values) is an entire analytic function of λ. Its series expansions is (ν)

(ν)

E(ν),z (λ) = 1 + λ(I0 1)(z) + · · · + λk [(I0 )k 1](z) + · · · ,

(8.80)

or E(ν),z (λ) = 1 + λ(Ia(ν) 1)(x) + · · · + λk [(Ia(ν) )k 1](x) + · · · , with x − a = z. It can be also obtained by expanding the last expression of (8.60) into a power series in λ. The series (8.80) is bounded either by (8.76) or by (8.79), respectively. Moreover, the integral expressing the λ-potential measure, ∞ (ν) Uλ ([0, z]) = e−λt G(ν) (t, [0, z]) dt, 0

converges for all complex λ, so that the λ-potential measure is also an entire analytic function of λ. Its series expansion is obtained from that of E(ν),z (−λ) via the formula (8.62). Finally, in case of the setting of Proposition 8.6.2, we have z (ν) y β−1 Eβ (|λ|y β ) dy, (8.81) Uλ ([0, z]) ≤ Cν β 0

(ν)

and in the case of the setting of Proposition 8.6.1, the measures Uλ (dy) and G(ν) (t, dy) have densities with respect to the Lebesgue measure. Remark 124. The statement that the expansion (8.80) coincides with the integral (8.60) is a far-reaching extension of Zolotarev’s formula (2.80). Proof. (i) Let us start with the setting of Proposition 8.6.2. Expanding the last expression of (8.60) into a power series in λ, the comparison principle of Proposition 5.11.3 shows us that all terms are bounded by the corresponding terms of the series with Gβ (t, dy) instead of G(ν) (t, dy). Therefore, this series is convergent for all λ. Since both the last expression in (8.60) and the series (8.80) solve the same linear fractional equation, they coincide. Finally, again by the comparison principle of Proposition 5.11.3, we find ∞ ∞ (ν) |λ|t e G(ν) (t, [0, z]) dt ≤ Cν e|λ|t Gβ (t, [0, z]) dt, Uλ ([0, z]) ≤ 0

0

which implies (8.81) by (8.10). (ii) Let us turn to the setting of Proposition 8.6.1. Notice first that the potential measure has a density, as was stated in Proposition 8.6.1. The existence of a density of Gν (t, dy) follows from (8.72) and the representation of Gν (t, dy) as the Fourier transform of exp{tψν (p)}, see, e.g., (5.147).

8.6. Generalized fractional linear equations, part II

467

As in (i), for positive λ, E(ν),z (−λ) given by the last expression of (8.60) coincides with its series representation (8.80), because they solve the same linear fractional equation. In order to see that the expansion (8.80) can be obtained from the expansion of the last expression of (8.60), we note that all terms of the latter are bounded. (This is seen from (8.72) and the fact that Gν (t, dy) is the Fourier transform of exp{tψν (p)}.) Therefore, this expansion is at least asymptotic. But two asymptotic expansions necessarily coincide. Hence, the expansion (8.80) coincides with the expansion obtained from the last expression of (8.60). But since the former converges for all λ, the latter converges as well. Therefore, the integrals in (8.60) also converge for all λ. Finally, the analytic property of the potential measure follows from the cor responding properties of E(ν),z via the formula (8.63). We are now ready for the main result of this section, which extends Theorem 8.5.1 to arbitrary semigroups etA . The proof is literally the same as that of Theorem 8.5.1 (once the properties of the λ-potential measures from Theorem 8.6.1 are obtained) and is therefore omitted. Theorem 8.6.2. Under the assumptions of either Proposition 8.6.2 or Proposition 8.6.1, let A be the generator of the strongly continuous semigroup etA in the Banach space B, with the domain of the generator D ⊂ B. Let the growth type of etA be m0 , so that etA  ≤ M emt with any m > m0 and some M . Then the following holds: (i) The L(B, B)-valued potential measure ∞ (ν) etA G(ν) (t, [0, z]) dt U−A ([0, z]) =

(8.82)

0

k of the semigroup Tt etA on the subspace Ckill(a) ([a, b], B) of Cuc ((−∞, b], B), as constructed in Theorem 5.13.1, is well defined as a σ-finite measure on {y : y ≥ 0}. In the setting of Proposition 8.6.2, we have z (ν) y β−1 Eβ (my β ) dy (8.83) U−A ([0, z]) ≤ Cν M β 0

for any z > 0. (ii) The L(B, B)-valued generalized families of Mittag-Leffler functions   ∞ ∞ ∂ E(ν),z (A) = eAt G(ν) (t, dy) dt ∂t 0 z   z ∞ =1+A eAt G(ν) (t, dy) dt 0

(8.84)

0

are well defined and bounded by E(ν),z (A) ≤ M Eβ (Cν mz β )

(8.85)

468

Chapter 8. Generalized Fractional Differential Equations

or by E(ν),z (A) ≤ M Eβ (Cν m max(z, z β )),

(8.86)

respectively. (iii) For any g ∈ Ckill(a) ([a, b], B), the B-valued function (8.69) belongs to the domain of the generator of the semigroup Tt etA and is the unique solution to the problem (8.65) from the domain. For any g ∈ C([a, b], B), this function represents the unique generalized solution to (8.65), both by approximation and in the sense of generalized functions. (iv) For any g ∈ C([a, b], B) (continued as zero to the left of a) and Y ∈ B, the function (8.70) represents the unique generalized solution to the problem (8.64).

8.7 The time-dependent case; path integral representation In this section, our aim is to extend the above results for (8.64) to the case of a family of operators A depending on x, i.e., to the problem (ν)

Da+∗ μ(x) = A(x)μ(x) + g(x),

μ(a) = Y,

x ≥ a.

(8.87)

This development is based on an appropriate extension of Theorem 5.13.1, which we shall carry out in two steps, first for bounded and then for unbounded measures ν. In any case, the framework of 3-level Banach towers turns out to be convenient. Theorem 8.7.1. ˜ ⊂ D ⊂ B be three Banach spaces with the ordered norms . ˜ ≥ (i) Let D D ˜ be dense in both D and B with respect to their topolo.D ≥ .B , and let D gies (three-level Banach tower). Let A(x), x ∈ R, be a uniformly bounded family of operators in L(D, B) depending strongly continuously on x, which ˜ D). Let all A(x) is also uniformly bounded and strongly continuous in L(D, generate uniformly bounded (for x ∈ R and t from any compact segment) strongly continuous semigroups etA(x) in B with the common core D, which also represent uniformly bounded strongly continuous semigroups in D with ˜ and uniformly bounded semigroups in D. ˜ Then the opthe common core D, erators etA(.) : f (x) → etA(x) f (x) form a strongly continuous semigroup in C∞ (R, B) with the invariant core C∞ (R, D), and a strongly continuous semigroup in C∞ (R, D). (ii) Assume additionally that the function x → A(x) is differentiable both as a ˜ D), and that the derivamapping R → L(D, B) and as a mapping R → L(D,  tives A (x) (the prime denotes a derivative with respect to x) are uniformly (in x) bounded and strongly continuous families of operators again both in

8.7. The time-dependent case; path integral representation

469

˜ D). Then the operators etA(.) represent a strongly continL(D, B) and L(D, 1 (R, B) ∩ C∞ (R, D) (with the norm uous semigroup in the Banach space C∞ 1 defined as the sum of the norms in C∞ (R, B) and C∞ (R, D)) with the in1 ˜ Reduced to the latter space, the operators variant core C∞ (R, D)∩C∞ (R, D). tA(.) form a semigroup of bounded operators, if this space is equipped with e its own Banach topology. (iii) Under the assumptions (i) and (ii), let etA(x) have common growth types mB 0 , ˜ ˜D mD 0 and m 0 as semigroups in B, D and D, respectively, so that etA(x) B→B ≤ MB etmB , etA(x) D→ ˜ D ˜

etA(x) D→D ≤ MD etmD , ˜D ˜ D etm ≤M

(8.88)

D ˜ for any mB > mB ˜D > m ˜D 0 , mD > m0 , m 0 and some MB , MD , MD . Then tA(.) 1 the semigroup e in C∞ (R, B) ∩ C∞ (R, D) has a growth type that does D not exceed max(mB 0 , m0 ), and it satisfies the estimates 1 (R,B)∩C (R,D)) etA(.) L(C∞ ∞

≤ max(MB e

tmB

, MD e

tmD

(8.89) + MB MD te

t max(mB ,mD )



sup A (x)D→B ) x



≤ max(MD , MB ) exp{t(max(mB , mD ) + MB sup A (x)D→B )}. x

1 ˜ this semigroup has a growth type that does not (R, D) ∩ C∞ (R, D), In C∞ D D exceed max(m0 , m ˜ 0 ), and it satisfies the estimates

etA(x) L(C 1

(8.90)

˜

∞ (R,D)∩C∞ (R,D))

≤ max(MD e

tmD

˜ De ,M

tm ˜D

˜ D te + MD M

t max(mD ,m ˜ D)



sup A (x)D→D ) ˜ x

˜ D ) exp{t(max(mD , m ˜ D sup A (x) ˜ ˜ D) + M ≤ max(MD , M D→D )}. x

Proof. (i) Since etA(x) f (x) − etA(x0 ) f (x0 ) = etA(x) (f (x) − f (x0 )) + (etA(x) − etA(x0 ) )f (x0 ) and by Proposition 4.2.2, etA(x) f (x) belongs to C∞ (R, B) (respectively C∞ (R, D)) whenever f does. Therefore, the operators etA(.) represent semigroups both in C∞ (R, B) and C∞ (R, D). By uniform boundedness of etA(x) with respect to x, these semigroups are locally bounded (bounded for t from compact segments). Next, for f ∈ C∞ (R, D), we have t etA(x) f (x) − f (x) = A(x)esA(x) f (x) ds, 0

which tends to zero in B, as t → 0, because A(x) and etA(x) are uniformly bounded as operators from L(D, B) and L(D, D), respectively. By a density argument and

470

Chapter 8. Generalized Fractional Differential Equations

the boundedness of the operators etA(x) in L(B, B), the strong continuity of the semigroup etA(.) in C∞ (R, B) is implied. ˜ etA(x) f (x) − f (x) → 0 in D, as t → 0, because Similarly, for f ∈ C∞ (R, D), tA(x) ˜ D) and L(D, ˜ D), ˜ A(x) and e are uniformly bounded as operators from L(D, tA(x) respectively. The boundedness of the operators e in L(D, D) implies the strong continuity of the semigroup etA(.) in C∞ (R, D). It remains to show that any f ∈ C∞ (R, D) belongs to the domain of the generator A(.) of the semigroup etA(.) in C∞ (R, B). We have 1 1 tA(x) (e f (x) − f (x)) = A(x)f (x) + t t



t

A(x)(esA(x) − 1)f (x) ds, 0

and the second term tends to zero, as t → 0, due to the strong continuity of esA(.) in C∞ (R, B). (ii) Since & d tA(x) 1 % tA(x+δ) [e (e f (x)] = lim − etA(x) f (x) + etA(x+δ) (f (x + δ) − f (x)) δ→0 δ dx 1 (R, B) ∩ C∞ (R, D), the expression and using (4.138), we find that, for f ∈ C∞

d tA(x) [e f (x)] = dx



t

e(t−s)A(x) A (x)esA(x) f (x) ds + etA(x) f  (x)

(8.91)

0

is well defined in the topology of B and represents an element of C∞ (R, B), because A (x) is assumed to be bounded and strongly continuous as a family in L(D, B). By the strong continuity of esA(.) in C∞ (R, B), it follows that & d % tA(x) f (x) → f  (x), e dx as t → 0. But by (i), the operators esA(.) depend strongly continuously on s in C∞ (R, D). Consequently, the esA(.) form a strongly continuous semigroup in 1 C∞ (R, B) ∩ C∞ (R, D). 1 ˜ then formula (8.91) holds also in the topology If f ∈ C∞ (R, D) ∩ C∞ (R, D), 1 ˜ of C∞ (R, D). This implies that the esA(.) preserve the space C∞ (R, D)∩C∞ (R, D) and act as bounded operators in the Banach topology of this space. (iii) By (8.91), we have ) &) ) d % tA(x) ) ) ) e f (x) ) dx ) C∞ (R,B) 1 (R,B) ≤ MB etmB f (x)C∞

+ MB MD te(t−s)mB esmD sup A (x)D→B f (x)C∞ (R,D) , x

8.7. The time-dependent case; path integral representation

471

which implies the first inequality in (8.89). From this, it follows that the growth 1 D (R, B) ∩ C∞ (R, D) does not exceed max(mB type of etA(.) in C∞ 0 , m0 ). The last inequality in (8.89) follows from the estimate   1 + tMB sup A (x)D→B ≤ exp tMB sup A (x)D→B . x

x

Similarly, (8.90) is obtained from the estimate ) % &) ) )d tA(x) ) ) e f (x) ) ) dx C∞ (R,D) 1 (R,D) ≤ MD etmD f (x)C∞ ˜D ˜ D te(t−s)mD esm + MD M sup A (x) ˜

˜ . D→D f (x)C∞ (R,D)

x



Theorem 8.7.2. (i) Under the assumptions of Theorem 8.7.1(i), let ν be a bounded measure on the ray {y : y > 0}. Then the operator Lν + A(.) generates a strongly continuous semigroup Φν,A in C∞ (R, B) with the invariant core C∞ (R, D), where this t semigroup is also strongly continuous. The semigroup Φt has the following representation: Φν,A t Y (x)

(8.92)



= e−t ν etA(x) Y (x) + 0≤s1 ≤···≤sm ≤t

m=1

ds1 · · · dsm ν(dz1 ) · · · ν(dzm )

× exp{s1 A(x)} exp{(s2 − s1 )A(x − z1 )} · · ·

· · · exp{(t − sm )A(x − z1 − · · · − zm )}Y (x − z1 − · · · − zm ) ,

or, using the notation (4.99) for piecewise-continuous paths, Φν,A t Y (x)

(8.93)



−t ν tA(x) =e Y (x) + e  × exp

0≤s1 ≤···≤sm ≤t

m=1



s1

A(Zx (τ ))dτ







t

· · · exp

s2

exp

0

A(Zx (τ ))dτ

ds1 · · · dsm ν(dz1 ) · · · ν(dzm )

A(Zx (τ ))dτ

 ···

s1

Y (Zx (t)) .

sm

represent strongly continuous semigroups (ii) For any b > a, the operators Φν,A t in the spaces Ckill(a) ([a, b], B) and Ckill(a) ([a, ∞), B) ∩ C∞ (R, B) as well.

472

Chapter 8. Generalized Fractional Differential Equations

(iii) If the assumptions of Theorem 8.7.1 (ii) hold, then Lν + A(.) also generates a 1 (R, B) ∩ C∞ (R, D) with the invariant strongly continuous semigroup in C∞ 1 ˜ are bounded in the space core C∞ (R, D) ∩ C∞ (R, D). The operators Φν,A t 1 ˜ equipped with its own Banach topology. C∞ (R, D) ∩ C∞ (R, D) Proof. (i) Since Lν is a bounded operator both in C∞ (R, B) and in C∞ (R, D), it follows from Theorem 4.6.1 that the operator (Lν + A(.))f (x) = f (x − y)ν(dy) + (A(x) − ν)f (x) generates a strongly continuous semigroup in C∞ (R, B) with the invariant core C∞ (R, D), where this semigroup is also strongly continuous. Moreover, formula  (4.85) (where the operator f (x − y)ν(dy) is considered a bounded perturbation) provides the representation (8.92). Unlike in (4.103), the operators A(x) may not commute. Therefore, the exponents cannot be put together. Due to the notations (4.99), the equations (8.93) and (8.92) are equivalent. (ii) The invariance of the spaces Ckill(a) ([a, b], B) and Ckill(a) ([a, ∞), B) under Φν,A is seen from (8.92). t (iii) This follows again from Theorem 4.6.1 and the observation that the 1 ˜ equipped (R, D) ∩ C∞ (R, D) operators etA(.) and Lν are bounded in the space C∞ with its own Banach topology.  By (2.43), the product of the exponents in (8.93) or (8.92) equals the backt ward chronological product T˜ exp{ 0 A(Zx (τ )) dτ }. Revoking the notations of Section 4.7 and denoting by ν P C the measure on P Cx (t) constructed from ν, we can rewrite (8.93) as Φν,A t Y

(x) = e

−t ν

F (Zx (.))ν P C (dZ(.)), P Cx (t)

with

 F (Zx (.)) = T˜ exp



t

A(Zx (τ )) dτ

Y (Zx (t)).

0

By introducing the normalized probability measure ν˜P C = e−t ν ν P C as in Section 4.7 and by denoting the integration with respect to this measure on the path-space P Cx (t) by Eν (the expectation), we arrive at the main path integral representation for the solutions: yieldCorollary 10. Under the assumptions of Theorem 8.7.2, the semigroup Φν,A t ing the unique solution to the Cauchy problem (ν)

μ˙ t (x) = A(x)μt (x) − D+ μt (x),

μ0 = Y,

(8.94)

8.7. The time-dependent case; path integral representation

473

has the following integral representation in terms of the backward chronological exponential: ν,A Φt Y (x) = F (Zx (.))˜ ν P C (dZ(.)) P Cx (t)

 t  A(Zx (τ )) dτ Y (Zx (t)) . = Eν T˜ exp

(8.95)

0

The next consequence shows that the formula (8.92) makes it possible to find tA(x) is known. the growth of the semigroup Φν,A t , whenever the growth of e Corollary 11. Under the assumptions of Theorem 8.7.2(i) to (iii), the following estimates hold: Φν,A t L(C∞ (R,B)) ≤ MB exp{t(mB + ν(MB − 1))},

(8.96)

Φν,A t L(C∞ (R,D)) ≤ MD exp{t(mD + ν(MD − 1))},

(8.97)

1 (R,B)∩C (R,D)) Φν,A t L(C∞ ∞

.% ≤ max(MB , MD ) exp t max(mD , mB ) + MB sup A (x)D→B x &/ + ν(max(MB , MD ) − 1) ,

Φν,A t L(C 1

(8.98)

˜

∞ (R,D)∩C∞ (R,D))

.% ˜ D ) exp t max(mD , m ≤ max(MD , M ˜ D ) + MD sup A (x)D→D ˜ x &/ ˜ D ) − 1) , + ν(max(MD , M

(8.99)

In particular, if the semigroups etA(x) are regular in B and D in the sense that (8.88) holds with MD = MB = 1 and some mD , mB , which is equivalent to the requirement that sup sup t

x

1 ln etA(x) L(B) < ∞, t

sup sup t

x

1 ln etA(x) L(D) < ∞, t

(8.100)

1 then the same applies to the semigroup Φν,A both in C∞ (R, B) and C∞ (R, B) ∩ t C∞ (R, D), and its growth rates are given by the estimates

Φν,A t L(C∞ (R,B)) ≤ exp{tmB }, 1 (R,B)∩C (R,D)) Φν,A t L(C∞ ∞

Φν,A (8.101) t L(C∞ (R,D)) ≤ exp{tmD }, 

  ≤ exp t max(mD , mB ) + sup A (x)D→B , x

(8.102) independently of ν.

474

Chapter 8. Generalized Fractional Differential Equations

Proof. By (8.92), we have Φν,A t C∞ (R,B)

≤ MB e

mB t



νn MBn tn 1+ n! n=1

 ,

which implies (8.96). Similarly, the other estimates are obtained due to (8.89) and (8.90).  We can now address the problem (8.87) for the simplest case of bounded ν. Theorem 8.7.3. Under the assumptions of Theorem 8.7.2, the resolvent operators in the space Ckill(a) ([a, ∞), B) ∩ C∞ (R, B) yielding RλA,ν of the semigroup Φν,A t the classical solutions to the problems   (ν) λ − A(x) + Da+ μ(x) = g(x), μ(a) = 0, x ≥ a, (8.103) are well defined for λ > mB + ν(MB − 1), and are given by the formula  t 

∞ A,ν −λt ˜ e Eν T exp A(Zx (τ )) dτ g(Zx (t)) dt. Rλ g(x) = 0

(8.104)

0

When reduced to Ckill(a) ([a, b], B), they are also well defined for λ ≥ mB + ν(MB − 1). In particular, if all semigroups generated by A(x) in B are contractions, then the problem (8.87) with Y = 0 has a unique classical solution (belonging in Ckill(a) ([a, b], B)) given to the domain of the generator of the semigroup Φν,A t by (8.104) with λ = 0 for any g ∈ Ckill(a) ([a, b], B). Since g ∈ Ckill(a) ([a, ∞), B), formula (8.104) can be rewritten as RλA,ν g(x) = Eν

0

σa

 t 

e−λt T˜ exp A(Zx (τ )) dτ g(Zx (t)) dt,

(8.105)

0

where σa = inf{t : Zx (t) ≤ a}. This formula can be used for defining various generalized solutions to (8.103).

8.8. Chronological operator-valued Feynmann–Kac formula

475

8.8 Chronological operator-valued Feynmann–Kac formula Let us now turn to problems of the type (8.87) with an unbounded ν. Theorem 8.8.1. (i) Under the assumptions of Theorem 8.7.1 (i) to (iii), let the semigroups etA(x) be regular in B and D in the sense that (8.88) holds with MD = MB = 1 and some mD , mB (equivalently, if (8.100) holds). Let ν be a measure on the ray {y : y > 0} satisfying (8.37). Then the operator Lν + A(.) generates a both in C∞ (R, B) and C∞ (R, D) solving strongly continuous semigroup Φν,A t the Cauchy problem (8.94), with the domains of the generator containing the 1 1 ˜ respectively. The spaces C∞ (R, B) ∩ C∞ (R, D) and C∞ (R, D) ∩ C∞ (R, D), ν,A semigroup Φt can be obtained as the limit, as  → 0, of the semigroups Φνt ,A built by Theorem 8.7.2 for the finite approximations ν (dy) = 1|y|≥ ν(dy) of has the representation ν, so that the semigroup Φν,A t Φν,A t Y

(x) = lim

→0

Φtν ,A Y

 t 

˜ (x) = lim Eν T exp A(Zx (τ )) dτ Y (Zx (t)) ,

→0

0

(8.106) where the limit is well defined both in the topologies of B and D, and Φν,A t L(C∞ (R,B)) ≤ exp{tmB },

Φν,A t L(C∞ (R,D)) ≤ exp{tmD }. (8.107)

(ii) For any b > a, the operators Φν,A represent strongly continuous semigroups t also in the spaces Ckill(a) ([a, b], B), Ckill(a) ([a, ∞), B) ∩ C∞ (R, B), Ckill(a) ([a, b], D), Ckill(a) ([a, ∞), D) ∩ C∞ (R, D). Proof. (i) This is similar to the proof of Proposition 5.13.1. By (8.101) and (8.102), the semigroups Φνt ,A are uniformly (in ) bounded in both C∞ (R, B)

1 and C∞ (R, B) ∩ C∞ (R, D).

Estimating the difference between the actions of Φνt ,A for two values 2 < 1 < 1 in the usual way, i.e., by a formula of the type (4.8), leads to ν ,A Φt 1



ν ,A Φt 2

= 0

t

ν

,A

ν

2 Φt−s (Lν 1 − Lν 1 )Φs 1

,A

ds.

476

Chapter 8. Generalized Fractional Differential Equations

1 Hence, for Y ∈ C∞ (R, B) ∩ C∞ (R, D), we can derive by (8.101) and (8.102) that ν 1 ,A

ν 2 ,A

)Y C∞ (R,B) ) 1 ) ) ) ν 1 ,A ν 1 ,A (t−s)mB ) ≤ ds e sup ) (Φs Y (x − y) − Φs Y (x))ν(dy)) ) x 0

2 B t 1 ν ,A 1 (R,B) , ≤ ds e(t−s)mB yν(dy)Φs 1 Y C∞

(Φt



− Φt

t

2

0

and therefore ν ,A (Φt 1



ν ,A Φt 2 )Y

C∞ (R,B) ≤



1

t

ds e(t−s)mB (8.108) &/ . % 1 (R,B)∩C (R,D) , × exp s max(mD , mB ) + sup A (x)D→B Y C∞ ∞ yν(dy)

2

0

x

which tends to zero, as 1 , 2 → 0, uniformly for t from any compact set. Therefore, 1 the families Φνt ,A Y converge, as  → 0, for any Y ∈ C∞ (R, B) ∩ C∞ (R, D). By a density argument, this convergence extends to all Y ∈ C∞ (R, B). Passing to the limit in the semigroup equation, we find that the limiting operators form a bounded semigroup in C∞ (R, B), with the same bounds (8.101). We denote this semigroup by Φν,A t . Its strong continuity follows from the strong continuity of ν ,A Φt . Next, writing Φν ,A Y − Y Φν,A Y − Φνt ,A Y Φν,A t Y −Y = t + t t t t and noting that, by (8.108), the second term tends to zero, as t,  → 0, we can 1 (R, B) ∩ C∞ (R, D) belongs to the domain of the conclude that the space C∞ ν,A semigroup Φt in C∞ (R, B). Finally, the same estimates as above can be established in the space topology in C∞ (R, D). The of C∞ (R, D), which shows the required properties of Φν,A t estimates (8.107) follow from the same estimate for Φνt ,A . (ii) The invariance of the spaces Ckill(a) ([a, b], B), Ckill(a) ([a, ∞), B), Ckill(a) ([a, b], D), and Ckill(a) ([a, ∞), D) under Φν,A follows from their invariance under all Φνt ,A . t



From the point of view of numeric calculations, the limiting integral representation formula (8.106) seems to be most appropriate. From a theoretical perspective, it is of course desirable to get rid of the lim →0 . Remark 125. For the next result, some methods of stochastic analysis have to be invoked. Readers who are not familiar with this stuff are advised to skip the next theorem and to simply think of Eν used below as a notation for lim →0 Eν .

8.8. Chronological operator-valued Feynmann–Kac formula

477

Theorem 8.8.2. Under the assumptions of Theorem 8.7.1, the formula (8.95) is generalized for the present setting as  t 

ν,A ˜ Φt Y (x) = Eν T exp A(Zx (τ )) dτ Y (Zx (t)) , (8.109) 0

where Eν means the expectation with respect to the measure on the cadlag paths of the L´evy process generated by the operator Lν , starting at x. Proof. This follows from (8.103) and three additional points: (i) the convergence of Feller semigroups implies the weak convergence of the corresponding Markov processes; (ii) the limiting process generated by Lν is a L´evy process, whose trajectories are non-increasing cadlad paths; (iii) the convergence of propagators that are parametrized by cadlag paths, see Theorem 2 of [149] or Theorem 1.9.5 of [148].  The formula (8.109) is a time-ordered operator-valued version of the classical Feynman–Kac formula of stochastic calculus. As a consequence, like in the case of bounded ν, we obtain the solutions to the problem (8.87). Theorem 8.8.3. Under the assumptions of Theorem 8.7.1, the resolvent operators in the space Ckill(a) ([a, ∞), B) ∩ C∞ (R, B) yielding RλA,ν of the semigroup Φν,A t the classical solutions to the problem (8.103) are well defined for λ > mB and are given by the formula 

 t σa A,ν −λs ˜ e A(Zx (τ )) dτ g(Zx (t)) ds T exp Rλ g(x) = Eν 0 0 (8.110)  

σa s −λs ˜ T exp = lim Eν e A(Zx (τ )) dτ g(Zx (t)) ds.

→0

0

0

If all semigroups generated by A(x) in B are contractions, then the problem (8.87) with Y = 0 has a unique classical solution (belonging to the domain of the generator in Ckill(a) ([a, b], B)) for any g ∈ Ckill(a) ([a, b], B). of the semigroup Φν,A t Note that the formula (8.110) is a time-ordered operator-valued version of the stationary Feynman–Kac formula of stochastic calculus. As usual, if g is any bounded measurable function [a, ∞) → B, then the formula (8.110) also yields generalized solutions, by approximation or by duality, to the problem (8.103). Again as usual, one defines solutions to the problem (8.87) with arbitrary Y by shifting, i.e., as a function μ(x) = Y + u(x), where u solves the problem (ν)

Da+∗ u(x) = A(x)u(x) + A(x)Y + g(x), This leads to the following result:

μ(a) = 0,

x ≥ a.

(8.111)

478

Chapter 8. Generalized Fractional Differential Equations

Corollary 12. Under the assumptions of Theorem 8.7.1, if all semigroups generated by A(x) in B are contractions, then the problem (8.87) has the unique generalized solution  s  σa

T˜ exp A(Zx (τ )) dτ (A(Zx (t))Y + g(Zx (t))) ds, μ(x) = Y + Eν 0

0

(8.112) for any Y ∈ D and a bounded measurable curve g : [a, ∞) → B. Remark 126. (i) If one assumes some regularity on ν, like in Propositions 8.6.1 or 8.6.2, then one can relax the assumptions on A(x). Various additional regularity properties of solutions can be obtained by assuming some smoothing properties of the semigroups etA(x) . (ii) Assuming the existence of a bounded second derivative A (x) allows for 1 showing that the space C∞ (R, B) ∩ C∞ (R, D) is an invariant core for Φν,A t . t (iii) The backward time-ordered exponential T˜ exp{ s A(Zx (τ )) dτ } represents the backward propagator U s,t solving the backward Cauchy problem f˙s (x) = −A(Zx (s))fs (x),

s ≤ t,

(8.113)

with the given terminal condition ft , where the family A(Zx (t)) is bounded (as operators D → B), but discontinuous in t. However, by the property of L´evy processes, the set of discontinuities is at most countable. Let us present some examples, extending those of Section 8.2, when the basic formula (8.112) is applicable. For better comparison with Section 8.2, we shall use the letter t for the argument, rather than x used in the present section (where t was used as the time variable in the auxiliary semigroups). (i) The generalized fractional Schr¨ odinger equation with time-dependent Hamiltonian and a generalized fractional derivative: (ν)

Da+∗ ψt = −iH(t)ψt ,

(8.114)

where H(t) is a family of self-adjoint operators in a Hilbert space H such that the unitary groups generated by H(t) have a common domain D ⊂ H, where they are regular in the sense of the second condition of (8.100). The most basic concrete example are Hamiltonians of the form H(t) = −Δ+ V (t, x) with V (t, .) ∈ C 2 (Rd ), where D can be chosen as the Sobolev space H22 (Rd ). Similarly, one can deal with the fractional Schr¨ odinger equation with a complex parameter: (ν) (8.115) Da+∗ ψt = σH(t)ψt , if H is a negative operator and σ is a complex number with a non-negative real part, and where again a common domain D ⊂ H exists such that the semigroups generated by σH(t) are regular. In both cases, formula (8.112) is applicable.

8.9. Summary and comments

479

(ii) Generalized fractional Feller evolutions, where each A(t) in (8.111) generates a Feller semigroup in C∞ (Rd ), again with the additional property that the semigroups generated by A(t) act regularly in their invariant cores D that can 1 2 often be taken as C∞ (Rd ) (for operators of at most first order) or C∞ (Rd ) (e.g., for diffusions). (iii) Generalized fractional evolutions generated by ΨDOs with spatially homogeneous symbols (or with constant coefficients): (ν)

Da+∗ ft = −ψt (−i∇)ft + gt ,

f |t=a = fa ,

(8.116)

under various assumptions on the symbols ψt (p), as given, e.g., by Theorems 2.4.2, 4.15.1 or 4.15.4. In this case, propagators solving (8.113) are explicitly constructed via (2.60) and (2.65). For instance, formula (8.110) for the solution to (8.116) with fa = 0 becomes σa Gψ,Z (8.117) R0A,ν g(t, w) = Eν s,0 (w − v)gZt (s) (v) dv ds, Rd

0

where Gψ,Z s,0 (w)

1 = (2π)d

e

ipw

 exp − 0



s

ψZt (τ ) (p) dτ

dp.

(8.118)

8.9 Summary and comments Fractional and fractal thinking is quickly developing as a new paradigm of tomorrow’s science of complexity, see, e.g., [263] and numerous references therein. For an extensive review of the applications of fractional differential equations in natural sciences, we can refer to [192, 206, 237, 253] and [255]. The book [72] analyses fractional PDEs by first solving the case of constant coefficients via the Fourier transform and then exploiting the method of frozen coefficients. Fractional equations are discussed in many books, see, e.g., [24, 62, 120, 222, 229, 256]. See also [177] specifically for the Banach-space setting. Computational aspects are discussed in [186], for boundary-value problems see [105]. For equations with constant coefficients, one of the basic methods for solving fractional PDEs is based on the Fourier transform, see, e.g., [11]. We used this method for equations with homogeneous symbols in Chapter 4. An important development with respect to fractional equations is the fractional calculus of variations, see [8, 195, 196]. Optimization problems of this theory are formulated in terms of a certain class of fractional equations on bounded domains, the so-called fractional Euler–Lagrange equations. Their analysis seems to have been initiated in [41]. For fractional control problems and related equations, we can refer to [1, 160, 218] and references therein. The theory of the fractional Hamilton–Jacobi–Bellman (HJB) equation as presented here was essentially developed in [160] and [161], where further details

480

Chapter 8. Generalized Fractional Differential Equations

can be found. The HJB evolution which is fractional in time arises from the scaling limits of controlled continuous-time random walks. The Euler–Lagrange equations of the fractional calculus of variations and related models of fractional mechanics lead to another type of the HJB equation, for which we refer to [196] and references therein. Related classes of nonlinear fractional equations are analysed in [4]. As mentioned before, generalized fractional calculus is usually developed by extending fractional integrals to integral operators with arbitrary integral kernels and then defining fractional derivatives as the derivatives of these integral operators, see [2, 123–125, 129, 130, 196]. Alternatively, one can use some compositions and mixtures of the standard derivatives, see, e.g., [70]. In this chapter, we used yet another approach, suggested in [152] and motivated by a probabilistic interpretation. It starts with the definition of a generalized fractional derivative, and the generalized fractional operator is then defined as the corresponding potential operator, or in other words, as the fundamental solution. When following this approach, we emphasized the analytic part of the construction. This approach leads to a certain new class of generalized Mittag-Leffler functions that can be represented, as their classical counterpart, as the Laplace transform of positive functions, expressed in terms of the Green function of the corresponding generalized-mixed-fractional-derivative operators. For the general background on the Feynman–Kac formulae, we can refer to [190]. The operator-valued versions that we exploited in this chapter were put forward in [100]. The fractional Schr¨odinger equation is gaining popularity in the physics community, see, e.g., [28, 179, 211] and references therein. For fractional versions of the wave equations, we refer to [228]. A detailed analysis of fractional Pearson diffusions is given in [185]. Important recent developments in fractional differential equations concern fractional kinetic equations with applications to statistical mechanics and fractional stochastic PDEs. For these developments, we refer to [130, 157, 188, 191, 268, 269] and references therein. The main physical source for fractional equations is their appearance in the scaled limit of continuous-time random walks, see, e.g., [144, 183, 184, 206] and the extensive bibliography therein.

Chapter 9

Appendix In this final chapter, we provide some useful facts and formulas from analysis as a convenient source of references. For the sake of completeness, we provide most of the proofs, which are either very simple (like in Section 9.1) or not quite standard, except for the basic properties of Euler’s Gamma and Beta functions from Section 9.2 that can be found in many sources. Theorem 9.5.1 may be new – at least, the author did not find an appropriate reference.

9.1 Fixed-point principles The most handy form of a fixed-point theorem for us will be the following generalized contraction principle, also referred to as the Weissinger fixed-point theorem: Proposition 9.1.1. If Φ is a mapping X → X in a complete metric space X with n n a metric ρ such some αn such ∞that ρ(Φ (x), Φ (y)) ≤ αn ρ(x, y) for all x, y with that A = 1 + n=1 αn < ∞, then Φ has a unique fixed point x∗ , Φn (x) converges to x∗ for any x and ρ(x, x∗ ) ≤ Aρ(x, Φ(x)). (9.1) Proof. For any x, we find ρ(Φn (x), Φn+1 (x)) ≤ αn ρ(x, Φ(x)), and thus ρ(Φn (x), Φm (x)) ≤

m−1

αk ρ(x, Φ(x)).

k=n

Consequently, the sequence Φn (x) is Cauchy and hence converges to a point x∗ , so that ρ(x, x∗ ) ≤ ρ(x, Φ(x)) + ρ(Φ(x), Φ2 (x)) + · · · ≤ Aρ(x, Φ(x)). © Springer Nature Switzerland AG 2019 V. Kolokoltsov, Differential Equations on Measures and Functional Spaces, Birkhäuser Advanced Texts Basler Lehrbücher, https://doi.org/10.1007/978-3-030-03377-4_9

481

482

Chapter 9. Appendix

This implies (9.1). Passing to the limit in the inequality ρ(Φn+1 (x), Φn (x)) ≤ αn ρ(x, Φ(x)) yields ρ(Φ(x∗ ), x∗ ) = 0. Uniqueness follows from (9.1) by applying it to another fixed point x ˜∗ , which yields x∗ , Φ(˜ x∗ )) = Aρ(˜ x∗ , x˜∗ ) = 0. ρ(˜ x∗ , x∗ ) ≤ Aρ(˜



The following contraction principle (also referred to as the Banach fixed-point theorem) is a consequence of Proposition 9.1.1. Proposition 9.1.2. If the mapping Φ is a contraction, that is ρ(Φ(x), Φ(y)) ≤ θρ(x, y) with θ ∈ (0, 1),  then ρ(Φn (x), Φn (y)) ≤ θn ρ(x, y) and the previous as∞ sertion applies with A = n=0 θn = 1/(1 − θ). Unlike the contraction principle that is commonly used in the classical texts on ODEs for proving well-posedness for small times (which then extends to finite times by iterations), Proposition 9.1.1 makes it possible to prove well-posedness directly for arbitrary times, which is especially handy for generalizations that include memory. Proposition 9.1.3. If Φ1 , Φ2 are two mappings X → X in a complete metric space X such that ρ(Φnj (x), Φnj (y)) ≤ αn (j)ρ(x, y) for j = 1, 2 and all x, y with some ∞ αn (j) such that A(j) = 1 + n=1 αn (j) < ∞, and if ρ(Φ1 (x), Φ2 (x)) ≤  for all x, then (9.2) ρ(x∗1 , x∗2 ) ≤  min A(j) j=1,2

for the fixed points x∗j of the mappings Φj . Proof. Note that (x∗1 ), Φn2 (x∗1 )) ρ(x∗1 , Φn2 (x∗1 )) ≤ ρ(x∗1 , Φ2 (x∗1 )) + · · · + ρ(Φn−1 2 ≤ A(2)ρ(x∗1 , Φ2 (x∗1 )) ≤ A(2). Passing to the limit yields ρ(x∗1 , x∗2 ) ≤ A(2).



Fixed points are also often obtained from the following (generalized) Gronwall’s lemma: Proposition 9.1.4. If a continuous function u : R+ → R+ satisfies the inequality

t

u(t) ≤ a +

c(s)u(s) ds, 0

where c(s) is an integrable (possibly unbounded) function, then  u(t) ≤ a exp 0

t

 c(s) ds .

(9.3)

9.2. Special functions

483

Proof. Using (9.3) recursively yields  t t u(t) ≤ a + a c(s) ds + c(s2 ) 0

0

s2

 c(s1 )u(s1 )ds1 ) ds,

0

and by induction we find   t c(s) ds + · · · + c(sn ) · · · c(s1 ) ds1 · · · dsn u(t) ≤ a 1 + 0 0≤s1 ≤···≤sn ≤t + c(sn+1 ) · · · c(s1 )u(s1 )ds1 · · · dsn+1 . 0≤s1 ≤···≤sn+1 ≤t

This implies the required estimate by passing to the limit n → ∞.



We shall also use the following discrete version of the above result. Lemma 9.1.1. Let a positive sequence {xn } of functions on R+ satisfy the estimates t xn (t) ≤ a + (b + cxn−1 (s)) ds (9.4) 0

with some non-negative constants a, b, c. Then the sequence {xn } is bounded with xn (t) ≤ [a + bt]ect +

x0 cn . n!

(9.5)

Proof. By direct induction, one gets     n+1 n+1 c n tn t c n tn ct c x0 , (9.6) xn ≤ a 1 + ct + · · · + +bt 1 + + · · · + + n! 2 (n + 1)! (n + 1)! 

which implies (9.5).

Exercise 9.1.1. Extend the previous result to the case of an integrable function c(s) instead of a constant c.

9.2 Special functions The Euler Gamma and Beta functions are defined for positive arguments as 1 ∞ e−t tx−1 dt, B(x, y) = tx−1 (1 − t)y−1 dt. (9.7) Γ(x) = 0

0

From the definition, one can trivially derive the formula Γ(x + 1) = xΓ(x) and the useful integral   ∞ 1 + ω −(1+ω)/β 1 −tpβ ω e p dp = Γ . (9.8) t β β 0

484

Chapter 9. Appendix

Continued to complex t with positive real part, this yields   ∞ ( ' 1+ω 1 exp −teiφ pβ pω dp = Γ e−iφ(1+ω)/β t−(1+ω)/β , β β 0

(9.9)

where t > 0 and φ ∈ (−π/2, π/2). By a lengthy derivation, one obtains the fundamental relation Γ(β)Γ(1 − β) = π/ sin(πβ). (9.10) The functions Γ and B are linked by the equation Γ(x)Γ(y) = Γ(x + y)B(x, y).

(9.11)

The following well-known formulae express the volume of the unit ball Vd in Rd and the area of the unit sphere |S d−1 | in terms of the Γ-function: Vd =

π d/2 1 2 π d/2 = = |S d−1 |, d Γ(d/2) Γ(1 + d/2) d

|S d−1 | = 2

π d/2 . Γ(d/2)

(9.12)

√ The latter formula includes the case |S 0 | = 2 (two-point set), since Γ(1/2) = π. The Mittag-Leffler functions with a parameter α > 0 or with two parameters α, β > 0 are defined as Eα (x) =



k=0

xk , Γ(αk + 1)

Eα,β (x) =



k=0

xk . Γ(αk + β)

(9.13)

This series converges for all x, therefore E(x) is analytic in C. As with other analytic functions, the same expansion can be used for defining the function E(A) for bounded linear operators A in a Banach space B. Differentiating the series and using Γ(αk + 1) = Γ(αk)αk, we get ∞ ∞ ∞

d kxk−1 xk−1 xm Eα (x) = = = , dx Γ(αk + 1) Γ(αk)α m=0 Γ(α(m + 1))α k=1

k=1

in other words, 1 d Eα (x) = Eα,α (x). dx α

(9.14)

Let us derive the so-called Dirichlet formulae that generalize (9.11). Firstly, for α1 , . . . , αn ∈ (0, 1), one has Γ(α1 ) · · · Γ(αn ) = Γ(α1 + · · · + αn ) ds1 · · · dsn−1 (9.15) 0≤s1 ≤···≤sn−1 ≤1

αn −1

× (1 − sn−1 )

1 −1 (sn−1 − sn−2 )αn−1 −1 · · · (s2 − s1 )α2 −1 sα . 1

9.3. Asymptotics of the Fourier transform: power functions and their exponents

485

The integral in this formula is often referred to as the multinomial Beta function. In order to see that (9.15) holds, let us write Γ(α1 ) · · · Γ(αn ) =







···

0

1 −1 n −1 e−t1 tα · · · e−tn tα dt. n 1

0

Then a change to the variables xj = t1 + · · · + tj yields Γ(α1 ) · · · Γ(αn ) = 0



e−xn dxn



×

0≤x1 ≤···≤xn

(xn − xn−1 )αn −1 (xn−1 − xn−2 )αn−1 −1 · · ·

1 −1 · · · (x2 − x1 )α2 −1 xα dx1 · · · dxn−1 , 1

which implies (9.15) after yet another change of variables xj = xn sj , j = 1, . . ., n − 1. Another version of the Dirichlet formula is ds1 · · · dsn (sn − sn−1 )αn −1 · · · (s1 − s)α1 −1 s≤s1 ≤···≤sn ≤t (9.16) Γ(α1 ) · · · Γ(αn ) . = (t − s)α1 +···+αn Γ(α1 + · · · + αn + 1) For the proof, one writes the l.h.s. as the repeated integral



t−s

dτ 0

0≤x1 ≤···≤xn−1 ≤τ t−s α1 +···+αn −1

=

τ



1 −1 dx1 · · · dxn−1 (τ − xn−1 )αn −1 · · · (x2 − x1 )α2 −1 xα 1



0

×

0≤s1 ≤···≤sn−1 ≤1

1 −1 ds1 · · · dsn−1 (1 − sn−1 )αn −1 · · · (s2 − s1 )α2 −1 sα . 1

Then (9.16) follows by applying (9.15).

9.3 Asymptotics of the Fourier transform: power functions and their exponents Let us begin with the Fourier transform of power functions. Since xω + for any real ω is not integrable, its Fourier transform cannot be defined in the classical sense. For ω > −1, however, the function xω + is locally integrable, and can therefore be considered an element of the space of tempered distributions S  (R). Hence its Fourier

486

Chapter 9. Appendix

transform can be defined in the sense of generalized functions. In the most elementary approach, one can define the Fourier integral via certain regularizations. For instance, for ω > −1 and p > 0, one can define ∞ ∞ ω ±irp r e dr = lim rω e±ir(p±i ) dr. (9.17)

→0+

0

0

Note that it is only for ω ∈ (−1, 0) that the l.h.s. is also defined as an improper Riemann integral. Proposition 9.3.1. Let ω > −1. Then, for the integral defined by (9.17), ∞ / . π rω e±irp dr = p−1−ω Γ(1 + ω) exp ±i (1 + ω) 2 0 for p > 0, or equivalently ∞ . π / rω eirp dr = |p|−1−ω Γ(1 + ω) exp i (1 + ω) sgn p 2 0

(9.18)

(9.19)

for real p = 0, where sgn p = sgn (p) denotes the sign of the real number p. If τ > 0, then ∞ rω e−r(ip+τ ) dr = (ip + τ )−1−ω Γ(1 + ω), (9.20) 0

where arg(ip + τ ) (which is needed for a proper definition of the r.h.s. of (9.20)) is chosen in its main branch, that is, −π/2 < arg(ip + τ ) < π/2. Proof. For ω > −1 and q > 0, we have ∞ rω e−rq dr = q −1−ω Γ(1 + ω).

(9.21)

0

Since the functions on both sides are analytic in the complex half-plane {Re q > 0}, they must coincide there. This implies (9.20). The r.h.s. of (9.20) has a limit as q = ∓ip + τ tends to any non-vanishing point on the imaginary axis, that is as τ → 0 and a fixed p > 0. Consequently, choosing q = ∓ip = p exp{∓iπ/2} yields (9.18) and hence (9.19).  Proposition 9.3.2. If α ∈ (0, 1) and a real p = 0, then ∞ dr Γ(1 − α) −iπα sgn p/2 α e (eirp −1) 1+α = − |p| = Γ(−α)e−iπα sgn p/2 |p|α . (9.22) r α 0 Proof. Integration by parts yields ∞ dr ip ∞ −α irp irp (e − 1) 1+α = r e dr r α 0 0 for any real p. By (9.18), this implies (9.22), where we also used the identity Γ(1 − α) = −αΓ(−α). 

9.3. Asymptotics of the Fourier transform: power functions and their exponents

487

Proposition 9.3.3. Let p ∈ R and p = 0. If α ∈ (1, 2), then



Γ(1 − α) α −iπα sgn p/2 eirp − 1 − irp |p| e dr = − = Γ(−α)|p|α e−iπα sgn p/2 . 1+α r α 0 (9.23) More generally, if α ∈ (k, k + 1) with a natural k, then  ∞ dr (irp)k Γ(1 − α) α −iπα sgn p/2 |p| e =− . (9.24) eirp − 1 − irp − · · · − 1+α k! r α 0 Proof. Integration by parts yields  ∞ (irp)k dr irp e − 1 − irp − · · · − 1+α k! r 0   ip ∞ irp (irp)k−1 dr = e − 1 − irp − · · · − α 0 (k − 1)! rα for any real p. Therefore, (9.24) is obtained from (9.22) by induction.



All the above formulae have natural extensions to finite dimensions. The following result is a direct corollary of the formulae (9.19) and (9.24). Proposition 9.3.4. Let μ(ds) be a measure on the unit sphere S d−1 in Rd with d > 1, p ∈ Rd and p = 0. (i) Let ω ∈ (−1, 0). Then ∞ |y|ω ei(y,p) d|y| μ(d¯ y) d−1 0 S = Γ(1 + ω) |(p, y¯)|−1−ω e−iπ(1+ω) sgn (p,¯y)/2 μ(d¯ y ),

(9.25)

S d−1

where y¯ = y/|y| ∈ S d−1 . If μ is symmetric, i.e., invariant with respect to the inversion s → −s, then ∞ |y|ω ei(y,p) d|y| μ(d¯ y) 0 S d−1 (9.26) −1−ω = Γ(1 + ω) |(p, y¯)| cos(π(1 + ω)/2)μ(d¯ y ). S d−1

(ii) Let α ∈ (k, k + 1) with a natural k. Then   ∞ y) ik (y, p)k d|y| μ(d¯ ei(y,p) − 1 − i(y, p) − · · · − 1+α k! |y| 0 S d−1 = Γ(−α) e−iπα sgn (p,¯y)/2 |(p, y¯)|α μ(d¯ y ). S d−1

(9.27)

488

Chapter 9. Appendix

If μ is symmetric, then the last expression simplifies to Γ(−α) cos(πα/2) |(p, y¯)|α μ(d¯ y ).

(9.28)

S d−1

Next, we need the Fourier transform of exponentials of power functions, ∞ ∞ ' ' ( ω ( β β Iσω (x) = exp ixp − σp p dp = exp ixp − |σ|eiψ pβ pω dp, (9.29) 0

0

for x ∈ R, where σ = σr + iσi = |σ|eiψ , and ω, β real constants. These integrals can usually not be calculated in a closed form, but we are interested in the main term of the asymptotics for large x. Note that changing the integration variable yields ∞ ( ' 1 β (x) = exp ip sgn (x) − σpβ |x|−β pω dp. (9.30) Iσω 1+ω |x| 0 Proposition 9.3.5. Let ω > −1, β > 0. Let σr > 0, or equivalently ψ ∈ (−π/2, π/2). Then |σ| β |Iσω (x)| ≤ C 1+ω 1+ω Γ(1 + ω). (9.31) |x| More precisely, β (x) = Iσω

Γ(1 + ω) β i sgn (x) exp {i sgn (x)πω/2} + I˜σω (x) |x|1+ω

with β (x)| ≤ C 1+ω+β |I˜σω

|σ| |x|1+ω+β

Γ(1 + ω + β),

(9.32)

(9.33)

where C can be taken either as C = 1 or as 2β|σ|/(πσr ) depending on certain relations between σ and β that are made explicit in the below proof. (Namely, when the rotation angle φ = min(φ0 , π/2) equals π/2 or φ0 , respectively.) In particular, if β ≥ 2, the second case is realized. Proof. The idea is to rotate the beam of integration (which is R+ ) onto the angle ±φ, where the sign ± corresponds to the sgn (x) and denotes the anticlockwise or clockwise rotation, respectively (so that the real part of the number ip sgn (x) is negative throughout the rotation). Also, φ ∈ (0, π/2] is chosen as close as possible to π/2, with the restriction that the real part of σpβ must remain positive throughout the rotation. These conditions justify that a rotation according to the Cauchy theorem of complex analysis can be performed. On the beam p = re±iφ = reiφ sgn (x) , r > 0, the real and imaginary parts of the complex number σpβ equal rβ ξr = rβ [σr cos(βφ) − sgn (x)σi sin(βφ)], rβ ξi = rβ [σi cos(βφ) + sgn (x)σr sin(βφ)].

9.3. Asymptotics of the Fourier transform: power functions and their exponents

489

The value ξr is positive for small φ, because σr > 0. By monotonicity, if sgn (x)σi ≥ 0, there exists a unique φ0 ∈ (0, π/(2β)] so that ξr = 0, and if sgn (x)σi < 0, there exists a unique φ0 ∈ (π/(2β), π/β) so that ξr = 0. This φ0 is specified by the equations sin(φ0 β) = σr /|σ|, cos(φ0 β) = sgn (x)σi /|σ|. Therefore, the right choice of φ is φ = min(φ0 , π/2). If β ≥ 2, then π/β ≤ π/2, which implies φ = φ0 . If β < 2, then φ = π/2 whenever σr cos(βπ/2) − sgn (x)σi sin(βπ/2) ≥ 0. Performing the rotation and turning to the real variable r on the beam p = re±iφ that is obtained by the rotation, we find   eiφ(1+ω) sgn (x) ∞ rβ β iφ sgn (x) exp i sgn (x)re − β (ξr + iξi ) rω dr, Iσω (x) = |x|1+ω |x| 0 (9.34) where ξr = 0 and ξi = sgn(x)|σ|, if φ = φ0 , and ξr = σr cos(βπ/2) − sgn (x)σi sin(βπ/2) > 0, ξi = σi cos(βπ/2) + sgn (x)σr sin(βπ/2) otherwise. In both cases, we find ξr2 + ξi2 = σr2 + σi2 = |σ|2 . Therefore, ∞ 1 1 β |Iσω (x)| ≤ e−r sin φ rω dr = Γ(1 + ω)(sin φ)−(1+ω) , 1+ω |x| |x|1+ω 0 which yields (9.31) with C = (sin φ)−1 . On the other hand, using the elementary inequality | exp{−(x + iy)} − 1| ≤ |x + iy| that is valid for any x > 0 and y, we obtain β (x) Iσω

eiφ(1+ω) sgn (x) = |x|1+ω





/ . β exp i sgn (x)reiφ sgn (x) rω dr + I˜σω (x),

0

where β |I˜σω (x)| ≤

1





exp{−r sin φ}rβ+ω

1 ξr2 + ξi2 dr

|x|1+ω+β 0 ∞ |σ| = exp{−r sin φ}rβ+ω dr |x|1+ω+β 0 |σ| Γ(1 + ω + β)(sin φ)−(1+ω+β) , = |x|1+ω+β

which yields (9.33) with C = (sin φ)−1 .

490

Chapter 9. Appendix

Using (9.21) with q = −i sgn (x)eiφ sgn (x) = exp{i sgn (x)(φ − π/2)} = sin φ − i sgn (x) cos φ (which has a positive real part) yields . / eiφ(1+ω) sgn (x) ∞ exp i sgn (x)reiφ sgn (x) rω dr 1+ω |x| 0 eiφ(1+ω) sgn (x) Γ(1 + ω) exp {−i sgn (x)(φ − π/2)(1 + ω)} |x|1+ω Γ(1 + ω) = exp{i sgn (x)π(1 + ω)/2}, |x|1+ω =

(9.35)

which implies (9.32). Moreover, if φ = π/2, then β (x)| ≤ |I˜σω

|σ| Γ(1 + ω + β). |x|1+ω+β

If φ = φ0 , then sin(φβ) = σr /|σ|, and therefore sin φ > πφ/2 > π arcsin(σr /|σ|)/(2β) >

πσr , 2β|σ| 

which yields the required estimate for C.

Remark 127. Expanding the exponent exp{−r (ξr + iξi )/|x| } in (9.34) into a β power series yields the full asymptotic expansion of Iσω (x) in power series of |x|−β . β

β

An extension of Proposition 9.3.5 concerns a variation or a mixing of β and σ. Namely, extending (9.29), let us consider the integral   ∞ t β (x) = exp ixp − στ pβτ μ(dτ ) pω dp, (9.36) Iσω s

0

with continuous curves βt ∈ R+ , σt ∈ C. Proposition 9.3.6. Let ω > −1. Let β(τ ) be a continuous curve [s, t] → [bmin , bmax ] with bmin > 0, let σ(τ ) = σr (τ ) + iσi (τ ) be a continuous curve [s, t] → C whose real part σr (τ ) is bounded from below by a constant σ > 0 and whose magnitude |σ(τ )| is bounded from above by a constant Σ, and let μ(dτ ) be an arbitrary finite (non-negative) Borel measure on [s, t] (for instance, a discrete one). Then β (x)| ≤ |Iσω

C |x|1+ω

(9.37)

with a constant C depending on bmax , ω and the ratio Σ/σ. More precisely, β Iσω (x) =

Γ(1 + ω) β i sgn (x) exp{i sgn (x)πω/2} + I˜σω (x) |x|1+ω

(9.38)

9.4. Asymptotics of the Fourier transform: functions of power growth

with



t

β |I˜σω (x)| ≤ C s

|σ(τ )|μ(dτ ) , |x|1+ω+βτ

491

(9.39)

where the constant C depends on bmin , bmax , ω and Σ/σ. Proof. For this proof, the arguments used in Proposition 9.3.5 can be directly extended. The rotation angle φ is defined as follows: φ = min(φ0 , π/2) with φ0 = inf{ψ : ∃τ ∈ supp μ : σr (τ ) cos(β(τ )ψ) − sgn (x)σi (τ ) sin(β(τ )ψ) = 0}. After the rotation, the integral turns into β Iσω (x) =

eiφ(1+ω) sgn (x) |x|1+ω





exp{i sgn (x)reiφ sgn (x) − (ξr + iξi )} rω dr,

(9.40)

0

where

t

rβ(τ ) [σr (τ ) cos(β(τ )φ) − sgn (x)σi (τ ) sin(β(τ )φ)]dτ, |x|β(τ )

t

rβ(τ ) [σi (τ ) cos(β(τ )φ) + sgn (x)σr (τ ) sin(β(τ )φ)]dτ. |x|β(τ )

ξr = s

ξi =

s

Therefore, we obtain (9.38) with β |I˜σω (x)| ≤

1 |x|1+ω





e−r sin φ rω

1 ξr2 + ξi2 dr.

0

The equation [σr (τ ) cos(β(τ )φ) − sgn (x)σi (τ ) sin(β(τ )φ)]2 + [σi (τ ) cos(β(τ )φ) + sgn (x)σr (τ ) sin(β(τ )φ)]2 = σ 2 (τ ) yields

1 2 2 ξr + ξi ≤ |ξr | + |ξi | ≤ 2 s

Estimating r (9.39).

β(τ )

by r

bmax

and r

bmin

t

rβ(τ ) |σ(τ )| dτ. |x|β(τ )

for r > 1 and r < 1, respectively, yields 

9.4 Asymptotics of the Fourier transform: functions of power growth In this section, we briefly discuss a branch of analysis that deals with the relation between the asymptotic behaviour of functions and their Fourier transform. Such

492

Chapter 9. Appendix

relations are often referred to as Tauberian theorems (see [37] for the full story). We present here only some more or less elementary results. We start with two basic lemmas of asymptotic analysis, which are presented under more general assumptions than usual. Moreover, they give emphasis to the main terms of the asymptotics with precise error estimates, rather than to the corresponding asymptotic expansions. The first of these lemmas is Watson’s lemma. Lemma 9.4.1. Let a, β, α, λ > 0, and let f ∈ C([0, a]) be such that |f (x) − f (0)| ≤ Kx. Then a 1 xβ−1 f (x) exp{−λxα } dx = Γ(β/α)f (0)λ−β/α + O(λ−(β+1)/α ), (9.41) α 0 where

|O(λ−(β+1)/α )| ≤ C(a, β, α)Kλ−(β+1)/α

(9.42)

with a constant C(a, β, α). Proof. This can be seen by writing f (x) = f (0) + (f (x) − f (0)) and using (9.8) for each term. Further details are available in many textbooks on calculus.  The following Erd´elyi’s lemma (in a modified version) is more involved: Lemma 9.4.2. Let a > 0, β ∈ (0, 1) and p a complex number with a non-negative imaginary part. Let f ∈ C([0, a]) ∩ C 1 ((0, a]) be such that f (a) = 0 and g(x) = xβ−1 (f (x) − f (0)) has an integrable derivative. Then a I= xβ−1 f (x) exp{ipx} dx = Γ(β)(−ip)−β f (0) + O(p−1 ), (9.43) 0 −β

where (−ip) is a branch of an analytic function on the half-plane of p with nonnegative imaginary part such that (−ip)−β = λβ for p = iλ with a positive λ, and where (9.44) |O(p−1 )| ≤ C(a, β)g  L1 [0,a] |p|−1 with a constant C(a, β). For instance, for real p, a xβ−1 f (x) exp{ipx} dx = eiπβ sgn (p)/2 Γ(β)|p|−β f (0) + O(p−1 ). I=

(9.45)

0

Remark 128. The assumption that xβ−1 (f (x) − f (0)) has an integrable derivative is exactly the assumption that is needed for our main application below. It is much weaker than the requirement that f itself has an integrable derivative. Proof. Let p = qeiψ with q > 0 and ψ ∈ [0, π]. First, we prove the assertion under the additional assumption that f (x) = 1 for x ∈ [0, δ] with some δ ∈ (0, a). For that purpose, the trick is to decompose the integral I into a sum of two integrals over the segments [0, δ] and [δ, a], respectively. To the first integral, which does not

9.4. Asymptotics of the Fourier transform: functions of power growth

493

depend on f , one can apply the Cauchy theorem for complex variables and replace it by a sum of integrals over the segment x = ie−iψ y, y ∈ [0, δ], and the part of a circle that joins the points ie−iψ δ and δ, respectively. Let us assume for the sake of definiteness that ψ ∈ [0, π/2]. Then this quarter-circle can be parametrized as x = δeiφ , φ ∈ (0, (π/2) − ψ), to be integrated over φ in the opposite direction. This yields I = I1 + I2 + I3 with δ iπβ/2 −iψβ I1 = e e y β−1 e−qy dy,

0 (π/2)−ψ

I2 = −i δ β eiφβ exp{iqδei(φ+ψ) }dφ, 0 a I3 = xβ−1 f (x) exp{iqxeiψ } dx. δ

By Watson’s lemma, we find that I1 = eiπβ/2 e−iψβ Γ(β)q −β−1 + O(q −1 ) = Γ(β)(−ip)−β + O(q −1−β ). Integration by parts shows that I2 and I3 are of order 1/q, which completes the proof in this case. Let us now turn to a general f . For that purpose, let χ(x) be a mollifier, i.e., an infinitely differentiable function R+ → [0, 1] that has the value 1 in a neighbourhood of zero and vanishes for x ≥ a. If we change f to (1 − χ)f in (9.43), the integrand vanishes near zero and integration by parts shows that it is of order 1/p. Therefore, it is sufficient to prove the result for f χ instead of f . For this, we get a xβ−1 f (0)χ(x) exp{ipx} dx = eiπβ/2 Γ(β)p−β f (0) + O(p−1 ) 0

by the above result. It remains to show that the integral a g(x)χ(x) exp{ipx} dx 0

is of order O(1/p). Here, g(x)χ(x) is a continuous function that vanishes at the boundaries x = 0, a and has an integrable derivative. Integration by parts shows that the boundary terms vanish, and we get the integral of order O(1/p), as claimed.  The next statement reflects the main application of Erd´elyi’s lemma that we are interested in. Proposition 9.4.1. Let α ∈ (0, 1) and p ∈ R, p = 0. Moreover,  ∞ let ν(y) be a continuous complex-valued function on y > 0 such that G(y) = y ν(z)dz is well defined for y > 0, that there exists a finite limit κ = lim y α G(y), y→0

(9.46)

494

Chapter 9. Appendix

and that the function d (G(y) − κy −α ) = καy −α−1 − ν(y) dy

(9.47)

is integrable around the origin. Then the inverse Fourier transform of G is an analytic function in the complex open upper half-plane {Im p > 0}, has a continuous extension to p ∈ R \ {0} and it holds ∞ eipy G(y) dy = κΓ(1 − α)(−ip)α−1 + O(1/p), (9.48) 0

where, for any a > 0, |O(1/p)| ≤ C(a, α)καy −α−1 − ν(y)L1 ([0,a]) |p|−1 with a constant C(a, α). In particular, for real p, ∞ eipy G(y) dy = eiπ(1−α) sgn (p)/2 Γ(1 − α)κ|p|−(1−α) + O(1/p).

(9.49)

(9.50)

0

Remark 129. If the function (9.47) tends to zero as y → 0, then κα = lim y α+1 ν(y), y→0

(9.51)

which implies (9.46) by L’Hˆopitale’s rule. Proof. Notice first that the integral on the l.h.s. of (9.48) is well defined as an improper Riemann integral (i.e., as the limit of integrals over [0, K], K → ∞), because of K 1 K ipy 1 eipy G(y) dy = (e − 1)ν(y)dy + (eipK − 1)G(K) ip ip 0 0 and limy→∞ G(y) = 0. Choosing a mollifier χ(y), i.e., an infinitely differentiable function R+ → [0, 1] such that χ(y) = 1 for y ≤ a/2 and χ(y) = 0 for y ≥ a with some (arbitrarily chosen) a > 0, we see that ∞ 1 ∞ ipy eipy G(y)(1 − χ(y)) dy = − e [ν(y)(1 − χ(y)) − G(y)χ (y)]dy, ip a/2 0 which is of order 1/|p|. Therefore, it is sufficient to prove the proposition for χ(y)G(y) instead of G(y). We can write χ(y)G(y) = y −α f (y) with a function f that has a compact support such that f (0) = κ, and y −α (f (y) − f (0)) = G(y)χ(y) − y −α κ has an integrable derivative by (9.47). Hence estimate (9.48) follows from Erd´elyi’s lemma. 

9.4. Asymptotics of the Fourier transform: functions of power growth

495

The following asymptotic result should be compared with the exact formulae (9.3.2). Proposition 9.4.2. Let α ∈ (0, 1), and let ν(y) be a continuous complex-valued function on y > 0 such that ∞ min(1, y)|ν(y)|dy < ∞. 0

Set G(y) =

∞ y

ν(z)dz and



ψν (p) =

(eipy − 1)ν(y) dy = ip

0



eipy G(y)dy.

(9.52)

0

(The equation holds due to integration by parts.) (i) If G(y) satisfies (9.46) and (9.47), then ψν (p) is an analytic function in the complex open upper half-plane {Im p > 0} and has a continuous extension to the closed half-plane {Im p ≥ 0}. In this half-plane, ψν (p) = −(−ip)α Γ(1 − α)κ + O(1),

(9.53)

where, for any a > 0, |O(1)| ≤ C(a, α)καy −α−1 − ν(y)L1 ([0,a])

(9.54)

with a constant C(a, α). In particular, for real p, ψν (p) = −e−iπα sgn (p)/2 Γ(1 − α)κ|p|α + O(1). (ii) If

∞ 0

(9.55)

y|ν(y)|dy < ∞, then ψν (p) = 0



(eipy − 1)ν(y) dy = ip



yν(y)dy + o(1) ,

(9.56)

0

where o(1) → 0 as p → 0, and the derivative ∞ ∞ d (eipy − 1)ν(y) dy = iyeipy ν(y)dy dp 0 0 is well defined and uniformly bounded in the half-plane {Im p ≥ 0}. (iii) If   G(y) = g + O(1/y λ ) y −γ ,

(9.57)

with |O(1/y λ )| ≤ C/y λ and some g = 0, γ ∈ (0, 1), λ > 1 − γ, C > 0, then ∞ ψν (p) = (eipy − 1)ν(y) dy = −gΓ(1 − γ)(−ip)γ + O(p), (9.58) 0

496

Chapter 9. Appendix

where, for any a > 0,

a |O(p)| ≤ |p| G(y) dy + 0

C g a1−γ−λ + a1−γ . γ+λ−1 1−γ

(9.59)

In particular, for real p, we find ∞ ψν (p) = (eipy − 1)ν(y) dy = −gΓ(1 − γ)e−iπγ sgn (p)/2 |p|γ + O(p). (9.60) 0

Moreover, if additionally ν(y) = γ(g + O(1/y λ ))y −γ−1

(9.61)

(note that (9.61) implies (9.57) by L’Hˆ opitale’s rule), then d ψν (p) = −gγ sgn (p)Γ(1 − γ)e−iπγ sgn (p)/2 |p|γ−1 + O(1) dp

(9.62)

for real p, with a bounded function O(1). Proof. (i) Due to (9.52), (9.53) follows from (9.48) by multiplying this formula by (ip).  (ii) If y|ν(y)|dy < ∞, then  ∞    ipy  (e − 1)ν(y) dy  ≤ py|ν(y)|dy,  0

which tends to zero at least as p, as p → 0. The asymptotic equation (9.56) is then obtained from the formula for the derivative. (iii) Using (9.52) and splitting the integral into two parts over [0, a] and [a, ∞), respectively, we see that the first integral yields the first estimate on the r.h.s. of (9.59). Setting p = qeiψ with q = |p| > 0 and ψ ∈ [0, π] and changing the variable of integration y to z = yq in the second integral, we get ∞ ∞ eipy G(y) dy = ip|p|γ−1g exp{izeiψ }z −γ dy + r ip |p|a

a

with

|r| ≤ C|p|λ+γ



|p|a

z −γ−λ dy =

C a1−γ−λ |p|, γ+λ−1

where the condition λ + γ > 1 was used. It remains to observe that



|p|a

exp{izeiψ }z −γ dy =

0



exp{izeiψ }z −γ dy −

0

|p|a

exp{izeiψ }z −γ dy,

9.4. Asymptotics of the Fourier transform: functions of power growth

497

and    |p|a  1   exp{izeiψ }z −γ dy  ≤ (|p|a)1−γ ,   0  1−γ ∞ exp{izeiψ }z −γ dy = Γ(1 − γ) exp{i(ψ − π/2)(γ − 1)}, 0

where (9.20) was used. Therefore, the main term of ψν (p) equals ip|p|γ−1 gΓ(1 − γ) exp{i(ψ − π/2)(γ − 1)} = −gΓ(1 − γ)(−ip)γ , as required. This proves (9.58) and (9.59). Finally, the previous analysis implies (9.62) for d dp







(e

ipy

− 1)ν(y) dy =

0



iyeipy ν(y) dy

0

by using yν(y) instead of G(y).



Remark 130. In order to get rough asymptotics (without error estimates), some conditions can be relaxed. For instance, the general Tauberian theorems (see Chapter 4 in [37]) imply that for monotone positive G, the main terms in (9.58) and (9.53) follow just from the existence of the limits κ = limy→0 y α G(y) and g = limy→∞ y γ G(y). The main reason for the asymptotic analysis that has been performed above is to get estimates for the fundamental solution to ΨDOs with symbols ψν (p) from (9.53). By (1.180), this fundamental solution is given by the inverse Fourier transform: Eν = F −1 (1/ψν ). Proposition 9.4.3. Under the assumption of Proposition 9.4.2(i), let α ∈ ( 12 , 1). Then the function ∞ dp 1 E˜ν (x) = −[F −1 (1/ψν )](x) = − (9.63) eipx 2π −∞ ψν (p) is well defined for real x = 0. (Note that the integral can be defined either as an improper Riemann integral or as the limit of an appropriately chosen complex x.) Moreover, this function is uniformly bounded for positive x and has the following asymptotics behaviour for negative x: 1 E˜ν (x) = |x|α−1 sin(πα) + O(1), πκ with a uniformly bounded (for x < 0) function O(1).

(9.64)

498

Chapter 9. Appendix

Remark 131. (i) The minus sign was introduced in (9.63) for convenience. In fact, the main ˜ object is the function E(x) = E(−x), which represents the density of the potential measure U (dx) of the operator Lν with the symbol ψν (p). It turns out that the measure U (dx) is supported on R+ (see Section 8.4), so that ˜ E(x) vanishes identically for x > 0. (ii) Assumption α > 1/2 is a consequence of our technique. It can be relaxed by more advanced methods (see Remark 130). Proof. We have 1 E˜ν (x) = 2π





e −∞

ipx

1 dp = ψν (p) 2π





e 0

ipx

1 dp + ψν (p) 2π

0



e−ipx

dp . ψν (−p)

In order to apply Proposition 9.4.2 (iii) (with x instead of p and p instead of y), we observe from (9.55) that, for positive p, 1 eiπα/2 =− α (1 + O(|p|−α ), ψν (p) p Γ(1 − α)κ 1 e−iπα/2 =− α (1 + O(|p|−α ). ψν (−p) p Γ(1 − α)κ Therefore, by Proposition 9.4.2 (with γ = λ = α, hence the condition α > 1/2), we find E˜ν (x) = −

eiπα/2 1 Γ(1 − α)e−iπα sgn x/2 |x|α 2πix Γ(1 − α)κ

e−iπα/2 1 Γ(1 − α)eiπα sgn x/2 |x|α + O(1) 2πix Γ(1 − α)κ   1 = − |x|α−1 eiπα/2 e−iπα sgn x/2 − e−iπα/2 eiπα sgn x/2 + O(1). 2πiκ sgn x +

Remarkably, the main terms cancel for positive x, which proves that E is bounded in this region. For negative x, we get E˜ν (x) =

1 1 |x|α−1 (eiπα − e−iπα ) + O(1) = |x|α−1 sin(πα) + O(1), 2πiκ πκ 

as claimed.

9.5 Argmax in convex Hamiltonians In control theory, the Hamiltonian function often appears in the form H(p) = max(xp − U (x)), x∈X

p ∈ Rd ,

(9.65)

9.5. Argmax in convex Hamiltonians

499

where X is a closed set in Rd and U (x) a continuous function. In many situations (see specifically Theorem 6.10.1), it is important to know the property of the (possibly multi-valued) function x ˆ(p) = argmax (xp − U (x)), that is, x ˆ(p) is the set of points where the maximum in (9.65) is achieved. It is straightforward to see that if X is convex and U (x) is a strictly convex function, then xˆ is single-valued. In this case, one can expect x ˆ(p) to be Lipschitz-continuous. For simplicity, we shall establish this fact only for X being a ball. Theorem 9.5.1. Let X = {x : |x| ≤ 1} and U (x) a strictly convex twice continuously differentiable function, such that   2 ∂ U (x)ξ, ξ ≥ a(ξ, ξ) ∂x2 for a constant a and all x and ξ, and such that x = 0 is the point of the global minimum of U . Then x ˆ(p) : Rd → X is a well-defined (globally) Lipschitz-continuous function, with the Lipschitz constant depending on a and )  ) 2 ) ∂ U ) ) ) :x∈X . b = max ) 2 ∂x (x) ) Proof. The global maximum (without the constraint x ∈ X) of px − U (x) is achieved at the point x, so that p = ∇U (x). Due to the convexity, the mapping x → ∇U (x) is a diffeomorphism of Rd . Therefore, its inverse G(p) is also a diffeomorphism. Therefore, for p ∈ (∇U )(X), we find that x ˆ = G(p) and that x ˆ is a Lipschitz-continuous and smooth function of p. For p ∈ P = Rd \ (∇U )(X), x ˆ(p) belongs to the unit sphere S d−1 . Thus we only need to show the Lipschitz continuity of the mapping xˆ as a mapping P → S d−1 . From the method of Lagrange multipliers, it follows that, for p ∈ P , x ˆ(p) solves the equation p = ∇U (x) + λx with some λ > 0. Remark 132. One can see this also directly. In fact, since the function xp − U (x) has its maximum x ˆ(p) on S d−1 , the gradient of this function must be orthogonal ˆ(p). to the tangent plane of S d−1 at x Therefore, for all x ∈ S d−1 , the set {p : xˆ(p) = x} is the ray {pλ = ∇U (x) + λx,

λ ≥ 0}.

By convexity, we find  (∇U (x) − ∇U (y), x − y) = x − y, 0

1

 ∂2U (y + t(x − y)) dt (x − y) ≥ a(x − y)2 , ∂x2

500

Chapter 9. Appendix

and thus |∇U (x) − ∇U (y)| ≥ a|x − y|. Consequently, |x − y| ≤

1 |p0 (x) − p0 (y)|. a

(9.66)

In order to prove the theorem, we have to show that |x − y| ≤ C|pλ (x) − pμ (y)|

(9.67)

for all λ, μ ≥ 0 and a constant C. Remark 133. In order to get a better overview, notice that the norm |pλ | increases with λ due to (x, ∇U (x)) > 0. Therefore, p0 (x) is the minimal (in magnitude) solution to the equation x ˆ(p) = x. For proving (9.67), let us look for the value of min{|pλ (x) − pμ (y)| : λ, μ ≥ 0}

(9.68)

for a given x = y from S d−1 . Notice first of all that this value is positive, since pλ (x) = pμ (y) for any λ, μ and any x = y, because otherwise ∇U (x) + λx = ∇U (y) + μy, and thus x ˆ(p) = x = y (because x ˆ(q) is uniquely defined for any q). Next, let us show that the minimum in (9.68) cannot be realized on a pair (λ0 , μ0 ) such that both of these numbers are positive. In fact, assuming λ0 > 0, μ0 > 0, it follows that (x, ∇U (x) − ∇U (y) + λ0 x − μ0 y) = 0 (y, ∇U (x) − ∇U (y) + λ0 x − μ0 y) = 0, or, equivalently, that λ0 − μ0 (x, y) = −(x, ∇U (x) − ∇U (y)) −λ0 (x, y) + μ0 = (y, ∇U (x) − ∇U (y)). Summing up these equations yields (λ0 + μ0 )(1 − (x, y)) = −(x − y, ∇U (x) − ∇U (y)) < 0, and thus λ0 + μ0 < 0, which is a contradiction. Therefore, either the minimum in (9.68) is realized on μ0 = λ0 = 0, in which case we have the required Lipschitz continuity by (9.66), or one of the numbers μ0 , λ0 vanishes. For considering this second case, let λ0 = 0, μ0 > 0. Then μ0 = (y, ∇U (x) − ∇U (y))

9.5. Argmax in convex Hamiltonians

501

and min{|pλ (x)−pμ (y)|} = |p0 (x)−pμ0 (y)| = |∇U (x)−∇U (y)−(∇U (x)−∇U (y), y)y|. λ,μ

Therefore, it remains to show that |x − y| ≤ C|∇U (x) − ∇U (y) − (∇U (x) − ∇U (y), y)y| with a constant C. This, however, would follow directly from the inequality (x − y)2 ≤ C(x − y, ∇U (x) − ∇U (y) − (∇U (x) − ∇U (y), y)y).

(9.69)

Now we have (x − y, ∇U (x) − ∇U (y)) ≥ a(x − y)2 , |(∇U (x) − ∇U (y), y)| ≤ b|x − y|. Noting that (y, x − y) = |x − y|2 /2 (whenever (x, y) > 0 and |x| = |y| = 1), it follows that |(x − y, y)(∇U (x) − ∇U (y), y)| ≤ b|x − y|3 /2. Therefore, (x − y, ∇U (x) − ∇U (y) − (∇U (x) − ∇U (y), y)y) > a(x − y)2 (1 − ω), where ω ≤ b|x − y|/(2a). Thus for |x − y| ≤ a/b, (x − y, ∇U (x) − ∇U (y) − (∇U (x) − ∇U (y), y)y) > a(x − y)2 /2, which implies (9.69) with C = 2/a. On the other hand, on the set {x, y ∈ X : |x − y| ≥ a/b} the expression |∇U (x) − ∇U (y) − (∇U (x) − ∇U (y), y)y| is bounded from below by a positive constant, say c, since it is continuous and cannot vanish. Therefore, we have |x − y| ≤

2 |∇U (x) − ∇U (y) − (∇U (x) − ∇U (y), y)y|, c

which again implies (9.69).



Exercise 9.5.1. Show that if J in Theorem 9.5.1 is spherically symmetric, i.e., it is of the form J(|x|), then   x ˆ(p) = min 1, (J  )−1 (|p|)p/|p| . (9.70)

Bibliography [1] O.P. Agrawal. A General Formulation and Solution Scheme for Fractional Optimal Control Problems, J. Nonlinear Dynamics 38 (2004), 323–337. [2] O.P. Agrawal. Generalized Variational Problems and Euler–Lagrange equations. Computers and Mathematics with Applications 59 (2010) 1852–1864. [3] N.U. Ahmed. A general class of McKean–Vlasov stochastic evolution equations driven by Brownian motion and L´evy process and controlled by L´evy measure. Discuss. Math. Differ. Incl. Control Optim. 36:2 (2016), 181–206. [4] G. Akagi, G. Schimperna, Giulio and A. Segatti. Fractional Cahn–Hilliard, Allen–Cahn and porous medium equations. J. Differential Equations 261:6 (2016), 2935–2985. [5] R. Albert and A.-L. Barab´asi. Statistical mechanics of complex networks. Reviews of Modern Physics. 74:1 (2002), 47–97. arXiv:cond-mat/0106096. [6] S.A. Albeverio, R. Høegh-Krohn and S. Mazzucchi. Mathematical theory of Feynman path integrals. An introduction. Second edition. Lecture Notes in Mathematics, 523. Springer-Verlag, Berlin, 2008. [7] L. Ambrosio, N. Gigli and G. Savar´e. Gradient Flows in Metric Spaces and in the Space of Probability Measures. Birkh¨auser, 2005. [8] R. Almeida, Sh. Pooseh and D.F.M. Torres. Computational methods in the fractional calculus of variations. Imperial College Press, London, 2015. [9] R. Alonso. Boltzmann-type equations and their applications. IMPA Mathematical Publications. 30th Brazilian Mathematics Colloquium. Instituto Nacional de Matem´ atica Pura e Aplicada (IMPA), Rio de Janeiro, 2015. [10] A. Ananova and R. Cont. Pathwise integration with respect to paths of finite quadratic variation. J. Math. Pures Appl. 107:6 (2017), 737–757. [11] V.V. Anh and N.N. Leonenko. Spectral Analysis of Fractional Kinetic Equations with Random Data. Journal of Statistical Physics 104, Nos. 5/6, (2001), 1349–1387. [12] D. Applebaum. L´evy Processes and Stochastic Calculus. Cambridge Studies in Advanced Mathematics, vol. 93, CUP, 2004. © Springer Nature Switzerland AG 2019 V. Kolokoltsov, Differential Equations on Measures and Functional Spaces, Birkhäuser Advanced Texts Basler Lehrbücher, https://doi.org/10.1007/978-3-030-03377-4

503

504

Bibliography

[13] V.V. Aristov. Direct methods for solving the Boltzmann equation and study of nonequilibrium flows. Fluid Mechanics and its Applications, 60. Kluwer Academic Publishers Group, Dordrecht, 2001. [14] V.I. Arnold. Mathematical Understanding of Nature. AMS, Providence, RI, 2014. [15] V.I. Arnold. Ordinary differential equations. Translated from the Russian by Roger Cooke. Second printing of the 1992 edition. Universitext. SpringerVerlag, Berlin, 2006. [16] V.I. Arnold. Mathematical methods of classical mechanics. Nauka, Moscow, 1979 (in Russian). French translation Mir, Moscow, 1976. Polish translation PWN, Warsaw, 1981. [17] A.A. Arsen‘ev and O.E. Buryak. On the connection between a solution of the Boltzmann equation and a solution of the Landau–Fokker–Planck equation. Math. USSR Sbornik, 69:2, 465–478, 1991. [18] J.P. Aubin and A. Cellina. Differential Inclusions. New York, Springer, 1994. [19] Yu.V. Averbukh. A minimax approach to mean field games Mat. Sbornik 206:7 (2015), 3–32 (in Russian). [20] V.I. Averbukh and O.G. Smolyanov. The theory of differentiation in linear topological spaces. Russian Math Survey 22:6 (1967), 201–258. [21] V.I. Averbukh and O.G. Smolyanov. The various definitions of the derivatives in linear topological spaces. Russian Math Survey 23:4 (1968), 67–258. [22] I.F. Bailleul. Sensitivity for the Smoluchowski equation. J. Phys. A 44 (2011), no. 24, 245004. [23] I.F. Bailleul, Peter L.W. Man and M. Kraft. A stochastic algorithm for parametric sensitivity in Smoluchowski’s coagulation equation. SIAM J. Numer. Anal. 48:3 (2010), 1064–1086. [24] D. Baleanu, K. Diethelm, E. Scalas and J.J. Trujillo. Fractional calculus. Models and numerical methods. Second edition. Series on Complexity, Nonlinearity and Chaos, 5. World Scientific, Hackensack, NJ, 2017. [25] J.M. Ball, J. Carr. The discrete coagulation-fragmentation equations: existence, uniqueness and density conservation. J. Stat. Phys. 61 (1990), 203– 234. [26] V. Bally, L. Caramellino and R. Cont. Stochastic Integration by Parts and Functional Itˆ o Calculus. Advanced Courses in Mathematics CRM Barcelona. Birkh¨auser, 2016. [27] V. Barbu. Nonlinear Differential Equations of Monotone Type in Banach Spaces, Springer, New York, 2010. [28] S.S. Bayin. Time fractional Schr¨odinger equation. Fox’s H-functions and the effective potential. J. Math. Phys. 54 (2013), 012103.

Bibliography

505

[29] R. Basna, A. Hilbert and V.N. Kolokoltsov. An epsilon-Nash equilibrium for non-linear Markov games of mean-field-type on finite spaces. Commun. Stoch. Anal. 8:4 (2014), 449–468. [30] V.P. Belavkin. Quantum branching processes and nonlinear dynamics of multi-quantum systems. Dokl. Acad. Nauk SSSR (in Russian) 301:6 (1988), 1348–1352. [31] V.P. Belavkin. Multiquantum systems and point processes I. Reports on Math. Phys. 28 (1989), 57–90. [32] V. Belavkin, V. Kolokoltsov. On general kinetic equation for many particle systems with interaction, fragmentation and coagulation. Proc. R. Soc. Lond. A 459 (2002), 1–22. [33] G. Ben Arous. D´eveloppement asymptotique du noyau de la chaleur hypoelliptique sur la diagonale. Ann. Inst. Fourier (Grenoble) 39:1 (1989), 73–99. [34] A. Bensoussan, J. Frehse and Ph. Yam. Mean field games and mean field type control theory. Springer Briefs in Mathematics. Springer, New York, 2013. [35] A. Bensoussan, J. Frehse and Ph. Yam. On the interpretation of the Master Equation. Stochastic Process. Appl. 127:3 (2017), 2093–2137. [36] J. Bertoin. Random fragmentation and coagulation processes. Cambridge Studies in Advanced Mathematics, 102. Cambridge University Press, Cambridge, 2006. [37] N.H. Bingham, C.M. Goldie and J.L. Teugels. Regular variation. Cambridge University Press, 1987. [38] G. Birkhof. Lattice Theory. American Mathematical Society, 1995 [3rd. ed. with corrections]. [39] K. Bogdan. Sharp estimates for the Green function in Lipschitz domains. J. Math. Anal. Appl. 243 (2000) 326–337. [40] A.V. Bolsinov and A.T. Fomenko. Integrable geodesic flows on two-dimensional surfaces. Monographs in Contemporary Mathematics. Consultants Bureau, New York, 2000. [41] L. Bourdin. Existence of a weak solution for fractional Euler–Lagrange equations. J. Math. Anal. Appl. 399:1 (2013), 239–251. [42] C. Buse, M. Megan, M.-S. Prajea and P. Preda. The strong variant of a Barbashin Theorem on Stability of Solutions for Non-Autonomous Differential Equations in Banach Spaces. Integr. Equ. Oper. Theory 59 (2007), 491–500. [43] P.E. Caines, “Mean Field Games”, Encyclopedia of Systems and Control, Eds. T. Samad and J. Ballieul. Springer Reference 364780; DOI 10.1007/9781-4471-5102-9 30-1, Springer-Verlag, London, 2014. [44] P. Cardaliaguet, J.-M. Lasry, P.-L. Lions and A. Porretta. Long time average of mean field games with a nonlocal coupling. SIAM J. Control Optim. 51:5 (2013), 3558–3591.

506

Bibliography

[45] P. Cardaliaguet, F. Delarue, J.-M. Lasry and P.-L. Lions. The master equation and the convergence problem in mean field games. arXiv:1509.02505v1 [math.AP] [46] R. Carmona and F. Delarue. Forward-backward stochastic differential equations and controlled McKean–Vlasov dynamics. Ann. Probab. 43:5 (2015), 2647–2700. [47] R. Carmona and F. Delarue. The master equation for large population equilibriums. Stochastic analysis and applications 2014, 77–128, Springer Proc. Math. Stat., 100, Springer, Cham, 2014. [48] R. Carmona and F. Delarue. Probabilistic Theory of Mean Field Games with Applications, vol. I, II. Probability Theory and Stochastic Modelling vol. 83, 84. Springer, 2018. [49] C. Cercignani, G.M. Kremer. The relativistic Boltzmann equation: theory and applications. Progress in Mathematical Physics, 22. Birkh¨ auser Verlag, Basel, 2002. [50] M. Chaleyat-Maurel and L. Elie. Diffusions Gaussiannes. Ast´erisque. 84-85 (1981), 1762–1768. [51] S. Cho, P. Kim, Panki and H. Park. Two-sided estimates on Dirichlet heat kernels for time-dependent parabolic operators with singular drifts in C 1,α domains. J. Differential Equations 252:2 (2012), 1101–1145. [52] M. Cichon. On solutions of differential equations in Banach spaces. Nonlinear Analysis 60 (2005), 651–667. [53] C. Chicone and Y. Latushkin. Evolution Semigroups in Dynamical Systems and Differential Equations. Mathematical Surveys and Monographs, vol. 70. AMS, 1999. [54] M.G. Crandall and P.-L. Lions. Hamilton–Jacobi Equations in Infinite Dimensions I. J. Funct. Anal. 62 (1985), 379–396. [55] D. Crisan, Th. Kurtz and Y. Lee. Conditional distributions, exchangeable particle systems, and stochastic partial differential equations. Ann. Inst. Henri Poincar´e Probab. Stat. 50:4 (2014), 946–974. [56] D. Crisan and E. McMurray. Smoothing properties of McKean–Vlasov SDEs. Probab. Theory Related Fields 171:1-2 (2018), 97–148. [57] J.L. Da Silva, A.N. Kochubei and Y. Kondratiev. Fractional statistical dynamics and fractional kinetics. Methods Funct. Anal. Topology 22:3 (2016), 197–209. [58] G. Darbo. Punti uniti in transformazioni a condominio non compacto. Rend. Sem. Mat. Univ. Padova, 24 (1955), 84–92. [59] M. Deaconu, N. Fournier, E. Tanr´e. Rate of convergence of a stochastic particle system for the Smoluchowski coagulation equation. Methodol. Comput. Appl. Probab. 5:2 (2003), 131–158.

Bibliography

507

[60] F. Delarue, S. Menozzi and E. Nualart. The Landau equation for Maxwellian molecules and the Brownian motion on SON (R). Electron. J. Probab. 20 (2015), no. 92. [61] P. Del Moral and A. Doucet. Interacting Markov chain Monte Carlo methods for solving nonlinear measure-valued equations. Ann. Appl. Probab. 20:2 (2010), 593–639. [62] K. Diethelm. The Analysis of Fractional Differential Equations. Lecture Notes in Mathematics, vol. 2004. Springer (2010). [63] N. Dinculeanu. Vector integration and stochastic integration in Banach spaces. Wiley, New York, 2000. [64] R.J. DiPerna. Measure-valued solutions to conservation laws. Arch. Rational Mech. Anal. 88:3 (1985), 223–270. [65] R.J. DiPerna and P.-L. Lions. On the Cauchy problem for Boltzmann equations: global existence and weak stability. Ann. of Math. 130:2 (1989), 321– 366. [66] B. Djehiche, H. Tembine and R. Tempone. A stochastic maximum principle for risk-sensitive mean-field type control. IEEE Trans. Automat. Control 60:10 (2015), 2640–2649. [67] S. Yu. Dobrokhotov and A.I. Shafarevich. Tunnel splitting of the spectrum of Laplace–Beltrami operators on two-dimensional surfaces with a squareintegrable geodesic flow (in Russian). Funktsional. Anal. i Prilozhen. 34:2 (2000), 67–69 (in Russian). Engl. transl. Funct. Anal. Appl. 34:2 (2000), 133–134. [68] J. Duan. An introduction to Stochastic Dynamics. Cambridge Texts in Applied Mathematics. Cambridge, 2015. [69] P. Dubovskii. Mathematical theory of coagulation. Lecture Notes Series, 23. Seoul National University, Research Institute of Mathematics, Global Analysis Research Center, Seoul, 1994. [70] M.M. Dzherbashian and A.B. Nersesian, Fractional derivatives and the Cauchy problem for differential equations of fractional order, Izv. Acad. Nauk Armjanskvy SSR 3:1 (1968), 3–29. [71] Dzherbashian. Integral Transforms and Representations of Functions in the Complex Plane. Nauka, Moscow, 1966 (in Russian). [72] S.D. Eidelman, S.D. Ivasyshen and A.N. Kochubei. Analytic Methods in the Theory of Differential and Pseudo-Differential Equations of Parabolic Type. Operator Theory: Advances and Applications, vol. 152. Springer, Basel, 2004. [73] Ch.M. Elliott and Z. Songmu. On the Cahn–Hilliard Equation. Arch. Rational Mech. Anal. 96:4 (1986), 339–357. [74] Ch.M. Elliott and H. Garcke. On the Cahn–Hilliard equation with degenerate mobility. SIAM J. Math. Anal. 27:2 (1996), 404–423.

508

Bibliography

[75] A. Favini, A. Yagi. Degenerate differential equations in Banach spaces. Monographs and Textbooks in Pure and Applied Mathematics, 215. Marcel Dekker, Inc., New York, 1999. [76] H.O. Fattorini. The Cauchy problem. Encycl. Math. Appl. 18, Addison Wesley, Reading Mass., 1983. [77] M.V. Fedoryuk. Asymptotics of the Green function of pseudo-differential parabolic equations (in Russian). Differential equations 14:7 (1978), 1296– 1301. [78] W. Feller. An introduction to Probability. 2nd Edition, vol. 2, John Wiley, 1971. [79] A.F. Filippov. Differential equations with discontinuous right-hand side. Mat. Sb. 51(93):1 (1960), 99–128. [80] A.F. Filippov. Differential Equations with Discontinuous Righthand Sides. Kluwer Academic, Dordrecht, 1988. [81] D. Finkelshtein. Around Ovsyannikov’s method. Methods Funct. Anal. Topology 21:2 (2015), 134–150. [82] D.L. Finkelshtein, Yu.G. Kondratiev, M.J. Oliveira. Markov evolutions and hierarchical equations in the continuum. II: Multicomponent systems. Rep. Math. Phys. 71:1 (2013), 123–148. [83] W.H. Fleming and H.M. Soner. Controlled Markov Processes and Viscosity Solutions. Sec Ed. Sptinger, 2006. [84] H. F¨ollmer. Calcul d’Itˆo sans probabilit´es. In Seminar on Probability, XV (Univ. Strasbourg, Strasbourg, 1979/1980) (French), Lecture Notes in Mathematics 850 (1981), 143–150. Springer, Berlin. [85] A.T. Fomenko and Kh. Tsishang. A topological invariant and a criterion for the equivalence of integrable Hamiltonian systems with two degrees of freedom. Izv. Akad. Nauk SSSR Ser. Mat. 54:3 (1990), 546–575 (in Russian). English transl. Math. USSR-Izv. 36:3 (1991), 567–596. [86] T.D. Frank. Nonlinear Fokker–Planck Equations, Fundamentals and Applications. Springer Series in Synergetics, 2005. [87] M. Freidlin. Functional Integration and Partial Differential Equations. Princeton Univ. Press, Princeton, NY 1985. [88] I. Gallagher, L. Saint-Raymond, B. Texier. From Newton to Boltzmann: hard spheres and short-range potentials. Zurich Lectures in Advanced Mathematics. European Mathematical Society, Z¨ urich, 2013. [89] G.N. Galanis and P.K. Palamides. Nonlinear differential equations in Fr´echet spaces and continuum cross-section. Analele Stiintifice ale universitatii “Al. I. Cuza” IASI. Tomul LI, s.I, Matematica (2005), f.1. [90] I.M. Gelfand and G.E. Shilov. Generalized Functions, vol. 1. Academic Press 1964. Transl. from Russian, Moscow, 1958.

Bibliography

509

[91] I.M. Gelfand and G.E. Shilov. Generalized Functions, vol. 3. Academic Press 1964. Transl. from Russian, Moscow, 1958. [92] D.A. Gomes and J. Saude. Mean field games models – a brief survey. Dyn. Games Appl. 4:2 (2014), 110–154. [93] D.A. Gomes, J. Mohr and R.R. Souza. Continuous time finite state space mean field games. Appl. Math. Optim. 68:1 (2013), 99–143. [94] A. Gorban and V. Kolokoltsov. Generalized Mass Action Law and Thermodynamics of Nonlinear Markov Processes. Mathematical Modeling of Natural Phenomena (MMNP) 10:5 (2015), 16–46, http://www.mmnp-journal.org/ [95] A.N. Gorban, M. Shahzad. The Michaelis–Menten–Stueckelberg Theorem. Entropy 13 (2011), 966–1019. [96] P. G´orka, H. Prado and J. Trujillo. The time fractional Schr¨odinger equation on Hilbert space. Integral Equations Operator Theory 87:1 (2017), 1–14. [97] O. Gu´eant, J.-M. Lasry and P.-L. Lions. Mean Field Games and Applications. Paris-Princeton Lectures on Mathematical Finance 2010. Lecture Notes in Math. (2003), Springer, Berlin, p. 205–266. [98] H. Gu´erin, S. M´el´eard and E. Nualart. Estimates for the density of a nonlinear Landau process. Journal of Functional Analysis 238 (2006), 649–677. [99] M.E. Gurtin. Generalized Ginzburg–Landau and Cahn–Hilliard equations based on a microforce balance. Physica D 92 (1996), 178–192. [100] J.W. Hagood. The operator-valued Feynman–Kac formula with noncommutative operators. J. Funct. Anal. 38:1 (1980), 99–117. [101] P. H´ajek and P. Vivi. Some problems on ordinary differential equations in Banach spaces. RACSAM 104:2 (2010), 245–255. [102] P. H´ajek and M. Johanis. On Peano’s theorem in Banach spaces. J. Differential Equations 249:12 (2010), 3342–3351. [103] A. Hammond and F. Rezakhanlou. The kinetic limit for a system of coagulating Browninan particles. Arch. Ration. Mech. Anal. 185:1 (2007), 1–67. [104] Ph. Hartman. Ordinary differential equations. Corrected reprint of the Sec. Edition. Classics in Applied Mathematics, 38. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2002. [105] J. Henderson and R. Luca (2016). Boundary-value problems for systems of differential, difference and fractional equations. Positive solutions. Elsevier, Amsterdam. 2016. [106] M.E. Hern´ andez-Hern´ andez and V.N. Kolokoltsov. On the probabilistic approach to the solution of generalized fractional differential equations of Caputo and Riemann–Liouville type. Journal of Fractional Calculus and Applications 7:1 (2016), 147–175. [107] M.E. Hern´ andez-Hern´andez and V.N. Kolokoltsov. On the solution of twosided fractional ordinary differential equations of Caputo type. Fract. Calc. Appl. Anal. 19:6 (2016), 1393–1413.

510

Bibliography

[108] G. Herzog. On Lipschitz conditions for ordinary differential equations in Fr´echet spaces. Czechoslovak Mathematical Journal 48 (123) (1998), 95–103. [109] L. H¨ormander. Pseudo-differential operators. Comm. Pure Appl. Math. 18 (1965), 501–517. [110] L. H¨ormander. The analysis of linear partial differential operators. Vol. 1: Distribution theory and Fourier analysis. Vol. 2: Differential operators with constant coefficients. Vol. 3: Pseudo-differential operators. Vol. 4: Fourier integral operators. Berlin, New York : Springer, 2nd Ed., 2003–2009. [111] J. Hofbauer, K. Sigmund. Evolutionary Games and Population Dynamics. Cambridge University Press, 1998. [112] E. Horst. Differential equations in Banach spaces: five examples. Arch. Math. 46 (1986), 440–444. [113] M. Huang, R.P. Malhame, P.E. Caines. Large population stochastic dynamic games: closed-loop McKean–Vlasov systems and the Nash certainty equivalence principle. Commun. Inf. Syst. 6:3 (2006), 221–251. [114] L. Huang and St. Menozzi. A parametrix approach for some degenerate stable driven SDEs. Ann. Inst. Henri Poincar´e Probab. Stat. 52:4 (2016), 1925–1975. [115] K. Iosida. Functional analysis. Springer, 1965. [116] S. Itˆo. Diffusion equation. Translations of Mathematical Monographs, 114. AMS, Providence, RI, 1992. [117] L.M. Graves. Riemann Integration and Taylor’s Theorem in General Analysis. Transactions of the American Mathematical Society 29:1 (1927), 163– 177. [118] N. Jacob. Pseudo-differential operators and Markov processes. Vol. II: Generators and their potential theory. London: Imperial College Press, 2002. [119] O. Kalenberg. Foundations of Modern Probability. Sec. Ed., Springer, Berlin, 2002. [120] A. Kilbas, H.M. Srivastava and J.J. Trujillo. Theory and applications of fractional differential equations. Elsevier, Amsterdam, 2006. [121] P. Kim and R. Song. Estimates on Green functions and Schr¨ odinger-type equations for non-symmetric diffusions with measure-valued drifts. J. Math. Anal. Appl. 332:1 (2007), 57–80. [122] P. Kim and R. Song. Intrinsic ultracontractivity of nonsymmetric diffusions with measure-valued drifts and potentials. Ann. Probab. 36:5 (2008), 1904– 1945. [123] V. Kiryakova. Generalized fractional calculus and applications. Pitman Research Notes in Mathematics Series, 301. Longman Scientific, Harlow. Copublished in the United States with John Wiley and Sons, New York, 1994. [124] V. Kiryakova. A brief story about the operators of the generalized fractional calculus. Fract. Calc. Appl. Anal. 11:2 (2008), 203–220.

Bibliography

511

[125] V. Kiryakova. From the hyper-Bessel operators of Dimovski to the generalized fractional calculus. Fract. Calc. Appl. Anal. 17:4 (2014), 977–1000. [126] K. Kiyohara. Two-dimensional geodesic flows having first integrals of higher degree. Math. Ann. 320:3 (2001), 487–505. [127] V. Knopova and A. Kulik. Parametrix construction of the transition probability density of the solution to an SDE driven by α-stable noise. Annales de l’Institut Henri Poincar´e. Probabilit´es et Statistiques 54:1 (2018), 100–140. [128] A. Kochubei. Parabolic pseudodifferential equations, hypersingular integrals, and Markov processes. Math USSR Izvestiya 33:2 (1989), 233–259. [129] A. Kochubei. General fractional calculus, evolution equations, and renewal processes. Integral Equations Operator Theory 71:4 (2011), 583–600. [130] A.N. Kochubei and Y. Kondratiev. Fractional kinetic hierarchies and intermittency. Kinet. Relat. Models 10:3 (2017), 725–740. [131] A.N. Kolmogorov and S.V. Fomin. Elements of the theory of functions and functional analysis (in Russian). Nauka, Moscow, 1976. [132] V.N. Kolokoltsov. Geodesic flows on two-dimensional manifolds with an additional first integral that is polynomial with respect to velocities. Izv. Akad. Nauk SSSR Ser. Mat. 46:5 (1982), 994–1010 (in Russian). [133] V.N. Kolokoltsov. New examples of manifolds with closed geodesics. Vestnik Moskov. Univ. Ser. I Mat. Mekh. 4 (1984), 80–82 (in Russian). [134] V.N. Kolokoltsov. Symmetric Stable Laws and Stable-like Jump-Diffusions. Proc. London Math. Soc. 3:80 (2000), 725–768. [135] V.N. Kolokoltsov. Small diffusion and fast dying out asymptotics for superprocesses as non-Hamiltonian quasi-classics for evolution equations. Electronic Journal of Probability 6 (2001), paper 21. [136] V.N. Kolokoltsov. Semiclassical Analysis for Diffusions and Stochastic Processes. Springer LNM, vol. 1724, Springer 2000. [137] V.N. Kolokoltsov. A new path integral representation for the solutions of the Schr¨ odinger and stochastic Schr¨odinger equation. Math. Proc. Cam. Phil.Soc. 132 (2002), 353–375. [138] V.N. Kolokoltsov. On the singular Schr¨odinger equations with magnetic fields. Matem. Zbornik 194:6 (2003), 105–126 (in Russian). Engl. transl. Sbornik Mathematics, p. 897–918. [139] V.N. Kolokoltsov. Hydrodynamic limit of coagulation-fragmentation type models of k-nary interacting particles. Journal of Statistical Physics 115, 5/6 (2004), 1621–1653. [140] V.N. Kolokoltsov. Measure-valued limits of interacting particle systems with k-nary interaction II. Finite-dimensional limits. Stochastics and Stochastics Reports 76:1 (2004), 45–58.

512

Bibliography

[141] V.N. Kolokoltsov. Kinetic equations for the pure jump models of k-nary interacting particle systems. Markov Processes and Related Fields 12 (2006), 95–138. [142] V.N. Kolokoltsov. On the regularity of solutions to the spatially homogeneous Boltzmann equation with polynomially growing collision kernel. Advanced Studies in Contemp. Math. 12 (2006), 9–38. [143] V.N. Kolokoltsov. Nonlinear Markov Semigroups and Interacting L´evy Type Processes. Journ. Stat. Physics 126:3 (2007), 585–642. [144] V.N. Kolokoltsov. Generalized Continuous-Time Random Walks (CTRW), Subordination by Hitting Times and Fractional Dynamics. arXiv:0706. 1928v1[math.PR] 2007. Probab. Theory and Applications 53:4 (2009). [145] V.N. Kolokoltsov. The central limit theorem for the Smoluchovski coagulation model. arXiv:0708.0329v1[math.PR] 2007. Prob. Theory Relat. Fields 146: 1 (2010), 87–153. [146] V.N. Kolokoltsov. The L´evy–Khintchine type operators with variable Lipschitz continuous coefficients generate linear or nonlinear Markov processes and semigroups. arXiv:0911.5688 (2009). Prob. Theory Related Fields 151(2011), 95–123. [147] V.N. Kolokoltsov. Nonlinear Markov processes and kinetic equations. Cambridge Tracks in Mathematics 182, Cambridge Univ. Press, 2010. [148] V.N. Kolokoltsov. Markov processes, semigroups and generators. De Gruyter Studies in Mathematics vol. 38, De Gruyter, 2011. [149] V.N. Kolokoltsov. Stochastic Integrals and SDE Driven by Nonlinear L´evy Noise. In: D. Crisan (Ed). Stochastic Analysis 2010. Springer, Berlin Heidelberg, 2011, p. 227–242. [150] V.N. Kolokoltsov. Nonlinear Markov games on a finite state space (meanfield and binary interactions). International Journal of Statistics and Probability 1:1 (2012), 77–91. http://www.ccsenet.org/journal/index.php/ijsp/article/view/16682 [151] V.N. Kolokoltsov. Nonlinear diffusions and stable-like processes with coefficients depending on the median or VaR. Applied Mathematics and Optimization 68:1 (2013), 85–98. [152] V.N. Kolokoltsov. On fully mixed and multidimensional extensions of the Caputo and Riemann–Liouville derivatives, related Markov processes and fractional differential equations. Fract. Calc. Appl. Anal. 18:4 (2015), 1039– 1073. http://arxiv.org/abs/1501.03925. [153] V.N. Kolokoltsov. Stochastic monotonicity and duality of kth order with application to put-call symmetry of powered options. http://arxiv.org/ abs/1405.3894 Journal of Applied Probability 52:1 (2015), 82–101.

Bibliography

513

[154] V.N. Kolokoltsov. The evolutionary game of pressure (or interference), resistance and collaboration (2014). MOR (Mathematics of Operations Research), 42 (2017), no. 4,915–944. [155] V.N. Kolokoltsov and O.A. Malafeyev. Understanding Game Theory. World Scientific, Singapore, 2010. [156] V.N. Kolokoltsov and O.A. Malafeyev. Mean field game model of corruption. Dynamics Games and Applications. 7:1 (2017), 34–47. Open Access mode. [157] V.N. Kolokoltsov and O.A. Malafeyev. Many agent games in socio-economic systems: corruption, inspection, coalition building, network growth, security. Springer, 2019. [158] V.N. Kolokoltsov and V.P. Maslov. Idempotent analysis and its applications. Kluwer Publishing House, 1997. [159] V.N. Kolokoltsov and A.E. Tyukov. Boundary-value problems for Hamiltonian systems and absolute minimizers in calculus of variations. Electron. J. Differential Equations, No. 90 (2006). [160] V. Kolokoltsov and M. Veretennikova. Fractional Hamilton Jacobi Bellman equations for scaled limits of controlled Continuous Time Random Walks. Communications in Applied and Industrial Mathematics 6:1 (2014), e-484. DOI: 10.1685/journal.caim.484 http://caim.simai.eu/index.php/caim/article/view/484/PDF [161] V. Kolokoltsov and M. Veretennikova. Well-posedness and regularity of the Cauchy problem for nonlinear fractional in time and space equations. http://arxiv.org/abs/1402.6735. ‘Fractional Differential Calculus’ 4:1(2014), 1–30, http://files.ele-math.com/articles/fdc-04-01.pdf [162] V. Kolokoltsov and M. Troeva. On the mean field games with common noise and the McKean–Vlasov SPDEs. arXiv:1506.04594. To appear In Stachastic Analysis and Applications. [163] V. Kolokoltsov, M. Troeva and W. Yang. On the rate of convergence for the mean-field approximation of controlled diffusions with large number of players. Dyn. Games Appl. 4:2 (2014), 208–230. [164] V.N. Kolokoltsov and W. Yang. Existence of solutions to path-dependent kinetic equations and related forward-backward systems. Open Journal of Optimization 2:2 (2013), 39–44. [165] V.N. Kolokoltsov and W. Yang (2013). Sensitivity analysis for HJB equations with an application to coupled backward-forward systems (2013). arXiv:1303.6234 [166] A.N. Kolmogorov and S.V. Fomin. Elements of the theory of functions and functional analysis. Moscow, Nauka, 6th Edition, 1989 (in Russian). Arabic translation by Dar Mir, Moscow, 1988. Portuguese translation by Mir, Moscow, 1982. German translation as Hochschulb¨ ucher f¨ ur Mathematik, Band 78. VEB Deutscher Verlag der Wissenschaften, Berlin, 1975. French

514

[167]

[168]

[169] [170]

[171] [172] [173] [174]

[175]

[176]

[177]

[178]

[179] [180] [181]

Bibliography

translation by Mir, Moscow, 1974. English translation by Dover Publications, Inc., New York, 1975. V. Konakov, S. Menozzi and S. Molchanov. Explicit parametrix and local limit theorems for some degenerate diffusion processes. Ann. Inst. Henri Poincar´e Probab. Stat. 46:4 (2010) 908–923. Y. Kondratiev, T. Kuna and N. Ohlerich. Spectral gap for Glauber type dynamics for a special class of potentials. Electron. J. Probab. 18 (2013), no. 42. Y. Kondratiev, T. Pasurek and M. R¨ ockner. Gibbs measures of continuous systems: an analytic approach. Rev. Math. Phys. 24:10 (2012), 1250026. V.V. Kozlov. Polynomial conservation laws for the Lorentz and the Boltzmann–Gibbs gases. Uspekhi Mat. Nauk 71:2 (2016), 81–120 (in Russian). Engl. transl. Russian Math. Surveys 71:2 (2016), 253–290. N.N. Krasovskii and A.I. Subbotin. Game-Theoretical Control Problems. New York, Springer, 1988. B. Kruglikov. Invariant characterization of Liouville metrics and polynomial integrals. J. Geom. Phys. 58:8 (2008), 979–995. A.M. Kulik. On weak uniqueness and distributional properties of a solution to an SDE with α-stable noise. To appear in SPA. H. Kunita. Stochastic Flows and Stochastic Differential Equations. Cambridge studies in advanced mathematics, vol. 24. Cambridge Univ. Press, 1990. O.A. Ladyzhenskaia, V.A. Solonnikov, and N.N. Uraltceva. Linear and quasilinear equations of parabolic type. Translations of mathematical monographs, vol. 23. Providence, American Mathematical Society, 1968. Russian edition published in Moscow in 1967. V. Lakshmikantham, A.R. Mitchell and R.W. Mitchell. Differential equations on closed subsets of a Banach space. Transactions of the AMS 220 (1976), 103–113. V. Lakshmikantham and J. Vasundhara Devi. Theory of Fractional Differential Equations in a Banach Space. European Journal of Pure and Applied Mathematics 1:1 (2008), 38–45. V. Lakshmikantham et al. Theory of causal differential equations. Atlantis studies in mathematics for engineering and science, vol. 5. Amsterdam, Paris. Atlantis Press, World Scientific, 2009. N. Laskin. Fractional Schr¨ odinger equation. Phys. Rev. E 66 (2002), 056108. J.-M. Lasry and P.-L. Lions. Jeux a` champ moyen. I. Le cas stationnaire (French). C.R. Math. Acad. Sci. Paris 343:9 (2006) 619–625. Ph. Laurencot and S. Mishler. Global existence for the discrete diffusive coagulation-fragmentation equations in L1 . Rev. Mat. Iberoamericana. 18:3 (2002), 731–745.

Bibliography

515

[182] R. L´eandre. D´ev´eloppement asymptotique de la densit´e d’une diffusion d´eg´en´er´e. Forum Math. 4 (1992), 45–75. [183] N.N. Leonenko, M.M. Meerschaert and A. Sikorskii. Correlation structure of fractional Pearson diffusions. Comput. Math. Appl. 66:5 (2013), 737–745. [184] N.N. Leonenko, M.M. Meerschaert and A. Sikorskii. Fractional Pearson diffusions. J. Math. Anal. Appl. 403 (2013) 532–546. [185] N.N. Leonenko, I. Papic, A. Sikorskii and N. Suvak. Heavy-tailed fractional Pearson diffusions. Stochastic Processes and their Applications 127 (2017) 3512–3535. [186] Ch. Li and F. Zeng (2015). Numerical methods for fractional calculus. CRC Press, Boca Raton, 2015. [187] J. Lindenstrauss and L. Tzafriri. Classical Banach Spaces, vols. 1 and 2. Springer-Verlag, 1977. [188] W. Liu, M. R¨ockner and J.L. da Silva. Quasi-linear (stochastic) partial differential equations with time-fractional derivatives. SIAM J. Math. Anal. 50:3 (2018), 2588–2607. [189] S.G. Lobanov and O.G. Smolyanov. Ordinary differential equations in locally convex spaces (Russian). Uspekhi Mat. Nauk 49:3 (1994), 93–168. English Transl. in Russian Math. Surveys 49:3 (1994), 97–175. [190] J. L¨orinczi, F. Hiroshima and V. Betz. Feynman–Kac-type theorems and Gibbs measures on path space. With applications to rigorous quantum field theory. De Gruyter Studies in Mathematics, 34. Walter de Gruyter, Berlin, 2011. [191] G. Lv, J. Duan, H. Gao and J.-L. Wu. On a stochastic nonlocal conservation law in a bounded domain. Bull. Sci. Math. 140:6 (2016), 718–746. [192] R.L. Magin. Fractional Calculus in Bioengineering. Begell House Publisher, Inc, Connecticut, 2006. [193] S. Maniglia. Probabilistic representation and uniqueness results for measurevalued solutions of transport equations. J. Math. Pures Appl. (9) 87:6 (2007), 601–626. [194] O.A. Malafeyev. Controlled conflict systems. Petersburg University, 2000 (in Russian). [195] A.B. Malinowska and D.F.M. Torres. Introduction to the Fractional Calculus of Variations. Imperial College Press, 2012. [196] A.B. Malinowska, T. Odzijewicz and D.F.M. Torres. Advanced Methods in the Fractional Calculus of Variations. Springer, Heidelberg, 2015. [197] R.H. Martin. Nonlinear operators and differential equations in Banach spaces. New York, 1976. [198] P.R. Masani. Multiplicative Riemann integration in normed rings. Trans. Amer. Math. Soc. 61 (1947), 147–192.

516

Bibliography

[199] V.P. Maslov. Perturbation Theory and Asymptotical Methods. Moscow State University Press, 1965 (in Russian). French Transl. Dunod, Paris, 1972. [200] V.P. Maslov. M´ethodes Op´eratorielles. Moscow, Nauka 1974 (in Russian). French transl. Moscow, Mir, 1987. [201] V.P. Maslov. Complex Markov Chains and Functional Feynman Integral. Moscow, Nauka, 1976 (in Russian). [202] V.P. Maslov and M.V. Fedoryuk. Semiclassical Approximation in Quantum Mechanics. Nauka, Moscow, 1976 (in Russian). Engl. transl. Reidel, Dordrecht, 1981. [203] V.S. Matveev and P.I. Topalov. Geodesic equivalence of metrics on surfaces, and their integrability. (Russian) Dokl. Akad. Nauk 367:6 (1999), 736–738. [204] W.M. McEneaney. A new fundamental solution for differential Riccati equations arising in control. Automatica (Journal of IFAC) 44:4 (2008), 920–936. [205] M. Meerschaert, E. Nane and P. Vellaisamy. Distributed-order fractional diffusions on bounded domains. J. Math. Anal. Appl. 379:1 (2011), 216–228. [206] M.M. Meerschaert and A. Sikorskii. Stochastic Models for Fractional Calculus. De Gruyter Studies in Mathematics Vol. 43, NY (2012). [207] R. Metzler and J. Klafter. The random walk’s guide to anomalous diffusion: a fractional dynamics approach. Physics Reports 339:1 (2000), 1–77. [208] P.-A. Meyer. Quantum Probability for Probabilists. Springer LNM, vol. 1538. Springer, 1991. [209] V.M. Millionshchikov. On the theory of differential equations in locally convex spaces. (Russian) Mat. Sb. (N.S.) 57:4 (1962), 385–406. [210] S. Mishler, B. Wennberg. On the spatially homogeneous Boltzmann equation.Ann. Inst. H. Poincar´e Anal. Non Lin´eaire 16:4 (1999), 467–501. [211] M. Naber. Time fractional Schr¨ odinger equation. J. Math. Phys. 45 (2004), 3339–3352. [212] M.A. Naimark. Normed rings. Translated from the first Russian edition by Leo F. Boron. Reprinting of the revised English edition. Wolters–Noordhoff Publishing, Groningen, 1970. [213] A. Negoro. Stable-like processes: construction of the transition density and the behavior of sample paths near t = 0. Osaka J. Math. 31 (1994), 189–214. [214] J. Norris. Cluster Coagulation. Comm. Math. Phys. 209 (2000), 407–435. [215] J. Norris. A consistency estimate for Kac’s model of elastic collisions in a dilute gas. Ann. Appl. Probab. 26:2 (2016), 1029–1081. [216] J. Norris. Measure solutions for the Smoluchowski coagulation-diffusion equation. ArXiv:1408.5228, 2014. [217] L. Orsina, M.M. Porzio, F. Smarrazzo. Measure-valued solutions of nonlinear parabolic equations with logarithmic diffusion. J. Evol. Equ. 15:3 (2015), 609–645.

Bibliography

517

[218] F. Padula and A. Visioli. Advances in robust fractional control. Springer, 2015. [219] A. Pazy. Semigroups of Linear Operators and Applications to Partial Differential Equations. Springer-Verlag, New York, 1983. [220] I.G. Petrovskii. Lectures on the theory of ordinary differential equations (in Russian). Sixth corrected edition. Nauka, Moscow 1970. English translation by Prentice-Hall, Englewood Cliffs, N.J., 1966. [221] H. Pham. Linear quadratic optimal control of conditional McKean–Vlasov equation with random coefficients and applications. Probab. Uncertain. Quant. Risk 1 (2016), Paper No. 7. [222] I. Podlubny. Fractional differential equations, An introduction to fractional derivatives, fractional differential equations, to methods of their solution and some of their applications. Mathematics in Science and Engineering, vol. 198. Academic Press, Inc., San Diego (1999). [223] H. Pollard. The completely monotonic character of the Mittag-Leffler function Ea (−x). Bull. Amer. Math. Soc. 54 (1948), 1115–1116. [224] L.S. Pontryagin. Ordinary differential equations (in Russian). Fifth edition. Nauka, Moscow, 1982. Spanish translation by Aguilar, Madrid, 1973. [225] M. Poppenberg. An application of the Nash–Moser theorem to ordinary differential equations in Fr´echet spaces. Studia Mathematica 137 (2) (1999), 101–121. [226] M.M. Porzio and F. Smarrazzo. Radon measure-valued solutions for some quasilinear degenerate elliptic equations. Ann. Mat. Pura Appl. (4) 194:2 (2015), 495–532. [227] F.O. Porper and S.D. Eidelman. Two-sided estimates of fundamental solutions of second-order parabolic equations, and some applications. Uspehki Mat. Nauk 39:3 (1984), 107–156. Engl. transl. in Russian Math. Surv. 39:3 (1984), 119–178. [228] A.V. Pskhu. Fundamental solution of the diffusive wave equation of fractional order (in Russian). Izvestia RAN, Ser. Math. 73:2 (2009), 141–182. [229] A.V. Pskhu. Partial differential equations of fractional order (in Russian). Nauka, Moscow (2005). [230] L. Rass and J. Radcliffe. Spatial Deterministic Epidemics. Mathematical Surveys and Monographs, vol. 102. AMS 2003. [231] M. Reed and B. Simon. Methods of Modern Mathematical Physics, vol. 1, Functional Analysis. Academic Press, N.Y. 1972. [232] M. Reed and B. Simon. Methods of Modern Mathematical Physics, vol. 2, Harmonic Analysis. Academic Press, N.Y. 1975. [233] R.M. Redheffer. The Theorems of Bony and Prezis on Flow-Invariant Sets. The American Mathematical Monthly 79 :7 (1972), 740–747.

518

Bibliography

[234] S. Rjasanow and W. Wagner. Stochastic numerics for the Boltzmann equation. Springer Series in Computational Mathematics, 37. Springer-Verlag, Berlin, 2005. [235] A.P. Robertson and W. Robertson. Topological vector spaces. Cambridge University Press, 1973. [236] B.N. Sadovskii. A fixed-point principle, Funktsional. Anal. i Prilozhen. 1:2 (1967), 74–76. [237] R.S. Saha. Fractional calculus with applications for nuclear reactor dynamics. CRC Press, Boca Raton, 2016. [238] S.G. Samko, A.A. Kilbas, O.I. Marichev. Fractional integrals and derivatives. Theory and applications. Translated from the 1987 Russian original. Gordon and Breach Science Publishers, Yverdon, 1993. [239] S.G. Samko. Hypersingular integrals and their applications (in Russian). Rostov State University, 1984. Engl. transl. Analytical Methods and Special Functions, 5. Taylor and Francis, London, 2002. [240] H.H. Schaefer. Banach Lattices and Positive Operators. Springer, BerlinHeidelberg, 1974. [241] R.L. Schilling, R. Song and Z. Vondracek. Bernstein Functions. Theory and Applications. Studies in Math 37, De Gruyter, 2010. [242] W. Schneider. Completely monotone generalized Mittag-Leffler functions. Expo Math 14 (1996), 3–16. [243] A. Schumacher. Second order Banach space valued differential equations: a semigroup approach. Arch. Math. 81 (2003), 446–456. [244] R.E. Showalter. Monotone Operators in Banach Space and Nonlinear Partial Differential Equations. Mathematics Surveys and Monographs 49, American Mathematical Society, 1997. [245] M.A. Shubin. Pseudo-differential operators. Moscow, Nauka, 1978 (in Russian). [246] M.V. Simkin and V.P. Roychowdhury. Re-inventing Willis. Physics Reports 502 (2011), 1–35. [247] M. Slemrod. Dynamics of Measured Valued Solutions to a BackwardForward Heat Equation. Journal of Dynamics and Differential Equations 3:1 (1991), 1–28. [248] G.V. Smirnov. Introduction to the Theory of Differential Inclusions. Providence, RI, AMS, 2001. [249] O.G. Smolyanov. Analysis in topological linear spaces. Moscow State University, 1979 (in Russian). [250] A.I. Subbotin. Generalized Solutions of First Order of PDEs: The Dynamical Optimization Perspectives. Boston, Birkh¨auser, 1995.

Bibliography

519

[251] N.N. Subbotina et al. The Method of Characteristics for the Hamilton– Jacobi–Bellmam equation. Ekaterinburg, RIO RAN, 2013 (in Russian). [252] I.A. Taimanov. Topological obstructions to the integrability of geodesic flows on nonsimply connected manifolds. (Russian) Izv. Akad. Nauk SSSR Ser. Mat. 51:2 (1987), 429–435 (in Russian). Engl. transl. Math. USSR-Izv. 30:2 (1988), 403–409. [253] V.E. Tarasov. Fractional Dynamics, Applications of Fractional Calculus to Dynamics of Particles, Fields and Media. Springer, Higher Education Press (2011). [254] M.E. Taylor. Pseudo-differential operators. Princeton University Press, 1981. [255] V.V. Uchaikin. Fractional Derivatives for Physicists and Engineers. Springer (2012). [256] S. Umarov. Introduction to fractional pseudo-differential equations with singular symbols. Developments in Mathematics, 41. Springer, 2015. [257] V.V. Vasil’ev and S.I. Piskar¨ev. Differential equations in a Banach space (in Russian). Moscow University Press, 1996. [258] C. Villani. On the spatially homogeneous Landau equation for Maxwellian molecules. Math. Models Methods Appl. Sci. 8:6 (1998), 957–983. [259] C. Villani. On a new class of weak solutions to the spatially homogeneous Boltzmann and Landau equations. Arch. Rational Mech. Anal. 143:3 (1998), 273–307. [260] V. Vedenyapin, A. Sinitsyn and E. Dulov. Kinetic Boltzmann, Vlasov and related equations. Elsevier, Inc., Amsterdam, 2011. [261] V.S. Vladimirov. Equations of mathematical physics (in Russian). Moscow, Nauka, 1988. [262] I.I. Vrabie. Compactness Methods for Nonlinear Evolutions. Pitman Monographs and Surveys in Pure and Applied Mathematics 32, Longman Scientific, 1987. [263] B.J. West. Fractional calculus View of Complexity. Tomorrow’s Science. CRC Press, Boca Raton, 2016. [264] S. Yamamuro. Differential Calculus in Topological Linear Spaces. Springer LNM 374, Springer, Berlin, 1974. [265] R. Yano. On quantum Fokker–Planck equation. J. Stat. Phys. 158:1 (2015), 231–247. [266] V.M. Zolotarev. On analytic properties of stable distribution laws (in Russian). Vestnik Leningrad. Univ. 11:1 (1956), 49–52. [267] V.M. Zolotarev. One-dimensional Stable Distributions. Moscow, Nauka, 1983 (in Russian). Engl. transl. in vol. 65 of Translations of Mathematical Monographs AMS, Providence, Rhode Island, 1986.

520

Bibliography

[268] G. Zou, G. Lv and J.-L. Wu. On the regularity of weak solutions to spacetime fractional stochastic heat equations. Statist. Probab. Lett. 139 (2018), 84–89. [269] G. Zou, G. Lv and J.-L. Wu. Stochastic Navier–Stokes equations with Caputo derivative driven by fractional noises. J. Math. Anal. Appl. 461:1 (2018), 595–609.

Index ∗-weak topology, 2 abstract M-space, 79 accretive mapping, 152, 195 accretive relation, 152 accretivity in weighted norm, 196, 197 autocatalysis, 166 backward Cauchy problem, 124 backward propagator, 259 Baire theorem, 33 Banach lattice, 79 Banach space dual pair, 2 Banach tower, 221 Bellman equation, 122 jump processes, 123, 124 jump processes, well-posedness, 124 binary relation, 151 contraction, 152 Bochner integral, 15 Bogolyubov chains, 424 Boltzmann collisions, 419 Boltzmann’s equation spatially trivial, 420 bounded set, 31 bp-topology, 38 branching, 176 Cahn–Hilliard equation, 370 Caputo fractional derivative, 44 generalized, 455 catalyst, 166 Cauchy principle value, 61 Cauchy sequence, 29 causal equations, 135, 136 causal integral operator, 453

chain rule, 259 chemical reaction, 165 order, 165 mechanism, 166 chronological exponential, 18, 95, 290 backward, 96 coagulation, 176 coagulation kernel, 419, 420 collision, 176 collision breakage, 176, 420 collision kernel, 420, 421 completely monotone function, 345 complex balance point, 172 complex diffusion equation, 255, 278 complex matrix, 168 conditional positivity, 335 for matrices, 160 mappings in R∞ , 160 multuilinear mappings, 181 strong form, 181 contraction, 2 contraction principle, 482 generalized, 481 convolution semigroup, 341 Csiszar–Morimoto entropy, 170 δ-sequence, 224 decomposable measures, 417 delay equations, 135 detailed balance, 171 detailed balance point, 172 diffusion equation, 68, 224, 254 complex, 225 Dirac measure, 4 directional derivative, 9, 35 Dirichlet boundary condition, 229 Dirichlet formulae, 484

© Springer Nature Switzerland AG 2019 V. Kolokoltsov, Differential Equations on Measures and Functional Spaces, Birkhäuser Advanced Texts Basler Lehrbücher, https://doi.org/10.1007/978-3-030-03377-4

521

522 discrete measure, 4 dissipative operator, 152 dual Banach space, 2 dual operator, 2 dual pair of Banach spaces, 77 dual space, 32 strong topology, 32 weak topology, 32 Duhamel formula, 93 elliptic polynomial, 330 equicontinuity, 29 Erd´elyi’s lemma, 492 evolutionary coalition building, 180 explosion, 92 Feller propagator, 337 Feller semigroup, 337 conservative, 337 Feynman–Kac formula, 477 stationary, 477 time-ordered operator-valued, 477 first order kinetics ergodic, 169 weakly reversible, 169 first-order kinetics, 169 forward-backward systems, 394, 414 Fourier theorem, 40 Fourier transform, 40 inversion formula, 41 Fr´echet derivative, 36 strong, 36 Fr´echet space, 30 fractional complex diffusion, 448 fractional derivative generalized, 453 spectral measure, 54 symmetric, 51 symmetric mixed, 53 fractional derivatives in generator form, 454 fractional Feller evolution, 448, 479 fractional Laplacian, 55 fractional RL integral, 44 generalized, 455 left, 49

Index fractional Schr¨ odinger equation, 448, 478 regularized, 448 fragmentation, 420 fragmentation kernel, 420 frozen coefficients, 309 fundamental solution, 65 for Cauchy problem, 67 Gˆ ateaux derivative, 10, 35 compatible with duality, 77 Gaussian diffusion operator, 230 Gaussian diffusion semigroup, 230 generalized function, 34, 56 convolution, 58 differentiation, 34, 57 direct product, 57 Fourier transform, 57 pointwise product, 58 support, 58 tempered, 34, 56 generalized solution, 57 by duality, 334 to the Cauchy problem, 218 via approximation, 334, 449, 458 via discrete approximations, 293 generator of a strongly continuous semigroup, 215 generators of order at most one, 351 geodesic flow, 111 Ginzburg–Landau equation, 363 Godunov’s theorem, 157 Green function, 67, 68, 99, 224, 230 for fractional derivative, 103 for fractional derivative, Mellin transform, 444 Hadamard derivative, 36 Hamilton–Jacobi–Bellman (HJB) equation, 363 Hamilton–Jacobi–Bellman equation, 122 Hamiltonian optimization theory, 122 with Lipschitz minimizer, 395 Hamiltonian equations, 105 heat conduction equation, 68, 224, 254, 278

Index heat kernel, 67, 99, 224, 230 for fractional derivative, 103 Holand semigroup, 263 hyper-singular integrals, 56 inductive limit, 33 of linear spaces, 34 intensity of transitions, 418 interest driven migration, 176 iterated Riemann integral, 43 KdV-equation, 92 kinetic equation, 177, 421 anticipating, 375 causal, 374 path dependent, 374, 413 weak form, 178, 418 kinetic system, 167 complex balanced, 172 conservative, 172 detailed balanced, 172 Kolmogorov’s diffusion, 231 Kolmogorov’s forward equation, 176 L´evy exponent, 340 L´evy kernel, 335 L´evy measure, 339 L´evy–Khintchin operators, 339 L´evy–Khintchin-type operators, 39, 335 Landau–Fokker–Planck equation, 422 Laplace exponent, 345 Laplacian, 223 lattice, 78 distributive, 78 Lebesgue decomposition theorem, 6 Lie–Trotter formula, 296 linear Cauchy problem, 259 linear operator, 2 bounded, 2, 32 closable, 215 closed, 214 closure, 215 core, 215 densely defined, 2 domain, 2 norm of, 3 strong convergence, 3

523 strong topology, 3 linear topological space, 27 metricizable, 29 locally convex space, 27 barrelled, 33 bornological, 32 Lomonosov–Lavoisier law, 178 lower and upper semi-inner product, 20 Lyapunov function, 408 subcritical, 408 m-accretive relation, 152 Markov chain graph, 169 Kolmogorov’s forward equation, 169 mass-action-law, 165 mass-action-law kinetics, 178 mass-exchange process, 178 master equation, 169, 415 maximum principle, 335 McKean–Vlasov diffusions, 387, 422 measurable mapping, 14 Banach-space-valued, 14 method of duality, 263 mild equation, 246 for fractional evolutions, 451 mild solution, 246, 273, 370 to HJB, 363, 368 Minkowski functional, 27 Mittag-Leffler function, 484 mixed states, 416 molecularity, 167 monotone mapping, 151 Morimoto H-theorem, 170 multiple coagulation, 420 multiplicative Riemann integral, 18 Neumann boundary condition, 228 nonlinear diffusion, 385 complex, 386 nonlinear quantum dynamic semigroup, 404 observables, 416 decomposable, 417 operator of order at most one, 454 left-sided, 454

524 right-sided, 454 operators of at most kth order, 38 Ornstein–Uhlenbeck diffusion, 230 measure-valued, 270 Ovsyannikov’s method, 358 pairwise mutations, 176 Paley–Wiener theorem, 41 path integral, 252 for linear fractional evolution, 472 Schr¨ odinger equation, 253 Peano’s theorem, 157 perturbation theory, 244 Poisson bracket, 113 porous medium equation, 153 positive maximum principle (PMP), 335 positive-definite kernels, 270 potential measure, 346 potential operator, 218 preferential attachment, 179 principle of uniform boundedness, 33 probability kernel, 37 probability measure, 4 Prokhorov’s compactness criterion, 5 propagator, 259 generated by, 259 of generalized solutions, 261 solving Cauchy problem, 259 strongly continuous, 261 propagator equation, 259 pseudo-differential operator, 42 symbol of, 42 quasi-accretive mapping, 195, 196 qubit, 79 Radon measure, 4 complex, 4 dimensionality, 255 reflexive Banach space, 2 replicator dynamics, 164, 177, 422 generalized, 423 resolvent, 216, 224 resolvent equation, 217 Riccati equation, 268 backward, 269 Riemann–Lebesgue lemma, 41

Index Riesz–Markov theorem, 5, 37 RL fractional derivative, 44 generalized, 455 Schr¨ odinger equation, 225, 253 regularized, 226 with magnetic fields, 255 with magnetic fields, regularized, 255, 278 Schwartz space, 30 second quantization, 433 semi-norm, 27 semigroup of linear operators, 213 equicontinuous, 214 strongly continuous, 213 type of growth, 219 sensitivity, 125 discrete kinetic equations, 205 for nonlinear propagators, 401 fractional HJB, 452 fractional ODEs, 146 general kinetic equation, 431 HJB equation, 369 integral equations, 125, 145 McKean–Vlasov diffusion, 389 McKean–Vlasov diffusion, second order, 392 of ODEs, 131 simplest nonlinear diffusion, 384 Smoluchowski coagulation-fragmentation, 179 Smoluchowski’s equation, 419 Sobolev space, 70 local, 71 Sovolev embedding, 70 spectral measure, 54 stable densities, 104 stoichiometric coefficients, 165 stoichiometric space, 167 stoichiometric vectors, 167 strictly convex Banach space, 150 sub-gradient, 151 T -product, 18, 95, 290 backward, 96, 291 Tauberian theorems, 492

Index Taylor expansion first order, 9 second and third order, 74 tightness, 5 time-ordered exponential, 18, 95 Tonelli’s theorem, 111 topology of bounded convergence, 32 topology of pointwise convergence, 32 total variation measure, 4 for complex measures, 4 transition kernel, 37, 279 additively bounded, 426 bounded, 37 critical, 424 E preserving, 424 multiplicatively bounded, 426 signed, 37 subcritical, 424 weakly continuous, 37, 280 uniformly convex Banach space, 150 variational derivative, 71 strong or weak, 75, 410 vector lattice, 79 viscosity solution, 402 Vlasov’s equation, 421 Watson’s lemma, 492 weak topology, 2, 6 for a pair, 2 for measures, 5 Weierstrass function, 116 Yosida approximation, 220

525