Preface
Contents
List of Definitions and Formulas
Part I
Part II
Part I Foundations of Finsler Geometry
1 Warm-Up: Norms and Inner Products
1.1 Norms and Inner Products
1.2 Three Characterizations of Inner Products
1.2.1 Sharp Uniform Convexity and Smoothness
1.2.2 Smoothness at the Origin
1.2.3 Center of Circumscribed Triangle
2 Finsler Manifolds
2.1 Minkowski Normed Spaces
2.2 Euler's Homogeneous Function Theorem
2.3 Finsler Manifolds
2.4 Asymmetric Distance and Geodesics
2.5 Reverse Finsler Structures
3 Properties of Geodesics
3.1 Fundamental and Cartan Tensors
3.2 Dual Norms and the Legendre Transformation
3.3 The Geodesic Equation
3.4 The Exponential Map
3.5 Completenesses and the Hopf–Rinow Theorem
4 Covariant Derivatives
4.1 The Geodesic Equation Revisited
4.2 Covariant Derivatives
4.3 Covariant Derivatives Along Curves
4.4 The Chern Connection
5 Curvature
5.1 Jacobi Fields and the Curvature Tensor
5.2 Properties of the Curvature Tensor
5.3 Flag and Ricci Curvatures and Their Characterizations
5.4 Further Properties of the Curvature Tensor
6 Examples of Finsler Manifolds
6.1 Minkowski Normed Spaces
6.2 Finsler Manifolds of Constant Curvature
6.3 Berwald Spaces
6.3.1 Isometry of Tangent Spaces and Its Applications
6.3.2 T-Curvature
6.3.3 Characterizations of Berwald Spaces
6.4 Randers Spaces
6.5 Hilbert and Funk Geometries
6.6 Teichmüller Space
7 Variation Formulas for Arclength
7.1 First Variation Formula
7.2 Second Variation Formula
7.3 Cut Points and Conjugate Points
8 Some Comparison Theorems
8.1 The Bonnet–Myers Theorem
8.3 Uniform Convexity and Smoothness
8.3.1 Background: k-Convexity and k-Concavity
8.3.2 Uniform Convexity and Smoothness Constants
8.3.3 T-Curvature Revisited
8.3.4 k-Concavity of (M,F)
8.3.5 k-Convexity of (M,F)
8.4 Busemann NPC for Berwald Spaces
Part II Geometry and Analysis of Weighted Ricci Curvature
9 Weighted Ricci Curvature
9.1 Measures on Finsler Manifolds
9.2 Riemannian Weighted Ricci Curvature
9.3 Finsler Weighted Ricci Curvature
9.4 Volume and Diameter Comparison Theorems
10 Examples of Measured Finsler Manifolds
10.1 Minkowski Normed Spaces
10.2 Berwald Spaces
10.3 Randers Spaces
10.3.1 Properties of the S-Curvature
10.3.2 Randers Spaces of Vanishing S-Curvature
10.4 Hilbert and Funk Geometries
11 The Nonlinear Laplacian
11.1 Energy Functional and Sobolev Spaces
11.2 Laplacian and Harmonic Functions
11.3 Laplacian Comparison Theorem
11.4 Linearized Laplacians
12 The Bochner–Weitzenböck Formula
12.1 Hessian
12.2 Pointwise Formula
12.3 Integrated Formula
12.4 Improved Bochner Inequality
13 Nonlinear Heat Flow
13.1 Global Solutions
13.2 Existence
13.3 Large Time Behavior
13.4 Regularity
13.5 Linearized Heat Semigroups and Their Adjoints
14.3 Characterizations of Lower Ricci Curvature Bounds
14.4 The Li–Yau Estimates
15 Bakry–Ledoux Isoperimetric Inequality
15.1 Background
15.2 Poincaré–Lichnerowicz Inequality and Variance Decay
15.3 The Key Estimate
15.4 Proof of Theorem 15.1
16 Functional Inequalities
16.1 Logarithmic Sobolev Inequality
16.1.1 Entropy Decay
16.1.2 Logarithmic Sobolev Inequality
16.2 Beckner Inequality
16.3 Sobolev Inequality
16.3.1 Logarithmic Entropy-Energy and Nash Inequalities
16.3.2 Sharp Sobolev Inequality
16.3.3 Addendum to the Proof of Theorem 16.17
Part III Further Topics
17 Splitting Theorems
17.1 Busemann Functions
17.2 Diffeomorphic Splitting
17.3 The Berwald Case
18 Curvature-Dimension Condition
18.1 Optimal Transport Theory
18.2 Curvature-Dimension Condition
18.3 Brunn–Minkowski Inequality
18.4 Analytic Applications
18.4.1 Functional Inequalities
18.4.2 Concentration of Measures
18.5 Further Developments
18.5.1 Riemannian Curvature-Dimension Condition
18.5.2 Heat Flow as Gradient Flow
18.5.3 Measure Contraction Property
19 Needle Decompositions
19.1 Lipschitz Functions and Optimal Transports
19.1.1 Transport Rays
19.1.2 Cyclical Monotonicity
19.2 Construction of Needle Decompositions
19.2.1 Transport Sets
19.2.2 Disintegration
19.2.3 Conditioned Version
19.3 Properties of Needles
19.4 Isoperimetric Inequalities
19.5 Further Applications
References
Index

##### Citation preview

Springer Monographs in Mathematics

Shin-ichi Ohta

Comparison Finsler Geometry

Springer Monographs in Mathematics Editors-in-Chief Minhyong Kim, School of Mathematics, Korea Institute for Advanced Study, Seoul, South Korea; Mathematical Institute, University of Warwick, Coventry, UK Katrin Wendland, Research group for Mathematical Physics, Albert Ludwigs University of Freiburg, Freiburg, Germany Series Editors Sheldon Axler, Department of Mathematics, San Francisco State University, San Francisco, CA, USA Mark Braverman, Department of Mathematics, Princeton University, Princeton, NY, USA Maria Chudnovsky, Department of Mathematics, Princeton University, Princeton, NY, USA Tadahisa Funaki, Department of Mathematics, University of Tokyo, Tokyo, Japan Isabelle Gallagher, Département de Mathématiques et Applications, Ecole Normale Supérieure, Paris, France Sinan Güntürk, Courant Institute of Mathematical Sciences, New York University, New York, NY, USA Claude Le Bris, CERMICS, Ecole des Ponts ParisTech, Marne la Vallée, France Pascal Massart, Département de Mathématiques, Université de Paris-Sud, Orsay, France Alberto A. Pinto, Department of Mathematics, University of Porto, Porto, Portugal Gabriella Pinzari, Department of Mathematics, University of Padova, Padova, Italy Ken Ribet, Department of Mathematics, University of California, Berkeley, CA, USA René Schilling, Institute for Mathematical Stochastics, Technical University Dresden, Dresden, Germany Panagiotis Souganidis, Department of Mathematics, University of Chicago, Chicago, IL, USA Endre Süli, Mathematical Institute, University of Oxford, Oxford, UK Shmuel Weinberger, Department of Mathematics, University of Chicago, Chicago, IL, USA Boris Zilber, Mathematical Institute, University of Oxford, Oxford, UK

This series publishes advanced monographs giving well-written presentations of the “state-of-the-art” in fields of mathematical research that have acquired the maturity needed for such a treatment. They are sufficiently self-contained to be accessible to more than just the intimate specialists of the subject, and sufficiently comprehensive to remain valuable references for many years. Besides the current state of knowledge in its field, an SMM volume should ideally describe its relevance to and interaction with neighbouring fields of mathematics, and give pointers to future directions of research.

Shin-ichi Ohta

Comparison Finsler Geometry

Shin-ichi Ohta Department of Mathematics Osaka University Osaka, Japan

Dedicated to my parents

Preface

The main aim of this book is to present recent developments of comparison geometry and geometric analysis on Finsler manifolds in an accessible way to students and researchers who are familiar only with Riemannian geometry. We especially focus on a Finsler manifold endowed with a measure on it such that its weighted Ricci curvature is bounded from below by some constant. A Finsler manifold (or a Finsler space) is a manifold equipped with a (possibly asymmetric) norm on each tangent space, named after Paul Finsler [100] (see [55, 239]). This class of spaces arises as a subject of natural interest from many different (but related) viewpoints including: (A) (B) (C) (D)

A generalization of Riemannian manifolds A nonlinearization of normed spaces A special (smooth) class of metric spaces A special (autonomous, positively 1-homogeneous) class of Lagrangian or Hamiltonian structures

The standpoint of this book is the combination of (A) and (C), it is reflected in the choices of the topics in this book. The fundamental book [25] of D. Bao, S.-S. Chern and Z. Shen in 2000 provided a comprehensive introduction to Finsler geometry, more directly following (A). In the present book, we do not repeat the exposition of [25] but try to be as geometric as possible, partly inspired by Z. Shen’s book [230] in 2001. The following quotation from H. Busemann’s paper [53] may be also suitable to explain our motivation: The term “Finsler space” evokes in most mathematicians the picture of an impenetrable forest whose entire vegetation consists of tensors. The purpose of the present lecture is to show that the association of tensors (or differential forms) with Finsler spaces is due to an historical accident, and that, at least at the present time, the fruitful and relevant problems lie in a different direction.

In Part I, we start with the natural notion of an (asymmetric) distance function induced from a Finsler metric (Chap. 2). The geodesic equation is obtained by the standard variational argument (Chap. 3), and analyzing the behavior of geodesics,

vii

viii

Preface

we are led to the notion of Jacobi fields. Via Jacobi fields, we can introduce the flag curvature, which corresponds to the sectional curvature in Riemannian geometry (Chap. 5). In this strategy, one naturally encounters an important observation that the flag curvature coincides with the sectional curvature with respect to a certain Riemannian metric induced from the original Finsler metric (Theorem 5.12). This observation goes back to (at least) O. Varga [241, 242] (see also [16, 220]), and has been diversely utilized by Z. Shen and the author. This part is closed with some important examples (Chap. 6) and comparison theorems (Chap. 8). In Part II, in order to develop (geometric) analysis on Finsler manifolds, we employ a measure on a Finsler manifold (so-called a measured Finsler manifold). Unlike the Riemannian case, however, there is no unique canonical measure. Therefore, we take an arbitrary measure and, inspired by the Riemannian characterization of flag curvature mentioned above, introduce the weighted Ricci curvature (Chap. 9). Then, coupled with the corresponding notions of nonlinear Laplacian (Chap. 11) and nonlinear heat flow (Chap. 13), numerous geometric and analytic applications become possible. Among them the Bochner–Weitzenböck formula and the corresponding Bochner inequality (Chap. 12) will play the most important role. They enable us to develop a nonlinear analogue to the celebrated -calculus à la D. Bakry and his collaborators (see a recent book [21] for the linear theory). We in particular obtain Finsler counterparts of gradient estimates (Chap. 14), Bakry– Ledoux’s Gaussian isoperimetric inequality (Chap. 15), and functional inequalities (Chap. 16). In Part III, we review three advanced topics linked with the weighted Ricci curvature. In Chap. 17, we discuss a generalization of Cheeger–Gromoll’s classical splitting theorem. The splitting phenomenon in the general Finsler setting is not yet well understood and there remains a room for further investigations (especially for non-Berwald spaces). The following chapters are devoted to two further powerful techniques in comparison geometry (besides the -calculus): the curvature-dimension condition in Chap. 18 and the needle decomposition (also called the localization) in Chap. 19. The applications of these methods would be compared with those of the -calculus in Part II. A common feature of these strategies (-calculus, curvature-dimension condition, needle decomposition) is their validity on non-smooth metric measure spaces. This is the subject of recent intensive research (we refer to [8, 21, 59, 60, 244]), and Finsler manifolds provide an important class of examples (from the viewpoint (C) above). We do stress that this is merely one of the possible strategies to study Finsler manifolds. Our description would be suitable for those who are familiar with comparison geometry and geometric analysis. There are other successful ways, for instance, directly based on connections [25], Lagrangian and Hamiltonian structures [50, 181], complex geometry [1], convex geometry, etc. Among them, the most famous approach is arguably the first one, developed by the French group (H. AkbarZadeh, É. Cartan, A. Lichnerowicz et al), S.-S. Chern’s group and M. Matsumoto’s group.

Preface

ix

The list of references is by no means exhaustive. In particular, since we focus on rather recent developments, some important contributions at earlier stages could be missed. I recommend H. Rund’s book [220] in 1959 for historical accounts. The basic conception of this book stemed from the author’s lecture courses at Kyoto University in 2012 and at Osaka University in 2017. The survey [202] was written in the same spirit, and this book largely extends its perspective. The introduction of the weighted Ricci curvature goes back to the author’s paper [193] published in 2009. Since then, there has been a lot of progress partly included in this book. I hope that it keeps going on, and it would bring me great pleasure if this book provided some help. For drawing the attention of the interested readers, I mention here three widely open subjects: (a) Finsler–Ricci flow. The theory of Ricci flow is not well investigated for Finsler metrics. While the definition itself can be generalized to the Finsler situation, we know very little about its behavior. There are some works on the subclass consisting of Berwald metrics (e.g., [147]); however, it is seemingly unknown if this subclass is stable along the flow. (Precisely, when we formulate the Ricci flow in the space of general Finsler metrics, does a path starting from a Berwald metric remain Berwald in the future?) The evolution of the S-curvature is then a key issue. Another difficulty stems from the lack of an appropriate notion of scalar curvature. Scalar curvature plays an important role in various ways, including Perelman’s W-entropy and the study of Ricci solitons (see, for instance, [93, 215]). In the Finsler setting, however, it is unclear how to define the scalar curvature. (b) Gradient flows of convex functions on Finsler manifolds. Analysis of gradient flows of convex functions has quite wide applications in and outside of mathematics (evolution equations, geometric analysis, optimization, and machine learning, to name a few). Nonetheless, as for Finsler-like spaces, much less is known than Riemannian-like spaces (where the angle makes sense). On one hand, there is a quite successful theory of gradient flows in singular Riemannian-like spaces such as CAT(0)-spaces (nonpositively curved metric spaces) as in [9, 17]. On the other hand, as for Finsler-like spaces, even the behavior of gradient flows of convex functions on normed spaces is yet to be understood. We know that the contraction property (in the usual exponential form) fails in normed spaces (see [207]), which reveals the critical difference between Riemannian and Finsler situations. (c) Maps between Finsler manifolds. As seen in Part II of this book, geometric analysis of functions on Finsler manifolds is now well investigated. The Bochner inequality is a key ingredient and there are fruitful analytic and geometric applications. Contrary to them, analysis of maps between Finsler manifolds is widely open. There are some results on harmonic maps (e.g., found in the book [227, Chap. 9] and the references therein; see also [74] for a different approach); however, much less is known than the case of functions. (We remark that the energy functional for maps in [74, 227] is defined via an averaging procedure in tangent spaces. Thus, when we apply it to functions, the resulting

x

Preface

energy functional does not coincide with that in this book.) This subject may be related to analysis of differential forms on Finsler manifolds, which seems even less studied. Acknowledgments I began to study Finsler geometry in 2007 (with the book [25] as a matter of course) during my long-term stay in Bonn supported by JSPS. After completing my first two papers [192, 193] in Finsler geometry, a fruitful collaboration with Theo Sturm started and resulted in the papers [206–208]. I learned a lot from these joint works, especially about analysis on manifolds and metric measure spaces. I would like to express my sincere gratitude to Theo Sturm for these and other collaborations, as well as his hospitality during my stays in Bonn. I am also grateful to anonymous reviewers for many valuable comments and suggestions. Last but not least, I thank my family for their constant support during the writing of this book, especially in the last round of revisions that were done mostly on a work-at-home basis during the COVID-19 pandemic. Actually I felt like writing this book was my homework.

Osaka, Japan

Shin-ichi Ohta

Contents

Part I Foundations of Finsler Geometry 1

Warm-Up: Norms and Inner Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Norms and Inner Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Three Characterizations of Inner Products . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Sharp Uniform Convexity and Smoothness . . . . . . . . . . . . . . . 1.2.2 Smoothness at the Origin. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.3 Center of Circumscribed Triangle . . . . . . . . . . . . . . . . . . . . . . . . .

3 3 4 5 6 7

2

Finsler Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Minkowski Normed Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Euler’s Homogeneous Function Theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Finsler Manifolds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Asymmetric Distance and Geodesics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Reverse Finsler Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11 11 13 14 16 18

3

Properties of Geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Fundamental and Cartan Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Dual Norms and the Legendre Transformation . . . . . . . . . . . . . . . . . . . . . 3.3 The Geodesic Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 The Exponential Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Completenesses and the Hopf–Rinow Theorem . . . . . . . . . . . . . . . . . . . .

19 19 21 24 28 30

4

Covariant Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 The Geodesic Equation Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Covariant Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Covariant Derivatives Along Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 The Chern Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

33 33 34 39 42

xi

xii

Contents

5

Curvature. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Jacobi Fields and the Curvature Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Properties of the Curvature Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Flag and Ricci Curvatures and Their Characterizations . . . . . . . . . . . . 5.4 Further Properties of the Curvature Tensor . . . . . . . . . . . . . . . . . . . . . . . . .

47 47 50 54 56

6

Examples of Finsler Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Minkowski Normed Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Finsler Manifolds of Constant Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Berwald Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Isometry of Tangent Spaces and Its Applications . . . . . . . . . 6.3.2 T-Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.3 Characterizations of Berwald Spaces . . . . . . . . . . . . . . . . . . . . . . 6.4 Randers Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Hilbert and Funk Geometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6 Teichmüller Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63 63 64 65 65 67 67 71 76 78

7

Variation Formulas for Arclength . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 First Variation Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Second Variation Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Cut Points and Conjugate Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

79 79 82 86

8

Some Comparison Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 The Bonnet–Myers Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 The Cartan–Hadamard Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Uniform Convexity and Smoothness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.1 Background: k-Convexity and k-Concavity . . . . . . . . . . . . . . . 8.3.2 Uniform Convexity and Smoothness Constants . . . . . . . . . . . 8.3.3 T-Curvature Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.4 k-Concavity of (M, F ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.5 k-Convexity of (M, F ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 Busemann NPC for Berwald Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

91 91 95 97 97 99 103 103 106 109

Part II Geometry and Analysis of Weighted Ricci Curvature 9

Weighted Ricci Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 Measures on Finsler Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Riemannian Weighted Ricci Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Finsler Weighted Ricci Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4 Volume and Diameter Comparison Theorems . . . . . . . . . . . . . . . . . . . . . .

115 115 117 120 122

10

Examples of Measured Finsler Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1 Minkowski Normed Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Berwald Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3 Randers Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.1 Properties of the S-Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.2 Randers Spaces of Vanishing S-Curvature . . . . . . . . . . . . . . . . 10.4 Hilbert and Funk Geometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

129 129 130 132 132 133 138

Contents

xiii

11

The Nonlinear Laplacian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Energy Functional and Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Laplacian and Harmonic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 Laplacian Comparison Theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4 Linearized Laplacians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

141 141 146 150 153

12

The Bochner–Weitzenböck Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1 Hessian. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2 Pointwise Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.3 Integrated Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.4 Improved Bochner Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

155 155 158 164 167

13

Nonlinear Heat Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.1 Global Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2 Existence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.3 Large Time Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.4 Regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.5 Linearized Heat Semigroups and Their Adjoints . . . . . . . . . . . . . . . . . . .

171 171 173 181 184 188

14

Gradient Estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.1 L2 -Gradient Estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2 L1 -Gradient Estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.3 Characterizations of Lower Ricci Curvature Bounds . . . . . . . . . . . . . . . 14.4 The Li–Yau Estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

193 193 196 198 200

15

Bakry–Ledoux Isoperimetric Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.2 Poincaré–Lichnerowicz Inequality and Variance Decay . . . . . . . . . . . 15.3 The Key Estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.4 Proof of Theorem 15.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

207 207 209 213 218

16

Functional Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.1 Logarithmic Sobolev Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.1.1 Entropy Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.1.2 Logarithmic Sobolev Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.2 Beckner Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.3 Sobolev Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.3.1 Logarithmic Entropy-Energy and Nash Inequalities . . . . . . 16.3.2 Sharp Sobolev Inequality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.3.3 Addendum to the Proof of Theorem 16.17 . . . . . . . . . . . . . . . .

221 222 222 223 228 234 235 241 248

Part III Further Topics 17

Splitting Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.1 Busemann Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.2 Diffeomorphic Splitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.3 The Berwald Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

257 257 262 264

xiv

Contents

18

Curvature-Dimension Condition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.1 Optimal Transport Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.2 Curvature-Dimension Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.3 Brunn–Minkowski Inequality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.4 Analytic Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.4.1 Functional Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.4.2 Concentration of Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.5 Further Developments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.5.1 Riemannian Curvature-Dimension Condition . . . . . . . . . . . . . 18.5.2 Heat Flow as Gradient Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.5.3 Measure Contraction Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

269 269 272 276 279 279 282 283 283 284 285

19

Needle Decompositions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.1 Lipschitz Functions and Optimal Transports. . . . . . . . . . . . . . . . . . . . . . . . 19.1.1 Transport Rays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.1.2 Cyclical Monotonicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.2 Construction of Needle Decompositions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.2.1 Transport Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.2.2 Disintegration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.2.3 Conditioned Version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.3 Properties of Needles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.4 Isoperimetric Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.5 Further Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

287 287 288 289 291 291 292 293 295 297 299

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313

List of Definitions and Formulas

Here we collect some important definitions and formulas: (M, F ) is a Finsler manifold of dimension n, (x i )ni=1 is a local coordinate system on an open set U ⊂ M, (x i , v j )ni,j =1 denotes the associated fiber-wise linear local coordinate system on  T U such that v = nj=1 v j (v)(∂/∂x j )|π(v) for v ∈ T U , where π : T M −→ M is the natural projection (see Sect. 2.3). For simplicity we denote v j (v) by v j .

Part I (I-1)

Length of a piecewise C1 -curve η : [0, l] −→ M (Sect. 2.4): 

l

L(η) :=

  F η(t) ˙ dt.

0

(I-2)

Asymmetric distance from x to y (Sect. 2.4): d(x, y) := inf{L(η) | η : [0, 1] −→ M such that η(0) = x, η(1) = y}.

(I-3)

Components of the fundamental tensor in (3.1): gij (v) :=

(I-4)

1 ∂ 2 [F 2 ] (v), 2 ∂v i ∂v j

v ∈ T U \ 0.

Riemannian metric gv on Tπ(v) M for v ∈ T U \ 0 in (3.2): gv

 n i=1

  n n   ∂  ∂  := ai i  , bj j  ai bj gij (v). ∂x π(v) ∂x π(v) j =1

i,j =1

xv

xvi

List of Definitions and Formulas

(I-5)

Components of the Cartan tensor in (3.5): Aij k (v) :=

(I-6)

F (v) ∂gij (v), 2 ∂v k

Dual norm in (3.8): F ∗ (α) :=

α(v) =

sup v∈Tx M,F (v)≤1

(I-7)

v ∈ T U \ 0.

sup

α ∈ Tx∗ M.

α(v),

v∈Tx M,F (v)=1

Components of the dual fundamental tensor in (3.10): gij∗ (α) :=

1 ∂ 2 [(F ∗ )2 ] (α), 2 ∂αi ∂αj

α ∈ T ∗ U \ 0,

where (x i , αj )ni,j =1 denotes the fiber-wise linear coordinate system on T ∗ U such  that α = nj=1 αj (α) dx j . (I-8) Legendre transformations (Lemma 3.8): L(v) =

n n  1  ∂[F 2 ] i (v) dx = gij (v)v j dx i , 2 ∂v i

L∗ (α) =

n n  ∂ 1  ∂[(F ∗ )2 ] ∂ (α) i = gij∗ (α)αj i , 2 ∂αi ∂x ∂x

α ∈ T ∗ U.

i,j =1

i=1

(I-9)

v ∈ T U,

i,j =1

i=1

Formal Christoffel symbols in (3.14): γjik (v) :=

 n ∂gj l ∂gj k 1  il ∂glk g (v) (v) + (v) − (v) , 2 ∂x j ∂x k ∂x l

v ∈ T U \ 0,

l=1

where (g ij (v)) denotes the inverse matrix of (gij (v)). (I-10) Forward and backward open balls in (3.16): + (x, r) := {y ∈ M | d(x, y) < r}, BM

(I-11)

Reversibility constant in (3.17): F :=

(I-12)

− BM (x, r) := {y ∈ M | d(y, x) < r}.

F (−v) . v∈T M\0 F (v) sup

Covariant derivatives (Definition 4.1): v Dw X :=

 n

n   ∂Xi ∂  w j j (x) + ji k (v)w j Xk (x) . ∂x ∂x i x

i,j =1

k=1

List of Definitions and Formulas

xvii

(I-13) Coefficients of the geodesic spray in (4.4) (giving the geodesic equation η¨ i + Gi (η) ˙ = 0): n 

Gi (v) :=

γjik (v)v j v k ,

v ∈ T U.

j,k=1

(I-14)

Coefficients of the nonlinear connection in (4.4) (and (4.10)):  1 ∂Gi (v) = ji k (v)v k , j 2 ∂v n

Nji (v) :=

v ∈ T U.

k=1

(I-15) Coefficients for the covariant derivative in (4.8) (or the Chern connection in Definition 4.12): ji k := γjik −

n 1  il g (Alkm Njm + Aj lm Nkm − Aj km Nlm ) F l,m=1

on T U \ 0. (I-16) Jacobi equation (Definition 5.1): η˙

η˙

Dη˙ Dη˙ J + Rη˙ (J ) = 0, where Rv (w) :=

n 

Rji (v)w j

i,j =1

(I-17)

 ∂  , ∂x i x

v, w ∈ Tx M, x ∈ U.

Components of the curvature tensor in (5.5):

Rji (v) :=

 n ∂N i n  ∂Nji ∂Gi j k k (v) − (v)v − (v)G (v) − Nki (v)Njk (v), ∂x j ∂x k ∂v k k=1

k=1

v ∈ T U . (See Lemmas 5.6, 5.8 for some properties of the curvature tensor.) (I-18) Flag curvature (Definition 5.9): K(v, w) :=

gv (Rv (w), w) , F 2 (v)gv (w, w) − gv (v, w)2

where v, w ∈ Tx M are linearly independent.

xviii

List of Definitions and Formulas

(I-19)

Ricci curvature (Definition 5.11): Ric(v) := trace(Rv ) = F 2 (v)

n−1 

K(v, ei ),

v ∈ T M \ 0,

i=1

where {v/F (v)} ∪ {ei }n−1 i=1 is a gv -orthonormal basis of Tπ(v) M. (I-20) Expansion of the curvature tensor in (5.10):  Rlji k :=

i i  ∂kl ∂kl − Njm m j ∂x ∂v n

 −

∂x k

m=1

n 

+

∂ji l

n  m=1

Nkm

∂ji l

∂v m

m i (ji m kl − km jml )

m=1

on T U \ 0. (See Lemma 5.15 for some properties of Rlji k .) (I-21) T-curvature (Definition 6.10): Tv (w) :=

n 

  gil (v) ji k (w) − ji k (v) w j w k v l ,

v, w ∈ Tx M, x ∈ U.

i,j,k,l=1

(I-22) First variation formula for σ : [0, l] × (−ε, ε) −→ M, L(s) = L(σ (·, s)), T = ∂t σ , U = ∂s σ (Proposition 7.1):

l  l 

T T (·, s) − (t, s) dt gT U, DTT F (T ) F (T ) 0 0  l  T T (t, s) dt. = gT DT U, F (T ) 0

 L (s) = gT U,

(I-23)

Index form for vector fields W, X along a geodesic η in (7.1): I (W, X) :=

(I-24)

1 F (η) ˙

 0

l



  η˙ η˙ gη˙ (Dη˙ W, Dη˙ X) − gη˙ Rη˙ (W ), X dt.

Second variation formula where η = σ (·, 0) is a geodesic (Theorem 7.6):   L (0) = I U (·, 0), U (·, 0)  + gη˙ DUT U (·, 0),

η˙ F (η) ˙

l



l

− 0

0

∂s [F (T )](t, 0)2 dt. F (η(t)) ˙

List of Definitions and Formulas

xix

(I-25) Uniform convexity and smoothness constants (Definition 8.14, Lemma 8.16): F 2 (w) gα∗ (β, β) sup , = sup ∗ 2 x∈M v,w∈Tx M\{0} gv (w, w) x∈M α,β∈Tx∗ M\{0} F (β)

CF := sup

sup

gv (w, w) F ∗ (β)2 = sup sup . ∗ 2 x∈M v,w∈Tx M\{0} F (w) x∈M α,β∈Tx∗ M\{0} gα (β, β)

SF := sup

sup

Part II In Part II, we equip (M, F ) with a positive C∞ -measure m on M. (II-1)

Weighted Ricci curvature (Definition 9.11): RicN (v) := Ric(v) + ψη (0) −

ψη (0)2 N −n

,

v ∈ T M,

for N ∈ R \ {n}, where η : (−ε, ε) −→ M is the geodesic with η(0) ˙ = v and ψη : (−ε, ε) −→ R is given by the decomposition    ˙ dx 1 dx 2 · · · dx n dm = e−ψη det gij (η) of m along η. As the limits, Ric∞ (v) := Ric(v) + ψη (0),  Ric(v) + ψη (0) Ricn (v) := −∞ (II-2)

if ψη (0) = 0.

S-curvature (Remark 9.14): S(v) := ψη (0),

(II-3)

if ψη (0) = 0,

v ∈ T M.

Comparison functions in (9.4): ⎧ √ 1 ⎪ ⎪ √ sin( κt) ⎪ ⎪ κ ⎨ sκ (t) := t ⎪ ⎪ √ 1 ⎪ ⎪ ⎩√ sinh( −κt) −κ

for κ > 0, for κ = 0, for κ < 0.

xx

List of Definitions and Formulas

(II-4)

Energy functional on an open set  ⊂ M in (11.1): E (u) :=

(II-5)



1 2

F ∗ (du)2 dm, 

1 u ∈ Hloc ().

Sobolev spaces (Definition 11.2): 1 H 1 () := {u ∈ L2 () ∩ Hloc () | E (u) + E (−u) < ∞},

and H01 () is the closure of C∞ c () with respect to the norm

u H 1 () := (II-6)



u 2L2 () + E (u) + E (−u).

Gradient vectors of a function u in (11.6): n 

∇u(x) := L (dux ) =

gij∗ (dux )

i,j =1

(II-7)

 ∂u ∂  (x) i  ∈ Tx M. ∂x j ∂x x

Essential domain in (11.8): Mu := {x ∈ M | dux = 0}.

(II-8)

Divergence of a vector field V in (11.9): divm V :=

n   ∂V i i=1

∂x i

+Vi

∂ , ∂x i

V =

n  i=1

Vi

∂ , ∂x i

where dm = e dx 1 dx 2 · · · dx n , and in the weak form that   φ divm V dm = − dφ(V ) dm, φ ∈ C∞ c (M). M

M

(II-9) Nonlinear Laplacian (Definition 11.16): u := divm (∇u) in the weak sense that   φu dm := − dφ(∇u) dm, φ ∈ C∞ c (M). M

(II-10)

M

Linearized gradient vectors and Laplacian in (11.16): ∇ V f :=

n  i,j =1

g ij (V )

∂f ∂ , ∂x j ∂x i

V f := divm (∇ V f ).

List of Definitions and Formulas

(II-11)

xxi

Hessian in (12.1): ∇ 2 u(v) := Dv∇u (∇u) ∈ Tx M,

(II-12)

v ∈ Tx M, x ∈ Mu .

Laplacian as the trace of Hessian (Lemma 12.4): u = trace(∇ 2 u) − S(∇u).

(II-13) Pointwise Bochner–Weitzenböck formula (Theorem 12.7): For u ∈ C∞ (M), we have ∇u

F 2 (∇u) − d(u)(∇u) = Ric∞ (∇u) + ∇ 2 u 2HS(∇u) 2

and ∇u

F 2 (∇u) (u)2 − d(u)(∇u) ≥ RicN (∇u) + 2 N

on Mu for N ∈ (−∞, 0) ∪ [n, ∞]. (II-14) Integrated Bochner–Weitzenböck formula (Theorem 12.13): Given u ∈ 2 (M) ∩ C1 (M) with u ∈ H 1 (M), we have Hloc loc 

 −

dφ ∇

∇u

M

 =

M

F 2 (∇u) 2

dm

  φ d(u)(∇u) + Ric∞ (∇u) + ∇ 2 u 2HS(∇u) dm

for φ ∈ Hc1 (M) ∩ L∞ (M), and 

 −

dφ ∇ M

 ≥

∇u

F 2 (∇u) 2

dm

(u)2 dm φ d(u)(∇u) + RicN (∇u) + N M

for N ∈ (−∞, 0) ∪ [n, ∞] and nonnegative functions φ ∈ Hc1 (M) ∩ L∞ (M). (II-15) Linearized heat semigroup and its adjoint (Definitions 13.19, 13.21): Given a global solution (ut )t≥0 to the heat equation, f ∈ H01 (M) and s ≥ 0, ∇u (f )) (Ps,t t≥s is the weak solution to    ∇u  ∇u ∂t Ps,t (f ) = ∇ut Ps,t (f ) ,

∇u Ps,s (f ) = f.

xxii

List of Definitions and Formulas

∇u (φ)) s,t For φ ∈ H01 (M) and t > 0, (P s∈[0,t] is the weak solution to

 ∇u   ∇u  s,t (φ) = −∇us P s,t (φ) , ∂s P (II-16)

Minkowski’s exterior boundary measure of A ⊂ M in (15.1): m+ (A) := lim inf ε→0

(II-17)

∇u t,t P (φ) = φ.

m(B + (A, ε)) − m(A) . ε

Isoperimetric profile (under m(M) = 1) in (15.2):

I(M,F,m) (θ ) := inf{m+ (A) | A ⊂ M : Borel set with m(A) = θ }, θ ∈ [0, 1]. (II-18)

Variance of a function f in (15.7) (provided m(M) = 1): Varm (f ) :=

(II-19)

2  2     f− f dm dm = f 2 dm − f dm . M

M

2 -operator in (16.1): ∇u

2 (u) :=  (II-20)

M

F 2 (∇u) − d(u)(∇u). 2

Relative entropy of a probability measure f m in (16.2):  Entm (f m) :=

f log f dm. M

M

Part I

Foundations of Finsler Geometry

Part I is devoted to the foundations of Finsler geometry. Our discussion is as geometric as possible, and the behavior of geodesics plays an important role throughout.

Chapter 1

Warm-Up: Norms and Inner Products

In this chapter, as a warm-up before the general theory of Finsler manifolds, we consider normed spaces and discuss some characterizations of inner product spaces among normed spaces. These special properties of inner product spaces will help us to understand the difference between Riemannian and Finsler manifolds.

1.1 Norms and Inner Products A nonnegative function · : Rn −→ [0, ∞) on the n-dimensional Euclidean space is called a norm if it satisfies the following conditions: (1) (Positive-definiteness) v = 0 if and only if v = 0, where 0 ∈ Rn denotes the origin; (2) (Homogeneity) cv = |c| · v for all v ∈ Rn and c ∈ R; (3) (Triangle inequality) v + w ≤ v + w for all v, w ∈ Rn . It follows from (2) and (3) that · is a convex function, i.e.,

(1 − t)v + tw ≤ (1 − t) v + t w holds for all v, w ∈ Rn and t ∈ [0, 1]. A norm · is determined by its unit ball B := {v ∈ Rn | v ≤ 1}, which is a centrally symmetric, bounded, closed, and convex set including the origin in its interior. That is to say, two norms of Rn having the same unit ball necessarily coincide. Exercise 1.1 Prove this claim. Moreover, construct the associated norm from a set B ⊂ Rn satisfying the above conditions. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S.-i. Ohta, Comparison Finsler Geometry, Springer Monographs in Mathematics, https://doi.org/10.1007/978-3-030-80650-7_1

3

4

1 Warm-Up: Norms and Inner Products

We call S := {v ∈ Rn | v = 1} the unit sphere of · , which is the boundary of B. (The set S is also called the indicatrix in Finsler geometry.) If a norm · satisfies the parallelogram identity

v + w 2 + v − w 2 = 2 v 2 + 2 w 2

(1.1)

for all v, w ∈ Rn , then v, w :=

 1

v + w 2 − v − w 2 4

gives an inner product, i.e., ·, · : Rn × Rn −→ R is a bilinear function such that v, w = w, v for all v, w ∈ Rn and v, v > 0 for all v ∈ Rn \ {0}. One can recover the original norm from the induced inner product by the relation v = √ v, v. Exercise 1.2 Prove that (Rn , · ) is an inner product space if and only if its unit sphere is an ellipsoid. This is an exercise of linear algebra. On one hand, by an ellipsoid we mean a set

 n i=1

 n  2   ci ci ei ∈ R  =1 ai n

i=1

for some ai > 0 and for an orthonormal basis {ei }ni=1 of Rn with respect to the canonical inner product. On the other hand, if a norm · comes from an inner product, then there exists a positive-definite symmetric matrix G such that S = {v ∈ Rn | v T · G · v = 1}, where v T is the transpose of v.

1.2 Three Characterizations of Inner Products Besides those in the previous section, there are many characterizations of inner products among norms. Here we discuss three of them, which are important in the Finsler context.

1.2 Three Characterizations of Inner Products

5

1.2.1 Sharp Uniform Convexity and Smoothness The first characterization is related to Banach space theory and metric geometry. We say that a normed space (Rn , · ) is uniformly convex (or, more precisely, 2-uniformly convex) if there is a constant C ≥ 1 such that    v + w 2 1 1 1 2 2 2    2  ≤ 2 v + 2 w − 4C w − v

(1.2)

for all v, w ∈ Rn . Then we have, by recursive applications of (1.2) and the continuity of the norm, the inequality

(1 − t)v + tw 2 ≤ (1 − t) v 2 + t w 2 −

(1 − t)t

w − v 2 C

(1.3)

for all v, w ∈ Rn and t ∈ [0, 1]. Observe that what simply follows from the convexity of the norm is (1 − t)v + tw) 2 ≤ (1 − t) v 2 + t w 2 , and (1.3) means that · possesses a stronger convexity. Exercise 1.3 Deduce (1.3) from (1.2). These equivalent inequalities can be regarded as weak formulations of the convexity “Hess( · 2 /2) ≥ C−1 ” measured by the distance structure · . (To be precise, the first three norm squared terms in (1.2) are playing the role of a potential function, and the last term w − v 2 appears as the distance structure.) The least constant C satisfying (1.2) is called the uniform convexity constant. One can also regard C−1 as a kind of (positive) lower curvature bound of the unit sphere S of · (see [188]). Similarly, we introduce the uniform smoothness (or the 2-uniform smoothness) as    v + w 2 1 1 S 2 2 2    2  ≥ 2 v + 2 w − 4 w − v

(1.4)

for some S ≥ 1 and for all v, w ∈ Rn . Then we have

(1 − t)v + tw 2 ≥ (1 − t) v 2 + t w 2 − S(1 − t)t w − v 2 for all v, w ∈ Rn and t ∈ [0, 1]. The uniform smoothness constant S is the least constant satisfying (1.4). It measures the concavity of · 2 /2 and gives an upper curvature bound of the unit sphere S. Remark 1.4 (Connections with Banach Space Theory) The uniform convexity and smoothness are fundamental notions in Banach space theory (see, e.g., [24]). One can also consider the p-uniform convexity and smoothness for general p ∈ (1, ∞) by replacing the squared norm · 2 with the p-th power · p . For instance,

6

1 Warm-Up: Norms and Inner Products

an Lp -space with p ∈ (1, 2] is 2-uniformly convex with C = (p − 1)−1 ([24, Proposition 3]) and p-uniformly smooth with S = 1 (Clarkson’s inequality). By duality, for p ∈ [2, ∞), an Lp -space is p-uniformly convex with C = 1 and 2uniformly smooth with S = p − 1. Among Lp -spaces, only L2 -spaces are both 2-uniformly convex and smooth. Exercise 1.5 Let (R2 , · p ) be the 2-dimensional Lp -space, namely (x, y) p := (|x|p + |y|p )1/p . Prove that it is not 2-uniformly convex (resp. 2-uniformly smooth) when p ∈ (2, ∞) (resp. p ∈ (1, 2)). Proposition 1.6 (First Characterization) Given a normed space (Rn , · ), the following are equivalent: (A) · comes from an inner product. (B) · is uniformly convex with C = 1. (C) · is uniformly smooth with S = 1. Proof It follows from the parallelogram identity (1.1) that (A) implies (B) and (C). In order to see (B) ⇒ (A), we apply (1.2) to the pairs v, w and v + w, v − w and find    v + w 2 1 1 1 2 2 2    2  ≤ 2 v + 2 w − 4 v − w ,

v 2 ≤

1 1

v + w 2 + v − w 2 − w 2 . 2 2

Combining these gives the parallelogram identity and hence (A) holds. The implication (C) ⇒ (A) is similarly shown.   Notice that we did not use the finiteness of the dimension, thereby Proposition 1.6 in fact gives a characterization of Hilbert spaces among Banach spaces. This characterization also shows that the synthetic upper and lower sectional curvature bounds in the sense of Alexandrov for metric spaces (called CAT-spaces and Alexandrov spaces) make sense only in the Riemannian setting. See Sect. 8.3 for further discussions.

1.2.2 Smoothness at the Origin The next characterization has an analytic influence (see [228, Proposition 2.2] for the manifold case). Proposition 1.7 (Second Characterization) A norm · on Rn is induced from an inner product if and only if the squared norm · 2 is twice differentiable at the origin 0.

1.2 Three Characterizations of Inner Products

7

Proof The “only if” part is obvious, thus we prove only the “if” part. If the squared norm is twice differentiable at 0, then the Taylor expansion yields

v 2 =

n 

aij v i v j + o( v 2 )

i,j =1

for some symmetric matrix (aij )ni,j =1 , where we write v = (v i )ni=1 . Thanks to the homogeneity of the norm, we have for t > 0

v 2 =

n  1 o( tv 2 ) 2 i j

tv = a v v + . ij t2 t2 i,j =1

n i j Taking the limit as t → 0, we find v 2 = i,j =1 aij v v . Observe from the positive-definiteness of the norm that (aij ) is a positive-definite  matrix. Therefore, the norm · is induced from the inner product v, w := ni,j =1 aij v i w j .   Hence, on a general normed space, the squared norm · 2 is not C2 at the origin. This explains the reason why one can obtain only the C1,α -regularity for solutions to the heat equation on Finsler manifolds (see Chap. 13). Exercise 1.8 Let · be a norm on Rn which is C2 on Rn \ {0}. Prove that · 2 is C1,α for some α > 0 on Rn .

1.2.3 Center of Circumscribed Triangle We finally discuss a less known, but elegant geometric characterization in dimension 2. Proposition 1.9 (Third Characterization) A normed space (R2 , · ) comes from an inner product space if and only if, for any triangle ABC tangent to S at a, b, c, we have −→ −→ − → v := |OBC|Oa + |OAC|Ob + |OAB|Oc = 0,

(1.5)

where S ⊂ R2 is the unit circle of · , O := 0 and a, b, c ∈ S are on the edges BC, CA, AB, respectively (see Fig. 1.1). We denote by |OBC| the area of OBC with respect to the Lebesgue measure. To be precise, “ABC is tangent to S at a, b, c” means that B ⊂ ABC and a, b, c ∈ ∂[ABC] ∩ S.

8

1 Warm-Up: Norms and Inner Products

Fig. 1.1 ABC in Proposition 1.9

A c

b v S

O B

C a B

A

c

·

fl

a

A

fl

S

B

a

S

Fig. 1.2 Non-strictly convex and non-differentiable cases

Proof It is not difficult to see that inner product spaces satisfy (1.5), we leave its proof to the readers (Exercise 1.10). There are in fact many different ways to show this fact. In order to see the converse, we first note that, if the norm is not strictly convex, then we have a flexibility in choosing a, b, c for some fixed A, B, C (see the left figure of Fig. 1.2). Hence we can find a, b, c with v = 0. Similarly, if S is not differentiable, then there is a flexibility in the choices of A, B, C for some fixed a, b, c (see the right figure of Fig. 1.2). Thus, again, we can take v = 0 (consider, by rotation, the situation that c is on the y-axis and C is fixed). Now we assume that the norm · is strictly convex and differentiable on R2 \{0}. By rotation and expansion/contraction in the x- and y-axes, we can suppose that S is tangent to the lines x = 1 and y = −1, and passes through (1, 0). Let p = (p1 , p2 ) be the point attaining the minimum of the x-component in S, and q = (q1 , q2 ) be the point attaining the maximum of the y-component in S. Put a = (1, 0), b = (α, −1), and c = (c1 , c2 ) for some c1 ∈ (p1 , q1 ) and c2 ∈ (p2 , q2 ). The triangle ABC corresponding to a, b, c is given by A = (1 − t, −1),

B = (1, −1 + st),

C = (1, −1)

1.2 Three Characterizations of Inner Products

9

Fig. 1.3 ABC, p and q

B (1, −1 + st )

q (q 1, q 2 ) c (c 1, c 2 ) a (1, 0) p (p 1, p 2 )

A (1 − t, −1)

b (α, −1)

C (1, −1)

for some s, t > 0; see Fig. 1.3. Notice that the line passing through A and B is written as y = s(x − (1 − t)) − 1. Observe that  t st 2 st t t st − + = (st −s −1). |OBC| = , |OCA| = , |OAB| = 2 2 2 2 2 2 Therefore, v=

st t t − → (1, 0) + (α, −1) + (st − s − 1)Oc 2 2 2

and the hypothesis (1.5) implies that − → Oc = −

1 (s + α, −1). st − s − 1

Since c is on the segment AB, we have

1 s+α =s − − (1 − t) − 1. st − s − 1 st − s − 1 This is rewritten as st 2 − 2(s + 1)t − α + 2 = 0, and hence t=

(s + 1) ±

s 2 + αs + 1 . s

(1.6)

10

1 Warm-Up: Norms and Inner Products

Since st − s − 1 = 2|OAB|/t > 0, we find that st − s − 1 =



s 2 + αs + 1.

Substituting this into (1.6), we obtain − → Oc = − √

1 s2

+ αs + 1

(s + α, −1).

Now we regard c as a curve parametrized by s. Recall that the slope of S at c is (1, s), thus we have s= =



∂ 1 −s − α ∂ √ √ ∂s ∂s s 2 + αs + 1 s 2 + αs + 1 −2(s 2

=−

−(2s + α) + αs + 1) + (s + α)(2s + α)

2s + α . αs + α 2 − 2

This shows that α = 0 (otherwise there are at most only two solutions for s), and hence b = (0, −1) and 1 − → Oc = √ (−s, 1). 2 s +1 It follows that c is on the Euclidean unit circle S. In particular, we have p = (−1, 0) and q = (0, 1). By reflecting S over the y-axis, we deduce that S coincides with S also between q and a. Finally, letting S upside down, we conclude that whole S coincides with S.   Exercise 1.10 Prove the “only if” part of Proposition 1.9. The following may be an open question. Exercise 1.11 Give a counterpart to Proposition 1.9 in R3 or, more generally, in Rn . The characterization in Proposition 1.9 was used in [207] to reveal that the behavior of heat flow on normed spaces is very different from that on inner product spaces (see Remark 14.9). The author does not know if Proposition 1.9 had been known before. We remark that the above proof, different from that in the original paper [207], was provided by a student at “Galois Festival,” which is an event for first-year students in the Faculty of Science, Kyoto University. Exercise 1.12 Give an alternative (simpler) proof of the “if” part of Proposition 1.9.

Chapter 2

Finsler Manifolds

In this chapter, we begin with Minkowski normed spaces which appear as tangent spaces of Finsler manifolds, and recall Euler’s homogeneous function theorem as an important calculus tool throughout the book. Then we give the definition of a Finsler manifold, followed by some examples and a naturally induced (asymmetric) distance structure.

2.1 Minkowski Normed Spaces Instead of inner product spaces in Riemannian geometry, we will employ the following far more general class of spaces as tangent spaces in Finsler geometry. Definition 2.1 (Minkowski Norms) Let n ≥ 1. We say that a nonnegative function

· : Rn −→ [0, ∞) is a Minkowski norm if the following hold: (1) (Regularity) · is C∞ on Rn \ {0}; (2) (Positive 1-homogeneity) cv = c v for all v ∈ Rn and c > 0; (3) (Strong convexity) For any v ∈ Rn \ {0}, the Hessian matrix of · 2 at v, 

is positive-definite, where v = Rn .

∂ 2 ( · 2 ) (v) , ∂v i ∂v j n

i=1 v

ie i

with the canonical basis {ei }ni=1 of

We call the pair (Rn , · ) a Minkowski normed space. Let us compare the Minkowski norms with the usual norms in the previous chapter. First of all, a certain regularity is necessary to do differential geometry, we will assume the C∞ -smoothness as in (1) throughout for the sake of simplicity. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S.-i. Ohta, Comparison Finsler Geometry, Springer Monographs in Mathematics, https://doi.org/10.1007/978-3-030-80650-7_2

11

12

2 Finsler Manifolds

It is natural to remove the origin 0, since the C2 -smoothness at 0 holds only in inner product spaces (Proposition 1.7). We also remark that we need to be careful about the product structure. Exercise 2.2 Let (Rni , · i ), i = 1, 2, be Minkowski normed spaces in the above sense. Prove that the L2 -product structure

(v1 , v2 ) :=



v1 21 + v2 22 ,

(v1 , v2 ) ∈ Rn1 × Rn2 ,

on the product space Rn1 × Rn2 satisfies the regularity (1) in Definition 2.1 if and only if both · 1 and · 2 are inner products. We stress that the homogeneity (2) is imposed only in the positive direction (c > 0). Hence the asymmetry −v = v is allowed; this is an important special feature of Minkowski norms. The corresponding unit ball B = {v ∈ Rn | v ≤ 1} is a bounded, closed, and convex set including the origin in its interior, but not necessarily centrally symmetric. If the homogeneity holds also in the negative direction (i.e., cv = |c| · v for c ∈ R), then we say that the Minkowski norm

· is reversible or absolutely homogeneous. Remark 2.3 (Non-reversibility) We remark that, in general, there is no relation at all between the behavior of the norm near v and −v. One can indeed modify the norm · on a neighborhood of −v without changing it on a neighborhood of v. It follows from the strong convexity (3) that: • (Positive-definiteness) v = 0 if and only if v = 0; • (Strict convexity) v + w ≤ v + w for all v, w ∈ Rn , and equality holds if and only if v = aw or w = av for some a ≥ 0. Exercise 2.4 Prove the above two properties from (3). The regularity (1) and the strong convexity (3) imply the uniform smoothness and convexity (in the same forms as (1.4) and (1.2)), respectively. Therefore, the unit sphere S = {v ∈ Rn | v = 1} is a smooth and positively curved submanifold. One can occasionally relax the regularity and strong convexity conditions, for instance, to the positive-definiteness and (strict) convexity. Such a non-smooth Minkowski norm can be treated as the limit of smooth ones (see Example 2.6 for an example of a regularization procedure, and Subsect. 18.5.1 for a related discussion). We finally remark that, in the 1-dimensional case, there is only a unique norm (up to isometry) 1 = −1 = 1 which is regarded as an inner product, while Minkowski norm is not unique since −1 can be chosen independently from 1 .  Exercise 2.5 Show that the Lp -norm v p = ( ni=1 |v i |p )1/p on Rn with n ≥ 2 does not satisfy (1) (resp. (3)) when p ∈ (1, 2) (resp. p ∈ (2, ∞)). (Recall Exercise 1.5.)

2.2 Euler’s Homogeneous Function Theorem

13

Example 2.6 (Regularization) A general norm · on Rn (as in Sect. 1.1) can be approximated by a sequence of Minkowski norms as follows. Let | · | be the Euclidean norm of Rn , and φ ∈ C∞ (Rn ) be a rotationally  symmetric mollifier such that supp φ is included in the Euclidean unit ball and Rn φ dLn = 1, where Ln is the Lebesgue measure on Rn . Given ε ∈ (0, 1), define a function · ε on Rn by

v ε

 w 1 :=

v − w · φ Ln (dw) (ε|v|)n ε|v| Rn 

for v ∈ Rn \ {0} and 0 ε := 0. Then · ε is C∞ on Rn \ {0} and is a norm (we leave this claim as an exercise below). We further define

v ε :=



( v ε )2 + ε|v|2 ,

v ∈ Rn ,

which is strongly convex and hence a Minkowski norm. It is clear from the construction that limε→0 v ε = v for all v ∈ Rn . Exercise 2.7 In the above example, prove that

v ≤ v ε ≤ v ε holds for all v ∈ Rn and 0 < ε < ε < 1, and that · ε is a norm.

2.2 Euler’s Homogeneous Function Theorem The following basic observation on homogeneous functions on Rn will be used countless times. We include a proof for completeness. Theorem 2.8 (Euler’s Homogeneous Function Theorem) Let n ≥ 1. Suppose that a differentiable function H : Rn \{0} −→ R satisfies H (cv) = cr H (v) for some r ∈ R and for all v ∈ Rn \ {0} and c > 0 (that is, H is positively r-homogeneous). Then we have n  ∂H i=1

for all v ∈ Rn \ {0}.

∂v i

(v)v i = rH (v)

(2.1)

14

2 Finsler Manifolds

Proof Differentiating the equality H (cv) = cr H (v) in c, we have by the chain rule n  ∂H i=1

∂v i

(cv)v i = rcr−1 H (v).

Letting c = 1 shows the claim.

 

Exercise 2.9 Prove the converse of Theorem 2.8. That is to say, if a differentiable function H : Rn \ {0} −→ R satisfies (2.1) for all v ∈ Rn \ {0}, then H is positively r-homogeneous.

2.3 Finsler Manifolds We refer to [247] for the basic theory of differentiable manifolds. Throughout the book, unless otherwise indicated, a manifold M is connected, Hausdorff, σ -compact, without boundary, C∞ , and of dimension n ≥ 2 (we exclude the 1dimensional case since we are interested in curvature). A local coordinate system on a (nonempty) open set U ⊂ M is a homeomorphism ϕ from U to an open set in Rn such that both ϕ and its inverse ϕ −1 are C∞ . The components of ϕ are denoted by (x i )ni=1 (called the coordinate functions), i.e., ϕ = (x 1 , x 2 , . . . , x n ). We will suppress ϕ and a local coordinate system will be written as (x i )ni=1 . We denote the tangent space at x ∈ M by Tx M, the tangent bundle over M by  T M := x∈M Tx M, and the associated natural projection by π : T M −→ M. The ∗ cotangent space  at x ∈∗ M and the cotangent bundle over M are denoted by Tx M ∗ and T M := x∈M Tx M, respectively. Given a differentiable function f on M, its derivative at x ∈ M is denoted by dfx ∈ Tx∗ M. For v ∈ Tx M, we may write df (v) instead of dfx (v) for simplicity. Given a local coordinate system (x i )ni=1 on an open set U ⊂ M, we will always use the fiber-wise linear coordinates (x i , v j )ni,j =1 on T U such that v=

n  j =1

We will write v =

n

j =1 v

j (∂/∂x j )|

x

 ∂  v (v) j  . ∂x π(v) j

∈ Tx M for simplicity.

Definition 2.10 (Finsler Structures) We say that a nonnegative function F : T M −→ [0, ∞) is a Finsler structure of M (also called a Finsler metric or a Finsler function) if the following conditions hold: (1) (Regularity) F is C∞ on T M \ 0, where 0 ⊂ T M is the zero section; (2) (Positive 1-homogeneity) F (cv) = cF (v) for all v ∈ T M and c > 0; (3) (Strong convexity) For any v ∈ T M \ 0 and a local coordinate system (x i )ni=1 around π(v), the n × n matrix

2.3 Finsler Manifolds

15



∂ 2 (F 2 ) (v) ∂v i ∂v j

is positive-definite. We call the pair (M, F ) a Finsler manifold. We say that F is reversible (or absolutely homogeneous, symmetric) if F (−v) = F (v) holds for all v ∈ T M. Thus, the function F provides a Minkowski norm on each tangent space which varies smoothly also in horizontal directions. Exercise 2.11 Prove that the positive-definiteness in Definition 2.10(3) is independent of the choice of a local coordinate system. We give three fundamental examples of Finsler manifolds, more examples and further properties can be found in Chaps. 6 and 10. Example 2.12 (a) (Minkowski normed spaces) A Minkowski normed space (Rn , · ) (in the sense of Definition 2.1) is a Finsler manifold by canonically identifying each of its tangent spaces with (Rn , · ). (b) (Randers spaces) Let √ (M, g) be a Riemannian manifold and take a 1-form β on M with |β(v)| < g(v, v) for all v ∈ T M \ 0. Then F (v) :=

 g(v, v) + β(v)

(2.2)

is a Finsler structure of M. A Finsler manifold constructed in this way is called a Randers space (named after the physicist Gunnar Randers). Notice that F is reversible if and only if β = 0 (and then F is Riemannian). One could regard that the 1-form β represents the effect of wind blown on the Riemannian manifold (M, g). This interpretation leads to a resolution of Zermelo’s navigation problem by means of Randers metrics (see [27], [227, Sect. 5.4]). Randers himself originally considered this metric in connection with general relativity in [218]. (c) ((α, β)-metrics) Again, let (M, g) be a Riemannian manifold and β be a 1-form on M. We also consider a smooth positive function φ : (−b0 , b0 ) −→ √ (0, ∞), √ and assume that |β(v)| < b0 g(v, v) for all v ∈ T M \0. Set α(v) := g(v, v). Then the function F defined by  F (v) := α(v)φ

β(v) α(v)

on T M \ 0,

F (0) := 0

is called an (α, β)-metric. This is in fact a Finsler structure if and only if φ(s) − sφ  (s) + (b2 − s 2 )φ  (s) > 0

16

2 Finsler Manifolds

for all |s| ≤ b < b0 (see [227, Lemma 2.1]). This construction provides a number of important classes of Finsler manifolds. For example, choosing φ(s) = 1 + s (with b0 = 1) gives Randers metrics. We refer to [227] for further discussions. We denote the unit sphere in Tx M by Ux M := Tx M ∩ F −1 (1). It is also called the indicatrix at x in Finsler geometry. Exercise 2.13 (Unit Spheres of Randers Spaces) Let (M, F ) be a Randers space as in (2.2) and take a point x ∈ M. Show that the unit sphere Ux M is an ellipsoid, but its center is not necessarily at the origin. (This is an exercise of linear algebra.) The non-reversibility of Finsler structures opens the door to many applications not captured by Riemannian metrics. Besides Zermelo’s navigation problem, there are many applications of Finsler structures to physics, biology, mechanics, dynamical systems, and so on (we refer to, for example, the books [15, 50, 181]). One of the most famous applications is Katok’s construction in [133] of non-reversible Finsler structures of Sn with only finitely many closed geodesics (see also [261]). Remark 2.14 (Lagrangian/Hamiltonian Structures) As a further generalization, one can drop the positive homogeneity in Definition 2.10(2). Then we obtain Lagrangian and Hamiltonian structures. It is known that, even in this case, one can define a suitable notion of curvature. We refer to [2] from the view of optimal control system (see also [5, 101, 111] for some preceding works, and [149, 198] for investigations related to the subject of this book). Without homogeneity, however, one cannot define the (asymmetric) distance function as in the following section because the length of a curve can change under a reparametrization.

2.4 Asymmetric Distance and Geodesics Now we introduce a natural distance structure associated with a Finsler structure F . For a piecewise C1 -curve η : [0, l] −→ M, we define its length by 

l

L(η) :=

  F η(t) ˙ dt,

0

where η(t) ˙ ∈ Tη(t) M denotes the tangent (velocity) vector of η at t. Then, given x, y ∈ M, the (asymmetric) distance from x to y is defined as d(x, y) := inf L(η), η

where η runs over all piecewise C1 -curves from x to y. Due to the non-reversibility of F , the distance d(y, x) from y to x can be different from the distance d(x, y)

2.4 Asymmetric Distance and Geodesics

17

from x to y. Thus d is not a distance function in the usual sense; however, it is common to call it a distance function in the Finsler context. Exercise 2.15 Prove that d is symmetric if and only if F is reversible. It is straightforward from its definition that the distance function d satisfies: • (Positive-definiteness) d(x, y) = 0 if and only if x = y; • (Triangle inequality) d(x, z) ≤ d(x, y) + d(y, z) for all x, y, z ∈ M. In particular, we have −d(z, y) ≤ d(x, z) − d(x, y) ≤ d(y, z). Due to the asymmetry of d, we need to be very careful about the order of x, y, z in these inequalities. Exercise 2.16 (Distance Functions are Lipschitz) Given z ∈ M, show that the distance function f (x) := d(z, x) is 1-Lipschitz in the sense that f (y) − f (x) ≤ d(x, y) holds for all x, y ∈ M. Similarly, prove that h(x) := −d(x, z) is also 1-Lipschitz. The above length and distance structures lead us to a natural notion of geodesics, in a way similar to metric geometry, as follows. Definition 2.17 (Geodesics) We say that a continuous curve η : [0, l] −→ M is a geodesic if it is locally minimizing and of constant speed with respect to the distance function d. That is to say, for some c ≥ 0 and any a ∈ [0, l], there is ε > 0 such that d(η(s), η(t)) = c(t − s) holds for all s, t ∈ [0, l] ∩ [a − ε, a + ε] with s ≤ t. If we have d(η(0), η(l)) = cl (and hence d(η(s), η(t)) = c(t − s) for all 0 ≤ s < t ≤ l and d(η(0), η(l)) = L(η)), then we call η a minimal geodesic. The constant c ≥ 0 represents the speed of η. If c = 1, then we say that η is of unit speed. Exercise 2.18 Prove the observation in the above definition: If η : [0, l] −→ M is a geodesic of speed c and d(η(0), η(l)) = cl holds, then we have d(η(s), η(t)) = c(t − s) for all 0 ≤ s < t ≤ l. If a geodesic η of speed c is C1 , then we have F (η) ˙ ≡ c (all geodesics are eventually C∞ by Proposition 3.11(i)). We remark that the order of s and t cannot be reversed in the equality d(η(s), η(t)) = c(t − s) because of the asymmetry of d. The reverse curve η(t) ¯ := η(l − t) is not necessarily locally minimizing or of constant speed, thereby it is not a geodesic in general. For example, recall a Randers space regarded as a Riemannian manifold with wind blown on it (Example 2.12(b)).

18

2 Finsler Manifolds

2.5 Reverse Finsler Structures It will be useful to introduce the following notion. Definition 2.19 (Reverse Finsler Structures) Define the reverse Finsler structure ← − ← − F of F by F (v) := F (−v) for v ∈ T M. ← − Let us put an arrow ← on a quantity associated with F . Observe, for example, ← − that d (x, y) = d(y, x) and a curve η is a geodesic with respect to F if and only if ← − its reverse curve η¯ is a geodesic with respect to F . Moreover, for the fundamental tensor (Sect. 3.1) we have ← g− = g , the curvatures (Chaps. 5, 9) satisfy v

← − K (v, w) = K(−v, w),

−v

←− Ric(v) = Ric(−v),

←− RicN (v) = RicN (−v),

and the gradient vectors and Laplacians (Chap. 11) are related by ← − ∇ u = −∇(−u),

← −  u = −(−u),

respectively. Exercise 2.20 Show the relations above (between the quantities with respect to F ← − and F ). One might consider some symmetrizations by taking averages like ← − F (v) + F (v)  F (v) = 2

or

← − F 2 (v) + F 2 (v) . 2

We will indeed make use of this kind of symmetrization in the formulation of Sobolev spaces (Sect. 11.1). From the geometric viewpoint, however, the structure ) can be totally different from that of (M, F ). This symmetrization does of (M, F not preserve neither geodesics nor any curvature.

Chapter 3

Properties of Geodesics

In this chapter, we begin our study of differential calculus on Finsler manifolds. The main subject of the chapter is the geodesic equation as the Euler–Lagrange equation for the energy functional. To this end, some important quantities such as the fundamental and Cartan tensors are introduced. We will see that the metric definition of geodesics in Definition 2.17 coincides with the variational definition as solutions to the geodesic equation. We also prove the Finsler analogue of the Hopf–Rinow theorem.

3.1 Fundamental and Cartan Tensors Let (M, F ) be a Finsler manifold. Choose a local coordinate system (x i )ni=1 on an open set U ⊂ M, and let x ∈ U . For v ∈ Tx M \ {0}, we define the components of the fundamental tensor by gij (v) :=

1 ∂ 2 [F 2 ] (v), 2 ∂v i ∂v j

i, j = 1, 2, . . . , n.

(3.1)

Clearly gj i (v) = gij (v). Recall that this symmetric matrix is positive-definite by the strong convexity (Definition 2.10(3)). Therefore, (gij (v))ni,j =1 defines an inner product on Tx M, denoted by gv , as follows. Definition 3.1 (Riemannian Metric gv ) Given v ∈ Tx M \ {0}, we define a Riemannian metric gv on Tx M by gv

 n i=1

 n  n  ∂   ∂  ai i  , bj j  := ai bj gij (v). ∂x x ∂x x j =1

(3.2)

i,j =1

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S.-i. Ohta, Comparison Finsler Geometry, Springer Monographs in Mathematics, https://doi.org/10.1007/978-3-030-80650-7_3

19

20

3 Properties of Geodesics

Exercise 3.2 (Well-definedness of gv ) Show that the metric gv is well-defined in the sense that its definition (3.2) is independent of the choice of a local coordinate system. √ Note that, if F is induced from a Riemannian metric g (by F (v) = g(v, v)), then gv coincides with the original metric g for all v ∈ T M \ 0. The Riemannian metric gv will play a significant role in the Riemannian geometric approach to Finsler geometry. Let us explain here a geometric meaning of gv , which will be helpful to understand the validity of some results in the sequel (such as Theorem 5.12). Since F 2 |Tx M is positively 2-homogeneous and its derivative (∂[F 2 ]/∂v i )|Tx M is positively 1-homogeneous, using Euler’s theorem (Theorem 2.8), we find that gv (v, v) =

n n 1  ∂ 2 [F 2 ] 1  ∂[F 2 ] i j (v)v v = (v)v i = F 2 (v) 2 ∂v i ∂v j 2 ∂v i i,j =1

(3.3)

i=1

for all v ∈ Tx M \ {0}. By this equality and the definition of gv , the unit sphere of the inner product gv is tangent to that of F |Tx M at v/F (v) up to second order (see Fig. 3.1). In this sense, gv can be regarded as the best Riemannian approximation of F in the direction v. Thereby the metric gv is useful when we are interested in the behavior of some geometric or analytic quantity in the direction v (in another direction we need to employ a different metric). The difference (or, more precisely, the ratio) between F and gv can be estimated by the uniform convexity and smoothness (see Subsect. 8.3.2 below). Exercise 3.3 (Fundamental Inequality) For v, w ∈ Tx M with v = 0, prove that we have

Ux M g v (·, ·) = 1 v/ F (v )

Fig. 3.1 Unit spheres of F and gv

3.2 Dual Norms and the Legendre Transformation n 

wi

i=1

21

∂F (v) ≤ F (w) ∂v i

(3.4)

and equality holds if and only if w = av for some a ≥ 0 (by using the strict convexity of F ). This inequality is called the fundamental inequality in [25], and is equivalent to gv (v, w) ≤ F (v)F (w) (see also Exercise 3.9 below). Next we define the Cartan tensor by its components Aij k (v) :=

F (v) ∂gij (v), 2 ∂v k

v ∈ Tx M \ {0}.

(3.5)

It measures the variation of gv in the vertical direction. (This is the notation following [25], Cij k (v) := Aij k (v)/F (v) is called the Cartan tensor in some other literature including [230].) Since gij (v) is positively 0-homogeneous in v, we deduce from Theorem 2.8 the useful relation n 

Aij k (v)v = i

n 

Aij k (v)v = j

j =1

i=1

n 

Aij k (v)v k = 0

(3.6)

k=1

for all v ∈ Tx M \ {0}. On a Riemannian manifold (M, g), the Cartan tensor vanishes on whole T M \ 0 since gv = g for all v ∈ T M \ 0. Conversely, if Aij k = 0 holds on Tx M \ {0} for all i, j, k, then gv is independent of v ∈ Tx M \ {0} and F |Tx M coincides with the norm induced from gv thanks to (3.3). Therefore, the Cartan tensor is a genuinely non-Riemannian quantity appearing only in the Finsler world. Exercise 3.4 Show that n  ∂Aij k i=1

∂v l

(v)v i = −Aj kl (v)

(3.7)

for all v ∈ T M \ 0 and 1 ≤ j, k, l ≤ n.

3.2 Dual Norms and the Legendre Transformation In this section, we digress for a while to introduce the dual norm and the Legendre transformation. We define the dual norm F ∗ : T ∗ M −→ [0, ∞) to F by

22

3 Properties of Geodesics

F ∗ (α) :=

α(v) =

sup v∈Tx M,F (v)≤1

sup

α(v)

(3.8)

v∈Tx M,F (v)=1

for α ∈ Tx∗ M. Thanks to the duality, for each x ∈ M, F ∗ |Tx∗ M provides a Minkowski norm in the sense of Definition 2.1. It is clear from the definition that α(v) ≤ F ∗ (α)F (v) holds for any v ∈ Tx M and α ∈ Tx∗ M, and hence − min{F ∗ (α)F (−v), F ∗ (−α)F (v)} ≤ α(v) ≤ min{F ∗ (α)F (v), F ∗ (−α)F (−v)}. (3.9) We emphasize that, however, α(v) ≥ −F ∗ (α)F (v) does not necessarily hold unless F is reversible. Exercise 3.5 Given two Finsler structures F1 and F2 on the same manifold M, suppose that λF2 (v) ≤ F1 (v) ≤ F2 (v) holds for some 0 < λ ≤  < ∞ and for all v ∈ T M. Show that −1 F2∗ (α) ≤ F1∗ (α) ≤ λ−1 F2∗ (α) holds for all α ∈ T ∗ M. Exercise 3.6 (Lipschitz Functions) Assume that a C1 -function f on M satisfies F ∗ (dfx ) ≤ C for some C ≥ 0 and for all x ∈ M. Show that f is C-Lipschitz in the sense that f (y) − f (x) ≤ Cd(x, y) for all x, y ∈ M (recall Exercise 2.16). We remark that |f (y) − f (x)| ≤ Cd(x, y) does not hold in general due to the non-reversibility of F . Given a local coordinate system (x i )ni=1 on U ⊂ M, we use the coordinates on T ∗ U such that

(x i , αj )ni,j =1

α=

n 

αj (α) dx j ,

j =1

n j and we will write α = j =1 αj dx for simplicity. Then, as an analogue to the fundamental tensor (3.1), we define for α ∈ T ∗ U \ 0 gij∗ (α) :=

1 ∂ 2 [(F ∗ )2 ] (α), 2 ∂αi ∂αj

i, j = 1, 2, . . . , n.

(3.10)

As a canonical correspondence between the tangent and cotangent spaces via the Finsler structure, we define the Legendre transformation L : Tx M −→ Tx∗ M for each x ∈ M by the relations   F ∗ L(v) = F (v),

  L(v) (v) = F 2 (v).

By the strict convexity of F ∗ , such an element L(v) ∈ Tx∗ M is indeed uniquely determined. We denote the inverse map of L by L∗ : T ∗ M −→ T M. Then, given

3.2 Dual Norms and the Legendre Transformation

23

α ∈ Tx∗ M, L∗ (α) ∈ Tx M is similarly characterized by   F L∗ (α) = F ∗ (α),

  α L∗ (α) = F ∗ (α)2 .

Exercise 3.7 Prove that α = L(v) and v = L∗ (α) are characterized by attaining equality in the inequality α(v) ≤

1 2 1 F (v) + F ∗ (α)2 . 2 2

A local coordinate expression of the Legendre transformation can be given as follows. Lemma 3.8 (Legendre Transformations) The Legendre transformations L and L∗ can be written in local coordinates as L(v) =

n n  1  ∂[F 2 ] i (v) dx = gij (v)v j dx i , 2 ∂v i i=1

L∗ (α) =

i,j =1

n n  ∂ 1  ∂[(F ∗ )2 ] ∂ (α) i = gij∗ (α)αj i . 2 ∂αi ∂x ∂x i=1

i,j =1

(Precisely, the latter expressions including gij (v) and gij∗ (α) make sense only outside the zero sections.) Proof We prove the claim only for L(v), that for L∗ (α) can be obtained in the same way. Fix v¯ ∈ Tx M \ {0}, put α¯ := L(v), ¯ and consider the (positively 0homogeneous) function v −→ α(v)/F ¯ (v) on Tx M \ {0}. This function attains its maximum at v = v¯ by the definition of L∗ (and L∗ (α) ¯ = v), ¯ thus we find that

¯ 1 ∂F ∂ α(v) α¯ i =− 2 (v) · α(v) ¯ + i i ∂v F (v) F (v) F (v) ∂v vanishes at v = v¯ for all 1 ≤ i ≤ n. Hence, α¯ i =

α( ¯ v) ¯ ∂F ∂F 1 ∂[F 2 ] ( v) ¯ = F ( v) ¯ ( v) ¯ = (v). ¯ F (v) ¯ ∂v i ∂v i 2 ∂v i

This shows the first equality, and the second follows from Theorem 2.8.

 

24

3 Properties of Geodesics

Exercise 3.9 For v, w ∈ Tx M with v = 0, show that   L(v) (w) = gv (v, w). In particular, the fundamental inequality (3.4) is equivalent to [L(v)](w) ≤ F (v)F (w) or α(w) ≤ F ∗ (α)F (w) by letting α = L(v). Exercise 3.10 Prove that L|Tx M is a linear map if and only if gij is constant on Tx M \ {0}. Thus, the Legendre transformations L|Tx M and L∗ |Tx∗ M are not linear at some point x on a non-Riemannian Finsler manifold. This is the reason why the Finsler Laplacian is nonlinear (see Sect. 11.2). Let us finally remark that, differentiating the (∂/∂x i )-component of the equality n   1  ∂ ∂[(F ∗ )2 ]  L(v) v = L∗ L(v) = 2 ∂αi ∂x i i=1

in v j , we obtain    1 ∂ 2 [F 2 ]  1  ∂ 2 [(F ∗ )2 ]  ∗ δij = L(v) · L(v) gkj (v). (v) = gik 2 ∂αk ∂αi 2 ∂v j ∂v k n

n

k=1

k=1

Therefore, (gij∗ (L(v)))ni,j =1 is the inverse matrix of (gij (v))ni,j =1 for each v ∈ T M \ 0.

3.3 The Geodesic Equation Having the fundamental and Cartan tensors in hand, we begin the differential geometric study of geodesics (defined in Definition 2.17). Similarly to Riemannian geometry, one can characterize geodesics as solutions to the associated Euler– Lagrange equation, called the geodesic equation. To this end, for a C1 -curve η : [0, l] −→ M, define its energy by 1 E(η) := 2



l

  ˙ dt. F 2 η(t)

0

It is a standard fact that the critical points of the energy functional E (in a class of curves fixing the endpoints) are both locally minimizing and of constant speed, and vice versa (see Exercises 3.12, 3.13 and 3.19 below). We remark that, on the other hand, the critical points of the length functional are merely locally minimizing (see Sect. 7.1).

3.3 The Geodesic Equation

25

In order to obtain an equation for the critical points of E, we consider a C2 curve η : [0, l] −→ M and its C2 -variation σ : [0, l] × (−ε, ε) −→ M such that σ (t, 0) = η(t) for all t ∈ [0, l] as well as σ (0, s) = η(0) and σ (l, s) = η(l) for all s ∈ (−ε, ε) (i.e., the endpoints are fixed). For simplicity, we set   1 E(s) := E σ (·, s) = 2



l

  F 2 ∂t σ (t, s) dt,

s ∈ (−ε, ε).

0

We shall calculate E (0). Recall from (3.3) that n 

F 2 (∂t σ ) =

where ∂t σ =

gij (∂t σ )∂t σ i ∂t σ j ,

i,j =1

n  i=1

∂t σ i

∂ . ∂x i

(Here and henceforth we fix a local coordinate system on a neighborhood of the image of η, by shortening η if necessary.) Hence we have 

n  ∂gij ∂gij ∂[F 2 (∂t σ )]  k k = ( η)∂ ˙ σ (·, 0) + ( η)∂ ˙ ∂ σ (·, 0) η˙ i η˙ j s s t  k k ∂s ∂x ∂v s=0 i,j,k=1

+

n 

  gij (η) ˙ ∂s ∂t σ i (·, 0) · η˙ j + η˙ i · ∂s ∂t σ j (·, 0) .

(3.11)

i,j =1

Note that, since gij is a function on the tangent bundle, we need to differentiate it both in the horizontal directions (∂gij /∂x k ) and in the vertical directions (∂gij /∂v k ). We are going to omit “(·, 0)” in the sequel. We find ˙ s ∂t σ i · η˙ j = gij (η)∂ ˙ t ∂s σ i · η˙ j gij (η)∂ =

n

d[gij (η)∂ ∂gij ˙ s σ i · η˙ j ]  ∂gij k k ∂s σ i · η˙ j − ( η) ˙ η ˙ + ( η) ˙ η ¨ dt ∂x k ∂v k k=1

− gij (η)∂ ˙ s σ · η¨ i

j

for each i and j , and it follows from (3.6) that n  ∂gij i=1

∂v

(η) ˙ η˙ i = k

n  ∂gij j =1

∂v k

(η) ˙ η˙ j = 0.

26

3 Properties of Geodesics

Substituting these into (3.11) yields  ∂[F 2 (∂t σ )]   ∂s s=0 n n    ∂gij d k i j gij (η)(∂ ( η)∂ ˙ σ · η ˙ η ˙ + ˙ s σ i · η˙ j + ∂s σ j · η˙ i ) s k ∂x dt

=

i,j =1

i,j,k=1

n  ∂gij (η)(∂ ˙ s σ i · η˙ j + ∂s σ j · η˙ i )η˙ k ∂x k

i,j,k=1

n 

gij (η)(∂ ˙ s σ i · η¨ j + ∂s σ j · η¨ i )

i,j =1

 n  ∂gj k ∂gij ∂gik ( η) ˙ − ( η) ˙ − ( η) ˙ ∂s σ k · η˙ i η˙ j ∂x k ∂x i ∂x j

=

i,j,k=1

−2

n 

gij (η)∂ ˙ s σ i · η¨ j + 2

i,j =1

n   d gij (η)∂ ˙ s σ i · η˙ j . dt

(3.12)

i,j =1

Since ∂s σ (0, 0) = ∂s σ (l, 0) = 0, we obtain   1 l ∂[F 2 (∂t σ )]  dt  2 0 ∂s s=0  l n  n  n  ∂gij ∂gj k 1 ∂gik j k j (η) ˙ − (η) ˙ − (η) ˙ η˙ η˙ − gij (η) ˙ η¨ ∂s σ i dt. = 2 ∂x i ∂x k ∂x j 0

E (0) =

i=1

j =1

j,k=1

(3.13) Now, we define the formal Christoffel symbols by γjik (v)

 n ∂gj l ∂gj k 1  il ∂glk := g (v) (v) + k (v) − (v) , 2 ∂x j ∂x ∂x l

v ∈ T U \ 0,

(3.14)

l=1

where U ⊂ M is the domain of the local coordinate system and (g ij (v))ni,j =1 denotes the inverse matrix of (gij (v))ni,j =1 . Then we can rewrite (3.13) as 

E (0) = −



l

n 

0 i,m=1

gim (η) ˙

 n j,k=1

γjik (η) ˙ η˙ j η˙ k

+ η¨ ∂s σ m dt. i

3.3 The Geodesic Equation

27

Therefore, η is a critical point for E (in the sense that E (0) = 0 for all variations σ ) if and only if it solves the system of ordinary differential equations η¨ i +

n 

γjik (η) ˙ η˙ j η˙ k = 0,

1 ≤ i ≤ n,

(3.15)

j,k=1

on (0, l). We call (3.15) the geodesic equation. The basic ODE theory yields the smoothness, existence, and uniqueness of solutions as follows. Proposition 3.11 (Properties of Geodesics) (i) All solutions to the geodesic equation are C∞ . (ii) For any v ∈ T M, there exist ε > 0 and a solution η : (−ε, ε) −→ M to the geodesic equation with η(0) ˙ = v. (iii) If two solutions ηi : [ai , bi ] −→ M to the geodesic equation with ai < 0 < bi , i = 1, 2, satisfy η˙ 1 (0) = η˙ 2 (0), then we have η1 (t) = η2 (t) for all t ∈ [a1 , b1 ] ∩ [a2 , b2 ]. Exercise 3.12 Prove that every solution to the geodesic equation (3.15) has a constant speed, namely the function F ◦ η˙ is constant. Exercise 3.13 Prove that a locally minimizing C2 -curve of constant speed necessarily satisfies the geodesic equation (3.15). The C2 -condition in Exercise 3.13 is in fact redundant, see Exercise 3.19 below. By combining Exercises 3.12 and 3.19, a geodesic in the sense of Definition 2.17 will equivalently mean a solution to the geodesic equation. Example 3.14 In a Minkowski normed space (Rn , · ) with the standard (linear) coordinates, the formal Christoffel symbols γjik vanish. Hence every geodesic is a segment, and vice versa. Exercise 3.15 Let (R2 , · ) be a 2-dimensional Minkowski normed space, and M = R2 /Z2 be the 2-dimensional torus equipped with the quotient Finsler structure. By taking a non-reversible Minkowski norm · , construct an example of a minimal geodesic in M whose reverse curve is not minimizing (but a geodesic, so locally minimizing). We close this section with some remarks on the formal Christoffel symbols. In the above (somewhat heuristic) argument, there remains a certain freedom in the choice of the functions γjik . For example, replacing γjik with  n ∂gj k 1  il ∂gj l g 2 k − 2 ∂x ∂x l l=1

28

3 Properties of Geodesics

gives rise to the same geodesic equation (3.15). However, these functions are not symmetric in j and k, while we clearly have γjik = γkji . From this and other reasons, γjik in (3.14) is a natural choice. Observe from the above calculation that, thanks to (3.6), all non-Riemannian terms in the geodesic equation vanished and the formal Christoffel symbols γjik have the same form as the Riemannian Christoffel symbols (see, for example, [66]), although the Riemannian ones are functions on the manifold M (to be precise, on the domain of the local coordinate system). This phenomenon, however, will turn out to be a special feature of the geodesic equation. In order to take one step forward to define general covariant derivatives of vector fields, we need to modify γjik by using some non-Riemannian quantities (without affecting the geodesic equation); see the next chapter. (This fact would be a reason why the functions γjik are called “formal” Christoffel symbols.)

3.4 The Exponential Map We next consider the exponential map. For v ∈ Tx M, if there is a geodesic η : [0, 1] −→ M with η(0) ˙ = v, then we define expx (v) := η(1) (such a geodesic is unique by Proposition 3.11(iii)). The domain of expx will be denoted by Dx ⊂ Tx M, which is a starlike subset of Tx M. Let D := x∈M Dx ⊂ T M. The mapping exp : D −→ M,

exp |Dx := expx ,

is called the exponential map. By definition, for each v ∈ Dx , the curve t −→ expx (tv) (t ∈ [0, 1]) is a geodesic emanating from x. Thus we have d(x, expx (tv)) ≤ tF (v) for all t ∈ [0, 1] (and equality holds for small t > 0 by Exercise 3.19 below). Remark 3.16 For v ∈ Tx M \ {0} and any small ε > 0, we remark that η(t) = expx (tv) is not necessarily a geodesic on (−ε, ε) unless F is reversible. The curve ← − η|(−ε,0] is a geodesic with respect to the reverse Finsler structure F . Some more fundamental properties of the exponential map and its domain are summarized in the following proposition. Given x ∈ M and r > 0, we define the open forward and backward balls with center x and radius r by + BM (x, r) := {y ∈ M | d(x, y) < r},

− BM (x, r) := {y ∈ M | d(y, x) < r}, (3.16) respectively. The closures of them clearly coincide with the closed balls: + BM (x, r) = {y ∈ M | d(x, y) ≤ r},

− BM (x, r) = {y ∈ M | d(y, x) ≤ r}.

3.4 The Exponential Map

29

+ We may write B + (x, r) instead of BM (x, r) for simplicity (when the space in + question is transparent), or BF (x, r) to stress that the distance structure is induced from F . Open and closed balls in Tx M are defined in a similar way by using the (asymmetric) distance function dTx M (v, w) := F (w − v).

Proposition 3.17 (Properties of the Exponential Map) (i) The domain D of the exponential map exp is an open set including the zero section 0. (ii) For any x ∈ M, there exist ε > 0 and an open neighborhood U of x such that any two points in U are joined by a unique solution to the geodesic equation with length less than ε. (iii) The exponential map exp is C1 on D, and its restriction to D \ 0 is C∞ . (iv) Given x ∈ M, for sufficiently small ε > 0, the map expx : BT+x M (0, ε) −→ + (x, ε) is a C1 -diffeomorphism and its restriction expx : BT+x M (0, ε) \ BM + {0} −→ BM (x, ε) \ {x} is a C∞ -diffeomorphism. These assertions essentially follow from the ODE theory. In (iii) and (iv), by construction, we find that the derivative d(expx )0 : T0 (Tx M) −→ Tx M of expx at 0 ∈ Tx M is the identity map on Tx M (by identifying its domain T0 (Tx M) with Tx M); we refer to [25, Sect. 5.3] for details. The exponential map becomes C2 at the zero section 0 if and only if F belongs to the special class of Berwald metrics (see Proposition 6.11). To prove that solutions of the geodesic equation are locally minimizing, we can follow the standard line as follows. Lemma 3.18 (Gauss’ Lemma) Let v ∈ Ux M and η : [0, l] −→ M be the solution to the geodesic equation (3.15) with η(0) ˙ = v. For t ∈ (0, l) and w ∈ Tv (Tx M) tangent to Ux M, we have   η(t), ˙ d(expx )tv (w) = 0, gη(t) ˙ where we regard as w ∈ Ttv (Tx M) by identifying both Tv (Tx M) and Ttv (Tx M) with Tx M in the canonical way. Proof Fix T ∈ (0, l) and, for sufficiently small ε > 0, define a variation σ : [0, T ]× (−ε, ε) −→ M by σ (t, s) := expx (t (v + sw)). Then we have on one hand   T E σ (·, s) = F 2 (v + sw), 2

30

3 Properties of Geodesics

which takes its minimum at s = 0 since w is tangent to Ux M. On the other hand, observe from the calculation in (3.12) and the geodesic equation (3.15) that n   !   d E σ (·, s) ˙ ) ∂s σ i (T , 0) · η˙ j (T ) = gij η(T s=0 ds i,j =1

  = gη(T ˙ ) . ˙ ) d(expx )T v (T w), η(T This shows the claim at t = T , and completes the proof.

 

Note that w ∈ Tv (Tx M) being tangent to Ux M is equivalent to gv (v, w) = 0. Exercise 3.19 Let ε > 0 and U ⊂ M be given as in Proposition 3.17(ii), y, z ∈ U , and η : [0, 1] −→ M be the unique solution to the geodesic equation satisfying η(0) = y, η(1) = z and L(η) < ε. Then, for any C1 -curve ξ : [0, 1] −→ M from y to z, show that we have L(η) ≤ L(ξ ) and equality holds only if ξ is a reparametrization of η. We may consider the spherical coordinates about y and make use of Lemma 3.18 (see [66, Sect. I.6] for the Riemannian case). By a similar argument to the characterization of equality in the above exercise, one can also find that a geodesic in the sense of Definition 2.17 is necessarily a solution to the geodesic equation.

3.5 Completenesses and the Hopf–Rinow Theorem Here we introduce completeness conditions and prove the Hopf–Rinow theorem. Definition 3.20 (Completenesses) We say that (M, F ) is forward complete if D = T M, i.e., if the exponential map is defined on the whole tangent bundle T M. If the ← − reverse Finsler structure (M, F ) (recall Definition 2.19) is forward complete, then (M, F ) is said to be backward complete. We remark that the forward completeness is not necessarily equivalent to the backward completeness, when the reversibility constant F :=

F (−v) ∈ [1, ∞] v∈T M\0 F (v) sup

(3.17)

is infinite. For instance, the Funk metric appearing in Sect. 6.5 is forward complete but not backward complete (see Exercise 6.22). In order to state the Finsler analogue of the Hopf–Rinow theorem, we define two notions for asymmetric metric spaces. We say that a set A ⊂ M is forward + bounded if A ⊂ BM (x, r) for some x ∈ M and r > 0. A sequence (xi )i∈N in M is called a forward Cauchy sequence if, for any ε > 0, there exists N ∈ N such

3.5 Completenesses and the Hopf–Rinow Theorem

31

that d(xi , xj ) < ε for any N ≤ i < j (notice that we cannot bound d(xj , xi ) with i < j ). Theorem 3.21 (Hopf–Rinow Theorem) The following statements are equivalent: (I) (II) (III) (IV)

(M, F ) is forward complete. We have Dx = Tx M for some x ∈ M. Any closed, forward bounded set is compact. Any forward Cauchy sequence converges.

Furthermore, if (M, F ) is forward complete, then: (V) For any points x, y ∈ M, there exists a minimal geodesic from x to y. Proof First we assume (II) and show that (V) holds for x from (II) and any y ∈ M. This in particular implies (I) ⇒ (V). For small ε ∈ (0, d(x, y)) such that the closed + ball BM (x, ε) is compact, we can take + (x, ε) := {z ∈ M | d(x, z) = ε} z1 ∈ SM

with ε+d(z1 , y) = d(x, y) and a minimal geodesic η : [0, ε] −→ M from η(0) = x to η(ε) = z1 . Thanks to (II), the domain of the geodesic η is extended to [0, ∞). Let T ≥ ε be the supremum of t > 0 such that     d x, η(t) + d η(t), y = d(x, y). Suppose T < d(x, y) and put z2 := η(T ). For small ε ∈ (0, d(z2 , y)), we can + choose z3 ∈ SM (z2 , ε ) such that ε + d(z3 , y) = d(z2 , y) and a minimal geodesic  ξ : [0, ε ] −→ M from z2 to z3 . Then we have d(x, z2 ) + d(z2 , z3 ) = d(x, z2 ) + d(z2 , y) − d(z3 , y) = d(x, y) − d(z3 , y) ≤ d(x, z3 ). Hence the concatenation of η|[0,T ] and ξ is a minimizing curve from x to z3 . Therefore, ξ(t) = η(T + t) holds for t ∈ [0, ε ], and η|[0,T +ε ] is a minimal geodesic from x to z3 . This, however, contradicts the maximality of T , thereby we have T = d(x, y) and η|[0,d(x,y)] is a minimal geodesic from x to y. The implication (I) ⇒ (II) is trivial. In order to prove (II) ⇒ (III), it suffices to + see that the forward closed ball BM " # (x, r) is compact for any r > 0, where x is from + (x, r) always holds. Moreover, we deduce (II). Notice that expx BT+x M (0, r) ⊂ BM " # + + from (V) that expx BTx M (0, r) ⊃ BM (x, r). Hence, we find

 + + expx BTx M (0, r) = BM (x, r).

32

3 Properties of Geodesics

+ Since the map expx is continuous and BT+x M (0, r) is compact, BM (x, r) is compact as well. The proof of (III) ⇒ (IV) is almost straightforward. Given a forward Cauchy sequence (xi )i∈N such that d(xi , xj ) < ε for N ≤ i < j , we have {xi }i≥N ⊂ + + BM (xN , ε). Since BM (xN , ε) is compact, (xi )i∈N has a convergent subsequence. Then, by the forward Cauchy condition, the original sequence (xi )i∈N also converges to the same limit point. We finally show (IV) ⇒ (I). Given v ∈ Tx M \ {0}, the domain of the geodesic η(t) := expx (tv) is, in general, a half-open interval [0, l). It follows from (IV) that it is also closed, therefore l = ∞. This completes the proof.  

Exercise 3.22 Let (M, F ) be noncompact and forward complete. Show that there exists a nonconstant globally minimizing geodesic η : [0, ∞) −→ M. (Such η is called a ray and plays a role in splitting theorems; see Chap. 17.) The forward Cauchy condition is not equivalent to the backward one, the Funk metric again provides a counter-example. Nonetheless, we remark that the convergence limi→∞ d(xi , x) = 0 is equivalent to the convergence limi→∞ d(x, xi ) = 0 ← − since d and d are locally comparable around x. By using the condition (III) for instance, one readily finds the following. Corollary 3.23 If the reversibility constant F is finite, then the forward completeness is equivalent to the backward completeness. In this case, we can simply speak of completeness without ambiguity.

Chapter 4

Covariant Derivatives

In this chapter, we revisit the geodesic equation (3.15) and give an appropriate definition of covariant derivatives of vector fields (associated with the Chern connection). Our argument will be heuristic and is motivated by a Riemannian characterization described in Proposition 4.6 (going back to [241]; see also [220, Sect. III.5] and [230, Sect. 6.2]). We refer to [25] (as well as a brief account in Sect. 4.4) for a more systematic treatment.

4.1 The Geodesic Equation Revisited It is sometimes useful to compare a Finsler notion with the corresponding Riemannian notion for a Riemannian structure gV obtained from the fundamental tensor along a nowhere vanishing (local) vector field V (recall (3.2)). As a first example, let η : [0, l] −→ M be a nonconstant (short) geodesic, and take a nowhere vanishing C∞ -vector field V on a neighborhood of η([0, l]) such that V (η(t)) = η(t) ˙ for all t ∈ [0, l]. We shall compare the Finsler geodesic equation (3.15) with the Riemannian one with respect to gV . Fix t0 ∈ (0, l) and put x0 := η(t0 ), v0 := η(t ˙ 0 ). On one hand, recall from (3.14) that the formal Christoffel symbols with respect to F are γjik (v0 ) =

 n ∂gj l ∂gj k 1  il ∂glk (v0 ). g (v0 ) + − 2 ∂x j ∂x k ∂x l l=1

On the other hand, the Christoffel symbols  ij k (x0 ) of gV , which are functions on the domain of the local coordinate system, are defined by  ij k (x0 ) © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S.-i. Ohta, Comparison Finsler Geometry, Springer Monographs in Mathematics, https://doi.org/10.1007/978-3-030-80650-7_4

33

34

4 Covariant Derivatives

=

 n 1  il ∂[glk (V )] ∂[gj l (V )] ∂[gj k (V )] (x0 ) g (v0 ) + − 2 ∂x j ∂x k ∂x l l=1

= γjik (v0 )  n ∂gj l ∂gj k 1  il ∂glk ∂V m ∂V m ∂V m (x0 ), + g (v0 ) (V ) j + m (V ) k − m (V ) 2 ∂v m ∂x ∂v ∂x ∂v ∂x l l,m=1

(4.1) n i m m where V = m=1 V (∂/∂x ). Thanks to (3.6), the terms differ from γj k (v0 ) vanish when we substitute (4.1) into the geodesic equation. Thus we have η¨ i (t0 ) +

n 

j

 ij k (x0 )v0 v0k = η¨ i (t0 ) +

j,k=1

n 

j

γjik (v0 )v0 v0k .

(4.2)

j,k=1

Therefore, η is a geodesic also for the Riemannian structure gV . We stress that, to cancel the extra non-Riemannian terms appearing in (4.1), we j need to take the contraction with both v0 and v0k in (4.2). Hence the appearance of the formal Christoffel symbols γjik in the geodesic equation essentially depends on the special “double contraction” structure of the equation. We need to modify the functions γjik when we consider covariant derivatives of general vector fields (and connections).

4.2 Covariant Derivatives We shall argue a heuristic way of finding an appropriate definition of covariant derivatives of vector fields. Recall that, in the Riemannian case, Christoffel symbols are functions on the domain U of a local coordinate system. In the Finsler case, similarly to formal Christoffel symbols, we need to consider functions on the (slit) tangent bundle T U \ 0. Definition 4.1 (Covariant Derivatives) Let X be a C1 -vector field on M, w ∈ Tx M and v ∈ Tx M \ {0}. Then we define v Dw X

 n

n i   ∂  j ∂X i j k w := (x) + j k (v)w X (x) ∈ Tx M, ∂x j ∂x i x i,j =1

(4.3)

k=1

v X the where ji k (v) is a modification of γjik (v) given later in (4.8). We call Dw covariant derivative of X by w with reference vector v.

4.2 Covariant Derivatives

35

The freedom of the choice of a reference vector v is a special feature of the Finsler setting, we most commonly choose v = w or v = X(x). Now the question is: How do we find a reasonable definition of ji k (v)? We have seen that simply setting ji k = γjik makes sense when we consider the tangent vector field of a geodesic η. Motivated by this observation, we introduce the coefficients of the geodesic spray and the nonlinear connection as Gi (v) :=

n 

γjik (v)v j v k ,

Nji (v) :=

j,k=1

1 ∂Gi (v), 2 ∂v j

(4.4)

respectively, where v ∈ T U \ 0, i, j = 1, 2, . . . , n, and U ⊂ M is the domain of the local coordinate system (we refer to [25, Sect. 2.3] for an explanation on how the quantities Nji arise). We also set Gi = Nji := 0 on 0. Notice that Gi is positively 2-homogeneous, hence by Theorem 2.8 n 

Nji (v)v j = Gi (v).

(4.5)

j =1

The geodesic equation (3.15) can be rewritten as η¨ i + Gi (η) ˙ = 0. Let η : [0, l] −→ M and V be as in the previous section, take t0 ∈ (0, l) and put x0 := η(t0 ) and v0 := η(t ˙ 0 ). It follows from (4.1) and (3.6) that we have n 

j  ij k (x0 )v0

j =1

n

n  1  il ∂glk ∂V m j i γj k (v0 ) + = g (v0 ) m (v0 ) j (x0 ) v0 2 ∂v ∂x j =1

(4.6)

l,m=1

for the Christoffel symbols  ij k of gV and for all 1 ≤ i, k ≤ n. Since η is a geodesic (of F ) and V (η(t)) = η(t), ˙ we deduce from the geodesic equation and (4.5) that n  ∂V m j =1

∂x j

j

(x0 )v0 = η¨ m (t0 ) = −Gm (v0 ) = −

n 

j

Njm (v0 )v0 .

j =1

Substituting this into (4.6), we obtain an expression which is independent of the choice of V : n  j =1

j

 ij k (x0 )v0 =

n

 γjik (v0 ) − j =1

n  1 j g il (v0 )Alkm (v0 )Njm (v0 ) v0 . F (v0 ) l,m=1

(4.7)

36

4 Covariant Derivatives

The above calculation suggests the following definition of ji k (v): ji k (v) := γjik (v)−

n 1  il g (v)(Alkm Njm +Aj lm Nkm −Aj km Nlm )(v) F (v)

(4.8)

l,m=1

for v ∈ T U \ 0 and i, j, k = 1, 2, . . . , n. Note that ji k (v) has the same symmetries i (v) =  i (v). The right-hand side of (4.7) in i, j, k as γjik (v), in particular, kj jk n j coincides with j =1 ji k (v0 )v0 thanks to (3.6). Example 4.2 In a Minkowski normed space (Rn , · ) with the standard (linear) coordinates, γjik vanishes (recall Example 3.14) and so does Nji . Therefore, we have ji k = 0 and the covariant derivatives mean the component-wise derivatives. Lemma 4.3 (Well-definedness of Covariant Derivatives) The covariant derivative given in (4.3) with (4.8) is well-defined, i.e., it is independent of the choice of a local coordinate system. Proof This is a reasonable exercise for coordinate calculations, we give only an outline. Let (x i )ni=1 and (y a )na=1 be two local coordinate systems around x ∈ M. Then we have v=

n 

vi

i=1

for v ∈ Tx M. Put ua =

  n  n a  ∂  ∂  i ∂y = v (x) ∂x i x ∂x i ∂y a x a=1

n

i=1 v

i=1

i (∂y a /∂x i )

for simplicity and notice that

 ∂ua ∂  ∂y a ∂ ∂ = = , ∂v i ∂v i ∂ua ∂x i ∂ua n

n

a=1

∂ = ∂x i

n

 a=1

a=1

∂y a

∂ ∂ua ∂ + ∂x i ∂y a ∂x i ∂ua

n

n 2 a   ∂y a ∂ ∂ j ∂ y = + v ∂x i ∂y a ∂x i ∂x j ∂ua a=1

as tangent vectors on T M (we remark that ∂/∂x i = tangent vectors on M). Using these relations, we find gij (v) =

j =1

n

a=1 (∂y

n  ∂y a ∂y b (x) (x) gab (v), ∂x i ∂x j

a,b=1

Aij k (v) =

n  ∂y a ∂y b ∂y c abc (v), (x) (x) (x)A ∂x i ∂x j ∂x k

a,b,c=1

and

a /∂x i )∂/∂y a

as

4.2 Covariant Derivatives

γjik (v) =

37

n n   ∂x i ∂x i ∂y b ∂y c ∂ 2ya a (x) (x) (x) γ (v) + (x) j k (x) bc a j k a ∂y ∂x ∂x ∂y ∂x ∂x

a,b,c=1

+

a=1

n 

1 F (v)

a,b,c,d,l=1

 b 2 c ∂x i ∂ y ∂y c ∂ 2 y b ad l ∂y bcd (v) (x)A (x) g (v)v + ∂y a ∂x j ∂x k ∂x l ∂x k ∂x j ∂x l

n 

1 F (v)

a,b,c,d,e,l,m=1

Nji (v) =

n  a,b=1

∂x i ∂x l ∂y b ∂y c ∂ 2 y e ad bce (v), (x) g (v) (x)v m A ∂y a ∂y d ∂x j ∂x k ∂x l ∂x m

n  ∂x i ∂x i ∂y b ∂ 2ya ba (v) + (x) (x) N (x) j k (x)v k , a j a ∂y ∂x ∂y ∂x ∂x a,k=1

a , and N abc ,  a ) denote the tensor where gij , Aij k , γjik , and Nji (resp.  gab , A γbc b components and connection coefficients with respect to the coordinates (x i )ni=1 (resp. (y a )na=1 ). Then we have

ji k (v)

 n n  ∂x i ∂y b ∂y c ∂ 2ya a  = (x) (x) k (x)bc (v) + j k (x) . ∂y a ∂x j ∂x ∂x ∂x a=1

b,c=1

Hence n

n i   ∂ j ∂X i j k w + j k (v)w X ∂x j ∂x i

i,j =1

k=1

n 

=

 n

i,j,a=1

+

n  k=1

b=1

∂x i ∂y a

wj

∂y b ∂ ∂x i a X ∂x j ∂y b ∂y a

  n ∂ ∂y b ∂y c a ∂ 2ya j k  w  (v) + X bc j k j k ∂x ∂x ∂x ∂x ∂x i b,c=1

n

n  a  ∂ b ∂X a b c  w  bc (v) = + w X b ∂y ∂y a a,b=1

c=1

as we desired, where we write w=

n  j =1

wj

  n   ∂  b ∂  = w  ,  j b ∂x x ∂y x b=1

X=

n  i=1

 ∂ a ∂ . = X i ∂x ∂y a n

Xi

a=1

 

38

4 Covariant Derivatives

Exercise 4.4 Complete the details of the proof of the above lemma. It is straightforward from its definition (4.3) that the covariant derivative enjoys the linearity v v v v v v v v Dw X = Dw X + Dw X, Dcw X = cDw X, Dw (X1 + X2 ) = Dw X1 + Dw X2 , 1 +w2 1 2

where c ∈ R, and satisfies the Leibniz rule v v (f X) = f · Dw X + df (w) · X Dw

(4.9)

for f ∈ C1 (M). It has also the torsion-freeness as in the next exercise (see also (4.14) in Theorem 4.13 below). Exercise 4.5 (Torsion-Freeness) Show that v v DW X − DX W = [W, X]

holds for any C1 -vector fields W, X on M and v ∈ Tx M \ {0}, where [W, X] is the Lie bracket. The following important and inspiring proposition is a consequence of the calculations above. Proposition 4.6 (A Riemannian Characterization of Covariant Derivatives) Let V be a nowhere vanishing C∞ -vector field on M and assume that all integral curves of V are geodesic. Then we have g

g

DVV W = DVV W,

V DW V = DWV V

for any C1 -vector field W on M, where D gV denotes the covariant derivative with respect to the Riemannian structure gV . Proof We first consider DVV W

n

n i   ∂ j ∂W i j k V = + j k (V )V W . ∂x j ∂x i i,j =1

k=1

Recalling (4.7) and (4.8), we find that n  j =1

ji k (V )V j =

n 

 ij k V j ,

j =1 g

where  ij k is the Christoffel symbol of gV . Thus we have DVV W = DVV W . We similarly obtain

4.3 Covariant Derivatives Along Curves

39

n

n  ∂ ∂V i  i Wj j + j k (V )W j V k ∂x ∂x i

V DW V =

i,j =1

=

k=1

n

n  ∂ ∂V i  i g Wj j + j k W j V k = DWV V , ∂x ∂x i

i,j =1

k=1

 

which completes the proof.

A vector field V satisfying the condition in the above proposition is sometimes called a geodesic field (see, e.g., [230, Sect. 6.2]). As it turns out from Proposition 4.6 and a similar characterization of the curvature in the next chapter (Theorem 5.12), such a vector field plays an important role in comparison geometry. We close this section with an exercise on an expression of Nji (v) we use later. We may apply the standard fact n  ∂g im ∂gab (v) = − g ia (v) j (v)g bm (v) ∂v j ∂v a,b=1

and, as always, Theorem 2.8. Exercise 4.7 Show that n 

Nji (v) =

γjik (v)v k −

k=1

n n  1  ik g (v)Akj l (v)Gl (v) = ji k (v)v k . F (v) k,l=1

k=1

(4.10)

4.3 Covariant Derivatives Along Curves This section is devoted to some further fundamental properties of covariant derivatives introduced in Definition 4.1. Given a C1 -curve η : [0, l] −→ M and a nowhere vanishing vector field V along η, we define the covariant derivative of a C1 -vector field X along η with reference vector field V as DηV˙ X(t)

 n

n   ∂  i i j k ˙ := j k (V )η˙ X (t) i  , X + ∂x η(t) i=1

where X(t) =

n

i=1 X

i (t)(∂/∂x i )|

j,k=1

η(t) .

The Leibniz rule (4.9) is then read as

DηV˙ (f X)(t) = f (t) · DηV˙ X(t) + f  (t) · X(t),

(4.11)

40

4 Covariant Derivatives

where f is a C1 -function on [0, l]. Note that the geodesic equation (3.15) for η coincides with η˙

Dη˙ η˙ = 0. For three vector fields V , W, X along η, where V is nowhere vanishing, one may expect that  d gV (W, X) = gV (DηV˙ W, X) + gV (W, DηV˙ X) dt holds (in other words, the metric compatibility holds). In a local coordinate system, on one hand, the left-hand side is d dt

 n i,j =1

n  ∂gij ∂gij k ˙ k W i Xj gij (V )W i Xj = (V ) η ˙ + (V ) V ∂x k ∂v k i,j,k=1

+

n 

gij (V )(W˙ i Xj + W i X˙ j ).

i,j =1

On the other hand, using (4.8), (3.14), and (3.6), we find gV (DηV˙ W, X) + gV (W, DηV˙ X)

n n   i i k l ˙ gij (V ) W + kl (V )η˙ W Xj = i,j =1

+

k,l=1

n 

n  j gij (V )W i X˙ j + kl (V )η˙ k Xl

i,j =1

=

n 

k,l=1

gij (V )(W˙ i Xj + W i X˙ j )

i,j =1

+

n    l l glj (V )ki (V ) + gil (V )kj (V ) η˙ k W i Xj i,j,k,l=1

=

n 

gij (V )(W˙ i Xj + W i X˙ j )

i,j =1

+

n n  ∂gij 2  l k i j − A N ij l k (V )η˙ W X . ∂x k F

i,j,k=1

l=1

(4.12)

4.3 Covariant Derivatives Along Curves

41

Thus we obtain  d gV (W, X) − gV (DηV˙ W, X) − gV (W, DηV˙ X) dt

n n   2 = Aij k (V )V˙ k + Aij l (V )Nkl (V )η˙ k W i Xj , F (V ) i,j,k=1

(4.13)

l=1

which does not necessarily vanish in general (compare this with (4.15) in Theorem 4.13 below). We shall present two important cases in which the right-hand side of (4.13) vanishes. Lemma 4.8 For any C1 -vector fields V , W along a C1 -curve η : [0, l] −→ M such that V (t) = 0 for all t ∈ [0, l], we have  d gV (V , W ) = gV (DηV˙ V , W ) + gV (V , DηV˙ W ). dt Proof Let X = V in (4.13), then (3.6) yields the claim.

 

The next lemma can be alternatively reduced to the Riemannian case by Proposition 4.6. Lemma 4.9 (Covariant Derivatives Along Geodesics) For any nonconstant geodesic η : [0, l] −→ M and C1 -vector fields W, X along η, we have  d η˙ η˙ gη˙ (W, X) = gη˙ (Dη˙ W, X) + gη˙ (W, Dη˙ X). dt Proof Letting V = η˙ in (4.13), we have  d η˙ η˙ gη˙ (W, X) − gη˙ (Dη˙ W, X) − gη˙ (W, Dη˙ X) dt

n n   2 = Aij k (η) ˙ η¨ k + Aij l (η)N ˙ kl (η) ˙ η˙ k W i Xj . F (η) ˙ i,j,k=1

l=1

We deduce from the geodesic equation (3.15) and (4.5) that n

n   Aij k (η) ˙ η¨ k + Aij l (η)N ˙ kl (η) ˙ η˙ k k=1

=−

l=1 n  k=1

Aij k (η)G ˙ k (η) ˙ +

n 

Aij l (η)G ˙ l (η) ˙ =0

l=1

for all 1 ≤ i, j ≤ n. This completes the proof.

 

42

4 Covariant Derivatives

One can slightly generalize Lemma 4.9 to locally minimizing curves (see Exercise 7.2 below) as follows. Lemma 4.10 (Covariant Derivatives Along Locally Minimizing Curves) Let η : η˙ [0, l] −→ M be a C2 -curve satisfying F (η) ˙ = 0 and Dη˙ [η/F ˙ (η)] ˙ = 0 on [0, l]. Then we have, for any C1 -vector fields W, X along η,  d η˙ η˙ gη˙ (W, X) = gη˙ (Dη˙ W, X) + gη˙ (W, Dη˙ X). dt Exercise 4.11 Prove Lemma 4.10 by computation or by reducing it to Lemma 4.9.

4.4 The Chern Connection In this final section of this chapter, we review another introduction of the functions ji k as the coefficients of a connection which satisfies some fine properties. We refer to [25, Chap. 2] for more details as well as historical accounts. In Riemannian geometry (see [66]), the Levi-Civita connection is the unique connection which is metric compatible and torsion-free. In the Finsler case, however, one cannot always find a connection which satisfies both of these properties. Therefore, we need to modify (at least) one of the requirements, and so there arise a number of possible candidates. Here we discuss one of them, called the Chern connection, corresponding to ji k given in (4.8). Some other connections are also briefly mentioned in Example 4.15. An important fact is that, despite the non-uniqueness of connections, curvatures will be uniquely determined (see [25, Exercise 3.10.7] as well as the Riemannian characterization of curvatures in Theorem 5.12). Let π : T M −→ M be the natural projection and consider the pulled-back tangent bundle π ∗ (T M) −→ T M \ 0, which is a vector bundle such that the fiber over v ∈ Tx M \ {0} is given by Tv (Tx M) = Tx M. An element of π ∗ (T M) can be denoted by (x, v; w), where v ∈ Tx M \ {0} represents its base point (reference vector) and w lives in the fiber Tx M over v. A C∞ -section ξ : T M \0 −→ π ∗ (T M) can be regarded as a C∞ -map ξ : T M \ 0 −→ T M satisfying π ◦ ξ = π . They form a C∞ (T M \ 0)-module denoted by (π ∗ (T M)). For example, if X is a nowhere vanishing C∞ -vector field on M, then ξX := X ◦ π is a C∞ -section of π ∗ (T M) and is called a basic section. Locally, basic sections generate the module (π ∗ (T M)). A connection on π ∗ (T M) is an R-bilinear map   ∇ : π ∗ (T M) ×  π ∗ (T M) −→ π ∗ (T M),

(X, ξ ) −→ ∇X ξ,

which is a derivation in its second variable, i.e., ∇X (f ξ ) = f (v) · ∇X ξ + dfv (X) · ξ(v) ∈ Tx M

4.4 The Chern Connection

43

for all X ∈ Tv (Tx M) = Tx M over v ∈ Tx M \ {0}, ξ ∈ (π ∗ (T M)) and f ∈ C∞ (T M \0). A connection ∇ is determined by its connection 1-forms ωji on T M \0 via the relations  n     i  ∂  ∇X ξ∂/∂x j = ωj (v) (X) i  , ∂x x

i, j = 1, 2, . . . , n,

i=1

where X ∈ Tv (Tx M) over v ∈ Tx M \ {0} and ωji (v) ∈ Tv∗ (T M \ 0) (ξ∂/∂x j in the left-hand side is the basic section, i.e., ξ∂/∂x j (v) = (∂/∂x j )|π(v) ). Definition 4.12 (Chern Connection) Consider the functions ji k defined in (4.8). The Chern connection of (M, F ) is a connection ∇ on π ∗ (T M) whose connection 1-forms are given by ωji (v) :=

n 

v ∈ T M \ 0.

ji k (v) dx k ,

k=1

The Chern connection is sometimes also called the Chern–Rund connection. Rund discovered this connection independently from Chern, and later it turned out that Rund’s connection coincides with the Chern connection; see [14]. The absence of the dv k -terms in ωji is essential in the torsion-freeness (that we have seen in Exercise 4.5). The metric compatibility needs to be modified as in the following characterization. Theorem 4.13 (A Characterization of the Chern Connection) The Chern connection is a unique connection satisfying the following conditions. (I) (Torsion-freeness) For any 1 ≤ i ≤ n, we have n 

dx j ∧ ωji = 0.

(4.14)

j =1

(II) (Almost metric compatibility) For any 1 ≤ i, j ≤ n, we have dgij −

n n  2  (gkj ωik + gik ωjk ) = Aij k δv k , F k=1

k=1

where we set δv k := dv k +

n  l=1

Nlk dx l .

(4.15)

44

4 Covariant Derivatives

 define  i Proof For the connection 1-forms  ωji of a given connection ∇, ji k and  jk by  ωji (v) =

n \$ %   ij k (v) dv k . ji k (v) dx k +  k=1

i = 0 for Our goal is to show that (I) and (II) hold if and only if  ji k = ji k and  jk all i, j, k. First we assume (I) and (II). The torsion-freeness condition (4.14) can be rewritten as n \$ %   ij k dx j ∧ dv k = 0, ji k dx j ∧ dx k +  j,k=1 i and  i = 0 (thereby  i satisfies (4.14)). By which holds if and only if  ji k =  kj jk jk i and  i = 0, the almost metric compatibility condition (4.15) kj assuming  ji k =  jk is divided into two parts: n n  ∂gij   2  k k   gkj il + gik j l = − Aij k Nlk , ∂x l F k=1

k=1

∂gij 2 = Aij l , ∂v l F

(4.16)

for all i, j, l. The latter condition is just the definition of the Cartan tensor. Taking i , we deduce from the former condition that kj into account the symmetry  ji k =  2

n 

ilk gkj 

k=1

=

n n n          ilk + gik  jkl + ilk + glk  ijk − jkl + glk  ijk gkj  gkj  gik  k=1

 =

∂gij 2 − ∂x l F 

− =2

k=1 n 

Aij k Nlk +

n 

k=1



k=1

2 ∂gil − j ∂x F

n 

∂gj l 2 − ∂x i F

n 

Aj lk Nik

k=1

Ailk Njk

k=1

gkj ilk

k=1

for all i, j, l. Therefore, we have  ilk = ilk (since the matrix (gkj ) is invertible).

4.4 The Chern Connection

45

Conversely, we have already seen that the Chern connection satisfies (4.14), and the latter condition in (4.16) always holds. The former condition in (4.16) can be also readily shown (we did the same calculation in (4.12)). This completes the proof.   n k k k l One can regard the 1-forms δv = dv + l=1 Nl dx as the natural dual to ∂/∂v k on T M \ 0. Then the natural dual to dx i is given by  j ∂ ∂ δ := − Ni . δx i ∂x i ∂v j n

j =1

Note that δv k (δ/δx i ) = 0 indeed holds. We refer to [25, Sect. 2.3] for details and further discussions. Using δ/δx i , we obtain the concise expression: ji k

 n δgj l δgj k 1  il δglk . = g + k − 2 δx j δx δx l l=1

Exercise 4.14 In the change of local coordinates as in Lemma 4.3, show that  ∂y a δ δ = , i δx ∂x i δy a n

δv i =

a=1

n  ∂x i a δu . ∂y a a=1

Example 4.15 (Some Other Connections) As we have mentioned, there are some other connections having their own merits. Here we present two of them. We will denote by ωji the connection 1-forms associated with the Chern connection. (a) The connection forms for the Berwald connection are given by  ωji := ωji +

n 

g il A˙ lj k dx k ,

k,l=1

where A˙ ij k (v) :=

n 1  Aij k|a (v)v a , F (v) a=1

dAij k −

n  m=1

(Amj k ωim + Aimk ωjm + Aij m ωkm ) =

n

 a=1

Aij k|a dx a + Aij k;a

δv a F

(see [25, Exercises 2.5.4, 2.5.5]). The Berwald connection is torsion-free since there is no dv k -term, and A˙ ij k = A˙ ikj . (b) The Cartan connection is given by

46

4 Covariant Derivatives

ωij := ωji +

n 1  il g Alj k δv k F k,l=1

=

n  n n  1  il 1  il ji k + g Alj m Nkm dx k + g Alj k dv k . F F k=1

l,m=1

k,l=1

This connection has a torsion due to the dv k -terms, and is metric compatible in the sense that dgij −

n 

(gkj ωki + gik ωkj ) = 0

k=1

(compare this with (4.15)). We refer to [25, Sect. 2.4], [231, Sect. 7.2], and the references therein for more explanations on these and other connections. Exercise 4.16 Prove the metric compatibility of the Cartan connection.

Chapter 5

Curvature

This chapter is devoted to the derivation of a natural notion of curvature via a Jacobi field, which is the variational vector field of a geodesic variation. This argument goes back to Ludwig Berwald’s important posthumous paper [35]. The appearance of a geodesic variation reminds us of Proposition 4.6, where we characterized covariant derivatives by using the Riemannian metric gV associated with a vector field V whose integral curves are geodesics. In fact, this viewpoint leads us to a useful and inspiring description of the Finsler curvature as the Riemannian curvature of gV . The metric gV is also called an osculating Riemannian metric, and its application to the Riemannian characterization of the Finsler curvature goes back to Ottó Varga [241, 242]; see also [16] and [220, Sect. III.5]. The argument in this chapter is partly indebted to [230, Chap. 6].

5.1 Jacobi Fields and the Curvature Tensor Recall that, in Riemannian geometry, curvature is intimately linked to the behavior of geodesics. Such a tight connection can be seen from the Jacobi equation ˙ η˙ = 0, Dη˙ Dη˙ J + R(J, η)

(5.1)

where η : [0, l] −→ M is a geodesic, σ : [0, l] × (−ε, ε) −→ M is a C∞ -map such that σ (t, 0) = η(t) and each σ (·, s) is a geodesic, J (t) := ∂s σ (t, 0) is the variational vector field along η, and R is the curvature operator. A vector field along a geodesic satisfying (5.1) is called a Jacobi field and, conversely, every Jacobi field arises in this way as the variational vector field of some geodesic variation. Having the Finsler versions of geodesics and covariant derivatives in hand, we shall follow the same line in the Finsler setting.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S.-i. Ohta, Comparison Finsler Geometry, Springer Monographs in Mathematics, https://doi.org/10.1007/978-3-030-80650-7_5

47

48

5 Curvature

On a Finsler manifold (M, F ), consider a nonconstant geodesic η : [0, l] −→ M and a C∞ -map σ : [0, l] × (−ε, ε) −→ M such that σ (t, 0) = η(t) for t ∈ [0, l] and that σ (·, s) is a geodesic for all s ∈ (−ε, ε). Put J (t) := ∂s σ (t, 0) ∈ Tη(t) M, η˙

t ∈ [0, l],

η˙

and let us calculate Dη˙ Dη˙ J , which is a vector field along η. We deduce from (4.10) that η˙ Dη˙ J

  n

n n

n     ∂  ∂  i i j k i i j ˙ ˙ = j k (η)J ˙ η˙ = Nj (η)J ˙ . J + J + ∂x i η ∂x i η i=1

j,k=1

j =1

i=1

Hence we find 

   n

n n n    ∂ d i i j i k k j  . + = Nj (η)J ˙ Nk (η) ˙ J˙ + Nj (η)J ˙ J¨ + i dt ∂x η

η˙ η˙ Dη˙ Dη˙ J

j =1

i=1

j =1

k=1

(5.2) Now we apply the geodesic equation (3.15) to obtain ! ! ∂ 2 i! d2 ∂ ∂t σ Gi (∂t σ ) J¨i = 2 ∂s σ i |s=0 = =− s=0 s=0 ∂s ∂s dt 

n  ∂Gi ∂Gi ∂ 2 σ j  =− (η)J ˙ j + j (η) ˙ j ∂x ∂v ∂s∂t s=0 j =1

=−

n

 ∂Gi j =1

∂x j

(η)J ˙ j + 2Nji (η) ˙ J˙j

for all i, and n ∂N i !  ∂Nji d j i j k k N (η)J = ˙ (η) ˙ η˙ + (η) ˙ η¨ J j + Nji (η) ˙ J˙j dt j ∂x k ∂v k k=1

=

n ∂N i  j k=1

∂x k

(η) ˙ η˙ k −

∂Nji ∂v k

(η)G ˙ k (η) ˙ J j + Nji (η) ˙ J˙j

for all i, j . Substituting these into (5.2) yields η˙

η˙

Dη˙ Dη˙ J =

  n

n  ∂N i n ∂Nji    ∂Gi j k− k (η) i (η)N k (η) j ∂  . − j (η)+ ˙ ( η) ˙ η ˙ ( η)G ˙ ˙ + N ˙ ˙ J k j k k i ∂x ∂x ∂v ∂x η

i,j =1

k=1

k=1

5.1 Jacobi Fields and the Curvature Tensor

49

Note that the terms involving J˙k vanished. This is reasonable and necessary to η˙ η˙ write down the Jacobi equation since J (t) = 0 should imply Dη˙ Dη˙ J (t) = 0 in the equation. Thus we arrive at the following definition. Definition 5.1 (Jacobi Fields and the Curvature Tensor) A C2 -vector field J along a geodesic η : [0, l] −→ M is called a Jacobi field if it satisfies the Jacobi equation η˙

η˙

Dη˙ Dη˙ J + Rη˙ (J ) = 0

(5.3)

on (0, l). Here the curvature tensor R : Tx M  v −→ Rv ∈ Tx∗ M ⊗ Tx M is defined by Rv (w) :=

n  i,j =1

Rji (v)w j

 ∂  ∂x i x

(5.4)

for v, w ∈ Tx M, where Rji (v) :=

 n ∂N i n  ∂Nji ∂Gi j k k (v)− (v)v − (v)G (v) − Nki (v)Njk (v). ∂x j ∂x k ∂v k k=1

(5.5)

k=1

Clearly Rv (w) is linear in w, thereby Rv is an endomorphism of Tx M. Observe that Rji is positively 2-homogeneous and Rji (0) = 0. Exercise 5.2 Prove that the curvature tensor R is well-defined (independent of the choice of a local coordinate system) by computation, by using (5.3), or by giving a coordinate-free definition. Exercise 5.3 Write down the Jacobi equation on a Minkowski normed space (by using Examples 3.14, 4.2). We deduce the following fundamental fact from the ODE theory. Lemma 5.4 Let J1 , J2 be Jacobi fields along a common geodesic η : [0, l] −→ M. η˙ η˙ If they share the same initial conditions J1 (0) = J2 (0) and Dη˙ J1 (0) = Dη˙ J2 (0), then we have J1 (t) = J2 (t) for all t ∈ [0, l]. From the above construction, we obtain a characterization of Jacobi fields as the variational vector fields of geodesic variations. Proposition 5.5 (A Characterization of Jacobi Fields) Let η : [0, l] −→ M be a nonconstant geodesic. A C2 -vector field J along η is a Jacobi field if and only if there is a C∞ -variation σ : [0, l] × (−ε, ε) −→ M such that σ (·, 0) = η, J = ∂s σ (·, 0), and that σ (·, s) is a geodesic for every s ∈ (−ε, ε). In particular, all Jacobi fields are C∞ .

50

5 Curvature

Proof We have already shown the “if” part, hence it suffices to prove the “only if” part. Given a Jacobi field J and small ε > 0, we take a (nonconstant) C∞ -curve ξ : (−ε, ε) −→ M with ξ˙ (0) = J (0). Let V1 , V2 be C∞ -vector fields along ξ such that η˙

˙ V1 (0) = η(0),

V2 (0) = Dη˙ J (0),

η(0) ˙

Dξ˙

η(0) ˙

V1 (0) = Dξ˙

V2 (0) = 0.

Then we define a variation σ : [0, l] × (−ε, ε) −→ M by " # σ (t, s) := expξ(s) t V1 (s) + sV2 (s) . Observe that σ (t, 0) = exp(t η(0)) ˙ = η(t), and that each σ (·, s) is a geodesic. Moreover, if ε is sufficiently small, then V1 (s) + sV2 (s) = 0 and hence σ is C∞ . Thus the variational vector field X := ∂s σ (·, 0) is a Jacobi field along η. Since, by (4.11), X(0) = ξ˙ (0) = J (0),  n 2 i n     j ∂ σ ∂  η˙ (0, 0) + ˙ η˙ (0)ξ˙ k (0) ji k η(0) Dη˙ X(0) = ∂t∂s ∂x i η(0) i=1

j,k=1

  η(0) ˙  η(0) ˙  ∂t σ (0, s) s=0 = Dξ˙ V1 (s) + sV2 (s) s=0 = Dξ˙ η˙

= V2 (0) = Dη˙ J (0), we find that X coincides with J thanks to Lemma 5.4. This completes the proof.

 

From the construction in the above proof, we also find that any initial data v, w ∈ η˙ Tη(0) M admit a Jacobi field J with J (0) = v and Dη˙ J (0) = w. Together with the linearity of (5.3) in J and Lemma 5.4, Jacobi fields along a fixed geodesic form a 2n-dimensional vector space.

5.2 Properties of the Curvature Tensor We list some basic properties of the curvature tensor R defined in (5.4). Lemma 5.6 (Properties of R) (i) For all v ∈ Tx M, we have Rv (v) = 0. (ii) For all v ∈ Tx M \ {0} and w ∈ Tx M, we have gv (Rv (w), v) = 0. Proof (i) Let v = 0 without loss of generality. It follows from (5.5) and (4.5) that, for all 1 ≤ i ≤ n,

5.2 Properties of the Curvature Tensor n 

51

Rji (v)v j

j =1

=

n  ∂Gi j =1

∂x j

(v)v j −

n

 ∂Gi k=1

∂x

 n k i k (v)v − N (v)G (v) − Nki (v)Gk (v) k k k=1

= 0. In the first equality, we used Theorem 2.8 to see n ∂N i  j j =1

∂x k

1  ∂ 2 Gi ∂Gi j (v)v = (v) 2 ∂x k ∂v j ∂x k n

(v)v j =

j =1

and n ∂N i  j j =1

∂v k

1  ∂ 2 Gi 1 ∂Gi (v)v j = (v) = Nki (v). k j 2 ∂v ∂v 2 ∂v k n

(v)v j =

j =1

 (ii) The claimed equation is equivalent to ni,k=1 gik (v)Rji (v)v k = 0 for all 1 ≤ j ≤ n. Let η : (−ε, ε) −→ M be the geodesic with η(0) ˙ = v. For simplicity, we consider a local coordinate system (x i )ni=1 such that gij (η(t)) ˙ = δij and η(t) ˙ = 1 F (v) · (∂/∂x )|η(t) for all t ∈ (−ε, ε). Then we have, for all 1 ≤ k ≤ n and t ∈ (−ε, ε),   ˙ = −η¨ k (t) = 0. Gk η(t)

(5.6)

In particular, in the expression of Rji (v) in (5.5), the third term vanishes.  To calculate ni=1 Rji (v)v i , observe that n  ∂Gi i=1

∂x j

(v)v i =

n  ∂γkli (v)v i v k v l ∂x j

i,k,l=1

=

n

n   1 ∂ 2 gil ∂gim m (v) − (v)γ (v) vi vk vl kl 2 ∂x j ∂x k ∂x j

i,k,l=1

=

m=1

n 1  ∂ 2 gil (v)v i v k v l , 2 ∂x j ∂x k i,k,l=1

where we used gij (v) = δij in the second equality and (5.6) in the third equality. Next we deduce from (4.10) that

52

5 Curvature

n

n n n   ∂γjil ∂  i 1  ∂ 2 gil i k i l k N (v)v = (v)v v v = (v)v i v k v l v j ∂x k ∂x k 2 ∂x k ∂x j k=1

i=1

i,k,l=1

i,k,l=1

since, for any 1 ≤ m ≤ n, (5.6) and gij (η(t)) ˙ = δij imply n  ∂Gm k=1

∂x k

(v)v k = 0,

k=1

Now the remaining term is n 

n  ∂gim

Nki (v)v i =

i=1

n

∂x k

i k i k=1 Nk (v)Nj (v)v .

n 

γkli (v)v i v l =

i,l=1

(v)v k = 0.

(5.7)

It follows from

n 1  ∂gil (v)v i v l , 2 ∂x k i,l=1

(5.6) and (5.7) that n 

Nki (v)v i =

i=1

n  1  ∂gkl ∂gik (v) + (v) vi vl = 0 2 ∂x i ∂x l i,l=1

for all 1 ≤ k ≤ n. Therefore, completes the proof of (ii).

n

i i i=1 Rj (v)v

= 0 for all 1 ≤ j ≤ n, and this  

Exercise 5.7 Let J be a Jacobi field along a geodesic η : [0, 1] −→ M. Show that    η˙    gη˙ J (t), η(t) ˙ = gη˙ Dη˙ J (0), η(0) ˙ . ˙ · t + gη˙ J (0), η(0) η˙

In particular, if both J (0) and Dη˙ J (0) are perpendicular to η(0) ˙ in gη˙ , then we have gη˙ (J (t), η(t)) ˙ = 0 for all t. The following important symmetry property of Rv (see [25, (3.4.7)]) is not apparent from its definition (5.5). Lemma 5.8 (Symmetry of Rv ) For any v ∈ Tx M \ {0} and w1 , w2 ∈ Tx M, we have     gv w1 , Rv (w2 ) = gv Rv (w1 ), w2 . Proof In terms of local coordinates, the claim is written as n  k=1

gik (v)Rjk (v) =

n  k=1

gj k (v)Rik (v)

5.2 Properties of the Curvature Tensor

53

for all 1 ≤ i, j ≤ n. One way of showing this is to reduce it to the Riemannian case by using a vector field V as in Proposition 4.6 (or Theorem 5.12 below); see Exercise 5.13. Here we give a direct proof by computation. Similarly to the proof of Lemma 5.6(ii), we choose a local coordinate system (x i )ni=1 such that gij (η(t)) ˙ = δij and η(t) ˙ = F (v) · (∂/∂x 1 )|η(t) for t ∈ (−ε, ε), where η : (−ε, ε) −→ M is the geodesic with η(0) ˙ = v. We will omit the j evaluations at v for brevity. Then our goal is to show Rji = Ri . By (5.6), the first term of Rji in (5.5) can be written as n n    ∂γkli k l ∂Gi ∂ 2 gil 1 ∂ 2 gkl vk vl . = v v = − ∂x j ∂x j ∂x j ∂x k 2 ∂x j ∂x i k,l=1

k,l=1

Hence we have n   ∂ 2 gj l ∂Gi ∂ 2 gil ∂Gj − = − i k vk vl . ∂x j ∂x i ∂x j ∂x k ∂x ∂x k,l=1

From (4.10), (5.6), and (5.7), we obtain n  ∂N i  j k=1

∂x k

j

∂Ni ∂x k

n

n  ∂  i j l vk = vk (γ − γ )v jl il ∂x k k=1

l=1

n   ∂ 2 gj l ∂ 2 gil = − k i vk vl . ∂x k ∂x j ∂x ∂x k,l=1

Therefore, it suffices to show that (4.10), (5.6), and (5.7), we have n 

Nki Njk

=

k=1

n  n  k=1

=

1 4

i k k=1 Nk Nj

γkli v l

l=1 n 

n

  n

is symmetric in i and j . Again by

γjkm v m

m=1



k,l,m=1

∂gil ∂gkl − k ∂x ∂x i



∂gj m l m ∂gkm vv − ∂x j ∂x k

  n ∂gj m ∂gil 1  ∂gkl ∂gkm l m vv . =− − − 4 ∂x k ∂x i ∂x k ∂x j k,l,m=1

j

This is indeed symmetric in i and j , thereby we conclude that Rji (v) = Ri (v) for all 1 ≤ i, j ≤ n as desired.  

54

5 Curvature

5.3 Flag and Ricci Curvatures and Their Characterizations We define the flag curvature, which generalizes the sectional curvature in Riemannian geometry, by using the curvature tensor introduced in Definition 5.1. Definition 5.9 (Flag Curvature) For linearly independent vectors v, w ∈ Tx M, we define the flag curvature of (v, w) by K(v, w) :=

gv (Rv (w), w) . 2 v (w, w) − gv (v, w)

F 2 (v)g

Notice that K(av, bw) = K(v, w) for all a, b > 0 by the positive homogeneity. Moreover, K(v, w) is independent of the choice of w in the 2-plane v ∧ w ⊂ Tx M spanned by v and w as follows. Exercise 5.10 Show that we have K(v, w + av) = K(v, w) for all a ∈ R (one may use Lemma 5.6). The 2-plane v ∧ w is called a flag, and v is called a flagpole in it. Thus it is more precise to denote the flag curvature by K(v; v ∧ w), whereas we will write K(v, w) for convenience. In the Riemannian case, the flag curvature coincides with the sectional curvature of the flag (regardless of the choice of a flagpole). In the Finsler case, however, the flag curvature depends not only on the flag, but also on the choice of a flagpole in it. For instance, K(w, v) = K(v, w) may happen. Definition 5.11 (Ricci Curvature) For v ∈ Tx M, we define the Ricci curvature Ric(v) of v as the trace of the endomorphism Rv : Tx M −→ Tx M. If v = 0, then Ric(v) can be written as Ric(v) =

n−1  i=1

n−1    gv Rv (ei ), ei = F 2 (v) K(v, ei ), i=1

where {v/F (v)} ∪ {ei }n−1 i=1 ⊂ Tx M is an orthonormal basis with respect to gv . We remark that the trace of an endomorphism of a finite-dimensional vector space is defined independently of the choice of a basis. Observe that Ric is positively 2-homogeneous and Ric(0) = 0. It is naturally expected from the construction of the curvature tensor via Jacobi fields that these curvatures have close connections with the behavior of geodesics (see Chaps. 7, 8). Moreover, Proposition 4.6 leads to the following important and useful characterizations of the Finsler curvatures in terms of the Riemannian curvatures. Theorem 5.12 (Riemannian Characterizations of Curvatures) Given a nonzero vector v ∈ Tx M \ {0}, take a nowhere vanishing C∞ -vector field V on a neighborhood U of x such that V (x) = v and every integral curve of V is a geodesic. Then, for any w ∈ Tx M linearly independent of v, the flag curvature

5.3 Flag and Ricci Curvatures and Their Characterizations

55

K(v, w) coincides with the sectional curvature of the 2-plane v ∧ w with respect to the Riemannian metric gV . Similarly, the Finsler Ricci curvature Ric(v) coincides with the Riemannian Ricci curvature of v with respect to gV . Proof Without loss of generality, we assume gv (w, w) = 1. Let η : (−δ, δ) −→ M be the geodesic with η(0) ˙ = v and observe that V (η(t)) = η(t) ˙ by the condition imposed on V . Take a C∞ -variation σ : (−δ, δ) × (−ε, ε) −→ M such that σ (·, 0) = η, ∂s σ (0, 0) = w and σ (·, s) is an integral curve of V for each s. Then, by hypothesis, σ (·, s) is a geodesic for each s, and hence J (t) := ∂s σ (t, 0) is a Jacobi field along η with J (0) = w. Thereby we deduce from the Jacobi equation (5.3) that η˙

η˙

gv (Dη˙ Dη˙ J (0), w) gv (Rv (w), w) K(v, w) = 2 =− . 2 F (v) − gv (v, w) gv (v, v) − gv (v, w)2 Now we compare this observation with the Riemannian counterpart for gV . Since σ is also a geodesic variation with respect to gV as we saw in Sect. 4.1, J is a Jacobi field also for gV . Moreover, it follows from Proposition 4.6 that η˙

η˙

g

g

Dη˙ Dη˙ J (0) = Dη˙ V Dη˙ V J (0), where D gV denotes the covariant derivative with respect to gV . Therefore, we obtain the first assertion g

K(v, w) = −

g

gv (Dη˙ V Dη˙ V J (0), w) gv (v, v) − gv (v, w)2

= KgV (v ∧ w),

where KgV denotes the sectional curvature with respect to gV . The second assertion on the Ricci curvatures then immediately follows.

 

A discussion similar to the above proof could be used to give an alternative proof of Lemma 5.8. Exercise 5.13 Give an alternative proof of Lemma 5.8 by using Proposition 4.6. Remark 5.14 (Geodesic Fields) Recall (from Sect. 4.2) that a vector field V as in Theorem 5.12 is sometimes called a geodesic field. Given v ∈ Tx M \ {0}, one can always find such a vector field V . For instance, let η : [−δ, δ] −→ M be a (sufficiently short) geodesic with η(0) ˙ = v, and consider geodesics ξ : [0, 2δ] −→ M emanating from η(−δ) of the same speed as η. Then V (ξ(t)) := ξ˙ (t) is welldefined and C∞ on a small neighborhood of x, and satisfies the hypothesis of Theorem 5.12 (F (v)−1 · V is the gradient vector field of the distance function d(η(−δ), ·); see Sect. 11.2 below). Notice also that the choice of V is not unique (in the above construction V depends on δ). Therefore, Theorem 5.12 in particular shows that KgV (v ∧ w) is independent of the choice of V satisfying the hypothesis.

56

5 Curvature

Theorem 5.12 provides almost straightforward generalizations of some comparison geometric results on Riemannian manifolds to Finsler manifolds. This powerful method goes back to (at least) Auslander [16], where the Finsler analogues of the Bonnet–Myers and Cartan–Hadamard theorems were shown (see Sects. 8.1 and 8.2 for the precise statements and proofs). One can further generalize this idea to Finsler manifolds equipped with measures by employing the weighted Ricci curvature, that is the subject of Part II.

5.4 Further Properties of the Curvature Tensor In this section we collect some further properties of the curvature tensor R defined in (5.4) and (5.5). We refer to [25, Sects. 3.1–3.5] for further accounts. Our use of the results in this section will be limited, so the readers may skip this section at the first reading. i with the help of Gi (v) = First we rewrite (5.5) in terms of the functions kl n i k l k,l=1 kl (v)v v , (4.10) and (4.5) as Rji (v)

n

i  ∂ji l ∂ji l ∂kl k k k = (v)v − (v)v + (v)G (v) v l ∂x j ∂x k ∂v k k,l=1

+

n  k=1

=

n 

ji k (v)Gk (v) −

i kl (v)jkm (v)v l v m

k,l,m=1

n

n  ∂ i i   ∂ji l ∂kl jl m i m i m (v) v k v l . − + N +   −   j m kl km j l ∂x j ∂x k ∂v m k

k,l=1

m=1

(5.8) Note that the inside of the brackets becomes skew-symmetric in j and k (this is i /∂v m ) · N m . This is expected from the Riemannian experience) if we add −(∂kl j certainly possible since we deduce from (4.10) that, for all 1 ≤ i, m ≤ n, 

n n n i   ∂kl ∂ k l i k l i − 2 (v)v v =  (v)v v km (v)v k kl ∂v m ∂v m

k,l=1

k,l=1

k=1

= 2Nmi (v) − 2Nmi (v) = 0.

(5.9)

Thus we define (see [25, (3.3.2)])  Rlji k (v)

:=

 i n n i i   ∂j l ∂ji l ∂kl m ∂kl m (v) − (v) − Nj − Nk ∂x j ∂v m ∂x k ∂v m m=1

m=1

5.4 Further Properties of the Curvature Tensor

+

n 

57

m i (ji m kl − km jml )(v)

(5.10)

m=1

n i i i k l for v ∈ T M \ 0. Observe that k,l=1 Rlj k (v)v v = Rj (v) and that Rlj k is positively 0-homogeneous. Next we summarize some symmetry and skewsymmetry properties of Rlji k (see (3.1.3), (3.2.4) and (3.4.4) in [25]). Lemma 5.15 (Properties of Rlji k ) Let v ∈ Tx M \ {0}. i (v) for all (i) We have the skew-symmetry in j and k, namely Rlji k (v) = −Rlkj 1 ≤ i, j, k, l ≤ n. (ii) The first Bianchi identity i Rlji k (v) + Rji kl (v) + Rklj (v) = 0

holds for all 1 ≤ i, j, k, l ≤ n. (iii) We have an almost skew-symmetry in i and l in the sense that n    gim (v)Rljmk (v) + glm (v)Rijmk (v) v l = 0

(5.11)

l,m=1

for all 1 ≤ i, j, k ≤ n. (iv) For any w ∈ Tx M, we have the following symmetry: 



gv Rv (w), w = gv

  n i,j,k,l=1

= gv

  n i,j,k,l=1

Rlji k (v)w j v k v l

 ∂  , w ∂x i x

Rlji k (v)v j w k w l

 ∂  ,v . ∂x i x

Proof (i) is straightforward from the definition (5.10), and we leave (ii) as an exercise (Exercise 5.16). (iii) This is not straightforward and requires a long calculation. We will omit the evaluations at v for simplicity. First we deduce from (5.10) and (4.10) that n  l=1

Rljmk v l

  n n ∂Njm  ∂Njm ∂Nkm  a ∂Nkm i a m = − Nj − ka − + Nk − j a ∂x j ∂v a ∂x k ∂v a a=1

+

n 

m a (jma Nka − ka Nj )

a=1

a=1

58

5 Curvature

=

n n m  ∂Njm ∂Njm  ∂Nkm a ∂Nk a − − N + N . j k ∂x j ∂x k ∂v a ∂v a a=1

a=1

Observe from (4.10) and (4.8) that n 

gim Nkm

+

m=1

n 

m l glm ki v

=

l,m=1

n  ∂gil l=1

∂x k

vl .

(5.12)

Then we have n 

gim

m=1

=

n m  ∂Nkm ∂ki + g vl lm ∂x j ∂x j l,m=1

n  l=1

n n  ∂ 2 gil l  ∂gim m ∂glm m l v − N −  v, ∂x j ∂x k ∂x j k ∂x j ki m=1

l,m=1

where the first term in the right-hand side is symmetric in j and k so will be canceled. We similarly find n 

gim

m=1

n n n m   ∂Nkm ∂ki ∂gim m  ∂gia l m + g v = − N − gam ki . lm ∂v a ∂v a ∂x k ∂v a k l,m=1

m=1

m=1

Hence we have n 

(gim Rljmk + glm Rijmk )v l

l,m=1

n  n   ∂glm m ∂glm m l m v =−  −  + glm (Njl ki − Nkl jmi ) ki j i ∂x j ∂x k l,m=1

+

l,m=1

n 

a m a glm (jma ki − ka j i )v l .

l,m,a=1 m and  a will vanish: Finally, it follows from (5.12) that the terms including ki ki

n  ∂glm l=1

∂x j

vl +

n  l=1

glm Njl +

n 

gla jam v l = 0.

l,a=1

 Therefore, we obtain nl,m=1 (gim Rljmk + glm Rijmk )v l = 0 as desired. (iv) This follows from (i) and (iii). The first equality is immediate from the definitions of Rv and Rlji k . Applying (i) and (iii), we have

5.4 Further Properties of the Curvature Tensor

gv

  n

Rlji k (v)w j v k v l

i,j,k,l=1

=−

n 

 ∂  , w = ∂x i x

59 n  i,j,k,l,m=1 n 

gim (v)Rlji k (v)v j w k v l w m =

i,j,k,l,m=1

= gv

  n i,j,k,l=1

gim (v)Rlji k (v)w j v k v l w m

gim (v)Rlji k (v)v j w k w l v m

i,j,k,l,m=1

  i j k l ∂  Rlj k (v)v w w ,v . i ∂x  x

 

Thus we also obtain the second equality. Exercise 5.16 Prove (ii) of the above lemma.

Now we associate to the Chern connection (recall Definition 4.12) the following curvature 2-forms: ij

:=

dωji

n 

ωjk ∧ ωki

k=1

=

1 2

n 

Rj i kl dx k ∧ dx l +

k,l=1

n 1  i Pj kl dx k ∧ δv l , F k,l=1

where we naturally impose Rj i lk = −Rj i kl (see [25, Sect. 3.1], and recall Theorem 4.13 for the definition of δv l ). The components Rj i kl of the horizontal curvature tensor coincide with Rji kl in (5.10) derived in a heuristic way. The components Pj i kl of the mixed curvature tensor are given by Pj i kl = −F

∂ji k ∂v l

,

which vanish in the Riemannian case (see [25, (3.3.3)]). The exterior differential of ij , dij = −

n 

(kj ∧ ωki − ωjk ∧ ik ),

k=1

corresponds to the second Bianchi identity (see [25, Sect. 3.5], [66]).  Exercise 5.17 Computing ij with ωji = nk=1 ji k dx k in Definition 4.12, show that Rj i kl and Pj i kl are indeed given as above. Given a nowhere vanishing C1 -vector field V , the Chern curvature is defined by

60

5 Curvature V V V R V(X, Y )Z := DX [DYV Z] − DYV [DX Z] − D[X,Y ] Z,

(5.13)

where X, Y are C1 -vector fields and Z is a C2 -vector field on M. When we begin with a nonzero tangent vector v ∈ Tx M \ {0} and take a C1 -vector field V such that V (x) = v, the Chern curvature R V (X, Y )Z at x depends on the choice of V . If, in addition, we assume that all integral curves of V are geodesic, then the corresponding curvature coincides with the flag curvature in the sense of the next proposition. Thus the Chern curvature (5.13) could be used to introduce the flag curvature (see, e.g., [251, Sect. 2]). We will not use this expression of the flag curvature in the sequel. Proposition 5.18 Let V be a nowhere vanishing C∞ -vector field on an open set U ⊂ M such that all of its integral curves are geodesic. Then, for any C2 -vector field W on U , we have     gV R V(V , W )W, V = gV RV (W ), W . In particular, if v := V (x) and w := W (x) are linearly independent at x ∈ U , then we have K(v, w) =

gV (R V(V , W )W, V )(x) . 2 F (v)gv (w, w) − gv (v, w)2

Proof We give only an outline of the proof, and leave the detailed calculation to the readers (Exercise 5.19). We first observe that, for arbitrary vector fields X, Y, Z, V V [DYV Z] − DYV [DX Z] DX V =D[X,Y ] Z+

n 

Xj Y k Z l Rlji k (V )

i,j,k,l=1

+

n 

Xj Y k Z l

i,j,k,l,m=1

∂ ∂x i

 m  m i ∂ji l ∂kl ∂V ∂V ∂ m m (V ) +N (V ) − (V ) +N (V ) . j k ∂v m ∂x j ∂v m ∂x k ∂x i

Now let X = V and Y = Z = W . We deduce from Lemma 5.15(iv) that   n   ∂ gV V j W k W l Rlji k (V ) i , V = gV RV (W ), W . ∂x i,j,k,l=1

Moreover, by Theorem 2.8, we have gV

 n i,j =1

∂ji l

∂ V (V ) i , V m ∂v ∂x j

for all l, m. Combining these, we find

=0

5.4 Further Properties of the Curvature Tensor

61

  gV R V (V , W )W, V   = gV RV (W ), W    n n i ∂kl ∂V m ∂ m +gV W kW l m (V ) Vj +G (V ) , V . ∂v ∂x j ∂x i i,k,l,m=1

j =1

It follows from the condition on V that DVV V = 0, thereby the second term in the right-hand side vanishes. Therefore, we obtain gV (R V (V , W )W, V ) = gV (RV (W ), W ).   Exercise 5.19 Complete the proof of Proposition 5.18 by giving the omitted calculations (or give an alternative proof).

Chapter 6

Examples of Finsler Manifolds

This chapter is devoted to some fundamental and important examples of Finsler manifolds. In addition to Minkowski normed spaces and Randers spaces already mentioned in Example 2.12, we introduce Berwald spaces, Hilbert and Funk geometries, and Teichmüller spaces and discuss their characteristic properties. Further examples (including (α, β)-metrics as in Example 2.12(c)) can be found in, e.g., [227, Subsect. 2.1.2]. We will revisit some of these examples in Chap. 10 in the context of measured Finsler manifolds (i.e., Finsler manifolds equipped with measures).

6.1 Minkowski Normed Spaces As mentioned in Example 2.12(a), a (smooth) Minkowski normed space (Rn , · ) in the sense of Definition 2.1 is regarded as a Finsler manifold by identifying each tangent space with (Rn , · ) in the canonical way. Proposition 6.1 The flag curvature of a Minkowski normed space (Rn , · ) is identically 0. Proof We give two simple proofs of this fundamental fact. First, let us consider a scaled space (Rn , λ · ) for λ > 0. Then the map (Rn , · )  x −→ λ−1 x ∈ (Rn , λ · ) is isometric (i.e., its derivative preserves the norm). Hence we have K · (v, w) = Kλ · (v, w) for all linearly independent vectors v, w ∈ T0 Rn . However, by the definition of the flag curvature, we find Kλ · (v, w) = λ−2 K · (v, w) (see Exercise 6.2 below). Therefore K · (v, w) = 0 necessarily holds for all v, w ∈

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S.-i. Ohta, Comparison Finsler Geometry, Springer Monographs in Mathematics, https://doi.org/10.1007/978-3-030-80650-7_6

63

64

6 Examples of Finsler Manifolds

T0 Rn . By the homogeneity of the space (parallel translations are isometric), we obtain K = 0 on whole Rn . The second proof utilizes the Riemannian characterization of the flag curvature in Theorem 5.12. Given v ∈ Rn \ {0}, take the constant vector field V ≡ v on Rn . Integral curves of V are straight lines and hence geodesic. Then, since (Rn , gV ) is   an inner product space, we find K(v, w) = KgV (v ∧ w) = 0. Exercise 6.2 Let (M, F ) be a Finsler manifold, and consider another Finsler structure Fλ (v) := λF (v) on M for λ > 0. Show that KFλ (v, w) = λ−2 KF (v, w) and RicFλ (v) = RicF (v). We remark that the latter relation can be rewritten in a more reasonable (homogeneous) form: RicFλ (v)/Fλ2 (v) = λ−2 RicF (v)/F 2 (v).

6.2 Finsler Manifolds of Constant Curvature Proposition 6.1 above shows that Finsler manifolds of constant flag curvature have more variety than the Riemannian case. Classifications of those spaces stimulated much interest of intensive research. One of the first major contributions was due to Akbar-Zadeh [4]. For simplicity, we state his result only in the compact case. We refer to [25, Sect. 12.4] for more details. Theorem 6.3 (Akbar-Zadeh’s Theorem, Compact Case) Let (M, F ) be a compact Finsler manifold of constant flag curvature κ. (i) If κ = 0, then (M, F ) is locally Minkowskian. (ii) If κ < 0, then (M, F ) is Riemannian. We say that (M, F ) is locally Minkowskian if every point x ∈ M has a neighborhood U such that (U, F ) is isometric to an open set of some n-dimensional Minkowski normed space. In contrast to Theorem 6.3(ii), we will see in Sect. 6.5 that there are rich families of reversible and non-reversible noncompact Finsler manifolds of constant negative flag curvature. In the positively curved case, we have different phenomena for reversible and non-reversible metrics. On one hand, Bryant [49] constructed a family of nonreversible Finsler metrics of constant positive flag curvature on the 2-sphere S2 . On the other hand, Kim and Min [137] showed that any reversible Finsler manifold of constant positive flag curvature is necessarily Riemannian. Some more rigidity results in the reversible case can be found in [227, Subsect. 7.2.2]. We refer to [230, Chap. 9] and [227, Chap. 5] for more examples and further discussions on Finsler manifolds of constant flag curvature. For example, there are nontrivial Finsler manifolds whose flag curvature is identically 0 (see [230, Example 9.2.3]). Some characterizations of Randers spaces having constant flag curvature can be found in [230, Sect. 9.3]. One can find a classification of

6.3 Berwald Spaces

65

projectively flat Finsler metrics with constant flag curvature in [154] (see [227, Chap. 6] for the definition and related studies of projectively flat Finsler metrics).

6.3 Berwald Spaces Next we introduce Berwald spaces, named after Ludwig Berwald (who called them “affinely connected Finsler spaces”; see, e.g., [34]).

6.3.1 Isometry of Tangent Spaces and Its Applications Berwald spaces are defined in terms of ji k as follows. Definition 6.4 (Berwald Spaces) We say that a Finsler manifold (M, F ) is a Berwald space (or of Berwald type) if the connection coefficients ji k are constant on the (slit) tangent space Tx M \ {0} for every x ∈ M. The constancy of ji k in the vertical direction makes the covariant derivative independent from the choice of a reference vector (recall Definition 4.1). This is a rather strong condition that, however, can be helpful when one intends to generalize some Riemannian arguments (see, for instance, Sect. 8.4 on the Busemann nonpositive curvature and Sect. 17.3 on a splitting theorem). Riemannian manifolds as well as Minkowski normed spaces (where ji k = 0) are Berwald spaces. Non-Riemannian, non-locally Minkowskian Berwald spaces can be constructed by taking perturbed Cartesian products of Berwald spaces (see [135, 237, 238] and Remark 6.6 below). Berwald spaces enjoy several fine properties. The following is particularly important and characteristic (see [122], [25, Proposition 10.1.1]). Proposition 6.5 (Isometry of Tangent Spaces) Let (M, F ) be a Berwald space. Then, for any C1 -curve η : [0, 1] −→ M with η˙ = 0, the parallel transport along η gives a linear isometry between (Tη(0) M, F ) and (Tη(1) M, F ). In particular, all tangent spaces of (M, F ) are mutually linearly isometric. Proof Let us denote by Dη˙ the covariant derivative along η, which is independent of the choice of a reference vector. Then the parallel transport P sends v ∈ Tη(0) M to P (v) := V (1) ∈ Tη(1) M, where V is the vector field along η such that V (0) = v and Dη˙ V = 0. Observe that P is a linear map by the linearity of Dη˙ , and Lemma 4.8 yields  d gV (V , V ) = 2gV (Dη˙ V , V ) = 0. dt  √ Hence F (v) = gv (v, v) = gP (v) (P (v), P (v)) = F (P (v)) as desired.

 

66

6 Examples of Finsler Manifolds

We remark that, on a general Finsler manifold, we have  d gV (V , V ) = 2gV (DηV˙ V , V ), dt η˙

while V satisfies Dη˙ V = 0 (if we assume DηV˙ V = 0 instead, then the map P is not necessarily linear). At this point, we used the Berwald condition to obtain η˙ DηV˙ V = Dη˙ V = 0. Remark 6.6 (Szabó’s Classification) By Proposition 6.5, for each closed curve η : [0, 1] −→ M with x = η(0) = η(1), the parallel transport along η provides a linear isometric transformation of (Tx M, F ). This gives rise to a holonomy group and, unless the holonomy group is trivial (like Minkowski normed spaces), F |Tx M possesses some symmetries. Then Szabó’s classification asserts that a simply connected, complete Berwald space is necessarily one of the following four types: (a) (b) (c) (d)

Riemannian manifold; Minkowski normed space; symmetric non-Riemannian Berwald space (of rank ≥ 2); and perturbed Cartesian product of the above three types.

We refer to [237, 238] for the detailed classification and to [87, 135] for further investigations. In [237], it was also established the following important metrizability theorem (see also [25, Exercise 10.1.4]). Theorem 6.7 (Szabó’s Metrizability Theorem) For any Berwald space (M, F ), there exists a Riemannian metric whose Levi-Civita connection coincides with the Chern connection of F . In the 2-dimensional case, Szabó’s theory provides the following explicit classification (see [237] and [25, Theorem 10.6.2]). Theorem 6.8 (Szabó’s Rigidity) Let (M, F ) be a 2-dimensional Berwald space. Then the following dichotomy holds: (i) If the flag curvature is identically 0, then (M, F ) is locally Minkowskian. (ii) If the flag curvature is not identically 0, then (M, F ) is Riemannian. The assertion (i) holds also in higher dimensions (see [25, Proposition 10.5.1]). Notice that its converse implication clearly holds, namely a locally Minkowskian Finsler manifold is a Berwald space whose flag curvature is identically 0. We finally mention a more recent rigidity result in [39, 135] (see also [170]). Theorem 6.9 (Berwald Spaces of Nowhere Vanishing Curvature) Let (M, F ) be a complete Berwald space whose flag curvature is nowhere vanishing. Then (M, F ) is necessarily a Riemannian manifold.

6.3 Berwald Spaces

67

We remark that, thanks to Proposition 6.5, any Berwald space has finite reversibility, and hence the forward completeness is equivalent to the backward completeness (recall Corollary 3.23).

6.3.2 T-Curvature In order to give some characterizations of Berwald spaces, we introduce a quantity which measures the variation of the connection coefficients in reference vectors. It will turn out in Proposition 6.11 (in comparison with Proposition 6.5) that this quantity is related to the variation of tangent spaces. We will utilize this notion also in the estimates of the convexity and concavity of the distance function in Sect. 8.3. Definition 6.10 (T-curvature) Define the T-curvature of (M, F ) by, for x ∈ M and v, w ∈ Tx M \ {0}, Tv (w) :=

n 

  gil (v) ji k (w) − ji k (v) w j w k v l .

i,j,k,l=1

We also set Tv (w) := 0 if v = 0 or w = 0. By using a vector field W with W (x) = w, we can rewrite Tv (w) as w v W − Dw W, v). Tv (w) = gv (Dw

Note that the right-hand side is independent of the choice of the vector field W . This expression ensures that Tv (w) is well-defined (independent of the choice of a local coordinate system). Clearly Tv (w) = 0 for all v and w on Berwald spaces.

6.3.3 Characterizations of Berwald Spaces There are a number of characterizations of Berwald spaces among Finsler manifolds. We present two of them which are relevant to our interest. See [25, Sect. 10.2] and [230, Sect. 10.1] for further characterizations and related discussions. Proposition 6.11 (Characterizations of Berwald Spaces) For a Finsler manifold (M, F ), the following assertions are equivalent: (I) (M, F ) is a Berwald space. (II) The T-curvature vanishes identically. (III) The exponential map expx : Dx −→ M is C2 on a neighborhood of 0 for all x ∈ M.

68

6 Examples of Finsler Manifolds

Moreover, if (M, F ) is a Berwald space, then the exponential map expx is in fact C∞ on a neighborhood of 0 for every x ∈ M. Proof We prove only the equivalence between (I) and (II). By the very definition of Berwald spaces, (I) clearly implies (II). Conversely, if (II) holds, then we have n 

  gil (v) ji k (w) − ji k (v) w j w k v l

i,j,k,l=1

=

n 

n  gil (v) Gi (w) − ji k (v)w j w k v l = 0

i,l=1

j,k=1

for all v, w ∈ Tx M \ {0}, x ∈ M. Fixing v and differentiating the latter expression in w in the directional (vertical) coordinates three times, we find n 

gil (v)

i,l=1

∂ 3 Gi (w)v l = 0 ∂v j ∂v k ∂v m

for all 1 ≤ j, k, m ≤ 1. This means that ∂ 3 Gi /∂v j ∂v k ∂v m = 0 (otherwise choosing v l = ∂ 3 Gl /∂v j ∂v k ∂v m (w) gives a contradiction). Hence ∂ 2 Gi /∂v j ∂v k is constant on Tx M \{0} for all 1 ≤ i, j, k ≤ n (this is in fact the classical definition by Berwald himself; see, e.g., [34]). Therefore, for each 1 ≤ i, j, k ≤ n, ∂2 ∂v j ∂v k =

 n

i ab (v)v a v b

a,b=1

2ji k (v) + 2

n  ∂ i  ja a=1

n i i  ∂ak ∂ 2 ab a (v) + (v) v + (v)v a v b ∂v k ∂v j ∂v j ∂v k

(6.1)

a,b=1

is constant on Tx M \ {0}. Now, by differentiating (5.9) in v, it generally holds that n n ∂ i n i i    ∂ 2 ab ∂ak ja a b a (v)v v = −2 (v)v = −2 (v)v a . j k k ∂v ∂v ∂v ∂v j

a,b=1

a=1

a=1

Substituting this into (6.1) yields ∂2 ∂v j ∂v k

 n a,b=1

n ∂ i  ja i ab (v)v a v b = 2ji k (v) + 2 (v)v a . ∂v k

By differentiating the right-hand side in v m , we find

a=1

(6.2)

6.3 Berwald Spaces

69

∂ji k ∂v m

∂ji m

(v) +

∂v k

(v) +

n  ∂ 2 ji a a=1

∂v m ∂v k

(v)v a = 0.

(6.3)

We shall see by contracting this equation in i that  the second term in the right-hand side of (6.2) vanishes. By putting lj k (v) := ni=1 gli (v)ji k (v) and omitting the evaluations at v for brevity, first observe from Theorem 2.8, (4.8), and (4.5) that n 

v l gli

i,l=1

∂ji k ∂v m

=

n 

vl

l=1

∂lj k ∂v m

 n n ∂ 2 gj l 1  l ∂ 2 gj k 1  l ∂ 2 glk p p =− v − v N + m p Nk 2 ∂v m ∂x l 2 ∂v m ∂v p j ∂v ∂v +

=−

+

1 2 1 2

l=1

l,p=1

n 

p n ∂ 2 gj k p 1  l ∂gj k ∂Nl G + v ∂v m ∂v p 2 ∂v p ∂v m

p=1 n  l=1

l,p=1

vl

∂ 2 gj k 1 + m l ∂v ∂x F

n 

p

p

p

(Akmp Nj + Aj mp Nk + Aj kp Nm )

p=1

n 1  ∂ 2 gj k p G . 2 ∂v m ∂v p

(6.4)

p=1

Note that this is symmetric in j, k, and m. Next, as for the last term in (6.3), we deduce from (4.10) that n  i,l,a=1

v l gli

∂ 2 ji a ∂v m ∂v

n 

va = k

vl

l,a=1 n 

=

l,a=1

n 2  ∂ 2 lj a a l ∂ gli v − v i va ∂v m ∂v k ∂v m ∂v k j a i,l,a=1

∂ 2 lj a a  ∂gkm i v + N . ∂v m ∂v k ∂v i j n

vl

i=1

This vanishes since the first term in the right-hand side is calculated by using some symmetries in l and a (and by repeatedly applying Theorem 2.8) as

n n n  ∂ 2 lj a l a ∂gla i l a ∂ 3 gla ∂2 1  1  l a vv v v = v v − N ∂v m ∂v k 2 ∂v m ∂v k ∂x j 2 ∂v m ∂v k ∂v i j

l,a=1

l,a=1

=−

i,l,a=1

n n  ∂ 3 gla ∂gkm i 1  i l a N v v = − N . j 2 ∂v m ∂v k ∂v i ∂v i j i,l,a=1

Hence we find, from (6.3) and (6.4),

i=1

70

6 Examples of Finsler Manifolds

n 

n n  ∂ 2 gj k ∂ 2 gj k p 2  p p p − (A N + A N + A N ) − G =0 kmp j j mp k j kp m m l ∂v ∂x F ∂v m ∂v p

vl

l=1

p=1

p=1

for all 1 ≤ j, k, m ≤ n. This yields, together with (4.8), (4.5), and (4.10), 2

n ∂ i  ja

∂v k

a=1

=

n 

va

g

il

l,a=1

∂ 2 gj l a  v − ∂v k ∂x a n

m=1

n 

−2

g il

l,m,a=1

=

n 

g

il

l=1

 n a=1

−2

n 

n  l=1

g il

 n a=1

∂ 2 gj a m a ∂ ∂gj l m ∂ 2 gla m N + k m Nj − k m Nl v ∂v k ∂v m a ∂v ∂v ∂v ∂v

∂glm m a  v ∂v k j a

∂ 2 gj l a  v − ∂v k ∂x a n



m=1

g il

l,m=1

=



∂gj l ∂gj k ∂ 2 gj l ∂glk Gm + m Nkm − m Njm + m Nlm k m ∂v ∂v ∂v ∂v ∂v

∂glm m N ∂v k j

∂ 2 gj l a  v − ∂v k ∂x a n

m=1



∂gj l ∂gj k ∂ 2 gj l ∂glk Gm + m Nkm + m Njm + m Nlm k m ∂v ∂v ∂v ∂v ∂v

=0 as desired. Therefore, recalling (6.2) and (6.1), we see that ∂2 ∂v j ∂v k

 n

i ab (v)v a v b = 2ji k (v)

a,b=1

is constant on Tx M \ {0}. Thus (M, F ) is a Berwald space. The equivalence between (I) and (III) was established by Akbar-Zadeh [4]. A detailed guidance of the proof, along with the C∞ -smoothness of expx around 0, can be found in [25, Exercise 5.3.5] (see also [228, p. 315]). We omit the proof and leave it as an exercise.   Exercise 6.12 ((I) ⇔ (III) in Proposition 6.11) Prove the equivalence between (I) and (III) in Proposition 6.11 (along [25, Exercise 5.3.5]). The calculation in the above proof of the implication (II) ⇒ (I) is related to the fact that the Berwald connection coincides with the Chern connection on Berwald spaces (recall Example 4.15). We refer to [25, Sect. 10.2] for more details.

6.4 Randers Spaces

71

6.4 Randers Spaces Recall from Example 2.12(b) that a Randers space is a special kind of Finsler manifold (M, F ) whose Finsler structure is given by 

F (v) =

α(v, v) + β(v),

where α is a Riemannian metric and β is a 1-form on M such that β(v)2 < α(v, v) for all v ∈ T M \ 0. Randers spaces are important in applications and reasonable for concrete calculations. In this section, we shall calculate the geodesic spray coefficients Gi for later use (in Sect. 10.3). See [15] and [25, Chap. 11] for more discussions on Randers spaces. In a local coordinate system (x i )ni=1 on an open set U ⊂ M, we represent α and β as α(v, v) =

n 

β(v) =

aij (x)v i v j ,

i,j =1

n 

bi (x)v i

i=1

 the Levi-Civita connection of (M, α) and by  for v ∈ Tx U . We denote by ∇ ji k its Christoffel symbols. Define a function bj k on U by n 

∂/∂x k β = bj k (x) dx = ∇ j

j =1

n   ∂bj j =1

∂x k

n 

i  bi j k (x) dx j .

(6.5)

i=1

First we calculate the fundamental tensor of (M, F ) (see [25, (11.1.3), (11.1.4)]). Lemma 6.13 (gij of Randers Spaces) For v ∈ Tx U \ {0}, we have

   n n F (v) 1 m m aij (x) − a (x)v a (x)v im jm

v α

v 2α

gij (v) =

m=1

1 +

v α

 n

aim (x)v

m=1

where we set v α :=

m

m=1

+ bi (x)

1

v α

 n

aj m (x)v

m

+ bj (x) ,

m=1

√ α(v, v).

Proof We will abbreviate v α as v in the proofs in this section. For v ∈ Tx U \ {0}, we have gij (v) =

! 1 ∂2 2 2

v + 2 v β(v) + β(v) 2 ∂v i ∂v j

72

6 Examples of Finsler Manifolds

= aij (x) +

1

v

 n

m=1

aij (x) 1 + −

v

v 3

 n 1 aim (x)v m bj (x) + aj m (x)v m bi (x)

v m=1

 n

aim (x)v

m

m=1

  n

aj m (x)v β(v) + bi (x)bj (x). m

m=1

A rearrangement, using F (v) = v + β(v), yields the claim.

 

To calculate Gi , we would like to know the inverse matrix (g ij ) of (gij ). It is given by an application of the following simple fact (see [25, Proposition 11.2.1]; we remark that the symmetry of Q is necessary to write down the inverse): For an invertible, symmetric, n × n complex matrix (Qij ) and a complex  vector (zi ) ∈ Cn n i ij −1 i such that 1 + i=1 z zi = 0 with (Q ) := (Qij ) and z := nj=1 Qij zj , we have  zi zj −1 ij  . (6.6) (Qij + zi zj ) = Q − 1 + ni=1 zi zi (We consider complex matrices for including both (Qij + zi zj ) and (Qij − zi zj ).) We set & ' n n  ' ij −1 i ij (a ) := (aij ) , b := a bj ,

β α := ( bi bi . j =1

i=1

Note that β α is the dual norm of β and |β(v)| ≤ β α v α holds and that the condition β(v)2 < v 2α is equivalent to β α < 1. The next lemma corresponds to [25, (11.2.2)]. Lemma 6.14 (g ij of Randers Spaces) For v ∈ Tx U \ {0}, we have g ij (v) =



v α ij β(v) + β 2α (x) v α i j

v α  i a (x) + vv − 2 b (x)v j + bj (x)v i . 3 F (v) F (v) F (v)

Proof For applying (6.6), we slightly rearrange the expression of gij (v) given in Lemma 6.13 (omitting the evaluations of aij and bi at x) as follows: gij (v) =

 

n n 1 F (v) 1 aim v m + bi aj m v m + bj aij +

v

v

v m=1

F (v)

v 3

 n m=1

aim v m

m=1

  n

m aj m v .

m=1

We first focus on the first two terms in the right-hand side of (6.7). Since

(6.7)

6.4 Randers Spaces

73

n 

a ij

i,j =1

=

1

v

 n m=1

 n 1 aim v m + bi aj m v m + bj

v m=1

 n  j n  v 1 + bj aj m v m + bj

v

v j =1

m=1

=1+2

β(v) + β 2 ≥ 1 − 2 β + β 2 > 0,

v

we deduce from (6.6) that the inverse matrix of the first two terms in (6.7) is given by −1  j   i

v ij v β(v) v

v 2

v i j . a − 1+ 1+2 + β 2 + b + b F (v) F (v)

v

v F 2 (v) v We denote this by Qij and set zi :=

√ n F (v)  aim v m ,

v 3/2 m=1

zi :=

n 

Qij zj .

j =1

Then −1 √   i

β(v) v

v vi

v 2 i 1+2 + β +b . z =√ − 1+ √ F (v)

v F (v) v F (v) v i

In order to apply (6.6) once again, observe that 1−

n  i=1

−1 

β(v) F (v)

v 1+2 + β 2 > 0. z i zi = 1 + F (v)

v

v

Therefore we obtain g ij (v) = Qij + = Qij +

1−

zi zj n

i i=1 z zi

  v i v v j v j v v i vi vj j i +b − 2 +b −  F (v) v (1 − ni=1 zi zi ) F 2 (v) v F (v) v

 −1  j  i

β(v) v v

v 2

v i j 1+2 + β 2 + b + b + 1+ F (v)

v

v F 2 (v) v =

2v i v j

v

v ij vi vj − − (bi v j + bj v i ) a + n F (v) F (v) v (1 − i=1 zi zi ) F 2 (v) F 2 (v)

74

6 Examples of Finsler Manifolds

=

v ij β(v) + β 2 v i j

v a + v v − 2 (bi v j + bj v i ) F (v) F 3 (v) F (v)

 

as desired.

Gi (v)

Combining the above lemmas, we arrive at the following expression of (see [25, (11.3.11)]). We will omit some calculations, and those are left as an exercise (Exercise 6.16). Proposition 6.15 (Gi of Randers Spaces) For v ∈ Tx U \ {0}, we have n     ji k (x)v j v k + bj k (x) a ij (x)v k − a ik (x)v j v α G (v) = i

j,k=1

+ bj k (x)

%  vi \$ j k  k v v + b (x)v j − bj (x)v k v α . F (v)

Proof For 1 ≤ l ≤ n, observe that n n n   ∂gj k j k ∂bj j ∂[F 2 (v)] F (v)  ∂aj k j k v v = = v v + 2F (v) v . l l l ∂x ∂x

v ∂x ∂x l

j,k=1

j =1

j,k=1

We also find n n  ∂glk j k 1  ∂ 2 [F 2 (v)] j v v = v ∂x j 2 ∂v l ∂x j j =1

j,k=1

  n n n ∂apq p q j β(v)  F (v)  ∂alk j k bl m − = v v + a v v v v lm j 3

v ∂x 2 v 2 v ∂x j j,k=1

+ F (v)

n  j =1

m=1

j,p,q=1

  n n ∂bl j ∂bk j k 1  m v + bl + alm v v v . ∂x j

v ∂x j m=1

j,k=1

Hence we have n  j,k,l=1

=

 a

il

∂gj l ∂gj k j k ∂glk v v + − ∂x j ∂x k ∂x l

  i n n ∂apq p q j β(v) i 2F (v)  i j k b  − j k v v + v v v v

v

v v 3 ∂x j j,k=1

+ 2F (v)

n  j,l=1

j,p,q=1

 a il

n  ∂bj j ∂bl 2 i  ∂bk j k i v v − + 2b + v v . ∂x j ∂x l

v ∂x j j,k=1

(6.8)

6.4 Randers Spaces

75

In order to apply Lemma 6.14, we also calculate n 

 v

j,k,l=1

l

∂gj l ∂gj k j k ∂glk v v + − ∂x j ∂x k ∂x l

n n  ∂bk j k F (v)  ∂akl j k l = v v v + 2F (v) v v .

v ∂x j ∂x j j,k,l=1

j,k=1

Moreover, by (6.8), we have 

n 

bl

j,k,l=1

∂gj l ∂gj k j k ∂glk v v + − ∂x j ∂x k ∂x l

  n n ∂apq p q j β(v)2 2F (v) 

β 2 − jmk v j v k + bm v v v 3

v

v ∂x j

v

=

m,j,k=1

j,p,q=1

n  n   ∂bj k j ∂bk β(v)  ∂bk j k 2 + 2F (v) − k b v + 2 β + v v . ∂x j ∂x

v ∂x j j,k=1

j,k=1

Bringing all the calculations and Lemma 6.14 together, we obtain  n ∂gj l ∂gj k j k 1  il ∂glk v v G = g + − 2 ∂x j ∂x k ∂x l i

j,k,l=1

=

n 

 ji k v j v k + v

j,k=1

n  j,l=1

 a il

n  ∂bj j ∂bl vi v jl k v j v k − − bl  ∂x j ∂x l F (v) j,k,l=1

n  n ∂bj k j

v v i  ∂bk v i  ∂bk j k b − v + v v . F (v) ∂x j ∂x k F (v) ∂x j j,k=1

(6.9)

j,k=1

Recalling the definition of bj k in (6.5) and noticing the symmetry of bj k − ∂bj /∂x k in j and k, we have n  j,l=1

 a il

n n   ∂bj j ∂bj ij k ∂bl ik j v − = (a v − a v ) = bj k (a ij v k − a ik v j ). ∂x j ∂x l ∂x k j,k=1

j,k=1

We also deduce from (6.5) that n n   ∂bj k j vi

v v i  ∂bk l j k  − bl j k v v − − k b v F (v) F (v) ∂x j ∂x j,k,l=1

j,k=1

76

6 Examples of Finsler Manifolds

=

n  n ∂bj vi 

v v i  bj k − k v j v k + bj k (bk v j − bj v k ). F (v) ∂x F (v) j,k=1

j,k=1

Substituting these into (6.9) completes the proof.

 

Exercise 6.16 Follow all the calculations in the proof of Proposition 6.15. We close this section with two interesting characterizations of a Randers space being also a Berwald space (see [25, Theorem 11.5.1] for a detailed historical account on (I) ⇔ (II) and [83] for the equivalence to (III)). Theorem 6.17 Let (M, F ) be a Randers space. Then the following are equivalent: (I) (M, F ) is a Berwald space.  = 0. (II) β is parallel with respect to α, i.e., ∇β i i (III) We have G (v) = G (−v) for all v ∈ T M.  = 0 is equivalent to bj k = 0 by (6.5). Then, as it can be seen Note that ∇β from Proposition 6.15, the geodesic equation for F coincides with that for α. The condition bj k = 0 in particular implies that β is a Killing form with respect to α (see Sect. 10.3). Exercise 6.18 Prove Theorem 6.17. (One may apply the characterization of Berwald spaces corresponding to Berwald’s original definition, by the quadraticity of the geodesic spray coefficients Gi , which we encountered in the proof of Proposition 6.11.)

6.5 Hilbert and Funk Geometries Let D ⊂ Rn be a bounded convex domain. Related to his fourth problem, Hilbert [121] introduced a distance function dH on D as follows. Given distinct points x, y ∈ D, denote by x  = x + s(y − x) and y  = x + t (y − x) the intersections of the boundary ∂D and the line passing through x and y with s < 0 < t (see Fig. 6.1). Then we define   |x − y| · |x − y  | 1 , dH (x, y) := log 2 |x  − x| · |y − y  | where |·| is the Euclidean norm. This is indeed a distance function on D and satisfies an interesting property that line segments between any two points are minimizing. In the particular case where D is the unit ball, (D, dH ) coincides with the Klein model of the hyperbolic space. The structure of (D, dH ) has been investigated from geometric and dynamical viewpoints (see [33, 80, 91] among others). For instance, (D, dH ) is known to be Gromov hyperbolic under some mild smoothness

6.5 Hilbert and Funk Geometries

77

Fig. 6.1 Hilbert and Funk geometries

b

y

y

v x x

a

and convexity assumptions on D. We also refer to [30] for a result on the spectral gap of Hilbert geometry for a linear Laplacian defined in [29]. Funk [103] introduced the following non-symmetrization of dH : |x − y  | . dF (x, y) := log |y − y  | 

Note that dF (x, y) = dF (y, x), while the triangle inequality dF (x, z) ≤ dF (x, y) + dF (y, z) still holds. Clearly we have 2dH (x, y) = dF (x, y) + dF (y, x). Moreover, line segments are minimizing also for dF . Exercise 6.19 Show that dH and dF satisfy the triangle inequality. (This is usually shown by using the cross ratio. We refer to [257] for an alternative proof as well as a connection with Teichmüller spaces.) Exercise 6.20 Show that line segments are minimizing with respect to both dH and dF . If ∂D is smooth and D is strongly convex (in other words, ∂D is positively curved), then dH and dF are realized by the smooth Finsler structures FH (x, v) =

 1 1 |v| + , 2 |x − a| |x − b|

FF (x, v) =

|v| |x − b|

for v ∈ Tx D = Rn , respectively, where a = x + sv and b = x + tv denote the intersections of ∂D and the line passing through x in the direction v with s < 0 < t as in Fig. 6.1 (see, e.g., [231, Sect. 2.3]). Note that 2FH (x, v) = FF (x, v) + FF (x, −v). A remarkable feature of these metrics is that they have the constant negative flag curvature as follows (see [212, Theorem 1] and [231, Theorem 12.2.11]). Theorem 6.21 (Flag Curvature of Hilbert and Funk Geometries) A Finsler manifold (D, FH ) as above has the constant flag curvature −1, and (D, FF ) has the constant flag curvature −1/4.

78

6 Examples of Finsler Manifolds

Hence it is natural to expect that (D, dH ) and (D, dF ) enjoy some properties of negatively curved spaces. Exercise 6.22 Prove that (D, FH ) is complete and that (D, FF ) is forward complete and backward incomplete. We find from the non-equivalence of the forward and backward completenesses that (D, FF ) can never be a Berwald space. As for (D, FH ), it follows from Theorem 6.9 that it is a Berwald space if and only if it is Riemannian (and then it is the hyperbolic space).

6.6 Teichmüller Space The Teichmüller metric on Teichmüller space (of a surface of genus g with p punctures) is arguably one of the most famous Finsler structures in differential geometry. The Teichmüller metric is known to be complete by Earle and Eells [90], while, for instance, the Weil–Petersson metric is Riemannian but incomplete (see [249]). Moreover, the Teichmüller metric does not satisfy the Gromov hyperbolicity ([169]; see also [123, 175, 176] for alternative proofs), and the Weil–Petersson metric is Gromov hyperbolic if and only if 3g − 3 + p ≤ 2 (see [47]). It is also known that the Teichmüller metric does not have the Busemann nonpositive curvature if 3g − 3 + p ≥ 2 ([167, 168]; see Definition 8.26 below for the definition of the Busemann nonpositive curvature). We refer to nice surveys [168, 250] for the geometry of the Teichmüller metric and the Weil–Petersson metric, respectively. Further studies of the Teichmüller metric from the viewpoint of Finsler geometry seem to be worthwhile.

Chapter 7

Variation Formulas for Arclength

In this chapter we study the first and second variation formulas for arclength, followed by some applications to the behavior of the distance function along geodesics, including the study of cut and conjugate points. The first variation formula (Proposition 7.1) is closely related to the geodesic equation (3.15), which was introduced as the Euler–Lagrange equation for the energy functional. The second variation formula (Theorem 7.6) will be related to the flag curvature (see Theorem 7.8 and comparison theorems in Sect. 8.3).

7.1 First Variation Formula Throughout this and the next sections, we fix a C∞ -map σ : [0, l] × (−ε, ε) −→ M and denote the tangent and variational vector fields by T (t, s) := ∂t σ (t, s) =

∂σ (t, s), ∂t

U (t, s) := ∂s σ (t, s) =

∂σ (t, s). ∂s

For simplicity, we will assume that T (t, s) = 0 for all t and s. Denote the length of the curve σ (·, s) by   L(s) := L σ (·, s) =



l

  F T (t, s) dt.

0

First we consider the first variation of L, which is calculated similarly to the variation of the energy functional in Sect. 3.3. We remark that, different from Sect. 3.3, σ does not necessarily fix the endpoints. Proposition 7.1 (First Variation Formula) Let σ , T , U , and L be as above. Then we have © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S.-i. Ohta, Comparison Finsler Geometry, Springer Monographs in Mathematics, https://doi.org/10.1007/978-3-030-80650-7_7

79

80

7 Variation Formulas for Arclength

 L (s) = gT U,

l  l 

T T (·, s) − (t, s) dt gT U, DTT F (T ) F (T ) 0 0  l  T T (t, s) dt. = gT DT U, F (T ) 0

Proof We shall modify the calculation in Sect. 3.3 focusing on T /F (T ). Observe from (3.6) that ∂[F (T )] 1 ∂ = ∂s 2F (T ) ∂s

 n

gij (T )T i T j

i,j =1

n  n  ∂gij 1 k i j i j i j (T )U T T + g (T )(∂ T · T + T · ∂ T ) . ij s s 2F (T ) ∂x k

=

i,j =1

k=1

Then, we deduce again from (3.6) that  Tj 1  gij (T )∂s T i · T j = gij (T )∂t U i F (T ) F (T ) n

n

j =1

=

n

 j =1

j =1

j n j  j ∂gij ∂ T i T k i T i ∂ gij (T )U − . (T )T U −gij (T )U ∂t F (T ) ∂x k F (T ) ∂t F (T ) k=1

Hence we obtain ∂[F (T )] ∂ = ∂s ∂t

 n

 j n T Tj i ∂ − gij (T )U gij (T )U F (T ) ∂t F (T ) i

i,j =1

1 + 2F (T )

i,j =1

n  i,j,k=1



∂gkj ∂gij ∂gik (T )U k T i T j − − ∂x k ∂x i ∂x j

  j n T T ∂ ∂ gT U, − = gij (T )U i ∂t F (T ) ∂t F (T ) i,j =1

1 F (T )

n 

gkl (T )γijl (T )U k T i T j

i,j,k,l=1

 

T T ∂ gT U, − gT U, DTT . = ∂t F (T ) F (T ) This completes the proof of the first equality. The second equality follows from Lemma 4.8, which implies

7.1 First Variation Formula

81

  

T T ∂ T T T gT U, = gT U, DT + gT DT U, ∂t F (T ) F (T ) F (T ) since gT /F (T ) = gT and ji k (T /F (T )) = ji k (T ).

 

Exercise 7.2 Prove that a C2 -curve η : [0, l] −→ M with η˙ = 0 is locally minimizing if and only if it satisfies η˙ Dη˙

η˙ =0 F (η) ˙

on (0, l). η˙

Note that combining Dη˙ [η/F ˙ (η)] ˙ = 0 with the constancy of the speed gives rise η˙

to the geodesic equation Dη˙ η˙ = 0. In the remainder of this section, we digress for a little while to discuss the first variation formula for the distance function. We refer to [51, Sect. 4.5] for the case of metric spaces. We begin with a toy example of Minkowski normed spaces. Exercise 7.3 Let (Rn , · ) be a Minkowski normed space, and take y ∈ Rn \ {0} and w ∈ Ty Rn . Show that, by identifying Rn and Ty Rn , lim

ε→0

1

y + εw − y = gy (y, w). ε

y

On a Finsler manifold, on one hand, one may apply Proposition 7.1 in a small neighborhood of ξ(0) to see the following. Exercise 7.4 Let (M, F ) be a forward complete Finsler manifold, and fix x ∈ M and a geodesic ξ : (−δ, δ) −→ M. Prove that, for any unit speed minimal geodesic η : [0, l] −→ M from x to ξ(0) with l := d(x, ξ(0)), we have lim sup ε→0

  d(x, ξ(ε)) − d(x, ξ(0)) ≤ gη˙ η(l), ˙ ξ˙ (0) . ε

On the other hand, we can show the following. Exercise 7.5 Let (M, F ), x and ξ be as in the previous exercise and again put l := d(x, ξ(0)). Assume that there is a sequence (ηi )i∈N of unit speed minimal geodesics from x to ξ(si ) such that si → 0 and η˙ i (0) converges to some v ∈ Ux M. Prove that η(t) := expx (tv), t ∈ [0, l], is a minimal geodesic from x to ξ(0), and we have   d(x, ξ(ε)) − d(x, ξ(0)) = gη˙ η(l), ˙ ξ˙ (0) . ε→0 ε lim

82

7 Variation Formulas for Arclength

Combining the above exercises, we arrive at the following first variation formula for the distance function: lim

ε→0

  d(x, ξ(ε)) − d(x, ξ(0)) = min gη˙ η(l), ˙ ξ˙ (0) , η ε

where l = d(x, ξ(0)) and η runs over all unit speed minimal geodesics from x to ξ(0).

7.2 Second Variation Formula Next, we consider the second variation of L. We assume that η := σ (·, 0) is a nonconstant geodesic (otherwise the formula becomes more complicated), and define the index form for piecewise C1 -vector fields W, X along η by I (W, X) :=

1 F (η) ˙

 l\$ 0

 % η˙ η˙ gη˙ (Dη˙ W, Dη˙ X) − gη˙ Rη˙ (W ), X dt.

(7.1)

Observe from Lemma 5.8 that this is symmetric, i.e., I (X, W ) = I (W, X). Theorem 7.6 (Second Variation Formula) Let σ , T , U , and L be as in the previous section, and assume that η := σ (·, 0) is a nonconstant geodesic. Then we have  L (0) = I (U0 , U0 ) + gη˙ DUT U (·, 0), 

η˙ F (η) ˙

l

 −

0

0

l

∂s [F (T )](t, 0)2 dt, F (η(t)) ˙

where we set U0 := U (·, 0). Proof We differentiate the first variation 



L (s) = 0

l

gT (DTT U, T ) (t, s) dt F (T )

given in Proposition 7.1 and apply Lemma 4.8 to see L (0)  l

gT (DUT [DTT U ], T )+gT (DTT U, DUT T ) ∂s [F (T )] − 2 = gT (DTT U, T ) (t, 0) dt. F (T ) F (T ) 0 Since DTT U = DUT T by the very definition of covariant derivatives, we find in the second term that

7.2 Second Variation Formula

83

gT (DTT U, T ) = gT (DUT T , T ) =

∂[F (T )] 1 ∂[gT (T , T )] = F (T ) . 2 ∂s ∂s

Hence we have L (0)  l  l gT (DUT [DTT U ], T ) + gT (DTT U, DTT U ) ∂s [F (T )](t, 0)2 = (t, 0) dt − dt. F (T ) F (η(t)) ˙ 0 0 By combining this with

gT (DUT U, T ) (·, 0) F (T )

l



l

= 0

0

gT (DTT [DUT U ], T ) (t, 0) dt F (T )

following from the condition that η is a geodesic, it now suffices to show " #   gT DUT [DTT U ] − DTT [DUT U ], T (t, 0) = −gη˙ Rη˙ (U0 ), U0 (t).

(7.2)

For the sake of simplicity, we will omit the evaluations at (t, 0) and write T and U instead of η˙ and U0 , respectively. Observe first that DUT [DTT U ] = DUT



n

n  ∂ ∂t U i + ji k (T )T j U k ∂x i i=1

j,k=1

n

n " # ∂   i i j k i j k j k ∂ s ∂t U + ∂s [j k (T )]T U +j k (T )(∂s T · U + T · ∂s U ) = ∂x i i=1

j,k=1

n 

+

i lm (T )U l

i,l,m=1

n  ∂ m m j k ∂t U + j k (T )T U . ∂x i j,k=1

Similarly, DTT [DUT U ] = DTT



n

n  ∂ ∂s U i + ji k (T )U j U k ∂x i i=1

=

n



∂ t ∂s U i +

i=1

+

j,k=1

"

n 

∂t [ji k (T )]U j U k +ji k (T )(∂t U j · U k + U j · ∂t U k )

j,k=1 n 

i,l,m=1

i lm (T )T l

n  ∂ m m j k ∂s U + j k (T )U U . ∂x i j,k=1

# ∂ ∂x i

84

7 Variation Formulas for Arclength

Recalling T = ∂t σ and U = ∂s σ , we obtain n \$ % ∂  ∂s [ji k (T )]T j U k − ∂t [ji k (T )]U j U k ∂x i

DUT [DTT U ] − DTT [DUT U ] =

i,j,k=1

n 

+

i m (lm jmk − ji m lk )(T )T j U k U l

i,j,k,l,m=1

∂ . ∂x i

Note that ∂[ji k (T )] ∂s

=

n ∂ i  jk l=1

∂x l

(T )U l +

∂ji k ∂v l

(T )∂s T l

and, since η is a geodesic, ∂[ji k (T )] ∂t

=

n ∂ i  jk l=1

=

∂x l

n ∂ i  jk l=1

∂x l

(T )T + l

(T )T l −

∂ji k ∂v l ∂ji k ∂v l

(T )∂t T

l

(T )Gl (T ) .

Therefore we have, choosing local coordinates with gij (η) ˙ = δij for brevity, " # gT DUT [DTT U ] − DTT [DUT U ], T (·, 0)   ∂ji k

=

∂x l

i,j,k,l=1

+

n 

i i ∂lk ∂lk i j k l i j k l (T )T T U U + T G (T )U U ∂x j ∂v j

i m (lm jmk − ji m lk )(T )T i T j U k U l

i,j,k,l,m=1

+

n  i,j,k,l=1

∂ji k ∂v l

(T )T i T j U k ∂s T l .

(7.3)

Since the last term involving ∂s T l is unwelcome, we would like to show that it vanishes. This follows from the relation n ∂ i  jk i,j =1

∂v l

(v)v i v j = 0

(7.4)

7.2 Second Variation Formula

85

for v ∈ T M \ 0 and 1 ≤ k, l ≤ n, where we again use local coordinates such that gij (v) = δij . In order to see (7.4), observe from (4.10) that n ∂ i  jk i,j =1

∂v l

(v)v j v i =

n   ∂N i i=1

k (v)v i ∂v l

i − kl (v)v i .

Moreover, it follows from (4.10) and Theorem 2.8 that n  ∂N i i=1

k (v)v i ∂v l

n  n n i  ∂γkm 1  ∂ 2 gik m i m = (v)v + γkl (v) − (v)G (v) v i ∂v l 2 ∂v l ∂v m i=1

=

n 

m=1

γkli (v)v i +

i=1

=

n 

m=1

1 2

n  m=1

∂gkl (v)Gm (v) ∂v m

i kl (v)v i .

i=1

This completes the proof of (7.4). Going back to (7.3), we deduce from (7.4), (4.5), and (5.10) that " # gT DUT [DTT U ] − DTT [DUT U ], T (·, 0) =

n 

i Rklj (T )T i T j U k U l .

i,j,k,l=1

Then we apply the almost skew-symmetry (5.11) to obtain n  i,j,k,l=1

i Rklj (T )T i T j U k U l = −

n 

k Rilj (T )T i T j U k U l = −

i,j,k,l=1

  = −gT RT (U ), U .

Thus we have (7.2), which completes the proof.

n 

Rlk (T )U k U l

k,l=1

 

If σ fixes the endpoints (namely σ (0, s) = η(0) and σ (l, s) = η(l) for all s), then we have the following useful estimate. Corollary 7.7 In the situation of Theorem 7.6, if σ fixes the endpoints, then we have L (0) ≤ I (U0 , U0 ). Moreover, if U0 is in addition a Jacobi field, then we have L (0) ≤ 0. Proof The first assertion is straightforward from Theorem 7.6 since U (0, ·) = 0 and U (l, ·) = 0. If U0 is a Jacobi field, then we deduce from the Jacobi equation (5.3) and Lemma 4.9 that

86

7 Variation Formulas for Arclength

L (0) ≤ I (U0 , U0 ) = =

1 F (η) ˙

 l\$ 0

% η˙ η˙ η˙ η˙ gη˙ (Dη˙ U0 , Dη˙ U0 ) + gη˙ (Dη˙ Dη˙ U0 , U0 ) dt

!l 1 η˙ gη˙ (Dη˙ U0 , U0 ) = 0 0 F (η) ˙  

as desired.

As a standard application of the second variation formula, Synge’s theorem can be shown in the same way as the Riemannian case (going back to [16]; see [25, Sect. 8.8]). Theorem 7.8 (Synge’s Theorem) Let (M, F ) be a forward complete, oriented, even-dimensional Finsler manifold whose flag curvature is bounded from below by a positive constant. Then M is simply connected. Exercise 7.9 Prove Theorem 7.8 by using the second variation formula. We remark that, by the Bonnet–Myers theorem below (Theorem 8.1), M as in Theorem 7.8 is in fact compact. A key observation for Exercise 7.9 is that, if η is a η˙ loop (i.e., η(0) = η(l)), σ (0, s) = σ (l, s) and Dη˙ U0 = 0 in Theorem 7.6, then we have L (0) ≤ −

1 F (η) ˙

 0

l

  gη˙ Rη˙ (U0 ), U0 dt < 0

by the positivity of the flag curvature.

7.3 Cut Points and Conjugate Points Having at our disposal the second variation formula and the index form, we can analyze the behavior of the distance function along geodesics in the same manner as the Riemannian case (occasionally one can even reduce to the Riemannian situation via Proposition 4.6 and Theorem 5.12). First we introduce two fundamental notions. Definition 7.10 (Cut Points) Fix x ∈ M. For a unit vector v ∈ Ux M, define     ρ(v) := sup t > 0 | d x, expx (tv) = t ∈ (0, ∞]. If ρ(v) < ∞, then we call the point expx (ρ(v)v) a cut point of x. The set of all cut points of x is called the cut locus of x and will be denoted by Cut(x).

7.3 Cut Points and Conjugate Points

87

Definition 7.11 (Conjugate Points) Let η : [0, l] −→ M be a nonconstant geodesic. If there is a nontrivial Jacobi field J along η such that J (0) = J (t) = 0 for some t ∈ (0, l], then we call η(t) a conjugate point of η(0) along η. The following lemma is readily seen from the definition of cut points. Lemma 7.12 (Uniqueness Before Cut Points) Take v ∈ Ux M and consider the geodesic η(t) := expx (tv). Then, for each l ∈ (0, ρ(v)), η|[0,l] is a unique minimal geodesic from x to η(l). Proof We prove by contradiction. Assume that there is another minimal geodesic ξ : [0, l] −→ M from x to η(l). Then we have ξ˙ (l) = η(l) ˙ by Proposition 3.11(iii). Note that, however, the broken curve obtained by concatenating ξ |[0,l] and η|[l,l+ε] is minimizing provided that l + ε < ρ(v). This contradicts the fact that a locally minimizing curve of constant speed is necessarily a solution to the geodesic equation and hence C∞ (recall Exercise 3.19).   Since Jacobi fields are variational vector fields of geodesic variations (Proposition 5.5), conjugate points arise from the degeneracy of the exponential map. Precisely, y is a conjugate point of x if and only if there is v ∈ Tx M such that y = expx (v) and the derivative of the exponential map expx at v does not have full rank. Thus the set of conjugate points of x (the conjugate locus of x) coincides with the set of critical values of expx , and hence it is a closed set. One can also show that the cut locus is a closed set under the forward completeness (see Exercise 7.18). For analyzing the behavior of the distance function along a geodesic beyond a conjugate point, the index form defined in (7.1) will be useful. Lemma 7.13 Let η : [0, l] −→ M be a nonconstant geodesic including no conjugate point of η(0). Then, for any piecewise C1 -vector field W along η with W (0) = 0 and W (l) = 0, we have I (W, W ) ≥ 0. Moreover, equality holds if and only if W (t) = 0 for all t. Proof Fix an orthonormal basis {vi }ni=1 of (Tη(0) M, gη(0) ˙ ), and let Ji be the Jacobi η˙ field along η such that Ji (0) = 0 and Dη˙ Ji (0) = vi . Then, since η does not include conjugate points of η(0), {Ji (t)}ni=1 becomes a basis of Tη(t) M for each t ∈ (0, l] (recall that linear combinations of Ji are again Jacobi fields). Given a vector field W , we write in this basis W (t) =

n 

fi (t)Ji (t).

i=1

Then we have I (W, W ) =

n  i,j =1

1 F (η) ˙

 l\$  η˙  %  η˙ gη˙ Dη˙ [fi Ji ], Dη˙ [fj Jj ] − fi fj gη˙ Rη˙ (Ji ), Jj dt. 0

88

7 Variation Formulas for Arclength

Since Ji is a Jacobi field, for any 1 ≤ i, j ≤ n, we find from Lemma 4.9 that   η˙   η˙ gη˙ Dη˙ [fi Ji ], Dη˙ [fj Jj ] − fi fj gη˙ Rη˙ (Ji ), Jj   η˙ η˙ η˙ η˙ = gη˙ fi Dη˙ Ji + fi Ji , fj Dη˙ Jj + fj Jj + fi fj gη˙ (Dη˙ Dη˙ Ji , Jj ) \$ ! % d η˙ η˙ η˙ fi fj gη˙ (Dη˙ Ji , Jj ) + fi fj gη˙ (Ji , Dη˙ Jj ) − gη˙ (Dη˙ Ji , Jj ) + fi fj gη˙ (Ji , Jj ) = dt

at t ∈ (0, l) where all the functions fi are differentiable. Moreover, Lemma 5.8 implies ! d η˙ η˙ η˙ η˙ η˙ η˙ gη˙ (Ji , Dη˙ Jj ) − gη˙ (Dη˙ Ji , Jj ) = gη˙ (Ji , Dη˙ Dη˙ Jj ) − gη˙ (Dη˙ Dη˙ Ji , Jj ) dt     = −gη˙ Ji , Rη˙ (Jj ) + gη˙ Rη˙ (Ji ), Jj = 0. η˙

η˙

Combining this with Ji (0) = Jj (0) = 0, we obtain gη˙ (Ji , Dη˙ Jj ) − gη˙ (Dη˙ Ji , Jj ) = 0. Therefore, we have I (W, W ) =

1 F (η) ˙

 0

l

gη˙

 n i=1

fi Ji ,

n 

fj Jj dt ≥ 0

j =1

as desired. Equality holds if and only if fi is constant for all i, and then fi = 0 since W (l) = 0.   The following corollary to Lemma 7.13, called the index lemma, provides a characterization of Jacobi fields in the absence of conjugate points. Lemma 7.14 (Index Lemma) Let η : [0, l] −→ M be a geodesic including no conjugate point of η(0), and W be a piecewise C1 -vector field along η. Then we have I (W, W ) ≥ I (J, J ), where J is the Jacobi field along η with J (0) = W (0) and J (l) = W (l). Moreover, equality holds if and only if W = J . We remark that the absence of conjugate points ensures that such a Jacobi field J indeed exists and is unique (see Exercise 7.15 below). Proof Applying Lemma 7.13 to W − J , we find that I (W − J, W − J ) ≥ 0 and equality holds if and only if W = J . Then the claim follows from I (W − J, W − J ) = I (W, W ) + I (J, J ) − 2I (J, W ) = I (W, W ) − I (J, J ), where in the second equality we deduce from the Jacobi equation, Lemma 4.9 and J (t) = W (t) at t = 0, l that

7.3 Cut Points and Conjugate Points

I (J, W ) = =

1 F (η) ˙

 l\$ 0

89

% η˙ η˙ η˙ η˙ gη˙ (Dη˙ J, Dη˙ W ) + gη˙ (Dη˙ Dη˙ J, W ) dt

!l !l 1 1 η˙ η˙ gη˙ (Dη˙ J, W ) = gη˙ (Dη˙ J, J ) = I (J, J ). 0 0 F (η) ˙ F (η) ˙  

Exercise 7.15 Given a nonconstant geodesic η : [0, l] −→ M, show that the following assertions are mutually equivalent: (a) η(l) is not a conjugate point of η(0) along η. (b) For any v ∈ Tη(0) M and w ∈ Tη(l) M, there exists a unique Jacobi field J along η satisfying J (0) = v and J (l) = w. Combining the index lemma with the second variation formula, we obtain the following fundamental property of conjugate points. Proposition 7.16 (Beyond Conjugate Points) Let η : [0, l] −→ M be a unit speed geodesic. If η(τ ) is a conjugate point of η(0) along η for some τ ∈ (0, l), then we have d(η(0), η(t)) < t for all t ∈ (τ, l]. Proof It is sufficient to show d(η(0), η(l)) < l. Choose a nontrivial Jacobi field J1 along η|[0,τ ] with J1 (0) = 0 and J1 (τ ) = 0, and extend it to [0, l] as J1 := 0 on (τ, l]. For small δ > 0 such that η|[τ −δ,τ +δ] contains no pair of conjugate points (precisely, η(s2 ) is not conjugate to η(s1 ) for all pairs (s1 , s2 ) with τ − δ ≤ s1 < s2 ≤ τ +δ), we take the Jacobi field J2 along η|[τ −δ,τ +δ] with J2 (τ −δ) = J1 (τ −δ) and J2 (τ +δ) = 0. Observe that J2 (τ ) = 0 since η|[τ −δ,τ +δ] has no pair of conjugate points, and hence J2 = J1 |[τ −δ,τ +δ] . Now we define a piecewise C∞ -vector field W along η by W := J1 on [0, τ − δ], W := J2 on [τ − δ, τ + δ], and W := 0 on [τ + δ, l]. Notice that W differs from J1 only on [τ − δ, τ + δ]. Then it follows from Lemma 7.14 and J2 = J1 |[τ −δ,τ +δ] that I (W, W ) < I (J1 , J1 ). Moreover, we find I (J1 , J1 ) = 0 by the same calculation as in the proof of Lemma 7.14. For small ε > 0, consider a variation σ : [0, l] × (−ε, ε) −→ M such that σ (·, 0) = η, σ (0, ·) = η(0), σ (l, ·) = η(l) and ∂s σ (t, 0) = W (t). Set L(s) := L(σ (·, s)), and observe that L (0) = 0 since η is a geodesic (to be precise, we apply Proposition 7.1 on the intervals [0, τ − δ], [τ − δ, τ + δ] and [τ + δ, l]). Moreover, we deduce from Corollary 7.7 that L (0) ≤ I (W, W ) < 0. Therefore, we have L(s) < L(0) for s = 0 sufficiently close to 0, which implies   d η(0), η(l) ≤ L(s) < L(0) = l. This completes the proof.

 

90

7 Variation Formulas for Arclength

In particular, along a geodesic, a cut point appears no later than a conjugate point. We close this section with another important fact on what happens at cut points. Proposition 7.17 (Behavior at Cut Points) Let (M, F ) be forward complete, take a unit vector v ∈ Ux M and put η(t) := expx (tv). If ρ(v) < ∞, then either (I) or (II) below holds: (I) There exist two (or more) distinct minimal geodesics from x to η(ρ(v)). (II) η(ρ(v)) is the first conjugate point of x along η. We remark that (I) and (II) could simultaneously hold, a typical (Riemannian) example is a pair of antipodal points in a sphere. Proof Note first that there is no conjugate point on (0, ρ(v)) by Proposition 7.16. Take a sequence ti > ρ(v) converging to ρ(v) as i → ∞ and, for each i, choose a minimal geodesic ξi : [0, d(x, η(ti ))] −→ M from x to η(ti ). Since ti > ρ(v), we have d(x, η(ti )) < ti and hence ξ˙i (0) = v. Take a convergent subsequence of ξ˙i (0) and denote its limit by w ∈ Ux M. On one hand, if w = v, then ξ(t) := expx (tw) provides a minimal geodesic from x to η(ρ(v)) different from η. Hence (I) holds. On the other hand, if w = v, then expx is not injective in any neighborhood of ρ(v)v. Therefore, d(expx ) degenerates at ρ(v)v and η(ρ(v)) is a conjugate point of x along η. Thus (II) holds.   We remark that the forward completeness ensured the existence of ξi as well as the well-definedness of ξ . Another fine property under the forward completeness is the following. Exercise 7.18 For a forward complete Finsler manifold (M, F ) and x ∈ M, prove that the cut locus Cut(x) is a closed set. One way to prove the above exercise is to realize the cut locus Cut(x) as the boundary of the set   expx {tv ∈ Tx M | v ∈ Ux M, 0 ≤ t < ρ(v)} . Alternatively, one may make use of the continuity of the function ρ: Exercise 7.19 Let (M, F ) be forward complete and take x ∈ M. Show that the function ρ : Ux M −→ (0, ∞] defined in Definition 7.10 is continuous. We refer to [25, Chap. 8] for further discussions on cut points and conjugate points (see also [66, Sect. III.2] for the Riemannian case).

Chapter 8

Some Comparison Theorems

Comparison theorems are the main subjects of this book. They are concerned with quantitative or qualitative properties of a space with a certain condition on its curvature. In this book, we will consider Finsler manifolds whose flag or (weighted) Ricci curvature is bounded from below or above by a constant. This chapter is devoted to some fundamental examples of geometric comparison theorems. The first two of them (the Bonnet–Myers and Cartan–Hadamard theorems) are verbatim analogues of the Riemannian counterparts. Then we study the convexity and concavity of the distance function using some non-Riemannian quantities besides the flag curvature. We refer to [66] for the Riemannian case.

8.1 The Bonnet–Myers Theorem The Riemannian characterization of the Finsler curvature as in Theorem 5.12 has some straightforward applications in comparison geometry. We discuss two such applications in this and the next sections. Although they can be essentially reduced to the Riemannian case, we give detailed proofs for the sake of self-containedness. We will say that Ric ≥ K holds for some constant K ∈ R if Ric(v) ≥ KF 2 (v) holds for all v ∈ T M (or, equivalently, Ric(v) ≥ K for all unit vectors v ∈ U M). The diameter of M is defined by diam(M) := sup{d(x, y) | x, y ∈ M}. Theorem 8.1 (Bonnet–Myers Theorem (Unweighted)) Assume that (M, F ) is forward or backward complete and satisfies Ric ≥ K for some K > 0. Then we have

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S.-i. Ohta, Comparison Finsler Geometry, Springer Monographs in Mathematics, https://doi.org/10.1007/978-3-030-80650-7_8

91

92

8 Some Comparison Theorems

) diam(M) ≤ π

n−1 . K

In particular, M is compact and has finite fundamental group. Proof We shall prove by contradiction. Assume that there are x, y ∈ M with √ l := d(x, y) > π (n − 1)/K. By the Hopf–Rinow theorem (Theorem 3.21), there exists a unit vector v ∈ Ux M such that η(t) := expx (tv) is a minimal geodesic from x to y = η(l). We follow the same lines as the Riemannian case to√see that there necessarily exists a conjugate point η(t0 ) of η(0) for some t0 ∈ (0, π (n − 1)/K]. Choose an orthonormal basis {ei }n−1 i=1 ∪ {v} of (Tx M, gv ), and consider the vector fields Ei (t) := d(expx )tv (tei ) ∈ Tη(t) M along η for i = 1, 2, . . . , n − 1. Note that Ei is the variational vector field of the geodesic variation   σi (t, s) := expx t (v + sei ) ,

(t, s) ∈ [0, l) × (−ε, ε).

Hence Ei is a Jacobi field (for both F and gη˙ ; recall Proposition 4.6), and we deduce from Gauss’ lemma (Lemma 3.18) that gη˙ (η, ˙ Ei ) = 0 for all i. Then, since l ≤ ρ(v), we find that {Ei (t)}n−1 is a basis of the orthogonal complement of η(t) ˙ in i=1 (Tη(t) M, gη˙ ) for every t ∈ (0, l) (similarly to the proof of Lemma 7.13). η˙ For brevity, we will denote by Ei the covariant derivative Dη˙ Ei along η. Observe from gη˙ (η, ˙ Ei ) = 0 and Lemma 4.9 that gη˙ (η, ˙ Ei ) = 0. We define (n−1)×(n−1)matrices A(t) = (aij (t)) and B(t) = (bij (t)) for t ∈ (0, l) by   aij (t) = gη˙ Ei (t), Ej (t) ,

Ei (t) =

n−1 

bij (t)Ej (t).

j =1

We summarize some fundamental properties of A and B in the next claim. Claim 8.2 (i) We have BA = AB T and A = 2BA, where B T denotes the transpose matrix of B. (ii) The matrix A−1/2 BA1/2 is symmetric. Here we set ⎛√ ⎜ A1/2 := P −1 ⎝

0

a1 ..

0

.

⎞ ⎟ ⎠P

an−1

and A−1/2 := (A1/2 )−1 , where a1 , . . . , an−1 > 0 are the eigenvalues of A and P is an orthogonal matrix which diagonalizes A as

8.1 The Bonnet–Myers Theorem

93

⎛ ⎞ 0 a1 ⎜ ⎟ A = P −1 ⎝ . . . ⎠ P. 0 an−1 (iii) We have the Riccati equation B  + B 2 + RA−1 = 0,

(8.1)

where the matrix R(t) = (Rij (t)) is defined by Rij (t) := gη˙ (Rη˙ (Ei ), Ej )(t) = gη˙ (Ei , Rη˙ (Ej ))(t). Proof (i) Observe from Lemma 4.9 that A = BA + AB T . Moreover, it follows from the Jacobi equation (5.3) and Lemma 5.8 that ! d gη˙ (Ei , Ej ) − gη˙ (Ei , Ej ) = gη˙ (Ei , Ej ) − gη˙ (Ei , Ej ) dt     = −gη˙ Rη˙ (Ei ), Ej + gη˙ Ei , Rη˙ (Ej ) = 0. Combining this with Ei (0) = Ej (0) = 0, we find that gη˙ (Ei , Ej ) = gη˙ (Ei , Ej ) (similarly to the proof of Lemma 7.13), which yields BA = AB T . Thus we have A = 2BA. (ii) It follows from BA = AB T that BA is symmetric. Since A is symmetric, we find that A−1/2 (BA)A−1/2 = A−1/2 BA1/2 is also symmetric. (iii) On one hand, we deduce from     aij (t) = −gη˙ Rη˙ (Ei ), Ej − gη˙ Ei , Rη˙ (Ej ) + 2gη˙ (Ei , Ej ), Lemma 5.8 and (i) that A = −2R + 2BAB T = −2R + 2B 2 A. On the other hand, it follows from (i) that A = 2B  A + 2BA = 2B  A + 4B 2 A. Comparing these expressions of A , we obtain the claim.

 

Now, we consider the function h(t) := (det[A(t)])1/(2(n−1)) (which can be regarded as the area form of the sphere ∂B + (x, t) at η(t) with respect to gη˙ ). Observe from A = 2BA that (n − 1)h =

h h (det A)−1 (det A) = trace(A A−1 ) = h trace(B). 2 2

Moreover, by (8.1), we have

94

8 Some Comparison Theorems

(n − 1)h = h trace(B) + h trace(B  ) =

h trace(B)2 − h trace(B 2 ) − h trace(RA−1 ). n−1

Since A−1/2 BA1/2 is symmetric, we deduce from the Cauchy–Schwarz inequality that trace(B)2 = trace(A−1/2 BA1/2 )2 ≤ (n − 1) trace(A−1/2 B 2 A1/2 ) = (n − 1) trace(B 2 ). Furthermore, it follows from the definitions of A and R that ˙ trace(RA−1 ) = trace(A−1/2 RA−1/2 ) = Ric(η). Thus we obtain a Finsler analogue of the Bishop inequality asserting h (t) ≤ −

Ric(η(t)) ˙ h(t). n−1

(8.2)

K h(t) n−1

(8.3)

If Ric ≥ K > 0, then (8.2) yields h (t) ≤ −

for t ∈ (0, l). We compare this differential inequality with K s(t), s (t) = − n−1 

) where

s(t) := sin

K t . n−1

√ Put f := h/s on (0, π (n − 1)/K). Observe that f  = (h s − hs )/s2 and (h s − hs ) = h s − hs ≤ 0 by (8.3). Moreover, we have h(t) = O(t) as t → 0 and limt→0 h (t) = h (0) = 1 (we leave h (0) = 1 as an exercise below). Combining these, we find that f is non-increasing. Hence h(t) ≤ (h(ε)/s(ε)) · s(t) holds for t > ε > 0. However, √ since s(π √(n − 1)/K) = 0, this implies that h(t0 ) = 0 necessarily holds at some t0 ∈ (0, π (n − 1)/K]. Then η(t0 ) is a conjugate point of η(0), and η is no longer minimizing beyond t0 (recall Proposition√7.16). This contradicts the choice of v, therefore we conclude that diam(M) ≤ π (n − 1)/K. The compactness of M follows from the Hopf–Rinow theorem (Theorem 3.21). To show the assertion on the fundamental group, let us consider the universal cover  of M equipped with the induced Finsler structure F . Then F  again satisfies M   Ric ≥ K, and (M, F ) is forward (and backward) complete since geodesics in M

95

Therefore, M  is also compact, and hence the fundamental group can be lifted to M. of M is finite.   Exercise 8.3 Show that h (0) = 1 holds in the above proof. The Bonnet–Myers theorem for the weighted Ricci curvature will be discussed in Sect. 9.4. This is the reason why we called Theorem 8.1 the unweighted one. Remark 8.4 (Bishop Inequality) What is commonly called the Bishop inequality (or Bishop’s comparison theorem) in Riemannian geometry is an upper bound of the volume of a ball by that in a model space of constant curvature (see [36, Sect. 11.10] and, e.g., [66, Theorem III.4.4]). A generalization to a Berwald space equipped with the Busemann–Hausdorff measure will be given in Corollary 10.3. Since the inequality (8.2) is an essential step from which the volume comparison can be deduced rather straightforwardly, we call the inequality (8.2) (and its generalizations) a Bishop inequality. Remark 8.5 (Reversing Stability) In the proof of Theorem 8.1, we used the Hopf– Rinow theorem for finding a minimal geodesic from x to y and for showing that the finiteness of the diameter implies the compactness. For these assertions, it is sufficient to assume either the forward completeness or the backward completeness. ←− ← − Moreover, if we consider the reverse Finsler structure F , then we have Ric(v) = Ric(−v) (see Sect. 2.5) and hence the curvature bound Ric(v) ≥ KF 2 (v) is ←− ← − equivalent to Ric(v) ≥ K F 2 (v). Therefore, Theorem 8.1 is stable when we take the reverse structure. Finally we explain how to reduce Theorem 8.1 to the Riemannian case. Thanks to Remark 8.5 above, we may assume the forward completeness. Fix arbitrary x ∈ M and define a vector field V on M \ ({x} ∪ Cut(x)) by V (ηv (t)) := η˙ v (t) for t ∈ (0, ρ(v)), along each unit speed geodesic ηv : [0, ρ(v)) −→ M with η˙ v (0) = v ∈ Ux M. This vector field V is indeed well-defined, and all of its integral curves are geodesic. Therefore, gV enjoys RicgV (V ) ≥ K by √ Theorem 5.12, and the Riemannian Bonnet–Myers theorem asserts that ρ(v) ≤ π (n − 1)/K for all √ v ∈ Ux M. Hence we obtain diam(M) ≤ π (n − 1)/K.

8.2 The Cartan–Hadamard Theorem By analyzing Jacobi fields again, we can show another fundamental comparison theorem, under the nonpositive flag curvature this time. Theorem 8.6 (Cartan–Hadamard Theorem) Assume that (M, F ) is forward complete and satisfies K ≤ 0. Then we have the following: (i) Any geodesic does not have conjugate points. (ii) For any x ∈ M, the exponential map expx : Tx M −→ M is a C1 -covering.

96

8 Some Comparison Theorems

(iii) If M is simply connected, then expx : Tx M −→ M is a C1 -diffeomorphism for every x ∈ M, and its restriction expx : Tx M \ {0} −→ M \ {x} is a C∞ -diffeomorphism. Proof (i) For any Jacobi field J along a geodesic η : [0, ∞) −→ M, we deduce from Lemma 4.9 and the hypothesis K ≤ 0 that !   d2 η˙ η˙ η˙ η˙ g R (J, J ) = 2g (D J, D J ) + 2g (D D J, J ) ≥ −2g (J ), J ≥ 0. η ˙ η ˙ η ˙ η ˙ η ˙ η ˙ η ˙ η ˙ η ˙ dt 2 Hence gη˙ (J, J ) is a nonnegative convex function. This implies that, if J (0) = 0 and J (l) = 0 for some l > 0, then we have J = 0 on [0, l]. Therefore, there is no conjugate point along η. (ii) It follows from (i) that expx does not have critical points. Hence expx is C1 and locally diffeomorphic on Tx M, and C∞ on Tx M \ {0}. Moreover, one can define  on Tx M as the pull-back of F by expx , i.e., a Finsler metric F   (w) := F d(expx )v (w) F

for w ∈ Tv (Tx M), v ∈ Tx M.

By definition, we have   dF(v1 , v2 ) ≥ dF expx (v1 ), expx (v2 )

(8.4)

for all v1 , v2 ∈ Tx M (see Exercise 8.7 below). Given y ∈ M \ {x}, take small r > 0 such that, for any z1 , z2 ∈ BF± (y, r) := + BF (y, r) ∩ BF− (y, r), there is a unique minimal geodesic from z1 to z2 and there is no other geodesic from z1 to z2 included in BF+ (y, 3r). Notice that dF (z1 , z2 ) ≤ dF (z1 , y) + dF (y, z2 ) < 2r and hence the minimal geodesic from z1 to z2 is indeed included in BF+ (y, 3r). ± Now, given v ∈ exp−1 x (y), we shall show that the restriction expx : BF  (v, r) −→ ± ± BF (y, r) is a diffeomorphism. First, for z ∈ BF (y, r), take a minimal geodesic η : [0, 1] −→ M from y to z. We can lift η to the geodesic η˜ : [0, 1] −→ Tx M with η(0) ˜ = v, then we have η(1) ˜ ∈ BF± ˜ = η(1) = z. Thus we  (v, r) and expx (η(1)) obtain the surjectivity. To see the injectivity, take distinct points w1 , w2 ∈ BF±  (v, r) and let ξ : [0, 1] −→ Tx M be a minimal geodesic from w1 to w2 with respect to . Since ξ is included in B + (v, 3r), we deduce from (8.4) that η := expx ◦ξ is a F  F geodesic included in BF+ (y, 3r). Then η is the unique minimal geodesic from z1 to z2 , where zi := expx (wi ), by the choice of r. Therefore, we find z1 = z2 , which ± yields the injectivity and hence expx : BF±  (v, r) −→ BF (y, r) is a diffeomorphism as desired. Finally, since r depends only on y (independent of the choice of v ∈ exp−1 x (y)), expx is a covering. (iii) This is straightforward from (ii).  

8.3 Uniform Convexity and Smoothness

97

Exercise 8.7 (Pull-back Finsler Structures) Let  : N −→ M be a C1 map between n-dimensional manifolds without critical points. Assume that M is equipped with a Finsler structure F , and denote by  := F ◦ d : T N −→ [0, ∞) F the pull-back Finsler structure of F by . This is a continuous Finsler structure of N (here we do not need a higher regularity). ) −→ (M, F ) is a local isometry in the sense that, for any (1) Prove that  : (N, F p ∈ N, there exists r > 0 such that we have dF(q1 , q2 ) = dF ((q1 ), (q2 )) for all q1 , q2 ∈ BF+  (p, r). In particular, for any geodesic ξ in N (in the sense of Definition 2.17),  ◦ ξ is a geodesic in M of the same speed as ξ . (2) Show that dF ((p), (q)) ≤ dF(p, q) holds for all p, q ∈ N . Exercise 8.8 In the situation of Theorem 8.6(iii), prove that every pair of points x, y ∈ M is connected by a unique (minimal) geodesic from x to y (up to a reparametrization).

8.3 Uniform Convexity and Smoothness In this and the next sections, we consider several comparison theorems involving the flag curvature and some non-Riemannian quantities. The necessity of these nonRiemannian concepts will explain the difference between Riemannian and Finsler geometries. The discussion in this section is based on [192] to a large extent, similar results can be found in [230, Chap. 15] as well. In [192] we applied these comparison theorems to obtain the almost everywhere second order differentiability (in the sense of Alexandrov) of semi-convex functions and (d 2 /2)-concave functions. Such a regularity plays a role in optimal transport theory on Finsler manifolds (see [193] and Sect. 18.1).

8.3.1 Background: k-Convexity and k-Concavity Recall from Subsect. 1.2.1 that a (Minkowski) normed space (Rn , · ) is said to be 2-uniformly convex if there is a constant C ≥ 1 such that    v + w 2 1 1 1 2 2 2    2  ≤ 2 v + 2 w − 4C w − v for all v, w ∈ Rn , and 2-uniformly smooth if there is S ≥ 1 such that

98

8 Some Comparison Theorems

   v + w 2 1 1 S 2 2 2    2  ≥ 2 v + 2 w − 4 w − v for all v, w ∈ Rn . Replacing the norm with a distance function, one can consider some generalizations (nonlinearizations) of them on metric spaces. The most successful (and natural) way of generalizing the 2-uniform convexity is as follows. Let (X, d) be a metric space (possibly with an asymmetric distance function). Geodesics and minimal geodesics in (X, d) can be defined in the same manner as Definition 2.17. Then, we say that (X, d) is a geodesic space if any pair of points x, y ∈ X can be joined by a minimal geodesic from x to y. For instance, a forward or backward complete Finsler manifold is a geodesic space by the Hopf–Rinow theorem (Theorem 3.21). Definition 8.9 (k-Convex Metric Spaces) A geodesic space (X, d) is said to be kconvex for k ∈ (0, 1] if, for any three points x, y, z ∈ X and any minimal geodesic η : [0, 1] −→ X from y to z, we have   d 2 x, η(t) ≤ (1 − t)d 2 (x, y) + td 2 (x, z) − k(1 − t)td 2 (y, z)

(8.5)

for all t ∈ [0, 1]. The inequality (8.5) means that the squared distance function f (t) := d 2 (x, η(t)) along η satisfies f  ≥ 2kd 2 (y, z) in the weak sense. The following celebrated theorem of Alexandrov reveals the tight connection between the k-convexity and upper sectional curvature bounds for Riemannian manifolds (see, e.g., [46, Chap. II.1]). Theorem 8.10 (Triangle Comparison Theorem I) A complete, simply connected Riemannian manifold (M, g) is 1-convex in the sense of (8.5) if and only if it has the nonpositive sectional curvature (i.e., (M, g) is an Hadamard manifold). The implication from the nonpositive curvature to the 1-convexity will be generalized to the Finsler setting in Theorem 8.23 by using some non-Riemannian quantities. Theorem 8.10 was an origin of the theory of 1-convex metric spaces, called CAT(0)-spaces after Cartan, Alexandrov, and Toponogov, regarded as metric spaces of nonpositive sectional curvature in a synthetic geometric sense (0 in CAT(0) means its upper curvature bound). This viewpoint turned out quite powerful and useful. There have been fruitful and diverse applications in, for instance, harmonic maps, geometric group theory, and optimization (we refer to the books [17, 46, 51, 112, 130] among others; see also [189] for a related study in the case of k ∈ (0, 1)). We next consider the reverse inequality to (8.5). Definition 8.11 (k-Concave Metric Spaces) We define the k-concavity for k ∈ [1, ∞) by reversing the inequality (8.5), namely

8.3 Uniform Convexity and Smoothness

  d 2 x, η(t) ≥ (1 − t)d 2 (x, y) + td 2 (x, z) − k(1 − t)td 2 (y, z)

99

(8.6)

for all x, y, z ∈ X, η : [0, 1] −→ X and t ∈ [0, 1] as in Definition 8.9. Exercise 8.12 Show that, if X contains more than one point, then one cannot take k > 1 in (8.5) and k < 1 in (8.6). A counterpart to Theorem 8.10 for a lower curvature bound, due to Alexandrov and Toponogov, is given as follows (see [66, Sect. IX.5]). Theorem 8.13 (Triangle Comparison Theorem II) A complete Riemannian manifold (M, g) is 1-concave in the sense of (8.6) if and only if it has the nonnegative sectional curvature. We remark that (M, g) is not necessarily simply connected in this case. Similarly to CAT(0)-spaces, 1-concave metric spaces have been investigated from various viewpoints, and they are called Alexandrov spaces of nonnegative curvature (see, e.g., [51]). Under lower curvature bounds, we have more results of analytic flavor. In many of those results, however, the lower Ricci curvature bound is in fact a more essential condition. We will discuss such analytic comparison theorems under lower weighted Ricci curvature bounds in Part II.

8.3.2 Uniform Convexity and Smoothness Constants Coming back to our Finsler world, we observe from the characterization of inner product spaces by the sharp uniform convexity or smoothness in Proposition 1.6 that any non-Riemannian Finsler manifold cannot satisfy the 1-convexity or 1-concavity. Indeed, if (M, F ) is 1-convex (resp. 1-concave), then each of its tangent spaces is 1-convex (resp. 1-concave), thus (M, F ) is necessarily a Riemannian manifold by Proposition 1.6. Now, our aim will be to give some estimates of k in the k-convexity and k-concavity, in terms of the flag curvature and some (natural) non-Riemannian concepts. To this end, we define the following quantities which measure the fiberwise uniform convexity and smoothness of F . Definition 8.14 (Uniform Convexity and Smoothness Constants) We define the uniform convexity constant of (M, F ) by F 2 (w) ∈ [1, ∞], x∈M v,w∈Tx M\{0} gv (w, w)

CF := sup

sup

and the uniform smoothness constant of (M, F ) by gv (w, w) ∈ [1, ∞]. 2 x∈M v,w∈Tx M\{0} F (w)

SF := sup

sup

100

8 Some Comparison Theorems

2 2 Since gv (w, w) ≥ C−1 F F (w) and gv is the Hessian of F /2 at v, the constant CF indeed measures the fiber-wise convexity of F 2 . Moreover, CF coincides with the least constant C ≥ 1 for which we have

 F

2

v+w 2

F 2 (v) F 2 (w) 1 2 + − F (w − v) 2 2 4C

(8.7)

for all v, w ∈ Tx M and x ∈ M. Similarly, SF is the least constant S ≥ 1 such that  F2

v+w 2

F 2 (v) F 2 (w) S 2 + − F (w − v) 2 2 4

(8.8)

holds for all v, w ∈ Tx M and x ∈ M. Exercise 8.15 Show the above relation between CF (resp. SF ) and the uniform convexity (8.7) (resp. uniform smoothness (8.8)). Observe that the definition of the fundamental tensor (3.1) yields  ! d2 2 F = 2g(1−t)v+tw (w − v, w − v). (1 − t)v + tw dt 2 It follows from Proposition 1.6 that the following are equivalent: • (M, F ) is a Riemannian manifold; • CF = 1; • SF = 1. For later use, we introduce dual expressions of CF and SF in the next lemma. Recall Sect. 3.2 for the dual norm F ∗ and the Legendre transformations L, L∗ . We define an inner product gα∗ of Tx∗ M for α ∈ Tx∗ M \ {0} in the same way as gv in (3.2), i.e., gα∗

 n

i

ai dx ,

i=1

n 

bj dx

j

:=

j =1

n 

ai bj gij∗ (α).

(8.9)

i,j =1

This is indeed the dual metric to gL∗ (α) . Lemma 8.16 (Dual Expressions of CF and SF ) For any x ∈ M and v ∈ Tx M \ {0}, we have F 2 (w) gα∗ (β, β) = sup , ∗ 2 w∈Tx M\{0} gv (w, w) β∈Tx∗ M\{0} F (β) sup

gv (w, w) F ∗ (β)2 = sup , ∗ 2 w∈Tx M\{0} F (w) β∈Tx∗ M\{0} gα (β, β) sup

where we put α := L(v). In particular, we have

8.3 Uniform Convexity and Smoothness

gα∗ (β, β) , ∗ 2 x∈M α,β∈Tx∗ M\{0} F (β)

CF = sup

sup

101

F ∗ (β)2 . ∗ x∈M α,β∈Tx∗ M\{0} gα (β, β)

SF = sup

sup

Proof We prove only the claim for CF , that for SF is seen in the same way. Choose a local coordinate system (x i )ni=1 around x such that gij (v) = δij , and set 

n   n i ∂ i 2  Sx := w = w ∈ Tx M  (w ) = 1 , ∂x i i=1

i=1

i=1

i=1

 n

n   ∗ i ∗ 2  Sx := β = βi dx ∈ Tx M  (βi ) = 1 . Observe that Sx is the unit sphere of gv , and S∗x is the unit sphere of gα∗ , where α = L(v). Given w ∈ Sx , on one hand, we can choose β1 ∈ S∗x such that β1 (w) = 1. Then we have 1 = β1 (w) ≤ F ∗ (β1 )F (w), and hence gα∗ (β1 , β1 ) 1 F 2 (w) 2 . = ≤ F (w) = gv (w, w) F ∗ (β1 )2 F ∗ (β1 )2 On the other hand, if we take β2 ∈ S∗x satisfying β2 (w) = F ∗ (β2 )F (w), then we find F ∗ (β2 )F (w) = β2 (w) ≤ 1. This implies F 2 (w) 1 gα∗ (β2 , β2 ) = . = F 2 (w) ≤ ∗ gv (w, w) F (β2 )2 F ∗ (β2 )2  

Combining these completes the proof.

Combining the above lemma with Lemma 3.8, we can estimate the Lipschitz constant of the Legendre transformation on each tangent space. This estimate will be used in Sect. 13.2, and seems to have more applications. Proposition 8.17 (Lipschitz Constants of Legendre Transformations) Fix x ∈ M. We have   3/2 F L∗ (β) − L∗ (α) ≤ Cx Sx · F ∗ (β − α) for any α, β ∈ Tx∗ M, where we set Cx :=

F 2 (w) , v,w∈Tx M\{0} gv (w, w) sup

Sx :=

gv (w, w) . 2 v,w∈Tx M\{0} F (w) sup

Similarly, it holds that   3/2 F ∗ L(w) − L(v) ≤ Sx Cx · F (w − v)

102

8 Some Comparison Theorems

for any v, w ∈ Tx M. Proof We assume α = β without loss of generality. Take a local coordinate system around x such that gij∗ (β − α) = δij (and hence gij (L∗ (β − α)) = δij ). Then we deduce from Lemma 3.8 that

 n  ∗ i  1/2    ∗ ∗ ∗ i 2 L (β) − L (α) F L (β) − L (α) ≤ Cx  =

Cx 2

i=1

 n  i=1

∂[(F ∗ )2 ] ∂[(F ∗ )2 ] (β) − (α) ∂αi ∂αi

2 1/2 .

For each i, it follows from Hölder’s inequality that 

2 ∂[(F ∗ )2 ] ∂[(F ∗ )2 ] (β) − (α) ∂αi ∂αi  1 2 n  ∂ 2 [(F ∗ )2 ]  (1 − t)α + tβ (βj − αj ) dt = ∂αj ∂αi 0 j =1

n 1

 ≤ 0

j =1

 ∂ 2 [(F ∗ )2 ]  (1 − t)α + tβ (βj − αj ) ∂αj ∂αi

2 dt.

Now, observe that the largest eigenvalue of the matrix (gij∗ (ω)) is bounded above by Cx Sx for every ω ∈ Tx∗ M \ {0}, since gω∗ (ω , ω ) ≤ Cx F ∗ (ω )2 ≤ ∗ Cx Sx gβ−α (ω , ω ) for any ω ∈ Tx∗ M by Lemma 8.16. Therefore, we obtain n   i=1

0

n 1  j =1

 ∂ 2 [(F ∗ )2 ]  (1 − t)α + tβ (βj − αj ) ∂αj ∂αi

2 dt ≤ 4C2x S2x

n  (βj − αj )2 j =1

= 4C2x S2x F ∗ (β − α)2 . This completes the proof of the first assertion. The second assertion can be obtained in the same way.   Finally we remark that CF and SF also give an upper bound of the reversibility constant F defined in (3.17). Lemma 8.18 (Uniform Convexity/Smoothness Versus Reversibility) We have   F ≤ min{ CF , SF }. Proof For any v ∈ T M \ 0, observe that

8.3 Uniform Convexity and Smoothness

103

F 2 (−v) F 2 (−v) F 2 (−v) = ≤ CF . = gv (v, v) gv (−v, −v) F 2 (v) Similarly, we find F 2 (−v) g−v (−v, −v) g−v (v, v) = = ≤ SF . 2 2 F (v) F (v) F 2 (v)  

Taking the supremum in v shows the claim.

8.3.3 T-Curvature Revisited In order to estimate the convexity and concavity of the squared distance function of a Finsler manifold, in addition to the flag curvature and the uniform convexity and smoothness constants, we need to control the variation of tangent spaces. This is another non-Riemannian phenomenon, since all tangent spaces of a Riemannian manifold are equipped with the same Euclidean structure. There are several ways of estimating such variations, here we employ the T-curvature which appeared in Subsect. 6.3.2. Recall that, for x ∈ M and v, w ∈ Tx M \ {0}, Tv (w) =

n 

  gil (v) ji k (w) − ji k (v) w j w k v l .

i,j,k,l=1

Notice that Tv (w) is positively 1-homogeneous in v and positively 2homogeneous in w. Thus, for δ ≥ 0, we will say that T ≥ −δ (resp. T ≤ δ) holds if we have Tv (w) ≥ −δF (v)F 2 (w) (resp. Tv (w) ≤ δF (v)F 2 (w)) for all x ∈ M and v, w ∈ Tx M. Recall from Proposition 6.11 that T is identically 0 if and only if (M, F ) is a Berwald space, and then all tangent spaces of M are linearly isometric (Proposition 6.5). Therefore, we can certainly regard that, in a sense, the T-curvature measures the variation of tangent spaces.

8.3.4 k-Concavity of (M, F ) We are ready to study the k-convexity and k-concavity of a Finsler manifold. We begin with the k-concavity, for which we do not need to worry about cut points. Theorem 8.19 (k-Concavity of M) Let (M, F ) be forward or backward complete and satisfy K ≥ −κ,

T ≥ −δ,

SF ≤ S

104

8 Some Comparison Theorems

for some κ, δ ≥ 0 and S ≥ 1. Then we have √ κr cosh( κr) + rδ √ sinh( κr) s→0 (8.10) for any x, y ∈ M, v ∈ Uy M and the geodesic ξ : (−ε, ε) −→ M with ξ˙ (0) = v, where r := d(x, y) and the right-hand side of (8.10) is read as S + rδ when κ = 0 or r = 0. lim sup

d 2 (x, ξ(−s)) + d 2 (x, ξ(s)) − 2d 2 (x, y) ≤S 2s 2

In particular, if diam(M) ≤ R, then (M, F ) is k(κ, δ, S, R)-concave with √ k(κ, δ, S, R) := S

√ κR cosh( κR) + Rδ. √ sinh( κR)

We remark that ξ(s) = expy (sv) holds only for s ≥ 0, due to the non-reversibility of F . Proof If y = x (r = 0), then we deduce from Lemma 8.18 that d 2 (x, ξ(−s)) + d 2 (x, ξ(s)) F 2 (−v) + F 2 (v) S+1 = ≤ ≤ S. s→0 2 2 2s 2 lim

Hence we assume y = x. Moreover, we remark that it is sufficient to consider the case of κ > 0, since then the case of κ = 0 can be obtained as the limit. To prove (8.10), we utilize the second variation formula in the previous chapter. Let η : [0, r] −→ M be a unit speed minimal geodesic from x to y, and V be η˙ the gη˙ -parallel vector field along η with V (r) = v (i.e., Dη˙ V = 0). We consider a variation σ : [0, r] × (−ε, ε) −→ M of η given by   σ (t, s) := ξV (t) sf (t) ,

√ sinh( κt) , √ sinh( κr)

f (t) :=

where ξw denotes the geodesic with ξ˙w (0) = w (see Fig. 8.1). Note that σ (0, ·) = x and σ (r, ·) = ξ . As in Chap. 7, we introduce the vector fields T (t, s) := ∂t σ (t, s) and U (t, s) := r ∂s σ (t, s), and define L(s) := 0 F (T (t, s)) dt. Observe that U (t, 0) = f (t)V (t) η˙ and Dη˙ [f V ](t) = f  (t)V (t). Hence the second variation formula (Theorem 7.6) yields that   L (0) = I (f V , f V ) + gη˙ DUT U (r, 0), η(r) ˙ − 

 0

r



∂[F (T )] (t, 0) ∂s

since F (η) ˙ = 1, and the index form (7.1) is given by 

r

I (f V , f V ) = 0

\$  % (f  )2 gη˙ (V , V ) − f 2 gη˙ Rη˙ (V ), V dt.

2 dt

8.3 Uniform Convexity and Smoothness

105 y

Fig. 8.1 Variation σ

ξ

η (t )

f (t ) V (t ) σ (t, ·)

x

We shall estimate each term of this expression of L (0). η˙ It follows from Dη˙ V = 0 and Lemma 4.9 that gη˙ (V , V ) is constant, and hence the hypothesis SF ≤ S implies   2 V (t), V (t) = gη(r) gη(t) ˙ ˙ (v, v) ≤ SF (v) = S ˙ V )| ≤ for all t. Combining this with K ≥ −κ and |gη˙ (η, Cauchy–Schwarz inequality of gη˙ ), we find



gη˙ (V , V ) (by the

      gη˙ Rη˙ (V ), V = K(η, ˙ V ) gη˙ (V , V ) − gη˙ (η, ˙ V )2 ≥ −κ gη˙ (V , V ) − gη˙ (η, ˙ V )2 ≥ −κgη˙ (V , V ) ≥ −κS (this is also the case when η˙ and V are not linearly independent, since then we have Rη˙ (V ) = 0 by Lemma 5.6(i)). Moreover, since DUU U = 0 by construction, we have       ˙ = gη˙ DUU U (r, 0), η(r) ˙ − Tη˙ U (r, 0) ≤ δF 2 (v) = δ. gη˙ DUT U (r, 0), η(r) (8.11) Finally note that, by the Cauchy–Schwarz inequality, 

r 0



∂[F (T )] (t, 0) ∂s

2 dt ≥

1 r

 0

r

∂[F (T )] (t, 0) dt ∂s

2 =

These together imply L (0) ≤



r 0

Therefore, we obtain

\$

% 1 (f  )2 S + f 2 κS dt + δ − L (0)2 . r

1  2 L (0) . r

106

8 Some Comparison Theorems

  (L ) (0) = 2rL (0) + 2L (0) ≤ 2r κS 2 



= 2r

 √



r

2

√ cosh( κr) κS +δ . √ sinh( κr)

0

√ cosh(2 κt) dt + δ √ sinh2 ( κr)

This completes the proof by observing that d 2 (x, ξ(−s)) + d 2 (x, ξ(s)) − 2d 2 (x, y) L2 (−s) + L2 (s) − 2L2 (0) ≤ 2 2s 2s 2 1 → (L2 ) (0) 2 as s → 0.

 

Compared with Theorem 8.13 in the Riemannian case, we needed two nonRiemannian quantities in Theorem 8.19: the uniform smoothness constant and the T-curvature. The uniform smoothness constant controls the infinitesimal concavity and is clearly inevitable. Then the T-curvature plays a role of estimating the variation of tangent spaces. There could be some other ways to do this, while a benefit of considering the T-curvature is its connection with Berwald spaces. For instance, we deduce from Proposition 6.11 the following. Corollary 8.20 (Berwald Spaces of Nonnegative Curvature) Let (M, F ) be a complete Berwald space satisfying K ≥ 0 and SF ≤ S for some S ≥ 1. Then (M, F ) is S-concave in the sense of (8.6). Recall that the reversibility constant F is finite in Berwald spaces, thereby we can simply speak of the completeness (recall Subsect. 6.3.1). It is not known if any kind of converse to Theorem 8.19 or Corollary 8.20 holds true. Exercise 8.21 Having a closer look on the proof of Theorem 8.19 in the Riemannian case (where S = 1 and δ = 0), prove the implication from the 1-concavity to the nonnegative sectional curvature in Theorem 8.13.

8.3.5 k-Convexity of (M, F ) The estimate of the k-convexity makes use of the following modification of the argument in the proof of Theorem 8.6(i). Lemma 8.22 (Rauch’s Comparison Theorem) Assume that (M, F ) satisfies√K ≤ κ for some κ ≥ 0. Let η : [0, l] −→ M be a unit speed geodesic, with l < π/ κ if κ > 0, and J be a Jacobi field along η with J (0) = 0. Then we have

8.3 Uniform Convexity and Smoothness

 1 d gη˙ (J, J ) (l) ≥ 2 dt

107

√  κ cos( κl)  gη˙ J (l), J (l) , √ sin( κl)

where the right-hand side is read as gη˙ (J (l), J (l))/ l when κ = 0. η˙

Proof First we assume κ > 0. We will denote the covariant derivative Dη˙ J by J  for brevity. The claim is equivalent to √  √

  d  κ cos( κl) gη˙ (J, J ) (l) ≥ gη˙ J (l), J (l) . √ dt sin( κl)

(8.12)

In order to see this, we deduce from the Jacobi equation and the hypothesis K ≤ κ that, at t ∈ (0, l) with J (t) = 0,

gη˙ (J  , J ) + gη˙ (J  , J  ) gη˙ (J  , J )2 d2  d gη˙ (J  , J )   = − g (J, J ) = η ˙ dt dt 2 gη˙ (J, J )3/2 gη˙ (J, J ) gη˙ (J, J ) gη˙ (Rη˙ (J ), J ) gη˙ (J  , J  )gη˙ (J, J ) − gη˙ (J  , J )2 + =−  gη˙ (J, J )3/2 gη˙ (J, J ) ≥−

 ˙ J )2 } κ{gη˙ (J, J ) − gη˙ (η,  ≥ −κ gη˙ (J, J ). gη˙ (J, J )

When J (t) = 0, observe from gη˙ (J, J ) ≥ 0 that

d  gη˙ (J, J ) (t) = 0, dt

d2  gη˙ (J, J ) (t) ≥ 0. dt 2

Therefore, we have, in both cases,

 √ √ √ d d  gη˙ (J, J ) (t) · sin( κt) − gη˙ (J, J )(t) · κ cos( κt) ≥ 0. dt dt Combining this with J (0) = 0, we obtain (8.12) as desired. The case of κ = 0 is given as the limit.   Theorem 8.23 (k-Convexity of M) Let (M, F ) be forward complete and satisfy K ≤ κ,

T ≤ δ,

CF ≤ C

for some κ, δ √≥ 0 and C ≥ 1. Take x ∈ M and y ∈ M \ Cut(x) such that r := d(x, y) < π/ κ if κ > 0. Then we have

108

8 Some Comparison Theorems

Fig. 8.2 Variation σ

y

v

ξ (s )

η σ (·, s )

x

√ √ κr cos( κr) − rδ √ sin( κr) (8.13) for any v ∈ Uy M and the geodesic ξ : (−ε, ε) −→ M with ξ˙ (0) = v, where the right-hand side of (8.13) is read as C −1 − rδ when κ = 0 or r = 0. d 2 (x, ξ(−s)) + d 2 (x, ξ(s)) − 2d 2 (x, y) lim ≥ C −1 s→0 2s 2

Proof By the same reasoning as Theorem 8.19, we can assume y = x and κ > 0. Let η : [0, r] −→ M be the unique minimal geodesic from x to y, and take small ε > 0 so that ξ(s) ∈ Cut(x) ∪ {x} for all s ∈ (−ε, ε) (recall Exercise 7.18). Then we consider a C∞ -variation σ : [0, r] × (−ε, ε) −→ M of η such that σ (·, s) is the unique minimal geodesic from x to ξ(s) (see Fig. 8.2; compare this construction of the variation σ with that in the proof of Theorem 8.19 drawn in Fig. 8.1).  r We again consider T (t, s) := ∂t σ (t, s), U (t, s) := ∂s σ (t, s) and L(s) := 0 F (T (t, s)) dt = d(x, ξ(s)). Observe that J := U (·, 0) is a Jacobi field along η by construction, and hence !r  η˙  η˙ Dη˙ J (r), v I (J, J ) = gη˙ (Dη˙ J, J ) = gη(r) ˙ 0

by the same calculation as in the proof of Lemma 7.14. Then it follows from the second variation formula (Theorem 7.6), F (T (t, s)) = d(x, ξ(s))/r and a modification of (8.11) under T ≤ δ that   2  η˙  T   L (0) L (0) = gη(r) D D J (r), v + g U (r, 0), η(r) ˙ − r ˙ η(r) ˙ U η˙ r  η˙  L (0)2 . Dη˙ J (r), v − δ − ≥ gη(r) ˙ r We finally apply Lemma 8.22 and the hypothesis CF ≤ C to see

8.4 Busemann NPC for Berwald Spaces

 η˙  1 d  gη˙ (J, J ) (r) ≥ Dη˙ J (r), v = gη(r) ˙ 2 dt √ √ −1 κ cos( κr) . ≥C √ sin( κr)

109

√ √ κ cos( κr) gη(r) √ ˙ (v, v) sin( κr)

Therefore, we obtain √ √  κ cos( κr) −δ . (L2 ) (0) = 2rL (0) + 2L (0)2 ≥ 2r C −1 √ sin( κr) Since the left-hand side of (8.13) coincides with (L2 ) (0)/2, this completes the proof.   If κ = 0 and M is simply connected, then the Cartan–Hadamard theorem (Theorem 8.6) yields that Cut(x) = ∅. Hence we have the following counterpart to Corollary 8.20. Corollary 8.24 (Berwald Spaces of Nonpositive Curvature) Let (M, F ) be a complete, simply connected Berwald space satisfying K ≤ 0 and CF ≤ C for some C ≥ 1. Then (M, F ) is C −1 -convex in the sense of (8.5). It is again an open problem if any kind of converse to Theorem 8.23 or Corollary 8.24 holds. Compare this with Theorem 8.31 below. Exercise 8.25 Similarly to Exercise 8.21, prove the implication from the 1convexity to the nonpositive sectional curvature in Theorem 8.10.

8.4 Busemann NPC for Berwald Spaces We saw in Proposition 1.6 and the previous section that the 1-convexity and the 1-concavity (the nonpositive and nonnegative curvature bounds in the sense of Alexandrov) make sense only for Riemannian manifolds. Nonetheless, there is another synthetic geometric notion of nonpositive curvature, going back to Busemann, which holds also for non-Riemannian Finsler manifolds. The following condition was introduced by Busemann. Definition 8.26 (Busemann NPC) We say that (M, F ) has the Busemann nonpositive curvature (Busemann NPC for short) if, for any pair of geodesics η, ξ : [0, 1] −→ M, the distance function d(η(t), ξ(t)) is convex in t. Busemann in his influential paper [52] proved the following. Theorem 8.27 (Busemann’s Comparison Theorem) A complete, simply connected Riemannian manifold has the Busemann NPC if and only if it has the nonpositive sectional curvature.

110

8 Some Comparison Theorems

More precisely, it was shown that a Riemannian manifold has the nonpositive sectional curvature if and only if it satisfies the Busemann NPC condition locally. We refer to [17, 46, 130] and the references therein for more on Busemann NPC spaces. Exercise 8.28 Assume that (M, F ) is forward complete and has the Busemann NPC. (1) Prove the analogue to Exercise 8.8, i.e., every pair of points x, y ∈ M is connected by a unique (minimal) geodesic from x to y. (2) Show that M is contractible. Exercise 8.29 Show that a 1-convex metric space in the sense of Definition 8.9 has the Busemann NPC. The converse of Exercise 8.29 is not true; indeed, all strictly convex normed spaces obviously have the Busemann NPC (the strict convexity is necessary to ensure that line segments are the only geodesics, whereas the reversibility is unnecessary). From this fact, it is natural to look for some conditions for Finsler manifolds to have the Busemann NPC. We can find a reasonable answer in [142] as follows (compare this with Corollary 8.24). Theorem 8.30 (Busemann NPC for Berwald Spaces) Let (M, F ) be a complete, simply connected Berwald space satisfying K ≤ 0. Then it has the Busemann NPC. Proof First we consider two geodesics η, ξ : [0, 1] −→ M emanating from a common point denoted by x. This case will turn out essential by a standard technique. Let ζ : [0, 1] −→ M be the geodesic from η(1) to ξ(1), and consider a variation σ : [0, 1] × [0, 1] −→ M such that σ (·, s) is the geodesic from x to ζ (s). (Note that the construction of this variation is closer to Fig. 8.2 than Fig. 8.1.) We remark that, thanks to Theorem 8.6, σ is C∞ on (0, 1] × [0, 1]. Set 

1

L(t) :=

  F ∂s σ (t, s) ds

0

for t ∈ [0, 1], and observe that L(0) = 0 and L(1) = L(ζ ) = d(η(1), ξ(1)). We shall show that L (t) ≥ 0. To this end, we put T := ∂t σ and U := ∂s σ as before, and observe ∂ 2 [F 2 (∂s σ )] ∂ 2 [gU (U, U )] = = 2gU (U, DTU DTU U ) + 2gU (DTU U, DTU U ). ∂t 2 ∂t 2 Note that U (·, s) is a Jacobi field for each fixed s, and that the covariant derivative is independent of a reference vector by the very definition of Berwald spaces. Therefore, we have

8.4 Busemann NPC for Berwald Spaces

111

 n    ∂ gU (U, DTU DTU U ) = −gU U, RT (U ) = −gU U, Rlji k (T )U j T k T l i . ∂x i,j,k,l=1

From the definition (5.10) of Rlji k , we find that it is fiber-wise constant in Berwald spaces. Hence Rlji k (T ) = Rlji k (U ), and then Lemma 5.15(iv) shows  n    i j k l ∂ = gU RU (T ), T , Rlj k (U )U T T gU U, i ∂x i,j,k,l=1

which is nonpositive by the hypothesis K ≤ 0. Combining these, we obtain 

∂ 2 [F (U )] ds ∂t 2 0  2  1

1 ∂ 2 [F 2 (U )] ∂[F (U )] 1 ds − = 2F (U ) F (U ) ∂t ∂t 2 0  2  1

gU (DTU U, DTU U ) gU (U, DTU U ) 1 ds ≥ − F (U ) F (U ) F (U ) 0  1 % 1 \$ 2 U U U 2 = F ds ≥ 0, (U )g (D U, D U ) − g (U, D U ) U U T T T 3 0 F (U )

L (t) =

1

where we used the Cauchy–Schwarz inequality of gU . Therefore, L is a convex function and we have     d η(t), ξ(t) ≤ L(t) ≤ (1 − t)L(0) + tL(1) = td η(1), ξ(1) for any t ∈ (0, 1). Next we consider the general case. Given a pair of geodesics η, ξ : [0, 1] −→ M, let ζ : [0, 1] −→ M be the geodesic from η(0) to ξ(1). For any t ∈ (0, 1), on one hand, the above discussion implies       d η(t), ζ (t) ≤ td η(1), ζ (1) = td η(1), ξ(1) . On the other hand, the same inequality but for the reverse curves of ζ and ξ , which ← − are geodesics with respect to the reverse Finsler structure F (v) = F (−v), yields that       d ζ (t), ξ(t) ≤ (1 − t)d ζ (0), ξ(0) = (1 − t)d η(0), ξ(0) ← − ← − (since (M, F ) also enjoys K ≤ 0; recall Sect. 2.5). Together with the triangle inequality, we obtain

112

8 Some Comparison Theorems

      d η(t), ξ(t) ≤ d η(t), ζ (t) + d ζ (t), ξ(t)     ≤ (1 − t)d η(0), ξ(0) + td η(1), ξ(1) for t ∈ (0, 1). Since η and ξ were arbitrary geodesics, we conclude that d(η, ξ ) is a convex function. This completes the proof.   The converse of Theorem 8.30 also holds true as follows. Theorem 8.31 (Busemann NPC Implies Berwald) A simply connected, forward (or backward) complete Finsler manifold has the Busemann NPC if and only if it is a Berwald space of nonpositive flag curvature. Precisely, after a partial result in [140] that a Berwald space possesses the Busemann NPC only if its flag curvature is nonpositive, Ivanov–Lytchak [125] completed the proof of Theorem 8.31.

Part II

Geometry and Analysis of Weighted Ricci Curvature

In Part II, we consider a Finsler manifold endowed with a measure on it and develop geometric analysis using the weighted Ricci curvature. The Bochner–Weitzenböck formula and the corresponding Bochner inequality will play prominent roles.

Chapter 9

Weighted Ricci Curvature

In Part I, we saw that the natural notions of Finsler curvatures (the flag and Ricci curvatures) can be introduced through the behavior of geodesics, and then several comparison theorems follow smoothly by similar arguments to the Riemannian case, or through the characterizations of these curvatures from the Riemannian geometric point of view (Theorem 5.12). In order to proceed further in this direction, we would like to equip our Finsler manifold with a measure on it. At this point, however, we face a difficulty in choosing a measure, because a Finsler manifold does not necessarily have a unique canonical measure like the volume measure in the Riemannian case. Then our standpoint is that, instead of choosing some constructive measure, we begin with an arbitrary measure and modify the Ricci curvature into the weighted Ricci curvature according to the choice of a measure. This is motivated by the theory deeply investigated in the Riemannian case by Lichnerowicz, Bakry and others (see Sect. 9.2). It will turn out that this strategy fits the Finsler setting very well.

9.1 Measures on Finsler Manifolds Let (M, F ) be an n-dimensional Finsler manifold (we continue assuming all the standard conditions described at the beginning of Sect. 2.3). Unlike the Riemannian case, we cannot fix a unique canonical measure on (M, F ). There are several constructive measures having their own merits; however, they do not reduce to a single measure unless (M, F ) is Riemannian. Moreover, in a sense, some Finsler manifolds do not admit any measure as good as the Riemannian volume measure (see Sect. 10.3 for a precise account). In this section, before introducing the weighted Ricci curvature, we explain the two most fundamental measures in Finsler geometry. The first one is the Busemann– Hausdorff measure, for which the unit ball in each tangent space has the same © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S.-i. Ohta, Comparison Finsler Geometry, Springer Monographs in Mathematics, https://doi.org/10.1007/978-3-030-80650-7_9

115

116

9 Weighted Ricci Curvature

volume as the unit ball in Rn . We denote by ωn the volume of the unit ball in Rn , and by Ln the n-dimensional Lebesgue measure on Rn . Definition 9.1 (Busemann–Hausdorff Measure) Define the Busemann–Hausdorff measure mBH on M by mBH (dx) := BH (x) dx 1 dx 2 · · · dx n in each local coordinate system (x i )ni=1 , where the function BH is defined by ωn = Ln BH (x)



(ai )ni=1

  n    ∂  ∈ R F ai i  < 1 . ∂x n

i=1

x

The measure mBH can be regarded as a generalization of the Hausdorff measure. Indeed, if F is reversible (F (−v) = F (v)), then mBH coincides with the ndimensional Hausdorff measure associated with the Finsler distance function d. Exercise 9.2 Prove that mBH is well-defined (independent of the choice of a local coordinate system). Moreover, for a Riemannian manifold (M, g), show that mBH coincides with the volume measure volg (see the next section for the definition of volg ). The second one is the Holmes–Thompson measure, which is defined in terms of the cotangent bundle and is suitable from the viewpoint of dynamical systems. Let  be the standard symplectic form on the cotangent bundle T ∗ M given by =

n 

dx i ∧ dαi

i=1

for each local coordinate system (x i )ni=1 , and set n :=  ∧ · · · ∧  = (−1)n(n−1)/2 n! · dx 1 ∧ · · · ∧ dx n ∧ dα1 ∧ · · · ∧ dαn . Definition 9.3 (Holmes–Thompson Measure) The Holmes–Thompson measure mHT on M is defined by mHT (A) :=

1 ωn



(−1)n(n−1)/2 n  n!  A

 := {α ∈ Tx∗ M | x ∈ A, F ∗ (α) < 1} ⊂ T ∗ M. for A ⊂ M, where A The measure mHT is preserved by the geodesic flow and has connections with dynamical systems and integral geometry (see [124]). In the Riemannian case, both mBH and mHT coincide with the Riemannian volume measure. We refer to [6], [230,

9.2 Riemannian Weighted Ricci Curvature

117

Chap. 2], and [227, Sect. 4.1] for further information and discussions on measures on Finsler manifolds. Exercise 9.4 Prove that mHT coincides with the volume measure volg for a Riemannian manifold (M, g).

9.2 Riemannian Weighted Ricci Curvature Prior to introducing the weighted Ricci curvature of Finsler manifolds, we briefly review the Riemannian case. Given a Riemannian manifold (M, g), the volume measure volg on M is written in local coordinates as volg (dx) =



  det gij (x) dx 1 dx 2 · · · dx n .

Now, let m be an arbitrary positive C∞ -measure on M, i.e., m is written in local coordinates as m =  dx 1 dx 2 · · · dx n with a positive C∞ -function . Then we can represent m as a conformal change of volg in the form that m = e−ψ volg ,

ψ ∈ C∞ (M).

We will call (M, g, m) a weighted Riemannian manifold. The function ψ is sometimes called the weight function (with respect to the volume measure). It is well known that the Ricci curvature plays an essential role to control the behavior of the volume measure. Hence, on (M, g, m), it is natural (and necessary) to modify the Ricci curvature according to the choice of m (or, equivalently, ψ). Then it turns out that an appropriate modification is given as follows, with a real parameter N (going back to [18, 20, 157, 217]). Definition 9.5 (Riemannian Weighted Ricci Curvature) Let (M, g, m) be an ndimensional weighted Riemannian manifold as above. Then, for N ∈ R \ {n} and a tangent vector v ∈ Tx M, we define the weighted Ricci curvature associated with the measure m (also called the Bakry–Émery–Ricci curvature or the N -Ricci curvature) by RicN (v) := Ricg (v, v) + Hess ψ(v, v) −

(dψ(v))2 , N −n

where Ricg denotes the Ricci curvature tensor of g and Hess ψ : T M ⊗ T M −→ R is the Hessian of ψ. As the limits of N → ∞ and N ↓ n, we also define Ric∞ (v) := Ricg (v, v) + Hess ψ(v, v),

118

9 Weighted Ricci Curvature

 Ricn (v) :=

Ricg (v, v) + Hess ψ(v, v)

if dψ(v) = 0,

−∞

if dψ(v) = 0.

Given K ∈ R, we say that RicN ≥ K holds if we have RicN (v) ≥ Kg(v, v) for all v ∈ T M. The Riemannian Hessian is defined by     Hess ψ(X, Y ) := X dψ(Y ) − dψ(∇X Y ) = g ∇X (∇ψ), Y

(9.1)

for vector fields X and Y , where ∇ is the Levi-Civita connection of g and ∇ψ is the gradient vector field of ψ (see [66]; a Finsler notion of Hessian will be introduced in Sect. 12.1). What is relevant to us is a fact that Hess ψ(v, v) = (ψ ◦ η) (0) holds along the geodesic η(t) := expx (tv). Exercise 9.6 Show that, in a local coordinate system, we have Hess ψ(X, Y ) =

n  i,j =1

Xi Y j

n  ∂ 2ψ k ∂ψ , −  ij ∂x i ∂x j ∂x k k=1

where ijk denotes the Christoffel symbol of g. Exercise 9.7 Putting m =  volg , show the following expression of RicN : RicN (v) = Ricg (v, v) − (N − n)

Hess[1/(N −n) ](v, v) . (x)1/(N −n)

Observe that, if m = volg (i.e., in the unweighted case), then we have ψ = 0 and RicN (v) = Ricg (v, v) for all N . Note also that the weighted Ricci curvature RicN is concerned with only the first and second order derivatives of ψ; therefore, it depends on the gradient vector field ∇ψ (or the 1-form dψ) rather than the weight function ψ itself. Thus one can generalize the weighted Ricci curvature to the one associated with a vector field V by replacing ∇ψ with V , which is called the non-gradient type. The following properties of RicN are immediately observed from its definition. Lemma 9.8 (Monotonicity and Scaling Invariance) (i) Given v ∈ T M, RicN (v) is monotone non-decreasing in N on each of the intervals [n, ∞] and (−∞, n). Moreover, we have Ricn (v) ≤ RicN (v) ≤ Ric∞ (v) ≤ RicN  (v) for all −∞ < N  < n < N < ∞. (ii) The weighted Ricci curvature RicN for the measure am multiplied by a positive constant a > 0 coincides with RicN for m. Proof (i) directly follows from the definition of RicN . To see (ii), observe that

9.2 Riemannian Weighted Ricci Curvature

119

am = ae−ψ volg = e−ψ+log a volg . Hence the weight function for the measure am is given by ψ − log a. Then the claim is straightforward from d(ψ − log a) = dψ.   Though we gave the definition of RicN for all N ∈ (−∞, ∞], we will mainly consider N ∈ (−∞, 0) ∪ [n, ∞]. The parameter N in RicN is sometimes called the effective dimension. One can indeed regard N as “an upper bound of the dimension” in a certain sense, whereas this interpretation does not fit the case of N ∈ (−∞, 0). Roughly speaking, if (M, g, m) satisfies RicN ≥ K for some K ∈ R and N ∈ [n, ∞], then it behaves like having “Ric ≥ K and dim ≤ N .” We will see some of such phenomena, generalized to Finsler manifolds, in Sect. 9.4. Remark 9.9 (When Weighted Measures Appear) We encounter weighted Riemannian manifolds in various contexts. (a) Let (X, d, m) be the measured Gromov–Hausdorff limit of a sequence of (unweighted) n-dimensional Riemannian manifolds ((Mi , gi , volgi ))i∈N (we refer to [102, 112] for the definition of this important notion of convergence), and assume that every (Mi , gi ) enjoys Ricgi ≥ K. Then the limit space (X, d, m) is not necessarily a Riemannian manifold, but a metric measure space satisfying the curvature-dimension condition CD(K, n) in the sense of Lott– Sturm–Villani (see Chap. 18 below). Even if (X, d) is a Riemannian manifold, the limit measure m may not coincide with the Hausdorff measure of d and becomes a weighted measure. (b) In the needle decomposition (also called the localization) method that will be discussed in Chap. 19, we decompose a (unweighted) Riemannian manifold (M, g, volg ) into a family of geodesics with weighted measures. These weighted geodesics (called needles) inherit the weighted Ricci curvature bound Ricn ≥ K if (M, g) satisfies Ricg ≥ K. (c) A classical object in convex geometry and analysis is the Euclidean space Rn equipped with a log-concave measure m = e−ψ Ln , where ψ is a convex function. This space can be analyzed by means of the weighted Ricci curvature (or the curvature-dimension condition) by noticing that Ric∞ = Hess ψ ≥ 0 possibly in the weak sense. (d) A Riemannian manifold (M, g) is called a Ricci soliton if there is a vector field V such that we have Ricg +(LV g)/2 = λg for some λ ∈ R, where LV denotes the Lie derivative with respect to V . If V is the gradient vector field of a function ψ ∈ C∞ (M), then (M, g) is called a gradient Ricci soliton and we have Ricg + Hess ψ = λg, which can be written as Ric∞ = λg by employing ψ as the weight function. When ψ is constant, we have Ricg = λg and hence (M, g) is an Einstein manifold. Thus a gradient Ricci soliton (M, g, ψ) (or a Ricci soliton (M, g, V )) can be regarded as an analogue of an Einstein manifold in the context of weighted Riemannian manifolds.

120

9 Weighted Ricci Curvature

Remark 9.10 (On the Range of N ) Here we list some more remarks on the investigations of RicN in the cases of N = ∞, N ∈ [n, ∞) and N < n. (a) The most classical object is Ric∞ , which goes back to Lichnerowicz’ pioneering work [157] on a generalization of Cheeger–Gromoll’s splitting theorem [72] to weighted Riemannian manifolds of Ric∞ ≥ 0. We refer to Chap. 17 below for a further account and some generalizations to Finsler manifolds. (b) Bakry and his collaborators systematically investigated the case of N ∈ [n, ∞] through the celebrated -calculus approach. We refer to [18, 20] and a book [21] for this powerful and highly successful theory, and to Chaps. 14–16 below for some generalizations to the Finsler context. The -calculus is concerned with linear diffusion operators satisfying the so-called 2 -criterion, which corresponds to the Bochner inequality for the (weighted) Laplacian in the (weighted) Riemannian setting. This theory fits well with (and is one of the precursors of) the recently rapidly developing theory of metric measure spaces satisfying synthetic Ricci curvature bounds. In fact, the 2 -criterion turned out equivalent to the corresponding Riemannian curvature-dimension condition. We refer to [11, 12, 95] as well as Chap. 18 for more details (see also Remark 14.8). We remark that the 2 -criterion is also called the curvature-dimension condition (see, e.g., [21]). Indeed, the curvature-dimension condition in the sense of Lott–Sturm–Villani was named due to the similarity to Bakry et al.’s theory (though the equivalence mentioned above is not obvious at all). For avoiding confusion, by the curvature-dimension condition we will mean only the one in the sense of Lott–Sturm–Villani. (c) An early investigation in the range N ∈ (−∞, 0] can be found in [225] in the context of -calculus. Then systematic studies have been done more recently. Some Poincaré-type and isoperimetric inequalities for N ∈ (−∞, 0] or N ∈ (−∞, 1) were investigated in [139] and [179], respectively (see Sect. 15.2 and Chap. 19). Beckner-type inequalities were studied in [22, 105] (see Sect. 16.2). The curvature-dimension conditions for N ∈ (−∞, 0) and N = 0 were given in [200] and [203], respectively (see Chap. 18), inspired by preceding works in [209, 210]. Moreover, a Cheeger–Gromoll-type splitting theorem was established under RicN ≥ 0 with N ∈ (−∞, 1] in [252], where in the critical case N = 1 we have only a warped product splitting. See also [253] for an interesting interpretation of Ric1 .

9.3 Finsler Weighted Ricci Curvature Let (M, F ) be a Finsler manifold and m be a positive C∞ -measure on M (in the same manner as Sect. 9.2). We will call this triple (M, F, m) a measured Finsler manifold. Inspired by the Riemannian characterization of the Finsler Ricci curvature (Theorem 5.12) as well as the definition of the Riemannian weighted Ricci curvature

9.3 Finsler Weighted Ricci Curvature

121

(Definition 9.5), one arrives at the following definition of the Finsler weighted Ricci curvature, introduced in [193]. Definition 9.11 (Finsler Weighted Ricci Curvature) Let (M, F, m) be a triple as above. Given v ∈ Tx M \ {0}, let η : (−ε, ε) −→ M be the geodesic with η(0) ˙ =v and decompose the measure m as    ˙ dx 1 dx 2 · · · dx n dm = e−ψη det gij (η)

(9.2)

along η, where ψη : (−ε, ε) −→ R is a C∞ -function. Then, for N ∈ R \ {n}, define RicN (v) := Ric(v) + ψη (0) −

ψη (0)2 N −n

.

As the limits of N → ∞ and N ↓ n, we define Ric∞ (v) := Ric(v) + ψη (0),  Ric(v) + ψη (0) Ricn (v) := −∞

if ψη (0) = 0, if ψη (0) = 0.

We also set RicN (0) := 0. We say that RicN ≥ K holds for K ∈ R if we have RicN (v) ≥ KF 2 (v) for all v ∈ T M. Observe that we have Ricn ≥ K only if ψη = 0 along all geodesics η (see Remark 9.14 for a related discussion). Note also that RicN (cv) = c2 RicN (v) for any c ≥ 0, and the analogue of Lemma 9.8 holds verbatim. A coordinate-free expression of the function ψη in (9.2) can be given as follows. Denote by mη(t) (resp. volgη(t) ˙ ). ˙ ) the measure on Tη(t) M induced from m (resp. gη(t) Then we have  volgη(t) ˙ (A) , (9.3) ψη (t) = log mη(t) (A) where A is any bounded open set in Tη(t) M. This shows that, in particular, RicN (v) is independent of the choice of a local coordinate system. Remark 9.12 (Riemannian Characterization Again) In the same manner as Theorem 5.12, consider a vector field V on a neighborhood U of x such that V (x) = v and all integral curves of V are geodesic. Then, using the volume measure of gV , we can decompose m as    dm = e−ψ dvolgV = e−ψ det gij (V ) dx 1 dx 2 · · · dx n

122

9 Weighted Ricci Curvature

on U , where ψ is a function on U . Recall that V (η(t)) = η(t) ˙ by construction and η is a geodesic also for gV (see Sect. 4.1). Hence, we have ψ(η(t)) = ψη (t) and g RicN (v) in Definition 9.11 coincides with the weighted Ricci curvature RicNV (v) of the weighted Riemannian manifold (U, gV , m). From the above discussions, one can regard ψη as a weight function of m with respect to gη˙ (or volgη˙ ). An important difference from the Riemannian case (Definition 9.5) is that one cannot take a weight function defined on the manifold M. Instead, a natural weight function in the Finsler case will be  ∈ C∞ (T M \ 0) given by (η(t)) ˙ := ψη (t). Exercise 9.13 Prove that the function  defined above is positively 0homogeneous, i.e., (cv) = (v) for all v ∈ T M \ 0 and c > 0. We refer to [163] for a recent investigation on some comparison theorems in terms of a general positively 0-homogeneous weight function  (not necessarily induced from a measure). An important feature of this approach is that we can simultaneously consider the weighted case and the unweighted case (corresponding to  = 0). This is not the case when we consider only weight functions associated with measures, since some Finsler manifolds do not admit any measure with  = 0 (see Remark 10.11 below). Remark 9.14 (S-Curvature) The quantity S(v) := ψη (0) appearing in the definition of RicN (v) is called the S-curvature and has been intensively studied (for instance, coupled with the Busemann–Hausdorff measure). Observe that S is positively 1homogeneous and S(−v) = −S(v) holds if F is reversible. Although we do not pursue such a direction, one can obtain some comparison theorems under the combination of a lower bound on the (unweighted) Ricci curvature and a (negative) lower bound on the S-curvature (see, e.g., [227, 228, 230], and also [248] for related results in the Riemannian setting). The condition S = 0, corresponding to the unweighted situation for Riemannian manifolds, is assumed in some analytic studies including those on the Finsler–Ricci flow (see [147]). A fundamental example satisfying S = 0 is the Busemann–Hausdorff measure on Berwald spaces (see Proposition 10.2).

9.4 Volume and Diameter Comparison Theorems Having the Riemannian characterization of the weighted Ricci curvature in hand (Remark 9.12), we naturally expect that some comparison theorems on weighted Riemannian manifolds can be generalized to measured Finsler manifolds (M, F, m). In this section, we consider geometric comparison theorems concerning the volume and diameter, along the lines of [193, Sect. 7] and [234]. We will use the function sκ : R −→ R (κ ∈ R) defined by

9.4 Volume and Diameter Comparison Theorems

123

⎧ √ 1 ⎪ ⎪ for κ > 0, √ sin( κt) ⎪ ⎪ κ ⎨ sκ (t) := t for κ = 0, ⎪ ⎪ √ 1 ⎪ ⎪ ⎩√ sinh( −κt) for κ < 0. −κ

(9.4)

Observe that sκ is the solution to the differential equation f  + κf = 0,

f (0) = 0,

f  (0) = 1.

The proof of the next theorem should be compared with that of Theorem 8.1. Recall (3.16) for the definition of open forward balls B + (x, r). Theorem 9.15 (Bishop–Gromov Volume Comparison) Let (M, F, m) be a forward or backward complete measured Finsler manifold satisfying RicN ≥ K for some K ∈ R and N ∈ [n, ∞). Then we have R N −1 dt m(B + (x, R)) 0 sK/(N −1) (t)  ≤ r N −1 dt m(B + (x, r)) 0 sK/(N −1) (t) √ for all x ∈ M and 0 < r < R, where R ≤ π (N − 1)/K when K > 0. Proof This theorem can be reduced to the weighted Riemannian case in the same manner as the Bonnet–Myers theorem (see the last paragraph of Sect. 8.1). First we explain this reduction and then give a direct proof. Let V be the vector field on M \ ({x} ∪ Cut(x)) such that V (η(t)) = η(t) ˙ for all unit speed minimal geodesics η : [0, l] −→ M with η(0) = x. Then one can reduce the claim to the volume comparison theorem on (M \ ({x} ∪ Cut(x)), gV , m) (the singularity of gV at x does not affect the proof of the volume comparison). We turn to a direct proof, which in fact follows essentially the same lines as the weighted Riemannian case. Fix a unit vector v ∈ Ux M and put η(t) := expx (tv) for t ∈ [0, l), where η([0, l)) ⊂ M \ Cut(x) (equivalently, l ≤ ρ(v)). Choose a gv -orthonormal basis {ei }n−1 i=1 ∪ {v} ⊂ Tx M and consider the vector fields Ei (t) := d(expx )tv (tei ) ∈ Tη(t) M along η for i = 1, 2, . . . , n − 1. Set "   #1/(2(n−1)) h(t) := det gη˙ Ei (t), Ej (t) as in the proof of Theorem 8.1, which represents the area form of ∂B + (x, t) at η(t) with respect to volgη˙ . Recall from the Bishop inequality (8.2) that Ric(η(t)) ˙ h (t) ≤− . h(t) n−1

(9.5)

124

9 Weighted Ricci Curvature

In order to estimate the growth of the measure m taking its weight function into account, we put m = e−ψη volgη˙ along η as in Definition 9.11 and introduce the functions f (t) := e−ψη (t) h(t)n−1 ,

h1 (t) := e−ψη (t)/(N−n) ,

(N −n)/(N −1) (n−1)/(N−1) h

Then we find h2 = h1 (N − 1) =

h2 (t) := f (t)1/(N −1) .

and

h2 h2

h1 (N − n)(1 − n) (h1 )2 (n − 1)(n − N ) (h )2 h + (N − n) + + (n − 1) N −1 h1 N −1 h h2 h21 +

2(N − n)(n − 1) h1 h N −1 h1 h

= (N − n)

 2 h1 (N − n)(n − 1) h1 h h + (n − 1) − − . h1 h N −1 h1 h

Combining this with (9.5), we have (N − 1)

ψη (t)2     h2 ˙ . (t) ≤ −ψη (t) + − Ric η(t) ˙ = −RicN η(t) h2 N −n

This is regarded as a weighted version of the Bishop inequality. Then we obtain from the hypothesis RicN ≥ K that h2 K . ≤− h2 N −1

(9.6)

Comparing (9.6) with sK/(N−1) sK/(N−1)

=−

K , N −1

we find (h2 sK/(N−1) −h2 sK/(N −1) ) ≤ 0. Observe also that limt→0 h2 (t)sK/(N−1) (t) = 0 since h2 (t) = O(t (n−N )/(N −1) ). Therefore, the function h2 (t) sK/(N−1) (t)

 =

f (t) sK/(N−1) (t)N −1

1/(N −1)

−1 is non-increasing in t. Then, we can compare the integrals of f and sN K/(N−1) by a well known argument by Gromov as follows (see, e.g., [66, Lemma III.4.1]): Putting

9.4 Volume and Diameter Comparison Theorems

125

−1 φ := sN K/(N−1) for simplicity, we have for 0 < r < R





r

R

φ dt 0

 f dt =

0



r

r

φ dt 0

 f dt +

0



r



R

φ dt 0

φ·

r

f dt φ

 R f (r) φ dt f dt + φ dt φ dt ≤ φ(r) 0 0 0 r  r  r  r  R ≤ φ dt f dt + f dt φ dt 

r



r

0

 =

r

0



r

0

r

R

f dt

(9.7)

φ dt.

0

0

Finally, we integrate the above inequality in v ∈ Ux M. Let Ax be the measure on Ux M induced from the Riemannian metric (gv )v∈Ux M . Note that, since the forward or backward completeness ensures the existence of minimal geodesics (by the Hopf– Rinow theorem), we have   m B + (x, r) =





min{r,ρ(v)}

fv (t) dt Ax (dv), Ux M

0

where fv denotes the function f as above associated with v ∈ Ux M. Therefore, we obtain from (9.7) that   m B + (x, R) ≤

0  min{R,ρ(v)}

 Ux M

R ≤ 0r

φ dt φ dt

0

R = 0r

φ dt φ dt

0

φ dt 0  min{r,ρ(v)} φ dt 0 





1

min{r,ρ(v)}

fv (t) dt

Ax (dv)

0

min{r,ρ(v)}

·

fv (t) dt Ax (dv) Ux M

0

  · m B + (x, r) .  

This completes the proof.

−1 Remark 9.16 (Area Comparison) The monotonicity of fv /sN K/(N−1) in the above proof implies

fv (R) ≤

sK/(N −1) (R)N −1 fv (r) sK/(N−1) (r)N −1

for 0 < r < R < ρ(v). Directly integrating this inequality in v ∈ Ux M shows that

126

9 Weighted Ricci Curvature

sK/(N −1) (R)N −1 m+ (B + (x, R)) , ≤ m+ (B + (x, r)) sK/(N −1) (r)N −1

(9.8)

where   m(B + (x, r + ε)) − m(B + (x, r)) m+ B + (x, r) := lim ε→0 ε is the exterior boundary measure of the ball B + (x, r) (see (15.1) below for the general definition of m+ ). The inequality (9.8) is an area comparison on concentric spheres (see [36, Sect. 11.10]), and an integration of (9.8) gives rise to Theorem 9.15 via Gromov’s argument as in (9.7). A “directional” version of (9.8) is called the measure contraction property, which is known to be useful in the synthetic geometric investigation of lower Ricci curvature bounds (see Subsect. 18.5.3 for a brief account). In the case of K > 0, as a corollary to (the proof of) Theorem 9.15, we obtain a weighted version of the Bonnet–Myers theorem (compare this with the unweighted one in Theorem 8.1). Corollary 9.17 (Bonnet–Myers Theorem (Weighted)) Let (M, F, m) be forward or backward complete and satisfy RicN ≥ K for some K > 0 and N ∈ [n, ∞). Then we have ) N −1 diam(M) ≤ π . K In particular, M is compact and has finite fundamental group. Proof We saw in the proof of Theorem√ 9.15 that the function f (t)sK/(N−1) (t)1−N is non-increasing in t. Since sK/(N −1) (π (N √ − 1)/K) = 0, this implies that h(t0 ) = 0 necessarily holds at some t0 ∈ (0, π (N − 1)/K]. Then η(t0 ) is a conjugate √ point of x = η(0) along η, and hence ρ(v) ≤ π (N − 1)/K. This completes the proof of the diameter bound. The topological assertions immediately follow from the same argument as the proof of Theorem 8.1.   The next exercise is motivated by the synthetic geometric theory of metric measure spaces, e.g., by means of the measure contraction property or the curvaturedimension condition (we refer to [190, 197, 235]). Exercise 9.18 Give an alternative proof of Corollary 9.17 by using only the assertion of Theorem 9.15 (instead of its proof). We close this section with a consideration of the case of N = ∞ and K > 0. In this case, one cannot expect a diameter bound since Gaussian-type spaces can satisfy Ric∞ ≥ K > 0 (see Sect. 10.1 below). Nonetheless, we can show that the measure m enjoys the Gaussian decay and its total mass is finite (see [234, Theorem 4.26] in the context of the curvature-dimension condition).

9.4 Volume and Diameter Comparison Theorems

127

Theorem 9.19 (Finiteness of Total Mass for N = ∞) Let (M, F, m) be forward or backward complete and satisfy Ric∞ ≥ K for some K > 0. Then we have     m B + (x, r) ≤ m B + (x, 4ε) + C1



r

eC2 t e−Kt

2 /2

dt

for any x ∈ M, sufficiently small ε > 0 and for all r > 4ε, where C1 (x) and C2 (ε) are positive constants depending on x and ε, respectively. In particular, we have m(M) < ∞. Proof Fix a unit vector v ∈ Ux M and put η(t) := expx (tv). We use the functions h, ψη and f in the proof of Theorem 9.15. In order to see the claim, we shall estimate 2 the ratio f (t)/e−Kt /2 . We deduce from the Bishop inequality (9.5) (or (8.2)) and the hypothesis Ric∞ ≥ K that

 ! d2 h (t) d Kt 2 /2  (n − 1) − ψ log f (t)e = (t) + Kt η dt h(t) dt 2 h (t)h(t) − h (t)2 − ψη (t) + K h(t)2   ≤ − Ric η(t) ˙ − ψη (t) + K ≤ 0. = (n − 1)

This concavity implies that, for t > 4ε > 0,    t − 2ε ε 2 2  2  log f (2ε)e2Kε ≥ log f (ε)eKε /2 + log f (t)eKt /2 . t −ε t −ε Since limε→0 f (ε) = 0 (it indeed holds that limε→0 f (ε)/εn−1 = e−ψη (0) by Exercise 8.3), the left-hand side is negative if ε > 0 is sufficiently small. Then 2 we obtain, assuming f (ε)eKε /2 ≥ εn ,   t − 2ε log(f (ε)eKε 2  2  log f (ε)eKε /2 ≤ − log f (t)eKt /2 ≤ − ε 2ε ≤

2 /2

)

t

log(ε−n ) t. 2ε

Therefore, we have, setting C2 (ε) := (log ε−n )/(2ε),     m B + (x, r) ≤ m B + (x, 4ε) +





min{r,ρ(v)}

fv (t) dt Ax (dv) Ux M

min{4ε,ρ(v)}

  ≤ m B + (x, 4ε) + Ax (Ux M)



r

eC2 t e−Kt

2 /2

dt.

This shows the first assertion. Then the second assertion immediately follows as

128

9 Weighted Ricci Curvature

  m(M) ≤ m B + (x, 4ε) + C1



  2 C2 K C2 exp − + 2 dt < ∞, t− 2 K 2K  

which completes the proof.

We remark that, under the weaker condition RicN ≥ K > 0 with N < 0, the total mass is not necessarily finite. For instance, (R, | · |, ex dx) satisfies RicN = (1 − N)−1 . Nonetheless, there exists another model space with finite total mass, (R, | · |, coshN −1 x dx), which satisfies RicN = 1 − N and plays an important role in the study of Poincaré and isoperimetric inequalities (see [164, 165, 179] and Remark 15.6(b) below). Exercise 9.20 Prove that the space 0 R, | · |, cosh

N −1

)

1 K x dx 1−N

with K > 0 and N < 0 indeed satisfies RicN = K (i.e., RicN (v) = K|v|2 for all v ∈ T R).

Chapter 10

Examples of Measured Finsler Manifolds

In this chapter, we analyze the weighted Ricci curvature for some examples appearing in Chap. 6. We remark that a suitable choice of a measure is unclear in some cases. In fact, the theory of weighted Ricci curvature has so far been focused on general theory and there are not many investigations on concrete examples. One of the most important observations in this chapter is the absence of “canonical measures” in some Randers spaces. Precisely, we will see that some Randers spaces do not admit any measure whose S-curvature vanishes identically (see Remark 10.11).

10.1 Minkowski Normed Spaces Let (Rn , · ) be a Minkowski normed space (recall Example 2.12(a), Sect. 6.1) and consider a measure m = e−ψ Ln with ψ ∈ C∞ (Rn ). Then the weighted Ricci curvature of (Rn , · , m) is given by RicN (v) = (ψ ◦ ηv ) (0) −

(ψ ◦ ηv ) (0)2 N −n

for v ∈ Tx Rn , where ηv (t) := x + tv for t ∈ R (by identifying Tx Rn and Rn ). We also find that S(v) = (ψ ◦ ηv ) (0) = dψ(v) (recall Remark 9.14), and hence the S-curvature is identically 0 if and only if ψ is a constant function (see Exercise 10.7 below for a related fact on general measured Finsler manifolds). Exercise 10.1 Prove the above expression of the weighted Ricci curvature RicN (v). In the case of N = ∞, the weighted Ricci curvature Ric∞ is directly related to the convexity (or concavity) of ψ. If ψ is K-convex in the sense that (ψ ◦ ηv ) (0) ≥ K v 2 for all v ∈ T Rn , then we have Ric∞ ≥ K. When K = 0 (i.e., ψ © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S.-i. Ohta, Comparison Finsler Geometry, Springer Monographs in Mathematics, https://doi.org/10.1007/978-3-030-80650-7_10

129

130

10 Examples of Measured Finsler Manifolds

is convex), we encounter a log-concave measure as in Remark 9.9(c). Another important example is a Gaussian-type measure m(dx) = e−K x

2 /2

Ln (dx)

with K > 0, which satisfies Ric∞ ≥ KC−1 for the uniform convexity constant C of (Rn , · ) as in (1.2). Here, precisely, Ric∞ ≥ KC−1 is understood in the weak sense (around the origin) since the weight function ψ(x) = K x 2 /2 is only C1 at the origin (recall Subsect. 1.2.2).

10.2 Berwald Spaces We next consider Berwald spaces (recall Sect. 6.3). In this case, the Busemann– Hausdorff measure can be regarded as a canonical measure since its S-curvature vanishes identically. Recall Definition 9.1 for the definition of the Busemann– Hausdorff measure. Proposition 10.2 (S-Curvature Vanishes for mBH ) Let (M, F, mBH ) be a Berwald space equipped with the Busemann–Hausdorff measure. Then the corresponding S-curvature is identically 0. Proof Given v ∈ Ux M, let η : (−ε, ε) −→ M be the geodesic with η(0) ˙ = v. Fix an orthonormal basis {ei }n−1 ∪ {v} ⊂ T M with respect to g , and let P be the x v i i=1 parallel vector field along η with Pi (0) = ei for i = 1, 2, . . . , n − 1. (Recall that the covariant derivative is independent of the choice of a reference vector in Berwald spaces.) We consider a local coordinate system around x given by expη(t)

 n−1

ai Pi (t)

−→ (a1 , a2 , . . . , an−1 , t) ∈ Rn .

i=1

We remark that the exponential map expη(t) is C∞ on a neighborhood of 0 by Proposition 6.11. We shall compare volgη˙ and mBH along η by using this coordinate system. On one hand, observe that 



˙ = gη˙ gij η(t)



    ∂  ∂  = gη˙ Pi (t), Pj (t) , , ∂x i η(t) ∂x j η(t)

˙ Since Pi is a parallel vector field for all 1 ≤ i ≤ n, where we set Pn (t) := η(t). we have gij (η(t)) ˙ = δij for all t. Therefore we obtain volη˙ (dx) = dx 1 dx 2 · · · dx n along η. On the other hand, since the parallel transport along η preserves F (by Proposition 6.5; here we essentially need the Berwald condition), we find that

10.2 Berwald Spaces

131

F

 n i=1

ai

  n ∂  = F a P (t) i i ∂x i η(t) i=1

is constant in t for all (ai )ni=1 ∈ Rn . Hence, the Busemann–Hausdorff measure mBH is a constant multiplication of dx 1 dx 2 · · · dx n along η. Therefore the weight function ψη along η is constant, and we obtain S(v) = 0.   In particular, for the Busemann–Hausdorff measure mBH , we have RicN = Ric for all N. This is, however, a special feature of Berwald spaces and it is not always possible to find such a nice measure, as we will discuss in the next section. Combining Proposition 10.2 with Theorem 9.15 yields the following upper bound of the volume of balls, as a direct analogue to Bishop’s comparison theorem in the Riemannian case (recall Remark 8.4). Observe that the right-hand side of (10.1) is the volume of a ball of radius r in the n-dimensional model space of constant sectional curvature K/(n − 1). Corollary 10.3 (Bishop’s Comparison Theorem) Let (M, F ) be a complete Berwald space satisfying Ric ≥ K for some K ∈ R. Then we have   mBH B + (x, r) ≤ nωn



r

sK/(n−1) (t)n−1 dt

(10.1)

0

√ for any x ∈ M and r > 0, where r ≤ π (n − 1)/K if K > 0. Proof We deduce from Proposition 10.2 that Ricn = Ric for mBH . Thus it follows from Theorem 9.15 with Ricn ≥ K that r  +    sK/(n−1) (t)n−1 dt mBH B (x, r) ≤ 0ε mBH B + (x, ε) n−1 dt 0 sK/(n−1) (t) for 0 < ε < r. Letting ε → 0 and recalling that the unit ball in Tx M has the volume ωn with respect to mBH (see Definition 9.1), we obtain    mBH (B + (x, ε)) r mBH B + (x, r) ≤ n lim sK/(n−1) (t)n−1 dt ε→0 εn 0  r sK/(n−1) (t)n−1 dt = nωn 0

 

as desired.

Exercise 10.4 Using the area comparison (9.8) instead of Theorem 9.15, prove that   + n−1 m+ BH B (x, r) ≤ nωn sK/(n−1) (r) holds in the same situation as Corollary 10.3. Integrating this inequality in r recovers (10.1).

132

10 Examples of Measured Finsler Manifolds

Exercise 10.5 For a measure m = e−ψ mBH on a Berwald space with ψ ∈ C∞ (M), show that we have   (dψ(v))2 RicN (v) = Ric(v) + g∇ψ v, Dv (∇ψ) − N −n for v ∈ Tx M provided that dψx = 0, where ∇ψ(x) := L∗ (dψx ) ∈ Tx M is the gradient vector of ψ at x (see (11.6) below). We remark that the Finsler Hessian of a function ψ will be defined by ∇ 2 ψ(v) := (see (12.1)). Thus, the above expression of RicN (v) has a similar form to the Riemannian weighted Ricci curvature in Definition 9.5 by replacing volg with mBH . ∇ψ Dv (∇ψ)

10.3 Randers Spaces For Randers spaces (recall Example 2.12(b), Sect. 6.4), we encounter a different phenomenon from Berwald spaces: There may not be any measure whose Scurvature vanishes. Our discussion will follow the lines of [195, 230], with the help of some calculations in Sect. 6.4.

10.3.1 Properties of the S-Curvature We begin with an auxiliary lemma on the S-curvature of a general measured Finsler manifold. Recall that the S-curvature is defined as S(v) := ψη (0) (Remark 9.14), where η is the geodesic with η(0) ˙ = v and the function ψη is given by (9.2) or (9.3). Lemma 10.6 (A Coordinate Representation of the S-Curvature) Let (M, F, m) be a measured Finsler manifold and take x ∈ M. We represent m in a local coordinate system (x i )ni=1 around x ∈ M as dm = e dx 1 dx 2 · · · dx n . Then we have, for any v ∈ Tx M, S(v) =

n

 ∂ Nii (v) − i (x)v i . ∂x

(10.2)

i=1

Proof Let η : (−ε, ε) −→ M be the geodesic with η(0) ˙ = v. Then the S-curvature is written as

10.3 Randers Spaces

133

"  # d 1 log det gij (η) S(v) = ˙ − (η) (0) dt 2  ∂ ˙ d[det(gij (η))] 1 (0) − = (x)v i . 2 det(gij (v)) dt ∂x i n

i=1

Thanks to the geodesic equation η¨ i + Gi (η) ˙ = 0 and (4.10), the first term in the right-hand side is calculated as   ij  ˙ d[gij (η)] 1 trace g (v) · (0) 2 dt

n ∂gj i ∂gj i 1  ij k k g (v) (v)v + (v)η¨ (0) = 2 ∂x k ∂v k i,j,k=1

=

n 

γiki (v)v k −

i,k=1

=

n 

1  ij g (v)Aj ik (v)Gk (v) F (v) n

j =1

Nii (v).

i=1

This completes the proof.

 

Exercise 10.7 Let m1 and m2 be two measures on a Finsler manifold (M, F ) such that their S-curvatures coincide everywhere on T M. Then, show that it necessarily holds that m1 = cm2 for some constant c > 0.

10.3.2 Randers Spaces of Vanishing S-Curvature √ We turn to a Randers space (M, F ) given by F (v) = α(v, v) + β(v). We i , ∂/∂x j ), b = β(∂/∂x i ), will use the notations in Sect. 6.4: a = α(∂/∂x ij i  √ n ij

v α = α(v, v), β α = i,j =1 a bi bj , etc. The next lemma is related to Exercise 2.13. Lemma 10.8 (Busemann–Hausdorff √ Measure on Randers Spaces) Let (M, F ) be a Randers space with F (v) = α(v, v) + β(v). Then the Busemann–Hausdorff measure is given in local coordinates as  (n+1)/2    mBH (dx) = 1 − β 2α (x) det aij (x) dx 1 dx 2 · · · dx n .

134

10 Examples of Measured Finsler Manifolds

Proof Recalling the definition of mBH (Definition 9.1), we find that it suffices to show Ln

 

n   v ∈ Rn  v A + bi v i < 1 = i=1

ωn √ (1 − b 2A−1 )(n+1)/2 det A

(10.3)

for any symmetric positive-definite matrix A = (aij )ni,j =1 and any (co-)vector b = (bi )ni=1 with b A−1 < 1, where we denote A−1 by (a ij )ni,j =1 and set & ' ' n

v A := ( aij v i v j ,

b A−1

& ' ' n := ( a ij bi bj .

i,j =1

i,j =1

  The condition v A + ni=1 bi v i <  1 is equivalent to v 2A < (1 − ni=1 bi v i )2 (since − v A ≤ − ni=1 bi v i < 1 − ni=1 bi v i ), which can be rewritten as (v + c)T · (A − bbT ) · (v + c) < 1 + cT · (A − bbT ) · c, where c := (A − bbT )−1 · b. Hence, we observe that the set in the left-hand side of (10.3) is the inside of an ellipsoid with center −c (we remark that A − bbT is positive-definite by the hypothesis b A−1 < 1). The volume of this set is given by  n/2 ωn  1 + cT · (A − bbT ) · c . det(A − bbT )

(10.4)

To calculate (10.4), on one hand, we have   det(A − bbT ) = det In − A−1/2 b · (A−1/2 b)T · det A = (1 − b 2A−1 ) · det A (notice that b 2A−1 is the only nonzero eigenvalue of the matrix A−1/2 b·(A−1/2 b)T ). On the other hand, we deduce from (6.6) that   −1 −1 T

b 2A−1 1 −1 (A b) · (A b) ·b = 1+ A−1 b = c= A + A−1 b, 2 2 1 − b A−1 1 − b A−1 1 − b 2A−1 and hence 1 + cT · (A − bbT ) · c = 1 +

b 2A−1 1 − b 2A−1

=

1 . 1 − b 2A−1

Substituting these into (10.4) completes the proof of (10.3).

 

10.3 Randers Spaces

135

We call a 1-form β a Killing form if bij + bj i = 0 holds (recall (6.5) for the definition of bij ). Precisely, a Killing form is the dual of a Killing vector field on the Riemannian manifold (M, α) as follows. Exercise 10.9 (Killing Vector Fields on (M, α)) Show that, on the Riemannian manifold (M, α), the dual vector field (the Legendre transformation) of β given by X(x) :=

n 

a ij (x)bj (x)

i,j =1

 ∂  ∂x i x

is Killing if and only if bij + bj i = 0. Here we call X a Killing vector field if we Y X, Z) + α(Y, ∇ Z X) = 0 for all vector fields Y and Z, where ∇  denotes have α(∇ the Levi-Civita connection of α. A Killing vector field preserves the metric, i.e., the flow generated by a Killing vector field consists of isometries of α. In [230, Example 7.3.2], Shen gave a characterization of Randers spaces whose Busemann–Hausdorff measure enjoys S = 0 (see (ii) in the next proposition). Along the same lines but for a general measure m, we can also give a necessary condition for a Randers space to admit a measure of vanishing S-curvature. Proposition 10.10 (When S-Curvature Vanishes) Let (M, F ) be a Randers space √ with F (v) = α(v, v) + β(v). (i) If there is a measure m on M such that the associated S-curvature is identically 0, then the function (v) =

n n 1  2 v α  (bij + bj i )(x)v i v j + (bij − bj i )(x)bj (x)v i F (v) F (v) i,j =1

i,j =1

(10.5) is linear in v ∈ Tx M (for all x in the domain of the local coordinate system). (ii) The S-curvature associated with the Busemann–Hausdorff measure mBH on (M, F ) is identically 0 if and only if we have (1 − β 2α )(bij + bj i )(x) + 2

n 

bk (x)(bki bj + bkj bi )(x) = 0

(10.6)

k=1

for all 1 ≤ i, j ≤ n. In particular, if β is a Killing form such that its length

β α is constant, then we have S = 0 for mBH . Proof We will suppress α in v α and β α throughout thisproof. (i) If S is identically 0, then we deduce from (10.2) that ni=1 Nii is necessarily linear in each tangent space Tx M. We saw in Proposition 6.15 that

136

10 Examples of Measured Finsler Manifolds n     ji k (x)v j v k + bj k (x) a ij (x)v k − a ik (x)v j v

Gi (v) =

j,k=1

%  vi \$ j k  k j j k v v + b (x)v − b (x)v v . + bj k (x) F (v) We set Xi (v) :=

n 

  bj k (x) a ij (x)v k − a ik (x)v j v ,

j,k=1

Y i (v) :=

n 

bj k (x)

j,k=1

%  vi \$ j k  k v v + b (x)v j − bj (x)v k v . F (v)

Since 2Nii = ∂Gi /∂v i , the linearity of

n

n   ∂Xi i=1

∂v i

i i=1 Ni

+

is equivalent to the linearity of

∂Y i . ∂v i

Observe first that, omitting the evaluations at x, n  ∂Xi i=1

∂v i

n 

(v) =

bj i (a ij − a j i ) v +

i,j =1

n  i,j,k,l=1

  ail v l = 0. bj k a ij v k − a ik v j

v

Next, since i n n  v n 1  i ∂F ∂ n−1 = − v (v) = ∂v i F (v) F (v) F 2 (v) ∂v i F (v) i=1

i=1

by Theorem 2.8, we have n  ∂Y i i=1

=

∂v i n 

i,j =1

+

(v)

n    ail v l vi (bij + bj i )v j + (bij − bj i )bj v + bj k bk v j − bj v k F (v)

v k,l=1

n \$ % n−1  bj k v j v k + (bk v j − bj v k ) v F (v) j,k=1

=

n n (n + 1) v  n+1  (bij + bj i )v i v j + (bij − bj i )bj v i . 2F (v) F (v) i,j =1

i,j =1

10.3 Randers Spaces

137

Therefore (v) in (10.5) is necessarily linear in v ∈ Tx M. (ii) We shall calculate the S-curvature of mBH . It follows from Lemma 10.8 that the function  in Lemma 10.6 is given by =

  n+1 1 log(1 − β 2 ) + log det(aij ) . 2 2

Thus we have  n n n   ∂aj k ∂ i ∂[ β 2 ] i 1  n+1 jk vi v = − v + trace (a ) · ∂x i ∂x i 2 ∂x i 2(1 − β 2 ) i=1

i=1

i=1

 n n n

n + 1  ∂bj j i 1  ∂aj k j k i j  + ij v i =− b v − b b v ∂x i 2 ∂x i 1 − β 2 i,j =1

=−

n+1 1 − β 2

n 

i,j =1

k=1

bj i bj v i +

i,j =1

n 

j  ij v i .

(10.7)

i,j =1

Substituting this and the calculations in (i) into (10.2), we obtain S(v) =

n n n+1  (n + 1) v  (bij + bj i )v i v j + (bij − bj i )bj v i 4F (v) 2F (v) i,j =1

+

i,j =1

n n+1  j b bj i v i . 1 − β 2 i,j =1

This can be rewritten as, since F (v) = v + β(v), n n  4F (v) 4β(v)  j i j S(v) = (bij + bj i )v v + b bj i v i n+1 1 − β 2 i,j =1

+ 2 v

i,j =1

n 

(bij − bj i )bj v i +

i,j =1

n 4 v  j b bj i v i . 1 − β 2 i,j =1

Comparing the evaluations at v and −v in the right-hand side, we find that S = 0 holds if and only if we have bij + bj i +

n  2 bk (bki bj + bkj bi ) = 0 1 − β 2 k=1

for all 1 ≤ i, j ≤ n and

(10.8)

138

10 Examples of Measured Finsler Manifolds n  (bij − bj i )bj + j =1

n  2 bj bj i = 0 1 − β 2

(10.9)

j =1

for all 1 ≤ i ≤ n. Note that (10.8) is equivalent to (10.6). Moreover, if we assume (10.8), then contracting it with bi and bj yields 2

n 

bi bj bij +

i,j =1

n 4 β 2  i j b b bij = 0, 1 − β 2 i,j =1

 and hence ni,j =1 bi bj bij = 0. Plugging it into the contraction of (10.8) with bj , we further obtain n  j =1

bj bij +

n 1 + β 2  j b bj i = 0. 1 − β 2 j =1

This is equivalent to (10.9), therefore (10.8) actually implies (10.9). This completes the proof of the first assertion. In order to see the second assertion, observe from the calculation in (10.7) that  ∂[ β 2 ] =2 bj i bj . i ∂x n

j =1

Therefore, if β is a Killing form of constant length, then (10.6) is satisfied and the S-curvature vanishes.   Remark 10.11 (Absence of Canonical Measure) We can easily construct Randers spaces not satisfying the condition in Proposition 10.10(i). For example, if b1 (x) = 0 but b11 (x) = 0, then v = (∂/∂x 1 )|x satisfies F (v) = v α and (−v) = −(v) (since the first term in the right-hand side of (10.5) is not 0). Therefore, in general, a Finsler manifold does not necessarily possess a measure whose S-curvature vanishes identically. This is one of the reasons why we think it natural to begin with an arbitrary measure. One can also see from [230, Example 7.3.3] that a Killing form of constant length is not necessarily parallel. Hence, the S-curvature can vanish for some non-Berwald Randers spaces (recall Theorem 6.17).

10.4 Hilbert and Funk Geometries Finally, we consider the weighted Ricci curvature for Hilbert and Funk geometries (recall Sect. 6.5) equipped with the Lebesgue measure Ln . We refer to [196] for

10.4 Hilbert and Funk Geometries

139

details. Recall from Theorem 6.21 that the Hilbert geometry (resp. Funk geometry) has the constant flag curvature −1 (resp. −1/4). Hence, as for the unweighted Ricci curvature, we have Ric = −(n − 1) for (D, FH ) and Ric = −(n − 1)/4 for (D, FF ). Theorem 10.12 (RicN for Hilbert and Funk Geometries) Let (D, FH ) be the Hilbert geometry associated with a bounded convex domain D ⊂ Rn with smooth and strongly convex boundary ∂D. Then (D, FH , Ln ) satisfies −(n − 1) ≤ Ric∞ ≤ 2,

−(n − 1) −

(n + 1)2 ≤ RicN ≤ 2 N −n

for N ∈ (n, ∞). Similarly, for the Funk geometry (D, FF , Ln ), we have Ric∞ = −

n−1 , 4

RicN = −

(n + 1)2 n−1 − 4 4(N − n)

for N ∈ (n, ∞). Therefore, in both cases, Ric∞ and RicN are bounded below (RicN = K means that RicN (v) = KF 2 (v) holds for all v ∈ T M). As immediate corollaries, we obtain the corresponding Bishop–Gromov volume comparison (Theorem 9.15), Laplacian comparison (Theorem 11.20), and Brunn–Minkowski inequality (Theorem 18.8) among others. We chose the Lebesgue measure on D merely for the sake of computability. It is a natural and open problem to calculate the weighted Ricci curvature for other measures and relate the results with the investigations of Hilbert (or Funk) geometry from the geometric and dynamical viewpoints. Exercise 10.13 Calculate (or estimate) the weighted Ricci curvature of the Hilbert geometry (D, FH ) with respect to the Busemann–Hausdorff measure mBH or the Holmes–Thompson measure mHT . One can consider the same problem also for the Funk geometry (D, FF ). Recall Sect. 9.1 for the definitions of these measures. On one hand, the Busemann–Hausdorff measure is the n-dimensional Hausdorff measure of dH and may be suitable from the geometric viewpoint. On the other hand, the Holmes– Thompson measure could be appropriate from the viewpoint of dynamical systems.

Chapter 11

The Nonlinear Laplacian

In this chapter, we consider the natural energy functional (for functions) and the corresponding Sobolev spaces. Then we introduce the nonlinear Laplacian in a way that its associated harmonic functions are minimizers of the energy functional (Lemma 11.19). We also show the Laplacian comparison theorem as the first analytic comparison theorem (Theorem 11.20). The main reference of this chapter is [206]. We remark that, in some papers (such as [201, 206, 208]), we used Du to represent the derivative of a function u. In this book, however, we use du which seems more common in differential geometry.

11.1 Energy Functional and Sobolev Spaces Let (M, F, m) be a measured Finsler manifold (in the sense of Sect. 9.3). Given an 1 () the space of weakly differentiable functions open set  ⊂ M, we denote by Hloc ∗ u on  such that both u and F (du) belong to L2loc () (we refer to [115, 116] for the basic theory of Sobolev spaces on Riemannian manifolds). Precisely, for  any x ∈ , there is an open neighborhood U ⊂  of x such that U u2 dm <  1 () is defined solely in terms ∞ and U F ∗ (du)2 dm < ∞. We remark that Hloc of the differentiable structure of M, that is to say, it is a linear space determined independently of the choices of F and m. Definition 11.1 (Energy Functional) We define the energy functional E : 1 () −→ [0, ∞] by Hloc E (u) :=

1 2



F ∗ (du)2 dm.

(11.1)



We will suppress the subscript  when  = M, namely E means EM .

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S.-i. Ohta, Comparison Finsler Geometry, Springer Monographs in Mathematics, https://doi.org/10.1007/978-3-030-80650-7_11

141

142

11 The Nonlinear Laplacian

Clearly we have E (cu) = c2 E (u) for c ≥ 0. We remark that, due to the non-reversibility of F , E (−u) may not coincide with E (u). Definition 11.2 (Sobolev Spaces) Define the Sobolev space associated with E as 1 () | E (u) + E (−u) < ∞}, H 1 () := {u ∈ L2 () ∩ Hloc

and denote by H01 () the closure of C∞ c () with respect to the (absolutely homogeneous) Sobolev norm

u H 1 () :=



u 2L2 () + E (u) + E (−u),

∞ where C∞ c () denotes the set of C -functions on  with compact support.

As in the above definitions, we will always suppress the dependence on the √ measure m. It follows from the convexity of F ∗ that E is convex, i.e., 

E (u1 + u2 ) ≤



E (u1 ) +

 E (u2 )

(11.2)

for u1 , u2 ∈ H 1 (). This, in particular, implies

u1 + u2 H 1 () ≤ u1 H 1 () + u2 H 1 () for u1 , u2 ∈ H 1 (). Together with the absolute homogeneity of · H 1 () , we find that H 1 () is a linear space and · H 1 () is a norm on it. Moreover, we also deduce from (11.2) the convexity of E :   E (1 − t)u1 + tu2 ≤ (1 − t)E (u1 ) + tE (u2 )

(11.3)

for all u1 , u2 ∈ H 1 () and t ∈ [0, 1]. Remark 11.3 (On the Definition of Sobolev Spaces) As a Sobolev space, one may alternatively employ 1 {u ∈ L2 () ∩ Hloc () | E (u) < ∞}

which includes H 1 () above. This larger space, however, may not be a linear space since E (u) < ∞ does not necessarily imply E (−u) < ∞ if M is noncompact (see [141] for a related study). For avoiding extra care needed due to this nonlinearity, we will restrict ourselves to the space H 1 () as in Definition 11.2. We discuss some fundamental properties of the Sobolev spaces. Though we give outlines of the proofs, it is also possible to reduce them to the (weighted) Riemannian situation.

11.1 Energy Functional and Sobolev Spaces

143

Lemma 11.4 (Properties of Sobolev Spaces) Let  ⊂ M be an open set. (i) The space (H 1 (), · H 1 () ) is a reflexive Banach space, and H01 () is a subspace in it. (ii) If (M, F ) has finite reversibility F < ∞ and is complete, then we have 1 H01 (M) = H 1 (M). That is to say, C∞ c (M) is dense in (H (M), · H 1 ). Proof Observe first that one can find a reversible Finsler structure associated with the Sobolev norm · H 1 : Consider a symmetrization of the dual norm given by  ∗ (α) := F

F ∗ (α)2 + F ∗ (−α)2 , 2

:= (F ∗ )∗ gives rise to the then the corresponding reversible Finsler structure F same Sobolev norm · H 1 as that of F . Now, on each tangent space Tx M, we choose an inner product gx which varies  up to a multiplicative constant Cn smoothly in x and is comparable with F √ (v) ≤ gx (v, v) ≤ Cn F (v) depending only on n = dim M (in the sense that Cn−1 F for all v ∈ Tx M). For example, the classical John ellipsoid provides such a Riemannian metric (see [126]). Precisely, in each Tx M, the unique ellipsoid of −1 (1) will be the unit sphere minimal (internal) volume circumscribing Tx M ∩ F  of gx , where the reversibility of √ F ensures that the center of the ellipsoid is the (v)/n ≤ gx (v, v) ≤ F (v). origin, and we have F 1 (i) We have seen that H () is a normed space. The completeness of H 1 ()  is equivalent to the completeness of with respect to the Sobolev norm of F the Sobolev space of the weighted Riemannian manifold (, g, m). The latter is shown in a standard way as follows (we refer to, e.g., [151, Theorem 10.5], [172, Theorem 1.1.12] for the Euclidean case). Let {(Ua , ξa )}a∈N be a partition of unity of  such that each Ua is included in the domain of a local coordinate system. Given a Cauchy sequence (ui )i∈N in (H 1 (), · H 1 () ), we can take u ∈ L2 () with limi→∞ ui − u L2 () = 0 and, for each a ∈ N, a 1-form ωa on Ua such that  lim

dui − ωa 2g dm = 0. i→∞ Ua

By construction we find ωa |Ua ∩Ub = ωb |Ua ∩Ub almost everywhere, thereby we can introduce a 1-form ω on  by ω := ωa on Ua . Then we have ω = du, u ∈ H 1 () and limi→∞ ui − u H 1 () = 0 as desired. Finally, the reflexivity immediately follows from that of the Sobolev space of (, g, m), which is actually a Hilbert space. ← − (ii) We deduce from Exercise 3.5 (between F and F ) that F ∗ (−α) ≤ F F ∗ (α) for any α ∈ T ∗ M. Thus we have

144

11 The Nonlinear Laplacian

∗ (α)2 ≤ F

1 + 2F ∗ 2 F (α) 2

and, again by Exercise 3.5,  (v) ≥ F

2 F (v). 1 + 2F

Hence, any bounded closed set with respect to g is also bounded for F and is compact by the completeness assumption. Therefore (M, g) is also complete (by the Hopf–Rinow theorem), and one can reduce the claim to the Riemannian case (see, for example, [115, Theorem 2.7]). Precisely, we fix x0 ∈ M and consider the cut-off functions \$   % k := min max 2 − k −1 dg (x0 , ·), 0 , 1 for k ∈ N, where dg denotes the distance function induced from g. Then, given u ∈ H 1 (M), we have k u ∈ H01 (M) by the completeness and the classical (Euclidean) Meyers–Serrin theorem (see, e.g., [151, Theorem 10.15], [172, Theorem 1.1.5/1]). In order to see u ∈ H01 (M), observe from the definition of k that 

u − k u 2L2

u2 dm → 0 M\Bg (x0 ,k)

and  1

(1 − k )du − u dk 2g dm 2 M  \$ % (1 − k )2 du 2g + u2 dk 2g dm ≤

Eg (u − k u) =



M

≤ M\Bg (x0 ,k)



du 2g

dm +

k −2 u2 dm

Bg (x0 ,2k)\Bg (x0 ,k)

→ 0, both as k → ∞, where Eg is the energy functional with respect to g. This completes the proof.   The following corollary to Lemma 11.4(ii) will play an essential role when we consider the mass conservation of heat flow. Lemma 11.5 (Constant Functions in H01 (M)) If (M, F, m) satisfies F < ∞, m(M) < ∞, and is complete, then constant functions on M belong to H01 (M).

11.1 Energy Functional and Sobolev Spaces

145

Another important property of the energy functional E , which is essential especially in the study of heat flow as gradient flow of the energy (Sect. 13.2), is the lower semi-continuity as a function on L2 (). Here we assume the finite reversibility (which may be redundant) for comparing E and · H 1 () and for applying some tools from functional analysis. We refer to [260] for the basics of functional analysis. Lemma 11.6 (Lower Semi-continuity of E ) Let  ⊂ M be an open set and assume F < ∞. Then the energy functional E is lower semi-continuous in L2 (). Precisely, if a sequence (ui )i∈N in H 1 () converges to a function u in the L2 -norm and satisfies lim infi→∞ E (ui ) < ∞, then we have u ∈ H 1 () and E (u) ≤ lim inf E (ui ). i→∞

Moreover, if ui ∈ H01 () for all i ∈ N, then we have u ∈ H01 (). Proof Notice that lim infi→∞ ui H 1 () < ∞ thanks to F < ∞, and let us assume that limi→∞ E (ui ) exists without loss of generality. Now, since (H 1 (), · H 1 () ) is reflexive (Lemma 11.4(i)), a subsequence of (ui )i∈N is weakly convergent in H 1 () by Kakutani’s theorem (see [260, Sect. V.2, Theorem 1], [151, Theorem A.59]). This weak limit coincides with u by hypothesis (and, e.g., Mazur’s lemma below) and belongs to H01 () if ui ∈ H01 () for all i ∈ N. Then, if F is reversible, the lower semi-continuity of the norm under weak convergence yields the claim: E (u) ≤ limi→∞ E (ui ) (see [260, Sect. V.1, Theorem 1]). In the non-reversible case, we use Mazur’s lemma (see [260, Sect.  V.1, Theorem 2]): Forany ε > 0, there exists a convex combination uε := m i=1 ci ui (with ci ≥ 0 and m ci = 0 for i=1 ci = 1) such that u − uε H 1 () ≤ ε. We can assume √ all i ≤ ε−1 without loss of generality. Then we have, by the convexity of E as in (11.2), m      E (u) ≤ E (uε ) + E (u − uε ) ≤ ci E (ui ) + ε. i=1

Letting ε → 0 shows the claim.

 

We close this section with two exercises on the strict convexity of the energy functional. Define the ground state energy as   χ := inf 2E (u) | u ∈ H01 (), u L2 () = 1 .

(11.4)

Then we have χ u 2L2 () ≤ 2E (u) for all u ∈ H01 (), which can be compared with the Poincaré inequality (see Sect. 15.2). Note that χ may be 0 (when, for example, constant functions belong to H01 ()). In such a case it is more convenient to consider

146

11 The Nonlinear Laplacian

χ



  1  := inf 2E (u)  u ∈ H0 (), u L2 () = 1, u dm = 0 .

(11.5)



Exercise 11.7 Show that E is (χ /S )-convex in the L2 -norm in the sense that, for any u1 , u2 ∈ H01 (),  E

u1 + u2 2

E (u1 ) + E (u2 ) χ −

u2 − u1 2L2 () , 2 8S

where S denotes the uniform smoothness constant of  (recall Definition 8.14). Exercise 11.8 Prove that,  for each a ∈ R, E is (χ  /S )-convex on the affine subspace {u ∈ H01 () |  u dm = a} of H01 () in the L2 -norm.

11.2 Laplacian and Harmonic Functions In this section we shall introduce a Laplacian, acting on functions, as the divergence of the gradient vector field. Here the gradient vector field is determined by the Finsler structure F , while the divergence depends only on the measure m. This Laplacian is nonlinear but naturally associated with the energy functional defined in the previous section. Definition 11.9 (Gradient Vectors) For a differentiable function u : M −→ R, its gradient vector at x ∈ M is defined as the Legendre transformation of the derivative dux ∈ Tx∗ M: ∇u(x) := L∗ (dux ) ∈ Tx M.

(11.6)

Recall Sect. 3.2 for the definition of the Legendre transformation, and note that F (∇u(x)) = F ∗ (dux ). If dux = 0, then we can write down ∇u(x) in local coordinates as ∇u(x) =

n  i,j =1

gij∗ (dux )

 ∂u ∂  (x) i  . ∂x j ∂x x

This implies that, for any v ∈ Tx M,   g∇u ∇u(x), v = du(v).

(11.7)

From the definition of the Legendre transformation, observe that ∇u(x) represents the direction in which u increases the most. For example, we see the following. Exercise 11.10 In a Minkowski normed space (Rn , · ), show that

11.2 Laplacian and Harmonic Functions

147

∇( x 2 ) = 2x,

∇(− −x 2 ) = −2x

for x ∈ Rn , by identifying the tangent spaces Tx Rn and T−x Rn with Rn . We need to be careful when dux = 0 because the Legendre transformation L∗ is only continuous at the zero section and gij∗ (dux ) is not defined. Therefore we introduce the essential domain for u, defined by Mu := {x ∈ M | dux = 0},

(11.8)

which will play a role in various regularity arguments. Observe that, if u ∈ Cl (M), then ∇u is Cl−1 on Mu and continuous on M. Exercise 11.11 Prove the following analytic description of the distance function: d(x, y) = sup{u(y) − u(x) | u ∈ C∞ (M), F ∗ (du) ≤ 1} for x, y ∈ M. This kind of distance structure is sometimes called the intrinsic distance, for example, in the theory of Dirichlet forms (see, e.g., [12]). Exercise 11.12 Let u ∈ C∞ (M) be a function satisfying F ∗ (du) ≡ 1 on M. Show that any integral curve of the gradient vector field ∇u is a minimal geodesic. In particular, ∇u is a geodesic field (recall Sect. 4.2 and Remark 5.14). Next we introduce the divergence of a vector field. Definition 11.13 (Divergences) For a differentiable vector field V on M, we define its divergence associated with a measure m on M by divm V :=

n   ∂V i i=1

where V =

n

i=1 V

i (∂/∂x i )

∂ , +V ∂x i ∂x i i

(11.9)

and  is given by dm = e dx 1 dx 2 · · · dx n .

Note that the divergence is a linear operator. The integrated (weak) formulation of (11.9) gives rise to the divergence (or integration by parts) formula:  M

 φ divm V dm = −

dφ(V ) dm

(11.10)

M

for all φ ∈ C∞ c (M). We remark that the right-hand side makes sense for measurable vector fields V with F (V ) ∈ L1loc (M). We will use this weak formulation of the divergence in the sequel. Exercise 11.14 Show that the definition (11.9) of divm V is independent of the choice of a local coordinate system. Exercise 11.15 Prove the above divergence formula for C1 -vector fields V .

148

11 The Nonlinear Laplacian

Now, we define our Laplacian as the divergence of the gradient vector field. Definition 11.16 (Nonlinear Laplacian) We define the (distributional) Laplacian 1 (M) by u := div (∇u) in the weak sense that of a function u ∈ Hloc m 

 φu dm := − M

(11.11)

dφ(∇u) dm M

for all φ ∈ C∞ c (M). Since we will always fix a measure m on M, we suppress the dependence of the Laplacian  on m. Notice that the right-hand side of (11.11) is well-defined since (3.9) yields 



max{F ∗ (dφ)F (∇u), F ∗ (−dφ)F (∇u)} dm

|dφ(∇u)| dm ≤ M



M

  max F ∗ (dφ)2 , F ∗ (−dφ)2 dm

1/2  

1/2 F 2 (∇u) dm

supp φ

M

< ∞. We similarly find that one can take φ ∈ H01 (M) in (11.11) if u ∈ H 1 (M). Since the gradient vector field ∇u is merely continuous at x with dux = 0 (i.e., on M \ Mu ) even when u ∈ C∞ (M), it is necessary to introduce the Laplacian in the weak form as above. Note also that our Laplacian is a negative operator in the sense that  uu dm = −2E (u) ≤ 0 

for all u ∈ H01 () and equality holds if and only if u is constant almost everywhere on each connected component of . Taking the gradient vector field (more precisely, the Legendre transformation) is a nonlinear operation, thus our Laplacian  is a nonlinear operator unless F is Riemannian (recall Exercise 3.10). Precisely, although (cu) = cu holds for c ≥ 0 (and also for c < 0 if F is reversible), we have (u1 + u2 ) = u1 + u2 if and only if F is Riemannian. Some more remarks on the definitions of Laplacian-type operators are in order. Remark 11.17 (Definitions of Laplacians) (a) The nonlinear Laplacian defined above goes back to (at least) [229] (see also [104, 230]). It is also possible (and more natural from some aspects) to define a Laplacian-type operator without using a measure, as the trace of a Hessian operator (see Lemma 12.4 below). Then our Laplacian can be regarded as the weighted Laplacian (sometimes also called the Witten Laplacian) with respect

11.2 Laplacian and Harmonic Functions

149

to the measure m. We will revisit this issue in Sect. 12.1, where we introduce the Hessian to write down the Bochner–Weitzenböck formula. (b) There are a number of notions of linearized Laplacians in the literature. Such a linearization involves some kind of averaging procedure to kill the anisotropy of F , and different ways of taking averages give rise to different linearizations (see, e.g., [26, 29, 64, 65]). In our context, however, the nonlinear Laplacian is more natural by virtue of its tight link to the weighted Ricci curvature as well as of its consistency with analysis on metric measure spaces. Actually, our energy functional E coincides with the Cheeger energy introduced in [67] utilizing the notion of upper gradients. We refer to [114, 119, 226] as well as [10, 117, 118, 120] for this quite powerful and highly successful theory. Exercise 11.18 Given a C1 -function u on M, prove that     u η(l) − u η(0) ≤



l

  F ∇u(η) dt

0

holds for any continuous curve η : [0, l] −→ M parametrized by arclength (compare this with Exercise 3.6). This means that the function F (∇u) = F ∗ (du) is an upper gradient for u. The Cheeger energy of u is defined as the infimum of the squared L2 -norm of upper gradients for u, via an approximation procedure for general L2 -functions (see [67] for the precise definition). Next, we discuss a fundamental relationship between the energy functional and the Laplacian via harmonic functions. Let  ⊂ M be an open set. We say that a function u ∈ H 1 () is (weakly) harmonic on  if u = 0 holds on  in the weak sense, i.e.,  dφ(∇u) dm = 0 

for all φ ∈ H01 (). We remark that, by the negativity of the Laplacian , u ∈ H01 () is harmonic if and only if it is constant almost everywhere on each connected component of . Lemma 11.19 (Harmonic Functions Minimize the Energy) A function u ∈ H 1 () is harmonic on  if and only if it satisfies E (u) = inf{E (u + φ) | φ ∈ H01 ()}. Proof Given φ ∈ H01 (), we deduce from Lemma 3.8 and L∗ (du) = ∇u that  d 1 E (u + tφ) t=0 = dt 2

 

! d F ∗ (du + t dφ)2 dm t=0 dt

  n ∂[(F ∗ )2 ] 1 ∂φ = (du) i dm 2  ∂αi ∂x i=1

150

11 The Nonlinear Laplacian

 =

dφ(∇u) dm.

(11.12)



it follows from (11.12) that If u is a minimizer of E in the sense above, then 1  dφ(∇u) dm = 0 necessarily holds for all φ ∈ H0 (). Hence, u is harmonic. Conversely, let u be harmonic and suppose that there is some φ ∈ H01 () such that E (u + φ) < E (u). Then the convexity of E as in (11.3) implies E (u + tφ) ≤ (1 − t)E (u) + tE (u + φ) for any t ∈ (0, 1), and hence  d E (u + tφ) t=0 ≤ E (u + φ) − E (u) < 0. dt This contradicts the harmonicity of u due to (11.12). Therefore u is a minimizer of E as desired.  

11.3 Laplacian Comparison Theorem In this section, we show the Laplacian comparison theorem as an analytic counterpart to the (directional) area comparison (9.8) (see Remark 11.21 below). This was shown in [206, Theorem 5.2] via a reduction to the Riemannian case, and here we give a direct proof by a calculation similar to Theorem 9.15. We refer to, e.g., [155, Theorem 4.1] for the Riemannian case. Theorem 11.20 (Laplacian Comparison) Let (M, F, m) be forward or backward complete and assume RicN ≥ K for some K ∈ R and N ∈ [n, ∞). Then, for any z ∈ M, the distance function u(x) := d(z, x) from z satisfies u(x) ≤ (N − 1)

sK/(N −1) (d(z, x)) sK/(N −1) (d(z, x))

(11.13)

pointwise on M \ ({z} ∪ Cut(z)), and in the weak sense on M. Recall (9.4) for the definition of sK/(N −1) , and Definition 7.10 for the definition of the cut locus Cut(z). For example, when K = 0, the inequality (11.13) means u(x) ≤

N −1 . d(z, x)

√ We remark that, in the case of K > 0, diam(M) ≤ π (N − 1)/K √ holds by the Bonnet–Myers theorem (Corollary 9.17), and hence d(z, x) = π (N − 1)/K can be achieved only when x is a cut point of z.

11.3 Laplacian Comparison Theorem

151

Proof As in the proof of Theorem 9.15, we fix a unit vector v ∈ Uz M and put η(t) := expz (tv). We also take an orthonormal basis {ei }n−1 i=1 ∪ {v} of (Tz M, gv ) and define Ei (t) := d(expz )tv (tei ) for t ∈ [0, ρ(v)) and i = 1, 2, . . . , n − 1. Now, let us consider polar coordinates (x i )ni=1 on a neighborhood of η((0, ρ(v))) such that x n = u and (∂/∂x i )|η(t) = Ei (t) for t ∈ (0, ρ(v)) (one may recall Gauss’ lemma (Lemma 3.18)). Then we have ∇u = ∂/∂x n and, by (11.9),       d log . ˙ det gij η(t) u η(t) = −ψη (t) + dt Notice that   gij η(t) ˙ = gη(t) ˙



∂ ∂ , j i ∂x ∂x

  = gη(t) Ei (t), Ej (t) , ˙

˙ 1/(2(n−1)) as in the proofs of Theoand consider the function h := (det[gij (η)]) rems 8.1 and 9.15. Then we find   (hn−1 ) (e−ψη hn−1 ) f u η(t) = −ψη (t) + n−1 (t) = −ψ n−1 (t) = (t), f h e ηh

(11.14)

where f (t) := e−ψη (t) h(t)n−1 . In the proof of Theorem 9.15, we showed 

f (t) sK/(N −1) (t)N −1

 ≤0

as a consequence of the weighted Bishop inequality (9.6) under RicN ≥ K. This is equivalent to sK/(N−1) f (t) ≤ (N − 1) (t), f sK/(N−1)

(11.15)

which yields the claim on M \ ({z} ∪ Cut(z)). In order to extend (11.13) to M in the weak sense, let φ ∈ C∞ c (M) be an arbitrary nonnegative function with compact support. Then the claim is 



  φ(x) d(z, x) m(dx),

dφ(∇u) dm ≤ M

M

where we denote by (d(z, x)) the right-hand side of (11.13). We remark that (d(z, ·)) is integrable around z since (t) = O(t −1 ) as t → 0 and n ≥ 2. Given a unit vector v ∈ Uz M, let fv : (0, ρ(v)) −→ (0, ∞) be the function f as in the first part of this proof associated with v. Recall that dm = fv (t) dt Az (dv)

152

11 The Nonlinear Laplacian

on M \ ({z} ∪ Cut(z)) = {expz (tv) | v ∈ Uz M, t ∈ (0, ρ(v))}, where Az is the measure on Uz M induced from gv as in the proof of Theorem 9.15. Thanks to this decomposition of m and fv (t) ≤ (t)fv (t) from (11.15), we have 

  φ(x) d(z, x) m(dx) = M

 



Uz M



ρ(v)

  φ expz (tv) (t)fv (t) dt Az (dv)

ρ(v)

  φ expz (tv) fv (t) dt Az (dv).

0

≥ 0

Uz M

Moreover, for each v ∈ Uz M, 

ρ(v)

0

  φ expz (tv) fv (t) dt

!ρ(v)   = φ expz (tv) fv (t) −



0

 ≥− 0

ρ(v) 0

∂[φ(expz (tv))] fv (t) dt ∂t

"  ρ(v) # dφ ∇u expz (tv) fv (t) dt

since φ ≥ 0 and fv (0) = 0. Therefore we obtain 

  φ(x) d(z, x) m(dx) ≥ −

M



 Uz M

 =−

0

ρ(v)

"  # dφ ∇u expz (tv) fv (t) dt Az (dv)

dφ(∇u) dm, M

 

which completes the proof.

Remark 11.21 (Laplacian and Area Comparisons) As seen in the above proof, the Laplacian comparison is equivalent to the inequality (11.15), which is essentially the integration of the Bishop inequality (9.6). More precisely, (11.15) can be regarded as a directional and infinitesimal version of the area comparison (9.8). This observation, in particular, shows that the Laplacian comparison is closely related to the measure contraction property (see Remark 9.16 and Subsect. 18.5.3). For the distance function u(x) ¯ := d(x, z) to z, we have 

sK/(N−1) (d(x, z)) ← −  u(x) ¯ = −(−u)(x) ¯ ≤ (N − 1) sK/(N−1) (d(x, z)) ← − by Theorem 11.20 for the reverse Finsler structure F , since the Ricci curvature ← − bound RicN ≥ K is common between (M, F, m) and (M, F , m) (recall Sect. 2.5).

11.4 Linearized Laplacians

153

11.4 Linearized Laplacians Besides the nonlinear Laplacian , we will also make use of its linearization associated with the Riemannian metric induced from a vector field (in most cases the gradient vector field of some function). 1 (M) such that df = 0 Let V be a measurable vector field on M. For f ∈ Hloc almost everywhere on the set {x ∈ M | V (x) = 0} (equivalently, V = 0 almost everywhere on the essential domain Mf as in (11.8)), we define the gradient vector field and the Laplacian of f on the (singular) weighted Riemannian manifold (M, gV , m) by ⎧ n  ⎪ ∂f ∂ ⎪ ⎨ g ij (V ) j i V ∂x ∂x ∇ f := i,j =1 ⎪ ⎪ ⎩0

on Mf ,

V f := divm (∇ V f ),

on M \ Mf ,

(11.16) where the latter is understood in the weak sense. Clearly ∇ V and V are linear operators (for each fixed V ). Moreover, we immediately obtain the following by definition. 1 (M), we have ∇ ∇u u = ∇u as measurable vector Lemma 11.22 For any u ∈ Hloc fields and ∇u u = u in the weak sense. Moreover, for a measurable, nowhere vanishing vector field V such that V = ∇u on Mu , we similarly have ∇ V u = ∇u and V u = u. 1 (M) satisfying df = Finally, we observe for later use that, for u, f1 , f2 ∈ Hloc 1 df2 = 0 almost everywhere on M \ Mu , we have

df2 (∇ ∇u f1 ) = g∇u (∇ ∇u f1 , ∇ ∇u f2 ) = df1 (∇ ∇u f2 ).

(11.17)

Chapter 12

The Bochner–Weitzenböck Formula

This chapter is devoted to the main ingredients of our geometric analysis on measured Finsler manifolds, the Bochner–Weitzenböck formula (or the Bochner formula), and the corresponding Bochner inequality, established in [208] in terms of the nonlinear Laplacian and its linearization introduced in the previous chapter. We refer to [216, Chap. 9] for the Riemannian case as well as a historical account. In the language of the celebrated -calculus à la Bakry et al., the Bochner inequality can be regarded as a nonlinear analogue of the 2 -criterion (recall Remark 9.10(b)). Coupled with the nonlinear heat flow discussed in the next chapter, the Bochner inequality has fruitful applications including gradient estimates (Chap. 14), isoperimetric inequalities (Chap. 15), and functional inequalities (Chap. 16). For further applications of the Bochner inequality, we refer to [246] for an eigenvalue estimate of the nonlinear Laplacian, [254] for a gradient estimate for harmonic functions, and to [255, 256, 258, 259] for investigations of the p-Laplacian, among others.

12.1 Hessian We begin with a Finsler notion of Hessian, which will be needed in the Bochner– Weitzenböck formula. For a twice differentiable function u : M −→ R and a point x ∈ Mu (i.e., dux = 0), we define the Hessian ∇ 2 u(x) ∈ Tx∗ M ⊗ Tx M of u at x by using the covariant derivative (4.3) as ∇ 2 u(v) := Dv∇u (∇u) ∈ Tx M,

v ∈ Tx M

(12.1)

(we will denote [∇ 2 u(x)](v) by ∇ 2 u(v) for simplicity). Clearly, ∇ 2 u(v) is linear in v ∈ Tx M, so that ∇ 2 u(x) is an endomorphism of Tx M. This Hessian is symmetric in the following sense.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S.-i. Ohta, Comparison Finsler Geometry, Springer Monographs in Mathematics, https://doi.org/10.1007/978-3-030-80650-7_12

155

156

12 The Bochner–Weitzenböck Formula

Lemma 12.1 (Symmetry of Hessian) Let u : M −→ R be twice differentiable. Then we have     g∇u ∇ 2 u(v), w = g∇u v, ∇ 2 u(w) for all v, w ∈ Tx M with x ∈ Mu . Proof This is essentially a consequence of the torsion-freeness as in Exercise 4.5. Let V and W be extensions of v and w to smooth vector fields on a neighborhood of x, respectively. Then we deduce from Lemma 4.8 and (11.7) that     g∇u DV∇u (∇u), W = V g∇u (∇u, W ) − g∇u (∇u, DV∇u W )   = V du(W ) − du(DV∇u W ). ∇u V = [V , W ] from Exercise 4.5, we obtain Combining this with DV∇u W − DW

  ∇u   (∇u), V = du([V , W ]) − du([V , W ]) = 0. g∇u DV∇u (∇u), W − g∇u DW  

This completes the proof.

Exercise 12.2 Let u : M −→ R be a twice differentiable function and take x ∈ Mu . In a local coordinate system (x i )ni=1 such that {(∂/∂x i )|x }ni=1 is orthonormal with respect to g∇u(x) , show the following expression of the Hessian:    n

n   ∂ 2u ∂  ∂  ∂u k ∇ u = − ij (∇u) k (x) j  . ∂x i  ∂x i ∂x j ∂x ∂x 2

x

j =1

k=1

(12.2)

x

(Compare this with Exercise 9.6 in the Riemannian case.) Exercise 12.3 In [230, Sect. 14.1], a different kind of Hessian D2 u : T M −→ R of a twice differentiable function u is defined as D2 u(v) := (u ◦ η) (0), where η : (−ε, ε) −→ M is the geodesic with η(0) ˙ = v. Prove that D2 u(v) does not 2 coincide with g∇u (∇ u(v), v) in general, and that they coincide in Berwald spaces. The following relation between the Laplacian and the Hessian is worth noting (recall Remark 11.17(a)). A similar calculation will play a role in Lemma 12.6(ii) below as well. We remark that the trace of an endomorphism of a finite-dimensional vector space is defined independently of the choice of a basis. Lemma 12.4 (Laplacian as the Trace of Hessian) Let u be a twice differentiable function on a measured Finsler manifold (M, F, m) and take x ∈ Mu . Then the trace of the Hessian ∇ 2 u(x) coincides with u(x) + ψη (0), where η : (−ε, ε) −→ M is the geodesic with η(0) ˙ = ∇u(x) and the measure m is decomposed as    dm = e−ψη det gij (η) ˙ dx 1 dx 2 · · · dx n

12.1 Hessian

157

along η as in Definition 9.11. Proof Let (x i )ni=1 be a local coordinate system such that {(∂/∂x i )|x }ni=1 is orthonormal with respect to g∇u(x) . By the expression (11.9) of the divergence in local coordinates, we have at x u(x) = divm

 n

gij∗ (du)

i,j =1

∂u ∂ ∂x j ∂x i

n n "    # ∂gij ∂ 2u ∂u d 1 −ψη + log det gij (η) = − (∇u) j + ˙ (0) ∂x i ∂x dt 2 (∂x i )2 i,j =1

i=1

 n n   ∂gij ˙ d[gij (η)] ∂ 2u ∂u 1  (0) , − (∇u) j − ψη (0) + trace = ∂x i ∂x 2 dt (∂x i )2 i,j =1

i=1

where we used Euler’s theorem (Theorem 2.8) in the second equality. We deduce from the geodesic equation (3.15) of η, (4.5) and (3.6) that 1 trace 2 =

1 2

=−



n  i,j =1 n 

˙ d[gij (η)] (0) dt

j

γii (∇u)

i,j =1

n  ∂u 1 − Aiij (∇u)Gj (∇u) ∂x j F (∇u) i,j =1

i,j =1

i,j =1

n  ∂gij ∂u (∇u) j ∂x i ∂x

i,j =1

n

 j γii (∇u) + =−

=−

 n ∂gij ∂gii ∂u ∂gii ∂u j (∇u) + (∇u) η ¨ (0) − (∇u) j j j j i ∂x ∂x ∂v ∂x ∂x

i,j =1

n 

n 1  ∂u k Aiik (∇u)Nj (∇u) F (∇u) ∂x j k=1

 ∂u j ii ∇u(x) (x). ∂x j

Combining these with (12.2), we obtain u(x) =

n

n   ∂ 2u ∂u j (x) − ψη (0) −  (∇u) ii ∂x j (∂x i )2 i=1

j =1

  = trace ∇ 2 u(x) − ψη (0) as desired.

 

158

12 The Bochner–Weitzenböck Formula

The above expression of the Laplacian means that trace(∇ 2 u) can be regarded as the unweighted Laplacian (determined only by F ) and then u is the weighted one with respect to m. One can also write     u(x) = trace ∇ 2 u(x) − S ∇u(x) by using the S-curvature (recall Remark 9.14).

12.2 Pointwise Formula In this section, we prove the pointwise version of our Bochner–Weitzenböck formula. Our argumentation, originally established in [208], follows the lines of [244, Chap. 14] (in the Riemannian case) and is related to optimal transport theory (see Remark 12.10). Let u : M −→ R be a C∞ -function and put V := ∇u that is C∞ on Mu and continuous on M. We fix x ∈ Mu (i.e., dux = 0) and, for sufficiently small t ≥ 0, introduce a map Tt and a vector field Vt on a small neighborhood of x defined by   Tt (y) := expy tV (y) ,

   d Tt (y) , Vt Tt (y) := dt

(12.3)

respectively. In other words, (Vt )t≥0 is the time-dependent vector field evolved by the geodesic flow with the initial condition V0 = V , and (Tt )t≥0 is the flow generated by (Vt )t≥0 such that T0 is the identity map. Since the curve σ (t) := Tt (y) is a geodesic for each y, we deduce from the geodesic equation (3.15) and σ˙ (t) = Vt (σ (t)) that Dσσ˙˙ σ˙ (t) =

n



σ¨ i +

i=1

=

n  j,k=1

n  n  j =1

i=1

 ∂  ji k (σ˙ )σ˙ j σ˙ k (t) i  ∂x σ (t)

 n   ∂  ∂Vti j ∂Vti j k  i  + V + j k (Vt )Vt Vt σ (t) ∂x j t ∂t ∂x i σ (t) j,k=1

= 0. Therefore, we obtain  n   ∂    ∂Vti   σ (t) + DVVtt Vt σ (t) = 0,  i ∂t ∂x σ (t) i=1

which is called the pressureless Euler equation in [244, Chap. 14].

(12.4)

12.2 Pointwise Formula

159

Now we put η(t) := Tt (x), take an orthonormal basis {ei }ni=1 of (Tx M, gV ) such that en = η(0)/F ˙ (η(0)), ˙ and consider the vector fields Ei (t) := d(Tt )x (ei ) ∈ Tη(t) M,

i = 1, 2, . . . , n,

along η. By construction, each Ei is a Jacobi field along η. We remark that, in general, Tt (η(s)) = η(s + t) for s, t > 0, and hence, En (t) may not coincide with η(t)/F ˙ (η(t)) ˙ (they coincide if the geodesic η is also an integral curve of V = ∇u). Similarly to the proof of the Bonnet–Myers theorem (Theorem 8.1), we will write η˙ Ei (t) := Dη˙ Ei (t) for simplicity and define an n × n matrix B(t) = (bij (t)) by  n Ei (t) = j =1 bij (t)Ej (t). By a similar discussion to Claim 8.2, we obtain the following Riccati equation analogous to (8.1). Lemma 12.5 (Riccati Equation) For η and B as above and t ≥ 0, we have     d[trace B] (t) + trace B 2 (t) + Ric η(t) ˙ = 0. dt Proof We can follow the same lines as the proof of Claim 8.2, up to a difference that Ei (0) = 0. Consider an n × n matrix A(t) = (aij (t)) given by aij (t) := gη˙ (Ei (t), Ej (t)), and observe that A = BA + AB T by definition. Since each Ei is a Jacobi field, we have (BA − AB T ) = 0. This implies BA − AB T ≡ B(0) − B(0)T ,

A = 2BA − B(0) + B(0)T ,

and hence, A = 2B  A + 2BA = 2B  A + 2B 2 A + 2BAB T . We also deduce from the Jacobi equation that A = −2R + 2BAB T , where Rij (t) := gη˙ (Rη˙ (Ei ), Ej )(t). Comparing these expressions of A , we obtain B  + B 2 + RA−1 = 0. Taking the trace completes the proof.   ∗ M ⊗T Next we analyze an endomorphism ∇Vt (η(t)) ∈ Tη(t) η(t) M defined by

∇Vt (v) := DvVt Vt for v ∈ Tη(t) M. Note that ∇V = ∇ 2 u.

160

12 The Bochner–Weitzenböck Formula

Lemma 12.6 (i) We have ∇Vt (η(t)) = B(t)T in the sense that, for each 1 ≤ i ≤ n, n ∇Vt (Ei (t)) = j =1 bij (t)Ej (t). (ii) It holds that trace(B(t)) = divm Vt (η(t))+ψη (t), where m = e−ψη volgη˙ along η as in Definition 9.11. Proof (i) It suffices to show that ∇Vt (Ei (t)) coincides with Ei (t). On one hand, V (η(t)) η(t) ˙ note that ∇Vt (Ei (t)) = DEti (t) Vt = DEi (t) Vt , since η(t) ˙ = Vt (η(t)) by (12.3). On the other hand, recall that Ei (t)

=

n

 dE l i

dt

l=1

n 

(t) +

jl k

  j η(t) ˙ η˙ (t)Eik (t)

j,k=1

 ∂  , ∂x l η(t)

and observe from the definition of Ei that Ei (t) =

 ! ∂ Tt expx (sei ) . s=0 ∂s

Exchanging the order of differentiation (in s and t), we deduce from (12.3) that n "    #! dEil ∂ ∂Vtl  (t) = Vtl Tt expx (sei ) η(t) Eik (t). = k s=0 dt ∂s ∂x k=1

η(t) ˙

Therefore, we obtain Ei (t) = DEi (t) Vt = ∇Vt (Ei (t)). (ii) This is shown in a similarly way to Lemma 12.4. Choose a local coordinate system (x i )ni=1 around η(t) such that {(∂/∂x i )|η(t) }ni=1 is orthonormal with respect to gη(t) = gVt (η(t)) . We suppress the evaluations at η(t) in this proof. First we deduce ˙ from (4.10) that  ∇Vt

∂ ∂x i

=

=

Vt D∂/∂x i Vt n

 j =1

=

j ∂Vt ∂x i

n

j  ∂Vt j =1

∂x i

j

+ Ni (Vt )

+

n 

j ik (Vt )Vtk

k=1

∂ . ∂x j

Hence, it follows from (i) that n

   ∂Vti i trace B(t) = trace(∇Vt ) = + N (V ) . t i ∂x i i=1

Next we have

∂ ∂x j

12.2 Pointwise Formula

161

n    ∂Vti 1 divm Vt η(t) = − ψη (t) + trace ∂x i 2



i=1

˙ d[gij (η)] (t) dt

and, by the geodesic equation (3.15) of η and (4.10), 1 trace 2



˙ d[gij (η)] (t) dt

n

∂gii 1  ∂gii j j (Vt )Vt + j (Vt )η¨ (t) = 2 ∂x j ∂v i,j =1

=

n  i,j =1

=

n 

Aiij (Vt ) j j γiji (Vt )Vt − G (Vt ) F (Vt )

Nii (Vt ).

i=1

Combining these implies trace(B(t)) = divm Vt (η(t)) + ψη (t) as desired.

 

Substituting Lemma 12.6(ii) into Lemma 12.5, we find  !     d divm Vt η(t) + trace B 2 (t) + Ric∞ η(t) ˙ = 0. dt Moreover, the linearity of divm (by the definition (11.9)) and (12.4) imply  !   d divm Vt η(t) = d(divm Vt ) η(t) ˙ + divm dt

 n i=1

 ∂Vti ∂  η(t) i ∂t ∂x

    = d(divm Vt ) η(t) ˙ − divm (DVVtt Vt ) η(t) . Hence, we have         ˙ + trace B 2 (t) . ˙ = Ric∞ η(t) divm (DVVtt Vt ) η(t) − d(divm Vt ) η(t)

(12.5)

Now, we put t = 0 and consider the benefit of the symmetry of the Hessian ∇V = ∇ 2 u. We deduce from Lemma 12.6(i) that g∇u (∇ 2 u(ei ), ej ) = bij (0). Combining this with the symmetry of ∇ 2 u (Lemma 12.1), we find that B(0) is symmetric. Hence, we have n   2   trace B 2 (0) = g∇u ∇ 2 u(ei ), ej = ∇ 2 u(x) 2HS(∇u) , i,j =1

where · HS(∇u) denotes the Hilbert–Schmidt norm with respect to g∇u . Furthermore, again by the symmetry of ∇ 2 u and by Lemma 4.8, we have for each 1 ≤ i ≤ n

162

12 The Bochner–Weitzenböck Formula



 ∇u  ∂ ∂ F 2 (∇u) ∇u . g∇u D∇u (∇u), i = g∇u D∂/∂x i (∇u), ∇u = i ∂x ∂x 2 This yields ∇u D∇u (∇u)

=

n  i,j =1

 2

∂ ∂ ∇u ∇u F (∇u) g (∇u)g∇u D∇u (∇u), i =∇ ∂x ∂x j 2 ij

(recall (11.16) for the definition of the linearized gradient ∇ ∇u ). Plugging these into (12.5), we arrive at the following pointwise Bochner–Weitzenböck formula on the essential domain Mu . Notice that we utilize the linearized Laplacian ∇u as in (11.16). Theorem 12.7 (Pointwise Bochner–Weitzenböck Formula) For u ∈ C∞ (M), we have 2

F (∇u) ∇u (12.6) − d(u)(∇u) = Ric∞ (∇u) + ∇ 2 u 2HS(∇u) 2 on Mu . Moreover, ∇u

F 2 (∇u) (u)2 − d(u)(∇u) ≥ RicN (∇u) + 2 N

(12.7)

holds on Mu for any N ∈ (−∞, 0) ∪ [n, ∞]. The last term (u)2 /N is read as 0 when N = ∞. Proof The first assertion (12.6) was shown above. Then the second assertion (12.7) is clear if N = ∞. For N ∈ (−∞, 0) ∪ (n, ∞), we deduce from Lemma 12.6 and the symmetry of B(0) that  

∇ 2 u 2HS(∇u) = trace B 2 (0) (trace B(0))2 = + trace n ≥

(u + ψη (0))2 n



trace B(0) B(0) − In n

2

.

Plugging a = u and b = ψη (0) into the equation  2 a2 b2 N (N − n) a b (a + b)2 = − + + n N N −n n N N −n and noticing that N (N − n) > 0 holds for N ∈ (−∞, 0) ∪ (n, ∞), we have

12.2 Pointwise Formula

163

(u + ψη (0))2 n

ψη (0)2 (u)2 − . N N −n

Combining this with (12.6) yields the desired inequality (12.7) (recall Definition 9.11 for the definition of RicN ). The remaining case of N = n is obtained as the limit.   If RicN ≥ K for some K ∈ R, then it follows from (12.7) that ∇u

F 2 (∇u) (u)2 − d(u)(∇u) ≥ KF 2 (∇u) + 2 N

(12.8)

on Mu , which we will call the pointwise Bochner inequality. One can generalize the Bochner–Weitzenböck formula and the Bochner inequality to a more general class of Hamiltonian structures (by dropping the positive 1-homogeneity of F ; recall Remark 2.14). We refer to [149, 198] and the references therein for details. Remark 12.8 (F Versus g∇u ) In contrast to the relation ∇u u = u (Lemma 11.22), RicN (∇u) may not coincide with the weighted Ricci curvature g RicN∇u (∇u) of the weighted Riemannian manifold (M, g∇u , m) (unless all integral curves of ∇u are geodesic; recall Remark 9.12). It is compensated in the formula (12.6) by a fact that ∇ 2 u does not necessarily coincide with the Hessian of u with respect to g∇u . As for the Bochner inequality (12.8), although the quantities in the inequality are common between F and g∇u , the assumption RicN ≥ K is not. Exercise 12.9 Let u ∈ C∞ (M) and suppose that all integral curves of ∇u are geodesic. Prove that the Finsler Hessian ∇ 2 u at x ∈ Mu coincides with the Riemannian Hessian Hess u at x with respect to the Riemannian metric g∇u . Recall (9.1) for the Riemannian Hessian. By polarization, it is sufficient to show g∇u (∇ 2 u(v), v) = Hess u(v, v) for v ∈ Tx M. We close this section with a remark on the connection with optimal transport theory. Remark 12.10 (Relation with Optimal Transport Theory) Let us explain an interpretation of the vector field Vt as in (12.3) from the viewpoint of optimal transport theory. See Sect. 18.1 for some terminologies, and we refer to Villani’s books [243, 244] for further information and diverse developments of this quite fruitful theory. Briefly speaking, (Vt )t≥0 can be regarded as the tangent (velocity) vector of a geodesic in the L2 -Wasserstein space with the initial vector ∇u. A potential function ϕt of the vector field Vt (i.e., ∇ϕt = Vt ) is given by

d 2 (x, y) . ϕt (y) := inf u(x) + x∈M 2t

(12.9)

164

12 The Bochner–Weitzenböck Formula

This is the Hopf–Lax formula providing a viscosity solution to the Hamilton–Jacobi equation: ∂ϕt F 2 (∇ϕt ) + = 0, ∂t 2

(12.10)

which corresponds to the pressureless Euler equation (12.4) (see [159, Theorem 2.5], [244]). In view of optimal transport theory, (12.9) is rewritten as

tϕt (y) = inf

x∈M

 d 2 (x, y)  − − tu(x) = (−tu)c (y), 2

where (−tu)c is called the c-transform of the function −tu for the quadratic cost function c(x, y) = d 2 (x, y)/2. This implies that the map exp(t∇u) provides an ← − optimal transport from μ to [exp(t∇u)]∗ μ with respect to F , and exp(t ∇ (−ϕt )) gives its reverse transport from [exp(t∇u)]∗ μ to μ that is optimal with respect to ← − the reverse Finsler structure F . Here μ is any absolutely continuous probability measure (with sufficiently small support) and [exp(t∇u)]∗ μ denotes the pushforward of μ by the map exp(t∇u) (see Theorem 18.2). We in particular find that, along a geodesic η(t) = expx (t∇u(x)),     ← − η(t) ˙ = − ∇ (−ϕt ) η(t) = ∇ϕt η(t) . This is corresponding to (12.3). In terms of Wasserstein geometry (or the Otto calculus on the L2 -Wasserstein space), the vector field ∇ϕt can be regarded as the tangent vector at time t of the geodesic ([exp(t∇u)]∗ μ)t∈[0,δ] for the L2 -Wasserstein distance induced from F (provided that δ > 0 is sufficiently small depending on the support of μ). Exercise 12.11 Show that (12.10) implies (12.4) by taking ∇ ∇ϕt of (12.10). One may use the symmetry of ∇ 2 ϕt = ∇Vt as in Lemma 12.1.

12.3 Integrated Formula In the previous section, we have shown the Bochner–Weitzenböck formula in the pointwise sense on the essential domain Mu for a function u ∈ C∞ (M). Since the gradient vector field ∇u is only continuous on M \ Mu , we need to consider an integrated (weak) formulation to extend the formula to M. Such an integrated version of the Bochner–Weitzenböck formula was established in [208, Theorem 3.6]. The following fact will play an essential role to overcome the illposedness of ∇u on M \ Mu (see, e.g., [151, Exercise 10.37(iv)], [42, Chap. I, Theorem 7.1.1]).

12.3 Integrated Formula

165

1 (M), we have df = 0 almost everywhere on f −1 (0). Lemma 12.12 For f ∈ Hloc 1 (M) ∩ L∞ (M), then it also holds that d(f 2 /2) = f df = 0 almost If f ∈ Hloc loc everywhere on f −1 (0).

We denote by Hc1 (M) the set of functions in H 1 (M) with compact support. Theorem 12.13 (Integrated Bochner–Weitzenböck 2 (M) ∩ C1 (M) such that u ∈ H 1 (M), we have Hloc loc 

 −

dφ ∇ M



= M

∇u

F 2 (∇u) 2

Formula) For

u

dm

% \$ φ d(u)(∇u) + Ric∞ (∇u) + ∇ 2 u 2HS(∇u) dm

(12.11)

for all bounded functions φ ∈ Hc1 (M) ∩ L∞ (M), and 2



 F (∇u) (u)2 dm ≥ dm dφ ∇ ∇u φ d(u)(∇u) + RicN (∇u) + 2 N M M (12.12) for all N ∈ (−∞, 0) ∪ [n, ∞] and bounded nonnegative functions φ ∈ Hc1 (M) ∩ L∞ (M). The last term (u)2 /N in (12.12) is read as 0 for N = ∞. 

In particular, if RicN ≥ K for some N ∈ (−∞, 0) ∪ [n, ∞] and K ∈ R, then we have the integrated Bochner inequality: 2



 (u)2 ∇u F (∇u) 2 dm ≥ dm dφ ∇ φ d(u)(∇u) + KF (∇u) + − 2 N M M (12.13) for all bounded nonnegative functions φ ∈ Hc1 (M) ∩ L∞ (M). 

Proof First of all, observe that all the integrals in (12.11) and (12.12) are well2 (M), defined. Precisely, in the right-hand side, it is sufficient to assume u ∈ Hloc 1 ∞ u ∈ Hloc (M), φ ∈ L (M) and the compactness of supp φ. In the left-hand side, since F ∗ (dφ) may not be bounded, we need to assume u ∈ C1 (M) besides u ∈ 2 (M) and φ ∈ H 1 (M) (see also Exercise 12.17 below). We also remark that all Hloc c the integrands of (12.11) and (12.12) vanish almost everywhere on M \Mu thanks to Lemma 12.12. Since the proofs are common, we will prove only (12.12) assuming N < ∞. First we consider the case of u ∈ C∞ (M). If φ ∈ Hc1 (Mu ), then (12.12) follows from the pointwise inequality (12.7) on Mu via the integration by parts (11.10). For an arbitrary nonnegative function φ ∈ Hc1 (M) ∩ L∞ (M), set \$  % φk := min φ, max kF 2 (∇u) − k −1 , 0 ,

k ∈ N.

166

12 The Bochner–Weitzenböck Formula

Observe that 0 ≤ φk ≤ φ and limk→∞ φk (x) = φ(x) for all x ∈ Mu . Moreover, since φk (x) = 0 if F (∇u(x)) ≤ k −1 , we find φk ∈ Hc1 (Mu ). Hence, we have

 2

  (u)2 ∇u F (∇u) dm ≥ dm. − dφk ∇ φk d(u)(∇u)+RicN (∇u)+ 2 N M M (12.14) In the limit as k → ∞, on one hand, the right-hand side of (12.14) converges to

(u)2 dm. φ d(u)(∇u) + RicN (∇u) + N M



On the other hand, concerning the left-hand side, we put  \$    %  k := x ∈ Mu  φ(x) > max kF 2 ∇u(x) − k −1 , 0 = {x ∈ Mu | φ(x) = φk (x)}. Then, since ∇ ∇u [F 2 (∇u)] = 0 almost everywhere on M \ Mu , we have      ∇u 2    d(φ − φ ) ∇ [F (∇u)] dm k   M        ∇u 2   ≤ dφ ∇ [F (∇u)] dm + 

dφk ∇

  ≤ 

  d[F 2 (∇u)] ∇ ∇u [F 2 (∇u)] dm.

k



k



dφ ∇

∇u

k

    [F (∇u)] dm + k 

2

∇u

  [F (∇u)] dm 2



k

The first term of the right-hand side tends to 0 as k → ∞ since k decreases to a null set. In the second term, notice that     d[F 2 (∇u)] ∇ ∇u [F 2 (∇u)] = 4F 2 (∇u) · d[F (∇u)] ∇ ∇u [F (∇u)] on Mu . Combining this with the choice of k , we obtain 

  d[F 2 (∇u)] ∇ ∇u [F 2 (∇u)] dm

k k



  (φ + k −1 ) · d[F (∇u)] ∇ ∇u [F (∇u)] dm → 0

≤4 k

as k → ∞. Therefore, the left-hand side of (12.14) converges to  −

2

 F (∇u) dm dφ ∇ ∇u 2 M

12.4 Improved Bochner Inequality

167

as k → ∞. This completes the proof for u ∈ C∞ (M). 2 (M)∩C1 (M) with u ∈ H 1 (M). Next we consider the general case of u ∈ Hloc loc ∞ 2 Since u ∈ Hloc (M), one can choose ui ∈ C (M) such that ui → u locally in the H 2 -norm as i → ∞. Using the integration by parts (11.10), one can rewrite (12.12) for the function ui as 2

 ∇ui F (∇ui ) dm dφ ∇ − 2 M

  (ui )2 ≥− dm. divm (φ∇ui )ui dm + φ RicN (∇ui ) + N M M 

As i → ∞, each of the terms appearing in this inequality converges to the respective term with u in place of ui . Hence, we obtain (12.12) for general u.   As we will see in the next chapter (Theorem 13.18), the required regularities 2 (M) ∩ C1 (M) and u ∈ H 1 (M) are fulfilled by solutions to the nonlinear u ∈ Hloc loc heat equation ∂t u = u. This leads to various applications of known techniques in the -calculus, discussed in Chaps. 14–16. Exercise 12.14 Let (M, F, m) be compact and satisfy RicN ≥ K for some K > 0 and N ∈ (−∞, 0) ∪ [n, ∞]. Assume that there is a nonconstant function u ∈ H 2 (M) ∩ C1 (M) satisfying u = −λu for a positive constant λ > 0 (i.e., a kind of eigenfunction). Show that λ≥

KN N −1

holds, where the right-hand side is read as K when N = ∞. Recall that the Laplacian  is a negative operator (Sect. 11.2); thereby λ above is necessarily positive (λ = 0 only when u is constant). For weighted Riemannian manifolds, the above inequality is understood as the spectral gap λ1 ≥ KN/(N −1) for the first nonzero eigenvalue λ1 of the weighted Laplacian. One should compare Exercise 12.14 with the Poincaré–Lichnerowicz inequality in Theorem 15.4 below (see also Remark 15.6(a)).

12.4 Improved Bochner Inequality In this section, we give an inequality improving the Bochner inequality (12.8) with N = ∞, followed by its integrated form improving (12.13). This improved (or reinforced) Bochner inequality will be used to show the L1 -gradient estimate (Theorem 14.4) and the Gaussian isoperimetric inequality (Theorem 15.1). We remark that the Bochner inequality (12.8) itself implies the L2 -gradient estimate

168

12 The Bochner–Weitzenböck Formula

(Theorem 14.2), which is a priori weaker than the L1 -gradient estimate (see the proof of Theorem 14.6). In the context of linear diffusion operators, such an inequality can be derived from (12.8) by a self-improvement argument (see [21, Sect. C.6], and also [224] for an extension to RCD(K, ∞)-spaces). Instead of trying to generalize the selfimprovement procedure, here we give a direct proof. Proposition 12.15 (Improved Bochner Inequality) Assume that Ric∞ ≥ K holds for some K ∈ R. Then we have, for any u ∈ C∞ (M), ∇u

  F 2 (∇u) − d(u)(∇u) ≥ KF 2 (∇u) + d[F (∇u)] ∇ ∇u [F (∇u)] 2 (12.15)

on Mu . Proof Observe from the Bochner–Weitzenböck formula (12.6) that ∇u

F 2 (∇u) − d(u)(∇u) ≥ KF 2 (∇u) + ∇ 2 u 2HS(∇u) . 2

Therefore, it suffices to show   4F 2 (∇u) ∇ 2 u 2HS(∇u) ≥ d[F 2 (∇u)] ∇ ∇u [F 2 (∇u)] .

(12.16)

Fix x ∈ Mu and choose a local coordinate system (x i )ni=1 such that gij (∇u(x)) = δij . First we calculate the right-hand side of (12.16) at x as 2 n     ∂[F 2 (∇u)] d[F 2 (∇u)] ∇ ∇u [F 2 (∇u)] = ∂x i i=1

=4

n 

 ∇u 2 g∇u D∂/∂x i (∇u), ∇u

i=1

=4

n  n  i=1

j =1

 2 ∂u ∂ ∇u · g∇u D∂/∂x (∇u), . i ∂x j ∂x j

Then we deduce from the Cauchy–Schwarz inequality that n  n  i=1

j =1

 2 ∂u ∂ ∇u · g∇u D∂/∂x (∇u), i ∂x j ∂x j

n  n  i=1

j =1

 2  2 n  ∂u ∂ ∇u g∇u D∂/∂x (∇u), · i ∂x j ∂x j j =1

12.4 Improved Bochner Inequality

169

= ∇ 2 u 2HS(∇u) · F 2 (∇u). This yields (12.16) as well as (12.15).

 

The integrated form of the above improved inequality is obtained in the same way as Theorem 12.13. Corollary 12.16 (Integrated Form) Assume that Ric∞ ≥ K holds for some K ∈ 2 (M) ∩ C1 (M) such that u ∈ H 1 (M), we have R. Then, given u ∈ Hloc loc 2

 F (∇u) dm dφ ∇ ∇u 2 M  \$  % ≥ φ d(u)(∇u) + KF 2 (∇u) + d[F (∇u)] ∇ ∇u [F (∇u)] dm 

M

for all bounded nonnegative functions φ ∈ Hc1 (M) ∩ L∞ (M). 2 (M), prove that F (∇u) ∈ H 1 (M). Exercise 12.17 For u ∈ Hloc loc 1 (M) for u ∈ H 2 (M) ∩ C1 (M). As a corollary, we have F 2 (∇u) ∈ Hloc loc

Exercise 12.18 For u ∈ C2 (M), prove that F 2 (∇u) ∈ C1 (M) (but F (∇u) is not necessarily differentiable on M \ Mu ).

Chapter 13

Nonlinear Heat Flow

In this chapter, we discuss fundamental properties of the nonlinear heat equation ∂t u = u associated with the nonlinear Laplacian  defined in Chap. 11. In particular, we establish the existence (Theorem 13.10) and the regularity (Theorem 13.18) of global solutions to the heat equation. Coupled with the Bochner inequalities in the previous chapter, the analysis of heat flow leads to various analytic and geometric applications as we will see in the following chapters. We remark that, due to the nonlinearity, there is no heat kernel. Although it is nonlinear, our Laplacian is locally uniformly elliptic by virtue of the smoothness (on T M \ 0) and the strong convexity of Finsler structures. Therefore one can analyze the heat equation by using well-established techniques in partial differential equations. We remark that analytic arguments in the proof of the regularity will be only sketched since they are beyond the scope of this book; we refer to [104, 206] and the references therein for more details.

13.1 Global Solutions First we introduce global solutions to the nonlinear heat equation ∂t u = u (with the Dirichlet boundary condition, to be precise). Actually, since in applications we utilize only global solutions, we will not consider local solutions in this book (see [206, Sect. 4] for local solutions). Definition 13.1 (Global Solutions) We say that a function u : (0, T ) × M −→ R is a global solution to the heat equation ∂t u = u if it satisfies the following:     (1) u ∈ L2 (0, T ), H01 (M) ∩ H 1 (0, T ), H −1 (M) ; (2) For any φ ∈ C∞ c (M), it holds that

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S.-i. Ohta, Comparison Finsler Geometry, Springer Monographs in Mathematics, https://doi.org/10.1007/978-3-030-80650-7_13

171

172

13 Nonlinear Heat Flow



 φ · ∂t ut dm = −

dφ(∇ut ) dm

M

(13.1)

M

at almost every t ∈ (0, T ), where we set ut := u(t, ·). In addition, we will call u : (0, ∞) × M −→ R a global solution if its restriction u : (0, T ) × M −→ R is a global solution for every T > 0. Let us explain the notations in (1) (we refer to [96, 219] for more details). The condition u ∈ L2 ((0, T ), H01 (M)) means that ut ∈ H01 (M) for almost every t and T T 2 0 ut H 1 dt < ∞ holds. This, in particular, implies 0 E(ut ) dt < ∞. Denoted by H −1 (M) is the dual Banach space of H01 (M), so that L2 (M) ⊂ H −1 (M). Then lying in H 1 ((0, T ), H −1 (M)) means that u satisfies the following conditions: • The weak derivative ∂t ut ∈ H −1 (M) exists for almost every t ∈ (0, T ) in the sense that we have  0

T





T

φt · ∂t ut dm dt = − 0

M

for all φ ∈ C∞ c ((0, T ) × M) (precisely, φt and ∂t ut ); T • 0 ( u 2L2 + ∂t ut 2H −1 ) dt < ∞ holds.

M

 ∂t φt · ut dm dt M

φt · ∂t ut dm denotes the coupling of

We will see that solutions to the heat equation eventually enjoy a reasonably higher regularity (Theorem 13.18), and then we have a clearer vision for the behavior of u. Here we only remark that the condition (1) implies u ∈ C([0, T ], L2 (M)) (see [96, Subsect. 5.9.2], [219, Lemma 11.4]). This continuous extension is important when we discuss the initial condition. In the sequel, we will refer to the continuous extension u : [0, T ] × M −→ R as a global solution and represent it as (ut )t∈[0,T ] . In (2), by the same reasoning as (11.11), the test function φ can be taken from H01 (M). Since (H 1 (M), · H 1 ) is separable, the condition (2) is equivalent to requiring that, for any φ ∈ L2 ((0, T ), H01 (M)), (13.1) holds at almost every t ∈ (0, T ). Since the Laplacian  is nonlinear, given two global solutions (ut )t∈[0,T ] and (u¯ t )t∈[0,T ] , their sum (ut + u¯ t )t∈[0,T ] may not be a global solution. What we can show is that, if M is compact, then (aut + b)t∈[0,T ] is a global solution for any a ≥ 0 and b ∈ R. Some more examples are discussed in the following exercises. Exercise 13.2 (Examples of Global Solutions) Let (Rn , · , Ln ) be a Minkowski normed space equipped with the Lebesgue measure and fix y ∈ Rn . (1) Prove that ut (x) = t −n/2 e− y−x

2 /4t

(13.2)

is a global solution to the heat equation on (Rn , · , Ln ) (see Exercise 11.10).

13.2 Existence

173

(2) More generally, let f : [0, ∞)2 −→ R be a C2 -function such that ∂r f (t, 0) = 0, ∂r f (t, r) ≤ 0 and ∂r2 f (t, r) +

n−1 ∂r f (t, r) = ∂t f (t, r). r

(13.3)

Show that ut (x) = f (t, y − x ) solves the heat equation. (3) Similarly, letting f satisfy ∂r f (t, 0) = 0, ∂r f (t, r) ≥ 0 and (13.3), show that ut (x) = f (t, x − y ) solves the heat equation. The Gaussian-like distribution ut in (13.2) is C2 at x = y if and only if the norm

· comes from an inner product (Proposition 1.7). Therefore global solutions are typically not C2 (and the C1,α -regularity in Theorem 13.18 will be the most we can expect). Exercise 13.3 Let f ∈ H01 (M) be an eigenfunction satisfying f = −λf for some λ ≥ 0. Show that ut (x) := e−λt f (x) is a global solution to the heat equation. Exercise 13.4 Given a global solution (ut )t∈[0,T ] to the heat equation on (M, F, m), prove that (−ut )t∈[0,T ] is a global solution to the heat equation on ← − (M, F , m).

13.2 Existence This section is devoted to the construction of global solutions to the heat equation. Since the energy functional E is a lower semi-continuous convex function on the Hilbert space L2 (M) (by (11.3) and Lemma 11.6), we can construct global solutions as gradient curves of E with respect to the L2 -norm, applying the classical theory going back to [45, 84] (see also [182]). Here we give an outline along the lines of [206, Sect. 3]. We sometimes refer to results in a more general setting of CAT(0)-spaces (recall Subsect. 8.3.1; note that Hilbert spaces are CAT(0)-spaces), because of the author’s familiarity as well as for drawing the readers’ attention to recent diverse developments of gradient flow theory on CAT(0)-spaces (originally established in [128, 129, 171], see also [9, 17]). Given u ∈ H01 (M), we define the local descending slope of E at u by

E(u) − E(f ) , |∇(−E)|(u) := max 0, lim sup

f − u L2 f →u where f ∈ H01 (M) and the convergence f → u is with respect to the L2 -norm. Note that, thanks to the convexity of E, this is in fact equivalent to the global descending slope, that is,

174

13 Nonlinear Heat Flow

E(u) − E(f ) |∇(−E)|(u) = max 0, sup f =u f − u L2

(13.4)

(see [9, Theorem 2.4.9]). We, in particular, find that |∇(−E)|(u) = 0 holds if and only if u is a minimizer of E in H01 (M), i.e., a constant function. Exercise 13.5 Prove (13.4). Next we deduce that a direction in which E decreases the most (the “gradient vector” of −E) is uniquely determined. Lemma 13.6 (Gradient Vectors of −E) For u ∈ H01 (M) with 0 < |∇(−E)|(u) < ∞, there exists a unique element h ∈ L2 (M) satisfying h L2 = |∇(−E)|(u) and  lim sup ˆ h→h

ˆ E(u) − E(u + t h) lim = |∇(−E)|(u), ˆ L2 t↓0 t h

(13.5)

where hˆ → h in L2 (M). Proof Take a sequence (hˆ i )i∈N in H01 (M) \ {0} such that hˆ i → 0 in L2 (M) and E(u) − E(u + hˆ i ) = |∇(−E)|(u). i→∞

hˆ i L2 lim

We put hi := (|∇(−E)|(u)/ hˆ i L2 ) · hˆ i and deduce from the convexity of E that lim lim

i→∞ t↓0

E(u) − E(u + thi ) E(u) − E(u + hˆ i ) ≥ lim = |∇(−E)|(u). i→∞ t hi L2

hˆ i L2

Hence, we have lim lim

i→∞ t↓0

E(u) − E(u + thi ) = |∇(−E)|(u). t hi L2

Moreover, for any i, j ≥ 1, it follows from the definition of |∇(−E)|(u), the convexity of E and hi L2 = hj L2 = |∇(−E)|(u) that E(u) − E(u + t (hi + hj )/2) t↓0 t (hi + hj )/2 L2

E(u) − E(u + thi ) E(u) − E(u + thj ) |∇(−E)|(u) . ≥ lim +

(hi + hj )/2 L2 t↓0 2t hi L2 2t hj L2

|∇(−E)|(u) ≥ lim

Letting i, j → ∞, we obtain limi,j →∞ hi + hj L2 = 2|∇(−E)|(u). Therefore

13.2 Existence

175

hi − hj 2L2 = 2 hi 2L2 + 2 hj 2L2 − hi + hj 2L2 → 0 as i, j → ∞, and hence (hi )i∈N is a Cauchy sequence converging to some h ∈ L2 (M) with h L2 = |∇(−E)|(u). Then the desired property (13.5) is straightforward from the construction. The uniqueness also follows from the above construction, since the minimizing sequence (hˆ i )i∈N was arbitrary and we did not extract a subsequence.   We define ∇(−E)(u) := h for h ∈ L2 (M) granted by Lemma 13.6 above, and call it the gradient vector of −E at u. We simply set ∇(−E)(u) := 0 ∈ L2 (M) if |∇(−E)|(u) = 0. Then, a gradient curve of E will be understood as a (weak) solution to the equation ∂t ut = ∇(−E)(ut ). Henceforth, we assume the finite reversibility F < ∞ for applying the lower semi-continuity of E in Lemma 11.6. Following a well-established technique in gradient flow theory, we shall construct gradient curves of E via a discrete approximation. For u ∈ H01 (M) and δ > 0, we denote by Jδ (u) ∈ H01 (M) a unique minimizer of the function f −→ E(f ) +

f − u 2L2 2δ

(13.6)

(one may compare this with (12.9)). The existence as well as the uniqueness of such a minimizer can be obtained in a similar way to Lemma 13.6 thanks to the lower semi-continuity of E (i.e., any minimizing sequence is necessarily a Cauchy sequence in L2 (M)). We also remark that, if |∇(−E)|(u) = 0, then u is constant and Jδ (u) = u for all δ. One can regard Jδ (u) as an approximation of a point on a gradient curve of E at time δ from u. Lemma 13.7 (Infinitesimal Behavior of Jδ (u)) Assume F < ∞ and fix u ∈ H01 (M). (i) As δ → 0, Jδ (u) converges to u in the L2 -norm, as well as locally in the H 1 norm in the sense that   2 F ∗ d[Jδ (u)] − du dm → 0 

for every compact set  ⊂ M. (ii) If |∇(−E)|(u) < ∞, then we have     Jδ (u) − u  − ∇(−E)(u) lim  = 0. δ→0  δ L2 Proof First of all, if |∇(−E)|(u) = 0, then Jδ (u) = u and there is nothing to prove. Thus we will assume |∇(−E)|(u) > 0. We put fδ := Jδ (u) throughout this proof. (i) It follows from the definition of fδ = Jδ (u) that

176

13 Nonlinear Heat Flow

E(fδ ) +

fδ − u 2L2 2δ

≤ E(u).

(13.7)

This implies limδ→0 fδ − u L2 = 0, and hence E(u) ≤ lim infδ→0 E(fδ ) by Lemma 11.6. Combining this with (13.7) shows that limδ→0 E(fδ ) = E(u). Then we find, again from (13.7), limδ→0 fδ − u 2L2 /δ = 0. Now, since E(fδ ) +

fδ − u 2L2 2δ



2

fδ − u 2L2 fδ + u E(fδ ) + E(u) fδ − u L2 ≤E + ≤ + 2 8δ 2 8δ

by the choice of fδ , we obtain limδ→0 E((fδ + u)/2) = E(u). Together with F∗



dfδ + du 2

2 ≤

F ∗ (du)2 1 ∗ F ∗ (dfδ )2 + − F (dfδ − du)2 2 2 4Sx

at x (dual to (8.8), recall Proposition 8.17 for the definition of Sx ), we deduce that

 fδ + u → 0 F (dfδ − du) dm ≤ 4 sup Sx · E(fδ ) + E(u) − 2E 2 x∈ 



2

as δ → 0 for any compact set  ⊂ M. (ii) Let (hi )i∈N be a sequence in H01 (M) \ {0} constructed in the proof of Lemma 13.6, satisfying hi L2 = |∇(−E)|(u) and lim lim

i→∞ t↓0

E(u) − E(u + thi ) = |∇(−E)|(u). t hi L2

Set a := |∇(−E)|(u) for simplicity and consider hδi := u +

fδ − u L2 · hi . a

Then we deduce from hδi − u L2 = fδ − u L2 and the choice of fδ = Jδ (u) that lim inf δ→0

E(u) − E(hδi ) E(u) − E(fδ ) E(u) − E(u + thi ) ≥ lim = lim . δ→0 hδ − u L2 t↓0

fδ − u L2 t hi L2 i

Letting i → ∞ yields lim

δ→0

E(u) − E(fδ ) = a.

fδ − u L2

This implies that (a/ fδ − u L2 ) · (fδ − u) converges to ∇(−E)(u) as δ → 0 by the same argument as the proof of Lemma 13.6.

13.2 Existence

177

It remains to show limδ→0 fδ − u L2 /δ = a. Notice that

fδ − u 2L2

E(fδ ) +

≤ E(u + δhi ) +

a2δ 2

by the choices of fδ and hi , and that E(u) − E(fδ ) ≤ a fδ − u L2 by (13.4). Combining these, we find

fδ − u 2L2 2δ 2

E(u + δhi ) − E(fδ ) a 2 + δ 2

=

E(u + δhi ) − E(u) E(u) − E(fδ ) a 2 + + δ δ 2

a2 E(u + δhi ) − E(u) a fδ − u L2 + + . δ δ 2

By rearrangement we obtain  2 1 fδ − u L2 E(u + δhi ) − E(u) −a ≤ + a2. 2 δ δ Letting δ → 0 and then i → ∞, we have limδ→0 fδ − u L2 /δ = a as desired.

 

Iterating the procedure (13.6) provides a discrete approximation of a gradient curve of E. In fact, given u0 ∈ H01 (M), ((Jt/k )k (u0 ))t≥0 converges to an L2 continuous curve (ut )t≥0 in H01 (M) emanating from u0 as k → ∞, uniformly (in the L2 -norm) on each bounded interval [0, T ] (see, e.g., [171, Theorem 1.13]). The limit curve (ut )t≥0 satisfies the following properties: • (ut )t≥0 is locally Lipschitz on (0, ∞) as a curve in L2 (M), and the function t −→ E(ut ) is continuous on [0, ∞) as well as locally Lipschitz on (0, ∞) (see [171, Theorem 2.9, Corollary 2.10]); • We have lim δ↓0

E(ut ) − E(ut+δ ) = |∇(−E)|(ut )

ut+δ − ut L2

(13.8)

for all t ≥ 0, and |∇(−E)|(ut ) < ∞ for all t > 0 (see [171, Theorem 2.14]); • It holds that lim δ↓0

ut+δ − ut L2 = |∇(−E)|(ut ) δ

for all t ≥ 0 (see [171, Theorem 2.17]).

(13.9)

178

13 Nonlinear Heat Flow

Equation (13.8) ensures that the curve (ut )t≥0 is tangent to the gradient vector field ∇(−E). Together with (13.9), a similar discussion to Lemma 13.7(ii) shows the following. Lemma 13.8 (Time Derivatives) In the notations as above, we have    ut+δ − ut   − ∇(−E)(ut ) lim   =0 δ↓0 δ L2 for all t > 0. In particular, the weak derivative ∂t ut exists and coincides with ∇(−E)(ut ). Exercise 13.9 Prove the latter assertion of the lemma above, i.e.,  0

T





T

∂t φt · ut dm dt = − 0

M

 φt · ∇(−E)(ut ) dm dt M

for all T > 0 and φ ∈ C∞ c ((0, T ) × M). Now we state the existence result of global solutions to the heat equation. Theorem 13.10 (Existence) Assume F < ∞. For each initial datum u0 ∈ H01 (M) and T > 0, there exists a unique global solution (ut )t∈[0,T ] to the heat equation ∂t u = u lying in L2 ([0, T ], H01 (M)) ∩ H 1 ([0, T ], L2 (M)). Moreover, the distributional Laplacian ut is absolutely continuous with respect to m for all t ∈ (0, T ), with the density function ∇(−E)(ut ) ∈ L2 (M). In particular, the weak derivative ∂t ut coincides with ut and lim δ↓0

E(ut ) − E(ut+δ ) = ∇(−E)(ut ) 2L2 = ut 2L2 δ

(13.10)

holds for all t > 0. Proof Let (ut )t≥0 be the gradient curve of E constructed above. Note that, on T one hand, 0 E(ut ) dt ≤ T E(u0 ) < ∞. On the other hand, we deduce from Lemma 13.8 and the combination of (13.8) and (13.9) that  0

T



∂t ut 2L2 dt =

T

|∇(−E)|2 (ut ) dt = E(u0 ) − E(uT ) < ∞.

0

Hence, we have u ∈ L2 ([0, T ], H01 (M)) ∩ H 1 ([0, T ], L2 (M)). We shall show that (13.1) holds for all t ∈ (0, T ) by a variational argument on the minimizing procedure (13.6). Fix t ∈ (0, T ) and φ ∈ C∞ c (M). Given small δ, ε > 0, we consider the functions fδt := Jδ (ut ),

t f˜δ,ε := fδt + εφ.

13.2 Existence

179

Then the definition of fδt = Jδ (ut ) implies t E(f˜δ,ε )+

t − u 2

f˜δ,ε t L2

E(fδt ) +

fδt − ut 2L2 2δ

(13.11)

.

Firstly, observe that lim

t − u 2 − f t − u 2

f˜δ,ε t L2 t L2 δ

ε

ε→0

 =2 M

φ(fδt − ut ) dm.

Secondly, we deduce from the convexity and the uniform smoothness of (F ∗ )2 (recall Subsect. 8.3.2) that (1 − s)F ∗ (dfδt )2 + sF ∗ (dfδt + ε dφ)2 − Cx (1 − s)sF ∗ (ε dφ)2 ≤ F ∗ (dfδt + sε dφ)2 ≤ (1 − s)F ∗ (dfδt )2 + sF ∗ (dfδt + ε dφ)2 at x ∈ M for all s ∈ (0, 1), where Cx is from Proposition 8.17. By rearrangement we have F ∗ (dfδt + sε dφ)2 − F ∗ (dfδt )2 ≤ F ∗ (dfδt + ε dφ)2 − F ∗ (dfδt )2 s ≤

F ∗ (dfδt + sε dφ)2 − F ∗ (dfδt )2 + Cx (1 − s)F ∗ (ε dφ)2 . s

Letting s → 0, we find from Lemma 3.8 that 2ε dφ(∇fδt ) ≤ F ∗ (dfδt + ε dφ)2 − F ∗ (dfδt )2 ≤ 2ε dφ(∇fδt ) + Cx ε2 F ∗ (dφ)2 . Hence, we obtain lim

t ) − E(f t ) E(f˜δ,ε δ

ε

ε→0

 \$ % 1 F ∗ (dfδt + ε dφ)2 − F ∗ (dfδt )2 dm ε→0 2ε M  = dφ(∇fδt ) dm.

= lim

M

Substituting these into (13.11), we have 1 δ→0 δ







lim

M

φ(fδt − ut ) dm ≥ − lim

δ→0 M

dφ(∇fδt ) dm = −

with the help of Lemma 13.7(i) and Proposition 8.17.

dφ(∇ut ) dm M

180

13 Nonlinear Heat Flow

Now, it follows from Lemma 13.7(ii) that    1 φ · ∇(−E)(ut ) dm = lim φ(fδt − ut ) dm ≥ − dφ(∇ut ) dm. δ→0 δ M M M We have the reverse inequality as well by replacing φ with −φ, and hence 





φ · ∇(−E)(ut ) dm = − M

dφ(∇ut ) dm = M

φut dm. M

Therefore we conclude that (13.1) holds with ∂t ut = ∇(−E)(ut ). Moreover, ut is absolutely continuous with respect to m with the density function ∇(−E)(ut ). Finally, the uniqueness is included in the next lemma.   The next lemma ensures the uniqueness in the above theorem and shows that the heat semigroup uniquely extends to a (nonlinear) contraction semigroup acting on L2 (M). Lemma 13.11 (Non-expansion in L2 ) For any two global solutions (ut )t≥0 and (u¯ t )t≥0 to the heat equation, we have ! d

ut − u¯ t L2 ≤ 0 dt for almost every t > 0. In particular, if u0 = u¯ 0 almost everywhere, then we have ut = u¯ t almost everywhere for all t > 0. Proof Since (ut )t≥0 and (u¯ t )t≥0 are locally Lipschitz in L2 (M) on (0, ∞), it is sufficient to consider the right derivative. We deduce from (13.1) (with the test function ut − u¯ t ∈ H01 (M)) that  \$ ! % d+ 1

ut − u¯ t 2L2 = lim (ut+δ − u¯ t+δ )2 − (ut − u¯ t )2 dm δ↓0 δ M dt   u¯ t+δ − u¯ t ut+δ − ut = lim (ut+δ + ut − u¯ t+δ − u¯ t ) − dm δ↓0 M δ δ  = 2 (ut − u¯ t )(∂t ut − ∂t u¯ t ) dm M

 d(ut − u¯ t )(∇ut − ∇ u¯ t ) dm.

= −2 M

Now, it follows from Lemma 3.8 that   d(ut − u¯ t )(∇ut − ∇ u¯ t ) = (dut − d u¯ t ) L∗ (dut ) − L∗ (d u¯ t ) F ∗ (dut + s(d u¯ t − dut ))2 − F ∗ (dut )2 s→0 2s

= − lim

13.3 Large Time Behavior

181

F ∗ (d u¯ t + s(dut − d u¯ t ))2 − F ∗ (d u¯ t )2 . s→0 2s

− lim

This is nonnegative by the convexity of (F ∗ )2 in Tx∗ M (along the segment between dut (x) and d u¯ t (x)). This completes the proof.   We remark that, in the case of u¯ t = 0, we more straightforwardly obtain ! d+

ut 2L2 = −2 dt

 dut (∇ut ) dm = −4E(ut ) ≤ 0.

(13.12)

M

Exercise 13.12 At the end of the proof of Lemma 13.11, show that we have moreover 1 ∗ F (d u¯ t − dut )2 ≤ d(ut − u¯ t )(∇ut − ∇ u¯ t ) ≤ CF F ∗ (d u¯ t − dut )2 SF (recall Subsect. 8.3.2 for SF and CF ). We will use the former inequality to give estimates improving Lemma 13.11 in the next section. By our construction of heat flow as the gradient flow of E, one readily obtains the following fundamental property. Lemma 13.13 Let (ut )t≥0 be a global solution to the heat equation. If u0 ≥ 0 almost everywhere, then we have ut ≥ 0 almost everywhere for all t > 0. Similarly, if c ≤ u0 ≤ C almost everywhere for some c, C ∈ R, then we have c ≤ ut ≤ C almost everywhere for all t > 0. Proof If u0 ≥ 0 almost everywhere, then we have Jδ (u0 ) ≥ 0 almost everywhere since, otherwise, max{Jδ (u0 ), 0} has a less energy and is closer to u0 . Therefore ut ≥ 0 holds almost everywhere for all t > 0. The latter assertion c ≤ ut ≤ C is shown in the same way.   Remark 13.14 (Heat Flow as Gradient Flow of Entropy) An important and suggestive observation, going back to Jordan–Kinderlehrer–Otto’s seminal work [127] on Euclidean spaces, is that one can regard heat flow also as gradient flow of the relative entropy functional on the L2 -Wasserstein space. A generalization to the Finsler setting was investigated in [206, Sect. 7]. In this case, to be precise, we need ← − to employ the Wasserstein distance with respect to the reverse Finsler structure F . We refer to Subsect. 18.5.2 for a further account.

13.3 Large Time Behavior The asymptotic behavior of heat semigroup for large time (t → ∞) plays an essential role in the -calculus, as we will see in Chaps. 15, 16. Here we begin

182

13 Nonlinear Heat Flow

with an immediate application of Exercise 13.12. Recall (11.4) for the definition of the ground state energy χM ≥ 0, for which we have χM u 2L2 ≤ 2E(u) for all u ∈ H01 (M). Lemma 13.15 (Exponential Contraction in L2 ) For any two global solutions (ut )t≥0 and (u¯ t )t≥0 to the heat equation, we have

ut − u¯ t L2 ≤ e−(χM /SF )t u0 − u¯ 0 L2 for all t ≥ 0, where χM /SF is read as 0 if SF = ∞. It also holds that ut L2 ≤ e−χM t u0 L2 for all t ≥ 0. Proof We argue as in the proof of Lemma 13.11, and apply Exercise 13.12 to see ! d+ 4 2χM

ut − u¯ t 2L2 ≤ − E(u¯ t − ut ) ≤ −

ut − u¯ t 2L2 . dt SF SF Then Gronwall’s lemma yields ut − u¯ t 2L2 ≤ e−(2χM /SF )t u0 − u¯ 0 2L2 , which is the first assertion. The second assertion follows in the same way from (13.12) and ! d+

ut 2L2 = −4E(ut ) ≤ −2χM ut 2L2 . dt  

This completes the proof.

The above lemma is indeed a standard contraction estimate for the gradient flow of a (χM /SF )-convex function (see Exercise 11.7). Note that, if χM > 0, then we have limt→∞ ut L2 = 0 for any initial point u0 ∈ H01 (M). Since χM can be 0 (e.g., for compact manifolds), we have also introduced χ M in (11.5) so as to satisfy  χ M u 2L2 ≤ 2E(u) for all u ∈ H01 (M) with M u dm = 0. This is suitable to analyze the mass conservative situation. If (M, F, m) satisfies F < ∞, m(M) < ∞, and is complete, then we have H01 (M) = H 1 (M) (Lemma 11.4(ii)) and the constant functions belong to H01 (M) (Lemma 11.5). Thus we have the mass conservation:   ut dm = u0 dm (13.13) M

M

for any global solution (ut )t≥0 to the heat equation and all t ≥ 0 (by letting φ = 1 in (13.1)). In this case, by the same argument as Lemma 13.15, we have

ut − u¯ t L2 ≤ e−(χ M /SF )t u0 − u¯ 0 L2 for any global solutions (ut )t≥0 and (u¯ t )t≥0 such that

M

u0 dm =

M

u¯ 0 dm.

Proposition 13.16 (Large Time Behavior of Heat Flow) Assume F < ∞ and let (ut )t≥0 be a global solution to the heat equation.

13.3 Large Time Behavior

183

(i) We have limt→∞ |∇(−E)|(ut ) = 0 and limt→∞ E(ut ) = 0. (ii) Let (M, F, m) be complete and satisfy m(M) <  ∞ and χ M > 0. Then ut converges to the constant function u¯ = m(M)−1 M u0 dm in L2 (M). (iii) If M is compact, then all the conditions in (ii) are fulfilled and we have ut → u¯ in L2 (M). Proof (i) Since the energy functional E is convex, E(ut ) is convex in t (see [171, Theorem 2.36] and Exercise 13.17 below). Together with the nonnegativity E ≥ 0, this implies that limt→∞ |∇(−E)|(ut ) = 0. Moreover, since |∇(−E)|(ut ) ≥

E(ut )

ut L2

by (13.4) and ut L2 is bounded above by Lemma 13.11, we find that limt→∞ E(ut ) = 0. (ii) Since (ut − u) ¯ t≥0 is again a global solution thanks   to m(M) < ∞, we can assume M u0 dm = 0 without loss of generality. Then M ut dm = 0 holds for all t > 0 by the mass conservativeness. Hence, we have ! d+

ut 2L2 = −4E(ut ) ≤ −2χ M ut 2L2 , dt which implies ut 2L2 ≤ e−2χ M t u0 2L2 similarly to Lemma 13.15. Therefore we obtain limt→∞ ut L2 = 0.  (iii) Take a sequence (uˆ i )i∈N in H01 (M) such that uˆ i L2 = 1, M uˆ i dm = 0 and limi→∞ E(uˆ i ) = χ M /2. Now, since M is compact, we can apply the Rellich–Kondrachov theorem asserting that the embedding H 1 (M) → L2 (M) is compact (see, e.g., [115, Corollary 3.7] or [116, Theorem 2.9]). Thus there exists a subsequence of (uˆ i )i∈N which converges to some u ∈ L2 (M) in the L2 -norm. Then, if χ M = 0, it follows from the lower semi-continuity of E (Lemma 11.6) that E(u) = 0. This yields u = 0 by the mass conservativeness, however, which contradicts u L2 = 1. Therefore we obtain χ M > 0.   Exercise 13.17 Let (ut )t≥0 be a global solution to the heat equation. (1) Prove that |∇(−E)|(ut ) = sup s>0

ut+s − ut L2 s

for all t > 0 (one may apply the non-expansion property in Lemma 13.11). (2) Prove that lim δ↓0

E(ut ) − E(ut+δ ) E(ut ) − E(ut+s ) = sup δ s s>0

184

13 Nonlinear Heat Flow

for all t > 0, and use this assertion to show that t −→ E(ut ) is a convex function. We refer to [21, Subsect. 3.1.9] for the case of linear semigroups, where the spectral decomposition plays an essential role.

13.4 Regularity Let (ut )t≥0 be a global solution to the heat equation ∂t u = u. Notice that the standard elliptic regularity yields that u is C∞ on 2

  ({t} × Mut ) = (t, x) ∈ (0, ∞) × M | dut (x) = 0 .

t>0

Outside the essential domain Mut , one cannot obtain the C2 -regularity as we mentioned in Example 13.2. Nonetheless, our Laplacian  is locally uniformly elliptic since, in each local coordinate system on a relatively compact set U ⊂ M, we have n n n   1 i 2 (w ) ≤ gij (v)w i w j ≤ λ (w i )2 λ i=1

i,j =1

(13.14)

i=1

for some λ ≥ 1 and for all v, w ∈ Tx U with v = 0 (by the smoothness and the strong convexity of F ). Therefore we can apply the classical theory going back to Aronson, Moser, Nash, Serrin, etc. on Euclidean spaces as well as Saloff-Coste’s seminal work [221] on Riemannian manifolds. The next theorem summarizes the regularity properties established along this way in [104, 206]. Theorem 13.18 (Regularity of Heat Flow) Assume F < ∞ and let u = (ut )t≥0 be a global solution to the heat equation ∂t u = u. Then one can take the 2 -regularity in x as well as the C1,α continuous version of u, and it enjoys the Hloc 1 (M) ∩ C(M), regularity in both t and x on (0, ∞) × M. Moreover, ∂t ut lies in Hloc 1 and further in H0 (M) if SF < ∞. Techniques used in the proof of this theorem are standard from the analytic viewpoint, and we will give only an outline along [206]. We refer to [104, 206, 221] for more details, and to [32, 88, 229] for some related works. Proof The main ingredient of the proof is the regularity property for locally uniformly elliptic operators. For our purpose, [221, Sects. 4, 5] will be sufficient. Step 1 First we show that u is Hölder continuous. Recall from Lemma 11.22 that ut = ∇ut ut , thereby u is a solution to the equation ∂t w = ∇ut w (or ∂t w = Vt w for a measurable one-parameter family of nowhere vanishing vector fields

13.4 Regularity

185

(Vt )t≥0 such that Vt = ∇ut on Mut ). The time-dependent operator ∇ut (or Vt ) is locally uniformly elliptic thanks to (13.14). Therefore u is Hölder continuous on (0, ∞) × M by [221, Corollary 5.5]. Step 2 Next we show that ∂t ut ∈ H01 (M) assuming SF < ∞. A modified argument 1 (M) in general. For t > 0 and δ > 0, using a cut-off function yields ∂t ut ∈ Hloc consider the function wtδ :=

ut+δ − ut , δ

which converges to ∂t ut in L2 (M) as δ → 0 by Lemma 13.8. As in the proof of Lemma 13.11 (with the help of Exercise 13.12), observe that ! 2 d

wtδ 2L2 = − 2 dt δ

 d(ut+δ − ut )(∇ut+δ − ∇ut ) dm M

   2 max F ∗ (dut+δ − dut )2 , F ∗ (dut − dut+δ )2 dm 2 SF δ M  2  ≤− E(wtδ ) + E(−wtδ ) SF

≤−

for almost all t > 0. Hence, we have, given 0 < τ < T , 2τ SF

 τ

T

  2 E(wtδ ) + E(−wtδ ) dt ≤ SF ≤ ≤

2 SF  T 0



T

≤ 0

 

T τ T 0

  t E(wtδ ) + E(−wtδ ) dt  0

t



 E(wtδ ) + E(−wtδ ) ds dt

( wsδ 2L2 − wTδ 2L2 ) ds

wsδ 2L2 ds.

Moreover, we deduce from Hölder’s inequality and (13.10) that 

T 0

2    s+δ 1 ds = ∂t u dt dm ds M δ s 0    s+δ 1 T ≤ (∂t u)2 dt dm ds δ 0 M s  T +δ  ≤ (∂t u)2 dm dt 

wsδ 2L2

T

0

≤ E(u0 ).

M

186

13 Nonlinear Heat Flow

Therefore (w δ )δ>0 is bounded in L2 ([τ, T ], H01 (M)) with respect to the norm 

T



τ

wt 2L2



1/2

+ E(wt ) + E(−wt ) dt

,

and hence one can extract a weakly convergent subsequence (as δ → 0). The limit coincides with ∂t u by construction and belongs to L2 ([τ, T ], H01 (M)). Finally, since τ > 0 and T > τ were arbitrary, we obtain ∂t ut ∈ H01 (M) for almost all t > 0. Step 3 The Hölder continuity of ∂t u can be seen in a similar way to Step 1. For an arbitrary test function φ ∈ C∞ c ((0, T ) × M), we have 

T





0

T

∂t φ · ∂t u dm dt = − M

 d(∂t φ)(∇u) dm dt

0



T

= 0



M

  dφ ∇ ∇ut (∂t u) dm dt,

M

where we used Euler’s theorem (Theorem 2.8) in the latter equality to see n ∂[g ∗ (du)] n ∂g ∗   ∂u ∂ 2 u ∂u ij ij = (du) =0 ∂t ∂x j ∂αk ∂t∂x k ∂x j j =1

j,k=1

(we remark that ∇ ∇ut (∂t u) and ∂ 2 u/(∂t∂x k ) make sense thanks to Step 2). Hence, ∂t u is a weak solution to the equation ∂t w = ∇ut w (or ∂t w = Vt w) and is Hölder continuous similarly to Step 1. 2 -regularity follows the same lines as Step 2, whereas Step 4 The proof of the Hloc we need more involved calculations. We fix a local coordinate system (x i )ni=1 on U ⊂ M, define  by dm = e dx 1 · · · dx n , and introduce the notations

Dkδ [u](x) :=

u(x + δek ) − u(x) , δ

Dkδ [∇u](x) :=

∇u(x + δek ) − ∇u(x) , δ

where ek = (δ1k , δ2k , . . . , δnk ) ∈ Rn is the kth element in the standard basis. Then, for φ ∈ C∞ c ((0, T ) × U ), we have on one hand  M

Dk−δ [φ]∂t u dm

 =− 

M

=− M

φDkδ [e ∂t u] dx  φ(x)∂t u(x δ )Dkδ [e ](x) dx −

M

φ ∂t (Dkδ [u]) dm,

where we set x δ := x + δek . On the other hand, we deduce from the heat equation ∂t u = u that

13.4 Regularity

 M

187

Dk−δ [φ]∂t u dm = −



d(Dk−δ [φ])(∇u) dm

M



dφ(Dkδ [e ∇u]) dx

= M



  δ dφx Dkδ [∇u](x) e(x ) dx +

= M

 M

dφ(∇u)Dkδ [e ] dx.

Comparing these yields 



− M

φ ∂t (Dkδ [u]) dm



= M

dφx Dkδ [∇u](x)

 (x δ ) e dx +



+ M

 M

dφ(∇u)Dkδ [e ] dx

φ(x)∂t u(x δ )Dkδ [e ](x) dx.

(13.15)

Choosing φ = Dkδ [u] or its cut-off in U , we see that the integral  0

T

EU (Dkδ [u]) dt

(arising essentially from the first term in the right-hand side of (13.15)) is bounded above uniformly in δ (as δ → 0). Therefore the limit ∂u/∂x k of Dkδ [u] as δ → 0 T 1 (M) for almost all satisfies 0 EU (∂u/∂x k ) dt < ∞, and we have ∂ut /∂x k ∈ Hloc t > 0. Step 5 In order to consider the Hölder continuity of the spatial derivatives, let (x i )ni=1 and  on U be as in the previous step. For φ ∈ C∞ c ((0, T ) × U ), observe on one hand   T  T ∂ 2φ ∂u ∂ u dm dt = − ∂ φ + u dm dt t k ∂x k ∂x k M ∂t∂x M 0 0  T  ∂u ∂ ∂t φ k − φ k ∂t u dm dt. =− ∂x ∂x M 0 On the other hand, by the heat equation ∂t u = u, we have   T ∂ 2φ ∂φ (∇u) dm dt u dm dt = d k ∂x k M ∂t∂x M 0 0     T n ∂[g ∗ (du)] ∂u ∂u ∂ ∂ ij =− + dφ ∇ ∇ut + ∇u dm dt ∂x k ∂x k ∂x j ∂x i ∂x k 0 M



T



i,j =1

188

13 Nonlinear Heat Flow



T

=− 0



   n ∂g ∗ ∂u ∂u ∂ ∂ ij + dφ ∇ ∇ut (du) + ∇u dm dt ∂x k ∂x k ∂x j ∂x i ∂x k M i,j =1

(thanks to Theorem 2.8 and Step 4, similarly to Step 3). Comparing these, we find that ∂u/∂x k is a weak solution to the equation ∂t w = ∇ut w + div

 n ∂g ∗ ij i,j =1

∂x

(du) k

∂u ∂ ∂ ∂ + ∇u − k ∂t u. j i k ∂x ∂x ∂x ∂x

Since the vector field inside the divergence, denoted by W , is in L2loc ((0, T ) × U ), p we deduce that ∂u/∂x k ∈ Lloc ((0, T ) × U ) for some p > 2. It follows that W ∈ p Lloc ((0, T ) × U ) and then, iterating this self-improvement, we obtain ∂u/∂x k ∈ q Lloc ((0, T ) × U ) for q > n + 2 and eventually the Hölder continuity of ∂u/∂x k . Therefore, together with Step 3, we conclude that u is C1,α on (0, ∞) × M.   We remark that Steps 1–5 correspond to Proposition 4.4, Appendices A and B, Proposition 4.5, Appendix C (Theorem 4.6), and Lemma 4.7–Theorem 4.9 in [206], respectively.

13.5 Linearized Heat Semigroups and Their Adjoints We saw in the Bochner inequality (Chap. 12) that it is meaningful to consider a linearization of the Laplacian with respect to the gradient vector field of a function. In the same spirit, we introduce a linearization of the heat semigroup by using a global solution. This concept will play a central role in the applications of the Bochner inequality in the following chapters. Let (ut )t≥0 be a global solution to the heat equation ∂t u = u. We fix a measurable one-parameter family of nowhere vanishing vector fields (Vt )t≥0 such that Vt = ∇ut on Mut for each t ≥ 0. Definition 13.19 (Linearized Heat Semigroups) Given f ∈ H01 (M) and s ≥ 0, ∇u (f )) we define (Ps,t t≥s as the weak solution to the linearized heat equation:  ∇u  ∇u   ∂t Ps,t (f ) = Vt Ps,t (f ) ,

∇u Ps,s (f ) = f.

(13.16)

(By an abuse of notation, we will suppress the dependence on the choice of (Vt )t≥0 ∇u ; see also Remark 13.22 below.) in Ps,t ∇u (f )) Precisely, as in Definition 13.1, for any T > 0, (Ps,t t∈(s,s+T ) will belong to 1 1 −1 + T ), H0 (M)) ∩ H ((s, s + T ), H (M)) and satisfy

L2 ((s, s

13.5 Linearized Heat Semigroups and Their Adjoints

 M

 ∇u  φ · ∂t Ps,t (f ) dm = −

 M

189

 ∇u   dφ ∇ Vt Ps,t (f ) dm

for any φ ∈ C∞ c (M) and almost all t ∈ (s, s + T ). Recall that then ∇u (f )) 2 (Ps,t t∈(s,s+T ) also lies in C([s, s + T ], L (M)) and the initial condition ∇u Ps,s (f ) = f makes sense. The linearity of Vt enables us to apply classical results on linear operators. We summarize the existence and regularity properties of linearized heat semigroups in the next proposition. Proposition 13.20 (Properties of Linearized Heat Semigroups) Let (M, F, m) be complete and satisfy CF < ∞ and SF < ∞, and take (ut )t≥0 and (Vt )t≥0 as above. (i) For each s ≥ 0, T > 0 and f ∈ H01 (M), there exists a unique solution ∇u (f ), t ∈ [s, s + T ], to the linearized heat equation (13.16). ft = Ps,t (ii) The solution (ft )t∈[s,s+T ] in (i) is Hölder continuous on (s, s + T ] × M. (iii) Assume that either m(M) < ∞ or Ric∞ ≥ K for some K ∈ R holds. If c ≤ f ≤ C almost everywhere for some −∞ < c < C < ∞, then we have c ≤ ft ≤ C almost everywhere for all t ∈ (s, s + T ]. Proof (i) Let s = 0 without loss of generality. This unique existence follows from Theorem 4.1 and Remark 4.3 in [158, Chap. III] (see also [219, Theorem 11.3]). Precisely, in the notations in [158], we take H = L2 (M) and V = H01 (M), and put At := −Vt : H01 (M) −→ H −1 (M) (recall that the norm · H 1 of H01 (M) is comparable with the Sobolev norm (inner product) with respect to the Riemannian metric g constructed in the proof of Lemma 11.4). We deduce from Lemma 8.16 that, for any h, h¯ ∈ H01 (M),           ∗  ¯ dm ≤ 2 EVt (h) EVt (h) ¯ ¯ Vt h dm =  g (dh, d h) h L(Vt )     M M   ¯ ≤ 2CF h H 1 h ¯ H1, ≤ 2CF E(h) E(h)  ∗ (dh, dh) dm denotes the energy functional on where EVt (h) := (1/2) M gL(V t) ∗ ). This implies that At = −Vt (M, gVt , m) (recall (8.9) for the definition of gL(V t) is a bounded linear map. Moreover, it follows again from Lemma 8.16 that  −

hVt h dm = 2EVt (h) ≥ M

%  2 1 \$

h 2L2 + 2E(h) − h 2L2 . E(h) = SF SF

 Since F < ∞ by Lemma 8.18, h 2L2 + 2E(h) is comparable with h H 1 . Therefore we have a unique solution (ft )t∈[0,T ] to (13.16) with f0 = f , which lies in L2 ([0, T ], H01 (M)) ∩ H 1 ([0, T ], H −1 (M)) and also in C([0, T ], L2 (M)). (ii) The Hölder continuity is a consequence of the local uniform ellipticity of Vt , in the same way as Step 1 of the proof of Theorem 13.18.

190

13 Nonlinear Heat Flow

(iii) This is shown, e.g., by using the fundamental solution q(t, x; s, y) to (13.16) (see [221, Sect. 6]). In fact, we have  ft (x) = q(t, x; s, y)f (y) m(dy), M

where q(t, x; s, y) ≥ 0 and M q(t, x; s, y) m(dy) = 1 for all t > s and x ∈ M (by 1 ∈ H01 (M) if m(M) < ∞, or by [221, Sect. 7] if Ric∞ ≥ K since it implies the squared exponential volume bound as in [234, Theorem 4.24]). From this expression of ft , one readily obtains the claim.   ∇u (u ). Moreover, it follows The uniqueness in (i) above ensures that ut = Ps,t s from the non-expansion property

! d

ft 2L2 = −4EVt (ft ) ≤ 0 dt

(13.17)

∇u ) (similar to (13.12)) that (Ps,t t≥s uniquely extends to a linear contraction semi2 group acting on L (M). Note also that  2 ({t} × Mut ) f ∈ C∞ s 0. In particular, we have ∂[F 2 (∇ut )]/∂t ∈ L1 (M). Proof Observe that both sides are 0 almost everywhere on M \ Mut (with the help 1 (M) in Theorem 13.18). On M (where u is C∞ ), we deduce from of ∂t ut ∈ Hloc ut t Theorem 2.8 that ! ∂ ∂ F 2 (∇ut ) = ∂t ∂t =2

 n

gij∗ (dut )

i,j =1

n 

gij∗ (dut )

i,j =1

∂ut ∂ut ∂x i ∂x j

n  ∂gij∗ ∂ut ∂ 2 ut ∂ 2 ut ∂ut ∂ut + (du ) t ∂x i ∂t∂x j ∂αk ∂t∂x k ∂x i ∂x j i,j,k=1

= 2d(∂t ut )(∇ut ).  

Then the heat equation shows the claim. L2 -gradient

The estimate is shown in a standard way, providing a good example of a typical argument in the -calculus based on the integration by parts and the Bochner inequality. Notice that, when K > 0, the estimate (14.1) (or, more directly, Corollary 14.3) implies that ut is getting closer to be constant exponentially fast as t → ∞. Theorem 14.2 (L2 -Gradient Estimate) Assume that (M, F, m) is compact and satisfies Ric∞ ≥ K for some K ∈ R. Then, for any global solution (ut )t≥0 to the heat equation ∂t u = u with u0 ∈ H 2 (M) ∩ C1 (M), we have   2   ∇u F (∇us ) (x) F 2 ∇ut (x) ≤ e−2K(t−s) Ps,t

(14.1)

for any x ∈ M and 0 ≤ s < t < ∞. We emphasize that the nonlinear semigroup (us → ut ) appears in the left-hand ∇u (defined in Sect. 13.5) is employed in the side, while the linearized semigroup Ps,t right-hand side. Proof First of all, recall that u0 ∈ H 2 (M) ∩ C1 (M) ensures F 2 (∇u0 ) ∈ H 1 (M) (Exercise 12.17). Note also that both sides of (14.1) are Hölder continuous by Theorem 13.18 and Proposition 13.20. For fixed t > 0 and an arbitrary nonnegative function h ∈ C∞ (M), consider the function  H (s) := e

2Ks

F 2 (∇us ) ∇u s,t dm, (h) · P 2 M

0 ≤ s ≤ t.

∇u that s,t Then we deduce from the definition (13.18) of the adjoint heat semigroup P

195

F 2 (∇us )  ∇us  ∇u  ∇ Ps,t (h) dm H (s) = e d 2 M

 ∂ F 2 (∇us ) ∇u s,t dm + 2KH (s). + e2Ks (h) · P ∂s 2 M 





2Ks

(14.2)

By (11.17), the first term in the right-hand side coincides with  e2Ks

  ∇u  ∇u F 2 (∇us ) s,t (h) ∇ s dm. d P 2 M

The sum of the other terms coincides with, by Lemma 14.1,  e2Ks M

  ∇u s,t (h) d(us )(∇us ) + KF 2 (∇us ) dm. P

Then the integrated Bochner inequality (12.13) for N = ∞ with the test function ∇u (h) implies H  (s) ≤ 0. (We remark that P ∇u (h) lies in H 1 (M) ∩ L∞ (M) s,t s,t φ=P by Proposition 13.20 and hence can be a test function.) Therefore, H is nonincreasing, and the resulting inequality H (t) ≤ H (s) yields  h·

e2Kt M

F 2 (∇ut ) dm ≤ H (s) = e2Ks 2

 M

∇u h · Ps,t



F 2 (∇us ) dm, 2

where we used (13.19). Since this inequality holds for every nonnegative function h ∈ C∞ (M), we obtain (14.1) for almost every x ∈ M. Finally, by the Hölder continuity (in x) of both sides of (14.1), we conclude that (14.1) holds for all x ∈ M.   In (14.2), the linearized gradient ∇ ∇us is not defined outside the essential domain Mus , but we have d(F 2 (∇us )) = 0 almost everywhere on M \ Mus by virtue of Lemma 12.12. Hence, we did not replace ∇ ∇us with ∇ Vs . The same remark will apply to calculations in the sequel without mentioning. Combining (14.1) with Proposition 13.20(iii), we obtain an estimate of the decay of the Lipschitz constant along heat flow. Corollary 14.3 (Decay of Lipschitz Constant) Assume that (M, F, m) is compact and satisfies Ric∞ ≥ K for some K ∈ R. Then, for any global solution (ut )t≥0 to the heat equation with u0 ∈ H 2 (M) ∩ C1 (M), we have

F (∇ut ) L∞ ≤ e−Kt F (∇u0 ) L∞ for all t ≥ 0.

196

14.2 L1 -Gradient Estimate By slightly modifying the above derivation of the L2 -gradient estimate, one can obtain the L1 -gradient estimate. Precisely, we replace the Bochner inequality in the proof with the improved one as in Proposition 12.15 and Corollary 12.16. We remark that the L1 -estimate is a priori stronger than the L2 -estimate (by Jensen’s inequality; see the proof of Theorem 14.6). Theorem 14.4 (L1 -Gradient Estimate) Assume that (M, F, m) is compact and satisfies Ric∞ ≥ K for some K ∈ R. Then, for any global solution (ut )t≥0 to the heat equation ∂t u = u with u0 ∈ H 2 (M), we have     ∇u F (∇us ) (x) F ∇ut (x) ≤ e−K(t−s) Ps,t

(14.3)

for any x ∈ M and 0 ≤ s < t < ∞. Proof Similarly to Theorem 14.2, observe that F (∇u0 ) ∈ H 1 (M) and that both sides of (14.3) are Hölder continuous. Fix arbitrary ε > 0 and consider the positive function  0 < τ < t − s. ξτ := e−2Kτ F 2 (∇ut−τ ) + ε, We deduce from Lemma 14.1 that ! ∂ F 2 (∇ut−τ ) = −2d(ut−τ )(∇ut−τ ). ∂τ Thus, we find, on one hand, ∂τ ξτ = −

 e−2Kτ  KF 2 (∇ut−τ ) + d(ut−τ )(∇ut−τ ) . ξτ

On the other hand, for any nonnegative test function φ ∈ C∞ (M), we have 2

 F (∇ut−τ ) e−2Kτ dm · dφ ∇ ∇ut−τ ξτ 2 M M  2

   −2Kτ e e−2Kτ F (∇ut−τ ) = d φ + φ 2 · dξτ ∇ ∇ut−τ dm ξτ 2 ξτ M 2

 −2Kτ   e ∇ut−τ F (∇ut−τ ) = ∇ dm d φ ξτ 2 M 2 

 2  e−4Kτ F (∇ut−τ ) ∇ut−τ F (∇ut−τ ) + ∇ dm φ 3 ·d 2 2 ξτ M



dφ(∇ ∇ut−τ ξτ ) dm =



197

 2

e−2Kτ ∇ut−τ F (∇ut−τ ) ∇ dm ≤ d φ ξτ 2 M     e−2Kτ + φ · d F (∇ut−τ ) ∇ ∇ut−τ [F (∇ut−τ )] dm, ξτ M 



where we used F 2 (∇ut−τ ) ≤ e2Kτ ξτ2 in the last inequality. Therefore, the improved Bochner inequality (Corollary 12.16) with the test function φe−2Kτ /ξτ shows that ∇ut−τ ξτ + ∂τ ξτ ≥ 0

(14.4)

in the weak sense. For an arbitrary nonnegative function ψ ∈ C∞ (M) and τ ∈ (0, t − s), set  (τ ) := M

∇u ψ · Pt−τ,t (ξτ ) dm =

 M

∇u t−τ,t (ψ) · ξτ dm. P

Observe from (13.18) and (11.17) that     ∇u  ∇u t−τ,t (ψ) dm t−τ,t   (τ ) = (ψ) · ∂τ ξτ dm − dξτ ∇ ∇ut−τ P P 

M

= M

∇u t−τ,t (ψ) · ∂τ ξτ dm − P



M

M

 ∇u  t−τ,t (ψ) (∇ ∇ut−τ ξτ ) dm. d P

∇u (ψ) that   (τ ) ≥ 0. t−τ,t Hence, it follows from (14.4) with the test function P Therefore, we obtain (t − s) ≥ (0), which implies

 ψ M

∇u · Ps,t (ξt−s ) dm

 ≥

ψ · ξ0 dm. M

∇u (ξ By the arbitrariness of ψ, we have Ps,t t−s ) ≥ ξ0 ≥ F (∇ut ) almost everywhere. Finally, letting ε → 0, we deduce from the non-expansion property (13.17) that

  ∇u F (∇us ) ≥ F (∇ut ) e−K(t−s) Ps,t almost everywhere. Since both sides are Hölder continuous, this completes the proof.   Note that we can deduce Corollary 14.3 also from Theorem 14.4. As a corollary to the L2 - or L1 -gradient estimate (Theorems 14.2 and 14.4), one can also obtain the N = ∞ case of the spectral gap given in Exercise 12.14 (corresponding to the Poincaré inequality (15.9)). Exercise 14.5 Let (M, F, m) be compact and satisfy Ric∞ ≥ K for some K > 0. Assume that there is a nonconstant (eigen)function u ∈ H 1 (M) satisfying u = −λu in the weak sense for a positive constant λ > 0. Prove that λ ≥ K necessarily holds by using the L2 - or L1 -gradient estimate.

198

Observe that u¯ t (x) := e−λt u(x) is a global solution to the heat equation by Exercise 13.3, and hence u ∈ H 2 (M) ∩ C1,α (M) by Theorem 13.18.

14.3 Characterizations of Lower Ricci Curvature Bounds From our discussions so far, it is not difficult to see that the Bochner inequalities and the gradient estimates under the lower weighted Ricci curvature bound Ric∞ ≥ K are in fact all equivalent to Ric∞ ≥ K. This fact guarantees that employing the Bochner inequality in substitution for Ric∞ ≥ K does not lose sharpness, and forms a basis for the -calculus. Theorem 14.6 (Characterizations of Ric∞ ≥ K) Let (M, F, m) be a compact measured Finsler manifold. Then, for each K ∈ R, the following assertions are equivalent: (I) Ric∞ ≥ K. (II) The Bochner inequality ∇u



F 2 (∇u) − d(u)(∇u) ≥ KF 2 (∇u) 2

holds on Mu for all u ∈ C∞ (M). (III) The improved Bochner inequality ∇u

  F 2 (∇u) − d(u)(∇u) − KF 2 (∇u) ≥ d[F (∇u)] ∇ ∇u [F (∇u)] 2

holds on Mu for all u ∈ C∞ (M). (IV) The L2 -gradient estimate  2  ∇u F (∇us ) , F 2 (∇ut ) ≤ e−2K(t−s) Ps,t

0 ≤ s < t < ∞,

holds for all global solutions (ut )t≥0 to the heat equation with u0 ∈ C∞ (M). (V) The L1 -gradient estimate   ∇u F (∇us ) , F (∇ut ) ≤ e−K(t−s) Ps,t

0 ≤ s < t < ∞,

holds for all global solutions (ut )t≥0 to the heat equation with u0 ∈ C∞ (M). Proof We have shown (I) ⇒ (III) in Proposition 12.15. Since the integrated Bochner inequality follows from its pointwise version for smooth functions as in the proof of Theorem 12.13, we obtain (III) ⇒ (V) from Theorem 14.4. The implication (V) ⇒ (IV) is a consequence of a kind of Jensen’s inequality: ∇u ∇u 2 Ps,t (f )2 ≤ Ps,t (f )

(14.5)

14.3 Characterizations of Lower Ricci Curvature Bounds

199

for continuous functions f ∈ H 1 (M). In order to see (14.5), observe that   ∇u ∇u 2 ∇u (rf + 1)2 = r 2 Ps,t (f ) + 2rPs,t (f ) + 1 0 ≤ Ps,t ∇u (1) = 1 by Proposition 13.20(iii). Hence, the holds for all r ∈ R since Ps,t ∇u 2 ∇u discriminant Ps,t (f ) − Ps,t (f 2 ) is nonpositive as desired. One can deduce (IV) ⇒ (II) from the proof of Theorem 14.2 by letting u0 = u. Precisely, the L2 -gradient estimate with s = 0 implies H  (0) ≤ 0, which then yields the Bochner inequality for u (the detail of the proof is left as an exercise below). Finally, we prove (II) ⇒ (I). Given arbitrary v0 ∈ Tx0 M \ {0}, we fix a local coordinate system (x i )ni=1 around x0 such that gij (v0 ) = δij and x i (x0 ) = 0 for all 1 ≤ i, j ≤ n, and consider the smooth function

u(x) :=

n  i=1

v0i x i +

n 1  k ij (v0 )v0k x i x j 2 i,j,k=1

on a neighborhood of x0 . Then we find ∇u(x0 ) = v0 as well as (∇ 2 u)|Tx0 M = 0 (recall (12.2) for the local coordinate expression of ∇ 2 u). Thus the Bochner– Weitzenböck formula (12.6) and (II) imply 2

 F (∇u) − d(u)(∇u) (x0 ) ≥ KF 2 (v0 ). Ric∞ (v0 ) = ∇u 2 This completes the proof.

 

Exercise 14.7 Give a detailed proof of the implication (IV) ⇒ (II) above. There are several directions of further research, briefly discussed below. Remark 14.8 (Non-smooth Counterparts) On “Riemannian” metric measure spaces (precisely, infinitesimally Hilbertian metric measure spaces), the Bochner inequality (formulated in a suitable way) is known to be equivalent to the corresponding curvature-dimension condition (recall Remark 9.10(b) and see [12, 95] as well as Chap. 18). A Finsler counterpart to this characterization (in other words, a nonsmooth analogue to Theorem 14.6) is not known, since a suitable formulation of the Finsler Bochner inequality on metric measure spaces is not yet established. It is specifically unclear how to generalize the linearized Laplacian to metric measure spaces. Remark 14.9 (The Lack of Contraction) In the Riemannian case, lower Ricci curvature bounds are also equivalent to the corresponding contraction estimates of heat flow with respect to the Wasserstein distance (see [245] for the Riemannian case and [95] for the case of metric measure spaces satisfying the Riemannian curvaturedimension condition). We refer to Chap. 18 for the Wasserstein distance, and to (18.17) for the L2 -contraction estimate. More generally, for linear semigroups, gra-

200

dient estimates are directly equivalent to the corresponding contraction properties (see [143, 145]). In our Finsler setting, however, the lack of the commutativity (introduced and studied in [205]) prevents such a contraction estimate, at least in the usual exponential form (see [207]). This is related to one of the most important problems in the theory of gradient flows of convex functions on Finsler manifolds or normed spaces (see (b) at the end of Preface). Remark 14.10 (Noncompact Case) The -calculus approach on noncompact Finsler manifolds is yet to be fully developed. For example, known generalizations of the L2 - and L1 -gradient estimates to the noncompact case require some additional assumptions. The lack of higher order regularity (Theorem 13.18) or the contraction property (Remark 14.9) prevents a direct application of techniques in the linear calculus or the Riemannian curvature-dimension condition. See [204] for a further discussion.

14.4 The Li–Yau Estimates Next we consider the Li–Yau-type estimates under RicN ≥ K for N ∈ [n, ∞) and K ≤ 0, essentially along the lines of the Riemannian case (see [85, 155, 156]). They provide “dimensional” estimates depending on N , which improve the “dimensionfree” estimates under Ric∞ ≥ K as in Theorems 14.2 and 14.4. Recall that the condition RicN ≥ K can be regarded as the combination of Ric ≥ K and dim ≤ N (see Sect. 9.2), and that Ric∞ ≥ RicN holds for N ∈ [n, ∞) (Lemma 9.8(i)). Theorem 14.11 (Li–Yau Gradient Estimate) Assume that (M, F, m) is compact and satisfies RicN ≥ K for some N ∈ [n, ∞) and K ≤ 0, and take a positive global solution (ut )t∈[0,T ] to the heat equation ∂t u = u. Then we have   F ∇(log ut ) − θ · ∂t (log ut ) ≤ N θ 2 2



1 K − 2t 4(θ − 1)

(14.6)

on (0, T ] × M for any θ > 1. Proof We can follow the argument in [85, Sect. 5.3] (for Riemannian manifolds), despite the nonlinearity of the Laplacian and the lack of higher order regularity. We divide the proof into five steps. It is sufficient to show the claim for t = T . Step 1 Put ft := log ut , which is H 2 in space and C1,α in both space and time by Theorem 13.18. Since ut > 0, we have ∇ft =

∇ut , ut

ft =

ut F 2 (∇ut ) − , ut u2t

and, in particular, g∇ft = g∇ut on Mut . Thus the heat equation for u implies

14.4 The Li–Yau Estimates

201

ft + F 2 (∇ft ) = ∂t ft .

(14.7)

Observe again from Theorem 13.18 that ∂t ft and ft belong to H 1 (M). Moreover, d(∂t ft ) = 0 holds almost everywhere on M \ Mut by (14.7) and Lemma 12.12. Step 2 Next we verify that the function σ (t, x) := t · ∂t ft (x) satisfies ∇u σ + 2dσ (∇f ) − ∂t σ +

σ =0 t

(14.8)

in the weak sense on (0, T ) × M. For each φ ∈ C∞ c ((0, T ) × M), we have  T

φσ −dφ(∇ ∇u σ ) + 2φ · dσ (∇f ) + ∂t φ · σ + dm dt t 0 M  T \$ %   = −d(tφ) ∇ ∇u (∂t f ) + 2tφ · d(∂t f )(∇f ) + ∂t (tφ) · ∂t f dm dt 0

=

M

 T \$ 0

     % −d(tφ) ∇ ∇u (∂t f ) + tφ · ∂t F 2 (∇f ) + ∂t (tφ) f + F 2 (∇f ) dm dt

M

 T \$ %     d(tφ) ∇ ∇u (∂t f ) + d ∂t (tφ) (∇f ) dm dt, =− 0

M

where we used (the calculation in the proof of) Lemma 14.1 and (14.7) in the second equality. Again by a similar calculation to Lemma 14.1, we find  ∂ ∂ d(tφ)(∇f ) = ∂t ∂t =

 n i,j =1

n 

∂(tφ) gij∗ (df ) i

gij∗ (df )

i,j =1

∂x



∂f ∂x j

∂ 2 (tφ) ∂f ∂(tφ) ∂ 2 f + i j ∂t∂x ∂x ∂x i ∂t∂x j

    = d ∂t (tφ) (∇f ) + d(tφ) ∇ ∇u (∂t f ) . Therefore, we obtain  0

T

 \$ %     d(tφ) ∇ ∇u (∂t f ) + d ∂t (tφ) (∇f ) dm dt = 0, M

and hence (14.8) holds in the weak sense. Step 3 Now we consider the function \$  %    α(t, x) := t F 2 ∇ft (x) − θ · ∂t ft (x) = tF 2 ∇ft (x) − θ σ (t, x),

202

which is H 1 in space and Hölder continuous in both space and time. Moreover, dαt = 0 holds almost everywhere on M \ Mut . Using the previous identity (14.8) for σ and the Bochner inequality, we shall show that ∇u α + 2dα(∇f ) − ∂t α ≥ β

(14.9)

holds in the weak sense on (0, T ) × M, where β is a continuous function defined by  2 α(t, x) 1 2 2 F (∇f ) − ∂t f + KF (∇f ) (t, x). β(t, x) := − + 2t t N For each nonnegative function φ ∈ C∞ c ((0, T ) × M), observe from (14.8), Lemma 14.1 and (14.7) that 

φα −dφ(∇ ∇u α) + 2φ · dα(∇f ) + ∂t φ · α + dm dt t M 0  T \$ "  #   = −t dφ ∇ ∇u F 2 (∇f ) + 2tφ · d F 2 (∇f ) (∇f )



T

0



M

T

= 0

 = 0

 \$ "   #  −t dφ ∇ ∇u F 2 (∇f ) + 2tφ · d F 2 (∇f ) (∇f ) M

T

%   − φ · ∂t tF 2 (∇f ) + φF 2 (∇f ) dm dt

% − 2tφ · d(∂t f )(∇f ) dm dt

 \$ " %  # −t dφ ∇ ∇u F 2 (∇f ) − 2tφ · d(f )(∇f ) dm dt. M

It follows from the integrated Bochner inequality (12.13) for f that the right-hand side is bounded from below by 

T

2t 0

(f )2 2 dm dt. φ KF (∇f ) + N M



Then (14.7) completes the proof of (14.9). Step 4 By the compactness of M, one can take a maximizing point (t0 , x0 ) of α in [0, T ] × M. Since the assertion (14.6) is obvious if α(t0 , x0 ) ≤ 0, we will assume α(t0 , x0 ) > 0 and then t0 > 0. Our claim at this step is β(t0 , x0 ) ≤ 0 and we prove it by contradiction. If β(t0 , x0 ) > 0, then we have β > 0 on a neighborhood of (t0 , x0 ). Thanks to (14.9), on such a neighborhood, α is a strict sub-solution to the linear parabolic operator ∇u α + 2dα(∇f ) − ∂t α.

14.4 The Li–Yau Estimates

203

Therefore, α(t0 , x0 ) is strictly less than the supremum of α on the boundary of a + parabolic cylinder [t0 − δ, t0 ] × BM (x0 , δ) for sufficiently small δ > 0. In particular, α cannot be maximal at (t0 , x0 ), a contradiction. Step 5 The conclusion β(t0 , x0 ) ≤ 0 of the previous step yields that  α(t0 , x0 ) ≥

2t02

2 1 2 2 F (∇f ) − ∂t f + KF (∇f ) (t0 , x0 ) N

(14.10)

holds at any maximizer (t0 , x0 ) of α. We shall show that α(t0 , x0 ) ≤

 Kt0 Nθ2 1− . 2 2(θ − 1)

(14.11)

Recall that we can assume α(t0 , x0 ) > 0 and t0 ∈ (0, T ]. All the inequalities below will be evaluated at (t0 , x0 ). Setting ρ := F 2 (∇f )/α that is nonnegative at (t0 , x0 ), we rewrite (14.10) as 2  αρt0 − α α 1 α2 αρ − ≥ + Kαρ = (1 − ρt0 + θρt0 )2 + Kαρ. N θ t0 2t02 N θ 2 t02 Dividing both sides by α and by rearrangement, we find that α≤

1 − 2Kρt02 Nθ2 N θ 2 1 + ab = , 2 2 (1 + (θ − 1)ρt0 ) 2 (1 + a)2

where we put a := (θ − 1)ρt0 ≥ 0,

b := −

2Kt0 ≥ 0. θ −1

Then, it follows from 1 + ab ≤ 1 +

 ab b(a 2 + 1) b + ≤ 1+ (1 + a)2 2 4 4

that (14.11) holds. Finally, since K ≤ 0, (14.11) implies α(T , x) ≤ α(t0 , x0 ) ≤

 KT Nθ2 1− 2 2(θ − 1)

for any x ∈ M. This completes the proof of the claim (14.6).

 

204

We refer to [19] for various variants and generalizations of the Li–Yau inequality. We proceed to a Harnack inequality as an application of Theorem 14.11. Theorem 14.12 (Li–Yau Harnack Inequality) Assume that (M, F, m) is compact and satisfies RicN ≥ K for some N ∈ [n, ∞) and K ≤ 0, and take a nonnegative global solution (ut )t∈[0,T ] to the heat equation ∂t u = u. Then we have  N θ/2  2 t θ d (y, x) KN θ (t − s) us (x) ≤ ut (y) · exp − s 4(t − s) 4(θ − 1) for any x, y ∈ M, 0 < s < t ≤ T and θ > 1. Proof Note that, for any ε > 0, u + ε is again a global solution to the heat equation. Therefore, replacing u with u + ε (and then letting ε → 0), we can assume that u is positive without loss of generality. Take v ∈ Ty M such that expy ((t − s)v) = x and F (v) = d(y, x)/(t − s). Then η(τ ) := expy ((t − τ )v), τ ∈ [s, t], is the reverse curve of a minimal geodesic from y = η(t) to x = η(s), and we have F (−η(τ ˙ )) = d(y, x)/(t − s) for all τ . We again consider f := log u and put  :=

KNθ θ d 2 (y, x) , − 2 4(θ − 1) 4(t − s)

  Nθ ζ (τ ) := fτ η(τ ) + log τ + τ. 2

Then we deduce from Theorem 14.11 and df (v) ≥ −F ∗ (df )F (−v) (by (3.9)) that ˙ + ∂τ fτ (η) + ∂τ ζ = dfτ (η)

Nθ + 2τ

   1 K Nθ F 2 (∇fτ (η)) ≥ −F ∇fτ (η) F (−η) − Nθ − + + ˙ + θ 2τ 4(θ − 1) 2τ   d(y, x) F 2 (∇fτ (η)) θ d 2 (y, x) = −F ∇fτ (η) + + ≥ 0. t −s θ 4(t − s)2 Hence, we obtain us (x) · s N θ/2 es = eζ (s) ≤ eζ (t) = ut (y) · t N θ/2 et , which proves the claim.

 

Generalizations of Theorems 14.11 and 14.12 to complete, noncompact measured Finsler manifolds are yet to be given (see [85, Sect. 5.3] for the Riemannian case). In both Theorems 14.11 and 14.12, we can take θ = 1 if K = 0. Thus we have the following. Corollary 14.13 (Nonnegatively Curved Case) Let (M, F, m) be compact and satisfy RicN ≥ 0 for some N ∈ [n, ∞), and take a global solution (ut )t∈[0,T ] to the heat equation. If ut > 0 for all t, then we have

14.4 The Li–Yau Estimates

205

  N F 2 ∇(log ut ) − ∂t (log ut ) ≤ 2t on (0, T ] × M. If ut ≥ 0 for all t, then us (x) ≤ ut (y) ·

 N/2  2 t d (y, x) exp s 4(t − s)

holds for any x, y ∈ M and 0 < s < t ≤ T . We remark that the conditions ut > 0 and ut ≥ 0 need to be verified only at t = 0 thanks to Lemma 13.13. Substituting (14.7) into the gradient estimate in Corollary 14.13, we immediately obtain the following. Corollary 14.14 Let (M, F, m) be compact and satisfy RicN ≥ 0 for some N ∈ [n, ∞), and take a positive global solution (ut )t∈[0,T ] to the heat equation. Then we have (log ut ) ≥ −

N 2t

on (0, T ] × M. Remark 14.15 (Order of x, y) The order of x, y in d 2 (y, x) in the right-hand side of the Harnack inequality needs to be handled with care. Although the curvature bound ← − RicN ≥ K is common between F and its reverse structure F , (ut )t∈[0,T ] may not ← − be a global solution to the heat equation with respect to F . Instead, recall from Exercise 13.4 that (−ut )t∈[0,T ] is a global solution to the heat equation: ∂t (−u) = ← −  (−u). Since −ut is nonpositive, one cannot apply Theorem 14.12. Remark 14.16 (Gradient Estimates for N < 0) Recall that the Bochner inequality (12.8) is valid for N ∈ (−∞, 0) ∪ [n, ∞]. For negative N , however, no reasonable gradient estimate for heat flow corresponding to the bound RicN ≥ K is known. We remark that, since RicN ≥ Ric∞ for N < 0 (Lemma 9.8(i)), the condition RicN ≥ K is weaker than Ric∞ ≥ K (recall Remark 9.10(c) as well). We refer to [254] for a local gradient estimate of Cheng–Yau type (see [75]) for harmonic functions on complete, noncompact measured Finsler manifolds (built with the Bochner inequality).

Chapter 15

Bakry–Ledoux Isoperimetric Inequality

This chapter is devoted to a geometric application of the improved Bochner inequality (Sect. 12.4), the Bakry–Ledoux isoperimetric inequality (also called the Gaussian isoperimetric inequality) under Ric∞ ≥ K > 0 (Theorem 15.1). This is one of the most important geometric applications of the -calculus. The asymptotic behavior of (nonlinear or linearized) heat semigroups for large time will play an essential role. A related analysis also shows the Poincaré–Lichnerowicz inequality under RicN ≥ K > 0 with N ∈ (−∞, 0) ∪ [n, ∞] (Theorem 15.4). Similarly to the previous chapter, we assume the compactness and fix auxiliary nowhere vanishing vector fields (Vt )t≥0 associated with a global solution (ut )t≥0 to the heat equation.

15.1 Background First we explain the precise statement and background of the Bakry–Ledoux isoperimetric inequality. Let (M, F, m) be a compact measured Finsler manifold satisfying Ric∞ ≥ K for some K > 0. We can normalize m as m(M) = 1 without changing Ric∞ (by the Finsler analogue of Lemma 9.8(ii)). For a Borel set A ⊂ M, define Minkowski’s exterior boundary measure as m+ (A) := lim inf ε→0

m(B + (A, ε)) − m(A) , ε

(15.1)

where B + (A, ε) := {y ∈ M | infx∈A d(x, y) < ε} is the forward ε-neighborhood of A for ε > 0. Then the (forward) isoperimetric profile I(M,F,m) : [0, 1] −→ [0, ∞) of (M, F, m) is defined by I(M,F,m) (θ ) := inf{m+ (A) | A ⊂ M : Borel set with m(A) = θ }. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S.-i. Ohta, Comparison Finsler Geometry, Springer Monographs in Mathematics, https://doi.org/10.1007/978-3-030-80650-7_15

(15.2)

207

208

15 Bakry–Ledoux Isoperimetric Inequality

Clearly we have I(M,F,m) (0) = I(M,F,m) (1) = 0. An isoperimetric inequality means a lower bound of I(M,F,m) (θ ), i.e., a lower bound of the measure (area) of the boundary of any set of prescribed volume. The main result of this chapter is the following (established in [204]). Theorem 15.1 (Bakry–Ledoux Isoperimetric Inequality) Assume that (M, F, m) is compact and satisfies m(M) = 1 and Ric∞ ≥ K for some K > 0. Then we have I(M,F,m) (θ ) ≥ IK (θ )

(15.3)

for all θ ∈ (0, 1), where ) IK (θ ) :=

K −Kc2 (θ)/2 e 2π

 with θ =

c(θ)

−∞

)

K −Ka 2 /2 e da. 2π

In the weighted Riemannian case, Theorem 15.1 (under completeness instead of compactness) is due to Bakry and Ledoux [23]. The inequality (15.3) can be regarded as a dimension-free version of Lévy–Gromov’s isoperimetric inequality. Lévy–Gromov’s classical isoperimetric inequality states that the isoperimetric profile of an n-dimensional (unweighted) Riemannian manifold (M, g) with Ricg ≥ n − 1 is bounded from below by the profile of the unit sphere Sn (both spaces are equipped with the normalized volume measures; see [112, 152, 153]). In Sn , isoperimetric minimizers (i.e., sets attaining the infimum in (15.2)) are given by balls. In (15.3), the role of the unit sphere is played by the real line R equipped with the Euclidean distance and the Gaussian measure mK (dx) := √ 2 K/(2π ) e−Kx /2 dx, thus (15.3) is also called the Gaussian isoperimetric inequality. Recall that (R, | · |, mK ) satisfies Ric∞ = K. Isoperimetric minimizers in (R, | · |, mK ) are the half-lines (−∞, c(θ )] and [−c(θ ), ∞), thus IK is indeed the isoperimetric profile of (R, | · |, mK ). Higher dimensional Gaussian 2 spaces (Rn , | · |, (K/(2π ))n/2 e−K|x| /2 dx) have the same isoperimetric profile IK , and then isoperimetric minimizers are again half-spaces of the form {x ∈ Rn | x, w ≤ c(θ )} with w ∈ Sn−1 . We refer to [40, 236] for the Gaussian isoperimetric inequality in the classical Euclidean or Hilbert setting, and to [13] for the case of metric measure spaces satisfying the Riemannian curvature-dimension condition RCD(K, ∞) (by means of the -calculus). Exercise 15.2 Prove that, in (R, | · |, mK ), the half-lines (−∞, c(θ )] and [−c(θ ), ∞) are isoperimetric minimizers. We will prove Theorem 15.1 by generalizing the argument in [23]. This approach based on the -calculus is known to give a sharp estimate only in the case of Ric∞ ≥ K. As for general RicN , we will discuss Lévy–Gromov-type isoperimetric inequalities in Sect. 19.4 as an application of another powerful technique called the needle decomposition (or the localization). This more geometric method is also available for singular spaces, precisely, for essentially non-branching metric

15.2 Poincaré–Lichnerowicz Inequality and Variance Decay

209

measure spaces satisfying the curvature-dimension condition (see [59]). This wide class of spaces includes, in particular, reversible Finsler manifolds of weighted Ricci curvature bounded below. However, in the non-reversible case, the needle decomposition gives only weaker estimates, e.g., I(M,F,m) (θ ) ≥ −1 F · IK (θ ) instead of (15.3), where F ≥ 1 is the reversibility constant as in (3.17). See Chap. 19 (and [203]) for more details.

15.2 Poincaré–Lichnerowicz Inequality and Variance Decay Throughout this and the next chapters, the behavior of heat semigroups as t → ∞ will play an essential role. In this section, as an auxiliary step towards the Bakry– Ledoux isoperimetric inequality, we consider the outcome of the ergodicity  ut →

u0 dm

in L2 (M)

(15.4)

M

for all global solutions (ut )t≥0 to the heat equation (Proposition 13.16(iii)). Another ingredient is the mass conservation  M

∇u Ps,t (f ) dm =

 f dm

(15.5)

M

for any global solution (ut )t≥0 to the heat equation, f ∈ L2 (M) and for any 0 ≤ s < t < ∞ (this is a consequence of Lemma 11.5, (13.19) and Proposition 13.20(iii)). We shall show a Poincaré-type inequality as a crucial step. Since the proof is common, here we consider the more general setting of RicN ≥ K > 0 with N ∈ (−∞, 0) ∪ [n, ∞], yielding a Finsler analogue to the Lichnerowicz inequality (compare Theorem 15.4 below with Exercise 12.14). We begin with a simple application of the Bochner inequality. Lemma 15.3 Assume that (M, F, m) is compact and satisfies RicN ≥ K for some K ∈ R and N ∈ (−∞, 0) ∪ [n, ∞]. Then we have, for any u ∈ H 2 (M) ∩ C1 (M) such that u ∈ H 1 (M), 

(u)2 d(u)(∇u) + KF 2 (∇u) + dm ≤ 0. N M In particular, if K > 0, then it holds that

210

15 Bakry–Ledoux Isoperimetric Inequality

 F 2 (∇u) dm ≤ M



N −1 KN

(u)2 dm.

(15.6)

M

In the case of N = ∞, the coefficient in the right-hand side of (15.6) is read as 1/K. Proof The first assertion is a direct consequence of the Bochner inequality (12.13) with φ = 1. It further implies that, by the integration by parts,  F 2 (∇u) dm +

K M

u 2L2 N

 ≤− M

d(u)(∇u) dm = u 2L2 .  

Rearranging this inequality yields (15.6) when K > 0.

Normalizing the measure m as m(M) = 1, we define the variance of a function f ∈ L2 (M) by   Varm (f ) :=

M

2

 f−



 dm =

f dm M

2

f 2 dm −

f dm

M

.

(15.7)

M

Then the Poincaré–Lichnerowicz inequality is obtained with the help of the ergodicity (15.4) as follows (see [21, Theorem 4.8.4]). Theorem 15.4 (Poincaré–Lichnerowicz Inequality) Suppose that (M, F, m) is compact and satisfies m(M) = 1 and RicN ≥ K for some K > 0 and N ∈ (−∞, 0) ∪ [n, ∞]. Then we have, for any f ∈ H 1 (M), Varm (f ) ≤

N −1 KN

 F 2 (∇f ) dm.

(15.8)

M

The coefficient in the right-hand side is read as 1/K for N = ∞. Proof Let (ut )t≥0 be the global solution to the heat equation with u0 = f and consider the function (t) := ut 2L2 . It follows from the ergodicity (15.4) that  Varm (f ) = (0) − lim (t) = − t→∞

 (t) dt.

0

In order to bound  (t), recall from (13.12) that   (t) = −2 F 2 (∇ut ) dm = −4E(ut ). M

Moreover, we deduce from Lemma 14.1 that   (t) = −4 d(ut )(∇ut ) dm = 4 ut 2L2 . M

15.2 Poincaré–Lichnerowicz Inequality and Variance Decay

211

Then (15.6) implies −2 (t) ≤

N − 1   (t). KN

Integrating this differential inequality and noticing limt→∞  (t) = 0 (by Proposition 13.16(i)), we obtain  Varm (f ) = −

 (t) dt ≤

0

N −1 2KN



2(N − 1) lim  (t) −  (0) = E(f ). t→∞ KN

This completes the proof.

 

Observe that (15.8) with N ∈ [n, ∞) (resp. N ∈ (−∞, 0)) is stronger (resp. weaker) than that for N = ∞,  1 Varm (f ) ≤ F 2 (∇f ) dm, (15.9) K M as naturally expected from the monotonicity of RicN in N (Lemma 9.8(i)). The Poincaré inequality (15.9) can be alternatively derived from the curvaturedimension condition CD(K, ∞), which is equivalent to Ric∞ ≥ K (see Theorems 18.6 and 18.12). Precisely, CD(K, ∞) implies the logarithmic Sobolev inequality (Theorem 18.10), and then the Poincaré inequality follows by a general argument due to Otto–Villani [213] (see also [161]). We remark that, in this more flexible method, we do not need to assume even the finite reversibility F < ∞. The case of CD(K, N ) with N ∈ (1, ∞), based on a similar but more involved calculation, can be found in [160, Theorem 5.34]. Next we show that the Poincaré inequality (15.9) implies the exponential decay of the variance by the same way as Proposition 13.16(ii) (see also [21, Theorem 4.2.5]). This can be compared with the decay of the entropy in Lemma 16.2 below, which is linked with the logarithmic Sobolev inequality. Proposition 15.5 (Variance Decay) Assume that (M, F, m) is compact and satisfies m(M) = 1 and Ric∞ ≥ K for some K > 0. Then, given any global solution (ut )t≥0 to the heat equation ∂t u = u and f ∈ H 1 (M), we have  ∇u  Varm Ps,t (f ) ≤ e−2K(t−s)/SF Varm (f ) ∇u (f ) converges to the constant function for all 0 ≤ s < t < ∞. In particular, Ps,t  2 M f dm as t → ∞ in L (M). ∇u (f ). It follows from (13.17) together with (15.5), Lemma 8.16 Proof Put ft := Ps,t and (15.9) that

212

15 Bakry–Ledoux Isoperimetric Inequality

 d Varm (ft ) = −2 dt



 dft (∇ Vt ft ) dm = −2 M

2 ≤− SF

M



∗ gL(V (dft , dft ) dm t)

2K F ∗ (dft )2 dm ≤ − Varm (ft ). SF M

Hence e2Kt/SF Varm (ft ) is non-increasing in t and we obtain e2Kt/SF Varm (ft ) ≤ e2Ks/SF Varm (f ), which is the first assertion. The second assertion is straightforward from (15.5) and   limt→∞ Varm (ft ) = 0. ∇u ) We remark that Proposition 15.5 indeed holds for f ∈ L2 (M) since (Ps,t t≥s is 2 extended to a contraction semigroup on L (M) (recall Sect. 13.5). Moreover, thanks to Theorem 18.12, it is in fact sufficient to assume only SF < ∞, CF < ∞ and the completeness instead of the compactness. We close this section with some more remarks on the Poincaré–Lichnerowicz inequality (15.8).

Remark 15.6 (Riemannian Case and Rigidity Results) (a) For weighted Riemannian manifolds, by the linearity of the weighted Laplacian , the inequality (15.8) is equivalent to the spectral gap λ1 ≥ KN/(N − 1) for the first nonzero eigenvalue λ1 of − (recall also Exercise 12.14). The classical (unweighted) case of N = n is known as the Lichnerowicz inequality, and the weighted case with N ∈ (n, ∞] can be found in [20]. The negative range N ∈ (−∞, 0) was studied independently in [139] and [200] (see also [225] for an earlier independent work). (b) The constant (N − 1)/(KN ) in the inequality (15.8) is known to be sharp for N ∈ (−∞, −1] ∪ [n, ∞]. As for N ∈ (−1, 0), it was shown in [139] that (N − 1)/(KN ) is not optimal at least for N < 0 close to 0. Rigidity concerning characterizations of spaces attaining the sharp constants is an important problem in comparison geometry. We summarize known results on unweighted and weighted Riemannian manifolds: • In the unweighted case (i.e., N = n), Obata’s classical theorem [187] asserts that λ1 = Kn/(n − 1) holds if and only if (M, g) is the sphere of constant sectional curvature K/(n − 1) (thus Ricg = K). • When N = ∞, equality in (15.8) forces (M, g, m) to isometrically split off the 1-dimensional Gaussian space (R, | · |, mK ) (see [76]). • For N < −1, equality holds in (15.8) if and only if (M, g, m) is isometric to a warped product of hyperbolic nature, 0 R ×cosh(√K/(1−N )t)

)

! := R × !, dt + cosh 2

2

1 K t · g! , 1−N

and the measure is decomposed in this product structure as

15.3 The Key Estimate

213

m(dt dx) = cosh

N −1

)

K t dt m! (dx). 1−N

Here (!, g! , m! ) is an (n − 1)-dimensional weighted Riemannian manifold this structure with satisfying RicN −1 ≥ K(2 − N )/(1 − N ) (compare √ the 1-dimensional model space (R, | · |, coshN −1 ( K/(1 − N )x) dx) as in Exercise 9.20). Moreover, for N = −1, equality is never achieved. We refer to [164] for details. • In the remaining case of N ∈ (n, ∞), equality is not achieved by Riemannian manifolds. Precisely, it was shown in [136] (in the generalized setting of RCD(K,√N )-spaces) that equality holds in (15.8) only if the maximal diameter π (N − 1)/K (in the Bonnet–Myers theorem) is attained. This is, however, possible only when N = n by [144]. If we admit singularities, then the maximal diameter can be attained by spherical suspensions (with singularities at north and south poles). The rigidity problem in the Finsler situation is yet to be investigated due to the less clear understanding of the splitting phenomenon (see Chap. 17 for a related discussion). (c) We refer to [62] for an interesting result on almost rigidity (a quantitative stability estimate) for the Poincaré–Lichnerowicz inequality by means of the needle decomposition, in the context of essentially non-branching CD(K, N )spaces with N ∈ (1, ∞) (see also Sect. 19.5). Exercise 15.7 Give an eigenfunction on the Gaussian space (R, | · |, mK ) for the √ 2 first nonzero eigenvalue K (recall mK (dx) = K/(2π ) e−Kx√/2 dx). Similarly, for N < −1, give an eigenfunction on (R, | · |, coshN −1 ( K/(1 − N )x) dx) corresponding to the first nonzero eigenvalue KN/(N − 1).

15.3 The Key Estimate We next prove a kind of gradient estimate as a key ingredient, by an argument somewhat similar to the L1 -gradient estimate (Theorem 14.4). Define 1 ϕ(c) := √ 2π



c −∞

e−b

2 /2

db

for c ∈ R,

1 −1 2 N (θ ) := ϕ  ◦ ϕ −1 (θ ) = √ e−ϕ (θ) /2 2π

for θ ∈ (0, 1).

Set also N (0) = N (1) = 0. Observe that, for θ ∈ (0, 1),

214

15 Bakry–Ledoux Isoperimetric Inequality

d  −1  1 d ϕ −1 (θ ) −ϕ −1 (θ)2 /2 1 ϕ (θ ) = − . =− N (θ ) = − e √  −1 dθ dθ N (θ ) ϕ (ϕ (θ )) 2π (15.10) 

Proposition 15.8 (Key Estimate) Assume that (M, F, m) is compact and satisfies Ric∞ ≥ K for some K ∈ R. Then, given any global solution (ut )t≥0 to the heat equation ∂t u = u such that u0 ∈ H 2 (M) and 0 ≤ u0 ≤ 1 almost everywhere, we have   ∇u 2 2 2 2 N (ut ) + αF (∇ut ) ≤ P0,t N (u0 ) + cα (t)F (∇u0 ) (15.11) for all α ≥ 0 and t > 0, where we set cα (t) :=

1 − e−2Kt + αe−2Kt > 0 K

for K = 0 and cα (t) := 2t + α for K = 0. For brevity, we suppressed the dependence of cα on K. Proof Recall from Lemma 13.13 that the hypothesis 0 ≤ u0 ≤ 1 implies 0 ≤ ut ≤ 1 for all t > 0, and hence N (ut ) makes sense. Moreover, by modifying u0 into (1 − 2ε)u0 + ε and then letting ε → 0, we can assume that ε ≤ u0 ≤ 1 − ε holds for some ε > 0 without loss of generality. Note also that both sides of (15.11) are Hölder continuous by Theorem 13.18 and Proposition 13.20(ii). We fix t > 0 and consider the function  ζs := N 2 (us ) + cα (t − s)F 2 (∇us ), 0 ≤ s ≤ t. Note that ζs ≥ N (ε) since ε ≤ us ≤ 1 − ε, and ζs ∈ H 1 (M) by Exercise 12.17. Moreover, dζs = 0 almost everywhere on M \ Mus by Lemma 12.12. One ∇u (ζ ), therefore it is sufficient to show that can rewrite (15.11) as ζt ≤ P0,t 0 ∇u ∂s [Ps,t (ζs )] ≤ 0 holds in the weak sense on (0, t) × M. To this end, for any φ ∈ C∞ c ((0, t) × M), we deduce from (13.19), (13.18) and ∇u that s,t the linearity of P  t 0

M

=− =−

 ∇u  φs · ∂s Ps,t (ζs ) dm ds = −  t 0

M

0

M

 t 0

M

∇u ∂s φs · Ps,t (ζs ) dm ds

∇u s,t (∂s φs ) · ζs dm ds P

 t \$  ∇u  ∇u  % s,t (φs ) + Vs P s,t (φs ) · ζs dm ds ∂s P

15.3 The Key Estimate

=

 t 0

M

215 ∇u s,t (φs ) · (∂s ζs − ∇us ζs ) dm ds. P

To be precise, the third equality follows from ∇u (φ ) s,t ∇u (φs+δ ) − P P s s+δ,t

=

δ ∇u  Ps+δ,t (φs+δ − φs ) 

δ

+

∇u (φ ) ∇u (φs ) − P s,t P s s+δ,t

δ

∇u (φ ) s,t ∇u (φs ) − P P φs+δ − φs s s+δ,t ∇u ∇u s+δ,t s+δ,t − ∂s φ s + P (∂s φs ) + =P δ δ  ∇u  ∇u s,t s,t (φs ) →P (∂s φs ) − Vs P ∇u ) s,t as δ → 0 by the contractivity (13.17) and the L2 -continuity of (P s∈[0,t] . Hence ∇u s our goal is to prove  ζs − ∂s ζs ≥ 0 in the weak sense on (0, t) × M. All the calculations below will be understood in such a weak sense (with the help of Lemma 12.12). Observe from cα (t) = −2Kcα (t) + 2 and Lemma 14.1 that

∂s ζs =

%   1\$ N (us )N  (us )us + Kcα (t −s)−1 F 2 (∇us )+cα (t −s)d(us )(∇us ) . ζs

We also find ∇

∇us

1 cα (t − s) ∇us 2  N (us )N (us )∇us + ∇ [F (∇us )] . ζs = ζs 2

Hence we have ∇us ζs =

N (us )N  (us ) N  (us )2 − 1 2 N (us )N  (us ) us + F (∇us ) − dζs (∇us ) ζs ζs ζs2 +

  cα (t − s) ∇us 2 cα (t − s)  [F (∇us )] − dζs ∇ ∇us [F 2 (∇us )] , 2 2ζs 2ζs

where we used N  = −1/N in (15.10). Now we apply the improved Bochner inequality (Corollary 12.16) to obtain ∇us ζs − ∂s ζs =

N  (us )2 − Kcα (t − s) 2 F (∇us ) ζs

216

15 Bakry–Ledoux Isoperimetric Inequality

cα (t − s) ∇us F 2 (∇us )  − d(us )(∇us ) + ζs 2 − ≥

  N (us )N  (us ) cα (t − s) dζs (∇us ) − dζs ∇ ∇us [F 2 (∇us )] 2 2 ζs 2ζs

  N  (us )2 2 cα (t − s) F (∇us ) + d[F (∇us )] ∇ ∇us [F (∇us )] ζs ζs −

  cα (t − s) N (us )N  (us ) dζs (∇us ) − dζs ∇ ∇us [F 2 (∇us )] . 2 2 ζs 2ζs

Substituting

1 cα (t − s)  2 N (us )N (us )dus + d[F (∇us )] dζs = ζs 2 and recalling (11.17) yields ∇us ζs − ∂s ζs ≥

ζs2 N  (us )2 − N 2 (us )N  (us )2 2 F (∇us ) ζs3 −

+ =

  cα (t − s)N (us )N  (us ) dus ∇ ∇us [F 2 (∇us )] 3 ζs

   cα (t − s)  2 ζs −cα (t−s)F 2 (∇us ) d[F (∇us )] ∇ ∇us [F (∇us )] ζs3

cα (t − s)N  (us )2 4 F (∇us ) ζs3 −

  cα (t − s)N (us )N  (us ) dus ∇ ∇us [F 2 (∇us )] ζs3

+

  cα (t − s)N 2 (us ) d[F (∇us )] ∇ ∇us [F (∇us )] . 3 ζs

Since the Cauchy–Schwarz inequality for g∇us implies       dus ∇ ∇us [F 2 (∇us )]  ≤ F (∇us ) d[F 2 (∇us )] ∇ ∇us [F 2 (∇us )]    = 2F 2 (∇us ) d[F (∇us )] ∇ ∇us [F (∇us )] , we conclude that ∇us ζs − ∂s ζs

15.3 The Key Estimate

217

cα (t − s)N  (us )2 4 F (∇us ) ζs3 −

   2cα (t − s)N (us )|N  (us )| 2 F (∇u ) d[F (∇us )] ∇ ∇us [F (∇us )] s 3 ζs

  cα (t − s)N 2 (us ) d[F (∇us )] ∇ ∇us [F (∇us )] 3 ζs     2 cα (t − s)  2 ∇us [F (∇u )] |N (u )|F (∇u ) − N (u ) d[F (∇u )] ∇ = s s s s s ζs3 +

≥0 in the weak sense. Therefore we have (15.11) almost everywhere, and then it holds everywhere thanks to the Hölder continuity. This completes the proof.   When K > 0, choosing α = K −1 and letting t → ∞ in (15.11), we obtain the following functional inequality. Corollary 15.9 Assume that (M, F, m) is compact and satisfies m(M) = 1 and Ric∞ ≥ K for some K > 0. Then, for any u ∈ H 2 (M) such that 0 ≤ u ≤ 1 almost everywhere, we have √

 KN

  u dm ≤ KN 2 (u) + F 2 (∇u) dm.

M

(15.12)

M

Proof Let (ut )t≥0 be the global solution to the heat equation with u0 = u. Taking α = K −1 , we find cα = K −1 and then (15.11) implies √

KN (ut ) ≤

  ∇u KN 2 (ut ) + F 2 (∇ut ) ≤ P0,t KN 2 (u) + F 2 (∇u) .

As t → ∞, (15.4) and Proposition 15.5 show the convergence to the constant functions,  ut → u dm, M

   ∇u P0,t KN 2 (u) + F 2 (∇u) → KN 2 (u) + F 2 (∇u) dm, M

 

respectively, in L2 (M). Hence we obtain (15.12). Alternatively, taking α = 0 in (15.11) implies N (ut ) ≤

∇u P0,t



N

2 (u) + c (t)F 2 (∇u) 0

.

218

15 Bakry–Ledoux Isoperimetric Inequality

If K > 0, then limt→∞ c0 (t) = K −1 and hence we can obtain (15.12) as the limit of t → ∞.  √ It follows from (15.12) and KN 2 (u) + F 2 (∇u) ≤ KN (u) + F (∇u) that √

   K N u dm − N (u) dm ≤ F (∇u) dm. M

M

(15.13)

M

This is a functional form of the Gaussian isoperimetric inequality studied by Bobkov [37, 38] in the Gaussian spaces. Bakry–Ledoux’s work [23] was strongly influenced by these works of Bobkov.

15.4 Proof of Theorem 15.1 Now, we can prove Theorem 15.1 by applying (15.12) (or (15.13)) to functions approximating the characteristic function of a given set. This is a standard strategy for showing an isoperimetric inequality from a functional inequality. Proof Let θ ∈ (0, 1). Fix an arbitrary Borel set A ⊂ M with m(A) = θ and consider the function \$ % uε (x) := max 1 − ε−1 d(x, A), 0 , ε > 0, where we set d(x, A) := infy∈A d(x, y). Notice that uε = 1 on A and uε = 0 on M \ B − (A, ε), where B − (A, ε) := {x ∈ M | d(x, A) < ε} is the backward ε-neighborhood of A. Moreover, it follows from the triangle inequality that uε is ε−1 -Lipschitz, i.e., uε (y) − uε (x) ≤ ε−1 d(x, y) for all x, y ∈ M (recall Exercises 2.16 and 3.6). Hence we have F (∇uε ) ≤ ε−1 almost everywhere on B − (A, ε) \ A. Thus, applying (15.12) to (smooth approximations of) uε shows, with the help of √ N (0) = N (1) = 0 and 0 ≤ N ≤ 1/ 2π , √



)

u dm ≤ ε

KN M

  1 K + 2 · m B − (A, ε) \ A . 2π ε

Letting ε → 0, we find √

KN (θ ) ≤ lim inf ε→0

m(B − (A, ε) \ A) . ε

(15.14)

15.4

Proof of Theorem 15.1

219

√ Now, put c := ϕ −1 (θ )/ K and observe from the definitions of ϕ and N in the previous section that √

)

K −Kc2 /2 e , 2π )  √Kc  c 1 K 2 −b2 /2 θ=√ e db = e−Ka /2 da. 2π 2π −∞ −∞

KN (θ ) =

Therefore (15.14) implies the desired isoperimetric inequality (15.3) but for the ← − reverse Finsler structure F (due to the use of B − (A, ε) instead of B + (A, ε)). ← − Finally, because the curvature bound Ric∞ ≥ K is common between F and F , we also obtain (15.3) for F . This completes the proof.   Similarly to the gradient estimates (recall Remark 14.10), the noncompact case in the Finsler context is yet to be developed (see also [204]). Finally we mention some rigidity results, which can be compared with the rigidity results for the Poincaré– Lichnerowicz inequality in Remark 15.6. Remark 15.10 (Rigidity Results and Quantitative Estimates) (a) For weighted Riemannian manifolds, equality in the isoperimetric inequality (15.3) forces (M, g, m) to isometrically split off the 1-dimensional Gaussian √ 2 space (R, | · |, K/(2π ) e−Kx /2 dx) (see [184, Theorem 18.7]). This is a similar phenomenon to the Poincaré inequality (recall Remark 15.6(b)). For the Lévy–Gromov-type isoperimetric inequalities with K > 0 and N ∈ (−∞, −1) ∪ [n, ∞), we also have rigidity results in the same form as Remark 15.6(b) (see [59] for the generalized setting of RCD(K, N )-spaces with N ∈ [2, ∞), and [165] for the case of N < −1). (b) Besides rigidity, quantitative stability estimates (in other words, almost rigidity) for isoperimetric inequalities have attracted growing interest recently. 2 For example, for the Gaussian spaces (Rn , | · |, (K/(2π ))n/2 e−K|x| /2 dx), a quantitative isoperimetric inequality (in a sharp, dimension-free form) was established in [28, 92] (see also [79, 185]). Here a quantitative isoperimetric inequality means that, if m+ (A) is close to IK (m(A)), then the difference between A and some half-space (which is an isoperimetric minimizer in the Gaussian case) is necessarily small in some quantitative way (depending on the deficit m+ (A) − IK (m(A)) ≥ 0 in the isoperimetric inequality). For weighted Riemannian manifolds as well as reversible measured Finsler manifolds, the needle decomposition turned out a fine tool also for this problem and has given some quantitative estimates. We refer to [58] for the case of N ∈ [n, ∞) (in the general framework of essentially non-branching CD(K, N )-spaces; recall also Remark 15.6(c)), and to [166] for the case of N = ∞ (see also Sect. 19.5).

Chapter 16

Functional Inequalities

In this last chapter of Part II, we make full use of the -calculus technique to establish important functional inequalities. We have already shown the Poincaré– Lichnerowicz inequality in the previous chapter (Theorem 15.4). In this chapter we further obtain the logarithmic Sobolev inequality (Theorem 16.4), the Beckner inequality (Theorem 16.8), and the Sobolev inequality (Theorem 16.17). We will closely follow the arguments in the linear (Riemannian) case as in [21, 105] and generalize them to our nonlinear (Finsler) setting along [201]. All the estimates will be of the same forms as the linear case, except for the range of the exponent p adopted in the Sobolev inequality in the non-reversible case (see Remark 16.18). We remark that, in contrast with the gradient estimates in Chaps. 14 and 15, linearized heat semigroups will play only a subsidiary role. We will sometimes use the following common notation for brevity (called the 2 -operator): 2 (u) := ∇u

F 2 (∇u) − d(u)(∇u). 2

(16.1)

Then the Bochner inequality (12.8) under RicN ≥ K can be written in a shorthand way as 2 (u) ≥ KF 2 (∇u) +

(u)2 N

and is also called the 2 -criterion (recall Remark 9.10(b)).

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S.-i. Ohta, Comparison Finsler Geometry, Springer Monographs in Mathematics, https://doi.org/10.1007/978-3-030-80650-7_16

221

222

16 Functional Inequalities

16.1 Logarithmic Sobolev Inequality First we study the logarithmic Sobolev inequality. The logarithmic Sobolev inequality plays a quite important role in infinite-dimensional analysis and probability theory. Among others, the equivalence with the hypercontractivity is fundamental in semigroup theory, we refer to [21, Chap. 5] for further information. From the geometric viewpoint, the logarithmic Sobolev inequality plays a crucial role in the study of entropy in Ricci flow theory; a notable example is Perelman’s no local collapsing theorem (see [215] and, e.g., [77, Chap. 6], [78, Chap. 5]). In the logarithmic Sobolev inequality of this section (Theorem 16.4), we will assume that (M, F, m) is (forward or backward) complete and satisfies RicN ≥ K for some K > 0 and N ∈ [n, ∞) (see Remark 16.5 and Theorem 18.10 for the case of N = ∞). Then M is compact (Corollary 9.17), and we will normalize m as m(M) = 1 without loss of generality. Recall that the Poincaré–Lichnerowicz inequality allowed N to be negative (Theorem 15.4). It is, however, not the case for the logarithmic Sobolev inequality (see Remark 16.6, and observe that the admissible range (16.17) of the Beckner inequality does not include p = 2 if N < −2).

16.1.1 Entropy Decay We begin with an auxiliary lemma on the decay of entropy. This would be compared with the decay of the variance (Proposition 15.5) in the study of the Poincaré– Lichnerowicz inequality. For a nonnegative function f ∈ L1 (M) with M f dm = 1, define the relative entropy with respect to m as  Entm (f m) :=

f log f dm.

(16.2)

M

Observe that limt↓0 t log t = 0 holds and the function t log t is convex in t ∈ (0, ∞). One can regard that Entm (f m) measures the difference between the probability measures m and f m. We indeed find the following (from Jensen’s inequality). Exercise 16.1 Show that Entm (f m) ≥ 0 holds, and that equality holds if and only if f = 1 almost everywhere.  When f is not normalized as M f dm = 1, we generalize the definition as 

 Entm (f m) :=

 f dm · log f dm ,

f log f dm − M

M

M

16.1 Logarithmic Sobolev Inequality

223

which is also nonnegative. Setting f¯ := ( M f dm)−1 · f (provided M f dm > 0), we find that  f dm · Entm (f¯m). (16.3) Entm (f m) = M

Lemma 16.2 (Entropy Decay) Assume that (M, F, m) is compact and satisfies  m(M) = 1. Let f ∈ L2 (M) be a nonnegative function with M f dm = 1, and (ut )t≥0 be the global solution to the heat equation ∂t u = u with u0 = f . Then we have limt→∞ Entm (ut m) = 0. Recall that the heat semigroup extends to L2 (M) (see Lemma 13.11). Proof  It follows from Lemma 13.13 and the mass conservation (13.13) that ut ≥ 0 and M ut dm = 1 for all t > 0. Since 1 s log s ≤ (s − 1) + (s − 1)2 2 for s ≥ 1, we have

1 2 ut log ut dm ≤ 0 ≤ Entm (ut m) ≤ (ut − 1) + (ut − 1) dm 2 {ut ≥1} {ut ≥1} 



1 ≤ ut − 1 L1 + ut − 1 2L2 . 2 Letting t → ∞, we deduce from the ergodicity (Proposition 13.16(iii)) that Entm (ut m) → 0.  

16.1.2 Logarithmic Sobolev Inequality The strategy of the proof of the logarithmic Sobolev inequality is essentially similar to that of the Poincaré–Lichnerowicz inequality. We shall employ the relative entropy (t) = Entm (ut m) instead of the squared L2 -norm (t) = ut 2L2 in the proof of Theorem 15.4. In order to compare   (t) and   (t), however, we need a longer calculation than that for (t). Thus first we give a sufficient condition (corresponding to (15.6)) for obtaining the logarithmic Sobolev inequality, and then we show that the sufficient condition follows from RicN ≥ K.

224

16 Functional Inequalities

Proposition 16.3 (A Sufficient Condition) Assume that (M, F, m) is compact and satisfies m(M) = 1 and 

F 2 (∇u) dm u M 2

  F (∇u) du ∇ ∇u + u · d([log u])(∇[log u]) dm ≤ −C 2u2 M

(16.4)

for some constant C > 0 and all positive functions u ∈ H 2 (M) ∩ C1 (M) such that u ∈ H 1 (M). Then the logarithmic Sobolev inequality  f log f dm ≤ M



C 2

M

F 2 (∇f ) dm f

holds for all nonnegative functions f ∈ H 1 (M) with

M

(16.5)

f dm = 1.

By using 2 introduced in (16.1), the assumed inequality (16.4) can be written as 

 u · F 2 (∇[log u]) dm ≤ C

u · 2 (log u) dm.

M

(16.6)

M

Note that both sides are well-defined thanks to infM u > 0. In the right-hand side of (16.5) appears the Fisher information of the probability measure f m with respect to m:  M

F 2 (∇f ) dm = 4 f



"  !# F 2 ∇ f dm,

(16.7)

M

−1 where we set F 2 (∇f )/f  = 0 on f (0) (almost everywhere). If f is nonnegative but not normalized as M f dm = 1, then the logarithmic Sobolev inequality (16.5) is modified into the scaling invariant form (recall (16.3)):





f dm ·log

f log f dm− M

M



C f dm ≤ 2 M

 M

F 2 (∇f ) dm. f

(16.8)

Proof By a truncation argument, one can assume that f ≥ ε holds for some ε > 0. Then the solution (ut )t≥0 to the heat equation with u0 = f also satisfies ut ≥ ε for all t > 0 (Lemma 13.13). Moreover, we know that ut ∈ H 2 (M) ∩ C1 (M) and ut ∈ H 1 (M) for t > 0 by Theorem 13.18. Hence we can apply the hypothesis (16.4) to ut , and then the claim follows from a similar argument to the proof of Theorem 15.4 by considering (t) := Entm (ut m). Observe that

16.1 Logarithmic Sobolev Inequality

225



 (t) =





(log ut +1)ut dm = − M

d[log ut ](∇ut ) dm = − M

M

F 2 (∇ut ) dm. ut

Moreover, it follows from Lemma 14.1 and (11.17) that ut 2 2  (t) = F (∇ut ) − d(ut )(∇ut ) dm ut u2t M 

  2 ∇ut F (∇ut ) dm − 2 =− dut ∇ d(ut )(∇[log ut ]) dm. u2t M M 



Since    d(ut )(∇[log ut ]) dm = − [log ut ]ut dm = d([log ut ])(∇ut ) dm M



M

M

ut · d([log ut ])(∇[log ut ]) dm,

= M

the supposed inequality (16.4) implies −  (t) ≤ (C/2)  (t). We deduce from Lemma 16.2, limt→∞ E(ut ) = 0 (Proposition 13.16(i)) and ut ≥ ε that limt→∞ (t) = limt→∞   (t) = 0. Therefore we obtain 



f log f dm = −

 (t) dt ≤

0

M



C 2

0

C   (t) dt = −   (0). 2  

This completes the proof.

We remark that the assertion (16.5) includes the situation that the right-hand side (Fisher information) is infinite. The left-hand side (relative entropy) is finite for f ∈ L2 (M) by the estimate in the proof of Lemma 16.2. Now we prove the logarithmic Sobolev inequality under a lower weighted Ricci curvature bound along the lines of [21, Theorem 5.7.4]. Some calculations in this proof will be used again in the proof of the Sobolev inequality. Theorem 16.4 (Logarithmic Sobolev Inequality) Assume that (M, F, m) is complete and satisfies m(M) = 1 and RicN ≥ K for some K > 0 and N ∈ [n, ∞). Then we have  f log f dm ≤ M

N −1 2KN



for all nonnegative functions f ∈ H 1 (M) with

M

M

F 2 (∇f ) dm f

(16.9)

f dm = 1.

Proof Fix h ∈ C∞ (M) and consider a positive function eah for a > 0. We begin with some preliminary and useful calculations. Observe that, since a > 0,

226

16 Functional Inequalities

  (eah ) = aeah h + aF 2 (∇h) .

∇(eah ) = aeah ∇h,

(16.10)

Hence, on one hand, we find from (11.17) that

   a 2 e2ah F 2 (∇h) −a 2 eah d eah h+aF 2 (∇h) (∇h) 2 (e )= 2 2

F (∇h) +ae2ah F 2 (∇h)∇h =a 2 divm e2ah ∇ ∇h 2 \$  %  −a 2 e2ah a h+aF 2 (∇h) F 2 (∇h)+d(h)(∇h)+ad[F 2 (∇h)](∇h) ah

∇h

2

∇h F (∇h) 2 2 4  +ad[F (∇h)](∇h)+a F (∇h)−d(h)(∇h) =a e 2 \$ % =a 2 e2ah 2 (h)+ad[F 2 (∇h)](∇h)+a 2 F 4 (∇h) (16.11) 2 2ah

in the weak sense (note that 2 (eah ) = 0 almost everywhere on M \ Mh by Lemma 12.12). On the other hand, it follows from the integration by parts that  M

2 (eah ) dm = (eah ) 2L2 

\$ % e2ah (h)2 + 2aF 2 (∇h)h + a 2 F 4 (∇h) dm

= a2 M

 = a2

\$ % e2ah (h)2 − 2ad[F 2 (∇h)](∇h) − 3a 2 F 4 (∇h) dm.

M

Comparing this with (16.11), we have 

 e2ah (h)2 dm =

M

\$ % e2ah 2 (h) + 3ad[F 2 (∇h)](∇h) + 4a 2 F 4 (∇h) dm.

M

(16.12) Now, applying the Bochner inequality (12.13) to eah and by (16.11) and (16.10), we deduce that 2 (h) + ad[F 2 (∇h)](∇h) + a 2 F 4 (∇h)

  ((eah ))2 e−2ah e−2ah ah 2 ah KF ∇(e ) + 2 (e ) ≥ = N a2 a2 \$ % 1 (h)2 + 2aF 2 (∇h)h + a 2 F 4 (∇h) = KF 2 (∇h) + N

(16.13)

in the weak sense. We shall multiply the both sides by eh and integrate. Then the right-hand side is calculated with the help of (16.12) with a = 1/2 and the

16.1 Logarithmic Sobolev Inequality

227

integration by parts as

% 1\$ (h)2 + 2aF 2 (∇h)h + a 2 F 4 (∇h) dm eh KF 2 (∇h) + N M

 a2 4 h 2 = e KF (∇h) + F (∇h) dm N M

 3 1 h 2 4 e 2 (h) + d[F (∇h)](∇h) + F (∇h) dm + N M 2  \$ % 2a − eh F 4 (∇h) + d[F 2 (∇h)](∇h) dm N M  eh F 2 (∇h) dm =K



M

1 + N

 3 2 2 4 − 2a d[F (∇h)](∇h) + (a − 1) F (∇h) dm. e 2 (h) + 2 M



h

Hence we obtain from (16.13) that 

1 1− N



 e 2 (h) dm ≥ K

eh F 2 (∇h) dm

h

M

M

3 − 2(N + 2)a + 2N



(a − 1)2 − N a 2 + N

eh d[F 2 (∇h)](∇h) dm M



eh F 4 (∇h) dm.

(16.14)

M

Choosing a = 3/(2(N + 2)) > 0 yields  N −1 eh 2 (h) dm N M   (4N − 1)(N − 1) eh F 2 (∇h) dm + eh F 4 (∇h) dm ≥K 2 4N (N + 2) M M  eh F 2 (∇h) dm. (16.15) ≥K M

This is the desired inequality (16.4) (recall (16.6)) with C = (N −1)/(KN ) for u = eh , namely for all positive C∞ -functions u. By approximation this implies (16.4) for all u in the required class (similarly to the last step of the proof of Theorem 12.13). This completes the proof by Proposition 16.3.  

228

16 Functional Inequalities

Remark 16.5 (The Case of N = ∞) The logarithmic Sobolev inequality under Ric∞ ≥ K > 0 will be derived from the curvature-dimension condition CD(K, ∞); see Theorem 18.10. Remark 16.6 (The Case of N < 0) Unlike the Poincaré–Lichnerowicz inequality (Theorem 15.4), the calculation in the above proof is not applicable to the case of N < 0. Precisely, though taking a < 0 is acceptable in the reversible case, the last inequality (16.15) fails if N < 0. In fact, the logarithmic Sobolev inequality of the form (16.5) does not hold in general under the assumption RicN ≥ K > 0 with √ N < 0. A simple counter-example is (R, | · |, e− x +1 dx) of exponential (but not normal) decay, which satisfies only the exponential concentration, whereas (16.5) implies the normal concentration. We refer to [148] for the theory of concentration of measures (see also Subsect. 18.4.2), and also to [164, 179] for related discussions. √ 2 Exercise 16.7 Prove that the above example (R, |·|, e− x +1 dx) satisfies Ric∞ ≥ 0 as well as 2

RicN ≥

27(1 − N )2 − 4 27(1 − N )3

for N < 0.

16.2 Beckner Inequality Next we consider the Beckner inequality interpolating the Poincaré–Lichnerowicz and logarithmic Sobolev inequalities. We follow the lines of [105] (see also [22, 186]). We emphasize that N can be negative in the Beckner inequality, like the Poincaré–Lichnerowicz inequality (Theorem 15.4). Theorem 16.8 (Beckner Inequality) Assume that (M, F, m) is compact and satisfies m(M) = 1 and RicN ≥ K for some K > 0 and N ∈ (−∞, −2) ∪ [n, ∞). Then we have

f 2L2 − f 2Lp 2−p

N −1 ≤ KN

 F 2 (∇f ) dm

(16.16)

M

for all f ∈ H 1 (M), where 1 ≤ p ≤ 2 for N ∈ [n, ∞) and 1≤p≤ for N < −2.

2N 2 + 1 (N − 1)2

(16.17)

16.2 Beckner Inequality

229

The case of N = ∞ will be discussed in Corollary 16.11. Observe that, in the case of N < −2, 1
0. Let (ut )t≥0 be the global solution to the heat equation with u0 = f p . We put q := 2/p ∈ (1, 2] for simplicity and introduce  ϒ(t) :=

(s) := s q ,

(ut ) dm, M

for t ≥ 0. It follows from the heat equation that ϒ  (t) =



 (ut )ut dm = −

M



=−



 (ut )F 2 (∇ut ) dm M

F 2 (∇[ (ut )])  (ut ) M

dm,

since F (∇[ (ut )]) =  (ut )F (∇ut ) by the convexity of . Observe also that ϒ(0) = f 2L2 and 



ϒ (0) = −q(q − 1) M

q−2 u0 F 2 (∇u0 ) dm

 = 2(p − 2)

F 2 (∇f ) dm.

(16.18)

M

We shall compare ϒ(0) and ϒ  (0) in the same spirit as the proofs of Theorem 15.4 and Proposition 16.3. We obtain from the calculation as in the proof of Lemma 14.1 that 

 

ϒ (t) = − M

1 



  2d( (ut )ut )(∇[ (ut )])  dm. (ut )ut ·F ∇[ (ut )] +  (ut ) 2

Then we deduce from [ (ut )] =  (ut )ut +  (ut )F 2 (∇ut )

230

16 Functional Inequalities

=  (ut )ut −



1 



  (ut )F 2 ∇[ (ut )]

that ϒ  (t)      1 = (ut )F 2 (∇ut )F 2 ∇[ (ut )]   M    2  1  + (ut ) d F ∇[ (ut )] (∇ut ) dm    

2d([ (ut )])(∇[ (ut )]) 1 2d[F 2 (∇[ (ut )])](∇[ (ut )]) + (ut ) −    (ut )   (ut ) M     dut (∇[ (ut )]) 1 2  + 2  (ut )F ∇[ (ut )] dm   (ut )

 

   2d([ (ut )])(∇[ (ut )]) 1 d F 2 ∇[ (ut )] ∇  + =−  (ut )  (ut ) M   1 F 4 (∇[ (ut )]) dm + (u ) t   (ut )2   

22 ( (ut )) 1 F 4 (∇[ (ut )]) dm, = − (ut )  (ut )   (ut )2 M where we used [( )−1 ] ≥ 0 to see

  1 1 ∇  = (ut ) ∇ut .  (ut ) 

Substituting (s) = s q yields  q 2−q q−1 u F 2 (∇[ut ]) dm, (16.19) ϒ (t) = − q −1 M t 

q 2 − q 4−3q 4 2−q q−1 q−1 2ut 2 (ut ) + ut F (∇[ut ]) dm. ϒ  (t) = q −1 M q −1 (16.20) 

q−1

Now, we fix t > 0, put h := ut with θ > −1 chosen later, i.e.,

hθ+1

and apply the Bochner inequality (12.13) to

16.2 Beckner Inequality

231

2 (hθ+1 ) ≥ KF 2 (∇[hθ+1 ]) +

([hθ+1 ])2 N

(in the weak sense). By noticing g∇h = g∇[hθ+1 ] (since θ > −1), this is expanded as h2θ 2 (h) + θ h2θ−1 d[F 2 (∇h)](∇h) + θ 2 h2θ−2 F 4 (∇h) ≥ Kh2θ F 2 (∇h) +

2 1 θ h h + θ hθ−1 F 2 (∇h) . N

Then, by rearrangement, we have 2 (h) ≥ KF 2 (∇h) + θ 2 −θ

1 − N F 4 (∇h) N h2

d[F 2 (∇h)](∇h) 2θ F 2 (∇h)h (h)2 + + . h N h N

(16.21)

We shall combine this inequality with (16.19) and (16.20) to obtain an inequality between ϒ  (t) and ϒ  (t). To this end, we calculate the last two terms of the right2−q hand side of (16.21) multiplied by ut . We set r := (2 − q)/(q − 1) ∈ [0, ∞) (so 2−q that hr = ut ) and find  hr−1 F 2 (∇h)h dm = − M

 \$

% (r −1)hr−2 F 4 (∇h)+hr−1 d[F 2 (∇h)](∇h) dm

M

as well as   \$ % rhr−1 F 2 (∇h)h + hr d(h)(∇h) dm hr (h)2 dm = − M

M

 \$ % r(r − 1)hr−2 F 4 (∇h) + rhr−1 d[F 2 (∇h)](∇h) dm = M



2 r r−1 d[F (∇h)](∇h) h 2 (h) + rh dm. + 2 M Substituting these into (16.21), we obtain  N −1 hr 2 (h) dm N M  hr F 2 (∇h) dm ≥K M

 +

θ 2 (1 − N ) 2θ (r − 1) r(r − 1) − + N N N

 hr−2 F 4 (∇h) dm M

232

16 Functional Inequalities

 +

3r 2θ + −θ − N 2N

 hr−1 d[F 2 (∇h)](∇h) dm. M

To eliminate the last term (which does not have fixed sign), we choose θ=

3r > −1. 2(N + 2)

Precisely, θ ≥ 0 clearly holds if N ≥ n, and we have θ > −3/4 if N < −2 and p ≤ (2N 2 + 1)/(N − 1)2 . Then we find, since (N − 1)/N > 0,  hr 2 (h) dm ≥ M

KN N −1 +

 hr F 2 (∇h) dm M

  r 4(r − 1)(N + 2) − 9r 2 4(N + 2)

 hr−2 F 4 (∇h) dm. M

Combining this with (16.19), (16.20) and q/(q − 1) = r + 2 yields   2KN  4(r−1)(N +2)−9r ϒ (t)+r(r+2) +1 hr−2 F 4 (∇h) dm N −1 2(N +2)2 M  2KN  (4N −1)r+2(N+2)N ϒ (t)+r(r+2) =− hr−2 F 4 (∇h) dm. N −1 2(N+2)2 M

ϒ  (t) ≥ −

Finally, note that (4N − 1)r + 2(N + 2)N ≥ 0 holds for all r ∈ [0, ∞) if N ≥ n, or for

2(N + 2)N r ∈ 0, 1 − 4N if N < −2. Observe that the latter range of r coincides with (16.17) in terms of p. Therefore we obtain 2KN  ϒ (t). N −1  Now, since ut converges to the constant function M f p dm as t → ∞ in L2 (M) (by (15.4)), we have limt→∞ ϒ(t) = f 2Lp . Together with limt→∞ ϒ  (t) = 0 (by Proposition 13.16(i)), we deduce that ϒ  (t) ≥ −

 ϒ(0) − f 2Lp = −

∞ 0

ϒ  (t) dt ≤

N −1 2KN



ϒ  (t) dt = −

0

Recalling ϒ(0) = f 2L2 and (16.18), we conclude that

N −1  ϒ (0). 2KN

16.2 Beckner Inequality

233

f 2L2 − f 2Lp ≤ (2 − p)

N −1 KN

 F 2 (∇f ) dm. M

This completes the proof of (16.16) for nonnegative functions f . For a general function f ∈ H 1 (M), we divide it into f+ := max{f, 0} and f− := max{−f, 0}, and apply the above inequality with respect to F and to the ← − reverse structure F , respectively. Precisely, we apply the above argument to f+ to see  N −1 2 2

f+ Lp ≥ f+ L2 − (2 − p) F 2 (∇f+ ) dm. KN M ←− ← − A similar inequality for f− but with respect to F yields that, thanks to RicN (v) = ← − RicN (−v) ≥ K F 2 (v) (recall Sect. 2.5),  N −1 − # ← − "← 2 2 F 2 ∇ f− dm.

f− Lp ≥ f− L2 − (2 − p) KN M ← −← − Observe that F ( ∇ f− ) = F (∇(−f− )), and that df = df+ = df− = 0 almost everywhere on f −1 (0) by Lemma 12.12. Moreover, since 1 ≤ p < 2, we have  p p 2/p ≥ f+ 2Lp + f− 2Lp .

f 2Lp = f+ Lp + f− Lp Therefore we obtain

f 2Lp ≥ f 2L2 − (2 − p)

N −1 KN

 F 2 (∇f ) dm. M

 

This completes the proof.

The Beckner inequality (16.16) recovers the Poincaré–Lichnerowicz inequality and the logarithmic Sobolev inequality as follows (see [21, Remark 6.8.4]). Remark 16.9 (Relations of Functional Inequalities) (a) The inequality (16.16) with p = 1 recovers the Poincaré–Lichnerowicz inequality (15.8). Precisely, when f ≥ 0, the left-hand side of (16.16) is exactly the variance Varm (f ) and we have (15.8). In the general case, by truncation one can assume that f is bounded. Then (16.16) for f + f L∞ yields (15.8) since Varm (f ) = Varm (f + f ∞ ). (b) The case of p = 2 is understood as the limit and gives the logarithmic Sobolev inequality (16.9) for f 2 of the form:  2

f log(f M

2

) dm − f 2L2

 2(N − 1)  · log f 2L2 ≤ KN

 F 2 (∇f ) dm M

234

16 Functional Inequalities

(recall (16.8) and see Exercise 16.10 below). (c) It seems not known if the admissible range (16.17) of p for N < −2 is optimal. As for the Sobolev inequality with N ≥ n, one can take 2 ≤ p ≤ 2N/(N − 2) (in the reversible case; see Theorem 16.17). Notice that, for N < −2, 2N 2 + 1 2N < 2. < N −2 (N − 1)2 Hence it would be worth considering whether the Beckner inequality (16.16) with N < −2 can be extended to 1 ≤ p ≤ 2N/(N − 2). Observe also that limN →−2 2N/(N − 2) = 1, therefore the condition N < −2 may be optimal, though the Poincaré–Lichnerowicz inequality (15.8) holds for all N < 0. Exercise 16.10 Show that lim

p→2

f 2L2 − f 2Lp 2−p

1 = 2



  1 f 2 log(f 2 ) dm − f 2L2 · log f 2L2 . 2 M

Finally, we consider the case of N = ∞ as a corollary to Theorem 16.8. Corollary 16.11 (Beckner Inequality for N = ∞) Assume that (M, F, m) is compact and satisfies m(M) = 1 and Ric∞ ≥ K for some K > 0. Then we have

f 2L2 − f 2Lp 2−p

1 K

 F 2 (∇f ) dm

(16.22)

M

for all f ∈ H 1 (M) and 1 ≤ p ≤ 2. Proof For any N ∈ (−∞, −2), since RicN ≥ Ric∞ ≥ K (by Lemma 9.8(i)), we deduce from Theorem 16.8 that

f 2L2 − f 2Lp 2−p

N −1 KN

 F 2 (∇f ) dm M

holds for 1 ≤ p ≤ (2N 2 + 1)/(N − 1)2 . Since limN →−∞ (2N 2 + 1)/(N − 1)2 = 2, this yields (16.22) for all 1 ≤ p < 2, and the case of p = 2 is given as the limit. This completes the proof.  

16.3 Sobolev Inequality Finally, we consider the Sobolev inequality, which could be regarded as a variant of the logarithmic Sobolev inequality in the opposite direction to the Beckner inequality (i.e., p > 2). In this case, similarly to the logarithmic Sobolev inequality (recall Remark 16.6), one cannot take N < 0.

16.3 Sobolev Inequality

235

First we establish the logarithmic entropy-energy and Nash inequalities, followed by a “non-sharp” Sobolev-type inequality. Then, with the help of some qualitative properties induced from the non-sharp Sobolev inequality, we will proceed to a sharp estimate. In this section we assume RicN ≥ K for some K > 0 and N ∈ [n, ∞) (thus M is compact), and normalize m as m(M) = 1.

16.3.1 Logarithmic Entropy-Energy and Nash Inequalities We start with an inequality between the relative entropy and the energy (see [21, Theorem 6.8.1]). Proposition 16.12 (Logarithmic Entropy-Energy Inequality) Let (M, F, m) be complete and satisfy m(M) = 1 and RicN ≥ K for some K > 0 and N ∈ [n, ∞). Then we have  N 8 Entm (f 2 m) ≤ log 1 + E(f ) 2 KN for all f ∈ H 1 (M) with

M

f 2 dm = 1.

Proof First we assume that f is nonnegative. By truncation we can assume f ∈ L∞ (M) and infM f > 0, in particular, f 2 ∈ H 1 (M). Consider the global solution (ut )t≥0 to the heat equation with u0 = f 2 , and put (t) := Entm (ut m). Then we have, as in the proof of Proposition 16.3, 



(t) = − M

F 2 (∇ut ) dm = − ut



  ut F 2 ∇[log ut ] dm ≤ 0. M

Recall also from the proof of Proposition 16.3 that   (t) = −



"  # dut ∇ ∇ut F 2 (∇[log ut ]) dm M



ut · d([log ut ])(∇[log ut ]) dm.

−2 M

Then the integrated Bochner inequality (12.13) for log ut with the test function ut shows that  2   (t) ≥ −2K  (t) + ut ([log ut ])2 dm. N M Moreover, by the Cauchy–Schwarz inequality with the probability measure ut m and the integration by parts, we find that

236

16 Functional Inequalities

 (t) ≥ −2K  (t) + = −2K  (t) + = −2K  (t) +

2 N 2 N

2

 ut [log ut ] dm M

 M

F 2 (∇ut ) dm ut

2

2  2  (t) . N

(16.23)

We deduce from this differential inequality that the function  KN t −→ e−2Kt 1 −   (t) is non-decreasing. Thus we have  −1

KN 2Kt − (t) ≤ KN e 1−  −1 .  (0) 

Integrating this inequality implies that (0) − (t) ≤

   (0)   (0) N log 1 − + e−2Kt . 2 KN KN

(16.24)

Finally observe that, since f ≥ 0,   (0) = −

 M

F 2 (∇(f 2 )) dm = −4 f2

 F 2 (∇f ) dm. M

Therefore we obtain the claim for f ≥ 0 by letting t → ∞ in (16.24), with the help of Lemma 16.2. For general f ∈ H 1 (M), in the same manner as the last step of the proof of Theorem 16.8, we divide it into f+ := max{f, 0} and f− := max{−f, 0}. We apply the above argument to f+ / f+ L2 to see 1

f+ 2L2

 M

  f+2 log(f+2 ) dm − log f+ 2L2

  4 N 2 log 1 + ≤ F (∇f ) dm . + 2 KN f+ 2L2 M

← − A similar inequality for f− / f− L2 with respect to F yields

16.3 Sobolev Inequality

1

f− 2L2

237

 M

  f−2 log(f−2 ) dm − log f− 2L2

  N 4 − # ← −2 "← ∇ f dm . F ≤ log 1 + − 2 KN f− 2L2 M

← −← − Then it follows from F ( ∇ f− ) = F (∇(−f− )), the concavity of the function log(1 + s) in s and f+ 2L2 + f− 2L2 = 1 that   N 4 2

f+ 2L2 log 1 + F (∇f ) dm + 2 KN f+ 2L2 M     4 N 2 2 F ∇(−f− ) dm + f− L2 log 1 + 2 KN f− 2L2 M      N 4 4 2 2 ≤ log 1 + F (∇f+ ) dm + F ∇(−f− ) dm 2 KN M KN M   N 4 2 F (∇f ) dm . = log 1 + 2 KN M

Entm (f 2 m) ≤

 

This completes the proof.

Remark 16.13 The inequality in (16.23) is invalid for N < 0 since the Cauchy– Schwarz inequality cannot be reversed. Next, along the strategy in [21, Proposition 6.2.3], we show the Nash inequality and then a non-sharp Sobolev inequality. Lemma 16.14 (Nash Inequality) Assume that (M, F, m) is complete and satisfies m(M) = 1 and RicN ≥ K for some K > 0 and N ∈ [n, ∞). Then we have, for all f ∈ H 1 (M), +2

f N L2

N/2  8 2 E(f ) ≤ f L2 +

f 2L1 . KN

∞ Proof We can assume  f2 ∈ L (M) (by truncation) and f L2 = 0, and normalize f so as to satisfy M f dm = 1 (thanks to the scaling invariance of the claimed inequality). Put ψ(θ ) := log( f L1/θ ) for θ ∈ (0, 1]. Then ψ is a convex function by Hölder’s inequality: For any θ, θ  ∈ (0, 1] and λ ∈ (0, 1), we have

  ψ (1 − λ)θ + λθ    = (1 − λ)θ + λθ  log







|f |(1−λ)/((1−λ)θ+λθ ) |f |λ/((1−λ)θ+λθ ) dm M

238

16 Functional Inequalities

  ≤ (1 − λ)θ + λθ  0  |f |

× log

(1−λ)θ/((1−λ)θ+λθ  )  

1/θ

|f |

dm

M

1/θ 

λθ  /((1−λ)θ+λθ  ) 1 dm

M

  = log f 1−λ · f λL1/θ  = (1 − λ)ψ(θ ) + λψ(θ  ). L1/θ Therefore we find ψ(1) ≥ ψ

   1 1 1 1 1 + ψ = ψ 2 2 2 2 2

since f L2 = 1. Combining this with ψ

 

 1 d 1 1 = |f |1/θ dm = (−4|f |2 log |f |) dm 2 2 dθ M 2 M θ =1/2 = − Entm (f 2 m)

and Proposition 16.12, we obtain −N/4   1 8

f L1 ≥ exp − Entm (f 2 m) ≥ 1 + E(f ) , 2 KN  

which completes the proof.

Proposition 16.15 (Non-sharp Sobolev Inequality) Assume that (M, F, m) is complete and satisfies m(M) = 1 and RicN ≥ K for some K > 0 and N ∈ [n, ∞) \ {2}. Then we have

f 2Lp ≤ C1 f 2L2 + C2 E(f )

(16.25)

for all f ∈ H 1 (M), where p = 2N/(N − 2), C1 = C1 (N ) > 1 and C2 = C2 (K, N) > 0. Proof By the same reasoning as the last step of the proof of Theorem 16.8 with the help of  p p 2/p

f 2Lp = f+ Lp + f− Lp ≤ f+ 2Lp + f− 2Lp (16.26) (since p > 2), it suffices to consider nonnegative functions f . Moreover, one can also assume that f ∈ L∞ (M) and infM f > 0. We consider the decreasing sequence Ak := {x ∈ M | f (x) > 2k }, and set

k ∈ Z,

16.3 Sobolev Inequality

239

⎧ k ⎪ ⎪ ⎨2   fk := min max{f − 2k , 0}, 2k = f − 2k ⎪ ⎪ ⎩0

on Ak+1 , on Ak \ Ak+1 , on M \ Ak .

Then we observe that 22k m(Ak+1 ) ≤ fk 2L2 ≤ 22k m(Ak ),

fk L1 ≤ 2k m(Ak ).

Together with the Nash inequality (Lemma 16.14) applied to fk , we find  N/2  2k (N +2)/2 8 +2 2 2 m(Ak+1 ) E(f ≤ fk N ≤

f

+ )

fk 2L1 k k L2 L2 KN  N/2 8 2k E(fk ) ≤ 2 m(Ak ) + 22k m(Ak )2 . KN One can rewrite this inequality by using p = 2N/(N − 2) as p(k+1)

2

 N/(N +2)  pk 4/(N +2) 8 2k E(fk ) 2 m(Ak ) m(Ak+1 ) ≤ 2 2 m(Ak ) + . KN p

Combining this with Hölder’s inequality implies 

2p(k+1) m(Ak+1 )

k∈Z

N/(N +2)    2pk  8 2k 2 2/(N +2) 2 m(Ak ) + E(fk ) ≤2 2 m(Ak ) KN p

k∈Z

 2/(N +2) N/(N +2)   8 p 2k 2pk 2 2 m(Ak ) + E(fk ) 2 m(Ak ) ≤2 KN k∈Z

k∈Z

 N/(N +2)   4/(N +2) 8 22k m(Ak ) + E(fk ) ≤ 2p 2pk m(Ak ) . KN k∈Z

k∈Z

We remark that, by the definition of Ak , 

2pk m(Ak ) =

k∈Z

Hence we have

 1 1 p 2pk {m(Ak ) − m(Ak+1 )} ≤

f Lp < ∞. 1 − 2−p 1 − 2−p k∈Z (16.27)

240

16 Functional Inequalities



(N −2)/(N+2) 2pk m(Ak )

≤ 2p

k∈Z

 N/(N +2) 8 22k m(Ak ) + E(fk ) . KN k∈Z (16.28)

One sees similarly to (16.27) that 

22k m(Ak ) =

k∈Z

4  2k 4 2 {m(Ak ) − m(Ak+1 )} ≤ f 2L2 . 3 3 k∈Z

Combining this with



k∈Z E(fk )

= E(f ), we have

 8 4 8 22k m(Ak ) + E(fk ) ≤ f 2L2 + E(f ). KN 3 KN k∈Z

Moreover, we deduce that  k∈Z

2pk m(Ak ) =

1  p(k+1) p 2 {m(Ak ) − m(Ak+1 )} ≥ 2−p f Lp . 2p − 1 k∈Z

Substituting these into (16.28) yields   −p 8 p (N −2)/N p(N +2)/N 4 2

f L2 + E(f ) . 2 f Lp ≤2 3 KN Finally, recalling p = 2N/(N − 2), we obtain

f 2Lp



4 8

f 2L2 + E(f ) ≤2 2 3 KN  8 4N/(N −2) 4 2 =2

f L2 + E(f ) . 3 KN

This completes the proof.

2 p(N +2)/N

 

We remark that, on one hand, the exponent p = 2N/(N −2) in Proposition 16.15 is same as the classical Sobolev embedding (see, e.g., [115, 116]). On the other hand, the constants C1 and C2 in (16.25) are by no means sharp. Then we will utilize some qualitative consequences of (16.25) to obtain a sharper estimate. In fact, one can reduce those qualitative arguments to the Riemannian (linear) case by using the uniform smoothness (see Subsect. 16.3.3 for details). For instance, the following corollary to Proposition 16.15 is straightforward. Corollary 16.16 Assume that (M, F, m) is complete and satisfies m(M) = 1 and RicN ≥ K for some K > 0 and N ∈ [n, ∞) \ {2}. Then there exists a C∞ Riemannian metric g of M such that

16.3 Sobolev Inequality

241

f 2Lp ≤ C1 f 2L2 + C2 SF Eg (f ) holds for all f ∈ H 1 (M), where p = 2N/(N − 2), C1 > 1 and C2 > 0 are as in Proposition 16.15, and Eg is the energy functional of (M, g, m). Proof Let {Ui }li=1 be an open cover of M, Vi a nowhere vanishing C∞ -vector field on Ui , and {ρi }li=1 a partition of unity subordinate to {Ui }li=1 . Consider a  ∗ (recall Riemannian metric g whose dual metric g ∗ is given by g ∗ := li=1 ρi gL(V i) 1 (8.9)). Then, for any f ∈ H (M), we deduce from Lemma 8.16 that 2E(f ) =

l   i=1

ρi F ∗ (df )2 dm ≤

Ui

l   i=1

Ui

∗ ρi SF gL(V (df, df ) dm = 2SF Eg (f ). i)

Combining this with Proposition 16.15, we obtain the claim.

 

In particular, it follows from Corollary 16.16 that the embedding H 1 (M) → is compact for 1 ≤ p < 2N/(N − 2) (the Rellich–Kondrachov theorem; see [115, 116] and [21, Sect. 6.4]). Lp (M)

16.3.2 Sharp Sobolev Inequality We show the sharp Sobolev inequality along the lines of [21, Theorem 6.8.3] (see also [21, Remark 6.8.4]). Theorem 16.17 (Sobolev Inequality) Assume that (M, F, m) is complete and satisfies m(M) = 1 and RicN ≥ K for some K > 0 and N ∈ [n, ∞). Then we have

f 2Lp − f 2L2 p−2

N −1 KN

 F 2 (∇f ) dm

(16.29)

M

for all f ∈ H 1 (M) and 2 ≤ p ≤ 2(N + 1)/N. If F is reversible, then the range of p can be extended to 2 ≤ p ≤ 2N/(N − 2) (2 ≤ p < ∞ when N = 2). Observe that 2(N + 1)/N < 2N/(N − 2). The inequality (16.29) has exactly the same form as the Beckner inequality (16.16), therefore one can indeed take p ∈ [1, 2(N +1)/N ] (and p ∈ [1, 2N/(N −2)] or p ∈ [1, ∞) in the reversible case). Recall that the case of p = 2 corresponds to the logarithmic Sobolev inequality (Remark 16.9(b)). Proof Let p > 2 and, for simplicity, N > 2. This certainly covers the case of N = n = 2 by taking the limit as N → 2 (recall that RicN ≥ Ricn by Lemma 9.8(i)). Note also that, thanks to (16.26), it suffices to show the claim for nonnegative functions f (similarly to the proof of the Beckner inequality).

242

16 Functional Inequalities

Step 1 Take the smallest possible constant C > 0 such that the inequality

f 2Lp − f 2L2 p−2

≤ 2CE(f )

(16.30)

holds for all nonnegative functions f ∈ H 1 (M). Then our goal is to show C ≤ (N − 1)/(KN ). Suppose that there exists an extremal (nonconstant) function fˆ ≥ 0 enjoying equality in (16.30) as well as fˆ ∈ L∞ (M) and infM fˆ > 0, and normalize it as fˆ Lp = 1. Then, for any φ ∈ C∞ (M) and ε > 0, we find from the choices of C and fˆ that

fˆ + εφ 2Lp − fˆ 2Lp − fˆ + εφ 2L2 + fˆ 2L2 p−2

≤ 2C{E(fˆ + εφ) − E(fˆ)}.

Dividing both sides by ε and letting ε → 0, we obtain from fˆ Lp = 1 and Lemma 3.8 that     1 2 ˆp−1 ˆ ˆ pf φ − 2f φ dm ≤ 2C dφ(∇ f ) dm = −2C φfˆ dm. p−2 M p M M Replacing φ with −φ, we also obtain the reverse inequality. Thus it follows that fˆ is absolutely continuous with respect to m and fˆp−1 − fˆ = −C(p − 2)fˆ.

(16.31)

In particular, we have fˆ ∈ L∞ (M) ∩ H 1 (M). Put ζ := log fˆ and observe that ∇ζ =

∇ fˆ , fˆ

ζ =

fˆ F 2 (∇ fˆ) ∈ L1 (M). − fˆ fˆ2

(16.32)

Moreover, one can rewrite (16.31) as e(p−1)ζ − eζ = −C(p − 2)fˆ, and hence e(p−2)ζ = 1 − C(p − 2)

fˆ . fˆ

(16.33)

Multiplying both sides by a function −ebζ ζ ∈ L1 (M) with b ∈ R and applying the integration by parts, we have  (p − 2 + b)

e(p−2+b)ζ F 2 (∇ζ ) dm M

16.3 Sobolev Inequality

243





=b

ebζ F 2 (∇ζ ) dm + C(p − 2) M

ebζ ζ M

fˆ dm. fˆ

Then, substituting (16.33) into e(p−2)ζ in the left-hand side yields  (p − 2)

ebζ F 2 (∇ζ ) dm M

fˆ fˆ 2 dm e (p − 2 + b)F (∇ζ ) = C(p − 2) + ζ fˆ fˆ M

 ˆ 2  f fˆ = C(p − 2) dm, ebζ + (p − 3 + b)F 2 (∇ζ ) fˆ fˆ M 

where we used (16.32) in the second equality. Therefore we obtain 1 C





 ebζ F 2 (∇ζ ) dm = M

ebζ M

fˆ fˆ

2 + (p − 3 + b)F 2 (∇ζ )

fˆ dm. fˆ (16.34)

Step 2 We will compare (16.34) with the inequality (16.13) for ζ . To this end, in order to improve the regularity, we consider the global solution (ut )t≥0 to the heat equation with u0 = fˆ. Thanks to fˆ ∈ H 1 (M) from (16.31), we find that ut = ∂t ut converges to fˆ in L2 (M) as t → 0 (see Claim 16.20 below for a further account). Moreover, since ∂t u solves the linearized heat equation ∂t w = ∇u w (see Step 3 in the proof of Theorem 13.18) and fˆ ∈ L∞ (M), we have

∂t ut L∞ ≤ fˆ L∞ < ∞ for all t > 0 by Proposition 13.20(iii). Now, we fix t > 0 and set h := log ut . One can expand 2 (h) as 

 b ebh − d[F 2 (∇h)](∇h) + (h)2 + bF 2 (∇h)h dm. 2 M

 ebh 2 (h) dm =

M

Hence we have, for the left-hand side of (16.13),  \$ % ebh 2 (h) + ad[F 2 (∇h)](∇h) + a 2 F 4 (∇h) dm M

 b 2 2 2 2 4 d[F (∇h)](∇h) + (h) + bF (∇h)h + a F (∇h) dm a− = e 2 M

  2  3b b bh 2 2 2 4 − a F (∇h)h + − ab + a F (∇h) dm. = e (h) + 2 2 M 

bh

Therefore (16.13) and (16.32) imply

244

16 Functional Inequalities

 ebh F 2 (∇h) dm

K M

  1 3b (N + 2)a 1− (h)2 + − F 2 (∇h)h N 2 N M  2 (N − 1)a 2 b 4 − ab + F (∇h) dm + 2 N

2    N − 1 ut 2(N − 1) 3b (N + 2)a ut − − F 2 (∇h) ebh + = N u 2 N N ut t M  2 b (N − 1)a 2 3b (N + 2)a N −1 − ab + − + + + F 4 (∇h) dm. 2 N 2 N N (16.35) 

ebh

Recall that, in the non-reversible case, a is necessarily positive (or nonnegative by taking limit) since we used the Bochner inequality for eah to see (16.13). Note also that b does not need to be nonnegative. Step 3 The coefficient of the last term in (16.35) should vanish, i.e., b2 (N − 1)a 2 3b (N + 2)a N −1 − ab + − + + = 0, 2 N 2 N N

(16.36)

since F 4 (∇ζ ) may not be integrable. Moreover, comparing (16.35) and (16.34), we would like to choose a and b satisfying p−3+b =

 3b (N + 2)a 2(N − 1) N − − . N −1 2 N N

(16.37)

Granted the existence of a and b satisfying (16.36) and (16.37), we finish the proof of C ≤ (N − 1)/(KN ). Plugging a and b as above into (16.35) yields ut dm. e F (∇h) dm ≤ e + (p − 3 + b)F (∇h) ut M M (16.38) Note that, as t → 0, we have ut → fˆ, ut → fˆ (both in L2 (M)) and E(ut ) → E(fˆ) (recall Sect. 13.2). Moreover, F (∇ut ) → F (∇ fˆ) in L2 (M) since KN N −1









bh

2

bh

ut ut

2

2  F (∇ut ) − F (∇ fˆ) dm = 2E(ut ) + 2E(fˆ) − 2

M

≤ 2E(ut ) + 2E(fˆ) − 2

2

 

F (∇ut )F (∇ fˆ) dm M

dut (∇ fˆ) dm M

16.3 Sobolev Inequality

245

= 2E(ut ) + 2E(fˆ) + 2 → 4E(fˆ) + 2



ut fˆ dm M



fˆfˆ dm = 0. M

Recall also that infM ut ≥ infM fˆ > 0 and ut L∞ ≤ fˆ L∞ < ∞ for all t > 0. Therefore we can take the limit of (16.38) as t → 0 to see KN N −1





 e F (∇ζ ) dm ≤ bζ

M

2

e M

fˆ fˆ

2

fˆ dm. + (p − 3 + b)F (∇ζ ) fˆ 2

Combining this with (16.34) shows C ≤ (N − 1)/(KN ) as desired, and hence we obtain the inequality (16.29). Step 4 Now, we show that a and b satisfying (16.36) and (16.37) exist within the assumption p ∈ (2, 2(N + 1)/N]. At this point, due to the non-reversibility, we need an additional care on the necessary nonnegativity of a. (If F is reversible, then a does not need to be nonnegative and one can relax the condition to p ∈ (2, 2N/(N − 2)].) We deduce from the latter equation (16.37) that a=

b N −1 − (p − 1) . 2 N +2

Substituting this into the former equation (16.36) implies b2 + 4



 2 N −1 p−1 − 1 b − (p − 2) + (p − 1)2 = 0. N +2 N +2

(16.39)

Let us denote the left-hand side by ϕ(b). The discriminant of ϕ is given by  N (p − 1) 2 − N (p − 1) + 1 , N +2 N +2 which vanishes at p = 1, 2N/(N − 2) and is nonnegative for p ∈ [1, 2N/(N − 2)]. Hence (16.39) has at least one solution  p−1 b0 ≥ 2 1 − , N +2 and we put a0 :=

N −1 b0 − (p − 1) . 2 N +2

246

16 Functional Inequalities

Then we make use of the hypothesis p ≤ 2(N + 1)/N to see the desired nonnegativity: a0 ≥ 1 −

N −1 p−1 − (p − 1) ≥ 0. N +2 N +2

(16.40)

Step 5 Finally, since a good extremal function as described in Step 1 may not exist, it is necessary to consider an approximation procedure. This step can be essentially reduced to the Riemannian case (and requires p > 2). Here we explain only an outline and leave the details (along the last part of the proof of [21, Theorem 6.8.3]) to the next subsection. Given p ∈ (2, 2(N + 1)/N] (or p ∈ (2, 2N/(N − 2)) in the reversible case) and δ > 0, we consider the smallest possible constant C(p, δ) > 0 such that the inequality

f 2Lp − (1 + δ) f 2L2 p−2

≤ 2C(p, δ)E(f )

holds for all nonnegative functions f ∈ H 1 (M). Take a sequence (fk )k∈N of nonconstant positive functions in H 1 (M) such that fk H 1 = 1 and lim

k→∞

fk 2Lp − (1 + δ) fk 2L2 2(p − 2)E(fk )

= C(p, δ).

Thanks to the compactness of the embedding H 1 (M) → Lp (M) (recall Corollary 16.16), a subsequence of (fk )k∈N converges to some nonnegative function fˆ ∈ Lp (M). If C(p, δ) = ∞, then it necessarily holds that limk→∞ E(fk ) = 0, which implies E(fˆ) = 0 by the lower semi-continuity of E (Lemma 11.6). This is, however, a contradiction, because E(fˆ) = 0 implies fˆ = 1 (due to fk H 1 = 1) and hence

fk 2Lp − (1 + δ) fk 2L2 is negative for large k. Therefore we find C(p, δ) < ∞ and

fˆ 2Lp − (1 + δ) fˆ 2L2 ≥ 2C(p, δ)(p − 2)E(fˆ).

(16.41)

Together with the choice of C(p, δ), we have equality in (16.41). This shows that fˆp−1 − (1 + δ)fˆ = −C(p, δ)(p − 2)fˆ

(16.42)

holds instead of (16.31). Now, let V be a nowhere vanishing measurable vector field on M such that V = ∇ fˆ on Mfˆ . Then, we find that the weighted Riemannian manifold (M, gV , m) satisfies a (non-sharp) Sobolev inequality in the same form as Corollary 16.16. Having this Sobolev inequality in hand, one can show that supM fˆ < ∞ and

16.3 Sobolev Inequality

247

infM fˆ > 0 (see the next subsection). Therefore the same argument as Steps 1– 3 above yields C(p, δ) ≤ (1 + δ)

N −1 . KN

Hence,

f 2Lp − (1 + δ) f 2L2 p−2

≤ 2(1 + δ)

N −1 E(f ) KN

holds for all nonnegative functions f ∈ H 1 (M). Letting δ → 0 (and possibly p → 2N/(N − 2) in the reversible case) completes the proof.   We close this subsection with a further discussion on the admissible range of p in the sharp Sobolev inequality (16.29). Remark 16.18 (Admissible Range of p) In (16.40), in order to see a0 ≥ 0, we used the condition p ≤ 2(N + 1)/N which is more restrictive than the usual range p ≤ 2N/(N − 2) available for Riemannian or reversible Finsler manifolds (see [21, 60]). In fact, a more precise estimate implies that a0 ≥ 0 holds also for √

2(N + 1) 7N 2 + 2N + (N + 2) N 2 + 8N , , p∈ N 4N (N − 1)

which slightly improves the acceptable range of p. This is, however, still more restrictive than p ≤ 2N/(N −2). Actually, in the extremal case of p = 2N/(N −2), one can explicitly calculate:  2(N − 3) p−1 b0 = 2 1 − = , N +2 N −2

a0 = −

2 0 necessarily converges to fˆ as t → 0 in L2 (M). Then, by the contraction property (13.17), we obtain ∂t ut = ∇u (fˆ). This completes the proof. P0,t  

16.3 Sobolev Inequality

249

Recall that V is a nowhere vanishing vector field extending (∇ fˆ)|Mfˆ . Henceforth, we work on the weighted Riemannian manifold (M, gV , m) and show the upper and lower boundedness of fˆ satisfying (16.42) (see Step 5). Let (Pt )t≥0 be the linear heat semigroup on (M, gV , m) associated with the linearized Laplacian V (constructed in the same way as Proposition 13.20 for the static vector field V ). For 1 ≤ p ≤ q ≤ ∞, we denote by Pt p,q the operator norm of Pt from Lp (M) to Lq (M), i.e.,

Pt (ξ ) Lq ≤ Pt p,q · ξ Lp for all ξ ∈ Lp (M). Note that Pt p,p = 1 holds for all 1 ≤ p ≤ ∞ and t ≥ 0 in the same way as (13.17) (or by Proposition 13.20(iii) for p = ∞). We begin with important contraction properties associated with Sobolev-type inequalities (called the ultracontractivity; see [21, Sect. 6.3]). Recall that we are assuming RicN ≥ K for some K > 0 and N ∈ [n, ∞). Claim 16.21 (Ultracontractivity) (i) We have Pt 1,2 ≤ C0 t −N/4 for all t ∈ (0, 1], where C0 = C0 (K, N, SF ) > 0. (ii) It holds that Pt 1,∞ ≤ 2N/2 C02 t −N/2 for all t ∈ (0, 1]. (iii) More generally, we have

Pt p,q ≤ (2N/2 C02 )(q−p)/(pq) · t −N (q−p)/(2pq) for all t ∈ (0, 1] and 1 ≤ p < q ≤ ∞. Proof (i) Take an arbitrary function ξ ∈ L2 (M) with ξ L1 = 1 and put (t) := Pt (ξ ) 2L2 . Then we have  (t) = −4EV (Pt (ξ )) and deduce from the Nash inequality (Lemma 16.14) and Pt (ξ ) L1 = 1 that (t)(N +2)/N ≤ (t) +

 8  2SF  E Pt (ξ ) ≤ (t) −  (t), KN KN

where the second inequality is seen in the same of Corollary 16.16.  way as the proof  This implies that the function t −→ eKt/SF 1 − (t)−2/N is non-increasing. Thus we find   eKt/SF 1 − (t)−2/N ≤ 1 − (0)−2/N ≤ 1, and hence −N/2  ≤ (t) ≤ 1 − e−Kt/SF



K −K/SF e t SF

−N/2

for all t ∈ (0, 1]. This completes the proof (since L2 (M) is dense in L1 (M)).

250

16 Functional Inequalities

(ii) Note that Pt is symmetric (since V is static). Hence we obtain Pt 2,∞ ≤ C0 t −N/4 as the dual of (i). Then we have

Pt 1,∞ ≤ Pt/2 2,∞ · Pt/2 1,2 ≤

C02

 −N/2 t . 2

(iii) This is seen by interpolating (ii) and the contractivity of Pt . The classical Riesz–Thorin interpolation theorem (see, e.g., [89]) asserts that θ

Pt pθ ,qθ ≤ Pt 1−θ p0 ,q0 · Pt p1 ,q1

holds for θ ∈ (0, 1), where 1 1−θ θ = + , pθ p0 p1

1 1−θ θ = + . qθ q0 q1

First we apply this theorem with p0 = p1 = 1, q0 = 1 and q1 = ∞ to see 1/q

(q−1)/q

Pt 1,q ≤ Pt 1,1 · Pt 1,∞

(q−1)/q  ≤ 2N/2 C02 t −N/2 ,

where we used Pt 1,1 = 1 and (ii). Then we put q0 = q1 = q in the theorem and obtain (q−p)/(p(q−1))

(p−1)q/(p(q−1))

· Pt q,q

Pt p,q ≤ Pt 1,q  N/2 2 −N/2 (q−p)/(pq) ≤ 2 C0 t ,

 

which completes the proof. We define the resolvent operator associated with V by 

Rλ :=

e−λt Pt dt,

λ > 0,

(16.43)

0

which can be (formally) represented as Rλ = (λ · id −V )−1 (see [21, Sect. A.1]). Claim 16.22 Let λ > 0. (i) For p > N/2, we have Rλ p,∞ < ∞. (ii) For p ∈ [1, N/2], we have Rλ p,q < ∞ for all q ∈ [p, pN/(N −2p)), where pN/(N − 2p) is read as ∞ when p = N/2. Proof (i) Observe from Claim 16.21(iii) that Pt p,∞ ≤ (2N/2 C02 )1/p t −N/(2p) for t ∈ (0, 1]. Moreover, for t > 1, we find

Pt p,∞ ≤ P1 p,∞ · Pt−1 p,p ≤ (2N/2 C02 )1/p .

16.3 Sobolev Inequality

251

Combining these, we obtain 

Rλ p,∞ ≤

e−λt Pt p,∞ dt

0



≤ (2N/2 C02 )1/p

1



t −N/(2p) dt +

0

e−λt dt

1

< ∞.

(ii) A similar discussion to (i) yields 

Rλ p,q ≤ (2N/2 C02 )(q−p)/(pq)

1

t −N (q−p)/(2pq) dt +



0

e−λt dt

< ∞.

1

 

This completes the proof.

Let pt : M × M −→ R be the heat kernel associated with the linear operator Pt , i.e.,  ξ(y)pt (x, y) m(dy) [Pt (ξ )](x) = M

for any ξ ∈ L1 (M), t > 0, and almost every x ∈ M (see, e.g., [21, Proposition 1.2.5]). We obtain the following estimate of pt from a Poincaré-type inequality. Claim 16.23 We have |pt (x, y) − 1| ≤ 2C02 e2K/SF · e−Kt/SF for all t ≥ 2 and (m ⊗ m)-almost every (x, y) ∈ M × M. Proof Consider the linear operator  Pt0 (ξ )

:= Pt (ξ ) − M

  ξ dm = Pt ξ − ξ dm , M

and notice that its kernel is pt − 1. Observe also that M Pt0 (ξ ) dm = 0, 0 = P1 ◦ Pt ◦ P10 . Then, given ξ ∈ L1 (M),

Pt0 (ξ ) 2L2 = Varm (Pt (ξ )), and P2+t it follows from Claim 16.21(i) that

P10 (ξ ) L2 ≤ P1 1,2 · P00 (ξ ) L1 ≤ 2C0 ξ L1 .

252

16 Functional Inequalities

Next, we deduce from Ric∞ ≥ RicN ≥ K the variance decay (as in Proposition 15.5): 0 (ξ ) L2 ≤ e−Kt/SF P10 (ξ ) L2 .

P1+t

Finally, we apply the dual of Claim 16.21(i) to see 0 0 0 (ξ ) L∞ ≤ P1 2,∞ · P1+t (ξ ) L2 ≤ C0 P1+t (ξ ) L2 .

P2+t

Combining these implies 0 (ξ ) L∞ ≤ 2C02 e−Kt/SF ξ L1 .

P2+t

Since ξ ∈ L1 (M) was arbitrary, for t ≥ 2, we obtain that |pt (x, y) − 1| ≤ 2C02 e−K(t−2)/SF = 2C02 e2K/SF e−Kt/SF holds (m ⊗ m)-almost everywhere.

 

We are ready to show that the (nonnegative) extremal function fˆ ∈ Lp (M) satisfying (16.42) is bounded from below and above. Put Cp := C(p, δ)(p − 2) > 0 for brevity. It follows from (16.42) and fˆ = V fˆ (Lemma 11.22) that  1+δ p−1 V ˆ f = Cp · id − fˆ. Cp Hence, by setting λ := (1 + δ)/Cp , we find 1 1 fˆ = (λ · id −V )−1 (fˆp−1 ) = Rλ (fˆp−1 ). Cp Cp

(16.44)

(We remark that p > 2 is necessary to ensure λ > 0.) Then a positive lower bound of fˆ can be obtained from Claim 16.23 as follows: By the integral representation (16.43) of Rλ , we have Rλ (fˆp−1 ) ≥



e−λt Pt (fˆp−1 ) dt

2

 ≥

−λt

e 2

>0 almost everywhere on M.

 \$ % 2 2K/SF −Kt/SF max 1 − 2C0 e e , 0 dt · fˆp−1 dm M

16.3 Sobolev Inequality

253

The upper boundedness is a consequence of Claim 16.22. On one hand, if fˆ ∈ Lr (M) for some r > (p − 1)N/2, then we have fˆp−1 ∈ Lr/(p−1) (M) and deduce from Claim 16.22(i) and (16.44) that fˆ ∈ L∞ (M). On the other hand, granted that fˆ ∈ Lr (M) for some r ∈ [p, (p − 1)N/2], Claim 16.22(ii) yields fˆ ∈ Lq (M) for any q less than N ·r (p − 1)N − 2r (read as ∞ if r = (p − 1)N/2), which is greater than r since r ≥ p > (p − 2)N/2 (by p < 2N/(N − 2)). Iterating this improving procedure shows that fˆ ∈ Lr (M) actually holds for some r > (p − 1)N/2, and eventually fˆ ∈ L∞ (M).

Part III

Further Topics

In Part III, we overview three advanced topics in comparison Finsler geometry. The first chapter is concerned with Cheeger–Gromoll-type splitting theorems. The following two chapters are devoted to two powerful techniques, besides the calculus we discussed in Part II, in the investigation of spaces of weighted Ricci curvature bounded below.

Chapter 17

Splitting Theorems

Cheeger–Gromoll’s classical splitting theorem [72] asserts that, if a complete Riemannian manifold (M, g) of nonnegative Ricci curvature includes a straight line (an isometric copy of the real line R), then it isometrically splits off the real line (i.e., M can be represented as ! × R isometrically). We refer to [98, 157, 252] for generalizations to weighted Riemannian manifolds. This beautiful theorem and its generalizations have had quite rich applications in the structure theories of Riemannian manifolds [73], measured Gromov–Hausdorff limits of Riemannian manifolds [68–71], and of metric measure spaces satisfying the Riemannian curvature-dimension condition [48, 106, 183]. In the Finsler setting, one cannot expect the isometric splitting since normed spaces provide counter-examples. We will show a diffeomorphic and measurepreserving splitting instead (Proposition 17.6). In the case of Berwald spaces, we can moreover obtain a one-parameter family of isometries as translations (Proposition 17.9, Theorem 17.10). The Laplacian comparison theorem (Theorem 11.20) and the Bochner–Weitzenböck formula (Chap. 12) are key ingredients. The main reference of this chapter is [199].

17.1 Busemann Functions Let (M, F, m) be a forward complete measured Finsler manifold. We begin with the analysis of a Busemann function associated with a ray, which plays a fundamental role in splitting theorems. We call a geodesic η : [0, ∞) −→ M a ray if it is globally minimizing and of unit speed, namely d(η(s), η(t)) = t − s holds for all 0 ≤ s < t. The Busemann function bη : M −→ R associated with η is defined by    bη (x) := lim t − d x, η(t) . t→∞

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S.-i. Ohta, Comparison Finsler Geometry, Springer Monographs in Mathematics, https://doi.org/10.1007/978-3-030-80650-7_17

257

258

17 Splitting Theorems

This limit indeed exists because the triangle inequality ensures that, for any s < t,       s − d x, η(s) ≤ t − d x, η(t) ≤ d η(0), x . The triangle inequality also shows that bη is 1-Lipschitz in the sense that bη (y) − bη (x) ≤ d(x, y) for all x, y ∈ M (recall Exercise 3.6). Hence, bη is differentiable almost everywhere by Rademacher’s theorem. We say that another ray σ : [0, ∞) −→ M is asymptotic to η, denoted by σ ∼ η, if there are sequences (ti )i∈N with limi→∞ ti = ∞ and (σi )i∈N such that σi : [0, d(σ (0), η(ti ))] −→ M is a minimal geodesic from σ (0) to η(ti ) and that limi→∞ σi (t) = σ (t) for all t ≥ 0. In the next lemma we summarize some fundamental properties of the Busemann function (see, e.g., [199, Lemma 3.1], [232, Theorem 3.8.2]). Lemma 17.1 (Properties of Busemann Function) Let η : [0, ∞) −→ M be a ray. (i) For any x ∈ M, there exists a ray σ asymptotic to η such that σ (0) = x. (ii) For any ray σ ∼ η and t ≥ 0, we have bη (σ (t)) = bη (σ (0)) + t. (iii) If bη is differentiable at x ∈ M, then σ (t) := expx (t∇bη (x)) is a unique ray asymptotic to η emanating from x. From these properties, one could regard the Busemann function bη as “a projection to η” or “the (signed) distance function from the level set b−1 η (0)”. The next proposition is a key analytic step utilizing the Laplacian comparison theorem (Theorem 11.20) under RicN ≥ 0. When N = ∞, we assume that the weight function  : U M −→ R (restricted to the unit tangent sphere bundle), defined by (v) := ψη (0) with η(t) = expx (tv) as in Sect. 9.3, is bounded above. Proposition 17.2 (Subharmonicity of Busemann Function) Suppose that RicN ≥ 0 holds for some N ∈ [n, ∞], and that  : U M −→ R defined above is bounded above if N = ∞. Then, for any ray η : [0, ∞) −→ M, the associated Busemann function bη is subharmonic in the sense that bη ≥ 0 holds in the weak sense. Proof First we consider the case of N < ∞. Fix an arbitrary bounded open set  ⊂ M and a nonnegative test function φ ∈ H01 (). We put ri (x) := −d(x, η(i)) ←− for i ∈ N. Then, for x ∈ {η(i)} ∪ Cut(η(i)), ri is differentiable at x and ∇ri (x) coincides with the initial vector of the unique unit speed minimal geodesic from x to η(i). By Lemma 17.1(iii) and the construction of asymptotic rays, we find that limi→∞ ∇ri (x) = ∇bη (x) for x at where bη is differentiable. Hence, we have, by the dominated convergence theorem,

17.1 Busemann Functions

259



 lim

i→∞ 

dφ(∇ri ) dm =

dφ(∇bη ) dm. 

← − In order to apply Theorem 11.20, observe that ∇ri = − ∇ (−ri ) and −ri = ← − d (η(i), ·) (recall Sect. 2.5). Therefore it follows from the integration by parts and ← − Theorem 11.20 for F that    N −1 ← − dφ(∇bη ) dm = lim φ  (−ri ) dm ≤ lim φ dm = 0. i→∞ i→∞ −ri    (17.1) Thus bη ≥ 0 holds in the weak sense. In the case of N = ∞, we modify the calculation in the proof of Theorem 11.20 (see [98, (2.1)]). As in Theorem 11.20, let u(x) := d(z, x) and σ (t) := expz (tv) for some unit vector v ∈ Uz M. Then we have, for T > 0 such that there is no cut point of z on σ ((0, T ]),   h u σ (T ) = −[(σ˙ )] (T ) + (n − 1) (T ) h

(17.2)

by (11.14), where h is defined as in the proof of Theorem 8.1. Now, we consider the function t 2 h (t) T2 h which tends to 0 as t → 0 since h(t) = O(t) and h (t) → 1 (recall the proof of Theorem 8.1). We find from the Bishop inequality (8.2) and the hypothesis Ric∞ ≥ 0 that 

 t 2 h (t) dt T2 h 0    T

h h h 1 2t (t) + t 2 dt (t) − (t)2 = 2 h h h T 0  T 1 ˙ (t)) 2 Ric(σ ≤ 2 1−t dt n−1 T 0  T 1 1 ≤ + t 2 [(σ˙ )] (t) dt. T (n − 1)T 2 0

h (T ) = h



T

Moreover, it follows from the integration by parts that  0

T







t [(σ˙ )] (t) dt = T [(σ˙ )] (T ) − 2

2

0

T

2t[(σ˙ )] (t) dt

260

17 Splitting Theorems

  = T 2 [(σ˙ )] (T ) − 2T  σ˙ (T ) +



T

2(σ˙ ) dt.

0

Substituting these into (17.2) yields   n − 1 2(σ˙ (T )) 2 u σ (T ) ≤ − + 2 T T T



T

(σ˙ ) dt.

(17.3)

0

← − Coming back to the analysis of ri , we apply (17.3) with respect to F to −ri = ← − ← − d (η(i), ·). Noticing  (v) = (−v), we obtain  −ri (x) 2 n − 1 2(σ˙ i (0)) ← −  (−ri )(x) ≤ − + + (σ˙ i ) dt ri (x) ri (x) ri (x)2 0

  1 (n − 1) − 2 σ˙ i (0) + 2 sup  ≤− ri (x) UM ←− for any x ∈ M \ ({η(i)} ∪ Cut(η(i))), where σi : [0, −ri (x)] −→ M is the unique minimal geodesic from x to η(i) with respect to F . Therefore (17.1) is available with  N = n + 2 sup  − inf  UM

U

in place of N, and we deduce that bη is subharmonic.

 

Remark 17.3 (-Completeness) By modifying the estimates in the above proof, one can in fact weaken the condition supU M  < ∞ into  lim inf

r→∞ σ

r

e−2(σ˙ )/(n−1) dt = ∞

(17.4)

0

for all x ∈ M, where σ runs over all unit speed minimal geodesics with σ (0) = x and of length r. This condition was introduced in [252] and called the -completeness. We refer to [146, 163, 253] for related geometric and analytic comparison theorems. Note that, if supU M  < ∞, then (17.4) follows from the forward infinite extendibility of geodesics, namely the forward completeness. In order to see that (17.4) implies the subharmonicity, we deduce from (17.2) that ! d 2(σ˙ )/(n−1) e u(σ ) dt

2[(σ˙ )] hh − (h )2 u(σ ) − [(σ˙ )] + (n − 1) = e2(σ˙ )/(n−1) n−1 h2

17.1 Busemann Functions

2(σ˙ )/(n−1)

≤ −e ≤−

261

2 ([(σ˙ )] )2 1   u(σ ) + + [(σ˙ )] + Ric(σ˙ ) n−1 n−1

2 e2(σ˙ )/(n−1)  u(σ ) , n−1

where we used the Bishop inequality (8.2) in the first inequality and the hypothesis Ric∞ ≥ 0 in the second inequality (notice that Ric1 ≥ 0 is enough at this step). By letting L(t) := e2(σ˙ (t))/(n−1) u(σ (t)), the above differential inequality yields L ≤ −

e−2(σ˙ )/(n−1) 2 L . n−1

Therefore we have, since limt→0 L(t) = ∞, 1 =− L(T )



L 1 dt ≥ n−1 L2

T 0



T

e−2(σ˙ )/(n−1) dt,

0

and hence   e−2(σ˙ (T ))/(n−1) . u σ (T ) ≤ (n − 1)  T −2(σ˙ )/(n−1) dt 0 e ← − It follows from this inequality with respect to F that, on any bounded open set  ⊂ M and for any nonnegative function φ ∈ H01 () with φ ≤ 1,  



e−2(σ˙ i (0))/(n−1)  −ri (x) −2(σ˙ x )/(n−1) m(dx) i  e dt 0  1 ≤ (n − 1) sup e−2/(n−1)  −ri (x) −2(σ˙ x )/(n−1) m(dx), i  U e dt 0

← − φ  (−ri ) dm ≤ (n − 1)

x

where σix : [0, −ri (x)] −→ M is a minimal geodesic from x to η(i). Since the hypothesis (17.4) ensures 

−ri (x)

lim

i→∞ 0

e−2(σ˙ i

x )/(n−1)

dt = ∞

for all x ∈ , the dominated convergence theorem shows  lim

i→∞ 

← − φ  (−ri ) dm ≤ 0.

This completes the proof of the subharmonicity of bη .

262

17 Splitting Theorems

17.2 Diffeomorphic Splitting Henceforth, let (M, F, m) be both forward and backward complete (in other words, ← − both (M, F ) and (M, F ) are forward complete). Similarly to Proposition 17.2, we assume that RicN ≥ 0 for some N ∈ [n, ∞] and that  is bounded above if N = ∞. Suppose that (M, F ) admits a straight line η : R −→ M, which is a geodesic with d(η(s), η(t)) = t − s for all s < t. Then we can consider the two Busemann functions    bη (x) := lim t − d x, η(t) , t→∞

   bη¯ (x) := lim t − d η(−t), x , t→∞

where bη¯ is precisely the Busemann function for the ray η(t) ¯ := η(−t) (t ∈ [0, ∞)) ← − with respect to F . Proposition 17.2 together with the comparison principle for the nonlinear Laplacian shows the harmonicity of bη and bη¯ as follows. Proposition 17.4 (Harmonicity of Busemann Functions) Assume that RicN ≥ 0 holds for some N ∈ [n, ∞] and that supU M  < ∞ if N = ∞. Let η : R −→ M be a straight line. Then we have bη + bη¯ = 0, and bη and bη¯ are harmonic with respect ← − ← − to F and F , respectively. Moreover, bη and bη¯ are C∞ and bη =  bη¯ = 0 holds in the pointwise sense. Proof On one hand, observe from the triangle inequality that bη + bη¯ ≤ 0. On the ← − ← − other hand, Proposition 17.2 implies bη ≥ 0 and  bη¯ ≥ 0 (note that  (v) = (−v) is bounded above if N = ∞), and hence ← − bη ≥ 0 ≥ −  bη¯ = (−bη¯ ). Since bη (η(s)) = −bη¯ (η(s)) = s by construction, the strong comparison principle ← − (see [104, Lemma 5.4]) then yields that bη = −bη¯ , and we also find bη =  bη¯ = 0 in the weak sense. Since a harmonic function is a static solution to the heat equation, bη is C1,α by Theorem 13.18. Furthermore, ∇bη does not vanish since F (∇bη ) = 1 by Lemma 17.1(ii). Therefore bη and bη¯ = −bη are eventually C∞ (recall Sect. 13.4), ← − and bη =  bη¯ = 0 holds in the pointwise sense.   We say that a straight line σ : R −→ M is bi-asymptotic to η if σ |[0,∞) ∼ ← − η|[0,∞) and σ¯ (t) := σ (−t) is asymptotic to η¯ with respect to F . Combining Proposition 17.4 with Lemma 17.1(iii), we obtain the following. Lemma 17.5 Let η : R −→ M be a straight line. Then, for any x ∈ M, the geodesic σ : R −→ M with σ˙ (0) = ∇bη (x) is a unique straight line bi-asymptotic to η such that σ (0) = x.

17.2 Diffeomorphic Splitting

263

Lemma 17.1(ii) implies not only ∇bη = 0 but also that every integral curve g∇b of ∇bη is a geodesic (of unit speed). Therefore RicN (∇bη ) = RicN η (∇bη ) by Remark 9.12 and we can apply the splitting theorem for weighted Riemannian manifolds in [98] to (M, g∇bη , m). Thus we obtain the following, yielding a diffeomorphic and measure-preserving splitting of (M, m) (see (17.5) below). Proposition 17.6 (Isometric Splitting of (M, g∇bη , m)) Assume that RicN ≥ 0 holds for some N ∈ [n, ∞] and that supU M  < ∞ if N = ∞. If (M, F ) includes a straight line η : R −→ M, then (M, g∇bη ) splits isometrically as M = ! × R with ! = b−1 η (0). Moreover, for each x0 ∈ !, σ (t) = (x0 , t) ∈ ! × R is a straight line bi-asymptotic to η and  ◦ σ˙ is constant. Proof Applying the Bochner–Weitzenböck formula (12.6) with respect to g∇bη to bη , we deduce from ∇ ∇bη bη = ∇bη and ∇bη bη = bη (recall Lemma 11.22) that Ric∞ (∇bη ) + Hess bη 2HS(∇bη ) = ∇bη

F 2 (∇bη ) − d(bη )(∇bη ) = 0, 2

where the Hessian Hess bη is taken with respect to g∇bη and we used F (∇bη ) = 1 and bη = 0 in the latter equality. Then the hypothesis Ric∞ ≥ RicN ≥ 0 implies that Hess bη = 0, thus ∇bη is a parallel (and hence Killing) vector field with respect to g∇bη . Therefore the associated one-parameter family of transformations ϕt : M −→ M, t ∈ R, consists of isometries and M is isometric to the product space ! × R with ! := b−1 η (0), both with respect to g∇bη . Note also that the map ϕt is written as ϕt (x, s) = (x, s + t) in this product structure. For each x0 ∈ !, σ (t) := (x0 , t) is an integral curve of ∇bη and hence a straight line bi-asymptotic to η. In order to split the measure, observe from the proof of Theorem 12.7 that    Hess bη 2

HS(∇bη )

(bη + d( ◦ ∇bη )(∇bη ))2 . n

Combining this with bη = 0, we find that d( ◦ ∇bη )(∇bη ) = 0. Recalling that σ (t) = (x0 , t) is an integral curve of ∇bη , we obtain that  ◦ σ˙ is constant.   The last assertion in Proposition 17.6 shows that the measure m splits as m|! ⊗ L1 , where L1 is the 1-dimensional Lebesgue measure on R and the measure m|! on ! is defined as m|! (A) := m(A × [0, 1]) through the isometry M = ! × R. Hence, the map (! × R, m|! ⊗ L1 )  (x0 , t) −→ ϕt (x0 ) ∈ (M, m)

(17.5)

is diffeomorphic and measure-preserving. It is unclear if this splitting procedure can be iterated, because it seems difficult to determine the structure of (!, F |T ! , m|! ) from the construction in

264

17 Splitting Theorems

Proposition 17.6. Precisely, the discussion in terms of g∇bη as above does not provide any information of F |T ! . For instance, we do not know if ! is totally convex in (M, F ) (see Lemma 17.8 below for the Berwald case). One may be able to derive from ∇ 2 bη = 0 (which is equivalent to Hess bη = 0 by Exercise 12.9) some more information on the structure of (!, F |T ! ), however, it has been done only in the Berwald case discussed in the next section. Remark 17.7 (Necessity of supU M  < ∞) We remark that the upper boundedness of  (or the -completeness as in Remark 17.3) is necessary for the splitting within the hypothesis Ric∞ ≥ 0. In fact, if (M, g) is a hyperbolic space and ψ(x) := nd 2 (z, x) for some fixed point z ∈ M, then we find from the (2n)-convexity of ψ (recall Subsect. 8.3.1) that (M, g, e−ψ volg ) satisfies Ric∞ ≥−(n−1)+2n = n+1, whereas M does not isometrically split off the real line. Splitting theorems for weighted Riemannian manifolds with RicN ≥ 0 for some N ∈ (−∞, 1] were established in [252] by using a variant of Bochner inequality. In this case, we have an isometric splitting for N ∈ (−∞, 1), while only a warped product splitting holds for N = 1 (see also [146, 253]).

17.3 The Berwald Case This section is devoted to a more detailed analysis on the splitting phenomenon for Berwald spaces (recall Sects. 6.3, 10.2). We shall try to generalize the argument in the Riemannian case for having a closer look on the construction. Alternatively, one can make use of the structure theorem of Berwald spaces (recall Remark 6.6 and Theorem 6.7, and see [135]). Throughout this section, let (M, F, m) be a complete Berwald space, and suppose that RicN ≥ 0 holds for some N ∈ [n, ∞] and that  : U M −→ R is bounded above if N = ∞. By the definition of Berwald spaces, the covariant derivative (4.3) does not depend on the choice of a reference vector. Thus we will omit reference vectors in this section. A subset A ⊂ M is said to be totally convex if any minimal geodesic joining two points in A is included in A. We say that A ⊂ M is geodesically complete if, for any geodesic η : [0, 1] −→ M included in A, its extension η˜ : R −→ M (as a geodesic) is still included in A. The next lemma is a crucial step concerning the structure of the set ! = b−1 η (0) given in Proposition 17.6. We do not know if any analogous property holds in the non-Berwald setting. Lemma 17.8 (bη is Affine) Assume that (M, F ) includes a straight line η : R −→ M. Then, given any geodesic ξ : [0, l] −→ M, we have (bη ◦ ξ ) = 0. In particular, for each t ∈ R, the level set b−1 η (t) of the Busemann function is totally convex and geodesically complete. Proof Observe from Lemma 4.8 and ∇ 2 bη = 0 (recall Proposition 17.6) that

17.3 The Berwald Case

265

  (bη ◦ ξ ) = g∇bη (∇bη ◦ ξ, ξ˙ )   = g∇bη Dξ˙ (∇bη ◦ ξ ), ξ˙ + g∇bη (∇bη ◦ ξ, Dξ˙ ξ˙ ) =0 (i.e., bη is a totally geodesic or affine function). In particular, bη ◦ ξ is constant if bη (ξ(s)) = bη (ξ(t)) for some s = t. Therefore, for each t ∈ R, we find that b−1 η (t) is totally convex and geodesically complete.   We remark that, in the non-Berwald case, the covariant derivatives in the above ∇b proof are taken with respect to the reference vector ∇bη . Then, Dξ˙ η (∇bη ◦ ξ ) still

∇b vanishes by ∇ 2 bη = 0, whereas it is unclear if Dξ˙ η ξ˙ vanishes. We define a map ϕt : M −→ M for t ∈ R as in the proof of Proposition 17.6, which is the one-parameter family of C∞ -transformations generated from ∇bη (i.e., ∂ϕt /∂t = ∇bη (ϕt )). Combining Proposition 17.6 and Lemma 17.8, we determine the metric structure of the splitting given in Proposition 17.6 as follows.

Proposition 17.9 (Berwald Isometric Splitting) Let (M, F, m) be a complete Berwald space satisfying RicN ≥ 0 for some N ∈ [n, ∞], and assume supU M  < ∞ if N = ∞. If (M, F ) includes a straight line η : R −→ M, then we have the following. (i) For every t ∈ R, ϕt is a measure-preserving isometry such3that ϕt (M0 ) = Mt , where we set Mt := b−1 η (t). Moreover, it holds that M = t∈R ϕt (M0 ). (ii) The (n − 1)-dimensional submanifold (M0 , F |T M0 , m|M0 ) is again of Berwald type and satisfies RicN −1 ≥ 0. (iii) Define a projection ρ : M −→ M0 by ρ(ϕt (x0 )) := x0 for (x0 , t) ∈ M0 × R. Then, a curve ξ : R −→ M is a geodesic if and only if its projections ρ(ξ ) : R −→ M0 and bη (ξ ) : R −→ R are geodesic. Proof (i) We have already seen in (17.5) that ϕt is measure-preserving. In order to see that (dϕt )x : Tx M −→ Tϕt (x) M is isometric (preserves F ), we utilize the isometry of parallel transports in Proposition 6.5. To this end, for any v ∈ Tx M, we deduce from Proposition 17.6 that V (t) := dϕt (v) is a parallel vector field with respect to g∇bη along the geodesic σ (t) := ϕt (x) (which is an integral curve of ∇bη ). Then, since all integral curves of ∇bη are geodesic, Proposition 4.6 yields that V is also a parallel vector field with respect to F . Hence, we find from Proposition 6.5 that (dϕt )x preserves F . (ii) The total convexity of M0 in Lemma 17.8 implies that (M0 , F |T M0 ) is of Berwald type (by, e.g., (III) of Proposition 6.11). As for the curvature bound, given a nonzero vector v0 ∈ Tx0 M0 \{0}, we extend it to a vector field V0 on a neighborhood U0 ⊂ M0 of x0 such that all of its integral curves are geodesic. Then we consider a vector field V on U := U0 × R ⊂ M given by V (y0 , t) := (V0 (y0 ), 0). Since all integral curves of V are geodesic again, we obtain from Theorem 5.12 that

266

17 Splitting Theorems

  M 0 RicM N −1 (v0 ) = RicN (v0 , 0) ≥ 0, where (v0 , 0) ∈ Tx0 M0 × T0 R = T(x0 ,0) M via the product structure M = M0 × R (note that (N − 1) − (n − 1) = N − n). (iii) The assertion means that the geodesic equation on M splits into those on M0 and R. Given an open set U0 ⊂ M0 with local coordinates (x i )n−1 i=1 , we consider the coordinates (x i )ni=1 of U0 × R ⊂ M with x n = bη . Then, since g∇bη (∇bη , T Mt ) = 0, F (∇bη ) = 1 and ϕt is isometric, we have γjik (∇bη ) = 0 unless 1 ≤ i, j, k ≤ n − 1. This implies Gi (∇bη ) = 0 and hence Nji (∇bη ) = 0 for all 1 ≤ i, j ≤ n by (4.10). Therefore we find ji k (∇bη ) = 0 unless 1 ≤ i, j, k ≤ n − 1. Then, since ji k is fiber-wise constant by the very definition of Berwald spaces (Definition 6.4), the geodesic equation on M splits into those on M0 and R as Dξ˙ ξ˙ =

  n−1

n−1    ∂  i j ˙k n ∂  ˙ ¨ ¨ξ i + j k (ξ )ξ ξ +ξ . ∂x i  ∂x n  i=1

j,k=1

ξ

This completes the proof.

ξ

 

We remark that, different from the Riemannian case, one cannot reconstruct the Finsler structure F on M only from the factor F |T M0 . Indeed, given x ∈ M0 , what we know is F |Tx M0 and the fact that Tx M0 is perpendicular to ∇bη (x) with respect to g∇bη . These provide only a little information about F |Tx M\Tx M0 . Thanks to Proposition 17.9(ii), one can iterate the construction in Proposition 17.9 and obtain the following. Theorem 17.10 (Iterated Splitting) Let (M, F, m) and η : R −→ M be as in Proposition 17.9. Then we have the following. (i) There exist a k-parameter family of measure-preserving isometries ϕp : M −→ M, p ∈ Rk , and an (n − k)-dimensional totally convex, geodesically complete submanifold ! ⊂ M such that • • • •

(!, F |T ! ) does not include a straight line, (!, 3 F |T ! , m|! ) is of Berwald type with RicN −k ≥ 0, p∈Rk ϕp (!) = M, ϕp+q = ϕq ◦ ϕp holds for any p, q ∈ Rk .

In particular, (M, m) admits a diffeomorphic, measure-preserving splitting (M, m) = (! × Rk , m|! ⊗ Lk ), where Lk is the k-dimensional Lebesgue measure on Rk . (ii) For each x0 ∈ !, Hx0 := {ϕp (x0 ) | p ∈ Rk } is a k-dimensional submanifold of M. Moreover, (Hx0 , F |T Hx0 ) is of Berwald type and its flag curvature is identically 0. (iii) For any x0 , y0 ∈ !, (Hx0 , F |T Hx0 ) is isometric to (Hy0 , F |T Hy0 ).

17.3 The Berwald Case

267

Proof (i) This is seen by iterating the splitting procedure as in Proposition 17.9. If we split as M = M1 × R along a straight line η1 : R −→ M and M1 again includes a straight line η2 : R −→ M1 , then we define ϕ(t,s) : M −→ M by  (1)  (2) ϕs (x), r

ϕ(t,s) (x, r) := ϕt

(1)

  = ϕs(2) (x), r + t (2)

for (x, r) ∈ M1 × R, where ϕt : M −→ M and ϕs : M1 −→ M1 are the one-parameter families of measure-preserving isometries associated with η1 and η2 , respectively. By construction, ϕ(t,s) is isometric. We obtain the claim by iterating this construction. (ii) It follows from Proposition 17.9(iii) that Hx0 is totally geodesic (i.e., locally totally convex) and hence of Berwald type. Moreover, for each 2-dimensional affine subspace P ⊂ Rk , the 2-dimensional submanifold # := {ϕp (x0 ) | p ∈ P } of Hx0 is totally geodesic again by Proposition 17.9(iii). Hence, it is of Berwald type and we find from Theorem 6.8 that # is either locally Minkowskian (hence flat) or Riemannian. Observe that Proposition 17.9(iii) implies the flatness of # even if it is Riemannian. Therefore the flag curvature of (Hx0 , F |T Hx0 ) is identically 0. (iii) This is a consequence of the isometry of parallel transports (Proposition 6.5). Let η1 , η2 , . . . , ηk be straight lines used in the construction of !. Observe that T Hx0 is spanned by the gradient vector fields of k Busemann functions bηi , i = 1, 2, . . . , k, and these gradient vector fields are parallel by ∇ 2 bηi = 0. Then, for a minimal geodesic ξ : [0, 1] −→ ! from x0 to y0 , the parallel transport along ϕp ◦ ξ sends ∇bηi (ϕp (x0 )) to ∇bηi (ϕp (y0 )) and is linearly isometric. Therefore Hy0 is isometric to Hx0 .   We remark that a Berwald space with vanishing flag curvature K = 0 is necessarily locally Minkowskian (see Theorem 6.8, [25, Proposition 10.5.1]). Hence, if ! degenerates to a single point {x0 } (k = n), then M = Hx0 is an ndimensional Minkowski normed space. In terms of Szabó’s classification (Remark 6.6), assuming that M is simply connected, we find that Hx0 corresponds to the maximal flat factor (a Minkowski normed space) and ! is the product of (irreducible) Riemannian manifolds and symmetric Berwald spaces of compact type (see [135]). The argument in this section is indebted to various fine properties of Berwald spaces, already in the key step (Lemma 17.8) at the beginning. It is an open problem whether any of the results in this section can be (or cannot be) generalized to nonBerwald Finsler manifolds.

Chapter 18

Curvature-Dimension Condition

Since the end of twentieth century, optimal transport theory has been making a breathtaking and diverge progress in and outside mathematics, e.g., partial differential equations, probability theory, differential geometry, economics, and image processing, to name a few. What is especially relevant to our interest is the convexity of an entropy functional along optimal transports, called the curvaturedimension condition. This notion, due to Lott, Sturm, and Villani, turned out having rich applications in analysis and geometry. In this chapter, in our framework of Finsler manifolds, we overview the basic ideas of optimal transport theory and its relation with the weighted Ricci curvature. We refer to [243] for a fundamental knowledge of optimal transport theory, [244] for a comprehensive treatise of the curvature-dimension condition, [222] for a unique overview including applied subjects, and to [8] for a survey including more recent developments (on metric measure spaces). As for the Finsler setting, we refer to [193] as well as surveys [194, 197].

18.1 Optimal Transport Theory Optimal transport theory is concerned with the minimal cost we need for transporting (pushing forward) one probability measure to another one. This is formulated as follows in our setting. Let (M, F ) be a Finsler manifold, and denote by P(M) the set of all Borel probability measures on M. We also introduce Pp (M) ⊂ P(M) for p ∈ [1, ∞) as the subset consisting of measures with finite p-th moment, i.e., μ ∈ Pp (M) if we have   p  d (x, y) + d p (y, x) μ(dy) < ∞ M

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S.-i. Ohta, Comparison Finsler Geometry, Springer Monographs in Mathematics, https://doi.org/10.1007/978-3-030-80650-7_18

269

270

18 Curvature-Dimension Condition

for some (equivalently, for all) x ∈ M. The transport cost will be defined by using the cost of transporting a unit mass from a point x ∈ M to another point y ∈ M. We will employ d p (x, y) with p ∈ [1, ∞) as such a cost. Given μ, ν ∈ Pp (M), we regard that a measurable map T : M −→ M transports μ to ν if ν is the pushforward of μ by T, denoted by T∗ μ = ν, in the sense that 

 φ dν = M

φ ◦ T dμ M

holds for all bounded continuous functions φ ∈ Cb (M). The transport cost of T is, naturally, 

  d p x, T(x) μ(dx).

M

Then the Monge problem for the cost d p asks how to find and characterize a map T : M −→ M attaining

 inf M

    d p x, T(x) μ(dx)  T∗ μ = ν .

(18.1)

We call T attaining the infimum in (18.1) a d p -optimal transport map from μ to ν. The Monge problem is, however, ill-posed when there is no map T satisfying T∗ μ = ν. For instance, if μ is a Dirac measure δx at some point x and the support of ν contains more than one point, then we need to transport from x to more than one point, and hence any transport from μ to ν cannot be represented by a map like T. Moreover, the non-convex constraint T∗ μ = ν (in the “horizontal” direction) is not easy to analyze. In order to overcome these difficulties of the Monge problem, instead of considering pairs (x, y) we transport mass between them (i.e., y = T(x)), we focus on the amount of mass we transport from x to y for all pairs (x, y). Precisely, we consider a probability measure π ∈ P(M × M) on the product space and call it a coupling of (μ, ν) if the push-forward of π to the first (resp. second) marginal coincides with μ (resp. ν); in other words, we have π(A × M) = μ(A),

π(M × A) = ν(A)

for any measurable set A ⊂ M. For A, B ⊂ M, one can regard that π(A × B) represents the amount of mass transported from A to B according to the transport plan π . We denote by #(μ, ν) the set of all couplings of (μ, ν). Note that the product measure μ ⊗ ν is a coupling of (μ, ν), and thus #(μ, ν) is nonempty. The Kantorovich problem then asks how to find and characterize a coupling π of (μ, ν) attaining

18.1 Optimal Transport Theory

271

 inf

π ∈#(μ,ν) M×M

d p (x, y) π(dxdy).

(18.2)

This problem is well-posed in a very general situation (for both costs and spaces), and the convex constraint π ∈ #(μ, ν) (in the “vertical” direction) is much easier to analyze. We call a coupling attaining the infimum in (18.2) a d p -optimal coupling of (μ, ν) (or simply an optimal coupling when the cost in question is clear by context). Note that, for a map T : M −→ M satisfying T∗ μ = ν, the push-forward (idM ×T)∗ μ of μ by the map (idM ×T)(x) := (x, T(x)) is a coupling of (μ, ν). Thus we find that the infimum in the Kantorovich problem (18.2) is not greater than that in the Monge problem (18.1). The infimum in (18.2) can be used to measure how far it is from μ to ν. Then we arrive at the following notion. Definition 18.1 (Wasserstein Distance Wp ) For μ, ν ∈ Pp (M), we define the Lp Wasserstein distance from μ to ν by  Wp (μ, ν) :=

1/p d p (x, y) π(dxdy)

inf

π ∈#(μ,ν)

.

M×M

We call (Pp (M), Wp ) the Lp -Wasserstein space over M. The finiteness of the p-th moment ensures that Wp (μ, ν) < ∞ holds for any μ, ν ∈ Pp (M). The function Wp is indeed an asymmetric distance function on Pp (M) (Wp (ν, μ) may not coincide with Wp (μ, ν) when F is not reversible), and the base space (M, d) is isometrically embedded into (Pp (M), Wp ) by the map x −→ δx (i.e., Wp (δx , δy ) = d(x, y)). We will utilize the cases of p = 1, 2. The quadratic case appears in the curvature-dimension condition, and the case of p = 1 will be instrumental in the needle decomposition discussed in the next chapter. Let p = 2 and (M, F, m) be a forward (or backward) complete measured Finsler manifold in the remainder of this chapter. We say that μ is absolutely continuous with respect to m if we have μ(A) = 0 for any set A ⊂ M with m(A) = 0, and then we write μ & m. Note that Dirac measures are not absolutely continuous with respect to m. We will denote by P2ac (M; m) the set consisting of μ ∈ P2 (M) with μ & m. For μ ∈ P2ac (M; m), we can solve the Monge problem (18.1) as follows (due to [44] in the Euclidean case and [174] in the Riemannian case). Theorem 18.2 (Optimal Transport Map) For any μ ∈ P2ac (M; m) and ν ∈ P2 (M), there exists a (d 2 /2)-concave function ϕ : M −→ R such that the map   T(x) := expx ∇(−ϕ)(x) is a unique optimal transport map from μ to ν and that π := (idM ×T)∗ μ is a unique optimal coupling of (μ, ν).

272

18 Curvature-Dimension Condition

Therefore, under the mild assumption μ & m, the infima in (18.1) and (18.2) coincide and we have     d 2 x, T(x) μ(dx) = d 2 (x, y) π(dxdy). W22 (μ, ν) = M

M×M

A function ϕ : M −→ R is said to be (d 2 /2)-concave if there is a function ψ : M −→ R such that

2 d (x, y) − ψ(y) ϕ(x) = inf y∈M 2 for all x ∈ M (in other words, ϕ is the c-transform ¯ of ψ for the cost c(x, y) = d 2 (x, y)/2; recall Remark 12.10). This condition naturally arises from the celebrated Kantorovich duality asserting 1 2 W (μ, ν) = sup 2 2

  d 2 (x, y)  , ϕ dμ + ψ dν  ϕ(x) + ψ(y) ≤ 2 M M





(18.3)

which is behind the construction of an optimal transport map T as in Theorem 18.2. The function ϕ in Theorem 18.2 (or −ϕ) is called a Kantorovich potential generating the optimal transport from μ to ν. The L1 -counterpart to (18.3) will play a role in the needle decomposition (see (19.6)). A short outline of the proof of Theorem 18.2 via (18.3) is as follows: We can take a pair (ϕ, ψ) attaining the supremum in (18.3). Then ϕ is necessarily the c-transform ¯ of ψ, and we have ϕ(x) + ψ(y) = d 2 (x, y)/2 with y = expx (∇(−ϕ)(x)) for μalmost every x. Moreover, the map T(x) := expx (∇(−ϕ)(x)) satisfies T∗ μ = ν by a variational argument on the maximality of (ϕ, ψ). This shows (18.3) and that T is an optimal transport map from μ to ν. In the above proof, we use a fact that any (d 2 /2)-concave function is locally Lipschitz and twice differentiable almost everywhere (by a generalization of Alexandrov’s theorem; see [192, 193] for details). In particular, the optimal transport map T in Theorem 18.2 is differentiable almost everywhere. Note also that, by construction, μλ := (Tλ )∗ μ with Tλ (x) := expx (λ∇(−ϕ)(x)) provides a unique minimal geodesic from μ to ν with respect to W2 , i.e., we have W2 (μλ , μτ ) = (τ − λ)W2 (μ, ν) for all 0 ≤ λ < τ ≤ 1. We remark that μλ & m holds for all λ ∈ [0, 1).

18.2 Curvature-Dimension Condition For motivating the interplay between optimal transports and the (weighted) Ricci curvature, it is worth mentioning the classical Brunn–Minkowski inequality on the Euclidean space Rn . Given two compact sets A, B ⊂ Rn , the Brunn–Minkowski

18.2 Curvature-Dimension Condition

273

inequality asserts that |Zλ (A, B)|1/n ≥ (1 − λ)|A|1/n + λ|B|1/n

(18.4)

holds for all λ ∈ (0, 1), where | · | denotes the Lebesgue measure and Zλ (A, B) := {(1 − λ)x + λy | x ∈ A, y ∈ B}. One can prove (18.4) by analyzing the optimal transport between the uniform distributions on A and B (see [173], [197, Subsect. 4.1], and [243, Sect. 6.1]). We will discuss a generalization to measured Finsler manifolds in Theorem 18.8. If we consider a weighted Euclidean space (Rn , | · |, m), where m = e−ψ Ln with a weight function ψ ∈ C∞ (Rn ), then one can show that 1/N  ≥ (1 − λ)m(A)1/N + λm(B)1/N m Zλ (A, B)

(18.5)

holds for some N ∈ (n, ∞) if and only if we have Hess ψ(v, v) −

(dψ(v))2 ≥0 N −n

for all v ∈ T Rn (see [197, Subsect 4.1]). Therefore the nonnegative weighted Ricci curvature RicN ≥ 0 is equivalent to the generalized Brunn–Minkowski inequality (18.5). Furthermore, notice that the case of A = {x}, B = B(x, R) and λ = r/R corresponds to the Bishop–Gromov volume comparison m(B(x, R))/m(B(x, r)) ≤ (R/r)N under RicN ≥ 0 (Theorem 9.15). Remark 18.3 To be precise, such a function ψ on whole Rn is necessarily constant due to the splitting theorem in the previous chapter. Therefore, for example, we should consider the above equivalence on a convex domain in Rn . Extending the above consideration from pairs of uniform distributions to pairs of arbitrary probability measures leads us to the curvature-dimension condition, which will be defined as the convexity of an entropy functional on the Wasserstein space. This “horizontal” notion of convexity, going back to McCann’s pioneering work [173], is also called the displacement convexity to distinguish it from the “vertical” convexity with respect to the affine (linear) structure of P(M). Recall from (16.2) that the relative entropy of an absolutely continuous measure μ = ρm ∈ P2ac (M; m) is defined by  Entm (μ) :=

ρ log ρ dm, M

where ρ is the density function (the Radon–Nikodym derivative) of μ with respect to m. Precisely, we set Entm (μ) := ∞ if the positive part is infinite (regardless of the negative part). The relative entropy will be closely related to the behavior of Ric∞ .

274

18 Curvature-Dimension Condition

For dealing with RicN with N = ∞, we also introduce other kinds of entropies as follows. Definition 18.4 (Entropies) Let μ = ρm ∈ P2ac (M; m). Define the Rényi–Tsallis entropy SN (μ) of μ by 

ρ (N −1)/N dm

SN (μ) := −

(18.6)

M

for N ∈ [n, ∞), and  SN (μ) :=

ρ (N −1)/N dm

(18.7)

M

for N ∈ (−∞, 0). We also define S0 (μ) := ess sup ρ = ρ L∞ . Notice that the generating functions h(t) = t log t (N = ∞), −t (N −1)/N (N ≥ n) and t (N −1)/N (N < 0) are all convex on [0, ∞) (and h(0) = 0), and therefore it is natural to change the sign in (18.6) and (18.7). The case of N = 0 can be regarded as the limit: S0 (μ) = lim SN (μ)N/(N −1) . N ↑0

Observe also that   Entm (μ) = lim N 1 + SN (μ) = lim N →∞



N →∞ M

Nρ(1 − ρ −1/N ) dm.

In order to describe convexity conditions for entropies, we introduce functions defined by using sκ in (9.4):  (λ)

τ K,N (r) := λ1/N

sK/(N−1) (λr) sK/(N −1) (r)

(N −1)/N (18.8)

for K ∈√ R, N ∈ (−∞, 0) ∪ [n, ∞), λ ∈ (0, 1) and r > 0 (or for (λ) r ∈ (0, π (N − 1)/K) if K/(N − 1) > 0). We also define τ K,N (0) := λ. Observe (λ)

(λ)

that τ 0,N (r) = λ regardless of N and r. For simplicity, we set τ K,N (r) := ∞ for √ r ≥ π (N − 1)/K when K/(N − 1) > 0. Definition 18.5 (Curvature-Dimension Condition CD(K, N )) (1) Let K ∈ R and N ∈ [n, ∞). We say that (M, F, m) satisfies the curvaturedimension condition CD(K, N ) (or (M, F, m) is a CD(K, N )-space) if, for any

18.2 Curvature-Dimension Condition

275

pair μi = ρi m ∈ P2ac (M; m) (i = 0, 1), we have SN (μλ )  ≤− M×M

\$ %   (1−λ)  (λ)  τ K,N d(x, y) ρ0 (x)−1/N + τ K,N d(x, y) ρ1 (y)−1/N π(dxdy) (18.9)

for all λ ∈ (0, 1), where (μλ )λ∈[0,1] is the unique W2 -geodesic from μ0 to μ1 and π is the unique optimal coupling of (μ0 , μ1 ). (2) CD(K, ∞) for K ∈ R is defined in the same way by replacing (18.9) with K (1 − λ)λW22 (μ0 , μ1 ). 2 (18.10) (3) For K ∈ R and N < 0, CD(K, N ) is defined also in the same way by replacing (18.9) with Entm (μλ ) ≤ (1 − λ) Entm (μ0 ) + λ Entm (μ1 ) −

SN (μλ )  ≤ M×M

\$

%   (1−λ)  (λ)  τ K,N d(x, y) ρ0 (x)−1/N + τ K,N d(x, y) ρ1 (y)−1/N π(dxdy).

(4) CD(K, 0) for K ∈ R is defined again in the same way by replacing (18.9) with S0 (μλ )

s−K ((1 − λ)d(x, y)) s−K (λd(x, y)) ≤ max ess sup ρ0 (x) , ess sup ρ1 (y) , (1 − λ)s−K (d(x, y)) λs−K (d(x, y)) supp π supp π

where the essential supremum is taken with respect to π . Observe that (18.10) can be regarded as “Hess Entm ≥ K” in the Wasserstein space (P2 (M), W2 ) (recall Definition 8.9). When K = 0, CD(0, N ) with N ∈ (−∞, 0) ∪ [n, ∞) can be simplified into the convexity of SN : SN (μλ ) ≤ (1 − λ)SN (μ0 ) + λSN (μ1 ). In the case of K = N = 0, we have the quasi-convexity: S0 (μλ ) ≤ max{S0 (μ0 ), S0 (μ1 )}. With these definitions of CD(K, N ), we arrive at the highlight of this chapter. Theorem 18.6 (CD(K, N ) Characterizes RicN ≥ K) Let (M, F, m) be a forward or backward complete measured Finsler manifold of dimension n ≥ 2, and take

276

18 Curvature-Dimension Condition

K ∈ R and N ∈ (−∞, 0] ∪ [n, ∞]. Then RicN ≥ K holds if and only if (M, F, m) satisfies the curvature-dimension condition CD(K, N ). An outline of the proof of Theorem 18.6 is as follows. The implication from RicN ≥ K to CD(K, N ) makes use of a similar (but sharper and more technical) argument to the proofs of the Bonnet–Myers theorem (Theorem 8.1) and the Bishop–Gromov volume comparison theorem (Theorem 9.15). Given probability measures μ0 , μ1 ∈ P2ac (M; m) and the optimal transport map T from μ0 to μ1 , we show that the Jacobian of Tλ with respect to m satisfies a certain concavity estimate (in λ), and we integrate it in μ0 to obtain CD(K, N). The converse implication is seen via the Brunn–Minkowski inequality (Theorem 18.8). In particular, the Brunn– Minkowski inequality is also equivalent to RicN ≥ K. Remark 18.7 (The Case of √K/(N −1) > 0) In the case where N ∈ [n, ∞) and K > 0, the restriction r < π (N − 1)/K in (18.8) is natural in view of the Bonnet– Myers theorem (Corollary 9.17). When N < 0 and K < 0, however, √ Theorem 18.6 provides only a local control within sets of diameter less than π (N − 1)/K (see Theorem 18.8(iii)). We briefly summarize the history of Theorem 18.6. The study of the displacement convexity of entropies in the Euclidean setting goes back to McCann’s seminal work [173]. Its connection with the lower Ricci curvature bound was first heuristically discussed in [213] (see also Sect. 18.4 below). Then it was shown in [81] that, on Riemannian manifolds of nonnegative Ricci curvature, entropies in a certain class are displacement convex. In a subsequent paper [82], they also showed that Ric∞ ≥ K implies CD(K, ∞). In the meantime, Renesse and Sturm [245] established the converse implication for CD(K, ∞) in the Riemannian setting, namely Theorem 18.6 with N = ∞ for unweighted Riemannian manifolds. It was extended to the weighted case in [233]. Sturm and Lott–Villani then independently established Theorem 18.6 with N ∈ [n, ∞) for weighted Riemannian manifolds, along with the synthetic theory of metric measure spaces satisfying CD(K, N ) (see [234] for N = ∞, [235] for N ∈ [n, ∞), [161] for N = ∞ or K = 0, and [160] for N ∈ [n, ∞)). The generalization to measured Finsler manifolds was done in [193] for N ∈ [n, ∞]. The cases of N < 0 and N = 0 are more recent and can be found in [200] and [203], respectively.

18.3 Brunn–Minkowski Inequality We mentioned that the curvature-dimension condition can be interpreted as a generalization of the Brunn–Minkowski inequality (18.4) to pairs of general probability measures. Therefore one can consider a counterpart of the Brunn–Minkowski inequality in curved spaces by means of the curvature-dimension condition. For A, B ⊂ M and λ ∈ (0, 1), define Zλ (A, B) as the set of points η(λ) such that η : [0, 1] −→ M runs over all minimal geodesics with η(0) ∈ A and η(1) ∈ B.

18.3 Brunn–Minkowski Inequality

277

Theorem 18.8 (Brunn–Minkowski Inequality) Let (M, F, m) be forward or backward complete and satisfy RicN ≥ K for some K ∈ R and N ∈ (−∞, 0] ∪ [n, ∞], and take compact sets A, B ⊂ M with positive masses. (i) If N ∈ [n, ∞), then we have  1/N m Zλ (A, B) ≥

inf

x∈A,y∈B

 (1−λ)  τ K,N d(x, y) · m(A)1/N +

inf

x∈A,y∈B

 (λ)  τ K,N d(x, y) · m(B)1/N (18.11)

for all λ ∈ (0, 1). (ii) If N = ∞, then we have   log m Zλ (A, B)

 1A 1B K 2 m, m ≥ (1 − λ) log m(A) + λ log m(B) + (1 − λ)λW2 2 m(A) m(B)

for all λ ∈ (0, 1), where 1A denotes the characteristic function of A. (iii) If N < 0, then we have  1/N m Zλ (A, B) ≤

sup x∈A,y∈B

 (1−λ)  τ K,N d(x, y) · m(A)1/N +

sup x∈A,y∈B

 (λ)  τ K,N d(x, y) · m(B)1/N

√ for all λ ∈ (0, 1), provided that diam(A ∪ B) < π (N − 1)/K for K < 0. (iv) If N = 0, then we have, for all λ ∈ (0, 1),   m Zλ (A, B)

≥ min min

x∈A,y∈B

(1 − λ)s−K (d(x, y)) λs−K (d(x, y)) m(A), min m(B) . x∈A,y∈B s−K (λd(x, y)) s−K ((1 − λ)d(x, y))

Proof The strategies of the proofs are all common: We apply the curvaturedimension condition CD(K, N ) to the uniform distributions on A and B. Here we explain only the case of N ∈ [n, ∞). Let μ0 and μ1 be the uniform distributions on the sets A and B, respectively: μ0 :=

1A m, m(A)

μ1 :=

1B m. m(B)

Then it follows from the curvature-dimension condition (18.9) that −SN (μλ )

278

18 Curvature-Dimension Condition

inf

x∈A,y∈B

+ =

 (1−λ)  τ K,N d(x, y)

inf

x∈A,y∈B

inf

x∈A,y∈B



 (λ)  τ K,N d(x, y)

m(A)1/N dμ0 A



m(B)1/N dμ1 B

 (1−λ)  τ K,N d(x, y) · m(A)1/N +

inf

x∈A,y∈B

 (λ)  τ K,N d(x, y) · m(B)1/N .

Setting μλ = ρλ m, we deduce from Hölder’s inequality that 

−1/N

−SN (μλ ) = M

ρλ

 dμλ ≤ M

ρλ−1 dμλ

1/N = m(supp μλ )1/N .

Since supp μλ ⊂ Zλ (A, B), we obtain the desired inequality (18.11).

 

Observe that, in (i) and (iii), we have  (λ)  inf τ d(x, y) x∈A,y∈B K,N sup x∈A,y∈B

  τ (λ) K,N d(x, y)

 =

(λ) τ K,N

=

τ (λ) K,N

inf

d(x, y)

(N ≥ n),

inf

d(x, y)

(N < 0)

sup

d(x, y)

(N ≥ n),

d(x, y)

(N < 0)

x∈A,y∈B

 x∈A,y∈B

for K > 0, and inf

x∈A,y∈B

sup x∈A,y∈B

 (λ)  (λ) τ K,N d(x, y) = τ K,N   τ (λ) K,N d(x, y)

 

=

τ (λ) K,N

x∈A,y∈B

sup x∈A,y∈B

for K < 0. When K = 0, we obtain  1/N m Zλ (A, B) ≥ (1 − λ)m(A)1/N + λm(B)1/N for N ∈ [n, ∞) (i.e., the “concavity” of m1/N of the same form as (18.5)) and the reverse inequality for N < 0. The latter inequality (N < 0) on weighted Euclidean spaces (Rn , e−ψ Ln ) is called the (1/N)-concavity in convex geometry, which is equivalent to the p-concavity of the density function w = e−ψ :   w p (1 − λ)x + λy ≤ (1 − λ)w p (x) + λw p (y) with p = 1/(N − n) < 0 (see [41, 43, 180]). One can readily see that the convexity of w p is indeed equivalent to the nonnegativity of RicN :

18.4 Analytic Applications

279

Hess ψ(v, v) −

(dψ(v))2 ≥ 0. N −n

Under the weakest nonnegative curvature Ric0 ≥ 0, we have a kind of quasiconcavity: m(Zλ (A, B)) ≥ min{m(A), m(B)}. For N ∈ [n, ∞), we can show the Bishop–Gromov volume comparison (Theorem 9.15) by applying (18.11) to concentric balls (or annuli). Moreover, the finiteness of the total mass in the case of N = ∞ and K > 0 (Theorem 9.19) also follows from the corresponding Brunn–Minkowski inequality (see [234, 235]).

18.4 Analytic Applications In their influential paper [213], Otto and Villani discovered that the convexity of the relative entropy is instrumental in studying various functional inequalities (see also [161, 244]). We briefly review this intuitive viewpoint, followed by an application to the concentration of measures. We will give outlines of proofs to compare them with the arguments in Chaps. 15 and 16 based on the -calculus.

18.4.1 Functional Inequalities The following inequality is also called the (quadratic) transportation cost inequality or the transportation cost-information inequality (see, e.g., [148]). Theorem 18.9 (Talagrand Inequality) Assume that (M, F, m) is forward or backward complete and satisfies m(M) = 1 and Ric∞ ≥ K for some K > 0. Then we have, for any μ ∈ P2ac (M; m), W22 (m, μ) ≤

2 Entm (μ). K

(18.12)

Proof One can derive (18.12) from the curvature-dimension condition CD(K, ∞) by a simple argument. For the W2 -geodesic (μλ )λ∈[0,1] from m to μ, we deduce from (18.10) and Entm (m) = 0 that Entm (μλ ) ≤ λ Entm (μ) −

K (1 − λ)λW22 (m, μ). 2

(18.13)

Recalling Entm (μλ ) ≥ 0 (see Exercise 16.1), dividing both sides by λ, and letting λ → 0, we obtain (18.12).   ← − We remark that applying (18.12) to the reverse Finsler structure F yields

280

18 Curvature-Dimension Condition

W22 (μ, m) ≤

2 Entm (μ). K

(18.14)

A finer analysis again using the convexity (18.13) of Entm shows the logarithmic Sobolev inequality, as the N = ∞ version of Theorem 16.4 (which was shown by means of the -calculus). Theorem 18.10 (Logarithmic Sobolev Inequality) Assume that (M, F, m) is forward or backward complete and satisfies m(M) = 1 and Ric∞ ≥ K for some K > 0. Then we have 1 Im (ρm) 2K  for any locally Lipschitz, nonnegative function ρ with M ρ dm = 1. Entm (ρm) ≤

(18.15)

Here we denote by Im (ρm) the Fisher information of ρm as in (16.7),  Im (ρm) :=

M

F 2 (∇ρ) dm, ρ

where the integrant is regarded as 0 on ρ −1 (0). Proof We put μ := ρm and begin again with (18.13), which can be rearranged as Entm (μ) ≤

Entm (μ) − Entm (μλ ) K − λW22 (m, μ). 1−λ 2

Denote by T = exp(∇(−ϕ)) the optimal transport map from m to μ as in Theorem 18.2, and consider Tλ (x) := expx (λ∇(−ϕ)(x)) (so that μλ = (Tλ )∗ m). We also set μλ = ρλ m and deduce from the convexity of the function t −→ t log t (t > 0) that ρ log ρ − ρλ log ρλ ≤ (log ρ + 1)(ρ − ρλ ). Integrating this inequality in m and letting λ → 1 this time, we have lim sup λ→1

  Entm (μ) − Entm (μλ ) 1 ≤ lim sup log ρ dμ − log ρ dμλ 1−λ M M λ→1 1 − λ  log ρ(T(x)) − log ρ(Tλ (x)) = lim sup m(dx) 1−λ M λ→1  "  #   F ∗ d[log ρ] T(x) d x, T(x) m(dx). ≤ M

Then it follows from the Cauchy–Schwarz inequality and T∗ m = μ that

18.4 Analytic Applications

 M

281

"    #  F ∗ d[log ρ] T(x) d x, T(x) m(dx) ≤ Im (μ)W2 (m, μ),

and therefore we arrive at the HWI inequality:  K Entm (μ) ≤ W2 (m, μ) Im (μ) − W22 (m, μ). 2  

Finally, we obtain (18.15) by completing the square.

Remark 18.11 (Dimensional Talagrand Inequality) One can derive the Talagrand inequality from the logarithmic Sobolev inequality; this implication also goes back to [213] (see also [110] and [201, Corollary 4.4]). Then the dimensional logarithmic Sobolev inequality (Theorem 16.4) yields the following improvement of (18.12) (and (18.14)): If m(M) = 1 and RicN ≥ K for some K > 0 and N ∈ [n, ∞), then we have, for any μ ∈ Pac (M; m), % 2(N − 1) \$ max W22 (m, μ), W22 (μ, m) ≤ Entm (μ). KN We finally explain that a variational analysis of the logarithmic Sobolev inequality (18.15) around m gives rise to the Poincaré inequality (Theorem 15.4 for N = ∞). Recall (15.7) for the definition of the variance Varm (f ). Theorem 18.12 (Poincaré Inequality) Suppose that (M, F, m) is forward or backward complete and satisfies m(M) = 1 and Ric∞ ≥ K for some K > 0. Then we have  1 Varm (f ) ≤ F 2 (∇f ) dm (18.16) K M for any locally Lipschitz function f ∈ H 1 (M). Proof By  truncation, it suffices to show the claim for bounded functions. We assume M f dm = 0 without loss of generality and consider ρε := 1 + εf for ε ∈ R close to 0. Then we have M ρε dm = 1 and deduce from (18.15) that  (1 + εf ) log(1 + εf ) dm ≤ M

ε2 2K

 M

F 2 (∇f ) dm. 1 + εf

Expanding both sides around ε = 0 yields that, since f is bounded,    f2 2 ε2 fε + ε dm ≤ F 2 (∇f ) dm + O(ε3 ). 2 2K M M Recalling (18.16).

M

f dm = 0, dividing both sides by ε2 , and letting ε → 0, we obtain  

282

18 Curvature-Dimension Condition

The case of N ∈ [n, ∞) can be found in [160, Theorem 5.34], which recovers the Poincaré–Lichnerowicz inequality (Theorem 15.4) for N ∈ [n, ∞). Although the proof in [160] requires involved calculations, it is also possible to apply the needle decomposition, which enables us to reduce some geometric and functional inequalities (including the Brunn–Minkowski, logarithmic Sobolev, and Sobolev inequalities) to their simpler 1-dimensional counterparts, provided that F is reversible (see [60] and the next chapter).

18.4.2 Concentration of Measures It is well known that the Talagrand inequality (18.12) implies the concentration phenomenon of the measure m. Assuming m(M) = 1, we define the (forward) concentration function of (M, F, m) by 

  1 αm (r) := sup 1 − m B + (A, r)  A ⊂ M, m(A) ≥ 2 for r > 0, where A is a Borel set and B + (A, r) := {y ∈ M | infx∈A d(x, y) < r} is the forward open r-neighborhood of A. Proposition 18.13 (Normal Concentration) Assume that (M, F, m) is forward or backward complete and satisfies m(M) = 1 and Ric∞ ≥ K for some K > 0. Then we have, for all r > 0, αm (r) ≤ 2e−Kr

2 /4

.

Proof Given A ⊂ M such that m(A) ≥ 1/2 and m(B + (A, r)) < 1, let μ1 and μ2 be the uniform distributions on A and B := M \ B + (A, r), respectively. Then, on one hand, we find from construction and the triangle inequality of W2 that  2   r 2 ≤ W22 (μ1 , μ2 ) ≤ W2 (μ1 , m) + W2 (m, μ2 ) ≤ 2 W22 (μ1 , m) + W22 (m, μ2 ) . On the other hand, it follows from the Talagrand inequalities (18.12) and (18.14) that  2 Entm (μ1 ) + Entm (μ2 ) K   # 2" log m(A) + log 1 − m B + (A, r) =− K  1 − m(B + (A, r)) 2 . ≤ − log K 2

W22 (μ1 , m) + W22 (m, μ2 ) ≤

Thus we have 1 − m(B + (A, r)) ≤ 2e−Kr

2 /4

. This completes the proof.

 

18.5 Further Developments

283

The above normal concentration means that m is almost concentrated on the (forward) r-neighborhood of every set of volume ≥ 1/2, with the normal (Gaussian) decay in r. We refer to the beautiful book [148] for more on the concentration of measure phenomenon. There is a deep theory on the relationship between the concentration and functional inequalities (see also [177] among others). For example, the Poincaré inequality (18.16) is known to imply a weaker concentration of the form αm (r) ≤ Ce−cr , called the exponential concentration (see [148, Theorem 3.1]). The connection with the Ricci curvature goes back to Gromov and V. Milman’s seminal work (see, e.g., [112, 113, 148]).

18.5 Further Developments Finally, we review some related investigations and further developments.

18.5.1 Riemannian Curvature-Dimension Condition As we mentioned at the end of Sect. 18.2, the curvature-dimension condition CD(K, N) can be used to define a synthetic geometric notion of lower Ricci curvature bound for metric measure spaces without any differentiable structure (we refer to the book [244]). This can be regarded as an analogue for the Ricci curvature to the theory of Alexandrov spaces and CAT-spaces for the sectional curvature (recall Subsect. 8.3.1). An important feature of CD(K, N ) is that it is preserved by the measured Gromov–Hausdorff convergence, and therefore the limit of a sequence of weighted Riemannian manifolds with RicN ≥ K satisfies CD(K, N ) (recall Remark 9.9(a) as well). Theorem 18.6 shows that CD(K, N ) is available also for measured Finsler manifolds with RicN ≥ K (and for their limit spaces). This is, on one hand, good news since the curvature-dimension condition turns out applicable to a wider class of spaces. On the other hand, the validity in the Finsler case reveals that one cannot expect genuinely Riemannian properties such as isometric splitting theorems (as we discussed in Chap. 17). Motivated by the latter observation, the combination of CD(K, N ) and the linearity of heat flow (equivalently, the quadraticity of the energy form E) was introduced in [11, 95, 106, 108], which is now called the Riemannian curvaturedimension condition RCD(K, N ). Since then, geometric and analytic investigations of metric measure spaces satisfying RCD(K, N ) (called RCD(K, N )-spaces) have been making a rapid and deep progress. Especially, almost all results known for the limit spaces of Riemannian manifolds of Ricci curvature bounded below (by Cheeger, Colding, Naber, etc.) were generalized to this context, including the splitting theorem in [106, 107], rectifiability in [183], and constancy of the

284

18 Curvature-Dimension Condition

dimension of regular points in [48] (see also a survey [8]). Furthermore, RCD(K, N ) is known to be equivalent to the corresponding Bochner inequality in an appropriate way (recall Remark 9.10(b) and Remark 14.8, and see [12, 95]). In contrast, the general theory of metric measure spaces satisfying CD(K, N ) is not yet as well developed as that for RCD(K, N ). The needle decomposition in the next chapter is available for (essentially non-branching) CD(K, N )-spaces; however, then the reversibility is assumed to obtain sharp estimates. It is an intriguing open problem to what extent analysis on Finsler manifolds as in this book (nonlinear -calculus) can be generalized to general metric measure spaces satisfying CD(K, N ).

18.5.2 Heat Flow as Gradient Flow Another related result available in the Finsler setting is that heat flow can be regarded as gradient flow of the relative entropy Entm . This interpretation should be compared with the more classical introduction of heat flow as gradient flow of the energy functional E in the L2 -space (discussed in Sect. 13.2) and goes back to the celebrated work of Jordan et al. [127] on Euclidean spaces (followed by [94, 191] on Riemannian manifolds). It was established in [206] that, on compact measured Finsler manifolds, a curve (μt )t≥0 in P2ac (M; m) with μt = ρt m is a gradient curve of Entm with respect to the L2 -Wasserstein distance associated with ← − the reverse Finsler structure F if and only if (ρt )t≥0 solves the heat equation ∂t ρt = ρt . We refer to [10] for the case of metric measure spaces satisfying CD(K, ∞) (see also [109, 191, 223] for related preceding works on Alexandrov spaces). This interpretation of heat flow can be further generalized to the framework of Hamiltonian structures (recall Remark 2.14 and see [149, 198]). In the Riemannian case, as is natural from this gradient flow viewpoint, we can derive from RCD(K, ∞) (regarded as “Hess Entm ≥ K”) the contraction of heat flow: W2 (μt , νt ) ≤ e−Kt W2 (μ0 , ν0 )

(18.17)

for all t > 0, μt = ρt m and νt = σt m such that ρt and σt obey the heat equation. Moreover, the contraction property (18.17) is equivalent to the L2 -gradient estimate (recall Remark 14.9). In the Finsler setting, however, the contraction of the form (18.17) fails (see (b) at the end of Preface and [207]), and the relation with the gradient estimate (Theorem 14.2) is yet to be understood.

18.5 Further Developments

285

18.5.3 Measure Contraction Property Another direction of research is weakening the curvature-dimension condition to cover some wilder spaces such as sub-Riemannian manifolds. Given K ∈ R and N ∈ [1, ∞), the measure contraction property MCP(K, N ) means that, roughly speaking, the Brunn–Minkowski inequality (18.11) holds when A or B is a singleton (see [190, 235]). Precisely, we have  N  (λ)  m Zλ ({x}, B) ≥ inf τ K,N d(x, y) · m(B) y∈B

(18.18)

when A = {x} and  N  (1−λ)  m Zλ (A, {y}) ≥ inf τ K,N d(x, y) · m(A) x∈A

(18.19)

when B = {y}. This property can also be regarded as a directional version of the Bishop–Gromov volume comparison (Theorem 9.15; see also Remarks 9.16 and 11.21). For unweighted Riemannian manifolds, MCP(K, n) is equivalent to Ricg ≥ K. In general, however, MCP(K, N ) is strictly weaker than CD(K, N ) (see [190, 235]). A typical and inspiring example exhibiting the difference between CD and MCP can be found in sub-Riemannian geometry. It is known that a number of sub-Riemannian manifolds do not satisfy CD(K, N ) for any K and N but satisfy MCP(K, N ) (see [3, 131, 150] among others). We also refer to [61, 63] for recent interesting contributions to the theory of metric measure spaces satisfying MCP(K, N ).

Chapter 19

Needle Decompositions

The needle decomposition (also called the localization) is a quite interesting and powerful method, which enables us to reduce a high-dimensional problem to its 1-dimensional counterpart (on needles). It was first developed on Euclidean spaces in the context of convex analysis and geometry by [113, 132, 162, 214] etc. Klartag [138] successfully extended this method to weighted Riemannian manifolds and gave a striking alternative proof of the Lévy–Gromov isoperimetric inequality without relying on the deep regularity theory in geometric measure theory. Cavalletti–Mondino [59] further generalized the method to essentially nonbranching metric measure spaces satisfying the curvature-dimension condition CD(K, N) (for N ∈ [1, ∞)) and showed the Lévy–Gromov isoperimetric inequality which was new even for reversible Finsler manifolds. In this chapter, we briefly explain the construction of needle decompositions and an application to the Lévy–Gromov isoperimetric inequality, along the lines of [59, 203]. For general Finsler manifolds, however, the needle decomposition seems to have only limited power, since the non-reversibility prevents us from obtaining sharp estimates (see Remark 19.15). This observation should be compared with Chap. 15, where we used the -calculus to deal with the case of N = ∞ and the non-reversibility did not worsen the estimate.

19.1 Lipschitz Functions and Optimal Transports First we introduce some necessary notions related to the analysis of 1-Lipschitz functions. We will find an inspiring connection with optimal transport theory (recall Sect. 18.1), especially with the L1 -case (employing the distance function d as the cost function).

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S.-i. Ohta, Comparison Finsler Geometry, Springer Monographs in Mathematics, https://doi.org/10.1007/978-3-030-80650-7_19

287

288

19 Needle Decompositions

19.1.1 Transport Rays Let (M, F ) be a forward or backward complete Finsler manifold. Recall that a function ϕ : M −→ R is said to be 1-Lipschitz if ϕ(y) − ϕ(x) ≤ d(x, y) holds for all x, y ∈ M. We are interested in pairs of points enjoying equality, namely ϕ := {(x, y) ∈ M × M | ϕ(y) − ϕ(x) = d(x, y)}.

(19.1)

Since ϕ is 1-Lipschitz, if (x, y) ∈ ϕ , then we have (η(s), η(t)) ∈ ϕ along any minimal geodesic η : [0, 1] −→ M from x to y and for all 0 ≤ s ≤ t ≤ 1. This observation leads to the following definition. Definition 19.1 (Transport Rays) We call a unit speed geodesic η : I −→ M from a closed interval I ⊂ R a transport ray associated with ϕ if (η(s), η(t)) ∈ ϕ holds for all s, t ∈ I with s < t and if η cannot be extended to a longer geodesic satisfying this property. We will choose an interval I so as to satisfy ϕ(η(t)) = t. If a singleton η : {t} −→ M is a transport ray, then we call it a degenerate transport ray. Sometimes we identify I with its image η(I ) ⊂ M. Let us consider three simple examples. Example 19.2 (a) For ϕ(x) = |x| on R, we have two non-degenerate transport rays both emanating from 0: η1 (t) = t and η2 (t) = −t for t ∈ [0, ∞). (b) For a kind of cut-off function ϕ : R −→ [0, 1] given by ⎧ ⎪ ⎪ ⎨1 ϕ(x) = 2 − |x| ⎪ ⎪ ⎩0

for |x| ≤ 1, for 1 < |x| < 2, for |x| ≥ 2,

there are two non-degenerate transport rays: η1 (t) = −t + 2 and η2 (t) = t − 2 for t ∈ [0, 1]. All other points x ∈ (−∞, −2) ∪ (−1, 1) ∪ (2, ∞) are degenerate transport rays. (c) If ϕ : M −→ R is C-Lipschitz for some C < 1, then we have only degenerate transport rays. From these examples (and directly by definition), one could guess that ϕ cannot behave badly at interior points of non-degenerate transport rays. This is indeed the case and ϕ is differentiable at those points. Define    S ϕ := x ∈ M  (w, x), (x, y) ∈ ϕ for some w, y ∈ M \ {x} . Note that (w, y) ∈ ϕ also holds by definition and that there exists a minimal geodesic from w to y passing through x.

19.1 Lipschitz Functions and Optimal Transports

289

Lemma 19.3 (Differentiability of ϕ) Given x ∈ S ϕ , let η : (−ε, ε) −→ M be a unit speed minimal geodesic such that η(0) = x and (η(−ε), x), (x, η(ε)) ∈ ϕ . Then ϕ is differentiable at x and we have ∇ϕ(x) = η(0). ˙ In particular, such a geodesic η is unique. In particular, non-degenerate transport rays can intersect only at their endpoints. See [99, Lemma 10] or [203, Lemma 4.5] for a proof of Lemma 19.3.

19.1.2 Cyclical Monotonicity Before proceeding to a closer analysis of transport rays, we discuss the connection between the set ϕ in (19.1) and optimal transport theory. The following simple but useful notion will play an important role. Definition 19.4 (Cyclical Monotonicity) For p ∈ [1, ∞), a set  ⊂ M × M is said to be d p -cyclically monotone if, for any finite collection {(xi , yi )}li=1 ⊂ , we have l  i=1

d p (xi , yi ) ≤

l 

d p (xi , yi+1 ),

i=1

where we set yl+1 := y1 in the right-hand side. This means that the transport cost of the pairings {(xi , yi )}li=1 (i.e., we transport mass from xi to yi ) with respect to d p is minimal among permutations of {yi }li=1 . Then we have an elegant characterization of optimal couplings as follows (see, e.g., [244, Theorem 5.10]). Theorem 19.5 (Cyclical Monotonicity and Optimal Couplings) For μ, ν ∈ Pp (M), a coupling π ∈ #(μ, ν) is d p -optimal if and only if its support supp π ⊂ M × M is d p -cyclically monotone. Going back to the set ϕ associated with a 1-Lipschitz function ϕ, it is readily seen that ϕ enjoys the d-cyclical monotonicity (i.e., with p = 1). Lemma 19.6 The set ϕ is d-cyclically monotone. Proof For any finite collection {(xi , yi )}li=1 ⊂ ϕ , we deduce from the definition (19.1) of ϕ and the 1-Lipschitz continuity of ϕ that l  i=1

d(xi , yi ) =

l l l        ϕ(yi ) − ϕ(xi ) = ϕ(yi+1 ) − ϕ(xi ) ≤ d(xi , yi+1 ), i=1

where yl+1 := y1 .

i=1

i=1

 

290

19 Needle Decompositions

This lemma provides a connection with d-optimal transports. In the previous chapter, however, we used the quadratic cost function to capture the behavior of the Ricci curvature. The next lemma plays a crucial role to close this gap (see [56, 57]). Lemma 19.7 (d 2 -Optimality) Assume that a set  ⊂ ϕ satisfies sup ϕ(x) ≤ (x,y)∈

inf ϕ(y)

(x,y)∈

(19.2)

and    ϕ(x2 ) − ϕ(x1 ) ϕ(y2 ) − ϕ(y1 ) ≥ 0

(19.3)

for all (x1 , y1 ), (x2 , y2 ) ∈ . Then  is d 2 -cyclically monotone. Proof First we see that the set  :=

   ϕ(x), ϕ(y)  (x, y) ∈  ⊂ R × R

is | · |2 -cyclically monotone by induction. For any pair (x1 , y1 ), (x2 , y2 ) ∈ , it follows from the hypothesis (19.3) that 2  i=1

|ϕ(yi+1 )−ϕ(xi )|2 −

2 

   |ϕ(yi )−ϕ(xi )|2 =2 ϕ(x2 )−ϕ(x1 ) ϕ(y2 ) − ϕ(y1 )

i=1

≥ 0, where y3 := y1 . Then, suppose that the claim holds true for any collection of (l − 1) elements in  . Given {(xi , yi )}li=1 ⊂ , observe from (19.3) that we have ϕ(xi ) < ϕ(xj ) only if ϕ(yi ) ≤ ϕ(yj ). Hence, by reordering, we can assume that ϕ(x1 ) = mini ϕ(xi ) and ϕ(y1 ) = mini ϕ(yi ). Applying the induction hypothesis to {(ϕ(xi ), ϕ(yi ))}li=2 , we obtain l  i=1

|ϕ(yi+1 ) − ϕ(xi )|2 −

l 

|ϕ(yi ) − ϕ(xi )|2

i=1

 2  2  2   2 ≥ ϕ(y2 ) − ϕ(x1 ) + ϕ(y1 ) − ϕ(xl ) − ϕ(y2 ) − ϕ(xl ) − ϕ(y1 ) − ϕ(x1 )    = 2 ϕ(y2 ) − ϕ(y1 ) ϕ(xl ) − ϕ(x1 ) ≥ 0,

where yl+1 := y1 . Hence  is | · |2 -cyclically monotone. Now, for any {(xi , yi )}li=1 ⊂ , it follows from (19.2) and the 1-Lipschitz continuity of ϕ that 0 ≤ ϕ(yi+1 ) − ϕ(xi ) ≤ d(xi , yi+1 ). Therefore we obtain

19.2 Construction of Needle Decompositions l  i=1

d 2 (xi , yi ) =

291

l l l  2  2    d 2 (xi , yi+1 ) ϕ(yi ) − ϕ(xi ) ≤ ϕ(yi+1 ) − ϕ(xi ) ≤ i=1

i=1

i=1

as desired, where we used the d 2 -cyclical monotonicity of  in the first inequality.   Observe that the condition (19.2) was used only for ensuring |ϕ(yi+1 )−ϕ(xi )| ≤ d(xi , yi+1 ). Thus we do not need to assume (19.2) in the reversible case.

19.2 Construction of Needle Decompositions We continue considering a 1-Lipschitz function ϕ : M −→ R and explain how to construct the corresponding needle decomposition, which is a family of transport rays associated with ϕ.

19.2.1 Transport Sets Inspired by Lemma 19.3, we introduce the following decomposition of M. Definition 19.8 (Transport Sets) (1) We call x ∈ M a degenerate point if there is no non-degenerate transport ray including x. The set of degenerate points will be denoted by D ϕ . (2) The set of points x ∈ M such that there is exactly one non-degenerate transport ray including x will be denoted by T ϕ and called the transport set associated with ϕ. (3) The set of points x ∈ M such that there is more than one non-degenerate transport ray including x will be denoted by B ϕ , regarded as the set of branching points in respect of the behavior of ϕ. Observe from Lemma 19.3 that S ϕ ⊂ T ϕ . Moreover, since the branching does not occur at interior points of non-degenerate transport rays again by Lemma 19.3, any x ∈ B ϕ is either the initial point of all transport rays including x or the terminal point of all such rays. Thus we can further decompose B ϕ into     B+ ϕ := x ∈ B ϕ (x, y) ∈ ϕ for some y ∈ M \ {x} ,     B− ϕ := x ∈ B ϕ (w, x) ∈ ϕ for some w ∈ M \ {x} . − One can show that S ϕ , D ϕ , T ϕ , B + ϕ , and B ϕ are all Borel sets.

Example 19.9 We perform the decomposition for the examples in Example 19.2.

292

19 Needle Decompositions

(a) We have T ϕ = R \ {0} and B + ϕ = {0}. Note also that S ϕ = T ϕ . (b) We have T ϕ = [−2, −1] ∪ [1, 2] and D ϕ = (−∞, −2) ∪ (−1, 1) ∪ (2, ∞). In this case, S ϕ = (−2, −1) ∪ (1, 2) is a proper subset of T ϕ . (c) All points are degenerate, i.e., D ϕ = M. We will be mainly interested in the transport set T ϕ . In general, however, D ϕ can be a large set as in the above example. As for B ϕ , we obtain the following as an interesting application of Theorem 18.2. Henceforth, we fix a measure m on M (i.e., we consider a measured Finsler manifold (M, F, m)). Proposition 19.10 We have m(B ϕ ) = 0. We remark that, since measures we consider are mutually absolutely continuous, the assertion m(B ϕ ) = 0 is independent of the choice of m. Proof The outline of the proof is as follows. We refer to [56, Proposition 4.5] or + [203, Proposition 4.8] for details. If m(B + ϕ ) > 0, then, by associating x ∈ B ϕ with two points on different transport rays emanating from x, we can construct a d 2 -cyclical monotone set  ⊂ M × M of the form =

     x, Tk (x)  x ∈ A, x, Tk (x) ∈ ϕ , k = 1, 2

1 2 for some A ⊂ B + ϕ with m(A) > 0, where T (x) and T (x) live in different transport rays. We can deduce from Lemma 19.7 that  is d 2 -cyclically monotone. Then, by Theorem 19.5, the coupling

π :=

 2 1A 1 m (idM ×Tk )∗ 2 m(A) k=1

is d 2 -optimal. This, however, contradicts Theorem 18.2 because  cannot be represented as the graph of any map. Therefore we obtain m(B + ϕ ) = 0 and similarly ) = 0.   m(B − ϕ

19.2.2 Disintegration In order to construct a needle decomposition, we identify T ϕ with the set R ϕ of nondegenerate transport rays through the following equivalence relation for x, y ∈ T ϕ : x∼y

if (x, y) ∈ ϕ or (y, x) ∈ ϕ .

The map  : T ϕ / ∼ −→ R ϕ sending [η(t)] to η is indeed well-defined and bijective. We endow R ϕ with the quotient topology induced from T ϕ via . Now, if m(T ϕ ) > 0, then we define a measure  ω on R ϕ as the push-forward of m|T ϕ by

19.2 Construction of Needle Decompositions

293

the projection σ : T ϕ −→ T ϕ / ∼, precisely,  ω := ( ◦ σ )∗ (m|T ϕ ). Thanks to the disintegration theorem associated with the map ◦σ : T ϕ −→ R ϕ (see, e.g., [86, III.70–73] and [9, Theorem 5.3.1]), there exists a family of Borel probability measures on M, denoted by {μη }η∈R ϕ , such that μη is supported in the image of η and we have 



 Tϕ

φ dm =

ω(dη) φ dμη 

for all measurable functions φ on M, where Iη ⊂ M is the image of η. Therefore, m|T ϕ is decomposed into {μη }η∈R ϕ on non-degenerate transport rays (called needles), and m|T ϕ is recovered by integrating {μη }η∈R ϕ in  ω. We summarize the outcome of the above construction in the next theorem. Theorem 19.11 (Needle Decomposition for ϕ) Let (M, F, m) be a forward or backward complete measured Finsler manifold, and take a 1-Lipschitz function ϕ : M −→ R. Then there exist a decomposition {Iη }η∈Q of M into transport rays associated with ϕ, a measure ω on the index set Q, and a family {μη }η∈Q of probability measures on M satisfying the following: (i) For ω-almost every η ∈ Q, we have supp μη ⊂ Iη . (ii) For any measurable function φ on M, we have  

 φ dm = M

Q

φ dμη ω(dη).

The function ϕ is called the guiding function of the needle decomposition. We identified a transport ray η with its image Iη for both non-degenerate and degenerate transport rays (Iη is a singleton in the latter case). We put Q := R ϕ  D ϕ and, for a degenerate transport ray η ∈ D ϕ , choose as μη the Dirac measure at the point Iη . Then, by setting ω :=  ω + m|Dϕ , we obtain Theorem 19.11. Note that we have ω(Q) = m(M) by Proposition 19.10.

19.2.3 Conditioned Version Theorem 19.11 provides needle decompositions associated with general 1-Lipschitz functions. In most applications, we actually employ a 1-Lipschitz function ϕ induced from another function f (satisfying M f dm = 0) via the L1 -optimal  transport theory. To this end, we take a function f ∈ L1 (M) such that M f dm = 0 and

294

19 Needle Decompositions



  |f (x)| d(x, y) + d(y, x) m(dx) < ∞

(19.4)

M

for some (and hence all) y ∈ M. Then we consider a problem finding a 1-Lipschitz function ϕ : M −→ R achieving

 sup M

  f ϕ dm  ϕ is 1-Lipschitz .

(19.5)

Notice that f ϕ ∈ L1 (M) conditions on f . Moreover, since   by the integrability  f dm = 0, we have f (ϕ + c) dm = f M M M ϕ dm for any c ∈ R. Hence we can readily show the existence of a maximizer ϕ by using the Ascoli–Arzelà theorem. The above maximizing problem is closely related to the L1 -optimal transport theory. Given μ, ν ∈ P1 (M), the celebrated Kantorovich–Rubinstein duality asserts that 

   (19.6) ϕ dν − ϕ dμ  ϕ is 1-Lipschitz . W1 (μ, ν) = sup M

M

Compare this with the L2 -version in (18.3). Observe that the inequality “≥” in (19.6) is seen by integrating d(x, y) ≥ ϕ(y) − ϕ(x) in couplings of (μ, ν). A function ϕ achieving the supremum in (19.6) is called a Kantorovich potential (and the doptimal transport from μ to ν is done along transport rays of ϕ; see the remark below). If μ, ν ∈ P1ac (M; m) with μ = ρm and ν = ρm, ¯ then the function f := ρ¯ − ρ satisfies the condition (19.4) since μ, ν ∈ P1 (M), and (19.5) coincides with the right-hand side of (19.6). Remark 19.12 (L1 -Optimal Transport Theory) The analysis of d-optimal transports is more difficult than the quadratic case, due to the lack of strict convexity (of d(η(0), η(·)) along minimal geodesics η : [0, 1] −→ M). For example, for the uniform distributions μ = L1 |[−1,0] and ν = L1 |[0,1] on R, the maps T1 (x) = x + 1 and T2 (x) = −x both are d-optimal from μ to ν. Thus d-optimal transport maps and d-optimal couplings are not unique even between absolutely continuous probability measures (compare this with the quadratic case in Theorem 18.2). Nonetheless, one can decompose the space into transport rays (via a Kantorovich potential as above), and then a d-optimal transport map is obtained by integrating 1-dimensional (nonunique) optimal transport maps on these transport rays. We refer to [56, 99] for details (see also [7, 54, 97, 240] and [243, Subsect. 2.4.6] for the Euclidean case). For ϕ induced from f as above, the associated needle decomposition satisfies an important additional property over Theorem 19.11. As an application of techniques in the L1 -optimal transport theory (see [97, 138]), we have  f dm = 0 A

19.3 Properties of Needles

295

for any saturated set A ⊂ M, which means that the set S(A) := {x ∈ M | (x, y) ∈ ϕ or (y, x) ∈ ϕ for some y ∈ A} coincides with A (observe that A ⊂ S(A) always holds). For example, any subset of D ϕ is a saturated set. We then obtain the following reinforcement of Theorem 19.11. Theorem 19.13 (Conditioned Needle Decomposition) Let (M, F, m) be a forward or backward complete measured Finsler manifold and f ∈ L1 (M) satisfy  M f dm = 0 and (19.4). Take ϕ attaining the supremum in (19.5). Then, for the needle decomposition associated with ϕ as in Theorem 19.11, we have  Iη

f dμη = 0

for ω-almost every η ∈ Q.  Hence, the condition M f dm = 0 on (M, m) is inherited by ω-almost every needle (Iη , μη ). In particular, since ω|D ϕ = m|Dϕ by construction, we find that f = 0 necessarily holds m-almost everywhere on D ϕ .

19.3 Properties of Needles Note that Theorems 19.11 and 19.13 hold for general measured Finsler manifolds without assuming any curvature bounds. In this section, we shall see how the lower weighted Ricci curvature bound influences the behavior of the measure μη on (Iη , | · |) for non-degenerate transport rays η ∈ R ϕ , where | · | denotes the Euclidean distance induced from the identification of Iη ⊂ M with the domain of η. First, we can show that the measure μη is absolutely continuous with respect to the 1-dimensional Lebesgue measure L1 on Iη , and its density function is positive and locally Lipschitz on the interior of Iη . This step utilizes the measure contraction property. As we mentioned in Subsect. 18.5.3, the measure contraction property MCP(K, N) following from RicN ≥ K can be regarded as the Brunn–Minkowski inequality with a one-point set at one side. Then, for ( ω-almost every) η ∈ R ϕ and a, b ∈ Iη with a < b, putting μη = ρη L1 , we obtain the estimates

sK/(N−1) (b − t) sK/(N −1) (b − s)

N −1 ≤

ρη (t) ≤ ρη (s)

sK/(N−1) (t − a) sK/(N−1) (s − a)

N −1

for a < s < t < b. Precisely, the latter (resp. former) inequality follows from the measure contraction property from the point η(a) (resp. to the point η(b)), i.e., (18.18) with x = η(a) (resp. (18.19) with y = η(b)). From these bounds, one can indeed deduce that ρη is positive and locally Lipschitz on the interior of Iη . We

296

19 Needle Decompositions

remark that the smoothness of the space is in fact enough for this qualitative step, since, given N ∈ (n, ∞) and each compact set  ⊂ M, one can always find K ∈ R for which RicN ≥ K holds on . Second, under the curvature bound RicN ≥ K, we can derive a sharper estimate from the curvature-dimension condition CD(K, N ) (recall that CD(K, N ) is strictly stronger than MCP(K, N )). Precisely, one can localize CD(K, N ) into each needle (Iη , | · |, μη ) with the help of Lemma 19.7 and obtain the following quantitative estimate of ρη . Theorem 19.14 (CD(K, N ) for Needles) Let (M, F, m) and ϕ be as in Theorem 19.11, and suppose that (M, F, m) satisfies RicN ≥ K for some K ∈ R and N ∈ (−∞, 0] ∪ [n, ∞). Then we have, along  ω-almost every η ∈ R ϕ ,   ρη (1 − λ)s + λt

N −1 sK/(N−1) (λ(t − s)) sK/(N−1) ((1 − λ)(t − s)) ≥ ρη (s)1/(N −1) + ρη (t)1/(N −1) sK/(N−1) (t − s) sK/(N−1) (t − s) (19.7) for any s, t in the interior of Iη with s < t and for all λ ∈ (0, 1). If we assume Ric∞ ≥ K for some K ∈ R, then we have   K log ρη (1 − λ)s + λt ≥ (1 − λ) log ρη (s) + λ log ρη (t) + (1 − λ)λ(t − s)2 2 instead of (19.7). When K = 0, (19.7) means that %N −1   \$ ρη (1 − λ)s + λt ≥ (1 − λ)ρη (s)1/(N −1) + λρη (t)1/(N −1) , 1/(N −1)

for N ∈ [n, ∞) (resp. N ∈ i.e., the concavity (resp. convexity) of ρη (−∞, 0]). It is important to have the power 1/(N − 1) rather than 1/N for showing sharp isoperimetric inequalities. Intuitively speaking, one obtains 1/(N − 1) by separately analyzing the component tangent to transport rays in the transport associated with ϕ. This is in the same spirit as CD(K, N ); recall the definition (18.8) of τ (λ) K,N (r) mixing λ and sK/(N −1) (λr)/sK/(N −1) (r) (λ comes from the component in the direction of transport). When ρη is C2 , it follows from (19.7) that (ρη1/(N −1) ) +

K ρ 1/(N −1) ≤ 0 N −1 η

for N ∈ [n, ∞), and the reverse inequality for N ∈ (−∞, 0]. These imply that, in either case,

19.4 Isoperimetric Inequalities

297

ψ  −

(ψ  )2 ≥K N −1

by putting ρη = e−ψ . Hence (Iη , | · |, μη ) satisfies RicN ≥ K. In general, (19.7) can be regarded as a weak form of RicN ≥ K (i.e., CD(K, N )). Therefore the lower weighted Ricci curvature bound RicN ≥ K is inherited by  ω-almost every (non-degenerate) needle (Iη , | · |, μη ) with the same K and N . Thus, these needles could capture the “N -dimensional” behavior of (M, F, m), although they are 1dimensional as metric spaces. Remark 19.15 (Non-reversible Case) We remark that (Iη , | · |) is not isometric to (Iη , F |T Iη ) in the non-reversible case, since d(η(t), η(s)) = |t − s| for s < t. Therefore we do not know if (Iη , F |T Iη , μη ) satisfies any curvature bound. Recall that the reverse curve of η may not be geodesic or of constant speed.

19.4 Isoperimetric Inequalities We are ready to prove a Lévy–Gromov-type isoperimetric inequality. Precisely, we shall generalize the following isoperimetric inequality by E. Milman [178, 179] to the Finsler setting. Theorem 19.16 (Milman’s Isoperimetric Inequality) Let (M, g, m) be a complete weighted Riemannian manifold of dimension n ≥ 2 satisfying m(M) = 1, RicN ≥ K, and diam(M) ≤ D for some K ∈ R, N ∈ (−∞, 1) ∪ [n, ∞], and D ∈ (0, ∞]. Then we have I(M,g,m) (θ ) ≥ IK,N,D (θ ) for all θ ∈ (0, 1), where IK,N,D (θ ) is an explicitly given function. Recall (15.2) for the definition of the isoperimetric profile I(M,g,m) . We refer to [178, 179] for the precise expression of IK,N,D (we remark that, for some combinations of (K, N, D), only the trivial bound I(M,g,m) (θ ) ≥ 0 is available, i.e., IK,N,D (θ ) = 0). We also have 1-dimensional model spaces in nontrivial cases (see [178, Corollary 1.4] and [179, Corollary 1.4]). √ When K > 0, N ∈ [n, ∞) and D = ∞ (or, equivalently, D = π (N − 1)/K), Theorem 19.16 recovers the classical Lévy–Gromov isoperimetric inequality (see [152, 153], [112, Appendix C]), and its weighted version (due to [31]). The case of K > 0 and N = D = ∞ corresponds to the Bakry–Ledoux isoperimetric inequality [23] studied in Chap. 15, i.e., IK,∞,∞ for K > 0 coincides with IK in Theorem 15.1. Our generalization to measured Finsler manifolds is the following.

298

19 Needle Decompositions

Theorem 19.17 (Isoperimetric Inequality) Let (M, F, m) be a complete measured Finsler manifold of dimension n ≥ 2 satisfying F < ∞, m(M) = 1, RicN ≥ K, and diam(M) ≤ D for some K ∈ R, N ∈ (−∞, 0] ∪ [n, ∞], and D ∈ (0, ∞]. Then we have I(M,F,m) (θ ) ≥ −1 F · IK,N,D (θ ) for all θ ∈ (0, 1). We remark that, since the finite reversibility F < ∞ is assumed (recall (3.17) for F ), the forward and backward completenesses are mutually equivalent. Proof Let A ⊂ M be a Borel set with m(A) = θ , and consider the function f := 1A − θ , where 1A is the characteristic function of A. Then we find  f dm = m(A) − θ = 0, M

and therefore Theorems 19.11, 19.13 provide a guiding function ϕ and the elements of the associated needle decomposition: (Q, ω) and {(Iη , |·|, μη )}η∈Q . In fact, since f is never being 0,  we have m(D ϕ ) = ω(D ϕ ) = 0 (thereby m(T ϕ ) = ω(R ϕ ) = 1). We deduce from Iη f dμη = 0 that μη (A) = θ for ω-almost every η ∈ R ϕ . Thanks to the curvature bound of (Iη , | · |, μη ) as in Theorem 19.14 and diam(Iη ) ≤ diam(M) ≤ D, the 1-dimensional isoperimetric inequality yields I(Iη ,|·|,μη ) (θ ) ≥ IK,N,D (θ ). This is, however, with respect to the Euclidean distance | · | which does not coincide with the distance function d in the reverse direction (recall Remark 19.15). To be more precise, we have only   −1 F (t − s) ≤ d η(t), η(s) ≤ F (t − s) for s < t. Hence, as for the exterior boundary measure μ+ η (A ∩ Iη ) with respect to d|Iη (recall (15.1) for the definition), we obtain −1 μ+ η (A ∩ Iη ) ≥ F · IK,N,D (θ ).

Then it follows from Theorem 19.11 and Fatou’s lemma that m(B + (A, ε)) − m(A) ε→0 ε  + μη (B (A, ε)) − μη (A) = lim inf ω(dη) ε→0 ε Q  μη (B + (A, ε)) − μη (A) ω(dη) lim inf ≥ ε Q ε→0  ≥ μ+ η (A ∩ Iη ) ω(dη).

m+ (A) = lim inf

Q

19.5 Further Applications

299

We remark that, in the last inequality, we used an observation that the forward εneighborhood of A ∩ Iη inside Iη is included in the neighborhood B + (A, ε) of A in M. Thus we conclude m+ (A) ≥ −1  F · IK,N,D (θ ), which completes the proof.  Recall that, in the case of K > 0 and N = D = ∞, we had in Theorem 15.1 the sharp estimate without the factor −1 F (by means of the -calculus). For other K, N, and D, removing −1 from the assertion of Theorem 19.17 in the non-reversible F case is an open problem. It is not known if one can modify the argument in this chapter to achieve such an improvement. Alternatively, one may follow the classical strategy based on the regularity of the boundary of an isoperimetric minimizing region (see, e.g., [178, 184]). This seems doable, and a generalization of the regularity theory to the Finsler setting is important in its own right. We may expect a lower regularity than the Riemannian case like heat equation (Sect. 13.4), and then we need to examine if such a lower regularity is sufficient to obtain isoperimetric inequalities.

19.5 Further Applications The needle decomposition is a powerful tool not only for establishing geometric and functional inequalities but also for investigating their rigidity and almost rigidity phenomena. Here we mention some of known results. After Klartag’s work [138] on weighted Riemannian manifolds, Cavalletti– Mondino [59] developed a construction more directly linked to optimal transport theory and established the needle decomposition for essentially non-branching metric measure spaces satisfying the curvature-dimension condition CD(K, N ) with K ∈ R and N ∈ (1, ∞). This class includes RCD(K, N )-spaces, finite-dimensional Alexandrov spaces, and reversible Finsler manifolds. The needle decomposition for general CD(K, ∞)-spaces or RCD(K, ∞)-spaces is an open problem (note that we do not have “MCP(K, ∞)”). In [59], they showed the Lévy–Gromov–Milman isoperimetric inequality (in the form of Theorem 19.16) as an application of the needle decomposition (in particular, Theorem 19.17 with F = 1 is included in [59]). It was also established in [59] a rigidity result characterizing equality of the isoperimetric inequality for RCD(K, N)-spaces with K > 0 and N ∈ (1, ∞) (recall Remarks 15.6(b) and 15.10(a) for related rigidity results). The technique decomposing the isoperimetric inequality into those on needles was then used in [58] to show a quantitative isoperimetric inequality for essentially non-branching CD(K, N )-spaces with K > 0 and N ∈ (1, ∞). Furthermore, in [166], a quantitative version of the Bakry– Ledoux isoperimetric inequality was investigated for weighted Riemannian (or reversible Finsler) manifolds of Ric∞ ≥ K > 0 (recall also Remark 15.10(b)). Cavalletti–Mondino continued their work and studied various functional inequalities in [60], again for essentially non-branching CD(K, N )-spaces with N ∈ (1, ∞). In [211], the rigidity problem for the logarithmic Sobolev inequality under

300

19 Needle Decompositions

Ric∞ ≥ K > 0 (as in Theorem 18.10) was investigated for weighted Riemannian manifolds, and we have a characterization of equality in the same manner as the spectral gap (Poincaré inequality) in [76] and the isoperimetric inequality in [184] (recall Remarks 15.6(b) and 15.10(a), respectively). See also [62] for a quantitative version of the Poincaré–Lichnerowicz inequality (recall Remark 15.6(c)). We refer to [63] for the needle decomposition and isoperimetric inequalities under the measure contraction property MCP(K, N) with N ∈ (1, ∞), which is, as we mentioned in Subsect. 18.5.3, a synthetic geometric notion of the lower Ricci curvature bound weaker than the curvature-dimension condition CD(K, N ).

References

1. Abate, M., Patrizio, G.: Finsler Metrics—A Global Approach. With Applications to Geometric Function Theory. Lecture Notes in Mathematics, vol. 1591. Springer, Berlin (1994) 2. Agrachev, A., Barilari, D., Rizzi, L.: Curvature: a variational approach. Mem. Am. Math. Soc. 256, 1225 (2018) 3. Agrachev, A., Lee, P.W.Y.: Generalized Ricci curvature bounds for three dimensional contact subriemannian manifolds. Math. Ann. 360, 209–253 (2014) 4. Akbar-Zadeh, H.: Sur les espaces de Finsler à courbures sectionnelles constantes. (French) Acad. Roy. Belg. Bull. Cl. Sci. (5) 74, 281–322 (1988) 5. Álvarez Paiva, J.C., Durán, C.E.: Geometric invariants of fanning curves. Adv. Appl. Math. 42, 290–312 (2009) 6. Álvarez Paiva, J.C., Thompson, A.C.: Volumes in normed and Finsler spaces. In: Bao, D., Bryant, R.L., Chern, S.-S., Shen, Z. (eds.) A Sampler of Riemann–Finsler Geometry, pp. 1– 48. Cambridge University Press, Cambridge (2004) 7. Ambrosio, L.: Lecture notes on optimal transport problems. In: Colli, P., Rodrigues, J.F. (eds.) Mathematical Aspects of Evolving Interfaces. Lecture Notes in Mathematics, vol. 1812, pp. 1–52. Springer, Berlin (2003) 8. Ambrosio, L.: Calculus, heat flow and curvature-dimension bounds in metric measure spaces. In: Sirakov, B., de Souza, P.N., Viana, M. (eds.) Proceedings of the International Congress of Mathematicians—Rio de Janeiro 2018. Vol. I. Plenary Lectures, pp. 301–340. World Scientific Publishing, Hackensack, NJ (2018) 9. Ambrosio, L., Gigli, N., Savaré, G.: Gradient Flows in Metric Spaces and in the Space of Probability Measures. Birkhäuser Verlag, Basel (2005) 10. Ambrosio, L., Gigli, N., Savaré, G.: Calculus and heat flow in metric measure spaces and applications to spaces with Ricci bounds from below. Invent. Math. 195, 289–391 (2014) 11. Ambrosio, L., Gigli, N., Savaré, G.: Metric measure spaces with Riemannian Ricci curvature bounded from below. Duke Math. J. 163, 1405–1490 (2014) 12. Ambrosio, L., Gigli, N., Savaré, G.: Bakry–Émery curvature-dimension condition and Riemannian Ricci curvature bounds. Ann. Probab. 43, 339–404 (2015) 13. Ambrosio, L., Mondino, A.: Gaussian-type isoperimetric inequalities in RCD(K, ∞) probability spaces for positive K. Atti Accad. Naz. Lincei Rend. Lincei Mat. Appl. 27, 497–514 (2016) 14. Anastasiei, M.: A historical remark on the connections of Chern and Rund. In: Bao, D., Chern, S.-S., Shen, Z. (eds.) Finsler Geometry. Contemporary Mathematics, vol. 196, pp. 171–176. American Mathematical Society, Providence, RI (1996)

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S.-i. Ohta, Comparison Finsler Geometry, Springer Monographs in Mathematics, https://doi.org/10.1007/978-3-030-80650-7

301

302

References

15. Antonelli, P.L., Ingarden, R.S., Matsumoto, M.: The Theory of Sprays and Finsler Spaces with Applications in Physics and Biology. Kluwer Academic Publishers Group, Dordrecht (1993) 16. Auslander, L.: On curvature in Finsler geometry. Trans. Am. Math. Soc. 79, 378–388 (1955) 17. Baˇcák, M.: Convex Analysis and Optimization in Hadamard Spaces. Walter de Gruyter & Co., Berlin (2014) 18. Bakry, D.: L’hypercontractivité et son utilisation en théorie des semigroupes. (French) In: Bernard, P. (eds.) Lectures on Probability Theory. Lecture Notes in Mathematics, vol. 1581, pp. 1–114. Springer, Berlin (1994) 19. Bakry, D., Bolley, F., Gentil, I.: The Li–Yau inequality and applications under a curvaturedimension condition. Ann. Inst. Fourier (Grenoble) 67, 397–421 (2017) 20. Bakry, D., Émery, M.: Diffusions hypercontractives. (French) In: Azéma, J., Yor, M. (eds.) Séminaire de probabilités, XIX, 1983/84. Lecture Notes in Mathematics, vol. 1123, pp. 177– 206. Springer, Berlin (1985) 21. Bakry, D., Gentil, I., Ledoux, M.: Analysis and Geometry of Markov Diffusion Operators. Springer, Cham (2014) 22. Bakry, D., Gentil, I., Scheffer, G.: Sharp Beckner-type inequalities for Cauchy and spherical distributions. Studia Math. 251, 219–245 (2020) 23. Bakry, D., Ledoux, M.: Lévy–Gromov’s isoperimetric inequality for an infinite-dimensional diffusion generator. Invent. Math. 123, 259–281 (1996) 24. Ball, K., Carlen, E.A., Lieb, E.H.: Sharp uniform convexity and smoothness inequalities for trace norms. Invent. Math. 115, 463–482 (1994) 25. Bao, D., Chern, S.-S., Shen, Z.: An Introduction to Riemann–Finsler Geometry. Springer, New York (2000) 26. Bao, D., Lackey, B.: A Hodge decomposition theorem for Finsler spaces. C. R. Acad. Sci. Paris Sér. I Math. 323, 51–56 (1996) 27. Bao, D., Robles, C., Shen, Z.: Zermelo navigation on Riemannian manifolds. J. Differ. Geom. 66, 377–435 (2004) 28. Barchiesi, M., Brancolini, A., Julin, V.: Sharp dimension free quantitative estimates for the Gaussian isoperimetric inequality. Ann. Probab. 45, 668–697 (2017) 29. Barthelmé, T.: A natural Finsler–Laplace operator. Israel J. Math. 196, 375–412 (2013) 30. Barthelmé, T., Colbois, B., Crampon, M., Verovic, P.: Laplacian and spectral gap in regular Hilbert geometries. Tohoku Math. J. (2) 66, 377–407 (2014) 31. Bayle, V.: Propriétés de concavité du profil isopérimétrique et applications. (French) Thése de Doctorat, Institut Fourier, Universite Joseph-Fourier, Grenoble (2003) 32. Belloni, M., Kawohl, B., Juutinen, P.: The p-Laplace eigenvalue problem as p → ∞ in a Finsler metric. J. Eur. Math. Soc. 8, 123–138 (2006) 33. Benoist, Y.: Convexes hyperboliques et fonctions quasisymétriques. (French) Publ. Math. Inst. Hautes Études Sci. 97, 181–237 (2003) 34. Berwald, L.: On Finsler and Cartan geometries. III. Two-dimensional Finsler spaces with rectilinear extremals. Ann. Math. (2) 42, 84–112 (1941) 35. Berwald, L.: Ueber Finslersche und Cartansche Geometrie. IV. Projektivkrümmung allgemeiner affiner Räume und Finslersche Räume skalarer Krümmung. (German) Ann. Math. (2) 48, 755–781 (1947) 36. Bishop, R.L., Crittenden, R.J.: Geometry of Manifolds. Academic, New York, London (1964) 37. Bobkov, S.: A functional form of the isoperimetric inequality for the Gaussian measure. J. Funct. Anal. 135, 39–49 (1996) 38. Bobkov, S.G.: An isoperimetric inequality on the discrete cube, and an elementary proof of the isoperimetric inequality in Gauss space. Ann. Probab. 25, 206–214 (1997) 39. Boonnam, N., Hama, R., Sabau, S.V.: Berwald spaces of bounded curvature are Riemannian. Acta Math. Acad. Paedagog. Nyházi. (N.S.) 33, 339–347 (2017) 40. Borell, C.: The Brunn–Minkowski inequality in Gauss space. Invent. Math. 30, 207–216 (1975) 41. Borell, C.: Convex set functions in d-space. Period. Math. Hungar. 6, 111–136 (1975)

References

303

42. Bouleau, N., Hirsch, F.: Dirichlet Forms and Analysis on Wiener Space. Walter de Gruyter & Co., Berlin (1991) 43. Brascamp, H.J., Lieb, E.H.: On extensions of the Brunn–Minkowski and Prékopa–Leindler theorems, including inequalities for log concave functions, and with an application to the diffusion equation. J. Funct. Anal. 22, 366–389 (1976) 44. Brenier, Y.: Polar factorization and monotone rearrangement of vector-valued functions. Commun. Pure Appl. Math. 44, 375–417 (1991) 45. Brézis, H.: Opérateurs maximaux monotones et semi-groupes de contractions dans les espaces de Hilbert (French). North-Holland Publishing Co., Amsterdam, London; American Elsevier Publishing Co., Inc., New York (1973) 46. Bridson, M.R., Haefliger, A.: Metric Spaces of Non-positive Curvature. Springer, Berlin (1999) 47. Brock, J., Farb, B.: Curvature and rank of Teichmüller space. Am. J. Math. 128, 1–22 (2006) 48. Brué, E., Semola, D.: Constancy of the dimension for RCD(K, N ) spaces via regularity of Lagrangian flows. Commun. Pure Appl. Math. 73, 1141–1204 (2020) 49. Bryant, R.L.: Projectively flat Finsler 2-spheres of constant curvature. Selecta Math. (N.S.) 3, 161–203 (1997) 50. Bucataru, I., Miron, R.: Finsler–Lagrange Geometry. Applications to Dynamical Systems. Editura Academiei Române, Bucharest (2007) 51. Burago, D., Burago, Yu., Ivanov, S.: A Course in Metric Geometry. American Mathematical Society, Providence, RI (2001) 52. Busemann, H.: Spaces with non-positive curvature. Acta Math. 80, 259–310 (1948) 53. Busemann, H.: The geometry of Finsler spaces. Bull. Am. Math. Soc. 56, 5–16 (1950) 54. Caffarelli, L.A., Feldman, M., McCann, R.J.: Constructing optimal maps for Monge’s transport problem as a limit of strictly convex costs. J. Am. Math. Soc. 15, 1–26 (2002) 55. Cartan, É.: Sur les espaces de Finsler. C. R. Acad. Sci. Paris 196, 582–586 (1933) 56. Cavalletti, F.: Monge problem in metric measure spaces with Riemannian curvaturedimension condition. Nonlinear Anal. 99, 136–151 (2014) 57. Cavalletti, F.: Decomposition of geodesics in the Wasserstein space and the globalization problem. Geom. Funct. Anal. 24, 493–551 (2014) 58. Cavalletti, F., Maggi, F., Mondino, A.: Quantitative isoperimetry à la Levy–Gromov. Commun. Pure Appl. Math. 72, 1631–1677 (2019) 59. Cavalletti, F., Mondino, A.: Sharp and rigid isoperimetric inequalities in metric-measure spaces with lower Ricci curvature bounds. Invent. Math. 208, 803–849 (2017) 60. Cavalletti, F., Mondino, A.: Sharp geometric and functional inequalities in metric measure spaces with lower Ricci curvature bounds. Geom. Topol. 21, 603–645 (2017) 61. Cavalletti, F., Mondino, A.: New formulas for the Laplacian of distance functions and applications. Anal. PDE 13, 2091–2147 (2020) 62. Cavalletti, F., Mondino, A., Semola, D.: Quantitative Obata’s theorem. Preprint (2019). Available at arXiv:1910.06637 63. Cavalletti, F., Santarcangelo, F.: Isoperimetric inequality under measure-contraction property. J. Funct. Anal. 277, 2893–2917 (2019) 64. Centore, P.: A mean-value Laplacian for Finsler spaces. In: Antonelli, P.L., Lackey, B.C. (eds.) The Theory of Finslerian Laplacians and Applications, pp. 151–186. Kluwer Academic Publishers, Dordrecht (1998) 65. Centore, P.: Finsler Laplacians and minimal-energy maps. Int. J. Math. 11, 1–13 (2000) 66. Chavel, I.: Riemannian Geometry. A Modern Introduction, 2nd edn. Cambridge University Press, Cambridge (2006) 67. Cheeger, J.: Differentiability of Lipschitz functions on metric measure spaces. Geom. Funct. Anal. 9, 428–517 (1999) 68. Cheeger, J., Colding, T.H.: Lower bounds on Ricci curvature and the almost rigidity of warped products. Ann. Math. (2) 144, 189–237 (1996) 69. Cheeger, J., Colding, T.H.: On the structure of spaces with Ricci curvature bounded below. I. J. Differ. Geom. 46, 406–480 (1997)

304

References

70. Cheeger, J., Colding, T.H.: On the structure of spaces with Ricci curvature bounded below. II. J. Differ. Geom. 54, 13–35 (2000) 71. Cheeger, J., Colding, T.H.: On the structure of spaces with Ricci curvature bounded below. III. J. Differ. Geom. 54, 37–74 (2000) 72. Cheeger, J., Gromoll, D.: The splitting theorem for manifolds of nonnegative Ricci curvature. J. Differ. Geometry 6, 119–128 (1971/72) 73. Cheeger, J., Gromoll, D.: On the structure of complete manifolds of nonnegative curvature. Ann. Math. (2) 96, 413–443 (1972) 74. Chen, Q., Jost, J., Wang, G.: A maximum principle for generalizations of harmonic maps in Hermitian, affine, Weyl, and Finsler geometry. J. Geom. Anal. 25, 2407–2426 (2015) 75. Cheng, S.Y., Yau, S.T.: Differential equations on Riemannian manifolds and their geometric applications. Commun. Pure Appl. Math. 28, 333–354 (1975) 76. Cheng, X., Zhou, D.: Eigenvalues of the drifted Laplacian on complete metric measure spaces. Commun. Contemp. Math. 19, 1650001, 17 pp. (2017) 77. Chow, B., Chu, S.-C., Glickenstein, D., Guenther, C., Isenberg, J., Ivey, T., Knopf, D., Lu, P., Luo, F., Ni, L.: The Ricci Flow: Techniques and Applications. Part I. Geometric Aspects. American Mathematical Society, Providence, RI (2007) 78. Chow, B., Lu, P., Ni, L.: Hamilton’s Ricci Flow. American Mathematical Society, Providence, RI; Science Press Beijing, New York (2006) 79. Cianchi, A., Fusco, N., Maggi, F., Pratelli, A.: On the isoperimetric deficit in Gauss space. Am. J. Math. 133, 131–186 (2011) 80. Colbois, B., Verovic, P.: Hilbert geometry for strictly convex domains. Geom. Dedicata 105, 29–42 (2004) 81. Cordero-Erausquin, D., McCann, R.J., Schmuckenschläger, M.: A Riemannian interpolation inequality à la Borell, Brascamp and Lieb. Invent. Math. 146, 219–257 (2001) 82. Cordero-Erausquin, D., McCann, R.J., Schmuckenschläger, M.: Prékopa–Leindler type inequalities on Riemannian manifolds, Jacobi fields, and optimal transport. Ann. Fac. Sci. Toulouse Math. (6) 15, 613–635 (2006) 83. Crampin, M.: Randers spaces with reversible geodesics. Publ. Math. Debrecen 67, 401–409 (2005) 84. Crandall, M.G., Liggett, T.M.: Generation of semi-groups of nonlinear transformations on general Banach spaces. Am. J. Math. 93, 265–298 (1971) 85. Davies, E.B.: Heat Kernels and Spectral Theory. Cambridge University Press, Cambridge (1989) 86. Dellacherie, C., Meyer, P.-A.: Probabilities and Potential. North-Holland Publishing Co., Amsterdam, New York (1978) 87. Deng, S.: Homogeneous Finsler Spaces. Springer, New York (2012) 88. DiBenedetto, E.: C1+α local regularity of weak solutions of degenerate elliptic equations. Nonlinear Anal. 7, 827–850 (1983) 89. Duoandikoetxea, J.: Fourier Analysis. American Mathematical Society, Providence, RI (2001) 90. Earle, J.C., Eells, J.: On the differential geometry of Teichmüller spaces. J. Analyse Math. 19, 35–52 (1967) 91. Egloff, D.: Uniform Finsler Hadamard manifolds. Ann. Inst. H. Poincaré Phys. Théor. 66, 323–357 (1997) 92. Eldan, R.: A two-sided estimate for the Gaussian noise stability deficit. Invent. Math. 201, 561–624 (2015) 93. Eminenti, M., La Nave, G., Mantegazza, C.: Ricci solitons: the equation point of view. Manuscripta Math. 127, 345–367 (2008) 94. Erbar, M.: The heat equation on manifolds as a gradient flow in the Wasserstein space. Ann. Inst. Henri Poincaré Probab. Stat. 46, 1–23 (2010) 95. Erbar, M., Kuwada, K., Sturm, K.-T.: On the equivalence of the entropic curvature-dimension condition and Bochner’s inequality on metric measure spaces. Invent. Math. 201, 993–1071 (2015)

References

305

96. Evans, L.C.: Partial Differential Equations. American Mathematical Society, Providence, RI (1998) 97. Evans, L.C., Gangbo, W.: Differential equations methods for the Monge–Kantorovich mass transfer problem. Mem. Am. Math. Soc. 137, 653 (1999) 98. Fang, F., Li, X.-D., Zhang, Z.: Two generalizations of Cheeger–Gromoll splitting theorem via Bakry–Emery Ricci curvature. Ann. Inst. Fourier (Grenoble) 59, 563–573 (2009) 99. Feldman, M., McCann, R.J.: Monge’s transport problem on a Riemannian manifold. Trans. Am. Math. Soc. 354, 1667–1697 (2002) 100. Finsler, P.: Über Kurven und Flächen in allgemeinen Räumen. (German) Dissertation, Göttingen (1918). Reprinted by Verlag Birkhäuser, Basel (1951) 101. Foulon, P.: Géométrie des équations différentielles du second ordre. (French) Ann. Inst. H. Poincaré Phys. Théor. 45, 1–28 (1986) 102. Fukaya, K.: Collapsing of Riemannian manifolds and eigenvalues of Laplace operator. Invent. Math. 87, 517–547 (1987) 103. Funk, P.: Über Geometrien, bei denen die Geraden die Kürzesten sind. (German) Math. Ann. 101, 226–237 (1929) 104. Ge, Y., Shen, Z.: Eigenvalues and eigenfunctions of metric measure manifolds. Proc. Lond. Math. Soc. (3) 82, 725–746 (2001) 105. Gentil, I., Zugmeyer, S.: A family of Beckner inequalities under various curvature-dimension conditions. Bernoulli 27, 751–771 (2021) 106. Gigli, N.: The splitting theorem in non-smooth context. Preprint (2013). Available at arXiv:1302.5555 107. Gigli, N.: An overview of the proof of the splitting theorem in spaces with non-negative Ricci curvature. Anal. Geom. Metr. Spaces 2, 169–213 (2014) 108. Gigli, N.: On the differential structure of metric measure spaces and applications. Mem. Am. Math. Soc. 236, 1113 (2015) 109. Gigli, N., Kuwada, K., Ohta, S.: Heat flow on Alexandrov spaces. Commun. Pure Appl. Math. 66, 307–331 (2013) 110. Gigli, N., Ledoux, M.: From log Sobolev to Talagrand: a quick proof. Discrete Contin. Dyn. Syst. 33, 1927–1935 (2013) 111. Grifone, J.: Structure presque-tangente et connexions. I. (French) Ann. Inst. Fourier (Grenoble) 22, 287–334 (1972) 112. Gromov, M.: Metric Structures for Riemannian and Non-Riemannian Spaces. Based on the 1981 French Original. With appendices by Katz, M., Pansu, P., Semmes, S. Birkhäuser Boston, Inc., Boston, MA (1999) 113. Gromov, M., Milman, V.D.: Generalization of the spherical isoperimetric inequality to uniformly convex Banach spaces. Compositio Math. 62, 263–282 (1987) 114. Hajłasz, P., Koskela, P.: Sobolev met Poincaré. Mem. Am. Math. Soc. 145, 688 (2000) 115. Hebey, E.: Sobolev Spaces on Riemannian Manifolds. Lecture Notes in Mathematics, vol. 1635. Springer, Berlin (1996) 116. Hebey, E.: Nonlinear Analysis on Manifolds: Sobolev Spaces and Inequalities. New York University, Courant Institute of Mathematical Sciences, New York; American Mathematical Society, Providence, RI (1999) 117. Heinonen, J.: Lectures on Analysis on Metric Spaces. Springer, New York (2001) 118. Heinonen, J.: Nonsmooth calculus. Bull. Am. Math. Soc. (N.S.) 44, 163–232 (2007) 119. Heinonen, J., Koskela, P.: Quasiconformal maps in metric spaces with controlled geometry. Acta Math. 181, 1–61 (1998) 120. Heinonen, J., Koskela, P., Shanmugalingam, N., Tyson, J.T.: Sobolev Spaces on Metric Measure Spaces. An Approach Based on Upper Gradients. Cambridge University Press, Cambridge (2015) 121. Hilbert, D.: Über die gerade Linie als kürzeste Verbindung zweier Punkte. (German) Math. Ann. 46, 91–96 (1895) 122. Ichijy¯o, Y.: Finsler manifolds modeled on a Minkowski space. J. Math. Kyoto Univ. 16, 639– 652 (1976)

306

References

123. Ivanov, N.V.: A short proof of non-Gromov hyperbolicity of Teichmüller spaces. Ann. Acad. Sci. Fenn. Math. 27, 3–5 (2002) 124. Ivanov, S.: Volume comparison via boundary distances. In: Bhatia, R., Pal, A., Rangarajan, G., Srinivas, V., Vanninathan, M. (eds.) Proceedings of the International Congress of Mathematicians. Volume II, pp. 769–784. Hindustan Book Agency, New Delhi (2010) 125. Ivanov, S., Lytchak, A.: Rigidity of Busemann convex Finsler metrics. Comment. Math. Helv. 94, 855–868 (2019) 126. John, F.: Extremum problems with inequalities as subsidiary conditions. Studies and Essays Presented to R. Courant on his 60th Birthday, January 8, 1948, pp. 187–204. Interscience Publishers, Inc., New York, NY (1948) 127. Jordan, R., Kinderlehrer, D., Otto, F.: The variational formulation of the Fokker–Planck equation. SIAM J. Math. Anal. 29, 1–17 (1998) 128. Jost, J.: Equilibrium maps between metric spaces. Calc. Var. Partial Differ. Equ. 2, 173–204 (1994) 129. Jost, J.: Convex functionals and generalized harmonic maps into spaces of nonpositive curvature. Comment. Math. Helv. 70, 659–673 (1995) 130. Jost, J.: Nonpositive Curvature: Geometric and Analytic Aspects. Birkhäuser Verlag, Basel (1997) 131. Juillet, N.: Geometric inequalities and generalized Ricci bounds in the Heisenberg group. Int. Math. Res. Not. IMRN, 2347–2373 (2009) 132. Kannan, R., Lovász, L., Simonovits, M.: Isoperimetric problems for convex bodies and a localization lemma. Discrete Comput. Geom. 13, 541–559 (1995) 133. Katok, A.B.: Ergodic perturbations of degenerate integrable Hamiltonian systems. (Russian) Izv. Akad. Nauk SSSR Ser. Mat. 37, 539–576 (1973) English translation in Math. USSR-Izv. 7, 535–572 (1973) 134. Kell, M.: Sectional curvature-type conditions on metric spaces. J. Geom. Anal. 29, 616–655 (2019) 135. Kell, M.: A note on non-negatively curved Berwald spaces. Preprint (2015). Available at arXiv1502.03764 136. Ketterer, C.: Obata’s rigidity theorem for metric measure spaces. Anal. Geom. Metr. Spaces 3, 278–295 (2015) 137. Kim, C.-W., Min, K.: Finsler metrics with positive constant flag curvature. Arch. Math. (Basel) 92, 70–79 (2009) 138. Klartag, B.: Needle decompositions in Riemannian geometry. Mem. Am. Math. Soc. 249, 1180 (2017) 139. Kolesnikov, A.V., Milman, E.: Brascamp–Lieb-type inequalities on weighted Riemannian manifolds with boundary. J. Geom. Anal. 27, 1680–1702 (2017) 140. Kristály, A., Kozma, L.: Metric characterization of Berwald spaces of non-positive flag curvature. J. Geom. Phys. 56, 1257–1270 (2006) 141. Kristály, A., Rudas, I.J.: Elliptic problems on the ball endowed with Funk-type metrics. Nonlinear Anal. 119, 199–208 (2015) 142. Kristály, A., Varga, C., Kozma, L.: The dispersing of geodesics in Berwald spaces of nonpositive flag curvature. Houston J. Math. 30, 413–420 (2004) 143. Kuwada, K.: Duality on gradient estimates and Wasserstein controls. J. Funct. Anal. 258, 3758–3774 (2010) 144. Kuwada, K.: A probabilistic approach to the maximal diameter theorem. Math. Nachr. 286, 374–378 (2013) 145. Kuwada, K.: Space-time Wasserstein controls and Bakry–Ledoux type gradient estimates. Calc. Var. Partial Differ. Equ. 54, 127–161 (2015) 146. Kuwae, K., Li, X.-D.: New Laplacian comparison theorem and its applications to diffusion processes on Riemannian manifolds. Preprint (2020). Available at arXiv:2001.00444 147. Lakzian, S.: Differential Harnack estimates for positive solutions to heat equation under Finsler–Ricci flow. Pacif. J. Math. 278, 447–462 (2015)

References

307

148. Ledoux, M.: The Concentration of Measure Phenomenon. American Mathematical Society, Providence, RI (2001) 149. Lee, P.W.Y.: Displacement interpolations from a Hamiltonian point of view. J. Funct. Anal. 265, 3163–3203 (2013) 150. Lee, P.W.Y.: On measure contraction property without Ricci curvature lower bound. Potential Anal. 44, 27–41 (2016) 151. Leoni, G.: A First Course in Sobolev Spaces. American Mathematical Society, Providence, RI (2009) 152. Lévy, P.: Leçons d’analyse fonctionnelle. Gauthier-Villars, Paris (1922) 153. Lévy, P.: Problèmes concrets d’analyse fonctionnelle. Avec un complément sur les fonctionnelles analytiques par F. Pellegrino. (French) 2nd edn. Gauthier-Villars, Paris (1951) 154. Li, B.: On the classification of projectively flat Finsler metrics with constant flag curvature. Adv. Math. 257, 266–284 (2014) 155. Li, P.: Geometric Analysis. Cambridge University Press, Cambridge (2012) 156. Li, P., Yau, S.-T.: On the parabolic kernel of the Schrödinger operator. Acta Math. 156, 153– 201 (1986) 157. Lichnerowicz, A.: Variétés riemanniennes à tenseur C non négatif. (French) C. R. Acad. Sci. Paris Sér. A-B 271, A650–A653 (1970) 158. Lions, J.-L., Magenes, E.: Non-homogeneous Boundary Value Problems and Applications, vol. I. Springer, New York/Heidelberg (1972) 159. Lott, J., Villani, C.: Hamilton–Jacobi semigroup on length spaces and applications, J. Math. Pures Appl. 88, 219–229 (2007) 160. Lott, J., Villani, C.: Weak curvature conditions and functional inequalities. J. Funct. Anal. 245, 311–333 (2007) 161. Lott, J., Villani, C.: Ricci curvature for metric-measure spaces via optimal transport. Ann. Math. (2) 169, 903–991 (2009) 162. Lovász, L., Simonovits, M.: Random walks in a convex body and an improved volume algorithm. Random Struct. Algor. 4, 359–412 (1993) 163. Lu, Y., Minguzzi, E., Ohta, S.: Comparison theorems on weighted Finsler manifolds and spacetimes with %-range. Preprint (2020). Available at arXiv:2007.00219 164. Mai, C.H.: On Riemannian manifolds with positive weighted Ricci curvature of negative effective dimension. Kyushu J. Math. 73, 205–218 (2019) 165. Mai, C.H.: Rigidity for the isoperimetric inequality of negative effective dimension on weighted Riemannian manifolds. Geom. Dedicata 202, 213–232 (2019) 166. Mai, C.H., Ohta, S.: Quantitative estimates for the Bakry–Ledoux isoperimetric inequality. Comment. Math. Helv. (to appear). Available at arXiv.1910.13686 167. Masur, H.: On a class of geodesics in Teichmüller space. Ann. Math. (2) 102, 205–221 (1975) 168. Masur, H.: Geometry of Teichmüller space with the Teichmüller metric. In: Ji, L., Wolpert, S.A., Yau, S.-T. (eds.) Geometry of Riemann Surfaces and Their Moduli Spaces. Surveys in Differential Geometry, vol. 14, pp. 295–313. International Press, Somerville, MA (2009) 169. Masur, H.A., Wolf, M.: Teichmüller space is not Gromov hyperbolic. Ann. Acad. Sci. Fenn. Ser. A I Math. 20, 259–267 (1995) 170. Matveev, V.S.: There exist no locally symmetric Finsler spaces of positive or negative flag curvature. C. R. Math. Acad. Sci. Paris 353, 81–83 (2015) 171. Mayer, U.F.: Gradient flows on nonpositively curved metric spaces and harmonic maps. Commun. Anal. Geom. 6, 199–253 (1998) 172. Maz’ya, V.: Sobolev Spaces with Applications to Elliptic Partial Differential Equations. Second, revised and augmented edition. Springer, Heidelberg (2011) 173. McCann, R.J.: A convexity principle for interacting gases. Adv. Math. 128, 153–179 (1997) 174. McCann, R.J.: Polar factorization of maps on Riemannian manifolds. Geom. Funct. Anal. 11, 589–608 (2001) 175. McCarthy, J.D., Papadopoulos, A.: The visual sphere of Teichmüller space and a theorem of Masur–Wolf. Ann. Acad. Sci. Fenn. Math. 24, 147–154 (1999)

308

References

176. McCarthy, J.D., Papadopoulos, A.: The mapping class group and a theorem of Masur–Wolf. Topology Appl. 96, 75–84 (1999) 177. Milman, E.: On the role of convexity in isoperimetry, spectral gap and concentration. Invent. Math. 177, 1–43 (2009) 178. Milman, E.: Sharp isoperimetric inequalities and model spaces for curvature-dimensiondiameter condition. J. Eur. Math. Soc. (JEMS) 17, 1041–1078 (2015) 179. Milman, E.: Beyond traditional curvature-dimension I: new model spaces for isoperimetric and concentration inequalities in negative dimension. Trans. Am. Math. Soc. 369, 3605–3637 (2017) 180. Milman, E., Rotem, L.: Complemented Brunn–Minkowski inequalities and isoperimetry for homogeneous and non-homogeneous measures. Adv. Math. 262, 867–908 (2014) 181. Miron, R., Hrimiuc, D., Shimada, H., Sabau, S.V.: The Geometry of Hamilton and Lagrange Spaces. Kluwer Academic Publishers Group, Dordrecht (2001) 182. Miyadera, I.: Nonlinear Semigroups. American Mathematical Society, Providence, RI (1992) 183. Mondino, A., Naber, A.: Structure theory of metric-measure spaces with lower Ricci curvature bounds. J. Eur. Math. Soc. 21, 1809–1854 (2019) 184. Morgan, F.: Geometric Measure Theory. A Beginner’s Guide, 4th edn. Elsevier/Academic Press, Amsterdam (2009) 185. Mossel, E., Neeman, J.: Robust dimension free isoperimetry in Gaussian space. Ann. Probab. 43, 971–991 (2015) 186. Nguyen, V.H.: -entropy inequalities and asymmetric covariance estimates for convex measures. Bernoulli 25, 3090–3108 (2019) 187. Obata, M.: Certain conditions for a Riemannian manifold to be isometric with a sphere. J. Math. Soc. Jpn. 14, 333–340 (1962) 188. Ohta, S.: Regularity of harmonic functions in Cheeger-type Sobolev spaces. Ann. Global Anal. Geom. 26, 397–410 (2004) 189. Ohta, S.: Convexities of metric spaces. Geom. Dedicata 125, 225–250 (2007) 190. Ohta, S.: On the measure contraction property of metric measure spaces. Comment. Math. Helv. 82, 805–828 (2007) 191. Ohta, S.: Gradient flows on Wasserstein spaces over compact Alexandrov spaces. Am. J. Math. 131, 475–516 (2009) 192. Ohta, S.: Uniform convexity and smoothness, and their applications in Finsler geometry. Math. Ann. 343, 669–699 (2009) 193. Ohta, S.: Finsler interpolation inequalities. Calc. Var. Partial Differ. Equ. 36, 211–249 (2009) 194. Ohta, S.: Optimal transport and Ricci curvature in Finsler geometry. Adv. Stud. Pure Math. 57, 323–342 (2010) 195. Ohta, S.: Vanishing S-curvature of Randers spaces. Differ. Geom. Appl. 29, 174–178 (2011) 196. Ohta, S.: Weighted Ricci curvature estimates for Hilbert and Funk geometries. Pacif. J. Math. 265, 185–197 (2013) 197. Ohta, S.: Ricci curvature, entropy, and optimal transport. In: Ollivier, Y., Pajot, H., Villani, C. (eds.) Optimal Transportation. London Mathematical Society Lecture Note Series, vol. 413, pp. 145–199. Cambridge University Press, Cambridge (2014) 198. Ohta, S.: On the curvature and heat flow on Hamiltonian systems. Anal. Geom. Metr. Spaces 2, 81–114 (2014) 199. Ohta, S.: Splitting theorems for Finsler manifolds of nonnegative Ricci curvature. J. Reine Angew. Math. 700, 155–174 (2015) 200. Ohta, S.: (K, N )-convexity and the curvature-dimension condition for negative N . J. Geom. Anal. 26, 2067–2096 (2016) 201. Ohta, S.: Some functional inequalities on non-reversible Finsler manifolds. Proc. Indian Acad. Sci. Math. Sci. 127, 833–855 (2017) 202. Ohta, S.: Nonlinear geometric analysis on Finsler manifolds. Eur. J. Math. 3, 916–952 (2017) 203. Ohta, S.: Needle decompositions and isoperimetric inequalities in Finsler geometry. J. Math. Soc. Jpn. 70, 651–693 (2018)

References

309

204. Ohta, S.: A semigroup approach to Finsler geometry: Bakry–Ledoux’s isoperimetric inequality. Comm. Anal. Geom. (to appear). Available at arXiv:1602.00390 205. Ohta, S., Pálfia, M.: Gradient flows and a Trotter–Kato formula of semi-convex functions on CAT(1)-spaces. Am. J. Math. 139, 937–965 (2017) 206. Ohta, S., Sturm, K.-T.: Heat flow on Finsler manifolds. Commun. Pure Appl. Math. 62, 1386– 1433 (2009) 207. Ohta, S., Sturm, K.-T.: Non-contraction of heat flow on Minkowski spaces. Arch. Ration. Mech. Anal. 204, 917–944 (2012) 208. Ohta, S., Sturm, K.-T.: Bochner–Weitzenböck formula and Li–Yau estimates on Finsler manifolds. Adv. Math. 252, 429–448 (2014) 209. Ohta, S., Takatsu, A.: Displacement convexity of generalized relative entropies. Adv. Math. 228, 1742–1787 (2011) 210. Ohta, S., Takatsu, A.: Displacement convexity of generalized relative entropies. II. Commun. Anal. Geom. 21, 687–785 (2013) 211. Ohta, S., Takatsu, A.: Equality in the logarithmic Sobolev inequality. Manuscripta Math. 162, 271–282 (2020) 212. Okada, T.: On models of projectively flat Finsler spaces of constant negative curvature. Tensor (N.S.) 40, 117–124 (1983) 213. Otto, F., Villani, C.: Generalization of an inequality by Talagrand and links with the logarithmic Sobolev inequality. J. Funct. Anal. 173, 361–400 (2000) 214. Payne, L.E., Weinberger, H.F.: An optimal Poincaré inequality for convex domains. Arch. Ration. Mech. Anal. 5, 286–292 (1960) 215. Perelman, G.: The entropy formula for the Ricci flow and its geometric applications. Preprint (2002). Available at arXiv:math/0211159 216. Petersen, P.: Riemannian Geometry, 3rd edn. Springer, Cham (2016) 217. Qian, Z.: Estimates for weighted volumes and applications. Quart. J. Math. Oxford Ser. (2) 48, 235–242 (1997) 218. Randers, G.: On an asymmetrical metric in the fourspace of general relativity. Phys. Rev. (2) 59, 195–199 (1941) 219. Renardy, M., Rogers, R.C.: An Introduction to Partial Differential Equations, 2nd edn. Springer, New York (2004) 220. Rund, H.: The Differential Geometry of Finsler Spaces. Springer, Berlin, Göttingen, Heidelberg (1959) 221. Saloff-Coste, L.: Uniformly elliptic operators on Riemannian manifolds. J. Differ. Geom. 36, 417–450 (1992) 222. Santambrogio, F.: Optimal Transport for Applied Mathematicians. Birkhauser/Springer, Basel (2015) 223. Savaré, G.: Gradient flows and diffusion semigroups in metric spaces under lower curvature bounds. C. R. Math. Acad. Sci. Paris 345, 151–154 (2007) 224. Savaré, G.: Self-improvement of the Bakry–Émery condition and Wasserstein contraction of the heat flow in RCD(K, ∞) metric measure spaces. Discrete Contin. Dyn. Syst. 34, 1641– 1661 (2014) 225. Scheffer, G.: Local Poincaré inequalities in non-negative curvature and finite dimension. J. Funct. Anal. 198, 197–228 (2003) 226. Shanmugalingam, N.: Newtonian spaces: an extension of Sobolev spaces to metric measure spaces. Rev. Mat. Iberoamericana 16, 243–279 (2000) 227. Shen, Y.-B., Shen, Z.: Introduction to Modern Finsler Geometry. World Scientific Publishing Co., Singapore (2016) 228. Shen, Z.: Volume comparison and its applications in Riemann–Finsler geometry. Adv. Math. 128, 306–328 (1997) 229. Shen, Z.: The non-linear Laplacian for Finsler manifolds. In: Antonelli, P.L., Lackey, B.C. (eds.) The Theory of Finslerian Laplacians and Applications, pp. 187–198. Kluwer Academic Publishers, Dordrecht (1998) 230. Shen, Z.: Lectures on Finsler Geometry. World Scientific Publishing Co., Singapore (2001)

310

References

References

311

259. Yin, S.-T., He, Q.: The first eigenfunctions and eigenvalue of the p-Laplacian on Finsler manifolds. Sci. China Math. 59, 1769–1794 (2016) 260. Yosida, K.: Functional Analysis. Reprint of the sixth (1980) edition. Springer, Berlin (1995) 261. Ziller, W.: Geometry of the Katok examples. Ergodic Theory Dynam. Syst. 3, 135–157 (1983)

Index

Symbols ← − F , 18

A Absolutely continuous, 271 Affine, 264 Alexandrov space, 6, 99, 283, 284, 299 (α, β)-metric, 15 Asymptotic, 258 bi-, 262

B Ball backward, 28 forward, 28 Basic section, 42 Berwald space, 65, 76, 106, 109, 130, 264 Bianchi identity first, 57 second, 59 Bishop inequality, 94, 95, 123, 259 weighted, 124, 151 Bochner inequality, 198, 226, 235, 284 improved, 168, 169, 197, 198, 215 integrated, 165, 195, 202 pointwise, 163 Bochner–Weitzenböck formula integrated, 165 pointwise, 162, 168, 199, 263 Branching point, 291 Busemann function, 257 Busemann nonpositive curvature, 78, 109

C Cartan tensor, 21 CAT(0)-space, 6, 98, 173, 283 CD(K, N )-space, 274 Cheeger energy, 149 Comparison theorem, 91 area, 125, 131, 152 Bishop, 95, 131 Bishop–Gromov, 123, 273, 279, 285 Bonnet–Myers, 91, 126, 213, 276 Busemann, 109 Cartan–Hadamard, 95 finite mass, 127, 279 Laplacian, 150, 258 Rauch, 106 triangle, I, 98 triangle, II, 99 Complete -, 260 backward, 30 forward, 30 Concentration, 282 exponential, 228, 283 function, 282 normal, 228, 282 Conjugate locus, 87 Conjugate point, 87 Connection, 42 Berwald, 45 Cartan, 45 Chern, 43, 59 Chern–Rund, 43 Levi-Civita, 42 Connection 1-form, 43 Contraction, 199, 284

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 S.-i. Ohta, Comparison Finsler Geometry, Springer Monographs in Mathematics, https://doi.org/10.1007/978-3-030-80650-7

313

314 Coupling, 270 optimal, 271, 289 Covariant derivative, 34 along curve, 39 along geodesic, 41 c-transform, ¯ 272 c-transform, 164 Curvature Chern, 59 flag, 54 Ricci, 54 S-, 122, 130, 132, 135 T-, 67, 103 weighted Ricci Finsler, 121 Riemannian, 117 Curvature-dimension condition, 119, 120, 199, 211, 274, 296, 299 Riemannian, 120, 199, 208, 283, 299 Curvature tensor, 49, 50, 59 Cut locus, 86 Cut point, 86 Cyclical monotonicity, 289, 292

D (d 2 /2)-concave function, 97, 272 Degenerate point, 291 Diameter, 91 Displacement convexity, 273 Distance, 16 intrinsic, 147 Divergence, 147 Divergence formula, 147 Dual norm, 21, 100

E Effective dimension, 119 Eigenfunction, 167, 173, 197 Ellipsoid, 4, 134 John, 143 Energy, 24, 141, 283 Entropy Rényi–Tsallis, 274 relative, 181, 222, 235, 273 Ergodicity, 209, 223 Essential domain, 147, 153, 162, 184 Exponential map, 28 Exterior boundary measure, 126, 207, 298

Index F Finsler–Ricci flow, ix, 122 Finsler manifold, 15 measured, 120 Finsler structure, 14 pull-back, 97 Fisher information, 224, 280 Flag, 54 Flagpole, 54 Formal Christoffel symbol, 26 Forward bounded, 30 Forward Cauchy sequence, 30 Fundamental inequality, 21, 24 Fundamental solution, 190 Fundamental tensor, 19 Funk geometry, 30, 77, 138

G -calculus, 120, 155, 193, 221, 284 2 -criterion, 120, 155, 221 2 -operator, 221 Geodesic, 17, 27 minimal, 17, 272 Geodesically complete, 264 Geodesic equation, 27, 40 Geodesic field, 39, 55, 147 Geodesic space, 98 Geodesic spray, 35 Global solution, 171 existence, 178 non-expansion, 180 regularity, 184 Gradient curve, 173, 175, 177 Gradient estimate L1 -, 196–198 L2 -, 194, 197, 198, 284 Li–Yau, 200 Gradient flow, 173, 181, 200, 284 Gradient vector, 146, 175 linearized, 153 Gromov hyperbolic, 76, 78 Ground state energy, 145, 182 Guiding function, 293

H H01 (), 142 1 (), 141 Hloc H 1 (), 142

Index Hamiltonian structure, 16, 163, 284 Hamilton–Jacobi equation, 164 Harmonic function, 149, 262 Harnack inequality, 204 Heat equation, 171 linearized, 188 Heat flow, 10, 171, 284 Heat semigroup adjoint, 190 conjugate, 191 linearized, 188 Hessian, 118, 132, 148, 155, 163 Hilbert–Schmidt norm, 161 Hilbert geometry, 76, 138 Hopf–Lax formula, 164

I Index form, 82 Index lemma, 88 Indicatrix, 4, 16 Inequality Beckner, 120, 228, 234, 241 Brunn–Minkowski, 273, 276, 277, 285 HWI, 281 Jensen, 198, 222 Lichnerowicz, 212 logarithmic entropy-energy, 235 logarithmic Sobolev, 211, 222, 224, 225, 229, 233, 241, 280, 300 Nash, 237 Poincaré, 120, 145, 197, 211, 281, 283, 300 Poincaré–Lichnerowicz, 167, 210, 222, 229, 233, 282 Sobolev, 234, 238, 241 Talagrand, 279, 281, 282 transportation cost, 279 Infinitesimally Hilbertian, 199 Inner product, 4 Isoperimetric inequality, 208 Bakry–Ledoux, 207, 208, 297, 299 Gaussian, 207, 218 Lévy–Gromov, 208, 219, 287, 297, 299 Milman, 120, 297, 299 quantitative, 219, 299 Isoperimetric profile, 207, 297

J Jacobi equation, 49 Jacobi field, 49

315 K k-concavity, 98, 103 k-convexity, 98, 107 Kantorovich duality, 272 Kantorovich potential, 272, 294 Kantorovich problem, 270 Kantorovich–Rubinstein duality, 294 Killing form, 76, 135 Killing vector field, 135

L Laplacian, 148, 156 linearized, 153, 162, 188 unweighted, 158 Legendre transformation, 22, 146 Length, 16 Lipschitz function, 17, 22, 218, 258, 288 Locally Minkowskian, 64, 66, 267 Locally uniformly elliptic, 171, 184, 189 Log-concave measure, 119, 130 Lower semi-continuous, 145 M Mass conservation, 144, 182, 209 Measure, 120 Busemann–Hausdorff, 116, 122, 130, 133, 139 Dirac, 270, 293 Gaussian, 130, 208, 212, 219 Holmes–Thompson, 116, 139 Lebesgue, 116, 138, 172, 263, 273 volume, 117 Measure contraction property, 126, 152, 285, 295, 300 Measured Gromov–Hausdorff convergence, 119, 257, 283 Metric compatible, 40 almost, 43 Minkowski normed space, 11, 15, 63, 129, 172 Monge problem, 270

N Needle, 293 Needle decomposition, 119, 208, 213, 219, 282, 287, 293 conditioned, 295 Nonlinear connection, 35 Norm, 3 Minkowski, 11

316

Index

O Optimal transport, 163, 269, 289, 294 Optimal transport map, 270, 294 Otto calculus, 164

Straight line, 262 Strong convexity, 11, 14 Sub-Riemannian manifold, 285 Subharmonic, 258

P P2ac (M; m), 271 Parallel form, 76, 138 Parallelogram identity, 4 P(M), 269 Positive homogeneity, 11, 13, 14 Pp (M), 269 Pressureless Euler equation, 158, 164 Pulled-back tangent bundle, 42 Push-forward, 164, 270, 292

T Teichmüller metric, 78 space, 78 Theorem Akbar-Zadeh, 64 Alexandrov, 97, 272 disintegration, 293 Euler, 13 Hopf–Rinow, 31 Meyers–Serrin, 144 Rademacher, 258 Rellich–Kondrachov, 183, 241 Riesz–Thorin, 250 Synge, 86 Szabó, 66, 267 Torsion-free, 38, 43 Totally convex, 264 Transport ray, 288 degenerate, 288 Transport set, 291 Triangle inequality, 17

R Randers space, 15, 71, 132 Ray, 32, 257 RCD(K, N )-space, 283 Reference vector, 34 Resolvent operator, 250 Reverse Finsler structure, 18 Reversibility constant, 30, 102, 298 Reversible, 12, 15 ρ(v), 86 Riccati equation, 93, 159 Ricci flow, ix, 191, 222 Ricci soliton, 119 gradient, 119 Riemannian characterization covariant derivative, 38 curvature, 54 weighted Ricci curvature, 121 Riemannian metric gv , 19 gα∗ , 100 osculating, 47 Rigidity, 212, 219, 299 almost, 213, 219, 299 S Saturated set, 295 Slope, 173 Sobolev norm, 142 Sobolev space, 142 Spectral gap, 167, 197, 212, 300 Splitting, 283 Berwald, 265, 266 Cheeger–Gromoll, 120, 257 diffeomorphic, 263 warped product, 120, 264

U Ultracontractivity, 249 Uniform convexity, 5, 99 constant, 5, 99 Uniform smoothness, 5, 99 constant, 5, 99 Unit ball, 3 Unit sphere, 4, 16 Upper gradient, 149 V Variance, 210, 211 Variation formula distance function, 82 first, 79 second, 82 W Wasserstein distance, 271 Wasserstein space, 163, 181, 271 Weighted Riemannian manifold, 117 Weight function, 117, 122 Weil–Petersson metric, 78