Quantization, PDEs, and Geometry: The Interplay of Analysis and Mathematical Physics [1st ed.] 3319224069, 978-3-319-22406-0, 978-3-319-22407-7, 3319224077

This book presents four survey articles on different topics in mathematical analysis that are closely linked to concepts

248 99 3MB

English Pages 314 [322] Year 2016

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Quantization, PDEs, and Geometry: The Interplay of Analysis and Mathematical Physics [1st ed.]
 3319224069, 978-3-319-22406-0, 978-3-319-22407-7, 3319224077

Table of contents :
Front Matter....Pages i-viii
Gelfand–Shilov Spaces: Structural Properties and Applications to Pseudodifferential Operators in ℝ n ....Pages 1-68
An Excursion into Berezin–Toeplitz Quantization and Related Topics....Pages 69-115
Global Attraction to Solitary Waves....Pages 117-152
Geodesics in Geometry with Constraints and Applications....Pages 153-314

Citation preview

Operator Theory Advances and Applications 251

Dorothea Bahns Wolfram Bauer Ingo Witt Editors

Quantization, PDEs, and Geometry The Interplay of Analysis and Mathematical Physics

Operator Theory: Advances and Applications Volume 251 Founded in 1979 by Israel Gohberg

Editors: Joseph A. Ball (Blacksburg, VA, USA) Harry Dym (Rehovot, Israel) Marinus A. Kaashoek (Amsterdam, The Netherlands) Heinz Langer (Wien, Austria) Christiane Tretter (Bern, Switzerland) Associate Editors: Vadim Adamyan (Odessa, Ukraine) Wolfgang Arendt (Ulm, Germany) Albrecht Böttcher (Chemnitz, Germany) B. Malcolm Brown (Cardiff, UK) Raul Curto (Iowa, IA, USA) Fritz Gesztesy (Columbia, MO, USA) Pavel Kurasov (Stockholm, Sweden) Vern Paulsen (Houston, TX, USA) Mihai Putinar (Santa Barbara, CA, USA) Ilya M. Spitkovsky (Williamsburg, VA, USA)

Honorary and Advisory Editorial Board: Lewis A. Coburn (Buffalo, NY, USA) Ciprian Foias (College Station, TX, USA) J.William Helton (San Diego, CA, USA) Thomas Kailath (Stanford, CA, USA) Peter Lancaster (Calgary, Canada) Peter D. Lax (New York, NY, USA) Donald Sarason (Berkeley, CA, USA) Bernd Silbermann (Chemnitz, Germany) Harold Widom (Santa Cruz, CA, USA)

Subseries Linear Operators and Linear Systems Subseries editors: Daniel Alpay (Beer Sheva, Israel) Birgit Jacob (Wuppertal, Germany) André C.M. Ran (Amsterdam, The Netherlands) Subseries Advances in Partial Differential Equations Subseries editors: Bert-Wolfgang Schulze (Potsdam, Germany) Michael Demuth (Clausthal, Germany) Jerome A. Goldstein (Memphis, TN, USA) Nobuyuki Tose (Yokohama, Japan) Ingo Witt (Göttingen, Germany)

Dorothea Bahns • Wolfram Bauer • Ingo Witt Editors

Quantization, PDEs, and Geometry The Interplay of Analysis and Mathematical Physics

Editors Dorothea Bahns Mathematisches Institut Georg-August-Universität Göttingen Göttingen, Germany

Wolfram Bauer Institut für Analysis Leibniz Universität Hannover Hannover, Germany

Ingo Witt Mathematisches Institut Georg-August-Universität Göttingen Göttingen, Germany

ISSN 0255-0156 ISSN 2296-4878 (electronic) Operator Theory: Advances and Applications ISBN 978-3-319-22406-0 ISBN 978-3-319-22407-7 (eBook) DOI 10.1007/978-3-319-22407-7 Library of Congress Control Number: 2016930024 Springer Cham Heidelberg New York Dordrecht London © Springer International Publishing Switzerland 2016 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. Printed on acid-free paper This book is published under the trade name Birkhäuser. The registered company is Springer International Publishing AG (www.birkhauser-science.com)

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

vii

Gelfand–Shilov Spaces: Structural Properties and Applications to Pseudodifferential Operators in Rn T. Gramchev 1 2 3 4

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic properties of Gelfand–Shilov spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anisotropic generalizations of Shubin operators . . . . . . . . . . . . . . . . . . . . . . . . . Elliptic operators with irregular singularity at infinity . . . . . . . . . . . . . . . . . . Appendix: Pseudodifferential operators on Gelfand–Shilov spaces . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 4 12 36 57 64

An Excursion into Berezin–Toeplitz Quantization and Related Topics M. Engliˇs 1 2 3 4 5 6 7

The problem of quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Fock space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bergman spaces and their operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic ideas of Berezin(–Toeplitz) quantization(s) . . . . . . . . . . . . . . . . . . . . . . . Berezin quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Berezin–Toeplitz quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

70 78 84 88 101 109 112 113

Global Attraction to Solitary Waves A. Comech 1 Solitary waves. Linear stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 2 The set of all solitary waves as a global attractor . . . . . . . . . . . . . . . . . . . . . . . 124 3 Klein–Gordon equation with one oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

vi

Contents Appendix: The Titchmarsh convolution theorem . . . . . . . . . . . . . . . . . . . . . . . . 144 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

Geodesics in Geometry with Constraints and Applications I. Markina 1 2 3 4 5 6 7 8 9

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Main definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Carnot groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sub-Riemannian spheres . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Principal bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rolling manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Group of diffeomorphisms of the circle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

153 154 170 198 210 238 258 283 303 307

Introduction During the long history of natural sciences it has turned out that various concepts originating in physics became a rich source for mathematical research. In particular this is the case in the area of analysis and an embedding of such physical ideas into a more abstract mathematical framework can be expected to provide a deeper structural insight. Conversely this mathematical treatment may lead to new applications and achievements in physics. The present volume aims to collect four different topics in modern mathematics that are under on-going research and closely linked to applications in physics. Written by leading experts of the field these papers have an introductory character and are accessible to non-specialists. The authors attached importance to a detailed introduction and clear motivation of the subject. Most of the proofs are given or, when too long, cited from the literature. Finally, open questions and some of the most recent approaches to the different topics are discussed. The article by Todor Gramchev is concerned with linear and semilinear pseudodifferential equations on Euclidean space Rn . Besides the usual local theory, here global aspects become important. It is one basic observation that solutions of such equations often decay exponentially at infinity. To obtain sharp results in this direction, solutions of the equations under study are sought in Gelfand–Shilov spaces. Semilinear equations are then regarded as perturbations of the corresponding linear ones, and the linear equations are investigated with the help of various pseudodifferential calculi on Rn , with global estimates on the amplitude functions and special control of the implied constants in these estimates. A number of typical applications is also mentioned like exponential location of eigenfunctions of stationary Schr¨ odinger operators with polynomially growing potentials in phase space and the construction of traveling wave solutions of certain nonlinear equations. The paper by Miroslav Engliˇs aims to give a flavour of two quantization theories in physics: the Berezin and the Berezin–Toeplitz quantization. As an illuminating example the cases where the quantized domains are the entire complex space Cn and the unit disc are studied by employing Toeplitz operators on the Fock and the Bergman space, respectively. Generalizing this concept the quantization of a domain Ω ⊂ Cn equipped with a K¨ahler form ω and corresponding Poisson bracket is explained. Finally, the author sketches the proof of the existence theorem for Berezin and Berezin–Toeplitz quantization of smoothly bounded strictly pseudoconvex domains which is based on the Boutet de Movel theory and Fefferman’s expansion of the Szeg˝o kernel.

viii

Introduction

The article by Andrew Comech discusses the soliton resolution conjecture by the example of a nonlinear wave equation in one space dimension. This is in fact one of the main conjectures in the field of dispersive equations, and it is wide open in space dimensions two and higher. Here, the general motivation behind addressing such questions is explained. Then the Klein–Gordon equation in one space dimension with a nonlinearity located at one point is studied. This equation is globally well posed in the energy space, and the soliton resolution conjecture asserts in this case that any finite energy solutions converges as t → ∞ towards a global attractor which is entirely composed of solitary waves. A detailed proof of this result is provided. The contribution by Irina Markina provides a glimpse into the area of subriemannian geometry and its various applications. In particular, there are close connections to classical mechanics, CR manifolds or geometric control theory. Starting from a subriemannian metric a Hamiltonian formalism is explained that produces geodesics under non-holonomic constraints. The author mentions a variety of concrete examples in which the subriemannian geometry originates from a Lie group structure or is induced via a principle fibre bundle. Subsequently the kinematic system of a manifold rolling on another manifold without twisting and slipping is addressed in the framework of subriemannian geometry. The last part of the paper discusses subriemannian structures on infinite-dimensional Lie groups and the problem of controllability.

G¨ ottingen, 27.05.2015

Operator Theory: Advances and Applications, Vol. 251, 1–68 c Springer International Publishing Switzerland 2016 

Gelfand–Shilov Spaces: Structural Properties and Applications to Pseudodifferential Operators in Rn Todor Gramchev Abstract. We present the basic definitions and properties of Gelfand–Shilov spaces and discuss applications to the study of the global analytic-Gevrey regularity and the rate of exponential decay of solutions of large classes of (semi-) linear pseudodifferential equations. Mathematics Subject Classification (2010). Primary: 35S05; Secondary: 35B40, 35B65. Keywords. Pseudo-differential equations, Gelfand–Shilov spaces, exponential decay, holomorphic extensions.

1. Introduction In the 1950s, Gelfand and Shilov introduced new spaces defined on Rn , called by the authors S spaces, suitable for the study of global properties of linear partial differential equations on the whole Euclidean space. More precisely, given μ > 0, ν > 0, we define Sνμ (Rn ) as the set of all smooth functions f (x), x ∈ Rn , such that, for some positive constants A, B, and C, the inequalities sup |xβ ∂xα f (x)| ≤ CA|α| B |β| mαβ ,

x∈Rn

α, β ∈ Zn+ ,

(1.1)

with mαβ = |α|μ|α| |β|ν|β| hold. In view of Stirling’s formula one can use other expressions for mαβ , e.g., mαβ = (α!)μ (β!)ν , α! = α1 ! · · · αn !. The index μ represents the analytic-Gevrey index, with μ < 1 implying extensions of the elements of Sνμ (Rn ) to entire functions of exponential type 1/(1 − μ), while the index ν This work was completed with the support of a grant from the University of G¨ ottingen. The author’s research was partially supported by the Gruppo Nazionale per l’Analisi Matematica, la Probabilit` a e le loro Applicazioni (GNAMPA) of the Istituto Nazionale di Alta Matematica (INdAM), Italy, and by the Institute of Mathematics and Informatics of the Bulgarian Academy of Sciences.

2

T. Gramchev

characterizes the type of exponential decay. For more details we refer to [29], see also the part of the present paper dedicated to functional-analytic properties of Gelfand–Shilov spaces. The mathematical problems we are interested in can be summarized as follows: let P (x, D) be a linear partial differential operator with smooth (analyticGevrey) coefficients in Rn . Typically, P is modeled by the linear Schr¨odinger operator P = −Δ + V (x), with V (x) being a real-analytic potential. Our goal is to study the global analyticGevrey regularity and the decay at infinity of solutions of P u + F (u) = f (x),

x ∈ Rn ,

where F (u) is a semi-linear perturbation (typically a polynomial nonlinearity), while the right-hand side f is a uniformly regular (C ∞ , Gevrey, or real-analytic) function with suitable decay properties at infinity. More generally, we consider pseudodifferential operators P . The hypotheses on the behavior of the coefficients as |x| → ∞ turn out to be important in order to derive uniform regularity and decay estimates. Moreover, the two issues are in the realm of the investigations of models relevant to mathematical physics and applied mathematics. Broadly speaking, we dwell upon two problems whose setting will be inserted in an abstract functionalanalytic framework given by scales of Banach spaces of functions belonging to the Gelfand–Shilov spaces Sνμ (Rn ). For the sake of simplicity, we consider partial differential operators, leaving the more general case of pseudodifferential operators for the corresponding sections, where more general assertions will be shown. • Problem 1 (global regularity and decay). Given a linear partial differential operator P with analytic coefficients, we consider the linear equation P u = f (x),

x ∈ Rn ,

(1.2)

where the right-hand side (the source term) f belongs to some functional reg space Ydec (Rn ) of (analytic-Gevrey) smooth functions on Rn which decay rapidly (exponentially, sub-exponentially, or super-exponentially) to zero as |x| → ∞ in a suitable functional space (Gelfand–Shilov space Sνμ (Rn )) with u being a weak solution (usually of “low regularity” and, in the nonlinear case, assuming additional hypothesis on the “slow decay” at infinity) belonging to some functional space X(Rn ) of Schwartz tempered (ultra-) distributions or to some Sobolev space of positive index in the presence of nonlinear terms. The main goal is to find optimal conditions on the linear operator P guarreg (Rn ) as well. Note that if f = 0, and P is a anteeing that u belongs to Ydec second-order linear elliptic operator on Rn , we are in the realm of the study of exponential decay and regularity properties of eigenfunctions of P , a field of great interest in and ramifications to various branches of mathematics and applications, starting from the fundamental work of Agmon [1]. There is an

Gelfand–Shilov Spaces and Pseudodifferential Operators

3

extensive literature on this subject, we only mention some of the work in this direction, namely [6], [15], [32], [56] and the references therein. We mention further that, broadly speaking, the estimates on the decay at infinity of the eigenfunctions ϕ(x) are pointwise and involve at most a finite number of derivatives (usually the first derivatives if one deals with second-order elliptic operators). We have as a simple example  2  2 2 ϕ1 (x) = e−x , ϕ2 (x) = e−x 2 + cos eηx . Both functions are restrictions of entire functions and have pointwise quadratic exponential decay as |x| → ∞. However, while all derivatives of ϕ1 (x) (k) preserve this decay property, the derivatives ϕ2 (x) are unbounded at infin−1 ity provided that k > η . Broadly speaking, the main purpose of this article is to show that the use of Gelfand–Shilov spaces as a functional-analytic framework yields simultaneously uniform analytic-Gevrey regularity and exponential decay excluding oscillations or finite derivative blow-up at infinity. This simple example shows that the pointwise estimates of eigenfunctions starting from the fundamental work of Agmon [1] do not distinguish between the two functions above. • Problem 2 (stability under semilinear perturbations). Given a semilinear equation (1.3) P (x, D)u = f (x) + F (u), x ∈ Rn , where F (u) is a polynomial nonlinear term (or entire function), at least quadratic at u = 0 with u being a “low regularity” solution belonging to some Banach algebra (because of the nonlinearity) X(Rn ) of finitely smooth functions, we are interested in finding optimal conditions on the linear operator reg (Rn ). As a particuP and the nonlinear term F (u) implying that u ∈ Ydec n lar case we consider P u = F (u) on R , relevant to the study of qualitative properties of solitary wave-type solutions of semilinear equations in mathereg n dec n (Rn ) = Adec matical physics. Here we take Ydec un (R ), where Aun (R ) stands n for the space of real-analytic functions on R which extend to holomorphic functions in the strip {z ∈ Cn : |z| < T }, T > 0, having exponential decay n as |x| → ∞. In fact, as we see in the sequel, Adec un (R ) coincides with the 1 n Gelfand–Shilov space S1 (R ). The objective of this article is to give a survey of some recent results on the above two problems, to propose some novel assertions exhibiting generalizations and refinements of previous results, and to outline some new directions in the study of global properties of pseudo-differential operators in Gelfand–Shilov spaces. Next, we outline the three classes of model Schr¨odinger operators representing the particular cases of the more general setting in the context of global regularity and exponential decay at infinity of solutions of (semi-) linear pseudodifferential equations on Rn , P u = −Δu + V (x)u, x ∈ Rn , where V (x) is real-analytic potential (not necessarily real-valued).

4

T. Gramchev

Broadly speaking, we have three different classes of potentials, which provide model cases for the more general classes of pseudodifferential operators that will appear in the complete results. Case 1 (free particle operator). V (x) = c2 , c > 0. We recognize as a particular case the equation for solitary wave-type solutions −v  + c2 v = F (u). Case 2 (anisotropic harmonic oscillators). The starting model is the harmonic oscillator V (x) = |x|2 = x21 + · · · + x2n . This is historically the most interesting case. It is also the model which motivated Shubin’s pioneering work in the 1970s on pseudodifferential operators in Rn , in particular, the fundamental book [63] which led to new notions and results on pseudodifferential operators generalizing the harmonic oscillator, e.g., on hypoellipticity and spectral properties of such classes of pseudodifferential operators. Actually, we will address the case of an anisotropic harmonic oscillator, namely when V (x) = |x|2k + lower-order terms. Case 3 (irregular singularity at infinity). V (x) = O(x −σ ), |x| → ∞, for some 0 < σ < n, where we use the notation x = (1 + |x|2 )1/2 . We point out that the case σ ≥ n has been investigated in different contexts in the fundamental papers [45], [47], [48]. In a recent paper [17], the decay and regularity issues for 0 < σ < n have been settled completely in the framework of Gelfand–Shilov spaces. The paper is organized as follows: Section 2 reviews some basic properties of Gelfand–Shilov spaces. In particular, the issue of relating the Gelfand–Shilov regularity in the symmetric case Sμμ (Rn ) of a function with the decay properties of the sequence of its Fourier coefficients defined by an arbitrary positive globally elliptic Shubin-type differential operator is addressed. In Section 3, we dwell upon the Gelfand–Shilov regularity of solutions of (semi-) linear anisotropic harmonic oscillators-type differential equations following [16] with some novelties allowing more general classes of nonlinear terms. Section 4 deals with the Gelfand–Shilov regularity of solutions of semilinear elliptic equations with irregular singular behavior as |x| → ∞ (case 3, where 0 < σ < n). The arguments here are highly nontrivial and technically quite involved, as one has to consider non-elliptic SG pseudodifferential operators in Gelfand–Shilov classes. We provide also an appendix with self-contained material on hypoelliptic SG pseudodifferential operators.

2. Basic properties of Gelfand–Shilov spaces 2.1. Scales of defining Sobolev norms We introduce some scales of Sobolev norms defining the Gelfand–Shilov spaces Sνμ (Rn ) in (1.1). First of all we recall a result obtained in [22] which provides a useful characterization of Sνμ (Rn ). For any s ∈ R, we shall denote by H s (Rn ) the Sobolev space H s (Rn ) = {u ∈ S  (Rn ) : ξ s uˆ(ξ) ∈ L2 (Rn )},

Gelfand–Shilov Spaces and Pseudodifferential Operators

5

endowed with the standard norm · s uˆ(·) L2 , where u ˆ denotes the Fourier transform of u. We recall that Sνμ = {0} iff μ + ν ≥ 1. If μ < 1, then the space Sνμ is a subset of the restrictions to Rn of the entire functions on Cn and a nice characterization is given in the book [29]. Proposition 2.1. Let μ ∈ (0, 1). Then f ∈ Sνμ (Rn ) iff f extends as an entire function to Cn and there exist a, b > 0 such that sup ea|x|

1/ν

−b|y|1/(1−μ)

x,y∈Rn

|f (x + iy)| < +∞.

The next proposition (cf. [22]) states that to show that a function belongs to Sνμ (Rn ) we can prove decay and regularity estimates separately. Proposition 2.2. Let μ > 0, ν > 0 with μ + ν ≥ 1 and let f ∈ C ∞ (Rn ). Then the following conditions are equivalent: i) f ∈ Sνμ (Rn ). ii) There exist positive constants A, B such that sup |xk f (x)| ≤ A|k|+1 (k!)ν

x∈Rn

and

sup |∂xj f (x)| ≤ B |j|+1 (j!)μ

x∈Rn

(2.1)

for all j, k ∈ Zn+ . iii) There exist positive constants a, T such that   sup exp(a|x|1/ν )|f (x)| < +∞ and sup T −|j| j!−μ sup |∂xj f (x)| < +∞. j∈Zn +

x∈Rn

x∈Rn

μ 2.2. Characterization of Sμ (Rn ) by eigenfunction expansions in Rn n We shall give a version in R of some results, already known on compact manifolds, concerning eigenfunction expansions. Broadly speaking, the aim is to relate the regularity of a function with the decay properties of the sequence of its Fourier coefficients. More precisely, we want to reproduce in Rn the classical results of [61, Section 10] on Sobolev regularity and [62] for analytic functions on a compact manifold, taking into account Weyl asymptotics of eigenvalues. Our basic example of an operator will be the harmonic oscillator appearing in quantum mechanics (2.2) H = −Δ + |x|2 , whose eigenfunctions are the Hermite functions

hα (x) = Hα (x)e−|x|

2

/2

,

α = (α1 , . . . , αn ) ∈ Nn ,

(2.3)

where Hα (x) is the αth Hermite polynomial, cf. [28]. See for example [57], [52], [43] for related Hermite expansions as well as [31], [71] for connections with a degenerate harmonic oscillator. Here we shall consider a more general class of operators with polynomial coefficients in Rn ,  P = cαβ xβ Dxα , Dα = (−i)|α| ∂xα , (2.4) |α|+|β|≤m

6

T. Gramchev

studied by Shubin [63] in the framework of global pseudodifferential calculus, see also [36], [3], [50]. Let us recall, in short, some definitions and results from [63, Chapter IV]. First, global ellipticity of P in (2.4) is defined by imposing  pm (x, ξ) = cαβ xβ ξ α = 0 for (x, ξ) = (0, 0). (2.5) |α|+|β|=m

This condition is obviously satisfied by the operator H in (2.2). For these operators, the counterpart of the standard Sobolev spaces are the spaces    xβ ∂xα u L2 (Rn ) < +∞ , Qs (Rn ) = u ∈ S  (Rn ) : u Qs := |α|+|β|≤s 

where S (R ) is the class of the tempered distributions of Schwartz and s ∈ N. Under the global ellipticity assumption (2.5) P : Qm (Rn ) → L2 (Rn ) is a Fredholm operator. The finite-dimensional null-space Ker P is given by functions in the Schwartz space S(Rn ). Following Gelfand and Shilov, it is natural to consider as a global counterpart in Rn of the real-analytic class the inductive (respectively, projective) Gelfand– Shilov classes Sνμ (Rn ) (respectively, Σμν (Rn )), where μ > 0, ν > 0, μ + ν ≥ 1 (respectively, μ + ν > 1), defined as the set of all u ∈ S(Rn ) for which there exist A > 0, C > 0 (respectively, for every A > 0 there exists C > 0) such that n

|xβ ∂xα u(x)| ≤ CA−|α|−|β| (α!)μ (β!)ν ,

α ∈ Nn , β ∈ Nn ,

see [29], [43] and [50, Chapter 6]. In the sequel we shall limit our attention to Sμμ (Rn ), μ ≥ 1/2 (respectively, Σμμ (Rn ), μ > 12 ). We recall that u ∈ Sμμ (Rn ) iff there exist A > 0, C > 0 (respectively, for every A > 0 one can find C > 0) such that  xβ ∂xα u L2 (Rn ) ≤ CA−s (s!)μ , s ∈ N. (2.6) |α|+|β|=s

It was shown recently that every solution u ∈ S  (Rn ) of P u = 0 belongs to 1/2 S1/2 (Rn ) provided (2.5) holds, see [11], [15] for details and more general results. We assume, as in Seeley [62], that P is a normal operator (i.e., P ∗ P = P P ∗ ) satisfying the global ellipticity condition (2.5). This guarantees the existence of a basis of orthonormal eigenfunctions uj , j ∈ N, with eigenvalues λj , limj→∞ |λj | = +∞ (see Seeley [62] and Shubin [63]). Hence, given u ∈ L2 (Rn ), or u ∈ S  (Rn ), we can expand ∞  u= aj u j (2.7) j=1

where the Fourier coefficient aj ∈ C is defined by aj = (u, uj )L2 (Rn ) , 

j = 1, 2, . . . ,

with convergence in L (R ), or S (R ), for (2.7). 1/2 In view of [11] the eigenfunctions uj belong to S1/2 (Rn ). 2

n

n

(2.8)

Gelfand–Shilov Spaces and Pseudodifferential Operators

7

We state the first main result. Theorem 2.3. Suppose that P is globally elliptic, cf. (2.4), (2.5), and normal. Then: ∞ ∞ (i) u ∈ Qs (Rn ) ⇐⇒ |aj |2 |λj |2s/m < ∞ ⇐⇒ |aj |2 j s/n < ∞, s ∈ N. j=1

j=1

(ii) u ∈ S(Rn ) ⇐⇒ |aj | = O(|λj |−s ), j → ∞ ⇐⇒ |aj | = O(j −s ), j → ∞, for all s ∈ N. Next, we show the global analogue to Seeley’s theorem in [62]. Theorem 2.4. Let P be as before and μ ≥ 1/2 (respectively, μ > 1/2). Then ∞ 1/(mμ) we have that u ∈ Sμμ (Rn ) ⇐⇒ |aj |2 e|λj | < ∞ for some > 0 ⇐⇒ ∞

j=1 2 j 1/(mμ)

|aj | e

< ∞ for some > 0 ⇐⇒ there exist C > 0, > 0 such that

j=1

|aj | ≤ C e−j (respectively, u ∈ Σμμ (Rn ) ⇐⇒ ∞



1/(2nμ)

,

|aj |2 e|λj |

j∈N 1/(mμ)

(2.9)

< ∞ for all > 0 ⇐⇒

j=1

|aj |2 ej

1/(mμ)

< ∞ for all > 0 ⇐⇒ for every > 0 there exist C > 0 such

j=1

that |aj | ≤ C e−j

1/(2nμ)

, j ∈ N).

Remark 2.5. Choosing as P the harmonic oscillator H in (2.2), with eigenfunctions hα (x) as in (2.3), we recapture the results on the Hermite expansions related to Gelfand–Shilov type spaces for n = 1, whereas, for n ≥ 2, taking into account the n multiplicity of the eigenvalues λα = j=1 (2αj + 1) for hα , α ∈ Nn , we obtain as 1/2

a particular case of Theorem 2.4 the characterization u ∈ S1/2 (Rn ) iff |aα | ≤ C e−|α| , α ∈ Nn , aα hα , cf. [43], [52] and the references for positive constants C and , where u = α∈Nn

therein. 2.3. Proof of Theorem 2.3 It is not restrictive to assume that P is positive, with λj > 0, cf. [62]. We need some preliminary results from [63]. Namely, concerning asymptotics of eigenvalues, [63, Theorem 30.1] and [50, Proposition 4.6.4] give the following lemma. Lemma 2.6. Let P be globally elliptic of order m > 0, cf. (2.4), (2.5), and strictly positive. Then, for the eigenvalues λj , j = 1, 2, . . . , we have λj ∼ C j m/(2n) for a positive constant C.

as j → +∞,

8

T. Gramchev Now, for P as before and r ∈ R, r = 0, introduce the rth power P ru =

∞ 

λrj aj uj ,

(2.10)

j=1

with aj , uj as in (2.7), (2.8). The operator P r is well defined as a map S  (Rn ) → S  (Rn ). Lemma 2.7. Let u ∈ S  (Rn ). Then u ∈ Qs (Rn ) if and only if P s/m u ∈ L2 (Rn ), s ∈ N. The norms u Qs and P s/m u L2 (Rn ) are equivalent. In fact, P s/m is an elliptic operator of order s in the pseudodifferential calculus of [63], cf. [50, Section 4.3], and consequently P s/m u ∈ L2 (Rn ) corresponds to u ∈ Qs (Rn ) with equivalence of norms, cf. [50, Proposition 2.1.9, Theorem 2.1.12]. Proof of Theorem 2.3. We may now prove (i) in Theorem 2.3. From (2.10) and Parseval identity ∞ ∞



2 



s/m 2s/m λj aj uj

= λj |aj |2 . P s/m u 2L2 (Rn ) =

2 n L (R )

j=1 2s/m

In view of Lemma 2.6 we have λj

c1 u 2Qs ≤

j=1

∼ C  j s/n and, therefore, from Lemma 2.7

∞ 

j s/n |aj |2 ≤ c2 u 2Qs

j=1

for positive constants c1 , c2 . This gives (i). On the other hand S(Rn ) = suitable s n Q (R ), hence (ii) follows from (i).  s∈N

Note also that by Lemma 2.7 we may generalize the definition of Qs (Rn ) to all s ∈ R, and (i) extends obviously to these spaces. Finally, we observe that the preceding arguments and the statement of Theorem 2.3 remain valid for any globally elliptic normal pseudo-differential operator in [63]. 2.4. Proof of Theorem 2.4 We shall follow the argument of [62, pages 737–738]. Namely, we shall use the following adapted version of the celebrated theorem of the iterates of [42]. Lemma 2.8. Let P be globally elliptic, cf. (2.4), (2.5), of order m. Let μ ≥ 1/2 and u ∈ S  (Rn ). Then u ∈ Sμμ (Rn ) if and only if, for some C > 0, P M u L2 (Rn ) ≤ C M+1 (M !)μm

for all M ∈ N.

(2.11)

A short proof of Lemma 2.8 will be given in Section 2.5; for more detail we refer to the paper [21]. By applying Lemma 2.8, in the sequel we may then take the estimates (2.11) as an equivalent definition of the class Sμμ (Rn ), fixing as P the

Gelfand–Shilov Spaces and Pseudodifferential Operators

9

operator in Theorem 2.3. On the other hand, assuming without loss of generality that u ∈ S(Rn ), we have ∞ ∞ ∞

2

2



 





2 aj P M u j 2 n =

λM λ2M P M u 2L2 (Rn ) =

j aj u j 2 n = j |aj | , L (R )

j=1

L (R )

j=1

j=1

in view of (2.7), (2.8) and Parseval identity. It follows from Lemma 2.6 that C1 P M u 2L2 (Rn ) ≤

∞ 

j mM/n |aj |2 ≤ C2 P M u 2L2 (Rn )

(2.12)

j=1

for suitable positive constants C1 , C2 . Assume now that the estimate (2.9) is satisfied. Then from the first estimate in (2.12) we have, for some C > 0, > 0, P M u 2L2 (Rn ) ≤ C

∞ 

j mM/n e−2j

1/(2nμ)

1/(2nμ) ≤ C˜ sup j mM/n e−j ,

(2.13)

j

j=1

where C˜ = C

∞ 

e−j

1/(2nμ)

.

j=1

Now observe that the identity eωj

1/(2nμ)

=

∞  ω M j M/(2nμ) M!

M=0

implies that, for any ω > 0 and M ∈ N, j M/(2nμ) e−ωj

1/(2nμ)

≤ ω −M M ! .

(2.14)

Raising both sides of (2.14) to the 2μmth power and applying the estimate obtained in (2.13) with = 2μmω, we have P M u 2L2 (Rn ) ≤ C˜ (ω −M M !)2μm , which gives (2.11) for some C > 0. Similarly, assuming (2.11) and using the second estimate in (2.12), we deduce (2.9). The same computations give the other equivalences in Theorem 2.4.  2.5. Proof of Lemma 2.8 We shall use the estimates (2.6) as definition of Sμμ (Rn ). It is then easy to show that u ∈ Sμμ (Rn ) implies (2.11). In the opposite direction, we assume (2.11) and prove (2.6). Write for short  |u|s = xβ ∂xα u L2 (Rn ) . (2.15) |α|+|β|=s

The following interpolation result for the semi-norms |u|s is needed in the case when m ≥ 2, the integer m being the order of P .

10

T. Gramchev

Proposition 2.9. There exists a constant C > 0 such that for any s ∈ N, where s = pm + r, p ∈ N, 0 < r < m, and for all > 0 r

|u|s ≤ |u|(p+1)m + C − m−r |u|pm + C s (s!)1/2 u L2 (Rn ) .

(2.16)

The proof of Proposition 2.9 is omitted for brevity. A corresponding result for the homogeneous Sobolev spaces is well known, see for example [42, Lemma 3.3 and the remark after]. A novelty with regard to Sobolev spaces is the last term in the right-hand side of (2.16): the factor (s!)1/2 comes from the symbolic calculus of [63, Section 24], see also [50, Sections 1.7, 1.8]. Since μ ≥ 1/2, Proposition 2.9, with = 1 say, implies that we may limit ourselves to check (2.6) for s = pm, p = 0, 1, . . . . Namely, we shall prove that the sequence (2.17) σp (u, λ) = (pm)!−μ λ−p |u|pm , p = 0, 1, . . . , is bounded if λ is sufficiently large. To this end, we use the following proposition. Proposition 2.10. Let P be as in Lemma 2.8. Then, for λ > 0 large enough, we have, for all p ∈ N, σp+1 (u, λ) ≤ [(pm + 1) · · · (pm + m)]−μ σp (P u, λ) + σp (u, λ) + σp−1 (u, λ) + σ0 (u, λ).

(2.18)

In Propositions 2.9, 2.10 and in the sequel we may assume u ∈ S(Rn ). Note that σ0 (u, λ) = |u|0 = u L2 (Rn ) . Proof of Proposition 2.10. We recall that P : Qm (Rn ) → L2 (Rn ) is Fredholm. Assuming Ker P = {0} for simplicity, we have, for a constant C > 0,  xβ Dxα u L2 (Rn ) ≤ C P u L2(Rn ) , u ∈ S(Rn ). (2.19) u Qs = |α|+|β|≤s

Consider then for |α| + |β| = (p + 1)m the term xβ Dxα u L2 (Rn ) and write xβ Dα u = xβ−δ xδ Dxα−γ Dxγ u where we fix γ ≤ α, δ ≤ β so that |γ| + |δ| = pm and |α − γ| + |β − δ| = m. Therefore, using (2.19), we may estimate xβ Dxα u L2 Rn ) ≤ xβ−δ Dxα−γ (xδ Dxγ u) L2 (Rn ) + xβ−δ [xδ , Dxα−γ ]Dxγ u L2 (Rn ) ≤ C P (xδ Dxγ u) L2 (Rn ) + xβ−δ [xδ , Dxα−γ ]Dxγ u L2 (Rn ) ≤ I1 + I2 + I3 , where I1 = C xδ Dxγ (P u) L2 (Rn ) ,

I2 = C [P, xδ Dxγ ]u L2 (Rn ) ,

I3 = xβ−δ [xδ , Dxα−γ ]Dxγ u L2 (Rn ) . Summing up I1 , I2 , I3 over all (α, β) with |α| + |β| = (p + 1)m, we can estimate |u|(p+1)m and σp+1 (u, λ) accordingly from (2.15), (2.17). For short, |u|(p+1)m ≤ J1 + J2 + J3 ,

σp+1 (u, λ) ≤ Y1 + Y2 + Y3 .

(2.20)

Gelfand–Shilov Spaces and Pseudodifferential Operators

11

Since J1 ≤ C|P u|p for a new constant C, then for λ sufficiently large Y1 = ((p + 1)m)!−μ λ−p−1 J1 ≤ [(pm + 1) · · · (pm + m)]−μ σp (P u, λ).

(2.21)

To treat Y2 , by using the expression for P in 2.4, we compute  ˜ [P, xδ Dxγ ] = cαβ [xβ Dxα˜ , xδ Dxγ ], ˜ |α|+| ˜ β|≤m

where ˜

[xβ Dxα˜ , xδ Dxγ ] =



˜

δ+β−τ γ+α−τ C1αδτ Dx ˜ ˜ x

0=τ ≤α, ˜ τ ≤δ





˜

δ+β−τ γ+α−τ C2βγτ Dx ˜ , ˜ x

˜ τ ≤γ 0=τ ≤β, |τ | and where |C1αδτ ˜ | can be estimated by C3 (pm) . The constants here ˜ | and |C2βγτ and in the sequel do not depend on p. Hence,  



˜ ˜

[P, xδ Dxγ ]u 2 n ≤ C4 (pm)|τ | xδ+β−τ Dxγ+α−τ u L2 (Rn ) , (2.22) L (R ) τ ˜ |α|+| ˜ β|≤m

˜ τ ≤ γ. Set s = |δ + β˜ − τ | + |γ + where 0 = τ ≤ α, ˜ τ ≤ δ or 0 = τ ≤ β, ˜ − 2|τ |). Since |˜ ˜ ≤ m and 0 < |τ | ≤ m, we have α ˜ − τ | = (pm + |˜ α| + |β| α| + |β| (p − 1)m ≤ s < (p + 1)m. Note also that s ≤ (p + 1)m − 2|τ |, hence in (2.22) we can estimate |τ | ≤ ((p + 1)m − s)/2. We may then write J2 ≤ C5 (J2 + (pm)m/2 |u|pm + J2 ), where J2 =



(2.23)

(pm)((p+1)m−s)/2 |u|s ,

pm 0, ε > 0 an estimate of the form |∂zα u(z)| ≤ A|α|+1 (α!)h/(h+1) e−ε|z|

h+1

(3.5)

for z in a conic neighborhood of the real axis in C. Such estimates, with the term (α!)h/(h+1) for the αth derivatives, are optimal and, as far as we know, new in literature. They apply to a number of special functions appearing as solutions of (3.4), see Section 3.4. It is interesting to observe that our global ellipticity condition (3.3) for (3.2) corresponds to a dichotomy exponential growth/decay for the solutions of (3.4), see Section 3.4 for a more precise description in terms of asymptotic theory. By a rotation in the complex plane, this property transfers to straight lines in the complex plane, provided global ellipticity is preserved. The estimates (3.5) lead in a natural way to the idea that the appropriate functional framework to study the holomorphic extensions and the decay at infinity simultaneously, is given by the Gelfand–Shilov spaces of type S (cf. the classical book of Gelfand and Shilov [29], see also Mityagin [49], Pilipovic [52]). We recall that f ∈ Sνμ (Rn ), μ > 0, ν > 0, μ + ν ≥ 1, iff f ∈ C ∞ (Rn ) and there exist A > 0, ε > 0 such that 1/ν (3.6) |∂xα f (x)| ≤ A|α|+1 (α!)μ e−ε|x| for all x ∈ Rn , α ∈ Zn+ or, equivalently, one can find C > 0 such that sup |xβ ∂xα f (x)| ≤ C |α|+|β|+1 (α!)μ (β!)ν ,

x∈Rn

α, β ∈ Zn+ .

(3.7)

The bounds (3.6), (3.7) with μ < 1 grant that f extends to Cn as an entire function with uniform estimates, see [29] for precise statements. So, for example (3.5) reads h/(h+1) u ∈ S1/(h+1) (R). Concerning recent applications of Gelfand–Shilov spaces, we mention that for traveling (i.e., solitary) wave solutions of dispersive and dissipative equations, Sνμ -regularity with index μ = 1, joint with exponential decay, i.e., ν = 1, was

14

T. Gramchev

recently studied by Bona and Li [4], Bondareva and Shubin [5], Biagioni and Gramchev [2], Gramchev [30], Cappiello, Gramchev and Rodino [14]. Let us now go back to the initial model, i.e., the Schr¨odinger operator (3.1) in Rn . We assume that V (x) = V0 (x) + R(x),

x ∈ Rn ,

(3.8)

where V0 (x) is a homogeneous elliptic polynomial with complex coefficients of degree 2h. Generalizing the condition (3.3) of the one-dimensional case, we assume V0 (x) ∈ R− ∪ {0},

x ∈ Rn \ 0,

(3.9)

while R(x) is a polynomial of degree at most 2h − 1 (i.e., an anisotropic generalization of the multidimensional harmonic oscillator −Δ + |x|2 appearing in quantum mechanics). It is known that super-exponential decay estimates of type exp(−ε|x|h+1 ), ε > 0, hold also for second-order partial differential equations, under the assumptions (3.8), (3.9). The main interest here comes historically from quantum mechanics, where the exponential decay of eigenfunctions has been intensively studied, see for instance Agmon [1], Hislop and Sigal [38], Rabinovich [56], Buzano [6], and the references quoted therein. We also mention Davies [24], Davies and Simon [25], and the recent work of Rabier [54], Rabier and Stuart [55]. It is natural to discuss the validity of the bound (3.5), i.e., the information h/(h+1) that u ∈ S1/(h+1) , in the n-dimensional case. To this end, further generalizing to higher-order linear operators, we first study the Sνμ -regularity of eigenfunctions of anisotropic Shubin type partial differential operators in Rn ,  cαβ xβ Dxα , (3.10) P = |α| |β| m + k ≤1

where k and m are positive integers. Here we use the standard notation Dxα = (−i)|α| ∂xα . We assume that P is anisotropic (m, k)-globally elliptic, namely, there exist C > 0 and R > 0 such that       2k  β α 2m 1/2  c x ξ , |x| + |ξ| ≥ R. (3.11) αβ   ≥ C |x| + |ξ| |α| |β| m + k ≤1

Note that the operator H in (3.1), (3.2) satisfies (3.11) with m = 2, k = 2h under the assumptions (3.8), (3.9). Anisotropic global ellipticity in the previous sense implies both local regularity and asymptotic decay of the solutions, namely we have the following basic result (see [3]): P u = f ∈ S(Rn ) for u ∈ S  (Rn ) implies that actually u ∈ S(Rn ). In this paper we want to improve this result focusing on the regularity of P in the Gelfand–Shilov classes Sνμ (Rn ). Namely we shall prove the following theorem. Theorem 3.1. Assume that P in (3.10) is (m, k)-globally elliptic, i.e., (3.11) is satisfied. If u ∈ S  (Rn ) is a solution of P u = f ∈ Sνμ (Rn ), where μ ≥ μcr =

k , k+m

ν ≥ νcr =

m , k+m

Gelfand–Shilov Spaces and Pseudodifferential Operators

15

then also u ∈ Sνμ (Rn ). In particular, P u = 0 for u ∈ S  (Rn ) implies that u ∈ k/(k+m) Sm/(k+m) (Rn ). The proof of Theorem 3.1 will be given in Section 3.2. We refer to Section 3.4 for a simple alternative proof in the one-dimensional case by means of asymptotic theory and for some examples of explicit solutions. From (3.6), cf. [29], one easily deduces the following result in the complex domain, which refers to eigenfunctions of P . (If P is (m, k)-globally elliptic, then also P − λ, λ ∈ C, is (m, k)-globally elliptic.) Proposition 3.2. Under the previous assumptions on P , if u ∈ S  (Rn ) is a solution of P u = λu for some λ ∈ C, then u extends to an entire function on Cn and, for suitable constants ε > 0, γ > 0, and C > 0, |∂zα u(z)| ≤ C |α|+1 (α!)μcr e−ε|z|

1/νcr

,

z ∈ Cn , |z| < γ |z|, α ∈ Zn+ .

(3.12)

Notice that for m = 2, k = 2h, (3.12) gives the estimates (3.5). The proof of Theorem 3.1 will also provide precise bounds on the constant ε in (3.12), which does not depend on compact perturbations of P . We pass now to semilinear equations. We shall, in addition, require that the spectrum σ(P ) of P in L2 (Rn ) does not coincide with the whole complex plane. This assumption is not necessary in the linear case as we can see from the proof of Theorem 3.1. Concerning the nonlinear term, we shall allow convolution terms in the nonlinearity F (u), namely we assume that  Fj uj∗ u , F ∈ C, (3.13) F (u) = j,∈Z+ , 2≤j+≤d

where u

0∗

= 1, u

1∗

= u, and · · ∗ u, uj∗ = u  ∗ ·

j ≥ 2,

(3.14)

j times

where ∗ is convolution. Hence, we shall consider the equation  Pu = cαβ xβ Dxα u = F (u) + f,

(3.15)

|α| |β| m + k ≤1

where f is given, f = 0 or f ∈ Sνμ (Rn ), μ ≥ μcr , ν ≥ νcr . In view of the L1 − L2 convolution estimates one gets that  u ∈ H s (Rn ) L1 (Rn ), s > n/2 implies uj∗ ∈ H s (Rn ), j ≥ 2. (3.16) We point out that one gets easily that < x >s u ∈ L2 (Rn ), s > n/2 yields u ∈ L1 (Rn ) with

1 u L1 ≤ dx < x >s u L2 . < x >2s We show a refinement of the main result for the semilinear equation in [16] by allowing nonlinear convolution terms.

16

T. Gramchev

Theorem 3.3. Let P of the form (3.10) satisfy (3.11) and assume that σ(P ) = C. k Let F (u) be as in (3.13), (3.14) and let f ∈ Sνμ (Rn ), μ ≥ μcr = k+m , ν ≥ νcr = ( n m s n . Let s > n/2 and suppose that u ∈ H (R ) L R ) is a solution of (3.15). k+m Then ∗ u ∈ Sνμ∗ (Rn ), where

 ∗

μ = 

max{1, μ} μ

if Fj = 0 for some  ≥ 1, if F has only convolution terms, i.e., Fj = 0,  ≥ 1,

if Fj = 0 for some j ≥ 1, if F has only polynomial terms, i.e., Fj = 0, j ≥ 2. In particular, if f = 0 we obtain that any solution u ∈ H s (Rn ) L1 (Rn ) of (3.15) belongs to Sνμcrcr (Rn ), i.e., we have for positive constants C and ε ∗

ν =

max{1, ν} ν

|∂xβ u(x)| ≤ C |β|+1 (β!)μcr e−ε|x|

1/νcr

,

x ∈ Rn .

(3.17)

The key point in Theorem 3.3, that we want to emphasize, is that in the semilinear case we can keep the super-exponential decay of order 1/νcr in the nonlinear convolution terms, however, in view of (3.17), the extension to the complex domain u(z) is analytic in a strip {z ∈ Cn : |z| < T } for some T > 0 only and not entire in general. Our method allows us to treat, at least for particular models, more general nonlinear terms than (3.13). Namely, we give a generalization of Theorem 3.3 for Schr¨ odinger operators H defined by (3.1), (3.8), where V0 (x) > 0 for x ∈ Rn \ 0 and R(x) is a polynomial of degree at most 2h − 1 with real coefficients. We shall allow for H a more general nonlinear term of the form  F,γ (x)u (∇u)γ , (3.18) F (x, u, ∇u) = 2≤+|γ|≤d

where the F,γ (x) are polynomials in x such that F,γ (x) = F,γ ∈ C if γ = 0

and

deg(F,0 (x)) ≤ h.

(3.19)

We can deal with the presence of convolution terms as well, but we consider the simpler form in order to avoid heavy technicalities. We will obtain the following result. Theorem 3.4. Let H be the operator defined by (3.1), (3.8), where V0 (x) > 0 for h , x ∈ Rn \ 0 and R(x) real-valued and let f ∈ Sνμ (Rn ) for some μ ≥ μcr = h+1 1 s+1 n ν ≥ νcr = h+1 . Then, if u ∈ H (R ), s > n/2, is a solution of the equation Hu = f + F (x, u, ∇u), max{1,μ}

with F as in (3.18), (3.19), then u ∈ Sν

(Rn ).

(3.20)

Gelfand–Shilov Spaces and Pseudodifferential Operators

17

Theorem 3.3 in the particular case k = m, i.e., μ = ν, and Theorem 3.4 in case V0 (x) = |x|2 were already obtained in [11] (see the section on Shubin globally elliptic operators). It is worth, in conclusion, to return to the one-dimensional equation (3.4) in the semilinear version   −u + a0 x2h + a1 x2h−1 + · · · + a2h u = F (x, u, u ) under the preceding assumptions on the coefficients aj and the nonlinearity F . We have from Theorem 3.4 that every solution u ∈ H s+1 (R), s > 1/2, extends to a holomorphic function u(z) in the strip {z ∈ C : |z| < T } satisfying there |∂zα u(z)| ≤ A|α|+1 α! e−ε|z|

h+1

for suitable positive constants A, T , ε. With regard to (3.5), entire extension is lost in general. We shall test this on a simple example in Section 3.4. The same example exhibits a solution with algebraic growth. This contradicts in the semilinear case the dichotomy exponential growth/decay from the asymptotic theory. 3.1. Preliminaries on anisotropic globally elliptic operators We illustrate some basic properties of anisotropic globally elliptic operators of the form (3.10) and recall some equivalent formulations of the ellipticity condition (3.11). Moreover, we prove that the Fourier transformation preserves global ellipticity. This property will be crucial in the next sections to derive decay estimates for the solutions of (3.15). Finally, we recall some recent characterization of Gelfand–Shilov spaces Sνμ (Rn ) that will be instrumental in the proofs of our results in the next subsections. To place the operator (3.10) in the general theory of anisotropic operators, cf. [3], we recall that the Newton polyhedron of P is defined as the convex hull of the set A ∪ {(0, 0)}, where   |α| |β| + ≤ 1, c : = 0 . A = (α, β) ∈ Z2n αβ + m k We can also define the principal part of P as follows. Definition 3.5. Let P be defined by (3.10) for some positive integers k, m. We define the principal symbol pm,k (x, ξ) of P as the function  pm,k (x, ξ) = cαβ xβ ξ α . (3.21) |α| |β| m + k =1

The global ellipticity condition (3.11) can be easily reformulated as follows, cf. [3]. Proposition 3.6. Let P be an operator of the form (3.10). Then (3.11) holds if and only if pm,k (x, ξ) = 0 for all (x, ξ) = (0, 0). We now describe the action of the Fourier transformation on the operator (3.10).

18

T. Gramchev

Proposition 3.7. Let P be an operator of the form (3.10) and let u ∈ S(Rn ). Then Pu = Qˆ u where Q is an operator of the form Q=



aρσ y σ Dyρ .

|ρ| |σ| k + m ≤1

Moreover, P is (m, k)-globally elliptic if and only if Q is (k, m)-globally elliptic, i.e., the following estimate holds true for some positive constants C  , R :      aρσ y σ η ρ  ≥ C  (|y|2m + |η|2k )1/2 for |y| + |η| ≥ R > 0. (3.22)  |ρ| |σ| k + m ≤1

Proof. Applying the standard properties of the Fourier transform and the Leibniz formula we can compute as follows:   β D α u)(ξ) = Pu(ξ) = cαβ (x cαβ Dxβ (ξ α u (ξ)) x |α| |β| m + k ≤1



=

|α| |β| m + k ≤1

cαβ

|α| |β| m + k ≤1

 β α! ξ α−γ Dξβ−γ u (ξ) γ (α − γ)!

γ≤α, γ≤β

= Qˆ u(ξ), where Q=

 |α| |β| m + k ≤1

cαβ

 β α! y α−γ Dyβ−γ , γ (α − γ)!

(3.23)

γ≤α, γ≤β

|β−γ| ≤ 1 in (3.23). The first part of the proposition and we observe that |α−γ| m + k is proved. Moreover, we notice from (3.23) that the principal symbol of Q is given by  qk,m (y, η) = cσρ y σ η ρ = pm,k (η, y) for all (y, η) ∈ R2n . |ρ| |σ| k + m =1

Then we can conclude the proof by applying Proposition 3.6.



To derive our estimates in Gelfand–Shilov classes, in the sequel we shall take advantage of a nice characterization of the space Sνμ (Rn ) given by Chung, Chung and Kim [22] showing that it is sufficient to check (3.7) for α = 0 and, separately, for β = 0. Moreover, the space Sνμ (Rn ) is also characterized via the Fourier transform. We recall this result of Proposition 2.2 in detail, since it will be largely used in the next sections. 3.2. Linear estimates In this subsection, we prove regularity and decay estimates for the solutions of the linear equation P u = f . Although the approach will be essentially the same

Gelfand–Shilov Spaces and Pseudodifferential Operators

19

as for the general equation (3.15), we prefer to treat the linear case separately for two reasons. The first is that for F = 0 in (3.15) the results hold under weaker assumptions on P and on the a priori regularity of the solution. The second, more important, reason is that in the linear case we are able to prove a stronger regularity for the solution as we already claimed in the introduction. Let us start from the study of the analytic-Gevrey regularity of the solutions. To this end we need to introduce suitable scales of Sobolev norms. k . For fixed ε > 0, s ≥ 0, we define the norm Let μ ≥ μcr = k+m u

{s,μ;ε}

=



ε|α| ∂ α u s |α|μ|α| x

α∈Zn +

and the corresponding partial sum s,μ;ε EN [u] =

 |α|≤N

ε|α| ∂ α u s , |α|μ|α| x

where · s denotes the standard norm in the Sobolev space H s (Rn ). By Stirling’s formula and Sobolev embedding estimates it easily follows that if a function u ∈ C ∞ (Rn ) is such that u {s,μ;ε} < +∞ for some ε > 0, s ≥ 0, then u satisfies the global estimate sup C −|α| (α!)−μ sup |∂xα u(x)| < +∞. (3.24) α∈Zn +

x∈Rn

for some positive constant C. Let us now consider the equation P u = f , where P is an operator of the form (3.10) satisfying (3.11). Assume that we can find a λ ∈ C \ σ(P ). Since also P − λ satisfies (3.11), then by the results in [3], the linear operator (P − λ)−1 ◦ xq ∂xp : H s (Rn ) → H s (Rn )

(3.25)

|p| |q| m+ k

≤ 1 and for every s ≥ 0. Differentiating is continuous for any p, q ∈ Zn+ with and introducing commutators in the equation P u = f , we get, for every α ∈ Zn+ , that   P (∂xα u) = ∂xα f − ∂xα , P u. Then, for λ ∈ / σ(P ), we obtain

  (P − λ)(∂xα u) = ∂xα f − λ∂xα u − ∂xα , P u.

For fixed ε > 0, μ ≥ μcr , we can now multiply both sides of (3.26) by invert P − λ. We get

(3.26) |α|

ε |α|μ|α|

ε|α| α ε|α| ε|α| ∂x u = (P − λ)−1 (∂xα f ) − λ μ|α| (P − λ)−1 (∂xα u) μ|α| μ|α| |α| |α| |α|   ε|α| − (P − λ)−1 ∂xα , P u. |α|μ|α|

and

20

T. Gramchev

Finally, taking H s -norms and summing up for |α| ≤ N , we obtain s,μ;ε [u] EN  ≤ |α|≤N

 ε|α|



ε|α|

(P − λ)−1 (∂xα f ) + |λ|

(P − λ)−1 (∂xα u)

s s |α|μ|α| |α|μ|α| |α|≤N

+

 |α|≤N



ε

(P − λ)−1 ([∂xα , P ]u) . (3.27) s μ|α| |α| |α|

We will prove the following result. Theorem 3.8. Let P in (3.10) satisfy (3.11) and assume that σ(P ) = C. Moreover, let f ∈ S(Rn ) such that f {0,μ;ε } < +∞ for some μ ≥ μcr , ε > 0. If u ∈ S  (Rn ) is a solution of the equation P u = f , then u ∈ S(Rn ) and there exists an ε ∈ (0, ε ] such that u {0,μ;ε} < +∞. In particular, u satisfies (3.24) for some positive constant C. To prove the theorem we need to estimate the three terms in the right-hand side of (3.27) for s = 0 uniformly with respect to N . The most delicate term is the one containing commutators which must be written in a suitable form in order to get a sharp critical value for the regularity index μ. To treat it, we need some preliminary steps. Lemma 3.9. Let  ∈ (0, 1), r > 0 and let b be a positive integer. Then   /(1−) , t ≥ 0. tb ≤ rtb + (1 − ) r Proof. Clearly we can assume b = 1 setting tb = z. Define g(z) = z  − rz, z ≥ 0. Since g  (z) = z −1 − r = 0 iff z = z,r = (/r)1/(1−) , we readily obtain that   /(1−)   1/(1−)   /(1−) sup g(z) = g(z,r ) = −r = (1 − ) . r r r z≥0 

The proof is complete. Using Lemma 3.9, we can prove a crucial estimate.

Lemma 3.10. Let μ > 0, k, m be positive integers, and α, γ ∈ Zn+ such that γ (m+k) > 0 for some j ∈ {1, . . . , n}. Then, for every r > 0, η ≥ 0, we have αj ≥ 2 j k m+k

αj k η αj η αj −γj k 1− γ (m+k) j ≤r + r . m+k |α|μαj |α|μ(αj −γj k )

Proof. We can write m+k

η αj −γj k = m+k |α|μ(αj −γj k )



η |α|μ

αj ,

Gelfand–Shilov Spaces and Pseudodifferential Operators γj (m+k) ∈ (0, 1). With this choice of , we have 1 −  αj k αj k η γj (m+k) − 1. Then applying Lemma 3.9 with t = |α|μ and

21 γj (m+k) αj k

where  = 1 −

=

 and 1− = obtain that, for any r > 0,

b = αj , we



αj k −1 m+k αj k η αj −γj k η αj γj (m + k) − γj (m+k) γj (m + k) γj (m+k) +1 r ≤r + 1− m+k |α|μαj αj k αj k |α|μ(αj −γj k ) αj k

αj k − +1 η αj γj (m + k) r γj (m+k) γj (m + k) γj (m+k)   1− =r + γ (m+k) |α|μαj αj k αj k 1 − j αj k 

A  αj k η αj 1 1 1− γ (m+k) j ≤r + sup · r 1 − |α|μαj A≥2 A − 1 A

≤r

αj k η αj 1− γ (m+k) j + r . |α|μαj



The lemma is proved.

The following result is a straightforward consequence of the Leibniz formula. Lemma 3.11. Let α, ρ, σ ∈ Zn+ and let k, m be positive integers. Then the following identity holds:  σ σ−γ α+ρ−γ α! [∂xα , xσ ∂xρ ]u = (∂x u) x (α − γ)! γ 0=γ≤α, γ≤σ

=

 0=γ≤α, γ≤σ

σ σ−γ ρ + α−γ − α! ∂x ∂ (∂x ∂ u) x (α − γ)! γ

± where ∂± = ∂α,γ,k,m are the Fourier multipliers defined by the symbols  |ξj |±γj m/k .

(3.28)

1≤j≤n, αj >2γj k+m k

To estimate the commutator, we now use the assumption μ ≥ μcr . Lemma 3.12. Let P satisfy the assumptions of Theorem 3.8 and assume that λ ∈ C \ σ(P ). Then, for every u ∈ S(Rn ) and for every s ≥ 0, there exist Cs > 0, ε > 0 such that   

ε|α|

(P − λ)−1 ∂xα , P u ≤ Cs (rE s,μ;ε [u] + u s+k+2m ) N s μ|α| |α| 2n(k+m)≤|α|≤N

(3.29) for every integer N ≥ 2n(k + m), for every r > 0, and for some ε > 0 independent of N .

22

T. Gramchev

Proof. Let α ∈ Zn+ with |α| ≥ 2n(k + m). By Lemma 3.11, we can write  cρσ (P − λ)−1 ([∂xα , xσ ]∂xρ u) (P − λ)−1 [∂xα , P ]u =

=





cρσ

|ρ| |σ| m + k ≤1

|ρ| |σ| m + k ≤1

0=γ≤α, γ≤σ

  σ α! (P − λ)−1 ◦ xσ−γ ∂xρ ∂+ ∂xα−γ ∂− u (α − γ)! γ

(3.30)

with ∂± defined as in (3.28). At this point, observe that the operator (P − λ)−1 ◦ xσ−γ ∂xρ ∂+ is bounded from H s (Rn ) into itself for every s ≥ 0 uniformly with respect to α, cf. [3]. Since |γ| ≤ |σ| ≤ k in (3.30), we then obtain 1 1 (P − λ)−1 [∂xα , P ]u s ≤ Cs μ|α| μ|α| |α| |α|

n  

αγi i · ∂xα−γ ∂− u s

0=γ≤α, i=1 |γ|≤k

for some positive constant Cs independent of α. Now, since |α| ≥ 2n(k + m), we surely have αj ≥ 2 k+m k γj for some j ∈ {1, . . . , n}. Moreover, we can write





 

s

α−γ − αj −γj k+m α −γ k ∂x ∂ u s = ξ |ξj | · |ξh | h h u ˆ ,



1≤j≤n, αj >2γj k+m k

1≤h≤n, αh ≤2γh k+m k

where we denote by · the norm in L2 (Rn ). On the other hand, for every μ ≥ k , we have μcr = k+m ⎛ ⎞ n ⎟ ⎜ γi   αγhh |α|μ(αj −γj /μcr ) 1 ⎟ ⎜ i=1 αi ≤ · ⎟ ⎜ ⎝ |α|μαj −γj |α|μ|α| |α|μ(αh −γh ) |α|μ(αj −γj /μcr ) ⎠ 1≤j≤n, αj >2γj k+m k





⎜ ⎜ ≤C⎜ ⎝

 1≤j≤n, αj >2γj k+m k

1≤h≤n, αh ≤2γh k+m k

⎟ ⎟ ⎟ |α|μ(αj −γj /μcr ) ⎠



1

1≤h≤n, αh ≤2γh k+m k

1 . |α|μ(αh −γh )

we can apply Lemma 3.10 Now, for every j ∈ {1, . . . , n} such that αj > 2γj k+m k with η = |ξj |. We obtain that, for every r ∈ (0, 1),



n γi



αj αh −γh   α | |ξ | |ξ



j h α−γ − s i=1 i ∂ u ≤ r · u ˆ ∂ ξ



s x μ(αh −γh )

|α|μαj |α|μ|α| |α| 1≤j≤n, 1≤h≤n, αj >2γj k+m k





+ ξ s

 1≤h≤n, αh ≤2γh k+m k

αh ≤2γh k+m k

|ξh |αh −γh

uˆ . |α|μ(αh −γh )

Gelfand–Shilov Spaces and Pseudodifferential Operators

23

Choosing ε < 1, summing over |α|, and observing that



  |ξh |αh −γh

|α|

s ε ξ uˆ

|α|μ(αh −γh )

2n(k+m)≤|α|≤N

1≤h≤n, αh ≤2γh k+m k

≤ Cs u s+k+2m



ε|α| ≤ Cs u s+k+2m

2n(k+m)≤|α|≤N

for some constant Cs > 0 independent of N , we finally deduce estimate (3.29).



Proof of Theorem 3.8. By [3, Corollary 8.1] we already know that u ∈ S(Rn ). To prove that u {0,μ;ε} < +∞ we start from (3.27) for s = 0. Obviously, we have  |α|≤N

ε|α|

(P − λ)−1 (∂xα f ) ≤ C f μ|α| |α|

{0,μ;ε }

< +∞

for every ε ≤ ε . Concerning the second term, for every α ∈ Zn+ , α = 0, there exists j = jα ∈ {1, . . . , n} such that αj > 0. Writing (P − λ)−1 (∂xα u) = (P − α−e λ)−1 ◦ ∂xj (∂x j u), by (3.25) the operator (P − λ)−1 ◦ ∂xj maps continuously L2 (Rn ) into itself. Then we obtain |λ|

 |α|≤N

 

ε|α|

(P − λ)−1 (∂ α u) ≤ C u + εE 0,μ;ε [u] . x N −1 μ|α| |α|

The last term in (3.27) can be estimated by applying Lemma 3.12. Then, choosing ε sufficiently small, there exists C > 0 such that, for every r ∈ (0, 1), the estimate & '  0,μ;ε 0,μ;ε 0,μ;ε α f {0,μ;ε } + εEN −1 [u] + rEN [u] + ∂x u , EN [u] ≤ C |α| 0 sufficiently small, we can iterate this estimate to obtain that s,˜ μ;ε sup EN [u] < +∞. This concludes the proof. 

N ∈Z+

To prove the decay properties for the solutions of (3.15), we can argue as in the previous section. Applying the Fourier transformation to (3.15), we obtain the new equation  Qˆ u = fˆ + F (u), (3.36) where Q is (k, m)-globally elliptic.

28

T. Gramchev

Theorem 3.18. Let P satisfy the assumptions of Theorem 3.16 and let u ∈ H s (Rn ), s > n/2, be a solution of (3.15), where F is of the form (3.13) and f ∈ S(Rn ) with f s,ν;δ < +∞ for some ν ≥ νcr , δ  > 0. Then there exists a δ ∈ (0, δ  ] such that u s,ν;δ < +∞. To prove this theorem, we again need a further result. Lemma 3.19. Let Q be (k, m)-globally elliptic with σ(Q) = C and let u ∈ S(Rn ). Then, for fixed λ ∈ C \ σ(Q), s > n/2, δ > 0, ν ≥ νcr , there exists C > 0 such that  |α|≤N

  δ |α|

s,ν;δ −1 α 

 −1 − λ) (∂ ) ≤ C ˆ u + δ u · E [ˆ u ] u

(Q

ξ s s N −1 |α|ν|α| s

for every N ∈ Z+ . Proof. If αj = 0 for some j ∈ {1, . . . , n}, then we have that (Q − λ)−1 (∂ξα (u )) = α−e (Q − λ)−1 (∂ξ ∂x j (u )). Moreover, since the linear operator (Q − λ)−1 ◦ ∂ξ is j

j

continuous from H s (Rn ) to H s (Rn ), we obtain  0=|α|≤N

ε|α| (Q − λ)−1 (∂ξα (u )) s ≤ Cs ε |α|ν|α|

 0=|α|≤N

ε|α|−1 α−ej  ∂ u s . |α|ν|α| ξ

Now, using standard properties of the Fourier transform and Sobolev embedding estimates, we obtain





α−e

α−ej 

−1

ˆ) ∗ u( u = (∂ξ j u

∂ξ s s )    2 1/2  α−ej 2s  ( −1 (η) dη = η Fξ→η ∂ξ u ˆ∗u Rn

&)



2s

= Rn

'1/2  2   −1 2  α−e j ∂η u ˆ(η) · u (η) dη  α−ej

≤ Cs u −1 ∂ξ s

u ˆ s . 

The lemma is proved.

Proof of Theorem 3.18. First of all, by Lemma 3.15, it follows that u ∈ S(Rn ). As in the proof of Theorem 3.13, it is sufficient to show that there exists δ > 0 such that u ˆ {s,ν;δ} < ∞. Starting from (3.36) and taking λ ∈ C \ σ(Q), we get, for every α ∈ Zn+ , ∂ξα u ˆ = (Q − λ)−1 (∂ξα fˆ) − λ(Q − λ)−1 (∂ξα u ˆ) − (Q − λ)−1 [Q, ∂ξα ]ˆ u + (Q − λ)−1 (∂ξα u ).

Gelfand–Shilov Spaces and Pseudodifferential Operators

29

We can now apply Lemmas 3.12 and 3.19. We obtain that there exists C > 0 independent of N such that the estimate s,ν;δ EN [ˆ u]





C ⎝ ˆ f 1 − rC

{s,ν;δ  }

s,ν;δ s,ν;δ + δEN u] + δ u −1 EN u] + s −1 [ˆ −1 [ˆ



⎞ ∂ξα u ˆ s ⎠

|α| 0 smaller, we obtain that u ˆ {s,ν;δ} < +∞. We leave the details to the reader.  Arguing as in the previous section, the proof of Theorem 3.3 is a direct consequence of Theorems 3.16 and 3.18 combined with Proposition 2.2. We conclude this section giving the proof of Theorem 3.4. As for equation (3.15), we prove separately regularity and decay estimates, but for the linear part of the equation the estimates are the same as proven before. To conclude we only need to give estimates on the new nonlinear term coming from (3.18), (3.19). That is what we do in the next lemmas. Lemma 3.20. Let H be as in Theorem 3.4 and let λ ∈ C \ σ(H). Then, for every μ ≥ 1, s > n/2, ε ∈ (0, 1), , N ∈ Z+ ,  ≥ 2, q, γ ∈ Zn+ , |q| ≤ h, and for every u ∈ H s+1 (Rn ) there exist positive constants Cs , Cs such that the following estimates hold :  |α|≤N

 |α|≤N

  ε|α|

(H − λ)−1 (∂xα (xq u )) ≤ Cs u s + ε(E s,μ;ε [u]) , N −1 s μ|α| |α|

(3.37)

 

ε|α|

(H − λ)−1 (∂xα (u (∇u)γ )) ≤ Cs u +|γ| + ε(E s,μ;ε [u])+|γ| . s+1 N −1 s |α|μ|α| (3.38)

Proof. We start by proving (3.37). For fixed α = 0, let j = jα ∈ {1, . . . , n} such that αj > 0. We have  α   q! α q  q α−ej  ∂x (x u ) = x ∂xj ∂x xq−α ∂xα−α (u ). (u ) +   α (q − α )!  α ≤α, 0=α ≤q

Observe that (3.25) with m = 2, k = 2h implies that the operators (H −λ)−1 ◦xq ∂xj  and (H − λ)−1 ◦ xq−α are bounded from H s (Rn ) to H s (Rn ) since |q| ≤ h. Then, arguing as in the proof of Lemma 3.17, we easily obtain (3.37).

30

T. Gramchev As for (3.38), for |α| ≥ 2, we can write (H − λ)−1 ◦ ∂xα = (H − λ)−1 ◦ ∂xi ∂xj ◦ for some i, j ∈ {1, . . . , n} and apply (3.25). Then

α−e −e ∂x i j

 2≤|α|≤N

ε|α|

(H − λ)−1 (∂xα (u (∇u)γ ))

s |α|μ|α|

≤ Cs

 2≤|α|≤N

≤ ≤

ε|α|

∂xα−ei −ej (u (∇u)γ )

s μ|α| |α|

s,μ;ε s,μ;ε  |γ| Cs ε(EN −2 [u]) (EN −1 [u]) s,μ;ε +|γ| Cs ε(EN . −1 [u])



We then obtain (3.38).

Repeating the steps of the proof of Theorem 3.16 with the aid of Lemma 3.20, we can easily prove that, if f ∈ S(Rn ) with f {s,μ;ε } < +∞ for some μ ≥ μcr , s > n/2, ε > 0, and u ∈ H s+1 (Rn ) is a solution of (3.20), then u {s,˜μ;ε} < +∞ for some ε ∈ (0, ε ], where μ ˜ = max{1, μ}. To prove decay estimates for (3.20), we apply the Fourier transform to both sides of (3.20). We obtain the new equation  u H ˆ = fˆ + F (x, u, ∇u),

(3.39)

where  = Q(D) + |ξ|2 , H Q(D) being an elliptic operator with constant coefficients of order 2h. To prove regularity estimates for u ˆ, we need the following lemma.  be the operator defined by (3.39) and let λ ∈ C \ σ(H).  Then, Lemma 3.21. Let H for every ν ≥ νcr , s > n/2, ε ∈ (0, 1), , N ∈ Z+ , q, γ ∈ Zn+ with |q| ≤ h and  + |γ| ≥ 2, and for every u ∈ H s+1 (Rn ) there exist positive constants Cs , Cs such that the following estimates hold :  |α|≤N

 |α|≤N

  ε|α|

s,ν;ε q u )) ≤ C ˆ

(H  − λ)−1 (∂ξα (x( · EN u] , (3.40) u s + ε u −1 s s −1 [ˆ s ν|α| |α|  

ε|α|

+|γ| +|γ|−1 s,ν;ε γ )) ≤ C  ˆ

(H  − λ)−1 (∂ξα (u u (∇u) + ε u E [ˆ u ] . s s+1 s+1 N −1 s |α|ν|α| (3.41)

Proof. The proof of (3.40) is immediate. In fact, for every α ∈ Zn+ , α = 0, we have q u )) = (H  − λ)−1 (∂ξα (x(  − λ)−1 (∂ q+ej ∂ α−ej u ) s ≤ Cs ∂ α−ej u s (H s ξ ξ ξ

and then we conclude as in the proof of Lemma 3.19.

Gelfand–Shilov Spaces and Pseudodifferential Operators

31

To prove (3.41), we observe that, if  ≥ 1, we have, for every α ∈ Zn+ , α = 0,





α−e 

(H  − λ)−1 (∂ α (u (∇u)γ )) s ≤ Cs ∂ξ j (ˆ u ∗ (u−1 (∇u)γ ) s ξ

α−e −1

≤ Cs ∂ξ j u ˆ s u (∇u)γ s

α−e +|γ|−1 ≤ Cs ∂ξ j u ˆ s u s+1 . For  = 0, |γ| ≥ 2, we can argue similarly. We leave the details to the reader.



With the aid of Lemma 3.21 and arguing as in the proof of Theorem 3.18, we obtain that if f ∈ S(Rn ) is such that f s,ν;δ < +∞ for some s > n/2, ν ≥ νcr , δ  > 0, and u ∈ H s+1 (Rn ) is a solution of (3.20), then there exists δ ∈ (0, δ  ] such that u s,ν;δ < +∞. We conclude by observing that, under the assumptions of Theorem 3.4, we have both u {s,˜μ;ε} < +∞ and u s,ν;δ < +∞ for some positive ε, δ, and s > n/2. Combining these two estimates we easily obtain the proof of Theorem 3.4. Remark 3.22. We observe that our method can be easily adapted to a larger class of operators satisfying more general anisotropic estimates. Namely, for fixed multi-indices k = (k1 , . . . , kn ), m = (m1 , . . . , mn ), where kj > 0, mj > 0 for all j = 1, . . . , n, we can consider an operator of the form  P = cαβ xβ Dxα , cαβ ∈ C, (3.42) (α,β)∈A

  α1 αn β1 βn (α, β) ∈ Z2n : + · · · + + + · · · + ≤ 1 . + m1 mn k1 kn The principal symbol pm,k (x, ξ) of P is defined by  cαβ xβ ξ α , pm,k (x, ξ) =

where

A=

 (α,β)∈A

where

  α1 αn β1 βn (α, β) ∈ Z2n : + · · · + + + · · · + = 1 . + m1 mn k1 kn P in (3.42) is said to be (m, k)-globally elliptic if n     cαβ xβ ξ α  ≥ (|xj |kj + |ξj |mj ) for |x| + |ξ| ≥ R

= A The operator   

(α,β)∈A

(3.43)

j=1

for some positive constants C, R or, equivalently, if pm,k (x, ξ) = 0 for all (x, ξ) = (0, 0). For this class it is natural to prove estimates in general Gelfand–Shilov classes describing the regularity and decay properties with respect to each variable separately. We recall here the definition and refer the reader to [29] for a detailed presentation of these spaces.

32

T. Gramchev

Definition 3.23. Let μ = (μ1 , . . . , μn ), ν = (ν1 , . . . , νn ) ∈ Rn , where μj > 0, νj > 0 for all j = 1, . . . , n. We denote by Sνμ (Rn ) the space of all functions u ∈ C ∞ (Rn ) such that 1 μ1 n μn β1 ν1 sup |xβ ∂xα u(x)| ≤ A|α|+|β|+1 αα · · · αα β1 · · · βnβn νn n 1

x∈Rn

for some constant A > 0. We notice that Proposition 2.2 has an obvious extension to this class, cf. [22]. The assertion of Theorem 3.1 can be reformulated in this new framework as follows: if P is an operator of the form (3.42) satisfying (3.43) and f ∈ Sνμ (Rn ) with kj mj μj ≥ kj +m , νj ≥ kj +m for any j = 1, . . . , n, then every solution u ∈ S  (Rn ) of the j j equation P u = f actually belongs to Sνμ (Rn ). Similarly, for the semilinear equation P u = f + F (u) with F (u) as in (3.13), starting from a solution u ∈ H s (Rn ), we can prove that u ∈ Sνμ˜ (Rn ), where μ ˜j = max{1, μj } for every j = 1, . . . , n. We leave the details to the reader. 3.4. The one-dimensional case: examples First focusing attention on linear operators, we consider P as in (3.10) and pm,k (x, ξ) as in (3.21):  P = cαβ xβ Dxα , (3.44) β α m + k ≤1



pm,k (x, ξ) =

cαβ xβ ξ α ,

(3.45)

β α m + k =1

d where now x ∈ R, ξ ∈ R; we recall that Dx = −i dx . Assume that P is (m, k)globally elliptic, i.e., in view of Proposition 3.6,

pm,k (x, ξ) = 0 for all (x, ξ) = (0, 0). Consider then the algebraic equations  pm,k (±1, λ) =

cαβ (±1)β λα ,

(3.46)

λ ∈ C.

β α m + k =1

± In view of (3.46), the order of these equations is m and all the roots λ± 1 , . . . , λm , ± counted with multiplicity, satisfy the condition λj = 0. We may apply to P the results of the asymptotic theory [31], [64], [70]; the following rough statements will be sufficient for our purposes. − + − Proposition 3.24. There exist two fundamental systems u+ 1 , . . . , um and u1 , . . . , um of solutions of P u = 0, of the form ± 1/ν ± )vj (x), u± j (x) = exp(iλj ν|x|

j = 1, . . . , m,

(3.47)

where ν = m/(k + m) and |vj± (x)| ≤ C exp(δ|x|σ ),

x ∈ R± ,

(3.48)

Gelfand–Shilov Spaces and Pseudodifferential Operators

33

for some σ < 1/ν and positive constants C and δ (in the case of a multiple root ± λ± j , any linear combination of the corresponding independent solutions uj also satisfies (3.47), (3.48)). We begin by giving a cheap proof of Theorem 3.1 in the case of a homogeneous ordinary differential equation. Proposition 3.25. Let P be defined as in (3.44), (3.45), (3.46). Assume that P u = 0, u ∈ S  (R). Then u ∈ Sνμ (R), where μ = k/(k + m), ν = m/(k + m). ± Proof. Since (3.46) implies λ± j = 0 in (3.47) for all j = 1, . . . , m, all solutions uj ± in Proposition 3.24 have exponential growth if λj < 0 or exponential decay if  λ± j > 0 in R± . On the other hand we know that a solution u ∈ S (R) of P u = 0 belongs to S(R), hence u ∈ S(R+ ) and u ∈ S(R− ). This implies that u is a linear combination of the u+ j which have exponential decay in R+ and simultaneously a linear combination of the u− j with exponential decay in R− . Note in particular + that, if λj < 0 for all j = 1, . . . , m or λ− j < 0 for all j = 1, . . . , m, then no non-trivial solutions u ∈ S  (R) can exist. Otherwise, from (3.47), (3.48) we have

|u(x)| ≤ Ce−δ|x|

1/ν

,

x ∈ R,

(3.49)

for any constant δ satisfying ± 0 < δ < min{νλ± j : λj > 0}

and a suitable constant C depending on δ. We now use Proposition 3.7, namely, for every solution u ∈ S  (R) of P u = 0, we may write Pu = Qˆ u=0 where Q is now (k, m)-globally elliptic. We then apply the preceding arguments to the ordinary differential equation Qˆ u = 0, exchanging the role of k and m. We deduce that  1/μ |ˆ u(ξ)| ≤ C  e−δ |ξ| , ξ ∈ R, (3.50) for some C  > 0, δ  > 0. According to [22], we may read (3.49) and (3.50) as |β|

sup |xβ u(x)| ≤ C1 A1 (β!)ν , x∈R

|α|

sup |ξ α uˆ(ξ)| ≤ C1 B1 (α!)μ ξ∈R

for all α, β ∈ Z+ and suitable positive constants A1 , B1 , C1 independent of α, β. In view of iii) of Proposition 2.2, these estimates give the conclusion u ∈ Sνμ (R).  As an obvious byproduct of Proposition 3.25, we may recapture, in a special case, the celebrated non-triviality theorem of Gelfand and Shilov, cf. [29]. Proposition 3.26. Let μ > 0, ν > 0, μ + ν = 1. Assume that μ, ν ∈ Q. Then Sνμ (R) = {0}, i.e., there exists a non-trivial function u ∈ Sνμ (R).

34

T. Gramchev

Proof. Consider the basic example of a (2p, 2h)-globally elliptic operator in R, P = Dx2p + x2h . The spectrum of P is discrete, with eigenvalues λj → +∞ and the eigenfunctions ϕj , j = 1, 2, . . . , forming a complete orthogonal system in L2 (R), see [3]. Since also h/(h+p) P −λj is (2p, 2h)-globally elliptic, from Proposition 3.45 we have ϕj ∈ Sp/(h+p) (R). It remains then to observe that, for any given μ ∈ Q, 0 < μ < 1, we may write μ = h/(h + p) for two positive integers h and p and, consequently, ν = 1 − μ =  p/(h + p). Hence, we have ϕj ∈ Sνμ (R). To see more explicit examples of functions in Sνμ (R), we may address similar ordinary differential operators with polynomial coefficients. In particular, we recall, cf. [44], [64], that the (2, 2h)-globally elliptic equation (Dx2 + x2h − ρxh−1 )u = 0

(3.51)

h/(h+1)

admits non-trivial solutions in L2 (R), hence, in S1/(h+1) (R), for special values of the parameter ρ, namely: • When h is even, for ρ = 2(h + 1)N + h + 1, N ∈ Z, the solution in R+ is given by

  h+1 ρ+h h 2xh+1 , ; /(h + 1) Ψ u(x, ρ) = exp −x , (3.52) 2(h + 1) h + 1 h + 1 whose analytic extension coincides in R− with u(−x, −ρ). To be definite, we recall the definition of the Tricomi function Ψ, cf. [69]: Ψ(a, c; x) =

Γ(c − 1) 1−c Γ(1 − c) Φ(a, c; x) + x Φ(a − c + 1, 2 − c; x), Γ(a − c + 1) Γ(a)

where the principal branch of x1−c is chosen and Φ is the hypergeometric confluent function, Φ(a, c; x) =

∞  (a)n xn , (c)n n! n=0

here as standard, for r ∈ R, (r)0 = 1, (r)n = r(r + 1) · · · (r + n − 1), n ≥ 1. We have

∞  c − a − 1 (a)n as x → +∞, (3.53) Ψ(a, c; x) ∼ x−a n xn n=0 which gives the expected exponential decay in (3.52). • When h is odd, for ρ = −2(h + 1)N − h or ρ = −2(h + 1)N − h − 2, N ∈ Z, the solution of (3.51) in L2 (R) is of the form   u(x, ρ) = exp −xh+1 /(h + 1) Pρ (x) (3.54) where Pρ (x) is a polynomial, cf. [66].

Gelfand–Shilov Spaces and Pseudodifferential Operators

35

h/(h+1)

From the expression (3.54) we may directly recognize that u ∈ S1/(h+1) (R). It is natural to question whether solutions of the type exponential-polynomial occur for other (p, ph)-globally elliptic equations, when h is odd. For a detailed analysis of such solutions we refer to [44, Section 7.4]. As an example in the opposite direction: the (3, 3h)-globally elliptic equation (D − ixh )(D + ixh )2 u + σxh−2 u = 0, h/(h+1)

with h odd, h ≥ 3, admits for some σ ∈ C solutions u ∈ S1/(h+1) (R) which are not of type (3.54), see [44, Section 7.3] for their explicit expression in terms of Meijer’s G-functions. We pass now to consider nonlinear ordinary differential equations. We want to test the sharpness of Theorems 3.3 and 3.4 on a one-dimensional model. Generalizing the arguments in [11] we consider the equation −u + x2h u − hxh−1 u = xh u − u u−1 ,

x ∈ R,

(3.55)

where h,  ∈ Z+ ,  > 1, h > 1, h is odd. We notice that (3.55) corresponds to the equation (3.20) for n = 1, V (x) = x2h − hxh−1 , and F (x, u, u ) = xh u − u u−1 , f = 0. First of all we observe that (3.55) can be rewritten as follows:

d d h  h h − x (u + x u) = − x u , x ∈ R. (3.56) dx dx Then every solution u ∈ H 2 (Rn ) of the Bernoulli equation u + xh u = u ,

x ∈ R,

(3.57)

is also a solution of (3.56). We restrict our study to the solutions of (3.57). Fixing u(0) = uo > 0, by standard arguments we obtain 1 + 1− * ) x h+1 h+1 − xh+1 −(−1) th+1 1− u(x) = e e dt (3.58) uo + (1 − ) 0

or, equivalently, h+1

u(x) = e

− xh+1

*

, +∞

where λ = u1− + (1 − ) o

0

)

+∞

λ + ( − 1)

1 + 1−

h+1

e

−(−1) th+1

dt

,

(3.59)

x th+1

e−(−1) h+1 dt. We notice from (3.58) that u is well xh+1

defined for x ≤ 0 and u(x) ∼ e− h+1 as x → −∞. To analyze the global behavior of u on R, it is convenient to express it in terms of special functions. To be definite, write Γ(α) = γ(α, x) + Γ(α, x), where

) γ(α, x) = 0

x

e−t tα−1 dt,

)

+∞

Γ(α, x) =

e−t tα−1 dt.

x

The function γ(α, x) is called the incomplete Gamma function, while Γ(α, x) is usually known as a complementary incomplete Gamma function. We recall that

36

T. Gramchev

Γ(α, x) = xα e−x Ψ(1, α + 1, x); hence, in view of (3.53), for fixed α ∈ R, the function Γ(α, x) has the asymptotic expansion Γ(α, x) ∼ e−x xα−1

+∞ 

(−1)n

n=0

(1 − α)n xn

as x → +∞,

cf. [69]. By a change of variable it easily follows that 1  h

 1−

− h+1 h+1  − 1 h+1 1 h+1 − xh+1 , x Γ , λ+ u(x) = e −1 h+1 h+1

(3.60)

(3.61)

h − h+1   1  where λ = u1− . We can distinguish three cases: − h+1 Γ h+1 o −1 h  h+1 − h+1  1  a) − −1 Γ h+1 < λ < 0. In this case, the solution blows up at the point xo > 0 defined by the equation ) +∞ th+1 e−(−1) h+1 dt, λ = (1 − )

xo

cf. (3.59). b) λ = 0. The solution is well defined and real-analytic on R. Moreover, by h (3.60), (3.61), u(x) ∼ x −1 as x → +∞. Therefore, u ∈ S  (R), u ∈ / S(R). Notice that this does not contradict our results, since u ∈ / H s (R) for s > 1/2, hence the assumptions of Theorems 3.3 and 3.4 are not fulfilled. c) λ > 0. Also in this case, by (3.59), the solution u is real-analytic on R. Moreover, 1

0 < u(x) < λ 1− e−

xh+1 h+1

.

Now u ∈ H (R) and Theorem 3.3 applies and gives the more precise information that u ∈ S 1 1 (R). In particular, u admits a holomorphic extension 2

h+1

u(z) to a strip of the form {z ∈ C : |z| < T } for some T > 0. Nevertheless, Picard’s great theorem of complex analysis implies that u does not admit an entire extension to C, since in (3.59), for any fixed λ ∈ R, the equation ) +∞ th+1 e−(−1) h+1 dt = 0 λ + ( − 1) z

admits a solution zo , cf. [69]. Hence, we cannot expect to obtain u ∈ S μ 1 (R) h+1

for some μ < 1.

4. Elliptic operators with irregular singularity at infinity The main goal of the present section is to study the equations (1.2) and (1.3) when P admits “irregular” behavior for large x. More precisely, we consider the linear equation (4.1) P (x, D)u = f (x), x ∈ Rn

Gelfand–Shilov Spaces and Pseudodifferential Operators

37

and the corresponding semilinear perturbation P (x, D)u = f (x) + F (u),

x ∈ Rn ,

(4.2)

where the pseudodifferential operator P (x, D) is locally elliptic, but with coefficients that exhibit an “irregular” type of singularity as |x| → ∞. Before stating the main results, as a motivating model operator we take P = Am (D) +

ω(x) , x σ

(4.3)

where Am (D) is an elliptic homogeneous linear partial differential operator with constant coefficients and real-valued symbol Am (ξ) of order m ∈ N and ω ∈ C ∞ (Rn ) is a bounded function. In particular, if we take m = 2 and A2 (D) = −Δ, we have ω(x) . (4.4) P = −Δ + x σ The case σ ≤ 0, also corresponding to an irregular-type singularity at infinity in the language of ordinary differential operators, has been studied in the previous sections. Namely, if σ < 0, we have in (4.4) a potential with algebraic growth at infinity. If the potential ω(x)

x σ is polynomial, then P is included in the theory of Shubin operators and the subsequent anisotropic generalizations. For σ = 0, we are back to the case of SG pseudodifferential operators. Note that SG-ellipticity in (4.3) reads σ=0

and |Am (ξ) + ω(x)| ≥ Cξ m

for C > 0 and large |x| + |ξ|. This is satisfied by (4.3) if σ = 0 and Am (ξ) > 0

for ξ = 0,

ω(x) ≥ C  > 0 for |x| ≥ R

(4.5)

|ω(x)| ≥ C  > 0

(4.6)

or else Am (ξ) ∈ R

for ξ ∈ Rn ,

for |x| ≥ R

for some positive constants C  , R . In the one-dimensional case, the assumption σ > m implies regularity at infinity for the ordinary differential operator P , whereas σ = m corresponds to the classical Fuchs condition at infinity. For the case σ ≥ m, we refer to the fundamental work of McOwen [45] and Lockhart and McOwen [47], [48], where the authors carried out a comprehensive analysis of linear elliptic operators in Rn under the two assumptions (formulated for (4.3)) σ = m and lim|x|→∞ ω(x) = 0. Gelfand–Shilov spaces do not appear in the above-mentioned papers. However, one easily gets from the characterization of the kernels of the elliptic systems in the aforementioned papers that even P u = 0, u ∈ S  (Rn ) does not imply u ∈ S(Rn ). In fact, the following simple example of a linear ODE shows that no global hypoellipticity results are possible in any Gelfand–Shilov space Sνμ (Rn ).

38

T. Gramchev

Example. Let P u = u (x) + a(x)u(x), x ∈ R, where a ≡ 0 satisfies, for some K > 0, the estimate |a(x)| ≤ Kx −1 , x ∈ R. Then one easily verifies that no nonzero solution u of P u = 0 belongs to S(R), while one can find C1 > C2 > 0 such that C2 x −K ≤ |u(x)| ≤ C1 x K , x ∈ R. This estimate is sharp if we choose a(x) = ± Kx −1 for ± x ≥ 1. Furthermore, if ω(x) = a(x)x is infinitesimal as x → ∞, then, for every ε > 0, we can find C1 (ε) > C2 (ε) > 0 such that C2 (ε)x −ε ≤ |u(x)| ≤ C1 (ε)x ε ,

x ∈ R.

In view of the above results for σ ≥ m it is natural to focus our attention on the case of an irregular-type singularity 0 0, ν > 0, μ+ ν ≥ 1, are defined as the set of all f ∈ C ∞ (Rn ) satisfying the following estimates: there exist positive constants C, ε such that |∂xα f (x)| ≤ C |α|+1 (α!)μ e−ε|x|

1/ν

,

x ∈ Rn ,

(4.9)

cf. the book of Gelfand and Shilov [29] (see also Mityagin [49], Pilipovic [52]). We notice that for μ = 1, functions from Sνμ (Rn ) are real-analytic and admit a holomorphic extension to a strip of the form {z ∈ C : |z| < T }, T > 0. We also recall that the Fourier transformation F acts as an isomorphism F : Sνμ (Rn ) −→ Sμν (Rn ).

(4.10)

Gelfand–Shilov Spaces and Pseudodifferential Operators

39

Gelfand–Shilov spaces were already used by the authors of [15], [16] for semilinear Shubin equations, i.e., σ < 0 in (4.3), (4.4), giving estimates on the solutions of the form (4.9) with μ ≥ 1/2, ν ≥ 1/2, and in [12] for semilinear SG-elliptic equations, i.e., σ = 0 in (4.3), (4.4); in this case exponential decay of the type e−ε|x| , ε > 0, was proved. To state our results in full generality, let us refer to the following class of pseudodifferential operators. Given m = (m1 , m2 ) ∈ R2 , δ ∈ [0, 1), we denote by Γm,δ = Γm,δ (Rn ) the space of all functions p(x, ξ) ∈ C ∞ (R2n ) such that   α β ∂ ∂ p(x, ξ) ≤ Cαβ ξ m1 −|α| x m2 −|β|+δ|α| ξ x for all (x, ξ) ∈ Rn , α, β ∈ Zn+ and for some positive constants Cαβ . We shall also denote by OP Γm,δ the class of pseudodifferential operator P = p(x, D) defined by a symbol p ∈ Γm,δ . We introduce fundamental hypotheses which turn out to be crucial for the global hypoellipticity in the weighted Sobolev spaces H s1 ,s2 (Rn ): there exist m = (m1 , m2 ) with m1 ≤ m1 , m2 ≤ m2 and R > 0 such that inf





ξ −m1 x −m2 |p(x, ξ)| = C1 > 0

|x|+|ξ|≥R

 and, for every α, β ∈ Zn+ , one can find Cαβ > 0 such that  α β   −|α| ∂ ∂ p(x, ξ) ≤ C ξ x −|β|+δ|α| |p(x, ξ)| ξ x αβ

Zn+

(4.11)

(4.12)

for all α, β ∈ and all (x, ξ) ∈ R with |x| + |ξ| ≥ R. Notice that if δ = 0, then Γm,0 coincides with the class of SG pseudodifferential operators studied in [23], [51], [59], [60], and if we assume further m1 = m1 , m2 = m2 in (4.13), then the symbol p is SG-elliptic (or md-elliptic). The metric x −2 |dx|2 + x 2δ ξ −2 |dξ|2 , 0 ≤ δ < 1, is an admissible metric for the Weyl–H¨ormander calculus in [39] and we may regard the preceding pseudodifferential operators in this framework. For globally hypoelliptic operators we have then easily the following result, see also [8] for details. 2n

Theorem 4.1. Let P = p(x, D) with p ∈ Γm,δ satisfy (4.11), (4.12). Then the  operator P admits a parametrix E ∈ OP Γ−m ,δ satisfying E ◦ P = I + R1 ,

P ◦ E = I + R2 ,

where Rj , j = 1, 2, is S-regularizing, i.e., Rj : S  (Rn ) → S(Rn ), and

j = 1, 2, 



E : H s1 ,s2 (Rn ) → H s1 +m1 ,s2 +m2 (Rn ) for all s1 , s2 ∈ R. Hence, P u = f ∈ S(Rn ), u ∈ S  (Rn ) implies u ∈ S(Rn ). The operator P is Fredholm in S(Rn ) and in S  (Rn ), cf. [63, Definition 2.54]. In particular, the solutions u ∈ S  (Rn ) of P u = 0 form a finite-dimensional subspace of S(Rn ).

40

T. Gramchev

Our main goal is to identify the critical threshold of sub-exponential decay and to derive global analytic-Gevrey regularity of the solutions in the framework of Gelfand–Shilov spaces under suitable additional assumptions on the regularity of the symbol of P . Let us then introduce an analytic-Gevrey variant of the class Γm,δ defined above. Let then m ∈ R, δ ∈ [0, 1), μ ≥ 1. We introduce the class of symbols Γm,δ = μ m,δ n ∞ 2n Γμ (R ) as the set of all p ∈ C (R ) such that one can find C > 0 such that   α β ∂ξ ∂x p(x, ξ) ≤ C |α|+|β|+1 α!(β!)μ ξ m−|α| x −|β|+δ|α| , α, β ∈ Zn+ , (x, ξ) ∈ R2n . the class of pseudodifferential operators with symbol We denote by OP Γm,δ μ in Γm,δ . μ We assume that (4.11) is satisfied with m = (m, −σ) for some σ ≥ 0, namely inf

ξ −m x σ |p(x, ξ)| = C1 > 0.

|x|+|ξ|≥R

(4.13)

The second crucial hypothesis is the following variant of condition (4.12): there exist C2 , R > 0 such that   α β ∂ξ ∂x p(x, ξ) ≤ C |α|+|β|+1 α!(β!)μ ξ −|α| x −|β|+δ|α| |p(x, ξ)| (4.14) 2 for all α, β ∈ Zn+ and all (x, ξ) ∈ R2n with |x| + |ξ| ≥ R. We have the following result. Theorem 4.2. Let μ ≥ 1, ν ≥ 1 and let f ∈ Sνμ (Rn ). Let P be a pseudodifferential operator with symbol p ∈ Γm,δ satisfying (4.13), (4.14). If u ∈ S  (Rn ) is a soluμ . 1 tion of the linear equation (4.1), then u ∈ Sνμ (Rn ), where ν  = max ν, 1−δ . In particular, every solution u ∈ S  (Rn ) of the equation P u = 0 satisfies the estimate  α  ∂ u(x) ≤ C |α|+1 (α!)μ e−ε|x|1−δ x for all x ∈ Rn , α ∈ Zn+ and some positive constants C, ε independent of α. Example. Consider the operator P in (4.3). In view of the assumptions (4.8), (4.5), (4.7), its symbol p(x, ξ) satisfies the conditions (4.12), (4.13). In fact, we have the estimate   |p(x, ξ)| = Am (ξ) + ω(x)x −σ  ≥ Cξ m x −σ for |x| + |ξ| large. Moreover, it is easy to see that the derivatives of p with respect to x satisfy (4.12) for δ = 0. Nonetheless, ξ derivatives require δ > 0. For simplicity, we confine our attention to the expected estimate, for |α| = m,     α ∂ξ p(x, ξ) = Cα ≤ C Am (ξ) + ω(x)x −σ ξ −m x mδ for |x| + |ξ| large, which holds if and only if δ ≥ σ/m ∈ (0, 1). Hence, Theorem 4.1 applies to P in m,σ/m (4.3), (4.4). Similarly, if ω satisfies (4.8) for Aα = C |α|+1 (α!)μ , then p ∈ Γμ and condition (4.14) is fulfilled. Then Theorem 4.2 gives for solutions u(x) ∈ S  (Rn ) of ω(x) u=0 P u = Am (D)u + x σ

Gelfand–Shilov Spaces and Pseudodifferential Operators

41

μ the regularity u ∈ Sm/(m−σ) (Rn ), in particular, a sub-exponential decay |u(x)| ≤ 1−σ/m

C e−ε|x| and uniform Gevrey regularity of order μ. The pointwise decay rate is sharp (see below). We note that, if δ = σ = 0, then the theorem above reduces to the known statements for SG elliptic operators, cf. [12, Theorem 7.13]. Finally, if μ = 1, then u admits a holomorphic extension to a strip of the form {z ∈ C : |z| < T } for some T > 0. In particular, consider the equation −Δu +

ω(x) u=0 x σ

(4.15)

with 0 < σ < 2, where ω(x) satisfying (4.8) is of the form ω(x) = 1 + ωo (x) with lim ωo (x) = 0. For ωo (x) ≡ 0, solutions do not exist because of the |x|→∞

σ

positivity of the operator. Taking, for instance, ωo (x) = (1 − n + σ/2) x 2 −1 − 1−σ/2

(σ/2 + 1) x −3+ 2 − x −2 , we may easily verify that u(x) = exp − x 1−σ/2 σ

1 S2/(2−σ) (Rn )



is a solution of (4.15).

Next, we treat semilinear perturbations. As in previous sections, we propose generalizations of the result in [17] by allowing nonlinearities containing convolution terms. More precisely, we suppose that the nonlinear term is of the form  Fij ui∗ uj , Fij ∈ C, (4.16) F (u) = i,j : ≤i+j≤d

for some integers d ≥  ≥ 2. Under the hypotheses (4.13), (4.14), (4.16) we have the following global regularity results in Gelfand–Shilov spaces for the semilinear equation (4.2). Theorem 4.3. Let μ ≥ 1, ν ≥ 1 and let f ∈ Sνμ (Rn ). If u is a solution of (4.2) such that x εo u ∈ H s (Rn ) for some s > n/2, εo > σ/( − 1), then . 1 . where ν  = max ν, 1−δ

u ∈ Sνμ (Rn ),

Concerning ordinary differential equations, i.e., n = 1 in Theorems 4.2 and 4.3, our results in their general form can be seen in the spirit of the classical analysis on regularity and asymptotic behavior at infinity (see, e.g., Wasow [70]) and also intersect recent results on Gevrey regularity for nonlinear equations proved by Djakov and Mityagin [26], [27]. They apply to a large class of equations described in detail in Section 4.4. The simplest model in this framework is given by the operator  −γ d + x 1 + x2 , x ∈ R, (4.17) L= dx where γ > 0. If γ ≥ 1, the equation is Fuchsian or of regular type at infinity, so let us further assume γ < 1. After multiplication by −i, we recognize in (4.17)

42

T. Gramchev

an operator of the form (4.3) with m = n = 1, A1 (D) = D, ω(x) = −ix/x , σ = 2γ − 1. The solutions of Ly = 0 are given by

(1 + x2 )1−γ y(x) = C exp . (4.18) 2 (γ − 1) Conditions (4.8), (4.6) are readily verified, so L is SG-elliptic for γ = 1/2. The results in the present paper refer to the case 1/2 < γ < 1; in particular, Theorem 4.2 applies. We are then exactly in the framework of example (4.15), where now μ = 1, 1 (R) which is the regularity we may δ = σ = 2γ − 1, so that we expect y ∈ S1/(1−δ) probe in (4.18). Example. We give the nonlinear version of the previous example taking for simplicity L in (4.17) as linear part. Consider the ordinary differential equation −γ  y = y  , x ∈ R,  ≥ 2, (4.19) Ly = y  + x 1 + x2 where 1/2 < γ < 1. Theorem 4.3 applies and we have that all the solutions of (4.19) such that x εo y(x) ∈ H s (R) for some  s > 1/2 and εo > (2γ − 1)/( − 1) are analytic and decay at infinity like exp −|x|2(1−γ) . This will be tested on the explicit expression of the solutions given by (4.53) in Section 4.4. Notice that compared to the linear case (4.1) we need a priori decay on the solution. Such an assumption is necessary to obtain a sub-exponential decay. In fact, in Section 4.4 we shall check that the equation (4.19) admits two types of homoclinics: one with only algebraic decay y(x) ∼ x(1−2γ)/(−1) as x → +∞ which does not satisfy the required a priori bound; other homoclinics with x εo y(x) ∈ H s (R), s > 1/2, εo > (2γ − 1)/( − 1) which have the expected sub-exponential decay. Moreover, we may check that 2γ−1 −1 is indeed a sharp lower bound for εo . In conclusion, we would like to observe that the problems of asymptotic decay and holomorphic extensions of solutions, apart from the interest per se in the general theory of differential equations (both ordinary and partial), arise in different contexts in mathematical physics, e.g., for analytic regularity and exponential decay of traveling wave type solutions, cf. the fundamental work by Bona and Li [4] (see also [2]), for the exponential decay of eigenfunctions of Schr¨odinger operators appearing in quantum mechanics, starting from the celebrated work of Agmon [1] (see also [6], [25], [37], [56]) and, more generally, for solutions of second-order elliptic equations, cf. [54] and the references therein. The section is organized as follows. In Subsection 4.1, we introduce some scales of Sobolev norms providing suitable characterizations of the space Sνμ (Rn ), which will be instrumental in the proofs of our statements. In Subsections 4.2 and 4.3, we prove sub-exponential decay estimates and uniform regularity, respectively, for the solutions of the equations (4.1), (4.2). As a consequence we obtain Theorems 4.2 and 4.3. In Section 4.4, we fix the attention on a class of ordinary differential operators including (4.17) and also check the sharpness of our results on the solutions of (4.19). In the proofs we shall use the classical theorems of (composition, adjoints, construction pseudodifferential calculus for the class Γm,δ μ

Gelfand–Shilov Spaces and Pseudodifferential Operators

43

of parametrices). Unlike the case of Γm,δ , we are not aware of an existing specific calculus for Γm,δ in Gelfand–Shilov classes, hence we proved these statements for a μ more general class including Γm,δ μ . Nevertheless, in order to immediately introduce the reader to the proofs of the main results, we postponed the pseudodifferential calculus to an appendix at the end of the paper. 4.1. Separate norms for decay and regularity Taking into account Proposition 2.2, we introduce norms which describe only the decay and the regularity properties, respectively. Precisely, let us set  ε|k| u s,ν;ε = xk u s |k|!ν n k∈Z+

and write s,ν;ε HN [u] =

 k∈Zn + , |k|≤N

ε|k| xk u s . |k|!ν

By Sobolev embedding estimates, it is obvious that if u s,ν;ε < +∞ for some ν > 0, s > n/2, ε > 0, then u satisfies the first inequality in (2.1). Similarly, we can define  T |j| u {s,μ;T } = ∂xj u s . μ j! n j∈Z+

It is easy to verify that if u {s,μ;T } < +∞ for some T > 0, s ≥ 0; then u satisfies the second inequality in (2.1). In fact, for technical reasons that will be clear in the next sections, we shall use a slightly different scale of norms to prove regularity estimates for nonlinear equations. Precisely, for fixed εo ≥ 0, we shall consider the norm  T |j| u {s,μ;T,εo } = x εo ∂xj u s (4.20) μ j! n j∈Z+

and denote the corresponding partial sum as  T |j| s,μ;T,εo [u] = x εo ∂xj u s . EN μ j! n j∈Z+ , |j|≤N

s,μ;T s,μ;T,0 [u] for EN [u]. We shall write EN

4.2. Decay estimates We first derive sharp decay estimates for the solutions of the equations (4.1), (4.2), where F is of the form (4.16) and P is a pseudodifferential operator with symbol p ∈ Γm,δ satisfying the conditions (4.13), (4.14). The approach will be the same μ in the linear and the semilinear case, but the latter case requires some a priori restrictions on the behavior at infinity of the solution. Let us then start with the linear case F (u) = 0.

44

T. Gramchev

If u ∈ S(Rn ) is a solution of P u = f , then for every k ∈ Zn+ , ε > 0, ν ≥ 1, we can write ε|k| k ε|k| k x P u(x) = x f (x) ν |k|! |k|!ν from which we get  ε|k| ε|k| k ε|k|  k P, xk u. P (x u) = x f (x) + |k|!ν |k|!ν |k|!ν Now, since P satisfies (4.13) and (4.14), by Proposition A.13 there exists a left parametrix E for P . Then we have   ε|k| ε|k| ε|k|  ε|k| k x u= E(xk f ) + R(xk u) + E P, xk u , ν ν ν ν |k|! |k|! |k|! |k|! where R is a regularizing operator mapping S  (Rn ) into S(Rn ), cf. Remark A.8. Taking Sobolev norms and summing up for |k| ≤ N , N ∈ Z+ , we obtain s,ν;ε HN [u] ≤

 ε|k| E(xk f ) s |k|!ν

|k|≤N

 ε|k| + R(xk u) s + |k|!ν |k|≤N

 0 0, s > n/2. If u ∈ S  (Rn ) is a solution of P u = f , then there exists ε > 0 such . 1 that u s,ν  ;ε < +∞, where ν  = max ν, 1−δ . In particular, there exist positive constants C, c such that |u(x)| ≤ Ce−c|x|

1/ν 

for every x ∈ Rn . In order to prove Theorem 4.4 we want to show that, for some ε > 0, the left-hand side of (4.21) converges as N → +∞. To do this we need to estimate properly the three terms in the right-hand side. The most delicate term is the one containing commutators for which some preliminary steps are necessary. Lemma 4.5. Let δ ∈ (0, 1) and r > 0. Then δ/(1−δ) δ tβδ ≤ rtβ + (1 − δ) , r for all β ∈ N.

t ≥ 0.

(4.22)

Gelfand–Shilov Spaces and Pseudodifferential Operators

45

Proof. Clearly we can assume β = 1, setting tβ = z. Set g(z) = z δ − rz, z ≥ 0. Since g  (z) = δz δ−1 − r = 0 iff z = zδ,r = (δ/r)1/(1−δ) , we readily obtain that δ/(1−δ) 1/(1−δ) δ/(1−δ) δ δ δ sup g(z) = g(zδ,r ) = −r = (1 − δ) . r r r z≥0 

The proof is complete. Lemma 4.6. Let δ ∈ (0, 1), ν ≥ 1, γ, η > 0. Then

|x||β|δ η ≤ γ (1−δ)|β| |x||β| (|k|(|k| − 1) · · · (|k − β| + 1))ν−1 (|k|(|k| − 1) · · · (|k − β| + 1))ν δ/(1−δ) δ (1 − δ)γ −δ|β| + (4.23) ν−1/(1−δ) η (|k|(|k| − 1) · · · (|k − β| + 1)) for all x ∈ Rn , k, β ∈ Zn+ , |β| ≤ |k|. Proof. We set r = η/(|k|(|k| − 1) · · · (|k − β| + 1)) and t = γ|x|. Then (4.23) follows from (4.22) and a straightforward calculation.  Lemma 4.7. Let δ ∈ (0, 1), ν ≥ 1. Then there exists C0 > 0 such that for every γ ∈ (0, 1), η > 0 the estimate  |β|δ k−β   x x (|k|(|k| − 1) · · · (|k − β| + 1))ν−1 (|k − β|)!ν n k−β    | |β| η |β| −δ|β| |k−β| |x (1−δ)|β|  k−β+|β|eq  x + C ≤ C0 γ γ D , 0 ν,δ,η |k|!ν q=1 |k − β|!ν

(4.24)

where |k−β|

Dν,δ,η =

1 (|k|(|k| − 1) · · · (|k − β| + 1))ν−1 n (1 − δ) + (|k|(|k| − 1) · · · (|k − β| + 1))ν−1/(1−δ)

δ/(1−δ) δ , η

(4.25)

holds for all x ∈ Rn , k, β ∈ Zn+ , β ≤ k. Proof. Since δ ∈ (0, 1), we have n    x |β|δ ≤ (1 + |x|δ )|β| ≤ (n + 2)|β| 1 + |xq ||β|δ .

(4.26)

q=1

Next we estimate by (4.23) and derive |β|δ  k−β+|β|e  |xq xk−β | ηγ (1−δ)|β| q x ≤ ν−1 ν (|k|(|k| − 1) · · · (|k − β| + 1)) (|k|(|k| − 1) · · · (|k − β| + 1)) δ/(1−δ) δ (1 − δ)γ −δ|β| |xk−β | (4.27) + (|k|(|k| − 1) · · · (|k − β| + 1))ν−1/(1−δ) η

46

T. Gramchev

for q = 1, . . . , n, x ∈ Rn . Combining (4.26) and (4.27) and summing over q we get (4.24) and (4.25).  The next lemma states some crucial estimates for the operator P in (4.1), (4.2). Since the proof is based on some results contained in the appendix, we give here only the statement and refer the reader to the appendix for this proof. satisfying (4.13), (4.14) and let E be Lemma 4.8. Let P = p(x, D) with p ∈ Γm,δ μ a left parametrix for P as in Proposition A.13. Then, for every s ∈ R, there exist positive constants As , Cs such that, for every u ∈ S(Rn ), we have Eu s ≤ Cs x σ u s−m

(4.28)

and  

1

E P, xk u ≤ s |k|!ν

 β≤k,β=0

|β|

As x δ|β| xk−β u s (|k|(|k| − 1) · · · (|k − β| + 1))ν−1 |k − β|!ν (4.29)

for all k ∈ Zn+ , k = 0, ν ≥ 1. Taking Lemmas 4.7 and 4.8 into account, we can now estimate the commutator in the right-hand side of (4.21). 1 , s ∈ Z+ . Then there exist positive constants ε, Cs Proposition 4.9. Let ν ≥ 1−δ such that, for every η > 0, the estimate   

  ε|k|

E P, xk u ≤ Cs ηH s,ν;ε [u] + εH s,ν;ε [u] . (4.30) N N −1 ν s |k|! n k∈Z+ , s≤|k|≤N

holds for every N ∈ Z+ with N ≥ s. Proof. In view of Lemma 4.8 we have   

ε|k|

E P, xk u ≤ ε|k| ν s |k|!

β≤k, β=0

= ε|k|



|β|

β≤k, β=0

×

|β|

As x δ|β| xk−β u s ν−1 (|k|(|k| − 1) · · · (|k − β| + 1)) |k − β|!ν



As (|k|(|k| − 1) · · · (|k − β| + 1))ν−1 

(k − β)!

α!

(∂ α1 u) xk−β−α2 ∂ α3 x |β|δ 2 x x L α !α !α ! (k − β − α2 )! =α, 1 2 3

|α|≤s α1 +α2 +α3 α2 ≤k−β

≤ ε|k|

 β≤k,β=0

×



(As C)|β| (|k|(|k| − 1) · · · (|k − β| + 1))ν−1



(k − β)!

α!

xk−β−α2 x |β|δ ∂xα1 u 2 L α !α ! (k − β − α )! 2 =α, 1 2

|α|≤s α1 +α2 α2 ≤k−β

Gelfand–Shilov Spaces and Pseudodifferential Operators

47

  using the fact that ∂xα3 x |β|δ  ≤ C |α3 |+|β|+1 α3 ! x |β|δ . Then, by Lemma 4.7, we get, for any η > 0, γ ∈ (0, 1),  

ε|k|

E P, xk u ≤ η ν s |k|!



(M γ (1−δ) )|β|

×ε

α! α !α ! =α, 1 2



n  ε|k−α2 | xk−β−α2 +|β|eq ∂xα1 u L2 (k − β)! (k − β − α2 )! q=1 |k|!ν



+ D0



|α|≤s α1 +α2 α2 ≤k−β

β≤k, β=0

|α2 |



(M γ −δ )|β|





α! ε|β+α2 | α !α ! 1 2 =α,

|α|≤s α1 +α2 α2 ≤k−β

β≤k, β=0



ε|k−β−α2 | xk−β−α2 ∂xα1 u L2 (k − β)! × , (k − β − α2 )! |k − β|!ν where M is a positive constant independent of ε, k, β, γ and D0 = in view of the condition ν ≥

|k−β|

sup

k,β∈Zn + \0, β≤k

1 1−δ .

Dν,δ,η < +∞

Now, observing that

(k − β)! 1 1 ≤ (k − β − α2 )! |k|!ν |k − α2 |!ν and 1 (k − β)! 1 ≤ ν (k − β − α2 )! |k − β|! |k − β − α2 |!ν and choosing γ < M −1/(1−δ) , ε < 1, we obtain  

ε|k|

E P, xk u ≤ η s |k|!ν ×





α! ε|α2 | α !α ! 1 2 =α,

β≤k, β=0 |α|≤s α1 +α2 α2 ≤k−β



n  εk−|α2 | xk−β−α2 +|β|eq ∂xα1 u

|k − α2

q=1

×







α! α !α ! =α, 1 2

|α|≤s α1 +α2 α2 ≤k−β

|!ν

L2



+ D0 ε



(M γ −δ )|β|

β≤k,β=0



ε|k−β−α2 | xk−β−α2 ∂xα1 u L2 |k − β − α2 |!ν

.

(4.31)

We now observe that in the first term in the right-hand side of (4.31), if s ≤ |k| ≤ N , then we have 0 ≤ |k − β − α2 + |β|eq | = |k − α2 | ≤ N . Then, rescaling indices

48

T. Gramchev

in the sums we obtain that η









n  εk−|α2 | xk−β−α2 +|β|eq ∂xα1 u L2 α! |α2 | ε α !α ! |k − α2 |!ν =α, 1 2 q=1



s≤|k|≤N β≤k, β=0 |α|≤s α1 +α2 α2 ≤k−β

≤ Cs η





|α1 ≤s 0≤|k|≤N

ε|k|

xk ∂xα1 u 2 . ν L |k|!

−1 δ

Similarly, choosing ε < M γ and taking the fact into account that in the second term in the right-hand side of (4.31) we have 0 ≤ |k − β − α2 | ≤ N − 1, since β = 0, we obtain the estimate



    α! ε|k−β−α2 | xk−β−α2 ∂xα1 u L2 −δ |β| D0 ε (M γ ) α !α ! |k − β − α2 |!ν α +α =α, 1 2 |α|≤s

s≤|k|≤N β≤k, β=0

1

2

α2 ≤k−β

≤ Cs D0 ε





|α1 |≤s 0≤|k|≤N −1

ε|k|

xk ∂xα1 u 2 . ν L |k|!

From the last two estimates we easily obtain (4.30) observing that  k! α1 k α1 x ∂x u = (−1)|j| ∂xα1 −j (xk−j u), (k − j)! j j≤α1 , j≤k



cf. [15, Lemma 3.2].

Proof of Theorem 4.4. We observe that under the assumptions of Theorem 4.4, we already know that u ∈ S(Rn ), cf. Theorem 4.1. Now, by (4.28), we have, for any ε ∈ (0, ε ],  ε|k|

 ε|k|

E(xk f ) ≤ Cs xk x σ f s ≤ Cs x σ f s,ν;ε < +∞.  s |k|!ν |k|!ν  |k|≤N

|k|≤N

Moreover, since R is S-regularizing, also R ◦ xj is S-regularizing for every j = 1, . . . , n. For fixed k = 0, there exists j = jk ∈ {1, . . . , n} such that R ◦ xk = R ◦ xjk ◦ xk−ejk . Then  ε|k|

 ε|k|−1



R(xk u) ≤ u s + Cs ε

xk−ejk u .  s s |k|!ν |k|!ν  |k|≤N

0 0 sufficiently small, we obtain  s,ν  ;ε s,ν  ;ε σ HN [u] ≤ Cs u s + x s−2+δ u s + εHN −1 [u] + x f

s,ν;ε

49  .

(4.32)

Then, possibly choosing ε > 0 smaller and iterating estimate (4.32), it follows that s,ν  ;ε HN [u] is bounded from above with respect to N . Then, as N → +∞, we obtain  that u s,ν  ;ε < +∞. To treat the nonlinear case, we shall suppose without loss of generality that F (u) = u for some integer  ≥ 2. Compared to the linear case, here we need to assume some a priori decay of u. Theorem 4.10. Let P = p(x, D) ∈ OP Γm,δ satisfy the assumptions of Theorem μ 4.3. Let u be a solution of (4.2) such that x εo u ∈ H s (Rn ), s ∈ Z+ , s > n/2, for some εo > σ/( − 1). Assume, moreover, that x σ f s,ν;ε < ∞ for some ε > 0, . 1 ν ≥ 1. Then there exists ε > 0 such that u s,ν  ;ε < +∞, where ν  = max ν, 1−δ . εo +ρ Lemma 4.11. Under u ∈ H s (Rn ) - the assumptions of. Theorem 4.10 we have x for every ρ ≤ min 1 − δ, ( − 1)εo − σ .

Proof. By (4.2), we have x εo +ρ P u = x εo +ρ f + x εo +ρ u from which

        x εo +ρ u = E x εo +ρ f + R x εo +ρ u + E P, x εo +ρ u + E x εo +ρ u (4.33)

for some regularizing operator R mapping S  (Rn ) into S(Rn ). Clearly, the assumption on f and (4.28) imply that the Sobolev norm of the first term in the right-hand side of (4.33) is finite. Furthermore,   as a consequence of Theorem A.11 and Lemma A.16, the operator E P, x εo +ρ x −εo −ρ−δ+1 maps H s (Rn ) into itself. Hence,



 

E P, x εo +ρ u ≤ Cs x εo +ρ+δ−1 u < +∞, s s since ρ ≤ 1 − δ. Finally, we have





Ex εo +ρ u ≤ Cs x εo +ρ+σ u

s s

ε

σ+ρ o

−1 = Cs x u (x u)−1 s

σ+ρ −1

≤ Cs x εo u s x −1 u s < +∞ 

applying Schauder’s lemma. The proof is complete. Iterating Lemma 4.11, we obtain that x u ∈ H (R ) for all τ > 0. τ

s

n

Lemma 4.12. Under the assumptions of Theorem 4.10, the estimate  ε|k|



−1 s,ν;ε

E(xk u ) ≤ Cs ε x σ+1 −1 u

HN −1 [u] ν s s |k|! n k∈Z+ , 0 0. Namely, in this case, if u is a solution of P u = f and u ∈ H s (Rn ), then u s,ν  ;ε < +∞ for some ε > 0, where ν  is as in Theorem 4.4. 4.3. Regularity estimates In this section, we derive regularity estimates for the solutions of (4.1), (4.2). As in the previous section we first consider the linear case F (u) = 0. If u ∈ S(Rn ) is a solution of the equation P u = f , then, for every j ∈ Zn+ , T > 0, μ ≥ 1 we have the identity T |j| j T |j| j ∂ P u(x) = ∂ f (x) x j!μ j!μ x from which  T |j|   T |j|    T |j|  T |j| j ∂x u = μ E ∂xj f + μ R ∂xj u + μ E P, ∂xj u , μ j! j! j! j! where E is a left parametrix of P and R is a S-regularizing operator.

(4.35)

Gelfand–Shilov Spaces and Pseudodifferential Operators

51

Lemma 4.14. Let P satisfy the assumptions of Theorems 4.2 and 4.3. Then, for every εo ≥ 0, there exists a constant B > 0 such that 

 

E P, x εo ∂ j u ≤ Cj!μ x s 0=γ≤j

B |γ|+1

x εo ∂ j−γ u . x s (j − γ)!μ

(4.36) 

Proof. See the appendix.

Theorem 4.15. Let P = p(x, D) ∈ OP Γm,δ satisfy the assumptions of Theorem μ n 4.2. Assume, moreover, that f ∈ S(R ) is such that f {0,μ;T  ,σ} < +∞ for some μ ≥ 1, T  > 0 and let u ∈ S  (Rn ) be a solution of equation (4.1). Then there exists a T > 0 such that u {0,μ;T } < +∞. Proof. As in the proof of Theorem 4.4, we know that u is actually in S(Rn ). By (4.35), we can write  T |j|

 T |j|



∂xj u 2 ≤

E(∂xj f ) 2 μ L μ L j! j!

|j|≤N

|j|≤N

+

 T |j|



R(∂xj u) 2 + μ L j!

|j|≤N

 0 n/2. We have the following result. Theorem 4.16. Let P = p(x, D) ∈ OP Γm,δ satisfy the assumptions of Theoμ rem 4.3. Let u be a solution of (4.2) with x εo u ∈ H s (Rn ), s > n/2 for some εo > σ/( − 1) and assume, moreover, that f {s,μ;T  ,σ+εo } < +∞ for some  μ ≥ 1, T > 0. Then there exists a T > 0 such that u s,μ;T,εo < +∞. Lemma 4.17. Under the assumptions of Theorem 4.16 the estimate    T |j| εo +σ s,μ;T,εo εo j     x E(x ∂ u ) ≤ C u + T (E [u]) s s x s N −1 j!μ

(4.41)

|j|≤N

holds true. Proof. Let j ∈ Zn+ , j = 0. Then jq = 0 for some q ∈ {1, . . . , n}. By (4.28), since m ≥ 1, we have





E(x εo ∂xj u ) ≤ Cs x εo +σ ∂xj u

s s−m







 ≤ Cs ∂xq x εo +σ ∂xj−eq u s−m + Cs x εo +σ , ∂xq ∂xj−eq u s







 ≤ Cs x εo +σ ∂xj−eq u s + Cs x εo +σ , ∂xq ∂xj−eq u s . Since we can estimate

 ε +σ





x o , ∂xq ∂ j−eq u ≤ C  x εo +σ−1 ∂ j−eq u , x x s s s we obtain







E(x εo ∂xj u ) ≤ Cs x εo +σ ∂xj−eq u . s s

Now, applying the Leibniz formula, we can write x εo +σ ∂xj−eq u =

 j1 +···+j =j−eq

 σ (j − eq )!  εo j1    −1 x ∂x u x ∂xjk u . j1 ! · · · j ! 

k=2

Gelfand–Shilov Spaces and Pseudodifferential Operators

53

Then, since μ ≥ 1, we obtain

T |j|

x εo +σ ∂ j−eq u

x μ s j! 





σ (j − eq )! T |j1 |

T |jk |

Cs T jk

x εo ∂xj1 u

x −1 ≤ μ ∂ u x j! j +···+j =j−e (j1 ! · · · j !)1−μ j1 !μ jk !μ 1

≤ Cs T



k=2

q

 j1 +···+j =j−eq

T |j1 |

x εo ∂ j1 u

x μ s j1 !

  k=2

σ T |jk |

x −1 ∂xjk u s μ jk !

applying Schauder’s lemma and using the condition εo > σ/( − 1). Using the last estimate, summing up over j we obtain (4.41).  Proof of Theorem 4.16. First observe that by an inductive argument similar to the one adopted in Lemma 4.11, we have that x εo ∂xj u ∈ H s (Rn ) for every j ∈ Zn+ . Then, arguing as in the proof of Theorem 4.15, we obtain that   s,μ;T,εo s,μ;T,εo [u] ≤ Cs x εo +σ u s + T EN [u] + f {s,μ;T,σ+εo } EN −1  T |j|



E(x εo ∂xj u ) . + μ s j! 0=|j|≤N

Then, applying Lemma 4.17, we get, for any T ≤ min{B −1 , T  }, 



εo +σ  s,μ;T,εo [u] ≤ Cs x εo +σ u s + x  u s EN s,μ;T,εo s,μ,T,εo + T EN [u] + T (EN [u]) + f −1 −1

from which we obtain that u

{s,μ;T,εo }



{s,μ;T,εo +σ}



< +∞.

Similarly as for the linear case, Theorem 4.3 can be easily obtained combining Theorems 4.10 and 4.16. We leave the details to the reader. 4.4. The case of ordinary differential operators In this section, we apply the results obtained in the previous sections to a class of ordinary differential operators including (4.17) as example. Consider the operator & '

m

m−1 1 d d P = m κ(x) + a1 (x) κ(x) + · · · + am (x) . (4.42) κ (x) dx dx The hypotheses on the coefficients of P are as follows: κ(x) is even, κ(x) > 0 for all x ∈ R, and there exist Co , κo > 0 such that, for μ ≥ 1, 0 < δ < 1,  j  D κ(x) ≤ C j+1 (j!)μ x δ−j , x ∈ R, j ∈ Z+ , (4.43) x o κ(x) = κo |x|δ (1 + o(1))

as x → ±∞.

(4.44)

As for aj (x), j = 1, . . . , m, we assume that these coefficients satisfy estimates of ± type (4.43) with δ = 0 and aj (x) = a± j0 + o(1), aj0 ∈ C, as x → ±∞. It is easy to

54

T. Gramchev

prove that P can be rewritten as   P = im Dxm + b1 (x)Dxm−1 + · · · + bm (x) , where, for j = 1, . . . , m, and

  k Dx bj (x) ≤ C k+1 (k!)μ x −jδ−k

  bj (x) = (−i)j aj (x)κ−j (x) + O x −jδ−1 ,

so that where

(4.45)

b± j0

−jδ bj (x) = b± (1 + o(1)) j0 |x|

=

−j (−i)j a± j0 κo .

as x → ±∞,

(4.46)

We now consider the two algebraic equations

m−1 + · · · + b± L (λ) = λm + b± 10 λ m0 = 0 ±

and assume that λ = 0

for every λ such that L± (λ) = 0.

Proposition 4.18. Under the previous assumptions, disregarding the factor im in (4.45), we consider P in (4.42) as a pseudodifferential operator with symbol p(x, ξ) = ξ m + b1 (x)ξ m−1 + · · · + bm (x). that satisfies (4.13), (4.14) Then p(x, ξ) is as a globally hypoelliptic symbol in Γm,δ μ with σ = mδ. Proof. First observe that c ξ m x −mδ ≤ |p(x, ξ)| ≤ C ξ m

for |x| + |ξ| ≥ R

(4.47)

for some positive constants C, c, R. The second estimate is obvious. To prove the estimate in the left-hand side, observe that under our assumptions |L± (λ)| ≥ c (1 + |λ|m ) , hence

1 1 + x mδ |ξ|m L± (x δ ξ) ≥ c . mδ x x mδ Argue first for the region x > 0. Write there p± o (x, ξ) =

(4.48)

+ p(x, ξ) = p+ o (x, ξ) + p(x, ξ) − po (x, ξ).

In view of (4.46), given ε > 0, for x > R we can estimate   1 + x mδ |ξ|m  p(x, ξ) − p+ . o (x, ξ) ≤ ε x mδ Applying (4.48) and taking ε > 0 sufficiently small, we get for a new constant c > 0 1 + x mδ |ξ|m for x > r, ξ ∈ R. (4.49) x mδ Arguing similarly for x < 0, we obtain the same estimate for x < −R. On the other hand, for |x| ≤ R, the estimates (4.49) are trivial provided |ξ| is large, so we |p(x, ξ)| ≥ c

Gelfand–Shilov Spaces and Pseudodifferential Operators

55

have proved (4.49) for |x| + |ξ| ≥ R. At this moment we observe that 1 + x mδ |ξ|m ≥ x −mδ ξ m , x mδ and we get the left-hand side of (4.47). So we have proved that p satisfies (4.13) with σ = mδ. It remains to check the hypoellipticity condition (4.14). We first estimate   |∂ξ p(x, ξ)| ≤ C |ξ|m−1 + x −δ |ξ|m−2 + · · · + x −(m−1)δ . (4.50) To proceed, it is convenient to use an equivalent version of (4.49) for |x| + |ξ| ≥ R, namely m Hj (x, ξ), (4.51) |p(x, ξ)| ≥ c j=0

where Hj (x, ξ) = x −(m−j)δ |ξ|j which easily follows from the previous arguments. Let us estimate the generic term in the right-hand side of (4.50). We have to prove that x −(m−j)δ |ξ|j−1 ≤ C |p(x, ξ)| ξ −1 x δ , j = 1, . . . , m. (4.52) Arguing for small |ξ|, we observe that x −(m−j)δ |ξ|j−1 ≤ CHj−1 (x, ξ) ξ −1 x δ , and in view of (4.51) we obtain (4.52). For large |ξ|, we use the inequality x −(m−j)δ |ξ|j−1 ≤ CHj (x, ξ)ξ −1 x δ , and again in view of (4.51), we deduce (4.52). We leave similar estimates of the other derivatives to the reader.  We may then construct a parametrix for P in (4.42). Then, for n = 1, Theorems 4.2, 4.3 apply to (4.42) under the assumptions (4.43), (4.44). To be specific, for the solutions y(x), x ∈ R, of the semilinear homogeneous equation (i.e., f = 0) we obtain the estimates |y (α) (x)| ≤ C |α|+1 (α!)μ e−ε|x|

1−δ

,

x ∈ Rn .

We notice that in the particular case in which the coefficients aj in (4.42) are constant the operator P , besides being globally hypoelliptic, admits even a left inverse P −1 . We also notice that the example (4.17) in the introduction is included in the class described in this section. The same conclusions then apply to (4.17) with δ = 2γ − 1 as we observed in the introduction. To conclude this section, let us write down the solutions of (4.19) and check on them that the assumption on εo in Theorem 4.3 is sharp in this case. In fact, the ordinary differential equation 1 y  + x(1 + x2 )−γ y = y  ,  ≥ 2, < γ < 1, 2 is a Bernoulli equation which we can treat explicitly. Namely, let us write ) +∞  −1  2 1−γ 1+x ψ(x) = − and A = ( − 1) eψ(x) dx. 2(1 − γ) 0

56

T. Gramchev Fixing for simplicity attention on solutions y(x) for which y(0) = yo > 0, we

have

& y(x) =

eψ(x) , +∞ λ + ( − 1) x eψ(t) dt

1 ' −1

,

(4.53)

where λ = yo1− eψ(0) −A. Here and in the following, roots are defined to be positive for positive numbers, with continuous extension to the complex domain, i.e., we take principal branches. To study the behavior of the solutions, let us observe that ) +∞ eψ(t) dt E(x) = ( − 1) x

is positive and decreasing on the real axis, where lim E(x) = 2A,

x→−∞

E(0) = A,

lim E(x) = 0,

x→+∞

having the asymptotic expansion as x → +∞   E(x) = eψ(x) x2γ−1 + o(1) .

(4.54)

(4.55)

We may easily prove (4.55) by applying the classical de l’Hˆopital rule. Let us test Theorem 4.3 on (4.53). We distinguish three cases.  A 1/(1−) • eψ(0) < yo < +∞, i.e., −A < λ < 0. Then the solution y(x) blows up at the point xo > 0 uniquely defined by E(xo ) = −λ, cf. (4.54). 1  ψ(x)  −1  A 1/(1−) , i.e., λ = 0. Then the solution y(x) = eE(x) is well • yo = eψ(0) defined and real-analytic on R. The decay at −∞ is sub-exponential, whereas from (4.55) we get y(x) ∼ x

1−2γ −1

as x → +∞.

Note that y(x) is then homoclinic, in the sense that

lim y(x) = 0, but

x→±∞

Theorem 4.3 cannot be applied, since for εo > σ/(−1), with σ = 2γ−1 in the present case, we have x εo y(x) ∈ / L∞ (R), hence x εo y(x) ∈ / H s (R) ⊂ L∞ (R) for s > n/2.  A 1/(1−) , i.e., λ > 0. In this case, since • 0 < yo < eψ(0) 0 < λ < λ + E(x) < λ + 2A in view of (4.54), the solution y(x) is well defined and real-analytic in R and 0 < y(x) < λ1/(1−) eψ(x) ≤ c1 e−c2 |x|

2−2γ

for positive constants c1 , c2 . A similar sub-exponential bound is satisfied by y  (x), hence x εo y(x) ∈ H 1 (R) for every εo ∈ R. Therefore, Theorem 4.3 1 (R). applies and gives the more precise information y ∈ S1/(2−2γ)

Gelfand–Shilov Spaces and Pseudodifferential Operators

57

Appendix: Pseudodifferential operators on Gelfand–Shilov spaces In the sequel, we will use the notation e1 = (1, 0),

e2 = (0, 1),

e = (1, 1).

(−i)|α| ∂xα

Moreover, we will denote as standard = for all α ∈ Zn+ . 2 Let m = (m1 , m2 ) ∈ R and let μ, ν be real numbers such that μ ≥ 1, ν ≥ 1. Let also 1 , 2 , δ1 , δ2 be real numbers with 0 ≤ δj < j ≤ 1, j = 1, 2, and denote ¯ = (1 , 2 ), δ¯ = (δ1 , δ2 ). Dxα

¯

m,, ¯δ Definition A.1. We shall denote by Γν,μ the space of all functions p(x, ξ) ∈ ∞ 2n C (R ) satisfying the following condition: there exists a positive constant C such that   α β ∂ ∂ p(x, ξ) ≤ C |α|+|β|+1 (α!)ν (β!)μ ξ m1 −1 |α|+δ1 |β| x m2 −2 |β|+δ2 |α| (A.1) ξ

x

¯

m,, ¯δ for every (x, ξ) ∈ R2n and α, β ∈ Zn+ . We will denote by OP Γν,μ the space of all m,, ¯ δ¯ operators (A.2) with symbol in Γν,μ . (0,0),, ¯ δ¯

¯

, ¯δ We shall denote by Γ0, . ν,μ the class Γν,μ m,, ¯ δ¯ Given p ∈ Γν,μ , we can consider the pseudodifferential operator defined as standard by ) P u(x) = p(x, D)u(x) = (2π)−n ei x,ξ p(x, ξ)ˆ u(ξ) dξ, u ∈ S(Rn ), (A.2) Rn

where u ˆ denotes the Fourier transform of u. ¯

m,, ¯δ Remark A.2. In this appendix, we shall construct a calculus for the class OP Γν,μ μ n m,δ on the Gelfand–Shilov spaces Sν (R ). First of all we notice that the class Γμ considered in the previous sections corresponds in the notation of this section to 1 ,e,δe2 the class Γme , so all the results presented here apply to Γm,δ μ . We observe that 1,μ most part of the results in the sequel can be proved following the same arguments used in other similar contexts, cf. [9], [12], [21]. For this reason some proofs will be just sketched or omitted for the sake of brevity.

We start by giving a continuity theorem between Sobolev spaces for operators , ¯ δ¯ which gives precise factorial estimates for the norm of the operafrom OP Γm, ν,μ tors. This is an obvious consequence of the Weyl–H¨ormander calculus, see [39]. 0,,δ Theorem A.3. Given p ∈ Γν,μ , the operator p(x, D) defined by (A.2) is linear and s n continuous from H (R ) to H s (Rn ) for every s ∈ R and

p(x, D) L(H s ,H s ) ≤ K

max

|α|+|β|≤N

C |α|+|β| (α!)μ (β!)ν ,

where C is the constant appearing in (A.1) and the constants K, N depend only on s and the dimension n. The next result states the action of the operators defined above between Gelfand–Shilov spaces.

58

T. Gramchev ¯

, ¯δ Theorem A.4. Given p ∈ Γm, ν,μ , the operator P defined by (A.2) is linear and 

continuous from Sνμ (Rn ) into itself for any μ , ν  with μ ≥ μ/(1 − δ1 ), ν  ≥ ν/(1 − δ2 ). Furthermore, P can be extended to a linear and continuous map from  (Sνμ (Rn )) into itself. Proof. For any α, β ∈ Zn+ and for any positive integer N , we can write )  β! xα Dxβ P u(x) = (2π)−n xα ei x,ξ ξ β1 Dxβ2 p(x, ξ)ˆ u(ξ) dξ β1 !β2 ! Rn β1 +β2 =β )  β! ei x,ξ (1 − Δξ )N [ξ β1 Dxβ2 p(x, ξ)ˆ u(ξ)] dξ. = (2π)−n xα x −2N β1 !β2 ! Rn β1 +β2 =β 0 / 2 + 1. By (4.10), (A.1), and standard factorial inequalities, Choose N = |α|+m 2 we obtain   x |α|−2N (1 − Δξ )N [ξ β1 Dxβ2 p(x, ξ)ˆ u(ξ)] 





≤ C |α|+|β|+1 (α!)ν+δ2 ν (β1 !)μ (β2 !)μ+δ1 μ e−a ξ

1/μ

for some positive constants C, a. Then, by the conditions μ ≥ μ/(1 − δ1 ), ν  ≥  ν/(1 − δ2 ), it follows that P is continuous from Sνμ (Rn ) into itself. By standard  arguments we can extend P to the dual space (Sνμ ) (Rn ), cf. [21, Theorem 2.2].  For t ≥ 0, denote by Qt the set Qt = {(x, ξ) ∈ R2n : ξ 1 −δ1 < t, x 2 −δ2 < t} and by Qet = R2n \ Qt its complement. ¯

m,, ¯δ Definition A.5. We denote by F Sν,μ the space of all formal sums



pj such

j≥0

that pj ∈ C ∞ (R2n ) for j ≥ 0 and there exist positive constants B, C such that for all j ≥ 0   α β ∂ξ ∂x pj (x, ξ) ≤ C |α|+|β|+2j+1 (α!)ν (β!)μ (j!)μ+ν−1 × ξ m1 −1 |α|+δ1 |β|−(1 −δ1 )j x m2 −2 |β|+δ2 |α|−(2 −δ2 )j for all α, β ∈ Zn+ and for all (x, ξ) ∈ QeBj μ+ν−1 . m,, ¯ δ¯ Definition A.6. We say that two sums pj , qj ∈ F Sν,μ are equivalent if j≥0

j≥0

there exist positive constants B, C such that for every N = 1, 2, . . .     α β (pj − qj ) ≤ C |α|+|β|+2N +1 (α!)ν (β!)μ (N !)μ+ν−1 ∂ξ ∂x j 0. Then p ∈ Sθθ (R2n ). ×

inf

1 0≤N ≤B(max{ ξ 1 −δ1 , x 2 −δ2 }) μ+ν−1

Remark A.8. Notice that if R is Sνμ -regularizing, then in particular it is S-regularizing, i.e., it maps S  (Rn ) into S(Rn ). ¯

m,, ¯δ In order to construct a symbol in Γν,μ starting from a formal sum in m,, ¯ δ¯ F Sν,μ , some restrictions on μ, ν are necessary. In fact, the following arguments require the use of Gevrey cut-off functions of order μ and ν. This leads to assuming the non-analyticity condition μ > 1, ν > 1. (A.4) ¯

m,, ¯δ Hence, the next results of this section hold for analytic symbols of Γ1,1 only m,, ¯ δ¯ considering them as elements of Γν,μ for some choice of μ > 1, ν > 1. With the same argument used in [9, Theorem 2.14], it is easy to prove the following result.

60

T. Gramchev

Proposition A.9. Let



¯

m,, ¯δ pj ∈ F Sν,μ , where μ > 1, ν > 1. Then, for every fixed

j≥0

R > 0, we can find a sequence of non-negative functions ϕj ∈ C ∞ (R2n ) satisfying the following conditions: ϕ0 (x, ξ) = 1 in R2n , ϕj (x, ξ) = 0 in Q2Rj μ+ν−1 sup (x,ξ)∈R2n

and

ϕj (x, ξ) = 1 in

(A.5) Qe3Rj μ+ν−1 ,

(A.6)

 α β  ∂ξ ∂x ϕj (x, ξ) ≤  −|α|−|β| C |α|+|β|+1 (α!)ν (β!)μ Rj μ+ν−1 ,

j ≥ 1, (A.7)

for some positive constant C and such that the function  ϕj (x, ξ)pj (x, ξ) p(x, ξ) =

(A.8)

j≥0 ¯

, ¯δ is in Γm, and p ∼ ν,μ



¯

m,, ¯δ pj in F Sν,μ for R sufficiently large.

j≥0

Using the same arguments as in [21], we obtain the following results about m,, ¯ δ¯ and the composition of two operators. the transpose of an operator from OP Γν,μ We omit the proofs for the sake of brevity. ¯

m,, ¯δ Proposition A.10. Let P = p(x, D) ∈ OP Γν,μ and let t P be its transpose defined by

 t P u, v = u, P v , 





u ∈ (Sνμ (Rn )) , v ∈ Sνμ (Rn ),



(A.9)

where μ ≥ μ/(1 − δ1 ), ν ≥ ν/(1 − δ2 ) as in Theorem A.4. Then, P = Q + R, m,, ¯ δ¯ with where Q = q(x, D) is in OP Γν,μ   q(x, ξ) ∼ (α!)−1 ∂ξα Dxα p(x, −ξ) t

j≥0 |α|=j 

m,, ¯δ and R is a Sνμ -regularizing operator for any μ , ν  with min{μ , ν  } ≥ in F Sν,μ μ+ν−1 min{1 −δ1 ,2 −δ2 } . ¯



¯

¯

m,, ¯δ ,, ¯δ Theorem A.11. Let p ∈ Γν,μ , q ∈ Γm . Then there exists a symbol s ∈ ν,μ 



¯

,, ¯δ such that p(x, D)q(x, D) = s(x, D) + R for some Sνμ -regularizing opeΓm+m ν,μ m,, ¯ δ¯ pj in F Sν,μ rator R, with μ , ν  as in Proposition A.10. Moreover, if p ∼ j≥0 m ,, ¯ δ¯ and q ∼ qj in F Sν,μ , then j≥0

s(x, ξ) ∼





j≥0 h+k+|α|=j

1 α ∂ ph (x, ξ)Dxα qk (x, ξ) α! ξ

(A.10)

Gelfand–Shilov Spaces and Pseudodifferential Operators 

61 

m+m ,, ¯δ −+ ¯ δ,, ¯δ in F Sν,μ . Similarly, the commutator [P, Q] = c(x, D) ∈ OP Γm+m ν,μ with  1   c(x, ξ) ∼ ∂ξα p(x, ξ)Dxα q(x, ξ) − ∂ξα q(x, ξ)Dxα p(x, ξ) α! ¯

¯

¯

α=0

in

¯ , m+m −+ ¯ δ, ¯ δ¯ . F Sν,μ

We now formulate the global hypoellipticity conditions in their general form m,, ¯ δ¯ for the class Γν,μ . ¯

m,, ¯δ is said to be globally hypoelliptic if there Definition A.12. A symbol p ∈ Γν,μ   exist B, C1 , C2 > 0 and m = (m1 , m2 ) ∈ R2 such that

inf



(x,ξ)∈QeB



ξ −m1 x −m2 |p(x, ξ)| = C1 > 0

(A.11)

and |α|+|β|

|∂ξα ∂xβ p(x, ξ)| ≤ C2 for all α, β ∈

Zn+

(α!)ν (β!)μ |p(x, ξ)|ξ −1 |α|+δ1 |β| x −2 |β|+δ2 |α|

and (x, ξ) ∈

(A.12)

QeB . ¯

m,, ¯δ Proposition A.13. Let p be a globally hypoelliptic symbol in Γν,μ . Then, there  ,, ¯ δ¯ such exists a left parametrix for P , i.e., an operator E with symbol in Γ−m ν,μ 

that EP = I + R, where I is the identity operator and R is a Sνμ -regularizing operator for every μ , ν  such that min{μ , ν  } ≥ min{μ+ν−1 . 1 −δ1 ,2 −δ2 } Proof. As standard, we construct the symbol e(x, ξ) of E starting from its asymptotic expansion and applying Proposition A.9. Define e0 (x, ξ) = p(x, ξ)−1 (1 − ω(x, ξ)),

(A.13)

where ω is a Gevrey function of order σ = min{μ, ν} with compact support such that ω = 1 in a neighborhood of QB . It is easy to prove by induction on |α + β| that   α β ∂ξ ∂x e0 (x, ξ) ≤ C |α|+|β| (α!)ν (β!)μ ξ −1 |α|+δ1 |β| x −2 |β|+δ2 |α| |e0 (x, ξ)| (A.14) for every (x, ξ) ∈ QeB and α, β ∈ Zn+ . For j = 1, 2, . . . , we can define by induction  ej (x, ξ) = −e0 (x, ξ) ∂ξα ej−|α| (x, ξ)Dxα p(x, ξ). (A.15) 0 1, ν  > 1, μ ≥ μ. Then, in particular, we deduce that

(A.17)

rαβj ∈

j≥0 (−|α|,−|β|+δ|α|),e,δe

2 F Sν  ,μ . Finally, we apply Proposition A.9, taking in (A.8) cut-off  functions ϕj (x, ξ) independent of α, β, and obtain (A.16).

Gelfand–Shilov Spaces and Pseudodifferential Operators

63

Proof of Lemma 4.8. Estimate (4.28) is obvious by the previous arguments. Concerning (4.29), we can write ) k −n x P u(x) = (2π) ei x,ξ xk p(x, ξ)ˆ u(ξ) dξ Rn ) ei x,ξ Dξk (p(x, ξ)ˆ u(ξ)) dξ = (2π)−n (−1)|k| n R )  k = (2π)−n ei x,ξ ˆ(ξ) dξ (−Dξβ )p(x, ξ)(−Dξ )k−β u β Rn β≤k  k   (−Dξ )β p (x, D)(xk−β u). = β β≤k

Hence,

 k   1 1 k E [P, x ]u = − E (−Dξ )β p (x, D)(xk−β u) ν ν β |k|! |k|!

(A.18)

0=β≤k

and, therefore,



 1 k

1

β k −δ|β| δ|β| k−β E [P, x ]u ≤ p)(x, D)x (x x u)

E(∂

. s ξ |k|!ν |k|!ν β s 0=β≤k

Now observe that E(∂ξβ p)(x, D)x −δ|β| L(H s ,H s ) ≤ C |β|+1 β!.

(A.19)

In fact, from Lemma A.16 we know that E(∂ξβ p)(x, D) = rβ (x, D) with rβ (x, ξ) satisfying |∂ξθ ∂xγ rβ (x, ξ)| ≤ C |β|+1 β!ξ −|β|−|θ| x −|γ|+δ|β|+δ|θ|

(A.20)

for every θ, γ ∈ Zn+ and for some constant C = C(θ, γ, s) > 0. Then we consider the operator rβ (x, D)x −δ|β| and its transpose, cf. Proposition A.10, with symbol given by sβ (x, ξ) = x −δ|β| r˜β (x, ξ), with r˜β (x, D) = t rβ (x, D). It is easy to see that also r˜β satisfies (A.20) and hence |∂ξθ ∂xγ sβ (x, ξ)| ≤ Cθγ C |β| β!ξ −|β|−|θ| x |γ|+δ|θ| . Then, by Theorem A.3, sβ (x, D) L(H s ,H s ) ≤ KC |β| β!, where K =

max

|θ|+|γ|≤N

Cθγ and we deduce (A.19). Summing up, we obtain





 k 1 1



k E [P, x ]u ≤ β! x δ|β| xk−β u

A|β|+1 s s ν ν |k|! |k|! β s 0=β≤k

64

T. Gramchev

for some constant As depending only on s and on the dimension n. Then we conclude observing that k 1 1 . β! ≤ ν β |k|! (|k|(|k| − 1) · · · (|k − β| + 1))ν−1 |k − β|!ν  Proof of Lemma 4.14. Assume initially εo = 0. To deal with [P, ∂xj ], we write  j j ∂x P = (∂xγ p)(x, D)∂xj−γ . γ γ≤j

Hence, E[P, ∂xj ]u =

 j E(∂xγ p)(x, D)∂xj−γ u. γ

0=γ≤j

Applying Theorem A.3 and Lemma A.16, we obtain  j



E[P, ∂xj ]u ≤ C |γ|+1 (γ!)μ ∂xj−γ u s s γ 0=γ≤j

from which we deduce (4.36) for εo = 0. The case εo > 0 is treated similarly. We leave the details to the reader.



Acknowledgment The author thanks Prof. D. Bahns and Prof. I. Witt for the invitation to visit the University of G¨ ottingen, for the support, and for the help in the final stages of the preparations of the paper.

References [1] S. Agmon, Lectures on exponential decay of second-order elliptic equations: bounds on eigenfunctions of N-body Schr¨ odinger operators. Math. Notes, vol. 29, Princeton University Press, 1982. [2] H.A. Biagioni and T. Gramchev, Fractional derivative estimates in Gevrey spaces, global regularity and decay for solutions to semilinear equations in Rn . J. Differential Equations 194 (2003), 140–165. [3] P. Boggiatto, E. Buzano, and L. Rodino, Global hypoellipticity and spectral theory. Math. Res., vol. 92, Akademie Verlag, 1996. [4] J. Bona and Y. Li, Decay and analyticity of solitary waves. J. Math. Pures Appl. 76 (1997), 377–430. [5] I. Bondareva and M. Shubin, Equations of Korteweg–de Vries type in classes of increasing functions. J. Soviet Math. 51 (1990), 2323–2332. [6] E. Buzano, Super-exponential decay of solutions to differential equations in Rd . In: J. Toft, M.W. Wong, and H. Zhu (eds.), Modern trends in pseudo-differential operators, pp. 117–133, Birkh¨ auser, 2006.

Gelfand–Shilov Spaces and Pseudodifferential Operators

65

[7] D. Calvo and L. Rodino, Iterates of operators and Gelfand–Shilov functions. Integral Transforms Spec. Funct. 22 (2011), 269–276. [8] I. Camperi, Global hypoellipticity and Sobolev estimates for generalized SG-pseudodifferential operators. Rend. Semin. Mat. Univ. Politec. Torino 66 (2008), 99–112. [9] M. Cappiello, Fourier integral operators of infinite order and applications to SGhyperbolic equations. Tsukuba J. Math. 28 (2004), 311–361. [10] M. Cappiello, Fourier integral operators and Gelfand–Shilov spaces In: Modern trends in pseudo-differential operators, pp. 81–100, Oper. Theory Adv. Appl., vol. 160, Birkh¨ auser, 2005. [11] M. Cappiello, T. Gramchev, and L. Rodino, Super-exponential decay and holomorphic extensions for semilinear equations with polynomial coefficients. J. Funct. Anal. 237 (2006), 634–654. [12] M. Cappiello, T. Gramchev, and L. Rodino, Exponential decay and regularity for SGelliptic operators with polynomial coefficients. In: Hyperbolic problems and regularity questions, pp. 49–58, Trends Math., Birkh¨ auser, 2007. [13] M. Cappiello, T. Gramchev, and L. Rodino, Gelfand–Shilov spaces, pseudo-differential operators and localization operators In: Modern trends in pseudo-differential operators, pp. 297–312, Oper. Theory Adv. Appl., vol. 172, Birkh¨ auser, 2007. [14] M. Cappiello, T. Gramchev, and L. Rodino, Semilinear pseudo-differential equations and travelling waves. In: L. Rodino, B.-W. Schulze, and M.W. Wong (eds.), Pseudo-differential operators: partial differential equations and time-frequency analysis, Fields Institute Communications 52 (2007), 213–238. [15] M. Cappiello, T. Gramchev, and L. Rodino, Decay and regularity for harmonic oscillator-type equations. Integral Transforms Spec. Funct. 20 (2009), 283–290. [16] M. Cappiello, T. Gramchev, and L. Rodino, Entire extensions and exponential decay for semilinear elliptic equations. J. Anal. Math. 111 (2010), 339–367. [17] M. Cappiello, T. Gramchev, and L. Rodino, Sub-exponential decay and uniform holomorphic extensions for semilinear pseudodifferential equations. Comm. Partial Differential Equations 35 (2010), 846–877. [18] M. Cappiello, T. Gramchev, and L. Rodino, Exponential estimates and holomorphic extensions for semilinear elliptic pseudodifferential equations. Complex Var. Elliptic Equ. 56 (2011), 1129–1142. [19] M. Cappiello and F. Nicola, Holomorphic extension of solutions of semilinear elliptic equations. Nonl. Anal. 74 (2011), 2663–2681. [20] M. Cappiello and F. Nicola, Regularity and decay of solutions of nonlinear harmonic oscillators. Adv. Math. 229 (2012), 1266–1299. [21] M. Cappiello and L. Rodino, SG-pseudo-differential operators and Gelfand–Shilov spaces. Rocky Mountain J. Math. 36 (2006), 1117–1148. [22] J. Chung, S.Y. Chung, and D. Kim, Characterization of the Gelfand–Shilov spaces via Fourier transforms. Proc. Am. Math. Soc. 124 (1996), 2101–2108. [23] H.O. Cordes, The technique of pseudodifferential operators. Cambridge Univ. Press, 1995. [24] E.B. Davies, Heat kernels and spectral theory. Cambridge Tracts Math., vol. 92, Cambridge Univ. Press, Cambridge, 1989.

66

T. Gramchev

[25] E.B. Davies and B. Simon, Ultracontractivity and the heat kernel for Schr¨ odinger operators and Dirichlet Laplacians. J. Funct. Anal. 59 (1984), 335–395. [26] P. Djakov and B. Mityagin, Smoothness of solutions of a nonlinear ODE. Int. Equations Oper. Theory 44 (2002), 149–171. [27] P. Djakov and B. Mityagin, Smoothness of solutions of nonlinear ODE. Math. Ann. 324 (2002), 225–254. [28] A. Erdelyi, W. Magnus, F. Oberhettinger, and F.G. Tricomi, Higher Transcendental Functions. Vol. 1–3, McGraw-Hill, New York, 1953. [29] I.M. Gel’fand and G.E. Shilov, Generalized functions, II. Academic Press, New York, 1968. [30] T. Gramchev, Perturbative methods in scales of Banach spaces: applications for Gevrey regularity of solutions to semilinear partial differential equations. Rend. Sem. Mat. Univ. Pol. Torino, 61 (2003), 101–134. [31] T. Gramchev, S. Pilipovi´c, and L. Rodino, Global regularity and stability in S-Spaces for classes of degenerate Shubin operators. Pseudo-Differential Operators: Complex Analysis and Partial Differential Equations Operator Theory: Advances and Applications 205 (2010), 81–90. [32] T. Gramchev, S. Pilipovi´c, and L. Rodino, Eigenfunction expansions in Rn . Proc. Amer. Math. Soc. 139 (2011), 4361–4368. [33] T. Gramchev and G. Tranquilli, Hypoellipticity and solvability in Gelfand–Shilov spaces for twisted Laplacian type operators. Compt. Rendus Acad. Bulg. Sci. 67 (2014), 1193–1200. [34] T. Gramchev and G. Tranquilli, Cauchy problem for second-order hyperbolic equations for Shubin pseudodifferential operators. Operator Theory: Advances and Applications 245 (2015), 81–90. [35] K. Gr¨ ochenig and G. Zimmermann, Spaces of test functions via the STFT. J. Funct. Spaces Appl. 2 (2004), 24–53. [36] B. Helffer, Th´eorie spectrale pour des op´erateurs globalement elliptiques. Ast´erisque, vol. 112. Soc. Math. France, Paris, 1984. [37] B. Helffer and B. Parisse, Comparison of the decay of eigenfunctions for Dirac and Klein–Gordon operators. Applications to the study of the tunneling effect. Ann. Inst. H. Poincar´e Phys. Th´eor. 60 (1994), 147–187. [38] P.D. Hislop and I.M. Sigal, Introduction to spectral theory. Springer, Berlin, 1996. [39] L. H¨ ormander, The analysis of linear partial differential operators III. Pseudodifferential operators. Springer, Berlin, 1985. [40] B.Ya. Levin, Lectures on entire functions. In collaboration with and with a preface by Yu. Lyubarskii, M. Sodin and V. Tkachenko. Transl. Math. Monogr., vol. 150, Amer. Math. Soc., Providence, RI, 1996. [41] H. Komatsu, A proof of Kotake and Narashiman’s theorem. Proc. Japan Acad. 38 (1962), 615–618. [42] T. Kotake and M.S. Narasimhan, Regularity theorems for fractional powers of a linear elliptic operator. Bull. Soc. Math. France, 90 (1962), 449–471. [43] M. Langenbruch, Hermite functions and weighted spaces of generalized functions. Manuscripta Math. 119 (2006), 269–285.

Gelfand–Shilov Spaces and Pseudodifferential Operators

67

[44] M. Mascarello and L. Rodino, Partial differential equations with multiple characteristics. Akademie Verlag, Berlin, 1997. [45] R. McOwen, On elliptic operators in Rn . Comm. Partial Differential Equations 5 (1980), 913–933. [46] N. Lerner, Y. Morimoto, K. Pravda-Starov, and C.J. Xu, Gelfand–Shilov smoothing properties of the radially symmetric spatially homogeneous Boltzmann equation without angular cutoff. J. Differential Equations 256 (2014), 797–831 [47] R. Lockhart and R. McOwen, On elliptic systems in Rn . Acta Math. 150 (1983), 125–135. [48] R. Lockhart and R. McOwen, Correction to “On elliptic systems in Rn ,” Acta Math. 153 (1984), 303–304. [49] B.S. Mityagin, Nuclearity and other properties of spaces of type S. Trudy Moskov. Mat. Obˇsˇc. 9 (1960) 317–328. [50] F. Nicola and L. Rodino, Global pseudo-differential calculus on Euclidean spaces. Birkh¨ auser, Basel, 2010. [51] C. Parenti, Operatori pseudodifferenziali in Rn e applicazioni. Ann. Mat. Pura Appl. 93 (1972), 359–389. [52] S. Pilipovic, Tempered ultradistributions. Boll. Unione Mat. Ital. B (7) 2 (1988), 235–251. [53] S. Pilipovic and N. Teofanov, Pseudodifferential operators on ultramodulation spaces. J. Functional Anal., 208 (2004), 194–228. [54] P.J. Rabier, Asymptotic behavior of the solutions of linear and quasilinear elliptic equations on RN . Trans. Amer. Math. Soc. 356 (2004), 1889–1907. [55] P.J. Rabier and C. Stuart, Exponential decay of the solutions of quasilinear secondorder equations and Pohozaev identities. J. Differential Equations, 165 (2000), 199– 234. [56] V.S. Rabinovich, Exponential estimates for eigenfunctions of Schr¨ odinger operators with rapidly increasing and discontinuous potentials. Contemporary Math. 364 (2004), 225–236. [57] M. Reed and B. Simon, Methods of modern mathematical physics. I. Academic Press, San Diego Ca., 1975. [58] L. Rodino, Linear partial differential operators in Gevrey spaces. World Scientific, Singapore, 1993. [59] E. Schrohe, Spaces of weighted symbols and weighted Sobolev spaces on manifolds. In: H.O. Cordes, B. Gramsch, and H. Widom (eds.), Pseudodifferential operators, pp. 360–377, Lecture Notes in Math., vol. 1256, Springer, New York, 1987. [60] B.-W. Schulze, Boundary value problems and singular pseudodifferential operators. J. Wiley, Chichester, 1998. [61] R.T. Seeley, Integro-differential operators on vector boundes. Trans. Amer. Math. Soc. 117 (1965), 167–204. [62] R.T. Seeley, Eigenfunction expansions of analytic functions. Proc. Amer. Math. Soc. 21 (1969), 734–738. [63] M. Shubin, Pseudodifferential operators and spectral theory. Springer Ser. Soviet Math., Springer, Berlin, 1987.

68

T. Gramchev

[64] Y. Sibuya, The Gevrey asymptotics in the case of singular perturbations. J. Differential Equations 165 (2000), 255–314. [65] Y. Sibuya, Formal power series solutions in a parameter. J. Differential Equations 190 (2003), 559–578. [66] G. Szeg¨ o, Orthogonal polynomials. Amer. Math. Soc., 1959. [67] N. Teofanov, Ultradistributions in time-frequency analysis. In: Pseudo-differential operators and related topics, Oper. Theory Adv. Appl., Birkh¨ auser, 2005. [68] J. Toft, A. Khrennikov, B. Nilsson, and S. Nordebo, Decompositions of Gelfand– Shilov kernels into kernels of similar class. J. Math. Anal. Appl. 396 (2012), 315–322. [69] F.G. Tricomi, Funzioni speciali. Ed. Tirrenia, Torino, 1965. [70] W. Wasow, Asymptotic expansions for ordinary differential equations. Wiley, New York, 1965. [71] M.W. Wong, The heat equation for the Hermite operator on the Heisenberg group. Hokkaido Math. J. 34 (2005), 393–404. Todor Gramchev Dipartimento di Matematica e Informatica Universit` a di Cagliari Via Ospedale 72 I-09124 Cagliari, Italy e-mail: [email protected]

Operator Theory: Advances and Applications, Vol. 251, 69–115 c Springer International Publishing Switzerland 2016 

An Excursion into Berezin–Toeplitz Quantization and Related Topics Miroslav Engliˇs Abstract. We present an introduction to the Berezin and Berezin–Toeplitz quantizations, starting from their historical origins and relationships with other quantization methods, discussing various instructive examples like the Segal–Bargmann–Fock space, and culminating by highlights of proofs of the existence of these quantizations using both the Boutet de Monvel theory and the approach via Fefferman’s expansion and Forelli–Rudin construction. The exposition strives to be reasonably self-contained and accessible to nonexperts. Mathematics Subject Classification (2010). Primary 53D55; Secondary 46E22, 47B35, 32A36. Keywords. Berezin quantization, Toeplitz operator, Bergman space, Bergman kernel, Berezin transform.

Quantization has traditionally been understood as a recipe in physics for passing from a classical system – which, loosely speaking, is something that concerns macroscopic objects and that we are familiar with from everyday life – to the “corresponding” quantum system, which pertains to microscopic objects where things are subject to more complicated rules. The latter should reduce to the former as the size of the objects gets large, that is, as the “Planck constant”, which, heuristically, corresponds to the magnitude where the quantum phenomena become relevant, tends to zero. (This is the so-called “correspondence principle”, or “classical limit”.) Over time, it became apparent that such a concept is not totally appropriate, either mathematically or physically. From the point of view of physics, it is more appropriate to understand quantization just as a correspondence between classical and quantum systems; that is, there may be quantum systems which have no classical counterpart, as well as different quantum systems corresponding to the same classical system. From the mathematical point of view, one even encounters ˇ grant no. IAA100190802, GA CR ˇ grant no. 201/09/0473, and Research supported by GA AV CR by the Ministry of Education research plan no. MSM4781305904.

70

M. Engliˇs

obstacles of a different kind – namely, various “no-go” theorems show that there can exist no mathematical recipe that would fulfill all the axioms required by the physical interpretation. As a result, nowadays we face the existence of many different quantization theories, ranging from geometric quantization, deformation quantization and various related operator-theoretic quantizations to Feynman path integrals, asymptotic quantization, or stochastic quantization, to mention just a few. No one of the existing approaches solves the quantization problem completely; on the other hand, on the mathematics side all these have evolved into rich theories of their own right, and with results of great depth and beauty. The aim of this paper is to give a flavour of two of the approaches that belong to the list above, namely the Berezin and the Berezin–Toeplitz quantizations. Compared to other similar surveys like [1] or [41], we have tried to intersperse the exposition with simple examples that illustrate the main ideas, thus keeping it – we hope – accessible even to students or newcomers to the area. The paper is organized as follows. In Section 1, we present in some more detail what has been mentioned in the first two paragraphs above, namely, the original aspirations of the quantization theory and the various ramifications that the subsequent developments have led to. Section 2 discusses what turns out to be the simplest example of Berezin–Toeplitz quantization, namely the Toeplitz operators on the Fock space. The basic principles of the Berezin–Toeplitz and Berezin quantizations in curved (i.e., non-Euclidean) spaces and the necessary tools for them are discussed in Sections 4 and 3, respectively, while the full account of these theories appears in Sections 5 and 6. The last Section 7 contains miscellaneous additional comments, bibliographic remarks, and the like. This paper is an extended version of the series of lectures the author gave at the summer school Analysis – with Applications to Mathematical Physics in G¨ottingen on August 29–September 2, 2011. It is the author’s pleasure to thank the organizers for the opportunity to participate in the workshop and for the hospitality during his stay.

1. The problem of quantization 1.1. The canonical quantization The original concept of quantization, going back to Weyl, von Neumann, and Dirac, consists in assigning operators to functions: f −→ Qf . Here the functions f are supposed to live on some manifold, called the classical phase space; for reasons going back to classical mechanics, the manifold is taken to be symplectic, meaning it is equipped with a differential form of a certain kind. (We will be more specific about this later.) The operators live on some

Berezin–Toeplitz Quantization

71

fixed, separable infinite-dimensional Hilbert space H, and are assumed to be selfadjoint if f is real valued. (They need not be bounded in general.) One calls the functions f classical observables, while the corresponding operators Qf are the associated quantum observables. The physical interpretation is that upon performing some experiment to measure a quantity (position, velocity, momentum, energy, . . . ) represented by f , the possible outcomes will have the probability distribution Π(Qf )u, u , where Π(Qf ) is the spectral measure of the operator Qf , while u ∈ H is a unit vector characterizing the “state” of the given quantum system. In particular, if Qf has pure point spectrum consisting of eigenvalues λj with eigenvectors uj , uj = 1, then the possible outcomes of measuring f will be λj with probability |u, uj |2 ; if u = uj for some j, the measurement will be deterministic and will always return λj . Noncommutativity of operators corresponds to the impossibility of measuring simultaneously the corresponding observables. The simplest example of a quantization rule as above is for M = R2n , the real 2n-space, with elements written as (p, q) ∈ Rn × Rn ; one thinks of q1 , . . . , qn as the coordinates of a particle in Rn , and of p1 , . . . , pn as the velocities (or, more precisely, momenta) of the particle; in other words, M is the phase space of a single particle moving in Rn . We take H = L2 (Rn ) for the Hilbert space, viewed as L2 -functions in the position variables q; and define the quantum observables Qf , for f one of the coordinate functions on R2n , by Qqj : f (q) −→ qj f (q), Qpj : f (q) −→

(1)

h ∂f (q) 2πi ∂qj

(the Schr¨ odinger representation). These operators satisfy the canonical commutation relations (or just CCR for short) [Qqj , Qqk ] = [Qpj , Qpk ] = 0, [Qqj , Qpk ] = 0 for j = k,

∀j, k, (2)

ih [Qqj , Qpj ] = I, 2π where [A, B] := AB − BA denotes the commutator of two operators. The parameter h, on which this map Q also depends, is the Planck constant ; this should be thought of as a small positive number, and the classical limit h  0 should somehow recover the classical system from the quantum one, as already mentioned. Note that under the physical interpretation just explained, (1) implies, in particular, that it is possible to measure simultaneously the position variables q (in fact, the joint spectral distribution of the Qq1 , . . . , Qqn is just the Lebesgue measure on Rn , so the probability of finding the particle in a state given by u ∈ L2 (Rn ) to be present in some set Ω ⊂ Rn in an experiment is equal to the integral of |u|2 over Ω), or the momentum variables p, or even pj and qk for j = k, but not qj and pj ; the last is a reflection of the celebrated Heisenberg uncertainty

72

M. Engliˇs

principle. As h tends to zero, even the operators Qqj and Qpj become commutative, and the problems with simultaneous non-measurability thus disappear. Of course, it remains to say how to assign the operators Qf to more general functions f than the coordinate functions. There are some requirements which such an assignment should satisfy, coming from the physical interpretation: (A1) (A2)

The map f → Qf should be linear. (The von Neumann rule.) For any polynomial φ : R → R, we should have Qφ◦f = φ(Qf ).

(A3)

(In particular, Q1 = I.) ih [Qf , Qg ] = − Q{f,g} , where 2π

n  ∂f ∂g ∂f ∂g − {f, g} = ∂pj ∂qj ∂qj ∂pj j=1 is the Poisson bracket of f and g.

Here the axiom (A2) just means that if our experiment yields λ as an outcome for measuring f with some probability, then it should yield λ2 with the same probability when measuring f 2 , or, more generally, φ(λ) with the same probability when measuring φ(f ). Similarly, the linearity axiom (A1) is quite natural. Finally, the last axiom (A3) has to do with the time evolution of the system, as described by the Hamiltonian formalism in classical mechanics (we will not go into details about that here). (The last axiom also extends in an obvious way to any other manifold M on which we have an analogue of the Poisson bracket defined – these are precisely the symplectic manifolds that we have already hinted at.) Note that for f, g the coordinate functions on M = R2n , the last axiom reduces precisely to the canonical commutation relations (2). We are thus lead to the problem of extending the rules (1) in such a way that the axioms (A1)–(A3) above are satisfied. So, what are the solutions to this extension problem? (And, more generally, what would be the solutions for some more general symplectic manifold M ?) 1.2. Inconsistencies Unfortunately, here bad news comes. Namely, the above axioms are inconsistent (even in the simplest case of M = R2n ). To see that, denote for brevity P = Qp1 , Q = Qq1 , p = p1 , q = q1 ; then pq =

(p + q)2 − p2 − q 2 2

implies, using (A1) and (A2), that Qpq =

P Q + QP (P + Q)2 − P 2 − Q2 = . 2 2

Berezin–Toeplitz Quantization

73

On the other hand, by (A2) Qq2 = Q2 and Qp2 = P 2 , so we can apply the same argument to p2 , q 2 in the place of p, q: (p2 + q 2 )2 − p4 − q 4 2 implies, using (A1) and (A2), that p2 q 2 =

P 2 Q2 + Q2 P 2 . 2 Finally, as p2 q 2 = (pq)2 , (A2) requires that we should have Qp2 q2 = Q2pq . However, an easy computation, using the canonical commutation relation for P and Q, shows that

2 P Q + QP P 2 Q2 + Q2 P 2 = 2 2 (the two sides differ by a nonzero multiple of the identity). Thus we have arrived at a contradiction. Note that our argument above used just (A1) and (A2), so even these two axioms alone are inconsistent. It was shown by Groenewold in 1946 (with an improvement by van Hove in 1951) that, likewise, (A1) and (A3) alone are inconsistent. Finally, the present author noticed (much later) that also (A2) and (A3) by themselves lead to contradiction. In other words, not only the three axioms (A1)– (A3) all together – although quite innocuous and very natural from the point of view of physics – but even any two of them are already inconsistent! The contradiction deduced above used polynomial classical observables f , i.e., very nice functions; if we allow some “wilder” functions f as observables, then it can, in fact, be shown that already the von Neumann rule (A2) alone and the canonical commutation relations (2) lead to a contradiction. Namely, recall that there exists a continuous function f (Pe´ano curve) which maps R continuously and surjectively onto R2n . Let g be a right inverse for f , so that g : R2n → R and f ◦ g = id; such g exists owing to the surjectivity of f , and can be chosen to be measurable and locally bounded. Denote, for brevity, T = Qg and consider the functions φ = p1 ◦ f , ψ = q1 ◦ f . Then by the axiom (A2), Qp2 q2 =

φ(T ) = Qp1 ◦f ◦g = Qp1 ,

ψ(T ) = Qq1 ◦f ◦g = Qq1 ,

and ih I, 0 = (φψ − ψφ)(T ) = φ(T )ψ(T ) − ψ(T )φ(T ) = [Qp1 , Qq1 ] = − 2π

a contradiction. What should we do to resolve this disappointing situation? First of all, we will work solely with continuous or, still better, smooth (infinitely differentiable) functions; these are anyway the only ones that we really meet in the physical realm, and it rules out the pathologies we saw in the preceding paragraph. Next, we discard the von Neumann rule, except for φ = 1, i.e., Q1 = I.

74

M. Engliˇs

The only discrepancy left there is thus the one between the linearity axiom (A1) and the Poisson brackets axiom (A3). There are two established approaches how to deal with that. The first approach is to actually insist on both axioms, but restrict even further the set of quantizable observables, i.e., the domain of the map f → Qf (we have already restricted it to smooth functions a few lines above). For instance, for our quantization on M = R2n , if we allow only functions f at most linear in the momentum variables pj , then the recipe

 ∂f ih  ∂f ∂ψ pj + f− ψ, Qf : ψ −→ − 2π ∂pj ∂qj ∂pj j j odinger where ψ = ψ(q) ∈ L2 (Rn ), does the job we need: it extends the Schr¨ representation (1) and satisfies (A1) and (A3). (Note that the last makes sense, since the Poisson bracket of two functions at most linear in p is again at most linear in p.) In the case of a general symplectic manifold M in the place of R2n , one can similarly make things work by restricting, in an appropriate sense, to functions at most linear in “half of the variables”. In technical terms, choosing this “half of the variables” requires the concept of the so-called polarizations of the manifold; by definition, a polarization is a smooth choice of subspaces of dimension n in each fiber Tx M , x ∈ M , of the tangent bundle T M of M . The whole approach leads to particularly appealing results in the context of manifolds M with nice group actions (symmetries), when methods of representation theory apply, and is known as the geometric quantization (Kostant [35], Souriau [42]). The second approach, on the other hand, starts by relaxing the Poisson brackets axiom (A3) to hold only asymptotically as h → 0: ih (3) [Qf , Qg ] = − Q{f,g} + O(h2 ). 2π This is the basic idea behind the deformation quantization. Before spelling out the precise definition of the latter in detail, let us look at a simple example on R2n , which we now describe. 1.3. Weyl quantization An “arbitrary” function f (p, q) on R2n can be expanded into exponentials via the Fourier transform: ) ) f (p, q) = (4) fˆ(ξ, η) e2πi(ξ·p+η·q) dξ dη. Rn

Rn

From the Schr¨ odinger representation (1) and the Taylor series for the exponential, is it easy to interpret the exponentials e2πiξ·Qp and e2πiη·Qq : e2πiξ·Qp u(q) = u(q + hξ),

e2πiη·Qq u(q) = e2πiη·q u(q).

With a bit of effort, one can also take a good guess what e2πi(ξ·Qp +η·Qq ) should be. Indeed, given a u ∈ L2 (Rn ), the function g(q, t) = [e2πit(ξ·Qp +η·Qq ) u](q),

t ∈ R,

Berezin–Toeplitz Quantization

75

should be a solution to ∂g/∂t = 2πi(ξ · Qp + η · Qq )g subject to the initial condition g(q, 0) = u(q); in other words, n ∂g ∂g  − hξj = 2πiη · qg, g(q, 0) = u(q). ∂t j=1 ∂qj Fixing q for a moment and setting G(t) = g(q − thξ, t), this becomes G (t) = 2πiη · (q − thξ)G(t), 2

with the solution G(t) = e2πitη·q−πit g(q, t) = e

2

2πitη·(q+thξ)−πit hη·ξ

hη·ξ

G(0) = u(q),

u(q), or 2

u(q + thξ) = e2πitη·q+πit

hη·ξ

u(q + thξ).

Taking t = 1 we are thus lead to e2πi(ξ·Qp +η·Qq ) u(q) = e2πiη·q+πihη·ξ u(q + hξ). Returning to (4), let us now postulate that ) ) Qf = fˆ(ξ, η) e2πi(ξ·Qp +η·Qq ) dξ dη =: Wf . Rn

Rn

In other words, using the previous formula, ) ) Wf u(q) = fˆ(ξ, η) e2πiη·q+πihη·ξ u(q + hξ) dξ dη n n R )R ) ξ − q  −n =h , η eπiη·(q+ξ) u(ξ) dξ dη fˆ h Rn Rn ) )  q + y −n =h e2πi(q−y)·p/h u(y) dy dp f p, 2 Rn Rn by Plancherel’s theorem. This is the celebrated Weyl calculus of pseudodifferential operators; a beautiful reference for it is Folland’s book [28]. It can be shown that, appropriately interpreted, Wf makes sense even for any tempered distribution f on R2n , being then a continuous operator from the Schwartz space S(Rn ) into the tempered distributions S  (Rn ) on Rn . If f is sufficiently nice – for instance, if f ∈ S(R2n ) – then Wf is continuous even from S(Rn ) into itself. For such f and g, the product Wf Wg therefore makes sense, and it turns out that Wf Wg = Wf g + hWC1 (f,g) + O(h2 ) as h  0, where C1 (f, g) =

n i   ∂f ∂g ∂f ∂g  − 4π j=1 ∂qj ∂pj ∂pj ∂qj

satisfies C1 (f, g) − C1 (g, f ) = − Hence

i {f, g}. 2π

ih + O(h2 ) W 2π {f,g} and so that the Weyl calculus satisfies (3). [Wf , Wg ] = −

(5)

76

M. Engliˇs

One can even do slightly better than that. Namely, the product formula (5) can even be improved to higher order: there exist C2 , C3 , . . . such that Wf Wg = Wf g + hWC1 (f,g) + h2 WC2 (f,g) + O(h3 ), Wf Wg = Wf g + hWC1 (f,g) + h2 WC2 (f,g) + h3 WC3 (f,g) + O(h4 ), and so on. Symbolically, Wf Wg = Wf ∗g

(6)

where f ∗ g := f g + hC1 (f, g) + h2 C2 (f, g) + h3 C3 (f, g) + · · · . The last expression should be viewed just as a formal power series in h (no convergence is asserted!), and (6) should just be understood as above, i.e., Wf Wg =

N −1 

hj WCj (f,g) + O(hN ),

j=0

for any N = 0, 1, 2, . . . . Ultimately, one is even led to the idea that for the quantization it is not really necessary to have the operators Qf , but it suffices to have a noncommutative product like ∗. This is the essence of the second approach to resolving the inconsistency of the axioms (A1)–(A3), called the deformation quantization. 1.4. Deformation quantization The precise definition runs as follows. Given our manifold M , consider the ring C ∞ (M )[[h]] of all formal power series in h over C ∞ (M ). That is, the elements of C ∞ (M )[[h]] are formal power series f=

∞ 

hj fj (x)

(7)

j=0

with fj ∈ C ∞ (M ), and addition and multiplication defined in the usual way. A star product is an associative C[[h]]-bilinear mapping ∗ such that f ∗g =

∞ 

hj Cj (f, g),

∀f, g ∈ C ∞ (M ),

(8)

j=0

where the bilinear operators Cj satisfy C0 (f, g) = f g,

i {f, g}, 2π ∀j ≥ 1.

C1 (f, g) − C1 (g, f ) = −

Cj (f, 1) = Cj (1, f ) = 0

(The C[[h]]-bilinearity means that f ∗ g is linear in each argument and (hf ) ∗ g = f ∗ (hg) = h(f ∗ g); consequently, for any f, g as in (7), ∞ ∞ ∞      hj fj (x) ∗ hk gk (x) = hj+k+m Cm (fj , gk )(x), j=0

k=0

j,k,m=0

Berezin–Toeplitz Quantization

77

where the last sum should, of course, be re-arranged by combining together the terms with the same power hj+k+m of h.) We have seen at the end of §1.3 that the Weyl calculus, with the star product defined by (6), satisfies (8) (in fact, that is exactly how the Weyl star-product was defined). From (6) and the fact that multiplication of operators in associative, i.e., (Wf Wg )Wk = Wf (Wg Wk ), it is also immediate that the Weyl star-product (6) is associative. Thus the Weyl calculus from §1.3 is an example of deformation quantization on R2n . The drawback of the Weyl quantization is, however, that it does not readily extend to more general phase spaces than R2n . Indeed, its definition used heavily the Fourier transform, and the Fourier transform is something which is specific only for the Euclidean spaces and a few of other situations. Although the definition of deformation quantization, together with its physics interpretation etc., goes back to 1977 (it was introduced by Bayen, Flato, Fronsdal, Lichnerowicz and Sternheimer in [4]), its existence on a general symplectic manifolds was established only years later. The first proof was given by DeWilde and Lecomte in 1983 [18], followed by different proofs by Fedosov in 1985 [26] and Omori, Maeda and Yoshioka in 1991 [39]; finally, in 1997 Kontsevich established its existence even on any Poisson (i.e., more general than symplectic) manifold [34]. These constructions also allow one to describe all possible deformation quantizations of a given manifold, and it turns out that they can be bijectively classified, up to a natural “equivalence”, by the elements of the formal power series ring H 2 (Ω, R)[[h]] over the second cohomology group H 2 (Ω, R). For wealth of further information on deformation quantization, the reader is referred, e.g., to the survey by Gutt [30]. One disadvantage of the deformation quantization is that it works with formal power series: no convergence is assumed, nor – it turns out – can be guaranteed in general, which makes the whole thing somewhat awkward when it comes to performing some concrete calculations. It is therefore of interest to have deformation quantizations that would be induced by some operators behind, as was the case of the Weyl quantization and the formula (6), and it would be even nicer if these operators were somehow naturally related to the geometry and analysis on the manifold in question – as was, again, the case for the Weyl transform and its relationship to the Fourier transform. In the rest of this paper, we will discuss two instances of such deformation ahler quantizations, which exist on domains in Cn (or, more generally, on nice K¨ manifolds). Before plunging into the formal definitions and technicalities, let us show how things work in the simplest example when the domain in question is the entire complex space Cn .

78

M. Engliˇs

2. The Fock space 2.1. Fock space on C The Fock, or Segal–Bargmann, space on C is, by definition, 2

F (C) = F := L2hol (C, π −1 e−|z| dz), 2

the subspace of all entire functions in L2 (C, π −1 e−|z| dz). Given a function f ∈ F, ∞ its Taylor series f (z) = j=0 fj z j converges on all of C, and uniformly on any compact subset. In particular, for any R ∈ (0, +∞) we have ) ) ∞  2 dz 2 −|z|2 dz = |f (z)| e fj z j fk z k e−|z| π π |z| 0 is such that the polydisc Dz,r := {w ∈ Cn : |wj − zj | < r ∀j = 1, . . . , n} lies wholly in Ω, then ) f (z) = (πr2 )−n f (w) dw, Dz,r

so |f (z)| ≤ (πr2 )−n

)

1/2  ) dw Dz,r

Dz,r

|f (w)|2 dw

1/2

≤ (πr2 )−n/2 f .

Berezin–Toeplitz Quantization

85

Consequently, the evaluation functional f → f (z) is bounded on L2hol (Ω), and uniformly for z in compact subsets of Ω. From the latter it follows, first of all, that L2hol is a closed subspace of L2 , hence a Hilbert space in its own right; while the former again implies that there exists a unique Kz ∈ L2hol (Ω) such that f (z) = f, Kz

∀f ∈ L2hol (Ω).

The function K(x, y) ≡ Ky (x) = Ky , Kx = K(y, x) (14) 2 is thus the reproducing kernel of Lhol (Ω), called the Bergman kernel ; note that from (14) it is immediate that it is holomorphic in x and anti-holomorphic in y. Furthermore, since Ω was assumed to be bounded, hence of finite Lebesgue measure, the function constant on Ω belongs to L2hol (Ω), and, consequently, 1 = 1(x) = 1, Kx ≤ 1 Kx ,

(15)

implying that Kx > 0 for all x ∈ Ω. 3.2. Berezin symbols While quantization is a recipe for associating operators to functions, here we come across an assignment going in the other direction, i.e., mapping operators on some Hilbert space into functions on some domain. These functions are commonly called the symbol of the corresponding operator, and the whole process is often called a symbol calculus, or dequantization. (Similarly, quantization is sometimes called an operator calculus in various contexts.) Here is an instance of such process, which is characteristic for the Bergman spaces. For an operator T on the Bergman space L2hol (Ω), the Berezin symbol T of T is the function on Ω given by T Kx, Kx T(x) = = T kx , kx , Kx , Kx

kx :=

Kx . Kx

Note that this definition makes sense, since the denominator is positive by (15). There are a number of properties of the symbol map T → T immediate from its definition: • The mapping T → T is linear. • I = 1, i.e., the symbol of the identity operator is the function constant one. • T4∗ = T. • If T is bounded, then T is a bounded function; in fact, T ∞ ≤ T . Moreover, the function T is smooth (in fact, even real-analytic), because it is the restriction to the diagonal x = y of the function of two variables T Ky , Kx T Ky , Kx T(x, y) := = Ky , Kx K(x, y) holomorphic in x, y on the set where K(x, y) = 0. (Since we know that K(x, x) = Kx 2 > 0 by (15), by continuity K(x, y) is nonzero in some neighbourhood of the diagonal.)

86

M. Engliˇs However, the most important property of the symbol map is that T → T

is one-to-one.

(16)

Indeed, suppose T(x) = T(x, x) = 0 ∀x. Setting x = u + iv, y = u − iv, it follows that G(u, v) := T(u + iv, u + iv) is a holomorphic function of u, v which vanishes for all u, v real. By the uniqueness principle for holomorphic functions, G must vanish identically, so T(x, y) = 0 ∀x, y, hence T Kx, Ky = T Kx (y) = 0 ∀x, y. However, ) T ∗ f (x) = T ∗ f, Kx = f, T Kx =

f (y)T Kx(y) dy, Ω

so T ∗ f (x) = 0 for all f and x. Hence, T ∗ = 0 and T = 0, proving the injectivity of the map T → T. 3.3. Toeplitz operators on the Bergman space As before, the Toeplitz operator on L2hol (Ω) with symbol φ ∈ L∞ (Ω) is defined as Tφ g = P (φg) where P : L2 → L2hol is the orthogonal projection (called the Bergman projection). All the properties familiar from the Fock space setting remain in force here: • f → Tf is linear; • T1 = I; • Tf∗ = Tf ; • Tf ≤ f ∞ . Furthermore, for φ bounded holomorphic, Tφ is just the operator of “multiplication by φ” on the Bergman space; and for φ bounded holomorphic and f arbitrary, Tf φ = Tf Tφ ,

Tφf = Tφ Tf .

The difference between the Fock space is that now, since Ω is bounded, there are plenty of bounded holomorphic functions on Ω (not just the constants), e.g., all holomorphic polynomials. We finally remark – although this is not needed in contrast to the corresponding property of the Berezin symbol map from §3.2 – that the map f → Tf is also one-to-one. Indeed, assume that Tf = 0; then Tf u, v = f u, v = 0 for any holomorphic polynomials u, v, in particular, f z j , z m = 0, or ) f (z)z j z m dz = 0 Ω

for any multiindices j, m. By the Stone–Weierstrass theorem, this implies that ) f (z)g(z) dz = 0 for any function g continuous on the closure Ω of Ω. By the Riesz representation theorem, this means that f (z) dz is the zero measure, and, consequently, that f = 0 almost everywhere, as claimed.

Berezin–Toeplitz Quantization

87

3.4. Berezin transform The Toeplitz correspondence assigns the operator Tf to a function f , while the Berezin symbol map assigns the function T to an operator T . The Berezin transform is the composition of these two maps; that is, it assigns to a function f on Ω again a function on Ω, denoted Bf or f, and given by 4f . Bf := f := T Chasing through the definitions shows that B is in fact an integral operator: ) |K(x, y)|2 f Kx , Kx  f (y) f (x) = = dy. Kx , Kx K(x, x) Ω One also checks easily that B has the following properties, which can either be derived from those of the Toeplitz operators and the Berezin symbols, or verified directly. • • • •

f → Bf is linear; B1 = 1; Bf = Bf ; Bf ∞ ≤ f ∞ .

Also, Bf is always a real-analytic function on Ω, and the operator B is one-to-one. 3.5. Weighted variants In an obvious manner, all the objects described in §§3.1–3.4 generalize also to the case of weighted L2 spaces. Namely, let w > 0 be a positive continuous weight on Ω, integrable there with respect to the Lebesgue measure. The associated weighted Bergman space on Ω with respect to w is the subspace L2hol (Ω, w) of all holomorphic functions in L2 (Ω, w). Using the mean-value property of harmonic functions, one again shows that the point evaluations f → f (z) are continuous on L2hol (Ω, w), uniformly on compact subsets (the continuity and positivity of w is needed here); implying as before that L2hol (Ω, w) is a closed subspace of L2 (Ω, w) – hence a Hilbert space on its own – and that it possesses a reproducing kernel, the weighted Bergman kernel Kw (x, y) ≡ Kw,y (x). The Berezin symbol T of an operator T on L2hol (Ω, w) is the function on Ω, T Kw,x, Kw,x = T kw,x , kw,x , T(x) = Kw,x , Kw,x

kw,x :=

Kw,x . Kw,x

(Naturally, T depends also on the weight w, although this is not reflected in the notation.) Here one needs that Kw (x, x) = Kw,x 2 > 0 for all x ∈ Ω, which again follows as in (15) (and the hypothesis of the integrability of w ensures that the function constant one belongs to L2hol (Ω, w)). Importantly, the Berezin symbol map T → T is still one-to-one (with the same proof as in the unweighted case). The Toeplitz operator on L2hol (Ω, w) with symbol φ ∈ L∞ (Ω) is defined as Tφ f = Pw (φf )

88

M. Engliˇs

where Pw : L2 (Ω, w) → L2hol (Ω, w) is the orthogonal projection (the weighted Bergman projection). Finally, the weighted Berezin transform of a function f on Ω is another function on Ω, given by 4f Bw f := f := T (again, the simpler notation f does not reflect that fact that f depends also on the weight w); and Bw is in fact an integral operator ) f Kw,x, Kw,x |Kw (x, y)|2 Bw f (x) = = w(y) dy. f (y) Kw,x , Kw,x Kw (x, x) Ω Let us now (at last!) describe how all these concepts can be utilized for the construction of the special deformation quantizations on Ω mentioned in the previous sections.

4. Basic ideas of Berezin(–Toeplitz) quantization(s) 4.1. Berezin–Toeplitz quantization For the Fock spaces Fα , α = π/h, we have seen that the Toeplitz calculus assigning to a function f on Cn the Toeplitz operator Tf on Fα yields a deformation quantization of Cn . The main idea of Berezin–Toeplitz quantization is to use the Toeplitz operators in the same way also on a general domain Ω. Of course, what 2 is unclear is the right substitute for the Gaussian measures e−π|z| /h on Cn . The main problem in the Berezin–Toeplitz quantization is thus to find a family of weights ρh , h > 0, on the domain Ω such that the corresponding Toeplitz operators on L2hol (Ω, ρh ) satisfy Tf Tg =

∞ 

hj T [Cj (f, g)]

(17)

j=0

in some sense, where Cj are some bidifferential operators such that C0 (f, g) = f g and i {f, g} C1 (f, g) − C1 (g, f ) = 2π for some given Poisson bracket {· , ·} on Ω. 2 Recall that for Ω = C and ρh (z) = e−π|z| /h h−1 dz, this was fulfilled with 1 Cj (f, g) = j! (∂ j f )(∂ j g). (And similarly for Cn .) The operators Cj ≡ CjBT then define a star-product f ∗BT g :=

∞ 

hj CjBT (f, g),

f, g ∈ C ∞ (Ω),

j=0

called a Berezin–Toeplitz star-product (and denoted by ∗BT to distinguish it from the various other star-products around).

Berezin–Toeplitz Quantization

89

4.2. Berezin quantization This method is not based on Toeplitz operators, but rather on the Berezin symbols. Consider, quite generally, any weight w on Ω of the kind discussed in §3.5. Since the Berezin symbol map T → T is one-to-one, we can introduce a noncommutative product ∗w on (some) functions on Ω by 4. S ∗w T := ST The product f ∗w g is thus defined only for functions f, g in the set Aw := {T : T is a bounded linear operator on L2 (Ω, w)} hol

(which also depends on w). The product f ∗w g then also belongs to Aw , and ∗w is associative (since the multiplication of operators is). The idea is to glue these non-commutative products ∗w , as w is allowed to vary with the Planck constant h, into a star product. More precisely, the Berezin quantization amounts to finding a family of weights ρh , h > 0, such that the intersection  A := Aρh h>0

is sufficiently large, and such that for f, g ∈ A, ∞  hj Cj (f, g) f ∗ρh g = j=0

asymptotically as h  0, where Cj are some bidifferential operators with C0 (f, g) = f g and i {f, g} C1 (f, g) − C1 (g, f ) = 2π for a given Poisson bracket {· , ·} on Ω. Here “sufficiently large” means, basically, that A should be so large that the bilinear operators Cj (f, g) are uniquely determined by their values on f, g ∈ A. Since Cj are differential operators in each argument, this will be the case, for instance, whenever for any point x, any finite set J of multiindices, and any set of complex numbers cj , j ∈ J, we can find an element f ∈ A such that ∂ j f (x) = cj ∀j ∈ J. In particular, it is enough if A contains all polynomials (in z and z) on Ω. The resulting bidifferential operators Cj ≡ CjB then, of course, define the desired star-product ∞  hj CjB (f, g), f, g ∈ C ∞ (Ω), f ∗B g := j=0

called the Berezin star-product (and denoted ∗B to distinguish it from the BerezinToeplitz star-product of §4.1). So far, we have not exhibited any example of the Berezin quantization, even on Cn . We will do that by showing that it is in fact related to another problem which has a very familiar answer on Cn .

90

M. Engliˇs

4.3. Berezin quantization via the Berezin transform In fact, the problem described in §4.2 can be reduced to one concerning the asymptotic behaviour of the weighted Berezin transforms Bw with the appropriate weights w. More precisely, the following holds. Suppose we can find a family of weights ρh , h > 0, on Ω, such that as h → 0, the corresponding weighted Berezin transforms Bρh ≡ Bh have an asymptotic expansion (18) Bh = Q0 + hQ1 + h2 Q2 + · · · , with some differential operators Qj where Q0 = I. Let cjαβ be the coefficients of Qj , i.e.,  cjαβ ∂ α ∂ β f ; Qj f =: α,β multiindices

and set f ∗Bt g :=

∞ 

hj Cj (f, g),

j=0

where Cj (f, g) ≡ CjBt (f, g) :=



cjαβ (∂ β f )(∂ α g).

(19)

α,β

If it happens that C1 (f, g) − C1 (g, f ) =

i {f, g}, 2π

then ∗Bt is a star product and f ∗Bt g = f ∗B g

∀f, g,

(20)

i.e., ∗Bt coincides with the Berezin star-product from §4.2. The rest of this subsection is devoted to the proof of this assertion. Once this has been done, the construction of the Berezin quantization reduces to constructing a family of weights for which the associated Berezin transforms have nice asymptotics (18); this will be done in Section 5. Furthermore, the assertion also yields immediately an easy example of a Berezin quantization on Cn ; this, as well as some other examples, will be presented in §4.6 below. So let us prove (20). Suppose we have a family of weights ρh such that (18) holds. Denote by Zj = Tzj , j = 1, . . . , n the Toeplitz operator on L2hol (Ω, ρh ) whose symbol is the coordinate function zj ; we have seen that Zj are actually just the multiplication operators Zj : f (z) → zj f (z). Let Zj∗ be the adjoint of Zj on L2hol (Ω, ρh ). (Thus Zj∗ depends also on h, although it is not visible in the notation.) For p(z, z) = α,β pαβ z α z β a polynomial in z and z, define the operators  pαβ Z α Z ∗β Vp := α,β

Berezin–Toeplitz Quantization

91

on each L2hol (Ω, ρh ), h > 0 (where we are using the obvious multiindex conventions Z α = Z1α1 · · · Znαn etc.). Note that owing to the hypothesis that the domain Ω is bounded, Zj and, hence, Vp are bounded linear operators. Recall now our notation Ky = Kρh ( · , y) for the reproducing kernels, and the notation for the “two-variable Berezin symbol” of an operator T on L2hol (Ω, ρh ), T Ky , Kx T Ky (x) T ∗ Kx (y) T(x, y) := = = , Ky , Kx K(x, y) K(x, y) which is defined in some neighbourhood of the diagonal in Ω×Ω (where K(x,y) = 0) and whose restriction to the diagonal x = y coincides with the Berezin symbol T(x) of T . Applying this in particular to the operator Vp , we get α ∗β Vp Ky (x) α,β pαβ (Z Z Ky )(x)  Vp (x, y) = = K(x, y) K(x, y) α ∗β α ∗β α,β pαβ x (Z Ky )(x) α,β pαβ x Z Ky , Kx = = K(x, y) K(x, y) α β α β α,β pαβ x Ky , Z Kx α,β pαβ x y Kx (y) = = K(x, y) K(x, y)  α β = pαβ x y = p(x, y) for any h. α,β

In particular, Vp (x) = Vp (x, x) = p(x, x) for any h. Consequently, p ∈ Aρh for all h, that is, p ∈ A; thus A contains all polynomials, settling the first requirement for the Berezin quantization from §4.2. Next, for any two operators T1 , T2 on L2hol (Ω, ρh ), ) T2 Ky (z) T1∗ Kx (z)ρh (z) dz T2 Ky , T1∗ Kx  = (T1 T2 )(x, y) = Ky , Kx Ky , Kx )   T2 (z, y)Kρh (z, y) · T1 (x, z)Kρh (x, z) = ρh (z) dz. Ky , Kx In particular, )

|Kρh (x, z)| T1 (x, z)T2 (z, x) ρρh (z) dz Kρh (x, x)   = Bh [T1 (x, · )T2 ( · , x)] (x).

 (T 1 T2 )(x, x) =

2

Thus if (18) holds, i.e., Bh =

∞  j=0

hj Q j

as h → 0,

92

M. Engliˇs

with some differential operators Qj f = α,β cjαβ ∂ α ∂ β f , and Cj is defined by Cj (f, g) := α,β cjαβ (∂ β f )(∂ α g), then we get, for h → 0,  (T 1 T2 )(x, x) =

∞ 

hj Qj [T1 (x, · )T2 ( · , x)](x)

j=0

=



 hj cjαβ ∂ β T1 (x, · ) ∂ α T2 ( · , x) x .

j,α,β

Now since T(x) = T(x, x) and T(x, y) is holomorphic in x and anti-holomorphic in y, we have  ∂ β T1 (x, · ) x = ∂ β T1 (x) (the T on the left-hand side is the T(x, y), and the T on the right-hand side is the T(x)). Similarly,  ∂ α T2 ( · , x)  = ∂ α T2 (x). x

Thus T 1 T2 =



hj cjαβ (∂ β T1 ) (∂ α T2 )

j,α,β

=



hj Cj (T1 , T2 ) = T1 ∗Bt T2 ,

j

  by the definition of ∗Bt . On the other hand, T 1 T2 = T1 ∗ρh T2 , by the definition of ∗w (with w = ρh ) in §4.2; so T1 ∗Bt T2 = T1 ∗ρh T2 . Applying this to T1 = Vp , T2 = Vq with some polynomials p, q in z, z, and recalling that Vp = p, this means that p ∗Bt q = p ∗ρh q for any polynomials p, q in z, z. Since any f ∈ C ∞ (Ω) can be approximated, at any given point, to any finite order by polynomials, and the Cj (· , ·) for both ∗Bt and ∗B are differential operators in each argument, necessarily CjBt (f, g)(x) = CjB (f, g)(x) for all f, g ∈ C ∞ (Ω) and x ∈ Ω; that is, ∗Bt = ∗B , completing our proof. 4.4. Berezin–Toeplitz quantization via the Berezin transform On a slightly more heuristic level, it is possible to derive not only the Berezin, but also the Berezin–Toeplitz quantization (§4.1) from the asymptotics (18) of the Berezin transform; that is, we can show that if (18) holds, then [Tf , Tg ] ≈ h T{f,g}

(21)

as the Planck constant h  0. While this will not be directly needed anywhere in the sequel, we believe it is worth mentioning here. Assume first that f, g are holomorphic. Then for any φ ∈ L2hol , Tf φ, Kx = f φ, Kx = f (x)φ(x) = f (x)φ, Kx .

Berezin–Toeplitz Quantization

93

It follows that Tf∗ Kx = f (x)Kx . Similarly Tg Kx = g(x)Kx . Hence Tg Kx , Tf∗ Kx Tf Tg Kx , Kx T = T (x) = f g Kx , Kx Kx , Kx =

g(x)Kx , f (x)Kx = f (x)g(x); Kx , Kx

that is, T f Tg = f g. On the other hand, by definition of the Berezin transform and (18), Tf g = Bh (f g) = f g + hQ1 (f g) + O(h2 ). Subtracting this from T f Tg = f g gives (Tf Tg − Tf g )∼ = −hQ1 (f g) + O(h2 ) 2 = −hT Q1 (f g) + O(h ).

“Removing the tilde” (yes, this is the heuristic part) we get, for f, g holomorphic, Tf Tg − Tf g = −hTC1B (g,f ) + O(h2 ),

(22)

where C1B is the C1 from the Berezin quantization. Note that, as we have seen in §4.3, C1B (g, f ) involves only holomorphic derivatives of f and anti-holomorphic derivatives of g (i.e., only ∂ α f and ∂ β g). This also means, in particular, that for any holomorphic functions u, v, C1B (ug, vf ) = uC1B (g, f )v. On the other hand, we have seen in §3.3 that for u, v as above and arbitrary F and G, Tv TF = TvF . TG Tu = TuG , Multiplying (22) by Tv from the left and Tu from the right, we therefore obtain Tvf Tgu − Tvf gu = Tv [Tf Tg − Tf g ]Tu = −hTv TC1B (g,f ) Tu + O(h2 ) = −hTvC1B (g,f )u + O(h2 ) = −hTC1B (ug,vf ) + O(h2 ). That is, (22) holds not only for f, g holomorphic, but for any f, g of the form uv with holomorphic u, v. By the same approximation argument as in the end of §4.3, we conclude that actually Tf Tg − Tf g = −hTC1B (g,f ) + O(h2 ) for any f, g ∈ C ∞ (Ω). That is, we have obtained the first two terms Tf Tg = TC0BT (f,g) + hTC1BT (f,g) + O(h2 )

94

M. Engliˇs

of the Berezin–Toeplitz star-product (17), showing, incidentally, that (C0BT (f, g) = f g and) (23) C1BT (f, g) = −C1B (g, f ). It is clear how to continue this argument to obtain also the higher-order terms CjBT and, hence, the entire Berezin–Toeplitz star-product. 4.5. Connection between Berezin and Toeplitz quantizations The relationship (23) between the Berezin and the Berezin–Toeplitz operator C1 can actually be put into a rather neat form. Recall that we have our three mappings f → Tf (the Toeplitz operators), T → T (the Berezin symbol), and their composition f → Tf = Bh f (the Berezin transform). In terms of these, the Berezin– Toeplitz star product was defined by Tf Tg = Tf ∗BT g ,

(24)

while the Berezin star product was, essentially, defined by T ∗B S = T4S. Applying the last formula to T = Tf , S = Tg , and using (24), gives  Tf ∗B Tg = T f Tg = Tf ∗BT g , or Bf ∗B Bg = B(f ∗BT g). In other words, the Berezin and the Berezin–Toeplitz star-products are intertwined (conjugate) by the Berezin transform. From this, one easily gets the higher-order analogues of the relation (23), i.e., involving CjB and CjBT (and the operators Qj ) for j ≥ 1. 4.6. Some examples of Berezin and Berezin–Toeplitz quantizations We have already worked out the Berezin–Toeplitz quantization on Cn in some detail in Section 2 1 ; let us see how the other approaches discussed in this section work out in this case. 2 Thus, let Ω = Cn and ρh (z) = e−α|z| (α/π)n dz, with α = π/h > 0; note that the “classical limit” h  0 now corresponds to α → +∞. Since we know the reproducing kernel to be given by Kα (x, y) = eα x,y , the formula for the Berezin transform becomes ) |Kh (x, y)|2 ρh (y) dy Bα f (x) = f (y) Kh (x, x) Cn )  α n 2 f (y) e−αx−y dy. = π n C 1 Strictly

speaking, the Berezin–Toeplitz quantization as defined in §4.1 does not apply to Cn , since our domain Ω throughout this whole section is assumed to be bounded (in order to have nontrivial bounded holomorphic functions on Ω, such as the polynomials); however, it is still illustrative to include also the case of Ω = Cn here, albeit with the caveats about dealing with unbounded operators like Tz etc. in general.

Berezin–Toeplitz Quantization

95

This is precisely the heat solution operator at the time t = 1/4α: Bα f = eΔ/4α f. In particular, as α → +∞, we get Bα f → f , more precisely there is even an asymptotic expansion Bα f (x) = eΔ/4α f (x) = f (x) + or more briefly Bα = eΔ/4α =

Δf (x) Δ2 f (x) + + ··· , 4α 2!(4α)2

∞  j=0

α−j

Δj . j!4j

From §4.3, we conclude that the Berezin quantization works for the above choice of weights ρh on Cn , with 1  α Cj (f, g) = CjB (f, g) := (∂ f )(∂ α g). j! |α|=j

This can be compared with the Berezin–Toeplitz quantization formula for the same choice of weights from Section 2: (−1)j  α (∂ f )(∂ α g). Cj (f, g) = CjBT (f, g) := j! |α|=j

Both quantize the Euclidean Poisson bracket on Cn (spelled out in the axiom (A3) in Section 1). The second example which can be worked out explicitly to some level is the 2 α unit disc Ω = D := {z ∈ C : |z| < 1} in C, with weights ρh (z) = α+1 π (1 − |z| ) , α > −1; the parameter α again plays the role of the reciprocal of h, so that h  0 corresponds to α → +∞. A standard calculation in polar coordinates, similar to the one we did for the Fock space, shows that the reproducing kernels are 1 . Kα (x, y) = (1 − xy)α+2 This gives the formula for the Berezin transform: ) α+1 (1 − |x|2 )α+2 Bα f (x) = f (y) (1 − |y|2 )α dy. π |1 − xy|2α+4 D With some work, it can again be shown that as α → +∞, Bα f = f + where

 Δf + ··· 4α

 = (1 − |z|2 )2 Δ Δf

is the invariant Laplacian on D. (The Qj for j > 1 are already a bit complicated and involve Bernoulli numbers; an explicit expression for general j is not known.)

96

M. Engliˇs

The results of §4.3 thus again tell us that the Berezin quantization on D works for the above choice of weights, with C0B (f, g) = f g,

C1B (f, g) = (1 − |z|2 ) ∂f ∂g.

Similarly, the Berezin–Toeplitz quantization works, with C0BT (f, g) = f g,

C1BT (f, g) = −(1 − |z|2 ) ∂f ∂g.

Explicit expressions for CjB and CjBT for general j ≥ 2 are again unknown. Both methods quantize the Poisson bracket {f, g} = (1 − |z|2 )2 (∂f ∂g − ∂g∂f ) associated to the invariant (= Poincar´e, Lobachevsky) metric on D. Our third and final example concerns the unit ball Ω = Bn := {z ∈ Cn : |z| < 1} in Cn , with weights ρh (z) = cα (1 − |z|2 )α , where α = 1/h → +∞ and cα is a normalizing constant making ρh to be of total mass 1. The reproducing kernel equals 1 Kα (x, y) = , (1 − x, y )α+n+1 yielding the expression for the Berezin transform ) (1 − |x|2 )α+n+1 Bα f (x) = cα f (y) (1 − |y|2 )α dy. |1 − x, y |2α+2n+2 Bn Again,

 Δf + ··· 4α  the invariant Laplacian on Bn . Both the Berezin and the as α → +∞, with Δ Berezin–Toeplitz quantizations work for the above choice of weights, and their coefficients Cj are given by formulas of a similar nature as for the disc. For a later occasion, it is instructive to summarize some observations from these examples here. Looking at the weights and the corresponding reproducing kernels in the three cases, namely, Bα f = f +

2

n −α|z| ρα (z) = ( α , π) e

Kα (x, y) = eα x,y

for the Fock space on Cn ; ρα (z) =

α+1 π (1

− |z|2 )α ,

Kα (x, y) = (1 − xy)−α−2

for the disc; and ρα (z) = cα (1 − |z|2 )α ,

Kα (x, y) = (1 − x, y )−α−n−1

for the ball, we observe that Kα (x, x) is just the reciprocal of the weight ρh (x), up to the normalization constants and possibly a shift in the exponent α. Furthermore, we have seen in all three cases that the Berezin transform Bα is an approximate identity as α → +∞, more precisely Q2 Q1 + 2 + ··· , Bα = I + α α

Berezin–Toeplitz Quantization

97

where Q1 is, up to a constant factor, some kind of “invariant Laplacian” on the domain in question. We will see in Section 5 later that both these observations, in fact, remain in force in a much more general setting. 4.7. How to choose the weights ρh The main problem for carrying out both the Berezin and the Berezin–Toeplitz quantization is thus to find the weights ρh , h > 0, on Ω so that (17) and (18) hold. There is a way to see what should be the right choice, which we now describe. It is time we gave a precise definition of the object we wish to quantize, the Poisson bracket on our domain (or manifold) Ω. Quite generally, a symplectic manifold is a real manifold equipped with a 2-form ω=

m 

gjk dxj ∧ dxk

j,k=1

which is non-degenerate (i.e., the matrix {gjk }m j,k=1 is invertible) and closed (dω = 0). Here m is the real dimension of the manifold, which must necessarily be even. The Poisson bracket is then defined as m  ∂f ∂g {f, g} = g jk ∂xj ∂xk j,k=1

m where {g jk }m j,k=1 is the inverse matrix to {gjk }j,k=1 . For the case of complex manifolds that we have here, it is furthermore important that the symplectic form be compatible with the complex structure, and also it is more convenient to use the complex coordinates zj , z j , j = 1, . . . , n, rather than the real coordinates xk , k = 1, . . . , m, m = 2n. On the level of the form ω, this translates into the fact that ω is K¨ ahler, meaning that (in local coordinates)

ω=

n 

gjk dzj ∧ dz k

j,k=1

with some positive-definite matrix {gjk }nj,k=1 satisfying ∂l gjk = ∂j glk ,

∂l gjk = ∂k gjl .

(25)

The Poisson bracket is then given by {f, g} =

n 

g jk (∂ j f ∂k g − ∂j f ∂ k g),

(26)

j,k=1

where {g jk }nj,k=1 is the inverse matrix to {gjk }. Finally, the 2-form ω determines (both in the symplectic and in the K¨ahler case) also a nonvanishing volume element ω n on Ω. To find the right choice of the weights ρh , we take guidance from group invariance.

98

M. Engliˇs

Assume there is a group G acting on Ω by biholomorphic transformations preserving the form ω. Naturally, we would then want our quantizations to be G-invariant, i.e., to satisfy (f ◦ φ) ∗ (g ◦ φ) = (f ∗ g) ◦ φ,

∀φ ∈ G.

On the level of the Berezin quantization, this means that the operators Qj in (18), and, hence, B itself, should commute with the action of G. An examination of the formula defining the Berezin transform with respect to some weight ρ shows that this happens if and only if |Kρ (φ(x), φ(y))|2 |Kρ (x, y)|2 ρ(x) dx = ρ(φ(x)) dφ(x). Kρ (y, y) Kρ (φ(y), φ(y)) In particular, the ratio ρ(φ(x)) dφ(x) |Kρ (x, y)|2 Kρ (φ(y), φ(y)) = ρ(x) dx Kρ (y, y) |Kρ (φ(x), φ(y))|2 has to be the squared modulus of a holomorphic function. Writing ρ(z) dz = w(z) · ω n (z)

(27)

n

with the (G-invariant) volume element ω and some (positive) weight function w, the last condition translates into w(φ(z)) = w(z)|fφ (z)|2 for some holomorphic functions fφ . In other words, the form ∂∂ log w is G-invariant. However, the simplest examples of G-invariant 2-forms (and if G is sufficiently “ample”, the only ones) are clearly the constant multiples of ω. Thus we are led to ∂∂ log w = −cω with some constant c. It follows that ω = ∂∂Φ,

Φ := − 1c log w,

ahler potential for ω. This gives for the i.e., that Φ = − 1c log w is a real-valued K¨ volume element ω n (z) = det[∂∂Φ(z)] dz, and (27) gives ρ(z) = e−cΦ(z) det[∂∂Φ(z)]. Returning the Planck constant dependence into play, we therefore see that the sought weights ρh should be of the form ρh = e−cΦ det[∂∂Φ], with some c = c(h) depending only on h. Note that the condition ω = ∂∂Φ means that ∂ 2 Φ(z) . gjk (z) = ∂zj ∂z k

Berezin–Toeplitz Quantization

99

The fact that this matrix is positive-definite, for each z ∈ Ω, means precisely that the potential Φ is strictly plurisubharmonic on Ω. We will usually abbreviate “strictly plurisubharmonic” to “strictly PSH”. Finally, the condition i {f, g} (28) 2π in the Berezin quantization will be satisfied if the operator Q1 in (18) equals C1 (f, g) − C1 (g, f ) = −

Q1 =

n 

g jk ∂k ∂ j =: Δ,

j,k=1

the Laplace–Beltrami operator associated to ω. Indeed, in that case we have by (19) C1 (f, g) =

n 

g jk (∂ j f )(∂k g),

j,k=1

and (28) follows by (26). We have thus arrived at a final recipe for the Berezin and Berezin–Toeplitz quantizations on a domain Ω ⊂ Cn equipped with a K¨ahler form ω and the corresponding Poisson bracket. Namely: 1. There must exist a K¨ ahler potential Φ for ω, i.e., a strictly PSH function Φ such that ω = ∂∂Φ. 2. We take the Bergman spaces L2hol (Ω, e−cΦ det[∂∂Φ]), where c ∈ R is a parameter. Denote by Kc (x, y) the reproducing kernel of this space, by Bc the (c) associated Berezin transform, and by Tf the Toeplitz operator on this space with symbol f . 3. See if c = c(h) can be chosen so that Bc = I + hΔ + h2 Q2 + h3 Q3 + · · ·

as h → 0

with some differential operators Qj , Q0 = I, Q1 = Δ (for the Berezin quantization); and (c) Tf Tg(c)

=

∞ 

(c)

hj TCj (f,g)

as h  0

j=0 i in some sense, with C0 (f, g) = f g and C1 (f, g)−C1 (g, f ) = − 2π {f, g} (for the Berezin–Toeplitz quantization).

It turns out that under suitable hypothesis on Ω and Φ, this recipe indeed works, with c(h) = 1/h. For brevity, let us denote by dμh the corresponding measures dμh (z) := e−Φ(z)/h det[gkj (z)] dz,

h > 0,

100

M. Engliˇs

and by L2hol,h = L2hol (Ω, dμh ) the associated weighted Bergman spaces; also Kc , Bc (c)

and Tf will be written as Kh , Bh and Tf , respectively. We will also sometimes use our earlier notation α = 1/h for h1 rather than c. For simplicity, we have so far really discussed only the situation when Ω is a domain in Cn . It turns out that the whole formalism works also on arbitrary K¨ahler manifolds, just with some minor technical adjustments. The most conspicuous of them is that instead of considering Bergman spaces of functions on Ω, one needs to consider, more generally, spaces of sections of a holomorphic line bundle L, equipped with a Hermitian metric (in the fibers) given locally by e−Φ (more precisely: the curvature form of this Hermitian metric should coincide with the given K¨ ahler form ω). For such L to exist, it is necessary that the cohomology class of ω be integral. The role of the weighted Bergman spaces L2hol (Ω, dμh ) is then played by the spaces of holomorphic L2 sections of the tensor powers L⊗m , m = 1/h = 1, 2, . . . ; in particular, the Planck constant can approach 0 only through a discrete set of values. However, the whole formalism – weighted Bergman kernels, Berezin symbols, Toeplitz operators, and Berezin transforms – still makes perfect sense, and so does the above recipe for Berezin and Berezin– Toeplitz quantizations. Since both Bh and Tf are defined by formulas involving the weighted Bergman kernels Kh , the key to proving the viability of our recipe is obviously an understanding of the behaviour of Kh (x, y) as h  0. Historically, there are two approaches how to handle this problem, which both appeared independently around 1997–1998. The first one was developed in the context of compact manifolds by Zelditch [45], who gave, in our language, the asymptotics of the reproducing kernels Kh (x, x) on the diagonal as h → 0; this was subsequently extended also away from the diagonal by Catlin [13]. These two papers did not consider Bh and Tf , but rather were inspired by certain geometric applications going back to Tian in 1990 [44] (with a follow-up by Ruan [40]). The proofs rely on a theory, due to Boutet de Monvel and Guillemin [11], of Fourier integral operators of Hermite type, which was in exactly the same way used, in fact, already in 1994 by Bordemann, Meinrenken and Schlichenmaier [9] to establish the result about Tf on compact manifolds directly without those for Kh and Bh (thus bypassing the Berezin quantization). The second approach, due to the present author, dealt with domains in Cn not manifolds, and relied on somewhat simpler methods (Fefferman’s expansion and ∂-techniques) to obtain the asymptotics on Kh and Bh [19] [20] [21]; naturally, some hypothesis on the behaviour of Φ at the boundary were needed. The result for Tf can, however, be established in this case only for bounded domains, and one still has to resort to the more sophisticated machinery used by Bordemann, Meinrenken and Schlichenmaier [9]. Prior to these general results, Berezin and Berezin–Toeplitz quantizations had been established only ad hoc in some special cases, such as in dimension n = 1 (i.e., for Riemann surfaces) with the Poincar´e metric by Klimek and Lesniewski

Berezin–Toeplitz Quantization

101

in 1991 [33] (using uniformization), for Ω = Cn with the Euclidean metric by Coburn in 1993 [14], or for bounded symmetric domains with the invariant metric by Borthwick, Lesniewski and Upmeier in 1994 [10]. The basic idea, in any case, goes back – as the terminology rightly suggests – to Berezin in 1975 [6]. The equivalence of the Berezin quantization and the asymptotic expansion of the Berezin transform is due to Karabegov [32]. Some recent extensions and generalizations of the theory are discussed, e.g., in the book [38] by Ma and Marinescu, or the paper [7] by Berndtsson, Berman and Sj¨ostrand. In the rest of this paper, we will first handle the case of the Berezin quantization by the second of the above-mentioned approaches. Then we proceed to deal with the Berezin–Toeplitz quantization via the first approach, adapted to the context – to which we have also restricted ourselves hitherto in this paper – of domains in Cn rather than compact manifolds.

5. Berezin quantization 5.1. Basic notions of several complex variables Recall that a smooth function Φ : Ω → R on a domain Ω in Cn is called strictlyplurisubharmonic (strictly-PSH) if for any z ∈ Ω and v ∈ Cn , the function of one complex variable t → Φ(z + tv), t∈C is strictly subharmonic where defined. Equivalently, Φ is strictly-PSH if the matrix of mixed second derivatives / ∂ 2 Φ 0n ∂zj ∂z k j,k=1 is positive definite. A bounded domain Ω ⊂ Cn with smooth boundary is called strictly pseudoconvex if there exists a smooth function r such that r>0 −r

on Ω,

r = 0, ∇r > 0

on ∂Ω,

is strictly-PSH in a neighbourhood of Ω.

One calls r a strictly-PSH defining function for Ω. For completeness (it will not be needed in the sequel), we remark that there are also (not necessarily strictly) plurisubharmonic (PSH) functions, for which t → Φ(zt v) is assumed to be only subharmonic (not necessarily strictly), or, equivalently, the matrix of mixed second-order derivatives is only positive semi-definite; and (not necessarily strictly) pseudoconvex domains, which can be defined as increasing unions of strictly pseudoconvex domains. (This is not the same thing as having a – not necessarily strictly – PSH defining function.) Pseudoconvex domains are the natural domains in Cn on which holomorphic functions live: if Ω is not pseudoconvex, then there exists a larger domain Ω such that every holomorphic function on Ω in fact extends holomorphically to Ω . An example of a non-pseudoconvex domain is the domain Ω = {z ∈ Cn : 1 < |z| < 2},

102

M. Engliˇs

n > 1, for which Ω = {z ∈ Cn : |z| < 2}. In dimension n = 1, as we all know from basic complex analysis, all domains are pseudoconvex. Strictly pseudoconvex domains are those whose boundary is, additionally, in some sense “non-degenerate”, which makes it possible to establish results which have as yet no known counterparts in the non-strictly pseudoconvex case. We will come across some of these results later in this section. The upshot of all the above is that pseudoconvex domains are the ones on which it makes sense to study holomorphic functions; strictly pseudoconvex domains are the manageable ones. 5.2. Main theorem on Berezin quantization Theorem B. Let Ω ⊂ Cn be smoothly bounded and strictly pseudoconvex, and Φ a strictly-PSH function on Ω such that e−Φ = r is a defining function for Ω. Then for the weights w = e−αΦ det[∂∂Φ], we have as α → +∞, α ∈ Z, Kα (x, x) ≈ eαΦ(x)

∞ αn  bj (x) , π n j=0 αj

with some functions bj ∈ C ∞ (Ω), b0 = det[∂∂Φ]; and Bα f ≈

∞  Qj f j=0

αj

where Qj are some differential operators, in particular Q0 = I and Q1 =

n 

g jk

j,k=1

∂2 =: Δ, ∂zk ∂z j

the Laplace–Beltrami operator. Here g jk is the inverse matrix to gjk :=

∂2Φ . ∂zj ∂z k

It follows, as explained in §4.3, that denoting by cjαβ the coefficients of the operators Qj ,  cjαβ ∂ α ∂ β f, Qj f = α,β multiindices

and setting f ∗Bt g :=

∞ 

hj Cj (f, g),

j=0

where Cj (f, g) :=



cjαβ (∂ β f )(∂ α g),

α,β

we obtain a Berezin quantization on the domain Ω with the Poisson bracket associated to the K¨ahler form ω = ∂∂Φ.

Berezin–Toeplitz Quantization

103

It is instructive to see how Theorem B applies in the examples from §4.6. For the unit ball Ω = Bn (which includes Ω = D for n = 1), take Φ(z) = log

1 , 1 − |z|2

which is a K¨ahler potential for the invariant metric on Bn . Then Φ is strictly-PSH, e−Φ(z) = 1 − |z|2 is a strictly-PSH defining function for Bn , and b0 (z) = det[

∂2Φ 1 ]= . ∂zj ∂z k (1 − |z|2 )n+1

We thus recover the formulas from §4.6 (b0 explains the “shift in the exponent α”). Also, we see that cα ∼ αn . For the Fock space on Ω = Cn , a K¨ahler potential for the Euclidean metric is Φ(z) = |z|2 . In that case b0 (z) = det[δjk ] = 1, so there is no “shift” this time, and Theorem B again recovers the asymptotics of Kα and Bα on the Fock space from Section 2 and §4.6. We need to review a few prerequisites before giving a proof of the theorem. 5.3. Hartogs domains For a domain Ω ⊂ Cn and a real-valued smooth function φ on it, the Hartogs domain with base Ω and radius-function e−φ is  := {(z, t) ∈ Ω × C : |t|2 < e−φ(z) }. Ω  is pseudoconvex if and only if Ω is pseudoconvex and It can be shown that Ω  φ is PSH; and that Ω is strictly pseudoconvex and smoothly bounded if Ω is strictly-pseudoconvex, φ is strictly-PSH and e−φ = r is a defining function for Ω. Furthermore, r(z, t) := r(z) − |t|2 = e−φ(z) − |t|2 (29)  is a defining function for Ω. Thus the hypotheses of Theorem B guarantee precisely that taking for φ the  over Ω will be smoothly K¨ahler potential Φ, the corresponding Hartogs domain Ω bounded and strictly pseudoconvex, with a defining function given by (29). 5.4. Hardy space Continuing with the notation from the preceding paragraph, consider the compact manifold  X := ∂ Ω equipped with the measure dσ :=

J[ r] dS, ∂ r

(30)

104

M. Engliˇs

where dS stands for the surface measure on X and J[ r ] for the Monge–Amp`ere determinant * + r ∂ r J[ r] = − det > 0. r ∂ r ∂∂ Let H 2 (X) = H 2 be the subspace in L2 (X, dσ) of functions whose Poisson ex is holomorphic. (Alternatively, H 2 (X) is the closure in L2 (X, dσ) tension into Ω  of Ω  and holomorphic in its interior.) of functions continuous on the closure Ω 2 One calls H (X) the Hardy space on X. We remark that the measure (30) – which at first sight may look a bit artificial – is actually a familiar object in differential geometry. Namely, the restriction ν of 1 (∂ r − ∂ r ) to X is a contact form on X, meaning the differential form Im ∂ r = 2i n that ν ∧ (∂∂ν) is a non-vanishing volume element on X. Up to a constant factor, this volume element is precisely (30). 5.5. Szeg¨ o kernel  the evaluation functional f → f (z, t) on H 2 turns out to be For each (z, t) ∈ Ω, continuous, hence is given by the scalar product with a certain element k(z,t) ∈ H 2 . The function KSzeg¨o((x, t), (y, s)) := k(y,s) , k(x,t) H 2  ×Ω  is called the Szeg¨ on Ω o kernel. In other words, KSzeg¨o is the reproducing kernel of the Hardy space H 2 (X),  (rather than just their boundary viewed as a space of holomorphic functions on Ω values on X). There is a simple relationship between the Hardy space H 2 (X) and the o weighted Bergman spaces L2hol,h on the base Ω, as well as between the Szeg¨ kernel KSzeg¨o and the weighted Bergman kernels of L2hol,h , which we now explain. 5.6. Ligocka’s formula  can be parameterized as The boundary X of Ω X = {(z, eiθ e−φ(z)/2 ) : z ∈ Ω, θ ∈ [0, 2π]}. In these coordinates, and recalling our notations r(z) = e−φ(z) , r(z, t) = r(z)−|t|2 , easy computations show that 1 1 dS = r + ∂r 2 dz dθ, ∂ r = r + ∂r 2 , J[ r ] = J[r] = e−(n+1)φ det[∂∂φ], so

(31)

dσ(z, t) = e−(n+1)φ det[∂∂φ] dz dθ. (32)  Consider now a holomorphic function f on Ω. Taking the Taylor expansion in the fiber variable, we can write ∞   fj (z) tj , (z, t) ∈ Ω, f (z, t) = j=0

Berezin–Toeplitz Quantization

105

with fj holomorphic on Ω. Expressing t in polar coordinates, one also sees immediately that f (z) tj ⊥ g(z) tk ∀f, g if k = j (orthogonality is meant in H 2 ). For the norm of f in H 2 (X), we thus get, using (32), ) |f (z, t)|2 dσ(z, t) X

=

=

∞ )  j=0 ∞ 

|fj (z)|

2

)



 |eiθ e−φ(z)/2 |2j dθ e−(n+1)φ(z) det[∂∂φ(z)] dz

0

Ω

) 2π

|fj |2 e−(j+n+1)φ det[∂∂φ(z)] dz.

Ω

j=0

It follows that 2

H (X) =

∞ 5

L2hol (Ω, 2πe−(j+n+1)φ det[∂∂φ(z)] dz),

j=0

and KSzeg¨o ((x, t), (y, s)) =

∞ 1  K −(j+n+1)φ det[∂∂φ(z)] (x, y) (ts)j . 2π j=0 e

In other words, the weighted Bergman kernels of our spaces L2hol,h are just the Taylor coefficients, with respect to the fiber variable, of the Szeg¨ o kernel of H 2 (X). This result is due to Ligocka [36]; the basic idea goes back to Forelli and Rudin [29]. 5.7. Fefferman’s theorem This celebrated result of Fefferman [27] and Boutet de Monvel and Sj¨ostrand [12] describes the boundary behaviour of the Szeg¨o kernel of an arbitrary (nice) domain  in Cn , thus including, in particular, the kernel KSzeg¨o of our Hartogs domain Ω. Here is the result. Let D ⊂ Cn be a bounded strictly pseudoconvex domain with smooth bound discussed ary, and r a defining function for D. As in the special case of D = Ω before, one defines the Hardy space H 2 (∂D) as the subspace in L2 (∂D, dσ) (with some non-vanishing volume element σ on ∂D) of all functions whose Poisson extensions into D are not only harmonic but holomorphic; and the Szeg¨ o kernel KSzeg¨o (z, w), z, w ∈ D, as the reproducing kernel of H 2 (∂D), viewed as a space of functions on D (not just of their boundary values on ∂D). Then there are functions a, b ∈ C ∞ (Cn ) such that (a) for x ∈ ∂D, a(x) =

n! J[r](x) > 0; πn

(33)

106

M. Engliˇs

(b) the Szeg¨ o kernel on the diagonal is given by the formula KSzeg¨o(x, x) =

a(x) + b(x) log r(x). r(x)n

This formula also extends to KSzeg¨o(x, y) with x = y, namely, KSzeg¨o(x, y) =

a(x, y) + b(x, y) log r(x, y), r(x, y)n

where a(x, y), b(x, y) and r(x, y) are almost-sesquiholomorphic extensions of a(x) = a(x, x), b(x) = b(x, x) and r(x) = r(x, x), respectively. The latter means that ∂a(x, y)/∂y and ∂a(x, y)/∂x both vanish to infinite order on the diagonal x = y, and similarly for b(x, y) and r(x, y). Such extensions always exist, and it is a consequence of the strict pseudoconvexity that r(x, y) can be chosen so that Re r(x, y) > 0 for all x, y ∈ D, so that the logarithm can be defined as the principal branch. (c) KSzeg¨o (x, y) is smooth on D × D \ U, for any neighbourhood U of the boundary diagonal {(x, x) : x ∈ ∂D}. Finally, there is a device for converting this description of the boundary behaviour into the description of the Taylor components from Ligocka’s formula. 5.8. Resolution of singularities ∞ Recall that the power series k=0 k j z k converges on the unit disc D, and its sum equals j ∞   ajk j! kj z k = + , (1 − z)j+1 (1 − z)k k=0

k=1

with some constants ajk , if j = 0, 1, 2, . . . ; and ∞ 

kj z k =

k=0

(−1)j (1 − z)j log(1 − z) + Fj (z), j!

with some Fj ∈ C −j (D), if j = −1, −2, −3, . . . . Also, by the familiar Cauchy estimates, if a holomorphic function f (z) = k fk z k on the disc belongs to C j (D), then its Taylor coefficients satisfy as k → +∞. fk = O(k −j ) Now suppose that f (z) = k fk z k is a holomorphic function on D which satisfies a(z) f (z) = + b(z) log(1 − z) (1 − z)n+1 for some a, b ∈ C ∞ (C). Taking the Taylor expansions of a, b around z = 1, this implies that there exist α1 , . . . , αn+1 and β0 , β1 , β2 , . . . , with αn+1 = a(1), such that, for any M = 0, 1, 2, . . . , f (z) =

n+1  j=1

 αj + βj (1 − z)j log(1 − z) + FM (z), (1 − z)j j=0 M

Berezin–Toeplitz Quantization

107

with FM ∈ C M (D). Combining this with the observations in the preceding paragraph, it follows that fk ≈ an k n + an−1 k n−1 + · · · + a0 +

a−1 + ··· , k

an =

a(1) , n!

(34)

for some constants an , an−1 , . . . , as k → ∞. 5.9. Sketch of proof of Theorem B As already mentioned in §5.3, the hypotheses of the theorem guarantee that the Hartogs domain  = {(z, t) ∈ Ω × C : |t|2 < e−Φ(z) } Ω is smoothly bounded, strictly pseudoconvex, and with a defining function r(z, t) := e−Φ(z) − |t|2 .  By Ligocka’s formula Consider the Hardy space H 2 (X) on the boundary X = ∂ Ω. from §5.6, we have H 2 (X) =

∞ 5

L2hol (Ω, e−kΦ det[∂∂Φ])

(35)

k=n+1

 and (where n = dim Ω, so n + 1 = dim Ω), KSzeg¨o ((x, s), (y, t)) =

∞ 1  Kk+n+1 (x, y) (st)k , 2π k=0

where, for brevity, we are denoting the reproducing kernel of L2hol (Ω,e−kΦ det[∂∂Φ]) by Kk (x, y). Fefferman’s theorem for the Szeg¨ o kernel tells us that a KSzeg¨o = n+1 + b log r, r for some (almost-sesquiholomorphic) functions a, b ∈ C ∞ (Cn+1 × Cn+1 ). Hence, in particular, ∞ 1  Kk+n+1 (x, x) sk = KSzeg¨o((x, s), (x, 1)) 2π k=0

=

a(x, s) + b(x, s) log(e−Φ(x) − s) (e−Φ(x) − s)n+1

a(x, s)e(n+1)Φ(x) + b(x, s) log(1 − seΦ(x) ) − b(x, s)Φ(x) (1 − seΦ(x) )n+1    =: z A(x, z) + B(x, z) log(1 − z), = (1 − z)n+1 =

108

M. Engliˇs

where A(x,z) = a(x,ze−Φ(x) )e(n+1)Φ(x) − b(x,ze−Φ(x))Φ(x)(1 − z)n+1 and B(x,z) = b(x,ze−Φ(x) ). So for each x ∈ Ω, ∞ 

e−kΦ(x) Kk+n+1 (x, x) z k =

k=0

A(x, z) + B(x, z) log(1 − z) (1 − z)n+1

with functions A, B ∈ C ∞ (Ω × D). Employing the resolution of singularities from §5.8 gives ∞  bj (x) kn Kk (x, x) ≈ n ekΦ(x) π kj j=0 as k → +∞, proving the first part of Theorem B. (The formula for b0 follows from (31), (33) and (34).) With a bit of technicalities which we omit, the last result can be extended also to x = y: ∞  bj (x, y) kn (36) Kk (x, y) ≈ n ekΦ(x,y) π kj j=0 for (x, y) near the diagonal, where Φ(x, y), bj (x, y) are almost-sesquiholomorphic extensions of Φ(x) = Φ(x, x) and bj (x) = bj (x, x). (The technicalities involve an improved version of the resolution of singularities from §5.8, where f (z), holomorphic in z ∈ D, is replaced by f (x, z), depending smoothly on x and holomorphic in z in the disc |z| < r(x), where the radius r(x) also depends smoothly on x; see Lemma 7 in [21].) The second part of Theorem B (concerning the asymptotics of the Berezin transform) is then proved by first showing that in the integral defining Bα , ) |Kα (x, y)|2 −αΦ(y) e f (y) det[∂∂Φ(y)] dy Bα f (x) = Kα (x, x) Ω the main contribution, as α → +∞, comes from a small neighbourhood of x. In that neighbourhood, one then replaces Kα (x, y) by the asymptotic expansion (36). This reduces the problem to finding the asymptotics as α → +∞ of integrals of the form )   α Φ(x,y)+Φ(y,x)−Φ(x)−Φ(y) dy, F (y) e neighbourhood of x

where F is an expression involving f , det[∂∂Φ], and the coefficient functions bj from (36). Finally, this kind of integrals is handled by the standard stationaryphase (Laplace, WJKB) method, yielding the result in the theorem. The first two terms in the asymptotic expansion for Bα can be evaluated explicitly, giving the desired outcomes Q0 = I and Q1 = Δ, and thus finishing completely the proof of Theorem B.

Berezin–Toeplitz Quantization

109

6. Berezin–Toeplitz quantization For f ∈ L∞ (Ω), let us denote, for brevity, the Toeplitz operator with symbol (m) f on L2hol (Ω, e−mΦ det[∂∂Φ]) by Tf . The main result on the Berezin–Toeplitz quantization then reads as follows. Theorem BT. Let Ω be a smoothly bounded strictly pseudoconvex domain in Cn , and Φ : Ω → R a smooth strictly-PSH function such that e−Φ =: r is a defining function for Ω. Then there exist bilinear differential operators Cj (j = 0, 1, 2, . . . ) such that for any f, g ∈ C ∞ (Ω) and any M = 0, 1, 2, . . . , M



(m) (m)  −j (m)

m TCj (f,g) = O(m−M−1 ) as m → ∞.

Tf Tg − j=0

Furthermore, i {f, g}. C1 (f, g) − C1 (g, f ) = C0 (f, g) = f g, 2π ∞ j Consequently, f ∗ g := j=0 h Cj (f, g) defines a star-product on Ω. Observe that the theorem establishes the expansion for the product of two Toeplitz operators (17) in the strongest possible sense, namely, in the operator norm. As already mentioned, the proof of Theorem BT involves a sophisticated machinery, due to Boutet de Monvel and Guillemin, of Fourier integral operators of Hermite type – more specifically, of Toeplitz operators with pseudodifferential symbols. It is not our intention to introduce all the necessary notions and technicalities here; we will, however, try to highlight at least the main ideas.  from Section 5, Consider again the Hartogs domain Ω  = {(z, t) ∈ Ω × C : |t|2 < e−Φ(z) }. Ω  is smoothly bounded, Again, the hypotheses of Theorem BT guarantee that Ω strictly pseudoconvex, and admits r(z, t) := e−Φ(z) − |t|2 as a defining function.  with As before, consider the Szeg¨o kernel on the compact manifold X = ∂ Ω respect to the measure J[ r] dS. dσ := ∂ r We have already seen that (Ligocka’s formula) ∞ 1  KSzeg¨o (x, s; y, t) = Kk+n+1 (x, y) (st)k , 2π k=0

H 2 (X) =

∞ 5 k=n+1

L2hol (Ω, e−kΦ det[∂∂Φ]).

(37)

110

M. Engliˇs

The space H 2 (X) also admits its own “Hardy-space” Toeplitz operators: namely, if F is a function in, say, C ∞ (X), one defines the Toeplitz operator TF on H 2 (X) with symbol F as TF ψ := PSzeg¨o (F ψ), ψ ∈ H 2 (X), o prowhere PSzeg¨o : L2 (X, dσ) → H 2 (X) is the orthogonal projection (the Szeg¨ jection).  Now if f is a smooth function on Ω, we can lift it to a function F ∈ C ∞ (Ω) by composing with the projection on the first variable, i.e., F (x, t) := f (x). An easy verification then reveals that under the orthogonal decomposition (37), (m) the Toeplitz operators Tf on L2hol (Ω, e−mΦ det[∂∂Φ]) and the Toeplitz operator TF on H 2 (X) are related by TF =

∞ 5

(m)

Tf

.

m=n+1

The main ingredient in the whole proof is that, following the ideas of Boutet de Monvel and Guillemin, we can define Toeplitz operators TQ on H 2 (X) by the same recipe not only for functions, but also for pseudodifferential operators (ΨDO for short) Q on X as symbols. That is, for a ΨDO Q on X, we define TQ ψ := PSzeg¨o Qψ. For Q the operator of multiplication by a function F ∈ C ∞ (X), this recovers the Toeplitz operators TF above as a particular case. Toeplitz operators on H 2 (X) with ΨDO symbols are often called generalized Toeplitz operators. One proceeds to define the order ord(TQ ) and the symbol σ(TQ ) of TQ as the order of Q and the restriction of the principal symbol σ(Q) of Q to the symplectic submanifold Σ := {(x, ξ) : ξ = t(∂ r − ∂ r )x , t > 0} of the cotangent bundle of X, respectively. It can be shown that these two definitions are unambiguous: although it may happen that TQ = TQ for two different ΨDOs Q, Q (which is peculiar for ΨDO symbols – it is never the case that TF = TF  for F = F  ), in that case either Q, Q have the same order and their symbols coincide on Σ, or one of them – say, Q – has greater order than the other and its symbol vanishes on Σ to order ord(Q) − ord(Q ). Also, the order and the symbol of TQ obey the usual rules one would expect, as well as some additional ones: (P1) the generalized Toeplitz operators form an algebra under composition (i.e., ∀Q1 , Q2 ∃Q3 : TQ1 TQ2 = TQ3 ); (P2) ord(TQ1 TQ2 ) = ord(TQ1 ) + ord(TQ2 ); σ(TQ1 TQ2 ) = σ(TQ1 )σ(TQ2 ); (P3) σ([TQ1 , TQ2 ]) = {σ(TQ1 ), σ(TQ2 )}Σ ; (P4) if ord(TQ ) = 0, then TQ is a bounded operator on H 2 ;

Berezin–Toeplitz Quantization

111

(P5) if ord(TQ1 ) = ord(TQ2 ) = k and σ(TQ1 ) = σ(TQ2 ), then ord(TQ1 − TQ2 ) ≤ k − 1; (P6) for F ∈ C ∞ (X) and (x, ξ) ∈ Σ, σ(TF )(x, ξ) = F (x). Returning to the proof of Theorem BT, let T be the subalgebra of all generalized Toeplitz operators on H 2 (X) which commute with the rotations Uθ : f (z, w) → f (z, eiθ w),

(z, w) ∈ X, θ ∈ R,

in the fiber variable. Clearly, the operators TF with F (x, t) = f (x) for some function f ∈ C ∞ (Ω) (i.e., with F constant along fibers) belong to T . Denote by D : H 2 (X) → H 2 (X) the infinitesimal generator of the semigroup Uθ . Then D acts as multiplication by im on the mth summand in (37), for each m: 5 D= imI; m

and also D = T∂/∂θ is a generalized Toeplitz operator of order 1. Using (P1)–(P6) it can be shown that if T ∈ T is of order 0, then T = TF + D−1 R for some (uniquely determined) F ∈ C ∞ (X) which is constant along the fibers (hence, descends to a function on Ω), and R ∈ T of order 0. Repeated application of this formula shows that, for each k ≥ 0, T =

k 

D−j TFj + D−k−1 Rk ,

j=0

with Fj (x, t) = fj (x) for some fj ∈ C ∞ (Ω) and Rk ∈ T of order 0. Invoking the fact that zero-order operators are bounded, it follows that D

k+1



T−

k 

 D−j TFj = Rk

j=0

is a bounded operator on H 2 . (m) In view of the decomposition TF = ⊕m Tf , this means that k

 

 (m)

m−j Tfj = O(m−k−1 )

T L2 (Ω,e−mΦ det[∂∂Φ]) − j=0

as m → +∞. Taking for T the product TF TG , with F (x, t) = f (x), G(x, t) = g(x) for some f, g ∈ C ∞ (Ω), and setting Cj (f, g) := fj , we obtain the desired (m) (m) asymptotic expansion for Tf Tg . Finally, the assertions concerning C0 and C1 follow from the above properties (P2) and (P3) of the symbol by a routine calculation.

112

M. Engliˇs

7. Concluding remarks This paper is by no means intended as an exhaustive survey of quantization methods, or even of the Berezin and the Berezin–Toeplitz quantizations; its main goal was to serve as a first introduction into the subject for a new-comer interested in the area. From the many surveys and overviews of various quantization techniques, the reader is referred, e.g., to [1] for a somewhat more in-depth account of many (but not all) things discussed here, as well as for abundant references to other literature. Two good surveys of traditional deformation quantization (i.e., on the level of formal power series) are Gutt [30] and Sternheimer [43]; a very nice recent overview focused on the Berezin–Toeplitz quantization discussed here is Schlichenmaier [41]. Some more technical aspects of several points left out here can be found in the author’s article [22]. An excellent reading about the material discussed in Section 1 are several books by Folland, in particular [28]. It should, finally, be mentioned that the subject of Berezin and Berezin– Toeplitz quantization is still far from being understood completely, and there are many things waiting still to be resolved in a satisfactory way. For instance, in both Theorem B and Theorem BT the semiclassical limit α = h1 → +∞ is taken only for α ranging through the integers; this is of course natural if Ω is a compact manifold (as was the original context in [9]), but is only an artifact of the methods of proof for Ω a domain in Cn . Removing this restriction, i.e., extending the asymptotics of the reproducing kernels Kα , the Berezin transforms Bα , and the (α) Toeplitz operators Tf also to non-integer α → +∞ would be most desirable. Another highly active area concerns the generalizations of Fefferman’s theorem on the Szeg¨ o kernel from §5.7 (and the analogous theorem of his for the Bergman kernel, which was not mentioned here) to domains which are only weakly (i.e., not necessarily strictly) pseudoconvex; at the moment, there are only some partial results for special types of domains (see, e.g., [31]). Having a result of that kind would make it possible to extend Theorems B and BT to more general domains. Similarly, having a result of that kind for domains which are not nec whose the essarily smoothly bounded – more specifically, for Hartogs domains Ω −φ has a logarithmic singularity at the boundary of Ω – would radius-function e make it possible to quantize metrics whose K¨ahler potential behaves like that at the boundary; the latter include, for instance, the important Cheng–Yau metric on Ω (the K¨ahler–Einstein metric; see [5] for more information on this). Carrying out the Berezin–Toeplitz quantization in the last case by the method described in Section 6 would also require an extension of the Boutet de Monvel and Guillemin theory of generalized Toeplitz operators to noncompact manifolds, which is another open problem at present. Closely related ideas concern also the boundary behaviour of weighted Bergman kernels with respect to weights having some kind of singularity at the boundary (e.g., involving the logarithm of the defining function); some results of the present author in that direction can be found in [23]. Interestingly, the same technique as in that paper can also be used to establish that the weighted Bergman

Berezin–Toeplitz Quantization

113

kernels Kα (x, y) appearing in the previous sections can be continued to meromorphic functions of α in the entire complex plane [25]; this is somewhat reminiscent of the resonances occuring in scattering theory, and is related to zeta functions of elliptic operators. A subject of a completely different flavour is the extension of the Theorems B and BT above also to the setting of harmonic, rather than holomorphic, functions; although this seems not to have any direct relevance for quantization, the results are equally interesting, and, apparently, much more intriguing, than in the holomorphic case (see, e.g., [24]). There is also a variety of problems, though again not directly related to quantization, concerning the range of the Berezin symbol map T → T (see, e.g., Coburn [16] and Bommier-Hato [8]), while notable applications of Toeplitz operators and the Berezin transform appear in operator theory and in time-frequency analysis; let us mention at least [17], [37], [2], [3] and [46].

References [1] S. Twareque Ali, M. Engliˇs: Quantization methods: a guide for physicists and analysts, Rev. Math. Phys. 17 (2005), 391–490. [2] S. Axler, D. Zheng: Compact operators via the Berezin transform, Indiana Univ. Math. J. 47 (1998), 387–400. [3] W. Bauer, L.A. Coburn, J. Isralowitz: Heat flow, BMO, and the compactness of Toeplitz operators, J. Funct. Anal. 259 (2010), 57–78. [4] F. Bayen, M. Flato, C. Fronsdal, A. Lichnerowicz, D. Sternheimer: Deformation theory and quantization, Ann. Phys. 111 (1978), 61–110 (part I), 111–151 (part II). [5] M. Beals, C. Fefferman, and R. Grossman: Strictly pseudoconvex domains in Cn , Bull. Amer. Math. Soc. 8 (1983), 125–326. [6] F.A. Berezin: General concept of quantization, Comm. Math. Phys. 40 (1975), 153– 174. [7] R. Berman, B. Berndtsson, J. Sj¨ ostrand: A direct approach to Bergman kernel asymptotics for positive line bundles, Ark. Mat. 46 (2008), 197–217. [8] H. Bommier-Hato: Lipschitz estimates for the Berezin transform, J. Funct. Spaces Appl. 8 (2010), 103–128. [9] M. Bordemann, E. Meinrenken, M. Schlichenmaier: Toeplitz quantization of K¨ ahler manifolds and gl(N ), N → ∞ limits, Comm. Math. Phys. 165 (1994), 281–296. [10] D. Borthwick, A. Lesniewski, H. Upmeier: Nonperturbative deformation quantization of Cartan domains, J. Funct. Anal. 113 (1993), 153–176. [11] L. Boutet de Monvel, V. Guillemin: The spectral theory of Toeplitz operators, Ann. Math. Studies, vol. 99, Princeton University Press, Princeton 1981. [12] L. Boutet de Monvel, J. Sj¨ ostrand: Sur la singularit´e des noyaux de Bergman et de Szeg¨ o, Ast´erisque 34–35 (1976), 123–164. [13] D. Catlin: The Bergman kernel and a theorem of Tian, Analysis and geometry in several complex variables (Katata 1997), pp. 1–23, Trends in Math., Birkh¨ auser, Boston 1999.

114

M. Engliˇs

[14] L.A. Coburn: Deformation estimates for the Berezin–Toeplitz quantization, Comm. Math. Phys. 149 (1992), 415–424. [15] L.A. Coburn: Berezin–Toeplitz quantization, Algebraic methods in operator theory, pp. 101–108, Birkh¨ auser, Boston, 1994. [16] L.A. Coburn: A Lipschitz estimate for Berezin’s operator calculus, Proc. Amer. Math. Soc. 133 (2005), 127–131. [17] L. Coburn: Symbol calculus for Gabor–Daubechies windowed Fourier localization operators, preprint, 2005. [18] M. DeWilde, P.B.A. Lecomte: Existence of star products and of formal deformations of the Poisson Lie algebra of arbitrary symplectic manifolds, Lett. Math. Phys. 7 (1983), 487–496. [19] M. Engliˇs: A Forelli–Rudin construction and asymptotics of weighted Bergman kernels, J. Funct. Anal. 177 (2000), 257–281. [20] M. Engliˇs: The asymptotics of a Laplace integral on a K¨ ahler manifold, J. reine angew. Math. 528 (2000), 1–39. [21] M. Engliˇs: Weighted Bergman kernels and quantization, Comm. Math. Phys. 227 (2002), 211–241. [22] M. Engliˇs: Berezin and Berezin–Toeplitz quantizations for general function spaces, Rev. Mat. Complut. 19 (2006), 385–430. [23] M. Engliˇs: Weighted Bergman kernels for logarithmic weights, Pure Appl. Math. Quarterly (Kohn special issue) 6 (2010), 781–813. [24] M. Engliˇs: Berezin transform on the harmonic Fock space, J. Math. Anal. Appl. 367 (2010), 75–97. [25] M. Engliˇs: Analytic continuation of weighted Bergman kernels, J. Math. Pures Appl. 94 (2010), 622–650. [26] B.V. Fedosov: A simple geometric construction of deformation quantization, J. Diff. Geo. 40 (1994), 213–238. [27] C. Fefferman: The Bergman kernel and biholomorphic mappings of pseudoconvex domains, Inv. Math. 26 (1974), 1–65. [28] G.B. Folland, Harmonic analysis in phase space, Annals of Mathematics Studies, vol. 122, Princeton University Press, Princeton, 1989. [29] F. Forelli, W. Rudin: Projections on spaces of holomorphic functions in balls, Indiana Univ. Math. J. 24 (1974), 593–602. [30] S. Gutt: Variations on deformation quantization, Conference Moshe Flato (Dijon, 1999), vol. I, pp. 217–254, Math. Phys. Stud. 21, Kluwer, Dordrecht, 2000. [31] J. Kamimoto: Newton polyhedra and the Bergman kernel, Math. Z. 246 (2004), 405– 440. [32] A.V. Karabegov: Deformation quantization with separation of variables on a K¨ ahler manifold, Comm. Math. Phys. 180 (1996), 745–755. [33] S. Klimek, A. Lesniewski: Quantum Riemann surfaces, I: The unit disc, Comm. Math. Phys. 146 (1992), 103–122; II: The discrete series, Lett. Math. Phys. 24 (1992), 125–139; III: The exceptional cases, Lett. Math. Phys. 32 (1994), 45–61. [34] M. Kontsevich: Deformation quantization of Poisson manifolds, preprint (1997), arXiv:q-alg/9709040.

Berezin–Toeplitz Quantization

115

[35] B. Kostant: Quantization and unitary representations, Lecture Notes in Math., vol. 170, Springer, Berlin, 1970. [36] E. Ligocka: On the Forelli–Rudin construction and weighted Bergman projections, Studia Math. 94 (1989), 257–272. [37] M.-L. Lo: The Bargmann transform and windowed Fourier localization, Integral Eqs. Oper. Theory 57 (2007), 397–412. [38] X. Ma, G. Marinescu, Holomorphic Morse inequalities and Bergman kernels, Progress in Mathematics, vol. 254, Birkh¨ auser Verlag, Basel, 2007. [39] H. Omori, Y. Maeda, A. Yoshioka: Weyl manifolds and deformation quantization, Adv. Math. 85 (1991), 224–255. [40] W.-D. Ruan: Canonical coordinates and Bergman metrics, Comm. Anal. Geom. 6 (1998), 589–631. [41] M. Schlichenmaier: Berezin–Toeplitz quantization for compact K¨ ahler manifolds. A review of results, Adv. Math. Phys. (2010), Art. ID 927280, 38 pp. [42] J.-M. Souriau, Structure des syst`emes dynamiques, Dunod, Paris, 1969. [43] D. Sternheimer: Deformation quantization: twenty years after, Particles, Fields and Gravitation (Lodz, 1998), pp. 107–145, AIP Conf. Proc. vol. 453, Amer. Inst. Phys., Woodbury, 1998. [44] G. Tian: On a set of polarized K¨ ahler metrics on algebraic manifolds, J. Diff. Geom. 32 (1990), 99–130. [45] S. Zelditch: Szeg¨ o kernels and a theorem of Tian, Int. Math. Res. Not. 6 (1998), 317–331. [46] K. Zhu, Operator theory in function spaces, 2nd edition, Amer. Math. Soc., Providence, 2007. Miroslav Engliˇs Mathematics Institute Silesian University in Opava Na Rybn´ıˇcku 1 CZ-74601 Opava, Czech Republic and Mathematics Institute ˇ a 25 Zitn´ CZ-11567 Prague 1, Czech Republic e-mail: [email protected]

Operator Theory: Advances and Applications, Vol. 251, 117–152 c Springer International Publishing Switzerland 2016 

Global Attraction to Solitary Waves Andrew Comech Abstract. We study properties of solitary wave solutions of the form φ(x)e−iωt, with ω real and φ(x) localized in space. In the first section, we sketch two fundamental results on stability of solitary waves: Derrick’s theorem on instability of time-independent solutions and the Vakhitov–Kolokolov stability criterion for spectral stability of solitary waves. The main subject of the article is the structure of the (weak) global attractor of finite energy solutions to nonlinear Hamiltonian systems. The solitary resolution conjecture states that such an attractor is formed by the set of all solitary waves. We give the proof of this result for the simplest model: the Klein–Gordon field in one spatial dimension, coupled to a nonlinear oscillator. The main building block of the proof is the Titchmarsh convolution theorem; for completeness, we provide its proof. Mathematics Subject Classification (2010). Primary 35B41; Secondary 37K40. Keywords. Solitary waves, linear stability, Klein–Gordon equation, attractors, Titchmarsh convolution theorem.

1. Solitary waves. Linear stability 1.1. Derrick’s theorem As a warm-up, let us consider linear stability and instability of time-independent solutions to a nonlinear wave equation, −∂t2 ψ = −Δψ + f  (ψ),

ψ = ψ(x, t) ∈ R,

x ∈ Rn ,

n ≥ 1.

(1.1)

We assume that f (s) is smooth and satisfies f (0) = 0. Equation (1.1) is a Hamiltonian system, δE δE , ∂t π = − , ∂t ψ = δπ δψ with the Hamiltonian functional )  2  π |∇ψ|2 + + f (ψ) dx. (1.2) E(ψ, π) = 2 2 Rn The paper of G.H. Derrick [Der64] contained the following argument (now widely known as Derrick’s theorem) to show the non-existence of stable localized time-

118

A. Comech

independent solutions of the form u(x, t) = θ(x) to (1.1) in three spatial dimensions (n = 3): Let the energy of the time-independent solution θ(x) be given by )   E= (∇θ)2 + f (θ) d3 x. A necessary condition for the solution to be stable is δ 2 E ≥ 0. Suppose θ(x) is a localized solution of δE = 0., Define θλ (x) = θ(λx) where λ is , an arbitrary constant, and write I1 = (∇θ)2 d3 x, I2 = f (θ)d3 x. Then )   Eλ = (∇θλ )2 + f (θλ ) d3 x = I1 /λ + I2 /λ3 . Whence (dEλ /dλ)|λ=1 = −I1 − 3I2 = 0, and since I1 > 0, (d2 Eλ /dλ2 )|λ=1 = 2I1 + 12I2 = −2I1 < 0. That is, δ 2 E < 0 for a variation corresponding to a uniform stretching of the “particle”. Hence the solution θ(x) is unstable. Problem 1.1. Repeat the above computation for other dimensions and prove that time-independent localized solutions in spatial dimensions n ≥ 3 do not minimize the value of the Hamiltonian functional (1.2). Problem 1.2. Find the conditions on f so that there are localized solutions θ(x) in one spatial dimension. Show that they also do not minimize the Hamiltonian (1.2). Since in the above argument one has I2 = −I1 /3 < 0, we know that f (θ(x)) takes some negative values. The instability of solutions in the potential f (u) which does not satisfy f (u) > 0 for u = 0 seems natural to expect: solutions would tend to slip into the negative values of the potential energy, where u = 0. In fact, this mechanical analogy is misleading, and in general the condition δ 2 E ≥ 0 is not necessary for stability. For example, by [PS13], there are stable solitary waves in the massive Thirring model, although the second derivative of E has infinitely many negative directions. A rigorous treatment of the instability of time-independent solutions to the nonlinear wave equation (1.1) in any dimension is in [KS07]. Let us modify Derrick’s argument to show the linear instability of smooth time-independent solutions in any dimension (cf. Definition 1.6 below). More precisely, given a time-independent solution θ(x) to (1.1), we would like to show that if ψ(x, t) = θ(x) + r(x, t) is a perturbed solution, then the linearized equation on r will have exponentially growing modes. This suggests that certain small perturbations of θ(x) will grow exponentially in time, implying instability of θ(x) (see Remark 1.7 below). Lemma 1.3 (Derrick’s theorem for n ≥ 1). Let n ≥ 1. If the nonlinear wave equation (1.1) admits a smooth time-independent solution θ ∈ H ∞ (Rn ), then this solution is linearly unstable. Above, H ∞ (R) denotes the intersection of all Sobolev spaces H k (R), k ∈ N.

Global Attraction to Solitary Waves

119

Proof. Since θ satisfies −Δθ + g(θ) = 0, we also have −Δ∂x1 θ + g  (θ)∂x1 θ = 0. Due to lim θ(x) = 0, ∂x1 θ vanishes somewhere. Therefore, according to the |x|→∞

minimum principle, ∂x1 θ cannot be an eigenfunction corresponding to the smallest eigenvalue; hence there is a nonzero χ ∈ L2 (Rn ) which corresponds to some smaller (hence negative) eigenvalue of L = −Δ + g  (θ), so that Lχ = −c2 χ, with some c > 0. Taking the Ansatz which describes a perturbed solution, ψ(x, t) = θ(x)+ r(x, t), we obtain the linearization at θ: −∂t2 r = −Lr.

* + * +* + r 0 1 r We rewrite this linearization as the first-order system: ∂t = . −L 0 s * + s χ The matrix in the right-hand side has eigenvectors , corresponding to the ±cχ eigenvalues ±c ∈ R; thus, the solution θ is linearly unstable. Let us also mention that ∂τ2 |τ =0 E(θ + τ χ) = χ, E  (θ)χ = χ, Lχ < 0, showing that δ 2 E(θ) is not positive-definite. Here and below,  , refers to the inner product in L2 .  Remark 1.4. It is shown in [KS07, Theorem 5.6] that the linearization at a timeindependent solution may be linearly stable when this particular solution is not from the Sobolev space H 1 (such examples exist in dimensions n > 10). While Derrick’s result [Der64] on absence of stable localized solutions could be considered an obstacle to interpreting soliton-like solutions as particles, Derrick suggested several ways to obtain stable localized solutions, including the following: Elementary particles might correspond to stable, localized solutions which are periodic in time, rather than time-independent. We will perform the linear stability analysis of such solutions in Section 1.3. 1.2. Existence of solitary waves The existence of solitary wave solutions of the form ψω (x, t) = φω (x)e−iωt ,

ω ∈ R,

φω ∈ H 1 (Rn ),

(1.3)

to the nonlinear Klein–Gordon equation (and nonlinear Schr¨ odinger equation) in Rn , in a rather generic situation, was established in [Str77] (a more general result was obtained in [BL83a, BL83b]). We will call such solutions solitary waves. Other appropriate names are nonlinear eigenfunctions and quantum stationary states. Note that while solitary waves are time-periodic, the observable quantities, such as the charge and current densities, are time-independent. We denote the set of all solitary waves by S. Typically, solitary waves exist for ω from an interval or a collection of intervals of the real line; therefore, the factorspace S/U(1) in a generic situation is isomorphic to a finite union of intervals.

120

A. Comech

Solitary waves in one spatial dimension. Let us construct solitary wave solutions in the simple one-dimensional case. We consider the U(1)-invariant nonlinear Schr¨odinger equation, i∂t ψ = −∂x2 ψ + g(|ψ|2 )ψ,

ψ = ψ(x, t) ∈ C,

x ∈ R,

(1.4)

with g(s) a smooth real-valued function. For our convenience, we will assume that m := g(0) > 0. The amplitude φω (x) of a solitary wave is to satisfy the stationary equation ωφ = −∂x2 φ + g(φ2 )φ,

(1.5)

which we rewrite as ωφ2 − G(φ2 ) , (1.6) 2 ,s with G(s) = 0 g(s ) ds . (We will see in a moment that if there is a solitary wave, then φω could be chosen positive.) We will interpret equation (1.6) as describing the particle in the “effective potential” ∂x2 φ = g(φ2 )φ − ωφ = −∂φ

ωφ2 − G(φ2 ) , 2 so that x is “the time” and φ is “the position” of the particle. The “mechanical” energy corresponding to the system described by equation (1.6) is E(φ) = |φ |2 /2+ Vω (φ). For a particular solution φ(x) to (1.6), E(φ) is constant (it does not depend on the “time” x). We are interested in soliton-like solutions, such that φ → 0 and φ → 0 as |x| → ∞, and hence E(φ) ≡ 0. If there is a “turning point” μω > 0 such that Vω (φ) < 0 for φ ∈ (0, μω ), Vω (μω ) = 0, and Vω (μω ) > 0, then there exists a “trajectory” φ(x) with zero “mechanical” energy E = 0, which satisfies limx→±∞ φ(x) = 0. Such a soliton is defined up to a shift along x; we fix φω by requiring that it assumes its maximum value at the origin: φω1 (0) = μω (then φω is symmetric). φω is obtained by integration from dφ/dx = − Vω (φ) for x > 0. See Figure 1. Vω (φ) :=

1.3. Stability of solitary waves By Derrick’s theorem (cf. Lemma 1.3), all spatially localized time-independent solutions to the nonlinear wave equation (1.1) turn out to be unstable. Derrick rightly suggested in [Der64] that localized solutions to (1.1) which are not static but rather time-periodic could be stable. To illustrate this, let us consider the (generalized) nonlinear Schr¨ odinger equation in one dimension (1.4). Let φω (x)e−iωt be a solitary wave solution to (1.4) with ω < m and with φω (x) even. Let us remind the definition of the stability of the orbit of φω (x)e−iωt [CL82, GSS87]. Definition 1.5 (Orbital stability). The φω -orbit is called orbitally stable if for any > 0 there is δ > 0 such that for any u0 ∈ H 1 (Rn ) which satisfies φω −u H 1 (Rn ) < δ there is a solution u(t) to (1.4) such that sup inf u(t) − eis φω H 1 (Rn ) < . t∈R s∈R

Global Attraction to Solitary Waves

121

W y W" )*

" )y*  "

Figure 1. Solitary wave profile φω (x) as a “particle trajectory” in the effective potential Vω (φ), with x interpreted as “time”. Otherwise, the φω -orbit is called unstable. To study the linear stability of φω (x)e−iωt , one considers the solution to (1.4) in the form of the Ansatz ψ(x, t) = (φω (x) + ρ(x, t))e−iωt , with ρ(x, t) ∈ C. The linearized equation on ρ is called the linearization at a solitary wave: * + Re ρ(x, t) R(x, t) = , (1.7) ∂t R = JLR, Im ρ(x, t) with

*

+ 0 1 J= , −1 0 L− = −∂x2 + g(φ2ω ) − ω,

*

+ 0 , L−

(1.8)

L+ = L− + 2g  (φ2ω )φ2ω .

(1.9)

L L= + 0

Note that since L− = L+ , the action of L on ρ considered as taking values in C is R-linear but not C-linear. This is why we need to write the equation in the form of a system (1.7). Definition 1.6 (Linear instability). The solitary wave φω (x)e−iωt is called linearly unstable if the intersection of the spectrum of the linearized equation with the right half-plane is nonempty: σ(JL) ∩ {λ ∈ C: Re λ > 0} = ∅. Otherwise, the solitary wave is called spectrally stable. Remark 1.7. Although linear instability implies (nonlinear) instability in a rather general situation, this relation is nontrivial; see, e.g., [SS00, KS07, GO12].

122

A. Comech Since lim φω (x) = 0 and g(0) = m, the essential spectrum of L− and L+ |x|→∞

is [m − ω, +∞) (this follows from Weyl’s theorem it  on the essential spectrum),  follows that the essential spectrum of JL is iR \ − i(m − ω), i(m − ω) . Therefore, the linear stability is determined by the location of the point spectrum of JL. First, let us note that the spectrum of JL is located on the real/and imaginary 0 L L 0 axes only: σ(JL) ⊂ R ∪ iR. To prove this, we consider (JL)2 = − −0 + L+ L− . Since L− is positive-definite (φω ∈ ker L− , being nowhere zero, corresponds to its smallest eigenvalue), we can define the selfadjoint root of L− ; then 1/2

1/2

σd ((JL)2 )\{0} = σd (L− L+ )\{0} = σd (L+ L− )\{0} = σd (L− L+ L− )\{0} ⊂ R, 1/2

1/2

with the inclusion due to L− L+ L− being selfadjoint. Thus, any eigenvalue λ ∈ σd (JL) satisfies λ2 ∈ R. Given the family of solitary waves, φω (x)e−iωt , ω ∈ Ω, with Ω some subset of the real line, we would like to know at which ω the eigenvalues of the linearized equation with Re λ > 0 appear. Since λ2 ∈ R, such eigenvalues can only be located on the real axis, having emerged from λ = 0. Hence, the appearance of real eigenvalues follows the jump in the dimension of the generalized null space of JL. Taking the derivatives of (1.5) with respect to x and ω, one can check that there are relations * + + * + * 0 0 −∂ω φω JL = = 0, JL , (1.10) φω 0 φω * + * + * + ∂ φ 0 ∂ φ JL = x ω . (1.11) JL x ω = 0, 0 −xφω /2 0 This shows that λ = 0 belongs to the point spectrum of JL, and moreover that there are two Jordan blocks which correspond to λ = 0. A jump in the dimension of the generalized null space of JL takes place if at a particular value of ω one can solve either the equation + * ∂ω φω (1.12) JLζ = 0 or + * 0 , (1.13) JLζ = xφω so that one of the Jordan blocks (1.10), (1.11) increases its size. Let us consider the Jordan block  By the Fredholm alternative, there is a solution to (1.12) if  (1.10). and only if ∂ω0φω is orthogonal to the null space of (JL)∗ = −LJ, which is given by * + * + φω 0 ker LJ = Span , . 0 ∂x φω Thus, there is a solution to (1.12) whenever + * +7 6* φω ∂ω φω , = φω , ∂ω φω = ∂ω φω 2L2 /2 = 0. 0 0

Global Attraction to Solitary Waves

123

Problem 1.8. Check that the Jordan block (1.11) is always of size 2. A slightly more careful analysis [CP03] based on a construction of the moving frame in the generalized eigenspace of λ = 0 shows that there are two real eigenvalues ±λ ∈ R that have emerged from λ = 0 when ω is such that ∂ω φω 2L2 becomes positive, leading to a linear instability of the corresponding solitary wave. The opposite condition, ∂ω φω 2L2 < 0, (1.14) is called the Vakhitov–Kolokolov stability criterion, which guarantees the absence of nonzero real eigenvalues for the groundstates of the nonlinear Schr¨ odinger equation. (Groundstates are defined as the family of solitary waves with φω (x) strictly positive.) The condition (1.14) appeared in [VK73, CL82, Sha83, Wei86, GSS87] in relation to linear and orbital stability of solitary waves. For more details and results on orbital stability, see the book [Str89]. The asymptotic stability of solitary waves has been studied by Soffer and Weinstein [SW90, SW92], Buslaev and Perelman [BP92, BP95], and then developed in [PW97, SW99, Cuc01, BS03, Cuc03] and other papers. Let us now give the details of the linear instability argument by Vakhitov– Kolokolov [VK73]. Lemma 1.9 (Vakhitov–Kolokolov stability criterion). There is λ ∈ σp (JL), λ > 0, where JL is the linearization (1.7) at the solitary wave φω (x)e−iωt , if and only if d 2 dω φω L2 > 0 at this value of ω. Remark 1.10. We are interested in the one-dimensional case x ∈ R, although the proof works for any dimension. Proof. We follow [VK73]. Assume that φω is linearly unstable, so that there is λ ∈ σd (JL), λ > 0. The relation (JL − λ)Ξ = 0 implies that λ2 Ξ1 = −L− L+ Ξ1 . It follows that Ξ1 is orthogonal to the kernel of the selfadjoint operator L− (which is spanned by φω ): 1 1 φ, Ξ1 = − 2 φ, −L− L+ Ξ1 = − 2 L− φ, −L+ Ξ1 = 0, λ λ hence there is η ∈ L2 (R, C) such that Ξ1 = L− η and λ2 η = −L+ Ξ1 . Thus, the inverse to L− can be applied: λ2 L−1 − Ξ1 = −L+ Ξ1 . Then λ2 η, L− η = −Ξ1 , L+ Ξ1 . / ker L− , it follows that η, L− η > 0. Since Since L− is positive-definite and η ∈ λ > 0, one has Ξ1 , L+ Ξ1 < 0, therefore the quadratic form ·, L+ · is not positivedefinite on vectors orthogonal to φω . According to Lagrange’s principle, the function r corresponding to the minimum of r, L+ r under conditions r, φω = 0 and r, r = 1 satisfies α, β ∈ R. (1.15) L+ r = αr + βφω , Since λ is positive, r, L+ r = α has to be negative. Since L+ ∂x φω = 0, one has λ1 = 0 ∈ σp (L+ ). Due to ∂x φω vanishing at one point (x = 0), there is exactly

124

A. Comech

one negative eigenvalue of L+ , which we denote by λ0 ∈ σp (L+ ). (This eigenvalue corresponds to some non-vanishing eigenfunction.) Note that β = 0, or else α would have to be equal to λ0 , with r the corresponding eigenfunction of L+ , but then r, having to be nonzero, could not be orthogonal to φω . Denote λ2 := inf(σ(L+ ) ∩ R+ ) > 0. Let us consider f (z) = φω , (L+ − z)−1 φω , which is defined and is smooth for z ∈ (λ0 , λ2 ). Note that f (z) is defined at z = λ1 := 0 ∈ σd (L+ ) since φω (which is even) is orthogonal to the null space of L∗+ = L+ , spanned by ∂x φω (which is odd). If α < 0, then, by (1.15), we would have 1 f (α) = φω , (L+ − α)−1 φω = φω , r = 0, β  and since f (z) > 0, one has f (0) > 0. On the other hand, ) 1 d f (0) = φω , L−1 φ = φ , ∂ φ = |φω (x)|2 dx. ω ω ω ω + 2 dω R , d 2 Thus, the linear instability leads to α < 0, which results in dω R |φω (x)| dx > 0. d To prove the “only if” part of the theorem, now assume that dω φω 2L2 > 0. We consider the function f (z) = φω , (L+ − z)−1 φω , z ∈ ρ(L+ ). Since f (0) =  φω , L−1 + φω > 0, f (z) > 0, and lim f (z) = −∞ (where λ0 < 0 is the smallest z→λ0 +

eigenvalue of L+ ), there is α ∈ (λ0 , 0) ⊂ ρ(L+ ) such that f (α) = φω , (L+ − α)−1 φω = 0. Then we define r = (L+ − α)−1 φω . Since φω , r = f (α) = 0, there 1/2 1/2 is η such that r = L− η. It follows that the quadratic form L− L+ L− is not positive definite: 1

1

1

1

2 2 2 2 η, (L− L+ L− )L− η = r, L+ r = r, (αr + φω ) = αr, r < 0. L−

1/2

1/2

Thus, there is λ > 0 such that −λ2 ∈ σ(L− L+ L− ); then also −λ2 ∈ σ(L− L+ ). Let ξ be the corresponding eigenvector, L− L+ ξ = −λ2 ξ; then +* + * + * ξ ξ 0 L− = λ , −L+ 0 − λ1 L+ ξ − λ1 L+ ξ hence λ ∈ σ(JL).



2. The set of all solitary waves as a global attractor 2.1. Introduction The long time asymptotics for nonlinear wave equations have been the subject of intensive research, starting with the pioneering papers by Segal [Seg63b, Seg63a], Strauss [Str68], and Morawetz and Strauss [MS72], where the nonlinear scattering and local attraction to zero were considered. Global attraction (for large initial data) to zero cannot possibly hold if there are solitary wave solutions. The existing

Global Attraction to Solitary Waves

125

results suggest that the set of orbitally stable solitary waves typically forms a local attractor, that is, attracts any finite energy solutions that were initially close to it. Moreover, a natural hypothesis is that the set of all solitary waves forms a global attractor of all finite energy solutions. 2.2. Relation to Quantum theory In 1911, Niels Bohr receives his doctorate from the University of Copenhagen (under the physicist Christian Christiansen) and leaves off to Cambridge, where he works as a post-doctoral student under experimentalist J.J. Thomson, of Trinity College, Cambridge and Cavendish Laboratory, who earlier studied the deflection of cathode rays in magnetic and electric fields, measured the mass-to-charge ratio of the cathode rays, and suggested back in 1897 [Tho97] the existence of electrons. Let us mention that J.J. Thomson was a Ph.D. student of Lord Rayleigh, the author of the celebrated two volumes “The Theory of Sound” (1877, 1878), while one of Thomson’s Ph.D. students was Ernest Rutherford; Thomson’s son, George Paget Thomson, shared the Nobel Prize in 1937 with C.J. Davisson “for their experimental discovery of the diffraction of electrons by crystals”. Two years later, Bohr formulates his famous postulates [Boh13]. Bohr’s first postulate: quantum stationary states as Schr¨ odinger’s eigenstates. According to Bohr’s first postulate, an unperturbed electron runs forever along a certain stationary orbit, which we denote |E and call quantum stationary state. Once in such a state, the electron has a fixed value of energy E, with energy not being lost via emitted radiation. Under a perturbation, the electron can jump from one quantum stationary state to another, |E− −→ |E+ ,

(2.1)

emitting or absorbing a quantum of light with energy equal to the difference of the energies E+ and E− . The old quantum theory was based on the Bohr–Sommerfeld quantization condition 8 p · dq = 2πn, n ∈ N, (2.2) with q and p the position and the momentum of the electron. This condition leads to the values m e4 n ∈ N, (2.3) En = − 2 2 , 2 n for the energy levels in Hydrogen, in a good agreement with the experiment. In the above formula, m > 0 is the mass of the electron, e < 0 is its charge,  is Planck’s constant, and we assume that the units are chosen so that the speed of light is equal to 1. Apparently, the Bohr–Sommerfeld quantization condition (2.2) does not explain the perpetual circular motion of the electron; according to classical Electrodynamics, such a motion would be accompanied by a loss of energy via radiation. In terms of the wavelength λ = 2π |p| of de Broglie’s phase waves [Bro24], the Bohr–Sommerfeld condition states that the length of the classical orbit of

126

A. Comech

the electron is the integer multiple of λ. Following de Broglie’s ideas, Schr¨odinger [Sch26] identified Bohr’s stationary orbits, or quantum stationary states |E , with wave functions that have the form ψ(x, t) = φω (x)e−iωt ,

ω = E/.

(2.4)

Physically, the charge and current densities e ¯ ρ = eψψ, j = (ψ¯ · ∇ψ − ∇ψ¯ · ψ) 2i

(2.5)

which correspond to the (quasi)stationary states of the form ψ(x, t) = φω (x)e−iωt do not depend on time, and therefore the generated electromagnetic field is also stationary and does not carry the energy away from the system, allowing the electron cloud to flow forever around the nucleus. Bohr’s second postulate: quantum jumps as global attraction to solitary waves. Bohr’s second postulate states that the electrons can jump from one quantum stationary state (Bohr’s stationary orbit ) to another. This postulate suggests the dynamical interpretation of Bohr’s transitions as long-time attraction Ψ(t) −→ |E± ,

t → ±∞

(2.6)

for any trajectory Ψ(t) of the corresponding dynamical system, where the limiting states |E± depend on the trajectory. Then the quantum stationary states, denote them by S, should be viewed as points of the global attractor. The attraction (2.6) takes the form of the long-time asymptotics ψ(x, t) ∼ φω± (x)e−iω± t ,

t → ±∞,

(2.7)

which holds for each finite energy solution. See Figure 2. : ? ? ? ? ? ? ? ? ? ? ? ? >

kF6 kF5

)u*

kF4 kF3

? ? ? ? ? ? ? ? ? ? ? ?
0 and q(s) is a polynomial with real coefficients of degree at least one. The quantity F (ψ) := q(|ψ|2 )ψ, ψ∈C (3.2)

130

A. Comech

has the meaning of a force exerted at the string by a nonlinear oscillator located at the point x = 0. All derivatives and the equation are understood in the sense of distributions. Equation (3.1) is U(1)-invariant, where U(1) stands for the unitary group eiθ , θ ∈ R mod 2π. If we identify a complex number ψ = u + iv ∈ C with the two-dimensional vector (u, v) ∈ R2 , then, physically, equation (3.1) describes small crosswise oscillations of the infinite string in three-dimensional space (x, u, v) stretched along the x-axis. The string is subject to the action of an “elastic force” −m2 ψ(x, t) and coupled to a nonlinear oscillator of the force F (ψ(0, t)) attached at the point x = 0. Solitary waves. Definition 3.1. (1) The solitary wave solutions of (3.1) are finite energy solutions to (3.1) of the form (3.3) ψ(x, t) = φω (x)e−iωt , where ω ∈ R, φω ∈ H 1 (R). (2) The set of all solitary wave solutions is denoted by S: S = {ψ ∈ C(R, H 1 (R)):

ψ(x, t) = φω (x)e−iωt , ω ∈ R, φω ∈ H 1 (R)}.

Note that S also contains the zero solution. (3) The solitary manifold is the set of corresponding initial data: . S = (φω , −iωφω ): φω (x)e−iωt ∈ S .

(3.4)

(3.5)

Remark 3.2. Since the equation is U(1)-invariant, the set S is invariant under multiplication by eiθ , θ ∈ R. The following proposition provides a concise description of all solitary waves. Proposition 3.3. The set of all nonzero solitary wave solutions (3.4) of equation (3.1) consists of functions ψ(x, t) = φω (x)e−iωt with 1 κ(ω) = m2 − ω 2 , (3.6) φω (x) = Ce−κ(ω)|x|, where ω ∈ [−m, m] and C ∈ C satisfies the following relation: 2κ(ω) = q(|C|2 ).

(3.7)

Remark 3.4. The values ω = ±m can only correspond to the zero solution. Remark 3.5. We can state the following necessary and sufficient condition for the existence of nonzero solitary waves: ∃C ∈ C \ {0} such that 0 < q(|C|2 ) ≤ 2m. The case q(C 2 ) = 2m corresponds to the solitary wave with ω = 0, which is a time-independent solution to (3.1) given by ψ(x, t) = Ce−m|x| .

Global Attraction to Solitary Waves

131

Proof. Substituting φω (x)e−iωt into (3.1), we get the following eigenvalue problem: −ω 2 φω = ∂x2 φω − m2 φω + δ(x)q(|φω |2 )φω ,

x ∈ R.

(3.8)

We can assume that φω (0) = 0. Indeed, if φω (0) = 0, then (3.8) turns into a homogeneous second-order linear differential equation, which together with the inclusion φω ∈ H 1 (R) results in φω (x) ≡ 0. Equation (3.8) implies that away from the origin we have ∂x2 φω = (m2 − ω 2 )φω ,

x = 0,

hence φω (x) = C± e−κ± |x| for ±x > 0, where κ± satisfy κ2± = m2 − ω 2 . Since 1 we need φω ∈ H √ (R), it is imperative that κ± > 0; we conclude that |ω| < m and that κ± = m2 − ω 2 > 0. Moreover, since the function φω (x) is continuous, C− = C+ = C = 0 (since we are looking for nonzero solitary waves). We see that 1 C = 0, κ ≡ m2 − ω 2 > 0. (3.9) φω (x) = Ce−κ|x| , Equation (3.8) implies the following gluing condition at x = 0: 0 = φω (0+) − φω (0−) + q(|φω (0)|2 )φω (0). This condition and (3.9) lead to the equation 2κ = q(|C|2 ).

(3.10) 

Hamiltonian structure. We set Ψ(t) = (ψ(x, t), π(x, t)) ∈ C2 and rewrite equation (3.1) in the vector form: * + * + 0 1 0 Ψ(t) + δ(x) , (3.11) ∂t Ψ(t) = Δ − m2 0 q(|ψ(0, t)|2 )ψ(0, t) where x ∈ R and t ∈ R. We write 1 U (ψ) = 2

)

|ψ|2

q(s) ds; 0

then q(|ψ|2 )ψ = −∇U (ψ), where the gradient is taken with respect to (Re ψ, Im ψ): ∇U (ψ) = ∂u U + i∂v U,

ψ = u + iv,

u, v ∈ R.

Then equation (3.11) can formally be written as a Hamiltonian system, * + 0 1 J = , ∂t Ψ(t) = J H (Ψ), −1 0 where H is the variational derivative of the Hamilton functional * + )  1  2 ψ |π| + |ψ  |2 + m2 |ψ|2 dx + U (ψ(0)), Ψ= , H(Ψ) = π 2 R

taken with respect to (Re ψ, Im ψ) and (Re π, Im π).

(3.12)

(3.13)

132

A. Comech

Equation (3.11) is formally a Hamiltonian system with the phase space X from Definition 3.6 (1) (with n = 1) and the Hamilton functional H. Both H and Q are continuous functionals on X . Charge conservation. Since (3.1) is U(1)-invariant, the Noether theorem formally implies that the charge functional )   i Q(ψ, π) = ψπ − πψ dx (3.14) 2 R * + ψ(x, t) is (formally) conserved for solutions Ψ(t) = to (3.1). π(x, t) The energy space. Denote by · L2 the norm in L2 (Rn ). Let H s (Rn ), s ∈ R, be the Sobolev space with the norm ψ H s = (m2 − Δ)s/2 ψ L2 .

(3.15)

For s ∈ R and R > 0, denote by H0s (BnR ) the space of distributions from H (Rn ) supported in BnR (the ball of radius R in Rn ). We denote by · H s ,R the norm in the space H s (BnR ) which is defined as the dual to H0−s (BR ). s

Definition 3.6. Let n ≥ 1. (1) X = H 1 (Rn ) × L2 (Rn ) is the Hilbert space of states Ψ = (ψ, π), with the norm Ψ 2X = π 2L2 + ∇ψ 2L2 + m2 ψ 2L2 = π 2L2 + ψ 2H 1 . (2) For ε ≥ 0, introduce the Banach spaces X −ε = H 1−ε (Rn ) × H −ε (Rn ) with the norm Ψ 2X −ε = (m2 − Δ)−ε/2 Ψ 2X = π 2H −ε + ψ 2H 1−ε . (3) Define the seminorms Ψ 2X −ε ,R = π 2H −ε ,R + ψ 2H 1−ε ,R , and denote by Y

−ε

R > 0,

the Banach space with the norm

Ψ Y −ε =

∞ 

2−R Ψ X −ε ,R < ∞.

(3.16)

R=1

3.2. Main result Assume that the polynomial q(·) in (3.1) satisfies the following conditions: q(s) =

p  j=0

qj sj ,

p ≥ 1;

qj ∈ R,

qp < 0.

(3.17)

Global Attraction to Solitary Waves

133

Theorem 3.7 (Global attraction for Klein–Gordon equation with an oscillator). Assume that q(s) satisfies (3.17). For any (ψ0 , π0 ) ∈ X , the solution ψ(t) to (3.1) with (ψ, ∂t ψ)|t=0 = (ψ0 , π0 ) converges to the solitary manifold S in the space Y −ε , for any ε > 0: (3.18) lim distY −ε ((ψ, ∂t ψ)|t , S) = 0, t→±∞

where S is introduced in (3.5) and distY −ε (Ψ, S) := inf Ψ − s Y −ε , with · Y −ε s∈S

introduced in (3.16). Remark 3.8. (1) In (3.17), the assumption that qp < 0 is needed for the global well-posedness of (3.1) in the energy space H 1 × L2 (cf. Theorem 3.9 below). (2) By (3.17), the nonlinearity is of polynomial character and is strictly nonlinear. This condition is crucial in our argument: It will allow us to apply the Titchmarsh convolution theorem. (3) It suffices to prove Theorem 3.7 for t → +∞. (4) For real initial data, we obtain a real-valued solution ψ(t) to (3.1). Therefore, the convergence (3.18) of (ψ(t), ∂t ψ(t)) to the set of pairs (φω , −iωφω ) with ω ∈ R implies that ψ(t) locally converges to zero or a static solution. (5) As the matter of fact, the convergence (3.18) also holds in the local energy seminorms. The proof based on the technique of quasimeasures is presented in [KK07]. We will give the proof of the global attraction to solitary waves for equation (3.1). We present the argument from [Kom03] and [KK07], slightly shortened since we prove the convergence to the attractor in the Y −ε -norm (as opposed to convergence in the local energy seminorms proved in [Kom03] and [KK07]). 3.3. Global well-posedness The global well-posedness of (3.1) in the energy space is proved in [KK07]: Theorem 3.9. Assume that q(s) satisfies (3.17). Then: (1) For every (ψ0 , π0 ) ∈ X , the Cauchy problem  2 ∂t ψ = ∂x2 − m2 ψ + δ(x)q(|ψ|2 )ψ, x ∈ R, (ψ, ∂t ψ)|t=0 = (ψ0 , π0 ), where m > 0, has a unique solution ψ(t), t ∈ R, such that (ψ, ∂t ψ) ∈ C(R, X ). (2) The map W (t) : (ψ0 , π0 ) → (ψ(t), ∂t ψ(t)) is continuous in X for each t ∈ R. (3) The values of the energy and charge functionals are conserved along the trajectory: H(ψ(t), ∂t ψ(t)) = const,

Q(ψ(t), ∂t ψ(t)) = const,

t ∈ R.

(4) The following a priori bound holds: (ψ(t), ∂t ψ(t)) X ≤ C(ψ0 , π0 ),

t ∈ R.

(3.19)

134

A. Comech

(5) For any 0 ≤ ε < 1/2, Λ ∈ R, and T > 0, the map W (t) : XΛ → XΛ ,

W (t) : (ψ0 , π0 ) → (ψ(t), ∂t ψ(t)),

is continuous in the topology of Y −ε , uniformly in t ∈ [−T, T ]. Above, XΛ is defined by (3.20) XΛ = {Ψ ∈ X : H(Ψ) ≤ Λ}. Remark 3.10. In Theorem 3.9 (5), we need ε < 1/2 so that H 1−ε (R) ⊂ C(R). 3.4. Omega-limit trajectories Pick the initial data (ψ0 , π0 ) ∈ H 1 (R) × L2 (R). (3.21) According to Theorem 3.9 (1) there exists a global solution to (3.1), which we denote ψ(x, t), with the initial data (ψ, ∂t ψ)|t=0 = (ψ0 , π0 ).

(3.22)

(ψ, ∂t ψ) ∈ Cb (R, X ).

(3.23)

By Theorem 3.9 (4), one has Lemma 3.11. For any ε > 0, the embedding X ⊂ Y −ε is compact. Proof. Let Ψj ∈ X , j ∈ N be a sequence such that Ψj X ≤ C < ∞,

j ∈ N.

(3.24)

It suffices to specify a Cauchy subsequence in Ψj considered in the space Y −ε . Since X is a Hilbert space, we can choose a subsequence of Ψj which is weakly convergent in X to some Ψ0 ∈ X . Since for any s > s and R > 0 the  inclusion H0s (BnR ) ⊂ H s (Rn ) is compact (with BnR being a ball of radius R in Rn ), we can choose a smaller subsequence of Ψj which converges in the metric · X −ε ,R . By the diagonalization process, we can choose a yet smaller subsequence of Ψj , which we denote Ψjr , r ∈ N, which converges in the metric · X −ε ,R , for any R > 0. Let us show that Ψjr , r ∈ N, is a Cauchy sequence in Y −ε . Pick δ > 0. Choose R0 ∈ N large enough so that 2−R0 C < δ/4, where C is from (3.24). Since Ψjr is convergent in · X −ε ,R for any fixed R > 0, there is r0 ∈ N such that Ψjr − Ψjr X −ε ,R0 < δ/2 for all r, r > r0 . Then, for all r, r > r0 , Ψjr − Ψjr Y −ε =

∞ 

2−R Ψjr − Ψjr X −ε ,R

R=1



R0 

2−R Ψjr − Ψjr X −ε ,R +

R=1

∞  R=R0 +1

≤ Ψjr − Ψjr X −ε ,R0 + 2−R0 · 2C < This finishes the proof.

2−R Ψjr − Ψjr X δ δ + = δ. 2 2 

Global Attraction to Solitary Waves

135

According to Lemma 3.11, in any sequence tj → +∞ there exists a convergent subsequence tjr , r ∈ N: Y −ε

(ψ, ∂t ψ)|tj −−−→(ζ0 , θ0 ) ⊂ H 1 × L2 . r

r→∞

Let ζ be a solution to (3.1) with the initial data (ζ, ∂t ζ)|t=0 = (ζ0 , θ0 ): ∂t2 ζ = ∂x2 ζ − m2 ζ + δ(x)q(|ζ|2 )ζ,

ζ(x, t) ∈ C,

x ∈ R,

t ∈ R,

(3.25)

which is understood in the sense of distributions. Due to Theorem 3.9 (4), there is the bound sup (ζ, ∂t ζ)|t X < ∞. (3.26) t∈R

Denote by Sτ the time shift operator defined on C(R, S  ): Sτ ψ(t) = ψ(τ + t),

t ∈ R.

(3.27)

We now fix ε ∈ (0, 1/2). By the continuous dependence on the initial data (Theorem 3.9 (5)), it follows that, for any T > 0, Stjr (ψ, ∂t ψ)

C([−T,T ],Y −ε )

−−−→ r→∞

(ζ, ∂t ζ).

(3.28)

Recall that the space Y −ε is introduced in Definition 3.6 (3). To conclude the proof of Theorem 3.7, it suffices to check that every omegalimit trajectory ζ(x, t) belongs to the set of solitary waves. 3.5. Local energy decay Let χ be the solution to the linear Klein–Gordon equation with the initial data (3.21): (χ, ∂t χ)|t=0 = (ψ0 , π0 ). (3.29) ∂t2 χ = ∂x2 χ − m2 χ, Proposition 3.12 (Local energy decay). For any n ∈ N and m > 0, if χ solves ∂t2 χ = Δχ − m2 χ,

x ∈ Rn ,

(χ, ∂t χ)|t=0 = (χ0 , π0 ) ∈ H 1 (Rn ) × L2 (Rn ),

then, for any ρ ∈ S (Rn ), lim ( ρ(·)χ(·, t) H 1 + ρ(·)∂t χ(·, t)) L2 ) = 0.

t→∞

Proof. For the Fourier transform of χ(x, t) in x, we have: χ(ξ, ˆ t) = χ ˆ0 (ξ) cos(ω(ξ)t) + π ˆ0 (ξ)

sin(ω(ξ)t) , ω(ξ)

1 where ω(ξ) = m2 + ξ 2 . We will only prove that limt→∞ ρ(·)χ(·, t) H 1 = 0; the limit limt→∞ ρ(·)∂t χ(·, t) L2 = 0 is computed similarly. Pick > 0. We split the initial data χ0 and π0 into χ0 = u1 + u2 , π0 = v1 + v2 , so that u1 H 1 + v1 L2 < /2

(3.30)

and uˆ2 , vˆ2 ∈ S (Rn ),

supp u ˆ2 ∪ supp vˆ2 ⊂ {ξ ∈ Rn : |ξ| ≥ λ},

(3.31)

136

A. Comech

for some λ > 0. Let χ1 and χ2 be the solutions to the linear Klein–Gordon equation with the initial data (χ1 , ∂t χ1 )|t=0 = (u1 , v1 ), (χ2 , ∂t χ2 )|t=0 = (u2 , v2 ). Due to (3.30) and the energy conservation, χ1 (t) H 1 ≤ /2 for t ∈ R. It suffices to show that (3.32) lim ρ(·)χ2 (·, t) H 1 = 0. t→∞

We have: ρχ2 (·, t) 2L2 ≤ ρ L2 χ2 (·, t) L2 ρχ2 (·, t) L∞ . (3.33) The first two factors in the right-hand side of (3.33) are bounded uniformly in time. For the last factor in the right-hand side of (3.33), we have:





sin(ω(·)t)

.

ρ(·)χ2 (·, t) L∞ ≤ ρˆ ∗ u (3.34) ˆ2 (·) cos(ω(·)t) + vˆ2 (·)

1 ω(·) L

Lemma 3.13. Let f , g ∈ S (R ), and 0 ∈ / supp g. Then, for any N ∈ N, there is CN > 0 so that   t ∈ R. f ∗ g(·)eiω(·)t L1 ≤ CN (1 + |t|)−N , n

Proof. First of all, one has )   iω(·)t L1 = f ∗ g(·)e

)     f (ξ − η)g(η)eiω(η)t dη  dξ ≤ f L1 g L1 .  

(3.35)

Then, since 0 ∈ / supp g, |∇η ω(η)| is bounded away from zero on the support of g. Therefore, for |t| ≥ 1, the expression  ) )    f (ξ − η)g(η)eiω(η)t dη  dξ (3.36)   decays faster than any negative power of t due to the stationary phase method. 1 Namely, one can place the operator L = i|∇ω(η)| 2 t ∇η ω · ∇η in front of the exponential factor eiω(η)t under the inner integral in (3.36), and then integrate by parts in η. This gives a factor of t−1 could be repeated arbitrarily   . The procedure many times N ≥ 1, leading to f ∗ g(·)eiω(·)t L1 ≤ CN t−N , with some CN < ∞. Together with (3.35), this concludes the proof of the lemma.  From (3.34), applying Lemma 3.13 to the right-hand side, we conclude that lim ρχ2 (·, t) L∞ = 0. This, together with (3.33), yields

t→∞

lim ρχ2 2L2 = 0.

(3.37)

lim ∇x (ρχ2 (·, t)) 2L2 = 0.

(3.38)

t→∞

Similarly, one proves that t→∞

Each of the terms in the right-hand side of (3.33) could accommodate a derivative in x: ∇ρ L2 is bounded, ∇χ(·, t) L2 is bounded uniformly in time, while ∇(ρχ2 (·, t)) L∞ is bounded by the expression similar to the right-hand side of (3.36), which is dealt with by Lemma 3.13.

Global Attraction to Solitary Waves

137

Using (3.37) and (3.38), we obtain: lim ρ(·)χ2 (·, t) H 1 = 0.

t→∞

As we mentioned before, the convergence lim ρ(·)∂t χ2 (·, t) L2 = 0 is proved t→∞ similarly. This finishes the proof of Proposition 3.12.  Proposition 3.12 yields the decay of the norm of (χ, ∂t χ) in the space Y −ε introduced in Definition 3.6 (3): Lemma 3.14 (Local energy decay of the dispersive component). There is a local energy decay for χ: lim (χ, ∂t χ)|t Y −ε = 0. (3.39) t→∞

Remark 3.15. Lemma 3.14 means that the dispersive component χ does not give any contribution to the omega-limit trajectories. 3.6. Absolute continuity for large frequencies Define  0, ϕ(x, t) = ψ(x, t) − χ(x, t),

t < 0, t ≥ 0,

(3.40)

with ψ(x, t) the solution to (3.1) with the initial data (3.22), and with χ(x, t) defined in (3.29). Then ϕ(x, t) solves the following Cauchy problem: ∂t2 ϕ = ∂x2 ϕ − m2 ϕ + δ(x)f (t),

(ϕ, ∂t ϕ)|t≤0 = (0, 0),

(3.41)

t ∈ R,

(3.42)

where f (t) := Θ(t)q(|ψ(0, t)|2 )ψ(0, t),

with Θ(t) the Heaviside step function. Recall that (ψ, ∂t ψ) ∈ Cb (R, X ) by (3.23). On the other hand, since χ(x, t) is a finite energy solution to the free Klein–Gordon equation, we also have (χ, ∂t χ) ∈ Cb (R, X ). It follows that ϕ(x, t) defined by (3.40) is finite in the energy norm: (ϕ, ∂t ϕ) ∈ Cb (R, X ),

t ∈ R.

(3.43)

Let k(ω) be the analytic function with the domain D := C\((−∞, −m]∪[m, +∞)) such that 1 k(ω) = ω 2 − m2 , Im k(ω) > 0, ω ∈ D. (3.44) Let us also denote its limit values at the real axis by k± (ω) := k(ω ± i0),

ω ∈ R.

(3.45)

As illustrated on Figure 4 (where all square roots take positive values), we have: k− (ω) = k+ (ω)

for

k− (ω) = −k+ (ω) for ω k+ (ω) > 0

for

−m ≤ ω ≤ m, ω ∈ R\(−m, m), ω ∈ R\[−m, m].

(3.46)

138

A. Comech l)" ,j2*?

q

0

n

" 4 n 4

q

l)" j2*?, " 4 n 4

n

l)" j2*? j

q

n 4 " 4

q

l)" ,j2*? , " 4 n 4 l)" j2*?

q

" 4 n 4

Figure 4. The boundary values k± (ω) := k(ω ± i0), ω ∈ R. Let us consider the Fourier transform of ϕ defined in (3.40): ) ∞ eiωt ϕ(x, t) dt, (x, ω) ∈ R2 . ϕ(x, ˜ ω) = Ft→ω [ϕ(x, t)] =

(3.47)

0

This is a continuous function of x ∈ R with values in tempered distributions of ω ∈ R, which satisfies the following equation (cf. (3.41)): −ω 2 ϕ(x, ˜ ω) = ∂ 2 ϕ(x, ˜ ω) − m2 ϕ(x, ˜ ω) + δ(x)f˜(ω), (x, ω) ∈ R2 , (3.48) x

)

where



f˜(ω) = Ft→ω [f (t)](ω) =

eiωt f (t) dt,

ω ∈ R.

(3.49)

0

Proposition 3.16 (Spectral representation). There is the following relation: ϕ(x, ˜ ω) = −

eik+ (ω)|x| ˜ f (ω), 2ik+ (ω)

x ∈ R,

ω ∈ R\{±m}.

(3.50)

Proof. According to (3.40), ϕ|t≤0 ≡ 0, hence the formula (3.47) could be extended to ω ∈ C+ := {z ∈ C : Im z > 0}, defining complex Fourier transform of ϕ(x, t): ) ∞ ϕ(x, ˜ ω) = eiωt ϕ(x, t) dt, x ∈ R, Im ω ≥ 0. (3.51) 0

Similarly, since f |t 0. (3.60) ϕ(·, ˜ ω + i ) 2L2 dω = 2π e−2t ϕ(·, t) 2L2 dt ≤ R

0

On the other hand, we can calculate the term in the left-hand side of (3.60) exactly. According to (3.54), ϕ(x, ˜ ω + i ) = −

eik(ω+i)|x| ˜ f (ω + i ), 2ik(ω + i )

hence (3.60) results in ) eik(ω+i)|x| 2L2 ˜ |f (ω + i )|2 dω ≤ const, |k(ω + i )|2 R

> 0.

(3.61)

Here is a crucial observation about the norm of eik(ω+i)|x| . Lemma 3.21. (1) For ω ∈ R\(−m, m), lim

→0+

eik(ω+i)|x| 2L2 1 . = 2 |ωk(ω + i )| ωk+ (ω)

(3.62)

(2) For any δ > 0 there exists δ > 0 such that for ω ∈ R\[−m − δ, m + δ] and ∈ (0, δ ), eik(ω+i)|x| 2L2 1 . (3.63) ≥ 2 |ωk(ω + i )| 2ωk+ (ω) Remark 3.22. The asymptotic behavior of the L2 -norm of eik(ω+i) stated in the lemma is easy to understand: for ω ∈ R\[−m, m], this norm is finite for > 0 due to the small positive imaginary part of k(ω + i ), but it becomes unboundedly large when → 0+. Let us also mention that the expression in the left-hand side of (3.62) is easy to evaluate in the momentum space. Since * ik(ω+i)|x| + 1 e 1 = 2 , Fx→ξ = 2 2ωk(ω + i ) ξ + m2 − (ω + i )2 ξ − k12

Global Attraction to Solitary Waves

141

where k1 = k(ω + i ) ∈ C+ , we have: ) ) eik(ω+i)|x| 2L2 dξ dξ 1 1 = = . 4|ωk(ω + i )|2 2π R |ξ 2 − k12 |2 2π R (ξ + k1 )(ξ − k1 )(ξ + k1 )(ξ − k1 ) Closing the contour of integration at ξ → +i∞ and using the Cauchy residue theorem (note that k1 ∈ C+ and −k1 ∈ C+ ), one gets:

eik(ω+i)|x| 2L2 1 i 1 = + . 2 4|ωk(ω + i )|2 k1 2(k12 − k1 ) k1 2

The relation (3.62) follows after we note that k12 −k1 = (ω +i )2 −(ω −i )2 = 4iω . Substituting (3.63) into the left-hand side of (3.61), we get: ) dω ≤ 2C, 0 < < δ , |f˜(ω + i )|2 ωk + (ω) |ω|≥m+δ

(3.64)

with the same C as in (3.61). We conclude that for each δ > 0 the set of functions f˜(ω + i ) gδ, (ω) = , 0 < < δ , |ωk+ (ω)|1/2 defined for ω ∈ Ωδ , is bounded in L2 (R\[−m − δ, m + δ]), and hence is weakly compact. The convergence of the distributions (3.56) implies the following weak convergence in L2 (R\[−m − δ, m + δ]): gδ,  gδ ,

→ 0+,

where the limit function gδ (ω) coincides with the distribution f˜(ω)|ωk+ (ω)|−1/2 restricted onto R\[−m − δ, m + δ]. It remains to note that, by (3.64), the norms of all functions gδ , δ > 0, are bounded in L2 (R\[−m − δ, m + δ]) by a constant independent on δ, hence (3.59) follows.  3.7. Spectral analysis of omega-limit trajectories By Lemma 3.14, as t → ∞, the dispersive component χ(·, t) converges to zero in Y −ε defined in (3.16), where we need 0 < ε < 1/2. On the other hand, according to (3.28), the functions ψ(x, tjr + t) converge to ζ(x, t) as r → ∞, in the topology of C([−T, T ], Y −ε ), for any T > 0. Hence, the functions ϕ(x, tjr + t) = Θ(tjr + t)(ψ(x, tjr + t) − χ(x, tjr + t)) also converge to ζ(x, t): ϕ(x, tjr + t)

C([−T,T ],Y −ε )

−−−→ r→∞

ζ(x, t),

(3.65)

for any T > 0. For brevity, we write β(t) := ζ(0, t), (3.66) (3.67) g(t) := q(|ζ(0, t)|2 )ζ(0, t). ˜ By (3.25), the function ζ(x, ω), which is the Fourier transform of ζ(x, t), satisfies the equation ˜ ω) = ∂ 2 ζ(x, ˜ ω) − m2 ζ(x, ˜ ω) + δ(x)˜ −ω 2 ζ(x, g (ω), (x, ω) ∈ R2 , (3.68) x

142

A. Comech

valid in the sense of tempered distributions of (x, ω) ∈ R2 . Above, g˜(ω) is the ˜ ω) is a continuous function of Fourier transform of g(t). According to (3.26), ζ(x, x ∈ R with values in tempered distributions of ω ∈ R. Lemma 3.23. Let u ∈ S  (R) and {tj : j ∈ N} be such that limj→∞ tj = ∞. If S

ei ωtj u −→ v ∈ S  (R)

(3.69)

and u|I ∈ L1loc (I) for some open set I ⊂ R, then v|I = 0. Proof. Pick any  ∈ C0∞ (R) with supp  ⊂ I. Then, due to the convergence (3.69), , ei ωtjr u −→ , v . On the other hand, , ei ωtjr u = Fω→t [(ω)u(ω)](tjr ) → 0, as the Fourier transform of the L1 -function u. It follows that , v = 0. Since  is an arbitrary smooth function with support in I, we are done.  Lemma 3.24 (Compactness of the spectrum). ˜ ⊂ [−m, m]. supp β Proof. By (3.65), for any x ∈ R, we have: S

ϕ(x, tjr + t) −→ ζ(x, t), t ∈ R. (3.70) , −iωt −iωt 1 j e ϕ(x, ˜ ω) dω, the relation (3.70) implies that, Since ϕ(x, tj + t) = 2π R e for any x ∈ R, S ˜ e−iωtjr ϕ(x, ˜ ω) −→ ζ(x, ω), r → ∞. (3.71) 2 By Proposition 3.20, ϕ(0, ˜ ω) is locally L for ω ∈ R\[−m, m]. Therefore, the con˜ ˜ ω) vanishes for ω ∈ vergence (3.71) and Lemma 3.23 show that β(ω) := ζ(0, R\[−m, m].  ˜ ω) admits the Lemma 3.25 (Spectral representation for ζ). The distribution ζ(x, following representation: ik+ (ω)|x|

˜ ω) = − e ζ(x, g˜(ω), 2ik+ (ω)

x ∈ R,

ω ∈ R\{±m}.

(3.72)

Proof. Due to (3.28), we also have S

f (tjr + t) := F (ψ(0, tjr + t)) −→ F (ζ(0, t)) =: g(t), where F (ψ) = q(|ψ|2 )ψ (cf. (3.2)); hence, due to the continuity of the Fourier transform in S  , S eiωtjr f˜(ω) −→ g˜(ω), ω ∈ R. (3.73) Now the statement of the lemma can be proved by starting with the relation (3.50) proved in Proposition 3.16 and applying the limits (3.71) and (3.73). When taking the limits, we use the fact that k(ω) is smooth for ω ∈ R\{±m} and hence the ik(ω)|x| expression e2ik(ω) , ω ∈ R\{±m}, is a multiplicator in S  away from ω = ±m.  Lemma 3.26. The points ω = ±m cannot be isolated points of the support of g˜(ω).

Global Attraction to Solitary Waves

143

Proof. Let us assume that, on the contrary, ω0 = m or −m is an isolated point of the support of g˜. Pick an open neighborhood U of ω0 such that U ∩ supp g˜ = {ω0 }. Pick  ∈ C0∞ (R) such that supp ˜ g ⊂ U , (ω0 ) = 1. Then (ω)˜ g (ω) = M δ(ω − m),

M ∈ C\{0},

(3.74)

where the derivatives of δ(ω − m) do not appear since ˇ ∗ g(t) is bounded. By ˜ ·) ⊂ {ω0 }, hence (3.72), we have, for any x ∈ R, U ∩ supp ζ(x, ˜ ω) = δ(ω − ω0 )b(x), (ω)ζ(x, b ∈ H 1 (R). (3.75) Again, the terms with the derivatives of δ(ω −ω0 ) are prohibited since α, ˇ∗ζ(·, t) are bounded for any α ∈ C0∞ (R). The inclusion b(x) ∈ H 1 (R) is due to ζ˜ ∈ S  (R, H 1 (R)). Multiplying (3.68) by (ω) and taking into account (3.74), (3.75), and the relation ω02 = m2 , we see that the distribution b(x) satisfies the equation 0 = b (x) + M δ(x). M = 0 would lead to b ∈ H 1 (R), contradicting the inclusion ζ˜ ∈ S  (R, H 1 (R)). This contradiction shows that ω = ±m cannot be isolated points of the support of g˜, finishing the proof.  ˜ Lemma 3.27. supp g˜(·) ⊂ supp β. Proof. By Lemma 3.25, ˜ ∪ {±m}. supp g˜(·) ⊂ supp β 

Now the statement of the lemma follows from Lemma 3.26.

˜ = {ω } for some Lemma 3.28 (Reduction to the point spectrum). Either supp β ˜ ω ∈ [−m, m] or β = 0. Proof. By (3.17), the Fourier transform g˜(ω) of g(t) := F (ζ(0, t)) is given by g˜ =

p  j=0

˜ ∗ · · · ∗ (β ˜ ∗β. ˜ ∗ β) ˜ ∗ β) ˜ qj (β   

(3.76)

j

Now we will use the Titchmarsh convolution theorem [Tit26] which could be stated as follows: For any u, v ∈ E  (R),

sup supp(u ∗ v) = sup supp u + sup supp v.

Above, E  (R) is the space of compactly supported distributions. For more details and a proof, see the appendix. Applying the Titchmarsh convolution theorem to the convolutions in (3.76), we obtain the following equality: ˜ + (p − 1)(sup supp β ˜ − inf supp β). ˜ sup supp g˜ ≥ sup supp β (3.77) We used the relation

˜ = − inf supp β. ˜ sup supp β

144

A. Comech

We wrote “≥” in (3.77) because of possible cancellations in the summation in the right-hand side of (3.76). Note that the Titchmarsh convolution theorem is applicable to each summand in the right-hand side of (3.76) since by Lemma 3.24 ˜ is compactly supported (supp β ˜ ⊂ [−m, m]). the function β Comparing (3.77) with the statement of Lemma 3.27, we conclude that ˜ − inf supp β) ˜ = 0. (p − 1)(sup supp β

(3.78)

Since p ≥ 2 by (3.17) (which means that the oscillator at x = 0 is nonlinear), we ˜ consists of at most a single point ω ⊂ [−m, m]. conclude that supp β  Lemma 3.29. ζ(x, t) is a solitary wave: ζ(x, t) = φ(x)e−iω t , where ω ∈ (−m, m) and φ ∈ H 1 (R) satisfies −ω2 φ = ∂x2 φ − m2 φ + δ(x)F (φ(0)),

x ∈ R.

(3.79)

˜ ⊂ {ω }, with ω ∈ [−m, m]. Therefore, Proof. By Lemma 3.28, supp β ˜ β(ω) = a1 δ(ω − ω ),

with some a1 ∈ C.

(3.80)

˜ Note that the derivatives δ (k) (ω − ω ), k ≥ 1 do not enter the expression for β(ω) since β(t) = ζ(0, t) is a bounded continuous function of t due to the bound (3.26). The relation (3.80), together with (3.76), yield that g˜(ω) = g1 δ(ω − ω ),

with some g1 ∈ C.

(3.81)

Now Lemma 3.25 implies that the omega-limit trajectory ζ(x, t) is a solitary wave: ζ(x, t) = φ(x)e−iω t . ˜ ω) solves (3.68), φ(x) satisfies (3.79). Since ζ(x, Remark 3.30. By Lemma 3.26, ω = ±m could only correspond to the zero solution. Lemma 3.29 completes the proof of (3.18). Thus, Theorem 3.7 is proved. 

Appendix: The Titchmarsh convolution theorem A.1. Statement of the theorem The Titchmarsh convolution theorem was originally formulated as follows [Tit26]: ) x If φ(t) and ψ(t) are integrable functions, such that φ(t)ψ(x − t) dt = 0 0

almost everywhere in the interval 0 < x < κ, then φ(t) = 0 almost everywhere in (0, λ), and ψ(t) = 0 almost everywhere in (0, μ), where λ + μ ≥ κ.

Global Attraction to Solitary Waves

145

The Titchmarsh convolution theorem could be restated as the equality sup supp φ ∗ ψ = sup supp φ + sup supp ψ,

(A.1)

which is satisfied if the quantity , in its right-hand side is finite. Above, φ ∗ ψ is the convolution φ ∗ ψ(x) = R φ(x − t)ψ(t) dt. The equality similar to (A.1) takes place for inf supp φ ∗ ψ. These equalities imply that the obvious inclusion supp φ∗ψ ⊆ supp φ+supp ψ is sharp at the boundary if both supp φ and supp ψ are compact. The Titchmarsh convolution theorem was originally proved in [Tit26] for functions from L1 , but the statement is easily generalized for compactly supported distributions. The generalization of the Titchmarsh convolution theorem to higher dimensions can be stated in terms of the convex hulls of the supports [Lio51]: Theorem A.1 (Titchmarsh convolution theorem). For f, g ∈ E  (Rn ), c.h. supp f ∗ g = c.h. supp f + c.h. supp g.

(A.2)

Above, E  (R) is the space of distributions with compact support (dual to the space E (R) which is C ∞ (R) with the seminorms supω |f (k) (ω)|). c.h. denotes the convex hull of the set. Let us also note that we use the following conventions: For X, Y ⊆ Rn , For X ⊆ Rn , k ∈ R,

X + Y = {x + y, x ∈ X, y ∈ Y }; kX = {kx, x ∈ X}.

(A.3) (A.4)

Different proofs of the Titchmarsh convolution theorem are contained in [H¨ or90, Theorem 4.3.3] (Harmonic Analysis style), [Yos80, Chapter VI] (Real Analysis style), and [Lev96, Lecture 16, Theorem 5] (Complex Analysis style). A.2. Elementary proof via Paley–Wiener theorem We will give an elementary proof based on the Paley–Wiener theorem. We will consider the one dimension only. The higher-dimensional case is proved in the same way, with the higher-dimensional version of the Paley–Wiener theorem and utilizing the concept of the supporting function as in [H¨ or90]. Titchmarsh convolution theorem for f ∗ f . Let us first show how to prove of the Titchmarsh convolution theorem for f ∗ f using the Paley–Wiener theorem (see [Yos80, Chapter VI] or [H¨or90, Theorem 7.3.1]) which relates the size of the support a distribution f with the growth properties of its Fourier transform, ) e−iζx ϕ(x) dx. (A.5) ϕ(ζ) ˆ = ϕx→ζ [f ](ζ) = R

Theorem A.2 (Paley-Wiener). (1) Let ϕ ∈ S (R). If supp ϕ ⊂ [−R, R], R > 0, then ϕ(ζ) ˆ is an entire function of ζ ∈ C (analytic function in the whole space C) and for any N ∈ N there is CN < ∞ such that |ϕ(ζ)| ˆ ≤ CN ζ −N eR| Im ζ| .

(A.6)

146

A. Comech

(2) Conversely, if ϕˆ ∈ S (R) has a holomorphic extension to C (also denoted ϕ) ˆ which satisfies (A.6) with some R < ∞, for any N ∈ N, then ϕ ∈ C ∞ (R), supp ϕ ⊂ BR . Remark A.3. The Paley–Wiener theorem for distributions states that ϕ ∈ E  (R), supp f ⊆ [−A, A] if an only if ϕ(ζ) ˆ is an entire function and there exist C > 0 and m ∈ R so that ϕ(ζ) ˆ satisfies ζ ∈ C.

|ϕ(ζ)| ˆ ≤ C(1 + |ζ|)m eA| Im ζ| ,

Proof. We follow [Tay11]. The first part is immediate: integrate by parts in x in , the integral ϕ(ξ) ˆ = e−ix·ξ ϕ(x) dx. For the second part, we pick x = 0 and define ω = x/|x|. Then, due to analyticity of ϕ, ˆ ) ) dξ iξ·x dξ = , ϕ(x) = ϕ(ξ)e ˆ ϕ(ξ ˆ + iτ ω)ei(ξ+iτ ω)·x 2π 2π R R ) |ϕ(x)| ≤ CN ξ −N eRτ e−τ |x| dξ. R

Taking N = 2 and sending τ to +∞, we see that for |x| > R the integral is arbitrarily small, hence ϕ(x) = 0 for |x| > R.  Lemma A.4 (Titchmarsh convolution theorem for f ∗ f ). For any f ∈ E  (R), inf supp(f ∗ f ) = 2 inf supp f,

sup supp(f ∗ f ) = 2 sup supp f.

We will show that Lemma A.4 is a consequence of the following lemma. Lemma A.5.   max sup supp(f ∗ f ), − inf supp(f ∗ f ) = 2 max(sup supp f, − inf supp f ). Proof. We write a = max(sup supp f, − inf supp f ).

(A.7)

Assume that supp(f ∗ f ) ⊆ [−2a + , 2a − ]

for some > 0.

(A.8)

Then, by the Paley–Wiener theorem, there are m ≥ 0 and C > 0 such that |F (ζ)|2 = |Fω→ζ [f ∗ f ](ζ)| ≤ C(1 + |ζ|)m e(2a−)| Im ζ| ,

ζ ∈ C.

(A.9)

It follows that 1

1

m



|F (ζ)| = |F (ζ)2 | 2 ≤ C 2 (1 + |ζ|) 2 e(a− 2 )| Im ζ| ,

ζ ∈ C.

(A.10)

a− > 0, contradicting By the Paley–Wiener theorem, supp f ⊆ [−a + the assumption of the lemma. Therefore, the inclusion (A.8) is impossible. We are done.   2,

 2 ],

Proof of Lemma A.4. We can shift f so that inf supp f > 0 and apply Lemma A.5 to the shifted distribution. It follows that sup supp(f ∗ f ) = 2 sup supp f . Similarly for inf. 

Global Attraction to Solitary Waves

147

Titchmarsh convolution theorem for f ∗ g. We closely follow [Yos80, Chap. VI.5]. Lemma A.6. Let f, g ∈ E  (R). Then, for any polynomials α, β, inf supp(αf ) ∗ (βg) ≥ inf supp f ∗ g,

sup supp(αf ) ∗ (βg) ≤ sup supp f ∗ g.

Proof. The proofs of both inequalities are similar; we will only prove the second one. It suffices to prove it for the polynomials α(ω) = ω, β(ω) = 1. We write fn (ω) = ω n f (ω),

gn (ω) = ω n g(ω),

Amn := sup supp fm ∗ gn .

(A.11)

Let us assume that, contrary to the statement of the Lemma, sup supp f1 ∗ g > sup supp f ∗ g.

(A.12)

This inequality can be rewritten as A10 − A00 > 0.

(A.13)

Due to the relation ω(f ∗ g)(ω) = (f1 ∗ g)(ω) + (f ∗ g1 )(ω), we have: sup supp(f1 ∗ g + f ∗ g1 ) = sup supp ω(f ∗ g)(ω) ≤ sup supp f ∗ g = A00 . (A.14) It follows that sup supp(f1 ∗ g ∗ f1 ∗ g + f1 ∗ g ∗ f ∗ g1 ) ≤ sup supp f1 ∗ g + sup supp(f1 ∗ g + f ∗ g1 ) ≤ A10 + A00 . If we had sup supp f1 ∗ g ∗ f1 ∗ g = sup supp f1 ∗ g ∗ f ∗ g1 , then both these quantities would be smaller than or equal to A10 + A00 . By Lemma A.4 and (A.13), this would lead to sup supp f1 ∗ g ≤ (A10 + A00 )/2 < A10 , contradicting (A.11). Thus, sup supp f1 ∗ g ∗ f1 ∗ g = sup supp f1 ∗ g ∗ f ∗ g1 , leading to sup supp f1 ∗ g ∗ f1 ∗ g = sup supp f1 ∗ g ∗ f ∗ g1 ≤ sup supp f ∗ g + sup supp f1 ∗ g1 . (A.15) If we take into account that sup supp f1 ∗g ∗f1 ∗g = 2 sup supp f1 ∗g by Lemma A.4, then (A.15) yields 2A10 = 2 sup supp f1 ∗ g ≤ sup supp f ∗ g + sup supp f1 ∗ g1 = A00 + A11 . (A.16) This gives A11 − A10 ≥ A10 − A00 > 0.

(A.17)

In the last inequality, we took into account (A.13). The inequalities (A.17) imply that (A.18) sup supp f1 ∗ g1 > sup supp f1 ∗ g. Just as we derived (A.16) from (A.12), we could use (A.18) to derive 2 sup supp f1 ∗ g1 ≤ sup supp f1 ∗ g + sup supp f2 ∗ g1 .

(A.19)

The inequality (A.19) could be written as A21 − A11 ≥ A11 − A10 , and, together with (A.17), this yields A21 − A11 ≥ A11 − A10 ≥ A10 − A00 > 0.

148

A. Comech

Proceeding by induction, we prove that A32 − A22 ≥ A22 − A21 ≥ A21 − A11 ≥ A11 − A10 ≥ A10 − A00 > 0, hence Ann ≥ A00 + 2n(A10 − A00 ).

(A.20)

At the same time, since sup supp fn ≤ sup supp f , sup supp gn ≤ sup supp g, we know that sup supp fn ∗ gn ≤ sup supp fn + sup supp gn ≤ sup supp f + sup supp g. This would be in contradiction with (A.20). Hence, (A.12) is not true. This finishes the proof of the lemma.  Let us show how to complete the proof of the Titchmarsh theorem for f ∗ g. Assume that inf supp f ≥ 0, inf supp g ≥ 0, and that f ∗ g(t) = 0, This implies that

)

0 ≤ t ≤ κ.

(A.21)

t

f (t − s)g(s) ds = 0,

0 ≤ t ≤ κ.

(A.22)

0

We may assume that both f and g are continuous. (If not, we consider their ,t ,t antiderivatives F (t) = −∞ f (s) ds, G(t) = −∞ g(s) ds, which also satisfy inf supp F ≥ 0, inf supp G ≥ 0; integrating (A.22) twice, we obtain F ∗ G(t) = 0, 0 ≤ t ≤ κ. We may repeat this process until we get functions continuous on [0, κ].) By Lemma A.6, (A.22) leads to ) t f (t − s)g(s)sn ds = 0, n ∈ N, (A.23) 0

valid for all 0 ≤ t ≤ κ. Since f and g are continuous, Lerch’s theorem [Yos80, Chapter VI.5, Corollary 2] implies that f (t − s)g(s) = 0,

0 ≤ s ≤ t.

(A.24)

This in turn implies that there exists λ ≥ 0 such that f (s) = 0 for 0 ≤ s ≤ λ and g(s) = 0 for 0 ≤ s ≤ t − λ. Acknowledgment The author is grateful to Dorothea Bahns, Wolfram Bauer, and Ingo Witt for the invitation to give a minicourse at the Summer School “Analysis – With Applications to Mathematical Physics,” G¨ ottingen, August 29–September 2, 2011. Special thanks to Dorothea Bahns for many valuable suggestions and for pointing out typos.

Global Attraction to Solitary Waves

149

References [BL83a] H. Berestycki and P.-L. Lions, Nonlinear scalar field equations. I. Existence of a ground state, Arch. Rational Mech. Anal. 82 (1983), pp. 313–345. [BL83b] H. Berestycki and P.-L. Lions, Nonlinear scalar field equations. II. Existence of infinitely many solutions, Arch. Rational Mech. Anal. 82 (1983), pp. 347–375. [Boh13] N. Bohr, On the constitution of atoms and molecules, Phil. Mag. 26 (1913), pp. 1–25. [BP92]

odinger equaV.S. Buslaev and G.S. Perel man, Scattering for the nonlinear Schr¨ tion: states that are close to a soliton, Algebra i Analiz 4 (1992), pp. 63–102.

[BP95]

V.S. Buslaev and G.S. Perel man, On the stability of solitary waves for nonlinear Schr¨ odinger equations, in Nonlinear evolution equations, vol. 164 of Amer. Math. Soc. Transl. Ser. 2 , pp. 75–98, Amer. Math. Soc., Providence, RI, 1995.

[Bro24] L.D. Broglie, Recherches sur la th´eorie des Quanta, Th`eses, Paris, 1924. [BS03]

V.S. Buslaev and C. Sulem, On asymptotic stability of solitary waves for nonlinear Schr¨ odinger equations, Ann. Inst. H. Poincar´e Anal. Non Lin´eaire 20 (2003), pp. 419–475.

[BV92]

A.V. Babin and M.I. Vishik, Attractors of evolution equations, vol. 25 of Studies in Mathematics and its Applications, North-Holland Publishing Co., Amsterdam, 1992, translated and revised from the 1989 Russian original by Babin.

[CL82]

T. Cazenave and P.-L. Lions, Orbital stability of standing waves for some nonlinear Schr¨ odinger equations, Comm. Math. Phys. 85 (1982), pp. 549–561.

[Com12] A. Comech, On global attraction to solitary waves. Klein–Gordon equation with mean field interaction at several points, J. Differential Equations 252 (2012), pp. 5390–5413. [Com13] A. Comech, Weak attractor of the Klein–Gordon field in discrete space-time interacting with a nonlinear oscillator, Discrete Contin. Dyn. Syst. A 33 (2013), pp. 2711–2755. [CP03]

A. Comech and D. Pelinovsky, Purely nonlinear instability of standing waves with minimal energy, Comm. Pure Appl. Math. 56 (2003), pp. 1565–1607.

[Cuc01] S. Cuccagna, Stabilization of solutions to nonlinear Schr¨ odinger equations, Comm. Pure Appl. Math. 54 (2001), pp. 1110–1145. [Cuc03] S. Cuccagna, On asymptotic stability of ground states of NLS , Rev. Math. Phys. 15 (2003), pp. 877–903. [Der64] G.H. Derrick, Comments on nonlinear wave equations as models for elementary particles, J. Mathematical Phys. 5 (1964), pp. 1252–1254. [EGS96] M.J. Esteban, V. Georgiev, and E. S´er´e, Stationary solutions of the Maxwell– Dirac and the Klein–Gordon–Dirac equations, Calc. Var. Partial Differential Equations 4 (1996), pp. 265–281. [GO12] V. Georgiev and M. Ohta, Nonlinear instability of linearly unstable standing waves for nonlinear Schr¨ odinger equations, J. Math. Soc. Japan 64 (2012), pp. 533–548. [GS79]

R.T. Glassey and W.A. Strauss, Decay of a Yang–Mills field coupled to a scalar field , Comm. Math. Phys. 67 (1979), pp. 51–67.

150

A. Comech

[GSS87] M. Grillakis, J. Shatah, and W. Strauss, Stability theory of solitary waves in the presence of symmetry. I , J. Funct. Anal. 74 (1987), pp. 160–197. [GV85] J. Ginibre and G. Velo, Time decay of finite energy solutions of the nonlinear Klein–Gordon and Schr¨ odinger equations, Ann. Inst. H. Poincar´e Phys. Th´eor. 43 (1985), pp. 399–442. [Hen81] D. Henry, Geometric theory of semilinear parabolic equations, vol. 840 of Lecture Notes in Mathematics, Springer-Verlag, Berlin, 1981. [H¨ or90] L. H¨ ormander, The analysis of linear partial differential operators. I , Springer Study Edition, Springer-Verlag, Berlin, 1990, second edn. [H¨ or91] L. H¨ ormander, On the fully nonlinear Cauchy problem with small data. II , in Microlocal analysis and nonlinear waves (Minneapolis, MN, 1988–1989), vol. 30 of IMA Vol. Math. Appl., pp. 51–81, Springer, New York, 1991. [J¨ or61] K. J¨ orgens, Das Anfangswertproblem im Grossen f¨ ur eine Klasse nichtlinearer Wellengleichungen, Math. Z. 77 (1961), pp. 295–308. [KK07] A. Komech and A. Komech, Global attractor for a nonlinear oscillator coupled to the Klein–Gordon field , Arch. Ration. Mech. Anal. 185 (2007), pp. 105–142. [KK09] A. Komech and A. Komech, Global attraction to solitary waves for Klein– Gordon equation with mean field interaction, Ann. Inst. H. Poincar´e Anal. Non Lin´eaire 26 (2009), pp. 855–868. [KK10] A. Komech and A. Komech, On global attraction to solitary waves for the Klein– Gordon field coupled to several nonlinear oscillators, J. Math. Pures Appl. (9) 93 (2010), pp. 91–111. [Kla82] S. Klainerman, Long-time behavior of solutions to nonlinear evolution equations, Arch. Rational Mech. Anal. 78 (1982), pp. 73–98. [Kom91] A.I. Komech, Stabilization of the interaction of a string with a nonlinear oscillator , Vestnik Moskov. Univ. Ser. I Mat. Mekh. (1991), pp. 35–41, 103. [Kom95] A.I. Komech, On stabilization of string-nonlinear oscillator interaction, J. Math. Anal. Appl. 196 (1995), pp. 384–409. [Kom99] A. Komech, On transitions to stationary states in one-dimensional nonlinear wave equations, Arch. Ration. Mech. Anal. 149 (1999), pp. 213–228. [Kom03] A.I. Komech, On attractor of a singular nonlinear U(1)-invariant Klein–Gordon equation, in Progress in analysis, Vol. I, II (Berlin, 2001), pp. 599–611, World Sci. Publ., River Edge, NJ, 2003. [KS00]

A. Komech and H. Spohn, Long-time asymptotics for the coupled Maxwell– Lorentz equations, Comm. Partial Differential Equations 25 (2000), pp. 559– 584.

[KS07]

P. Karageorgis and W.A. Strauss, Instability of steady states for nonlinear wave and heat equations, J. Differential Equations 241 (2007), pp. 184–205.

[KSK97] A. Komech, H. Spohn, and M. Kunze, Long-time asymptotics for a classical particle interacting with a scalar wave field , Comm. Partial Differential Equations 22 (1997), pp. 307–335. [KV96] A. Komech and B. Vainberg, On asymptotic stability of stationary solutions to nonlinear wave and Klein–Gordon equations, Arch. Rational Mech. Anal. 134 (1996), pp. 227–248.

Global Attraction to Solitary Waves

151

[Lev96] B.Y. Levin, Lectures on entire functions, vol. 150 of Translations of Mathematical Monographs, American Mathematical Society, Providence, RI, 1996, in collaboration with and with a preface by Yu. Lyubarskii, M. Sodin and V. Tkachenko, Translated from the Russian manuscript by Tkachenko. [Lio51] J.-L. Lions, Supports de produits de composition. I , C. R. Acad. Sci. Paris 232 (1951), pp. 1530–1532. [MS72] C.S. Morawetz and W.A. Strauss, Decay and scattering of solutions of a nonlinear relativistic wave equation, Comm. Pure Appl. Math. 25 (1972), pp. 1–31. [PS13] D. Pelinovsky and Y. Shimabukuro, Orbital stability of Dirac solitons, Letters in Mathematical Physics (2013), pp. 1–21. [PW97] C.-A. Pillet and C.E. Wayne, Invariant manifolds for a class of dispersive, Hamiltonian, partial differential equations, J. Differential Equations 141 (1997), pp. 310–326. [Sch26] E. Schr¨ odinger, Quantisierung als Eigenwertproblem, Ann. Phys. 386 (1926), pp. 109–139. [Sch51a] L.I. Schiff, Nonlinear meson theory of nuclear forces. I. Neutral scalar mesons with point-contact repulsion, Phys. Rev. 84 (1951), pp. 1–9. [Sch51b] L.I. Schiff, Nonlinear meson theory of nuclear forces. II. Nonlinearity in the meson-nucleon coupling, Phys. Rev. 84 (1951), pp. 10–11. [Seg63a] I. Segal, Non-linear semi-groups, Ann. of Math. (2) 78 (1963), pp. 339–364. [Seg63b] I.E. Segal, The global Cauchy problem for a relativistic scalar field with power interaction, Bull. Soc. Math. France 91 (1963), pp. 129–135. [Seg66] I. Segal, Quantization and dispersion for nonlinear relativistic equations, in Mathematical Theory of Elementary Particles (Proc. Conf., Dedham, Mass., 1965), pp. 79–108, M.I.T. Press, Cambridge, Mass., 1966. [Sha83] J. Shatah, Stable standing waves of nonlinear Klein–Gordon equations, Comm. Math. Phys. 91 (1983), pp. 313–327. [SS00] J. Shatah and W. Strauss, Spectral condition for instability, in Nonlinear PDE’s, dynamics and continuum physics (South Hadley, MA, 1998), vol. 255 of Contemp. Math., pp. 189–198, Amer. Math. Soc., Providence, RI, 2000. [Str68] W.A. Strauss, Decay and asymptotics for cmu = F (u), J. Functional Analysis 2 (1968), pp. 409–457. [Str77] W.A. Strauss, Existence of solitary waves in higher dimensions, Comm. Math. Phys. 55 (1977), pp. 149–162. [Str89] W.A. Strauss, Nonlinear wave equations, vol. 73 of CBMS Regional Conference Series in Mathematics, Published for the Conference Board of the Mathematical Sciences, Washington, DC, 1989. [SW90] A. Soffer and M.I. Weinstein, Multichannel nonlinear scattering for nonintegrable equations, Comm. Math. Phys. 133 (1990), pp. 119–146. [SW92] A. Soffer and M.I. Weinstein, Multichannel nonlinear scattering for nonintegrable equations. II. The case of anisotropic potentials and data, J. Differential Equations 98 (1992), pp. 376–390. [SW99] A. Soffer and M.I. Weinstein, Resonances, radiation damping and instability in Hamiltonian nonlinear wave equations, Invent. Math. 136 (1999), pp. 9–74.

152

A. Comech

[Tao07] T. Tao, A (concentration-)compact attractor for high-dimensional non-linear Schr¨ odinger equations, Dyn. Partial Differ. Equ. 4 (2007), pp. 1–53. [Tay11] M.E. Taylor, Partial differential equations I. Basic theory, vol. 115 of Applied Mathematical Sciences, Springer, New York, 2011, second edn. [Tem97] R. Temam, Infinite-dimensional dynamical systems in mechanics and physics, vol. 68 of Applied Mathematical Sciences, Springer-Verlag, New York, 1997, second edn. [Tho97] J. Thomson, Cathode rays, Philosophical Magazine 44 (1897), p. 293. [Tit26] E. Titchmarsh, The zeros of certain integral functions, Proc. of the London Math. Soc. 25 (1926), pp. 283–302. [VK73] N.G. Vakhitov and A.A. Kolokolov, Stationary solutions of the wave equation in the medium with nonlinearity saturation, Radiophys. Quantum Electron. 16 (1973), pp. 783–789. [Wei86] M.I. Weinstein, Lyapunov stability of ground states of nonlinear dispersive evolution equations, Comm. Pure Appl. Math. 39 (1986), pp. 51–67. [Yos80] K. Yosida, Functional analysis, vol. 123 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], SpringerVerlag, Berlin, 1980, sixth edn. Andrew Comech Texas A&M University Mathematics Department College Station, TX 77843, USA and Institute for Information Transmission Problems Moscow 127994, Russia e-mail: [email protected]

Operator Theory: Advances and Applications, Vol. 251, 153–314 c Springer International Publishing Switzerland 2016 

Geodesics in Geometry with Constraints and Applications Irina Markina Abstract. In this course we carefully define the notion of a non-holonomic manifold which is a manifold with a certain non-integrable distribution. We describe such concepts as horizontal distribution, the Ehresmann connection, bracket generating condition for a distribution, sub-Riemannian structure and sub-Riemannian metric, Hamiltonian system, normal and abnormal geodesics, principal bundle and others. Mathematics Subject Classification (2010). Primary 53C17; Secondary 37K05, 58B25. Keywords. Sub-Riemannian geometry, non-holonomic constraints, smooth manifold, smooth sub-bundle, principal bundle, geodesic, length minimisers, Hamiltonian system, rolling system, controllability, Carnot–Carath´eodory distance.

1. Introduction These notes are based on the course of lectures presented at the summer school “Analysis – with Applications to Mathematical Physics” that took place at GeorgAugust-Universit¨at, G¨ottingen, in August 29–September 2, 2011. The main purpose of these notes is to give a flavor of the subject that during the last decade received the name Sub-Riemannian Geometry and that studies the geometry of manifolds with non-holonomic constraints and presence of a positively definite metric. This subject has attracted attention of scientists since the 19th century. We will not describe the history of development of this subject, we only mention that it was independently considered in several branches of mathematics such as non-holonomic mechanics, geometry of bundles, CR-manifolds, geometric control theory and others. I acknowledge support through the Norwegian Research Council, project # 204726 /V30 and the Mittag-Leffler Institute, Stockholm, Sweden during fall 2012.

154

I. Markina

It is supposed that the reader is familiar with basic notions of differential geometry, topology, and Lie groups. Nevertheless, in order to keep self-sufficiency of the notes, we present the main definitions and basic notions related to these topics in Appendix A. It is advisable to always consult this appendix first if the reader meets an unfamiliar notion. The principal subject of sub-Riemannain geometry, discussed in the notes, is the notion of geodesics related to sub-Riemannian Hamiltonian functions produced by a sub-Riemannian metric. We present basic models where the sub-Riemannian geometry appears rather naturally. Based on these examples we show main features and peculiarities of geodesics in geometry with non-holonomic constraints. The structure of these notes is as follows. Section 2 collects main definitions that reveal the similarity and difference between Riemannian and sub-Riemannian geometries. Carnot groups and their particular examples are presented in Section 3. We describe the sub-Riemannian structure of odd-dimensional spheres in Section 4. Section 5 deals with principal bundles. After presenting the main definitions we reconsider examples of Sections 3 and 4 from the point of view of principal bundles. Sections 6 is dedicated to a mechanical problem of rolling one manifold over another, where kinematic constraints are described in the language of a smooth sub-bundle of the tangent bundle of the configuration space. In the last Section 7 we generalize some results obtained for principal bundles on the infinitedimensional Lie group of orientation preserving diffeomorphisms of the unit circle. Appendix A collects a vast number of definitions and concrete formulas used in the text. Some of them are well known, some of them are not widely presented in the literature. As it was noticed above, we recommend that a not very experienced reader start reading the notes from Appendix A. Appendix B is short and very technical, where we wrote some of the expressions that are useful, but not necessary for a first reading.

2. Main definitions 2.1. Smooth manifolds, vector fields, tangent map It is supposed that the reader is familiar with the notion of smooth or C ∞ manifolds. We set up main definitions and notations. A smooth manifold is a Hausdorff, second countable topological space, where the smooth complete atlas is defined. We write M for a smooth manifold, or rather M n if we want to emphasize the dimension n of the manifold. Let C ∞ (M ) denote the space of smooth real-valued functions defined on M . The tangent space at a point q ∈ M is denoted by Tq M . Recall that any element vq ∈ Tq M is a function vq : C ∞ (M ) → R satisfying two properties: 1. R-linearity: vq (af + bg) = avq (f ) + bvq (g), 2. Leibnizian property: vq (f g) = vq (f )g(q) + f (q)vq (g) for all a, b ∈ R, f, g ∈ C ∞ (M ), q ∈ M . The space Tq M , q ∈ M , is a real vector space and therefore vq is called a tangent vector.

Geodesics in Geometry with Constraints

155

Equivalently, a tangent vector vq at q ∈ M can be defined as an equivalence class of parameterized curves through q, as follows. Let ϕ : U → V , U ⊂ M , V ⊂ Rn be a coordinate chart with q ∈ U and let γ1 , γ2 be two smooth curves defined on an interval I ⊂ R containing 0 such that γi (0) = q. We say that γ1 and γ2 are “equivalent” or have the same “velocity vector” at t = 0 if two smooth maps I " t → ϕ(γi (t)) ∈ V, i = 1, 2, have the same first derivatives at t = 0. The set of all equivalence classes of curves through q is called the tangent vector space Tq M . Note that ϕ−1 induces a oneto-one correspondence between the model space Rn and the tangent space Tq M . In fact, if ϕ−1 (x) = q, x ∈ V ⊂ Rn , then each vector v ∈ Rn corresponds to the equivalence class of the curve   t → ϕ−1 (x + tv) ∈ Tq M. The visualisation is presented in Figure 2.1.

U p

M

γ1

ϕ−1

γ2

ϕ

Rn

0

0

V

R

ϕ ◦ γ2

v x

I

ϕ ◦ γ1

Figure 2.1. The notion of the tangent vector. After the previous definitions one can also say, that the notion of a tangent vector v is the generalization of the derivative of C ∞ -functions along the direc  1 n tion v. If the chart U,  ϕ = (x , . . ., x ) is chosen, then the standard notation for the basis of Tq M is

∂ ∂ ∂x1 , . . . , ∂xn

n

or shortly (∂1 , . . . , ∂n ). Any vector v ∈ Tq M

is written in coordinates as v = j=1 v j ∂j . Notice the position of indices! The dual space to Tq M is denoted by Tq∗ M and the pairing is written as · , · q , where we usually omit the subscript “q”, see Definition 42. The dual basis to (∂1 , . . . , ∂n ) with respect to the pairing is denoted by (dx1 , . . . , dxn ) and, by

156

I. Markina

definition, it satisfies dxi , ∂j = δij , where δij is the Kronecker symbol. Then n ∗ any co-vector λ ∈ Tq M is written in coordinates as λ = k=1 λk dxk . Notice the position of indices. The elements of Tq∗ M are usually called co-vectors in geometry and momenta in physics. The tangent and co-tangent bundles are denoted by T M and T ∗ M , correspondingly, consult Definition 59. Both vector bundles are C ∞ -smooth manifolds [23, 117]. The notations prM :

TM (q, v)

→ M → q

and

pr∗M :

T ∗M (q, λ)

→ M → q

will be fixed for the canonical projections from the tangent and co-tangent bundles to the underlying manifold. A vector field X on a manifold M is a function that assigns to each point q ∈ M a tangent vector X(q) ∈ Tq M . We also write Xq for the value of the vector field X at the point q ∈ M . If f ∈ C ∞ (M ), then Xf denotes a real-valued function on M given by (Xf )(q) = X(q)f, for all q ∈ M. A vector field X is called smooth if for any f ∈ C ∞ (M ) the function Xf : M → R is  ∞ an element in C (M ). If U, ϕ = (x1 , . . . , xn ) is a coordinate chart, then any vec n tor field X can be written in terms of coordinates as X(q) = j=1 X j (q)∂j . Then the smoothness condition of the vector field X on the neighborhood U is equivalent to the requirement that all functions X j , j = 1, . . . , n, be of class C ∞ (U ). If the functions X j , j = 1, . . . , n, are analytic in U , then the corresponding vector field X is called an analytic vector field. Another way to define a vector field X is to use the definition of a local section. Namely, a vector fields X is a smooth map X : U → T M , such that prM ◦X = idU for any open set U ⊂ M . The section is global if U can be taken as entire M . We write Vect M (Vect U ) for the collection of smooth vector fields, defined on M (U , U ⊂ M ). Algebraically, Vect M is a module over the ring C ∞ (M ) and a vector space over the field R (or C if the manifold M is modelled over Cn ). Moreover, an operation of multiplication of two vector fields can be defined. The multiplication [· , ·] (that received the name commutator or the Lie product) is defined by [X, Y ]f = X(Y f ) − Y (Xf ).

(2.1)

The Lie product is a map [· , ·] : Vect M × Vect M → Vect M satisfying the three axioms of Definition 51. The set of smooth vector fields considered as a real vector space endowed with the Lie multiplication forms a Lie algebra. Definition 1. Let M and N be two smooth manifolds and F : M → N be a map. The map F is smooth if the following holds. For any q ∈ M and for any local charts (U, ϕ) of q ∈ M and (V, ψ) of F (q) ∈ N , the composition ψ ◦ F ◦ ϕ−1 is a smooth map ψ ◦ F ◦ ϕ−1 : ϕ(U ) → ψ(V ) in the sense of smoothness defined in the Euclidean space Rn .

Geodesics in Geometry with Constraints

157

F V U

F (q) N

q

M ψ ϕ

ψ ◦ F ◦ ϕ−1 ψ(V ) ϕ(U )

Figure 2.2. The smooth map F . The definition is illustrated in Figure 2.2. A diffeomorphism between two manifolds is defined in a similar way. Definition 2. Let F : M → N be a smooth map. The differential of F at q ∈ M is the linear map dq F : Tq M → TF (q) N such that   dq F (Xq ) f := Xq (f ◦ F ) for any f ∈ C ∞ (N ) and Xq ∈ Tq M .     If the local charts U, ϕ = (x1 , . . . , xm ) of q ∈ M and V, ψ = (y 1 , . . . , y n ) of F (q) ∈ N are chosen, then n   ∂  k y F (q) ∂yk |F (q) , j = 1, . . . , m. dq F (∂xj ) = ∂xj k=1     ∂ k y The matrix ∂x F (q) is called the Jacobi matrix of the map F with j k,j

respect to the given coordinate charts. 2.2. Distributions and non-holonomic constraints Definition 3. Let M be a smooth manifold. A mapping D that assigns to every point q ∈ M a linear subspace Dq of the tangent space Tq M is called a singular distribution on M . Definition 4. A distribution D is called smooth on M , if for any q ∈ M there is a neighborhood U (q) and smooth linearly independent vector fields X1 , . . . , Xk , such that Dx = span{X1 (x), . . . , Xk (x)}, for all x ∈ U (q). A distribution D is called analytic if the vector fields X1 , . . . , Xk in Definition 4 can be chosen to be analytic. The smooth (analytic) distribution D on M is a smooth (analytic) sub-bundle of the tangent bundle T M , and its rank is equal

158

I. Markina

to k for all q ∈ M . In the case of a singular distribution the set of vector fields X1 , . . . , Xk may not be necessarily linear independent, and therefore, the dimension of the linear subspace Dq can vary from point to point. From now on, we will work only with smooth distributions and smooth manifolds and therefore we omit the word “smooth”. A analogous definition can be given for a map D∗ that assigns to any point q ∈ M a linear subspace in the co-tangent space Tq∗ M and in this case it is called a co-distribution. The notion of a smooth distribution naturally leads to the following question. When does a smooth distribution or a smooth sub-bundle D ⊂ T M define a submanifold N inside of the original manifold M ? The answer was given by Frobenius [48]. Definition 5. A smooth distribution D on M is called involutive or integrable if [X, Y ] is a smooth section of D for any choice of smooth sections X and Y of D. Definition 6. A smooth submanifold N of a manifold M is the integral manifold of a distribution D if for any point q ∈ N there is an open neighborhood U (q) ⊂ N such that Tx N = Dx for any x ∈ U (q). Theorem 2.1 ([48, 131]). A submanifold N of a manifold M is the integral manifold of a distribution D, if and only if, D is involutive. In this case a foliation of the manifold M by integral manifolds N passing through different points q ∈ M is produced. Somehow, one cannot leave a chosen leaf of the foliation produced by the integral manifold N of D and being touched by the distribution D. A smooth curve c : I → M can be considered as a smooth map between two ∂ ∈ Tt I under the tangent manifolds. In this case the image of the tangent vector ∂r ∂ ˙ and is called the ˙ i.e., dt c ∂r = c(t), map dt c : Tt I → Tc(t) M is denoted by c(t), velocity vector of the curve c at t ∈ I. Definition 7. We say that a smooth curve c : I → M is tangent to the distribution D (or horizontal) if the tangent vector c(t) ˙ belongs to the vector space Dc(t) for any t ∈ I. One can release the condition of smoothness for the curve c and require that the curve have derivative almost everywhere on the interval I. If a distribution D is involutive then given a point q on its integral manifold N one can reach only the points on N being tangent to D. Let us ask the opposite question: when can we reach any point (of the original manifold M ) starting from a given one and always staying tangent to the prescribed distribution D? To answer this question we introduce a flag of distributions. Let X be a vector field such that Xq ∈ Dq for all q ∈ M . We denote by D + [X, D] the sub-bundle of T M spanned by D and all the vector fields [X, Y ], where the vector field Y is such that Yq ∈ Dq , q ∈ M . Thus Dq + [X, D]q = span{Dq , [X, Y ]q | ∀ Yq ∈ Dq , q ∈ M }.

Geodesics in Geometry with Constraints

159

We also drop the subscript q and write D + [X, D]. We define the k-bracket (k, X) inductively by bracket (2, X) = D + [X, D], . . . , bracket (k, X) = D + [bracket(k − 1, X), D]. More generally, changing the vector field X to the entire distribution D, we set bracket (2, D) = D + [D, D], . . . , bracket (k, D) = D + [bracket(k − 1, D), D]. We get a flag of distributions D ⊂ bracket (2, D) ⊂ · · · ⊂ bracket (k, D) ⊂ · · · . A smooth section X of D is a k-step generator if bracket (k, Xq ) = Tq M for any q ∈ M. Similarly, a distribution D is said to be the k-step bracket generating (or completely non-holonomic) distribution if bracket (k, Dq ) = Tq M for every q ∈ M. We say that a distribution D is strongly bracket generating if Dq + [X, D]q = Tq M for all non-vanishing Xq ∈ Dq . If we do not emphasize the number of steps for k-step bracket generating distribution, then we simply say bracket generating distribution. Example 1. Consider vector fields in R4 written in coordinates (x, y, z, w): X1 =

∂ , ∂x

X2 =

∂ , ∂y

X3 =

∂ ∂ +x . ∂z ∂w

The distribution D = span{X1 , X2 , X3 } is two-step bracket generating but not strongly bracket generating. If a distribution D is bracket generating and the dimension of brackets (k, Dq ) does not depend on the point q ∈ M for any k, then the distribution D is called regular. Now we are ready to formulate a sufficient condition for the connectivity problem. This condition was independently proved by P.K. Rashevski˘ı [119] and W.L. Chow [33]. Theorem 2.2 ([33, 119]). If a manifold M is topologically connected and if a distribution D on M is bracket generating, then any two points on M can be connected by a piecewise smooth curve tangent to D. Necessary and sufficient conditions for the connectivity problem in the case of C ∞ -manifold and C ∞ -smooth distribution can be found in [128]. See also references therein.

160

I. Markina

Example 2. In the following example we show that the Chow–Rashevski˘ı condition ∂ ∂ is not necessary for connectivity. Let M = R2 , X1 = ∂x , X2 = φ(x) ∂y , where the ∞ C -function φ satisfies φ(x) > 0, φ(x) = 0,

if if

x > 0, x ≤ 0.

It is clear that one cannot move vertically in the left half-plane, but one can move horizontally to the right half-plane, displace arbitrarily in the right half-plane and proceed to the left half-plane, see Figure 2.3. In this example one can connect any points in the plane being tangent to the distribution D = span{X1 , X2 }, but the vector fields definitely do not span the entire plane at points q = (x, y) with x ≤ 0. y B

0

x

A

Figure 2.3. (R2 , D) is horizontally connected, but D is not bracket generating. Example 3. Another example of a bracket generating distribution is the Gruˇsin distribution spanned by vector fields in R2 ∂ ∂ , X2 = x , X1 = ∂x ∂y studied by M.S. Baouendi in his PhD thesis in the early 70s, and then by numerous authors, see for instance, [6, 25, 40, 52, 67, 129]. The latter two examples are based on non-smooth distributions. Example 4. Historically the integrability condition was given in terms of one-forms, but not in terms of vector fields. Let a manifold M be of dimension n and we want to describe a distribution D ⊂ T M of rank k, k < n. To achieve this we need to find n − k one-forms Θ1 , . . . , Θn−k , such that the distribution D belongs to their common kernel. The forms Θj , j = 1, . . . , n − k, are called annihilators of D. It is equivalent to solve the system ⎧ 1 n ⎪ ⎨Θ1 (x , . . . , x ) = 0 ............... ⎪ ⎩ Θn−k (x1 , . . . , xn ) = 0, that received the name Pfaffian equations.

Geodesics in Geometry with Constraints

161

This system is integrable if the one-forms Θ1 , . . . , Θn−k are exact forms: ⎧ 1 n 1 n ⎪ ⎨Θ1 (x , . . . , x ) = dθ1 (x , . . . , x ) = 0 (2.2) ............... ⎪ ⎩ 1 n 1 n Θn−k (x , . . . , x ) = dθn−k (x , . . . , x ) = 0. After integrating the latter system we get n−k functions describing a k-dimensional integral submanifold of M defined by the integrable system (2.2) or by the involutive distribution D. The Chow–Rashevski˘ı Theorem 2.2 for an analytic co-rank one distribution D, or for one Pfaffian equation was solved by C. Carath´eodory. The result states as follows. Let M be a connected manifold endowed with an analytic co-rank one distribution D. If there exist two points A, B ∈ M that cannot be connected by a horizontal curve, then the distribution D is integrable. Or, formulating the negation of the above statement if for any points A, B ∈ M there is a horizontal curve connecting these points, then the distribution D is non-integrable (completely nonholonomic, bracket generating). C. Carath´eodory developed this theory due to the question posted by M. Born to derive the second law of thermodynamics and the existence of the entropy function. Translating the problem into the geometric language we work with a manifold M that is the set of all possible thermodynamical states of some isolated system. The admissible or horizontal curves are adiabatic curves, such curves that correspond to slow processes in time and such that during these processes (along the admissible curves) no heat Θ is exchanged. C. Carath´eodory wrote the condition of an adiabatic process as a Pfaffian equation Θ = 0 on M . It was known at that moment from works by S. Carnot, J.P. Joule and others, that there are thermodynamical states A, B ∈ M , which cannot be connected by an adiabatic process (by an admissible curve). Carath´eodory’s theorem states in this case that the distribution defined by the Pfaffian equation Θ = 0 is integrable, that leads to the existence of two functions T (temperature) and S (entropy) that locally satisfy the relation Θ = T dS. This proves the existence of the entropy function S, as well as that the adiabatic process remains in the leaf (hypersurface) of the state space M corresponding to the entropy function. The entropy function S tends not to decrease, being constant or increasing, according to the second law of thermodynamics. Due to the names of S. Carnot and C. Carath´eodory involved in this discovery, M. Gromov called the sub-Riemannian geometry the Carnot–Carath´eodory geometry. Exercises Decide whether the following distributions D = span{X1 , X2 } in R3 are bracket generating and regular. Find one forms ω such that D = ker(ω). ∂ ∂ ∂ , X2 = ∂y + x ∂z . 1. Heisenberg distribution: X1 = ∂x ∂ ∂ 2 ∂ 2. Martinet distribution: X1 = ∂x , X2 = ∂y + x ∂z .

162

I. Markina

2.3. Riemannian and sub-Riemannian manifolds Let us recall some notions from Riemannian geometry and compare basic definitions in the Riemannian and sub-Riemannian settings. Definition 8. A Riemannian metric is a map g : Tq M × Tq M → R, which is symmetric, bilinear, positive definite for any q ∈ M , and smoothly varying with respect to q.   If the coordinate chart U, ϕ = (x1 , . . . , xn ) is chosen and (∂1 , . . . , ∂n ) is the local basis of Tq M , q ∈ U , then gij = g(∂i , ∂j ) is the associated matrix to the metric g. Smoothness of g means that the matrix gij (q) = gij (x1 , . . . , xn ) is a smooth function of (x1 , . . . , xn ) in ϕ(U ). The couple (M, g) is called a Riemannian manifold. It would be more correct to say that the triplet (M, T M, g) is called a Riemannian manifold. Definition 9. The distance d(q0 , q1 ) between two points q0 , q1 ∈ M related to the Riemannian metric g is defined by the equality ) 1  1/2  g c(t), ˙ c(t) ˙ dt , d(q0 , q1 ) = inf 0

where the infimum is taken over all curves c : [0, 1] → M differentiable almost everywhere in [0, 1], and such that c(0) = q0 , c(1) = q1 . We are ready now to define a sub-Riemannian manifold. Let M be a smooth manifold and let D be a smooth distribution (a smooth sub-bundle) of the tangent bundle T M . Definition 10. A map gD : Dq × Dq → R which is symmetric, bilinear, positive definite for any q ∈ M and smoothly varying with respect to q is called a subRiemannian metric. Definition 11. The couple (D, gD ) is called a sub-Riemannian structure and the triplet (M, D, gD ) is called a sub-Riemannian manifold. If D = T M , then Definition 11 is reduced to the definition of a Riemannian manifold. In this sense the sub-Riemannian geometry is a generalization of Riemannian geometry. The distance function related to a sub-Riemannian metric gD is defined by ) 1  1/2  gD c(t), ˙ c(t) ˙ dt , (2.3) dc−c (q0 , q1 ) = inf 0

where the infimum is taken over all horizontal curves c : [0, 1] → M differentiable almost everywhere in [0, 1] and such that c(0) = q0 , c(1) = q1 . Thus, we have added the horizontality condition, c(t) ˙ ∈ Dc(t) , for the set of admissible curves. The set of admissible curves is smaller, therefore, the dc−c -distance is, in general, bigger than the Riemannian distance if both metrics are defined on the manifold and coincide on Dq , q ∈ M . Theorem 2.2 guarantees that the set of horizontal curves is not empty and therefore, the function dc−c takes only finite values. The distance

Geodesics in Geometry with Constraints

163

dc−c is called the Carnot–Carath´eodory distance due to the impact by S. Carnot and C. Carath´eodory described in Example 4. Let us suppose that a Riemannian metric g and a sub-Riemannian metric gD are defined on a smooth manifold M , and the Riemannian distance d and the Carnot–Carath´eodory distance dc−c on M are produced, respectively. As a result, two metric spaces (M, d) and (M, dc−c ) and two topological spaces (M, τd ) and (M, τc−c ) are defined, where the topology τd is generated by open balls in the d-metric and τc−c is generated by dc−c -balls. It is established that the topological spaces (M, τd ) and (M, τc−c ) are equivalent, but the metric spaces (M, d) and (M, dc−c ) are not in general Lipschitz equivalent, see [111, p. 27], [13, 55, 63, 114]. Example 5 shows non-equivalence of the metric spaces (M, d) and (M, dc−c ) in some particular cases. 2.3.1. Riemannian and sub-Riemannian gradient. At the end of the subsection we would like to say some words about the gradient vector field in sub-Riemannian geometry. Let us recall that the gradient on the Riemannian manifold (M, g) is a vector field “grad” such that it is detected by its action on smooth functions by g(grad f, X) = Xf,

for any X ∈ Vect M

and f ∈ C ∞ (M ).

If a coordinate chart is chosen, then the gradient can be written as  ∂f grad f = g ij i ∂j , ∂x ij

(2.4)

where {g ij }ni,j=1 is the inverse matrix to gij = g(∂i , ∂j ), i, j = 1, . . . , n. More details about differential operators on Riemannian manifolds can be found in [117]. In the case of a sub-Riemannian manifold (M, D, gD ) the definition is analogous. A subRiemannian gradient gradD is a horizontal vector field, such that gD (gradD f, X) = Xf,

(2.5)

for any smooth section X of D and f ∈ C ∞ (M ). 2.4. Hamiltonian formalism and geodesics Let us compare the problem of finding a curve realizing the distance between two points in the Riemannian and sub-Riemannian geometries. 2.4.1. Geodesic on Riemannian manifolds. Historically, a geodesic was defined as a curve γ that locally realizes the distance between two points on a Riemannian manifold. The corresponding equation is ∇γ(t) γ(t) ˙ = 0, ˙

γ : I → M,

(2.6)

where ∇ is the Levi-Civita connection, which is a generalization of the directional derivative of vector fields defined on a Riemannian manifold (M, g), see Definitions 45, 46, and 47. The connection ∇ is compatible with the Riemannian metric g, see Theorem 8.1 and [23, p. 53], [117, p. 59]. Geometrically, equation (2.6) also implies, that the corresponding first geodesic curvature of the solution vanishes.

164

I. Markina

The physical interpretation asserts that solutions of equation (2.6) give trajectories of the motion of particles under the absence of any external force, motion of “free particles” or “free motion”.   Given a coordinate chart U, ϕ = (x1 , . . . , xn ) , the Christoffel symbols of the Levi-Civita connection are introduced by ∇∂i ∂j = − nk=1 Γkij ∂k . Then equation (2.6) takes the form x ¨k (t) =

n 

Γkij x˙ i (t)x˙ j (t),

k = 1, . . . , n,

t ∈ I.

(2.7)

ij=1

Given a Riemannian metric g there is a predeterminant choice of the dual space Tq∗ M to the tangent space Tq M given as follows. If v ∈ Tq M , then v ∗ (·) = g(v, ·) : Tq M → R is a continuous linear functional, and therefore, an element of Tq∗ M . We write it in coordinates. Let {∂j }nj=1 be a basis of Tq M and let {gij } n be the matrix associated with the metric g, then dxi = j=1 gij ∂j , i = 1, . . . , n, n n represent the basis of the dual Tq∗ M . If v = j=1 v j ∂j then v ∗ = i=1 vi∗ dxi , n where vi∗ = j=1 gij v j . This process is called “lowering indices” in physics. We can say now that the Riemannian metric g defines a map g˜ : Tq M → Tq∗ M , which is an isomorphism between two vector spaces. Therefore, the inverse map g˜−1 : Tq∗ M → Tq M is defined. The map g˜−1 defines a metric on Tq∗ M , called a co-metric, which we denote by g −1 . Thus, the co-metric is the map g −1 : Tq∗ M × Tq∗ M → R defined by     g −1 (v ∗ , w∗ ) = v ∗ g˜−1 (w∗ ) = g g˜−1 (v ∗ ), g˜−1 (w∗ ) . We see that maps g˜ and g˜−1 became linear isometries between Tq M and Tq∗ M for all q ∈ M . The matrix corresponding to g −1 is the inverse matrix to {gij } and it is usually written as {g ij }. The process that associates a vector v = (v 1 , . . . , v n ) to a given co-vector λ = (λ1 , . . . , λn ) by making use of the map g˜−1 is called “rising indices”: n  vi = g ij λj , i = 1, . . . , n. j=1

We conclude that the Riemannian metric g defines a pairing · , · : Tq M × Tq∗ M → R by v, λ = g(v, g˜−1 (λ)) = g −1 (˜ g (v), λ),

λ ∈ Tq∗ M, v ∈ Tq M.

Having the co-metric and a chosen coordinate chart, we define the Riemannian Hamiltonian function H : T ∗ M → R by n 1 −1 1  ij H(q, λ) = g (λq , λq ) = g λi λj . 2 2 i,j=1

Geodesics in Geometry with Constraints

165

A solution of the Hamiltonian equations ∂H(q(s), λ(s)) , q(s) = (x1 (s), . . . , xn (s)) (2.8) ∂λi ∂H(q(s), λ(s)) , λ(s) = (λ1 (s), . . . , λn (s)), i = 1, . . . , n, λ˙ i (s) = − ∂xi s ∈ I, is called the bi-characteristic curve. The projection of the bi-characteristic curve to the manifold M is called geodesic. The vector field  ∂H(q, λ) ∂H(q, λ)  → − ,− H (q, λ) = ∂λi ∂xi is called Hamiltonian vector field. The Hamiltonian function is constant along the bi-characteristic since n  H(q(s), λ(s))   ∂H(q, λ) i ∂H(q, λ) ˙ = x ˙ (s) + (s) λ i ds ∂xi ∂λi i=1 x˙ i (s) =

=

n  

 − λ˙ i x˙ i (s) + x˙ i λ˙ i (s) = 0.

i=1

If a geodesic is parametrized by the arc length, then H = 1/2. Remark that the notions of a local length minimiser and of a geodesic as the projection of a bi-characteristic to the manifold coincide in the Riemannian geometry, see, for instance [7]. Denote by γq,v a geodesic starting from q ∈ M with the initial velocity v ∈ Tq M . The notion of a geodesic leads to the construction of a map associating to vectors from Tq M points in a neighborhood of q ∈ M . The domain of definition for this map is ˙ = v}. (2.9) D(q) = {v ∈ Tq M | ∃ a geodesic γq,v : [0, 1] → M, γ(0) = q, γ(0) Definition 12. The Riemannian exponential map expq : D(q) → M is defined by expq (v) = γq,v (1) for all v ∈ D(q). Actually the Riemannian exponential map is the composition of the following maps. Tq M

ι

/ TM

g ˜

/ T ∗M

Φ

/ T ∗M

pr∗ M

/5 M.

(2.10)

exp

Here we denote by ι the inclusion of the tangent space Tq M into the tangent bundle, by g˜ the association of the tangent and co-tangent bundles, by using the metric g, by Φ the flow produced by the Hamiltonian vector field on the cotangent bundle, see Definition 44, and by pr∗M the canonical projection to the base manifold M . The concrete choice of the initial velocity v at q ∈ M gives the value of the dual momentum λq ∈ Tq∗ M . In the following proposition we collect some basic properties of the Riemannian exponential map.

166

I. Markina

Proposition 1 ([23, 117]). Let v ∈ D(q) be as defined in (2.9). Then 1. the exponential map expq carries lines through the origin of Tq M to geodesics on M through q in the following sense expq (tv) = γq,tv (1) = γq,v (t),

t ∈ [0, 1];

2. for each q ∈ M , there is a neighborhood V of the origin in Tq M , such that the exponential map expq : V → U is a diffeomorphism onto a neighborhood U of q ∈ M ; 3. if U is a normal neighborhood of q ∈ M (U is the diffeomorphic image of a starlike neighborhood of the origin in Tq M ), then for each point x ∈ U there is a unique geodesic γq,v : [0, 1] → U joining q and x in U and γ˙ q,v (0) = v = exp−1 q (x). Exercises 1. Show that equations (2.7) and (2.8) are equivalent if we introduce the covectors (called momenta in physics) λ and the Christoffel symbols for the Levi-Civita connection by λi =

n  j

gij x˙ j ,

Γkij =

n 1  km  ∂gjm ∂gim ∂gij  . g + − 2 m=1 ∂xi ∂xj ∂xm

2. Suppose that a coordinate chart is chosen and X1 (q), . . . , Xn (q) is an orthonormal basis of Tq M . If the collection X1 (x), . . . , Xn (x) is smooth and orthonormal in a neighborhood U of q, then the family of vector fields X1 , . . . , Xn is called an orthonormal frame in U . Assume that an orthonormal frame is given. Show that the Hamiltonian function can be written as 1 Xi (q), λq 2 . (2.11) 2 i=1 n Hint. Write co-vectors in the form λ = i=1 λi ω i , where {ω i }ni=1 is the dual basis to (X1 , . . . , Xn ): Xj , ω i = δij . 3. Calculate the exponential map expq : Tq Rn → Rn , where Rn is considered as a Riemannian manifold with the Euclidean metric. Let the Euclidean metric be also defined on Tq Rn . Show that the exponential map is an isometry. n

H(q, λ) =

2.4.2. Geodesics on sub-Riemannian manifolds. Let (M, D, gD ) be a sub-Riemannian manifold. Let us assume that we are interested in finding the local minimizer of the length functional (2.3) over all almost everywhere differentiable horizontal curves, or in other words, we look for a curve that locally realizes the Carnot– Carath´eodory distance. We need to define an analogue of the Levi-Civita connection, but there is no metric defined on the entire tangent bundle. We will not enter this question deeply, since it requires some amount of knowledge of differential geometry, see, for instance [37, 53]. Instead, we adapt the Hamiltonian approach,

Geodesics in Geometry with Constraints

167

since it is more suitable for physical applications. We also distinguish length minimizers, curves realizing dc−c -distance, and geodesics, which are projections of bi-characteristic curves of the Hamiltonian system onto the underlying manifold. To use the Hamiltonian approach we still have to overcome the absence of a metric defined on the entire tangent bundle that was used for definition of the dual to T M . Therefore, we assume that we are given a dual T ∗ M (as a set of all continuous linear functionals) and a pairing · , · : Tq M × Tq∗ M → R that is the evaluation of a functional over a vector. Definition 13. We define a linear map g˜D : Tq∗ M → Tq M by the following two conditions: 1. the image of Tq∗ M is the linear space Dq ⊂ Tq M ; 2. for λq ∈ Tq∗ M the image g˜D (λq ) is a vector Xq ∈ Dq , such that Yq , λq = gD (Yq , Xq ) for all Yq ∈ Dq . The map g˜D is an analogue of the map g˜−1 in the Riemannian geometry. The map g˜D defines the co-metric g D : Tq∗ M × Tq∗ M → R by the following rule g D (ξ, λ) = ˜ g D (λ), ξ = gD (˜ g D (ξ), g˜D (λ)). We still can write the matrix g ij for the co-metric g D in local coordinates, but we have no analogue for gij since the matrix g ij is not invertible in this case. Let us introduce the notation Dq⊥ = ker(˜ g D ), q ∈ M for the kernel of the D linear map g˜ . The elements of the smooth sub-bundle D⊥ ⊂ T ∗ M are called   → − annihilators of the distribution D since Yq , ξq = gD Yq , g˜D (ξq ) = gD (Yq , 0 ) = 0 for Yq ∈ Dq , ξq ∈ Dq⊥ . Then also → − g D (λq , ξq ) = ˜ g D (ξq ), λq =  0 , λq = 0, ∀ ξq ∈ Dq⊥ , and ∀ λq ∈ Tq∗ M. As in the Riemannian case, having a co-metric, one can define the Hamiltonian function HsR : T ∗ M → R by n 1 1  ij HsR (q, λ) = g D (λq , λq ) = g λi λj , q ∈ M. (2.12) 2 2 ij=1 We call HsR the sub-Riemannian Hamiltonian function. Consider again the Hamiltonian equations (2.8). The first equation written in the form n  x˙ i (s) = g ij λj , i = 1, . . . , n, or x(s) ˙ = g˜D (λ) (2.13) j=1

says that the velocity of the solution to (2.8) will be a horizontal vector field by Definition 13. Since the co-metric in sub-Riemannian case is not strictly positive definite, it can happened that the sub-Riemannian Hamiltonian function vanishes. This leads to two different types of geodesics: normal and abnormal. Recall that the Hamiltonian function is constant along any bi-characteristic. If this constant is zero, then the projection to M is called an abnormal geodesic. If the Hamiltonian

168

I. Markina

function is not zero along the bi-characteristic, then the geodesic is called normal. If X1 , . . . , Xk is an orthonormal frame of the distribution D, then the abnormal bicharacteristic is a solution of the Hamiltonian system for k Hamiltonian functions λq ∈ Dq⊥ \ {0},

HXi (q, λ) = λq , Xi (q) = 0,

i = 1, . . . , k.

To find normal bi-characteristics we need to work with the Hamiltonian function k  H(q, λ) = λq , Xi (q) 2 . i=1

We will mostly work with normal geodesics. The reader can find a lot of useful information about abnormal geodesics in [97, 110]. Here we only want to present a short description of D⊥ and the cases when the abnormal geodesics are trivial. Proposition 2 ([97]). Let D be a smooth distribution of rank k on an n-dimensional manifold M . Then D⊥ is a smooth (2n − k)-dimensional sub-bundle of T ∗ M . Locally, it can be described as a set of (q, λ) ∈ T ∗ M such that Hi (q, λ) = λq , Xi (q) = 0, i = 1, . . . , k, where {Xi }ki=1 is a local basis for D. We remark that the sub-bundle D ⊂ T M defines the set of annihilators D⊥ ⊂ T ∗ M . The converse is also true. Given a smooth sub-bundle D⊥ ⊂ T ∗ M , the distribution D ⊂ T M is defined by Dq = {v ∈ Tq M | v, λq = 0, for all λq ∈ Dq⊥ }, q ∈ M. Theorem 2.3 ([97, 127]). Let D be a smooth distribution on a smooth manifold M . 1. If D = T M , then there are no abnormal geodesics. 2. If D is strongly bracket generating, but D = T M , then the abnormal geodesics are constant curves. Proof. For some additional information about symplectic manifolds check Subsection 8.2. Let Γ : I → T ∗ M \ {0} be an abnormal  bi-characteristic curve for the distribution D that we write as Γ(t) = γ(t), λ(t) . Then if X is a smooth section of D, then   =  > HX (Γ(t)) = HX γ(t), λ(t) = λ(t), X γ(t) = 0. If D = T M , then all possible Hamiltonians vanish, and since the pairing is nondegenerate, we get λ(t) = 0 for all t ∈ I. This contradicts the assumption that ⊥ \ {0}. λ(t) ∈ Dγ(t) Let us assume now that D is strongly bracket generating and Γ, γ are as above. It implies that HY (Γ(t)) = 0 for any Y ∈ D. Differentiating with respect to t the latter equality, we obtain dHY ˙ (Γ(t)) = dHY (Γ(t)) = 0. (2.14) dt Suppose that the bi-characteristic Γ is the solution of the Hamiltonian system −−→ ˙ t ∈ I, Γ(t) = HX (Γ(t)),

Geodesics in Geometry with Constraints

169

for some smooth section X of D. Then for any Y ∈ D we get −−→  −→ ˙ H[X,Y ] (Γ(t)) = {HX , HY }(Γ(t)) = Ω HX (Γ(t)), HY (Γ(t)) = dHY (Γ(t)) =0 by Definition 49 and (2.14). We conclude that λ(t) annihilates the tangent space Dγ(t) + [X, D]γ(t) ,

t∈I

along γ. Since λ(t) = 0, the condition HX (Γ(t)) = λ(t), X(γ(t)) = 0 implies that X(γ(t)) = 0, i.e., γ(t) ˙ = 0 by Corollary 9. We conclude that the curve γ is constant.  The relation between the length minimizing curves and the geodesics (projections of bi-characteristics of the Hamiltonian system) in sub-Riemannian geometry is expressed in the following theorem Theorem 2.4. Let (M, D, gD ) be a sub-Riemannian manifold. 1. If γ : [a, b] → M is a length minimizer, parametrized by the arc length, then γ is geodesic (normal or abnormal) [97]. 2. Every normal geodesic is a local length minimizer [15, 95]. 3. There are abnormal geodesics that are local length minimizers [109, 110]. 4. There are abnormal geodesics that are not local length minimizers [97]. 5. If (M, dc−c ) is a complete metric space for a Carnot–Carath´eodory metric dc−c , then any two points can be joined by a minimizing geodesic. In particular, this is true for a compact manifold M [13, Theorem 2.7, p. 19 and Remark 2, p. 20]. 6. On a sub-Riemannian manifold with a bracket generating distribution of step 2, any length minimizing curve is C ∞ -smooth, or in other words there are no strictly abnormal minimizing geodesics in this case [112, Theorem 4]. 7. For a generic (in the Whitney C ∞ topology) bracket generating distribution of rank greater than or equal to three, there do not exist nontrivial minimizing singular curves [30]. At the end of this section we note that the sub-Riemannian exponential map is produced in the same form as in (2.10), where the initial velocity vector is horizontal. It is reflected in the following scheme Dq

ι

/D

j

/ T ∗M

Φ

/ T ∗M

pr∗ M

/6 M,

(2.15)

exp

where we have to change the metric dependent identification g˜ of T M with T ∗ M to any other map j giving this identification. Unfortunately, not all good properties of the Riemannian exponential map are inherited. For instance, the sub-Riemannian exponential map is never a local diffeomorphism, since the map j is not invertible for any q ∈ M .

170

I. Markina

Exercises 1. Let (M, D, gD ) be a sub-Riemannian manifold. Show that the co-metric g D is non-negative definite, symmetric, and smoothly varying with respect to the point q ∈ M . 2. Let M = R3 with coordinates q = (x, y, z). Find a basis of the distribution ∂ ∂ D = ker{ω = x2 dy − (1 − x)dz}. (Check if the basis X = ∂x , Y = (1 − x) ∂y + 2 ∂ x ∂z works.) Is D bracket generating? Regular? Find the matrix of the subRiemannian metric gD making vector fields X, Y orthonormal. Find the subRiemannian Hamiltonian function H generated by gD and the corresponding Hamiltonian system. It was shown in [97] that the curve γ : [a, b] → R3 , γ(t) = (0, t, 0) is a length minimizer for the Carnot–Carath´eodory distance if (b − a) is small enough. Show that the curve γ is not a bi-characteristic for the sub-Riemannian Hamiltonian function H. Conclude that the curve γ is a length minimizer but not a normal geodesic.

3. Carnot groups Let us consider a special example of smooth manifolds, where the sub-Riemannian structure appears naturally. 3.1. Short introduction to Lie groups It is recommended for the reader who is not familiar with Lie group theory to start from Subsection 8.3 of the Appendix A. A Lie group is an object that nicely combine algebraic, geometric, and analytic properties. Namely, a Lie group G is a pair (M, ρ), where 1. M is a C ∞ -smooth manifold modeled on some (complete locally convex) vector space, 2. the map ρ : M × M → M satisfies the axioms of the group product, 3. the map ρ is compatible with the smooth manifold structure in the sense that the map ρ : M × M → M is C ∞ -smooth as a map between the smooth manifold M × M and another smooth manifold M . As usual in mathematics, we will write only G instead of (M, ρ) to denote the group and the underlying manifold M . Recall, that a Lie algebra is a pair (V, [· , ·]), where V = (V, +) is a vector space over the fields R or C and [· , ·] is the Lie product introduced in Definition 51, Appendix A.There is a close relation between Lie groups and Lie algebras. From a Lie group to its Lie algebra. To define the Lie algebra g of a Lie group G we consider special vector fields on G. To describe this class of vector fields we introduce the action of the group on itself. We call the mappings lτ (q) := ρ(τ, q) = τ q,

τ ∈ G fixed,

q ∈ G is arbitrary,

Geodesics in Geometry with Constraints

171

the left action of G on itself and rτ (q) := ρ(q, τ ) = qτ,

τ ∈ G fixed,

q ∈ G is arbitrary,

the right action of G on itself. Since the group multiplication and the inversion are smooth, the maps lτ , rτ : G → G are smooth diffeomorphisms of G. Their differentials dq lτ : Tq G → Tlτ (q) G and dq rτ : Tq G → Trτ (q) G are linear maps of the respective tangent spaces. Definition 14. A vector field X on G satisfying       dq rτ (X(q)) = X rτ (q) = X(qτ ) , dq lτ (X(q)) = X lτ (q) = X(τ q) for all τ, q ∈ G is called left- (right-) invariant vector field. The set of left invariant vector fields considered as a vector space over the field R with the Lie product defined by the commutator of vector fields (2.1) forms a real Lie algebra L. Of course, one needs to verify that the commutator of left invariant vector fields is a left invariant vector field. Since any left invariant vector field is defined by its value at the identity of the group e ∈ G, there is an isomorphism ι between the vector space Te G and L defined by L " X → X(e) ∈ Te G,

Te G " v → dl(v) ∈ L.

This isomorphism ι can be extended to an isomorphism of Lie algebras if we define Lie brackets in Te G as [X(e), Y (e)] := [X, Y ](e). The Lie algebra (Te G, [· , ·]) is denoted usually by g and is called the Lie algebra of the Lie group G. The Lie algebra R of right invariant vector fields is isomorphic to g if we set R " [X, Y ] ↔ −[X, Y ](e) ∈ g. The dual space to the space of left invariant vector fields consists of left invariant one-forms and they satisfy the Maurer–Cartan equations, see [131]. The next question is to find a map between a given Lie group G and its Lie algebra g. The answer is given in terms of the exponential map exp : g → G. There are essentially two ways to introduce the exponential map. The first one uses the property that any homomorphism of Lie algebras can be lifted to a homomorphism of the groups [87, 131] under some condition on the groups. The second one uses properties of solutions of ordinary differential equations [42]. The first way. Let (R, +) be the additive group of real numbers and r be the d . Let G be a Lie group, g be its Lie corresponding Lie algebra with generator dr algebra, and X ∈ g be an arbitrary element. Then the map h : r → g d h → tX ∈ g, t ∈ R, dr is a homomorphism from the Lie algebra r into the Lie algebra g. Theorems of Lie group theory [87, 131] ensures that there is a unique Lie group homomorphism r"t

172

I. Markina

cX , such that d ) = tX. dr In other words, the curve cX : R → G is a one-parametric subgroup of G and it is such that cX (0) = e and c˙X (0) = X. Here we used the property of R to be simply connected group in order to construct the Lie group homomorphism cX from the Lie algebra homomorphism h. cX : R → G,

and d cX = h,

or dcX (t

The second way. Let G be a Lie group, g be its Lie algebra, and let X ∈ g be an arbitrary left invariant vector field. Then the theory of ordinary differential equations guaranties that the solution of the Cauchy problem  dc (t) X = X(cX (t)) dt cX (0) = e is unique, possesses the properties of one parameter subgroup of G, and c˙X (0) = X(e) [42]. Definition 15. The map g " X → cX (1) ∈ G is called the group exponential map and is denoted by exp. Thus exp

:

g X

→ G  → cX (1).

We will call the curve cX (t), t ∈ R, the exponential curve and it is customary to use also the notation exp(tX) instead of cX (t). The main properties of the exponential map are listed in the Appendix A, Subsection 8.3. We write in these notations d  ∼  exp(tX) = c˙X (0) = X(e) (3.1) Te G ⊃  = X ∈ g. dt t=0 Let us assume now that the Lie algebra g of a Lie group G is endowed with an inner product (· , ·). Then, by making use of left translations we can define a metric g on the group. Namely, let vq , wq ∈ Tq G, then dq lq−1 (vq ), dq lq−1 (wq ) ∈ Te G. We define   (3.2) g(vq , wq ) := dq lq−1 (vq ), dq lq−1 (wq ) for any q ∈ G. Using right translations we also can define a metric. Conversely, if there is a Riemannian metric g defined on a Lie group G considered as a smooth manifold, then it is compatible with the Lie structure if it is invariant under the action of the group on itself. Definition 16. A Riemannian metric g on G is called left invariant (right invariant), if for any vq , wq ∈ Tq G the following holds: 

g(vq , wq ) = g(dq lτ (vq ), dq lτ (wq )) = g(vτ q , wτ q ),  g(vq , wq ) = g(dq rτ (vq ), dq rτ (wq )) = g(vqτ , wqτ ) .

Geodesics in Geometry with Constraints

173

Exercises 1. Show that the following pairs are Lie groups. a. (R, +). b. (R, ·), where “·” is the product of real numbers. c. (S 1 , ·), where S 1 is the set of complex numbers of absolute value 1 and “·” is the product of complex numbers. The group (S 1 , ·) is also denoted by U (1) and it is called unitary one-dimensional group. d. (M, ·), where M is the set of (3 × 3) upper triangular real matrices ⎞ ⎛ 1 x t ⎝ 0 1 y ⎠, x, y, t ∈ R 0 0 1 and “·” stands for the usual matrix product. A group is called compact, if the underlying manifold is compact as a topological space. Which of the above-mentioned groups are compact? 2. Show that if X, Y are left invariant (right invariant) vector fields on G, then the commutator [X, Y ] is also a left invariant (right invariant) vector field. 3. Find the Lie algebras corresponding to the Lie groups mentioned in the first exercise. Describe the left invariant vector fields. 4. Show that the metric from (3.2) is a left invariant metric on the group. 3.2. Heisenberg group We start from the simplest example of a sub-Riemannian manifold that is called a Heisenberg group. 3.2.1. The Heisenberg sub-Riemannian manifold. Consider the smooth manifold R3 with coordinates q = (x, y, t). Then Tq R3 = span{∂x , ∂y , ∂t } and Tq∗ R3 = span{dx, dy, dt}. We define the smooth two-dimensional distribution D as the span of two vector fields 1 1 X = ∂x − y∂t , Y = ∂y + x∂t . (3.3) 2 2 See Figure 3.1. Let us find a Riemannian metric g in coordinates (x, y, t) making X, Y and T = [X, Y ] = ∂t orthonormal. So we have g(X, X) = g(X, Y ) = g(T, T ) = 1 and other values vanish. We express the basis (∂x , ∂y , ∂t ) in the form 1 ∂x = X + yT, ∂y = Y − 2 Then by making use of the bi-linearity of g we

1 xT, 2 get

y2 xy , g12 = g(∂x , ∂y ) = − , 4 4 The matrix {gij } takes the form ⎛ 2 y 1 + y4 − xy 4 2 2 ⎜ x x gij = ⎝ − xy 1 + − 4 4 2 y x −2 1 2 g11 = g(∂x , ∂x ) = 1 +

∂t = T.

g13 = g(∂x , ∂t ) =

y 2

....

⎞ ⎟ ⎠.

(3.4)

174

I. Markina

Figure 3.1. The Heisenberg distribution. Notice that det g = 1. It implies that the volume form in (R3 , g) is given by the standard Lebesgue measure: dx ∧ dy ∧ dt. The distribution D is bracket generating of step 2 since [X, Y ] = ∂t := T and Tq R3 = span{X, Y, T }. Moreover, the distribution is strongly bracket generating and we will be interested in normal geodesics only (by Theorem 2.3). The dual basis to X, Y, T is 1 1 dx, dy, ω = dt − xdy + ydx. 2 2 (Verify it!) The form ω is the annihilator of the distribution D and D⊥ = span{ω}, therefore, we can also define the distribution D as D = ker(ω) = {v = (x, y, z) ∈ R3 | ω(v) = 0}. Define the sub-Riemannian metric gD as the restriction of the metric g on the planes Dq for all q ∈ R3 . Then (R3 , D, gD ) is the Heisenberg sub-Riemannian manifold. To find the normal geodesics on the Heisenberg sub-Riemannian manifold we write λ = ξdx + ηdy + θdt for any co-vector λ. Since the basis X, Y is orthonormal, the Hamiltonian function is  1  1 1 2  1 2  H(q, λ) = . λ, X 2 + (λ, Y 2 = ξ − θy + η + θx 2 2 2 2 The Hamiltonian system and the initial conditions are ⎧ x˙ = ξ − 12 θy ⎪ ⎪ ⎪ ⎪ 1 ⎪ ⎪ ⎪y˙ = η + 2 θx  ⎪ ⎨t˙ = 1 (ηx − ξy) + 1 θ(x2 + y 2 ) x(0) = y(0) = t(0) = 0, 2 4 (3.5) 1 1 2 ˙ ⎪ξ = − 2 ηθ − 4 θ x ξ(0) = ξ0 , η(0) = η0 , θ(0) = θ0 . ⎪ ⎪ ⎪ ⎪ ⎪ η˙ = − 12 ξθ − 14 θ2 y ⎪ ⎪ ⎩˙ θ = 0,

Geodesics in Geometry with Constraints

175

We need projections of the bi-characteristics of H onto R3 , therefore, we try to reduce the Hamiltonian system to a system containing only the variables (x, y, t). We differentiate the first two equations and replace ξ˙ and η˙ from the fourth and fifth equations. It gives 



x ¨ = −θ0 y˙ x˙ 0 −1 x ¨ . (3.6) = θ0 or y˙ 1 0 y¨ y¨ = θ0 x, ˙ Then multiplying the first equation by y and the second by x we notice that the third equation is equivalent to the condition ˙ = 1 (x(s)y(s) t(s) ˙ − y(s)x(s)). ˙ (3.7) 2   The solution γ(s) = x(s), y(s), t(s) is x(s) y(s)

= |θξ00 | sin(|θ0 |s) − |θη00 | (cos(|θ0 |s) − 1), = − |θξ00 | (cos(|θ0 |s) − 1) − |θη00 | sin(|θ0 |s),

t(s)

=

(ξ02 +η02 ) 2|θ0 |2 (|θ0 |s

if

θ0 = 0

(3.8)

− sin(|θ0 |s)),

and x(s) = ξ0 s, y(s) = η0 s, The graph is in Figure 3.2.

t(s) = 0,

if

θ0 = 0.

Figure 3.2. Geodesics on the Heisenberg group.

(3.9)

176

I. Markina

Let us look at the condition (3.7) of the Hamiltonian equation. To understand better this equation let us calculate the velocity vector of any curve in the basis X, Y, T . Let c(s) = (x(s), y(s), t(s)), s ∈ I, be a curve, then     1 1 ˙ c(s) ˙ = x(s)∂ ˙ ˙ ∂y + x(s)∂t ˙ ˙ ∂x − y(s)∂t + y(s) x + y(s)∂ y + t(s)∂ t = x(s) 2 2   1 1 ˙ + x(s)y(s) ˙ − y(s)x(s) ˙ ∂t + t(s) 2 2   1 ˙ + 1 x(s)y(s) ˙ − y(s)x(s) ˙ T (c(s)). = x(s)X(c(s)) ˙ + y(s)Y ˙ (c(s)) + t(s) 2 2 To be a horizontal curve, the coordinate of c˙ in front of the vector field T have to vanish for all s ∈ I, that leads to the equation (3.7). Thus, the third equation of the Hamiltonian system (3.5) is just the horizontality condition, and it is not surprising due to the general fact (2.13). Observe the following relations between the initial values for the Hamiltonian system and the initial velocity vector: ξ0 = x(0), ˙ η0 = y(0). ˙ The values x(0) ˙ and ˙ y(0) ˙ and the initial point completely define the initial velocity t(0). So, fixing the initial velocity we still have countably many geodesics starting from the origin, that are parametrized by the parameter θ0 and connecting a fixed point (0, 0, t). To see this we parametrize all geodesics, starting from the origin with a fixed initial velocity (ξ0 , η0 ), in the unit interval [0, 1]. Denote by x1 = x(1), y1 = y(1) the final point of these geodesics. We assume x21 +y12 = 0, that means that geodesics connect the origin with a point (0, 0, t1 ) on t-axis, as it is shown in Figure 3.2. Then, 0 = x21 + y12 =

sin2 (|θ0 |) 2 (ξ0 + η02 ) |θ0 |2

=⇒

|θ0 | = 2πn, n ∈ N

by solutions (3.8). The corresponding value of t1 is 14 (ξ02 + η02 ). The complete description of geodesics on the Heisenberg sub-Riemannian manifold can be found in [20, 77, 88]. The geodesics (3.8) (3.9) are local length minimizers by Theorem 2.3. The length of a sub-Riemannian geodesic γ : I = [0, T ] → R3 is ) T (x(s) ˙ 2 + y(s) ˙ 2 )1/2 ds = (ξ02 + η02 )1/2 T. length(γ) = 0

If there are several sub-Riemannian geodesics connecting the origin with a point q ∈ R3 and they are parametrized by the arc length, then the curve of the smallest length is the minimizing curve, realizing the Carnot–Carath´eodory distance between the origin and q ∈ R3 . 3.2.2. Heisenberg sub-Riemannian manifold as a Lie group. Let us consider the following non-commutative group on the smooth manifold R3 . Define the product for τ = (x, y, t) and q = (x1 , y1 , t1 ) by 1 τ q = (x, y, t)(x1 , y1 , t1 ) = (x + x1 , y + y1 , t + t1 + (xy1 − x1 y)). 2

(3.10)

Geodesics in Geometry with Constraints

177

As a motivation for this law one can consider the product of (4 × 4) real matrices ⎛ ⎞ ⎛ ⎞ t1 1 x y t 1 x1 y1 y y 1 ⎜ 0 1 0 ⎟ ⎜ ⎟ 1 0 2 ⎟ ⎜ 0 2 ⎜ ⎟ ⎝ 0 0 1 − x ⎠ · ⎝ 0 0 1 − x1 ⎠ 2 2 0 0 0 1 0 0 0 1 ⎞ ⎛ 1 x + x1 y + y1 t + t1 + 12 (xy1 − x1 y) y+y1 ⎟ ⎜ 0 1 0 ⎟ 2 =⎜ x+x1 ⎝ 0 ⎠ 0 1 − 2 0 0 0 1 that leads to formula (3.10). It is an easy exercise to verify that the product (3.10) satisfies the group axioms. The identity e of the obtained group has coordinates (0, 0, 0) with respect to this multiplication and the inverse element to (x, y, t) is (−x, −y, −t). The pair, consisting of the smooth manifold R3 and the introduced group law, is called the Heisenberg group and is denoted by H1 . This group law defines the left translation: lτ (q) = τ q. The left translation lτ by τ = (x, y, z) has the differential at e = (0, 0, 0) written in coordinates (x, y, t) as ⎞ ⎛ 1 0 0 1 0 ⎠. de lτ = ⎝ 0 1 1 −2y 2x 1 The action of de lτ on the basis (∂x , ∂y , ∂t ), that coincides with (X, Y, T ) at e, gives the basis (X, Y, T ) at τ . We conclude that the basis (X, Y, T ) is just the basis of left invariant vector fields on the group H1 . They form the famous Heisenberg algebra h1 which is by definition a three-dimensional Lie algebra spanned by X, Y, T with only one non-trivial commutator: [X, Y ] = T and all other commutators vanish. We use the identification of the Lie algebra of left-invariant vector fields with Te H1 . The exponential map is a global diffeomorphism [50] in this case. The coordinates on the group H1 are of the first kind and are given by H1 " q = (x, y, t) = exp(xX + yY + tT ),

xX + yY + tT ∈ h1 .

The inverse map to the exponential restores the group multiplication law from the commutation relations of the Heisenberg algebra in the following way. Let V = xX + yY + tT ∈ h1 , V1 = x1 X + y1 Y + t1 T ∈ h1 and τ = exp(V ), q = exp(V1 ), then by the Baker–Campbell–Hausdorff formula (8.1) (BCH-formula for short) we obtain

1 τ q = exp(V ) exp(V1 ) = exp V + V1 + [V, V1 ] + · · · 2

1 = exp (x + x1 )X + (y + y1 )Y + (t + t1 )T + (xy1 − x1 y)T 2

1 = x + x1 , y + y1 , t + t1 + (xy1 − x1 y) , 2 that coincides with (3.10).

178

I. Markina

There is a norm · H1 on the group H1 which is a direct analogue of the Euclidean norm in R3 . It is defined by 1/4  2 τ H1 = x2 + y 2 + t2 , τ = (x, y, t). (3.11) If we stretch the basic elements X and Y of the Heisenberg algebra by a number s > 0, then the bi-linearity of the commutator implies [sX, sY ] = s2 T . Making use of the BCH-formula we get the dilatation δs on the group: δs (τ ) = δs (x, y, t) = (sx, sy, s2 t).

(3.12)

This dilation, which is called the homogeneous dilation is compatible with the norm in the sense that the norm becomes a homogeneous of order one function: δs (τ ) H1 = (sx, sy, s2 t) H1 = s τ H1 . Compare this situation with the Euclidean norm and the usual dilation in R3 ! The Heisenberg distance function dH1 is dH1 (τ, q) = τ −1 q H1 . By Exercise 2, the Heisenberg distance dH1 and the Carnot–Carath´eodory distance dc−c are equivalent. Example 5. Let us show that the Heisenberg distance and the Euclidean distance dE are not Lipschitz equivalent, even locally in R3 . Take two points e = (0, 0, 0) and q = (0, 0, t). Then 1 dH1 (e, q) = |t|, dE (e, q) = |t|, which shows non-equivalence of the distance functions. This also proves that the metric spaces (R3 , dE ) and (R3 , dH1 ) are not equivalent. But the topological spaces (R3 , τE ) and (R3 , τH1 ) are equivalent since any Heisenberg ball contains a Euclidean ball and vice versa.We also present the picture of balls in different metrics in Figure 3.4. 1.0

0.5

1.0

0.5

0.5

1.0

0.5

1.0

Figure 3.3. The Heisenberg and the Euclidean balls inside each other.

Geodesics in Geometry with Constraints

179

Figure 3.4. The Euclidean, Heisenberg, and Carnot–Carath´eodory balls. The metric with the matrix (3.4) is a left invariant metric on H1 . The distribution D = span{X, Y }, where X, Y are defined in (3.3), itself can be called left invariant since it is completely defined by X(e) = ∂x , Y (e) = ∂y , and Dτ = dlτ De . The differential operator  ∂  2 ∂ ∂ ∂2 ∂2 1 2 2 ∂ ΔsR = X 2 + Y 2 = − x (3.13) + + + y − y x ∂x2 ∂y 2 2 ∂t2 ∂x ∂y ∂t is called sub-Laplacian. It is an analogue of the Laplace–Beltrami operator Δ = ∂2 ∂2 ∂2 3 ∂x2 + ∂y 2 + ∂t2 in R with respect to the Euclidean metric. Observe that the 1/4  homogeneous function N (τ ) = (x2 + y 2 )2 + t2 for τ = (x, y, t) ∈ H1 is connected to the fundamental solution Γ(τ ) to the sub-Laplacian (3.13) as follows c(Q) Γ(τ ) = N (τ )Q−2 . The constant c(Q) < 0 can be calculated explicitly and Q = 4 is the Hausdorff dimension of the metric space (H1 , dH1 ), see [49]. Let us present the formulas for the gradient on H1 in coordinates. In order to use formula (2.4), we calculate the inverse matrix to (3.4): ⎞ ⎛ 1 0 − y2 x ⎠. 1 g ij = ⎝ 0 22 y x +y 2 x −2 2 1 + 4 Then ⎛

⎞ ⎛ ∂x f ⎜ g ij ⎝ ∂y f ⎠ = ⎝ ∂t f

Thus

⎞ ⎛ ∂x f − y2 ∂t f x ⎟ ⎝ ∂y f +  2 ∂t f2 2  ⎠= x∂y f −y∂x f x +y ∂t f + 1+ 4 2

⎞ Xf ⎠. Yf y x 2 Y f − 2 Xf + T f



⎞ ⎞ ⎛ ∂x f ∂x grad f = g ij ⎝ ∂y f ⎠ · ⎝ ∂y ⎠ = Xf X + Y f Y + T f T. ∂t ∂t f

The horizontal gradient “gradD ” is the projection of “grad” onto D = span{X, Y } and it is written as gradD f = (Xf, Y f ) in the left invariant basis X, Y of D.

180

I. Markina

3.2.3. Heisenberg group and isoperimetric problem. Let us recall the ancient story of Dido, or Elissa in Greek version, the founder and the first Queen of Carthage (in modern-day Tunisia). She was daughter of the king of Tyre and after the dangerous for her life intrigues of her brother Pygmalion she had to leave her land. Eventually Elissa and her followers arrived at the coast of North Africa where Elissa asked the local inhabitants for a small piece of land for a temporary refuge until she could continue her journey. She was allowed to have only as much land as could be encompassed by an oxhide. Elissa cut the oxhide into thin strips so that she had enough to encircle an entire nearby hill. According to this legend, Elissa was the first person who solved the isoperimetric problem of enclosing the maximum area within a boundary of a fixed length. The dual problem is to find a minimal length curve enclosing the fixed area. Let us formulate this problem mathematically. Introduce the coordinates (x, y) on the plane R2 and let c(s) = (x(s), (y(s))), s ∈ I, be a closed curve in R2 that encloses a bounded domain Ω. Then the area A of Ω can be calculated as , , dA = c 12 (xdy − ydx) by the Stokes theorem. Here the area form dA = dx ∧ dy Ω is the differential of the one form 12 (xdy − ydx). The variational problem with constraint is formulated as follows: , 1 2 Find a closed curve c : I → R of minimal length x˙ 2 (s) + y˙ 2 (s)ds, c , 1 such that the area A = 2 c (xdy − ydx) enclosed by this curve is fixed. Let us introduce the third coordinate t that will reflect the change of the area swept by the curve c(s) = (x(s), y(s)), s ∈ I, i.e.,   ˙ = 1 x(s)y(s) ˙ − y(s)x(s) ˙ for all s ∈ I. (3.14) t(s) 2   We associate the family of curves γ : I → R3 , γ(s) = x(s), y(s), t(s) to a single planar curve c(s) = (x(s), y(s)), s ∈ I, in such a way that we obey the constraint (3.14). Integrating condition (3.14), we get )  1  t − t0 = x(s)y(s) ˙ − y(s)x(s) ˙ ds, 2 c which means that the area enclosed by the planar curve c and the straight line connecting the end of c with the origin, is equal to the change of the vertical coordinate of γ (here we assumed t0 = 0), see the Figure 3.5. Another desirable condition is to find a Riemannian metric g in R3 such that the length of γ : I → R3 is equal to the length of the planar curve c. In order to satisfy it, we find a distribution D of planes in R3 such that γ will be tangent to D and the length of the vector c(s) ˙ = (x(s), ˙ y(s)) ˙ in R2 coincides with the length of the vector ˙ γ(s) ˙ = (x(s), ˙ y(s), ˙ t(s)) ∈ Dγ(s) ⊂ R3 . In this case we only need the restriction gD of the Riemannian metric g given in (3.4) to planes Dγ(s) that will be the subRiemannian metric. Thus the distribution D has to be annihilated by the form dual to the vector field T . In this case the third coordinate t˙ − 12 (xy˙ − y x) ˙ of the

Geodesics in Geometry with Constraints

181

Figure 3.5. The sub-Riemannian length of the curve is equal to the area of the projection. velocity vector γ˙ written in the basis X, Y, T will vanish. So 1 D(x, y, t) = ker(ω) = ker(dt − (xdy − ydx)), 2 and the sub-Riemannian metric gD is just the Euclidean metric on D making the basis of D orthonormal. The reader may recognize the Heisenberg manifold (R3 , D, gD ) described in the first part of Subsection 3.2. More information about the relation between the isoperimetric problems and the Heisenberg groups the reader can find in [6, 22]. 3.2.4. Variational equation on the Heisenberg group. For the sake of completeness we would like to mention the variational equation for geodesics on H1 obtained in [122]. Let g be a left invariant Riemannian metric on the Heisenberg group, such that the left invariant vector fields X, Y, T are orthonormal at each point q ∈ H1 . We emphasize that it is a Riemannian, but not a sub-Riemannian metric. Let ∇ be the Levi-Civita connection associated with g. Let J be an almost complex structure on D defined as in (3.17). To formulate the result we also need to introduce the set of admissible curves for the variational problem. Recall that we are looking for a horizontal curve c connecting q0 ∈ H1 and q1 ∈ H1 and minimizing the length functional ) / 01/2 g(c(s), ˙ c(s)) ˙ l(c) = ds. (3.15) c

The set of admissible curves for variation is just the set of horizontal curves connecting the points q0 and q1 . Since all considered curves are horizontal we can

182

I. Markina

change the Riemannian metric g in (3.15) to a sub-Riemannian metric gD obtained by restriction of g on D at each point of the Heisenberg group. Theorem 3.1 ([122]). Let γ : I → H1 be a smooth (C 2 ) horizontal curve parametrized by the arc length. The curve γ is a critical point of the length functional (2.3) for any admissible variation, if and only if, there is κ ∈ R such that γ satisfies the second-order ordinary differential equation ˙ = 0. ∇γ˙ γ˙ + 2κJ(γ)

(3.16)

This result was recently extended to higher-dimensional Heisenberg groups [121]. The parameter κ is called the curvature of the sub-Riemannian geodesic γ since the projection of γ to R2 is a curve with the curvature κ. The value κ = 0 corresponds to straight lines parallel to R2 . 3.2.5. Heisenberg algebra and inner product. Finishing discussions on the Heisenberg group, we want to show that the commutation relations of the Heisenberg algebra induce a natural inner product on the Heisenberg algebra under an additional condition. Let us use the notations: X, Y, T for the basis of the Heisenberg algebra h1 , U = span{X, Y }, V = span{T }, and h1 = U ⊕ V . The one-dimensional vector space V is naturally isomorphic to R1 by fixing the basis element T . Therefore, V possesses the metric, such that the length |T | = 1. Thus, V becomes a normed vector space (V, | · |), and therefore, a metric space. The commutator [· , ·] on the Heisenberg algebra h1 produces a bi-linear skew symmetric form (· , ·) : U × U → V by (u1 , u2 ) := [u1 , u2 ], u1 , u2 ∈ U . Let J be an almost complex structure on U , that is, a linear map J : U → U defined on the basis by J(X) = −Y, J(Y ) = X. (3.17) 2 Observe that J = − IdU and J are compatible with the commutator structure in the sense that the brackets are invariant under the transformation J. We have (JX, JY ) = [JX, JY ] = [−Y, X] = [X, Y ] for the basis of U , and thus by bi-linearity of , for any vector from U . Verify, that −J also possesses the same properties. The skew symmetric bi-linear form  and the compatible almost complex structure J define a symmetric bi-linear form gˆ(· , ·) : U × U → V by gˆ(Z, W ) = (JZ, W ). Indeed, linearity is obvious and symmetry follows from gˆ(Z, W ) = (JZ, W ) = (JJZ, JW ) = −(Z, JW ) = (JW, Z) = gˆ(W, Z). On the basis elements of U we get g(X, X) = (JX, X) = T,

g(Y, Y ) = (JY, Y ) = T,

g(X, Y ) = 0.

Recalling that we use the isomorphism V with R , we conclude that gˆ is an inner product making X, Y orthonormal. Consider now the commutator as an adjoint map 1

adX : U → V : adX (Z) = [X, Z],

Geodesics in Geometry with Constraints

183

(for the definition of the adjoint map and details see Appendix A, Subsection 8.3). Then adX : (ker(adX ))⊥ → V is an isomorphism, where the orthogonal complement is taken with respect to the inner product gˆ. Moreover, adX is an isometry form (ker(adX ))⊥ to V , since the length 1 basis element Y is mapped to the length 1 basis element T . The same holds for adY . In the construction here we used the almost complex structure J. Let h1 be the Heisenberg algebra. Left translations of the plane U = span{X, Y } ∈ h define the distribution D(x,y,z)

 1 = span X = ∂x − y∂t , 2

1 Y = ∂y + x∂t 2



on the Heisenberg group H1 . Left translations of the restriction of the inner product gˆ on U becomes the left invariant sub-Riemannian metric gD . We conclude that the commutation relations on the Heisenberg algebra and the presence of the compatible almost complex structure J naturally lead to the left invariant subRiemannian structure (D, gD ) on the group H1 . More about this see [77]. Exercises 1. Show that the Carnot–Carath´eodory distance function on the Heisenberg sub-Riemannian manifold is homogeneous with respect to the dilation (3.12). 2. Show that any two homogeneous with respect to the dilation (3.12) distance functions d1 and d2 are equivalent on the Heisenberg group; that is, there are  > 0, such that constants C, C  1 (τ, q), Cd1 (τ, q) ≤ d2 (τ, q) ≤ Cd

τ, q ∈ H1 .

3. Verify that the Heisenberg distance function dH1 is symmetric and satisfies the triangle inequality. If you did not succeed see [89]. 4. Show that geodesics (3.8) and (3.9) are invariant under the left translation defined by the multiplication (3.10). 5. Prove that all geodesics on the Heisenberg group can be obtained by left translations of geodesics (3.8) and (3.9) starting from e = (0, 0, 0). 3.3. H-type groups The Heisenberg type groups (H-type for shortness) were introduced by A. Kaplan [77] and have been studied extensively by many mathematicians, see for instance [21, 26, 36, 78, 88, 120]. The Heisenberg-type groups H are the groups whose Lie algebras h are generalizations of the Heisenberg algebra in the following sense. Let a vector space h endowed with a commutator [· , ·] and an inner product (· , ·). We suppose that the commutator defines the decomposition h = U ⊕ V,

[U, U ] ⊆ V,

[U, V ] = [V, V ] = {0},

184

I. Markina

and, moreover, this decomposition is orthogonal with respect to the inner product. The next assumption is the compatibility between [· , ·], (· , ·), and an almost complex structure. We define the map J

:

V T

→ End(U ) → JT ,

by (JT X, Y ) = (T, [X, Y ]),

for any X, Y ∈ U, and any T ∈ V.

(3.18)

This immediately implies the skew-symmetry of JT for any T ∈ V : (JT X, Y ) = −(X, JT Y ).

(3.19)

For any element X ∈ U the adjoint map adX (·) = [X, ·] : U → V gives the decomposition  ⊥ U = ker(adX ) ⊕ ker(adX ) , where the orthogonal complement to the inner product (· , ·).  is taken with respect  We say that the Lie algebra h = U ⊕⊥ V, [· , ·], (· , ·) is of H-type if for any X ∈ U , (X, X) = 1 the map  ⊥ adX : ker(adX ) → V is an isometry onto V . The last condition is equivalent to JT2 = −|T |2 IdU ,

for all T ∈ V,

(3.20)

where IdU denotes the identity mapping in End(U ) [36]. The condition (3.20) is a consequence of JT JT  + JT  JT = −2(T, T ) IdU ,

for all T, T  ∈ V,

(3.21)

see [36]. When there exists a linear mapping J : V → End(U ) satisfying (3.20) or (3.21), U is called the Clifford module over V . The relation between H-type groups and Clifford modules was carefully studied in [36]. Some interesting generalization where an inner product is changed to an arbitrary non-degenerate scalar product can be found in [34, 35, 60]. In Theorem 3.2 a result on the classification of H-type algebras is presented. We need some definitions. Definition 17. The algebra h satisfies the J 2 condition if, whenever X ∈ U and T, T  ∈ V with (T, T  ) = 0, then there exists T  ∈ V , such that JT JT  X = JT  X.

(3.22)

Denote by hn0 the Euclidean n-dimensional space, by hn1 the n-dimensional Heisenberg algebra, by hn3 the n-dimensional quaternion H-type algebra, and by h17 the octonion H-type algebra. The lower index corresponds to the topological dimension of V and the upper index reflects the real, complex, quaternion and octonion topological dimensions of U . Theorem 3.2 ([36]). Suppose that h is an H-type algebra satisfying the J 2 condition. Then h is isometrically isomorphic to hn0 , hn1 , hn3 or to h17 .

Geodesics in Geometry with Constraints

185

This classification is intimately related to Clifford algebras. The first three of H-type algebras are also connected to division algebras of real, complex, and quaternion numbers since all of these algebras are isomorphic to some Clifford algebras. The last one h17 is not related to division algebra of octonion numbers, since the algebra of octonion numbers is not isomorphic to a Clifford algebra due to non associative product of octonions. The groups related to division algebras were studied in [21], where the parametric formulas of geodesics and other questions also were obtained. We present their construction in the following subsection. 3.3.1. Constructions of groups related to division algebras. Before we describe the general construction of groups Hn0 , Hn1 , Hn3 , and H17 , we would like to recall the Cayley–Dickson construction of division algebras R (real numbers), C (complex numbers), Q (quaternion numbers), and O (octonion numbers). The Cayley– Dickson construction explains why each algebra fits neatly inside the next one. Recall that the division algebra means that each non-zero element has a unique inverse. The Cayley–Dickson construction is given nicely in [8]. The complex number, as is well known, can be thought of as a pair (a, b) of real numbers a, b ∈ R. We define the conjugate to a real number as a∗ = a and the conjugate to the pair as (a, b)∗ = (a∗ , −b).

(3.23)

Then the Cayley–Dickson product is defined by (a, b)(c, d) = (ac − db∗ , a∗ d + cb).

(3.24)

Now we can think of a pair (a, b) as a quaternion, where a, b ∈ C. The conjugate is defined as in (3.23) and the product as in (3.24). We obtain the quaternion numbers Q that form a non-commutative algebra with respect to (3.24). Finally, we define an octonion as a pair (a, b) with a, b ∈ Q, the conjugate as in (3.23), and the product (3.24). The octonions with the multiplication (3.24) form a non-commutative, non-associative algebra. Actually, we can continue the Cayley–Dickson construction doubling the dimension and getting a bit worse algebras. First we lose the fact that every element is its own conjugate, then we loose commutativity, associativity, and finally we loose the division algebra property. An algebra possesses a division property if xy = 0

implies

x = 0 or

y = 0.

Using the Cayley–Dickson product, we first describe the following groups: Euclidean n-dimensional space Hn0 = Rn , the n-dimensional Heisenberg group Hn1 , the n-dimensional quaternion H-type group Hn3 , and the octonion group H17 . The corresponding Lie algebras hn0 , hn1 , hn3 , and h17 are infinitesimal representations of these groups. We recall that the definitions of H17 and h17 differ from the ones in [36]. In our construction we used the octonion product which is not associative, therefore, it cannot give a Clifford algebra, where the product is associative by definition. The group corresponding to the classification of Theorem 3.2 is essentially the same as

186

I. Markina

we present, where the product of octonions has to be changed to an associative multiplication, presented in [36].  The Euclidean space. The group Hn0 = Rn , +) endowed with the Euclidean inner product is a trivial example of an H-type group. We have the identifications hn0 = Te Rn = Rn = span{∂x1 , . . . , ∂xn }. Left invariant vector fields are linear combinations of (∂x1 , . . . , ∂xn ) with constant coefficients. The exponential map is the identity map. Since all commutators [∂xi , ∂xj ], i, j = 1, . . . , n, vanish, we get hn0 = U ⊕ V,

U = Rn ,

V = {0}.

 In the notation of the Heisenberg group Hn1 = Cn × The Heisenberg group  R, ◦ the upper index stands for the complex dimension of Cn that corresponds to the real dimension 2n of the space U in hn1 = U ⊕ V . The lower index reflects the real dimension of R that is isomorphic to the center V of the algebra. We start from n = 1, and then generalize it to an arbitrary n ∈ N. Complex numbers considered as a vector space have 2 basis vectors that we call units, since their squares have absolute value 1: Hn1 .

real 1 = (1, 0), 12 = 1,

and imaginary i = (0, 1), i2 = −1.

Take a complex number z = (x1 , x2 ), x1 , x2 ∈ R, and a real number t. Define a new non-commutative law between the elements τ = [z, t], q = [z  , t ] ∈ C × R by 1 (3.25) τ ◦ q = τ q = [z, t][z  , t ] = [z + z  , t + t + (zi) · z  ], 2 where firstly we take the Cayley–Dickson product zi = (x1 , x2 )(0, 1) and then the inner product “·” of vectors z, z  ∈ R2 . We write τ q instead of τ ◦ q for the Hn1 group product to simplify the notation. If we use the representation of i as the (2 × 2) matrix * + 0 1 i= , −1 0 then the group law can be written as * + 1 τ q = z, t][z  , t ] = [z + z  , t + t + (iz) · z  . (3.26) 2 Using the algebraic form of a complex number z = x1 + ix2 = Rez + i Imz, we can write (3.25) in the form * + 1 τ q = [z, t] z  , t ] = [z + z  , t + t + Im(z ∗ z  ) , 2 where z ∗ z  is the Cayley–Dickson product of z ∗ by z  . The non-commutativity of the new multiplication law in C × R is seen for the last variable t ∈ R. The dimension one of the second slot of coordinates reflects the existence of only one imaginary unit. The reader easily recognizes in (3.26) the Heisenberg group H1 multiplication law (3.10).

Geodesics in Geometry with Constraints

187

In order to present an n-dimensional analogue of the Heisenberg group we take two n-dimensional vectors of complex numbers w = (z1 , . . . , zn ) and w = (z1 , . . . , zn ) where zl = x1l + ix2l , zl = (x1l ) + i(x2l ) , l = 1, . . . , n. The matrix i is changed to a block diagonal matrix J = diag i with n matrices i on the diagonal. The multiplication law between the elements τ = [w, t] and q = [w , t ] ∈ Cn × R is transformed into the following one   n 1      (zl i) · zl τ q = [w, t][w , t ] = w + w , t + t + 2 l=1 * + 1 = w + w , t + t + (Jw) · w 2 * + 1 = w + w , t + t + Im(w∗ w ) , 2 n where w∗ w = l=1 zl∗ zl . The unit element is e = (0, 0) and (−w − t) = (w, t)−1 is the inverse element to (w, t). A left invariant basis of the Heisenberg algebra hn1 , n ∈ N, is obtained as in the one-dimensional case by translation of the basis vectors {∂x1l , ∂x2l , ∂t }nl=1 by de lτ . With l = 1, . . . , n we get 1 X1l = ∂x1l − x2l ∂t , 2

1 X2l = ∂x2l + x1l ∂t , 2

and T = ∂t .

(3.27)

Let us introduce the notations U = span{X1l , X2l }nl=1 , V = span{T }. Since [X1l , X2l ] = T and other commutators vanish, we get hn1 = U ⊕ V . Let the inner product (· , ·) in hn1 be such that basis vectors become orthonormal. The condition (3.18) holds due to the commutation relations. The endomorphism JT is represented by the matrix J, which possesses properties (3.19) with respect to the Euclidean inner product in R2n and (3.20). The J 2 condition holds trivially, since dim V = 1.  Quaternion group Hn3 . In the notation of the quaternion group Hn3 = Qn ×  R, ◦ the upper index denotes the quaternion dimension of the space of quaternions Qn that corresponds to the real dimension 4n of the horizontal distribution. The lower index in this case reflects the real dimension 3 of the center of the Lie algebra hn3 . As previously, we start from the one-dimensional case: n = 1, and then consider its multidimensional analogue. Quaternion numbers Q, which we think of as pairs of complex numbers, have one real element 1 = (1, 0), 12 = 1, and three imaginary basis elements i1 = (i, 0), i2 = (0, 1), i3 = (0, i),

such that i21 = i22 = i23 = i1 i2 i3 = −1.

The Cayley–Dickson product is no longer commutative, for example, i1 i2 = −i2 i1 = −i3 , i2 i3 = −i3 i2 = −i1 , i3 i1 = −i1 i3 = −i2 .

(3.28)

188

I. Markina

In order to construct the quaternion H-type group H13 , we take a quaternion q = (z1 , z2 ), z1 , z2 ∈ C, and three real numbers t1 , t2 , t3 that reflects the threedimensional nature of the space of the imaginary quaternions. Define a new noncommutative law between the elements h = [q, t1 , t2 , t3 ] ∈ Q × R3 and p = [q  , t1 , t2 , t3 ] ∈ Q × R3 by (3.29) hp = [q, t1 , t2 , t3 ][q  , t1 , t2 , t3 ] * + 1 1 1        = q + q , t1 + t1 + (qi1 ) · q , t2 + t2 + (qi2 ) · q , t3 + t3 + (qi3 ) · q , 2 2 2 where qik , k = 1, 2, 3 is the Cayley–Dickson product for the quaternions and “·” is the inner product in R4 . As in the case of the Heisenberg group we can use the matrix representation of the imaginary units ⎡ ⎡ ⎤ ⎤ 0 −1 0 0 0 0 −1 0 ⎢ 1 ⎢ 0 0 0 ⎥ 0 0 1 ⎥ ⎥, i = ⎢ 0 ⎥, i1 = ⎢ (3.30) ⎣ 0 0 0 −1 ⎦ 2 ⎣ 1 0 0 0 ⎦ 0 0 1 0 0 −1 0 0 ⎡ ⎤ 0 0 0 −1 ⎢ 0 0 −1 0 ⎥ ⎥, i3 = ⎢ ⎣ 0 1 0 0 ⎦ 1 0 0 0 and rewrite the group law (3.29) in the form + * 1 1 1 hp = q + q  , t1 + t1 + (i1 q) · q  , t2 + t2 + (i2 q) · q  , t3 + t3 + (i3 q) · q  . 2 2 2 We can represent a quaternion q in the algebraic form as q = α + i1 β + i2 γ + i3 δ = α + i1 Im1 q + i2 Im3 q + i3 Im3 q. Then the multiplication law (3.29) admits the form (3.31) hp = [q, t1 , t2 , t3 ][q  , t1 , t2 , t3 ] * + 1 1 1   ∗   ∗   ∗  = q + q , t1 + t1 + Im1 (q q ), t2 + t2 + Im2 (q q ), t3 + t3 + Im3 (q q ) , 2 2 2 where q ∗ q  is the Cayley–Dickson product of q ∗ by q  . To give an n-dimensional analogue of the quaternion H-type group, we take the n-dimensional vectors of quaternion numbers w = (q1 , . . . , qn ), w = (q1 , . . . , qn ). Each of the matrices im , m = 1, 2, 3, is changed to the block diagonal matrix Jm = diag im with n (4 × 4)-matrices im on the main diagonal. The multiplication

Geodesics in Geometry with Constraints

189

law between the elements h = [w, t1 , t2 , t3 ], p = [w , t1 , t2 , t3 ] ∈ Qn × R3 is hp = [w, t1 , t2 , t3 ][w , t1 , t2 , t3 ]   n n n    1 1 1 (ql i1 )ql , t2 + t2 + (ql i2 )ql , t3 + t3 + (ql i3 )ql = w + w , t1 + t1 + 2 2 2 l=1 l=1 l=1 * + 1 1 1        = w + w , t1 + t1 + (J1 w) · w , t2 + t2 + (J2 w) · w , t3 + t3 + (J3 w) · w 2 2 2 * + 1 1 1   ∗   ∗   ∗  = w + w , t1 + t1 + Im1 (w w ), t2 + t2 + Im2 (w w ), t3 + t3 + Im3 (w w ) , 2 2 2 n where w∗ w = l=1 ql∗ ql . The unit of the group Hn3 is e = (0, 0) ∈ Qn × R3 and the inverse element to (w, t1 , t2 , t3 ) is (−w, −t1 , −t2 , −t3 ), w ∈ Qn , t1 , t2 , t3 ∈ R. The quaternion algebra hn3 , ∈ N, is the direct sum of U ⊕ V , where U = span(X11 , X21 , X31 , X41 , . . . , X1n , X2n , X3n , X4n ) with the left invariant vector fields  1 X1l (w, t) = ∂x1l + − x2l ∂t1 − x3l ∂t2 − x4l ∂t3 , 2  1 X2l (w, t) = ∂x2l + x1l ∂t1 + x4l ∂t2 − x3l ∂t3 , 2  1 X3l (w, t) = ∂x3l + − x4l ∂t1 + x1l ∂t2 + x2l ∂t3 , 2  1 X4l (w, t) = ∂x4l + x3l ∂t1 − x2l ∂t2 + x1l ∂t3 , 2

l = 1, . . . , n,

(3.32)

and w = (q1 , . . . , qn ) = (x11 , x21 , x31 , x41 , . . . , x1 n , x2 n , x3 n , x4 n ). The subspace V is spanned by {T1 , T2 , T3 } with Tk = ∂tk . The following commutator relations [X1l , X2l ] = T1 , [X1l , X3l ] = T2 ,

[X1l , X4l ] = T3 ,

[X2l , X3l ] = T3 , [X2l , X4l ] = −T2 , [X3l , X4l ] = T1 , hold for l = 1, . . . , n and all other brackets vanish. Thus, the condition (3.18) is verified with respect to the inner product making the bases of U and V orthonormal. The endomorphisms JTm are represented by matrices Jm , m = 1, 2, 3. The J 2 condition holds due to relation (3.28). Remark 1. If we involve only two imaginary basis elements into the construction, then we obtain the quaternion H-type group with two-dimensional center V . Taking into consideration only one of the ik , k = 1, 2, 3, we get a group isomorphic to the Heisenberg group Hn1 . Octonion H-type group H17 . Octonions or Caley numbers, which we think of as pairs of quaternion numbers, have one real basis element 1 = (1, 0), 12 = 1

190

I. Markina

and 7 imaginary basis elements j1 = (i1 , 0),

j2 = (i2 , 0),

j3 = (i3 , 0),

j5 = (0, i1 ),

j6 = (0, i2 ),

j7 = (0, i3 ),

j4 = (0, 1),

whose squares equal (−1). The rule of multiplication is presented in Table 3 in Appendix B. The product of octonions is not associative, for example, j1 (j2 j4 ) = −j7 ,

(j1 j2 )j4 = j7 .

There is no matrix representation of jk since the multiplication between jk is not associative, in contrast to the matrix multiplication. Nevertheless, it is possible to associate a matrix Jm with any imaginary unit jm which can be considered as a replacement of endomorphism JZm , m = 1, . . . , 7. The matrices Jm are given in the Appendix B. We take an octonion w = (q1 , q2 ), q1 , q2 ∈ Q and t ∈ R7 , corresponding to the seven-dimensional space of imaginary octonions. Using Jm we write the multiplication law on the group H17 as follows + * 1 1 hp = [w, t][w , t ] = w + w , t1 + t1 − (wJ1 ) · w , . . . , t7 + t7 − (wJ7 ) · w . 2 2 (3.33) Notice some properties of the matrices Jm : J2m = −U, JTm = −Jm , J−1 m = Jm , m = 1, . . . , 7,

(3.34)

where U is the (7 × 7) identity matrix. The product of the matrices Jm does not correspond to the product of the corresponding imaginary unities jm , for example, j1 j2 = −j3 ,

but

J1 J2 = −J3 .

The matrices Jm do not represent the imaginary units in octonion algebra O, but they can be used to write the group law and the left invariant basis of the corresponding algebra. The algebra h17 is the direct sum U ⊕ V , where U = span(X1 , . . . , X8 ) with Xl (w, t) = ∂xl +

7 1  (xJm )l ∂tm , 2 m=1

l = 1, . . . , 8,

(3.35)

where w = (x1 , . . . , x8 ) and (xJm )l is the lth coordinate of the vector xJm . We give the coefficients (xJm )l in Table 1. For instance, to write the vector field X1 we take the first line from Table 1 and get  1 − x2 ∂t1 − x3 ∂t2 − x4 ∂t3 − x5 ∂t4 − x6 ∂t5 − x7 ∂t6 − x8 ∂t7 . X1 (w, t) = ∂x1 + 2 The subspace V is spanned by {T1 , . . . , T7 } with Tm = ∂tm . The non-vanishing commutators are given in Table 2 showing that condition (3.18) still holds. Using the normal coordinates (w, t) for the elements, we identify the elements of the group with the elements of the algebra via the exponential map exp : h17 → H17 : 8 7    exp xk Xk + tm Tm ∈ h17 . H17 " (x1 , . . . , x8 , t1 , . . . , t7 ) ←− k=1

m=1

Geodesics in Geometry with Constraints

191

∂t1

∂t2

∂t3

∂t4

∂t5

∂t6

∂t7

X1

−x2

−x3

−x4

−x5

−x6

−x7

−x8

X2

x1

x4

−x3

x6

−x5

−x8

x7

X3

−x4

x1

x2

x7

x8

−x5

−x6

X4

x3

−x2

x1

x8

−x7

x6

−x5

X5

−x6

−x7

−x8

x1

x2

x3

x4

X6

x5

−x8

x7

−x2

x1

−x4

x3

X7

x8

x5

−x6

−x3

x4

x1

−x2

X8

−x7

x6

x5

−x4

−x3

x2

x1

Table 1. The product xJm .

X1

X2

X3

X4

X5

X6

X7

X8

X1

0

T1

T2

T3

T4

T5

T6

T7

X2

−T1

0

T3

−T2

T5

−T4

−T7

T6

X3

−T2

−T3

0

T1

T6

T7

−T4

−T5

X4

−T3

T2

−T1

0

T7

−T6

T5

−T4

X5

−T4

−T5

−T6

−T7

0

T1

T2

T3

X6

−T5

T4

−T7

T6

−T1

0 −T3

T2

X7

−T6

T7

T4

−T5

−T2

T3

X8

−T7

−T6

T5

T4

−T3

−T2

0 −T1 T1

0

Table 2. Non-vanishing commutators.

3.3.2. Groups related to division algebras considered as sub-Riemannian manifolds. The group Hn0 is a usual Euclidean space, since V = {0}. The sub-Riemannian metric is the Riemannian metric given as a left translation of the Euclidean product by the abelian group law “+”. Further, Hn1 = (R2n+1 , D, gD ),

Hn3 = (R4n+3 , D, gD ),

H87 = (R15 , D, gD ),

where the left invariant distributions D are such that Dq = span{X1 (q), . . . , Xα (q)}. Here left invariant vector fields Xj , j = 1, . . . , α = dim U , are given by (3.27), (3.32), and (3.35). We use as the underlying smooth real manifolds for groups the real vector spaces R2n+1 , R4n+3 , and R15 that isomorphic to Cn × R, Qn × R3 , and O × R7 , respectively. The metric gD is such that the basis of Dq becomes orthonormal in each case. The distributions D are of step 2, strongly bracket generating and regular.

192

I. Markina

In order to present normal geodesics on H-type groups related to division algebras, we denote all of them by H. We also use the notations α = dim Uq , β = dim Vq for all q = (x, t) ∈ H ∼ = Rα × Rβ , λ = (ξ, θ) ∈ Tq∗ H. Then the Hamiltonian function is H(q, λ) = H(x, t, ξ, θ) =

β α α α  1 1   l 2   2  λ, Xl 2 = ξl2 + (x ) θm + Mx · ξ 2 4 m=1 l=1

l=1

l=1

β where “·” denotes the usual inner product in Rα and M = m=1 θm Jm . The matrix M is skew symmetric. The corresponding Hamiltonian system is ⎧ ∂H ⎪ ⎪ x˙ = ∂ξ = 2ξ + Mx ⎪ ⎪ ⎪ ⎨ t˙m = ∂H = θm |x|2 + Jm x · ξ, m = 1, . . . , β. ∂θm 2 (3.36) ∂H 1 2 ˙ ⎪ ⎪ ⎪ ξ = − ∂x = − 2 |θ| x − Mξ ⎪ ⎪ ⎩ θ˙ = − ∂H = 0 m ∂tm and the initial conditions are x(0) = t(0) = 0, ξ(0) = ξ0 , θ(0) = θ0 . We see from the last equation that the θ-coordinates of the momentum are constants θ0 . The system (3.36) is reduced to  x¨ = 2Mx˙ (3.37) t˙ = − 1 Jm x · x, ˙ m = 1, . . . , β. 2

The solutions of the system (3.37) for θ0 = 0 are ⎧ 0 |) 0 |) ⎨ x(s) = 1−cos(2s|θ Mx(0) ˙ + sin(2s|θ Id x(0) ˙ 2|θ0 |2 2|θ0 |  2 (θ ) | x(0)| ˙ sin(2s|θ |) ⎩ tm (s) = 0 m s − 2|θ0 | 0 , m = 1, . . . , β, 4|θ0 |2

(3.38)

where Id is the identity matrix in Rα . If θ0 vanishes, then the geodesics are straight lines starting from (0, 0) with the initial velocity ξ0 , θ0 = 0, remaining in the space t = 0. Analysing the solutions, we obtain the information about the behaviour of geodesics similar to the Heisenberg group. Projections to any threedimensional subspace containing two coupled x-coordinates and one t-coordinate give the Heisenberg picture, see Figure 3.2. A detailed study of this question can be found in [21]. In the work [59] the authors obtained an analogue of the variational equation (3.1) for the H-type group H13 = (R7 , ρ), where the group law ρ is given by (3.31). Theorem 3.3. Let γ : [a, b] → H13 be a horizontal curve, parameterized by arc length. Then γ is a critical point of the functional for the Carnot–Carath´eodory distance,

Geodesics in Geometry with Constraints

193

if and only if, there exist numbers κ1 , κ2 , κ3 ∈ R such that γ satisfies the secondorder differential equation  ∇γ˙ γ˙ − 2 κm Jm (γ) ˙ = 0. (3.39) m=1,2,3

Here Jm , m = 1, 2, 3, are the almost complex structures given by the endomorphisms JTm , m = 1, 2, 3 in the definition of H13 . The Levi-Civita connection ∇ is compatible with the extension g of the sub-Riemannian metric gD from D onto the entire tangent bundle T H13. The Riemannian metric g is left invariant. The question whether there are similar equations for other H-type groups is still open. See an analogue of equation (3.39) in Subsection 5.5. We also present here formulas for horizontal gradients on groups H, written in the left invariant bases (X1 , . . . , Xα ) given in (3.27), (3.32), (3.35) and in the standard basis {∂xl , ∂tm }, l = 1, . . . , α, m = 1, . . . , β. & ' β 1  (Jm x)∂tm , (3.40) gradD = (X1 , . . . , Xα ) = gradx + 2 m=1 with gradx = (∂x1 , . . . , ∂xα ). 3.3.3. Action of groups related to division algebras on the Siegel upper half-spaces. Let Cn+1 be the (n + 1)-dimensional complex space. We use the notation z = (z  , zn+1 ), where z  = (z1 , . . . , zn ) ∈ Cn . The set  E n  n+1  2 2 : 4Re (zn+1 ) > |z | = |zl | Un = (z1 , . . . , zn+1 ) ∈ C l=1

defines the Siegel upper half-space in Cn+1 . Let BC denote the unit ball in Cn+1 :  E n+1  |wl |2 < 1 . BC = (w1 , . . . , wn+1 ) ∈ Cn+1 : l=1

Then the Cayley transformation wn+1 =

1 − zn+1 , 1 + zn+1

wl =

zl , 1 + zn+1

l = 1, . . . , n,

1 − wn+1 , 1 + wn+1

zl =

2wl , 1 + wn+1

l = 1, . . . , n,

and its inverse zn+1 =

show that the unit ball BC and the Siegel upper half-space Un are biholomorphically equivalent. Let Qn+1 , n + 1 ∈ N, be an (n + 1)-dimensional quaternion vector space. The elements of Qn+1 are (n+1)-tuples of quaternions that we denote by q = (q  , qn+1 ),

194

I. Markina

2 q  = (q1 , . . . , qn ) ∈ Qn , with the norm |q|2 = n+1 l=1 |q|l . The Siegel upper halfspace in Qn+1 can be defined by analogy with the complex case as:  E n  Un = (q1 , . . . , qn+1 ) ∈ Qn+1 : 4Re (qn+1 ) > |ql |2 = |q  |2 . l=1

The unit ball BQ in Q BQ =

n+1

is

 (h1 , . . . , hn+1 ) ∈ Qn+1 :

n+1 

E |hl |2 < 1 .

l=1

Since the multiplication of quaternions is not commutative there are two forms of Cayley transformation that give the symmetric geometry. The (left) Cayley transformation, mapping the Siegel upper half-space Un onto the unit ball BQ , has the form (1 + h∗n+1 )(1 − hn+1 ) , |1 + hn+1 |2 2hl (1 + h∗n+1 ) = , |1 + hn+1 |2

qn+1 = (1 + hn+1 )−1 (1 − hn+1 ) = ql = 2hl (1 + hn+1 )−1

for l = 1, . . . , n. The inverse transformation from BQ onto Un is ∗ ) ql (1 + qn+1 , 2 |1 + qn+1 | ∗ ) (1 − qn+1 )(1 + qn+1 = 2 |1 + qn+1 |

hl = ql (1 + qn+1 )−1 = hn+1 = (1 − qn+1 )(1 + qn+1 )−1

for l = 1, . . . , n. The Cayley transformation is biholomorphic in the quaternion sense, where the notion of a quaternion holomorphic function is not a direct generalization of a holomorphic complex function, it requires some additional inputs, see, for instance [39, 68]. Let us denote by (q  , q) an element of one of the above-mentioned Siegel upper half-spaces, and by (h , h) a point from the corresponding unit ball. Then |h |2 = h (h )∗ = |q  |2 |(1 + q)−1 |2 , |h|2 = hh∗ = |1 − q|2 |(1 + q)−1 |2 , and |h |2 + |h|2 = (|q  |2 + |1 − q|2 )|(1 + q)−1 |2 < 1. Since |(1 + q)−1 |2 = |(1 + q)|−2 , we have |q  |2 + |1 − q|2 = |q  |2 + 1 + |q|2 − 2Re (q) < |(1 + q)|2 = 1 + |q|2 + 2Re (q), that yields |q  |2 < 4Re (q).

Geodesics in Geometry with Constraints

195

Let K be one of the following spaces Cn+1 or Hn+1 . We denote by p = (q  , q) a point from the Siegel upper half-space Un of K. The boundary of Un is  E n    2 2 ∂Un = (q , q) ∈ K : 4Re (q) = |q | = |ql | . l=1

We mention here three automorphisms of the domain Un : dilation, rotation and translation. Dilation. Let p = (q  , q) ∈ Un . For every positive number δ we define a dilation δs (p) by δs (p) = δs (q  , q) = (sq  , s2 q). The non-isotropy of the dilation comes from the definition of Un . Rotation. For every unitary linear transformation U that acts on Cn and any symplectic linear transformation acting on Hn we define the rotation Rot(p) on Un by Rot(p) = Rot(q  , q) = (U (q  ), q). Both the dilation and the rotation are extended to mappings on the boundary ∂Un . Translation. We use the notation H for the groups Hn1 , Hn3 . To every element [w, t] of H we associate the following affine self-map of Un . Notice that it is a holomorphic map for the cases Cn+1 , Qn+1 . This map is the action on the left of the group H on Un :

1 |w|2 [w, t].(q  , q) → q  + w, q + + w∗ q  + i · t . (3.41) 4 2 β Here i · t = l=1 ik tk . This mapping preserves the level sets, given by the function r(p) = 4Re (q) − |q  |2 .

(3.42)

In fact, since |q  + w|2 = |q  |2 + |w|2 + 2Re (w∗ q1 ), we obtain

1 ∗  |w|2 + w q − |q  + w|2 = 4Re (q) − |q  |2 . 4Re q + 4 2 Hence, the transformation (3.41) maps Un onto itself and preserves the boundary ∂Un . Let us check that the mapping (3.41) defines an action of the group H on the space Un . If we compose the mappings (3.41), corresponding to elements [w, t] and [ω, s] ∈ H, we get   [w, t]. [ω, s].(q  , q)

|ω|2 1 |w|2 1 ∗  ∗  + + (w + ω) q + w ω + i · (s + t) . (3.43) = w + ω + q ,q + 4 4 2 2

196

I. Markina

On the other hand, the transformation corresponding to the element [w, t][ω, s] is [w, t][ω, s].(q  , q)

1 |w + ω|2 1 + (w + ω)∗ q  + i · Im w∗ ω + i · (s + t) . (3.44) = w + ω + q , q + 4 2 2 Observing that 1 |ω|2 1 |w|2 1 |w + ω|2 + i · Im w∗ ω = + + Re (w∗ ω) + i · Im w∗ ω 4 2 4 4 2 2 |w|2 |ω|2 1 = + + w∗ ω, 4 4 2 we conclude that (3.43) and (3.44) give the same result. Thus, (3.41) gives us a realization of H as a group of affine (q-holomorphic) bijections of Un . We can identify the elements of Un with the boundary via its action at the origin h(0) = [w, t].(0, 0) → (w, |w|2 + i · t), where h = [w, t]. Thus, H " [w, t] → (w, |w|2 + i · t) ∈ ∂Un . We may use the following coordinates (q  , t, r) = (q  , t1 , . . . , tdim V2 , r) on Un : Un " (q  , q) = (q  , t, r),

tk = Imk q, k = 1, . . . , β, r = r(q  , q) = 4Re (q) − |q  |2 .

If 4Re (q) = |q  |2 , then we get coordinates on the boundary ∂Un of the Siegel upper half-space ∂Un " (q  , q) = (q  , t1 , . . . , tdim V2 ), where tk are as above and r = r(q  , q) = 0. 3.4. Carnot groups The following example includes connected simply connected Lie groups G whose Lie algebras are the direct sum of their subspaces g = V1 ⊕ V2 ⊕ · · · Vm , such that [V1 , Vk ] = Vk+1 , k = 1, 2, . . . , m − 1, and [V1 , Vm ] = 0. Since the commutators have finite length the algebras and the groups are nilpotent. The Lie algebras are also graded: [Vl , Vk ] ⊆ Vl+k and stratified 0 ∈ V1 ⊂ V1 ⊕ V2 ⊂ · · · ⊂ ⊕m k=1 Vk . Such kind of groups received the name the Carnot groups in literature.

Geodesics in Geometry with Constraints

197

3.4.1. Two-step Carnot groups. The two-step Carnot groups G are those possessing Lie algebras g which are nilpotent of step 2, graded, stratified: g = V1 ⊕ V2 ,

[V1 , V1 ] ⊆ V2 ,

[V1 , V2 ] = [V2 , V2 ] = {0}.

The group underlying manifold is R , α = dim V1 , β = dim V2 . The group multiplication law can be written by making use of a Rβ -valued skew symmetric form Ω : Rα × Rα → Rβ . Namely, if we write (v1 , v2 ), (v1 , v2 ) ∈ V1 ⊕ V2 for the Lie algebra elements, then   (v1 , v2 ), (v1 , v2 ) = (0, Ω(v1 , v1 )). α+β

If we write τ = (x, t), q = (x1 , t1 ) for the elements of G, then

1 τ q = (x, t)(x1 , t1 ) := x + x1 , t + t1 + Ω(x, x1 ) 2

(3.45)

by the BCH-formula. All H-type groups and groups related to division algebras are examples of two-step Carnot groups. Another treatment of two-step nilpotent groups by making use of metric, see [43, 44]. 3.4.2. Engel group. The Engel group is an example of a three-step Carnot group. The underlying manifold is R4 . We use coordinates q = (x, y, z, w). Let us calculate the Lie group multiplication law by making use of the BCH-formula for a nilpotent group of step 3: exp(F1 ) exp(F2 )

1 1 1 = exp F1 + F2 + [F1 , F2 ] + [F1 , [F1 , F2 ]] − [F2 , [F1 , F2 ]] . (3.46) 2 12 12 The Lie algebra for the Engel group has to satisfy the relations [X, Y ] = Z,

[X, Z] = aW,

[Y, Z] = bW,

a, b ∈ R.

For example, if we choose a slight modification of the Heisenberg vector fields 1 1 X = ∂x − y∂z + z∂w , Y = ∂y + x∂z − z∂w , 2 2 then we get [X, Y ] = ∂z := Z,

[X, Z] = −∂w := W,

[Y, Z] = ∂w = W.

(3.47)

If we write Fi = xi X + yi Y + zi Z + wi W , i = 1, 2 then the BCH-formula (3.46) and the commutation relations (3.47) lead to the group law (x1 , y1 , z1 , w1 )(x2 , y2 , z2 , w2 ) 1 (3.48) = x1 + x2 , y1 + y2 , z1 + z2 + (x1 y2 − x2 y1 ), 2

1 1 1 w1 + w2 − (x1 z2 − x2 z1 ) + (y1 z2 − y2 z1 ) + (y1 − y2 )(x1 y2 − x2 y1 ) . 2 2 12 Another coordinate representation of the Engel group can be found in [37].

198

I. Markina

Exercises 1. Find the matrices of left invariant Riemannian metrics for groups related to division algebras. These metrics should also make the left invariant basis {X1 , . . . , Xα , T1 , . . . , Tβ },

α = dim U, β = dim V

orthonormal. 2. Find gradients and sub-Laplacian operators.

4. Sub-Riemannian spheres In this section we will consider sub-Riemannian manifolds whose underlying smooth manifolds are odd-dimensional unit spheres. For the beginning we pay special attention to the spheres S 3 and S 7 . We will see how the same sub-Riemannian structure is defined by considering S 3 as a group, as a CR-manifold, and as a principal U (1)-bundle. We also compare construction of sub-Riemannian structures on S 3 and S 7 . The reason why we present the examples of S 3 and S 7 is that we can consider these spheres globally as manifolds endowed with a globally non-vanishing linearly independent basis. The structure of the basis on any sphere is given in the following theorem. Theorem 4.1 ([1]). Let S n−1 = {x ∈ Rn | x 2E = 1} be the unit sphere in Rn , with respect to the usual Euclidean norm · E . Then S n−1 has precisely (n) − 1 linearly independent, globally defined and non-vanishing vector fields, where (n) is defined in the following way: if n = (2a + 1)2b and b = c + 4d, where 0 ≤ c ≤ 3, then (n) = 2c + 8d. In particular, two classical consequences follow: S 1 , S 3 and S 7 are the only spheres with a maximal number of linearly independent globally defined nonvanishing vector fields, and all even-dimensional spheres have no globally defined and non-vanishing vector fields. Rephrasing the property of a manifold M to have maximal number of linearly independent globally defined non-vanishing vector fields one says that M is parallelizable. The fact that S 1 , S 3 and S 7 are the only parallelizable spheres was proved in [17]. Even-dimensional spheres have no globally defined and non-vanishing vector fields which is a consequence of the Hopf index theorem, see [130]. 4.1. Sub-Riemannian structures on S 3 4.1.1. S 3 as a Lie group. Consider the smooth manifold S 3 : S 3 = {x = (x0 , x1 , x2 , x3 ) ∈ R4 | x 2E = 1}. In order to introduce the multiplication between the point of S 3 , we consider the set S 3 as a subset of the quaternion numbers Q of norm one. Recall that

Geodesics in Geometry with Constraints

199

Q = (R4 , +, ·), where + stands for the usual coordinate-wise addition in R4 and “·” is a non-commutative product given by the formula & ' & ' 3 3   0 k 0 k x + x ik · y + y ik = (x0 y 0 − x1 y 1 − x2 y 2 − x3 y 3 ) k=1

k=1

+ (x1 y 0 + x0 y 1 − x3 y 2 + x2 y 3 )i1

(4.1)

+ (x2 y 0 + x3 y 1 + x0 y 2 − x1 y 3 )i2 + (x3 y 0 − x2 y 1 + x1 y 2 + x0 y 3 )i3 . 3 3 The conjugate of q = (x0 + k=1 xk ik ), is given by q¯ = (x0 − k=1 xk ik ) and the norm |q| of q ∈ Q is defined by |q|2 = q q¯. The realization of the sphere S 3 as the set of unit quaternions with the multiplication (4.1), gives the Lie group S 3 = (S 3 , ·). The multiplication rule (4.1) induces a right translation rτ (q) of an element 3 3 q = x0 + k=1 xk ik by the element τ = y 0 + k=1 y k ik . The matrix corresponding to the tangent map drτ (q), obtained by the multiplication rule, becomes ⎛ 0 ⎞ y y1 y2 y3 ⎜ −y 1 y 0 −y 3 y 2 ⎟ ⎟. drτ = ⎜ ⎝ −y 2 y 3 y 0 −y 1 ⎠ −y 3 −y 2 y 1 y0 Calculating the action of drτ (q) on the basis of the unit vectors (∂0 , ∂1 , ∂2 , ∂3 ), we get four vector fields Nτ = y 0 ∂0 + y 1 ∂1 + y 2 ∂2 + y 3 ∂3 ,

Vτ = −y 1 ∂0 + y 0 ∂1 − y 3 ∂2 + y 2 ∂3 ,

Xτ = −y 2 ∂0 + y 3 ∂1 + y 0 ∂2 − y 1 ∂3 , Yτ = −y 3 ∂0 − y 2 ∂1 + y 1 ∂2 + y 0 ∂3 .

(4.2)

It is easy to see that Nτ is the unit normal to S 3 at τ ∈ S 3 with respect to the Euclidean inner product (· , ·) in R4 . Moreover, for any τ ∈ S 3 (Nτ , Vτ ) = (Nτ , Xτ ) = (Nτ , Yτ ) = 0 and (Nτ , Nτ ) = (Vτ , Vτ ) = (Xτ , Xτ ) = (Yτ , Yτ ) = 1. ⎞ −y 1 y 0 −y 3 y 2 ⎝ −y 2 y 3 y 0 −y 1 ⎠ −y 3 −y 2 y 1 y0 has rank three, we conclude that the vector fields {V (τ ), X(τ ), Y (τ )} form an orthonormal basis of Tτ S 3 with respect to (· , ·)τ , for any τ ∈ S 3 . Since the matrix



Observing that [X, Y ] = 2V , we see that the distribution D = span{X, Y } is bracket generating, strongly bracket generating, and regular, therefore it satisfies the hypotheses of Theorem 2.2 and Theorem 2.3. Notice that the distribution D = span{X, Y } can also be defined as the kernel of the contact one form ω = −y1 dy 0 + y0 dy 1 − y3 dy 2 + y2 dy 3 .

(4.3)

200

I. Markina

Remark 2. It is easy to see that [V, Y ] = 2X and [X, V ] = 2Y , therefore the distributions span{Y, V } and span{X, V } are also bracket generating. The corresponding contact forms are θ = −y2 dy 0 + y3 dy 1 + y0 dy 2 − y1 dy 3 ,

η = −y3 dy 0 − y2 dy 1 + y1 dy 2 + y0 dy 3 ,

respectively. This means that there is a priori no natural choice of a sub-Riemannian structure on S 3 generated by the Lie group action of multiplication of quaternions. Any choice that can be made, will produce essentially the same geometry. Exercises 1. See that the constructed group (S 3 , ·) coincides with the matrix group Sp(1). 2. Show that the constructed group (S 3 , ·) is isomorphic to the group SU (2) of matrices

z1 z2 , |z1 |2 + |z2 |2 = 1, z1 , z2 ∈ C. −¯ z2 z¯1 Use the correspondence q = x0 + x1 i1 + x2 i2 + x3 i3



z1 = x0 + ix1 , z2 = x2 + ix3 .

4.1.2. S 3 as a CR-manifold. Consider S 3 as the boundary of the unit ball BC ∈ C2 , or the hypersurface S 3 := {(z, w) ∈ C2 : z z¯ + ww¯ = 1}. The sphere S 3 cannot be endowed with a complex structure since it has a threedimensional tangent space. Nevertheless it possesses a differentiable structure compatible with the natural complex structure of the ball BC = {(z, w) ∈ C2 : z z¯ + ww¯ < 1} as an open set in C2 . We show that this differentiable structure over the sphere S 3 , called CR-structure, is equivalent to the sub-Riemannian one considered in the previous subsection. We begin by recalling the definition of a CR-structure, according to [14]. In the case W = Tq R2n , q = (x1 , y 1 , . . . , xn , y n ) ∈ R2n , we say that the standard almost complex structure for W is defined by setting J(∂xj ) = ∂yj ,

J(∂yj ) = −∂xj ,

1 ≤ j ≤ n.

For a smooth real submanifold M of Cn and a point q ∈ M , in general, the tangent space Tq M is not invariant under the standard almost complex structure map J : Tq Cn → Tq Cn , Tq Cn ∼ = Tq R2n . We are interested in the largest subspace invariant under the action of J. Definition 18. The holomorphic tangent space Hq M of M at q is the vector space Hq M = Tq M ∩ J(Tq M ) for a point q ∈ M.

Geodesics in Geometry with Constraints

201

A real submanifold M of Cn is said to have a CR-structure if dimR Hq M does not depend on q ∈ M . A result of [14] implies that every smooth real hypersurface S embedded in Cn satisfies dimR Hq S = 2n−2, therefore, S is a CR-manifold. This fact applies to every odd-dimensional sphere, considered as an embedded manifold to Cn . Let us describe the holomorphic tangent space Hq S 3 . The space Hq S 3 can be seen as a complex vector space of complex dimension one. This description is achieved by considering the differential form ω = z¯dz + wdw ¯ and observing that ker(ω) is precisely the set we are looking for. Straightforward calculations show that ker(ω) = span{w∂ ¯ z − z¯∂w }. In real coordinates this corresponds to 1 (−X + iY ), 2 where X and Y were defined in (4.2). It is important to remark that this is precisely the maximal 1-complex-dimensional J-invariant subspace of Tq S 3 , namely w∂ ¯ z − z¯∂w =

J(X) = Y,

J(Y ) = −X.

Then J(span{X, Y }) = span{X, Y }, but J(V ) = −N ∈ / Tq S 3 for any point q ∈ S 3 . Therefore, the right invariant distribution corresponding to the left action of S 3 over itself coincides with the 1-complex-dimensional holomorphic tangent space. Remark 3. Essentially the same almost complex structure can be obtained by means of the Levi-Civita connection ∇ on S 3 considered as a smooth Riemannian manifold embedded into R4 . Namely, in [74] it is introduced the mapping JV (W ) = ∇W V for W ∈ D, and the vector field V defined in (4.2). Exercise 1. Show that the distribution D = HS 3 at q ∈ S 3 can be also defined as a set of complex two-dimensional vectors that are orthogonal to n = z∂z + w∂w with respect to standard Hermitian product (v, n)H = v¯1 n1 + v¯2 n2 at each point q ∈ S 3 : Dq = {v ∈ Tq C2 | (v, n)H = 0, n = z∂z + w∂w , z, w ∈ C}. 4.1.3. S 3 as a principal U (1)-bundle. In this part we describe how the structure of a principal U (1)-bundle over S 3 induces a bracket generating distribution on S 3 . More details about the relation between principal bundle and sub-Riemannian geometry will be given in Section 5. The group U (1), consisting of complex numbers of absolute value 1, acts on the right on the manifold S 3 by μ:

S3 S 3 × U (1) → (z, w).υ → (zυ, wυ).

Here υ ∈ U (1) = {υ ∈ C : |υ|2 = 1} and (z, w) ∈ S 3 ⊂ C2 .

202

I. Markina Consider the Hopf map h : S 3 → S 2 [71, 98], given explicitly by h(z, w) = (|z|2 − |w|2 , 2z w), ¯

where S 2 = {(x, ζ) ∈ R × C : x2 + |ζ|2 = 1}. Clearly, h is a submersion of S 3 onto S 2 , and it is a bijection between S 3 /U (1) and S 2 , where S 3 /U (1) is understood as the orbit space of the U (1)-action over S 3 . Let p = (x0 , ζ0 ) ∈ S 2 . Consider the great circle γp (s) = (z0 , w0 )e2πis ,

s ∈ [0, 1],

3

in S , that projects to p under the Hopf map. Here (z0 , w0 ) is a point in the pre-image of p under h. Consider the tangent vector field, defined by γ˙ p (s) = 2πi(z0 , w0 )e2πis ∈ Tγp (s) S 3 . We write the curve γp and the map dγp (s) h in real coordinates. Then γp (s) = (z(s), w(s)) = (x0 (s) + ix1 (s), x2 (s) + ix3 (s)) = (x0 (s), x1 (s), x2 (s), x3 (s)) and

⎞ x0 (s) x1 (s) −x2 (s) −x3 (s) x0 (s) x1 (s) ⎠ . dγp (s) h = 2 ⎝ x2 (s) x3 (s) 3 2 1 −x (s) x (s) x (s) −x0 (s) ⎛

(4.4)

Thus, the Hopf map induces the following action over the vector field γ˙ p (s) = iγp (s): ⎛ ⎞ ⎞ x˙ 0 (s) ⎛ 0 ⎛ ⎞ 1 2 3 (s) x (s) −x (s) −x (s) x 0 1 ⎟ ⎜   x˙ (s) ⎟ ⎝ ⎠ x0 (s) x1 (s) ⎠⎜ 0 . dγp (s) h γ˙ p (s) = 2 ⎝ x2 (s) x3 (s) = ⎝ x˙ 2 (s) ⎠ −x3 (s) x2 (s) x1 (s) −x0 (s) 0 x˙ 3 (s) Therefore, if dγp (s) h is a full rank matrix, then we would have characterized the kernel of it, by   ker dγp (s) h = span{γ˙ p (s)} = span{Vγp (s) }, (4.5) since γ˙ p (s) = 2πVγp (s) by (4.2). In order to see that the matrix (4.4) is full rank we observe that [dγp (s) h][dγp (s) h]tr = 4I3 , where I3 denotes the identity (3×3)-matrix. This implies that dγp (s) h is full rank. Now we describe how the Hopf map induces the distribution D constructed in the previous subsections. Define the distribution D as the orthogonal complement to the kernel of dh with respect to the Euclidean inner product (· , ·) in R4 . More precisely, Dq = {v ∈ Tq S 3 | (v, w) = 0 ∀ w ∈ ker(dq h)}. Since we know that ker(dq h) = span{V (q)}, and moreover, (Xq , Vq ) = (Yq , Vq ) = (Xq , Yq ) = 0,

Geodesics in Geometry with Constraints

203

we see that Dq = span{Xq , Yq }.

(4.6)

In the literature the distribution obtained by this way is called the Ehresmann connection and ker(dq h) is called the vertical space. We give a general definition of the Ehresmann connection in Subsection 5.1. The action of the group U (1) on the manifold S 3 satisfies the definition of the principal bundle, see Definition 21. We conclude that the Hopf fibration is a principal U (1)-bundle. Moreover the distribution D is invariant under the right action of U (1): dq rτ (Dq ) = Drτ (q) = Dq.τ ,

τ ∈ U (1), q ∈ S 3 .

Thus, the Hopf map, written in coordinates, indicates, in a topological way, how one makes a natural choice of the horizontal distribution D that was not obvious when we considered the left action of S 3 over itself. The sub-Riemannian metric gD is defined by restricting the usual Riemannian metric on S 3 to the distribution D. Summarizing the last three subsections we conclude that all presented constructions lead to the sub-Riemannian manifold (S 3 , D, gD ), where Dq = span{Xq , Yq },

q = (x0 , x1 , x2 , x3 ),

Yq = −x3 ∂x0 − x2 ∂x1 + x1 ∂x2 + x0 ∂x3 , Xq = −x2 ∂x0 + x3 ∂x1 + x0 ∂x2 − x1 ∂x3 , and the sub-Riemannian metric gD = (· , ·)|D is the restriction of the Euclidean inner product (· , ·) in R4 to the distribution D. Let us find an analogue of the horizontality condition (3.7) for S 3 . A smooth curve c : I → S 3 is horizontal if c(s) ˙ ∈ Dc(s) for all s ∈ I or if the third coordinate in the decomposition c˙ = α(s)X(c(s)) + β(s)Y (c(s)) + δ(s)V (c(s))   4 vanishes. Write c(s) = x0 (s), x1 (s), x3 (s), x4 (s) and c(s) ˙ = k=0 xk ∂k . Then, since   δ(s) = T (c(s)), c(s) ˙ = −x1 x˙ 0 + x0 x˙ 1 − x3 x˙ 2 + x2 x˙ 3 by (4.2), we conclude that the curve is horizontal if it satisfies the differential equation −x1 (s)x˙ 0 (s) + x0 (s)x˙ 1 (s) − x3 (s)x˙ 2 (s) + x2 (s)x˙ 3 (s) = 0,

s ∈ I.

(4.7)

It is a reformulation of the condition c˙ ∈ ker(ω), where ω is the one-form from (4.3): ω = −x1 dx0 +x0 dx1 −x3 dx2 +x2 dx3 . This form can be written as ω = dA01 −dA32 , where dA01 = x0 dx1 − x1 dx0 , dA32 = x3 dx2 − x2 dx3 are the area forms on the planes (x0 , x1 ) and (x3 , x2 ), up to the factor 1/2, respectively. Let us denote by A01 the area swept by the projection of the curve c onto (x0 , x1 )-plane, and by A32 the area swept by the projection of the curve c onto (x3 , x2 )-plane. Then the curve c is horizontal if and only if A01 = A32 , see Figure 4.1. Compare this with the isoperimetric property of a horizontal curve on the Heisenberg group H1 . Observe

204

I. Markina

(x0 , x1 ) A01

0

c

A32 (x3 , x2 )

Figure 4.1. Projections of c to the planes (x0 , x1 ) and (x3 , x2 ). that the considered vector fields X, Y, V are right invariant vector fields produced by the left action of S 3 on itself. The right action of the group S 3 on itself leads to left invariant vector fields on S 3 . This phenomenon is general for the action of groups, see Appendix A. The geodesics can be found by making use of the Hamiltonian approach as in the case of the Heisenberg group, but the Hamiltonian equations in this case are much more difficult. This method was exploited in [28, 29], where the subRiemannian structure was defined by left invariant vector fields. The authors of [74] showed that the result of Theorem 3.1 remains true for the sub-Riemannian manifold S 3 , where they used the complex structure described in Remark 3. They considered the horizontal distribution defined by the right invariant vector fields on S 3 . We will present a different method to find sub-Riemannian geodesics on S 3 . This method is valid for all odd-dimensional spheres and even for all principal bundles with the appropriate choice of metrics. It will be one of the main points of consideration in Section 5. 4.2. Sub-Riemannian structures on S 7 4.2.1. Tangent vector fields for S 7 . In this section we obtain two structurally different types of horizontal distributions on S 7 . One of them is of rank 6 and the other is of rank 4. We start from the construction of a convenient basis of tangent vector fields on the sphere S 7 . The multiplication of unit octonions is not associative, therefore S 7 is not a group in contrast with S 3 . Nevertheless, we are still able to use the multiplication

Geodesics in Geometry with Constraints

205

law in order to find global non-vanishing tangent vector fields. In calculations we use a slightly different multiplication table of unit octonions from what we considered in Subsection 3.3 and that leads to a different product. It is more convenient for our purpose in this subsection. Both the multiplication table of unit octonions and the product of two arbitrary octonions are presented in Appendix B. The multiplication rule induces a matrix representation of the right octonion multiplication, given explicitly by: ⎞ ⎛ 0 y −y 1 −y 2 −y 3 −y 4 −y 5 −y 6 −y 7 ⎜ y1 y0 y 3 −y 2 y 5 −y 4 −y 7 y6 ⎟ ⎟ ⎜ 2 0 1 6 7 4 5 ⎟ ⎜ y −y 3 y y y y −y −y ⎟ ⎜ 3 ⎜ y y 2 −y 1 y0 y 7 −y 6 y 5 −y 4 ⎟ ⎟, ⎜ drτ = ⎜ 4 5 6 7 y0 y1 y2 y3 ⎟ ⎟ ⎜ y 5 −y 4 −y 7 −y 6 ⎜ y y −y y −y 1 y 0 −y 3 y2 ⎟ ⎟ ⎜ 6 ⎝ y y7 y 4 −y 5 −y 2 y3 y 0 −y 1 ⎠ y 7 −y 6 y5 y 4 −y 3 −y 2 y1 y0 7 for τ = y 0 + k=1 y k jk . We are able to find globally defined tangent vector fields which are invariant under the right multiplication rule. We proceed by analogy with the constructions made for S 3 . The explicit formulas of vector fields are given in Appendix B. The vector fields {Y1 , . . . , Y7 } form a frame for T S 7 and Y0 is the normal to S 7 . More explicitly   Yi (τ ), Yj (τ ) = δij , τ ∈ S7, i, j ∈ {0, 1, . . . , 7}, where (· , ·) is the standard inner product in R8 , and δij stands for Kronecker’s delta. 4.2.2. CR-structure and the Hopf map on S 2n+1 . Before we go further in studying structures on S 7 , we present general relations between the CR-structures on odddimensional spheres and the higher-dimensional Hopf fibration. S

Consider S 2n+1 = {z ∈ Cn+1 | z 2 = 1}. Then the right U (1)-action on given by (z0 , . . . , zn ).υ = (z0 υ, . . . , zn υ),

2n+1

for υ ∈ U (1) and (z0 , . . . , zn ) ∈ S 2n+1 , induces the principal U (1)-bundle U (1) → h

S 2n+1 −→ CP n given explicitly by S 2n+1 " (z0 , . . . , zn ) → h(z0 , . . . , zn ) = [z0 : · · · : zn ] ∈ CP n , where [z0 : · · · : zn ] denotes homogeneous coordinates. This map is called higher Hopf fibration. The kernel of the map h : S 2n+1 → CP n gives the vertical space at each point of S 2n+1 . The horizontal distribution or the Ehresmann connection D is given by the orthogonal complement to the vertical distribution V with respect to the inner product of R2n+2 . We show that the vertical space is always given by the action of standard almost complex structure in Cn+1 on the normal vector field

206

I. Markina

to S 2n+1 , and the Ehresmann connection coincides with the holomorphic tangent space at each point of S 2n+1 . Theorem 4.1 asserts that any odd-dimensional sphere has at least one globally defined non-vanishing tangent vector field. If the dimension of the sphere is of the form 4n + 1, then it has only one globally defined non-vanishing tangent vector field. If the dimension of the sphere is of the form 4n+ 3, then the sphere admits at least three globally defined non-vanishing vector fields. Any sphere S 2n+1 possesses the vector field V = −y 1 ∂0 + y 0 ∂1 − y 3 ∂2 + · · · − y 2n+2 ∂2n+1 + y 2n+1 ∂2n+2 .

(4.8)

Observe that this vector field has appeared already in two cases: as the vector field V for S 3 , and as the vector field Y1 for S 7 . The vector field V encloses valuable information concerning the CR-structure of S 2n+1 . A result of [14] states that the sphere S 2n+1 , as a smooth hypersurface in Cn+1 , admits a holomorphic tangent space of dimension dimR (Hq S 2n+1 ) = 2n for any point q ∈ S 2n+1 . The following lemma implies a description of the holomorphic tangent space Hq S 2n+1 as the orthogonal complement to V . Lemma 1. Let W be a Euclidean space of dimension k + 2, k ≥ 1, with an inner product (· , ·)W and let X, Y be two vectors from W . Consider an orthogonal 4 with respect to (· , ·)W and an orthogonal decomposition W = span{X, Y } ⊕⊥ W endomorphism A : W → W such that A(span{X, Y }) = span{X, Y }, 4 is an invariant space under the action of A, i.e., then W 4) = W 4. A(W 4 , then for any α, β ∈ R it is clear that Proof. Let v ∈ W       Av, αX + βY W = v, Atr (αX + βY ) W = v, A−1 (αX + βY ) W . Since A(span{X, Y }) = span{X, Y }, there exist a, b ∈ R such that A−1 (αX + βY ) = aX + bY,     4.  and therefore, Av, αX +βY W = v, aX +bY W = 0, which implies Av ∈ W As an application of Lemma 1, it is possible to obtain an explicit characterization of the space Hq S 2n+1 . Lemma 2. The vector space Hq S 2n+1 is the orthogonal complement to the vector Vq ∈ Tq S 2n+1 from (4.8) for any q ∈ S 2n+1 . Proof. Consider the vector space Wq = span{N (q)} ⊕⊥ Tq S 2n+1 ∼ = Tq R2n+2 , where N (q) is the normal vector to S 2n+1 at the point q. The standard almost complex structure map J : Wq → Wq is orthogonal. Moreover, J(V (q)) = −N (q),

Geodesics in Geometry with Constraints

207

4q ⊕⊥ span{V (q), N (q)}, it is J(N (q)) = V (q). Using the decomposition Wq = W 4q , which is the orthogonal possible to apply Lemma 1 in order to conclude that W 2n+1 4q = 2n, we , is invariant under J. Since dimR W complement to V (q) in Tq S 2n+1 4 conclude that Wq = Hq S .  The space HS 2n+1 can also be described as the kernel of the one-form θ = z¯0 dz0 + · · · + z¯n dzn . Indeed, consider X ∈ HS 2n+1 . Then by straightforward calculations we have θ(X) = (X, N ) + i(X, V ) = 0.

(4.9)

Lemma 2 provides a horizontal distribution of rank 2n for the spheres S 2n+1 , by considering the holomorphic tangent bundle: D = HS 2n+1 . The bracket generating property follows from the following general result for an arbitrary contact manifold. Definition 19. Let M be a (2n + 1)-dimensional manifold. A smooth one form ω is called contact if it satisfies the condition ωq ∧ (dωq )n = 0

for any q ∈ M.

The pair (M, ω) is called a contact manifold. Lemma 3. Let M be a (2n + 1)-dimensional contact manifold with contact form ω, then D = ker(ω) is a bracket generating distribution of rank 2n and step 2. Proof. Recall Cartan’s formula for a differential one-form ω, namely dω(X, Y ) = X(ω(Y )) − Y (ω(X)) − ω([X, Y ]),

(4.10)

for all X, Y ∈ T M . It follows from (4.10) that D is Frobenius integrable if and only if dω(X, Y ) = 0 for all X, Y ∈ D. Thus, if ω is a contact form, then dω(X, Y ) = 0 for all X, Y ∈ T M and, therefore D is not integrable. This implies the bracket / Dq at any point q ∈ M for some generating property for D, since if [X, Y ]q ∈ Xq , Yq ∈ Dq then span{[X, Y ]q } ⊕ Dq = Tq M .  By Lemma 3, in order to prove that HS 2n+1 is bracket generating, it is sufficient to find a contact one-form ω such that HS 2n+1 = ker(ω). To achieve this, let us consider ω = Im θ = −y1 dy 0 + y0 dy 1 − · · · − y2n+1 dy 2n + y2n dy 2n+1 defined on S

2n+1

. By (4.9), the relation HS

2n+1

(4.11)

= ker(ω) holds immediately.

Theorem 4.2 ([58]). The one-form ω defined in (4.11) is a contact form. More specifically, ω satisfies (dω)n ∧ ω = n! · 2n dvolS 2n+1 , where dvolS 2n+1 is the volume form for S 2n+1 . The following corollary holds by Lemma 3 and Theorem 4.2. Corollary 1. The holomorphic tangent bundle HS 2n+1 is a bracket generating distribution of step 2 and rank 2n.

208

I. Markina

An important consequence of Theorem 4.2 follows by considering a classical result by G. Darboux, see [38]. In modern terms, this theorem asserts that every (2n + 1)-dimensional contact manifold is locally the n-dimensional Heisenberg group. This means precisely that the tangent cone of S 2n+1 , as a sub-Riemannian manifold with distribution HS 2n+1 and metric induced by the usual Euclidean metric in R2n+2 , is isomorphic to the n-dimensional Heisenberg group. See [13, 63] for the definition of the tangent cone to a sub-Riemannian manifold. It is necessary to remark, that in general, there is no globally defined basis for HS 2n+1 . By Theorem 4.1, this is only possible for S 3 and S 7 . A basis for the distribution in the case of S 3 has already been discussed. An explicit proof that shows the bracket generating property of the basis of HS 7 can be found in [9, 10, 58]. We conclude this section by proving that the line span{V } from (4.8) forms the kernel of dh, where h is the Hopf fibration h

U (1) → S 2n+1 −→ CP n . The orthogonal complement to V is the horizontal distribution D = HS 2n+1 . To achieve this, we recall that the charts defining the holomorphic structure of CP n are given by the open sets Uk = {[z0 : · · · : zn ] : zk = 0}, together with the homeomorphisms ϕk

:

→  Cn  zk−1 zk+1 z0 zn [z0 : · · · : zn ] → zk , . . . , zk , zk , . . . , zk . Uk

Then, without loss of generality we assume that n = 3 and perform explicit calculations for k = 0. Other cases can be treated similarly. Using the chart (U0 , ϕ0 ) defined above, we have the map ϕ0 ◦ h

:

S7 → C3 z1 z2 z3 (z0 , z1 , z2 , z3 ) → ( z0 , z0 , z0 ),

which in real coordinates can be written as x0 x2 + x1 x3 x0 x3 − x1 x2 x0 x4 + x1 x5 ϕ0 ◦ h(x0 , . . . , x7 ) = , , , x20 + x21 x20 + x21 x20 + x21

x0 x5 − x1 x4 x0 x6 + x1 x7 x0 x7 − x1 x6 , , . x20 + x21 x20 + x21 x20 + x21 The differential of this mapping is given by the matrix   d(ϕ0 ◦ h) = A, B ∈ R6×8

Geodesics in Geometry with Constraints where A ∈ R6×2 and B ∈ R6×6 have the following forms ⎛ 2 2 2 2 ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ A=⎜ ⎜ ⎜ ⎜ ⎜ ⎝ and

⎛ ⎜ ⎜ ⎜ ⎜ B=⎜ ⎜ ⎜ ⎜ ⎝

(x1 −x0 )x2 −2x0 x1 x3 (x20 +x21 )2 (x21 −x20 )x3 +2x0 x1 x2 (x20 +x21 )2 (x21 −x20 )x4 −2x0 x1 x5 (x20 +x21 )2 (x21 −x20 )x5 +2x0 x1 x4 (x20 +x21 )2 (x21 −x20 )x6 −2x0 x1 x7 (x20 +x21 )2 (x21 −x20 )x7 +2x0 x1 x6 (x20 +x21 )2

x0 x20 +x21 1 − x2x+x 2 0 1

x1 x20 +x21 x0 x20 +x21

0 0 0 0

0 0 0 0

(x0 −x1 )x3 −2x0 x1 x2 (x20 +x21 )2 (x21 −x20 )x2 −2x0 x1 x3 (x20 +x21 )2 (x20 −x21 )x5 −2x0 x1 x4 (x20 +x21 )2 (x21 −x20 )x4 −2x0 x1 x5 (x20 +x21 )2 (x20 −x21 )x7 −2x0 x1 x6 (x20 +x21 )2 (x21 −x20 )x6 −2x0 x1 x7 (x20 +x21 )2

0 0

0 0

x0 x20 +x21 1 − x2x+x 2 0 1

x1 x20 +x21 x0 x20 +x21

0 0

0 0

209

⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠

0 0 0 0

0 0 0 0

x0 x20 +x21 1 − x2x+x 2 0 1

x1 x20 +x21 x0 x20 +x21

⎞ ⎟ ⎟ ⎟ ⎟ ⎟. ⎟ ⎟ ⎟ ⎠

By straightforward calculations, we know that det([d(ϕ0 ◦ h)][d(ϕ0 ◦ h)]tr ) = (x20 + x21 )−8 = |z0 |−16 = 0,   therefore, the matrix d(ϕ0 ◦ h) has rank 6 or equivalently dimR ker d(ϕ0 ◦ h) = 2. Moreover, since d(ϕ0 ◦ h)(N ) = d(ϕ0 ◦ h)(V ) = 0, by direct calculations, we conclude   ker d(ϕ0 ◦ h) = span{N, V }. This implies ker(dh) = span{V }. 4.2.3. Application of the first quaternionic Hopf map. We wish to find with the help of the quaternionic Hopf bundle S 3 → S 7 → S 4 a natural choice of horizontal distributions of rank 4 on S 7 . The right action of S 3 on S 7 is defined in the following way. Represent any point of S 7 as a pair of quaternions (q1 , q2 ) of the norm |(q1 , q2 )| = 1. Any point in S 3 is also a quaternion υ ∈ S 3 of unit norm. Then the right action is μ:

S7 × S3 (q1 , q2 ).υ

→ S7  → (q1 υ, q2 υ),

where q1 υ and q2 υ are the usual products of quaternions. We consider the quaternionic Hopf map given by h :

S7 → S4 , 2 (z, w) → (|z| − |w|2 , 2z w) ¯

(4.12)

210

I. Markina

which can be written in real coordinates as: h(x0 , . . . , x7 ) = (x20 + x21 + x22 + x23 − x24 − x25 − x26 − x27 , 2(x0 x4 + x1 x5 + x2 x6 + x3 x7 ), 2(−x0 x5 + x1 x4 − x2 x7 + x3 x6 ),

(4.13)

2(−x0 x6 + x1 x7 + x2 x4 − x3 x5 ), 2(−x0 x7 − x1 x6 + x2 x5 + x3 x4 )). The differential map dh is the following: ⎛ x1 x2 x3 x0 ⎜ x4 x x x7 5 6 ⎜ −x x −x x6 dh = 2 ⎜ 5 4 7 ⎜ ⎝ −x6 x7 x4 −x5 −x7 −x6 x5 x4

−x4 x0 x1 x2 x3

−x5 x1 −x0 −x3 x2

−x6 x2 x3 −x0 −x1

−x7 x3 −x2 x1 −x0

⎞ ⎟ ⎟ ⎟. ⎟ ⎠

Since none of the commutators [Yi , Yj ], i, j = 1, . . . , 7 coincides with Yk for k = 1, . . . , 7, we look for the kernel of dh among the commutators Yij , i, j = 1, . . . , 7. The precise form of commutators is given in Appendix B. After tedious calculations we find that dh(Y45 ) = dh(Y46 ) = dh(Y56 ) = 0. Define V = {Y45 , Y46 , Y56 }. Notice that the commutation relation between Y45 , Y46 , Y56 are: [Y45 , Y46 ] = Y56 ,

[Y46 , Y56 ] = Y45 ,

[Y56 , Y45 ] = Y46

reflecting that they form a Lie algebra for the Lie group S 3 ∼ = SU (2). As in the previous cases we define the horizontal distribution D or the Ehresmann connection as an orthogonal complement to V with respect to the usual inner product in R8 . The sub-Riemannian metric gD is the restriction of the inner product from R8 to D. Remark 4. Another way to construct the horizontal distribution was proposed in [58] The authors presented several bracket generating distributions transversal to V . None of them has a globally defined non-vanishing basis, meanwhile the vertical space V has such kind of the basis. The authors of the paper [9], see also [10], presented another horizontal distribution of rank 4 that was constructed considering the Clifford algebra structure of S 7 and that possesses a globally defined non-vanishing basis. However, in this case a globally defined basis of the vertical space, different from V constructed above, was not found. We would like to draw the attention to the paper [10], where complete description of trivializable sub-Riemannian structures on S 2n+1 induced by Clifford algebras is given.

5. Principal bundles In this section we show the explicit formulas for geodesics on any odd-dimensional spheres. They were found by making use of a result from [111]. To present this result we give all necessary definitions, prove the theorem of existence of geodesics and apply it to odd-dimensional spheres. At the end, we will give some applications of the geometry of principal bundles to physics and will illustrate them by

Geodesics in Geometry with Constraints

211

exploiting the results from previous sections. We recommend the book [76] as an introduction to the theory of smooth bundles. 5.1. Ehresmann connection In this subsection we describe two possible ways to introduce sub-Riemannian structures on a smooth manifold M , provided that there exists a submersion π : M → B to another smooth manifold B. We call the map π projection and the manifold B the base space. For q ∈ M we call the pre-image Fb = π −1 (b), b = π(q), the fiber through a point q ∈ M . The set Fb is a smooth submanifold of M . Since the differential map dq π is surjective for all q ∈ M , the kernel ker(dq π) is non-trivial. We denote it by Vq and call it the vertical space at q ∈ M . The collection of all vertical spaces is called vertical distribution or vertical sub-bundle V ⊂ T M . The vertical space is actually the tangent space Vq = Tq F to the fiber F passing through q. Definition 20. An Ehresmann connection for a submersion π : M → B is a distribution D ⊂ T M that is everywhere transverse and of complementary dimension to the vertical distribution V : Tq M = Dq ⊕ Vq .

q ∈ M.

Notice that ⊕ denotes only transversality of two vector spaces at q ∈ M , but not orthogonality, because there was no any kind of metric defined on M up to the moment. The vector space Dp is an example of a horizontal vector space and the Ehresmann connection is an example of a horizontal distribution. Notice that given a submersion we always have a vertical distribution and the construction of the horizontal distribution or the Ehresmann connection is the question of mathematical art. There are two ways to introduce the sub-Riemannian structure on a manifold M by making use a given submersion π : M → B. Case 1. Sub-Riemannian structure by restriction. Suppose that we have a submersion π : M → B and that the manifold M is endowed with a Riemannian metric gM . Let Vq = ker(dq π). Define the horizontal vector space Dq as the orthogonal complement to the vertical space Vq at each q ∈ M with respect to the given metric gM : Dq ⊕⊥ Vq = Tq M . The obtained horizontal distribution D will be the Ehresmann connection, since it is orthogonally transversal to V at each point. If we denote by gMD the restriction of the Riemannian metric gM to the distribution D, then (M, D, gMD ) is the sub-Riemannian manifold defined by the submersion π and the Riemannian metric gM on M . In this case the sub-Riemannian length of a horizontal curve is equal to the Riemannian length, since the vertical components vanish. Case 2. Sub-Riemannian structure by lifting. Suppose now that for the submersion π : M → B the Ehresmann connection D is defined, and moreover, the base manifold B is endowed with a Riemannian metric gB . Since the restriction

212

I. Markina

dq π|Dq : Dq → Tπ(q) B is an isomorphism, we can pullback the metric gB to the horizontal space Dq at each q ∈ M . Denote the obtained metric by gBD . Thus, gBD (v, w) := gB (dq π(v), dq π(w)),

v, w ∈ Dq ,

q ∈ M.

The obtained metric varies smoothly with q ∈ M . The triplet (M, D, gBD ) is called a sub-Riemannian manifold induced by the Ehresmann connection D on M and by the Riemannian metric gB on B. In this case we get the following properties. 1. If a horizontal curve γ is given on M , then the sub-Riemannian length of γ is equal to the Riemannian length of its projection to B. This is obvious, since the vertical component of the velocity vector of a horizontal curve γ in Tq M is absent, and moreover, the vertical space is projected to the 0-subspace in each tangent space Tπ(q) B. 2. What happens if we are given a curve on the base space and we pull it back to the manifold M ? Define the horizontal lift of a curve c : I → B to M . The horizontal lift of the curve c is a curve γ : I → M such that (1) γ(t) ˙ ∈ Dγ(t)

and

(2) π(γ(t)) = c(t) for all t ∈ I.

The horizontal lift of a Riemannian geodesic in B is a sub-Riemannian geodesic in M . If the Riemannian geodesic in B is a length minimizers between its end points, then its horizontal lift is a sub-Riemannian length minimizer between the corresponding fibers. Remark 5. It is natural to ask, when the horizontal lift exists. We will not discuss it here. But if it exists, then given a point q ∈ M and a curve c starting at π(q), the horizontal lift γ of c starting from q ∈ M is unique. Now, having these two ways of constructing sub-Riemannian structures on M , we can ask when these structures coincide. Suppose we are given a submersion π of a Riemannian manifold (M, gM ) to a Riemannian manifold (B, gB ) and the Ehresmann connection D is orthogonal to ker(dπ) everywhere. If the restriction dq π|Dq : (Dq , dMD ) → (Tπ(q) B, gB ) is a linear isometry for corresponding vector spaces for all q ∈ M , then (M, D, gMD ) = (M, D, gBM ). In this case the submersion π is actually a Riemannian submersion, see Definition 41. 5.2. Metrics on principal bundles A definition of a fiber bundle is given in Appendix A, Definition 59. We present the definition of the principal bundle in the smooth setting. Let M and B be smooth manifolds and let G be a Lie group. Recall that an action of a Lie group on a smooth manifold is a smooth map by definition. Definition 21. Let M, B, and G be as above. A fiber bundle (F, M, B, π) is a smooth principal bundle if the typical fiber F has the structure of a Lie group G and, moreover, the group G acts freely and transitively on each fiber, see (Definitions 56, 57).

Geodesics in Geometry with Constraints

213

In the case of principal G-bundle we have the following properties. Proposition 3 ([75, 125]). 1. The action μτ : M → M of the group G is a proper map for any τ ∈ G, that is, the pre-image of a compact set is compact. 2. The base space B is diffeomorphic to the space M/G of orbits of the group G. The space M/G becomes a smooth homogeneous manifold. 3. The map π is a natural projection π : M → M/G onto the quotient space and it is a smooth submersion. We assume from now on that the group G acts on the right: q → q.τ for τ ∈ G, q ∈ M and we will omit the word “smooth” in the notion of principal bundle. Definition 22. Let π : M → B be a principal G-bundle, D be the Ehresmann connection and let gD be a sub-Riemannian metric on M associated with D. If gD is invariant under the right action of G on M , that is,   gD (vq , wq ) = gD drτ (vq ), drτ (wq ) = gD (vq.τ , wq.τ ), τ ∈ G, q ∈ M, then the metric gD is said to be of bundle type. Example 6. Let us suppose that π : M → B is a principal G-bundle and suppose that M is endowed with a Riemannian metric gM which is invariant with respect to the right action of G. Let D be the distribution from Case 1. Then the restriction gMD of gM to D gives a bundle type metric. In what follows we want to present the special situation, where geodesics can be calculated by making use of Riemannian and sub-Riemannian metrics. Assume that π : M → B is a principal G-bundle, gM is a Riemannian metric on M , and Dq is the horizontal space orthogonal to the vertical space Vq = ker(dq π) at each q ∈ M with respect to gM . We also suppose that the Riemannian metric gM is right G-invariant. In addition to the above-described structure we assume that the distribution D is furnished with a sub-Riemannian metric gD of bundle type. We say that the metric gM is compatible with gD if the restriction gMD := gM |D coincides with gD on M , that is gMD (vq , wq ) = gD (vq , wq ) for all vq , wq ∈ Dq

and all q ∈ M.

Now let us consider the restriction gMV := gM |V of gM to the vertical subspace Vq ⊂ Tq M , q ∈ M . The metric gMV is defined on the tangent space Vq to the fiber G at each q. Since there is an isomorphism between Vq and the Lie algebra g of the Lie group G, the metric gMV defines a bilinear symmetric form Iq : g × g → R, which is called the moment of inertia tensor at q ∈ M . We can express it in the following way. By making use of the group exponential map expG : g → G, we introduce the infinitesimal generator map σq for the right G-group action on M . Namely, d  σq : g → Vq ⊂ Tq M is such that g " ξ →  q. exp( ξ). (5.1) d =0

214

I. Markina

Let us observe the following feature. Fixing fiber, by choosing a point q ∈ M and a local trivialization, we identify the fiber with the Lie group G. Now we can consider q.τ not only as a right action of the Lie group G (τ ∈ G) on M along the fiber, but also as the action on the left of q (considered as an element of G) on the Lie group G, or q.τ = lq (τ ). (5.2) So, it is convenient to think of the infinitesimal generator of the right action as of a locally left invariant vector field, since   d  d  σq (ξ) =  q. exp( ξ) =  lq exp( ξ) = dlq (ξ) d =0 d =0 by the property 1 in Theorem 8.2 or by the definition of the exponential curve in Subsection 3.1, see (3.1). Then we define the bilinear symmetric tensor Iq : g × g → R by ξ, η ∈ g, q ∈ M. (5.3) Iq (ξ, η) = gMV (σq (ξ), σq (η)), We conclude that if a right G-invariant Riemannian metric gM is given on a principal G-bundle π : M → B, then we define the sub-Riemannian structure (D, gMD ), with D = V ⊥ , and the moment of inertia tensor Iq (5.3) by means of restrictions of gM . Conversely, if we have a sub-Riemannian structure (D, gD ) with the metric gD of bundle type and a moment of inertia tensor Iq : g × g → R, q ∈ M , then we can define a Riemannian metric gM as follows. Let us write any vector vq ∈ Tq M according to the transversal decomposition Dq ⊕ Vq as vq = vDq + vVq . Then the Riemannian metric gM is defined by   gM (vq , wq ) := gD (vDq , wDq ) + Iq σq−1 (vVq ), σq−1 (wVq ) , where the inverse map σq−1 is well defined. In order to check that the obtained Riemannian metric gM is compatible with the bundle type metric gD we observe that to gM , since if vq ∈ Dq and • Dq and Vq become orthogonal with respect  wq ∈ Vq , then g(vq , wq ) = gD (vq , 0) + Iq 0, σq−1 (wq ) = 0; • the restriction of gM to the distribution D coincides with gD , • to be right G-invariant the metric gM has to satisfy the relation gM (vq , wq ) = gM (drτ (vq ), drτ (wq )) = gM (vq.τ , wq.τ ).

(5.4)

Let us reformulate the last condition in terms of the symmetric bi-linear tensor Iq . The left-hand side of (5.4) yields gM (vq , wq ) = gD (vDq , wDq ) + Iq (ξ, η),

(5.5)

where σq (ξ) = vVq , and σq (η) = wVq . The right-hand side of (5.4) leads to   gM (vq.τ , wq.τ ) = gD (drτ (vDq ), drτ (wDq )) + Iq.τ ζ, χ , (5.6)

Geodesics in Geometry with Constraints

215

where σq.τ (ζ) = drτ σq (ξ) and σq.τ (χ) = drτ σq (η). Let us calculate drτ σq (ξ). We get   d  d  drτ σq (ξ) =  q exp( ξ)τ =  qτ τ −1 exp( ξ)τ d =0 d =0 (5.7)     d  =  qτ exp Adτ −1 (ξ) = σq.τ Adτ −1 (ξ) , d =0 where Ad is the adjoint action of G over g, for definition see Example 13. For the relation of the adjoint map and the exponential map see Appendix A.  Analogously, we get drτ σq (η) = σq.τ Adτ −1 (η) . Since gD is right invariant by definition of a bundle type metric, the equalities (5.4), (5.5), and (5.6) imply     Iq (ξ, η) = Iqτ Adτ −1 (ξ), Adτ −1 (η) or Iqτ (ξ, η) = Iq Adτ (ξ), Adτ (η) by changing q to qτ −1 , and then, τ −1 to τ . We conclude that to make the constructed Riemannian metric gM invariant under the right action of the Lie group G, we have to require that the given inertia tensor Iq should be invariant with respect of the adjoint action of G on its Lie algebra g. After all these discussions, we define a metric that we will work with. We will use the terminology of [111]. Definition 23. A Riemannian metric gM on a smooth manifold M is said to be of constant bi-invariant type if 1. gM is right G-invariant, 2. its inertia tensor Iq is independent of q ∈ M . The word “constant” refers to the independence of the moment of inertia tensor from the points of the manifold. The bi-invariance reflects the fact that the inertia tensor defines a bi-invariant metric along the fiber G. We discuss it in the following remark. Remark 6 (Bi-invariant metrics on Lie group.). Since the Lie algebra g is related to the tangent spaces Tq G to its group by left translations, any tensor Θ on the algebra g corresponds to a left invariant tensor T on its group G, because T is defined by making use of left translations to bring Θ to each point q ∈ G. If, moreover, the tensor Θ on g is invariant under the adjoint action of G on its Lie algebra g, then the left invariant tensor T on G becomes right invariant. Applying these considerations to the bi-linear symmetric non-degenerate form Iq on g, we obtain that it generates a bi-invariant metric on the fiber G through the point q. We see that the adjoint invariance on g means that Iq is constant along the fiber through the point q. 5.3. Geodesics theorem Before we state and prove one of the principle theorems, we formulate an auxiliary statement.

216

I. Markina

Let G be a Lie group, g be its Lie algebra and let gG be a bi-invariant Riemannian metric on the group, that may be given by an adjoint invariant bilinear symmetric form on g as was noticed in Remark 6. Thus, the Lie group is also considered as a Riemannian manifold (G, gG ). There are two exponential maps defined in this case: the group exponential expG : g → G and the Riemann exponential map (expR )e : Te G ∼ g → G. Proposition 4 ([106]). In the above-stated notations the two exponential maps coincide. In other words the Riemannian geodesic through the identity of the group G coincides with the one-parameter subgroup produced by the group exponential map. Let π : M → B be a principal G-bundle and let gM be a Riemannian metric of constant bi-invariant type. Let D be the Ehresmann connection which is orthogonal with respect to gM to the vertical space Vq at each q ∈ M . Let gD and gV denote restrictions of gM to D and V , respectively. Recall that the infinitesimal action σq : g → Vq is an isomorphism by (5.1). Denote by projq the projection from Tq M to Vq at each q ∈ M . The composition A = σq−1 ◦ projq is called the g-valued connection form, see diagram (5.8). A

z

gk

σq ∼ =

.

proj . ker(dq π) = Vq o Tq M .

(5.8)

σq−1

Let expR be the Riemannian exponential map generated by gM and let γv,R (t) = expR (tv) be the Riemannian geodesic passing through q ∈ M with the initial velocity vector v ∈ Tq M . We project this Riemannian geodesic to the base manifold B obtaining a curve π(γv,R ). Then we lift horizontally π(γv,R ) to M and obtain a curve that we denote by γsR . Theorem 5.1 ([111]). In the above-mentioned notations the curve γsR is a normal sub-Riemannian geodesic starting at q ∈ M . It is given by the formula γsR (t) = γv,R (t) expG (−tAq (v)),

v ∈ Tq M.

(5.9)

Proof. We follow the ideas in [111]. Since the decomposition of D ⊕⊥ V is orthogonal with respect to the Riemannian metric gM , and gD and gV are defined by the restriction of gM to the corresponding distributions, we can define three Hamiltonian functions HR , HsR and HV . Here we denote by HR the Riemannian Hamiltonian function related to the Riemannian metric gM , by HsR the sub-Riemannian Hamiltonian function related to the metric gD , see (2.12) and by HV the vertical Hamiltonian function related to gV and constructed by the same rule as in (2.12). Then the orthogonality of the composition D ⊕⊥ V implies that HsR = HR − HV . Let us also use the notations expR : Tq M → M,

expsR : Dq → M,

expV : Vq → M,

q ∈ M.

Geodesics in Geometry with Constraints

217

The rough idea of the proof is to show that if these Hamiltonian functions Poisson commute, then the corresponding flows on T ∗ M produced by its Hamiltonian vector fields also commute. Therefore, if v = vD + vV is the initial velocity vector written according to the decomposition Dq ⊕⊥ Vq = Tq M , then the flow commutativity property leads to the commutativity of the exponential maps, that is expsR (tvD ) = expR (tv) expV (−tvV ). In the last step of the proof we observe that expV (−tvV ) coincides with the group G exponential map because the metric gV is bi-invariant along the fiber through q ∈ M. The first step in the proof of the theorem is to show that the Hamiltonian functions HR , HsR , and HV Poisson commute. Actually we only need to show that {HsR , HV } = 0. We use the local trivialization for the bundle π : M → B. Let U ⊂ B be a neighborhood of π(q), then π −1 (U ) is diffeomorphic to G × U . At the level of cotangent bundles it leads to the diffeomorphism Tπ∗−1 (U) M = T ∗ (G × U ) ∼ = T ∗ G × T ∗ U. Let us use the coordinates π −1 (U ) " q = (τ, b) = (τ 1 , . . . , τ l , b1 , . . . , bk ) ∈ G × U , l + k = n = dim M for points, and Tq∗ M " λ = (μ, p) = (μ1 , . . . , μl , p1 , . . . , pk ) ∈ Tq∗ G × Tb∗ U for momenta. Since the moment of inertia tensor I is independent of q ∈ M and it is independent of the horizontal part of any vector, the dual tensor I∗ : g∗ × g∗ → R is also independent of q ∈ M and p ∈ Tb∗ U . This implies that HV (τ, b, μ, p) is only a function of the μ-variables: HV = HV (μ). The right invariant property of the metric gD leads to independence of the corresponding Hamiltonian function HsR from τ -slot of variables: HsR = HsR (b, μ, p). The Poisson brackets are ⎛ ⎞ l k   ∂H ∂H ∂H ∂H sR V sR V ⎠ {HsR , HV } = ⎝ + ∂μ ∂τ ∂p ∂b j j j j j=1 j=1 ⎞ ⎛ l k   ∂H ∂H ∂H ∂H sR V sR V ⎠ . + −⎝ ∂τ ∂μ ∂b ∂p j j j j j=1 j=1 V (μ) V (μ) V (μ) In the last sum we have ∂H∂τ = ∂H∂b = ∂H∂p = 0 and ∂HsR∂τ(b,μ,p) = j j j j 0, that implies {HsR , HV } = 0. We conclude that flows ΦR , ΦsR , and ΦV on → − − → → − T ∗ M corresponding to H R , H sR , and H V , respectively, commute. Recall that the exponential map is a composition of the following maps, see (2.10).

Tq M

ι

/ TM

/ T ∗M

duality

Φ

/ T ∗M

pr∗ M

/5 M,

exp

where we have to change the corresponding flows and Tq M to Dq and Vq respectively. We see that the commutation of the flows leads to the commutation of the

218

I. Markina

exponential maps and we have expsR (tv) = expR (tv) expV (−tv),

t∈I

(5.10)

for HsR = HR − HV , where v ∈ Tq M is the initial velocity vector that corresponds to the choice of λ in the flow. The curve expR (tv) = γv,R (t), t ∈ I, is the Riemannian geodesic on M , starting from q ∈ M with the initial velocity v ∈ Tq M . Recall, that the exponential curve expV (−tv) produced by the Hamiltonian function HV (μ) is independent of p variables in the momentum slot. It gives expV (−tv) = expV (−tvV ). Now we exploit the fact that the Riemannian geodesic expV (−tvV ) coincides with the geodesic (or one-parametric subgroup) given by the group exponential map expG (−tσ −1 (vV )). The composition of the projection of v to the vertical space Vq and of the map σq−1 is called a g-valued connection form Aq : Tq M → g, see (5.8). So equation (5.10) takes the form (5.9). Let us make some observations. The vector v is just an initial velocity vector at q ∈ M . The element −tAq (v) ∈ g, and therefore, the vector tv − σ(tAq (v)) is horizontal for any t ∈ I. So the velocity vector of the resulting curve in the right-hand side of (5.10) is horizontal for any moment t, and the resulting curve in the left-hand side is a horizontal geodesic.  It is shown in [111] that, moreover, all normal sub-Riemannian geodesics are given by the formula (5.9). 5.3.1. Geodesics on odd-dimensional spheres. In the case of odd-dimensional spheres   n    |zj |2 = 1 , S 2n+1 = (z0 , z1 , . . . , zn ) ∈ Cn+1  j=0

there is a natural action of U (1) given by q.υ = (z0 υ, z1 υ, . . . , zn υ), where υ ∈ U (1). This action induces the Hopf fibration U (1) → S 2n+1 → CP n , which forms a principal U (1)-bundle with connection D given by the orthogonal complement to the vector field Vq = −y 0 ∂x0 + x0 ∂y0 − · · · − y n ∂xn + xn ∂yn

(5.11)

at each q = (x , y , . . . , x , y ) ∈ S , {zj = x + with respect to the usual inner product in R2n+2 . As it was shown, this distribution can be also given by ker ω with respect to the contact form ω = −y0 dx0 +x0 dy 0 −· · ·−yn dxn +xn dy n . Note that Vq = qi, q ∈ S 2n+1 , where q is thought of as a radial vector at the origin to the unit sphere, i is the complex imaginary unit, and q.i = qi, i ∈ u(1) is the u(1) action. 0

0

n

n

2n+1

j

iy j }nj=0 ,

Consider the sphere (S 2n+1 , D, gD ) as a sub-Riemannian manifold with the sub-Riemannian metric gD obtained by the restriction of the usual Riemannian metric g on T S 2n+1 to the distribution D. As a direct application of Theorem 5.1, it is possible to describe all normal sub-Riemannian geodesics for S 2n+1 . The Lie algebra u(1) is one-dimensional, its typical elements are purely imaginary numbers: ξ = iα. The u(1)-valued connection form is Aq (v) = ig(v, Vq ), v ∈ Tq S 2n+1

Geodesics in Geometry with Constraints

219

and g(v, Vq ) is just the projection of v to the vertical space Vq by making use of the Riemannian metric g. The Riemannian metric g on S 2n+1 is of constant bi-invariant type, because we have  d  q expU(1) ( ξ) = q.iα = αVq , σq (ξ) = d  =0

for any q ∈ S given by

2n+1

and ξ = iα ∈ u(1). Therefore, the moment of inertia tensor is

˜ = Iq (iα, iα Iq (ξ, ξ) ˜ ) = g(αVq , α ˜ Vq ) = αα, ˜ which does not depend on q ∈ M . By Theorem 5.1, we have the following result. Proposition 5. Let q ∈ S 2n+1 and v ∈ Tq S 2n+1 . If γR (t) = (z0 (t), . . . , zn (t)) is the great circle satisfying γR (0) = q and γ˙ R (0) = v, then the corresponding subRiemannian geodesic γsR is given by   γsR (t) = z0 (t)e−itg(v,Vq ) , . . . , zn (t)e−itg(v,Vq ) , t ∈ R. (5.12) To analyze formula (5.12) we recall that the Riemannian geodesic starting at q ∈ S n with a velocity v ∈ Tq S n for any sphere S n is a submanifold of Rn+1 is given by: 1 v sin( v t), where v = g(v, v). (5.13) γR (t) = q cos( v t) + v The great circle γR (t) on S 2n+1 , considered as a submanifold of R2(n+1) ∼ = Cn+1 , will be written in complex notation as γR (t) = (z0 (t), . . . , zn (t)). Observe that V (γ(t)) = γ(t).i and Vq = V (γ(0)). The following corollary can be thought of as a sort of Pythagoras theorem for sub-Riemannian spheres. Corollary 2. For a horizontal sub-Riemannian geodesic on S 2n+1 of the form (5.12) the following equation holds γ˙ sR (t) 2 + g 2 (v, Vq ) = v 2 . Thus, its sub-Riemannian1velocity is constant and its sub-Riemannian length for t ∈ [a, b] is (γ) = (b − a) v 2 − g 2 (v, Vq ). Proof. Denote by (· , ·)H the standard Hermitian product in Cn+1 , Re (· , ·)H = g(· , ·). By straightforward calculations, we have   (γ˙ sR , γ˙ sR )H = (−ig(v, Vq )γR + γ˙ R )e−itg(v,Vq ) , (−ig(v, Vq )γR + γ˙ R )e−itg(v,Vq )  H  = g 2 (v, Vq )(γR , γR )H + (γ˙ R , γ˙ R )H + g(v, Vq ) i(γ˙ R , γR )H − i(γR , γ˙ R )H = g 2 (v, Vq ) + v 2 − 2g 2 (v, Vq ). 

The assertion follows. Corollary 3. If a curve γsR (t) = γR (t)e then v 2 = 1 + g 2 (v, Vq ).

−itg(v,Vq )

is parameterized by arc length

220

I. Markina

Corollary 4. The set of sub-Riemannian geodesics arising from the great circles γR (t), such that γ˙ R (0) ∈ D is diffeomorphic to CP n . Proof. In this case, any sub-Riemannian geodesic starting at q ∈ S 2n+1 with the initial velocity v ∈ D ⊂ Tq S 2n+1 coincides with the corresponding great circle, because the condition γ˙ R (0) ∈ D is equivalent to g(v, Vq ) = 0, thus v γsR (t) = p cos( v t) + sin( v t), v whose loci is uniquely determined by the point [v] ∈ CP n .



Observe that the manifold CP n can be seen as a submanifold of S 2n+1 which is transversal to V along the fiber containing q and it can be thought of as a sophisticated analogue of the horizontal space at the identity in the (2n + 1)dimensional Heisenberg group. 5.3.2. Curvature or charge of sub-Riemannian geodesics on S 3 . The following equation ∇γ˙ sR γ˙ sR + 2κJ(γ˙ sR ) = 0, (5.14) obtained by variational method, is true for length minimizers in S 3 [74]. Here ∇ is the Levi-Civita connection associated with the Riemannian metric on S 3 and J is an almost complex structure on S 3 satisfying J(X) = −Y , J(Y ) = X. The geometers call the parameter κ in (5.14) the curvature of γsR , since after projecting the curve γsR via the Hopf fibration, κ becomes precisely the curvature of the projected curve in S 2 . Note that curves of zero curvature are the horizontal great circles. Physicists call the parameter κ charge or phase and denote it by λ. We return to the notion of a charge later in Subsection 5.5. Since on S 3 all length minimizers are given by normal geodesics, we conclude that solutions of (5.14) coincides with (5.9). Let us see closer on this relation. Proposition 6. The curvature of the normal sub-Riemannian geodesic γsR (t) = γR (t)e−itg(v,Vq ) in S 3 , starting from q ∈ S 3 with an initial velocity v ∈ Tq S 3 , parameterized by arc length, equals the value g(v, Vq ). Proof. Recall that the Lie group structure of S 3 as of the set of unit quaternions, induces the globally defined vector fields (4.2). Let q = (x0 , x1 , x2 , x3 ) = γ(0) ∈ S 3 be an initial point of γ and let v = (v 0 , v 1 , v 2 , v 3 ) = γ˙ R (0) ∈ Tq S 3 be an initial velocity of the corresponding great circle γR . By direct calculation, we have γ(t) ˙ = fX (t)X(γ(t)) + fY (t)Y (γ(t)), where, denoting α = g(v, X), β = g(v, Y ), we have fX (t) = α cos(2tg(v, V )) + β sin(2tg(v, V )), fY (t) = β cos(2tg(v, V )) − α sin(2tg(v, V )).

(5.15)

Geodesics in Geometry with Constraints

221

It follows from this decomposition that J(γ(t)) ˙ = −fY (t)X(γ(t)) + fX (t)Y (γ(t)).

(5.16)

˙ As it is well known for submanifolds It remains to determine the term ∇γ˙ γ. of Rn , the vector field ∇γ˙ γ˙ corresponds to the projection of the second derivative γ¨ to the tangent space of the submanifold. In this case, differentiating (5.15) we obtain ∇γ˙ γ˙ = 2g(v, V )(fY (t)X(γ(t)) − fX (t)Y (γ(t))) = −2g(v, V ) J(γ(t)). ˙ 

This finishes the proof.

In [74] the problem of existence of closed sub-Riemannian geodesics is also discussed. Their result states that a complete geodesic γ√in S 3 parameterized by arc length, with curvature κ is closed, if and only if, κ/ 1 + κ 2 ∈ Q. This result can be generalized to any odd-dimensional sphere. Proposition 7. Let γsR : R → S 2n+1 be a complete sub-Riemannian geodesic parameterized by arc length, with an initial velocity v ∈ Tq S 2n+1 . Then γsR is closed if and only if g(v, Vq ) 1 ∈ Q. 1 + g 2 (v, Vq ) Proof. The curve γsR : R → S 2n+1 is closed, if and only if,

v sin( v T ) q = e−iT g(v,Vq ) q cos( v T ) + v for some T > 0. Since v ∈ Tq S 2n+1 , we know that v is orthogonal to the vector joining 0 ∈ R2n+2 and q, with respect to g. Thus, sin( v T ) = 0, which forces T = kπ/ v , k ∈ Z. To complete the argument, we only need to see that  g(v,Vq )  ±e−iπk v q = q for some k ∈ Z, if and only if, g(v, Vq ) g(v, Vq ) = 1 ∈ Q, v 1 + g 2 (v, Vq ) where we have used Corollary 3.



Exercises 1. Calculate directly that the curve (5.9) is horizontal. 2. Write the equation of geodesics starting from the point q = (1, 0, 0, 0). What is the value of κ at q = (1, 0, 0, 0)?

222

I. Markina

5.3.3. Sub-Riemannian geodesics on S 4n+3 . Let us consider the sphere   n  4n+3 n+1 2 S = (q0 , . . . , qn ) ∈ Q | |qj | = 1 . j=0

∼ S 3 on S 4n+3 is defined by q.υ = The right action of the group Sp(1) ∼ = SU (2) = 4n+3 (q0 , . . . , qn ).υ = (q0 υ, . . . , qn υ), q ∈ S . This action induces a quaternionic Hopf fibration S 3 → S 4n+3 → HP n , given by h

:

→ HP n S 4n+3 (q0 , . . . , qn ) → [q0 : . . . : qn ].

(5.17)

This map forms a principal S 3 -bundle with the Ehresmann connection given by the orthogonal complement to the vector fields Vq1 = −y 0 ∂x0 + x0 ∂y0 + w0 ∂z0 − z 0 ∂w0 − · · · − y n ∂xn + xn ∂yn + wn ∂zn − z n ∂wn , Vq2 = −z 0 ∂x0 − w0 ∂y0 + x0 ∂z0 + y 0 ∂w0 − · · · − z n ∂xn − wn ∂yn + xn ∂zn + y n ∂wn , Vq3 = −w0 ∂x0 + z 0 ∂y0 − y 0 ∂z0 + x0 ∂w0 − · · · − wn ∂xn − z n ∂yn + y n ∂zn + xn ∂wn , at each q = (x0 , y 0 , z 0 , w0 , . . . , xn , y n , z n , wn ) ∈ S 4n+3 , with respect to the usual Riemannian metric g on S 4n+3 . It is easy to see that the following commutation relations hold for V 1 , V 2 , V 3 : [V 1 , V 2 ] = V 3 ,

[V 2 , V 3 ] = V 1 ,

[V 3 , V 1 ] = V 2 .

Thus one recovers the fact that span{Vq1 , Vq2 , Vq3 } considered as the Lie algebra sp(1) is isomorphic to the Lie algebra of the Lie group S 3 . All in all, the studied sub-Riemannian manifold is (S 4n+3 , D, gD ), where ⊥ D = V = span{V 1 , V 2 , V 3 } with respect to the usual Euclidean metric g in T S 4n+3 and gD is the restriction of g to D. Compare it with the sub-Riemannian manifold S 2n+1 . It is an established fact that the distribution D is bracket generating. The geometry of spheres S 4n+3 is known to be a quaternionic analogue of CR-geometry, see [5]. Note that the vectors Vq1 , Vq2 , Vq3 coincide with q.i1 , q.i2 , q.i3 , respectively. Here q.ik is the action of sp(1). To apply Theorem 5.1 in this situation, it is necessary to specify the sp(1)valued connection form associated to the Hopf map h from (5.17). In this case, the connection form is given by A(v) = i1 g(v, Vq1 ) + i2 g(v, Vq2 ) + i3 g(v, Vq3 ), where v ∈ Tq S 4n+3 . The Riemannian metric g is of constant bi-invariant type, since for any q ∈ S 4n+3 and ξ = i1 α1 + i2 α2 + i3 α3 ∈ sp(1), αk ∈ R, k = 1, 2, 3, (ξ is a pure imaginary quaternion) we have  d  q expSp(1) ( ξ) = α1 q.i1 + α2 q.i2 + α3 q.i3 = α1 Vq1 + α2 Vq2 + α3 Vq3 . σq (ξ) = d =0

Geodesics in Geometry with Constraints

223

Therefore, the moment of inertia tensor, given by & 3 ' & 3 ' 3 3 3      Iq (ξ, ξ) = Iq ik αk , ik α ˜k = g αk Vqk , α ˜ k Vqk = αk α ˜k , k=1

k=1

k=1

k=1

k=1

does not depend on the point. As for Proposition 5, we have the following result. Proposition 8. If γR (t) = (q0 (t), . . . , qn (t)) is the great circle satisfying γR (0) = q and γ˙ R (0) = v ∈ Tq S 4n+3 , then the corresponding sub-Riemannian geodesic is given by   (5.18) γsR (t) = q0 (t) · e−tA(v) , . . . , qn (t) · e−tA(v) . In Proposition 8, the quaternionic exponential is defined by 1 1 ai1 + bi2 + ci3 eai1 +bi2 +ci3 = cos a2 + b2 + c2 + sin a2 + b2 + c2 · √ , a2 + b 2 + c2 for a, b, c ∈ R. Note that the curve e−tA(v) is simply the Riemannian geodesic in S 3 starting at the identity of the group e= (1, 0, 0, 0) ∈ S 3 , with initial velocity  vector 0, −g(v, Vq1 ), −g(v, Vq2 ), −g(v, Vq3 ) . Corollary 5. The set of sub-Riemannian geodesics in S 4n+3 arising from great circles γR (t), such that γ˙ R (0) is orthogonal to span{Vq1 , Vq2 , Vq3 } is diffeomorphic to HP n . Corollary 6. Let γsR : R → S 4n+3 be a complete sub-Riemannian geodesic parameterized by arc length, with the initial velocity v ∈ Tq S 4n+3 . Then γ is closed if and only if g(v, Vq1 ) g(v, Vq2 ) g(v, Vq3 ) , , ∈ Q. v v v Corollary 7. For the horizontal sub-Riemannian geodesic of the form (5.18) the equality γ˙ sR (t) 2 + A(v) 2 = v 2 holds, where A(v) 2 = g 2 (v, Vq1 )+g 2 (v, Vq2 )+ g 2 (v, Vq3 ). We leave the proofs of Corollaries 5–7 as an exercise. 5.4. Geodesics related to Yang–Mills fields This subsection is aimed at a description of sub-Riemannian geodesics produced by a principal G-bundle π : M → B as was described in Case 2 of Subsection 5.1. Recall that in this case the Ehresmann connection or the horizontal distribution D transversal to the vertical distribution V = ker(dπ) is given. Moreover, the subRiemannian metric gD is given as a pullback of the Riemannian metric gB from T B to the distribution D. We also require that the sub-Riemannian structure (D, gD ) is invariant under the right action of the structure group G. We want to write geodesic equations for the sub-Riemannian manifold (M, D, gD ). To describe sub-Riemannian geodesics and explain their physical meaning, we need to introduce more definitions related to the notion of a principal bundle. Let the base space B be endowed with a Riemannian metric gB . A Riemannian

224

I. Markina

metric is a positively definite quadratic form and in physics it represents the kinetic energy of a system in the space B. Consider electromagnetic charged particles (or color-charged particle, or particle with other characteristics) moving in B. The information about charges is encoded in a compact Lie group G, that is usually SU (n) in physics. External forces also can be presented by their action on B. The motion of the particle is not free since it must respect some symmetries, such as the isometry group on the base space and transformations of the structure group G. Let us avoid constraints as we did solving the Dido problem. We add more variables that allows us to inherit the information about the presence of charges encoded in G, so that to each point b ∈ B we associate a copy of the group Gb . In the enlarged space M we assume that the structure group G acts freely and transitively such that M receives the structure of a principal G-bundle. The choice of the horizontal distribution is dictated by the external forces acting on the particle on the base space B and it is expressed through the curvature of the connection one form annihilating the horizontal distribution. The motion on the total space M is governed by a Lagrangian. If the Lagrangian has redundant degrees of freedom or gauges, then the transformations between possible gauges, given by observed physical laws, are called gauge transformations, or gauge symmetries. So gauge transformations are automorphisms of the principal bundle, and they form a group with respect to the composition of the bundle automorphisms. To the principal G-bundle we associate a vector bundle with the same base space where the typical fiber is the representation of G. The obtained vector bundle is called an associated bundle and the gauge group Gau(M ) consists of all smooth sections of the associated bundle. The trivial section b → idb of the associated bundle corresponds to the identity bundle automorphism, see [81, 108]. The electromagnetic charged particles are described by the theory of principal U (1)-bundles. We can think of charges as of elements of the space g∗ dual to the Lie algebra g corresponding to one-dimensional structure groups U (1) or R. In mathematical theory charges are elements of dual Lie algebras, because they fit better to the situation in which the geodesics are produced by bi-characteristics on the co-tangent bundle T ∗ M of M . Yang and Mills [105] proposed the theory generalizing the gauge theory from principal U (1)-bundles to principal U (n)-bundles. For instance, SU (2) symmetry group is used in the isospin model, SU (2) × U (1) symmetry group describes electroweak interaction, and SU (3) symmetry group is the subject of quantum cromodynamics. The G-group action on M with G invariant Riemannian metric, produces an action on T ∗ M and both the Hamiltonian function and the flow on T ∗ M corresponding to the Hamiltonian vector field are invariant under this action. Thus, it seems natural to reduce the space T ∗ M to the space of orbits T ∗ M/G and consider the flow on the reduced space T ∗ M/G. This reduction is called the Poisson reduction. The idea to consider the reduction of spaces endowed with some structures (Poisson, symplectic, K¨ahler) comes from works [100, 101]. The dynamics on the reduced space is related to sub-Riemannian geodesics on M , whose projections on

Geodesics in Geometry with Constraints

225

the base space B are trajectories of the motion of charged particles in Yang–Mills fields and the corresponding equations are called the Wong equations. In the case of a one-dimensional structure group the dynamic is quite well known, since the reduced space T ∗ M/G is diffeomorphic to T ∗ B ⊕ R and for each fixed value of the charge we get its level set in T ∗ M/G that is diffeomorphic to T ∗ B. This level sets are glued together to form the entire reduced space T ∗ M/G. For a non-abelian group G acting on M the structure of the reduced space T ∗ M/G is more complicated and it is isomorphic to T ∗ B ⊕ Ad∗ (M ), where we have to change the quite simple component R representing the abelian charge to the vector bundle Ad∗ (M ) over the same base space B that is associated with the principal G-bundle π : M → B. The motion of a “free” particle on the base space B means absence of forces acting there and the trajectory of “free motion” is the geodesic given by the equation ∇c˙ c˙ = 0, ∇ is the Levi-Civita connection on B. If a force F is present, then the equation changes to the Newton equation ∇c˙ c˙ = F of geodesics on the manifold B for a particle of constant charge and unit mass. If the charge is non-abelian, then it is encoded in the bundle Ad∗ (M ) and the righthand side of the last equation depends in a complicated way on charge. Moreover, the condition on the level sets of an abelian charge changes to the requirement to be a “co-variantly constant” charge. To formulate the Wong equation supplemented by the conservation condition for the charge we start from necessary definitions. 5.4.1. Structure of the reduced space. Induced action of group on tangent and co-tangent bundles. Let π : M → B be a principle G-bundle. The right action of G on M produces right actions on T M and T ∗ M . They defined by the following: μ: T M × G →  T M  (5.19) (q, v).τ → q.τ, drτ (v) , and μ:

T ∗M × G (q, ω).τ

∗ → T M∗   → (q.τ, drτ ) (ω) ,

(5.20)

where (drτ )∗ is the dual operator to the differential drτ of the right translation r by τ ∈ G. The factorization of T M by the action (5.19) leads to the factor space T M/G with elements [q, v]. Define the projection π  : T M/G → B by π ([q, v]) := π(q) ∈ B,

[q, v] ∈ T M/G.

We get a vector bundle over the base space B, where we will denote the projection π  simply by π. Thus, we have π : T M/G → B. Analogously, taking the factor of T ∗ M by the action (5.20) of G, we get a vector bundle π : T ∗ M/G → B.

226

I. Markina

We aim to construct the bundle map T M/G → T B. The principal Gbundle π: M → B after differentiating leads to the bundle map dπ : T M → T B. Let us take the factor by the action of G of both parts. The action of G over T B is trivial: T B/G = T B. Thus we get a bundle map dq π

WO b  T M/G



/ Tb B O  / TB prB

prB

 B

 B.

By making use of the dual map (dπ)∗ , we get an analogous bundle map Wb∗ o O

(dq π)∗

Tb∗ B O

  (dπ)∗ T ∗ M/G o T ∗B pr∗ B

 B

pr∗ B

 B.

Actually, we need to verify that the maps dπ and (dπ)∗ are equivariant with respect to the action of the group G:   dπ (q, vq ).τ = dπ(q, vq ).τ = (π(q), dπ(vq )) = (b, wb ), where (q, vq ) ∈ Tq M and (b, wb ) ∈ Tπ(q) B. The group G acts on T M on the right and since the action of G on T B is trivial, we also can suppose that it acts on the right. To show the equivariance we recall that the decomposition D ⊕ V = T M is preserved under the action of G. Therefore   dπ((q, vq ).τ ) = dπ((q.τ, drτ (vq |D + vq |V )) = π(q.τ ), dπ(vq.τ |D ) + dπ(vq.τ |V ) = (π(q), wπ(q.τ ) ) = (b, wb ), where dπ(vq.τ |V ) vanishes. The proof for the bundle map (dπ)∗ : T ∗ B → T ∗ M/G is similar. Observe that ker(dq π) = Vq , q ∈ M , b = π(q), and the typical fiber Wb of the bundle T M/G splits into parts isomorphic to the vertical Vq and horizontal Dq spaces. Moreover, Dq is isomorphic to the typical fiber Tb B of the bundle T B. Sections of the bundle prB : T B → B are vector fields on B. Sections of the bundle

Geodesics in Geometry with Constraints

227

prB : T M/G → B are right invariant vector fields with respect to the action of G. We finish the construction of the bundle map T M/G → T B. Now we find the bundle map T B → T M/G. We start by recalling that there exists a bundle map h : T B → T M that we called the horizontal lift such that the image hq (Tb B) is Dq ⊂ Tq M , where Dq ⊕ Vq = Tq M . To show that the map h is equivariant under G we take a point (q, vq |D ) ∈ Tq M and its pre-image (b, wb ) = (π(q), wπ(q) ). Then on the one hand h((b, wb ).τ ) = h((b, wb )) = (q, vq |D ) since the action of G on T B is trivial. On the other hand h(b, wb ).τ = (q, vq |D ).τ = (q.τ, drτ (vq |D )) = h(π(q.τ ), wπ(q.τ ) ) = h((b, wb )) = (q, vq |D ).

(5.21)

(5.22)

The chains of equalities (5.21) and (5.22) show that h is an equivariant map. Roughly speaking, the horizontal lift h is the inverse map for dπ|D : hq

 Vq ⊕ Dq = Tq M

dq π

/ Tπ(q) B.

Since the map h is equivariant, we can take the factor of h : T B → T M by the action of G and get the induced bundle map, that we again call h: hq

Wb ∼ Vq ⊕ Dq o O  T M/G o

TbO B  TB

h

prB

prB

 B

 B.

The bundle map Wb∗ ∼ Vq∗ ⊕ Dq∗ O  T ∗ M/G pr∗ B

 B

h∗ q

h∗

/ T ∗B bO  / T ∗B pr∗ B

 B.

is produced similarly. Here Dq = Im(hq ) is isomorphic to Tb B, b = π(q), and Vq = ker(dq π). At the co-tangent bundles level we get Dq∗ = Im(dq π ∗ ), Vq∗ = ker(h∗q ) and Dq∗ is isomorphic to Tb∗ B.

228

I. Markina

Resuming the discussion of the two last parts, we conclude that we constructed two maps between bundles T M/G and T B, and T ∗ M/G and T ∗ B: dπ

T M/G m

+

TB ,



T M/G m

h

h∗

,

T ∗ B.

dπ ∗

In these maps the horizontal distribution D is the image of some map, meanwhile the vertical part is the kernel of some other mapping. In the next step we change the role of D and V . Adjoint and co-adjoint bundles, associated with the principal bundle π : M → B. Let us suppose that a principal G-bundle π : M → B is given. Recall, that in this case • the typical fiber F is isomorphic to the group G, • the group G acts on F by right (or left) translations. Then it is possible to define the associate bundle, where • the typical fiber F is isomorphic to some vector space E, • the action of G on E is defined by the representation. It is achieved in general through the representation of G on E. We will give the definition only in the particular case, when E = g (or E = g∗ ) and the action of the Lie group G is the adjoint action on its Lie algebra g (or the co-adjoint action on its dual Lie algebra g∗ ). The adjoint fiber bundle Ad(M ) to a principal G-bundle π : M → B is the vector bundle π  : Ad(M ) → B with a typical fiber isomorphic to g. The action of the group G on the fiber g is defined by the adjoint action g " ξ → Adτ (ξ) ∈ g for all τ ∈ G. To construct the adjoint bundle Ad(M ) one starts from the direct product M ×g and then, taking factor by the right action of G defined on M ×g by μ:

(M × g) × G → (q, ξ).τ →

M ×g   q.τ, Adτ −1 (ξ) .

(5.23)

Here we used that the group G acts on the right on M and that the adjoint action on g is the left action since it comes as a differential of the left action a (by conjugation), see (8.2) and (8.5). This definition of the action is compatible with the definition of an equivariant (right-left) map. The adjoint bundle Ad(M ) is produced by factoring M × g by the right action (5.23). The standard notations in the literature are Ad(M ) or M ×Ad g. The equivalence class [q, ξ] ∈ Ad(M ) of the representative  (q, ξ) ∈ M × g is also often written as qξ due to the mnemonic  cancelation rule q.τ, Adτ −1 (ξ) = qξ, where τ is canceled. The projection map π from Ad(M ) to the base space B is defined by π  ([q, ξ]) := π(q). The co-adjoint bundle Ad∗ (M ) is the vector bundle π ˇ : Ad∗ (M ) → B with ∗ ∗ the typical fiber g . The bundle Ad (M ) is obtained by division of M × g∗ by the right action of G on M × g∗ : μ:

(M × g∗ ) × G →  M × g∗  , (q, ω).τ → q.τ, Ad∗τ −1 (ω)

(5.24)

Geodesics in Geometry with Constraints

229

The next step is to reveal relations between the adjoint bundle Ad(M ) and the vector bundle T M/G. Find the bundle map Ad(M ) → T M/G. As usual, we start from the principal G-bundle, where the group G acts on the right on M . The right translation r generates the infinitesimal generator σq : g → Vq ⊂ Tq M, (see Example 12). Let us vary q ∈ M and we get a bundle map gO

σq

 M ×g

/ Vq ⊂ Tq M O σ

 / V ⊂ TM prM

prM

 M

 M.

The map σ is equivariant: τ ∈ G.

σ((q, ξ).τ ) := σ(q.τ, Adτ −1 (ξ)) = drτ σ(q, ξ) := σ(q, ξ).τ, Indeed,

d  d   qτ Adτ −1 (exp(εξ)) =  qτ (τ −1 exp(εξ)τ ) dε ε=0 dε ε=0 = drτ dlq (ξ) = drτ σ(q, ξ) .

σ(q.τ, Adτ −1 (ξ)) =

Dividing by actions (5.23) and (5.24) we come to the bundle map gO    M × g /G prM

 M/G

σq

/ Vq ⊂ Tq M O σ

 / V /G ⊂ T M/G prM

 M/G.

or

gO  Ad(M ) prB

 B

σq

σ

/ Vq ⊂ Tq M O  / V /G ⊂ T M/G prB

 B.

Thus, the image Im(σ) = V /G is the collection of right invariant vertical vector spaces. Construction of the bundle map T M/G → Ad(M ). The auxiliary map here is the g-valued connection one-form Aq : Tq M → g introduced in (5.8). The connection form is uniquely defined by two conditions 1. ker(Aq ) = Dq and 2. Aq ◦ σq = Idg for any q ∈ M .

230

I. Markina

The second condition says that after the projection on the vertical space V the connection form is the canonical identification between V and the Lie algebra g. The map A : T M → M × g is equivariant:   A (q, v|V ).τ := A(q.τ, drτ (v|V ))   (5.25) = A(q, v|V ).τ = (q, ξ).τ := q.τ, Adτ −1 (ξ) , where we consider only the vertical part v|V of a vector v ∈ Tq M , since the horizontal part belongs to the kernel of Aq . To prove (5.25), we note that v|V = σq (ξ) and Aq (v|V ) = ξ from Property 2. This and equivariance of σ imply the equivariance of A as follows     A(q.τ, drτ (v|V )) = A q.τ, drτ σ(q, ξ) = A q.τ, σ(q.τ, Adτ −1 (ξ)   = q.τ, Adτ −1 (ξ) = (q, ξ).τ = A(q, v|V ).τ. The factorization by the action of G leads to the bundle map Aq

gO o  Ad(M ) o

A

prB

Tq M O  T M/G prB

 B

 B,

where ker(A) = D/G. The construction of dual bundle maps is straightforward. The isomorphism of bundles T M/G and T B ⊕ Ad(M ). Summarizing everything for the constructed bundle maps, we get σq

gO j  Ad(M ) m

Aq

+

σ

,

dq π

Wq k O

 T M/G m

A

+

TbO B

hq dπ

+

for

b = π(q).

 TB

h

Moreover, Im(hq ) = Dq = ker(Aq ), Aq ◦ σq = Idg ,

Im(σq ) = Vq = ker(dq π), dπq ◦ hq = IdTb B .

So, we conclude that the typical fiber Wq is isomorphic to the product g × Tb B that leads to the isomorphism of bundles ∼ TB ⊕ Ad(M ) =    T M/G + σ (ξ) ←− (b, w) ⊕ [q, ξ] q, hq (w) q     q, v −→ π(q), dq π(v) ⊕ [q, Aq (v)].

Geodesics in Geometry with Constraints

231

Here hq (w) ∈ Dq is the horizontal lift of w ∈ Tπ(q) B and σq (ξ) ∈ Vq is a vector field. The result dq π(v) represents the horizontal part of v ∈ Tq M and Aq (w) is the missing vertical part regarded as an element of g. Let us emphasize one more time that the constructed bundle isomorphism is induced by chosen Ehresmann connection D that is invariant under the action of the structural group G. Remark 7. Let us present the gauge group Gau(M ) acting on the principal Gbundle π : M → B. To do this, let us recall that a section of a fiber bundle ϕ : E → B is any function f : B → E satisfying ϕ ◦ f = IdB . The bundle isomorphism σ : Ad(M ) ↔ V /G ⊂ T M/G gives a correspondence between the set S of all sections of the adjoint bundle π : Ad(M ) → B and the set L of G invariant vertical vector fields on M , since the set L is the set of sections of the bundle π : V /G → B. Thus the bundle map σ induces the isomorphism σ  : S ↔ L. Another observation: if there is a G-invariant map Φ : M → M , then factor : M/G = B → M/G = B. ization by the action of the group G induces the map Φ Definition 24. The gauge group Gau(M ) of the principal G-bundle π : M → B is  is the the set of all G-invariant maps Φ : M → M such that the induced map Φ  = IdB . identity on B : Φ The Lie algebra gau(M ) for the gauge group Gau(M ) is the set of G-invariant vector fields on M , which is exactly the set L, which is σ -isomorphic to S. Conclusion: the sections of the adjoint bundle Ad(M ) form the Lie algebra of the gauge group. The physicists are actually interested in working with the dual isomorphism ∗ T ∗ M/G ∼ = T ∗ B ⊕ Ad (M ), where in the construction of the co-adjoint action on g∗ and dual maps σ ∗ , A∗ to σ, A are used. Roughly speaking, the bundle isomorphism T ∗ M/G ∼ = T ∗ B ⊕ Ad∗ (M ) is the splitting of the reduced phase space into the vertical part encoded in Ad∗ (M ) (generated by the action of G) and the complementary horizontal part isomorphic to T ∗ B. 5.4.2. The Wong equation. The main goal of this part is to introduce the geodesic equation on the reduced space T ∗ M/G. To do this we need to introduce the curvature form Ω of the Ehresmann connection D and the covariant derivative for sections on the co-adjoint bundle Ad∗ (M ). Curvature form. A curvature two-form Ω on M associated with the Ehresmann connection D measures the behavior of two horizontal vector fields F X, Y ∈ D with respect to each other, or more precisely, Ω maps a pair X, Y ∈ 2 D to a vector (−[X, Y ] mod D) ∈ V . Since the connection one-form A gives an identification of V with the Lie algebra g, we define the g-valued two-form Ωq by  −Aq ([X, Y ]) if X, Y ∈ Dq , Ωq (X, Y ) = for all q ∈ M. 0 otherwise,

232

I. Markina

Since the form A is defined on G-invariant vector fields, we get −A([X, Y ]) = dA(X, Y ) − [A(X), A(Y )] and we conclude that Ω = dA − [A, A]. The form Ω is G-equivariant, since the form A is so. The equivariance of Ω allows to extend the definition of Ω to the bundle map F2 O  F2 prM

Dq

D

 M

Ωq

Ω

/g O

factoring by G

F2 O

Dq

 F2   D /G

 / M ×g prM

Ωq

Ω

/g O  / Ad(M )

prB

 M

 B

prB

 B.

The last property is reflected in the name of the curvature two-form Ω as Ad(M )valued curvature form. Covariant derivative for sections on the co-adjoint bundle Ad∗ (M ). Let ψ : B → Ad(M ) be a section, and X ∈ Vect B, then we would like to define the covariant derivative DX ψ that for any chosen ψ and X gives a section of the adjoint bundle Ad(M ) that is DX ψ : B → Ad(M ). To define it we first relate a map F : M → g to ψ by the following ψ

B " b −→ (q, ξ) = (q, F (q)) ∈ Ad(M ), where b = π(q), q ∈ M , and ξ = F (q) ∈ g. The map F should be equivariant: F (q.τ ) = F (q).τ = Adτ −1 (F (q)), since ψ is equivariant by definition. The differential map dq F : Tq M → TF (q) g ∼ =g (q, v) → (q, dq F (v)) leads to the bundle map dF : T M → M × g. If we show that dF is equivariant, then by taking factor by the action of group, we get a bundle map dF : T M/G → Ad(M ) that sends any point (q, Y ), with q ∈ M and Y being a G-invariant vector field, to the point (q, dq F (Y )) ∈ Ad(M ), where F : M → g is equivariant. But q ∈ M is such that π(q) = b ∈ B, and Y is the horizontal lift by h of some vector field X ∈ Vect B. So given b ∈ B, X ∈ Vect B, the composition dF (h ◦ X) is the desired section X h dF ψ : B → T B → T M/G → Ad(M ). The following chain of equalities shows that dF : T M → M × g is equivariant   dF (q.τ, drτ h(X)) = q.τ, dFq.τ (hX(q.τ )) = dF (q, h(X)).τ due to the equivarience of F .

Geodesics in Geometry with Constraints

233

The Wong equation. Let c : I → B be a curve on the base space, that represents the trajectory of the motion of some charged particle. Then c(t) ˙ ∈ Vect B is the vector field along c and the horizontal lift h sends this vector field to Dγ(t) ⊂ Tγ(t) M , or h

T(c(t)) B " (c(t), c(t)) ˙ → (γ(t), h(c(t))) ˙ ∈ Dγ(t) ⊂ Tγ(t) M,

π(γ(t)) = c(t).

The contraction ic˙ Ω of Ad(M )-valued two-form Ω on h(c) ˙ is Ad(M )-valued oneform along c: ˙ ·) : T B → Ad(M ). ic˙ Ω(·) := Ω(h(c), Take any section λ(t) = λ(c(t)) of the co-adjoint bundle Ad∗ (M ) along the curve c on the base space B. The section λ represents the charge (electromagnetic or colorcharge) of a particle moving in B. The duality of Ad(M ) and Ad∗ (M ) produces a momentum in T ∗ B, by Λc,Ω ˙ (·) := λ(c(t)), ic˙ Ω(·) . The pairing · , · associates to each vector field X ∈ T B a real number given by the pairing between λ(c(t)) ∈ Ad∗ (M ) and ic(t) ˙ Ω(h(X)) ∈ Ad(M ) along the ∈ T ∗ B. Now we exploit the Riemannian metric gB given on curve c. Thus Λc,Ω ˙ the base space B and find the metric dual Λ# ∈ T B to Λc,Ω ˙ . The vector field c,Ω ˙ Λ# represents a force acting on the base space B produced by the charge λ of c,Ω ˙ the particle. This force is called the Lorentz force. The presence of the non-abelian Lorentz force Λ# leads to the equation c,Ω ˙ called the Wong equation, ∇c˙ c˙ = Λ# (5.26) c,Ω ˙ , where the non-abelian charge λ has to satisfy the condition Dc(t) ˙ λ(c(t)) = 0,

for any t ∈ I.

(5.27)

expressing the property that the charge has to be “co-variantly constant” along c. The second-order differential equations (5.26) and (5.27) have the solution that is the curve (c(t), λ(c(t))) ∈ Ad∗ (M ), t ∈ I, in the co-adjoint bundle lying over the curve c in the base space B. The system of equations can be rewritten as the first-order system on T ∗ B ⊕ Ad∗ (M ) by introducing the momentum p(t) = ∗ p(c(t)) ∈ Tc(t) B along the base curve c. We again use the metric tensor gB and define the co-vector p(t)(·) = gB (c(t), ˙ ·). Then (p, λ) ∈ T ∗ B ⊕ Ad∗ (M ). Since we have the isomorphism T M/G ∼ = T ∗ B ⊕ Ad∗ (M ), the solution (p, λ) ∗ ∗ on T B ⊕Ad (M ) is also solution on T M/G. The following theorem is an analogue of Theorem 5.1 produced for Case 2 and it expresses the relation between subRiemannian geodesics in Case 2 and the solutions of the Wong equations. Theorem 5.2 ([111]). Let Γ = (γ, p) be a normal sub-Riemannian bi-characteristic for (M, D, gD ), produced by the principal G-bundle π : M → B, where the subRiemannian structure (D, gD ) is G-invariant and gD is the pullback of the Riemannian metric gB on B. The projection of Γ onto T M/G ∼ = T ∗ B ⊕ Ad∗ (M ) is the solution of the Wong equations (5.26) and (5.27).

234

I. Markina

Conversely, let c(t) ∈ B, t ∈ I, be a solution of the Wong equation (5.26) complemented by (5.27) and h(c(t)) be its horizontal lift to M . Then h(c(t)) is a normal sub-Riemannian geodesic for (M, D, gD ) described as above. The nonabelian charge λ ∈ Ad∗ (M ) corresponds to the co-vector p ∈ T ∗ M . Instead of presenting the proof of the theorem, that can be found in [111], we show the relation between this theorem and the examples of Carnot groups considered in Section 3. 5.5. Examples of solutions to the Wong equations Before we present the examples, let us do some observations about the Wong equation (5.26). ˙ h(v)) . Conservation of energy. By definition, Λ# c,Ω ˙ (v) = λ(c(t)), Ω(h(c), # ˙ = 0 that Since the form Ω is skew symmetric, then for v = c˙ we get Λc,Ω ˙ (c) ˙ c ˙ =0 by the Wong equation (5.26). Since the derivative of the leads to ∇c˙ c, 

d 1 kinetic energy is dt ˙ c ˙ = ∇c˙ c, ˙ c ˙ = 0, we conclude that the kinetic energy 2 c, is constant along solutions of (5.26). This energy is equal to the value of the sub-Riemannian Hamiltonian function along the corresponding sub-Riemannian geodesic.

Relation to physics. If the group G acting on M is abelian, then the Wong equations are known as the Lorentz equations. In this case the adjoint bundle Ad∗ (M ) is the trivial bundle M × g∗ , where elements of the fiber g∗ represent charges. The condition of the covariant constancy (5.27) asserts that the charge λ(t) is constant. The rest (5.26) is the family of equations, parametrized by the charge. These sub-Riemannian geodesics, corresponding to the bundle type structures are projected to the motion of a particle on the base manifold B under the influence of the magnetic field Ω, defined by the curvature of the horizontal distribution D. We get a family of curves parametrized by the charge λ. This observation is one of the main parts of the Kaluza–Klein theory. The Lorentz equations play an important role in classical electrodynamics and both the classical and quantum versions of electromagnetism are highly successful physical theories. The quantum version of non-abelian gauge theory are quite successful and actively developing subject as we discussed at the beginning of the section. The interesting peculiarity is, that in contrast to the abelian electromagnetic case, the non-abelian quantum theory has no physically meaningful classical analogue, or in other words, there are no such thing as a classical quark (non-abelian electron), or classical gluon (non-abelian photon). It seems that the non-abelian Lorentz equations have no useful physical applications in high energy particle theory, but they found their impact on the mechanical systems such as falling, swimming, orbiting, and rolling. In the next section we get a description of the rolling system of two bodies, leaving out other applications apart. The reader can find interesting examples of principal bundles associated to mechanical systems in [111].

Geodesics in Geometry with Constraints

235

5.5.1. Heisenberg group. The Heisenberg manifold. Let B = R2 be the base space, where an electrically charged particle will move. Assume that B is endowed with the usual Euclidean metric and we will use the standard coordinate system b = (x, y). Let π : M = R3 → B, be the principle R-bundle, where π = pr1,2 is the projection on the plane formed by two first coordinates. The action of the structure group G = (R, +) is defined by R3 × R (x, y, t).τ

μ:

→ R3 → (x, y, t + τ ).

The vertical space is Vq = ker(π) = span{∂t }. Let us choose the horizontal distribution D = span{∂x , ∂y }. Then the connection R-valued one-form is just A = dt and the curvature form Ω = −dA ≡ 0. We see that there is no magnetic field and the motion is a free motion on the base space, or its copy. Geodesics are straight lines. The sub-Riemannian metric gD , which is the pullback of the Euclidean metric, is the Euclidean metric on D. The distribution D is orthogonal to V with respect to the Euclidean metric in R3 , it is not bracket generating, the sub-Riemannian manifold (R3 , D, gD ) is the foliation by planes t = constant, and the motion is possible only inside of a plane defined by an initial position of the particle. We choose now the horizontal distribution in another, non-trivial way, for instance D = span{X, Y }, where X = ∂x ,

Y = ∂y + x∂t ,

or in a more symmetric way 1 1 X = ∂x − y∂t , Y = ∂y + x∂t . 2 2 In this case the reader recognizes the Heisenberg distribution. Since dπ(X) = ∂x ,

dπ(Y ) = ∂y ,

the pullback of the Euclidean metric from R2 to D is represented by the identity matrix and makes the vector fields X, Y into the orthonormal basis of D. The connection form is A = dt − 12 xdy + 12 ydx and, as we remember, it is the dual form to the vertical vector field T = ∂t . The vector fields X, Y, T and their commutation relations define the Heisenberg group structure in R3 through the BCH-formula (8.1). The curvature form Ω = −dA = dx ∧ dy is constant, non-vanishing and is equal to the volume form on R2 . Any form Ω = F (x, y)dx ∧ dy represents a magnetic field in the base space B = R2 , that also can be thought as a field 0 dy ∧dt+ 0 dt∧dx+ F (x, y)dx∧dy orthogonal to the base space B. As it is known, in order to be a magnetic field the form Ω has to satisfy the Maxwell equation dΩ = 0, which is true in this case. Observe that in the presence of the abelian structure group the curvature form given by Ω = −dA (since [A, A] = 0 in this case) automatically satisfies the Maxwell equation. In the case F (x, y) = constant the magnetic field coincides up to a constant with the Heisenberg curvature form.

236

I. Markina

Let us have a look on the Lorentz equation (5.26). The Lie algebra of the structure group is R and the dual to it is also R. Thus, the co-adjoint bundle is a trivial bundle with the typical fiber R. The covariant derivative is the usual derivative and we get that the charge λ has to be constant by (5.27). The horizontal lift h maps the velocity vector c˙ = x∂ ˙ x + y∂ ˙ y ∈ Tc R2 to c˙ = xX(c) ˙ + yY ˙ (c) ∈ Dc . Moreover, ˙ ∇c˙ c˙ = c¨, ic˙ Ω = Ω(c), and the Lorentz equation becomes c¨ = −λΩ(c). ˙ The last equation coincides with the Hamiltonian equation (3.6) with θ0 = λ. So, geodesics produced by these equations are the Heisenberg geodesics described in Subsection 3.2. ¨hler structure. Suppose that Heisenberg manifold and S 3 with the Ka we are given as the base space a manifold B, endowed with the K¨ ahler structure. Let us recall the definition of a K¨ ahler manifold. Definition 25. Let M be a complex integrable manifold with corresponding complex structure J. We say that a Riemannian metric gM is compatible with J if gM (v, w) = gM (Jv, Jw),

for all v, w ∈ Tq M, q ∈ M.

The triplet (M, J, gM ) is called an Hermitian manifold. The compatible J and gM defines a skew symmetric form ω by ω(v, w) := gM (Jv, w)

for all v, w ∈ Tq M, q ∈ M

(5.28)

and it is called the associated K¨ ahler form. We can retrieve gM from ω also. Definition 26. A K¨ ahler manifold is a complex integrable manifold M endowed with the complex structure J, compatible Riemannian metric gM , associated K¨ahler form ω, such that dω = 0. It is important for us to know that the K¨ahler form ω and the Riemannian metric gM are related by (5.28). We suppose that the curvature form Ω is given by the K¨ahler form ω. The Lorentz equation becomes ˙ ∇c˙ c˙ = −λJc (c). We recognize the variational equations (3.16) and (5.14) obtained by variational methods for the Heisenberg group and for the Hopf fibration on S 3 . Observe that the Levi-Civita connections in (3.16) and (5.14) are connections related to Riemannian metrics on the total space M and the solutions γ are the sub-Riemannian geodesics on M . Meanwhile, the Levi-Civita connection in the Lorentz equation is the connection related to the Riemannian metric on the base space B and the solution is a curve c in the base space, parameterized by the charge. However, projections of γ coincide with c, as was asserted in Theorem 5.2. Curvature of geodesics on surfaces. Let B be an oriented surface (twodimensional manifold) furnished with a Riemannian metric gB . Let ω be its area

Geodesics in Geometry with Constraints

237

form. The equation ω(v, w) = ω(Jv, w) for v, w ∈ Tb B defines an almost complex structure Jb : Tb B → Tb B on B. Consider a principle U (1)-bundle π : M → B over B. The curvature form Ω can be written as Ω = F ω for some F ∈ C ∞ (B). The scalar field F or the endomorphism F J : T B → T B defines the magnetic field acting on the surface B. The Lorentz equation becomes ˙ ∇c˙ c˙ = −F Jc (c) for a particle of unit charge. Suppose that the geodesic c is parametrized by arc length: gB (c, ˙ c) ˙ = 1. The geodesic curvature kgeod (t) along c is defined by ˙ ∇c˙ c) ˙ and the Lorentz equation is equivalent to kgeod (t) = gB (Jc (c), kgeod (t) = F (c(t)). Suppose now that the magnetic field F is constant, that leads to the constant geodesic curvature. Thus, if the base space B is the plane then the geodesics are circles, which coincides with the Heisenberg case. If B is the two-dimensional sphere S 2 , then geodesics will be great circles and this is reflected in the picture of the Hopf fibration S 3 over S 2 . 5.5.2. Quaternionic H-type group with the Lorentzian metric. In this example we take as the base space B the Minkowski space R3,1 = (R4 , gB ), where gB is a non-degenerate metric tensor of index 1 having the associated matrix ⎛ ⎞ −1 0 0 0 ⎜ 0 1 0 0 ⎟ ⎜ ⎟ ⎝ 0 0 1 0 ⎠. 0 0 0 1 Define the principal R3 -bundle π : R4+3 → R4 with the standard projection into the four-dimensional subspace, where the abelian group (R3 , +) acts by μ:

R7 × R3 (b, t1 , t2 , t3 ).(τ1 , τ2 , τ3 )

→ R7 → (b, t1 + τ1 , t2 + τ2 , t3 + τ3 ),

b ∈ R4 .

Analogously to the Heisenberg manifold we find the vertical space Vq , q = (b, t1 , t2 , t3 ) by V = span{∂t1 , ∂t2 , ∂t3 }. We chose the horizontal distribution D as the span of X11 , X21 , X31 , X41 in (3.32). With this we get the Ehresmann connection and recuperate the structure of quaternionic H-type group of the lowest dimension. The choice of the basis for D is completely defined by the choice of almost complex structures (3.30). The pullback of the Minkowski metric is now a non-degenerate metric of index 1, converting the vector fields X11 , X21 , X31 , X41 into an orthonormal basis and defining the first one X11 as a globally defined timelike vector field. The corresponding geometry can be called the sub-Lorentzian geometry. Different types of sub-Lorentzian manifolds were studied in [27, 61, 62, 90, 91, 92].

238

I. Markina

The curvature form Ω defines an R3 -valued constant magnetic field. The projections of sub-Lorentzian geodesics into the Minkowski base space are divided into three different causal type and were considered in details in [91]. The change of principle R3 -bundle to the bundle π : B × S 1 → B, B = R3,1 , where the abelian group U (1) acts on the slot S 1 by the standard multiplication of complex numbers, leads to the classical Kaluza–Klein model. Exercises 1. Consider a round version of the Heisenberg group H1 ∼ = R3 = R2 × S 1 as principal U (1)-bundle, where the action is defined by R2 × S 1 (R2 × S 1 ) × U (1) → iϕ (x, y, e ).υ → (x, y, eiϕ .υ). Find a horizontal distribution with non-trivial curvature form. Write basic vector fields for the horizontal distribution and find the commutation relations between them. Do they form the Heisenberg algebra? If yes, find the group multiplication law in R3 by making use the BCH-formula. 2. Construct the Kaluza–Klein model, associated with two-step Carnot groups in Subsection 3.4.

6. Rolling manifolds Rolling surfaces without slipping or twisting is one of the classical kinematic problems that in recent years has again attracted attention of mathematicians due to its geometric and analytic richness. The kinematic conditions of rolling without slipping or twisting are described by means of motion on a configuration space being tangential to a smooth sub-bundle of the tangent bundle of the configuration space that we call, as before, the horizontal distribution. The precise definition of the mentioned motion in the case of two n-dimensional manifolds imbedded in RN , given for example in [125], involves studying the behavior of the tangent bundles of the manifolds and the normal bundles induced by the embeddings. This extrinsic point of view, which depends on the embeddings, has been successfully applied, for instance in [72, 73]. The drawback of the extrinsic approach is that the geometric descriptions depend strongly on the embedding under consideration. So far, few attempts have been made to formulate this problem intrinsically. An early enlightening formulation is given in [18], that is achieved by means of an ´ Cartan, see [24, 126]. One of intrinsic version of the moving frame method of Elie the important results established there is the bracket generating property of the rank two distribution corresponding to no-twisting and no-slipping restrictions, namely, if the two surfaces have different Gaussian curvature, then the distribution is bracket generating, see [18]. A control theoretic approach to the same problem, studied in [3], has the advantage that the kinematic restrictions are written explicitly as vector fields on the appropriate configuration space.

Geodesics in Geometry with Constraints

239

We present here a short description of a generalization of the kinematic problem for two n-dimensional abstract manifolds rolling without twisting or slipping via an intrinsic formulation. We define the configuration space of the system, present an extrinsic definition of rolling for manifolds imbedded into the Euclidean space, several equivalent definitions of rolling, involving intrinsic characteristics, and discuss their relations. The intrinsic approach permits to determine the embedding-independent information contained in the extrinsic definition. 6.1. Rolling of embedded manifolds  be oriented, connected, n-dimensional Riemannian manifolds isoLet M and M metrically imbedded into Rn+ν , equipped with the standard Euclidean metric and standard orientation. Isometrical embeddings always exist due to a result of Nash [115] and we denote them by ι and  ι, respectively. The corresponding Rie coincide with the restrictions of the Euclidean mannian metrics on M and M metric from Rn+ν and they will be denoted by gM and gM .  will be marked by a Objects (points, curves, . . . ) related to the manifold M hat () on top, objects related to M will be free of it, while those related to the ambient space RN , N = n + ν, will carry a bar ( − ). Note that for any manifold M imbedded in Rn+ν , there is a natural splitting of the tangent space of Rn+ν into a direct sum: Tx Rn+ν = Tx M ⊕ Tx M ⊥ ,

x ∈ M,

(6.1)



where Tx M is the tangent space and Tx M is the normal space to M at x. According to the splitting (6.1), any vector v ∈ Tx Rn+ν , x ∈ M , can be written uniquely as the sum v = v  + v ⊥ , where v  ∈ Tx M , v ⊥ ∈ Tx M ⊥ . Analogous projections . can be defined for M . The “ambient” LeviLet ∇ denote the Levi-Civita connection on M or on M n+ν Civita connection on R is denoted by ∇. Note that if X and Y are tangent vector fields on M , and Υ is a normal vector field to M , then  ⊥   ¯ , x ∈ M, ∇X Y (x) = ∇X¯ Y¯ (x) , ∇⊥ ¯ Υ(x) X Υ(x) := ∇X ¯ Y¯ and Υ ¯ are any local extensions to Rn+ν of the vector fields X, Y and where X, . Υ, respectively. Equivalent statements hold for M If Z and Ψ are vector fields along a smooth curve x : I → Rn+ν , we use D D⊥ dt Z(t) to denote the covariant derivative of Z along the curve x and dt Ψ for the normal covariant derivative of Ψ along x (these notations are according [117, p. 119]), see also Appendix A, Subsection 8.1. Observe that an isometric embedding of M into Rn+ν induces the equalities



⊥ d d D D⊥ Z= Z Ψ= Ψ , . dt dt dt dt

240

I. Markina

D A tangent vector Z along a smooth curve x is parallel if dt Z(t) = 0 for every t ∈ I. ⊥ We say that a normal vector field Ψ along x is normal parallel if Ddt Ψ(t) = 0 for every t. Definition 27 is a reformulation of the definition of a rolling map contained in [125, Appendix B]. The group SE(N ) of orientation preserving Riemannian isometries of RN will play an important role. For the definition of the group SE(N ) see Appendix A, Subsection 8.3.

 without slipping or twisting is a smooth curve Definition 27. A rolling of M on M (x, R) : [0, τ ] → M × SE(n + ν) satisfying the following conditions: (i) (ii) (iii) (iv) (v)

, x (t) := R(t) x(t) ∈ M , dx(t) R(t) Tx(t) M = Tx(t) M  is orientation preserving, dx(t) R(t)|Tx(t) M : Tx(t) M → Tx(t) M no slip condition: x ˙ (t) = dx(t) R(t) x(t), ˙ for every t, no twist condition (tangential part): dx(t) R(t)

D D Z(t) = dx(t) R(t) Z(t), dt dt

for any tangent vector field Z(t) along x(t) and every t. (vi) No twist condition (normal part): dx(t) R(t)

D⊥ D⊥ Ψ(t) = dx(t) R(t) Ψ(t), dt dt

for any normal vector field Ψ(t) along x(t) and every t. From now on we omit words “without slipping or twisting” just writing “a rolling”. Condition (v) is equivalent to the requirement that any tangent vector field Z is parallel along the curve x if and only if dx R Z is parallel along x . As a consequence, this condition is automatically satisfied in the case of manifolds of dimension one. Similarly, condition (vi) is equivalent to the statement that any normal vector field Ψ is normal parallel along the curve x if and only if dx R Ψ is a normal parallel vector field along x . Thus, for embeddings of co-dimension one, condition (vi) holds automatically. Example 7. Consider the submanifolds of R3 , defined by . ¯1 ∈ R, θ ∈ [0, 2π) , M = (¯ x1 , sin θ, 1 − cos θ) ∈ R3 | x .  = (¯ M x1 , x¯2 , 0) ∈ R3 | x ¯1 , x ¯2 ∈ R, . These are a cylinder and a plane. The rolling map ⎛ ⎞ ⎞ ⎛ x ¯1 x ¯1 ¯2 ⎠ → ⎝ x ¯2 cos t + (¯ x3 − 1) sin t + t ⎠ , R(t) : x¯ = ⎝ x −¯ x2 sin t + (¯ x ¯3 x3 − 1) cos t + 1

Geodesics in Geometry with Constraints

241

 along the x describes the rolling of the infinite cylinder M on M ¯2 -axis with constant speed 1. Any choice of a smooth curve x ∈ M , given by x(t) = (¯ x1 , sin t, 1 − cos t),

x ¯1 ∈ R,

t ∈ I ⊂ R,

defines the rolling

Figure 6.1. Rolling of the cylinder over the plane. Notice that Definition 27 ignores physical restrictions given by the actual  as touching along the shapes of the imbedded manifolds. If we think of M and M curves x and x ˆ and rolling according to the isometry R, then we cannot rule out the possibility that there might be transverse intersections between the manifolds other than the contact points. 6.2. Intrinsic rolling In this section we introduce a new object called intrinsic rolling. 6.2.1. Frame bundles and bundles of isometries. Let V and V be two oriented inner product n-dimensional spaces. We denote by SO(V, V ) the collection of all linear orientation-preserving isometries between V and V . When V = V , we write SO(V ) instead of SO(V, V ). Note that SO(V ) is a group. Given any choice of the basis in V , we can write an element of SO(V ) as an (n × n)-matrix. However, since there is no canonical choice of the basis on V , the group SO(V ) is not canonically isomorphic to SO(n). , we introduce the space Q of all relative positions in For any pair M and M  which M can be tangent to M     . ) x ∈ M, x ∈M (6.2) Q = q ∈ SO(Tx M, TxM  This space is a manifold with a structure of an SO(n)-fiber bundle over M × M and can be considered as the configuration space of the rolling. Its dimension is n(n+3) . Notice that it is not a principal SO(n)-bundle since the action of SO(n) 2 . To see this in more on the fiber depends on the choice of coordinates in M and M detail, we describe the space Q in terms of frame bundles. Let F and F be the , respectively. As we know from oriented orthonormal frame bundles of M and M  Subsection 8.5.1 of Appendix A, F and F are principal SO(n)-bundles.

242

I. Markina

. The group SO(n) acts on the right Consider F × F as a bundle over M × M  on F and F and we can divide by this action diagonally on fibers. Then, we can identify Q with (F × F )/SO(n) by the map assigning to each equivalence class [f, fˆ] the mapping q ∈ Q, such that fˆj = q fj ,

for

j = 1, . . . , n.

(6.3)

Clearly, this construction does not depend on the choice of a representative of an equivalence class of (F × F )/SO(n). Conversely, given an isometry q ∈ Q, there exists a unique equivalence class of frames satisfying (6.3). As we see, we can define the right action by SO(Tx M ) or the left action by ) on (F × F )/SO(n). Since both groups are not canonically isomorphic SO(Txˆ M to SO(n) (except for the case when n = 2), the configuration space Q = (F × F )/SO(n) does not have the structure of a principal SO(n)-bundle. However, since Q is an SO(n)-fiber bundle we can exploit its local properties and think that it  ×SO(n). Let U be a neighborhood in M such looks locally like the product M × M that F |U is trivial and let v be a section of F |U : v(x) = (v1 (x), . . . , vn (x)), x ∈ M . Each section determines a left action of SO(n) on F |U . To see this, recall that for each x ∈ U , the frame v(x) can be considered as an isometry v(x) : Rn → Tx M . The left action takes the following form: if f ∈ Fx is any other frame at x ∈ U , written in terms of the frame v as n  fij vi (x), fj = i=1

then τ =

(τij )ni,j=1

∈ SO(n) acts on the left on f via the equation τ.fj =

n 

fij τki vk ,

j = 1, . . . , n.

i,k=1

Observe that this action depends on the choice of the frame v. This defines local left and right actions of SO(n) on Q as follows. Let U   respectively, so that both frame bundles and U be neighborhoods in M and M  → F |  be sections. trivialize over these neighborhoods. Let v : U → F |U and vˆ : U U We define the left action of τ ∈ SO(n) on Q with respect to vˆ by τ.fˆj = (τ.q)fj , where the left action of τ on fˆj is defined with respect to vˆ and fˆj = qfj for j = 1, . . . , n. Similarly, the right action of SO(n) on Q with respect to v is defined by   fˆj = (q.τ ) τ −1 .fj . Remark that if we have a matrix representation of an element τ0 ∈ SO(n) in .n  by τ0 = g (ˆ coordinates of M v , qvj ) i,j=1 , then we have M i .n .n vi , (τ.q)vj ) i,j=1 = τ τ0 , and gM vi , (q.τ )vj ) i,j=1 = τ0 τ, τ ∈ SO(n). gM (ˆ (ˆ

Geodesics in Geometry with Constraints

243

6.2.2. Reformulation of rolling in terms of bundles. Both formulations of rolling surfaces given in [3, 18] define the configuration space as a manifold of isometries , as we did before, without taking into account of tangent spaces of M and M the embedding into the ambient space. The condition (vi) imposed over a rolling (x, R) by Definition 27 is non-trivial whenever the codimension ν of the imbedded manifolds is greater than 1. So, it is natural to suppose that the total configuration space of the rolling system will have a normal component which takes care of the action of R on the normal bundle. Therefore, by analogy with the construction  of isometries of the normal tangent of Q, we define a fiber bundle over M × M n+ν n+ν →R and  ι:M be two embeddings, given as initial space. Let ι : M → R data. Let Φ be the principal SO(ν)-bundle over M , such that the fiber over a point x ∈ M consists of all positively oriented orthonormal frames { λ (x)}νλ=1  be the similarly defined principal SO(ν)-bundle on M . As spanning Tx M ⊥ . Let Φ  it was done previously, we identify the manifold (Φ × Φ)/SO(ν) with    ⊥ ) x ∈ M, x  . Pι,ι := p ∈ SO(Tx M ⊥ , TxM ∈M (6.4) The space Pι,ι is not in general a principal SO(ν)-bundle, but there are local left and right actions defined similarly as on Q. We notice and reflect it in notations that Q is invariant of embeddings, while Pι,ι is not. We obtain dim(Pι,ι ) = 2n + ν(ν−1) , so that the . We form the direct sum Q⊕Pι,ι for the fiber bundle over M ×M 2 n(n+3)+ν(ν−1) , is Q(x,x) ×Pι,ι(x,x) . Thus dim(Q⊕Pι,ι ) = fiber over (x, x ) ∈ M × M . 2 The following proposition allows to reformulate Definition 27. Proposition 9. If a curve (x, R) : [0, τ ] → M × SE(n + ν) satisfies the conditions (i)–(vi) in Definition 27, then the mapping t → (dx(t) R(t)|Tx(t) M , dx(t) R(t)|Tx(t) M ⊥ ) =: (q(t), p(t)) , defines a curve in Q ⊕ Pι,ι with the following properties: (I) no slip condition: x ˙ (t) = q(t)x(t) ˙ for every t. D D Z(t) = dt q(t)Z(t) for any tangent (II) no twist condition, tangential part: q(t) dt vector field Z(t) along x(t) and every t. ⊥ ⊥ (III) no twist condition, normal part: p(t) Ddt Ψ(t) = Ddt p(t)Ψ(t) for any normal vector field Ψ(t) along x(t) and every t. Conversely, if (q, p) : [0, τ ] → Q ⊕ Pι,ι is a smooth curve satisfying (I)–(III), then there exists a unique rolling (x, R) : [0, τ ] → M × SE(n + ν), such that dx(t) R(t)|Tx(t) M = q(t) and dx(t) R(t)|Tx(t) M ⊥ = p(t). Proof. Assume that (x, R) : [0, τ ] → M × SE(n + ν) is a rolling map satisfying (i)–(vi). The conditions (i) and (ii) assure that ) and dx(t) R(t)|Tx(t) M ∈ SO(Tx(t) M, Tx(t) M ⊥ ). dx(t) R(t)|Tx(t) M ⊥ ∈ SO(Tx(t) M ⊥ , Tx(t) M

(6.5)

244

I. Markina

Since dx(t) R(t) must be orientation preserving in Rn+ν for any t ∈ [0, τ ] we conclude that both of the mappings (6.5) are either orientation reversing or orientation preserving. The additional requirement (iii) implies that (q, p) is orientation preserving. The conditions (I)–(III) correspond to the conditions (iv)–(vi). Conversely, if we have a curve (q, p) in Q ⊕ Pι,ι with projection (x, x ) into , then we have an isometry R ∈ SE(n + ν) in the following way: R(t) : x M ×M ¯ → ¯ x+a ¯ ¯ A(t)¯ ¯(t), A(t) ∈ SO(n + ν), where A(t) = dx(t) R(t) is determined by the conditions dx(t) R(t)|Tx(t) M = q(t)|Tx(t) M , Then for images of dx(t) R(t) we have   , Im dx(t) R(t)|Tx(t) M = Tx(t) M

dx(t) R(t)|Tx(t) M ⊥ = p(t)|Tx(t) M ⊥ .   ⊥ . Im dx(t) R(t)|Tx(t) M ⊥ = Tx(t) M

¯ The vector a ¯(t) is determined by a ¯(t) = x (t) − A(t)x(t) for any t ∈ [0, τ ].



The one-to-one correspondence between rolling maps and smooth curves in Q ⊕ Pι,ι , satisfying (I)–(III), naturally leads to a definition of a rolling map in terms of these bundles.  without slipping or twisting is a smooth curve Definition 28. A rolling of M on M (q, p) : [0, τ ] → Q ⊕ Pι,ι such that (q(t), p(t)) satisfies (I) no slip condition: x ˙ (t) = q(t)x(t) ˙ for every t,

D D (II) no twist condition, tangential part: q(t) dt Z(t) = dt q(t)Z(t) for every t ∈ [0, τ ] and for any tangent vector field Z along x, ⊥ ⊥ (III) no twist condition, normal part: p(t) Ddt Ψ(t) = Ddt p(t)Ψ(t) for every t ∈ [0, τ ] and for any normal vector field Ψ along x.

Proposition 9 implies that the bundle Q⊕Pι,ι can be seen as the configuration space for a rolling of two isometrically embedded manifolds ι : M → Rn+ν and  → Rn+ν . According to [125], the dimension n(n+3)+ν(ν−1) corresponds to  ι: M 2 the degrees of freedom of the system. A purely intrinsic definition of a rolling is deduced from Definition 28, by restricting it to the bundle Q. This concept naturally generalizes the definition given in [3] for two-dimensional Riemannian manifolds imbedded into R3 and we use the term intrinsic rolling for this object. Definition 29. An intrinsic rolling of two n-dimensional oriented Riemannian man without slipping or twisting is a smooth curve q : [0, τ ] → Q, with ifolds M on M (t) = prM projections x(t) = prM q(t) and x  q(t), satisfying the following conditions: ˙ (t) = q(t)x(t) ˙ for all t, (I ) no slip condition: x  (II ) no twist condition: Z is a parallel tangent vector field along the curve x, if and only if q Z is parallel along x .

Geodesics in Geometry with Constraints

245

 are imbedded into Euclidean Remark 8. If n-dimensional manifolds M and M , there is a unique orienta) ∈ M × M space Rn+1 , then for each pair of points (x, x ⊥ ⊥  tion preserving isometry p : Tx M → TxM . Hence, since Pι,ι is an SO(1)-bundle, , and so Q ⊕ Pι,ι ∼ it can be identified with M × M = Q. In this case we see that the notion of rolling in Definition 28 coincides with the intrinsic rolling in Definition 29. 6.2.3. Extrinsic and intrinsic rollings along the same curves. Let (x, x ) : [0, τ ] →  M × M be a given pair of curves. We aim to give an answer to the following questions: , along curves x and x , how • If q1 and q2 are two intrinsic rollings of M on M are they related? What properties of the rolling are defined by fixing the paths x and x ? → ι: M • Suppose an intrinsic rolling q and embeddings ι : M → Rn+ν and  n+ν are given. Is it possible to extend q to a rolling (q, p)? Is this extension R unique? The following example clarifies the situation for one-dimensional manifolds, where different embeddings are easy to describe.  = R, with the usual Euclidean structure, and M = Example 8. Consider M 1 S , with the usual round metric and positive orientation counterclockwise. Let x : [0, τ ] → S 1 be written as x(t) = eiϕ(t) , where ϕ : [0, τ ] → R is an absolutely continuous function. Since SO(1) = {1}, the configuration space Q for the intrinsic . The no-slipping condition implies that rolling is just M × M x (t) = x (0) + ϕ(t) − ϕ(0), and we may assume x (0) = ϕ(0) = 0. We consider different rollings of M on  under various embeddings. Without loss of generality, we may assume that M R(0) = idR1+ν is the identity map in R1+ν . We will use r = (r1 , . . . , r1+ν ) for coordinates of R1+ν . Case 1: Consider the embeddings ι1

:

M eiϕ

→ R2 , → (sin ϕ, 1 − cos ϕ)

 ι1

:

 → M R2 . x  → ( x, 0)

Simple calculations show that there is only one possible rolling. Case 2: Consider the embeddings ι2

:

M eiϕ

→ R3 , → (sin ϕ, (1 − cos ϕ) cos θ0 , (1 − cos ϕ) sin θ0 )  ι2

:

 → M R3 , x  → ( x, 0, 0)

where θ0 is any fixed angle from (0, π2 ). Conditions (ii), (iii) and (iv) of Definition 27 imply that the differential dx(t) R(t) of R(t), t ∈ [0, τ ], in matrix form can be

246

I. Markina

written uniquely as ⎞⎛ ⎞⎛ ⎛ ⎞ 1 0 0 cos ϕ(t) sin ϕ(t) 0 1 0 0 ⎝ 0 cos κ(t) sin κ(t) ⎠ ⎝ − sin ϕ(t) cos ϕ(t) 0 ⎠ ⎝ 0 cos θ0 sin θ0 ⎠ , 0 0 1 0 − sin κ(t) cos κ(t) 0 − sin θ0 cos θ0 for some smooth function κ : [0, τ ] → R. To satisfy the normal no-twist condition, dx(t) R(t) must map the normal parallel vector fields on M 1 = − sin ϕ(t)

∂ ∂ ∂ + cos ϕ(t) cos θ0 + cos ϕ(t) sin θ0 , ∂r1 ∂r2 ∂r3

∂ ∂ + cos θ0 , ∂r2 ∂r3 . Calculating the covariant derivative of to normal parallel vector fields on M dx(t) R(t) 1 and dx(t) R(t) 2 , we conclude that κ(t) is constant and the assumption R(0) = idR1+ν implies that the constant is 0. Hence, the circle will roll along the line with a constant tilt given by θ0 , see Figure 6.2. 2 = − sin θ0

Figure 6.2. Case 2: S 1 rolling on R. Different tilting angles give different embeddings, but equivalent rollings.  as a spiral. Case 3: Consider the isometric embedding of M  ι3

:

 → M x  →

R3

√1 (cos x , sin x , x ), 2

and ι2 from the previous case. In this situation, the circle M will rotate along the . Checking the normal no-twist condition we come to the same conclusion spiral M that the path is uniquely determined by the initial angle θ0 . Note that in all the cases above, the intrinsic rolling t → (eiϕ(t) , ϕ(t)) either uniquely induces a rolling, or the rolling is determined by an initial configuration of the normal tangent spaces, which corresponds to the initial tilting angle θ0 . In fact it is also possible to find a choice of basis, consisting of normal parallel vector fields, so that the normal component of the rolling p is constant with respect to this basis. We show in Lemma 4 below that this holds generally.

Geodesics in Geometry with Constraints

247

 be two fixed curves. We denote by Let x : [0, τ ] → M and x  : [0, τ ] → M {ej (t)}nj=1 a collection of parallel tangent vector fields along x(t) forming an orthonormal basis for Tx(t) M and by { λ (t)}νλ=1 a collection of normal parallel vector fields along x(t) forming an orthonormal basis for Tx(t) M ⊥ . Such vector fields can be constructed by parallel transport and normal parallel transport along x(t). Similarly, along x (t), we define parallel frames {ˆ ei }ni=1 and {ˆ κ }νκ=1 . Recall that Latin indices i, j, . . . vary from 1 to n, while Greek ones κ, λ, . . . vary from 1 to ν. The following lemma shows that the image of a parallel frame over M has . This reflects the fact that rolling constant coordinates in a parallel frame over M preserves parallel vector fields. Lemma 4. A curve (q, p) : [0, τ ] → Q ⊕ Pι,ι satisfies (II) and (III) if and only if the matrices A(t) = (aij (t)) = (gM ei , q(t)ej )), (ˆ

B(t) = (bκλ (t)) = (gM κ (t), p(t) λ (t))), (ˆ

are constant for any t ∈ [0, τ ]. Proof. Check that the derivatives a˙ ij (t) and b˙ ij (t) vanish, see also [57].



The following theorem gives an answer to the first question raised at the beginning of Subsection 6.2.3. Theorem 6.1 ([57]). Let q : [0, τ ] → Q be a given intrinsic rolling map without (t)), t ∈ [0, τ ]. Define slipping or twisting with projection prM×M  q(t) = (x(t), x the vector spaces   V = v is a parallel vector field along x, and gM (v, x) ˙ = 0 for all t ,   V = v is a parallel vector field along x , and gM v, x ˙ ) = 0 for all t . ( Then dim V = dim V and, if we denote this dimension by k, the following holds.  along curves x and x (a) The map q is the unique intrinsic rolling of M on M  if and only if k ≤ 1. (b) If k ≥ 2, all the rollings along x and x  differ from q by an element in SO(V ). In particular, if the curve x : [0, τ ] → M is a geodesic, we have the following consequence of Theorem 6.1. Corollary 8. Assume that the curve x is a geodesic in M . Then there exists an  along (x, x intrinsic rolling of M on M ) if and only if x  is a geodesic with the same speed as x. Moreover, if n ≥ 2, and if V is defined as in Theorem 6.1, then  differ by an element in SO(V ). dim V = n − 1, and all the rollings along x and x Concerning the problem of extending intrinsic rollings to extrinsic ones, the following theorem gives a complete answer to the question posed at the beginning of Subsection 6.2.3.

248

I. Markina

Theorem 6.2. Let q : [0, τ ] → Q be an intrinsic rolling and let ι : M → Rn+ν and  → Rn+ν be given embeddings. Then, given an initial normal configuration  ι: M p0 ∈ (Pι,ι )(x0 ,x0 ) , where (x0 , x 0 ) = prM×M  q(0), there exists a unique rolling (q, p) : [0, τ ] → Q ⊕ Pι,ι satisfying p(0) = p0 . κ }νκ=1 be normal parallel frames along curves x and x , Proof. Let { λ }νλ=1 and {ˆ respectively. Let B0 ∈ SO(ν) be defined by -  .ν B0 = {bκλ }νκ,λ=1 = g¯  κ (0), p0 λ (0) κ,λ=1 , where g¯ is the Euclidean metric in Rn+ν . Then p(t) must satisfy   bκλ = g¯  κ (t), p(t) λ (t) for any t ∈ [0, τ ], by Lemma 4, and it is uniquely determined by this.



We already gave the answer about the uniqueness of the intrinsic rolling q in Theorem 6.1. Then the extension (q, p) was proposed in Theorem 6.2. Now the natural question arises: whether the extrinsic part p is unique. In order to answer this question we define the vector spaces   . E = (t) is a normal parallel vector field along x(t) and g¯ x(t), ˙ (t)) = 0 ,     =  ˙ (t),  (t)) = 0 , E (t) is a normal parallel vector field along x (t) and g¯ x ⊥ from embedwith inner product g¯ and orientation induced on T M ⊥ and T M dings. Both vector spaces have dimension ν. An extrinsic rolling (q, p) extending  or, equivalently, an intrinsic rolling q is determined up to a left action of SO(E)  up to a right action of SO(E). Both SO(E) and SO(E) are isomorphic to SO(ν), but not canonically. 6.3. Distributions for extrinsic and intrinsic rolling The aim of this subsection is to formulate the kinematic conditions of no-slipping and no-twisting in terms of a distribution. In this setting, a rolling will be a smooth curve almost everywhere tangent to this distribution.  denote the canonical 6.3.1. Local trivializations of Q. Let π : Q ⊕ Pι,ι → M × M ). projection. Consider a rolling R(t) = (q, p) : [0, τ ] → Q ⊕ Pι,ι , then π ◦ R = (x, x  Given an arbitrary t0 in the domain of R, let U and U denote neighborhoods of , respectively, such that both bundles T M and T M ⊥ (t0 ) in M and M x(t0 ) and x  , such that both are trivialized being restricted to U . In the same way we chose U  . This implies that the  and T M ⊥ are trivialized when they are restricted to U TM , trivializes when it is restricted to U × U . bundle π : Q ⊕ Pι,ι → M × M ˙ We Each of the requirements (I)–(III) can be written as restrictions to R. ˙ show, that all admissible values of R form a distribution; that is a smooth subbundle of T (Q ⊕ Pι,ι ). We will use the local trivializations to describe this distribution.

Geodesics in Geometry with Constraints

249

 be as in Subsection 6.3.1. The 6.3.2. The tangent space of SO(n). Let U and U −1  tangent space T π (U × U ) is isomorphic to the following direct sum under the trivialization ) = T U × T U  × T SO(n) × T SO(ν). T π −1 (U × U The decomposition requires to know a detailed description of the tangent spaces T SO(n) and T SO(ν) in terms of left and right invariant vector fields. We start by considering the embedding of SO(n) in GL(n). Denote the matrix entries of a matrix A by (aij ) and the transpose matrix by Atr . Then, differentiating the condition At A = 1, we obtain n   T SO(n) = ker ωij , ωij = (arj dari + ari darj ) . r=1

i≤j

It is clear that the tangent space at the identity 1 of SO(n) is spanned by Wij (1) :=

∂ ∂ − , ∂aij ∂aji

1 ≤ i < j ≤ n.

We denote so(n) = span{Wij (1)} following the classical notation. We use left translations of these vectors to define

n  ∂ ∂ − arj ari (6.6) Wij (A) := dlA Wij (1) = ∂arj ∂ari r=1 as global left invariant basis of T SO(n). Note that the left and right action in T SO(n) is described by



 n n ∂ ∂ ∂ ∂ dlA ari , drA ajs . = = ∂aij ∂a ∂a ∂a rj ij is r=1 s=1 We have the following formula to switch from left to right translation and the other way around,



n  ∂ ∂ dlA ari asi drA = , ∂aij ∂ars r,s=1



n  ∂ ∂ drA ajs air dlA = . ∂aij ∂ars r,s=1 Therefore, the right invariant basis of T SO(n) can be written as      (air ajs − ajr ais )Wrs (A) = AdA−1 Wij (A) . drA Wij (1) = r j, (so Wij (A) = −Wji (A)) then the bracket relations are given by [Wij , Wkl ] = δj,k Wil + δi,l Wjk − δi,k Wjl − δj,l Wik . The detailed calculation presented in this subsection can be found in [57].

250

I. Markina

6.3.3. Distributions. Now we are ready to rewrite the kinematic conditions (I)– (III) as a distribution. Let R : [0.τ ] → Q⊕Pι,ι be a rolling satisfying the conditions (I)–(III). Consider its image under the trivializations. Then n ν   ∂ ∂ ˙ ˙ R = x˙ + x + a˙ ij + . (6.7) b˙ κλ ∂a ∂b ij κλ i,j=1 κ,λ=1

Condition (I) holds if and only if x ˙ (t) = q(t)x(t), ˙ t ∈ [0, τ ]. We want to write the last two terms in (6.7) in the right invariant basis of corresponding tangent spaces of SO(n) and SO(ν), based on conditions (II) and (III). Satisfying (II), we obtain n  i,j=1

a˙ ij

       ∂ = q −1 eˆj , q −1 eˆi − gM eˆj , eˆi AdA−1 Wij (A) . gM ∇x(t) ˙  ∇qx(t) ˙ ∂aij i n+1 since C, C = i,j=1 |cij |2 , the metric · , · coincides with the Euclidean metric n+1  2 is an orthonormal basis for the in R(n+1) . From this we get that ∂c∂ij i,j=1

2

tangent bundle T R(n+1) with respect to · , · . 2 We define the embedding of SE(n) into R(n+1) by ι:

2

SE(n) → R(n+1) x = (C, r)

→ C =

C 0

r 1

.

This mapping is in fact an isometry of SE(n) onto its image. To see this, notice 2 that the metrics coincide at the identity, and that the metric of R(n+1) , restricted to the image Im(ι) of ι, is left invariant under the action of SE(n). Hence, the metrics on SE(n) and Im(ι) coincide, and ι defines an isometric embedding. Extrinsic rolling of SE(3) over se(3). We will use the constructed embedding to build an extrinsic rolling of SE(3) over se(3) in R16 and we write M for the

256

I. Markina

image of SE(3) in R16 under this embedding. Denote ∂ij = fields spanning T M are 1  e1 = Y1 = √ (ci1 ∂i2 − ci2 ∂i1 ) , 2 i=1

∂ ∂cij ,

then the vector

1  e2 = Y2 = √ (ci1 ∂i3 − ci3 ∂i1 ) , 2 i=1 (6.15) 3 3  1  e3 = Y3 = √ (ci2 ∂i3 − ci3 ∂i2 ) , e3+k = Xk = cik ∂i4 , k = 1, 2, 3, 2 i=1 i=1 3

3

where we suppressed dι in the notation. We introduce an orthonormal basis of T M ⊥, 1  (cj1 ∂j2 + cj2 ∂j1 ) , Υ1 = √ 2 j=1 3

1  Υ2 = √ (cj1 ∂j3 + cj3 ∂j1 ) , 2 j=1 3

1  Υ3 = √ (cj2 ∂j3 + cj3 ∂j2 ) , 2 j=1 3

Ψλ =

3 

cjλ ∂jλ ,

λ = 1, 2, 3,

Ξμ = ∂4μ ,

(6.16)

μ = 1, 2, 3, 4.

j=1

 the image of R6 into R16 by the embedding We denote by M ⎛ ⎞ √1 x √1 x   x 4 0 2 1 2 2 ⎜ 1 √1 x 1 0  x 5 ⎟  ι ⎜ − √2 x ⎟ 2 3 ( x1 , x 2 , x 3 , x 4 , x 5 , x 6 ) → ⎜ ⎟. 2 − √12 x 3 0 x 6 ⎠ ⎝ − √12 x 0 0 0 0  We have the following orthonormal basis of T M, 1 eˆ1 = √ (∂12 − ∂21 ), 2

1 eˆ2 = √ (∂13 − ∂31 ), 2

eˆ3+k = ∂k4 ,

1 eˆ3 = √ (∂23 − ∂32 ), 2

k = 1, 2, 3,

⊥ are while the vector fields spanning T M 1 1 1 ˆ1 = √ (∂12 + ∂21 ), ˆ2 = √ (∂13 + ∂31 ), ˆ3 = √ (∂23 + ∂32 ), 2 2 2 κ = 1, 2, 3, ˆ6+κ = ∂4κ , κ = 1, 2, 3, 4. 3+κ = ∂κκ , ˆ In order to extend an intrinsic rolling q with π(q) = (x, x ), we find an orthonormal frame of normal parallel vector fields along curves x and x . Along x , we 10 may use the restriction of {ˆ κ }κ=1 . For the curve x the answer is more complicated.

Geodesics in Geometry with Constraints

257

We first study the value of ∇⊥ for different choices of vector fields. (1) ∇⊥ X Ξμ = 0, for any tangential vector field X, and Ξμ as in equation (6.16). (2) ∇⊥ Xk Υ = 0, for any normal vector field Υ, and Xk as in equation (6.15). (3) Otherwise, the results are presented in the following table. Υ1 ∇⊥ Y1 ∇⊥ Y2 ∇⊥ Y3

1 2

Υ2

(Ψ1 − Ψ2 ) 1 − 2√ Υ 2 3

Υ3 1 √ Υ 2 2 2 1 √ Υ 2 2 1

1 − 2√ Υ 2 3 1 2

1 − 2√ Υ 2 2

(Ψ1 − Ψ3 ) 1 √ Υ 2 2 1

1 2

(Ψ2 − Ψ3 )

Ψ1 − 21 Υ1 − 21 Υ2 0

Ψ2 1 2 Υ1

Ψ3

0

1 2 Υ2 1 2 Υ3

− 12 Υ3

0

.

We use the relations above√to construct an extrinsic rolling by making use of the ˙ ˙ curve (6.13). Since x(t) ˙ = 2 θ(t)Y 1 (x(t)) + ψ(t)X3 (x(t)), the vector field Ψ(t) =

3 

(υλ (t)Υλ (x(t)) + υ3+λ (t)Ψλ (x(t))) ,

λ=1

is normal parallel along x(t) if & ' & & ' ' θ˙ θ˙ θ˙ υ˙ 1 − √ (υ4 − υ5 ) Υ1 + υ˙ 2 + υ3 Υ2 + υ˙ 3 − υ2 Υ3 2 2 2 ' ' & & θ˙ θ˙ + υ˙ 4 + √ υ1 Ψ1 + υ˙ 5 − √ υ1 Ψ2 + υ6 Ψ3 = 0. 2 2 Hence we define a parallel orthonormal frame along x(t) by 1 1 1 (t) = cos θΥ1 (x(t)) − √ sin θΨ1 (x(t)) + √ sin θΨ2 (x(t)) , 2 2 θ θ 2 (t) = cos Υ2 (x(t)) + sin Υ3 (x(t)) , 2 2 θ θ 3 (t) = − sin Υ2 (x(t)) + cos Υ3 (x(t)) , 2 2 1 cos θ + 1 1 − cos θ Ψ1 (x(t)) + Ψ2 (x(t)) , 4 (t) = √ sin θΥ1 (x(t)) + 2 2 2 1 1 − cos θ 1 + cos θ Ψ1 (x(t)) + Ψ2 (x(t)) , 5 (t) = − √ sin θΥ1 (x(t)) + 2 2 2 λ = 1, 2, 3, 4. 6 (t) = Ψ3 (x(t)) and 6+λ (t) = Ξλ (x(t)) , Thus p(t) is represented by a constant matrix in the bases { λ (t)}10 λ=1 and {ˆ κ (t)}10 . Let us choose p(t) to be the identity in these bases, due to the given κ=1 embedding.

258

I. Markina

The curve R(t) = (q(t), p(t)) in SE(16) is given by R(t)(t)x = Ax + r(t), where A(t) equals ⎛

cos2 sin θ 2

θ 2

⎜ ⎜ ⎜ 0 ⎜ ⎜ 0 ⎜ ⎜ − sin θ 2 ⎜ cos θ−1 ⎜ 2 ⎜ ⎜ 0 ⎜ ⎜ 0 ⎜ ⎜ 0 ⎜ ⎜ 0 ⎜ ⎜ ⎜ ⎝

− sin2 θ cos2 2θ 0 0 sin2 2θ − sin2 θ 0 0 0 0

0 0 cos θ2 0 0 0 − sin θ2 0 0 0

0 sin2 θ 0 sin2 2θ 0 0 1 0 0 cos2 θ2 0 sin2 θ 0 0 0 0 0 0 0 0

cos θ−1 2 sin θ 2

0 0

− sin2 θ cos2 θ2 0 0 0 0

0 0 sin θ2 0 0 0 cos θ2 0 0 0

06×10 and

Here, 0m×n size 6 × 6.

0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 cos θ2 0 sin θ2

0 0 0 0 0 0 0 0 − sin θ2 cos θ2



010×6

⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟, ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠

16

t θ θ r(t) = −1, √ , 0, 0, √ , −1, 0, 0, 0, 0, −1, 0, 0, 0, 0, 0 . 2 2 denotes the zero matrix of size m × n and 16 is the identity matrix of

7. Group of diffeomorphisms of the circle In this section we consider one of the simplest examples of an infinite-dimensional manifold possessing rather good structure and having various applications. We start from the definitions, then we give motivations and we finish with some results concerning sub-Riemannian structure of this example. The material of this section can be found in [81, 93, 99, 108, 116]. 7.1. Manifold and group structure of Diff S 1 Let S 1 be the unit circle with the standard counterclockwise orientation that we will think of as R mod 2πZ. Denote by Diff S 1 the set of all C ∞ -diffeomorphisms of S 1 preserving the orientation. This is an open subset of the infinite-dimensional space C ∞ (S 1 , S 1 ) of all C ∞ -smooth maps of the circle, see Exercise 1. The topology defined on the space C ∞ (S 1 , S 1 ) is the C ∞ topology, which corresponds to the uniform convergence of all derivatives of h ∈ C ∞ (S 1 , S 1 ) and it can be described by the countable system of semi-norms h m = sup |hm (θ)|. θ∈S 1

The set Diff S , considered as an open set of C ∞ (S 1 , S 1 ), inherits the C ∞ -topology. Define the group operation in Diff S 1 as the composition of two diffeomorphisms and the inverse element of the group as the inverse map. The identity map id : S 1 → S 1 becomes the unit of the group. We would like to consider the defined group as an infinite-dimensional analogue of a Lie group. To do this we 1

Geodesics in Geometry with Constraints

259

need to define a manifold structure on the topological space Diff S 1 . We start from the choice of a model vector space V, where the coordinate charts (U, ϕ) with U being an open set on Diff S 1 and ϕ : U → V ⊂ V being a diffeomorphism, will be constructed. The space Vect(S 1 ) of all C ∞ real vector fields on the circle is a good candidate for the model space V since it possesses the Fr´echet topology and it is one of the examples of a locally convex complete topological vector space. d = v(θ)∂θ ∈ Vect(S 1 ) can be associated with The vector fields S 1 " θ → v(θ) dθ the space C ∞ (S 1 , R) of real functions v. A sketch of the construction of the chart is given below. Let us consider the following neighborhood V0 of 0 ∈ Vect(S 1 ): V0 = {v ∈ Vect(S 1 ) | |v| < π}, where |v| is the absolute value in R. Then there is a homeomorphic map ψ : V0 → U0 ⊂ C ∞ (S 1 , S 1 ) to a neighborhood U0 = {f ∈ C ∞ (S 1 , S 1 ) | f (θ) = −θ,

for all θ ∈ S 1 },

of the identity map id ∈ C ∞ (S 1 , S 1 ). Construction of ψ see in Exercise 2. Choose an open set U ⊂ U0 consisting of diffeomorphisms. Then U is a neighborhood of id ∈ Diff S 1 . Then the set V = ψ −1 (U ) will be an open subset of V0 ∈ Vect(S 1 ). Thus, we constructed the chart (U, ϕ), ϕ = ψ −1 |U in a neighborhood of the identity map id ∈ Diff S 1 . To construct a complete atlas we exploit the group structure of Diff S 1 . If U is a neighborhood of id ∈ Diff S 1 , then f.U = {φ ∈ Diff S 1 | φ = f ◦ h, h ∈ U } is a neighborhood of f . Having the map ψ, we define ψf : V → f.U for any f ∈ Diff S 1 as the composition f ◦ ψ. Then the chart (Uf , ϕf ), where Uf = f.U and ϕf = ψf−1 |f.U is the corresponding local chart about any group element f ∈ Diff S 1 . The reader has to verify that any composition ϕh ◦ ϕ−1 f is a smooth map in Vect(S 1 ). As we know, the space Vect(S 1 ), endowed with usual brackets for vector fields, forms a Lie algebra, that we shall denote by the same symbol Vect(S 1 ). On the other hand, the space Tid Diff S 1 furnished with brackets for tangent vectors is the Lie algebra diff of left invariant vector fields. As we discussed in Subsection 2.1, a tangent vector is the equivalence class of all curves f : R → Diff S 1 such that f (0) = id and that have the same initial velocity. Such curves can be seen as smooth functions f : R × S 1 → S 1 , f (0, θ) = θ, where f (t0 , θ) ∈ Diff S 1 for any fixed value t0 ∈ R. The velocity vector at t = 0, corresponding to the equivalence class [f ], is a vector field ∂  v(θ) =  f (t, θ), f ∈ [f ], ∂t t=0 defined on S 1 . By this we identify Vect(S 1 ) and diff with the Lie product [X, Y ] = Y X − XY . Now we explain why the sign in the latter commutation relation is opposite. The action of the group Diff S 1 on the circle S 1 , considered as a smooth compact oriented manifold, is defined as the natural left action μ:

Diff S 1 × S 1 f.θ

→ S1 → f (θ).

260

I. Markina

The differential dμ acts at the level of tangent spaces: dμ : T (Diff S 1 × S 1 ) ∼ = T Diff S 1 × T S 1 → T S 1 . In order to make the following diagram commutative T (Diff S 1 × S 1 ) " w(f, θ) O



/ v(f (θ)) ∈ T S 1 O v

w

Diff S 1 × S 1 " (f, θ)

μ

/ f (θ) ∈ S 1

or dμ(w(f, θ)) = v(μ(f, θ)) = v(f (θ)) we have to extend the elements of diff as d  right invariant vector fields on Diff S 1 , or w(f, ·) = d  (exp w0 )f (·), where =0 1 ∼ 1 w0 ∈ diff = Tid Diff S = Vect(S ) and the exponential map is defined in (7.1). In this way we associate the elements of diff with right invariant vector fields on Diff S 1 . The bracket in diff is defined as in Vect(S 1 ), but with the opposite sign, see the discussion in Subsection 3.1. Thus   diff = Vect(S 1 ), [ , ] with [u∂θ , v∂θ ] = (−uv  + u v)∂θ . The group exponential map is defined in a standard way through one parametric subgroup in Diff S 1 . The one parametric subgroup R " t → f (t, θ) ∈ Diff S 1 of diffeomorphisms for any fixed value θ0 ∈ S 1 is the solution of the Cauchy problem   dθ ∂  1 where v(θ) = ∂t  f (t, θ) dt = v(θ) ∈ Tθ S , t=0

θ(0) = θ0 . Thus, the solution θ(t) = expθ0 (tv) is the exponential curve that carries each line t → tv through the origin in Vect(S 1 ) to the one parametric subgroup in Diff S 1 and exp : Vect(S 1 ) → Diff S 1 (7.1) v → exp(v) is the Diff S 1 -group exponential map. The interesting feature is that, in contrast to finite-dimensional groups, there is no neighborhood of id ∈ Diff S 1 , where the exponential map would be diffeomorphism. The map is not locally surjective. No matter how small the neighborhood U of the identity map id ∈ Diff S 1 , there is f ∈ U such that f does not belong to any one-parametric subgroup, or f ∈ / Im(exp), see Exercise 3. Moreover, the map exp is not injective. Let fn be a rotation by the angle fn 2π 1 1 → θ + 2π n : θ n . The map fn belongs to the closed subgroup S ⊂ Diff S of 1 1 rotations in Diff S . Any one-parameter element of S is generated by a constant vector field v in (7.1). Let  2π 2π  = φ(θ) + } H = {φ ∈ Diff S 1 | φ θ + n n

Geodesics in Geometry with Constraints be the subgroup of Diff S 1 of all periodic diffeomorphisms with period to see that fn commutes with H. Then since S 1 " fn = φfn φ−1 ∈ φS 1 φ−1 ,

261 2π n .

It easy

φ ∈ H,

we conclude that fn belongs to all one-parametric subgroups from φS 1 φ−1 , φ ∈ H. Finally, we state a property of the group Diff S 1 and its Lie algebra Vect(S 1 ) whose proof can be found in [70]. Proposition 12. The group Diff S 1 is a simple group. For practical purposes it is convenient to introduce a basis of Vect(S 1 ). Since we identify the vector fields v(θ)∂θ with smooth real functions v(θ) on S 1 and the latter can be developed into the Fourier series, therefore, the Fourier basis cos(nθ), sin(nθ), n = 0, 1, . . . is a natural choice of a basis for Vect(S 1 ). We observe also that all f ∈ Diff S 1 are periodic functions with the period 2π in the following sense, f (θ + 2π) = f (θ) + 2π. Exercises 1. Define η(f ) =

inf

θ,ϑ∈S 1 ,θ=ϑ

f (θ) − f (ϑ) , |θ − ϑ|

where f : S 1 → S 1 , f ∈ C ∞ (S 1 ).

Here · means the Euclidean distance in R2 and S 1 considered as an embedded manifold to R2 . Verify that η is a continuous function of f . Conclude that since η(f ) > 0 if and only if f is a diffeomorphism, the set Diff S 1 is an open set in C ∞ (S 1 , S 1 ). 2. Let V0 = {v ∈ Vect(S 1 ) | |v| < π}, with · being the absolute value in R and U0 = {f ∈ C ∞ (S 1 , S 1 ) | f (θ) = −θ, ∀ z ∈ S 1 } be neighborhoods of 0 ∈ Vect(S 1 ) and the identity map id ∈ C ∞ (S 1 , S 1 ), respectively. We construct a map ψ : V0 → U0 by the following. Define a map ψv : S 1 → S 1 for v ∈ V0 as a map that sends a point θ ∈ S 1 to the end point of the arc of length |v|, which starts at θ with the initial velocity v(θ). Show that ψ is the homeomorphism from V0 " v → ψv ∈ U0 . 3. Let n ∈ N be big enough, and ε ∈ (0, n1 ). We think of S 1 = {θ ∈ R mod 2πZ}. Consider the diffeomorphism π f (θ) = θ + + ε sin2 (nθ). n Show that by choosing n and ε we can make f (θ) so close in C ∞ -topology to the identity map as we want. Show that f has only one periodic orbit of period 2n. (Hint: start from θ = 0.) Show that starting from any other value θ ∈ (0, π/n) we get a non-periodic orbit. Conclude that f cannot belong to

262

I. Markina any one-parameter subgroup of Diff S 1 . Indeed if it belonged to exp(v) for some v ∈ Vect(S 1 ), then it would be a rotation f (θ) = θ + πn , since after 2n  2n repetitions we get f (0) = 2π = f (0 + 2π). But we know that it is not

true since all other points of S 1 do not belong to the same orbit of f . See also [81, 108, 116]. 4. Calculate the commutator in Vect(S 1 ) for its basis cos(nθ), sin(nθ), n = 0, 1, . . .. 7.1.1. Central extensions of Diff S 1 and Vect(S 1 ). To introduce the central extensions of the Lie–Fr´echet group Diff S 1 and its Lie algebra Vect(S 1 ) we start from the linear object, i.e., from the Lie algebra. Definition 31. The central extension  g of a Lie algebra g by the Lie algebra R of real numbers is the set g× R and new Lie brackets [(ξ, a), (η, b)]g , ξ, η ∈ g, a, b ∈ R, satisfying the axioms of Definition 51. In this case R becomes the center of the extended Lie algebra. The simplest trivial example of a central extension is the direct product g × R with the Lie brackets defined by [(ξ, a), (η, b)]g := ([ξ, η]g , ab − ba) = ([ξ, η]g , 0). We are interested in a non-trivial extension. In order to get it we need to find an invariant skew symmetric bi-linear form ω : g × g → R, such that the new Lie g satisfies bracket [(ξ, a), (η, b)]g := ([ξ, η]g , ω(ξ, η)) in the extended Lie algebra  the axioms of Definition 51. It leads to the condition on ω, which is called cocycle condition: ω([ξ, η], ζ) + ω([η, ζ], ξ) + ω([ζ, ξ], η) = 0. (7.2) The form ω is called a 2-cocycle, and the terminology comes from the cohomology theory. It was shown in [54] that there is an essentially unique non-trivial 2-cocycle for the Lie algebra Vect(S 1 ), which is called the Gelfand–Fuchs Lie algebra cocycle. It is given by the following form ω: ) )   1 1   u (θ)v (θ) dθ = u (θ)dv  ω u(θ)∂θ , v(θ)∂θ = 2π S 1 2π S 1 

) (7.3) 1 u (θ) v  (θ) = det dθ. u (θ) v  (θ) 4π S 1 The central extension of the Lie algebra Vect(S 1 ) by R is called the Virasoro algebra, is denoted by vir and it is unique up to an isomorphism. The name of the algebra is coming from the name of the Argentinian physicist Miguel Angel Virasoro, who invented the idea of central extension for Vect(S 1 ) in his physical calculations. In physics, actually, a more general 2-cocycle is used. Let us explain the mathematical background of this more general 2-cocycle. Definition 32. A 2-cocycle ω : g × g → R is called a 2-coboundary if there exists a linear map η : g → R such that ω(ξ, υ) = η([ξ, υ]) for all ξ, υ ∈ g.

Geodesics in Geometry with Constraints

263

Let us assume that the central extension  g of a Lie algebra g is given by a 2-coboundary η, e.g., (ξ, a) ∈  g and [(ξ, a), (υ, b)] = ([ξ, υ], η([ξ, υ])). Then the change of the variables (ξ, a) → (ξ, a − η(ξ)) leads to ([ξ, υ], η([ξ, υ])) → ([ξ, υ], 0). We again obtained the trivial extension. By this observation, in describing different central extensions one is interested only in the 2-cocycles modulo 2-coboundaries. Meanwhile, in physical applications the following general form of the 2-cocycle ωh,c for some positive constants h, c is used: )      c c   1 ωh,c u(θ)∂θ , v(θ)∂θ = (7.4) h− v (θ) − v  (θ) u(θ) dθ. 2π S 1 12 12 The constant c received the name central charge in physics and its value depends on the underlying physical theory. The cocycle ω from (7.3) is obtained, up to c c , by setting h = 12 and is often called the classical the normalization factor 12 Gelfand–Fuchs 2-cocycle. The following question arises: is there a group, whose Lie algebra is vir? The answer is positive. The corresponding group is called the real Virasoro–Bott group and is denoted by Vir. The Virasoro–Bott group as a set is the direct product of Diff S 1 and R: Vir = Diff S 1 × R. In this case the group multiplication law in Vir can be defined as follows: (f, a)(h, b) = (f ◦ h, a + b + λ(f, h)),

f, h ∈ Diff S 1 ,

a, b ∈ R,

(7.5)

where λ : Diff S ×Diff S → R is a smooth function that makes the multiplication law (7.5) associative. The associativity of (7.5) corresponds to the group cocycle identity: λ(f ◦ h, g) + λ(f, h) = λ(f, h ◦ g) + λ(h, g). (7.6) The infinitesimal version of the Lie group 2-cocycle is the Lie algebra 2-cocycle. The following theorem gives the group 2-cocycle for the classical Gelfand–Fuchs 2-cocycle and was obtained for the first time in [16]. 1

1

Proposition 13. The map B:

Diff S 1 × Diff S 1 (f, h)

→ →

,

1 4π S 1 1

S1 log(f ◦ h) d log h

is a continuous 2-cocycle on the group Diff S . Here f, h ∈ Diff S 1 and f  , h are their derivatives with respect to θ ∈ S 1 . Proof. For the proof we need to verify the group 2-cocycle condition (7.6) and then to check that the infinitesimal version coincides with the classical Gelfand–Fuchs 2-cocycle ω. The details can be found in [81].  As in the case of the central extension of a Lie algebra, any central extension of a Lie group is defined up to a 2-coboundary. Definition 33. A smooth 2-cocycle λ : G × G → R is called a 2-coboundary (on G) if there exists a smooth map F : G → R such that λ(f, h) = F (f ) + F (h) − F (f ◦ h).

264

I. Markina

Two group 2-cocycles define isomorphic extensions if they differ by a 2coboundary. Let us construct a group 2-coboundary for the group Diff S 1 . First, we write the general Lie algebra cocycle ωh,c in the following form: )      c   1 c ωh,c u(θ)∂θ , v(θ)∂θ = h− v (θ) − v  (θ) u dθ 2π S 1 12 12 ) 2π   1 αuv  + βu v  dθ =: αa(u, v) + βb(u, v), = 2π 0 c c , β = 12 and we used integration by parts of periodic functions where α = h − 12 , 2π  1 1 u and v on S . Let us verify that a(u, v) = 2π uv dθ is the Lie algebra 20 coboundary. Introduce the functional ) 2π 1 η(u∂θ ) = u(θ) dθ, 2π 0

expressing the mean value of u on S 1 . Observe that ) 2π 1 η([u, v]) = u (θ)v(θ) − v  (θ)u(θ) dθ 2π 0 ) 2π 1 u(θ)v  (θ) dθ = −2a(u, v) = −2 2π 0 by integration by parts. Thus a is a group 2-coboundary. The second part b(u, v) = , 2π   1 2π 0 u v dθ is the Gelfand–Fuchs algebra 2-cocycle. The multiplication law (7.5) takes the form (f, a)(h, b) = (f ◦ h, a + b + αA(f, h) + βB(f, h)),

f, h ∈ Diff S 1 ,

a, b ∈ R.

Here B(f, h) is the Bott 2-cocycle given by Proposition 13 and A(f, h) is the group 2-coboundary satisfying Definition 33. To find the smooth function F from Definition 33 we verify the following two properties. 1. Since the identity element on Vir has the form (id, 0) we get A(f, id) = 0 =⇒ F (id) = 0, id ∈ Diff S 1 by the multiplication law (7.5). 2. In order to obtain the inverse element to (f, a) ∈ Vir in the form (f −1 , −a) we require A(f, f −1 ) = 0 =⇒ F (f ) + F (f −1 ) = 0.  , 2π  The function F (f ) = 0 f (θ) − θ dθ obviously satisfies the first property. Let us show that it also satisfies the second property. Step 1. First we assume that f (0) = 0. Then f (2π) = 2π and f is strictly increasing on [0, 2π]. Thus ) 2π   F (f ) + F (f −1 ) = f (θ) + f −1 (θ) − 2θ dθ 0

)



)

=

)

f



)



dθdy − 4π 2 = 0.

dθdy + 0

0

0

f

Geodesics in Geometry with Constraints

265

Step 2. Now we assume that f is an arbitrary element of Diff S 1 . Define fˆ(θ) = f (θ + f −1 (0)). Then fˆ(θ) satisfies three properties: a) fˆ(0) = 0, b) fˆ−1 (θ) = f −1 (θ) − f −1 (0),  , 2π , 2π  fˆ(θ) + s dθ for any s ∈ R. c) fˆ(θ + s) dθ = 0

0

Indeed, the properties a), b) are obvious and to show c) we observe that fˆ(θ) − θ is a periodic function with the period 2π by fˆ(θ + 2πk) = fˆ + 2πk, k ∈ N. Then ) 2π+s ) 2π+s ) 2π θ dθ fˆ(θ + s) dθ = fˆ(θ) − θ dθ + s

0

)

s

 1 fˆ(θ) − θ dθ + (2π + s)2 − s2 = 2 0 ) 2π ) 2π   fˆ(θ) + s dθ. fˆ(θ) dθ + 2πs = = 2π

0

0

We continue to prove the second property for an arbitrary f ∈ Diff S 1 and deduce ) 2π   F (f ) + F (f −1 ) = f (θ) + f −1 (θ) − 2θ dθ 0

)



= 0

)



    fˆ θ − f −1 (0) + fˆ−1 (θ) + f −1 (0) − 2θ dθ 

=

 fˆ(θ) − f −1 (0) + fˆ−1 (θ) + f −1 (0) − 2θ dθ

0

= F (fˆ) + F (fˆ−1 ) = 0 by the Step 1 and a), b), c). We conclude, that ) A(f, h) =



  f (θ) + h(θ) − (f ◦ h)(θ) − θ dθ.

(7.7)

0

The last step in defining group 2-coboundary A is to verify that the infinitesimal version of A from (7.7) coincides with the algebra 2-coboundary a. We will use the following. Proposition 14 ([81]). Let A : G × G → R be a group 2-cocycle defining a central extension by R of the group G. Then the algebra 2-cocycle a : g × g → R defining the corresponding central extension of the Lie algebra g of G is given by d2  d2  A(ft , hs ) − A(hs , ft ), a(u, v) =   dtds t=0,s=0 dtds t=0,s=0 where ft and hs are smooth curves in G such that d  d   ft = u,  hs = v. dt t=0 ds s=0

266

I. Markina Now, differentiating (7.7) we obtain ) 2π  ∂ ∂ ∂ ∂   ∂ ∂ − f t hs + hs ft dθ ∂θ ∂t ∂s ∂θ ∂s ∂t t=0,s=0 0 ) 2π   − v∂θ u + u∂θ v dθ = −2πη([u, v]) = 4πa(u, v). = 0

Normalizing the 2-coboundary (7.7), we finally deduce ) 2π   1 f (θ) + h(θ) − (f ◦ h)(θ) − θ dθ. A(f, h) = 4π 0 Example 9. Here we present the Heisenberg algebra from the central extension point of view. Take the abelian Lie algebra g = R2 and an arbitrary skew symmetric bilinear form ω : R2 × R2 → R. Since the Lie algebra g is abelian, the 2-cocycle condition (7.2) is trivial and the form ω satisfies it. The resulting central extension of R2 by R is the set h = R2 ⊕ R endowed with the brackets   [h1 , h2 ] = [(v1 , r1 ), (v2 , r2 )] = 0, ω(v1 , v2 ) . Note that the choice of any other skew-symmetric bilinear form leads to an isomorphic Lie algebra  h. The Lie algebra h with a non-degenerate form ω is a representative of this isomorphism class and it is called the three-dimensional Heisenberg algebra. The n-dimensional analogue can be obtained by taking g = R2n+1 and an arbitrary skew symmetric bilinear form ω : R2n+1 × R2n+1 → R. We even can continue and present an infinite-dimensional version of the Heisenberg algebra as a central extension of the space   ) 1 f (θ) dθ = 0 . g = f ∈ C(S 1 , S 1 ) | η(f ) = 2π S 1 The space g is considered as an abelian algebra. Since a function with vanishing mean value can be written as a Fourier series ∞  xn cos(nθ) + yn sin(nθ), f (θ) = n=1

it can be interpreted as a point in an infinite-dimensional space. The 2-cocycle ω : g × g → R is given by ) ω(f, g) = f  (θ)g(θ) dθ. (7.8) S1

Exercises 1. Prove the cocycle condition (7.2) for the Gelfand–Fuchs cocycle (7.3). 2. Calculate the Heisenberg group 2-cocycle that corresponds to the algebra 2-cocycle (7.8).

Geodesics in Geometry with Constraints

267

7.1.2. Complexification of Vect(S 1 ) and the Virasoro algebra. The next step is to consider complexifications of Vect(S 1 ) and vir and their relations to Diff S 1 and Vir. The reader who is not familiar with the general construction of complexification can find all the necessary definitions in Appendix A, Subsection 8.4. The complexification Vect(S 1 ) ⊗ C consists of smooth complex-valued vector fields v(θ)∂θ defined on S 1 , that can be identified with the space C ∞ (S 1 , C). The natural basis is the complex-valued Fourier basis ek := −ieikθ ∂θ , k ∈ Z, produced from the real Fourier basis. The commutation relations for basic vector fields on Vect(S 1 ) ⊗ C are [em , en ] = (n − m)em+n ,

m, n ∈ Z.

These relations are known under the name of Witt, and the Lie algebra whose basis satisfies the Witt relations is called the Witt algebra. So the complexification of the Lie algebra Vect(S 1 ) is the Witt algebra. Actually, the complex-valued vector d on fields ek = −ieikθ ∂θ can be extended to meromorphic vector fields Lk = z k+1 dz C∪{∞}, that are holomorphic vector fields in C\{0}. The extended to the Riemann d ∞ sphere C = C ∪ {∞} algebra of complex-valued vector fields {z k+1 dz }k=−∞ is also called the Witt algebra. The complexification of the Lie algebra Vect(S 1 ) does not correspond to any Lie group. The explanation can be the following. First, we observe that the Lie algebra Vect(S 1 ) contains Lie sub-algebras gk = span{∂θ , sin(kθ)∂θ , cos(kθ)∂θ }, k = 1, 2, . . .. All these sub-algebras are isomorphic to the Lie algebra sl(2, R). The algebras gk can be integrated to subgroups Gk of Diff S 1 . The subgroup G1 is the projective special linear group P SL(2, R) and G2 = SL(2, R). All other groups Gk are k-fold coverings of G1 . It is known that to the algebra sl(2, C), which is the complexification of sl(2, R), there correspond only 2 complex groups C GC 1 = P SL(2, C) and G2 = SL(2, C). All other groups Gk , k > 2, do not admit complexifications. The second observation is that the real Lie algebra Vect(S 1 ) is contained in the complex Lie algebra Vect(S 1 ) ⊗ C and therefore the real Lie sub-algebras gk are contained in the corresponding complex Lie sub-algebras gk ⊗ C. If the complex group (Diff S 1 )C existed then it would contain the real group Diff S 1 and the real subgroups Gk would belong to the complex subgroups GC k . It is known C = P SL(2, C), SL(2, R) = G ⊂ G = SL(2, C), and that P SL(2, R) = G1 ⊂ GC 2 1 2 no other complex subgroups of (Diff S 1 )C containing Gk , k > 2. The rigorous proof can be found in [118]. Instead of a complex structure, the group Diff S 1 admits a left invariant CRstructure according to [96], that is constructed as follows. Define a subalgebra   ∞ an einθ ∂θ ∈ Vect(S 1 ) ⊕ C, an ∈ C (7.9) h(1,0) = n=1

of Vect(S 1 ) ⊕ C. The set h(1,0) is just the set of all vector fields having vanishing mean value on the circle. The sum h(1,0) ⊕ h(1,0) is of complex co-rank 1 in

268

I. Markina

Vect(S 1 ) ⊕ C, whence we obtain a left invariant CR-structure on Diff S 1 . In addition, * + ∞ ∞ ∞   inθ −inθ an e ∂θ , a ¯n e ∂θ = −i (k + n)an a ¯k ei(n−k)θ ∂θ n=1

n=1

k,n=1

which is not in h ⊕ unless all an = 0. Thus (Diff S 1 , h(1,0) ) is strongly pseudoconvex. We now move to the complexification vir ⊗ C of the central extension of Vect(S 1 ) that is the Virasoro algebra vir. As a vector space vir ⊗ C is the complex vector space generated by ek , c, where ek is the basis of Vect(S 1 ) ⊗ C and the generator c is called the central element. To define the bracket on vir ⊗ C we extend the real Lie algebra 2-cocycle ) 2π   1 αu(θ)v  (θ) + βu (θ)v  (θ) dθ ωαβ (u, v) = (7.10) 2π 0 (1,0)

h(1,0)

to the complex-valued cocycle C : Vect(S 1 ) ⊗ C × Vect(S 1 ) ⊗ C → C ωαβ

by the standard procedure of extension of the integral to complex-valued functions. Then the commutator becomes   C (v, u)c , [(v, 0), (0, c)] = 0, [(v, μc), (u, νc)] = [v, u], ωαβ C on the Witt basis ek = −ieikθ ∂θ , v, u ∈ Vect(S 1 ) ⊗ C, μ, ν ∈ C. The value of ωαβ k ∈ Z is given by  −i(αn + βn3 ) if n + m = 0, C imθ inθ ωαβ (−ie ∂θ , −ie ∂θ ) = 0 if n + m = 0.

The complexification vir⊗C of vir is also called the Virasoro algebra and in physics it is used more than the real Virasoro algebra. Are there complex groups that correspond to vir ⊗ C? In the work [96] the author proved that the Virasoro–Bott group Vir admits a left invariant complex structure. It means that the complexified Virasoro algebra vir ⊕ C admits the splitting vir ⊕ C = vir(1,0) ⊕ vir(0,1) and the manifold (Vir, vir(1,0) ) can be considered as a complex manifold. To prove that Vir is a complex group one has to verify that the multiplication and inversion become holomorphic maps, but we leave it to the reader. L. Lempert shows in his work [96] that there is actually a family of complex structures, that are defined as follows. Fix any purely imaginary complex number iκ, κ ∈ R, and define subalgebras  ∞ E

 (1,0) inθ ∞ = an e ∂θ , iκa0 c ∈ vir ⊕ C, {an }n=0 ∈ C, κ ∈ R (7.11) vir n=0

Geodesics in Geometry with Constraints

269

and vir(0,1) = vir(1,0) . Since vir ⊕ C = vir(1,0) ⊕ vir(0,1) , we obtain a family of left invariant complex structures on Vir parametrized by κ ∈ R. 7.1.3. Homogeneous manifold Diff S 1 /S 1 . Let us denote by S 1 the closed subgroup of Diff S 1 generated by rotations of the unit circle: τ (θ) = θ + b, b ∈ R mod 2πZ if τ ∈ S 1 . Suppose that S 1 acts on Diff S 1 on the right: μ:

Diff S 1 × S 1 f.τ

→ Diff S 1 → f ◦ τ = f (τ ).

Then the right quotient Diff S 1 /S 1 has a manifold structure. Since the group S 1 is not a normal subgroup, then the manifold Diff S 1 /S 1 has no group structure. The CR-structure h(1,0) of the group Diff S 1 presented at (7.9) is invariant under the right action of S 1 . The action of S 1 is transversal to CR-structure, therefore the quotient Diff S 1 /S 1 inherits the complex structure from the CR-structure. The space Diff S 1 /S 1 can be considered as a homogeneous space, where the group Diff S 1 acts on the left by composition μ:

Diff S 1 × Diff S 1 /S 1 f.h

→ Diff S 1 /S 1  → f ◦ h = f (h).

Indeed, if h1 , h2 belong to the same equivalence class on Diff S 1 /S 1 , then h−1 1 ◦h2 is a rotation. Therefore the images f (h1 ), f (h2 ) belong also to the same equivalence class on Diff S 1 /S 1 because the composition [f (h1 )]−1 ◦ f (h2 ) = h−1 1 ◦ h2 is the rotation. The manifold (Diff S 1 /S 1 , D(1,0) )

with

(1,0)

Df

= did μf (h(1,0) ),

(7.12)

1

where dμf is the differential of the left action of the group Diff S on Diff S 1 /S 1 , is a strongly pseudo-convex CR-manifold due to the properties of h(1,0) . The subalgebra h(1,0) can be also obtained in the following way. To the Lie– Fr´echet group Diff S 1 there corresponds a Lie algebra Vect(S 1 ). Let us denote the Lie algebra of the group of rotations S 1 by s1 . The space s1 consists of constant vector fields on the circle. Then the tangent bundle of the quotient Diff S 1 /S 1 has sections that are vector fields invariant under the right action of S 1 , or, in other words, Tf (Diff S 1 /S 1 ) is isomorphic to Vect(S 1 )/s1 . This space is the space of vector fields with vanishing mean value on the circle. The almost complex structure on Vect(S 1 )/s1 is given by the Hilbert transform J, which is easier to describe through the Fourier basis as ∞ ∞   −an sin(nθ) + bn cos(nθ) for v(θ) = an cos(nθ) + bn sin(nθ). Jv(θ) = n=1

n=1

(7.13) Then   ∞   1 1 (1,0) inθ Vect(S )/s = {v∂θ − iJ(v∂θ )} = cn e ∂θ , cn = an − ibn = h(1,0) . n=1

270

I. Markina

7.1.4. The groups Vir and Diff S 1 as principal bundles. The aim of this subsection is to explain the bundle structures of the groups Vir and Diff S 1 over the base space Diff S 1 /S 1 . Proposition 15. The following bundle structures exist. 1. The bundle π : Diff S 1 → Diff S 1 /S 1 is a principal U (1)-bundle. 2. The bundle Π : Vir → Diff S 1 /S 1 is a trivial C∗ -bundle. Here C∗ is the multiplicative group of complex numbers C \ {0}. To prove the proposition one can show that the manifolds Vir, Diff S 1 , and Diff S 1 /S 1 , considered with their complex and CR-structures, are bi-holomorphically equivalent to some spaces of univalent functions, where the bundle structure is more transparent. We start from the definitions of these spaces. Let Hol(BC ) denote the space of holomorphic functions in the unit disk BC ⊂ C. The subspaces A0 and A0 of Hol(BC ) are defined by A0 = {f ∈ C ∞ (B C ) | f ∈ Hol(BC ), f (0) = 0},

A0 = {f ∈ A0 | f  (0) = 0},

where B C is the closure of the unit disk BC . The classes A0 and A0 are complex Fr´echet vector spaces, where the topology is defined by the semi-norms f m = sup{|f (m) (z)| | z ∈ B C }. The topology is equivalent to the uniform convergence of all derivatives in B C . Notice that both A0 and A0 can be considered as complex manifolds, where the real tangent space is naturally isomorphic to the holomorphic part of the splitting under the induced almost complex structure from CN . Then we define F = {f ∈ A0 | f is univalent in BC and injective on the boundary ∂BC }. Geometrically, the class F defines all differentiable embeddings of the closed disk ∞ B C to C and analytically it is represented by functions f = cz(1 + n=1 cn z n ), c, cn ∈ C. As a subset of A0 the space of univalent functions F forms an open subset inheriting the Fr´echet topology of complex vector space A0 . Next we consider the class F1 = {f ∈ F | |f  (0)| = 1}, ∞ whose elements can be written as f = eiφ z(1 + n=1 cn z n ), φ ∈ R mod 2πZ. The set F1 is a pseudo- convex surface of real codimension 1 in the complex open set F ⊂ A0 . The last class of functions is F0 = {f ∈ F | f  (0) = 1}.

n The elements of this class have the form f = z(1 + ∞ n=1 cn z ). It is obvious 1 that F0 can be considered both as the quotient F1 /S and as the quotient F /C∗ , C∗ = C \ {0}. In the latter case, F is the holomorphic trivial principal C∗ -bundle over the base space F0 (since the projection is just dividing by a non-zero complex number). The topological structure of the circle bundle F1 over the base space F0 is more complicated.

Geodesics in Geometry with Constraints

271

Since the set F0 can be also considered as an open subset of the affine space  v + A0 , where v(z) = z, the tangent space Tf F0 inherits the natural complex structure of the complex vector space A0 [4]. The real tangent space Tf F0 with the induced almost complex structure from A0 is isomorphic to the complex vector (1,0) space Tf F0 of the splitting T F0 ⊗ C = T (1,0) F0 ⊕ T (0,1) F0 . Moreover, the affine coordinates ∞ can be introduced so that to every f ∈ F0 , written in the form f (z) = z(1 + n=1 cn z n ) there will correspond the sequence {cn }∞ n=1 . Theorem 7.1. The following statements are true. 1. The Virasoro–Bott group Vir with the left invariant complex structure (Vir, vir(1,0) ) defined by (7.11) is bi-holomorphic to F [96]. 2. The group Diff S 1 with its left invariant CR-structure (Diff S 1 , h(1,0) ) (7.9) is CR-isomorphic to the strongly convex hypersurface F1 ⊂ F [96]. 3. The homogeneous space Diff S 1 /S 1 with its complex structure (Diff S 1 /S 1 , D(1,0) ) introduced by (7.12) is bi-holomorphic to F0 [96, 84]. It can be shown that Diff S 1 /S 1 admits not only a complex but even a K¨ahlerian structure [4, 84]. Proposition 15 follows from Theorem 7.1 and the known bundles in space of holomorphic functions: principal C∗ -bundle F → F0 and the circle bundle F1 → F0 . Recall that the group C∗ is the multiplicative group of complex numbers C \ {0}. The bundle maps of Theorem 7.1 are expressed in the following diagram.  F ? _F o / Vir  / C C o prby R

 S1 o

prby R

prby R

 ? _ F1 o

F1

prby R

 / S1

prDiff S 1 /S 1

prF0

 F0 o

  / Diff S 1 

F0

 / Diff S 1 /S 1 .

Here F, F0 are corresponding bi-holomorphic maps from Theorem 7.1 and F1 gives the isomorphism of CR-structures. The left- and right-hand side extremes represent the typical fibers and the central part shows the projections of total spaces to the base spaces F0 and Diff S 1 /S 1 . Now we briefly describe the bijective map between F0 and Diff S 1 /S 1 . Let c B C be the complement to the closure B C of the unit disk BC . For any f ∈ F0 , c c we define a matching function g : B C → C, such that the image g(B C ) coincides with the complement to closure f (BC ). Assume also that the map g satisfies the normalization g(∞) = ∞. Note that such g exists by the Riemann mapping theoc rem. Since both sets BC and B C have a common boundary S 1 and the functions f

272

I. Markina

and g have smooth extensions through S 1 , the images g(S 1 ) and f (S 1 ) are defined uniquely and represent the same smooth contour in C. If g and g are two matching functions to f , then they are related by a rotation g(ζ) = g(ζw), 

c

ζ ∈ BC,

|w| = 1.

Thus for an arbitrary matching function g to f ∈ F0 the diffeomorphism h ∈ Diff S 1 , given by eih(θ) = (f −1 ◦ g)(eiθ ), (7.14) is uniquely defined by f up to the right superposition with a rotation. The relation Diff S 1 /S 1 " [h]



f ∈ F0

given by (7.14) defines a holomorphic bijection Diff S 1 /S 1 ∼ = F0 . The composition f −1 ◦ g is often called a conformal welding. The left action of Diff S 1 on Diff S 1 /S 1 is transferred to the left action over F0 . It was shown in [4] that the action μf : F0 → F0 is a holomorphic map for any fixed f ∈ Diff S 1 . The infinitesimal generator of this action σf : Vect(S 1 ) → Tf F0 is given by the variational formula of A.C. Schaeffer and D.C. Spencer [124, page 32]

2 )  wf (w) u(w) dw f 2 (z) σf [u∂θ ](z) := ∈ T f F0 2π f (w) w(f (w) − f (z)) S1

defined for f ∈ F0 , u∂θ ∈ Vect(S 1 ). Note that here σf is a map from the real vector space Vect(S 1 ) to the real vector space Tf F0 . It extends by linearity to a map (1,0) (0,1) L[f, ·] : Vect(S 1 ) ⊗ C → Tf F0 ⊗ C = Tf F0 ⊕ Tf F0 . The variation L[f, ·] defines also an isomorphism of complex vector spaces h(1,0) ↔ (1,0) Tf F0 , which is given explicitly by Lk [f ] = L[f, ek ](z) = z k+1 f  (z), ek = −ie

ikθ

∂θ ∈ h

(1,0)

,

Lk [f ] ∈

(7.15) (1,0) T f F0 ,

k = 1, 2, . . .

by making use of the residue calculus, see, e.g., [4, 83]. Taking the antiholomorphic part of the basis e−k = ie−ik(θ) ∂θ , k = 1, 2, . . . , we obtain expressions for L−k [f ] ∈ (1,0) Tf F0 which are rather difficult. The first two of them are L−1 [f ](z) = f  (z) − 2c1 f (z) − 1, L−2 [f ](z) =

1 f  (z) − − 3c1 + (c21 − 4c2 )f (z), z f (z)

and others can be obtained by the Witt commutation relations [4, 84] [Lk , Ln ] = (n − k)Lk+n ,

k, n ∈ Z. (7.16) ∞ n Here we use the affine coordinates f = z(1 + n=1 cn z ) ↔ (c1 , c2 , . . .). The constant vector u0 = −i is mapped to L0 [f ](z) = zf  (z) − f (z). The vector fields

Geodesics in Geometry with Constraints

273

Lk , k ∈ Z were obtained in [84] and received the name of Kirillov’s vector fields, see also [4]. We have (1,0)

Tid

F0 = span{L0 [id], L1 [id], L2 [id], . . . } = span{z 2 , z 3 , . . .}.

7.1.5. KdV and Virasoro–Bott group. Let us impose an L2 -inner product (· , ·)L2 on the algebra vir by )   1 (u(θ)∂θ , a), (v(θ)∂θ , b) L2 = u(θ)v(θ) dθ + ab. (7.17) 2π S 1 Then by right translations we define a right invariant L2 -metric on the Virasoro– Bott group. We are interested in finding a geodesic equation on the group Vir with respect to the L2 -metric. First we present the Hamiltonian equation on Lie groups. In this case it can be rewritten as an equation on its dual Lie algebra g∗ . We start from the Poisson structure on the group naturally defined by a Lie algebra structure, and then present the Hamiltonian equation on g∗ corresponding to this Poisson structure. Definition 34. The natural Lie–Poisson (or Kirillov–Kostant–Poisson) structure {· , ·} defined on the dual Lie algebra g∗ is {· , ·} :

C ∞ (g∗ ) × C ∞ (g∗ ) → C ∞ (g∗ ) (f (w), g(w)) → [dw f, dw g], w .

Here, as usual, · , · is the pairing between g and g∗ . The functions f, g are from C ∞ (g∗ ), w ∈ g∗ and dw f, dw g ∈ Tw∗ (g∗ ). We identify elements of Tw∗ (g∗ ) with g and think of df, dg as elements of the Lie algebra g. Now if f ∈ C ∞ (g∗ ), then the Hamiltonian equation takes the form df (w) = {H, f }(w) = [dw H, dw f ], w = addw H (dw f ), w dt = dw f, ad∗dw H (w) . ˙ we get that the Hamiltonian Since the left-hand side can be written as dw f, w , equation on g∗ takes the form w(t) ˙ = ad∗dw(t) H (w(t)).

(7.18)

Let A : g → g∗ be any invertible self adjoint operator, e.g., Aξ, w = ξ, Aw , where · , · is the pairing between g and g∗ . Such operators are often called “inertia operator”. Define the Hamiltonian function H : g∗ → R by 1 H(w) := w, A−1 w , w ∈ g∗ . 2 −1 Then dw H(w) = A w, and the Hamiltonian equation takes the form w(t) ˙ = ad∗A−1 w(t) (w(t)).

(7.19)

Let us apply this calculus to the Virasoro–Bott group. As we know, the Lie algebra vir consists of vector fields (v(θ)∂θ , a). The dual space vir∗ for infinitedimensional space vir is too large, therefore one usually considers its “smooth part”

274

I. Markina

in the following sense: for every non-zero element v∂θ ∈ vir there is an element w ∈ vir∗ such that v∂θ , w = 0 and the converse is also true. The dual space vir∗ can be identified with so-called smooth quadratic differentials (u(θ)(dθ)2 , a), u ∈ C ∞ (S 1 , R), see [82]. The pairing is defined by ) H   I 1 u(θ)(dθ)2 , a , v(θ)∂θ , b = v(θ)u(θ) dθ + ab. (7.20) 2π S 1 The co-adjoint action of the Lie algebra vir on its dual vir∗ is the following:      u(θ)(dθ)2 , a = (−2v  u − vu − av  )(dθ)2 , 0 . ad∗ (7.21) v(θ)∂θ ,b

Generally, the presence of any inner product (· , ·) on a Lie algebra g allows us to construct the inertia operator A by (u, v) = A(u), v for all u, v ∈ g. It is analogous to the situation when a metric on a Riemannian manifold M produces the identification Tq M with its dual Tq∗ M , q ∈ M . The L2 product (7.17) and the pairing (7.20) define the following inertia operator: A:

vir → vir∗ (u(θ)∂θ , a) → (u(θ)(dθ)2 , a).

The Hamiltonian function defined by the product (7.17) and the pairing (7.20) is )   1 2 H (u(θ)(dθ) , a) = u2 (θ) dθ + a2 2 S1    H = (u(θ)∂θ , a). Then substituting u(θ)(dθ)2 , a for w in (7.19) and d 2 u(θ)(dθ) ,a

we get

   d u(θ)(dθ)2 , a = (−3uu − au )(dθ)2 , 0 . dt The last equation is reduced to the system u˙ = −3uu − au , a˙ = 0.

(7.22)

The first equation is the Korteweg–de Vries (KdV) non-linear evolution equation that describes traveling waves in a shallow canal. The second equation is just saying that the parameter a is a real constant. Remark that the Euler equation for the L2 metric on the group Diff S 1 is called the Hopf or inviscid Burgers equation. We will obtained it in Subsection 7.2. Exercise 1. Prove the formula for the co-adjoint action (7.21) using the pairing (7.20) and the definition ad∗ξ ω, η = −ω, adξ η for ξ, η from a Lie algebra and ω from the dual to the Lie algebra. Other interesting equations. On the groups Diff S 1 and Vir more metrics can be defined. Let us describe them and write the corresponding geodesic equations.

Geodesics in Geometry with Constraints

275

On the Virasoro algebra Vir and on Vect(S 1 ) the following weighted family 1 of metrics (· , ·)Hα,β can be defined as )     αvu + βv  u dθ + ab. (7.23) (v∂θ , a), (u∂θ , b) H 1 = α,β

S1

1 Theorem 7.2 ([80]). The Euler equations for the right invariant metric (· , ·)Hα,β , α = 0 on the Virasoro–Bott group are given by the following system:   α(u˙ + 3uu ) − β u˙  + 2u u + uu + au = 0 (7.24) a˙ = 0,

for (u(θ, t)∂θ , a(t)) ∈ Vir for each t ∈ I. For α = 1, β = 0 equation (7.24) is the KdV equation (7.22). For α = β = 1 one recovers Camassa–Holm equation. For α = 0, β = 1 equation (7.24) becomes 1 the Hunter–Saxton equation. Note that in the case α = 0 the metric (· , ·)Hα,β becomes the homogeneous degenerate (· , ·)H˙ 1 metric and therefore to define the Euler equation one has to consider the homogeneous space Diff S 1 /S 1 and define the geodesic flow on it (for details see [80]). Contour dynamics and Virasoro algebra. The relations between contour dynamics, stochastic evolution equations, conformal field theory and the groups Vir and Diff S 1 were described in [11, 12, 46, 47, 99]. 7.2. Sub-Riemannian geodesics on Diff S 1 In this section we present some results for sub-Riemannian geodesics on the groups Diff S 1 and Vir. First we describe the horizontal sub-bundles and metrics on them. Then we present some formulas for normal geodesics and discuss the controllability of these groups with respect to the chosen horizontal sub-bundles. 7.2.1. Horizontal sub-bundles. Recall that the linear map η : Vect(S 1 ) → R given by ) 2π 1 u(θ) dθ, (7.25) η(u∂θ ) = 2π 0 associates to each vector field from Vect(S 1 ) its mean value on the circle. The kernel of η, consisting of all vector fields with zero mean value, is isomorphic to Vect(S 1 )/s1 , where s1 as before denotes the subalgebra of Vect(S 1 ) of constant vector fields, corresponding to the abelian group of rotations S 1 . We use the notation Vect0 (S 1 ) = Vect(S 1 )/s1 . Then Vect(S 1 ) = Vect0 (S 1 ) ⊕ s1 . 1 Define a horizontal sub-bundle H of T Diff S 1 by left translations of Vect  0 (S ).  1 A horizontal sub-bundle E of T Vir is left translations of Vect0 (S ), 0 on Vir. Then the complement of Vect0 (S 1 ), 0 in vir is given by s1 = {(a0 ∂θ , a) ∈ vir : a0 , a ∈ R}

276

I. Markina

  and we have vir = Vect0 (S 1 ), 0 ⊕ s1 . The algebra s1 is an abelian sub-algebra of vir corresponding to the abelian sub-group S1 = {(θ → θ + b0 , b) ∈ Vir : b0 , b ∈ R}. Proposition 16. The sub-bundle H of T Diff S 1 is invariant under the action of rotations S 1 and the sub-bundle E of T Vir is invariant under the action of S1 . Proof. If ρ : θ → θ + b0 is a rotation, then  d   Adρ (u)(θ) =  b0 + exp( u(θ − b0 ) = u(θ − b0 ). d =0 Therefore, η(Adρ (u)) = η(u), which means that H is invariant under the action  of S 1 . By similar arguments E is invariant under S1 . 7.2.2. Sub-Riemannian metrics and normal geodesics. Let us describe left-invariant metrics on H and E. We start with H. Let (· , ·)1,0 denote the standard L2 inner product on Vect(S 1 ) ) 2π 1,0  1 = u(θ)v(θ)dθ. u∂θ , v∂θ 2π 0 Let g1,0 be the Riemannian metric obtained by left translation of (· , ·)1,0 , and let h1,0 be its restriction to H. Before we present the equations for sub-Riemannian normal geodesics with respect to the metric h1,0 , we formulate a general result that can be found in [65, 66] and that defines the geodesic equation and the exact form of normal geodesics under some invariance conditions. We mention also regular infinite-dimensional Lie groups, see [108, 116]. For the first reading the reader can pay small attention to this, since up to now all known Lie groups are regular groups. Nevertheless, this condition ensures, particularly, the existence and smoothness of the group exponential map, that may be neither bijective nor injective. The assumption about the existence of the map ad ξ is also nontrivial in the infinite-dimensional case. Theorem 7.3 ([65]). Let G be an infinite-dimensional regular Lie group and K be its connected subgroup. Denote by g and k their respective Lie algebras. Let (· , ·) ⊥ be an inner product in g for which ad ξ exists for any ξ ∈ g. Assume that h = k and g = h ⊕⊥ k. Define the horizontal distribution H by left translations of h. Let g be a Riemannian metric on G obtained by left translation of (· , ·) and h = g|H . The following statements hold. (a) If (· , ·) is ad(k) invariant and if γR : [0, 1] → G is a Riemannian geodesic with respect to g, then λ(t) = prk κ (γ˙ R (t)),

t ∈ [0, 1]

is constant. Here prk : g → k is the orthogonal projection with respect to (· , ·) and κ (γ(t)) ˙ = dγ −1 (t) l(γ(t)) ˙ is the left logarithmic derivative.

Geodesics in Geometry with Constraints

277

(b) If (· , ·) is Ad(K) invariant and if γsR : [0, 1] → G is a sub-Riemannian geodesic with respect to h, then γsR is a normal geodesic, if and only if, it is of the form γsR (t) = γR (t) · expG (−λt),

λ = prk κ (γ˙ R ),

t ∈ [0, 1],

(7.26)

where γR : [0, 1] → G is a Riemannian geodesic with respect to g. (c) The left logarithmic derivative usR (t) = κ (γ˙ sR (t)) of the curve γsR : [0, 1] → G satisfies the equation u˙ sR (t) = ad usR (t) (usR (t) + λ) with a constant λ and t ∈ [0, 1]. To write the equation of an H-horizontal normal geodesic on Diff S 1 we have to check the conditions of Theorem 7.3 for G = Diff S 1 and K = S 1 . We start from the calculation of the adjoint operator ad u∂θ with respect to the inner product  1,0 1 (· , ·) on Vect(S ), that is defined by (adu∂θ v∂θ , w∂θ )1,0 = (v∂θ , adu∂θ w∂θ )1,0 = (v∂θ , [u∂θ , w∂θ ])1,0 . We drop the symbol ∂θ to simplify the notation. We calculate ) 2π   1  1,0   1,0 (adu v, w) = (v, −uw + u w) = − uvw + u vw dθ 2π 0 ) 2π    1 = uv + 2u v w dθ = (uv  + 2u v, w)1,0 2π 0 using integration by parts. Remember that all the functions u, v, w are periodic on [0, 2π], and therefore the term outside of the integral vanishes. We conclude that   ad u v = uv + 2u v,

u∂θ , v∂θ ∈ Vect(S 1 ).

(7.27)

Proposition 17. The inner product (· , ·)1,0 is Ad(S 1 )- and ad(s1 )-invariant. Proof. As in general, the invariance with respect to the adjoin action of the Lie group implies the invariance with respect to the adjoin action of the Lie algebra, the Ad(S 1 )-invariance implies ad(s1 )-invariance. Nevertheless, we present both proofs. It was shown in the proof of Proposition 16 that for the rotation ρ(θ) = θ + b the adjoint action of ρ ∈ S 1 on Vect(S 1 ) is expressed as Adρ u(θ) = u(θ − b). Then (Adρ u, Adρ v)1,0 = (u, v)1,0 , since u, v are periodic with period 2π. Denote by p0 = ∂θ the basis vector for s1 . Then (adp0 u, v)1,0 = ([p0 , u], v)1,0 = −(u , v)1,0 1,0 = (u, v  )1,0 = −(u, [p0 , v])1,0 = −(u, ad . p0 , v)

It implies that the inner product (· , ·)1,0 is invariant under the adjoint action of the algebra s1 .  Moreover, the subspaces Vect0 (S 1 ) and s1 are orthogonal with respect to the inner product (· , ·)1,0 , making the linear map η from (7.25) an orthogonal projection onto s1 . We see that all the hypotheses of Theorem 7.3 are satisfied. Thus, a normal H-horizontal geodesic γ : I → Diff S 1 is the solution to the equations ˙ = u, κ (γ)

  u˙ = ad u (u + λ) = 3uu + 2λu ,

u ∈ Vect0 (S 1 ), λ ∈ R.

278

I. Markina

The Riemannian geodesics obtained for λ = 0 are solutions to inviscid Burgers’ equation u˙ = 3uu . For the map π : Diff S 1 → Diff S 1 /S 1 we denote the base space Diff S 1 /S 1 by B. Let b1,0 be the Riemannian metric on B = Diff S 1 /S 1 obtained as a pushforward of h1,0 by π. Then the Riemannian geodesics in B with respect to b1,0 are projections π(γ). Now we consider a more general family of metrics than just the L2 -metric. of scalar products on Vect0 (S 1 ) by the Define a two-parameter family (· , ·)αβ 0 formula (u, v)αβ 0 =

1 2π

)



(αu(θ)v(θ) + βu (θ)v  (θ))dθ,

u, v ∈ Vect0 (S 1 ).

0

The scalar product is non-degenerate for α = −n2 β, n ∈ N, and is positive definite only if β ≥ 0 and α > −β. We extend the inner product (· , ·)αβ 0 to the entire Lie algebra Vect(S 1 ) by the formula  αβ (u, v)αβ = u − η(u), v − η(v) 0 + η(u)η(v)

u, v ∈ Vect(S 1 ).

(7.28)

Let us define a Riemannian metric gαβ by left translation of (· , ·)αβ , and let hαβ be its restriction to H. Theorem 7.3 can be applied also in this case and we deduce that an H-horizontal normal geodesic γ : I → Diff S 1 with respect to the metric hαβ is a solution to the equations ˙ = u, κ (γ)

β u˙  − αu˙ = β(uu + 2u u ) − 3αuu + 2λu ,

u ∈ Vect0 (S 1 ).

If bαβ is the Riemannian metric on B = Diff S 1 /S 1 induced by hαβ as a pushforward, then the Riemannian geodesics on B are given as projections π(γ) of solutions. The details can be found in [65]. Now we present metrics and normal geodesics for the Virasoro–Bott group Virμν , where the sub-index corresponds to the 2-cocycle ωμν . We extend the inner product (· , ·)αβ to the Virasoro algebra virμν . The extension is given by the formula 

αβ (u∂θ , a1 ), (v∂θ , a2 ) = (u, v)αβ + a1 a2 . μν

αβ be the Riemannian metric on Virμν obtained by left translations of (· , ·)αβ Let gμν μν , and let hαβ be its restriction to the sub-bundle E. μν

Let us calculate the adjoint ad (u,a) of ad(u,a) with respect to the metric (· , ·)1,0 . Notice that μν 1 ωμν (u, v) = 2π

) 0



  μu(θ)v  (θ) + νu (θ)v  (θ) dθ = −(u, Lμν v  )1,0 ,

Geodesics in Geometry with Constraints

279

  ∂2 ∂2 v the operator −μ + ν ∂θ where we used the notation Lμν v = − μ + ν ∂θ 2 2 is also known as the Hill operator. Then we calculate  1,0  1,0 ad = v, [u, w] − bωμν (w, u) (u,a) (v, b), (w, c) μν 1,0 1,0    (7.29) = adu v, w + w, bLμν u  1,0 = (uv  + 2u v + bLμν u , 0), (w, c) μν

by formula (7.27). The conditions of Theorem 7.3 are satisfied. The left logarithmic derivative (u(t), 0) ∈ (Vect0 (S 1 ), 0) ⊂ gμν of an E-horizontal normal geodesic (γ, b) : I → Virμν with respect to the metric h1,0 ˙ 0) = μν is a solution to the equation (u,  ad(u,0) (u + λ1 , λ2 ), λ1 , λ2 ∈ R. This means that the curve (γ, b) is a solution to κ (γ) ˙ = u,

with

u˙ = 3uu + (2λ1 − λ2 μ)u + λ2 νu ,

u ∈ Vect0 (S 1 ). (7.30)

1,0 The corresponding Riemannian geodesics with respect to g0,1 satisfy the KdV equation, as was shown in Subsection 7.1.5 for an analogous right invariant metric. The equations for a normal geodesic with respect to the general metric hαβ μν can be found in [65].

7.2.3. Metrics on H corresponding to invariant K¨ahlerian metrics. In this subsection we discuss metrics on the sub-bundle H of T Diff S 1 obtained by the pullback of some K¨ ahlerian metrics defined on B = Diff S 1 /S 1 , where we identify B and F 0 as it was made in Subsection 7.1.4. Recall that the left action of Diff S 1 on F 0 is well defined. Let us choose an Hermitian metric on the base space F 0 assuming that this metric is K¨ahlerian and invariant under the action of Diff S 1 . All pseudo-Hermitian metrics on F 0 are included into the two-parameter family bαβ , see [83, 85, 86]. It is sufficient to describe this metric only at id ∈ F 0 because at other points of F 0 the metric bαβ are defined by the left action of Diff S 1 . Any smooth curve ft in F 0 with f0 = idBC can be written as ft (z) = z + tzF (z) + o(t),

F ∈ A0 .

Hence, we can identify TidBC F 0 with A0 by relating the equivalence class [t → ft ] to F . With this identification, the metric bαβ ∈ F 0 can be written as ))    2   αF1 F 2 + β(zF1 ) (zF2 ) dσ(z), bαβ id (F1 , F2 ) = BC π BC (7.31) ∞  =2 (αn + βn3 )an bn , n=1

∞ n n where dσ(z) is the area element and F1 (z) = ∞ n=1 an z , F2 (z) = n=1 bn z . 2 If α = −n β, n ∈ Z, then the metric bαβ is non-degenerate pseudo-Hermitian. Otherwise, bαβ degenerates along a distribution of complex dimension 1. Moreover, we require β ≥ 0 and −α < β in order to obtain a positively definite Hermitian

280

I. Markina

metric. Since it is impossible to write the left action of Diff S 1 on F 0 explicitly, it is not easy to describe bαβ globally on F 0 . However, these metrics can be pulled back to H by projections π : Diff S 1 → F 0 . Consider the injective map did π :

Vect0 (S 1 ) → TidBC F 0 ∼ = A0 . u∂θ → F

Then the elements F and u are related by the formula, see [85] F (eiθ ) =

 i u(θ) − iJu(θ) , 2

where J is from (7.13). Observe that ))   2  αF1 F 2 + β(zF1 ) (zF2 ) dσ(z) bαβ |idBC (F1 , F2 ) = π B ))C   −i = αdF1 ∧ dF 2 + βd(zF1 ) ∧ d(zF2 ) π B ) C  −i αF1 dF 2 + β(zF1 )d(zF2 ) . = π S1     So we conclude that for u, v ∈ Vect0 (S 1 ), and F1 = 2i u − iJu , F2 = 2i v − iJv ,   bαβ |idBC did πu, did πv )   i α(u − iJu) d(v + iJv) + β(u − iJu ) d(v  + iJv  ) = 4π S 1 )  i α(u dv − iJu dv + iu dJv + Ju dJv) = 4π S 1  + β(u dv  − iJu dv  + iu dJv  + Ju dJv  ) )   i α(u dv + Ju dJv) + β(u dv  + Ju dJv  ) = 4π S 1 )   1 + α(Ju dv − u dJv) + β(Ju dv  − u dJv  ) 4π S 1 = iωαβ (u, v) + ωαβ (Ju, v), , , where ωαβ is 2-cocycle (7.10) and we used S 1 u dv = S 1 Ju d(Jv) in the last equation, that can be shown by Fourier expansions. The corresponding to the cocycle ωαβ inner product on Vect0 (S 1 ) is obtained by  u, v αβ = ωαβ (Ju, v). Observe that  u, v αβ = −(Ju , v)αβ ,

u, v ∈ Vect0 (S 1 ).

(7.32)

Extend  , αβ to an inner product on the whole algebra Vect(S ) as in (7.28). Let gαβ be the Riemannian metric obtained by left translation of  , αβ , and let hαβ 1

Geodesics in Geometry with Constraints

281

be the metric restricted to H. We apply Theorem 7.3 and deduce that a normal critical curve γ : I → Diff S 1 is the solution to ˙ = u, κ (γ)

−αJ u˙  +β u˙  = −α(uJu +2u2 )+β(uJu +2u )+2λu , λ ∈ R.

Here we used the property (7.32), see also [65]. We conclude that the geodesics for bαβ can be found by solving the above equation for λ = 0 and then projecting them to F 0 . For (α, β) = (1, 0), this is a special case of the modified Constantin–Lax– Majda (CLM) equation. For more information, see [45], where the Riemannian geometry of g1,0 is considered. 7.2.4. Controllability on Diff S 1 . Before we formulate the main result in controllability, we describe some special subgroups of Diff S 1 . We start from subalgebras of Vect(S 1 ). For each n ∈ Z, let us define pn = cos nθ ∂θ ,

kn = sin nθ ∂θ .

The Lie brackets are given by

[pm , pn ] =

m+n m−n 2 km−n + 2 km+n , m−n − m+n 2 km−n − 2 km+n ,

(7.34)

[pm , kn ] =

− m+n 2 pm−n

(7.35)

[km , kn ] =

+

m−n 2 pn+m .

(7.33)

It is easy to see from (7.33)–(7.35) that hn = span{p0 , pn , kn } are subalgebras of Vect(S 1 ), and that hn is isomorphic to su(1, 1) for each n. To each Lie sub-algebra hn ⊂ Vect(S 1 ) corresponds a subgroup Hn of Diff S 1 . To show that any two points on groups Diff S 1 or Vir can be connected by H- or, respectively, E-horizontal curve, we use the invariance of these horizontal sub-bundles under the corresponding group action. We start from a general result. Assume that a horizontal sub-bundle H is invariant under the action of some subgroup K of a given group G. Then, if the tangent bundle T K is transversal to H, the question of controllability is reduced to the question whether elements of K can be reached from the unity of G by an H-horizontal curve. Lemma 7. Let G be a Lie group with the Lie algebra g, and let a left- (or right-) invariant horizontal sub-bundle H be obtained by left (or right) translations of a subspace h ⊆ g. Assume that there is a sub-group K of G with the Lie algebra k such that g = p ⊕ k for some p ⊆ h. Suppose also that h is Ad(K)-invariant. Then any pair of elements in G can be connected by a smooth H-horizontal curve, if and only if, for every a ∈ K there is an H-horizontal smooth curve connecting 1 ∈ K and a. Proof. We present the proof for the case of a left-invariant sub-bundle H. Let c : [0, 1] → G be any curve (not necessarily horizontal), connecting the points a0 and a1 , and having left logarithmic derivative u. Using left translation of c by a−1 0 , we can assume that a0 = 1. Let prk : g → k be the projection with the kernel p ⊂ h.

282

I. Markina

Consider the projection k(t) = prk u(t), t ∈ [0, 1]. Let ϑ be a curve in K with left logarithmic derivative k, starting at 1. Then the left logarithmic derivative of the curve ϑ(t)−1 is − Adϑ k. Let us show that the curve γ1 (t) = c(t) · ϑ(t)−1 , t ∈ [0, 1], is H-horizontal. We calculate the left logarithmic derivative of γ˙ 1 (t) and find κ (∂t (c(t) · ϑ(t)−1 )) = Adϑ(t) (u(t) − k(t)) ∈ h, since h is Ad(K) invariant. Hence, we have constructed an H-horizontal curve γ1 , from 1 to a1 · ϑ(1)−1 . Applying the right translation by ϑ(1), that keeps the curve H-horizontal because of the Ad(K)-invariance of h, we get a curve from ϑ(1) to a1 . Moreover, by the hypothesis of the theorem, we can connect 1 with ϑ(1) by a smooth horizontal curve γ2 . Finally, we glue the curves γ1 and γ2 into one smooth curve by slowing exponentially down to zero speed at the connecting point.  Theorem 7.4. The following is true. (a) Let H be a choice of a horizontal sub-bundle on Diff S 1 defined as in Section 7.2.1. Then any pair of points can be connected by an H-horizontal curve. (b) Let E be a choice of a horizontal sub-bundle on Virμν defined as in Section 7.2.1. Then any two points on Virμν can be connected by an E-horizontal curve. Proof. To prove (a), it is sufficient to show that any two points in Diff S 1 can be connected by an H-horizontal curve. Due to Lemma 7, we only need to verify that id ∈ Diff S 1 can be connected with any element in S 1 by an H-horizontal curve. The subgroup S 1 is contained in Hn for any n, in particular, S 1 can be considered as a subgroup of H1 . Any H-horizontal curve in H1 , will have left logarithmic derivative in h1 ∩ Vect0 (S 1 ) = span{k1 , p1 }. Since [p1 , k1 ] = p0 , the horizontal distribution H restricted to H1 is bracket generating. The group H1 is finitedimensional, therefore we can apply the Rashevski˘ı–Chow theorem to conclude that every point in H1 , including points in S 1 , can be reached by an H-horizontal curve. To prove (b), we need to show that any point in S1 = {θ → (θ+a, b) ∈ Virμν } can be connected to (id, 0) ∈ Virμν by an E-horizontal curve. Let  n = {(φ, a) ∈ Virμν : φ ∈ Hn , a ∈ R} H which has Lie algebra  hn = span {(p0 , 0), (pn , 0), (kn , 0), (0, 1)} .  n is not bracket generating. Unfortunately, the sub-bundle restricted to the group H We need to find a smaller subgroup. The Lie algebras  hn have special subalgebras . tn = span (p0 , n2 ν − μ), (pn , 0), (kn , 0) .  n . On the contrary to what holds on H  n, Denote the corresponding subgroups by T  the distribution E restricted to any subgroup Tn is bracket generating, and so all

Geodesics in Geometry with Constraints

283

 n can be reached by an E-horizontal curve. It is elements in such a subgroup T clear that -  .  n ∩ S1 = θ → θ + r, r(n2 ν − μ) : r ∈ R , T 1 . Since S1 is isomorphic to R2 as where S1 is the subgroup of translations in H  j ∩ S1 , j = 1, 2 such that a group and ν = 0, we can find unique elements gj ∈ T g = g1 · g2 = g2 · g1 for any g ∈ S1 . Denote by c1 and c2 curves that connect (id, 0) ∈ Virμν with g1 and g2 , respectively. We can reach g by first following a curve c2 and then moving to g by a curve from g2 ◦ c1 to g1 , that is the translation of c1 by g2 . This finishes the proof.  The question of controllability on infinite-dimensional manifolds is very difficult and is not well studied. We mention the book [93], where the smooth calculus on most general complete topological locally convex vector spaces is presented and the theory of infinite-dimensional manifolds is also developed, see also [108, 116] for the study of infinite-dimensional Lie groups. The analogous of the Chow– Rashevski˘ı theorem for the Hilbert manifolds can be found in [69] for the Banach manifolds in [94] and for manifolds modelled on more general complete topological vector spaces see [79].

8. Appendix A 8.1. Smooth manifolds Definition 35. A topological space S is second countable if its topology has a countable base, that is a countable collection B of open sets such that every open set is a union of some sub-collection of B. Definition 36. A set P is a submanifold of a smooth manifold M if: 1. P is a smooth manifold, 2. the inclusion map j : P → M is smooth and at each point p ∈ P its differential dq j : Tq P → Tj(q) M is injective. Some authors require that P is also a topological subspace of M . Definition 37. An immersion ϕ : M m → N n is a smooth map such that the differential dq ϕ : Tq M m → Tϕ(q) N n is injective for all q ∈ M . It is equivalent to saying that the Jacobi matrix of dq ϕ has rank m relatively to one (hence every) choice of coordinate system. Definition 38. An embedding φ : P → M of a manifold P into a manifold M is 1. an injective immersion, such that 2. the induced map φ˜ : P → φ(P ) ⊂ M is a homeomorphism onto the subspace φ(P ).

284

I. Markina

ϕ

φ

α

q = lim α(t) t→+∞

Immersion,

Sub-manifold,

but not a submanifold

but not an embedding

Imbedding

Figure 8.1. Difference between immersion, embedding and submanifold. If P is a submanifold of M , then the inclusion map j : P → M is an embedding. Conversely, if φ : P → M is an embedding, then this map induces a manifold structure on the image φ(P ) ⊂ M and the induced map φ˜ : P → φ(P ) is a diffeomorphism. The map φ ◦ φ˜−1 : φ(P ) → M is the inclusion j : φ(P ) → M which is smooth and whose differential is injective as a composition of two injective differentials dφ ◦ d(φ˜−1 ). We conclude that φ(P ) is a submanifold of M . In Figure 8.1 one can see the difference between embedding and submersion of the manifold P = R into the manifold M = R2 . Definition 39. A submersion π : M → B is a smooth surjective map such that the differential dq π : Tq M → Tπ(q) B is surjective for all q ∈ M . Definition 40. Let (M, gM ) and (N, gN ) be two Riemannian manifolds. A diffeomorphism ι : M → N is called a Riemannian isometry, if gM (v, w) = gN (dq ι(v), dq ι(w))

for all v, w ∈ Tq M and all q ∈ M.

Definition 41. Let (M, gM ) and (B, gB ) be two Riemannian manifolds and let  ⊥ π : M → B be a submersion. Let Tq M = ker(dq π) ⊕⊥ ker(dq π) be the orthogonal decomposition with respect to gM . If the restriction dπ|ker(dq π) : ker(dq π) → Tπ(q) B is a linear isometry for any q ∈ M , then the map π is called the Riemannian submersion. Definition 42. A pairing between the tangent and the co-tangent bundle is a map · , · : Tq M × Tq∗ M → R which is bi-linear, non-degenerate, and smoothly varying with respect to q ∈ M . It is non-degenerate in the sense that if v, λ = 0 for all v ∈ Tq M and λ ∈ Tq∗ M , then λ ≡ 0.

Geodesics in Geometry with Constraints

285

Definition 43. An absolutely continuous curve c : I → M is an integral curve of a vector field X ∈ Vect(M ) if it satisfies the differential equation c(t) ˙ = X(c(t)),

for almost all t ∈ I.

A vector field X is called complete if each of its non-extendable integral curves (starting from different points q ∈ M ) is defined on I = R. Let us denote by cq,X the integral curve of a complete vector field X starting at q ∈ M . Thus the curve cq,X is a solution of the Cauchy problem c(t) ˙ = X(c(t)),

c(0) = q,

t ∈ R.

Definition 44. The flow of a complete vector field X on a smooth manifold M is  : R × M → M given by the map X  q) = cq,X (t), X(t, where cq,X is the non-extendable integral curve of X starting at q ∈ M .  of a complete vector field satisfies the conditions: Proposition 18. The flow X  1. X(0, ·) : M → M is the identity map of M ,    + t, ·) = X  s, (X(t,  ·) = X(s,  ·) ◦ X(t,  ·) for all s, t ∈ R. As a corollary 2. X(s we conclude that flows commute for fixed times,  ·) : M → M is a diffeomorphism for any s ∈ R, where the 3. the map X(s,   −1 (s, ·) := X(−s, ·). inverse map is X We need the completeness assumption in order to work with the entire manifold M and not only locally. For arbitrary vector fields one can define a local analogue of the flow. Now we define a Levi-Civita connection. We start from the definition of the affine connection. Definition 45. An affine connection ∇ on a smooth manifold M is a map ∇:

Vect(M ) × Vect M (X, Y )

→ Vect(M )  → ∇X Y,

satisfying the following properties: 1. ∇f X+gY Z = f ∇X Z + g∇Y Z, 2. ∇X (Y + Z) = ∇X Y + ∇X Z, for all X, Y, Z ∈ Vect(M ) and f, g ∈ C ∞ (M ). 3. ∇X (f Y ) = f ∇X Y + X(f )Y , The notion of the affine connection leads to the definition of the covariant derivative along a given curve. Namely, let an affine connection ∇ be defined on a smooth manifold M . Suppose that c : I → M is a smooth curve and X : I → T M is a vector field along the curve c. Then there exists a unique correspondence which associates to a vector field X another vector field DX dt along c by the rule DX := ∇c˙ X(c(t)). dt

286

I. Markina

The covariant derivative

D dt

satisfies the properties

DX DY D (X + Y ) = + , dt dt dt

D (f X) = dt



df dt

X +f

D X, dt

where f is a smooth function along the curve c. Definition 46. An affine connection ∇ on a smooth manifold M is symmetric if ∇X Y − ∇Y X = [X, Y ]

for all X, Y ∈ Vect(M ).

Definition 47. Let (M, g) be a Riemannian manifold with an affine connection ∇. We say, that the affine connection ∇ is compatible with the Riemannian metric g if   X g(Y, Z) = g(∇X Y, Z) + g(Y, ∇X Z) for all X, Y, Z ∈ Vect(M ). The following theorem asserts that the presents of the Riemannian metric guarantees the existence and uniqueness of an affine connection that is compatible with the Riemannian metric and symmetric. Theorem 8.1 ([23, 117]). Given a Riemannian manifold (M, g) there is a unique affine connection ∇, called the Levi-Civita connection, such that ∇ is symmetric and compatible with the metric g. 8.2. Symplectic manifolds Definition 48. A non-degenerate skew symmetric real-valued 2-form Ω is called a symplectic form. The pair (N, Ω), where N is a smooth manifold and Ω is a symplectic form, is called a symplectic manifold. In some literature it is also required that Ω is a closed form. Definition 49. Let (N, Ω) be a symplectic manifold and H ∈ C ∞ (N ) a function, → − then the associated with H Hamiltonian vector field H is defined by → − Ω(X, H ) := dH(X) for all X ∈ Vect(N ). The Poisson brackets between functions H, K ∈ C ∞ (N ) is the directional derivative of one function in the direction of the Hamiltonian vector field, associated with another function. Namely, → − → − − → {H, K} := dK( H ) = Ω( H , K ). As an example, consider a smooth manifold M and its co-tangent bundle T ∗ M . Recall that if pr∗M : T ∗ M → M is the canonical projection, then d(pr∗M ) : T (T ∗M ) → T M. We use the notation · , · to denote the pairing between T M and T ∗ M . Define a real-valued one-form ω : T (T ∗ M ) → R on the manifold N = T ∗ M by ω(q,λ) (v) := d(pr∗M )(v), λ ,

v ∈ T(q,λ) (T ∗ M ).

Then the 2-form Ω = dω is a symplectic form. Verify it!

Geodesics in Geometry with Constraints

287

  If we chose a chart U, ϕ = (x1 , . . . , xn ) on M , then it induces the chart  ∗  T U, Φ = (x1 , . . . , xn , λ1 , . . . , λn ) on T ∗ M . The canonical projection pr∗M and its differential take the matrix form

In×n 0 In×n 0 pr∗M = =⇒ d pr∗M = . 0 0 0 0 . . . , λn ) is written as Therefore, one-form ω at a point (q, λ) = (x1 , . . . , xn , λ1 , the n ω(q,λ) = j=1 λj dxj . The symplectic form Ω becomes Ω = nj=1 dλj ∧ dxj . The Hamiltonian vector field is n   → − ∂H ∂ ∂H ∂  H (q, λ) = − ∂λj ∂xj ∂xj ∂λj j=1 and the Poisson brackets are {H, K} =

n   ∂H ∂K ∂H ∂K  . − ∂λj ∂xj ∂xj ∂λj j=1

To each vector field X ∈ Vect(M ) we associate the function HX : T ∗ M → R by HX (q, λ) = X(q), λq . → − Then one can associate the vector field H X to the function HX in the nHamiltonian k 1 n following way. If X(q) = k=1 X (x , . . . , x )∂k and λq = nj=1 λj (x1 , . . . , xn )dxj n then HX (q, λ) = j=1 λj X j (q) and n  n / ∂   0 ∂  − → ∂ X j (q) j − . λk X k (q) H X (q, λ) = j ∂x ∂x ∂λj j=1 k=1

Now it is obvious that

→ − d(pr∗M )( H X (q, λ)) = X(q) for all (q, λ) ∈ T ∗ M.

Corollary 9. Geodesics produced by the Hamiltonian function HX coincide with integral curves of the vector field X. → − If X ∈ Vect(M ), then the vector field H X is called the Hamiltonian lift of X. Exercises 1. Let (N, Ω) be a symplectic manifold. Verify that (C ∞ (N ), {· , ·}) is a Lie → − algebra and the map H → H is a Lie algebra homomorphism (C ∞ (N ), {· , ·})

−→

(Vect(N ), [· , ·]).

2. Let M be a smooth manifold. Show that {HX , HY } = H[X,Y ] for all X, Y ∈ Vect(M ). Conclude that the map X → HX produces a homomorphism of Lie algebras (Vect(M ), [· , ·])

−→

(C ∞ (T ∗ M ), {· , ·}).

288

I. Markina

8.3. Lie groups The content of this subsection can be found in [42, 87, 108, 131]. Definition 50. A Lie group G is a smooth (finite-dimensional) manifold M and a group such that the operations of multiplication

ρ:

M ×M (τ, q)

→ M, → τq

and

M → M, x → x−1 are smooth maps between corresponding smooth manifolds: M × M → M and M → M , respectively. inversion

in :

It is customary to use the letter G to denote the underlying manifold M and the pair (M, ρ) in the case of Lie groups. Definition 51. A Lie algebra g over R (C) is a real (complex) vector space V together with an operation [· , ·] : V × V → V (called the bracket, commutator, or Lie product) satisfying the following three axioms: 1. skew symmetry: [X, Y ] = −[Y, X], 2. bi-linearity: [aX + bY, Z] = a[X, Z] + b[Y, Z], a, b ∈ R (C) (and the same with respect to the second term), 3. Jacobi identity: [[X, Y ], Z] + [[Z, X], Y ] + [[Y, Z], X] = 0 for any X, Y, Z ∈ V . We will use the letters g, h, . . . to denote the also underlying vector spaces V, U, . . . in the case of Lie algebras. Example 10. 1. The general linear group GL(n, R) = GL(n) is the Lie group of all (n × n)matrices L with real entries such that det L = 0. Since the determinant 2 function det : Rn → R is smooth, the underlying manifold of GL(n) is an 2 open subset in Rn defined by the complement to the inverse image of the function “det” of the value 0 ∈ R. The group multiplication is the multiplication of matrices. The corresponding Lie algebra gl(n, R) = gl(n) is formed 2 by all (n × n)-matrices and isomorphic to Rn as a vector space. The commutator in GL(n) is the commutator of two matrices. The group GL(n) is the group of all non-degenerate linear transformations of Rn . In a similar way the group GL(n, C) can be defined. 2. The orthogonal group O(n, R) = O(n) is the subspace of GL(n) such that Ltr L = LLtr = Id, where Ltr is the transpose to L ∈ O(n). Verify that in this case det2 L = 1. The smooth underlying manifold for O(n) is the level set 2 of the function “det” inside of Rn and it consists of two connected components corresponding to the value 1 and −1 of “det”. The special orthogonal

Geodesics in Geometry with Constraints

289

group SO(n, R) = SO(n) is the subset of O(n) whose matrices have determinant 1 and the underlying manifold is the connected component containing the identity matrix. Both groups have the same Lie algebra o(n) consisting of (n × n)-matrices that are skew symmetric: Ltr = −L. The main feature of these groups is that under their transformations the Euclidean inner product in Rn is preserved. (Why?) 3. The unitary group U (n) is the group of (n× n)-matrices with complex entries ¯ tr = Id, where L ¯ tr is the transpose and conjugate matrix ¯ tr L = LL such that L to L ∈ U (n). The special unitary group SU (n) is the subset of U (n) whose matrices have determinant 1. The Lie algebra u(n) is the set of (n × n)¯ tr = −L. The Lie algebra matrices that are skew-Hermitian symmetric: L su(n), n ≥ 2, is the subset of u(n) having vanishing trace. The unitary and n special unitary groups preserve the Hermitian product (z, w) = k=1 z¯k wk in Cn . 4. The symplectic group Sp(n) is the group of (n × n)-matrices with quaternion ¯ tr = Id, where L ¯ tr is the transpose and quaternion ¯ tr L = LL entries such that L conjugate matrix to L. The Lie algebra sp(n) is the set of (n × n)-matrices ¯ tr = −L. Symplectic groups preserve that are skew-Hermitian symmetric: L the Hermitian product in the n-dimensional quaternionic space Qn . 5. The special Euclidean group SE(n), or the group of rigid motions in Rn is the group consisting of rotations and translations in Rn . An element τ ∈ SE(n) is usually written as a pair τ = (A, a), where A ∈ SO(n) and a is a n-dimensional vector. The multiplication is given by τ υ = (A, a)(B, b) := (AB, Ab + a). Thus the group SE(n) is the group of all isometries in the , where n stands for translations and Euclidean space. Its dimension is n(n+1) 2 n(n−1) is the dimension of SO(n). 2 We define the exponential map and list its properties. Let (R, +) be the additive group of real numbers and r be the corresponding Lie algebra with generator d dr . Let G be a Lie group, g be its Lie algebra, and X ∈ g be an arbitrary element. Then the map h: r → g d t dr → tX is a homomorphism of the Lie algebra r into the Lie algebra g. The theorems of Lie group theory [87, 131] ensure that, due to the simply connectedness of R, there is a unique Lie group homomorphism cX , such that

d cX : R → G, and d cX = h, or dcX t = tX. dr In other words, the curve cX : R → G is a one-parametric subgroup of G and it is such that cX (0) = e and c˙X (0) = X. The curve cX (t), t ∈ R, is called the exponential curve and it is often denoted by exp(tX), t ∈ R. The map exp :

g X

→ G  → exp(X).

290

I. Markina

is called the exponential map. We list the properties of the exponential map in the following theorem. Theorem 8.2 ([87, 131]). Let X belong to the Lie algebra g of a finite-dimensional Lie group G. Then the following properties hold. 1. The exponential curve exp(tX) = cX (t) for each t ∈ R satisfies d  d  exp(tX) =  cX (t) = c˙X (0) = X,  dt t=0 dt t=0 2. 3. 4. 5.

6.

7.

8.

(see also the definition    of the exponential  curve in Subsection 3.1). exp(t1 + t2 )X = exp(t1 X) exp(t2 X) , for all t1 , t2 ∈ R.  −1 exp(−tX) = exp(tX) for each t ∈ R. The map exp : g → G is a C ∞ -map between two manifolds. The differential at zero vector of the exponential map d0 exp : T0 g → Te G is the identity map g → g, where we identify elements of g with T0 g and with Te G. An important corollary is that exp gives a diffeomorphism between a neighborhood of 0 ∈ g and a neighborhood of e ∈ G. The left translation of cX by τ ∈ G given by c˜ = lτ (cX (t)) = τ cX (t) is the  (X(e)  unique integral curve of the left invariant vector field X = X) such that it starts at the point τ : c˜(0) = τ . As a particular consequence, left invariant vector fields are always complete.  τ ) : G → G associated with The one-parametric flow of diffeomorphisms X(t,    τ ) = τ cX (t) = a left invariant vector field X (X(e) = X) is given by X(t, rcX (t) τ , where rcX (t) is the right translation by cX (t). In the neighborhoods of 0 ∈ g and e ∈ G, where exp is a diffeomorphism, the inverse map is defined and is called logarithm. It expresses the product of two exponents through the Baker–Campbell–Hausdorff formula [123], whose first terms are given as follows:  1 1 exp(X) exp(Y ) = exp X + Y + [X, Y ] + [X, [X, Y ]] 2 12  1 − [Y, [X, Y ]] + · · · . (8.1) 12

It is useful to keep in mind the following diagram defining the exponential map as time-one value of the exponential curve. For chosen X ∈ g, d t dr ∈r Id

h=dcX

/ g " tX

=⇒

X → exp X = cX (1)

exp(tX)





t∈R

cX

/ G " cX (t).

The straight line tX ∈ g is mapped to the one-parametric subgroup cX (t) = exp(tX) ∈ G.

Geodesics in Geometry with Constraints

291

Definition 52. A subgroup N of a group G is called a normal subgroup if it is invariant under conjugation; that is, for each element n ∈ N and each τ ∈ G the element τ nτ −1 ∈ N . Definition 53. A group G is called simple if it is a non-trivial group and there are no other normal subgroups except the trivial subgroup and the group itself. A group that is not simple can be decomposed into two smaller groups, a normal subgroup and the corresponding quotient group, and the process can be repeated. 8.3.1. Action of Lie groups on manifolds. Let M be a smooth manifold and let G be a Lie group. Definition 54. An action of G on M on the left is a smooth map μ : G × M → M such that μ(ςτ, q) = μ(ς, μ(τ, q)), μ(e, q) = q, for all ς, τ ∈ G and q ∈ M . If μ : G × M → M is an action of G on M on the left, then for a fixed ς ∈ G the map q → μ(ς, q) is a diffeomorphism of M which we will denote by μς . Similarly we define a right action. Definition 55. An action of G on M on the right is a smooth map μ : M × G → M such that μ(q, ςτ ) = μ(μ(q, ς), τ ), μ(q, e) = q for all ς, τ ∈ G and q ∈ M . We also use the notation τ.q instead of μ(τ, q) for the left action and q.τ instead of μ(q, τ ) for the right action. Definition 56. We say that a Lie group G acts freely on the right on a smooth manifold M if for all q ∈ M , q.τ = q.ς if and only if τ = ς. Equivalently: if there exists q ∈ M such that q.τ = q (that is, if τ has at least one fixed point), then τ is the identity element e ∈ G. Definition 57. We say that G acts transitively on the right on M if for any q, p ∈ M there is τ ∈ G such that q.τ = p. The same definitions can be given for the left action of the group G on a manifold M . Example 11. 1. The flow on M defined in Definition 44 is an example of the action of the group (R, +) on a smooth manifold M . 2. The action μ : GL(n) × Rn → Rn of the general linear group in Rn is defined as a product of an (n × n)-matrix by an n-vector written as a column (or (n × 1)-matrix).

292

I. Markina

3. Let S n−1 = {x ∈ Rn | x E = 1}. The action μ : O(n) × S n−1 → S n−1 is defined as a product of an (n × n)-matrix from O(n) by a vector from Rn of the length one. 4. The multiplication law in any Lie group G produces two actions on itself: left and right translations. Recall that the action l:

G×G (τ, υ)

→ G  → τ υ,

or l(τ, υ) = lτ (υ) = τ.υ := τ υ, is the left translation and r:

G×G (υ, τ )

→ G → υτ,

or r(υ, τ ) = rτ (υ) = υ.τ := υτ is the right translation. 5. Define a left action a of a Lie group G on itself by a:

G×G (τ, υ)

→ G  → τ υτ −1 ,

(8.2)

or a(τ, υ) = aτ (υ) = τ.υ := τ υτ −1 . This action is called the action by conjugation or the inner automorphism. This action produces other very interesting actions of G on its the Lie algebra g and even an action of the Lie algebra g over itself, see Example (14). 6. Define a left action of a group G on its tangent bundle T G by μ:

G × TG →  TG  τ.(q, vq ) → τ q, dq lτ (vq ) .

(8.3)

The right action of G on T G is defined analogously. 7. Define the left action of a group G on its co-tangent bundle T ∗ G by μ:

G × T ∗G τ.(q, ωq )

→  T ∗G   → τ q, (dq lτ )∗ (ωq ) ,

(8.4)

where (dq lτ )∗ the dual map to the differential dq lτ . The right action of G on T ∗ G is defined analogously. 8. The adjoint action of a Lie group G on its Lie algebra g is defined by μ:

G×g → (τ, ξ) →

g Adτ (ξ).

(8.5)

The definition of the adjoint map Adτ : g → g is given in Example 13. The adjoint action uses the notion of the action a by conjugation (8.2) which is a left action. Therefore, the action Ad is an action on the left. 9. The co-adjoint action of a Lie group G on the dual to its Lie algebra g∗ is defined by g∗ μ : G × g∗ → (8.6) (τ, ω) → Ad∗τ (ω). See the definition of the co-adjoint map Ad∗τ : g∗ → g∗ in (8.9).

Geodesics in Geometry with Constraints

293

10. An action on the left of the special Euclidean group SE(n) over Rn is defined by μ : SE(n) × Rn → Rn (A, a).x → Ax + a. To proceed with examples of the action of a Lie group G on its Lie algebra g (the underlying manifold is just the vector space Te G) we make some observations about the differential map at the identity of an action μ on G. Let μ : G × G → G. Fix one variable τ ∈ G and consider μτ : G → G as a diffeomorphism of the group G. Then the differential at e ∈ G is the linear map de μτ : Te G → Tμτ (e) G = Tτ G. Let us consider three examples of an action μ: translations r, l, and the action by conjugation a. Example 12. Let μ be the right translation r : G × G → G, rτ (q) = q.τ . We want to understand how q.τ changes with respect to the variable τ near τ = e. This variation is called the infinitesimal generator of the right action rτ (q) at point q and is denoted by σq . To calculate σq we observe that the map q.τ with fixed q and variable τ is just the left translation q.τ = lq (τ ), so σq = de lq : Te G → Tlq (e) G = Tq G, or it is customary to write σq : g → Tq G. We conclude that the map σq generates a left invariant vector field on G, since it translates an element ξ of the Lie algebra g to a vector field Xq = de lq (ξ) that will be left invariant by definition. In a practical way the map σq is calculated by making use of the exponential curve by the following d  d  d  σq (ξ) =  q exp(εξ) =  lq (exp(εξ)) = dlq  exp(εξ) = dlq (ξ). dε ε=0 dε ε=0 dε ε=0 (8.7) Analogously, the left translation l has its infinitesimal generator, that is the map g → Tq G generating right invariant vector fields. The formula corresponding to (8.7) is d  d  σq (ξ) = exp(εξ))q =  rq (exp(εξ)) = drq (ξ). (8.8)  dε ε=0 dε ε=0 Example 13. Now we consider μ = a, a(τ, q) = aτ (q) = τ qτ −1 : G → G,

294

I. Markina

which is not only a diffeomorphic map of the underlying manifold of the group G, but it also preserves the group structure, so the map aτ is a group automorphism and we write aτ ∈ Aut(G). The differential of aτ at the identity is de aτ : Te G → Taτ (e) G = Te G. From general group theory [87, 123, 131] it is known that de aτ preserves the Lie algebra structure of the vector space Te G (since aτ preserves the Lie group structure). So we can write de aτ : g → g and conclude that it is an automorphism of Lie algebras. It is denoted by Adτ := de aτ and is called the adjoint map at τ ∈ G. Thus Adτ (ξ) ∈ g for any ξ ∈ g, or Adτ : g → g,

Adτ ∈ Aut(g).

Now let the variable τ vary and consider the adjoint map Ad : G → Aut(g) as a homomorphism of groups, where to the product in G there corresponds a superpositions of linear maps in Aut(g). This map is also called the adjoint representation of the group G on its Lie algebra g, or the adjoint action of the group G on its Lie algebra g. Let the adjoint map Adτ : g → g at τ ∈ G be given, and let · , · be the pairing between the Lie algebra g and its dual g∗ . The co-adjoint map or the dual representation Ad∗τ : g∗ → g∗ at τ ∈ G is defined by Ad∗τ (ω), ξ := ω, Adτ −1 (ξ) ,

ξ ∈ g, ω ∈ g∗ .

(8.9)

Example 14. Since the map Ad sends e ∈ G to Id ∈ Aut(g), the differential de Ad at e ∈ G is the linear map de Ad : Te G → TId Aut(g) and, moreover, it is a homomorphism of the Lie algebra g = (Te G, [· , ·]) and the Lie algebra End(g) = (TId Aut(g), [· , ·]) of all linear transformations of g, where the Lie brackets are defined through the composition of linear maps from End(g). The map de Ad is denoted by ad and is called the adjoint representation of the Lie algebra g over itself, or the adjoint action of the Lie algebra g over itself. The construction is reflected beautifully in the following commutative diagram: GO

Ad

expG

/ Aut(g) O exp

g

ad

/ End(g)

or, in other words, AdexpG (ξ) = exp(adξ ),

ξ ∈ g,

expG (ξ) ∈ G.

Geodesics in Geometry with Constraints

295

Notice that expG in the left-hand side is the exponential map from the Lie algebra g to its Lie group G. The right-hand side exp is the exponential map from the Lie algebra End(g) to its Lie group Aut(g). An interesting feature of the map ad ∈ End(g) is the following. If we fix ξ ∈ g, then adξ : g → g is the map given by adξ η = [ξ, η], where [· , ·] are Lie brackets on g, see [87, 131]. Notice here also the relation between the adjoint action Adτ of the group G on its Lie algebra g, the action aτ on the group G by conjugation, and the exponential map expG reflected in the following commutative diagram: GO



/G O

expG

expG

g

Adτ

/g

or in other words τ exp(ξ)τ −1 = exp(Adτ ξ),

ξ ∈ g,

τ ∈ G.

Example 15 ([81]). The co-adjoint action of a diffeomorphism f ∈ Diff S 1 on the dual vir∗ of the Virasoro algebra is defined by the following formula: Ad∗ :

Diff S 1 × vir∗   f (θ). u(θ)(dθ2 ), a

→  vir∗    → u(f ) · (f  )2 (dθ)2 + aS(f ) (dθ)2 , a ,

where

f  f  − 3/2(f )2 (f  )2 is the so-called Schwarzian derivative of the diffeomorphism f . S(f ) =

The following interesting observation concerns the tangent bundle of a group G. The action, right or left, of a group G on itself induces an action on the tangent bundle T G making the last one into a group. Proposition 19 ([108]). The product map induces a smooth associative map ρ: T (G × G) ∼ T G × T G → T G, that makes the tangent bundle of a Lie group G into a Lie group T G. Proof. Let (τ, vτ ) and (g, vg ) be two points on T G, then we define the multiplication law ρ: T G × T G → T G by   ρ (τ, vτ ), (g, vg ) = (τ g, vτ " vg ) = (τ g, vτ g ), where the vector vτ g ∈ Tτ g G is obtained in the following way. Consider smooth curves γτ,vτ : [−1, 1] → G, γτ,vτ (0) = τ, γ˙ τ,vτ (0) = vτ ,

296

I. Markina

and γg,vg : [−1, 1] → G,

γg,vg (0) = g,

γ˙ g,vg (0) = vg .

Then the product γτ,vτ (t) " γg,vg (t) in G is defined for any t ∈ [−1, 1] and defines a curve γτ g,vτ g : [−1, 1] → G,

γτ g,vτ g (0) = τ g,

γ˙ τ g,vτ g (0) = vτ g := vτ " vg .

So the product vτ " vg is the initial vector velocity of the product curve γτ g,vτ g obtained by multiplication of γτ,vτ by γg,vg .  The natural projection pr : T G → G induces a group homomorphism pr  : (T G, ρ) → (G, ρ). The kernel of this homomorphism is isomorphic to the abelian additive group of tangent vectors at the identity of the group Te G. In other words, there is a short exact sequence σ

pr 

0 −→ Te G −→ T G −→ G −→ 1 of smooth group homomorphisms. Here σ is the infinitesimal generator of the right action of the group G on itself that associates a left invariant vector field to any vector ξ ∈ Te G. Let us define a map z: G → TG − → that to any element τ associates (τ, 0 τ ) ∈ Tτ G, i.e., the null section at tangent pr z space Tτ G. Then the composition G −→ T G −→ G is the identity map on G. This means that the group T G can be decomposed as a semi-direct product T G = Te G  G, where Te G is a normal subgroup on T G: (e, ξ) ∈ Te G ⊂ T G, and G is a subgroup → − z of T G: G " τ → (τ, 0 τ ) ∈ T G. The semi-direct product can be written by making use of a homomorphism h : G → Aut(Te G) [87] by (ξ, τ )(η, g) → (ξhτ (η), τ g), where hτ : Te G → Te G is the action of the subgroup G ⊂ T G over the normal subgroup Te G ⊂ T G [87]. Since the elements of Te G can be considered as elements of a subgroup Te G of the group T G and they are also elements of g, then it can be shown that Ad is the suitable homomorphism. Thus the semi-direct product can be written as (ξ, τ )(η, g) → (ξ Adτ (η), τ g), where the product of ξ by Adτ (η) is considered as the product in the normal subgroup of the group T G.

Geodesics in Geometry with Constraints

297

Definition 58. Let G be a group and A, B be two sets where the group G acts. A map F : A → B is said to be equivariant, if F (τ.q) F (q.τ ) F (q.τ ) F (τ.q)

= = = =

τ.F (q) F (q).τ τ −1 .F (q) F (q).τ −1

for for for for

left-left action, right-right action, right-left action, left-right action,

for all τ ∈ G and all q ∈ A. The definition says that an equivariant map is a map that commutes with the action of the group in the domain of definition and on the target space. As we can see, the definition depends on whether right or left action is chosen on the domain of definition and on the target space of F . Equivariant maps are also known as G-maps or G-homomorphisms. Exercises 1. We present here one more point of view on the map ad. Let us fix any vector ξ ∈ g and see the difference between ξ and the result of adjoint action Ad of G on ξ ∈ g: Adτ ξ − ξ.

2. 3. 4. 5. 6.

Then the differential of the map G " τ → Adτ ξ − ξ ∈ g at the identity τ = e is denoted by adξ and it is a linear map adξ : Te G → Te g. After identifications Te G ∼ Te g ∼ g and proving the correspondence of Lie brackets, we come to the previous definition of adξ : g → g. Show that adξ (η) is bilinear and satisfies adξ (η) = − adη (ξ). Verify that the pair (T G, ρ) from Proposition 19 satisfies the definition of a Lie group. Show that Te G is a normal subgroup of T G and G is the complementary subgroup. Show that adξ (η) = [ξ, η]. Check that the dual representation Ad∗ : G → Aut(g∗ ) from (8.9) is a group homomorphism. Define the co-adjoint map ad∗ξ : g∗ → g∗ by ad∗ξ ω, η = −ω, adξ η , ξ, η ∈ g, ω ∈ g∗ . Verify that the co-adjoint map ad∗ : g → End(g∗ ) is an algebra homomorphism.

8.4. Complexifications Here we present definitions of complexifications of real vector spaces, complex and CR-structures on manifolds and Lie groups, including the infinite-dimensional case. 8.4.1. Complexification of a real vector space. A complexification of a real vector space V is the tensor product V ⊗ C over R, where the generators are v ⊗ 1 and v ⊗ i, v ∈ V . So, V ⊗ C are all possible linear combinations of v ⊗ 1 and v ⊗ i,

298

I. Markina

v ∈ V with real coefficients, modulo the equivalence relations (v1 + v2 ) ⊗ z ∼ v1 ⊗ z + v2 ⊗ z,

v ⊗ (z1 + z2 ) ∼ v ⊗ z1 + v ⊗ z2 ,

av ⊗ z ∼ v ⊕ az,

a ∈ R.

The real dimension of V ⊗ C is 2 dim V . The multiplication by complex numbers is defined by α(v ⊗ z) = v ⊗ αz,

for

α, z ∈ C,

and v ∈ V.

It makes the space V ⊗C into a complex vector space of complex dimension dim V . The generators for complex vector space V ⊗ C are v ⊗ 1 and v ⊗ i. The real space V is naturally imbedded into V ⊗ C by identifying V with the space V ⊗ 1 (any element v ∈ V is identified with the element v ⊗ 1 ∈ V ⊗ C). The conjugation for V ⊗ C is defined by v ⊗ z := v ⊕ z¯. As an application we consider a complexification of a smooth real manifold M of real dimension n. For any q ∈ M , the complex vector space Tq M ⊗ C is called the complexified tangent space and Tq∗ M ⊗ C is called the complexified co-tangent space. The complex space Tq∗ M ⊗C can also be regarded as the complex dual space of Tq M ⊗ C by defining the pairing v ∈ Tq M, ξ ∈ Tq∗ M, z, w ∈ C,   for any point q ∈ M . The complexified tangent bundle is T C M = ∪q∈M Tq M ⊗ C   and the complexified co-tangent bundle is T ∗ C M = ∪q∈M Tq∗ M ⊗ C . A complexified vector field L on M is a smooth section of T C M , which means that L M  a vector Lq ∈ Tq M ⊗ C. In any smooth coordinate system assigns to 1each q ∈ n U, ϕ = (x , . . . , x ) we can express L as v ⊗ z, ξ ⊕ w := v, ξ zw,

Lq =

for

n 

Lj (q)∂xj ,

j=1

where L , j = 1, . . . , n are smooth, complex-valued functions defined on U ⊂ M . If M is complex manifold of complex dimension n, then it is important to distinguish between the real tangent bundle and the complexified tangent bundle. The real tangent bundle T M corresponds to a smooth manifold M of real dimension 2n. Its fiber Tq M is a real vector space and has real dimension 2n. The fiber Tq M ⊗ C of the complexified tangent bundle is a complex space of complex dimension 2n. j

8.4.2. Complex structures. If the real vector space V is of even dimension, then it is possible to define an almost complex structure J, that is, a map J : V → V , such that J 2 = − IdV . Example 16. Let V = Tq R2n ∼ = Cn . Take the coordinates q = (x1 , y1 , . . . , xn , yn ). The standard almost complex structure for Tq R2n is defined by setting J(∂xj ) = ∂yj ,

J(∂yj ) = −∂xj ,

j = 1, . . . , n,

(8.10)

Geodesics in Geometry with Constraints

299

on the standard basis. Then J extends by linearity to all Tq R2n .√This almost complex structure is designed to simulate the multiplication by i = −1. The standard almost complex structure J ∗ on the co-tangent space Tq∗ R2n is the following J ∗ (dxj ) = −dy j

J ∗ (dy j ) = −dxj ,

j = 1, . . . , n.

An almost complex structure can be defined on a real tangent space of a complex manifold M by pushing forward the complex structure from Cn up to M via a coordinate chart. For q ∈ M and a holomorphic chart (U, ζ), ζ : U → Cn , we define Jq : Tq M → Tq M by Jq (L) := dζ(q) ζ −1 J(dq ζ(L)),

(8.11)

where J in the right-hand side is the standard almost complex structure in Cn . The definition implies that if ζ = (z1 , . . . , zn ), zj = xj + iy j , then Jq (∂xj ) = ∂yj and Jq (∂yj ) = −∂xj . If J is an almost complex structure on a real vector space V , then we can extend it to an almost complex structure JC on the complexification V ⊗ C by setting JC (v ⊗ z) := J(v) ⊗ z, v ∈ V, z ∈ C. Then JC (w) = JC w,

for

w ∈ V ⊗ C.

(8.12)

The linear map JC has two eigenvalues i and −i, since = − IdV ⊗C . The corresponding eigenspaces are denoted by V (1,0) and V (0,1) . Thus we have JC2

V ⊗ C = V (1,0) ⊕ V (0,1) from linear algebra. The property (8.12) implies V (1,0) = V (0,1) . Let us construct bases for V (1,0) and V (0,1) . First we observe that v and Jv are linearly independent over R in V , since J has no real eigenvalues. Then {v1 − iJv1 , . . . , vn − iJvn } is a basis for the complex n-dimensional vector space V

(8.13) (1,0)

and

{v1 + iJv1 , . . . , vn + iJvn }

(8.14)

is a basis for the complex n-dimensional vector space V (0,1) . Recall, that dim V = 2n. Let us see how it works for a complex n-dimensional manifold M . Let (z1 , . . . , zq ) with zj = xj +iy j be a set of local holomorphic coordinates and the almost complex structure on Tq M , q ∈ M , is given by (8.11). Define the vector fields ∂zj =

1 (∂ j − i∂yj ) 2 x

∂z¯j =

1 (∂ j + i∂yj ), j = 1, . . . , n. 2 x (1,0)

Then in view of the above discussions, a basis for Tq M is given by {∂z1 , . . . , ∂zn } (0,1) and a basis for Tq M is given by {∂z¯1 , . . . , ∂z¯n }. Due to the form of the bases the

300

I. Markina (1,0)

(0,1)

spaces Tq M and Tq M received the names holomorphic and antiholomorphic tangent vector spaces. The Hermitian inner product on Tq M ⊗ C is defined by declaring that {∂z1 , . . . , ∂zn , ∂z¯1 , . . . , ∂z¯n } is an orthonormal basis. Let M now be a real manifold, such that at each q ∈ M the tangent space Tq M admits an almost complex structure Jq : Tq M → Tq M . Then it leads to the splitting T C M = T (1,0) M ⊕ T (0,1) M into the holomorphic and antiholomorphic bundles, and each of them is naturally isomorphic to the real tangent bundle of M , but now they are equipped with an additional structure Jq . If T (1,0) M is integrable, that is, [T (1,0) M, T (1,0) M ] ⊂ T (1,0) M , then the pair (M, T (1,0) M ) is called a complex manifold. 8.4.3. Lie groups, Lie algebras and complexification. Let us impose a Lie algebra structure on V and see how one can define a complexification g ⊗ C of the Lie algebra g = (V, [· , ·]). All that we need is to define the Lie bracket [v ⊗ α, u ⊗ β] := [v, u] ⊗ αβ,

v, u ∈ g, α, β ∈ C.

(8.15)

Next we consider the relation between the almost complex structure and the Lie algebra structure. Let G be a Lie group and g be its Lie algebra. Let J : Te G → Te G be an almost complex structure. It determines the splitting Te G ⊗ C = g ⊗ C = g(1,0) ⊕ g(0,1) . If the subspace g(1,0) is a Lie subalgebra of g ⊗ C, then the pair (G, g(1,0) ) is called a left invariant complex structure. This structure is also right invariant, if g(1,0) is adjoint invariant with respect to the adjoint action of the group G, or Adτ ξ ∈ g(1,0) for all ξ ∈ g(1,0) and τ ∈ G. We define now a CR-structure on a real manifold N . We follow a scheme, that is suitable for finite- and infinite-dimensional manifolds. Let T N be the tangent bundle and D be a co-rank 1 smooth sub-bundle, where an almost complex structure Jq : Dq → Dq is defined. Let T N ⊗ C be the complexified tangent bundle, then D ⊗ C is a complex co-rank 1 smooth sub-bundle, where the splitting D ⊗ C = D(1,0) ⊕ D(0,1) is defined. If [D(1,0) , D(1,0) ] ⊂ D(1,0) , then D(1,0) is called an integrable CR-structure and the pair (N, D(1,0) ) is an integrable CR-manifold. Example 17. Let N be a real hypersurface in a complex manifold (M, T (1,0) M ). Define D(1,0) = T (1,0) M |N ∩ T M ⊗ C. Then (N, D(1,0) ) is a CR-manifold with the CR-structure D(1,0) inherited from the complex manifold (M, T (1,0) M ). The manifolds S 3 ⊂ C2 and the boundary of the Siegel upper half-space in C2 have CR-structures induced from C2 . (1,0) (0,1) ¯q ∈ / Dq ⊕ Dq for any A CR-manifold is strongly pseudo-convex if [L, L] local non-vanishing section L of D(1,0) . If we have a Lie group G with a Lie algebra g, then a left invariant CRstructure is defined by a splitting h(1,0) ⊕h(0,1) of a complex co-rank 1 subspace g⊗C with subalgebras h(1,0) and h(0,1) = h(1,0) . This structure is strongly pseudoconvex ¯ ∈ if [ξ, ξ] / h(1,0) ⊕ h(0,1) holds for any non-zero ξ ∈ h(1,0) .

Geodesics in Geometry with Constraints

301

Exercises 1. Show that the standard almost complex structure (8.10) in Tq R2n is an isometry in R2n . 2. Show that the description of J given in (8.11) does not depend on the choice of coordinate chart. Conclude that the push forward of the standard almost complex structure J from Cn to a complex manifold is well defined. 3. Prove (8.12). 4. Show that v − iJv ∈ V (1,0) and v + iJv ∈ V (0,1) for any v ∈ V . 5. Prove that (8.13) and (8.14) are linearly independent systems. 6. Find the dual basis for {∂z1 , . . . , ∂zn , ∂z¯1 , . . . , ∂z¯n } with respect to the standard Hermitian product. 7. Verify that the Lie bracket defined by (8.15) is C-linear, skew symmetric and satisfies the Jacobi identity. 8.5. Fiber bundles Definition 59. A fiber bundle is a collection (F, E, B, π), where in general F, E, B are topological spaces and π is a continuous map π : E → B. It can also be written as π π or shortly E −→ B. F −→ E −→ B It is required that for any x ∈ E, there is an open neighborhood U ⊂ B of π(x) (which will be called a trivializing neighborhood) such that π −1 (U ) is homeomorphic to the product space F ×U , in such a way that π carries over to the projection onto the second factor. In other words, the following diagram should commute: φ / F × U, π −1 (U ) l l l π  ll ulll pr2 U

where the map pr2 : F × U → U is the natural projection on the second coordinate and φ : π −1 (U ) → F × U is a homeomorphism. The set of all (Ui , φi ) is called a local trivialization of the bundle. Thus for any b ∈ B, the pre-image π −1 (b) is homeomorphic to F and is called the fiber over b. The set B is called the base space, E is the total space and the map π is the projection map. Every fiber bundle π : E → B is an open map, since projection pr2 is an open map. Therefore, B carries the quotient topology determined by the map π. Definition 60. A bundle π  : E  → B  is a sub-bundle of the bundle π : E → B, provided E  is a subspace of E, B  is a subspace of B and π  = π|E  : E  → B  . If we speak about smooth fiber bundle, we require that F, E, B are smooth manifolds and all other maps are smooth. A smooth sub-bundle (F  , E  , B, π|E  ) of a bundle (F, E, B, π) is a smooth bundle such that the inclusions jF : F  → F, are smooth.

jE : E  → E

302

I. Markina

Definition 61. Let π  : E  → B and π : E → B be  two bundles  overthe same base  space B. A bundle map (bundle morphism) u : π  : E  → B −→ π : E → B is a map u : E  → E such that π  = πu. The last equality is the requirement that the following diagram commutes: / E. /u E H HHH v v HH vv zvvv π π  H$ B Definition 62. The fiber product (direct sum or Whitney sum) of two bundles π  : E  → B and π : E → B over B is the bundle Π : E  ⊕ E → B, where E  ⊕ E = {(q  , q) ∈ E  × E | π  (q  ) = π(q), and Π(q  , q) = π  (q  ) = π(q)}. The fiber Π−1 (b) of Π : E  ⊕ E → B over b ∈ B is π −1 (b) × π −1 (b) ⊂ E  × E. 8.5.1. Frame bundle. Let (M, gM ) be an n-dimensional oriented Riemannian manifold. The frame bundle π : F → M is a fiber bundle, whose total space F consists of collections (q, v1 , . . . , vn ) ∈ M × (Tq M )n such that gM (vi , vj ) = δij . An element (q, v1 , . . . , vn ) is an orthonormal basis of Tq M . Sections of the frame bundle are called orthonormal frame fields, and they are just assignments to any point q ∈ M of some orthonormal basis (v1 , . . . , vn ) ∈ Tq M . If SO(Tq M ) is the group of orientation preserving isometries of the vector space Tq M , then there is a natural left action of SO(Tq M ) on the frame bundle F given by μ:

F SO(Tq M ) × F → τ (q, v1 , . . . , vn ) → (q, τ.(v1 , . . . , vn )),

where τ.(v1 , . . . , vn ) is just an isometrical transformation of the basis (v1 , . . . , vn ). If τ is written as a matrix in the basis (v1 , . . . , vn ), then it is a product of an (n × n)-matrix by an (n × 1)-column. It is possible to think of the frame field (or just a frame) as a linear isomorphism fq : Rn → Tq M that assigns to any standard basic element ej = (0, . . . , 0, 1, 0, . . . , 0) with 1 on the jth place of Rn , the component vj of (v1 , . . . , vn ). The map fq belongs to the space SO(Rn , Tq M ) of all isometrical transformations from Rn , with the standard Euclidean metric, to the vector space Tq M , endowed with some inner product. In this case it is possible to define the right action of the group SO(n) of F by μ : F × SO(n) → F (q, f ).τ → (q, f.τ ), where f.τ = f ◦ τ = f (τ ) is the composition of the isometry in Rn and then the map fq . It gives the principal SO(n)-bundle structure for the frame bundle F . Notice that there is no natural left action of SO(n) on the fiber over q ∈ M , but only the action of SO(Tq M ). The group SO(Tq M ) is not canonically isomorphic to SO(n) when n ≥ 3.

Geodesics in Geometry with Constraints

303

9. Appendix B Table 3 represents products of unit octonions that were used in the construction of the octonion H-type group H17 . j1 j1

j2

j3

j4

j5

j6

j7

−1 −j3

j2

−j5

j4

j7

−j6

−1 −j1

−j6

−j7

j4

j5

j2

j3

j3

−j2

j1

−1

−j7

j6

−j5

j4

j4

j5

j6

j7

−1

−j1

−j2

−j3

j5

−j4

j7

−j6

j1

−1

j7

−j6

j6

−j7

−j4

j5

j2

−j7

−1

j5

j7

j6

−j5

−j4

j3

j6

−j5

−1

Table 3. Multiplication table of unit octonions jm . The precise forms of the matrices Jm for the product in the octonion H-type group H17 are given below. ⎡ ⎤ ⎡ ⎤ 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 ⎢ −1 0 ⎢ 0 0 0 −1 0 0 0 0 0 0 ⎥ 0 0 0 0 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ 0 0 ⎥ ⎢ −1 0 0 0 1 0 0 0 0 0 0 0 0 0 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ 0 0 −1 0 ⎢ 0 1 0 0 0 0 0 ⎥ 0 0 0 0 0 ⎥ ⎢ ⎥ ⎢ ⎥ J2 = ⎢ J1 = ⎢ 0 0 0 1 0 0 ⎥ 0 0 0 1 0 ⎥ ⎢ 0 0 ⎥ ⎢ 0 0 0 ⎥ ⎢ 0 0 ⎢ 0 0 0 0 0 −1 0 0 0 ⎥ 0 0 0 0 1 ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ 0 0 ⎣ 0 0 0 0 0 0 0 0 −1 ⎦ 0 −1 0 0 0 ⎦ 0 0 0 0 0 0 1 0 0 0 0 0 0 −1 0 0 ⎡ ⎡ ⎤ ⎤ 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 ⎢ 0 ⎢ 0 0 0 0 0 −1 0 1 0 0 0 0 0 ⎥ 0 0 ⎥ ⎢ ⎢ ⎥ ⎥ ⎢ 0 −1 0 0 ⎢ ⎥ 0 0 0 0 ⎥ 0 −1 0 ⎥ ⎢ ⎢ 0 0 0 0 0 ⎥ ⎢ −1 ⎢ 0 0 0 0 0 0 0 0 0 0 0 0 ⎥ 0 0 −1 ⎥ ⎥ ⎥ ⎢ ⎢ J3 = ⎢ J4 = ⎢ 0 0 0 0 0 0 1 ⎥ 0 0 0 ⎥ ⎢ 0 ⎥ ⎢ −1 0 0 0 0 ⎥ ⎢ 0 ⎢ 0 1 0 0 0 0 0 0 0 0 −1 0 ⎥ 0 0 0 ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ 0 ⎣ 0 0 1 0 0 0 0 0 0 1 0 0 ⎦ 0 0 0 ⎦ 0 0 0 0 −1 0 0 0 0 0 0 1 0 0 0 0 ⎡ ⎡ ⎤ ⎤ 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 ⎢ 0 ⎢ 0 0 0 0 1 0 0 0 ⎥ 0 0 0 0 0 0 1 ⎥ ⎢ ⎢ ⎥ ⎥ ⎢ 0 ⎢ 0 ⎥ 0 0 0 0 0 0 −1 0 0 0 1 0 0 0 ⎥ ⎢ ⎢ ⎥ ⎥ ⎢ 0 ⎢ 0 0 0 0 0 0 1 0 ⎥ 0 0 0 0 −1 0 0 ⎥ ⎢ ⎢ ⎥ ⎥ J5 = ⎢ J6 = ⎢ 0 0 0 0 0 ⎥ 0 −1 0 0 0 0 0 ⎥ ⎢ 0 −1 0 ⎢ 0 ⎥ ⎥ ⎢ −1 ⎢ 0 0 0 0 0 0 0 0 ⎥ 0 0 1 0 0 0 0 ⎥ ⎢ ⎢ ⎥ ⎥ ⎣ 0 ⎣ −1 0 0 −1 0 0 0 0 ⎦ 0 0 0 0 0 0 0 ⎦ 0 0 1 0 0 0 0 0 0 −1 0 0 0 0 0 0

304

I. Markina ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ J7 = ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

0 0 0 0 0 0 0 −1

⎤ 0 0 0 0 0 0 1 0 0 0 0 0 −1 0 ⎥ ⎥ 0 0 0 0 1 0 0 ⎥ ⎥ 0 0 0 1 0 0 0 ⎥ ⎥ 0 0 −1 0 0 0 0 ⎥ ⎥ 0 −1 0 0 0 0 0 ⎥ ⎥ 1 0 0 0 0 0 0 ⎦ 0 0 0 0 0 0 0

Vector fields on S 7 . Table 4 gives products of unit octonions and allows us to calculate the product of two arbitrary octonions. It was used to calculate the vector fields on S 7 . j0

j1

j2

j3

j4

j5

j6

j7

j0

j0

j1

j2

j3

j4

j5

j6

j7

j1

j1

−j0

j3

−j2

j5

−j4

−j7

j6

j2

j2

−j3

−j0

j1

j6

j7

−j4

−j5

j3

j3

j2

−j1

−j0

j7

−j6

j5

−j4

j4

j4

−j5

−j6

−j7

−j0

j1

j2

j3

j5

j5

j4

−j7

j6

−j1

−j0

−j3

j2

j6

j6

j7

j4

−j5

−j2

j3

−j0

−j1

j7

j7

−j6

j5

j4

−j3

−j2

j1

−j0

Table 4. Multiplication table for the basis elements of O. Let o1 = (x0 j0 + x1 j1 + x2 j2 + x3 j3 + x4 j4 + x5 j5 + x6 j6 + x7 j7 ) and o2 = (y 0 j0 + y 1 j1 + y 2 j2 + y 3 j3 + y 4 j4 + y 5 j5 + y 6 j6 + y 7 j7 ) be two octonions. Then we have according to Table 4: o1 · o2 = (x0 j0 + x1 j1 + x2 j2 + x3 j3 + x4 j4 + x5 j5 + x6 j6 + x7 j7 ) ◦ (y 0 j0 + y 1 j1 + y 2 j2 + y 3 j3 + y 4 j4 + y 5 j5 + y 6 j6 + y 7 j7 ) = (x0 y 0 − x1 y 1 − x2 y 2 − x3 y 3 − x4 y 4 − x5 y 5 − x6 y 6 − x7 y 7 )j0 + (x1 y 0 + x0 y 1 − x3 y 2 + x2 y 3 − x5 y 4 + x4 y 5 + x7 y 6 − x6 y 7 )j1 + (x2 y 0 + x3 y 1 + x0 y 2 − x1 y 3 − x6 y 4 − x7 y 5 + x4 y 6 + x5 y 7 )j2 + (x3 y 0 − x2 y 1 + x1 y 2 + x0 y 3 − x7 y 4 + x6 y 5 − x5 y 6 + x4 y 7 )j3 + (x4 y 0 + x5 y 1 + x6 y 2 + x7 y 3 + x0 y 4 − x1 y 5 − x2 y 6 − x3 y 7 )j4 + (x5 y 0 − x4 y 1 + x7 y 2 − x6 y 3 + x1 y 4 + x0 y 5 + x3 y 6 − x2 y 7 )j5 + (x6 y 0 − x7 y 1 − x4 y 2 + x5 y 3 + x2 y 4 − x3 y 5 + x0 y 6 + x1 y 7 )j6 + (x7 y 0 + x6 y 1 − x5 y 2 − x4 y 3 + x3 y 4 + x2 y 5 − x1 y 6 + x0 y 7 )j7 .

Geodesics in Geometry with Constraints

305

According to the multiplication table for octonions, we have the following unit vector fields in R8 arising as right translations of ∂yj , j = 0, . . . , 7 under the octonion product. If q = (y 0 , . . . , y 7 ) ∈ S 7 , then Y0 (q) = y 0 ∂y0 + y 1 ∂y1 + y 2 ∂y2 + y 3 ∂y3 + y 4 ∂y4 + y 5 ∂y5 + y 6 ∂y6 + y 7 ∂y7 Y1 (q) = −y 1 ∂y0 + y 0 ∂y1 − y 3 ∂y2 + y 2 ∂y3 − y 5 ∂y4 + y 4 ∂y5 − y 7 ∂y6 + y 6 ∂y7 Y2 (q) = −y 2 ∂y0 + y 3 ∂y1 + y 0 ∂y2 − y 1 ∂y3 − y 6 ∂y4 + y 7 ∂y5 + y 4 ∂y6 − y 5 ∂y7 Y3 (q) = −y 3 ∂y0 − y 2 ∂y1 + y 1 ∂y2 + y 0 ∂y3 + y 7 ∂y4 + y 6 ∂y5 − y 5 ∂y6 − y 4 ∂y7 Y4 (q) = −y 4 ∂y0 + y 5 ∂y1 + y 6 ∂y2 − y 7 ∂y3 + y 0 ∂y4 − y 1 ∂y5 − y 2 ∂y6 + y 3 ∂y7 Y5 (q) = −y 5 ∂y0 − y 4 ∂y1 − y 7 ∂y2 − y 6 ∂y3 + y 1 ∂y4 + y 0 ∂y5 + y 3 ∂y6 + y 2 ∂y7 Y6 (q) = −y 6 ∂y0 + y 7 ∂y1 − y 4 ∂y2 + y 5 ∂y3 + y 2 ∂y4 − y 3 ∂y5 + y 0 ∂y6 − y 1 ∂y7 Y7 (q) = −y 7 ∂y0 − y 6 ∂y1 + y 5 ∂y2 + y 4 ∂y3 − y 3 ∂y4 − y 2 ∂y5 + y 1 ∂y6 + y 0 ∂y7 . The vector fields Yi , i = 1, . . . , 7 form an orthonormal frame of Tq S 7 , q ∈ S 7 , with respect to restriction of the inner product · , · from R8 to the tangent space Tq S 7 at each q ∈ S 7 . Commutators between vector fields Let us denote by Yij (q) = 12 [Yi (q), Yj (q)] the commutators between the constructed above vector fields Yj , j = 0, . . . , 7. We have the following list: Y12 (q) = y 3 ∂y0 + y 2 ∂y1 − y 1 ∂y2 − y 0 ∂y3 + y 7 ∂y4 + y 6 ∂y5 − y 5 ∂y6 − y 4 ∂y7 Y13 (q) = −y 2 ∂y0 + y 3 ∂y1 + y 0 ∂y2 − y 1 ∂y3 + y 6 ∂y4 − y 7 ∂y5 − y 4 ∂y6 + y 5 ∂y7 Y14 (q) = y 5 ∂y0 + y 4 ∂y1 − y 7 ∂y2 − y 6 ∂y3 − y 1 ∂y4 − y 0 ∂y5 + y 3 ∂y6 + y 2 ∂y7 Y15 (q) = −y 4 ∂y0 + y 5 ∂y1 − y 6 ∂y2 + y 7 ∂y3 + y 0 ∂y4 − y 1 ∂y5 + y 2 ∂y6 − y 3 ∂y7 Y16 (q) = y 7 ∂y0 + y 6 ∂y1 + y 5 ∂y2 + y 4 ∂y3 − y 3 ∂y4 − y 2 ∂y5 − y 1 ∂y6 − y 0 ∂y7 Y17 (q) = −y 6 ∂y0 + y 7 ∂y1 + y 4 ∂y2 − y 5 ∂y3 − y 2 ∂y4 + y 3 ∂y5 + y 0 ∂y6 − y 1 ∂y7 Y23 (q) = y 1 ∂y0 − y 0 ∂y1 + y 3 ∂y2 − y 2 ∂y3 − y 5 ∂y4 + y 4 ∂y5 − y 7 ∂y6 + y 6 ∂y7 Y24 (q) = y 6 ∂y0 + y 7 ∂y1 + y 4 ∂y2 + y 5 ∂y3 − y 2 ∂y4 − y 3 ∂y5 − y 0 ∂y6 − y 1 ∂y7 Y25 (q) = −y 7 ∂y0 + y 6 ∂y1 + y 5 ∂y2 − y 4 ∂y3 + y 3 ∂y4 − y 2 ∂y5 − y 1 ∂y6 + y 0 ∂y7 Y26 (q) = −y 4 ∂y0 − y 5 ∂y1 + y 6 ∂y2 + y 7 ∂y3 + y 0 ∂y4 + y 1 ∂y5 − y 2 ∂y6 − y 3 ∂y7 Y27 (q) = y 5 ∂y0 − y 4 ∂y1 + y 7 ∂y2 − y 6 ∂y3 + y 1 ∂y4 − y 0 ∂y5 + y 3 ∂y6 − y 2 ∂y7 Y34 (q) = −y 7 ∂y0 + y 6 ∂y1 − y 5 ∂y2 + y 4 ∂y3 − y 3 ∂y4 + y 2 ∂y5 − y 1 ∂y6 + y 0 ∂y7 Y35 (q) = −y 6 ∂y0 − y 7 ∂y1 + y 4 ∂y2 + y 5 ∂y3 − y 2 ∂y4 − y 3 ∂y5 + y 0 ∂y6 + y 1 ∂y7 Y36 (q) = y 5 ∂y0 − y 4 ∂y1 − y 7 ∂y2 + y 6 ∂y3 + y 1 ∂y4 − y 0 ∂y5 − y 3 ∂y6 + y 2 ∂y7 Y37 (q) = y 4 ∂y0 + y 5 ∂y1 + y 6 ∂y2 + y 7 ∂y3 − y 0 ∂y4 − y 1 ∂y5 − y 2 ∂y6 − y 3 ∂y7

306

I. Markina Y45 (q) = y 1 ∂y0 − y 0 ∂y1 − y 3 ∂y2 + y 2 ∂y3 + y 5 ∂y4 − y 4 ∂y5 − y 7 ∂y6 + y 6 ∂y7 Y46 (q) = y 2 ∂y0 + y 3 ∂y1 − y 0 ∂y2 − y 1 ∂y3 + y 6 ∂y4 + y 7 ∂y5 − y 4 ∂y6 − y 5 ∂y7 Y47 (q) = −y 3 ∂y0 + y 2 ∂y1 − y 1 ∂y2 + y 0 ∂y3 + y 7 ∂y4 − y 6 ∂y5 + y 5 ∂y6 − y 4 ∂y7 Y56 (q) = −y 3 ∂y0 + y 2 ∂y1 − y 1 ∂y2 + y 0 ∂y3 − y 7 ∂y4 + y 6 ∂y5 − y 5 ∂y6 + y 4 ∂y7 Y57 (q) = −y 2 ∂y0 − y 3 ∂y1 + y 0 ∂y2 + y 4 ∂y6 − y 5 ∂y7 Y67 (q) = y 1 ∂y0 − y 0 ∂y1 − y 3 ∂y2 + y 2 ∂y3 − y 5 ∂y4 + y 4 ∂y5 + y 7 ∂y6 − y 6 ∂y7 .

Basic notations N R C Q O M, N Tq M Tq∗ M TM T ∗M prM , pr∗M Vect(M ) X, Y, Z S1 Diff S 1 Vir vir D D∗ D⊥ g gD gD Ω A σ ∇ H → − H G, H g, h Aut(E)

is the set of positive integer numbers is the set of real numbers is the set of complex numbers is the set of quaternion numbers is the set of the Caley numbers (octonions) are smooth manifolds is the tangent space at the point q ∈ M is the cotangent space at the point q ∈ M is the tangent bundle for a manifold M is the cotangent bundle for a manifold M are the canonical projections from T M , T ∗ M to M is the set of smooth vector fields on M are smooth vector fields, elements of Vect(M ) is the unit circle is the group of orientation preserving diffeomorphisms of S 1 is the Virasoro–Bott group is the Virasoro algebra is a smooth distribution on M , horizontal sub-bundle of T M is a smooth co-distribution on M , smooth sub-bundle of T ∗ M is the set of annihilators of D, horizontal sub-bundle of T M is a Riemannian metric is a sub-Riemannian metric related to the distribution D is a sub-Riemannian co-metric related to the distribution D is a curvature form of a horizontal distribution is a connection form of a horizontal distribution is the infinitesimal generator of a group acting on a manifold is the Levi-Civita connection is a Hamiltonian function is the Hamiltonian vector field associated to a function H are Lie groups are Lie algebras is the group of automorphisms of E

Geodesics in Geometry with Constraints End(E) W, W ∗ · (· , ·) · , · {· , ·} [· , ·] · E · H1 dE dH1 dc−c f |S grad f gradD f R

307

is the Lie algebra of endomorphisms of E is a vector space and its dual is the Euclidean inner product is an inner product is a pairing between W and W ∗ are Poisson brackets is the commutator is the Euclidean norm is the Heisenberg norm is the Euclidean distance function is the Heisenberg distance function is the Carnot–Carath´eodory distance function is the restriction of the function f to the set S is the Riemannian gradient of a function f is the sub-Riemannian gradient of a function f is the rolling map

Acknowledgment It is not a duty, but a pleasure to thank the organizers of the school “Analysis – with Applications to Mathematical Physics” in G¨ottingen, and in particular, Wolfram Bauer, for the invitation, warm hospitality, an extraordinarily interesting school, and the indispensable help in preparation of these notes. I am grateful to the Analysis Group at the University of Bergen for the creative friendly atmosphere and the infinite source of ideas, questions and emotions. Particularly, I would like to express deep gratitude to my co-authors with whom I shared hard hours of work and lovely time of mathematical conversations. Special sincere thanks to my lovely husband Alexander Vasiliev for his support of all my initiatives.

References [1] J.F. Adams, Vector fields on spheres. Ann. of Math. (2) 75 (1962), 603–632. [2] A. Agrachev, M. Caponigro, Controllability on the group of diffeomorphisms. Ann. Inst. H. Poincar´e. Anal. Non Lin´eaire 26 (2009), 2503–2509. [3] A. Agrachev, Y. Sachkov, Control theory from the geometric viewpoint. Encyclopedia of Mathematical Sciences, 87. Control Theory and Optimization, II. SpringerVerlag, Berlin, 2004. 412 pp. [4] H. Airault, P. Malliavin, Unitarizing probability measures for representations of Virasoro algebra. J. Math. Pures Appl. 80 (2001), no. 6, 627–667. [5] D. Alekseevsky, Y. Kamishima, Pseudo-conformal quaternionic CR structure on (4n + 3)-dimensional manifolds. Ann. Mat. Pura Appl. (4) 187 (2008), no. 3, 487– 529. [6] N. Arcozzi, A. Baldi, From Grushin to Heisenberg via an isoperimetric problem. J. Math. Anal. Appl. 340 (2008), no. 1, 165–174.

308

I. Markina

[7] V.I. Arnold, Mathematical methods of classical mechanics. Translated from the 1974 Russian original by K. Vogtmann and A. Weinstein. Corrected reprint of the second (1989) edition. Graduate Texts in Mathematics, 60. Springer-Verlag, New York, 516 pp. [8] J.C. Baez, The octonions. Bull. Amer. Math. Soc. (N.S.) 39 (2002), no. 2, 145–205. [9] W. Bauer, K. Furutani, Spectral analysis and geometry of a sub-Riemannian structure on S 3 and S 7 . J. Geom. Phys. 58 (2008), 1693–1738. [10] W. Bauer, K. Furutani, C. Iwasaki, Trivializable sub-Riemannian structures on spheres. Bull. Sci. Math. 137 (2013), no. 3, 361–385. [11] M. Bauer, D. Bernard, SLE martingales and the Virasoro algebra. Phys. Lett. B 557 (2003), no. 3-4, 309–316. [12] M. Bauer, D. Bernard, Conformal field theories of stochastic Loewner evolutions. Comm. Math. Phys. 239 (2003), no. 3, 493–521. [13] A. Bella¨ıche, The tangent space in Sub-Riemannian geometry. In Sub-Riemannian geometry, edited by Andr´e Bella¨ıche and Jean-Jacques Risler. Progress in Mathematics, 144. Birkh¨ auser Verlag, Basel, 1996. 393 pp. [14] A. Boggess, CR manifolds and the tangential Cauchy–Riemann complex. Studies in Advanced Mathematics. CRC Press, Boca Raton, FL, 1991. 364 pp. [15] V.G. Boltyanskii, Sufficient conditions for optimality and the justification of the dynamic programming method. SIAM J. Control 4 (1966), 326–361. [16] R. Bott, On the characteristic classes of groups of diffeomorphisms. Enseignment Math. (2) 23 (1977), no. 3-4, 209–220. [17] R. Bott, J. Milnor, On the parallelizability of the spheres. Bull. Amer. Math. Soc. 64 (1958), 87–89. [18] R. Bryant, L. Hsu, Rigidity of integral curves of rank 2 distributions. Invent. Math. 114 (1993), no. 2, 435–461. [19] F. Bullo, A.D. Lewis, Geometric control of mechanical systems. Modeling, analysis, and design for simple mechanical control systems. Texts in Applied Mathematics, 49. Springer-Verlag, New York, 2005. 726 pp. [20] O. Calin, D.C. Chang, P. Greiner, Geometric analysis on the Heisenberg group and its generalizations. AMS/IP Studies in Advanced Mathematics, 40. American Mathematical Society, Providence, RI; International Press, Somerville, MA, 2007. 244 pp. [21] O. Calin, D.C. Chang, I. Markina, Geometric analysis on H-type groups related to division algebras. Math. Nachr. 282 (2009), no. 1, 44–68. [22] L. Capogna, D. Danielli, S.D. Pauls, J.T. Tyson, An introduction to the Heisenberg group and the sub-Riemannian isoperimetric problem. Progress in Mathematics, 259. Birkh¨ auser Verlag, Basel, 2007. 223 pp. [23] M.P. do Carmo, Riemannian geometry. Mathematics: Theory & Applications. Birkh¨ auser Boston, Inc., Boston, MA, 1992. 300 pp. ´ Cartan, Les syst`emes de Pfaff, ` [24] E. a cinq variables et les ´equations aux d´eriv´ees ´ partielles du second ordre. Ann. Sci. Ecole Norm. Sup. 27 (1910), no. 3, 109–192.

Geodesics in Geometry with Constraints

309

[25] C.H. Chang, D.C. Chang, B. Gaveau, P. Greiner, H.P. Lee, Geometric analysis on a step 2 Grushin operator. Bull. Inst. Math. Acad. Sin. (N.S.) 4 (2009), no. 2, 119–188. [26] D.C. Chang, I. Markina, Geometric analysis on quaternion H-type groups. J. Geom. Anal. 16 (2006), no. 2, 266–294. [27] D.C. Chang, I. Markina, A. Vasil’ev, Sub-Lorentzian geometry on anti-de Sitter space. J. Math. Pures Appl. 90 (9) (2008), no. 1, 82–110. [28] D.C. Chang, I. Markina, A. Vasil’ev, Sub-Riemannian geodesics on the 3D sphere. Complex Anal. Oper. Theory 3 (2009), no. 1, 44–68. [29] D.C. Chang, I. Markina, A. Vasil’ev, Modified action and differential operators on the 3D sub-Riemannian sphere. Asian J. Math. 14 (3) (2010), no. 4, 439–474. [30] Y. Chitour, F. Jean, E. Tr´elat, Genericity results for singular curves. J. Diff. Geom. 73 (2006), no. 1, 45–73. [31] Y. Chitour, P. Kokkonen, Rolling Manifolds: Intrinsic Formulation and Controllability. arXiv:1011.2925 [32] Y. Chitour, P. Kokkonen, Rolling manifolds on space forms. Ann. Inst. H. Poincar´e Anal. Non Lin´eaire 29 (2012), no. 6, 927–954. ¨ [33] W.L. Chow, Uber Systeme von linearen partiellen Differentialgleichungen erster Ordnung. Math. Ann. 117 (1939), 98–105. [34] P. Ciatti, Scalar products on Clifford modules and pseudo-H-type Lie algebras. Ann. Mat. Pura Appl. 178 (2000), no. 4, 41–31. [35] L.A. Cordero, P.E. Parker, Isometry groups of pseudoriemannian 2-step nilpotent Lie groups. Houston J. Math. 35 (2009), no. 1, 49–72. [36] M. Cowling, A.H. Dooley, A. Kor´ anyi, F. Ricci, H-type groups and Iwasawa decompositions. Adv. Math. 87 (1991), no. 1, 1–41. [37] D. Danielli, N. Garofalo, D.M. Nhieu, Sub-Riemannian calculus on hypersurfaces in Carnot groups. Adv. Math. 215 (2007), no. 1, 292–378. [38] G. Darboux, Sur le probl`eme de Pfaff. Bull. Sci. Math. 6 (1882), 14–36, 49–68. [39] C.A. Deavours, The quaternion calculus. Amer. Math. Monthly, 80 (1973), 995– 1008. [40] J. Dou, P. Niu, J. Han, Polar coordinates for the generalized Baouendi–Grushin operator and applications. J. Partial Differential Equations 20 (2007), no. 4, 322– 336. [41] P.I. Dubnikov, S.N. Samborskii, Controllability criterion for systems in a Banach space (Generalization of Chow’s theorem). Ukraine Math. J. 32 (1979), no. 5, 649– 653. [42] J.J. Duistermaat, J.A.C. Kolk, Lie groups. Universitext. Springer-Verlag, Berlin, 2000. 344 pp. [43] P. Eberlein, Riemannian submersion and lattices in 2-step nilpotent Lie groups. Comm. Anal.Geom. 11, (2003), no. 3, 441–488. [44] P. Eberlein, Geometry of 2-step nilpotent Lie groups. Modern dynamical systems and applications, 67–101, Cambridge Univ. Press, Cambridge, 2004. [45] J. Escher, B. Kolev, and M. Wunch, The geometry of a vorticity model equation. Commun. Pure Appl. Anal. 11 (2012), no. 4, 1407–1419.

310

I. Markina

[46] R. Friedrich, W. Werner, Conformal restriction, highest-weight representations and SLE. Comm. Math. Phys. 243 (2003), no. 1, 105–122. [47] R.M. Friedrich, The global geometry of stochastic L¨ owner evolutions. Probabilistic approach to geometry, 79–117, Adv. Stud. Pure Math., 57, Math. Soc. Japan, Tokyo, 2010. ¨ [48] G. Frobenius, Uber das Pfaffsche Problem. J. reine angew. Math 82 (1877), 230–315. [49] G.B. Folland, A fundamental solution for a subelliptic operator. Bull. Amer. Math. Soc. 79 (1973), 373–376. [50] G.B. Folland, E.M. Stein, Hardy spaces on homogeneous groups. Mathematical Notes, 28. Princeton University Press, Princeton, N.J.; University of Tokyo Press, Tokyo, 1982. 285 pp. [51] W. Fulton, J. Harris, Representation theory. A first course. Graduate Texts in Mathematics, 129. Readings in Mathematics. Springer-Verlag, New York, 1991. 551 pp. [52] N. Garofalo, D. Vassilev, Strong unique continuation properties of generalized Baouendi–Grushin operators. Comm. Partial Differential Equations 32 (2007), no. 4-6, 643–663. [53] Z. Ge, Betti numbers, characteristic classes and sub-Riemannian geometry. Illinois J. Math. 36 (1992), no. 3, 372–403. [54] I.M. Gel’fand, D.B. Fuchs, Cohomology of the Lie algebra of vector fields on the circle. Functional Anal. Appl. 2 (1968), no. 4, 342–343. [55] V. Gershkovich, A. Vershik, Nonholonomic manifolds and nilpotent analysis. J. Geom. Phys. 5 (1988), no. 3, 407–452. [56] M. Godoy Molina, E. Grong, Geometric conditions for the existence of a rolling without twisting or slipping. Commu. Pure Appl. Anal. 13 (2014), no. 1, 435–452. [57] M. Godoy Molina, E. Grong, I. Markina, S. Leite, An intrinsic formulation of the rolling manifolds problem. J. Dyn. Control Syst. 18 (2012), no. 2, 181–214. [58] M. Godoy, I. Markina, Sub-Riemannian geometry of parallelizable spheres. Rev, Mat. Iberoam. 27 (2011), no. 3, 997–1022. [59] M. Godoy, I. Markina, Sub-Riemannian geodesics and heat operator on odddimensional spheres. Anal. Math. Phys. 2 (2012), no. 2, 123–147. [60] M. Godoy, A. Korolko, I. Markina, Sub-semi-Riemannian geometry of general Htype groups. Bull. Sci. Math. 137 (2013), no. 6, 805–835. [61] M. Grochowski, Geodesics in the sub-Lorentzian geometry. Bull. Polish Acad. Sci. Math. 50 (2002), no. 2, 161–178. [62] M. Grochowski, Reachable sets for the Heisenberg sub-Lorentzian structure on R3 . An estimate for the distance function. J. Dyn. Control Syst. 12 (2006), no. 2, 145– 160. [63] M. Gromov, Carnot–Carath´eodory spaces seen from within. In Sub-Riemannian geometry, edited by Andr´e Bella¨ıche and Jean-Jacques Risler. Progress in Mathematics, 144. Birkh¨ auser Verlag, Basel, 1996. 393 pp. [64] E. Grong, Controllability of rolling without twisting or slipping in higher dimensions. SIAM J. Control Optim. 50 (2012), no. 4, 2462–2485. [65] E. Grong, I. Markina, A. Vasil’ev, Sub-Riemannian geometry on infinite-dimensional manifolds. to appear in J. Geom. Anal. DOI 10.1007/s12220-014-9523-0.

Geodesics in Geometry with Constraints

311

[66] E. Grong, I. Markina, A. Vasil’ev, Sub-Riemannian structures corresponding to K¨ ahlerian metrics on the universal Teichm¨ uller space and curve. “60 years of analytic functions in Lublin” – in memory of our professors and friends Jan G. Krzy˙z, Zdzislaw Lewandowski and Wojciech Szapiel, 97116, Monogr. Univ. Econ. Innov. Lublin, Innovatio Press Sci. Publ. House Univ. Econ. Innov. Lublin, Lublin, 2012. [67] V.V. Gruˇsin, A certain class of hypoelliptic operators. (Russian) Mat. Sb. (N.S.) 83 (125) (1970), 456–473. [68] K. G¨ urlebeck, W. Spr¨ ossig, Quaternionic and Clifford Calculus for Physicists and Engineers. Chichester: John Wiley and Sons, 1997. 371 pp. [69] E. Heintze, X. Liu, Homogeneity of infinite-dimensional isoparametric submanifolds. Ann. of Math. (2), 149 (1999), 149–181. [70] M.R. Herman, Simplicit´e du groupe des diff´eomorphismes de classe C ∞ , isotopes a l’identit´e, du tore de dimension n. (French) C. R. Acad. Sci. Paris S´er. A-B 273 ` (1971), A232–A234. ¨ [71] H. Hopf, Uber die Abbildungen von Sph¨ aren auf Sph¨ aren niedrigerer Dimension. Math. Ann. 104 (1931), 637–665. [72] K. H¨ uper, F. Silva Leite, On the geometry of rolling and interpolation curves on S n , SOn , and Grassmann manifolds. J. Dyn. Control Syst. 13 (2007), no. 4, 467–502. [73] K. H¨ uper, M. Kleinsteuber, F. Silva Leite, Rolling Stiefel manifolds. Internat. J. Systems Sci. 39 (2008), no. 9, 881–887. [74] A. Hurtado, C. Rosales, Area-stationary surfaces inside the sub-Riemannian threesphere. Math. Ann. 340 (2008), no. 3, 675–708. [75] D. Husemoller, Fibre bundles. Third edition. Graduate Texts in Mathematics, 20. Springer-Verlag, New York, 1994. 353 pp. [76] D.D. Joyce, Riemannian holonomy groups and calibrated geometry. Oxford Graduate Texts in Mathematics, 12. Oxford University Press, Oxford, 2007. 303 pp. [77] A. Kaplan, Fundamental solutions for a class of hypoelliptic PDE generated by composition of quadratics forms. Trans. Amer. Math. Soc. 258 (1980), no. 1, 147– 153. [78] A. Kaplan, On the geometry of groups of Heisenberg type. Bull. London Math. Soc. 15 (1983), no. 1, 35–42. [79] M. Khajeh Salehani, I. Markina, Controllability on infinite-dimensional manifolds: a Chow–Rashevsky theorem. Acta Appl. Math. 134 (2014), 229–246. [80] B. Khesin, G. Misiolek, Euler equations on homogeneous spaces and Virasoro orbits. Adv. Math. 176 (2003), no. 1, 116–144. [81] B. Khesin, R. Wendt, The geometry of infinite-dimensional groups. A Series of Modern Surveys in Mathematics [Results in Mathematics and Related Areas. 3rd Series.], 51. Springer-Verlag, Berlin, 2009. 304 pp. [82] A.A. Kirillov, The orbits of the group of diffeomorphisms of the circle, and local Lie superalgebras. (Russian) Funktsional. Anal. i Prilozhen. 15 (1981), no. 2, 75–76. [83] A.A. Kirillov, Geometric approach to discrete series of unirreps for vir. J. Math. Pures Appl. 77 (1998), 735–746. [84] A.A. Kirillov, D.V. Yuriev, Representations of the Virasoro algebra by the orbit method. J. Geom. Phys. 5 (1988), no. 3, 351–363.

312

I. Markina

[85] A.A. Kirillov, K¨ ahler structures on K-orbits of the group of diffeomorphisms of a circle. Funct. Anal. Appl. 21 (1987), no. 2, 42–45. [86] A.A. Kirillov, D.V. Yur’ev, K¨ ahler geometry and the infinite-dimensional homogenous space M = Diff + (S 1 )/Rot(S 1 ). Funct. Anal. Appl. 21 (1987), no. 4, 284–294. [87] A.W. Knapp, Lie groups beyond an introduction. Second edition. Progress in Mathematics, 140. Birkh¨ auser Boston, Inc., Boston, MA, 2002. 812 pp. [88] A. Kor´ anyi, Geometric properties of Heisenberg-type groups. Adv. in Math. 56 (1985), no. 1, 28–38. [89] A. Kor´ anyi, H.M. Reimann, Quasiconformal mappings on the Heisenberg group. Invent. Math. 80 (1985), no. 2, 309–338. [90] A. Korolko, I. Markina, Nonholonomic Lorentzian geometry on some H-type groups. J. Geom. Anal. 19 (2009), no. 4, 864–889. [91] A. Korolko, I. Markina, Geodesics on H-type groups with sub-Lorentzian metric and their physical interpretation. Comp. Anal. Oper. Theory. 4 (2010), no. 3, 589–618. [92] A. Korolko, I. Markina, Semi-Riemannian geometry with nonholonomic constraints. Taiwanese J. of Math., 15 (2011), no. 4, 1581–1616. [93] A. Kriegl, P.W. Michor, The convenient setting of global analysis. Mathematical Surveys and Monographs, 53. American Mathematical Society, Providence, RI, 1997. 618 pp. [94] Yu.S. Ledyaev, On an infinite-dimensional variant of the Rashevski–Chow theorem. Dokl. Akad. Nauk 398 (2004), no. 6, 735–737. [95] E.B. Lee, L. Markus, Foundations of optimal control theory. John Wiley & Sons, Inc., New York-London-Sydney 1967. 576 pp. [96] L. Lempert, The Virasoro group as a complex manifold. Math. Res. Lett. 2 (1995), no. 4, 479–495. [97] W. Liu, H.J. Sussmann, Shortest paths for sub-Riemannian metrics on rank-two distributions. Mem. Amer. Math. Soc. 118 (1995), no. 564, 104 pp. [98] D.W. Lyons, An elementary introduction to the Hopf fibration. Math. Mag. 76 (2003), no. 2, 87–98. [99] I. Markina, A. Vasil’ev, Virasoro algebra and dynamics in the space of univalent functions. Five lectures in complex analysis, 85–116, Contemp. Math., 525, Amer. Math. Soc., Providence, RI, 2010. [100] J. Marsden, A. Weinstein, Reduction of symplectic manifolds with symmetry. Rep. Mathematical Phys. 5 (1974), no. 1, 121–130. [101] K.R. Meyer, Symmetries and integrals in mechanics. Dynamical systems (Proc. Sympos., Univ. Bahia, Salvador, 1971), pp. 259–272. Academic Press, New York, 1973. [102] P.W. Michor, D. Mumford, Riemannian geometries on spaces of plane curves. J. Eur. Math. Soc. (JEMS) 8 (2006), no. 1, 1–48. [103] P.W. Michor, D. Mumford, An overview of the Riemannian metrics on spaces of curves using the Hamiltonian approach. Appl. Comput. Harmon. Anal. 23 (2007), no. 1, 74–113.

Geodesics in Geometry with Constraints

313

[104] P.W. Michor, D. Mumford, J. Shah, L. Younes, A metric on shape space with explicit geodesics. Atti Accad. Naz. Lincei Cl. Sci. Fis. Mat. Natur. Rend. Lincei (9) Mat. Appl. 19 (2008), no. 1, 25–57. [105] R.L. Mills, C.N. Yang, Conservation of isotopic spin and isotopic gauge invariance. Phys. Rev. 96 (1954), p. 191. [106] J. Milnor, Morse theory. Annals of Mathematics Studies, no. 51 Princeton University Press, Princeton, N.J. 1963. 153 pp. [107] J. Milnor, D. Husemoller, Symmetric bilinear forms. Ergebnisse der Mathematik und ihrer Grenzgebiete, Band 73. Springer-Verlag, New York-Heidelberg, 1973. 147 pp. [108] J. Milnor, Remarks on infinite-dimensional Lie groups. Relativity, groups and topology, II (Les Houches, 1983), 1007–1057, North-Holland, Amsterdam, 1984. [109] R. Montgomery, Abnormal minimizers. SIAM J. Control Optim. 32 (1994), no. 6, 1605–1620. [110] R. Montgomery, A survey of singular curves in sub-Riemannian geometry. J. Dynam. Control Systems 1 (1995), no. 1, 49–90. [111] R. Montgomery, A tour of subriemannian geometries, their geodesics and applications. Mathematical Surveys and Monographs, 91. American Mathematical Society, Providence, RI, 2002. 259 pp. [112] R. Monti, The regularity problem for sub-Riemannian geodesics. Springer, INDAM Series, 2013. [113] T. Nagano, Linear differential systems with singularities and an application to transitive Lie algebras. J. Math. Soc. Japan 18 (1966), 398–404. [114] A. Nagel, E.M. Stein, S. Wainger, Balls and metrics defined by vector fields. I. Basic properties. Acta Math. 155 (1985), no. 1-2, 103–147. [115] J. Nash, The imbedding problem for Riemannian manifolds. Ann. of Math. 63 (1956), no. 2, 20–63. [116] H. Omori, Infinite-dimensional Lie groups. Translated from the 1979 Japanese original and revised by the author. Translations of Mathematical Monographs, 158. American Mathematical Society, Providence, RI, 1997. 415 pp. [117] B. O’Neill, Semi-Riemannian geometry. With applications to relativity. Pure and Applied Mathematics, 103. Academic Press, [Harcourt Brace Jovanovich, Publishers], New York, 1983. 468 pp. [118] A. Pressley, G. Segal, Loop groups. Oxford Mathematical Monographs. Oxford Science Publications. The Clarendon Press, Oxford University Press, New York, 1986. 318 pp. [119] P.K. Rashevski˘ı, About connecting two points of complete nonholonomic space by admissible curve. Uch. Zapiski Ped. Inst. K. Liebknecht 2 (1938), 83–94. [120] F. Ricci, The spherical transform on harmonic extensions of H-type groups. Differential geometry (Turin, 1992). Rend. Sem. Mat. Univ. Politec. Torino, 50 (1992), no. 4, 381–392. [121] M. Ritor´e, A proof by calibration of an isoperimetric inequality in the Heisenberg group Hn . Calc. Var. Partial Differential Equations 44 (2012), no. 1–2, 47–60.

314

I. Markina

[122] M. Ritor´e, C. Rosales, Area-stationary and stable surfaces in the sub-Riemannian Heisenberg group H1 . Mat. Contemp. 35 (2008), 185–203. [123] A.A. Sagle, R.E. Walde, Introduction to Lie Groups and Lie Algebras. Pure and Appl. Mathematics, 51. Academic Press, New York-London, 1973. 361 pp. [124] A.C. Schaeffer, D.C. Spencer, Coefficient Regions for Schlicht Functions (with a Chapter on the Region of the Derivative of a Schlicht Function by Arthur Grad). American Mathematical Society Colloquium Publications, vol. 35. American Mathematical Society, New York, 1950. [125] R.W. Sharpe, Differential geometry. Cartan’s generalization of Klein’s Erlangen program. Graduate Texts in Mathematics, 166. Springer-Verlag, New York, 1997. 421 pp. [126] I.M. Singer, J.A. Thorpe, Lecture notes on elementary topology and geometry. Undergraduate Texts in Mathematics. Springer-Verlag, New York-Heidelberg, 1976. 232 pp. [127] R.S. Strichartz, Sub-Riemannian geometry. J. Diff. Geom. 24 (1986), 221–263; Correction, ibid. 30 (1989), 595–596. [128] H.J. Sussmann, Orbits of families of vector fields and integrability of distributions. Trans. Amer. Math. Soc. 180 (1973), 171–188. [129] J.T. Tyson, Sharp weighted Young’s inequalities and Moser–Trudinger inequalities on Heisenberg type groups and Grushin spaces. Potential Anal. 24 (2006), no. 4, 357–384. [130] J. Vick, Homology Theory: An Introduction to Algebraic Topology. Graduate Texts in Mathematics, vol. 145. Springer-Verlag New York, 1994. [131] F.W. Warner, Foundations of differentiable manifolds and Lie groups. Graduate Texts in Mathematics, 94. Springer-Verlag, New York-Berlin, 1983. 272 pp. [132] J.A. Zimmerman, Optimal control of the sphere S n rolling on E n . Math. Control Signals Systems 17 (2005), no. 1, 14–37. Irina Markina Department of Mathematics University of Bergen Postbox 7803 N-5020 Bergen, Norway e-mail: [email protected]