Lecture Notes on Functional Analysis: With Applications to Linear Partial Differential Equations 0821887718, 978-0-8218-8771-4

This textbook is addressed to graduate students in mathematics or other disciplines who wish to understand the essential

1,032 58 1MB

English Pages 250 [265] Year 2013

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Applied Differential Equations Lecture Notes

https://www.ams.org/open-math-notes/omn-view-listing?listingId=111348 These are a series of lecture slides from a course

380 39 11MB Read more

Analysis of Partial Differential Equations

405 158 507KB Read more

Introduction to Partial Differential Equations

256 64 38MB Read more

Partial Differential Equations: Theory, Analysis and Applications : Theory, Analysis and Applications [1 ed.] 9781613247433, 9781611228588

Partial differential equations are used to formulate and thus aid the solution of problems involving functions of severa

595 97 13MB Read more

Space-time methods: applications to partial differential equations 9783110547870, 9783110548488

556 57 3MB Read more

Contributions to partial differential equations and applications 9783319783246, 9783319783253, 3319783254

Preface -- Career papers -- The career of Prof. W. Fitzgibbon, by Jeff Morgan, Jacques Periaux -- The career of Prof. Yu

897 99 25MB Read more

Linear Algebra and Partial Differential Equations 9789353161644, 9353161649

This book seeks to build fundamental concepts on the subject of Linear Algebra and Partial Differential Equations. Each

1,282 47 26MB Read more

Linear and Semilinear Partial Differential Equations: An Introduction 9783110269055, 9783110269048

The text is intended for students who wish a concise and rapid introduction to some main topics in PDEs, necessary for u

506 42 1MB Read more

Variational Methods in Nonlinear Analysis: With Applications in Optimization and Partial Differential Equations 9783110647389, 9783110647365

This well-thought-out book covers the fundamentals of nonlinear analysis, with a particular focus on variational methods

369 88 5MB Read more

Space-Time Methods: Applications to Partial Differential Equations [25] 9783110547870

This volume contains seven chapters that provide some recent developments in the formulation, analysis, and implementati

735 172 7MB Read more

Lecture Notes on Functional Analysis: With Applications to Linear Partial Differential Equations
0821887718, 978-0-8218-8771-4

Author / Uploaded
Alberto Bressan

Citation preview

Lecture Notes on Functional Analysis With Applications to Linear Partial Differential Equations !LBERTO "RESSAN

'RADUATE 3TUDIES IN -ATHEMATICS 6OLUME

!MERICAN -ATHEMATICAL 3OCIETY

Lecture Notes on Functional Analysis With Applications to Linear Partial Differential Equations

Lecture Notes on Functional Analysis With Applications to Linear Partial Differential Equations

Alberto Bressan

Graduate Studies in Mathematics Volume 143

American Mathematical Society Providence, Rhode Island

EDITORIAL COMMITTEE David Cox (Chair) Daniel S. Freed Rafe Mazzeo Gigliola Staffilani 2010 Mathematics Subject Classification. Primary 46–01; Secondary 35–01.

For additional information and updates on this book, visit www.ams.org/bookpages/gsm-143

Library of Congress Cataloging-in-Publication Data Bressan, Alberto, 1956– [Lectures. Selections] Lecture notes on functional analysis with applications to linear partial differential equations / Alberto Bressan. pages cm. — (Graduate studies in mathematics ; volume 143) Includes bibliographical references and index. ISBN 978-0-8218-8771-4 (alk. paper) 1. Functional analysis. 2. Differential equations, Linear. I. Title. QA321.B74 2012 515.7—dc23 2012030200

Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting for them, are permitted to make fair use of the material, such as to copy a chapter for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American Mathematical Society. Requests for such permission should be addressed to the Acquisitions Department, American Mathematical Society, 201 Charles Street, Providence, Rhode Island 02904-2294 USA. Requests can also be made by e-mail to [email protected]. c 2013 by the American Mathematical Society. All rights reserved. The American Mathematical Society retains all rights except those granted to the United States Government. Printed in the United States of America. ∞ The paper used in this book is acid-free and falls within the guidelines

established to ensure permanence and durability. Visit the AMS home page at http://www.ams.org/ 10 9 8 7 6 5 4 3 2 1

18 17 16 15 14 13

To Wen, Luisa Mei, and Maria Lan

Contents

Preface

xi

Chapter 1. Introduction

1

§1.1. Linear equations

1

§1.2. Evolution equations

4

§1.3. Function spaces

7

§1.4. Compactness

7

Chapter 2. Banach Spaces

11

§2.1. Basic definitions

11

§2.2. Linear operators

16

§2.3. Finite-dimensional spaces

20

§2.4. Seminorms and Fréchet spaces

23

§2.5. Extension theorems

26

§2.6. Separation of convex sets

30

§2.7. Dual spaces and weak convergence

32

§2.8. Problems

35

Chapter 3. Spaces of Continuous Functions

45

§3.1. Bounded continuous functions

45

§3.2. The Stone-Weierstrass approximation theorem

47

§3.3. Ascoli’s compactness theorem

53

§3.4. Spaces of Hölder continuous functions

56

§3.5. Problems

57 vii

viii

Contents

Chapter §4.1. §4.2. §4.3. §4.4. §4.5. §4.6.

4. Bounded Linear Operators The uniform boundedness principle The open mapping theorem The closed graph theorem Adjoint operators Compact operators Problems

61 61 63 64 66 68 71

Chapter §5.1. §5.2. §5.3. §5.4. §5.5. §5.6. §5.7. §5.8.

5. Hilbert Spaces Spaces with an inner product Orthogonal projections Linear functionals on a Hilbert space Gram-Schmidt orthogonalization Orthonormal sets Positive definite operators Weak convergence Problems

77 78 79 82 84 85 89 92 95

Chapter §6.1. §6.2. §6.3. §6.4.

6. Compact Operators on a Hilbert Space Fredholm theory Spectrum of a compact operator Selfadjoint operators Problems

101 101 106 107 111

Chapter §7.1. §7.2. §7.3. §7.4. §7.5.

7. Semigroups of Linear Operators Ordinary differential equations in a Banach space Semigroups of linear operators Resolvents Generation of a semigroup Problems

115 115 120 124 128 134

Chapter §8.1. §8.2. §8.3. §8.4. §8.5. §8.6.

8. Sobolev Spaces Distributions and weak derivatives Mollifications Sobolev spaces Approximations of Sobolev functions Extension operators Embedding theorems

139 139 146 151 157 161 163

Contents

ix

§8.7. Compact embeddings

175

§8.8. Differentiability properties

179

§8.9. Problems

180

Chapter 9. Linear Partial Differential Equations

185

§9.1. Elliptic equations

185

§9.2. Parabolic equations

200

§9.3. Hyperbolic equations

207

§9.4. Problems

212

Appendix. Background Material

217

§A.1. Partially ordered sets

217

§A.2. Metric and topological spaces

217

§A.3. Review of Lebesgue measure theory

222

§A.4. Integrals of functions taking values in a Banach space

226

§A.5. Mollifications

228

§A.6. Inequalities

233

§A.7. Problems

237

Summary of Notation

241

Bibliography

245

Index

247

Preface

The first version of these lecture notes was drafted in 2010 for a course at the Pennsylvania State University. The book is addressed to graduate students in mathematics or other disciplines, who wish to understand the essential concepts of functional analysis and their application to partial differential equations. Most of its content can be covered in a one-semester course at the first-year graduate level. In writing this textbook, I followed a number of guidelines: - Keep it short, presenting all the fundamental concepts and results, but not more than that. - Explain clearly the connections between theorems in functional analysis and familiar results of finite-dimensional linear algebra. - Cover enough of the theory of Sobolev spaces and semigroups of linear operators as needed to develop significant applications to elliptic, parabolic, and hyperbolic PDEs. - Include a large number of homework problems and illustrate the main ideas with figures, whenever possible. In functional analysis one finds a wealth of beautiful results that could be included in a monograph. However, for a textbook of this nature one should resist such a temptation. After the Introduction, Chapters 2 to 6 cover classical topics in linear functional analysis: Banach spaces, Hilbert spaces, and linear operators. Chapter 4 is devoted to spaces of continuous functions, including the StoneWeierstrass approximation theorem and Ascoli’s compactness theorem. In xi

xii

Preface

view of applications to linear PDEs, in Chapter 6 we prove some basic results on Fredholm operators and the Hilbert-Schmidt theorem on compact symmetric operators in a Hilbert space. Chapter 7 provides an introduction to the theory of semigroups, extending the definition of the exponential function etA to a suitable class of (possibly unbounded) linear operators. We stress the connection with finite-dimensional ODEs and the close relation between the resolvent operators and backward Euler approximations. After an introduction explaining the concepts of distribution and weak derivative, Chapter 8 develops the theory of Sobolev spaces. These spaces provide the most convenient abstract framework where techniques of functional analysis can be applied toward the solution of ordinary and partial differential equations. The first three sections in Chapter 9 describe applications of the previous theory to elliptic, parabolic, and hyperbolic PDEs. Since differential operators are unbounded, it is often convenient to recast a linear PDE in a “weak form”, involving only bounded operators on a Hilbert-Sobolev space. This new equation can then be studied using techniques of abstract functional analysis, such as the Lax-Milgram theorem, Fredholm’s theory, or the representation of the solution in terms of a series of eigenfunctions. The last chapter consists of an Appendix, collecting background material. This includes: definition and properties of metric spaces, the contraction mapping theorem, the Baire category theorem, a review of Lebesgue measure theory, mollification techniques and partitions of unity, integrals of functions taking values in a Banach space, a collection of inequalities, and a version of Gronwall’s lemma. These notes are illustrated by 41 figures. Nearly 180 homework problems are collected at the end of the various chapters. A complete set of solutions to the exercises is available to instructors. To obtain a PDF file of the solutions, please contact the author, including a link to your department’s web page listing you as an instructor or professor. It is a pleasure to acknowledge the help I received from colleagues, students, and friends, while preparing these notes. To L. Berlyand, G. Crasta, D. Wei, and others, who spotted a large number of misprints and provided many useful suggestions, I wish to express my gratitude. Alberto Bressan State College, July 2012

Chapter 1

Introduction

This book provides an introduction to linear functional analysis, extending techniques and results of classical linear algebra to infinite-dimensional spaces. With the development of a theory of function spaces, functional analysis yields a powerful tool for the study of linear ordinary and partial differential equations. It provides fundamental insights on the existence and uniqueness of solutions, their continuous dependence on initial or boundary data, the convergence of approximations, and on various other properties. The following remarks highlight some key results of linear algebra and their infinite-dimensional counterparts.

1.1. Linear equations Let A be an n × n matrix. Given a vector b ∈ Rn , a basic problem in linear algebra is to find a vector x ∈ Rn such that (1.1)

Ax = b .

In the theory of linear PDEs, an analogous problem is the following. Consider a bounded open set Ω ⊂ Rn and a linear partial differential operator of the form (1.2)

n n . ij Lu = − (a (x)uxi )xj + bi (x)uxi + c(x)u . i,j=1

i=1

Given a function f : Ω → R, find a function u, vanishing on the boundary of Ω, such that (1.3)

Lu = f . 1

2

1. Introduction

There are fundamental differences between the problems (1.1) and (1.3). The matrix A yields a continuous linear transformation on the finite-dimensional space Rn . On the other hand, the differential operator L can be regarded as an unbounded (hence discontinuous) linear operator on the infinite-dimensional space L2 (Ω). In particular, the domain of L is not the entire space L2 (Ω) but only a suitable subspace. In spite of these differences, since both problems (1.1) and (1.3) are linear, there are a number of techniques from linear algebra that can be applied to (1.3) as well. (I): Positivity Assume that the matrix A is strictly positive definite, i.e., there exists a constant β > 0 such that Ax, x ≥ β |x|2

for all x ∈ Rn .

Then A is invertible and the equation (1.1) has a unique solution for every b ∈ Rn . This result has a direct counterpart for elliptic PDEs. Namely, assume that the operator L is strictly positive definite, in the sense that (after a formal integration by parts) (1.4) ⎛ ⎞ n n ⎝ Lu, uL2 = aij (x)uxi uxj + bi (x)uxi u + c(x)u2 ⎠ dx Ω

i,j=1

i=1

≥ β u 2H 1 (Ω) 0

for some constant β > 0 and all u ∈ H01 (Ω). Here H01 (Ω) is a space of functions which vanish on the boundary of Ω and such that u H01 (Ω)

. =

1/2

|u| dx +

|∇u| dx

2

Ω

2

< ∞;

Ω

see Chapter 8 for precise definitions. If (1.4) holds, one can then prove that the problem (1.3) has a unique solution u ∈ H01 (Ω), for every given f ∈ L2 (Ω). A key assumption, in order for the inequality (1.4) to hold, is that the operator L should be elliptic. Namely, at each point x ∈ Ω the n × n matrix (aij (x)) should be strictly positive definite.

1.1. Linear equations

3

(II): Fredholm alternative A well-known criterion in linear algebra states that the equation (1.1) has a unique solution for every given b ∈ Rn if and only if the homogeneous equation Ax = 0 has only the solution x = 0. Of course, this holds if and only if the matrix A is invertible. In general, continuous linear operators on an infinite-dimensional space X do not share this property. Indeed, one can construct a bounded linear operator Λ : X → X which is one-to-one but not onto, or conversely. Yet, the finite-dimensional theory carries over to an important class of operators, namely, those of the form Λ = I − K, where I is the identity and K is a compact operator. If Λ is in this class, then one can still prove the equivalence Λ is one-to-one

⇐⇒

Λ is onto.

By an application of this theory it follows that, for a linear elliptic operator, the equation (1.3) has a unique solution u ∈ H01 (Ω) for every f ∈ L2 (Ω) if and only if the homogeneous equation Lu = 0 has only the zero solution. (III): Diagonalization If one can find a basis {v1 , . . . , vn } of Rn consisting of eigenvectors of A, then with respect to this basis the system (1.1) takes a diagonal form and is thus easy to solve. For a general matrix A with multiple eigenvalues, it is well known that such a basis of eigenvectors need not exists. A positive result in this direction is the following. If the n × n matrix A is symmetric, then one can find an orthonormal basis {v1 , . . . , vn } of the Euclidean space Rn consisting of eigenvectors of A. Namely,

1 if i = j , vi , vj = Avk = λk vk . 0 if i = j , Here λ1 , . . . , λn ∈ R are the corresponding eigenvalues. The solution x of (1.1) can now be found by computing its coefficients c1 , . . . , cn with respect

4

1. Introduction

to the orthonormal basis: n x = ck vk ,

Ax =

k=1

n

λk ck vk = b =

k=1

n

b, vk vk .

k=1

Notice that, thanks to the basis of eigenvectors, the problem becomes decoupled. Instead of a large system of n equations in n variables, we only need to solve n scalar equations, one for each coefficient ck . If all eigenvalues λk are nonzero, we thus have the explicit formula (1.5)

n 1 x = b, vk vk . λk k=1

One can adopt the same approach in the analysis of the elliptic operator L in (1.2), provided that aij = aji and bi (x) = 0. Indeed, these conditions make the operator “symmetric”. One can then find a countable orthonormal basis {φ1 , φ2 , . . . } of the space L2 (Ω) consisting of functions φk ∈ H01 (Ω) such that

1 if i = j , (1.6) φi , φj L2 = Lφk = λk φk , 0 if i = j , for a suitable sequence of real eigenvalues λk → +∞. Assuming that λk = 0 for all k, the unique solution of (1.3) can now be written explicitly as (1.7)

u =

∞ 1 f, φk L2 φk . λk k=1

Notice the close resemblance between the formulas (1.5) and (1.7). In essence, one only needs to replace the Euclidean inner product on Rn by the inner product on L2 (Ω).

1.2. Evolution equations Let A be an n × n matrix. For a given initial state b ∈ Rn , consider the Cauchy problem d x(t) = Ax(t), x(0) = b . dt According to linear ODE theory, this problem has a unique solution: (1.8)

(1.9)

x(t) = etA b ,

where (1.10)

etA =

∞ tk Ak . k! k=0

1.2. Evolution equations

5

Notice that the right-hand side of (1.10) is defined as a convergent series of n × n matrices. Here A0 = I is the identity matrix. The family of matrices {etA ; t ∈ R} has the “group property”, namely e0 A = I ,

etA esA = e(t+s)A

for all t, s ∈ R .

If A is symmetric, then it admits an orthonormal basis of eigenvectors {v1 , . . . , vn }, with corresponding eigenvalues λ1 , . . . , λn . In this case, the solution (1.9) can be written more explicitly as etA b =

n

etλk b, vk vk .

k=1

The theory of linear semigroups provides an extension of these results to unbounded linear operators in infinite-dimensional spaces. In particular, it applies to parabolic evolution equations of the form d u = 0 on ∂Ω , u(t) = −Lu(t), u(0) = g ∈ L2 (Ω) , dt where L is the partial differential operator in (1.2) and ∂Ω denotes the boundary of Ω. When aij = aji and bi (x) = 0, the elliptic operator L is symmetric and the solution can be decomposed along the orthonormal basis {φ1 , φ2 , . . . } of the space L2 (Ω) considered in (1.6). This yields the representation (1.11)

(1.12)

∞ . −tλk u(t) = St g = e g, φk L2 φk ,

t ≥ 0.

k=1

Notice that the operator L is unbounded (its eigenvalues satisfy λk → +∞ as k → ∞). However, the operators St in (1.12) are bounded for every t ≥ 0 (but not for t < 0). The family of linear operators {St ; t ≥ 0} is called a linear semigroup, since it has the semigroup properties S0 = I ,

St ◦ Ss = St+s

for all s, t ≥ 0 .

. Intuitively, we could think of St as an exponential operator: St = e−Lt . However, since L is unbounded, one should be aware that an exponential formula such as (1.10) is no longer valid. When the explicit formula (1.12) is not available, the operators St must be constructed using some different approximation method. In the finite-dimensional case, the exponential of a matrix A can be recovered by −n t tA (1.13) e = lim I − A n→∞ n and also by (1.14)

etA = lim etAλ , λ→∞

. Aλ = A(I − λ−1 A)−1 .

6

1. Introduction

Remarkably, the two formulas (1.13)–(1.14) retain their validity also for a wide class of unbounded operators on infinite-dimensional spaces. The hyperbolic initial value problem

u(0) = g , (1.15) utt + Lu = 0, ut (0) = h ,

u = 0 on ∂Ω ,

can also be treated by similar methods. The finite-dimensional counterpart of (1.15) is the system of secondorder linear equations d2 d x(t) + Ax(t) = 0, x(0) = a , x(0) = b . 2 dt dt Here x , a, b ∈ Rn and A is an n × n matrix. Denoting time derivatives . ˙ (1.16) can be written as a first-order by an upper dot and setting y = x, system: 0 I x x(0) a x˙ (1.17) = , = . y˙ −A 0 y y(0) b (1.16)

The same results valid for first-order linear ODEs can thus be applied here. If A is symmetric, then it has an orthonormal basis of eigenvectors {v1 , . . . , vn } with corresponding eigenvalues λ1 , . . . , λn . In this case, the solution of (1.16) can be written as n (1.18) x(t) = ck (t)vk . k=1

Each coefficient ck (·) can be independently computed, by solving the secondorder scalar ODE d2 ck (t) + λk ck (t) = 0 , dt2

ck (0) = a, vk ,

d ck (0) = b, vk . dt

Returning to the problem (1.15), if the elliptic operator L is symmetric, then the solution can again be decomposed along the orthonormal basis {φ1 , φ2 , . . . } of the space L2 (Ω) considered in (1.6). This yields the entirely similar representation (1.19)

u(t) =

∞

ck (t) φk

t ≥ 0,

k=1

where each function ck is determined by the equations d2 ck (t) + λk ck (t) = 0 , dt2

ck (0) = g, φk L2 ,

d ck (0) = h, φk L2 . dt

1.4. Compactness

7

1.3. Function spaces In functional analysis, a key idea is to regard functions f : Rn → R as points in an abstract vector space. All the information about a function is condensed in one single number f , which we call the norm of f . Typically, the norm measures the “size” of f and of its partial derivatives up to some order k. It is remarkable that so many results can be achieved in such an economical way, relying only on this single concept, coupled with the structure of vector space. This accounts for the wide success of functional analytic methods. Toward all applications of functional analysis to integral or differential equations, one needs to develop a theory of function spaces. In this direction, it is natural to consider the spaces C k of functions with bounded continuous partial derivatives up to order k. The “size” of a function f ∈ C k (Rn ) is here measured by the norm . f C k = maxα1 +···+αn ≤k sup ∂xα11 · · · ∂xαnn f (x) . x∈Rn

The spaces C k , however, are not always appropriate for the study of PDEs. Indeed, from physical or geometrical considerations one can often provide estimates not on the maximum value of a solution and its derivatives, but on their Lp norm, for some p ≥ 1. This motivates the introduction of the Sobolev spaces W k,p , containing all functions whose derivatives up to order k lie in Lp . The “size” of a function f ∈ W k,p (Rn ) is now measured by the norm ⎛ ⎞1/p p . α1 f W k,p = ⎝ ∂x1 · · · ∂xαnn f (x) dx⎠ . α1 +···+αn ≤k

Rn

Because of their fundamental role in PDE theory, all of Chapter 8 will be devoted to the study of Sobolev spaces.

1.4. Compactness When solving an equation, if an explicit formula for the solution is not available, a common procedure relies on three steps: (i) Construct a sequence of approximate solutions (un )n≥1 . (ii) Extract a convergent subsequence unj → u ¯. (iii) Prove that the limit u ¯ is a solution. RN

When we reach step (ii), a major difference between the Euclidean space and abstract function spaces is encountered. Namely, in RN all closed

8

1. Introduction

bounded sets are compact. Otherwise stated, in RN the Bolzano-Weierstrass theorem holds: • From every bounded sequence (un )n≥1 one can extract a convergent subsequence. As proved in Chapter 2 (see Theorem 2.22), this crucial property is valid in every finite-dimensional normed space but fails in every infinite-dimensional one. In a space of functions, showing that a sequence of approximate solutions is bounded, i.e., un ≤ C for some constant C and all n ≥ 1, does not guarantee the existence of a convergent subsequence. To overcome this fundamental difficulty, two main approaches can be adopted. (i) Introduce a weaker notion of convergence. Prove that every bounded sequence (also in an infinite-dimensional space) has a subsequence which converges in this weaker sense. A key result in this direction, the Banach-Alaoglu theorem, will be proved at the end of Chapter 2. Weak convergence in Hilbert spaces is discussed in Chapter 5. (ii) Consider two distinct norms, say u weak ≤ u strong , with the following property. If a sequence (un )n≥1 is bounded in the strong norm, i.e., un strong ≤ C, then there exists a subsequence that converges in the weak norm: unj − u ¯ weak → 0, for some limit u ¯. Ascoli’s theorem, proved in Chapter 3, and the Rellich-Kondrahov compact embedding theorem, proved in Chapter 8, yield different settings where this approach can be implemented. A large portion of the analysis of partial differential equations ultimately relies on the derivation of a priori estimates. It is the nature of the problem at hand that dictates what kind of a priori bounds one can expect, and hence in which function spaces the solution can be found. This motivates the variety of function spaces which are currently encountered in literature. While the techniques of functional analysis are very general and yield results of fundamental nature in an intuitive and economical way, one should be aware that only some aspects of PDE theory can be approached by functional analytic methods alone. Typically, the solutions constructed by these abstract methods lie in a Sobolev space of functions that possess just the minimum amount of regularity needed to make sense of the equations. For several elliptic and parabolic equations, it is known that solutions enjoy a much higher regularity. However, this regularity can only be established by a more detailed analysis. Further properties, such as the maximum principle

1.4. Compactness

9

for elliptic or parabolic equations and the finite propagation speed for hyperbolic equations, also require additional techniques, specifically designed for PDEs. For these issues, which are not within the scope of the present lecture notes, we refer to the monographs [E, GT, McO, P, PW, RR, S, T].

Chapter 2

Banach Spaces

Given a vector space X over the real or complex numbers, we wish to introduce a distance d(·, ·) between points of X. This will allow us to define limits, convergent sequences and series, and continuous mappings. Since X has the algebraic structure of a vector space, the distance d should be consistent with this structure. It is thus natural to require the following properties: (p1) The distance d is invariant under translations. Namely: d(x, y) = d(x + z, y + z) for every x, y, z ∈ X. In particular, d(x, y) = d(x − y, 0). (p2) The distance d is positively homogeneous. Namely: d(λx, λy) = |λ| d(x, y), for any scalar number λ and any x, y ∈ X. . (p3) Every open ball B(x0 , r) = {x ∈ X ; d(x, x0 ) < r} is a convex set. The invariance under translations implies that the distance d(·, ·) is entirely determined as soon as we specify the function x → x = d(x, 0), i.e., the distance of a point x from the origin. This is what we call the norm of a vector x ∈ X. It can be taken as the starting point for the entire theory.

2.1. Basic definitions Let X be a (possibly infinite-dimensional) vector space over the field K of numbers. We shall always assume that K is either the field of real numbers R or the field of complex numbers C . A norm on X is a map x → x from X into R, with the following properties. 11

12

2. Banach Spaces

(N1) For every x ∈ X one has x ≥ 0, with equality holding if and only if x = 0. (N2) For every x ∈ X and λ ∈ K one has λx = |λ| x . (N3) For every x, y ∈ X one has x + y ≤ x + y . A vector space X with a norm · satisfying (N1)–(N3) is called a normed space. In turn, a norm determines a distance between elements of X. Lemma 2.1 (Distance defined by a norm). Let · be a norm on the vector space X. Then . (2.1) d(x, y) = x − y defines a distance between points of X. Moreover, this distance has the additional properties of translation invariance, positive homogeneity, and convexity stated in (p1)–(p3) above. Proof. 1. We check that, for all x, y, z ∈ X, the three basic properties of a distance are satisfied: (D1) d(x, y) ≥ 0 for all x, y ∈ X with equality holding if and only if x = y; (D2) d(x, y) = d(y, x); (D3) d(x, z) ≤ d(x, y) + d(y, z). Indeed, (D1) is an immediate consequence of (N1). To prove (D2) we write d(y, x) = y − x = (−1)(x − y) = | − 1| x − y = x − y = d(x, y). The triangle inequality follows from (N3), replacing x, y with x−y and y −z, respectively: d(x, z) = x−z = (x−y)+(y−z) ≤ x−y + y−z = d(x, y)+d(y, z). 2. The property (p1) of translation invariance follows immediately from the definition. The homogeneity property (p2) follows from d(λx, λy) = λ(x − y) = |λ| x − y = |λ| d(x, y) . Finally, to check that every open ball is convex, by translation invariance it suffices to prove that every ball centered at the origin is convex. If x, y ∈ B(0, r) and 0 ≤ θ ≤ 1, then the convex combination satisfies θx + (1 − θ)y ≤ |θ| x + |1 − θ| y < θ r + (1 − θ)r = r .

2.1. Basic definitions

13

Hence θx + (1 − θ)y ∈ B(0, r), which proves that the open ball B(0, r) is convex. The distance d(x, y) = x−y determines a topology on the vector space X. We can thus talk about open sets, closed sets, convergent sequences, and continuous mappings. Throughout the following, the open and the closed balls centered at a point x with radius r > 0 are denoted respectively by

. . B(x, r) = y ∈ X ; y − x < r , B(x, r) = y ∈ X ; y − x ≤ r . We recall that a set V ⊆ X is open if, for every x ∈ V , there exists r > 0 such that B(x, r) ⊆ V . A set U ⊆ X is closed if its complement X \ U is open. A sequence (xn )n≥1 converges to a point x ¯ ∈ X if lim xn − x ¯ = 0.

n→∞

Given a series of elements of X, we say that the series converges to x ¯, and write ∞ yk = x ¯, k=1

if the sequence of partial sums converges to x ¯, namely n lim x ¯− yk = 0 . n→∞ k=1

Given two normed spaces X, Y , we say that a map f : X → Y is continuous if, for every x ∈ X and ε > 0, there exists δ > 0 such that f (x ) − f (x) < ε

whenever x ∈ X, x − x < δ.

A sequence (xn )n≥1 is a Cauchy sequence if, for every ε > 0, one can find an integer N large enough so that xm − xn ≤ ε

whenever m, n ≥ N.

A normed space X is complete if every Cauchy sequence converges to some limit point x ¯ ∈ X. A complete normed space is called a Banach space.

14

2. Banach Spaces

Figure 2.1.1. The unit balls in R2 with norms · 1 (left), · 2

(center), and · ∞ (right), described in Example 2.3.

2.1.1. Examples of normed and Banach spaces. Example 2.2. The finite-dimensional space Rn = {x = (x1 , . . . , xn ), xi ∈ R} with Euclidean norm . (2.2) x 2 = x21 + · · · + x2n is a Banach space over the real numbers. In particular, the field R of all real numbers can be regarded as a 1-dimensional Banach space. In this case, the norm of a number x ∈ R is provided by its absolute value. Example 2.3. On the space Rn one can consider the alternative norms 1/p . . x p = |x1 |p + · · · + |xn |p , x ∞ = max1≤i≤n |xi |. Here 1 ≤ p < ∞. Each of these norms also makes Rn into a finite-dimensional Banach space. Example 2.4. For any closed bounded interval [a, b], the space C 0 ([a, b]) of all continuous functions f : [a, b] → R, with norm . (2.3) f C 0 = maxx∈[a,b] |f (x)| , is a Banach space. Example 2.5. Let Ω be an open subset of Rn . For every 1 ≤ p < ∞, p consider the space pL (Ω) of all Lebesgue measurable functions f : Ω → R such that Ω |f (x)| dx < ∞. This is a Banach space, equipped with the norm 1/p . p (2.4) f Lp = |f (x)| dx . Ω

Two functions f, f˜ are regarded here asthe same element of Lp(Ω) if they coincide almost everywhere, i.e., if meas {x ∈ Ω ; f (x) = f˜(x)} = 0.

2.1. Basic definitions

15

Similarly, the space L∞ (Ω) of all essentially bounded, measurable functions f : Ω → R is a Banach space with norm . (2.5) f L∞ = ess sup |f (x)|. x∈Ω

Example 2.6. For a fixed p ≥ 1, consider the space of all sequences of real numbers whose p-th powers are summable: ∞ p . (2.6) = x = (x1 , x2 , . . .) ; |xk |p < ∞ . k=1

This is a Banach space with norm (2.7)

. x p =

∞

1/p |xk |p

.

k=1

Example 2.7. The space ∞ of all bounded sequences of real numbers, with norm . (2.8) x ∞ = sup |xk |, k≥1

is a Banach space. Within this space, one can consider the subspace c0 of all sequences (xk )k≥1 that converge to zero as k → ∞. This is also a Banach space, for the same norm (2.8). Remark 2.8. Within the space p , 1 ≤ p < ∞, consider the family of unit vectors (2.9) e1 = (1, 0, 0, 0, . . .), e2 = (0, 1, 0, 0, . . .), e3 = (0, 0, 1, 0, . . .), . . . . These are linearly independent. The set of all linear combinations1 N (2.10) span{ek ; k ≥ 1} = θk ek ; N ≥ 1, θk ∈ R k=1

does not coincide with the entire space p , but is a dense subset of p . Indeed, it consists of all sequences of the form x = (x1 , x2 , . . . , xN , 0, 0, 0, . . .), having finitely many nonzero entries. The set {ek ; k ≥ 1} is not an algebraic basis for p , but it provides a topological basis. Namely, every element x = (x1 , x2 , . . .) ∈ p can be obtained as the sum of the convergent series ∞ x = xk ek . k=1 1 Notice

that by a linear combination one always means a finite sum, not a series.

16

2. Banach Spaces

Remark 2.9. On the field of real or complex numbers, the absolute value |α| measures how big the number α is. Similarly, on a vector space X one can think of the norm f as measuring the size of an element f ∈ X. We observe, however, that it is possible to adopt different norms (and hence different distances) on the same space of functions, obtaining quite different convergence results. This is illustrated by the next example. Example 2.10. Let X be the space of all polynomial functions on the interval [0, 1]. On X we can consider the two norms 1 . . (2.11) f C 0 = maxx∈[0,1] |f (x)|, f L1 = |f (x)| dx . 0

These norms yield different convergence results (see Figure 2.1.2). For example, consider the sequence of monomials fn (x) = xn . Letting n → ∞ one has fn L1 → 0. Therefore the sequence (fn )n≥1 converges to zero (i.e., to the identically zero function) in the L1 norm. On the other hand, fn C 0 = 1 for all n ≥ 1. In terms of the C 0 norm, this same sequence is not a Cauchy sequence and does not have any limit.

Figure 2.1.2. Left: the L1 distance between the two functions f, g,

measured by the area of the shaded region, is small. However, their x) − g(¯ x)| is large. Right: C 0 distance, measured by f − g C 0 = |f (¯ the sequence of polynomials fn (x) = xn converges to zero in the space L1 ([0, 1]) but does not have any limit in the space C 0 ([0, 1]).

2.2. Linear operators Let X, Y be normed spaces on the same field K of scalar numbers. A linear operator is a mapping Λ from a subspace Dom(Λ) ⊆ X into Y such that Λ(c1 x1 + c2 x2 ) = c1 Λx1 + c2 Λx2

for all x1 , x2 ∈ Dom(Λ), c1 , c2 ∈ K.

Here Dom(Λ) is the domain of Λ. The range of Λ is the subspace . Range(Λ) = {Λx ; x ∈ Dom(Λ)} ⊂ Y.

2.2. Linear operators

17

The null space or kernel of Λ is the subspace . Ker(Λ) = {x ∈ X ; Λx = 0} ⊆ X. Notice that Λ is one-to-one if and only if Ker(Λ) = {0}. Next, consider a linear operator Λ : X → Y defined on the entire space X, i.e., with Dom(Λ) = X. We say that Λ is bounded if . (2.12) Λ = sup Λx < ∞. x≤1

Theorem 2.11 (Continuity of bounded operators). A linear operator Λ : X → Y is bounded if and only if it is continuous. Proof. 1. If Λ is continuous, then in particular it is continuous at the origin. Hence there exists δ > 0 such that x ≤ δ implies Λx ≤ 1. By linearity, this implies that 1 Λx ≤ whenever x ≤ 1 . δ Hence Λ is bounded. 2. Conversely, let Λ be bounded, so that (2.12) holds. By linearity, we obtain x1 − x2 ≤ x1 −x2 Λ . Λx1 −Λx2 = Λ(x1 −x2 ) = x1 −x2 Λ x1 − x2 Hence Λ is uniformly Lipschitz continuous with Lipschitz constant Λ .

If X, Y are normed spaces over the same field of scalar numbers, we denote by B(X; Y ) the space of bounded linear operators from X into Y . Notice that here we require that the domain of these operators should be the entire space X. Theorem 2.12 (The space of bounded linear operators). The space B(X; Y ) of all bounded linear operators from X into Y is a normed space, with norm defined at (2.12). If Y is a Banach space, then B(X; Y ) is a Banach space. Proof. If Λ1 , Λ2 are linear operators, and c1 , c2 ∈ K, then their linear combination is defined as . (c1 Λ1 + c2 Λ2 )(x) = c1 Λ1 x + c2 Λ2 x . We now check that the properties (N1)–(N3) of a norm are satisfied. 1. If Λ = 0 is the zero operator, then Λx = 0 for all x ∈ X, and Λ = 0. On the other hand, if Λ is not the zero operator, then Λx = 0 for some

18

2. Banach Spaces

x 1 x = 0. Hence Λ( x ) = x Λx = 0 and the supremum in (2.12) is strictly positive. This proves (N1).

2. If α ∈ K, then (N2) follows by the identities αΛ = sup αΛx = |α| sup Λx = |α| Λ . x≤1

x≤1

3. To check the triangle inequality (N3), for every x ∈ X with x ≤ 1 we write (Λ1 + Λ2 )x = Λ1 x + Λ2 x ≤ Λ1 x + Λ2 x ≤ Λ1 + Λ2 . Taking the supremum over all x ∈ X with x ≤ 1 we obtain (N3). 4. Next, assume that Y is a Banach space. We need to show that B(X; Y ) is complete. Let (Λn )n≥1 be a Cauchy sequence of bounded linear operators. For each x ∈ X, this implies lim sup Λm x − Λn x ≤ lim sup Λm − Λn x = 0. m,n→∞

m,n→∞

Therefore the sequence of points (Λn x)n≥1 is Cauchy in Y . Since Y is complete, this sequence has a unique limit, which we call Λx. Since every Λn is a linear operator, it is clear that the Λ is linear as well. We claim that Λ is also bounded (and hence continuous). By assumption, we can choose N large enough such that Λk − ΛN ≤ 1

for all k ≥ N.

Therefore, for any x ∈ X with x ≤ 1 one has Λx = lim Λk x ≤ ΛN x + lim sup Λk − ΛN x ≤ ΛN + 1 . k→∞

k→∞

Since x was an arbitrary point in the unit ball, this proves that the limit operator Λ is bounded, and hence Λ ∈ B(X; Y ). 2.2.1. Examples of linear operators. Example 2.13 (Matrices as linear operators). Every n × m matrix A = (aij ) determines a bounded linear operator Λ : Rm → Rn defined by Λ(x1 , . . . , xm ) = (y1 , . . . , yn ),

with

m . yi = aij xj . j=1

Example 2.14 (Diagonal operators on a space of sequences). Let 1 ≤ p ≤ ∞, and consider the space X = p of all sequences x = (x1 , x2 , . . .)

2.2. Linear operators

19

of real numbers, with norm defined as (2.7) or (2.8). Let (λ1 , λ2 , . . .) be an arbitrary sequence of real numbers, and define the operator Λ : X → X as . (2.13) Λ(x1 , x2 , x3 , . . .) = (λ1 x1 , λ2 x2 , λ3 x3 , . . . ). With reference to the basis of unit vectors {e1 , e2 , . . .} in (2.9), we can think of Λ as an infinite matrix: ⎛ ⎞ λ1 0 ⎜ 0 λ2 ⎟ ⎜ ⎟ ⎜ ⎟ λ 3 ⎝ ⎠ .. . with λ1 , λ2 , . . . along the diagonal and 0 everywhere else. We now have two cases. (i) If the sequence (λk )k≥1 is bounded, then the operator Λ is bounded. Its norm is Λ = sup |λk | .

(2.14)

k

(ii) If the sequence (λk )k≥1 is unbounded, then the operator Λ is not bounded. Its domain Dom(Λ) = x ∈ p ; Λx ∈ p is a vector subspace, strictly contained in p . Example 2.15 (Differentiation operator). Consider the open interval I = ]0, π[ and let X = BC(I) be the space of bounded continuous real-valued functions on I, with norm f =

sup 0 0 . More generally, any seminorm also satisfies (2.25). Observe that (2.25) implies that the function p is convex. Indeed p(θx + (1 − θ)y) ≤ p(θx) + p((1 − θ)y) = θp(x) + (1 − θ)p(y) for all x, y ∈ X, θ ∈ [0, 1]. However, compared with a seminorm, here we also allow for p(x) < 0. Moreover, we do not require p to be symmetric with respect to the origin. In other words, one may well have p(x) = p(−x). Example 2.28. Let X be a normed space and let Ω ⊂ X be a bounded, open, convex set containing the origin. Then the functional . (2.26) p(x) = inf {λ ≥ 0 ; x ∈ λΩ} satisfies the assumptions (2.25). Theorem 2.29 (Hahn-Banach extension theorem). Let X be a vector space over the reals and let p : X → R be a map with the properties (2.25). Consider a subspace V ⊆ X and let f : V → R be a linear functional such that (2.27)

f (x) ≤ p(x)

for all x ∈ V .

Then there exists a linear functional F : X → R such that F (x) = f (x) for all x ∈ V and (2.28)

−p(−x) ≤ F (x) ≤ p(x)

for all x ∈ X .

Proof. 1. If V = X, observing that f (x) = −f (−x) ≥ −p(−x), the conclusion is clear. If V = X, choose any vector x0 ∈ / V and consider the strictly larger subspace . V0 = {x + tx0 ; x ∈ V, t ∈ R}. For every x, y ∈ V , the bound on f yields f (x) + f (y) = f (x + y) ≤ p(x + y) ≤ p(x − x0 ) + p(x0 + y).

28

2. Banach Spaces

Therefore f (x) − p(x − x0 ) ≤ p(y + x0 ) − f (y) for all x, y ∈ V. . Choosing β = supx∈V f (x) − p(x − x0 ) , we have (2.29) f (x) − p(x − x0 ) ≤ β ≤ p(y + x0 ) − f (y)

for all x, y ∈ V .

2. We now extend f to a linear functional defined on the larger space V0 , by setting . f (x + tx0 ) = f (x) + β t, x ∈ V, t ∈ R . We claim that this extension still satisfies (2.30)

f (x + tx0 ) ≤ p(x + tx0 )

for all x ∈ V, t ∈ R .

Indeed, if t = 0, the above inequality follows from our initial assumptions. If t > 0, replacing both x and y by x/t in (2.29) we obtain x x x x t f ≤ tβ ≤ t p −p − x0 + x0 − f , t t t t f (x) − p(x − tx0 ) ≤ tβ ≤ p(x + tx0 ) − f (x) . Therefore, for x ∈ V and t ≥ 0 we have f (x − tx0 ) = f (x) − βt ≤ p(x − tx0 ) , f (x + tx0 ) = f (x) + βt ≤ p(x + tx0 ), proving (2.30). 3. The previous two steps show that every bounded linear functional f defined on a proper subspace V ⊂ X can be extended to a strictly larger subspace, still satisfying the inequality (2.27). To complete the proof, we shall use a maximality principle. Let F be the family of all couples (V, φ), where V is a subspace of X and φ : V → R is a linear functional such that φ(x) ≤ p(x)

for all x ∈ V.

This family can be partially ordered by setting ˜ if and only if V ⊆ V and φ coincides with the (V, φ) (V , φ) restriction of φ˜ to V . By the Hausdorff Maximality Principle (see Theorem A.1 in the Appendix), the partially ordered family F contains a maximal element, say (V max , F ). If V max = X, then by the previous step the linear functional F

2.5. Extension theorems

29

could be extended to a strictly larger subspace, in contradiction to the maximality assumption. Hence V max = X and F : X → R is a linear functional such that F (x) ≤ p(x) for all x ∈ X. By linearity, this implies that −p(−x) ≤ −F (−x) = F (x) ,

completing the proof.

Figure 2.5.1. By the Hahn-Banach theorem, a linear functional f : V → R defined on a subspace V ⊂ X and such that f (x) ≤ p(x) for all x ∈ V can be extended to a linear functional F : X → R satisfying F (y) ≤ p(y) for all y ∈ X.

The previous theorem has a natural application to the case where p(x) = x is a norm. Theorem 2.30 (Extension theorem for bounded linear functionals). Let X be a normed space over the field K of real or complex numbers. Let f : V → K be a bounded linear functional defined on a subspace V ⊆ X. Then f can be extended to a bounded linear functional F : X → K having the same norm: . . F = sup |F (x)| = sup |f (x)| = f . x∈X,x≤1

x∈V,x≤1

. Proof. 1. First assume that K = R, so f is real-valued. Set κ = f , and . define p(x) = κ x . Then the result follows immediately from the previous theorem.

30

2. Banach Spaces

2. Next, consider the case where K = C is the field of complex numbers. Notice that in this case, V and X can also be regarded as vector spaces over the real numbers. The functional F : X → C will be obtained by constructing separately its real and imaginary parts. . For x ∈ V , define u(x) = Re f (x). This is a real-valued linear functional . on V with norm u ≤ κ = f . Hence by the previous steps it admits an extension U : X → R with norm U ≤ κ. We claim that the map . F (x) = U (x) − iU (ix) satisfies all requirements. Indeed, for x ∈ V , F (x) = Re f (x) − i Re f (ix) = Re f (x) + i Im f (x) = f (x). Moreover, let α ∈ C be such that |α| = 1, αF (x) = |F (x)|. Then |F (x)| = αF (x) = U (αx) ≤ κ αx = κ x . . Hence F ≤ κ = f .

Corollary 2.31. Let X be a Banach space. For any two vectors x, y ∈ X with x = y, there exists a continuous linear functional φ : X → K such that φ(x) = φ(y). Corollary 2.32. Let X be a Banach space. For every vector x ∈ X, there exists a continuous linear functional φ : X → K such that φ(x) = x and φ = 1.

Figure 2.5.2. Left: if A is open, the disjoint convex sets A, B can

be separated by a hyperplane. Right: If A is compact and B is closed, the disjoint convex sets A, B can be strictly separated.

2.6. Separation of convex sets Consider the following problem. Given two disjoint convex sets A, B in a normed space X, can one find a bounded linear functional φ : X → R such that the images φ(A) and φ(B) are disjoint? The following theorem provides a positive answer, relying on the Hahn-Banach extension theorem.

2.6. Separation of convex sets

31

Theorem 2.33 (Separation of convex sets). Let X be a normed space over the reals, and let A, B be nonempty, disjoint convex subsets of X. (i) If A is open, then there exists a bounded linear functional φ : X → R and a number c ∈ R such that (2.31)

φ(a) < c ≤ φ(b)

for all a ∈ A , b ∈ B .

(ii) If A is compact and B is closed, then there exists a bounded linear functional φ : X → R and numbers c1 , c2 ∈ R such that (2.32)

φ(a) ≤ c1 < c2 ≤ φ(b)

for all a ∈ A , b ∈ B .

. Proof. 1. Choose points a0 ∈ A and b0 ∈ B and set x0 = b0 − a0 . Consider the open set . . Ω = A − B + x0 = (a − a0 ) + (b0 − b) ; a ∈ A, b ∈ B . Since A, B are convex and A is open, it is clear that Ω is an open, convex neighborhood of the origin. Moreover, x0 ∈ / Ω, because otherwise x0 = a − b + x0 ,

a − b = 0,

for some a ∈ A, b ∈ B.

This is a contradiction because A ∩ B = ∅. 2. Consider the functional (2.33)

. p(x) = inf {λ ≥ 0 ; x ∈ λΩ}.

Since Ω is a neighborhood of the origin, we have B(0, ρ) ⊆ Ω for some ρ > 0. Hence x (2.34) p(x) ≤ for all x ∈ X. ρ Moreover, the convexity of Ω implies that (2.35) p(x + y) ≤ p(x) + p(y) , p(tx) = t p(x) for all x, y ∈ X , t ≥ 0. Notice that p(x0 ) ≥ 1 because x0 ∈ / Ω. . 3. On the one-dimensional subspace V = {tx0 ; t ∈ R}, define the linear . functional f by setting f (tx0 ) = t. Observe that f (x0 ) = 1 ,

f (tx0 ) = t ≤ tp(x0 ) ≤ p(tx0 ) .

By the Hahn-Banach extension theorem, there exists a linear functional φ : X → R such that −p(−x) ≤ φ(x) ≤ p(x)

for all x ∈ X .

By (2.34) it is clear that φ is bounded. Indeed, φ ≤ ρ−1 .

32

2. Banach Spaces

4. If now a ∈ A and b ∈ B, we have φ(a) − φ(b) + 1 = φ(a − b + x0 ) ≤ p(a − b + x0 ) < 1 because φ(x0 ) = f (x0 ) = 1 while a − b + x0 ∈ Ω and Ω is open. Therefore φ(a) < φ(b)

for all a ∈ A , b ∈ B .

The sets φ(A) and φ(B) are nonempty, disjoint convex sets of R, with φ(A) . open. Taking c = supa∈A φ(a), the conclusion (2.31) is satisfied. This proves (i). 5. To prove (ii), observe that the assumptions on A, B imply . d(A, B) = inf a − b ; a ∈ A, b ∈ B > 0 . . . If we choose ρ = d(A, B), then the open neighborhood Aρ = {x ∈ X; d(x, A) < ρ} of radius ρ around A does not intersect B. We can thus apply part (i) to the disjoint convex sets Aρ and B. This yields a continuous linear functional φ : X → R and a constant c2 such that φ(x) < c2 ≤ φ(y)

for all x ∈ Aρ , y ∈ B .

We now observe that the set φ(A) is compact, being the image of a compact . set by a continuous map. Hence c1 = supx∈A φ(x) < c2 . This achieves the proof of (ii).

Figure 2.6.1. The construction used in the proof of the separation theorem.

2.7. Dual spaces and weak convergence Let X be a Banach space over the field K of real or complex numbers. The set of all continuous linear functionals ϕ : X → K is called the dual space of X and denoted by X ∗ . Observe that a linear functional ϕ : X → K can be regarded as a linear operator ϕ : X → Y , in the special case where Y = K. We can thus use

2.7. Dual spaces and weak convergence

33

Theorem 2.12 and conclude that X ∗ = B(X, K) is a Banach space, with norm . (2.36) ϕ ∗ = sup |ϕ(x)| . x≤1

2.7.1. Weak convergence. A sequence x1 , x2 , . . . in a Banach space X is weakly convergent if there exists x ∈ X such that lim ϕ(xn ) = ϕ(x)

n→∞

for all ϕ ∈ X ∗ .

In this case, we say that x is the weak limit of the sequence xn and write xn x. We recall that the sequence xn converges strongly to x if xn −x → 0. Since by definition every ϕ ∈ X ∗ is continuous, the strong convergence xn → x clearly implies the weak convergence xn x. If a weak limit exists, then it is necessarily unique: xn x and xn y imply x = y. Indeed, assume y = x. Then by Corollary 2.31 there exists a continuous linear functional φ ∈ X ∗ such that φ(x) = φ(y). This leads to a contradiction, because φ(x) = lim φ(xn ) = φ(y) . n→∞

2.7.2. Weak-star convergence. One can take a different perspective, and observe that each vector x ∈ X determines a linear functional on X ∗ , namely (2.37)

ϕ → ϕ(x),

ϕ ∈ X ∗.

By Corollary 2.32, the norm of this functional is . (2.38) x ∗∗ = sup |ϕ(x)| = x . ϕ∗ ≤1

We thus have a canonical embedding ι : X → (X ∗ )∗ . To each element x ∈ X there corresponds a bounded linear functional ι(x) on the dual space X ∗ , namely, the map ϕ → ϕ(x). By (2.38), this embedding is isometric, i.e., it preserves the norm. If every bounded linear functional on X ∗ is of the form (2.37) for some x ∈ X, then ι(X) = (X ∗ )∗ and we say that the space X is reflexive. Examples of reflexive spaces include all finite-dimensional spaces, and the spaces Lp (Ω), p in Examples 2.5 and 2.6, with 1 < p < ∞. One should be aware that, in general, the space (X ∗ )∗ can be strictly larger than ι(X). Examples of spaces which are not reflexive include the spaces L1 (Ω), L∞ (Ω), 1 , and ∞ .

34

2. Banach Spaces

The embedding ι : X → (X ∗ )∗ can be used to introduce a weak topology on the dual space X ∗ . We say that a sequence of bounded linear functionals ∗ ϕn ∈ X ∗ weak-star converges to ϕ ∈ X ∗ , and write ϕn ϕ, if (2.39)

for every x ∈ X .

lim ϕn (x) = ϕ(x)

n→∞

This is a much weaker property than the convergence ϕn → ϕ in norm. Indeed ∗

ϕn ϕ

means that

ϕn → ϕ

means that

|ϕn (x) − ϕ(x)| → 0 sup x∈X, x≤1

for each x ∈ X ,

|ϕn (x) − ϕ(x)| → 0 .

By Theorem 2.22, if the space X ∗ is infinite-dimensional, then the closed unit ball B1 ⊂ X ∗ is not compact: one can find a sequence of linear functionals ϕn ∈ X ∗ , with ϕn ∗ ≤ 1 for every n ≥ 1 but without any convergent subsequence (with respect to the norm topology of X ∗ ). On the other hand, if instead of strong convergence we only ask for weak-star convergence, a positive result can be achieved. Theorem 2.34 (Banach-Alaoglu). Let X be a separable Banach space. Then every bounded sequence of linear functionals ϕn ∈ X ∗ admits a weakstar convergent subsequence.2 Proof. 1. Consider a bounded sequence of linear functionals ϕn ∈ X ∗ , say, with ϕn ∗ ≤ C for all n ≥ 1. Since X is separable, there exists a dense countable set S = {x1 , x2 , . . .} ⊂ X. 2. We claim that there exists a subsequence (ϕnj )j≥1 which converges pointwise at each point xk , so that (2.40)

lim ϕnj (xk ) = ϕ(xk )

j→∞

for some values ϕ(xk ) and all k ≥ 1. This subsequence will be constructed by a standard diagonalization procedure. Since the sequence of numbers (ϕn (x1 ))n≥1 is bounded, there exists an infinite set of indices I1 ⊂ N such that the subsequence (ϕn (x1 ))n∈I1 converges to some limit ϕ(x1 ). Similarly, the sequence of numbers (ϕn (x2 ))n≥1 is bounded. Hence there exists an infinite set of indices I2 ⊂ I1 such that the subsequence (ϕn (x2 ))n∈I2 converges to some limit ϕ(x2 ). 2 The Banach-Alaoglu theorem remains valid even without the assumption that X is separable. For a proof in this more general case we refer to [C, R].

2.8. Problems

35

By induction, for each k we can find an infinite set of indices Ik ⊂ Ik−1 such that the subsequence (ϕn (xk ))n∈Ik converges to some limit ϕ(xk ). We now choose a subsequence n1 < n2 < n3 < · · · , with nj ∈ Ij for every j. As j → ∞, this yields the convergence ϕnj (xk ) → ϕ(xk ) for every k ≥ 1. 3. We claim that the limit function ϕ : S → K is Lipschitz continuous with constant C = supn ϕn ∗ . Indeed, for every xh , xk ∈ S we have |ϕ(xh ) − ϕ(xk )| = lim |ϕnj (xh ) − ϕnj (xk )| j→∞

≤ lim sup ϕnj ∗ xh − xk ≤ C xh − xk . j→∞

Therefore, the map ϕ can be uniquely extended by continuity to the closure of S, i.e., to the entire space X. This continuous extension will still be denoted by ϕ : X → K. 4. It remains to show that the subsequence ϕnj weak-star converges to ϕ. Let any x ∈ X and ε > 0 be given. Since S is dense, we can choose a point xk ∈ S such that xk − x < ε. Recalling that all functions ϕn and ϕ are Lipschitz continuous with constant C, we obtain lim sup |ϕnj (x) − ϕ(x)| j→∞

≤ lim sup |ϕnj (x) − ϕnj (xk )| j→∞

+ lim sup |ϕnj (xk ) − ϕ(xk )| + |ϕ(xk ) − ϕ(x)| j→∞

≤ C x − xk + 0 + C xk − x ≤ 2Cε . Since ε > 0 was arbitrary, this implies that |ϕnj (x) − ϕ(x)| → 0 as j → ∞, ∗ proving the weak-star convergence ϕnj ϕ. Being the pointwise limit of a uniformly bounded sequence of linear functionals, it is clear that the map ϕ is bounded and linear as well.

2.8. Problems 1. Check if the following are normed spaces. In the negative case, identify which of the properties (N1)–(N3) fails. In the positive case, decide if they are Banach spaces.

36

2. Banach Spaces

(i) Let X = R, with

x =

x −2x

if x ≥ 0, if x < 0.

(ii) Let X be the vector space whose elements are the sequences of real numbers x = (x1 , x2 , x3 , . . .) such that xk = 0 for all except finitely many k. On X consider the norm (2.8). (iii) Let X be the space of all polynomials (of any degree), with norm p = maxx∈[0,1] |p(x)|. (iv) Let X be the space of all polynomials of degree ≤ 2, with norm . p = |p(0)| + |p (0)| + |p (0)|. (v) Let X be the space of all continuous functions on the interval [0, 1], with 1 . f = |f (x)| dx. 0

(vi) Fix κ ∈ R and let X be the space of all continuous functions f : [0, ∞[ → R such that . f = sup eκt |f (t)| < ∞. t≥0

(vii) Let X = R2 . Given x = (x1 , x2 ), for a fixed 0 < p < 1 define 1/p . x = |x1 |p + |x2 |p . Note: in connection with (vii), it is interesting to check whether . d(x, y) = |x1 − y1 |1/2 + |x2 − y2 |1/2 is a distance on R2 . Are the open balls B(x, r) = {y ; d(y, x) < r} convex? Is d(·, ·) translation invariant? 2. Let X, Y be Banach spaces. Prove that the Cartesian product X × Y = (x, y) ; x ∈ X, y ∈ Y is also a Banach space, with norm (2.41)

. (x, y) = max{ x , y }.

3. Let X be a Banach space over the field K of real or complex numbers. Notice that X × X and K × X are Banach spaces, with product norms as in (2.41). Prove: (i) The mapping (x, y) → x + y from X × X into X is continuous. (ii) The mapping (α, x) → αx from K × X into X is continuous.

2.8. Problems

37

4. Let X be a normed space with norm · . Prove that every subspace V ⊂ X is also a normed space, with the same norm. If X is a Banach space and V is closed, then V is also a Banach space. 5. Prove that a normed space X is complete if and only if every absolutely convergent series has a sum:

xk < ∞

k≥1

implies that

n . xk = lim xk

k≥1

n→∞

exists.

k=1

6. Let X be a vector space. The convex hull of a set A ⊂ X is defined as the set of all convex combinations of elements of A, namely N N . co A = θk ak ; N ≥ 1, ak ∈ A , θk ∈ [0, 1], θk = 1 . k=1

k=1

Prove that (i) co A is convex, and (ii) co A is the intersection of all convex sets that contain A. 7. Let X be a Banach space and let A, B ⊂ X. Prove the following statements. (i) If A is open, then co A is open as well. (ii) If A is bounded, then co A is bounded. . (iii) If A, B are bounded, then the set A + B = {a + b ; a ∈ A, b ∈ B} is bounded as well. (iv) If A is closed and B is compact, then A + B is closed. (v) The sum of two closed sets may not be closed. . . (vi) If A is convex, then A+A = {x+y ; x ∈ A , y ∈ A} = 2A = {2x ; x ∈ A}. (vii) If A is closed and A + A = 2A, then A must be convex. 8. Let X be a normed space. We say that a set S ⊆ X is symmetric if a ∈ S implies −a ∈ S. Prove that (i) If S is convex, then its closure is convex as well. (ii) If S is symmetric, then its closure is symmetric as well. 9. Let X, Y be Banach spaces over the real numbers, and let Λ : X → Y be a bounded linear operator. Prove that (i) If S is convex, then its image Λ(S) is convex. (ii) If S is symmetric, then its image Λ(S) is symmetric. 10. Let Λ : X → Y be a linear operator. Assume that, for every sequence xn → 0, the sequence (Λxn )n≥1 is bounded. Prove that Λ is continuous.

38

2. Banach Spaces

11. Let X be an infinite-dimensional Banach space, and let S be a set of linearly independent vectors. By span(S) we denote the set of all (finite) linear combinations of elements of S. Prove that (i) If S = {v1 , . . . , vN } is a finite set, then span(S) is a closed subspace of X. (ii) If S = {vk ; k ≥ 1} is an infinite sequence, then the vector space span(S) cannot be closed in X. 12 (Spaces over the real and over the complex numbers). Let X be a vector space over the complex numbers. Then X is also a vector space over the real numbers. If Φ : X → C is a complex linear functional and φ(x) = Re Φ(x) is its real part, prove that φ is a real linear functional, and Φ(x) can be reconstructed from φ as Φ(x) = φ(x) − iφ(ix). 13. Show that a normed space X is finite-dimensional if and only if its dual X ∗ is finite-dimensional. 14. Consider the spaces p of sequences of real numbers, as in Examples 2.6 and 2.7. If 1 ≤ p ≤ q ≤ ∞, prove that p ⊆ q , with equality holding only if p = q. Moreover, prove that the identity operator Λ : p → q defined as Λx = x is continuous, for every p ≤ q. . 15. For 1 ≤ p < ∞, prove that the subspace V = span{ek ; k ≥ 1} introduced in p (2.10) is dense in the space . For p = ∞, show that the closure of V coincides with the subspace c0 of all sequences that converge to zero. 16. (Properties of diagonal operators) For 1 ≤ p ≤ ∞, consider the operator Λ : p → p defined in (2.13). Prove the claims (i)–(ii) in Example 2.14. 17. Work out the details of Example 2.18. Namely, fix 1 ≤ p ≤ ∞ and let g : Ω → R be any measurable function. (i) If g ∈ L∞ (Ω), prove that the multiplication operator Mg : f → gf from Lp (Ω) into itself is bounded and has norm given by (2.15). (ii) If g ∈ / L∞ (Ω), prove that the linear operator f → gf from Lp (Ω) into itself is unbounded. Give a direct proof that Dom(Mg ) = Lp (Ω). 18. Fix 1 ≤ p ≤ ∞. Let (fn )n≥1 be a sequence of functions in Lp (R), converging weakly to a function f ∈ Lp (R). Prove that b b lim fn (x) dx = f (x) dx for all a < b. n→∞

a

19. Consider the sequence of functions

1/n fn (x) = 0

a

if x ∈ [0, n] , otherwise.

2.8. Problems

39

(i) Prove that fn → 0 strongly (i.e., fn → 0) in every space Lp (R) with 1 < p ≤ ∞. (ii) On the other hand, show that in the space L1 (R) this sequence is not strongly convergent. In fact, this sequence does not even admit any weakly convergent subsequence. 20. Let Y be a subspace of a Banach space X and let Λ : Y → Rn be a bounded linear operator from Y into the Euclidean space Rn . Show that Λ can be extended : X → Rn with norm Λ ≤ √n Λ . to a linear operator Λ 21. Let φ : R2 → R be a linear functional, say φ(x1 , x2 ) = ax1 + bx2 . Give a direct proof that (i) If R2 is endowed with the norm x 1 = |x1 |+|x2 |, then the corresponding operator norm (2.36) is φ ∞ = max{|a|, |b|}. (ii) If R2 is endowed with the norm x ∞ = max{|x1 |, |x2 |}, then the corresponding norm (2.36) is φ 1 = |a| + |b|. (iii) If R2 is endowed with the norm x p = (|x1 |p + |x2 |p )1/p , with 1 < p < ∞, then the corresponding norm (2.36) is φ q = (|a|q + |b|q )1/q , with 1 1 p + q = 1. 22. Let X be a vector space. Let B ⊂ X be a convex subset such that, for every nonzero vector x ∈ X, there exists a positive number θx > 0 such that αx ∈ B

if and only if

|α| ≤ θx .

. For r ≥ 0, call rB = {rb ; b ∈ B}. Prove that . x = min {r ≥ 0 ; x ∈ rB} is a norm on X, and B = {x ∈ X ; x ≤ 1} is the unit ball in this norm. 23. Let Ω ⊆ Rn be an open set. On the space C(Ω) of (possibly unbounded) continuous functions f : Ω → R, consider the seminorms pk (·) and the distance d(·, ·) defined in (2.22), (2.20). Next, consider any sequence of open sets Ak compactly contained in Ω and such that k≥1 Ak = Ω. Define the seminorms pk (·) and the distance d (·, ·) as before, but replacing the sets Ak in (2.21) with the sets Ak . Prove that the two distances d, d are equivalent. Namely, given any sequence of continuous functions (fj )j≥1 , the following statements are equivalent: (i) the sequence is Cauchy for the distance d, (ii) the sequence is Cauchy for the distance d , (iii) the functions fj converge uniformly on each compact subset of Ω.

40

2. Banach Spaces

24. On the space C(Ω), consider the seminorms pk (·) defined in (2.22) and the distance d(·, ·) constructed in (2.20). . ∞ (i) Explain why f = k=1 2−k pk (f ) does not yield a norm on C(Ω). . (ii) Consider the open unit ball B = {f ∈ C(Ω) ; d(f, 0) < 1}, where 0 stands for the identically zero function. Explain why . f ♦ = inf λ > 0 ; λ−1 f ∈ B does not yield a norm on C(Ω). 25. Let X be a Banach space. Consider any set S ⊂ X and assume x ∈ span(S). Prove that there exist points xj ∈ S and coefficients cj ∈ K, j = 1, . . . , N , such that N k (2.42) x = c j xj , c x for all k = 1, . . . , N . j j ≤ 2 x j=1 j=1 26. Let S be a subset of a Banach space X. Prove that the following statements are equivalent: (i) x ∈ span(S). n ∞ . (ii) x = j=1 cj xj = limn→∞ j=1 cj xj , for some points xj ∈ S and numbers cj ∈ K. 27. Consider the Banach space ∞ consisting of all bounded sequences x = (x1 , x2 , x3 , . . .) of real numbers. (i) Prove that there exists a bounded linear functional F : ∞ → R such that . (2.43) |F (x)| ≤ x ∞ = sup |xn | , n≥1

(2.44)

F (x) = lim xn n→∞

if the limit exists.

(ii) Show that, if F : ∞ → R satisfies the above properties (2.43)–(2.44), then lim inf xn ≤ F (x) ≤ lim sup xn n→∞

n→∞

for all x ∈ ∞ .

(iii) Using (ii) prove that, if there exists an integer N such that xn ≤ yn for all n ≥ N , then F (x) ≤ F (y). (iv) Show that, for any sequence a = (a1 , a2 , . . .) ∈ 1 , the continuous linear functional . Fa (x) = a n xn n≥1

cannot satisfy (2.44). Hence the space of all continuous linear functionals on ∞ cannot be identified with 1 .

2.8. Problems

41

(v) For every bounded sequence of real numbers x = (x1 , x2 , . . .), define 1 . 1 F (x) = lim sup xn . lim inf xn + 2 n→∞ 2 n→∞ Does F satisfy (2.43)–(2.44)? Is F a bounded linear functional on the Banach space ∞ ? 28. In the Banach space X = L∞ (R), consider the subspace V consisting of all bounded continuous functions. Prove that there exists a bounded linear functional Λ : L∞ (R) → R with . Λ = 1 such that Λf = f (0) for every bounded continuous function f . However, f g dx for every show that there exists no function g ∈ L1 (R) such that Λf = f ∈ L∞ (R). Conclude that the dual space of L∞ (R) cannot be identified with L1 (R). 29. Let X be a normed space and let Ω ⊂ X be an open, convex set containing the origin. Consider the functional (2.45)

. p(x) = inf {λ ≥ 0 ; x ∈ λΩ}. (i) Prove that p(·) satisfies the conditions p(x + y) ≤ p(x) + p(y) ,

p(tx) = t p(x) for all x, y ∈ X, t ≥ 0 .

. (ii) Assuming that Br = {x ∈ X ; x < r} ⊆ Ω, prove that p(x) ≤ x /r. (iii) Assuming that Ω = {x ∈ X ; x < 1} is the open unit ball, prove that p(x) = x . 30. Let C 0 ([0, 1]) be the Banach space of all real-valued continuous functions f : [0, 1] → R, with norm f = maxx∈[0,1] |f (x)|. (i) Show that X = {f ∈ C 0 ([0, 1]) ; f (0) = 0} is a closed subspace of C 0 , hence a Banach space. . 1 (ii) Prove that the map f → Λf = 0 f (x) dx is a continuous linear functional . on X. Compute its norm Λ = sup f ≤1 |Λf |. Is this supremum over the closed unit ball actually attained as a maximum? 31. Let X be a Banach space over the reals and let X ∗ be its dual. Let Ω ⊂ X be a convex set containing the origin. Define . Ω∗ = {φ ∈ X ∗ ; φ(x) ≤ 1 for all x ∈ Ω} , . Ω∗∗ = {x ∈ X ; φ(x) ≤ 1 for all φ ∈ Ω∗ } . Prove that Ω∗∗ is the closure of Ω.

42

2. Banach Spaces

32. Let ξ be a nonzero vector in a Banach space X over the reals. Call U = span{ξ} = {tξ ; t ∈ R}. Prove that there exists a closed subspace V ⊂ X such that X = U ⊕V . Namely, every element x ∈ X can be written uniquely as a sum x = u+v

with u ∈ U , v ∈ V .

Moreover, the projections x → u = πU x and x → v = πV x are continuous linear operators. 33. On the Banach space X = L1 ([−1, 1]), prove that there exists a continuous linear functional ϕ : X → R with norm ϕ = 1, having the following property: If f is a polynomial of degree 1, then ϕ(f ) = f (0). 34. Given an open set Ω ⊂ Rn , we denote by C m (Ω) the space of all continuous functions f : Ω → R with continuous partial derivatives up to order m. Let Ak be the sets defined in (2.21). Show that the sequence of seminorms (2.46)

. pk (f ) =

maxx∈Ak |Dα f (x)|

|α|≤m

makes C m (Ω) into a Fréchet space. α1α = (α1 , .. α. ,nαn ) is a multi-index of length Here . . ∂ α |α| = α1 + · · · + αn , and D f = ∂x1 · · · ∂x∂n f. 35. Let Y be a closed subspace of a Banach space X and assume x0 ∈ X \ Y . Show that there exists a bounded linear functional ϕ ∈ X ∗ such that ϕ ≤ 1 and ϕ(x0 ) = d(x0 , Y ) = inf x0 − y , y∈Y

ϕ(y) = 0 for all y ∈ Y .

36. In the space p of sequences of real numbers, consider the unit vectors ek as in (2.9). Prove that (i) If 1 < p < ∞, the sequence (ek )k≥1 converges weakly to zero in p . (ii) In the space 1 , the sequence (ek )k≥1 does not admit any weakly convergent subsequence. 37. Let φ : R → R be a smooth, increasing function, and consider the operator . (Λf )(x) = f (φ(x)). Derive a condition on φ which implies that Λ is a bounded linear operator from Lp (R) into itself. Consider the cases 1 ≤ p < ∞ and p = ∞ separately. 38. Show that the concept of Lebesgue measure cannot be extended to infinitedimensional spaces. More precisely, let X be an infinite-dimensional Banach space.

2.8. Problems

43

Prove that there there cannot be a measure μ, defined on the sigma-algebra of Borel subsets of X, with the following properties: (i) μ(Ω) > 0 for every nonempty open set Ω ⊂ X; (ii) μ is translation-invariant: μ(x + S) = μ(S) for every x ∈ X and S ⊂ X; (iii) there exists a nonempty open set Ω0 such that μ(Ω0 ) < ∞. 39. Let X be a Banach space over the reals. (i) Let S ⊂ X be a closed convex set and assume y ∈ / S. Prove that there exists a bounded linear functional ϕ ∈ X ∗ such that ϕ(y) < inf ϕ(x) . x∈S

. (ii) Consider a weakly convergent sequence: xn y. Let S = co{xn ; n ≥ 1} be the smallest closed convex set containing all points xn . Prove that y ∈ S. 40. Given a function f ∈ L∞ (R), we say that ess lim f (x) = λ x→0

if there exists a function f˜ such that f˜(x) = f (x) for a.e. x ∈ R, and moreover limx→0 f˜(x) = λ. (i) Prove that there exists a bounded linear functional Φ : L∞ (R) → R such that Φ(f ) = ess lim f (x) x→0

whenever the limit exists. (ii) Prove that the above conclusion fails if the space L∞ (R) is replaced by L1 (R).

Chapter 3

Spaces of Continuous Functions

3.1. Bounded continuous functions Let E be a metric space. By C(E) we denote the space of all continuous real-valued (possibly unbounded) functions f : E → R. In general, this space does not have a natural norm. For this reason, we shall also consider the space BC(E) of all bounded continuous functions f : E → R, with norm . (3.1) f = sup |f (x)| . x∈E

Most of this chapter will be concerned with the case where E is compact. In this case every continuous function f : E → R is necessarily bounded, hence C(E) = BC(E). Lemma 3.1. BC(E) is a Banach space. Proof. 1. Let (fn )n≥1 be a Cauchy sequence in BC(E). Then for every fixed x ∈ E the sequence of numbers fn (x) is Cauchy and hence converges to some limit, which we call f (x). 2. By assumption, for every ε > 0 there exists N large enough so that sup |fn (x) − fm (x)| ≤ ε

x∈E

for all n, m ≥ N .

Letting m → ∞, since fm (x) → f (x) we obtain sup sup |fn (x) − f (x)| ≤ ε .

n≥N x∈E

45

46

3. Spaces of Continuous Functions

In turn, this implies sup fn − f ≤ ε ,

sup |f (x)| ≤ ε + sup |fN (x)| < ∞ .

n≥N

x∈E

x∈E

Since ε > 0 was arbitrary, the first inequality shows the convergence fn − f → 0. The second inequality shows that f is bounded. 3. Finally, we prove that f is continuous. Let any x ∈ E and ε > 0 be given. By uniform convergence, there exists an integer N such that |fN (x) − f (x)| < ε/3 for every x ∈ E. Since fN is continuous, there exists δ > 0 such that |fN (y) − fN (x)| < ε/3 whenever d(y, x) < δ. Putting together the above inequalities, when d(y, x) < δ we have |f (y) − f (x)| ≤ |f (y) − fN (y)| + |fN (y) − fN (x)| + |fN (x) − f (x)|
0. By the assumption of pointwise convergence, for every x ∈ E there exists an integer N (x) such that |fN (x) (x) − f (x)| < ε. Since fN (x) and f are continuous, there exists an open neighborhood Vx of x such that |fN (x) (y) − fN (x) (x)| < ε and |f (y) − f (x)| < ε for every y ∈ Vx . Since E is compact, we can cover E with finitely many of these neighborhoods, say E ⊆ Vx1 ∪ · · · ∪ Vxm .

3.2. The Stone-Weierstrass approximation theorem

47

Choose the integer N = max {N (x1 ), . . . , N (xm )}. For every n ≥ N and y ∈ E, assuming that y ∈ Vxi we have fN (xi ) (y) ≤ fN (y) ≤ fn (y) ≤ f (y) , because the sequence is increasing. Therefore |fn (y) − f (y)| ≤ |fN (xi ) (y) − f (y)| ≤ |fN (xi ) (y) − fN (xi ) (xi )| + |fN (xi ) (xi ) − f (xi )| + |f (xi ) − f (y)| < ε + ε + ε. Since y ∈ E and ε > 0 were arbitrary, this establishes the uniform convergence fn → f .

3.2. The Stone-Weierstrass approximation theorem Given a domain E ⊂ Rn , for computational purposes it can be useful to approximate a continuous, real-valued function f ∈ BC(E) with special functions: say, polynomials, exponential functions, or trigonometric polynomials. It is thus important to understand whether every function f ∈ BC(E) can be uniformly approximated by such functions. In this section we will prove a key result in this direction. As a preliminary, observe that the space BC(E) is an algebra. Namely, it is closed under multiplication: if f, g ∈ BC(E), then also f g ∈ BC(E). Moreover, the norm of the product satisfies f g ≤ f g . We say that a subspace A ⊆ BC(E) is a subalgebra if f, g ∈ A implies f g ∈ A. Lemma 3.4 (Closure of a subalgebra). If A ⊆ BC(E) is a subalgebra, then its closure A is a subalgebra as well. Proof. Indeed, assume f, g ∈ A. Then there exist uniformly convergent sequences fn , gn ∈ A with fn → f and gn → g. One has f g − fn gn ≤ f g − fn g + fn g − fn gn ≤ f − fn g + fn g − gn . Since the sequence (fn )n≥1 is uniformly bounded, the right-hand side approaches zero as n → ∞. This shows the convergence fn gn → f g. Since A is an algebra, fn gn ∈ A for every n. Hence f g ∈ A.

48

3. Spaces of Continuous Functions

We say that a subset A ⊆ BC(E) separates points if, for every couple of distinct points x, y ∈ E, there exists a function f ∈ A such that f (x) = f (y). Theorem 3.5 (Stone-Weierstrass). Let E be a compact metric space. If A is a subalgebra of C(E) that separates points and contains the constant functions, then A = C(E). Otherwise stated, let A be a family of continuous, real-valued functions f : E → R with the following properties: (i) If f, g ∈ A and a, b ∈ R, then the linear combination af + bg lies in A. (ii) If f, g ∈ A, then the product f g lies in A as well. (iii) The constant function f (x) ≡ 1 lies in A. (iv) For every two distinct points x, y ∈ E, there exists f ∈ A such that f (x) = f (y). Then every continuous function f : E → R on the compact domain E can be uniformly approximated by functions in A. Proof. 1. There exists a sequence of polynomials (pn )n≥1 such that pn (t) → √ t uniformly for t ∈ [0, 1]. To prove the above claim, the underlying idea is to construct approximate solutions to the equation t − p2 (t) = 0 by iteration. We thus set p0 (t) ≡ 0 and, by induction on n = 0, 1, 2, . . ., 1 pn+1 (t) = pn (t) + (t − p2n (t)) 2

(3.2)

(see Figure 3.2.1). By induction, one checks that pn (t) ≤ pn+1 (t) ≤ every t ∈ [0, 1]. Indeed, √

√ t for

√

1 t − pn (t) − (t − p2n (t)) 2 √ 1 √ = ( t − pn (t)) 1 − ( t + pn (t)) ≥ 0. 2

t − pn+1 (t) =

For every fixed t ∈ [0, 1], the sequence pn (t) is increasing and bounded above. Hence it has a unique limit, say g(t). By (3.2), √ this limit satisfies 2 t − g (t) = 0. Since g(t) ≥ 0 we conclude that g(t) = t. Finally, by Dini’s theorem, the convergence is uniform for t ∈ [0, 1].

3.2. The Stone-Weierstrass approximation theorem

49

Figure 3.2.1. The first few polynomials in the sequence defined in (3.2).

2. For every function f ∈ A, one has |f | ∈ A. . Indeed, let κ = maxx∈E |f (x)|. We can assume κ = 0. Then all functions fn (x) = pn (f 2 (x)/κ2 ) lie in A, because A is an algebra. Since f 2 (x)/κ2 ∈ [0, 1], the previous step yields the convergence ! f 2 (x) |f (x)| fn (x) → = , κ2 κ uniformly for x ∈ E. Therefore

|f | κ

∈ A, and hence |f | ∈ A as well.

3. We now apply the previous argument to the subalgebra A and conclude that, if f, g ∈ A, then the functions 1 1 max{f, g} = (f + g + |f − g|) , min{f, g} = (f + g − |f − g|) 2 2 also lie in A. 4. For any two distinct points y1 , y2 ∈ E and any couple of real numbers a1 , a2 , there exists a function f ∈ A such that f (y1 ) = a1 and f (y2 ) = a2 . Indeed, by assumption there exists a continuous function g ∈ A such that g(y1 ) = g(y2 ). Since A is an algebra and contains all constant functions, the function g(x) − g(y1 ) . f (x) = a1 + (a2 − a1 ) g(y2 ) − g(y1 ) lies in A and satisfies our requirements. 5. Given any continuous function f , a point y ∈ E, and ε > 0, there exists a function gy ∈ A such that (3.3)

gy (y) = f (y),

gy (x) ≤ f (x) + ε

for every x ∈ E .

50

3. Spaces of Continuous Functions

Figure 3.2.2. For a fixed y, taking the infimum of finitely many functions hyz (here drawn as affine functions) we obtain a continuous function gy ≤ f + ε, with gy (y) = f (y).

Indeed (see Figure 3.2.2), by the previous step, for every point z ∈ E, there exists a function hyz ∈ A such that hyz (y) = f (y) and hyz (z) = f (z). Since f and hyz are both continuous, there exists an open neighborhood Vz of z such that hyz (x) < f (x) + ε for every x ∈ Vz . We can cover the compact set E with finitely many such neighborhoods: E ⊆ Vz1 ∪ · · · ∪ Vzm . Then the function . gy (x) = min hyz1 (x), . . . , hyzm (x) lies in A and satisfies the conditions in (3.3).

Taking the supremum of finitely many functions gy we obtain a continuous function g such that f − ε ≤ g ≤ f + ε. Figure 3.2.3.

6. The closure of A is the entire space C(E). Indeed, let f ∈ C(E) be any continuous function and let ε > 0 be given. For each y ∈ E, by the previous step there exists a function gy ∈ A (see

3.2. The Stone-Weierstrass approximation theorem

51

Figure 3.2.3) such that gy (y) = f (y),

gy (x) ≤ f (x) + ε

for every x ∈ E .

By the continuity of f and gy , there exists a neighborhood Uy of y such that gy (x) ≥ f (x) − ε

for all x ∈ Uy .

We now cover the compact set E with finitely many neighborhoods: E ⊆ Uy1 ∪ · · · ∪ Uyν . Then the function . g(x) = max gy1 (x), . . . , gyν (x) lies in A and satisfies f (x) − ε ≤ g(x) ≤ f (x) + ε

for all x ∈ E .

A natural example of an algebra that satisfies all the assumptions in the Stone-Weiertrass theorem is provided by the polynomial functions. Corollary 3.6 (Uniform approximation by polynomials). Let E be a compact subset of Rn . Let A be the family of all real-valued polynomials in the variables (x1 , . . . , xn ). Then A is dense in C(E). Indeed, the family of all real-valued polynomials in (x1 , . . . , xn ) is an algebra that contains the constant functions and separates points in Rn . Hence, by Theorem 3.5 every continuous function f : E → R can be uniformly approximated by polynomials. 3.2.1. Complex-valued functions. A key ingredient in the proof of Theorem 3.5 was the fact that, if two real-valued functions f, g lie in a subalgebra A, then max{f, g} and min{f, g} lie in A. Clearly, such a statement would be meaningless for complex-valued functions. In fact, in its original form the Stone-Weierstrass theorem is NOT valid for complex-valued functions. In order to obtain an approximation result valid for functions f : E → C, the main idea is to regard C = R ⊕ iR as a two-dimensional space over the reals. In the following, given a compact metric space E, we shall denote by CR (E; C) the space of all continuous complex-valued functions on E, regarded as a vector space over the real numbers. Theorem 3.7. Let E be a compact metric space. Let A be a subalgebra of CR (E; C) that separates points and contains the constant functions. Moreover, assume that whenever f ∈ A, then also the complex conjugate function f¯ lies in A. Then A is dense in CR (E; C).

52

3. Spaces of Continuous Functions

Proof. 1. By the assumptions, if f ∈ A, then its real and imaginary parts f + f¯ f − f¯ Re(f ) = , Im(f ) = 2 2 also lie in A. Let A0 be the subalgebra of A (over the real numbers), consisting of all functions f ∈ A with real values. Applying the Stone-Weierstrass theorem to A0 , we conclude that A0 is dense in C(E) = CR (E; R). 2. Given any f ∈ CR (E; C), we write f as a sum of its real and imaginary parts f = Re(f ) + i Im(f ). By the previous step, there exist two sequences of real-valued functions gn , hn ∈ A0 such that (3.4)

gn → Re(f ),

hn → Im(f )

as n → ∞, uniformly on E.

. Consider the sequence fn = gn + ihn ∈ A0 + iA0 = A. By (3.4), we have the uniform convergence fn → f . Hence A = CR (E; C). Example 3.8 (Complex trigonometric polynomials). Let E be the unit circumference {x2 + y 2 = 1} in R2 . Points on E will be parameterized by the angle θ ∈ [0, 2π]. Let A be the algebra of all complex trigonometric polynomials: (3.5)

p(θ) =

N

cn einθ ,

n=−N

where N ≥ 0 is any integer and the coefficients cn are complex numbers. It is clear that A is an algebra, contains the constant functions, and separates points. Moreover, p ∈ A implies p¯ ∈ A as well. By Theorem 3.7, the family of all these complex trigonometric polynomials is dense in CR (E; C). Relying on the previous example, we now show that a real-valued, continuous periodic function can be uniformly approximated with trigonometric polynomials of the form (3.6)

q(x) =

N

αk cos kx +

k=0

N

βk sin kx .

k=1

Here N ≥ 1 is any integer, while αn , βn are real numbers. Corollary 3.9 (Approximation of periodic functions by trigonometric polynomials). Let f : R → R be a continuous function, periodic of period 2π. Then for any ε > 0 there exists a trigonometric polynomial q as in (3.6) such that (3.7)

|q(x) − f (x)| ≤ ε

for all x ∈ R .

3.3. Ascoli’s compactness theorem

53

Proof. By assumption, f (x + 2π) = f (x) for every x ∈ R. As shown in Example 3.8, there exists a complex trigonometric polynomial p of the form (3.5) such that |p(x) − f (x)| ≤ ε

(3.8)

for all x ∈ R .

Consider the complex coefficients cn = an + ibn , with an , bn ∈ R. Calling . q(x) = Re p(x) the real part of p, we compute q(x) =

N

an cos nx −

n=−N

N

bn sin nx =

N

αn cos nx +

n=0

n=−N

N

βn sin nx ,

n=1

with α0 = a0 ,

αn = a−n + an ,

βn = b−n − bn

for n ≥ 1.

Since f is real-valued, by (3.8) we have |q(x) − f (x)| = | Re p(x) − Re f (x)| ≤ |p(x) − f (x)| ≤ ε

for all x ∈ R .

Remark 3.10. If the periodic function f is even, i.e., f (x) = f (−x), then it can be approximated with a trigonometric polynomial of the form (3.6) with βk = 0 for every k ≥ 1, that is, with a finite sum of cosine functions. If f is odd, i.e., f (x) = −f (−x), then in (3.6) one can take αk = 0 for every k ≥ 0. In other words, f can be approximated by a finite sum of sine functions.

3.3. Ascoli’s compactness theorem In a finite-dimensional space, by the Bolzano-Weierstrass theorem every bounded sequence has a convergent subsequence. On the other hand, as shown in Theorem 2.22 of Chapter 2, this compactness property fails in every infinite-dimensional normed space. For example, in the space C([0, 1]) the sequences of continuous functions fn (x) = xn or fn (x) = sin nx are bounded but do not admit any uniformly convergent subsequence. It is thus natural to ask: what additional property of the functions fn can guarantee the existence of a uniformly convergent subsequence? An answer is provided by Ascoli’s theorem, relying on the concept of equicontinuity. Let E be a metric space. A family of continuous functions F ⊂ C(E) is called equicontinuous if, for every x ∈ E and ε > 0, there exists δ > 0 such that (3.9)

d(y, x) < δ

implies |f (y) − f (x)| < ε

54

3. Spaces of Continuous Functions

for all functions f ∈ F . Notice that here δ > 0 can depend on x and ε, but not on the particular function f ∈ F . Lemma 3.11. Let E be a compact metric space and let F ⊂ C(E) be equicontinuous. Then F is uniformly equicontinuous. Namely, for every ε > 0 there exists δ > 0 such that (3.10) d(x, y) < δ

implies

|f (x) − f (y)| < ε

for all x, y ∈ E, f ∈ F .

Proof. Let ε > 0 be given. For each x ∈ E, choose δ = δ(x) > 0 such that (3.9) holds simultaneously for all functions f ∈ F . By compactness, we can cover the space E with finitely many balls: E ⊆ B(x1 , δ1 ) ∪ · · · ∪ B(xn , δn ), where δi = δ(xi ). Choose ρ > 0 so small that, for every x ∈ E, the ball B(x, ρ) is entirely contained inside one of the balls B(xi , δi ). Now assume d(x, y) < ρ. Then there exists an index k ∈ {1, . . . , n} such that x, y ∈ B(xk , δk ). This implies |f (x) − f (y)| ≤ |f (x) − f (xk )| + |f (y) − f (xk )| ≤ ε + ε for every function f ∈ F . Since ε > 0 was arbitrary, this proves the lemma. If E is a compact metric space, then C(E) = BC(E) is a Banach space. By completeness, it follows that for a subset F ⊂ C(E) the following properties are equivalent: (i) F is relatively compact, i.e., the closure F is compact. (ii) F is precompact, i.e., for every ε > 0 it can be covered by finitely many balls with radius ε. (iii) From every sequence of continuous functions fk ∈ F one can extract a subsequence converging to some function f , uniformly on E. Theorem 3.12 (Ascoli). Let E be a compact metric space. Let F ⊂ C(E) be an equicontinuous family of functions, such that (3.11)

sup |f (x)| < ∞

f ∈F

for every x ∈ E.

Then F is a relatively compact subset of C(E). Proof. It suffices to prove that F is precompact. 1. Let ε > 0 be given. By Lemma 3.11, there exists δ > 0 such that (3.10) holds. Since E is compact, it can be covered by finitely many balls of radius

3.3. Ascoli’s compactness theorem

55

δ, say E ⊆

n "

B(xi , δ).

i=1

By the assumption (3.11), . M = maxi∈{1,...,n} sup |f (xi )| < ∞. f ∈F

We now choose finitely many numbers α1 , . . . , αm such that the balls B(αj , ε) cover the compact interval [−M, M ]. 2. Consider the set Θ of all maps θ : {x1 , . . . , xn } → {α1 , . . . , αm }. This is a finite set. Indeed, there are exactly mn such maps. For every θ ∈ Θ, define the family of continuous functions . Fθ = f ∈ F ; f (xi ) ∈ B(θ(xi ), ε) for all i = 1, . . . , n . Since the interval [−M, M ] is covered by the balls B(αj , ε), we clearly have θ∈Θ Fθ = F . 3. We claim that each set Fθ has diameter ≤ 4ε. Indeed, assume f, g ∈ Fθ . For any x ∈ E, choose an index i such that x ∈ B(xi , δ). We then have |f (x) − g(x)| ≤ |f (x) − f (xi )| + |f (xi ) − θ(xi )| + |θ(xi ) − g(xi )| + |g(xi ) − g(x)| ≤ ε + ε + ε + ε. Therefore . f − g C = maxx∈E |f (x) − g(x)| ≤ 4ε . The above arguments show that, for any ε > 0, the set F can be covered with finitely many sets having diameter ≤ 4ε. Hence it can also be covered by finitely many closed balls of radius 4ε. Since ε is arbitrary, this proves that F is precompact. The above theorem can be easily extended to functions taking values in the complex domain C or in a Euclidean space Rn . Corollary 3.13. Let E be a compact metric space. Let (fk )k≥1 be a sequence of continuous functions from E into Rn such that (i) the family {fk } is equicontinuous; (ii) supk≥1 |fk (x)| < ∞ for every x ∈ E. Then the sequence (fk )k≥1 admits a uniformly convergent subsequence.

56

3. Spaces of Continuous Functions

Proof. Let us write out the components: fk (x) = (fk,1 (x), fk,2 (x), . . . , fk,n (x)) ∈ Rn . Applying Theorem 3.12 to the sequence of scalar functions (fk,1 )k≥1 , we can extract a subsequence such that these first components converge uniformly on E. From this subsequence we can extract a further subsequence such that the second components converge uniformly, etc. After n steps we obtain a subsequence where all components converge, uniformly on E.

3.4. Spaces of H¨ older continuous functions Let Ω ⊂ Rn be an open set, and 0 < γ ≤ 1. We say that a function f : Ω → R is H¨ older continuous with exponent γ if there exists a constant C such that |f (x) − f (y)| ≤ C |x − y|γ for all x, y ∈ Ω . 0,γ We denote by C (Ω) the space of all bounded H¨ older continuous functions on Ω, with norm (3.12)

. f C 0,γ (Ω) = sup |f (x)| + x∈Ω

sup x,y∈Ω, x =y

|f (x) − f (y)| . |x − y|γ

More generally, given an integer k ≥ 0, we denote by C k,γ (Ω) the space of all continuous functions with Hölder continuous partial derivatives up to order k. This space is endowed with the norm1 (3.13) |D α f (x) − D α f (y)| . α f C k,γ (Ω) = sup . sup |D f (x)| + |x − y|γ x∈Ω x,y∈Ω, x =y |α|≤k

|α|=k

Theorem 3.14 (H¨ older spaces are complete). Let Ω ⊆ Rn be an open set. For every integer k ≥ 0 and any 0 < γ ≤ 1, the space C k,γ (Ω) is a Banach space. Proof. The fact that (3.13) defines a norm is clear. To prove that the space C k,γ (Ω) is complete, let (fm )m≥1 be a Cauchy sequence with respect to the norm (3.13). Then, for every x ∈ Ω, the sequence fm (x) is Cauchy and converges to some value f (x) uniformly on Ω. The assumption also imply that, for every |α| ≤ k, the sequence of partial derivatives D α fm is Cauchy; hence it converges to some continuous function vα (x) = D α f (x) uniformly on Ω. 1 Given

a multi-index α = (α1 , . . . , αn ), we use the notation D α f =

to denote a partial derivative of f of order |α| = α1 + · · · + αn .

∂ ∂ x1

α1

···

∂ ∂ xn

αn

f

3.5. Problems

57

It remains to prove that the convergence fm → f takes place also with respect to the norm of C m,γ (Ω). In other words, for |α| = k we need to show that α D (fm − f )(x) − D α (fm − f )(y) (3.14) lim sup = 0. m→∞ x,y∈Ω, x =y |x − y|γ By assumption, lim

sup

α D (fm − fn )(x) − D α (fm − fn )(y)

m,n→∞ x,y∈Ω, x =y

|x − y|γ

= 0.

Hence, for any ε > 0, there exists N large enough so that α α D (fm − fn )(x) − D (fm − fn )(y) sup ≤ ε for all m, n ≥ N . |x − y|γ x,y∈Ω, x =y Keeping m fixed and letting n → ∞ we obtain α α D (fm − f )(x) − D (fm − f )(y) sup ≤ ε |x − y|γ x,y∈Ω, x =y Since ε > 0 was arbitrary, this proves (3.14).

for all m ≥ N .

3.5. Problems 1. Let E be a compact metric space. Assume that a family of real-valued continuous functions F ⊂ C(E) satisfies the following two conditions: (i) For every x, y ∈ E and a, b ∈ R, there exists a function f ∈ F such that f (x) = a and f (y) = b. (ii) If f, g ∈ F, then the functions max{f, g} and min{f, g} lie in the closure F. Prove that F is dense in C(E). 2. Prove or disprove the following statements. (i) Given any continuous function f : R → R (possibly unbounded), there exists a sequence of polynomials (pn )n≥1 such that pn (x) → f (x) uniformly on every bounded interval [a, b]. (Note: Here the sequence of polynomials should be independent of the interval [a, b].) (ii) There exists a sequence of polynomials pn that converges to the function 2 f (x) = e−x uniformly on R.

58

3. Spaces of Continuous Functions

3. Let g : [0, π] → R be a continuous function. (i) Prove that, for any ε > 0, there exists a trigonometric polynomial of the form p(x) =

N

bk cos kx

k=0

such that |p(x) − g(x)| < ε for every x ∈ [0, π]. (ii) If g(0) = g(π) = 0, prove that, for any ε > 0, there exists a trigonometric polynomial of the form p(x) =

N

ak sin kx

k=1

such that |p(x) − g(x)| < ε for every x ∈ [0, π]. 4. Let f : R → R be continuously differentiable. Show that, for every bounded interval [a, b], there exists a sequence of polynomials pn such that pn → f and pn → f , uniformly on [a, b]. 5. Let E be the unit circle {x2 + y 2 = 1} in R2 . Points on E will be parameterized with the angle θ ∈ [0, 2π]. Let A be the family of all complex trigonometric polynomials of the form p(θ) =

N

cn einθ ,

n=0

where N ≥ 0 is any integer and the cn are complex-valued coefficients. (i) Is A an algebra? (ii) Does A contain the constant functions and separate points? (iii) Is A dense on the space of all continuous, complex-valued functions f : E → C? 6. Let Ω ⊂ Rn be a bounded open set. Prove that the family of all polynomials p = p(x1 , x2 , . . . , xn ) in n variables is dense on the space Lp (Ω), for every 1 ≤ p < ∞. 7. Consider the rectangle Q = {(x, y) ; x ∈ [0, a] , y ∈ [0, b]}. (i) Given any continuous function f = f (x, y) on Q and any ε > 0, construct a finite number of continuous functions g1 , . . . , gN : [0, a] → R and h1 , . . . , hn : [0, b] → R such that N gi (x)hi (y) ≤ ε for all (x, y) ∈ Q. f (x, y) − i=1

3.5. Problems

59

(ii) Given any continuous function f : Q → R that vanishes on the boundary of Q, show that one can choose an integer M ≥ 1 and finitely many coefficients cmn , 1 ≤ m, n ≤ M , such that M πn y πm x sin cmn sin ≤ ε for all (x, y) ∈ Q. f (x, y) − a b m,n=1

8. Let Ω ⊂ Rn be a bounded open set, and 0 < γ ≤ 1. Prove that the embedding C 0,γ (Ω) ⊂⊂ C 0 (Ω) is compact. In other words, if (fn )n≥1 is a bounded sequence in C 0,γ (Ω), then it admits a subsequence that converges in C 0 (Ω). 9. Decide for which values of 0 < γ ≤ 1 the following functions lie in the Hölder space C 0,γ ( ]0, 1[ ): √ 1 f3 (x) = x | ln x| . f1 (x) = x1/3 , f2 (x) = x · sin , x 10. Explain what is wrong with the following argument. “Let 0 < γ ≤ 1 and let (fn )n≥1 be a sequence of functions in the Hölder space C 0,γ ([0, 1]), with fn C 0,γ ≤ 1 for every n. Since the functions fn are uniformly bounded and equicontinuous, by Ascoli’s theorem we can extract a subsequence converging to some limit function f , uniformly for x ∈ [0, 1]. It is now easy to check that f C 0,γ ≤ 1. This shows that the closed unit ball in C 0,γ is compact, and hence by Theorem 2.22 in Chapter 2 the space C 0,γ is finite-dimensional.” 11. Let ϕ : R+ → R+ be a smooth function such that ϕ(0) = 0 ,

ϕ (s) > 0 ,

ϕ (s) ≤ 0 for all s > 0 .

Given an open set Ω ⊆ Rn , consider the space of continuous functions |f (x) − f (y)| . . ϕ < ∞ . C (Ω) = f : Ω → R ; f ϕ = sup |f (x)| + sup x x=y ϕ(|x − y|) Prove that C ϕ (Ω) is a Banach space. (Note: Here the function ϕ can be regarded as a modulus of continuity. In the case ϕ(s) = sγ with 0 < γ ≤ 1 one has C ϕ (Ω) = C 0,γ (Ω).) 12. Prove the following more general version of Ascoli’s theorem. Let E be a compact metric space and let K be a compact subset of a normed space X. Assume that a sequence of maps φn : E → K is equicontinuous. Namely, for every x ∈ E and ε > 0, there exists δ > 0 such that d(y, x) < δ

implies

φn (y) − φn (x) < ε

for all n ≥ 1 .

Then the sequence (φn )n≥1 admits a uniformly convergent subsequence.

Chapter 4

Bounded Linear Operators

In this chapter we look in more detail at linear operators in Banach spaces. As in Chapter 2, B(X, Y ) will denote the space of bounded linear operators Λ : X → Y , with norm . Λ = sup Λx . x≤1

We begin with some results based on the Baire category theorem.

4.1. The uniform boundedness principle Theorem 4.1 (Banach-Steinhaus uniform boundedness principle). Let X, Y be Banach spaces. Let F ⊂ B(X, Y ) be any family of bounded linear operators. Then either F is uniformly bounded, so that sup Λ < ∞,

Λ∈F

or else there exists a dense set S ⊂ X such that (4.1)

sup Λx = ∞

Λ∈F

for all x ∈ S.

Proof. For every integer n, consider the open set . (4.2) Sn = {x ∈ X ; Λx > n for some Λ ∈ F }. If one of these sets, say Sk , is not dense in X, then there exists x0 ∈ X and a radius r > 0 such that the closed ball B(x0 , r) does not intersect Sk . In 61

62

4. Bounded Linear Operators

other words, Λx ≤ k

for all Λ ∈ F ,

x ∈ B(x0 , r).

If now x ≤ r, then Λx = Λ(x0 + x) − Λx0 ≤ Λ(x0 + x) + Λx0 ≤ 2k Therefore

for all Λ ∈ F .

1 2k . Λ = sup Λx = sup Λx ≤ r r x≤1 x≤r

for every Λ ∈ F . In this case, the family of operators F is uniformly bounded. The other possibility is that the open sets Sn in (4.2) are all dense in X. . # By Baire’s category theorem, the intersection S = n≥1 Sn is dense in X. By construction, for each x ∈ S and n ≥ 1 there exists an operator Λ ∈ F such that Λx > n. Hence (4.1) holds. Remark 4.2. From the above theorem it follows that, if a family of operators Λ ∈ B(X, Y ) is pointwise bounded on the unit ball, then it is uniformly bounded. In other words, the condition sup Λx < ∞

Λ∈F

for each x ∈ X with x ≤ 1

implies sup sup Λx < ∞ .

Λ∈F x≤1

This justifies the name “uniform boundedness principle”. Corollary 4.3 (Continuity of the pointwise limit). Let X, Y be Banach spaces. Let (Λn )n≥1 be a sequence of bounded linear operators in B(X; Y ). Assume that the pointwise limit . (4.3) Λx = lim Λn x n→∞

exists for every x ∈ X. Then the map Λ defined by (4.3) is a bounded linear operator. Proof. For every x ∈ X, the sequence (Λn x)n≥1 is bounded. Hence by the previous theorem the sequence of operators Λn is uniformly bounded. This implies . Λ = sup Λx = sup lim Λn x ≤ sup Λn < ∞ , x≤1

x≤1

n→∞

showing that the linear operator Λ is bounded.

n≥1

4.2. The open mapping theorem

63

4.2. The open mapping theorem If X, Y are metric spaces, we say that a map f : X → Y is open if, for every open subset U ⊆ X, the image f (U ) is an open subset of Y . This is the case if and only if, for every x ∈ X and r > 0, the image f (B(x, r)) of the open ball centered at x with radius r > 0 contains a ball centered at f (x). Theorem 4.4 (Open mapping). Let X, Y be Banach spaces. Let Λ : X → Y be a bounded, surjective linear operator. Then Λ is open. Proof. 1. By linearity, the image of an open ball B(x, r) can be written as Λ(B(x, r)) = Λ(x) + Λ(B(0, r)) = Λ(x) + r Λ(B(0, 1)). . Calling Br = B(0, r) the open ball centered at the origin with radius r, to prove the theorem it thus suffices to show that the image Λ(B1 ) contains an open ball around the origin in Y . 2. Since Λ is surjective, Y = ∞ n=1 Λ(Bn ). Recalling that Y is a complete metric space, by Baire’s theorem at least one of the closures Λ(Bn ) ⊂ Y has nonempty interior. By a rescaling argument, Λ(B1 ) = n−1 Λ(Bn ) must also have nonempty interior. Namely, there exist y0 ∈ Y and r > 0 such that B(y0 , r) ⊂ Λ(B1 ). Since the open unit ball B1 is convex and symmetric, the same is true of its image Λ(B1 ) and of the closure Λ(B1 ). In particular, by symmetry B(−y0 , r) ⊆ Λ(B1 ), while convexity implies that the ball B(0, r) ⊂ Y satisfies 1 1 B(0, r) = B(y0 , r) + B(−y0 , r) ⊆ Λ(B1 ). 2 2 Using again the linearity of Λ, by rescaling we obtain (4.4)

B(0, 2−n r) ⊆ Λ(B2−n )

for all n ≥ 1 .

3. We conclude the proof by showing that B(0, r/2) ⊆ Λ(B1 ). Indeed, consider any point y ∈ B(0, r/2). We proceed by induction. - By density, we can find x1 ∈ B2−1 such that y − Λx1 < 2−2 r. - In turn, we can find x2 ∈ B2−2 such that (y − Λx1 ) − Λx2 < 2−3 r. - Continuing by induction, for each n we have y−

n−1 j=1

Λxj ∈ B(0, 2−n r) ⊆ Λ(B2−n ).

64

4. Bounded Linear Operators

Therefore we can select a point xn ∈ B2−n such that n−1 −n−1 y− Λxj − Λxn r. < 2 j=1 Since Xis a Banach space and n xn < ∞, the series n xn converges, say ∞ n=1 xn = x. We observe that x ≤

∞ n=1

xn
0, the set Λ(B1 ) can be covered by finitely many balls of radius ε. Under the assumptions (ii), let ε > 0 be given. Choose k such that Λ − Λk < ε/2. Since Λk is compact, we can select finitely many elements y1 , . . . , yN ∈ Y such that (4.7)

Λk (B1 ) ⊆

N " i=1

ε B yi , . 2

4.5. Compact operators

69

If x ≤ 1, then Λx − Λk x < ε/2. By (4.7) there exists a point yi with Λk x − yi < ε/2. By the triangle inequality, Λx − yi < ε. This proves that the finitely many balls B(yi , ε) cover Λ(B1 ). Theorem 4.11 (Adjoint of a compact operator). Let X, Y be Banach spaces, and let Λ : X → Y be a bounded linear operator. Then Λ is compact if and only if its adjoint Λ∗ : Y ∗ → X ∗ is compact. Proof. 1. Assume that Λ is compact. Let (yn∗ )n≥1 be a sequence in Y ∗ , with yn∗ ≤ 1 for every n. To prove that Λ∗ is compact we need to show that the sequence (Λ∗ yn∗ )n≥1 admits a convergent subsequence. . Let B1 = {x ∈ X ; x ≤ 1} be the closed unit ball in X. By assumption, the image Λ(B1 ) has compact closure, which we denote by . E = Λ(B1 ) ⊂ Y . 2. By definition, each yn∗ ∈ Y ∗ is a linear map from Y into the field K. Let fn : E → K be the restriction of yn∗ to the compact set E. We claim that the family of functions {fn ; n ≥ 1} satisfies the assumptions of Ascoli’s theorem. Indeed, all these functions are uniformly Lipschitz continous, because |fn (y) − fn (y )| ≤ yn∗ y − y ≤ y − y

for all y, y ∈ E .

Moreover, observing that sup y = sup Λx = Λ , y∈E

x≤1

we obtain |fn (y)| ≤ yn∗ y ≤ 1 · Λ

for all y ∈ E .

Hence all functions fn : E → K are uniformly bounded. By Theorem 3.12, there exists a subsequence (fnj )j≥1 which converges to a function f uniformly on the compact set E = Λ(B1 ). 3. We now observe that Λ∗ yn∗ i − Λ∗ yn∗ j = sup |Λ∗ yn∗ i − Λ∗ yn∗ j , x| x≤1

= sup |yn∗ i − yn∗ j , Λx| = sup |fni (Λx) − fnj (Λx)| , x≤1

x≤1

where the right-hand side approaches zero as i, j → ∞. This shows that the subsequence (Λ∗ yn∗ j )j≥1 is Cauchy, hence it converges to a limit x∗ ∈ X ∗ . Therefore Λ∗ is compact. The converse implication can be proved by the same arguments.

70

4. Bounded Linear Operators

4.5.1. Integral operators. Compact operators often arise in the form of integral operators. To see an example, consider the Banach space C([a, b]) of all continuous, real-valued functions defined on the closed interval [a, b]. Theorem 4.12 (Compactness of an integral operator). Let K : [a, b]× [a, b] → R be a continuous map. Then the integral operator b . (4.8) (Λf )(x) = K(x, y) f (y) dy a

is a compact linear operator from C([a, b]) into itself. Proof. Consider a bounded sequence of continuous functions fn ∈ C([a, b]). We need to prove that the sequence Λfn admits a uniformly convergent subsequence. By Ascoli’s compactness theorem, it suffices to show that the functions Λfn are uniformly bounded and equicontinuous. 1. Since K is continuous on the compact set [a, b] × [a, b], it is bounded and uniformly continuous. Namely, there exists a constant κ such that |K(x, y)| ≤ κ

for all x, y.

Moreover, for every ε > 0 there exists δ > 0 such that (4.9) |K(x, y) − K(˜ x, y)| ≤ ε

whenever |x − x ˜| ≤ δ , x, x ˜, y ∈ [a, b] .

2. By assumption, there exists a constant M such that fn = maxx∈[a,b] |fn (x)| ≤ M

for all n ≥ 1 .

This implies b |K(x, y)| |fn (y)| dy ≤ (b − a)κM , (Λfn )(x) ≤ a

proving that the functions Λfn are uniformly bounded. 3. Next, let ε > 0 be given. Choose δ > 0 such that (4.9) holds. If |x − x ˜| ≤ δ, then for any n ≥ 1 we have b x) ≤ |K(x, y) − K(˜ x, y)| |fn (y)| dy ≤ (b − a)ε M . (Λfn )(x) − (Λfn )(˜ a

Since ε > 0 was arbitrary, this proves the equicontinuity of the sequence (Λfn )n≥1 . An application of Ascoli’s theorem completes the proof.

4.6. Problems

71

Figure 4.5.1. The kernel function K in (4.11).

Example 4.13. For any continuous function f ∈ C([−1, 1]), let u = Λf be the solution to the two-point boundary value problem (4.10)

u (x) + f (x) = 0 ,

u(−1) = u(1) = 0 .

Observe that the solution must be unique. Indeed, if u1 , u2 are solutions, then the function w = u1 − u2 satisfies w (x) = 0 ,

w(−1) = w(1) = 0,

hence w(x) ≡ 0. A direct computation shows that x 1 (1 + y)(1 − x) (1 − y)(1 + x) u(x) = f (y) dy + f (y) dy 2 2 −1 x provides a solution to (4.10). The solution operator Λ : f → u = Λf is thus a linear, compact operator on C([−1, 1]). It can be written in the form (4.9), with ⎧ (1−y)(1+x) ⎪ if − 1 ≤ x ≤ y , ⎨ 2 . (4.11) K(x, y) = ⎪ ⎩ (1+y)(1−x) if y ≤ x ≤ 1 . 2 Referring to Figure 4.5.1, the piecewise affine map x → K(x, y) for a fixed y can be uniquely determined by the three equations K(−1, y) = K(1, y) = 0,

Kx (y+, y) − Kx (y−, y) = −1 .

4.6. Problems 1. On the Banach space X = C([0, 1]), decide whether the following operators are (i) linear, (ii) bounded, (iii) compact: (1) (Λf )(x) = f (sin x). (2) (Λf )(x) = sin(f (x)). (3) (Λf )(x) = xf (x).

72

4. Bounded Linear Operators

(4) (Λf )(x) = x f (0) +

1 0

f (s) ds.

(5) (Λf )(x) = y(x), where y(·) is the solution to the Cauchy problem y (x) + y(x) = f (x) ,

y(0) = 0 .

2. Prove that L2 ([0, 1]) is a vector subspace of L1 ([0, 1]) of first category. Indeed, 1 . for each n ≥ 1, the set En = {f : [0, 1] → R ; 0 |f |2 dx ≤ n} is a closed subset of L1 with empty interior. 3. Let X be a Banach space and let Λ : X → ∞ be a linear operator, so that Λ(x) = (Λ1 (x), Λ2 (x), . . .) is a bounded sequence of real numbers, for every x ∈ X. Prove that the operator Λ is bounded if and only if each linear functional Λn is bounded. 4. Let 1 ≤ p ≤ ∞. Consider a linear operator Λ : Lp ([0, 1]) → Lp ([0, 1]) (defined on the entire space Lp ) which has the following property. If a sequence of functions fn ∈ Lp converges pointwise a.e. to some f ∈ Lp , then the sequence (Λfn )(x) converges to (Λf )(x) for a.e. x ∈ [0, 1]. Prove that Λ is continuous. 5. Let X be an infinite-dimensional Banach space. Let K : X → X be a compact linear operator. Show that, if K is one-to-one, then the range of K cannot be closed. 6. Let X, Y, Z be Banach spaces and consider two linear operators Λ1 : X → Y , Λ2 : Y → Z. Assume that one of the two operators is continuous while the other is compact. Prove that the composition Λ2 ◦ Λ1 : X → Z is a compact linear operator. 7. Let X be a Banach space and let Λ : X → X be a compact linear operator such that Λ = Λ2 . Namely, Λ(x) = Λ(Λ(x)) for all x ∈ X. Prove that Range(Λ) is finite-dimensional. 8. Let X be a Banach space. Let U ⊂ X be a closed, convex set such that n≥1 nU = X. Prove that U contains a neighborhood of the origin. 9. Let X, Y be infinite-dimensional Banach spaces. If K : X → Y is a compact linear operator, prove that K(X) = Y , i.e., K cannot be surjective. As an example, consider the map K : 1 → 1 defined as follows. If x = . (x1 , x2 , x3 , . . .) with x 1 = n≥1 |xn | < ∞, then Kx = x11 , x22 , x33 , . . . , xnn , . . .). Find a point y ∈ 1 \ K( 1 ). 10. Let f : R → R be a bounded function with closed graph. Prove that f is continuous. On the other hand, construct a function g : R → R which is one-to-one and onto, has closed graph, but is not continuous.

4.6. Problems

73

11. Let X, Y be Banach spaces. Prove that every continuous function f : X → Y (not necessarily linear) has closed graph. Construct a bounded function g : R → L1 (R) which has closed graph but is not continuous. 12. Let 1 ≤ p < ∞, and consider the Banach space p of all sequences x = 1/p . p (x1 , x2 , x3 , . . .) of real numbers such that x p = < ∞. Let k≥1 |xk | (λ1 , λ2 , λ3 , . . .) be a bounded sequence of real numbers, and define the bounded linear operator Λ : p → p by setting . Λ(x1 , x2 , x3 , . . .) = (λ1 x1 , λ2 x2 , λ3 x3 , . . . ). Prove that Λ is compact if and only if limk→∞ λk = 0. 13. Let X, Y, Z be Banach spaces. Let B : X × Y → Z be a bilinear map.1 Assume that B is continuous at the origin. Prove that B is bounded; namely there exists a constant C such that B(x, y) ≤ C x y

for all x ∈ X , y ∈ Y.

14. Let X be the vector space of all polynomials in one real variable, with norm 1 . |p(t)| dt . p = 0

Consider the bilinear functional B : X × X → R defined as 1 p(t)q(t) dt. B(p, q) = 0

Show that, for each fixed p ∈ X, the map q → B(p, q) is a continuous linear functional on X. Similarly, for q ∈ X fixed, the map p → B(p, q) is a continuous linear functional. However, prove that B is not continuous from X the product space X × . 1 1 into R. We recall that X × X has norm (p, q) = max 0 |p(x)|dx , 0 |q(x)|dx . 15. Let S be a closed bounded subset of a Banach space X. Assume that for every ε > 0 there exists a finite-dimensional subspace Yε ⊆ X such that . d(S, Yε ) = sup d(x, Yε ) ≤ ε. x∈S

Prove that the set S is compact. 16. Let A be a bounded linear operator on the Banach space X. Assuming that AK = KA for every compact operator K, prove that A is a scalar multiple of the identity, i.e., there exists a number λ such that A = λI. 1 Saying that B is bilinear means that x → B(x, y) is a linear map for every given y ∈ Y , and y → B(x, y) is a linear map for every fixed x ∈ X.

74

4. Bounded Linear Operators

17. Let X be a Banach space and let Λ : X → X be a bounded linear operator. Assume that there exist ε > 0 and an infinite-dimensional subspace V ⊆ X such that Λx ≥ ε x for every x ∈ V . Prove that the operator Λ cannot be compact. 18. Let X be a Banach space and let X ∗ be its dual space. Consider a sequence of elements xk ∈ X with the property that the series k≥1 x∗ , xk converges for every x∗ ∈ X ∗ . Prove that . ∗ x , xk x∗ → ϕ(x∗ ) = k≥1 ∗

is a bounded linear functional on X . 19. On the Banach space X = C([0, 1]) consider the linear operator Λ : X → X defined by 1 t (Λf )(0) = f (0) , (Λf )(t) = f (s) ds for t > 0 . t 0 (i) Prove that Λ is continuous. (ii) Prove that Λ is one-to-one but not onto. (iii) Show that Λ is not compact. 20. Let Ω ⊂ Rn be an open set and let g : Ω → R be a bounded, measurable function. As in Example 2.18, for any 1 ≤ p ≤ ∞, on the space Lp (Ω) consider the . multiplication operator (Mg f )(x) = g(x) f (x). (i) Determine for which functions g the operator Mg is one-to-one. (ii) Determine for which functions g the operator Mg has closed range. (iii) Determine for which functions g the operator Mg is compact. 21. Find an example showing that, for an infinite-dimensional Banach space X, the set of bounded linear operators Λ : X → X which are one-to-one may not be an open subset of the space B(X; X) of all bounded linear operators. Similarly, show that the set of bounded linear operators whose range is dense may not be open in B(X; X). 22. Let X, Y be Banach spaces, and let Λ : X → Y be a linear continuous bijection. (i) Prove that there exists β > 0 such that Λx ≥ β x for all x ∈ X . (ii) Let Ψ ∈ B(X; Y ) be any bounded linear operator with norm Ψ < β. Using the contraction mapping theorem, prove that, for any f ∈ Y , the equation u = Λ−1 (f − Ψu) has a unique solution. (iii) Prove that, within the Banach space B(X; Y ), the set of all bijective operators is open.

4.6. Problems

75

23. Recalling the spaces introduced in Examples 2.6 and 2.7 of Chapter 2, define the operator Λ : ∞ → ∞ by setting Λx = y = (y1 , y2 , y3 , . . .), where n . 1 xi . (4.12) yn = n i=1 (i) Prove that Λ is a bounded linear operator and compute its norm. Is Λ a compact operator? (ii) Consider the operator Λ defined as in (4.12), but on the space 1 of absolutely summable sequences. Is Λ a bounded linear operator? Is it compact? 24. Let X, Y be vector spaces, with Y ⊆ X. Assume that there exist norms such that (X ; · X ) and (Y ; · Y ) are both Banach spaces, with y X ≤ C · y Y

for all y ∈ Y .

(i) If Y = X, prove that the two norms · X and · Y are equivalent, namely x Y ≤ C · x X for some constant C . (ii) If Y = X, prove that Y is a subset of first category in X. Namely, each . set Sn = {y ∈ Y ; y Y ≤ n} is a closed, nowhere dense subset of X.

Chapter 5

Hilbert Spaces

The Euclidean space Rn is equipped with a natural inner product ·, ·. This inner product is useful in many ways: . • It defines the Euclidean norm x = x, x. • It determines perpendicular spaces and perpendicular projections, and it allows us to construct bases of mutually orthogonal vectors {v1 , . . . , vn }. Thanks to the inner product, computing the components of a vector v ∈ Rn with respect to an orthogonal basis is an easy matter. • Every linear functional ϕ : Rn → R can be represented as an inner product: ϕ(x) = w, x for a suitable vector w ∈ Rn . • Having an inner product, one can define a class of symmetric operators, with many useful properties. We recall that A : Rn → Rn is symmetric if Ax, y = x, Ay for all x, y ∈ Rn . In the standard basis of Rn , symmetric operators correspond to symmetric n × n matrices. • Always relying on the inner product, one can define a class of positive operators. We recall that A : Rn → Rn is strictly positive definite if Ax, x > 0 for all x ∈ Rn , x = 0. In this case, the map x → Ax, x is a positive definite quadratic form. The goal of this chapter is to show how the definition and properties of the Euclidean inner product can be extended to infinite-dimensional vector spaces. 77

78

5. Hilbert Spaces

5.1. Spaces with an inner product Let H be a vector space over the field K of real or complex numbers. An inner product on H is a map (·, ·) that, to each couple of elements x, y ∈ H, associates a number (x, y) ∈ K with the following properties. For every x, y, z ∈ H and α ∈ K one has (i) (x, y) = (y, x), where the upper bar denotes complex conjugation; (ii) (x + y, z) = (x, z) + (y, z); (iii) (αx, z) = α(x, z); (iv) (x, x) ≥ 0, with equality holding if and only if x = 0. Notice that the above properties also imply (5.1)

(x, y + z) = (x, y) + (x, z),

(x, αy) = α ¯ (x, y).

In the case where K = R, the properties (i)–(iii) simply say that an inner product on a real vector space is a symmetric bilinear mapping. In connection with the above inner product, we also define . (5.2) x = (x, x) . The Minkowski inequality, proved below, shows that this is indeed a norm on the vector space H. Theorem 5.1 (Two basic inequalities). Let H be a vector space with inner product (·, ·). Then (i) |(x, y)| ≤ x y

(Cauchy-Schwarz inequality);

(ii) x + y ≤ x + y

(Minkowski inequality).

Proof. (i) If y = 0, the first inequality is trivial. To cover the general case, set . . . a = (x, x), b = (x, y), c = (y, y). For every scalar λ ∈ K one has ¯ + ¯bλ + cλλ. ¯ 0 ≤ (x + λy, x + λy) = a + bλ Choosing λ = −b/c, we obtain b¯b . c Since c = (y, y) > 0, multiplying both sides by c we obtain 0 ≤ a−

0 ≤ ac − |b|2 = x 2 y 2 − |(x, y)|2 , proving (i).

5.2. Orthogonal projections

79

(ii) By the Schwarz inequality, Re(x, y) ≤ |(x, y)| ≤ x y . Therefore x + y 2 = (x + y, x + y) = x 2 + y 2 + 2 Re(x, y) 2 ≤ x 2 + y 2 + 2 x y = x + y .

Taking square roots we obtain (ii).

A vector space H with an inner product (·, ·), which is complete with . respect to the norm x = (x, x), is called a Hilbert space. Example 5.2. The Euclidean space Rn with inner product (x, y) = x1 y1 + · · · + xn yn is a Hilbert space over the real numbers. Example 5.3. The space 2 of all sequences of complex numbers x = (x1 , x2 , . . .) such that ∞ 1/2 x = |xk |2 0 we can find a continuous function fε : [−π, π] → C such that fε − f L2 ([−π,π]) < ε ,

(5.14)

fε (−π) = fε (π) .

In turn, as shown by Example 3.8 in Chapter 3, we can find a complex trigonometric polynomial of the form p(x) =

N

αk eikx

k=−N

such that (5.15)

. fε − p C 0 ([−π, π]) = maxx∈[−π,π] |fε (x) − p(x)| < ε .

Observing that fε − p L2 =

π

1/2 |fε (x) − p(x)| dx 2

−π

≤

√

2π fε − p C 0 ,

5.6. Positive definite operators

89

it is clear that fε can be approximated by trigonometric polynomials p(·) also with respect to the L2 norm. Using (5.14) together with (5.15) we conclude that span(S) = L2 ([−π, π]). Now consider the complex trigonometric series π ∞ eikx e−ikx . (5.16) ak √ , f (x) √ ak = (f, ϕk ) = dx . 2π 2π −π k=−∞ By the previous theorems, this series converges to f in L2 ([−π, π]), namely N (5.17) lim f − ak ϕ k = 0. N →∞ k=−N

L2 ([−π, π])

This result can be restated as Corollary 5.11. Let f ∈ L2 ([−π, π] ; C) be a complex-valued, square summable function. Defining the coefficients π . 1 ck = f (y) e−iky dy , 2π −π one has the convergence lim N →∞

2 N ck eikx dx = 0. f (x) − −π π

k=−N

5.6. Positive definite operators A basic problem of linear algebra is to solve the system of linear equations Ax = b , where A is an n × n matrix and b is a vector in Rn . One condition which guarantees the existence and uniqueness of solutions is that A be strictly positive definite. Indeed if the inner product satisfies Ax, x > 0 for every x = 0, then A must have full rank. Hence the above system of linear equations will have a unique solution. In this section we show that this result remains valid also in an infinite-dimensional Hilbert space. Let H be a Hilbert space over the real numbers. We say that a linear operator A : H → H is strictly positive definite1 if there exists β > 0 such that (5.18)

(Au, u) ≥ β u 2

for all u ∈ H.

1 In the literature, operators satisfying the inequality (5.18) are usually called strictly monotone. Here we prefer to call them strictly positive definite, to stress the relationship with positive definite matrices in linear algebra.

90

5. Hilbert Spaces

Theorem 5.12 (Inverse of a positive definite operator). Let H be a real Hilbert space. Let A : H → H be a bounded linear operator which is strictly positive definite, so that (5.18) holds. Then, for every f ∈ H, there . exists a unique u = A−1 f ∈ H such that (5.19)

Au = f .

The inverse operator A−1 satisfies A−1 ≤

(5.20)

1 . β

Proof. We need to show that, under the assumption (5.18), the continuous map A is one-to-one and onto. 1. From (5.18) it follows that β u 2 ≤ (Au, u) ≤ Au u . Hence β u ≤ Au .

(5.21)

If Au = 0, then u = 0, proving that Ker(A) = {0} and A is one-to-one. 2. Next, we claim that Range(A) is closed. Consider any sequence of points vn ∈ Range(A), such that vn → v. We need to show that v = Au for some u ∈ H. By assumption, vn = Aun for some un ∈ H. Using (5.21) we obtain 1 1 lim sup um −un ≤ lim sup Aum −Aun = lim sup vm −vn = 0. m,n→∞ m,n→∞ β m,n→∞ β Hence the sequence (un )n≥1 is Cauchy and converges to some limit u ∈ H. By continuity, Au = v, proving our claim. 3. We now claim that Range(A) = H. If not, since Range(A) is closed, we could find a nonzero vector w ∈ [Range(A)]⊥ . But this would imply β w 2 ≤ (Aw, w) = 0, reaching a contradiction. 4. By the previous steps, A : H → H is a bijection, hence the equation . (5.19) has a unique solution u = A−1 f . By (5.21) it follows that A−1 f = u ≤ proving (5.20).

f β

for all f ∈ H ,

5.6. Positive definite operators

91

The previous result can also be conveniently formulated in terms of bilinear forms. Theorem 5.13 (Lax-Milgram). Let H be a Hilbert space over the reals and let B : H × H → R be a continuous bilinear functional. This means that B[au + bu , v] = aB[u, v] + bB[u , v], B[u , av + bv ] = aB[u, v] + bB[u, v ] , |B[u, v]| ≤ C u v , for some constant C and all u, u , v, v ∈ H, a, b ∈ R. In addition, assume that B is strictly positive definite, i.e., there exists a constant β > 0 such that B[u, u] ≥ β u 2

for all u ∈ H .

Then, for every f ∈ H, there exists a unique u ∈ H such that (5.22)

B[u, v] = (f, v)

for all v ∈ H .

Moreover, u ≤ β −1 f .

(5.23)

Proof. For every fixed u ∈ H the map v → B[u, v] is a continuous linear functional on H. By the Riesz representation theorem, there exists a unique vector, which we call Au ∈ H, such that B[u, v] = (Au, v)

for all v ∈ H .

We claim that A is a bounded, positive definite linear operator. The linearity of A is easy to check. To prove that A is bounded we observe that, for every u ∈ H, . Au = sup |(Au, v)| = sup |B[u, v]| ≤ C u . v=1

v=1

Hence A ≤ C. Moreover, . (Au, u) = B[u, u] ≥ β u 2 , proving that A is strictly positive definite. We can now apply Theorem 5.12 and conclude that the equation Au = f has a unique solution u = A−1 f , satisfying u ≤ β −1 f . By the definition of A, this provides a solution to (5.22).

92

5. Hilbert Spaces

5.7. Weak convergence Let H be a Hilbert space. We say that a sequence of points xn ∈ H converges weakly to a point x ∈ H, and write xn x, if (5.24)

lim (y, xn ) = (y, x)

n→∞

for every y ∈ H.

If a weak limit exists, then it is necessarily unique. Indeed, assume that xn x and xn x ˜. Choosing y = x − x ˜ in (5.24), one obtains 0 = lim (x− x ˜, xn )− lim (x− x ˜, xn ) = (x− x ˜, x)−(x− x ˜, x ˜) = x− x ˜ 2 . n→∞

n→∞

Hence x = x ˜. We recall that, if H is infinite-dimensional, then the closed unit ball . B1 = {x ∈ H ; x ≤ 1} is not compact (with respect to the topology determined by the norm). In particular, B1 contains an orthonormal sequence {e1 , e2 , . . .}, which does not admit any convergent subsequence. On the other hand, replacing strong convergence by weak convergence, one can still prove a useful compactness property. Theorem 5.14 (Weakly convergent sequences). Let H be a Hilbert space. (i) Every weakly convergent sequence is bounded. Namely, if xn x, then xn ≤ C for some constant C and all n ≥ 1. (ii) Every bounded sequence of points xn ∈ H admits a weakly convergent subsequence: xnj x for some x ∈ H. Proof. 1. Every xn can be regarded as a continuous linear map, namely y → (y, xn ) from H into K (the field of reals, or of complex numbers). The Hilbert norm of xn coincides with the norm of xn as a linear functional on H, namely xn = sup |(y, xn )| . y≤1

By assumption, for every y ∈ H, the set {(y, xn ) ; n ≥ 1} is bounded. Hence by Remark 4.2 in Chapter 4 (the uniform boundedness principle), the countable set of linear functionals {xn ; n ≥ 1} is uniformly bounded. This establishes (i). . 2. To prove (ii), consider the vector space X = span{xn ; n ≥ 1}, i.e., the closure of the set of all linear combinations of the points x1 , x2 , . . .. We observe that X is separable. Indeed, consider the set of all finite linear

5.7. Weak convergence

93

combinations

N

θi xi

i=1

with N ≥ 1 and θ1 , . . . , θN rational. This is a countable set dense in X. Since X ⊆ H is itself a Hilbert space, by Riesz’s theorem each xn can be identified with a bounded linear functional on X. By Theorem 2.34 in Chapter 2 (Banach-Alaoglu), the sequence (xn )n≥1 admits a weak-star convergent subsequence, say (xnj )j≥1 . This means that lim (y, xnj ) = ϕ(y)

j→∞

for all y ∈ X ,

for some bounded linear functional ϕ : X → K. 3. By Riesz’s representation theorem there exists a unique element x ∈ X such that ϕ(y) = (y, x) for all y ∈ X. We conclude the proof by showing that the subsequence xnj converges weakly to x in the entire Hilbert space H. Indeed, consider any y ∈ H. Denoting by PX : H → X the perpendicular projection, one has lim (y, xnj ) = lim (PX y, xnj ) = (y, x) .

j→∞

j→∞

We conclude this section by showing that a compact operator maps weakly convergent sequences into strongly convergent ones. Theorem 5.15. In a Hilbert space H, consider a weakly convergent sequence xn x. Let Λ : H → H be a compact operator. Then one has the strong convergence (5.25)

Λxn − Λx → 0 .

Proof. To prove (5.25), it suffices to show that, from any subsequence (xn )n∈I1 one can extract a further subsequence (xn )n∈I2 , with I2 ⊆ I1 , such that lim Λxn − Λx = 0 . n→∞, n∈I2

Let a subsequence (xn )n∈I1 be given. Since this sequence is weakly convergent, by the previous theorem it is globally bounded, say xn ≤ C for every n ∈ I1 . Since Λ is compact, from this bounded sequence we can extract a further subsequence (xn )n∈I2 , with I2 ⊆ I1 , such that the images converge strongly: lim

n→∞, n∈I2

Λxn − y = 0

for some y ∈ H. It remains to show that y = Λx.

94

5. Hilbert Spaces

Calling Λ∗ the adjoint operator, one has (v , Λxn − Λx) = (Λ∗ v , xn − x) → 0

for all v ∈ H,

proving the weak convergence Λxn Λx. Since the weak limit is unique, this implies Λx = y, completing the proof.

Figure 5.7.1. A plot of the functions fn (x) = sin2 nx, for n =

1, 5, 25. As n → ∞, the functions fn do not converge pointwise, or in the L2 norm. However, we have the weak convergence fn f ≡ 1/2, because the average value of fn on every interval [a, b] converges to the constant 1/2.

Example 5.16. In the space L2 ([0, π]), consider the sequence of functions fn (x) = sin2 nx (see Figure 5.7.1). We claim that this sequence converges weakly to the constant function f (x) ≡ 1/2. Indeed, consider any g ∈ L2 ([0, π]). We need to prove that π π g(x) 2 (5.26) lim g(x) sin nx dx = dx . n→∞ 0 2 0 In the special case where (5.27)

g(x) =

1 0

if x ∈ [0, b] , if x ∈ ]b, π] ,

the result is clear, because b π b sin 2nb g(x) b 2 lim sin nx dx = lim − = = dx . n→∞ 0 n→∞ 2 4n 2 2 0

5.8. Problems

95

By linearity, (5.26) remains true for every linear combination of functions of the form (5.27), i.e., for every piecewise constant function. Now consider any function g ∈ L2 . Given any ε > 0, one can find a piecewise constant function g˜ such that g − g˜ L2 < ε. This yields π 1 2 g(x) sin nx − dx 2 0

π

=

g˜(x) 0

1 sin nx − 2 2

π

1 [˜ g (x) − g(x)] · sin nx − 2 2

dx + 0

dx

. = An + Bn . Since g˜ is piecewise constant, we already know that An → 0 as n → ∞. On the other hand, by Cauchy’s inequality, 2 1/2 π 1/2 π 1 |Bn | ≤ ˜ sin2 nx − g − g L2 · dx < ε· . 2 4 0 Since ε can be taken arbitrarily small, this proves the weak convergence fn f . Notice that strong convergence does not hold. Indeed, π 1 2 1 2 2 lim fn − f L2 = lim sin nx − dx = . n→∞ n→∞ 0 2 8 Next, consider the compact operator Λ from L2 ([0, π]) into itself defined by . (Λg)(x) =

x

g(y) dy . 0

This integral operator maps the sequence fn (x) = sin2 nx into the sequence x x sin 2nx (Λfn )(x) = sin2 ny dy = − . 2 4n 0 Moreover, (Λf )(x) = x/2. Observe that Λfn converges to Λf uniformly on [0, π]. In particular, we have the strong convergence Λfn −Λf L2 → 0. This provides an illustration of Theorem 5.15, stating that a compact operator maps weakly convergent sequences into strongly convergent ones.

5.8. Problems 1. Let H be a Hilbert space. Using the definition of inner product, prove the two identities in (5.1). Moreover, prove Pythagoras’ theorem: if two vectors x, y ∈ H are orthogonal to each other, then x 2 + y 2 = x + y 2 .

96

5. Hilbert Spaces

2. Let H be a vector space with inner product (·, ·). Prove that the inner product is a continuous map from the product space H × H into K. In other words, if xk → x and yk → y in the norm (5.2), then the sequence of inner products (xk , yk ) . converges to (x, y). On H × H, consider the norm (x, y) = ( x 2 + y 2 )1/2 . 3. Let X be a Banach space over the reals,with norm · . Does there exist an inner product (·, ·) on X such that x = (x, x) for every x ∈ X? To answer, follow the two steps below. (i) Let H be a real Hilbert space. Prove that its norm x = (x, x) satisfies the parallelogram identity (5.28)

x + y 2 + x − y 2 = 2 x 2 + 2 y 2

for all x, y ∈ X .

(ii) Conversely, let X be a Banach space over the real numbers, whose norm satisfies the parallelogram identity (5.28). Show that 1 . 1 x + y 2 − x 2 − y 2 = x + y 2 − x − y 2 (x, y) = 2 4 is an inner product on X, which yields exactly the same norm · . Hints: To prove that (x + x , y) = (x, y) + (x , y), first establish the identity 1 1 x+x +y 2 = x 2 + x 2 + x+y 2 + x +y 2 − x+y −x 2 − x +y −x 2 . 2 2 To prove that (λx, y) = λ(x, y), first show that this identity holds when λ is an integer, then for λ rational, and finally by continuity for every λ ∈ R. 4. Using the above problem, prove that: . (i) On the vector space R2 , for all p ≥ 1 with p = 2, the norm x p = (|x1 |p + |x2 |p )1/p is not generated by an inner product. . (ii) On the space of real-valued continuous functions C([0, 1]), the norm f = maxx∈[0,1] |f (x)| is not generated by an inner product. 5. Consider the Hilbert space H = L2 ([−1, 1]), with inner product (f, g) = 1 f g dx. Show that the sequence of polynomials {1, x, x2 , x3 , . . .} is linearly inde−1 pendent, and its span is dense in H. Applying the Gram-Schmidt orthogonalization procedure, construct an orthonormal sequence of polynomials {p0 , p1 , p2 , p3 , . . .} (proportional to the Legendre polynomials). Explicitly compute the first three polynomials. 6. Let (en )n≥1 be an orthonormal sequence in a Hilbert space H. Prove the weak convergence en 0. In other words show that, for every x ∈ H, one has limn→∞ (x, en ) = 0. 7. Let H be an infinite-dimensional Hilbert space and let any vector x ∈ H be given, with x ≤ 1. Construct a sequence of vectors xn with xn = 1 for every n ≥ 1, such that the weak convergence holds: xn x.

5.8. Problems

97

8. Let (vk )k≥1 be a sequence of unit vectors in a real Hilbert space H. Let (αk )k≥1 be a sequence of real numbers. ∞ (i) If ∞ k=1 |αk | < ∞, prove that the series k=1 αk vk converges. (ii) Assume that the sequence (vk )k≥1 is orthonormal. In this case, prove ∞ ∞ that the series k=1 αk vk converges if and only if k=1 |αk |2 < ∞. 9. Let φ : Rn → Rn be a smooth bijection. Assume that the determinant of the Jacobian matrix satisfies det Dφ(x) = 1 for all x ∈ Rn , so that φ is volume. preserving. On the space L2 (Rn ) consider the linear operator (Λf )(x) = f (φ(x)). Prove that Λ = 1 ,

Λ∗ = Λ−1 .

Observe that Λ can be regarded as a rotation in the space L2 (Rn ), because the adjoint operator coincides with the inverse. x = (x1 , x2 , . . .) 10. Consider space 2 of all sequences of real numbers ∞the Hilbert . ∞ 2 such that k=1 |xk | < ∞, with inner product (x, y) = k=1 xk yk . Define the . operator Λ : H → H by setting Λ(x1 , x2 , x3 , . . .) = (x2 , x3 , x4 , . . .). Compute the ∗ ∗ adjoint operator Λ . Are the operators Λ, Λ surjective? One-to-one? 11. Let S be a convex set. We say that x ∈ S is an extreme point of S if x cannot be expressed as a convex combination of distinct points of S. In other words, x = θx1 + (1 − θ)x2

whenever 0 < θ < 1,

x1 , x2 ∈ S ,

x1 = x2 .

Prove the following. (i) If S is the closed unit ball in a Hilbert space H, then every point x ∈ S with x = 1 is an extreme point of S. This is true, in particular, for the space H = L2 ([0, 1]). (ii) On the other hand, consider the unit ball in L1 ([0, 1]), i.e., 1 . B = f : [0, 1] → R ; |f (t)| dt ≤ 1 . 0

Prove that B does not contain any extreme point. . 12. Let (xn )n≥1 be a sequence of points in a Hilbert space H such that C = lim inf n→∞ xn < ∞. Prove that there exists a weakly convergent subsequence xnj x, for some point x ∈ H satisfying x ≤ C. 13. Let H be an infinite-dimensional, separable Hilbert space over the reals, and let . 2 be the space of all sequences of real numbers a = (a1 , a2 , . . .) such that a 2 = (∞ 2 )1/2 < ∞. Construct a linear bijection Λ : H → 2 which preserves k=1 ak distances, i.e., such that Λx 2 = x H

for all x ∈ H .

98

5. Hilbert Spaces

14. Given n vectors v1 , v2 , . . . , vn in a real Hilbert space H, define their Gram determinant as the determinant of the n × n symmetric matrix in (5.7): ⎞ ⎛ (v1 , v1 ) · · · (vn , v1 ) . ⎜ .. ⎟ . .. G(v1 , . . . , vm ) = det ⎝ ... . . ⎠ (v1 , vn ) · · ·

(vn , vn )

(i) Using (5.7) with x = 0, prove that the vectors v1 , . . . , vn are linearly independent if and only if G(v1 , . . . , vn ) = 0. (ii) Assuming that the vectors v1 , . . . , vn are linearly independent, consider . the subspace V = span{v1 , . . . , vn }. For any vector x ∈ H, show that the distance of x to V is * G(x, v1 , v2 , . . . , vn ) . d(x, V ) = x − PV (x) = G(v1 , v2 , . . . , vn ) (iii) As shown in Figure 5.8.1, prove that the n-dimensional volume of the parallelepiped with edges v1 , . . . , vn can be expressed as v1 · d v2 ; span{v1 } · d v3 ; span{v1 , v2 } · · · d vn ; span{v1 , . . . , vn−1 } . Using (ii), show that the volume of this parallelepiped is G(v1 , v2 , . . . , vn ).

Figure 5.8.1. Computing the area of a parallelogram and the volume of a parallelepiped. Here h2 = d(v2 , span{v1 }), while h3 = d(v3 , span{v1 , v2 }).

15. Let f ∈ L2 (R). Prove that there exists a unique even2 function g0 such that f − g0 L2 =

min

g∈L2 (R), g even

f − g L2 .

Explicitly determine the function g0 . 16. Let K ∈ B(X; H) be a bounded linear operator from a Banach space X into a Hilbert space H. Prove that the following conditions are equivalent. (i) K is compact. (ii) For every ε > 0 there exists an operator Kε : X → H with finitedimensional range, such that Kε − K < ε. 2 We

recall that g is an even function if g(x) = g(−x) for all x ∈ R.

5.8. Problems

99

17. Let un n≥1 and vn n≥1 be two orthonormal sets in a Hilbert space H. Assume that ∞ un − vn < 1. n=1

Show that if one set is complete, then the other set is also complete. 18. Let {un ; n ≥ 1} be a countable set of linearly independent unit vectors in a . −n un . Hilbert space H. Consider the vector v = ∞ n=1 2 (i) Assuming that all vectors un are mutually orthogonal, prove that the set . S = {v, u1 , u2 , u3 , . . .} is linearly independent. (ii) Show by an example that, if the vectors un are not mutually orthogonal, the above set S can be linearly dependent. 19. Given a sequence (xn )n≥1 in a Hilbert space H, show that the strong convergence xn − x → 0 holds if and only if xn → x

and

xn x (weak convergence).

20. Let Q = [0, 1] × [0, 1] be the unit square. Within the space L2 (Q), consider the subspace of all functions depending only on the variable y: U = u ∈ L2 (Q) ; u(x, y) = ϕ(y) for some function ϕ : [0, 1] → R and a.e. (x, y) ∈ Q . (i) Find the orthogonal subspace W = U ⊥ . (ii) Given any f ∈ L2 (Q), determine the function g ∈ U such that f − g L2 (Q) = min f − u L2 (Q) . u∈U

21. Let H be a Hilbert space, and let Ω ⊂ H be a closed, convex subset. (i) Prove that, for any x ∈ H, there exists a unique point y ∈ Ω such that y − x = minω∈Ω ω − x . This point of minimum distance y = πΩ (x) is called the perpendicular projection of x into Ω. (ii) Show that y = πΩ (x) if and only if ω − y , y − x ≥ 0 for all ω ∈ Ω. 22. On the Hilbert space H = L2 (R), consider the subset . Ω = f ; f (x) ≤ ex for a.e. x ∈ R . (i) Prove that Ω is a closed, convex subset of H. (ii) Prove that the perpendicular projection π : H → Ω is the map defined as (πf )(x) = min{f (x), ex }.

100

5. Hilbert Spaces

23. On the Hilbert space H = L2 (R), consider the linear operator Λ : H → H defined by (Λf )(x) = f (|x|). (i) Find the operator norm of Λ. (ii) Find the kernel and the range of Λ. (iii) Compute the adjoint operator Λ∗ . 24. On the Hilbert space H = L2 ([0, ∞[), consider the operator (Λf )(x) = f (ex ). (i) Compute the norm of linear operator Λ. (ii) Find the kernel and the range of Λ. (iii) Compute the adjoint operator Λ∗ . 25. Consider a bounded sequence of functions fn ∈ L2 ([0, T ]). As n → ∞, show that the weak convergence fn f holds if and only if b b fn (x) dx = f (x) dx for every b ∈ [0, T ] . lim n→∞

0

0

26. On the space L2 ([0, 1]), consider the two sequences of functions

2/3 √ n if x ∈ [0, n−1 ] , fn (x) = n · cos nx , fn (x) = 0 if x > n−1 . b (i) In both cases, prove that limn→∞ 0 fn (x) dx = 0 for every b ∈ [0, 1]. 1 (ii) By taking linear combinations, show that limn→∞ 0 fn g dx = 0 for every piecewise constant function g. (iii) Is it true that fn 0 ? 27. Given a sequence (xn )n≥1 in a Hilbert space H, prove that the following statements are equivalent. (i) The weak convergence holds: xn x. (ii) The sequence (xn ) is bounded and (y, xn ) → (y, x) for all y in a subset S ⊆ H whose closure has nonempty interior. 28. Let Ω ⊆ RN be an open set. Prove that a sequence of functions fn ∈ L2 (Ω) converges weakly to f if and only if there exists a constant C such that fn L2 ≤ C for every n ≥ 1 and moreover lim fn dx = f dx n→∞

Q

Q

for every box Q = [a1 , b1 ] × [a2 , b2 ] × · · · × [aN , bN ] entirely contained in Ω. 29. In a real Hilbert space H, consider a weakly convergent sequence: xn y. . Let S = co{xn ; n ≥ 1} be the smallest closed convex set containing all points xn . Prove that y ∈ S.

Chapter 6

Compact Operators on a Hilbert Space

For a linear operator A : Rn → Rn in a finite-dimensional space, several results are known concerning its kernel, range, eigenvalues, and eigenvectors. In particular: (i) A is one-to-one if and only if A is onto. Indeed, the subspaces Ker(A) and [Range(A)]⊥ have the same dimension. (ii) If A is symmetric, then its eigenvalues are real. Moreover, the space Rn admits an orthonormal basis consisting of eigenvectors of A. In this chapter we shall prove similar results, valid for operators Λ on an infinite-dimensional Hilbert space H. Indeed, (i) remains valid for operators of the form Λ = I − K, where I is the identity and K is a compact operator. Moreover, the statements in (ii) can be extended to any compact, selfadjoint operator Λ : H → H.

6.1. Fredholm theory Let H be a Hilbert space. We recall that a bounded linear operator K : H → H is compact if for every bounded sequence of points un ∈ H one can extract a subsequence (unj )j≥1 such that the images converge: Kunj → v for some v ∈ H. The next theorem describes various relations between the kernel and range of an operator having the form I − K and of its adjoint. 101

102

6. Compact Operators on a Hilbert Space

Theorem 6.1 (Fredholm). Let H be a Hilbert space over the reals and let K : H → H be a compact linear operator. Then (i) Ker(I − K) is finite-dimensional; (ii) Range(I − K) is closed; (iii) Range(I − K) = Ker(I − K ∗ )⊥ ; (iv) Ker(I − K) = {0} if and only if Range(I − K) = H; (v) Ker(I − K) and Ker(I − K ∗ ) have the same dimension. Proof. 1. If the kernel of (I − K) is infinite-dimensional, one can find an orthonormal sequence (en )n≥1 contained in Ker(I − K). In this case Ken = en for every n. Moreover, by Pithagoras’ theorem, for m = n one has em − en 2 = em 2 + en 2 = 2. √ Therefore Kem −Ken = em −en = 2 for every m = n. Hence from the sequence (Ken )n≥1 one cannot extract any convergent subsequence. This contradiction establishes (i). 2. Toward the proof of (ii), we first show that there exists β > 0 such that (6.1)

u − Ku ≥ β u

for all u ∈ Ker(I − K)⊥ .

Indeed, if (6.1) fails, we could find a sequence of points un ∈ Ker(I − K)⊥ such that un = 1 and un − Kun < 1/n. Since the sequence (un )n≥1 is bounded, by extracting a subsequence and relabeling, we can assume that this sequence converges weakly, say un u for some u ∈ H. Since K is a compact operator, by Theorem 5.15 this implies the strong convergence Kun → Ku. We now have un − Ku ≤ un − Kun + Kun − Ku → 0

as n → ∞.

This yields the strong convergence un → Ku. Recalling the weak convergence un u, we conclude that u = Ku and un → u strongly. By construction, we now have u = lim un = 1, n→∞

u ∈ Ker(I − K)⊥ ,

while, at the same time, u − Ku = 0, hence u ∈ Ker(I − K). We thus reached a contradiction, proving (6.1). 3. We now prove that Range(I −K) is closed. Consider a sequence of points vn ∈ Range(I − K), with vn → v as n → ∞. We need to find some u such that v = u − Ku. By assumption, for each n ≥ 1, there exists un such that

6.1. Fredholm theory

103

vn = un − Kun . Notice that, if un → u for some u ∈ H, then by continuity we could immediately conclude that v = u − Ku. In general, however, there is no guarantee that the sequence (un )n≥1 converges. To overcome this difficulty, let u ñ be the perpendicular projection of un . on Ker(I − K), and let zn = un − u ñ . These definitions yield . z n = un − u ñ ∈ Ker(I − K)⊥ , vn = un − Kun = zn − Kzn . Using (6.1), for every pair of indices m, n we obtain vm − vn ≥ β zm − zn . Since the sequence (vn )n≥1 is Cauchy, this proves that the sequence (zn )n≥1 is a Cauchy sequence as well. Therefore there exists u ∈ H such that zn → u, and hence u − Ku = lim zn − Kzn = lim vn = v. n→∞

n→∞

Figure 6.1.1. Left: by choosing zn ∈ Ker(I − K)⊥ we achieve

the inequality β zn ≤ vn , used in the proof of (ii). Right: if . I − K is one-to-one but not onto, the sequence of subspaces Hn = n (I − K) (H) is strictly decreasing. This leads to a contradiction, used in the proof of (iv).

4. Since Range(I −K) and Ker(I −K ∗ )⊥ are closed subspaces, the assertion (iii) holds if and only if (6.2)

Range(I − K)⊥ = Ker(I − K ∗ ).

This is proved by observing that the following statements are all equivalent: x ∈ Ker(I − K ∗ ), (I − K ∗ )x = 0, (y , (I − K ∗ )x) = 0 ((I − K)y , x) = 0 x ∈ Range(I − K)⊥ .

for all y ∈ H, for all y ∈ H,

104

6. Compact Operators on a Hilbert Space

5. Toward a proof of (iv), assume that Ker(I − K) = {0}, so that the operator I − K is one-to-one. If Range(I − K) = H, we shall derive a contradiction. . Indeed, assume that this range is not the entire space: H1 = (I − K)(H) = H. By (ii), H1 is a closed subspace of H. Since I − K is one-toone, we must have . H2 = (I − K)(H1 ) ⊂ H1 . We can continue this process by induction: for every n, we set . Hn = (I − K)n (H). By induction, we see that each Hn is a closed subspace of H, and H ⊃ H1 ⊃ H2 ⊃ · · · . ⊥ For each n ≥ 1 we now choose a vector en ∈ Hn ∩ Hn+1 with en = 1. Observe that, if m < n, then

Kem − Ken = −(em − Kem ) + (en − Ken ) + (em − en ) = em + zm . with zm = −(em − Kem ) + (en − Ken ) − en ∈ Hm+1 . ⊥ , by Pythagoras’ theorem this implies Since em ∈ Hm+1

Kem − Ken ≥ em = 1. Therefore, the sequence (Ken )n≥1 cannot have any strongly convergent subsequence, contradicting the compactness of K. 6. To prove the converse implication in (iv), we use a duality argument. Assume that Range(I − K) = H. By Theorem 4.8, Ker(I − K ∗ ) = Range(I − K)⊥ = H ⊥ = {0}. Since K ∗ is compact, by the previous step we have Range(I − K ∗ ) = H. Using again Theorem 4.8, we obtain Ker(I − K) = Range(I − K ∗ )⊥ = H ⊥ = {0}, as claimed. 7. Toward a proof of (v), we first show that the dimension of Ker(I − K) is greater than or equal to the dimension of Range(I − K)⊥ . Indeed, suppose on the contrary that (6.3)

dim Ker(I − K) < dim Range(I − K)⊥ .

Then there exists a linear map A : Ker(I − K) → Range(I − K)⊥ which is one-to-one but not onto. We extend A to a linear map A : H → Range(I − K)⊥ defined on the whole space H, by requiring that Au = 0 if u ∈ Ker(I − K)⊥ . Since the range of A is finite-dimensional, the operator A is compact, and so is K + A.

6.1. Fredholm theory

105

We claim that Ker(I − (K + A)) = {0}. Indeed, consider any vector u ∈ H and write u = u 1 + u2 ,

u1 ∈ Ker(I − K),

u2 ∈ Ker(I − K)⊥ .

Then (6.4) (I −K −A)(u1 +u2 ) = (I −K)u2 −Au1 ∈ Range(I −K)⊕Range(I −K)⊥ . Since (I − K)u2 is orthogonal to Au1 , the sum (I − K)u2 + Au1 can vanish only if (I − K)u2 = 0 and Au1 = 0. Recalling that the operator I − K is one-to-one on Ker(I − K)⊥ and A is one-to-one on Ker(I − K), we conclude that u1 = u2 = 0. Applying (iv) to the compact operator K + A, we obtain Range(I − (K + A)) = H. However, this is impossible: by construction, there exists a vector v ∈ Range(I − K)⊥ with v ∈ / Range(A). By (6.4), the equation u − Ku − Au = v has no solution. This contradiction shows that (6.3) cannot hold. 8. Recalling that Range(I − K ∗ )⊥ = Ker(I − K), from the previous step we deduce that dim Ker(I − K ∗ ) ≥ dim Range(I − K ∗ )⊥ = dim Ker(I − K). Interchanging the roles of K and K ∗ we obtain the opposite inequality. Remark 6.2. When K is a compact operator, the above theorem provides information about the existence and uniqueness of solutions to the linear equation u − Ku = f.

(6.5) Namely, two cases can arise.

CASE 1: Ker(I − K) = {0}. Then the operator I − K is one-to-one and onto. For every f ∈ H the equation (6.5) has exactly one solution. CASE 2: Ker(I −K) = {0}. This means that the homogeneous equation u − Ku = 0 has a nontrivial solution. In this case, the equation (6.5) has solutions if and only if f ∈ Ker(I − K ∗ )⊥ , i.e., if and only if (6.6)

(f, u) = 0

for every u ∈ H such that u − K ∗ u = 0.

The above dichotomy is known as the Fredholm alternative.

106

6. Compact Operators on a Hilbert Space

6.2. Spectrum of a compact operator Let H be a Hilbert space over the reals, and let Λ : H → H be a bounded linear operator. • The resolvent set of Λ, denoted as ρ(Λ), is the set of numbers η ∈ R such that ηI − Λ is a bijection (i.e., one-to-one and onto). Notice that in this case, by the open mapping theorem, the inverse operator (ηI − Λ)−1 is continuous. . • The complement of the resolvent set: σ(Λ) = R \ ρ(Λ) is called the spectrum of Λ. • The point spectrum of Λ, denoted as σp (Λ), is the set of numbers η ∈ R such that ηI − Λ is not one-to-one. Equivalently, η ∈ σp (Λ) if there exists a nonzero vector w ∈ H such that Λw = ηw. In this case, η is called an eigenvalue of Λ and w is an associated eigenvector. • The essential spectrum of Λ, denoted as σe (Λ) = σ(Λ) \ σp (Λ), is the set of numbers η ∈ R such that ηI − Λ is one-to-one but not onto. Theorem 6.3 (Spectrum of a compact operator). Let H be an infinitedimensional Hilbert space, and let K : H → H be a compact linear operator. Then (i) 0 ∈ σ(K). (ii) σ(K) = σp (K) ∪ {0}. (iii) Either σp (K) is finite, or else σp (K) = {λk ; k ≥ 1}, where the eigenvalues satisfy limk→∞ λk = 0. Proof. 1. To prove (i) we argue by contradiction. If 0 ∈ / σ(K), then K has a continuous inverse K −1 : H → H. We thus have I = K ◦ K −1 . Since the composition of a continuous operator with a compact operator is compact, this implies that the identity is a compact operator. But this is false, because H is an infinite-dimensional space and the closed unit ball in H is not compact. 2. To prove (ii), assume that λ ∈ σ(K), with λ = 0. If Ker(λI − K) = {0}, the Fredholm alternative would imply Range(λI − K) = H. By the open mapping theorem, (λI − K) would have a bounded inverse, against the assumptions. This contradiction proves that λ ∈ σp (K).

6.3. Selfadjoint operators

107

3. To prove (iii), assume that (λn )n≥1 is a sequence of distinct eigenvalues of K, with λn → λ. We claim that λ = 0. Indeed, since λn ∈ σp (K), for each n ≥ 1 there exists an eigenvector wn . such that Kwn = λn wn . Call Hn = span{w1 , . . . , wn }. Since eigenvectors corresponding to distinct eigenvalues are linearly independent, Hn ⊂ Hn+1 . Observe that, for every n ≥ 2, one has (K − λn I)Hn ⊆ Hn−1 . For each ⊥ n we can thus choose an element en ∈ Hn ∩ Hn−1 with en = 1. If m < n, then (Ken − λn en ) ∈ Hn−1 ,

(Kem − λm em ) ∈ Hm−1 ⊂ Hn−1 ,

⊥ . Hence and em ∈ Hm ⊆ Hn−1 while en ∈ Hn−1 Ken − Kem = (Ken − λn en ) − (Kem − λm em ) + λn en − λm em

≥ λn en = |λn | . Therefore lim inf Ken − Kem ≥ lim |λn | = |λ| .

m,n→∞

n→∞

If |λ| > 0, then the sequence (Ken )n≥1 cannot have any convergent subsequence, contradicting the assumption that K is compact.

6.3. Selfadjoint operators Let Λ : H → H be a bounded linear operator on a real Hilbert space H. We say that Λ is symmetric if (Λx, y) = (x, Λy)

for all x, y ∈ H .

Notice that this is equivalent to saying that Λ is selfadjoint. Example 6.4. Let A = (aij )i,j=1,...,n be a symmetric n × n matrix. Then A determines a symmetric linear operator x → Ax from Rn into Rn . It also determines the quadratic form x → x, Ax =

n

aij xi xj .

i,j=1

The quantities . m = min x, Ax, |x|=1

. M = max|x|=1 x, Ax

provide the smallest and the largest eigenvalue of A, respectively. The theory of symmetric linear operators on a Hilbert space extends many well-known properties of symmetric matrices to an infinite-dimensional setting.

108

6. Compact Operators on a Hilbert Space

Lemma 6.5 (Bounds on the spectrum of a symmetric operator). Let Λ : H → H be a bounded linear selfadjoint operator on a real Hilbert space H. Define the upper and lower bounds . . m = inf (Λu, u), M = sup (Λu, u). u∈H , u=1

u∈H , u=1

Then (i) The spectrum σ(Λ) is contained in the interval [m, M ]. (ii) m, M ∈ σ(Λ). (iii) Λ = max{−m, M }. Proof. 1. Let η > M . Then (ηu − Λu , u) ≥ (η − M ) u 2

for all u ∈ H.

By the Lax-Milgram theorem, the linear continuous operator ηI − Λ is oneto-one and onto. By the open mapping theorem, it has a continuous inverse. This proves that every η > M is in the resolvent set of Λ. Similarly, replacing Λ with −Λ, we see that every η < m lies in the resolvent set. This proves (i). 2. From now on, to fix the ideas, we assume |m| ≤ M . The opposite case, where M < −m, can be handled by entirely similar arguments, replacing Λ by −Λ. For every u, v ∈ H we have 4(Λu, v) = (Λ(u + v), u + v) − (Λ(u− v), u − v) ≤ M u + v 2 + u − v 2 = 2M ( u 2 + v 2 ). . If Λu = 0, setting v = ( u / Λu )Λu we obtain 2 u Λu = 2(Λu, v) ≤ M ( u 2 + v 2 ) = 2M u 2 . Therefore (6.7)

Λu ≤ M u

for all u ∈ H .

Indeed, (6.7) trivially holds also if Λu = 0. Since Λ ≥ supu=1 (Λu, u) = M , from (6.7) it follows that Λ = M , proving (iii). 3. Next, we claim that M ∈ σ(Λ). Choose a sequence (un )n≥1 with (Λun , un ) → M ,

un = 1

for all n.

6.3. Selfadjoint operators

109

Then Λun − M un 2 = Λun 2 − 2M (Λun , un ) + M 2 un 2 ≤ 2M 2 − 2M (Λun , un ) → 0 . Therefore the operator Λ − M I cannot have a bounded inverse.

According to a classical theorem in linear algebra, every symmetric n×n matrix A can be reduced to diagonal form by an orthogonal transformation. This can be achieved by choosing an orthonormal basis of Rn consisting of eigenvectors of A. The following theorem shows that the result remains valid for compact symmetric operators. Theorem 6.6 (Hilbert-Schmidt; eigenvectors of a compact symmetric operator). Let H be a separable real Hilbert space, and let K : H → H be a compact symmetric linear operator. Then there exists a countable orthonormal basis of H consisting of eigenvectors of K. Proof. 1. If H = Rn , this is a classical result in linear algebra. We thus assume that H is infinite-dimensional. Let η0 = 0 and let {η1 , η2 , . . .} be the set of all nonzero eigenvalues of K. Consider the eigenspaces . . . H0 = Ker(K), H1 = Ker(K − η1 I), H2 = Ker(K − η2 I), . . . . Observe that 0 ≤ dim(H0 ) ≤ ∞, while 0 < dim(Hk ) < ∞ for every k ≥ 1. 2. We claim that, for m = n, the subspaces Hm and Hn are orthogonal. Indeed, assume u ∈ Hm , v ∈ Hn . Then ηm (u, v) = (Ku, v) = (u, Kv) = ηn (u, v). Since ηm = ηn , this implies (u, v) = 0. 3. Next, we show that the subspaces Hk generate the entire space H. More precisely, consider the set of all linear combinations N . H = αk uk ; N ≥ 1, uk ∈ Hk , αk ∈ R . k=1

We claim that . ⊥ ⊆ Ker(K) = H H0 . ⊆ H. Moreover if u ∈ H ⊥ and v ∈ H, then Kv ∈ H and Indeed, K(H) hence (Ku, v) = (u, Kv) = 0. ⊥ ⊥. This shows that K(H ) ⊆ H (6.8)

110

6. Compact Operators on a Hilbert Space

be the restriction of K to the subspace H ⊥ . Clearly K is a Let K compact, symmetric operator. By Lemma 6.5 we have . = u)| = K sup |(Ku, M. ⊥ , u=1 u∈H

If M = 0, then by the lemma either λ = M or λ = −M is in the spectrum In this case, since K is compact, λ is in the point spectrum, and there of K. ⊥ such that exists an eigenvector w ∈ H = Kw = λw . Kw But this is impossible, because all eigenvectors of K corresponding to a nonzero eigenvalue are already contained in the union of the subspaces Hk , = 0, proving (6.8). k ≥ 1. We thus conclude that K In turn, (6.8) yields ⊥ ⊆ H ⊥ ∩ H0 = {0}, H 0 is dense in H. proving that H 4. For each k ≥ 1, the finite-dimensional subspace Hk admits an orthonormal basis Bk = {ek,1 , ek,2 , . . . , ek,N (k) }. Moreover, since H is separable, the closed subspace H0 = Ker(K) admits a countable orthonormal basis . B0 = {e0,1 , e0,2 , . . .}. Hence the union B = k≥0 Bk is an orthonormal basis of H. Remark 6.7. Let {w1 , w2 , . . .} be an orthonormal basis of a real Hilbert space H, consisting of eigenvectors of a linear, compact, selfadjoint operator K. Let λ1 , λ2 , . . . be the corresponding eigenvalues. For a given f ∈ H, consider the equation u − Ku = f .

(6.9)

If 1 ∈ / σ(K), then (6.9) admits a unique solution. Writing u =

∞

c k wk ,

f =

k=1

∞

bk w k ,

k=1

this solution can be computed as (6.10)

u =

∞ k=1

∞ bk (f, wk ) wk = wk . 1 − λk 1 − λk k=1

Remark 6.8. Given a countable set S = {u1 , u2 , . . .} in a Banach space X over the reals, a basic problem is to decide whether span(S) is dense on X. Positive answers can be provided in two important cases.

6.4. Problems

111

(I) X = C(E) is the space of all continuous real-valued functions on a compact metric space E. In this case, if span(S) is an algebra that contains the constant functions and separates points, then the Stone-Weierstrass theorem yields span(S) = X. (II) X is a separable Hilbert space, and there exists a compact selfadjoint operator Λ : X → X such that (i) span(S) contains all eigenvectors of Λ, and (ii) span(S) contains the kernel of Λ. In this case, the Hilbert-Schmidt theorem yields span(S) = X.

6.4. Problems 1. (i) Find two bounded linear operators Λ1 , Λ2 from L2 ([0, ∞[) into itself such that Λ1 ◦ Λ2 = I (the identity operator), but Λ2 ◦ Λ1 = I. (ii) Let H be a real Hilbert space, and let Λ, K : H → H be bounded linear operators, with K compact. Show that in this case Λ(I − K) = I

if and only if

(I − K)Λ = I .

2. On the Hilbert space L2 ([0, ∞[), consider the linear operator defined by (Λf )(x) = 2f (x + 1),

x ≥ 0.

(i) Is Λ a bounded operator? Is it compact? (ii) Explicitly determine the adjoint operator Λ∗ . (iii) Describe Ker(Λ) and Ker(Λ∗ ). 3. Let H be a real Hilbert space, and let Λ : H → H be a bounded linear operator, with norm Λ = M . Give a direct proof that σ(Λ) ⊆ [−M, M ]. 4. On the Hilbert space H = L2 ([0, 1]), consider the linear operator 1 (6.11) (Λf )(x) = K(x, y) f (y) dy , 0

where K : [0, 1] × [0, 1] → R is a continuous map, satisfying (6.12)

K(x, y) = K(y, x) ,

for all x, y ∈ [0, 1].

(i) Prove that Λ is a compact selfadjoint linear operator. (ii) As a special case, note that the assumption (6.12) is satisfied by the integral kernel

(1 − y)x if 0 ≤ x ≤ y , . K(x, y) = (1 − x)y if y ≤ x ≤ 1 .

112

6. Compact Operators on a Hilbert Space

Show that, when f is continuous, the function u(x) = (Λf )(x) provides a solution to the boundary value problem u (x) + f (x) = 0 ,

u(0) = u(1) = 0 .

. 5. On the Hilbert space H = L2 ([0, π]), consider the multiplication operator . (Λf )(x) = f (x) · sin x. (i) Check whether Λ is selfadjoint. (ii) Compute the operator norm Λ and check whether Λ ∈ σp (Λ). (iii) Is Λ compact? . t 6. On the space L2 ([0, 1]), consider the integral operator (Λu)(t) = 0 u(s) ds. (i) Prove that for every u ∈ L2 the function Λu is Hölder continuous, namely Λu ∈ C 0,1/2 ([0, 1]). (ii) Prove that the operator Λ is compact. (iii) Compute the adjoint operator Λ∗ . (iv) Given a function g ∈ L2 , does the equation u − Ku = g have a unique solution? Assuming that g is continuously differentiable, write the ODE satisfied by this solution. 7. As in Remark 6.7, let {w1 , w2 , . . .} be an orthonormal basis of the Hilbert space H, consisting of eigenvectors of the linear, compact, selfadjoint operator K. Assume that 1 ∈ σ(K). Give a necessary and sufficient condition for the equation (6.9) to have solutions. Write a formula, similar to (6.10), describing all such solutions. 8. Let A, B be continuous, selfadjoint linear operators on a Hilbert space H. Prove that the composition AB is selfadjoint if and only if AB = BA. 9. Let Λ : H → H be a bounded linear operator. Prove that the following are equivalent. (i) Λ is compact. (ii) limn→∞ Λvn = 0 for every orthonormal sequence (vn )n≥1 of vectors in H. On the other hand, let H be a separable Hilbert space with orthonormal basis {e1 , e2 , . . .}. Show that there exists a bounded linear operator Λ ∈ B(H) which satisfies limn→∞ Λen = 0 but is not compact. 10. Let K be a compact linear operator on the Hilbert space H. Prove that, for every closed subspace V ⊂ H, the image (I − K)(V ) = {x − Kx ; x ∈ V } is a closed subspace of H.

6.4. Problems

113

11. In the setting of Remark 6.7, consider the linear ODE on the Hilbert space H: d u(t) = Ku(t) , u(0) = f , (6.13) dt for some f ∈ H. Write the solution in the form ∞ (6.14) u(t) = ck (t)wk , k=1

computing the time-dependent coefficients ck (·). 12. Extend the result of Problem 11 to equations of the more general form d u(t) = Ku(t) + g(t) , u(0) = f , dt where f ∈ H and t → g(t) ∈ H is a continuous function. 13. In the setting of Remark 6.7, consider the second-order linear ODE on the Hilbert space H: d2 d u(0) = g , u(t) = Ku(t) , u(0) = f , dt2 dt for some f, g ∈ H. Write the solution in the form (6.14), computing the coefficients ck (·).

(6.15)

14. Let Λ : H → H be a bounded linear operator. Using Problem 22 in Chapter 4, . show that the resolvent set ρ(Λ) = {η ∈ R ; ηI − Λ is a bijection} is open.

Chapter 7

Semigroups of Linear Operators

7.1. Ordinary differential equations in a Banach space The classical existence-uniqueness theory for ODEs with Lipschitz continuous right-hand side can be extended to Banach spaces without any substantial change. Let X be a Banach space, and let F : X → X be a Lipschitz continuous map, so that (7.1)

F (x) − F (y) ≤ L x − y

for some Lipschitz constant L and every x, y ∈ X. Given an initial point x ¯ ∈ X, consider the Cauchy problem (7.2)

x(t) ˙ = F (x(t)),

x(0) = x ¯.

Here and throughout the sequel, the upper dot denotes a derivative with respect to time. As in the finite-dimensional case, the global existence and uniqueness of a solution can be proved using the contraction mapping theorem. Theorem 7.1 (Existence-uniqueness of solutions to a Cauchy problem for an ODE with Lipschitz continuous right-hand side). Let X be a Banach space and assume that the vector field F : X → X satisfies the Lipschitz condition (7.1). Then for every x ¯ ∈ X the Cauchy problem (7.2) has a unique solution t → x(t), defined for all t ∈ R.

115

116

7. Semigroups of Linear Operators

Proof. Fix any T > 0 and consider the Banach space C([0, T ]; X) of all continuous mappings w : [0, T ] → X, with the equivalent norm . (7.3) w † = maxt∈[0,T ] e−2Lt w(t) . Observe that a function x : [0, T ] → X provides a solution to the Cauchy problem (7.2) if and only if x(·) is a fixed point of the Picard operator t . ¯+ (7.4) Φ(w)(t) = x F (w(s)) ds, t ∈ [0, T ]. 0

We claim that Φ is a strict contraction, with respect to the equivalent . norm (7.3). Indeed, given any u, v ∈ C([0, T ]; X), set δ = u − v † . By the definition (7.3), this implies u(s) − v(s) ≤ δe2Ls

for all s ∈ [0, T ] .

For every t ∈ [0, T ], the assumption of Lipschitz continuity in (7.1) implies t −2Lt −2Lt e Φ(u)(t) − Φ(v)(t) = e (F (u(s)) − F (v(s))) ds 0

≤ e−2Lt

t t −2Lt F (u(s)) − F (v(s)) ds ≤ e L u(s) − v(s) ds 0

≤ e−2Lt

0

t

Lδe2Ls ds < 0

δ . 2

Therefore (7.5)

Φ(u) − Φ(v) † ≤

1 u − v † . 2

We can now apply the contraction mapping theorem, obtaining the existence of a unique fixed point for Φ, i.e., a continuous mapping x(·) such that t x(t) = x ¯+ F (x(s)) ds for all t ∈ [0, T ]. 0

This function x(·) provides the unique solution to the Cauchy problem. By reversing time, we can construct a unique solution on any time interval of the form [−T, 0].

There are two well-known methods for constructing approximate solutions to the Cauchy problem (7.2), shown in Figure 7.1.1. In the following we fix a time step h > 0 and define the times tj = j · h, j = 0, 1, 2, . . ..

7.1. Ordinary differential equations in a Banach space

x2

117

y(t) x(t)

z(t)

–x

x1

Figure 7.1.1. Euler approximations to the system x˙ 1 = x2 , x˙ 2 = −x1 . Here x(·) is an exact solution, y(·) is a piecewise affine approximation obtained by the forward Euler scheme, while z(·) is obtained by the backward Euler scheme.

• Forward Euler approximations. The values of the approximate solution at the times tj are defined by induction, according to the formula (7.6)

x(tj+1 ) = x(tj ) + h F (x(tj )). • Backward Euler approximations. The values of the approximate solution at the times tj are determined as

(7.7)

x(tj+1 ) = x(tj ) + h F (x(tj+1 )).

In both cases, after the values x(tj ) have been computed on the discrete set of times tj , one can extend the approximate solution to all real values of t ≥ 0, letting t → x(t) be an affine function on each interval [tj−1 , tj ]. Forward Euler approximations are easy to construct: during the whole time interval [tj , tj+1 ] we let the derivative x(t) ˙ be constant, equal to the value of F at the initial point: x(t) ˙ = F (x(tj )),

t ∈ [tj , tj+1 ].

On the other hand, in a backward Euler approximation, for t ∈ [tj , tj+1 ] we let the derivative x(t) ˙ be constant, equal to the value of F at the terminal point: x(t) ˙ = F (x(tj+1 )), t ∈ [tj , tj+1 ]. In this case, given x(tj ), to find x(tj+1 ) one needs to solve the implicit equation (7.7). Backward Euler approximations thus require more computational work. On the other hand, they often have much better stability and convergence properties.

118

7. Semigroups of Linear Operators

7.1.1. Linear homogeneous ODEs. Let A : Rn → Rn be a linear operator. Then the solution to the linear Cauchy problem (7.8)

x˙ = Ax ,

x(0) = x ¯

is the map t → etA x ¯, where (7.9)

e

tA

∞ . tk Ak = . k! k=0

This same formula remains valid for any bounded linear operator A on a Banach space X. We observe that the series in (7.9) is absolutely convergent for every t ∈ R. Moreover, the exponential map has the following properties: (i) e0A = I, the identity map. (ii) esA etA = e(s+t)A (semigroup property). (iii) For every x ¯ ∈ X, the map t → etA x ¯ is continuous. According to (i)–(ii), the family {etA ; t ≥ 0} is a group of linear operators. More generally, the theory of linear semigroups studies the correspondence A ←→ {etA ; t ≥ 0} between a linear operator and its exponential. When A is a bounded linear operator, its exponential function is computed by the convergent series (7.9). Conversely, given the family of operators etA , one can recover A as the limit etA − I . t→0+ t

A = lim

There are important cases where the operators etA are bounded for every t ≥ 0, while A is an unbounded operator. These are indeed the most interesting applications of semigroup theory, useful in the analysis of PDEs of parabolic or hyperbolic type. Example 7.2. If A : Rn → Rn is a diagonal matrix, can be readily computed. Indeed ⎛ ⎞ ⎛ tλ λ1 · · · 0 e 1 ⎜ ⎟ ⎜ A = ⎝ ... . . . ... ⎠ , etA = ⎝ ... 0 · · · λn 0

then its exponential ··· .. . ···

⎞ 0 .. ⎟ . . ⎠ etλn

Observe that the corresponding operator norms are A = maxk |λk | ,

etA = maxk |etλk |.

7.1. Ordinary differential equations in a Banach space

119

Example 7.3. Let 1 be the Banach space of all sequences of complex numbers x = (x1 , x2 , x3 , . . .) with norm . x 1 = |xk | . k

Given any sequence of complex numbers (λk )k≥1 , consider the linear operator . (7.10) Ax = (λ1 x1 , λ2 x2 , λ3 x3 , . . .). The corresponding exponential operators are etA x = (etλ1 x1 , etλ2 x2 , etλ3 x3 , . . .) . It is important to observe that the quantity A = sup |λk | k

may well be infinite, but at the same time the norm etA = sup |etλk | k

can be bounded, for every t ≥ 0. Indeed, as shown in Figure 7.1.2, assume that the real part of all eigenvalues λk is uniformly bounded above, say λk = αk + iβk ,

αk ≤ ω

for all k ≥ 1 ,

for some constant ω ∈ R. Then (7.11)

|etλk | = |etαk | ≤ etω

for all t ≥ 0.

Therefore, for each t ≥ 0 the operator etA is bounded. Namely etA ≤ etω . Two further cases are worth exploring. CASE 1: Assume that all real parts of the eigenvalues λk are bounded above and below, say −ω ≤ αk ≤ ω for some constant ω > 0 and all k ≥ 1. Then, for all t ∈ R we have the estimate (7.12)

|etλk | = |etαk | ≤ e|t|ω .

Hence the operators etA are all bounded, also for t < 0. This shows that the differential equation (7.8) can be solved both forward and backward in time. The family of operators {etA ; t ∈ R} forms a group of bounded linear operators. CASE 2: Assume that all eigenvalues λk = αk + iβk are contained in a sector of the complex plane. More precisely, assume that there exists ω ∈ R

120

7. Semigroups of Linear Operators

and an angle 0 < θ¯ < π/2 such that, for every k, we can write αk = ω − rk cos θk , ¯ θ]. ¯ for some rk ≥ 0 and θk ∈ [−θ,

(7.13)

βk = −rk sin θk

In this case, since αk ≤ ω, it is clear that (7.11) holds and the operator etA is bounded, for every t ≥ 0. However, much more is true: for every t > 0, the composed operator AetA is bounded. Indeed AetA = sup |λk etλk | ≤ sup (ω + rk )e(ω−rk cos θk )t k

≤

k

sup

¯

(ω + r)e(ω−r cos θ)t ≤ sup (ω + r)e(ω−r cos θ)t < +∞,

r≥0, |θ|≤θ¯

r≥0

because cos θ¯ > 0. In this case, A is called a sectorial operator. Even if the initial datum x ¯ does not lie in the domain of the operator A, for every t > 0 we have x(t) = etA x ¯ ∈ Dom(A). iR

iR

iR

λk 0

ω

R

−ω

ω

0 λk

−

θ

R λ

θ

ω

R

r

Figure 7.1.2. Left: when all the eigenvalues λk of the operator A in (7.10) have real part Re(λk ) ≤ ω, then for every t ≥ 0 the exponential operator is bounded: etA ≤ etω . Center: if all eigenvalues λk satisfy Re (λk ) ∈ [−ω, ω], then for every t ∈ R the exponential operator is bounded: etA ≤ e|t|ω . Right: if all eigenvalues λk lie in a sector of the complex plane with angle θ¯ < π/2, then for each t > 0 the operator AetA is bounded as well.

7.2. Semigroups of linear operators Consider a linear evolution equation in a Banach space X, say d (7.14) u(t) = Au(t) , u(0) = u ¯ ∈ X. dt For t ≥ 0, one would like to express the solution as u(t) = etA u ¯, for some tA family of linear operators {e ; t ≥ 0}.

7.2. Semigroups of linear operators

121

Example 7.4. Consider the partial differential equation ut − ux = 0. This ∂ can be recast in the form (7.14), where A = ∂x is a differential operator. p As function space, we can take X = L (R), for some p ∈ [1, ∞[ . Clearly, the differential operator A is unbounded. Its domain is the set of absolutely continuous functions u ∈ Lp (R) with derivative ux ∈ Lp (R). On the other hand, for every initial data u ¯ ∈ Lp , the solution of the initial value problem ut = ux ,

u(0, x) = u ¯(x)

can be explicitly computed: u(t, x) = u ¯(x + t),

t ∈ R.

∂ This indicates that, although the differential operator A = ∂x is unbounded, tA the corresponding exponential operators e are uniformly bounded:

(etA u ¯)(x) = u ¯(x + t) and hence etA = 1 for every t ∈ R. In the theory of linear semigroups one considers two basic problems: (1) Given a semigroup of linear operators {St ; t ≥ 0}, find its generator, i.e., the operator A such that St = etA . (2) Given a linear operator A, decide whether it generates a semigroup {etA ; t ≥ 0} and establish the properties of this semigroup. For applications, (2) is clearly more important. However, for the development of the theory it is convenient to begin with (1). 7.2.1. Definition and basic properties of semigroups. Definition 7.5. Let X be a Banach space. A strongly continuous semigroup of linear operators on X is a family of linear maps {St ; t ≥ 0} with the following properties. (i) Each St : X → X is a bounded linear operator. (ii) For every s, t ≥ 0, the composition satisfies St Ss = St+s (semigroup property). Moreover S0 = I (the identity operator). (iii) For every u ∈ X, the map t → St u is continuous from [0, ∞[ into X. We say that {St ; t ≥ 0} is a semigroup of type ω if, in addition, the linear operators St satisfy the bounds (7.15)

St ≤ etω

for all t ≥ 0.

122

7. Semigroups of Linear Operators

A semigroup of type ω = 0 is called a contractive semigroup. Indeed, in this case St ≤ 1 for every t ≥ 0, hence St u − St v ≤ u − v

for all u, v ∈ X, t ≥ 0.

The linear operator St u − u . Au = lim t→0+ t is called the generator of the semigroup {St ; t ≥ 0}. Its domain Dom(A) is the set of all u ∈ X for which the limit in (7.16) exists.

(7.16)

For a given u ¯ ∈ X, we regard the map t → St u ¯ as the solution to the linear ODE (7.14). Notice that, in a sense, here we are approaching the problem backwards: given the solution u(t) = St u ¯, we seek to reconstruct the evolution equation, finding the operator A. Some elementary properties of the semigroup S and its generator A are now derived. Theorem 7.6 (Properties of semigroups). Let {St ; t ≥ 0} be a strongly continuous semigroup and let A be its generator. Assume u ¯ ∈ Dom(A). Then (i) For every t ≥ 0 one has St u ¯ ∈ Dom(A) and ASt u ¯ = St A¯ u. . (ii) The map t → u(t) = St u ¯ is continuously differentiable and provides a solution to the Cauchy problem (7.14). Proof. 1. Let u ¯ ∈ Dom(A), so that the limit in (7.16) exists. Then Ss St u ¯ − St u ¯ St Ss u ¯ − St u ¯ Ss u ¯−u ¯ u. = lim = St lim = St A¯ s→0+ s→0+ s→0+ s s s Therefore St u ¯ ∈ Dom(A) and ASt u ¯ = St A¯ u, proving (i). lim

2. Next, assume u ¯ ∈ Dom(A), t > 0. Then, by the semigroup property,

+ + St u St−h u ¯ − St−h u ¯ ¯−u ¯ lim u = lim St−h u − St A¯ − St A¯ h→0+ h→0+ h h

=

lim

h→0+

St−h

+ Sh u ¯−u ¯ u − St A¯ u) = 0. − A¯ u + (St−h A¯ h

Indeed, Sh u¯u−¯u → A¯ u, while St−h ≤ etω . Moreover, since A¯ u ∈ X is well defined, the map s → Ss A¯ u is continuous. The above computation shows that the map t → u(t) = St u ¯ has a left derivative: lim

h→0+

St u ¯ − St−h u ¯ u. = St A¯ h

7.2. Semigroups of linear operators

123

The right derivative is computed as lim

h→0+

St+h u ¯ − St u ¯ Sh u ¯−u ¯ u. = St lim = St A¯ h→0+ h h

Therefore, for every t > 0, the map t → St u ¯ is differentiable, with derivative d S u ¯ = S A¯ u = AS u ¯ . Since A¯ u ∈ X, by the definition of semigroup the t t dt t map t → St (A¯ u) must be continuous. Hence the map t → St u ¯ is continuously differentiable. The following theorem collects some properties of the generator A. We recall that a linear operator A : X → X is closed if its graph Graph(A) = (x, y) ∈ X × X ; x ∈ Dom(A), y = Ax is a closed subset of the product space X × X. Theorem 7.7 (Properties of generators). Let {St ; t ≥ 0} be a strongly continuous semigroup on the Banach space X, and let A be its generator. Then (i) The domain of A is dense in X. (ii) The operator A is closed. ε . Proof. 1. Fix any u ∈ X and consider the approximation Uε = ε−1 0 Ss u ds. Since the map t → St u is continuous, we have Uε → u as ε → 0+. To prove (i), we now show that Uε ∈ Dom(A) for every ε > 0. Since Dom(A) is a vector subspace, it suffices to show that ε . uε = εUε = Ss u ds ∈ Dom(A). 0

For 0 < h < ε we have

ε ε + Sh uε − uε 1 Ss u ds − Ss u ds = Sh h h 0 0 1 ε = (Ss+h u − Ss u)ds h 0 1 ε+h 1 h = Ss u ds − Ss u ds → Sε u − u h ε h 0

as h → 0 + .

This shows that uε ∈ Dom(A) for every ε > 0, proving (i). 2. Next, we prove that the graph of A is closed. Let (uk , vk ) be a sequence of points on the graph of A. More precisely, let uk ∈ DomA, vk = Auk , and assume the convergence uk → u, vk → v, for some u, v ∈ X. We need to

124

7. Semigroups of Linear Operators

show that u ∈ Dom(A) and Au = v. For each k ≥ 1, since uk ∈ Dom(A), the previous theorem implies h h d Sh uk − uk = St Auk dt. St uk dt = dt 0 0 Letting k → ∞, we obtain

h

Sh u − u =

St v dt. 0

Therefore,

Sh u − u 1 h lim St v dt = v. = lim h→0+ h→0+ h 0 h By definition, this means that u ∈ Dom(A) and Au = v.

Figure 7.2.1. Left: the intervals of integration, in the proof of Theorem 7.7 (i). Right: according to (7.20) or (7.28), the backward Euler approximation with step h > 0 is the weighted average of points on the trajectory t → St u, with exponentially decreasing weight w(t) = e−t/h /h.

7.3. Resolvents The crucial link between a semigroup {St ; t ≥ 0} and its generator A is provided by the so-called “resolvent”. This can be best understood in terms of backward Euler approximations. Assume that we want to solve (7.14) approximately, by backward Euler approximations. We thus fix a time step h > 0 and iteratively solve u(t + h) = u(t) + h Au(t + h). At each step, given a value u(t) ∈ X, we thus need to compute u(t + h) = (I − hA)−1 u(t). The backward Euler operator . (7.17) Eh− = (I − hA)−1 ,

7.3. Resolvents

125

with h > 0 small, plays a crucial role in semigroup theory. Given A, there are two main ways to recover the solution of the Cauchy problem (7.14), in terms of this operator. (I): As a limit of backward Euler approximations. For a fixed τ > 0, we consider the time step h = τ /n. After n steps, the backward Euler approximation scheme yields τ −n u(τ ) ≈ Eτ−/n ◦ · · · ◦ Eτ−/n = I − A u ¯. n Keeping τ fixed and letting n → ∞, we expect to recover the exact solution of (7.14) in the limit: τ −n . (7.18) u(τ ) = Sτ u ¯ = lim I − A u ¯. n→∞ n (II): As a limit of solutions to approximate evolution equations. Fix again a time step h > 0 and set λ = 1/h. We then define the operator Aλ : X → X as . Aλ u = A Eh− u = A(I − hA)−1 u. In other words, Aλ u is the value of A computed not at u but at the nearby point Eh− u, i.e., at the first step in a backward Euler approximation, with h = 1/λ. It turns out that, for h > 0 small enough, Aλ = A1/h is a well defined, bounded linear operator. We can thus consider the exponential tA λ operators e = k≥0 (tAλ )k /k! and define . (7.19) u(t) = St u ¯ = lim etAλ u ¯. λ→∞

Example 7.8. Consider the scalar ODE x˙ = ax,

x(0) = x ¯.

Its solution is x(t) = eta x ¯. In this case we trivially have a aλ = a1/h = ¯ = lim eta1/h x ¯ = lim eta/(1−ha) x ¯. , eat x h→0 h→0+ 1 − ha This corresponds to the limit (7.19). On the other hand, the limit (7.18) is related to the identity τ −n eτ a = lim 1 − a . n→∞ n Notice that, for 0 < h < a−1 , one also has the identities ∞ −t/h ∞ −t/h e e −1 (7.20) ¯ = ¯ dt . dt = 1, (1 − ha) x eta x h h 0 0

126

7. Semigroups of Linear Operators

The second identity shows that the backward Euler operator Eh− x ¯ = −1 (1 − ha) x ¯ can be obtained by taking an average of points along the trajectory t → eta x ¯, with the exponentially decreasing weight w(t) = h−1 e−t/h . Motivated by the previous analysis, we introduce some definitions. Definition 7.9. Let A be a linear operator on a Banach space X. The resolvent set of A is the set ρ(A) of all real numbers λ such that the operator λI − A : Dom(A) → X is one-to-one and onto. If λ ∈ ρ(A), the resolvent operator Rλ : X → X is defined by . Rλ u = (λI − A)−1 u. We remark that, if A is a closed operator, then by the closed graph theorem the operator Rλ : X → Dom(A) ⊆ X is a bounded linear operator. Moreover ARλ u = Rλ Au

if u ∈ Dom(A).

There is a close connection between resolvents and backward Euler operators. Namely − λRλ = E1/λ .

Theorem 7.10 (Resolvent identities). Let A be a closed linear operator. If λ, μ ∈ ρ(A), then (7.21)

Rλ − Rμ = (μ − λ)Rλ Rμ ,

(7.22)

Rλ Rμ = Rμ Rλ .

Proof. For any u ∈ X one has . v = Rλ u − Rμ u = (λI − A)−1 u − (μI − A)−1 u ∈ Dom(A), (λI − A)v = u − (λI − μI + μI − A)(μI − A)−1 u = (μ − λ)(μI − A)−1 u . Applying the operator (λI − A)−1 to both sides of the above identity, one obtains (7.21). Next, using (7.21), for any λ = μ in the resolvent set ρ(A) we obtain Rλ Rμ =

Rλ − Rμ R μ − Rλ = = R μ Rλ . λ−μ μ−λ

7.3. Resolvents

127

Theorem 7.11 (Integral formula for the resolvent). Let {St ; t ≥ 0} be a semigroup of type ω, and let A be its generator. Then for every λ > ω one has λ ∈ ρ(A) and ∞ (7.23) Rλ u = e−tλ St u dt . 0

Moreover, Rλ ≤

(7.24)

1 . λ−ω

Proof. 1. By assumption, the semigroup satisfies the bounds St ≤ etω . Therefore the integral in (7.23) is absolutely convergent: (7.25) ∞ ∞ ∞ 1 −tλ tλ e St u dt ≤ e St u dt ≤ e(ω−λ)t u dt ≤ u . λ−ω 0

0

0

Define . λ u = R

∞

e−tλ St u dt .

0

λ is a bounded linear operator, with norm The above estimate shows that R 1 λ ≤ R . λ−ω 2. We now show that (7.26)

λ u = u (λI − A)R

for all u ∈ X .

Indeed, for any h > 0 we have λ u − R λ u Sh R 1 ∞ −λt e (St+h u − St u) dt = h h 0 1 = h

∞

e

−λ(t−h)

−e

0

eλh − 1

∞

1 St u dt − h

eλh St u dt − h

h

h

e−λ(t−h) St u dt

0

e−λt St u dt . h 0 0 Taking the limit as h → 0+ we obtain λ u − R λ u Sh R λ u − u . lim = λR h→0+ h By the definition of generator, this means that λ u ∈ Dom(A), λ u = λR λ u − u , R AR =

proving (7.26).

e

−λt

−λt

128

7. Semigroups of Linear Operators

3. The identity (7.26) already shows that the map v → (λI − A)v from Dom(A) into X is surjective. We now prove that this map is one-to-one. If u ∈ Dom(A), then ∞ ∞ λ u = A AR e−λt St u dt = e−λt ASt u dt 0

=

∞

0

λ Au . e−λt St Au dt = R

0

This proves the commutativity relation (7.27)

λ (λI − A)u = (λI − A)R λ u R

for all u ∈ Dom(A).

If now (λI − A)u = (λI − A)v, using (7.27) and (7.26) we obtain λ (λI − A)u = R λ (λI − A)v = v, u = R proving that the map (λI − A) is one-to-one. λ = (λI − A)−1 = Rλ . We thus conclude that λ ∈ ρ(A) and R

Remark 7.12. According to the integral representation formula (7.23), the resolvent operators Rλ provide the Laplace transform of the semigroup S. Taking 0 < h = λ−1 , this same formula shows that the backward Euler approximations can be obtained as ∞ −t/h e . − −1 (7.28) Eh u = (I − hA) u = St u dt . h 0 Here the integral is convergent, provided that h is sufficiently small: if S is a semigroup of type ω, we need h < ω −1 . Generalizing the identity (7.20), we see that the first step in an Euler backward approximation coincides with an averaged value of the entire trajectory t → St u ¯, with the exponentially decreasing weight w(t) = h−1 e−t/h .

7.4. Generation of a semigroup In this section we tackle the most important question. Namely, given an operator A from Dom(A) ⊂ X into X, under which conditions does there exist a semigroup {St ; t ≥ 0} generated by A? Theorem 7.13 (Existence of the semigroup generated by a linear operator). Let A be a linear operator on a Banach space X. Then the following are equivalent. (i) A is the generator of a semigroup of linear operators {St ; t ≥ 0}, of type ω.

7.4. Generation of a semigroup

129

(ii) A is a closed, densely defined operator. Moreover, every real number λ > ω is in the resolvent set of A, and 1 −1 (7.29) for all λ > ω . (λI − A) ≤ λ−ω Proof. The fact that (i) implies (ii) has already been proved. We shall thus assume that (ii) holds and prove that A generates a semigroup of type ω. The proof will be achieved in several steps. 1. By assumption, for each λ > ω the resolvent operator Rλ = (λI − A)−1 is well defined. We can thus consider the bounded linear operator . (7.30) Aλ = −λI + λ2 Rλ = λARλ . Notice that, setting h = 1/λ, we can write Aλ u = A(I − hA)−1 u = A(Eh− u) . In other words, Aλ u is the value of A computed not at the point u but at the point Eh− u obtained by a backward Euler step of size h = 1/λ, starting at u. It is thus expected that Aλ should be a good approximation for the operator A, at least when λ is large (so that h = 1/λ is small). 2. Since each Aλ is bounded, we can construct its exponential as ∞ ∞ (λ2 t)k Rλk 2 . (tAλ )k etAλ = = e−λt eλ tRλ = e−λt . k! k! k=0

k=0

We observe that, if A is unbounded, then Aλ → ∞ as λ → ∞. However, for t ≥ 0 the norms of the exponential operators remain uniformly bounded as λ → ∞. Indeed, the assumption (7.29) together with the definition (7.30) yields etAλ ≤ e−λt

∞ (λ2 t)k Rλ k k=0

k!

≤ e−λt eλ

2 t/(λ−ω)

= eλωt/(λ−ω) .

In particular, as soon as λ ≥ 2ω, we have (7.31)

etAλ ≤ e2ωt

for all t ≥ 0 .

3. In this step we establish the limit (7.32)

lim Aλ v = Av

λ→∞

for all v ∈ Dom(A).

To prove (7.32), we start with the identity λRλ u − u = ARλ u = Rλ Au ,

130

7. Semigroups of Linear Operators

which is valid for all u ∈ Dom(A). This yields λRλ u − u = Rλ Au ≤ Rλ Au ≤

1 Au → 0 λ−ω

as λ → ∞.

Hence lim λRλ u = u

λ→∞

for all u ∈ Dom(A).

Using the fact that Dom(A) is dense in X, we now prove that the above limit holds more generally for all u ∈ X. Indeed, for every u ∈ X and ε > 0, there exists v ∈ Dom(A) such that u − v < ε. Using the triangle inequality, we obtain lim sup λRλ u − u λ→∞

≤ lim sup λRλ u − λRλ v + lim sup λRλ v − v + v − u λ→∞

λ→∞

≤ lim sup λRλ u − v + 0 + v − u λ→∞

≤ lim sup λ→∞

λ ε + ε = 2ε . λ−ω

Since ε > 0 was arbitrary, this proves that (7.33)

lim λRλ u = u

λ→∞

for all u ∈ X.

For the backward Euler approximations with time step h = 1/λ, (7.33) yields lim Eh− u = u

h→0+

for all u ∈ X .

If v ∈ Dom(A), then in (7.33) we can take u = Av and conclude that lim Aλ v = lim λARλ v = lim λRλ Av = lim λRλ u = u = Av .

λ→∞

λ→∞

λ→∞

λ→∞

This proves (7.32). 4. Fix any t ≥ 0. We claim that, as λ → ∞, the family of uniformly bounded operators etAλ converges to some linear operator, which we call St .

7.4. Generation of a semigroup

131

To prove this, for λ, μ > 2ω we shall estimate the difference etAλ − etAμ . The commutativity relation Rλ Rμ = Rμ Rλ implies Aλ Aμ = Aμ Aλ . Therefore ∞ (tAλ )k Aμ etAλ = Aμ = etAλ Aμ . k! k=0

For every u ∈ X we thus have t d (t−s)Aμ sAλ tAλ tAμ e u−e u = e u ds e 0 ds =

t

e

(t−s)Aμ

(Aλ − Aμ )e

0

sAλ

t

u ds =

e(t−s)Aμ esAλ (Aλ u − Aμ u) ds .

0

By (7.31), this implies t tAλ tAμ e u − e u ≤ e2(t−s)ω e2sω Aλ u − Aμ u ds = t e2ωt Aλ u − Aμ u . 0

If u ∈ Dom(A), then (7.32) yields Aλ u → Au and Aμ u → Au as λ, μ → ∞. Therefore lim sup etAλ u − etAμ u ≤ t e2ωt lim sup Aλ u − Aμ u = 0. λ,μ→∞

λ,μ→∞

Figure 7.4.1. Proving the convergence of the approximations etAλ

as λ → ∞. The difference etAλ u − etAμ u can be estimated by the length of the curve s → e(t−s)Aμ esAλ u, for s ∈ [0, t].

Using the triangle inequality, we now show that the same limit is valid more generally for every v ∈ X, uniformly as t ranges in bounded intervals.

132

7. Semigroups of Linear Operators

Indeed, given ε > 0, choose u ∈ Dom(A) with u − v ≤ ε. Then lim sup etAλ v − etAμ v λ,μ→∞

≤ lim sup etAλ v − etAλ u + lim sup etAλ u − etAμ u λ→∞

λ,μ→∞

+ lim supμ→∞ etAμ u − etAμ v ≤ 2 lim sup etAλ v − u ≤ 2e2ωt ε . λ→∞

Since ε > 0 was arbitrary, our claim is proved. 5. By the previous step, for every t ≥ 0 and u ∈ X the following limit is well defined: . St u = lim etAλ u . λ→∞

We claim that {St ; t ≥ 0} is a strongly continuous semigroup of type ω. Indeed, the semigroup property follows from St Ss u = lim etAλ esAλ u = lim e(t+s)Aλ u = St+s u . λ→∞

λ→∞

For a fixed u ∈ X, the map t → St u is continuous, being the limit of the continuous maps t → etAλ u, uniformly for t in bounded intervals. Finally, for every t ≥ 0 and u ∈ X with u ≤ 1 we have the estimate St u = lim etAλ u ≤ lim sup etAλ u λ→∞

≤ lim e

λ→∞ tλω/(λ−ω)

λ→∞

u = etω u .

This proves that St ≤ etω , i.e., that the semigroup is of type ω. 6. It remains to prove that the linear operator A is indeed the generator of the semigroup. Toward this goal, call B the generator of the semigroup {St ; t ≥ 0}. By our earlier analysis, we know that B is a linear, closed operator, with domain Dom(B) dense in X. We need to show that B = A. Since Aλ is the generator of the semigroup {etAλ ; t ≥ 0}, for every λ > ω we have t tAλ (7.34) e u−u = esAλ Aλ u ds . 0

Moreover, for u ∈ Dom(A), the triangle inequality yields (7.35)

esAλ Aλ u − Ss Au ≤ esAλ Aλ u − Au + esAλ Au − Ss Au .

7.4. Generation of a semigroup

133

As λ → ∞, the right-hand side of (7.35) goes to zero, uniformly for s in bounded intervals. Taking the limit as λ → ∞ in (7.34) we thus obtain t (7.36) St u − u = Ss Au ds for all t ≥ 0, u ∈ Dom(A) . 0

As a consequence, Dom(B) ⊇ Dom(A) and St u − u 1 t Bu = lim Ss Au ds = Au = lim t→0+ t 0 h→0+ t

for all u ∈ Dom(A) .

To prove that A = B, it remains to show that Dom(B) ⊆ Dom(A). For this purpose, choose any λ > ω. We know that the operators λI − A : Dom(A) → X

and

λI − B : Dom(B) → X

are both one-to-one and surjective. In particular, the restriction of λI −B to Dom(A) coincides with λI − A and is thus surjective. Hence this operator λI − B cannot be extended to any domain strictly larger than Dom(A), preserving the one-to-one property. This shows that Dom(B) = Dom(A), completing the proof. We now investigate uniqueness. Given a semigroup of linear operators {St ; t ≥ 0}, its generator is uniquely defined by the limit in (7.16). Conversely, given a linear operator A satisfying conditions (ii) in Theorem 7.13, we now show that the semigroup generated by A is uniquely determined. Theorem 7.14 (Uniqueness of the semigroup). Let {St }, {St } be two strongly continuous semigroups of linear operators, having the same generator A. Then St = St for every t ≥ 0. Proof. Let u ∈ Dom(A). Then Ss u ∈ Dom(A) and St−s Ss u ∈ Dom(A) for every 0 ≤ s ≤ t. We can thus estimate t t d St u − St u = St−s ASs u − ASt−s Ss u ds = 0. St−s Ss u ds = 0 ds 0 Indeed, d St−s−h (Ss+h u) − St−s Ss u St−s Ss u = lim h→0 ds h = lim

h→0

St−s−h (Ss+h u − Ss u) St−s−h (Ss u) − St−s (Ss u) + lim h→0 h h

= St−s (ASs u) − ASt−s (Ss u) = 0 .

134

7. Semigroups of Linear Operators

This shows that, for every fixed t ≥ 0, the bounded linear operators St and St coincide on the dense subset Dom(A). Hence St = St .

Applications of the theory of semigroups to the solution of partial differential equations of evolutionary type will be illustrated in Chapter 9.

7.5. Problems 1. Assume that the operator A in (7.10) is sectorial, so that (7.13) holds, for some ω ∈ R and some angle θ¯ < π/2. (i) For t > 0, give an a priori estimate on the norm of the operator AetA , ¯ depending only on ω, θ. (ii) More generally, for t > 0 and k ≥ 1, prove that the operator Ak etA is also bounded. Estimate the norm Ak etA . 2. Let A be the generator of a contractive semigroup. Show that, for every λ > 0, the bounded linear operator Aλ in (7.30) also generates a contractive semigroup. 3. Fix λ > 0 and consider the weight function

λ e−λt w(t) = 0

if t ≥ 0, if t < 0.

. By induction on n ≥ 1, check that the n-fold convolution wn = w ∗ w ∗ · · · ∗ w is given by ⎧ λn ⎨ tn−1 e−λt if t ≥ 0, (7.37) wn (t) = (n − 1)! ⎩ 0 if t < 0. Show that w(·) is the density of a probability measure with mean value =

∞

t w(t) dt = 0

1 , λ

2 ∞ ∞ 1 1 1 w(t) dt = t− t2 − 2 w(t) dt = 2 . λ λ λ 0 0

variance =

Therefore, wn (·) is the density of a probability measure with mean n/λ and variance n/λ2 . . Now fix T > 0 and let λn = n/T . For every n ≥ 1, set ⎧ ⎨ (n/T )n n−1 −(n/T )t t e if t ≥ 0, (7.38) Wn (t) = (n − 1)! ⎩ 0 if t < 0.

7.5. Problems

135

Show that Wn (·), plotted in Figure 7.5.1, is the density of a probability measure with mean value n/λn = T and variance n/(n/T )2 = T 2 /n. In particular, for every ε > 0, one has (7.39) , ∞ T −ε T +ε Wn (t) dt + Wn (t) dt = 0 , lim Wn (t) dt = 1 . lim n→∞

0

n→∞

T +ε

T −ε

Figure 7.5.1. A plot of the weight functions Wn defined in (7.38).

Here T = 4, while n = 1, 2, 10, 100. All these probability distributions have average value 4, while their variance (= 16/n) decreases to zero as n → ∞. 4. (Convergence of backward Euler approximations) Let {St ; t ≥ 0} be a contractive semigroup on a Banach space X, generated by the linear operator A. Fix h > 0 and set λ = 1/h. By induction on n ≥ 1, extend formula (7.28) for backward Euler approximations. Namely, prove that ∞ −n wn (t)St u dt for all u ∈ X , (I − hA) u = 0

where wn is the weight function defined in (7.37). Set h = T /n, so that λn = n/T . Using (7.39) prove that, for every u ∈ X, one has the convergence −n ∞ T u = Wn (t)St u dt → ST u as n → ∞. I− A n 0

136

7. Semigroups of Linear Operators

5. (Positively invariant sets) Let A be the generator of a contractive semigroup {St ; t ≥ 0} on a Banach space X. Let Ω ⊂ X be a closed, convex subset. Prove that the following are equivalent. (i) Ω is positively invariant: if u ∈ Ω then St u ∈ Ω for all t ≥ 0. (ii) Ω is invariant with respect to the backward Euler operator: if u ∈ Ω, then for every h > 0 one has (I − hA)−1 u ∈ Ω. 6. Let {St ; t ≥ 0} be a semigroup of type ω, and let A be its generator. Prove that, for every γ ∈ R, the family of bounded linear operators {eγt St ; t ≥ 0} is a semigroup of type ω + γ, having A + γI as its generator. 7. On the space Lp (R), consider the linear operators St defined by (St f )(x) = e−2t f (x + t) . (i) Prove that the family of linear operators {St ; t ≥ 0} is a strongly continuous, contractive semigroup on Lp (R), for all 1 ≤ p < ∞. Find the generator A of this semigroup. What is Dom(A)? (ii) Show that the family of operators {St ; t ≥ 0} is NOT a strongly continuous semigroup on L∞ (R). 8. Let {St ; t ≥ 0} be a strongly continuous semigroup of linear operators on Rn . Prove that there exists an n × n matrix A such that St = etA for every t ≥ 0. 9. (Semilinear equations) Let A be a linear operator on a Banach space X, generating the contractive semigroup {St ; t ≥ 0}. We say that a map u : [0, T ] → X is a mild solution of the semilinear Cauchy problem (7.40)

d u(t) = Au(t) + f (t, u(t)) , dt

if

(7.41)

u(t) = St u ¯+

u(0) = u ¯

t

St−s f (s, u(s)) ds

for all t ∈ [0, T ] .

0

(i) Assuming that f : [0, T ] × X → X is continuous and satisfies the bounds f (t, x) ≤ M ,

f (t, x) − f (t, y) ≤ L x − y

for all t, x, y,

prove that the Cauchy problem (7.40) admits a unique mild solution. (ii) In the special case where A is a bounded operator, so that St = etA , prove that u(·) satisfies (7.41) if and only if u is a continuously differentiable solution to the Cauchy problem (7.40). 10. Fix a time T > 0. On the space X = L1 ([0, 1]), construct a strongly continuous, contractive semigroup of linear operators {St ; t ≥ 0} such that St = 1 for 0 ≤ t < T but St = 0 for t ≥ T .

7.5. Problems

137

11. Let {St ; t ≥ 0} be a strongly continuous, contractive semigroup of linear operators on a Banach space X. Assume that, for a given time τ > 0, the operator Sτ is compact. Prove that, for every t > τ , the operator St compact as well. Construct an example showing that the operators St may not be compact for 0 ≤ t < τ. ∂ 12. On the space L1 (R), consider the operator Au = ∂x u with domain Dom(A) = u ∈ L1 (R) ; u is absolutely continuous, ux ∈ L1 (R) .

(i) Describe the semigroup {St ; t ≥ 0} generated by A. (ii) For any u ∈ L1 (R) and h > 0, construct the backward Euler approximation Eh− u. (iii) Let u ∈ Cc∞ (R) be a smooth function with compact support, say with u(x) = 0 for x ∈ / [a, b]. Given any time step h > 0, show that the forward Euler approximations (Eh+ )n u = (I + hA)n u are well defined and have support contained in [a, b]. (iv) Using (iii) )nshow that, for every time τ > 0 sufficiently large, the functions ( I + nτ A u cannot converge to Sτ u as n → ∞.

Chapter 8

Sobolev Spaces

The present chapter covers the basic theory of Sobolev spaces, which provide a very useful abstract framework for the analysis of both linear and nonlinear PDEs. We begin with an introduction to the theory of distributions and some motivating examples.

8.1. Distributions and weak derivatives In the following, L1loc (R) denotes the space of locally summable functions f : R → R. These are the Lebesgue measurable functions which are summable over every bounded interval. The support of a function φ, denoted by Supp(φ), is the closure of the set {x ; φ(x) = 0} where φ does not vanish. By Cc∞ (R) we denote the space of continuous functions with compact support, having continuous derivatives of every order. Every locally summable function f ∈ L1loc (R) determines a linear functional Λf : Cc∞ (R) → R, namely . (8.1) Λf (φ) = f (x)φ(x) dx . R

Notice that this integral is well defined for every φ ∈ Cc∞ (R) . Moreover, if Supp(φ) ⊆ [a, b], we have the estimate b (8.2) |Λf (φ)| ≤ |f (x)| dx φ C 0 . a

Next, assume that f is continuously differentiable. Then its derivative f (x) = lim

h→0

f (x + h) − f (x) h 139

140

8. Sobolev Spaces

is continuous, hence locally summable. In turn, f also determines a linear functional on Cc∞ (R), namely . (8.3) Λf (φ) = f (x) φ(x) dx = − f (x) φ (x) dx . R

R

At this stage, a key observation is that the first integral in (8.3) is defined only if f (x) exists for a.e. x and is locally summable. However, the second integral is well defined for every locally summable function f , even if f does not have a pointwise derivative at any point. Moreover, if Supp(φ) ⊆ [a, b], we have the estimate b |Λf (φ)| ≤ |f (x)| dx φ C 1 . a

This construction can also be performed for higher-order derivatives. Definition 8.1. Given an integer k ≥ 1, the distributional derivative of order k of f ∈ L1loc (R) is the linear functional . k ΛDk f (φ) = (−1) f (x)D k φ(x) dx . R

If there exists a locally summable function g such that Λg = ΛDk f , namely k g(x)φ(x) dx = (−1) f (x)D k φ(x) dx for all φ ∈ Cc∞ (R), R

R

then we say that g is the weak derivative of order k of f . Remark 8.2. Classical derivatives are defined pointwise, as limits of difference quotients. On the other hand, weak derivatives are defined only in an integral sense, up to a set of measure zero. By arbitrarily changing the function f on a set of measure zero we do not affect its weak derivatives in any way. Example 8.3. Consider the function

0 . f (x) = x

if x ≤ 0, if x > 0 .

Its distributional derivative is the map ∞ Λ(φ) = − x · φ (x) dx = H(x) φ(x) dx , R

0

where (8.4)

H(x) =

0 1

if x ≤ 0, if x > 0 .

In this case, the Heaviside function H in (8.4) is the weak derivative of f .

8.1. Distributions and weak derivatives

141

Example 8.4. The function H in (8.4) is locally summable. Its distributional derivative is the linear functional ∞ . Λ(φ) = − H(x) φ (x) dx = − φ (x) dx = φ(0) . R

0

This corresponds to the Dirac measure, concentrating a unit mass at the origin. We claim that the function H does not have any weak derivative, i.e., there cannot exist any locally summable function g such that (8.5) g(x) φ(x) dx = φ(0) for all φ ∈ Cc∞ . Indeed, if (8.5) holds, then by the Lebesgue dominated convergence theorem h lim |g(x)| dx = 0. h→0 −h

δ Hence we can choose δ > 0 so that −δ |g(x)| dx ≤ 1/2. Let φ : R → [0, 1] be a smooth function, with φ(0) = 1 and with support contained in the interval [−δ, δ]. We now reach a contradiction by writing δ 1 = φ(0) = Λ(φ) = g(x)φ(x) dx = g(x)φ(x) dx −δ

R

≤ maxx |φ(x)| · Example 8.5. Consider the function

0 . f (x) = 2 + sin x

δ −δ

|g(x)| dx ≤

1 . 2

if x is rational, if x is irrational .

Being discontinuous at every point, the function f is nowhere differentiable. On the other hand, the function g(x) = cos x provides a weak derivative for f . Indeed, the behavior of f on the set of rational points (having measure zero) is irrelevant. We thus have − f (x)φ (x) dx = − (2 + sin x) φ (x) dx = (cos x) φ(x) dx. Example 8.6. Consider the Cantor ⎧ 0 ⎪ ⎪ ⎪ ⎪ 1 ⎪ ⎪ ⎨ 1/2 (8.6) f (x) = 1/4 ⎪ ⎪ ⎪ ⎪ 3/4 ⎪ ⎪ ⎩

function f : R → [0, 1], defined by if if if if if ···

x ≤ 0, x ≥ 1, x ∈ [1/3, 2/3] , x ∈ [1/9, 2/9] , x ∈ [7/9, 8/9] ,

142

8. Sobolev Spaces

Figure 8.1.1. The Cantor function f and a test function φ showing that g(x) ≡ 0 cannot be the weak derivative of f .

(see Figure 8.1.1). This provides a standard example of a continuous function which is not absolutely continuous. We claim that f does not have a weak derivative. Indeed, assume on the contrary that g ∈ L1loc is a weak derivative of f . Since f is constant on each of the open sets 1 2 1 2 7 8 ] − ∞, 0[ , ]1, +∞[ , , , , , , , ..., 3 3 9 9 9 9 we must have g(x) = f (x) = 0 on the union of these open intervals. Hence g(x) = 0 for a.e. x ∈ R. To obtain a contradiction, it remains to show that the function g ≡ 0 is NOT the weak derivative of f . As shown in Figure 8.1.1, let φ ∈ Cc∞ be a test function such that φ(x) = 1 for x ∈ [0, 1] while φ(x) = 0 for x ≥ b. Then g(x)φ(x) dx = 0 = 1 = − f (x)φ (x) dx . 8.1.1. Distributions. The construction described in the previous section can be extended to any open domain in a multi-dimensional space. Let Ω ⊆ Rn be an open set. By L1loc (Ω) we denote the space of locally summable functions on Ω. These are the measurable functions f : Ω → R which are summable restricted to every compact subset K ⊂ Ω. Example 8.7. The functions ex and ln |x| are in L1loc (R), while x−1 ∈ / 1 γ 1 Lloc (R). On the other hand, the function f (x) = x is in Lloc (]0, ∞[) for every (positive or negative) exponent γ ∈ R. In several space dimensions, the function f (x) = |x|−γ is in L1loc (Rn ) provided that γ < n. One should keep in mind that the pointwise values of a function f ∈ L1loc on a set of measure zero are irrelevant. By Cc∞ (Ω) we denote the space of continuous functions φ : Ω → R having continuous partial derivatives of all orders and whose support is a compact subset of Ω. Functions φ ∈ Cc∞ (Ω) are usually called “test functions”. We

8.1. Distributions and weak derivatives

143

recall that the support of a function φ is the closure of the set where φ does not vanish: . Supp (φ) = {x ∈ Ω ; φ(x) = 0}. We shall need an efficient way to denote higher-order derivatives of a function f . A multi-index α = (α1 , α2 , . . . , αn ) is an n-tuple of nonnegative integer numbers. Its length is defined as . |α| = α1 + α2 + · · · + αn . Each multi-index α determines a partial differential operator of order |α|, namely ∂ α2 ∂ α1 ∂ αn α D f = ··· f. ∂x1 ∂x2 ∂xn Definition 8.8. A distribution on the open set Ω ⊆ Rn is a linear functional Λ : Cc∞ (Ω) → R such that the following boundedness property holds: For every compact K ⊂ Ω there exists an integer N ≥ 0 and a constant C such that (8.7)

|Λ(φ)| ≤ C φ C N

for every φ ∈ C ∞ with Supp(φ) ⊆ K .

In other words, for all test functions φ which vanish outside a given compact set K, the value Λ(φ) should be bounded in terms of the maximum value of derivatives of φ, up to a certain order N . Notice that here both N and C depend on the compact subset K. If there exists an integer N ≥ 0 independent of K such that (8.7) holds (with C = CK possibly still depending on K), we say that the distribution has finite order. The smallest such integer N is called the order of the distribution. Example 8.9. Let Ω be an open subset of Rn and consider a function f ∈ L1loc (Ω). Then the linear map Λf : Cc∞ (Ω) → R defined by . (8.8) Λf (φ) = f φ dx Ω

is a distribution. Indeed, it is clear that Λf is well defined and linear. Given a compact subset K ⊂ Ω, for every test function φ with Supp(φ) ⊆ K we have the estimate |Λf (φ)| = f φ dx ≤ |f (x)| dx · maxx∈K |φ(x)| ≤ C φ C 0 . K K Hence the bound (8.7) holds with C = K |f | dx and N = 0. This provides an example of a distribution of order zero.

144

8. Sobolev Spaces

The family of all distributions on Ω is clearly a vector space. A remarkable fact is that, while a function f may not admit any partial derivative (in the classical sense), for a distribution Λ an appropriate notion of derivative can always be defined. Definition 8.10. Given a distribution Λ and a multi-index α, we define the distribution D α Λ by setting . (8.9) D α Λ(φ) = (−1)|α| Λ(D α φ) . It is easy to check that D α Λ is itself a distribution. Indeed, the linearity of the map φ → D α Λ(φ) is clear. Next, let K be a compact subset of Ω and let φ be a test function with support contained in K. By assumption, there exists a constant C and an integer N ≥ 0 such that (8.7) holds. In turn, this implies |D α Λ(φ)| = |Λ(D α φ)| ≤ C D α φ C N ≤ C φ C N +|α| . Hence D α Λ also satisfies (8.7), with N replaced by N + |α|. Notice that, if Λf is the distribution in (8.8) corresponding to a function f which is |α|-times continuously differentiable, then we can integrate by parts and obtain |α| α |α| α D Λf (φ) = (−1) Λf (D φ) = (−1) f (x)D α φ(x) dx = D α f (x)φ(x) dx = ΛDα f (φ) . This justifies the formula (8.9). 8.1.2. Weak derivatives. For every locally summable function f and every multi-index α = (α1 , . . . , αn ), the distribution Λf always admits a distributional derivative D α Λf , defined according to (8.9). In some cases, one can find a locally summable function g such that the distribution D α Λf coincides with the distribution Λg . This leads to the concept of weak derivative. Definition 8.11. Let f ∈ L1loc (Ω) be a locally summable function on the open set Ω ⊆ Rn and let Λf be the corresponding distribution, as in (8.8). Given a multi-index α = (α1 , . . . , αn ), if there exists a locally summable function g ∈ L1loc (Ω) such that D α Λf = Λg , i.e., (8.10) f D α φ dx = (−1)|α|

g φ dx

for all test functions φ ∈ Cc∞ (Ω),

then we say that g is the weak α-th derivative of f and write g = D α f .

8.1. Distributions and weak derivatives

145

In general, a weak derivative may not exist. In particular, Example 8.4 shows that the Heaviside function does not admit a weak derivative. Indeed, its distributional derivative is a Dirac measure (concentrating a unit mass at the origin), not a locally summable function. On the other hand, if a weak derivative does exist, then it is unique (up to a set of measure zero). Lemma 8.12 (Uniqueness of weak derivatives). Assume f ∈ L1loc (Ω) and let g, g˜ ∈ L1loc (Ω) be the weak α-th derivatives of f , so that α |α| |α| f D φ dx = (−1) g φ dx = (−1) g˜ φ dx for all test functions φ ∈ Cc∞ (Ω). Then g(x) = g˜(x) for a.e. x ∈ Ω. Proof. By the assumptions, the function (g − g˜) ∈ L1loc (Ω) satisfies (g − g˜)φ dx = 0 for all test functions φ ∈ Cc∞ (Ω). By Corollary A.17 in the Appendix, we thus have g(x) − g˜(x) = 0 for a.e. x ∈ Ω. If a function f is twice continuously differentiable, a basic theorem of calculus states that partial derivatives commute: fxj xk = fxk xj . This property remains valid for weak derivatives. To state this result in full generality, we recall that the sum of two multi-indices α = (α1 , . . . , αn ) and β = (β1 , . . . , βn ) is defined as α + β = (α1 + β1 , . . . , αn + βn ). Lemma 8.13 (Weak derivatives commute). Assume that f ∈ L1loc (Ω) has weak derivatives D α f for every |α| ≤ k. Then, for every pair of multiindices α, β with |α| + |β| ≤ k one has D α (D β f ) = D β (D α f ) = D α+β f .

(8.11)

Proof. Consider any test function φ ∈ Cc∞ (Ω). Using the fact that D β φ ∈ Cc∞ (Ω) is a test function as well, we obtain α β |α| D f D φ dx = (−1) f (D α+β φ) dx Ω

Ω |α|

= (−1) (−1)

|α+β|

(D α+β f ) φ dx Ω

= (−1)|β|

(D α+β f ) φ dx . Ω

146

8. Sobolev Spaces

By definition, this means that D α+β f = D β (D α f ). Exchanging the roles of the multi-indices α and β in the previous computation, one obtains D α+β f = D α (D β f ), completing the proof. The next lemma extends another familiar result, stating that the weak derivative of a limit coincides with the limit of the weak derivatives. Lemma 8.14 (Convergence of weak derivatives). Consider a sequence of functions fn ∈ L1loc (Ω). For a fixed multi-index α, assume that each fn admits the weak derivative gn = D α fn . If fn → f and gn → g in L1loc (Ω), then g = D α f . Proof. For every test function φ ∈ Cc∞ (Ω), a direct computation yields g φ dx = lim gn φ dx = lim (−1)|α| fn D α φ dx Ω

n→∞ Ω

= (−1)|α|

n→∞

Ω

f D α φ dx . Ω

By definition, this means that g is the α-th weak derivative of f .

8.2. Mollifications As usual, let Ω ⊆ Rn be an open set. For a given ε > 0, define the open subset . (8.12) Ωε = {x ∈ Rn ; B(x, ε) ⊂ Ω} . For every u ∈ L1loc (Ω) the mollification . uε (x) = (Jε ∗ u)(x) =

Jε (x − y)u(y) dy B(x,ε)

is well defined for every x ∈ Ωε . Moreover, uε ∈ C ∞ (Ωε ). A very useful property of the mollification operator is that it commutes with weak differentiation. Lemma 8.15 (Mollifications). Let Ωε ⊂ Ω be as in (8.12). Assume that a function u ∈ L1loc (Ω) admits a weak derivative D α u, for some multi-index α. Then the derivative of the mollification (which exists in the classical sense) coincides with the mollification of the weak derivative: (8.13)

D α (Jε ∗ u) = Jε ∗ D α u

for all x ∈ Ωε .

8.2. Mollifications

147

. Proof. Observe that, for each fixed x ∈ Ωε , the function φ(y) = Jε (x − y) is in Cc∞ (Ω). Hence we can apply the definition of weak derivative D α u using φ as a test function. Writing Dxα and Dyα to distinguish differentiation with respect to the variables x or y, we thus obtain D α uε (x) = Dxα Jε (x − y) u(y) dy Ω

Dxα Jε (x − y) u(y) dy

= Ω

= (−1)

|α|

Dyα Jε (x − y) u(y) dy Ω

= (−1)

|α|+|α|

Jε (x − y) Dyα u(y) dy Ω

=

Jε ∗ D α u (x).

Figure 8.2.1. Left: the open subset Ωε ⊂ Ω of points having dis-

tance > ε from the boundary. Right: the domain Ω can be covered by countably many open subdomains Vj = Ω1/(j−1) \ Ω1/(j+1) .

This property of mollifications stated in Lemma 8.15 provides the key tool to relate weak derivatives with partial derivatives in the classical sense. As a first application, we prove Corollary 8.16 (Constant functions). Let Ω ⊆ Rn be an open, connected set, and assume u ∈ L1loc (Ω). If the first-order weak derivatives of u satisfy Dxi u(x) = 0

for i = 1, 2, . . . , n and a.e. x ∈ Ω ,

then u coincides a.e. with a constant function.

148

8. Sobolev Spaces

Proof. 1. For ε > 0, consider the mollified function uε = Jε ∗ u. By the previous analysis, uε : Ωε → R is a smooth function whose derivatives Dxi uε vanish identically on Ωε . Therefore, uε must be constant on each connected component of Ωε . 2. Now consider any two points x, y ∈ Ω. Since the open set Ω is connected, there exists a polygonal path Γ joining x with y and remaining inside Ω. . Let δ = minz∈Γ d(z, ∂Ω) > 0 be the minimum distance of points in Γ to the boundary of Ω. Then for every ε < δ the whole polygonal curve Γ is in Ωε . Hence x, y lie in the same connected component of Ωε . In particular, uε (x) = uε (y). . 3. Let u ˜(x) = limε→0 uε (x). By the previous step, u ˜ is a constant function on Ω. Moreover, u ˜(x) = u(x) at every Lebesgue point of u, hence almost everywhere on Ω. This concludes the proof.

.

Figure 8.2.2. Left: even if Ω is connected, the subdomain Ωε =

{x ∈ Ω ; B(x, ε) ⊆ Ω} may not be connected. Right: any two points x, y ∈ Ω can be connected by a polygonal path Γ remaining inside Ω. Hence, if ε > 0 is sufficiently small, x and y belong to the same connected component of Ωε .

In the one-dimensional case, relying again on Lemma 8.15, we now characterize the set of functions having a weak derivative in L1 . Corollary 8.17 (Absolutely continuous Consider an open ( functions). ) 1 interval ( )]a, b[ and assume that u ∈ Lloc ]a, b[ has a weak derivative v ∈ L1 ]a, b[ . Then there exists an absolutely continuous function u ˜ such that (8.14)

(8.15)

u ˜(x) = u(x)

v(x) = lim

h→0

for a.e. x ∈ ]a, b[ ,

u ˜(x + h) − u ˜(x) h

for a.e. x ∈ ]a, b[ .

8.2. Mollifications

149

Proof. Let x0 ∈ ]a, b[ be a Lebesgue point of u, and define x . u ˜(x) = u(x0 ) + v(y) dy . x0

Clearly u ˜ is absolutely continuous and satisfies (8.15). . In order to prove (8.14), let Jε be the standard mollifier and let uε = . ∞ Jε ∗ u, vε = Jε ∗ v. Then uε , vε ∈ C (]a + ε , b − ε[), while Lemma 8.15 yields x (8.16) uε (x) = uε (x0 ) + vε (y) dy for all x ∈ ]a + ε , b − ε[ . x0

Letting ε → 0, we obtain uε (x0 ) → u(x0 ) because x0 is a Lebesgue point. Moreover, the right-hand side of (8.16) converges to u ˜(x) for every x ∈ ]a, b[ , while the left-hand side converges to u(x) at every Lebesgue point of u (and hence almost everywhere). Therefore (8.14) holds. If f, g ∈ L1loc (Ω) are weakly differentiable functions, for any constants a, b ∈ R it is clear that the linear combination af + bg is also weakly differentiable. Indeed, (8.17)

Dxi (af + bg) = a Dxi f + b Dxi g .

We now consider products and compositions of weakly differentiable functions. One should be aware that, in general, the product of two functions f, g ∈ L1loc may not be locally summable. Similarly, the product of two weakly differentiable functions on Rn may not be weakly differentiable (see problem 20). For this reason, in the next lemma we shall assume that one of the two functions is continuously differentiable with uniformly bounded derivatives. Given two multi-indices α = (α1 , . . . , αn ) and β = (β1 , . . . , βn ), we recall that the notation β ≤ α means βi ≤ αi for every i = 1, . . . , n. Moreover, α! α1 ! α2 ! α2 ! α . . = = · ··· . β β! (α − β)! β1 ! (α1 − β1 )! β2 ! (α2 − β2 )! β2 ! (α2 − β2 )! Lemma 8.18 (Products and compositions of weakly differentiable functions). Let Ω ⊆ Rn be any open set and consider a function u ∈ L1loc (Ω) having weak derivatives D α u of every order |α| ≤ k. (i) If η ∈ C k (Ω), then the product ηu admits weak derivatives up to order k. These are given by the Leibniz formula α α (8.18) D (ηu) = D β η D α−β u . β β≤α

150

8. Sobolev Spaces

(ii) Let Ω ⊆ Rn be an open set and let ϕ : Ω → Ω be a C k bijection whose Jacobian matrix has a uniformly bounded inverse. Then the composition u ◦ ϕ is a function in L1loc (Ω ) which admits weak derivatives up to order k. . Proof. 1. To prove (i), let Jε be the standard mollifier and set uε = Jε ∗ u. Since the Leibniz formula holds for the product of smooth functions, for every ε > 0 we obtain (8.19)

α D (ηuε ) = D β η D α−β uε . β α

β≤α

For every test function φ ∈ Cc∞ (Ω), we thus have |α| α (−1) (ηuε )D φ dx = D α (ηuε )φ dx Ω Ω α ( ) = D β η D α−β uε φ dx . β Ω β≤α

Notice that, if ε > 0 is small enough so that Supp(φ) ⊂ Ωε , then the above integrals are well defined. Letting ε → 0, we obtain ⎛ ⎞ α ⎝ (−1)|α| (ηu)D α φ dx = D β η D α−β u⎠ φ dx . β Ω Ω β≤α

By the definition of weak derivative, (8.18) holds. 2. We prove (ii) by induction on k. Call y the variable in Ω and x = ϕ(y) the variable in Ω, as) shown in Figure 8.2.3. By assumption, the n × n i Jacobian matrix ∂ϕ ∂yj i,j=1,...,n has a uniformly bounded inverse. Hence the composition u ◦ ϕ lies in L1loc (Ω ), proving the theorem in the case k = 0.

Next, assume that the result is true for all weak derivatives of order |α| ≤ k − 1. Consider any test function φ ∈ Cc∞ (Ω ) and define the mollification . uε = Jε ∗ u. For any ε > 0 small enough so that ϕ Supp(φ) ⊂ Ωε , we have −

Ω

(uε ◦ ϕ) · Dyi φ dy = =

Ω

Ω

Dyi (uε ◦ ϕ) · φ dy ⎛ ⎞ n ⎝ Dxj uε (ϕ(y)) · Dyi ϕj (y)⎠ · φ(y) dy . j=1

8.3. Sobolev spaces

151

Letting ε → 0, we conclude that the composition u ◦ ϕ admits a first-order weak derivative, given by (8.20)

Dyi (u ◦ ϕ)(y) =

n

Dxj u(ϕ(y)) · Dyi ϕj (y).

j=1

By the inductive assumption, each function Dxj (u ◦ ϕ) admits weak derivatives up to order k − 1, while Dyi ϕj ∈ C k−1 (Ω ). By part (i) of the lemma, all the products on the right-hand side of (8.20) have weak derivatives up to order k − 1. Using Lemma 8.13, we conclude that the composition u ◦ ϕ admits weak derivatives up to order k. By induction, this concludes the proof.

Figure 8.2.3. The mappings considered in part (ii) of Lemma 8.18.

8.3. Sobolev spaces Consider an open set Ω ⊆ Rn , fix p ∈ [1, ∞], and let k be a nonnegative integer. Definition 8.19. (i) The Sobolev space W k,p (Ω) is the space of all locally summable functions u : Ω → R such that, for every multi-index α with |α| ≤ k, the weak derivative D α u exists and belongs to Lp (Ω). On W k,p we shall use the norm ⎛ (8.21)

. u W k,p = ⎝

|α|≤k

(8.22)

⎞1/p

|D α u|p dx⎠

. u W k,∞ = ess sup |D α u| |α|≤k

if 1 ≤ p < ∞,

Ω

x∈Ω

if p = ∞ .

152

8. Sobolev Spaces

(ii) The subspace W0k,p (Ω) ⊆ W k,p (Ω) is defined as the closure of Cc∞ (Ω) in W k,p (Ω). More precisely, u ∈ W0k,p (Ω) if and only if there exists a sequence of functions un ∈ Cc∞ (Ω) such that u − un W k,p → 0. k,p (iii) In addition, Wloc (Ω) denotes the space of functions which are locally k,p in W . These are the functions u : Ω → R satisfying the following property: If Ω is an open set compactly contained1 in Ω, then the restriction of u to Ω is in W k,p (Ω ).

Intuitively, one can think of the closed subspace W01,p (Ω) as the space of all functions u ∈ W 1,p (Ω) which vanish along the boundary of Ω. More generally, W0k,p (Ω) is a space of functions whose derivatives D α u vanish along ∂Ω, for |α| ≤ k − 1. Definition 8.20. In the special case where p = 2, we define the Hilbert. Sobolev space H k (Ω) = W k,2 (Ω). The space H k (Ω) is endowed with the inner product . (8.23) u, vH k = D α u D α v dx . |α|≤k

Ω

. Similarly, we define H0k (Ω) = W0k,2 (Ω). Theorem 8.21 (Basic properties of Sobolev spaces). (i) Each Sobolev space W k,p (Ω) is a Banach space. (ii) The space W0k,p (Ω) is a closed subspace of W k,p (Ω). Hence it is a Banach space, with the same norm. (iii) The spaces H k (Ω) and H0k (Ω) are Hilbert spaces. Proof. 1. Let u, v ∈ W k,p (Ω). For |α| ≤ k, call D α u, D α v their weak derivatives. Then, for any λ, μ ∈ R, the linear combination λu + μv is a locally summable function. One easily checks that its weak derivatives are (8.24)

D α (λu + μv) = λD α u + μD α v .

Therefore, D α (λu+μv) ∈ Lp (Ω) for every |α| ≤ k. This proves that W k,p (Ω) is a vector space. 1 By

of Ω.

definition, a set Ω is compactly contained in Ω if the closure Ω is a compact subset

8.3. Sobolev spaces

153

2. Next, we check that (8.21) and (8.22) satisfy the conditions (N1)–(N3) defining a norm. For λ ∈ R and u ∈ W k,p one has λu W k,p = |λ| u W k,p , u W k,p ≥ u Lp ≥ 0 , with equality holding if and only if u = 0. Moreover, if u, v ∈ W k,p (Ω), then for 1 ≤ p < ∞ Minkowski’s inequality yields ⎞1/p ⎛ u + v W k,p = ⎝ D α u + D α v pLp ⎠ |α|≤k

⎛ ≤ ⎝

D α u Lp + D α v Lp

p

⎞1/p ⎠

|α|≤k

⎛ ≤ ⎝

⎞1/p D α u pLp ⎠

|α|≤k

⎛

+⎝

⎞1/p D α v pLp ⎠

|α|≤k

= u W k,p + v W k,p . In the case p = ∞, the above computation is replaced by u + v W k,∞ = D α u L∞ + D α v L∞ D α u + D α v L∞ ≤ |α|≤k

|α|≤k

= u W k,∞ + v W k,∞ . 3. To conclude the proof of (i), we need to show that the space W k,p (Ω) is complete; hence it is a Banach space. Let (un )n≥1 be a Cauchy sequence in W k,p (Ω). For any multi-index α with |α| ≤ k, the sequence of weak derivatives D α un is Cauchy in Lp (Ω). Since the space Lp (Ω) is complete, there exist functions u and uα , such that (8.25)

un − u Lp → 0,

D α un − uα Lp → 0

for all |α| ≤ k .

By Lemma 8.14, the limit function uα is precisely the weak derivative D α u. Since this holds for every multi-index α with |α| ≤ k, the convergence un → u holds in W k,p (Ω). This completes the proof of (i).

154

8. Sobolev Spaces

4. The fact that W0k,p (Ω) is a closed subspace of W k,p (Ω) follows immediately from the definition. It is also straightforward to check that (8.23) is an inner product, yielding the norm · W k,2 . Example 8.22. Let Ω = ]a, b[ be an open interval. By Corollary 8.17, each element of the space W 1,p (]a, b[) coincides a.e. with an absolutely continuous function f : ]a, b[→ R having derivative f ∈ Lp (]a, b[).

Figure 8.3.1. For certain values of p, n a function u ∈ W 1,p (Rn )

may not be continuous, or bounded.

Example 8.23. Let Ω = B(0, 1) ⊂ Rn be the open ball centered at the origin with radius one. Fix γ > 0 and consider the radially symmetric function −γ/2 n . −γ 2 u(x) = |x| = xi , 0 < |x| < 1. i=1

Observe that u ∈

C 1 (Ω \

{0}). We claim that, for 1 ≤ p < ∞, n−p (8.26) u ∈ W 1,p (Ω) if and only if γ < . p Outside the origin, the partial derivatives are computed as −(γ/2)−1 n γ 2 −γxi (8.27) uxi = − xi 2xi = . 2 |x|γ+2 i=1

8.3. Sobolev spaces

155

Hence, the gradient ∇u = (ux1 , . . . , uxn ) has Euclidean norm n 1/2 γ 2 |∇u(x)| = |uxi (x)| = . |x|γ+1 i=1

Calling σn the (n−1)-dimensional measure of the unit sphere {x ∈ Rn ; |x| = 1} ⊂ Rn , we compute p 1 dx = |x|−p(γ+1) dx γ+1 Ω |x| x∈Rn , |x| −1, p i.e., γ < n−p p . After a similar computation for |u| , we conclude that n−p |u|p + |∇u|p dx < ∞ if and only if γ < . p Ω To complete the proof of (8.26), we need to show that, if 0 < γ < n−p p (and hence n ≥ 2), then the functions in (8.27) are indeed the weak derivatives of u on the entire ball Ω (and not only on the set Ω \ {0}). For this purpose, consider any test function φ ∈ Cc∞ (Ω). Fix i ∈ {1, . . . , n}. For convenience, we extend φ to the entire space Rn by setting φ(x) = 0 for x ∈ / Ω. Since n ≥ 2, the xi -axis has n-dimensional Lebesgue measure zero. An integration by part yields −γxi −γxi φ dx = φ dxi dx1 · · · dxi−1 dxi+1 · · · dxn γ+2 γ+2 Ω |x| Rn−1 \{0} R |x|

= −

Rn−1 \{0}

R

|x|

−γ

φxi dxi dx1 · · · dxi−1 dxi+1 · · · dxn

|x|−γ φxi dx ,

= − Ω

completing the proof. Observe that the previous computation relied on the fact that u is absolutely continuous (in fact, smooth) on a.e. line parallel to one of the coordinate axes. However, there is no way to change u on a set of measure zero, so that it becomes continuous on the whole domain Ω. An important question in the theory of Sobolev spaces is whether one can estimate the norm of a function in terms of the norm of its first derivatives.

156

8. Sobolev Spaces

The following result provides an elementary estimate in this direction. It is valid for domains Ω which are contained in a slab, say Ω ⊆ {x = (x1 , x2 , . . . , xn ) ; a < x1 < b} .

(8.29)

Theorem 8.24 (Poincar´ e’s inequality. I). Let Ω ⊂ Rn be an open set which satisfies (8.29) for some a, b ∈ R. Then every u ∈ H01 (Ω) satisfies u L2 (Ω) ≤ 2(b − a) Dx1 u L2 (Ω) .

(8.30)

Proof. 1. Assume first that u ∈ Cc∞ (Ω). We extend u to the whole space Rn by setting u(x) = 0 for x ∈ / Ω. Using the variables x = (x1 , x ) with x = (x2 , . . . , xn ), we compute x1 2 u (x1 , x ) = 2uux1 (t, x ) dt . a

An integration by parts yields 2 2 u L2 = u (x) dx = Rn

b

= Rn−1

Rn−1

b a

x1

1·

2uux1 (t, x ) dt dx1 dx

a

(b − x1 ) 2uux1 (x1 , x ) dx1 dx ≤ 2(b − a)

a

Rn

|u| |ux1 | dx

≤ 2(b − a) u L2 ux1 L2 . Dividing both sides by u L2 , we obtain (8.30) for every u ∈ Cc∞ (Ω). 2. Now consider any u ∈ H01 (Ω). By assumption there exists a sequence of functions un ∈ Cc∞ (Ω) such that un − u H 1 → 0. By the previous step, this implies u L2 = lim un L2 ≤ lim 2(b − a) Dx1 un L2 = 2(b − a) Dx1 u L2 . n→∞

n→∞

To proceed in the analysis of Sobolev spaces, we need to establish some additional facts about weak derivatives. Theorem 8.25 (Properties of weak derivatives). Let Ω ⊆ Rn be an open set, let p ∈ [1, ∞], and let |α| ≤ k. If u, v ∈ W k,p (Ω), then: ⊂ Ω is in the space W k,p (Ω). (i) The restriction of u to any open set Ω α k−|α|,p (ii) D u ∈ W (Ω).

8.4. Approximations of Sobolev functions

157

(iii) If η ∈ C k (Ω), then the product satisfies η u ∈ W k,p (Ω). Moreover there exists a constant C, depending on Ω and on η C k but not on u, such that (8.31)

ηu W k,p (Ω) ≤ C u W k,p (Ω) .

(iv) Let Ω ⊆ Rn be an open set and let ϕ : Ω → Ω be a C k diffeomorphism whose Jacobian matrix has a uniformly bounded inverse. Then the composition satisfies u ◦ ϕ ∈ W k,p (Ω ). Moreover there exists a constant C, depending on Ω and on ϕ C k but not on u, such that (8.32)

u ◦ ϕ W k,p (Ω ) ≤ C u W k,p (Ω) .

Proof. Statement (i) is an obvious consequence of the definitions, while (ii) follows from Lemma 8.13. To prove (iii), we observe that by assumption all derivatives of η are bounded, namely D β η L∞ ≤ η C k

for all |β| ≤ k .

Hence the bound (8.31) follows from the representation formula (8.18). Recalling part (ii) of Lemma 8.18, we prove (iv) by induction on k. By assumption, the n × n Jacobian matrix (Dxi ϕj )i,j=1,...,n has a uniformly bounded inverse. Hence the case k = 0 is clear. Next, assume that the result is true for k = 0, 1, . . . , m − 1. If u ∈ W m,p (Ω), we have Dxi (u ◦ ϕ) W m−1,p (Ω ) ≤ C ∇u W m−1,p (Ω) ϕ C m (Ω ) ≤ C u W m,p (Ω) , showing that the result is also true for k = m. By induction, this achieves the proof.

8.4. Approximations of Sobolev functions If u ∈ W k,p (Rn ) is a function defined on the entire space Rn , it can be approximated by smooth functions simply by taking mollifications: uε = Jε ∗ u. However, if u is only defined on some open subset Ω ⊂ Rn , a more careful construction is needed. Theorem 8.26 (Approximation with smooth functions). Let Ω ⊆ Rn be an open set. Let u ∈ W k,p (Ω) with 1 ≤ p < ∞. Then for any ε > 0 there exists a function U ∈ C ∞ (Ω) such that U − u W k,p (Ω) < ε.

158

8. Sobolev Spaces

Proof. 1. Let ε > 0 be given. Consider the following locally finite open covering of the set Ω, shown in Figure 8.2.1: 1 . V1 = x ∈ Ω ; d(x, ∂Ω) > , 2 1 1 . Vj = x ∈ Ω ; < d(x, ∂Ω) < , j = 2, 3, . . . . j+1 j−1 Using Theorem A.18 in the Appendix, let η1 , η2 , . . . be a smooth partition of unity subordinate to the above covering. By Theorem 8.25, for every j ≥ 0 the product ηj u is in W k,p (Ω). By construction, it has support contained in Vj . 2. Consider the mollifications Jε ∗ (ηj u). By Lemma 8.15 and by Theorem A.16 in the Appendix, for every |α| ≤ k we have D α (Jε ∗ (ηj u)) = J ∗ (D α (ηj u)) → D α (ηj u) as ε → 0. Since each ηj has compact support, here the convergence takes place in Lp (Ω). Therefore, for each j ≥ 0 we can find 0 < εj < 2−j small enough so that ηj u − Jεj ∗ (ηj u) W k,p (Ω) ≤ ε 2−j . 3. Consider the function ∞ . U = Jεj ∗ (ηj u). j=1

Notice that the above series may not converge in W k,p . However, it is certainly pointwise convergent because every compact set K ⊂ Ω intersects finitely many of the sets Vj . Restricted to K, the above sum contains only finitely many nonzero terms. Since each term is smooth, this implies U ∈ C ∞ (Ω). 4. Consider the subdomains 1 . Ω1/n = x ∈ Ω ; d(x, ∂Ω) > . n Recalling that j ηj (x) ≡ 1, for every n ≥ 1 we obtain U − u W k,p (Ω1/n ) ≤

n+2

ηj u − Jεj ∗ (ηj u) W k,p (Ω1/n ) ≤

j=1

n+2 j=1

This yields U − u W k,p (Ω) = sup U − u W k,p (Ω1/n ) ≤ ε . n≥1

ε 2−j ≤ ε.

8.4. Approximations of Sobolev functions

159

Since ε > 0 was arbitrary, this proves that the set of C ∞ function is dense on W k,p (Ω). Using the above approximation result, we obtain a first regularity theorem for Sobolev functions (see Figure 8.4.1). Theorem 8.27 (Relation between weak and strong derivatives). Let u ∈ W 1,1 (Ω), where Ω ⊆ Rn is an open set having the form . (8.33) Ω = x = (x, x ) ; x = (x2 , . . . , xn ) ∈ Ω , α(x ) < x1 < β(x ) (possibly with α ≡ −∞ or β ≡ +∞). Then there exists a function u ˜ with u ˜(x) = u(x) for a.e. x ∈ Ω, such that the following holds. For a.e. x = (x2 , . . . , xn ) ∈ Ω ⊂ Rn−1 (with respect to the (n − 1)-dimensional Lebesgue measure), the map x1 → u ˜(x1 , x ) is absolutely continuous. Its derivative coincides a.e. with the weak derivative Dx1 u.

Figure 8.4.1. The domain Ω at (8.33). If u has a weak deriv-

ative Dx1 u ∈ L1 (Ω), then (by possibly changing its values on a set of measure zero) the function u is absolutely continuous on almost every segment parallel to the x1 -axis and its partial derivative ∂u/∂x1 coincides a.e. with the weak derivative.

Proof. 1. By the previous theorem, there exists a sequence of functions uk ∈ C ∞ (Ω) such that (8.34)

uk − u W 1,1 < 2−k .

160

8. Sobolev Spaces

We claim that this implies the pointwise convergence uk (x) → u(x),

Dx1 uk (x) → Dx1 u(x)

for a.e. x ∈ Ω .

Indeed, consider the functions ∞

. f (x) = |u1 (x)| + |uk+1 (x) − uk (x)| , (8.35)

k=1

∞ . g(x) = |Dx1 u1 (x)| + D u (x) − D u (x) x1 k+1 . x1 k k=1

By (8.34), uk − uk+1 W 1,1 ≤ 21−k ; hence f, g ∈ L1 (Ω) and the series in (8.35) are absolutely convergent for a.e. x ∈ Ω. Therefore, they converge pointwise almost everywhere. Moreover, we have the bounds (8.36) |uk (x)| ≤ f (x) ,

|Dx1 uk (x)| ≤ g(x)

for all n ≥ 1, x ∈ Ω .

2. Since f, g ∈ L1 (Ω), by Fubini’s theorem there exists a null set N ⊂ Ω (with respect to the (n − 1)-dimensional Lebesgue measure) such that, for every x ∈ Ω \ N , one has β(x ) β(x ) (8.37) f (x1 , x ) dx1 < ∞ , g(x1 , x ) dx1 < ∞. α(x )

α(x )

Fix such a point x ∈ Ω \ N . Choose a point y1 ∈ ]α(x ) , β(x )[ where the pointwise convergence uk (y1 , x ) → u(y1 , x ) holds. For every α(x ) < x1 < β(x ), since uk is smooth, we have x1 (8.38) uk (x1 , x ) = uk (y1 , x ) + Dx1 uk (s, x ) ds . y1

We now let n → ∞ in (8.38). By (8.36) and (8.37), the functions Dx1 uk (·, x ) are all bounded by the integrable function g(·, x ) ∈ L1 . By the dominated convergence theorem, the right-hand side of (8.38) thus converges to x1 . (8.39) u ˜(x1 , x ) = u(y1 , x ) + Dx1 u(s, x ) ds . y1

Clearly, the right-hand side of (8.39) is an absolutely continuous function of the scalar variable x1 . On the other hand, the left-hand side satisfies . u ˜(x1 , x ) = lim uk (x1 , x ) = u(x1 , x ) for a.e. x1 ∈ [α(x ), β(x )]. n→∞

This achieves the proof.

8.5. Extension operators

161

Remark 8.28. (i) It is clear that a similar result holds for any other derivative Dxi u, with i = 1, 2, . . . , n. and Ω ⊂ Ω, then the restriction of u to Ω lies in (ii) If u ∈ W k,p (Ω) has a complicated topology, the the space W k,p (Ω). Even if the open set Ω result of Theorem 8.27 can be applied to any cylindrical subdomain Ω ⊂ Ω admitting the representation (8.33). (iii) If Ω ⊂ Rn is a bounded open set and u ∈ W k,p (Ω), then u ∈ W k,q (Ω) for every q ∈ [1, p].

8.5. Extension operators Let Ω ⊂ Rn be a bounded open domain with C 1 boundary.2 Given a function u defined on Ω, the next theorem provides a way to extend it to the entire space Rn , retaining some control on the W 1,p -norm. ⊂ Rn be open sets, Theorem 8.29 (Extension operators). Let Ω ⊂⊂ Ω Moreover, assume such that the closure of Ω is a compact subset of Ω. 1 that Ω has C boundary. Then there exists a bounded linear operator E : W 1,p (Ω) → W 1,p (Rn ) and a constant C such that (i) Eu(x) = u(x) for a.e. x ∈ Ω, (ii) Eu(x) = 0 for x ∈ / Ω, (iii) one has the bound Eu W 1,p (Rn ) ≤ C u W 1,p (Ω) . Proof. 1. We first prove that the same conclusion holds in the case where = Rn . the domain is a half-space: Ω = {x = (x1 , x2 , . . . , xn ) ; x1 > 0}, and Ω 1,p In this case, any function u ∈ W (Ω) can be extended to the whole space Rn by reflection, i.e., by setting . (8.40) (E u)(x1 , x2 , . . . , xn ) = u ( |x1 |, x2 , x3 , . . . , xn ) . By Theorem 8.27, for every i ∈ {1, . . . , n} the function u is absolutely continuous along a.e. line parallel to the xi -axis. Hence the same is true of the extension E u. A straightforward computation involving integration by parts shows that the first-order weak derivatives of E u exist on the entire space Rn and satisfy

Dx1 E u(−x1 , x2 , . . . , xn ) = −Dx1 u(x1 , x2 , . . . , xn ), Dxj E u(−x1 , x2 , . . . , xn ) = Dxj u(x1 , x2 , . . . , xn ) (j = 2, . . . , n), 2 Saying that Ω has C 1 boundary means the following. For every boundary point x ∈ ∂Ω there exists a neighborhood V x of x and a C 1 map ϕ : Rn → R such that ∇ϕ(x) = 0 and Ω ∩ V x = {y ∈ V x ; ϕ(y) > 0}.

162

8. Sobolev Spaces

for all x1 > 0, x2 , . . . , xn ∈ R. The extension operator E : W 1,p (Ω) → W 1,p (Rn ) defined at (8.40) is clearly linear and bounded because E u W 1,p (Rn ) ≤ 2 u W 1,p (Ω) .

2

Figure 8.5.1. The open covering of the set Ω. For every ball Bi = B(xi , ri ) there is a C 1 bijection ϕi mapping the open unit ball B ⊂ Rn onto Bi . For those balls Bi having center on the boundary Ω, the positive half-ball B + = B ∩ {y1 > 0} is mapped onto the intersection Bi ∩ Ω.

2. To handle the general case, we use a partition of unity. For every x ∈ Ω (the closure of Ω), choose a radius rx > 0 such that the open ball B(x, rx ) centered at x with radius rx 0 satisfies the following conditions: • If x ∈ Ω, then B(x, rx ) ⊂ Ω. . Moreover, calling B = • If x ∈ ∂Ω, then B(x, rx ) ⊂ Ω. B(0, 1) the open unit ball in Rn , there exists a C 1 bijection ϕx : B → B(x, rx ), whose inverse is also C 1 , which maps the half-ball B

+

. =

+ n 2 y = (y1 , y2 , . . . , yn ) ; yi < 1 , y1 > 0 i=1

onto the set B(x, rx ) ∩ Ω. Choosing rx > 0 sufficiently small, the existence of such a bijection follows from the assumption that Ω has a C 1 boundary. Since Ω is bounded, its closure Ω is compact. Hence it can be covered with finitely many balls Bi = B(xi , ri ), i = 1, . . . , N . Let ϕi : B → Bi be the corresponding bijections. Recall that ϕi maps B + onto Bi ∩ Ω.

8.6. Embedding theorems

163

3. Let η1 , . . . , ηN be a smooth partition of unity subordinate to the above covering. For every x ∈ Ω we thus have (8.41)

u(x) =

N

ηi (x)u(x) .

i=1

We split the set of indices as {1, 2, . . . , N } = I ∪ J , where I contains the indices with xi ∈ Ω while J contains the indices with xi ∈ ∂Ω. For every i ∈ J , we have ηi u ∈ W 1,p (Bi ∩ Ω). Hence by Theorem 8.25 (iv), one has (ηi u) ◦ ϕi ∈ W 1,p (B + ). Applying the extension operator E defined at (8.40), one obtains E (ηi u) ◦ ϕi ∈ W 1,p (B + ) , E (ηi u) ◦ ϕi ◦ ϕ−1 ∈ W 1,p (Bi ) . i Summing all these extensions together, we define . Eu = ηi u + E (ηi u) ◦ ϕi ◦ ϕ−1 i . i∈I

i∈J

It is now clear that the extension operator E satisfies all requirements. Indeed, (i) follows from (8.41). Property (ii) stems from the fact that, for every u ∈ W 1,p (Ω), the support of Eu is contained in N i=1 Bi ⊆ Ω. Finally, since E is defined as the sum of finitely many bounded linear operators, the bound (iii) holds, for some constant C.

8.6. Embedding theorems In one space dimension, a function u : R → R which admits a weak derivative Du ∈ L1 (R) is absolutely continuous (after changing its values on a set of measure zero). On the other hand, if Ω ⊆ Rn with n ≥ 2, there exist functions u ∈ W 1,p (Ω) which are not continuous and not even bounded. This is indeed the case of the function u(x) = |x|−γ , for 0 < γ < n−p p . In several applications to PDEs or to the calculus of variations, it is important to understand the degree of regularity enjoyed by functions u ∈ W k,p (Rn ). We shall prove two basic results in this direction. 1. (Morrey) If p > n, then every function u ∈ W 1,p (Rn ) is Hölder continuous (after a modification on a set of measure zero). 2. (Gagliardo-Nirenberg) If p < n, then every function u ∈ W 1,p (Rn ) ∗ p2 lies in the space Lp (Rn ), with the larger exponent p∗ = p + n−p .

164

8. Sobolev Spaces

In both cases, the result can be stated as an embedding theorem: after a modification on a set of measure zero, each function u ∈ W 1,p (Ω) also lies in some other Banach space X. Here X = C 0,γ or X = Lq for some q > p. The basic approach is as follows: (I): Prove an a priori bound valid for all smooth functions. Given any function u ∈ C ∞ (Ω) ∩ W k,p (Ω), one proves that u also lies in another Banach space X and that there exists a constant C depending on k, p, Ω but not on u, such that (8.42)

u X ≤ C u W k,p

for all u ∈ C ∞ (Ω) ∩ W k,p (Ω) .

(II): Extend the embedding to the entire space, by continuity. Since C ∞ is dense in W k,p , for every u ∈ W k,p (Ω) we can find a sequence of functions un ∈ C ∞ (Ω) such that u − un W k,p → 0. By (8.42), lim sup um − un X ≤ lim sup C um − un W k,p = 0 . m,n→∞

m,n→∞

Therefore the functions un also form a Cauchy sequence in the space X. By completeness, un → u ˜ for some u ˜ ∈ X. Observing that u ˜(x) = u(x) for a.e. x ∈ Ω, we conclude that, up to a modification on a set N ⊂ Ω of measure zero, each function u ∈ W k,p (Ω) also lies in the space X. 8.6.1. Morrey’s inequality. In this section we prove that, if u ∈ W 1,p (Rn ), where the exponent p is bigger than the dimension n of the space, then u coincides a.e. with a Hölder continuous function. Theorem 8.30 (Morrey’s inequality). Assume n < p < ∞ and set . γ = 1 − np > 0. Then there exists a constant C, depending only on p and n, such that (8.43)

u C 0,γ (Rn ) ≤ C u W 1,p (Rn )

for every u ∈ C 1 (Rn ) ∩ W 1,p (Rn ). Proof. Before giving the actual proof, we outline the underlying idea. From an integral estimate on the gradient of the function u, say (8.44) |∇u(x)|p dx ≤ C0 , Rn

we seek a pointwise estimate of the form (8.45)

|u(y) − u(y )| ≤ C1 |y − y |γ

for all y, y ∈ Rn .

8.6. Embedding theorems

165

Figure 8.6.1. Proving Morrey’s inequality. Left: the values u(y) and u(y ) are compared with the average value of u on an (n − 1)dimensional ball (the shaded region) centered at the midpoint z and contained in the hyperplane H perpendicular to the vector y − y . Center and right: a point x in the cone Γ is described in terms of the coordinates (r, ξ) ∈ [0, ρ] × B1 .

To achieve (8.45), a natural attempt is to write 1 . / ) d ( |u(y) − u(y )| = dθ u θy + (1 − θ)y dθ 0 (8.46) 1 ≤ ∇u(θy + (1 − θ)y ) |y − y | dθ . 0

However, the integral on the right-hand side of (8.46) only involves values of ∇u over the segment joining y with y . If the dimension of the space is n > 1, this segment has zero measure. Hence the integral in (8.46) can be arbitrarily large, even if the integral in (8.44) is small. To address this difficulty, we shall compare both values u(y), u(y ) with the average value uA of the function u over an (n − 1)-dimensional ball centered at the midpoint z = y+y 2 , as shown in Figure 8.6.1, left. Notice that the difference |u(y)−uA | can be estimated by an integral of |∇u| ranging over a cone of dimension n. In this way the bound (8.44) can thus be brought into play. 1. We now begin the proof, with a preliminary computation. On Rn , consider the cone ⎧ ⎫ n ⎨ ⎬ . Γ = x = (x1 , x2 , . . . , xn ) ; x2j ≤ x21 , 0 < x1 < ρ ⎩ ⎭ j=2

166

8. Sobolev Spaces

and the function (8.47)

ψ(x) =

1 xn−1 1

.

p p−1

be the conjugate exponent of p, so that 1p + 1q = 1. We compute q q ρ 1 1 q n−1 ψ Lq (Γ) = dx = cn−1 x1 dx1 n−1 xn−1 0 Γ x1 1

Let q =

ρ

s(n−1)(1−q) ds ,

= cn−1 0

where the constant cn−1 gives the volume of the unit ball in Rn−1 . Therefore, ψ ∈ Lq (Γ) if and only if n < p. In this case, 1 ρ p−n p−1 q p−n p − n−1 p−1 (8.48) ψ Lq (Γ) = cn−1 s ds = c ρ p−1 = cρ p , 0

for some constant c depending only on n and p. . 2. Consider any two distinct points y, y ∈ Rn . Let ρ = 12 |y − y |. The . hyperplane passing through the midpoint z = y+y and perpendicular to 2 the vector y − y has equation H = x ∈ Rn ; x − z , y − y = 0 . Inside H, consider the (n − 1)-dimensional ball centered at z with radius ρ, . Bρ = x ∈ H ; |x − z| < ρ . Calling uA the average value of u on the ball Bρ , the difference |u(y) − u(y )| will be estimated as (8.49) |u(y) − u(y )| ≤ |u(y) − uA | + uA − u(y ) . 3. By a translation and a rotation of coordinates, we can assume n n y = (0, . . . , 0) ∈ R , Bρ = x = (x1 , x2 , . . . , xn ) ; x1 = ρ , x2i ≤ ρ2 . i=2

To compute the average value uA , let B1 be the unit ball in Rn−1 , and let cn−1 be its (n − 1)-dimensional measure. Points in the cone Γ will be described using an alternative system of coordinates. To the point with coordinates (r, ξ) = (r, ξ2 , . . . , ξn ) ∈ [0, ρ] × B1 we associate the point x(r, ξ) ∈ Γ with components (8.50)

(x1 , x2 , . . . , xn ) = (r, rξ) = (r, rξ2 , . . . , rξn ).

Define U (r, ξ) = u(r, rξ), and observe that U (0, ξ) = u(0) for every ξ.

8.6. Embedding theorems

167

Therefore

ρ.

U (ρ, ξ) = U (0, ξ) + 0

(8.51)

uA − u(0) =

1

cn−1

/ ∂ U (r, ξ) dr , ∂r

ρ 0

B1

∂ U (r, ξ) dr dξ . ∂r

We now change variables, transforming the integral (8.51) over [0, ρ] × B1 into an integral over the cone Γ. Computing the Jacobian matrix of the transformation (8.50), we find that its determinant is rn−1 ; hence dx1 dx2 · · · dxn = rn−1 dr dξ2 · · · dξn . Moreover, since |ξ| ≤ 1, the directional derivative of u in the direction of the vector (1, ξ2 , . . . , ξn ) is estimated by n ∂ = ux + (8.52) ξ u U (r, ξ) i x 1 i ≤ 2|∇u(r, ξ)| . ∂r i=2

Using (8.52) in (8.51) and using the estimate (8.48) on the Lq norm of the . function ψ(x) = x11−n , we obtain 1 2 |∇u(x)| dx uA − u(0) ≤ n−1 cn−1 Γ x1 2 p (8.53) ≤ q= ψ Lq (Γ) ∇u Lp (Γ) cn−1 p−1 ≤Cρ

p−n p

u W 1,p (Rn )

for some constant C. Notice that the last two steps follow from Hölder’s inequality and (8.48). 4. Using (8.53) to estimate each term on the right-hand side of (8.49) and recalling that ρ = 12 |y − y |, we conclude that (8.54)

|u(y) − u(y )| ≤ 2C

|y − y | 2

p−n p

u W 1,p (Rn ) .

This shows that u is Hölder continuous with exponent γ =

p−n p .

5. To estimate supy |u(y)|, we observe that, by (8.54), for some constant C1 one has |u(y)| ≤ |u(x)| + C1 u W 1,p (Rn )

for all x ∈ B(y, 1) .

168

8. Sobolev Spaces

Taking the average of the right-hand side over the ball centered at y with radius one, we obtain (8.55) |u(y)| ≤ −

|u(x)| dx + C1 u W 1,p (Rn ) ≤ C2 u Lp (Rn ) + C1 u W 1,p (Rn ) .

B(y,1)

6. Together, (8.54)–(8.55) yield |u(y) − u(y )| . u C 0,γ (Rn ) = sup |u(y)| + sup ≤ C u W 1,p (Rn ) , |y − y |γ y y =y for some constant C depending only on p and n.

Since C ∞ is dense in W 1,p , Morrey’s inequality yields Corollary 8.31 (Embedding). Let Ω ⊂ Rn be a bounded open set with C 1 . boundary. Assume n < p < ∞ and set γ = 1 − np > 0. Then every function f ∈ W 1,p (Ω) coincides a.e. with a function f˜ ∈ C 0,γ (Ω). Moreover, there exists a constant C such that f˜ C 0,γ ≤ C f W 1,p

(8.56)

for all f ∈ W 1,p (Ω).

. = Proof. 1. Let Ω {x ∈ Rn ; d(x, Ω) < 1} be the open neighborhood of radius one around the set Ω. By Theorem 8.29 there exists a bounded extension operator E, which extends each function f ∈ W 1,p (Ω) to a function Ef ∈ W 1,p (Rn ) with support contained inside Ω. 2. Since C 1 (Rn ) is dense in W 1,p (Rn ), we can find a sequence of functions gn ∈ C 1 (Rn ) converging to Ef in W 1,p (Rn ). By Morrey’s inequality lim sup gm − gn C 0,γ (Rn ) ≤ C lim sup gm − gn W 1,p (Rn ) = 0. m,n→∞

m,n→∞

This proves that the sequence (gn )n≥1 is also a Cauchy sequence in the space C 0,γ . Therefore it converges to a limit function g ∈ C 0,γ (Rn ), uniformly for x ∈ Rn . 3. Since gn → Ef in W 1,p (Rn ), we also have g(x) = (Ef )(x) for a.e. x ∈ Rn . In particular, g(x) = f (x) for a.e. x ∈ Ω. Since the extension operator E is bounded, from the bound (8.43) we deduce (8.56). Remark 8.32. The conclusion of Corollary 8.31 remains valid if Ω = Rn .

8.6. Embedding theorems

169

8.6.2. The Gagliardo-Nirenberg inequality. Next, we study the case 1 ≤ p < n. We define the Sobolev conjugate of p as np . (8.57) p∗ = > p. n−p Notice that p∗ depends not only on p but also on the dimension n of the space: 1 1 1 = − . ∗ p p n

(8.58)

As a preliminary, we describe a useful application of the generalized Hölder inequality; see (A.26) in the Appendix. Let n − 1 nonnegative func1

tions g1 , g2 , . . . , gn−1 ∈ L1 (Ω) be given. Since gin−1 ∈ Ln−1 (Ω) for each i, using the generalized Hölder inequality, one obtains n−1 n−1 1 1 1 1 1 3 3 n−1 (8.59) g1n−1 g2n−1 · · · gn−1 ds ≤ gin−1 Ln−1 = gi Ln−1 . 1 Ω

i=1

i=1

Theorem 8.33 (Gagliardo-Nirenberg inequality). Assume 1 ≤ p < n. Then there exists a constant C, depending only on p and n, such that (8.60)

f Lp∗ (Rn ) ≤ C ∇f Lp (Rn )

for all f ∈ Cc1 (Rn ) .

Figure 8.6.2. ∞ Proving the Gagliardo-Nirenberg inequality. The integral −∞ |Dx1 f (s1 , x2 , x3 )| ds1 depends on x2 , x3 but not on ∞ x1 . Similarly, the integral −∞ |Dx2 f (x1 , s2 , x3 )| ds2 depends on x1 , x3 but not on x2 .

Proof. 1. For each i ∈ {1, . . . , n} and every point x = (x1 , . . . , xi , . . . , xn ) ∈ Rn , since f has compact support, we can write xi f (x1 , . . . , xi , . . . , xn ) = Dxi f (x1 , . . . , si , . . . , xn ) dsi . −∞

170

8. Sobolev Spaces

In turn, this yields |f (x1 , . . . , xn )| ≤

|f (x)|

(8.61)

∞ −∞

n n−1

≤

Dxi f (x1 , . . . , si , . . . , xn ) dsi ,

n 3

∞ −∞

i=1

1 ≤ i ≤ n,

|Dxi f (x1 , . . . , si , . . . , xn )| dsi

1 n−1

.

We now integrate (8.61) with respect to x1 . Observe that the first factor on the right-hand side does not depend on x1 . This factor behaves like a constant and can be taken out of the integral. The product of the remaining n − 1 factors is handled using (8.59). This yields (8.62) ∞

−∞

n

|f | n−1 dx1 ≤

∞ −∞

≤

|Dx1 f | ds1

−∞

∞

n 3

−∞ i=2

∞

1 n−1

|Dx1 f | ds1

1 n−1

n 3 i=2

∞ −∞

∞ −∞

|Dxi f | dsi

1 n−1

dx1

∞ −∞

|Dxi f | dsi dx1

1 n−1

.

Notice that the second inequality was obtained ∞by applying the generalized H¨ older inequality to the n − 1 functions gi = −∞ |Dxi f |dsi , i = 2, . . . , n. We now integrate both sides of (8.62) with respect to x2 . Observe that one of the factors appearing in the product on the right-hand side of (8.62) does not depend on the variable x2 (namely, the one involving integration with respect to s2 ). This factor behaves like a constant and can be taken out of the integral. The product of the remaining n − 2 factors is again estimated using Hölder’s inequality. This yields (8.63) ∞

∞

−∞

−∞

n

|f | n−1 dx1 dx2

≤

∞ −∞

×

∞ −∞

n 3 i=3

∞ −∞

|Dx1 f | dx1 dx2

∞ −∞

∞ −∞

1 n−1

∞

−∞

∞ −∞

|Dxi f | dsi dx1 dx2

|Dx2 f | dx1 dx2

1 n−1

.

1 n−1

8.6. Embedding theorems

171

Proceeding in the same way, after n integrations we obtain (8.64) ∞ ∞ n ··· |f | n−1 dx1 · · · dxn −∞

≤

−∞

n 3 i=1

∞ −∞

···

∞ −∞

|Dxi f | dx1 · · · dxn

1 n−1

≤

|∇f | dx

Rn

n n−1

.

This already implies f Ln/(n−1) =

(8.65)

Rn

|f |

n n−1

n−1 n

dx

≤

|∇f | dx ,

Rn

proving the theorem in the case where p = 1 and p∗ =

n n−1 .

2. To cover the general case where 1 < p < n, we apply (8.65) to the function . g = |f |β

(8.66)

. p(n − 1) β = . n−p

with

Using the standard Hölder inequality, one obtains (8.67) n−1 n βn |f | n−1 dx ≤ β|f |β−1|∇f | dx Rn

Rn

≤ β

Rn

|f |

(β−1)p p−1

p−1 p

1

p

|∇f | dx p

dx Rn

.

Our choice of β in (8.66) yields (β − 1)p βn np = = = p∗ . p−1 n−1 n−p Therefore, from (8.67) it follows that Rn

n−1

p∗

n

|f | dx

Observing that

n−1 n

−

≤ β

p−1 p p∗

Rn

=

Rn

n−p np

|f | dx

p∗

1 p∗

=

p−1

|f | dx

1 p∗ ,

1

p

Rn

p

.

we conclude that

≤ C

|∇f | dx p

1 |∇f | dx p

Rn

p

.

172

8. Sobolev Spaces ∗

If the domain Ω ⊂ Rn is bounded, then Lq (Ω) ⊆ Lp (Ω) for every q ∈ [1, p∗ ]. Using the Gagliardo-Nirenberg inequality, we obtain Corollary 8.34 (Embedding). Let Ω ⊂ Rn be a bounded open domain with C 1 boundary, and assume 1 ≤ p < n. Then, for every q ∈ [1, p∗ ] with . np p∗ = n−p , there exists a constant C such that (8.68)

f Lq (Ω) ≤ C f W 1,p (Ω)

for all f ∈ W 1,p (Ω) .

. = Proof. Let Ω {x ∈ Rn ; d(x, Ω) < 1} be the open neighborhood of radius one around the set Ω. By Theorem 8.29 there exists a bounded extension operator E : W 1,p (Ω) → W 1,p (Rn ), with the property that Ef for every f ∈ W 1,p (Ω). Applying the Gagliardois supported inside Ω, Nirenberg inequality to Ef , for suitable constants C1 , C2 , C3 we obtain f Lq (Ω) ≤ C1 f Lp∗ (Ω) ≤ C2 Ef Lp∗ (Rn ) ≤ C3 f W 1,p (Ω) .

8.6.3. High-order Sobolev estimates. Let Ω ⊂ Rn be a bounded open set with C 1 boundary, and let u ∈ W k,p (Ω). The number n k− p will be called the net smoothness of u. As in Figure 8.6.3, let m be the integer part and let 0 ≤ γ < 1 be the fractional part of this number, so that n (8.69) k− = m+γ. p In the following, we say that a Banach space X is continuously embedded in a Banach space Y if X ⊆ Y and there exists a constant C such that u Y ≤ C u X

for all u ∈ X .

Figure 8.6.3. Computing the “net smoothness” of a function f ∈

W k,p ⊂ C m,γ .

Theorem 8.35 (General Sobolev embeddings). Let Ω ⊂ Rn be a bounded open set with C 1 boundary, and consider the space W k,p (Ω). Let m, γ be as in (8.69). Then the following continuous embeddings hold.

8.6. Embedding theorems

173

(i) If k − np < 0, then W k,p (Ω) ⊆ Lq (Ω), with 1 n − k . n p (ii) If k −

n p

1 q

=

1 p

−

k n

=

= 0, then W k,p (Ω) ⊆ Lq (Ω) for every 1 ≤ q < ∞.

(iii) If m ≥ 0 and γ > 0, then W k,p (Ω) ⊆ C m,γ (Ω). (iv) If m ≥ 1 and γ = 0, then for every 0 ≤ γ < 1 one has W k,p (Ω) ⊆ C m−1,γ (Ω). Remark 8.36. Functions in a Sobolev space are only defined up to a set of measure zero. By saying that W k,p (Ω) ⊆ C m,γ (Ω) we mean the following. For every u ∈ W k,p (Ω) there exists a function u ˜ ∈ C m,γ (Ω) such that u ˜(x) = u(x) for a.e. x ∈ Ω. Moreover, there exists a constant C, depending on k, p, m, γ but not on u, such that u C m,γ (Ω) ≤ C ˜ u W k,p (Ω) . Proof of the theorem. 1. We start by proving (i). Assume k − np < 0 and let u ∈ W k,p (Ω). Since D α u ∈ W 1,p (Ω) for every |α| ≤ k − 1, the Gagliardo-Nirenberg inequality yields D α u Lp∗ (Ω) ≤ C u W k,p (Ω) ,

|α| ≤ k − 1 .

k−1,p∗

(Ω), where p∗ is the Sobolev conjugate of p. . . This argument can be iterated. Set p1 = p∗ , p2 = p∗1 , . . . , pj = p∗j−1 . By (8.58) this means 1 1 1 1 1 j = = − , ..., − , p1 p n pj p n Therefore u ∈ W

provided that jp < n. Using the Gagliardo-Nirenberg inequality several times, we obtain (8.70)

W k,p (Ω) ⊆ W k−1,p1 (Ω) ⊆ W k−2,p2 (Ω) ⊆ · · · ⊆ W k−j,pj (Ω) .

After k steps we find that u ∈ W 0,pk (Ω) = Lpk (Ω), with Hence pk = q and (i) is proved.

1 pk

=

1 p

− nk = 1q .

2. In the special case kp = n, repeating the above argument, after k − 1 steps we find 1 1 k−1 1 = − = . pk−1 p n n Therefore pk−1 = n and W k,p (Ω) ⊂ W 1,n (Ω) ⊆ W 1,n−ε (Ω)

174

8. Sobolev Spaces

for every ε > 0. Using the Gagliardo-Nirenberg inequality once again, we obtain n(n − ε) n2 − εn u ∈ W 1,n−ε (Ω) ⊆ Lq (Ω), q = = . n − (n − ε) ε Since ε > 0 was arbitrary, this proves (ii). 3. To prove (iii), assume that m ≥ 0 and γ > 0 and let u ∈ W k,p (Ω). We use the embeddings (8.70), choosing j to be the smallest integer such that pj > n. We thus have 1 1 j 1 1 j−1 < − = < − , u ∈ W k−j,pj (Ω) . p n pj n p n Hence, for every multi-index α with |α| ≤ k − j − 1, Morrey’s inequality yields n n D α u ∈ W 1,pj (Ω) ⊆ C 0,γ (Ω) with γ = 1 − = 1− +j. pj p Since α was any multi-index with length ≤ k − j − 1, the above implies u ∈ C k−j−1 ,γ (Ω). To conclude the proof of (iii), it suffices to check that n n k− = (k − j − 1) + 1 − + j , p p so that m = k − j − 1 is the integer part of the number k − np , while γ is its fractional part. . 4. To prove (iv), assume that m ≥ 1 and that j = np is an integer. Let u ∈ W k,p (Ω), and fix any multi-index with |α| ≤ j − 1. Using the GagliardoNirenberg inequality as in step 2, we obtain D α u ∈ W k−j,p (Ω) ⊆ W 1,q (Ω) for every 1 < q < ∞. Hence, by Morrey’s inequality D α u ∈ W 1,q (Ω) ⊆ C

0,1− n q

(Ω).

Since q can be chosen arbitrarily large, this proves (iv).

Example 8.37. Let Ω be the open unit ball in R5 , and assume u ∈ W 4,2 (Ω). Applying the Gagliardo-Nirenberg inequality two times and then Morrey’s inequality, we obtain 10

1

u ∈ W 4,2 (Ω) ⊂ W 3, 3 (Ω) ⊂ W 2,10 (Ω) ⊂ C 1, 2 (Ω) . Observe that the net smoothness of u is k −

n p

=4−

5 2

= 1 + 12 .

8.7. Compact embeddings

175

8.7. Compact embeddings Let Ω ⊂ Rn be a bounded open set with C 1 boundary. In this section we study the embedding W 1,p (Ω) ⊂ Lq (Ω) in greater detail and show that, when 1q > p1 − n1 , this embedding is compact. Namely, from any sequence (um )m≥1 which is bounded in W 1,p one can extract a subsequence which converges in Lq . As a preliminary we observe that, if p > n, then every function u ∈ W 1,p (Ω) is Hölder continuous. In particular, if (um )m≥1 is a bounded sequence in W 1,p (Ω), then the functions um are equicontinuous and uniformly bounded. By Ascoli’s compactness theorem we can extract a subsequence (umj )j≥1 which converges to a continuous function u uniformly on Ω. Since Ω is bounded, this implies umj − u Lq (Ω) → 0 for every q ∈ [1, ∞]. This already shows that the embedding W 1,p (Ω) ⊂ Lq (Ω) is compact whenever p > n and 1 ≤ q ≤ ∞. In the remainder of this section we thus focus on the case p < n. By the Gagliardo-Nirenberg inequality, the space W 1,p (Ω) is continuously em∗ np bedded in Lp (Ω), where p∗ = n−p . In turn, since Ω is bounded, for every ∗ ∗ 1 ≤ q ≤ p we have the continuous embedding Lp (Ω) ⊆ Lq (Ω). Theorem 8.38 (Rellich-Kondrachov compactness theorem). Let Ω ⊂ Rn be a bounded open set with C 1 boundary. Assume 1 ≤ p < n. Then for . np each 1 ≤ q < p∗ = n−p one has the compact embedding (8.71)

W 1,p (Ω) ⊂⊂ Lq (Ω).

Proof. 1. Let (um )m≥1 be a bounded sequence in W 1,p (Ω). Using Theorem 8.29 on the extension of Sobolev functions, we can assume that all functions um are defined on the entire space Rn and vanish outside a compact set K: (8.72)

. = Supp(um ) ⊆ K ⊂ Ω B(Ω, 1) .

Here the right-hand side denotes the open neighborhood of radius one around the set Ω. is bounded, we have Since q < p∗ and Ω um Lq (Rn ) = um Lq (Ω) ≤ C um Lp∗ (Ω) ≤ C um W 1,p (Ω)

for some constants C, C . Hence the sequence um is uniformly bounded in Lq (Rn ).

176

8. Sobolev Spaces

. 2. Consider the mollified functions uεm = Jε ∗ um . By (8.72) we can assume We claim that that all these functions are supported inside Ω. (8.73) uεm − um Lq (Ω) → 0 as

ε → 0,

uniformly with respect to m.

Indeed, if um is smooth, then (performing the changes of variable y = εy and z = x − εty) ε um (x) − um (x) = Jε (y ) [um (x − y ) − um (x)] dy |y | 2 the conclusion is a special case of Theorem 8.38 with p = 2. When n = 1, every function in H 1 (Ω) is Hölder continuous and the result follows from Ascoli’s theorem. When n = 2, we can apply the previous theorem with p = 3/2, p∗ = 6, q = 2 and obtain W 1,2 (Ω) ⊂ W 1, 3/2 (Ω) ⊂⊂ L2 (Ω) .

As an application of the compact embedding theorem, we now prove an estimate on the difference between a function u and its average value on a domain Ω. Theorem 8.40 (Poincar´ e’s inequality. II). Let Ω ⊂ Rn be a bounded, connected open set with C 1 boundary, and let p ∈ [1, ∞]. Then there exists a constant C depending only on p and Ω such that u − − u dx (8.77) ≤ C ∇u Lp (Ω) , p Ω

L (Ω)

for every u ∈ W 1,p (Ω) . Proof. If the conclusion were false, one could find a sequence of functions uk ∈ W 1,p (Ω) with uk − − uk dx > k ∇uk Lp (Ω) p Ω

L (Ω)

for every k = 1, 2, . . .. Then the renormalized functions uk − − uk dx . Ω vk = uk − − uk dx p Ω

L (Ω)

satisfy (8.78) − vk dx = 0 , Ω

vk Lp (Ω) = 1,

Dvk Lp (Ω)
n, then by (8.56) the functions vk are uniformly bounded and Hölder continuous. Using Ascoli’s compactness

8.8. Differentiability properties

179

theorem, we can thus find a subsequence that converges in L∞ (Ω) to some function v. By (8.78), the sequence of weak gradients also converges, namely ∇vk → 0 in Lp (Ω). By Lemma 8.14, the zero function is the weak gradient of the limit function v. We now have

− v dx = lim − vk dx = 0. k→∞

Ω

Ω

Moreover, since ∇v = 0 ∈ by Corollary 8.16 the function v must be constant on the connected set Ω; hence v(x) = 0 for a.e. x ∈ Ω. But this is in contradiction to Lp (Ω),

v Lp (Ω) = lim vk Lp (Ω) = 1.

k→∞

8.8. Differentiability properties By Morrey’s inequality, if Ω ⊂ Rn and w ∈ W 1,p (Ω) with p > n, then w coincides a.e. with a Hölder continuous function. Indeed, after a modification on a set of measure zero, we have 1/p (8.79)

|w(x) − w(y)| ≤ C|x − y|

1− n p

B(x, |y−x|)

|∇w(z)|p dz

.

This by itself does not imply that u should be differentiable in a classical sense. Indeed, there exist Hölder continuous functions that are nowhere differentiable. However, for functions in a Sobolev space a better regularity result can be proved. Theorem 8.41 (Almost everywhere differentiability). Let Ω ⊆ Rn 1,p and let u ∈ Wloc (Ω) for some p > n. Then u is differentiable at a.e. point x ∈ Ω, and its gradient coincides with the weak gradient. 1,p Proof. Let u ∈ Wloc (Ω). Since the weak derivatives are in Lploc , the same . is true of the weak gradient ∇u = (Dx1 u, . . . , Dxn u). By the Lebesgue differentiation theorem, for a.e. x ∈ Ω we have (8.80) − |∇u(x) − ∇u(z)|p dz → 0 as r → 0 . B(x,r)

Fix a point x for which (8.80) holds, and define (8.81)

. w(y) = u(y) − u(x) − ∇u(x) · (y − x).

180

8. Sobolev Spaces

1,p Observing that w ∈ Wloc (Ω), we can apply the estimate (8.79) and obtain

|w(y) − w(x)| = |w(y)| = |u(y) − u(x) − ∇u(x) · (y − x)| 1/p 1− n ≤ C |y − x| p |∇u(x) − ∇u(z)|p dz ≤ C |y − x| −

B(x, |y−x|)

B(x, |y−x|)

1/p

|∇u(x) − ∇u(z)|p dz

for suitable constants C, C . Therefore |w(y) − w(x)| → 0 |y − x|

as |y − x| → 0 .

By the definition of w in (8.81), this means that u is differentiable at x in the classical sense and its gradient coincides with its weak gradient.

8.9. Problems 1. Determine which of the following functionals define a distribution on Ω ⊆ R. ∞ k! Dk φ(k), with Ω = R. (i) Λ(φ) = k=1 ∞

(ii) Λ(φ) =

(iii) Λ(φ) =

2−k Dk φ(1/k), with Ω = R.

k=1 ∞

φ(1/k) , with Ω = R. k

k=1 ∞

(iv) Λ(φ) =

0

φ(x) dx, with Ω = ]0, ∞[ . x2

2. Give a direct proof that, if f ∈ W 1,p ( ]a, b[ ) for some a < b and 1 < p < ∞, then, by possibly changing f on a set of measure zero, one has 1

|f (x) − f (y)| ≤ C |x − y|1− p

for all x, y ∈ ]a, b[ .

Compute the best possible constant C. 3. Consider the open square . Q = {(x1 , x2 ) ; 0 < x1 < 1, 0 < x2 < 1} ⊂ R2 . Let f ∈ W 1,1 (Q) be a function whose weak derivative satisfies Dx1 f (x) = 0 for a.e. x ∈ Q. Prove that there exists a function g ∈ L1 ([0, 1]) such that f (x1 , x2 ) = g(x2 )

for a.e. (x1 , x2 ) ∈ Q .

8.9. Problems

181

4. Let Ω ⊆ Rn be an open set and assume f ∈ L1loc (Ω). Let g = Dx1 f be the weak derivative of f with respect to x1 . If f is C 1 restricted to an open subset Ω ⊆ Ω, prove that g coincides with the partial derivative ∂f /∂x1 at a.e. point x ∈ Ω . 5. (i) Prove that, if u ∈ W 1,∞ (Ω) for some open, convex set Ω ⊆ Rn , then u coincides a.e. with a Lipschitz continuous function. (ii) Show that there exists a (nonconvex), connected open set Ω ⊂ Rn and a function u ∈ W 1,∞ (Ω) that does not coincide a.e. with a Lipschitz continuous function. 6. (Rademacher’s theorem) Let Ω ⊂ Rn be an open set and let u : Ω → R be a bounded, Lipschitz continuous function. (i) Prove that u ∈ W 1,∞ (Ω). (ii) Prove that u is differentiable at a.e. point x ∈ Ω. Hint for (i): Consider first the case where Ω is convex. To construct the weak derivative Dxi , for any fixed x1 , . . . , xi−1 , xi+1 , . . . , xn , consider the absolutely continuous function s → u(x1 , . . . , xi−1 , s, xi+1 , . . . , xn ). n 7. Let Ω = B(0, 1) be the open in R , with n ≥ 2. Prove that the unit ball 1 is in W 1,n (Ω). unbounded function f (x) = ln ln 1 + |x|

8. (i) Let Ω = ] − 1, 1[. Consider the linear map T : C 1 (Ω) → R defined by T f = f (0). Show that this map can be continuously extended, in a unique way, to a bounded linear functional T : W 1,1 (Ω) → R. (ii) Let Ω = B(0, 1) ⊂ R2 be the open unit disc. Consider again the linear map T : C 1 (Ω) → R defined by T f = f (0). For which values of p can this map be continuously extended to a bounded linear functional T : W 1,p (Ω) → R? 9. Determine for which values of p ≥ 1 a generic function f ∈ W 1,p (R3 ) admits a . trace along the x1 -axis. In other words, set Γ = {(t, 0, 0) ; t ∈ R} ⊂ R3 and consider the map T : Cc1 (R3 ) → Lp (Γ), where T f = f|Γ is the restriction of f to Γ. Find values of p such that this map admits a continuous extension T : W 1,p (R3 ) → Lp (Γ). 10. Let V ⊂ Rn be a subspace of dimension m and let V ⊥ be the perpendicular subspace of dimension n − m. Let u ∈ W 1,p (Rn ) with m < p < n. Show that, after a modification on a set of measure zero, the following hold. (i) For a.e. y ∈ V ⊥ (with respect to the (n − m)-dimensional measure), the restriction of u to the affine subspace y + V is Hölder continuous with exponent γ = 1 − m p. (ii) The pointwise value u(y) is well defined for a.e. y ∈ V ⊥ . Moreover u(y) Lp (V ⊥ ) ≤ C · u W 1,p (Rn ) for some constant C depending on m, n, p but not on u.

182

8. Sobolev Spaces

11. Let Ω ⊂ Rn be an arbitrary open set (without assuming any regularity of the boundary) and assume p > n. Show that every function f ∈ W01,p (Ω) coincides a.e. with a continuous function f˜. Moreover, there exists a constant C, depending on p, n but not on Ω or f , such that n f˜ C 0,γ (Ω) ≤ C f W 1,p (Ω) with γ = 1 − . p 12. When k = 0, by definition W 0,p (Ω) = Lp (Ω). If 1 ≤ p < ∞, prove that W00,p (Ω) = Lp (Ω) as well. What is W00,∞ (Ω)? 13. Let ϕ : R → [0, 1] be a smooth function such that

1 if r ≤ 0 , ϕ(r) = 0 if r ≥ 1 . Given any f ∈ W k,p (Rn ), prove that the functions fk (x) = f (x) ϕ(|x| − k) converge to f in W k,p (Rn ), for every k ≥ 0 and 1 ≤ p < ∞. As a consequence, show that W0k,p (Rn ) = W k,p (Rn ). . 14. Let R+ = {x ∈ R ; x > 0} and assume u ∈ W 2,p (R+ ). Define the symmetric . / extension of u by setting Eu(x) = u(|x|). Prove that Eu ∈ W 1,p (R) but Eu ∈ 2,p W (R), in general. 15. Let u ∈ Cc1 (Rn ) and fix p, q ∈ [1, ∞[ . For a given λ > 0, consider the rescaled . function uλ (x) = u(λx). (i) Show that there exists an exponent α, depending on n, q, such that uλ Lq (Rn ) = λα u Lq (Rn ) . (ii) Show that there exists an exponent β, depending on n, p, such that ∇uλ Lp (Rn ) = λβ ∇u Lp (Rn ) . (iii) Determine for which values of n, p, q one has α = β. Compare with (8.57). 16. Let Ω ⊂ Rn be a bounded open set with C 1 boundary. Let (um )m≥1 be a sequence of functions which are uniformly bounded in H 1 (Ω). Assuming that um − u L2 → 0, prove that u ∈ H 1 (Ω) and u H 1 ≤ lim inf um H 1 . m→∞

. . 17. Let Ω = {(x1 , x2 ) ; x21 + x22 < 1} be the open unit disc in R2 , and let Ω0 = . Ω \ {(0, 0)} be the unit disc minus the origin. Consider the function f (x) = 1 − |x|. Prove that (see Figure 8.9.1) ⎧ 1,p for 1 ≤ p < ∞ , ⎨ f ∈ W0 (Ω) 1,p f ∈ W0 (Ω0 ) for 1 ≤ p ≤ 2 , ⎩ f ∈ / W01,p (Ω0 ) for 2 < p ≤ ∞ .

8.9. Problems

183

1

1 1

1

Figure 8.9.1. Left: the function f can be approximated in W 1,p

with functions fn having compact support in Ω. Right: the function f can be approximated in W 1,2 with functions gn having compact support in Ω0 . 18. Let Ω = {(x1 , x2 ) ; 0 < x1 < 1, 0 < x2 < 1} be the open unit square in R2 . (i) If u ∈ H 1 (Ω) satisfies meas {x ∈ Ω ; u(x) = 0} > 0 ,

∇u(x) = 0 for a.e. x ∈ Ω ,

prove that u(x) = 0 for a.e. x ∈ Ω. (ii) For every α > 0, prove that there exists a constant Cα (with the following property. If u ∈ H 1 (Ω) is a function such that meas {x ∈ Ω ; u(x) = ) 0} ≥ α, then (8.82)

u L2 (Ω) ≤ Cα ∇u L2 (Ω) .

19. Let Ω ⊆ Rn be an open set, and let K ⊂ Ω be a closed set which is “small”, in the sense that its (n − 1)-dimensional measure is mn−1 (K) = 0. More precisely, assume that the projection of K on every (n − 1)-dimensional hyperplane has zero (n − 1)-dimensional measure. Let u be continuously differentiable on the open set Ω \ K, and assume u ∈ Lp (Ω \ K), ∇u ∈ Lp (Ω \ K). Prove that u ∈ W 1,p (Ω). 20. (i) Find two functions f, g ∈ L1loc (Rn ) such that the product f · g is not locally summable. (ii) Show that, if f, g ∈ L1loc (R) are both bounded and weakly differentiable, then the product f · g is also weakly differentiable and satisfies the usual product rule: Dx (f g) = (Dx f ) · g + f · (Dx g). (iii) Find two functions f, g ∈ L1loc (Rn ) (with n ≥ 2) with the following properties. For every i = 1, . . . , n the first-order weak derivatives Dxi f , Dxi g are well defined. However, the product f ·g does not have any weak derivative (on the entire space Rn ). 21. Let Ω ⊂ Rn be a bounded, connected open set with C 1 boundary, let Ω ⊂ Ω be a nonempty open subset, and let p ∈ [1, ∞]. Prove that there exists a constant

184

8. Sobolev Spaces

C such that

u Lp (Ω) ≤ C u Lp (Ω ) + ∇u Lp (Ω)

for every u ∈ W 1,p (Ω) .

22. Consider the Banach space X = C 0 (]0, 1[) of all bounded continuous functions on the open interval ]0, 1[ , with norm f C 0 = sup0 0 such that n aij (x)ξi ξj ≥ θ|ξ|2 for all x ∈ Ω, ξ ∈ Rn . i,j=1

Remark 9.1. By definition, the uniform ellipticity of the operator L depends only on the coefficients aij . In the symmetric case where aij = aji , the above condition means that for every x ∈ Ω the n × n symmetric matrix A(x) = (aij (x)) is strictly positive definite and its smallest eigenvalue is ≥ θ. 9.1.1. Physical interpretation. As an example, consider a fluid moving with velocity b(x) = (b1 , b2 , b3 )(x) in a domain Ω ⊂ R3 . Let u = u(t, x) describe the density of a chemical dispersed within the fluid. Given any subdomain V ⊂ Ω, assume that the total amount of chemical contained in V changes only due to the inward or outward flux through the boundary ∂V . Namely, d (9.5) u dx = n · (a ∇u) dS − n · (b u) dS . dt V ∂V ∂V Here n(x) denotes the unit outer normal to the set V at a boundary point x, while a > 0 is a constant diffusion coefficient. The first integral on the righthand side of (9.5) describes how much chemical enters through the boundary by diffusion. Notice that this is positive at points where n·∇u > 0. Roughly speaking, this is the case if the concentration of chemical outside the domain V is greater than inside. The second integral (with the minus sign in front) denotes the amount of chemical that moves out across the boundary of V by advection, being transported by the fluid in motion (Figure 9.1.1). Using the divergence theorem, from (9.5) we obtain (9.6) ut dx = a Δu dx − div(b u) dx . V

V

V

9.1. Elliptic equations

187

Figure 9.1.1. As the chemical is transported across the boundary of V , the total amount contained inside V changes in time.

Since the above identity holds on every subdomain V ⊂ Ω, we conclude that u = u(t, x) satisfies the parabolic PDE ut −

a Δu + div(b u) = 0. [diffusion] [advection]

A more general model can describe the following situations: • The diffusion is not uniform throughout the domain. In other words, the coefficient a is not a constant but depends on the location x ∈ Ω. Moreover, the diffusion is not isotropic: in some directions it is faster than in others. All this can be modeled by replacing the constant diffusion matrix A(x) ≡ aI with a more general symmetric matrix A(x) = (aij (x)). • The total amount of u is not conserved. Additional terms are present, accounting for linear decay and for an external source. In n space dimensions, this leads to a linear evolution equation of the form (9.7) n n ( ij ) ( i ) ut − a (x)uxi x + b (x)u x = −c(x) u + f (x) . j

i

i,j=1

i=1

[diffusion]

[advection]

[decay]

[source]

Equation (9.7) can be used to model a variety of phenomena, such as mass transport, heat propagation, etc. In many situations, one is interested in steady states, i.e., in solutions which are independent of time. Setting ut = 0 in (9.7), we obtain the linear elliptic equation (9.8)

−

n n ( ij ) ( i ) a (x)uxi x + b (x)u x + c(x) u = f (x) . j

i,j=1

i

i=1

9.1.2. Classical and weak solutions. By a classical solution of the boundary value problem (9.2) we mean a function u ∈ C 2 (Ω) which satisfies

188

9. Linear Partial Differential Equations

the equation and the boundary conditions at every point. In general, due to the lack of regularity in the coefficients of the equation, problem (9.2) may not have classical solutions. A weaker concept of solution is thus needed. Definition 9.2. A weak solution of (9.2) is a function u ∈ H01 (Ω) such that (9.9) n n ij i a uxi vxj − b u vxi +cuv dx = f v dx for all v ∈ H01 (Ω) . Ω

i,j=1

Ω

i=1

Remark 9.3 (On the concept of weak solution). The boundary condition u = 0 on ∂Ω is taken into account by requiring that u ∈ H01 (Ω). The equality (9.9) is formally obtained by writing (9.10) (Lu) v dx = f v dx for all v ∈ Cc∞ (Ω) Ω

Ω

and integrating by parts. Notice that, if (9.9) holds for every test function v ∈ Cc∞ (Ω), then by an approximation argument the same integral identity remains valid for every v ∈ H01 (Ω). It is important to observe that a function u ∈ H01 may not have weak derivatives of second-order. However, the integral in (9.9) is always well defined, for all u, v ∈ H01 . A convenient way to reformulate the concept of weak solution is the following. On the Hilbert space H01 (Ω), consider the bilinear form ⎛ ⎞ n n . ⎝ (9.11) B[u, v] = aij uxi vxj − bi uvxi + c uv ⎠ dx . Ω

i,j=1

i=1

A function u ∈ H01 is a weak solution of (9.2) provided that (9.12)

for all v ∈ H01 .

B[u, v] = (f, v)L2

Here and in the sequel we use the notation . (9.13) (f, g)L2 = f g dx Ω

for the inner product in H 1 (Ω) (9.14)

L2 (Ω),

. (f, g)H 1 =

to distinguish it from the inner product in

f g dx + Ω

n

fxi gxi dx .

Ω i=1

Remark 9.4 (Choice of the sign). The minus sign in front of the secondorder terms in (9.1) disappears in (9.11), after a formal integration by parts.

9.1. Elliptic equations

189

As will become apparent later, this sign is chosen so that the corresponding quadratic form B[u, v] can be positive definite. Remark 9.5 (General boundary conditions). Given a function g ∈ H 1 (Ω), one can consider the nonhomogeneous boundary value problem

Lu = f, x ∈ Ω, (9.15) u = g, x ∈ ∂Ω . . This can be rewritten as a homogeneous problem for the function u ˜ = u − g, namely

L˜ u = f − Lg, x ∈ Ω, (9.16) u ˜ = 0, x ∈ ∂Ω . Assuming that Lg ∈ L2 (Ω), problem (9.16) is exactly of the same type as (9.2). Remark 9.6 (Operators not in divergence form). A differential operator of the form n n . ij Lu = − a (x)uxi xj + bi (x)uxi + c(x)u i,j=1

i=1

can be rewritten as Lu = −

n (

ij

a (x)uxi

i,j=1

) xj

+

n

i

b (x) +

i=1

n

aijxj(x) uxi + c(x)u .

j=1

Assuming that aij , aijxj , bi , c ∈ L∞ (Ω), a weak solution of the corresponding problem (9.2) can again be obtained by solving (9.12), where the bilinear form B is now defined by ⎛ ⎞ n n n . ⎝ B[u, v] = bi + aij uxi vxj + aijxj uxi v + c uv ⎠ dx . Ω

i,j=1

i=1

j=1

As a first example, consider the boundary value problem

−Δu + u = f, x ∈ Ω, (9.17) u = 0, x ∈ ∂Ω . . Clearly, the operator −Δu = − i uxi xi is uniformly elliptic, because in this case the n×n matrix A(x) = (aij (x)) is the identity matrix, for every x ∈ Ω. The existence of a weak solution to (9.17) can be proved by a remarkably concise argument, based on the Riesz representation theorem.

190

9. Linear Partial Differential Equations

Lemma 9.7. Let Ω ⊂ Rn be a bounded open set. Then for every f ∈ L2 (Ω) the boundary value problem (9.17) has a unique weak solution u ∈ H01 (Ω). The corresponding map f → u is a compact linear operator from L2 (Ω) into H01 (Ω). Proof. By the Rellich-Kondrachov theorem, the canonical embedding ι : H01 (Ω) → L2 (Ω) is compact. Hence its dual operator ι∗ is also compact. Since H01 and L2 are Hilbert spaces, they can be identified with their duals. We thus obtain the following diagram: ι

(9.18)

H01 (Ω) −→ L2 (Ω), ι∗

H01 (Ω) = [H01 (Ω)]∗ ←− [L2 (Ω)]∗ = L2 (Ω) .

For each f ∈ L2 (Ω), the definition of dual operator yields (ι∗ f, v)H 1 = (f, ιv)L2 = (f, v)L2

for all f ∈ L2 (Ω), v ∈ H01 (Ω) .

By (9.14) this means that ι∗ f is precisely the weak solution to (9.17).

9.1.3. Homogeneous second-order elliptic operators. We begin by studying solutions to the elliptic boundary value problem

Lu = f, x ∈ Ω, (9.19) u = 0, x ∈ ∂Ω , assuming that the differential operator L contains only second-order terms: (9.20)

n . Lu = − (aij (x)uxi )xj . i,j=1

We recall that a weak solution of (9.19) is a function u ∈ H01 (Ω) such that (9.21)

B[u, v] = (f, v)L2

for all v ∈ H01 (Ω) ,

where B : H01 × H01 → R is the continuous bilinear form n . (9.22) B[u, v] = aij uxi vxj dx . Ω i,j=1

Theorem 9.8 (Unique solution of the elliptic boundary value problem). Let Ω ⊂ Rn be a bounded open set. Let the operator L in (9.20) be uniformly elliptic, with coefficients aij ∈ L∞ (Ω). Then, for every f ∈ L2 (Ω), the boundary value problem (9.19) has a unique weak solution u ∈ H01 (Ω). The corresponding solution operator, which we denote as L−1 : f → u, is a compact linear operator from L2 (Ω) into H01 (Ω).

9.1. Elliptic equations

191

Proof. The existence and uniqueness of a weak solution to the elliptic boundary value problem (9.19) will be achieved by checking that the bilinear form B in (9.22) satisfies all the assumptions of the Lax-Milgram theorem. 1. The continuity of B is clear. Indeed, n n ij ij B[u, v] ≤ i,j=1 Ω |a uxi vxj | dx ≤ i,j=1 a L∞ uxi L2 vxj L2 ≤ C u H 1 v H 1 . 2. We claim that B is strictly positive definite, i.e., there exists β > 0 such that (9.23)

B[u, u] ≥ β u 2H 1 (Ω)

for all u ∈ H01 (Ω) .

Indeed, since Ω is bounded, Poincaré’s inequality yields the existence of a constant κ such that u 2L2 (Ω) ≤ κ |∇u|2 dx for all u ∈ H01 (Ω) . Ω

On the other hand, the uniform ellipticity condition implies n n ij 2 B[u, u] = a uxi uxj dx ≥ θ uxi dx = θ |∇u|2 dx . Ω i,j=1

Ω

Ω

i=1

Together, the two above inequalities yield u 2H 1 = u 2L2 + ∇u 2L2 ≤ (κ + 1) ∇u 2L2 ≤

κ+1 B[u, u] . θ

This proves (9.23) with β = θ/(κ + 1). 3. By the Lax-Milgram theorem, for every f˜ ∈ H01 (Ω) there exists a unique element u ∈ H01 such that (9.24)

B[u, v] = (f˜, v)H 1

for all v ∈ H01 (Ω) .

Moreover, the map Λ : f˜ → u is continuous, namely u H 1 ≤ β −1 f˜ H 1 . . Choosing f˜ = ι∗ f ∈ H01 (Ω), defined in (9.18), we thus achieve (9.25)

B[u, v] = (ι∗ f, v)H 1 = (f, v)L2

By definition, u is a weak solution of (9.19).

for all v ∈ H01 (Ω) .

192

9. Linear Partial Differential Equations

4. To prove that the solution operator L−1 : f → u is compact, consider the the diagram ι∗

Λ

L2 (Ω) −→ H01 (Ω) −→ H01 (Ω) . By Lemma 9.7, the linear operator ι∗ is compact. Moreover, Λ is continuous. Therefore the composition L−1 = Λ ◦ ι∗ is compact. 9.1.4. Representation of solutions in terms of eigenfunctions. Since H01 (Ω) ⊂ L2 (Ω), the solution operator L−1 described in Theorem 9.8 can also be regarded as a compact operator from L2 (Ω) into itself. In the symmetric case where aij = aji , the operator L−1 is selfadjoint. Hence, by the HilbertSchmidt theorem, it admits a representation in terms of a countable basis of eigenfunctions. Theorem 9.9 (Representation of solutions as a series of eigenfunctions). Assume aij = aji ∈ L∞ (Ω). Then, in the setting of Theorem 9.8, the linear operator L−1 : L2 (Ω) → L2 (Ω) is compact, one-to-one, and selfadjoint. The space L2 (Ω) admits a countable orthonormal basis {φk ; k ≥ 1} consisting of eigenfunctions of L−1 , and one has the representation L−1 f =

(9.26)

∞

λk (f, φk )L2 φk .

k=1

The corresponding eigenvalues λk satisfy (9.27)

lim λk = 0 ,

k→∞

λk > 0 for all k ≥ 1.

Proof. 1. By Theorem 9.8, L−1 is a compact linear operator from L2 (Ω) into itself. To show that L−1 is one-to-one, assume u = L−1 f = 0. Then 0 = B[u, v] = (f, v)L2 for every v ∈ H01 (Ω). In particular, for every test function φ ∈ Cc∞ (Ω) we have f φ dx = 0 . Ω

This implies f (x) = 0 for a.e. x ∈ Ω. Hence Ker(L−1 ) = {0} and the operator L−1 is one-to-one. 2. To prove that L−1 is selfadjoint, assume f, g ∈ L2 (Ω) ,

u = L−1 f ,

v = L−1 g .

9.1. Elliptic equations

193

By the definition of weak solution in (9.9), this implies u, v ∈ H01 (Ω) and (L

−1

f, g)L2 =

u g dx = Ω

f v dx = (f, L−1 g)L2 .

ij

a uxi vxj dx =

Ω i,j

Ω

Note that the second equality follows from the fact that v = L−1 g, using u ∈ H01 (Ω) as a test function. The third equality follows from the fact that u = L−1 f , using v ∈ H01 (Ω) as a test function. 3. Since L−1 is a compact and selfadjoint operator on the separable space L2 (Ω), by the theorem of Hilbert-Schmidt in Chapter 6, there exists of a countable orthonormal basis consisting of eigenfunctions of L−1 . This yields (9.26). By the compactness of the operator L−1 , the eigenvalues satisfy λk → 0. Finally, choosing v = φk as the test function in (9.21), one obtains B[λk φk , φk ] = (φk , φk )L2 = 1. Since the quadratic form B is strictly positive definite, we conclude that λk =

1 > 0. B[φk , φk ]

. Example 9.10. Let Ω = ]0, π[ ⊂ R and Lu = −uxx . Given f ∈ L2 (]0, π[), consider the elliptic boundary value problem

(9.28)

−uxx = f, 0 < x < π, u(0) = u(π) = 0 .

As a first step, we compute the eigenfunctions of the operator Lu = −uxx . Solving the boundary value problem −uxx = μu ,

u(0) = u(π) = 0 ,

we find the eigenvalues and the normalized eigenfunctions ! 2

μk = k ,

φk (x) =

2 sin kx . π

Of course, the inverse operator L−1 has the same eigenfunctions φk , with eigenvalues λk = 1/k 2 .

194

9. Linear Partial Differential Equations

In this special case, formula (9.26) yields the well-known representation of solutions of (9.28) in terms of a Fourier sine series: u(x) = L−1 f =

∞

λk (f, φk )L2 φk

k=1 ∞ 1 = k2

!

π

f (y) 0

k=1

2 sin ky dy π

!

2 sin kx π

π ∞ 2 = f (y) sin ky dy sin kx . πk 2 0 k=1

9.1.5. More general linear elliptic operators. The existence and uniqueness result stated in Theorem 9.8 relied on the fact that the bilinear form B in (9.21) is strictly positive definite on the space H01 (Ω). This property no longer holds for the more general bilinear form B in (9.11), where additional lower-order terms are present. For example, if the function c(x) is large and negative, one may find some u ∈ H01 (Ω) such that B[u, u] < 0. . Example 9.11. Consider the open interval Ω = ]0, π[ . Observe that the operator . Lu = −uxx − 4u is uniformly elliptic on Ω. However, the corresponding bilinear form π B[u, v] = ux vx − 4uv dx 0

is not positive definite on

H01 (Ω). π

For example, taking u(x) = sin x, we find

cos2 x − 4 sin2 x dx = −

B[u, u] = 0

3π . 2

If f (x) = sin 2x, then the boundary value problem

−uxx − 4u = sin 2x, x ∈ ]0, π[ , (9.29) u(0) = u(π) = 0 has no solutions. Indeed, choosing v(x) = sin 2x, for every u ∈ H01 (Ω) an integration by parts yields π π B[u, v] = 2ux cos 2x − 4u sin 2x dx ux vx − 4uv dx = 0

=

0 π

2u sin 2x

0

x

π

dx = 0 =

sin2 2x dx = (f, v)L2 . 0

9.1. Elliptic equations

195

Therefore there is no function u ∈ H01 (Ω) that satisfies (9.12). In this connection one should also notice that the corresponding homogeneous problem

−uxx − 4u = 0, x ∈ ]0, π[ , (9.30) u(0) = u(π) = 0 admits infinitely many solutions: u(x) = κ sin 2x, for any constant κ. We now study the existence and uniqueness of weak solutions to the more general boundary value problem (9.1)–(9.2). Our approach is based on two steps. STEP 1: By choosing a constant γ > 0 sufficiently large, the operator . (9.31) Lγ u = Lu + γu is strictly positive definite. More precisely, the corresponding bilinear form (9.32) ⎛ ⎞ n n . ⎝ Bγ [u, v] = aij (x) uxi vxj − bi (x) uvxi + c(x) uv + γ uv ⎠ dx Ω

i,j=1

i=1

satisfies (9.33)

Bγ [u, u] ≥ β u 2H 1

for all u ∈ H01 (Ω) ,

for some constant β > 0. Using the Lax-Milgram theorem, we conclude that for every f ∈ L2 (Ω) the equation Lγ u = f has a unique weak solution u ∈ H01 (Ω). Moreover, the map f → u = L−1 γ f 2 1 2 is a linear compact operator from L (Ω) into H0 (Ω) ⊂ L (Ω). We regard 2 L−1 γ as a compact operator from L (Ω) into itself. STEP 2: The original problem (9.2) can now be written as Lu = Lγ u − γu = f . Applying the operator L−1 γ to both sides, one obtains −1 u − L−1 γ γu = Lγ f .

(9.34) Introducing the notation (9.35)

. K = γL−1 γ ,

. h = L−1 γ f,

we are led to the equation (9.36)

(I − K)u = h .

196

9. Linear Partial Differential Equations

Since K is a compact operator from L2 (Ω) into itself, Fredholm’s theory can be applied. In particular, one has Fredholm’s alternative: either (i) for every h ∈ L2 (Ω) the equation u − Ku = h has a unique solution u ∈ L2 (Ω), or else (ii) the equation u − Ku = 0 has a nontrivial solution u ∈ L2 (Ω). Calling K ∗ : L2 → L2 the adjoint operator, case (ii) occurs if and only if the adjoint equation v − K ∗ v = 0 has a nontrivial solution v ∈ L2 (Ω). Information about the existence and uniqueness of weak solutions to (9.2) can thus also be obtained by studying the adjoint operator (9.37)

n n . L∗ v = − (aij (x) vxj )xi − bi (x)vxi + c(x) v . i,j=1

i=1

The remainder of this section will provide detailed proofs of the above claims. Lemma 9.12 (Estimates on elliptic operators). Let the operator L in (9.1) be uniformly elliptic, with coefficients aij , bi , c ∈ L∞ (Ω). Then there exist constants α, β, γ > 0 such that (9.38)

|B[u, v]| ≤ α u H 1 v H 1 ,

(9.39)

β u 2H 1 ≤ B[u, u] + γ u 2L2 ,

for all u, v ∈ H01 (Ω). Proof. 1. The boundedness of the bilinear form B : H01 × H01 → R follows from n n aij uxi vxj − bi uvxi + c uv dx B[u, v] = Ω i,j=1 i=1 ≤

n

aij L∞ uxi L2 vxj L2 +

i,j=1

+ c L∞ u L2 v L2 ≤ α u H 1 v H 1 .

n i=1

bi L∞ u L2 vxi L2

9.1. Elliptic equations

197

2. Concerning the second estimate, using the ellipticity condition (9.4) and 1 2 the elementary inequality ab ≤ 2θ a + θ2 b2 , we obtain n n n 2 2 θ uxi L2 = θ uxi dx ≤ aij uxi uxj dx Ω i=1

i=1

= B[u, u] +

n Ω

≤ B[u, u] +

Ω i,j=1

b uuxi − cu i

2

dx

i=1

n

bi L∞ u L2 uxi L2 + c L∞ u 2L2

i=1

≤ B[u, u] +

n n 1 i 2 θ 2 2 b L∞ u L2 + uxi L2 + c L∞ u 2L2 . 2θ 2 i=1

i=1

Therefore θ uxi 2L2 − C u 2L2 2 n

B[u, u] ≥

for all u ∈ H01 (Ω),

i=1

for a suitable constant C. Taking β = θ/2 and γ = C + θ/2, the inequality (9.39) is satisfied. Remark 9.13. By the above lemma, if the constant γ > 0 is large enough, then the bilinear form . Bγ [u, v] = B[u, v] + γ (u, v)L2 in (9.32) is strictly positive definite. Notice that, for γ > 0 large, it would be very easy to show that the bilinear form . γ = B B[u, v] + γ (u, v)H 1 is strictly positive definite on H01 (Ω). However, Lemma 9.12 shows that we can achieve strict positivity by adding the much weaker term γ (u, v)L2 . Let γ be as in (9.39) and define the bilinear form Bγ according to (9.32). Since Bγ is strictly positive definite, we can apply the Lax-Milgram theorem and conclude that, for every f ∈ L2 (Ω), there exists a unique u ∈ H01 (Ω) such that (9.40)

Bγ [u, v] = (f, v)L2 = (ι∗ f, v)H 1

for all v ∈ H01 (Ω).

Since the map ι∗ is compact, the solution operator f → u = L−1 γ f is a linear compact operator from L2 (Ω) into H01 (Ω). Therefore, it is also a compact operator from L2 (Ω) into itself.

198

9. Linear Partial Differential Equations

An entirely similar result holds for the adjoint problem

∗ Lγ v = g, x ∈ Ω, (9.41) v = 0, x ∈ ∂Ω , . where L∗ is the adjoint operator introduced in (9.37) and L∗γ u = L∗ u + γu. Given g ∈ L2 (Ω), a weak solution of (9.41) is defined to be a function v ∈ H01 (Ω) such that . (9.42) Bγ∗ [v, u] = Bγ [u, v] = (u, g)L2 for all u ∈ H01 (Ω) . Since Bγ∗ is strictly positive definite, for every g ∈ L2 the Lax-Milgram theorem yields a unique weak solution v of (9.41). The map g → v = (L∗γ )−1 g is a linear, compact operator from L2 (Ω) into itself. Lemma 9.14 (Adjoint operator). In the above setting, the operator (L∗γ )−1 is the adjoint of the operator L−1 γ . Proof. By definition, for every f, g ∈ L2 (Ω) and u, v ∈ H01 (Ω) one has 4 4 5 5 ∗ (f, v)L2 = Bγ L−1 (u, g)L2 = Bγ u, (L−1 γ f, v , γ ) g . In particular, choosing v = (L∗γ )−1 g and u = L−1 γ f , we obtain ( 4 ) 5 ( −1 ) −1 ∗ f, (L∗γ )−1 g L2 = Bγ L−1 γ f , (Lγ ) g = Lγ f, g L2 .

Lemma 9.15 (Representation of weak solutions). Given any f ∈ L2 (Ω), a function u ∈ L2 (Ω) is a weak solution of (9.2) if and only if (9.43)

(I − K)u = h ,

with

K = γL−1 γ ,

h = L−1 γ f.

Proof. 1. Let u be a weak solution of (9.2). By definition, this means that u ∈ H01 (Ω) and Bγ [u, v] = B[u, v] + γ(u, v)L2 = (f + γu, v)L2

for all v ∈ H01 (Ω) .

Therefore u = L−1 γ (f + γu) = h + Ku . 2. To prove the converse, let (9.43) hold. Then −1 1 u = γL−1 γ u + Lγ f ∈ H0 (Ω) .

Moreover, for every v ∈ H01 (Ω) we have B[u, v] = Bγ [u, v] − γ(u, v)L2 = (f + γu, v)L2 − γ(u, v)L2 = (f, v)L2 .

9.1. Elliptic equations

199

In order to apply Fredholm’s theory (Theorem 6.1), together with (9.2) we consider the homogeneous problem

Lu = 0, x ∈ Ω, (9.44) u = 0, x ∈ ∂Ω , and the adjoint problem

∗ L v = 0, (9.45) v = 0,

x ∈ Ω, x ∈ ∂Ω ,

where L∗ is the adjoint linear operator defined at (9.37). Theorem 9.16 (Unique solutions to the elliptic problem). Under the basic hypotheses (H), the following statements are equivalent: (i) For every f ∈ L2 (Ω), the elliptic boundary value problem (9.2) has a unique weak solution. (ii) The homogeneous boundary value problem (9.44) has the only solution u(x) ≡ 0. (iii) The adjoint homogeneous problem (9.45) has the only solution v(x) ≡ 0. 2 Proof. 1. Since K = γL−1 γ is a compact operator from L (Ω) into itself, Fredholm’s theorem can be applied. As a consequence, the linear operator I − K is surjective if and only if it is one-to-one, i.e., if and only if Ker(I − K) = {0}.

2. By Lemma 9.15, u − Ku = 0 if and only if u is a weak solution of (9.44). An entirely similar argument shows that v − K ∗ v = 0 if and only if v is a weak solution of (9.45). By Fredholm’s theorem, Ker(I − K) and Ker(I − K ∗ ) have the same dimension. We thus obtain a chain of equivalent statements: I − K is surjective, Ker(I − K) = {0}, Ker(I − K ∗ ) = {0}, u ≡ 0 is the unique solution of (9.44), v ≡ 0 is the unique solution of (9.45).

Theorem 9.16 covers the situation where I − K is one-to-one and Fredholm’s first alternative holds. In the case where I − K is not necessarily one-to-one, the existence of solutions to u − Ku = L−1 γ f

200

9. Linear Partial Differential Equations

can be determined using the identity (9.46)

Range(I − K) = [Ker(I − K ∗ )]⊥ .

Theorem 9.17 (Existence of solutions to the elliptic problem). Under the hypotheses (H), problem (9.2) has at least one weak solution if and only if (9.47)

(f, v)L2 = 0

for every weak solution v ∈ H01 (Ω) of the adjoint problem (9.45). Proof. The boundary value problem (9.2) has a weak solution provided −1 that L−1 γ f ∈ Range(I − K). By (9.46), this holds if and only if Lγ f is ∗ orthogonal to every v ∈ Ker(I − K ), i.e., to every solution v of the adjoint problem (9.45). We claim that this holds if and only if f itself is orthogonal to every solution v of (9.45). Indeed, if v − K ∗ v = 0, one has (f, v)L2 = (f, K ∗ v)L2 = (Kf, v)L2 = γ (L−1 γ f, v)L2 . Since γ > 0, this proves our claim.

9.2. Parabolic equations Let Ω ⊂ Rn be a bounded open set and let L be the operator in (9.1). In addition to the standard hypotheses (H) stated at the beginning of the chapter, we now assume that the coefficients aij satisfy the stronger regularity condition (9.48)

aij ∈ W 1,∞ (Ω) .

In this section we study the parabolic initial-boundary value problem ⎧ t > 0, x ∈ Ω, ⎨ ut + Lu = 0, (9.49) u(t, x) = 0, t > 0 , x ∈ ∂Ω , ⎩ u(0, x) = g(x), x ∈ Ω. It is convenient to reformulate (9.49) as a Cauchy problem in the Hilbert space X = L2 (Ω), namely d (9.50) u = Au, u(0) = g , dt for a suitable (unbounded) linear operator A : L2 (Ω) → L2 (Ω). More precisely . (9.51) A = −L , Dom(A) = u ∈ H01 (Ω) ; Lu ∈ L2 (Ω) . In other words, u ∈ Dom(A) if u is the solution to the elliptic boundary value problem (9.2), for some f ∈ L2 (Ω). In this case, Au = −f .

9.2. Parabolic equations

201

Our eventual goal is to construct solutions of (9.50) using semigroup theory. We first consider the case where the operator L is strictly positive definite on H01 (Ω). More precisely, we assume that there exists β > 0 such that the bilinear form B : H01 (Ω) × H01 (Ω) → R defined at (9.11) is strictly positive definite: there exists β > 0 such that (9.52)

B[u, u] ≥ β u 2H 1

for all u ∈ H01 (Ω) .

Notice that this is certainly true if bi ≡ 0 and c ≥ 0. Theorem 9.18 (Semigroup of solutions of a parabolic equation. I). Let the standard hypotheses (H) hold, together with (9.48). Moreover, assume that the corresponding bilinear form B in (9.11) is strictly positive definite, so that (9.52) holds. Then the operator A = −L generates a contractive semigroup {St ; t ≥ 0} of linear operators on L2 (Ω). Proof. To prove that A generates a contraction semigroup on X = L2 (Ω), we need to check the following: (i) Dom(A) is dense in L2 (Ω). (ii) The graph of A is closed. (iii) Every real number λ > 0 is in the resolvent set of A, and (λI − A)−1 ≤ λ−1 . 1. To prove (i), we observe that, if ϕ ∈ Cc2 (Ω), then the regularity assump. tions (9.48) imply f = Lϕ ∈ L2 (Ω). This proves that Dom(A) contains the subspace Cc2 (Ω) and therefore it is dense in L2 (Ω). 2. If (9.52) holds, then, by the Lax-Milgram theorem, for every f ∈ L2 (Ω) there exists a unique u ∈ H01 (Ω) such that B[u, v] = (f, v)L2

for all v ∈ H01 (Ω) .

. The map f → u = L−1 f is a bounded linear operator from L2 (Ω) into L2 (Ω). We observe that the pair of functions (u, f ) lies in the graph of A if and only if (−f, u) is in the graph L−1 . Since L−1 is a continuous operator, its graph is closed. Hence the graph of A is closed as well. 3. According to the definition of A, to prove (iii) we need to show that, for every λ > 0, the operator λI − A has a bounded inverse with operator norm (λI − A)−1 ≤ λ−1 . Equivalently, for every f ∈ L2 (Ω), we need to show

202

9. Linear Partial Differential Equations

that the problem (9.53)

λu + Lu = f, u = 0,

x ∈ Ω, x ∈ ∂Ω ,

has a weak solution satisfying u L2 ≤

(9.54)

1 f L2 . λ

By the Lax-Milgram theorem, there exists a unique u ∈ H01 (Ω) such that (9.55)

for all v ∈ H01 (Ω).

(λu, v)L2 + B[u, v] = (f, v)L2

Taking v = u in (9.55), we obtain λ u 2L2 + B[u, u] = (f, u)L2 ≤ f L2 · u L2 . Since we are assuming B[u, u] ≥ 0, this yields λ u L2 ≤ f L2 , proving (9.54). 4. We can now use Theorem 7.13 and conclude that the linear operator A generates a contractive semigroup. 9.2.1. Representation of solutions in terms of eigenfunctions. Consider the special case where aij = aji and L is the operator in (9.20), containing only second-order terms. In this case, by Poincaré’s inequality, the bilinear form B in (9.22) is strictly positive definite and Theorem 9.18 can be applied. Using Theorem 9.9, one obtains a representation of the semigroup trajectories in terms of an orthonormal basis {φk ; k ≥ 1} of eigenfunctions of the compact selfadjoint operator L−1 . By construction, for every k ≥ 1 one has L−1 φk = λk φk , where λk > 0 is the corresponding eigenvalue. Therefore (9.56)

φk ∈ Dom(L) ,

Lφk = μk φk ,

μk =

1 . λk

Notice that λk → 0 and μk → ∞, as k → ∞. For every k ≥ 1, the function . u(t) = e−μk t φk provides a C 1 solution to the Cauchy problem d u(t) = −Lu(t) , dt

u(0) = φk .

9.2. Parabolic equations

203

Hence, by the uniqueness of semigroup trajectories, one must have St φk = e−μk t φk . By linearity, for any coefficients c1 , . . . , cN ∈ R one has N N St ck φk = ck e−μk t φk . k=1

k=1

Since St is a bounded linear operator, decomposing an arbitrary function g ∈ L2 (Ω) along the orthonormal basis {φk ; k ≥ 1}, we thus obtain (9.57)

∞

St g =

e−μk t (g, φk )L2 φk .

k=1

The above representation of semigroup trajectories is valid for every g ∈ L2 (Ω) and every t ≥ 0. Lemma 9.19. Let L be the operator in (9.20). Then for every g ∈ L2 (Ω) the formula (9.57) defines a map t → ut = St g from [0, ∞[ into L2 (Ω). This map is continuous for t ∈ [0, ∞[ and continuously differentiable for t > 0. Moreover, u(t) ∈ Dom(L) ⊆ H01 (Ω) for every t > 0 and d u(t) = Lu(t) dt

(9.58)

for all t > 0 .

Proof. 1. Let g ∈ L2 (Ω). Since μk > 0 for every k, it is clear that 2 −μk t (g, φk )L2 ≤ (g, φk )2L2 . e Therefore,

2 (g, φk )2L2 = g 2L2 < ∞ . e−μk t (g, φk )L2 ≤ k≥1

k≥1

Hence the series in (9.57) is convergent, uniformly for t ≥ 0. In particular, since the partial sums are continuous functions of time, the map t → St g is continuous as well. 2. We claim that, even if g ∈ / H01 (Ω), one always has St g ∈ Dom(L) ⊆ H01 (Ω) for all t > 0 . Indeed, a function u = k ck φk lies in Dom(L) if and only if the coefficients ck satisfy (ck μk )2 < ∞ .

(9.59)

k

204

9. Linear Partial Differential Equations

. In the case where ck (t) = e−μk t (g, φk )L2 , we have the estimate 2 (9.60) (μk ck (t))2 ≤ sup μk e−μk t · (g, φk )2L2 . k

k

k

An elementary calculation now shows that, for ξ ≥ 0 and t fixed, the function ξ → ξ e−tξ attains its global maximum at ξ = 1/t. Therefore, μk e−μk t ≤ maxξ≥0 ξe−tξ = Using this bound in (9.60), we obtain ∞ 2 μk ck (t) ≤ k=1

1 t2 e2

1 . te

g 2L2 .

Hence the series defining Lu(t) is convergent. This implies u(t) ∈ Dom(L), for each t > 0. 3. Differentiating the series (9.57) term by term and observing that the series of derivatives is also convergent, we achieve (9.58). . Example 9.20. As in Example 9.10, let Ω = ]0, π[ ⊂ R and Lu = −uxx . 2 Given g ∈ L (]0, π[), consider the parabolic initial-boundary value problem ⎧ ut = uxx , t > 0, 0 < x < π, ⎨ u(0, x) = g(x), 0 < x < π, ⎩ u(t, 0) = u(t, π) = 0 . In this special case, the formula (9.57) yields the solution as the sum of a Fourier sine series: π ∞ 2 −k2 t u(t, x) = g(y) sin ky dy sin kx . e π 0 k=1

9.2.2. More general operators. To motivate the following construction, we begin with a finite-dimensional example. Let A be an n × n matrix and consider the linear ODE on Rn d (9.61) x(t) = −Ax(t) . dt If A is positive definite, i.e., if Ax, x ≥ 0 for all x ∈ Rn , then −A generates a contractive semigroup. Indeed 6 7 d d |x(t)|2 = 2 x(t) , x(t) = 2 −Ax(t), x(t) ≤ 0 , dt dt showing that the Euclidean norm of a solution does not increase in time.

9.2. Parabolic equations

205

Next, let A be an arbitrary matrix. We can then find a number γ ≥ 0 large enough so that A + γI is positive definite, and hence the matrix −(A+γI) generates a contractive semigroup. In this case, if x(t) = e−tA x(0) is a solution to (9.61), writing −A = γI − (A + γI), one obtains |x(t)| = |e−At x(0)| (9.62)

= |e{γI−(A+γI)}t x(0)| = eγt |e−(A+γI)t x(0)| ≤ eγt |x(0)|.

According to (9.62), the operator −A generates a semigroup of type γ. We shall work out a similar construction in the case where L is a general elliptic operator, as in (9.1), and the corresponding bilinear form B[u, v] in (9.11) is not necessarily positive definite. According to Lemma 9.12, there exists a constant γ > 0 large enough so that the bilinear form (9.63)

Bγ [u, v] = B[u, v] + γ(u, v)L2

is strictly positive definite on H01 (Ω). We can thus define . . Lγ u = Lu + γu , Bγ [u, v] = B[u, v] + γ(u, v)L2 . The parabolic equation in (9.49) can now be written as ut = −Lγ u + γu . . By the previous analysis, the operator Aγ = −(L + γI) generates a con(γ) tractive semigroup of linear operators, say {St ; t ≥ 0}. Therefore, the . operator A = −L = γI − Lγ defined as in (9.51) generates a semigroup of type γ. Namely {St ; t ≥ 0}, with (γ)

St = eγt St ,

t ≥ 0.

Summarizing the above analysis, we have Theorem 9.21 (Semigroup of solutions of a parabolic equation. II). Let Ω ⊂ Rn be a bounded open set. Assume that the operator L in (9.1) satisfies the regularity conditions (9.48) and the uniform ellipticity condition (9.4). Then the operator A = −L defined at (9.51) generates a semigroup {St ; t ≥ 0} of linear operators on L2 (Ω). Having constructed a semigroup {St ; t ≥ 0} generated by the operator A, one needs to understand in which sense a trajectory of the semigroup t → u(t) = St f provides a solution to the parabolic equation (9.49). In the case where L is the symmetric operator defined in (9.20), the representation

206

9. Linear Partial Differential Equations

(9.57) yields all the needed information. Indeed, according to Lemma 9.19, for every initial data g ∈ L2 (Ω) the solution t → u(t) = St g is a C 1 map, which takes values in Dom(L) and satisfies (9.58) for every t > 0. A similar result can be proved for general elliptic operators of the form (9.1). However, this analysis is beyond the scope of the present notes. Here we shall only make a few remarks: . (1) Initial condition. The map t → u(t) = St g is continuous from [0, ∞[ into L2 (Ω) and satisfies u(0) = g. The initial condition in (9.49) is thus satisfied as an identity between functions in L2 (Ω). (2) Regular solutions. If g ∈ Dom(A), then u(t) = St g ∈ Dom(A) for all t ≥ 0. Moreover, the map t → u(t) is continuously differentiable and satisfies the ODE (9.50) at every time t > 0. Since Dom(A) ⊂ H01 (Ω), this also implies that u(t) satisfies the correct boundary conditions, for all t ≥ 0. (3) Distributional solutions. Given a general initial condition f ∈ L2 (Ω), one can construct a sequence of initial data fm ∈ Dom(A) such that fm − f L2 → 0 as m → ∞. In this case, if the semigroup is of type γ, we have St fm − St f L2 ≤ eγt fm − f L2 . Therefore the trajectory t → u(t) = St f is the limit of a sequence of C 1 . solutions t → um (t) = St fm . The convergence is uniform for t in bounded sets. Relying on these approximations, we now show that the function u = u(t, x) provides a solution to the parabolic equation (9.64)

ut =

n

(a (x)uxi )xj − ij

i,j=1

n

bi (x)uxi − c(x)u

i=1

in the distributional sense. Namely, for every test function ϕ ∈ Cc∞ (Ω× ]0, ∞[), one has

(9.65)

uϕt +

n i,j=1

ij

u(a ϕxj )xi +

n

+ u(b ϕ)xi − c uϕ dx dt = 0 . i

i=1

To prove (9.65), consider a sequence of initial data fm ∈ Dom(A) such that fm − f L2 → 0. Then, for any fixed time interval [0, T ], the trajectories . t → um (t) = St fm converge to the continuous trajectory t → u(t) = St f in C 0 ([0, T ] ; L2 (Ω)). Since each um is clearly a solution in the distributional

9.3. Hyperbolic equations

207

sense, writing +

n n ij i um ϕ t + um (a ϕxj )xi + um (b ϕ)xi − c um ϕ dx dt = 0 i,j=1

i=1

and letting m → ∞, we obtain (9.65).

9.3. Hyperbolic equations In this last section we problem ⎧ ⎨ utt + Lu u(t, x) (9.66) ⎩ u(0, x)

consider the linear hyperbolic initial-boundary value t ∈ R, x ∈ Ω, t ∈ R , x ∈ ∂Ω , x ∈ Ω.

= 0, = 0, = f (x) , ut (0, x) = g(x),

Compared with (9.49), notice that here we are taking two derivatives with respect to time. The system (9.66) can thus be regarded as a second-order evolution equation in the space L2 (Ω). For simplicity, we shall only treat the case where L is the homogeneous second-order elliptic operator n (9.67) Lu = − (aij (x)uxi )xi , i,j=1

aij

assuming that the coefficients satisfy (9.68) n ij ji 1,∞ a =a ∈W (Ω), aij (x)ξi ξj ≥ θ|ξ|2

for all x ∈ Ω, ξ ∈ Rn .

i,j=1

According to Theorem 9.9, the space L2 (Ω) admits an orthonormal basis {φk ; k ≥ 1} consisting of eigenfunctions of L, so that (9.69)

φk ∈ Dom(L) ,

Lφk = μk φk ,

for a sequence of strictly positive eigenvalues μk → +∞ as k → ∞. It is thus natural to construct a solution of (9.66) in the form ∞ (9.70) u(t, x) = ck (t) φk (x) . k=1

Taking the inner product of both sides of (9.70) with φk , we see that each coefficient ck (·) should satisfy the linear second-order ODE

ck (0) = (φk , g)L2 , (9.71) ck + μk ck = 0 , ck (0) = (φk , h)L2 . The following analysis will show that the formal expansion (9.70)–(9.71) is indeed valid, provided that the initial data satisfy (9.72)

g ∈ H01 (Ω) ,

h ∈ L2 (Ω) .

208

9. Linear Partial Differential Equations

. As a first step, let us rewrite (9.66) as a first-order system, setting v = ut . On the product space . (9.73) X = H01 (Ω) × L2 (Ω) we thus consider the evolution problem d u 0 I u (9.74) = , −L 0 v dt v

u f (0) = . v g

In the special case where (9.75)

f = ak φk ,

g = bk φk ,

for a given k ≥ 1 and ak , bk ∈ R, an explicit solution of (9.74) is found in the form u(t) = ck (t)φk , v(t) = ut (t) = ck (t)φk , where the coefficient ck (t) satisfies ck (t) + μk ck (t) = 0 ,

ck (0) = ak ,

ck (0) = bk .

An elementary computation yields bk √ √ ck (t) = ak cos( μk t) + √ sin( μk t) . μk Hence

√ √ √1 sin( μk t) cos( μk t) u(t) ak φk μk (9.76) = . √ √ √ v(t) bk φk − μk sin( μk t) cos( μk t) ( ) Observe that t → u(t), v(t) is a continuously differentiable map from R into H01 (Ω) × L2 (Ω) which satisfies the initial conditions and the differential equation in (9.74). By taking linear combinations of solutions of the form (9.76), we now obtain a group of linear operators: (9.77) √ √ ∞ √1 sin( μk t) cos( μk t) f (f, φk )L2 φk . μk St = . √ √ √ g (g, φk )L2 φk − μk sin( μk t) cos( μk t) k=1

The next theorem shows that {St ; t ∈ R} is actually a group of linear isometries on the product space X = H01 (Ω) × L2 (Ω), with the equivalent norm 1/2 . (9.78) (u, v) X = B[u, u] + v 2L2 .

9.3. Hyperbolic equations

209

Figure 9.3.1. An elastic membrane, clamped along the boundary of the domain Ω, whose points vibrate in the vertical direction.

Remark 9.22. Consider an elastic membrane which occupies a region Ω in the plane, is clamped along the boundary ∂Ω, and is subject to small vertical vibrations (Figure 9.3.1). Let u(t, x) denote the vertical displacement of a point x on this membrane, at time t. Then the quantity (u, ut ) 2X can be regarded as the total energy of the vibrating membrane. Indeed, the term B[u, u] describes an elastic potential energy, while ut 2L2 yields the kinetic energy. Theorem 9.23 (Solutions of a linear hyperbolic problem). In the above setting, the formula (9.77) defines a strongly continuous group of . bounded linear operators {St ; t ∈ R} on the space X = H01 (Ω) × L2 (Ω). Each operator St : X → X is an isometry with respect to the equivalent norm (9.78). Proof. 1. The equivalence between the norm (9.78) and the standard product norm 1/2 . (u, v) H 1 ×L2 = u 2H 1 + v 2L2 is an immediate consequence of (9.23). 2. Let the functions f, g be given by f =

∞

ak φk ,

k=1

g =

∞

bk φk

with ak = (f, φk )L2 , bk = (g, φk )L2 .

k=1

Then B[f, f ] =

∞ k=1

μk ak φk , ak φk

L2

=

∞ k=1

μk a2k ,

g 2L2 =

∞ k=1

b2k .

210

9. Linear Partial Differential Equations

f Therefore ∈ X if and only if g (9.79) ∞ ∞ μk a2k = μk (f, φk )2L2 < ∞ , k=1

k=1

∞

b2k =

k=1

∞

(g, φk )2L2 < ∞ .

k=1

f f If ∈ X, then for every t ∈ R the series defining St in (9.77) is g g convergent. Introducing the coefficients ⎧ 1 √ √ ⎪ ⎪ ak (t) = cos( μk t)ak + √ sin( μk t) bk , ⎨ μk (9.80) ⎪ ⎪ √ √ √ ⎩ bk (t) = ak (t) = − μk sin( μk t) ak + cos( μk t) bk , at any time t we have 2 2 ∞ ∞ f f 2 2 (9.81) S = μ |a (t)| + |b (t)| = t k k k g . g X X k=1

k=1

This shows that each linear operator St is an isometry with respect to the equivalent norm · X . 3. We claim that the family of linear operators {St ; t ∈ R} satisfies the group properties f f f f (9.82) S0 = , St Ss = St+s , t, s ∈ R . g g g g Indeed, (9.82) is clearly satisfied for initial data f, g of the special form (9.75). By linearity and continuity, it must hold for all initial data. To complete the proof, we need to show that, for f, g fixed, the map f t → St is continuous from R into X. But this is clear, because the g above map is the uniform limit of the continuous maps m m f . . (9.83) t → St m , fm = (f, φk )L2 φk , gm = (g, φk )L2 φk , gm k=1

as m → ∞.

k=1

Having constructed the group of linear operators {St ; t ∈ R}, we still need to explain in which sense the trajectories u(t) f (9.84) t → = St v(t) g provide a solution to the hyperbolic initial-boundary value problem (9.66).

9.3. Hyperbolic equations

211

(1) The initial andboundary conditions are satisfied. Consider an f arbitrary initial data ∈ X = H01 (Ω) × L2 (Ω). By the continuity of the g map in (9.84), it follows that u(t) − f H 1 → 0 ,

v(t) − g L2 → 0

as t → 0 .

Hence the initial conditions in (9.66) are satisfied. Moreover, by the definition of the space X, we have u(t) ∈ H01 (Ω) for all t ≥ 0. Hence the boundary condition u = 0 on ∂Ω is also satisfied. (2) The hyperbolic equation is satisfied in the distributional sense. If both functions f and g are finite linear combinations of the eigenfunctions φk , then the corresponding trajectory (9.84) is a continuously differentiable map from R into X, and it satisfies (9.74) at all times t > 0. More generally, given initial data f ∈ H01 (Ω) and g ∈ L2 (Ω), one can construct a sequence of approximations fm , gm as in (9.83), so that fm − f H 1 → 0, gm −g L2 → 0 as m → ∞. The corresponding semigroup um (t) fm trajectories t → converge to the trajectory (9.84), = St vm (t) gm uniformly for t ∈ R. Relying on these approximations, we now show that the function u = u(t, x) provides a solution to the hyperbolic equation utt =

n

(aij (x)uxi )xj

i,j=1

in the distributional sense. Indeed, consider any test function ϕ ∈ Cc∞ ( ]0, ∞[ ×Ω ). Since each um = um (t, x) is a distributional solution of (9.66), +

n um ϕtt + aij (x) (um )xi ϕxj dx dt = 0 . i,j=1

Letting m → ∞ and using the convergence um (t) − u(t) H 1 → 0, uniformly for t ∈ R, we obtain +

n ij u ϕtt + a (x) uxi ϕxj dx = 0 . i,j=1

. Example 9.24. Let Ω = ]0, π[ ⊂ R and Lu = −uxx . Given f ∈ H01 (Ω) and g ∈ L2 (]0, π[), consider the hyperbolic initial-boundary value problem ⎧ utt = uxx , t > 0, 0 < x < π, ⎨ u(0, x) = f (x) , ut (0, x) = g(x), 0 < x < π, ⎩ u(t, 0) = u(t, π) = 0 .

212

9. Linear Partial Differential Equations

In this special case, the formula (9.77) yields the solution in terms of a Fourier sine series: . π ∞ 2 u(t, x) = f (y) sin ky dy cos kt π 0 k=1

sin kt + k

/

π

g(y) sin ky dy

sin kx .

0

9.4. Problems . 1. Let Ω = {(x, y) ; x2 + y 2 < 1} be the open unit disc in R2 . Prove that, for every bounded measurable function f = f (x, y), the problem

uxx + x uxy + uyy = f on Ω , u = 0 on ∂Ω has a unique weak solution. . 2. Let Ω = {(x, y) ; x2 + y 2 < 1} be the unit disc in R2 . On the space X = H01 (Ω), consider the inner product ux vx + 2uy vy + y(ux vy + uy vx ) dxdy . u, v♦ = Ω

(i) Prove that ·, ·♦ is indeed an inner product on X, which makes X a Hilbert space. (ii) Given f ∈ L2 (Ω), show that there exists a unique u ∈ X such that u, v♦ = f v dx for all v ∈ X = H01 (Ω) . Ω

What elliptic equation does u solve ? 3. Consider the differential operator on R2 Lu = −(xux )x − (yuy )y + 2uxy + 3(ux + uy ) − 6u. Determine for which bounded open sets Ω ⊂ R2 it is true that the operator L is uniformly elliptic on Ω. 4. Let Ω ⊂ Rn be a bounded open set. Let β > 0 be the best constant in Poincaré’s inequality, namely . β = sup u L2 (Ω) ; u ∈ H01 (Ω), ∇u L2 (Ω) ≤ 1 .

9.4. Problems

213

(i) Establish a lower bound on the eigenvalues of the operator Lu = −Δu. More precisely, if φ ∈ H01 (Ω) provides a nontrivial weak solution to

−Δφ = μ φ, x ∈ Ω, φ = 0, x ∈ ∂Ω , prove that μ ≥ 1/β 2 . (ii) Prove that the solution of the parabolic initial-boundary value problem ⎧ ut = Δu, t > 0, x ∈ Ω, ⎨ u(t, x) = 0, t > 0 , x ∈ ∂Ω , ⎩ u(0, x) = g(x), x ∈ Ω, decays to zero as t → ∞. Indeed, u(t) L2 ≤ e−t/β g L2 for every t ≥ 0. 2

⊂ Rn be bounded open sets. Let 0 < μ1 ≤ μ2 ≤ · · · be the eigenvalues 5. Let Ω ⊆ Ω ˜1 ≤ μ ˜2 ≤ · · · be the eigenvalues of the of the operator −Δ on H01 Ω, and let 0 < μ Prove that μ operator −Δ on H01 (Ω). ˜ 1 ≤ μ1 . 6. On the interval [0, T ], consider the Sturm-Liouville eigenvalue problem

(p(t)u ) + q(t)u = μ u, 0 0 the bilinear form . Bγ [u, v] = (∇u · ∇v + γuv) dx Ω 1

is strictly positive definite on H (Ω). Express the weak solution of (9.90) in terms of the inverse operator L−1 γ , where Lγ u = −Δu + γu. 16. (Biharmonic equation) Let Ω ⊂ Rn be a bounded open set with smooth boundary. A function u ∈ H02 (Ω) is a weak solution of the biharmonic equation ⎧ Δ2 u = f, x ∈ Ω, ⎪ ⎨ (9.92) ⎪ ⎩ u = ∂u = 0, x ∈ ∂Ω , ∂ν if (9.93) Δu Δv dx = f v dx for all v ∈ H02 (Ω). Ω

Ω

Given f ∈ L2 (Ω), prove that the boundary value problem (9.92) has a unique weak solution. Hint: Show that the bilinear form B[u, v] on H02 (Ω) defined by the left-hand side of (9.93) is strictly positive definite on H02 (Ω).

Appendix

Background Material

A.1. Partially ordered sets A set S is partially ordered by a binary relation if, for every a, b, c ∈ S, one has (i) a a, (ii) a b and b a implies a = b, (iii) a b and b c implies a c. A subset S ⊆ S of a partially ordered set S is said to be totally ordered if, for every a, b ∈ S , one has either a b or b a. We say that the subset S is maximal (with respect to the property of being totally ordered) if S is not strictly contained in any other totally ordered set. Using Zorn’s lemma, or the axiom of choice, one can prove Theorem A.1 (Hausdorff Maximality Principle). If S is any partially ordered set, every totally ordered subset S ⊆ S is contained in a maximal totally ordered subset.

A.2. Metric and topological spaces A distance on a set X is a map d : X × X → R+ satisfying the following three properties: (1) positivity: d(x, y) ≥ 0, d(x, y) = 0 if and only if x = y, (2) symmetry: d(x, y) = d(y, x), 217

218

Appendix. Background Material

(3) triangle inequality: d(x, z) ≤ d(x, y) + d(y, z). A set X endowed with a distance function d(·, ·) is called a metric space. The open ball centered at a point x with radius r > 0 is the set . B(x, r) = {y ∈ X ; d(y, x) < r}. In turn, a metric d(·, ·) determines a topology on X. A subset A ⊆ X is open if, for every x ∈ A, there exists a radius r > 0 such that B(x, r) ⊆ A. A subset C ⊆ X is closed if its complement X \ C = {x ∈ X ; x ∈ / C} is open. If B(x, r) ⊆ A for some r > 0, we say that A is a neighborhood of the point x, or equivalently that x is an interior point of A. • The union of any family of open sets is open. The intersection of finitely many open sets is open. • The intersection of any family of closed sets is closed. The union of finitely many closed sets is closed. The entire space X and the empty set ∅ are always both open and closed. If there exists no other subset S ⊂ X which is at the same time open and closed, we say that the space X is connected. A sequence (xn )n≥1 converges to a point x ∈ X if lim d(xn , x) = 0 .

n→∞

In this case, we write limn→∞ xn = x or simply xn → x. The closure of a set A, denoted by A, is the smallest closed set containing A. This is obtained as the intersection of all closed sets containing A. A point x lies in the closure of the set A if and only if there exists a sequence of points xn ∈ A that converges to x. A subset S ⊆ X is dense in X if S = X. This is the case if and only if S intersects every nonempty open subset of X. The space X is separable if it contains a countable dense subset. A sequence (xn )n≥1 is a Cauchy sequence if for every ε > 0 one can find an integer N large enough so that d(xm , xn ) ≤ ε

whenever m, n ≥ N.

The metric space X is complete if every Cauchy sequence converges to some limit point x ∈ X.

A.2. Metric and topological spaces

219

If X, Y are two metric spaces, a map f : X → Y is continuous if, for . every open set A ⊆ Y , the pre-image f −1 (A) = {x ∈ X ; f (x) ∈ A} is an open subset of X. • A map f : X → Y is continuous if and only if, for every x0 ∈ X and ε > 0, there exists δ > 0 such that d(x, x0 ) < δ

implies

d(f (x), f (x0 )) < ε.

We say that a function f : X → Y is Lipschitz continuous if there exists a constant C ≥ 0 such that ( ) d f (x), f (x ) ≤ C · d(x, x ) for all x, x ∈ X . More generally, we say that f : X → Y is H¨ older continuous of exponent 0 < α ≤ 1 if there exists a constant C such that ( ) 4 5α d f (x), f (x ) ≤ C · d(x, x ) for all x, x ∈ X . A collection of open sets {Ai ; i ∈ I} such that K ⊆ i∈I Ai is called an open covering of the set K. Here the set of indices I may well be infinite. A set K ⊆ X is compact if, from every open covering of K, one can extract a finite subcovering. A set S is relatively compact if its closure S is compact. A set S is precompact if, for every ε > 0, it can be covered by finitely many balls with radius ε. Theorem A.2 (Compact subsets of Rn ). A subset S ⊂ Rn is compact if and only if it is closed and bounded. Theorem A.3 (Equivalent characterizations of compactness). Let S be a metric space. The following are equivalent: (i) S is compact. (ii) S is precompact and complete. (iii) From every sequence (xk )k≥1 of points in S one can extract a subsequence converging to some limit point x ∈ S. A.2.1. Fixed points of contractive maps. Let φ : X → X be a map from a complete metric space X into itself. A point x∗ such that φ(x∗ ) = x∗ is called a fixed point of φ. For a strictly contractive map, the fixed point is unique and can be obtained by a simple iterative procedure. Theorem A.4 (Contraction Mapping Theorem). Let X be a complete metric space, and let φ : X → X be a continuous mapping such that, for some κ < 1, (A.1)

d(φ(x), φ(y)) ≤ κd(x, y)

for all x, y ∈ X.

220

Appendix. Background Material

Then there exists a unique point x∗ ∈ X such that x∗ = φ(x∗ ).

(A.2)

Moreover, for any y ∈ X one has d(y, x∗ ) ≤

(A.3)

1 d(y, φ(y)). 1−κ

Proof. Fix any point y ∈ X and consider the sequence y0 = y,

y1 = φ(y0 ),

...,

yn+1 = φ(yn ), . . . .

By induction, we have d(y2 , y1 ) ≤ κ d(y1 , y0 ), d(y3 , y2 ) ≤ κd(y2 , y1 ) ≤ κ2 d(y1 , y0 ), ··· d(yn+1 , yn ) ≤ κ d(yn , yn−1 ) ≤ κn d(y1 , y0 ) = κn d(φ(y), y). For m < n we have (A.4) n−1 n−1 d(yn , ym ) ≤ d(yj+1 , yj ) ≤ κj d(y, φ(y)) ≤ j=m

j=m

κm d(y, φ(y)). 1−κ

Since κ < 1, the right-hand side of (A.4) approaches zero as m → ∞. Hence the sequence (yn )n≥1 is Cauchy. Since X is complete, this sequence converges to some limit point x∗ . By the continuity of φ one has x∗ = lim yn = lim φ(yn−1 ) = φ lim yn−1 = φ(x∗ ); n→∞

n→∞

n→∞

hence (A.2) holds. The uniqueness of the fixed point x∗ is proved observing that, if x1 = φ(x1 ),

x2 = φ(x2 ),

then by (A.1) it follows that ( ) d(x1 , x2 ) = d φ(x1 ), φ(x2 ) ≤ κd(x1 , x2 ). The assumption κ < 1 thus implies d(x1 , x2 ) = 0 and hence x1 = x2 . Finally, using (A.4) with m = 0, for every n ≥ 1 we obtain d(yn , y) ≤ Letting n → ∞, we obtain (A.3).

1 d(y, φ(y)). 1−κ

A.2. Metric and topological spaces

221

A.2.2. The Baire category theorem. Let X be a metric space. Among all subsets of X we would like to define a family of “large sets” and a family of “small sets” with the following natural properties: (i) A set S ⊆ X is large if and only if its complement X \ S is small. (ii) The intersection of countably many large sets is large. (iii) A large set is nonempty. If a probability measure μ on X is given, one can call “large sets” the sets having probability one, and “small sets” those with probability zero. With such definition, all properties (i)–(iii) are clearly satisfied. If the metric space X is complete, relying on Baire’s category theory, one can still introduce a concept of “large sets” and “small sets”, based exclusively on the topological structure. Namely, we say that a set S ⊆ X is of second category (i.e., “topologically large”) if S is the intersection of countably many open dense sets. On the other hand, a set S ⊆ X is said to be of first category, or equivalently meager (i.e., “topologically small”), if S is the union of countably many closed sets with empty interior. From the definition, it is clear that these topologically large or small sets satisfy the above properties (i) and (ii). The fact that (iii) also holds is an important consequence of the following theorem. Theorem A.5 (Baire). Let (Vk )k≥1 be a sequence of open, dense subsets . # of a complete metric space X. Then the intersection V = ∞ V k=1 k is a nonempty, dense subset of X. # ∞ Proof. Let Ω ⊂ X be any open set. We need to show that ∩Ω V k k=1 is nonempty. Choose a point x0 and a radius r0 < 1 such that B(x0 , 3r0 ) ⊂ Ω. By induction, for every k ≥ 1 we choose a point xk and a radius rk such that B(xk , 3rk ) ⊂ Vk ∩ B(xk−1 , rk−1 ). This is possible because Vk is open and dense. For each k ≥ 1, the above choice implies rk 1 rk+1 ≤ , d(xk+1 , xk ) ≤ rk ≤ k . 3 3 Therefore, the sequence (xk )k≥1 is Cauchy. Since X is complete, this sequence has a limit: xk → x∗ for some x∗ ∈ X. We now observe that d(x∗ , xk ) ≤

∞

d(xj+1 , xj ) ≤

j=k

∞ j=k

rj ≤

∞ j=k

Therefore, for every k ≥ 1, x∗ ∈ B(xk , 3rk ) ⊂ Vk .

3k−j rk =

3 rk . 2

222

Appendix. Background Material

When k = 0, this same argument yields x∗ ∈ B(x0 , 3r0 ) ⊂ Ω. Hence #∞ x∗ ∈ k=1 Vk ∩ Ω.

A.3. Review of Lebesgue measure theory A.3.1. Measurable sets. A family F of subsets of Rn is called a σalgebra if (i) ∅ ∈ F and Rn ∈ F , (ii) if A ∈ F , then Rn \ A ∈ F , (iii) if Ak ∈ F for every k ≥ 1, then

∞ k=1 Ak

∈ F and

#∞

k=1 Ak

∈ F.

Theorem A.6 (Existence of Lebesgue measure on Rn ). There exists a σ-algebra F of subsets of Rn and a mapping A → mn (A), from F into [0, +∞], with the following properties. (i) F contains every open subset of Rn and hence also every closed subset of Rn . (ii) If B is a ball in Rn , then mn (B) equals the n-dimensional volume of B. (iii) If Ak ∈ F for every k ≥ 1 and if the sets Ak are pairwise disjoint, then ∞ ∞ " mn = mn (Ak ) (countable additivity). k=1

k=1

(iv) If A ⊆ B with B ∈ F and mn (B) = 0, then also A ∈ F and mn (A) = 0. The sets contained in the σ-algebra F are called Lebesgue measurable sets, while mn (A) is the n-dimensional Lebesgue measure of the set A ∈ F. If a property P (x) is true for all points x ∈ Rn , except for those in a measurable set N with mn (N ) = 0, we say that the property P holds almost everywhere (a.e.). A function f : Rn → R is measurable if

. f −1 (U ) = x ∈ Rn ; f (x) ∈ U ∈ F for every open set U ⊆ R. Every continuous function is measurable. If f, g are measurable, then f + g and f · g are measurable. Given a uniformly bounded sequence of

A.3. Review of Lebesgue measure theory

223

measurable functions (fk )k≥1 , the functions defined as . . f ∗ (x) = lim sup fk (x) , f∗ (x) = lim inf fk (x) k→∞

k→∞

are both measurable. The essential supremum of a measurable function f is defined as . ess sup f = inf α ∈ R ; f (x) < α for a.e. x ∈ Rn . Theorem A.7 (Egoroff ). Let (fk )k≥1 be a sequence of measurable functions, and assume the pointwise convergence fk (x) → f (x)

for a.e. x ∈ A,

for some measurable function f and a measurable set A ⊂ Rn with mn (A) < ∞. Then for each ε > 0 there exists a subset E ⊆ A such that (i) mn (A \ E) ≤ ε, (ii) fk → f uniformly on E.

A.3.2. Lebesgue integration. In order to define the Lebesgue integral, one begins with a special class of functions. The characteristic function of a set A is

1 if x ∈ A, χ (x) = A 0 if x ∈ / A. A function taking finitely many values, i.e., having the form (A.5)

g(x) =

N

ci χ (x),

i=1

Ai

for some disjoint measurable sets A1 , . . . , AN ⊂ Rn and constants c1 , . . . , cN ∈ R, is called a simple function. If the function g in (A.5) is nonnegative, its Lebesgue integral is defined by N . g dx = ci mn (Ai ). Rn

i=1

As before, mn (Ai ) denotes the n-dimensional Lebesgue measure of the set Ai . More generally, if f : Rn → R is a nonnegative measurable function, its Lebesgue integral is defined as

+ . f dx = sup g dx ; g is simple, g ≤ f . Rn

Rn

For any measurable function f : Rn → R, its positive and negative parts are denoted as . . f+ = max{f, 0}, f− = max{−f, 0}.

224

Appendix. Background Material

We then define the Lebesgue integral of f as . (A.6) f dx = f+ dx − f− dx Rn

Rn

Rn

provided that at least one of the terms on the right-hand side is finite. In this case we say that f is integrable. Notice that the integral in (A.6) may well be +∞ or −∞. A measurable function f : Rn → R is summable if |f | dx < ∞. Rn

We say that f is locally summable if the product f · χ every compact set K ⊂

K

is summable for

Rn .

The Lebesgue integral has useful convergence properties. Theorem A.8 (Fatou’s lemma). Let (fk )k≥1 be a sequence of functions which are nonnegative and summable. Then lim inf fk dx ≤ lim inf fk dx . k→∞

Rn

k→∞

Rn

Theorem A.9 (Monotone convergence). Let (fk )k≥1 be a sequence of measurable functions such that f1 is summable and f1 ≤ f2 ≤ · · · ≤ fk ≤ fk+1 ≤ · · · . Then lim fk dx = lim fk dx . Rn

k→∞

k→∞ Rn

Theorem A.10 (Lebesgue dominated convergence). Let (fk )k≥1 be a sequence of measurable functions such that fk (x) → f (x)

for a.e. x ∈ Rn .

Moreover, assume that there exists a summable function g such that |fk (x)| ≤ g(x)

for every k ≥ 1 and a.e. x ∈ Rn .

Then lim

k→∞ Rn

fk dx =

f dx . Rn

Given a Lebesgue measurable set U ⊂ Rn , the integral of function f : U → R with respect to Lebesgue measure can be

f (x) if . . ˜ ˜ f dx , where f (x) = f dx = 0 if U Rn

a measurable defined as x∈U, x∈ /U.

A.3. Review of Lebesgue measure theory

225

For 1 ≤ p < ∞, Lp (U ) denotes the space of all Lebesgue measurable functions f : U → R such that f Lp (U )

(A.7)

. =

1/p |f | dx p

< ∞.

U

Moreover, L∞ (U ) denotes the space of all measurable functions f : U → R which are essentially bounded, i.e., such that . f L∞ (U ) = ess- sup |f (x)| < ∞ .

(A.8)

x∈U

Two functions whose values coincide outside a set of measure zero are regarded to be the same element of Lp , or L∞ . Given an open set Ω ⊆ Rn and 1 ≤ p ≤ ∞, by Lploc (Ω) we denote the space of all measurable functions f : Ω → R such that f ∈ Lp (U ) for every bounded open set U whose closure is contained in Ω. If 0 < mn (Ω) < ∞, the average value of f on the set Ω is defined as 1 . − f dx = f dx . mn (Ω) Ω Ω Observe that a function f is continuous at the point x0 if and only if f (x) − f (x0 ) = 0. (A.9) lim sup r→0+ x∈B(x ,r) 0

Replacing the supremum with an average, we say that f is quasi-continuous at the point x0 if f (x) − f (x0 ) dx = 0. (A.10) lim − r→0+

B(x0 ,r)

Theorem A.11 (Lebesgue). Let f : Rn → R be locally summable. Then f is quasi-continuous at a.e. point x0 ∈ Rn . A point x0 where (A.10) holds is called a Lebesgue point of f . Given an interval [a, b] ⊂ R, a function F : [a, b] → R is absolutely continuous if, for every ε > 0, there exists δ > 0 such that, for any finite family of disjoint intervals [a1 , b1 ], . . . , [an , bn ] contained in [a, b] one has n i=1

(bi − ai ) < δ

implies

n i=1

|F (bi ) − F (ai )| < ε.

226

Appendix. Background Material

Theorem A.12 (Absolutely continuous functions). The following are equivalent. (i) F : [a, b] → R is absolutely continuous. (ii) There exists a function f ∈ L1 ([a, b]) such that F (x) = F (a) +

f (x) dx

for all x ∈ [a, b] .

[a,x]

If (i) and (ii) hold, then F is a.e. differentiable, with derivative F (x) = f (x) for a.e. x ∈ [a, b]. Next, consider two measurable subsets X ⊆ Rm and Y ⊆ Rn , so that the Cartesian product . X ×Y =

(x, y) ; x ∈ X, y ∈ Y

is a measurable subset of Rm+n . Given a function of two variables f : . X × Y → R, for each fixed x we consider the function y → f x (y) = f (x, y) of the variable y alone. Similarly, for each fixed y we consider the function . x → f y (x) = f (x, y) of the variable x alone. Theorem A.13 (Fubini). Let X ⊆ Rm , Y ⊆ Rn be measurable sets, and assume f ∈ L1 (X × Y ). Then • f y ∈ L1 (X) for a.e. y ∈ Y , • f x ∈ L1 (Y ) for a.e. x ∈ X, . • the integral function F (y) = X f y (x)dx is in L1 (Y ), . • the integral function G(x) = Y f x (y)dy is in L1 (X). Moreover, one has the identity .

/ / . f y (x)dx dy = f x (y) dy dx .

f (x, y) dxdy = X×Y

Y

X

X

Y

For detailed proofs of all the above theorems we refer to [F].

A.4. Integrals of functions taking values in a Banach space Let X be a Banach space with norm · . Let X ∗ be its dual space. Every x∗ ∈ X ∗ thus determines a linear continuous mapping x → x∗ , x from X into R. In this section we show how to construct the integral of a function f : [0, T ] → X.

A.4. Integrals of functions taking values in a Banach space

227

A function g : [0, T ] → X is simple if it has the form (A.11)

g(t) =

N

t ∈ [0, T ],

ui χ (t),

i=1

Ai

where, for each i = 1, . . . , N , ui ∈ X and Ai ⊆ [0, T ] is a measurable set. A function f : [0, T ] → X is strongly measurable if there exists a sequence of simple functions gk : [0, T ] → X such that gk (t) → f (t)

for a.e. t ∈ [0, T ].

Moreover, we say that f is summable if there exists a sequence of simple functions gk such that T gk (t) − f (t) dt → 0. (A.12) 0

A function f : [0, T ] → X is weakly measurable if, for each x∗ ∈ X ∗ , the scalar function t → x∗ , f (t) is measurable. A function f : [0, T ] → X is almost separably valued if there exists a subset N ⊂ [0, T ] with zero measure such that the set of images {f (t) ; t ∈ [0, T ] \ N } is separable (i.e., it admits a countable dense subset). Theorem A.14 (Pettis). A map f : [0, T ] → X is strongly measurable if and only if f is weakly measurable and almost separably valued. The integral of a function f with values in a Banach space is defined in two steps. If g is the simple function in (A.11), we define T N . g dt = ui m1 (Ai ). 0

i=1

Here m1 is the one-dimensional Lebesgue measure on the interval [0, T ]. If f is summable, we define T . f dt = lim 0

k→∞ 0

T

gk (t) dt

where (gk )k≥1 is any sequence of simple functions for which (A.12) holds. Theorem A.15 (Bochner). A strongly measurable function f : [0, T ] → X is summable if and only if the scalar function t → f (t) is summable. In

228

Appendix. Background Material

this case one has

T 0

f (t) dt ≤

T

Moreover, for every x∗ ∈ X ∗ , 6 7 T x∗ , f (t) dt = 0

f (t) dt .

0

T

8

9 x∗ , f (t) dt .

0

Proofs of the above theorems can be found in [Y].

A.5. Mollifications In the analysis of Sobolev spaces, an important technique is the approximation of a general function with smooth functions. This can be done by means of mollifications. The standard mollifier on Rn is defined as ⎧ ⎪ if |x| < 1 , ⎨ Cn exp |x|21−1 . (A.13) J(x) = ⎪ ⎩ 0 if |x| ≥ 1 , where the constant Cn is chosen so that Rn J(x) dx = 1 . For each ε > 0 we also define the rescaled function x . 1 (A.14) Jε (x) = n J . ε ε Notice that Jε ∈ Cc∞ (Rn ) and that Jε = 1 , Supp(Jε ) = {|x| ≤ ε}. Rn

Figure A.5.1. A standard mollifier J and its rescaling Jε .

A.5. Mollifications

229

Now let Ω ⊆ Rn be an open set and let f ∈ L1loc (Ω) be a locally integrable function. We then define the mollification of f in terms of a convolution: . fε = Jε ∗ f. Since f is defined on Ω and Jε has support in the ball centered at the origin with radius ε, this convolution is well defined at all points in the subset . Ωε = x ∈ Ω ; B(x, ε) ⊆ Ω . Indeed . fε (x) =

Jε (x − y)f (y) dy = B(x,ε)

Jε (z)f (x − z) dz . B(0,ε)

Theorem A.16 (Properties of mollifiers). Let Ω ⊆ Rn be an open set and let f ∈ L1loc (Ω). Then: (i) For every ε > 0 one has fε ∈ C ∞ (Ωε ) . (ii) As ε → 0, one has the pointwise convergence fε (x) → f (x) for a.e. x ∈ Ω . (iii) If f is continuous, then fε → f uniformly on compact subsets of Ω. (iv) If 1 ≤ p < ∞ and f ∈ Lploc (Ω), then fε → f in Lploc (Ω) . Proof. 1. Notice that each Ωε is open. Indeed, if the closed ball B(x0 , ε) is entirely contained in the open set Ω, the same holds for the ball B(x, ε), whenever |x − x0 | is sufficiently small. Let {e1 , . . . , en } be the standard orthonormal basis of Rn . Fix a point x ∈ Ωε and i ∈ {1, . . . , n}. If h is so small that x + hei ∈ Ωε , then the difference quotient is computed as

(A.15)

fε (x + hei ) − fε (x) h . / 1 1 x + hei − y x−y = n J −J f (y) dy . ε B(x,ε) h ε ε

Since the closed ball B(x, ε) is entirely contained in Ω, we have f ∈ L1 (B(x, ε)). Moreover, since J ∈ C ∞ , we have the uniform convergence . / 1 x−y x + hei − y x−y 1 ∂ lim J J −J = . h→0 h ε ε ε ∂xi ε Letting h → 0 in (A.15), we obtain the existence of the partial derivative ∂ ∂ fε (x) = Jε (x − y) f (y) dy . ∂xi B(x,ε) ∂xi

230

Appendix. Background Material

A similar argument shows that, for every multi-index α, the derivative D α fε exists and D α Jε (x − y) f (y) dy .

D α fε (x) = B(x,ε)

This proves (i). 2. By the Lebesgue differentiation theorem, for a.e. x ∈ Ω we have (A.16) lim − |f (y) − f (x)| dy = 0 . r→0

B(x,r)

If x is a point for which (A.16) holds, then 4 5 |fε (x) − f (x)| = Jε (x − y) f (y) − f (x) dy B(x,ε) 1 x − y ≤ n J f (y) − f (x) dy ε B(x,ε) ε f (y) − f (x) dy . ≤C − B(x,ε)

As ε → 0, the right-hand side goes to zero because of (A.16). This proves (ii).

Figure A.5.2. The sets Ωε ⊂ Ω and the compact sets K ⊂⊂ Kρ ⊂⊂ Ω, used in the analysis of mollifiers.

3. Assume that f is continuous, and let K ⊂ Ω be a compact subset. Then we can choose δ > 0 small enough so that the compact neighborhood . (A.17) Kρ = {x ∈ Rn ; d(x, K) ≤ ρ} is still contained inside Ω. Since f is uniformly continuous on the compact set Kρ , the previous calculations show that fε (x) → f (x) uniformly for x ∈ K. This proves (iii).

A.5. Mollifications

231

4. To prove (iv), assume 1 ≤ p < ∞ and f ∈ Lploc (Ω). Let K ⊂ Ω be a compact subset and choose ρ > 0 so that the compact neighborhood Kρ in (A.17) is still contained in Ω. We claim that, for every 0 < ε < ρ, fε Lp (K) ≤ f Lp (Kρ ) .

(A.18)

Indeed, for x ∈ K an application of Hölder’s inequality with q = |fε (x)| = Jε (x − y) f (y) dy B(x,ε)

≤

Jε (x − y)

p p−1

yields

p−1 1 p p Jε (x − y) |f (y)| dy

B(x,ε)

p−1 p

≤

K

p

Jε (x − y) dy B(x,ε)

Recalling that yields

1

B(x,ε)

Jε (x − y)|f (y)| dy p

Jε (x − y) dy = 1, for 0 < ε < ρ the above inequality

|fε (x)|p dx ≤

Jε (x − y)|f (y)| dy p

K

|f (y)| Kρ

dx

B(x,ε)

≤

.

B(x,ε)

Jε (x − y) dx dy =

p B(y,ε)

|f (y)|p dy . Kρ

This proves (A.18). Next, for any δ > 0, we choose g ∈ C(Kρ ) such that f − g Lp (Kρ ) < δ . Together with (A.18), this yields fε − f Lp (K) ≤ fε − gε Lp (K) + gε − g Lp (K) + g − f Lp (K) ≤ f − g Lp (Kρ ) + gε − g Lp (K) + g − f Lp (K) ≤ δ + gε − g Lp (K) + δ . Since g is continuous, by (ii) it follows that |gε − g| → 0 uniformly on the compact set Kρ . Hence lim supε→0 fε − f Lp (K) ≤ 2δ. Since δ > 0 was arbitrary, this proves (iv).

232

Appendix. Background Material

Corollary A.17. Let f ∈ L1loc (Ω) and assume f φ dx = 0 for every φ ∈ Cc∞ (Ω). Ω

Then f (x) = 0 for a.e. x ∈ Ω. Indeed, let x ∈ Ω be a Lebesgue point of f and let Jε be the standard mollifier. Taking φ(y) = Jε (x − y) and letting ε → 0, one obtains 0 = Jε (x − y)f (y) dy = J(x − y)f (y) dy → f (x). Ω

B(x,ε)

Hence f (x) = 0 for a.e. x ∈ Ω. A.5.1. Partitions of unity. Let S ⊆ Rn and let V1 , V2 , . . . be open sets that cover S, so that " S ⊆ Vk . k≥1

We say that a family of functions {φk ; k ≥ 1} is a smooth partition of unity subordinate to the sets Vk if the following hold. (i) For every k ≥ 1, φk : Rn → [0, 1] is a C ∞ function with support contained inside the open set Vk . (ii) Each point x ∈ S has a neighborhood which intersects the support of finitely many functions φk . Moreover (A.19) φk (x) = 1 for all x ∈ S . k

Note that, by assumption (ii), at each point x ∈ S the summation in (A.19) contains only finitely many nonzero terms.

Figure A.5.3. A partition of unity subordinate to the sets V1 , V2 , V3 .

A.6. Inequalities

233

Theorem A.18 (Existence of a smooth partition of unity). Let S ⊆ Rn and let {Vk ; k ≥ 1} be a family of open sets covering S. Then there exists a smooth partition of unity subordinate to the sets Vk .

A.6. Inequalities A.6.1. Convex sets and convex functions. A set Ω ⊆ Rn is convex if x, y ∈ Ω , θ ∈ [0, 1]

implies

θx + (1 − θ)y ∈ Ω.

In other words, if Ω contains two points x, y, then it also contains the entire segment joining x with y. Let Ω ⊂ Rn be a convex set. We say that a function f : Ω → R is convex if f (θx + (1 − θ)y) ≤ θ f (x) + (1 − θ)f (y)

(A.20)

whenever x, y ∈ Ω and θ ∈ [0, 1]. Notice that (A.20) holds if and only if the epigraph of f , i.e., the set (x, z) ∈ Ω × R ; z ≥ f (x) , is a convex subset of Rn × R. A twice differentiable function f : Rn → R is uniformly convex if its Hessian matrix of second derivatives is uniformly positive definite. This means that, for some constant κ > 0, n

fxi xj (x) ξi ξj ≥ κ

i,j=1

n

ξi2

for all x, ξ ∈ Rn .

i=1

Figure A.6.1. A convex function f and its epigraph. A support

hyperplane at the point x0 .

234

Appendix. Background Material

Theorem A.19 (Supporting hyperplanes). Let f : Rn → Rn be a convex function. Then for every x0 ∈ Rn there exists a (subgradient) vector ω such that f (x) ≥ f (x0 ) + ω, x − x0 for all x ∈ Rn . The hyperplane {(x, z) ; z = ω, x − x0 } ⊂ Rn+1 touches the graph of f at the point x0 and remains below this graph at all points x ∈ Rn . It is thus called a supporting hyperplane to f at x0 . Theorem A.20 (Jensen’s inequality). Let f : R → R be a convex function, and let Ω ⊂ Rn be a bounded open set. If u ∈ L1 (Ω) is any integrable function, then (A.21) f − u dx ≤ − f (u) dx . Ω

Ω

. 1 Here −Ω u dx = meas(Ω) udx is the average value of u on the set Ω, Ω . 1 while −Ω f (u) dx = meas(Ω) Ω f (u)dx is the average value of f (u). Notice that we do not require that the set Ω be convex. . Proof. Set u0 = −Ω u(y) dy. Then there exists a support hyperplane to the graph of f at the point u0 , say f (u) ≥ f (u0 ) + ω, u − u0 for some constant ω ∈ R and all u ∈ R. Hence 6 7 f (u(x)) ≥ f − u(y) dy dx + ω, u(x) − − u(y) dy . Ω

Ω

Taking the average value of both sides on the set Ω, we obtain (A.21). A.6.2. Basic inequalities. 1. Cauchy’s inequality. (A.22)

ab ≤

a 2 + b2 2

for all a, b ∈ R .

Indeed, 0 ≤ (a − b)2 = a2 + b2 − 2ab. √ √ For any ε > 0, replacing a with 2ε a and b with b/ 2ε, from (A.22) we obtain the slightly more general inequality (A.23)

ab ≤ εa2 +

2. Young’s inequality. (A.24)

ap bq ab ≤ + p q

b2 4ε

(a, b ∈ R , ε > 0) .

1 1 a, b > 0 , 1 < p < q < ∞ , + = 1 p q

.

A.6. Inequalities

235

u

Indeed, since the map f (u) = eu is convex, one has exp{ up + vq } ≤ ep + v e q . Therefore +

1 1 1 1 ap bq p q ln a+ln b p q ab = e ≤ eln a + eln b = = exp ln a + ln b + . p q p q p q 3. H¨ older’s inequality. If f ∈ Lp (Ω), g ∈ Lq (Ω) with 1 ≤ p, q ≤ ∞ 1 1 and p + q = 1, then (A.25) |f g| dx ≤ f Lp (Ω) g Lq (Ω) . Ω

Indeed, in the special case where f Lp (Ω) = g Lq (Ω) = 1, Young’s inequality yields 1 1 1 1 |f g| dx ≤ |f |p dx + |g|q dx = + = 1 = f Lp g Lq . p q p q Ω Ω Ω To cover the general case, we simply replace the functions f, g with f˜ = f / f Lp and g˜ = g/ g Lq , respectively. By induction, one can establish the following more general version of Hölder’s inequality. Let 1 ≤ p1 , . . . , pm ≤ ∞, with p11 + · · · + p1m = 1. Assume fk ∈ Lpk (Ω) for k = 1, . . . , m. Then m 3 (A.26) |f1 f2 · · · fm | dx ≤ fk Lpk (Ω) . Ω

k=1

4. Minkowski’s inequality. For any 1 ≤ p ≤ ∞ and f, g ∈ Lp (Ω), one has f + g Lp (Ω) ≤ f Lp (Ω) + g Lp (Ω) .

(A.27)

Indeed, in the case p = 1 the result is trivial. If p > 1, applying Hölder’s p inequality with exponents p and q = p−1 , one obtains f + g pLp (Ω) = |f + g|p dx = |f + g|p−1 (|f | + |g|) dx Ω

≤

|f + g| Ω

p (p−1)· p−1

Ω

p−1 ,

1

p

|f | dx p

dx

1 -

p

|g| dx p

+

Ω

p−1 p p = f + g L f + g p (Ω) L (Ω) L (Ω) . p−1 Dividing both sides by f + g L p (Ω) , we obtain (A.27).

Ω

p

236

Appendix. Background Material

5. Interpolation inequality. Let 1 ≤ p ≤ r ≤ q ≤ ∞, with 1 θ 1−θ = + r p q

for some θ ∈ [0, 1] .

If f ∈ Lp (Ω) ∩ Lq (Ω), then we also have f ∈ Lr (Ω) and 1−θ f Lr (Ω) ≤ f θLp (Ω) f L q (Ω) .

(A.28)

(1−θ)r Indeed, observing that θr = 1, by Hölder’s inequality one obtains p + q r |f | dx = |f |θr |f |(1−θ)r dx Ω

Ω

≤

|f |

p θr· θr

θr p

|f |

dx

Ω

q (1−θ)r· (1−θ)r

(1−θ)r q

dx

.

Ω

6. Discrete H¨ older and Minkowski inequalities. The inequalities (A.25) and (A.27) hold, more generally, when Ω is any measure space. In particular, one can take Ω = {1, . . . , n} with the counting measure. For any collection of numbers a1 , . . . , an and b1 , . . . , bn and for 1 ≤ p, q < ∞ with 1 1 older inequality p + q = 1, one has the discrete H¨ n

(A.29)

|ak bk | ≤

k=1

n

1 p

|ak |p

k=1

n

1 q

|bk |q

k=1

and the discrete Minkowski inequality n 1 1 n 1 n p p p (A.30) |ak + bk |p ≤ |ak |p + |bk |p . k=1

k=1

k=1

Given two vectors x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ), using the discrete H¨ older inequality (A.29) with p = q = 2, one obtains the CauchySchwarz inequality for the inner product on Rn : |x, y| ≤ |x| |y| .

(A.31) Indeed,

1 n 1 n n n 2 2 |x, y| = xk yk ≤ |xk yk | ≤ |xk |2 |yk |2 = |x| |y| . k=1

k=1

k=1

k=1

A.7. Problems

237

A.6.3. A differential inequality. Theorem A.21 (Gronwall’s inequality). Let t → z(t) be a nonnegative, absolutely continuous function defined for t ∈ [0, T ]. Assume that the time d derivative z = dt z satisfies z (t) ≤ φ(t) z(t) + ψ(t)

for a.e. t ∈ [0, T ] ,

where φ, ψ ∈ L1 ([0, T ]) are nonnegative functions. Then t t t (A.32) z(t) ≤ e 0 φ(s) ds z(0) + e s φ(σ) dσ ψ(s) ds for all t ∈ [0, T ]. 0

Notice that the right-hand side of (A.32) is precisely the solution to the Cauchy problem Z (t) = φ(t) Z(t) + ψ(t) ,

Z(0) = z(0).

To prove (A.32), we write s s d − s φ(σ) dσ z(s) = e− 0 φ(σ) dσ z (s) − φ(s)z(s) ≤ e− 0 φ(σ) dσ ψ(s). e 0 ds Integrating over the interval [0, t], one finds t t s e− 0 φ(σ) dσ z(t) − z(0) ≤ e− 0 φ(σ) dσ ψ(s) ds . Multiplying by e

t 0

0 φ(σ) dσ

, we thus obtain (A.32).

A.7. Problems 1. Let {Ai ; i ∈ I} be an open covering of a compact metric space K. Prove that there exists ρ > 0 such that, for every x ∈ K, the ball B(x, ρ) is entirely contained in one of the sets Ai . 2. Let (xn )n≥1 be a sequence of points in a metric space E. Prove the following statements. (i) The sequence converges to a point x ¯ if and only if from every subsequence ¯. (xnj )j≥1 one can extract a further subsequence converging to x (ii) If d(xm , xn ) ≥ δ > 0 for all m = n, then no convergent subsequence can exist. (iii) Let E be complete and assume that, for every ε > 0, from any sequence one can extract a further subsequence (xnj )j≥1 such that lim sup d(xnj , xnk ) < ε. j,k→∞

Then the sequence admits a convergent subsequence.

238

Appendix. Background Material

3. Consider the function ⎧ ⎪ ⎨ f (x) =

1 x(ln x)2

if 0 < x