Contents
List of Figures
Basic Notation
1 Choice Principles
1.1 Axiom of Choice
1.2 Some Applications
1.3 Problems
2 Hilbert Spaces
2.1 Norms
2.2 Inner Products and Hilbert Spaces
2.3 Some Geometric Properties
2.4 Orthogonality
2.5 Orthogonal Sequences
2.6 Problems
3 Completeness, Completion and Dimension
3.1 Banach Spaces
3.2 Completion and Dimension
3.3 Separability
3.4 Problems
4 Linear Operators
4.1 Linear Transformations
4.2 Back to Matrices
4.3 Boundedness
4.4 Problems
5 Functionals and Dual Spaces
5.1 A Special Type of Linear Operators
5.2 Dual Spaces
5.3 The Bra-ket Notation
5.4 Problems
6 Fourier Series
6.1 The Space L2[–0, 0]
6.2 Convergence Conditions for the Fourier Series
6.2.1 Sufficient Convergence Conditions for the Fourier Series in a Point
6.2.2 Conditions for Uniform Convergence for the Fourier Series
6.3 Problems
7 Fourier Transform
7.1 Convolution
7.2 L1 Theory
7.3 L2 Theory
7.4 Schwartz Class
7.5 Problems
8 Fixed Point Theorem
8.1 Some Applications
8.1.1 Neumann Series
8.1.2 Differential Equations
8.1.3 Integral Equations
8.1.4 Fractals
8.2 Problems
9 Baire Category Theorem
9.1 Baire Categories
9.2 Baire Category Theorem
9.3 Problems
10 Uniform Boundedness Principle
10.1 Problems
11 Open Mapping Theorem
11.1 Problems
12 Closed Graph Theorem
12.1 Problems
13 Hahn–Banach Theorem
13.1 Extension Theorems
13.2 Minkowski Functional
13.3 Separation Theorem
13.4 Applications of the Hahn–Banach Theorem
13.5 Problems
14.1 Hilbert Spaces
14.2 Banach Spaces
14.3 Problems
15 Weak Topologies and Reflexivity
15.1 Weak* Topology
15.2 Reflexive Spaces
15.3 Problems
16 Operators in Hilbert Spaces
16.1 Compact Operators
16.3 Problems
17 Spectral Theory of Operators on Hilbert Spaces
17.1 A Quick Review of Spectral Theory in Finite Dimensions
17.2 The Spectral Theorem for Compact Self-Adjoint Operators
17.3 Problems
18 Compactness
18.1 Metric Spaces
18.2 Compactness in Some Function Spaces
18.2.1 Space l2
18.2.2 Space of Continuous Functions
18.2.3 Lebesgue Spaces
18.3 Problems
Bibliography
Index

##### Citation preview

Gerardo R. Chacón, Humberto Rafeiro, Juan C. Vallejo Functional Analysis De Gruyter Graduate

Also of interest Probability Theory and Statistical Applications. A Profound Treatise for Self-Study Peter Zörnig, 2016 ISBN 978-3-11-036319-7, e-ISBN (PDF) 978-3-11-040271-1, e-ISBN (EPUB) 978-3-11-040283-4 Multivariable Calculus and Differential Geometry Gerard Walschap, 2015 ISBN 978-3-11-036949-6, e-ISBN (PDF) 978-3-11-036954-0

Mathematics for the Physical Sciences Leslie Copley, 2014 ISBN 978-3-11-040945-1, e-ISBN (PDF) 978-3-11-040947-5, e-ISBN (EPUB) 978-3-11-042624-3

Abstract Algebra. An Introduction with Applications Derek J.S. Robinson, 2015 ISBN 978-3-11-034086-0, e-ISBN (PDF) 978-3-11-034087-7, e-ISBN (EPUB) 978-3-11-038560-1

Distribution Theory. Convolution, Fourier Transform, and Laplace Transform Dijk van Gerrit, 2013 ISBN 978-3-11-029591-7, e-ISBN (PDF) 978-3-11-029851-2

Gerardo R. Chacón, Humberto Rafeiro, Juan C. Vallejo

Functional Analysis

A Terse Introduction

Mathematics Subject Classification 2010 35-02, 65-02, 65C30, 65C05, 65N35, 65N75, 65N80 Authors Gerardo Chacón Gallaudet University Department of Science Technology & Mathematics 800 Florida Ave. NE Washington DC 20002 USA [email protected]

Humberto Rafeiro Pontiﬁcia Universidad Javeriana Departamento de Matemáticas Bogotá Colombia [email protected]

Juan Camilo Vallejo Pontiﬁcia Universidad Javeriana Departamento de Matemáticas Bogotá Colombia [email protected]

ISBN 978-3-11-044191-8 e-ISBN (PDF) 978-3-11-044192-5 e-ISBN (EPUB) 978-3-11-043364-7 Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliograﬁe; detailed bibliographic data are available on the Internet at http://dnb.dnb.de. © 2017 Walter de Gruyter GmbH, Berlin/Boston Typesetting: Integra Software Services Pvt. Ltd. Printing and binding: CPI books GmbH. Leck Cover image: Cover provided by the authors using TikZ code @ Printed on acid-free paper Printed in Germany www.degruyter.com

A Yinzú, gracias por convivir con mis desvaríos G.C. à Daniela pelo seu amor e paciência H.R.

Preface This book is intended to make a smooth transition from linear algebra to the basics of functional analysis. When possible, we present concepts starting with a finitedimensional example using matrix calculations and finite basis and we build up from that point to more complicated examples in infinite-dimensional spaces, making emphasis in the main differences from one case to the other. Although we rely upon reader’s knowledge of basic linear algebra and vector spaces, we go back to indicate some relevant concepts and techniques that will be necessary in several parts in the book. Throughout the book, there are several examples and proofs that are left to the reader to complete. This is done with the objective of requiring an active participation from the reader that we expect will result in a better insight of the problems and techniques involved in the theory. The main prerequisite for this book would be a proof-based course in linear algebra. A basic course in topology is preferred but not completely necessary. Although measure-theoretic examples are presented in several parts of the book, those can be safely disregarded in a first reading. However, we would like to recommend Refs [8, 18]. In Chapter 1 we give a brief introduction to the axiom of choice and its equivalent formulations, emphasizing the usage of the principle. We give some examples which highlight the rationale in the handling of the axiom of choice. Among those examples are the proofs that every vector space has a Hamel basis, the existence of nonmeasurable Lebesgue sets and a whimsy version of the Banach–Tarski paradox. Chapter 2 consists of an introduction to the theory of Hilbert spaces. It starts by defining the notion of norm as a way of measuring vectors and, consequently, distances between vectors. Then ℓp spaces are introduced and Minkowski and Hölder’s inequalities are proven in order to define the ‖ ⋅ ‖p -norm. We then notice that in the finite-dimensional context, the case p = 2 coincides with the Euclidean geometric manner of measuring vectors which comes from Pythagoras Theorem. We then exhibit the necessity of having a way of measuring angles in a vector space, and inner products are introduced along with several examples and geometric properties such as parallelogram identity and polarization identities. We finish the chapter studying orthogonal sequences and introducing the notion of Fourier coefficients and their properties. In Chapter 3 we start by presenting examples of Banach spaces and show that in finite dimensions all norms are equivalent. We also show that the compactness of the closed unit ball is a property that characterizes finite-dimensional vector spaces. We describe some separable spaces and its relation to Schauder basis. Chapter 4 contains the definition and examples of linear transformations in several vector spaces. Then we go back to the study of matrices as linear transformations

VIII

Preface

acting on finite-dimensional spaces. We show that in this case, continuity is an automatic property and that this is a main difference from the infinite-dimensional case. We calculate the norm of some concrete bounded linear operators and show that the space of linear operators with range on a Banach space is itself a Banach space. Then we finish the chapter with an extension theorem. After studying general linear operators, Chapter 5 is devoted to study the special case of linear functionals. First, by showing examples and then, in the case of Hilbert spaces, by identifying a way to represent all linear functionals. Next, we look for a similar representation in Banach spaces, motivating the notion of dual spaces and considering examples. The chapter finishes with a section presenting the bra-ket notation as a way of distinguishing between vectors in a Hilbert space and its dual. Chapter 6 is an optional chapter focused on the theory of Fourier coefficients of functions on the space L2 [–0, 0]. It starts by studying such coefficients from the point of view of a Hilbert space and then passes to the more general space L1 [–0, 0] in which the question of convergence of the Fourier series arises. Sufficient conditions on a function f are given to assure the pointwise and the uniform convergence of its Fourier series. This chapter uses tools from measure theory and topology of metric spaces and it can be disregarded if the reader does not have the necessary background. Chapter 7 is also an optional chapter and relies heavily on measure theory and integration. We introduce the notion of convolution operator and obtain some immediate properties of this operator. The Young inequality for the convolution is also given. The notions of Dirac sequences and Friedrich mollifiers are introduced and it is shown that the convolution with a Friedrich mollifier generates an identity approximation operator in the framework of Lebesgue spaces. The Fourier transform is introduced in the case of L1 functions and then, using the Plancherel Theorem, we extend the notion of Fourier transform into the space of square summable functions. Many properties of the Fourier transform are derived, e.g., the translation, modulation, convolution, uncertainty principle, among others. We end the chapter with a brief introduction to the Schwartz class of functions, which is the natural environment for the Fourier transform. In Chapter 8 we study the Fixed Point Theorem in the realm of metric spaces. We give classical applications, e.g., the Babylonian algorithm, Newton’s method, applications in the framework of differential and integral equations, to name a few. We also included a section touching on the subject of fractals, where we prove the Hutchinson Theorem. Chapter 9 is devoted to the Baire Category Theorem. Although this is a theorem that belongs to topology it has plenty of applications in many branches of mathematics, since it is an existence theorem. We give the classical proof of Weierstrass of a continuous nowhere differentiable function and then we give a very short proof of the existence of such functions using Baire Category Theorem, in fact showing that the set of such functions is generic.

Preface

IX

Chapters 10,11,12 and 13 are very important, since they contain the four pillars of functional analysis: the uniform boundedness principle, the Open Mapping Theorem, the Closed Graph Theorem and the Hahn–Banach Extension Theorem. All of these results are existence results and have many applications in analysis and elsewhere. In Chapter 10 we give two different proofs of the uniform boundedness principle, one of which does not rely on the Baire Category Theorem. As a particular case we show the Banach–Steinhaus Theorem. We introduce the notion of bilinear operator and show that a bilinear operator which is continuous in each coordinate is continuous. In Chapter 11 we define the notion of an open mapping and show the Open Mapping Theorem which states that a surjective bounded linear operator between Banach spaces is open. The so-called Banach Isomorphism Theorem is also derived. In Chapter 12 we study the notion of a closed operator and show the Closed Graph Theorem, which loosely states that a closed operator is continuous under certain conditions. Chapter 13 is devoted to the study of Hahn–Banach-type theorem. We give the real and complex extension version of the Hahn–Banach Theorem based on the Banach function and show the classical corollaries of this fact, e.g., the Hahn–Banach extension in norm form, the construction of a linear continuous functional with prescribed conditions, to name a few. The Minkowski functional is studied with some details and it is pointed out that in any vector space we can introduce a seminorm. Using the notion of Minkowski functional we study the separation theorems. We end the chapter with some applications of the Hahn–Banach Theorem, e.g., the result that all separable spaces are isometric isomorphic to a subspace of ℓ∞ , and with the Lax–Milgram Theorem. In Chapter 14, we consider the concept of adjoint operators, starting again from the knowledge of matrices and motivating the definition with the relation between a real matrix and its transpose. The rigorous definition for Hilbert spaces is then presented and some properties are considered. In the case of Banach spaces, a new definition is given based on the modifications that are needed from the Hilbert space case. Properties and examples are also showed. The weak topology is introduced in Chapter 15 as the smallest topology that makes all linear functionals continuous. Comparisons between weak and norm topologies are presented and the weak∗ topology is also presented. The chapter finishes with a section devoted to reflexive spaces, showing the classical example of ℓp spaces and the counter-example of the space ℓ1 . Chapter 16 is dedicated to describe compact, normal and self-adjoint operators acting on Hilbert spaces, showing examples and properties in preparation to Chapter 17. There, a brief introduction to spectral theory is depicted, starting from a review of the situation in finite-dimensional complex spaces where a linear transformation can be represented by a diagonal matrix if, and only if, it is normal. This serves as a motivation to look for a similar result in general Hilbert spaces. The chapter ends by showing the spectral theorem for compact self-adjoint operators. In Chapter 18

X

Preface

we study in some detail the notion of compactness in metric spaces, we give several related notions and we provide some criteria of compactness for some function spaces. In the preparation of this book we made extensive use of some works, namely Refs [20, 26, 24, 23, 30, 5, 21]. Many of the results, examples and proofs in the text are also taken from our personal notebooks, and the exact references were lost. Washington, USA Bogotá, Colombia

G. Chacón H. Rafeiro and J.C. Vallejo December 2016

Contents List of Figures

XV

Basic Notation

XVII

1 1.1 1.2 1.3

1 Choice Principles Axiom of Choice 1 Some Applications 3 Problems 6

2 2.1 2.2 2.3 2.4 2.5 2.6

8 Hilbert Spaces Norms 8 Inner Products and Hilbert Spaces Some Geometric Properties 20 Orthogonality 24 Orthogonal Sequences 27 Problems 34

3 3.1 3.2 3.3 3.4

Completeness, Completion and Dimension Banach Spaces 36 Completion and Dimension 43 Separability 45 Problems 52

4 4.1 4.2 4.3 4.4

53 Linear Operators Linear Transformations Back to Matrices 54 Boundedness 57 Problems 64

5 5.1 5.2 5.3 5.4

66 Functionals and Dual Spaces A Special Type of Linear Operators Dual Spaces 69 The Bra-ket Notation 73 Problems 76

6 6.1 6.2

78 Fourier Series The Space L2 [–0, 0] 78 Convergence Conditions for the Fourier Series

14

36

53

66

80

XII

6.2.1 6.2.2 6.3

Contents

Sufﬁcient Convergence Conditions for the Fourier Series in a Point 80 Conditions for Uniform Convergence for the Fourier Series 84 Problems 87

7 7.1 7.2 7.3 7.4 7.5

89 Fourier Transform Convolution 89 L1 Theory 94 L2 Theory 105 Schwartz Class 108 Problems 111

8 8.1 8.1.1 8.1.2 8.1.3 8.1.4 8.2

113 Fixed Point Theorem Some Applications 116 Neumann Series 116 Differential Equations 117 Integral Equations 118 Fractals 119 Problems 124

9 9.1 9.2 9.3

126 Baire Category Theorem Baire Categories 126 Baire Category Theorem 127 Problems 135

10 10.1

Uniform Boundedness Principle Problems 142

11 11.1

Open Mapping Theorem Problems 146

12 12.1

Closed Graph Theorem Problems 149

13 13.1 13.2 13.3 13.4 13.5

151 Hahn–Banach Theorem Extension Theorems 151 Minkowski Functional 155 Separation Theorem 158 Applications of the Hahn–Banach Theorem Problems 165

137

143

147

163

Contents

14 14.1 14.2 14.3

166 The Adjoint Operator Hilbert Spaces 166 Banach Spaces 170 Problems 172

15 15.1 15.2 15.3

Weak Topologies and Reflexivity Weak∗ Topology 177 Reﬂexive Spaces 179 Problems 182

16 16.1 16.2 16.3

183 Operators in Hilbert Spaces Compact Operators 183 Normal and Self-Adjoint Operators Problems 190

17 17.1 17.2 17.3

192 Spectral Theory of Operators on Hilbert Spaces A Quick Review of Spectral Theory in Finite Dimensions 192 The Spectral Theorem for Compact Self-Adjoint Operators 194 Problems 197

174

199 18 Compactness 18.1 Metric Spaces 199 18.2 Compactness in Some Function Spaces 18.2.1 Space ℓ2 209 18.2.2 Space of Continuous Functions 211 18.2.3 Lebesgue Spaces 216 18.3 Problems 221 Bibliography Index

225

223

187

209

XIII

List of Figures 2.1 2.2 3.1 3.2 3.3 6.1 7.1 8.1 8.2 8.3 8.4 8.5 8.6 13.1 13.2 13.3 18.1 18.2 18.3 18.4 18.5

Parallelogram identity 20 Law of cosines 24 Unit ball with respect to the norm ‖ ⋅ ‖1 41 Unit ball with respect to the norm ‖ ⋅ ‖2 42 Unit ball with respect to the norm ‖ ⋅ ‖∞ 42 %-net 86 Example of a Dirac sequence 92 Sierpinski triangle 114 Examples of iteration for ﬁxed point 115 Sierpinski triangle obtained has a ﬁxed point of an application 119 %-Collars of two sets 120 The Cantor ternary set C 123 First step of the self-similar Cantor set 124 Projection of a point into the unit sphere 156 Two sets that cannot be separated by a hyperplane 162 Strictly convex normed space 162 Totally bounded set 201 %-net 202 Lebesgue number 208 Sequence of continuous functions converging pointwise but not uniformly 212 Grid 214

Basic Notation Here we will review the basic notation that will be used throughout the book. In general, we will refer to a field 𝔽 to denote either ℝ, the set of all real numbers, or ℂ, the set of complex numbers. Given a natural number n, 𝔽n denotes the vector space of all n n-tuples (!1 , . . . , !n ), where the !j belong to either ℝ or ℂ. The symbol [!ij ]i,j=1 denotes an n × n matrix with entries !ij . The set of natural numbers will be denoted as ℕ. The absolute value of a real number ! will be denoted by |!|. Similarly, if z is a complex number, then we will use the symbol |z| to denote the absolute value or modulus of a complex number. If z = a + ib, then the complex conjugated will be denoted as z = a – ib. Given a topological space X and a set E ⊆ X, we denote by E the closure of the set E. If there is a metric d defined on X, then Br (x) will denote the open ball centered at x and with radius r. Similarly, Br (x) denotes the closed ball centered at x and with radius r. Sequences of elements will be denoted as (xn )∞ n=1 , and with the symbol xn → x we represent that the limit of xn is equal to x when n tends to infinity. The topology in which such limits are considered should be clear from the context. The symbols sup(A) and inf(A) will denote, respectively, the supremum and the infimum of a set A ⊆ ℝ. We use the symbol ⊘ to indicate the end of an example or a remark. Similarly, the symbol ◻ will denote the end of a proof. Finally, references to the books in the bibliography will be denoted by square brackets.

1 Choice Principles Learning Targets ✓ Introduction to the axiom of choice. ✓ Learn some choice principles which are equivalent to the axiom of choice. ✓ Get acquainted with and understand how to use some type of choice principles in a real-world situation.

1.1 Axiom of Choice The axiom of choice is a devise used when we need to iterate some process infinitely, e.g., when we need to choose some infinite elements from a set. One of the formulations of the axiom uses the notion of choice function and the others rely on the Cartesian product. Definition 1.1 (Choice Function). Let X be a nonempty set. A function f : 2X 󳨀→ X is said to be a choice function on the set X if f (A) ∈ A when 0 ≠ A ⊆ X. With the notion of choice function we can state the axiom of choice. Axiom 1.2 (Axiom of Choice). For every nonempty set there exists a choice function. The axiom of choice can be phrased as For every family A of disjoint nonempty sets there exists a set B which has exactly one element in common with each set belonging to A.

Although this axiom seems to be true, at least for the case of finite choices or numerable choices, in the case of nonnumerable infinite choices, the situation is quite different. Even for the numerable choices it is somewhat an illusion, since we do not have a way to guarantee that we can choose all the numerable elements without resorting to some type of axiom of choice. For example, to guarantee the existence of the natural numbers ℕ it is necessary to assume some type of axiom of infinity, as is done in the Zermelo–Fraenkel set theory. The power of the axiom of choice lies in the fact that it permits to choose an infinite number of things instantaneously. In Ref. [4] we have the following remark: Several mathematicians claimed that proofs involving the axiom of choice have a different nature from proofs not involving it, because the axiom of choice is a unique set theoretical principle

2

1 Choice Principles

which states the existence of a set without giving a method of defining (“constructing”) it, i.e. is not effective.

In the beginning of the twentieth century there were disputes among several renowned mathematicians regarding the acceptance of this axiom. One of the serious obstacles is the so-called Banach–Tarski paradox. Banach–Tarski Paradox: The unit ball B := {(x, y, z) ∈ ℝ3 : x2 + y2 + z2 ≤ 1} in three dimensions can be disassembled into a finite number of pieces (in fact, just five pieces would suffice), which can then be reassembled (after translating and rotating each of the pieces) to form two disjoint copies of the ball B. For more information regarding the Banach–Tarski paradox, see Ref. [41]. In Theorem 1.6 we give a whimsy version of the Banach–Tarski paradox. Nowadays the axiom of choice is accepted by the majority of the mathematical community and we will add it to our mathematical toolbox without further philosophical discourse. For an account of the history of the axiom of choice see Ref. [29]. From now on in the whole book, when we use the tribar symbol before a result, it means that it relies on the axiom of choice or any of its equivalent formulations. Definition 1.3. Let (Xj )j∈J be a family of sets. The Cartesian product, denoted by Fj∈J Xj , is the set of all maps x : j → ∪j∈ Xj such that x(j) ∈ Xj for any j ∈ J. In the next theorem we collect several choice principles that are equivalent to the axiom of choice, but before doing that we will need a couple of definitions. Definition 1.4. Let (X, ≼) be a poset (partially ordered set). We say that a subset S, with 0 ≠ S ⊆ X, is a chain in X if all the elements in S are related by the partial order from X, i.e. if for all x, y ∈ S we have either x ≼ y or y ≼ x.

Definition 1.5. A poset X is well ordered if any nonempty subset of X has a minimum element, viz. if min(A) exists whenever 0 ≠ A ⊆ X. The following theorem will be given without proof; for the proof and further results, cf. Refs [9, 16, 17]. Theorem 1.6. The following principles are equivalent:

1.2 Some Applications

3

Axiom of Choice: For every nonempty set there exists a choice function. Axiom of Choice for the Cartesian Product: The Cartesian product of nonempty sets is nonempty. Hausdorff’s Maximal Chain Condition: Each partially ordered set contains a maximal chain. Kuratowski–Zorn Lemma: If in a poset X each chain has an upper bound, then X has a maximal element. Zermelo Theorem: Every nonempty set can be well ordered.

1.2 Some Applications As a first application of the axiom of choice we show that, under the Lebesgue measure, there are nonmeasurable sets. In fact we need to be careful in the phrasing of the existence of nonmeasurable Lebesgue sets, since this problem depends on the type of model of set theory that we are using. The celebrated result of Solovay [37] affirms that there are some models of set theory (devoid of the axiom of choice) in which every subset of the real numbers is Lebesgue measurable! We always use the standard Zermelo–Fraenkel set theory model with the axiom of choice in this book. We now use the well-known Vitali construction to get the desired nonmeasurable Lebesgue set. Theorem 1.7 (Vitali’s Set). There exists a nonmeasurable Lebesgue set of ℝ. Proof. We define an equivalence relation ∼ on ℝ by setting x ∼ y if x – y is a rational number. This splits ℝ into an uncountable aggregate C of pairwise disjoint classes. The sets of C are of the form x + ℝ for some x ∈ ℝ. Invoking the axiom of choice, there is a set V ⊆ (0, 1) which has one and only one point in common with each C ∈ C . The constructed set V, designated as Vitali set, is not Lebesgue measurable. The nonmeasurability of V follows from the translation invariance of the Lebesgue measure and the fact that bounded sets have finite measure. Let (qn ) be an enumeration of the rational numbers in (–1, 1), from which it follows that the sets Vn := qn + V are pairwise disjoint and that (0, 1) ⊆ ⨆ Vn ⊆ (–1, 2),

(1.1)

n

where ⨆ stands for the union of pairwise disjoint sets. To obtain a contradiction, we assume that V is a Lebesgue measurable set. Since a translated Lebesgue measurable set is measurable with the same measure, we have that the sets Vn are measurable. From eq. (1.1) and the equality m (⊔n Vn ) = ∑n m (Vn ), where m stands for the Lebesgue measure, we obtain a contradiction: either m(V) = 0 or m(V) > 0. ◻

4

1 Choice Principles

The previous proof relies on the axiom of choice but the applicability of such choice principle is not straightforward in the preceding result. The next example will be clearer. We will show that a vector space always has a basis, in the sense of a Hamel basis. Before stating and proving such a statement, we will give a naive heuristic approach to show the result. First let us choose an arbitrary element x1 of the vector space V and take I1 = {x1 }. If the span of I1 generates V, we find our basis. On the contrary, we choose some element x2 of V not spanned by I1 and define I2 = I1 ∪ {x2 }. Again we check if the span of I2 generates V. In the affirmative we find our basis; on the contrary we continue with the process. Using this algorithm we produce bigger and bigger nested linearly independent sets I1 ⊆ I2 ⊆ ⋅ ⋅ ⋅ ⊆ In ⊆ ⋅ ⋅ ⋅ . If the preceding process stops at some n, then the vector space V has finite dimension. But what happens if the algorithm runs ad infinitum? Here we will need to invoke some type of axiom of choice, which will enable us to choose an infinite number of elements instantaneously. Theorem 1.8. Every vector space has a Hamel basis. Proof. We will show a somewhat stronger statement, namely, we show that taking any linearly independent subset of a vector space V, call it I0 , there exists a basis B of the vector space V which contains I0 . Let T := {I ⊆ V : I0 ⊆ I and I is linearly independent in V} .

To show that T has a maximal element, from the Hausdorff maximal principle, it is sufficient to show T ≠ 0 and that for every chain C such that I0 ∈ C ⊆ T , the union ∪C belongs to T . The first condition is satisfied since I0 ∈ T . Let us now take C to be a chain in T with respect to inclusion. We now show that ∪C∈C C is linearly independent. Taking n elements vk in ∪C∈C C and n elements !k from ℝ such that v1 !1 + v2 !2 + ⋅ ⋅ ⋅ + vn !n = 0, we want to show that !1 = !2 = ⋅ ⋅ ⋅ = !n = 0. Since vk ∈ Ck where Ck ∈ C , and the fact that C is a chain, we can find an element Cmax in C which contains all Ck , k = 1, ⋅ ⋅ ⋅ , n. Since the elements in C are linearly independent, we obtain that all the coefficients !k must be zero. By the Hausdorff maximal chain condition, there exists a maximal chain Cmax . To finish the proof, we need to show that Cmax is indeed a basis for the vector space V. We now suppose that Cmax is not a basis for V; therefore, there exists a v ∈ V such that v = ̸ v 1 !1 + v2 !2 + ⋅ ⋅ ⋅ + v n !n

(1.2)

1.2 Some Applications

5

for all vk ∈ Cmax and !k ∈ ℝ. Let us choose vk ∈ Cmax and !k in such a way that !0 v + !1 v1 + !2 v2 + ⋅ ⋅ ⋅ + !n vn = 0. If !0 = 0 then we have that !k = 0 for all k due to the fact that Cmax is linearly independent. If !0 ≠ 0 we can write v in terms of the elements of Cmax which contradicts eq. (1.2). From the above considerations, we get that Cmax is linearly independent, which ends the proof. ◻ The fact that every vector space has a Hamel basis is not very useful per se since the proof is not constructive and the cardinality of the basis can be nonnumerable. In spite of that, we can again show that there are nonmeasurable sets, following a different approach from Theorem 1.2. Theorem 1.9 (Nonmeasurable Set). There exists a nonmeasurable Lebesgue set in ℝ. Proof. Let us take (ℝ, ℚ, +, ⋅) as the vector space ℝ over the field ℚ. By Theorem 1.3 this vector space has a Hamel basis B , which means that for all v ∈ ℝ there exist finite elements in B , say v1 , v2 , . . . , vn and qi ∈ ℚ, i = 1, . . . , n such that n

v = ∑ qk v k , k=1

where the qi and vi are uniquely determined. Let us fix v0 ∈ B and define V0 = {x ∈ ℝ : the coefficient of v0 is null when x is written using the basis B } , which means that ℝ = ⨆ (V0 + rv0 )

where

A + rv0 := {rv0 + a : a ∈ A} .

r∈ℚ

By (ℝ, L , +) we mean the measure space with the Lebesgue measure +. Suppose that V0 ∈ L . By the fact that the Lebesgue measure preserves the measure of translated sets we obtain that V0 + rv0 ∈ L for all r ∈ ℚ. If +(V0 ) = 0 we have that +(V0 + rv0 ) = 0 for all r ∈ ℚ which implies a contradiction since +(ℝ) = +(⊔r∈ℚ V0 + rv0 ) = 0. Now let us take +(V0 ) ≠ 0. Since ℝ = ∪k∈ℤ [k/2, (k + 1)/2] we have that there exists k0 such that +(V0 ∩ [k0 /2, (k0 + 1)/2]) = c ∈ (0, 1/2]. Let C = {r ∈ ℚ : rv0 ∈ [

–k0 1 – k0 , ]} , 2 2

which is an infinite set. From which we get that r ∈ C implies that rv0 +[k0 /2, (k0 +1)/2] ⊆ [0, 1]. Since +((V0 ∩ [k0 /2, (k0 + 1)/2]) + rv0 ) = c for all r ∈ C, due to the invariance of Lebesgue measure under translation, and the fact that

6

1 Choice Principles

⨆ ((V0 ∩ [k0 /2, (k0 + 1)/2]) + rv0 ) ⊆ [0, 1], r∈C

we obtain a contradiction since +∞ = ∑ c ≤ +([0, 1]) = 1.

r∈C

We end this digression on the axiom of choice proving a result that is, in the words of T. Tao, an extremely whimsy version of the Banach–Tarski paradox. Theorem 1.10. There is a subset of the interval [0, 2] which can be disassembled into a countable number of disjoint pieces, which can be reassembled to form the real line ℝ just by translating the pieces. Proof. The proof will be based on the Vitali set construction. We define an equivalence relation ∼ on [0, 1] by setting x ∼ y if x – y is a rational number. This splits [0, 1] into an uncountable collection C of pairwise disjoint numerable classes. Invoking the axiom of choice, there is a set V ⊆ [0, 1] which has one and only one point in common with each C ∈ C . We now take W as W=

q + V,

q∈ℚ∩[0,1]

from which we have that W ⊆ [0, 2]. Since there is a one-to-one function > : ℚ ∩ [0, 1] → ℚ, we obtain the real line ℝ via this > function, namely ℝ=

⨆ q∈ℚ∩[0,1]

>(q) + V. ◻

1.3 Problems 1.1.

1.2. 1.3.

A function f : [a, b] 󳨀→ ℝ is said to be Heine continuous (also known as sequential continuous) at the point x = . if for all (xn )n∈ℕ , xn ∈ [a, b] such that limn→∞ xn = . we have that limn→∞ f (xn ) = f (. ). A function f : [a, b] 󳨀→ ℝ is said to be Cauchy continuous (also known as %-\$ continuous) at the point x = . if for all % > 0 there exists \$ > 0 such that |f (x) – f (. )| whenever |x – . | < \$. Prove that (a) Heine continuity at a point implies Cauchy continuity. (b) Cauchy continuity at a point implies Heine continuity. Find a function f : ℝ → ℝ which is linear and discontinuous! We say that a function F : X → ℝ has the Darboux property if F(X) = I where I is some interval in ℝ. Let (X, G, ,) be a measure space. An atom is a set A ∈ G

1.3 Problems

1.4.

7

such that ,(A) > 0 and such that for any B ∈ G with B ⊆ A we have the dichotomy ,(B) = 0 or ,(B) = ,(A). A measure is called nonatomic when there is no atom. Sierpi´ nski’s Theorem – Nonatomic finite measures have the Darboux property. A family F of nonempty subsets of X is said to be a filter in X if (a) A, B ∈ F implies that A ∩ B ∈ F ; (b) A ∈ F and A ⊆ B ⊆ X implies that B ∈ F . A maximal filter is said to be an ultrafilter. Ultrafilter Lemma – For all filter F there exists an ultrafilter Fmax ⊃ F .

2 Hilbert Spaces Learning Targets ✓ Familiarize with the concept of norm and some examples. ✓ Understand inner products as a tool to study the geometry of some vector spaces. ✓ Use orthogonal basis to obtain a simple representation of vectors in Hilbert spaces.

2.1 Norms In this chapter we will introduce a concept that will be used throughout the rest of the book. It is probably one of the most important concepts in functional analysis. The idea is to find a way to generalize the concept of the absolute value in the setting of the real numbers. The reader may be familiar with this concept from calculus and/or linear algebra courses. We want to somehow “synthesize” the good properties of the absolute value in a way that an abstract definition can be given under the generality of a vector space.

Definition 2.1. Let V be a vector space over a field 𝔽. A norm on V is a function ‖ ⋅ ‖ : V → ℝ satisfying the following properties: (a) For all v ∈ V, ‖v‖ ≥ 0. (b) For all v ∈ V, ‖v‖ = 0 if, and only if, v = 0. 󵄨 󵄨 (c) For all + ∈ 𝔽 and all v ∈ X, ‖+v‖ = 󵄨󵄨󵄨+󵄨󵄨󵄨 ‖v‖. (d) (Triangle Inequality) For all v, w ∈ V, ‖v + w‖ ≤ ‖v‖ + ‖w‖. The pair (V, ‖ ⋅ ‖) is called a normed space.

Remark 2.2. If we replace the second property above for (b)󸀠 . For all v ∈ V, if v = 0 then ‖v‖ = 0, we say that ‖ ⋅ ‖ is a seminorm. In this case, the pair (V, ‖ ⋅ ‖) is called a seminormed space. ⊘

As a consequence of the triangle inequality, it can be seen that, for every v, w ∈ V, the 󵄨 󵄨 inequality 󵄨󵄨󵄨‖v‖ – ‖w‖󵄨󵄨󵄨 ≤ ‖v – w‖ holds. This fact implies that the norm is a continuous function. Having a way to generalize the absolute value gives us a way of measuring distances. Concretely, every normed space (V, ‖ ⋅ ‖) induces a metric space (V, d) by defining d : V × V → ℝ as d(v, w) = ‖v – w‖, for every v, w ∈ V. With this metric we can define a topology on V by means of open balls.

2.1 Norms

9

Given a positive number r and one element v of the vector space V, recall that the open ball centered at v with radius r is defined as Br (v) = {w ∈ V : ‖v – w‖ < r}. We give some examples of normed spaces in what follows, leaving most of the details as exercises. Example 2.3. Let V = ℝ, equipped with the norm ‖x‖ = |x|. This is the normed space the reader is probably most familiar with, even if the concept was still not known. This example gives the idea that the norm tries to measure distances and generalizes this simple concept to more complicated vector spaces. ⊘ Example 2.4. Let V = ℝn , equipped with the norm 1/2

n

‖(x1 , ⋅ ⋅ ⋅ , xn )‖2 =

(∑ xj2 ) j=1

.

This is the usual norm in ℝn . We will show with detail that this is indeed a norm. First, properties (a), (b) and (c) of Definition 2.1 are very straightforward. We will focus on the triangle inequality. Notice that we would like to show that for any pair of vectors in (x1 , ⋅ ⋅ ⋅ , xn ) and (y1 , ⋅ ⋅ ⋅ , yn ) ∈ ℝn we have that 1/2

n

2

(∑(xj + yj ) ) j=1

n

(∑ xj2 ) j=1

1/2

n

+

1/2

(∑ yj2 ) j=1

.

But notice that n

n

j=1

j=1

∑(xj + yj )2 = ∑ (xj2 + yj2 + 2xj yj ) .

(2.1)

Thus, we need to find a way to calculate the term ∑nj=1 2xj yj . For this, notice that since for any two real numbers a and b we have that (a – b)2 ≥ 0, then ab ≤ (a2 + b2 )/2. Now we use this inequality as follows: n

2

n

n

n

n

i,j=1

i,j=1

j=1

j=1

(∑ xj yj ) = ∑ xi yi xj yj ≤ ∑ (xi2 yj2 + xj2 yi2 )/2 = ∑ xj2 ∑ yj2 . j=1

With this in hand, we can go back to eq. (2.1) and write

10

2 Hilbert Spaces

n

n

n

n

n

j=1

j=1

j=1

j=1

∑(xj + yj )2 ≤ ∑ xj2 + ∑ yj2 + 2√∑ xj2 ∑ yj2 j=1 2 n

n

= (√∑ xj2 + √∑ yj2 ) , j=1

j=1

and the triangle inequality follows.

Example 2.5. As a way to generalize the previous example, we can define the vector ∞ 2 space ℓ2 as the space of all real sequences (xn )∞ n=1 such that ∑j=1 xj < ∞. On this space define the norm ‖(xn )∞ n=1 ‖2

1/2

=

(∑ xj2 ) j=1

.

We leave it as an exercise to verify that the previous proof works with the corresponding modifications in this case. ⊘ Example 2.6. The previous example can be further generalized. In fact, with a small modification we can produce an infinite family of normed vector spaces. Let p ≥ 1, and define ∞ { } p ℓp = {(xn )∞ n=1 : xn ∈ ℝ, ∑ |xj | < ∞} . j=1 { }

The following expression defines a norm on ℓp ‖(xn )∞ n=1 ‖p

1/p p

= (∑ |xj | )

.

j=1

Again, the only difficult property to prove is the triangle inequality. We will do so by recurring to a series of previous inequalities that are interesting by themselves. First, we try to follow the same calculations as in Example 2.4. Notice that for estimating the “mixed” terms, we used the simple inequality: ab < (a2 + b2 )/2 that holds for any real numbers a and b. In the general case, we will need a similar inequality. ⊘ Theorem 2.7 (Young’s Inequality). Let p, q > 1 satisfy 1 1 + = 1. p q

2.1 Norms

11

Then, for any nonnegative real numbers a, b ab ≤

ap bq + . p q

We say that p and q are conjugated exponents. Proof. There are several ways to prove this inequality. Here we will use the fact that the exponential function f (x) = ex satisfies that f 󸀠󸀠 (x) > 0 and consequently it is a convex function. Hence for every nonnegative real numbers a and b, the segment through the points (a, ea ) and (b, eb ) is “below” the graph of the exponential function on the interval between a and b. In symbols, for every t ∈ [0, 1] we have that eta+(1–t)b ≤ tea + (1 – t)eb . Now, taking t =

1 p

we have that 1

log ap + 1 log bq

q , ab = e p 1 log ap 1 log bq + e ≤ e p q

and the inequality follows.

With this inequality in hand, we need to estimate the mixed terms ∑ xj yj . This will be given by the following. Theorem 2.8 (Hölder’s Inequality). Let p, q > 1 be conjugated exponents and let (xn ) and (yn ) be sequences of real numbers. Then ∞

j=1

j=1

1/p p

∑ |xj yj | ≤ (∑ |xj | )

1/q q

(∑ |yj | )

.

j=1

Proof. If any of the series in the right of the inequality is divergent, then the inequality holds trivially. Suppose that both series converge; in other words, suppose that (xn ) ∈ ℓp and (yn ) ∈ ℓq . Moreover, we can assume that both series are nonzero, since otherwise the inequality would trivially hold. Define, for every i xĩ =

xi 1/p p (∑∞ j=1 |xj | )

Then, by Young’s inequality,

,

yĩ =

yi 1/q q (∑∞ j=1 |yj | )

.

12

2 Hilbert Spaces

∑ |xĩ yĩ | ≤ i=1

1 ∞ 1 ∞ ∑ |xĩ |p + ∑ |yĩ |q . p i=1 q i=1

Writing this inequality in terms of xi and yi and using the fact that p and q are conjugated exponents, we prove the result. ◻ Now we are ready to prove the triangle inequality in ℓp . Theorem 2.9 (Minkowski’s Inequality). Let (xn ) and (yn ) be two sequences in ℓp , p ≥ 1. Then, 1/p

1/p

p

(∑ |xj + yj | )

p

≤ (∑ |xj | )

j=1

1/p

p

+ (∑ |yj | )

j=1

.

j=1

Proof. Notice that the case p = 1 is trivial. Suppose p > 1, and let q > 1 be such that 1/p + 1/q = 1 (which implies that p = (p – 1)q), then by Hölder’s inequality, ∞

j=1

j=1

j=1

∑ |xj + yj |p ≤ ∑ |xj ||xj + yj |p–1 + ∑ |yj ||xj + yj |p–1 ∞

1/p p

≤ ((∑ |xj | ) j=1

1/p p

+ (∑ |yj | )

1/p p

1/q

) (∑ |xj + yj |

j=1

= ((∑ |xj | ) j=1

(p–1)q

)

j=1 1/p p

+ (∑ |yj | )

1/q p

) (∑ |xj + yj | )

j=1

,

j=1

and the inequality follows.

Example 2.10. We can follow similar ideas as in the previous examples and consider an even larger set in which p can vary. If p = 1, we define the space ℓ1 as the vector space of all real sequences (xj )∞ j=1 such that ∞

‖(xj )∞ j=1 ‖1 = ∑ |xj | < ∞. j=1

It is not hard to see that ‖ ⋅ ‖1 actually defines a norm. On the other hand, we can define ℓ∞ as the vector space of all bounded real sequences (xj )∞ j=1 . And we can define its norm as ‖(xj )∞ j=1 ‖∞ = sup{|xj | : j = 1 . . . ∞}. We leave it as an exercise to verify that this is also a norm.

2.1 Norms

13

Example 2.11. This example requires knowledge on measure theory. Let (X, M , ,) be a measure space and let 1 ≤ p < ∞. On the space of all measurable functions f : X → ℝ define the equivalence relation f ∼g

f (x) = g(x) for almost every x ∈ X.

if and only if

It is usual to identify a measurable function f with its equivalence class and we will do so for the rest of the book. Define the space Lp = Lp (X, M , ,) as the vector space of measurable functions (equivalence classes) f such that ∫ |f (x)|p d, < ∞, X

equipped with the norm 1/p

‖f ‖Lp = (∫ |f (x)|p d,)

.

X

We will leave as an exercise to verify that this is actually a norm. Notice that the equivalence relation previously defined is necessary in order to have that ‖f ‖ = 0 if and only if f ≡ 0. Notice that this example actually contains Example 2.6. We just need to define the appropriate measure space. In the case of ℓp , the appropriate spaces would be (ℕ, 2ℕ , ,) where 2ℕ denotes the family of all subsets of ℕ and , denotes the counting measure. For A ⊆ ℕ {Number of elements of A, ,(A) = { ∞, {

if A is finite if A is infinite. ⊘

Example 2.12. We can extend the definition of Lp to the case p = ∞ by considering L∞ as the space of all measurable and essentially bounded functions f : X → ℝ, i.e., the space of measurable functions such that there exists M > 0 with the property that |f (x)| < M

for almost every x ∈ X.

The norm in L∞ is defined as ‖f ‖∞ = inf{M > 0 : |f (x)| < M for almost every x ∈ X}.

Example 2.13. The norms ‖ ⋅ ‖p , with 1 ≤ p ≤ ∞ can also be defined on other function spaces; for example, we will use the space C ([0, 1]) of all continuous

14

2 Hilbert Spaces

real-valued functions on [0, 1]. Notice that every function in C ([0, 1]) is bounded and consequently, ‖f ‖p < ∞ for every 1 ≤ p ≤ ∞. ⊘

2.2 Inner Products and Hilbert Spaces In the previous section we started with some examples of normed spaces that were already well known, and then we proceeded to its generalization. Let’s go back to those examples for a minute: we defined a norm in ℝn in Example 2.4. This is a norm that we have seen before (maybe not with the same name) but remembering a norm is nothing but a way of measuring distances and we know from linear algebra how to measure the length of a vector in ℝn by using Pythagoras Theorem. The length that we reach using this reasoning is exactly the ‖ ⋅ ‖2 norm. This says that there is something special about p = 2, and whatever it is, is related with Pythagoras Theorem and hence, with the possibility of measuring angles and deciding when two vectors are orthogonal in the vector space. In this section we will discuss about a structure that allows to measure angles and define the notion of orthogonality in more general spaces. This is the notion of inner product spaces. Definition 2.14. Let V be a vector space over a field 𝔽. An inner product on V is a function that assigns to every ordered pair of vectors v and w in V, a scalar in 𝔽 denoted by ⟨v, w⟩. Such function satisfies the following properties for all v, w and z in V and all ! in 𝔽: (a) ⟨v + z, w⟩ = ⟨v, w⟩ + ⟨z, w⟩ . (b) ⟨!v, w⟩ = ! ⟨v, w⟩ . (c) ⟨v, w⟩ = ⟨w, v⟩ , where the bar denotes complex conjugation. (d) ⟨v, v⟩ > 0 if v ≠ 0. With this definition in hand, we can deduce the following properties of an inner product. We leave the proofs as exercises. Theorem 2.15. Let (V, ⟨⋅, ⋅⟩) be an inner product space. Then for v, w and z in V and ! in 𝔽 the following properties hold. (a) (b) (c) (d) (e)

⟨v, w + z⟩ = ⟨v, w⟩ + ⟨v, z⟩ . ⟨v, !w⟩ = ! ⟨v, w⟩ . ⟨v, 0⟩ = ⟨0, v⟩ = 0. ⟨v, v⟩ = 0 if, and only if, v = 0. If ⟨v, w⟩ = ⟨v, z⟩ for all v ∈ V, then w = z.

Let’s see some examples of inner product vector spaces. Some of them might look familiar.

2.2 Inner Products and Hilbert Spaces

15

Example 2.16. In this first example, we will define an inner product on ℝn . For v = (x1 , . . . , xn ) ∈ ℝn and w = (y1 , . . . , yn ) ∈ ℝn , define n

⟨v, w⟩ = ∑ xi yi . i=1

We leave it as an exercise to the reader to verify that the conditions for an inner product are satisfied in this case. This is called the standard inner product and you might recognize it from calculus as the “dot” product between two vectors. Notice that the complex conjugation in condition (c) of the definition of inner product makes no sense in this context. However, we can extend this definition to ℂn and define now for v = (x1 , . . . , xn ) ∈ ℂn and w = (y1 , . . . , yn ) ∈ ℂn , n

⟨v, w⟩ = ∑ xi yi . i=1

Example 2.17. Once we have one inner product ⟨v, w⟩ defined on a vector space V, we can define a family of inner products depending on a parameter r > 0, by using the equation ⟨v, w⟩r = r ⟨v, w⟩ . ⊘ Example 2.18. There are several other possibilities to define an inner product on ℝn or ℂn . Let’s consider for example the following inner product on ℝ2 . For v = (x1 , x2 ) and v = (y1 , y2 ) in ℝ2 , define ⟨v, w⟩ = x1 y1 – x2 y1 – x1 y2 + 4x2 y2 . 2

Notice that ⟨v, v⟩ = (x1 – x2 ) + 3x22 , which implies that ⟨v, v⟩ > 0 if v ≠ 0 and that ⟨v, v⟩ = 0 if and only if v = 0. The other conditions in the definition of the inner product are left to the reader to verify. ⊘ Before working on more examples, let’s focus on the standard inner product on ℝn for a moment and recall that it follows from Pythagoras Theorem that the length of a vector v = (x1 , . . . , xn ) ∈ ℝn is given by √⟨v, v⟩. So, it makes sense in the general context of inner product vector spaces in which the notion of length may not be as clear as in ℝn , to consider the following definition. Definition 2.19. Let (V, ⟨⋅, ⋅⟩) be an inner product space. For v ∈ V we define the norm of v by ‖v‖ = √⟨v, v⟩.

16

2 Hilbert Spaces

We will verify that this actually defines a norm. Properties (a), (b) and (c) are straightforward. We will see property (d) in the next theorem. Theorem 2.20. Let V be an inner product space over a field 𝔽. Then, for every v, w ∈ V we have the following: 󵄨 󵄨 (a) Cauchy–Schwarz Inequality: 󵄨󵄨󵄨⟨v, w⟩󵄨󵄨󵄨 ≤ ‖v‖ ‖w‖ . We have an equality if, and only if, the vectors v and w are linearly dependent. (b) Triangle Inequality: ‖v + w‖ ≤ ‖v‖ + ‖w‖ . We have an equality if, for some ! ≥ 0, v = 0 or w = !v. Proof. (a) If ⟨v, w⟩ = 0, then the result follows immediately. We suppose that ⟨v, w⟩ ≠ 0, and consequently that w ≠ 0. We can also suppose that v and w are linearly independent since if w = +v, for some + ∈ 𝔽, then the equality holds easily. We will use the familiar Gram–Schmidt process studied in linear algebra to ̂ = 0 and the space of all linear construct two vectors v̂ and ŵ such that ⟨v,̂ w⟩ ̂ Such vectors are combinations of v and w, span{v, w}, coincides with span{v,̂ w}. given by v̂ =

v , ‖v‖

ŵ = w – ⟨w, v⟩̂ v.̂

Notice that ̂ 2 = ‖w‖2 – 2| ⟨v,̂ w⟩ |2 + | ⟨v,̂ w⟩ |2 , 0 ≤ ‖w‖ = ‖w‖2 –

(b)

| ⟨v, w⟩ |2 ‖v‖2

and the inequality follows. ̂ = 0 which implies that w is a Notice that if the equality holds, then ‖w‖ multiple of v. This finishes the proof. Using Cauchy–Schwarz inequality and the properties of an inner product, we obtain ‖v + w‖2 = ⟨v + w, v + w⟩, = ‖v‖2 + ⟨v, w⟩ + ⟨w, v⟩ + ‖w‖2 , 󵄨 󵄨 ≤ ‖v‖2 + 2 󵄨󵄨󵄨⟨v, w⟩󵄨󵄨󵄨 + ‖w‖2 ≤ ‖v‖2 + 2 ‖v‖ ‖w‖ + ‖w‖2 , = (‖v‖ + ‖w‖)2 , which proves the triangle inequality. The equality occurs if, and only if, 2 ‖v‖ ‖w‖ = ⟨v, w⟩ + ⟨w, v⟩ = 2 Re ⟨v, w⟩ . Assuming this, we see from Cauchy– Schwarz inequality that

2.2 Inner Products and Hilbert Spaces

17

󵄨󵄨 󵄨 󵄨 󵄨 󵄨󵄨⟨v, w⟩󵄨󵄨󵄨 ≥ Re ⟨v, w⟩ = ‖v‖ ‖w‖ ≥ 󵄨󵄨󵄨⟨v, w⟩󵄨󵄨󵄨 , 󵄨 󵄨 that is, 󵄨󵄨󵄨⟨v, w⟩󵄨󵄨󵄨 = ‖v‖ ‖w‖ = Re ⟨v, w⟩ = ⟨v, w⟩ ≥ 0. This conditions include the equality condition of Cauchy–Schwarz, hence either v = 0 or there is + ∈ 𝔽 such that w = +v. Using again the conditions above we have 0 ≤ ◻ ⟨v, w⟩ = ⟨v, +v⟩ = + ‖v‖2 , which implies that + ≥ 0. Cauchy–Schwarz inequality is probably one of the most used inequalities in the theory of inner product spaces. As an example of its use we prove the following important result about the inner product. Theorem 2.21 (Continuity of the Inner Product). Let (V, ⟨⋅, ⋅⟩) be an inner product space and let ‖ ⋅ ‖ be its associated norm as defined in Definition 2.19. Suppose that (vn )∞ n=1 ⊆ V, (wn )∞ ⊆ V, v ∈ V and w ∈ V. If ‖v –v‖ → 0 and ‖w –w‖ → 0, then ⟨v , w ⟩ 󳨀→ ⟨v, w⟩ . n n n n n=1 Proof. Using the triangle inequality and then the Cauchy–Schwarz inequality, we obtain 󵄨 󵄨 󵄨 󵄨󵄨 󵄨󵄨⟨vn , wn ⟩ – ⟨v, w⟩󵄨󵄨󵄨 = 󵄨󵄨󵄨⟨vn , wn ⟩ – ⟨vn , w⟩ + ⟨vn , w⟩ – ⟨v, w⟩󵄨󵄨󵄨 , 󵄨 󵄨 󵄨 󵄨 󵄨 󵄨 ≤ 󵄨󵄨󵄨󵄨⟨vn , wn – w⟩󵄨󵄨󵄨󵄨 + ⟨vn – v, w⟩ , 󵄩 󵄩󵄩 󵄩 󵄩 󵄩 ≤ 󵄩󵄩󵄩vn 󵄩󵄩󵄩 󵄩󵄩󵄩wn – w󵄩󵄩󵄩 + 󵄩󵄩󵄩vn – v󵄩󵄩󵄩 ‖w‖ , which tends to 0 since ‖wn – w‖ 󳨀→ 0 and ‖vn – v‖ 󳨀→ 0 as n 󳨀→ ∞.

In what follows, we will show some other examples of inner product vector spaces in which its elements are functions. Example 2.22. Let V = C [0, 1] be the vector space of real-valued continuous functions in the interval [0, 1] . For f , g ∈ V, define 1

⟨f , g⟩ = ∫ f (t) g (t) dt. 0

From the linearity of the integral, we get that the first and second properties of the inner product hold. The third property is trivial since the functions are real valued. Finally, if f ≠ 0, then using the continuity of f we can conclude that f 2 is bounded away from zero on some subinterval of [0, 1] and hence 1 2

⟨f , f ⟩ = ∫ [f (t)] dt > 0. 0

18

2 Hilbert Spaces

Example 2.23. The previous example can be extended to the case of a general measure space (X, M , ,). We already defined the space L2 and we considered a norm defined on such spaces. It turns out that this norm is related to the inner product ⟨f , g⟩ = ∫ f (x)g(x)d,(x) X

in terms of Definition 2.19. We need to take the conjugate in the definition if the functions are complex valued. ⊘ Example 2.24. Consider the vector space C1 [0, 1] of all real-valued continuously differentiable functions and define the inner product as 1

1

0

0

⟨f , g⟩ = ∫ f (x)g(x)dx + ∫ f 󸀠 (x)g 󸀠 (x)dx. We leave it as an exercise to the reader to verify that this expression actually defines an inner product. ⊘ After going over several examples of inner products, we will study one special type of inner product spaces. Remark 2.25. Given an inner product space (V, ⟨⋅, ⋅⟩), we will say that a sequence (vn ) ⊆ V is a Cauchy sequence if for every % > 0, there exists N ∈ ℕ such that for every n, m ≥ N, ‖vn – vm ‖ < %. ⊘ Definition 2.26. Let (V, ⟨⋅, ⋅⟩) be an inner product space. We say that V is a Hilbert space if V is complete. i.e., if every Cauchy sequence in V is convergent in V. Example 2.27. Going back to Example 2.5 in which the space ℓ2 was defined, we can easily see that the norm is associated to the following inner product: ∞

∞ ⟨(!j )∞ j=1 , ("j )j=1 ⟩ = ∑ !j "j j=1 ∞ 2 for any (!j )∞ j=1 , ("j )j=1 ∈ ℓ . 2 We will see that ℓ is a Hilbert space. Suppose that (vn ) ⊆ ℓ2 is a Cauchy sequence. It is important to notice that in this case each element vn is actually a sequence of complex numbers. We will denote this as ∞

vn = (!nj )j=1 . Let % > 0; there exists N ∈ ℕ such that for n, m ≥ N, ‖vn – vm ‖ < %. Notice that for every j and M, if n, m ≥ N

2.2 Inner Products and Hilbert Spaces

19

M

2 n m 2 2 2 |!nj – !m j | ≤ ∑ |!j – !j | ≤ ‖vn – vm ‖ < %

(2.2)

j=1 ∞

and consequently, every sequence of the form (!nj )n=1 is a Cauchy sequence in ℂ. Hence, for every j there exists !j ∈ ℂ such that !nj → !j

as n → ∞.

Define v = (!j ) ∈ ℓ2 . We will show that vn → v in the ℓ2 norm. Notice that eq. (2.2) implies that for any M M

∑ |!nj – !j |2 < %2

(2.3)

j=1

and therefore M

M

j=1

j=1

2

∑ |!j |2 ≤ ∑ (|!j – !nj | + |!nj |) , M

M

j=1

j=1

≤ 4 ∑ |!j – !nj |2 + 4 ∑ |!nj |2 , ≤ 4%2 + 4‖vn ‖2 < ∞. This shows that v ∈ ℓ2 . Finally taking the limit when M → ∞ in eq. (2.3) we get that ‖vn – v‖ < % which finishes the proof. ⊘ Example 2.28. Consider the following subspace of ℓ2 : ∞ C = {(!n )n=1 : ∃k ∈ ℕ, !k+p = 0 for every p}.

In other words, C is the vector space of all sequences that are null except for a finite number of points. Clearly C ⊆ ℓ2 and we can define the same inner product as in ℓ2 . ∞ For each m ∈ ℕ, consider the sequence vm = (!m j )j=1 ∈ C defined as

!m j

1 { { , if j ≤ m = {j { 0, otherwise. {

We leave it as an exercise to show that (vm )∞ m=1 is a Cauchy sequence in C and that the ∞ sequence v = ( m1 )m=1 is its limit in ℓ2 . However, v ∈ ̸ C and consequently C is not a Hilbert space. ⊘

20

2 Hilbert Spaces

Remark 2.29. In the previous example, C is a linear subspace of ℓ2 . Moreover, its inner product is the restriction of the inner product of ℓ2 . When such a situation holds, we will simply say that C is a subspace of ℓ2 . ⊘

2.3 Some Geometric Properties In this section, we will focus on geometric properties of norms and inner product spaces. Most of these properties are probably known by the reader in the setting of the spaces ℝ2 or ℝ3 . We will show that although the geometry sometimes cannot be visualized as easily, some results can be generalized to abstract inner product spaces. As a first property, we will show the parallelogram equality. It receives this name because in every parallelogram the sum of the squares of the lengths of the diagonals equals the sum of the squares of the lengths of the four sides (see Fig. 2.1). Theorem 2.30 (Parallelogram Identity). Let (V, ⟨⋅, ⋅⟩) be an inner product vector space, and let ‖ ⋅ ‖ be its corresponding norm given as in Definition 2.19. Suppose that x, y ∈ V, then ‖v + w‖2 + ‖v – w‖2 = 2 (‖v‖2 + ‖w‖2 ) .

(2.4)

Proof. Using the definition of the norm, we have ‖v + w‖2 + ‖v – w‖2 = ⟨v + w, v + w⟩ + ⟨v – w, v – w⟩ , = ‖v‖2 + ‖w‖2 + ⟨v, w⟩ + ⟨v, w⟩ + ‖v‖2 + ‖w‖2 – ⟨v, w⟩ – ⟨w, v⟩ , = 2 (‖v‖2 + ‖w‖2 ) , ◻

and the proof is complete.

Parallelogram identity not only gives us an equality a norm should satisfy if it is induced as in Definition 2.19, but the converse is also true, i.e., if a norm satisfies

w v+ w

v−w v

Figure 2.1: Parallelogram identity.

2.3 Some Geometric Properties

21

eq. (2.4), then there exists an inner product related to the norm by means of Definition 2.19. First, we present the polarization identities. Their importance lies in the fact that they show how to recover the inner product through the norm. Theorem 2.31 (Polarization Identities). Let (V, ⟨⋅, ⋅⟩) be an inner product vector space over a field 𝔽 and let ‖⋅‖ be its corresponding norm given as in Definition 2.19. For v, w ∈ V it holds that (a)

If 𝔽 = ℝ, then 1 (‖v + w‖2 – ‖v – w‖2 ) . 4

(2.5)

1 󵄩 󵄩2 󵄩 󵄩2 (‖v + w‖2 – ‖v – w‖2 – i 󵄩󵄩󵄩v + iw󵄩󵄩󵄩 + i 󵄩󵄩󵄩v – iw󵄩󵄩󵄩 ) . 4

(2.6)

⟨v, w⟩ = (b)

If 𝔽 = ℂ, then ⟨v, w⟩ =

Proof. (a) If 𝔽 = ℝ, then ‖v + w‖2 – ‖v – w‖2 = ⟨v + w, v + w⟩ – ⟨v – w, v – w⟩ , = ⟨v, v⟩ + ⟨v, w⟩ + ⟨w, v⟩ + ⟨w, w⟩ – (⟨v, v⟩ – ⟨v, w⟩ – ⟨w, v⟩ + ⟨w, w⟩) , = 2 ⟨v, w⟩ + 2 ⟨w, v⟩ , = 4 ⟨v, w⟩ . (b)

If 𝔽 = ℂ, using the same calculation as in the previous item we have ‖v + w‖2 – ‖v – w‖2 = 2 ⟨v, w⟩ + 2 ⟨w, v⟩ and 󵄩2 󵄩󵄩 󵄩2 󵄩 󵄩󵄩v + iw󵄩󵄩󵄩 – 󵄩󵄩󵄩v – iw󵄩󵄩󵄩 = 2 ⟨v, iw⟩ + 2 ⟨iw, v⟩ . Then 󵄩 󵄩2 󵄩 󵄩2 ‖v + w‖2 – ‖v – w‖2 – i 󵄩󵄩󵄩v + iw󵄩󵄩󵄩 +i 󵄩󵄩󵄩v – iw󵄩󵄩󵄩

󵄩 󵄩2 󵄩 󵄩2 = ‖v + w‖2 – ‖v – w‖2 – i (󵄩󵄩󵄩v + iw󵄩󵄩󵄩 – 󵄩󵄩󵄩v – iw󵄩󵄩󵄩 ) ,

= 2 ⟨v, w⟩ + 2 ⟨w, v⟩ + 2i ⟨v, iw⟩ + 2i ⟨iw, v⟩ , = 2 ⟨v, w⟩ + 2 ⟨w, v⟩ + 2 (–i2 ) ⟨v, w⟩ + 2i2 ⟨w, v⟩ , = 2 ⟨v, w⟩ + 2 ⟨w, v⟩ + 2 ⟨v, w⟩ – 2 ⟨w, v⟩ , = 4 ⟨v, w⟩ . This completes the proof.

22

2 Hilbert Spaces

We are ready to prove that parallelogram identity actually characterizes inner product spaces. Theorem 2.32. Let (V, ‖ ⋅ ‖) be a normed space such that for every v, w ∈ V the identity in eq. (2.4) holds. Then there exists an inner product on V such that for every v, w ∈ V it holds that ⟨v, v⟩ = ‖v‖2 . Proof. The polarization identities just proven actually give us a way of defining the inner product in terms of the norm. Let’s start by considering the case of a real vector space and define the inner product as in eq. (2.5). From the definition it is clear that ⟨v, v⟩ = ‖v‖2 and that ⟨v, w⟩ = ⟨w, v⟩. We will show that for every v, w, z ∈ V, ⟨v + w, z⟩ = ⟨v, z⟩+⟨w, z⟩. By parallelogram identity we get that ‖v + w + z‖2 + ‖v – w + z‖2 = 2‖v + z‖2 + 2‖w‖2 and similarly ‖v + w + z‖2 + ‖w – v + z‖2 = 2‖w + z‖2 + 2‖v‖2 . Putting this two equations together we get 2‖v + w + z‖2 = ‖v + z‖2 + ‖w‖2 + ‖w + z‖2 + ‖v‖2 – ‖v – w + z‖2 – ‖w – v + z‖2 and replacing z by –z we get 2‖v + w – z‖2 = ‖v – z‖2 + ‖w‖2 + ‖w – z‖2 + ‖v‖2 – ‖v – w – z‖2 – ‖w – v – z‖2 . By the way in which the inner product is defined and using the two previous identities, we get 1 (‖v + w + z‖2 – ‖v + w – z‖2 ) , 4 1 1 = (‖v + z‖2 – ‖v – z‖2 ) + (‖w + z‖2 – ‖w – z‖2 ) , 4 4 = ⟨v, z⟩ + ⟨w, z⟩ .

⟨v + w, z⟩ =

From this identity, using induction we can conclude that for any natural number n, we have that ⟨nv, w⟩ = n ⟨v, w⟩ . On the other hand, it is also clear that ⟨–v, w⟩ = – ⟨v, w⟩. This proves the homogeneity for scalars in ℤ.

2.3 Some Geometric Properties

23

For the case of a rational scalar pq , with p.q ∈ ℤ, q ≠ 0 notice that 1 p q ⟨ v, w⟩ = qp ⟨ v, w⟩ = p ⟨v, w⟩ , q q and the identity follows. Finally, if + ∈ ℝ chose a sequence of rational numbers +n such that +n → +, then ‖+n v – +v‖ → 0 and by the continuity of the inner product, we have that ⟨+v, w⟩ = lim ⟨+n v, w⟩ = lim +n ⟨v, w⟩ = + ⟨v, w⟩ . n

n

This finishes the proof for the case of a real vector space. For the case of a complex vector spaces, the inner product is defined as in eq. (2.6). Notice that ⟨iv, w⟩ = i ⟨v, w⟩ and that ⟨v, w⟩ = ⟨w, v⟩. Let’s denote ⟨v, w⟩ℝ as the inner product defined for the real case and observe the relation ⟨v, w⟩ = ⟨v, w⟩ℝ + i ⟨v, iw⟩ℝ . Then using the linearity just proven, we have that ⟨v + w, z⟩ = ⟨v + w, z⟩ℝ + i ⟨v + w, iz⟩ℝ , = ⟨v, z⟩ℝ + ⟨w, z⟩ℝ + i ⟨v, iz⟩ℝ + i ⟨w, iz⟩ℝ , = ⟨v, z⟩ + ⟨w, z⟩ . Finally, if + ∈ ℂ, then + = a + ib, with a, b ∈ ℝ and then ⟨+v, w⟩ = ⟨av, w⟩ + ⟨ibv, w⟩ , = a ⟨v, w⟩ + ib ⟨v, w⟩ , = + ⟨v, w⟩ . ◻

This finishes the proof.

The next geometric property that we are going to study comes from the law of cosines, which can be thought as a generalization of Pythagorean Theorem for an arbitrary triangle. Consider the triangle 󳵻OAB in Fig. 2.2. The angle ( may not be a right angle, but the law of cosines still gives us a relation between the length of the sides of the triangle c2 = a2 + b2 – 2ab cos((). Let’s extrapolate this relation to the case of an inner product vector space (V, ⟨⋅, ⋅⟩) and suppose v, w ∈ V. Notice that

24

2 Hilbert Spaces

A

c a

θ O

b

B Figure 2.2: Law of cosines.

‖v – w‖2 = ⟨v – w, v – w⟩ , = ‖v‖2 + ‖w‖2 – 2Re ⟨v, w⟩ . Due to the similarities in the geometry, it makes sense to define the angle ( between vectors v and w by the following equation: cos ( =

Re ⟨v, w⟩ ‖v‖‖w‖

(2.7)

Notice that by Cauchy–Schwarz inequality, –1 ≤ cos ( ≤ 1. Also, notice that cos ( = 0 if and only if ⟨w, v⟩ = 0. We will say that two vectors v, w ∈ V are orthogonal if ⟨v, w⟩ = 0. Orthogonality will be the main subject of the next section.

2.4 Orthogonality Definition 2.33. Let (V, ⟨⋅, ⋅⟩) be an inner product space. (a) We say two vectors v, w ∈ V are orthogonal if ⟨v, w⟩ = 0, and we denote this by v ⊥ w. (b) If X, Y are subsets of V, we write X ⊥ Y; if for every x ∈ X and y ∈ Y we have x ⊥ y. We will denote by X ⊥ the set of vectors in V which are orthogonal to X X ⊥ = {y ∈ V : for every x ∈ X, ⟨x, y⟩ = 0} .

As a first property of orthogonality, we see that Pythagorean Theorem holds. The proof will be left to the reader. Theorem 2.34. Let (V, ⟨⋅, ⋅⟩) be an inner product space and let v, w ∈ V. If v ⊥ w, then ‖v + w‖2 = ‖v‖2 + ‖w‖2 . The following proposition will give us more geometric insight of what orthogonality means. Suppose v and w are two orthogonal vectors on an inner product vector space

2.4 Orthogonality

25

over a field 𝔽. Consider the “line” L = {v + tw : t ∈ 𝔽}, then among all elements of L, v is the one with minimal norm. Theorem 2.35. Let (V, ⟨⋅, ⋅⟩) be an inner product space; we have v ⊥ w if, and only if, ‖v‖ ≤ ‖v + tw‖,

for every t ∈ 𝔽.

Proof. If w = 0 the result follows trivially, so let us consider the case w ≠ 0. Then 0 ≤ ⟨v + tw, v + tw⟩ = ‖v‖2 + 2 Re (t ⟨v, w⟩) + |t|2 ‖w‖2 . If v ⊥ w then, for every t ∈ 𝔽, ⟨v + tw, v + tw⟩ = ‖v‖2 + |t|2 ‖w‖2 ≥ ‖v‖2 . Conversely, if ‖v‖ ≤ ‖v + tw‖ for every t ∈ 𝔽, we pick t = 󵄨 󵄨2 0 ≤ – 󵄨󵄨󵄨⟨v, w⟩󵄨󵄨󵄨 , and hence v ⊥ w.

– ⟨w, v⟩ to obtain ‖w‖2

We can generalize the previous property of orthogonality to the setting of distance between a vector and a subspace. Recall that a subspace of an inner product vector space was defined in Remark 2.29. Definition 2.36. A closed subspace S of an inner product vector space (V, ⟨⋅, ⋅⟩) is a subspace S such that S is closed with respect to the topology induced by the inner product of V. Theorem 2.37. Let (H , ⟨⋅, ⋅⟩) be a Hilbert space and let S be a closed subspace of H . If v ∈ H , then v has a unique representation of the form v = w + w⊥ , where w ∈ S and w⊥ ∈ S⊥ . Proof. The candidate for w should be the vector in S that is the closest to v. Define the distance from v to S as d = inf{‖v – w‖ : w ∈ S}. Notice that since the norm is always a nonnegative number, then the infimum makes sense. Moreover, for every n, there exists vn ∈ S such that ‖v – vn ‖2 < d2 + 1/n. We will show that the sequence (vn )∞ n=1 ⊆ S is a Cauchy sequence. This will follow from the

26

2 Hilbert Spaces

parallelogram law. Let n and m be two natural numbers, then ‖(vn – v) + (vm – v‖)‖2 + ‖(vn – v) – (vm – v)‖2 = 2 (‖vn – v‖2 + ‖vm – v‖2 ) and consequently 󵄩󵄩 v + vm 󵄩󵄩2 ‖vn – vm ‖2 = –4 󵄩󵄩󵄩󵄩 n – v󵄩󵄩󵄩󵄩 + 2 (‖vn – v‖2 + ‖vm – v‖2 ) , 󵄩 2 󵄩 2 2 2 2 ≤ –4d + 2d + + 2d2 + , n m where we have used the fact that (vn + vm )/2 ∈ S and hence its distance to v is greater than or equal to d. This shows that the sequence (vn )∞ n=1 is a Cauchy sequence and since H is a Hilbert space, then it converges to some x ∈ H . Moreover, since S is closed, then x ∈ S. Notice that ‖x – v‖ = d. Moreover, if y ∈ S and t > 0 then ty ∈ S and consequently d2 ≤ ‖(x – v) + ty‖2 = d2 – 2Re ⟨x – v, ty⟩ + ‖ty‖2 . Hence, 2Re ⟨x – v, ty⟩ ≤ ‖ty‖2 , which implies that Re ⟨x – v, y⟩ ≤ t‖y‖2 . Letting t → 0 we get that x – v ⊥ y. Thus, taking w = x and w⊥ = x – v we have the desired representation. To prove the uniqueness, suppose that v = y + z with y ∈ S and z ∈ S⊥ , then w – y = z – w⊥ but since S is a subspace, then z – w⊥ ∈ S. Similarly, by Example 2.17 we have that S⊥ is also a subspace and so z – w⊥ ∈ S⊥ . This can be true only if z – w⊥ = 0. This finishes the proof. ◻ Remark 2.38. Theorem 2.37 can be written as H = S ⊕ S⊥ . Where the symbol ⊕ means that the decomposition exists and that it is unique. ⊘ With the previous theorem in hand, we are ready to give the following definition. Definition 2.39. Let (H , ⟨⋅, ⋅⟩) be a Hilbert space. And let S be a closed subspace of H . The orthogonal projection P onto S is the linear transformation P : H → S that maps to every vector v ∈ H , its corresponding vector w ∈ S such that v = w + w⊥ , with w⊥ ∈ S⊥ . Notice that if I denotes the identity operator, then I–P : H → S⊥ is the orthogonal projection onto S⊥ . Notice that such decomposition can be made as long as the subspace

2.5 Orthogonal Sequences

27

S is closed. As a matter of fact, there is a way of characterizing a closed subspace in terms of its behavior with respect to ⊥. Theorem 2.40. A subspace S of a Hilbert space H is closed if and only if S = S⊥⊥ . Proof. From Example 2.17 we have that L⊥ is a closed subspace of H for any L ⊆ H . Thus, if S = S⊥⊥ , then S is closed. For the converse, notice that S ⊆ S⊥⊥ . Suppose v ∈ S⊥⊥ , then since S is closed, v = w + w⊥ with w ∈ S and w⊥ ∈ S⊥ . Now, w ∈ S⊥⊥ and since S⊥⊥ is a vector space, then w⊥ = v – w ∈ S⊥⊥ . Thus w⊥ = 0 and consequently v = w ∈ S. ◻ Example 2.41. In linear algebra we use the orthogonal decomposition without even thinking about it. For example, in ℝ2 when we write a vector v = (a, b), we can think of this as v = ae1 + be2 , where e1 = (1, 0) and e2 = (0, 1) are the canonical vectors. This, in other words, means that we are writing v = w + w⊥ where w ∈ {+e1 : + ∈ ℝ} and w⊥ ∈ {+e2 : + ∈ ℝ}. The reader should convince himself that these two spaces are orthogonal. ⊘

2.5 Orthogonal Sequences Canonical vectors have a special place in linear algebra, so it is natural to extend their “nice” properties to more general settings. We will deal with that in what follows. Definition 2.42. Consider a (possibly infinite) collection of vectors e1 , e2 , . . . in an inner product vector space (V, ⟨⋅, ⋅⟩). We say this collection is orthogonal if, for i ≠ j, ⟨ei , ej ⟩ = 0. If the additional property that, for every i, ⟨ei , ei ⟩ = 1 holds, then the collection of vectors is said to be orthonormal. Example 2.43. The canonical basis for ℝn is clearly an orthonormal collection of vectors. ⊘ Example 2.44. Going back to Example 2.27, if we consider on the space l2 the sequences e1 = (1, 0, 0, . . . ), e2 = (0, 1, 0, 0, . . . ), . . . then it is clear that (ej )∞ j=1 is an orthonormal set. ⊘ Example 2.45. Consider the space C[0, 20] of all real-valued continuous functions with the ‖ ⋅ ‖2 norm. This norm can be induced by the inner product

28

2 Hilbert Spaces

20

⟨f , g⟩ = ∫ f (x)g(x)dx. 0

Now define, for every nonnegative integer n, en (x) =

1 sin(nt). √0

We leave it as an exercise to verify that the family {en } is orthonormal.

Definition 2.46. Let (V, ⟨⋅, ⋅⟩) be an inner product space, and let (vj ) ⊆ V be a sequence of vectors. (a) We denote by span(vj ) as the vector subspace of all finite linear combinations of the vectors vj . That is, {n } span(vj ) = {∑ !ji vji : !ji ∈ 𝔽 and n ∈ ℕ} . { i=1 } (b)

We say that the sequence (vj ) is linearly independent if for every finite set {j1 , . . . , jn } ⊆ ℕ, !1 vj1 + ⋅ ⋅ ⋅ + !n vjn = 0 implies !1 = ⋅ ⋅ ⋅ = !n = 0.

Linear independence is a well-known concept from linear algebra. It allows to define the dimension of a vector space as the maximal number of linearly independent vectors in the space. If such a number does not exist, we say that the space is infinite dimensional. Lemma 2.47. Let (ej ) be an orthonormal sequence of vectors in an inner product vector space (V, ⟨⋅, ⋅⟩). Then, (ej ) is linearly independent and for any finite set {j1 , . . . , jn } ⊆ ℕ, if v ∈ span {ej1 , . . . , ejn }, then v = ⟨x, ej1 ⟩ ej1 + ⋅ ⋅ ⋅ + ⟨x, ejn ⟩ ejn . Proof. Since v is in span {ej1 , . . . , ejn } , we can write v = !1 ej1 + ⋅ ⋅ ⋅ + !n ejn . Then ⟨v, ejk ⟩ = ⟨!1 ej1 + ⋅ ⋅ ⋅ + !n ejn , ejk ⟩ , = !1 ⟨ej1 , ejk ⟩ + ⋅ ⋅ ⋅ + !n ⟨ejn , eji ⟩ , = !1 ⋅ 0 + ⋅ ⋅ ⋅ + !i–1 ⋅ 0 + !i ⋅ 1 + !i+1 ⋅ 0 + ⋅ ⋅ ⋅ + !n ⋅ 0, = !i ,

2.5 Orthogonal Sequences

29

which implies both, the representation of v and the linear independence of the vectors ej1 , . . . , ejn . Since this is an arbitrary finite set, then we get the result. ◻ The previous lemma can be extended to the case of expressions of the form v = ∑∞ j=1 !j ej where we assume the convergence of the series with respect to the metric induced by the inner product. Since we showed the continuity of the inner product in Theorem 2.21, then it is not hard to see that ∞

! k = ⟨∑ ! j e j , e k ⟩ j=1

for any k. Similarly, ∞

j=1

j=1

j=1

⟨∑ !j ej , ∑ "j ej ⟩ = ∑ !j "j .

(2.8)

This shows that the numbers resulting from taking the inner product of a vector with elements of an orthonormal basis deserve special attention, which motivates the following definition. Definition 2.48. Let (ej ) be an orthonormal sequence of vectors in an inner product vector space (V, ⟨⋅, ⋅⟩). Given a vector x ∈ V, its Fourier coefficients are defined as the numbers !j = ⟨x, ej ⟩. Theorem 2.49. Let (ej ) be an orthonormal sequence of vectors in an inner product vector space (V, ⟨⋅, ⋅⟩). (a)

(Parseval’s Identity) If v = ∑∞ j=1 !j ej , then ∞

‖v‖2 = ∑ |!j |2 . j=1

(b)

∞ 2 Moreover, if V is a Hilbert space and if ∑∞ j=1 |!j | < ∞, then the series ∑j=1 !j ej converges in V. (Bessel’s Inequality) For every v ∈ V ∞󵄨 󵄨󵄨2 󵄨 ∑ 󵄨󵄨󵄨⟨v, ej ⟩󵄨󵄨󵄨 ≤ ‖v‖2 . 󵄨 󵄨 j=1

Proof. Parseval’s identity follows from eq. (2.8) by taking !j = "j . For the converse, 2 suppose V is a Hilbert space and ∑∞ j=1 |!j | < ∞, then if Sn denotes the partial sum n ∑j=1 !j ej we have

30

2 Hilbert Spaces

󵄩󵄩 n 󵄩󵄩2 󵄩󵄩 󵄩󵄩 󵄩󵄩 󵄩 ‖Sn – Sm ‖ = 󵄩󵄩 ∑ !j ej 󵄩󵄩󵄩 , 󵄩󵄩 󵄩󵄩 󵄩󵄩j=m+1 󵄩󵄩 󵄩󵄩 n 󵄩󵄩2 󵄩󵄩󵄩 󵄩󵄩󵄩 2 = 󵄩󵄩󵄩 ∑ |!j | 󵄩󵄩󵄩 . 󵄩󵄩 󵄩󵄩 󵄩󵄩j=m+1 󵄩󵄩 2

(2.9)

(2.10)

This proves that {Sn } is a Cauchy sequence in V and consequently we have the convergence. In order to prove Bessel’s inequality, fix N ∈ ℕ and notice that 󵄩󵄩 󵄩󵄩2 N 󵄩󵄩 󵄩󵄩 󵄩󵄩 󵄩 0 ≤ 󵄩󵄩v – ∑ ⟨v, ej ⟩ ej 󵄩󵄩󵄩 , 󵄩󵄩 󵄩󵄩 j=1 󵄩󵄩 󵄩󵄩 N N N 󵄨 󵄨󵄨2 󵄨 = ‖v‖2 – ⟨v, ∑ ⟨v, ej ⟩ ej ⟩ – ⟨∑ ⟨v, ej ⟩ ej , v⟩ + ∑ 󵄨󵄨󵄨⟨v, ej ⟩󵄨󵄨󵄨 , 󵄨 󵄨 j=1

j=1

j=1

󵄨󵄨 󵄨󵄨2 = ‖v‖2 – ∑ 󵄨󵄨󵄨⟨v, ej ⟩󵄨󵄨󵄨 . 󵄨 󵄨 N

j=1

󵄨󵄨 󵄨󵄨2 Hence, ‖v‖2 ≥ ∑Nj=1 󵄨󵄨󵄨⟨v, ej ⟩󵄨󵄨󵄨 and the result follows. 󵄨 󵄨

One more piece of evidence that orthonormal sets can be thought as a generalization of canonical vectors in finite-dimensional spaces is the following result. It says that finite-dimensional spaces spanned by orthonormal sets are indistinguishable from ℝn or ℂn . Definition 2.50. We say two inner product spaces (V, ⟨⋅, ⋅⟩V ) and (W, ⟨⋅, ⋅⟩W ) over a field 𝔽 are isometric if there is a bijective linear operator F : V → W such that for every v ∈ V and w ∈ W, it holds that ⟨v, w⟩V = ⟨F (v) , F (w)⟩W . Theorem 2.51. If an inner product vector space (V, ⟨⋅, ⋅⟩V ) has an orthonormal basis {e1 , . . . , en }, then V is isometric to (𝔽n , ⟨⋅, ⋅⟩) where ⟨⋅, ⋅⟩ is the usual inner product in 𝔽n . Proof. Take the isomorphism F : 𝔽n → V defined as F (!1 , . . . , !n ) = !1 e1 + ⋅ ⋅ ⋅ + !n en . Denote a = (!1 , . . . , !n ) and b = ("1 , . . . , "2 ) . Then ⟨F (a) , F (b)⟩V = ⟨F (a) , "1 e1 + ⋅ ⋅ ⋅ + "n en ⟩V , = "1 ⟨F (a) , e1 ⟩V + ⋅ ⋅ ⋅ + "n ⟨F (a) , en ⟩V , = "1 ⟨!1 e1 + ⋅ ⋅ ⋅ + !n en , e1 ⟩V + ⋅ ⋅ ⋅ + "n ⟨!1 e1 + ⋅ ⋅ ⋅ + !n en , en ⟩V , = "1 !1 + ⋅ ⋅ ⋅ + "n !n , = ⟨a, b⟩ , as we wanted.

2.5 Orthogonal Sequences

31

The following theorem gives us a way of finding an orthonormal family out of any linearly independent set of vectors. This is known as the Gram–Schmidt orthonormalization process. Theorem 2.52. Let (V, ⟨⋅, ⋅⟩) be an inner product space and let v1 , v2 , . . . be a (possibly finite) collection of linearly independent vectors in V. There exists a collection of orthonormal vectors e1 , e2 , . . . in V such that for every k span {e1 , . . . , ek } = span {v1 , . . . , vk } . v1 . We obtain the remaining vector inductively. Suppose ‖v1 ‖ e1 , . . . , em have been chosen in such a way that for every k, the set {e1 , . . . , ek } , with 1 ≤ k ≤ m, is an orthonormal set and span {e1 , . . . , ek } = span {v1 , . . . , vk } . To construct the next vector, let Proof. First let e1 =

m

wm+1 = vm+1 – ∑ ⟨vm+1 , ek ⟩ek . k=1

wm+1 . Then em+1 ≠ 0 since otherwise vm+1 is a linear com‖wm+1 ‖ bination of e1 , . . . , em and hence a linear combination of v1 , . . . , vm . Furthermore, if 1 ≤ j ≤ m, then and define em+1 =

m

⟨wm+1 , ej ⟩ = ⟨vm+1 , ej ⟩ – ∑ ⟨vm+1 , ek ⟩⟨ek , ej ⟩ k=1

= ⟨vm+1 , ej ⟩ – ⟨vm+1 , ej ⟩ = 0. Therefore ⟨em+1 , ej ⟩ = 0 and consequently {e1 , . . . , em+1 } is an orthonormal set consisting of m + 1 nonzero vectors in the subspace span {v1 , . . . , vm+1 }. Thus, span {v1 , . . . , vm+1 } = span {e1 , . . . , em+1 }. ◻ The reader is probably familiar with the concept of (Hamel) basis in finite-dimensional spaces and has probably seen the Gram–Schmidt process as a tool to get orthonormal basis out of other basis. He has probably noticed similarities between the properties of basis learned in linear algebra and the properties of orthogonal sequences. However, one important property is missing in the context of infinite-dimensional inner product spaces. A basis of a vector space V is defined as a linearly independent set of vectors (vj ) such that every vector in the space can be written as a finite linear combinations of the vj ’s. We have seen that orthonormal sequences are linearly independent; however, the second condition is never satisfied by an infinite orthonormal sequence. As a matter of fact, suppose that (V, ⟨⋅, ⋅⟩) is an inner product space and that (ej ) is an (infinite) orthonormal sequence. Define

32

2 Hilbert Spaces

v=∑ j=1

ej j

.

By Theorem 2.49 we know that the series converges in V. However, if v had a finite representation v = !1 ej1 + ⋅ ⋅ ⋅ + !n ejn then for k ∈ ̸ {j1 , . . . , jn } 1 = ⟨v, ek ⟩ = ⟨!1 ej1 + ⋅ ⋅ ⋅ + !n ejn , ek ⟩ = 0, k which clearly is a contradiction. Not all is lost however; we can still use orthonormal sequences in a similar manner as basis when we consider a series instead of finite linear combinations. The countable nature of the series makes us focus on one special type of inner product spaces. Definition 2.53. An inner product space is separable if it has a countable dense subset. Example 2.54. The space ℓ2 considered before is a separable Hilbert space. Consider the set D = {(xn )∞ n=1 : ∃k, xn = 0 if n > k, and xn ∈ ℚ if n ≤ k}. Then D is dense in ℓ2 . We leave the details to the reader.

Example 2.55. We will give an example of a nonseparable inner product space. The details will be left to the reader. Define N as the vector space of all functions f : ℝ → ℝ such that f (x) = 0 outside a countable set Nf and ∑ |f (n)|2 < ∞. n∈Nf

Given f , g ∈ N , define ⟨f , g⟩N =

f (n)g(n).

n∈Nf ∩Ng

For every s ∈ ℝ define 7s : ℝ 󳨀→ ℝ {1 if t = s, t 󳨃󳨀→ { 0 otherwise { then notice that if s ≠ t then ‖7s – 7t ‖2 = 2.

2.5 Orthogonal Sequences

33

So any dense subset of N should have a point belonging to the balls B√2/2 (7s ) which are disjoint. Hence, N is not separable. ⊘ Now we are ready to introduce the concept of orthonormal basis for separable Hilbert spaces. Definition 2.56. Let (H , ⟨⋅, ⋅⟩) be a separable Hilbert space. A sequence of orthonormal vectors (ej )∞ j=1 is an orthonormal basis for H if ⟨v, ej ⟩ = 0, for all j ∈ ℕ implies that v = 0. In other words, it is not possible to find a nonzero vector orthogonal to every ej . As noticed before, the term “basis” is not used in the sense of a Hamel basis. The following theorem gives a justification for its use. Theorem 2.57. Let (H , ⟨⋅, ⋅⟩) be a separable Hilbert space and let (ej )∞ j=1 be an orthonormal basis for H . Then (a)

If v ∈ H , then ∞

v = ∑ ⟨v, ej ⟩ ej j=1

(b)

∞󵄨 󵄨󵄨2 󵄨 and ‖v‖2 = ∑ 󵄨󵄨󵄨⟨v, ej ⟩󵄨󵄨󵄨 󵄨 󵄨 j=1

span{ej }∞ =H. j=1

. Proof. Denote S = span(ej )∞ j=1 (a)

By Theorem 2.49 the series ∑∞ j=1 ⟨v, ej ⟩ ej converges to some vector w ∈ H . Moreover, since S is closed, then w ∈ S. On the other hand, notice that for every k, ∞

⟨v – w, ek ⟩ = ⟨v, ek ⟩ – ⟨∑ ⟨v, ej ⟩ ej , ek ⟩ , j=1

= ⟨v, ek ⟩ – ⟨v, ek ⟩ , = 0,

(b)

but since (ej )∞ j=1 is an orthonormal basis, then v – w = 0. This proves the representation for v and the expression for its norm follows from Parseval’s identity. It follows from the representation proved in the previous part that if v ∈ H then v ∈ S. ◻

Remark 2.58. During this chapter, we have introduced the concept of inner product vector spaces and Hilbert spaces starting from the basic knowledge of linear algebra.

34

2 Hilbert Spaces

When possible, we presented the first example in finite-dimensional spaces and then built up from there to more complicated spaces. However, the knowledge of abstract Hilbert spaces and its techniques could also answer concrete questions about finite-dimensional spaces. Should the reader want to deepen in this subject, we recommend the very well-written article [11]. ⊘

2.6 Problems 2.1.

Define on ℝ2 the following norm: ‖(x1 , x2 )‖p = (|x1 |p + |x2 |p )

2.2.

1/p

,

1 ≤ p < ∞.

Draw the boundary of ball B1 (0) with respect to the p-norm, for p = 1, p = 2 and p = 3. Make another drawing for the case of ‖(x1 , x2 )‖∞ = sup{|x1 |, |x2 |}. A set A in a vector space is convex if for every a, b ∈ A, we have that {a(1 – t) + bt : t ∈ [0, 1]} ⊆ A.

2.3. 2.4. 2.5. 2.6.

Show that B1 (0) is convex in any normed space. Verify that ‖⋅‖1 and ‖⋅‖∞ are actually norms in the spaces ℓ1 and ℓ∞ , respectively. Is ‖ ⋅ ‖1 a norm in ℓ∞ ? Show an example of a sequence that belongs to ℓp but does not belong to ℓ1 , for p > 1. Show an example of a sequence that belongs to ℓp but does not belong to ℓq , for p > q. Use the same ideas as in Theorem 2.8 to show Hölder’s inequality in its general form. Let 1 ≤ p, q < ∞ be conjugated exponents, and let f ∈ Lp and g ∈ Lq . Show that fg ∈ L1 and 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨 󵄨󵄨∫ fgd,󵄨󵄨󵄨 ≤ ‖f ‖p ‖g‖q . 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨X 󵄨󵄨

2.7. 2.8. 2.9. 2.10. 2.11.

Show that the equality in Minkowski’s inequality holds if and only if one vector is multiple of the other. Use the previous problem and the same ideas as in Theorem 2.9 to show that ‖ ⋅ ‖p is an actual norm in Lp . Use Hölder’s inequality to show that if ,(X) < ∞ and 1 ≤ p < q < ∞, then Lq ⊆ Lp . Prove that ‖ ⋅ ‖∞ defines a norm on L∞ . Prove Theorem 2.15.

2.6 Problems

2.12. 2.13. 2.14. 2.15. 2.16. 2.17. 2.18.

2.19.

35

Show that ⟨⋅, ⋅⟩r as defined on Example 2.17 satisfies the condition for an inner product for every r > 0. What happens if r = 0? And if r < 0? Verify that the expression defined in Example 2.24 is an inner product. Obtain the expression for the norm induced by such product. In Example 2.27 we show an inner product that induces the ℓ2 norm. Is it possible to find an inner product inducing the ℓp norm, p ≠ 2? Show that the sequence (xm )∞ m=1 defined in Example 2.28 is a Cauchy sequence in ℓ2 and find its limit. Prove Theorem 2.34. Given an inner product vector space V and a subset S of V. Prove that S⊥ is a closed subspace of V. Consider the Hilbert space ℝ3 with the usual dot product. Let S be the subspace {(x, y, z) ∈ ℝ3 : x + y + z = 0} and v = (1, 1, 1). Find the unique representation of v = w + w⊥ with w ∈ S and w⊥ ∈ S⊥ . Let (H , ⟨⋅, ⋅⟩) be a Hilbert space and let A ⊆ H be a closed, convex set. Let v ∈ H \A. Follow the proof of Theorem 2.37 to show that there exists a unique a ∈ A such that ‖v – a‖ = inf{‖v – x‖ : x ∈ A}.

2.20. 2.21. 2.22. 2.23. 2.24.

Verify that the family {en } defined in Example 2.45 is orthonormal. Provide the details in Example 2.54. Provide the details in Example 2.55. Use Zorn Lemma to show that every separable Hilbert space contains an orthonormal basis. Let (H , ⟨⋅, ⋅⟩) be a Hilbert space and let (vj )∞ j=1 be an orthonormal sequence of = H . Prove that (vj )∞ vectors in H such that span(vj )∞ j=1 is an orthonormal j=1 basis for H .

3 Completeness, Completion and Dimension

Learning Targets ✓ Understand the deﬁnition of Banach spaces and notice some examples and non examples. ✓ Discover that the compactness of the closed unit ball is a property proper of ﬁnite-dimensional spaces. ✓ Discuss some properties and examples of separable spaces.

3.1 Banach Spaces In the previous chapter, we introduced the concept of normed space and we showed some of its properties. We also showed several examples of normed spaces and we studied some particular cases for which normed spaces were also inner product spaces or even Hilbert spaces. In this section we will introduce the notion of Banach spaces. Remember that a normed space (V, ‖ ⋅ ‖) is complete if every Cauchy sequence in V converges to some element in V. Definition 3.1. Let (V, ‖ ⋅ ‖) be a normed space. If V is complete, we say V is a Banach space. Example 3.2. Let X be a compact topological space and C (X) the vector space of the continuous functions f : X → 𝔽. We can define a norm ‖ ⋅ ‖∞ on C (X) by 󵄨 󵄨 ‖f ‖∞ = sup 󵄨󵄨󵄨f (t)󵄨󵄨󵄨 , t∈X

for every f ∈ C (X) . Since X is compact and f is continuous, then f (X) ⊆ 𝔽 is also compact and, hence, bounded. Then f attains its supremum, so we can rewrite the equality above as 󵄨 󵄨 ‖f ‖∞ = max 󵄨󵄨󵄨f (t)󵄨󵄨󵄨 . t∈X

This norm is called the uniform norm. We will show that (C (X) , ‖ ⋅ ‖∞ ) is complete and, thus, a Banach space. Suppose (fn )∞ n=1 ⊆ C(X) is a Cauchy sequence, then given % > 0 there exists N ∈ ℕ such that for every n, m ≥ N, ‖fn – fm ‖∞ < %. In particular, for every t ∈ X, |fn (t) – fm (t)| < %. Thus for every t ∈ X, (fn (t))∞ n=1 is a Cauchy sequence in 𝔽, and hence, it is convergent.

3.1 Banach Spaces

37

Thus, we define a function f : X → 𝔽 as f (t) = lim fn (t). n→∞

We will show that this is actually the limit of the sequence (fn )∞ n=1 . Notice that for every t ∈ T, if n, m > N |fm (t) – f (t)| = lim |fm (t) – fn (t)| n→∞

≤ ‖fn – fm ‖∞ < %, and consequently fm converges uniformly to f . This implies that f ∈ C(X), and this finishes the proof. ⊘ Example 3.3. In the previous chapter, we defined the spaces ℓp , 1 ≤ p ≤ ∞ and we showed that they have a normed space structure. We leave it as an exercise to show that these are actually Banach spaces; the proof is similar to Example 2.27. ⊘ Example 3.4. This example shows that there exist normed spaces that are not complete. Take the vector space C([0, 1]) of all continuous real-valued functions but instead of defining its norm as in Example 3.2, we take the L2 norm 1

‖f ‖2 = (∫ |f (t)|2 dt) . 0

For each n define fn : [0, 1] → ℝ as {(2t)n fn (t) = { {1

if 1 ≤ t ≤ if

1 2

1 2

< t ≤ 1.

Then fn ∈ C([0, 1]) and 1/2

‖fn – fm ‖2 = ∫ |(2t)n – (2t)m |2 dt 0

=

1 1 1 1 1 – + . 2 2n + 1 n + m + 1 2 2m + 1

Hence (fn )∞ n=1 is a Cauchy sequence. However, considering the function 7[1/2,1] : [0, 1] 󳨀→ {0, 1} {1, if 1/2 ≤ t ≤ 1 t 󳨃󳨀→ { 0, if 0 ≤ t < 1/2 { we have that

38

3 Completeness, Completion and Dimension

1/2 2

‖fn – 7[1/2,1] ‖ = ∫ (2t)2n dt 0

=

1 → 0. 2(2n + 1)

But 7[1/2,1] ∈ ̸ C([0, 1]). This shows that (C([0, 1]), ‖ ⋅ ‖2 ) is not a Banach space.

In the previous examples, it can be seen that choosing a different norm makes an important difference in the properties of the space. This of course depends on the topology induced by the given norm. This leads us to define equivalence of norms. Definition 3.5. Let V be a vector space in which two norms ‖ ⋅ ‖1 and ‖ ⋅ ‖2 are defined. We say that these norms are equivalent if there exist constants a, b > 0 such that, for all v ∈ V, a‖v‖1 ≤ ‖v‖2 ≤ b‖v‖1 , i.e., ‖v‖1 ≍ ‖v‖2 . Theorem 3.6. Let V be a finite-dimensional vector space. Then all norms on V are equivalent. Proof. Let us take a basis {e1 , . . . , en } for X, so every v ∈ V can be written as v = ∑nj=1 !j ej , where kj ∈ 𝔽, for 1 ≤ j ≤ n. We will show that every norm ‖ ⋅ ‖ on V is 󵄨 󵄨 equivalent to the norm ||v|| = ∑nj=1 󵄨󵄨󵄨󵄨!j 󵄨󵄨󵄨󵄨 . On one side we have the inequality 󵄩󵄩 n 󵄩󵄩 n 󵄩󵄩 󵄩󵄩 󵄨 󵄨 󵄩 󵄩󵄩 ‖v‖ = 󵄩󵄩∑ !j ej 󵄩󵄩󵄩 ≤ ∑ 󵄨󵄨󵄨󵄨!j 󵄨󵄨󵄨󵄨 ‖vj ‖ ≤ (max ‖ej ‖) ||v|| = b|| v|| , 󵄩󵄩 󵄩󵄩 1≤j≤n 󵄩󵄩 j=1 󵄩󵄩 j=1 where b = max1≤j≤n ‖ej ‖. To get the other inequality we assume that there is no a > 0 such that, for all v ∈ V, a|| v|| ≤ ‖v‖. Then for all N ∈ ℕ, there is a vector vN ∈ V with ||vN || = 1 and N‖vN ‖ ≤ ||vN ||. Since the unit sphere is compact, there is a convergent subsequence (vNj ) of (vN ) with limit v0 in the space (V, || ⋅ ||) . Since the norm is continuous we conclude that ||v0 || = 1. We have then the inequality ‖v0 ‖ ≤ ‖v0 – vNj ‖ + ‖vNj ‖ ≤ b|| v0 – vNj || +

1 , Nj

(3.1)

which tends to zero as j → ∞. Hence v0 = 0, which contradicts the fact that ||v0 || = 1. ◻ Corollary 3.7. Let V be a finite-dimensional normed space. Then V is a Banach space.

3.1 Banach Spaces

39

Proof. Suppose the dimension of V is n. Since all norms are equivalent, we will only consider || ⋅|| , defined in the proof of the previous theorem. Let vl = ∑nj=1 !lj ej be a Cauchy sequence in the space (V, || ⋅ ||) . Noticing that 󵄨󵄨 󵄨󵄨 󵄨 ||vl – vm || = ∑ 󵄨󵄨󵄨!lj – !m j 󵄨󵄨󵄨 , 󵄨 n

(3.2)

j=1

we can conclude that (!lj ) some

!0j

in 𝔽. Taking v0 =

, for 1 ≤ j ≤ n is a Cauchy sequence in 𝔽 and converges to

l=1 ∑nj=1 !0j ej

in V, then

n 󵄨 󵄨󵄨 󵄨 lim ||vl – v0 || = lim ∑ 󵄨󵄨󵄨!lj – !0j 󵄨󵄨󵄨 = 0. 󵄨 󵄨 j→∞ j→∞

(3.3)

j=1

That is, vl → v0 as desired.

The previous corollary solves the problem of identifying Banach spaces in finite dimensions in a very absolute way. For infinite-dimensional normed spaces, the result is not as absolute as seen in Example 3.3. In what follows we will see a useful criterion for identifying completeness in a normed space. The criterion depends on the following definition. Definition 3.8. Let (V, ‖ ⋅ ‖) be a Banach space and let (vn )∞ n=1 be a sequence in X. We say that the sequence is absolutely summable if ∞

∑ ‖vn ‖ < ∞. n=1

From real analysis, we know that if a sequence of real numbers is absolutely summable, then it is also summable. This is not necessarily the case in general normed spaces. As a matter of fact, the property characterizes Banach spaces. Theorem 3.9. Let (V, ‖ ⋅ ‖) be a normed space. V is a Banach space if and only if every absolutely summable sequence in V is summable in V. Proof. Suppose that every absolutely summable sequence is summable in V. We will show that V is a Banach space. Let (vn )∞ n=1 be a Cauchy sequence in V. Then for every j ∈ ℕ, there exists Nj ∈ ℕ such that if n, m ≥ N, then ‖vn – vm ‖
0, from Problem 3.7 we get the result. ⊘

3.2 Completion and Dimension

43

3.2 Completion and Dimension In this section we will prove two things. First, the closed unit ball in a normed space Y is compact if, and only if, Y has finite dimension. Such property shows a main difference between finite- and infinite-dimensional spaces. We recommend reading Ref. [33] to gain a deep insight of the notion of compactness and its history, see Chapter 18 for a study of compactness. Second, every normed space can be completed (in some sense) to a Banach space. To prove the first proposition, we will need a tool that will help us to construct bounded sequences which do not have convergent subsequences in finitedimensional spaces. Lemma 3.14 (Riesz’s Lemma). Let (V, ‖ ⋅ ‖) be a normed space and S ≠ V be a closed vector subspace of V. Then, for every 0 < ! < 1, there is v ∈ V\S such that ‖v‖ = 1 and infw∈S ‖v – w‖ ≥ !. Proof. Let’s take z ∈ V\S (which is nonempty) and a = infy∈S ‖z – w‖, where a > 0 since S is closed. Also, for every b > a, there is x ∈ V with a ≤ ‖z – x‖ ≤ b. If we define z–x v = ‖z–x‖ , then v ∈ V\S and ‖v‖ = 1. Moreover, for every w ∈ S we have the inequality ‖v – w‖ =

1 󵄩󵄩 a a 󵄩 󵄩󵄩z – (x + ‖z – x‖w)󵄩󵄩󵄩 ≥ ≥ . 󵄩 󵄩 ‖z – x‖ ‖z – x‖ b

Hence, for a given 0 < ! < 1, we choose b =

a !

and the lemma is proven.

Now we are ready to prove the following: Theorem 3.15. Let V be a normed vector space. Then, the closed unit ball B1 (0) in V is compact if, and only if, V is a finite-dimensional vector space. Proof. If V is a finite-dimensional space then it is isomorphic to ℝdim V . Since B (0; 1) is closed and bounded, it is compact by the Heine–Borel Theorem (see for example Refs [34, 3]). Conversely, suppose that dim V = ∞. Take v1 ∈ V with ‖v1 ‖ = 1. Using ! = 21 in Riesz’s Lemma, there is v2 ∈ V with ‖v2 ‖ = 1 and ‖v1 – v2 ‖ ≥ 21 . Using Riesz’s Lemma once again, there is v3 ∈ V with ‖v3 ‖ = 1, ‖v1 –v3 ‖ ≥ 21 and ‖v2 –v3 ‖ ≥ 21 . Continuing with ∞ this procedure we construct a sequence (vn )n=1 such that for every n, ‖vn ‖ = 1, and for every j ≠ l, ‖vj – vl ‖ ≥ 21 . Since this sequence does not have a convergent subsequence, we conclude that the closed unit ball is not compact. ◻ Definition 3.16. Let (X, d) and (Y, d󸀠 ) be metric spaces. We say that X and Y are isometric if there is bijective mapping I : X → Y such that, for all x, y ∈ X, we have d(x, y) = d󸀠 (I(x), I(y)).

44

3 Completeness, Completion and Dimension

Remark 3.17. A mapping between metric spaces with the distance-preserving property defined above is called an isometry, and every isometry is injective. So, in other words, in the previous definition we say that two metric spaces are isometric if there is a surjective isometry between them. ⊘ Definition 3.18. Two normed spaces V1 and V2 are isomorphic if there is a linear bijective isometry I : V1 → V2 between them (i.e., if there is a bijective isometry which is also a linear mapping). The function I is called an isomorphism between the normed spaces. Theorem 3.19. Let (X, d) be a metric space. Then there exists a complete metric space (X 󸀠 , d󸀠 ) and a dense set S ⊆ X 󸀠 such that X is isometric with S. The complete metric space X 󸀠 is the completion of X. Moreover, if X 󸀠 and X 󸀠󸀠 are both completions of X, then they are isometric. Proof. We begin defining an equivalence relation on the set of all Cauchy sequences on X as follows: we say two such sequences (xn ) and (&n ) are equivalent if limn→∞ d (xn , &n ) = 0. Now take X 󸀠 as the set of all the equivalence classes given by the aforementioned equivalence relation. Using the triangle inequality we have d (xn , yn ) ≤ d (xn , xm ) + d (xm , ym ) + d (ym , yn ) , which implies 󵄨󵄨 󵄨 󵄨󵄨d (xn , yn ) – d (xm , ym )󵄨󵄨󵄨 ≤ d (xn , xn ) + d (yn , ym ) . 󵄨 󵄨 Since (xn ) and (yn ) are Cauchy sequences, then (d (xn , yn ))n is a Cauchy sequence in ℝ, and hence it is convergent. This shows that if & , y󸀠 ∈ X 󸀠 , then the limit limn→∞ d (xn , yn ) exists and does not depend on the representative used (in this case, (xn ) and (yn ) , for & and y󸀠 , respectively). We can then define a mapping d󸀠 : X 󸀠 × X 󸀠 󳨀→ ℝ, (& , y󸀠 ) 󳨃󳨀→ limn→∞ d (xn , yn ) , which turns out to be a metric on X 󸀠 , as it is easily shown. Now define I : X → X 󸀠 in such a way that (x, x, x, . . . ) is a representative of I (x) . Then I is an isometry and its image is dense in (X 󸀠 , d󸀠 ) , for if (xn ) is a representative of & ∈ X 󸀠 , then for : > 0; since (xn ) is a Cauchy sequence, there is an integer m big enough such that d󸀠 (& , I (xm )) = limn→∞ d (xn , xm ) < :.

It is easy to show that (X 󸀠 , d󸀠 ) is complete using I, the density of I (X) and the triangle inequality. This completes the proof of the main part of the theorem. To prove uniqueness, assume (X 󸀠󸀠 , d󸀠󸀠 ) is another completion of (X, d) . Then there is a dense subset U ⊆ X 󸀠󸀠 and an isometry I 󸀠 : X → U. Then the composition I 󸀠 ∘ I –1 :

3.3 Separability

45

I (X) → U is a bijective isometry, which has a unique extension to an isometry between the closures of these spaces, which are precisely X 󸀠 and X 󸀠󸀠 . This completes the proof. ◻ We now give an analogous result to the previous theorem in terms of normed spaces. Theorem 3.20. Let (V, ‖ ⋅ ‖) be a normed space. Then there exists a Banach space (W, || ⋅ ||) and a dense set S ⊆ W such that V is isomorphic with S. The Banach space W is the completion of V. Moreover, if W and W 󸀠 are both completions of V, then they are isomorphic. Proof. We will use the notation presented in the previous theorem. It will suffice to show that (X 󸀠 , d󸀠 ) is a vector space, with the distance d󸀠 being generated by a norm which is compatible with I (V). To define a linear structure on X 󸀠 recall that the elements of X 󸀠 are equivalence classes of Cauchy sequences, and define the sum and product by scalars in the natural way. We leave the details to the reader. A norm [⋅] on I (Y) can be defined from I as [I (x)] = ‖x‖ and it follows that the induced metric is a restriction of d󸀠 to I (V) . Now define (W, || ⋅ ||) = (X 󸀠 , || ⋅ ||) by extending this norm to the whole space defining ||& || = d󸀠 (0󸀠 , & ) . Hence (W, || ⋅ ||) is a normed space which is also complete; this makes it a Banach space. The second part of the proof is left as an exercise. ◻

3.3 Separability In the previous chapter, we introduced the notion of separability under the context of Hilbert spaces. Separable Hilbert spaces are the “nicest” type of inner-product spaces since it is possible to always obtain an orthonormal basis (see Problem 2.23) and consequently to use all the tools and techniques from such theory. In this section we will introduce the concept under the setting of Banach spaces and to study some of its properties. Separability is an important property of a Banach space since it allows to approximate any element with a set of elements that are somehow easier to handle.

Definition 3.21. A normed space is separable if it has a countable dense subset. Example 3.22. Maybe the easiest example is the case of the normed space (ℝ, | ⋅ |). It contains the countable dense subset ℚ. This example can be generalized to the case of the space (ℝn , ‖ ⋅ ‖) with respect to any norm (remember, they are all equivalent). In this case, consider the set ℚn of n-tuples of rational numbers. We leave the details to the reader. ⊘

46

3 Completeness, Completion and Dimension

For Example 3.29 we need the Weierstrass approximation theorem. For completeness we will give the proof based on the Bernstein polynomials. Theorem 3.23 (Weierstrass Approximation Theorem). Let f ∈ C [a, b] . For arbitrary % > 0, there exists a polynomial P such that sup |P(x) – f (x)| < %. x∈[a,b]

The Weierstrass theorem will follow from Theorem 3.28 after rescaling. We need some preliminary lemmas. Lemma 3.24. The following identities are true: n n ∑ ( )xk (1 – x)n–k = 1, k k=0

(3.4)

n 2 n ∑ (k – nx) ( )xk (1 – x)n–k = nx (1 – x) . k k=0

(3.5)

Proof. The identity (3.4) follows from the well-known binomial formula n n n (a + b) = ∑ ( )ak bn–k , k k=0

(3.6)

taking a = x and b = 1 – x. To prove the second identity, let us choose a = z and b = 1 and replacing in eq. (3.6) we obtain n n ∑ ( )zk = (z + 1)n . k k=0

(3.7)

After differentiating eq. (3.7) and multiplying by z, we get n n ∑ k( )zk = nz (z + 1)n–1 . k k=0

(3.8)

Differentiating once again the equality (3.8) and multiplying by z, we obtain n n ∑ k2 ( )zk = nz (nz + 1) (z + 1)n–2 . k k=0

Replacing z =

x 1–x

(3.9)

in eqs (3.7), (3.8) and (3.9) and multiplying by (1 – x)n , we obtain

3.3 Separability

47

n n ∑ ( )xk (1 – x)n–k = 1, k k=0

(3.10)

n n ∑ k( )xk (1 – x)n–k = nx, k k=0

(3.11)

n n ∑ k2 ( )xk (1 – x)n–k = nx (nx + 1 – x) . k k=0

(3.12)

The identity (3.5) now follows from multiplying eq. (3.10) by n2 x2 , multiplying eq. (3.11) by –2nx and summing the resulting equalities with eq. (3.12). ◻ Corollary 3.25. For all values of x n n 2 n ∑ (k – nx) ( )xk (1 – x)n–k ≤ . 4 k k=0

(3.13)

Proof. It follows from Lemma 3.24.(b) and the fact that x (1 – x) ≤ 41 , which follows from noticing that this inequality is equivalent to 4x2 – 4x + 1 = (2x – 1)2 ≥ 0. ◻ Lemma 3.26. Let x ∈ [0, 1] and \$ > 0 be arbitrary. Denoting Bn (x) by 󵄨󵄨 󵄨󵄨󵄨 󵄨k Bn (x) = {k ∈ {0, ⋅ ⋅ ⋅ , n} : 󵄨󵄨󵄨󵄨 – x󵄨󵄨󵄨󵄨 ≥ \$, } , 󵄨󵄨 n 󵄨󵄨

(3.14)

n 1 . ∑ ( )xk (1 – x)n–k ≤ 2 k 4n\$ k∈B (x)

(3.15)

we have that

n

Proof. Taking k ∈ Bn (x) , we have, from eq. (3.14), that 2

(k – nx) ≥1 n 2 \$2 from which it follows that n 1 ∑ ( )xk (1 – x)n–k ≤ 2 2 k n \$ k∈B (x) n

2 n ∑ (k – nx) ( )xk (1 – x)n–k , k k∈B (x) n

and now the result follows from eq. (3.13), since the summands are all nonnegative. ◻ We now introduce the notion of Bernstein polynomials.

48

3 Completeness, Completion and Dimension

Definition 3.27. Let f : [0, 1] 󳨀→ ℝ be a function. Then every polynomial of the form n n k Bn (x) = ∑ ( )xk (1 – x)n–k f ( ) n k k=0

is denoted as a Bernstein polynomial of the function f .

From the above lemmas, we see that if f is a continuous function then the approximation n n Bn (x) ≈ ∑ f (x) ( )xk (1 – x)n–k k k=0

is good enough for high values of n. More precisely: Theorem 3.28. Let f : [0, 1] 󳨀→ ℝ be a continuous function. Then for x ∈ [0, 1] we have lim Bn (x) = f (x) .

n→∞

(3.16)

Moreover, we have that Bn converges uniformly to f . Proof. Let M = maxx∈[0,1] |f (x)|. Due to the uniform continuity of f , for a given : > 0 󵄨 󵄨 there exists a \$ > 0 such that when 󵄨󵄨󵄨. – & 󵄨󵄨󵄨 < \$, we have 󵄨󵄨 󵄨 : 󵄨󵄨f (. ) – f (& )󵄨󵄨󵄨 < . 󵄨 󵄨 2

(3.17)

Now, taking an x ∈ [0, 1], and from the fact that ∑nk=0 (nk)xk (1 – x)n–k = 1 we have n n f (x) = ∑ f (x) ( )xk (1 – x)n–k , k k=0

from which we get n n k Bn (x) – f (x) = ∑ {f ( ) – f (x)} ( )xk (1 – x)n–k n k k=0

k n = ( ∑ + ∑ ) {f ( ) – f (x)} ( )xk (1 – x)n–k , n k A (x) B (x) n

n

where An (x) and Bn (x) are defined as

(3.18)

3.3 Separability

49

󵄨󵄨 󵄨󵄨 󵄨 󵄨k An (x) = {k ∈ {0, 1, ⋅ ⋅ ⋅ , n} : 󵄨󵄨󵄨󵄨 – x󵄨󵄨󵄨󵄨 < \$} , 󵄨󵄨 󵄨󵄨 n and 󵄨󵄨 󵄨󵄨 󵄨 󵄨k Bn (x) = {k ∈ {0, 1, ⋅ ⋅ ⋅ , n} : 󵄨󵄨󵄨󵄨 – x󵄨󵄨󵄨󵄨 ≥ \$} . 󵄨󵄨 󵄨󵄨 n We have k % n % n ∑ {f ( ) – f (x)} ( )xk (1 – x)n–k ≤ ∑ ( )xk (1 – x)n–k ≤ , n 2 2 k k A (x) A (x) n

n

due to the uniform continuity of f . On the other hand k n M n ∑ {f ( ) – f (x)} ( )xk (1 – x)n–k ≤ 2M ∑ ( )xk (1 – x)n–k ≤ 2 n k k 2n\$ B (x) B (x) n

n

where we used the boundedness of the function f and Lemma 3.26. Gathering the estimates we finish the proof. ◻ Example 3.29. The space (C([a, b]), ‖ ⋅ ‖∞ ) is separable. This is a consequence of Weierstrass approximation theorem (for a proof of this result, different from the one in Theorem 3.23, we recommend Refs [14, 34] and a simpler proof in Ref. [25]). Since the set of polynomials is dense in C([0, 1]), then a suitable countable dense subset will be the set of P (ℚ) of polynomials with rational coefficients. It can be proven that given any polynomial p and % > 0, there exists q ∈ P (ℚ) such that ‖p – q‖∞ < %. ⊘ Example 3.30. (Lp ([0, 1]), ‖⋅‖p ) is separable if 1 ≤ p < ∞. From measure theory, we know that the set of simple functions is dense in Lp ([0, 1]). The subset of all simple functions having rational coefficients and rational endpoints in each interval is a countable dense subset. ⊘ Example 3.31. In the previous chapter we showed an example of an inner product space that is not separable. This of course is also an example of a nonseparable normed space. ⊘ Example 3.32. Another example of a nonseparable normed space is given by the space (L∞ ([0, 1]), ‖ ⋅ ‖∞ ). Given a t ∈ [0, 1] define 7[0,t] : [0, 1] 󳨀→ {0, 1} {1, if 0 ≤ x ≤ t x 󳨃󳨀→ { 0, if t < x. {

50

3 Completeness, Completion and Dimension

Notice that if t, s ∈ [0, 1], t ≠ s then ‖7[0,t] – 7[0,s] ‖∞ = 1. Consequently, any dense subset N ⊆ L∞ ([0, 1]) must contain an element in B1/2 (7[0,t] ) but such family of balls is mutually disjoint and uncountable. ⊘ The following proposition shows some properties of separable normed spaces. Theorem 3.33. Let (V, ‖ ⋅ ‖) be a separable normed space, then (a) (b) (c)

If a normed space (W, || ⋅ ||) is isomorphic to (V, ‖ ⋅ ‖), then W is separable. Every subspace S of V is separable. Any completion of V is separable.

Proof. (a) Suppose f : V → W is a continuous bijective function and let D be a countable dense subset of V. Then from the continuity of f we get that f (D) is dense in W and it is clearly countable. (b) We will prove a stronger result: Any subset S of V is separable. Suppose that D = {d1 , d2 , . . . } is a countable dense subset of V. Then, for every n, k ∈ ℕ, there exists sn,k ∈ S ∩ B1/k (dn ). The set {sn,k : n, k ∈ ℕ} is countable and dense in S (why?). (c) Suppose that (W, ‖| ⋅ ‖|) is a completion of V and let f : V → f (V) ⊆ W be an isomorphism such that f (V) = B. Then by the first part of this proposition we have that f (V) is separable and the result follows from the density of f (V) and ◻ Problem 3.15. In order to show further properties of separable normed spaces, we need a notion of basis for a vector space which requires infinite series. Definition 3.34. Let (V, ‖ ⋅ ‖) be a normed vector space. A Schauder basis of V is a sequence (vn ) ⊆ V such that, for all v ∈ V, there is a unique sequence (!n ) ⊆ 𝔽, such that ∞

n

v = ∑ !j vj := lim ∑ !j vj . j=1

n→∞

j=1

Remark 3.35. Notice that requiring the sequence of scalars (!n ) to be unique implies linear independence for the Schauder basis. Moreover, a Schauder basis can be seen as a generalization of the notion of basis for finite-dimensional spaces, for if the space V is finite dimensional, one can just make !n = 0 for n > dim V. ⊘

3.3 Separability

51

The following theorem relates the separability of a normed vector space with the existence of a Schauder basis. We will need the following definition.

Definition 3.36. Let (V, ‖ ⋅ ‖) be a normed vector space. A set M ⊆ V is total if the set of all linear combinations of vectors in M is dense in V.

Theorem 3.37. Let (V, ‖ ⋅ ‖) be a normed vector space. (a) (b)

If V possesses a Schauder basis, then it is separable. V is separable if, and only if, there is a countable total subset which is linearly independent on V.

Proof. (a) Let (xn ) be a Schauder basis for V. Then the set of all linear combinations !1 x1 +⋅ ⋅ ⋅+!n xn with rational scalars is countable and dense in V. Hence V is separable. (b) (⇐) Suppose that there is a countable total subset {xn } in V, which is linearly independent. The same reasoning as in the previous item works here to show that the set of all linear combinations !1 x1 + ⋅ ⋅ ⋅ + !n xn with rational scalars is countable and dense in V. (why?). This implies that V is separable. ∞ (⇒) Let us now suppose that V is separable. Then there is a dense sequence (xn )n=1 in V. We will define a sequence (yn ) as follows: take y1 as the first nonzero element in ∞ ∞ (xn )n=1 . Then take y2 as the first element in (xn )n=2 such that it makes the sequence 2 (yn )n=1 linearly independent. Then continue this inductive procedure until there is no element that makes the resulting sequence linearly independent. Continuing with this ∞ ∞ process, we have a sequence (yn )n=1 which generates the same vector space as (xn )n=1 , but it is countable and linearly independent. ◻ Remark 3.38. The converse of the first item in the previous proposition is not true: it was shown by Per Enflo in 1973 that there are separable spaces that don’t have a Schauder basis. ⊘ Example 3.39. The space 𝔽n is separable. We already showed this for the case of 𝔽 = ℝ. A different proof comes from showing that the canonical basis for this space is also a Schauder basis. ⊘ Example 3.40. The space ℓp (ℕ) is separable if 1 ≤ p < ∞, for its canonical basis is a Schauder basis. ⊘ Example 3.41. We give one more example of a nonseparable Banach space. Consider ∞ ∞ the space ℓ∞ (ℕ) and a given sequence vn = (!nj )j=1 in it. Define the vector v = (!j )j=1 where its coordinates are

52

3 Completeness, Completion and Dimension

{ {0, !j = { j {! + 1, { j

󵄨󵄨 󵄨󵄨 if 󵄨󵄨󵄨!jj 󵄨󵄨󵄨 ≥ 1 󵄨󵄨 󵄨󵄨 󵄨 󵄨 if 󵄨󵄨󵄨!jj 󵄨󵄨󵄨 < 1. 󵄨 󵄨

By this construction we have ‖v‖∞ ≤ 2 and ‖v – vn ‖∞ ≥ 1, for every n. Hence, there is no dense sequence in ℓ∞ (ℕ). ⊘

3.4 Problems 3.1. 3.2. 3.3. 3.4. 3.5. 3.6.

3.7.

Let V be a vector space equipped with two equivalent norms. Prove that if one of them makes V a Banach space, so does the other. Show that the normed space (ℓp , ‖ ⋅ ‖p ) considered in Example 2.6 is a Banach space. Prove that a closed subspace of a Banach space is also a Banach space. Prove the converse in Theorem 3.9. Use Theorem 3.9 to give an alternative proof that the space considered in Example 3.4 is not a Banach space. Let (V, ‖ ⋅ ‖) be a normed space and E1 and E2 two subspaces of V such that V = E1 + E2 . Prove that V = E1 ⊕ E2 if and only if E1 ∩ E2 = 0. Is the result true for more than two subspaces? Show that a normed space is strictly convex if and only if 󵄩󵄩 1 󵄩󵄩 󵄩󵄩 󵄩 󵄩󵄩 (x + y)󵄩󵄩󵄩 < 1 󵄩󵄩 󵄩󵄩 2

3.8. 3.9. 3.10. 3.11. 3.12. 3.13. 3.14. 3.15. 3.16. 3.17.

for every ‖x‖ = ‖y‖ = 1. Use Problem 2.7 to show that the spaces ℓp , 1 < p < ∞ are strictly convex. Show that the function d󸀠 defined on the proof of Theorem 3.19 is a metric. Provide the remaining details to the proof of Theorem 3.20. Prove that any two completions of the same normed space are isomorphic. Provide the details in Example 3.22. Provide the details in Example 3.30. Provide the details in Theorem 3.37. Let (V, ‖ ⋅ ‖) be a normed space and let S ⊆ V. Prove that S is also separable. Consider the set c0 consisting of all sequences in 𝔽 converging to zero. Prove that (c0 , ‖ ⋅ ‖∞ ) is a Banach space. Prove that (c0 , ‖ ⋅ ‖∞ ) is a separable space.

4 Linear Operators Learning Targets ✓ Go over examples of linear transformations on vector spaces and their relations with matrices. ✓ Recognize the boundedness and the continuity as equivalent properties of linear operators.

4.1 Linear Transformations In this chapter, we will introduce one of the most important concepts in functional analysis, the concept of a linear transformation. Linear transformations are a special type of functions defined on vector spaces. They are so “nicely behaved” that deserve special attention. Moreover, although natural phenomena are not usually linear, it is possible to use linear models to approximate them. Definition 4.1. Let V and W be two normed spaces. A linear transformation or linear operator T : V → W is a function that satisfies the following properties: (a) T(u + v) = T(u) + T(v) for all u, v ∈ V. (b) T(+v) = +T(v) for all + ∈ 𝔽 and v ∈ V. Notice that the two conditions on the definition can be replaced by the condition T(+u + v) = +T(u) + T(v)

for all u, v ∈ V and + ∈ 𝔽.

Remark 4.2. Some simple observations about linear operators: (a) For every linear operator T we have T (0) = 0 (why?). (b) The set of all linear operators sharing the same domain and image can be made ⊘ a vector space by defining addition and scalar multiplication naturally. Example 4.3. The simplest example is the linear transformation T : V → W defined as T(v) = 0 for all v ∈ V. Here, V and W are arbitrary vector spaces. ⊘ Example 4.4. The identity function I : V → V defined as T(v) = v is clearly a linear transformation. ⊘ Example 4.5. If F : [0, 1] → [0, 1] is a continuous map, then the function TF : C ([0, 1]) 󳨀→ C ([0, 1]) , t 󳨃󳨀→ f (F (t))

54

4 Linear Operators

is a linear operator since if + ∈ ℝ, and f , g ∈ C ([0, 1]), then we have that TF (+f + g) (t) = (+f + g) (F(t)), = +f (F(t)) + g(F(t)), = +TF (f )(t) + TF (g)(t).

Example 4.6. Consider the vector space C∞ (0, 1) consisting of all real infinitely differentiable functions, and define T : C∞ (0, 1) → C∞ (0, 1) as Tf = f 󸀠 . Notice that properties (a) and (b) of Definition 4.1 follow from the properties of derivatives. ⊘ Example 4.7. Define the operator T : L1 [0, 1] → ℝ as 1

Tf = ∫ f (x)dx. 0

We leave it to the reader to verify that T is a linear operator.

Example 4.8. Given any n×m matrix A, define the linear transformation T : 𝔽m → 𝔽n as T(v) = Av. Clearly T is a linear transformation. We will study this type of transformations in more detail in the next section. ⊘ Example 4.9. Given a measure space (K, ,), K ⊆ ℂ and f Mf : Lp (K) → 𝔽 as

∈ L∞ , (K), define

Mf (h) = ∫ h(x)f (x)d,(x). K

Then Mf is a linear operator.

4.2 Back to Matrices The reader is probably familiarized with linear transformations acting on finitedimensional spaces. We will briefly study this type of transformations to motivate the concepts and techniques in infinite-dimensional spaces. An example of such linear transformations was given in Example 4.8. It turns out that this is actually the

4.2 Back to Matrices

55

only type of example. We will prove this fact as a way of showing the way linear transformations work. Given a basis B of a vector space V, we will denote by vB the representation of the vector v in terms of the basis B . Theorem 4.10. Let V and W be finite-dimensional vector spaces. Suppose that B = {v1 , . . . , vm } is a basis for V and D = {w1 , . . . , wn } is a basis for W. Let T : V → W be a linear transformation. Then there exists an n × m matrix A such that for all v ∈ V, T(v) = (AvB )D . Proof. Notice that for each j ∈ {1, . . . , m}, there exists !1j , . . . , !nj ∈ 𝔽 such that n

T(vj ) = ∑ !ij wi . i=1

Hence if v ∈ V, then v = +1 v1 + ⋅ ⋅ ⋅ + +m vm and by the linearity of T, we have that n

n

T(v) = +1 ∑ !i1 wi + ⋅ ⋅ ⋅ + +m ∑ !im wi , i=1 m n

i=1

= ∑ ∑ +j !ij wi . j=1 i=1

The proposition holds by considering the matrix A = [!ij ]ni,j=1 .

The previous proposition allows us to use the versatility of matrices and vector representation to study linear transformations in finite-dimensional vector spaces. For example, we will prove that every linear transformation in a finite-dimensional space is continuous. That will be a corollary of the following theorem. Theorem 4.11. Let A be an n × m matrix with entries in 𝔽. There exists a constant C > 0 such that for every vector v = (x1 , . . . , xm ) ∈ 𝔽m we have that ‖Av‖ ≤ C‖v‖.

(4.1)

Notice that since all norms in 𝔽m are equivalent, the selection of the norm only changes the value of the constant C, but the inequality remains. Proof. Suppose that A = [c1 c2 ⋅ ⋅ ⋅ cm ] where each cj is a column vector in 𝔽n and notice that Av = x1 a1 + ⋅ ⋅ ⋅ + xm am . Now, it is easy to see that there exists a constant K > 0 such that |xj | ≤ K‖v‖, then

56

4 Linear Operators

‖Av‖ = ‖x1 a1 + ⋅ ⋅ ⋅ + xm am ‖, ≤ K‖v‖(‖a1 ‖ + ⋅ ⋅ ⋅ + ‖am ‖). Take C = K‖a1 + ⋅ ⋅ ⋅ + am ‖. This proves the result.

Remark 4.12. Notice that in the case of the previously defined norms ‖ ⋅ ‖1 , ‖ ⋅ ‖2 or ‖ ⋅ ‖∞ , we can put K = 1 in inequality (4.1), to obtain ‖Av‖ ≤ (‖a1 ‖ + ⋅ ⋅ ⋅ + ‖am ‖)‖v‖, no matter what norm is defined on the finite-dimensional space. This reasoning implies the following corollary. ⊘ Corollary 4.13. Let V and W be finite-dimensional vector spaces. Every linear transformation T : V → W is continuous. There are two relevant subspaces associated to each continuous linear transformation in finite dimensions. They will be described in the following definition.

Definition 4.14. Let V and W be finite-dimensional vector spaces. Given a linear transformation T : V → W, the kernel of T is defined as the space ker(T) = {v ∈ V : Tv = 0} . The range space T(V) is defined as T(V) = {T(v) : v ∈ V}.

Remark 4.15. The set ker(T) is a vector space since it is a subset of V and if v, w ∈ ker(T) and if ! ∈ 𝔽, then it follows from the linearity of T that T(!v + w) = !T(v) + T(w) = 0, and consequently !v + w ∈ ker(T). We leave it as an exercise to the reader to prove that T(V) is also a vector space. Moreover, by the linearity of T, we have that T is injective if and only if ker(T) = {0}. Also, T is surjective if and only if T(V) = W. There is this nice relation between the dimensions of these two spaces that we give here without proof: dim T(V) + dim ker(T) = dim(V).

(4.2) ⊘

4.3 Boundedness

57

With this equation in hand we will be able to prove one useful property of linear transformations on finite-dimensional spaces. Recall that we say that a function is invertible if it is injective and surjective. Theorem 4.16. Let V and W be two finite-dimensional vector spaces and let T : V → W be a linear transformation. The following statements are equivalent: (a) (b) (c)

T is invertible. T is injective. T is surjective.

Proof. First, notice that if T is invertible, it is clearly injective. Assume now that T is injective, then ker(T) = {0} and using eq. (4.2) we have that dim T(V) = dim V. This implies that T is surjective. It only remains to prove 3 ⇒ 1, but this follows again by eq. (4.2) since T(V) = V implies that dim ker(T) = 0 and so if T is surjective, it is also injective. ◻ 1 × n Matrices. Now let’s focus on linear transformations T : 𝔽n → 𝔽. Notice that in this case, the associated matrix is a vector of n entries. Suppose that v = (v1 , . . . , vn ) is the representation of v with respect to the canonical basis {e1 , . . . , en }. Then T(v) = v1 T(e1 ) + ⋅ ⋅ ⋅ + vn T(en ), = ⟨v, (T(e1 ), . . . , T(en ))⟩ . This equation shows that the action of the linear transformation T can be obtained as the inner product by the vector (T(e1 ), . . . , T(en )). This is a particular case of Riesz Theorem that will be studied later. Remark 4.17. Although the readers should probably have taken a linear algebra course already, it is possible that they want to refresh some of the tools and basic concepts from linear algebra and matrix theory. We recommend some classical books [22, 6] and the modern approach considered in Ref. [2]. We make a special mention of the beautiful divulgative article [40]. ⊘

4.3 Boundedness In this section we will focus on linear transformations acting on general normed vector spaces. We will not restrict ourselves to the case of finite-dimensional spaces anymore unless stated otherwise. However, we will try to use some of the ideas of the previous section to guide us through. First, notice that eq. (4.1) allowed us to show the continuity of a linear transformation acting on finite-dimensional normed spaces. This equation plays a central

58

4 Linear Operators

role in the study of linear transformations and gives us a relationship between linear operators and the topology induced by the norms. Before going over the main results of the section, we need to agree on some notation issues. When it is clear from the context, we will write Tv instead of T(v); this is done in order to simplify the writing. Also, we will be talking about the abstract norm of several (possibly different) normed spaces; all norms will be denoted by the symbol ‖ ⋅ ‖ unless stated otherwise. The reader should keep track of which norm is used. Theorem 4.18. Let (V, ‖ ⋅ ‖) and (W, ‖ ⋅ ‖) be normed spaces. Let T : V 󳨀→ W be a linear operator. The following statements are equivalent: (a)

T(B1 (0)) is bounded: sup{‖Tv‖ : ‖v‖ < 1} < ∞.

(b)

There is a constant C > 0 such that, for all v ∈ V, ‖Tv‖ ≤ C‖v‖.

(c) (d) (e)

T is uniformly continuous on V. T is continuous on V. T is continuous at 0 ∈ V.

Proof. First, notice that (c) ⇒ (d) ⇒ (e) follows directly from the definitions. To prove that (a) ⇒ (b), let ! = sup‖v‖≤1 ‖Tv‖. If v = 0, then (b) holds trivially. v , then ‖u‖ = 1 and hence ‖Tu‖ ≤ ! and (b) follows from Suppose v ≠ 0 and let u = ‖v‖ the linearity of T. Now suppose that u, v ∈ V, and then if (b) holds we have that ‖Tu – Tv‖ = ‖T(u – v)‖≤ C‖u – v‖. This proves (b) ⇒ (c). We will now prove that (e) ⇒ (a). Suppose T is continuous at 0 ∈ V, then there is some \$ > 0 such that if ‖v‖ < \$, then ‖Tv‖ ≤ 1. Thus if u ∈ V, ‖u‖ ≤ 1, then ‖\$u‖ ≤ \$ and ‖T\$u‖ ≤ 1. Therefore, ‖Tu‖ ≤ \$1 and (a) follows. ◻ The previous theorem inspires the following definition. Definition 4.19. A linear operator T : V → W is called bounded if there exists a constant C > 0 such that for every v ∈ V ‖Tv‖ ≤ C‖v‖. We denote by B (V, W) the set of all linear bounded operators T : V → W.

4.3 Boundedness

59

Example 4.20. The operator TF defined in Example 4.5 is a continuous operator. To see this, take any f ∈ C ([0, 1]) ; then 󵄨 󵄨 󵄨 󵄨 ‖TF f ‖∞ = sup 󵄨󵄨󵄨󵄨f (F (!))󵄨󵄨󵄨󵄨 ≤ sup 󵄨󵄨󵄨f (!)󵄨󵄨󵄨 = ‖f ‖∞ . x∈[0,1] x∈[0,1]

Example 4.21. Consider the linear operator T defined in Example 4.6. Notice that for every n we define the function fn (x) = xn , then fn ∈ C∞ (0, 1) and ‖fn ‖∞ = 1. On the other hand, notice that Tfn = nfn–1 and consequently ‖Tfn ‖∞ = n. This shows that T is not a bounded operator on C∞ (0, 1). ⊘ In the previous section, we saw that if T : V → W is a linear operator and V, W are finite-dimensional normed spaces, then T is continuous. This result can be generalized; we actually need only the domain space to be finite dimensional. Theorem 4.22. Let (V, ‖ ⋅ ‖1 ) and (W, ‖ ⋅ ‖2 ) be normed spaces where ‖ ⋅ ‖1 and ‖ ⋅ ‖2 denote two arbitrary norms defined on the spaces V and W, respectively. Suppose that V is finite dimensional. If T : V → W is a linear operator, then T is bounded. Proof. Define a norm on V as ||v|| = ‖v‖1 + ‖Tv‖2 , for all v ∈ V. Since all norms on V are equivalent, then there is a constant C > 0 such that, for every v ∈ V, we have ||v|| ≤ C‖v‖1 . But since ‖Tv‖2 ≤ ||v|| ≤ C‖v‖1 , then T is bounded. ◻ In the previous section, we mentioned two relevant subspaces related to a linear transformation T. We saw that ker(T) = {0} if and only if T is injective and that T(V) = W implies that a linear operator T : V → W is surjective. This is a result that is independent on the dimension of V or W. However, when we dig further in the properties of these spaces, we see that ker(T) and T(V) are not even necessarily closed subspaces of V and W, respectively – something that was automatic in the case of finite dimensions. Example 4.23. Consider the vector space ℓ0 ⊆ ℓ1 defined as ℓ0 = {(xn ) : xn ∈ ℝ, ∃N ∈ ℕ, xk = 0 for all k > N} and define the linear transformation T : ℓ0 → ℝ as ∞

T(xn ) = ∑ nxn . n=1

T is clearly a linear transformation and ∞ } { ker(T) = {(xn ) : xn ∈ ℝ, ∑ nxn = 0} . n=1 } {

60

4 Linear Operators

We will show that this is not a closed subspace of ℓ0 . In fact consider the sequence s1 = (1, 0, 0, . . . ). Then clearly s1 ∈ ̸ ker(T); however, if we define the sequences t1 = (1, –1/2, 0, 0, . . . ) t2 = (1, 0, –1/3, 0, 0, . . . ) .. . tk = (1, 0 . . . , 0, –1/k ⏟⏟⏟⏟⏟⏟⏟ , 0, . . . ) (k+1)–th

.. . we have that T(tk ) = 0 for every k, and ‖s1 – tk ‖1 = 1/k → 0. ⊘

This shows that ker(T) is not closed.

The operator T of this example is not bounded. We will see that for the case of bounded operators such example does not exist. However, even for bounded operators it is possible that the range is not closed (see Problem 4.10). Theorem 4.24. Let (V, ‖ ⋅ ‖) and (W, ‖ ⋅ ‖) be two normed spaces and suppose that T : V → W is a bounded linear operator. Then ker(T) is closed. Proof. Notice that ker(T) = T –1 (0) is the inverse image of a closed subset of W. Since T is continuous, the result follows. ◻ We now focus on the space B (V, W) mentioned in Definition 4.19. We will endow this vector space with a norm by considering the function ‖ ⋅ ‖ : B (V, W) → ℝ+ defined by ‖T‖ = sup{‖Tv‖ : ‖v‖ ≤ 1}. We will prove that this function is, in fact, a norm on B (V, W). (a) It follows directly from the definition that ‖T‖ ≥ 0, for every T ∈ B (V, W) . (b) Let T ∈ B (V, W) . Then ‖T‖ = 0 if, and only if, for every v ∈ V, Tv = 0. And this occurs if, and only if, T = 0. (c) Let + ∈ 𝔽 and T ∈ B (V, W) . Then ‖+T‖ = sup{‖+Tv‖ : ‖v‖ ≤ 1}, = |+| sup{‖Tv‖ : ‖v‖ ≤ 1}, = |+|‖T‖.

4.3 Boundedness

(d)

61

Let T1 , T1 ∈ B (V, W), then ‖T1 + T2 ‖ = sup{‖(T1 + T2 )v‖ : ‖v‖ ≤ 1}, ≤ sup{‖T1 v‖ + ‖T2 v‖ : ‖v‖ ≤ 1}, ≤ ‖T1 ‖ + ‖T2 ‖.

This shows that B(V, W) is a normed space and its norm will be assumed to be the one previously defined unless stated otherwise. Example 4.25. Denote by I : V → V the identity operator on a normed space V, V ≠ {0}; then it is easy to see that ‖I‖ = 1. ⊘ Example 4.26. The operator Mf with f ∈ L∞ , (K) , defined as Mf : Lp, (K) 󳨀→ 𝔽

h 󳨃󳨀→ ∫ f hd, , K

is continuous in

Lp, (K) ,

1 ≤ p < ∞, and ‖Mf ‖ = ‖f ‖∞ .

Proof. If ‖f ‖∞ = 0, the proof is straightforward. Let us assume then that ‖f ‖∞ ≠ 0 and 1 ≤ p < ∞. If ‖h‖p = 1, it follows from 󵄨 󵄨p 󵄨 󵄨p ‖Mf h‖pp = ∫ 󵄨󵄨󵄨f (x)󵄨󵄨󵄨 󵄨󵄨󵄨h (x)󵄨󵄨󵄨 d, (x) ≤ ‖f ‖p∞ ‖h‖pp , K

that Mf is continuous and ‖Mf ‖ ≤ ‖f ‖∞ . Now, let 0 < ( < ‖f ‖∞ . Then there is a measurable set A, with 0 < , (A) < ∞ so, for 󵄨 󵄨 every x ∈ A, ( < 󵄨󵄨󵄨f (x)󵄨󵄨󵄨 ≤ ‖f ‖∞ . Also, the characteristic function of A, 7A , is an element p of L, (K) and 󵄨p 󵄨 󵄨p 󵄨 ‖Mf 7A ‖pp = ∫ 󵄨󵄨󵄨f (x)󵄨󵄨󵄨 󵄨󵄨󵄨7A (x)󵄨󵄨󵄨 d, (x) ≥ ( p ‖7A ‖pp . A

This implies that ( ≤ ‖Mf ‖. Hence ‖Mf ‖ = ‖f ‖∞ .

Example 4.27. For every 1 ≤ p < ∞ define the linear operator S :

ℓp 󳨀→ ℓp . (x1 , x2 , . . . ) 󳨃󳨀→ (x2 , x3 , . . . )

It is clear that ‖S(xn )‖p ≤ ‖(xn )‖p for every sequence (xn ) ∈ ℓp . This implies that ‖S‖ ≤ 1. Moreover, taking the sequence

62

4 Linear Operators

{1, if n = 2 yn = { 0, if n ≠ 2, { we have that ‖S(yn )‖p = ‖yn ‖p = 1. Thus, ‖S‖ = 1.

Example 4.28. Suppose that K ⊆ ℂ, and that (K, A , ,) is a measure space. Let K : (K, A , ,) × (K, A , ,) → 𝔽 be a measurable function and suppose there is a positive constant C such that 󵄨 󵄨 ∫ 󵄨󵄨󵄨󵄨K (x, y)󵄨󵄨󵄨󵄨 d, (x) ≤ C,

K

Consider, the function defined as TK : L1, (K) → L1, (K) defined by (TK f ) (x) = ∫ K (x, y) f (y) d, (y) , with f ∈ L1, (K) . K

Then TK is continuous and ‖TK ‖ ≤ C.

Proof. If f ∈ L1, (K) , then 󵄨 󵄨󵄨 󵄨 󵄨 󵄨󵄨(TK f ) (x)󵄨󵄨󵄨 ≤ ∫ 󵄨󵄨󵄨K (x, y) f (y)󵄨󵄨󵄨 d, (y) . 󵄨 󵄨 󵄨 󵄨 K

We also have that 󵄨 󵄨 󵄨 󵄨󵄨 󵄨 ‖ (TK f ) ‖1 = ∫ 󵄨󵄨󵄨󵄨(TK f ) (x)󵄨󵄨󵄨󵄨 d, (x) ≤ ∬ 󵄨󵄨󵄨󵄨K (x, y)󵄨󵄨󵄨󵄨 󵄨󵄨󵄨󵄨f (y)󵄨󵄨󵄨󵄨 d, (y) d, (x) . K

Using Fubini’s Theorem we get, 󵄨 󵄨 󵄨 󵄨 ‖ (TK f ) ‖1 ≤ ∬ 󵄨󵄨󵄨󵄨K (x, y)󵄨󵄨󵄨󵄨 d, (x) 󵄨󵄨󵄨󵄨f (y)󵄨󵄨󵄨󵄨 d, (y) ≤ C‖f ‖1 . K×K

From this the conclusion follows.

Once we have shown some examples of bounded operators and calculated its norms, one natural question to ask is whether the normed space B(V, W) is a Banach space. The following theorem gives us an answer to this question. Theorem 4.29. Let (W, ‖ ⋅ ‖) be a Banach space and (V, ‖ ⋅ ‖) be a normed space. Then B (V, W) is a Banach space.

4.3 Boundedness

63

Proof. Let (Tn )n=1 be a Cauchy sequence in B (W, V) . That is, each Tn is a linear operator Tn : V → W and for every % > 0 there exists N ∈ ℕ such that if n, m ≥ N, then ‖Tn – Tm ‖ < %. Notice that for every v ∈ V, it holds that ‖Tn v – Tm v‖ ≤ ‖Tn – Tm ‖‖v‖. It follows that for every v ∈ V the sequence (Tn v) is a Cauchy sequence in W. But since W is a Banach space, we conclude that it converges to some w ∈ W. Now define the operator T : V → W as Tv = lim Tn v. n→∞

It follows from the properties of limits that T is a linear transformation. Moreover, from the continuity of the norm in W, we have that if n > N, then ‖Tn v – Tv‖ = lim ‖Tn v – Tm v‖ ≤ %‖v‖. m→∞

Thus, (Tn – T) ∈ B (V, W) and ‖Tn – T‖ ≤ % and since B (V, W) is a vector space, then T = Tn + (T – Tn ) ∈ B (V, W). Finally, since ‖Tn – T‖ ≤ %, for every n ≥ N, we have that Tn → T in B (V, W) , and hence the space is complete. ◻ The following theorem will allow us to extend a linear operator. First, recall the definition of extension of a function. Definition 4.30. Let f : X → Z and g : Y → Z be two functions. f is an extension of g, or g is a restriction of f , if Y ⊆ X and f (t) = g (t) , for all t ∈ Y. We use the notation f |Y = g. Theorem 4.31. Let (W, ‖ ⋅ ‖) be a Banach space and let (V, ‖ ⋅ ‖) be a normed space. Suppose that D ⊆ V is a dense subspace of V. If T : D → W is a bounded linear operator, then T has a unique extension T ∈ B (V, W) . Moreover, we have ‖T‖ = ‖T‖. Proof. Let v ∈ V and let (vn ) be a sequence in D that converges to v. Since ‖Tvn –Tvm ‖ ≤ ‖T‖‖vn –vm ‖, we have that the sequence (T (vn )) is a Cauchy sequence in W and, hence, it is convergent, say to w ∈ W. Now define, T : V → W as Tv = lim Tvn . n→∞

64

4 Linear Operators

Clearly, T is linear and an extension of T. Notice that if (vn󸀠 ) ⊆ D is another sequence that converges to v, and Tvn󸀠 → w󸀠 , then the sequence (v1 , v1󸀠 , v2 , v2󸀠 , . . . ) also converges to v and using the same argument as before, the sequence (Tv1 , Tv1󸀠 , Tv2 , Tv2󸀠 , . . . ) is convergent in W. Hence two convergent subsequences (Tvn ) and (Tvn󸀠 ) must converge to the same vector. Thus, w = w󸀠 and this proves that T is well defined. Now, if v ∈ V then, by the continuity of the norm, ‖Tv‖ = lim ‖Tvn ‖ ≤ lim ‖T‖‖vn ‖ = ‖T‖‖v‖, n→∞

n→∞

which implies that ‖T‖ ≤ ‖T‖. On the other hand, ‖T‖ = sup {‖Tv‖ : v ∈ D, ‖v‖ ≤ 1} , ≤ sup {‖Tv‖ : v ∈ V, ‖v‖ ≤ 1} , = ‖T‖. Then, ‖T‖ = ‖T‖. Lastly, we prove the uniqueness of the extension. Suppose that S ∈ B (V, W) is an extension of T and v ∈ V. Then every sequence (vn ) in D converging to v satisfies that Tvn = Svn , and since both T and S are continuous, then Tv = Sv. Hence, S = T. ◻

4.4 Problems 4.1.

4.2. 4.3. 4.4. 4.5.

Let V1 and V2 be vector spaces, T : V1 → V2 be a linear operator, and v1 , . . . , vn ∈ V be a set of linearly dependent elements. Prove that the set T (v1 ) , . . . , T (vn ) is linearly dependent. Verify that the function Mf defined on Example 5.8 is a linear transformation. Show that matrix multiplication corresponds to composition of linear transformations in the sense of Theorem 4.10. Prove Corollary 4.13. Let (H1 , ⟨⋅, ⋅⟩1 ) and (H2 , ⟨⋅, ⋅⟩2 ) be two Hilbert spaces and let T : H1 → H2 be a linear operator. Let w ∈ H1 and u ∈ H2 and suppose that for every v ∈ H1 ⟨v, Tw⟩2 = ⟨v, u⟩2 .

4.6.

Prove that Tw = u. Let (V, ‖ ⋅ ‖) and (W, ‖ ⋅ ‖) be normed spaces and let T : V → W be a bounded linear operator. Show that

4.4 Problems

65

sup{‖Tv‖ : ‖v‖ ≤ 1} = sup{‖Tv‖ : ‖v‖ = 1}, 󵄩󵄩 󵄩 v 󵄩󵄩󵄩 󵄩 = sup {󵄩󵄩󵄩󵄩T ( )󵄩󵄩󵄩 : v ∈ V} , ‖v‖ 󵄩󵄩 󵄩󵄩 = inf{C > 0 : ‖Tv‖ ≤ C‖v‖ for all v ∈ V}. 4.7. 4.8. 4.9. 4.10. 4.11. 4.12.

Verify that the linear operators defined in Examples 4.3 and 4.4 are bounded. Verify that the linear operator defined in Example 4.7 is bounded. Suppose (V, ‖ ⋅ ‖) and (W, ‖ ⋅ ‖) are normed spaces and T : V → W is an isometric isomorphism. Prove that if V is a Banach space, then W is also a Banach space. Find an example of a bounded linear operator T : V → W such that T(V) is not closed. Let (V, ‖ ⋅ ‖) be a normed space and let T : V → 𝔽 be linear transformation. Prove that ker(T) is dense in V if and only if T is not bounded. Let (V, ‖ ⋅ ‖), (W, ‖ ⋅ ‖) and (X, ‖ ⋅ ‖) be normed spaces and let T1 : V → W, T2 : W → X be bounded linear operators. Denote by T2 T1 : V → X the composition of T2 and T1 . Show that ‖T2 T1 ‖ ≤ ‖T2 ‖‖T1 ‖.

4.13.

Let H be a Hilbert space. Suppose that the field of scalars is ℂ and let T : H → H be a linear operator such that for every v ∈ H, ⟨Tv, v⟩= 0. Prove that Tv = 0 for every v ∈ H. Is this true if the field of scalars is ℝ?

5 Functionals and Dual Spaces Learning Targets ✓ Comprehend the notion of linear functionals and their representation in Hilbert spaces. ✓ Understand some examples of dual spaces and their representation by means of isometric isomorphisms.

5.1 A Special Type of Linear Operators In the second section of the previous chapter, we briefly considered the special case of linear transformations T : 𝔽n → 𝔽. We noticed that each transformation is associated with a vector in 𝔽n . The idea was to give the reader a taste of the special nature of this type of linear operators. In this chapter, we will focus on the study of such operators. Definition 5.1. Let (V, ‖ ⋅ ‖) be a normed space over a field 𝔽. A linear functional on V is a linear bounded operator in T : V → 𝔽. We highlight that some authors use a different definition of linear functional, viz. they define linear functionals simply as linear operators without requiring its boundedness. Example 5.2. Let’s start with an example we mentioned before. Let v ∈ 𝔽n be fixed and define the operator Tv : 𝔽n → 𝔽 as Tv u = ⟨u, v⟩, where ⟨⋅, ⋅⟩ is the natural inner product in 𝔽n . It is easy to verify that T is a linear operator. Moreover, by Cauchy–Schwarz inequality we have that |Tv u| ≤ ‖u‖‖v‖ and consequently Tv is bounded and ⊘

‖Tv ‖ ≤ ‖v‖.

Example 5.3. The definite integral operator on the Banach space (C ([a, b]) , ‖ ⋅ ‖∞ ) is the mapping T : C ([a, b]) 󳨀→ ℝ b

f 󳨃󳨀→ ∫a f (x) dx

.

5.1 A Special Type of Linear Operators

67

Such operator is linear and continuous since 󵄨󵄨 b 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨 󵄨󵄨∫ f (x) dx󵄨󵄨󵄨 ≤ ‖f ‖∞ (b – a). 󵄨󵄨 󵄨󵄨 󵄨󵄨 a 󵄨󵄨 󵄨󵄨 󵄨󵄨

Example 5.4. Consider the same Banach space (C ([a, b]) , ‖ ⋅ ‖∞ ) as in the previous example and for each t ∈ [a, b] define the linear operator #t : C ([a, b]) 󳨀→ ℝ . f 󳨃󳨀→ f (t) The operators #t are bounded for every t ∈ [a, b] since for every f ∈ C[a, b], it holds that |f (t)| ≤ ‖f ‖∞ .

Example 5.5. Now let’s consider a similar example but taking a different norm. Con1 sider the normed space (C ([–1, 1]) , ‖ ⋅ ‖1 ), where ‖f ‖1 = ∫–1 |f (t)|dt. Define, as before, the linear operator #0 : C ([–1, 1]) 󳨀→ ℝ . f 󳨃󳨀→ f (0) Now choose a function f1 ∈ C ([–1, 1]) such that f1 (–1) = f1 (1) = 0 and f1 (0) ≠ 0. For every n ≥ 2, define fn : [–1, 1] 󳨀→ ℝ {f (nx) , x 󳨃󳨀→ { 1 {0

if |x| ≤

1 n

otherwise.

Notice that 1

2‖f1 ‖∞ 󵄨 󵄨 ‖fn ‖1 = ∫ 󵄨󵄨󵄨fn (x)󵄨󵄨󵄨 dx ≤ → 0. n –1

On the other hand, notice that #0 ( fn ) = fn (0) ≠ 0 for every n and consequently #0 is not a bounded linear transformation. ⊘ Example 5.6. If (V, ‖ ⋅ ‖) is a normed space, then ‖ ⋅ ‖ : V → ℝ+ defines what will be called a nonlinear functional. ⊘ Example 5.7. Let 1 < p < ∞ and consider the Banach space (ℓp , ‖ ⋅ ‖p ). Fix a sequence 1 1 q x = (xn )∞ n=1 ∈ ℓ , where p + q = 1 and define the linear operator Tx :

ℓp 󳨀→ 𝔽 . ∞ (yn )∞ n=1 󳨃󳨀→ ∑n=1 yn xn

68

5 Functionals and Dual Spaces

By Hölder’s inequality we have that |Tx (yn )| ≤ ‖(yn )‖p ‖(xn )‖q . This shows that Tx is well defined and that it is bounded.

Example 5.8. The previous example can be generalized as follows. Let 1 < p, q < ∞ such p1 + q1 = 1. Let (K, A , ,) be a measure space. For each f ∈ Lq, (K) define the linear operator Tf : Lp, (K) 󳨀→ 𝔽 h 󳨃󳨀→ ∫K f hd,. By Hölder’s inequality we have that Tf is bounded and its norm is less than ‖f ‖q . ⊘ Example 5.9. Example 5.2 is very similar to what happens in general Hilbert spaces. Suppose (H, ⟨⋅, ⋅⟩) is a Hilbert space over a field 𝔽 and let v ∈ H. Define the linear operator T : H 󳨀→ 𝔽 u 󳨃󳨀→ ⟨u, v⟩. Since by Cauchy–Schwarz inequality |Tv u| ≤ ‖u‖‖v‖, then Tv is a bounded operator for every v ∈ H.

It turns out that every linear functional on a Hilbert space can be written in such a way. In order to prove this statement, we will need the following lemma. Lemma 5.10. Suppose (H, ⟨⋅, ⋅⟩) is a Hilbert space over a field 𝔽 and let T : H → 𝔽 be a linear functional, T ≠ 0. Then dim ker(T)⊥ = 1. Proof. Since T ≠ 0, we can always assure the existence of a vector v ∈ ker(T)⊥ such that T(v) = 1 (why?). Suppose that u ∈ ker(T)⊥ . Then u – vTu also belongs to ker(T)⊥ since ker(T)⊥ is a vector space. But notice that T(u – vTu) = 0 and consequently u – vTu ∈ ker(T). Thus u = vTu, and this proves that the space ker(T)⊥ consists of only multiples of v. ◻

5.2 Dual Spaces

69

Theorem 5.11 (Riesz Representation Theorem for Hilbert spaces). Suppose (H, ⟨⋅, ⋅⟩) is a Hilbert space over a field 𝔽 and let T : H → 𝔽 be a linear functional. Then there exists a vector v ∈ H such that for every u ∈ H Tu = ⟨u, v⟩. Moreover, ‖T‖ = ‖v‖. Proof. We use Lemma 5.10 to find a vector w ∈ ker(T)⊥ , ‖w‖ = 1 with the property that for every u ∈ H, there exists u1 ∈ ker(T) and ! ∈ 𝔽 such that u = u1 + !w. Now notice that Tu = !Tw and ⟨u, w⟩ = ⟨u1 + !w, w⟩, = ⟨u1 , w⟩+!⟨w, w⟩, = 0 + !‖w‖2 , = !. Hence if we define v = wTw we have that ⟨u, v⟩ = Tw⟨u, w⟩, = !Tw, = Tu. Moreover, since |Tu| ≤ ‖v‖‖u‖ and Tv = ‖v‖2 we conclude that ‖T‖ = ‖v‖.

5.2 Dual Spaces In the previous section we went over several examples of linear functionals acting on different normed spaces. In this section we will focus on the normed space of all linear continuous functionals. Definition 5.12. Let (V, ‖ ⋅ ‖) be a normed space. The Banach space V ∗ consisting of all linear continuous functionals on V will be called the (topological) dual space of V. Notice that in the notation of previous chapters, V ∗ = B (V, 𝔽). Theorem 5.11 can be rephrased in the following sense: Given a Hilbert space H, there is an isomorphism T : H → H ∗ mapping each vector v ∈ H to its associated linear

70

5 Functionals and Dual Spaces

functional Tv in such a way that ‖v‖ = ‖Tv ‖. Hence, we say that the dual space H ∗ of a Hilbert space H is isomorphic to H. This statement completely characterizes the dual space of a Hilbert space. In what follows, we will show similar characterizations for other examples of normed spaces. Example 5.13. We saw in Example 5.2 that linear functionals on 𝔽n are associ∗ ated with vectors in 𝔽n . Moreover, since 𝔽n is a Hilbert space, then (𝔽n ) is isomorphic to 𝔽n . ⊘ Example 5.14. Let 1 < p, q < ∞ be conjugated exponents. In Example 5.7 we saw that any element in ℓq generates a linear functional. In fact, we will show that these are the only linear functionals on ℓp . p First, notice that if (xn )∞ n=1 ∈ ℓ , then ∞

(xn )∞ n=1 = ∑ xn (en ), n=1

where (en ) denotes the sequence whose entries are all equal to zero except for the n-th entry which is equal to 1. Define the linear operator ∗

T : (ℓp ) 󳨀→ ℓq f 󳨃󳨀→ f ((en ))∞ n=1 . ∗

It is left to the reader to prove the linearity of T. We will show that for every f ∈ (ℓp ) q the sequence ( f (en ))∞ n=1 actually belongs to ℓ and that T is surjective. p ∗ p Fix f ∈ (ℓ ) and for each K ∈ ℕ consider the sequence (xnK )∞ n=1 ∈ ℓ defined as q

{ |f (en )| , xnK = { f (en ) 0 {

if n ≤ K and f (en ) ≠ 0 otherwise.

Notice that ∞

K ∞ f ((xnK ))∞ n=1 = ∑ xn f ((en ))n=1 , n=1 K

= ∑ |f (en )|q n=1

and on the other hand, K ∞ |f ((xnK ))∞ n=1 | ≤ ‖f ‖‖(xn )n=1 ‖p , 1/p

K

= ‖f ‖ ( ∑ |f (en )|(q–1)p ) n=1 K

1/p q

= ‖f ‖ ( ∑ |f (en )| ) n=1

.

,

5.2 Dual Spaces

71

Putting both equations together we have that K

1/q q

( ∑ |f (en )| )

≤ ‖f ‖

n=1 q and letting K tend to infinity we have that the sequence ( f (en ))∞ n=1 belongs to ℓ ∞ and ‖( f (en ))n=1 ‖q ≤ ‖f ‖ and the opposite inequality follows from Example 5.7. The surjectivity of T also follows from Example 5.7. ⊘

Example 5.15. Following a similar reasoning as in the previous example, it is possible ∗ to show that if [a, b] ⊆ ℝ, and 1 < p < ∞, then (Lp [a, b]) is isomorphic to Lq [a, b], where p and q are conjugated exponents and the measure considered is the Lebesgue measure on the interval [a, b]. We leave the proof as an exercise for the readers who have experience in measure theory. We remark that this fact, together with Theorem 4.29 and Problem 4.9, implies that the space Lq [a, b] is complete. ⊘ Example 5.16. The following theorem gives us an isomorphism between the dual space C(X)∗ and the space of all finite complex Borel measures on X. We mention it here but its proof is out of the objectives of this book. One proof can be found in Ref. [13]. Theorem 5.17 (Riesz–Markov Theorem). Lex X be a compact Hausdorff topological space and M (X) the set of all finite and complex Borel measures on X with the norm ∗ 󵄨 󵄨 ‖,‖ = 󵄨󵄨󵄨,󵄨󵄨󵄨 (X) , , ∈ M (X) . Then the mapping M (X) → C (X) , defined as , 󳨃→ G, where G, ( f ) = ∫ fd,, for every f ∈ C (X) , X

is a linear surjective isometry.

One useful result about the dual space is that the norm of a vector v in a normed space V can be recovered by knowing the action of all the linear functionals over v. In the following theorem, we make this statement clear. Its proof will depend on the Theorem of Hahn Banach that will be studied later, see Corollary 13.9. Theorem 5.18. Let (V, ‖ ⋅ ‖) be a normed space and let v ∈ V. Then ‖v‖ = sup {|f (v)| : f ∈ V ∗ , ‖f ‖ = 1} . At this point it is natural to wonder what happens with the dual space of a dual space. From the previous examples of Hilbert spaces and ℓp spaces, it seems that when you

72

5 Functionals and Dual Spaces

take the dual twice you get the original space back (up to isomorphisms). However, let’s think about it for a moment. Given a normed space V, its dual is defined as V ∗ = B(V, 𝔽). Consequently V ∗ must be complete. Hence, any incomplete normed space V would be an example of a space such that (V ∗ )∗ is not isomorphic to V. We still have one question however: are there any such examples for Banach spaces? We can give an answer with the help of several problems from this and the previous chapter: In Problem 3.16 the reader is asked to show that the space (c0 , ‖ ⋅ ‖∞ ) is a Banach space. Then in Problem 5.13 it is necessary to prove that the dual space of (c0 , ‖ ⋅ ‖∞ ) is isometrically isomorphic to (ℓ1 , ‖ ⋅ ‖1 ). But in Problem 5.4 it must be shown that the dual space of (ℓ1 , ‖ ⋅ ‖1 ) is isometrically isomorphic to (ℓ∞ , ‖ ⋅ ‖∞ ). Putting all of this together, and knowing (from Problem 3.17 and Example 3.41) that (c0 , ‖⋅‖∞ ) is separable and (ℓ∞ , ‖ ⋅ ‖∞ ) is not, we conclude that there is not an isometric isomorphism between such spaces. We will study more about the double dual space in Chapter 16. From the previous reasoning, we have that separability is a useful tool for comparing spaces. We finish the chapter by proving that the separability of the dual space implies the separability of the space itself. Theorem 5.19. Let (V, ‖ ⋅ ‖) be a normed space. If V ∗ is separable then V is separable. ∞

Proof. Suppose V ∗ is separable. Then there is a sequence (fn )n=1 which is dense in V ∗ . 󵄩 󵄩 Choose vn ∈ V, with 󵄩󵄩󵄩vn 󵄩󵄩󵄩 = 1 such that for every n, 󵄨 󵄨󵄨 󵄨󵄨fn (vn )󵄨󵄨󵄨 ≥ 󵄨 󵄨

󵄩󵄩 󵄩󵄩 󵄩󵄩fn 󵄩󵄩 2

and define S = span ({vn }). This space is separable since it is generated by a countable set. We will show that V = S by showing that if f ∈ V ∗ and f (w) = 0, for every w ∈ S, then we must have that f ≡ 0. ∞ ∞ ∞ Using the density of ( fn )n=1 , it is possible to take a subsequence ( fnj ) ⊆ ( fn )n=1 such that fnj → f . For every nj we have

j=1

󵄨 󵄩󵄩 󵄩 󵄨 󵄩󵄩f – fn 󵄩󵄩󵄩 ≥ 󵄨󵄨󵄨󵄨( f – fn ) (vn )󵄨󵄨󵄨󵄨 , 󵄩󵄩 j󵄩 j j 󵄨󵄨 󵄨 󵄩 󵄨 󵄨󵄨 󵄨󵄨 = 󵄨󵄨󵄨󵄨fnj (vnj )󵄨󵄨󵄨󵄨 , 󵄨 󵄨 󵄩󵄩 󵄩󵄩 󵄩󵄩fn 󵄩󵄩 󵄩 j󵄩 ≥ 󵄩 󵄩, 2

and hence

󵄩󵄩 󵄩󵄩 󵄩󵄩 󵄩󵄩 󵄩󵄩 󵄩󵄩 󵄩󵄩 󵄩󵄩󵄩 󵄩󵄩f 󵄩󵄩 ≤ 󵄩󵄩f – fnj 󵄩󵄩󵄩 + 󵄩󵄩󵄩fnj 󵄩󵄩󵄩 ≤ 3 󵄩󵄩󵄩f – fnj 󵄩󵄩󵄩 → 0 as j → ∞. 󵄩 󵄩 󵄩 󵄩 󵄩 󵄩

This shows that f ≡ 0. We conclude V = S, and consequently V is separable.

5.3 The Bra-ket Notation

73

5.3 The Bra-ket Notation In this section we want to emphasize the importance of Hilbert spaces in modern physics, namely in quantum mechanics. We will also talk about the widely used Dirac notation, the so-called bra-ket notation. Quantum mechanics is ruled by 5 postulates, one of which says that the set of all states of a physical system is a Hilbert space; for more information about the other postulates cf. Ref. [32]. Therefore the theory of Hilbert spaces becomes foundational for a study in quantum mechanics. In physics, following the notation of Dirac, they use a notation that is called the bra-ket notation. Definition 5.20 (Ket Vector). Let (H, ⟨⋅, ⋅⟩) be a Hilbert space. To any element of H we call it the ket vector and it is denoted by |v⟩ instead of v. At first sight the ket notation for a vector in H is a somewhat cumbersome notation, but the real reason will be apparent when we look to the notation of bra vectors. From the Riesz Representation Theorem we know that any linear functional in H * can be given by an element of H; in fact H ∗ is isometric isomorphic to H. Definition 5.21 (Bra Vector). Let (H, ⟨⋅, ⋅⟩) be a Hilbert space. To any element of H we call it the bra vector and it is denoted by ⟨v|, which will stand for the linear functional belonging to H ∗ given by the vector v. The notation is now clear. The bra-ket notation for a vector in a Hilbert space is just a clever way to distinguish between a vector itself |v⟩ and the linear functional generated by the vector ⟨v|. Moreover, the strange-looking notation was devised in order to obtain an inner product when we operate the bra vector with the ket vector ⟨w||v⟩= ⟨w|v⟩ and the notation ⟨⋅| ⋅ ⟩ is also used to denote inner product. We know that if a Hilbert space H has a countable base {e1 , e2 , ⋅ ⋅ ⋅ en , ⋅ ⋅ ⋅ }, then each x ∈ H can be written as ∞

v = ∑ ck ek , k=1

where ck are the Fourier coefficients. With ket notation and the definition of the Fourier coefficients we can write

74

5 Functionals and Dual Spaces

|v⟩= ∑ ⟨ek |v⟩|ek ⟩. k=1

In a similar vein we can obtain a series expansion for a bra vector using a base for the space H ∗ , namely using {⟨e1 |, ⟨e2 |, ⋅ ⋅ ⋅ ⟨en | ⋅ ⋅ ⋅ }, we get that if ⟨w| ∈ H ∗ ∞

⟨w| = ∑ ⟨w|ek ⟩⟨ek |. k=1

Taking both series expansions for bra and ket vectors, we arrive at ∞

⟨w|v⟩= ∑ ⟨w|ek ⟩⟨ek |v⟩. k=1

Linear Operators and the Bra-ket Notation. Suppose that (H, ⟨⋅, ⋅⟩) is a Hilbert space over a field 𝔽 and T1 : H → H is a linear operator. Then we denote the action of T1 on the ket vector |v⟩∈ H as T1 |v⟩≡ |T1 v⟩. Similarly, if T2 : H ∗ → H ∗ is a linear operator, then its action on a bra vector ⟨w| ∈ H ∗ is denoted by ⟨w|T2 ≡ ⟨wT2 |. Notice that ⟨wT2 | ∈ H ∗ and consequently it is a functional acting on H. This is, ⟨wT2 | : H 󳨀→ 𝔽 |v⟩ 󳨃󳨀→ ⟨wT2 |v⟩. For example if T ∗ : H ∗ → H ∗ is the adjoint operator of T, then for every |v⟩∈ H and ⟨w| ∈ H ∗ we have that ⟨wT ∗ |v⟩= ⟨w|Tv⟩. Another variety of adjoint operator can be seen when thinking of the bra vector ⟨w| ∈ H ∗ seen as a linear functional. Its adjoint is the operator ⟨w|∗ : 𝔽∗ 󳨀→ H ∗ ⟨!| 󳨃󳨀→ ⟨!|⟨w|∗ , where (⟨!|⟨w|∗ ) (|v⟩) = !⟨w|v⟩.

5.3 The Bra-ket Notation

75

Other type of operators that can be defined acting in both, H and H ∗ are defined as follows: Fix a ket |x⟩∈ H, a bra ⟨y| ∈ H ∗ , and define the linear operator |x⟩⟨y| : H 󳨀→ H |v⟩ 󳨃󳨀→ |x⟩⟨y|v⟩. However, it is also possible, using the same notation, and operator on H ∗ : |x⟩⟨y| : H ∗ 󳨀→ H ∗ ⟨w| 󳨃󳨀→ ⟨w|x⟩⟨y|. In both instances, the operators have rank one since in the former case, the range is the subspace of H generated by |x⟩, and in the latter, it is the subspace of H ∗ generated by ⟨y|. As an example, suppose that {|e1 ⟩, |e2 ⟩, . . . } ⊆ H is a countable orthonormal basis for H. Then {⟨e1 |, ⟨e2 |, . . . } is a basis for H ∗ (see Problem 5.14). Fix m ∈ ℕ and define the operator |em ⟩⟨em | : H 󳨀→ H |v⟩ 󳨃󳨀→ |em ⟩⟨em |v⟩. Notice that for |v⟩∈ H, (|em ⟩⟨em |) (|em ⟩⟨em |) (|v⟩) = (|em ⟩⟨em |) (|em ⟩⟨em |v⟩) , = (|em ⟩⟨em |em ⟩) (⟨em |v⟩) , = |em ⟩(⟨em |v⟩) , = (|em ⟩⟨em |) (|v⟩) . Thus, |em ⟩⟨em | is the orthogonal projection over the space generated by |em ⟩. A similar result is obtained in the case in which |em ⟩⟨em | is seen as an operator on H ∗ . The details are left to the reader. With this in hand, we can rewrite the projections over any finite-dimensional space. Say for example on the space S = span{en1 , . . . enm } we define P : H 󳨀→ H m

|v⟩ 󳨃󳨀→ (∑ |enj ⟩⟨enj |) (|v⟩). j=1

We leave it as an exercise to prove that P is also an orthogonal projection operator. Remark 5.22. In this section we have focused only on notation issues that are useful to identify from vectors in a Hilbert space H and vectors in its dual H ∗ . We will not

76

5 Functionals and Dual Spaces

go over the quantum mechanics setting in this book; however, we refer the interested readers to Ref. [27] for an expository article addressed to mathematicians who want to get a general idea of the subject. ⊘

5.4 Problems 5.1. 5.2.

Prove that two continuous linear functionals defined on the same normed space and with a common kernel are proportional. Consider the linear operator Tv defined in Example 5.2. Prove that ‖Tv ‖ = ‖v‖.

5.3.

Consider the linear operator Tf defined in Example 5.2. Prove that ‖Tf ‖ = ‖f ‖q .

5.4. 5.5.

Prove that (ℓ1 ) is isometrically isomorphic to ℓ∞ . Let (V, ‖ ⋅ ‖) be a Banach space and let v ∈ V. Show that ‖v‖ ≥ sup {|f (x)| : f ∈ V ∗ , ‖f ‖ ≤ 1} .

5.6.

Let (V, ‖ ⋅ ‖) be a normed space and let A ⊆ V. Define A⊥ = {f ∈ V ∗ : f (a) = 0 ∀a ∈ A}.

5.7.

A⊥ is called the annihilator of A in V ∗ . Prove that A⊥ is closed in V ∗ . Let (V, ‖ ⋅ ‖) be a normed space and let B ⊆ V ∗ . Define ⊥

B = {v ∈ V : f (v) = 0 ∀f ∈ B}.

5.8. 5.9.

5.10. 5.11. 5.12.

B is called the annihilator of B in V. Prove that ⊥ B is closed in V. What name do annihilators receive in the case of Hilbert spaces? Show that the completeness of normed spaces is a property that is invariant under isometric isomorphisms. Is it separability? How about duality? Let (V, ‖ ⋅ ‖) be a normed space. Prove that V is finite dimensional if and only if V ∗ is finite dimensional. Prove Example 5.15. Let (H, ⟨⋅, ⋅⟩) be a Hilbert space and let A : H ∗ → H be such that for every f ∈ H ∗ and v ∈ H, f (v) = ⟨ f , Av⟩.

5.4 Problems

77

Use the polarization identities to prove that the inner product in the Hilbert space H ∗ is given by ⟨ f , g⟩H ∗ = ⟨Ag, Af ⟩. 5.13.

5.14.

Recall the Banach space (c0 , ‖ ⋅ ‖∞ ) considered in Problem 3.16. Use a similar reasoning as Example 5.7 to show that c∗0 is isometrically isomorphic to ℓ1 . Let (H, ⟨⋅, ⋅⟩) be a Hilbert space and let {|e1 ⟩, |e2 ⟩, . . . } be a countable basis for H. Prove that {⟨e1 |, ⟨e2 |, . . . } is a countable basis for H ∗ .

6 Fourier Series

Learning Targets ✓ Learn the notion of Fourier series. ✓ Get acquainted with convergence conditions for the Fourier series.

In Chapter 3 we gave an introduction to the vast theory of Hilbert spaces and we pointed out that orthonormal basis plays a special role in the theory. Such concept, with the help of Parseval’s identity and Bessel’s inequality, allows us to obtain information of any vector on a Hilbert space by the knowledge of its Fourier coefficients. In this chapter, we will study several aspects of Fourier coefficients in the specific case of the space L2 [–0, 0].

6.1 The Space L2 [–0, 0] In this section, we consider the Lebesgue measure defined on the interval [–0, 0] ⊆ ℝ and study the space L2 [–0, 0]. Recall that this is a real Hilbert space with respect to the inner product 0

⟨ f , g⟩= ∫ f (x)g(x) dx, –0

which induces the norm 0

‖f ‖2 = ∫ |f (x)|2 dx. –0

Consequently, there exists an orthogonal basis for such space. Such basis is given by the set of trigonometric functions {1, cos(nx), sin(nx) : n ∈ ℤ+ } . Notice that if n ∈ ℤ+ , 0

{0 ∫ cos(nx) cos(mx)dx = { 0 –0 { 0

{0 ∫ sin(nx) sin(mx)dx = { 0 –0 {

if n = m if n ≠ m, if n = m if n ≠ m.

6.1 The Space L2 [–0, 0]

79

Also, 0

∫ cos(nx) sin(mx) dx = 0, –0 0

∫ cos(nx) dx = 0, –0 0

∫ sin(mx) dx = 0. –0

This shows that {1, cos(nx), sin(nx) : n ∈ ℤ+ } is an orthogonal system. Moreover, as a consequence of Weierstrass Approximation Theorem (see Theorem 3.23 and also Refs [31, 14]) we have that the system is actually an orthogonal basis for the space L2 [–0, 0]. Consequently, we get that the set {

1 cos(nx) sin(nx) : n ∈ ℤ+ } , , √20 √0 √0

is an orthonormal basis for L2 [–0, 0]. Thus, given f ∈ L2 [–0, 0], we have by Theorem 2.57 that f (x) =

a0 ∞ + ∑ a cos(nx) + bn sin(nx), 2 n=1 n

(6.1)

where 0

a0 =

1 ∫ f (x)dx, 0 –0 0

an =

1 ∫ f (x) cos(nx) dx, 0 –0 0

bn =

1 ∫ f (x) sin(nx) dx. 0 –0

The right-hand side expression of eq. (6.1) is called the Fourier Series associated to the function f . Notice that the actual Fourier coefficients of f have one of the following forms: ⟨f ,

√20a0 1 ⟩= , √20 2

⟨f ,

cos(nx) ⟩ = √0an , √0

⟨f ,

sin(nx) ⟩ = √0bn . √0

80

6 Fourier Series

a However, for historical reasons, we will refer to the numbers { 20 , an , bn : n ∈ ℤ+ } as the Fourier coefficients of f . The convergence of the series is with respect to the norm ‖ ⋅ ‖2 of the space L2 [–0, 0]. The norm of f can be calculated by

‖f ‖22 = 0 (

a20 ∞ 2 + ∑ a + b2n ) . 2 n=1 n

6.2 Convergence Conditions for the Fourier Series The convergence result of the previous section can be written as ‖f – Sn ‖2 → 0, where Sn (x) =

n a0 + ∑ ak cos(kx) + bk sin(kx). 2 k=1

(6.2)

This means that the sequence of functions (Sn )n=1 converges in norm to f . It is natural to ask the conditions under which the convergence is pointwise or uniform.

6.2.1 Sufficient Convergence Conditions for the Fourier Series in a Point We will determine now sufficient conditions for the convergence of the trigonometric series at a given point. We will need the following preliminary remarks. First notice that all the trigonometric functions belonging to the orthonormal basis previously discussed are bounded on the interval [–0, 0]. Consequently, if > represents any of those functions, we have that for f ∈ L1 [–0, 0], 󵄨󵄨 0 󵄨󵄨󵄨 0 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨 󵄨󵄨󵄨 ∫ f (x)>(x) dx󵄨󵄨󵄨 ≤ ∫ |f (x)>(x)|dx, 󵄨󵄨 󵄨󵄨 󵄨󵄨 –0 󵄨󵄨–0 0

≤ ‖>‖∞ ∫ |f (x)|dx < ∞. –0

This implies that the Fourier coefficients a0 , an and bn are well defined even if f ∈ L1 [–0, 0]. However, the convergence of the Fourier series is a different problem. So far, we can only say that there is a relation f (x) ∼

a0 ∞ + ∑ a cos(nx) + bn sin(nx). 2 n=1 n

6.2 Convergence Conditions for the Fourier Series

81

We now examine the problem of the convergence of this series in a given point x to the value of the functions f in such point. First we rewrite Sn (x) using the integral expressions of the coefficients an and bn . We obtain 0

n 1 1 Sn (x) = ∫ f (t) ( + ∑ cos(kx) cos(kt) + sin(kx) sin(kt)) dt, 0 2 k=1 –0 0

=

n 1 1 ∫ f (t) ( + ∑ cos (k (t – x))) dt. 0 2 k=1 –0

We now use the following trigonometric formula (verify this): u) sin ( 2n+1 1 2 , + cos(u) + cos(2u) + ⋅ ⋅ ⋅ + cos(nu) = u 2 2 sin ( 2 )

(6.3)

to obtain that 0

Sn (x) =

sin ( 2n+1 (t – x)) 1 2 dt. ∫ f (t) 0 2 sin ( t–x )

(6.4)

2

–0

Let us make the change of variables z = t – x and use the periodicity of the integrand function to conclude that we may preserve the integration limits –0 and 0 when integrating with respect to z. Hence, 0

Sn (x) =

sin ( 2n+1 z) 1 2 dz. ∫ f (x + z) z 0 2 sin ( ) 2

–0

The function 2n+1 1 sin ( 2 z) Dn (z) = 20 sin ( z ) 2

is called the Dirichlet kernel. From identity (6.3) it follows that, for every n, 0

∫ Dn (z) dz = 1. –0

(6.5)

82

6 Fourier Series

Thus, 0

Sn (x) – f (x) =

sin ( 2n+1 z) 1 2 dz. ∫ [f (x + z) – f (x)] z 20 sin ( )

(6.6)

2

–0

Consequently, to show that Sn (x) converges to f (x) is equivalent to show that the integral on the right-hand side of eq. (6.6) converges to zero. In order to study such integral, we will need the following lemma. For a gist of eq. (6.7), see discussion after Lemma 7.15. Lemma 6.1. Suppose g ∈ L1 [–0, 0]. Then 0

lim ∫ g (x) sin(mx) dx = 0.

m→∞

(6.7)

–0

Proof. Suppose first that g is a differentiable function such that g 󸀠 is continuous on [–0, 0], then integrating by parts, we have 0

0 󵄨0 cos(mx) 󵄨󵄨󵄨 cos(mx) 󵄨󵄨 + ∫ g 󸀠 (x) dx → 0 ∫ g (x) sin(mx) dx = –g (x) 󵄨 m 󵄨󵄨–0 m

–0

(6.8)

–0

as m → ∞ since g and g 󸀠 are bounded on [–0, 0]. Now, for the general case suppose that g ∈ L1 [–0, 0] . From Example 3.30 we know that the space L1 [–0, 0] is separable since it is possible to find a countable dense subset of simple functions. Such functions can be modified to convert them into differentiable functions with continuous derivatives. This is a measure-theoretic exercise that we leave to the reader. We use this in order to get for every % > 0, a continuously differentiable function g% such that 0

% 󵄨 󵄨 ∫ 󵄨󵄨󵄨g (x) – g% (x)󵄨󵄨󵄨 dx < . 2

(6.9)

–0

We have now that 󵄨󵄨 󵄨󵄨 0 󵄨󵄨 󵄨󵄨 0 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 0 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨 󵄨 󵄨 󵄨󵄨 󵄨󵄨 ∫ g (x) sin(mx) dx󵄨󵄨 ≤ 󵄨󵄨 ∫ [g (x) – g% (x)] sin(mx)dx󵄨󵄨󵄨 + 󵄨󵄨󵄨∫ g% (x) sin(mx)dx󵄨󵄨󵄨 , 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 –0 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨–0 󵄨󵄨 󵄨 󵄨 󵄨󵄨–0 󵄨 󵄨 󵄨 󵄨 0

% 󵄨 󵄨 < + ∫ 󵄨󵄨󵄨g% (x) sin(mx)󵄨󵄨󵄨 dx 2 –0

and the result follows from eq. (6.8). Now we are ready to prove a sufficiency result for convergence of the Fourier series.

83

6.2 Convergence Conditions for the Fourier Series

Theorem 6.2. Suppose f ∈ L1 [–0, 0] and fix x ∈ [–0, 0]. Suppose that there exists \$ > 0 such that \$

f (x + t) – f (x) dt < ∞. t

(6.10)

–\$

Then Sn (x) → f (x). Remark 6.3. The condition on the previous theorem is known as the Dini condition. ⊘ Proof. First, notice that if 󵄨󵄨 \$ 󵄨󵄨 󵄨󵄨 󵄨 󵄨󵄨 f (x + t) – f (x) 󵄨󵄨󵄨 󵄨󵄨 ∫ dt󵄨󵄨󵄨 = M < ∞, 󵄨󵄨 󵄨󵄨 t 󵄨󵄨󵄨–\$ 󵄨󵄨 󵄨 󵄨 then 󵄨󵄨 󵄨󵄨󵄨 \$ 󵄨󵄨 0 󵄨󵄨󵄨 󵄨 󵄨󵄨 󵄨 󵄨󵄨 󵄨󵄨 f (x + t) – f (x) 󵄨󵄨󵄨 󵄨󵄨 f (x + t) – f (x) 󵄨󵄨󵄨 󵄨󵄨󵄨󵄨 f (x + t) – f (x) 󵄨󵄨󵄨󵄨 󵄨󵄨 dt, 󵄨 󵄨󵄨 ≤ 󵄨󵄨 ∫ 󵄨󵄨 ∫ dt dt + ∫ 󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 t t t 󵄨󵄨 󵄨󵄨 󵄨 󵄨󵄨 󵄨󵄨–\$ 󵄨󵄨󵄨–0 󵄨󵄨 [–0,0]\[–\$,\$] 󵄨 󵄨 2‖f ‖1 < ∞. ≤M+ \$ Now recall that to study the convergence of Sn (x) it is enough to study the integral in eq. (6.6). Such integral can be written as 0

1 f (x + z) – f (x) z 2n + 1 zdz. ∫ z sin 0 z 2 2 sin 2 –0

But since 0

∫ –0

f (x + t) – f (x) dt < ∞, t

then (why?) 0

∫ –0

f (x + z) – f (x) z dz < ∞ ⋅ z 2 sin z2

(6.11)

84

6 Fourier Series

so if we take g(z) =

z f (x + z) – f (x) ⋅ z 2 sin z2

then we can apply the above lemma to obtain the result.

6.2.2 Conditions for Uniform Convergence for the Fourier Series In the previous section, we presented sufficient conditions for the Fourier series of a function f ∈ L1 [–0, 0] to converge at every point of [–0, 0]. There is a broad class of functions satisfying such condition. For example, not even continuity of the function is necessary in order to be represented as the sum of a trigonometric series which converges at every point. However, a different situation occurs when considering the uniform convergence of the Fourier series. For example if the function f has at least one discontinuity, its Fourier series cannot converge uniformly to it, since the sum of a uniformly convergent series of continuous functions is a continuous function. Hence, continuity is a necessary condition for uniform convergence of the Fourier series. Actually we will need a stronger form of continuity (and more) to assure the uniform convergence. Definition 6.4. Let f : [a, b] → ℝ be a function. We say that f is absolutely continuous if, for every % > 0, there exists \$ > 0 such that if x1 , . . . , xn ∈ [a, b] and y1 , . . . , yn ∈ [a, b] are finite sequences such that ∑nk=1 |xk – yk | < \$, then n

∑ |f (xk ) – f (yk )| < %. k=1

Remark 6.5. It can be shown, see, e.g. Ref. [20], that every absolutely continuous function f is differentiable almost everywhere. Consequently, we will speak freely of f 󸀠 . ⊘ Theorem 6.6. Suppose f is an absolutely continuous function on [–0, 0] and suppose its derivative belongs to L2 [–0, 0] , then the Fourier series of f converges uniformly to f . Proof. Since f 󸀠 ∈ L2 [–0, 0] then the Fourier coefficients of f 󸀠 are well defined. Those will be denoted as a󸀠n and b󸀠n . Since f is absolutely continuous, then we can integrate by parts to get

6.2 Convergence Conditions for the Fourier Series

85

0

an =

1 ∫ f (x) cos(nx) dx, 0 –0

0

=

sin(nx) 0 1 1 f (x) |–0 – ∫ f 󸀠 (x) sin(nx) dx, 0 n n0 –0

=–

b󸀠n n

,

and similarly 0

bn =

a󸀠 1 ∫ f (x) sin(nx) dx = n . 0 n –0

Then, ∞ ∞ 1 󵄨 󵄨 󵄨 󵄨 󵄨 󵄨 󵄨 󵄨 ∑ 󵄨󵄨󵄨an 󵄨󵄨󵄨 + 󵄨󵄨󵄨bn 󵄨󵄨󵄨 = ∑ (󵄨󵄨󵄨󵄨b󸀠n 󵄨󵄨󵄨󵄨 + 󵄨󵄨󵄨󵄨a󸀠n 󵄨󵄨󵄨󵄨) . n

n=1

(6.12)

n=1

2

2

But since for every n, (|a󸀠n | + n1 ) ≥ 0, and (|b󸀠n | + n1 ) ≥ 0 then ∞

1 󵄨󵄨 󸀠 󵄨󵄨 󵄨󵄨 󸀠 󵄨󵄨 2 1 ∞ (󵄨󵄨󵄨bn 󵄨󵄨󵄨 + 󵄨󵄨󵄨an 󵄨󵄨󵄨) ≤ ∑ ((a󸀠n )2 + (b󸀠n )2 + 2 ) < ∞, n 2 n n=1 n=1 ∑

which follows from Bessel’s inequality. By eq. (6.12) we have that 󵄨󵄨 a 󵄨󵄨 ∞ 󵄨 󵄨 󵄨 󵄨 a0 ∞ + ∑ an cos(nx) + bn sin(nx) ≤ 󵄨󵄨󵄨󵄨 0 󵄨󵄨󵄨󵄨 + ∑ 󵄨󵄨󵄨an 󵄨󵄨󵄨 + 󵄨󵄨󵄨bn 󵄨󵄨󵄨 < ∞ 2 n=1 󵄨 2 󵄨 n=1 and consequently the Fourier series of the function f converges uniformly; however, we still don’t know if it actually converges to f . Suppose g is the limit of such Fourier series, then by Theorem 2.49 we have that g ∈ L2 [–0, 0]. But since f is continuous, if f also belongs to L2 [–0, 0] and since f and g have the same Fourier coefficients, then f ≡ g. ◻ A condition similar to Dini’s condition can be also stated for uniform convergence of the Fourier series of a function f . The theorem will make use of a compactness result that will be discussed later in Chapter 18.

Definition 6.7 (%-net). Let X ⊆ M be a subset of a metric space (M, d). We say that a set N% is an %-net for the set X if for all x ∈ X, there exists x% ∈ N% such that d(x, x% ) < %.

86

6 Fourier Series

Figure 6.1: %-net.

It can be shown that if X is a compact set and % > 0, then a finite %-net N% always exists. A graphical representation of the situation can be seen in Fig. 6.1 where N% is represented as a set of black dots. We will also need the following lemma. It can be seen as a stronger version of Lemma 6.1. Lemma 6.8. Suppose that X ⊆ L1 [–0, 0] is a compact set. Then for any % > 0 there is a positive number r such that 󵄨󵄨 0 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨 󵄨󵄨 ∫ f (t) sin(+t) dt󵄨󵄨󵄨 < % 󵄨󵄨 󵄨󵄨 󵄨󵄨–0 󵄨󵄨 󵄨 󵄨 for every + ≥ r and for every function f ∈ X. Proof. We use the previous result to choose a finite %2 -net {g1 , . . . , gk } ⊆ X with respect to the metric induced by the norm of L1 [–0, 0]. By Lemma 6.1 it is possible to choose r > 0 such that 󵄨󵄨 0 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨 % 󵄨󵄨 ∫ gi (t) sin(+t) dt󵄨󵄨󵄨 < , 󵄨󵄨 󵄨󵄨 2 󵄨󵄨 󵄨󵄨󵄨–0 󵄨

i = 1, 2, . . . , k,

for

+ ≥ r.

Now if f is any function in X we have that there exists i such that % 󵄩 󵄩󵄩 󵄩󵄩f – gi 󵄩󵄩󵄩1 < , 2 and consequently, 󵄨󵄨 0 󵄨󵄨 󵄨󵄨 0 󵄨󵄨 󵄨󵄨 󵄨󵄨 0 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 ∫ f (t) sin(+t) dt󵄨󵄨 ≤ 󵄨󵄨 ∫ gi (t) sin(+t) dt󵄨󵄨 + 󵄨󵄨 ∫ (f – gi ) sin(+t) dt󵄨󵄨󵄨 < %. 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨–0 󵄨󵄨󵄨–0 󵄨󵄨󵄨 󵄨󵄨󵄨–0 󵄨 󵄨 󵄨 This proves the lemma.

(6.13) ◻

6.3 Problems

87

Theorem 6.9. Suppose f ∈ L1 [–0, 0] is bounded in a set E ⊆ [–0, 0] and that for every % > 0 there exists \$ > 0 such that, for every x ∈ E, \$ 󵄨 󵄨󵄨f (x + z) – f (x)󵄨󵄨󵄨 󵄨 dz < %, ∫󵄨 |z|

–\$

then the Fourier series of the function f converges uniformly in E to f . The proof of this theorem is similar to the proof of Theorem 6.2, making the modifications necessary for the limit to be uniform based on the fact that the set of functions gx (t) =

f (x + t) – f (x) t

is compact in L1 [–0, 0]. We leave the details as an exercise to the reader. Remark 6.10. Up to this point we have treated functions defined in the segment [–0, 0] . All the results mentioned here can be extended, by making slight modifications to functions defined in a segment of arbitrary length 2l. ⊘

6.3 Problems 6.1.

Calculate the Fourier series of the following function: f : [–0, 0] 󳨀→ ℝ {0, if – 0 ≤ t < 0 t 󳨃󳨀→ { 1, if 0 ≤ t ≤ 0. {

6.2. 6.3.

Consider the Fourier series of the previous problem. Does it converge to f pointwise? At which points? Calculate the Fourier series of the following function: g : [–0, 0] 󳨀→ ℝ {–1, if – 0 ≤ t < 0 t 󳨃󳨀→ { 1, if 0 ≤ t ≤ 0. {

6.4. 6.5.

Consider the Fourier series of the previous problem. Does it converge to g pointwise? At which points? Suppose f , g ∈ L2 [–0, 0] and !, " ∈ ℝ. Find an expression for the Fourier coefficients of the function !f + "g in terms of the Fourier coefficients of f and g.

88

6.6. 6.7. 6.8.

6 Fourier Series

Let r > 0 and suppose f ∈ L2 [–r, r]. Find expressions for the Fourier coefficients of f analogous to the case r = 0. Given a function f ∈ L2 [–0, 0] and a positive number ! > 0, find expressions for the Fourier coefficients of the function g(t) = f (!t). Find the Fourier series of the function f : [–0, 0] 󳨀→ ℝ . t 󳨃󳨀→ t

6.9.

Find the Fourier series of the function f : [–0, 0] 󳨀→ ℝ . t 󳨃󳨀→ t2

6.10.

Provide the details of the proof of Theorem 6.9.

7 Fourier Transform Learning Targets ✓ Learn the notion of Fourier transform. ✓ Understand the difference between the L1 and L2 theory. ✓ Understand the deﬁnition and usefulness of the convolution.

The content of this chapter relies heavily on measure theory and integration. If the reader is not quite acquainted with measure theory and integration it is advisable to read this chapter with some measure and integration theory book nearby. For simplicity in this chapter we work with the Lebesgue measure on ℝn .

7.1 Convolution We now introduce a very important operator, the so-called convolution operator.

Definition 7.1 (Convolution Operator). Let f : ℝn 󳨀→ ℝ and g : ℝn 󳨀→ ℝ. The convolution operator is defined formally as (f ∗ g) (x) = ∫ f (x – y) g (y) dy.

(7.1)

ℝn

The following properties are almost immediate from the definition of convolution. Theorem 7.2. Let f , g and h be functions such that the integrals given by f ∗ g, g ∗ h and (f ∗ g) ∗ h are well defined. Then the convolution operator is commutative, associative and distributive as given below: (a) (b) (c)

f ∗ g = g ∗ f; (f ∗ g) ∗ h = f ∗ (g ∗ h); f ∗ (!g + "h) = !(f ∗ g) + "(f ∗ h).

Definition 7.3 (Lebesgue Spaces Lp (ℝn )). Let 1 ≤ p < ∞. The Lebesgue space Lp (ℝn ) is defined as Lp (ℝn ) = {f is measurable : ‖f ‖p < ∞},

90

7 Fourier Transform

where ‖f ‖p is the Lebesgue norm given by 1 p

󵄩󵄩 󵄩󵄩 󵄨 󵄨p 󵄩󵄩f 󵄩󵄩p = ( ∫ 󵄨󵄨󵄨f (x)󵄨󵄨󵄨 dx) . ℝn

The space L∞ (ℝn ) is defined as L∞ (ℝn ) = {f is measurable : ‖f ‖∞ < ∞} , where ‖f ‖∞ = ess supx∈ℝn |f (x)|. The study of Lebesgue spaces is outside the scope of this book, but for our purposes we need the following results regarding Lebesgue spaces. For the proofs see Ref. [38]. Theorem 7.4. Let 1 ≤ p < ∞, f ∈ Lp (ℝn ) and g ∈ Lq (ℝn ) where 1/p + 1/q = 1. Then (a) (b) (c) (d)

‖4h f – f ‖p → 0 when |h| → 0, where 4h f (x) = f (x – h), ∫ℝn |fg|dx ≤ ‖f ‖p ‖g‖q , which is called the Hölder inequality, ‖∫ℝn f (⋅, y)dy‖ ≤ ∫ℝn ‖f (⋅, y)‖p dy, which is called the Minkowski integral inequality, the space Cc (ℝn ) is dense in Lp (ℝn ).

We now show that if the functions in the convolution belong to conjugate Lebesgue spaces then the convolution is well defined. Theorem 7.5. Let 1 ≤ p, q ≤ ∞ be conjugate exponents (i.e., 1/p + 1/q = 1). If f ∈ Lp (ℝn ) and g ∈ Lq (ℝn ) then the mapping x 󳨃→ f ∗ g(x) exists for all x ∈ ℝn and is bounded and uniformly continuous. If 1 < p, q < ∞ then we also have that f ∗ g ∈ C0 (ℝn ). Proof. An application of the Hölder inequality gives f ∗ g(x) = ∫ f (x – y)g(y)dy ≤ ‖f ‖p ‖g‖Lq , ℝn

which implies the boundedness and existence result. Let us prove the continuity of the convolution when 1 ≤ p < ∞. We have 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 |f ∗ g(x – h) – f ∗ g(x)| = 󵄨󵄨󵄨 ∫ (f (x – h – y) – f (x – y)) g(y)dy󵄨󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 ℝn 󵄨 󵄨 ≤ ‖4h f – f ‖p ‖g‖Lq .

(7.2)

91

7.1 Convolution

The continuity of the convolution now follows due to the continuity of the translation operator ‖4h f – f ‖p → 0 when |h| → 0 (by Theorem 7.4). The estimate (7.2) is independent of any x which entails the uniform continuity. We now prove that under the restrictions 1 < p, q < ∞ the function f ∗ g ∈ C0 (ℝn ). Since Cc (ℝn ) is a dense set in Lp (ℝn ) (by Theorem 7.4(c)), then given % > 0, let f% , g% ∈ Cc (ℝn ) be such that ‖f – f% ‖p ≤ %

‖g – g% ‖Lq ≤ %.

and

For sufficiently big m, we have that f% ∗ g% (x) = 0 whenever |x| > m. After some calculations we obtain |f ∗ g(x)| ≤ ‖f – f% ‖p ‖g‖Lq + ‖f% ‖p ‖g – g% ‖Lq , which entails that lim|x|→∞ f ∗ g(x) = 0.

We now prove Young’s inequality which connects the convolution with the Lebesgue norm of the functions. Theorem 7.6 (Young’s Inequality for Convolution). Let p, q and r be real numbers satisfying the conditions p > 1, q > 1 and p1 + q1 – 1 = 1r > 0. Let f ∈ Lp (ℝn ) and g ∈ Lq (ℝn ). Then f * g ∈ Lr (ℝn ) and ‖f * g‖r ≤ ‖f ‖p ‖g‖q . Proof. We have |f (x – y)g(y)| = |f (x – y)||g(y)|, p( p1 – 1r )

= (|f (x – y)|p/r |g(y)|q/r ) (|f (x – y)|

|g(y)|

q( q1 – 1r )

).

Applying twice the Hölder inequality we get ∫ |f (x – y)g(y)|dy ≤ ℝn 1/r

( ∫ |f (x – y)|p |g(y)|q dy) ℝn

1/p–1/r

( ∫ |f (x – y)|p dy)

1/q–1/r

( ∫ |g(y)|q dy)

ℝn

ℝn

or using the definition of Lebesgue norm we have 1/r p

q

∫ |f (x – y)g(y)|dy ≤ ( ∫ |f (x – y)| |g(y)| dy) ℝn

ℝn

‖f ‖1–p/r ‖g‖1–q/r . p q

92

7 Fourier Transform

Figure 7.1: Example of a Dirac sequence.

After simplifications we obtain that ‖f * g‖r ≤ ‖f ‖p ‖g‖q , ◻

ending the proof.

The importance of the convolution operator stems from the fact that it is a very useful tool in the construction of the so-called approximate identity operators.

Definition 7.7. A Dirac sequence is a sequence of real-valued functions (fj )j∈ℕ which satisfy the following properties: (a) fj ≥ 0 for all j; (b) the equality ∫ℝn fj (x)dx = 1 is valid for all j ∈ ℕ; (c) for arbitrary %, \$ > 0 there exists an order J such that ∫ fj (x)dx < % |x|≥\$

for all j ≥ J.

Dirac sequences (Figure 7.1) are important since they generate an approximation identity operator, as given below. Theorem 7.8. Let g be a real-valued measurable bounded function in ℝn and K a compact set on which f is continuous. If (fj ) is a Dirac sequence, then fj ∗ g converges to g uniformly in the set K.

7.1 Convolution

93

Proof. Let us take x ∈ K. We have 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 |(fj ∗ g)(x) – g(x)| = 󵄨󵄨󵄨 ∫ fj (y)g(x – y)dy – ∫ g(x)fj (y)dy󵄨󵄨󵄨 , 󵄨󵄨 󵄨󵄨 ℝn 󵄨󵄨 ℝn 󵄨󵄨 󵄨 󵄨 󵄨󵄨 󵄨󵄨 ≤ ∫ fj (y) 󵄨󵄨g(x – y) – g(x)󵄨󵄨 dy, ℝn

󵄨 󵄨 ≤ ( ∫ + ∫ ) fj (y) 󵄨󵄨󵄨g(x – y) – g(x)󵄨󵄨󵄨 dy, |y| 0, let us pick \$ > 0 such that |y| < \$ implies that |g(x – y) – g(x)| < % for all x ∈ K. From this choice it follows that I1,\$ < %. To estimate the integral I2,\$ , from the property (c) we have that I2,\$ < 2‖g‖∞ % for j sufficiently large. ◻ One of the problems of constructing Dirac sequences is that we need to guarantee that the sequence satisfies the properties (a), (b) and (c). We can use the notion of Friedrich mollifier to construct a Dirac sequence using only one function. Definition 7.9 (Friedrich Mollifier). Let f : ℝn 󳨀→ ℝ+ be a function satisfying the following conditions: (a) f ∈ C0∞ (ℝn ); (b) f (x) = 0 when |x| > 1; and (c) ∫ℝn f (x)dx = 1. We define the Friedrich mollifier f% has x f% (x) := %–n f ( ) % for all % > 0 and x ∈ ℝn . Convolving a Friedrich mollifier with a function we obtain an approximate identity operator with “smooth” properties. Theorem 7.10. Let 1 ≤ p < ∞, f ∈ Lp (ℝn ) and >% be a Friedrich mollifier. Therefore (a) (b) (c)

>% ∗ f ∈ C∞ (ℝn ); ‖>% ∗ f ‖p ≤ ‖f ‖p ; lim%→0 ‖>% ∗ f – f ‖p = 0.

94

7 Fourier Transform

Proof. The item (a) follows from Leibniz’s rule for differentiation under the integral sign. The item (b) is just an application of the Minkowski integral inequality. To prove (c) we have >% f (x) – f (x) = ∫ >(y) [f (t – %y) – f (x)] dy. ℝn

For given ' > 0 there exists a \$ > 0 such that ‖f (⋅ + h) – f (⋅)‖p ≤ \$ > 0 fixed, we can choose % > 0 sufficiently small to have ∫ |>(x)|dx ≤ |y|≥\$/%

% 2‖>‖1

when |h| < \$. With

' . 4‖f ‖p

Using again the Minkowski integral inequality we have

‖>% f – f ‖p ≤ ( ∫ + ∫ ) |>(y)|‖f (⋅ – %y) – f (⋅)‖dy, |y|< %t

|y|≥ %t

and now using the previous estimates we arrive at (c).

7.2 L1 Theory We start with the definition of the key concept in this chapter. Definition 7.11. Let f ∈ L1 (ℝn ). The Fourier transform of the function f at the point . , denoted by F(f )(. ) or ̂f (. ), is defined by ̂f (. ) = ∫ f (x) e–20ix⋅. dx, ℝn

where x ⋅ . = ∑nj=1 xj .j .

Be aware that the definition of Fourier transform can be given in several different ways. We immediately see that the Fourier transform is well defined, since the function f (x)e–20ix. ∈ L1 (ℝn ) if f ∈ L1 (ℝn ). Let us give some examples of Fourier transforms.

7.2 L1 Theory

95

Example 7.12. Let f (x) = 7[–1,1] (x). Then 1

̂f (. ) = ∫ e–20i. x dx = sin(20. ) . 0. –1

(7.3) ⊘

2

Example 7.13. Let f+ (x) = e–+0‖x‖ . To obtain the Fourier transform of f+ we will first show the one-dimensional case and then we get the multidimensional one. One-dimensional case. We have ∞

2

f̂+ (. ) = ∫ e–0+x e–20i. x dx, –∞ ∞

2

= 2 ∫ e–0+x cos (20. x) dx. 0 ∞

2

The last integral is of the form I(. ) = ∫0 e–x cos(. x)dx, which is a known integral. To derive the value of this integral we rely on the Leibniz rule for the differentiation under the integral sign. We have ∞

2

󸀠

I (. ) = – ∫ xe–x sin(x. )dx, 0 ∞

2 . = – ∫ e–x cos(. x)dx, 2

0

where the first equality follows from Leibniz’s rule and the last from integration by parts. From the above we get that I(. ) satisfies a differential equation . I 󸀠 (. ) + I(. ) = 0. 2 .2

From the differential equation we have that (e 4 I(. ))󸀠 = 0 which implies that I(. ) = .2

Ce– 4 . Since I(0) = C we have, by the Euler–Poisson integral, that C = √0/2. Taking all the above into account we arrive at ∞

2

∫ e–x cos(. x)dx = 0

√0 – . 2 e 4. 2

Going back to the calculus of f̂+ we arrive, after some calculations, that 2

1 – 0|. | e + . f̂+ (. ) = √+

(7.4)

In particular f̂1 (. ) = f (. ). Therefore we found the fixed point of the one-dimensional Fourier transform.

96

7 Fourier Transform

Multidimensional case. Since 2

f̂+ (. ) = ∫ e–0+‖x‖ e–20i. x dx, ℝn 2

2

2

= ∫ dx1 ∫ dx2 ⋅ ⋅ ⋅ ∫ e–0+(x1 +x2 +⋅⋅⋅+xn ) e–20i(.1 x1 +.2 x2 +⋅⋅⋅+.n xn ) dxn , ℝ

we can use the one-dimensional case to obtain that 2

1 – 0‖. ‖ e + . √+

2

F (e–+0‖⋅‖ ) (. ) =

(7.5)

n

Once again, taking + = 1 we arrive at the fixed point of the n-dimensional Fourier transform. ⊘ Let us give another example. Example 7.14. Let f (x) = e–‖x‖ . Although this function looks similar to the one from the previous example, the calculation of the Fourier transform is more difficult, since it is not possible to use the trick of separation of variables as done in the previous example. The proof will rely on an integral representation of the exponential function and the Fubini theorem. From the fact that ∞

e–.

2

2

. . √0 –(x– 2x ) –x– dx = ∫ e 4x2 dx, = e–. ∫ e 2

0

(7.6)

0

we obtain that ∞

e–. =

2

. 2 –x2 – 2 4x dx. ∫e √0

0

To show the somewhat unusual relation we notice that the first equality can be shown by a change of variables x – . /(2x) 󳨃→ s and then using the Euler–Poisson integral. The second equality is straightforward. Now to calculate the Fourier transform we take the integral representation of the function f , as given below: ∞

2

2 ‖x‖ ̂f (. ) = 2 ∫ e–20i. x dx ∫ e–y – 4y2 dy, √0

ℝn ∞

=

2

2 – ‖x‖ 2 ∫ e–y dy ∫ e–20i. x e 4y2 dx, √0

0 ∞

=

0

ℝn

n 2 2 2 2 2 ∫ e–y (4y2 0) 2 e–40 y ‖. ‖ dy, √0

0

7.2 L1 Theory

= 2n+1 0

n–1 2

∫ yn e

–y2 (1+402 ‖. ‖2 )

97

dy,

0

=

n–1 2n 0 2

A ( n+1 ) 2 (1 + 40 2 ‖. ‖2 )

n+1 2

,

where we used the Fubini theorem to interchange the order of integration in the second equality, in the third we used eq. (7.5) and in the last we used the change of variables y2 (1 + 40 2 ‖. ‖) 󳨃→ s and the Gamma function. Therefore, F (e–‖⋅‖ ) (. ) = 2n 0

n–1 2

A ( n+1 ) 2 (1 + 40 2 ‖. ‖2 )

n+1 2

. ⊘

From the above examples we see that calculating the Fourier transform of a function can be a demanding task. We now state a phenomena that relates the behavior of the Fourier transform with a high frequency |. | > L for L sufficiently large, as given below. Lemma 7.15 (Riemann–Lebesgue Lemma). Let f ∈ L1 (ℝn ). Then lim ̂f (. ) = 0.

|. |→∞

Before going into the proof of this result, let us analyze a similar phenomena which justify, in some sense, the rationale of the Riemann–Lebesgue lemma. Let us define 1

S(+) := ∫ f (x) sin(+x)dx, 0

when f is a continuous function (which we fix). It can be proved that S(+) → 0 when + → ∞. The reason behind this fact is that when + > 0 the function x 󳨃→ sin(+x) is negative and positive in intervals of length 0/+. For very large values of +, there will be cancelling of the integral in adjacent intervals due to the continuity of the function f , since in very small intervals, it behaves like a step function. The proof of the Riemann–Lebesgue Lemma that we will give is short and will rely on two crucial things: (a) the shift operator in the L1 (ℝn ) space is continuous, and (b) the periodicity of the exponential function. There is a more direct proof, albeit more tedious, where we first obtain the result for a step function based on Example 7.3 and then approximate a L1 function by a step function.

98

7 Fourier Transform

Proof of Lemma 7.15. We have, since e–0i = –1, that 1 ̂f (. ) = ∫ e–20ix⋅. f (x)dx = – ∫ e–20i. ⋅(x+ 2n. ) f (x) dx = – ∫ e–20i. ⋅+ f (+ – 1 ) d+, 2n. ℝn

ℝn

where

1 2n.

ℝn

1 := ( 2n. , 2n.1 , ⋅ ⋅ ⋅ , 2n.1 ). From the above we get 1

2

n

̂f (. ) = 1 ( ∫ [f (x) – f (x – 1 )] e–20i. x dx) , 2 2n. ℝn

and if f is a continuous function, by the Lebesgue-dominated convergence theorem, we obtain that ̂f (. ) → 0 when |. | → ∞. For the general case, we use the fact that |̂f (. )| ≤ |F(f – g)(. )| + |F(g)(. )| ≤ ‖f – g‖1 + |F(g)(. )|, and by Luzin’s Theorem, for each % > 0, we can choose a continuous function g such that ‖f – g‖1 ≤ %; the general result now follows from the case for continuous functions. ◻

Definition 7.16 (Translation Operator). By 4h we denote the translation operator given by (4h f )(x) = f (x – h).

Be aware that the translation operator is given sometimes in a different way, i.e., 4h f (x) = f (x + h). We will not follow this notation. We now continue with some immediate properties of the Fourier transform. Theorem 7.17. Let f , g ∈ L1 (ℝn ). Then we have (a) (b) (c) (d) (e) (f)

F(!f + "g)(. ) = !F(f )(. ) + "Fg(. ). F (4h f ) (. ) = ̂f (. ) e20ih⋅. . F (f (x)e–20ih⋅x ) (. ) = ̂f (. + h) = (4–h ̂f )(. ). F(+–n f ( x+ ))(. ) = ̂f (+. ). F(f ∗ g)(. ) = ̂f (. )ĝ(. ).

If 1 is an orthogonal transformation, then F(f ∘ 1)(. ) = ̂f (1. ).

Proof. (a) It is straightforward.

7.2 L1 Theory

(b)

99

Taking the definition of the translation operator we have F (4h f ) (. ) = ∫ f (x – h) e–20ix⋅. dx, ℝn

= ∫ f (x) e–20i(x+h)⋅. dx, ℝn

= e–20ih⋅. ∫ f (x) e–20ix⋅. dx, ℝn

= e–20ih⋅. ̂f (. ). (c) (d)

The calculations are similar to the ones in item (b). By the fact that f+ (x) = +1 f ( x+ ) we have ̂f (. ) = ∫ +–n f ( x ) e–20ix⋅. dx, + + ℝn

= ∫ f (y) e–20i(+y)⋅. dy, ℝn

= ∫ f (y) e–20iy⋅(+. ) dy, ℝn

= ̂f (+. ) . (e) (f)

The result is just a standard application of Fubini’s theorem. Since 1 is an orthogonal transform we have F (f ∘ 1) (. ) = ∫ f (1x) e–20ix⋅. dx, ℝn

= ∫ f (y) e

–20i(1–1 y)⋅.

dy,

ℝn

= ∫ f (y) e–20iy⋅(1. ) dy, ℝn

= ̂f (1. ) . The relation (1–1 y) ⋅ . = y ⋅ (1. ) is a consequence of the fact that the application ◻ x 󳨃→ 1x is an isometry. Corollary 7.18 (Fourier Transform of Radial Functions). The Fourier transform of a radial function is a radial function.

100

7 Fourier Transform

Proof. Let w, z ∈ ℝn with |w| = |z|. Then there exists an orthogonal transformation 1 such that 1w = z. Using the fact that f is radial, i.e., f = f ∘ 1, then we have Ff (z) = Ff (1w) = F(f ∘ 1)(w) = Ff (w), ◻

which ends the proof.

For a formulation of Fourier transform of radial functions using Bessel function, see, e.g. Ref. [42]. Theorem 7.19. Let f ∈ L1 (ℝn ). Then (a) (b)

F : L1 (ℝn ) 󳨀→ C0 (ℝn ) is a bounded linear mapping with the bound ‖F(f )‖∞ ≤ 󵄩󵄩 󵄩󵄩 󵄩󵄩f 󵄩󵄩1 . Ff is uniformly continuous.

Proof. (a) The fact that F : L1 (ℝn ) 󳨀→ L∞ (ℝn ) is a bounded linear with the bound ‖̂f ‖∞ ≤ ‖f ‖1 is immediate, since 󵄨󵄨 󵄨 󵄨 󵄨 󵄨󵄨̂f (. )󵄨󵄨󵄨 ≤ ∫ 󵄨󵄨󵄨f (x) e–20ix⋅. dx󵄨󵄨󵄨 = 󵄩󵄩󵄩f 󵄩󵄩󵄩 . 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄩 󵄩1 󵄨󵄨 ℝn

(b)

The continuity of ̂f follows from the dominated convergence theorem. The decay󵄨 󵄨 ing of the Fourier transform, namely, the fact that ̂f (. ) → 0 as 󵄨󵄨󵄨. 󵄨󵄨󵄨 → ∞, is just the content of the Riemann–Lebesgue Lemma 7.15. To prove the uniform continuity, we note that 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 –20ix. –20ih. |Ff (. + h) – Ff (. )| = 󵄨󵄨󵄨 ∫ e [e – 1] f (x)dx󵄨󵄨󵄨 , 󵄨󵄨 󵄨󵄨 󵄨󵄨 ℝn 󵄨󵄨 󵄨 󵄨 󵄨󵄨 –20ih. 󵄨󵄨 – 1󵄨󵄨󵄨 |f (x)|dx, ≤ ∫ 󵄨󵄨󵄨e 󵄨 󵄨 ℝn

≤ ∫ 20r|h||f (x)|dx + 2 ∫ |f (x)|dx, |x|≤r

|x|>r

=: I + II, where we used the estimate |ei( –1| ≤ |(|. Choosing r sufficiently large, we get that II < %/2. With the r fixed, we can choose |h| sufficiently small such that I < %/2. ◻ Gathering the estimates we have the uniform continuity.

7.2 L1 Theory

101

In the following results we will highlight a very important fact about the Fourier transform, i.e., the Fourier transform converts the product of a function f by a polynomial into differentiation of f . We now introduce the multi-index notation. Definition 7.20 (Multi-Index Notation). For a multi-index ! = (!1 , . . . , !n ) , a sequence of nonnegative integers, we define |!| = !1 + ⋅ ⋅ ⋅ + !n ; !

x! = x1 1 . . . xn!n ; D! f =

𝜕|!| f ! . ! 𝜕x1 1 . . . 𝜕xnn

Theorem 7.21. Let f ∈ L1 (ℝn ) and xk f ∈ L1 (ℝn ). Then the function ̂f has partial derivative and it satisfies the relation 𝜕̂f (. ) . 𝜕.k

(7.7)

D! ̂f (. ) = F [(–20i. )! f ] (. ).

(7.8)

F(–20ixk f (x))(. ) = Moreover, if ̂f ∈ Ck (ℝn ) then

Proof. We will prove only eq. (7.7). Using Theorem 7.17 we have ̂f (. + e h) – ̂f (. ) e–20i(hek )x – 1 k = F (f (x) ) (. ). h h

(7.9)

Using the continuity of the Fourier transform, when h → 0, from eq. (7.9) we have eq. (7.7). ◻ We now study the effect of the Fourier transform in the derivative of a function. Under suitable conditions, the Fourier transform of the derivative of a function f is just the product of the Fourier transform of f with some polynomial. This property is exploited in differential equations, since it permits in some instances to convert a differential equation into an algebraic equation. Theorem 7.22. Let f ∈ L1 (ℝn ) and

𝜕f 𝜕xk

F(

∈ L1 (ℝn ). Then

𝜕f ) (. ) = 20i.k ̂f (. ). 𝜕xk

(7.10)

102

7 Fourier Transform

Moreover, if f ∈ Ck (ℝn ) and D! ∈ L1 (ℝn ), |!| ≤ k, then F(D! f )(. ) = (20i. )! ̂f (. ).

(7.11)

Proof. For the proof of eq. (7.10) it suffices to prove that the difference F(

𝜕f e20i(hek ). – 1 ) (. ) – ̂f (. ) 𝜕xk h

goes to zero, since lim ̂f (. )

h→0

e20i(hek ). – 1 = 20i.k ̂f (. ). h

From Theorem 7.17(b) we have F(

𝜕f e ) (. ) – ̂f (. ) 𝜕xk

20i(he ). k

–1

h

= F(

f (x + hek ) – f (x) 𝜕f – ) (. ) → 0, 𝜕xk h

when h → 0 due to the continuity of the Fourier transform.

We now give a formula of paramount importance in the theory of Fourier transform, the so-called multiplication formula, sometimes called the shifting hat formula. Theorem 7.23. Let f , g ∈ L1 (ℝn ). Then ∫ ̂f (x)g(x)dx = ∫ f (x)ĝ(x)dx. ℝn

(7.12)

ℝn

Proof. The proof is an immediate consequence of Fubini’s theorem, namely ∫ ̂f (x)g(x)dx = ∫ g(x)dx ∫ e–20iyx f (y)dy, ℝn

ℝn

ℝn

= ∫ f (y)dy ∫ e–20iyx g(x)dx, ℝn

ℝn

= ∫ f (x)ĝ(x)dx. ℝn

It should be noted that under the required conditions, we can indeed apply the Fubini theorem, which ends the proof. ◻ We now introduce the inverse Fourier transform.

7.2 L1 Theory

103

Definition 7.24. Let f ∈ L1 (ℝn ). The inverse Fourier transform of the function f at the point . , denoted by F –1 (f )(. ) or f ̌(. ), is defined by f ̌ (x) = ∫ f (. ) e20ix⋅. d. , ℝn

where x ⋅ . = ∑nj=1 xj .j . We want to see if indeed the inverse Fourier transform is true or at least in what space it is true. We need to be careful in this respect, because taking the Fourier transform of an L1 (ℝn ) is in C0 (ℝn ) but this does not guarantee that the Fourier transform is integrable. The problem is even more difficult, since functions with compact support can be transformed into function with support in an unbounded set as given in Example 7.12. sin x x

Example 7.25. The function f (x) = I(x) → ∞ when x → ∞, where

∉ L1 (ℝ). To see the statement we will show that x

I(x) = ∫ 0

| sin x| dx. x

We have 0

N0

N | sin x| sin x dx = ∑ ∫ dx, I(N0) = ∫ x x + 0(j – 1) j=1 0

0

N

1 2 ∑ , 0 j=1 j

from which follows the result, due to the divergence of the harmonic series.

From this commentary we see that we need to require, at least, that f needs to be a C0 function to the Fourier inversion formula to work. To show the formula we impose some restriction on f and ̂f to obtain the inversion formula. Theorem 7.26 (Fourier Inversion Formula). Let f , ̂f ∈ L1 (ℝn ). Then f (x) = ∫ e20ix. ̂f (. )d. , ℝn

the formula (7.13) being valid almost everywhere.

(7.13)

104

7 Fourier Transform

It would be tempting to expand the ̂f (. ) in eq. (7.13) and apply the Fubini Theorem to obtain the result. Unfortunately we cannot use that approach, since the conditions for Fubini’s Theorem are not satisfied. Instead we will use the Gauss summability method to overcome this difficulty. Definition 7.27 (Gauss Summability). We say that a function f is Gauss summable to L if the Gauss mean of a function f defined by 2

G% (f ) = ∫ f (x)e–%‖x‖ dx ℝn

converges to a limit L, i.e., lim G% (f ) = L.

%→0

If f is integrable then L = ∫ℝn f (x)dx. Taking a constant function in ℝn we immediately see that the Gauss summability method extends the notion of integrability. Proof of Theorem 7.26. Let us define A(t, x) = ∫ e20ix. e–t

2

0‖. ‖2 ̂

f (. )d. .

ℝn

If g(. ) = e20ix. e–t

2

0‖x‖2

then its Fourier transform is given by 2

ĝ(y) =

1 – 0‖x–y‖ e t2 . tn

2

Defining >(x) = e–0‖x‖ and recalling that f+ (x) =

1 f +n

( x+ ) we can write

ĝ(y) = >t (x – y). Using the multiplication formula we can write A(t, x) = (>t ∗ f ) (x),

(7.14)

and now taking limit in eq. (7.14) when t → 0 we obtain the Fourier inversion formula taking into account that > is a Dirac sequence. ◻ Let us see another corollary of the multiplication formula and the Gauss summability method.

7.3 L2 Theory

105

Corollary 7.28. Let f , ̂f ∈ L1 (ℝn ) and f be a continuous function at x = 0 with ̂f ≥ 0. Then ∫ ̂f (. )d. = f (0). ℝn

Proof. Let us write ∫ e–t

2

2

0‖. ‖2 ̂

f (. )d. = ∫

ℝn

ℝn

1 – 0‖y‖2 e t f (y)dy. tn

Taking limits in the above formula (t → 0) and the continuity of f at the origin, we obtain the result. ◻

7.3 L2 Theory We now want to extend the definition of Fourier transform to the space L2 (ℝn ). The approach will be based on the following idea. We first prove that the Fourier transform is an isometry in a dense subset of L2 and then we extend the Fourier transform to the entire space. The following theorem is of paramount importance in this task. Theorem 7.29 (Plancherel Theorem). Let f ∈ L1 (ℝn ) ∩ L2 (ℝn ). Then ̂f ∈ L2 (ℝn ) and moreover ‖̂f ‖2 = ‖f ‖2 .

(7.15)

Proof. We take g(x) = f (–x), where the ⋅ denotes complex conjugate. We have ĝ(. ) = ∫ e–20ix. f (–x)dx, ℝn

= ∫ e–20ix. f (x)dx, ℝn

= ̂f (. ). Taking >(x) = (f ∗ g)(x) we have that > ∈ L1 (ℝn ). Since > is a continuous function with ̂ = |̂f |2 ≥ 0 using Corollary 7.28 we get > ̂(. )d. . >(0) = ∫ > ℝn

Since >(0) = (f ∗ g)(0), = ∫ f (x)g(–x)dx, ℝn

= ∫ |f (x)|2 dx, ℝn

106

7 Fourier Transform

and ̂(. )d. = ∫ |̂f (. )|2 dx, ∫> ℝn

ℝn

and the theorem is proved.

We now prove a convergence theorem which states that a convergent sequence in the space L1 ∩ L2 with limit in L2 implies that the limit of the Fourier transform of the functions in the sequence is also in the space L2 , as given below. Theorem 7.30. Let f ∈ L2 (ℝn ). If (fj )j∈ℕ is a sequence of L1 (ℝn ) ∩ L2 (ℝn ) functions which converge to f in the L2 (ℝn ), then the sequence (f̂j )n∈ℕ converges in the L2 (ℝn ). The limit is almost everywhere unique and is independent of the sequence (fj ). Proof. Since ‖f̂k – f̂j ‖2 = ‖f̂ k – fj ‖2 = ‖fk – fj ‖2 we have that the sequence (f̂j ) is a Cauchy sequence and since L2 (ℝn ) is a Banach space, then the sequence has a L2 (ℝn ) limit. The uniqueness follows from similar arguments using Cauchy sequences. ◻ Definition 7.31 (Fourier Transform in L2 (ℝn )). Let f ∈ L2 (ℝn ). Then its Fourier transform ̂f is understood as the limit function which we obtain from any sequence of L1 (ℝn ) ∩ L2 (ℝn ) functions which converge to f in L2 (ℝn ). Taking the previous definition into account, the reader is invited to fill out the details of the following properties (do not forget that now the Fourier transform is obtained by a limiting process). Theorem 7.32. Let f ∈ L2 (ℝn ). Then (a) ‖f ‖ = ‖̂f ‖ . 2

(b) (c)

2

F(F –1 f ) = f = F –1 (Ff ). for every g ∈ L2 (ℝn ) we have ⟨f , g⟩= ⟨̂f , ĝ ⟩.

The item (c) follows from (a) and using the following relation fg =

1 (|f + g|2 – |f – g|2 + |f + ig|2 – |f – ig|2 ) . 4

With the above properties we have that F(L2 (ℝn )) = L2 (ℝn ).

7.4 L2 Theory

107

Theorem 7.33. The Fourier transform is a linear surjective isometry in the space L2 (ℝn ). Proof. The isometry is just Theorem 7.32(a). It is clear that F(L2 (ℝn )) ⊆ L2 (ℝn ) is a closed subspace since F is an isometry. The surjectivity follows from Theorem 7.32(c), because if there exists 0 ≠ g ⊥ F(L2 (ℝn )) then 0 = (Ff , g) = (f , Fg) which implies that Fg = 0 since it is orthogonal to all f ∈ L2 (ℝn ). Since F is an isometry, it follows that g = 0, entailing a contradiction. ◻ We end this section with the so-called uncertainty principle. We will give only a onedimensional version. The physical interpretation is that it is not possible to localize simultaneously the position and the momentum. Theorem 7.34 (Uncertainty Principle). Let f ∈ C0∞ (ℝ) with ‖f ‖2 = 1. Then 1 ≤ ∫ x2 |f (x)|2 dx ∫ x2 |̂f (x)|2 dx. 160 2 ℝ

(7.16)

Proof. On the one hand, integration by parts gives us that ∫ x(|f (x)|2 )󸀠 dx = –1. ℝ

On the other hand the Hölder inequality gives the bound 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 2 󸀠 󵄨 1 = 󵄨󵄨∫ x(|f (x)| ) dx󵄨󵄨󵄨 , 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨ℝ 󵄨 󵄨 ≤ 2 ∫ |xf (x)||f 󸀠 (x)|dx

(7.17)

≤ 2‖xf (x)‖2 ‖f 󸀠 ‖2 . Using the fact that the Fourier transform converts derivatives in product with polynomial (Theorem 7.22) and the Plancherel Theorem we have ‖f 󸀠 ‖2 = ‖f̂󸀠 ‖2 = 20‖. ̂f (. )‖2 , from which, together with eq. (7.17), we obtain eq. (7.16).

108

7 Fourier Transform

7.4 Schwartz Class In this section we will introduce the Schwartz class and prove some immediate results and the fact that it is a complete metric space. We will not touch upon deeper results, namely the notion of tempered distributions. The reader is invited to see Ref. [39] for further properties of the Schwartz class. We now introduce a class of functions which is the natural environment for the Fourier transform. Definition 7.35. A function f is said to belongs to the Schwartz class S (ℝn ) if f ∈ C∞ (ℝn ) and for all multi-indices !, " we have 󵄨󵄨 󵄨󵄨 sup 󵄨󵄨󵄨x! D" f (x)󵄨󵄨󵄨 = 𝜘!," (f ) < ∞. 󵄨 x 󵄨 If a function f belongs to the Schwartz class, then all the derivatives of f goes to zero, as |x| → ∞, faster than the reciprocal of any polynomial. Clearly C0∞ (ℝn ) ⊆ S (ℝn ) but also 2

e–|x| ∈ S (ℝn ) , which means that there are Schwartz functions with noncompact support. One of the reasons why the Schwartz space is the natural framework for the Fourier transform is due to the fact that the space is invariant over product by polynomials and by derivatives. These two properties are connected via the Fourier transform. Theorem 7.36. Let f , g ∈ S (ℝn ). Then (a) (b) (c) (d)

p(x)f (x) ∈ S (ℝn ) for all polynomials p(x); D! f ∈ S (ℝn ) for all multi-indice !; ̂f ∈ S (ℝn ); f ∗ g ∈ S (ℝn ).

Proof. The proofs of (a) and (b) are immediate from the definition of Schwartz class. The item (c) follows from eqs (7.8) and (7.11). The property (d) follows from Theorem 7.17(e) and the fact that f , g ∈ S (ℝn ) and then ̂f , ĝ ∈ S (ℝn ). ◻ We introduce convergence in S in the following way. Definition 7.37 (Convergence in the Schwartz Space). We say that the sequence (fj )j∈ℕ converges to f if lim 𝜘!," (fj – f ) = 0

j→∞

for all multi-indices ! and ". We denote this fact by the notation S - limj→∞ fj = f . The space S (ℝn ) fails to be normed, but it is a metric space.

7.4 Schwartz Class

109

Definition 7.38 (Metric in the Schwartz Space). In the space S (ℝn ) we introduce the metric d given by ∑ 2–|!|–|"|

d(f , g) =

!,"∈ℕn0

𝜘!," (f – g) 1 + 𝜘!," (f – g)

.

Theorem 7.39. The space (S (ℝn ), d) is a complete metric space. Proof. Let (fj )j∈ℕ be a Cauchy sequence in S (ℝn ). Let % > 0 be given, and take # ∈ ℕn0 . –|y|

Defining ' = 21+%% , there exists an order N(') such that when j, k ≥ N(') we have d(fj , fk ) < '. In particular we have 𝜘0,# (fj – fk ) 1 + 𝜘0,# (fj – fk )

% , 1+ %

from which we get sup |D# (fj – fk )| < %, x∈K

for any compact set K ⊆ ℝn . In other words, the sequence (fj )j∈ℕ is a Cauchy sequence in the complete space C|#| (K). This implies that lim fj = f

j→∞

in the space C|#| (K). From the above consideration, we see that f ∈ C∞ (ℝn ). To finish the proof we will show that for all K compact we have the estimate supx∈K |x! D" f | < ∞ and then see that we can lift up the restriction on K; i.e., we have sup |x! D" f | ≤ sup |x! D" (fj – f )| + sup |x! D" fj |, x∈K

x∈K

x∈K

"

≤ C(K, !) sup |D (fj – f )| + sup |x! D" fj |. x∈K

x∈K

From the above estimate, we obtain sup |x! D" f | ≤ lim sup 𝜘!," (fj ) < ∞. x∈K

(7.18)

j→∞

The result follows taking into account that the upper bound for supx∈K |x! D" f | in eq. (7.18) does not depend on K. ◻ We now collect some more properties of the Fourier transform in the Schwartz class. The proofs are not difficult and are left as an exercise.

110

7 Fourier Transform

Theorem 7.40. Let f , g ∈ S (ℝn ). Then ! f (. ) = (20i. )! ̂ ̂ (a) D f (. ); (b) F((–20ix)! f (x))(. ) = D! ̂f (. ); (c) (d) (e) (f)

F (F –1 f ) = f = F –1 (F(f )); ⟨f , g⟩= ⟨̂f , ̂f ⟩; f̂ ∗ f = ̂f ĝ; ̂ f ⋅ f = ̂f ∗ ĝ.

We now finish with two applications. We will proceed using the Fourier transform without going into the justifications of the possibility to use the “Fourier rules.” Example 7.41 (Heat Equation). Let us try to solve the following problem. Let us find a function u(x, t) : ℝn × ℝ+ 󳨀→ ℝ which satisfies the conditions { 𝜕u = C2 Bu; 𝜕t { u(x, 0) = f (x), { where B is the Laplace operator Bg =

x ∈ ℝn ;

𝜕2 g 𝜕2 g 𝜕2 g + +⋅ ⋅ ⋅+ 𝜕x 2 . Applying the Fourier transform 𝜕x12 𝜕x22 n

(in the x variable) to the first condition we obtain ̂ 𝜕u ̂ (. , t). (. , t) = –a2 (20)2 |. |2 u 𝜕t Solving the differential equation we get ̂ (. , t) = C(. )e–a u

2

40 2 |. |2 t

.

̂ (. , 0) = ̂f (. ) we finally get and from the condition u ̂ (. , t) = ̂f (. )e–a u

2

40 2 |. |2 t

.

Using the fact that f̂ ∗ g = ̂f ∗ ĝ we can obtain the solution of u(x, t) as a convolution, namely u(x, t) = F (f ∗ (F –1 (e–a

2

40 2 |x|2 t

))) .

The calculation of F –1 (e–a is left as an exercise, see Problem 7.2.

2

40 2 |x|2 t

) (x). ⊘

7.5 Problems

111

In practice, we use the Fourier rules without taking too much care about the justification. At the end we check if the solution obtained by this careless approach satisfies the conditions. Let us see the following example. Example 7.42. Let us try to calculate the following Fourier transform F (|x|–! ), for ! < n. We first use a scaling argument, namely F(|x|–! )(t. ) = |t|!–n F(|x|–! )(. ), from which we get a rough estimate, namely F(|x|–! )(. ) = |. |!–n F(|x|–! ) (. /|. |) , = |. |!–n C(n, !). We now want to calculate the constant C(n, !). We now use the multiplication formula 2 and the fact that e–0|x| is the fixed point for the Fourier transform and obtain 2

2

∫ e–0|x| |x|–! dx = ∫ e–0|x| C(n, !)|x|!–n dx. ℝn

ℝn

Passing to polar coordinates we get ∞

2

2

∫ e–0r rn–!–1 dr = C(n, !) ∫ e–0r r!–1 dr, 0

0

from which, using the Gamma function, we obtain C(n, !) = 0 !–

n–1 2

A ( n–! ) 2 A ( n2 )

. ⊘

It should be pointed out that the previous reasoning has a flaw; we used the multiplication formula with functions that do not belong to L1 (ℝn ). On the other hand, we found a possible candidate for the constant C(n, !) and in fact it can be possible to justify that we found the correct constant, see, e.g. Ref. [38].

7.5 Problems 7.1.

Prove the validity of commutativity, associativity and distributivity of the convolution operator, see Theorem 7.2.

112

7.2.

7 Fourier Transform

Calculate the following inverse Fourier transform: F –1 (e–a

7.3.

2

|x|2 t

) (x).

Hint: Take into account Example 7.13. Let f : ℝ 󳨀→ ℂ be a function which is of class Ck and all the functions f , f 󸀠 , f 󸀠󸀠 , . . . , f (k) are absolutely integrables in ℝ. Prove that ̂f (. ) = o ( 1 ) xk as . → 0.

7.4.

2

Prove that e–|x| ∈ S (ℝn ) but e–|x| ∉ S (ℝn ).

8 Fixed Point Theorem Learning Targets ✓ Learn the Fixed Point Theorem. ✓ Understand the applicability of the Fixed Point Theorem in applications.

The Fixed Point Theorem is a very powerful and robust technique used in many branches of mathematics. The uniqueness of the Fixed Point Theorem relies on two distinct features: (1) it is an existence result, and (2) its proof gives an iterative algorithm which can be used to approximate, with any degree of accuracy, the fixed point. It is interesting to point out that some form of Fixed Point Theorem was already known to the Babylonians in the form of the so-called Babylonian algorithm. Example 8.1 (Babylonian Algorithm of Successive Approximation of Square Roots). Let us prove that √r exists, where r > 0. Taking any sequence of the type {x1 = c > √r, { x = 1 (x + { n+1 2 n

r ), xn

we will show that xn → √r. If the sequence (xn )n∈ℕ converges for a strictly positive value then it must converge to xn → √r, since if xn → L implies that L = 21 (L + Lr ) which is nothing else than L2 = r. Now it suffices to show that (xn ) is a bounded decreasing sequence. ⊘ Definition 8.2. Let X be a set and f : X 󳨀→ X be a map. We say that an element x ∈ X is a fixed point of f if f (x) = x. Example 8.3. One of the simplest examples of maps with fixed points are the continuous functions f : [0, 1] → [0, 1] . All such maps have, at least, one fixed point. To see this consider the function h (x) = f (x) – x. Let us assume that f (0) ≠ 0 and f (1) ≠ 1, for, otherwise, the result follows immediately. Then, h (0) > 0 and h (1) < 0. By the Intermediate Value Theorem of elementary calculus, there is some x0 ∈ (0, 1) such that h (x0 ) = 0; this is f (x0 ) = x0 , and f has a fixed point. ⊘ From the previous example we note the following fact: the existence of a fixed point for the function h(x) := f (x) – x is equivalent to the existence of a solution to the equation f (x) = 0. This remark will be fully exploited in further examples. Example 8.4. Another example of fixed point is given by some fractal constructions. A nice example is the Sierpinski Triangle. From Fig. 8.1 it is clear, we take three copies

114

8 Fixed Point Theorem

Figure 8.1: Sierpinski triangle.

of the triangle, contract them by half and assemble them in a triangle shape; we obtain the initial set. For more details, see Section 8.1.4. ⊘

Definition 8.5. Let (X, dX ) and (Y, dY ) be metric spaces. We say that a map f : X 󳨀→ Y is a contraction if there is a constant 0 ≤ ! < 1 such that, for every u, v ∈ X, we have dY (f (u) , f (v)) ≤ !dX (u, v) . The constant ! is called the Lipschitz constant of the contraction. With the notation of a contraction map, we now state and prove the central theorem in this chapter, the so-called Banach Fixed Point Theorem, sometimes also denoted as Banach Contraction Principle. Theorem 8.6 (Banach’s Fixed Point Theorem). Let F be a closed subset of a complete metric space (X, d) . If the map f : F → F is a contraction, then f has one, and only one, fixed point in F. Proof. We will show first that there exists a fixed point for f . To do this we show that, ∞ given x ∈ F, the sequence (xn )n=0 defined by xn = f ( f n–1 (x)), where f 0 (x) = x, is a Cauchy sequence, since every contraction is continuous. By induction, we obtain the inequalities d (x1 , x2 ) ≤ !d (x, x1 ) and d (x2 , x3 ) ≤ !d (x1 , x2 ) ≤ !2 d (x, x1 ) ; more generally we have, d (xn , xn+1 ) ≤ !n d (x, x1 ) , for every n ∈ ℕ. Like this, for every n, m ∈ ℕ,

8 Fixed Point Theorem

115

d (xn , xn+m ) ≤ d (xn , xn+1 ) + d (xn+1 , xn+2 ) + ⋅ ⋅ ⋅ + d (xn+m–1 , xn+m ) , ≤ !n (1 + ! + !2 + ⋅ ⋅ ⋅ + !m–1 ) d (x, x1 ) ,
0 such that the Newton method converges to . for any x0 ∈ [. – \$, . + \$] ⊂ [a, b]. For a proof, see Problem 8.2.

8.1 Some Applications We give some applications of the Banach Fixed Point Theorem, regarding the existence of solution for problems in differential equations and integral equations.

8.1.1 Neumann Series We will start with this problem. Let K : X 󳨀→ X be a linear continuous surjective map. Given y ∈ X let us find a vector x ∈ X such that x – Kx = y. Defining Ax := y + Kx, we can write the previous equation in the following way: x = Ax,

(8.1)

from which the solution exists if x is a fixed point of A. The solution of eq. (8.1) is equivalent to the existence of the inverse for the operator (I–K)–1 where I is the identity operator. We have d(Ax1 , Ax2 ) = ‖Ax1 – Ax2 ‖X , = ‖K(x1 – x2 )‖X , ≤ ‖K‖‖x1 – x2 ‖X , = ‖K‖d(x1 , x2 ), from which it follows that our map A is a contraction if ‖K‖ < 1. From the Banach Fixed Point Theorem 8.6 there exists a unique solution x ∈ X for eq. (8.1) when ‖K‖ < 1. The Banach Fixed Point Theorem allows us also to obtain the solution; i.e., taking any

8.1 Some Applications

117

x0 ∈ X and defining xn := Axn–1 we obtain that xn → x. For the particular choice of x0 = y the xn is given by xn = y + Ky + K 2 y + ⋅ ⋅ ⋅ + K n y → x, from which we get ∞

x = ∑ K n y = (I – K)–1 y.

(8.2)

n=0

The series in eq. (8.2) is called the Neumann series. It is possible to prove that the convergence is not only a pointwise but also uniformly convergent. Thus we can prove the following theorem. Theorem 8.8. Let X be a Banach space and K : X 󳨀→ X be a linear continuous surjective map such that ‖K‖ < 1. Then the operator (I – K)–1 is a linear continuous operator in X with the bound ‖(I – K)–1 ‖ ≤

1 . 1 – ‖K‖

The operator (I – K)–1 is given by ∞

(I – K)–1 = ∑ K n . n=0

8.1.2 Differential Equations Maybe the most well-known example of the application of the Banach Fixed Point Theorem is the so-called Picard Theorem, sometimes also denoted as Cauchy–Picard Theorem. Theorem 8.9 (Picard Theorem). Let f : U 󳨀→ ℝ be a continuous function in U = [t0 – %, t0 + %] × [y0 – r, y0 + r] ⊆ ℝ2 , with r, % > 0, which is also Lipschitz in the second variable, viz. 󵄨󵄨 󵄨 󵄨󵄨f (t, !) – f (t, ")󵄨󵄨󵄨 ≤ K 󵄨󵄨󵄨󵄨! – "󵄨󵄨󵄨󵄨 , K > 0, 󵄨 󵄨 󵄨 󵄨 for (t, !) , (t, ") ∈ U, and denote M = max(t,!)∈U 󵄨󵄨󵄨f (t, !)󵄨󵄨󵄨 . Then the initial value problem { dy (t) = f (t, y (t)) dt { y { (t0 ) = y0 has a unique differentiable solution y : I → ℝ, where I = [t0 – \$, t0 + \$] , with 0 < \$ < min ( Mr , K1 , %) .

118

8 Fixed Point Theorem

Proof. We see that the initial value problem is equivalent to solving the integral equation t

(Fy) (t) = y0 + ∫ f (s, y (s)) ds,

t ∈ I.

t0

If we could show that F is a contraction in an appropriate closed subspace of a complete metric space, then the Banach Fixed Point Theorem would guarantee the existence of the solution. We now define the set 󵄨 󵄨 Q = { y ∈ C (I) : for every t ∈ I, 󵄨󵄨󵄨y (t) – y0 󵄨󵄨󵄨 ≤ r} , endowed with the metric d(f , g) = maxx∈I |f (x) – g(x)|. Since the set Q is a closed subset of a complete metric space (C(I), d) (see Example 3.2), we now show that F is indeed a contraction. For y, w ∈ X we have 󵄨󵄨 󵄨󵄨 󵄨󵄨 t 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨󵄨 󵄨󵄨(Fy) (t) – (Fw) (t)󵄨󵄨 ≤ 󵄨󵄨∫ 󵄨󵄨f (s, y (s)) – f (s, w (s))󵄨󵄨 ds󵄨󵄨 , 󵄨 󵄨 󵄨󵄨 󵄨 󵄨 󵄨󵄨 󵄨󵄨t 󵄨󵄨 󵄨󵄨 0 󵄨󵄨 󵄨 󵄨 󵄨 󵄨 ≤ 󵄨󵄨󵄨t – t0 󵄨󵄨󵄨 sup 󵄨󵄨󵄨󵄨f (s, y (s)) – f (s, w (s))󵄨󵄨󵄨󵄨 , s∈I

󵄨 󵄨 ≤ \$K sup 󵄨󵄨󵄨y (s) – w (s)󵄨󵄨󵄨 , s∈I

and so d(Fy, Fw) ≤ \$Kd(y, w). Since \$
(x) a continuous function in a ≤ x ≤ b. Then the integral equation b

f (t) = > (t) + + ∫ K (t, s) f (s) ds a

has a unique solution when |+|
(t) + + ∫ K (t, s) f (s) ds a

and we see that d(T(f1 ), T(f2 )) = max |T(f1 )(t) – T(f2 )(t)| ≤ |+|M(b – a)d(f1 , f2 ), t

which shows that T is a contraction when |+|M(b – a) < 1. The result now follows by the Banach Fixed Point Theorem. ◻

8.1.4 Fractals In this section we will talk about some fractal constructions. From Fig. 8.3 we can see that iterating any initial figure we obtain the Sierpinski triangle if we use the constructing stated in Example 8.4, see Theorem 8.18 for the analytic justification. We now introduce some definition that will play an important role in this section. Definition 8.11. Let M be a metric space. Given A ⊆ M and \$ > 0 we define the \$-collar of A to be A\$ = {x ∈ M : inf d (x, y) ≤ \$} . y∈A

Figure 8.3: Sierpinski triangle obtained has a ﬁxed point of an application.

120

8 Fixed Point Theorem

Note that if the set A is compact, then the infimum in the definition is attained. We now define the notion of Hausdorff metric in the space of compact subsets of a metric space. Definition 8.12 (Hausdorff Metric). Let KM denote the class of all nonempty compact subsets of a given metric space M. For K, F ∈ KM we define dH (K, F) = inf {\$ > 0 : K ⊆ F\$ , F ⊆ K\$ } . The Hausdorff metric between two sets is given as the infimum of all \$ > 0 such that both \$-collars engulf both sets, see Fig. 8.4. Example 8.13. If X = ℝ, then 󵄨 󵄨 dH ({a} , {b}) = 󵄨󵄨󵄨a – b󵄨󵄨󵄨 .

We now justify the fact that dH is a metric. Theorem 8.14. The function dH is a metric on KM called Hausdorff metric. Proof. The fact that dH (K, F) ≥ 0 and the symmetry relation dH (K, F) = dH (F, K) are self-evident from the definition of dH . Another immediate property is the fact that dH (K, K) = 0. Now, let us assume that dH (K, F) = 0. Then we have that K ⊆ ⋂ F\$ \$>0

F

K δ1

Figure 8.4: %-Collars of two sets.

δ2

8.1 Some Applications

121

and now since the set F is closed it follows that ∩\$>0 F\$ = F, from which we obtain that K ⊆ F. Similarly, we obtain that F ⊆ K entailing that K = F. Let us show the validity of the triangle inequality, i.e., dH (K, F) ≤ dH (K, E) + dH (E, F). Let us take \$1 , \$2 > 0 such that E ⊆ K \$1 ,

K ⊆ E \$1 ,

K ⊆ F \$2 ,

F ⊆ K \$2 ,

from which we obtain that E ⊆ F\$1 +\$2 ,

F ⊆ E\$1 +\$2

and by the definition of dH we have dH (K, F) ≤ \$1 + \$2 .

(8.3)

We obtain dH (K, F) ≤ dH (K, E) + dH (E, f ) taking infimum over \$1 and \$2 in eq. (8.3), which ends the proof. ◻ The following theorem will be given without proof. The interested reader can consult, e.g., Ref. [35]. Theorem 8.15. Let M be a complete metric space. Then (KM , dH ) is a complete metric space. We can even show that if M is compact, then (KM , dH ) is compact. Let us give two more results that will be important in the proof of Hutchinson’s Theorem. Lemma 8.16. For K1 , K2 , F1 , F2 ∈ KM . Then dH (K1 ∪ K2 , F1 ∪ F2 ) ≤ max {dH (K1 , F1 ) , dH (K2 , F2 )} .

(8.4)

Proof. Let us take a \$ > 0 such that \$ > max {dH (K1 , F1 ) , dH (K2 , F2 )} . Then F1 ⊆ (K1 )\$ ,

F2 ⊆ (K2 )\$ ,

K1 ⊆ (F1 )\$ ,

K2 ⊆ (F2 )\$

from which we obtain F1 ∪ F2 ⊆ (K1 ∪ K2 )\$ ,

K1 ∪ K2 ⊆ (F1 ∪ F2 )\$ .

(8.5)

122

8 Fixed Point Theorem

From the above considerations we get dH (K1 ∪ K2 , F1 ∪ F2 ) ≤ \$. Taking the infimum over all \$ > 0 satisfying relation (8.5) completes the proof.

The next lemma shows a very interesting property of the space (KM , dH ). If we take a contraction in (M, dM ) then that contraction will be a contraction in (KM , dH ) with the same Lipschitz constant. Lemma 8.17. Let f be a contraction in (M, dM ) with Lipschitz constant ! < 1. Then for any K, F ∈ KM , we have dH (f (E) , f (F)) ≤ !dH (E, F) . Proof. Let us take a \$ > 0 such that E ⊆ F\$ and F ⊆ E\$ . Then f (E) ⊆ f (F\$ ) ⊆ (f (F))!\$

f (F) ⊆ f (E\$ ) ⊆ (f (E))!\$ .

and

Taking the above into account we have dH (f (E) , f (F)) ≤ !\$. Taking the infimum over \$ > 0 completes the proof.

We now formulate and prove the Hutchinson Theorem. Theorem 8.18 (Hutchinson’s Theorem). Let (M, d) be a complete metric space and let fi : M → M be a contraction for i = 1, 2, . . . , k. Define f : KM 󳨀→ KM k

E 󳨃󳨀→ ⋃ fj (E)

.

j=1

Then f is a contraction in (Km , dH ). Moreover, f has a unique fixed point K ∈ KM , and for any F ∈ KM , fn (F) converges to K as n → ∞. Proof. Let K, F ∈ Km . Then iterating n times Lemma 8.16 we have k

k

dH (f (K) , f (F)) = dH (⋃ fj (K) , ⋃ fj (F)) , j=1

j=1

≤ max dH (fj (K) , fj (F)) . 1≤j≤K

8.1 Some Applications

123

Now using Lemma 8.17 it follows that dH (fj (K) , fj (F)) ≤ !j dH (K, F) , where !j < 1 is the Lipschitz contraction constant of fj . From the above we obtain that dH (f (K) , f (F)) ≤ (max !j ) dH (K, F) . 1≤i≤k

Since max1≤j≤k !j < 1 this shows that f is a contraction, which ends the proof.

The Hutchinson Theorem states that, under suitable conditions, we can write that a certain compact set K ∈ KM has K = f1 (K) ∪ f2 (K) ∪ ⋅ ⋅ ⋅ ∪ fN (K) , where the fj are contractions. The set K is called the attractor of the iterated function system (fj )j=1,...,N . Example 8.19. Let us give the inductive definition of the Cantor set: we start with the set K0 = {[0, 1]}, next we take K1 = {[0, 1/3], [2/3, 1]} and in general we obtain Kn as the collection of all closed intervals which are the left or right third of an interval from the collection Kn–1 . Using this iterative approach, the Cantor set is defined as C = ⋂ Kn . n∈ℕ

A graphical representation of each Kn is given in Fig. 8.5, starting from K0 In fact the Cantor set is the attractor of the iterated function system {f1 , f2 } given by f1 (x) = which can be seen in Fig. 8.6.

Figure 8.5: The Cantor ternary set C.

1 x, 3

f2 (x) =

1 (x – 1) + 1, 3 ⊘

124

8 Fixed Point Theorem

0

1

φ1

φ2

Figure 8.6: First step of the self-similar

Cantor set.

For more famous examples of fractals, the reader is invited to consult Ref. [7].

8.2 Problems 8.1. 8.2. 8.3.

Give all the details of the proof of the Babylonian algorithm given in Example 8.1. Prove Newton’s Theorem 8.7. n m +! Prove the inequality d(xn , xm ) ≤ ! 1–! d(x0 , x1 ) using the following steps: (a) Show the fundamental contraction inequality: (1 – !)d(x, y) ≤ d(x, T(x)) + d(y, T(y)).

8.4. 8.5.

(b) Obtain d(T n (x0 ), T m (x0 )) ≤ Prove Theorem 8.8. Let

1 1–!

[d(T n (x0 ), T n (x1 )) + d(T m (x0 ), T m (x1 ))].

2tet = 1.

8.6.

(8.6)

Prove that eq. (8.6) has a unique solution in the interval (0, 1). Prove that the equation x(t) = t + %x(tk ),

8.7.

where 0 < % < 1 and k > 1 has a unique solution x(t) ∈ C[0, 1]. Show that the following sequence

2;

1 2+ ; 2

2+

1 2+

1

;

2

1

2+ 2+

2+ converges. Find its value.

,⋅⋅⋅

1 1 2

8.2 Problems

8.8.

125

Show that in the Euclidean space ℝn , a linear map T : ℝn 󳨀→ ℝn with matrix n [!ij ]i,j=1 is a contraction if n

∑ |!ij |2 < 1. i,j=1

8.9.

Give an example of an operator T : V → V, where V is a Banach space, which satisfies ‖T(x) – T(y)‖ < ‖x – y‖ for all x, y ∈ X, but does not have a fixed point.

9 Baire Category Theorem Learning Targets ✓ Get introduced to the Baire category classiﬁcation of sets in a topological space. ✓ Learn the deﬁnition of Baire space. ✓ Learn Cantor’s Nested Set Theorem and its usage in some real analysis problems. ✓ Get acquainted with the Baire Category Theorem. ✓ Understand how to handle the Baire Category Theorem to prove some results in analysis which belong to the folklore.

9.1 Baire Categories In the context of measure theory, we know that some sets are negligibles in the sense that their measure is zero (the Cantor set has Lebesgue measure zero but nonetheless has the cardinality c, showing that the notion of negligible is unrelated with cardinality, a very striking fact). In practice this means that whenever we are working with properties that rely on the notion of strictly positive measure, e.g., the Lebesgue integral, we can discard those negligible sets. We now want to obtain similar negligible sets in the sense of metric spaces, or more general, in the topological sense. This notion of negligible set in the metric sense should be given in such a way that subsets of negligible sets and countable unions of negligible sets are also negligible sets. With this in mind we introduce the following definition.

Definition 9.1. Let U be a subspace of a topological space X. Then (a) U is said to be nowhere dense in X if int (U) = 0. (b) U is said to be a meager set or a set of the first category in X if it is contained in a countable union of nowhere dense sets. (c) U is said to be a nonmeager set or a set of the second category in X if it is not meager in X.

The terminologies first and second category were used by Baire himself. Since this nomenclature does not convey real meaning, the words meager and nonmeager started to be used and are the de facto standard for the aforementioned sets. We will use the latter. Example 9.2. Consider the set of real numbers ℝ with its usual topology. Then the set of rational numbers ℚ is meager in ℝ, since every point is a nowhere dense set in ℝ. ⊘

9.2 Baire Category Theorem

127

Notice that any subset of a nowhere dense set is also nowhere dense, and the union of a finite number of nowhere dense sets is, again, a nowhere dense set. However, it’s not true in general that a meager set is nowhere dense Problem 9.9. Theorem 9.3. Let X be a topological space. Then, the following statements are equivalent: (a) (b) (c) (d) (e)

The countable union of closed nowhere dense sets in X is a set with empty interior. The countable intersection of open dense sets in X is a dense set in X. Every meager set in X has empty interior. The complement of every meager set in X is dense in X. Every nonempty open set in X is not meager in X.

Proof. (a. ⇔ b.) We will prove the ⇒ direction only since the other direction uses an identical argument. Suppose int (⋃∞ n=1 Fn ) = 0, with Fn a closed nowhere dense set for every n ∈ ℕ, and let G = ⋂∞ G , where Gn is an open dense set in X. Remember that n=1 n a set is dense in X if the interior of its complement is empty, so we want to prove that int (G∁ ) = 0. Consider then, ∁

G = ( ⋂ Gn ) = ⋃ G∁n . n=1

n=1

Since, Gn is open and dense in X then G∁n is clearly closed (and hence G∁n = G∁n ) and has empty interior, so G∁n is nowhere dense for every n ∈ ℕ. By hypothesis we conclude that int (G∁ ) = 0. The rest of the proof we leave it as an exercise. ◻ We are now in a position to introduce the notion of Baire space. Definition 9.4. Let X be a topological space. If X satisfies the conditions in Theorem 9.3, then X is a Baire space.

9.2 Baire Category Theorem Before stating and proving the main result in this chapter, we want to recall a very powerful method widely used in analysis, the so-called Cantor’s Nested Set Theorem, which will be used to prove Baire Category Theorem and will also give some insight into the usage of the aforementioned category theorem. Theorem 9.5 (Cantor’s Nested Set Theorem). Let F1 , F2 , F3 , . . . be nonempty closed subsets of a complete metric space X, which satisfy the nesting property F1 ⊃ F2 ⊃

128

9 Baire Category Theorem

F3 ⊃ ⋅ ⋅ ⋅ and limn→∞ diam (Fn ) = 0. Then there is one, and only one, point in the intersection ∩∞ n=1 Fn . Proof. First we prove that the intersection is nonempty. From each of the sets Fn we select a point xn . Notice that if m > n, then xm ∈ Fn and hence d (xm , xn ) ≤ diam (Fn ), ∞ from which we get that (xn )n=1 is a Cauchy sequence in X since diam (Fn ) → 0. Due ∞ to the fact that the underlying space X is complete, (xn )n=1 converges to some element x∞ ∈ X. From the closedness of the set Fn and the nesting property F1 ⊃ F2 ⊃ ⋅ ⋅ ⋅, we conclude that x∞ ∈ Fn for every n, which implies that x∞ ∈ ∩∞ n=1 Fn . Let us prove uniqueness by contradiction. Suppose that x and y are distinct elements of ∩∞ n=1 Fn . Then, for every n ≥ 1, d (x, y) ≤ diam (Fn ) , which implies that d (x, y) = 0 and hence x = y, which is a contradiction. ◻ This theorem is a powerful tool to show existence and can be cataloged as an existence theorem. We will give some applications in real analysis of the Cantor Nested Set Theorem. We choose some well-known results which are given in most textbooks with a proof not relying directly on Cantor’s Theorem.

Theorem 9.6. Let f be a differentiable real-valued function defined on an open interval I. If, for every x ∈ I, we have f 󸀠 (x) = 0, then f is constant. Proof. Let us assume that the function f is not constant, so there exists some x, y ∈ I, 󵄨 󵄨 with x < y, such that 󵄨󵄨󵄨󵄨f (x) – f (y)󵄨󵄨󵄨󵄨 = k ≠ 0. Then 󵄨󵄨 󵄨 󵄨 󵄨󵄨󵄨 x + y 󵄨󵄨󵄨󵄨 󵄨󵄨󵄨󵄨 x + y 󵄨 )󵄨󵄨 + 󵄨󵄨f ( ) – f (y)󵄨󵄨󵄨 . k = 󵄨󵄨󵄨󵄨f (x) – f (y)󵄨󵄨󵄨󵄨 ≤ 󵄨󵄨󵄨f (x) – f ( 󵄨 󵄨󵄨 󵄨 󵄨󵄨 2 󵄨 󵄨 2 Then, by the pigeonhole principle, there exists x1 , y1 ∈ I with x1 < y1 where x1 ∈ {x, (x + y)/2} and y1 ∈ {(x + y)/2, y} such that |f (x1 ) – f (y1 )| ≥ k2 ; therefore, [x, y] ⊃ [x1 , y1 ]. Considering again the midpoint of this new interval we have two points x2 , y2 ∈ I 󵄨 󵄨 such that 󵄨󵄨󵄨󵄨f (x2 ) – f (y2 )󵄨󵄨󵄨󵄨 ≥ 2k2 from which we get the nesting [x, y] ⊃ [x1 , y1 ] ⊃ [x2 , y2 ] . If we continue this process, we have in the n-th iteration, a pair of points xn , yn ∈ I 󵄨 󵄨 such that 󵄨󵄨󵄨󵄨f (xn ) – f (yn )󵄨󵄨󵄨󵄨 ≥ 2kn and [x, y] ⊃ [x1 , y1 ] ⊃ [x2 , y2 ] ⊃ ⋅ ⋅ ⋅ ⊃ [xn , yn ] . Moreover, n |yn – xn | = (x – y)/2 . By Cantor’s Nested Set Theorem, ∞

⋂ [xn , yn ] = {r} . n=1

9.2 Baire Category Theorem

129

Also, 󵄨󵄨 󵄨 n 󵄨󵄨f (yn ) – f (xn )󵄨󵄨󵄨 󵄨 󵄨 ≥ k ⋅ 2 > 0, yn – xn 2n x – y which is a contradiction with the fact that f 󸀠 (x) = 0 for all x.

The standard textbook proof uses Lagrange’s Meanvalue Theorem, and the proof in this case is straightforward. We now show another interesting application of the Cantor Nested Set Theorem. Theorem 9.7 (Dini’s Theorem). Let fn be a sequence of continuous functions defined on a compact space X such that fn → f , where f is also continuous. If the sequence fn is monotone, then fn 󴁂󴀱 f . Proof. Let % > 0 be arbitrary and 󵄨 󵄨 Fn = {x ∈ X : 󵄨󵄨󵄨fn (x) – f (x)󵄨󵄨󵄨 ≥ %} . From the monotonicity of the fn we have that F1 ⊃ F2 ⊃ ⋅ ⋅ ⋅, i.e., Fn is a nested sequence. Since the functions fn and f are continuous, the sets Fn are closed subspaces of X and, hence they are compact. Since ⋂∞ n=1 Fn = 0 (why?) by Cantor’s Nested Set Theorem we have that there exists some k such that Fk = 0. The result now follows since : is arbitrary. ◻ The two previous examples show the usage of Cantor’s Theorem and we are tempted to use this theorem whenever we need to show existence. This can be somewhat difficult, since we need to construct a nested sequence of closed sets. One way to circumvent this difficulty is to use the so-called the Baire Theorem, which is the main theorem of this chapter. Its statement may look simple at first, but it is a very powerful tool with plenty of applications as we will see later in this chapter. Theorem 9.8 (Baire’s Theorem). Every complete metric space is a Baire space. Proof. Suppose X is a complete metric space. We will prove that it satisfies the second condition in Theorem 9.3, so let G = ⋂∞ n=1 Gn , where Gn are open dense sets in X for every n ∈ ℕ. Now we need to show that any open ball B1 in X has a common point with G. Clearly G1 ∩ B1 ≠ 0 and is open, so it contains the closure B2 of an open ball B2 of radius less than 1/2. Again, there is an open ball B3 with radius less than 1/3 such that its closure B3 is contained in G2 ∩ B2 . Continuing this process we obtain a sequence of open balls Bn with radius less than 1/n such that (G1 ∩ B1 ) ⊃ B2 ⊃ B3 ⊃ . . . ,

130

9 Baire Category Theorem

with Bn+1 ⊆ Gn ∩ Bn . The sequence formed by the centers of these balls is a Cauchy sequence and, since X is complete, there exists a unique ! ∈ X such that {!} = ⋂∞ n=1 Bn . Hence ! ∈ G ∩ B1 ≠ 0. ◻ Sometimes the Baire Theorem is given in the following way. Theorem 9.9. The intersection of a countable family of open and dense sets in a complete metric space is a dense set. The completeness hypothesis of the metric space in the Baire Theorem cannot be removed. To see this consider the metric space (ℚ, ‖⋅‖) , where the metric is the usual distance in ℝ. This space is not complete and each set containing a single point of ℚ is nowhere dense. Hence the whole space is the countable union of nowhere dense sets. Definition 9.10. Let X be a topological space. (a) If U ⊆ X and U = ⋂∞ n=1 Un , where Un is an open set in X for every n ∈ ℕ, then U is a set of type G\$ . (b) If U ⊆ X is a set of type G\$ and it is dense in X, then it is generic. (c) If V ⊆ X and V = ⋃∞ n=1 Vn , where Vn is a closed set in X for every n ∈ ℕ, then V is a set of type F3 . Theorem 9.11. Let X be a Baire space. Then every nonempty open set in X and every generic set in X are Baire spaces. Proof. Let G ⊆ X be an open set. If G is not a Baire space, there is a sequence of open dense sets Gn ⊆ G such that its intersection is not dense in G. Also, if G is the closure of G, then the sets Gn = Gn ∪ (X\G) are open and dense in X, but their intersection is not dense in X. This is a contradiction, since we assumed X was a Baire space. Then G is a Baire space. Now consider the sequence (Wn ) of open dense sets in X whose intersection W = ⋂n Wn is dense in X. Then W is a generic set in X. If (Zm ) is a sequence of open dense subsets in W, then there are open dense sets Sm in X such that Zm = Sm ∩ W. Also, ⋂ Zm = ⋂ (Sm ∩ W) = ⋂ Sm ∩ Wn m

m

m,n

is dense in X and, hence, dense in W. Then W is a Baire space.

Theorem 9.12. If S is a meager subset of a Baire space X, then its complement X\S is a Baire space. Proof. Since S is meager in X, the set X\S contains a generic subset U in X. If (Gn ) is a sequence of open dense sets in X\S, then Gn ∩ U is open and dense in U for every n.

9.2 Baire Category Theorem

131

The intersection ⋂n (Gn ∩ U) = (⋂n Gn ) ∩ U is dense in U and X\S, for U is dense in X. This shows that X\S is a Baire space. ◻ Remark 9.13. Notice that the Baire Theorem is, contrary to what its statement might suggest, more a topological result than a metric result. Consider ℝ which is a complete metric space and, hence, a Baire space. It is homeomorphic to (–1, 1) , so this last space is also a Baire space. However, (–1, 1) is not a complete metric space. ⊘ It belongs to the mathematical folklore that there are continuous functions that do not have derivative at every point in its domain, e.g., the Weierstrass function, cf. Example 9.14. The question of size of the set of such pathological functions can be answered using the Baire Theorem, showing that “most” continuous functions are in fact nowhere differentiable. We first show that the Weierstrass function (9.1) is a continuous function with no derivative at any point of its domain. After that we show that the set of such functions is generic in C[0, 1]. Example 9.14. The Weierstrass function W is a continuous function that is not differentiable at any point of its domain. It is defined by W : ℝ 󳨀→ ℝ

x 󳨃󳨀→ ∑ bn cos(an x0)

,

(9.1)

n=0

where a is a positive odd integer and b is a positive number less than 1, chosen in such 0 a way that ab > 1 and 32 > ab–1 . By the Weierstrass M-test, we know that the series defining W is uniformly convergent and since each term is a continuous function we have that W is indeed a continuous function. Now, we will see how Weierstrass showed that this function does not have derivative at any point. Denote by x0 any value of x, m a positive integer and !m an integer such that 1 1 – < am x0 – !m ≤ . 2 2 If we define xm+1 = am x0 – !m and x󸀠 =

!m – 1 , am

x 󸀠 – x0 = –

1 + xm+1 , am

x󸀠󸀠 =

!m + 1 , am

we see that x󸀠󸀠 – x0 =

1 – xm+1 . am

132

9 Baire Category Theorem

From this we conclude that x0 is between x󸀠 and x󸀠󸀠 and that m may take values big enough so that the difference of x󸀠 and x󸀠󸀠 with x0 tends to zero. On the other hand, we have f (x󸀠 ) – f (x0 )

= ∑ (bn

x 󸀠 – x0

n=0

cos(an x󸀠 0) – cos(an x0 0) ) = A + B, x 󸀠 – x0

where m–1

A = ∑ (an bn n=0 ∞

B = ∑ (bm+n n=0

cos(an x󸀠 0) – cos(an x0 0) ), an (x󸀠 – x0 ) cos(am+n x󸀠 0) – cos(am+n x0 0) ). x 󸀠 – x0

Since we have, by the product-to-sum formula, that cos (an x󸀠 0) – cos (an x0 0) an (x󸀠 – x0 )

󸀠 n x –x0

sin (a 2 0) x 󸀠 + x0 , = –0 sin (an 0) x󸀠 –x0 2 an 0 2

and 󵄨󵄨 󵄨󵄨 󸀠 󵄨󵄨 󵄨 󵄨󵄨sin (an x + x0 0)󵄨󵄨󵄨 < 1, 󵄨󵄨 󵄨󵄨 2 󵄨󵄨 󵄨󵄨 󵄨 󵄨

󵄨 󵄨󵄨 󵄨󵄨 sin (an x󸀠 –x0 0) 󵄨󵄨󵄨 󵄨󵄨 󵄨󵄨 2 󵄨󵄨 󵄨󵄨 󵄨󵄨 < 1, 󵄨󵄨 󸀠 󵄨󵄨󵄨 an x –x0 0 󵄨󵄨󵄨 󵄨󵄨 󵄨󵄨 2 󵄨 󵄨

it follows that m–1 0 m 󵄨󵄨 󵄨󵄨 n n (ab) . 󵄨󵄨A󵄨󵄨 < 0 ∑ a b < ab – 1 n=0

Since we also have cos(am+n x󸀠 0) = cos(an (!m – 1) 0) = – (–1)!m , cos(am+n x0 0) = cos ((an !m + an xm+1 )0) = (–1)!m cos(an xm+1 0), we conclude that ∞

1 + cos (an xm+1 0)

n=0

1 + xm+1

B = (–1)!m (ab) ∑ m

bn ;

and, since every term in the series is positive and the first term is not less than xm+1 ∈ (– 21 , 21 ), we have (–1)!m B >

2 m (ab) . 3

2 3

since

9.2 Baire Category Theorem

133

Then B = (–1)!m

2 m ' (ab) , 3

A = (–1)!m

%0 m (ab) , ab – 1

where ' is a positive number greater than 1, and % is a number between –1 and 1. We have then f (x󸀠 ) – f (x0 ) x 󸀠 – x0

2 % 0 m = (–1)!m (ab) ' ( + ) 3 ' ab – 1

and the sign of '% depends on (–1)!m . Similarly, we get that there exists positive number '󸀠 > 1 and –1 < %󸀠 < 1 such that f (x󸀠󸀠 ) – f (x0 ) x󸀠󸀠 and the sign of

%󸀠 '󸀠

– x0

2 %󸀠 0 m = – (–1)!m (ab) '󸀠 ( + 󸀠 ) 3 ' ab – 1

depends on – (–1)!m .

Notice that when m tends to ∞, we have that x󸀠 and x󸀠󸀠 tend to x0 . So if we take a 0 and b in such a way that 32 > ab–1 , we conclude from the previous inequalities that the ratios tend to ∞ and –∞, when m tends to ∞ and, hence the function f does not have a derivative at any point x0 . ⊘ Other famous continuous functions without derivative at any point are the following: van der Waerden’s function: ∞

⟨⟨4n x⟩⟩ 4n k=0 ∑

where ⟨⟨x⟩⟩ is the function which is equal to the absolute value of the difference between the point x and the nearest integer; Riemann’s function: ∞

1 sin (n2 0x) . 2 n n=0 ∑

We now study the question of the size of the set of continuous nowhere differentiable functions. We need the following density lemma, whose proof we leave as an exercise. Lemma 9.15. The set of all piecewise linear continuous functions defined on [0, 1] is dense in C [0, 1]. Theorem 9.16. Let S = C [0, 1]. Then the set of those functions in S that do not have finite derivative at any point in [0, 1] is generic in S.

134

9 Baire Category Theorem

󵄨 󵄨 󵄨 󵄨 Proof. Let I = [0, 1], and define, for every g ∈ S, Fgn (x, y) = 󵄨󵄨󵄨󵄨g (x + y) – g (x)󵄨󵄨󵄨󵄨 – n 󵄨󵄨󵄨y󵄨󵄨󵄨 , with x + y ∈ I. Now consider the set Un = {g ∈ S : for every x ∈ I there is y with Fgn (x, y) > 0}. If g ∈ ⋂∞ n=1 Un , then g does not have a derivative at any point of [0, 1] . Since S is a complete metric space, using Baire’s Theorem, we just have to prove that each Un is open and dense in S. If g ∈ Un , then there exists an % > 0 such that, for every x ∈ I, there is y = y (x) with Fgn (x, y) > %. (why?) Now, if h ∈ S, then 󵄨 󵄨 󵄨 󵄨 % + n 󵄨󵄨󵄨y󵄨󵄨󵄨 < 󵄨󵄨󵄨󵄨g (x + y) – g (x)󵄨󵄨󵄨󵄨 , 󵄨 󵄨 󵄨 󵄨 󵄨 󵄨 ≤ 󵄨󵄨󵄨󵄨g (x + y) – h (x + y)󵄨󵄨󵄨󵄨 + 󵄨󵄨󵄨󵄨h (x + y) – h (x)󵄨󵄨󵄨󵄨 + 󵄨󵄨󵄨h (x) – g (x)󵄨󵄨󵄨 , so if ‖g – h‖∞ < %2 , then Fh (x, y) > 0, for every x ∈ [0, 1] , i.e., h ∈ Un . Hence Un is open. We now prove that Un is dense in S. Let g ∈ S and % > 0. We will show that there exists h ∈ Un such that ‖g – h‖∞ < %. Notice that by Lemma 9.15 we may assume that g is a piecewise linear function. Let M > 0 be such that |g 󸀠 (x)| ≤ M for every x ∈ [0, 1] in which g is differentiable. Consider the function 6(x) = infk∈ℤ |x – k| and choose a natural number m such that m% – M > n. Define the function h on [0, 1] as h(x) = g(x) + %6(mx), and notice that h is continuous. However, if 0 ≤ x < 1, then 󵄨󵄨 󵄨󵄨 󵄨 󵄨󵄨 󵄨󵄨 lim h(x + y) – h(x) 󵄨󵄨󵄨 ≥ %m – M > n, 󵄨󵄨 󵄨󵄨y→0+ y 󵄨󵄨 󵄨󵄨 and consequently, h ∈ Un . On the other hand, notice that ‖g – h‖∞ = ‖%6(m⋅)‖∞ < %. This finishes the proof.

For another approach to the density of the sets Un , see Problem 9.14. Corollary 9.17. The set of the continuous real functions having derivative at some point is meager in C [a, b] . Another interesting application of the Baire Theorem is the following result.

9.3 Problems

135

Theorem 9.18. Let X be a Banach space. If X is complete and dim X = ∞, then any basis for X is uncountable. Proof. Let (ej )j∈ℕ be a basis for X such that ‖ej ‖ = 1, for every j ∈ ℕ. Take n, m ∈ ℕ and define n { } { n } 󵄨 󵄨 Fn,m = {(∑ kj ej ) ∈ X : ∑ 󵄨󵄨󵄨󵄨kj 󵄨󵄨󵄨󵄨 ≤ m}, { j=1 } j=1 { }

which is closed and X = ⋃n,m Fn,m . Using Baire’s Theorem, there is a set Fn,m containing an open ball B (!; r) , with r > 0. This is, however, a contradiction since the vector re re (! + n+1 ) ∈ B (!; r) but (! + n+1 ) ∉ Fn,m . This proves that a basis for X cannot be 2 2 countable. ◻

9.3 Problems 9.1. 9.2.

Show that in order for a set U to be meager in X it is necessary and sufficient that X ⊆ ⋃∞ n=1 Fn , where the Fi sets are closed with empty interior in U. Without using the Lagrange Mean Value Theorem, prove that a differentiable function f : I 󳨀→ ℝ such that | f 󸀠 (x)| ≤ K for all x ∈ I is a Lipschitz function, viz. it satisfies |f (y) – f (x)| ≤ K ⋅ |x – y|

9.3. 9.4. 9.5. 9.6. 9.7. 9.8. 9.9. 9.10. 9.11. 9.12.

for all x, y ∈ I. Show that the condition of limn→∞ diam Fn = 0 in Theorem 9.5 is necessary. Hint: Study the sequence Fn = [n, +∞). Prove that the interval [0, 1] cannot be enumerated, using the Cantor Nested Set Theorem. Prove Theorem 9.11. Prove Lemma 9.15. If f ∈ C∞ (ℝ) and for every x ∈ ℝ there is a nonnegative integer n such that f (n) (x) = 0, then f is a polynomial. Is it possible to find a Baire metric space which is not complete? Show that we can have a meager set which is not nowhere dense. Show that a singleton {p} in a metric space (X, d) has empty interior if and only if p is not an isolated point. Show that the boundary of an open set G ⊆ X, where X is a metric space, is a closed set with empty interior. Let X be a complete metric space and define B has the set of all pointwise bounded continuous functions in ℝX in the following sense: B := {f : X 󳨀→ ℝ : f is continuous}

136

9.13.

9.14.

9 Baire Category Theorem

with the additional property that for all x ∈ X, there exists a Cx > 0 such that |f (x)| ≤ Cx for all f ∈ B. Show that there exists an open nonempty set U ⊆ X such that B is uniformly bounded in U. Show that there is no function f : ℝ 󳨀→ ℝ which is continuous at each rational point and discontinuous at each irrational point. (Be aware that there is a function > : ℝ 󳨀→ ℝ which is continuous at the irrational points and discontinuous in the rational points, viz. >(x) = 0 if x is irrational, >(0) = 1 and >(p/q) = 1/q if p/q is an irreducible fraction). Try to prove the density result for the sets Un in Theorem 9.16 based on the fact that continuous functions in compact sets are in fact uniformly continuous and in the following geometric idea: g f ε

Ij

10 Uniform Boundedness Principle Learning Targets ✓ Learn the uniform boundedness principle in its various formulations. ✓ Get acquainted with different proofs of the uniform boundedness principle. ✓ Understand the applicability of the uniform boundedness principle.

One of the well-known results in analysis states that a continuous function in a compact set is uniformly continuous. This result gives global information relying on local pointwise information. In the same vein, the uniform boundedness theorem, also known as Banach–Steinhaus Theorem, permits to obtain the uniform boundedness of a family of operators from its pointwise boundedness. Definition 10.1. Let Tj : E → F, j ∈ J be a family of bounded operators. We say that the family of operators is: (a) pointwise bounded if for every x ∈ E there exists a constant Cx > 0 such that for every j ∈ J we have ‖Tj (x)‖ ≤ Cx . (b) uniformly bounded if there exists a constant C such that for every j ∈ J we have ‖Tj ‖ ≤ C. With the previous definitions at hand, we can state the central theorem of this chapter. Theorem 10.2 (Uniform Boundedness Theorem). Let X be a Banach space and Y a normed space. If {Tj }j∈J is a family of pointwise bounded operators in B (X, Y) then the family is uniformly bounded, i.e., supj∈J ‖Tj ‖ < ∞. We will provide two different proofs of Theorem 10.2, one relying on Baire’s Theorem and the other will be more elementary in a sense. We start with the elementary approach taken from Ref. [36], but first we need some auxiliary result. Lemma 10.3. Let T : X 󳨀→ Y be a bounded linear operator where X and Y are normed linear spaces. Then for any x ∈ X and r > 0, we have sup ‖T. ‖ ≥ ‖T‖r.

. ∈BX (x,r)

Proof. We notice that max {‖T(x – . )‖, ‖T(x + . )‖} ≥

1 (‖T(x – . )‖ + ‖T(x + . )‖) ≥ ‖T(. )‖, 2

(10.1)

138

10 Uniform Boundedness Principle

which is valid for all . ∈ BX (x, r). Now the result follows applying sup. ∈BX (x,r) in the inequality (10.1). ◻ Proof of Theorem 10.2—Elementary Version. Let us suppose that supj∈J ‖Tj ‖ = +∞. Then we can choose a sequence (Tk )k∈ℕ from the family (Tj )j∈J such that ‖Tn ‖ ≥ 4n . We now construct a sequence xn inductively in the following way: x0 = 0 and for n ≥ 1 we use the Lemma 10.3 to pick up xn ∈ X such that ‖xn – xn–1 ‖ ≤ 3–n and ‖Tn (xn )‖ ≥ 2 –n 3 ‖Tn ‖. Since (xn )n∈ℕ is a Cauchy sequence it converges to a limit x ∈ X. Writing 3 x – xn has a telescopic series; we obtain that ‖x – xn ‖ ≤ 21 3–n from which we get ‖Tn x‖ ≥ 1 –n 3 ‖Tn ‖ ≥ 61 (4/3)! → ∞, yielding a contradiction. ◻ 6 The following proof of Theorem 10.2 is the standard textbook proof relying on Baire’s Theorem. The proof is, as simply put by C. Villani, particularly opaque to intuition. Proof of Theorem 10.2—Standard Version. We will write X as a union of closed sets in order to use Baire’s Theorem. Taking Fj,k := {! ∈ X : ‖Tj (!)‖ ≤ k} we define Fk := ⋂ Fj,k = {! ∈ X : ‖Tj (!) ‖ ≤ k,

for every j ∈ J} .

j∈J

The sets Fk are closed since they are the intersection of the closed sets Fj,k (being the inverse image of a closed set by a continuous function Fj,k = Tj–1 BY (0; k)). Moreover, X = ∪∞ k=1 Fk , and now invoking Baire’s Theorem, there exists some Fm with nonempty interior. Let BX (!0 ; r) , r > 0, be an open ball contained in Fm . From the definition of Fm it follows that, for all j ∈ J and for every ! ∈ BX (!0 ; r), we have ‖Tj (!) ‖ ≤ m. If . ∈ X with ‖. ‖ = 1, then " = !0 +

r. 2

belongs to BX (!0 ; r) and

2 2 4m , ‖Tj (. ) ‖ = ‖Tj (") – Tj (!0 ) ‖ ≤ (‖Tj (") ‖ – ‖Tj (!0 ) ‖) ≤ r r r which is valid for every j ∈ J and ‖. ‖ = 1. From this it follows that supj ‖Tj ‖ ≤ 4m < ∞. ◻ r The hypothesis that the domain of the family of operators should be complete in the formulation of the Banach–Steinhaus Theorem cannot be dropped, which can be seen from the sequence of complex functionals fn : x ∈ (c00 , ‖ ⋅ ‖∞ ) 󳨃→ nxn , see Problem 10.1. The next corollary is often also called the Banach–Steinhaus Theorem. Corollary 10.4 (Banach–Steinhaus Theorem). Let X be a Banach space and Y a normed ∞ space. If (Tn )n=1 is a sequence in B (X, Y) such that, for all ! ∈ X, the limit T (!) := limn→∞ Tn (!) exists, then supn ‖Tn ‖ < ∞ and T is an operator in B (X, Y) .

10 Uniform Boundedness Principle

139

Proof. The linearity of T is clear. Now, since limn→∞ Tn (!) exists for every ! ∈ X, then supn ‖Tn (!) ‖ < ∞ and, by the uniform boundedness Theorem 10.2, supn ‖Tn ‖ < ∞. From the definition of T it follows that ‖Tn (!) ‖ ‖T (!) ‖ = lim ≤ sup ‖Tn ‖ < C, n→∞ ‖!‖ ‖!‖ n which yields that ‖T‖ < C for some positive constant C.

Corollary 10.5. Let X be a Banach space and U ⊆ X * = B (X, ℝ) . Then U is bounded if, 󵄨 󵄨 and only if, for every ! ∈ X we have supI∈U 󵄨󵄨󵄨I (!)󵄨󵄨󵄨 < ∞. Proof. If U is bounded, then supI∈U ‖I‖ = C < ∞ for which it follows that for every 󵄨 󵄨 ! ∈ X, supI∈U 󵄨󵄨󵄨I (!)󵄨󵄨󵄨 ≤ C‖!‖ < ∞. For the other implication, let us take U has a family of operators Tj in the Banach space X * , namely U = (Tj )j∈J and invoking the Banach–Steinhaus Theorem 10.2 ends the proof. ◻ We now characterize the size of a subset of a Banach space X whenever a family of operators is not uniformly bounded. Theorem 10.6. Let X be a Banach space and Y a normed space. If {Tj }j∈J is a family in B (X, Y) such that supj∈J ‖Tj ‖ = ∞, then the set A = {! ∈ X : supj ‖Tj (!) ‖ < ∞} is meager in X. Proof. We can write A as a union of closed sets A = ⋃ Fk = ⋃ {! ∈ X : ‖Tj (!)‖ ≤ n} . k∈ℕ

k∈ℕ

From the Banach–Steinhaus Theorem 10.2 it follows that all of the sets Fk have empty interior (why?) and now the result follows. ◻ Let us recall a well-known example from analysis. Taking f : ℝ2 → ℝ as (x, y) 󳨃→ xy/(x2 + y2 ) whenever (x, y) ≠ 0 and f (0, 0) = 0, we see that for all fixed x0 the application y 󳨃→ f (x0 , y) is continuous and for all fixed y0 the application x 󳨃→ f (x, y0 ) is continuous, but the function f itself is not continuous at the origin, which can be seen from the parametrization f (x, y) = cos(() ⋅ sin(() where ( is the angle in polar coordinates of the point (x, y). On the other hand, it is also known from mathematical analysis that a bilinear map B : ℝd × ℝn → ℝm is continuous. We now extend this result for the case of bilinear operators, i.e., we will present a result about the continuity of bilinear operator as an application of the Banach–Steinhaus Theorem, which loosely states that if we have a bilinear operator which is continuous separately in each variable then it will be continuous. Let us introduce the definition of bilinear operator.

140

10 Uniform Boundedness Principle

Definition 10.7 (Bilinear Operator). Let X1 , X2 and Y be vector spaces. We say that the application B : X1 × X2 → Y is a bilinear operator if, for all fixed x1 ∈ X1 and x2 ∈ X2 , the applications .2 󳨃→ B(x1 , .2 ),

and .1 󳨃→ B(.1 , x2 )

are linear operators.

Corollary 10.8 (Boundedness of Bilinear Operators). Let X1 and X2 be Banach spaces. If I : X1 × X2 → 𝔽 is a bilinear operator which is continuous on each coordinate, then I is continuous, i.e., if !n → ! and "n → ", then I (!n , "n ) → I (!, ") . Proof. Using the linearity coordinate-wise, it suffices to consider the case when !n → 0 and "n → 0 and we are left to prove simply that I (!n , "n ) → 0. For every !n ∈ X1 define Tn : X2 → 𝔽 by Tn (") = I (!n , ") , which is linear continuous, and moreover Tn (") → 0 as n → ∞. Since Tn satisfies the conditions from the Banach–Steinhaus Theorem then the family (Tn )n∈ℕ is uniformly bounded, i.e., 󵄨 󵄨 there is some C > 0 such that 󵄨󵄨󵄨󵄨Tn (")󵄨󵄨󵄨󵄨 ≤ C‖"‖, for every " ∈ X2 . In particular for "n we have that 󵄨󵄨 󵄨 󵄨 󵄨 󵄨󵄨I (!n , "n )󵄨󵄨󵄨 = 󵄨󵄨󵄨Tn ("n )󵄨󵄨󵄨 ≤ C‖"n ‖, 󵄨 󵄨 󵄨 󵄨 which converges to zero as n → ∞.

We cannot drop the hypothesis of completeness of the normed space, as can be seen in the following example (incidentally the following example can be used to prove that (C[0, 0], ‖ ⋅ ‖1 ) is not complete invoking Corollary 10.8). 0󵄨 󵄨 Example 10.9. Let Y = (C [0, 0] , ‖ ⋅ ‖1 ) , where ‖f ‖1 = ∫0 󵄨󵄨󵄨f (x)󵄨󵄨󵄨 dx, for f ∈ Y. Denote with I (⋅, ⋅) the bilinear map

I : Y × Y 󳨀→ ℂ, 0 (f , g) 󳨃󳨀→ ∫0 f (x) g (x) dx, which is continuous on each variable. Now let us take (fn )n∈ℕ in the following way: {√n sin(nx), 0 ≤ x ≤ 0n , fn (x) = { 0, otherwise. { It follows that ‖fn ‖1 = I is not continuous!

2 √n

converges to zero as n → ∞, but I (fn , fn ) =

0 2

for every n, so ⊘

10 Uniform Boundedness Principle

141

p Example 10.10. Let 1 < p < ∞ and 1/p+1/q = 1. If ∑∞ k=1 xk yk converges for all (yk )k∈ℕ ∈ ℓ , q then (xk )k∈ℕ ∈ ℓ . Since we want to obtain some norm estimate from pointwise estimate, it all configures to use the Banach–Steinhaus Theorem of uniform boundedness. Let us define ∞

n

>n (y) = ∑ xk yk

>(y) = ∑ xk yk ,

and

k=1

k=1

which are linear operators in ℓp . From the Hölder inequality 1 q

n

1 p

n

|>n (y)| ≤ ( ∑ |xk |q ) ( ∑ |yk |p ) , k=1

k=1 1 q

n

≤ ( ∑ |xk |q ) ‖y‖ℓp , k=1

from which it follows that >n is continuous with the bound n

1 q

q

‖>n ‖ ≤ ( ∑ |xk | ) . k=1

Due to the fact that >n converges pointwise to >, from the Banach–Steinhaus Theorem 10.4, we have that > is continuous; therefore, for every sequence (yk )k∈ℕ ∈ ℓp , we have 󵄨󵄨 󵄨󵄨 ∞ 󵄨 󵄨󵄨 󵄨󵄨 ∑ x y 󵄨󵄨󵄨 ≤ ‖>‖‖y‖ p . k k 󵄨󵄨 ℓ 󵄨󵄨󵄨 󵄨󵄨 󵄨󵄨k=1 󵄨

(10.2)

Now for fixed n ∈ ℕ, let us define (yk ) in the following way: {xk |xk |q–2 , 1 ≤ k ≤ n, xk ≠ 0, yk = { 0, otherwise. { The constructed sequence (yk ) belongs to ℓp . From eq. (10.2) we obtain n

n

k=1

k=1

( ∑ |xk |q ) ≤ ‖>‖ ( ∑ |xk |q )

1 p

142

10 Uniform Boundedness Principle

from which it follows that n

1 q

q

( ∑ |xk | ) ≤ ‖>‖.

(10.3)

k=1

Thus ‖x‖ℓq ≤ ‖>‖ which entails that x = (xk )k∈ℕ ∈ ℓq .

We end this section with the so-called Szëgo’s Theorem, which is an application of the Banach–Steinhaus Theorem. This theorem give us sufficient condition to guarantee which numerical integration formulas (sometimes also denoted as quadrature b formulas) are convergent. By an approximate quadrature of the integral ∫a f (t)dt we mean a sum of the form ∑nk=0 wk f (tk ) where tk are elements of a partition of [a, b], viz. a = t0 < t1 < t2 < ⋅ ⋅ ⋅ < tn = b, and wk are weights, which are independent of f . A natural question is to know under what conditions the sequence of approximate quadratures b (qn (f ))n∈ℕ converges to ∫a f (t)dt, with the approximate quadratures given by n

qn (f ) = ∑ wk(n) f (tk(n) ),

n ∈ ℕ.

(10.4)

k=0

Theorem 10.11 (Szegö’s Convergence Theorem). For all f ∈ C[a, b] the formula of b approximate quadratures qn (f ), given in eq. (10.4), converges to ∫a f (t)dt if (a)

there exist M > 0 such that ∑nk=0 |wk(n) | ≤ M, for n ∈ ℕ,

(b)

qn (p) → ∫a p(t)dt for every polynomial p.

b

10.1 Problems 10.1.

10.2.

10.3. 10.4.

Prove that the sequence of complex functionals fn : (c00 , ‖ ⋅ ‖∞ ) → ℂ given by fn (x) = nxn is pointwise bounded but nonetheless is not uniformly bounded. Explain why this does not contradict Theorem 10.2 of Banach–Steinhaus. Prove the following result: If X is a normed space and (xj )j∈J is a family of elements of X such that for all F ∈ X * the set {F(xj ) : j ∈ J} is bounded. Then the family (xj )j∈J is bounded in the X norm. Hint: Study the family Q(xj ) of functional in X * where Q is the canonical embedding Q : X → X * given in §15.2. Let X be a Banach space. If (fn )n∈ℕ ∈ X * is a sequence such that limn→∞ fn (x) = f (x) for all x ∈ X, then f ∈ X * and ‖f ‖ lim infn→∞ ‖fn ‖. Prove Szëgo Theorem 10.11.

11 Open Mapping Theorem Learning Targets ✓ Learn the Open Mapping Theorem. ✓ Get to know about the Banach Isomorphism Theorem. ✓ Understand the applicability of the Open Mapping Theorem as well as the Banach Isomorphism Theorem.

The notion of open mapping, at least in the framework of the Euclidean space, is well known. A mapping f : X 󳨀→ Y, where X ⊆ ℝn and Y ⊆ ℝm , is said to be an open mapping if the image of any open set in X under the mapping f is open in Y. A moment’s reflection shows that if f is an homeomorphism then f is an open mapping. A much more profound result, due to Brouwer, and commonly designated as Brouwer’s theorem, states that if f : U 󳨀→ ℝn is a continuous injective mapping defined on an open set U ⊆ ℝn then f (U) is open, see Ref. [43] for an elementary proof. Let us give some concrete examples. Example 11.1. Define X = [–1, 0] ∪ (1, 2] and f : X 󳨀→ [0, 4] with f (x) = x2 . Then f is a continuous bijection, but its inverse f –1 : [0, 4] → X given by f –1 (x) = {

–√x if 0 ≤ x ≤ 1 √x if 1 < x ≤ 4

is not continuous. This shows that although f is a continuous mapping it is not a homeomorphism; therefore, it is not an open mapping, see Problem 11.1. ⊘ At first sight we could think that continuity is necessary for a mapping to be open. This is not true, as the following example shows. Example 11.2. Let us take the mapping f :

ℝ2 󳨀→ ℝ (x, y) 󳨃󳨀→ sign(y) + x.

This is a discontinuous mapping but nonetheless it is open.

Let us now give the (topological) general definition of an open mapping. Definition 11.3. Let X and Y be topological spaces and f : X 󳨀→ Y a map. Then f is an open mapping if, for every open set U in X, its image under f is open in Y.

144

11 Open Mapping Theorem

We now show that for linear continuous maps operating between Banach spaces the question of openness of the mapping is easier and it is the content of the celebrated Open Mapping Theorem also known as Banach–Schauder Theorem. Theorem 11.4 (Open Mapping Theorem). Let X1 and X2 be Banach spaces. If T ∈ B (X1 , X2 ) and T surjective, then T is an open mapping. Before proving the Open Mapping Theorem, we will need some auxiliary results. Lemma 11.5. Let T ∈ B(V, W) be a surjective map where V and W are Banach spaces. Then the following properties hold: (a) (b) (c) (d)

for every r, s > 0 we have T(B (0; r)) = sr T(B (0; s)); for every ! ∈ V and r > 0, we have T(B (!; r)) = T! + T(B (0; r)); if B (0; %) ⊆ TB (0; r), then B (0; a%) ⊆ TB (0; ar), for every a > 0; If B ("0 ; %) ⊆ TB (0; r), then there exists \$ > 0 such that B (0; \$) ⊆ TB (0; r).

Proof. Items (a), (b) and (c) are almost immediate. Let us prove (d). Let us choose !1 ∈ B (0; r) such that ‖"1 – "0 ‖ < :2 , where "1 = T!1 , Since % % B (0; ) = B ("1 ; ) – "1 ⊆ {B ("0 ; %) – T!1 } , 2 2 ⊆ {TB (0; r) – T!1 } ⊆ T [B (0; r) – !1 ] ⊆ TB (0; 2r), which follows from the fact % B ("1 ; ) ⊆ B ("0 ; %) ⊆ TB (0; r), 2 then B (0; %2 ) ⊆ TB (0; 2r). From (c) we get that B (0; \$) ⊆ TB (0; r) with \$ = 4% .

(11.1) ◻

Lemma 11.6. Let Y1 and Y2 be normed spaces. If T ∈ B (Y1 , Y2 ) and there exists r > 0 such that the interior of T(B (0; r)) is nonempty, then T is an open mapping. Proof. To show that T is an open mapping, it suffices to show that for all v ∈ Y1 and all + > 0, there exists a \$ > 0 such that T (B(v, +)) ⊃ B (T(v), \$) . Since T (B(v, +)) = T(v) + T (B(0, +)) = T(v) + +T(B(0, 1)) and

11 Open Mapping Theorem

145

B(T(v), \$) = T(v) + B(0, \$), then if there exists a \$ > 0 such that B(0, \$) ⊆ T(B(0, 1)), the result follows. From the hypothesis, int (T (B (0, r))) ≠ 0, which implies that there exists a + > 0 such that B (0, +) ⊆ T (B (0, r)) , from which, together with Lemma 11.5(c), we get + B (0, ) ⊆ T (B(0, 1)), r ◻

which ends the proof. We are now in a position to prove the Open Mapping Theorem. Proof of Theorem 11.4. Since T is surjective we can write ∞

W = ⋃ T (B (0, k)) , k=1

from which, invoking Baire’s Theorem, we know that at least one of the sets T (B (0, k)) has nonempty interior, i.e., there are w ∈ W and + > 0 such that B (w, +) ⊆ T (B (0, k)). From Lemma 11.5(d), there exists a \$ > 0 such that B (0, \$) ⊆ T (B (0, k)), and from Lemma 11.5(c), we can even suppose (with different \$) that B (0, \$) ⊆ T (B (0, 1)). If we can show that T (B (0, 1)) ⊆ T (B (0, 2)), then from Lemma 11.6, we get that T is an open mapping. Let us take an arbitrary w ∈ T (B (0, 1)) and show that w ∈ T (B (0, 2)). Let us choose v1 ∈ B(0, 1) such that \$ 1 (w – T(v1 )) ∈ B (0, ) ⊆ T (B (0, )), 2 2

146

11 Open Mapping Theorem

where we used Lemma 11.5(c). In the same way, we choose v2 ∈ B(0, 21 ) such that (w – T(v1 ) – T(v2 )) ∈ B (0,

\$ 1 ) ⊆ T (B (0, 2 )), 22 2

1 ) for which and more generally, by induction, we choose vn ∈ B(0, 2n–1 n

(w – ∑ T(vj )) ∈ B (0, j=1

\$ 1 ) ⊆ T (B (0, n )). 2n 2

Since vn := ∑nj=1 vn is a Cauchy sequence and V is a Banach space, then there exists limn→∞ vn =: v. Due to the fact that T is continuous we have that w = T(v) and since ‖v‖ < 2 it follows that T (B (0, 1)) ⊆ T (B (0, 2)), ending the proof. ◻ An immediate corollary of the Open Mapping Theorem is the so-called Banach Isomorphism Theorem, which shows that for a linear bijective continuous map acting between Banach spaces the inverse maps exists and is also a linear continuous map. Corollary 11.7 (Banach Isomorphism Theorem). Let X1 and X2 be Banach spaces. If T ∈ B (X1 , X2 ) is a bijection, then T –1 ∈ B (X2 , X1 ) .

11.1 Problems 11.1.

11.2.

Prove that if f : X ⊆ ℝn 󳨀→ Y ⊆ ℝm is a bijection, then the following statements are equivalent: 1. f is an homeomorphism. 2. f is continuous and open. Prove the following result: If X is a normed space and (xj )j∈J is a family of elements of X such that for all F ∈ X * the set {F(xj ) : j ∈ J} is bounded. Then the family (xj )j∈J is bounded in the X norm. Hint: Study the family Q(xj ) of functional in X * where Q is the canonical embedding Q : X 󳨀→ X * given in §15.2.

12 Closed Graph Theorem Learning Targets ✓ Learn the Closed Graph Theorem.

Let us recall that a map T : X → Y between metric spaces X and Y is continuous if, for all xn → x, we have (1) the limit T(xn ) exists; and (2) T(xn ) → T(x). From the above, an easy way to construct discontinuous operators is to impose that there exists some sequence (xn )n∈ℕ for which xn → x but T(xn ) → / T(x). The notion of closed operator avoids this nuisance, in the sense that if the limit T(xn ) exists it must converge to T(x). This notion can be given in a somewhat different form, in terms of closedness of the graph of the mapping, thus justifying the nomenclature, see Definition 12.1. In a sense, between discontinuous operators the closed operators are the ones that most resemble continuous ones. Let us give the formal definition. Definition 12.1. A linear operator T : X → Y is a closed operator; if xn → x and Txn → y then x ∈ dom(T) and Tx = y. It is instructive to ponder the difference between continuous and closed operators. The image of a convergent sequence under a continuous map will always converge, i.e., if xn → x then we always have Txn → Tx. For a closed operator the situation is milder, in the sense that the image of a convergent sequence under a close map can converge or not, but if converges, Txn → y, then it must satisfy y = Tx. Example 12.2 (Non-Closed and Bounded Operator). Let id : ℚ → ℝ be the identity operator id(x) = x. Take xn = trunc(√2, n) where trunc(x, n) is the n truncated function. Then xn → √2 from which it follows that the operator identity is not a closed operator albeit it is a bounded operator. ⊘ The following proposition is an almost immediate result from the definition. Theorem 12.3. Let X1 and X2 be Banach spaces. Then every operator T ∈ B (X1 , X2 ) is closed. Proof. Let xn → x with Tx → ". Since T is continuous, then Txn → Tx = ". This shows that the operator T is indeed closed. ◻ Let X and Y be normed spaces. Then the Cartesian product X × Y is a vector space with the operations

148

(a) (b)

12 Closed Graph Theorem

(x1 , "1 ) + (x2 , "2 ) = (x1 + x2 , "1 + "2 ) , for every (x1 , "1 ) , (x2 , "2 ) ∈ X × Y and k (x, ") = (kx, k") , for every (x, ") ∈ X × Y and every k ∈ 𝔽.

󵄩 󵄩 Moreover, this space is a normed space with the norm ‖ (x, ") ‖ := ‖x‖X + 󵄩󵄩󵄩"󵄩󵄩󵄩Y , see Problem 12.1.

Definition 12.4. Let X and Y be normed spaces. The graph of a linear operator T : X → Y is the vector subspace graph(T) = {(x, Tx) ∈ X × Y : x ∈ dom T}.

With the notion of the graph of a map, we can show the Closed Graph Theorem. Theorem 12.5 (Closed Graph Theorem). Let X1 and X2 be Banach spaces. If T : X1 → X2 is a linear operator, then T is continuous if, and only if, T is closed. Proof. To prove the sufficiency, let us define the function I(x1 , x2 ) : X1 × X2 󳨀→ ℝ given by I(x1 , x2 ) = ‖T(x1 ) – x2 ‖, which is a continuous map and therefore the set graph(T) = I–1 ({0}) is a closed set, being the inverse image of the closed set {0} under a continuous map. For the necessity, let us suppose that the set graph (T) is a closed set. The space (X1 × X2 , ‖ ⋅ ‖X1 ×X2 ), where ‖ ⋅ ‖X1 ×X2 := ‖ ⋅ ‖X1 + ‖ ⋅ ‖X2 , is a Banach space. The map F1 : graph(T) → X1 , (x1 , T(x1 )) 󳨃→ x1 is a linear continuous bijection. The continuity follows from the following inequality: ‖F1 (x1 , T(x1 ))‖X1 = ‖x1 ‖X1 ≤ ‖(x1 , T(x1 ))‖X1 ×X2 . Using Banach Isomorphism Theorem (Corollary 11.18) it follows that F1 has a continuous inverse operator, from which we have ‖(x1 , Tx1 )‖X1 ×X2 ≤ C‖x1 ‖X1 .

12.1 Problems

149

Finally, we have ‖T(x1 )‖X2 ≤ ‖(x1 , Tx1 )‖X1 ×X2 ≤ C‖x1 ‖X1 . ◻

which was the desired result.

One of the main strengths of the Closed Graph Theorem resides in the following fact. To check continuity of an operator we need to verify two conditions: (i) if xn → x then Txn → y; (ii) y = Tx. The condition (i) can be difficult to verify. On the other hand for closed operators we only need to check (ii); therefore, by the Closed Graph Theorem, for linear operators we only need to check condition (ii), since (i) is gratis for closed operators. Let us give some example of this approach. Definition 12.6 (Symmetric Operator). Let T : H 󳨀→ H be an operator in a Hilbert space H. We say that T is symmetric if ⟨Tv, w⟩= ⟨v, Tw⟩ for all v, w ∈ H. Theorem 12.7 (Hellinger–Toeplitz Theorem). Let T : H 󳨀→ H be a linear symmetric operator. Then T is continuous. Proof. Let z ∈ H, then ⟨y, z⟩= lim ⟨Txn , z⟩= lim ⟨xn , Tz⟩= ⟨x, Tz⟩= ⟨Tx, z⟩, n→∞

n→∞

from which it follows that if xn → x and Txn → y then y = Tx, since taking z = Tx – y we get that Tx – y = 0. ◻

12.1 Problems 12.1. 12.2.

Prove that the space (X × Y, ‖ ⋅ ‖), where ‖(x, y)‖ := ‖x‖X + ‖y‖Y , is a normed space. Prove that for a linear operator T : X → Y, the following assertions are equivalent: (a) The graph of the operator T is closed, i.e., graph(T) is a closed set. (b) For every convergent sequence (xn ) ⊆ dom T, such that xn → x ∈ X, and then for the convergent sequence (Txn ) ⊆ Y, such that Txn → ", we have x ∈ dom T and " = Tx.

150

12.3.

12.4.

12 Closed Graph Theorem

Let u ∈ L∞ [0, 1] such that u(x) > 0 for all x ∈ [0, 1]. Let us define the multiplication operator Mu : Lp [0, 1] 󳨀→ Lp [0, 1] given by f 󳨃→ uf . Prove that Mu is a bounded linear operator. Let A : C([0, 1]) 󳨀→ C([0, 1]), x(t) 󳨃󳨀→ x(t) , t where D(A) = {x(t) ∈ C([0, 1]) : limt→0+ t–1 x(t) exists} . Prove that A is a closed operator.

13 Hahn–Banach Theorem Learning Targets ✓ Learn the Hahn–Banach Extension Theorem. ✓ Understand the deﬁnition and usefulness of the Minkowski functional. ✓ Get to know about the Hahn–Banach Separation Theorems.

13.1 Extension Theorems The ability to extend linear functionals is a very useful tool. We recall the Lebesgue integral, which is first defined in the set of all step functions and after it is extended to the set of all Lebesgue measurable functions. In many situations we need to construct certain linear functionals with some a priori conditions. If we can construct that linear functional with prescribed conditions in a small space then the Hahn–Banach extension theorem guarantees that there exists an extension of that linear functional to the entire space. Definition 13.1. Let X be a real vector space and p : X → ℝ. If, for every x1 , x2 ∈ X and every ! ≥ 0, we have (a) p (x1 + x2 ) ≤ p (x1 ) + p (x2 ) , and (b) p (!x1 ) = !p (x1 ) , then the functional p is called a Banach functional. We now show an extension result that will be used in the proof of the Hahn–Banach Theorem 13.3. Lemma 13.2. Let X be a real vector space and p a Banach functional in X. If > : Y 󳨀→ ℝ is a linear functional, where Y is a subspace of X, such that > (y) ≤ p (y) , for every y ∈ Y, then there exists a linear extension I : E 󳨀→ ℝ of > (where E := Y ∪ {. } and . ∈ X\Y) such that I (x) ≤ p (x) , for every x ∈ E. ̃ an arbitrary linear extension of > to E. Then for every w ∈ E, Proof. Let us denote by > which can be written as w = y + +. , for + ∈ ℝ and y ∈ Y, we have ̃ (w) = > ̃ (y + +. ) = > (y) + +̃ > > (. ) , for every y ∈ Y, + ∈ ℝ.

152

13 Hahn–Banach Theorem

̃ in such a way that the inequality > ̃ ≤ p holds. For every u1 , u2 ∈ Now we must choose > Y we have > (u1 ) + > (u2 ) = > (u1 + u2 ) ≤ p (u1 + u2 ) ≤ p (u1 – y) + p (u2 + y) , from where we get > (u1 ) – p (u1 – y) ≤ p (u2 + y) – > (u2 ) . Then, there is +0 ∈ ℝ such that sup {> (u1 ) – p (u1 – y)} ≤ +0 ≤ inf {p (u2 + y) – > (u2 )} .

u1 ∈Y

u2 ∈Y

̃ of > as > ̃ (. ) = +0 . We now verify the inequality We then define the linear extension > ̃ (w) ≤ p (w) , for every w ∈ E. If + > 0, then > y y ̃ (w) = > (y) + ++0 ≤ > (y) + + [p ( + . ) – > ( )] = p (y + +. ) = p (w) . > + + In the case + < 0 we have ̃ (y + +. ) = > ̃ (y – 󵄨󵄨󵄨󵄨+󵄨󵄨󵄨󵄨 . ) = > (y) – 󵄨󵄨󵄨󵄨+󵄨󵄨󵄨󵄨 +0 , > y y 󵄨 󵄨 ≤ > (y) – 󵄨󵄨󵄨+󵄨󵄨󵄨 [> ( 󵄨󵄨 󵄨󵄨 ) – p ( 󵄨󵄨 󵄨󵄨 – . )] , + + 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 ] [ y 󵄨 󵄨 = 󵄨󵄨󵄨+󵄨󵄨󵄨 p ( 󵄨󵄨 󵄨󵄨 – . ) = p (y + +. ) , 󵄨󵄨+󵄨󵄨 ◻

which ends the proof.

In the case of finite dimensions, the previous lemma states that we can always extend a linear functional increasing its domain by one dimension. Operating Lemma 13.2 a finite number of times we can always increase a finite number of dimensions. But what about infinite-dimensional spaces? The answer is given by the Hahn–Banach Theorem. Theorem 13.3 (Hahn–Banach Theorem—Real Version). Let X be a real vector space and p a Banach functional in X. If > : Y 󳨀→ ℝ is a linear functional, where Y is a subspace of X, such that > (y) ≤ p (y) , for every y ∈ Y, then there exists a linear extension I : X 󳨀→ ℝ of > such that

13.1 Extension Theorems

153

I (x) ≤ p (x) , for every x ∈ X. The map I is called the Hahn–Banach extension of >. Proof. Consider the set S = {>t : Yt 󳨀→ ℝ : >t is a linear extension of >, Y ⊆ Yt ⊆ X, and >t (y) ≤ p (y) , ∀y ∈ Yt } . This set is not empty, for > ∈ S, and is partially ordered with the relation >t ≺ >s ⇔ Yt ⊆ Ys and >t (y) = >s (y) , for every y ∈ Yt . If {>t : t ∈ J} is a totally ordered subset of S and I = ⋃t∈J Yt , then the map 8 : I 󳨀→ ℝ,

8 (y) := >t (y) , if y ∈ Yt ,

is well defined and 8 (y) ≤ p (y) , for every y ∈ I . Notice that for every t ∈ J we have >t ≺ 8, which implies that every totally ordered collection in S has an upper bound. By Zorn’s Lemma, S has a maximal element I defined on a set X0 ⊆ X which satisfies the inequality I (x) ≤ p (x) if x ∈ X0 . The goal now is to prove that X0 = X. If X0 = X then the theorem is proved, so suppose X0 ≠ X and take a nonzero element x1 ∈ X\X0 . By Lemma 13.2 we know that we can extend the linear functional. This contradicts the maximality of I which ends the proof. ◻ We now want to extend the Hahn–Banach Theorem for complex valued functionals. In what follows we need the definition of a real linear functional in a complex vector space. Definition 13.4 (Real Linear Functional). We say that f : V 󳨀→ ℝ is a real linear functional in a complex vector space V when it satisfies (a) f (v + w) = f (v) + f (w), for all v, w ∈ V, and (b) f (!v) = !f (v) for all ! ∈ ℝ. For the proof of the complex version of Hahn–Banach Theorem we will also need the following lemma. The proof is left to the reader. Lemma 13.5. Let X be a complex vector space. (a)

If f : X 󳨀→ ℝ is a real linear functional, then the map g : X 󳨀→ ℂ defined as g (x) = f (x) – i f (i x) , for x ∈ X is a complex linear functional.

154

(b)

13 Hahn–Banach Theorem

If g : X 󳨀→ ℂ is a complex linear functional, then there exists a real linear functional f : X 󳨀→ ℝ such that g (x) = f (x) – i f (i x) , for x ∈ X.

The proof of the complex version of the Hahn–Banach Theorem will rely on the real version and Lemma 13.5. Theorem 13.6 (Hahn–Banach Theorem—Complex Version). Let X be a complex vector space and p : X 󳨀→ [0, ∞) a map, satisfying: (a) p (x1 + x2 ) ≤ p (x1 ) + p (x2 ) , and 󵄨 󵄨 (b) p (+x1 ) = 󵄨󵄨󵄨+󵄨󵄨󵄨 p (x1 ) , for every x1 , x2 ∈ X and + ∈ ℂ. If > : Y 󳨀→ ℂ is a linear functional, where Y is a subspace of X, such that 󵄨󵄨 󵄨 󵄨󵄨> (y)󵄨󵄨󵄨 ≤ p (y) , for every y ∈ Y, 󵄨 󵄨 then there is a linear extension I : X 󳨀→ ℂ of > such that 󵄨 󵄨󵄨 󵄨󵄨I (x)󵄨󵄨󵄨 ≤ p (x) , for every x ∈ X. The map I is called the Hahn–Banach extension of >. Proof. Consider the real linear functional on Y defined by 8 = Re (>) . Since, 8 (y) ≤ 󵄨󵄨 󵄨 󵄨󵄨> (y)󵄨󵄨󵄨 ≤ p (y) , for every y ∈ Y, we have, using the real version of the Hahn–Banach 󵄨 󵄨 Theorem (restricting X and Y to real scalar multiplication given the case) that there is a real linear extension J : X 󳨀→ ℝ of 8 with J (x) ≤ p (x) , for every x ∈ X. Define I : X 󳨀→ ℂ as I (x) = J (x) – i J (i x) , which is a linear complex extension of >. The last step is to show that I is dominated by p. If I (x) ≠ 0, then there is 0 ≤ ( < 20 such 󵄨 󵄨 that I (x) = ei ( 󵄨󵄨󵄨I (x)󵄨󵄨󵄨 , so, using the fact that I is linear, we have 󵄨󵄨 󵄨 –i( –i( –i( –i( 󵄨󵄨I (x)󵄨󵄨󵄨 = I (e x) = Re (I (e x)) = J (e x) ≤ p (e x) = p (x) . Since in case I = 0 the result follows trivially, the proof of the theorem is complete. ◻ The following corollary is in many instances also called Hahn–Banach Theorem. Corollary 13.7. Let Y be a subspace of a normed space (X, ‖ ⋅ ‖) and let > : Y 󳨀→ ℝ be a continuous linear functional. Then there exists a continuous linear functional I : X 󳨀→ ℝ which is an extension of > such that ‖I‖ = ‖>‖. Proof. It suffices to use Theorem 13.3 with the Banach functional given by p(x) = ‖>‖‖x‖. ◻

13.2 Minkowski Functional

155

Another interesting corollary of the Hahn–Banach extension theorem given in Corollary 13.7 is the fact that we can construct continuous linear functionals with some prescribed conditions, as given below. Corollary 13.8. Let X be a normed space. For all 0 ≠ x ∈ X we can construct a linear continuous functional > : X 󳨀→ ℝ such that (a) ‖>‖ = 1; and (b) >(x) = ‖x‖. Proof. Taking Y = [x], >(!x) = !‖x‖ and using Corollary 13.7 we obtain the result.

With the previous corollary at hand we can show that the norm of an element x in a normed space X can be obtained taking the maximum when x is calculated running over all elements of the sphere in the dual space X * . Corollary 13.9. Let X be a nonempty normed space and x ∈ X. Then ‖x‖ = sup {|>(x)| : > ∈ X * } = max {|>(x)| : > ∈ X * } . ‖>‖≤1

‖>‖=1

Proof. From the definition of the norm of a functional we immediately have |>(x)| ≤ ‖>‖‖x‖, from which it follows that sup‖>‖≤1 {|>(x)| : > ∈ X * } ≤ ‖x‖. Now the result follows invoking Corollary 13.8. ◻

13.2 Minkowski Functional In this section we will study the Minkowski functional, which is a very efficient tool in questions related to norms in vector spaces. We start with some definitions. Definition 13.10. Let X be a vector space and S a subset of X. Then (a) We say S is convex if (!x + "y) ∈ S, for every x, y ∈ S and every !, " ∈ ℝ+ , such that ! + " = 1. 󵄨 󵄨 (b) We say S is balanced if +S = {+x : x ∈ S} ⊆ S, for every + ∈ 𝔽 with 󵄨󵄨󵄨+󵄨󵄨󵄨 ≤ 1. (c) We say S is absolutely convex if it is convex and balanced. (d) We say S is absorbing if, for every x ∈ X, there is !0 > 0 such that x ∈ +S, for 󵄨 󵄨 every + ∈ 𝔽 with 󵄨󵄨󵄨+󵄨󵄨󵄨 ≥ !0 . If S is balanced, and for every x ∈ X, there is , ∈ 𝔽, , ≠ 0, such that x ∈ ,S, then S is absorbing. The following example is enlightening regarding the definition of the Minkowski functional.

156

13 Hahn–Banach Theorem

x x || x ||

Figure 13.1: Projection of a point into the unit sphere.

Example 13.11. Let p : ℝ2 → ℝ be given by p(x) = ‖x‖. It is immediate from the properties of the Euclidean norm that p(x + y) ≤ p(x) + p(y) and p(!x) = |!|p(x). Taking the convex set B := {x ∈ ℝ2 : p(x) < 1} we have that when r > ‖x‖ then the vector 1r x ∈ B and when r < ‖x‖ then the vector 1r x ∉ B. From this example we see that the set B can be used to calculate the norm ‖x‖ by the formula ‖x‖ = inf {+ > 0 :

1 x ∈ B} . +

We now define formally the Minkowski functional.

Definition 13.12. Let X be a vector space and S a subset of X. The Minkowski functional of the set S is the functional pS : X 󳨀→ [0, ∞] defined as pS (x) = {

inf {1 > 0 : x ∈ 1S} , if x ∈ 1S, for some 1 > 0. ∞, if x ∉ 1S, for every 1 > 0.

(13.1)

We collect some useful properties of the Minkowski functional.

Theorem 13.13. Let X be a vector space and S a subset of X. Then the Minkowski functional of S has the following properties: (a) (b) (c)

If S is absorbing, then pS (x) < ∞, for every x ∈ X. For every x ∈ X and every + ∈ ℝ, with + > 0, we have pS (+x) = +pS (x) . If S is convex, then for every x, y ∈ X, we have pS (x + y) ≤ pS (x) + pS (y) .

13.2 Minkowski Functional

(d) (f)

157

󵄨 󵄨 If S is balanced, then for every x ∈ X and + ∈ 𝔽, we have pS (+x) = 󵄨󵄨󵄨+󵄨󵄨󵄨 pS (x) . If S is convex and 0 ∈ S, then we have {x ∈ X : pS (x) < 1} ⊆ S ⊆ {x ∈ X : pS (x) ≤ 1} .

Proof. (a) Follows directly from the definitions. (b) Take x ∈ X and + > 0. Then we have 1 S} , + 1 1 = {+ ( ) > 0 : x ∈ S} , + +

{1 > 0 : +x ∈ 1S} = {1 > 0 : x ∈

= + {# > 0 : x ∈ #S} ,

(c)

where the last equality follows by taking # = 1+ . This implies pS (+x) = +pS (x) . Suppose S is convex. Take 1 = ! + ", where !, " > 0, x ∈ !S and y ∈ "S. Then, " " y x y ! ! x , ∈ S, !+" + !+" = 1 and, by the convexity of S, ( !+" + !+" ) ∈ S. Hence, ! " ! " x + y = (! + ") (

" y ! x + ) ∈ (! + ") S = 1S. !+"! !+""

Thus, for every x, y ∈ X, we proved the validity of the inclusion {! > 0 : x ∈ !S} + {" > 0 : y ∈ "S} ⊆ {1 > 0 : (x + y) ∈ 1S} .

(d)

From this, it follows that pS (x + y) ≤ pS (x) + pS (y) for every x, y ∈ X. Suppose S is balanced. Take x ∈ X and + ∈ 𝔽 such that + ≠ 0 (notice that if + = 0, then the result is trivial). We will prove that 󵄨 󵄨 {1 > 0 : +x ∈ 1S} = 󵄨󵄨󵄨+󵄨󵄨󵄨 {! > 0 : x ∈ !S} , 󵄨 󵄨 from which it follows immediately that pS (+x) = 󵄨󵄨󵄨+󵄨󵄨󵄨 pS (x) . (⊆) Let 1 > 0 such that +x ∈ 1S. Since +x ∈ 1S, then 󵄨 󵄨 1 󵄨󵄨+󵄨󵄨 1 x ∈ S = 󵄨󵄨 󵄨󵄨 ( 󵄨 󵄨 ) S. + 󵄨󵄨+󵄨󵄨 + + Now, we have that |+| S = S since S is balanced. Hence, if ! =

1

|+|

󵄨 󵄨 , then 1 = 󵄨󵄨󵄨+󵄨󵄨󵄨 !

and x ∈ !S. 󵄨 󵄨 (⊇) Let ! > 0 such that x ∈ !S. We will show that +x ∈ (󵄨󵄨󵄨+󵄨󵄨󵄨 !) S. In fact, + 󵄨 󵄨 󵄨 󵄨 +x ∈ +!S = 󵄨󵄨󵄨+󵄨󵄨󵄨 ! ( 󵄨󵄨 󵄨󵄨 ) S = 󵄨󵄨󵄨+󵄨󵄨󵄨 !S, 󵄨󵄨+󵄨󵄨 since

+

|+|

S = S due to the fact that S is balanced.

158

(e)

13 Hahn–Banach Theorem

Suppose S is convex and 0 ∈ S. We first prove {x ∈ X : pS (x) < 1} ⊆ S. If pS < 1, then there is 0 < 1 < 1 such that x ∈ 1S or, in other words, contains the identity, then

x 1

∈ S. Since S

x x = 1 ( ) + (1 – 1) 0 ∈ S. 1 Now we will prove S ⊆ {x ∈ X : pS (x) ≤ 1} . This follows immediately since if x ∈ S, then 1 ∈ {1 > 0 : x ∈ 1S} , which implies ◻ that pS (x) ≤ 1. We now show that in any vector space we can define a seminorm. Theorem 13.14. Let S be a convex, balanced and absorbing set in a vector space X. Then the Minkowski functional defined by eq. (13.1) defines a seminorm in X. Proof. Since S is absorbing, pS (x) < ∞ for all x ∈ X by Theorem 13.13(a). The homogeneity is just Theorem 13.13(d), and the triangle inequality is Theorem 13.13.(c). ◻

13.3 Separation Theorem There is a connection between linear functionals and hyperplanes (see Theorem 13.16) which at first sight is not immediate. We need to introduce some auxiliary notions.

Definition 13.15 (Hyperplane). Let V be a non-null vector space. A hyperplane of V is a maximal subspace W ≠ V of V in the following sense: if W1 is a subspace of V and W ⊆ W1 , then either W1 = V or W1 = W.

We can characterize the hyperplanes as the subspaces generated by the kernel of a linear functional.

13.3 Separation Theorem

159

Theorem 13.16. Let V be a non-null vector space and W a subspace of V. Then W is a hyperplane of V if, and only if, there exists a non-null linear functional > : V 󳨀→ ℝ such that ker(>) = W. Proof. Let W be a hyperplane of V. Let us take v ∈ V\W. From the definition of hyperplane, we know that W ∪ {v} = V. Taking x ∈ V we have that x = w + !v, where w ∈ W and ! ∈ ℝ. Let us now define the following functional: >:

V 󳨀→ ℝ w + !v 󳨃󳨀→ !,

which is a non-null linear operator with ker(>) = W. For the other implication, let us take a non-null linear functional > such that ker(>) = W ≠ V. Let now W1 be a subspace of V such that ker(>) ⊆ W1 . If ker(>) ≠ W1 , then let us choose . ∈ W1 \ ker(>). Taking v ∈ V and defining w := v –

>(v) ., >(. )

>(v) . entailing that we obtain that w ∈ W1 since >(w) = 0. We can now write v = w + >(. ) v ∈ W1 ; therefore, V = W1 from which we get that W is a hyperplane. ◻

Definition 13.17. Let X be a vector space and S a subset of X. We say that S is an affine subspace of X if S = x0 + Y, where x0 ∈ X and Y is a vector subspace of X. With the notion of affine space we can introduce, mutatis mutandis, the notion of affine hyperplane. In the same way that we characterize the hyperplanes of a vector space via the kernel of a functional, we can characterize the affine hyperplanes with similar condition, as stated in Theorem 13.18. From Theorems 13.16 and 13.18 we see a connection between hyperplanes and linear functionals. This fact will be fully exploited in the Hahn–Banach Theorem 13.20. Theorem 13.18. Let V be a non-null vector space and W a subspace of V. Then W is an affine hyperplane of V if, and only if, there exists a non-null linear functional > : V 󳨀→ ℝ and ! ∈ ℝ such that {v ∈ V : >(v) = !} = W. The proof is left as an exercise. Before stating and proving the geometric version of the Hahn–Banach Theorem we need one more result.

160

13 Hahn–Banach Theorem

Lemma 13.19. Let H = {x ∈ X : >(x) = !} be a hyperplane of a normed space X, where > is a linear functional in X and ! ∈ ℝ. Then the hyperplane H is closed if, and only if, the linear functional > is continuous. Proof. The necessity is a consequence of the fact that H = >–1 ({!}) and the definition of continuity. To show the continuity of > we will rely on the fact that a bounded linear operator is continuous (see Theorem 4.17). Let H be a closed set which implies that X\H is a nonempty open set. Due to the openness of X\H for all x ∈ X\H there exists rx > 0 such that B(x, rx ) ⊆ X\H. Since >(x) ≠ ! we can suppose that >(x) < !. Our goal now is to show that indeed >(. ) < ! for all . ∈ B(x, rx ). Assuming that this is not the case, then there exists a point . ∈ B(x, rx ) such that >(. ) > !. Using the linearity of > and after some calculation we get >(. ) – ! >(. ) – ! >( x + (1 – ) . ) = !, >(. ) – >(x) >(. ) – >(x) ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ :=𝜘

which contradicts the fact that 𝜘 ∈ B(x, rx ). We now obtain, for all w ∈ X with ‖w‖ < 1, that >(x) – ! ! – >(x) < >(w) < rx rx

(13.2)

using the fact that >(x + rx w) ≤ ! for all w ∈ X with ‖w‖ < 1 and the linearity of >. The boundedness of > follows from the estimate (13.2) with ‖>‖ ≤ (! – >(x))/rx , which ends the proof. ◻ Taking all the above preliminary results we will state and prove the geometric version of the Hahn–Banach Theorem, which is also referred as the Separation Theorem or even the Minkowski–Ascoli–Mazur Theorem. Theorem 13.20 (Hahn–Banach Theorem—Geometric Version). Let X be a normed space, S an absorbing convex open subset of X, and T an affine subspace of X such that S ∩ T = 0. Then, there exists an affine closed hyperplane H such that T ⊆ H and H ∩ S = 0. Proof. Making a translation if necessary, we may suppose that 0 ∈ S. Let us take our affine space T given in the following way: T = x0 + Y, where Y is a vector subspace of X, x0 ∈ X and x0 ∉ Y. Since S is an absorbing and convex set, then the Minkowski functional of S is a positively homogeneous and subadditive function which satisfies the following relation:

13.3 Separation Theorem

{x ∈ X : pS (x) < 1} ⊆ S ⊆ {x ∈ X : pS (x) ≤ 1} .

161

(13.3)

Let us consider the set Y ∪ {x0 } = {+x0 + y : + ∈ ℝ and y ∈ Y} and let us take the following linear functional: F : Y ∪ {x0 } 󳨀→ ℝ , +x0 + y 󳨃󳨀→ + , for all + ∈ ℝ. Then T = F –1 (1) and we now show, using a contradiction argument, that F (x) ≤ pS (x) for all x ∈ Y ∪ {x0 }. Let us suppose that there exist +1 ∈ ℝ and y1 ∈ Y such + x +y that F (+1 x0 + y1 ) > pS (+1 x0 + y1 ) . Taking x = F +1 x0 +y1 we have that F (x) = 1, which ( 1 0 1) implies that x ∈ T = F –1 (1). Moreover, x ∈ S by eq. (13.3) and the fact that pS (x) < 1. The contradiction S ∩ T ≠ 0 proves the inequality. By the Hahn–Banach Theorem 13.3 and the fact that F (x) ≤ pS (x) for every x ∈ Y ∪ {x0 }, there exists a linear extension F̃ : X 󳨀→ ℝ of F such that F̃ (x) ≤ pS (x) for every x ∈ X. If we define H = {x ∈ X : F̃ (x) = 1} , then H is an affine hyperplane such that T ⊆ H. On the other hand, we have that S = {x ∈ X : pS (x) < 1}, since S is an open set. Then S ∩ H = 0. The functional F̃ is continuous due to Corollary 13.7 and now the closedness of H follows from Lemma 13.19. This completes the proof if X is a real normed space. ◻ We now prove another version of the Hahn–Banach Theorem, the so-called Hahn– Banach Separation Theorem, which is sometimes called the Eidelheit Separation Theorem. Theorem 13.21 (Hahn–Banach Separation Theorem). Let X be a normed space and U, V ⊆ X convex sets such that U, V ≠ 0, U ∩ V = 0 and U is open. Then there is a linear functional f ∈ X * such that f (U) ∩ f (V) = 0. Proof. For every x ∈ X, the set U – x = {u – x : u ∈ U} is open in X. Then U – V = ⋃v∈V (U – v) is open in X. Moreover, the convexity of U and V implies U – V is also convex. On the other hand, since U and V are disjoint we have 0 ∉ U – V. By the geometric form of the Hahn–Banach Theorem, there is a closed hyperplane H, which contains the element 0, such that H ∩ (U – V) = 0. Then if f ∈ X * is a continuous linear functional with ker (f ) = H, then f (U) ∩ f (V) = 0, which concludes the proof. ◻ The necessity of having convex sets in the separation theorem stems from the fact that we can have sets which are disjoint but they are intertwine in such a way that we cannot separate them by a hyperplane, as it is clear in Fig. 13.2.

162

13 Hahn–Banach Theorem

Figure 13.2: Two sets that cannot be separated by a

hyperplane. x0 x2

x2 =

x0 + x1 2

x1

Figure 13.3: Strictly convex normed space.

We end this section with a theorem regarding the unicity of the extension provided by the Hahn–Banach Extension Theorem, but we need to remember the concept of strictly convex normed space. Definition 13.22 (Strictly Convex Normed Space). Let X be a normed space. We say that X is strictly convex if 󵄩󵄩 x0 + x1 󵄩󵄩 󵄩󵄩 󵄩󵄩 < 1 󵄩󵄩 󵄩 󵄩 2 󵄩󵄩 for all x0 , x1 ∈ X such that ‖x0 ‖ = ‖x1 ‖ = 1. We can see geometrically the notion of strict convexity in Fig. 13.3. Theorem 13.23. Let X be a normed vector space. If X * is strictly convex, then all linear extensions from the Hahn–Banach Theorem are unique. Proof. Let M be a subspace of X and let us take a linear continuous functional F ∈ M * . ̃ ∈ X * and F ̃ ∈ X * two linear extensions of F which preserve the norm. Let us take F 1 2 ̃ and F ̃ is again an extension of F and moreover (F ̃+F ̃ ) /2 ∈ X * . The average of F 1 1 1 2 Since the average is an extension of F we get that 󵄩󵄩 ̃ ̃ 󵄩󵄩 󵄩󵄩 F + F2 󵄩󵄩 󵄩󵄩 ‖F‖ ≤ 󵄩󵄩󵄩 1 󵄩󵄩 2 󵄩󵄩󵄩 󵄩 󵄩

(13.4)

13.4 Applications of the Hahn–Banach Theorem

163

but on the other hand we have 󵄩󵄩 ̃ ̃ 󵄩󵄩 󵄩󵄩 F1 + F2 󵄩󵄩 1 ̃ ̃ 󵄩 󵄩󵄩 󵄩󵄩 2 󵄩󵄩󵄩 ≤ 2 (‖F1 ‖ + ‖F2 ‖) = ‖F‖. 󵄩󵄩 󵄩󵄩

(13.5)

From eqs (13.4) and (13.5) we obtain a contradiction since the space X * is strictly convex. ◻

13.4 Applications of the Hahn–Banach Theorem In this section we obtain various results using the Hahn–Banach Theorem. The following theorem is an extension, in a sense, of Corollary 13.8. The proof will follow in the same vein; i.e., we construct our linear continuous functional with some prescribed conditions in a natural subspace and then we extend to the entire space resorting on the Hahn–Banach Extension Theorem. Theorem 13.24. Let X be a normed space and Y a proper closed vector subspace. If . ∈ X\Y and \$ = d (. , Y) := infy∈Y ‖. – y‖, then there exists > ∈ X * such that (a) ‖>‖ = 1, (b) > (. ) = \$, and (c) >(y) = 0 for all y ∈ Y. Proof. Let E := Y ∪ {. }. Let us define > : E 󳨀→ ℝ given by >(w) = !\$, where w := y + !. and \$ = d (. , Y). The conditions (b) and (c) are immediate to check. Let us now check that (a) is satisfied. Firstly, we have that ‖>‖ ≤ 1 since, taking y + !. =: w ∈ E with ! ≠ 0, we get 󵄩󵄩 y 󵄩󵄩 󵄩 󵄩 ‖w‖ = ‖y + !. ‖ = |!| 󵄩󵄩󵄩 – . 󵄩󵄩󵄩 ≥ \$|!| = |>(w)|, 󵄩󵄩 –! 󵄩󵄩 and in the case ! = 0 the inequality |>(w)| ≤ ‖w‖ is immediate. To prove that the norm attains the value 1 we use a routine argument. Given % > 0 we know that there exists y% ∈ Y such that \$ ≤ ‖. – y% ‖ ≤ % + \$. Defining w% :=

. – y% , ‖. – y% ‖

we have that w% ∈ E and moreover ‖w% ‖ = 1. Since >(w% ) =

\$ \$ ≥ ‖. – y% ‖ \$ + %

and the fact that % > 0 is arbitrary we get that ‖>‖ ≥ 1, which combined with the previous estimate ‖>‖ ≤ 1, we obtain that ‖>‖ = 1. Using now the Hahn–Banach Extension

164

13 Hahn–Banach Theorem

Theorem we can guarantee that there exists an extension linear functional I to the whole space X such that it satisfies (a), (b) and (c). ◻ Theorem 13.24 can be used to characterize the elements of V which can be approximated using linear combinations of {v1 , v2 , . . . }. Theorem 13.25. Let V be a normed space. An element v ∈ V is a limit point of the set span{(vi )i∈ℕ } if, and only if, f (v) = 0 for all linear continuous functionals f satisfying f (vi ) = 0, i ∈ ℕ. Proof. If f (vi ) = 0 for all i ∈ ℕ then f (v) = 0. Due to Theorem 13.24 it follows that \$ = d(v, span {(vi )i∈ℕ }) = 0; otherwise there would be a functional f such that f (vi ) = 0 with f (v) = 1. Since \$ = 0 it immediately implies the result. For the other implication, let us take w = limn→∞ wn , where wn = ∑ni=1 !i vi . If f is a functional such that f (vi ) = 0 for all i ∈ ℕ, then we have n

f (w) = f ( lim wn ) = lim f (wn ) = lim ∑ !i f (vi ) = 0, n→∞

n→∞

n→∞

i=1

from which the result follows.

We now end this chapter proving a very interesting and somewhat counterintuitive result. There exists a universal space in which all separable normed spaces are contained. Theorem 13.26. All normed separable space is isometric isomorphic to a subspace of ℓ∞ . Proof. Let us fix a nonempty separable normed space X. Let E = {xn : n ∈ ℕ} be an enumerable dense set of X. Invoking Corollary 13.8 we know that, for each n ∈ ℕ, there exists >n ∈ X * such that ‖>n ‖ = 1 and moreover >n (xn ) = ‖xn ‖. Let us take F given by F : X 󳨀→ ℓ∞ x 󳨃󳨀→ (>n (x))n∈ℕ . The application F is well defined, since |>n (x)| ≤ ‖x‖ for all n ∈ ℕ. Moreover, F is a bounded linear application, since ‖F(x)‖ℓ∞ = sup {|>n (x)|} ≤ ‖x‖.

(13.6)

n∈ℕ

On the other hand, ‖F(xk )‖ℓ∞ ≥ |>k (xk )| ≥ ‖xk ‖ from which, together with eq. (13.6), we obtain ‖F(xk )‖ℓ∞ = ‖xk ‖X . Now taking into account the density of E together with the continuity of x 󳨃→ ‖F(x)‖ℓ∞ we obtain that ‖F(x)‖ℓ∞ = ‖x‖X , which ends the proof. ◻

13.5 Problems

165

13.5 Problems 13.1. 13.2. 13.3. 13.4. 13.5.

Prove Theorem 13.18. Let V be a Banach space and (fj )j∈ℕ ⊆ X * . If for all x ∈ X there exists limn→∞ fj (x) = f (x), then f ∈ X * . Let V be a normed space and x ≠ y ∈ X. Show that there exists a f ∈ X * such that f (x) ≠ f (y). Let V be a normed space, v ∈ V. If for all f ∈ X * with ‖f ‖ = 1 we have |f (v)| ≤ 1, then prove that ‖v‖ ≤ 1. Let V = {x ∈ ℝ2 : 2x1 – x2 = 0}

13.6.

be a subspace of ℝ2 , where x = (x1 , x2 ). Let us define the linear functional f (x) = x1 over V. Find explicitly the extension of f to the entire ℝ2 space. Prove the following result: Theorem. Let X be a normed space and Y a vector subspace of X. Then Y is dense in X if, and only if, the only element of X * that is zero in Y is the null functional.

14 The Adjoint Operator Learning Targets ✓ Understand the concept of adjoint operators acting on Hilbert spaces and on Banach spaces. ✓ Calculate some examples of adjoint operators.

We begin this chapter recalling the concept of the adjoint of a matrix. Given a matrix M = Mn×m , its adjoint is defined as the transpose of its conjugate matrix. That is, the ∗ matrix M ∗ = Mm×n whose entries are the conjugate of the entries of the transposed T matrix Mm×n . In order to simplify the exposition, let’s assume for now that the matrix M has real entries. In this case M ∗ = M T . Suppose v = (v1 , . . . , vm ) ∈ ℝm and w = (w1 , . . . , wm ) ∈ ℝm are two arbitrary vectors and denote by C1 , . . . , Cm ∈ ℝn the columns of the matrix M. Notice that Mv = v1 C1 + ⋅ ⋅ ⋅ + vm Cm , and consequently ⟨Mv, w⟩ℝn = v1 ⟨C1 , w⟩ℝn + ⋅ ⋅ ⋅ + vm ⟨Cm , w⟩ℝn . On the other hand, notice that M ∗ w = (⟨C1 , w⟩ℝn , ⋅ ⋅ ⋅ , ⟨Cm , w⟩ℝn ) , and therefore ⟨v, M ∗ w⟩ℝm = v1 ⟨C1 , w⟩ℝn + ⋅ ⋅ ⋅ + vm ⟨Cm , w⟩ℝn . Thus, we reach the following equation: ⟨Mv, w⟩ℝn = ⟨v, M ∗ w⟩ℝm .

(14.1)

This chapter may be understood as a way of generalizing the previous equation to other contexts. In the first section we will work under the setting of Hilbert spaces. The second section will be devoted to the study of adjoint operators on Banach spaces.

14.1 Hilbert Spaces We start this section with the proof of the existence of the adjoint of any linear operator acting between Hilbert spaces. We will make use of Problem 4.5 in several parts of this section without giving the exact reference.

14.1 Hilbert Spaces

167

Theorem 14.1. Let (H1 , ⟨⋅, ⋅⟩1 ) and (H2 , ⟨⋅, ⋅⟩2 ) be two Hilbert spaces and let T : H1 → H2 be a linear bounded operator. Then there exists a unique linear bounded operator T ∗ such that for every v ∈ H1 and w ∈ H2 , ⟨Tv, w⟩2 = ⟨v, T ∗ w⟩1 .

(14.2)

Moreover, ‖T‖ = ‖T ∗ ‖. Proof. We will use the Riesz Representation Theorem for Hilbert spaces. Given w ∈ H2 define the linear functional fw,T : H1 󳨀→ H2 , v 󳨃󳨀→ ⟨Tv, w⟩2 . Denote by ‖ ⋅ ‖1 and ‖ ⋅ ‖2 the norms of the Hilbert spaces H1 and H2 . The functional fw,T is bounded since 󵄨󵄨 󵄨 󵄨󵄨fw,T (v)󵄨󵄨󵄨 = 󵄨󵄨󵄨󵄨⟨Tv, w⟩2 󵄨󵄨󵄨󵄨, 󵄨 󵄨 ≤ ‖Tv‖2 ‖w‖2 , ≤ ‖T‖‖v‖1 ‖w‖2 . Then there exists a unique vector uf ∈ H1 depending on w and T such that for every v ∈ H1 , fw,T (v) = ⟨v, uf ⟩1 . With this equation in hand, define T ∗ : H2 󳨀→ H1 , w 󳨃󳨀→ uf . Notice that T ∗ is well defined as a consequence of the uniqueness of the Riesz Representation Theorem for Hilbert spaces 5.11. Moreover, T ∗ satisfies eq. (14.2) by definition. We will show that T ∗ is linear. Suppose ! ∈ 𝔽 and w1 , w2 ∈ H2 . Then for every v ∈ H1 , ⟨v, T ∗ (!w1 + w2 )⟩1 = ⟨Tv, !w1 + w2 ⟩2 , = !⟨Tv, w1 ⟩2 +⟨Tv, w2 ⟩2 , = ⟨v, !T ∗ w1 ⟩1 +⟨v, T ∗ w2 ⟩1 , = ⟨v, !T ∗ w1 + T ∗ w2 ⟩1 . This shows the linearity of T ∗ . To prove the boundedness, we use Theorem 5.18 to calculate its norm. Notice that

168

󵄨 󵄨 󵄨 󵄨 sup {󵄨󵄨󵄨󵄨⟨v, T ∗ w⟩1 󵄨󵄨󵄨󵄨 : ‖v‖1 ≤ 1} = sup {󵄨󵄨󵄨⟨Tv, w⟩2 󵄨󵄨󵄨 : ‖v‖1 ≤ 1} , ≤ ‖Tv‖2 ‖w‖2 , ≤ ‖T‖‖w‖2 . Hence T ∗ is bounded and ‖T ∗ ‖ ≤ ‖T‖. On the other hand, the same reasoning shows that ‖T‖ ≤ ‖T ∗ ‖. Finally, to prove the uniqueness suppose S : H2 → H1 is such that for every v ∈ H1 , and w ∈ H2 ⟨Tv, w⟩= ⟨v, Sw⟩, then ⟨v, Sw – T ∗ w⟩= 0, for every v ∈ H1 and consequently Sw = T ∗ w for every w ∈ H2 .

Definition 14.2. Let (H1 , ⟨⋅, ⋅⟩1 ) and (H2 , ⟨⋅, ⋅⟩2 ) be two Hilbert spaces and let T : H1 → H2 be a linear bounded operator. The operator T ∗ : H2 → H1 that satisfies eq. (14.2) is called the adjoint operator of T. Example 14.3. Consider the Hilbert space ℓ2 defined over a field 𝔽. Given a sequence ∞ (xn )∞ n=1 ∈ ℓ , define the linear operator T:

ℓ2 󳨀→ ℓ2 , 󳨃󳨀→ (xn yn )∞ n=1 .

(yn )∞ n=1

2 Since for (yn )∞ n=1 ∈ ℓ we have that ∞

n=1

n=1

∑ |xn yn |2 ≤ ‖(xn )‖∞ ∑ |yn |2 < ∞, then T is well defined. It is bounded since the previous inequality implies that ∞ 1/2 ∞ ‖T((yn )∞ n=1 )‖2 ≤ ‖(xn )n=1 ‖∞ ‖(yn )n=1 ‖2 . 2 ∞ On the other hand, suppose (wn )∞ n=1 ∈ ℓ is an arbitrary sequence and denote (un )n=1 = ∗ ∞ ∞ 2 T ((wn )n=1 ). Then for every (yn )n=1 ∈ ℓ we have that ∞ ∞ ∗ ∞ ⟨T((yn )∞ n=1 ), (wn )n=1 ⟩ = ⟨(yn )n=1 , T ((wn )n=1 )⟩, ∞

n=1

n=1

∑ xn yn wn = ∑ yn un , and by taking specific values of yn (which ones?) we have that un = xn wn .

14.2 Hilbert Spaces

169

Thus, an explicit formula for the operator T ∗ is given by T ∗ ((wn )∞ n=1 ) = xn wn .

In the following proposition, we will see some properties of the adjoint operators. Theorem 14.4. Let H1 and H2 be Hilbert spaces. Suppose T : H1 → H2 and S : H1 → H2 are bounded linear operators. Let ! ∈ 𝔽. Then (a) (b) (c) (d)

(S + T)∗ = S∗ + T ∗ , (!T)∗ = !T ∗ , ∗ (T ∗ ) = T, ∗ ‖T T‖ = ‖TT ∗ ‖ = ‖T‖2 .

Proof. We will denote ⟨⋅, ⋅⟩1 , and ⟨⋅, ⋅⟩2 as the inner products of H1 and H2 , respectively. Suppose v ∈ H1 and w ∈ H2 , then (a) ⟨v, (S + T)∗ w⟩1 = ⟨(S + T)v, w⟩2 , = ⟨Sv, w⟩2 +⟨Tv, w⟩2 , = ⟨v, S∗ w⟩1 +⟨v, T ∗ w⟩1 , = ⟨v, (S∗ + T ∗ ) w⟩1 . (b) ⟨v, (!T)∗ w⟩2 = ⟨!Tv, w⟩1 , = !⟨Tv, w⟩1 , = !⟨v, T ∗ w⟩2 , = ⟨v, !T ∗ w⟩2 . (c) ∗

⟨(T ∗ ) v, w⟩2 = ⟨v, T ∗ w⟩1 , = ⟨Tv, w⟩2 . (d)

Notice that ‖Tv‖2 = ⟨Tv, Tv⟩2 = ⟨v, T ∗ Tv⟩1 ≤ ‖v‖21 ‖T ∗ T‖, hence ‖Tv‖ ≤ ‖T ∗ T‖1/2 ‖v‖1 which implies that ‖T‖2 ≤ ‖T ∗ T‖. On the other hand, since by Problem 4.12 it holds that ‖T ∗ T‖ ≤ ‖T ∗ ‖‖T‖ = ‖T‖2 , then we have the equality. A similar ◻ reasoning shows that ‖TT ∗ ‖ = ‖T‖2 .

170

14.2 Banach Spaces In this section, we will introduce the concept of the adjoint operator acting between Banach spaces. We will try to extrapolate the notions learned for the case of Hilbert spaces. The main difficulty to make such extrapolation is, of course, the lack of an inner product. So we need to find a way to make sense of identity (14.2) in this new setting. Let’s analyze heuristically the case of Hilbert spaces for a moment. Suppose (H1 , ⟨⋅, ⋅⟩1 ) and (H2 , ⟨⋅, ⋅⟩2 ) are two Hilbert spaces. We showed in previous chapters that, due to the Riesz Representation Theorem, H1∗ can be identified with H1 by means of the isometric isomorphism D : H1∗ → H1 defined as D(f ) = v, where for every u ∈ H1 , f (u) = ⟨u, v⟩1 . Similarly, there exists an isometric automorphism A : H2∗ → H2 defined as A(g) = w where for every x ∈ H2 , g(x) = ⟨x, w⟩2 . Thus if T : H1 → H2 is a bounded linear operator, then ⟨Tv, w⟩2 = g(Tv), whereas since T ∗ w ∈ H1 , then there exists f ∈ H1∗ such that T ∗ w = D(f ). Therefore, identity in eq. (14.2) becomes g(Tv) = f (v). This shows the existence of an operator S : H2∗ → H1∗ such that S(g) = f . Notice that with the previous reasoning we got rid of the inner product but we paid a price; i.e., we need to consider the operator S acting between the dual spaces. We will use this idea to define the adjoint operator acting on duals of Banach spaces. Definition 14.5. Let V1 and V2 be two Banach spaces over a field 𝔽, and suppose that T : V1 → V2 is a bounded linear operator. The adjoint operator T ∗ : V2∗ → V1∗ is defined as T ∗ g(v) = g(Tv) for all v ∈ V1 . Theorem 14.6. Let (B1 , ‖ ⋅ ‖1 ) and (B2 , ‖ ⋅ ‖2 ) be two Banach spaces and suppose that T : B1 → B2 is a bounded linear operator. The operator T ∗ defined above is linear and bounded. Moreover, ‖T ∗ ‖ = ‖T‖.

(14.3)

14.2 Banach Spaces

171

Proof. To show the linearity, suppose that f , g ∈ V2∗ and ! ∈ 𝔽, then for every v ∈ V1 T ∗ (!f + g)(v) = (!f + g)(Tv), = !f (Tv) + g(Tv), = !T ∗ f (v) + T ∗ g(v). Moreover, from the definition we have ‖T ∗ g‖ = sup {|T ∗ g(v)| : ‖v‖1 ≤ 1}, = sup {|g(Tv)| : ‖v‖1 ≤ 1}, ≤ ‖g‖ sup {‖Tv‖2 : ‖v‖1 ≤ 1}, = ‖g‖‖T‖, which implies that T ∗ is bounded and ‖T ∗ ‖ ≤ ‖T‖. On the other hand, for each u ∈ V1 , as a consequence of Hahn–Banach Theorem (see eq. (13.8)), there exists a linear functional g ∈ H2∗ such that ‖g‖ = 1 and g(Tu) = ‖Tu‖2 . Consequently, ‖Tu‖2 = g(Tu), = T ∗ g(u), ≤ ‖T ∗ ‖‖g‖‖u‖1 , = ‖T ∗ ‖‖u‖1 , and we conclude that ‖T ∗ ‖ = ‖T‖.

Example 14.7. Let 1 < p < ∞ and consider the shift operator S:

ℓp 󳨀→ ℓp (x1 , x2 , x3 , . . . ) 󳨃󳨀→ (x2 , x3 , x4 , . . . ).

We showed in Example 4.27 that S is a bounded linear operator and ‖S‖ = 1. Let’s try to describe S∗ : Suppose g ∈ (ℓp )∗ , and then by Example 5.14, g is associated to a q ∞ p sequence (yn )∞ n=1 ∈ ℓ in such a way that for every sequence v = (xn )n=1 ∈ ℓ , ∞

g(Sv) = ∑ xn+1 yn , n=1 ∗

= S g(v).

172

Therefore, the operator S∗ : (ℓp )∗ → (ℓp )∗ maps g to S∗ g which is the linear functional associated to the sequence (0, y1 , y2 , . . . ) ∈ ℓq where p1 + q1 = 1. ⊘ Remark 14.8. In Theorem 14.4 we showed some properties of adjoint operators acting on Hilbert spaces. The first two properties also hold in the case of Banach spaces and their proofs are left to the reader. Notice that properties (c) and (d) do not make sense in the setting of general Banach spaces (why?). ⊘ Remark 14.9. Suppose that we have a bounded linear operator T : H1 → H2 where H1 and H2 are Hilbert spaces. Then we have two different options in order to define T ∗ . On the one hand, we can define the operator T ∗ : H2 → H1 in the sense given in the previous section. On the other hand, since Hilbert spaces are also Banach spaces, it is possible to define T ∗ : H2∗ → H1∗ . Some authors prefer to make a distinction and give the name Hilbert adjoint to the former operator and Banach adjoint to the latter. In this book, when working with Hilbert spaces, we will always refer to the Hilbert adjoint operator and we will just call it “adjoint operator”. ⊘

14.3 Problems 14.1.

Let H be a Hilbert space and let v, w ∈ H be nonzero vectors. Define the linear operator T : H 󳨀→ H, x 󳨃󳨀→ ⟨x, v⟩w.

14.2.

Prove that T ∈ B(H, H) and find the operator T ∗ . Define the shift operator S:

14.3.

ℓp 󳨀→ ℓp , (x1 , x2 , x3 , . . . ) 󳨃󳨀→ (x2 , x3 , x4 , . . . ).

Find an explicit formula for S∗ . Given a function f ∈ L∞ [0, 1], define the multiplication operator Mf : L2 [0, 1] 󳨀→ L2 [0, 1] g 󳨃󳨀→ fg.

14.4.

Find an explicit formula for Mf∗ . Consider ℂ as a Hilbert space with respect to the usual product of complex numbers. Suppose H is any Hilbert space and f ∈ H ∗ . Find a formula for f ∗ .

14.3 Problems

14.5.

173

Suppose H1 and H2 are two Hilbert spaces. Let T : H1 → H2 be a bounded linear operator. Prove that T ∗ T = 0 if and only if T = 0.

14.6.

Suppose H is a Hilbert space and S, T : H → H are bounded linear operators. Prove that (ST)∗ = T ∗ S∗ .

14.7. 14.8.

Let H be a Hilbert space and T : H → H a bounded, linear and invertible transformation. Prove that A∗ is also invertible and that (A–1 )∗ = (A∗ )–1 . Let V1 and V2 be two Banach spaces and let T, S : V1 → V2 two bounded linear operators. Prove that (a) (S + T)∗ = S∗ + T ∗ . (b) For every scalar !, (!T)∗ = !T ∗ .

14.9.

(c) The mapping T 󳨃→ T ∗ is an isometric isomorphism from B(V1 , V2 ) to B(V2∗ , V1∗ ). Let V1 , V2 and V3 be Banach spaces and suppose that T : V1 → V2 and S : V2 → V3 are bounded linear operators. Prove that (TS)∗ = S∗ T ∗ .

14.10.

Let 1 < p < ∞ and define the linear operator T:

14.11.

ℓp 󳨀→ ℓp , (x1 , x2 , x3 , . . . ) 󳨃󳨀→ (0, x1 , x2 , x3 , . . . ).

Prove that T is a bounded operator and describe T ∗ . Let 1 < p < ∞ and suppose (yn ) ∈ ℓ∞ . Define the linear operator M: Describe M ∗ .

ℓp 󳨀→ ℓp , 󳨃󳨀→ (xn yn )∞ n=1 .

(xn )∞ n=1

15 Weak Topologies and Reflexivity Learning Targets ✓ Study the notion of weak topology on a Banach space and weak∗ topology on dual spaces. ✓ Go over examples of reﬂexive spaces.

Suppose that (V, ‖⋅‖) is a normed space, and then there is a topology on V that induced its norm. A basis for such topology is the set of all open balls in the space. This is the natural topology that we have used so far in the book. For example, we showed that a linear transformation T : V → V is bounded if and only if it is continuous. Such continuity is defined in the sense of the previously defined topology. Our goal in this chapter is to give the reader an overview of the subject, this is why we will be sometimes sketchy and we will assume that the reader has previous knowledge in general topology. For a more detailed treatment of the subject we recommend chapter 3 in Ref. [44]. On the other hand, we saw that, as a consequence of Hahn–Banach Theorem, if v ∈ V, we can write ‖v‖ = sup {|f (v)| : f ∈ V ∗ } . In a sense, this says that the information about V can be obtained by the knowledge of how the linear functionals act on V. However, the norm topology on V is not the only topology that makes linear functionals continuous. In this chapter, we will study the so-called weak topology on V. That is the smallest topology such that every element of V ∗ is continuous. Definition 15.1. Let (V, ‖⋅‖) be a normed space over a field 𝔽. Then the weak topology on V is defined by the subbase {f –1 (O) : f ∈ V ∗ and O ⊆ 𝔽, is open in the topology of 𝔽} .

That is, a set belongs to the weak topology of a normed space if it can be written as a union of finite intersections of sets of the form f –1 (O), where f ∈ V ∗ and O ⊆ 𝔽 is open in the field 𝔽. Since the weak topology for V is the smallest such that every element of V ∗ is continuous, then it must be included in the norm topology of V. In other words, every element in the weak topology is an open set with respect to the norm topology. A clear consequence of this is the following proposition.

175

15 Weak Topologies and Reﬂexivity

Theorem 15.2. Let (V, ‖ ⋅ ‖) be a normed space and suppose f : V → 𝔽 is a linear transformation. Then f ∈ V ∗ if and only if f is continuous with respect to the weak topology. Remark 15.3. We will say that a linear functional is weakly continuous on V if it is continuous with respect to the weak topology. Similarly, an element of the weak topology will be called weakly open and we will say that a sequence (vn )∞ n=1 ∈ V weakly conw

verges to v ∈ V (denoted as vn → v) if it converges with respect to the weak topology. Notice that this means that for every f ∈ V ∗ , f (vn ) → f (v). In general, any topological property that holds with respect to the weak topology will be called a weak property or to hold weakly. ⊘ n Example 15.4. Consider the Banach space 𝔽n . Suppose that (vk )∞ k=1 is a sequence in 𝔽 w

such that vk → v. Consider for every j = 1, . . . , n the functionals fj :

𝔽n 󳨀→ 𝔽, (x1 , . . . xn ) 󳨃󳨀→ xj .

Then since fj (vk ) → fj (v), the weak convergence implies the coordinate-wise convergence, which is actually the norm convergence. Finally, since norm convergence always implies weak convergence, then we see that in the finite-dimensional case, both topologies coincide. ⊘ ∞

Example 15.5. Consider the Hilbert space (ℓ2 , ‖ ⋅ ‖2 ) and take the sequence (en )n=1 of 2 canonical vectors in ℓ2 . Then given f ∈ (ℓ2 )∗ , there exists a sequence vf = (xk )∞ k=1 ∈ ℓ such that f (en ) = ⟨en , vf ⟩= xn → 0. w

Consequently, en → 0. However, ‖en ‖ = 1 for every n. This shows that the weak topology and the norm topology are not necessarily the same, even for the case of Hilbert spaces. ⊘ The following proposition says something about the “size” of weakly open sets. Theorem 15.6. Let (V, ‖ ⋅ ‖) be an infinite-dimensional normed space. Then every weakly open set in V is unbounded. Proof. Suppose O is a basic weakly open set in V. It is enough to show that the proposition holds if 0 ∈ O (why?). Then there exist G1 , . . . , Gk ⊆ 𝔽 neighborhoods of 0 and f1 , . . . , fk ∈ V ∗ such that O = f1–1 (G1 ) ∩ ⋅ ⋅ ⋅ ∩ fk–1 (Gk ).

176

15 Weak Topologies and Reﬂexivity

Then S = f1–1 (0) ∩ ⋅ ⋅ ⋅ ∩ fk–1 (0) ⊆ O. But S is a vector subspace of V, so it is enough to show that S contains a nonzero element. This holds because otherwise the linear transformation defined as T : V 󳨀→ 𝔽n , v 󳨃󳨀→ (f1 (v), . . . , fk (v)) would be injective and so the dimension of V would be finite.

Even though the weak topology and the norm topology differ, it turns out that the bounded sets coincide. A subset A of a normed space V is said to be weakly bounded if, for each (weakly) open set O such that 0 ∈ O, there exists a positive number s > 0 such that A ⊆ tO for every t > s. Here, tO = {tv ∈ V : v ∈ O}. We leave it as an exercise to prove that this is equivalent to requiring that f (A) is bounded in 𝔽 for each f ∈ V ∗ . Theorem 15.7. A subset of a normed space is bounded if and only if it is weakly bounded. Proof. Let (V, ‖ ⋅ ‖) be a normed space. Since the weak topology is contained in the norm topology, it is clear from the definition that every bounded subset of V is also weakly bounded. For the converse, suppose that A ⊆ V is a nonempty weakly bounded set (the empty case is trivial). Then for each f ∈ V ∗ there exists Cf > 0 such that |f (v)| ≤ Cf for every v ∈ A. Consider the linear transformation Q : V → V ∗∗ defined as Qv(f ) = f (v),

for all f ∈ V ∗ ,

and notice that Q(A) = {Qv : v ∈ A} is a nonempty collection of linear functionals acting on the Banach space V ∗ . Moreover, if f ∈ V ∗ is fixed, then 󵄨 󵄨 󵄨 󵄨 sup {󵄨󵄨󵄨󵄨Qv (f )󵄨󵄨󵄨󵄨 : v ∈ A} = sup {󵄨󵄨󵄨f (v)󵄨󵄨󵄨 : v ∈ A} ≤ Cf . We can now use the uniform boundedness principle to conclude that the family of linear functionals {Qv : v ∈ A} is bounded in norm. But notice that by the Hahn– Banach Theorem, for v ∈ A, ‖v‖ = sup{|fv| : f ∈ V ∗ }, = sup{|Qv(f )| : f ∈ V ∗ }, = ‖Qv‖. Consequently, A is bounded in norm.

15.1 Weak∗ Topology

177

Remark 15.8. The mapping Q : V → V ∗∗ considered in the previous proof defines an isometric isomorphism from V to Q(V). We will study such mapping with a greater detail in the next sections of this chapter. ⊘ In what follows, we will study several consequences of the previous theorem. We start with a direct consequence. Corollary 15.9. Let (V, ‖ ⋅ ‖) be a normed space and let A ⊆ V. Then V is bounded if and only if the set {|f (v)| : v ∈ A} is bounded for each f ∈ V ∗ . Corollary 15.10. Let V and W be two normed spaces. Suppose that T : V → W is a linear operator. Then T is bounded if and only if for every f ∈ W ∗ , the composition f ∘ T belongs to V ∗ . Proof. Notice that T is bounded if and only if the set T (B1 (0)) is bounded in W. By Corollary 15.9 this occurs if and only if f (T (B1 (0))) is a bounded set in 𝔽 for each f ∈ W ∗ . Equivalently, f ∘ T is a bounded linear transformation. ◻ The previous corollary can be written in terms of continuity. We say that a linear operator between two normed spaces V and W is weak-to-weak continuous if for every weakly open set O ⊆ W, the set T –1 (O) is weakly open in V. Corollary 15.11. Let V and W be two normed spaces and let T : V → W be a linear operator. Then T is continuous if and only if it is weak-to-weak continuous. Proof. First, notice that for a linear functional, continuity and weak continuity coincide. Now, by Corollary 15.10 we have that T is continuous if and only if f ∘ T is weakly continuous for every f ∈ W ∗ . It remains to show that this implies the weak continuity of T. It is enough to show that T –1 (O) is weakly open for every subbasic element O of the weak topology of W. Suppose O = f –1 (G) for some open set G ⊆ 𝔽 and some f ∈ W ∗ . Then T –1 (O) = (f ∘ T)–1 (G) which is a weakly open V since f ∘ T is a weakly continuous operator. ◻ The following corollary claims that when two normed spaces are isomorphic, then its weak topologies are preserved by such isomorphism. We leave its proof as an exercise. Corollary 15.12. Given two normed spaces V and W, and a linear operator T : V → W, T is an isomorphism if and only if it is a weak-to-weak homeomorphism.

15.1 Weak∗ Topology In the previous section we defined the weak topology on a normed space V as the smallest topology such that all the elements of V ∗ are continuous. We can keep doing

178

15 Weak Topologies and Reﬂexivity

that now for the dual space (since this is also a normed space) and consider the weak topology on V ∗ . This would be the smallest topology such that all elements of V ∗∗ are continuous. This reasoning would keep working ad infinitum and in each step we obtain information about a space from the knowledge of its dual. However, it is also possible to obtain information about the dual space by the knowledge of its action on the initial space. This section will be devoted to the study of the weak∗ topology, which is a topology defined in the dual space V ∗ by using the behavior of elements in the normed space V. Recall the operator Q : V → V ∗∗ defined in the previous section. It maps each vector v ∈ V, a linear functional Qv : V ∗ → 𝔽 defined by the equation Qv(f ) = f (v).

(15.1)

We proved that Q is an isometric isomorphism between V and Q(V) ⊆ V ∗∗ . The weak∗ topology on V ∗ is defined as the smallest topology such that the elements of Q(V) are continuous on V ∗ . Definition 15.13. Let (V, ‖ ⋅ ‖) be a normed space and let Q : V → V ∗∗ defined as in eq. (15.1). The weak∗ topology on V ∗ is defined by the subbase {(Qv)–1 (O) : v ∈ V, and O ⊆ 𝔽 is open} .

Remark 15.14. Notice that, in the definition of the weak∗ topology, we only need the linear operators from a subset of V ∗∗ to be continuous. In the next section, we will consider the case of spaces for which Q(V) = V. This will be called reflexive spaces. ⊘ ∗ Remark 15.15. Given a normed vector space V and a sequence (fn )∞ n=1 ⊆ V , we say that ∞ ∗ ∗ ∗ (fn )n=1 weakly converges to f ∈ V if it converges in the weak topology. In symbols, w∗

we will write fn → f . We leave it as an exercise to show that w∗

fn → f ⇔ fn (v) → f (v) for all v ∈ V.

(15.2) ⊘

Example 15.16. By Problem 5.13, we know that c∗0 is isometrically isomorphic to l1 . Moreover, the isomorphism A : ℓ1 → c∗0 is given by ∞

∞ A((xk )∞ k=1 )((yk )k=1 ) = ∑ xk yk , k=1 1 ∞ where (xk )∞ k=1 ∈ ℓ and (yk )k=1 ∈ c0 .

15.2 Reﬂexive Spaces

179

∗ ∞ Let’s consider the sequence of functionals (fn )∞ n=1 ⊆ c0 where for every (yk )k=1 ∈ c0 ,

fn ((yk )∞ k=1 ) = yn . Notice that fn = A((enk )∞ k=1 ) where {1 enk = { 0 {

if k = n, otherwise.

Moreover ∞ Q((yk )∞ k=1 )(fn ) = fn ((yk )k=1 ) = yn → 0, w∗

∞ and by eq. (15.2) we have that fn → 0. On the other hand, take a vector u = (uk )∞ k=1 ∈ ℓ such that uk ↛ 0 and define the linear operator gu : c∗0 → 𝔽 as ∞

gu (A((xk )∞ k=1 )) = ∑ xk uk , k=1 1 where (xk )∞ k=1 ∈ ℓ (here, we are using the fact that A is a surjective function). We leave it as an exercise to prove that gu is bounded and consequently gu ∈ c∗∗ 0 . Now,

gu (fn ) = gu (A((enk )∞ k=1 )), ∞

= ∑ enk uk , k=1

= un ↛ 0. Therefore, the weak∗ convergence and the weak convergence differ.

15.2 Reflexive Spaces Let V be a Banach space. In the previous chapter we considered the space V ∗∗ and defined the isometric linear operator Q : V → V ∗∗ as Qv(f ) = f (v),

where v ∈ V and f ∈ V ∗ .

(15.3)

The linear transformation Q is known as the canonical embedding of V into V ∗∗ . This chapter will be devoted to the study of the spaces for which Q is surjective. Definition 15.17. A Banach space V is reflexive if the linear operator Q : V → V ∗∗ defined as in eq. (15.3) is an isometric isomorphism.

180

15 Weak Topologies and Reﬂexivity

Remark 15.18. Notice that the definition of reflexive spaces only includes Banach spaces. We showed before that a dual space is always complete and therefore V ∗∗ should be complete. But since Q is an isometric isomorphism, we have that if V is reflexive, then it should also be complete. ⊘ Example 15.19. Usually we start with the example of the finite-dimensional space 𝔽n . However, the ideas are exactly the same as in the case of any Hilbert space (H, ⟨⋅, ⋅⟩), so let’s consider this more general example. First remember that there exists an antilinear isometry A : H ∗ → H such that for every f ∈ H ∗ and u ∈ H, f (u) = ⟨u, Af ⟩. We say that A is antilinear since for f1 , f2 ∈ H ∗ , A(f1 + f2 ) = A(f1 ) + A(f2 ) and if + ∈ 𝔽, then A(+f1 ) = +A(f1 ). Moreover, if g ∈ H ∗∗ then by the Riesz Representation Theorem there exists f ∈ H ∗ such that for all h ∈ H ∗ , g(h) = ⟨h, f ⟩H ∗ . Now, we use Problem 5.12 to obtain g(h) = ⟨Af , Ah⟩ = h(Af ) = Q(Af )(h). This shows that Q is surjective and consequently that H is reflexive.

Example 15.20. The previous ideas can be applied for the case of some non-Hilbert spaces. We will consider the case of ℓp spaces for 1 < p < ∞. Recall that if 1 < q < ∞ is ∗ such that 1/p + 1/q = 1 then there exists antilinear bijective isometrics A1 : ℓp → (ℓq ) ∗

and A2 : ℓq → (ℓp ) defined as ∞ A1 ((xn )∞ n=1 )(yn )n=1 = ∑ yn xn

15.2 Reﬂexive Spaces

181

and ∞ A2 ((yn )∞ n=1 )(xn )n=1 = ∑ xn yn , p ∞ q for every (xn )∞ n=1 ∈ ℓ and (yn )n=1 ∈ ℓ . ∗∗ We will show that the canonical embedding Q : ℓp → (ℓp ) is surjective. Let

g ∈ (ℓp )

∗∗

and define the function g ∘ A2 :

ℓq 󳨀→ 𝔽 . 󳨃󳨀→ g(A2 (yn )∞ n=1 )

(yn )∞ n=1

p Then g ∘ A2 ∈ (ℓq ) and consequently there exists (xn )∞ n=1 ∈ ℓ such that

g ∘ A2 = A1 ((xn )∞ n=1 ). ∗

p ∞ We will show that Q((xn )∞ n=1 ) = g. Suppose that h ∈ (ℓ ) , then h = A2 ((yn )n=1 ) for some ∞ q (yn )n=1 ∈ ℓ and hence

g(h) = g(A2 (yn )∞ n=1 ) = g ∘ A2 (yn )∞ n=1 ∞ = A1 ((xn )∞ n=1 )(yn )n=1

= ∑ xn yn . On the other side, ∞ Q((xn )∞ n=1 )(h) = h(xn )n=1 , ∞ = A2 ((yn )∞ n=1 )(xn )n=1 ,

= ∑ xn yn . ∗

p Since this holds for any h ∈ (ℓp ) we have Q(xn )∞ n=1 = g and consequently that ℓ spaces are reflexive for 1 < p < ∞. ⊘

Before going over the next example, let’s recall Theorem 5.19. It says that if the dual space V ∗ is separable, then V is separable. On the other hand, if V is reflexive then V and V ∗∗ are isomorphic and consequently if V is separable then so is V ∗∗ . Now we use Theorem 5.19 to conclude that if V is reflexive and separable, then V ∗ is also separable. This reasoning allows us to find example of nonreflexive spaces. Example 15.21. The space ℓ1 is not reflexive since it is separable (see Example 3.40) but by Problem 5.4 its dual space is identified with ℓ∞ which is not separable by Example 3.41. ⊘

182

15 Weak Topologies and Reﬂexivity

Remark 15.22. Notice that a Banach space V is reflexive if Q(V) = V ∗∗ and consequently V and V ∗∗ are isomorphic. It is possible however to have a Banach space isomorphic to its double dual and yet not to be reflexive. R.C. James was the first to find such example in Ref. [15]. The details are out of the objectives of this book and will not be covered. However, it is important to remember that such an example exists. Other reading that we recommend to deepen the knowledge on weak topologies and reflexive spaces is Ref. [28]. ⊘

15.3 Problems

15.5.

Let 1 ≤ p < ∞, p ≠ 2. Find an example of a sequence in ℓp that is weakly convergent but not convergent in norm. Show that a subset A of a normed space V is weakly bounded if and only if f (A) is bounded in 𝔽 for every f ∈ V ∗ . Prove Corollary 15.12. Given a normed vector space V and a sequence (xn ) ⊆ V, prove that its weak limit is unique. Consider the Banach space (C[0, 1], ‖ ⋅ ‖∞ ) and suppose that (fn ) ⊆ C[0, 1] is a

15.6.

sequence of functions such that fn → f ∈ C[0, 1]. Prove that fn (t) → f (t) for every t ∈ [0, 1]. Given a normed vector space (V, ‖ ⋅ ‖) and a sequence (xn ) ⊆ V, suppose that

15.1. 15.2. 15.3. 15.4.

w

w

xn → v ∈ V. Prove that ‖x‖ ≤ lim inf ‖xn ‖. 15.7. 15.8. 15.9. 15.10.

Prove the statement in eq. (15.2). Use Problem 5.4 to show that the operator gu defined in Example 15.16 is bounded. Let (V, ‖ ⋅ ‖) be a normed space and let B ⊆ V ∗ . Prove that B is bounded if and only if B is weakly∗ bounded. w∗

Given a normed vector space V and a sequence (xn ) ⊆ V ∗ , suppose that xn → v ∈ V ∗ . Prove that ‖x‖ ≤ lim inf ‖xn ‖.

15.11. 15.12. 15.13.

Notice that ‖x‖ and ‖xn ‖ are understood to be the norms in V ∗ . Let H be a Hilbert space. Compare the weak and the weak∗ topologies on H ∗ . Try to extend the reasoning in Example 15.20 to the case p = 1. What fails? Let (V, ‖ ⋅ ‖) be a reflexive space. Prove that V ∗ is also reflexive.

16 Operators in Hilbert Spaces

Learning Targets ✓ Understand compactness as a property of operators on Hilbert spaces. ✓ Recognize normal and self-adjoint operators on Hilbert spaces.

This chapter can be considered as a preparation for the Spectral Theorem in Hilbert spaces. We will focus on three types of operators whose properties will be useful in the next chapter.

16.1 Compact Operators In this section we will study compact operators acting on Hilbert spaces. The word “compact” in mathematics usually refers to “small” in some sense. Under this context, a compact operator is a linear operator that shares several properties with operators acting on a finite-dimensional space. Definition 16.1. Let H and K be Hilbert spaces and denote the closed unit ball in H as BH = {x ∈ H : ‖x‖ ≤ 1}. A linear operator T : H → K is compact if T(BH ) is compact in K. Theorem 16.2. Let H and K be Hilbert spaces and T : H → K. (a) (b) (c)

If T is compact, then T is bounded. Suppose that T is bounded and Tn : H → K is a sequence of compact operators such that ‖Tn – T‖ → 0. Then T is compact. If L : H → H and R : H → H are bounded linear operators and T is compact, then LT and TR are compact.

Proof. (a) Since T(BH ) is compact in K, then it is a bounded set and consequently, there exists C > 0 such that ‖Tv‖ ≤ C This implies that T is bounded.

for all v ∈ H, ‖v‖ ≤ 1.

184

(b)

16 Operators in Hilbert Spaces

Since K is a complete metric space, then it is enough to show that T (BH ) is totally bounded. Let % > 0, then there exists N ∈ ℕ such that ‖Tn – T‖ < %/3 for all n ≥ N. Now since TN is compact, then there are vectors v1 , . . . vm ∈ BH such that m

TN (BH ) ⊆ ⋃ B%/3 (TN (vj )). j=1

Consequently, for every v ∈ BH , there exists j ∈ {1, . . . , m} such that ‖TN v – TN vj ‖ < %/3. Thus, ‖Tv – Tvj ‖ ≤ ‖Tv – TN v‖ + ‖TN v – TN vj ‖ + ‖TN vj – Tvj ‖, < 2‖T – TN ‖ + %/3, < %.

(c)

Therefore, T(BH ) ⊆ ⋃m j=1 B% (Tvj ). This shows that T(BH ) is totally bounded. If R = 0, then TR = 0 and the proposition holds. Suppose that R ≠ 0, then since R is bounded, 0 < ‖R‖ < ∞. Therefore, TR(BH ) ⊆ T(B‖R‖ (0)) = ‖R‖–1 T(BH ), and the result follows from the compactness of T (BH ). The second part of the ◻ proof is left as an exercise.

Example 16.3. Any bounded linear transformation T : 𝔽n → 𝔽n is compact. Moreover, if H is a Hilbert space and T : H → 𝔽n is a bounded linear operator, then T is compact because T(BH ) is a closed and bounded set in a finite-dimensional space. The Heine– Borel Theorem gives us the result. ⊘ Example 16.4. The previous example considers the case of matrices acting on 𝔽n . A generalization of such matrices will be to consider in the case of the Hilbert space ℓ2 . ∞ 2 Consider a sequence (aij )∞ i,j=1 , where aij ∈ 𝔽 are such that ∑i,j=1 |aij | < ∞. The sequence can be thought as an infinite matrix acting on ℓ2 by means of the linear operator T:

ℓ2 󳨀→ ℓ2 (xn )∞ n=1

󳨃󳨀→ (∑ aij xj ) j=1

i=1

.

16.1 Compact Operators

185

Notice that, by Cauchy–Schwarz inequality, for every canonical vector ek ∈ ℓ2 , 󵄨󵄨 󵄨󵄨2 󵄨󵄨 󵄨󵄨 󵄨2 󵄨󵄨 ∞ 󵄨󵄨⟨T ((xn )∞ ) , ek ⟩󵄨󵄨󵄨 = 󵄨󵄨󵄨∑ xj ⟨T(ej ), ek ⟩󵄨󵄨󵄨 , n=1 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨󵄨 󵄨󵄨 󵄨󵄨 j=1 󵄨 ∞ 2 󵄩 󵄩󵄩󵄩 ∑ 󵄨󵄨󵄨⟨T(e ), e ⟩󵄨󵄨󵄨2 . ≤ 󵄩󵄩󵄩󵄩(xn )∞ j k 󵄨󵄨 n=1 󵄩 󵄩2 󵄨󵄨 j=1

Consequently, 󵄩󵄩 󵄩2 ∞ 󵄨 󵄨2 󵄩󵄩T ((xn )∞ )󵄩󵄩󵄩 = ∑ 󵄨󵄨󵄨⟨T (xn )∞ ) , ek ⟩󵄨󵄨󵄨 , n=1 󵄩 n=1 󵄩󵄩 󵄨󵄨 󵄨󵄨 󵄩2 k=1

∞ ∞ 󵄩 󵄩󵄩󵄩2 ∑ ∑ 󵄨󵄨󵄨⟨T(e ), e ⟩󵄨󵄨󵄨2 , ≤ 󵄩󵄩󵄩󵄩(xn )∞ 󵄨󵄨 j k 󵄨󵄨 n=1 󵄩 󵄩2 j=1 k=1

=

󵄩󵄩2 󵄩󵄩 󵄩 󵄩󵄩(xn )∞ n=1 󵄩 󵄩2 󵄩

∑ |ajk |2 , j,k=1

2 which implies that T is well defined, bounded and ‖T‖ ≤ ∑∞ j,k=1 |ajk | .

Now, for every N ∈ ℕ, consider the sequence (aNij ) defined as {aij aNij = { 0 {

if i ≤ N and j ≤ N, otherwise.

and define the operator TN :

ℓ2 󳨀→ ℓ2 ∞ N ((xn )∞ n=1 ) 󳨃󳨀→ (∑j=1 aij xj )

∞ i=1

.

Thinking of our infinite matrix representation, the operators TN can be seen as truncated matrices or matrices having one N × N block whose coefficients coincide with aij and zeroes everywhere else. Notice that for every N, the space TN (H) is isomorphic to a subspace of 𝔽N and consequently each TN is compact. Moreover, using the previous calculations we have that ∞

‖T – TN ‖ = ∑ |ajk |2 󳨀→ 0. j,k=N

N→∞

Thus, by Theorem 16.2 we conclude that T is a compact operator.

Example 16.5. Let H be an infinite-dimensional Hilbert space. Then the identity operator I : H → H is not compact because I(BH ) = BH and BH is not compact (why?). ⊘

186

16 Operators in Hilbert Spaces

Notice that in Example 16.4 we showed that T is a compact operator by approximating with the operators TN . Such operators are not only compact, but they have finite rank. Definition 16.6. Let H and K be Hilbert spaces. A linear operator T : H → K has finite rank if T(H) is a finite-dimensional space. Clearly, every finite rank operator is compact. The following theorem will generalize the method we used in Example 16.4. Theorem 16.7. Let H and K be Hilbert spaces and let T : H → K be a linear operator. The following statements are equivalent: (a) (b) (c)

T is compact. T ∗ is compact. There exists a sequence of finite rank operators Tn : H → K such that ‖Tn –T‖ → 0.

Proof. Notice that (c) ⇒ (a) follows directly from the fact that every finite rank operator is compact and from part (b) of Theorem 16.2. Similarly, from Problem 16.3 and the fact that ‖Tn∗ – T ∗ ‖ = ‖Tn – T‖, it follows that (c) ⇒ (b). Now, we will show that (a) ⇒ (c). First, suppose that T is compact, and then T(BH ) is compact. Therefore, T(BH ) and consequently T(H) are separable (why?). By Problem 2.23 we have that there exists an orthonormal basis {en }∞ n=1 for the space T(H). Let’s define, for each N ∈ ℕ, the subspace KN as the closure of all linear combinations of the vectors {e1 , . . . , eN } and let PN : K → KN be the orthogonal projection. Notice that for every w ∈ K, ‖PN w – w‖ → 0. Now set TN = PN T, then for every v ∈ V, we have that ‖TN v – Tv‖ → 0. We will show that ‖TN – T‖ → 0. Let % > 0; since T is compact, there exist v1 , . . . , vm ∈ V such that m

T(BH ) ⊆ ⋃ B%/3 (Tvj ). j=1

Hence if v ∈ BH there exists j ∈ {1, . . . , m} such that ‖Tv – Tvj ‖ < %/3 and consequently for N ∈ ℕ such that ‖TN vj – Tvj ‖ < %/3, we have ‖Tv – TN v‖ ≤ ‖Tv – Tvj ‖ + ‖Tvj – TN vj ‖ + ‖TN vj – TN v‖, ≤ 2‖Tv – Tvj ‖ + ‖Tvj – TN vj ‖, < %. This estimation does not depend on v ∈ BH and hence, ‖T – TN ‖ < %.

187

In order to show that (b) ⇒ (a) we will use the characterization of compact operators w from Problem 16.4. Let (un ) ⊆ K be a sequence such that un → u. If T ∗ is compact, then T is also compact and by Problem 16.2, we have that TT ∗ is also compact. Thus ‖TT ∗ un – TT ∗ u‖ → 0 and consequently, ‖T ∗ un – T ∗ u‖2 = ⟨T ∗ (un – u), T ∗ (un –)⟩, = ⟨un – u, TT ∗ (un – u)⟩, ≤ ‖un – u‖‖TT ∗ (un – u)‖ → 0, since (un ) is bounded. This completes the proof.

16.2 Normal and Self-Adjoint Operators As seen in the previous section, there exists an equivalence between the compactness of a linear operator and the compactness of its adjoint operator. In general, a “nice” interaction between a linear operator and its adjoint implies a deeper knowledge of the operator. In this section, we will define two classes of operators in terms of their behavior with respect to the adjoint. Such definitions will gain more relevance in the next chapters.

Definition 16.8. Let H be a Hilbert space and suppose that T : H → H is a bounded linear operator. We say that (a) T is self-adjoint if T = T ∗ and (b) T is normal if TT ∗ = T ∗ T.

Example 16.9. Let’s consider the case of the Hilbert space 𝔽n and a linear operator n T : 𝔽n → 𝔽n . Suppose that the matrix A = [aij ]i,j=1 represents the operator T with respect to the canonical basis and recall that the representation of T ∗ with respect to T

n

the same basis is given by A = [aji ]i,j=1 . Thus, in order for T to be self-adjoint, we should have aij = aji .

(16.1)

For example, any symmetric real matrix is associated to a self-adjoint operator, but those are not the only examples (what else?). Notice that eq. (16.1) implies that all the entries on the diagonal must be real numbers. Also, notice that if T is self-adjoint, then T is normal; this is a general fact, not depending on the Hilbert space. The converse is not necessarily true; for example, consider the diagonal matrix

188

16 Operators in Hilbert Spaces

A=[

i 0 ]. 00

It is clear that A is associated to a normal operator that is not self-adjoint.

Example 16.10. In the previous chapters, we defined for a given sequence (xn ) ∈ ℓ∞ , the operator T:

ℓ2 󳨀→ ℓ2 󳨃󳨀→ (xn yn )∞ n=1 .

(yn )∞ n=1

We showed that its adjoint is given by T∗ :

ℓ2 󳨀→ ℓ2 ∞ (wn )∞ n=1 󳨃󳨀→ (xn wn )n=1 .

Notice that T is self-adjoint if and only if every xn is a real number. On the other hand, ∞ it is clear that T is a normal operator for every (xn )∞ ⊘ n=1 ∈ ℓ . The previous examples show that there seems to be a relation between real numbers and self-adjoint operators. This is made more evident in the next proposition. Theorem 16.11. Let (H, ⟨⋅, ⋅⟩) be a Hilbert space. Assume that the field of scalars is ℂ. Let T : H → H be a bounded linear operator. Then T is self-adjoint if and only if ⟨Tv, v⟩ is real for all v ∈ H. Proof. Suppose first that T is self-adjoint, and then for every v ∈ H, ⟨Tv, v⟩= ⟨v, T ∗ v⟩= ⟨v, Tv⟩= ⟨Tv, v⟩ and consequently ⟨Tv, v⟩ is real. Conversely, if for every v ∈ H, we have that ⟨Tv, v⟩= ⟨Tv, v⟩, then ⟨v, Tv⟩= ⟨v, T ∗ v⟩ and consequently ⟨v, Tv – T ∗ v⟩= 0 and the result follows from Problem 4.13.

Remark 16.12. Notice that in the previous proposition it was assumed that the field of scalars is ℂ. This is because the result in Problem 4.13 does not hold for real vector spaces. In the case of real scalars we can only claim that if T is self-adjoint, then ⟨Tv, v⟩ is a real number for all v ∈ H. The converse is clearly not true since in any real Hilbert space we have that ⟨Tv, v⟩ is a real number independently of the operator T. ⊘

189

Theorem 16.13. Let (H, ⟨⋅, ⋅⟩) be a Hilbert space and let T : H → H be a self-adjoint operator. Then ‖T‖ = sup{|⟨Tv, v⟩| : ‖v‖ = 1},

(16.2)

Proof. Suppose v ∈ H is such that ‖v‖ = 1, then by Cauchy–Schwarz inequality we have that |⟨Tv, v⟩| ≤ ‖Tv‖ ≤ ‖T‖ and consequently sup{|⟨Tv, v⟩| : ‖v‖ = 1} ≤ ‖T‖. On the other hand, let ! = sup{|⟨Tv, v⟩| : ‖v‖ = 1}. Since T is self-adjoint, for v, w ∈ H such that ‖v‖ = ‖w‖ = 1 we have ⟨T(v + w), v + w⟩–⟨T(v – w), v – w⟩ = 2⟨Tv, w⟩+2⟨Tw, v⟩, = 4Re⟨Tv, w⟩. Now notice that 󵄨󵄨 󵄨 2 󵄨󵄨⟨T(v ± w), v ± w⟩󵄨󵄨󵄨 ≤ ‖v ± w‖ !, and by the parallelogram law, 4Re⟨Tv, w⟩ ≤ ! (‖v + w‖2 + ‖v – w‖2 ) , ≤ 2! (‖v‖2 + ‖w‖2 ) , = 4!. 󵄨 󵄨 Now, let ( ∈ [0, 20] such that 󵄨󵄨󵄨⟨Tv, w⟩󵄨󵄨󵄨 = ei( ⟨Tv, w⟩. Then 󵄨󵄨 󵄨 –i( 󵄨󵄨⟨Tv, w⟩󵄨󵄨󵄨 = ⟨Tv, e w⟩, ≤ !. Taking the supreme over all w ∈ H such that ‖w‖ = 1, we have that ‖Tv‖ ≤ ! which implies the result. ◻ The following proposition gives us a characterization of normal operators. Theorem 16.14. Let (H, ⟨⋅, ⋅⟩) be a Hilbert space and let T : H → H be a bounded linear operator. Then T is normal if and only if for every v ∈ H, ‖Tv‖ = ‖T ∗ v‖.

190

16 Operators in Hilbert Spaces

Proof. As in the previous proposition, we will use Problem 4.13. Notice that T is normal if and only if TT ∗ – T ∗ T = 0 and this is equivalent to ⟨(TT ∗ – T ∗ T) v, v⟩= 0 for every v ∈ H. But this can be written as ⟨TT ∗ v, v⟩= ⟨T ∗ Tv, v⟩. A rearrangement of this equation gives the result.

16.3 Problems 16.1. 16.2. 16.3. 16.4.

16.5.

16.6.

Given a Hilbert space H, denote as B0 (H) the set of all compact operators T : H → H. Show that B0 (H) is a vector space. Prove the remaining case in Theorem 16.2. Let H and K be Hilbert spaces and let T : H → K be a finite rank operator. Show that T ∗ is also a finite rank operator. Let H and K be Hilbert spaces and let T : H → K be a linear operator. Prove that w T is compact if and only if for every sequence (vn )∞ n=1 such that vn → v weakly we have that ‖Tvn – Tv‖ → 0. Let H be a separable Hilbert space and let {en }∞ n=1 be a basis for H. Given a bounded sequence {+n }∞ , define the linear operator T : H → H by T(en ) = +n en n=1 and extend linearly to H. (a) Prove that ‖T‖ = sup{+n }. (b) Prove that T is compact if and only if |+n | → 0. Hint: Use the projections Pn defined on the proof of Theorem 16.7. Consider the shift operator S : ℓ2 → ℓ2 defined as S(x1 , x2 , . . . ) = (x2 , x3 , . . . ).

(a) Prove that S∗ : ℓ2 → ℓ2 is given by S∗ (y1 , y2 , . . . ) = (0, y1 , y2 , . . . ). Is S self-adjoint? normal? (b) Let ℓ02 = {(xn ) ∈ ℓ2 : x1 = 0}. Prove that ℓ02 is a Hilbert space. (c) Consider the operator T : ℓ02 → ℓ2 defined as the restriction of S to ℓ02 . Prove that T ∗ T = TT ∗ .

16.3 Problems

16.7.

191

Let H be a Hilbert space over a field 𝔽 and let T : H → H be a normal operator. Suppose + ∈ 𝔽. (a) Let I : H → H denote the identity operator. Prove that ker(T – +I) = ker(T – +T)∗ . ⊥

(b) Prove that T (ker(T – +I)⊥ ) ⊆ ker(T – +I) 16.8. Let H be a Hilbert space and let (Tn ) be a sequence of self-adjoint operators Tn : H → H such that ‖Tn – T‖ → 0 for some bounded linear operator T : H → H. Prove that T is a self-adjoint operator. 16.9. Let H be a Hilbert space and let T, S : H → H be two self-adjoint operators. Prove that ST is also a self-adjoint operator if and only if ST = TS.

17 Spectral Theory of Operators on Hilbert Spaces Learning Targets ✓ Review spectral theorems for normal and self-adjoint operators in ﬁnite-dimensional spaces. ✓ Comprehend the spectral theorem for compact self-adjoint operators in Hilbert spaces.

This chapter will be dedicated to the study of spectral properties of linear operators acting on Hilbert spaces. As a motivation, in the next section we will give a brief review of the situation in finite-dimensional complex spaces. In such spaces, for a given linear transformation, it is possible to find a basis that makes the associated matrix “as simple as possible” and thus, a classification of the linear operators can be given in terms of such canonical matrices. This is the main idea of spectral theory and the objective of this chapter is to give the reader a “feeling” of the situation with a very special case. We don’t pretend to cover all the cases or the details of such an extensive theory. Instead we would like to awaken the reader’s curiosity in the subject.

17.1 A Quick Review of Spectral Theory in Finite Dimensions In this section we plan to recall some facts from the theory of linear transformations on finite-dimensional spaces. We might be sketchy at some times since we expect the reader to be familiarized to these concepts and techniques from previous linear algebra courses. Suppose T : ℂn → ℂn is a nonzero linear operator, then there exist some “preferred directions” associated to T. Such directions are given by specific vectors v such that the subspace S = {!v : ! ∈ C} is invariant for T; that is T(S) ⊆ S. Such vectors are called eigenvectors. We give the following definition in the general setting since it will be used in the next sections of this chapter.

Definition 17.1. Let V be a vector space and let T : V → V be a linear operator. If there exists a nonzero vector v ∈ V and + ∈ ℂ such that Tv = +v, then we say that + is an eigenvalue corresponding to the eigenvector v.

Remark 17.2. Notice that for the same eigenvalue there are infinitely many eigenvectors since if Tv = +v, then for every ! ∈ ℂ, we have that T!v = +!v. ⊘

17.1 A Quick Review of Spectral Theory in Finite Dimensions

193

Going back to the case of finite dimensions, if M denotes an n × n matrix associated to the linear transformation T : ℂn → ℂn , then finding an eigenvalue is equivalent to finding + ∈ ℂ and v ∈ ℂn \{0} such that the following equation holds: (M – +I)v = 0, where I denotes n × n the identity matrix. Such + exists since by the fundamental theorem of algebra, the characteristic equation det(M – +I) = 0 always has a solution. With this result in hand, we will be able to prove the following proposition. Theorem 17.3. Let T : ℂn → ℂn be a linear transformation. Then there exists a basis B = {b1 , . . . , bn } for ℂn such that the matrix associated to T with respect to B is upper triangular. Proof. We will use induction over n. If n = 1 then the result is trivial. Suppose that the proposition holds for every positive integer k < n. Let + be an eigenvalue of T, and let U+ = (T – +I)(ℂn ). Then U+ is a subspace of ℂn and U+ ≠ ℂn since T – +I is not invertible. Hence k = dim(U+ ) < n. On the other hand, U+ is invariant for T since if u ∈ U+ , then Tu = (T – +I)u + +u ∈ U+ . Thus, we can use the induction hypothesis to conclude that there exists a basis {u1 , . . . , uk } for U+ with respect to which the matrix representation of the restriction of the linear transformation T|U+ : U+ → U+ is upper triangular. Now notice that ℂn = U+ ⊕ U+⊥ and suppose that {v1 , . . . vn–k } is a basis for U+⊥ . Then for every j ∈ {1, . . . , n – k} we have that Tvj = (T – +I)vj + +vj is a linear combination of {u1 , . . . , uk } ∪ {vj }. The result follows taking B = {u1 , . . . , uk , v1 , . . . , vn–k }. ◻ With this proposition in hand, we are now ready to show the spectral theorem for finite-dimensional complex spaces. It characterizes the linear transformations that have a diagonal matrix representation with respect to an orthonormal basis. Theorem 17.4. Let T : ℂn → ℂn be a linear transformation. The following statements are equivalent:

194

(a) (b)

17 Spectral Theory of Operators on Hilbert Spaces

T is normal. There exists a diagonal matrix representation of T with respect to an orthonormal basis for ℂn . n

Proof. Suppose that T is normal, and let M = [aij ]i,j=1 be an upper-triangular representation of T with respect to some basis B = {b1 , . . . , bn }. Notice that aij = 0 for i > j n

and that the representation of T ∗ is [aji ]i,j=1 . Using Problem 17.2 we may assume that B is an orthonormal basis. Moreover, since T is normal, then ‖Tb1 ‖2 = ‖T ∗ b1 ‖2 and consequently n

|a11 |2 = ∑ |a1j |2 , j=1

which implies that a1j = 0 for j = 2, . . . , n. Hence, ‖Tb2 ‖2 = |a22 |2 = ‖T ∗ b2 ‖2 and consequently n

|a22 |2 = ∑ |a2j |2 , j=2

which implies that a2j = 0 for j = 3, . . . , n. Continuing this process, we get that M is diagonal. Conversely, if the representation for T is diagonal, then so is the representation for T ∗ and the result follows since any two diagonal matrices commute. ◻ This theorem finishes this section which is by no means complete. There are many fascinating results in the spectral theory of operators in finite dimensions. For example, here we just dealt with the case of a complex field and as the reader probably have already noticed, the theory cannot be transferred verbatim to the case of a real field. We refer the interested readers to Ref. [1] for a simple proof in the case of a self-adjoint operator acting on a finite-dimensional real space. Here, we wanted to show one of the instances of the spectral theorem before dealing with it in infinite-dimensional Hilbert spaces.

17.2 The Spectral Theorem for Compact Self-Adjoint Operators In the previous section we studied conditions under which a linear transformation is “diagonalizable,” in the sense that there exists an orthonormal basis inducing a diagonal matrix representation. In this section we will study a similar result in the context of infinite-dimensional Hilbert spaces. We want to make clear that we will only touch a very shallow part of the very deep ocean that spectral theory represents with the hope to awaken the curiosity of the reader to more advanced courses. We saw that a central role was played by the eigenvalues and eigenvectors of a linear transformation. We will start by studying some of its properties in this new setting.

17.2 The Spectral Theorem for Compact Self-Adjoint Operators

195

Theorem 17.5. Let H be a complex Hilbert space and let T : H → H be a self-adjoint operator. Then every eigenvalue of T is a real number. Proof. Suppose that + ∈ ℂ is an eigenvalue for T and let v ∈ H, v ≠ 0 be such that (T–+I)v = 0, then since T is self-adjoint, (T–+I)∗ = T–+I and consequently +v = Tv = +v. Hence v(+ – +) = 0 which implies that + = +. ◻ We can actually say a little more about the eigenvalues of self-adjoint operators in the case of compact operators. Theorem 17.6. Let H be a Hilbert space and let T : H → H be a compact self-adjoint operator. Then either ‖T‖ or –‖T‖ is an eigenvalue for T. Proof. If T = 0, then ‖T‖ = 0 and the result follows trivially, so we suppose that T ≠ 0. We use Theorem 16.13 to choose a sequence (vn )∞ n=1 ⊆ H, ‖vn ‖ = 1 such that 󵄨󵄨 󵄨󵄨 ∞ 󵄨󵄨⟨Tvn , vn ⟩󵄨󵄨 → ‖T‖. Thus, there exists a subsequence (renamed as (vn )n=1 ) such that the complex sequence (⟨Tvn , vn ⟩) converges to a real number + such that |+| = ‖T‖. It remains to show that + is an eigenvalue for T. Notice that ‖ (T – +I) vn ‖2 = ⟨(T – +I) vn , (T – +I) vn ⟩, = ‖Tvn ‖2 – 2+⟨Tvn , vn ⟩++2 ‖vn ‖2 , ≤ |+|2 – 2+⟨Tvn , vn ⟩++2 → 0. Now since T is compact and ‖Tvn ‖ ≤ ‖T‖, then there exists a further subsequence ∞ (vnk )∞ k=1 ⊆ (vn )n=1 such that ‖Tvnk – v‖ → 0 for some v ∈ H. Thus ‖+vnk – v‖ → 0 and so we have that lim Tvnk = lim +vnk = v which, by the continuity of T implies that Tv = + lim Tvnk = +v.

Theorem 17.7. Let H be a Hilbert space over a field 𝔽 and let T : H → H be a compact linear operator. The set of eigenvalues of H is countable and the only possible accumulation point is + = 0. Proof. We will show that for every real number r > 0 the set of eigenvalues of T that belong to 𝔽\Br (0) is finite. Suppose that on the contrary, it is possible to find r0 > 0 and an infinite sequence of eigenvalues (+j )∞ j=1 ⊆ 𝔽\Br0 (0). Then for each j, there exists an eigenvector vj ∈ H. Let Hk be the vector space generated by {v1 , . . . , vk }. By Problem 17.1 we know that the vj󸀠 s

196

17 Spectral Theory of Operators on Hilbert Spaces

are linearly independent and so if v ∈ Hk we have that v = ∑kj=1 !j vj for some (unique) !󸀠j s ∈ 𝔽. Notice that k–1

(T – +k I)v = ∑ !j (+j – +k )vj ∈ Hk–1 . j=1

Since each Hk–1 is closed in Hk , we can use Lemma 3.14 to find vectors wk ∈ Hk , ‖wk ‖ = 1 such that ‖wk – w‖ ≥

1 for all w ∈ Hk–1 . 2

(17.1)

We will show that the sequence (Twk )∞ k=1 has no convergent subsequences and this contradicts the hypothesis that T is compact. Let n and m be positive integers, with m < n, and then wm ∈ Hn–1 and hence (T – +m I)wm ∈ Hn–1 . Consequently Twm ∈ Hn–1 and since (T – +n I)wn ∈ Hn–1 , then we have that Twm – (T – +n I)wn ∈ Hn–1 and by eq. (17.1), ‖Twn – Twm ‖ = ‖Twm – (T – +n I)wn + +n wn ‖, = |+n |‖+–1 n [Twm – (T – +n I)wn ] + wn ‖, |+n | , 2 r ≥ 0. 2 ≥

This finishes the proof. Now we are ready to prove the main theorem of this section.

Theorem 17.8. Let H be a Hilbert space and let T : H → H be a compact self-adjoint operator. Then ∞

T = ∑ +n Pn , n=1

where each Pn : H → ker(T – +n I) denotes the orthogonal projection and {+1 , +2 , . . . } are nonzero eigenvalues of T. Proof. We may assume, using Theorem 17.6 and after reordering if necessary, that |+1 | = ‖T‖. We write N1 = ker(T – +1 I) and notice that H = N1 ⊕ N1⊥ and from Problem 16.7

17.3 Problems

197

we have that T (N1⊥ ) ⊆ N1⊥ . Let T2 be the restriction of T1 to the space N1⊥ ; that is T2 = T|⊥ N1 . We leave it as an exercise to show that T2 is also compact and self-adjoint. Now apply the same reasoning to the operator T2 to find an eigenvalue +2 such that |+2 | = ‖T2 ‖. Let N2 = ker(T2 – +2 I) and notice that N2 = ker(T – +2 I) which implies that +2 ≠ +1 . Moreover, since clearly ‖T2 ‖ ≤ ‖T‖, then |+2 | ≤ |+1 |. Let T3 be the restriction of T to (N1 ⊕ N2 )⊥ and continue the process. This way we construct a sequence of (real) eigenvalues of T such that |+1 | ≥ |+2 | ≥ ⋅ ⋅ ⋅ . Moreover, since the sequence (+n )∞ n=1 is bounded, we conclude from Theorem 17.7 that +n → 0. Now fix k ∈ ℕ and let 1 ≤ k ≤ n. Then if v ∈ Nk we have that n

Tv – ∑ +j Pj v = Tv – +k v = 0. j=1

On the other hand, if v ∈ (N1 ⊕ ⋅ ⋅ ⋅ ⊕ Nn )⊥ , then Pk (v) = 0 and consequently n

Tv – ∑ +j Pj v = Tv. j=1

Therefore, 󵄩󵄩 󵄩󵄩 n 󵄩󵄩 󵄩󵄩 󵄩 󵄩󵄩 󵄩󵄩T – ∑ +j Pj 󵄩󵄩󵄩 = ‖Tn+1 ‖, 󵄩󵄩 󵄩󵄩 j=1 󵄩󵄩 󵄩󵄩 = |+n+1 | → 0, which finishes the proof.

Remark 17.9. As previously said, we pretend only to give a brief introduction to the broad and deep subject of spectral theory. Any type of literature that we may recommend here will be surely not enough to cover it and so we only cite the article [10] that will lead the interested readers to make their own search in their favorite topics. ⊘

17.3 Problems 17.1.

17.2.

Let V be a vector space and let T : V → V be a bounded linear transformation with at least two different eigenvalues. Prove that eigenvectors corresponding to different eigenvalues are linearly independent. Suppose that T : ℂn → ℂn is a linear transformation and suppose that B = {b1 , . . . , bn } is a basis for ℂn such that the matrix representation of T with respect to B is upper triangular. Let C = {c1 , . . . , cn } be the orthonormal basis

198

17.3. 17.4. 17.5.

17 Spectral Theory of Operators on Hilbert Spaces

resulting from applying the Gram–Schmidt process to B. Show that the matrix representation for T with respect to C is also upper triangular. Let H be a Hilbert space and let T : H → H be a compact operator. Suppose that + ∈ ℂ is an eigenvalue for T. Show that dim (ker(T – +I)) < ∞. Prove that the operators Tn defined on the proof of Theorem 17.8 are self adjoint and compact. Let (V, ‖ ⋅ ‖) be a Banach space over the field ℂ and let T : V → V be a bounded linear operator. Define the spectrum of T as 3(T) = {+ ∈ ℂ : T – +I is not invertible }. Show that (a) 3(T) ≠ 0. (b) 3(T) is a compact set.

17.6. 17.7.

(c) 3(T) ⊆ B‖T‖ (0). Prove that, in a Hilbert space, the eigenvectors of a normal operator corresponding to different eigenvalues are orthogonal. Suppose H is a Hilbert space and P : H → H is a bounded linear operator. Prove that the following are equivalent: (a) P is an orthogonal projection over its range. (b) P is self-adjoint. (c) P is normal.

18 Compactness Learning Targets ✓ Learn the notion of compactness in metric spaces. ✓ Understand the several equivalent conditions of compactness. ✓ Get to know some criterion for compactness in some function spaces.

The notion of compactness is essential throughout analysis. For example, the Lindelöf Theorem states that for an arbitrary set X ⊆ ℝn , all open coverings ∪j∈J Gj admit an enumerable subcovering, namely X ⊆ ∪n∈ℕ An where An ∈ ∪i∈J . In the case of some sets K ∈ ℝn this statement can be substantially improved, as the Borel–Lebesgue theorem (also known as the Heine–Borel Theorem) affirms: if K is a closed and bounded set then all open covering of K admits a finite subcover. This characterization of a set stating that all open covering admits a finite covering is called compactness. In the realm of the real line, the question of compactness can be summarized in the following theorem (cf. Ref. [34]). Theorem 18.1. Let K ⊆ ℝ. The following assertions are equivalent: (a) (b) (c) (d)

K is closed and bounded; all open covering of K admits a finite subcovering; all infinite subset of K has an accumulation point belonging to K; all sequence of points of K has a subsequence converging to some point of K.

18.1 Metric Spaces In this section we will study the question of compactness in the framework of metric spaces. We start with the notion of sequential compact set in a metric space, which is an analogue of the similar notion given in Theorem 18.1 (d). Definition 18.2 (Sequentially Compact Set). Let X ⊆ M be a subset of a metric space (M, d). We say that X is sequentially compact if all sequence in X has a convergent subsequence in X. We know that in the case of the real line ℝ, the Bolzano–Weierstrass Theorem asserts that when X ⊆ ℝ is closed and bounded, then it is sequentially compact. In fact we know that for the real line these two notions are equivalent, see Theorem 18.1. An immediate consequence of the definition of sequentially compact set is that the set must be bounded.

200

18 Compactness

Lemma 18.3. A sequentially compact set K of a metric space is bounded. For a direct proof of Lemma 18.3 see Problem 18.10. For a different proof based on the Hausdorff compactness criterion see Corollary 18.12. The following theorem should be well known to the reader in the case of subsets of the real line, but since it is highly instructive we will give its formulation in the metric space framework (see §9.2 for some applications of this theorem to prove existence results). Theorem 18.4 (Cantor Nested Set Theorem). Given a nested sequence K1 ⊃ K2 ⊃ ⋅ ⋅ ⋅ ⊃ Kn ⊃ . . . of nonempty closed sequentially compact sets in a metric space X, then, the intersection K = ⋂∞ i=1 Ki is nonempty. Proof. In fact, select a point xi in each set Ki to form the sequence {xi } ⊆ Ki . Since Ki is compact, a convergent subsequence {xik } can be extracted from {xi } . Let x0 = limk xik . Since for any fixed n, from the index ik > n onward, all terms of this sequence belong to Kn and Kn is closed, it follows that x0 ∈ Kn . However, then, x0 ∈ ⋂∞ i=1 Ki , and we are through. ◻ The next theorem states that the Weierstrass Theorem is also valid in the metric space framework. Theorem 18.5. Let K be a sequentially compact set of the space X and f be a continuous functional defined on this set. Then (a) (b)

the functional f is bounded on K; and the functional f attains its supremum and infimum on K.

Proof. (a) It is required to show that the functional f is bounded above (boundedness below can be demonstrated analogously). Assume the contrary. Then, there exists a sequence (xn )n∈ℕ of points of K, such that f (xn ) > n. Since the set K is sequentially compact, (xn )n∈ℕ contains a subsequence (xnk )k∈ℕ , which converges to a point x0 ∈ K. However, then, f (xnk ) > nk and, consequently, f (xnk ) → ∞ as k → ∞. On the other hand, f (xnk ) → f (x0 ) as k → ∞,

(18.1)

since the functional is continuous everywhere on K and, in particular, at the point x0 . This is a contradiction, and hence f is bounded. (b) Let " = supx∈K f (x) , implying that f (x) ≤ " for all x ∈ K and that there is a point x% ∈ K for all % > 0, such that f (x% ) > " – %. Hence there exists (xn )n∈ℕ , such that "–

1 < f (xn ) ≤ ". n

(18.2)

18.1 Metric Spaces

201

Since K is sequentially compact, (xn )n∈ℕ contains a subsequence (xnk )k∈ℕ , convergent to the point x0 ∈ K. Then "–

1 < f (xnk ) ≤ ", and, therefore, lim f (xnk ) = ". k→∞ nk

On the other hand, limk→∞ f (xnk ) = f (x0 ) , since f (x) is continuous at all points of the set K and, in particular, at the point x0 . Hence, f (x0 ) = ", yielding the desired proof. Analogously, it can be proved that if ! = infx∈K f (x) , then there is a point .0 ∈ K, such that f (.0 ) = !. ◻ We now introduce a notion related with sequential compactness. Definition 18.6 (Totally Bounded Set). Let X ⊆ M be a subset of a metric space (M, d). We say that X is totally bounded if, for all % > 0, there is a finite number of sets Ak such that X ⊆ A1 ∪ A2 ∪ ⋅ ⋅ ⋅ ∪ An and moreover each set Ak has diameter less than % (Fig. 18.1). At first sight it seems that the notion of totally bounded is the same as bounded, but the former concept is stronger than the latter, as the following example shows. Example 18.7. Let us take the points x1 = (1, 0, ⋅ ⋅ ⋅, 0, ⋅ ⋅ ⋅) x2 = (0, 1, ⋅ ⋅ ⋅, 0, ⋅ ⋅ ⋅) ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅⋅ xn = (0, 0, ⋅ ⋅ ⋅, n, ⋅ ⋅ ⋅)

Figure 18.1: Totally bounded set.

202

18 Compactness

from the space ℓ∞ . Then the union of these points is bounded, but not totally bounded. ⊘ Another related notion is the %-net. Definition 18.8 (%-net). Let X ⊆ M be a subset of a metric space (M, d). We say that N% is an %-net for the set X if for all x ∈ X, there exists x% ∈ N% such that d(x, x% ) < % (Fig. 18.2). Based on the notion of %-net we can formulate an important compactness criterion. Theorem 18.9 (Hausdorff Compactness Criterion). For a closed set K in a metric space X to be sequentially compact, it is necessary and, in the case of completeness of X, sufficient that there exists a finite %-net for the set K for every % > 0. Proof. Assume that K is sequentially compact and let us take any x1 from K. If N1 := {x1 } is an %-net of K we stop. In the other case, we pick a x2 such that d(x1 , x2 ) ≥ %. If N2 := {x1 , x2 } is an %-net we stop. In the other case, we pick a x3 such that d(x1 , x3 ) ≥ % and d(x2 , x3 ) ≥ 3. We can continue with this algorithm until we obtain a finite %-net. The existence of a finite %-net is guaranteed by the fact that if we can continue with the algorithm ad infinitum, then we are able to construct a sequence (wj )j∈ℕ with the property d(wi , wj ) ≥ % for all i ≠ j, which is impossible due to the sequential continuity of K. Conversely, let us assume that the space X is complete and that K has a finite %-net for every % > 0. We now take the sequence (%n )n∈ℕ where %n = n1 and construct the respective finite %n -net Nn := {x1(n) , . . . , x+(n) } . n

Let now X be any infinite subset of K. We now apply a greedy algorithm in the following +1 and we way: since K can be covered by a finite number of balls with centers in (xj(1) )j=1 know that at least one of such balls must contain an infinite number of elements of X, let us call this ball B1 . Again, since K can be covered by a finite number of balls with

Figure 18.2: %-net.

18.1 Metric Spaces

203

+

2 centers in (xj(2) )j=1 and we know that at least one of such balls must contain an infinite number of elements of B1 , let us call this ball B2 . We now iterate this algorithm and we will obtain a nested sets of balls

B1 ⊃ B2 ⊃ B3 ⊃ ⋅ ⋅ ⋅ ⊃ Bn ⊃ ⋅ ⋅ ⋅, each of which has an infinite number of elements and the elements of each ball are near each other, in the sense that if v, w ∈ Bk then d(v, w) < k1 . ∞

We now construct the sequence (bj )j=1 choosing bj ∈ Bj and in such a way that ∞

bj ≠ bi if i ≠ j. The sequence (bj )j=1 is a Cauchy sequence, which is convergent by the fact that X is complete, which entails the result. ◻ The next corollary is based on a two-step approximation. Corollary 18.10. Let X be a complete metric space. A set K is sequentially compact if, and only if, there exists a sequentially compact %-net for K for every % > 0. The proof of Corollary 18.10 is left as Problem 18.4. The Hausdorff compactness criterion immediately implies the separability of the space. Corollary 18.11. A compact space X is separable. Proof. Let us take a sequence (%n ) such that %n → 0 and construct, for every %n , a finite %n -net Nn = (xi(n) )i=1,2,...,+n . Taking N = ⋃∞ n=1 Nn we obtain a dense set in X.

Another corollary of the fact that a sequentially compact set is characterized by the existence of a finite %-net, for all % > 0, is the boundedness of the set, which can also be proved directly from the definition, see Problem 18.10. Corollary 18.12. A sequentially compact set K of a metric space is bounded. Proof. Let N1 = {x1 , . . . , xn } be an 1-net for K and a a fixed element of the space X. Further, let d = maxi d (a, xi ) . Then, evidently, d (x, a) ≤ 1 + d for every point x ∈ K. ◻ Let us now introduce the concept of an open covering.

204

18 Compactness

Definition 18.13 (Open Covering). Let K ⊆ M be a subset of a metric space (M, d). We say that a family (Gj )j∈J of open sets of M is an open cover or an open covering of the set X ⊆ M, if X ⊆ ⋃ Gj . j∈J

We now introduce the notion of a compact set based on the Borel–Lebesgue Theorem.

Definition 18.14 (Compact set). Let K ⊆ M be a subset of a metric space (M, d). We say that K is a compact set if for all open covering of K K ⊆ ⋃ Gj , j∈J

there exists a finite subcovering of K, viz. n

K ⊆ ⋃ Gk . k=1

Theorem 18.15. Let K be a compact set of a metric space X. Then K is closed. Proof. We argue by contradiction. Let us suppose that K is not closed, and then there exists an element v ∈ K\K. Defining Gn = X\B 1 (x), n

we see that the (Gn )n=1 is an open covering of K (since ∪n∈ℕ Gn = X\ {x}). On the other hand, we cannot extract a finite subcovering from (Gn )n∈ℕ , since B 1 (x) ∩ K ≠ 0 for all n n ∈ ℕ. ◻ Using the previous result we can prove that closed subsets of compact metric spaces are compact. Our aim is to show the relation between compact sets and sequentially compact sets. Theorem 18.16. Let F be a closed set of a metric space M. Then F is sequentially compact if, and only if, F is compact. Proof. Let F be sequentially compact and suppose that it fails to be compact, i.e., there exists an open covering from which it is not possible to extract a finite subcovering.

18.1 Metric Spaces

205

To fix idea, let us call {G! } the open covering of F which does not admit a finite subcovering of F. Let us now take positive sequence (%n )n∈ℕ which convergences monotonically to k1 zero, i.e., %n ↓ 0. Let x1(1) , ⋅ ⋅ ⋅, xk(1) be a finite %1 -net for F. We can write F = ∪j=1 Fj 1

where Fj = B%1 (xj(1) ) ∩ F. Each Fj is a sequentially compact set of diameter at most 2%1 . Since F is not covered by a finite subcovering of {G! } and then at least one of the Fj cannot be covered by a finite subcovering of {G! }, let us denote that set by Fj1 . We apply this algorithm and we obtain a sequentially compact set Fj2 with diameters at most 2%n from set Fj1 . Iterating this process we will obtain a nested sequence of closed sequentially compact sets Fj1 ⊃ Fj2 ⊃ ⋅ ⋅ ⋅ ⊃ Fjn ⊃ ⋅ ⋅ ⋅ diameters of which tend to zero. We know that there exists an element x that is contained in all Fjn for n ∈ ℕ. Since {G! } is a covering of F then there exists a G!0 which contains the point x and due to the openness of G!0 there exists a ball Br (x) ⊆ G!0 . Since the diameters of Fjn tend to zero, then there exists an order k such that Fjk ⊆ Br (x), but this entails a contradiction, since the sets Fjn are exactly those that cannot be covered by a finite subcovering of {G! }. Therefore, all open coverings of F admit a finite subcovering which is exactly what we wanted to prove, that F is compact. Let us now suppose that F is compact and that F is not sequentially compact, i.e., let us suppose that there exists a subset S which fails to have a convergent subsequence in S. Now, for each x ∈ F, there exists an rx such that Brx (x) contains no elements of S or at most x ∈ S. The set ∪x∈F Brx (x) is an open covering of F from which we can extract a finite subcovering, say Br1 (x1 ), ⋅ ⋅ ⋅, Brk (xk ), which covers F. Since S is contained in these finite sets, and each set contains at most one element, then S must be finite. Therefore, every infinite subset S ⊆ F has a convergent subsequences, i.e., F is sequentially compact. ◻ The previous theorem is the reason that in the framework of metric spaces we talk about compactness and sequential compactness in an interchangeable way and dropping the adjective sequential. We now show that compact sets behave well under a continuous mapping, i.e., the image is itself a compact set. Theorem 18.17 (Weierstrass Theorem). The image of a compact set by a continuous mapping is compact. Proof. We will use sequential compactness. Let f : X 󳨀→ Y and (yn )n∈ℕ be an arbitrary sequence of f (X) ⊆ Y. For each element yn we choose one of its pre-images xn . Due to the compactness of X, from (xn )n∈ℕ , we can extract a convergent subsequence (xnk ) converging to some x ∈ X. Since f is continuous, it follows ynk → y = f (x) since

206

18 Compactness

ynk = f (xnk ) → f (x) =: y. Therefore, f (X) is compact.

We now show that the Cartesian (numerable) product of compact spaces is again a compact space. This result is known in the literature as the Tikhonov Theorem, but sometimes the Russian name is transliterated also as Tychonoff, Tychonov and even Tichonov. Definition 18.18 (Cartesian Product of Metric Spaces). Let (M1 , d1 ) and (M2 , d2 ) be two metric spaces. The Cartesian product M1 ×M2 can be given a structure of a metric space, namely (M1 × M2 , d1 × d2 ) is a metric space, where the metric d1 × d2 can be taken in several ways, e.g., (a) d1 × d2 (z, w) = d1 (x1 , y1 ) + d2 (x2 , y2 ), (b)

d1 × d2 (z, w) = max {d1 (x1 , y1 ), d2 (x2 , y2 )},

(c)

d1 × d2 (z, w) = √(d1 (x1 , y1 ))2 + (d2 (x2 , y2 ))2 ,

where z = (x1 , x2 ) and w = (y1 , y2 ). The next result shows that the Cartesian product of compact spaces is compact. We will use the sequential compactness approach and to avoid cumbersome notation in the proof we will define a subsequence in a more appropriate way. Definition 18.19 (Subsequence). Let (xn )n∈ℕ be a sequence. A subsequence of (xn )n∈ℕ is any (xn )n∈ℕ1 , where the set ℕ1 satisfies the following properties: (a) ℕ1 ⊆ ℕ; (b) ℕ1 is infinite; (c) the elements of ℕ1 are ordered by increasing values, i.e., ℕ1 = {x1 < x2 < ⋅ ⋅ ⋅} . We will denote by ℕ1 ⊑ ℕ the previous properties. By lim xn we denote the limit of the subsequence (xn )n∈ℕ . 1

n∈ℕ1

Theorem 18.20 (Tikhonov Theorem). The Cartesian product M = ∏ Mn k∈ℕ

is sequentially compact if and only if each Mn is sequentially compact for all n ∈ ℕ.

18.1 Metric Spaces

207

Proof. To get the gist of the proof, we start with the result that if (M1 , d1 ) and (M2 , d2 ) are sequentially compact, then the metric space (M1 × M2 , d1 × d2 ) is also a sequentially compact set. Let (wn )n∈ℕ be a sequence in M1 × M2 of the form wn = (xn , yn ) where xn ∈ M1 and yn ∈ M2 . Since M1 is compact, then there exists a convergent subsequence (xn )n∈ℕ1 . Taking now the sequence (yn )n∈ℕ1 which belongs to M2 and the fact that M2 is compact we have a convergent subsequence (yn )n∈ℕ2 , where ℕ2 ⊑ ℕ1 . It follows that (wn )n∈ℕ2 is convergent in M1 × M2 , yielding the result.

To prove the general result, it suffices to show that if xn = (xn(1) , xn(2) , xn(3) , ⋅ ⋅ ⋅, xn(k) , ⋅ ⋅ ⋅) is a sequence in M, then there exists a set ℕ∞ ⊑ ℕ such that, for all k, we have lim x(k) n∈ℕ∞ n

= x(k) .

In the following diagram, at the left-hand side we have the sequence and at the righthand side we take the set in which the sequence has a convergent subsequence: (x1(1) , x2(1) , x3(1) , ⋅ ⋅ ⋅, xk(1) , ⋅ ⋅ ⋅),

ℕ1 ⊑ ℕ

(x1(2) , x2(2) , x3(2) , ⋅ (x1(3) , x2(3) , x3(3) , ⋅

⋅ ⋅),

ℕ 2 ⊑ ℕ1

⋅ ⋅),

ℕ 3 ⊑ ℕ2

⋅ ⋅

⋅, xk(2) , ⋅ ⋅, xk(3) , ⋅

⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ (x1(k) , x2(k) , x3(k) , ⋅ ⋅ ⋅, xk(k) , ⋅ ⋅ ⋅),

(18.3)

ℕk ⊑ ℕk–1

⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ From eq. (18.3) we know that lim x(i) n∈ℕi n

= x(i) .

We now define ℕ∞ as ℕ∞ := {xi : xi is the i-element of ℕi } , from which it follows that lim x(i) n∈ℕ∞ n for all i ∈ ℕ.

= x(i) , ◻

It should be pointed out that the full version of Tikhonov’s Theorem is equivalent to the axiom of choice; see the original proof in Ref. [19]. We end this section with the notion of Lebesgue number of a covering and the relation with metric compact spaces.

208

18 Compactness

Definition 18.21 (Lebesgue Number of a Covering). A number % > 0 is said to be a Lebesgue number of a covering M = ⋃j∈J Sj if all subsets X ⊆ M with diam (X) < % are contained in some Sj of the covering. If some number %0 is a Lebesgue number for a covering, then it follows immediately that all % such that 0 < % < %0 are still a Lebesgue number. We now show that even finite coverings of a metric space can fail to have a Lebesgue number. Example 18.22. Let ℝ2 = X ∪ Y where X = {(x, y) ∈ ℝ2 : x ≠ 0} and Y = {(x, y) ∈ ℝ2 : xy ≠ 1} . For all % > 0 let us choose x such that 0 < x < %. Then, taking the points w = (0, 1/x) and z = (x, 1/x) we have that d(w, z) < %, but w ∉ Y and z ∉ X. Since this is valid for all % > 0 then this covering does not have any Lebesgue number (Fig. 18.3). ⊘

1 ε ,ε 1 ε ,0

Figure 18.3: Lebesgue number.

18.2 Compactness in Some Function Spaces

209

One of the reasons that the previous example fails is the fact that the set ℝ2 is unbounded. For a result in the vein of the previous example, see Problem 18.12. Theorem 18.23. All open covering of a compact set (M, d) has a Lebesgue number. Proof. We argue by contradiction. Let us suppose that the cover M ⊆ ∪nj∈J Gj fails to have a Lebesgue number. Then, for all n ∈ ℕ, there exists a set Sn ⊆ M, with diam(Sn ) < 1 and such that Sn is not contained in any Gj . Let us choose in each Sn an element xn . n Passing to a subsequence if necessary, we may assume that limn→∞ xn = a ∈ M. We have that a ∈ Gj for some j ∈ J. Since the sets Gj are open, then there exists an % > 0 such that B% (a) ⊆ Gj . Let us take n ∈ ℕ such that n1 < %2 and d (a, xn ) < %2 . Then y ∈ Sn ⇒ d (a, y) ≤ d (a, xn ) + d (xn , y)
0, there exists an n ∈ ℕ such that for all x = (x1 , ⋅ ⋅ ⋅, xk , ⋅ ⋅ ⋅) ∈ X we have ‖Fn (x)‖ℓ2 < %. Example 18.26 (Hilbert Cube). By Hilbert cube H we mean the following set: ℕ H := {x ∈ ℝ : 0 ≤ xn ≤ 1/n} ,

where xn is the n-th component of the vector x = (x1 , x2 , ⋅ ⋅ ⋅, xn , ⋅ ⋅ ⋅). By the hyperharmonic series criterion we have that H ⊆ ℓ2 . We now show that H is equiconvergent. 1 2 Let us fix % > 0 and choose n ∈ ℕ such that ∑∞ k=n k2 ≤ % (which is possible by the hyperharmonic series criterion). For all x ∈ H we have ‖Fn (x)‖ℓ2 < %. Since the % > 0 was arbitrary we obtain the result.

We now state and prove the ℓ2 compactness criterion. Theorem 18.27. Let X ⊆ ℓ2 . The set X is compact if, and only if, X is bounded, closed and equiconvergent. Proof. Let X be compact. Given an % > 0 there exists for all x ∈ X an order kx ∈ ℕ such that ‖Fkx (x)‖ℓ2 . By continuity there exists a ball Bx = Brx (x) such that ‖Fkx (y)‖ℓ2 < % for all y ∈ Bx . We now form the open covering X ⊆ ∪x∈X Bx which admits a finite open covering due to the compactness of X, say X ⊆ ∪Nj=1 Bxj . Taking k = max {kx1 , ⋅ ⋅ ⋅, kxN } we obtain that, for all x ∈ X, ‖Fk (x)‖ℓ2 < %, proving the equiconvergence of X. The closedness and boundedness are clear. The proof of the converse will be based on the following: suppose that there are closed sets Ij ⊆ ℝ such that the inclusion map ı : X 󳨀→ ∏∞ j=1 Ij x 󳨃󳨀→ x

(18.4)

∞ is a homeomorphism of X with some closed subset of ∏∞ j=1 Ij . Since ∏j=1 Ij is compact by Tikhonov Theorem and the fact that closed subsets of compact sets are compact we obtain the result.

18.2 Compactness in Some Function Spaces

211

We now suppose that X is equiconvergent, closed and bounded. For all j ∈ ℕ, the set 0j (X) is bounded, where 0j is the j-th projection, namely 0j :

ℓ2 󳨀→ ℝ (x1 , x2 , ⋅ ⋅ ⋅, xj , ⋅ ⋅ ⋅) 󳨃󳨀→ xj .

Therefore, 0j (X) ⊆ Ij where Ij = [aj , bj ] ⊆ ℝ. It is only necessary to prove that the inclusion map 𝚤 defined in eq. (18.4) is a homeomorphism and the range is a closed set. Conversely, let us suppose X ⊆ ℓ2 is equiconvergent and take a sequence of points xn ∈ X such that limn xni = ai for each i ∈ ℕ. We claim that a = (a1 , a2 , . . . , ai , . . . ) ∈ ℓ2 and that lim xn = a in ℓ2 . Indeed, in first 󵄨󵄨 󵄨󵄨 place, (xn ) is Cauchy in ℓ2 , for, given % > 0, we take k ∈ ℕ such that 󵄨󵄨󵄨Fk (x)󵄨󵄨󵄨 < 4% for 󵄨 󵄨 every x ∈ X. Then 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 % x, y ∈ X ⇒ 󵄨󵄨󵄨Fk (x – y)󵄨󵄨󵄨 = 󵄨󵄨󵄨Fk (x) – Fk (y)󵄨󵄨󵄨 ≤ 󵄨󵄨󵄨Fk (x)󵄨󵄨󵄨 + 󵄨󵄨󵄨Fk (y)󵄨󵄨󵄨 < . 󵄨 󵄨 󵄨 󵄨 󵄨 󵄨 󵄨 󵄨 2 The sequence of points Fk (xn ) = (xn1 , . . . , xnk ) is Cauchy in ℝk . Then there is n0 ∈ ℕ such that 󵄨 󵄨 󵄨 󵄨 % m, n > n0 ⇒ 󵄨󵄨󵄨󵄨Fk (xm ) – Fk (xn )󵄨󵄨󵄨󵄨 = 󵄨󵄨󵄨󵄨Fk (xm – xn )󵄨󵄨󵄨󵄨 < . 2 Like this, 󵄨󵄨 󵄨 󵄨󵄨 󵄨 󵄨 󵄨 m, n > n0 ⇒ 󵄨󵄨󵄨xm – xn 󵄨󵄨󵄨 ≤ 󵄨󵄨󵄨󵄨Fk (xm – xn )󵄨󵄨󵄨󵄨 + 󵄨󵄨󵄨Fk (xm – xn )󵄨󵄨󵄨 < %. 󵄨 󵄨 Since ℓ2 is complete, there exists limn xn = b in ℓ2 . For each i, we have bi = limn xni = ai . Hence b = a = lim xn . We now show that if X ⊆ ℓ2 is equiconvergent, bounded and closed, then X is compact. First we notice that, for every i ∈ ℕ, 0i (X) is bounded; hence there exists Ji = [ai , bi ] ⊆ ℝ such that 0i (X) ⊆ Ji for every i. Let j : X → ∏ Ji be the inclusion mapping: j (x) = x. What we have just seen above means that, given a sequence of points xn ∈ X, we have limn j (xn ) = a in ∏ Ji if, and only if, a ∈ ℓ2 and limn xn = a in ℓ2 . Since X ⊆ ℓ2 is closed, we can conclude that limn j (xn ) = a ∈ ∏ Ji ⇔ limn xn = a ∈ X. This means that j is an homeomorphism of X over a closed subset of ∏ Ji . Since ∏ Ji is compact, it follows that X is also compact. ◻ 18.2.2 Space of Continuous Functions In this section we will work with the set of continuous functions with domain in [0, 1] for simplicity. We want to find under what conditions we can guarantee that a subset X of C([0, 1], ℝ) is compact. Since C([0, 1], ℝ) is a metric space endowed with the supnorm, we want to characterize the sets of continuous functions such that any sequence has a uniformly convergent subsequence. The following examples show that boundedness of the functions is not enough.

212

18 Compactness

Example 18.28. Let us take the sequence of functions fn : [0, 1] 󳨀→ ℝ, x 󳨃󳨀→ xn (1 – xn ). On the one hand, the sequence fn converges pointwise to the null function; on the n other hand a direct calculation shows that fn ( √ 1/2) = 1/4 for all n. Therefore, the sequence does not converge uniformly to the null function. Figure 18.4 depicts the graph of some elements of the sequence. ⊘ In order to characterize the compact sets of continuous functions we need to introduce the notion of equicontinuity. Definition 18.29 (Equicontinuity). Let F be a collection of functions f : X ⊆ ℝ 󳨀→ ℝ. We say that the set F is equicontinuous at the point x if, for all % > 0, there exists \$ > 0 such that, for all y ∈ X, satisfying |x – y| < \$ we have |f (x) – f (y)| < %

(18.5)

for all f ∈ F. We say that set F is equicontinuous when it is equicontinuous in all the points of X, which is the common domain for all the functions in F.

This definition requires a moment’s reflection, since it is very similar to the notion of continuity and also uniform continuity. The gist of the definition relies on the fact that the inequality (18.5) is valid for all functions belonging to the set F, whereas the notions of continuity and uniform continuity are intrinsic only to a function, not to some aggregate. In the same vein we can define the notion of equicontinuous sequence, as given below.

Figure 18.4: Sequence of continuous functions converging pointwise but not uniformly.

18.2 Compactness in Some Function Spaces

213

Definition 18.30 (Equicontinuous Sequence). We say that a sequence of functions (fn )n∈ℕ f : X 󳨀→ ℝ is an equicontinuous sequence at the point x0 when the aggregate F = {f1 , f2 , ⋅ ⋅ ⋅} is equicontinuous. We need one more definition. Definition 18.31 (Uniform Equicontinuity). Let F be a collection of functions f : X ⊆ ℝ 󳨀→ ℝ. We say that the set F is uniformly equicontinuous if, for all % > 0, there exists \$ > 0 such that, for all x, y ∈ X, satisfying |x – y| < \$ we have |f (x) – f (y)| < %

(18.6)

for all f ∈ F. Example 18.32. Let f be a continuous function which is not uniformly continuous. Then the aggregate F := {f } is equicontinuous but not uniformly equicontinuous. ⊘ We now give the Arzelá–Ascoli Theorem, which is the criterion for compactness in the set of continuous functions. Theorem 18.33 (Arzelá–Ascoli Theorem). For a closed set K ⊆ C [0, 1] to be compact, it is necessary and sufficient that the functions f ∈ K are uniformly bounded and equicontinuous. Proof. Let K be compact. The uniform boundedness of the functions is a consequence of Corollary 18.12. We now show the equicontinuity of the functions in K. Taking a fixed % > 0 we assemble an %/3-net for the set K given by {f1 , f2 , ⋅ ⋅ ⋅, fn }. Let us recall that these functions in the net are continuous in [0, 1] which entails that they are all the more uniformly continuous in [0, 1]. For each function fj let us take a \$j > 0 such that |fj (x) – fj (y)| < %/3 whenever |x – y| < \$j . We now define \$ = min{\$j : j = 1, ⋅ ⋅ ⋅, n}. Taking |x – y| < \$ we have |f (x) – f (y)| ≤ 2 max |f (t) – fj (t)| + |fj (x) – fj (y)|, 0≤t≤1

% ≤ 2d(f , fj ) + . 3

(18.7)

Since the inequality (18.7) is valid for all f ∈ K, choosing fj such that d(fj , f ) < %/3. Taking all the previous into account, we obtain the equicontinuity of the aggregate of functions in K. Let us now suppose that the functions in K are equicontinuous and uniformly bounded. To show compactness we will use the Hausdorff Compactness Theorem 18.9;

214

18 Compactness

i.e., we will show that for each % > 0 it is possible to construct a finite %-net for K. We know that there exists C such that |f (x)| ≤ C for all f ∈ K. By the equicontinuity, for % > 0, there exists a \$ > 0 such that |f (x) – f (y)|
h (x) =

1 ∫ >(t)dt, 2h x–h

where the integrand is extended by zero outside [a, b].

18.2 Compactness in Some Function Spaces

217

From the definition of Steklov function it is almost immediate that the Steklov function is continuous, since >h (x) =

1 [F (x + h) – F (x – h)] , 2h

where x

F (x) = ∫ > (t) dt, a–h

and F is a continuous function. Lemma 18.36. Let > : [a, b] 󳨀→ R be an integrable function and >h its Steklov mean. Then ‖>h ‖L1 ≤ ‖>‖L1 . Proof. Let us suppose that > ≥ 0. We have b

b

+h

2h ∫ >h (x)dx = ∫ dt ∫ >(z + t)dz, a

a

–h

+h

b

= ∫ dz ∫ >(z + t)dt, –h

a

+h

b+z

= ∫ dz ∫ >(x)dx, a+z

–h b

≤ 2h ∫ >(x)dx, a

where we used Fubini’s Theorem and the fact that > is extended by zero outside [a, b]. For the general case, we note that we have the pointwise estimate |>h (x)| ≤ |>|h (x), since x+h

1 󵄨󵄨 󵄨 󵄨 󵄨 ∫ 󵄨󵄨󵄨> (t)󵄨󵄨󵄨 dt = |>|h (x) . 󵄨󵄨>h (x)󵄨󵄨󵄨 ≤ 2h x–h

Now the result follows since ‖>h ‖L1 ≤ ‖|>|h ‖L1 ≤ ‖|>|‖L1 = ‖>‖L1 .

218

18 Compactness

In the next result we show that a similar phenomenon occurs. When > belongs to a psummable Lebesgue space then the Steklov mean will also belong to the same space and its norms will be dominated by the norm of >. Lemma 18.37. Let > ∈ Lp [a, b] for p ≥ 1. Then 󵄩 󵄩 󵄩󵄩 󵄩󵄩 󵄩󵄩>h 󵄩󵄩Lp ≤ 󵄩󵄩󵄩>󵄩󵄩󵄩Lp .

(18.8)

Proof. It is necessary only to prove the case p > 1. The Steklov function >h belongs to the space Lp since it is continuous. Using the Hölder inequality, we have 1 1 󵄨󵄨 x+h 󵄨󵄨 p q x+h x+h 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 ∫ > (t) dt󵄨󵄨 ≤ ( ∫ 󵄨󵄨󵄨> (t)󵄨󵄨󵄨p dt) ( ∫ dt) , 󵄨 󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 x–h 󵄨󵄨 x–h x–h

1 1 + = 1, q p

from which we have the pointwise inequality x+h

1 󵄨󵄨 󵄨p 󵄨 󵄨p ∫ 󵄨󵄨󵄨> (t)󵄨󵄨󵄨 dt = (|>|p )h (x). 󵄨󵄨>h (x)󵄨󵄨󵄨 ≤ 2h

(18.9)

x–h

It follows from eq. (18.9) and the previous lemma that b

b

b

a

a

a

󵄨p 󵄨 ∫ 󵄨󵄨󵄨>h (x)󵄨󵄨󵄨 dx ≤ ∫(|>|p )h (x) dx ≤ ∫ |>(x)|p dx, ◻

which ends the proof.

The importance of the Steklov mean stems from the fact that it is an approximate identity operator in the Lebesgue norm, as given below. Lemma 18.38. Let > ∈ Lp [a, b] with p ≥ 1. Then b

󵄨p 󵄨 lim ∫ 󵄨󵄨󵄨>h (x) – > (x)󵄨󵄨󵄨 dx = 0.

h→0

(18.10)

a

Proof. The proof will rely on Luzin’s Theorem. We first show that relation (18.10) is valid for a continuous function > and after that we invoke the Luzin Theorem to approximate measurable functions by appropriate continuous ones.

18.2 Compactness in Some Function Spaces

219

Let us take a continuous function >. If a < x < b and h is such that [x – h, x + h] ⊆ [a, b] , then, by the mean value theorem, we have x+h

>h (x) =

1 ∫ > (t) dt = > (. ) , 2h

(18.11)

x–h

for . ∈ [x – h, x + h] . From eq. (18.11) it follows that if x ∈ (a, b) then lim >h (x) = > (x) ,

h→0

which entails that the integrand in eq. (18.10) tends to zero almost everywhere in [a, b]. Since the integrand is also dominated by some constant (since |>(x)| ≤ M due to its continuity and |>h (x)| < M), then we can use the Lebesgue-dominated convergence theorem to interchange the limit with the integral showing the validity of eq. (18.10) in the case of > continuous. Let now > be an arbitrary measurable function in Lp . Using Luzin’s Theorem, for fixed % > 0, we can find a continuous function 8 such that 󵄩󵄩 󵄩 󵄩󵄩8 – >󵄩󵄩󵄩Lp < %.

(18.12)

We notice that (8 – >)h (x) = 8h (x) – >h (x) from which we obtain 󵄩󵄩 󵄩 󵄩󵄩8h – >h 󵄩󵄩󵄩Lp < % by eq. (18.12) and Lemma 18.37. We now have 󵄩 󵄩 󵄩 󵄩 󵄩 󵄩 󵄩 󵄩󵄩 󵄩󵄩>h – >󵄩󵄩󵄩Lp ≤ 󵄩󵄩󵄩>h – 8h 󵄩󵄩󵄩Lp + 󵄩󵄩󵄩8h – 8󵄩󵄩󵄩Lp + 󵄩󵄩󵄩8 – >󵄩󵄩󵄩Lp ; therefore, 󵄩 󵄩 󵄩 󵄩󵄩 󵄩󵄩>h – >󵄩󵄩󵄩 < 2% + 󵄩󵄩󵄩8h – 8󵄩󵄩󵄩 . Since 8 is a continuous function we already proved that ‖8h – 8‖Lp → 0 when h → 0. 󵄩 󵄩 Thus for sufficiently small h > 0 we have that 󵄩󵄩󵄩8h – 8󵄩󵄩󵄩 < %, from which it follows that 󵄩󵄩 󵄩󵄩 ◻ 󵄩󵄩>h – >󵄩󵄩 < 3%. We are now ready to prove the Kolmogorov Theorem. Theorem 18.39 (Kolmogorov Theorem). A set K ⊆ Lp [a, b] is compact if, and only if, the following conditions are satisfied: (a) (b)

the set K is bounded in Lp ; and 󵄩 󵄩 the difference 󵄩󵄩󵄩fh – f 󵄩󵄩󵄩Lp tends to zero uniformly, as h → 0.

220

18 Compactness

Proof. Let us assume that K is a compact set. Then (a) is immediate, since compact sets are bounded. To prove (b) we will argue by contradiction. Suppose that there exists an %0 > 0 and two sequences, (hn )n∈ℕ and (f (n) )n∈ℕ , for which lim h n→∞ n

= 0,

and

‖(f (n) )hn – f (n) ‖Lp ≥ %0 ,

where f (n) ∈ K and hn > 0. We now show that (f (n) )n∈ℕ does not have a convergent subsequence, which yields a contradiction. Suppose that (f (n) )n∈ℕ has a convergent subsequence, i.e., there exists a ℕ1 ⊑ ℕ such that limn∈ℕ1 f (n) = g in Lp . Now, for all n ∈ ℕ1 we have 󵄩󵄩 󵄩󵄩 󵄩󵄩 󵄩󵄩 󵄩󵄩 󵄩󵄩 󵄩 󵄩 %0 ≤ 󵄩󵄩󵄩fh(n) – f (n) 󵄩󵄩󵄩 p ≤ 󵄩󵄩󵄩fh(n) – ghn 󵄩󵄩󵄩 p + 󵄩󵄩󵄩󵄩ghn – g 󵄩󵄩󵄩󵄩Lp + 󵄩󵄩󵄩g – f (n) 󵄩󵄩󵄩 p . 󵄩L 󵄩L 󵄩L 󵄩 n 󵄩 n 󵄩 Using Lemma 18.37 we obtain 󵄩󵄩 󵄩󵄩 󵄩 󵄩 %0 ≤ 2 󵄩󵄩󵄩g – f (n) 󵄩󵄩󵄩 p + 󵄩󵄩󵄩󵄩ghn – g 󵄩󵄩󵄩󵄩Lp . 󵄩L 󵄩 Taking Lemma 18.38 into account, we know that there exists an m ∈ ℕ such that for 󵄩 󵄩 % all n > m we have 󵄩󵄩󵄩󵄩ghn – g 󵄩󵄩󵄩󵄩 < 20 , which yields a contradiction, since for all such n we have 󵄩󵄩 󵄩 󵄩󵄩g – f (n) 󵄩󵄩󵄩 > %0 . 󵄩󵄩 󵄩󵄩Lp 4 Let us now suppose that the conditions (a) and (b) are satisfied (our aim is to use the Hausdorff Criterion Theorem 18.9). Then for arbitrary f ∈ K, we have ‖f ‖Lp < M where the constant M does not depend on the choice of f . Let us fix a h > 0 and define the set Kh = {fh : f ∈ K} , where fh is the Steklov mean. By Hölder’s inequality we have x+h

1 p

x+h

1 q

1 󵄨 󵄨 󵄨p 󵄨󵄨 ( ∫ 󵄨󵄨󵄨f (t)󵄨󵄨󵄨 dt) ( ∫ dt) < 󵄨󵄨fh (x)󵄨󵄨󵄨 ≤ 2h x–h

x–h

M 1 (2h) p

,

(

1 1 + ), p q

which means that all the functions in Kh are uniformly bounded. We now show that they are even equicontinuous. Let us pick a function in Kh , say fh , and for x, y ∈ [a, b] we have x+h

x–h

y+h

y–h

1 fh (x) – fh (y) = ( ∫ f (t) dt – ∫ f (t) dt) . 2h

(18.13)

18.3 Problems

221

We have the estimate 1 1 󵄨󵄨 󵄨󵄨 p q x+h x+h 󵄨󵄨 x+h 󵄨󵄨 󵄨󵄨 󵄨󵄨 1 󵄨󵄨 ∫ f (t) dt󵄨󵄨 ≤ ( ∫ 󵄨󵄨󵄨f (t)󵄨󵄨󵄨 dt) ( ∫ dt) < M (x – y) q , 󵄨󵄨 󵄨󵄨 󵄨 󵄨 󵄨󵄨 󵄨󵄨 󵄨󵄨 y+h 󵄨󵄨 y+h y+h 󵄨 󵄨

and an analogous estimate is valid for the other integral. Thus 1 󵄨󵄨 󵄨 M q 󵄨󵄨fh (x) – fh (y)󵄨󵄨󵄨 < 󵄨 󵄨 h (x – y) ,

from which the equicontinuity of the functions in Kh follows. Now, for a fixed h > 0, the set Kh is compact in C[a, b] by the Arzelá–Ascoli Theorem. ◻

18.3 Problems 18.1.

Prove that the set V = {xn = sin(nt) ∈ L2 [–0, 0] : n ∈ ℕ}

18.2. 18.3. 18.4. 18.5.

is closed and bounded, but it is not compact. Let V be a sequentially compact set, W be a closed set from a Banach space X and V ∩ W = 0. Prove that d(V, W) > 0. Let V and W be sequentially compact sets from a Banach space X. Prove that there exist v ∈ V and w ∈ W such that d(V, W) = ‖v – w‖. Prove Corollary 18.10. Let X be a Banach space and V ⊆ X a sequentially compact set. Let us suppose that there exists an application T : V 󳨀→ V

18.6.

such that ‖T(v) – T(w)‖ ≥ ‖v – w‖ for all v, w ∈ V. Prove that T is an isometric mapping of V into itself. Let X be a Banach space and Kj be nested sequentially compact sets, K1 ⊃ K2 ⊃ ⋅ ⋅ ⋅ ⊃ Kj ⊃ ⋅ ⋅ ⋅ Demonstrate that ⋂ Kj ≠ 0. j∈ℕ

222

18.7.

18 Compactness

Let V be a set of equicontinuous functions from the space C([a, b]). Show that the set t

W = {f : f (t) = ∫ >(s)ds, > ∈ V} , 0

18.8. 18.9. 18.10. 18.11. 18.12.

is a compact set. Demonstrate that all compact set of the space C󸀠 [a, b] is also compact in the space C[a, b]. Show the Weierstrass Theorem 18.17 using the definition of compactness via open covering. Prove Lemma 18.3 directly from the definition without resorting to the Hausdorff compactness criterion as done in Corollary 18.12. Prove in fact that the set defined in Example 18.7 is bounded but not totally bounded. Let (M, d) be a metric space. Show that there are closed subsets F, D ⊆ M with F ∩ D = 0 and d(F, D) = 0 then the open cover M = (M\F) ∪ (M\D) does not have a Lebesgue number.

Bibliography [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]

[16]

[17] [18] [19] [20] [21] [22] [23] [24] [25] [26]

F. Acker. A note on the spectral theorem in the ﬁnite-dimensional real case. The American Mathematical Monthly, 121(10):942–945, 2014. S. Axler. Linear Algebra Done Right. Springer, 2nd edition, 2004. R. G. Bartle and D. R. Sherbert. Introduction to Real Analysis. John Wiley & Sons, 2011. L.D. Beklemishev. Provability, Computability and Reﬂection. Studies in Logic and the Foundations of Mathematics. Elsevier Science, 2000. G. Botelho, D. Pellegrino, and E. Teixeira. Fundamentos de análise funcional. Coleção Textos Universitários, SBM, 2015. C. Curtis. Linear Algebra: An Introductory Approach. Springer, fourth edition, 1984. K. Falconer. Fractal geometry. Mathematical foundations and applications. 3rd ed. Hoboken, NJ: John Wiley & Sons, 3rd edition, 2014. H. Royden and P. Fitzpatrick. Real Analysis. Pearson, 4th edition, 2010. A. A. Fraenkel, Y. Bar-Hillel, and A. Levy. Foundations of set theory. North-Holland Publishing Co., Amsterdam–London, revised edition, 1973. P. R. Halmos. What does the spectral theorem say? The American Mathematical Monthly, 70(3):241–247, 1963. P. R. Halmos. Finite-dimensional Hilbert spaces. The American Mathematical Monthly, 77, 1970. H. Hanche-Olsen and H. Holden. The Kolmogorov-Riesz compactness theorem. Expositiones Mathematicae, 28(4):385–394, 2010. D. G. Hartig. The Riesz representation theorem revisited. The American Mathematical Monthly, 90(4):277–280, 1983. D. Jackson. A proof of Weierstrass’s theorem. The American Mathematical Monthly, 41(5):309–312, 1934. R. C. James. A non-reﬂexive Banach space isometric with its second conjugate space. Proceedings of the National Academy of Sciences of the United States of America, 37(3):174–177, 1951. T. Jech. The axiom of choice. North-Holland Publishing Co., Amsterdam-London; American Elsevier Publishing Co., Inc., New York, 1973. Studies in Logic and the Foundations of Mathematics, Vol. 75. T. Jech. Set theory. Springer Monographs in Mathematics. Springer-Verlag, Berlin, 2003. The third millennium edition, revised and expanded. F. Jones. Lebesgue Integration On Euclidean Space. Jones and Bartlett Books in Mathematics. Jones & Bartlett Learning, 2000. J. L. Kelley. The Tychonoff product theorem implies the axiom of choice. Fundamenta Mathematicae, 37:75–76, 1950. A. N. Kolmogorov and S. V. Fomin. Elementos da teoria das funções e de análise funcional. “Mir”, Moscow, 1982. E. Kreyszig. Introductory functional analysis with applications. Wiley Classics Library. John Wiley & Sons, Inc., New York, 1989. K. M Hoffman and R. Kunze. Linear Algebra. Pearson, 2nd edition, 1971. E. L. Lima. Espaços métricos. Projeto Euclides, IMPA, 2003. E. L. Lima. Curso de Análise, vol.1. Projeto Euclides, IMPA, 2013. J. H. Lindsey. A simple proof of the Weierstrass approximation theorem. The American Mathematical Monthly, 98(5):429–430, 1991. L. A. Lusternik and V. J. Sobolev. Elements of functional analysis. Hindustan Publishing Corp., Delhi, corrected edition, 1971.

224

Bibliography

[27] G. W. Mackey. Quantum mechanics and Hilbert space. The American Mathematical Monthly, 64(8):45–57, 1957. [28] R. E. Megginson. An Introduction to Banach Space Theory. Graduate Texts in Mathematics. Springer-Verlag New York, 1998. [29] G.H. Moore. Zermelo’s Axiom of Choice: Its Origins, Development, and Inﬂuence. Studies in the History of Mathematics and Physical Sciences. Springer New York, 2012. [30] C. R. de Oliveira. Introdução à análise funcional. Projeto Euclides, IMPA, 2013. [31] A. Pinkus. The Weierstrass approximation theorems. Surveys in Approximation Theory, 1:1–37, 2005. [32] A.I.M. Rae. Quantum Mechanics, 5th Edition. CRC Press, 2007. [33] M. Raman-Sundström. A pedagogical history of compactness. The American Mathematical Monthly, 122(7):619–635, 2015. [34] W. Rudin. Principles of mathematical analysis. McGraw-Hill Book Co., New York, 3rd edition, 1976. International Series in Pure and Applied Mathematics. [35] M. Ó. Searcóid. Metric spaces. Springer Undergraduate Mathematics Series. Springer-Verlag London, Ltd., London, 2007. [36] A. D. Sokal. A really simple elementary proof of the uniform boundedness theorem. The American Mathematical Monthly, 118(5):450–452, 2011. [37] R. M. Solovay. A model of set-theory in which every set of reals is Lebesgue measurable. Annals of Mathematics (2), 92:1–56, 1970. [38] E. M. Stein. Singular integrals and differentiability properties of functions. Princeton Mathematical Series, No. 30. Princeton University Press, Princeton, N.J., 1970. R.E. Castillo and H. Rafeiro. An introductory course in Lebesgue spaces. CMS Books in Mathematics, Springer, 2016. [39] E. M. Stein and R. Shakarchi. Fourier analysis, volume 1 of Princeton Lectures in Analysis. Princeton University Press, Princeton, NJ, 2003. An introduction. [40] G. Strang. The fundamental theorem of linear algebra. The American Mathematical Monthly, 100(9):848–855, 1993. [41] T. Tao. An epsilon of room. I: Real analysis. Pages from year three of a mathematical blog. Providence, RI: American Mathematical Society (AMS), 2010. [42] A. Iosevich and E. Liﬂyand. Decay of the Fourier Transform: Analytic and Geometric Aspects. Birkhäuser Basel, 1st edition, 2014. [43] W. Kulpa. Poincaré and domain invariance theorem. Charles University, Prague, 1–2(39):127–136, 1998. [44] H. Brezis. Functional analysis, Sobolev spaces and partial differential equations. New York, NY: Springer, 2011.

Index Algorithm – Babylonian 113 Application – Graph 148 Atom 6 Attractor 123 Axiom – Choice 1, 3 – of Choice for the Cartesian product 3

– Quadrature 142 Fourier – Coefficients 29 Function – Choice 1 – Gauss Mean 104 Functional – Banach 151 – Linear 66

Basis – Orthonormal 33 – Schauder 50

Gauss – Summable 104 Gram–Schmidt orthonormalization 31 Graph 148

Chain 2 Constant – Lipschitz 114 Continuity – %-\$ 6 – Cauchy 6 – Heine 6 – Sequential 6 Contraction 114 Convergence – Uniform 92 Convex – Strictly normed space 41 Criterion – Hausdorff Compactness 202 Eigenvalue 192 Eigenvector 192 Equiconvergence 210 Exponent – Conjugate 11, 90 Extension – Hahn–Banach 153 Filter 7 Formula

Hilbert – Cube 210 Hyperplane 158 – Affine 159 Identity – Parallelogram 20 – Parseval 29 – Polarization 21 Inequality – Bessel 29 – Fundamental Contraction 124 – Hölder 11, 90 – Minkowski 12 – Minkowski Integral 90 – Young 10, 91 – Young (for convolution) 91 Isometric – Metric Spaces 43 – Spaces 30 Kernel 56 Lemma – Kuratowski–Zorn 3

226

Index

– Riemann–Lebesgue 97 – Riesz 43 – Ultrafilter 7 Measure – Counting 13 Method – Newton 116 Metric – Hausdorff 120 Mollifier – Friedrich 93 Norm – Definition 8 – Equivalent 38 – Uniform 36 Notation – Bra-Ket 73 – Dirac 73 Number – Lebesgue 208 Operator – Approximate Identity 92 – Banach Adjoint 170 – Bilinear 140 – Bounded 58 – Closed 147 – Compact 183 – Convolution 89 – Finite rank 186 – Hilbert Adjoint 168 – Kernel 56 – Linear 53 – Multiplication 150 – Normal 187 – Range 56 – Self-adjoint 187 – Symmetric 149 – Translation 98 Orthogonal Projection 26

Poset 2 Principle – Banach Contraction 114 – Uncertainty 107 – Well-Ordering 3 Product – Cartesian 2 Property – Darboux 6 Sequence – Cauchy 18 Series – Absolutely summable 39 – Neumann 117 Set – Absolutely Convex 155 – Absorbing 155 – Balanced 155 – Compact 199, 204 – Convex 155 – Equicontinuous 212 – Nonmeasurable 5 – Orthonormal 27 – Partially Ordered 2 – Sequentially Compact 199 – Total 51 – Totally Bounded 201 – Uniformly Equicontinuous 213 – Well-Ordered 2 Space – Affine 159 – Banach 36 – Dual 69 – Hilbert 18 – Reflexive 179 – Separable 32, 45 – Strictly Convex 162 Sum – Direct 40 Summable – Gauss 104

Index

System – Iterated Function 123 Theorem – Arzelá–Ascoli 213 – Banach Contraction 114 – Banach Fixed Point 114 – Banach Isomorphism 146 – Banach–Schauder 144 – Banach–Steinhaus 137, 138 – Bolzan–Weierstrass 199 – Borel–Lebesgue 199 – Brouwer 143 – Cantor Nested Set 200 – Cauchy–Picard 117 – Closed Graph 148 – Eidelheit Separation 161 – Hahn–Banach – Geometric 160 – Separation 160, 161 – Hausdorff Compactness Criterion 202 – Heine–Borel 199 – Hellinger–Toeplitz 149 – Hutchinson 122

– Kolmogorov 219 – Lindelöf 199 – Minkowski–Ascoli–Mazur 160 – Newton 116 – Open Mapping 144 – Picard 117 – Plancherel 105 – Riesz Representation Theorem for Hilbert Spaces 69 – Riesz-Markov 71 – Sierpi´nski 7 – Szegö 142 – Tikhonov 206 – Uniform boundedness 137 – Weierstrass 200, 205 – Weierstrass Approximation 46 – Zermelo 3 Topology – Weak 174 – Weak∗ 178 Vector – Bra 73 – Ket 73

227