A Primer of Real Functions [4 ed.] 9780883850299, 9781614440130, 088385029X

This is a revised, updated, and augmented edition of a classic Carus monograph with a new chapter on integration and its

268 85 11MB

English Pages 319 [323] Year 1997

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

A Primer of Real Functions [4 ed.]
 9780883850299, 9781614440130, 088385029X

Table of contents :
Front Cover......Page 1
A Primer of Real Functions......Page 5
Copyright Page......Page 4
Preface to the Fourth Edition......Page 9
Preface to the Third Edition......Page 11
Contents......Page 15
1 Sets......Page 17
2 Sets of real numbers......Page 21
3 Countable and uncountable sets......Page 24
4 Metric spaces......Page 37
5 Open and closed sets......Page 41
6 Dense and nowhere dense sets......Page 54
7 Compactness......Page 61
8 Convergence and completeness......Page 68
9 Nested sets and Baire's theorem......Page 77
10 Some applications of Baire's Theorem......Page 82
11 Sets of measure zero......Page 89
12 Functions......Page 93
13 Continuous functions......Page 99
14 Properties of continuous functions......Page 106
15 Upper and lower limits......Page 121
16 Sequences of functions......Page 124
17 Uniform convergence......Page 128
18 Pointwise limits of continuous functions......Page 139
19 Approximations to continuous functions......Page 142
20 Linear functions......Page 148
21 Derivatives......Page 155
22 Monotonic functions......Page 174
23 Convex functions......Page 191
24 Infinitely differentiable functions......Page 202
25 Lebesgue measure......Page 211
26 Measurable functions......Page 217
27 Definition of the Lebesgue integral......Page 222
28 Properties of Lebesgue integrals......Page 227
29 Applications of the Lebesgue integral......Page 233
30 Stieltjes integrals......Page 240
31 Applications of the Stieltjes integral......Page 245
32 Partial sums of infinite series......Page 253
Answers to Exercises......Page 261
Index......Page 297
Back Cover......Page 323

Citation preview

A PRIMER OF REAL FUNCTIONS

Fourth edition ©1996 by The Mathematical Association of America, Inc. Library of Congress Number 96-77785 Print ISBN 978-0-88385-029-9 Electronic ISBN 978-1-61444-013-0 Printed in the United States of America Current Printing (last digit): 10 9 8 7 6 5 4 3 2 1

The Cams Mathematical Monographs NUMBER THIRTEEN

A PRIMER OF REAL FUNCTIONS

R A L P H P. B O A S revised and updated by H A R O L D P. B O A S

THE CARUS MATHEMATICAL MONOGRAPHS Published by THE MATHEMATICAL ASSOCIATION OF AMERICA

Committee on Publications JAMES W. DANIEL, Chair Cams Mathematical Monographs Editorial Board MARJORIE L. SENECHAL, Editor KATALIN A. BENCSÄTH ROBERT BURCKEL GIULIANA P. DAVIDOFF JOSEPH A. GALLIAN

The following monographs have been

published:

1. Calculus of Variations, by G. A. Bliss (out of print) 2. Analytic Functions of a Complex print) 3. Mathematical 4. Projective

Variable, by D. R. Curtiss (out of

Statistics, by H. L. Rietz (out of print)

Geometry, by J. W. Young (out of print)

5. A History of Mathematics in America before 1900, by D. E. Smith and Jekuthiel Ginsburg (out of print) 6. Fourier Series and Orthogonal Polynomials, of print)

by Dunham Jackson (out

7. Vectors and Matrices, by C. C. MacDuffee (out of print) 8. Rings and Ideals, by Ν. H. McCoy (out of print) 9. The Theory of Algebraic Numbers, second edition, by Harry Pollard and Harold G. Diamond 10. The Arithmetic print)

Theory of Quadratic

Forms, by B. W. Jones (out of

11. Irrational Numbers, by Ivan Niven 12. Statistical

Independence

in Probability,

Analysis and Number

Theory,

by Mark Kac 13. A Primer of Real Functions, fourth edition, by Ralph P. Boas, Jr., revised and updated by Harold P. Boas 14. Combinatorial 15. Noncommulalive

Mathematics,

by Herbert J. Ryser

Rings, by 1. N. Herstein (out of print)

16. Dedekind Sums, by Hans Rademacher and Emil Grosswald 17. The Schwarz Function and its Applications,

by Philip J. Davis

18. Celestial Mechanics, by Harry Pollard 19. Field Theory and its Classical Problems, by Charles Robert Hadlock 20. The Generalized

Riemann Integral, by Robert M. McLeod

21. From Error-Correcting Codes through Sphere Packings to Simple Groups, by Thomas M. Thompson 22. Random Walks and Electric Networks, by Peter G. Doyle and J. Laurie Snell 23. Complex Analysis:

The Geometric

Viewpoint, by Steven G. Krantz

24. Knot Theory, by Charles Livingston 25. Algebra and Tiling: Homomorphisms Sherman Stein and Sandor Szabo

in the Service of Geometry,

by

MAA Service Center P.O. Box 91112 Washington, DC 20090-1112 1-800-331-1 MAA FAX: 1-301-206-9789

Preface to the Fourth Edition This book is an heirloom and a memorial. T h e first edition, dedicated "To My Epsilons," appeared in t h e year t h a t I (the youngest of my father's "little ones") t u r n e d six years old. Now, many years later, my siblings being of a n o n m a t h e m a t i c a l bent, t h e familial interest in t h e book resides in me. T h e most fitting tribute I can pay to my father's memory is t o pass on a new edition of this monograph to t h e next generation of epsilons. T h e principal change in this edition is the addition of a chapter on integration a n d some of its applications. This topic was deliberately omitted from previous editions, "reluctantly, because of the many technical details t h a t are needed before one gets t o t h e interesting results." My father eventually decided t h a t it would be acceptable and enjoyable t o present some of the interesting results without all t h e technical details, on the principle t h a t one need not u n d e r s t a n d t h e inner workings of the motor t o appreciate a drive in the country. C h a p t e r 3 is my reworking of a draft t h a t my father left at his d e a t h . Besides revising this material, I have added most of the notes, exercises, vii

viii

PREFACE T O THE FOURTH EDITION

a n d solutions for C h a p t e r 3. Resetting the book in afforded t h e opportunity t o renumber the exercises and t o relocate t h e notes from t h e end of the volume to t h e ends of individual sections. I have also m a d e minor revisions throughout C h a p t e r s 1 a n d 2. I t h a n k Robert B. Burckel for reading the manuscript with great care and for suggesting many improvements. Harold P. Boas Texas A & M University July 1995

Preface to the Third Edition I. T o t h e b e g i n n e r . In this little book I have presented some of t h e concepts and m e t h o d s of "real variables" a n d used t h e m t o obtain some interesting results. I have not sought great generality or great completeness. My idea is t o go reasonably far in a few directions with a m i n i m u m a m o u n t of special terminology. I hope t h a t in this way I have been able t o preserve some of the sense of wonder t h a t was associated with t h e subject in its early days b u t has now largely been lost. I hope also t h a t someone who has read this book will be able t o go on t o one of t h e m a n y more forbidding systematic treatises, of which there is no lack. No previous knowledge of the subject is assumed, but t h e reader should have h a d at least a course in calculus. In general, each topic is developed slowly but rises to a moderately high peak; a reader who finds the slope too steep may skip t o t h e beginning of the next section. Since this is not a handbook, b u t more in t h e n a t u r e of a course of informal lectures, I have not been at all consistent either a b o u t t h e proportion of detailed proof to ix

χ

PREFACE TO THE THIRD EDITION

general discussion or a b o u t strict logical arrangement of material. All phrases like "it is clear," "plainly," "it is trivial" are intended as abbreviations for a s t a t e m e n t something like "It should seem reasonable, you should be able t o supply t h e proof, and you are invited t o do so." On t h e other hand, "It can be shown . . . " is usually to suggest t h a t t h e proof is too complicated t o give here, or depends on notions t h a t are not discussed here, and t h a t you are not expected t o t r y to supply the proof yourself. In s t a t i n g definitions, I have frequently used "if" where I should really have used "if and only if." For example, "If a set is b o t h bounded above and bounded below, it is called bounded." This definition is t o be understood to carry an additional clause, "and if it is not b o t h bounded above and bounded below, it is not called bounded." There are a number of exercises, some of which merely supply illustrative material, and some of which are essential p a r t s of t h e book. An exercise t h a t merely states a proposition is to b e interpreted as a d e m a n d for a proof of t h e proposition. Answers t o all exercises are given a t t h e end of the book. P a r a g r a p h s in small t y p e deal either with peripheral material or with more difficult questions. I apologize in advance for whatever mistakes t h e alert reader m a y be able to detect. None were intentionally included; nevertheless, the detection and rectification of mistakes is a good exercise, a n d fosters a healthy skepticism a b o u t the printed word.

II. T o t h e e x p e r t . E x p e r t s are not supposed to read this book at all; since this statement will doubtless be taken as an invitation for t h e m t o do so, I must explain

PREFACE TO T H E THIRD EDITION

xi

w h a t I have tried (and not tried) to do. I have set out t o tell readers with no previous experience of t h e subject some of t h e results t h a t I find particularly interesting. I have therefore tried t o present t h e material t h a t seemed essential for t h e results I had in mind, together with as much related material as seemed interesting and not too complicated. Since this is not a systematic treatise, I have deliberately tried not t o introduce any concepts or notations, however significant or convenient, t h a t I did not really need t o use. I have omitted integration, reluctantly, because of the m a n y technical details t h a t are needed before one gets to t h e interesting results. Since this is not a treatise it has not been written like one. T h e style is deliberately wordy. T h e axiom of choice is frequently used but never mentioned; this book is not t h e place t o discuss philosophical questions, and, in any case, after Gödel's results, the assumption of the axiom of choice can do no m a t h e m a t i c a l h a r m t h a t has not already been done. On t h e other h a n d , according to t h e more recent work of P. J. Cohen, by assuming the axiom of choice r a t h e r t h a n its negative we are selecting one kind of m a t h ematics r a t h e r t h a n another, say Zermelonian rather t h a n non-Zermelonian. W i t h this selection, there is no point in avoiding t h e axiom of choice whenever it seems n a t u r a l t o use it, even in cases where it is known to b e avoidable.

I I I . A c k n o w l e d g m e n t s . I am indebted t o my teachers, J. L. Walsh a n d D. V. Widder, for introducing me t o this kind of m a t h e m a t i c s ; t o M. L. Boas and t o E. F . a n d R. C. Buck for criticizing early drafts of t h e book; a n d t o Η. M. Clark and Η. M. G e h m a n for help with the proofreading. I a m grateful t o several people who have

xii

PREFACE TO THE THIRD EDITION

pointed o u t oversights or suggested improvements, and especially t o Richard L. Baker, Ε. M. Beesley, H. P. Boas, A. M. Bruckner, G. T . Cargo, S. Haber, A. P. Morse, C. C. Oehring, J. Μ. H. Olmsted, J. C. Oxtoby, A. C. Segal, A. Shuchat, a n d H. A. T h u r s t o n .

In preparing this revision, I have tried t o resist t h e t e m p t a t i o n t o insert additional material; but a considerable n u m b e r of references have been added t o t h e notes. Ralph P. Boas, Jr. Northwestern University March 1960 J u n e 1965 March 1972 October 1979

Contents Preface to the Fourth Edition

vii

Preface to the Third Edition 1

Sets 1 2 3 4 5 6 7 8 9 10 11

2

ix 1

Sets Sets of real numbers Countable and uncountable sets Metric spaces O p e n and closed sets Dense a n d nowhere dense sets Compactness Convergence and completeness Nested sets a n d Baire's theorem Some applications of Baire's Theorem Sets of measure zero

Functions 12 Functions 13 Continuous functions 14 Properties of continuous functions 15 U p p e r a n d lower limits xiii

. . .

1 5 8 21 25 38 45 52 61 66 73 77 77 83 90 105

CONTENTS

xiv 16 17 18 19 20 21 22 23 24 3

Sequences of functions Uniform convergence Pointwise limits of continuous functions Approximations t o continuous functions Linear functions Derivatives Monotonie functions Convex functions Infinitely differentiable functions

108 112 . . 123 . . 126 132 139 158 175 186

Integration 25 Lebesgue measure 26 Measurable functions 27 Definition of the Lebesgue integral 28 Properties of Lebesgue integrals 29 Applications of t h e Lebesgue integral . . . . 30 Stieltjes integrals 31 Applications of t h e Stieltjes integral . . . . 32 P a r t i a l s u m s of infinite series . . . . . . . .

195 195 201 206 211 217 224 229 237

Answers t o Exercises

245

Index

281

Chapter 1

Sets 1. S e t s . In order t o read anything about our subject, you will have t o learn the language t h a t is used in it. I have tried t o keep the number of technical terms as small as possible, b u t there is a certain minimum vocabulary t h a t is essential. Much of it consists of ordinary words used in special senses; this practice has both advantages and disadvantages, b u t has in any case t o be endured since it is now too late t o change t h e language completely. Much of t h e s t a n d a r d language is taken from t h e theory of sets, a subject with which we are not concerned for its own sake. T h e theory of sets is, indeed, an independent branch of m a t h e m a t i c s . It has its own basic undefined concepts, subject t o various axioms; one of these undefined concepts is t h e notion of "set" itself. From a n intuitive point of view, however, we m a y think of a set as being a collection of objects of some kind, called its elements, or members, or points. We say t h a t a set contains its elements, or t h a t the elements belong t o the set or simply are in t h e set. T h e normal usage of set, as in "a 1

2

SETS

Chap. 1

set of dishes" or "a set of the works of Bourbaki," is fairly close t o w h a t we should have in mind, although t h e second phrase suggests some sort of arrangement of the elements which is irrelevant to the m a t h e m a t i c a l concept. Sets may, for example, consist of ordinary geometrical points, or of functions, or indeed of other sets. We shall use t h e words class, aggregate, a n d collection interchangeably with set, especially t o make complicated situations clearer: thus we m a y speak of a collection of aggregates of sets r a t h e r t h a n a set of sets of sets. If Ε is a set,* a set Η is called a subset of Ε if every element of Η is also an element of E. For example, if Ε is t h e set whose elements are the numbers 1, 2, a n d 3, t h e n there are eight subsets of E. Three of t h e m contain one element each; three contain two elements each; one is t h e set Ε itself (a subset does not have to be, in any sense, "smaller" t h a n t h e original set); t h e eighth subset of Ε is, by convention, t h e empty set, which is t h e set t h a t has no elements a t all. If i f is a subset of E, we write Η C Ε or Ε D H; sometimes we say t h a t Ε contains Η or t h a t Ε covers H. If Η is a subset of Ε b u t is not all of E, we call Η a proper subset of E. We write χ e Ε t o mean t h a t χ is an element of E. We often say t h a t χ is in E, or t h a t χ belongs t o E, or t h a t Ε contains x, meaning t h e same thing. Since the elements of sets are usually things of a different kind from t h e sets themselves, we should distinguish between the element χ a n d t h e set whose only element is x. It is often convenient t o denote t h e latter set by {x}. T h e notations χ € Ε and {χ} C Ε m e a n t h e same thing. A space is a set t h a t is being thought of as a universe * It is traditional to call sets "E," for "set" is "ensemble."

presumably because t h e French

Sec. 1

3

SETS

from which sets can be extracted. If Ω is a space a n d E c u , t h e n the complement of Ε (with respect to Ω) is t h e set consisting of all the elements of Ω t h a t are not elements of E. T h e complement of Ε is denoted by C(E). For example, if Ω consists of the letters of the alphabet and Ε of t h e consonants (including y as a consonant), t h e n C(E) consists of t h e vowels. If, however,* Ε consists of t h e single letter a, t h e n C(E) consists of the letters 6, c , . . . ,z. If Ε consists of t h e entire alphabet, then C(E) is empty. If Ε is empty, t h e n C(E) = Ω. Exercise 1.1. Show that C(C(E))

= E.

If Ε and F are two sets, there are two other sets t h a t can be formed by using t h e m , and t h a t occur so frequently t h a t they have special names. One of these sets is t h e union of t h e two sets, written Ε U F (sometimes called their sum, and written Ε + F)\ it consists of all elements t h a t are in Ε or in F (or in both; an element t h a t is in b o t h is counted only once). T h e other is the intersection of t h e two sets, written EOF (sometimes called their product, and written Ε • F or EF); it consists of all elements t h a t are in b o t h Ε and F. If Ε Π F is empty, t h e n Ε and F are called disjoint; t h a t is, Ε and F are disjoint if they have no element in common. Exercise 1.2. Let Ω consist of the 26 letters of the alphabet. Let Ε consist of all the consonants (including y), and F of all the letters that occur in the words real functions (the η is counted only once). Show that (a) Ε U F = Ω; (b) F D C(E); (c) C(F) C E; (d) F Π Ε and C(E) are disjoint. *For simplicity of n o t a t i o n w e frequently use, as here, a letter t h a t h a s just been used a s the n a m e of a set, t h a t we are now through w i t h , to d e n o t e a different set.

4

SETS

Chap. 1

There are various logical difficulties inherent in the uncritical use of the terminology of the theory of sets, and they have given rise to a great deal of discussion. Fortunately, however, they arise only at a higher level of abstraction than we shall attain in the rest of this book, and in contexts that we should consider rather artificial, so that we may safely ignore them hereafter. Some forms of words which appear to define sets may not actually do so, somewhat as some combinations of letters which might well represent English words (for instance, "frong") do not actually do so. For example, although we can safely speak of sets whose elements are sets, we cannot safely talk about the set of all sets whatsoever. Supposing that we could, the set of all sets would necessarily have itself as one of its elements. This is a peculiar property, although there are other ostensible sets that have it, for example, the set of all objects definable in fewer than thirteen words (since this "set" is itself defined in fewer than thirteen words). We might well decide to exclude from consideration those sets that are elements of themselves. The remaining sets do not have themselves as elements; form the aggregate A of all such acceptable sets. Now is A one of the sets that we accept, or one of the sets that we exclude? If we accept A, it does not have itself as an element and so must be included in the aggregate of all sets with this property; that is, A belongs to A, and therefore we do not accept A. On the other hand, if we do not accept A, then A is an element of itself; then since all elements of A are sets that are not elements of themselves, and so are acceptable, we must accept A. Thus if A is a set at all, we are involved in a logical contradiction. The only way out seems to be to declare that the words that seem to define A do not actually define a set. Another paradoxical property of "the set of all sets" will turn up in §3. Exercise 1.3. A librarian proposes to compile a bibliography listing those, and only those, bibliographies that do not list themselves. Comment on this proposal.

Sec. 2

SETS OF REAL NUMBERS

5

2. S e t s o f real n u m b e r s . Since we have to s t a r t somewhere, I assume t h a t you are familiar with the system of real numbers. I will take completely for granted its algebraic properties—those connected with addition, subtraction, multiplication, and division, and with inequalities. However, there is one property of t h e real numbers t h a t is less familiar t o most people, even though it underlies concepts, such as limit a n d convergence, t h a t are fundamental in calculus. This property can be stated in many equivalent forms, a n d t h e particular one t h a t we select is a m a t t e r of taste. I shall take as fundamental the so-called least upper bound property. Before we can s t a t e w h a t this property is, we need some more terminology. Let Ε be a nonempty set of real numbers. We say t h a t Ε is bounded above if t h e r e is a n u m b e r Μ such t h a t every χ in Ε satisfies the inequality χ < M. For example, the set of all real numbers less t h a n 2 is b o u n d e d above, and we can take Μ = 2, or Μ = π, or Μ = 100. On the other hand, t h e set of all positive integers is not bounded above. If Ε is bounded above, then its least upper bound, or supremum, is Β if Β is t h e smallest Μ t h a t can b e used in the preceding definition. In our example, where Ε is t h e set of all real numbers less t h a n 2, t h e s u p r e m u m of Ε is 2. Another way of stating t h e definition of t h e supremum of Ε is t o say t h a t it is a n u m b e r Β such t h a t every χ in Ε satisfies χ < B, while if A < Β there is at least one χ in Ε satisfying χ > A. T h e s u p r e m u m of Ε may or may not belong to E. In t h e example j u s t given, it does not. However, if we change t h e example so t h a t Ε consists of all numbers not greater t h a n 2, t h e s u p r e m u m of Ε is still 2, and now it belongs t o E. So far, although we have talked a b o u t the supremum of a set, we have not known (except in our illustrative ex-

6

SETS

Chap. 1

amples) whether there is any such thing. T h e least u p p e r b o u n d property, which we take as one of t h e axioms a b o u t real numbers, is j u s t t h a t every nonempty set Ε that is bounded above does in fact have a supremum. In other words, if we form the collection of all upper b o u n d s of E, this collection has a smallest element (hence t h e n a m e ) . We denote t h e s u p r e m u m of Ε by sup I? or sup x. W h e n sup Ε belongs t o E, we sometimes write m a x Ε instead. T h u s max Ε is t h e largest element of Ε if Ε has a largest element. T h e greatest lower bound, or infimum, of E, denoted by inf E, and the minimum min Ε, are defined similarly. (See Exercise 2.2.) An interval is a set consisting of all t h e real n u m b e r s between two given numbers, or of all the real numbers on one side or the other of a given number. More precisely, an interval consists of all real numbers χ t h a t satisfy a n inequality of one of the forms o < x < 6 , a < x < 6 , a < χ < b, a < χ < b (where a < b), χ > a, χ > a, χ < a, or χ < a. Using a square bracket t o suggest < or > and a parenthesis t o suggest < or > , we shall often use t h e following notations for the corresponding intervals: (a, b), [a,b), (a, b], [a, b], (a, ex), [α,οο), ( - c o , a ) , ( - c o , a ] . T h u s (0,1] means t h e set of all real numbers χ such t h a t 0 < χ < 1. ( T h e use of t h e symbol co in t h e notation for intervals is simply a m a t t e r of convenience and is not t o be taken as suggesting t h a t there is a number oo.) xeE

Exercise 2.1. For each of the sets Ε indicated below, describe the set of all upper bounds, the set of all lower bounds, sup E, and inf E. (a) Ε is the interval (0,1). (b) Ε is the interval [0,1). (c) Ε is the interval [0,1]. (d) Ε is the interval (0,1].

Sec. 2

SETS OF REAL NUMBERS

7

(e) Ε consists of the numbers 1, j , i, ... . (f) Ε is the set containing the single point 0. (g) E= { n + ( - ! ) " : n = l , 2 , . . . } . (h) £ = { η + ( - 1 ) " / η : τ ι = 1 , 2 , . . . } . (i) Ε is the set of real numbers χ between 0 and π such that sin a: > | · . Exercise 2.2. Give a detailed definition of inf E, formulate a greatest lower bound property, and prove that it is equivalent to the least upper bound property. If Ε is not bounded above, we write sup Ε = + o o ; if Ε is not bounded below, we write inf Ε = —oo. These are convenient abbreviations, b u t are not to be interpreted as implying t h a t there are real numbers + o o and — oo; there are not. We can, if we like, create such infinite numbers and adjoin t h e m to the real number system, but for most purposes it is undesirable t o do so. No m a t t e r how we int r o d u c e infinite numbers, we are bound t o make arithmetic worse t h a n it already is: there is one impossible operation t o begin with (division by zero), b u t if we make this operation possible we render even more operations impossible. Exercise 2.3. Explore the consequences of introducing numbers +oo and —oo such that a/0 = + 0 0 if a > 0, and a/0 = —00 if a < 0. Can a reasonable meaning be given to + 0 0 + (—00)? To 0 · ( + 0 0 ) ?

If a set is b o t h bounded above a n d bounded below, it is called bounded. A b o u n d e d nonempty set Ε is characterized by having inf Ε and sup Ε b o t h finite, or equivalently by being contained in some finite interval (a, b). We have supposed in our discussion of upper and lower b o u n d s t h a t we have been considering nonempty sets. So t h a t we shall not have t o make reservations about whether

8

Chap. 1

SETS

or not our sets have elements, we make the (rather peculiar) convention t h a t if Ε is empty, t h e n sup 2? = — oo a n d inf Ε = + o o . This convention allows us t o say, for example, t h a t sup(EL)F) is the larger of sup JE a n d s u p F , without having t o consider whether Ε or F may b e empty. 1

Exercise 2.4. If Ε is not empty, then inf Ε < supZJ; there is strict inequality if Ε contains at least two points.'

3 . C o u n t a b l e a n d u n c o u n t a b l e s e t s . If we have a set Ε with (say) five elements, for instance, the fingers on a h a n d , we can count these elements (or, for short, count E). This means exactly what we might expect: we can point to the elements of E, one by one, naming the integers 1, 2, 3, 4, 5, successively, as we do so. In slightly more formal language, we label all t h e elements of Ε by using t h e integers 1, 2, 3, 4, 5 once each. In still more formal language, we p u t t h e elements of Ε into one-to-one correspondence with the integers 1, 2, 3, 4, 5. It t u r n s out t o be useful to extend the concept of counting t o sets t h a t have infinitely many elements. Suppose, for example, t h a t Ε is now the set of all even positive integers. We can no longer point t o all the elements of E, one after another, naming successive integers, because Ε has t o o m a n y elements. However, we can still imagine labeling all the elements of Ε with all the positive integers, so t h a t each element of Ε has a different integer attached t o it: we j u s t have t o label each even integer with t h e integer whose double it is, t h a t is, we label the integer 2n with t h e label n. Since t h e even integers can be put into ' A n exercise t h a t simply makes a s t a t e m e n t calls for a proof of that statement.

Sec. 3

COUNTABLE AND UNCOUNTABLE SETS

9

one-to-one correspondence with all t h e integers in this way, we m a y reasonably say t h a t there are the same number of each, without, however, committing ourselves as t o what the number of integers may be. It is interesting t h a t there actually are sets t h a t cannot be counted (as we shall soon prove), so t h a t we can classify sets according t o whether they can be counted or not. T h e ones t h a t cannot be counted can be thought of as "bigger" t h a n those t h a t can be counted. (As we have just seen, a set is not necessarily bigger, in this sense, t h a n one of its proper subsets.) Before establishing the existence of uncountable sets, we shall introduce some more terminology and give additional examples of sets t h a t can be counted. To count a set means t o put its elements into one-toone correspondence with some set of consecutive positive integers starting from 1; this set is not necessarily t o be a proper subset of the set of all positive integers. If a set can be counted we call it countable. (The e m p t y set is also called countable: here the relevant subset of the positive integers can be thought of, with a little effort, as the e m p t y set.) We can write the elements of a nonempty countable set in some such form as x\, X2, £ 3 , •. • , where a typical element of the set would be denoted by χ and the subscripts are t h e consecutive integers t h a t are used as labels in counting the set. If we s t a r t out counting a countable set, either eventually we find a last element, or else t h e counting process continues indefinitely. In the first case, t h e set is called finite; in the second, countably infinite. T h e set of all t h e positive and negative integers together is countable, since we can use the odd positive integers to label all t h e positive integers and t h e even positive integers to label all t h e negative integers, t h u s :

10

Chap. 1

SETS

elements labels

... ...

-3 6

-2 4

-1 2

1 2 3 1 3 5

...

Exercise 3.1. Show similarly that the union of any two countably infinite sets is countable. A less obviously countable set is the set of lattice points in the plane: these are t h e points with b o t h coordinates integral, for example (1,2) or (—5,18). It is easy to see how t o count t h e m from t h e diagram (which, for convenience, shows only the lattice points in t h e first q u a d r a n t ; we can count all the lattice points by using Exercise 3.1 repeatedly). W i t h o u t drawing t h e picture, we can think of first grouping all the lattice points (m, n) for which m + n is 0, 1, 2, 3, . . . , and t h e n counting the groups, one after another. We can alternatively view the diagram as representing t h e lattice points in the first q u a d r a n t as t h e union of a countable collection of countable sets: the sets are t h e successive horizontal rows.

(0,2)

\

Τ (0,1)

\

(0,0)

-+

\

(1,2)

\ \

(1,1)

(i.o)

\

(2,2) (2,1) (2,0)

\ \ \ —

Some rather more complicated sets can be discussed by using t h e fact t h a t every subset of a countable set is countable. Expressed in another way, this theorem says t h a t a set whose elements can b e labeled with some of

Sec.

3

COUNTABLE AND UNCOUNTABLE SETS

11

the positive integers, using each only once, can equally well be labeled with consecutive positive integers (possibly all of t h e m ) . To see this, observe t h a t each element of the given subset of a countable set is labeled with some positive integer. Take the element with t h e smallest label and relabel it 1; t h e n take the element with the smallest label from t h e rest of the subset (if there is any more of the subset) a n d relabel it 2; and so on. It is now easy to see t h a t the positive rational numbers form a countable set. A positive rational number can be represented as a fraction p/q in lowest terms, where ρ and q are positive integers. If we associate the fraction 3/11 with the lattice point (3,11), and generally p/q with (p, q), we have the rational numbers in one-to-one correspondence with a subset of the lattice points, t h a t is, with a subset of a countable set. Hence t h e positive rational numbers form a countable set. Exercise 3.2. Show that the union of any countable collection of countable sets is countable. Exercise 3.3. The set of all points inside a circle is called a disk. Let S be a set of nonoverlapping disks in the plane—that is, no disk in S has a nonempty intersection with any other disk in 5 . Show that S must be countable. A still less obvious example is furnished by t h e algebraic numbers: these are t h e numbers (real or complex) t h a t can be roots of polynomials with integral coefficients (for example, all rational numbers, y/2, i, 2 · ^ + \ / 7 ) . To see t h a t the set of algebraic numbers is countable, we notice first t h a t t h e r e are only countably many linear polynomials with integral coefficients, countably many quadratic polynomials with integral coefficients, and so on.

12

Chap. 1

SETS

Exercise 3.4. Why is this? T h e polynomials of a given degree η with integral coefficients have at most η roots each, thus countably m a n y altogether. T h e aggregate of roots of polynomials of every degree with integral coefficients is accordingly a countable collection of countable sets, and so countable. A more abstract example is t h e class of all finite subsets of a given countable set. For, t h e class of subsets with one element each is countable, the class of subsets with two elements each is countable, a n d so on. We again have a countable collection of countable sets. (As we shall see later (page 16), t h e set of all subsets of a countably infinite set is not countable.) 1

T h e notion of countability can sometimes b e used t o show t h e existence of things of a particular kind. For a simple illustration, we prove t h a t not all real numbers are algebraic. (A n u m b e r t h a t is not algebraic is called transcendental.) T h e real algebraic numbers are, as we know, countable; our first step is to suppose t h a t they have been counted, and represented as decimals. It will simplify t h e notation somewhat, and do no h a r m , if we consider only real numbers between 0 a n d 1. Every real number between 0 and 1 has an ordinary decimal expansion, for example, \ = 0.14285714285714285714 π - 3 = 0.14159265358979323846.... Conversely, every such expansion defines a real number between 0 and 1; if we write down, for example, χ = 0.123456789101112131415...,

Sec. 3

13

COUNTABLE AND UNCOUNTABLE SETS

t h e n χ is certainly a real n u m b e r between 0 a n d 1, although we cannot connect it in any simple way with more familiar numbers. Now suppose t h a t t h e real algebraic numbers between 0 a n d 1 have been counted, so t h a t there are a first, a second, a t h i r d , a n d so on; call t h e m a\, a,2, 03, and so on. We t h e n have a column of decimals, which might s t a r t like this: a

x

= 0.215367...

o

2

= 0.652489...

a

3

= 0.061259...

a

4

= 0.300921...

T h i s list is supposed t o contain all t h e real algebraic numbers between 0 and 1; t h a t is, any such number will appear in t h e list if we go far enough. It is now easy t o construct a decimal t h a t does not appear anywhere in this list and so cannot be a n algebraic number. For example, if we write down 0.5655, t h e n we have t h e s t a r t of a decimal t h a t differs from a\ in t h e first decimal place, from 02 in t h e second, from 03 in t h e third, and from a\ in t h e fourth; obviously it is not going to b e any of these four numbers. We can go on in t h e same way, p u t t i n g 5 in t h e n t h decimal place if a has a n y t h i n g except 5 there, and p u t t i n g 6 in t h e n t h place if a does have 5 there. T h e resulting decimal differs from each a„ in t h e n t h decimal place and so cannot a p p e a r in our hypothetical list of all algebraic numbers, so it is not algebraic. n

n

We can describe t h e construction more concisely if we let t h e digits of t h e n t h algebraic number be α , ι , a ,2, a „ , 3 , . . . , a n d construct a new n u m b e r b = 0.616263 · · · by η

n

14

Chap. 1

SETS

taking b = 5 if α„,„ φ 5, and b„ = 6 if a , = 5. (There is no particular significance t o the numbers 5 and 6.) n

n

n

It is sometimes alleged t h a t a proof of this kind is only a "pure existence proof" and furnishes no explicit example of a transcendental number. This is not the case. At least in principle, it is possible t o count the algebraic numbers explicitly, find their decimal expansions, and so write down as many digits as we like of at least one transcendental number. T h e reason t h a t t h e number π , say, seems more concrete is t h a t π occurs in more contexts t h a n t h e number we have j u s t been talking a b o u t , so t h a t more is known a b o u t it; in particular, people have been interested enough t o c o m p u t e billions of decimal places of π already. 2

We have shown how t o find a transcendental number; t o show t h a t some given number is transcendental is much harder: it is a problem in number theory and requires deeper m e t h o d s t h a n the simple argument used here. T h e transcendence of e is fairly difficult, t h a t of π considerably harder, e" a n d 2 ^ much harder again; and it is not known whether π" is transcendental or not, or even whether it is irrational. Another transcendental number is 0.10100100000010..., where there are n! zeros after t h e n t h 1. This transcendental number is in a sense simpler t h a n π or e, since we could say without much trouble what any particular digit, say t h e quadrillionth, is, whereas we cannot, a t least a t present, do this for π or e. 2

3

If we look over the proof of t h e existence of transcendental numbers, we see t h a t no use was m a d e of the fact t h a t a\, ci2, . . . , were algebraic numbers beyond the fact t h a t the algebraic numbers form a countable set. T h e same argument applies, word for word, to show t h a t if Ε is any given countable set of real numbers between 0 a n d 1, t h e n there is a real n u m b e r between 0 and 1 t h a t is not in E.

Sec. 3

COUNTABLE AND UNCOUNTABLE SETS

15

Hence no countable set can exhaust the set of real numbers between 0 a n d 1, or in other words the set of real numbers between 0 and 1 cannot be countable. Exercise 3.5. The particular interval (0,1) is not significant; modify the preceding argument, or use the result, to show that the set of real numbers in any interval, however short, is not countable. Exercise 3.6. Show that the set of real numbers in (0,1) whose decimal expansions do not contain any 3's is not countable. (The number 3 has no particular significance here.) Exercise 3.7. Criticize the following "proof" that the set of real numbers between 0 and 1 is countable: First count the decimals that have only one nonzero digit, then those with at most two nonzero digits, and so on; we have then broken the set into a countable collection of countable sets. We defined a set t o be finite if it is countable b u t not countably infinite. We naturally call a set infinite, whether it is countable or not, as long as it is not finite. Every infinite set contains a countably infinite subset. To see this, choose a first element x\, quite arbitrarily. T h e set with Xi removed is still infinite (why?); choose an element xi from this reduced set; and so on. This process cannot t e r m i n a t e (again, why?), so our set contains the countably infinite subset x\, xi, ... . Exercise 3.8. Supply answers to the two questions in the preceding paragraph. Exercise 3.9. Show that if Ε is any infinite set and F is Ε with one point deleted, then Ε and F can be put into one-toone correspondence with each other. Thus every infinite set can be put into one-to-one correspondence with a proper subset of itself.

16

Chap. 1

SETS

Exercise 3.10. Establish a one-to-one correspondence between a finite interval and the set of all real numbers. As another application of the kind of reasoning that we have been using, we prove that the aggregate A of subsets of any given nonempty set Ε is "larger" than E, in the sense that A cannot be put into one-to-one correspondence with E, or indeed with any subset Η of Ε. We shall not make any use of this fact, but it helps to justify the remark on page 4 about the paradoxical character of "the set of all sets." Exercise 3.11. Use the theorem just stated to show that the notion of the set of all sets is paradoxical. For finite sets we could show without much trouble that there are two subsets of a set with one element (the set itself and the empty set); four subsets of a set with two elements (the whole set, two sets containing one element each, and the empty set); eight subsets of a set with three elements; and generally 2" subsets of a set with η elements. So our statement is true for finite sets; the general proof will, in fact, cover all sets with at least one element. Suppose, then, that £ is a set with at least one element, and suppose that the collection of all the subsets of Ε can be put into one-to-one correspondence with a subset Η of E. In other words, suppose that the subsets of Ε can be labeled, say as F , where χ runs through the elements of Zf, so that every subset is labeled and no element of Η is used twice. We are now going to deduce a contradiction, and this contradiction will show that the alleged one-to-one correspondence cannot exist. We form a subset G of Ε in the following way. For each χ in H, we look at F and see whether F contains x. If F does not contain x, put χ in G. (In particular, the χ for which F is empty is put in G; the χ for which F is Ε is not put in G.) Then G is a proper subset of E, and so by assumption corresponds to some ζ in H, that is, G is F . However, by construction, if ζ is in F , then we did not put the element ζ into G, so G is not r

r

:r

x

x

T

z

z

Sec. 3

COUNTABLE AND UNCOUNTABLE SETS

17

F ; if, on the other hand, ζ is not in F , then G contains ζ and F does not, so again G is not F . We have thus deduced from our initial assumption the contradictory statements that G is F and that G is not F , so that the initial assumption is untenable. The preceding theorem shows, for example, that there are more sets of real numbers than there are real numbers. In a rather similar way we could show that there are more real-valued functions, with domain the real numbers, than there are real numbers. s

z

z

z

z

z

Exercise 3.12. Prove the preceding statement. We now establish t h e fact, which seems surprising a t first sight, t h a t there are j u s t as m a n y points in a line segm e n t as in a square area: t h a t is, the real numbers between 0 and 1 can be put into one-to-one correspondence with the points in a square. ( T h e points in a square are ordered pairs of real numbers, their coordinates; see page 22.) T h e general idea of t h e correspondence is easy t o grasp: if we have two real numbers, represented as decimals, we can interlace their digits t o get a single real number; conversely, given a real number, we can dissect its decimal expansion t o get a pair of real numbers. T h e details are not quite obvious, however. Suppose, to make decimal representations unique, t h a t we select t h e nonterminating decimal when there is a choice: t h u s we take 0 . 2 4 3 9 9 9 . . . instead of 0.244000 T h e obvious procedure of making (p, q) with ρ = O.P1P2P3 • • • and q = 0. ! \x„ - y \- For example, if χ = {1, ±, - ± , - ± , ±, . . . } a n d y = { 1 , - 1 , 0, 0, . . . }, t h e n d(x, y) = | . 3

n

n

24

Chap. 1

SETS

T h e space m is the space of bounded sequences. Its elements are again sequences of numbers, b u t now required only t o b e bounded. T h e distance is the same as for Co. T h u s we could have χ — {1, 0, 1, 0, . . . } and y = {1, 4, 1, 5, 9, 2, 6, 5, . . . } , where n o y is t o be negative or exceed 9. Here t h e distance d(x, y) cannot be computed until we know more about the law of formation of y. Given t h a t y - 3, ym = 5, y = 8, a n d y = 9, however, we see t h a t d(x, y) = 9, the largest possible value. T h e space I is t h e space of sequences for which the s u m of the squares of t h e components converges. Its elements are sequences χ — {x\, x%,... } with x\ + x\ + x\ + · · · < oo. We take d(x,y) = {Σ~=ι(*η - Vn) } ' To verify t h e triangle inequality in this case, we need Minkowski's inequality (see page 184): n

9

u

i2

2

2

1 2

1/2

= d{x,y)

+ d(y, z).

Next we have some examples of metric spaces whose points are functions. T h e space C is the space of continuous functions defined on t h e closed interval [0,1]. Its elements are continuous functions χ = x(t), where 0 < t < 1, with distance d(x,y)

= m a x \x< 0 \ / 2 a n d χ < \/2· On the other hand, a set such as the one indicated in the figure, consisting of an oscillating curve condensing toward a line segment, together with the line segment, is connected. (The graph of y = sin(l/a;), together with t h e segment — 1 < y < 1,

Sec. 5

OPEN AND CLOSED SETS

33

L

has a similar character.) We might think t h a t t h e line segment L at t h e left could be separated from t h e rest of t h e set, j u s t as two a b u t t i n g open intervals can b e separated from each other. However, a n open set t h a t covers any point of L would have t o contain a neighborhood of a point of L, and the oscillating part of the graph enters any such neighborhood. Indeed, for t h e same reason, t h e set is still connected if we include only t h e rational points on L, or only t h e irrational points on L. By combining two sets of this kind it is possible t o construct two connected sets inside a square, one of which joins two opposite corners of t h e square, and t h e other of which joins t h e other two corners, although t h e two sets have no point in common. It is also possible t o have a set t h a t is connected but has t h e property t h a t after one particular point is removed, t h e remainder has no connected subsets containing more t h a n one point. (A construction is outlined on page 42.) A set may be open or not according to the space in which it is considered. However, we must not suppose, merely because connectedness was defined by using open sets, that a set can change from being connected to not being connected if it is considered as a subset of a different space. In fact, the property

34

SETS

Chap. 1

of being connected, unlike the property of being open, is an intrinsic property of the set. That is, if a set is connected when considered in one space, it is still connected when considered in another space, as long as the metric remains the same on points of the set. We shall show (equivalently) that the property of not being connected depends only on the set. Suppose that Ε is a set in a metric space S, that Ε is not connected, and that Si is a subspace of S and S is a subspace of 5 2 , with Ε C Si. We have to show that, whether we have added points to S (to get 52) or taken points away from S (to get Si), the set Ε is still disconnected. We start with the assumption that Ε C A\J B, where A and Β are open (in S), and the intersections Α Π Ε and Β Π Ε are disjoint and nonempty. To show that Ε is disconnected in the space Si, we replace the sets A and Β by subsets Αι and Bi of Si, where Αι contains all the points of A that are in Si, and Bi contains all the points of Β that are in S i . These sets still cover Ε (since Ε C Si), and their intersections with Ε are still nonempty and disjoint. We need to show that Αι and Bi are open subsets of Si. If ρ is a point of Αι, then ρ is certainly a point of A, and since A is open in S, all points of S at distance less than some r from ρ belong to A. Consequently, all points of Si at distance less than r from ρ belong to A\. Thus every point of A\ is an interior point of Αι (with respect to Si). Similarly Bi is open with respect to Si. Hence Ε is still disconnected in S i . To show that Ε is disconnected in the space S 2 , we inflate the covering sets A and Β into subsets Ai and Bi of S2 as follows. If ρ is a point of A, then since A is open in S, all points of S at distance less than some r from ρ belong to A. We adjoin to A all points of S2 at distance less than r from p. Do this for each point ρ in A, and denote the enlarged set by Ai. Notice that A2 is an open set relative to S2 by Exercise 5.18, because each point of A2 belongs to a neighborhood in S2 that is contained in Ai. We obtain the open set Bi by similarly enlarging B. The sets Ai and Bi still cover Ε and have nonempty

Sec. 5

OPEN AND CLOSED SETS

35

intersections with E. It may happen that A2 and B2 have some points in common, but their intersections with Ε are still disjoint, because the points we added to A and Β are points that lie outside S, and in particular lie outside E. Hence Ε is still disconnected in 5 2 . It is easy t o show t h a t R i is connected. If it were not, it would b e the union of two disjoint nonempty open sets. These sets would also be closed (Exercise 5.20), since each is t h e complement of the other. Hence it is enough t o prove t h a t R i has no nonempty subset t h a t is b o t h open a n d closed and is not all of R i . Suppose there is such a subset; call it G. T h e n G is m a d e u p of countably many open intervals whose endpoints are not in G, as we saw on page 30. O n t h e other hand, these endpoints are b o u n d a r y points (also limit points) of G, a n d since G is also closed, t h e y belong to G. T h i s contradiction shows t h a t G cannot exist. To show t h a t R 2 is connected, it is sufficient t o show t h a t it contains n o set, other t h a n itself and t h e empty set, t h a t is b o t h open and closed. Let Ε b e such a set; let Ρ £ Ε a n d Q € C(E). Consider t h e infinite straight line t h r o u g h Ρ and Q, regarded as a space L, with the distance between two points in L equal t o their distance in R 2 . T h e n L is a copy of R i . T h e sets Ef)L and C(E) Π L are b o t h open and closed in L, and neither of t h e m is empty (since one contains Ρ a n d the other contains Q). This contradicts the connectedness of R i . Exercise 5.27. Show that every nonempty set in R 2 , except for R 2 itself, has a nonempty boundary.

Exercise 5.28. The closure of a set Ε is the union of Ε and the set of all limit points of E. Show that it is also the union

36

Chap. 1

SETS

of Ε and the set of all boundary points of E, and that it is closed. Exercise 5.29. If Ε is not finite, must some point of Ε be an interior point of the closure of E? Exercise 5.30. cise 5.25?

What are the closures of the sets in Exer-

Exercise 5.31. A neighborhood of χ consists of the points y such that d(x, y) < r. Show that in Ri or R 2 the closure of this neighborhood is the set of points y such that d(x, y) < r. Is this true in every metric space? An i m p o r t a n t fact a b o u t closed sets is t h a t the union of two closed sets is still closed, and the intersection of two closed sets is still closed. Since a closed set is characterized by containing all its limit points, t o prove t h e first statement we consider two closed sets Ει and E2, and any limit point ρ of JSj U E2; we are t o show t h a t ρ 6 E\ U E2 • I claim t h a t ρ is a limit point of either E\ or E2 • For if ρ is not a limit point of E\, t h e n there is a neighborhood of ρ containing no point of E\ (other t h a n p), and if ρ is not a limit point of E2, t h e n t h e r e is a neighborhood of ρ containing no point of E2 (other t h a n p); the smaller of these neighborhoods contains no point of either E\ or E2 (other t h a n p ) , contradicting t h a t ρ is a limit point of E\ U E2. T h u s ρ is a limit point of a t least one of E\ and E2. Since Ει and E2 are b o t h closed, ρ € Εχ if ρ is a limit point of Εχ, a n d ρ € E2 if ρ is a limit point of E . Hence ρ belongs to at least one of Ει and E2, t h a t is, ρ belongs t o Ei U E2. Therefore Ei U E2 contains all its limit points and so is closed. 2

To see t h a t the intersection of two closed sets is closed, consider a limit point q of Ει Π E%. Every neighborhood of q contains points, other t h a n q, belonging b o t h t o Ει

Sec. 5

OPEN AND CLOSED SETS

37

a n d t o E2. This property makes q a limit point of E\ a n d a limit point of E2. Since E\ and E% are b o t h closed, q belongs t o b o t h and so t o E\ Π E2. Therefore E\ Π E2 contains all its limit points and so is closed. Exercise 5.32. Prove by induction that the union of any finite number of closed sets is closed. Exercise 5.33. Show that the intersection of any collection (possibly uncountable) of closed sets is closed. Exercise 5.34. Find an example to show that the union of a countably infinite collection of closed sets is not necessarily closed. Exercise 5.35. By arguing in a similar way, or by considering the complements of the sets concerned, show that the union of any collection of open sets is open; that the intersection of finitely many open sets is open; and that the intersection of a countably infinite collection of open sets need not be open. Exercise 5.36. Let χ be a given point of a metric space. If Ni is the neighborhood of χ consisting of all y such that d(x, y) < r, and N2 is the neighborhood of the same χ consisting of all y such that d(x,y) < r / 2 , show that the closure of N2 is a subset of N\. Must it be a proper subset? A set Ε is called perfect if either it is empty, or it is closed and every point of Ε is a limit point of E. A closed interval in R i is perfect; so is the union of a finite number of closed intervals. In t h e next section, we shall meet examples of more general perfect sets. NOTES ' T w o references for the c o n c e p t s in §§5-7 are K. Kuratowski, Topology, English translation, A c a d e m i c Press, N e w York, 1966; and A. Wilansky, Topology for Analysis, Krieger Publishing, M e l b o u r n e , FL, 1983. M a n y people use the word "neighborhood" to m e a n a n y set t h a t contains one of our n e i g h b o r h o o d s . 2

38

SETS

Chap. 1

6. D e n s e a n d n o w h e r e d e n s e s e t s . A set Ε is everywhere dense (or, for short, just dense) if its closure is t h e whole space. In particular (in R „ , equivalently), Ε is dense when every point of t h e space is a limit point of E. A set is nowhere dense if its closure contains no neighborhoods. In other words, Ε is nowhere dense either if Ε is empty, or if every neighborhood in t h e space contains a subneighborhood t h a t is disjoint from E. T h e rational points in R i form a dense set. A set consisting of a finite number of points of R i is nowhere dense. Note t h a t "nowhere dense" is not the negative of "everywhere dense." A set t h a t fails t o be everywhere dense must have the property t h a t its closure fails to fill some neighborhood (possibly small). If a set fails t o be nowhere dense, its closure must fill some neighborhood, but not necessarily the whole space. Occasionally we need t o say t h a t a set Ε is dense in an interval, or in some other set, or is nowhere dense in an interval. Such phrases are self-explanatory. Exercise 6.1. Consider the space Ω whose elements are the integral points 1, 2, 3, . . . of Ri with the Ri metric. Describe the neighborhoods in Ω. Is the set containing the single point 1 a nowhere dense set in Ω? Exercise 6.2. If a closed set contains no neighborhoods, it is nowhere dense. Since every point of a perfect set is a limit point of the set, it would a p p e a r t h a t a nonempty perfect set must have a great m a n y points. It is therefore somewhat surprising t h a t a nonempty set can b e b o t h nowhere dense a n d perfect.

Sec. 6

39

DENSE AND NOWHERE DENSE SETS

Exercise 6.3. R i , considered as a subset of R 2 , is nowhere dense and perfect. In R i , there are no nowhere dense perfect sets of such a simple structure. An example of a nowhere dense perfect set in R i is the Cantor set, which m a y be used as t h e basis for constructing m a n y examples of sets and functions with remarkable properties. T h i s set is constructed as follows. Consider t h e closed interval [0,1] in R j . Remove t h e open middle third, t h a t is, t h e interval ( j , f ) . Next remove t h e open middle thirds of t h e two remaining intervals, t h a t is, remove (^, | ) and ( | , §)· T h e n remove t h e open middle thirds of the four remaining intervals; and so on indefinitely. n u

1

2

1

2

7

8

1

9

9

3

3

9

9

1

W h a t remains? In t h e first place, what has been removed is a union of open sets (indeed, of open intervals), and so is open; what remains is its complement (with respect to [0,1]), a n d so is a closed set. T h e endpoints of t h e various middle thirds were not removed, so they remain; and since t h e remaining set is closed, every limit point of endpoints remains. For example, if we s t a r t from j and take t h e closest endpoint in the second step (|· — ^ = | ) , t h e n the closest endpoint in t h e third step ( | — | + ^γ), and so on, the (only) limit point of this set of points is ϊ~ έ + 5Υ ' " 7 · T h u s there are, in fact, limit points of endpoints which are not endpoints. T h e Cantor set is t h e set t h a t remains after we have removed all the middle thirds: it consists of all t h e endpoints a n d of their limit points. —

=

1

Exercise 6.4. Does the Cantor set contain the point y^r — 1 = 0.77245... ?

40

Chap. 1

SETS

We have just observed t h a t t h e Cantor set is closed. It also contains no interval, since the total length of t h e intervals removed i s j + | + ^ + -- - = l . Therefore t h e Cantor set is nowhere dense (Exercise 6.2). To show t h a t it is perfect, we have t o show t h a t each of its points is a limit point. Since the limit points of endpoints are naturally limit points of t h e set, it is merely a question of showing t h a t t h e endpoints are limit points. Consider, for example, t h e point ^ . To t h e left of it there is an interval of length ^ from which we remove t h e middle third, leaving a n interval of length i adjacent t o t h e point ^; then we remove t h e interval (jf, ^ ) , leaving a n interval of length adjacent to j ; a n d so on. In any neighborhood of j there will always be a short interval t h a t is not removed at some step, and this interval will contain an endpoint belonging to a subsequent step. Hence | is a limit point of endpoints. A similar argument applies t o any other endpoint. It will be useful later on to have a purely arithmetical construction for t h e Cantor set. We use t h e expansion of real numbers in "decimals" in bases 2 and 3 (binary a n d t e r n a r y expansions), instead of in base 10. For example, 0.10010110...

(base 2)

means 2

+

22

2"

+

s

2

+

4

+

2

5

+

2

6

2

+

7

2

+

8

while 0 . 1 0 0 1 0 1 1 0 . . . (base 3) means 1 0

0

— I

3

1

1

3

2

1

3

s

0

1

1

3

4

1

3

5

1

1

3

6

0 1

3

7

3

8

In base 2 we have only t h e digits 0 and 1, whereas in base 3 we have t h e digits 0, 1, a n d 2. T h u s 0 . 0 2 0 2 0 2 . . . (base 3)

Sec.

6

DENSE AND NOWHERE DENSE SETS

41

equals j . ( T h e reasoning is the same t h a t is used in summing a repeating decimal in base 10: if χ = . 0 . 0 2 0 2 . . . (base 3), t h e n multiplying by 3 moves t h e ternary point two places t o the right, so 9x — 2 . 0 2 0 2 . . . = 2 + x, whence 8x = 2.) N o t e t h a t this is not the expansion of \ t h a t we used above in showing t h a t \ belongs to t h e Cantor set; but it is equivalent since 2

32 + 34 + ' -

-

-

32

_

32

_ 1 _

1_

~ 3

3

2

+

34 " 3T + · · ·

1 +

3

3

1_ 3

4

+

'" '

Now let us express all t h e numbers between 0 a n d 1 in base 3. T h e numbers whose first digit is 1 are between j and I (inclusive), so they fill t h e first interval whose interior was discarded in forming the Cantor set. T h e numbers whose first digit is 0 fill t h e interval [0, \ ] , and the subset of these whose second digit is 1 are the numbers between g- a n d | , t h a t is, they fill one of t h e intervals whose interior was excluded in t h e second step of t h e construction. Every number excluded so far has a 1 in t h e first or second place in its t e r n a r y expansion; so do the endpoints, b u t they also have expansions t h a t have no l ' s . For example, \ = 0 . 1 0 0 0 0 . . . = 0 . 0 2 2 2 2 . . . (just as 0.1000 0 . 0 9 9 9 9 . . . in base 10), a n d § = 0 . 1 2 2 2 2 . . . = 0 . 2 0 0 0 . . . . By continuing in this way, we see t h a t the Cantor set can be described as consisting precisely of those numbers t h a t have expansions in base 3 containing no l's. T h e endpoints are t h e numbers of this kind whose expansions in base 3 end in all 0's or all 2's. Exercise 6.5. Show that the Cantor set is uncountable. Actually we can say more: the Cantor set can be put into one-to-one correspondence with the set of all real numbers be-

42

SETS

Chap. 1

tween 0 and 1. Recall that the points of the Cantor set are just the numbers that can be written in base 3 using only O's and 2's. With each such number χ associate the number obtained by halving each digit in the ternary expansion of χ and interpreting the result in base 2. In this way we get every number between 0 and 1 from some point of the Cantor set, and the endpoints of excluded intervals give rise to two different representations of the same number, for example, | = 0.022... in base 3 and § = 0.2000... in base 3 both yield \ = 0.0111... = 0.1000... in base 2. The correspondence is one-to-one between the Cantor set, less the countable set of endpoints of excluded intervals, and the set of all real numbers, less the countable set of numbers that have double representations in base 2. By making these countable sets correspond to each other, we obtain a oneto-one correspondence between the Cantor set and the set of real numbers between 0 and 1. Exercise 6.6. Does the Cantor set contain any irrational points? If so, find one explicitly. We can use t h e Cantor set t o construct the set mentioned on page 33, which becomes totally disconnected after the removal of a single point. Let Ρ be the point in t h e plane with coordinates ( 1 ) , and join Ρ to t h e points of the Cantor set by straight line segments. Now delete t h e points with irrational ordinates on t h e lines going t o endpoints of complementary intervals of the Cantor set, and delete t h e points with rational ordinates on t h e lines going t o t h e other points of the Cantor set. T h e resulting set is connected, but when Ρ is removed it contains no connected subsets except for single points. It is known as t h e Cantor t e e p e e . A metric space is called separable if it contains a countable set t h a t is everywhere dense. For example, R j is separable because t h e rational numbers form a countable dense set. 2

Sec. 6

43

DENSE AND NOWHERE DENSE SETS

Exercise 6.7. Show that R.2 is separable. T h e space Co (sequences tending to zero; page 23) is separable. As a countable dense set we may choose the set consisting of all sequences of rational numbers in which only a finite number of elements are different from 0 (for instance, ( 1 , 0 , 0 , . . . ) or ( | , - | , f , 0 , 0 , . . . ) ) . This set is countable for the same reason t h a t the set of all finite subsets of t h e rational numbers is countable (page 12). To show t h a t it is dense in Co, we recall t h a t the distance between two points χ = ( χ ι , x , . . . ) a n d r — (r\, r , . . . ) in Co is sup \xk — Tic I. Let χ be any point in Co. We want t o select a point r, where all r*. are rational and only finitely m a n y of t h e m are not zero, so t h a t sup |x/t — | is small. Taking an arbitrary positive e to measure the desired degree of smallness, choose Ν so large t h a t | x | < e for η > Ν (compare t h e more detailed discussion of convergence in §8). Let r i , 7"2> · • · > N be rational numbers such t h a t | x — r \ < e for η = 1, 2, . . . , Ν. T h e n r = ( η , Γ 2 , . . . ,rjv, 0 , 0 , . . . ) is the required point. 2

2

n

r

n

n

In the space C of continuous functions, the set of all polynomials is everywhere dense, as we shall show in §19. Exercise 6.8. Show that there are uncountably many polynomials. Exercise 6.9. Show that there are countably many polynomials all of whose coefficients are rational. Exercise 6.10. Show that if the set of all polynomials is dense in C, so is the set of all polynomials all of whose coefficients are rational. Deduce that C is separable. T h e space m of bounded sequences of real numbers (page 24) is an example of a space t h a t is not separable. We can see this as follows.

44

SETS

Chap. 1

Exercise 6.11. Show that there is an uncountable set S in m, all the points of S being represented by sequences containing only O's and l's. Exercise 6.12. What is the distance in m between any two of the points of the set S of Exercise 6.11? Take any everywhere dense set Ε in m . We shall p u t t h e set S (described in Exercise 6.11) into one-to-one correspondence with a subset of E. This will show t h a t Ε contains an uncountable subset and so is uncountable. Therefore m can contain no countable dense set. To p u t S into one-to-one correspondence with a subset of E, we proceed as follows. Take any point ρ of S. There is a point qoi Ε at distance less t h a n \ from p, since Ε is everywhere dense. T h e distance from q t o any other point a of 5 is more t h a n \ , since 1 = d(p,s) < d(p,q) + d(q,s) < \ 4- d(q,s). In this way we have associated a different point of Ε with each point of S, so t h a t Ε has at least as many points as S, a n d hence uncountably many.

Exercise 6.13. Show similarly that the space Β of bounded functions (page 25) is not separable.

NOTES W h e n I w a s a s o p h o m o r e , an older s t u d e n t showed m e t h e Cantor set, but a d m i t t e d t h a t he had never been able t o find a limit point t h a t w a s not an e n d p o i n t . He later left m a t h e m a t i c s for biology. 1

S e e , for e x a m p l e , Lynn Arthur S t e e n and J. Arthur Seebach, Jr., Counterexamples in Topology, second edition, Springer-Verlag, N e w York, 1978, pp. 1 4 5 - 1 4 7 ; J. C o b b and W . V o x m a n , Dispersion p o i n t s and fixed p o i n t s , American Mathematical Monthly 87 ( 1 9 8 0 ) , 278-281. 2

Sec. 7

COMPACTNESS

45

7. C o m p a c t n e s s . It is frequently desirable t o be able to assert t h a t a set possesses a limit point, even though we may not be able to say just where t h a t limit point is. Suppose, for example, we are trying t o prove t h a t a bounded, continuous, real-valued function / , defined on a given set Ε in R i , has a m a x i m u m . T h a t is, we want t o show t h a t there is a point χ of Ε such t h a t f(x) is actually equal t o t h e s u p r e m u m of all numbers f(y) for y in E. (You are supposed t o have a t least a rough idea already of w h a t a continuous function is; the formal definition will be discussed in §13.) We are t h e n trying to show t h a t there is a n ι in £ such t h a t f(x) > f(y) for all y in E. We supposed t h a t / is bounded, t h a t is, the values f(y) for y in Ε form a bounded set of real numbers. Such a set has a finite least upper b o u n d M, so f(y) < Μ for all y in E. There must t h e n be points x„ in Ε such t h a t f(x ) > Μ — l/n, since otherwise some smaller number t h a n Μ would also be an upper bound. Moreover, we can suppose for the same reason t h a t the all different, if Ε is not a finite set (and if Ε is finite there is nothing to prove). If t h e x have a limit point χ in E, the continuity of / makes f(x) > Μ because f(x„) > Μ — l / n . Since f(x) < Μ in any case, by the definition of M, it follows t h a t f{x) - M. Now under w h a t circumstances can we assert t h a t a set {x } of infinitely many different points of Ε has a limit point in ΕΊ This cannot always be true: for example, in R i the set E\ = {^, | · , | , yjr,... } has the limit point 0, but this limit point is not in Εχ. T h e set Ei = { 1 , 2 , 3 , 4 , . . . } has no limit point a t all in R\, T h e Bolzano-Weierstrass theorem exists for t h e purpose of furnishing simple conditions t h a t ensure t h a t a set contains a limit point. It says t h a t an infinite bounded set in R i has a limit point in R i ; if t h e set is also closed, it then contains a limit n

n

n

46

Chap. 1

SETS

point. Assuming the t r u t h of this theorem, for t h e moment, we see t h a t a bounded, continuous, real-valued function attains a maximum on any closed bounded set in R j . T h e examples (i) f(x) = x, with Ε the interval (0,1); and (ii) f(x) = 1 — x~ , with Ε the interval [1, oo); show t h a t b o t h "closed" a n d "bounded" are essential conditions on t h e set E. l

Exercise 7.1. Deduce from the Bolzano-Weierstrass theorem that the boundedness of the function is redundant: it follows from the other hypotheses. We m a y prove the Bolzano-Weierstrass theorem by a process t h a t has been suggested as a method for catching a lion in t h e S a h a r a Desert. We surround the desert by a fence, and then bisect the desert by a fence running (say) n o r t h and south. T h e lion is in one half or t h e other; bisect this half by a fence running east a n d west. T h e lion is now in one of two quarters; bisect this by a fence; and so on: t h e lion ultimately becomes enclosed by an arbitrarily small enclosure. T h e idea is actually employed in the Heligoland bird t r a p . 1

2

T h e essential point in applying this idea to our problem is t h a t if a set Ε has infinitely many points and lies in some finite interval / , t h e n at least one half of I must contain infinitely m a n y points of E. Let I be one of the halves, containing infinitely many points of E, and bisect I . Again, one of the halves of I contains infinitely m a n y points of E; call such a half I3. Continue this process. We obtain a nested sequence of intervals I , I3, . . . , each containing infinitely m a n y points of E. T h e left-hand endpoints of t h e intervals I„ form a set which is bounded above (since it is in J) a n d so has a least upper bound x. Every neighborhood of χ contains some / „ , since the length of / „ tends 2

2

2

2

Sec. 7

47

COMPACTNESS

to 0, and so contains infinitely many points of E. T h a t is, χ is a limit point of E. Exercise 7.2. Prove the Bolzano-Weierstrass theorem for R.2. As an application of t h e Bolzano-Weierstrass theorem together with t h e uncountability of t h e set of real numbers, we prove a theorem about t h e approximation of a function by t h e partial sums of its power series. Let f(x) — 52£L0 ajfcX*, with the series converging in \x\ < 1. Suppose that, for each χ in [0,1), f(x) coincides with some partial sum of its power series, that is, for each χ there is an η such that Σ * 1 η + ι k 0· Then f is a polynomial. a

x k

=

3

Let E be t h e set of points χ in [0, for which t h e partial s u m Y^.=o k f( )- Since there are countably m a n y integers a n d uncountably many points in [0, j ] , some E must b e uncountable, a n d so infinite. T h e n E h a s a limit point in [0, ^ ] , and / coincides on E with a polynomial, t h e same a t every point of E„. B u t an analytic function cannot coincide with a polynomial on a set having a limit point inside t h e interval of convergence without being itself a polynomial. (See §24 for more about analytic functions.) T h e s t a t e m e n t t h a t every infinite bounded set has a limit point makes sense in any metric space, although it m a y fail t o be true. For example, it fails for t h e space consisting of t h e rational points of R j with t h e R i metric. We can see this by considering t h e set consisting of the rational approximations 1, 1.4, 1.41, 1.414, 1 . 4 1 4 2 , . . . t o y/2. This set is bounded; it is closed (since \/2 is not in t h e space); b u t it contains no limit point. T h e Bolzano-Weierstrass property fails here because t h e space has, so t o speak, t o o few points. It can also fail for a space t h a t has t o o many points. For example, take t h e space Β whose points are n

a

x k

=

x

n

n

n

48

SETS

Chap. 1

bounded functions on [0,1]. We have seen (Exercise 6.13) how t o construct infinitely m a n y such functions, each a t distance 1 from t h e others; such a set of points of Β cannot have a limit point. A set Ε with the property t h a t every infinite subset of Ε has a limit point in Ε used t o be called compact; we have j u s t seen t h a t bounded closed sets in R i or R 2 have this property. However, the t e r m compact is now usually applied to sets with a less intuitive property (formerly called bicompactness). A set is called compact if, whenever it is covered by a collection of open sets, it is also covered by a finite subcollection of these sets. (To say t h a t Ε is covered by a collection of sets {G} means t h a t each point of Ε is in at least one G.) To see how t h e property of compactness can be a p plied, we shall use it t o show t h a t a continuous real-valued function f defined on a compact set Ε in a metric space has a maximum on E. First we show t h a t the function is bounded. To each χ in Ε we assign a neighborhood Ν with center a t χ such t h a t f(y) < f(x) + 1 for all y in N. We can do this because / , being continuous, does not change much if we change χ only a little. These neighborhoods are open sets, and every χ is in at least one of them, so since Ε is compact, a finite number of t h e m cover E. Let these be JVj, JV , · · • , N„,. If Xk is the center of Nk, t h e n f(x) cannot exceed t h e largest of the finite class of numbers f(xk) + 1; so / is bounded above. Similarly / is b o u n d e d below. 2

Now we suppose t h a t / does not a t t a i n a m a x i m u m on Ε and deduce a contradiction. T h e values f(x) for χ in Ε form a b o u n d e d set, as we have j u s t seen, so t h e y have a s u p r e m u m Μ, which we are supposing is not attained. To each χ we can then assign a neighborhood Ν such t h a t f(y) < f(x) + \{Mf(x)) for all y in Ν (by t h e

Sec. 7

49

COMPACTNESS

continuity of t h e function / ) . Again finitely many of these neighborhoods, N\, N2, ... , N (not the same neighborhoods as before), cover E. Let M' (< M) be t h e largest of t h e numbers f(xk), where Xk is the center of 7V . T h e n for every y in Ε we find, by taking an Xk t h a t is in the same Nk as y, t h a t n

fc

f(y)

< f(xk)

+ \{M

- f(x )) k

= \f{x ) k

+

\M

< i ( M ' + M). T h u s the values of f(y), for y in E, have t h e upper bound \{M' + M), which is less t h a n Μ, contradicting t h e definition of Μ as the s u p r e m u m of / . Exercise 7.3. If Ε is a set in R i , and Ε is covered by a finite number of open intervals, then we can reduce the number of intervals to arrange that no point of Ε is in more than two of them, and the reduced set still covers E. Exercise 7.4. Show that a closed subset of a compact set is compact. T h e preceding proof indicates t h a t it is desirable t o be able t o recognize a compact set when we meet one. In R i this is easy: the Heine-Borel theorem states t h a t a set in R i is compact if it is closed and bounded. T h e proof is almost t h e same as t h e proof of t h e Bolzano-Weierstrass theorem. Suppose t h a t t h e conclusion of t h e Heine-Borel theorem is false. T h e n we have a set Ε which is closed and bounded, a n d a collection {G} of open sets t h a t covers E, whereas no finite subcollection of sets G covers E. T h e set Ε lies in some finite interval / ; bisect I. T h e p a r t of Ε in one of t h e halves of / must fail t o be covered by a finite subcollection of t h e sets G; for, if b o t h p a r t s of Ε could be so covered, so could t h e whole set E. Let this half of I

50

Chap. 1

SETS

b e I2 • Now bisect I and continue the process as before t o define a limiting point x. As with the Bolzano-Weierstrass theorem, we see t h a t every neighborhood of χ contains a n interval I„ t h a t in t u r n contains a part of Ε t h a t cannot b e covered by a finite subcollection of the sets G (and hence is infinite). O n t h e other hand, χ is in Ε since Ε is closed, a n d so χ can be covered by one of t h e sets G. Since G is open it contains a neighborhood of x, and this neighborhood contains a n interval I„ if η is large enough. T h e p a r t of Ε in this I is covered by a finite number (namely, one) of t h e sets G. We have t h u s arrived at a contradiction by supposing the Heine-Borel theorem false. 2

n

Exercise 7.5. Prove the Heine-Borel theorem in R 2 . Exercise 7.6. Let S be a compact set in R 2 containing at least three noncollinear points, (a) Show that there is a triangle of largest area with vertices in S. (b) Does the diameter of S have to equal the length of one of the sides of this triangle? You should have noticed t h a t t h e hypotheses on the set in t h e Heine-Borel and Bolzano-Weierstrass theorems are t h e same. T h e similarity of b o t h hypotheses a n d proofs should suggest a close relationship between the theorems. As a m a t t e r of fact, t h e theorems stand or fall together: for a given metric space, if either of t h e m is true, so is t h e other. However, we shall omit t h e proof. 4

Exercise 7.7. Show directly the noncompactness of the two examples given on page 45 of infinite sets that do not contain limit points. Exercise 7.8. In R i , let Ε be the interval (0,1]. With each χ associate the open interval ( | x , 2x). These intervals cover E. Show that no finite number can cover E, and explain why this fact does not contradict the Heine-Borel theorem.

Sec. 7

51

COMPACTNESS

Exercise 7.9. The set [0, oo) in R i is covered by the open intervals (n — 1, η + 1), η = 0, 1, 2, . . . . No finite number of these intervals can cover the set. Explain why this fact does not contradict the Heine-Borel theorem. Exercise 7.10. The set Ε in Ri consisting of the rational numbers between 0 and 1 is not closed. It is covered by open sets in the following way: let χ be covered by an open interval of length -fa centered at x. A finite number of these open intervals do cover E. Explain why this fact does not contradict the Heine-Borel theorem. Exercise 7.11. Let the set Ε in R i be the closed interval [0,1]. Let each χ Φ 0 in Ε be covered by the interval [yx, 2x], and let 0 be covered by [0, j^]. The covering intervals are not open, yet a finite number of them do cover E. Explain why this fact does not contradict the Heine-Borel theorem. It is also worth observing t h a t not only is a set in R i (or in any R * ) compact if it is closed and bounded, but if it is compact it is necessarily closed and bounded. To see this, suppose t h a t Ε is a n o n e m p t y compact set in R i . It cannot be all of R j , so its complement contains a point x. Consider all open finite intervals G such t h a t the closure of G does not contain x. Among these intervals G there are certainly neighborhoods of each point of E, since if y G E, a neighborhood of y t h a t reaches only halfway t o χ will not contain χ in its closure. Therefore Ε is covered by t h e open sets G. Since Ε is compact, a finite collection of sets G covers E; let these sets b e G\, ... , G . Hence Ε is, in the first place, bounded, since it is contained in the union of a finite n u m b e r of finite intervals. Since t h e closures of G i , . . . , G do not contain x, the complements of these closures all do contain x, and therefore so does their intersection. T h e closures of the Gk are closed, their complements are open, so the intersection of t h e complements n

n

52

Chap. 1

SETS

is open. Accordingly, χ is in an open subset of C(E); since χ can be any point of C(E), it follows t h a t every point of C(E) is in an open subset of C(E). Therefore C(E) is open (Exercise 5.18). Hence Ε is closed (Exercise 5.21). NOTES ' H . P i t a r d , A contribution to the m a t h e m a t i c a l theory of big g a m e hunting, American Mathematical Monthly 45 (1938), 4 4 6 - 4 4 7 . See J o h n Sack, Report from Practically Nowhere, Harper, N e w York, 1959, p . 23. P r o o f s u g g e s t e d by W . C. Fox and R. R. Goldberg. S e e , for e x a m p l e , Μ. E. M u n r o e , Introduction to Measure and Integration, Addison-Wesley, Cambridge, M A , 1953, p. 31; Ε. T . C o p son, Metric Spaces, C a m b r i d g e University Press, 1968, p. 79. 2

3

4

8. C o n v e r g e n c e a n d c o m p l e t e n e s s . A great deal of analysis is concerned with infinite series or sequences of functions. T h e idea of an infinite series of numbers is more intuitive: we write, for instance, \ + \ + | -I , meaning t h a t we a d d u p t h e t e r m s one by one, thus forming the successive "partial sums" 1

1

2' 2

1

1

4' 2

+

1 +

4

1 +

8'

•"'

a n d call t h e limit of these (if there is one) the s u m of the infinite series. (It is assumed t h a t you already have some idea of t h e meaning of "limit"; see, however, page 54.) In this case t h e partial sums are \ , | , \ , . . . , and it is clear t h a t t h e s u m of t h e series must b e 1. It is even clearer if we use t h e formula for the sum of a geometric progression:

1

.

1

,

,

1

1 _ 2

\_ 2"+! _ ι

1

Sec. 8

CONVERGENCE AND COMPLETENESS

53

In general, if we write t h e infinite series Oi + 0,2 + α

3

+ · ·· ,

we expect t o calculate first αχ, t h e n αχ + α , t h e n αϊ + Ü2 + α , a n d so on, and call the limit (if any) of these partial s u m s the s u m of t h e series. Note t h a t , for example, 1 - 1 + 1 - 1 + · · · a n d ( 1 - 1 ) + ( 1 - 1 ) + ( 1 - 1 ) + · · · must be different infinite series, since the first has successive partial sums 1, 0, 1, 0, . . . , whereas all t h e partial sums of t h e second are 0. 2

3

However, we have not actually defined "an infinite series" or even a particular infinite series: we have merely suggested a way of attaching a numerical value t o a formula t h a t has no meaning a priori. In order to see how we could actually give a definition, we notice t h a t w h a t was really used in the suggested calculation is t h e sequence of partial sums. Technically speaking, a sequence is a function from t h e positive integers to some space: see §12. However, in less formal language, we may think of a sequence of numbers as being a collection of numbers t h a t have been labeled with t h e positive integers, preserving their order, and are not necessarily all different: t h u s 1,0, 1, 0, . . . , is a sequence (the labeling being implicit); generally a sequence can be written in some such way as αχ, Ü2, α , . . . , or { α } £ ° , or simply {a } if it is clear from t h e context t h a t we are talking a b o u t a sequence a n d not a set. We must always distinguish a sequence from t h e set of numbers t h a t a p p e a r in it. Any countably infinite set can be arranged as a sequence (in many ways), b u t a sequence need have only a finite number of different elements in it. (Note t h a t a "finite sequence" such as {5,12,13} is not a sequence according t o our definition.) We can now define the infinite series 0 1 + 0 2 + 0 3 - 1 as meaning no more and 3

η

= 1

n

54

Chap. 1

SETS

no less t h a n t h e sequence whose elements are t h e partial s u m s «ι = ai, S2 — ai + Ü2, S3 — ai + Ü2 + Ü3, a n d so on. Conversely, any sequence of numbers defines a corresponding infinite series of which it is the sequence of partial sums. For example, t h e sequence 1, 0, 1, 0, . . . is t h e sequence of partial sums of the series 1 — 1 + 1 — 1 + · · · . Generally, the sequence si, s , S3, ... is t h e sequence of partial sums of t h e series s\ + (s — S i ) + (S3 — S2) + · · ·. T h e notion of sequence is more general t h a n t h e notion of series since we can have a sequence whose elements are sets, or, indeed, points of any space we like; b u t there is no associated series unless there is an operation of addition for points of t h e space. We shall naturally say t h a t an infinite series converges if its associated sequence of partial sums converges; otherwise the series is said to diverge. To make this definition precise, we must therefore define what we are to mean by the convergence of a sequence. If {s } is a sequence of real numbers, we say t h a t it converges t o t h e limit L if Is„ — L\ eventually becomes and remains as small as we please. We t h e n write s„ —• L. In more formal language, s„ —• L if, given any positive number e (with emphasis on its possible smallness), there is an integer Ν (usually rather large) such t h a t \s — L\ < e provided t h a t η > N. This definition extends immediately to sequences whose elements are points of any metric space: we have only to replace \s — L\ by d(s ,L). T h u s the sequence of points { ( c o s ( l / n ) , s i n ( l / n ) ) } of R 2 converges t o (1,0); if elements x of t h e space C of continuous functions are defined by x (t) = t (l — t) , 0 < t < 1, t h e n t h e sequence {x } converges to t h e element 0 of C (since 2

2

n

n

n

n

n

n

n

n

n

*(!-«)< 4")· Although defining the infinite series αχ + 02 + · · · t o mean t h e sequence {s\, S2, S 3 , . . . } of its partial sums seems

Sec. 8

55

CONVERGENCE AND COMPLETENESS

n a t u r a l enough, we are under no compulsion to use this definition, a n d indeed it is not always the most reasonable definition t o use. For some purposes it is better t o define αϊ + 02 Η t o m e a n some other sequence, for example, £l_

si + s 2 ' 2

7'

Si

+ s + s 3 ' 2

3

"

or Sl + S

S

2

2

2

'

+ S

S

3

2

+

3

'

S

4

2

It can b e shown t h a t either of these definitions preserves t h e s u m of any convergent series. In addition, either one makes some divergent series converge; for example, the divergent series 1 - 1 + 1 — H — has si = 1, s = 0, s = 1, a n d so on, so t h a t either of the suggested definitions would give it the s u m \ . We are now going t o discuss some properties of sequences of points in a metric space; the idea of infinite series has served t o motivate the introduction of sequences, b u t we have no further use for infinite series j u s t now. (Some theorems are more conveniently formulated in t e r m s of series t h a n in t e r m s of sequences; we shall have an example on page 171.) If a sequence converges to a limit L, its elements eventually become and remain close to each other. Indeed, let Ν b e so large t h a t , for η > Ν, we have d(s , L) < e/2; let m> N; t h e n d(s , L) < e/2 also. By the triangle inequality, d(s ,s ) < d(s ,L) + d(s ,L) < e. In other words, d{s , s ) can be m a d e as small as we please by taking m and η simultaneously sufficiently large. A sequence {s } with t h e property t h a t its elements eventually become a n d remain close t o each other, in the sense j u s t described, is called a Cauchy sequence. It may or 1

2

n

m

m

m

n

n

n

n

m

3

56

Chap. 1

SETS

may not converge t o a limit in t h e space. For example, in t h e metric space of rational numbers with the distance, t h e sequence {1,1.4,1.41,1.414,1.4142,...} of decimal approximations to y/2 is a Cauchy sequence. In fact, if τη > Ν a n d η > TV, t h e n s and s agree a t least t h r o u g h t h e JVth decimal place, a n d so \s - s „ | < 10~ . However, t h e sequence does not converge t o a point of t h e space, because V2 is an irrational number. m

n

N

m

E x e r c i s e 8 . 1 . A sequence {s„ } of real numbers whose consecutive terms eventually become close together (that is to say, \s„ — s„ + i| —• 0 ) need not be a Cauchy sequence. Give an example. A metric space for which every Cauchy sequence converges t o a point of t h e space is called complete. T h e metric space of rational numbers is not complete. However, R j is complete, as we shall see shortly. It is, in fact, always possible t o make a metric space complete by adding new points t o it t o make a larger space, somewhat as the real numbers can be constructed from the rational n u m b e r s . We shall not discuss this construction here, however. We shall now show t h a t the completeness of R i follows from the least upper b o u n d property t h a t we took as fund a m e n t a l in §2. Let {s } be a Cauchy sequence. T h e n if e is a given positive number, there is a n Ν such t h a t \s — s \ < e if η > Ν and τη > Ν. In the first place, t h e different numbers s„ must form a bounded set. To see this, take e = 1, find t h e corresponding iV, a n d take a convenient m > N. T h e n s = s + (s„ — s ), whence \s I ^ \s I + 1 if ra > iV. Since t h e finite set 2

n

m

n

n

n

m

m

m

{S1,S2,S , 3

...

,Sfif}

Sec. 8

57

CONVERGENCE AND COMPLETENESS

is bounded, so is t h e whole set of numbers s . Now let Z/fc be t h e s u p r e m u m of t h e set consisting of all t h e different s for η > k. Since taking a larger k means t h a t we are considering t h e s u p r e m u m of a smaller set of n u m b e r s , we have Lk > Lk+\ >··· · Let L b e t h e infimum of t h e set of all Lk- We shall show t h a t s„ —• L. Let e be any positive number, a n d take t h e Ν corresponding to this e, so t h a t if η > TV and m> N,we have \s —s \ < t. By t h e definition of infimum, there must b e an Lk, with k > N, between L and L + e. Since Lk is t h e s u p r e m u m of t h e set of s with η > k, there is an s (with m > k) between Lk — e a n d Lk; hence L — e < s < L + e. Since s„ differs from s by at most e, we have L — 2e < s„ < L + 2e. Since 2e is j u s t as a r b i t r a r y as e, we now know t h a t s„ can be m a d e arbitrarily close t o L by taking η large enough; t h a t is, s —» L. n

n

m

n

n

m

m

m

n

Exercise 8.2. Show, conversely, that if the completeness of R i is assumed, then the least upper bound property follows. Exercise 8.3. If {s„ } is a sequence of points of R i such that s„ < Sn+i, and s„ < Μ for all n, then {s„ } converges. In other words, an increasing bounded sequence has a limit. (This is often taken as the fundamental property of R i , in place of the least upper bound property.) Exercise 8.4. Prove that R 2 is complete. We shall see in §17 t h a t t h e space C is complete; and, more generally, t h e space of continuous functions on any given compact set is complete. We noticed in Exercise 4.3 t h a t a subset of a metric space is again a metric space (if t h e same metric is used). Exercise 8.5. A subset of a complete metric space is not necessarily a complete metric space. Give an example of this.

58

SETS

Chap. 1

However, we can show t h a t a nonempty closed subset Ε of a complete metric space S is also a complete metric space (again, using the original distance). Let {xk} be a Cauchy sequence of points of E. It is also a Cauchy sequence of points of S, since the distances in Ε and S are t h e same. Since S is complete, Xk —• XQ, where xo e S. We have only to show t h a t Xo € E. There are two cases t o consider. In the first case, all b u t a finite number of the x* coincide. Clearly they must then coincide with xo, which accordingly belongs to E. In t h e second case, there are infinitely many different x ^ ' s . T h e condition Xk —• xo t h e n shows t h a t xo is a limit point (in S) of t h e set consisting of these Xk, and hence xo is a limit point of t h e set Ε (considered as a subset of S). Since Ε is closed, it contains its limit points (Exercise 5.22); so xo € E. We need t o distinguish carefully between t h e limit of a sequence and a limit point of t h e set consisting of t h e (different) elements of t h e sequence. For example, t h e sequence { 0 , 0 , 0 , . . . } in R i has t h e limit 0, but t h e set of elements of the sequence has only one point a n d so has no limit point. O n the other hand, the set of elements of t h e sequence { 0 , 1 , f , \,\,... } in R i has two limit points (0 a n d 1), b u t the sequence has no limit. T h e necessity of this distinction was w h a t forced us t o consider two cases in t h e preceding proof. However, there is a close connection between t h e limit of a sequence and the limit points of t h e set consisting of t h e different elements of t h e sequence.

\,\, \,

E x e r c i s e 8.6. Show that if a sequence converges to L and has infinitely many different elements, then the set consisting of all the different elements of the sequence has L as a limit point, indeed as its only limit point.

Sec. 8

CONVERGENCE AND COMPLETENESS

59

Exercise 8.7. Hence show that if a sequence converges to a limit and its elements belong to a closed set, the limit of the sequence belongs to the same set. Exercise 8.8. Hence show that if Ε is a nonempty compact set in R i , then Ε has a largest element. We m a y conclude from Exercise 8.6 t h a t if a sequence converges, t h e set of its elements does its best t o have the limit of t h e sequence as a limit point. In t h e other direction, we can easily see t h a t if a set Ε has a limit point L, there is a sequence of points of Ε having L as its limit. In fact, there is a point x\ in Ε a t distance less t h a n 1 from L; t h e n there is a point x in Ε at distance less t h a n \ from L; and so on. In applications, the set Ε often consists of t h e elements of a sequence; if these elements have a limit point, t h e n there is a subsequence converging t o t h e limit point. We shall refer to this as t h e subsequence principle; it has m a n y uses. 2

For example, one of t h e ways t o prove t h e fundamental theorem of algebra is t o show (A) the absolute value of a nonconstant polynomial P(z) has infimum zero in the complex plane, so t h a t there is a bounded sequence {z„} on which P(z„) —> 0; (B) by the subsequence principle, a subsequence of {z } has a limit ZQ, and hence P(ZQ) = 0. Some nineteenth-century proofs stop after step (A), apparently regarding (B) as self-evident. n

Exercise 8.9. If Ε is any bounded sequence in Ri ("bounded" means that the elements of Ε form a bounded set), then Ε contains at least one convergent subsequence. (This is an analogue for sequences of the Bolzano-Weierstrass theorem for sets.) As an illustration of t h e use of the subsequence principle, we discuss t h e diameter of a set E. T h e diameter

60

Chap. 1

SETS

is defined to be t h e s u p r e m u m of the distances between points of E; in symbols, d i a m i ? = sup d(x,y) for χ and y in E. For example, in R 2 t h e diameter of a circle of radius 1 is 2; so is the diameter of the (open) area inside t h e circle, and so is the diameter of the closed area. T h e diameter of t h e set t h a t consists of the three points ( 0 , 0 ) , ( 0 , 1 ) , a n d (1,0) is y/2. It may h a p p e n t h a t there are n o points χ a n d y in Ε for which d(x, y) = d i a m E, even when Ε is bounded a n d so has a finite diameter. For example, when Ε is a neighborhood in R 2 , there are no two points of Ε t h a t are d i a m Ε a p a r t . However, there must b e two points of Ε t h a t are diam Ε a p a r t if Ε is a compact nonempty set in R i or R 2 . Since Ε is bounded, d i a m . E is finite. We must be able, by t h e definition of diameter, t o find pairs of points x\, y\ \ x%, y \ . . . such t h a t d(x ,y„) > d i a m i ? — 1/n. Infinitely m a n y of t h e x or of the y will be different (otherwise we have already found χ a n d y such t h a t d(x,y) = d i a m i ? ) . If there are infinitely many different x , then t h e y have a limit point (by the Bolzano-Weierstrass theorem), a n d we can select a subsequence having this limit point as a limit. If there are only a finite number of different x, one of t h e m , say x\, will occur infinitely often, and t h e n a sequence all of whose elements are equal t o x\ will have x\ as a limit. We can proceed similarly with the y t h a t correspond to the x already selected. T h e result is t h a t we have sequences, which for simplicity of notation we may call {x } a n d {y } again, such t h a t x„ — n o , and y —> j/o, a n d d(x ,y ) —• d i a m i ? . Then we must have d(xo,yo) = diam Ε . For, on the one hand, d(xa,ya) cannot exceed d i a m E, since Ε is closed and so xo and yo are in E. O n t h e other h a n d , t h e triangle inequality shows t h a t 2

n

n

n

n

n

n

n

n

n

n

n

n

1

d(x ,y ) n

n

< d(x ,x ) n

0

+ d(y ,yo) n

+

d{x ,yo), 0

Sec. 9

61

NESTED SETS AND BAIRE'S THEOREM

so t h a t d i a m E < d(xn, yo). Exercise 8.10. Define the distance between two sets F and G to be inf d(x, y) for χ in F and y in G. Show that if F and G are closed and nonempty sets in R.2, and F is bounded, then there are points χ in F and y in G such that d(x, y) is the distance between F and G. Exercise 8.11. If Ν is a neighborhood of y, consisting of all χ such that d(x, y) < r, is diam Ν = 2r? (Consider (a) R i or R 2 ; (b) general metric spaces.) Exercise 8.12. diameter.

Show that Ε and its closure have the same

NOTES ' A n y m e t h o d , other t h a n convergence, for a t t a c h i n g a s u m to an infinite series is called a m e t h o d of summability. See O. Szasz, Introduction to the Theory of Divergent Series, Hafner, N e w York, 1948; G. H. Hardy, Divergent Series, Clarendon Press, Oxford, 1949; K. Zeller and W . B e e k m a n n , Theorie der Limitierungsverfahren, second edition, Springer, N e w York, 1970. S e e , for e x a m p l e , Ε . T . C o p s o n , Metric Spaces, C a m b r i d g e University P r e s s , 1968, §35; Irving Kaplansky, Set Theory and Metric Spaces, s e c o n d e d i t i o n , Chelsea, N e w York, 1977, pp. 9 0 - 9 3 . 2

9. N e s t e d s e t s a n d B a i r e ' s t h e o r e m . Suppose t h a t we have two sets E\ and E , t h a t E\ D E , a n d t h a t E is not empty. T h e n there is at least one point t h a t is simultaneously in b o t h sets, since E\ Π E = E . Similarly, if we have a finite number of sets t h a t are nested: Ei D E D E3 D · • • D E , and if the last set E is not empty, t h e n there is a t least one point t h a t is simultaneously in all t h e sets. There is nothing corresponding t o this when we have infinitely m a n y nested sets, none of which 2

2

2

2

2

n

2

n

62

Chap. 1

SETS

is empty; t h e intersection of all t h e sets m a y quite well b e e m p t y nevertheless. Consider t h e following three examples: (i) E is t h e open interval ( 0 , 1 / n ) in R i ; (ii) E is t h e closed interval [n, oo) in R i ; (iii) E is the set of those points χ in t h e metric space of rational points in R i t h a t are subject t o t h e inequality |x — \ / 2 | < 1/"· In each of these cases, the intersection of all the sets E is empty. We now s t a t e conditions t h a t prevent a nested collection of sets from having a n e m p t y intersection. Cantor's nested set theorem: If E\ D E D E3 D • · ·; if each E„ is closed and not empty; if the underlying space is complete; and if d i a m E —> 0; then there is exactly one point in the intersection of all the E . C a n t o r ' s theorem involves three conditions besides t h e condition t h a t t h e sets are nested and not empty: closedness and small diameter of t h e sets, and completeness of t h e space. In each of our three examples of nested sets with e m p t y intersection, a different one of these conditions fails. To prove C a n t o r ' s theorem, let x„ be any point belonging to E„. T h e sequence {x„ } is a Cauchy sequence since if m > n, then x e E , and so d(x ,x ) < diamE„, which approaches zero. Since the space is complete, {x } has a limit in the space. If we select any E , t h e n the x\. belong to E when k > n , so t h e limit belongs to E because E is closed. T h a t is, t h e limit is in every E„. Finally, there cannot b e two points t h a t are in every E , since diam E is a t least as large as the distance between any two of the points of E . n

n

n

n

2

n

n

m

n

n

m

n

n

n

n

n

n

n

n

It is sometimes useful to have the following weaker theorem: if we keep all the hypotheses of Cantor's theorem except that we no longer require that diam E„ —» 0, but require instead that the E„ are compact, we still can say that the intersection of the E„ is not empty (although it may now contain more than one point). Since we have kept the hypothesis that E is n

Sec. 9

63

NESTED SETS AND BAIRE'S THEOREM

closed, in any R»- our new hypothesis amounts to supposing that E is also bounded. In Rf-, the generalized theorem is an easy application of the subsequence principle: a sequence {x„ } consisting of one point from each set has a subsequence that has a limit, and this limit is a point of the required kind. n

In the general case we have to proceed differently. Let us cover Ει by neighborhoods of all its points, each of diameter at most 1. Because Ει is compact, a finite number of these, say N\, ... , N , cover Ει. One of the Nk must contain points of all the E„ (from η = 1 onward). For if N\ is disjoint from E„,,, and N2 is disjoint from E,„. , and so on, then, because the E„ are nested, no Nk contains points of E for η — m a x f m i , m ) , contradicting that the TV's cover E\. Thus it must be true, as asserted, that some Ν contains points of all the E„ for η = 1, 2, and so on. We now replace each E„ by the closure of Ν Π E„ to get closed nested sets of diameter at most 1. Repeat this reasoning with neighborhoods of diameter at most j covering E2; then with neighborhoods of diameter at most | covering E3; and so on. We obtain nested subsets of the original E „ , with diameters approaching zero, and we can then apply Cantor's theorem in its original form. p

2

n

p

We can sometimes use C a n t o r ' s theorem t o show t h a t a set in which we are interested cannot be empty. If we want t o know t h a t there are things with a certain property, we can be sure t h a t there are some if we can exhibit t h e m as t h e intersection of a nested collection of sets satisfying the hypotheses of C a n t o r ' s theorem. However, it is often more efficient not to use the nested sets directly, b u t to use instead another theorem t h a t is a consequence of C a n t o r ' s theorem. To s t a t e this new theorem we need t o introduce a new class of sets: namely, sets t h a t can be represented as unions of countably m a n y sets, each nowhere dense. ("Countably many" includes none, or one, or a finite number, as well as a countable infinity.) A set t h a t can be represented as t h e union of countably many nowhere

64

SETS

Chap. 1

dense sets is called a set of first category. (Since the name is not at all descriptive, the n a m e meager set has been suggested as an alternative; the reason for using this n a m e will a p p e a r shortly.) In R i , any set consisting of a finite number of points is of first category. So is any countable set, for example, t h e set of all rational numbers, since although this set is everywhere dense, it is t h e union of countably m a n y sets, each consisting of a single point. T h e Cantor set, being nowhere dense, is of first category b u t uncountable. If we form t h e union of t h e Cantor set with the set of all rational points, we obtain a set of first category which is b o t h everywhere dense a n d uncountable. Sets t h a t are not of first category are said to be of second category. Since t h e empty set is nowhere dense, it is of first category; hence a set of second category cannot be empty. This fact is the basis for the principal use of t h e notion of category: if we can show t h a t a set is of second category, t h e n it must contain points. We can sometimes exhibit t h e aggregate of things of a particular kind as a set of second category; there must then b e things of this kind. Some examples will be given in §10. T h e technique for applying this idea depends on Baire's theorem, which states t h a t a complete metric space is of second category. Before proving Baire's theorem we make a few remarks. First, the completeness of the metric space is an essential p a r t of t h e theorem. T h e metric space whose points are t h e rational points of R i , with t h e R j distance, is not complete; each point of t h e space, regarded as a set, is nowhere dense; t h e whole space is therefore the union of countably m a n y nowhere dense sets. We cannot s t a t e without reservation t h a t a countable set is of first category, although t h e preceding example may make this seem plausible. As we noted in Exercise 6.1, a

Sec. 9

65

NESTED SETS AND BAIRE S THEOREM

single point need not form a nowhere dense set. In particular, single points are neighborhoods in any space t h a t contains only a finite n u m b e r of points. T h e next exercise considers t h e other extreme. E x e r c i s e 9 . 1 . Show that if all the points of a space are limit points, any set containing only a single point is nowhere dense. E x e r c i s e 9.2. Hence show that Baire's theorem implies that both Ri and the Cantor set are uncountable. We now prove Baire's theorem. Let {E } be a sequence of nowhere dense sets in a complete metric space. We are t o prove t h a t there is at least one point of the space t h a t is in none of t h e E„. T h e idea of the proof is t h a t since Ει is nowhere dense, its complement contains a neighborhood Ni; t h e n iV"i, in t u r n , contains a subneighborhood N t h a t is in t h e complement of E as well as in t h e complement of E\\ and so on. In this way we obtain a nested sequence of neighborhoods t h a t are disjoint from more a n d more of t h e Ek, a n d a common point of all the neighborhoods cannot b e in any Ekn

2

2

To show t h a t there actually is a common point, we must take a certain a m o u n t of care so t h a t we can apply Cantor's theorem. First select a neighborhood JVi in C(Ei) whose radius is less t h a n 1. Let M i be t h e closure of t h e concentric subneighborhood of N whose radius is half t h e radius of Ni. T h e n Mi is a subset of N\ (by Exercise 5.36), a n d so M\ is disjoint from E\. Next, since E is nowhere dense, Mi contains a neighborhood N t h a t is in C(E ) (as well as in C(Ei)). Let M be t h e closure of t h e concentric subneighborhood of N whose radius is half t h e radius of N . Continuing in this way, we obtain nested closed sets Mk, whose diameters tend t o 0, which are not empty, x

2

2

2

2

2

2

66

Chap. 1

SETS

and which have t h e property t h a t Mk is disjoint from E\, Ei, ... , Ek . T h e common point of all t h e Mk is a point of t h e required kind, since it cannot be in any JE*.

1 0 . S o m e a p p l i c a t i o n s o f B a i r e ' s T h e o r e m . In this section, I assume t h a t you are familiar with elementary notions of derivatives and integrals (antiderivatives) as discussed in a first calculus course. (i) A P R O P E R T Y O F R E P E A T E D I N T E G R A L S .

Let /

be

a continuous real-valued function on a real interval, say [0,1]. Let f\ be any integral of / (that is, the constant of integration may be chosen arbitrarily), f any integral of / i , a n d so on. If some fk vanishes identically, so does / : we have only to differentiate fk repeatedly. T h e following proposition generalizes this simple fact: if for each χ there is an integer k, possibly differing from one χ to another, such that fk{x) = 0, then f vanishes identically. To prove this theorem, let Ek b e the set of points χ for which fk (x) = 0; t h e n our hypothesis says t h a t every χ in [0,1] is in some Ek • By Baire's theorem, not every Ek is nowhere dense. Hence there is some k for which t h e closure of Ek fills a n interval h. For this particular k, since fk is continuous and vanishes on Ek, we must have fk(x) = 0 for every χ in Ik', a n d as we observed above, this implies t h a t / vanishes identically on Ik- If h is not all of [0,1], we repeat this argument with any remaining p a r t of [0,1], a n d so on. In this way we have f(x) = 0 for all points χ of an everywhere dense set; and since / is continuous, it t h e n follows t h a t / ( x ) = 0 for every χ in [0,1]. T h u s if / ( χ ) φ 0, then no m a t t e r how the integrals fk are selected, there must b e some χ (indeed, a somewhere dense set of x) such t h a t fk{x) Φ 0 for every k. 2

Sec.

10

SOME APPLICATIONS OF BAIRE'S THEOREM

(ii) A C H A R A C T E R I Z A T I O N O F POLYNOMIALS.

67

Again

consider a continuous real-valued function / on [ 0 , 1 ] . If / has an n t h derivative t h a t is identically zero, it is easily proved, for example by Taylor's theorem with remainder (see page 1 8 7 ) , t h a t / coincides on [ 0 , 1 ] with a polynomial (of degree a t most η — 1 ) . T h e following theorem generalizes this in t h e spirit of example (i). Let f have derivatives of all orders on [ 0 , 1 ] , and suppose that at each point some derivative of f is zero. That is, for each χ there is an integer n(x) such that / ' " ' " ( x ) = 0 . Then f coincides on [0,1] with some polynomial. x

1

We can s t a r t t h e proof just as in (i). Let E„ be the set of points χ for which f^(x) = 0 . By hypothesis, every χ is in at least one E . By Baire's theorem, there is a n interval / in which some E„ is everywhere dense. Since is a continuous function, / ( " ' ( # ) = 0 in J, and / coincides in J with a polynomial. We may expand / as much as possible, so t h a t there is no larger interval containing / in which is identically zero. If / is not all of [ 0 , 1 ] , repeat the reasoning in any remaining part of [ 0 , 1 ] , and so on. In this way, we see t h a t there is a set of intervals whose union is everywhere dense and in each of which / coincides with a polynomial. We still have to show t h a t / coincides with the same polynomial in all the intervals. n

To do this, we are going to apply Baire's theorem again t o t h e nowhere dense set Η t h a t is left when we remove t h e interiors of our dense set of intervals from [ 0 , 1 ] . (Here "interior" means interior with respect to t h e metric space [ 0 , 1 ] . T h u s , t h e points 0 and 1 could possibly get removed.) We first need t o show t h a t Η is perfect. In the first place i f is closed, since it is obtained by removing a collection of open sets from a metric space. If Η is not perfect, t h e n Η must have a point y t h a t is not a limit

68

Chap. 1

SETS

point. T h i s point is t h e common endpoint of two maximal intervals in each of which / coincides with some polynomial. T h e n if η equals t h e larger of the degrees of the two polynomials, / ( ' ( x ) = 0 for χ in b o t h intervals, and a t t h e endpoint by t h e continuity of / ( ) . Therefore the intervals were not maximal, and t h e point y does not belong t o Η after all. T h e preceding discussion shows t h a t Η is perfect, a n d we may suppose it to be not e m p t y (otherwise there was only one interval t o begin with, and there is nothing t o n + 1

n + 1

prove). Consider i f as a complete metric space. By Baire's theorem for H, some E„ is everywhere dense in some neighborhood in Η, t h a t is, in t h e part of Η t h a t is in some open interval J. In other words, there is an open interval J t h a t contains points of H, and f^(x) = 0 for every χ in J Π Η (with t h e same n ) . Since every point of J Π Η is a limit of other points of J Π Η, it follows from the definition of t h e derivative as a limit of difference quotients t h a t all derivatives of / of order greater t h a n η are zero at all points of J Π Η'. Now J also contains intervals complementary t o H, and in each such interval Κ we have / ( ) ( x ) = 0 for some m (depending on K). If m < n , we have / ( " ' ( x ) = 0 in K, by differentiating. If m > n, 0 ) = · · · = f (x) = 0 at the we have / ( x ) = / endpoints of K, since these are points of J Π Η; so by integrating / ( ) repeatedly, we get / ' " ^ ( x ) = 0 throughout K. This reasoning applies t o every interval Κ t h a t is complem e n t a r y t o Η and is in J ; so / ' " ' ( x ) = 0 t h r o u g h o u t J. T h u s J contains no points of Η after all. B u t we arrived at J as an interval containing points of Η on the assumption t h a t Η was not empty. This contradiction means t h a t Η must be empty, there was only one interval I = [0,1] t o begin with, a n d / coincides with a single polynomial t h r o u g h o u t this interval. m

( n )

m

(

n

+

1

)

{m)

Sec.

10

SOME APPLICATIONS OF BAIRE'S THEOREM

69

E x e r c i s e 1 0 . 1 . Prove the proposition on page 47 without appealing to properties of analytic functions. (iii)

C O N T I N U O U S E V E R Y W H E R E OSCILLATING F U N C -

TIONS. In our next application of Baire's theorem, t h e underlying metric space will be t h e space C of continuous functions on a real interval. It will be shown later (page 112) t h a t this space is complete. Let us seek first t o construct a continuous function t h a t is not monotonic in any interval. This can actually be done more directly, b u t it is a good illustration of t h e use of Baire's theorem in a fairly uncomplicated situation. T h e intervals with two rational endpoints form a countable set. Let t h e m be I\, I , h, • • • , in some order, a n d let E be t h e set of elements of t h e space C t h a t are monotonic on / „ . We are going t o show t h a t each set E is nowhere dense in C; it t h e n will follow from Baire's theorem t h a t there is an element of C t h a t is not in any E . In other words, there is a continuous function t h a t is not monotonic on any I , a n d hence not monotonic on any interval (since every interval in R i contains an interval with rational endpoints). 2

n

2

n

n

n

T h e technique for showing t h a t E„ is nowhere dense is one t h a t is useful in m a n y applications: we show t h a t t h e complement C{E ) is open and everywhere dense. n

E x e r c i s e 10.2. Show that a closed set with an everywhere dense complement is nowhere dense. We first show t h a t C(E ) is open. If / is in C(E„), t h e n / is not monotonic in I . This means t h a t there are three points x, y, and ζ in J „ , with χ < y < z, and f(x) < f(y) a n d f(z) < f(y) (or else f(x) > f(y) and f(z) > f(y)). Recalling t h a t t h e distance between elements / and g of C is max\f(x) — g(x)\, we see t h a t if g is closer t o / t h a n half n

n

70

Chap. 1

SETS

t h e smaller of the numbers f(y) — f(x) a n d f(y) — f(z), we also have g(x) < g(y) a n d g(z) < g(y), so t h a t g is not monotonic in I . T h a t is, all elements g sufficiently close t o / are not in E , which is t o say t h a t C(E ) is open. To say t h a t t h e complement of E is everywhere dense is t o say t h a t in every neighborhood in C there exists a function / t h a t is not monotonic in / „ . It is r a t h e r intuitive t h a t there is a very wiggly function close t o any given continuous function. To justify this intuition, we can (for example) let g be the center of t h e given neighborhood in C; we can approximate g in t h e metric of C, as closely as we like, by a polynomial ρ (§19). Now ρ has a bounded derivative, so if we add t o ρ a small saw-tooth function with very steep t e e t h we obtain a function / t h a t is as close as we like t o g and is not monotonic in I„. n

n

n

n

(iv)

E X I S T E N C E OF CONTINUOUS N O W H E R E D I F F E R -

ENTIABLE F U N C T I O N S . T h e fact t h a t a continuous function may fail t o have a derivative a t any point came as a shocking surprise t o t h e mathematicians of the nineteenth century. However, it t u r n s out t h a t "most" continuous functions have this property, and we should rather b e surprised t h a t any continuous functions are differentiable at all. Still more surprising is the fact t h a t a function may b e everywhere oscillating and still have a finite derivative a t every point. Unfortunately all known examples of this last phenomenon are too complicated to give h e r e . 3

4

W h a t we are going to show is t h a t the elements of the space C that have, even at one point, a finite derivative, even on one side (see page 140) form a set of first category in C. This theorem shows t h a t all the functions t h a t are ordinarily encountered in calculus form only a set of first category in C . We do not exclude the possibility t h a t "most" elements of C might have infinite one-sided derivatives at 5

6

Sec. 10

SOME APPLICATIONS OF BAIRE'S THEOREM

71

most points (geometrically, their graphs might have cusps). We shall see later (page 152) t h a t a continuous function actually cannot have a vertical tangent at all t h e points of a n interval. Although there really are nowhere differentiable functions t h a t do n o t even have one-sided infinite derivatives, they are much harder t o c o n s t r u c t ; t h e greater difficulty m a y be connected with t h e fact (which we cannot prove here) t h a t these functions form only a set of first category. T h e "typical" continuous function has an everywhere dense set of cusps (like t h e graph of y = \x\ l at t h e origin). A nowhere differentiable function must be everywhere oscillating, since a monotonic function has a derivative a t most points (page 165). 7

8

l 2

We now prove t h a t t h e nowhere differentiable functions form a set of second category in C. Consider t h e set E formed of all those elements / of C such t h a t , for some point χ in t h e interval [0,1 — 1/n], n

\f(x

+

h)-f(x)\

< η

if 0 < h < 1/n. Clearly a function / t h a t has a finite right-hand derivative a t some χ belongs to E for some n . Hence t h e union of all t h e E contains all t h e elements of C t h a t have a finite right-hand derivative at some point. We shall show t h a t each E is nowhere dense; then t h e union of t h e E is of first category (and so cannot be all of C). As in t h e preceding example, we do this by showing t h a t E„ is closed a n d has an everywhere dense complement. T h a t E is closed follows as in example (iii), or from the fact t h a t t h e inequality defining E is preserved under convergence in C. T h a t t h e complement of E is everywhere dense follows just as in (iii): if / is an arbitrary continuous function, we find a function close to / with a b o u n d e d derivative, a n d t h e n add t o this function a small n

n

n

n

n

n

n

72

SETS

Chap. 1

continuous function the slope of whose graph has large a b solute value.

(v)

DECOMPOSITION

OF A CLOSED

INTERVAL.

E x e r c i s e 1 0 . 3 . Show that a closed interval cannot be written as the union of a countably infinite number of disjoint, closed, nonempty sets. 9

NOTES ' E r n e s t C o r o m i n a s and Ferran Sunyer i Balaguer, C o n d i c i o n e s para que u n a funcion infinitamente derivable sea un p o l i n o m i o , Revista Matematica Hispano-Americana (4) 14 (1954), 2 6 - 4 3 . Several e x t e n s i o n s of the t h e o r e m are given in this paper. For a generalization to functions of several variables, see Α. B . B o g h o s s i a n and P. D . J o h n s o n , Jr., A pointwise condition for an infinitely differentiable function of several variables to b e a polynomial, Journal of Mathematical Analysis and Applications 151 (1990), no. 1, 17-19. S e e Bernard R. G e l b a u m and John Μ. H. O l m s t e d , Counterexamples in Analysis, Holden-Day, San Francisco, 1964, p. 29. N o w h e r e differentiable functions arise naturally in the m a t h e m a t i c a l theory of "Brownian m o t i o n . " W i t h probability 1, a B r o w n ian p a t h is nowhere differentiable. C . E. Weil (On nowhere m o n o t o n e functions, Proceedings of the American Mathematical Society 56 (1976), 3 8 8 - 3 8 9 ) g a v e a short proof of the e x i s t e n c e of such functions via Baire's t h e o r e m , and references to other constructions. See also A. Denjoy, Sur les fonctions d e r b e s s o m m a b l e s , Bulletin de la Sociiti Mathematique de France 4 3 (1915), 1 6 1 - 2 4 8 , especially p p . 228 ff. S . B a n a c h , Über die Baire'sche Kategorie gewisser F u n k t i o n e n m e n g e n , Studio Mathematica 3 (1931), 174-180. Our e x i s t e n c e proof d o e s not produce any specific e x a m p l e of a c o n t i n u o u s nowhere differentiable function. For explicit construct i o n s , see J o h n McCarthy, An everywhere c o n t i n u o u s nowhere differentiable function, American Mathematical Monthly 6 0 (1953), 709; Τ. H. Hildebrandt, A simple c o n t i n u o u s function w i t h a finite derivative at n o point, American Mathematical Monthly 40 (1933), 5 4 7 548; Bernard R. G e l b a u m and J o h n Μ. H. O l m s t e d , Counterexamples in Analysis, H o l d e n - D a y , San Francisco, 1964, p p . 3 8 - 3 9 . 2

3

4

5

6

Sec. 11

SETS OF MEASURE ZERO

73

T h e simplest e x a m p l e w a s constructed by A . P. Morse, A cont i n u o u s function w i t h no unilateral derivatives, Transactions of the American Mathematical Society 44 ( 1 9 3 8 ) , 4 9 6 - 5 0 7 . S . Saks, O n the functions of Besicovitch in t h e space of continu o u s functions, Fundamenta Mathematicae 19 ( 1 9 3 2 ) , 2 1 1 - 2 1 9 . O n t h e other h a n d , we m i g h t consider the c o n t i n u o u s , nowhere differentiable functions t h a t lack one-sided derivatives (even infinite ones) at all points e x c e p t for a set of measure zero: these functions form a s e t of second category. See S. A. Shkarin, O n a class of continuo u s nowhere differentiable functions, Russian Academy of Sciences. Izvestiya. Mathematics 45 ( 1 9 9 5 ) , no. 2, 4 2 3 - 4 3 2 ; translated from Rossiiskaya Akademiya Nauk. Izvestiya. Seriya Matematicheskaya 58 ( 1 9 9 4 ) , no. 5, 1 9 5 - 2 0 5 . 7

8

T h i s is a special case of Sierpinski's theorem t h a t a c o m p a c t c o n n e c t e d set c o n t a i n i n g more t h a n o n e point is never a c o u n t a b l e union of disjoint closed sets; for s o m e references see the solution t o problem Ε 2 6 1 3 , American Mathematical Monthly 84 ( 1 9 7 7 ) , 8 2 7 828. 9

1 1 . S e t s o f m e a s u r e z e r o . A set of first category m a y be t h o u g h t of as having relatively few points, principally because it cannot exhaust a complete metric space. It may, on the other h a n d , seem rather large if looked at from other points of view. It m a y be everywhere dense, like t h e set of rational points in R i . It m a y be uncountable, as t h e Cantor set is. It may be b o t h everywhere dense and uncountable, as we observed on page 64. T h e r e is another—quite different—kind of "sparse" set t h a t has m a n y uses. Suppose t h a t a subset Ε of R i has t h e property t h a t it can be covered by a countable collection of open intervals whose t o t a l length is arbitrarily small. T h e n Ε is called a set of measure zero. There is a similar definition in any R „ . Sets of measure zero are just the sets t h a t are negligible in the theory of Lebesgue integration (see C h a p t e r 3), a n d the n a m e comes from t h a t theory. If

74

Chap. 1

SETS

something h a p p e n s except on a set of measure zero, it is said t o h a p p e n almost everywhere, or for almost all points. T h e union of two, or of finitely many, or even of a countable infinity of sets of measure zero is again of measure zero. Obviously a countable subset of R i is a set of measure zero, so a set of measure zero can be everywhere dense. However, a set of measure zero need not b e countable, or even of first category; and, on the other hand, a set of first category m a y fail t o be of measure zero. Let us justify these s t a t e m e n t s by examples. First, the Cantor set Ε is uncountable b u t of measure zero. For, if e is a small positive number, take enough complementary intervals of Ε so t h a t their total length exceeds 1 — e/2. T h e rest of the unit interval containing Ε can be covered by a finite set of intervals of length not exceeding e, and so Ε is certainly of measure zero. 1

To construct a set t h a t is of first category, b u t not of measure zero, modify the construction of the Cantor set as follows. Let {a } be a sequence of positive numbers such t h a t 5^o„ = e < 1. Remove from t h e center of t h e unit interval an open interval of length αχ; then from the center of each remaining interval remove a n open interval of length \a ; and so on. After the n t h step, the remaining set E consists of 2" closed intervals each of length less t h a n 2 ~ . T h e set Ε = f] E„ is therefore a closed set containing no interval. T h u s Ε is nowhere dense, hence of first category. O n t h e other hand, Ε cannot be of measure zero, since if it were covered by a countable set of intervals of t o t a l length less t h a n 1—e, we should have a unit interval covered by a set of intervals of t o t a l length less t h a n 1. n

2

n

n

n

To construct a set t h a t is b o t h of second category and of measure zero, construct a generalized Cantor set Ε of the kind just described with e — \- T h e n construct similar sets

Sec. 11

75

SETS OF MEASURE ZERO

with e = \ , t = \ , and so on. T h e complement of t h e union of all the generalized Cantor sets has measure zero; b u t since each Cantor set is nowhere dense, this complement is of second category. Since an interval in R i is not of measure zero, another way t o show t h a t a set of points is not e m p t y is to exhibit it as t h e complement of a set of measure zero. For instance, every countable subset of R i is of measure zero, and so R i cannot be countable. A set of measure zero, like a set of first category, is "small" in the sense t h a t its complement cannot b e e m p t y . 2

3

E x e r c i s e 11.1. If Ε is a subset of Ri that covers at most a fixed fraction of every interval, then Ε covers almost none of every interval. In other words, if there is a positive q less than 1 such that, for every interval (a, b), the set Ε Π (a, b) can be covered by countably many intervals whose total length is at most q(b — a), then Ε is a set of measure zero. 4

NOTES ' A n o t h e r e x a m p l e of an u n c o u n t a b l e set of measure zero is the set of zeroes of a typical (in t h e sense of Baire category) continuo u s function t h a t has zeroes: see T o m a s D o m i n g u e z B e n a v i d e s , How m a n y zeroes d o e s a c o n t i n u o u s function have?, American Mathematical Monthly 93 ( 1 9 8 6 ) , 4 6 4 - 4 6 6 . A n o t h e r interesting e x a m p l e of a measure-zero, second-category set is the set of so-called Liouville numbers (roughly speaking, transcendental n u m b e r s t h a t can b e rapidly approximated by rational n u m b e r s ) . See W . M. Priestley, S e t s thick and thin, American Mathematical Monthly 83 ( 1 9 7 6 ) , 6 4 8 - 6 5 0 . For s o m e other n o t i o n s of size of s e t s and e x a m p l e s of sets that are small in o n e sense and large in another sense, see Harvey D i a m o n d and Gregory Gelles, R e l a t i o n s a m o n g s o m e classes of s u b s e t s of R, American Mathematical Monthly 91 ( 1 9 8 4 ) , 1 9 - 2 2 . T h e r e is a sort of "duality" b e t w e e n t h e n o t i o n s of first category and measure zero t h a t is discussed in J o h n C. O x t o b y , Measure and Category, second edition, Springer-Verlag, N e w York, 1980. 2

3

76

SETS

Chap. 1

However, it is possible for a set and i t s c o m p l e m e n t b o t h to cover a p o s i t i v e fraction of every interval if this fraction is not fixed. See Walter R u d i n , Well-distributed measurable sets, American Mathematical Monthly 9 0 ( 1 9 8 3 ) , 4 1 - 4 2 ; F. S. Cater, A partition of the u n i t interval, American Mathematical Monthly 9 1 ( 1 9 8 4 ) , 5 6 4 - 5 6 6 . 4

Chapter 2

Functions 1 2 . F u n c t i o n s . In elementary m a t h e m a t i c s , it is cust o m a r y t o say t h a t y is a function of χ if, when χ is given, y is determined (uniquely; we are not concerned with "multiple-valued functions"). This is a good working definition a n d one t h a t suffices for most practical purposes. However, we should realize t h a t it does not define "function," although it does give a definite meaning t o some phrases containing this word. (In a somewhat similar way, we are accustomed t o attaching a definite meaning t o t h e phrase "y —+ oo" even t h o u g h oo by itself has no meaning.) However, it is interesting, and sometimes helpful, actually t o define a function as a genuine m a t h e m a t i c a l entity. Consider two sets Ε and F of real numbers, neither of which is empty, a n d form a class of ordered pairs (x, y) with χ € Ε a n d y £ F, where each χ occurs exactly once a n d each y occurs at least once. Such a class of ordered pairs is called a junction with domain Ε and range F, or a function from Ε t o F; or, on occasions when it is unnecessary to say precisely w h a t F is, a function with 77

78

Chap. 2

FUNCTIONS

domain Ε and values in R i , or a function from Ε into R i , or a real-valued function with domain E, etc. For example, let Ε b e all of R i , let F b e the closed interval [—1,1], a n d let t h e ordered pairs be (x, sin ζ ) for each χ in R i . This is w h a t is usually spoken of as "the function s i n x . " Notice t h a t if we take Ε t o be, for instance, t h e interval [0,2π] instead of all of R x , the set of ordered pairs (x, sin x) for χ £ Ε is a different function, the restriction of t h e original function t o [0,2π]. For another example, let Ε consist of t h e positive integers 1, 2, 3, . . . , and let F be the set { 1 , 4 , 9 , 1 6 , . . . }. T h e function whose ordered pairs are (1,1), (2,4), (3,9), . . . is an example of the special kind of function t h a t we usually call a sequence, specifically a sequence of real numbers. E x e r c i s e 1 2 . 1 . An equation in χ and y determines a set of ordered pairs (a;, y) whose components satisfy the equation. Accordingly, it may (or may not) determine a function. Which of the following equations do determine functions? (a) x +y = 25, (c)y=\x\, (e) y = cosx, 2

2

(b) x + y = 0, ( d ) | i | + |y| = - 2 , (f) χ = cosy. 2

2

If F consists of a single point, say the point 3, t h e function whose ordered pairs are (x, 3) is a constant function; ideally, it should b e distinguished notationally from the number 3. T h e definition of function is easily extended t o a more general setting. Let Ε be a nonempty set of points in some space S (which might be R „ , or C , or anything else; it need not be a metric space). Let F be another n o n e m p t y set of points belonging t o some space T , in general quite different from S. T h e class of all ordered pairs (x, y) with χ in S

Sec. 12

79

FUNCTIONS

a n d y in Τ is called the Cartesian product of S a n d T. A function from Ε t o F is a subset of this Cartesian product consisting of pairs (x, y), where each χ in Ε occurs exactly once and each y in F occurs at least once. It is also called a function from Ε into T. We speak of t h e points of F as t h e values of t h e function, and of F as t h e image of E. It is often convenient to call the function a mapping of Ε onto F. T h e Cartesian product of two R i ' s is the space R 2 (if we introduce the right metric in t h e p r o d u c t ) . A function whose domain and range are in R i is just the set of points of R 2 t h a t are in w h a t we ordinarily call t h e graph of the function. "A real function of a real variable" ordinarily means a function with domain a n d range in R j . For the purposes of this book it is unnecessary t o a t t e m p t t o define a "variable." The advantage of the abstract definition of a function as a class of ordered pairs is that it gives us a mathematical object defined in terms of concepts we have already had. Every proof about functions can be phrased in terms of it. Its disadvantage is that it loses most of the intuitive content of the notion of function. For many purposes, and especially for a student's first introduction to the concept, it is better to think of a function as a mapping or transformation or operator (emphasizing the process by which values are obtained from points of the domain); as a rule (emphasizing the correspondence between the values and the points of the domain); or in physicists' language as a field (emphasizing the domain, and the association of a value with each point of the domain—usually in physics the domain is in R 3 and the range is in Ri (scalar field) or in R 3 (vector field)). We should, perhaps, think of a class of ordered pairs as being a model of a function rather than the function itself. On the other hand, if "function" is taken as a primitive notion, then "ordered pair" can be defined, or modeled, in terms of it (as a function on the set {1, 2}). 1

80

Chap. 2

FUNCTIONS

A sequence is a function whose domain consists of all t h e positive integers. If we are talking a b o u t sequences, t h e domain is understood t o be fixed, and so we can specify the sequence by listing the points of t h e range in t h e order imposed by t h e points of the domain. We then often speak of the points of t h e range (repeated as often as necessary) as the elements of the sequence, rather t h a n as t h e values of the function. This brings us back t o the informal definition of a sequence t h a t we have already used (page 53). T h u s "the sequence { 2 , 4 , 8 , 1 6 , . . . }" or "the sequence { 2 } " means the set of ordered pairs (1,2), (2,4), (3,8), . . . . "The sequence {1}" means t h e set of ordered pairs (1,1), (2,1), (3,1), . . . . A subsequence of a given sequence is a restriction of the sequence t o a subset of t h e integers; it can be identified with t h e sequence obtained by renumbering its elements. For instance, { 2 } is a subsequence of { n } . (As noted on page 53, a "finite sequence" is not a sequence in our sense of t h e word.) n

n

In careful work, it is usually helpful t o distinguish notationally between (say) / , t h e name of a function, a n d / ( x ) , "the value of the function / at t h e point x." In other words, / is t o denote a set of ordered pairs, while / ( x ) s t a n d s for the point of t h e range t h a t is paired with t h e point χ of t h e domain. For example, the logarithmic function consists of t h e pairs (x, l o g x ) ; one of these pairs is (e, 1), and log(e) = 1. T h e distinction between a function a n d its values is not m a d e consistently in most books. It looks a little odd, although it is b o t h correct a n d unambiguous, t o speak of "the function log" (of course, we must specify the domain of the function; here it would presumably be (0, oo)). This would usually be called "the function l o g x . " We often want t o talk a b o u t more com5

5 This one is no exception!

Sec. 12

FUNCTIONS

81

plicated functions, such as t h e function whose value at χ is l o g ( s i n x ) . Here we are forced into a certain a m o u n t of awkwardness if we are t o avoid t h e possibly misleading "the function log(sinx)." T h e same problem arises when we want to talk a b o u t functions t h a t are so simple t h a t they have n o generally accepted names. T h e identity function, whose ordered pairs are (x, x) (for χ in a specified domain) should not be called "the function x," b u t may correctly be described as t h e function I such t h a t J ( x ) = x. T h e gain in clarity is well worth t h e loss of brevity. If t h e domain of a function / is in R ] , its values are usually written / ( x ) . If t h e domain is in R 2 , t h e values are usually written f(x,y), although f((x,y)) would be more in accordance with our general principles. T h e elements of a sequence are conventionally written as s „ , r a t h e r t h a n as s(n), a n d we ought t o refer t o t h e sequence as s. However, as we have already remarked, it is usually more convenient to call t h e sequence {s„}, t h a t is, t o specify w h a t its elements are (since its domain is understood t o be t h e set of positive integers). T h u s t h e sequence whose ordered pairs are ( 1 , 1 ) , ( 2 , 4 ) , (3,9), . . . is usually written { n } , and restrictions of it are clearly indicated by { n } £ ° , { n } „ odd, {n } , and t h e like. Similarly we could describe t h e function / such t h a t / ( x ) = s i n 2 x (with domain R i ) as { s i n 2 x } , reserving t h e notation s i n 2 x for t h e particular point of R i (as range) t h a t is paired with t h e point χ of t h e domain. T h e restriction of this function t o ( 0 , 2 π ) would be {sin2x}o 0, /(Ο) = 0, a n d f(x) = —1 for χ < 0, then t h e inverse image of t h e open interval ( — \ ) contains t h e single point 0, a n d so is not open. Exercise 13.5. Give an example to show that the image of an open set under a continuous function is not necessarily open. To verify the equivalence of t h e two definitions of continuity in a metric space, suppose first t h a t / is continuous

Sec. 13

CONTINUOUS FUNCTIONS

89

under t h e original definition. Let Ε be an open set in t h e range space, and let xo be a point in t h e inverse image of E. T h e n f(xo) € E, a n d if e is small enough, every y such t h a t d(f(xo),y) < e belongs t o Ε (since Ε is open). Since / is continuous, there is a positive δ such t h a t d ( x , x o ) < δ implies d ( / ( x ) , / ( x o ) ) < t. Hence all points χ sufficiently near xo have their images in E; t h a t is to say, t h e inverse image of Ε contains a neighborhood of each of its points, so it is open. Conversely, suppose t h a t t h e inverse image of every open set is open. In particular, t h e inverse image of any neighborhood in t h e range space, defined by a n inequality d(y,f(xo)) < e, is open and so contains a neighborhood defined by a n inequality d(x,xo) < δ; this is a n a p p r o p r i a t e neighborhood t o associate with e in t h e original definition. We have proved a little more t h a n we set out t o do, namely t h a t / is continuous at xo if and only if the inverse image of every neighborhood o / / ( x o ) contains a neighborhood of XoExercise 13.6. Show that if the range of / is in R i , and / is continuous at xo, and /(xo) φ 0, then there is a neighborhood of xo in which | / ( x ) | has a positive lower bound, that is, | / ( x ) | > m > 0. Exercise 13.7. Show that if / has an interval in R i as its domain, has its range in R i , and is continuous at xo, then / is bounded in some neighborhood of xo. Exercise 13.8. If a function / with range in R i is discontinuous at xo, then there are a positive e and a sequence {x„ } with limit xo such that | / ( x ) - / ( x o ) | > «· n

NOTES ' T h e d i s t i n c t i o n b e t w e e n continuity and the i n t e r m e d i a t e value property w a s first clarified by G. D a r b o u x , M e m o i r e sur les fonctions

90

FUNCTIONS

Chap. 2

discontinues, Annali delta Scuola Normale Superiore di Pisa, Classe di Scienze 4 ( 1 8 7 5 ) , 5 7 - 1 1 2 . Therefore functions possessing the int e r m e d i a t e value property are s o m e t i m e s called D a r b o u x functions. H . L e b e s g u e , Legons sur I'integration et la recherche des junctions primitives, 2nd e d . , Gauthier-Villars, Paris, 1928, p. 97. H . Fast, U n e remarque sur la p r o p r i i t e de Weierstrass, Colloquium Mathematicum 7 (1959), 75-77. V . P a m b u c c i a n , A n o t h e r e x a m p l e of an exotic function, American Mathematical Monthly 96 ( 1 9 8 9 ) , 9 1 3 - 9 1 4 . For more information a b o u t functions with t h e intermediate value property, see C h a p t e r 1 of Differentiation of Real Functions by A n d r e w Bruckner, Provid e n c e , A m e r i c a n M a t h e m a t i c a l Society, 1994, and its references. H . B l u m b e r g , N e w properties of all real functions, Transactions of the American Mathematical Society 2 4 (1922), 1 1 3 - 1 2 8 . W . Sierpinski and A. Z y g m u n d , Sur une fonction qui est disc o n t i n u e sur tout e n s e m b l e d e puissance du continu, Fundamenta Mathematicae 4 (1923), 316-318. 2

3

4

5

6

1 4 . P r o p e r t i e s o f c o n t i n u o u s f u n c t i o n s . It is a commonplace t h a t sums, products, a n d quotients of continuous functions are continuous (provided t h a t t h e divisor in t h e quotient is not zero). More precisely, we should suppose t h a t t h e two functions / a n d g have t h e same domain, a n d t h a t their values are in R i . T h e n we can define f + g, fg, a n d f /gin the usual (and natural) way, provided in t h e last case t h a t g is not zero anywhere on the common domain of t h e functions. T h e n if / a n d g are b o t h continuous a t xn, so are f + g, fg, a n d f/g if it is defined. However, it is usually not necessary t o be so meticulous. W h e n / a n d g have different domains, we are likely t o write / + g for the s u m of their restrictions t o the intersection of t h e domains of / a n d g, and similarly f/g for the quotient of t h e restrictions of / and g t o t h e p a r t of t h e intersection of their domains where g does not take t h e value 0. Note in this connection t h a t t h e function f\ defined by f\{x) = x/x and t h e function fi defined by fi(x) = 1 are different functions

Sec. 14

PROPERTIES OF CONTINUOUS FUNCTIONS

91

since their domains are different; we cannot say t h a t / i is continuous a t 0. In discussing t h e continuity of a p r o d u c t , it is convenient to use Exercise 13.7. Exercise 14.1. Show that if / is continuous at xo and g is not, then / + g is not. Can / + g be continuous at xo if neither / nor g is continuous at xo? Exercise 14.2. Carry out the detailed proofs for the continuity of f + g, fg, and f/g when / and g are continuous. A function / is said t o be univalent, or one-to-one, if in t h e set of ordered pairs t h a t constitute t h e function, not only does no χ in t h e domain occur twice, but also no y in the range occurs twice. In this case the ordered pairs (y, x) with y in t h e range and χ in the domain also constitute a function, t h e inverse of / , often denoted by / (not t o be confused with 1/f, which is the reciprocal of / ) ; its domain is t h e range of / a n d its range is the domain of / . It is often useful t o know t h a t under certain circumstances the inverse of a continuous univalent function is continuous. T h i s s t a t e m e n t is t r u e if the domain of the function is a compact set in some R „ , or, more generally, whenever the domain has t h e property t h a t every infinite subset of it has a limit point (the conclusion of t h e Bolzano-Weierstrass theorem). _ 1

To verify the preceding assertion, we have t o show t h a t the images of open sets are open, since these will be the inverse images of open sets under f~ . It is an equivalent s t a t e m e n t t h a t t h e images of closed sets are closed, a s t a t e m e n t which is somewhat more convenient for us t o use. l

Exercise 14.3. Establish the equivalence asserted in the preceding sentence.

92

Chap. 2

FUNCTIONS

Let, t h e n , £ b e a closed set in the domain of / , let F be the image of E, a n d let ya be a limit point of F; we have t o show t h a t y G F. Let {y } be a sequence of distinct points of F with yn —> ya, and let yn = f(xn)T h e r e is, by univalence, j u s t one x for each y , a n d the x are all different since the y are all different. T h e n t h e set whose points are the x has a limit point xo, and a subsequence {xni.} has χ as its limit. (See page 59.) We have xo G Ε since Ε is closed. Since / is continuous, 0

n

n

n

n

n

n

υ

Vn, = f(x ,) n

IM; but y -> 2/o, so y = / ( x ) G -r 7

nt

0

0

Although the intermediate value property t h a t was discussed in §13 does not characterize continuous functions, it is a property t h a t (under some conditions) continuous functions d o have. It is possessed by a continuous function with values in R i if its domain is a connected set in a metric space 5 . To see this, let f(a) = A and f(b) = B, a n d suppose A < C < B. Consider the sets E\ and E in S consisting, respectively, of t h e points χ of S for which f(x) < C and of t h e points χ of 5 for which f(x) > C. These sets are disjoint, since / ( x ) cannot (for the same x) be simultaneously less t h a n C a n d greater t h a n C. T h e y are not empty, since a G E\ and b G E . They are open, since they are inverse images of open sets. Since the dom a i n of / was assumed connected, it cannot be t h e union of two disjoint, nonempty, open sets. Hence the domain of / must contain a t least one point c t h a t is in neither E\ nor Ei. T h e only possible value for / ( c ) is C. 2

2

A continuous function whose domain is compact a n d whose range is in R i has a largest and a smallest value (page 48). E x e r c i s e 14.4. A more elegant proof of the preceding statement can be given by supposing that the least upper bound Μ of the values of / is not a value of / , and then considering the

Sec. 14

PROPERTIES OF CONTINUOUS FUNCTIONS

93

function 1/(M - / ) . Exercise 14.5. Deduce that if the domain is both compact and connected, then the range of a continuous function with values in R i is a closed bounded interval or a point. Compactness is of course not necessary for the existence of a m a x i m u m . Exercise 14.6. Let / be a nonnegative continuous function with domain [a, oo) and range in R i , and suppose that f{x) —• 0 as χ —» oo. Then / has a maximum on [a, oo). Exercise 14.7. If / is as in the preceding exercise, but strictly positive, then there is a sequence {x„ } such that x„ —• oo, and f(x) < f(x„) for all χ > x„. That is, as χ —» oo the function / never again takes as large a value as it had at x„. We now give some interesting applications of t h e intermediate value property. Consider a function / , from an interval in R i into R i , t h a t has t h e intermediate value property on every interval in its domain a n d has a discontinuity at the point c. T h e n (Exercise 13.8) there is a sequence { x „ } with limit c such t h a t for some positive e, either f(x ) > / ( c ) + e for all η or f(x ) < / ( c ) — e for all n ; say t h e former. Since / has t h e intermediate value property on every interval, it takes every value between / ( c ) a n d / ( c ) + e. Furthermore it does this infinitely often, since we can always consider a smaller neighborhood of c. Hence if a function from an interval into R i has the intermediate value property on every subinterval and is discontinuous, it must take some values infinitely often. (Thus the discontinuous function with the intermediate value property, constructed on page 84, illustrates t h e typical situation b e t t e r t h a n one might have n

n

94

FUNCTIONS

Chap. 2

expected.) As a corollary, we have t h a t if a function has the intermediate value property on every interval and takes no value more than once, then it is continuous. In particular, a function must be continuous if it takes each value between f(a) and f(b) exactly once, in every interval [a, b] in its d o m a i n . A strictly monotonic function (page 158) is a n example of a function t h a t takes no value more t h a n once, b u t not every function with this property is strictly monotonic. (Consider f(x) = χ + 1 for — 1 < χ < 0, and f(x) — χ — 1 for 0 < χ < 1.) However, a continuous function t h a t takes no value more t h a n once is indeed strictly monotonic. For, if it were not strictly monotonic, t h e n there would have to be points x < x < x$ with f{x\) < f(x ) a n d / ( X 3 ) < f(x ) (or t h e corresponding inequalities reversed); a n d so / would have (in t h e first case) a m a x i m u m at some point c between x\ and Χ3 (because / is continuous), and the m a x i m u m would b e proper (since / takes no value more t h a n once). T h e n f(x) < f(c) for values of χ on b o t h sides of c, a n d so by the intermediate value property / would take some value near / ( c ) twice, contradicting t h e hypothesis. ( T h e same reasoning shows t h a t a continuous function is monotonic between its consecutive local m a x i m a and minima: t h a t is, if / has a m a x i m u m at X\ and a minimum at x > x\, and neither a maxim u m nor a minimum in {x\,x ), then / is monotonic in {x\,x ).) 1

x

2

2

2

2

2

2

We have just seen t h a t the continuous functions (from an interval in R i into R i ) t h a t take each value exactly once are strictly monotonic. W h a t sort of continuous functions take each value exactly twice? T h e answer is t h a t t h e r e are no such functions. In fact, it is j u s t as easy t o show t h a t no continuous function o n a closed bounded interval can take each of its 2

Sec.

14

PROPERTIES OF CONTINUOUS FUNCTIONS

95

values exactly η times, where η > 1. For suppose t h a t t h e r e is such a function / . Then / has a n absolute maxim u m and a n absolute minimum, and each must be a t t a i n e d a t interior points except in the case η = 2, when t h e mini m u m (say) might be attained at b o t h endpoints. Hence we may suppose t h a t the m a x i m u m is attained at an interior point c\. Let c , C3, . . . , c„ b e the other points where / takes its m a x i m u m value. T h e n f(x) < f(c\) in a deleted neighborhood of each Ck (a one-sided neighborhood if cj. is an endpoint). For a sufficiently small positive e, the line y = f(ci) — e intersects t h e graph of / twice near C\ and at least once near each other cjt, by the intermediate value theorem, so there is a value t h a t / takes at least η + 1 times, a contradiction. 2

Since / cannot be continuous and exactly two-to-one, let us suppose now t h a t / takes each value at most twice. T h e n we m a y conclude t h a t the graph of / falls into at most three monotonic pieces (like the graph on page 99, with a small piece removed at one end). It adds interest t o this theorem to observe t h a t nothing similar holds for continuous functions t h a t take each value at most three times: the graph of such a function does not have t o be composed of a finite number of monotonic pieces, as is indicated by the sketch, where t h e n t h square from t h e right has side o„, a n d ]P a„ < 00. 3

4

Let us suppose, then, t h a t / is continuous on [a, b] and at most two-to-one. By the principle mentioned above, / is monotonic between any two consecutive local m a x i m a and minima. T h u s if m a x i m a and minima of / are attained only at t h e endpoints, t h e n / is itself monotonic. If it is not monotonic, t h e n it has at least one m a x i m u m or minim u m inside t h e interval, say a m a x i m u m a t c. If / has no other interior m a x i m a or minima, it must b e monotonic on (a, c) and on (c, b). T h e next case is the one where / has

96

Chap. 2

FUNCTIONS

precisely one interior m a x i m u m at c\ and one interior minim u m a t C2, say with ο < c\ < c < b. T h e n / is monotonic on each of the three intervals ( a , c i ) , ( c i , C 2 ) , (c2,b). Finally suppose t h a t / has more t h a n a t o t a l of two interior m a x i m a a n d minima. Let a m a x i m u m , a minimum, and a m a x i m u m (not necessarily consecutive) occur a t c\, c , c , where c < c < c , a n d / ( c i ) > / ( c ) , / ( c ) > / ( c ) ; suppose for definiteness t h a t / ( c i ) > / ( c ) . T h e n / takes some value slightly less t h a n / ( c ) at least twice near C3; taking this value t o be greater t h a n / ( c ) , it will be less t h a n / ( c i ) , and by t h e intermediate value property it will be t a k e n again between c\ a n d c , for a t o t a l of three times, contradicting the hypothesis. 2

2

3

x

2

3

2

3

2

3

3

2

2

We t u r n now t o a different application of the intermediate value property. Let / be a continuous function from an interval in R i into R i . We say t h a t / has a horizon-

Sec. 14

97

PROPERTIES OF CONTINUOUS FUNCTIONS

tal chord of length a (where a > 0) if there is a point χ such t h a t χ and χ + a are b o t h in the domain of / a n d fix) = f(x + a). This means t h a t there is a horizontal line segment of length a having b o t h ends on t h e graph of t h e function; we do not care whether or not t h e segment has other points in common with the graph. For example, if f(x) = 1 for every x, then / has horizontal chords of all lengths; the segment from (—1,1) t o (1,1) is a horizontal chord of length 2 of t h e function / defined by f(x) = χ — χ +1; the function / defined by f(x) = x has no horizontal chords. 3

3

A function / whose domain is all of R i is said t o b e periodic with period ρ if f(x + p) = f(x) for all x. (Of course ρ φ 0; normally ρ is understood t o be positive.) Exercise 14.8. A continuous periodic function is bounded. Exercise 14.9. A continuous periodic function has a maximum. Exercise 14.10. If / is continuous and has period p, then the integral f" f(x) dx has a value independent of a. +r

We first notice t h a t a continuous periodic function f has horizontal chords of all lengths. T h a t is, if / has period p , a n d α is any real number, then there is a n χ such t h a t / ( χ + a) - f{x) = 0. To see this, consider / [ / ( χ + a) — / ( x ) ] d x , which is zero (by Exercise 14.10). Hence t h e integrand cannot have a fixed sign on t h e interval (0,p): it must b e equal to zero at some point. T h e periodic integrand must r e t u r n t o t h e same value at ρ as it has at 0, so it must actually be equal t o zero a t least twice on the interval [0,p). Accordingly, there are a t least two points χ in [0,p) for which / ( x + o ) = fix), a n d we see t h a t a continuous function of period ρ has two Ρ

0

98

Chap. 2

FUNCTIONS

horizontal chords of any given length, with their endpoints at different points of [0,p).

left-hand

5

E x e r c i s e 1 4 . 1 1 . Show that a periodic continuous function always has a chord (not necessarily horizontal), of prescribed span, with its midpoint on the graph: that is, for each positive a there is an a: such that f(x + a) — f(x) = f(x) — f(x — a). For functions t h a t are not periodic, t h e situation is quite different. A given continuous function / , say with domain [0,1], may have no horizontal chords at all. (For instance, / could b e strictly increasing.) However, let us suppose t o begin with t h a t / does have one horizontal chord. More specifically, suppose that / ( 0 ) = / ( l ) , so that f has a horizontal chord of length 1. T h e universal chord t h e o r e m s t a t e s t h a t there are then horizontal chords of lengths \ , \ , \ , ... , but not necessarily a horizontal chord of any given length that is not the reciprocal of an integer. 6

To prove t h e positive half of this theorem, let k be a positive integer a n d consider the continuous function g, defined by g(x) — f(x + 1/k) — f(x), whose domain is [0,1 — 1/k]. I assert t h a t 0 is in the range of g. If not, then g would b e either positive for all χ in its domain or else negative for all χ in its domain (by the intermediate value p r o p e r t y ) , a n d so g(0) + g(l/k) +g(2/k) + · · · + g(l - 1/k) would be either positive or negative; on t h e other hand, this sum "telescopes" and is equal to / ( l ) — / ( 0 ) = 0. Alternatively, if g(x) > 0 for all x, t h a t is, if f(x) < f(x + 1/k) for all x, then /(Ο) < /(1/Α;) < f(2/k) < < f((k - l)/k) < f(k/k) = f(l) = / ( 0 ) , a contradiction. T h e figure indicates a function t h a t has a horizontal chord of length 1 b u t no horizontal chord of length a for j < α < 1. Naturally a function of this kind has horizontal chords of some lengths t h a t are not reciprocals of integers;

Sec. 14

PROPERTIES OF CONTINUOUS FUNCTIONS

99

the negative p a r t of the universal chord theorem asserts t h a t , for each number b t h a t is not the reciprocal of an integer, t h e r e is a continuous function t h a t has a horizontal chord of length 1 b u t does not have a horizontal chord of t h e particular length b. Exercise 14.12. Show how to construct an example for the negative part of the universal chord theorem for every b Φ 1/k. An interesting complement to t h e universal chord theorem is t h a t ο continuous function f with a horizontal chord of length 1 always has either a horizontal chord of length a or two different ones of length 1 — a ( i f O < a < l ) . To see this, suppose t h a t /(Ο) = / ( l ) , and form a new function g by repeating / with period 1. As a periodic continuous function, g necessarily has two horizontal chords of length α with left-hand endpoints in [0,1) (see page 97). A horizontal chord of length a for g, starting a t x, is a horizontal chord for / unless χ + α > 1 . I f x + a > l , then 0 < x + o — 1 < 1, a n d so a horizontal chord of length 1 — a for g (also for / ) s t a r t s a t χ + α — 1 and ends at x. 7

Exercise 14.13. Prove the universal chord theorem by induction, starting from the fact that if /(Ο) = / ( l ) , then / has either a horizontal chord of length α or one of length I — a.

100

Chap. 2

FUNCTIONS

In the same way, we can show t h a t if / has a derivative / ' in [0,1], a n d / ' ( 0 ) = / ' ( l ) , t h e n for any positive integer η t h e r e are points χ and χ + n~ such t h a t / has t h e same slope at b o t h points. This depends on t h e fact (page 143) t h a t a derivative always has t h e intermediate value property; this property (for g, not merely for / ) was all t h a t was used in the proof of t h e universal chord theorem. Another application of the same idea yields a simple fixed-point theorem. This says t h a t a continuous mapping of a compact interval into (part or all of) itself has at least one fixed point; t h a t is, there is at least one point t h a t coincides with its image. Another way of saying this is t h a t if / is a continuous function on [0,1] such t h a t 0 < f(x) < 1, t h e n t h e graph of the function must cross t h e line y = x. 1

Exercise 14.14. Prove the preceding statement; that is, prove that if / is continuous in [a,ft]and all the values of / are in [a, 6], then there is an χ in [a, ft] for which f{x) = x. Exercise 14.15. Suppose that a clock runs irregularly, but at the end of 24 hours it has neither gained nor lost overall. Is there some hour for which this clock shows an elapsed time of exactly one hour? Is there some continuous 576 minutes during which it shows an elapsed time of 576 minutes? (Assume that the indicated time is a continuous increasing function.) 8

A n o t h e r closely related theorem states t h a t if a continuous periodic function, of period 2p, has t h e property t h a t / ( x ) = —f(x +p) for every χ (like the graph of y = s i n x , w i t h ρ = π ) , t h e n f(x) = 0 for at least one x. This is obvious from t h e intermediate value theorem. If formulated in a different way, it becomes a simple case of Borsuk's antipodal-point theorem, which is much harder t o prove

Sec.

14

PROPERTIES OF CONTINUOUS FUNCTIONS

101

in higher-dimensional s i t u a t i o n s . In this formulation we consider a continuous function / whose domain is t h e circumference of a circle in R 2 and whose range is in R i . Suppose t h a t t h e image of every pair of antipodal points (points a t opposite ends of a diameter) is a pair of points t h a t are symmetric with respect to the origin. T h e n some point of t h e circumference m a p s into the origin. Still another application of the intermediate value theorem shows t h a t a (two-dimensional) pancake of arbitrary shape can b e bisected by a knife cut in any specified direction. At least if t h e b o u n d a r y of t h e pancake is sufficiently simple, it is easy t o see t h a t the p a r t of t h e pancake lying on one side of a line in t h e given direction has a n area t h a t varies continuously as the line is moved parallel t o itself. Since this area can be either 0 or t h e total area of the pancake, it must at some time be exactly half t h e t o t a l area. 9

We now indicate how two pancakes in a plane can be bisected simultaneously by some line. Let t h e two pancakes be A and B. As we have j u s t seen, there is a line in any given direction bisecting B. For t h e direction determined by any point Ρ on a given circle in R 2 , with center O, find a line bisecting B, and let f(P) b e the signed difference between t h e area of t h e p a r t of A on the left of this line and t h e area of t h e part of A on the right. We have a function whose domain is a circumference; whose range is in R i ; and which m a p s antipodal points into points symmetric a b o u t t h e origin, since replacing Ρ by its antipode interchanges left and right. If we can show t h a t / is continuous, t h e n t h e special case of Borsuk's theorem, noted above, shows t h a t f(P) = 0 for some P, which is to say t h a t a line in t h e direction of OP bisects b o t h A and B. T h e continuity of / is plausible, b u t not quite obvious. It is enough t o show t h a t a small change in the position

102

FUNCTIONS

Chap. 2

of Ρ produces not only a small change in the direction of t h e line bisecting B, b u t also a small change in its position, for example, a small change in its intercept on one of the coordinate axes. For, if this statement is true, f(P) will change by only a small a m o u n t . Now the preceding s t a t e m e n t about lines bisecting Β is true without reservation only if we impose some restriction on the admissible s h a p e of a pancake; for example, a "pancake" shaped like Q—Ο b bisected by many vertical lines. If we restrict ourselves to convex pancakes there is no difficulty, as is clear from a figure. c

a

n

e

10

E x e r c i s e 14.16. Given a convex closed curve in the plane, there is a line that simultaneously bisects the curve and the area that it surrounds. 11

There are similar theorems in more dimensions, b u t t h e y are harder t o prove. T h e three-dimensional fixedpoint theorem can be stated in picturesque, if misleading, language: if a c u p of coffee is stirred in any continuous fashion, there is a t least one molecule t h a t ends u p in its original position. (This is correct only if t h e coffee occupies every point inside t h e cup and "molecule" is interpreted as "point.") T h e three-dimensional antipodal-point theorem states t h a t a continuous mapping of the surface of a sphere into t h e space Tt\ t h a t carries every pair of antipodal points into a pair of points symmetric with respect t o t h e origin must carry some point of t h e surface of t h e sphere into t h e origin. T h i s can b e used t o show t h a t any three volumes in space can be bisected simultaneously by some plane (the "ham sandwich" t h e o r e m ) . 12

NOTES ' T h i s also follows from t h e definition of continuity o n page 8 8 , since the property in question requires the inverse i m a g e of every

Sec. 14

PROPERTIES OF CONTINUOUS

FUNCTIONS

103

o p e n interval t o be an o p e n interval (and hence t h e inverse image of every o p e n set t o be a n o p e n s e t ) . For a more t h o r o u g h t r e a t m e n t see J. B . D i a z , D i s c u s s i o n a n d e x t e n s i o n of a t h e o r e m of Tricomi c o n c e r n i n g f u n c t i o n s w h i c h a s s u m e all i n t e r m e d i a t e values, Journal of Mathematics and Mechanics 18 ( 1 9 6 8 / 6 9 ) , 6 1 7 - 6 2 8 ; also t h e rev i e w of this paper in Mathematical Reviews 39 # 3 7 0 . For a localized version of t h e property j u s t discussed see E . W . C h i t t e n d e n , N o t e on f u n c t i o n s which approach a limit at every p o i n t of an interval, American Mathematical Monthly 25 ( 1 9 1 8 ) , 2 4 9 - 2 5 0 . More generally there is no c o n t i n u o u s transformation of an interval such t h a t each i m a g e p o i n t has e x a c t l y t w o inverses. See O. G. Harrold, T h e n o n - e x i s t e n c e of a certain t y p e of c o n t i n u o u s t r a n s f o r m a t i o n , Duke Mathematical Journal 5 ( 1 9 3 9 ) , 7 8 9 - 7 9 3 ; and for e x t e n s i o n s , J. H. R o b e r t s , T w o - t o - o n e transformations, Duke Mathematical Journal 6 ( 1 9 4 0 ) , 2 5 6 - 2 6 2 ; P. C i v i n , T w o - t o - o n e m a p p i n g s of manifolds, Duke Mathematical Journal 10 ( 1 9 4 3 ) , 4 9 - 5 7 . 2

T h e r e is a s u b s t a n t i a l a m o u n t of literature on related t o p i c s . See, for e x a m p l e , Mathematical Reviews 94b:54094, 94b:54093, 9 0 e : 5 4 0 2 1 , 48#472, 36#3320, 32#434, 30#2476, 28#2547, 26#4310, 26#3021, 2 4 # A 2 9 5 4 , 2 4 # A 1 7 0 9 , 2,324d; and P r o b l e m s Ε 1094 and Ε 1715, American Mathematical Monthly (1954, 425; 1965, 784). D . C. Gillespie, A property of continuity. Bulletin can Mathematical Society 28 ( 1 9 2 2 ) , 2 4 5 - 2 5 0 . 3

of the

Ameri-

I n t h e p a p e r cited in t h e preceding n o t e , Gillespie gives a formula for such a function: }(x) = « + i s i n ( 7 r / : r ) , 0 < χ < 1. 4

!

J. B . D i a z and F. T . Metcalf, A c o n t i n u o u s periodic function has every chord twice, American Mathematical Monthly 74 ( 1 9 6 7 ) , 8 3 3 - 8 3 5 , give a different proof. For e x t e n s i o n s t o a l m o s t periodic and m o r e general f u n c t i o n s , see J. C. Oxtoby, Horizontal chord t h e o r e m s , American Mathematical Monthly 79 ( 1 9 7 2 ) , 4 6 8 - 4 7 5 . 5

A l t h o u g h Τ . M. F l e t t (Bulletin of the Institute of Mathematics and its Applications 11 ( 1 9 7 5 ) , 34) has discovered t h a t t h e positive part of t h e universal chord t h e o r e m was proved by A. M. A m p e r e in 1806 (see Mathematical Reviews 56 # 5 8 0 5 ) , the m o d e r n history of t h e t h e o r e m b e g i n s w i t h P. Levy, Sur une g6neralisation du t h e o r e m e de Rolle, Comptes Rendus de l'Acadimie des Sciences Paris 198 ( 1 9 3 4 ) , 4 2 4 - 4 2 5 ; it h a s b e e n r e p e a t e d l y rediscovered. See H. Hopf, Ü b e r die S e h n e n e b e n e r K o n t i n u e n und die Schliefen geschlossener W e g e , Commentarii Mathematici Helvetia 9 ( 1 9 3 7 ) , 3 0 3 - 3 1 9 , for e x t e n s i o n s . For a discussion of t h e possible l e n g t h s of horizontal chords for a g i v e n function, see H o p f s paper; also O x t o b y ' s paper cited in t h e preceding note, a n d R. J. L e v i t , T h e finite difference 6

104

Chap. 2

FUNCTIONS

e x t e n s i o n of Rolle's t h e o r e m , American Mathematical Monthly 70 ( 1 9 6 3 ) , 2 6 - 3 0 . For further information and s o m e applications see J. T . R o s e n b a u m , S o m e c o n s e q u e n c e s of the universal chord t h e o r e m , American Mathematical Monthly 78 ( 1 9 7 1 ) , 5 0 9 - 5 1 3 . R o s e n b a u m also gave a picturesque interpretation of the theorem: H o w t o build a picnic t a b l e for use o n a m o u n t a i n range of known period, Notices of the American Mathematical Society 16 ( 1 9 6 9 ) , 94. A n o t h e r application w a s s h o w n to m e by J. C. O x t o b y : if / is c o n t i n u o u s a n d f(x + y) = g(f(x),y) for all χ and y, then / is either strictly m o n o t o n i c or c o n s t a n t . ( T h i s is problem E 2 2 4 6 , American Mathematical Monthly 78 ( 1 9 7 1 ) , 6 7 6 - 6 7 7 . ) For if / is not strictly m o n o t o n i c , then / ( x o + h) = / ( x n ) for s o m e xn and s o m e h > 0. T h e n for all x, f(x + h) = / ( ( x o + h) + (x - x o ) ) =

fl(/(xo

+ h),x

= s ( / ( x o ) , x - x o ) = / ( x o + (χ - x o ) ) =

-

x) a

fix)-

Since / has arbitrarily short horizontal chords (by the universal chord t h e o r e m ) , this formula s h o w s t h a t / has arbitrarily short periods, a n d so is c o n s t a n t . O x t o b y remarks t h a t conversely, if / is c o n t i n u o u s and either strictly m o n o t o n i c or c o n s t a n t , then there is a function g such t h a t f(x + y) = g(f{x),y). Hopf, preceding n o t e . S u g g e s t e d by J. D . Memory, K i n e m a t i c s problem for joggers, American Journal of Physics 41 ( 1 9 7 3 ) , 1 2 0 5 - 1 2 0 6 . For an e l e m e n t a r y and detailed e x p o s i t i o n of this and related t h e o r e m s , see A . W . Tucker, S o m e topological properties of disk and sphere, Proceedings of the First Canadian Mathematical Congress 1945, University of Toronto P r e s s , 1946, pp. 2 8 5 - 3 0 9 . S e e W . G. C h i n n and Ν. E . S t e e n r o d , First Concepts of Topology ( N e w M a t h e m a t i c a l Library, no. 18), R a n d o m House, N e w York, 1966, p. 65. J. G. B r e n n a n , A property of a plane convex region, Mathematical Gazette 42 ( 1 9 5 8 ) , 3 0 1 - 3 0 2 ; A. C. Zitronenbaum, B i s e c t i n g a n area and its boundary, Mathematical Gazette 43 ( 1 9 5 9 ) , 1 3 0 - 1 3 1 . F o r a proof, references, and e x t e n s i o n s , see A. H. S t o n e a n d J. W . Tukey, Generalized "sandwich" t h e o r e m s , Duke Mathematical Journal 9 ( 1 9 4 2 ) , 3 5 6 - 3 5 9 ; C h i n n and Steenrod, book cited a b o v e , p. 120. 7

8

9

1 0

1 1

1 2

Sec. 15

105

UPPER AND LOWER LIMITS

1 5 . U p p e r a n d l o w e r l i m i t s . We shall need t o use a generalization of t h e notion of limit of a sequence of real numbers. If {s„ } is such a sequence, and t h e numbers s„ form a b o u n d e d set, t h e n (Exercise 15.1) there is always a n u m b e r L with t h e following property: Given any positive e, we have s < L + e whenever η is sufficiently large, and in addition an infinite number of s satisfy s„ > L — e. T h i s number L is called the upper limit of {s }, and is written lim sup s or lim s . If {s } is not bounded above, then no such number L exists, a n d we write lim sup s = + o o . If {s } is bounded above b u t unbounded below, L may or m a y not exist; in t h e case t h a t it fails to exist, we write lim s u p s = —oo. n

n

n

n

n

n

n

n

n

Consider some examples, (i) Let s = (—1)", so t h a t our sequence is — 1 , + 1 , —1, + 1 , . . . . T h e n l i m s u p 5 = 1. (ii) Let s„ — n; t h e sequence is 1, 2, 3, . . . . T h e n we have l i m s u p s n = + o o . (iii) Let {s } — { - 1 , - 2 , . . . } . T h e n l i m s u p s n = - o o . (iv) Let s — l / n , η = 1, 2, 3, . . . . T h e n lim sup s — lim s — 0. (v) Let n

n

n

n

n

then lim sup s

n

Exercise 15.1. L„ {s } n

=

n

= 0. B y considering the numbers

s u p s * , for k > n,

L„

defined

by

s h o w t h a t l i m s u p s , , e x i s t s ( f i n i t e ) if

is b o u n d e d . T h u s l i m s u p s , , = l i m „ — , x ( s u p j . > „

Exercise 15.2.

D e f i n e t h e l o w e r l i m i t , l i m inf, o r l i m , s i m i l a r l y ,

a n d d e t e r m i n e it for t h e s i x e x a m p l e s g i v e n a b o v e for l i m s u p .

106

Chap. 2

FUNCTIONS

T h e definition of upper limit resembles t h a t of least u p per bound, from which it differs in t h a t finite subsets of t h e elements {s } are disregarded. For example, t h e value of lim sup s is unchanged if t h e first thousand s are replaced by other numbers. We could also define lim sup s as t h e largest limit obtainable by picking convergent subsequences out of {s }. n

n

n

n

n

We have lim sup(s„ +t ) < lim sup s„ + lim sup t when b o t h t h e quantities on t h e right are finite; b u t strict inequality m a y occur here. However, l i m s u p ( s + t ) = lim sup s + lim t if lim t exists. n

n

n

n

n

n

n

E x e r c i s e 1 5 . 3 . Prove the statements in the preceding paragraph. T h e inequality extends t o finite sums, b u t n o t t o infinite sums. For example, if we let s^ denote t h e sequence { 0 , 0 , . . . , 0 , 1 , 0 , . . . } with elements s ^ equal t o 0, except t h a t t h e kth element sjj*' of t h e fcth sequence is 1, we have k

n

(as η —• oo) t h a t lim s u p ( s l ' + s i ' + . . . ) = lim s u p 1 = 1, 1

whereas lim

Slip Sn

2

= 0 for each k.

Upper a n d lower limits can also be defined for functions whose ranges are in R i b u t whose domains are more general. If t h e domain of / is a set 5 in a metric space, and xo is a limit point of S, by l i m s u p _ /(x) = L we mean t h a t given any positive e we have f(x) < L + e whenever χ is in a sufficiently small neighborhood of xo, and in addition there is a sequence of points { x „ } with limit xo such t h a t f(x ) > L — e. Slight modifications have t o b e made if L = ± o o . In a similar way we can define l i m s u p _ / ( x ) if t h e domain of / is a subset of R i t h a t is u n b o u n d e d above. Examples: x

n

x

+ 0 O

X | i

Sec. 15

107

UPPER AND LOWER LIMITS

(i) If f(x)

— s i n x for χ € R i , t h e n

limsup/(x) = + 1

and

X-.+0O

liminf f(x)

= — 1.

x-* + oo

(ii) If / ( x ) = e s i n x for χ > 0, t h e n x

l i m s u p / ( x ) = +oo

and

X->+00

liminf / ( x ) = —oo. x—+oo

(iii) If / ( x ) = e l for χ φ 0, t h e n l i m s u p . , . _ + o o a n d l i m i n f _ - / ( x ) = 0. l x

x

0+

/(x) =

0

Exercise 15.4. Show that if lim sups,, = lim inf s„ = L, and L is finite, then lim s„ = L according to the definition given on page 54. Exercise 15.5. If lim sup a„ < L and liminf s„ > L, then lims,, exists (and equals L). U p t o this point, although we have considered limits of sequences, we have not h a d to consider limits of more general functions. For functions with values in R i , it is simplest t o define \im -, „ f(x) t o be the common value (if t h e r e is one) of l i m s u p . _ / ( x ) a n d l i m i n f ^ x , , / ( x ) . If t h e domain of / includes a right-hand neighborhood of xn, we define as lim sup,,._ ,,. 0. W h e t h e r or not a sequence of functions converges pointwise, we can always define t h e pointwise upper a n d lower limits lim s u p s (x) and lim inf s (x) if the functions have a c o m m o n domain, and range in R i . If these upper and lower limits are finite, they define two functions, which we can call lim sup s a n d lim inf s . It is often convenient t o generalize the space C of continuous functions by considering functions whose domains are sets more general t h a n intervals in R i . In defining C , we used t h e existence of t h e m a x i m u m of a continuous function whose domain is a compact interval. Since every continuous function whose domain is a compact set has a n + 1

k

k

-

n

n

n

n

2

110

Chap. 2

FUNCTIONS

m a x i m u m , we can use j u s t t h e same definition: let Ε b e a compact set in a metric space; t h e n t h e space CE consists of continuous functions / with domain Ε a n d range in R i , a n d metric d(f,g) defined by m a x g \ f{x) — g(x)\Similarly, we can define t h e space B E of bounded functions with domain Ε by taking t h e metric d(f,g) t o b e s u p \f{x) — g(x)\; here Ε does not have t o be compact. If a sequence of continuous functions, regarded as a sequence of elements of C E , converges, it is customary t o say t h a t it converges uniformly on E. More generally, suppose t h a t we have a sequence of functions (bounded or n o t , continuous or not); if t h e difference of each two of t h e functions is bounded, we can form t h e distance between t h e m in t h e metric of B E ; a n d if t h e sequence of functions is a Cauchy sequence in this metric, we say t h a t it converges uniformly on E. For example, if / ( x ) = l/x, t h e n t h e sequence { / , / , / , · · · } converges uniformly on 0 < χ < 1. T h e sequence {s„} with s„(x) — x converges uniformly on any specified interval [0, a] with 0 < a < 1, b u t not on [0,1]. Notice t h a t it does not converge uniformly on t h e half-closed interval [0,1) either. It is important t o realize t h a t "uniform convergence on each closed subinterval of a n open interval" is not t h e same as "uniform convergence on t h e open interval." e

n

Another, more conventional, way of saying t h a t a sequence {s } is uniformly convergent on Ε is t o say t h a t , given a positive e, there is an Ν such t h a t , for η a n d m exceeding TV, we have \s (x) — s (x)\ < e simultaneously for every χ in E; we emphasize t h a t Ν is t o be independent of which χ in Ε is being considered. If Ν is allowed t o depend on x, we have t h e definition of pointwise convergence again. Suppose we can find numbers M with t h e property t h a t \s (x) — s i(x)\ < M for all χ in E; t h a t is, we have n

n

m

n

n

n+

n

Sec. 16

111

SEQUENCES OF FUNCTIONS

P x e £ \sn(x) ~ s i(x)\ < M . T h e n if Σ n converges, t h e sequence {s„ } converges uniformly on E. For, if m > η we have |s„ (x) - s (x)| < M„ + M 1 Η h M _ ι , and t h e s u m on t h e right is small when m and η are large. This is known as t h e Weierstrass M-test and is usually stated in t h e equivalent form t h a t it assumes when t h e s„ are t h e partial sums of a series. This reads as follows: / / c„ are functions with domain E, and |c„ (x)| < M for all χ in Ε (where we emphasize t h a t M„ is independent of x ) , then Σ c converges uniformly if Σ M converges. It is sometimes useful t o consider another kind of convergence: pointwise convergence together with uniform boundedness, t h a t is, boundedness in t h e metric of BEWe call this bounded convergence. For example, t h e sequence {s„} with s„(x) = x is boundedly convergent on [0,1], although it is uniformly convergent only on each [0,o], 0 < a < 1. If s (x) = nx , we still have {s } uniformly convergent on each [0,a], 0 < a < 1, but {s } is not b o u n d e d l y convergent on [0,1). s u

M

n

+

n

m

m

n+

2

n

n

n

n

n

n

n

n

Exercise 16.2. Show that the limit of a boundedly convergent sequence of functions is a bounded function. NOTES ' F o r a c o u n t e r e x a m p l e , see page 121. T h e error appears in various places in C a u c h y ' s earlier writings. See, for e x a m p l e , his 1821 COUTS d'analyse reprinted in Oeuvres Completes, II Serie, T o m e III, p. 120. ( T h e s e collected works of Cauchy were published in 27 volu m e s by Gauthier-Villars, Paris, starting in 1882.) A l s o see t h e reprint of t h e first part of Cauchy's Cours d'analyse with extensive n o t e s by U m b e r t o B o t t a z z i n i , C o o p e r a t i v a Libraria Universitaria Editrice B o l o g n a , B o l o g n a , 1992. "M" for "majorant." 2

112

Chap. 2

FUNCTIONS

17. U n i f o r m c o n v e r g e n c e . One of t h e useful properties of uniform convergence is t h a t the limit of a uniformly convergent sequence of continuous functions is continuous. T h e same fact can be s t a t e d more concisely by saying t h a t the space CE is complete. This is easily proved. In fact, we prove a slightly more general s t a t e m e n t t h a t is sometimes useful: if each s„ is continuous at x\, andif{s } converges uniformly in a neighborhood of χχ, then the function L defined by L(x) = l i m s ( x ) is continuous at x\. In the first place, uniform convergence implies pointwise convergence, so there is a function L. Let D be t h e distance function in t h e space BE of b o u n d e d functions. T h e n n

n

\L(x\)

- L(x )\ 2

< \L(xi)

- s„(xi)\

+ \L{x )

-

2

s (x )\ n

2

+ |s„(ari) - s „ ( x ) | 2

< 2D(L,s ) n

- s (x )|.

+ \s (xi)

n

n

2

If a positive e is given, t h e n D(L,s ) can be m a d e less t h a n e / 3 by choosing η large enough, since {s„} converges uniformly t o L. After choosing η this large, fix n. T h e n , since s is continuous at x\, there is a positive δ such t h a t \s {xi) — s (x2)\ < e/3 when d ( x i , X 2 ) < δ. Consequently, d ( x i , X 2 ) < δ implies t h a t \L(x\) — L(x )\ < e, a n d t h u s L is continuous a t x\. Of course, the limit of a nonuniformly convergent sequence of continuous functions is not necessarily discontinuous. For instance, the sequence {s„ }, where s„(x) = nx" (1 — x), is not uniformly convergent on [0,1] (Exercise 16.1), but its limit is the continuous function 0. However, under additional restrictions it is possible to conclude that if the limit function is continuous, then the convergence is uniform. For example, if a sequence of continuous functions converges monotonically to a continuous function on a compact interval in R i , the convergence is necessarily uniform. The hypothesis of monotonic convergence means that either s„(x) > s + i(x) for every η and every χ in n

n

n

n

2

1

n

Sec. 17

113

UNIFORM C O N V E R G E N C E

the interval, or else s„ (x) < s + i(x) for every η and every χ in the interval. Let us take the hypothesis in the form s„(x) > s„ + i(x). Call the limit function L; then s„ (x) — L(x) > 0. If the convergence is not uniform, then max. [s„ (x) — L(x)\ does not approach zero, so there is a sequence of values of η for which max.,- [s„ (x) — L(x)] > b > 0. Since s„ — L is a continuous function, it attains its maximum at a point x„ ; from the set {x„ } we can, by Exercise 8.9, select a sequence } with a limit z. Then we have s*(y* )-L{yi) > b, and consequently s„ (yk)-L(y ) > b for each η < k (it is only here that we make essential use of the inequality a (x) > s„ + i(x)). Letting k —» oo, with fixed n, we infer (by the continuity of s„ — L) that s„ (z) — L(z) > b for every n. On the other hand, a„ (z) — L(z) —* 0, since we have convergence at the point z. Thus the assumption of nonuniform convergence leads to a contradiction. Another condition that produces the same result is that the functions s„ are monotonic (even if they are not necessarily continuous). More precisely, let s„ —* L pointwise on ο compact interval [a,b], let L be continuous, and let all the s„ be nondecreasing functions. Then s„ —» L uniformly. The proof of this theorem requires a fact from §19, but since the theorem fits in well here, I prove it now. Fix a positive number e, and choose a finite set of points x*- in [a, b] such that ο = xi < X2 < · · · < x = b, and 0 < L(x*) — L(x»_i) < e for k = 2, 3, . . . , m. Since L is uniformly continuous (see page 127) and nondecreasing, these inequalities will certainly hold if the distances between consecutive x»'s are small enough. Moreover, since L is nondecreasing, we have 0 < L{xk) — L(x) < t for Xk-i < χ < Xk- Now since the sequence of functions {s„ } converges pointwise, and there are only a finite number of points χ*, we can choose η so large that n

r

k

n

m

\s„{xk)

— L(xk)\

< e for all k. Since each χ is in some

[xt-i,Xk],

and L is nondecreasing, we have s (x) < s„ (x*.) < L(x*) + e < L(x) + 2e, n

where we use successively the hypothesis that s„ is nonde-

114

Chap. 2

FUNCTIONS

creasing, the inequality s„ (x*) < L(x*.) + e, and the inequality L(xk) < L(x) + e. Similarly, L(x) - 2e < L(x ._ ι) - e < s„ (χ*·_ ι) < s„ (x). k

These two sets of inequalities together imply that, if η is large enough, |s„(x) — L(x)| < 2 e for every χ in [a,b]; this is the statement of the uniform convergence of {s„ }. Exercise 17.1. The limit of a sequence of discontinuous functions may be either continuous or discontinuous, whether or not the convergence is uniform. One reason for the importance of t h e idea of uniform convergence is t h a t a good way t o construct a function with some special property is often to exhibit it as t h e uniform limit of functions t h a t do not quite have t h e property. As an illustration of this principle (sometimes called t h e condensation of singularities), we shall exhibit a continuous curve that passes through every point of a plane area. (Such space-filling curves are known as P e a n o curves.) We must of course decide in advance what t h e phrase "continuous curve" is t o mean; the lesson of our construction is t h a t a plausible definition of continuous curve m a y lead to an object t h a t does not fit the intuitive idea of what a continuous curve should look like. One n a t u r a l way of defining a continuous curve in R is t o say t h a t it is a continuous image of a line segment, t h a t is, t h e set of values of a continuous function from a convenient closed interval (say [0,1]) into R . Of course, different functions may lead t o t h e same image, b u t this is irrelevant here; we are going to show t h a t there is a t least one function for which t h e image of t h e interval covers every point of a whole square, indeed, covers some points more t h a n once. We may represent the points ρ in t h e image by their coordinates, letting t h e image point (x(t), y(t)) 2

2

2

Sec. 17

115

UNIFORM CONVERGENCE

\. / Λ . -

2

-

1

0

ίΛ . / 1

2

correspond t o t h e point t of the domain. This a m o u n t s t o saying t h a t we are thinking of a continuous curve as defined by a pair of p a r a m e t r i c equations, χ = χ ( ί ) , y = y(t), with continuous functions χ a n d y. We are now going t o construct a continuous curve, in this sense, t h a t passes t h r o u g h every point of t h e square where 0 < χ < 1 a n d 0 < y < 1. As a m a t t e r of fact, the curve t h a t we shall construct passes t h r o u g h some points of t h e square five times. It is possible t o refine t h e construction so t h a t t h e curve has nothing worse t h a n triple points, b u t farther t h a n this we cannot go, as is shown in topology. We shall verify presently t h a t t h e curve must have double points a t least, t h a t is, there is no continuous one-to-one m a p p i n g of a line segment onto a square. 3

We base our construction on t h e properties of a continuous function / t h a t is even, is periodic with period 2, has t h e value 0 on [0, \], has the value 1 on [|, 1], and is linear in ( j , §)· Define two functions χ and y by 4

x(t) = lf(t) + ±m t) + ±mH) 2

+ ---,

y(t) = | / ( 3 ί ) + ^ / ( 3 ί ) + ^ / ( 3 i ) + · · • . 3

5

B o t h series are uniformly convergent (by t h e M - t e s t ) , and hence χ a n d y are continuous functions. 5

116

Chap. 2

FUNCTIONS

Let 0 < xo < 1 a n d 0 < yo < 1, and represent xn and j/o by "decimals" in base 2 (compare page 40): xo = 0 . 0 0 0 2 0 4 . . .

(base 2),

yo = 0 . 0 1 0 3 0 5 . . .

(base 2).

Now define a n u m b e r ίο by taking its expansion in base 3 t o be t = Ο . ( 2 α ) ( 2 θ ι ) ( 2 α ) . . . (base 3). T h a t is, t is constructed by doubling t h e binary digits of xo a n d y , interlacing t h e m , a n d interpreting t h e result in base 3. We shall now show t h a t x(io) = %o a n d y{to) = yo, so t h a t t h e curve whose parametric equations are χ = χ ( ί ) , y = y(t) passes t h r o u g h (xo,j/o)To do this, we show t h a t f(3 to) = ak for k = 0, 1, 2, . . . ; it will t h e n be obvious from t h e definition of x(fo) a n d y(to) t h a t x(io) = xa a n d y(to) = yo- Now ak is either 0 or 1. If ak = 0, t h e number represented by 0.(2ajfc)(2a*+i)... (base 3) is between 0 a n d a n d so 0

0

2

0

0

k

/(3 i ) = /(0.(2a )(2a f c

0

f c

f c + 1

) . . . ) = 0;

if a t = 1, t h e number represented by 0.(2α*:)(2α& ι)... (base 3) is between | a n d 1, a n d so f(3 to) = 1. We now show t h a t there cannot be a continuous curve which passes through every point of a square precisely once. If t h e r e were such a curve, it would be t h e image of a continuous univalent function with domain a n interval in R i , say [0,1], a n d range a square in R . By t h e theorem on page 9 1 , this function h a s a continuous inverse. Since b o t h t h e function a n d its inverse have t h e property t h a t t h e inverse images of open sets are open, it is equally t r u e t h a t t h e images of open sets are open. Consider t h e sets [0, £) a n d ( j , l ] , which are open relative t o t h e domain [0,1]. Their images are two sets E\ a n d E which are open relative t o t h e range square a n d which fill t h e square except +

k

2

2

Sec. 17

UNIFORM C O N V E R G E N C E

117

for one point. If we take a neighborhood in E\ and a neighborhood in Ει, we can obviously draw a line segment connecting some point of one neighborhood with some point of t h e other neighborhood, remaining in the square and not going t h r o u g h t h e missing point. T h u s we obtain two disjoint sets, b o t h open, a n d neither empty, covering a line segment; this contradicts t h e connectedness of R i . Hence our hypothetical continuous curve cannot exist. On the other hand, we showed on page 17 t h a t there is a one-to-one correspondence between an interval and a square; in t h e light of what we have just proved, it cannot b e continuous. A useful theorem t h a t uses t h e idea of uniform convergence can be s t a t e d concisely in t h e following form: A uniformly convergent sequence can be integrated term by term. More precisely, if / „ are functions from a finite real interval I into R i , if / „ —» / uniformly on I, and if each / „ is integrable in the ordinary (Riemann) sense over i", then

T h e proof is immediate if we know t h a t / is integrable: if I = [a, b], t h e n

< (fc-a)sup|/„(x)-/(x)| X

T h e same proof shows t h a t the sequence of indefinite integrals / f {x) dx converges uniformly itself. If each / „ is, w

a

n

118

Chap. 2

FUNCTIONS

for example, continuous, t h e n / is continuous (page 112) a n d so integrable. We will not complete t h e proof by showing t h a t t h e uniform limit of (Riemann) integrable functions is (Riemann) integrable, for later on (page 212) we shall see a much more powerful convergence theorem for t h e more general Lebesgue integral. However, even t h e simplest case, t h a t in which we have a uniformly convergent sequence of continuous functions, has m a n y interesting applications. Here is one of t h e m . Let / b e a function with derivatives of all orders, all of which are then necessarily continuous functions (Exercise 21.1). Suppose t h a t t h e sequence { / ^ } £ ° = i of derivatives converges uniformly t o a function L; we might call L a derivative of / of infinite order. As η —+ oo, we have on t h e one h a n d t h a t

Γ f (t)dt M

= fl»-V(x)

- f ~ Ka) - L(x) {n

l

L(a),

Ja

a n d on t h e other hand, by our theorem on integrating uniformly convergent sequences,

L

X

f (t)dt^ M

Γ L(t)dt. Ja

Consequently / L(t) dt = L(x) — L(a). This implies t h a t x L(x) = L'(x), and therefore L(x) = ce . T h u s any derivax

Q

tive of infinite order is necessarily a simple expone

(including t h e case where it is identically zero), no m a t t e r w h a t function gave rise t o it. T h e same theorem on uniform convergence can be used t o establish a theorem on termwise differentiation of a sequence of functions. / / the functions fn have continuous derivatives on a bounded interval I, if { / ( a ) } converg n

Sec. 17

119

UNIFORM C O N V E R G E N C E

for some a in I, and if {f' } converges uniformly, then f„ converges uniformly to a limit f which is differentiable, and l i m / ή = / ' . For suppose f' —» g uniformly. Then n

n

/«{x)

~

fn

(a) = I

f' (t) Λ - > /

g(t) dt.

n

Ja

Ja

Since f„(a) —> f(a), it follows t h a t f (x) converges for each x. Moreover, \ f„{x) - f (x) + f (a) - f„(a)\ equals n

m

Ι Γ (fn (*) " 9(t)) dt -

Γ (/;

\Ja

Ja

m

(ί) -g(t))

dt

< (length / ) ( s u p \f„ (t) - g(t)\ + sup \f' (t) tei (€/ m

g(t)\),

and since t h e right-hand side is independent of χ a n d tends t o zero, {/„ } converges uniformly, t o some limit / . Now fn(x)

- fn(a)

— f(x)

- /(a),

whence f(x)-f(a)=

Γ

g{t)dt.

Ja

Hence / is differentiable, a n d f'(x) = g(x); we take for granted t h e fact t h a t t h e indefinite integral of a continuous function has t h e integrand as derivative (compare page 214). Later (page 146) we shall prove a more general theorem in which no continuity is assumed for f' . There is some point in doing so, since a function can have a derivative at every point with t h e derivative not integrable, either in t h e R i e m a n n or t h e Lebesgue sense (page 215). n

120

Chap. 2

FUNCTIONS

Exercise 17.2. If / is a function from R i into R i , having derivatives of all orders, and the series

f

• · · + / dt /(«) du + [' f{t) dt + fix) + fix) Jo Jo Jo converges uniformly, what is its sum?

+ ...

7

Our applications of uniform convergence have been precise formulations under various circumstances of t h e principle t h a t we can take limits t e r m by t e r m in a uniformly convergent sequence. Here is another one. Exercise 17.3. Tannery's theorem: if /„ (k) —• L„ for each η as k —• oo; and |/„(fc)| < M„ for all k, where Σ M„ converges; and if ρ = pik) —• oo when k —» oo; then 8

OC

lim {/, ik) + /„(*) + . . . + / , ( * ) } = Σ

U•

N=L

Exercise 17.4. As an application of Tannery's theorem, prove that lim*_ «, (1 + x/kf = Σ,?=ο / « ! · T h e r e are fairly simple situations in which the theorem on term-by-term integration of a uniformly convergent sequence is inadequate. For example, 1 — χ + χ —···-)(-x) = [ l - ( - x ) " ] / ( l + x ) , a n d hence 1-x+x = l / ( l + x ) i f | x | < l . Formal integration suggests t h a t 2

+ 1

n

log 2

2

=f =

ί

ί

= / dx — χ dx + χ dx — • • • Jo 1 + x Jo Jo Jo , _ _ 1 _1 _ 1 . . . .

1

+

5

2

+

We cannot justify this calculation by t h e theorem on uniform convergence, since the sequence {/„ } with / „ (x) =

Sec. 17

121

UNIFORM C O N V E R G E N C E

[1 — (—x)"]/(l + x) does not converge uniformly on [0,1] (since it does not converge a t 1), and does not converge uniformly even on [0,1). Indeed, if it did, the s u p r e m u m on [0,1) of | x | / ( l + x) would approach zero as η —* oo; b u t since | x | / ( l + x) > \\x\ , t h e supremum in question is a t least ^ and so cannot approach zero. In this case it is easy t o verify t h e result of the formal calculation: n

n

2

n

3

η

J

l + x

0

l + x

J

0

< Jo/

Jo

l + x

= —!— — 0. n + 1

x dx n

T h i s calculation shows simultaneously t h a t the series 1 — \ + 3 ' ' ' converges and t h a t its s u m is log 2. Similarly, it is not very difficult to show (although we shall not d o it) t h a t —

χ ^> (-l) - = > -— 2

n + 1

sinnx

η

Niels Abel gave this e x a m p l e early belief t h a t a convergent must have a continuous sum: b u t t o 0, not t o 7R/2. Formal — = / 4

Jo

% dx = 2

i

9

,

- π < χ < π.

t o refute Cauchy's mistaken series of continuous functions the series converges at χ = π, integration yields

-—— η

/

Ιο

sin n x dx

122

Chap. 2

FUNCTIONS 00



(-1)"

+ 1

( 1 -COSTITT)

n

n= l

2

00

Σ (2n - l )

n= l

2

2

'

so t h a t we have a simple way of summing t h e numerical series J_

J2

J_ + 3

J_

2 + 52

Since t h e original series is not uniformly convergent o n t h e interval [Ο, π] (because t h e s u m is discontinuous a t π ) , t h e justification of our evaluation of t h e numerical series eludes t h e elementary theory. We will consider some more sophisticated convergence theorems for integrals in §28. NOTES ' T h i s is known as Dini's t h e o r e m . For it a n d t h e following t h e o rem, see G. Polya a n d G. S z e g ö , Problems and Theorems in Analysis, vol. I, Springer, N e w York, 1 9 7 2 , p p . 81 a n d 2 6 9 - 2 7 0 , p r o b l e m s 126 and 127 of part II. For a n a l y s e s of t h e notion of c o n t i n u o u s curve from different p o i n t s of v i e w , see G. T . W h y b u r n , W h a t is a curve?, American Mathematical Monthly 49 ( 1 9 4 2 ) , 4 9 3 - 4 9 7 ; J. W . T . Y o u n g s , Curves and surfaces, American Mathematical Monthly 51 ( 1 9 4 4 ) , 1 - 1 1 . ^ W . Hurewicz, U b e r dimensionserhöhender stetige A b b i l d u n g e n , Journal für die Reine und Angewandte Mathematik 169 ( 1 9 3 3 ) , 7 1 78. I . J. S c h o e n b e r g , O n t h e P e a n o curve of Lebesgue, Bulletin of the American Mathematical Society 44 ( 1 9 3 8 ) , 5 1 9 . A n o t h e r s i m p l e c o n s t r u c t i o n w a s given by Liu W e n , A space filling curve, American Mathematical Monthly 90 ( 1 9 8 3 ) , 2 8 3 . It c a n b e s h o w n t h a t there is n o point where χ a n d y are simult a n e o u s l y differentiable, t h a t is, t h e curve h a s a t a n g e n t at n o point. J a m e s A l s i n a , T h e P e a n o curve of Schoenberg is nowhere differentiable, Journal of Approximation Theory 3 3 ( 1 9 8 1 ) , 2 8 - 4 2 . F o r e x t e n s i o n s , s e e Lee Lorch, Derivatives of infinite order, Pacific Journal of Mathematics 3 (1953), 773-778. P o l y a a n d S z e g ö , b o o k cited above, p p . 37 a n d 2 1 4 , problem 165 of part I. 2

4

5

6

7

Sec.

P O I N T W I S E LIMITS O F C O N T I N U O U S F U N C T I O N S

18

123

S e e T . J. Γ Α . B r o m w i c h , An Introduction to the Theory of Infinite Series, second edition, M a c M i l l a n , L o n d o n , 1926, reprinted 1965, pp. 1 3 6 - 1 3 7 . 8

9

N . A b e l , U n t e r s u c h u n g e n über die Reihe: 1 + ψχ +

m

^ ~ ^ χ

( ^.y.jj" + • • · ι Journal für die Reine und Angewandte ematik 1 ( 1 8 2 6 ) , 3 1 1 - 3 3 9 ; a French version appears in A b e l ' s Completes, Grondahl, Christiania, 1881, pp. 2 1 9 - 2 5 0 . m

m

1

χ

3

2

+

MathOeuvres

1 8 . P o i n t w i s e l i m i t s o f c o n t i n u o u s f u n c t i o n s . We consider functions from an interval in Ri into R i . Although a pointwise limit of continuous functions is not necessarily continuous, it cannot be extremely discontinuous, as we shall now show: its points of continuity must at least form an everywhere dense set} (Hence, for example, the everywhere discontinuous function mentioned on page 82, which was obtained from continuous functions by two successive limiting processes, cannot be obtained by one limiting process involving only continuous functions.) We begin by observing that if a function / is discontinuous at a point x, then the images of arbitrarily small neighborhoods of χ do not have arbitrarily small diameters. That is, there is an integer η such that the diameter of the image of each neighborhood of χ is at least l / n . (The images of larger neighborhoods have, if anything, larger diameters, so we can say "every neighborhood" instead of "small neighborhoods.") Now suppose that / is discontinuous at every point of an interval, and let E„ be the set of points χ in this interval for which the diameter of the image of every neighborhood of χ is at least l / n . As we have just seen, every χ belongs to some E„ . Moreover, E is a closed set. For, if y is a limit point of E „ , every neighborhood of y contains some point χ of E , and hence contains a neighborhood of x, and so the diameter of the image of every neighborhood of y is also at least l / n . We now use Baire's theorem, which tells us that some EM is dense in some subinterval J. Since EM is closed, EM contains J. That is, we have found an interval J with the property that N

N

124

Chap. 2

FUNCTIONS

the image of every subinterval of J has diameter at least 1/M. The existence of such an interval J is, then, a consequence of the property of being discontinuous at every point of some interval. We shall now show that a pointwise limit of continuous functions cannot have such intervals J , and hence cannot be discontinuous at every point of an interval. The range of / , being some subset of R i , can be covered by countably many intervals J„ = (a„ ,i>„), each of length less than 1/M. Let us look at the inverse images H„ of these /,,. The sets H„ collectively cover the interval J, but none of them can contain a subinterval of J, since the images of subintervals of J all have diameters greater than 1/M. On the other hand, by Baire's theorem one of the H„ is dense in a subinterval of J. If we knew that the H„ were closed, we should have a contradiction, since a closed set that is dense in an interval contains that interval. Even when / is a pointwise limit of continuous functions, there is no reason to suppose that the H„ are closed. However, we can show that each H„ is a countable union of closed sets, and this property will do just as well. For, if the H„ have this property, then, by still another application of Baire's theorem, one of the closed sets is dense in a subinterval of J, and so contains that subinterval. Since the closed sets are subsets of H„ , it follows that H„ will also contain a subinterval of J. We have thus reduced the proof of our theorem to showing that if / is a pointwise limit of continuous functions fk, then the sets H„ are countable unions of closed sets. We recall that H„ is the inverse image under / of the interval (α„, b„); that is, H„ is the set of points χ such that a„ < f(x) < b„. Consider an χ in H„. Then if j is large enough, we have a„ +2/j < f(x) < b„ - 2/j, since a„ < f(x) < b„. Since /»(x) -> / ( χ ) , we then have a„ + l/j < /*(x) < b„ — 1/j for all sufficiently large k. Let Et.j be the set on which this last inequality holds, and let F j be the intersection of E .j, E + i,j, E i,j, and so on. Because fk is continuous, the sets E%-,j are closed (they are inverse images of closed intervals), and the sets F .j are closed (they are intersections of closed sets). We have just seen m

m

m

m+

m

Sec.

18

P O I N T W I S E LIMITS O F C O N T I N U O U S F U N C T I O N S

125

that if χ is in H„, then χ is in some F,„ j . That is, H„ is a subset of the union of all the F j. On the other hand, if χ is in some F j, we have a„ + 1/j < /* (x) < b„ — 1/j for some j and all sufficiently large k; since fk(x) —* f(x), this implies that o„ + 1/j < f(x) < b„ — 1/j, and so χ 6 H„. Thus H„ is precisely the union of all the F .j for all positive integers m and j , so we have succeeded in representing H as a countable union of closed sets. This is what was required to complete the proof. Only slight modifications of the proof show that a limit of continuous functions has points of continuity everywhere dense in every nonempty perfect set. In this form the condition can be shown to be reversible: a function whose points of continuity are everywhere dense in every nonempty perfect set can be represented as a limit of continuous functions. Baire described discontinuous limits of continuous functions as of class 1; limits of functions of class 1, not themselves of class 1, as of class 2 ; and so on. There are functions that do not belong to any Baire class. An interesting example of a function of Baire class 1 is any discontinuous function on R 2 that is continuous on each line parallel to a coordinate axis. m

m

m

n

2

3

Exercise 18.1. Construct an example of such a function. It is interesting that although we asserted only that a pointwise limit of continuous functions has an everywhere dense set of points of continuity, more than this is true: its set of points of discontinuity must form only a set of the first category. This property is really independent of the fact that the function in question is a pointwise limit of continuous functions. In fact, we shall show that a real-valued function / whose domain is an interval in R i , if continuous at the points of an everywhere dense set, is continuous except on a set of first category. Consider the sets E„ of points χ such that there exists a sequence {yi-} whose limit is χ and which has the property that \f(yk)—f(x)\ > l/n. Every point of discontinuity is in some E„ ;

126

FUNCTIONS

Chap. 2

that is, the set of points of discontinuity is a subset of the union of the E„. Since there are countably many E„, our theorem is proved if we show that each E„ is nowhere dense. If some E„ failed to be nowhere dense, some point χ where / is continuous would be a limit point of that E„ • If we choose a positive δ such that \y — x\ < δ implies that \f(y) — f(x)\ < l / ( 2 n ) , and then choose a point w of E„ so that |ιυ — x\ < \δ, and let J/Α· be the points occurring in the definition of E„ , we have 1/(2/0 - / ( « O l < 1/(2/0 - f(*)\ + \f(x) - / M l < 1 / "

for sufficiently large A;, contradicting the definition of E„. NOTES 'W. F . O s g o o d . For more on t h e subject of this s e c t i o n , see Ε . R. Lorch, C o n t i n u i t y a n d Baire functions, American Mathematical Monthly 78 ( 1 9 7 1 ) , 7 4 8 - 7 6 2 ; A . C. M. van Rooij a n d W . H. Schikhof, A Second Course on Real Functions, C a m b r i d g e University P r e s s , 1982, § § 1 0 - 1 1 a n d 1 6 - 1 7 ; C. d e La Vallee P o u s s i n , Integrales de Lebesgue, Functions d'ensemble, Classes de Baire, second e d i t i o n , Gauthier-Villars, Paris, 1934, c h a p . V I I - V I I I . Reno Baire, Lecons sur les Functions Discontinues, GauthierVillars, Paris, 1905. H . L e b e s g u e . For a s i m p l e proof, s e e F. W . Carroll, S e p a r a t e l y c o n t i n u o u s functions are Baire functions, American Mathematical Monthly 78 ( 1 9 7 1 ) , 175. 2

3

19. A p p r o x i m a t i o n s t o continuous functions. We have seen (page 69) t h a t a continuous function from R i into R i m a y have a r a t h e r irregular graph, a t least t o t h e extent of having oscillations in every interval. On t h e other h a n d , there is always a quite s m o o t h function whose g r a p h is very close t o t h e graph of a given continuous function. More precisely, if the domain of a continuous function is a compact interval, then we can find, as close as we please to the function, each of the following: a step function, a continuous polygonal function, and a polynomial. A step

Sec.

19

APPROXIMATIONS TO CONTINUOUS FUNCTIONS

127

function has a graph made u p of a finite number of horizontal line segments; a polygonal function has a graph m a d e u p of a finite n u m b e r of line segments of any orientation (not vertical). Here "as close as we please" is to b e interpreted in t h e metric of the space B. In other words, if / is t h e continuous function, and e is a given positive number, there are a step function f\, a continuous polygonal function fi, a n d a polynomial fa such t h a t | / ( x ) — fk(x)\ < e (for fc = 1, 2, and 3) for all χ in the interval in question. T h e property t h a t makes such approximations possible is called uniform continuity. A function / is continuous at χ if, given a positive e, there is a positive δ such t h a t | / ( x ) — / ( y ) | < e if \x — y\ < δ. Here we shall in general have t o take smaller and smaller f '

2 n

J -l

2

(1 - f ) " dt > 6(1 2

\δ ) . 2 η

J-S/2

Hence we have

J\l-s ) ds 2 n

\h\ 0 as η —* oo. Exactly similar reasoning applies t o I\. By taking η large enough, we t h e n have |7Ί| and I/3I each less t h a n ^ e , so 2

\I(x)-f(x)\

2

< |7i| + |/ | + |/3| < e , a

provided t h a t η is large enough. Therefore t h e polynomial I furnishes t h e approximation / if η is large enough. 3

Sec.

APPROXIMATIONS T O CONTINUOUS FUNCTIONS

19

131

G r a p h of g

\

0

1 Interval where f(x)

> 0

T h e possibility of approximating a continuous function by polynomials has many applications. It was used in §10 in t h e proof of t h e existence of continuous nowhere differentiable functions. Here is another application. Let / b e continuous on the interval [a,b]. T h e quantities / j " f(x)x dx (for η = 0, 1, 2, . . . ) are called t h e moments of / . We shall show t h a t a continuous function with a compact domain in R i is determined by its moments; t h a t is, two continuous functions with the same sequence of moments are identical. (We do not say anything a b o u t how a continuous function can actually be calculated from its moments.) It is an equivalent s t a t e m e n t t h a t a continuous function all of whose m o m e n t s are zero must vanish identically, and this we now prove. Suppose t h a t all the m o m e n t s of / are zero. W i t h o u t loss of generality, we suppose t h a t α = 0 a n d 6 = 1 . If / is not identically zero, it is positive (or negative) in some interval, and we can construct a continuous function g t h a t is zero outside this interval a n d t h a t makes fg dx = 2h > 0. (See the figure.) Let Μ be an u p p e r b o u n d for | / | on t h e interval [0,1]. Construct a polynomial Ρ such t h a t \g(x) — P(x)\ < hjM for all x. T h e n n

2

/ Jo

fPdx

[ fgdx[ f-{g-P)dx Jo Jo >2h — Μ • m a x \g(x) - P(x)\ > h.

132

FUNCTIONS

Chap. 2

B u t /J fP dx = 0, since all t h e moments of / vanish. T h e contradiction can be avoided only by having / identically zero. As a corollary of t h e theorem about moments, we see t h a t t h e set of all continuous functions can be p u t into one-to-one correspondence with a class of sequences of real numbers, since different continuous functions have different sequences of m o m e n t s . Since there are just as m a n y sequences of real n u m b e r s as there are real numbers (Exercise 3.13), it follows t h a t there are j u s t as many continuous functions as there are real numbers. (A more direct way of seeing this is t o observe t h a t a continuous function is determined by its values a t the rational points, t h a t is, by a sequence of real numbers.) NOTES ' T h e theorem t h a t a c o n t i n u o u s function c a n be uniformly a p p r o x i m a t e d by p o l y n o m i a l s o n an interval is known as t h e Weierstrass a p p r o x i m a t i o n theorem. T h e proof given here was invented by E. Landau. For a more general result, see D a v i d Vernon W i d d e r , The Laplace Transform, P r i n c e t o n University Press, eighth printing, 1972, pp. 6 0 61 and C h a p t e r III. 2

2 0 . L i n e a r f u n c t i o n s . A function / whose domain is R i is said t o b e linear if f(x) + f(y) = f(x+y) for all χ and y. (This is a more special use of the t e r m "linear" t h a n is often made: t h e function / defined by f(x) = ax + b is not linear in our sense when b φ 0.) Clearly if f(x) = ax t h e n / is linear, and we might expect t h a t all linear functions would be of this form; b u t not all of t h e m are. To exhibit one t h a t is not, we should have t o appeal t o one of t h e more abstruse properties of t h e real number system, which depends on ideas t h a t have not been introduced in

Sec. 20

LINEAR F U N C T I O N S

133

this book. W h a t is also not immediately obvious, b u t will be established shortly, is t h a t a discontinuous linear function is necessarily wildly discontinuous: it is, for example, u n b o u n d e d in every interval, and indeed its graph must be everywhere dense in R . It is only t o b e expected t h a t no very simple construction will produce a function of this kind. Let us consider a linear function / . For every χ we have f(2x) = f(x + x) = 2f(x), and so by induction f(nx) — nf(x) for every positive integer n . Since f(x) = f(x + 0) = f(x) + / ( 0 ) , we have / ( 0 ) = 0. Since 0 = / ( 0 ) = f(x -x) = f(x) + f(-x), we have f(-x) = -f{x). Therefore f{nx) = nf(x) for every integer, positive or negative. Replacing χ by x/n, we obtain f(x) = nf(x/n), or f(x/n) = n~ f(x). Now replace χ by mx, and we have f(mx/n) = n~ f(mx) = (m/n)f(x). In other words, f{rx) = rf(x) for every rational number r . In particular (put χ = 1), f(r) = rf(l) for every rational number r. It follows t h a t if / is continuous for all x, t h e n f(x) = xf(l) for all x. 1

2

1

l

We can easily strengthen this result by showing t h a t f(x) = xf(l) for all χ provided merely t h a t / is continuous at some point c. For then f(c + δ) — f(c) —• 0 as δ —> 0; b u t / ( c + δ)- / ( c ) = / ( « ) , so f(S) - 0 as S -» 0. T h a t is, / is continuous a t 0. Now if χ is any real number, t h e n f(x + δ) — f(x) = f(6) —• 0 as δ —> 0, so / is continuous at x. T h u s / is continuous throughout R i , and we know this implies t h a t f(x) = xf(l). We can go still further in weakening the hypothesis and nevertheless being able t o prove t h a t a linear function is continuous. Suppose t h a t / is merely bounded on some interval, or even on some set Ε having the following property: t h e set of all differences x — y between points χ and y of Ε contains a neighborhood of 0. T h a t is, there is a pos-

134

FUNCTIONS

Chap. 2

itive δ such t h a t if |t| < δ, t h e n there are points χ and y in Ε such t h a t χ — y = t. T h e n we can still conclude t h a t / is continuous if it is linear, and so a linear / t h a t is b o u n d e d on a set of t h e kind j u s t described must be of t h e form f(x) = ex. Suppose | / ( x ) | < Μ on E. For numbers t t h a t are differences between points of E, we can write = \f(x - y)\ = \f(x) - f(y)\ < 2M. Therefore if |u| < δ/η, we have \f(u)\ — n~ \f(nu)\ < 2M/n. Now let s be anyreal number, a n d let r b e a rational number such t h a t \r — s\ < δ/η. T h e n we have 2

l

|/(*)-«/(l)| = |/(*-r) + (r-«)/(l)| ^ 2M + g|/(l)|_ η —

Since η can be as large as we please, f(s) — sf(l) = 0. There are m a n y sets E, other t h a n intervals, having t h e property used in this proof. T h e y include t h e sets of positive Lebesgue measure (see §25), and some sets of measure zero. For example, t h e Cantor set (page 39) has t h e property. T h e proof of this fact can be given a very intuitive geometrical form. Take the usual coordinate system in R,2, a n d construct Cantor sets on t h e intervals [0,1] of b o t h t h e χ and y axes, removing from the plane not only t h e middle thirds of intervals, but also all t h e points of t h e unit square t h a t have one coordinate (at least) in a deleted interval, so t h a t at each step we remove some cross-shaped sets. (See t h e figure.) Consider a line with equation y = χ — c, where 0 < c < 1. At each step, t h e line meets at least one of t h e squares t h a t is not deleted in this step; these squares are closed, a n d nested, so their intersection contains some point (x, y) with y = x — c, a n d b o t h χ a n d y are points of the Cantor set. 3

Sec. 20

135

LINEAR F U N C T I O N S

To show t h a t f(x) = cx for a linear function / whose g r a p h is not everywhere dense in R , we m a y appeal t o an elementary fact from the theory of numbers. Let (χι,χι) and (2/1,2/2) be two pairs of real numbers t h a t are not proportional; this means t h a t £12/2 Φ X2Vi, or in geometrical language t h a t t h e points (χι,χι) and (2/1,2/2) of R are not on the same straight line through t h e origin. T h e n if ο a n d b are any two real numbers whatsoever, we can find rational multipliers r\ and r so t h a t τχΧχ + r x is as close as we like t o a and simultaneously r\y\ + r y is as close as we like to b. To prove this, we solve the equations ux\ + vx = a and uyi + vy = b (which we can d o since their d e t e r m i n a n t is not zero), and then choose r and r close t o u a n d v, respectively. 4

2

2

2

2

2

2

2

2

2

x

Now suppose t h a t / is linear and not of t h e form f(x)

2

=

136

Chap. 2

FUNCTIONS

ex. T h e second hypothesis implies t h a t we must be able t o find points x\ and x such t h a t f{x\)/x\ φ i(xi)jx2T h e n for any point (a, b) in R , we can find rational numbers ri a n d r such t h a t }{r\X\ + r x ) = r i / ( x i ) + r / ( x ) differs arbitrarily little from b, a n d a t the same time τχχχ + r x differs arbitrarily little from a. T h u s there is a point of t h e graph of / as close as we please t o the point (a, b) of R . 2

2

2

2

2

2

2

2

2

2

Another proof t h a t yields some additional insight can b e given as follows. Suppose t h a t / is linear b u t not of t h e form / ( x ) = cx, a n d we wish to show t h a t the graph of / is dense in R . Since f(t+r) = f(t)+rf(l) for rational r, it is enough t o show t h a t / takes, arbitrarily close to t h e origin, values arbitrarily close t o every real number, or, w h a t is t h e same thing, arbitrarily close t o every rational number. Let A be a positive rational n u m b e r a n d e < 1 a small positive n u m b e r t h a t we use t o specify closeness. We know t h a t / is u n b o u n d e d in (0, e); suppose, for definiteness, t h a t / takes arbitrarily large positive values there. T h e n t h e r e is an integer η exceeding A/e such t h a t for some s in (0, e) we have η + 1 > f(s) > n. Since / ( r x ) = rf(x) for all rational r and all x, we have f(As/n) = (A/n)f(s), so A + e > A(n + l)/n > f(As/n) > A, and As/n is a point of t h e interval (0, e). We have therefore constructed a point arbitrarily close t o 0 where / takes a value arbitrarily close t o A, if A > 0. If A < 0, t h e r e is a point close t o 0 where / takes a value close t o —A, a n d since /(—x) = — / ( x ) , t h e same conclusion follows. 2

Here is a n application in calculus of our theorems a b o u t linear functions. Suppose that the limit 5

exists for every real x; denote it by ψ(χ).

We shall now

Sec. 20

LINEAR F U N C T I O N S

show t h a t φ(χ) is definition of φ(χ),

137

necessarily of the form ax + b.

By the

and

Consequently, φ(χ — h) + φ(χ + h) = 2φ(χ). If we replace χ — h a n d χ + h by χ a n d y, this says t h a t ψ(χ) + (x + y) = 2·ψ(\(χ + y)). However, (*) above says t h a t 2ψ(\(χ + y)) = φ(χ) + \l>{y). Therefore ψ(χ) rp(y) = ψ(χ + y)\ t h a t is, ψ is linear. Now φ is a limit of continuous functions, and so must have points of continuity (page 123). B u t we know t h a t a linear function φ t h a t h a s a point of continuity h a s t h e form φ(χ) = χφ(1), which is t o say t h a t φ(χ) = φ(0ί) + χψ(1), as asserted.

+

138

FUNCTIONS

E x e r c i s e 2 0 . 1 . Let φ(χ) denote

Chap. 2

6

/ f(u) du, J supposed to exist for every real x. Then ψ(χ) = ax. lim

H-.cc

R

NOTES ' W h a t is needed is the e x i s t e n c e of a H a m e l basis for the real numbers: see K. Hrbäöek and T. Jech, Introduction to Set Theory, Marcel Dekker, N e w York, 1978, p p . 1 4 4 - 1 4 7 (a second edition a p peared in 1984). A discontinuous linear function is c o n s t r u c t e d in G. H. Hardy, J. E. L i t t l e w o o d , and G. Polya, Inequalities, second e d i t i o n , Cambridge University Press, 1952, p. 9 6 . H a m e l ' s original paper is cited in n o t e 4 below. See also G. S. Young, T h e linear functional e q u a t i o n , American Mathematical Monthly 65 ( 1 9 5 8 ) , 3 7 3 8 . A d d i t i o n a l history and references are given by J. W . Green and W . G ü s t i n , Q u a s i - c o n v e x sets, Canadian Journal of Mathematics 2 (1950), 489-507. F o r t h i s proof, and e x t e n s i o n s , see H. K e s t e l m a n , On t h e functional e q u a t i o n }(x + y) = f(x) + f(y), Fundamenta Mathematicae 34 (1947), 144-147. See also A. Wilansky, A d d i t i v e functions, in Lectures on Calculus, K. O. May, ed., H o l d e n - D a y , San Francisco, 1967, pp. 9 7 - 1 2 4 ; and L. Reich, Uber A p p r o x i m a tion durch Werte additiver F u n k t i o n e n , Osterreichische Akademie der Wissenschaften Mathematische-Naturwissenschaftliche Klasse. Sitzungsberichte. Abteilung II. Mathematische, Physikalische und Technische Wissenschaften 201 ( 1 9 9 2 ) , no. 1-10, 1 6 9 - 1 8 1 . A . Z y g m u n d , Trigonometrie Series, second edition, vol. 1, C a m bridge University Press, reprinted 1977, p. 235. T h e property w a s discovered by H. Steinhaus. For a simple arithmetical proof see J. F. R a n d o l p h , D i s t a n c e s b e t w e e n p o i n t s of t h e Cantor s e t , American Mathematical Monthly 4 7 ( 1 9 4 0 ) , 5 4 9 - 5 5 1 . For related properties of Cantor sets, see N. C. B o s e M a j u m d e r , On the distance set of t h e Cantor middle third set, III, American Mathematical Monthly 72 ( 1 9 6 5 ) , 7 2 5 - 7 2 9 ; J. M. B r o w n and K. W . Lee, T h e d i s t a n c e set of C\ χ C\, Journal of the London Mathematical Society (2) 15 ( 1 9 7 7 ) , 5 5 1 - 5 5 6 ; Roger L. Kraft, W h a t ' s t h e difference b e t w e e n Cantor s e t s ? , American Mathematical Monthly 101 ( 1 9 9 4 ) , 6 4 0 - 6 5 0 ; S t e p h e n Silverman, Intervals contained in arithmetic c o m b i n a t i o n s of sets, American Mathematical Monthly 102 ( 1 9 9 5 ) , 3 5 1 - 3 5 3 ; R o drigo B a m o n , Sergio P l a z a , and J a i m e Vera, On central Cantor s e t s 2

3

Sec. 21

139

DERIVATIVES

w i t h self-arithmetic difference of positive L e b e s g u e measure, Journal of the London Mathematical Society (2) 5 2 ( 1 9 9 5 ) , 1 3 7 - 1 4 6 . G . H a m e l , E i n e B a s i s aller Zahlen u n d die u n s t e t i g e n L ö s u n g e n der F u n k t i o n a l g l e i c h u n g f(x + y) = f(x) + f(y), Mathematische Annalen 6 0 ( 1 9 0 5 ) , 4 5 9 - 4 6 2 . M . Plancherei and G. Polya, Sur les valeurs m o y e n n e s des fonctions rdelles difinies p o u r t o u t e s les valeurs de la variable, Commentarii Mathematici Helvetici 3 ( 1 9 3 1 ) , 1 1 4 - 1 2 1 ; reprinted in George Polya: Collected Works, Vol. III, J o s e p h Hersch and Gian-Carlo R o t a , e d s . , M I T Press, Cambridge, M A , 1984, pp. 1 3 4 - 1 4 1 . S e e t h e following papers by R. P. A g n e w : Limits of integrals, Duke Mathematical Journal 9 ( 1 9 4 2 ) , 1 0 - 1 9 ; M e a n values a n d Frullani integrals, Proceedings of the American Mathematical Society 2 ( 1 9 5 1 ) , 2 3 7 - 2 4 1 ; Frullani integrals and variants of t h e Egoroff theorem o n essentially uniform convergence, Acad. Serbe Sei. Puhl. Inst. Math. 6 ( 1 9 5 4 ) , 1 2 - 1 6 . 4

5

6

2 1 . D e r i v a t i v e s . We consider only functions whose domains are intervals in R i a n d whose ranges are in R i . Along with t h e derivative of a function / , which can be defined in t h e usual way, we shall consider some generalizations t h a t have t h e advantage of applying to functions t h a t are not necessarily differentiable in t h e usual sense. These are t h e four Dini derivates, for which we shall use the following notations and definitions: 1

lim s u p

/+(*) rix)

lim inf lim sup

f(

+

X

h f(

X

h)-f(x)

+

h f(x

+

h)-f(x) h

Λ-.0-

lim inf

h)-f(x)

f(

X

+

h)-f(x) h

140

Chap. 2

FUNCTIONS

t h e + a n d — refer t o right and left, respectively, and their (upper or lower) positions refer t o upper and lower limits. For each x, t h e four derivates exist, finite or infinite, for any function / at all. It is common practice to use the phrase "the derivative of / " t o mean, according t o context, either t h e number f'(x), t h a t is, t h e derivative of / a t the particular point x, or t h e function / ' whose value a t χ is t h e number / ' ( x ) . We shall use t h e same ambiguous terminology for the Dini derivates. If we are going t o talk a b o u t t h e m as functions, we have t o extend our usual notion of function by considering functions whose values m a y include + 0 0 or - c o . We must be careful a b o u t such generalized functions; t h e r e are difficulties a b o u t forming s u m s or p r o d u c t s , or (for example) a b o u t trying t o differentiate t h e m . It will b e found t h a t we do not in fact do anything ambiguous with derivates. If / ( x ) = / + ( x ) , we say t h a t there is a right-hand derivative a t x, a n d we denote it by f' (x); similarly for t h e left-hand derivative / i ( x ) . Finally, t h e ordinary derivative f'(x) exists (finite or infinite) if a n d only if all four derivates are equal. Even when / (x) and g (x) are b o t h finite, we do not necessarily have ( / -I- g) (x) = / (x) + g (x); b u t if / ' ( x ) exists (finite), we do have ( / + g) (x) = f'(x) + 9 (x) (compare page 106). +

+

+

+

+

+

+

+

+

E x e r c i s e 2 1 . 1 . If / + ( x ) exists and is finite, then / is continuous on the right at x; if f'(x) exists and is finite, then / is continuous at x. E x e r c i s e 21.2. Show that / may be discontinuous at χ when / ' ( x ) exists, but is infinite.

Sec. 21

141

DERIVATIVES

O n t h e other h a n d , we have already seen (page 71) t h a t a continuous function does not have t o have a derivative anywhere (finite or infinite). Exercise 21.3. Show that if / ' ( a ) exists (finite), we can write f(x) - f( ) = (x- 1. We say t h a t / has a maximum at χ (an interior point of the domain of / ) if there is a neighborhood Ν of χ such t h a t f(y) < f(x) for all y in N; the m a x i m u m is proper if there is a neighborhood N' of χ such t h a t f(y) < f(x) for y in JV' and y φ x. f(x)

E x e r c i s e 21.8. Show that if / has a maximum at x, then / (χ) < 0 and /_ (x) > 0. +

In particular, iff has a maximum at χ and f'{x) exists, we must have f'{x) = 0. T h e r e are of course similar results for minima. Every function has only countably many proper maxima. To see this, assign t o a proper m a x i m u m of / , occurring, say, at x, an interval (ΤΊ,Τ^) with rational endpoints, such t h a t r\ and r% are on opposite sides of x, a n d f(y) < f(x) provided t h a t y Φ χ and r\ < y < r . This interval cannot also b e assigned t o some other proper m a x i m u m , occurring at z, since it would contain b o t h ζ a n d x, a n d we should have t o have b o t h f(z) < f(x) a n d f(x) < f(z). Consequently, different proper m a x i m a have different rational intervals assigned to t h e m . Since there are only countably many intervals with rational endpoints, t h e r e are a t most countably m a n y proper maxima. 3

2

T h e r e can, however, b e uncountably many improper m a x i m a for a continuous function; for example, a constant function has improper m a x i m a at all points. It can b e shown t h a t t h e values of any function at t h e points where its derivative is zero (or even where one of its Dini derivates is zero) form a set of measure zero. In addition, the ordinates of all maxima, proper or not, form a countable set. For we can assign t o each such ordinate ν an interval with 4

5

Sec. 21

143

DERIVATIVES

rational endpoints on which t h e m a x i m u m value of / is v. Different values of υ correspond t o different intervals, and there are only countably m a n y such intervals. We have observed t h a t when a derivative exists at a m a x i m u m or minimum inside the domain of t h e function (supposed here t o b e an interval), t h e derivative must be equal t o 0 a t t h a t point; hence, if we want to show t h a t a derivative takes t h e value 0, we frequently proceed by showing t h a t the function from which it was derived has a m a x i m u m or minimum, not at an endpoint of its domain. If we want t o show t h a t / ' assumes some other value c, we consider g(x) = f(x) — cx and look for its m a x i m a and minima. Any hypothesis t h a t forces g t o have a m a x i m u m or m i n i m u m in an interval (o, b) t h e n guarantees t h a t c is in t h e range of / ' . T w o hypotheses t h a t do this for differentiable functions / are (A) t h a t f'(a) > c and f'(b) < c (since then g{x) > g(a) in a right-hand neighborhood of a, and so t h e largest value of g between a and b is not attained a t a; similarly at 6); or (B) t h a t g(a) — g(b) (since then g either is constant or has a proper m a x i m u m or minimum between α a n d b). Hypothesis (A) leads to t h e observation t h a t derivatives have t h e intermediate value p r o p e r t y : if a derivative takes two values, it takes every value between them. 6

E x e r c i s e 21.9. Let / be a periodic differentiable function; let α be a given positive number; then there is a point χ such that the tangent at χ meets the graph again at a point α units farther along the x-axis (that is, f(x + a) — f(x) = af'(x)). Hypothesis (B) leads t o t h e mean-value theorem (also known as t h e law of the m e a n ) , which states t h a t every difference quotient [f(x) — f(y)]/(x — y) of a differentiable function / is in t h e range of / ' (the usual formulation

144

Chap. 2

FUNCTIONS

is—superficially—different).

A proof is suggested by t h e X

diagram.

7

T h e function g(x) = f(x)

-



- [ / ( 6 ) — /(a)]

takes the same value / ( a ) at b and a t a, so it has a maximum a t some point c between a and b; a t this point, t h e derivative of g is 0, and so / ' ( c ) = [/(&) — f(a)]/(b — a). A less conventional p r o c e d u r e is to s t a r t from g(a) = g(b) a n d infer from t h e universal chord theorem (page 98) t h a t there are intervals ( x „ , y ) in (a, b), each half as long as its predecessor, with g(x ) = g(y )- These intervals are nested a n d hence converge t o a point c, which will b e in t h e open interval (a, b) if we pick the first two intervals to avoid α a n d 6. Since we have a sequence of horizontal chords of g whose endpoints approach c, the tangent a t c (assumed t o exist) must b e horizontal, t h a t is, g' (c) = 0. One should not overemphasize the existence of t h e int e r m e d i a t e point c, whose location is usually unknown; w h a t is generally wanted in practice is t h a t t h e differ8

n

n

n

Sec. 21

145

DERIVATIVES

ence quotient [/(£>) — f{a)]/(b - a) is between s u p / ' and inf / ' ; this is actually an equivalent property because / ' has t h e intermediate value property. Another way of stating all this is t o say t h a t the range of / ' is an interval which contains t h e range of t h e difference quotients [ / ( x ) — / ( j / ) ] / ( x — y ) . T h e range of t h e difference quotients, however, does not necessarily contain the range of / ' ; in other words, t h e converse of the mean-value theorem may fail. An example is given by / ( x ) = x , where / ' takes the value 0, whereas no difference quotient is 0. However, the least u p p e r a n d greatest lower b o u n d s of t h e set of values of difference quotients are t h e same as the corresponding b o u n d s of / ' : for each value of / ' is a limit of difference quotients, a n d therefore so are the least u p p e r and greatest lower b o u n d s of / ' . 9

3

In t h e mean-value theorem we supposed t h a t / is continuous in t h e closed interval [a, b]. We can, as a m a t t e r of fact, drop the continuity of / at t h e endpoints provided t h a t we require continuity on t h e right a t α and on t h e left at b in t h e case where t h e limits f(a ) and f(b~) exist, and otherwise require nothing a t all a t t h e endpoints. However, t h e greater generality so obtained is illusory, since if f(a ) does not exist and / ' is finite near o, then / ' assumes every finite value in every right-hand neighborhood of o, so t h a t (b — a ) / ' ( c ) can have any finite value we p l e a s e . For, if k is any number, t h e n / ( x ) — kx does not have a right-hand limit at a, a n d so cannot be monotonic in a right-hand neighborhood of a. It therefore has m a x i m a and m i n i m a in every right-hand neighborhood of a, and its derivative is zero a t such points x; t h e n f'{x) = k. +

+

10

As an application of the mean-value theorem, we now prove a theorem on t h e termwise differentiation of a sequence of functions. T h e elementary theorem on page 119 d e m a n d s integrability of t h e derivatives, and its proof uses

146

Chap. 2

FUNCTIONS

t h e theorem on t h e integration of a uniformly convergent sequence of functions; b u t it is possible to prove a more general theorem w i t h o u t using any integration a t all. T h i s is: let the functions f have (finite) derivatives f' in an interval I; let the sequence { / ( a ) } converge for some a in I, and let {f } converge uniformly, say to g. Then f converges to a limit f, uniformly on I if I is compact, otherwise uniformly on each compact subset of I; and f'(x) = g(x) for all χ in I. n

n

n

n

n

E x e r c i s e 21.10. Show, without using any integration, that a continuous function on an interval [a, b] has an antiderivative.

11

To prove t h e t h e o r e m on termwise differentiation, first apply t h e mean-value theorem t o t h e difference / „ — f : m

(fn

(x) ~

fm

(x)) -

(fn

(a)

f

-

m

(a))

=

(x-a)(f (c)-f (c)), n

m

where c is between χ a n d ο (and, of course, m a y depend on m a n d n ) . T h e uniform convergence of {f' } and t h e convergence of {f ( )} t h u s make {/„} converge uniformly as long as \x — a\ is bounded. Let / be t h e limit of t h e / „ , a n d let e be an a r b i t r a r y positive number. We have n

a

n

I (fn (x) ~ fm (x)) ~ (fn (a) -f (a))\n . 0

T h a t is, fn(x)-fn(a)

χ —a

f(x)

-

χ —a

f(a)

< e.

η > no.

Sec. 21

147

DERIVATIVES

Also, | / ή ( a ) - g(a)\ < e if η > m because of the convergence of t h e derivatives. Now fix a n η t h a t exceeds b o t h no a n d n i . T h e n if |x - o| is small enough, fn(x)

-

fn(a)

χ — a a n d so f(x)-f(a)

< 2e

if |x - a\ is small enough. B u t \f (a) n

\f(x)-f(a) χ —a

-9(a)

- g(a)\ < e, so < 3e.

T h i s inequality shows t h a t / ' ( a ) exists and is equal t o g(a). Since we now know t h a t {/„} converges everywhere, we can take α t o be any point whatever in J, and the theorem follows. A n o t h e r application of t h e mean-value theorem yields t h e following t h e o r e m , which has a transparent geometrical interpretation. Suppose that f is differentiable in [a, b] and that f'(a) = f'(b); then there is a point c in (a, b) such that 12

f(c) ~ / ( a ) c—a

=

f'(c).

This says t h a t if the graph of / has t h e same slope a t a and at b, there must be a point c at which the tangent passes t h r o u g h t h e initial point (a, / ( a ) ) ; a sketch will make this geometrically plausible. In proving this, we may suppose t h a t / ' ( a ) = f'(b) = 0, since otherwise we would consider t h e function defined by

148 f(x)

Chap. 2

FUNCTIONS

Consider t h e function g defined by

— xf'{a). {x)

g

=

/

(

~ , χ —a

χ

)

/

(

α

a < χ < b;

)

g(a) = 0.

T h e function g is continuous in [a, b] a n d differentiable in (a, b]. We have g'{b) = -g(b)/(b - a). If g{b) > 0, t h e n we have g'(b) < 0 and hence g decreasing at 6 (Exercise 21.5), while g(a) = 0, so g a t t a i n s its m a x i m u m at a point c between a and b. Hence, there is a point c at which g'(c) = 0. A similar argument applies if g(b) < 0. If g(b) = 0, we have g(a) = g(b) — 0, and again g'(c) = 0 for some intermediate c. Since

9

,

( A

{

c

)

=

f'(c) — a

f(c)-f(a) (c-e)» '

our conclusion follows. Still another application of t h e mean-value theorem justifies t h e intuitive idea t h a t derivatives tend t o behave worse t h a n t h e functions from which they are d e r i v e d . 13

E x e r c i s e 21.11. If f(x) > 0, / ( 0 ) = 0, / is differentiable in (0,1], h(x) > 0, and / h(x) dx diverges at 0, then h{f(x))f'{x) is unbounded as χ —* 0; for example, f'(x)/f(x) is unbounded, and so is f'(x)/{f(x) log/(a:)}. +

We might hope t o extend t h e mean-value theorem t o cases where t h e derivative does not necessarily exist, b u t t h e most obvious generalization is certainly false. For example, if f(x) = \x\ we have / | ( x ) = —1 for χ < 0 a n d f' (χ) =r 1 for χ > 0, so t h a t although / ( l ) = / ( - l ) , we d o not have f' (x) = 0 for any χ a t all. Still less can we expect a mean-value theorem t o hold for one of t h e Dini derivates. However, something analogous t o the mean-value theorem +

+

Sec. 21

149

DERIVATIVES

does hold for Dini derivates and can substitute for t h e mean-value theorem in some a p p l i c a t i o n s . We shall establish the following result. Let f be continuous in [a, b]. If C is any number that is larger than [f(b) — f(a)]/(b — a), then at uncountably many points χ in (a, b) we have f (x) < C. Similarly / + (x) > c at uncountably m a n y points χ if c < [/(&) — f(a)]/(b — a); not, in general, t h e same points in b o t h cases. T h e left-hand derivates have the same property. T h e proof of this proposition is much like the conventional proof of t h e mean-value theorem. Let C b e any n u m b e r larger t h a n [/(£>) — f(a)]/(b — a), a n d consider the function g defined by 14

+

- / ( a ) - C • (x - a ) .

g(x) = f(x)

T h e n g(a) = 0, and g(b) = f(b) - f(a) - C • (b - a) < 0. Let s b e any number such t h a t 0 = g(a) > s > g(b). T h e set of points χ in [a,b] such t h a t g(x) > s is t h e inverse image of a closed set and so is closed, since g is continuous. This set is also bounded, so it has a largest point, say x . Since g is continuous, we must have g(x ) = s, while g(x + h) < s when 0 < h < b — x . Therefore g (x„) < 0, whence f (x ) = g (x ) + C 0, we infer t h a t f(y) > f(x) whenever y > x. We can use the theorem t h a t we have just proved t o establish a result t h a t is stronger in two directions: we d o not need t o suppose

150

Chap. 2

FUNCTIONS

t h a t / ' exists, a n d we can omit countably many points. More precisely, if f is continuous, and one Dini derivate is nonnegative except perhaps for countably many points, it follows that f is nondecreasing. For suppose t h a t / ( x ) > 0 for ο < χ < b except for countably m a n y points. ( T h e hypothesis f+(x) > 0 implies this, and t h e proof for / " (x) is similar.) If / fails t o b e nondecreasing, there must be two points χ and y such t h a t y > χ and f(y) < f(x). O u r generalization of t h e mean-value theorem, with [f(y) — f(x)]/(y — x) < C < 0, t h e n says t h a t t h e r e are uncountably many points between χ and y a t which / is negative, contradicting our hypothesis. 15

+

+

E x e r c i s e 21.12. The continuity of / is essential for the preceding theorem: construct a discontinuous function / for which / (x) > 0 for all x, yet / ( l ) < / ( - l ) . +

We can now show t h a t if f is continuous, then all four Dini derivates have the same upper and lower bounds in an interval; more generally, t h e collection of difference quotients [f(x + h) — f(x)]/h has t h e same upper a n d lower b o u n d s as t h e Dini derivates, provided, of course, t h a t b o t h χ a n d χ + h belong t o the interval in question. Suppose, for example, t h a t m is a lower b o u n d for / . Let g(x) = f(x) — mx. Since g (x) > 0, the function g is nondecreasing. If h > 0, we therefore have g(x + h) — g(x) > 0, or in other words +

+

f(x + h) - / ( x ) - m(x + h) + mx > 0; t h a t is,

f(x + h)

f(x)

>

m

for

Λ

>

0 j

s o

/

+

(

χ

)

>

Similarly, f(x) — f(x - h) - mx + m(x — h) > 0, t h a t is, f(x - h) - f(x) > m for h > 0, so / (x) > / _ (x) > m . -h

Sec. 21

151

DERIVATIVES

Next, suppose t h a t one derivate, say / , is continuous at x. This means t h a t its upper and lower b o u n d s are arbitrarily close t o / (x) in a sufficiently small neighborhood of x; t h e preceding theorem tells us t h a t t h e same is true for t h e u p p e r and lower b o u n d s of t h e other three derivates. This means t h a t all four derivates coincide (with / (x)) at t h e point x. T h a t is, if one derivate of a continuous function is continuous at a point, then there is a derivative at that point. A common error of students of calculus is to suppose t h a t f'(y) cannot exist if l i m _ f'(x) does not exist. +

+

+

x

Exercise 21.13. for χ

φ

T h e function / given by

0 and /(Ο) =

l i m i t l i m - o f'{x) r

Exercise 21.14.

y

0 is e v e r y w h e r e

f(x)

=

x sin(l/x) 2

differentiable,

but

the

does not exist. Show that

f'(y)

d o e s e x i s t if l i m . ^ j , / ' ( x ) T

exists.

O n e reason for this misunderstanding, perhaps, is t h a t a derivative, if discontinuous, is very discontinuous, so t h a t functions with discontinuous derivatives are not commonly encountered in calculus. More precisely, a derivative cannot have a simple jump. This is t o be interpreted in the following sense: if f'(x) exists at every point χ of an interval, and (for a point y of this interval) t h e limits f'(y ) and f'(y~) b o t h exist, t h e n b o t h these limits are equal t o f'{y). On t h e other hand, t h e example / ( x ) = |x| shows t h a t t h e limits of / ' from b o t h sides can exist and b e different a t y if f'(y) does not exist. T h e impossibility of a simple j u m p for a derivative is an immediate consequence of t h e intermediate value property of derivatives. +

A continuous function cannot have a derivative that is everywhere infinite. Indeed, we can say much more: a con-

152

Chap. 2

FUNCTIONS

tinuous function must have / (x) < + 0 0 on an uncountable s e t , a fact t h a t follows at once from t h e generalized mean-value theorem on page 149. Indeed, t h e generalized mean-value theorem says t h a t / (x) < C on an uncountable set if C > [f(b) - f(a)]/(b - a). It follows from a general theorem that we shall quote later (page 155) that whether / is continuous or not, it can have an infinite right-hand derivative /+ at most on a set of measure zero. On the other hand, if we do not require / to be continuous, we can have / (x) = + 0 0 at every point x. An example of this phenomenon can be constructed as follows. Let real numbers χ in [0,1] be represented in base 3 as Ο.α,χ a.2 ..., where each a„ is 0, 1, or 2. If χ has two representations, we choose the one that terminates. Then we put f(x) = Ο.61&2 · · · (base 2), where b„ = 1 if a = 2 and otherwise b„ = 0. Now, since we excluded ternary representations ending in repeated 2's, the ternary representation of every χ contains an infinite sequence of digits that are 0 or 1. Let one of these 0's or l's occur at the r t h ternary place. Let x' differ from χ only by having 2 as its r t h ternary digit; then x' > x, and in fact x' — χ = 3 or 2 · 3 " . In either case, f(x') - f(x) = 2 ~ . Hence +

16

+

+

17

n

_ r

r

r

f{x') - f(x) x' - χ

3" - 2

>

r + 1

'

Since r can be arbitrarily large, it follows that / (x) = + 0 0 . It can be shown that this function / is continuous except at the points that have terminating ternary expansions, and in fact is continuous on the right at these points, but discontinuous on the left. Another interesting result about possible values of derivatives (of not necessarily continuous functions) is that if one level set of / ' is dense, then every other level set of / ' is of first category. That is to say, if fix) = A (possibly infinite) on a dense set, then f'(x) can exist and be different from A at most on a set of first category. It is enough to consider the set S where f'(x) < A, since +

18

Sec. 21

153

DERIVATIVES

the set where / ' ( x ) > A is the set where (-f)'(x) < -A. When A is finite, S is contained in the union of the sets E„. , where χ e E„, provided that \y — x\ < l/n implies m

m

/ ( » ) ~ / ( * ) f(x) whenever y > χ in J, or else f(y) < f(x) whenever y > χ in I. If one of these conditions holds with strict inequality t h r o u g h o u t , we say t h a t / is strictly monotonic. T h e familiar functions used in calculus are, if not monotonic, a t least piecewise monotonic. T h u s if f(x) — x , t h e n / is decreasing when χ < 0 and increasing when χ > 0; if f(x) = cosx, then / is alternately increasing and decreasing in t h e intervals (—π, 0), (0, π ) , and so on; if f(x) = e , t h e n / is increasing t h r o u g h o u t R i . All these functions are continuous. O n t h e other h a n d , t h e function / defined by f(x) = [x] (the greatest integer not 2

x

Sec. 22

159

MONOTONIC FUNCTIONS

exceeding x) is nondecreasing and is discontinuous at each integer x. Exercise 2 2 . 1 . A monotonic function is bounded on each compact subinterval of its domain. Exercise 22.2. A monotonic function approaches a (finite) limit from each side at every interior point of its domain. Exercise 22.3. The limit of a pointwise convergent sequence of monotonic functions is monotonic. A function / is said t o have a jump a t a point χ of its domain if / has limits from b o t h sides at x, b u t / is not continuous at x. After Exercise 22.2, we can say t h a t t h e only discontinuities of a monotonic function are j u m p s . T h e easiest monotonic functions t o visualize are those with only a finite number of j u m p s , b u t a monotonic function can have a much more complicated structure t h a n this. For example, if / ( x ) = 2~ in t h e interval [ l / ( n + 1 ) , l / n ) , t h e n / is a nondecreasing function with j u m p s t h a t have a limit point a t 0. n

A monotonic function can have at most countably m jumps, since t h e intervals from f(x~) to f(x ), if not +

empty, form a set of disjoint intervals in R i (disjoint because / is monotonic), and such a set of intervals is countable (page 30). However, we shall show t h a t t h e set of j u m p s of a monotonic function can b e any countable set at all, even an everywhere dense one—for example, all t h e rational points in an interval. Let {x„ } be a given countable set, a n d let j be positive numbers such t h a t Y^j < oo. We define functions / „ by p u t t i n g f (x) = 0 for χ < x and / „ (x) = j for χ > x . Of course, the x will not in general be numbered in order of increasing magnitude. T h e series Σ fn converges uniformly (by the M - t e s t , page 111), n

n

n

n

n

n

n

160

Chap. 2

FUNCTIONS

since | / ( x ) | < j a n d j converges. If xo is n o t any of t h e x , t h e n it is a point of continuity for all t h e / „ a n d hence is a point of continuity for / (page 112). O n t h e other h a n d , if x is one of the x , then precisely one function / „ , namely f , is discontinuous a t x . Consequently, Σηφτη U = f - fm is continuous a t x . Hence / , as t h e s u m of a function t h a t is continuous a t x and a function t h a t is discontinuous a t x , is itself discontinuous a t x . Indeed, / has a j u m p of a m o u n t j a t x . We may reasonably call such a n / a pure j u m p function. More generally, we call / a pure jump function if it is constructed similarly, b u t possibly with b o t h right-hand a n d left-hand j u m p s , so t h a t f(xm~) Φ f(xm) Φ f(xm ) - If we construct a pure j u m p function whose right-hand a n d left-hand j u m p s are j u s t those of a given nondecreasing function g, t h e n g — f will still b e nondecreasing, and also continuous. n

n

n

n

m

n

m

m

m

m

m

m

m

m

+

It m a y seem plausible t h a t a pure j u m p function should have its derivative zero except a t its j u m p s . This conject u r e is almost, b u t not quite, true: the derivative of a

pure jump function is zero except on a set of measure zer this set of measure zero m a y contain more points t h a n t h e j u m p s . We can get a better idea of t h e kind of t h i n g t h a t can h a p p e n by considering some special cases. Let / b e t h e pure j u m p function t h a t has j u m p s of a m o u n t 2~ a t t h e points 3 ~ " for η = 1, 2, . . . ; let g b e t h e pure j u m p function t h a t h a s j u m p s of a m o u n t 3 at the points 2 ~ ; let / ( 0 ) = g(0) = 0. Both / a n d g a r e continuous (on t h e right) a t 0. However, we c a n easily show t h a t f' (0) = +oo while ^ + ( 0 ) = 0. In fact, if h > 0, we have [/(0 + h) - f(0)]/h = f(h)/h; a n d if 3 - ™ - < h < 3 - " \ t h e n f(h) = ET= i ~ = so t h a t f{h)/h > 3 / 2 — oo. Similarly, if 2~ < 3 h 3 _ m h η and h is between 3 and 3 , t h e n f(x + h) differs from f(x) by something between 2~ and 2 , a n d consequently [f{x + h) - f(x))/h must b e between 2 - " - / 3 and 2 - / 3 ~ . Hence as /ι - » 0 (and s o m - > oo), this difference quotient becomes positively infinite. T h a t is, a t a right-hand endpoint x, we have f' (x) = + 0 0 a n d f'-(x) = 0. Similarly, f'_(x) = + 0 0 and / + ( x ) = 0 a t a left-hand endpoint. At limit points t h a t are not endpoints, it can b e shown t h a t / = + 0 0 , whereas / + can have any value between 0 and + 0 0 . As a first application, we construct another singular function, one which is not constant on any interval. Let C b e t h e function just constructed, but extended to all of R i by p u t t i n g C ( x ) = 0 for χ < 0 a n d C ( x ) = 1 for χ > 1. Let { r „ } be an enumeration of the rational numbers. Define 3

2

_

m

_

m

n

_

1

m

_ m _ 1

l

m

1

_ m

m _ 1

+

+

4

00

/(x) = £ 2 - " C ( 2 " ( x - r „ ) ) . n= l

Since the series is uniformly convergent (by the M - t e s t ) , / is continuous. Each t e r m in t h e sum is nondecreasing

164

Chap. 2

FUNCTIONS

(since C is), so / is nondecreasing. In fact, / is strictly increasing: if χ < y, t h e n there is some rational number r „ between χ and y, and C(2 (x-r )) = Q< C(2 (y-r )), so f(x) < f(y). Finally, by Fubini's theorem on differentiation of series with nondecreasing t e r m s (see page 171, below), n

n

n

n

00

/'(*) = £

C-'(2"(z-r ))==0 n

n=l

for almost all x. As another application of t h e singular function C, we can construct t h e example, mentioned on page 154, of two functions t h a t have the same derivative (infinite a t some points) t h r o u g h o u t an interval, b u t d o not differ by a constant. We need a function g t h a t is continuous a n d nondecreasing with a finite derivative at each point not in t h e C a n t o r set a n d derivative + 0 0 at each point of t h e Cantor set. Once we have such a g, we can p u t h(x) = C(x)+g(x); t h e n h'(x) = g'(x) = + 0 0 a t all points of t h e C a n t o r set (since all derivates of C are nonnegative); a n d h'(x) = g'(x) a t all points not in t h e Cantor set, since C'(x) = 0 a t such points. However, g a n d h differ by C, which is not constant. We proceed t o construct g. Let us enumerate t h e complementary intervals (a ,b ) of t h e Cantor set in order of decreasing length (the order among the finitely m a n y intervals of each length is irrelevant). Let φ be a continuous nondecreasing function of t h e general character indicated in t h e figure: ψ (x) = 0 for χ < a„, φ„ (χ) = 1 for χ > b , {φ )'+(α>η) = ( V n ) ' _ ( M = + 0 0 . (An explicit example is φ (χ) = ( 2 / π ) t a n " {(χ - a )^ (b - χ)" / }.) 5

n

n

η

η

n

η

1

η

1

2

n

2

n

Observe t h a t t h e lengths of t h e intervals (o„,6„) are negative integral powers of 3, and let h„ = ( 2 / 5 ) whenm

Sec. 22

165

MONOTONIC FUNCTIONS

ever (an,bn) has length 3 . Define g(x) = ^2αηφη(χ) with t h e (infinite) s u m extended over all n. In other words, t h e value g(x) is t h e sum of t h e h„ over all intervals (a„, b ) t o t h e left of x, plus ηι^φ^χ) if x is in (ak, bk). Now there are 2 intervals (an,b„) of length 3 " , a n d hn = ( 2 / 5 ) on each, so Σ h converges. Therefore the series defining g converges uniformly, and so g is continuous. By construction, g is nondecreasing. Let χ b e a point of the Cantor set, other t h a n an a , and let δ > 0. T h e n u m e r a t o r of t h e difference quotient _

m

n

m _ 1

m

m

n

n

a

_ g(x +

δ) - g(x) _sr^ δ - 2 ^ "

h

Λ

ψ (χ + δ) - φ (χ) η

η

s

will exceed hk if t h e interval (χ, χ + δ) contains t h e complementary interval (ojt, bk). It is easy to see t h a t (χ,χ+δ) always contains a complementary interval of length 3 > 6/9. T h e n Δ > (2/5) / oo, so g'+(x) = + 0 0 . If, however, χ is a n an, t h e n g'+(x) = + 0 0 by inspection. Similarly o'_(x) = + 0 0 at all χ in t h e Cantor set. T h a t is, g'(x) = + 0 0 at all χ in the Cantor set. We now t u r n to the r a t h e r difficult proof t h a t a mono_

m

m

tonic function has afinitederivative almost everywh

T h e reader who is interested in seeing some applications of t h e theorem first may skip t o page 170.

166

Chap. 2

FUNCTIONS

T h e proof depends on a lemma by F . Riesz, known as the "flowing water" or "rising sun" lemma. If g is a continuous function from a n interval J into R i , if t h e graph of g is t h e cross section of the bed of a stream, and we consider t h e set Ε of points where water is flowing, it is intuitively clear t h a t Ε consists of open intervals a t whose ends g has the same value; if t h e graph is the profile of a m o u n t a i n range, if the sun rises in the direction of t h e positive x-axis, and if Ε is the set of points t h a t are in t h e shade, it is again intuitively clear t h a t Ε consists of open intervals a t whose ends g has t h e same value. (In b o t h cases, there may be an exceptional interval at the extreme left, as in t h e picture.) We now s t a t e t h e lemma in abstract terms a n d for a more general situation. Let g be continuous on an interval I, except for jumps, and let G be defined by G(x) =

m&x(g(x~),g(x),g(x )). +

The set Ε of points χ such that there is a y > χ with g(y) > G(x) is an open set; and if (a, b) is any one of the intervals making up E, then g(a ) < G(b). +

Sec. 22

MONOTONIC FUNCTIONS

167

If we exchange left and right, a n d define E' as t h e set of points χ such t h a t there is a y < χ with g(y) > G(x), t h e n , similarly, if E' is t h e union of open intervals (α',&'), we have G(a') > g(b'~). W e first prove t h a t Ε is open. Let xo e E; t h e n there is a y > x with g(y) > G(x ). We have to show t h a t this property holds for all χ near xo. If χ varies slightly t o t h e left of xo, then g(x) is near g(xo~); if χ varies slightly t o t h e right of Χη, t h e n g(x) is near o(xn ); in either case G(x) can exceed G(xo) only slightly. Since g(y) > G(xo), we also have g(y) > G(x) as long as G(x) is not much greater t h a n G(XQ), which, as we have just seen, is t h e case when χ is near χ η . 0

0

+

To prove t h e second s t a t e m e n t of t h e lemma, let (o, b) be one of t h e intervals t h a t make u p E; then b is not a point of E. If it were t h e case t h a t g(a ) > G(b), t h e n there would exist a positive number e a n d a point χ between a and b such t h a t g(x) > G(b) + e. Fix this χ and this e, and let z\ denote t h e least u p p e r bound of t h e numbers ζ in t h e interval [x, b] such t h a t g(z) > G(b) + e. T h e n G{z\) > G(b) + e; in particular, z\ φ b. T h u s , z\ € E, and so t h e r e is some y > z\ for which g(y) > G{z\) > G(b) + e. Since b E, we must have y < b, but this contradicts t h e definition of z\. This contradiction proves t h a t g(a ) cannot exceed G(b). +

+

We are going t o deduce t h e differentiability of a nondecreasing function from two consequences of Riesz's lemma. We lead u p t o these by showing first t h a t our theorem will follow if we can prove t h a t / (x) < + 0 0 almost everywhere a n d f (x) < f-(x) almost everywhere. It will simplify t h e n o t a t i o n somewhat, a n d will do no h a r m , if we suppose t h a t our interval (a, b) has its midpoint a t 0, and t h a t / ( 0 ) = 0. We can t h e n reflect t h e graph of / around t h e origin t o obtain t h e g r a p h of a function fo whose value +

+

168

Chap. 2

FUNCTIONS

a t a negative χ is —f(—x). T h i s function /o is also nondecreasing. Its right-hand derivates a t a negative χ equal t h e left-hand derivates of / a t —x, since fo(x + h)h

- h ) - (-/(-x)) h ^ / ( - s + (-/>))-/(-s)

f (x)

_ -f(-x

0



If / (x) < / _ (x) holds almost everywhere for every nondecreasing / , t h e n it also holds with / replaced b y /n a n d χ replaced by —x. T h a t is, / " (x) < / + (x) for almost all x , so for almost all χ we have +

/ + ( x ) < / _ ( x ) < Γ ( x ) < f (x) +


/ _ (x), t h e n there are rational numbers r a n d R such t h a t f-(x) < r < R < f (x). Since there are only countably m a n y pairs of rational numbers, t h e set where / ( x ) > / _ (x) is contained in t h e union of countably m a n y sets of measure zero, a n d so is itself of measure zero. T h e consequences of Riesz's lemma on which o u r proof depends are as follows: (1) Let / be nondecreasing on [a,b], a n d let En be t h e set of points χ where / is continuous a n d / (x) > R. T h e n E R c a n b e covered by a countable set of disjoint intervals (o,k,bk) whose t o t a l length £X&/t — a*) is a t most +

+

+

+

+

+

E L W *

+

) - f(a )]/R +

k

< [/(&") -

f(a )]/R. +

Sec. 22

MONOTONIC

169

FUNCTIONS

(2) Let / be nondecreasing on [a, b], and let E be the set of points χ where / is continuous and / _ (x) < r. T h e n E can b e covered by a countable set of disjoint intervals (a ,h) such t h a t Σ [ / ( & * Γ ) - f{a )] < r^,(b - a) < r(b — a). We defer the proofs of these two statements until we have shown how t h e theorem follows from t h e m . As a first step, we observe t h a t (1) implies / (x) < + o o almost everywhere. For t h e set of j u m p s of / is countable, and hence of measure zero. Now if Ε is the set of points χ where / is continuous and / ( x ) = + o o , t h e n we have t h e hypothesis of (1) for every positive R, so t h a t Ε is covered by intervals (a , b ) whose total length is a t most [f(b~) — f(a )]/R for every R. T h a t is, Ε is covered by a collection of intervals of arbitrarily small total length, whence Ε is of measure zero. R

R

+

k

k

k

k

+

+

k

k

+

Next consider a set Ε of points χ where / is continuous and / - ( x ) < r < R < / ( x ) , so t h a t t h e hypotheses of (1) a n d (2) are b o t h satisfied. Apply (1) t o each of t h e intervals (a ,b ) in (2). We find t h a t the p a r t of Ε in (o>k,b ) is covered by intervals whose total length, L say, is a t most [f(b ~) — f(a )]/R. Adding these inequalities for all (a ,b ), and applying (2), we find t h a t +

k

k

k

k

+

k

k

Y^L

k

k

< ( 1 / Ä ) £ [ / ( 6 i T ) -f(a )} +

k

k


R, so there is a y > χ such t h a t [f(y) - f(x)]/(y - x) > R, t h a t is, f(y) - Ry > +

170

Chap. 2

FUNCTIONS

/ ( χ ) — Rx. Apply Riesz's l e m m a t o t h e function g such t h a t g(t) — f(t) — Rt. Since / is continuous at x, so is g, a n d G ( x ) = g ( x ) . We conclude t h a t ER is covered by a countable set of disjoint intervals (α&, bk) such t h a t G{bk) > g(ak ) ; in other words (since / is nondecreasing), +

f(bk )-Rb

>f(a )-Ra ,

+

+

k

k

k

t h a t is, f(bk )-f(a )>R(bk-a ). +

+

k

k

Adding these inequalities for t h e various values of k gives i ? £ ( 6 * - a , 0 *+)-/(α* )]. +

(2) If χ 6 E , t h e n / _ (x) < r , so there is a y < χ such t h a t [f(y) — f(x)]/(y — x) < r, t h a t is (since y < x ) , R

- / ( x ) > (y ~ x)r,

f(y)

f{y) ~ry>

f(x)

-

rx.

Riesz's lemma, in t h e form with t h e inequality reversed in its hypothesis, can be applied to t h e function g defined by g(t) = f(t) — rt. We find t h a t E is covered by a countable set of disjoint intervals (ak,bk) such t h a t g(bk~) < G(afc); since / is nondecreasing, G(ak) — 9{άζ)ι R

s o

f(b ~)-rb k


0, q > 0, a n d q\ + q = 1. We are t o show t h a t when / is continuous, t h e first inequality (for all X i a n d x ) implies t h e second. If we write \{x\ + X2 + X3 + X4) as t h e average of t h e numbers \(x\ + x ) a n d ^ ( 1 3 + X4) a n d then apply t h e midpoint inequality three times, we obtain 2

2

2

2

/ (\(xi

+ X

2

+X3 +X4)) < i ( / ( z i ) + /(x )

+ /(x )

2

+

3

/(x4)).

By repeating this argument, we find t h a t / (i(Xi + · · · + *„)) < 1 ( / (

X l

) + ...+

/(χ„))

whenever t h e integer η is a power of 2. If we now take r of t h e x ' s equal t o x\ a n d t h e remaining s = η — r of t h e x ' s equal t o X 2 , we get /(^Xi + ^X2) f+(x + g)- Since the right-hand derivative is t h e limit of difference quotients, we will have [/(x + h) - f(x)]/h > [f(x+g+h)—f(x+g)]/h for all sufficiently small positive h. Equivalently, / ( x + h) + f(x + g) > f(x + g + h) + / ( x ) . Now (*) implies +

x

+

+

+

fix

+ h)
0. 3

+ 2h) -I- f(x)). Now if — x\)/2 t o see t h a t / is definition. quotients t h a t we noted

h ) h

~

f { X )

for every positive h. T h e right-hand side is the slope of a n a r b i t r a r y chord of the graph of y = f(x) with lefth a n d end a t (x,f(x)); and t h e left-hand side is t h e slope of t h e right-hand tangent t o t h e graph at x. T h u s every chord t h r o u g h (x, f(x)) a n d going t o the right lies above (or on) t h e (right-hand) tangent, which means t h a t t h e entire curve, from χ onward t o t h e right, lies above (or on) the right-hand tangent. Similarly on t h e left; a n d since t h e right-hand tangent has larger slope t h a n the left-hand tangent, the entire curve lies above (or on) t h e right-hand tangent line at x. A straight line touching the curve and such t h a t the entire curve lies above or on it is called a supporting line; t h e existence of a supporting line at each point can b e (and often is) taken as t h e definition of convexity. We say t h a t

180

Chap. 2

FUNCTIONS

t h e function is strictly convex if each s u p p o r t i n g line has j u s t one point of contact with t h e graph. We can now generalize t h e original definition of convex functions: not only is every chord of t h e g r a p h above or on its arc, b u t if we p u t a r b i t r a r y positive weights a t η points of t h e arc, their center of gravity will also be above or on t h e arc; and for a strictly convex function (and η > 1) t h e center of gravity will be strictly above t h e arc. Algebraically, this means t h a t if w\, w , ... , w are positive weights whose s u m is 1, t h e n 2

/(tuiXi Η

h w x„) n

< wxf(xi)

n

Η

+w f(x ), n

(t)

n

with strict inequality if / is strictly convex and at least two Xk are different. If / is concave (which means t h a t —/ is convex), t h e n t h e inequality is reversed. Inequality (f) is known as Jensen's inequality. To prove it, let Μ = wixi + · · • + w„x„ and u>\ +·•·•¥ w„ = 1, so t h a t Μ is t h e ^-coordinate of t h e center of gravity of η weights u>k placed at t h e points (xk, f(xk))Let us consider t h e s u p p o r t i n g line determined by t h e righth a n d t a n g e n t t o t h e g r a p h of / a t χ = M\ t h e curve lies above this t a n g e n t . Let t h e equation of t h e t a n g e n t b e y = g(x) = ax + b, where α and b are numbers t h a t could b e calculated b u t whose values are irrelevant. Since t h e curve is above (or on) this supporting line t h r o u g h (M,f(M)), we have ax + b < f(x). W r i t e this inequality for χ = χχ, x , ... , x„, multiply t h e fcth inequality by u>k, a n d add t h e results: 2

W\9{Xl)

+

Η Wn9(Xn)

< Wif(xi)

or, since t h e s u m of t h e u / s is 1, η

η

+ ••• +

U) f(x ), n

n

Sec. 23

181

CONVEX FUNCTIONS

B u t YJl^WkXk = M, so g(M) < E L i Finally, t h e supporting line y = g(x) goes through t h e point

(M, f(M)),

s o ( M ) = f(M), and f(M) < E L i «>*/(**); 5

this is (f). Furthermore, if / is strictly convex and some Xk is different from Μ, t h e n there was strict inequality in one of t h e inequalities t h a t we added. Hence there is strict inequality in (f) except when all t h e Xk are equal t o M. In a similar way, we can get analogous inequalities for integrals instead of sums. Let w a n d χ be continuous positive functions for a < t < b, with J^w(t)dt — 1. If we replace t h e number Μ by w(t)x(t) dt a n d use t h e inequality g(y) < f(y) for every y between a a n d b, we get

provided t h a t / is convex, a n d t h e reversed inequality when / is concave. This is t h e integral form of Jensen's inequality. T h e ordinary average, or more formally t h e arithmetic mean, of η numbers x\, ... , x is (xi + · · · + x „ ) / n ; t h e geometric mean is (x\X2 · • ·χ ) ^ • For many purposes it is b e t t e r t o use weighted means, n

ι

η

η

w\X\

+ W2X2 Η

\-w x n

n

and

x f χψ 1

• • • χ™",

where t h e u>k a d d t o 1; t h e ordinary case has Wk = 1/n for each k. A famous (and very useful) theorem states t h a t the geometric mean of η positive numbers does not exceed their arithmetic mean, a n d it is actually smaller except when all t h e n u m b e r s are equal t o each other. This is j u s t a consequence of t h e convexity of — l o g t for positive t. In

182

Chap. 2

FUNCTIONS

fact, by (f) applied t o this function, - log(u;i:ri Η

h

wx) n

n

< -(Wi

logX, + · · · + W

logX„),

n

t h a t is, log(wiXi + · · · + w x ) n

> log(xi)"' + · · · + l o g ( x „ ) " . 1

n

w

If we exponentiate b o t h sides, we get t h e inequality between t h e means; and there is equality if and only if all t h e Xk are equal. T h e r e are m a n y applications of Jensen's inequality in general a n d of t h e inequality between t h e geometric a n d arithmetic means in particular. We can solve many maxim u m and minimum problems for polynomials without using calculus, for example, to find t h e largest box t h a t can b e m a d e from a square of paper ο units on a side by cutt i n g χ by χ squares out of the corners and folding u p t h e resulting rectangles. This problem asks us t o maximize x(a — 2 x ) , b u t it is just as satisfactory to maximize {4x(a — 2x) } / . Consider t h e numbers 4x a n d α — 2x with weights ^ a n d | . Their weighted geometric mean {4x(a — 2x) } ! does not exceed the weighted arithmetic mean, namely f x + | ( a — 2x) = | a , a n d there is equality if and only if 4x = a — 2x, t h a t is, χ = a/6. 2This 1 3 says t h a t t h e largest possible value for {4χ(α — is a t t a i n e d a t χ = a/6; and for this χ we have x ( a — 2 x ) = 2 a / 2 7 . Notice t h a t it was i m p o r t a n t t o know when equality is attained in t h e inequality between the means. 3

2

2

2

1 3

1 3

2X)}/ 2

E x e r c i s e 23.2. 0 < χ < π/4.

Show that ( s i n a ; ) 4

8inT

< (cosa;)

3

COST

when

Sec. 23

CONVEX

183

FUNCTIONS

As another application, we deduce a necessary condition for t h e convergence of a series Σ a of positive t e r m s . Of course a —• 0 is a necessary condition, b u t na —> 0 is (in general) not. However, if t h e series is convergent, t h e n η times the geometric mean of the first η terms must approach zero; t h a t is, n(a\a • • ·α ) Ι —• 0. To see this, let s — a\ + a + • • • + a , and G = (a\a · • · α ) Ι . We are assuming s —* s. It follows t h a t the arithmetic m e a n of s i , s , •.. , s approaches s. 5

n

n

n

χ η

2

η

n

λ

2

2

n

n

η

η

n

2

n

E x e r c i s e 2 3 . 3 . Prove the preceding statement. Written in t e r m s of t h e a , this says t h a t k

n~ {a\

+ (ai +a )

l

h (a\ + a Η

-\

2

t h a t is, n~ {noi + (n — l)a -\

Ya }

1

2

1

n

n

—• s,

n

—» s, or equivalently

+ l ) s „ — αχ — 2a — 3 α — · • · — na

n~ {(n Since s

h a )}

2

3

2

n

}-»s.

—* s, the preceding formula shows t h a t a\ + 2a Η η

h na

2

_^

n

T h e left-hand side is t h e arithmetic mean of t h e η numbers displayed in t h e numerator, so their geometric mean also approaches 0. B u t this geometric mean is {η\α α ···α γΙ

=

η

λ

2

η

(n\y' G . n

n

Since (n! ) / " / n -> 1/e (see Exercise 32.4), nG -> 0. I m p o r t a n t a n d useful inequalities arise from t h e observation t h a t t h e function χ >-* x defined on (0, oo) is convex when r > 1. (It is also convex when r < 0, b u t the resulting inequalities d o not seem t o be of much use; when 0 < r < 1 1

n

6

r

184

Chap. 2

FUNCTIONS

it is concave.) W e s t a r t by observing t h a t if u and υ are positive numbers, and r > 1, then the minimum value of the expression q ~ u + (1 — q) ~ v for 0 < q < 1 equals (u + v) . This can be verified by calculus: t h e indicated function of q blows u p a t b o t h endpoints, so its minimum occurs a t t h e interior point where t h e derivative is equal t o zero, namely a t q = u/(u + v). However, it is more in t h e spirit of this section t o argue, by convexity, t h a t {qx + (1 — q)y} < qx + (1 — q)y for all positive numbers χ a n d y, with equality if a n d only if χ = y; p u t t i n g χ = u/q a n d y = v/(l — q) gives (u + v) < q ~ u + (1 — q) ~ v for every value of q, with equality if a n d only if u/q = v/(l-q), which is t o say q = u/(u + v). To apply t h e observation, let a a n d bk b e positive numbers for 1 < k < n. For each value of k a n d q we have {a +bk) < q ~ a + (1 — q) ~ b , so (with all sums from 1 t o n) 7

l

T

T

l

r

T

r

r

r

r

r

l

r

r

l

r

r

k

r

l

r

T

k

l

k

Y,{a

+b ) < r

k

k

T

r

k

q ~ Y^al l

+ (1 - q) ~

r

l

r

£

V. k

Taking t h e minimum of t h e right-hand side in q with u = (ΣΑ*)^Γ d ( Σ ^ ) 1 ^ g i i f taking r t h roots of b o t h sides, a n

v

=

y e s

a

t e r

(Σ(« ν)'"ί(Σ«ί)'" + (Σ'ί)"'· +

This is Minkowski's inequality for sums, which we used on page 24. T h e same argument, with sums replaced by integrals, yields Minkowski's inequality for integrals:

(/

l/+9l

')'" (/ )"' (/ ) '' s

l/r

+

l#

,/

T h i s is valid also when r = 1, as follows immediately from t h e triangle inequality.

Sec. 23

185

CONVEX FUNCTIONS

By Jensen's inequality,

ii^2w

ιΓ. / 0, or equivalently, T(d)-T(c) > f(d)-f(c). This last inequality is true because T(d) — T(c) represents the total variation 4

204

INTEGRATION

Chap. 3

of / on the interval [c, d], and hence it is at least as big as the net change \f(d) — / ( c ) |. Exercise 26.5. A function of bounded variation is differentiable almost everywhere. Exercise 26.6. If / is a continuous function on a finite interval [o, b], and the derivative / ' exists at all points of (a,b) and is bounded, then / has bounded variation. Exercise 26.7. Give an example of a continuous function on the interval [0,1] that does not have bounded variation. An important subclass of the functions of bounded variation consists of functions that have small variation on collections of intervals of small total length. W e say that a function / on an interval [a,b] is absolutely continuous if, given any positive e, we can find a positive number ä such that for every finite collection of disjoint subintervals {{xk,Xk + hk)} of total length less than ä, we have Σ* \f(xk+h )-f{x )\ Ì approaches 0 as Ì —* oo. Informally, we may say that an integrable function is large only on a small set. Exercise 27.5. If / is integrable on the measurable set S, and if e > 0, then there exists ä > 0 such that J \ f\ < e for every subset Å with measure less than ä. E

NOTES 'See, for e x a m p l e , Michael W . Botsko, A n elementary proof that a b o u n d e d a.e. continuous function is R i e m a n n integrable, American Mathematical Monthly 95 (1988), 249-252. A s s u m i n g the continuum hypothesis, S. U l a m showed that the only finite measure defined on all subsets of the real numbers that assigns measure zero to each point is identically zero ( Z u r Masstheorie in der allgemeinen Mengenlehre, Fundamenta Mathematicae 16 (1930), 141-150). 2

Sec. 28

PROPERTIES OF LEBESGUE INTEGRALS

211

I t w a s discovered in the 1960's that the Lebesgue integral can actually b e constructed by a procedure entirely analogous to Riemann's. See C a r u s M a t h e m a t i c a l M o n o g r a p h number 20, The Generalized Riemann Integral, by R o b e r t M . M c L e o d . 3

B y m a k i n g suitable conventions a b o u t how to handle the symbol co, one can define the L e b e s g u e integral of a ( p e r h a p s ) unb o u n d e d measurable function over a ( p e r h a p s ) unbounded measurable set directly ( w i t h o u t taking limits) b y the same procedure used for b o u n d e d functions on b o u n d e d domains. This is often considered to b e one of the advantages of the L e b e s g u e integral. 4

28. P r o p e r t i e s of L e b e s g u e integrals. As you would guess, the Lebesgue integral shares many properties with the Riemann integral. For instance, integration is a linear operation: f(cf + g) = c / / + J g for all integrable functions / and g and every constant c. The integral is additive over sets: if Si and S2 are disjoint measurable sets, then / / = / / + J f. Integration is a positive operation, in the sense that if f(x) > 0 for almost all x, then / / > 0; and if in addition / > 0 on a set of positive measure, then / / > 0. More generally, if / and g are integrable functions, and / ( x ) < g(x) almost everywhere, then J f < f g. A common way to show that an unbounded measurable function / is integrable is to find an integrable function g that dominates / (this means that |/(aOI < almost everywhere). S [ U S 2

S [

Si

Exercise 28.1. If / denotes the function whose value at χ is x~ ^ sin(logx), then / is integrable on the interval (0,1). 1

2

Now we turn to some properties of Lebesgue integrals that Riemann integrals might have, but don't, or that are not meaningful for Riemann integrals.

212

Chap. 3

INTEGRATION

(i)

INTEGRATION

OF SEQUENCES

A N D SERIES.

We

cannot necessarily integrate a convergent sequence of functions term by term. For example, if / „ is the function defined by f (χ) = ç for 0 < χ < 1 /ç, and / „ (÷) = 0 elsewhere, we have / „ (x) —» 0 for each x, while f (x) dx — n

n

1, so that l i m { / / „ ( x ) d x } and / exist but are different. n

0

0

{lim„ / „ ( x ) } d x both

Exercise 28.2. Find an example of the same phenomenon with continuous functions / „ . W e do know (page 117) that with the Riemann integral we can integrate a uniformly convergent sequence of functions term by term: if / „ —> / uniformly, then / / „ —• / / . With Lebesgue integration, we get the much more powerful "dominated convergence theorem." It says that if { / „ } is a sequence of Lebesgue integrable functions converging pointwise almost everywhere to a limit function / , and if there exists a Lebesgue integrable function g such that | / ( x ) | < ff(x) almost everywhere for each n, then / is Lebesgue integrable, and / / „ —• / / . In terms of series, the theorem says that if Σ £ ° = j / „ (x) converges for almost all x, and if there exists an integrable function g such that I Σ η = ι fn( )\ ^ 9( ) almost everywhere for all N, then the sum of the series is automatically integrable, and we can correctly integrate term by term. n

x

X

W e can deduce the dominated convergence theorem by applying Egoroff's theorem (page 202). Suppose e > 0. Since g is integrable, there is a set 5 of finite measure such that the integral of g over the complement C(S) is less than e. By Exercise 27.5, there is a positive ä such that J g < e for every subset Å of S with measure less than ä. By Egoroff's theorem, we can choose Å so that fn —* f uniformly on the complement of E. This means E

Sec. 28

213

PROPERTIES OF LEBESGUE INTEGRALS

that | / „ ( x ) - f(x)\ < e/m(S) for sufficiently large η and for x £ E . Accordingly, f |/„ - f\ < J |/„ - f\ + e for such n. Since g dominates each / „ , it also dominates the limit function / , and so J |/„ - / | < J 2g < 2e. Similarly, Ic(S) \ f ~ f\ - Hence / |/„ - / | < 5e when η is large enough, which means that J f —* f f. s

E

E

n

E

2 e


0. But l

+h

h' £ f = h~ {F(c + h) - F ( c ) } , so F'(c) = / ( c ) . In the general situation, we may assume, by Exercise 26.4, that / is the limit almost everywhere of a nondecreasing sequence { / „ } of continuous functions. Set F {x) = f*(f +i — fn)- From the special case just considered, we know that F {x) — f \{x) — f (x) for all x. Because the integrand is nonnegative, F is a nondecreasing function. Because of the telescoping sum in the integrand, Σ " = ι i( ) = Sa(fn+i - / ι ) · By the monotone convergence theorem, Σ " = ι Fj( ) converges for all ÷ to F(x) — / /i. By Fubini's theorem on derivatives (page 171), 2 " = i Fj( ) converges to F'[x)—f\ (x) for almost all x. But 1

+h

l

n

n

n

n+

n

n

F

x

x

x

o

x

Σ " = ι Fjix) = E ; = I ( / I(X) ~ fj{x)) = fn i(x) which converges to f(x) - / i (x) for almost all x. F'(x) = f{x) for almost all x. J +

+

Mx), Thus

Now let us consider the related question of whether the integral of a derivative gives back the original function. Again, we cannot expect this to be true in general. Indeed, even if / has a finite derivative everywhere, / ' may fail to be integrable. To visualize an example of such an / , imagine the line segment (0,1) partitioned into infinitely many intervals that condense at 0, with each interval carrying a pair of teeth of height 1, one positive and the other

216

Chap. 3

INTEGRATION

negative. The integral of | / ' | over each interval equals 4 (the total variation of / on the interval), and since there are infinitely many intervals, / ' is not Lebesgue integrable on (0,1). The corners of the graph of / can be rounded off to make / differentiable at all points of (0, l ) . If / ' exists almost everywhere and is integrable, it need not be the case that / is an indefinite integral of / ' . For example, the derivative of the Cantor function (page 162) is zero almost everywhere, and so the integral of the derivative is identically zero. For a function / to have any chance of being the integral of its derivative, / must possess any properties that indefinite integrals have in general. One such property is continuity. If g is integrable on [a,b], then the function G such that G(x) = f*g is a continuous function of x; this follows from Exercise 27.5. For the same reason, an indefinite integral has the stronger property of absolute continuity. This necessary property is also sufficient: if f is absolutely continuous on [a,b], then f has a derivative almost everywhere, and f is an indefinite integral of its derivative. Indeed, / is, in particular, of bounded variation (Exercise 26.9), and hence is the difference f\ — /2 of two nondecreasing functions f\ and / 2 ; so f'(x) exists for almost all x. Moreover, / { and are integrable by Exercise 28.4, so / ' is too because | / ' ( a r ) | < f[{x) +? {x). Let F(x) = f* / ' . W e know that F is absolutely continuous, and F'(x) = f'{x) for almost all x. Thus F — f is an absolutely continuous function whose derivative is zero almost everywhere. From page 204, we know that F — f is constant; that is, / is an indefinite integral of its derivative. 1

2

2

NOTES 1

T h e r e is an explicit formula in section 10.7 of E . C . T i t c h m a r s h ,

Sec.

29

APPLICATIONS OF THE LEBESGUE INTEGRAL

217

Theory of Functions, second edition, O x f o r d University Press, 1968; but a verbal description seems to be more informative than a formula. O n the other h a n d , if / ' exists and is finite everywhere a n d is integrable, then / is an indefinite integral of / ' . See, for e x a m p l e , C a s p e r G o f f m a n , O n functions with s u m m a b l e derivative, American Mathematical Monthly 78 (1971), 874-875; W a l t e r R u d i n , Real and Complex Analysis, third edition, M c G r a w - H i l l , N e w Y o r k , 1987, T h e o r e m 7.21, p. 149. A l s o , if / is continuous, the right-hand derivative of / exists and is finite except at countably many points, and the right-hand derivative is integrable, then / is an indefinite integral of its right-hand derivative ( a n d hence of / ' ) . See P. L . W a l k e r , O n L e b e s g u e integrable derivatives, American Mathematical Monthly 84 (1977), 287-288; E d w i n H e w i t t and K a r l Stromberg, Real and Abstract Analysis, second printing corrected, Springer-Verlag, N e w Y o r k , 1969, ( 1 8 . 4 1 ) ( d ) , p . 299. 2

29. A p p l i c a t i o n s of the L e b e s g u e integral. (i) C O M P U T I N G A M A X I M U M B Y I N T E G R A T I O N .

When

ñ > 0, the space of (equivalence classes) of measurable functions / for which / \f\ is finite is denoted by LP (or LP{S) if it is necessary to specify the domain of the functions). If 5 is a finite interval, we can show that / € L?(S) implies / € L (S) if q < p, but not necessarily for any q > p. Indeed, if q < p, then \f(x)\ < m a x ( l , \ f ( x ) \ ) for all x, so l / l * is integrable if \f\ is, and in fact J | / | < m(S) + J \f\ . On the other hand, if q > p, think of S = (0,1/2), and let / ( x ) be 1 / ' { x O o g x ) } ^ ; then f is integrable but / ' is not. p

q

q

p

?

p

s

p

s

2

1

p

Exercise 29.1. Give an example of a measurable function / such that / € L"(0,1) if q < p, but / g L (0,1). p

218

Chap. 3

INTEGRATION

When ñ > 1, the quantity ( / | / | ) is denoted by | | / | | p and is called the IP norm of / . Minkowski's inequality (page 184) shows that it satisfies the triangle inequality. Consequently, we can view the function space IP as a metric space, the distance between functions / and g being p

1 / p

11/-sll . P

Exercise 29.2. Compute | | / | | for the function / on the interval [0,1] such that f(x) = x. What happens when ρ —• oo? p

Now we consider the question of finding the maximum of a nonnegative function / that is integrable on a finite interval [a, b]. If / is bounded, there is, of course, no reason to suppose that / attains a maximum. Even the notion of its supremum, or least upper bound, makes no sense in the context of Lebesgue integrable functions, since we could change the value of / at a single point and get a different value for the supremum. If we are going to think about a supremum at all, we shall need some property that is not affected by how the function behaves on sets of measure zero. W e therefore introduce a property that acts as much as possible like a supremum. W e define the essential supremum of a nonnegative function / (denoted by | | / | | o o , or, in older works, by ess sup / or vrai max / ) to be the number Ì with the property that, for every positive e, the inequality / ( x ) < Ì + e holds for almost all x, whereas / ( χ ) > Ì — e on some set of positive measure. If there is no such number M , we define ||/||oo to be oo. Exercise 29.3. Determine ||/||oo when / is the ruler function (Exercise 13.1). W e are now going to prove that limp-.oo | | / | | = | | / | | o o , whether ||/||oo is finite (case 1) or infinite (case 2). p

Sec.

29

APPLICATIONS OF THE LEBESGUE INTEGRAL

219

Case 1. Let Ì = ||/||oo- If Μ = 0, then / is zero almost everywhere, so | | / | | = 0 for all p, and there is nothing to prove. Suppose, then, that Ì > 0, and let e be an arbitrary positive number less than M. By the definition of essential supremum, f(x) < Ì + e for almost all x. Then p

f" f» < (M +

e) (b-a). p

Ja

In particular, / G LP for every p. Moreover limsup | | / | |

< l i m s u p ( M + e)(b - a)

l/p

p

p—>oo

= Ì + e.

p—»00

On the other hand, f(x) measure. Then

> Ì — e on a set S of positive

(i>r>-(M">a^r = (Μ -e)(m(5))

1 / p

.

Hence liminfp_oo | | / | | > Ì — e. Since e is arbitrary, we deduce that the limit lim _oo exists and equals M. Case 2. If there were a positive number ñ such that / does not belong to LP, then / would not belong to L for any q > p, and the limit linip^oo | | / | | would be infinite. W e may therefore suppose that / belongs to every LP. Our assumption now is that there is no Ì such that f(x) < Ì almost everywhere. Consequently, for each positive n, we have f(x) > ç on a set 5„ of positive measure. Then p

p

q

p

=

n-(m(S»))

1 / p

.

Letting ñ —• oo, we obtain liminf _. | | / | | > n. Since ç is arbitrary, this means that lim _oo ||/||p = oo. p

p

00

p

220

Chap. 3

INTEGRATION

(ii) A C O N N E C T I O N B E T W E E N S U M S A N D I N T E G R A L S .

The integral test for convergence of infinite series says that if / is a positive decreasing function on (0, oo), then f x and 5 3 * L i f(k ) ( f ° given positive number x) either both converge or both diverge. ra

n

v

Exercise 29.4. Prove this integral test. Exercise 29.5. If / is not monotonic, the conclusion may fail. If / is merely nonnegative and integrable, nothing can be said about the convergence of Y^ f(kx) in general. There is, however, a sort of average connection between / / and Y^ f(kx): if the integral converges, the sum converges for almost all x. To prove this, we show that k

k

1

W e saw in §28 that sums and integrals of positive functions can be rearranged however we like, so /•OO

/

/

0

\

0

0

( ~ ÓÊ < Ð ÷

1

^

,

÷

Ü÷

=

ΛΟΟ

k=\ oo η

pn+l

= ΣΣ

x- f(x)dx l

k=ln=k oo pn+1

Jk

= ^2^2

l

1

oo oo

^'/(#

= Ó

~ f(kx)dx

x

*=l"'

'

*=1

oo

f°°

0

'Ó,

Jn

é·η+1

/

x~ f(x)dx 1

=^2

„=lfc=l^" oo

n+l

/

x~ f(x)dx 1

n

/·οο

f(x)dx=

/



Sec.

29

APPLICATIONS OF THE LEBESGUE INTEGRAL

221

Hence if f is finite, then Σ * = ι f(kx) is finite for almost all ÷ greater than 1. If we know that f is finite for every positive ä, then we get by rescaling that JZfcLi f(kx) is finite for almost all ÷ greater than 0. (iii)

SERIES OF O R T H O G O N A L F U N C T I O N S .

The

space

L (a, b) of real-valued functions with integrable square on an interval [a, b] is a metric space, the distance between functions / and g being | | / — p||2, the norm of their difference. W e can introduce a notion of inner product (or dot product, or scalar product) in this space by defining ( / , g) to be f f • g; the integral is finite by Schwarz's inequality (page 1°85). Notice that = ( / , / ) . W e say that two functions are orthogonal if their inner product is zero. A sequence of functions {

'2/ð s m ( n x ) } £ ° thonormal on the interval [Ο,π]. Exercise 29.6.

=1

isor-

V e r i f y t h e o r t h o n o r m a l i t y of this s e q u e n c e .

The formal expansion of a function / as a series in the orthonormal functions ö is ^ c„ ö , where the coefficients Cn are given by Cn = (/, φ ). When the orthonormal functions are sines and cosines, this is usually called a Fourier series. If we calculate the square of the distance between / and a partial sum of its formal expansion, we obtain n

ç

ç

η

Ν 71 =

1

71=1

\

Ν 71 =

71=1

'

Ν 1

τη,η =

1

222

Chap. 3

INTEGRATION

= n/iί-5X n= l

The left-hand side is nonnegative for every value of N, so we must have oo

.

n= l

J

This is Bessel's inequality. It suggests the question of whether it has a converse, that is, whether numbers Cn such that Σ °n converges are necessarily the coefficients in the expansion of some function in a series of whatever set of orthonormal functions is being considered. If integration is taken to be Riemann integration, then the answer is "no." It is one of the successes of Lebesgue integration that it does let us find a function of integrable square whose coefficients are a given sequence {cn}, provided that Σ η n converges. This is the Riesz-Fischer theorem (proved independently by F. Riesz and E. Fischer). The most natural way to construct a function with prescribed coefficients Cn is to write the series Σ η °çøç(÷)· Such a series does not necessarily converge for all x, but we shall see that it does converge in the I? distance. In other words, the partial sums of the series form a Cauchy sequence in the L metric: if we let s denote Σ £ = ι Wk, then ||s — s„||2 —• 0 as m and η —• oo. In fact, the orthonormality of the ö implies that if m > n, then η

c

2

2

c

n

m

ç

\\sm — s Hi ΣΓ=η+ι *' d * ' S to 0 as m and η —> oo because Σ * ^ ι \ converges. W e might hope that if a series is Cauchy in L , then there will be an L function to which it converges; in other words, I? is complete as a metric space. W e would expect that the L limit of the partial sums s„ will be a function that has the given coefficients Cn. =

c

a n

n

n

c

2

2

2

s s

u

m

o e s

Sec.

29

APPLICATIONS OF THE LEBESGUE INTEGRAL

223

W e axe going to show that this is actually what happens. To see that L is a complete space, consider any sequence {s } that is a Cauchy sequence in the L metric. Since \\s — s \\ —* 0, we can find an increasing sequence of numbers n for which n —» oo and ||s — s„ \\ < 2~ when TO > 7ifc, and in particular ||θη,. ι ni- Μ < 2~ . If we apply Schwarz's inequality to the function \s . —s . | and the constant function with value 1, we obtain 2

2

n

m

n

2

k

k

m

k



k

2

s

k

+

2

ni

+ l

ni

1/2

|âη.

J

+ 1

"«η,. I < | j f


d so the telescoping sum Σ * = ι ( m•+1 ( ) ~ nt ( )) converges almost everywhere (page 214). This is equivalent to saying that the sequence {s„ . (x)} converges almost everywhere to some f(x). This / is the function we are looking for. W e still have to show that / is the limit, in L , of the original sequence {s }. By Fatou's lemma, s



s

s

n

a n

e

a

x

s

s

x

(

2

n

«η,. - / l b < liminf \\s , - s n

]->oo

nj

|| < 2 -k 2

Therefore / is in L , since s . is, and the subsequence i n , . } -*· / in I / since l i m ^ o o ||s„ . - f\\ = 0. Now Minkowski's inequality (page 184) gives us | | / - e„||2 < 1 1 / - s . 11 + ||s„,. - s „ | | , so s -> / in L . Thus L is a complete metric space. Returning to the situation that s„ is a partial sum of the series £ o t ^ , write (f,