Integration
 9781400877812

Table of contents :
Preface
Contents
I. SOME THEOREMS ON REAL-VALUED FUNCTIONS
1. Sets and characteristic functions
2. Neighborhoods, openness, closure
3. Denumerable sets
4. Functions and limits
5. Bounds
6. Upper and lower limits
7. Semi-continuous functions
8. Functions of bounded variation
9. Absolutely continuous functions
II. THE LEBESGUE INTEGRAL
10. Step-functions
11. Riemann integrals
12. U-functions and L-functions
13. Integrals of U-functions and L-functions
14. Upper and lower integrals
15. The Lebesgue integral 7
16. Consistency of Riemann and Lebesgue integrals
17. Integrals over bounded sets
18. Integrals over unbounded sets
III. MEASURABLE SETS AND MEASURABLE FUNCTIONS
19. Arithmetic of measurable sets
20. Exterior measure and interior measure
21. Measurablefunctions
22. Measurable functions and summable functions
23. Equivalence of functions
24. Summable products. Inequalities
IV. THE INTEGRAL AS A FUNCTION OF SETS; CONVERGENCE THEOREMS
25. Multiple integrals and iterated integrals
26. Set functions
27. The integral as a set function
28. Modes of convergence
29. Convergence theorems
30. Metric spaces; spaces Lv
V. DIFFERENTIATION
31. Dini derivates
32. Derivates of monotonic functions
33. Derivatives of indefinite Lebesgue integrals
34. Derivatives of functions of bounded variation
35. Derivatives of absolutely continuous functions
36. Integration by parts
37. Mean-value theorems
38. Substitutiontheorems
39. Differentiation under the integral sign
VI. CONTINUITY PROPERTIES OF MEASURABLE FUNCTIONS
40. The classes of Baire
41. Metric density and approximate continuity
42. Density of continuous functions in Lv. Riemann-Lebesgue theorem
43. Lusin's theorem
44. Non-measurable sets and non-measurable functions
VII. THE LEBESGUE-STIELTJES INTEGRAL
45. The difference-function
46. Monotonic functions and functions of bounded variation
47. Integrals and measure with respect to monotonic functions
48. Examples
49. Borel sets
50. Dependence of integral on integrator
51. Integrals with respect to functions of bounded variation
52. PropertiesoftheLebesgue-Stieltjesintegral
53. Measure functions
54. Measure functions and Lebesgue-Stieltjes measure
55. Integrals with respect to a measure function
56. Measure functions defined by integrals
VIII. THE PERRON INTEGRAL
57. Definition of the Perron integral
58. Elementary properties
59. Relation to the Lebesgue integral
60. Perron integral of a derivative
61. Derivative of the indefinite Perron integral
62. Summability of non-negative integrable functions
63. Convergence theorems
64. Substitution
65. Integration by parts
66. Second theorem of mean value
IX. DIFFERENTIAL EQUATIONS
67. Ascoli's theorem
68. Existence and uniqueness of solutions
69. The solutions as functions of parameters
X. DIFFERENTIATION OF MULTIPLE INTEGRALS
70. Vitali's theorem
71. Derivates of set functions
72. Derivatives of indefinite integrals
73. Derivatives of functions of bounded variation
APPENDIX
List of special symbols and abbreviations
INDEX

Citation preview

INTEGRATION

PRINCETON MATHEMATICAL SERIES

Editors·. MARSTOX MORSE and A. W. TUCKER 1. The Classical Groups, Their Invariants and Representatives. By HERMAXN WEYL. 2. Topological Groups. By L. PONTRJ AGIX. Translated by EMMA LEHMER.

3. An Introduction to Differential Geometry with Use of the Tensor Calculus. By LUTHER PFAHLER EISENHART.

4. Dimension Theory. By WITOLD HUREWICZ and HENRY WALLMAN. 5. The Analytical Foundations of Celestial Mechanics. By ATJREL WIXTNER.

6. The Laplace Transform. By DAVID VERNON WIDDER. 7. Integration. By EDWARD JAMES MCSHANE.

8. Theory of Lie Groups: I. By CLAUDE CHEVALLEY. 9. Mathematical Methods of Statistics. By HARALD CRAMER. 10. Several Complex Variables. By SALOMON BOCHNER and WILLIAM TED MARTIN. 11. Introduction to Topology. By SOLOMON LEFSCHETZ.

12. Algebraic Geometry and Topology. Edited by R. Η. Fox, D. C. SPENCER, and A. W. TUCKER. 13. Algebraic Curves. By ROBERT J. WALKER. 14. The Topology of Fibre Bundles. By NORMAN STEENROD. 15. Foundations of Algebraic Topology. By SAMUEL EILENBERG and NORMAN STEEN­ ROD. 16. Functionals of Finite Riemann Surfaces. By MENAHEM SCHIFFER and DONALD C. SPENCER. 17. Introduction to Mathematical Logic, Vol. I. By ALONZO CHURCH. 18. Algebraic Geometry. By SOLOMON LEFSCHETZ. 19. Homological Algebra. By HENBI CARTAN and SAMUEL EILENBEBG. 20. The Convolution Transform. By I. I. HIRSCHMAN and D. V. WIDDER. 21. Geometric Integration Theory. By HASSLER WHITNEY.

INTEGRATION By

Edward James MeShane

PRINCETON

PRINCETON UNIVERSITY PRESS 1944

Copyright © 1944, by Princeton University Press London: Oxford University Press ALL RIGHTS RESERVED

Second Printing 1947 Third Printing 1950 Fourth Printing 1957

Printed in the United States of America

Preface The swift development of analysis in the twentieth century, beginning with the theory of the Lebesgue integral, has been of tremen­ dous mathematical importance. No mathematician today can afford to be ignorant of the modern theories of integration, and it is to the profit of a student of mathematics that he become acquainted with these ideas early in his graduate studies. On the other hand, most of the writings on integration are written by mature mathematicians for mature mathematicians, often in an admirably concise form which is not appreciated by a beginner. This book is written with the hope that it will open a path to the Lebesgue theory which can be travelled by students of little maturity. It is for the sake of such readers that details are explicitly presented which could ordinarily be regarded as obvious. An experienced mathematician may regard many proofs as verbose. Probably some of them are unnecessarily wordy, even for the veriest beginners; equally probably there are details omitted as obvious which will not be obvious to all readers. In view of the audience to whom this is addressed, the latter must be considered the graver fault. The scheme of introducing the Lebesgue and Lebesgue-Stieltjes integral here adopted is a modification of that of Daniell, the integral appearing as the result of a two-stage generalization of the Cauchy (or Stieltjes) integral. Perhaps this manifestation of a connection between continuous functions and summable functions may help the beginner to feel at home in the newer theory. There are few historical remarks on the theorems and methods here used and there is practically no bibliography. These are not usually of great interest to a beginner, and a student who wishes to continue further into the subject will necessarily read treatises—above all, Saks' Theory of the Integral—which will furnish bibliographical and historical references. In only a few features can this book make claims to novelty. An expert will usually recognize known proofs used in assorted combina­ tions and modifications. One acknowledgement must however be made. The latter part of the chapter on differential equations owes much to a mimeographed set of lecture notes on differential equations by Professor G. A. Bliss. Part of the material in this book has been ueed in teaching graduate classes at the University of Virginia, and in several respects the choice V

vi

PREFACE

of subject matter and of forms of proof has been guided by the com­ ments of the students, especially by those of Dr. B. J. Pettis. Shortly after the manuscript reached the Editors of the Princeton Mathematical Series I was called to the Aberdeen Proving Ground to help with the work in exterior ballistics. As a result, I lacked the time to perform the usual final tasks. I am most grateful to the Editors for their kindness in taking over duties which properly should have devolved upon the author, and thereby advancing the date of publication by many months. In particular, I owe thanks to Dr. Paco Lagerstrom, who worked long and efficiently over the manuscript. In the correction of proof I have been greatly assisted by Miss Mary Jane Cox, who not only read all proof-sheets but pointed out a num­ ber of places in which rewording was needed for the sake of clarity. Finally, I wish to thank Princeton University Press for its cooperativeness and efficiency. E. J. MCSHANE. CHARLOTTESVILLE, VIRGINIA, November 21, 1943.

Contents CHAPTER

PAGE

I. SOME THEOREMS ON REAL-VALUED FUNCTIONS 1. Sets and characteristic functions 2. Neighborhoods, openness, closure 3. Denumerable sets 4. Functions and limits 5. Bounds 6. Upper and lower limits 7. Semi-continuous functions 8. Functions of bounded variation 9. Absolutely continuous functions II. THE 10. 11. 12. 13. 14. 15. 16. 17. 18.

LEBESGUE INTEGRAL Step-functions Riemann integrals [/-functions and L-functions Integrals of {/-functions and L-functions Upper and lower integrals The Lebesgue integral Consistency of Riemann and Lebesgue integrals Integrals over bounded sets Integrals over unbounded sets

III. MEASURABLE SETS AND MEASURABLE FUNCTIONS 19. Arithmetic of measurable sets 20. Exterior measure and interior measure 21. Measurablefunctions 22. Measurable functions and summable functions 23. Equivalence of functions 24. Summable products. Inequalities.

1 1 5 14 20 23 26 38 44 47 52 52 57 62 66 72 75 85 89 94 101 101 109 118 125 128 131

IV. THE INTEGRAL AS A FUNCTION OF SETS; CONVERGENCE THEOREMS 136 25. Multiple integrals and iterated integrals 136 26. Set functions 150 27. The integral as a set function 156 28. Modes of convergence 160 29. Convergence theorems 166 30. Metric spaces; spaces Lv 177 V. DIFFERENTIATION 31. Dini derivates 32. Derivates of monotonic functions 33. Derivatives of indefinite Lebesgue integrals 34. Derivatives of functions of bounded variation 35. Derivatives of absolutely continuous functions 36. Integration by parts vii

188 188 194 197 20'_" 207 209

viii

CONTENTS

CHAPTER

PAGE

37. Mean-value theorems 38. Substitutiontheorems 39. Differentiation under the integral sign

VI. CONTINUITY PROPERTIES OF MEASURABLE FUNCTIONS 40. The classes of Baire 41. Metric density and approximate continuity 42. Density of continuous functions in Lv. Riemann-Lebesgue theorem 43. Lusin's theorem 44. Non-measurable sets and non-measurable functions

VII. T HE LEBESGUE- STIELTJES INTEGRAL 45. The difference-function

218 218 222

225 236

237 242 242

46. Monotonic functions and functions of bounded variation

248

47. Integrals and measure with respect to monotonic functions

251

48. Examples

255

49. Borel sets

261

50. Dependence of integral on integrator 51. Integrals with respect to functions of bounded variation

264 269

52. PropertiesoftheLebesgue-Stieltjesintegral

271

53. 54. 55. 56.

277 287 295 303

Measure functions Measure functions and Lebesgue-Stieltjes measure Integrals with respect to a measure function Measure functions defined by integrals

VIII. T HE PERRON INTEGRAL 57. Definition of the Perron integral 58. Elementary properties 59. 60. 61. 62. 63. 64. 65.

Relation to the Lebesgue integral Perron integral of a derivative Derivative of the indefinite Perron integral Summability of non-negative integrable functions Convergence theorems Substitution Integration by parts

66. Second theorem of mean value

IX. DIFFERENTIAL EQUATIONS 67. Ascoli's theorem 68. Existence and uniqueness of solutions 69. The solutions as functions of parameters

X. DIFFERENTIATION OF MULTIPLE INTEGRALS

312 312 316 322 323 326

328 329 329 331 335

336 336 338

348 366

70. Vitali's theorem

366

71. Derivates of set functions 72. Derivatives of indefinite integrals 73. Derivatives of functions of bounded variation

372 374 378

APPENDIX

383 List of special symbols and abbreviations

INDEX

209 211 216

385 387

CHAPTER I

Some Theorems on Real-valued Functions The entire subject-matter of this book rests upon the properties of real numbers, with which we assume the reader to be familiar. N o appeal is made to geometric intuition. Nevertheless, it is often convenient to use the language of geometry, and this is permissible if we define all our geometric expressions in terms of number. If q is a positive integer, we shall say that each ordered g-tuple (x(l>, ... , x'r") of real numbers is a " p o i n t in g-dimensional space," or a " p o i n t in R q . " For convenience in notation we prefer to put the indices (1) , • • • , up instead of down; the lower position will be reserved for subscripts distinguishing different points from each other. Usually we abbreviate b y writing x for (x(l), \ • • , z(s)). A standing notational convention will be the following. If any letter, with or without affixes, is used to denote a point in (/-dimensional space, the q numbers defining the point will be denoted b y the same symbol (with the same affixes if any) with superscripts ( l ) , • • • , ''". Thus if we speak of a point y0 in g-dimensional space Rq we mean the ordered g-tuple T w o points of a space R q are identical if and only if corresponding numbers in the two g-tuples are equal; that is, if x and y are both in Rq the equation x = y has the same meaning as the q equations

An ordered g-tuple and an ordered p-tuple (p q) will never be regarded as identical. Having this system of abbreviation, it is reasonable to proceed a step further and define sums x + y and products cx, where x and y are points in R q and c is a real number. The definitions are

W e have little use for these symbols until the later chapters. In order that a collection E of points of Rn shall be called a point set in Rg we require only that, given any point x of Rq> it must be possible to determine whether or not x belongs to the collection E. 1

2

I N T E G R A T I O N

[CHAP. II

However, there is a great deal of trouble concealed in this statement. The difficulty lies in giving a precise meaning to the word "determine." Clearly it is not possible to list all the infinitely many points of Rq one by one, marking each as belonging to E or not belonging to E. Some rule must be given. This leads to a further question. What is a rule? N o w we have begun to enter the domain of foundations of mathematics; and, without denying the importance of such studies, we shall turn back again to the narrower study of the points of our spaces Rq. W e shall assume that the reader has some reasonably adequate concept of a rule; if he has doubts of this, as we all may well have, we can only refer him to the various publications on mathematical logic and the foundations of mathematics. A simple example of a point-set in Rq is Rp itself; for, given any x in Rq, we know at once that it belongs to Rq. The " e m p t y " set A, which contains no points whatever, is also a point-set in Rq\ for, given any x in Rq, we know that it does not belong to A. T w o point-sets Ei, E-i in Rq are identical if and only if each point x which belongs to E\ belongs also to E-> and each point x which belongs to E2 belongs to E1.

The set of all x for which the statement S holds will sometimes be denoted by would be the closed interval consisting of all real numbers between 0 and 1. (Cf. also Given any set E in Rq, the set of all points of Rq which do not belong to E is called the complement of E, and is denoted b y CE. Thus the complement of the whole space R q is the empty space A, and conversely; in symbols, CR and C Ra. It is easy to see that for every point-set E in Rq the equation C(CE) = E holds. For if x is in E, it is not in CE, and is therefore in C(CE); and if x is in C(CE), it is not in CE, and is therefore in E. If Ei and E2 are point sets in Rq, we say that E\ is contained in Ei (in symbols, E1 E2) or that Ei contains E{ (in symbols, E1) in case every point x which belongs to Ei also belongs to E«. Thus in particular E E and E for every set E. Further, we define E\ U E2 to be the set of all points x belonging to one or both of Ei, Ei', we define* E1 H E» or E1E1 to be the set of all points x belonging to both Ei and E2; and we define E< — Ei to be the set of all points x which belong to Ei but not to Ei. | * E1 E2 is sometimes called the product, sometimes the intersection of E1 and E2, Ei E2 is called the sum or the union of Ei and E2. f In defining the set theoretical operations we might of course have considered any collection of elements instead of B q .

SEC.

2.12]

R E A L - V A L U E D

EXAMPLE.

.

If

I n Rh

Then Ei

F U N C T I O N S

p u t Ei

3

and

is the set

x

E i E i is the set

is any (finite or infinite) collection of sets in R q , we define

the sum (union)

Ea to be the set of all x contained in at least one

of the sets Ea, and we define the product (intersection)

Ea to be

the set of all x belonging to all the sets Ea. * EXAMPLE.

In one dimensional space B1, let En be

(n = 1, 2, 3, • • • ). is the single point 0.

Then

En is

If En is

En , then

< xThe < 1}, while relationships are easily verified: following C(Ei Et) = CE1 CEh C{ . Ea) =

C{Ei

Et) = CEi

CEt,

C(

Ei — E 2 = Ei C(Ei -

Ea) =

En is {a; | 0

(CEa),

(CEa),

CEi,

Et) = CEi

For instance, a point x is in C(

x

E2.

Ea) if and only if it is not in

Ea,

which is true if and only if it is in no one of the sets Ea, which is true if and only if it is in every set CEa and therefore in

(CEa)•

Again,

applying this equality to the sets, CEa, we have C(

CEa) =

whence b y taking complements CEa = Ci

(CCEa), Ea).

If {E} is a collection of sets, the sets of the collection \E\ are disjoint if no point x belongs to more than one of the sets of the collection. A useful tool in studying properties of point sets E is the characteristic function. 1.1. The characteristic function KE(x) of the set E is that function whose value is 1 if x is in E and whose value is 0 if x is not in E. In the next theorem we assemble some simple properties of characteristic functions. However, it is desirable first to define sums and products of characteristic functions. This is trivial for finite sums and products. Given an infinite aggregate of symbols a, and * Some authors write Ei + 2?2, Ei • E2,

Ea.

Ea, llEa for Ei

E2, Ei

E2,

Eai

4

I N T E G R A T I O N

[CHAP. II

corresponding to each a number ta which is either 0 or 1, we define the product of all ta to be 0 if any one of them is 0 and to be 1 if all the ta are 1. The sum of all ta is the number n if exactly n of the ta have the value 1, and is 1) if an infinite number of the ta have the value 1. Concerning this symbol we shall have more to say shortly. 1.2. (a) For any collection of sets, ,(x). (b) For any collection numbers 1 and

of sets, .

(x) is the smaller of the

(x).

( c ) The sets Ea are disjoint if and only if

1.

(d) If the sets Ea are disjoint, then K K E{X)

(e)

C

=

K {X).

1 -

e

(f) If Ei Et> then KEl(x) T o prove (a), we observe that if the left member has the value 1, then x is in

Ea, so it is in every Ea, so KEa(x)

= 1 for every a, and

the product of the characteristic functions is 1. member has the value zero, x is not in

Otherwise the left

Ea, it is therefore lacking

from some Ea; for this Ea we have

0, and the product of the

characteristic functions is 0. T o prove (b), if x is in Ea it is in at least one so at least one term of the sum (x) is 1. Hence . 1, and the smaller of 1 and

(x) is also 1.

If x is not in

so is every term of the sum

(#).

Ea, then

(x) is 0 and

So the sum is 0, which is the

smaller of 0 and 1. In (c), if the sets Ea are disjoint, each x belongs to at most one set E a , so at most one term in the sum is 1, the others all being 0. So the sum is 0 or 1. Conversely, if the sum is never more than 1, there is no x for which two or more of the characteristic functions have the value 1. That is, no x belongs to more than one of the sets Ea, and the Ea are disjoint. i Statement (d) follows at once from (6) and (c). it directly.

Or we can prove

If £ is in _ Ea, it is in exactly one of the sets E„, so both

members of the equation have the value 1.

If £ is not in

Ea, it is

not in any Ea, so both members of the equation have the value 0. If — =

x

is

in

CE,

it

is

n o t

in

E ,

so

K

E

{ x )

= , 0

a n d

1

=

KCE(X)

If x is not in CE it is in E, so KE{x) = 1 and 0 = 1 — KE(X). This proves (e). For ( / ) we observe that if x is not in Eh then 0 = (x),

KB(X).

=

1

KCE(x)

SEC. 2]

REAL-VALUED

FUNCTIONS

5

while if x is in E2 it is also in Ei, so 1

(x).

2. Next we proceed to investigate some properties of sets in R g which depend, at least in part, on the concept of distance.

If x is a

point in Rq (or, as an alternative name, a vector in it,,) we define its distance from the origin (or, alternatively, the length of the vector) to be the quantity

defined b y the equation

If x and y are points of Rq> we define their distance b y the equation W e now establish the four fundamental properties of this distance, which following. (1) Forare allthe points x, y of Rt, (2) If x and y are points of Rq, (3) For all points x, y of Rq,

0 if and only if x = y.

(4) For all points x, y, z of Rq,

Properties (1) and (3) are evident from the definition. Also, 0 if and only if each difference x(i> — y(i) has the value zero, which establishes (2). Property (4) is called the "triangle inequality"; in geometric language, it states that the sum of two sides of a triangle is at least equal to the third side. In order to prove it, it is convenient first to establish the highly useful Cauchy inequality. If ai, • • • , aq, bi, • • • , bq are real numbers, then

Oibi.

6

I N T E G R A T I O N

[CHAP. I

It is evident that

that is, 0. In the first double sum we first collect all the terms containing ai, then those containing a 2 , and so on. W e find

+ • • •

A similar process can be applied to each of the other two double sums; we thus find 0. If we transpose the middle term and divide b y 2, we obtain

The left member of the Cauchy inequality is non-negative. If the right member is also non-negative, the Cauchy inequality follows from the preceding inequality b y taking the square roots of both members; if the right member is negative, the inequality is evidently satisfied. EXEECISE. Let us say that the g-tuples (ah • • • , aq) and (b\, • • • , bq) are proportional if there are numbers h, k not both zero such that ha{ = kbt,i = 1, • • • , q. Show that the absolute values of the two members of the Cauchy inequality are equal if and only if the g-tuples are proportional. (If they are proportional and, say, h 0, we can substitute kbjh for a, and verify equality. If equality holds, show that it also holds in the first inequality in the proof. From this deduce proportionality of the g-tuples.) Returning to the proof of the triangle inequality, we first observe that b y the Cauchy inequality

SEC. 2]

REAL-VALUED

FUNCTIONS

7

W e add the same quantity to both members of this inequality to obtain

Multiplying both members b y 2 and changing notation, we find

whence the triangle inequality follows at once. If x0 is any point of thespace R,„ and e is any positive number, we define the e-neighborhood of the point x0 to be Thus in space of one dimension the e-neighborhood of x0 consists of in three-space, Ne(x0) consists of the points inside of the sphere of radius e with center at XoEXERCISE. If x is in R, and h and k are both in N((x), so is every number y between h and k. EXERCISE. Given two points xh x2 of Rq, we say that a point x is on the line-segment joining Xi and x 2 if there is a number t between 0 and 1 such that Show that if Xi and x 2 are both in Nt(y), so is every point on the line-segment joining Xi and x2. (Use the equation and the triangle inequality.) A point x is interior to a set E if it is possible to find a neighborhood* Nt(x) every point of which belongs to E. A point-set E is open if every point x which belongs to E is interior to E. For example, * In such a case as this, in which we merely wish to state that there is some neighborhood N,(x) with a given property, and the size of e is of no importance, we shall sometimes write merely N{x) instead of Nt(x).

8

I N T E G R A T I O N

[CHAP. II

let us give the n a m e " o p e n interval" to a point set consisting of where the a,l> and b(i> are finite constants for which Then every open interval is an open set. For if x belongs to the interval, each of the numbers Denote b y 2« the smallest of z a ) _ a(») a n ( ] l) — x'M i s positive. them, and consider the neighborhood If x0 is in this neighborhood, then

so £o is also in the open interval. Hence every open interval is an open set. If we give the name "closed interval" to where the a(i) and bH) are finite constants such that , we see that a closed interval is not an open set. For the point a belongs to the closedinterval; but every neighborhood of a contains points x0 with so that x0 can not belong to the closed interval. A point x is called an accumulation point of a set E if every neighborhood of x contains infinitely many points of E. The point x itself may or may not belong to E. For example, every point of an open interval is an accumulation point of the interval. If in one-space we take E to be the set of points 1, i, i , • • • , l/n, • • • , then 0 is an accumulation point of E ; and in fact we easily verify that it is the only accumulation point of E. The set of all points x which are accumulation points of a set E is called the derived set of E, and is denoted by E'. The set E U E' is the closure of E, and is denoted b y E. A set E is closed in case every accumulation point of E is itself a point of E; in symbols, if E' E. Thus the open interval

is not a closed set; for the point a is an accumulation point of the interval, but does not belong to the interval. The closed interval is a closed set. For suppose that a point x is not in the interval. Then the above inequalities do not all hold. Suppose, to be specific, that Define For every point xo in Ne(x) we have so x0 is not in the interval. Hence x can not be an accumulation point of the interval; and since no point outside of the interval is an accumulation point of the interval, the interval is a closed set.

SEC. 21 .2]

R E A L - V A L U E D

F U N C T I O N S

9

It is by no means tiue that every set is either open or closed. In one-space, the "half-open interval" is neither open nor closed; for no neighborhood of a lies in the set, while b is an accumulation point which does not belong to the set. W e here introduce a notation for intervals in Ri. W e define

Thus a square bracket at either end connotes that that end is included, a round bracket connotes that it is excluded. B y an interval in Rq we shall mean any non-empty set defined either b y inequalities a{i) x{i) 1, • • • , q) or by the inequalities obtained b y replacing some or all of the signs " " b y the sign " < . " In Rq there will be 4 9 different types of intervals. Sometimes it is convenient to use special symbols for four of these, in analogy with the preceding; thus an interval can be designated b y the abbreviation [a, b). Let I be an interval defined either b y inequalities or by inequalities obtained b y replacing some or all of the signs b y < . It is easy to verify that

which is a closed interval; and that the set of interior points of I is the set which is either an open interval or the empty set (the latter in case a{i) = b{i) for some i). If we consider R q as a subset of itself, we easily see that it is both open and closed, and the same is true of the empty set A. W e could show that no other sets in Rq are both open and closed. It sometimes happens that we are interested only in subsets of some given set S, and wish to neglect the rest of the space Rq. For instance, let us anticipate slightly and consider a function f(x) defined and continuous on the set S of all| rational numbers. Let E be \x \ }{x) > 0, x in »S* j. This set is surely not open; it can not contain any neighborhood of any point, because it contains no irrational numbers. Yet for every x0 in E there is an 0 such that if xis any rational for which So for such x we have Thus Ne(x0) is not contained in E, but all points of S in NJxn) are in E. This suggests the following definition. If a set E is contained in a set S, and for each x in E there is an 0 such that Nt(x) S E, then E is said to be open relative to S. The corre-

10

I N T E G R A T I O N

[CHAP. II

sponding closure concept is this. If a set E is contained in a set S, and all accumulation points of E which belong to S also belong to E, then E is closed relative to S. That is, E S is closed relative to S if E'S E. In particular, if S is Rq itself, these properties take the form of ordinary openness and closedness, since the factor S in Nt(xo)S and E'S may be omitted if it is the whole space under consideration. EXAMPLE. Let E be the set (0, 1]; that is, . This is neither open nor closed. But if S is the set (0, 2), then E is closed relative to S. For E' = [0, 1], and E'S = (0, 1] E. Also, if S0 is the set [ — 1, 1], E is open relative to S0. For let x be any point of E. If x 1 , there is a neighborhood iVe(a;) contained in E, so Nf(x)S E. If x 1, take E = 1. The set Nt{x) is then (0, 2), and N N. We then have a simple corollary of the BolzanoWeierstrass theorem, as follows. 2.12. If xi, Xi, x3, • • • is an infinite sequence of points of Rq, and the set {xi, x2, • • • } is bounded, then there exists a subsequence xni, z„i, • • • which converges to a limit Xo. Suppose first that some point occurs infinitely many times among the xn. Call the point x0, and let • • • be the terms of the sequence which coincide with x0. Then clearly If no infinite repetitions occur, there must be infinitely many different points among the xn. By 2.11, these have an accumulation point x0. In Ni(x0) there are infinitely many xn; choose one, and call

14

I N T E G R A T I O N

[CHAP. II

it x n i . In N (xo) there are infinitely many x„; some of these must have subscripts greater than n1. Choose one such; this ISXCRT2* In Ni(x,) there are infinitely many xn; choose one, xni, with subscript greater than n 2 . W e continue the process; b y an obvious induction, we obtain a s e q u e n c e xm, xni, ... such that • • • and . This is the subsequence sought. 2.13. Let E be a closed set. If is a sequence of points of E converging to a limit x0, then x0 is in E. If x0 is not in E, it is not in E, b y 2.4. B y 2.2, for some positive e the neighborhood N f ( x 0 ) contains no point of E, hence contains none of the xn. This is impossible if W e now establish a very useful property of sequences of closed sets. 2.14. Let Ei, E2, • • • be a shrinking sequence of closed sets (that is, Ei " ' " )> and let Ei be bounded. Then either there is a k such that the sets . • • • are all empty, or else there is a point £o contained in all the sets E%, that is x is in Ei. If any one set Ek is empty, then from the hypothesis Eh • • • we have all succeeding sets empty. Otherwise, from each En we choose a point xn. The points {xi, x-i, • • • } form a bounded set, for they are all in Eh which is bounded b y hypothesis. So by 2.12 there is a subsequence x„„ xn„ • • • which converges to a limit point x0. Choose any integer n. Only a finite number of the subscripts m are less than n; say n • • • . But if rii n, then En, En, and in particular x„, is in En. Hence the sequence • • • is a sequence of points of the closed set En, and b y 2.13 the point x0 is in En. Since this is true for each n, the theorem is proved. 3. A class E of objects { P } is denumerable if its elements can be put into one-to-one correspondence with the positive integers; that is, if its elements can be arranged in an infinite sequence P i , P 2 , P3, ' ' ' , in such a way that every element of E occurs in just one place in the sequence. Thus the set of positive integers is denumerable, being automatically in one-to-one correspondence with itself. Likewise the set of numbers 1, i, i, • • • l/n, • • • is denumerable. The next three theorems state important properties of denumerable sets. 3.1. Every subset of a denumerable set is finite or denumerable. Let E be the set consisting of Pi, P 2 , • • • , and let E* be a subset. Beginning with Ph we examine each P; until we come to an element P ,

Skc. 3.2]

REAL-VALUED FUNCTIONS

15

belonging to E*. We re-name this element P*. Continuing from this element, we proceed until we reach another which belongs to E*; this we re-name P*. By repeating the process indefinitely, each element of E* will be reached after a finite number of steps. If the process terminates, E* is finite. If not, the elements of E* are exhibited in a sequence P*, P*, • · • , and E* is denumerable. 3.2. The sum of a finite or denumerable collection of sets, each of which is finite or denumerable, is itself finite or denumerable.

We consider first the case of a denumerable collection Ει, Ei, •• • of sets, each of which is denumerable. Let the elements of Ei be Pi1I, Pi,2, Pi,3, · • • . We now form the sequence (*)Pl ( 1J P 2,1, P1,2 J P Z , \ , P 2,2, P I , Z ,

' · · ; P n . l , Pji-2,3, ' ' '

, Pl.n', ' ' ' ·

Schematically we can exhibit this in the form

4,2 Clearly each element P n ,% of U E i will be reached after a finite number of steps. However, it is possible that the same element P may have occurred in two different sequences, so that repetitions may occur in the sequence (*). We eliminate this by examining the elements of (*) in order, and removing any element which has previously occurred. Thus all the elements of U Ei are exhibited without repetition in a sequence. This sequence can not be finite, for it contains all the ele­ ments of Εχ, so by 3.1 U E1 is denumerable. This scheme can be applied with simple modifications to the cases of a finite set of denumerable sets and of a denumerable set of finite sets (the case of finite sets of finite sets is trivial). However, no additional proof is really needed, for we can fill out each finite sequence P1,i, Pil2, • · · Pi,k to an infinite sequence by repeating the last term infinitely

16

I N T E G R A T I O N

[CHAP. II

many times, and if there are only a finite number of sets we repeat the last set infinitely many times. The extra terms thus introduced are later removed when we strike repetitions out of the sequence 3.3. Let Ei, ..., Ek be sets each of which is finite or denumerable. Then the collection of ordered k-tuples (XI, ... ,XK) with XI in E i, • • • , Xk in Ek, is finite or denumerable. W e prove this first for k = 2. For each fixed xlt the pairs (x1 x2) are in one-to-one correspondence with the elements x2 of E 2 ; that is, to each Xi corresponds a finite or denumerable collection of pairs (xi, x2). There are as many such collections of pairs as there are elements Xi of Ei, that is, there are finitely or denumerably infinitely many collections. B y 3.2, the total collection of pairs is finite or denumerable. For general k, we proceed b y induction. Suppose the theorem proved for k — 1; the (k — l)-tuples (xi, • • • , Xk-1) form a finite or denumerable collection. The ^-tuples (xi, • • • , xk) can be regarded as pairs Xk) in which the first element is a (k — l)-tuple and the second element is Xk. B y the preceding paragraph, the collection of such pairs (i.e., of fc-tuples (xi, • • • , Xk)) is finite or denumerable. 3.4. The set of all rational numbers is denumerable. For each positive integer n, let En be the set consisting of the numbers 0, , • • • . Each set En is denumerable, being exhibitedas an infinite sequence without repetitions. Hence b y 3.2 the set En is finite or denumerable. The set of all rational numbers is contained in En, so b y 3.1 it is finite or denumerable. It is not finite, since it contains all the integers; so it is denumerable. |EXAMPLE 1. In p-dimensional space Rv let us give the name "rational points" to the points (x(1>, • • • , xM) such that each coordinate x,l) is a rational number. B y 3.4 and 3.3, the rational points form a denumerable subset of R„. EXAMPLE 2. In R„ the spheres Nr(x0) with rational points xa as centers and rational radii r form a denumerable set. For they are in one-to-one correspondence with the (v + 1)-tuples where the first v numbers are the (rational) coordinates of xn and r is the (positive rational) radius. W e can use the result in this last example to help in proving the Borel theorem. Given a collection of open sets { U\ and a set E, we say that the collection { U } covers E if each point of E belongs to at least one set U of the collection: E U. The Borel theorem is the following.

SEC.

2.12]

R E A L - V A L U E D

F U N C T I O N S

17

3.5. Let {U] be a collection of open sets, and let E be a bounded closed set. If the collection { U} covers E, it is possible to select afinitenumber Ui, ' ' ' , Un of sets of the collection \ U] in such a way that { Ui, • • • , Un] covers E. W e begin with the collection of all spheres with rational centers and radii; this is denumerable, b y Example 2 above. A sphere of this collection will be selected if it is entirely within some single set of the collection { U]. B y 3.1, the selected spheres form a finite or denumerable collection, and it is easily seen that the collection is not finite. W e can then arrange the selected spheres in a sequence Si, Si, ... . Our next step is to show that {/Si, Si, • • • } covers E. Let z 0 be a point of E. B y hypothesis xa is in a set U0 of the collection {U}; since Uo is open, there is a positive e such that NJx(>) is contained in U 0 . There is a rational point x such that and there is a rational number r such that

Clearly

U0, since for each point x in Nr(x) we have Hence NT(x) is a selected sphere, and is one of the Sn. But x0 is in this S„, for r, so that x0 is in Nr(x). Thus {Si, S 2 , • • • } covers E. N o w we define E i = E, Ez = E

GS i,

En = E

C[SX U • • • U S N - J ,

Each set Si U • • • U /S„_i is open b y 2.10, hence En is closed b y 2.6 and 2.10. It is easily seen that. • • • . N o point x0 is in all the En) for then it would be in E but not in any Sn, which we have seen to be impossible. By 2.14, some set Ek+\ is empty. That is, no point in E is in whence every point of E is in • • • + Sk. B y the manner of selecting the S„ each selected sphere S, is completely contained in some set (we denote it b y U,) of the collection [U}. Hence E /SI U • • • U Sk Ui U • • • U Uk,

-SI +

establishing the theorem. Our definition of a closed interval was such as to permit each a u> to be equal to the corresponding b(i), in which case the interval shrinks to the single point i With

18

I N T E G R A T I O N

[CHAP. II

the exception of this special case, every interval (open or closed) contains a non-denumerable infinity of points. It is enough to prove this for closed intervals, since every open interval contains a closed interval less than half the smallest It is enough to prove it for one dimension, for every closed interval i contains the set of points . It is enough to prove it for the single interval [0, 1], for every interval [a, 6] can be mapped on this b y the one-to-one mapping x [x! — a)/(b — a). Suppose now that all the numbers in [0, 1] were a denumerable set a\, a2, • • • . Expand each a, in a decimal fraction •••. F o r e a c h n, define 5 if 5, bn = 6 if 5. Thus Then the number .6162 • • • does not occur among the a,; it is different from each a„, since it has a different digit in the n-th decimal place.* Hence the assumption that all the numbers of the interval [0, 1] can be arranged in a sequence leads to a contradiction, and the set is not denumerable. W e now return to the subject of point sets, and prove that in a certain sense open sets are the next simplest sets after intervals. It is convenient to define a new term now. A collection of intervals is non-overlapping if no point which belongs to any interval of the collection is interior to any other interval of the collection. Thus in R1 the intervals [0, 1] and [1, 2] are non-overlapping, although they are not disjoint because they have the point 1 in common. 3.6. If G is an open set, it is the sum of a denumerable collection Iv 12, • " • of disjoint intervals /„: . It is also the sum of a denumerable number of non-overlapping closed intervals /„. Consider first all intervals of the form , where the numbers are integers. From these we select the ones whose closures are contained in G; these selected intervals we will call " t h e intervals selected at the first stage." Next consider all those intervals of the form , the being integers. W e select those whose closures are contained in G, but which are not contained in any interval selected at the first stage. These new intervals we will call " t h e intervals selected at the second stage." Proceeding b y induction, for each integer p we consider the intervals the

being integers.

From these

* W e must remember that numbers with different digits may be equal, for .aid2 • • • a„9999 • • • = ,aia 2 • • • (a„ + 1)00000 • • • . But the decimal expression .6162 • • • can not contain any 9's or 0's, so this exceptional possibility for equality between .6I&2 • • • and .Oia2 • • • is ruled out.

SEC. 3.7]

REAL-VALUED FUNCTIONS

19

we select intervals ("the intervals selected at the ( p + l)st stage") whose closures are contained in G but which are not contained in any interval selected at a previous stage. All the intervals constructed at each stage form a denumerable collection by 3.3, so the intervals selected at each stage form a finite or denumerable set by 3.1, and the aggregate of all intervals selected at any stage whatever is finite or denumerable. We shall shortly see that it cannot be finite. Let the collection of all selected intervals be denoted by I h I 1 , • · • . From the construction it is clear that they are disjoint; for two intervals of the type constructed are disjoint unless one contains the other. Moreover, each I n is contained in G, so U I n is contained in GIt remains to show that G is contained in U In', that is, if X0 is not in U I n it is not in G. Let Z 0 not belong to U I n . For each positive integer, p , it is in some interval (call it I*) of the p-th stage of subdivision, and this is not contained in any selected interval, since X0 is not in U In- Since I* is not in any interval selected at a previous stage and is not itself selected, its closure fails to be contained in G. That is, its closure must contain a point xp of CG. By the definition of distance in §2 and the assump­ tion made about the size of I*, | 1 xp, x0 | | ^ [q · (2~ρ+1)2]^, which tends to 0 as ρ increases. Thus xp tends to x0 by the definition preceding 2.12. But CG is closed by 2.6, so by 2.13 the point x0 is in CG. There­ fore (J In contains G, and being also contained in G must be the same as G. If we define I n to be the closure of I n , it is evident from the con­ struction that no point of any In is interior to any other Im, and so the intervals In are non-overlapping. Since In contains In, we have U In => U /„ = G. But the I n were so chosen that their closures I n were contained in G, so U In is contained in G, and must be identical with G. Incidentally, this shows that the I n cannot be finite in number. For suppose the In finite in number. Their sum contains a point for which x"> has the least value on U In- This point cannot be interior tO U In, and yet lies in the open set G = U In­ ks a corollary, we have the following theorem. 3.7. If G is open relative to a closed interval I, it is the sum of a finite or denumerable set of non-overlapping closed intervals Ii, 12, · · · The set I — G is closed by 2.5 and 2.7. So by 2.6 its complement C(I-G) is open. But C(I — G) = CI U G. By 3.6, the open set G U CI is the sum of denumerably many non-overlapping closed

20

intervals / „ .

I N T E G R A T I O N

[CHAP. II

Then G = (G

CI)

I =

(h

I).

From the sum in the last member we omit all sets I n I which are empty; the rest form a finite or denumerable collection of non-overlapping closed intervals whose sum is G. EXERCISE. If E is contained in S and is open relative to S, there is a sequence h , I 2 , • • • of intervals. such that E (S In). 4. According to the definition of E. H. Moore, a function is a system consisting of (a) a class X of elements x; (b) a class of elements y; (c) a law of correspondence by which to each e l e m e n t ^ o f £ there is assigned exactly one element y of the class (Such a function is sometimes also called a mapping of H into The element y corresponding to the element x is denoted by f(x), or g(x), or some other similar symbol. Properly, the symbol f(x) denotes the functional value y corresponding to the particular element x. However, it is common usage to use the symbol f(x) also to denote the function itself. Thus sin x may mean either the function in which H is the class of real numbers and £) is also the class of real numbers, and to each x there is assigned the number sin x; or else it may mean the particular number sin x which corresponds to some specific number x. H o w ever, this ambiguity does not often lead to confusion. The set of all values oi f{x) when x varies over E H will be denoted by f(E). In symbols f(E) = y = f(x), x in E}. In particular, if E is the set of points for which statement S holds, the set

will be alternatively denoted by Thus, for example, if 2 f(x) = x we have The set of all x such that f(x) lies in E (the "inverse image of E") will be denoted b y , ; thus, / ( x ) in E}. We write/ g if f(x) g(x) for all x in Also c • f is the function whose value at the point x is c -f(x), and so on (cf. the similar definitions for points in H q at the beginning of 1). W e shall be chiefly (though not exclusively) concerned with functions in which the class is a point set in Rq. For the class we shall always use a slight enlargement (which we denote b y R*) of the class of real numbers; shall consist of all real numbers plus two elements

SEC. 4.3]

REAL-YALUED

FUNCTIONS

which we denote b y and calculation rules for these new symbols: a a a a

a a a a a

21

W e shall use the following

for every real number a, if if if if if

a is real, if if

The last rule may seem strange, but functions with values ± oc will usually occur in connection with limiting processes of such a nature that the rule will prove convenient. 4.1. For all numbers a, b in R* the sum a + b is defined (finite or infinite) unless a or and b and b A simple and frequently used result is 4.2. If a, b, c, d are in R*, and a b and c d, then a c b d, provided that the sums are defined. This is clear if a, b, c, d are all finite. Otherwise either a or c is in which case a c in b d; or else b or d is which case a c b + d. In order to avoid verbosity it is convenient to define neighborhoods and and N, We define With this definition we see that 4.3. If x, y, z are any numbers (finite or infinite) in R*. then (a) if x is in N,(y), then x is in Nt if x is in (y) and y is in V Np(z), and a and then x is in T o prove (a), if y is finite then x so y x - y If y then x so x x and y y, and x is in N, ! y then x so if y and x is in N t i y N T o prove (b), we observe that it is obvious if any two of the numbers x, y, z are equal. Suppose therefore that they are different. If z is finite, then y - 2 and y is finite; so x - y Therefore \ x —z and x is in If X y y - z then y and Therefore x A simple calculation shows that and /3 are

22

I N T E G R A T I O N

[CHAP. II

positive numbers not greater than i. Hence x , and x is in A like discussion applies to the case z = — °o. In accordance with this definition, °o is an accumulation point of a set E if E contains arbitrarily large numbers x; that is, if for every 0 there are real numbers x in E such that x Suppose now that / ( x ) is defined on a point set E in Rq or in R*, and that x(s is an accumulation point of E. Then as usual we say that f(x) approaches the number k as x tends to if for every 0 there is a 0 such that f(x) is in for all points x of E in with the possible exception of x0 itself. More compactly, 4.4. Let f(x) be defined on a set E, and let x0 be a point of E'. Then lim / ( x ) = k is defined to mean that to every 0 there corresponds a

0 such that f(x) is in Nc(k) for all x in* EXAMPLE.

L e t f{x)

f{x) is in 2Ve(0) (i.e., of

f o r all a;

0.

Then

0; for

whenever x is in the Likewise, lim f{x)

neighborhood

In particular, if E is the set of positive integers, it is customary to denote the independent variable b y n instead of x and to write / „ instead of f(n). For this case definition 4.4 takes the following form. In order that k it is necessary and sufficient that to every 0 there shall correspond an integer ne such that /„ is in N,(k) whenever n If we write this differs from 4.4 only in that 5 is required to be the reciprocal of an integer; and this restriction on 5 is easily seen to be immaterial. Thus our definition includes the familiar definition of the limit of a sequence. Both 4.4 and the re-phrasing for sequences can be applied at once to functions in which the functional v a l u e s / ( x ) and the limit k are in the (/-dimensional space It q . Thus, for instance, we have incidentally defined limits such as where xn and x0 are in Rq] the definition reduces exactly to that in (the paragraph preceding 2.12).

2

* In accordance with is the set of all points x which belong both to E and to , but not to the set (xo) consisting of the single point x0; i.e.

SEC.

2.12]

R E A L - V A L U E D

F U N C T I O N S

23

W e can use the notion of convergent sequences to give another formulation to 4.4 which is of frequent usefulness. 4.5. Let f(x) be defined on E and let x0 be in E'. Then lim f(x)

exists and has the value k if and only if for every sequence x\, x-i, • • ' of points of Evalue distinct and has the k. from x0 and tending to Xo the limit lim f(xn) exists Suppose lim f(x) = k, and let e be a positive number. There is a

0 such t h a t / ( x ) isin A^e(/c) if x i s i n L e t z i , Xi, be a sequence of the type described. Since there is a y such that xn is in if n is in ; that is, if n and xn is in E, so for n the point xn is in E — and Suppose f(x„) is in Therefore = k. n) k. it N ise(k). not true that limlim f(x)f(x= Then there is an

•• • > 0 But (x0)r

0 such that no 5 serves; that is, in every set E — (a;0), no matter what 5 is, there is a point x for which / ( x ) is not in Nt(k). Let 5 take on the values 1, i> i , • • • . For each of these values of 5 we choose an in j E — (xo) such that/(a; n ) is not in Nt(k). Then and f(xn) does not converge to k. 5. Let S be a set (not the empty set) of real numbers. A number* M is an upper bound for the numbers x of S in case x M for every x in S. M is the supremum or least upper bound (abbreviated sup) of the numbers of S if: (a) for every x in S the inequality x M holds; (b) for every number m < M, there exists a number x in S such that x m. Property (a) asserts that M is an upper bound, property (b) that no smaller number is an upper bound. Analogously, N is a lower bound for the numbers S in case x N for all x in S; and N is the inHmum or greatest lower bound (abbreviated inf) if (a) x N for every x in S; (b) for every number n > N, there exists a number x in S such that x < n. A fundamental property, "Dedekind continuity," of the system of real numbers is that every non-empty set of numbers which has a (finite) upper bound has a (finite) sup. are B y allowed changeasofnumbers. sign, a similar statement * W e remember that and holds for lower bounds. Since we allow and as numbers, this t M a n y authors use the notation least upper bound and greatest lower bound becomes: numbersand in infimum. R* has a Ifsup an inf.S (abbreviatedEvery l.u.b.non-empty and g.l.b.) set for of supremum theand collection is finite the terms maximum and minimum (abbreviated max and min) are sometimes used.

24

I N T E G R A T I O N

[CHAP. II

From the definitions it is clear that if M0 is an upper bound for S, then sup S; likewise, if m 0 is a lower bound for S, then inf S. A simple but important property of these bounds is 5.1. Let S, S* be two non-empty sets of numbers in R* and S* S. Put m = inf S, m* = inf S*, M* = sup S*, M = sup S. Then m m* M* M. (Observe that this contains the statement that for the arbitrary non-empty set S the relation M m holds.) W e prove first M m*. Let x be any number in S*; then M x and x m*. For all x in S, and a fortiori for all x in S*; the inequalities x M and x m hold. Hence M, m are upper and lower bounds for the x in S*. Therefore the least upper bound M* can not exceed M, and likewise TO m. EXAMPLES. If S consists of all rational numbers x such that 1 < x < 3, then inf S and sup S are 1 and 3 respectively. If S is the class of all positive integers, then inf S is 1 and sup S is If S consists of all integers, then inf S is and sup S is . If S consists of the single number , then inf S and sup S are both 5.2. If of two numbers a, b we know that a x for all numbers x > b, then a b. Likewise, if a x for all x < b, then a b. We prove the first statement; the second is similar. This is really a special case of 5.1; for the set S* of numbers x > b is contained in the set S of numbers x a, so inf S* = b is inf S = a. H o w ever, we need not refer to 5.1. For if the theorem were false and a > b, there would be an h such that a > h > b. Then h > b but a > h, contrary to hypothesis. Let f(x) be a function defined on a set E. B y a lower (upper) bound for / on E we shall mean a lower (upper) bound of the set/(Z?). We shall also sometimes write sup / (on E) or sup f{x) for sup f(E) = sup x in E] and inf / for inf f{E). When there are several variables involved, for instance when we are studying a set of functions • • • we shall occasionally write a subscript on " s u p " and " i n f " to indicate the collection whose bounds we are seeking. Thus sup; denotes the supremum of the numbers •• • at a particular point x, while sup* f((x) denotes sup {fi(x) | x in E\ (for a fixed i). The symbol sup denotes that function whose value at each x is the greatest of the numbers fi(x), • ' ' ,fn(x). 5.3. If f(x) is defined on E and E* E, then inf f(E) inf (/£*) sup f(E*) sup f(E).

SEC.

2.12]

R E A L - V A L U E D

F U N C T I O N S

25

Since f(E*) f(E) our theorem follows at once f r o m 5.1. 5.4. If f and g are both defined on E and f g, then sup / and inf / inf g. For all x in E we have inf /

f(x)

g(x)

sup g

sup g.

So inf / is a lower bound for g and hence inf / inf g. is an upper bound for / and hence sup g sup f.

Similarly sup g

5.5. If f(x) is defined on E, and 0 c , then: (a) sup cf = c sup / ; (b) inf cf — c inf / ; while if < 0, then (c) sup cf = c inf / ; (d) inf cf = c sup / . For c = 0 both (a) and (b) reduce to 0 = 0. If c < 0, then for all x we have fix) sup / , so cf(x) c sup / . T h a t is, c sup / is a lower bound for cf(x). If h > c s u p / , then s u p / . Therefore there is an x in E such that

whence

h.

Hence if h > c

sup / , then h is not a lower bound for cf, and so (d) is proved. prove (c) we use (d): c inf / = c inf Suppose c > 0.

To

c . crl sup cf — sup cf.

W e write c = ( —1)( — c); then b y (c) and (d)

sup cf = sup [( —c)( —1)f] = - c i n f ( - 1 ) / = ( - c ) ( - l ) s u p / , inf cf = inf [ ( - c ) ( - l ) f ] = - c sup ( - 1 ) / = ( - c ) ( - l ) i n f / , establishing (a) and (b). 5.6. If the functions f i{x) and f2(x) and their sumfi{x) defined* on a set E, then

+ /2(z) are

inf [ / 1 + / 2 ] i n f / i + i n f / 2 , sup [/1 + / 2 ] sup fi + sup / 2 , provided that the right members of these inequalities are defined.* For each x in E we have fi(x)

sup /1,

/2(x)

sup fi;

hence b y 4.2 fi{x) +M%)

sup / x +

sup/2.

Thus s u p / i + sup / 2 is an upper bound ior fi(x) + fzix), and can not be less than the least upper bound. This establishes the second * Cf. 4.1.

26

I N T E G R A T I O N

[CHAP. II

inequality. The first is obtained from the second by replacing / i and f2 b y —fi and — f2 and using 5.5. EXAMPLE. Let E be any (non-empty) set. For each x the distance is a function defined on E. Define d(x) (the distance from x to the set E) to be inf in E}. Since 0 is a lower bound for we have at once d{x) 0. This function is continuous. For if Xi, Xi are any two points, we have b y 1

for all x in E. Then b y 5.4 d(xi) interchange x\ and x 2 and thus get d(x2) is, . So

But we may d(x2); that whenever

[ | Xi, Xi | | < e. This proves in fact that d(x) is uniformly continuous.* This function d(x) has the value 0 if and only if x is in E. If x is in E, for every there is a point x of E in N,(x), b y 2.2. Then , so d(x) = inf Since this is true for every « > 0, by 5.2 we have d(x) 0. But d(x) is not negative, so d(x) = 0. Conversely, if d(x) = 0, for every 0 there is a point x of E such that That is, for every positive e the neighborhood Ne{x) Contains a point of E, so b y 2.2 the point x is in E. 6. One of the chief concerns of analysis is the behavior of a function at the points which are near a given point. If we are interested in the upper bound of a function from this point of view, we shall first of all not be particularly concerned with the value of the f u n c t i o n / ( x ) at the given point xo, and second we shall not care about the supremum of / ( x ) for distant points. Suppose to be specific that / ( x ) is defined on a set E, and that x0 is an accumulation point of E. Whether or not x0 is in E is immaterial; if it is, we simply disregard the number f(xo). For each e we find sup \f(x) j x in E, x ; we denote this by If we choose any one e, this still is not adequate to describe the behavior of / near x 0 , for points at a distance greater than may enter into the definition of M , ( f ; x 0 )- But if we reduce e, we observe that the set of points a; in E — (x0) is reduced, so b y (5.3) M , ( / ; xo) is reduced. Since we are interested only in Me{f; x0) for small e, it follows that only the smaller values of Mc(f;x0) are of importance. This suggests to us that the number which will be the local substitute for the sup / will be inf e Mf(f; x 0 ). This number will be named the upper limit of f(x) as x tends to x0; it is denoted b y lim sup f(x). Summing up, (and defining the lower limit analogously): * Cf. Courant, Differential and Integral Calculus, vol. I, p. 51.

SEC.

2.12]

R E A L - V A L U E D

F U N C T I O N S

27

6.1. If f(x) is defined on a set E and x0 is an accumulation point of E, then where and where These limits are always defined (finite, not always equal. EXAMPLE.

Let

=

but they are

1 if x is r a t i o n a l a n d

if x is irrational. Take Then M. 1 and for all 0, and so l i m s u p f ( x ) = 1, lim inf f(x) = 0.

0

0

Since we frequently have to use sequences of numbers, we shall now observe the special form of these definitions which applies to sequences ai, a2, ... of numbers. A sequence is a function in which the independent variable ranges over the set E of the positive integers, and we are interested in the behavior of the function as a; . S o in 6.1 we set . Then Nc{x0)E consists of all integers greater than so Me{f; z 0 ) = sup Hence lim sup a( = inf e sup It is easily seen that this is the same as lim sup a( = inf„ sup Likewise, lim inf a, = sup n inf Before we start to prove any theorems, we shall collect several obvious consequences of the definitions: 6.2. Letf(x) be defined on a set E, and let x0 be an accumulation point of E. (a) If h > lim sup f(x), there exists an numbers

e the inequalities l i m s u p f(x)

hold.

0 such that for all positive

h

28

I N T E G R A T I O N

[CHAP. II

(6) If h < lim inf f(x), there, exists an e > 0 such that for all positive

numbers

the inequalities liminf f{%)

h

hold. (c) If h there exists a point x in such that h (d) If h there exists a point x in such that h. T o prove (a), we observe that b y 6.1 the number h is greater than the inf of for all 0, so-by the definition of inf there is an e > 0 such that M h. If then is contained in . So the sup of f(x) on the former set is not greater than its sup on the latter set, b y 5.3. Hence > . Finally, by 6.1 the upper limit of f(x) as x —> x0 is the inf of all M , so it is not greater than any one of them. This proves (a). W e can prove (b) analogously, or b y replacing f(x) by — fix) and using 5.5. Statements (c) and (d) are immediate consequences of the definitions of and me(f; x0). From 5.5 we readily obtain 6.3. Let /(x) be defined on E, and let x0 be an accumulation point of E. If 0 then (a) lim inf sup = climinf lim supf(x); (•b) lim while (c)if lim inf sup = ccthen clim limsup inf f{x). f(x), {d) =0, If c = 0, (a) and (b) reduce to 0 = 0. If c 0, b y 5.5 we have for every 0 and again by 5.5

SEC.

2.12]

R E A L - V A L U E D

Hence (d) is established.

F U N C T I O N S

29

By (d),

which is (c). If c > 0, then by (c) and (d)

and (a) is established. Interchanging sup and inf in the proof of (a) gives the proof of (6). The next four theorems are designed to set forth the relationship between upper and lower limits as defined in 6.1 and the (unique) limit as defined in 4.4. The first of these, Theorem 6.4, is the analogue of the sequential test for convergence established in 4.5. Theorem 6.6 shows that the equality of the upper and the lower limit is a necessary and sufficient condition for the existence of a limit in the ordinary sense. 6.4. Letf(x) be defined on a set E, and let x0 be an-accumulation point of E. If h = f ( x ) or if h = lim inf fix), then there exists a x~*xo

sequence Xi, Xi, • • • of points of E distinct from x0 and tending to xQ such that = h. But if h no such sequence X\, x%, • • • can exist. W e shall prove the statements about the upper limit; the statements about the lower limit can be similarly proved, or they can be obtained by use of 6.3. Suppose that h is equal to the upper limit of f(x) as We first show that to every positive number n there corresponds a positive number such that is in the neighborhood N1/n(h). If this is evident; for then Ma(f; h) h , so h, and is in no matter what n and a we choose. If h the neighborhood contains a number k > h. Since h = inf ( Mt(f; x 0 ), b y 6.2 there is a positive number . such that

Therefore is between two numbers of and must itself belong to Next we show that there is a point xn in . for which is in , t h i s is evident; for then /(;x) > for all x in . Thus if x„

30

I N T E G R A T I O N

[CHAP. II

is any point of we have , which is in If , there are numbers k in B y 6.2(c), there is then a point xn in E — (xo) for which k . Thus is between two numbers of and is therefore in that neighborhood. Summing up, xn is in E — So whence Moreover f(xn) is in 3. and Ma ; x0) is in , so b y 4.36 (if n 2) f(xn) is in Therefore Thus we have shown that it is always possible to pick a sequence xh x2. • • • of points of E distinct from x0, tending to x0, and such that f(xn) —> lim sup fix). Suppose now that h > lim sup/(a;); we must show that no such sequence exists.

Let k be a number such that

lim sup f(x).

If there is a sequence of the type specified with f(xn) tending to h, then for all but a finite number of these xn we have k. For every positive e the neighborhood contains infinitely many xn, so it contains some xn with k. Hence = sup x in lim . — = inf, is greater than k.k for all e, and supEf(x) But k was a number greater than the upper limit of as This contradictionproves that n o s e q u e n c e with and and, lim sup can exist. This theorem justifies the names " u p p e r limit" and "lower l i m i t " ; for the upper and lower limits o f , as are respectively the greatest and the least numbers which are limits of sequences where xn is in E, xn x0 and A direct consequence of 6.4 is 6.5. Corollary. Let f(x) be defined on E, and let x0 be an accumulation point of E. Then lim sup / ( x ) lim inf f{x). Let h be the upper limit of as B y 6.4, there is a sequence xh x2, • • • such that xn is in E, , and f(x„) h. Again by 6.4, this makes it impossible that h < lim inf f(x0), which establishes our theorem.

SEC.

2.12]

R E A L - V A L U E D

F U N C T I O N S

31

6 . 6 . Let the function f(x) be defined on a set E, and let x0 be an accumulation -point of E. Then the limit lim f{x) exists and has the value k if and only if k. Suppose that the limit of f(x) as exists and is equal to k. B y 4.5, if xi, x 2 , • • • is any sequence of points of E distinct from x 0 and tending to x0, then, k. But b y 6.4, the points xi, xi, • • • can be so chosen that

This is only possible if the upper limit of f(x) is k. Replacing sup b y inf, we find that lim inf f(x) must also be equal to k. Conversely, suppose that the upper and lower limits both have the value k. Let e be an arbitrary positive number. If k is finite, b y 6.2 there are positive numbers such that k W e define y to be the smaller of a and

then

k B y definition of my and My, this implies that f(x) is in . whenever x is in Hence in this case lim f(x) exists and is

equal to k.

If k

, b y 6.2 there is a positive number a such that That is, f{x) is in N(( cc) whenever a: is in E — (zo), so in this case also the limit exists and has the value The case k can be discussed in a like manner. An immediate consequence of 6.6 which happens to be in a form convenient for applications is the following. 6.7. Let /(x) be defined on E; let x0 be an accumulation point of E, and let k be a number in R*. In order that lim f(x) shall exist and have

the value k, it is necessary and sufficient that the inequalities k

and

bothSuppose hold. that both inequalities hold.

k B y 6.5

32

I N T E G R A T I O N

[CHAP. II

Hence both the upper limit and the lower limit of have the value k; so b y 6.6, the limit exists and is equal to k. The converse is obvious from 6.6. A n important special class of functions for which we can be sure that limits exist (finite, or is the class of monotonic functions. A function f(x) of a single real variable x, defined on a set E, is monotonic increasing if , whenever and both and Xi are in E; it is monotonic decreasing if whenever x2 X\ and both xx and are in E. Then 6.8. If f(x) is defined and monotonic increasing on the then sup/. If f(x) is defined and monotonic decreasing on

then inf/.

(Observe that a and b are permitted to have the values respectively.) W e prove the first statement; the second follows from the first if we replace / b y —/. Also we prove only that inf / ; the statement about the limit as x —> b can be proved analogously, or else can be derived b y replacing / b y If we give e any particular value, say e = 1, b y 5.1 we have Hence (A)

inf/.

On the other hand, let h be any number greater than inf / . For some number x in a b we have h, b y the definition of inf. Whether a is finite or — oo, we see that for sufficiently small e the neighborhood Ne{a) is entirely to the l e f t o f x) that is, x for all x in Nt(a). For all such x we have, h, so

It follows that h. This holds for all h

(B)

inf / , so by 5.2 we deduce

SEC.

2.12]

R E A L - V A L U E D

F U N C T I O N S

33

B y 6.7, inequalities (A) and (B) imply the desired conclusion. EXERCISE. If a1} a 2 , • • • is a monotonic increasing sequence of numbers, lim a„ exists and is equal to sup an. An analogous statement

holds for decreasing sequences. We have already had an example of a monotonic function in the definition of upper and lower limits; for we saw that if Hence Mt is a monotonic increasing function on the range 0 and so b y 5.6 we have the following theorem. 6.9. If f(x) is defined on E, and x is in E', then From 6.8 we can deduce a corollary concerning the right and left limits of monotone functions. Let fix) be defined on a subset E of one-dimensional space Ri, and let xo be a point of E. W e put E and Then 6.10. If f(x), considered as defined only in E+, has a limit as we call this limit the right limit of , and denote it by f If f(x), considered as defined only on £L, has a limit as we call this the left limit of /(x) at xa, and denote it by Also, the upper and lower limits of fix), considered as defined only on . as are denoted by respectively; and the corresponding upper and lower limits when f(x) is regarded as defined only on are denoted by

respectively. W e now prove 6.11. Letf(x) be monotonic on an interval I. If is in I and is not the left end-point of I, exists, and if x0 is in I and is not the right end-point of I, exists. Moreover, if fix) is monotonic increasing [monotonic decreasing] the inequalities

are valid whenever their terms are defined.

34

I N T E G R A T I O N

[CHAP. II

Suppose, for example, t h a t / ( x ) is monotonic increasing, and that is the set E - is either , and if happens to belong to E_ we can throw it out without changing or sup f(x) for x on E T h e n by 6.8, exists and

If furthermore x 0 is in I, then,

for all x in £L, so

sup Hence The other parts of the theorem can be established in a like manner. EXERCISE. I f , and exist and are equal, then exists and is equal to both of the one-sided limits. EXERCISE. If f(x) is bounded and monotonic on an interval 7, its discontinuities are at most denumerable. (Suggestion: Show that for each n there can be only a finite number of points at which

It is actually enough to assume that , without assuming it bounded. The next theorem is of considerable importance. 6.12. If and and their sum are defined on a set E, and xQ is an accumulation point of E, then

provided that the sum on the right is defined (cf. 4.1). If either of the terms on the right is , the theorem is obviously true. W e assume then that neither is Remembering that is a sup, by 5.6

N o w let h, k be any pair of numbers which are greater than lim s u p / i , lim sup / 2 respectively. B y 6.2a we can find 0 small enough so that and k. Then for this and a fortiori

But here h + k can be any number greater than lim sup so b y 5.2 our inequality is true.

lim sup,

SEC. 6.14]

R E A L - V A L U E D

F U N C T I O N S

35

The inequality established in 6.12 can be written in many different forms. In the next theorem we collect some of these variant forms of 6.12. 6.13. Letfi(x) andfi(x) be defined on a set E, and let xg be an accumulation point of E. Then any of the following statements which do not involve undefined expressions (i.e. sums of the form are true: f lim s lim i lim s

u n u

p f p

l l l

i i i

m m m

sup inf sup

lim i lim s lim s

n u u

f p p

l l l

i i i

lim i lim i

n n

f f

l l

i i

m m m lim m m lim

inf sup sup inf inf inf sup

Here

is the inequality of 6.12. follows from if we replace and use 6.3. follows from and (TJ) from , if we replace T o establish (7), we first notice that if f2(x) takes on the value on every set then for every e, and the right member of I f / 2 takes on the value on every such set, then so does and is always The left member of is then There remains the principal case, in which there is a positive e such that, is finite on W e may then consider / 2 as always finite, since its values outside of do not affect either member o f . N o w to prove we need only replace in noting that if lim inf , b y hypothesis lim sup which by implies lim sup From we obtain b y replacing 1 and If we replace by in and , we obtain the first inequalities in ( f ) and respectively. If we replace in and , we obtain the second inequalities in and I respectively. From theorem 6.13 we readily obtain 6.14. Let the functions and be defined on a set E, and let xo be an accumulation point of E. / / f lim f%(x) exists, then T o save printing, the symbol each lim sup, lim inf and lim.

is understood, instead of printed, under

36

(a) (b)

I N T E G R A T I O N

lim s u lim inf

p

l

i

m

[CHAP. II

sup lim inf

provided that the sums are defined. For b y 6.13a, 6.6 and 6.13 y, lim sup = lim sup

lim sup lim

lim sup

lim inf

Thus (a) is proved. If we replace and interchange inf and sup, we obtain (by 6.13 the proof of (6). EXERCISE. Prove the following theorem. Let, be defined on E, and let be an accumulation point of E. The equation k holds if and only if: (a) for every number and (b) for every number point x in Likewise, lim inf i

k there is an

such that

k and every neighborhood such that if and only if:

(c) for every number* h < k there is an

0 such that,

h on there is a

h on

(d) for every number* h > k and every neighborhood there is a point x in such that It is now easy to establish a generalized form of the Cauchy convergence criterion. 6.15. Letf(x) be defined andfiniteon E, and let Xo be an accumulation point of E. In order that shall exist and befinite,it is necessary and sufficient that the following condition be satisfied. For every positive number e, there is a 0 such that for every pair of points . If the limit exists and is finite, for every positive e there exists b y definition a positive 6 such that f(x) is in whenever x is in It follows that for every pair x', x" of points of this set we have Conversely, let the condition be satisfied. Let be an arbitrary positive number, and let S be the number specified in the condition. Choose a point x" in ; such a point exists, b y 2.1. If no such number exists, this condition is considered to be automatically satisfied. This happens in (a) and (d) if k , and in (6) and (c) if k

SEC. 6.14]

R E A L - V A L U E D

F U N C T I O N S

Then by hypothesis, for every x in

37

we have

whence

Thus and, upper bound f o r / ( x ) on

are respectively a lower bound and an B y definition,

Hence the upper and lower limits of are both finite, and differ b y at most 2e. Since 2e is an arbitrary positive number, the upper and lower limits are equal, and b y 6 . 6 e x i s t s and is finite. Suppose that/(a;, y) is defined for all x in a set E contained in the space Rq or , and for all y in a set F, concerning which we make no assumptions. Let be a function defined on F. If x0 is an accumulation point of E, the statement that

for all y in Y has by 4.4 the following meaning. T o each y in Y and each positive e there corresponds a positive depending on y and such t h a t , is in whenever x is in . In general we cannot hope that given e, one single 5 can be chosen which will serve for all y in the set Y. Whenever this happens to be true, the convergence is called uniform. 6.16. If f(x, y) is defined for all x in a set E in Rq or in R* and for all y in a set Y, and fo(y) is defined on Y, and Xo is an accumulation point of E, then the statement uniformly on F is defined to mean thatto each positive e there corresponds a positive 8 such that f(x, y) is in . whenever x is in . {The S is independent of y.) For uniform convergence there is a criterion quite analogous to Cauchy's criterion (6.15). This is the following. 6.17. Let / ( x , y) be defined andfinitefor all x in a set E in Rq or in R* and all y in a set Y. Let x0 be an accumulation point of E. In order that shall converge uniformly on Y to some finite limit as

38

I N T E G R A T I O N

[CHAP. II

it is necessary and sufficient that the following condition be satisfied. To each positive number e there corresponds a positive number 6 such that for every pair of points and every y in Y. If converges uniformly on F to a finite limit and e is positive, there is a 8 such that whenever x is in So if x' and x" are both in this set we have

Conversely, let the condition be satisfied. Then for each individual y in F the condition in 6.15 is satisfied, so for each such y the function approaches a finite limit, Let e be a positive number. B y hypothesis, there is a such that

whenever x' and x" are in there is a such that

whenever x" is in If x' is in B y the two inequalities above,

For each particular y

.

Let

be the smaller of

and

we choose an x" in

for all x' in . But S is independent of y, so f(x, y) converges uniformly to i as x tends to 7. From the concept of limit we can proceed in the usual way to the concept of continuity. Let be defined on a set E, and let x0 be a point of E which is also an accumulation point of E, so that x0 is in EE'. W e say that. . is continuous at x0 if is finite, and lim exists and is equal to, . Furthermore, is continuous on E if it is finite-valued on E and is continuous at each point of EE'. (Notice that nothing other than finiteness is required of at points of E which are not accumulation points of E.) EXERCISE. If f(x) is continuous on a bounded closed set E, it is uniformly continuous; that is, to each positive e there corresponds a positive such that whenever xi and x 2 are in E and (To each x corresponds such that

SEC.

6.14]

R E A L - V A L U E D

F U N C T I O N S

39

is in E. A finite number of the neighborhoods cover E, say those with Then serves.) In a similar way, from the ideas of upper and lower limits we can form two concepts related to continuity: Letf(x) be defined on a set E, and let the point x0 of E be an accumulation point of E. We define f(x) to be lower semi-continuous at x0 if , and we define f{x) to be upper semi-continuous at

W e shall say that is upper [lower] semi-continuous on E in case it is upper [lower] semi-continuous at each point x0 of EE'. In these last definitionsthere is no requirement that be finite-valued. EXERCISE. If . is lower semi-continuous on a bounded closed set E, it attains its greatest lower bound on E ; that is, there is an x 0 in E such that inf . If in addition on E, it has a finite lower bound on E. Also, if is upper semi-continuous on E, it attains its least upper bound on E. (By definition of inf, there is a sequence xn in E for which Use 2.12, 2.13, 6.4.) 7.1. / / , is defined andfinite-valuedon E, then is continuous on E if and only if it is both upper and lower semi-continuous on E. The function is both upper and lower semi-continuous on E if and only if for every in EE'. B y 6.7, this is true if and only if

.

B y definition, this

last is true if and only if is continuous at 7.2. If is upper semi-continuous on E and is lower semicontinuous on E, then if the function is upper semicontinuous on E and cg(x) is lower semi-continuous on E; while if 0, the function is lower semi-continuous on E and is upper semi-continuous on E. For c 0 this is trivial, as 7.1 shows. Suppose c 0 and let x0 be any point of EE'. Then b y 5.4

so cf(x) is lower semi-continuous. B y replacing / b y g and interchanging inf and sup we prove that is uppersemi-continuous. If c 0, then by the proof just completed i is lower semicontinuous, and therefore is upper semi-

40

I N T E G R A T I O N

[CHAP. II

continuous.

Likewise is upper semi-continuous, so is lower semi-continuous. 7.3. Let /(x) be defined on E, and let x0 be a point of E. Then is lower semi-continuous at xB if and only if for every number h there is a positive number e such that h for all x in Also, f{x) is upper semi-continuous at x0 if and only if for every number there is a positive number such that h for all x in We shall prove the first statement; the second follows by change of sign. Suppose first that x0 is in E but not in E'. Then is lower semi-continuous at x0) since our definition of lower semi-continuity requires nothing of / at points of E — E'. Also, by 2.1 there is an e such that E consists of x0 alone, so that if h the equation h holds at every point (i.e., x0 alone!) of So the two statements are always true, and are therefore equivalent to each other. W e still have to consider the principal case, in which xo is in EE'. Suppose f(x) lower semi-continuous at xn, and let h be a number less than . Then h so by 6.2 there is a positive e for which h. Therefore by definition of mt we have h for all x in But we do not need to except the point Xo, for the relation h holds at Xo too by choice of h. Hence h at all points of T o prove the converse, let h be a number less than By hypothesis, there is a positive e such that, E. Hence , and by 6.1

Here h is any number less than

so b y 5.2

This completes the proof. 7.4. If f(x) is lower semi-continuous on E and g{x) is upper semicontinuous on E, then the set G consisting of those points x of E at which is open relative to E, and the set C consisting of those points x of E at which is closed relative to E. W e shall prove G open relative to E) then, since it will follow b y 2.6 that C is closed relative to E. Let x0 be a point of G; we must show that there is a positive number y such that

SEC.

Since

6.14]

R E A L - V A L U E D

is in G, we have

F U N C T I O N S

41

, so there is a number h such that

B y 7.3, there are positive numbers , such that h for all a; in E and h for all x in Let y be the smaller of e and For every x in we then have , so all such x are in G. That is, G, completing the proof. 7.5. If K is a collection of functions defined and lower semicontinuous on E, and for each x in E we define to be sup / in K\, then g(x) is lower semi-continuous on E. Likewise, if K is a collection of functions defined and upper semi-continuous on E, and for each x in E we define to be inf f in K], then is upper semi-continuous on E. W e prove the first part; the second part is obtained b y changing signs and using 7.2. Let x» be in E, and let h be any number less than B y the definition of there is a function f(x) in the collection K such that h. B y 7.3 there is a positive e such that h for all x in Since b y definition, it is also true that h for all x in B y 7.3, this proves that is lower semi-continuous at Since is an arbitrary point of E, this implies that is lower semi-continuous on E. EXERCISE. In order that shall be lower semi-continuous on E, it is necessary and sufficient that for every constant c, the set E be closed relative to E. 7.6. If on E, and continuous on E. E, and tinuous on E.

If

are defined and lower semi-continuous , then is lower semiare upper semi-continuous on thenis upper semi-con-

W e prove the first statement; the second follows b y change of sign. Let x0 be a point of E, and let h be a number less than Then for i = 1, • • • , n, so for each i there is (by 7.3) an 0 such that h on E. If is the smallest of the , then h for all x in and all i. One of these (the least) is equal to, h for x in E. This is true for all h so by 7.3 is lower semi-continuous. EXERCISE. There is a strong resemblance between Theorems 7.5 and 7.6 on semi-continuous functions and Theorems 2.8 and 2.9 on relatively open and closed sets. With the help of the preceding exercise, show that 7.5 and 7.6 can be deduced from 2.8 and 2.9. 7.7. If , and < are both upper semi-continuous on E, and neither one takes on the value , then is upper semi-

42

I N T E G R A T I O N

[CHAP. II

continuous on E. If i and are both lower semi-continuous on E, and neither one takes on the value , then is lower semicontinuous on E. W e prove the first statement; the second follows by change of sign. Let x0 be any point in EE'. W e can not have ,for this would imply A similar statement applies to 6.12 to obtain

, contrary to hypothesis. Therefore we may use 4.2 and

This proves that is upper semi-continuous. 7.8. If _ and are both lower semi-continuous and non-negative, then is also lower semi-continuous. If and are both wpper semi-continuous and non-negative, and there is no point x in E at which one of them is and the other 0, then is upper semi-continuous. Suppose them both lower semi-continuous and 0, and let x0 be a point of EE'. If either or is 0, then ) = 0. Then is necessarily lower semi-continuous at XO, for ) 0, so that

If neither one is 0, then 0. Let h be any number less than W e can find positive numbers and such that B y 7.3, there are positive numbers such that hi, for all a; in E and for all x in If y is the smaller of and then for all £ in E both inequalities hold, so that

for all such x. B y 7.3, this implies that. is lower semi-continuous. Suppose now that , and are upper semi-continuous, and that we never have , where 0 o r v i c e versa. Let be any point of EE'. If one of the factors in, is the other is positive, so

SEC.

6.14]

R E A L - V A L U E D

F U N C T I O N S

43

Otherwise, both and are finite. W e can now repeat the proof of the first part of the theorem, reversing all inequalities. EXERCISE. A function is sometimes called "continuous in the generalized sense" if it satisfies the definition at the beginning of 7 except for the requirement of finiteness. Let be continuous in the generalized sense and monotonic increasing on the set t . If is defined and lower [upper] semi-continuous on E, so is If is continuous and monotonic decreasing on t , then if is defined and lower [upper] semi-continuous on E, is defined and upper [lower] semi-continuous on E. From this we can deduce 7.2. Also, if we let we can deduce 7.8, except for the case in which one factor is 0 and the other is 7.9. If f(x) is defined and lower semi-continuous on E, and there is a constant M such that, M for all x in E, then it is possible to find a sequence of functions , • • • , each defined and continuous on the whole space such that M • • • for all x in and for all x in E. Likewise, if > is upper semi-continuous and ous on Rq such that M for all x in E.

M, there are functions continu••• for all x in Rq and

Again the second statement follows from the first b y change of sign. For n = 1, 2, • • • and all a; in we define

Clearly 5.4 any points So b y 5.4 of

Also, For every x in E we have

M.

for all so b y Let and be

Interchanging Xi and x^, we get Therefore which proves that is continuous on all of ~ Finally, let x0 be a point in E; we must show that Let h be any number less than ( B y 7.3 there is an such that h for all x in N o w choose a number large

44

I N T E G R A T I O N

[CHAP. II

enough so that M

then in the expression K. In the first case, h. In the second case, h b y the choice of t. Thus h is a lower bound for , and b y definition h. Since n was any number greater than n 0 , this implies1 h. But h was any number

so b y 5.2

On the other hand, in the definition of we can take because z 0 is in E. So one possible value of is f(xo), and Since this is true for all n, we have This, with the preceding inequality, proves , and completes the proof of our theorem. 7.10. Corollary. If f(x) is defined and lower semi-continuous on E, and there is a constant M such that, M for all x in E, there is a function g(x) defined and lower semi-continuous on the whole space such that for all x in E. Let be the sequence of functions of 7.9, and define The limit exists, because • • • ; it is lower semi-continuous, by 7.5; and it coincides with/(a;) for all x in E, b y 7.9. 8 . There are several classes of functions which distinguish themselves by their usefulness in one branch or another of analysis. One such class, important in the present connection, is the class of monotonic functions (cf. 6). A peculiarly important property of these functions will later tu'rn out to be that (in a sense which can be made quite precise) they are almost everywhere continuous and almost everywhere have a derivative, the latter property not being shared by all continuous functions. W e often wish to add and subtract functions; but the difference of two monotonic functions (e.g., and x) may not be monotonic. Hence another class assumes importance—the class of functions which are sums or differences of monotonic functions. The principal object of this section is to show that this class is the same as the class of functions of bounded variation which we now define. Let us suppose that f(x) is defined and finite-valued on an interval I W e subdivide I b y means of a finite number of points where a b, and form the sum * The limit exists; see the exercise after 6.8.

SEC.

6.14]

R E A L - V A L U E D

.

F U N C T I O N S

45

The sup of this sum for all subdivisions of I is

called the total variation of / over I, and is denoted b y is finite we say that, is of bounded variation (abbreviated "fix) is BV") on I. Before proceeding with the theory we shall give an example of a function which is continuous but has not limited total variation. Such a function is For if we choose the a, to be 0, • • • 1, we find

and since this is greater than a partial sum of the divergent series the sup is » . W e verify at once that every function which is finite-valued and monotonic on an interval I is of BV. Suppose to be specific that f(x) is monotonic increasing; then for every subdivision • •• = b we have

Hence f(b) — f(a) is the common value of all the sums, and is therefore their sup, An easy consequence of the definition is 8.1. If f{x) and gix) are of BV on I, and y. and y are finite constants, then is of BV on I. For if a b is a subdivision of I, then

In particular, the difference of two monotonic functions is of BV. 8.2. If is a subdivision of the interval I: is a finer subdivision ithat is, a subdivision which includes all the points a, among the /3,), then

For each interval intervals

is subdivided into a number of and from the equation

46

I N T E G R A T I O N

[CHAP. I

we have the inequality Adding the inequalities for i = 0, 1, • • • , n — 1, we obtain the desired result. If I is the interval , it is sometimes convenient to use the symbol to denote For completeness, we define = 0. With this notation, we state 8.3. If a b, then If or , the result reduces to an obvious identity. Otherwise, let h, k, I, be numbers less than respectively. Then we can subdivide (a, y) b y points and I by points y,, and b y points 5 , in such a way that t (a) (b) (c) N o w take all the points together, and re-name them , • ••, from left to right. Then these points provide a finer subdivision than those in (a), (6), and (c); and so b y 8.2

(e) if) (9) But the sums in (e) and ( / ) add andto give the sum in (g).

Hence

Replacing h, k, I b y their upper bounds, as we may b y 5.2, establishing and the equality.

SEC.

6.14]

R E A L - V A L U E D

F U N C T I O N S

47

This theorem can be extended readily, by induction, to any finite number of intervals. One of its consequences is 8.4. If f(x) is of BV on I, it is also of BV on every interval contained in I. For let I be the interval Then

A trivial consequence of the definition of total variation is the inequality 8.5. For is b y definition not less than for any subdivision of the interval and in particular for the subdivision in which and and there are no other W e can now prove 8.6. If fix) is of BV on I, then there exist monotonic increasing functions such that _ Define is obvious. W e must show that p and n are monotonic increasing. Consider any two numbers and in I. Then b y 8.3 and 8.5

Also,

Hence both p and n are monotonic increasing, and the theorem is proved. Taking this in conjunction with the remark after 8.1, we see that a function is of BV on I if and only if it is the difference of two monotonic functions. 9. As we mentioned in the first paragraph of §8, the function of BV will later be shown to have a derivative at all points with relatively few exceptions. But it does not follow that such a function is the integral of its derivative. For this to be the case, we shall later find that the function must satisfy a certain condition called absolute continuity. A function , defined and finite on an interval I, is absolutely continuous (henceforth abbreviated AC) if to every positive number e there corresponds a 0 such that for all finite

48

INTEGRATION

[CHAP. I

collections of non-overlapping subintervals of I with total length less than δ. (Intervals in R1 are non-overlapping if they have at most end points in common.) It follows at once that if the collection consists of a single interval I 1 = (a h 0i) of length β1 — «ι < δ, then I /(βι) — /(«ι) I < e, so that an AC function is also uniformly con­ tinuous, and therefore is continuous. The converse is not true; for in a moment we shall prove that every function which is A C is of B V , while in §8 we gave an example of a continuous function which was not of B V . Let us begin by proving 9.1. If f(x) i s A C on a n interval* [a, b], then i t i s of B V . By hypothesis, f(x) is A C on the interval [a, b]. So by definition, there is a δ such that 2 | /(ft) — /(α,) | < 1 whenever Σ(0, — α,) < δ,

the intervals (α,, β,) being non-overlapping. We insert a finite number of points ο = C < Ci < C · · · < c„ = b between a and b in such a way that c»+i — C1 < δ. We notice that if the interval [c„ c l + i] be subdivided in any way by points a 0 = C i < a x < · · · < ak = cl+1, then Σ(α,+ί — α,·) = c-.+i — c, < δ, so by the choice of δ we have Σ |/(α,+ι) - /(α,·) | < 1. Thus Tf[c„ ci+i] ^ 1. Applying 8.3, 0

2

T/[a, b] = T/[co, Ci] + · · · + 7V[c„_i, C1J < n .

Hence T/[a, b] is finite, as was to be proved. Obviously the converse of this is false; a function can have B V and not be continuous at all. But it might be suspected that a continuous function which has BV is necessarily absolutely continu­ ous. This too is false. For we can give an example of a function which is finite, continuous and monotonic increasing (and therefore certainly of BV), and yet is not AC. This example is not quite trivially easy to construct, but it is important enough to justify some effort. We begin with the interval [0, 1], and imagine that we blacken the middle third, (i, f). This interval we call / and lim inf E, are identical, we denote them both by lim Ex. From 19.11 and 6.7 it is obvious that the following is true. 19.12s. If • • • are measurable subsets of a set E of finite measure, and lim Ei exists, then lim mEt also exists, and

19.13s. If a set E has measure and the set is contained in E, then is a finite or denumerable collection of sets of measure then has measure If then Then for each integer n and each interval I containing in its interior we have

So the characteristic function of is summable and its integral is for every n, and by definition, The second part of the theorem follows from 19.8, for is measurable, and The next three theorems are not

theorems.

S EC. 19.15]

SETS AND FUNCTIONS

107

19.14. I f E i s a d e n u m e r a b l e s e t , m E — 0 . By 16.1, the measure of a set containing a single point (which can be regarded as a degenerate closed interval) is zero. If E is denumerable, it is the sum of denumerably many sets consisting of one point each, so by 19.13 it has measure zero. The concept of a set of measure zero is so important and useful that it is frequently used in discussions in which no other concept out of the theory of Lebesgue integration is needed. Thus, for example, in the study of the Riemann integral it is shown (though we shall not prove it) that in order for a function f(x) defined and bounded in an interval I to be Riemann integrable over I, it is necessary and sufficient that the set of discontinuities of/(ζ) have measure zero. When sets of measure zero enter in such discussions, it is customary to say that a set E has measure zero if for every positive e it is possible to cover E with a finite or denumerable set of open intervals h, 12, · · · such that ΣΔ/„ < €. We now proceed to show that this definition is in fact equivalent to ours. (It is not an "s" theorem.) 19.15. In order that a set E have measure 0, it is necessary and sufficient that for every positive e there exist a finite or denumerable set of open intervals /1,/2, · · · such that U InID E and ΣΔΐη < e. The condition is sufficient. Suppose first that E is bounded, interior say to a closed interval I, which we use as basic interval. With the intervals In of the hypothesis, K E (x) ^ K u Ux),

= /1 Σ K (*) dx '>

=

Σ J1 κ'»(χ)dx

= ^ AI n < e.

Since this holds for every positive e, and K s (x) is non-negative, by 5.2. 14.4 and 14.2 we have

Hence, by 19.1, mE = 0. If E is unbounded, by the preceding paragraph mEW n = 0 for every n, so by 19.1 we again have mE = 0. The condition is necessary. Suppose first that E is bounded, interior say to a closed interval I, which we use as basic interval. Let e

108 be a positive number.

INTEGRATION

[CHAP. ILL

Since the integral of the characteristic function

of E is zero, by 15.1 there is a [/-function

such that

(Recall that our set E is in g-dimensional space ) Let V be the subset of I on which This is open relative to I by 7.4, and contains E. It is clear that

for if x is in V the left member is and the right member exceeds i, and otherwise the left member is zero and the right is hence nonnegative. Since is a bounded ['-function by 12.9, it is summable, and

The set V is the sum of a finite or denumerable collection of closed intervals by 3.7. If we denote the interior of by the sets are disjoint, and their sum is contained in V. By 16.1,

Now using 19.8 and 19.3

whence The are still not the intervals desired, since they are closed. Suppose that is the center of so that is defined by inequalities Let be the interval

Each edge of is twice the corresponding edge of Hence and the are the intervals desired. If E is of measure zero but unbounded, for each n the subset can be enclosed in open intervals . . . with

SEC. 20]

SETS AND FUNCTIONS

Then the aggregate of all intervals able, and

109

(n = 1, 2, · • · ) is (!enumer­

X Δ/η,, < e X 2-" = 6.

nj

Clearly E is covered by these intervals.

This completes the proof.

19.16. Let E be a set, and (h ( 1 ) , • • • , h { i ) ) a q-tuple of real numbers.

If E is measurable so is its translation E w , and mE ( h ) = mE.

If mE is finite, this is merely 18.7 applied to K E (x). Otherwise, for each η we consider EWn. This has finite measure, so its translation has the same finite measure: mEWW™ = mEW n .

As η increases the sets EW n expand, and so do their translations. So by 19.8 the sum Ew of the translations EwWf is measurable, and mEW = Iim

η—> »

= Iim mEW n = mE. n—» •*>

20. The simplest sets with which we have had to deal are the intervals. For these the measure is easily found; by 16.1, ml = Δ7. (This is not an "s" theorem, but a rather more complicated device permits us to obtain the measures of intervals from the interval-function AI without using any properties of Al except those in 10.1. The proof occurs later, in 47.3.) We may regard open sets as next in simplicity, since by 3.6 each open set G is the sum of a denumerable set of disjoint intervals Ih Z2, · · · . Thus for an open set G we have mG = XmIn. That is, on the basis of the theory which we have developed it is now possible to calculate the measure of an open set directly from the values of the interval-function Al, without explicitly computing any integrals. If F is a bounded closed set, we can enclose it in an open interval G; then mF is the difference between the measures of the open sets G and G — F. If F is an unbounded closed set, it is the same as Iim FWn, and each set F Wn is a bounded closed set. So the measures of closed sets are also obtainable without explicitly performing an integration. We now proceed to show how the measures of arbitrary sets can be studied through a knowledge of measures of open and closed sets. It will follow that the whole theory of measure, and with it the whole theory of integration, could be built on this basis, although we do not choose to develop it in that way. However, a word of caution may

110

INTEGRATION

[CHAP. ILL

not be amiss. We have already used 19.8 in obtaining the equation mG = XmIn, in the preceding paragraph. This informs us in particu­ lar that the sum Sm/, has the same value, no matter how we choose to decompose the set G into non-overlapping closed intervals. If we had chosen to base our theory of integration on the measures of open sets, putting the definition mG = XmIn at the beginning of the theory, we would have had to devise some other proof of this independence. This would have called for a proof somewhat like that of 10.2, but more complicated because it would involve infinite collections of intervals. And this would be only the first of several troubles, for the theorems to which we shall refer in the proofs in this section would be no longer available, and consequently the proofs would be much more difficult. It is convenient to define first two numbers, the exterior measure meE and the interior measure TtilE, which are defined for all sets and which are related to the measure of E (when it exists) in much the same way that the upper and lower integrals o(f(x) are related to the integral of f(x) (when it exists). 20.1s. Let E be an arbitrary set in the space Rq. Then its exterior measure meE is the inf of the measures of all open sets containing E, and its interior measure miE is the sup of the measures of all closed sets con­ tained in E. As an immediate consequence of the definition we have the follow­ ing theorem. 20.2s. For every set E the inequalities 0 5Ξ TtiiE Sj meE are satisfied. Let E be a set, and let F and G be respectively a closed set and an open set such that F CZ E CZ G. By 19.3 0 ^ mF mG. For each closed set F contained in E this inequality holds for all open sets G con­ taining E, so mF is a lower bound for mG, and cannot exceed the inf. That is, mF S m e E. This holds for all closed sets F contained in E, so meE is an upper bound for the measures of such sets, and cannot be less than their sup, which is miE. Another immediate consequence of 20.1 is the following 20.3s. If Ei CZ E2, then m x E Ι ^ mjii and m,E, ^ Tn e E 2 The class of closed sets F CI E i is a subset of the class of closed sets F CZ Ei, and by 5.3 this implies the first inequality. The second is proved analogously.

SEC.

19.11s]

SETS A N D

FUNCTIONS

5

Still another corollary of definition 20.1 is this theorem. 20.4s. If ••• is a finite or denumerable collection of sets in the space then

If the right member of the inequality is infinite this is trivial Otherwise, let e be an arbitrary positive number. By 20.1 we can enclose in an open set such that

Then

is an open set, and by 19.8

So by 20.1 we have and by 5.2 this implies the conclusion of the theorem. Now we establish a connection between the exterior measure and upper integral, and between interior measure and lower integral. 20.5s. Let E be a bounded set, interior to a closed interval I. Then

Let J be the interior of If is any open set containing E, and then Thus ( 4 ) is 14.4 by interior is definition opentobyI.20.1 2.10, Hence we contains have by 19.3, E because 19.2, the J and remark G both after contain 18.1, 1.2 E,

26

INTEGRATION

[CHAP. Ill

Likewise, if F is closed and contained in E,

whence by 20.1 CB) Let be an arbitrary positive number. is a {/-function such that

By definition 14.1, there

Define

This is a [/-function by 12.4, and is clearly positive-valued. We define G to be the subset of I on which This set is open relative to I by 7.4. So is JG, since J is the interior of 7; and by 2.7 the set is open. Also, JG contains E, for if x is in E it is in J by hypothesis, and

so that x is in G. Also the characteristic function of JG is less than for if x is in JG the characteristic function is 1 while by definition of G, and otherwise the characteristic function is 0 while is positive-valued. Hence by 20.1, 19.2, 18.3 and 11.6

SEC. 20.5s] Since

SETS A N D

FUNCTIONS

113

is arbitrary, by 5.2 this implies

This, with (^4), establishes the first of the two equations in the conclusion of our theorem. Again, let be an arbitrary positive number. There is an L-function such that

Define

and let F be the set on which 7 by 7.4, hence is closed by 2.7. x of F we have

This is closed relative to It is contained in E, for at each point

so must be 1. Also, we have and , is zero and j is negative. By 20.1, 19.2, 18.3 and 11.6

for all x; for if x is in F and otherwise

114

INTEGRATION

[CHAP. ILL

Since e is arbitrary, by 5.2 this implies m-iE S: j KE(x) dx, and this "with ( B ) completes the proof. From this theorem we deduce several corollaries. 20.6s. If E is bounded, it is measurable if and only if VtiE = meE; and in that case mE = meE = niiE. Let 7 be a closed interval containing E in its interior. Since E is bounded, by 19.2 it is measurable if and only if it has finite measure. Again by 19.2, this is true if and only if KE(x) is summable over the space Rq. By the remark after 18.1, this is true if and only if KE{x) is summable over I. By 15.1, this is true if and only if the upper and lower integrals of KE(x) are finite and equal. Since KE(x) is bounded, its upper and lower integrals over J cannot be infinite; so E is measurable if and only if the upper and lower integrals of KE(x) over I are equal. By 20.5, this is true if and only if m,iE = m,E. Furthermore, if niiE = m,,E, by 20.5 their common value is the integral of KB(x) over I, which by 17.4 and 18.1 is its integral over Rq, which by 19.2 is mE. Theorem 20.6 permits us to distinguish bounded measurable sets from bounded non-measurable sets, and to compute the measures of the former. From this we could proceed as in §19 to define measurable sets to be those sets E such that EWn is measurable for each n, and to obtain their measures as in 19.1. An alternative procedure will be mentioned after 20.10. 20.7s. If E is a measurable set and e is a positive number, there exists an open set G containing E such that m(G — E) < ¢, and there exists a closed set F contained in E such that m(E — F) < e. If E is measurable and bounded, by 20.6 we have meE = mE, so by 20.1 the set E is contained in an open set G such that mG < mE + e. But by 19.8 we have mG = mE + m(G — E), so m(G — E) < e. If E is measurable but unbounded, for each positive integer η the set EWn is measurable and bounded, and by the preceding sentence is contained in an open set Gn such that m(Gn - EWn) < ~ The set G = U Gn is open by 2.10 and contains U EWn, which is E. Also, G-E is contained in U((?„ — EWv); for if χ is in G — E, it is

SEC.

in some

19.11s]

SETS

A N D

but not in any

F U N C T I O N S

28

Hence by 19.3 and 19.8

Thus the first conclusion is established. If E is measurable, so is its complement CE. We enclose CE in an open set G such that and we define This is closed by 2.6, and is contained in E. Moreover,

so

This completes the proof. 20.8s. If E is measurable, then If then by 20.7 with there is a closed set with By 20.1 and 20.2 this implies. If mE is finite, the sets F and G of 20.7 show that and Since e is arbitrary, by 5.2 and 20.2 we have

establishing the theorem. 20.9s. Let M be a set having finite measure, and let E be a subset of M. Then (A) and (•B)

Let « be an arbitrary positive number. The numbers in equations (A) and (B) are clearly finite. By 20.7 there is an open set G containing M for which and by 20.1 there is a closed set F contained in E for which is open by 2.6 and 2.10, and contains Hence by 20.1 and 19.8

The set

116

INTEGRATION

[CHAP. ILL

By 5.2 this implies (C) On the other hand, by 20.7 there is a closed subset of M such that and by 20.1 there is an open set such that Then the set containing is closed by 2.6 and 2.10, and is contained in E. So by 19.3 and 19.8

But by 20.1 and the choice of G\ this yields

and by 5.2 we find

This and (C) together imply (B). From (B) we obtain ( 4 ) by interchanging E and M — E. 20.10s. Let E be a set in the space In order that E be measurable it(A)is necessary and sufficient that the equation be and But satisfied Suppose an 6' arbitrary suchfor that that every positive E is set contains measurable; number. then contains By CE 20.7, isCE, also there and measurable. are open sets Let

SEC.

19.11s]

SETS A N D

FUNCTIONS

10

so by 19.3 and 19.8

We now wish to establish the inequality

(B) for all sets If this is trivial. Otherwise, by 20.1 there i s a n open set containing for which Then and are open sets by 2.10, and contain and respectively. Hence by 20.1

The three disjoint sets so by 19.3 and 19.8

and

fill the set

It follows that

and by 5.2 this implies inequality (B). 20.4 we have

But

This and (B) together establish equation (,4). Conversely, suppose equation (vl) satisfied for a set E. we choose a bounded measurable set. By 20.9,

Since

so by

For

is measurable, by 20.8 equation A implies

Hence the exterior and interior measures of the bounded set are equal, and by 20.6 is measurable. In particular, if is the interval this shows that is measurable, so by 19.1 the set E is measurable. This completes the proof of the theorem.

118

INTEGRATION

[CHAP. ILL

If we had a more complete theory of the exterior measure of sets, developed without reference to measurable sets, we could use the property expressed in 20.10 to define the property of measurability. That is, a set E would by definition be measurable if the equation HTieEx = m e EiE

+ Tn E CE e

1

held for every set Ei. Having selected the measurable sets by this test, we could define the measure of a measurable set to be the same as its exterior measure. From the properties of exterior measure already established we could then deduce the properties of measurable sets, and erect a theory of integration on this foundation. This is the method originated by Caratheodory. An exposition can be found in Caratheodory's Vorlesungen iiber reelle Funktionen, or Kestelman's Modern Theories of Integration. EXERCISE. Show that if E is a bounded set, there are sets P and S with the following properties. P is an intersection of open sets and contains E. S is a sum of closed sets and is contained in E. The equa­ tions mP = meE and mS = TniE hold. Furthermore, for every measurable set M we have mPM = m e EM and mSM = Tn l EM. Extend this to unbounded sets E. 21. If we recall even a very little about the Riemann integral and the mild generalization of it mentioned in §16, we see that a function f(x) which is not Riemann integrable may lack integrability for either of two very different reasons. It may have too many discontinuities, as in the example after 18.6. Or it may simply have too many large functional values; for example the function f(x) which is x~x for 0 < ι S 1 and has the value 0 at χ = 0. This function is not inte­ grable, although it has only a single discontinuity. This latter type of function may clearly have many of the desirable properties of Riemannintegrable functions, even though it is not integrable. In the case of the Lebesgue integral the situation is similar. A function which is not summable may be of so intricate a structure that its upper and lower integrals have different values. Such compli­ cated functions we shall avoid. On the other hand, it may be nonsummable merely because it has too many large functional values. In this case its structure may be as simple as that of the summable functions. In order to be able to discuss the functions of relatively simple structure we introduce the notion of a measurable function. The class of measurable functions is closely related to the class of summable functions, as theorems 21.4 and 22.1 will show.

SEC. 19.11s]

SETS A N D

FUNCTIONS

3

21.1s. A function defined on a set E in I is measurable (on E) if for each constant a (finite or infinite) the set of points x in E at which is a measurable set. We use the symbol to mean the set of all x in E such that (in our previous notation: and we also use analogous symbols for sets on which satisfies other conditions. The meaning will be evident in each case. A first remark is that if f(x) is measurable on E, then E is a measurable set. For is then measurable, and this set is E itself. A second simple remark is that a set E is measurable if and only if is a measurable function on the space For if E is measurable the set is one of the measurable sets according as and if is a measurable function the set is measurable. A third remark is that if /(af) is measurable over E and are arbitrary constants, then (with the notation 10.8) . is measurable over This follows from 19.16 if we observe that x is in and if and only if is in E and, It is not an s-theorem. A fourth remark almost as obvious is 21.2s. If f(x) is defined and measurable on a set E, and Eo is a measurable subset of E, then is measurable on For if a is any number, the set is the product of the measurable sets . and so it is measurable by 19.6. We now list five conditions each of which could serve as a definition of measurability. 21.3s. If f(x) is defined on a measurable set E, then each of the following conditions is necessary and sufficient in order that f(x) be measurable on E: is measurable for all a; is measurable for all a; is measurable for all a; is measurable for all a and all b; is measurable for all a and all b. (We could state two more necessary and sufficient conditions by replacing after the number b in (d) and Suppose f(x) is measurable. The set is the sum of the sets for all rational numbers For if and there is a rational r such that and so a; is in and if a; is in some one of the sets then

120

INTEGRATION

so £ is in E [ f > a ] .

[CHAP. ILL

If a — , the set E [ f > a ] is empty and there

are no r > a, so the equality of E[f > a] ind U E[f ^ r], r > a is still valid. Each of the sets E[f ^ r] is measurable by hypothesis, and there are at most denumerably many of them because the rationals are denumerable. So their sum E[f > a] is measurable by 19.8. Thus if fix) is measurable, (c) is satisfied. Suppose that (c) is satisfied. For every a we have E [ f ^ a ] = E — E [ f > a } . But E [ f > a] is measurable for every a , so by 19.6 so is E[f ^ a]. Therefore if (c) is satisfied, so is (a). Suppose (a) is satisfied. If in the first paragraph of this proof we replace the signs ^, > by ^, < respectively and replace oo by — «ο; we have a proof that (b) is satisfied. Suppose that (b ) is satisfied. Then for every a the set E [ j < a] is measurable. But E[f 2= a] = E — E[f < a], so by 19.6 E[f ^ a] is measurable for every a, and so f(x) is measurable. Summing up, we have shown that the measurability of f{x) implies (c), which implies (a), which implies (6), which implies the measurability of f(x). Hence the measurability of f(x) is equivalent to (a), to (b) and to (c). If f ( x ) is measurable, then for every a and b the sets E [ f ^ b ] and E[f > a] are measurable, by condition (a) and condition (c) respec­ tively. Hence their product E[b 2? f > a] is measurable. Con­ versely, if E[b S: / > a] is measurable for all a and b, then by setting h = oo we find that E[f > a] is measurable for each a. This is condi­ tion (c), equivalent to measurability, so condition (d) is equivalent to measurability. If in the preceding paragraph we replace > by Si and replace the words "condition (c)" by "the definition of measurability of f(x)," we have a proof that (e) is equivalent t o the measurability of f(x). EXERCISE. Theorem 21.3 remains valid even if a and b are restricted to lie in a set M whose closure is R*. The next theorem partly exhibits the relationship between measur­ ability and summability. This relationship will appear even closer after the demonstration of theorems 22.1 and 22.2, which are partial converses of 21.4. 21.4s. I f f ( x ) i s d e f i n e d a n d s u m m a b l e o v e r a m e a s u r a b l e s e t E , i t i s measurable on E. We define f ( x ) to be — oo on the complement of E ; this does not affect its summability over E. Let a be an arbitrary real number. For each integer η we define f n ( x ) = inf {η • sup j f ( x ) — a , 0 } , 1 } .

SEC. 19.11s]

SETS A N D

FUNCTIONS

121

It is clear that, , so the limit of as exists. Call this limit We shall now show that g{x) is the characteristic function of If x is in then and sup Hence for all n greater than we have n • sup so that Thus_ for all large n, and the limit i of, is 1. If x is not in then and sup Then for all n we have n • sup so that Therefore g (x) Thus if x is in and is 0 otherwise. For each of the intervals the intersection is measurable by 19.1, so 1 is summable over by 19.2. Also f{x) is summable over by 18.1; so by 18.3 is summable over and so also is sup But this vanishes on as shown above; so by the definition 17.1 it is summable over the entire interval Again by 18.3, n sup and

are summable over

For all n we have (by 15.2)

so the integrals do not tend to as .. . Therefore by 18.4 the limit g(x) of the is summable over Since g(x) is the characteristic function of this implies that is measurable for all p, so for every finite number a the set is measurable by definition 19.1. For is empty, and for is the whole set E. In either case, it is measurable. Therefore is measurable for all a, and by 21.3 the function/(x) is measurable. 21.5s. Corollary. If f{x) is summable over a set E, then the subset Eo of E on which is measurable. Let is in if x is in CE. Then is summable over the whole space and by 21.4 it is measurable. By 21.3, the sets and are measurable. But these sets lie entirely in E, and their sum is Theorem 21.4 would lead us to suspect that theorems analogous to those of 18 should hold for measurable functions. We now establish several such theorems.

122

INTEGRATION

[CHAP. Ill

21.6. If f(x) is measurable on E and c is any finite constant, then cf(x) is measurable on E. For this is obvious. Suppose Then for every a the set. is the same as the set which is measurable by 21.3a; so i is measurable on then as we have just seen is measurable, and therefore so is 21.7s. If ••• is a finite or denumerable collection of functions measurable on E, then sup, and inf, are measurable on E. Let and let a be any number. We first show If a; is in then so for some value j o f i we have Thus a; is in and therefore is in Conversely, if x is in it is in some set Hence and Therefore x is in Now each set . is measurable by 21.3c. Hence is the sum of a finite or denumerable collection of measurable sets, and is measurable by 19.8. So by 21.3c, the function is measurable. The functions are measurable by 21.6. Hence, by the preceding part of the proof, sup, is measurable. But this is

the same as — by 5.5; and, applying 21.6 again, inf,, is measurable. 21.8s. Corollary. If fix) is measurable on E, then and are also measurable on E. For then —„ is measurable on E, by 21.6; so by 21.7 the functions

are measurable on E. 21.9s. If ••• is a sequence of functions all of which are measurable on E, then

are measurable on E. If we define then is measurable by 21.7 and also inf is measurable by 21.7. This last is the upper limit of by the paragraph following 6.1. Interchanging sup and inf proves that the lower limit is measurable. * The theorem remains true if

SEC. 21.11s]

SETS A N D

FUNCTIONS

123

21.10s. If fix) is defined on a set E, and E is the sum of a finite or denumerable collection of measurable sets on each of which f(x) is measurable, then fix) is measurable on E. For then the set is the sum of the various sets each of which is measurable by hypothesis. So is measurable by 19.8, that is, f(x) is measurable on E. In particular, if fix) is defined on E and is measurable on where then/(a;) is measurable on E. For fix) is measurable on. since for each a the set has measure 0 by 19.13. 21.11s. If are functions defined, finite and measurable on a measurable set Eo in q-dimensional space S is a set of points in h-dimensional space Rh, and S is either open, or closed, or the sum of a denumerable set of closed sets; is defined on S and is continuous or upper semi-continuous or lower semi-continuous on S; then the subset of on which the function is defined, is measurable and is measurable on . Let F be the mapping of into defined by

Even if the , happen to be defined at values of x not in Eo, we shall ignore such values, so that is defined only on E0. The inverse mapping, which may be many-valued, will be denoted by Thus for each z in we define to be the set of all points x in Ea such that Also, if is a point-set in is the collection of all x in E0 such that Fix) is in Let J be any interval in of the type described in 3.6, namely We now prove that is measurable. Clearly it consists of all x in E0 such that Fix) is in J ; so it consists of all x in which satisfy all of the inequalities

Otherwise stated, if

is the set of all x satisfying

the set , , is the intersection Each of these sets is measurable by 21.3, so by 19.8 their intersection is also measurable. Every open set is the sum of denumerably many closed sets, by 3.6.

124

INTEGRATION

[CHAP. ILL

Now suppose that is an arbitrary open set in the union of intervals of the above mentioned type. Since each is measurable the union is measurable by 19.8. If is closed, then is open. Hence J is measurable, and since it too is measurable by 19.6. If is the sum of denumerably many closed sets • • •, then and hence is measurable by the preceding paragraph and 19.8. From the preceding proof, for each of the three types of set S under consideration the set on which is defined is a measurable subset of because obviously Now let us suppose that is lower semi-continuous. Suppose first that S is open. For every number a in the set is open relative to S by 7.4, and so it is open, by 2.7. The set

is the same as the set , and since , is open this set is measurable. By 21.3(c), this implies that is measurable on Ev Suppose next that S is closed. The set is closed relative to S for every a, by 7.4; so by 2.7 it is closed. The set is the same as the set and by the first part of the proof this is measurable because is closed. Hence by 21.3(a) the function measurable on _ Finally, suppose that S is the union of a denumerable set of closed sets. The set is the sum of the sets Each set is closed relative to the closed set by 7.4, so is closed by 2.7. So each set is measurable, by the first part of this proof. The sum of these sets, for n = 1, 2, • • • , is which is the same as

This last set is therefore measurable.

By 21.3 a, the function

is continuous. measurable. Finally, if Therefore is upper semi-continuous, then is measurable is lower on semi-

SEC.

19.11s]

SETS A N D

FUNCTIONS

125

by the proof above, so by 21.6 the function is also measurable on 21.12s. Corollary. If the functions are finite valued and measurable on a set E, then the functions _ and are also measurable. The quotient is measurable if ^ the function is measurable on E. If _ the function is measurable if _ These all f o l l o w f r o m 21.11 by special choices of For 2/, we use continuous for all z; for we use continuous for all z. In both cases the set on which is defined is all of E. For the quotient we use defined and continuous on the open set where then if is defined for all x in E and is measurable. If the function is continuous for all z, so is defined for all x in E and is measurable. If the function is defined and continuous on the open set so if the power is defined and measurable on E. REMARK. It follows readily that the product of measurable functions is measurable, even though the functions are not finitevalued. Let and be measurable on E. Then so are

by the second remark after 21.1, 21.6 and 21.7. Also and for all x. By 21.12, is measurable for each n; so by 21.9 the limit, is also measurable. 22. We saw in 21.4 that every function. which is summable over a measurable set E is measurable. Of course the converse is false; the function 1 is measurable over the whole space but is not summable over R q . However, two partial converses to theorem 21.4 can be established. 22.1s. If f(x) is bounded and measurable on a set E of finite measure, it is summable over E. Suppose that where M is finite. We consider first the case in which E is bounded, and therefore interior to some closed interval I, which we use as basic interval. If e is a positive number, we can subdivide the interval by points into subintervals of length less than e. Let E, be the set Every point x of E belongs to exactly one set Ei, since each lies in exactly one interval

126

I N T E G R A T I O N

[CHAP. ILL

Now define

Then For if x is in CE all three are equal to 0. If x is in E it belongs to one set Then in the sums defining and all terms except the term with have value Also for all0, x.so By 21.3(d), each set. is measurable, so each characteristic function Ke,(X) is summable over I. By 15.6, the functions and are summable over I. Therefore by 14.4 and 14.2 This tells us, first, that the upper and lower integrals of, over I are finite, and second, that

Hence the difference between the upper and lower integrals of does not exceed the arbitrarily small positive number eA7; so the two are equal, and is summable. That is, fix) is summable over E. If E is not bounded, by the proof above f(x) is summable over for every n. Also,

So

is summable over E. EXERCISE. be continuous, strictly monotonic increasing (that is, and bounded for for instance, or arc tan t. A function defined and finite on a measurable set E in the space , is measurable if and only if is summable over for every n. We now generalize 22.1 so as to allow both the set E and the function f{x) to be unbounded. 22.2s. If is measurable on E and gix) is summable over E and is summable over E.

SEC.

19.11s]

SETS A N D

FUNCTIONS

127

Suppose first that E is bounded and that f{x) is non-negative. For each positive integer n we define inf Clearly we have Each is measurable by 21.7, and is bounded; so by 22.1 each is summable. Also, by 18.2

As the functions tend to For if . . . is finite, then and if then and lim Therefore by 18.4 the limit f(x) of the sequence is summable over E. Now we remove the restriction still supposing E bounded. The functions and are measurable by 21.8, and they satisfy the inequalities

So by the preceding paragraph they are both summable over E. But by 15.5 f(x) is the difference of and and is therefore summable by 18.3. Finally suppose that E is unbounded. For each p, the function is summable over by the preceding proof. Also, for all p So f(x) is summable. EXERCISE. Let be as described in the preceding exercise. A function f(x) defined and finite on a measurable set E is measurable if and only if

is summable over E. 22.3s. Corollary. If is measurable d-nd is summable, then f(x) is summable. For then the hypotheses of 22.2 are satisfied, with From theorem 22.2 we obtain two important corollaries. 22.4s. If f{x) is summable over a measurable set E and g{x) is bounded and measurable over E, then is summable over E. Suppose By 21.4 and the remark after 21.12, is measurable on E. Also, which is summable by 18.3. So is summable over E, by 22.2.

128

INTEGRATION

[CHAP. ILL

22.5s. If fix) is summable over a measurable* set and E is a measurable subset of then fix) is summable over E. For is bounded and measurable over the whole space so it is measurable over by 21.2. By 22.4, the product is summable over E\) that is, fix) is summable over E. 23. We shall shortly see that sets of measure zero are usually unimportant in the theory of Lebesgue integration. Consequently, it frequently happens that a statement about a function which is true except on a set of measure zero carries the same consequences as if it were true everywhere. This situation arises often enough to justify the introduction of a special name. A statement will be said to hold "almost everywhere" or " f o r almost all x" provided that it holds for all x except those belonging to a set of measure zero. (The set of measure zero may be the empty set A; that is, if the statement holds everywhere it holds almost everywhere.) 23.1s. If fix) is summable over E, then fix) is finite almost everywhere in E. Let be equal to fix) on E and to zero elsewhere. Then is summable over the whole space and so is by 18.3. By 19.5 and 21.4 this function is measurable, so by 21.1 the set E 0 on which it is is measurable. This is the same as the set For every n we have for all x in Therefore for each interval

since is summable by 18.3. possible only if Since

This holds for all n, which is for all p, by definition

The next theorem is merely a lemma to be used in the proof of 23.3. 23.2s. If E is a set of measure and fix) is defined on E, then fix) is summable over E, and its integral is The function fix) is measurable on E, for is contained in E and has measure 0 by 19.13. By 18.3, for each n, the integral exists and has the value 18.4 the function

is summable over E, and

By use of 21.5 it can be shown that the hypothesis that be omitted.

Hence by Since is measurable can

SEC.

19.11s]

SETS

A N D

f(x) is measurable on E and Also

F U N C T I O N S

129

is summable.

23.3s. If f(x) is summable over a set E, and g(x) is defined on E and equal to f{x) for almost all x in E, then g(x) is also summable over E, and

Define I to be equal to f(x) on E and equal to zero elsewhere, and define analogously. Let N be the set of measure zero on which Since N has measure 0 both and are summable over N, and their integrals over N are zero. In other words

both integrals existing and being equal to 0. The set CN is measurable by 19.6, so is summable over CN by 22.5. But on CN the functions and are equal, so

both integrals existing. Adding these equations member by member, we find that is summable over and has the same integral as fo{x), which is equivalent to the conclusion of our theorem. Two functions which are both defined on the same set E are called equivalent if , for almost all x in E. Thus 23.3 can be stated in the form that if one of a pair of equivalent functions is summable, so is the other, and the integrals of the two functions are equal. 23.4s. Let be defined on the sum of two sets and . and let Then is summable over if and only if it is summable over and if it is summable, then

Suppose

defined in any manner on The functions and are equivalent on the whole space For they differ at most on the set which has measure The first of these functions is summable over if and only if f(x) is summable over and the second is summable over

130

INTEGRATION

[CHAP. ILL

if and only if f(x) is summable over E2. So by 23.3 f(x) is summable over E i if and only if it is summable over E 2 . Moreover, if it is summable then by 23.3

REMARK. We can now show that the hypotheses concerning the signs of the functions ft(x) in 18.4 and 18.5 are superfluous. We consider only 18.4, since this yields 18.5 as a corollary. Let Ea be the subset of E on which \fi(x) | = °o; by 23.1, this has measure 0. Define D = E - E0, and on D define gjx) = fjx) - f^x). These functions are summable over D, by 23.4 and 18.3, and

T h e y tend to f(x) — fiQc) as n—> oo. So by 18.4 f(x) — fi(x) summable over D (that is, f(x) is summable over D) if and only if

is

and this is true if and only if

In this case, by 18.4

that is,

Since D and E differ only on the set E0 of measure 0, by 23.4 0. and Suppose summable, 23.5s. Conversely, Let is zero EiIfif and be f(x) that and the suppose ^ /(x) only set 0 on if =on tf(x) h 0a awhich set for t /=0 (E, zalmost ) for f(x) then is summable almost the >all0;integral £ we all inand xmust E. inofthat f(x) E. Bshow y its 23.3, over integral mEi Ef(x) exists = 0. is

SEC. 24.4S]

SETS

AND

FUNCTIONS

131

As n —* oo, the function nf(x) tends to oc if x is in Eh and is always 0 if x is not in Ei. Hence lim nf(x) — °° KEl(x). Also, f(x) ^ 2 f ( x ) ^ 3 f { x ) g • • • , and

So by 18.4 the limit function oo KSl{x) is summable. I t is equal to °° on E i . B u t by 23.1 the set of points on which a summable function has the value °o must have measure 0 ; so m E i = 0. A simple generalization of 23.5 is 23.6s. If f(x) and g(x) are both summable over a set E, and fix) ^ g(x), and

then fix) = g(x) almost everywhere in E. Let Eo be the subset of E on which | g(x) | = oo ; by 23.1, mE0 = 0. On E — Eo the function f(x) — g(x) is defined and non-negative, and by 23.4 and 18.3

So by 23.5 f(x) = g(x) for all a; in E — E0 except those x belonging to a set Ex of measure 0. Then E0 U Et has measure 0, and fix) = g(x) on E — ( E 0 U Ei); that is f(x) = g(x) almost everywhere in E. 24. The mere fact that two functions f(x) and g(x) are both summable is of course insufficient to assure us that their product is summable. For instance, let f(x) = g(x) = ar* for 0 < x ^ 1. E a c h of these is summable, but their product x~l is not. I n 22.4 we obtained one set of conditions under which we could be sure that/(a) g(x) is summable. Here we shall obtain a different set of conditions for the integrability of f(x)g(x), and in the process we shall establish one of the most important inequalities of analysis, the Schwarz-Holder inequality. Let us say that two functions $( 1 the two members are equal if and only if there are numbers X, fi not both zero such that X/(a;) = fig(x) for almost all x. EXERCISE. Let E be a set of finite measure. If 1 ^ a < b, and f(x) is a function of class Lb on E, it is also of class La on E. (Define p = b/a. Then | f(x) is of class Lv on E. Apply 24.4 with /, g replaced by j/(a;) \a, 1 respectively.)

CHAPTER IV

The Integral as a Function of Sets; Convergence

Theorems

25. Our treatment of integration has permitted us to discuss multiple integrals along with single integrals, without regard to the number of independent variables x{l). We shall now discuss the reduction of a multiple integral to an iterated integral. In order to avoid complexity of notation we shall suppose that all functions mentioned are defined over the whole space Rq. As we know, the integral of f(x) over E can be considered as the integral of f(x)KE{x) over Rq, so this agreement does not involve any loss of generality. Let s, t be positive integers such that s + t = q. The point x = (x(i}. • • • , x(q)) can be written in the form ( nt,j, the set Ek is contained in Et; so on E — Ee we have

This proves our theorem. 28.8s. Let the functions f^x), fi(x), ••• be defined and measurable on a measurable set E. If fn(x) converges to f{x) in any one of our three modes then f(x) is measurable. If fn tends almost everywhere to fix), then/(a:) is almost everywhere equal to lim inf fn(x), which is measurable by 21.9, If /„ tends to / almost uniformly, then fn tends to / almost everywhere by 28.6, so / is measurable. If /„ converges in measure to f(x), a subsequence converges almost everywhere to/(x) by 28.7, so /(x) is measurable. Before proceeding to the proof of the next theorem we establish a lemma. 28.9s. Lemma. If f{x) and gix) are defined and measurable on a set E, then for every 8 > 0 the subset Es of E on which g{x) is not in Ns(f(x)) is a measurable set. Let us subdivide E into the set E, on which fix) and g(x) are both finite, the set E2 on which /(x) is finite and | g(x) | = , the set E3 on which/(X) = + OO, and the set Et on which f(x) = — GO . B y 21.3 each of these sets is measurable. On Ex, the function f(x) — g(x) is defined and is measurable by 21.12 and 21.2. So | f(x) — g(x) | is measurable by 21.8, and the set EliS on which j f(x) — gix) | 3: 5 is measurable by 21.1. On all of E2, g(x) fails to lie in Ns(f(x)). On E3, g(x) is measurable by 21.2, and so the subset Es,j on which g(x) ^ 1/5 is measurable. Likewise g(x) is measurable on Eit so the subset Ei,s on which g(x) ^ —1/5 is measurable. B u t by the definition of neighborhood, the set Es on which g(x) is not in Ns(f(x)) consists of Ei,t U Ei U E3,j U EiiS, so it is measurable. 28.10s. Let the functions f(x), fi(x), fi{x), • • • be defined and measurable on a set E of finite measure. If fn(x) converges almost everywhere to f(x), then fn(x) converges to f(x) almost uniformly (hence in measure). Let e and 5 be positive numbers, and let En.s be the subset of E on which fn(x) is not in Ns(f(x)). B y 28.9, each EnyS is measurable. So if we define and we see by 19.8 and 19.9 or 19.11 that Esk and Et are measurable sets.

SEC.28.11S]

T H E I N T E G R A L AS A F U N C T I O N

165

But if x belongs to Es, it belongs to all Esk. If it belongs to Esk, it belongs to some En,s with n S: k. So if it is in Es it belongs to infinitely many En EJ • • • . Aiso they are all contained in the set E, which has finite measure. Hence by 19.9 lim mEsk = mEs = 0, and for some k we have k—> co mEf < c. Now define Et = Esk, nCii = k. Then mEe < t. B y the definition of Esk, if n > n(,s = k then En,t CI Esk = Et, so for every x in E — E( we know that x is not in En,s, and therefore fjx) is in Ns(f(x)). This proves that, fjx) tends tof(x) almost uniformly. An important property of any limiting process is the uniqueness of the resulting limit. In the present case, we have no true uniqueness, for if /„ tends in any one of our three modes to f(x), it tends equally well to any function g(x) which differs from fix) only on a set of measure zero. B u t we can prove that no greater arbitrariness than this is allowed: 28.11s. Let f(x), g(x), fi(x), ••• be defined on a set E. If jn{x) converges in any one of our three modes to f(x) and also converges in any one of our three modes to g(x), thenf(x) = g{x) for almost all x. If fn(x) converges in measure to f(x), by 28.7 it is possible to select a subsequence {/ a (a:)}(a = nh n 2 , • • • ) which converges almost everywhere to f(x). If fjx) tends to fix) almost everywhere, this is still true even if we take the whole sequence, and \ifn{x) tends almost uniformly to f(x) it tends almost everywhere to f{x) by 28.6. So in any case we can choose a subsequence \fa(x)\ converging almost everywhere to f(x). The sequence \fn(x)} converged in some one of the three modes to g{x), so the subsequence {fjx) j converges in the same mode to g(x), by 28.4. As in the preceding paragraph, it is possible to select a subsequence {/s(z)} out of the sequence IfJx)} in such a way that fff(x) tends to g(x) almost everywhere. The sequence fp(x) continues to converge almost everywhere to f(x), by 28.4. Then the equations lim fn(x) = f(x) and P-* » fi—*

lim fp(x) = g(x) oo

are respectively true for all x except those belonging to two sets Ea, Ei of measure 0. Therefore except on the set E0 U Ei of measure 0 we have f(x) = g(x), as was to be proved.

166

INTEGRATION

[CHAP. IV

I t might be thought that 28.7 could be improved to read that if fn converges in measure to /, then /„ converges almost everywhere to f. An example shows that this is false. Let E i be the interval [0, 1], E% and Eh the intervals [0, and 1] respectively, Eit E-a, E? the intervals [0, i], [1, [-£, f], [f, 1] respectively, and so on, proceeding by successive bisections. Let /„ = KE„. Then /„ converges in measure to 0 on the interval [0, 1]. For if 5 > 0, then fn(x) is in Ns(0) (in fact, is equal to 0) except at most on En; so we can take En,s = En, and lim„ mEn,s = lim„ mEn = 0. B u t at no point is lim fn(x) = f(x); in fact, lim fjx) does not exist. For each x in [0, 1] is contained in infinitely many En, so infinitely many fn(x) have value 1; while there are infinitely many En which do not contain x, so fn(x) = 0 for infinitely many n. Hence lim /„(x) does not exist. 29. I t is at once apparent that the mere convergence, in any one of our three modes, of a sequence of functions fn(x) to a limit function/(x) is not enough to guarantee that the integrals of the fn(x) will converge to the integral of f(x). For example, let fn(x) be defined thus:

(n = 1, 2, • • • ). These functions are all continuous and tend everywhere (hence in all three of our modes) to f0(x) = 0. B u t

so the integrals of the fn(x) do not converge to the integral of fo(x). This example makes it clear that we must make other assumptions besides mere convergence of the fn(x) in order to obtain convergence of the integrals. In this section we set forth several such sets of assumptions. First, however, we shall make a general observation. If a sequence of summable functions (fn(x)} converges in any of our modes to a function f0(x), we know by 23.1 that each function fn(x) is finite except on a set En of measure 0. If we re-define fn(x) by assigning it the value 0 on E„, then the new functions fn(x) have the same integrals as the old and still converge in the same mode as before to fo(x); for the change of values on the set U En of measure 0 does not affect any of our types of convergence. Hence in the theorems of this section there is no loss of generality in assuming that the functions fn(x) have finite values. This assumption simplifies the statements of some of our conclusions.

SEC. 29.2S] T H E

INTEGRAL

AS

A FUNCTION

167

The first theorem which we shall establish is a lemma which not only is useful for later proofs, but is of considerable importance in itself. 29.1s. (Fatou's Lemma). If the functions fn(x) are all summable over a measurable set E, and the lower limit of their integrals is not + «, and there is a summable function g(x) such that fn(x) ^ g(x) on E, then lim inf fn(x) is summable, and n—> »

Define gn(x) = inf {/.(z) | i ^ n } . Then g(z) ^ gi(x) ^ g2(x) ^ • • • ; and by 6.9 the limit of gn(x) is lim inf fn(x). Moreover, gn(x) is measurable by 21.7, and g(x) ^ gn(x) g fn(x), so | gn(x) | ^ | g{x) | + I fn{x) | , which is summable. So by 22.2 gn(x) is summable over E, and by 18.2

I t follows that

Now the gn(x) satisfy the hypotheses of 18.4, so by that theorem

Together with the preceding inequality, this establishes the theorem. 29.2s. Corollary. If the functions fjx) are summable over a measurable set E, and on E they converge in measure or almost everywhere to a function f(x), and there is a summable function g(x) such thatfn(x) ^ g(x) for all n and all x, then f(x) is summable on E and

provided that the right member of this inequality is finite. Let us suppose that fn(x) tends to f(x) almost everywhere in E. Then lim inf fn{x) = f(x) for almost all x in E. B u t lim inf/ n (.r) is summable by 29.1, so /(x) is summable by 23.3. The inequality follows at once from 29.1 and 23.3. Now let fn(x) tend to f(x) in measure, and define

168

INTEGRATION

[CHAP. IV

From the sequence \fn(x)\ we can (by 6.4) select a subsequence {/ ho, and there is a summable function gix) such that | fix, h) \ 5= gix) for all x in E and all h in H, then fix) is summable over E, and (a) (b) Lethi,hs, • ' • be a sequence of points of //converging to h0 and ^ ho. The sequence fix, hn)(n = I, 2, • • • ) satisfies the hypotheses of 29.3, so fix) is summable over E and

This holds for every sequence \hn] of points of H converging to ho and distinct from ho, so by 4.5 we obtain the desired conclusions. Analogous extensions of theorems and definitions 2 9 . 4 - 2 9 . 8 can be made with equal ease. We shall not state them in detail.

170

INTEGRATION

[CHAP. IV

Let us return again to the example with which we begin this sec­ tion. We notice that the convergence troubles arose because there were arbitrarily small intervals [0, h] on which the integrals of the f n (x) were not arbitrarily small; in fact, these integrals were equal to 1 for all large n. This suggests that we might arrive at a convergence theorem by excluding this type of behavior. Now we know from 27.1 that for each η and for every « > 0 there is a δ > 0 such that the integral of f n (x) over any set E of measure mE < δ has a value less than e. We exclude the type of difficulty shown in our example by requiring that for each e > 0 there shall be a δ > 0 which serves uniformly for all n. That is, in the definition of absolute continuity of I

JE

f (x

) dx, we ask that a single 5(e) shall serve uniformly for all n:

29.4s. Let the functions fi(x), fi(x), · · · be all defined and all summable over a set E*. The integrals F n (E)

=

J E f n (x)

dx.

regarded as functions of measurable subsets E of E*, are uniformly absolutely continuous if to each e > 0 there corresponds α δ > 0 such that for every measurable subset E of E* with mE < δ the inequality F n (E)

I S

I

fjn(x)

dx

I < 6

holds for all n.

A direct consequence of the definition is 29.5s. Let the functions fi(x), f%(x), ' 4 ' be defined and summable over a set E*. The set-functions F n (E) defined as in 29.4 are uniformly absolutely continuous if and only if to every e > 0 there corresponds a δ > 0 such that

for every measurable subset E of E* with mE < δ.

If the condition above is satisfied, the functions F n (E) are uni­ formly absolutely continuous. For if E is any subset of E* with mE < δ, then I Fn(E)

I = J

j E f n (x)

dx ^

Je

I

f n (x)

I dx 0 such that F n (E) < e/2 for every subset E of E* with mE < δ. Let E be such a set.

SEC. 29.2s] T H E

I N T E G R A L

AS

A

F U N C T I O N

171

It can be divided into the subset En,i on which fn(x) ^ 0 and the subset En,2 on which fn{x) < 0. These subsets are measurable, by 21.2 and 21.3. Each has measure less than 5, being contained in the set E whose measure is less than 5. So

This is the essential hypothesis in 29.6s. Let the functions fi(x), fi(x), • • • be defined, finite valued and summable on a set E* of finite measure. If (a) on E* the functions fn(x) converge in measure (or, more particularly, almost everywhere or almost uniformly) to a function f(x); (b) the set-functions Fn(E)

= fEf*(x)

dx are uniformly absolutely

continuous on the class of all measurable subsets of E*; and either (c) the integrals

are bounded, or (c') the function f(x) summable over E*, and

is finite for almost all x in E*;

As was remarked after 29.3, it is only since (d) is a consequence of (e). We first assume hypotheses (a), (b) and | fn(x) | converge in measure to | /(x) | , the function | f(x) | is summable. B u t by by 22.3 f(x) is summable. Let « be an arbitrary positive number. by 27.1 there is a 5i > 0 such that

then f(x)

is

necessary to prove (e), (c). Since the functions by 29.2 and hypothesis (c) 28.8/(x) is measurable; so Since f(x)

is summable,

(4) if E is a measurable subset of E* with mE < Si. sis (b), there is a 82 > 0 such that (B)

B y 29.5 and hypothe-

172

INTEGRATION

[CHAP. IV

for all n if E is a measurable subset of E* with mE < 82. Let 5 be the smallest of the numbers Si, 52 and e/3mE*. B y the definition of convergence in measure, there is an n 0 such that for all n > n 0 the set of all x for which fn(x) is not in Ns(f(x)) can be enclosed in a set En,s with measure mEn,s < 8. Since f(x) is summable, it is almost everywhere finite, and we can include in En,s the points x for which | f(x) | = =o without increasing mEn,s. Now we write

If n > n 0 , then mEn,s < 8. So the first integral on the right is less than e/3 by (B), and the second is less than «/3 by (.A). On E* — En,5 the functional value f*(x) is in Ns(f(x)), and f(x) is finite; therefore l/nte) — fix) | < 5. So the integrand in the third integral on the right is less than 8, and the measure of E* — En,t is at most equal to mE*; therefore the third integral on the right is at most 8mE* S e/3. Adding, we find that the integral on the left is less than e for all n > n0, which establishes conclusion (e). This completes the proof of our theorem under hypotheses (a), (6) and (c). Suppose now that hypotheses (a), (b) and (c') are satisfied. be a number such that

Let 8

if E is a measurable subset of E* with mE < 8; such a 8 exists by hypothesis (b) and 29.5. Since the f„(x) converge in measure to f(x), there is an integer p such that if n ^ p, the set of points x in E for which fn(x) is not in Ni(f(x)) can be enclosed in a set En with mEn < 8/2. B y hypothesis the set on which | f(x) \ = 00 has measure 0, so we can include it in En without increasing mE„. Then on E* — En the function/(a;) is finite &ndf„(x) is in Ni(f(x)); that is, | f„(x) — f(x) \ < 1 for x in E* — En. Now if n ^ p

Since m(E„UEp) < 5, the last two integrals are each less than 1. On E* - (EnUEp) both \fn(x) - f(x) \ and \fp(x) - f(x) \ are less

SEC. 29.2S] T H E

INTEGRAL

AS

A FUNCTION

173

than 1, so | fjx) — fp(x) j < 2. So the first integral on the right is at most 2 m E * . We therefore have for n ^ p

Consequently hypothesis (c) is satisfied, the integrals of the \f„(x) ( being not greater than the largest of the numbers

Therefore if the hypotheses (a), (b) and (c') are satisfied, so are (a), (6) and (c); and the proof of our theorem is complete. EXERCISE. Let f(x) be defined on a set E of finite measure, and let /i(x), fi(x), • • • be summable over E. In order that the conclusions of 29.6 shall hold it is necessary that hypotheses (a), (b), (c) and (c') be satisfied. Neither of the theorems 29.3 and 29.6 contains the other as a special case. For in 29.3 the set E * could be of infinite measure, while we can show by a simple example that the finiteness of the measure of E* is essential in 29.6. Let f(x) = 0, and let fn(x) = l/n if n ^ x ^ 2n, fjx) = 0 otherwise. Then fjx) converges uniformly (hence almost uniformly, in measure and everywhere) to f(x). B u t for the integrals over the whole space R i we have

On the other hand, the hypotheses of 29.6 can be satisfied without the existence of the summable function g(x) such that | fn(x) | g(x). F o r example, let E* = [0, 1], and let fix) = 0. For each positive integer n let /„(x) = x~} for (n + l ) " 1 ^ x ^ n~l, and let /„(x) = 0 elsewhere. Then

and the hypotheses of 29.6 are satisfied. B u t the intervals [(w + l ) - 1 , n~l\ cover (0, 1], and any function greater than all the fjx) would have to exceed x~l on (0, 1], and could not be summable. However, it is not difficult now to establish a theorem which is more general than either 29.3 or 29.6.

174

INTEGRATION

[CHAP. IV

29.7s. Let the functions fi(x), fz(x), • • • be defined, finite-valued and summable over a set E*. If (a) on E* the functions fn(x) converge in measure, or almost everywhere, or almost uniformly, to a function f(x); (b) for every e > 0 there is a set Ee of finite measure contained in E* and such that (i) the set-functions

are uniformly absolutely continuous on the class of all measurable subsets of Ef; (ii)

for all n; (c) the integrals

are bounded; or (c') f(x) is finite for almost all x; thenf(x)

is summable over E*, and

We first assume that (a), (b) and (c) hold. As in the proof of 29.6, f(x) is summable over E*. Let e be any positive number, and let Et be the set described in (6). On Et, the hypotheses of 29.6 are satisfied, so

Since | fn(x) \ tends in measure, or almost everywhere, or almost uniformly to | f{x) | on E* — Et, by 29.2 we have

Therefore, using 6.5, 6.12 and 6.14,

SEC. 29.2s]

THE

INTEGRAL

AS

A FUNCTION

175

This holds for all e > 0, so the upper and lower limits of the integrals of the | /„(x) — fix) | are both 0. So conclusion (e) is established. As before, (d) follows from (e). If hypotheses (a), (6) and (c') hold, take e = 1. B y hypothesis (b) there is a set E\ of finite measure such that

for all n. B u t since E i is of finite measure, we can show exactly as in the last part of the proof of 29.6 that the integrals

are bounded. Hence, adding, the integrals of the | fjx) | over E* are bounded, and hypothesis (c) is satisfied. I t is clear that 29.7 is more general than 29.6; for if the hypotheses of 29.6 hold we can take Et = E* for all e > 0, and the hypotheses of 29.7 are satisfied. Furthermore, 29.7 includes 29.3. For suppose that the hypotheses of 29.3 hold. Then hypotheses (a) and (c) of 29.7 clearly are satisfied. Since g(x) is summable over E*, for every e > 0 there is an interval W„: — n^x(i>^n such that that is,

Since | fn(x) | ^ g(x) for all n, part (ii) of hypothesis (b) is satisfied if we take Et = E*Wn. B y 27.1, for every number 7 > 0 there is a 8 > 0 such that

if E is a measurable subset of Et with mE < 5.

Hence for all such E

and part (i) of (b) is also satisfied. If we are willing to drop the " s " from theorem 29.6 we can omit hypotheses (c) and (c') completely: 29.8. Let the functions fi(x), fi{x), • • • be defined and summable on a set E* of finite measure. If

176

INTEGRATION

[CHAP. IV

(a) on E* the functions fn{x) converge in measure {or almost everywhere, or almost uniformly) to a function f(x); and (b) the set-functions Fn(E)

= JEf„(x)

dx are uniformly

continuous on the class of all measurable subsets of E*; clusions of theorem 29.6 hold.

absolutely

then the con-

B y theorem 29.5, there is a 5 > 0 such that Je \fn(,x) ( dx < 1 for all measurable subsets E of E* with niE < 8. For some interval Wp we have mWvE* > mE* - 8; therefore m(E* - Wv) < 8. Let t be a positive number less than Sl/q; then every interval I whose sides are all equal to t has measure less than 8. We can cover the interval Wv with a finite number of intervals 11, • • • , I h of this type. Then E* is contained in (E* - Wv) U E*h U • • • U E*Ih, and each of these sets has measure less than 8. So

Therefore hypothesis (c) of 29.6 is satisfied. The other hypotheses of 29.6 have been assumed as hypotheses here also, so the conclusions of 29.6 hold. Since hypotheses (b) and (c) are vital in theorem 29.6, it is interesting to have criteria which will guarantee their satisfaction. We have already seen (after 29.7) that these conditions are satisfied if [ fn(x) | S g(x), where g(x) is summable. B u t this merely brings us back to a special case of theorem 29.3. A criterion of an essentially different nature is 29.9s. (Nagumo). Let $( po (by which we of course mean p(p„, pQ) —> 0) is equivalent to (.4)

lim PrXy) = Po(y) uniformly on F . n—• »

For let e be a positive number. If pn —> p0, then for all n greater than a certain ne we have p(pn, Po) < e, whence

This is the definition 6.16 of uniform convergence. holds, there is an ne such that if n > n ( then

Conversely, if (A)

so that p(pn, Po) ^ «/2 < e. Again, let S be the class of all functions f(x) defined, finite and measurable on a measurable set E. For any two such functions /Or), g(x) let p(p, q) be the inf of all numbers a such that | f(x) — g(x) | < a except on a set of measure less than a. Properties (1) and (3) are obvious. For property (4), let /, g, h belong to S, and let e be an arbitrary positive number. Then

except on a set Ei of measure less than p(/, g) + e, and

except on a set E2 of measure less than p(g, h) +

So except on the

S EC.30.1 S] T H E I N T E G R A L A S A F U N C T I O N set E U 1

E2,

whose measure is less than I

/(x) - h(x)

I

p(f, g) + p(g, h)

< p(J, g) + p(g, h)

179

+ 2e, we have

+ 2e.

That is, p( f , h ) ^

ρ( / ,

g) + p(g, h)

+ 2e.

Since ¢, is an arbitrary positive number, by 5.2 this implies property (4) of 30.1. But with property (2) it is different; for if f ( x ) and g ( x ) differ on a set of measure zero we have p(/, g) = 0 without having / = g. This difficulty can be removed in either of two ways. We can alter the concept of equality by regarding two functions as identical if they are equivalent. This however has certain disadvantages; for example, everywhere else the meaning of " = " has been identity, not a con­ ventional relationship. The alternative is to lump together in a single class all functions equivalent to each other and use these equivalenceclasses as the points of our space S. Properties (1), (3) and (4) are undisturbed, for if they hold for /, g, h they hold for all functions respectively equivalent to /, g, h. If two points (equivalence-classes) of S are coincident, then functions /, g representing these points are equivalent, and p(J, g) = 0; that is, the distance from a point to itself is 0. Conversely, if p(f, g) = 0, let η be any positive integer. If a < 1/ n , the set E[ |/ — g | g: 1 / n ] is contained in t h e set E[ |/ — g \ ^ a], whose measure is less than a. This holds for all a less than 1/n, so mE

I/

ι

= o.

Adding these sets for η = 1, 2, · · · , we find by 19.3 m E [ I / - g-l > 0] = 0 ,

so

and g ( x ) are equivalent. That is, if p ( f , g ) = 0 the functions a n d g ( x ) r e p r e s e n t t h e s a m e p o i n t of S . Once this is understood, there is little danger of misunderstanding if we speak of functions /(x) as belonging to S, instead of using the longer and more accurate statement that f(x) is a member of an e q u i v a l e n c e c l a s s w h i c h i n t u r n i s a p o i n t of S . With the distance defined above, convergence of /„ to /0 in S is equivalent to convergence of fn(x) to f0(x) in measure on E. For let p(fn, /o) tend to zero. If e and δ are positive numbers, there is an n0 such that p(/„, /o) < inf {e, 5} when η exceeds n . By definition of p, this implies that | f {x) — fa(x) [ < e except on a set of measure less f(x)

f(x)

0

n

180

INTEGRATION

[CHAP. IV

than , and we readily verify that XN > XQ. So Rq is complete. If Y is an arbitrary set, the space of bounded functions on Y with the metric p(f, g) = sup f(y) — g(y) | is complete; this is merely a re-wording of 6.17. Our next two proofs both use a device which we state in the next lemma. 30.4s. If I pn j is a regular sequence in a metric space D, and a subsequence {p«}(a = n\, n2, • • • ) converges to a limit po, then the whole sequence converges to po. Let e be a positive number. For allTOand n greater than a certain nt we have

since the sequence is regular. For all a greater than a certain a 0 we have since pa —> po- Choose an a = n< larger than the greater of ne and a0. The two inequalities above both hold with m = a, and by (4) of 30.1 for all n greater than nf. Therefore pn —» poWe now proceed to show the completeness of the other spaces mentioned above. 30.5s. Let E be a measurable set. The space S of functions finite and measurable on E, with the metric defined above, is complete. Let {/„} be a regular sequence. We define a sequence of integers as follows. The integer n\ is the least one such that p(/m, fn) < 2 _ 1 whenever to and n are at least equal to n,. The integer n 2 is the least integer greater than nx such that p(fm, /„) < 2 "2 whenever m and n are at least equal to n 2 ; and so on. For compactness we write then by the choice of the nt we have (A) We need only show that the gt form a convergent sequence, since by 30.4 the whole sequence will then converge.

182

INTEGRATION

[CHAP. IV

Let Ei be the subset of E on which | gt(x) — gl+\{x) | ^ 2~\ B y {A) and the definition of distance, this set has measure less than 2~*. Define

Then by 19.8 we find that

(B)

On E — Mi all the inequalities

are satisfied. So if j and k are both at least equal to i (we choose the notation so that j ^ k) we find

T h a t is, for each x in E — M , the numbers oo. Therefore the left member is 0, and the first set of equations ( E ) is satisfied. Repeating the argument with sin nx in place of cos nx proves that the second set of equations ( E ) is also satisfied, and the theorem is established. REMARK. In proving the statement below it is convenient to observe that the Holder inequality (conclusion of 24.4) can be written in the form

EXERCISE. Let E be a measurable set, and let p, q be numbers greater than 1 such that l/p + 1/q = 1. If {/„} is a sequence of functions converging t o / in LP(E), and {g n } is a sequence of functions converging to g in Lq(E), then 71—*TO»

~

"~

( B y the Holder inequality

The factor p,(j„, 0) is less than pq(q, 0) + 1 if n is large.)

SEC. 30.7]

THE

INTEGRAL

AS

A FUNCTION

187

EXERCISE. If {/„} is a regular sequence in LP(E) (p S: 1) it is a regular sequence in S; if it converges to / in LP{E), it converges to / i n S. EXERCISE. Let E have finite measure, and let p ^ p' ^ 1. If {/„} is a regular sequence on LP(E), it is regular on LP'(E); if it converges to / on LP(E), it converges to / on LP'(E).

CHAPTER V

Differentiation 31. So far we have dealt exclusively with integrals, making no mention of derivatives. But the interrelations between the processes of integration and differentiation are of fundamental importance in analysis. In this chapter we investigate these relations, restricting ourselves for the sake of simplicity to functions of a single real variable. We shall of course define the derivative f'(x0) as the limit of the difference-quotient [f(x) — f(x0)]/(x — xn) as x approaches Xo) but since this limit may fail to exist, it is desirable to have related expressions which may serve us where there is no derivative. These expressions are obtained by using the upper and lower one-sided limits (see 6.10) of the difference-quotient, and are called the " D i n i derivates." Their definitions are as follows. 31.1. Let f(x) be defined, and finite on an interval [a, j3], and let Xo be a point in [a, /3]. Then (a) the upper derivate of f(x) at x0 is

(b) the loiver derivate of f(x) at x0 is

If Xo < 13, then (c) the upper right derivate of f(x) at x0 is

(d) the lower right derivate of f{x) at x0 is

If a < Xo, then (e) the upper left derivate of f(x) at x0 is

SEC. 31.10]

D I F F E R E N T I A T I O N

189

( / ) the lower left derivate of f(x) at xo is

EXERCISE. lif(x) is finite in [a, b] and x0 is in [a, b], then Df(x0) is the greatest number which is the limit of a sequence of differencequotients [/(/?„) — /(«»)]/(fin — «») where a n and fi„ tend to x0 subject to the condition an < fin, an ^ x0 ^ fir. An analogous statement holds for the lower derivate. 31.2. Let f(x) be defined and finite on the interval [a, fi], and let x0 be a point in [a, b]. Then (а) if Xo < fi, and D+f(x0) = D+f(x0), we call their common value the right derivative of f(x) at x0, and denote it by f'(x0+); (б) if a < xo, and D~f(x0) = Z)_/(xo), we call their common value the left derivative of f(x) at Xo, and denote it by f(x0 — ); (c) if Df(xo) = Df(xo), we call their common value the derivative of

f(x) at xo, and denote it by Df(x0),

orf(x0),

or

l*«v

In the rest of this section we shall investigate the simpler properties of these derivates. From the definition it is obvious that and F o r at x = a it makes no difference whether we write x a or x —> a-)under the symbol lim sup or lim inf; the condition x ^ a is forcibly satisfied in either case. Likewise and Slightly less trivial is 31.3. If a < xo < fi, then Df(x0) = sup {D+f(x0), D~f(x0)} and Df(x0) = inf {D+f(x0), DJ(x0)\. B y 6.4, there is a sequence of numbers xn > x0 tending to x0 such that

B u t by the second part of 6.4, this proves that

Likewise, Df(x0) W)

^ D~f(x0), so we have

190

INTEGRATION

[CHAP. IV

On the other hand, by 6.4 there is a sequence x„ of numbers different from x0 and tending to x0 for which

Either there are infinitely many x„ > x 0 , or there are infinitely many xn < x0. In the first case, we select the xn > xo and denote them by x'n. Then x'n > xo, and

B y the second part of 6.4, this yields hence (B) In the second case, we prove similarly that D~f{x0) ^ Df{x0), so that (B) holds. From ( 4 ) and ( B ) we obtain the first part of our conclusion. The second part is established similarly, or can be obtained from the first after theorem 31.7 is proved. The derivates being upper and lower limits, all the theorems of §6 are immediately available for use. If in any of the theorems of §6 we replace the pair of symbols lim sup, lim inf respectively by D, D, or by D+, D+, or by D , D_, we obtain a theorem on derivates. Particularly useful are the consequences of 6.12 and 6.13: 31.4. Let fi(x) and fi{x) be defined and finite on the interval [A, /3], Then for each x in [a, /3] the inequalities

(«) (0) (y)

(«) (0

(r) (n)

(#) hold, provided that the additions are possible. Moreover, if x < p, we may replace D, D by D+, D+ throughout; if a < x, we may replace D, D by D_ throughout. Each of these twenty four relationships follows at once from the corresponding part of 6.13.

SEC. 31.10]

D I F F E R E N T I A T I O N

191

The next theorem is an immediate corollary of 31.4. 31.5. If fix) and g(x) are defined and finite on the interval [a, (3], and f{x) — g(x) is monotonic increasing, then and for all x in [a, /3], If a < x ^ /? we can replace D, D by respectively in these inequalities, and if a 5= x < /3 we can replace D, D by D+, D+ respectively. Since/(x) — g(x) is monotonic increasing, its difference-quotient is non-negative, so its lower derivate is non-negative. B y 31.47,

The second inequality follows by a similar argument, with use of 31.4/3. B y 31.4, the statements about D+, etc., are provable in exactly the same way. 31.6. J / 7 0 ) and g(x) are defined and finite on the interval[a, 0], then: (a) if d(%) has a right derivative at x,

(&) if ff(x) has a left derivative at x,

(c) if ff(x) has a derivative at x, all four of the preceding equations hold, where g'{x-\-) = g'{x — ) = g'(x). All these conclusions follow from 6.14. Given a function/(x) defined on [a, f3], the function f( — x) is defined on [ —/3, —a]; that is, if we write £ = —x, g(£) = fix), then g(£) is defined for —/? ^ | ^ —a. We then have 31.7. Let f(x) be defined and finite on [a, /S], Between the derivates of f(x), —fix),fi~x) and —f(—z) the following relationships hold: (a) (b) (c) In (a), (6), (c) we may everywhere interchange the affixes + and — on the letter D.

192

INTEGRATION

[CHAP. IV

Also, (d) (e) Here we understand, for example, that £)+(/( — x)) denotes the upper right derivate of g(£) = /(—£) with respect to £ at the place £ = -x. To prove (a), we have by 6.3

This gives the first part of (a). Replacing / by —/ gives the second. The analogues for the upper left and lower left derivates are similarly obtained. For (6), we use gr(|) = f{x), where x = — and note that £ > £o if and only if x < x 0 = — £o- Then

The second part of (b) can be similarly established; or it can be derived from the first and (a) by replacing / by —/. If in (b) we replace x by — x we obtain the formulas with the affixes + and — interchanged. Formulas (c) follow from (a) and (6), for

The interchange of f(x) and — /( — x) gives the formulas with the affixes + and — interchanged. Formulas (d) and (e) follow readily from (a), (6) and (c). For example, by (a) and (5.5)

I t is not our intention to make a detailed study of the differential calcjlus. B u t there are a few simple theorems which we shall need in the study of integrals, and these we shall now establish.

SEC. 31.10]

D I F F E R E N T I A T I O N

193

31.8. Let f(x) be defined and finite on [a, b], and let x0 be a point of \a, 6], If Df(xg) and Df(x0) are both finite, then f(x) is continuous at x0. Let M - 1 be the greater of | Df(x0) | and | Df(x0) | . B y the definition of these derivates, there is a S > 0 such that

if x ^ Xo is in [a, 6] and in Ns(xQ).

for such x.

Therefore

If e is positive, and 7 is the smaller of S and t/M,

then

if x is in [a, 6] and | x — x0 | < 7 . This completes the proof. 31.9. Let f(x) and g(x) be defined and finite on [a, b], and let x0 be a point of [a, b]. If the derivatives f(x0) and g'(x0) both exist and are finite, then the derivative ~ (f(x)g(x))

also exists at x0, and

B y 31.8, both f(x) and g(x) are continuous at x = Xo. We now write

As x —> Xo, the factors in the first term approach f'(x0) and g(x0) respectively, those in the second term approach f(x0) and g'(x0) respectively. Hence

establishing the theorem. 31.10. Let f(y) be defined and finite on the interval a ^ y ^ /3, and let y0 be in [a, /?]. Let g(x) be defined on the interval a ^ x ^ b and have its values in the interval [a, /3], and let x0 be in [a, 6], If g(x0) = yo, and the derivatives f'{yo), g'(xo) exist and are finite, then the function f{g(x)) has a derivative at x0, and

194

INTEGRATION For all y

[CHAP. IV

y0 in [a, 0] we define

and we set n(y0) = 0.

B y the definition of f'(y 0 such that | fi(y) | < e if | y — 2/0 | < 7 and y is in [a, /3], B u t g(x) is continuous at x0, by 31.8. Hence there is a 5 > 0 such that | g(x) — g(x0) \ < y if x is in [a, b] and \ x — x0\ < S. T h a t is, if x is in Ns(x0) fl [a, 6] — (x0), then y.{g{x)) is in iV e (0); so

Now by (^4) we can write

As x tends to x0, the factor f'(g(xo)) + n(g(x)) tends to f'{g{x0)), the other factor on the right tends to g'(x0). This establishes our theorem. 32. Now we investigate some properties of derivates which are not so purely formal as those in the preceding section. 32.1. Let f(x) be defined and finite on the interval [a, 6], If f{x) is monotonic increasing, or monotonic decreasing, or continuous, then D+f, D~f, D+f, D^f, Df, Df are all measurable on [a, 6]. If we can prove that the first four of these are measurable, then by 21.7 the last two also are measurable, for by 31.3 we know that Df = sup {D+f, D~f\ and Df = inf | D + f , DJ}. If fix) satisfies our hypotheses, so does —f{x)', so by 31.7 and 21.6 it is enough to prove D+f(x) and D~(f(x)) measurable. We discuss only the first of these; the proof of the measurability of the upper left derivate is the same except for trivial alterations. If we define

where h ranges over all real numbers such that 0 < h < a and p over all rational numbers such that 0 < p < a, then by 31.1 and 6.9

SEC. 31.10]

D I F F E R E N T I A T I O N

195

we have D+f(x)

= lim 0 D+f = lim \[/(x, a); in fact, that \p(x, a) = 0 it is obvious (5.3) that ^(x, a) ^ 0 (which we may assume less than e) such that whenever (ai, hi). (a 2 , 62), ' ' • , (an, bn) are non-overlapping subintervals of [a, b] with total length less than 7 , the inequality 2 | 0 the first has a limit which is

So for all h near zero it is less than e/2. hood of zero

That is, if h is in a neighbor-

which completes the proof. Theorem 42.1 is a special case of the following theorem. 42.4s. Let E be a measurable set. If /(x) belongs to the class Lp on E (p Si 1), that is, if f(x) is measurable on E and \f(x) is summable over E, then for every positive number e there is a function » , remaining less than the summable function | fix) I3". B y 29.3, its integral approaches zero, and we can choose an n for which the integral is less than (e/2)p; that is, (•B)

On the interval [0, 2n] the function t"' 1 is increasing, and has its greatest value (In) '""1 at t — 2n. T h a t is, (C) if for which (D)

B y 42.1, there is a function ^ (or KHE(X) > i ) consists of the set H (or HE) itself, and this is non-measurable. In §9 we constructed a set P contained in the interval [0, 1] and a f u n c t i o n / ( x ) , defined, continuous and monotonic increasing on [0, 1], such that (i) the set P is dosed, and its complement 11, 12, ' " ' such that 2m7,

=

(ii) / ( : r ) is constant on each interval (iii) / ( 0 ) = 0 andf(l)

=

is a set of open intervals

1;

I,;

1.

Define now g(x) = f(x) + x. This function is continuous, and gf(O) = 0, g{\) = 2. It is strictly increasing, for if x2 > Xi then g(x2) — g(x 1) = / ( x 2 ) — f(x 1) + x2 — xi ^ £2 — Xi > 0. Hence it has a continuous inverse function g~l(t), 0 ^ t ^ 2. Consider now the image of an interval I, = (a,, ft) under the mapping t = g(x). On I, the function f(x) has a constant value c,-, b y (ii). So on this interval g(x) = c, + x, and the interval (a„ fij) is mapped on (c, + a„ c, + (3,). T h a t is, the image of each /,• is an open interval Tof the same length as I T h e intervals T, d o not overlap, since g(x) is monotonic. Hence

B u t the interval [0, 2] has measure 2, and all points not in U T, are necessarily in the image Q of the set P. This set is closed, being the complement of the open set U T,) so it is measurable, and

W i t h the set H of the preceding example, we now define the function f(t) to be the characteristic function of the set HQ. This we have already seen to be non-measurable. Define )2 ^ S ( x ) for all χ in I . If 7, is one of the intervals on which s ( x ) is a con­ stant Cj, by hypothesis C1AmIj is an underestimate for the moment of inertia of the part of the mass in I j . Adding, the integral J i t s ( x ) d m ( x ) in the sense of 10.4 is an estimate from below for the moment of inertia. This holds for every step function s(x) satisfying the inequality above, hence the moment of inertia μ is at least equal to the sup of such

256

integrals. on I * :

I N T E G R A T I O N

[CHAP. V I I

But (x ( 1 ) ) 2 + (x(2>)"~ is continuous, so this sup is its integral

Using the step-functions S(x), we find that their integrals in the sense of 10.4 are overestimates for fi, so

Comparing this with the preceding inequality, we find that the m o m e n t of inertia is given b y the integral

T h e extension to unbounded distributions is obvious. E v e n in this example we recognize an advantage of the LebesgueStieltjes integral. T h e single formula a b o v e gives the m o m e n t of inertia of the matter irrespective of the distribution. T h e mass m a y all be concentrated in a finite or denumerable set of points or m a y b e continuously distributed, and if continuously distributed, m a y lack the derivative called density; in any of these cases or in any combination of them the m o m e n t of inertia is still represented b y the same Lebesgue-Stieltjes integral. Thus in recent work in analysis it has sometimes been found advantageous to use the Lebesgue-Stieltjes integral in order that a single proof m a y cover b o t h the ordinary Lebesgue integral and the case of infinite sums. For another example, let P be a finite or denumerable set of points Pi, ps, • • • in R g with no accumulation point. W e could for example let P consist of all the points (a:(1), • • • , xiq)) in which each x'i] is a positive integer. If I is an interval whose closure is {a; | a(iJ g x™ ^ (3(4), i = 1, • • • , q}, we define #>[!] to be the number of points of P belonging to the interval {a; | a ( i ) ^ xM < i = 1, • • • , q). This function of intervals is obviously additive, so there is a function g{x) such that AgI = . W e suppose that the notation is so chosen that the points of P in the interval a ( i ) S x(i) < bts' are pi, • • • , pm. If s(x) is a step-function constant on each of a set of intervals

whose closures cover I * , we m a y suppose that the 7, are small enough

258

I N T E G R A T I O N

so that each contains at most one point of P.

[CHAP. V I I

Then in the sum

where c3- is the value of s(x) on / , , the factor A„7, vanishes unless 7, contains a point of P and is 1 if 7, contains such a point.

Hence

If ,) is finite,

j = 1, • • • , m, we set u(x) = fix) for x = pi, • • • , pm, u(x) = °° elsewhere, and we set Z(a;) = /(a;) for a; = pi, • • • , pm and l(x) = — oo elsewhere T h e n u(x) is a {/-function and l(x) is an /.-function, and l(x) g fix) ^ u(x). Therefore

This shows that every function finite on PI* is ry-summable. If fix) is defined on the whole space, we extend the integral to the whole space b y 18.1; we find that f{x) is gr-summable over the space if and only if 2 f i p i ) is absolutely convergent, in which case the integral has the value 2fip,)- In particular, b y 19.1, every set E is ^-measurable, and its gf-measure is the number of points of P which it contains. In §§10 to 43 several of the theorems were not marked with an "s." T h e preceding example serves to show that they state properties

SEC. 48]

LEBESGUE-STIELTJES INTEGRAL

which do not hold for all integrals J f ( x ) d g ( x ) .

259

Thus, for instance, it

is not necessarily true (19.14) that a finite or denumerable set should have ^-measure zero—a set consisting of a single point of P has g- measure 1. At the end of §17 we commented that the symbol J 1 S(^dgix) (which we then were calling J 1 K x ) d x ) could have two interpretations. It could mean jBJ(x)Ki(x)dg(x), and this was the meaning we chose to give it. But it could also be interpreted as in 15.1, with I as basic interval. When we choose Δ7 to be the product of the edges of I this distinction was seen in §17 to be only conceptual; the two interpretations led to one and the same value for the integral. For other interval-functions AgI the two interpretations can lead to different values of the integral. We use the preceding example to show this. Let 7* be an interval a(i) g = . . . > g) such that β is in the set P but no other x a) ^ 3 we find mgIh = 1. Although §25 is not an " s "-section, it needs only the special property of g(x) that if 7 has the projections Is and P on the spaces Rs and Rt respectively, there shall be interval functions AS7S and AtP whose product is A„7. T h e reader will be able to verify that this holds true whenever g(x) is of the form « I, and likewise for the other three functions. It is permissible to restrict our attention to those n large enough so that > 0 if x(t) > 0. As before, we define af = inf {x%\ 0}, = sup {x®, 0}, /„ = [an, f}n). The number of superscripts i for which x'f < 0 is then the same as the number v(x) of superscripts for which x(i> ^ 0. The interval I is the limit of the /„, so by the remark after 53.12 we have n(/„) —»/x(7). That is,

A like proof holds for the other three functions. If any of the coordinates x(i> is 0, the proof in the preceding paragraph still holds if I is replaced by the empty set A.

290

INTEGRATION

[CHAP. V I I

From the definitions we have, for every interval I = [a, /J),

Hence by 50.2 every set of finite a-measure also has finite p-measure and finite n-measure, and by 52.4 it has finite (/-measure. That is, (.B) the class Wig of sets of finite g-measure contains the class 3Jla of sets of finite a-measure. Since g(x) is left continuous, by 52.10 the equation A„/ = mgI holds for all intervals of the type [a, 0). But we already know that A„7 = n{I) for all such intervals, so n(I) = mgI for all intervals [a, /3). By 53.8, the set-functions p.(E) and mgE are equal for all bounded Borel sets. In exactly the same way, the equations and v(E) = mnE hold for all bounded Borel sets. The function a(x) is positively monotonic, so by 54.1 the family Ma of sets of finite a-measure is the natural range of the measure function maE. By 53.7, every set of the family g belongs to 9Jla, and by {B) belongs to 90V That is, if E is in g it has finite g-measure, and by 53.7 The natural range of a measure function is uniquely determined by its values on bounded Borel sets, as was remarked after 53.6. On such sets the functions maE and a(E) coincide. Since Wa is the natural range of maE, it is also the natural range of a(E). The functions n(E) and meE coincide for all bounded Borel sets E, and they are regular, so by 53.6 we can calculate their total variations by using only bounded Borel sets. That is, for every set E0 the total variation of mgE over E0 is the same as the total variation a(E0) of n(E) over E0. The family 90^ satisfies 53.2(a, b, c), and since we have seen that it is contained in Wg, the function mgE is defined on it. Clearly mgE continues to satisfy conditions 53.2(d, e, f) on 9W0. Since a{E) is the total variation of mgE, and 3)la is the natural range of a{E), it is also the natural range of mgE, by 53.12. Applying 53.7 to mgE as on and on 30fa, we see that 9K,, is contained in 3Jia. But by (B), 93?0 is contained in 9)?s, so the two coincide. Hence 9Jlg is the natural range of mgE. If g is the natural range of n(E), by 53.7 (applied to ti{E) and mgE) it contains 9ft,,. We have already seen that g is contained in 9K„, so the two coincide.

SEC. 54.2]

LEBESGUE-STIELTJES

INTEGRAL

291

It remains to prove the uniqueness of the function g(x), subject to (i), (ii) (iii) and the equation AgI = n (I) for all intervals I = [a, Let 7 be a function which satisfies (i), (ii), (iii) and is such that A y I = n(I) for all intervals I of type [a, 0). Let x be any point of the space; we shall show that y(x) = g(x). If any one of the z(i> is zero, this follows from (ii). Otherwise, we define as before a ( l ) = inf [x(i), 0], = sup 0], I = [a, |3), v(x) = number of superscripts t such that x(i) < 0. If we form the sum which defines A y I, all terms but one vanish because of (ii), and we find

But by 52.10,

By hypothesis,

Hence

This is exactly the definition of g(x). Hence y(x) ~ g(x), and the function g(x) is uniquely determined by the values of the measure function n(E) on intervals The proof of the theorem is complete. One immediate consequence of this theorem is that we have an affirmative answer to the question raised in §53: if y.(E) is regular on a family can it be extended uniquely to its natural range? For fi(E) determines g(x), which determines the measure function mgE on the natural range 2)i,,; and mgE = p(E) for all sets E of so mgE is an extension of n(E) to its natural range. If there were another such extension, say to a function m{E) on a natural range gi, then i±\{E), n(E) and mgE all coincide on all sets of and in particular on all bounded Borel sets; so gi = 2Ji0 and mgE — m(E), by 53.7. Another consequence is 54.3. If g(x) is of BV on every interval and is left continuous, the family SJi, of sets of finite g-measure is the natural range of mgE. Furthermore, there is a function y(x) satisfying (i), (ii), and (iii) of theorem 54.2 such that if either one of the integrals

exists so does the other, and the two are equal. For every interval I = [a, /J) we have

by 52.10. Since mgE is a regular measure function on the range by 54.2 there is a function y(x) satisfying conditions (i), (ii) and (iii)

292

INTEGRATION

[CHAP. V I I

of 54.2 and such that myE = mgE for all sets E of and the class 9JJt of sets of finite 7-measure is the natural range of myE. Let 7 be any interval containing an interior point. If we denote the closure of 7 by [a, /3], and write 7o for the corresponding right-open interval [a, (3), then For in defining Ag only the values of g(x) at the vertices of 7 were used, and 7 and 7 0 have the same vertices. A like statement holds for A y I. But by 52.10, for the interval 7 0 we have and from this and the preceding equation we conclude that Ae7 = A77. If 7 contains no interior point, we must have a ( i ) = /3(i) for some i, and therefore Thus the equation AgI = AyI holds for all intervals 7. Now in defining integral and measure the function g{x) never entered except in the expression A(,I. The functions g(x) and y(x), though possibly different, have the same difference-function; AgI = AyI. Hence they lead to the same integral and measure. This establishes the last statement of our theorem. Also, the classes yjlg and 9D?7 are identical, and mgE = myE for all sets E of 9?!„. Since 9JZ7 is the natural range of myE by 54.2, it follows that 9M„ is the natural range of mgE. This completes the proof. Another corollary, similar to the last, informs us that we would have lost no generality if we had restricted our attention to left continuous functions from the start; for every g(x) which is of BV on every interval can be replaced by another such function which is left continuous and gives the same value to the integral as g(x) did. 54.4. If g(x) is of BV on every interval, there is a function y(x) which satisfies (i), (ii), (iii) of 54.2 and is such that (a) every g-measurable set E is y-measurable, and if mgE is finite the equation holds; (b) every g-summable function f(x) is also y-summable, and

Here we shall only prove conclusion (a); we could go on to prove (b) also, but this will follow from (a) after 55.6 is established.

SEC. 54.2]

LEBESGUE-STIELTJES

INTEGRAL

293

The set function mgE is a regular measure function on the family 99?„ of sets of finite (/-measure. By 54.2, there is a function y(x) satisfying (i), (ii), (iii) of 54.2 and such that every set E of Wg is 7-measurable, and MYE = MJI. This establishes (a) for sets of finite (/-measure. If E is (/-measurable, but has not finite (/-measure, we regard it as the sum for all n of the sets EWn. Each of these has finite (/-measure, and therefore is 7-measurable. Hence E = u ft EWn is 7-measurable. It is interesting to observe that, with the same notation as in 54.2, the following theorem holds. 54.5. Let N(E) be a regular measure function on a range g, and let the functions be defined as in the -proof of 54.2. Then p(x) and n(x) form a minimum decomposition of g(x), and a(x) is the same as T(X). Let I be an interval of the type fa, 0). Since we know by 46.3 and 46.5 that To establish the reverse inequality let e be a positive number, and let P be the set exhibited in 53.11. As stated in that theorem, we may suppose P to be a Borel set. In the proof of 54.2 it was shown that the equations mPE = 7T(E) and mNE = v(E) hold for all bounded Borel sets. Thus for every bounded Borel set E we have

In particular, U containing IP such that

so by 20.1 there is an open set

(A) The set UI can by 3.6 be represented as the sum of intervals where I j has the form

Hence

Let n be chosen large enough so that

(•B) The remainder

can be represented as the sum

294

INTEGRATION

of disjoint intervals I* + 1 , • • • , / * each of the type [a, (3).

[CHAP. VII

Then

(O (D) Since l i l ) • • • U In is contained in U, the first term in (D) is less than e: (.E)

Since UI contains IP, we have mpI = mpIP 5= mpUI, and by (B) it follows that (.F)

These inequalities, with (C) and (D), yield

Now

Since e is arbitrarily small, this implies that

Therefore is a minimum decomposition of g(x). It follows that for every interval I, the total variation A r I is the sum of A P I and A„/. But so Hence AT7 = A a I for every interval I. The functions a(x), r(x) were both so defined that they vanish whenever any one of the x(i) is zero. Hence, just as in proving the uniqueness of g{x) in the last paragraph

SEC.

54.2]

LEBESGUE-STIELTJES

INTEGRAL

295

of proof of 54.2, we find that T(X) = A(x). This completes the proof of the theorem. 56. If we are given a measure function n(E) on a range % we can at once utilize a process due to Lebesgue to define an integral. If f(x) is defined on a set E of we say that fix) is g-measurable if the set E\f a] belongs to g for every number a. Suppose now that f(x) is defined, bounded and g-measurable on a set E0 of g, and that n(E) is non-negative. Let a, b be numbers such that a < fix) < b, and let h, • • ' , h be numbers such that a = l0 < h < • • • < h = b. This collection of numbers h we call a "ladder L." We now form the sums (4) where Ef is the set E0[lj-1 < / ^ I,]. Clearly s(L) g S(L). Let L' be a ladder {1'0, • • • , Vh\ which is "finer" than L; that is, eachljis one of the numbers then the term l,-n(E,) in S(L) is replaced by the sum ' where Then.

Therefore, adding for j = 1, • • • , k, we find that S(L') g S(L). Similarly s(L') S s(L). It follows that for all ladders Lh Li. For if L' is the ladder consisting of all the numbers both of Li and of L%, then 8(h) g s(L') ^ S(L') ^ 8(Li). Therefore if we write m = sup s(L) and M = inf S(L) for all ladders L, we have m ^ M. On the other hand, if all the steps l} — i of the ladder L have length less than e, then whence The positive number e is arbitrarily small, so m = M. Incidentally, this proves that by making the length of the largest step of L arbitrarily close to zero we oblige both s(L) and S(L) to approach the common value m = M as limit. We now define

296

INTEGRATION

[CHAP. V I I I

Next, let /(x) be non-negative and ^-measurable on E 0 . Define fn(x) to be inf \f(x), n}. This function is also ^-measurable and is bounded, so the integral

exists. It is rather easy to show that this integral increases monotonically with n. If it has a finite limit as n —* , we say that fix) is //-integrable on E0, and we define

Let W„ be the interval If fix) is non-negative and is /x-integrable on each W„, it is easy to show that

increases monotonically with n. If its limit is finite, we say that f(x) is /i-integrable over the space Rg, and we define

If f(x) is defined over Rq, we define as usual f+(x) = sup \f{x), OJ, f~(x) = sup { —f(x), 0}. Then/(a:) is said to be /j-integrable over R„ if /+ and are both ju-integrable over It,,, and in that case we define

Finally we remove the restriction that n{E) be non-negative. With the notation of the preceding section, we have n(E) = v(E) — v(E). Then we say that a function f(x) defined on Rg is ^-integrable over Rg if it is both ir-integrable and v-integrable, and in that case we define

From these definitions it is easily seen that if fix) is defined on Rq and is ^-measurable on every interval, a necessary and sufficient condition for f(x) to be ^-integrable over Rq is that | f(x) \ be a-integrable over R q . We now prove the analogues of 50.1 and 50.2. 55.1. Let Hi{E) (i = 0, 1, 2) be non-negative measure functions, all defined on the same family ft, and such that /j.0(E) = ni{E) + ns(E) for all sets E of ft. Let f{x) be defined on Rq. Then f(x) is no-integrable if

SEC. 56.5]

LEBESGUE-STIELT JES INTEGRAL

297

and only if it is both m-integrable and jU2-integrable, and in that case

Suppose f(x) bounded and ^-measurable on a set E0 of g. If we define st(L) and S,(L) (i = 0, 1, 2) as in equation (.4) at the beginning of the section, we easily see that

for every ladder L.

Hence we readily deduce

The extension to functions f(x) which are defined and /uo-integrable (or both jui-integrable and /i2-integrable) over Ii q needs no detailed presentation; it follows by straightforward use of the definition. Two corollaries of 55.1 can now be stated. 55.2. If no(E) and i±\(E) are measure functions on and 0 ^ in(E) ^ Ho(E) for all E in then every function f(x) which is no-integrable over Rq is also m-integrable. For all E in define The requirements in 53.2 are easily verified, and so ^ { E ) is a non-negative measure function on g. So by 55.1, if f(x) is /io-integrable it is both yui-integrable and ^-integrable. 55.3. If Hi(E) (i = 0, 1, 2) are measure-functions on g such that HO(E) = M{E) -f PI(E) for all sets E of the family g, then every function f{x) which is both m-integrable and m-integrable over Rq is also MOintegrdble, and

Let TI(E), VI(E) be the positive and negative variations of IJ.J.E) (•I — 0, 1, 2) as defined in 53.9. If E is an arbitrary set and E0 is a subset of E which belongs to then by the definition 53.9 we find

This holds for all subsets E0 of E which belong to (A)

so

298

INTEGRATION

[CHAP. V I I I

In the same way we prove (B) By 53.12, the set-functions ri(E) and vdE) (i = 0,1, 2) are measure functions on ft. If f(x) is defined and both /xi-integrable and ^2-integrable over Rq, by definition it is integrable with respect to in, ir2, vh and v2. By 55.2 with inequalities ( 4 ) and (B) it is 7r0-integrable and ^o-integrable; so by definition it is ^o-integrable. Furthermore, for every set E of g we have

Transposing, we obtain

for all sets E of ft. If f(x) is both /ii-integrable and ^-integrable, we already know that it is x.-integrable and ivintegrable (i = 0, 1, 2). By 55.1,

Transposing yields

Next we show that if the family ft is enlarged in such a way that the measure function remains regular, all functions f(x) which were originally integrable remain integrable, and the values of their integrals do not change. 65.4. Let fix(E) and ta{E) be measure functions defined and either non-negative or regular on the respective ranges fti, ft2. Suppose that fti is contained in ftt, and that for each set E in fti the functions ni(E) and

SEC. 56.5]

LEBESGUE-STIELT JES INTEGRAL

(12(E) are equal. grable, and

299

Then every fxi-integrable function f(x) is also m-inte-

Consider first the case of non-negative measure functions m and Evidently every gi-measurable function is ^-measurable, since is contained in If f(x) is bounded and ^i-measurable on a set E0 of gi, for every ladder L the sums s(L), S(L) have the same value whether ;ui or m is used as measure function. Therefore Hi.

(4) If f(x) is non-negative and Mi-integrable on E a , and then the limit

exists. The limits of the members of this equation are the respective members of (A), so (A) holds for such fix). The extension from integrals over E0 to integrals over Rq is similarly made, and the restriction to non-negative f(x) is removed by applying the result already proved to the functions / + and f~. If HI and are regular, though not necessarily non-negative, on the respective families gi, they have the same positive and negative variations TT(E), v(E); for by 53.6 we can compute ir(E) and v(E) on bounded Borel sets, on which the two measure functions are surely equal. By the part of the proof already completed, if f(x) is x- and p-integrable when it and v are regarded as measure functions on gi> they are still ir- and v-integrable when ir and v are regarded as measure functions on g 2 , and the integrals retain the same values. By subtracting, the conclusion of the theorem is established. The integral defined in this section is very closely related to the integral defined in §51, and there called the Lebesgue-Stieltjes integral. (Historically, the form of definition used in this section has the better right to the name.) The next two theorems show this relationship. 55.5. Let g(x) be of BV on every interval. Let mgE be the corresponding measure-function, and *jfflg the family of sets of finite g-measure. Let f(x) be deUned over the space Rg. Then if f(x) is g-summable in the sense of §51 over Rq it is also mg-integrable in the sense of §55 over Rq, and

300

INTEGRATION

[CHAP. VIII

If gix) is positively monotonic or left continuous the converse is also true: if fix) is m„-integrable over Rg it is also g-summable over Rq. Evidently /(x) is (/-measurable if and only if it is fflVmeasurable, for there is only one concept of measure present. We first suppose that gix) is positively monotonic, so that mgE is non-negative. Let fix) be gf-measurable and bounded on a set E 0 of finite measure. Given a ladder L: we define and for all sets E of the equation m0E = m(E) holds. Thus if fix) is defined on Iiq, it is pi-intcgrable if and only if it is m^-integrable. Since gix) is left continuous, fix) is m„-integrable if and only if it is gr-summable by 55.5. Also, if the integrals exist,

= If % is the natural range of fj-iE), so that 5 and pi = fj, this completes the proof. Otherwise, by 55.4, if fix) is ^-integrable it is /ii-integrable, and

This, with the preceding paragraph, completes the proof. The postponed proof of conclusion (6) of 54.4 is now easily made. With the notation and hypotheses of that theorem, if fix) is to be constant, and by the initial conditions this constant value is u(r). This new system satisfies all the hypotheses of 69.4 if we replace V by the (open) set of all (x, y, u) with (x, y) in V and u in P. Since we have already established conclusions (C1) and (C3) we know that the solutions are continuous in all variables and so are their partial derivatives of all orders ^ m with respect to the ??(i) and u(r) for all (x, r;, u) such that x and £ are in [a — e, b + «] and (£, 77, u) is in the 5-neighborhood of the set of points

In particular, the solutions have the desired continuity and differentia bility properties if (£, 77) is in the S/2-neighborhood of E and u is in the 5/2-neighborhood of «o, which is conclusion (C 2 ) except for the trivial replacing of 5 by 6/2. Moreover, by (C3) the derivatives

also have these same differentiability properties if the functions ja'(x, y, u) are continuous in all variables. This is conclusion (C4).

SEC. 69.1]

DIFFERENTIAL

EQUATIONS

359

The lemma is therefore established, and we return to the proof of the main theorem. If (x, y) is in V and u is in P, and ft is a positive number so small that the /^-neighborhood of (x, y) is in V, we compute (A)

.

provided that | | tj | | < /3. Cauchy inequality, we deduce

From this, with hypothesis (iv) and the

whence

Thus hypothesis (v) of 69.3 is satisfied. The other hypotheses of 69.3 have been assumed satisfied, so there are positive numbers e, 5 such that equations D[u] have unique solutions yli>(x] rj, u) (i = 1, • • • , q; a — e x ^ b + e) for all (£, tj, u) in the set U: {(£, 7), u) | (£, rf) in the S-neighborhood of E, u in the 8-neighborhood of u0 j, and these solutions are continuous for all (x,

77, u) with x in

[a — e, b + e] and (£, 7, u) in U. Let j be any particular one of the numbers 1, • • • , q. If u is in P and the functions y(i)(x), a — e^x^b + e, are continuous and (x, y{x)) is in V, the functions / ( i ) (x, y(x), u) and

are measurable if | t \ is small enough, by the remark after 68.2. If we subtract the first of these from the second and divide by t we obtain a measurable function. If we now let t tend to zero through a sequence of values, we obtain a sequence of measurable functions converging to

which is therefore measurable on [a — e, b + e].

360

INTEGRATION

[CHAP. I X

If (£, 7?) is in the ^-neighborhood of E, u is in the S-neighborhood of Wo and x is in [a — e, b + e], the point {x, y(x; 77, u)) is in V. Because of the continuity of the solutions, the line-segment joining this point to the point (x, y(x; rj + h, u)) will also lie in V if h is any g-tuple lying in a certain neighborhood of (0, • • • , 0 ) . Henceforth we suppose that h is in that neighborhood. If h is in the given neighborhood of (0, • • • , 0) and t is in the interval [0,1] the functions

are continuous in t and h for fixed x. By the preceding paragraph they are measurable in x for fixed (t, h), since the arguments are continuous functions of x on [a — e, b + e]. So for each x in [a — b + e] we have ri

For each n all the summands in the right member are measurable functions of x on [a — e, b + «], so A)(x, rj, h, u) is measurable on that interval for each fixed h. Since the first partials of / ( i ) are continuous functions of the yM, and ym(x; 17 + h, u) and y{i){x\ 17, u) are continuous functions of r/ and h, it is easily seen that A){x, ij, h, u) is also continuous in t] and h for fixed x. Next let k be any one of the numbers 1, • • • , q, and let h be such that all its components except h(k) are zero. We define

Using equation (A), we find that for almost all x in [a — e, b + e] this satisfies the equation I ) t

SEC. 69.1]

DIFFERENTIAL

Also, since yU)(x;

EQUATIONS

-q, u) has the value

361

at x = ? we see that

where 5lk is 1 if i = k and is 0 otherwise. Consider the differential equations (L) By 69.2, for each g-tuple ij these equations have unique solutions vM(x; rj, t}, h, u) (i = 1, • • • , q; a — t ^ x g b + «) reducing to rjM at x = £ and continuous in all variables. Hence in particular the limit (•B)

exists and is continuous in x, ij, y and u. But for h(k) 0 the functions z(i)(x; T], h, u) satisfy these same differential equations L. If rj is the vector whose fc-th component is 1 and whose other components are 0, by the uniqueness of the solutions of L we have

Thus equation (B) informs us that the partial derivative of y{i) with respect to rj{l) exists, and (C) which is a continuous function of x, tj and u. We have thus obtained conclusion (Ci) of our theorem subject to the restriction that m = 1. By the lemma, conclusions (C 2 ), (C3) and (Ci) also hold for m = 1. To remove the restriction TO = 1 we use induction. Suppose that the hypotheses of 69.4 imply (Ci), (and by the lemma imply (C 2 ), (C3) and (C 4 )) wheneverTOis less than a certainTO'greater than 1; we show that the hypotheses still imply conclusion (Ci) when TO = TO'. The equations

362

INTEGRATION

[CHAP. IX

are identities for a — e g x g b + e and (£, tj) in the 5-neighborhood of E and u in the 5-neighborhood of u0, and the yw have continuous first-order partial derivatives with respect to the variables i)ik\ If for some k we denote the partial derivative of y'" with respect to t)ik> by vH), we see by 39.2 that the equations

are satisfied, where Sik i.s 1 if i = k and is 0 otherwise. are equivalent to the differential equations

These equations

CP) with the initial conditions vM(£) = 8tk. Let us consider u fixed somewhere in the 5-neighborhood of u()) and V1'. We have assumed (Ci), (C 2 ), (C 3 ) and (C 4 ) all valid if m is less than m', so by (C 2 ) we find the equations (P) have solutions

assuming the initial values u(l) at x = £ and having continuous partial derivatives of all orders ^ to' — 1 with respect to the variables VM and J7(J). In particular, if we fix u w at the value 8tk this solution is

as we saw in deriving equations (P); so these functions have continuous partial derivatives of all orders ^ m! — 1 with respect to the

SEC. 69.1]

DIFFERENTIAL

EQUATIONS

363

variables i){i). That is, the functions y("(x; ij, u) have continuous partial derivatives of all orders 5S m' with respect to the variables r)(l), and conclusion (C'j) holds for m = ml. By the lemma, the same is true of (C 2 ), (C3) and (Ci), so by induction these conclusions all hold for all integers m. All that remains is to establish conclusion (C 6 ). We therefore assume that the functions / ( , ) (x, y, u) have continuous partial derivatives of all order m with respect to all variables on the range (x, y) in V, u in P. If (£, rj) is in the 5-neighborhood of E, so is (£ + h, rj) for all sufficiently small values of h. We assume h to be such that (£ + h, ri) is in the 5-neighborhood of E and £ + h is in [a — e, b + e] and we define Y^Qi) to be the value of y(i)(x; £ + h, 17, u) at x = that is,

The functions y{i)(x\ £ + h, jj, it) satisfy equations D[u] and have values at x = But the only solutions of equations D[u] with those initial values are the functions yM(x; Y(h), u). That is,

for all x in [a — e, b + e], and all h near 0 such that ^ + h is in [a — €, b + e]. Since the functions y(i'(x; £ + h, 77, u) satisfy equations \u\ and have the values v'11 at x = £ + h, by integration we find that;

or

We divide by h and use the theorem of mean value, obtaining

where x is between £ and £ + h inclusive. equation is continuous in all variables, so

The right member of this

The first-order partial derivatives of the yli) with respect to the 57® have already been shown to exist and to be continuous. So by Taylor's

364

INTEGRATION

[CHAP. IX

theorem and the preceding equations,

where # is between 0 and 1. As h tends to zero Y(h) tends to F(0), which is ri; so the first factor in the right member tends to a limit. The second factor has already been shown to approach a limit. Hence the left member approaches a limit. That is, y{%) has a partial derivative with respect to and (O This proves also that the left member is a continuous function of all variables. From equations D[u] themselves we see that the partial derivatives of the y'1' with respect to x are continuous functions of all variables, being identically equal to / ' " ( x , y(x; r;, u), u). So we have shown that the functions y{i){:r; tj, u) have continuous first-order partial derivatives with respect to all variables on the range indicated, and conclusion (C5) is established for m = 1. Suppose next that m is greater than 1. The functions y(i)(x; ri, u) have been shown in (C 4 ) to have continuous partial derivatives of all orders ^ m with respect to the 77® and u(r>, and t h e / ® (x, y, it) have been assumed to have continuous partial derivatives of all orders g m with respect to all variables. From equations (C) it now follows that the first partial derivatives of the y'l) with respect to £ have continuous partial derivatives of all orders g m — 1 with respect to the 7?® and u(T>, for the differentiation of the right member will introduce only derivatives known to exist and be continuous. In a like manner, from equations D[u] we see that the partial derivatives of the yri> with respect to x have continuous partial derivatives of all orders ^ m — 1 with respect to the variables tj® and u(r}. Returning to (C), we now see that if m ^ 2 the partial differentiation of the right member with respect to x or £ will involve only partial derivatives now known to exist and be continuous, and likewise for the right member of D[u]. .That is, if

SEC. 69.4]

DIFFERENTIAL EQUATIONS

365

m ^ 2 the partial derivatives y% vtx, Vxl

all exist, are continuous and have continuous partial derivatives of all orders ^ m — 2 with respect to the ηω and u(r). Returning with this information to (C) and D[u], we find that if in ^ 3 the third-order partial derivatives of y(i) with respect to χ and ξ possess continuous partial derivatives of all orders g m — 3 with respect to the η('> and u(r>. We continue the process; the outcome is that all partial deriva­ tives of y N, or it has not. We shall show that each of these leads to a contradiction. This will prove that the assumption me(E — U Fn) > 0 leads to a contradiction, and will complete the proof of the theorem (with the additional hypotheses (I) and (II)). C A S E I. The set F has no point in common with any Fn. For each integer p, the set F is a set of g containing no point of Pi U • • • U Fp. The supremum e p+ i of the function e(F*) for all such sets F* is at least equal to e{F). By the law of formation of the

SEC. 70.2]

DIFFERENTIATION

369

sequence, e(Fp+1) is, at least e(F)/2. Therefore the edge of the cube Q ;)+ i is at least e(F), the edge of Q*+1 is at least 5e(F), and the measure of Q*+, is at least [5e(F)\'>, which is a positive number independent of p. But this is impossible; we have already seen that the series 2mQ* converges. C A S E II. There is an n such that Fn has a point in common with F. Let p + 1 be the least integer such that Fv+1 has a point in common with F, and let x be a point belonging to both Fp+i and F. By (B), the number p + 1 cannot be one of the numbers 1, • • • , N, so p ^ N. Since F is a set of the family g having no point in common with Fi U • • • U Fp, the number e,,+1 is at least equal to e{F), so by the law of formation of the sequence we know that

The points x and x0 both belong to F, which can be enclosed in a cube of side 2e(F). Hence,

The point x also belongs to Fp+h which is contained in Qp+i. definition of Qv+i there is a point xv+i such that

So by

Combining the last three inequalities, Ave obtain

That is, Xo belongs to the cube Q* +I . But p + 1 > N, so this contradicts statement (A). Thus our theorem has been established subject to the additional hypotheses (I) and (II). It is not difficult to replace (II) by the weaker hypothesis (II'). There is a positive number a such that for each x0 of E there is a sequence of sets of g converging to x0 with modulus of regularity a. Let go be the family of sets each consisting of a set of g plus a single point of E. If \F, j is a sequence of sets of g converging to x(1 with modulus of regularity a, we define Ff to be Fs plus the single point x )/A7„ tends to I. But in the light of the preceding section, we see that in space of two or more dimensions there is a distinction which is absent from the onedimensional case. A sequence { / „ } of intervals containing x0 may have edges tending to zero, and yet not converge regularly to x0 in the sense of definition 70.1; for instance, in two-space the intervals In. {a; 1 4 " ^ X (n ^ + 4 2 ) ^ Z (2) ^ 4 2 ) + n~2} fail to converge regularly to x0. We therefore have a choice in the above definition. Either we can demand merely that the / „ contain xa and have edges tending to zero; the derivate thus defined is called the strong upper derivate of $ at x0, and we shall not discuss it further. Or else we can demand that the / „ converge regularly to Xo] the derivate thus defined is the ordinary upper derivate of $ at xa. Precisely, 71.1. Let be a function of sets which is defined for all closed intervals contained in an interval I0. Let x0 be a point of I q. The ordinary upper derivate D[$i + $2]- For a properly chosen subsequence { / „ } , a = nu w2, • • • the sequences $i(7 a )/A7 a and $ 2 (7 a )/A7 a converge to limits, finite or infinite. Then (4) provided that the sum on the right is defined. ordinary upper derivate, we have

By the definition of

(B) So if the sum on the right in (A) is defined, (A) and (B) imply (C) If the sum on the right in (A) is undefined, one of the summands is + By (B), one of the terms on the right in (C) is also + 0 0 , so if the right member of (C) has a meaning it means + h — e. By the Yitali theorem (70.2) a finite or denumerable set of these intervals cover almost all of E and are disjoint. That is, there are disjoint subintervals I[, I'it • • • of 7 0 , each satisfying the inequality $ ( / ' ) > (h — e ) A / ' , and together covering almost all of E, so that

We select a finite number of these / ' , say I[, • • • , I'k, such that

Hence, using a preceding inequality,

376

INTEGRATION

[CHAP. X

The remainder of 7 0 can be subdivided into non-overlapping subintervals I " , • • • , I " . By hypothesis,

Since e is an arbitrary positive number, this implies h - e. This yields

Since eis an arbitrary positive number, the theorem is proved. 72.4. If fix) is summable over Rq, and for all measurable sets E we define F(E) =

I f(x) dx, then for almost all x the ordinary derivative JE and the general derivative of F(E) at x are defined, finite and equal to We need prove only the statement about the general derivative; as we saw in §71, if this is defined so is the ordinary derivative, and the two derivatives are then equal. Let Io be a closed interval, and let e and k be positive numbers. Since f{x) is summable over 7 0 , there is a [/-function u(x) Sj f(x) on 7 0 such that

SEC. 72.4]

DIFFERENTIATION

377

For each measurable set E contained in 7 0 we define

Let Ek be the subset of 7 0 on which D*G(x) ^ k.

so Eh has exterior measure less than e. 71.3 and 72.1

By 72.3,

Except on Ek we have by

Thus for each positive fc the set on which the inequality

fails to hold has exterior measure less than e, which is an arbitrary positive number. So its exterior measure is zero; that is, its measure is zero. In particular, if k is the reciprocal of a positive integer n the set on which the inequality

fails to hold is a set N„ of measure zero. Except on U Nn, which has measure zero, the above inequality holds for every n. Hence by 5.2

almost everywhere in 7 0 . This holds true for —f(x) as well as for f(x), so for almost all x we have that is,

The two inequalities together prove that the general derivative of F(E) exists almost everywhere in 7 0 and has the value/(x). If we reject

378

INTEGRATION

[CHAP. X

the set of measure zero on which f(x) is infinite, we find that D*F(x) = f(x) T i + =C

almost everywhere in I 0 . To complete the proof, we use the cubes W n : {x I — η ^ x(t> g n, i = 1, · · · , q}.

Let E 0 be the set of χ at which it is false that D*F(x ) exists, is finite, and equals/(x). The part of E0 interior to Wn has measure zero, by the preceding proof. This holds for every n; so on adding we find niEo = 0, and the proof is complete. 73. Although we have not defined the concept of total variation for a function of intervals, the definition is self-suggesting; we merely paraphrase 46.2. Another way of stating it is this. If Φ(7) is a func­ tion of intervals defined and additive for all subintervals of I0, there is by 45.5 a finite-valued function g(x) on I0 such that ASI = Φ(7) for every sub-interval 7 of I 0 . We then say that Φ (I) is of BV on I 0 if g(x) is of BV on I0, as defined in 46.2. If this is so, by 46.9 there are nonnegative interval functions A P I and A N I whose difference is Φ(I). We now prove a theorem analogous to 34.3. 73.1. Let Φ(7) be an additive function of intervals, defined for all subintervals of an interval I 0 , and of BV on I 0 . Then at almost all points χ of I 0 the ordinary derivative ΌΦ{χ) exists, and it is summable over the set E on which it exists. Moreover, if Φ (I) is non-negative the inequality

is satisfied.

By the preceding paragraph, there is no loss of generality in sup­ posing Φ(I) non-negative; for Φ (7) is the difference of two non-negative functions API and ANI, and if the theorem holds for each of these it holds for their difference Φ (7). For each pair h, k of positive integers we define E h t k to be the set of all points χ of I0 at which h · , .

βφ(

At each point of Eh,k the derivative Ζ)Φ(χ) fails to exist. Conversely, if the derivative fails to exist at x, the difference ί)Φ(χ) — ΏΦ(χ) is positive. If k is an integer large enough so that 2/k is less than that difference, and h is the smallest integer such that h/k exceeds Ι)Φ(χ),

SEC. 73.1]

D I F F E R E N T I A T I O N

then (h + 1 )/k cannot exceed D$(x), and so x is in Eh.k(4)

379

Therefore

the set of points at which the derivative fails to exist is Uh,k Eh,k-

Consider any one of these sets Eh.k• For an arbitrary positive number e it is possible by 20.7 to find an open set U containing Ehik and having measure mU < meEh,k + e. At each point xa of Ehlk the lower derivate is less than h/k, so there exists a sequence of intervals l'n converging regularly to x0 and having $(/^)/A/'„ < h/k. By the Vitali theorem (70.2), there is a finite or denumerable set h , I 2 , • • • of such intervals which are disjoint, lie in U and cover almost all of Ehik. Hence (•B)

Consider now the part of Eh,k in an interval / „ . On this set the upper derivate of $ is at least (h + 1 )/k, by hypothesis; so by 72.2 we have

By 20.4, this yields (C)

since almost all of Eh,h is in U In•

Comparing (B) and (C) yields

or

Since e is arbitrary, Eh,k has measure zero. Returning to statement 04), we now see that the set on which the derivative fails to exist is the sum of a denumerable collection of sets of measure zero, so it has measure zero. If we denote by E the set on

380

INTEGRATION

[CHAP. X

which 7>i>(z) exists, we have shown that E constitutes almost all of I 0 . It remains to show that D$(x) is summable over 7 0 . We first effect a partition of 7 0 into 2" disjoint subintervals j = 1; " ' ' i 2 3 by bisecting each side of I 0 . Again, by bisecting each side of each I h l we split 7 0 into 22