The Calculus Integral 1442180951, 9781442180956

An elementary introduction to integration theory on the real line. This is at the level of an honor's course in cal

896 124 1MB

English Pages 304 [303] Year 2013

Polecaj historie

Calculus Basics vol 3 : The Integral Calculus

This book is the third volume of Calculus Basics, which is composed of The Limits, The Differential Calculus, and The In

325 81 31MB Read more

Integral Calculus

189 48 136MB Read more

Differential And Integral Calculus

629 177 24MB Read more

CLP-2 Integral Calculus [2]

151 84 17MB Read more

Differential and Integral Calculus Vol. II [2]

926 206 21MB Read more

Differential and Integral Calculus Vol. I [1]

643 165 18MB Read more

CLP-2 Integral Calculus [2, Exercises ed.]

142 29 3MB Read more

CLP-2 Integral Calculus [2, Text ed.]

137 43 2MB Read more

Calculus Illustrated. Volume 3: Integral Calculus [3, 1 ed.] 9798657180145

This is the third volume of the series Calculus Illustrated, a textbook for undergraduate students.Mathematical thinking

1,361 356 19MB Read more

Integral calculus for beginners; with an introduction to the study of differential equations.

638 135 5MB Read more

The Calculus Integral
1442180951, 9781442180956

Author / Uploaded
Brian S Thomson

Citation preview

THE CALCULUS INTEGRAL

Brian S. Thomson Simon Fraser University

C LASSICAL R EAL A NALYSIS . COM

This text is intended as an outline for a rigorous course introducing the basic elements of integration theory to honors calculus students or for an undergraduate course in elementary real analysis. Since “all” exercises are worked through in the appendix, the text is particularly well suited to self-study. For further information on this title and others in the series visit our website. www.classicalrealanalysis.com There are PDF files of all of our texts available for download as well as instructions on how to order trade paperback copies. We always allow access to the full content of our books on G OOGLE B OOKS and on the A MAZON Search Inside the Book feature. Cover Image: Sir Isaac Newton And from my pillow, looking forth by light Of moon or favouring stars, I could behold The antechapel where the statue stood Of Newton with his prism and silent face, The marble index of a mind for ever Voyaging through strange seas of Thought, alone. . . . William Wordsworth, The Prelude.

Citation: The Calculus Integral, Brian S. Thomson, ClassicalRealAnalysis.com (2010), [ISBN 1442180951] Date PDF file compiled: June 19, 2011

BETA VERSION β 1.0 The file or paperback that you are reading should be considered a work in progress. In a classroom setting make sure all participants are using the same beta version. We will add and amend, depending on feedback from our users, until the text appears to be in a stable condition.

ISBN: 1442180951 EAN-13: 9781442180956

C LASSICAL R EAL A NALYSIS . COM

PREFACE There are plenty of calculus books available, many free or at least cheap, that discuss integrals. Why add another one? Our purpose is to present integration theory at an honors calculus level and in an easier manner by defining the definite integral in a very traditional way, but a way that avoids the equally traditional Riemann sums definition. Riemann sums enter the picture, to be sure, but the integral is defined in the way that Newton himself would surely endorse. Thus the fundamental theorem of the calculus starts off as the definition and the relation with Riemann sums becomes a theorem (not the definition of the definite integral as has, most unfortunately, been the case for many years). As usual in mathematical presentations we all end up in the same place. It is just that we have taken a different route to get there. It is only a pedagogical issue of which route offers the clearest perspective. The common route of starting with the definition of the Riemann integral, providing the then necessary detour into improper integrals, and ultimately heading towards the Lebesgue integral is arguably not the best path although it has at least the merit of historical fidelity.

Acknowledgments I have used without comment material that has appeared in the textbook [TBB] Elementary Real Analysis, 2nd Edition, B. S. Thomson, J. B. Bruckner, A. M. Bruckner, ClassicalRealAnalyis.com (2008). I wish to express my thanks to my co-authors for permission to recycle that material into the idiosyncratic form that appears here and their encouragement (or at least lack of discouragement) in this project. I would also like to thank the following individuals who have offered feedback on the material, or who have supplied interesting exercises or solutions to our exercises: [your name here], . . .

i

ii

Note to the instructor Since it is possible that some brave mathematicians will undertake to present integration theory to undergraduates students using the presentation in this text, it would be appropriate for us to address some comments to them.

What should I teach the weak calculus students? Let me dispense with this question first. Don’t teach them this material, which is aimed much more at the level of an honor’s calculus course. I also wouldn’t teach them the Riemann integral. I think a reasonable outline for these students would be this: 1. An informal account of the indefinite integral formula Z

F ′ (x) dx = F(x) +C

just as an antiderivative notation with a justification provided by the mean-value theorem. 2. An account of what it means for a function to be continuous on an interval [a, b]. 3. The definition

Z b a

F ′ (x) dx = F(b) − F(a)

for continuous functions F : [a, b] → R that are differentiable at all1 points in (a, b). The mean-value theorem again justifies the definition. You won’t need improper integrals, e.g., Z 1 Z 1 1 d √ √ dx = 2 x dx = 2 − 0. x 0 0 dx

4. Any properties of integrals that are direct translations of derivative properties. 5. The Riemann sums identity Z b a

where the points rem. 1 . . . or

ξ∗i

n

f (x) dx = ∑ f (ξ∗i )(xi − xi−1 ) i=1

that make this precise are selected by the mean-value theo-

all but finitely many points

iii

iv 6. The Riemann sums approximation Z b a

n

f (x) dx ≈ ∑ f (ξi )(xi − xi−1 ) i=1

where the points ξi can be freely selected inside the interval. Continuity of f justifies this since f (ξi ) ≈ f (ξ∗i ) if the points xi and xi−1 are close together. [It is assumed that any application of this approximation would be restricted to continuous functions.] That’s all! No other elements of theory would be essential and the students can then focus largely on the standard calculus problems. Integration theory, presented in this skeletal form, is much less mysterious than any account of the Riemann integral would be. On the other hand, for students that are not considered marginal, the presentation in the text should lead to a full theory of integration on the real line provided at first that the student is sophisticated enough to handle ε, δ arguments and simple compactness proofs (notably Bolzano-Weierstrass and Cousin lemma proofs).

Why the calculus integral? Perhaps the correct question is “Why not the Lebesgue integral?” After all, integration theory on the real line is not adequately described by either the calculus integral or the Riemann integral. The answer that we all seem to have agreed upon is that Lebesgue’s theory is too difficult for beginning students of integration theory. Thus we need a “teaching integral,” one that will present all the usual rudiments of the theory in way that prepares the student for the later introduction of measure and integration. Using the Riemann integral as a teaching integral requires starting with summations and a difficult and awkward limit formulation. Eventually one reaches the fundamental theorem of the calculus. The fastest and most efficient way of teaching integration theory on the real line is, instead, at the outset to interpret the calculus integral Z b a

F ′ (x) dx = F(b) − F(a)

as a definition. The primary tool is the very familiar mean-value theorem. That theorem leads quickly back to Riemann sums in any case. The instructor must then drop the habit of calling this the fundamental theorem of the calculus. Within a few lectures the main properties of integrals are available and all of the computational exercises are accessible. This is because everything is merely an immediate application of differentiation theorems. There is no need for an “improper” theory of the integral since integration of unbounded functions requires no additional ideas or lectures. There is a long and distinguished historical precedent for this kind of definition. For all of the 18th century the integral was understood only in this sense2 The descriptive definition of the Lebesgue integral, which too can be taken as a starting point, is exactly 2 Certainly

Newton and his followers saw it in this sense. For Leibnitz and his advocates the integral was a sum of infinitesimals, but that only explained the connection with the derivative. For a lucid account of the thinking of the mathematicians to whom we owe all this theory see Judith V. Grabiner, Who gave

v the same: but now requires F to be absolutely continuous and F ′ is defined only almost everywhere. The Denjoy-Perron integral has the same descriptive definition but relaxes the condition on F to that of generalized absolute continuity. Thus the narrative of integration theory on the real line can told simply as an interpretation of the integral as meaning merely Z b a

F ′ (x) dx = F(b) − F(a).

Why not the Riemann integral? Or you may prefer to persist in teaching to your calculus students the Riemann integral and its ugly step-sister, the improper Riemann integral. There are many reasons for ceasing to use this as a teaching integral; the web page, “Top ten reasons for dumping the Riemann integral” which you can find on our site www.classicalrealanalysis.com has a tongue-in-cheek account of some of these. The Riemann integral does not do a particularly good job of introducing integration theory to students. That is not to say that students should be sheltered from the notion of Riemann sums. It is just that a whole course confined to the Riemann integral wastes considerable time on a topic and on methods that are not worthy of such devotion. In this presentation the Riemann sums approximation to integrals enters into the discussion naturally by way of the mean-value theorem of the differential calculus. It does not require several lectures on approximations of areas and other motivating stories.

The calculus integral For all of the 18th century and a good bit of the 19th century integration theory, as we understand it, was simply the subject of antidifferentiation. Thus what we would call the fundamental theorem of the calculus would have been considered a tautology: that is how an integral is defined. Both the differential and integral calculus are, then, the study of derivatives with the integral calculus largely focused on the inverse problem. This is often expressed by modern analysts by claiming that the Newton integral of a function f : [a, b] → R is defined as Z b a

f (x) dx = F(b) − F(a)

where F : [a, b] → R is any continuous function whose derivative F ′ (x) is identical with f (x) at all points a < x < b. While Newton would have used no such notation or terminology, he would doubtless agree with us that this is precisely the integral he intended. The technical justification for this definition of the Newton integral is nothing more than the mean-value theorem of the calculus. Thus it is ideally suited for teaching you the epsilon? Cauchy and the origins of rigorous calculus, American Mathematical Monthly 90 (3), 1983, 185–194.

vi integration theory to beginning students of the calculus. Indeed, it would be a reasonable bet that most students of the calculus drift eventually into a hazy world of little-remembered lectures and eventually think that this is exactly what an integral is anyway. Certainly it is the only method that they have used to compute integrals. For these reasons we have called it the calculus integral3 . But none of us teach the calculus integral. Instead we teach the Riemann integral. Then, when the necessity of integrating unbounded functions arise, we teach the improper Riemann integral. When the student is more advanced we sheepishly let them know that the integration theory that they have learned is just a moldy 19th century concept that was replaced in all serious studies a full century ago. We do not apologize for the fact that we have misled them; indeed we likely will not even mention the fact that the improper Riemann integral and the Lebesgue integral are quite distinct; most students accept the mantra that the Lebesgue integral is better and they take it for granted that it includes what they learned. We also do not point out just how awkward and misleading the Riemann theory is: we just drop the subject entirely. Why is the Riemann integral the “teaching integral” of choice when the calculus integral offers a better and easier approach to integration theory? The transition from the Riemann integral to the Lebesgue integral requires abandoning Riemann sums in favor of measure theory. The transition from the improper Riemann integral to the Lebesgue integral is usually flubbed. The transition from the calculus integral to the Lebesgue integral (and beyond) can be made quite logically. Introduce, first, sets of measure zero and some simple related concepts. Then an integral which completely includes the calculus integral and yet is as general as one requires can be obtained by repeating Newton’s definition above: the integral of a function f : [a, b] → R is defined as Z b a

f (x) dx = F(b) − F(a)

where F : [a, b] → R is any continuous function whose derivative F ′ (x) is identical with f (x) at all points a < x < b with the exception of a set of points N that is of measure zero and on which F has zero variation. We are employing here the usual conjurer’s trick that mathematicians often use. We take some late characterization of a concept and reverse the presentation by taking that as a definition. One will see all the familiar theory gets presented along the way but that, because the order is turned on its head, quite a different perspective emerges. Give it a try and see if it works for your students. By the end of this textbook the student will have learned the calculus integral, seen all of the familiar integration theorems of the integral calculus, worked with Riemann sums, functions of bounded variation, studied countable sets and sets of measure zero, and given a working definition of the Lebesgue integral.

3 The

play on the usual term “integral calculus” is intentional.

Contents Preface

i

Note to the instructor

iii

Table of Contents

vii

1 What you should know first 1.1 What is the calculus about? . . . . . . . . . . . . . . . 1.2 What is an interval? . . . . . . . . . . . . . . . . . . . 1.2.1 What do open and closed mean? . . . . . . . . 1.2.2 Open and closed intervals . . . . . . . . . . . 1.3 Sequences and series . . . . . . . . . . . . . . . . . . 1.3.1 Sequences . . . . . . . . . . . . . . . . . . . . 1.3.2 Exercises . . . . . . . . . . . . . . . . . . . . 1.3.3 Series . . . . . . . . . . . . . . . . . . . . . . 1.4 Partitions . . . . . . . . . . . . . . . . . . . . . . . . 1.4.1 Cousin’s partitioning argument . . . . . . . . . 1.5 Continuous functions . . . . . . . . . . . . . . . . . . 1.5.1 What is a function? . . . . . . . . . . . . . . . 1.5.2 Uniformly continuous functions . . . . . . . . 1.5.3 Pointwise continuous functions . . . . . . . . 1.5.4 Exercises . . . . . . . . . . . . . . . . . . . . 1.5.5 Oscillation of a function . . . . . . . . . . . . 1.5.6 Endpoint limits . . . . . . . . . . . . . . . . . 1.5.7 Boundedness properties . . . . . . . . . . . . 1.6 Existence of maximum and minimum . . . . . . . . . 1.6.1 The Darboux property of continuous functions 1.7 Derivatives . . . . . . . . . . . . . . . . . . . . . . . 1.8 Differentiation rules . . . . . . . . . . . . . . . . . . . 1.9 Mean-value theorem . . . . . . . . . . . . . . . . . . 1.9.1 Rolle’s theorem . . . . . . . . . . . . . . . . . 1.9.2 Mean-Value theorem . . . . . . . . . . . . . . 1.9.3 The Darboux property of the derivative . . . . 1.9.4 Vanishing derivatives and constant functions . 1.9.5 Vanishing derivatives with exceptional sets . . 1.10 Lipschitz functions . . . . . . . . . . . . . . . . . . .

1 1 2 2 3 4 5 6 7 8 9 10 10 11 12 12 15 16 19 21 22 23 25 26 26 27 31 31 32 33

vii

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CONTENTS

viii 2 The Indefinite Integral 2.1 An indefinite integral on an interval . . . . . . . . . . . . . 2.1.1 Role of the finite exceptional set . . . . . . . . . . . 2.1.2 Features of the indefinite integral . . . . . . . . . . R 2.1.3 The notation f (x) dx . . . . . . . . . . . . . . . . 2.2 Existence of indefinite integrals . . . . . . . . . . . . . . . . 2.2.1 Upper functions . . . . . . . . . . . . . . . . . . . . 2.2.2 The main existence theorem for bounded functions . 2.2.3 The main existence theorem for unbounded functions 2.3 Basic properties of indefinite integrals . . . . . . . . . . . . 2.3.1 Linear combinations . . . . . . . . . . . . . . . . . 2.3.2 Integration by parts . . . . . . . . . . . . . . . . . . 2.3.3 Change of variable . . . . . . . . . . . . . . . . . . 2.3.4 What is the derivative of the indefinite integral? . . . 2.3.5 Partial fractions . . . . . . . . . . . . . . . . . . . . 2.3.6 Tables of integrals . . . . . . . . . . . . . . . . . . 3 The Definite Integral 3.1 Definition of the calculus integral . . . . . . . . . . . . . . 3.1.1 Alternative definition of the integral . . . . . . . . 3.1.2 Infinite integrals . . . . R. . . . . . . . . . . . . . Ra 3.1.3 Notation: a f (x) dx and ba f (x) dx . . . . . . . . R 3.1.4 The dummy variable: what is the “x” in ab f (x) dx? 3.1.5 Definite vs. indefinite integrals . . . . . . . . . . . 3.1.6 The calculus student’s notation . . . . . . . . . . . 3.2 Integrability . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Integrability of bounded, continuous functions . . 3.2.2 Integrability of unbounded continuous functions . 3.2.3 Comparison test for integrability . . . . . . . . . . 3.2.4 Comparison test for infinite integrals . . . . . . . . 3.2.5 The integral test . . . . . . . . . . . . . . . . . . . 3.2.6 Products of integrable functions . . . . . . . . . . 3.3 Properties of the integral . . . . . . . . . . . . . . . . . . 3.3.1 Integrability on all subintervals . . . . . . . . . . . 3.3.2 Additivity of the integral . . . . . . . . . . . . . . 3.3.3 Inequalities for integrals . . . . . . . . . . . . . . 3.3.4 Linear combinations . . . . . . . . . . . . . . . . 3.3.5 Integration by parts . . . . . . . . . . . . . . . . . 3.3.6 Change of variable . . . . . . . . . . . . . . . . . 3.3.7 What is the derivative of the definite integral? . . . 3.4 Mean-value theorems for integrals . . . . . . . . . . . . . 3.5 Riemann sums . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 Mean-value theorem and Riemann sums . . . . . . 3.5.2 Exact computation by Riemann sums . . . . . . . 3.5.3 Uniform Approximation by Riemann sums . . . . 3.5.4 Cauchy’s theorem . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

35 35 36 37 38 39 40 41 42 42 43 43 44 45 45 47

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

49 50 50 51 52 53 53 54 55 55 55 56 56 57 57 58 58 58 58 59 59 60 61 63 64 65 66 68 68

CONTENTS 3.5.5 Riemann’s integral . . . . . . . . . . . . . 3.5.6 Robbins’s theorem . . . . . . . . . . . . . 3.5.7 Theorem of G. A. Bliss . . . . . . . . . . . 3.5.8 Pointwise approximation by Riemann sums 3.5.9 Characterization of derivatives . . . . . . . 3.5.10 Unstraddled Riemann sums . . . . . . . . 3.6 Absolute integrability . . . . . . . . . . . . . . . . 3.6.1 Functions of bounded variation . . . . . . 3.6.2 Indefinite integrals and bounded variation . 3.7 Sequences and series of integrals . . . . . . . . . . 3.7.1 The counterexamples . . . . . . . . . . . . 3.7.2 Uniform convergence . . . . . . . . . . . . 3.7.3 Uniform convergence and integrals . . . . 3.7.4 A defect of the calculus integral . . . . . . 3.7.5 Uniform limits of continuous derivatives . . 3.7.6 Uniform limits of discontinuous derivatives 3.8 The monotone convergence theorem . . . . . . . . 3.8.1 Summing inside the integral . . . . . . . . 3.8.2 Monotone convergence theorem . . . . . . 3.9 Integration of power series . . . . . . . . . . . . . 3.10 Applications of the integral . . . . . . . . . . . . . 3.10.1 Area and the method of exhaustion . . . . 3.10.2 Volume . . . . . . . . . . . . . . . . . . . 3.10.3 Length of a curve . . . . . . . . . . . . . . 3.11 Numerical methods . . . . . . . . . . . . . . . . . 3.11.1 Maple methods . . . . . . . . . . . . . . . 3.11.2 Maple and infinite integrals . . . . . . . . 3.12 More Exercises . . . . . . . . . . . . . . . . . . . 4 Beyond the calculus integral 4.1 Countable sets . . . . . . . . . . . . . . . . . . . . 4.1.1 Cantor’s theorem . . . . . . . . . . . . . . 4.2 Derivatives which vanish outside of countable sets . 4.2.1 Calculus integral [countable set version] . . 4.3 Sets of measure zero . . . . . . . . . . . . . . . . 4.3.1 The Cantor dust . . . . . . . . . . . . . . . 4.4 The Devil’s staircase . . . . . . . . . . . . . . . . 4.4.1 Construction of Cantor’s function . . . . . 4.5 Functions with zero variation . . . . . . . . . . . . 4.5.1 Zero variation lemma . . . . . . . . . . . . 4.5.2 Zero derivatives imply zero variation . . . 4.5.3 Continuity and zero variation . . . . . . . . 4.5.4 Lipschitz functions and zero variation . . . 4.5.5 Absolute continuity [variational sense] . . 4.5.6 Absolute continuity [Vitali’s sense] . . . . 4.6 The integral . . . . . . . . . . . . . . . . . . . . .

ix . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

70 71 73 74 76 78 79 80 83 84 84 89 93 94 95 97 97 98 99 99 106 107 110 112 114 117 119 119

. . . . . . . . . . . . . . . .

121 121 122 123 123 125 127 130 130 132 134 134 134 135 135 137 138

CONTENTS 4.6.1 The Lebesgue integral of bounded functions . 4.6.2 The Lebesgue integral in general . . . . . . . 4.6.3 The integral in general . . . . . . . . . . . . 4.6.4 The integral in general (alternative definition) 4.6.5 Infinite integrals . . . . . . . . . . . . . . . 4.7 Approximation by Riemann sums . . . . . . . . . . 4.8 Properties of the integral . . . . . . . . . . . . . . . 4.8.1 Inequalities . . . . . . . . . . . . . . . . . . 4.8.2 Linear combinations . . . . . . . . . . . . . 4.8.3 Subintervals . . . . . . . . . . . . . . . . . . 4.8.4 Integration by parts . . . . . . . . . . . . . . 4.8.5 Change of variable . . . . . . . . . . . . . . 4.8.6 What is the derivative of the definite integral? 4.8.7 Monotone convergence theorem . . . . . . . 4.8.8 Summation of series theorem . . . . . . . . . 4.8.9 Null functions . . . . . . . . . . . . . . . . 4.9 The Henstock-Kurweil integral . . . . . . . . . . . . 4.10 The Riemann integral . . . . . . . . . . . . . . . . . 4.10.1 Constructive definition . . . . . . . . . . . .

1 . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

138 139 140 141 141 142 143 143 144 144 144 145 146 146 147 147 148 149 150

5 ANSWERS 153 5.1 Answers to problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 Index

288

Chapter 1

What you should know first This chapter begins a review of the differential calculus. We go, perhaps, deeper than the reader has gone before because we need to justify and prove everything we shall do. If your calculus courses so far have left the proofs of certain theorems (most notably the existence of maxima and minima of continuous functions) to a “more advanced course” then this will be, indeed, deeper. If your courses proved such theorems then there is nothing here in Chapters 1–3 that is essentially harder. The text is about the integral calculus. The entire theory of integration can be presented as an attempt to solve the equation dy = f (x) dx for a suitable function y = F(x). Certainly we cannot approach such a problem until we have some considerable expertise in the study of derivatives. So that is where we begin. Well-informed (or smug) students, may skip over this chapter and begin immediately with the integration theory. The indefinite integral starts in Chapter 2. The definite integral continues in Chapter 3. The material in Chapter 4 takes the integration theory, which up to this point has been at an elementary level, to the next stage. We assume the reader knows the rudiments of the calculus and can answer the majority of the exercises in this chapter without much trouble. Later chapters will introduce topics in a very careful order. Here we assume in advance that you know basic facts about functions, limits, continuity, derivatives, sequences and series and need only a careful review.

1.1 What is the calculus about? The calculus is the study of the derivative and the integral. In fact, the integral is so closely related to the derivative that the study of the integral is an essential part of studying derivatives. Thus there is really one topic only: the derivative. Most university courses are divided, however, into the separate topics of Differential Calculus and Integral Calculus, to use the old-fashioned names. Your main objective in studying the calculus is to understand (thoroughly) what the concepts of derivative and integral are and to comprehend the many relations among the concepts. 1

CHAPTER 1. WHAT YOU SHOULD KNOW FIRST

2

It may seem to a typical calculus student that the subject is mostly all about computations and algebraic manipulations. While that may appear to be the main feature of the courses it is, by no means, the main objective. If you can remember yourself as a child learning arithmetic perhaps you can put this in the right perspective. A child’s point of view on the study of arithmetic centers on remembering the numbers, memorizing addition and multiplication tables, and performing feats of mental arithmetic. The goal is actually, though, what some people have called numeracy: familiarity and proficiency in the world of numbers. We all know that the computations themselves can be trivially performed on a calculator and that the mental arithmetic skills of the early grades are not an end in themselves. You should think the same way about your calculus problems. In the end you need to understand what all these ideas mean and what the structure of the subject is. Ultimately you are seeking mathematical literacy, the ability to think in terms of the concepts of the calculus. In your later life you will most certainly not be called upon to differentiate a polynomial or integrate a trigonometric expression (unless you end up as a drudge teaching calculus to others). But, if we are successful in our teaching of the subject, you will able to understand and use many of the concepts of economics, finance, biology, physics, statistics, etc. that are expressible in the language of derivatives and integrals.

1.2 What is an interval? We should really begin with a discussion of the real numbers themselves, but that would add a level of complexity to the text that is not completely necessary. If you need a full treatment of the real numbers see our text [TBB]1 . Make sure especially to understand the use of suprema and infima in working with real numbers. We begin by defining what we mean by those sets of real numbers called intervals. All of the functions of the elementary calculus are defined on intervals or on sets that are unions of intervals. This language, while simple, should be clear. An interval is the collection of all the points on the real line that lie between two given points [the endpoints], or the collection of all points that lie on the right or left side of some point. The endpoints are included for closed intervals and not included for open intervals.

1.2.1 What do open and closed mean? The terminology here, the words open and closed, have a technical meaning in the calculus that the student should most likely learn. Take any real numbers a and b with a < b. We say that (a, b) is an open interval and we say that [a, b] is a closed interval. The interval (a, b) contains only points between a and b; the interval [a, b] contains all those points and in addition contains the two points a and b as well. 1 Thomson,

Bruckner, Bruckner, Elementary Real Analysis, 2nd Edition (2008). The relevant chapters are available for free download at classicalrealanalysis.com .

1.2. WHAT IS AN INTERVAL?

3

Open The notion of an open set is built up by using the idea of an open interval. A set G is said to be open if for every point x ∈ G it is possible to find an open interval (c, d) that contains the point x and is itself contained entirely inside the set G. It is possible to give an ε, δ type of definition for open set. (In this case just the δ is required.) A set G is open if for each point x ∈ G it is possible to find a positive number δ(x) so that (x − δ(x), x + δ(x)) ⊂ G. Closed A set is said to be closed if the complement of that set is open. Specifically, we need to think about the definition of an open set just given. According to that definition, for every point x that is not in a closed set F it is possible to find a positive number δ(x) so that the interval (x − δ(x), x + δ(x))

contains no point in F. This means that points that are not in a closed set F are at some positive distance away from every point that is in F. Certainly there is no point outside of F that is any closer than δ(x).

1.2.2 Open and closed intervals Here is the notation and language: Take any real numbers a and b with a < b. Then the following symbols describe intervals on the real line: • (open bounded interval) (a, b) is the set of all real numbers between (but not including) the points a and b, i.e., all x ∈ R for which a < x < b. • (closed, bounded interval) [a, b] is the set of all real numbers between (and including) the points a and b, i.e., all x ∈ R for which a ≤ x ≤ b. • (half-open bounded interval) [a, b) is the set of all real numbers between (but not including b) the points a and b, i.e., all x ∈ R for which a ≤ x < b. • (half-open bounded interval) (a, b] is the set of all real numbers between (but not including a) the points a and b, i.e., all x ∈ R for which a < x ≤ b. • (open unbounded interval) (a, ∞) is the set of all real numbers greater than (but not including) the point a, i.e., all x ∈ R for which a < x. • (open unbounded interval) (−∞, b) is the set of all real numbers lesser than (but not including) the point b, i.e., all x ∈ R for which x < b. • (closed unbounded interval) [a, ∞) is the set of all real numbers greater than (and including) the point a, i.e., all x ∈ R for which a ≤ x. • (closed unbounded interval) (−∞, b] is the set of all real numbers lesser than (and including) the point b, i.e., all x ∈ R for which x ≤ b. • (the entire real line) (−∞, ∞) is the set of all real numbers. This can be reasonably written as all x for which −∞ < x < ∞.

4

CHAPTER 1. WHAT YOU SHOULD KNOW FIRST

Exercise 1 Do the symbols −∞ and ∞ stand for real numbers? What are they then? Answer Exercise 2 (bounded sets) A general set E is said to be bounded if there is a real Answer number M so that |x| ≤ M for all x ∈ E. Which intervals are bounded? Exercise 3 (open sets) Show that an open interval (a, b) or (a, ∞) or (−∞, b) is an Answer open set. Exercise 4 (closed sets) Show that a closed interval [a, b] or [a, ∞) or (−∞, b] is an Answer closed set. Exercise 5 Show that the intervals [a, b) and (a, b] are neither closed nor open. Answer Exercise 6 (intersection of two open intervals) Is the intersection of two open intervals an open interval? Answer Exercise 7 (intersection of two closed intervals) Is the intersection of two closed intervals a closed interval? Answer Exercise 8 Is the intersection of two unbounded intervals an unbounded interval? Answer Exercise 9 When is the union of two open intervals an open interval?

Answer

Exercise 10 When is the union of two closed intervals an open interval?

Answer

Exercise 11 Is the union of two bounded intervals a bounded set?

Answer

Exercise 12 If I is an open interval and C is a finite set what kind of set might be I \ E? Answer Exercise 13 If I is a closed interval and C is a finite set what kind of set might be I \C? Answer

1.3 Sequences and series We will need the method of sequences and series in our studies of the integral. In this section we present a brief review. By a sequence we mean an infinite list of real numbers s1 , s2 , s3 , s4 , . . . and by a series we mean that we intend to sum the terms in some sequence a1 + a2 + a3 + a4 + . . . . The notation for such a sequence would be {sn } and for such a series ∑∞ k=1 ak .

1.3. SEQUENCES AND SERIES

5

1.3.1 Sequences Convergent sequence A sequence converges to a number L if the terms of the sequence eventually get close to (and remain close to) the number L. Definition 1.1 (convergent sequence) A sequence of real numbers {sn } is said to converge to a real number L if, for every ε > 0, there is an integer N so that L − ε < sn < L + ε

for all integers n ≥ N. In that case we write

lim sn = L.

n→∞

Cauchy sequence A sequence is Cauchy if the terms of the sequence eventually get close together (and remain close together). The two notions of convergent sequence and Cauchy sequence are very intimately related. Definition 1.2 (Cauchy sequence) A sequence of real numbers {sn } is said to be a Cauchy sequence if, for every ε > 0 there is an integer N so that for all pairs of integers n, m ≥ N.

|sn − sm | < ε

Divergent sequence When a sequence fails to be convergent it is said to be divergent. A special case occurs if the sequence does not converge in a very special way: the terms just get too big. Definition 1.3 (divergent sequence) If a sequence fails to converge it is said to diverge. Definition 1.4 (divergent to ∞) A sequence of real numbers {sn } is said to diverge to ∞ if, for every real number M, there is an integer N so that sn > M for all integers n ≥ N. In that case we write lim sn = ∞. n→∞

[We do not say the sequence “converges to ∞.”] Subsequences Given a sequence {sn } and a sequence of integers construct the new sequence

1 ≤ n1 < n2 < n3 < n4 < . . .

{snk } = sn1 , sn2 , sn3 , sn4 , sn5 , . . . .

The new sequence is said to be a subsequence of the original sequence. Studying the convergence behavior of a sequence is sometimes clarified by considering what is happening with subsequences. Bounded sequence A sequence {sn } is said to be bounded if there is a number M so that |sn | ≤ M for all n. It is an important part of the theoretical development to check that convergent sequences are always bounded.

CHAPTER 1. WHAT YOU SHOULD KNOW FIRST

6

1.3.2 Exercises In the exercises you will show that every convergent sequence is a Cauchy sequence and, conversely, that every Cauchy sequence is a convergent sequence. We will also need to review the behavior of monotone sequences and of subsequences. All of the exercises should be looked at as the techniques discussed here are used freely throughout the rest of the material of the text. Boundedness and convergence Exercise 14 Show that every convergent sequence is bounded. Give an example of a Answer bounded sequence that is not convergent. Exercise 15 Show that every convergent sequence is bounded. Give an example of a Answer bounded sequence that is not convergent. Exercise 16 Show that every Cauchy sequence is bounded. Give an example of a bounded sequence that is not Cauchy. Answer Theory of sequence limits Exercise 17 (sequence limits) Suppose that {sn } and {tn } are convergent sequences. 1. What can you say about the sequence xn = asn + btn for real numbers a and b? 2. What can you say about the sequence yn = sn tn ? 3. What can you say about the sequence yn =

sn tn ?

4. What can you say if sn ≤ tn for all n?

Answer

Monotone sequences Exercise 18 A sequence {sn } is said to be nondecreasing [or monotone nondecreasing] if s1 ≤ s2 ≤ s3 ≤ s4 ≤ . . . .

Show that such a sequence is convergent if and only if it is bounded, and in fact that lim sn = sup{sn : n = 1, 2, 3, . . . }.

n→∞

Answer Exercise 19 Show that every sequence {sn } has a subsequence that is monotone, i.e., either monotone nondecreasing sn1 ≤ sn2 ≤ sn3 ≤ sn4 ≤ . . .

or else monotone nonincreasing

sn1 ≥ sn2 ≥ sn3 ≥ sn4 ≥ . . . .

Answer

1.3. SEQUENCES AND SERIES

7

Nested interval argument Exercise 20 (nested interval argument) A sequence {[an , bn ]} of closed, bounded intervals is said to be a nested sequence of intervals shrinking to a point if and

[a1 , b1 ] ⊃ [a2 , b2 ] ⊃ [a3 , b3 ] ⊃ [a4 , b4 ] ⊃ . . . lim (bn − an ) = 0.

n→∞

Show that there is a unique point in all of the intervals.

Answer

Bolzano-Weierstrass property Exercise 21 (Bolzano-Weierstrass property) Show that every bounded sequence has Answer a convergent subsequence. Convergent equals Cauchy Exercise 22 Show that every convergent sequence is Cauchy. [The converse is proved below after we have looked for convergent subsequences.] Answer Exercise 23 Show that every Cauchy sequence is convergent. [The converse was proved Answer earlier.] Closed sets and convergent sequences Exercise 24 Let E be a closed set and {xn } a convergent sequence of points in E. Show that x = limn→∞ xn must also belong to E. Answer

1.3.3 Series The theory of series reduces to the theory of sequence limits by interpreting the sum of the series to be the sequence limit ∞

n

k=1

k=1

lim ∑ ak . ∑ ak = n→∞

Convergent series The formal definition of a convergent series depends on the definition of a convergent sequence. Definition 1.5 (convergent series) A series ∞

∑ ak = a1 + a2 + a3 + a4 + . . . .

k=1

is said to be convergent and to have a sum equal to L if the sequence of partial sums n

Sn =

∑ ak = a1 + a2 + a3 + a4 + · · · + an

k=1

converges to the number L. If a series fails to converge it is said to diverge.

CHAPTER 1. WHAT YOU SHOULD KNOW FIRST

8

Absolutely convergent series A series may converge in a strong sense. We say that a series is absolutely convergent if it is convergent, and moreover the series obtained by replacing all terms with their absolute values is also convergent. The theory of absolutely convergent series is rather more robust than the theory for series that converge, but do not converge absolutely. Definition 1.6 (absolutely convergent series) A series ∞

∑ ak = a1 + a2 + a3 + a4 + . . . .

k=1

is said to be absolutely convergent if both of the sequences of partial sums n

Sn =

∑ ak = a1 + a2 + a3 + a4 + · · · + an

k=1

and

n

Tn =

∑ |ak | = |a1 | + |a2| + |a3| + |a4| + · · · + |an|

k=1

are convergent. Cauchy criterion Exercise 25 Let

n

Sn =

∑ ak = a1 + a2 + a3 + a4 + · · · + an

k=1

be the sequence of partial sums of a series ∞

∑ ak = a1 + a2 + a3 + a4 + . . . .

k=1

Show that Sn is Cauchy if and only if for every ε > 0 there is an integer N so that n a ∑ k < ε k=m for all n ≥ m ≥ N.

Answer

Exercise 26 Let

n

Sn =

∑ ak = a1 + a2 + a3 + a4 + · · · + an

k=1

and

n

Tn =

∑ |ak | = |a1 | + |a2| + |a3| + |a4| + · · · + |an|.

k=1

Show that if {Tn } is a Cauchy sequence then so too is the sequence {Sn }. What can you Answer conclude from this?

1.4 Partitions When working with an interval and functions defined on intervals we shall frequently find that we must subdivide the interval at a finite number of points. For example if

1.4. PARTITIONS

9

[a, b] is a closed, bounded interval then any finite selection of points a = x0 < x1 < x2 < · · · < xn−1 < xn = b

breaks the interval into a collection of subintervals

{[xi−1 , xi ] : i = 1, 2, 3, . . . , n}

that are nonoverlapping and whose union is all of the original interval [a, b]. Most often when we do this we would need to focus attention on certain points chosen from each of the intervals. If ξi is a point in [xi−1 , xi ] then the collection {([xi−1 , xi ], ξi ) : i = 1, 2, 3, . . . , n}

will be called a partition of the interval [a, b]. In sequel we shall see many occasions when splitting up an interval this way is useful. In fact our integration theory for a function f defined on the interval [a, b] can often be expressed by considering the sum n

∑ f (ξk )(xk − xk−1)

k=1

over a partition. This is known as a Riemann sum for f .

1.4.1 Cousin’s partitioning argument The simple lemma we need for many proofs was first formulated by Pierre Cousin. Lemma 1.7 (Cousin) For every point x in a closed, bounded interval [a, b] let there be given a positive number δ(x). Then there must exist at least one partition {([xi−1 , xi ], ξi ) : i = 1, 2, 3, . . . , n} of the interval [a, b] with the property that each interval [xi−1 , xi ] has length smaller than δ(ξi ).

Exercise 27 Show that this lemma is particularly easy if δ(x) = δ is constant for all x in [a, b]. Answer Exercise 28 Prove Cousin’s lemma using a nested interval argument.

Answer

Exercise 29 Prove Cousin’s lemma using a “last point” argument.

Answer

Exercise 30 Use Cousin’s lemma to prove this version of the Heine-Borel theorem: Let C be a collection of open intervals covering a closed, bounded interval [a, b]. Then there is a finite subcollection {(ci , di ) : i = 1, 2, 3, . . . , n} from C that also covers [a, b]. Answer Exercise 31 (connected sets) A set of real numbers E is disconnected if it is possible to find two disjoint open sets G1 and G2 so that both sets contain at least one point of E and together they include all of E. Otherwise a set is connected. Show that the interval Answer [a, b] is connected using a Cousin partitioning argument.

CHAPTER 1. WHAT YOU SHOULD KNOW FIRST

10

Exercise 32 (connected sets) Show that the interval [a, b] is connected using a last Answer point argument.

Exercise 33 Show that a set E that contains at least two points is connected if and only if it is an interval. Answer

1.5 Continuous functions The integral calculus depends on two fundamentally important concepts, that of a continuous function and that of the derivative of a continuous function. We need some expertise in both of these ideas. Most novice calculus students learn much about derivatives, but remain a bit shaky on the subject of continuity.

1.5.1 What is a function? For most calculus students a function is a formula. We use the symbol f :E →R

to indicate a function (whose name is “ f ”) that must be defined at every point x in the set E (E must be, for this course, a subset of R) and to which some real number value f (x) is assigned. The way in which f (x) is assigned need not, of course, be some algebraic formula. Any method of assignment is possible as long as it is clear what is the domain of the function [i.e., the set E] and what is the value [i.e., f (x)] that this function assumes at each point x in E. More important is the concept itself. When we see “Let f : [0, 1] → R be the function defined by f (x) = x2 for all x in the interval [0, 1] . . . ” or just simply “Let g : [0, 1] → R . . . ” we should be equally comfortable. In the former case we know and can compute every value of the function f and we can sketch its graph. In the latter case we are just asked to consider that some function g is under consideration: we know that it has a value g(x) at every point in its domain (i.e., the interval [0, 1]) and we know that it has a graph and we can discuss that function g as freely as we can the function f . Even so calculus students will spend, unfortunately for their future understanding, undue time with formulas. For this remember one rule: if a function is specified by a formula it is also essential to know what is the domain of the function. The convention is usually to specify exactly what the domain intended should be, or else to take the √ largest possible domain that the formula given would permit. Thus f (x) = x does not specify a function until we reveal what the domain of the function should be; since √ f (x) = x (0 ≤ x < ∞) is the best we could do, we would normally claim that the domain is [0, ∞).

1.5. CONTINUOUS FUNCTIONS

11

Exercise 34 In a calculus course what are the assumed domains of the trigonometric functions sin x, cos x, and tan x? Answer Exercise 35 In a calculus course what are the assumed domains of the inverse trigonometric functions arcsin x and arctan x? Answer Exercise 36 In a calculus course what are the assumed domains of the exponential and natural logarithm functions ex and log x? Answer Exercise 37 In a calculus course what might be the assumed domains of the functions given by the formulas p 1 x2 − x − 1, and h(x) = arcsin(x2 − x − 1)? , g(x) = f (x) = 2 (x − x − 1)2 Answer

1.5.2 Uniformly continuous functions Most of the functions that one encounters in the calculus are continuous. Continuity refers to the idea that a function f should have small increments f (d) − f (c) on small intervals [c, d]. That is, however, a horribly imprecise statement of it; what we wish is that the increment f (d)− f (c) should be as small as we please provided that the interval [c, d] is sufficiently small. The interpretation of . . . as small as . . . provided . . . is sufficiently small . . . is invariably expressed in the language of ε, δ, definitions that you will encounter in all of your mathematical studies and which it is essential to master. Nearly everything in this course is expressed in ε, δ language. Continuity is expressed by two closely related notions. We need to make a distinction between the concepts, even though both of them use the same fundamentally simple idea that a function should have small increments on small intervals. Uniformly continuous functions The notion of uniform continuity below is a global condition: it is a condition which holds throughout the whole of some interval. Often we will encounter a more local variant where the continuity condition holds only close to some particular point in the interval where the function is defined. We fix a particular point x0 in the interval and then repeat the definition of uniform continuity but with the extra requirement that it need hold only near the point x0 . Definition 1.8 (uniform continuity) Let f : I → R be a function defined on an interval I. We say that f is uniformly continuous if for every ε > 0 there is a δ > 0 so that | f (d) − f (c)| < ε whenever c, d are points in I for which |d − c| < δ.

The definition can be used with reference to any kind of interval—closed, open, bounded, or unbounded.

CHAPTER 1. WHAT YOU SHOULD KNOW FIRST

12

1.5.3 Pointwise continuous functions The local version of continuity uses the same idea but with the required measure of smallness (the delta δ) adjustable at each point. Definition 1.9 (pointwise continuity) Let f : I → R be a function defined on an open interval I and let x0 be a point in that interval. We say that f is [pointwise] continuous at x0 if for every ε > 0 there is a δ(x0 ) > 0 so that | f (x) − f (x0 )| < ε

whenever x is a point in I for which |x − x0 | < δ(x0 ). We say f is continuous on the open interval I provided f is continuous at each point of I. Note that continuity at a point requires that the function is defined on both sides of the point as well as at the point. Thus we would be very cautious about asserting √ continuity of the function f (x) = x at 0. Uniform continuity on an interval [a, b] does not require that the function is defined on the right of a or the left of b. We are √ comfortable asserting that f (x) = x is uniformly continuous on [0, 1]. (It is.) A comment on the language: For most textbooks the language is simply “continuous on a set” vs. “uniformly continuous on a set” and the word “pointwise” is dropped. For teaching purposes it is important to grasp the distinction between these two definitions; we use here the pointwise/uniform language to emphasize this very important distinction. We will see this same idea and similar language in other places. A sequence of functions can converge pointwise or uniformly. A Riemann sum approximation to an integral can be pointwise or uniform.

1.5.4 Exercises The most important elements of the theory of continuity are these, all verified in the exercises. 1. If f : (a, b) → R is uniformly continuous on (a, b) then f is pointwise continuous at each point of (a, b). 2. If f : (a, b) → R is pointwise continuous at each point of (a, b) then f may or may not be uniformly continuous on (a, b). 3. If two functions f , g : (a, b) → R are pointwise continuous at a point x0 of (a, b) then most combinations of these functions [e.g., sum, linear combination, product, and quotient] are also pointwise continuous at the point x0 . 4. If two functions f , g : (a, b) → R are uniformly continuous on an interval I then most combinations of these functions [e.g., sum, linear combination, product, quotient] are also uniformly continuous on the interval I. Exercise 38 Show that uniform continuity is stronger than pointwise continuity, i.e., show that a function f (x) that is uniformly continuous on an open interval I is necessarily continuous on that interval. Answer

1.5. CONTINUOUS FUNCTIONS

13

Exercise 39 Show that uniform continuity is strictly stronger than pointwise continuity, i.e., show that a function f (x) that is continuous on an open interval I is not necessarily Answer uniformly continuous on that interval. Exercise 40 Construct a function that is defined on the interval (−1, 1) and is continuous only at the point x0 = 0. Answer Exercise 41 Show that the function f (x) = x is uniformly continuous on the interval (−∞, ∞). Answer Exercise 42 Show that the function f (x) = x2 is not uniformly continuous on the interval (−∞, ∞). Answer Exercise 43 Show that the function f (x) = x2 is uniformly continuous on any bounded Answer interval. Exercise 44 Show that the function f (x) = x2 is not uniformly continuous on the interval (−∞, ∞) but is continuous at every real number x0 . Answer Exercise 45 Show that the function f (x) = 1x is not uniformly continuous on the interval (0, ∞) or on the interval (−∞, 0) but is continuous at every real number x0 6= 0. Answer Exercise 46 (linear combinations) Suppose that F and G are functions on an open interval I and that both of them are continuous at a point x0 in that interval. Show that any linear combination H(x) = rF(x) + sG(x) must also be continuous at the point x0 . Does the same statement apply to uniform continuity? Answer

Exercise 47 (products) Suppose that F and G are functions on an open interval I and that both of them are continuous at a point x0 in that interval. Show that the product H(x) = F(x)G(x) must also be continuous at the point x0 . Does the same statement apply to uniform continuity? Answer Exercise 48 (quotients) Suppose that F and G are functions on an open interval I and that both of them are continuous at a point x0 in that interval. Must the quotient H(x) = F(x)/G(x) must also be pointwise continuous at the point x0 . Is there a version for uniform continuity? Answer Exercise 49 (compositions) Suppose that F is a function on an open interval I and that F is continuous at a point x0 in that interval. Suppose that every value of F is contained in an interval J. Now suppose that G is a function on the interval J that is continuous at the point z0 = f (x0 ). Show that the composition function H(x) = G(F(x)) Answer must also be continuous at the point x0 .

CHAPTER 1. WHAT YOU SHOULD KNOW FIRST

14

b5 b4 b3 b2 b1

Figure 1.1: Graph of a step function.

Exercise 50 Show that the absolute value function f (x) = |x| is uniformly continuous on every interval. Exercise 51 Show that the function ( 1 if x is irrational, D(x) = 1 if x = mn in lowest terms, n where m, n are integers expressing the rational number x = mn , is continuous at every irrational number but discontinuous at every rational number. Exercise 52 (Heaviside’s function) Step functions play an important role in integration theory. They offer a crude way of approximating functions. The function 0 if x < 0 H(x) = 1 if x ≥ 0

is a simple step function that assumes just two values, 0 and 1, where 0 is assumed on the interval (−∞, 0) and 1 is assumed on [0, ∞). Find all points of continuity of H. Answer Exercise 53 (step Functions) A function f defined on a bounded interval is a step function if it assumes finitely many values, say b1 , b2 , . . . , bN and for each 1 ≤ i ≤ N the set f −1 (bi ) = {x : f (x) = bi },

which represents the set of points at which f assumes the value bi , is a finite union of intervals and singleton point sets. (See Figure 1.1 for an illustration.) Find all points Answer of continuity of a step function.

Exercise 54 (characteristic function of the rationals) Show that function defined by the formula R(x) = lim lim | cos(m!πx)|n m→∞ n→∞

is discontinuous at every point.

Answer

Exercise 55 (distance of a closed set to a point) Let C be a closed set and define a function by writing d(x,C) = inf{|x − y| : y ∈ C}.

1.5. CONTINUOUS FUNCTIONS

15

This function gives a meaning to the distance between a set C and a point x. If x0 ∈ C, then d(x0 ,C) = 0, and if x0 6∈ C, then d(x0 ,C) > 0. Show that function is continuous at every point. How might you interpret the fact that the distance function is continuous? Answer Exercise 56 (sequence definition of continuity) Prove that a function f defined on an open interval is continuous at a point x0 if and only if limn→∞ f (xn ) = f (x0 ) for every sequence {xn } → x0 . Answer Exercise 57 (mapping definition of continuity) Let f : (a, b) → R be defined on an open interval. Then f is continuous on (a, b) if and only if for every open set V ⊂ R, the set f −1 (V ) = {x ∈ A : f (x) ∈ V } Answer

is open.

1.5.5 Oscillation of a function Continuity of a function f asserts that the increment of f on an interval (c, d), i.e., the value f (d) − f (c), must be small if the interval [c, d] is small. This can often be expressed more conveniently by the oscillation of the function on the interval [c, d]. Definition 1.10 Let f be a function defined on an interval I. We write ω f (I) = sup{| f (x) − f (y)| : x, y ∈ I}

and call this the oscillation of the function f on the interval I. Exercise 58 Establish these properties of the oscillation: 1. ω f ([c, d]) ≤ ω f ([a, b]) if [c, d] ⊂ [a, b]. 2. ω f ([a, c]) ≤ ω f ([a, b]) + ω f ([b, c]) if a 0, there is a δ > 0 so that ω f ([c, d]) < ε whenever [c, d] is a subinterval of I for which |d − c| < δ. [Thus uniformly continuous functions have small increments f (d) − f (c) or equivalently small oscillations ω f ([c, d]) on sufficiently small intervals.] Answer Exercise 60 (uniform continuity and oscillations) Show that f is a uniformly continuous function on a closed, bounded interval [a, b] if and only if, for every ε > 0, there are points a = x0 < x1 < x2 < x3 < · · · < xn−1 < xn = b

CHAPTER 1. WHAT YOU SHOULD KNOW FIRST

16 so that each of

ω f ([x0 , x1 ]), ω f ([x1 , x2 ]), . . . , and ω f ([xn−1 , xn ]) is smaller than ε. (Is there a similar statement for uniform continuity on open intervals?) Answer Exercise 61 (continuity and oscillations) Show that f is continuous at a point x0 in an open interval I if and only if for every ε > 0 there is a δ(x0 ) > 0 so that ω f ([x0 − δ(x0 ), x0 + δ(x0 )]) ≤ ε.

Answer

Exercise 62 (continuity and oscillations) Let f : I → R be a function defined on an open interval I. Show that f is continuous at a point x0 in I if and only if for every ε > 0 there is a δ > 0 so that ω f ([c, d]) < ε whenever [c, d] is a subinterval of I that contains the point x0 and for which |d − c| < δ. Answer Exercise 63 (limits and oscillations) Suppose that f is defined on a bounded open interval (a, b). Show that a necessary and sufficient condition in order that F(a+) = limx→a+ F(x) should exist is that for all ε > 0 there should exist a positive number δ(a) so that ω f ((a, a + δ(a)) < ε. Answer Exercise 64 (infinite limits and oscillations) Suppose that F is defined on (∞, ∞). Show that a necessary and sufficient condition in order that F(∞) = limx→∞ F(x) should exist is that for all ε > 0 there should exist a positive number T so that ω f ((T, ∞)) < ε. Show that the same statement is true for F(−∞) = limx→−∞ F(x) with the requirement that ω f ((−∞, −T )) < ε. Answer

1.5.6 Endpoint limits We are interested in computing, if possible the one-sided limits F(a+) = lim F(x) and F(b−) = lim F(x) x→a+

x→b−

for a function defined on a bounded, open interval (a, b). The definition is a usual ε, δ definition and so far familiar to us since continuity is defined the same way. That means there is a close connection between these limits and continuity.

1.5. CONTINUOUS FUNCTIONS

17

Definition 1.11 Let F : (a, b) → R. Then the one-sided limits F(a+) = lim F(x) x→a+

exists if, for every ε > 0 there is a δ > 0 so that whenever 0 < x − a < δ.

|F(a+) − F(x)| < ε

The other one-sided limit F(b−) is defined similarly. Two-sided limits are defined by requiring that both one-sided limits exist. Thus, if f is defined on both sides at a point x0 we write L = lim F(x) x→x0

if both exist and are equal.

L = F(x0 +) = F(x0 −)

Fundamental theorem for uniformly continuous functions This is an important fundamental theorem for the elementary calculus. How can we be assured that a function defined on a bounded open interval (a, b) is uniformly continuous? Check merely that it is pointwise continuous on (a, b) and that the one-sided limits at the endpoints exist. Similarly, how can we be assured that a function defined on a bounded closed interval [a, b] is uniformly continuous? Check merely that it is pointwise continuous on (a, b) and that the one-sided limits at the endpoints exist and agree with the values f (a) and f (b). Theorem 1.12 (endpoint limits) Let F : (a, b) → R be a function that is continuous on the bounded, open interval (a, b). Then the two limits F(a+) = lim F(x) and F(b−) = lim F(x) x→a+

x→b−

exist if and only if F is uniformly continuous on (a, b). This theorem should be attributed to Cauchy but cannot be, for he failed to notice the difference between the two concepts of pointwise and uniform continuity and simply took it for granted that they were equivalent. Corollary 1.13 (extension property) Let F : (a, b) → R be a function that is continuous on the bounded, open interval (a, b). Then F can be extended to a uniformly continuous function on all of the closed, bounded interval [a, b] if and only if F is uniformly continuous on (a, b). That extension is obtained by defining F(a) = F(a+) = lim F(x) and F(b) = F(b−) = lim F(x) x→a+

x→b−

both of which limits exist if F is uniformly continuous on (a, b). Corollary 1.14 (subinterval property) Let F : (a, b) → R be a function that is continuous on the bounded, open interval (a, b). Then F is uniformly continuous on every closed, bounded subinterval [c, d] ⊂ (a, b), but may or may not be a uniformly continuous function on all of (a, b).

18

CHAPTER 1. WHAT YOU SHOULD KNOW FIRST

Corollary 1.15 (monotone property) Let F : (a, b) → R be a function that is continuous on the bounded, open interval (a, b) and is either monotone nondecreasing or monotone nonincreasing. Then F is uniformly continuous on (a, b) if and only if F is bounded on (a, b). Exercise 65 Prove one direction of the endpoint limit theorem [Theorem 1.12]: Show that if F is uniformly continuous on (a, b) then the two limits F(a+) = lim F(x) and F(b−) = lim F(x) x→a+

x→b−

exist.

Answer

Exercise 66 Prove the other direction of the endpoint limit theorem [Theorem 1.12] using Exercise 63 and a Cousin partitioning argument: Suppose that F : (a, b) → R is continuous on the bounded, open interval (a, b) and that the two limits F(a+) = lim F(x) and F(b−) = lim F(x) x→a+

x→b−

exist. Show that F is uniformly continuous on (a, b).

Answer

Exercise 67 Prove the extension property [Corollary 1.13].

Answer

Exercise 68 Prove the subinterval property [Corollary 1.14].

Answer

Exercise 69 Prove the monotone property [Corollary 1.15].

Answer

Exercise 70 Prove the other direction of the endpoint limit theorem using a BolzanoWeierstrass compactness argument: Suppose that F : (a, b) → R is continuous on the bounded, open interval (a, b) and that the two limits F(a+) = lim F(x) and F(b−) = lim F(x) x→a+

x→b−

exist. Show that F is uniformly continuous on (a, b).

Answer

Exercise 71 Prove the other direction of the endpoint limit theorem using a HeineBorel argument: Suppose that F : (a, b) → R is continuous on the bounded, open interval (a, b) and that the two limits F(a+) = lim F(x) and F(b−) = lim F(x) x→a+

exist. Show that F is uniformly continuous on (a, b).

x→b−

Answer

Exercise 72 Show that the theorem fails if we drop the requirement that the interval is Answer bounded. Exercise 73 Show that the theorem fails if we drop the requirement that the interval is Answer closed. Exercise 74 Criticize this proof of the false theorem that if f is continuous on an interval (a, b) then f must be uniformly continuous on (a, b).

1.5. CONTINUOUS FUNCTIONS

19

Suppose if f is continuous on (a, b). Let ε > 0 and for any x0 in (a, b) choose a δ > 0 so that | f (x) − f (x0 )| < ε if |x − x0 | < δ. Then if c and d are any points that satisfy |c − d| < δ just set c = x and d = x0 to get | f (d) − f (c)| < ε. Thus f must be uniformly continuous on (a, b). Answer Exercise 75 Suppose that G : (a, b) → R is continuous at every point of an open interval (a, b). Then show that G is uniformly continuous on every closed, bounded subinterval [c, d] ⊂ (a, b). Answer Exercise 76 Show that, if F : (a, b) → R is a function that is continuous on the bounded, open interval (a, b) but not uniformly continuous, then one of the two limits F(a+) = lim F(x) or F(b−) = lim F(x) x→a+

x→b−

must fail to exist.

Answer

Exercise 77 Show that, if F : (a, b) → R is a function that is continuous on the bounded, open interval (a, b) and both of the two limits F(a+) = lim F(x) and F(b−) = lim F(x) x→a+

x→b−

exist then F is in fact uniformly continuous on (a, b).

Answer

Exercise 78 Suppose that F : (a, b) → R is a function defined on an open interval (a, b) and that c is a point in that interval. Show that F is continuous at c if and only if both of the two one-sided limits F(c+) = lim F(x) and F(c−) = lim F(x) x→c+

x→c−

exist and F(c) = F(c+) = F(c−).

Answer

1.5.7 Boundedness properties Continuity has boundedness implications. Pointwise continuity supplies local boundedness; uniform continuity supplies global boundedness, but only on bounded intervals. Definition 1.16 (bounded function) Let f : I → R be a function defined on an interval I. We say that f is bounded on I if there is a number M so that for all x in the interval I.

| f (x)| ≤ M

Definition 1.17 (locally bounded function) A function f defined on an interval I is said to be locally bounded at a point x0 if there is a δ(x0 ) > 0 so that f is bounded on the set (x0 − δ(x0 ), x0 + δ(x0 )) ∩ I. Theorem 1.18 Let f : I → R be a function defined on a bounded interval I and suppose that f is uniformly continuous on I. Then f is a bounded function on I.

20

CHAPTER 1. WHAT YOU SHOULD KNOW FIRST

Theorem 1.19 Let f : I → R be a function defined on an open interval I and suppose that f is continuous at a point x0 in I. Then f is locally bounded at x0 . Remember that, if f is continuous on an open interval (a, b), then f is uniformly continuous on each closed subinterval [c, d] ⊂ (a, b). Thus, in order for f to be unbounded on (a, b) the large values are occurring only at the endpoints. Let us say that f is locally bounded on the right at a if there is at least one interval (a, a + δa ) on which f is bounded. Similarly we can define locally bounded on the left at b. This corollary is then immediate. Corollary 1.20 Let f : (a, b) → R be a function defined on an open interval (a, b). Suppose that 1. f is continuous at every point in (a, b). 2. f is locally bounded on the right at a. 3. f is locally bounded on the left at b. Then f is bounded on the interval (a, b). Exercise 79 Use Exercise 60 to prove Theorem 1.18.

Answer

Exercise 80 Prove Theorem 1.19 by proving that all continuous functions are locally Answer bounded.

Exercise 81 It follows from Theorem 1.18 that a continuous, unbounded function on a bounded open interval (a, b) cannot be uniformly continuous. Can you prove that a continuous, bounded function on a bounded open interval (a, b) must be uniformly continuous? Answer Exercise 82 Show that f is not bounded on an interval I if and only if there must exist Answer a sequence of points {xn } for which f |(xn )| → ∞. Exercise 83 Using Exercise 82 and the Bolzano-Weierstrass argument, show that if a function f is locally bounded at each point of a closed, bounded interval [a, b] then f must be bounded on [a, b]. Exercise 84 Using Cousin’s lemma, show that if a function f is locally bounded at each point of a closed, bounded interval [a, b] then f must be bounded on [a, b]. Exercise 85 If a function is uniformly continuous on an unbounded interval must the Answer function be unbounded? Could it be bounded? Exercise 86 Suppose f , g : I → R are two bounded functions on I. Is the sum function f + g necessarily bounded on I? Is the product function f g necessarily bounded on Answer I?

1.6. EXISTENCE OF MAXIMUM AND MINIMUM

21

Exercise 87 Suppose f , g : I → R are two bounded functions on I and suppose that the function g does not assume the value zero. Is the quotient function f /g necessarily Answer bounded on I? Exercise 88 Suppose f , g : R → R are two bounded functions. Is the composite function h(x) = f (g(x)) necessarily bounded? Answer Exercise 89 Show that the function f (x) = sin x is uniformly continuous on the interval Answer (−∞, ∞). Exercise 90 A function defined on an interval I is said to satisfy a Lipschitz condition there if there is a number M with the property that |F(x) − F(y)| ≤ M|x − y|

for all x, y ∈ I. Show that a function that satisfies a Lipschitz condition on an interval Answer is uniformly continuous on that interval. Exercise 91 Show that f is not uniformly continuous on an interval I if and only if there must exist two sequences of points {xn } and {xn } from that interval for which xn − yn → 0 but f (xn ) − f (yn ) does not converge to zero. Answer

1.6 Existence of maximum and minimum Uniformly continuous function are bounded on bounded intervals. Must they have a maximum and a minimum value? We know that continuous functions need not be bounded so our focus will be on uniformly continuous functions on closed, bounded intervals. Theorem 1.21 Let F : [a, b] → R be a function defined on a closed, bounded interval [a, b] and suppose that F is uniformly continuous on [a, b]. Then F attains both a maximum value and a minimum value in that interval. Exercise 92 Prove Theorem 1.21 using a least upper bound argument.

Answer

Exercise 93 Prove Theorem 1.21 using a Bolzano-Weierstrass argument.

Answer

Exercise 94 Give an example of a uniformly continuous function on the interval (0, 1) Answer that attains a maximum but does not attain a minimum. Exercise 95 Give an example of a uniformly continuous function on the interval (0, 1) that attains a minimum but does not attain a maximum. Answer Exercise 96 Give an example of a uniformly continuous function on the interval (0, 1) that attains neither a minimum nor a maximum. Answer

22

CHAPTER 1. WHAT YOU SHOULD KNOW FIRST

Exercise 97 Give an example of a uniformly continuous function on the interval (−∞, ∞) Answer that attains neither a minimum nor a maximum. Exercise 98 Give an example of a uniformly continuous, bounded function on the interval (−∞, ∞) that attains neither a minimum nor a maximum. Answer Exercise 99 Let f : R → R be an everywhere continuous function with the property that lim f (x) = lim f (x) = 0. x→∞

x→−∞

Show that f has either an absolute maximum or an absolute minimum but not necessarily both. Answer Exercise 100 Let f : R → R be an everywhere continuous function that is periodic in the sense that for some number p, f (x + p) = f (x) for all x ∈ R. Show that f has an Answer absolute maximum and an absolute minimum.

1.6.1 The Darboux property of continuous functions We define the Darboux property of a function and show that all continuous functions have this property. Definition 1.22 (Darboux Property) Let f be defined on an interval I. Suppose that for each a, b ∈ I with f (a) 6= f (b), and for each d between f (a) and f (b), there exists c between a and b for which f (c) = d. We then say that f has the Darboux property [intermediate value property] on I. Functions with this property are called Darboux functions after Jean Gaston Darboux (1842–1917), who showed in 1875 that for every differentiable function F on an interval I, the derivative F ′ has the intermediate value property on I. Theorem 1.23 (Darboux property of continuous functions) Let f : (a, b) → R be a continuous function on an open interval (a, b). Then f has the Darboux property on that interval. Exercise 101 Prove Theorem 1.23 using a Cousin covering argument.

Answer

Exercise 102 Prove Theorem 1.23 using a Bolzano-Weierstrass argument. Answer

Exercise 103 Prove Theorem 1.23 using the Heine-Borel property.

Answer

Exercise 104 Prove Theorem 1.23 using the least upper bound property.

Answer

1.7. DERIVATIVES

23

Exercise 105 Suppose that f : (a, b) → R is a continuous function on an open interval (a, b). Show that f maps (a, b) onto an interval. Show that this interval need not be Answer open, need not be closed, and need not be bounded. Exercise 106 Suppose that f : [a, b] → R is a uniformly continuous function on a closed, bounded interval [a, b]. Show that f maps [a, b] onto an interval. Show that this interval must be closed and bounded. Answer Exercise 107 Define the function F(x) =

sin x−1 if x 6= 0 0 if x = 0.

Show that F has the Darboux property on every interval but that F is not continuous on every interval. Show, too, that F assumes every value in the interval [−1, 1] infinitely Answer often. Exercise 108 (fixed points) A function f : [a, b] → [a, b] is said to have a fixed point c ∈ [a, b] if f (c) = c. Show that every uniformly continuous function f mapping [a, b] Answer into itself has at least one fixed point. Exercise 109 (fixed points) Let f : [a, b] → [a, b] be continuous. Define a sequence recursively by z1 = x1 , z2 = f (z1 ), . . . , zn = f (zn−1 ) where x1 ∈ [a, b]. Show that if the sequence {zn } is convergent, then it must converge to a fixed point of f . Answer Exercise 110 Is there a continuous function f : I → R defined on an interval I such that for every real y there are precisely either zero or two solutions to the equation Answer f (x) = y? Exercise 111 Is there a continuous function f : R → R such that for every real y there Answer are precisely either zero or three solutions to the equation f (x) = y? Exercise 112 Suppose that the function f : R → R is monotone nondecreasing and has Answer the Darboux property. Show that f must be continuous at every point.

1.7 Derivatives A derivative2 of a function is another function “derived” from the first function by a procedure (which we do not have to review here): F(x) − F(x0 ) . F ′ (x0 ) = lim x→x0 x − x0 Thus, for example, we remember that, if F(x) = x2 + x + 1 then the derived function is F ′ (x) = 2x + 1. 2 The

word derivative in mathematics almost always refers to this concept. In finance, you might have noticed, derivatives are financial instrument whose values are “derived” from some underlying security. Observe that the use of the word “derived” is the same.

24

CHAPTER 1. WHAT YOU SHOULD KNOW FIRST

The values of the derived function, 2x + 1, represent (geometrically) the slope of the tangent line at the points (x, x2 + x+ 1) that are on the graph of the function F. There are numerous other interpretations (other than the geometric) for the values of the derivative function. Recall the usual notations for derivatives: d sin x = cos x. dx F(x) = sin x, F ′ (x) = cos x. dy = cos x. y = sin x, dx The connection between a function and its derivative is straightforward: the values of the function F(x) are used, along with a limiting process, to determine the values of the derivative function F ′ (x). That’s the definition. We need to know the definition to understand what the derivative signifies, but we do not revert to the definition for computations except very rarely. The following facts should be familiar: • A function may or may not have a derivative at a point. • In order for a function f to have a derivative at a point x0 the function must be defined at least in some open interval that contains that point. • A function that has a derivative at a point x0 is said to be differentiable at x0 . If it fails to have a derivative there then it is said to be nondifferentiable at that point. • There are many calculus tables that can be consulted for derivatives of functions for which familiar formulas are given. • There are many rules for computation of derivatives for functions that do not appear in the tables explicitly, but for which the tables are nonetheless useful after some further manipulation. • Information about the derivative function offers deep insight into the nature of the function itself. For example a zero derivative means the function is constant; a nonnegative derivative means the function is increasing. A change in the derivative from positive to negative indicates that a local maximum point in the function was reached. Exercise 113 (ε, δ(x) version of derivative) Suppose that F is a differentiable function on an open interval I. Show that for every x ∈ I and every ε > 0 there is a δ(x) > 0 so that F(y) − F(x) − F ′ (x)(y − x) ≤ ε|y − x| whenever y and x are points in I for which |y − x| < δ(x).

Answer

Exercise 114 (differentiable implies continuous) Prove that a function that has a derivative at a point x0 must also be continuous at that point. Answer

1.8. DIFFERENTIATION RULES

25

Exercise 115 (ε, δ(x) straddled version of derivative) Suppose that F is a differentiable function on an interval I. Show that for every x ∈ I and every ε > 0 there is a δ(x) > 0 so that F(z) − F(y) − F ′ (x)(z − y) ≤ ε|z − y| whenever y and z are points in I for which |y−z| < δ(x) and either y ≤ x ≤ z or z ≤ x ≤ y. Answer

Exercise 116 (ε, δ(x) unstraddled version of derivative) Suppose that F is a differentiable function on an open interval I. Suppose that for every x ∈ I and every ε > 0 there is a δ(x) > 0 so that F(z) − F(y) − F ′ (x)(z − y) ≤ ε|z − y| whenever y and z are points in I for which |y − z| < δ(x) [and we do not require either y ≤ x ≤ z or z ≤ x ≤ y]. Show that not all differentiable functions would have this Answer property but that if F ′ is continuous then this property does hold.

Exercise 117 (locally strictly increasing functions) Suppose that F is a function on an open interval I. Then F is said to be locally strictly increasing at a point x0 in the interval if there is a δ > 0 so that F(y) < F(x0 ) < F(z) for all F ′ (x0 ) >

x0 − δ < y < x0 < z < x0 + δ.

Show that, if 0, then F must be locally strictly increasing at x0 . Show that the converse does not quite hold: if F is differentiable at a point x0 in the interval and is also locally strictly increasing at x0 , then necessarily F ′ (x0 ) ≥ 0 but that F ′ (x0 ) = 0 is Answer possible.

Exercise 118 Suppose that a function F is locally strictly increasing at every point of an open interval (a, b). Use the Cousin partitioning argument to show that F is strictly increasing on (a, b). [In particular, notice that this means that a function with a positive derivative is increasing. This is usually proved using the mean-value theorem that is stated in Section 1.9 below.] Answer

1.8 Differentiation rules We remind the reader of the usual calculus formulas by presenting the following slogans. Of course each should be given a precise statement and the proper assumptions clearly made. Constant rule: If f (x) is constant, then f ′ = 0.

CHAPTER 1. WHAT YOU SHOULD KNOW FIRST

26

Linear combination rule: (r f + sg)′ = r f ′ + sg′ for functions f and g and all real numbers r and s. Product rule: ( f g)′ = f ′ g + f g′ for functions f and g. Quotient rule:

′ f ′ g − f g′ f = g g2 for functions f and g at points where g does not vanish.

Chain rule: If f (x) = h(g(x)), then f ′ (x) = h′ (g(x)) · g′ (x).

1.9 Mean-value theorem There is a close connection between the values of a function and the values of its derivative. In one direction this is trivial since the derivative is defined in terms of the values of the function. The other direction is more subtle. How does information about the derivative provide us with information about the function? One of the keys to providing that information is the mean-value theorem. The usual proof presented in calculus texts requires proving a weak version of the mean-value theorem first (Rolle’s theorem) and then using that to prove the full version.

1.9.1 Rolle’s theorem Theorem 1.24 (Rolle’s Theorem) Let f be uniformly continuous on [a, b] and differentiable on (a, b). If f (a) = f (b) then there must exist at least one point ξ in (a, b) such that f ′ (ξ) = 0. Exercise 119 Prove the theorem.

Answer

Exercise 120 Interpret the theorem geometrically.

Answer

Exercise 121 Can we claim that the point ξ whose existence is claimed by the theorem, Answer is unique?. How many points might there be? Exercise 122 Define a function f (x) = x sin x−1 , f (0) = 0, on the whole real line. Can Answer Rolle’s theorem be applied on the interval [0, 1/π]? √ Exercise 123 Is it possible to apply Rolle’s theorem to the function f (x) = 1 − x2 on Answer [−1, 1]. p Exercise 124 Is it possible to apply Rolle’s theorem to the function f (x) = |x| on [−1, 1]. Answer

1.9. MEAN-VALUE THEOREM

27

Exercise 125 Use Rolle’s theorem to explain why the cubic equation x3 + αx2 + β = 0 cannot have more than one solution whenever α > 0.

Answer

Exercise 126 If the nth-degree equation p(x) = a0 + a1 x + a2 x2 + · · · + an xn = 0

has n distinct real roots, then how many distinct real roots does the (n − 1)st degree Answer equation p′ (x) = 0 have? Exercise 127 Suppose that f ′ (x) > c > 0 for all x ∈ [0, ∞). Show that limx→∞ f (x) = ∞. Answer Exercise 128 Suppose that f : R → R and both f ′ and f ′′ exist everywhere. Show that if f has three zeros, then there must be some point ξ so that f ′′ (ξ) = 0. Answer Exercise 129 Let f be continuous on an interval [a, b] and differentiable on (a, b) with a derivative that never is zero. Show that f maps [a, b] one-to-one onto some other Answer interval. Exercise 130 Let f be continuous on an interval [a, b] and twice differentiable on (a, b) with a second derivative that never is zero. Show that f maps [a, b] two-one onto some other interval; that is, there are at most two points in [a, b] mapping into any one value Answer in the range of f .

1.9.2 Mean-Value theorem If we drop the requirement in Rolle’s theorem that f (a) = f (b), we now obtain the result that there is a point c ∈ (a, b) such that f (b) − f (a) f ′ (c) = . b−a Geometrically, this states that there exists a point c ∈ (a, b) for which the tangent to the graph of the function at (c, f (c)) is parallel to the chord determined by the points (a, f (a)) and (b, f (b)). (See Figure 1.2.) This is the mean-value theorem, also known as the law of the mean or the first mean-value theorem (because there are other mean-value theorems). Theorem 1.25 (Mean-Value Theorem) Suppose that f is a continuous function on the closed interval [a,b] and differentiable on (a,b) . Then there exists a point ξ ∈ (a, b) such that f (b) − f (a) . f ′ (ξ) = b−a Exercise 131 Prove the theorem.

Answer

CHAPTER 1. WHAT YOU SHOULD KNOW FIRST

28

a

c

b

Figure 1.2: Mean value theorem [ f ′ (c) is slope of the chord].

Exercise 132 Suppose f satisfies the hypotheses of the mean-value theorem on [a,b]. Let S be the set of all slopes of chords determined by pairs of points on the graph of f and let D = { f ′ (x) : x ∈ (a, b)}. 1. Prove that S ⊂ D. 2. Give an example to show that D can contain numbers not in S. Answer Exercise 133 Interpreting the slope of a chord as an average rate of change and the derivative as an instantaneous rate of change, what does the mean-value theorem say? If a car travels 100 miles in 2 hours, and the position s(t) of the car at time t, measured in hours satisfies the hypotheses of the mean-value theorem, can we be sure that there Answer is at least one instant at which the velocity is 50 mph? Exercise 134 Give an example to show that the conclusion of the mean-value theorem can fail if we drop the requirement that f be differentiable at every point in (a,b) . Answer Exercise 135 Give an example to show that the conclusion of the mean-value theorem can fail if we drop the requirement of continuity at the endpoints of the interval. Answer Exercise 136 Suppose that f is differentiable on [0, ∞) and that lim f ′ (x) = C.

x→∞

Determine lim [ f (x + a) − f (x)].

x→∞

Answer Exercise 137 Suppose that f is continuous on [a, b] and differentiable on (a, b). If lim f ′ (x) = C

x→a+

what can you conclude about the right-hand derivative of f at a?

Answer

1.9. MEAN-VALUE THEOREM

29

Exercise 138 Suppose that f is continuous and that lim f ′ (x)

x→x0

exists. What can you conclude about the differentiability of f ? What can you conclude about the continuity of f ′ ? Answer Exercise 139 Let f : [0, ∞) → R so that f ′ is decreasing and positive. Show that the series ∞

∑ f ′ (i)

i=1

is convergent if and only if f is bounded.

Answer

Exercise 140 Prove this second-order version of the mean-value theorem. Theorem 1.26 (Second order mean-value theorem) Let f be continuous on [a,b] and twice differentiable on (a,b) . Then there exists c ∈ (a, b) such that f ′′ (c) f (b) = f (a) + (b − a) f ′ (a) + (b − a)2 . 2! Answer

Exercise 141 Determine all functions f : R → R that have the property that f (x) − f (y) ′ x+y f = 2 x−y for every x 6= y. Answer Exercise 142 A function is said to be smooth at a point x if f (x + h) + f (x − h) − 2 f (x) = 0. lim h→0 h2 Show that a smooth function need not be continuous. Show that if f ′′ is continuous at Answer x, then f is smooth at x.

Exercise 143 Prove this version of the mean-value theorem due to Cauchy. Theorem 1.27 (Cauchy mean-value theorem) Let f and g be uniformly continuous on [a, b] and differentiable on (a, b). Then there exists ξ ∈ (a, b) such that [ f (b) − f (a)]g′ (ξ) = [g(b) − g(a)] f ′ (ξ).

(1.1) Answer

Exercise 144 Interpret the Cauchy mean-value theorem geometrically.

Answer

CHAPTER 1. WHAT YOU SHOULD KNOW FIRST

30

Exercise 145 Use Cauchy’s mean-value theorem to prove any simple version of L’Hôpital’s Answer rule that you can remember from calculus. Exercise 146 Show that the conclusion of Cauchy’s mean-value can be put into determinant form as f (a) g(a) 1 f (b) g(b) 1 = 0. f ′ (c) g′ (c) 0 Answer

Exercise 147 Formulate and prove a generalized version of Cauchy’s mean-value whose conclusion is the existence of a point c such that f (a) g(a) h(a) f (b) g(b) h(b) = 0. f ′ (c) g′ (c) h′ (c) Answer

Exercise 148 Suppose that f : [a, c] → R is uniformly continuous and that it has a derivative f ′ (x) that is monotone increasing on the interval (a, c). Show that for any a < b < c.

(b − a) f (c) + (c − b) f (a) ≥ (c − a) f (b)

Answer

Exercise 149 (avoiding the mean-value theorem) The primary use [but not the only use] of the mean-value theorem in a calculus class is to establish that a function with a positive derivative on an open interval (a, b) would have to be increasing. Prove this Answer directly without the easy mean-value proof.

Exercise 150 Prove the “converse” to the mean-value theorem: Let F, f : [a, b] → R and suppose that f is continuous there. Suppose that for every pair of points a < x < y < b there is a point x < ξ < y so that F(y) − F(x) = f (ξ). y−x Then F is differentiable on (a, b) and f is its derivative. Answer Exercise 151 Let f : [a, b] → R be a uniformly continuous function that is differentiable at all points of the interval (a, b) with possibly finitely many exceptions. Show that there is a point a < ξ < b so that f (b) − f (a) ′ b − a ≤ | f (ξ)|. Answer

1.9. MEAN-VALUE THEOREM

31

Exercise 152 (Flett’s theorem) Given a function differentiable at every point of an interval [a, b] and with f ′ (a) = f ′ (b), show that there is a point ξ in the interval for which f (ξ) − f (a) = f (ξ). ξ−a Answer

1.9.3 The Darboux property of the derivative We have proved that all continuous functions have the Darboux property. We now prove that all derivatives have the Darboux property. This was proved by Darboux in 1875; one of the conclusions he intended was that there must be an abundance of functions that have the Darboux property and are yet not continuous, since all derivatives have this property and not all derivatives are continuous. Theorem 1.28 (Darboux property of the derivative) Let F be differentiable on an open interval I. Suppose a, b ∈ I, a < b, and F ′ (a) 6= F ′ (b). Let γ be any number between F ′ (a) and F ′ (b). Then there must exist a point ξ ∈ (a, b) such that F ′ (ξ) = γ. Exercise 153 Compare Rolle’s theorem to Darboux’s theorem. Suppose G is everywhere differentiable, that a < b and G(a) = G(b). Then Rolle’s theorem asserts the existence of a point ξ in the open interval (a, b) for which G′ (ξ) = 0. Give a proof of the same thing if the hypothesis G(a) = G(b) is replaced by G′ (a) < 0 < G′ (b) or Answer G′ (b) < 0 < G′ (a). Use that to prove Theorem 1.28. Exercise 154 Let F : R → R be a differentiable function. Show that F ′ is continuous if and only if the set Eα = {x : F ′ (x) = α} is closed for each real number α.

Answer

Exercise 155 A function defined on an interval is piecewise monotone if the interval can be subdivided into a finite number of subintervals on each of which the function is nondecreasing or nonincreasing. Show that every polynomial is piecewise monotone. Answer

1.9.4 Vanishing derivatives and constant functions When the derivative is zero we sometimes use colorful language by saying that the derivative vanishes! When the derivative of a function vanishes we expect the function to be constant. But how is that really proved? Theorem 1.29 (vanishing derivatives) Let F : [a, b] → R be uniformly continuous on the closed, bounded interval [a, b] and suppose that F ′ (x) = 0 for every a < x < b. Then F is a constant function on [a, b].

32

CHAPTER 1. WHAT YOU SHOULD KNOW FIRST

Corollary 1.30 Let F : (a, b) → R and suppose that F ′ (x) = 0 for every a < x < b. Then F is a constant function on (a, b). Exercise 156 Prove the theorem using the mean-value theorem.

Answer

Exercise 157 Prove the theorem without using the mean-value theorem.

Answer

Exercise 158 Deduce the corollary from the theorem.

Answer

1.9.5 Vanishing derivatives with exceptional sets When a function has a vanishing derivative then that function must be constant. What if there is a small set of points at which we are unable to determine that the derivative is zero? Theorem 1.31 (vanishing derivatives with a few exceptions) Let F : [a, b] → R be uniformly continuous on the closed, bounded interval [a, b] and suppose that F ′ (x) = 0 for every a < x < b with finitely many possible exceptions. Then F is a constant function on [a, b].

Corollary 1.32 Let F : (a, b) → R be continuous on the open interval (a, b) and suppose that F ′ (x) = 0 for every a < x < b with finitely many possible exceptions. Then F is a constant function on (a, b). Exercise 159 Prove the theorem by subdividing the interval at the exceptional points. Answer Exercise 160 Prove the theorem by applying Exercise 151. Exercise 161 Prove the corollary.

Answer

Exercise 162 Let F, G : [a, b] → R be uniformly continuous functions on the closed, bounded interval [a, b] and suppose that F ′ (x) = f (x) for every a < x < b with finitely many possible exceptions, and that G′ (x) = f (x) for every a < x < b with finitely many possible exceptions. Show that F and G differ by a constant [a, b]. Answer Exercise 163 Construct a non-constant function which has a zero derivative at all but Answer finitely many points. Exercise 164 Prove the following major improvement of Theorem 1.31. Here, by many exceptions, we include the possibility of infinitely many exceptions provided, only, that it is possible to arrange the exceptional points into a sequence. Theorem 1.33 (vanishing derivatives with many exceptions) Let F : [a, b] → R be uniformly continuous on the closed, bounded interval [a, b] and suppose that F ′ (x) = 0 for every a < x < b with the possible exception of the points c1 , c2 , c3 , . . . forming an infinite sequence. Show that F is a constant function on [a, b].

1.10. LIPSCHITZ FUNCTIONS

33

[The argument that was successful for Theorem 1.31 will not work for infinitely many Answer exceptional points. A Cousin partitioning argument does work.] Exercise 165 Suppose that F is a function continuous at every point of the real line and such that F ′ (x) = 0 for every x that is irrational. Show that F is constant. Answer Exercise 166 Suppose that G is a function continuous at every point of the real line and such that G′ (x) = x for every x that is irrational. What functions G have such a property? Answer Exercise 167 Let F, G : [a, b] → R be uniformly continuous functions on the closed, bounded interval [a, b] and suppose that F ′ (x) = f (x) for every a < x < b with the possible exception of points in a sequence {c1 , c2 , c3 , . . . }, and that G′ (x) = f (x) for every a < x < b with the possible exception of points in a sequence {d1 , d2 , d3 , . . . }. Answer Show that F and G differ by a constant [a, b].

1.10 Lipschitz functions A function satisfies a Lipschitz condition if there is some limitation on the possible slopes of secant lines, lines joining points (x, f (x)) and (y, f (x). Since the slope of such a line would be f (y) − f (x) y−x any bounds put on this fraction is called a Lipschitz condition. Definition 1.34 A function f is said to satisfy a Lipschitz condition on an interval I if | f (x) − f (y)| ≤ M|x − y| for all x, y in the interval.

Functions that satisfy such a condition are called Lipschitz functions and play a key role in many parts of analysis. Exercise 168 Show that a function that satisfies a Lipschitz condition on an interval must be uniformly continuous on that interval. Exercise 169 Show that if f is assumed to be continuous on [a, b] and differentiable on (a, b) then f is a Lipschitz function if and only if the derivative f ′ is bounded on (a, b). Answer √ Exercise 170 Show that the function f (x) = x is uniformly continuous on the interval [0, ∞) but that it does not satisfy a Lipschitz condition on that interval. Answer Exercise 171 A function F on an interval I is said to have bounded derived numbers if there is a number M so that, for each x ∈ I one can choose δ > 0 so that F(x + h) − F(x) ≤M h

34

CHAPTER 1. WHAT YOU SHOULD KNOW FIRST

whenever x + h ∈ I and |h| < δ. Using a Cousin partitioning argument, show that F is Answer Lipschitz if and only if F has bounded derived numbers. Exercise 172 Is a linear combination of Lipschitz functions also Lipschitz? Answer Exercise 173 Is a product of Lipschitz functions also Lipschitz?

Answer

Exercise 174 Is f (x) = log x a Lipschitz function?

Answer

Exercise 175 Is f (x) = |x| a Lipschitz function?

Answer

Exercise 176 If F : [a, b] → R is a Lipschitz function show that the function G(x) = F(x) + kx is increasing for some value k and decreasing for some other value of k. Is the converse true? Exercise 177 Show that every polynomial is a Lipschitz function on any bounded interval. What about unbounded intervals? Exercise 178 In an idle moment a careless student proposed to study a kind of super Lipschitz condition: he supposed that | f (x) − f (y)| ≤ M|x − y|2

for all x, y in an interval. What functions would have this property?

Answer

Exercise 179 A function f is said to be bi-Lipschitz on an interval I if there is an M > 0 so that 1 |x − y| ≤ | f (x) − f (y)| ≤ M|x − y| M for all x, y in the interval. What can you say about such functions? Can you give examples of such functions? Exercise 180 Is there a difference between the following two statements: and

| f (x) − f (y)| < |x − y| for all x, y in an interval | f (x) − f (y)| ≤ K|x − y| for all x, y in an interval, for some K < 1?

Answer

Exercise 181 If Fn : [a, b] → R is a Lipschitz function for each n = 1, 2, 3, . . . and F(x) = limn→∞ Fn (x) for each a ≤ x ≤ b, does it follow that F must also be a Lipschitz function. Answer

Chapter 2

The Indefinite Integral You will, no doubt, remember the formula Z

x3 +C 3 from your first calculus classes. This assertion includes the following observations. d x3 • +C = x2 . dx 3 x2 dx =

• Any other function F for which the identity F ′ (x) = x2 holds is of the form F(x) = x3 /3 +C for some constant C. • C is called the constant of integration and is intended as a completely arbitrary constant. R

• The expression x2 dx is intended to be ambiguous and is to include any and all functions whose derivative is x2 . In this chapter we will make this rather more precise and we will generalize by allowing a finite exceptional set where the derivative need not exist. Since the indefinite integral is defined directly in terms of the derivative, there are no new elements of theory required to be developed. We take advantage of the theory of continuous functions and their derivatives as outlined in Chapter 1.

2.1 An indefinite integral on an interval We shall assume that indefinite integrals are continuous and we require them to be differentiable everywhere except possibly at a finite set. The definition is stated for open intervals only. Definition 2.1 (The indefinite integral) Let (a, b) be an open interval (bounded or unbounded) and let f be a function defined on that interval except possibly at finitely many points. Then any continuous function F : (a, b) → R for which F ′ (x) = f (x) for all a < x < b except possibly at finitely many points is said to be an indefinite integral for f on (a, b). 35

CHAPTER 2. THE INDEFINITE INTEGRAL

36

Warning An indefinite integral is always defined relative to some open interval. Confusions can easily arise if this is forgotten. Notation

The familiar notationZ

F ′ (x) dx = F(x) +C

will frequently be used along (one hopes) with some allusion to the interval under considerabtion, This notation is justified by the fact that, all indefinite integrals for the function F ′ can be written in this form for some choice of constant C. Use of the notation, however, requires the user to be alert to the underlying interval (a, b) on which the statement depends. Continuous functions differentiable mostly everywhere Our indefinite integration theory is essentially the study of continuous functions F : (a, b) → R defined on an open interval, for which there is only a finite number of points of nondifferentiability. Note that, if there are no exceptional points, then we do not have to check that the function is continuous: every differentiable function is continuous. The indefinite integration theory is, consequently, all about derivatives of continuous functions. We shall see, in the next chapter, that the definite integration theory is all about derivatives of uniformly continuous functions. Exercise 182 Suppose that F : (a, b) → R is differentiable at every point of the open Answer interval (a, b). Is F an indefinite integral for F ′ ? Exercise 183 If F is an indefinite integral for a function f on an open interval (a, b) Answer and a < x < b, is it necessarily true that F ′ (x) = f (x). Exercise 184 Let F, G : (a, b) → R be two continuous functions for which F ′ (x) = f (x) for all a < x < b except possibly at finitely many points and G′ (x) = f (x) for all a < x < b except possibly at finitely many points. Then F and G must differ by a constant. In particular, on the interval (a, b) the statements

and

Z

f (x) dx = F(x) +C1

Z

f (x) dx = G(x) +C2

are both valid (where C1 and C2 represent arbitrary constants of integration). Answer

2.1.1 Role of the finite exceptional set The simplest kind of antiderivative is expressed in the situation F ′ (x) = f (x) for all a < x < b [no exceptions]. Our theory is slightly more general in that we allow a finite set of failures and compensate for this by insisting that the function F is continuous at those points. There is a language that is often adopted to allow exceptions in mathematical statements. We do not use this language in Chapter 2 or Chapter 3 but, for classroom presentation, it might be useful. We will use this language in Chapter 4.

2.1. AN INDEFINITE INTEGRAL ON AN INTERVAL

37

mostly everywhere A statement holds mostly everywhere if it holds everywhere with the exception of a finite set of points c1 , c2 , c3 , . . . , cn . nearly everywhere A statement holds nearly everywhere if it holds everywhere with the exception of a sequence of points c1 , c2 , c3 , . . . . almost everywhere A statement holds almost everywhere if it holds everywhere with the exception of a set of measure zero1 . The mostly everywhere version Thus our indefinite integral is the study of continuous functions that are differentiable mostly everywhere. It is only a little bit more ambitious to allow a sequence of points of nondifferentiability. But for Chapters 2 and 3 we do only this version. The nearly everywhere version The point of view taken in the elementary analysis text by Elias Zakon2 is that the “nearly everywhere” version of integration theory is the one best taught to undergraduate students. Thus, in his text, all integrals concern continuous functions that are differentiable except possibly at the points of some sequence of exceptional points. The mostly everywhere case is the easiest since it needs an appeal only to the meanvalue theorem for justification. The nearly everywhere case is rather harder, but if you have worked through the proof of Theorem 1.33 you have seen all the difficulties handled fairly easily. The almost everywhere version The more advanced integration theory sketched in Chapter 4 allows sets of measure zero for exceptional sets; the theory is more difficult since one must then, at the same time, strengthen the hypothesis of continuity. Thus the final step in the program of improving integration theory is to allow sets of measure zero and study certain kinds of functions that are differentiable almost everywhere. This presents new technical challenges and we shall not attempt it until Chapter 4. Our goal is to get there using Chapters 2 and 3 as elementary warmups.

2.1.2 Features of the indefinite integral We shall often in the sequel distinguish among the following four cases for an indefinite integral. 1 This

notion of a set of measure zero will be defined in Chapter 4. For now understand that a set of measure zero is small in a certain sense of measurement. 2 E. Zakon, Mathematical Analysis I, ISBN 1-931705-02-X, published by The Trillia Group, 2004.

CHAPTER 2. THE INDEFINITE INTEGRAL

38

Theorem 2.2 Let F be an indefinite integral for a function f on an open interval (a, b). 1. F is continuous on (a, b) but may or not be uniformly continuous there. 2. If f is bounded then F is Lipschitz on (a, b) and hence uniformly continuous there. 3. If f unbounded then F is not Lipschitz on (a, b) and may or not be uniformly continuous there. 4. If f is nonnegative and unbounded then F is uniformly continuous on (a, b) if and only if F is bounded. It will be important for our theory of the definite integral in Chapter 3 that we know which of the situations holds. It would be good practice and disciple in this chapter, then, to spot in any particular example whether the function F is Lipschitz, or uniformly continuous, or simply continuous but not uniformly continuous on the interval given. Exercise 185 Give an example of two functions f and g possessing indefinite integrals on the interval (0, 1) so that, of the two indefinite integrals F and G, one is uniformly Answer continuous and the other is not. Exercise 186 Prove this part of Theorem 2.2: If a function f is bounded and possesses an indefinite integral F on (a, b) then F is Lipschitz on (a, b). Deduce that F is uniformly continuous on (a, b). Answer

2.1.3 The notation

R

f (x) dx

Since we cannot avoid its use in elementary calculus classes, we define the symbol Z

f (x) dx

to mean the collection of all possible functions that are indefinite integrals of f on an appropriately specified interval. Because of Exercise 184 we know that we can always write this as Z f (x) dx = F(x) +C where F is any one choice of indefinite integral for f and C is an arbitrary constant called the constant of integration. In more advanced mathematical discussions this notation seldom appears, although there are frequent discussions of indefinite integrals (meaning a function whose derivative is the function being integrated). Exercise 187 Why exactly is this statement incorrect: Z

x2 dx = x3 /3 + 1? Answer

2.2. EXISTENCE OF INDEFINITE INTEGRALS

39

Exercise 188 Check the identities d (x + 1)2 = 2(x + 1) dx and d 2 (x + 2x) = 2x + 2 = 2(x + 1). dx Thus, on (−∞, ∞), Z (2x + 2) dx = (x + 1)2 +C

and

Z

(2x + 2) dx = (x2 + 2x) +C.

Does it follow that (x + 1)2 = (x2 + 2x)?

Answer

Exercise 189 Suppose that we drop continuity from the requirement of an indefinite integral and allow only one point at which the derivative may fail (instead of a finite set of points). Illustrate the situation by finding all possible indefinite integrals [in this new sense] of f (x) = x2 on (0, 1). Answer Exercise 190 Show that the function f (x) = 1/x has an indefinite integral on any open interval that does not include zero and does not have an indefinite integral on any open interval containing zero. Is the difficulty here because f (0) is undefined? Answer Exercise 191 Show that

and

Z Z

p 1 p dx = 2 |x| +C |x|

p 1 p dx = −2 |x| +C |x| are both true in a certain sense. How is this possible?

Answer

Exercise 192 Show that the function 1 f (x) = p |x| has an indefinite integral on any open interval, even if that interval does include zero. Is there any difficulty that arises here because f (0) is undefined? Answer Exercise 193 Which is correct Z Z Z 1 1 1 dx = log x +C or dx = log(−x) +C or dx = log |x| +C? x x x Answer

2.2 Existence of indefinite integrals We cannot be sure in advance that any particular function f has an indefinite integral on a given interval, unless we happen to find one. Thus every calculus students knows

CHAPTER 2. THE INDEFINITE INTEGRAL

40 the existence of the indefinite integral

Z

sin x dx

on (−∞, ∞) merely because of the fact that d [− cos x] = sin x dx is true at every value of x. If we did not happen to remember that fact, then what properties might we spot in the function sin x that would guarantee that an indefinite integral exists, even given that we couldn’t explicitly find one? We turn now to the problem of finding sufficient conditions under which we can be assured that one exists. This is a rather subtle point. Many beginning students might feel that we are seeking to ensure ourselves that an indefinite integral can be found. We are, instead, seeking for assurances that an indefinite integral does indeed exist. We might still remain completely unable to write down some formula for that indefinite integral because there is no “formula” possible. We shall show now that, with appropriate continuity assumptions on f , we can be assured that an indefinite integral exists without any requirement that we should find it. Our methods will show that we can also describe a procedure that would, in theory, produce the indefinite integral as the limit of a sequence of simpler functions. This procedure would work only for functions that are mostly continuous. We will still have a theory for indefinite integrals of discontinuous functions but we will have to be content with the fact that much of the theory is formal, and describes objects which are not necessarily constructible3 .

2.2.1 Upper functions We will illustrate our method by introducing the notion of an upper function. This is a piecewise linear function whose slopes dominate the function. Let f be defined at all but finitely many points of an open interval (a, b) and bounded on (a, b) and let us choose points a = x0 < x1 < x2 < x3 < · · · < xn−1 < xn = b.

Suppose that F is a uniformly continuous function on [a, b] that is linear on each interval [xi−1 , xi ] and such that F(xi ) − F(xi−1 ) ≥ f (ξ) xi − xi−1 for all points ξ at which f is defined and for which xi−1 ≤ ξ ≤ xi (i = 1, 2, . . . , n). Then we can call F an upper function for f on [a, b]. The method of upper functions is to approximate the indefinite integral that we require by suitable upper functions. Upper functions are piecewise linear functions with the break points (where the corners are) at x1 , x2 , . . . , xn−1 . The slopes of these line segments exceed the values of the function f in the corresponding intervals. See Figure 2.1 for an illustration of such a function. 3 Note to the instructor: Just how unconstructible are indefinite integrals in general? See Chris Freiling, How to compute antiderivatives, Bull. Symbolic Logic 1 (1995), no. 3, 279–316. This is by no means an elementary question.

2.2. EXISTENCE OF INDEFINITE INTEGRALS

41

Figure 2.1: A piecewise linear function on [−3, 3]. Exercise 194 Let f (x) = x2 be defined on the interval [0, 1]. Define an upper function for f using the points 0, 14 , 21 , 43 , 1. Sketch the graph of that upper function. Answer Exercise 195 (step functions) Let a function f be defined by requiring that, for any integer n (positive, negative, or zero), f (x) = n if n − 1 < x < n. (Values at the integers are omitted.) This is a simple example of a step function. Find a formula for an Answer indefinite integral and show that this is an upper function for f .

2.2.2 The main existence theorem for bounded functions For bounded, continuous functions we can always determine the existence of an indefinite integral by a limiting process using appropriate upper functions. The lemma is a technical computation that justifies this statement. Lemma 2.3 Suppose that f : (a, b) → R is a bounded function on an open interval (a, b) [bounded or unbounded]. Then there exists a Lipschitz function F : (a, b) → R so that F ′ (x) = f (x) for every point a < x < b at which f is continuous.

Existence of indefinite integral of continuous functions If we apply this theorem to a bounded, continuous function we immediately obtain an indefinite integral. The indefinite integral is necessarily Lipschitz. Thus this corollary will answer our question as to what conditions guarantee the existence of an indefinite integral. We shall use it repeatedly. Theorem 2.4 Suppose that f : (a, b) → R is a bounded function on an open interval (a, b) [bounded or unbounded] and that there are only a finite number of discontinuity points of f in (a, b). Then f has an indefinite integral on (a, b), which must be Lipschitz on (a, b).

42

CHAPTER 2. THE INDEFINITE INTEGRAL

2.2.3 The main existence theorem for unbounded functions Our theorem applies only to bounded functions, but we remember that if f is continuous on (a, b) then it is uniformly continuous, and hence bounded, on any subinterval [c, d] ⊂ (a, b). This allows the following version of our existence theorem. Note that we will not get an indefinite integral that is Lipschitz on all of (a, b) unless f is bounded. Theorem 2.5 Suppose that f : (a, b) → R is a function on an open interval (a, b) [bounded or unbounded] and that there are no discontinuity points of f in (a, b). Then f has an indefinite integral on (a, b). The exercises establish the lemma and the theorem. This is an important technical tool in the theory and it is essential that the reader understands how it works. Exercise 196 Use the method of upper functions to prove Lemma 2.3. It will be enough to assume that f : (0, 1) → R and that f is nonnegative and bounded. (Exercises 197 and 198 ask for the justifications for this assumption.) Answer Exercise 197 Suppose that f : (a, b) → R and set g(t) = f (a + t(b − a)) for all 0 ≤ t ≤ 1. If G is an indefinite integral for g on (0, 1) show how to find an indefinite integral for f on (a, b). Answer Exercise 198 Suppose that f : (a, b) → R is a bounded function and that K = inf{ f (x) : a < x < b}.

Set g(t) = f (t) − K for all a < t < b. Show that g is nonnegative and bounded. Suppose that G is an indefinite integral for g on (a, b); show how to find an indefinite integral Answer for f on (a, b). Exercise 199 Show how to deduce Theorem 2.5 from the lemma.

Answer

2.3 Basic properties of indefinite integrals We conclude our chapter on the indefinite integral by discussing some typical calculus topics. We have developed a precise theory of indefinite integrals and we are beginning to understand the nature of the concept. There are a number of techniques that have traditionally been taught in calculus courses for the purpose of evaluating or manipulating integrals. Many courses you will take (e.g., physics, applied mathematics, differential equations) will assume that you have mastered these techniques and have little difficulty in applying them. The reason that you are asked to study these techniques is that they are required for working with integrals or developing theory, not merely for computations. If a course in calculus seems to be overly devoted to evaluating indefinite integrals it is only that you are being drilled in the methods. The skill in finding an exact expression for an indefinite integral is of little use: it won’t help in all cases anyway. Besides, any integral that can be handled by these methods can be handled in seconds in by computer software packages such as Maple or Mathematica (see Section 2.3.5).

2.3. BASIC PROPERTIES OF INDEFINITE INTEGRALS

43

2.3.1 Linear combinations There is a familiar formula for the derivative of a linear combination: d {rF(x) + sG(x)} = rF ′ (x) + sG′ (x). dx This immediately provides a corresponding formula for the indefinite integral of a linear combination: Z Z Z (r f (x) + sg(x)) dx = r f (x) dx + s g(x) dx As usual with statements about indefinite integrals this is only accurate if some mention of an open interval is made. To interpret this formula correctly, let us make it very precise. We assume that both f and g have indefinite integrals F and G on the same interval I. Then the formula claims, merely, that the function H(x) = rF(x) + sG(x) is an indefinite integral of the function h(x) = r f (x) + sg(x) on that interval I. Exercise 200 (linear combinations) Prove this formula by showing that H(x) = rF(x) + sG(x) is an indefinite integral of the function h(x) = r f (x) + sg(x) on any interval I, assuming that both f and g have indefinite integrals F and G on the Answer interval I.

2.3.2 Integration by parts There is a familiar formula for the derivative of a product: d {F(x)G(x)} = F ′ (x)G(x) + F(x)G′ (x). dx This immediately provides a corresponding formula for the indefinite integral of a product: Z Z F(x)G′ (x) dx = F(x)G(x) − F ′ (x)G(x) dx. Again we remember that statements about indefinite integrals are only accurate if some mention of an open interval is made. To interpret this formula correctly, let us make it very precise. We assume that F ′ G has an indefinite integral H on an open interval I. Then the formula claims, merely, that the function K(x) = F(x)G(x) − H(x) is an indefinite integral of the function F(x)G′ (x) on that interval I. Exercise 201 (integration by parts) Explain and verify the formula.

Answer

Exercise 202 (calculus student notation) If u = f (x), v = g(x), and we denote du = f ′ (x)dx and dv = g′ (x)dx then in its simplest form the product rule is often described as Z Z u dv = uv − v du. Explain how this version is used.

Answer

CHAPTER 2. THE INDEFINITE INTEGRAL

44

Exercise 203 (extra practice) If you need extra practice on integration by parts as a calculus technique here is a standard collection of examples all cooked in advance so that an integration by parts technique will successfully determine an exact formula for the integral. This is not the case except for very selected examples. [The interval on which the integration is performed is not specified but it should be obvious Z Z which Zpoints, if any,Zto avoid.] Z Z ln x dx , arcsin 3x dx , xex dx , x sin x dx , x ln x dx , x cos 3x dx , Z Z Z Z Z x5 Z √ ln x dx , 2x arctan x dx , x2 e3x dx , x3 ln 5x dx , (ln x)2 dx , x x + 3 dx , Z Z Z Z Z p 3 ln x 2 dx , x5 ex dx , x3 cos (x2 ) dx , x7 5 + 3x4 dx , x sin x cos x dx , x 2 Z Z Z Z Z x3 x3 ex 6x 3x x dx , e sin (e ) dx , dx , e cos x dx and sin 3x cos 5x dx. (x2 + 5)2 (x2 + 1)2 Answer

2.3.3 Change of variable The chain rule for the derivative of a composition of functions is the formula: d F(G(x)) = F ′ (G(x))G′ (x). dx This immediately provides a corresponding formula for the indefinite integral of a product: Z Z F ′ (G(x))G′ (x) dx =

F ′ (u) du = F(u) +C = F(G(x)) +C

[u = G(x)]

where we have used the familiar device u = G(x), du = G′ (x) dx to make the formula more transparent. This is called the change of variable rule, although it is usually called integration by substitution is most calculus presentations. Again we remember that statements about indefinite integrals are only accurate if some mention of an open interval is made. To interpret this formula correctly, let us make it very precise. We assume that F is a differentiable function on an open interval I. We assume too that G′ has an indefinite integral G on an interval J and assumes all of its values in the interval I. Then the formula claims, merely, that the function F(G(x)) is an indefinite integral of the function F ′ (G(x))G′ (x) on that interval J [not on the interval I please]. Note that we have not addressed the question of allowing exceptional points in this formula. If G is continuous and differentiable mostly everywhere in (a, b) and F is continuous and differentiable mostly everywhere in an appropriate interval, then what can be said? Exercise 204 In the argument for the change of variable rule we did not address the possibility that F might have finitely many points of nondifferentiability. Discuss. Answer Exercise 205 Verify that this argument is correct: Z Z Z 1 1 1 1 x cos(x2 +1) dx = 2x cos(x2 +1) dx = cos u du = sin u+C = sin(x2 +1)+C. 2 2 2 2

2.3. BASIC PROPERTIES OF INDEFINITE INTEGRALS

45 Answer

Exercise 206 Here is a completely typical calculus exercise (or exam question). You R 2 are asked to determine an explicit formula for xex dx. What is expected and how do Answer you proceed? Exercise 207 Given that numbers r and s.

R

f (t) dt = F(t) + C determine

R

f (rx + s) dx for any real Answer

2.3.4 What is the derivative of the indefinite integral? What is

Z

d f (x) dx? dx By definition this indefinite integral is the family of all functions whose derivative is f (x) [on some pre-specified open interval] but with a possibly finite set of exceptions. So the answer trivially is that Z d f (x) dx = f (x) dx at most points inside the interval of integration. (But not necessarily at all points.) The following theorem will do in many situations, but it does not fully answer our question. There are exact derivatives that have very large sets of points at which they are discontinuous. Theorem 2.6 Suppose that f : (a, b) → R has an indefinite integral F on the interval (a, b). Then F ′ (x) = f (x) at every point in (a, b) at which f is continuous. Answer

Exercise 208 Prove the theorem.

2.3.5 Partial fractions Many calculus texts will teach, as an integration tool, the method of partial fractions. It is, actually, an important algebraic technique with applicability in numerous situations, not merely in certain integration problems. It is best to learn this in detail outside of a calculus presentation since it invariably consumes a great deal of student time as the algebraic techniques are tedious at best and, often, reveal a weakness in the background preparation of many of the students. It will suffice for us to recount the method that will permit the explicit integration of Z x+3 dx. 2 x − 3x − 40 The following passage is a direct quotation from the Wikipedia site entry for partial fractions. “Suppose it is desired to decompose the rational function x+3 x2 − 3x − 40

CHAPTER 2. THE INDEFINITE INTEGRAL

46

into partial fractions. The denominator factors as (x − 8)(x + 5) and so we seek scalars A and B such that x+3 x2 − 3x − 40

=

x+3 A B = + . (x − 8)(x + 5) x − 8 x + 5

One way of finding A and B begins by "clearing fractions", i.e., multiplying both sides by the common denominator (x − 8)(x + 5). This yields x + 3 = A(x + 5) + B(x − 8). Collecting like terms gives x + 3 = (A + B)x + (5A − 8B). Equating coefficients of like terms then yields: A 5A

+ B − 8B

= 1 = 3

The solution is A = 11/13, B = 2/13. Thus we have the partial fraction decomposition x+3 x2 − 3x − 40

=

11/13 2/13 11 2 + = + . x−8 x + 5 13(x − 8) 13(x + 5)

Alternatively, take the original equation A B x+3 = + . (x − 8)(x + 5) x − 8 x + 5 multiply by (x − 8) to get x+3 B(x − 8) = A+ . x+5 x+5 Evaluate at x = 8 to solve for A as 11 = A. 13 Multiply the original equation by (x + 5) to get x + 3 A(x + 5) = + B. x−8 x−8 Evaluate at x = −5 to solve for B as 2 −2 = = B. −13 13

As a result of this algebraic identity we can quickly determine that Z x+3 dx = [11/13] log(x − 8) + [2/13] log(x + 5) +C. 2 x − 3x − 40

2.3. BASIC PROPERTIES OF INDEFINITE INTEGRALS

47

This example is typical and entirely representative of the easier examples that would be expected in a calculus course. The method is, however, much more extensive than this simple computation would suggest. But it is not part of integration theory even if your instructor chooses to drill on it. Partial fraction method in Maple Computer algebra packages can easily perform indefinite integration using the partial fraction method without a need for the student to master all the details. Here is a short Maple session illustrating that all the partial fraction details given above are handled easily without resorting to hand calculation. That is not to say that the student should entirely avoid the method itself since it has many theoretical applications beyond its use here. [32]dogwood% maple |\^/| Maple 12 (SUN SPARC SOLARIS) ._|\| |/|_. Copyright (c) Maplesoft, a division of Waterloo Maple Inc. 2008 \ MAPLE / All rights reserved. Maple is a trademark of Waterloo Maple Inc.

> int( (x+3) /

( x^2-3*x-40), x); 11 2/13 ln(x + 5) + -- ln(x - 8) 13

# No constant of integration appears in the result for indefinite integrals.

Exercise 209 In determining that Z x+3 dx = [11/13] log(x − 8) + [2/13] log(x + 5) +C 2 x − 3x − 40 we did not mention an open interval in which this would be valid. Discuss. Answer

2.3.6 Tables of integrals Prior to the availability of computer software packages like Maple4 , serious users of the calculus often required access to tables of integrals. If an indefinite integral did have an expression in terms of some formula then it could be found in the tables [if they were extensive enough] or else some transformations using our techniques above (integration by parts, change of variable, etc.) could be applied to find an equivalent integral that did appear in the tables. Most calculus books (not this one) still have small tables of integrals. Much more efficient, nowadays, is simply to rely on a computer application such as Maple or Mathematica to search for an explicit formula for an indefinite integral. These packages will even tell you if no explicit formula exists. It is probably a waste of lecture time to teach for long any method that uses tables and it is a waste of paper to write about them. The interested reader should just Google “tables of integrals” to see what can be done. It has the same historical interest that 4 See

especially Section 3.11.1.

48

CHAPTER 2. THE INDEFINITE INTEGRAL

logarithms as devices for computation have. Store your old tables of integrals in the same drawer with your grandparent’s slide rules.

Chapter 3

The Definite Integral We have defined already the notion of an indefinite integral Z

F ′ (x) dx = F(x) +C.

on an open interval (a, b). The theory of the indefinite integral is described best as the study of continous functions on open intervals that are mostly everywhere differentiable. The definite integral Z b a

F ′ (x) dx = F(b) − F(a)

on a bounded, closed interval [a, b] is defined as a special case of that and the connection between the two concepts is immediate. We can describe the theory of the definite integral as the study of uniformly continous functions on closed and bounded intervals that are mostly everywhere differentiable. In other calculus courses one might be introduced to a different (also very limited) version of the integral introduced in the middle of the 19th century by Riemann. Then the connection with the indefinite integral is established by means of a deep theorem known as the fundamental theorem of the calculus. Here we run this program backwards. We take the simpler approach of starting with the fundamental theorem as a definition and then recover the Riemann integration methods later. There are numerous advantages in this. We can immediately start doing some very interesting integration theory and computing integrals. Since we have already learned indefinite integration we have an immediate grasp of the new theory. We are not confined to the limited Riemann integral and we have no need to introduce the improper integral. We can make, eventually, a seamless transition to the Lebesgue integral and beyond. This calculus integral (also known as “Newton’s integral”) is a limited version of the full integration theory on the real line. It is intended as a teaching method for introducing integration theory. Later, in Chapter 4, we will present an introduction to the full modern version of integration theory on the real line.

49

CHAPTER 3. THE DEFINITE INTEGRAL

50

3.1 Definition of the calculus integral The definite integral is defined directly by means of the indefinite integral and uses a similar notation. Definition 3.1 (The definite integral) Let f be a function defined at every point of a closed, bounded interval [a, b] with possibly finitely many exceptions. Then f is said to be integrable [calculus sense] if there exists a uniformly continuous function F : [a, b] → R that is an indefinite integral for f on the open interval (a, b). In that case the number Z b

f (x) dx = F(b) − F(a),

a

is called the definite integral of f on [a, b]. To make this perfectly clear let us specify what this statement would mean: We require: 1. f is defined on [a, b] except possibly at points of a finite set. [In particular f (a) and f (b) need not be defined.] 2. There is a uniformly continuous function F on [a, b]. 3. F ′ (x) = f (x) at every point a < x < b except possibly at points of a finite set. 4. We compute F(b) − F(a) and call this number the definite integral of f on [a, b]. Thus our integration focuses on the study of uniformly continuous functions F : [a, b] → R for which there is at most a finite number of points of nondifferentiability in (a, b). For these functions we can write Z b a

F ′ (x) dx = F(b) − F(a).

(3.1)

The integration theory is, consequently, all about derivatives, just as was the indefinite integration theory. The statement (3.1) is here a definition not (as it would be in many other textbooks) a theorem.

3.1.1 Alternative definition of the integral In many applications it is more convenient to work with a definition that expresses everything within the corresponding open interval (a, b). Definition 3.2 (The definite integral) Let f be a function defined at every point of a bounded, open interval (a, b) with possibly finitely many exceptions. Then f is said to be integrable [calculus sense] on the closed interval [a, b] if there exists a uniformly continuous indefinite integral F for f on (a, b). In that case the number Z b a

f (x) dx = F(b−) − F(a+),

is called the definite integral of f on [a, b]. This statement would mean.

3.1. DEFINITION OF THE CALCULUS INTEGRAL

51

1. f is defined at least on (a, b) except possibly at points of a finite set. 2. There is a uniformly continuous function F on (a, b), with F ′ (x) = f (x) at every point a < x < b except possibly at points of a finite set. 3. Because F is uniformly continuous on (a, b), the two one-sided limits lim F(x) = F(a+) and lim F(x) = F(b−)

x→a+

x→b−

will exist. 4. The number F(b−) − F(a+) is the definite integral of f on [a, b]. Exercise 210 To be sure that a function f is integrable on a closed, bounded interval [a, b] you need to find an indefinite integral F on (a, b) and then check one of the following: 1. F is uniformly continuous on (a, b), or 2. F is uniformly continuous on [a, b], or 3. F is continuous on (a, b) and the one-sided limits, lim F(x) = F(a+) and lim F(x) = F(b−)

x→a+

x→b−

exist. Answer

Show that these are equivalent.

3.1.2 Infinite integrals Exactly the same definition for the infinite integrals Z ∞

−∞

f (x) dx,

Z ∞

f (x) dx, and

a

Z b

−∞

f (x) dx

can be given as for the integral over a closed bounded interval. Definition 3.3 (Infinite integral) Let f be a function defined at every point of (∞, ∞) with possibly finitely many exceptions. Then f is said to be integrable in the calculus sense on (∞, ∞) if there exists an indefinite integral F : (∞, ∞) → R for f for which both limits F(∞) = lim F(x) and F(−∞) = lim F(x) x→∞

exist. In that case the number Z

x→−∞

∞

−∞

f (x) dx = F(∞) − F(−∞),

is called the definite integral of f on (∞, ∞) . This statement would mean. 1. f is defined at all real numbers except possibly at points of a finite set. 2. There is a continuous function F on (∞, ∞), with F ′ (x) = f (x) at every point except possibly at points of a finite set.

CHAPTER 3. THE DEFINITE INTEGRAL

52 3. The two infinite limits

F(∞) = lim F(x) and F(−∞) = lim F(x) x→∞

x→−∞

exist. This must be checked. For this either compute the limits or else use Exercise 64: for all ε > 0 there should exist a positive number T so that ωF((T, ∞)) < ε and ωF((−∞, T )) < ε. 4. The number F(∞) − F(−∞) is the definite integral of f on [a, b]. Similar assertions define

Z b

f (x) dx = F(b) − F(−∞)

−∞

and

Z ∞ a

f (x) dx = F(∞) − F(a).

In analogy with the terminology of an infinite series ∞

∑ ak

k=1

we often say that the integral

Z ∞

f (x) dx

a

converges when the integral exists. That suggests language asserting that the integral converges absolutely if both integrals Z ∞

f (x) dx and

a

Z ∞ a

| f (x)| dx

exist.

3.1.3 Notation: The expressions

Ra a

f (x) dx and Z a

Ra b

f (x) dx

f (x) dx and

a

Z a

f (x) dx

b

for b > a do not yet make sense since integration is required to hold on a closed, bounded interval. But these notations are extremely convenient. Thus we will agree that Z a

f (x) dx = 0

a

R

and, if a < b and the integral ab f (x) dx exists as a calculus integral, then we assign this meaning to the “backwards” integral: Z a b

f (x) dx = −

Z b

f (x) dx.

a

R

Exercise 211 Suppose that the integral ab f (x) dx exists as a calculus integral and that F is an indefinite integral for f on that interval. Does the formula Z t s

f (x) dx = F(t) − F(s)

(s,t ∈ [a, b])

3.1. DEFINITION OF THE CALCULUS INTEGRAL

53

work even if s = t or if s > t?

Answer

Exercise 212 Check that the formula Z b

f (x) dx +

a

Z c

f (x) dx =

b

Z c

f (x) dx

a

works for all real numbers a, b and c.

Answer

3.1.4 The dummy variable: what is the “x” in If you examine the two statements Z

2

3

x dx = x /3 +C and

Z 2 1

Rb a

f (x) dx?

x2 dx = 23 /3 − 13 /3 = 7/3

you might notice an odd feature. The first integral [the indefinite integral] requires the symbol x to express the functions on both sides. But in the second integral [the definite integral] the symbol x plays no role except to signify the function being integrated. If we had given the function a name, say g(x) = x2 then the first identity could be written Z

3

g(x) dx = x /3 or

Z

g(t) dt = t 3 /3 + c

while the second one might be more simply written as Z 2

g = 7/3.

1

In definite integrals the symbols x and dx are considered as dummy variables, useful for notational purposes and helpful as aids to computation, but carrying no significance. Thus you should feel free [and are encouraged] to use any other letters you like to represent the dummy variable. But do not use a letter that serves some other purpose elsewhere in your discussion. Here are some bad and even terrible abuses of this: Exercise 213 What is wrong with this? Let x = 2 and let y=

Z 2

x2 dx.

1

Exercise 214 What is wrong with this? Show that Z x 1

x2 dx = x3 /3 − 1/3.

Exercise 215 Do you know of any other bad uses of dummy variables?

Answer

3.1.5 Definite vs. indefinite integrals The connection between the definite and indefinite integrals is immediate; we have simply defined one in terms of the other.

CHAPTER 3. THE DEFINITE INTEGRAL

54

If F is an indefinite integral of an integrable function f on an interval (a, b) then Z b a

f (x) dx = F(b−) − F(a+)

provided that these two one-sided limits do exist. In the other direction if f is integrable on an interval [a, b] then, on the open interval (a, b), the indefinite integral can be expressed as Z

f (x) dx =

Z x

f (t) dt +C.

a

Both statements are tautologies; this is a matter of definition not of computation or argument. Exercise 216 A student is asked to find the indefinite integral of e2x and he writes Z

2x

e dx =

Z x

e2t dt +C.

0

Answer

How would you grade? 2

Exercise 217 A student is asked to find the indefinite integral of ex and she writes Z

2

ex dx =

Z x

2

et dt +C.

0

Answer

How would you grade?

3.1.6 The calculus student’s notation The procedure that we have learned in order to compute a definite integral is actually just the definition. For example, if we wish to evaluate Z 6

we first determine that

Z

x2 dx

−5

x2 dx = x3 /3 +C

on any interval. So that, using the function F(x) = x3 /3 as an indefinite integral, Z 6

−5

x2 dx = F(6) − F(−5) = 63 /3 − (−5)3 /3 = (63 + 53 )/3.

Calculus students often use a shortened notation for this computation: x=6 Z 6 x3 2 x dx = = 63 /3 − (−5)3 /3. 3 x=−5 −5 Exercise 218 Which of these is correct: x=6 x=6 Z 6 Z 6 x3 x3 2 2 +1 ? x dx = or x dx = 3 x=−5 3 −5 −5 x=−5 Exercise 219 Would you accept this notation: Z ∞ dx 2 x=∞ √ = −√ = 0 − (−2)? x x=1 1 x3

Answer

3.2. INTEGRABILITY

55 Answer

3.2 Integrability What functions are integrable on an interval [a, b]? According to the definition we need to find an indefinite integral on (a, b) and then determine whether it is uniformly continuous. If we cannot explicitly obtain an indefinite integral we can still take advantage of what we know about continuous functions to decide whether a given function is integrable or not. This focus on continuity, however, will not answer the problem in general. But it does give us a useful and interesting theory. Continuity will be our most used tool in this chapter. For a more advanced theory we would need to find some other ideas.

3.2.1 Integrability of bounded, continuous functions If there does exist an indefinite integral of a bounded function, we know that it would have to be Lipschitz and so must be uniformly continuous. Thus integrability of bounded functions on bounded intervals reduces simply to ensuring that there is an indefinite integral. Theorem 3.4 If f : (a, b) → R is a bounded function that is continuous at all but finitely many points of an open bounded interval (a, b) then f is integrable on [a, b]. Corollary 3.5 If f : [a, b] → R is a uniformly continuous function then f is integrable on [a, b]. Exercise 220 Show that all step functions are integrable.

Answer

Exercise 221 Show that all differentiable functions are integrable.

Answer

3.2.2 Integrability of unbounded continuous functions What unbounded functions are integrable on an interval [a, b]? We know that a bounded function f : (a, b) → R that is continuous at each point of the open interval would be integrable. The unbounded case is covered in this theorem. Theorem 3.6 Suppose that f : (a, b) → R is a function that is continuous at all points of a bounded open interval (a, b). Then f is integrable on every closed, bounded subinterval [c, d] ⊂ (a, b). Moreover f is integrable on [a, b] itself if and only if the one-sided limits lim

Z c

t→a+ t

f (x) dx and lim

Z t

t→b− c

f (x) dx

exist for some a < c < b. In that case Z b a

f (x) dx = lim

Z c

t→a+ t

Exercise 222 Prove Theorem 3.6.

f (x) dx + lim

Z t

t→b− c

f (x) dx. Answer

CHAPTER 3. THE DEFINITE INTEGRAL

56

3.2.3 Comparison test for integrability What unbounded functions are integrable on an interval [a, b]? What functions are integrable on an unbounded interval (−∞, ∞)? Sometimes the most convenient way of checking for integrability is to compare an unknown case to the case of a known integrable function. The following simple theorem is sometimes called a comparison test for integrals. Theorem 3.7 (comparison test I) Suppose that f , g : (a, b) → R are functions on (a, b), both of which have an indefinite integral on (a, b). Suppose that | f (x)| ≤ g(x) for all a < x < b. If g is integrable on [a, b] then so too is f . We recall that we know already: If f : (a, b) → R is an unbounded function that is continuous at all points of (a, b) then f has an indefinite integral on (a, b). That indefinite integral may or may not be uniformly continuous. That provides a quick corollary of our theorems. Corollary 3.8 Suppose that f is an unbounded function on (a, b) that is continuous at all but a finite number of points, and suppose that g : (a, b) → R with | f (x)| ≤ g(x) for all a < x < b. If g is integrable on [a, b] then so too is f .

3.2.4 Comparison test for infinite integrals For infinite integrals there are similar statements available. Theorem 3.9 (comparison test II) Suppose that f , g : (a, ∞) → R are functions on (a, ∞), both of which have an indefinite integral on (a, ∞). Suppose that | f (x)| ≤ g(x) for all a < x. If g is integrable on [a, ∞) then so too is f . Corollary 3.10 Suppose that f is function on (a, ∞) that is continuous at all but a finite number of points, and suppose that g : (a, ∞) → R with | f (x)| ≤ g(x) for all a < x. If g is integrable on [a, ∞) then so too is f . Exercise 223 Prove the two comparison tests [Theorems 3.7 and 3.9].

Answer

Exercise 224 Prove Corollary 3.8.

Answer

Exercise 225 Prove Corollary 3.10.

Answer

Exercise 226 Which, if any, of these integrals exist: Z π/2 r Z π/2 r Z π/2 r sin x sin x sin x dx , dx , and dx? 2 x x x3 0 0 0

Answer

Exercise 227 Apply the comparison test to each of these integrals: Z ∞ Z ∞ Z ∞ sin x sin x sin x √ dx , dx , and dx. x x x2 1 1 1 Answer

3.2. INTEGRABILITY

57

Exercise 228 (nonnegative functions) Show that a nonnegative function f : (a, b) → R is integrable on [a, b] if and only if it has a bounded indefinite integral on (a, b). Answer Exercise 229 Give an example of a function f : (a, b) → R that is not integrable on Answer [a, b] and yet it does have a bounded indefinite integral on (a, b). Exercise 230 Discuss the existence of the definite integral Z b p(x) dx q(x) a where p(x) and q(x) are both polynomials.

Answer

Exercise 231 Discuss the existence of the integral Z ∞ p(x) dx q(x) a where p(x) and q(x) are polynomials.

Answer

3.2.5 The integral test It is useful to have a way of comparing infinite integrals to series. When one converges so too does the other. Theorem 3.11 (The integral test) Let f be a continuous, nonnegative, decreasing R function on [1, ∞). Then the definite integral 1∞ f (x) dx exists if and only if the series ∑∞ n=1 f (n) converges. Exercise 232 Prove the integral test.

Answer

Exercise 233 Give an example of a function f that is continuous and nonnegative on R [1, ∞) so that the integral 1∞ f (x) dx exists but the series ∑∞ n=1 f (n) diverges. Answer Exercise 234 Give an example of a function f that is continuous and nonnegative on R [1, ∞) so that the integral 1∞ f (x) dx does not exist but the series ∑∞ n=1 f (n) converges. Answer

3.2.6 Products of integrable functions When is the product of a pair of integrable functions integrable? When both functions are bounded and defined on a closed, bounded interval we shall likely be successful. When both functions are unbounded, or the interval is unbounded simple examples exist to show that products of integrable functions need not be integrable. Exercise 235 Suppose we are given a pair of functions f and g such that each is uniformly continuous on [a, b]. Show that each of f , g and the product f g is integrable on [a, b]. Answer

CHAPTER 3. THE DEFINITE INTEGRAL

58

Exercise 236 Suppose we are given a pair of functions f and g such that each is bounded and has at most a finite number of discontinuities in (a, b). Show that each of Answer f , g and the product f g is integrable on [a, b]. Exercise 237 Find a pair of functions f and g, integrable on [0, 1] and continuous on (0, 1) but such that the product f g is not. Answer Exercise 238 Find a pair of continuous functions f and g, integrable on [1, ∞) but such Answer that the product f g is not. Exercise 239 Suppose that F, G : [a, b] → R are uniformly continuous functions that are differentiable at all but a finite number of points in (a, b). Show that F ′ G is integrable on [a, b] if and only if FG′ is integrable on [a, b]. Answer

3.3 Properties of the integral The basic properties of integrals are easily obtained for us because the integral is defined directly by differentiation. Thus we can apply all the rules we know about derivatives to obtain corresponding facts about integrals.

3.3.1 Integrability on all subintervals When a function has a calculus integral on an interval it must also have a calculus integral on all subintervals. Theorem 3.12 (integrability on subintervals) If f is integrable on a closed, bounded interval [a, b] then f is integrable on any subinterval [c, d] ⊂ [a, b].

3.3.2 Additivity of the integral When a function has a calculus integral on two adjacent intervals it must also have a calculus integral on the union of the two intervals. Moreover the integral on the large interval is the sum of the other two integrals. Theorem 3.13 (additivity of the integral) If f is integrable on the closed, bounded intervals [a, b] and [b, c] then f is integrable on the interval [a, c] and, moreover, Z b

f (x) dx +

Z c

f (x) dx =

f (x) dx.

a

b

a

Z c

3.3.3 Inequalities for integrals Larger functions have larger integrals. The formula for inequalities: Z b a

f (x) dx ≤

Z b

g(x) dx

a

if f (x) ≤ g(x) for all but finitely many points x in (a, b).

3.3. PROPERTIES OF THE INTEGRAL

59

Theorem 3.14 (integral inequalities) Suppose that the two functions f , g are both integrable on a closed, bounded interval [a, b] and that f (x) ≤ g(x) for all x ∈ [a, b] with possibly finitely many exceptions. Then Z b a

f (x) dx ≤

Z b

g(x) dx.

a

The proof is an easy exercise in derivatives. We know that if H is uniformly continuous on [a, b] and if d H(x) ≥ 0 dx for all but finitely many points x in (a, b) then H(x) must be nondecreasing on [a, b]. Exercise 240 Complete the details needed to prove the inequality formula of Theorem thm:intineqal. Answer

3.3.4 Linear combinations Formula for linear combinations: Z b

[r f (x) + sg(x)] dx = r

Z b

f (x) dx + s

a

a

Z b a

g(x) dx

(r, s ∈ R).

Here is a precise statement of what we intend by this formula: If both functions f (x) and g(x) have a calculus integral on the interval [a, b] then any linear combination r f (x) + sg(x) (r, s ∈ R) also has a calculus integral on the interval [a, b] and, moreover, the identity must hold. The proof is an easy exercise in derivatives. We know that d (rF(x) + sG(x)) = rF ′ (x) + sG′ (x) dx at any point x at which both F and G are differentiable. Exercise 241 Complete the details needed to prove the linear combination formula.

3.3.5 Integration by parts Integration by parts formula: Z b a

′

F(x)G (x) dx = F(x)G(x) −

Z b a

F ′ (x)G(x) dx

The intention of the formula is contained in the product rule for derivatives: d (F(x)G(x)) = F(x)G′ (x) + F ′ (x)G(x) dx which holds at any point where both functions are differentiable. One must then give strong enough hypotheses that the function F(x)G(x) is an indefinite integral for the function F(x)G′ (x) + F ′ (x)G(x) in the sense needed for our integral.

CHAPTER 3. THE DEFINITE INTEGRAL

60

Exercise 242 Supply the details needed to prove the integration by parts formula in the special case where F and G are continuously differentiable everywhere. Exercise 243 Supply the details needed to state and prove an integration by parts formula that is stronger than the one in the preceding exercise.

3.3.6 Change of variable The change of variable formula (i.e., integration by substitution): Z b a

f (g(t))g′ (t) dt =

Z g(b)

f (x) dx.

g(a)

The intention of the formula is contained in the following statement which contains a sufficient condition that allows this formula to be proved: Let I be an interval and g : [a, b] → I a continuously differentiable function. Suppose that F : I → R is an integrable function. Then the function F(g(t))g′ (t) is integrable on [a, b] and the function f is integrable on the interval [g(a), g(b)] (or rather on [g(a), g(b)] if g(b) < g(a)) and the identity holds. There are various assumptions under which this might be valid. The proof is an application of the chain rule for the derivative of a composite function: d F(G(x)) = F ′ (G(x))G′ (x). dx Exercise 244 Supply the details needed to prove the change of variable formula in the Answer special case where F and G are differentiable everywhere. Exercise 245 (a failed change of variables) Let F(x) = |x| and G(x) = x2 sin x−1 , G(0) = 0. Does Z 1

0

F ′ (G(x))G′ (x) dx = F(G(1)) − F(G(0)) = | sin 1|?

Answer Exercise 246 (calculus student notation) Explain the procedure being used by this calculus student: In the integral obtain Z 2

R2 0

x cos(x2 + 1) dx we substitute u = x2 + 1, du = 2x dx and

x cos(x2 + 1) dx =

0

1 2

Z u=5 u=1

1 cos u du = (sin(5) − sin(1)). 2

Exercise 247 (calculus student notation) Explain the procedure being used by this calculus student: p The substitution x = sin u, dx = cos u du is useful, because 1 − sin2 u = cos u. Therefore Z πp Z π Z 1p 2 2 cos2 u du. 1 − x2 dx = 1 − sin2 u cos u du = 0

0

0

3.3. PROPERTIES OF THE INTEGRAL

61

Exercise 248 Supply the details needed to prove the change of variable formula in the special case where G is strictly increasing and differentiable everywhere. Answer Exercise 249 Show that the integral

√ Z π2 cos x √

dx x exists and use a change of variable to determine the exact value. 0

Answer

3.3.7 What is the derivative of the definite integral? What is

Z

d x f (t) dt? dx a Rx We know that a f (t) dt is an indefinite integral of f and so, by definition, Z d x f (t) dt = f (x) dx a at all but finitely many points in the interval (a, b) if f is integrable on [a, b]. If we need to know more than that then there is the following version which we have already proved: Z d x f (t) dt = f (x) dx a at all points a < x < b at which f is continuous. We should keep in mind, though, that there may also be many points where f is discontinuous and yet the derivative formula holds. Advanced note. If we go beyond the calculus interval, as we do in Chapter 4, then the same formula is valid Z d x f (t) dt = f (x) dx a but there may be many more than finitely many exceptions possible. For “most” values of t this is true but there may even be infinitely many exceptions possible. It will still be true at points of continuity but it must also be true at most points when an integrable function is badly discontinuous (as it may well be). Exercise 250 Prove Theorem 3.12 both for integrals on [a, b] or (−∞, ∞). Answer Exercise 251 Prove Theorem 3.13 both for integrals on [a, b] or (−∞, ∞). Answer Exercise 252 Prove Theorem 3.14 both for integrals on [a, b] or (−∞, ∞). Answer Exercise 253 Show that the function f (x) = x2 is integrable on [−1, 2] and compute its definite integral there. Answer

CHAPTER 3. THE DEFINITE INTEGRAL

62

Exercise 254 Show that the function f (x) = x−1 is not integrable on [−1, 0], [0, 1], nor on any closed bounded interval that contains the point x = 0. Did the fact that f (0) is undefined influence your argument? Is this function integrable on (∞, −1] or on [1, ∞)? Answer Exercise 255 Show that the function f (x) = x−1/2 is integrable on [0, 2] and compute its definite integral there. Did the fact that f (0) is undefined interfere with your argument? Is this function integrable on [0, ∞)? Answer p Exercise 256 Show that the function f (x) = 1/ |x| is integrable on any interval [a, b] and determine the value of the integral. Answer Exercise 257 (why the finite exceptional set?) In the definition of the calculus integral we permit a finite exceptional set. Why not just skip the exceptional set and just Answer split the interval into pieces? Exercise 258 (limitations of the calculus integral) Define a function F : [0, 1] → R in such a way that F(0) = 0, and for each odd integer n = 1, 3, 5 . . . , F(1/n) = 1/n and each even integer n = 2, 4, 6 . . . , F(1/n) = 0. On the intervals [1/(n + 1), 1/n] for R n = 1, 2, 3, the function is linear. Show that ab F ′ (x) dx exists as a calculus integral for R1 ′ all 0 < a < b ≤ 1 but that 0 F (x) dx does not. Hint: too many exceptional points. Answer Exercise 259 Show that each of the following functions is not integrable on the interval stated: 1. f (x) = 1 for all x irrational and f (x) = 0 if x is rational, on any interval [a, b]. 2. f (x) = 1 for all x irrational and f (x) is undefined if x is rational, on any interval [a, b]. 3. f (x) = 1 for all x 6= 1, 1/2, 1/3, 1/4, . . . and f (1/n) = cn for some sequence of positive numbers c1 , c2 , c3 , . . . , on the interval [0, 1]. Answer Exercise 260 Determine all values of p for which the integrals Z 1

x p dx or

Z ∞

x p dx

1

0

exist.

Answer

Exercise 261 Are the following additivity formulas for infinite integrals valid: 1.

Z ∞

f (x) dx =

Z ∞

f (x) dx =

Z ∞

f (x) dx =

−∞

2.

−∞

−∞

∞

∑

f (x) dx +

Z n

Z b

f (x) dx +

a

Z ∞

f (x) dx?

b

f (x) dx?

n=1 n−1

0

3.

Z a

∞

∑

Z n

f (x) dx?

n=−∞ n−1

Answer

3.4. MEAN-VALUE THEOREMS FOR INTEGRALS

63

3.4 Mean-value theorems for integrals In general the expression

Z

b 1 f (x) dx b−a a is thought of as an averaging operation on the function f , determining its “average value” throughout the whole interval [a, b]. The first mean-value theorem for integrals says that the function actually attains this average value at some point inside the interval, i.e., under appropriate hypotheses there is a point a < ξ < b at which Z b 1 f (x) dx = f (ξ). b−a a But this is nothing new to us. Since the integral is defined by using an indefinite integral F for f this is just the observation that Z b F(b) − F(a) 1 = f (ξ), f (x) dx = b−a a b−a the very familiar mean-value theorem for derivatives.

Theorem 3.15 Let f : (a, b) → R be integrable on [a, b] and suppose that F is an indefinite integral. Suppose further that F ′ (x) = f (x) for all a < x < b with no exceptional points. Then there must exist a point ξ ∈ (a, b) so that Z b a

f (x) dx = f (ξ)(b − a).

Corollary 3.16 Let f : (a, b) → R be integrable on [a, b] and suppose that f is continuous at each point of (a, b). Then there must exist a point ξ ∈ (a, b) so that Z b a

f (x) dx = f (ξ)(b − a).

Exercise 262 Give an example of an integrable function for which the first mean-value Answer theorem for integrals fails. Exercise 263 (another mean-value theorem) Suppose that G : [a, b] → R is a continuous function and ϕ : [a, b] → R is an integrable, nonnegative function. If G(t)ϕ(t) is integrable, show that there exists a number ξ ∈ (a, b) such that Z b a

G(t)ϕ(t) dt = G(ξ)

Z b a

ϕ(t) dt. Answer

Exercise 264 (and another) Suppose that G : [a, b] → R is a positive, monotonically decreasing function and ϕ : [a, b] → R is an integrable function. Suppose that Gϕ is integrable. Then there exists a number ξ ∈ (a, b] such that Z b a

G(t)ϕ(t) dt = G(a + 0)

Z ξ a

ϕ(t) dt.

Note: Here, as usual, G(a + 0) stands for limx→a+ G(x) , the existence of which follows from the monotonicity of the function G. Note that ξ in the exercise might possibly be b.

CHAPTER 3. THE DEFINITE INTEGRAL

64

Exercise 265 (. . . and another) Suppose that G : [a, b] → R is a monotonic (not necessarily decreasing and positive) function and ϕ : [a, b] → R is an integrable function. Suppose that Gϕ is integrable. Then there exists a number ξ ∈ (a, b) such that Z b

G(t)ϕ(t) dt = G(a + 0)

Z ξ a

a

ϕ(t) dt + G(b − 0)

Z b ξ

ϕ(t) dt.

Exercise 266 (Dirichelet integral) As an application of mean-value theorems, show that the integral Z ∞ sin x dx x 0 Answer is convergent but is not absolutely convergent.

3.5 Riemann sums The expression of an integral by its definition Z b a

f (x) dx = F(b) − F(a)

requires finding a function F to serve as an antiderivative. It would be more convenient, both for theory and practice, if we can relate the value of the integral directly to the actual values of the function f . Approximations of the form Z b a

n

f (x) dx ≈ ∑ f (ξi )(xi − xi−1 ) i=1

have long been used. Here the points xi are chosen so as to begin at the left endpoint a and end at the right endpoint b, a = x0 , x1 , x2 , x3 , . . . , xn−1 , xn = b and the points ξi (called the associated points) are required to be chosen at or between the corresponding points xi−1 and xi . Most readers would have encountered such sums under the stricter conditions that a = x0 < x1 < x2 < x3 < · · · < xn−1 < xn = b and xi−1 ≤ ξi ≤ xi

so that the points are arranged in increasing order. This need not always be the case, but it is most frequently so. We have used this notion before as a partition and we write partitions in the form {([xi , xi−1 ], ξi ) : i = 1, 2, . . . n}.

Moreover, in most settings, one is interested also in choosing points close together so that an inequality of the form |xi−1 − xi 0 there is a δ > 0 so that n Z xi f (x) dx − f (ξ )(x − x ) i i i−1 < ε ∑ x i=1

and

i−1

Z b n f (x) dx − ∑ f (ξi )(xi − xi−1 ) < ε a i=1

whenever points are given

a = x0 < x1 < x2 < x3 < · · · < xn−1 < xn = b

for which

xi − xi−1 < δ

with associated points ξi ∈ [xi−1 , xi ] chosen at any such point where f is defined. In the special case where f is defined and continuous at every point of the interval [a, b] we get the original version of Cauchy. Corollary 3.20 (Cauchy) Let f : [a, b] → R be a uniformly continuous function. Then, f is integrable on [a, b] and moreover the integral may be uniformly approximated by Riemann sums. Exercise 274 Prove Theorem 3.19 in the case when f is uniformly continuous on [a, b] Answer by using the error estimate in Exercise 273. Exercise 275 Prove Theorem 3.19 in the case when f is continuous on (a, b). Answer Exercise 276 Complete the proof of Theorem 3.19.

Answer

Exercise 277 Let f : [a, b] → R be an integrable function on [a, b] and suppose, moreover, the integral may be uniformly approximated by Riemann sums. Show that f would Answer have to be bounded. Exercise 278 Show that the integral Z 1

12 + 22 + 32 + 42 + 52 + 62 + · · · + n2 . n→∞ n3

x2 dx = lim

0

Answer Exercise 279 Show that the integral Z 1 x2 dx = lim (1 − r) + r(r − r2 ) + r2 (r2 − r3 ) + r3 (r3 − r4 ) + . . . . 0

r→1−

Answer

CHAPTER 3. THE DEFINITE INTEGRAL

70 R

Exercise 280 Show that the integral 01 x5 dx can be exactly computed by the method of Riemann sums provided one has the formula 15 + 25 + 35 + 45 + 55 + 65 + + · · · + N 5 =

N 6 N 5 5N 4 N 2 + + − . 6 2 12 12

3.5.5 Riemann’s integral By the middle of the nineteenth century Riemann, clearly inspired by Cauchy’s clarification of integration theory, was teaching a more general integration theory to his students. He took the observations of the preceding section and gave a definition of an integral based on it. This is a standard and time-honored tradition in mathematics. What an earlier mathematician proposes as a theorem, you will propose as a definition. Thus Theorem 3.19 turned into this. Definition 3.21 Let f be a function that is defined at every point of [a, b]. Then, f is said to be Riemann integrable on [a, b] if it satisfies the following “uniform integrability” criterion: there is a number I so that, for every ε > 0 there is a δ > 0, with the property that n I − ∑ f (ξi )(xi − xi−1 ) < ε i=1

whenever points are given for which

a = x0 < x1 < x2 < x3 < · · · < xn−1 < xn = b xi − xi−1 < δ

with associated points ξi ∈ [xi−1 , xi ].

The number I in the definition would then be written in integral notation as I = (R)

Z b

f (x) dx.

a

Most bounded functions (but not all) that are integrable in the calculus sense that we are using are Riemann integrable and the values of the integrals agree. Thus it is safe to write Z Z b

(R)

b

f (x) dx =

a

f (x) dx

a

when we are sure that the function f is integrable in both senses. Confused? There are no unbounded functions that are Riemann integrable, although many unbounded functions are integrable in the calculus sense. Some highly discontinuous functions are Riemann integrable, but not integrable as we understand it. So that is rather confusing. Should we incorporate Riemann’s ideas into our integration study or not? Mathematicians of the late nineteenth century did, but there was some considerable difficulties that arose as a result. We find it best for our purposes to leave the Riemann integral as an historical curiosity until we have developed the correct integration theory on the real line.

3.5. RIEMANN SUMS

71

Neither the calculus integral nor the Riemann integral is the correct theory for all purposes. When we return to the Riemann integral we can place it within the appropriate theoretical framework. As a “teaching integral” the calculus integral is arguably more appropriate since it is easier to develop and does lead eventually to the correct theory in any case. Exercise 281 Can you find a function that is Riemann integrable but not integrable in Answer the calculus sense taken in the text?

3.5.6 Robbins’s theorem This section introduces some basic ideas from integration theory. Most students learn such ideas studying the Riemann integral. Here everything remains in the context of the calculus integral. There is another approach possible to Cauchy’s theorem. While Riemann took the idea as a definition of a different kind of integration theory, we can find a generalization of that theorem by refining the Riemann definition. For Cauchy’s theorem the points of the subdivision a = x0 , x1 , . . . , xn = b were arranged in increasing order. In Robbins’s theorem3 we drop the insistence that the points in the Riemann sum must form an increasing sequence. This allows us to characterize the calculus integral of uniformly continuous functions entirely by a statement using Riemann sums. Theorem 3.22 (Robbins) A real-valued function f is uniformly continuous on an interval [a, b] if and only if it satisfies the following strong uniform integrability criterion: there is a number I so that, for every ε > 0 and C > 0, there is a δ > 0 with the property that n I − ∑ f (ξi )(xi − xi−1 ) < ε i=1 for any choice of points x0 , x1 , . . . , xn and ξ1 , ξ2 , . . . , ξn from [a, b] satisfying n

∑ |xi − xi−1| ≤ C

i=1

where a = x0 , b = xn , 0 < |xi − xi−1 | < δ and each ξi belongs to the interval with endpoints xi and xi−1 for i = 1, 2, . . . , n. In that case, necessarily, I=

Z b

f (x) dx.

a

This theorem gives us some insight into integration theory. Instead of basing the calculus integral on the concept of an antiderivative, it could instead be obtained from a definition of an integral based on the concept of Riemann sums. This gives us two equivalent formulations of the calculus integral of uniformly continuous functions: one uses an antiderivative and one uses Riemann sums. 3 Only one direction in the theorem is due to Robbins and a proof can be found in Herbert E. Robbins, Note on the Riemann integral, American Math. Monthly, Vol. 50, No. 10 (Dec., 1943), 617–618.. The other direction is proved in B. S. Thomson, On Riemann sums, Real Analysis Exchange 37, no. 1 (2010).

CHAPTER 3. THE DEFINITE INTEGRAL

72

Exercise 282 Prove the easy direction in Robbins’s theorem, i.e, assume that f is uniformly continuous and prove that the statement holds with I=

Z b

f (x) dx.

a

Answer Exercise 283 Show that the number I that appears in the statement of Theorem 3.22 is unique, i.e., that there cannot be two different numbers I and I ′ possessing the same Answer property. Exercise 284 Show that a function satisfies the hypotheses of Robbins theorem, Theorem 3.22, on an interval [a, b] if and only if it satisfies the following equivalent strong integrability criterion: for every ε > 0 and C > 0, there is a δ > 0 with the property that m n ′ ′ ′ ∑ f (ξ j )(x j − x j−1 ) − ∑ f (ξi )(xi − xi−1 ) < ε j=1 i=1 for any choice of points

x0 , x1 , . . . , xn and ξ1 , ξ2 , . . . , ξn and x′0 , x′1 , . . . , x′m and ξ′1 , ξ′2 , . . . , ξ′m

from [a, b] satisfying m

n

∑ |xi − xi−1| ≤ C

i=1

and

∑ |x′j − x′j−1| ≤ C

j=1

0 < |xi − xi−1 | < δ, 0 < |x′j − x′j−1 | < δ, each ξi belongs where to the interval with endpoints xi and xi−1 for i = 1, 2, . . . , n, and each ξ′j belongs to the interval with endpoints x′j and x′j−1 for j = 1, 2, . . . , m, Answer a = x0 = x′0 ,

b = xn = x′m ,

Exercise 285 Show that, if a function satisfies the hypotheses of Robbins theorem (Theorem 3.22) on an interval [a, b] then it satisfies this same strong uniform integrability criterion on every subinterval [c, d] ⊂ [a, b]: there is a number I(c, d) so that, for every ε > 0 and C > 0, there is a δ > 0 with the property that n I(c, d) − ∑ f (ξi )(xi − xi−1 ) < ε i=1 for any choice of points x0 , x1 , . . . , xn and ξ1 , ξ2 , . . . , ξn from [c, d] satisfying n

∑ |xi − xi−1| ≤ C

i=1

where c = x0 , d = xn , 0 < |xi − xi−1 | < δ and each ξi belongs to the interval with endpoints xi and xi−1 for i = 1, 2, . . . , n. Answer Exercise 286 Further to Exercise 285 show that I(x, z) = I(x, y) + I(y, z) for all a ≤ x < y < z ≤ b. Exercise 287 Prove the harder direction in Robbins’s theorem, i.e, assume that f satisfies the strong “integrability” criterion and prove that it must be uniformly continuous

3.5. RIEMANN SUMS

73

and that I=

Z b

f (x) dx.

a

Answer

3.5.7 Theorem of G. A. Bliss Students of the calculus and physics are often required to “set up” integrals by which is meant interpreting a problem as an integral. Basically this amounts to interpreting the problem as a limit of Riemann sums Z b a

n

f (x) dx = lim ∑ f (ξi )(xi − xi−1 ). i=1

In this way the student shows that the integral captures all the computations of the problem. In simple cases this is easy enough, but complications can arise. For example if f and g are two continuous functions, sometimes the correct set up would involve a sum of the form n

lim ∑ f (ξi )g(ηi )(xi − xi−1 ) i=1

and not the more convenient n

lim ∑ f (ξi )g(ξi )(xi − xi−1 ). i=1

Here, rather than a single point ξi associated with the interval [xi , xi−1 ], two different points ξi and ηi must be used. Nineteenth century students had been taught a rather murky method for handling this case known as the Duhamel principle; it involved an argument using infinitesimals that, at bottom, was simply manipulations of Riemann sums. Bliss4 felt that this should be clarified and so produced an elementary theorem of which Theorem 3.23 is a special case. It is just a minor adjustment to our Theorem 3.19. Theorem 3.23 (Bliss) Let f and g be bounded functions that are defined and continuous at every point of (a, b) with at most finitely many exceptions: Then, f g is integrable on [a, b] and moreover the integral may be uniformly approximated by Riemann sums in this alternative sense: for every ε > 0 there is a δ > 0 so that Z b n f (x)g(x) dx − ∑ f (ξi ) g (ξ∗i ) (xi − xi−1 ) < ε a i=1

whenever points

a = x0 < x1 < x2 < x3 < · · · < xn−1 < xn = b

are given with each

0 < xi − xi−1 < δ

and where ξi and ξ∗i are any points in [xi−1 , xi ] where f g is defined. Exercise 288 Prove the Bliss theorem. Answer 4 G.

A. Bliss, A substitute for Duhamel’s theorem, Annals of Mathematics, Ser. 2, Vol. 16, (1914).

CHAPTER 3. THE DEFINITE INTEGRAL

74

Exercise 289 Prove this further variant of Theorem 3.19. Theorem 3.24 (Bliss) Let f1 , f2 , . . . , f p bounded functions that are defined and continuous at every point of (a, b) with at most finitely many exceptions: Then, the product f1 f2 f3 . . . f p is integrable on [a, b] and moreover the integral may be uniformly approximated by Riemann sums in this alternative sense: for every ε > 0 there is a δ > 0 so that Z b a f1 (x) f2 (x) f3 (x) . . . f p (x) dx n (2) (3) (p) f 3 ξi . . . f p ξi (xi − xi−1 ) < ε − ∑ f1 (ξi ) f2 ξi i=1

whenever {([xi , xi−1 ], ξi ) : i = 1, 2, . . . n} is a partition of [a, b] with each (2)

(3)

(p)

and ξi , ξi , ξi , . . . , ξi functions are defined.

xi − xi−1 < δ

∈ [xi−1 , xi ] with these being points in (a, b) where the Answer

Exercise 290 Prove one more variant of Theorem 3.19. Theorem 3.25 Suppose that the function H(s,t) satisfies |H(s,t)| ≤ M(|s| + |t|)

for some real number M and all real numbers s and t. Let f and g be bounded functions that are defined and continuous at every point of (a, b) with at most finitely many exceptions: Then, H( f (x), g(x)) is integrable on [a, b] and moreover the integral may be uniformly approximated by Riemann sums in this sense: for every ε > 0 there is a δ > 0 so that Z b n H( f (x), g(x)) dx − ∑ H ( f (ξi ) , g (ξ∗i )) (xi − xi−1 ) < ε a i=1 whenever {([xi , xi−1 ], ξi ) : i = 1, 2, . . . n} is a partition of [a, b], xi − xi−1 < δ

and ξi , ξ∗i , ∈ [xi−1 , xi ] with both ξi and ξ∗i points in (a, b) where f and g are defined. Answer

3.5.8 Pointwise approximation by Riemann sums For unbounded, but integrable, functions there cannot be a uniform approximation by Riemann sums. Even for bounded functions there will be no uniform approximation by Riemann sums unless the function is almost everywhere continuous, which is rather a strong condition. If we are permitted to adjust the smallness of the partition in a pointwise manner, however, then such an approximation by Riemann sums is available. This is less convenient, of course, since for each ε we need find not merely a single positive δ but a

3.5. RIEMANN SUMS

75

positive δ(x) at each point x of the interval. While this appears, at the outset, to be a deep property of calculus integrals it is an entirely trivial property. Much more remarkable is that Henstock,5 who first noted the property, was able also to recognize that all Lebesgue integrable functions have the same property and that this property characterized the much more general integral of Denjoy and Perron. Thus we will see this property again, but next time it will appear as a condition that is both necessary and sufficient. Theorem 3.26 (Henstock property) Let f : [a, b] → R be defined and integrable on [a, b]. Then, for every ε > 0 and for each point x in [a, b] there is a δ(x) > 0 so that n Z xi ∑ x f (x) dx − f (ξi )(xi − xi−1) < ε i=1

and

i−1

Z b n f (x) dx − ∑ f (ξi )(xi − xi−1 ) < ε a i=1

whenever whenever points

a = x0 < x1 < x2 < x3 < · · · < xn−1 < xn = b

are given with each

xi − xi−1 < δ(ξi ) and ξi ∈ [xi−1 , xi ]. Note that our statement requires that the function f being integrated is defined at all points of the interval [a, b]. This is not really an inconvenience since we could simply set f (x) = 0 (or any other value) at points where the given function f is not defined. The resulting integral is indifferent to changing the value of a function at finitely many points. Note also that, if there are no such partitions having the property of the statements in Theorem 3.26, then the statement is certainly valid, but has no content. This is not the case, i.e., no matter what choice of a function δ(x) occurs in this situation there must be at least one partition having this property. This is precisely the Cousin covering argument. Exercise 291 In the statement of the theorem show that if the first inequality n Z xi ∑ x f (x) dx − f (ξi )(xi − xi−1) < ε i=1

i−1

holds then the second inequality Z b n f (x) dx − ∑ f (ξi )(xi − xi−1 ) < ε a i=1 must follow by simple arithmetic. 5 Ralph

Answer

Henstock (1923-2007) first worked with this concept in the 1950s while studying nonabsolute integration theory. The characterization of the Denjoy-Perron integral as a pointwise limit of Riemann sums was at the same time discovered by the Czech mathematician Jaroslav Kurweil and today that integral is called the Henstock-Kurzweil integral by most users.

CHAPTER 3. THE DEFINITE INTEGRAL

76

Exercise 292 Prove Theorem 3.26 in the case when f is the exact derivative of an Answer everywhere differentiable function F. Exercise 293 Prove Theorem 3.19 in the case where F is an everywhere differentiable Answer function except at one point c inside (a, b) at which F is continuous. Exercise 294 Complete the proof of Theorem 3.19.

Answer

3.5.9 Characterization of derivatives This section continues some basic ideas from integration theory, continued from Section 3.5.6. Most students learn such ideas studying the Riemann integral. Here everything remains, as before, in the context of the calculus integral. It was an old problem of W. H. Young to determine, if possible, necessary and sufficient conditions on a function f in order that it should be the derivative of some other function. Elementary students know only one sufficient condition (that f might be continuous) and perhaps one necessary condition (that f should have the intermediate value property). We can use a pointwise version of Robbins’s theorem to give an answer to this problem in terms of Riemann sums. We begin with the easy direction. Theorem 3.27 Let F : [c, d] → R be a differentiable function and let a, b ∈ [c, d], ε > 0, and C > 0 be given. Then there is a positive function δ : [c, d] → R+ with the property that Z b n F ′ (x) dx − ∑ F ′ (ξi )(xi − xi−1 ) < ε a i=1

for any choice of points x0 , x1 , . . . , xn and ξ1 , ξ2 , . . . , ξn from [c, d] with these four properties: 1. a = x0 and b = xn . 2. 0 < |xi − xi−1 | < δ(ξi ) for all i = 1, 2, . . . , n. 3. ξi belongs to the interval with endpoints xi and xi−1 for i = 1, 2, . . . , n. 4. ∑ni=1 |xi − xi−1 | ≤ C.

Characterization of derivatives What properties should a function have in order that we would know it to be the derivative of some other function? One answer is that it must have a strong integrability property expressed in terms of Riemann sums.

3.5. RIEMANN SUMS

77

Theorem 3.28 A function f : [a, b] → R is an exact derivative if and only if it has the following strong pointwise integrability property: there is a number I so that, for any choice of ε > 0 and C > 0, there must exist a positive function δ : [a, b] → R+ with the property that n I − ∑ f (ξi )(xi − xi−1 ) < ε i=1

for any choice of points x0 , x1 , . . . , xn and ξ1 , ξ2 , . . . , ξn from [c, d] with these four properties: 1. a = x0 and b = xn . 2. 0 < |xi − xi−1 | < δ(ξi ) for all i = 1, 2, . . . , n. 3. ξi belongs to the interval with endpoints xi and xi−1 for i = 1, 2, . . . , n. 4. ∑ni=1 |xi − xi−1 | ≤ C. Necessarily then, I=

Z b

f (x) dx.

a

This theorem too gives us some insight into integration theory. Instead of basing the calculus integral on the concept of an antiderivative it could instead be based on a definition of an integral centered on the concept of Riemann sums. This gives us two equivalent formulations of the calculus integral of derivative functions: one uses an antiderivative and one uses Riemann sums. The latter has some theoretical advantages since it is hard to examine a function and conclude that it is a derivative without actually finding the antiderivative itself. Exercise 295 Prove Theorem 3.27.

Answer

Exercise 296 Show that the number I that appears in the statement of Theorem 3.28 is unique, i.e., that there cannot be two different numbers I and I ′ possessing the same property. Exercise 297 Show that a function satisfies the hypotheses of Theorem 3.28 on an interval [a, b] if and only if it satisfies the following equivalent strong integrability criterion: for every ε > 0 and C > 0, there is a positive function δ on [a, b] with the property that m n ∑ f (ξ′j )(x′j − x′j−1 ) − ∑ f (ξi )(xi − xi−1 ) < ε j=1 i=1

for any choice of points

x0 , x1 , . . . , xn and ξ1 , ξ2 , . . . , ξn and x′0 , x′1 , . . . , x′m and ξ′1 , ξ′2 , . . . , ξ′m

from [a, b] satisfying a = x0 = x′0 , b = xn = x′m , and m

n

∑ |xi − xi−1 | ≤ C

i=1

and

∑ |x′j − x′j−1| ≤ C

j=1

CHAPTER 3. THE DEFINITE INTEGRAL

78

where a = x0 = x′0 , b = xn = x′n , 0 < |xi − xi−1 | < δ(ξi ), 0 < |x′j − x′j−1 | < δ(ξ′j ), each ξi belongs to the interval with endpoints xi and xi−1 for i = 1, 2, . . . , n, and each ξ′j belongs Answer to the interval with endpoints x′j and x′j−1 for j = 1, 2, . . . , m, Exercise 298 Show that, if a function satisfies the hypotheses of theorem 3.28 on an interval [a, b], then it satisfies this same strong “integrability” criterion on every subinterval [c, d] ⊂ [a, b]: there is a number I(c, d) so that, for every ε > 0 and C > 0, there is a positive function δ on [a, b] with the property that n I(c, d) − ∑ f (ξi )(xi − xi−1 ) < ε i=1 for any choice of points x0 , x1 , . . . , xn and ξ1 , ξ2 , . . . , ξn from [c, d] satisfying n

∑ |xi − xi−1| ≤ C

i=1

where c = x0 , d = xn , 0 < |xi − xi−1 | < δ(ξi ) and each ξi belongs to the interval with Answer endpoints xi and xi−1 for i = 1, 2, . . . , n. Exercise 299 Further to Exercise 298 show that I(x, z) = I(x, y) + I(y, z) for all a ≤ x < y, z ≤ b. Exercise 300 Prove the harder direction in Theorem 3.28.

Answer

3.5.10 Unstraddled Riemann sums Perhaps the reader can tolerate yet one more discussion of Riemann sums, albeit one of marginal interest. When we have considered a Riemann sum n

∑ f (ξi )(xi − xi−1)

i=1

to this point, we have always insisted that the associated points ξi should be selected between the corresponding points xi−1 and xi . We can relax this. We require ξi to be close to the two points xi−1 and xi , but we do not require that it appear between them. E. J. McShane was likely the first to exploit this idea to find a new characterization of the Lebesgue integral in terms of Riemann sums (i.e., unstraddled sums). Here we simply show that the integral of continuous functions can be obtained by such sums.

3.6. ABSOLUTE INTEGRABILITY

79

Theorem 3.29 Let f : [a, b] → R be a uniformly continuous function. Then the integral may be uniformly approximated by unstraddled Riemann sums: for every ε > 0 there is a δ > 0 so that n Z xi ∑ x f (x) dx − f (ξi )(xi − xi−1) < ε i−1

i=1

and

Z b n f (x) dx − ∑ f (ξi )(xi − xi−1 ) < ε a i=1

whenever points are given

a = x0 < x1 < x2 < x3 < · · · < xn−1 < xn = b

with associated points satisfying

ξi − δ < xi−1 < xi < ξi + δ

for each i = 1, 2, . . . , n.

The proof is sufficiently similar to that for Theorem 3.19 that the reader need not trouble over it. The only moral here is that one should remain alert to other formulations of technical ideas and be prepared to exploit them (as did McShane) in other contexts.

3.6 Absolute integrability If a function f is integrable, does it necessarily follow that the absolute value of that function, | f |, is also integrable? This is important in many applications. Since a solution to this problem rests on the concept of the total variation of a function, we will give that definition below in Section 3.6.1. Definition 3.30 (absolutely integrable) A function f is absolutely integrable on an interval [a, b] if both f and | f | are integrable there. Exercise 301 Show that, if f is absolutely integrable on an interval [a, b] then Z b Z b ≤ f (x) dx | f (x)| dx. a a

Answer

Exercise 302 (preview of bounded variation) Show that if a function f is absolutely integrable on a closed, bounded interval [a, b] and F is its indefinite integral then, for all choices of points a = x0 < x1 < x2 < · · · < xn−1 < xn = b, n

∑ |F(xi ) − F(xi−1 )| ≤

i=1

Z b a

| f (x)| dx. Answer

Exercise 303 (calculus integral is a nonabsolute integral) An integration method is an absolute integration method if whenever a function f is integrable on an interval

CHAPTER 3. THE DEFINITE INTEGRAL

80

[a, b] then the absolute value | f | is also integrable there. Show that the calculus integral is a nonabsolute integration method 6 . Hint: Consider π d . x cos dx x Answer

Exercise 304 Repeat Exercise 303 but using 1 d 2 x sin 2 . dx x Show that this derivative exists at every point. Thus there is an exact derivative which is integrable on every interval but not absolutely integrable. Answer Exercise 305 Let f be continuous at every point of (a, b) with at most finitely many exceptions and suppose that f is bounded. Show that f is absolutely integrable on [a, b]. Answer

3.6.1 Functions of bounded variation The clue to the property that expresses absolute integrability is in Exercise 302. The notion is due to Jordan and the language is that of variation, meaning here a measurement of how much the function is fluctuating. Definition 3.31 (total variation) A function F : [a, b] → R is said to be of bounded variation if there is a number M so that n

∑ |F(xi ) − F(xi−1 )| ≤ M

i=1

for all choices of points a = x0 < x1 < x2 < · · · < xn−1 < xn = b.

The least such number M is called the total variation of F on [a, b] and is written V (F, [a, b]). If F is not of bounded variation then we set V (F, [a, b]) = ∞.

Definition 3.32 (total variation function) let F : [a, b] → R be a function of bounded variation. Then the function T (x) = V (F, [a, x])

(a < x ≤ b), T (a) = 0

is called the total variation function for F on [a, b].

Our main theorem in this section establishes the properties of the total variation function and gives, at least for continuous functions, the connection this concept has with absolute integrability. 6 Both

the Riemann integral and the Lebesgue integral are absolute integration methods.

3.6. ABSOLUTE INTEGRABILITY

81

Theorem 3.33 (Properties of the total variation) Let F : [a, b] → R be a function of bounded variation and let T (x) = V (F, [a, x]) be its total variation. Then 1. for all a ≤ c < d ≤ b,

|F(d) − F(c)| ≤ V (F, [c, d]) = T (d) − T (c).

2. T is monotonic, nondecreasing on [a, b]. 3. If F is continuous at a point a < x0 < b then so too is T . 4. If F is uniformly continuous on [a, b] then so too is T . 5. If F is continuously differentiable at a point a < x0 < b then so too is T and, moreover T ′ (x0 ) = |F ′ (x0 )|. 6. If F is uniformly continuous on [a, b] and continuously differentiable at all but finitely many points in (a, b) then F ′ is absolutely integrable and F(x) − F(a) =

Z x a

′

F (t) dt and T (x) =

Z x a

|F ′ (t)| dt.

As we see here in assertion (6.) of the theorem and will discover further in the exercises, the two notions of total variation and absolute integrability are closely interrelated. The notion of total variation plays such a significant role in the study of real functions in general and in integration theory in particular that it is worthwhile spending some considerable time on it, even at an elementary calculus level. Since the ideas are closely related to other ideas which we are studying this topic should seem a natural development of the theory. Indeed we will find that our discussion of arc length in Section 3.10.3 will require a use of this same language. Exercise 306 Show directly from the definition that if F : [a, b] → R is a function of bounded variation then F is a bounded function on [a, b]. Answer Exercise 307 Compute the total variation for a function F that is monotonic on [a, b]. Answer Exercise 308 Compute the total variation function for the function F(x) = sin x on Answer [−π, π]. Exercise 309 Let F(x) have the value zero everywhere except at the point x = 0 where F(0) = 1. Choose points −1 = x0 < x1 < x2 < · · · < xn−1 < xn = 1.

What are all the possible values of

n

∑ |F(xi ) − F(xi−1 )|?

i=1

What is V (F, [−1, 1])?

Answer

Exercise 310 Give an example of a function F defined everywhere and with the property that V (F, [a, b]) = ∞ for every interval [a, b]. Answer

CHAPTER 3. THE DEFINITE INTEGRAL

82

Exercise 311 Show that if F : [a, b] → R is Lipschitz then F is a function of bounded Answer variation. Is the converse true? Exercise 312 Show that V (F + G, [a, b]) ≤ V (F, [a, b]) +V (G, [a, b]).

Answer

Exercise 313 Does V (F + G, [a, b]) = V (F, [a, b]) +V (G, [a, b]). Answer Exercise 314 Prove Theorem 3.33.

Answer

Exercise 315 (Jordan decomposition) Show that a function F has bounded variation on an interval [a, b] if and only if it can expressed as the difference of two monotonic, nondecreasing functions. Answer Exercise 316 Show that the function F(x) = x cos πx , F(0) = 0 is continuous everywhere but does not have bounded variation on the interval [0, 1], i.e., that V (F, [0, 1]) = ∞. Answer Exercise 317 (derivative of the variation) Suppose that F(x) = xr cos x−1 for x > 0, F(x) = −(−x)r cos x−1 for x < 0, and finally F(0) = 0. Show that if r > 1 then F has bounded variation on [−1, 1] and that F ′ (0) = 0. Let T be the total variation function of F. Show that T ′ (0) = 0 if r > 2, that T ′ (0) = 2/π if r = 2, and that T ′ (0) = ∞ if 1 < r < 2. Note: In particular, at points where F is differentiable, the total variation T need not be. Theorem 3.33 said, in contrast, that at points where F is continuously differentiable, Answer the total variation T must also be continuously differentiable. Exercise 318 (uniformly approximating the variation) Suppose that F is uniformly continuous on [a, b]. Show that for any v < V (F, [a, b] there is a δ > 0 so that so that n

v < ∑ |F(xi ) − F(xi−1 )| ≤ V (F, [a, b] i=1

for all choices of points a = x0 < x1 < x2 < · · · < xn−1 < xn = b

provided that each xi − xi−1 < δ. Is it possible to drop or relax the assumption that F is continuous? Note: This means the variation of a continuous function can be computed much like our Riemann sums approximation to the integral. Answer Exercise 319 Let Fk : [a, b] → R (k = 1, 2, 3, . . . ) be a sequence of functions of bounded variation, suppose that F(x) = lim Fk (x) k→∞

3.6. ABSOLUTE INTEGRABILITY

83

for every k = 1, 2, 3, . . . and suppose that there is a number M so that V (Fk , [a, b]) ≤ M.

k = 1, 2, 3, . . . .

Show that F must also have bounded variation. Does this prove that every limit of a sequence of functions of bounded variation must also have bounded variation? Exercise 320 (locally of bounded variation) Let F : R → R be a function. We say that F is locally of bounded variation at a point x if there is some positive δ so that V (F, [x − δ, x + δ]) < ∞. Show that F has bounded variation on every compact interval [a, b] if and only if F is locally of bounded variation at every point x ∈ R. Answer Exercise 321 (comparison test for variations) Suppose that F, G : [a, b] → R and that F is uniformly continuous on [a, b]. 1. If |F ′ (x)| ≤ |G′ (x)| for mostly every point x in (a, b) show that |F(b) − F(a)| ≤ V (F, [a, b]) ≤ V (G, [a, b]).

2. If F ′ (x) ≤ |G′ (x)| for mostly every point x in (a, b) show that F(b) − F(a) ≤ V (G, [a, b]).

(If you are feeling more ambitious replace “mostly everywhere” with “nearly everywhere.”) Answer Exercise 322 Here is a stronger version of bounded variation for a function F : [a, b] → R. For every C > 0 there is a number M so that n

∑ |F(xi ) − F(xi−1 )| ≤ M

i=1

for all choices of points a = x0 , x1 , x2 , . . . , xn−1 , xn = b for which n

∑ |xi − xi−1| ≤ C.

i=1

Show that this property is equivalent to the statement that F is Lipschitz on [a, b]. Answer

3.6.2 Indefinite integrals and bounded variation In the preceding section we spent some time mastering the important concept of total variation. We now see that it precisely describes the absolute integrability of a function. Indefinite integrals of nonabsolutely integrable functions will not be of bounded variation; indefinite integrals of absolutely integrable functions must be of bounded variation.

CHAPTER 3. THE DEFINITE INTEGRAL

84

Theorem 3.34 Suppose that a function f : (a, b) → R is absolutely integrable on a closed, bounded interval [a, b]. Then its indefinite integral F must be a function of bounded variation there and, moreover, V (F, [a, b]) =

Z b a

| f (x)| dx.

This theorem states only a necessary condition for absolute integrability. If we add in a continuity assumption we can get a complete picture of what happens. Continuity is needed for the calculus integral, but is not needed for more advanced theories of integration. Theorem 3.35 Let F : [a, b] → R be a uniformly continuous function that is continuously differentiable at every point in a bounded, open interval (a, b) with possibly finitely many exceptions. Then F ′ is integrable on [a, b] and will be, moreover, absolutely integrable on [a, b] if and only if F has bounded variation on that interval. Exercise 323 Prove Theorem 3.34.

Answer

Exercise 324 Prove Theorem 3.35.

Answer

3.7 Sequences and series of integrals Throughout the 18th century much progress in applications of the calculus was made through quite liberal use of the formulas Z bn Z b o lim fn (x) dx fn (x) dx = lim

and

n→∞ a

a

Z b

Z b

∞

∑

k=1 a

gk (x) dx =

a

n→∞

(

∞

)

∑ gk (x)

k=1

dx.

These are vitally important tools but they require careful application and justification. That justification did not come until the middle of the 19th century. We introduce two definitions of convergence allowing us to interpret what the limit and sum of a sequence, ∞

lim fn (x) and

n→∞

∑ gk (x)

k=1

should mean. We will find that uniform convergence allows an easy justification for the basic formulas above. Pointwise convergence is equally important but more delicate. At the level of a calculus course we will find that uniform convergence is the concept we shall use most frequently.

3.7.1 The counterexamples We begin by asking, naively, whether there is any difficulty in taking limits in the calculus. Suppose that f1 , f2 , f3 , . . . is a sequence of functions defined on an open interval I = (a, b). We suppose that this sequence converges pointwise to a function f , i.e., that for each x ∈ I the sequence of numbers { fn (x)} converges to the value f (x).

3.7. SEQUENCES AND SERIES OF INTEGRALS

85

1

1

Figure 3.1: Graphs of xn on [0, 1] for n = 1, 3, 5, 7, 9, and 50. Is it true that 1. If fn is bounded on I for all n, then is f also bounded on I? 2. If fn is continuous on I for all n, then is f also continuous on I? 3. If fn is uniformly continuous on I for all n, then is f also uniformly continuous on I? 4. If fn is differentiable on I for all n, then is f also differentiable on I and, if so, does f ′ = lim fn′ ? n→∞

5. If fn is integrable on a subinterval [c, d] of I for all n, then is f also integrable on [c, d] and, if so, does Z dn Z d o lim fn (x) dx? fn (x) dx = lim n→∞ c

c

n→∞

These five questions have negative answers in general, as the examples that follow show.

Exercise 325 (An unbounded limit of bounded functions) On the interval (0, ∞) and for each integer n let fn (x) = 1/x for x > 1/n and fn (x) = n for each 0 < x ≤ 1/n. Show that each function fn is both continuous and bounded on (0, ∞). Is the limit function f (x) = limn→∞ fn (x) also continuous ? Is the limit function bounded? Answer Exercise 326 (A discontinuous limit of continuous functions) For each integer n and −1 < x ≤ 1, let fn (x) = xn . For x > 1 let fn (x) = 1. Show that each fn is a continuous function on (−1, ∞) and that the sequence converges pointwise to a function f on (−1, ∞) that has a single point of discontinuity. Answer Exercise 327 (A limit of uniformly continuous functions) Show that the previous exercise supplies a pointwise convergence sequence of uniformly continuous functions on the interval [0, 1] that does not converge to a uniformly continuous function.

CHAPTER 3. THE DEFINITE INTEGRAL

86 2n

6

E E

E

E

E

E E

E

1 2n

E

-

1 n

1

Figure 3.2: Graph of fn (x) on [0, 1] in Exercise 329. Exercise 328 (The derivative of the limit is not the limit of the derivative) Let fn (x) = xn /n for −1 < x ≤ 1 and let fn (x) = x − (n − 1)/n for x > 1. Show that each fn is differentiable at every point of the interval (−1, ∞) but that the limit function has a point of nondifferentiability. Answer

Exercise 329 (The integral of the limit is not the limit of the integrals) In this example we consider a sequence of continuous functions, each of which has the same integral over the interval. For each n let fn be defined on [0, 1] as follows: fn (0) = 0, fn (1/(2n)) = 2n, fn (1/n) = 0, fn is linear on [0, 1/(2n)] and on [1/(2n), 1/n], and fn = 0 on [1/n, 1]. (See Figure 3.2.) It is easy to verify that fn → 0 on [0, 1]. Now, for each n, Z 1 0

But

Z 1 0

fn x = 1.

( lim n fn (x)) dx = n→∞

Thus lim n

n→∞

Z 1 0

fn x 6=

Z 1

Z 1

0 dx = 0.

0

lim n fn (x) dx

0 n→∞

so that the limit of the integrals is not the integral of the limit.

Exercise 330 (interchange of limit operations) To prove the (false) theorem that the pointwise limit of a sequence of continuous functions is continuous, why cannot we simply write lim lim fn (x) = lim fn (x0 ) = lim lim fn (x) x→x0

n→∞

n→∞

n→∞

x→x0

and deduce that

lim f (x) = f (x0 )?

x→x0

This assumes fn is continuous at x0 and “proves” that f is continuous at x0 . that Answer

3.7. SEQUENCES AND SERIES OF INTEGRALS

87

Exercise 331 Is there anything wrong with this “proof” that a limit of bounded functions is bounded? If each fn is bounded on an interval I then there must be, by definition, a number M so that | fn (x)| ≤ M for all x in I. By properties of sequence limits | f (x)| = | lim fn (x)| ≤ M n→∞

Answer

also, so f is bounded. Exercise 332 (interchange of limit operations) Let 0, if m ≤ n Smn = 1, if m > n.

Viewed as a matrix,



  [Smn ] =  

0 1 1 .. .

0 0 1 .. .

··· ··· ··· .. .

0 0 0 .. .

    

where we are placing the entry Smn in the mth row and nth column. Show that lim lim Smn 6= lim lim Smn . m→∞ n→∞

n→∞ m→∞

Answer

Exercise 333 Examine the pointwise limiting behavior of the sequence of functions xn . fn (x) = 1 + xn Exercise 334 Show that the natural logarithm function can be expressed as the pointwise limit of a sequence of “simpler” functions, √ log x = lim n n x − 1 n→∞

for every point in the interval. If the answer to our initial five questions for this particular limit is affirmative, what can you deduce about the continuity of the logarithm R function? What would be its derivative? What would be 12 log x dx? Exercise 335 Let x1 , x2 , . . . be a sequence that contains every rational number, let 1, if x is rational 1, if x ∈ {x1 , . . . , xn } and f (x) = fn (x) = 0, otherwise. 0, otherwise, 1. Show that fn → f pointwise on any interval.

2. Show that fn has only finitely many points of discontinuity while f has no points of continuity. 3. Show that each fn has a calculus integral on any interval [c, d] while f has a calculus integral on no interval. 4. Show that, for any interval [c, d], lim

Z d

n→∞ c

fn (x) dx 6=

Z dn c

o lim fn (x) dx.

n→∞

CHAPTER 3. THE DEFINITE INTEGRAL

88

Figure 3.3: Construction in Exercise 338.

Answer √ Exercise 336 Let fn (x) = sin nx/ n. Show that limn→∞ n fn = 0 but limn→∞ n fn′ (0) = ∞. Exercise 337 Let fn → f pointwise at every point in the interval [a, b]. We have seen that even if each fn is continuous it does not follow that f is continuous. Which of the following statements are true? 1. If each fn is increasing on [a, b], then so is f . 2. If each fn is nondecreasing on [a, b], then so is f . 3. If each fn is bounded on [a, b], then so is f . 4. If each fn is everywhere discontinuous on [a, b], then so is f . 5. If each fn is constant on [a, b], then so is f . 6. If each fn is positive on [a, b], then so is f . 7. If each fn is linear on [a, b], then so is f . 8. If each fn is convex on [a, b], then so is f . Answer Exercise 338 A careless student7 once argued as follows: “It seems to me that one can construct a curve without a tangent in a very elementary way. We divide the diagonal of a square into n equal parts and construct on each subdivision as base a right isosceles triangle. In this way we get a kind of delicate little saw. Now I put n = ∞. The saw becomes a continuous curve that is infinitesimally different from the diagonal. But it is perfectly clear that its tangent is alternately parallel now to the x-axis, now to the y-axis.” What is the error? (Figure 3.3 illustrates the construction.) Answer 7 In

this case the “careless student” was the great Russian analyst N. N. Luzin (1883–1950), who recounted in a letter [reproduced in Amer. Math. Monthly, 107, (2000), pp. 64–82] how he offered this argument to his professor after a lecture on the Weierstrass continuous nowhere differentiable function.

3.7. SEQUENCES AND SERIES OF INTEGRALS

89

Exercise 339 Consider again the sequence { fn } of functions fn (x) = xn on the interval (0, 1). We saw that fn → 0 pointwise on (0, 1), and we proved this by establishing that, for every fixed x0 ∈ (0, 1) and ε > 0, |x0 |n < ε if and only if n > log ε/ log x0 .

Is it possible to find an integer N so that, for all x ∈ (0, 1), |x|n < ε if f n > N?

Discuss.

Answer

3.7.2 Uniform convergence The most immediate of the conditions which allows an interchange of limits in the calculus is the notion of uniform convergence. This is a very much stronger condition than pointwise convergence. Definition 3.36 Let { fn } be a sequence of functions defined on an interval I. We say that { fn } converges uniformly to a function f on I if, for every ε > 0, there exists an integer N such that | fn (x) − f (x)| < ε for all n ≥ N and all x ∈ I. Exercise 340 Show that the sequence of functions fn (x) = xn converges uniformly on Answer any interval [0, η] provided that 0 < η < 1. Exercise 341 Using this definition of the Cauchy Criterion Definition 3.37 (Cauchy Criterion) Let { fn } be a sequence of functions defined on an interval set I. The sequence is said to be uniformly Cauchy on I if for every ε > 0 there exists an integer N such that if n ≥ N and m ≥ N, then | fm (x) − fn (x)| < ε for all x ∈ I. prove the following theorem: Theorem 3.38 Let { fn } be a sequence of functions defined on an interval I. Then there exists a function f defined on the interval I such that the sequence uniformly on I if and only if { fn } is uniformly Cauchy on I. Answer Exercise 342 In Exercise 340 we showed that the sequence fn (x) = xn converges uniformly on any interval [0, η], for 0 < η < 1. Prove this again, but using the Cauchy Answer criterion. Exercise 343 (Cauchy criterion for series) The Cauchy criterion can be expressed for uniformly convergent series too. We say that a series ∑∞ k=1 gk converges uniformly to the function f on an interval I if the sequence of partial sums {Sn } where n

Sn (x) =

∑ gk (x)

k=1

converges uniformly to f on I. Prove this theorem:

CHAPTER 3. THE DEFINITE INTEGRAL

90

Theorem 3.39 Let {gk } be a sequence of functions defined on an interval I. Then the series ∑∞ k=1 fk converges uniformly to some function f on the interval I if and only if for every ε > 0 there is an integer N so that n ∑ f j (x) < ε j=m for all n ≥ m ≥ N and all x ∈ I.

Answer

Exercise 344 Show that the series 1 + x + x2 + x3 + x4 + . . . converges pointwise on [0, 1), converges uniformly on any interval [0, η] for 0 < η < 1, Answer but that the series does not converge uniformly on [0, 1). Exercise 345 (Weierstrass M-Test) Prove the following theorem, which is usually known as the Weierstrass M-test for uniform convergence of series. Theorem 3.40 (M-Test) Let { fk } be a sequence of functions defined on an interval I and let {Mk } be a sequence of positive constants. If ∞

∑ Mk < ∞

k=1

and | fk (x)| ≤ Mk for each x ∈ I and k = 0, 1, 2, . . . ,

then the series ∑∞ k=1 fk converges uniformly on the interval I. Answer Exercise 346 Consider again the geometric series 1 + x + x2 + . . . (as we did in Exercise 344). Use the Weierstrass M-test to prove uniform convergence on the interval Answer [−a, a], for any 0 < a < 1. Exercise 347 Use the Weierstrass M-test to investigate the uniform convergence of the series ∞ sin kθ ∑ kp k=1 on an interval for values of p > 0.

Answer

Exercise 348 (Abel’s Test for Uniform Convergence) Prove Abel’s test for uniform convergence: Theorem 3.41 (Abel) Let {ak } and {bk } be sequences of functions on an interval I. Suppose that there is a number M so that N

−M ≤ sN (x) =

∑ ak (x) ≤ M

k=1

for all x ∈ I and every integer N. Suppose that the sequence of functions {bk } → 0 converges monotonically to zero at each point and that this convergence is uniform on I. Then the series ∑∞ k=1 ak (x)bk (x) converges uniformly on I.

3.7. SEQUENCES AND SERIES OF INTEGRALS

91 Answer

Exercise 349 Apply Theorem 3.41, to the following series that often arises in Fourier analysis: ∞ sin kθ ∑ k . k=1 Answer Exercise 350 Examine the uniform limiting behavior of the sequence of functions xn fn (x) = . 1 + xn On what sets can you determine uniform convergence? Exercise 351 Examine the uniform limiting behavior of the sequence of functions fn (x) = x2 e−nx . On what sets can you determine uniform convergence? On what sets can you determine uniform convergence for the sequence of functions n2 fn (x)? Exercise 352 Prove that if { fn } and {gn } both converge uniformly on an interval I, then so too does the sequence { fn + gn }. Exercise 353 Prove or disprove that if { fn } and {gn } both converge uniformly on an interval I, then so too does the sequence { fn gn }. Exercise 354 Prove or disprove that if f is a continuous function on (−∞, ∞), then f (x + 1/n) → f (x)

uniformly on (−∞, ∞). (What extra condition, stronger than continuity, would work if not?) Exercise 355 Prove that fn → f converges uniformly on an interval I, if and only if lim sup | fn (x) − f (x)| = 0. n

x∈I

Exercise 356 Show that a sequence of functions { fn } fails to converge to a function f uniformly on an interval I if and only if there is some positive ε0 so that a sequence {xk } of points in I and a subsequence { fnk } can be found such that | fnk (xk ) − f (xk )| ≥ ε0 .

Exercise 357 Apply the criterion in the preceding exercise to show that the sequence fn (x) = xn does not converge uniformly to zero on (0, 1). Exercise 358 Prove Theorem 3.38.

Answer

k Exercise 359 Verify that the geometric series ∑∞ k=0 x , which converges pointwise on (−1, 1), does not converge uniformly there.

CHAPTER 3. THE DEFINITE INTEGRAL

92

Exercise 360 Do the same for the series obtained by differentiating the series in Exk−1 converges pointwise but not uniformly on ercise 359; that is, show that ∑∞ k=1 kx (−1, 1). Show that this series does converge uniformly on every closed interval [a, b] contained in (−1, 1). Exercise 361 Verify that the series ∞

cos kx 2 k=1 k

∑

converges uniformly on (−∞, ∞). Exercise 362 If { fn } is a sequence of functions converging uniformly on an interval I to a function f , what conditions on the function g would allow you to conclude that g ◦ fn converges uniformly on I to g ◦ f ? ∞

xk ∑ converges uniformly on [0, b] for every b ∈ k=0 k [0, 1) but does not converge uniformly on [0, 1).

Exercise 363 Prove that the series

Exercise 364 Prove that if ∑∞ k=1 fk converges uniformly on an interval I, then the sequence of terms { fk } converges uniformly on I. Exercise 365 A sequence of functions { fn } is said to be uniformly bounded on an interval [a, b] if there is a number M so that | fn (x)| ≤ M

for every n and also for every x ∈ [a, b]. Show that a uniformly convergent sequence { fn } of continuous functions on [a, b] must be uniformly bounded. Show that the same statement would not be true for pointwise convergence. Exercise 366 Suppose that fn → f on (−∞, +∞). What conditions would allow you to compute that lim fn (x + 1/n) = f (x)? n→∞

Exercise 367 Suppose that { fn } is a sequence of continuous functions on the interval [0, 1] and that you know that { fn } converges uniformly on the set of rational numbers inside [0, 1]. Can you conclude that { fn } uniformly on [0, 1]? (Would this be true without the continuity assertion?) Exercise 368 Prove the following variant of the Weierstrass M-test: Let { fk } and {gk } be sequences of functions on an interval I. Suppose that | fk (x)| ≤ gk (x) for all k and ∞ x ∈ I and that ∑∞ k=1 gk converges uniformly on I. Then the series ∑k=1 fk converges uniformly on I. Exercise 369 Prove the following variant on Theorem 3.41: Let {ak } and {bk } be sequences of functions on an interval I. Suppose that ∑∞ k=1 ak (x) converges uniformly on I. Suppose that {bk } is monotone for each x ∈ I and uniformly bounded on E. Then the series ∑∞ k=1 ak bk converges uniformly on I.

3.7. SEQUENCES AND SERIES OF INTEGRALS

93

Exercise 370 Prove the following variant on Theorem 3.41: Let {ak } and {bk } be sequences of functions on an interval I. Suppose that there is a number M so that N ∑ ak (x) ≤ M k=1

for all x ∈ I and every integer N. Suppose that ∞

∑ |bk − bk+1|

k=1

converges uniformly on I and that bk → 0 uniformly on I. Then the series ∑∞ k=1 ak bk converges uniformly on I. Exercise 371 Prove the following variant on Abel’s test (Theorem 3.41): Let {ak (x)} and {bk (x)} be sequences of functions on an interval I. Suppose that ∑∞ k=1 ak (x) converges uniformly on I. Suppose that the series ∞

∑ |bk (x) − bk+1 (x)|

k=1

has uniformly bounded partial sums on I. Suppose that the sequence of functions {bk (x)} is uniformly bounded on I. Then the series ∑∞ k=1 ak (x)bk (x) converges uniformly on I. Exercise 372 Suppose that { fn (x)} is a sequence of continuous functions on an interval [a, b] converging uniformly to a function f on the open interval (a, b). If f is also continuous on [a, b], show that the convergence is uniform on [a, b]. Exercise 373 Suppose that { fn } is a sequence of functions converging uniformly to zero on an interval [a, b]. Show that limn→∞ fn (xn ) = 0 for every convergent sequence {xn } of points in [a, b]. Give an example to show that this statement may be false if fn → 0 merely pointwise. Exercise 374 Suppose that { fn } is a sequence of functions on an interval [a, b] with the property that limn→∞ fn (xn ) = 0 for every convergent sequence {xn } of points in [a, b]. Show that { fn } converges uniformly to zero on [a, b].

3.7.3 Uniform convergence and integrals We state our main theorem for continuous functions. We know that bounded, continuous functions are integrable and we have several tools that handle unbounded continuous functions.

CHAPTER 3. THE DEFINITE INTEGRAL

94

Theorem 3.42 (uniform convergence of sequences of continuous functions) Let f1 , f2 , f3 , . . . be a sequence of functions defined and continuous on an open interval (a, b). Suppose that { fn } converges uniformly on (a, b) to a function f . Then 1. f is continuous on (a, b). 2. If each fn is bounded on the interval (a, b) then so too is f . 3. For each closed, bounded interval [c, d] ⊂ (a, b) Z Z dn Z d o lim fn (x) dx = fn (x) dx = lim n→∞ c

c

n→∞

d

f (x) dx. c

4. If each fn is integrable on the interval [a, b] then so too is f and Z b Z bn Z b o f (x) dx. lim fn (x) dx = fn (x) dx = lim n→∞ a

a

n→∞

a

We have defined uniform convergence of series in a simple way, merely by requiring that the sequence of partial sums converges uniformly. Thus the Corollary follows immediately from the theorem applied to these partial sums. Corollary 3.43 (uniform convergence of series of continuous functions) Let g1 , g2 , g3 , . . . be a sequence of functions defined and continuous on an open interval (a, b). Suppose that the series ∑∞ k=1 gk converges uniformly on (a, b) to a function f . Then 1. f is continuous on (a, b). 2. For each closed, bounded interval [c, d] ⊂ (a, b) ) ( Z Z ∞

∑

d

d

gk (x) dx =

k=1 c

c

∞

∑ gk (x)

dx =

k=1

Z d

f (x) dx.

c

3. If each gk is integrable on the interval [a, b] then so too is f and ) ( Z Z Z ∞

∑

b

b

k=1 a

gk (x) dx =

a

∞

∑ gk (x)

k=1

b

f (x) dx.

dx =

a

Exercise 375 To prove Theorem 3.42 and its corollary is just a matter of putting together facts that we already know. Do this.

3.7.4 A defect of the calculus integral In the preceding section we have seen that uniform convergence of continuous functions allows for us to interchange the order of integration and limit to obtain the important formula Z b Z bn o f (x) dx = lim lim fn (x) dx. n n→∞ a

a

n→∞

Is this still true if we drop the assumption that the functions fn are continuous?

3.7. SEQUENCES AND SERIES OF INTEGRALS

95

We will prove one very weak theorem and give one counterexample to show that the class of integrable functions in the calculus sense is not closed under uniform limits8 . We will work on this problem again in Section 3.7.6 but we cannot completely handle the defect. We will remedy this defect of the calculus integral in Chapter 4. Theorem 3.44 Let f1 , f2 , f3 , . . . be a sequence of functions defined and integrable on a closed, bounded interval [a, b]. Suppose that { fn } converges uniformly on [a, b] to a function f . Then, provided we assume that f is integrable on [a, b], Z b

f (x) dx = lim

a

Exercise 376 Let

Z b

n→∞ a

( 0 gk (x) = 2−k

fn (x) dx.

if 0 ≤ x ≤ 1 − 1k if 1 − 1k < x ≤ 1.

Show that the series ∑∞ k=2 gk (x) of integrable functions converges uniformly on [0, 1] to Answer a function f that is not integrable in the calculus sense. Exercise 377 Prove Theorem 3.44.

Answer

3.7.5 Uniform limits of continuous derivatives We saw in Section 3.7.3 that a uniformly convergent sequence (or series) of continuous functions can be integrated term-by-term . As an application of our integration theorem we obtain a theorem on term-by-term differentiation. We write this in a form suggesting that the order of differentiation and limit is being reversed. Theorem 3.45 Let {Fn } be a sequence of uniformly continuous functions on an interval [a, b], suppose that each function has a continuous derivative Fn′ on (a, b), and suppose that 1. The sequence {Fn′ } of derivatives converges uniformly to a function on (a, b). 2. The sequence {Fn } converges pointwise to a function F. Then F is differentiable on (a, b) and, for all a < x < b, d d d F ′ (x) = F(x) = lim Fn (x) = lim Fn (x) = lim Fn′ (x). n→∞ dx n→∞ dx dx n→∞ For series, the theorem takes the following form: 8 Had

we chosen back in Section 2.1.1 to accept sequences of exceptional points rather than finite exceptional sets we would not have had this problem here.

CHAPTER 3. THE DEFINITE INTEGRAL

96

Corollary 3.46 Let {Gk } be a sequence of uniformly continuous functions on an interval [a,b], suppose that each function has a continuous derivative Fn′ on (a, b), and suppose that 1. F(x) = ∑∞ k=1 Gk (x) pointwise on [a, b]. ′ 2. ∑∞ k=0 Gk (x) converges uniformly on (a, b).

Then, for all a < x < b, ∞ ∞ d d ∞ d G (x) = G (x) = F ′ (x) = F(x) = ∑ k ∑ dx k ∑ G′k (x). dx dx k=1 k=1 k=1 Exercise 378 Using Theorem 3.42, prove Theorem 3.45. Exercise 379 Starting with the geometric series ∞ 1 = ∑ xk on (−1, 1), 1 − x k=0 show how to obtain

∞ 1 = ∑ kxk−1 (1 − x)2 k=1

on (−1, 1).

Answer

(3.4)

(3.5)

k−1 does not converge uniformly on (−1, 1). Is this trou[Note that the series ∑∞ k=1 kx blesome?] Answer

Exercise 380 Starting with the definition ex =

∞

xk

∑ k!

on (−∞, ∞),

(3.6)

k=0

show how to obtain

∞ k d x x e =∑ = ex dx k! k=0

on (−∞, ∞).

(3.7)

x [Note that the series ∑∞ k=1 k! does not converge uniformly on (−∞, ∞). Is this troublesome?] Answer k

sin nx Exercise 381 Can the sequence of functions fn (x) = 3 be differentiated term-byn term? ∞

Exercise 382 Can the series of functions

sin kx be differentiated term-by-term? 3 k=1 k

∑

Exercise 383 Verify that the function x2 x4 x6 x8 + + + + ... 1! 2! 3! 4! is a solution of the differential equation y′ = 2xy on (−∞, ∞) without first finding an explicit formula for y(x). y(x) = 1 +

3.8. THE MONOTONE CONVERGENCE THEOREM

97

3.7.6 Uniform limits of discontinuous derivatives The following theorem reduces the hypotheses of Theorem 3.45 and, accordingly is much more difficult to prove. Here we have dropped the continuity of the derivatives as an assumption. Theorem 3.47 Let { fn } be a sequence of uniformly continuous functions defined on an interval [a, b]. Suppose that fn′ (x) exists for each n and each x ∈ (a, b) except possibly for x in some finite set C. Suppose that the sequence { fn′ } of derivatives converges uniformly on (a, b) \C and that there exists at least one point x0 ∈ [a, b] such that the sequence of numbers { fn (x0 )} converges. Then the sequence { fn } converges uniformly to a function f on the interval [a, b], f is differentiable with, at each point x ∈ (a, b) \C, ′

f (x) =

lim f ′ (x) n→∞ n

and

lim

Z b

n→∞ a

fn′ (x) dx

=

Z b a

f ′ (x) dx.

Exercise 384 Prove Theorem 3.47.

Answer

Exercise 385 For infinite series, how can Theorem 3.47 be rewritten?

Answer

Exercise 386 (uniform limits of integrable functions) At first sight Theorem 3.47 seems to supply the following observation: If {gn } is a sequence of functions integrable in the calculus sense on an interval [a, b] and gn converges uniformly to a function g on [a, b] Answer then g must also be integrable. Is this true? Exercise 387 In the statement of Theorem 3.47 we hypothesized the existence of a single point x0 at which the sequence { fn (x0 )} converges. It then followed that the sequence { fn } converges on all of the interval I. If we drop that requirement but retain the requirement that the sequence { fn′ } converges uniformly to a function g on I, show that we cannot conclude that { fn } converges on I, but we can still conclude that there Answer exists f such that f ′ = g = limn→∞ fn′ on I.

3.8 The monotone convergence theorem Two of the most important computations with integrals are taking a limit inside an integral, Z b Z b fn (x) dx = lim lim fn (x) dx n→∞ a

a

n→∞

and summing a series inside an integral, ∞

∑

Z b

k=1 a

gk (x) dx =

Z b a

∞

!

∑ gk (x)

k=1

dx.

The counterexamples in Section 3.7.1, however, have made us very wary of doing this. The uniform convergence results of Section 3.7.5, on the other hand, have encouraged us to check for uniform convergence as a guarantee that these operations will be successful.

CHAPTER 3. THE DEFINITE INTEGRAL

98

But uniform convergence is not a necessary requirement. There are important weaker assumptions that will allow us to use sequence and series techniques on integrals. For sequences an assumption that the sequence is monotone will work. For series an assumption that the terms are nonnegative will work.

3.8.1 Summing inside the integral We establish that the summation formula ! Z b

a

∞

∞

∑ gk (x)

∑

dx =

n=1

n=1

Z

b

a

gk (x) dx

is possible for nonnegative functions. We need also to assume that the sum function f (x) = ∑∞ n=1 gk (x) is itself integrable since that cannot be deduced otherwise. This is just a defect in the calculus integral; in a more general theory of integration we would be able to conclude both that the sum is indeed integrable and also that the sum formula is correct. This defect is more serious than it might appear. In most applications the only thing we might know about the function ∞

∑ gk (x)

f (x) =

n=1

is that it is the sum of this series. We may not be able to check continuity and we certainly are unlikely to be able to find an indefinite integral. We split the statement into two lemmas for ease of proof. Together they supply the integration formula for the sum of nonnegative integrable functions. Lemma 3.48 Suppose that f , g1 , g2 , g3 ,. . . is a sequence of nonnegative functions, each one integrable on a closed bounded interval [a, b]. If, for all but finitely many x in (a, b) ∞

f (x) ≥ then

Z b a

∑ gk (x),

k=1

∞

f (x) dx ≥

∑

k=1

Z

b a

gk (x) dx .

(3.8)

Lemma 3.49 Suppose that f , g1 , g2 , g3 ,. . . is a sequence of nonnegative functions, each one integrable on a closed bounded interval [a, b]. If, for all but finitely many x in (a, b), ∞

f (x) ≤ then

Z b a

∑ gk (x),

k=1

∞

f (x) dx ≤

∑

k=1

Z

b a

gk (x) dx .

(3.9)

Exercise 388 In each of the lemmas show that we may assume, without loss of generality, that the inequalities ∞

f (x) ≤

∑ gk (x),

k=1

∞

or f (x) ≥

∑ gk (x),

k=1

3.9. INTEGRATION OF POWER SERIES

99

hold for all values of x in the entire interval [a, b].

Answer

Exercise 389 Prove the easier of the two lemmas.

Answer

Exercise 390 Prove Lemma 3.49, or rather give it a try and then consult the write up in the answer section. This is just an argument manipulating Riemann sums so it is not Answer particularly deep; even so it requires some care. Exercise 391 Construct an example of a convergent series of continuous functions that converges pointwise to a function that is not integrable in the calculus sense.

3.8.2 Monotone convergence theorem The series formula immediately supplies the monotone convergence theorem. Theorem 3.50 (Monotone convergence theorem) Let fn : [a, b] → R (n = 1, 2, 3, . . . ) be a nondecreasing sequence of functions, each integrable on the interval [a, b] and suppose that f (x) = lim fn (x) n→∞

for every x in [a, b] with possibly finitely many exceptions. Then, provided f is also integrable on [a, b], Z b

f (x) dx = lim

a

Z b

n→∞ a

fn (x) dx.

Exercise 392 Deduce Theorem 3.50 from Lemmas 3.48 and 3.49.

Answer

Exercise 393 Prove Theorem 3.50 directly by a suitable Riemann sums argument. Answer Exercise 394 Construct an example of a convergent, monotonic sequence of continuous functions that converges pointwise to a function that is not integrable in the calculus sense.

3.9 Integration of power series A power series is an infinite series of the form ∞

f (x) =

∑ an (x − c)n = a0 + a1(x − c)1 + a2(x − c)2 + a3(x − c)3 + · · ·

n=0

where an is called the coefficient of the nth term and c is a constant. One usually says that the series is centered at c. By a simple change of variables any power series can be centered at zero and so all of the theory is usually stated for such a power series ∞

f (x) =

∑ an xn = a0 + a1x + a2 x2 + a3x3 + . . . .

n=0

The set of points x where the series converges is called the interval of convergence. (We could call it a set of convergence, but we are anticipating that it will turn out to be an interval.)

CHAPTER 3. THE DEFINITE INTEGRAL

100

The main concern we shall have in this chapter is the integration of such series. The topic of power series in general is huge and central to much of mathematics. We can present a fairly narrow picture but one that is complete only insofar as applications of integration theory are concerned. Theorem 3.51 (convergence of power series) Let ∞

f (x) =

∑ anxn = a0 + a1x + a2x2 + a3x3 + . . . .

n=0

be a power series. Then there is a number R, 0 ≤ R ≤ ∞, called the radius of convergence of the series, so that 1. If R = 0 then the series converges only for x = 0. 2. If R > 0 the series converges absolutely for all x in the interval (−R, R). 3. If 0 < R < ∞ the interval of convergence for the series is one of the intervals (−R, R), (−R, R], [−R, R) or [−R, R] and at the endpoints the series may converge absolutely or nonabsolutely. The next theorem establishes the continuity of a power series within its interval of convergence. Theorem 3.52 (continuity of power series) Let ∞

f (x) =

∑ anxn = a0 + a1x + a2x2 + a3x3 + . . . .

n=0

be a power series with a radius of convergence R, 0 < R ≤ ∞. Then 1. f is a continuous function on its interval of convergence [i.e., continuous at all interior points and continuous on the right or left at an endpoint if that endpoint is included]. 2. If 0 < R < ∞ and the interval of convergence for the series is [−R, R] then f is uniformly continuous on [−R, R]. Finally we are in position to show that term-by-term integration of power series is possible in nearly all situations.

3.9. INTEGRATION OF POWER SERIES

101

Theorem 3.53 (integration of power series) Let ∞

∑ an xn = a0 + a1x + a2x2 + a3x3 + . . . .

f (x) =

n=0

be a power series and let ∞

F(x) =

an xn+1 ∑ n + 1 = a0 x + a1x2 /2 + a2 x3 /3 + a3x4 /4 + . . . . n=0

be its formally integrated series. Then 1. Both series have the same radius of convergence R, but not necessarily the same interval of convergence. 2. If R > 0 then F ′ (x) = f (x) for every x in (−R, R) and so F is an indefinite integral for f on the interval (−R, R). 3. f is integrable on any closed, bounded interval [a, b] ⊂ (−R, R) and Z b a

f (x) dx = F(b) − F(a).

4. If the interval of convergence of the integrated series for F is [−R, R] then f is integrable on [−R, R] and Z R

f (x) dx = F(R) − F(−R).

−R

5. If the interval of convergence of the integrated series for F is (−R, R] then f is integrable on [0, R] and Z R

f (x) dx = F(R) − F(0).

0

6. If the interval of convergence of the integrated series for F is [−R, R) then f is integrable on [−R, 0] and Z 0

f (x) dx = F(0) − F(−R).

−R

Note that the integration theorem uses the interval of convergence of the integrated series. It is not a concern whether the original series for f converges at the endpoints of the interval of convergence, but it is essential to look at these endpoints for the integrated series. The proofs of the separate statements in the two theorems appear in various of the exercises. Note that, while we are interested in integration problems here the proofs are all about derivatives; this is not surprising since the calculus integral itself is simply about derivatives. Exercise 395 Compute, if possible, the integrals ! Z Z 1

0

∞

∑x

n=0

n

0

dx and

−1

∞

∑x

n=0

n

!

dx. Answer

CHAPTER 3. THE DEFINITE INTEGRAL

102

Exercise 396 Repeat the previous exercise but use only the fact that ∞ 1 ∑ xn = 1 + x + x2 + x3 + x4 + · · · + = 1 − x . n=0

Answer

Is the answer the same?

Exercise 397 (careless student) “But,” says the careless student, “both of Exercises 395 and 396 are wrong surely. After all, the series f (x) = 1 + x + x2 + x3 + x4 + · · · +

converges only on the interval (−1, 1) and diverges at the endpoints x = 1 and x = −1 since 1 − 1 + 1 − 1 + 1 − 1 + 1 − 1 =? and

1 + 1 + 1 + 1 + · · ·+ = ∞.

You cannot expect to integrate on either of the intervals [−1, 0] or [0, 1].” What is your response? Answer

Exercise 398 (calculus student notation) For most calculus students it is tempting to write Z Z Z Z Z 2 3 2 a0 + a1 x + a2 x + a3 x + . . . dx = a0 dx+ a1 x dx+ a2 x dx+ a3 x3 dx+. . . . Answer

Is this a legitimate interpretation of this indefinite integral?

Exercise 399 (calculus student notation) For most calculus students it is tempting to write Z b Z b Z b Z b Z b a0 + a1 x + a2 x2 + a3 x3 + . . . dx = a0 dx+ a1 x dx+ a2 x2 dx+ a3 x3 dx+. . . . a

a

a

a

a

Answer

Is this a legitimate interpretation of this definite integral?

Exercise 400 Show that the series f (x) = 1 + 2x + 3x2 + 4x3 + . . . has a radius of convergence 1 and an interval of convergence exactly equal to (−1, 1). Show that f is not integrable on [0, 1], but that it is integrable [−1, 0] and yet the computation Z 0 Z 0 Z 0 Z 0 Z 0 1 + 2x + 3x2 + 4x3 + . . . dx = dx + 2x dx + 3x2 dx + 4x3 dx + . . . −1

−1

−1

= −1 + 1 − 1 + 1 − 1 + 1 − . . .

−1

−1

cannot be used to evaluate the integral.

Note: Since the interval of convergence of the integrated series is also (−1, 1), Theorem 3.53 has nothing to say about whether f is integrable on [0, 1] or [−1, 0]. Answer

3.9. INTEGRATION OF POWER SERIES

103

Exercise 401 Determine the radius of convergence of the series ∞

∑ kk xk = x + 4x2 + 27x3 + . . . .

k=1

Answer Exercise 402 Show that, for every 0 ≤ s ≤ ∞, there is a power series whose radius of Answer convergence R is exactly s. Exercise 403 Show that the radius of convergence of a series a0 + a1 x + a2 x2 + a3 x3 + . . . can be described as

∞

R = sup{r : 0 < r and

∑ ak rk converges}.

k=0

Exercise 404 (root test for power series) Show that the radius of convergence of a series a0 + a1 x + a2 x2 + a3 x3 + . . . is given by the formula R=

1 lim supk→∞

p k

|ak |

.

Exercise 405 Show that the radius of convergence of the series a0 + a1 x + a2 x2 + a3 x3 + . . . is the same as the radius of convergence of the formally differentiated series a1 + 2a2 x + 3a3 x2 + 4a4 x3 + . . . .

Exercise 406 Show that the radius of convergence of the series a0 + a1 x + a2 x2 + a3 x3 + . . . is the same as the radius of convergence of the formally integrated series a0 x + a1 x2 /2 + a2 x3 /3 + a3 x4 /4 + . . . . Answer Exercise 407 (ratio test for power series) Show that the radius of convergence of the series a0 + a1 x + a2 x2 + a3 x3 + . . . is given by the formula ak , R = lim k→∞ ak+1

assuming that this limit exists or equals ∞.

Answer

CHAPTER 3. THE DEFINITE INTEGRAL

104

Exercise 408 (ratio/root test for power series) Give an example of a power series for which the radius of convergence R satisfies 1 p R= limk→∞ k |ak | but ak lim k→∞ ak+1 does not exist. Answer Exercise 409 (ratio test for power series) Give an example of a power series for which the radius of convergence R satisfies ak+1 ak+1 . lim inf < R < lim sup k→∞ ak ak k→∞

Note: for such a series the ratio test cannot give a satisfactory estimate of the radius of Answer convergence. Exercise 410 If the coefficients {ak } of a power series

a0 + a1 x + a2 x2 + a3 x3 + . . .

form a bounded sequence show that the radius of convergence is at least 1.

Answer

Exercise 411 If the coefficients {ak } of a power series

a0 + a1 x + a2 x2 + a3 x3 + . . .

form an unbounded sequence show that the radius of convergence is no more than 1. Answer Exercise 412 If the power series a0 + a1 x + a2 x2 + a3 x3 + . . . has a radius of convergence Ra and the power series b0 + b1 x + b2 x2 + b3 x3 + . . . has a radius of convergence Rb and |ak | ≤ |bk | for all k sufficiently large, what relation Answer must hold between Ra and Rb ? Exercise 413 If the power series a0 + a1 x + a2 x2 + a3 x3 + . . . has a radius of convergence R, what must be the radius of convergence of the series a0 + a1 x2 + a2 x4 + a3 x6 + . . . Answer Exercise 414 Suppose that the series a0 + a1 x + a2 x2 + a3 x3 + . . . has a finite radius of convergence R and suppose that |x0 | > R. Show that, not only does a0 + a1 x0 + a2 x20 + a3 x30 + . . .

diverge but that limn→∞ |an xn0 | = ∞.

3.9. INTEGRATION OF POWER SERIES

105

Exercise 415 Suppose that the series a0 + a1 x + a2 x2 + a3 x3 + . . . has a positive radius of convergence R. Use the Weierstrass M-test to show that the series converges uniformly on any closed, bounded subinterval [a, b] ⊂ (−R, R). Exercise 416 Suppose that the series f (x) = a0 + a1 x + a2 x2 + a3 x3 + . . . has a positive radius of convergence R. Use Exercise 415 to show that f is differentiable on (−R, R) and that, for all x in that interval, f ′ (x) = a1 + 2a2 x + 3a3 x2 + 4a4 x3 + . . . .

Exercise 417 Suppose that the series f (x) = a0 + a1 x + a2 x2 + a3 x3 + . . . has a positive radius of convergence R. Use Exercise 416 to show that f has an indefinite integral on (−R, R) given by the function F(x) = a0 x + a1 x2 /2 + a2 x3 /3 + a3 x4 /4 + . . . .

Exercise 418 Suppose that the series f (x) = a0 + a1 x + a2 x2 + a3 x3 + . . . has a positive, finite radius of convergence R and that the series converges absolutely at one of the two endpoints R or −R of the interval of convergence. Use the Weierstrass M-test to show that the series converges uniformly on [−R, R]. Deduce from this that f ′ is integrable on [−R, R]. Note: this is the best that the Weierstrass M-test can do applied to power series. If the series converges nonabsolutely at one of the two endpoints R or −R of the interval then Answer the test does not help. Exercise 419 Suppose that the series f (x) = a0 + a1 x + a2 x2 + a3 x3 + . . . has a positive, finite radius of convergence R and that the series converges nonabsolutely at one of the two endpoints R or −R of the interval of convergence. Use a variant of the Abel test for uniform convergence to show that the series converges uniformly on any closed subinterval [a, b] of the interval of convergence. Deduce from this that f ′ is integrable on any such interval [a, b]. Note: this completes the picture for the integrability problem of this section. Answer Exercise 420 What power series will converge uniformly on (−∞, ∞)?

CHAPTER 3. THE DEFINITE INTEGRAL

106

k Exercise 421 Show that if ∑∞ k=0 ak x converges uniformly on an interval (−r, r), then it must in fact converge uniformly on [−r, r]. Deduce that if the interval of convergence is exactly of the form (−R, R), or [−R, R) or [−R, R), then the series cannot converge uniformly on the entire interval of convergence. Answer

Exercise 422 Suppose that a function f (x) has two power series representations f (x) = a0 + a1 x + a2 x2 + a3 x3 + . . . and f (x) = b0 + b1 x + b2 x2 + b3 x3 + . . . both valid at least in some interval (−r, r) for r > 0. What can you conclude? Exercise 423 Suppose that a function f (x) has a power series representations f (x) = a0 + a1 x + a2 x2 + a3 x3 + . . . valid at least in some interval (−r, r) for r > 0. Show that, for each k = 0, 1, 2, 3, . . . , f (k) (0) . k!

ak =

Exercise 424 In view of Exercise 423 it would seem that we must have the formula ∞

f (x) =

∑

k=0

f (k) (0) k x k!

provided only that the function f is infinitely often differentiable at x = 0. Is this a Answer correct observation?

3.10 Applications of the integral It would be presumptuous to try to teach here applications of the integral, since those applications are nearly unlimited. But here are a few that follow a simple theme and are traditionally taught in all calculus courses. The theme takes advantage of the fact that an integral can (under certain hypotheses) be approximated by a Riemann sum Z b a

n

f (x) dx ≈ ∑ f (ξi )(xi − xi−1 ). i=1

If there is an application where some concept can be expressed as a limiting version of sums of this type, then that concept can be captured by an integral. Whatever the concept is, it must be necessarily “additive” and expressible as sums of products that can be interpreted as f (ξi ) × (xi − xi−1 ). The simplest illustration is area. We normally think of area as additive. We can interpret the product f (ξi ) × (xi − xi−1 ).

3.10. APPLICATIONS OF THE INTEGRAL

107

as the area of a rectangle with length (xi − xi−1 ) and height f (ξi ). The Riemann sum itself then is a sum of areas of rectangles. If we can determine that the area of some figure is approximated by such a sum, then the area can be described completely by an integral. For applications in physics one might use t as a time variable and then interpret Z b a

n

f (t) dt ≈ ∑ f (τi )(ti − ti−1 ) i=1

thinking of f (τi ) as some measurement (e.g., velocity, acceleration, force) that is occurring throughout the time interval [ti−1 ,ti ]. An accumulation point of view For many applications of the calculus the Riemann sum approach is an attractive way of expressing the concepts that arise as a definite integral. There is another way which bypasses Riemann sums and goes directly back to the definition of the integral as an antiderivative. We can write this method using the slogan Z x+h a

f (t) dt −

Z x a

f (t) dt ≈ f (ξ) × h.

(3.10)

Suppose that a concept we are trying to measure can be captured by a function A(x) on some interval [a, b]. We suppose that we have already measured A(x) and now wish to add on a bit more to get to A(x + h) where h is small. We imagine the new amount that we must add on can be expressed as f (ξ) × h

thinking of f (ξ) as some measurement that is occurring throughout the interval [x, x + R h]. In that case our model for the concept is the integral ab f (t) dt. This is because (3.10) suggests that A′ (x) = f (x).

3.10.1 Area and the method of exhaustion There is a long historical and cultural connection between the theory of integration and the geometrical theory of area. Usually one takes the following as the primary definition of area. Definition 3.54 Let f : [a, b] → R be an integrable, nonnegative function and suppose that R( f , a, b) denotes the region in the plane bounded on the left by the line x = a, on the right by the line x = b, on the bottom by the line y = 0 and on the top by the graph of the function f (i.e., by y = f (x)). Then this region is said to have an area and value of that area is assigned to be Z b

f (x) dx.

a

The region can also be described by writing it as a set of points: R( f , a, b) = {(x, y) : a ≤ x ≤ b, 0 ≤ y ≤ f (x)}.

We can justify this definition by the method of Riemann sums combined with a method of the ancient Greeks known as the method of exhaustion of areas.

CHAPTER 3. THE DEFINITE INTEGRAL

108

Let us suppose that f : [a, b] → R is a uniformly continuous, nonnegative function and suppose that R( f , a, b) is the region as described above. Take any subdivision a = x0 < x1 < x2 < · · · < xn−1 < xn = b

Then there must exist points ξi , ηi ∈ [xi−1 , xi ] for i = 1, 2, . . . , n so that f (ξi ) is the maximum value of f in the interval [xi−1 , xi ] and f (ηi ) is the minimum value of f in that interval. We consider the two partitions {([xi , xi−1 ], ξi ) : i = 1, 2, . . . n} and {([xi , xi−1 ], ηi ) : i = 1, 2, . . . n}

and the two corresponding Riemann sums

n

n

∑ f (ξi )(xi − xi−1)

i=1

and Rb

∑ f (ηi )(xi − xi−1).

i=1

The larger sum is greater than the integral a f (x) dx and the smaller sum is lesser than that number. This is because there is a choice of points ξ∗i that is exactly equal to the integral, Z b a

n

f (x) dx = ∑ f (ξ∗i )(xi − xi−1 ) i=1

f (ξ∗i ) ≤

f (ξi ). (See Section 3.5.2.) and here we have f (ηi ) ≤ But if the region were to have an “area” we would expect that area is also between these two sums. That is because the larger sum represents the area of a collection of n rectangles that include our region and the smaller sum represents the area of a collection of n rectangles that are included inside our region. If we consider all possible subdivisions then the same situation holds: the area of the region (if it has one) must lie between the upper sums and the lower sums. But according to Theorem 3.19 the only Rb number with this property is the integral a f (x) dx itself. Certainly then, for continuous functions anyway, this definition of the area of such a region would be compatible with any other theory of area. Exercise 425 (an accumulation argument) Here is another way to argue that integration theory and area theory must be closely related. Imagine that area has some (at the moment) vague meaning to you. Let f : [a, b] → R be a uniformly continuous, nonnegative function. For any a ≤ s < t ≤ b let A( f , s,t) denote the area of the region in the plane bounded on the left by the line x = s, on the right by the line x = t, on the bottom by the line y = 0 and on the top by the curve y = f (x). Argue for each of the following statements: 1. A( f , a, s) + A( f , s,t) = A( f , a,t). 2. If m ≤ f (x) ≤ M for all s ≤ x ≤ t then m(t − s) ≤ A( f , s,t) ≤ M(t − s). 3. At any point a < x < b, d A( f , a, x) = f (x). dx 4. At any point a < x < b, A( f , a, x) =

Z x a

f (t) dt. Answer

3.10. APPLICATIONS OF THE INTEGRAL

109

Exercise 426 Show that the area of the triangle {(x, y) : a ≤ x ≤ b, 0 ≤ y ≤ m(x − a)}.

is exactly as you would normally have computed it precalculus. Exercise 427 Show that the area of the trapezium {(x, y) : a ≤ x ≤ b, 0 ≤ y ≤ c + m(x − a)}.

is exactly as you would normally have computed it precalculus. Exercise 428 Show that the area of the half-circle {(x, y) : −1 ≤ x ≤ 1, 0 ≤ y ≤

p

1 − x2 }.

is exactly as you would normally have computed it precalculus.

Answer

Exercise 429 One usually takes this definition for the area between two curves: Definition 3.55 Let f , g : [a, b] → R be integrable functions and suppose that f (x) ≥ g(x) for all a ≤ x ≤ b. Let R( f , g, a, b) denote the region in the plane bounded on the left by the line x = a, on the right by the line x = b, on the bottom by the curve y = g(x) and on the top by the curve by y = f (x). Then this region is said to have an area and value of that area is assigned to be Z b a

[ f (x) − g(x)] dx.

Use this definition to find the area inside the circle x2 + y2 = r2 .

Answer

Exercise 430 Using Definition 3.55 compute the area between the graphs of the functions g(x) = 1 + x2 and h(x) = 2x2 on [0, 1]. Explain why the Riemann sum n

∑ [g(ξi ) − h(ξi)](xi − xi−1)

i=1

R

and the corresponding integral 01 [g(x)−h(x)] dx cannot be interpreted using the method of exhaustion to be computing both upper and lower bounds for this area. Discuss. Answer Exercise 431 In Figure 3.4 we show graphically how to interpret the area that is repR ∞ −2 resented by 1 x dx. Note that Z 2 1

x−2 dx = 1/2,

and so we would expect

Z ∞ 1

Check that this is true.

Z 4 2

x−2 dx = 1/4,

Z 8 4

x−2 dx = 1/8

x−2 dx = 1/2 + 1/4 + 1/8 + . . . . Answer

CHAPTER 3. THE DEFINITE INTEGRAL

110 1

1 2

4

8 b

Figure 3.4: Computation of an area by

Z ∞ 1

x−2 dx.

3.10.2 Volume A full treatment of the problem of defining and calculating volumes is outside the scope of a calculus course that focuses only on integrals of this type: Z b

f (x) dx.

a

But if the problem addresses a very special type of volume, those volumes obtained by rotating a curve about some line, then often the formula π

Z b

[ f (x)]2 dx

a

can be interpreted as providing the correct volume interpretation and computation. Once again the justification is the method of exhaustion. We assume that volumes, like areas, are additive. We assume that a correct computation of the volume of cylinder that has radius r and height h is πr2 h. In particular the volume of a cylinder that has radius f (ξi ) and height (xi − xi−1 ) is π[ f (ξi )]2 (xi − xi−1 ).

The total volume for a collection of such cylinders would be (since we assume volume is additive) n

π ∑ [ f (ξi )]2 (xi − xi−1 ). i=1

We then have a connection with the formula π

Z b

[ f (x)]2 dx.

a

One example with suitable pictures illustrates the method. Take the graph of the function f (x) = sin x on the interval [0, π] and rotate it (into three dimensional space) around the x-axis. Figure 3.5 shows the football (i.e., American football) shaped object. Subdivide the interval [0, π], 0 = x0 < x1 < x2 < · · · < xn−1 < xn = π

Then there must exist points ξi , ηi ∈ [xi−1 , xi ] for i = 1, 2, . . . , n so that sin(ξi ) is the maximum value of sin x in the interval [xi−1 , xi ] and sin(ηi ) is the minimum value of sin x in that interval. We consider the two partitions {([xi , xi−1 ], ξi ) : i = 1, 2, . . . n} and {([xi , xi−1 ], ηi ) : i = 1, 2, . . . n}

3.10. APPLICATIONS OF THE INTEGRAL

111

Figure 3.5: sin x rotated around the x-axis. and the two corresponding Riemann sums n

n

i=1

i=1

π ∑ sin2 (ξi )(xi − xi−1 ) and π ∑ sin2 (ηi )(xi − xi−1 ). The “football” is entirely contained inside the cylinders representing the first sum and the cylinders representing the second sum are entirely inside the football. There is only one value that lies between these sums for all possible choice of partition, namely the number Z π

π

[sin x]2 dx.

0

We know this because this integral can be uniformly approximated by Riemann sums. The method of exhaustion then claims that the volume of the football must be this number. In general this argument justifies the following working definition. This is the analogue for volumes of revolution of Definition 3.55 . Definition 3.56 Let f and g be continuous, nonnegative functions on an interval [a, b] and suppose that g(x) ≤ f (x) for all a ≤ x ≤ b. Then the volume of the solid obtained by rotating the region between the two curves y = f (x) and y = g(x) about the x-axis is given by Z b π [ f (x)]2 − [g(x)]2 dx. a

Exercise 432 (shell method) There is a similar formula for a volume of revolution when the curve y = f (x) on [a, b] (with a < 0) is rotated about the y-axis. One can R either readjust by interchanging x and y to get a formula of the form π cd [g(y)]2 dy or use the so-called shell method that has a formula 2π

Z b a

x × h dx

where h is a height measurement in the shell method. Investigate. Exercise 433 (surface area) If a nonnegative function y = f (x) is continuously differentiable throughout the interval [a, b], then the formula for the area of the surface generated by revolving the curve about the x-axis is generally claimed to be Z b q 2 f (x) 1 + [ f ′ (x)]2 dx. π a

CHAPTER 3. THE DEFINITE INTEGRAL

112

Using the same football studied in this section how could you justify this formula.

3.10.3 Length of a curve In mathematics a curve [sometimes called a parametric curve] is a pair of uniformly continuous functions F, G defined on an interval [a, b]. The points (F(t), G(t)) in the plane are considered to trace out the curve as t moves from the endpoint a to the endpoint b. The curve is thought of as a mapping taking points in the interval [a, b] to corresponding points in the plane. Elementary courses often express the curve this way, x = F(t), y = G(t) a ≤ t ≤ b,

referring to the two equations as parametric equations for the curve and to the variable t as a parameter. The set of points {(x, y) : x = F(t), y = G(t), a ≤ t ≤ b}

is called the graph of the curve. It is not the curve itself but, for novices, it may be difficult to make this distinction. The curve is thought to be oriented in the sense that as t moves in its positive direction [i.e., from a to b] the curve is traced out in that order. Any point on the curve may be covered many times by the curve itself; the curve can cross itself or be very complicated indeed, even though the graph might be simple. For example, take any continuous function F on [0, 1] with F(0) = 0 and F(1) = 1 and 0 ≤ F(x) ≤ 1 for 0 ≤ x ≤ 1. Then the curve (F(t), F(t)) traces out the points on the line connecting (0, 0) to (1, 1). But the points can be traced and retraced many times and the “trip” itself may have infinite √ length. All this even though the line segment itself is simple and short (it has length 2). The length of a curve is defined by estimating the length of the route taken by the curve by approximating its length by a polygonal path. Subdivide the interval a = t0 < t1 < t2 < · · · < tn−1 < tn = b

and then just compute the length of a trip to visit each of the points (F(a), G(a)), (F(t1 ), G(t1 )), (F(t2 ), G(t2 )), . . . , (F(b), G(a)) in that order. The definition should resemble our definition of a function of bounded variation and, indeed, the two ideas are very closely related. Definition 3.57 (rectifiable curve) A curve give by a pair of functions F, G : [a, b] → R is said to be rectifiable if there is a number M so that n q ∑ [F(ti ) − F(ti−1 )]2 + [G(ti) − G(ti−1)]2 ≤ M i=1

for all choices of points

a = t0 < t1 < t2 < · · · < tn−1 < tn = b.

The least such number M is called the length of the curve.

Exercise 434 Show that a curve given by a pair of uniformly continuous functions F, G : [a, b] → R is rectifiable if and only if both functions F and G have bounded variation

3.10. APPLICATIONS OF THE INTEGRAL

113

on [a, b]. Obtain, moreover, that the length L of the curve must satisfy max{V (F, [a, b]),V (G, [a, b])} ≤ L ≤ V (F, [a, b]) +V (G, [a, b]).

Answer

Exercise 435 Prove the following theorem which supplies the familiar integral formula for the length of a curve. Theorem 3.58 Suppose that a curve is given by a pair of uniformly continuous functions F, G : [a, b] → R and suppose that both F and G have bounded, continuous derivatives at every point of (a, b) with possibly finitely many exceptions. Then the curve is rectifiable and, moreover, the length L of the curve must satisfy Z bq [F ′ (t)]2 + [G′ (t)]2 dt. L= a

Answer

Exercise 436 Take any continuous function F on [0, 1] with F(0) = 0 and F(1) = 1 and 0 ≤ F(x)1 ≤ for 0 ≤ x ≤ 1. Then the curve (F(t), F (t)) traces out the points on the line segment connecting (0, 0) to (1, 1). Why does the graph of the curve contain all Answer points on the line segment? Exercise 437 Find an example of a continuous function F on [0, 1] with F(0) = 0 and F(1) = 1 and 0 ≤ F(x)1 ≤ for 0 ≤ x ≤ 1 such that the curve (F(t), F(t)) has infinite length. Can you find an example where the length is√ 2? Can you find one where the length is 1?. Which choices will have length equal to 2 which is, after all, the actual length of the graph of the curve? Exercise 438 A curve in three dimensional space is a triple of uniformly continuous functions (F(t), G(t), H(t)) defined on an interval [a, b]. Generalize to the theory of such curves the notions presented in this section for curves in the plane. Exercise 439 The graph of a uniformly continuous function f : [a, b] → R may be considered a curve in this sense using the pair of functions F(t) = t, G(t) = f (t) for a ≤ t ≤ b. This curve has for its graph precisely the graph of the function, i.e., the set {(x, y) : y = f (x) a ≤ x ≤ b}.

Under this interpretation the graph of the function has a length if this curve has a length. Discuss. Answer Exercise 440 Find the length of the graph of the function 1 f (x) = (ex + e−x ), 0 ≤ x ≤ 2. 2 [The answer is 21 (e2 − e−2 ). This is a typical question in a calculus course, chosen not because the curve is of great interest, but because it is one of the very few examples that Answer can be computed by hand.]

CHAPTER 3. THE DEFINITE INTEGRAL

114

3.11 Numerical methods This is a big subject with many ideas and many pitfalls. As a calculus student you are mainly [but check with your instructor] responsible for learning a few standard methods, eg., the trapezoidal rule and Simpson’s rule. In any practical situation where numbers are needed how might we compute Z b

f (x) dx?

a

The computation of any integral would seem (judging by the definition) to require first obtaining an indefinite integral F [checking to see it is continuous, of course, and that F ′ (x) = f (x) at all but finitely many points in (a, b)]. Then the formula Z b a

f (x) dx = F(b) − F(a)

would give the precise value. But finding an indefinite integral may be impractical. There must be an indefinite integral if the integral exists, but that does not mean that it must be given by an accessible formula or that we would have the skills to find it. The history of our subject is very long so many problems have already been solved but finding antiderivatives is most often not the best method even when it is possible to carry it out. R Finding a close enough value for ab f (x) dx may be considerably easier and less time consuming than finding an indefinite integral. The former is just a number, the latter is a function, possibly mysterious. Just use Riemann sums? If we have no knowledge whatever about the function f beyond the fact that it is bounded and continuous mostly everywhere then to estimate Rb a f (x) dx we could simply use Riemann sums. Divide the interval [a, b] into pieces of equal length h a < a + h < a + 2h < a + 3h < a + (n − 1)h < b.

Here there are n−1 pieces of equal length and the last piece, the nth piece, has (perhaps) smaller length b − (a + (n − 1)h ≤ h. Then

Z b a

f (x) dx ≈ [ f (ξ1 ) + f (ξ2 ) + . . . f (ξn−1 )]h + f (ξn )[b − (a + (n − 1)h].

We do know that, for small enough h, the approximation is as close as we please to the actual value. And we can estimate the error if we know the oscillation of the function in each of these intervals. If we were to use this in practice then the computation is simpler if we choose always ξi as an endpoint of the corresponding interval and we choose for h only lengths (b − a)/n so that all the pieces have equal length. The methods that follow are better for functions that arise in real applications, but if we want a method that works for all continuous functions, there is no guarantee that any other method would surpass this very naive method.

3.11. NUMERICAL METHODS

115

Trapezoidal rule Here is the (current) Wikipedia statement of the rule: In mathematics, the trapezoidal rule (also known as the trapezoid rule, or the trapezium rule in British English) is a way to approximately calculate the definite integral Z b

f (x) dx.

a

The trapezoidal rule works by approximating the region under the graph of the function f(x) by a trapezoid and calculating its area. It follows that Z b f (a) + f (b) . f (x) dx ≈ (b − a) 2 a To calculate this integral more accurately, one first splits the interval of integration [a,b] into n smaller subintervals, and then applies the trapezoidal rule on each of them. One obtains the composite trapezoidal rule: " # Z b b − a f (a) + f (b) n−1 b−a f (x) dx ≈ + ∑ f a+k . n 2 n a k=1 This can alternatively be written as: Z b b−a ( f (x0 ) + 2 f (x1 ) + 2 f (x2 ) + · · · + 2 f (xn−1 ) + f (xn )) f (x) dx ≈ 2n a where b−a xk = a + k , for k = 0, 1, . . . , n n The error of the composite trapezoidal rule is the difference between the value of the integral and the numerical result: " # Z b b−a b − a f (a) + f (b) n−1 + ∑ f a+k . f (x) dx − error = n 2 n a k=1 This error can be written as (b − a)3 ′′ f (ξ), 12n2 where ξ is some number between a and b. error = −

It follows that if the integrand is concave up (and thus has a positive second derivative), then the error is negative and the trapezoidal rule overestimates the true value. This can also been seen from the geometric picture: the trapezoids include all of the area under the curve and extend over it. Similarly, a concave-down function yields an underestimate because area is unaccounted for under the curve, but none is counted above. If the interval of the integral being approximated includes an inflection point, then the error is harder to identify.

Simpson’s rule Simpson’s rule is another method for numerical approximation of definite integrals. The approximation on a single interval uses the endpoints and the midpoint. In place of a trapezoidal approximation, an approximation using quadratics

CHAPTER 3. THE DEFINITE INTEGRAL

116 produces:

Z b

a+b b−a f (a) + 4 f + f (b) . f (x) dx ≈ 6 2 a It is named after the English mathematician Thomas Simpson (1710–1761). An extended version of the rule for f (x) tabulated at 2n evenly spaced points a distance h apart, a = x0 < x1 < · · · < x2n = b is

Z x2n

h f (x) dx = [ f0 + 4( f1 + f3 + ... + f2n−1 ) + 2( f2 + f4 + ... + f2n−2 ) + f2n ] − Rn , 3 x0 where fi = f (xi ) and where the remainder term is Rn =

nh5 f ′′′′ (ξ) 90

for some ξ ∈ [x0 , x2n ]. Exercise 441 Show that the trapezoidal rule can be interpreted as asserting that a reasonable computation of the mean value of a function on an interval, Z b 1 f (x) dx, b−a a is simply to average the values of the function at the two endpoints. Answer Exercise 442 Establish the identity Z b Z f (a) + f (b) 1 b f (x) dx = (b − a) − (x − a)(b − x) f ′′ (x) dx 2 2 a a Answer under suitable hypotheses on f . Exercise 443 Establish the identity Z b

f (a) + f (b) (b − a)3 f ′′ (ξ) (b − a) = − 2 12 a for some point a < ξ < b, under suitable hypotheses on f . f (x) dx −

Exercise 444 Establish the inequality Z b (b − a)2 Z b ′′ f (a) + f (b) ≤ f (x) dx − (b − a) | f (x)| dx. a 2 8 a under suitable hypotheses on f .

Answer

Answer

Exercise 445 Prove the following theorem and use it to provide the estimate for the error given in the text for an application of the trapezoidal rule.

3.11. NUMERICAL METHODS

117

Theorem 3.59 Suppose that f is twice continuously differentiable at all points of the interval [a, b]. Let " # b−a b − a f (a) + f (b) n−1 + ∑ f a+k Tn = n 2 n k=1 denote the usual trapezoidal sum for f . Then Z b a

n

(b − a)3 ′′ f (ξi ) 3 k=1 12n

f (x) dx − Tn = − ∑

for appropriately chosen points ξi in each interval i(b − a) (i − 1)(b − a) ,a+ [xi−1 , xi ] = a + n n

(i = 1, 2, 3, . . . , n) Answer

Exercise 446 Prove the following theorem which elaborates on the error in the trapezoidal rule. Theorem 3.60 Suppose that f is twice continuously differentiable at all points of the interval [a, b]. Let " # b − a f (a) + f (b) n−1 b−a Tn = + ∑ f a+k n 2 n k=1 denote the usual trapezoidal sum for f . Show that the error term for using Tn to R estimate ab f (x) dx is approximately −

(b − a)2 ′ [ f (b) − f ′ (a)]. 12n2

Answer Exercise 447 The integral

Z 1

2

ex dx = 1.462651746

0

is correct to nine decimal places. The trapezoidal rule, for n = 1, 2 would give Z 1 0

and

Z 1

2

ex dx ≈

e0 + e1 = 1.859140914 2

e0 + 2e1/2 + e1 = 1.753931093. 4 0 At what stage in the trapezoidal rule would the approximation be correct to nine decimal places? Answer 2

ex dx ≈

3.11.1 Maple methods With the advent of computer algebra packages like Maple and Mathematica one does not need to gain any expertise in computation to perform definite and indefinite integration. The reason, then, why we still drill our students on these methods is to produce

CHAPTER 3. THE DEFINITE INTEGRAL

118

an intelligent and informed user of mathematics. To illustrate here is a short Maple session on a unix computer named dogwood. After giving the maple command we are in Maple and have asked it to do some calculus questions for us. Specifically we are seeking Z

x2 dx,

Z 2

x2 /dx,

0

Z

sin(4x) dx, and

Z

x[3x2 + 2]5/3 dx.

All of these can be determined by hand using the standard methods taught for generations in calculus courses. Note that Maple is indifferent to our requirement that constants of integration should always be specified or that the interval of indefinite integration should be acknowledged. [31]dogwood% |\^/| ._|\| |/|_. \ MAPLE /

| > int(x^2,x);

maple Maple 12 (SUN SPARC SOLARIS) Copyright (c) Maplesoft, a division of Waterloo Maple Inc. 2008 All rights reserved. Maple is a trademark of Waterloo Maple Inc. Type ? for help. 3 x ---3

> int(x^2,x=0..2); 8/3 >

int(sin(4*x),x); -1/4 cos(4 x)

>

int(x*(3*x^2+2)^(5/3),x); 2 8/3 (3 x + 2) ------------16

If we go on to ask problems that would not normally be asked on a calculus examination then the answer may be more surprising. There is no simple expression of the R indefinite integral cos x3 dx and consequently Maple will not find a method. The first R try to obtain a precise value for 01 cos x3 dx produces > int(cos(x^3),x=0..1); memory used=3.8MB, alloc=3.0MB, time=0.36 memory used=7.6MB, alloc=5.4MB, time=0.77

1/2 1/6 Pi

/ 2/3 2/3 (1/3) | 2 sin(1) 2 2 (-3/2 cos(1) + 3/2 sin(1)) 2 |30/7 ----------- - --------------------------------| 1/2 1/2 \ Pi Pi

2/3 2/3 \ 2 sin(1) LommelS1(11/6, 3/2, 1) 3 2 (cos(1) - sin(1)) LommelS1(5/6, 1/2, 1)| - 9/7 ---------------------------------- - ----------------------------------------------|

3.12. MORE EXERCISES

119

1/2 Pi

1/2 Pi

The second try asks Maple to give a numerical approximation. Maple uses a numerical integration routine with automatic error control to evaluate definite integrals that it cannot do analytically. > evalf(int(cos(x^3),x=0..1)); 0.9317044407

R

Thus we can be assured that 01 cos x3 dx = 0.9317044407 correct to 10 decimal places. In short, with access to such computer methods, we can be sure that our time in studying integration theory is best spent on learning the theory so that we will understand what we are doing when we ask a computer to make calculations for us.

3.11.2 Maple and infinite integrals For numerical computations of infinite integrals one can again turn to computer algebra packages. Here is a short Maple session that computes the infinite integrals Z ∞ 0

−x

e

dx,

Z ∞

−x

xe

0

dx,

Z ∞

3 −x

x e

0

dx, and

Z ∞ 0

x10 e−x dx.

We have all the tools to do these by hand, but computer methods are rather faster. [32]dogwood% maple |\^/| Maple 12 (SUN SPARC SOLARIS) ._|\| |/|_. Copyright (c) Maplesoft, a division of Waterloo Maple Inc. 2008 \ MAPLE / All rights reserved. Maple is a trademark of Waterloo Maple Inc. | Type ? for help. > int( exp(-x), x=0..infinity ); 1 > int(x* exp(-x), x=0..infinity ); 1 > int(x^3* exp(-x), x=0..infinity ); 6

> int(x^10* exp(-x), x=0..infinity ); 3628800

Exercise 448 Show that

Z ∞ 0

xn e−x dx = n!.

Answer

3.12 More Exercises Exercise 449 If f is continuous on an interval [a, b] and Z b

f (x)g(x) dx = 0

a

for every continuous function g on [a, b] show that f is identically equal to zero there.

| /

CHAPTER 3. THE DEFINITE INTEGRAL

120

Exercise 450 ( (Cauchy-Schwarz inequality)) If f and g are continuous on an interval [a, b] show that Z b 2 Z b Z b 2 2 [g(x)] dx . [ f (x)] dx f (x)g(x) dx ≤ a

a

a

Answer

Exercise 451 In elementary calculus classes it is sometimes convenient to define the natural logarithm by using the integration theory, log x =

Z x

dx.

1

Taking this as a definition, not a computation, use the properties of integrals to develop Answer the properties of the logarithm function. Exercise 452 Let f be a continuous function on [1, ∞) such that limx→∞ f (x) = α. R Show that if the integral 1∞ f (x) dx converges, then α must be 0. Exercise 453 Let f be a continuous function on [1, ∞) such that the integral converges. Can you conclude that limx→∞ f (x) = 0?

R∞ 1

f (x) dx

Chapter 4

Beyond the calculus integral Our goal in this final chapter is to develop the modern integral by allowing more functions to be integrated. We still insist on the viewpoint that Z b a

F ′ (x) dx = F(b) − F(a),

but we wish to relax our assumptions to allow this formula to hold even when there are infinitely many points of nondifferentiability of F. There may, at first sight, seem not to be much point in allowing more functions to be integrated, except perhaps when one encounters a function without an integral where one seems to be needed. But the theory itself demands it. Many processes of analysis lead from integrable functions [in the calculus sense] to functions for which a broader theory of integration is required. The modern theory is an indispensable tool of analysis and the theory is elegant and complete. Remember that, for the (naive) calculus integral, an integrable function f must have an indefinite integral F for which F ′ (x) = f (x) at every point of an interval with finitely many exceptions. The path to generalization is to allow infinitely many exceptional points where the derivative F ′ (x) may not exist or may not agree with f (x). Although we will allow an infinite set, we cannot allow too large a set of exceptions. In addition, as we will find, we must impose some restrictions on the function F if we do allow an infinite set of exceptions. Those two ideas will drive the theory.

4.1 Countable sets The first notion, historically, of a concept that captures the smallness of an infinite set is due to Cantor. If all of the elements of a set can be written in a list, then the set is said to be countable. This idea only becomes startling and interesting when one discovers that there are sets whose elements cannot be written in a list. Definition 4.1 A set of real numbers is countable if there is a sequence of real numbers r1 , r2 , r3 , . . . that contains every element of the set. Exercise 454 Prove that the empty set is countable.

Answer

Exercise 455 Prove that every finite set is countable.

Answer

121

122

CHAPTER 4. BEYOND THE CALCULUS INTEGRAL

Exercise 456 Prove that every subset of a countable set is countable.

Answer

Exercise 457 Prove that the set of all integers (positive, negative or zero) is countable. Answer Exercise 458 Prove that the set of all rational numbers is countable.

Answer

Exercise 459 Prove that the union of two countable sets is countable.

Answer

Exercise 460 Prove that the union of a sequence of countable sets is countable. Answer Exercise 461 Suppose that F : (a, b) → R is a monotonic, nondecreasing function. Show that such a function may have many points of discontinuity but that the collection Answer of all points where F is not continuous is countable. Exercise 462 If a function F : (a, b) → R has a right-hand derivative and a left-hand derivative at a point x0 and the derivatives on the two sides are different, then that point is said to be a corner. Show that a function may have many corners but that the Answer collection of all corners is countable.

4.1.1 Cantor’s theorem Your first impression might be that few sets would be able to be the range of a sequence. But having seen in Exercise 458 that even the set of rational numbers that is seemingly so large can be listed, it might then appear that all sets can be so listed. After all, can you conceive of a set that is “larger” than the rationals in some way that would stop it being listed? The remarkable fact that there are sets that cannot be arranged to form the elements of some sequence was proved by Georg Cantor (1845–1918). Theorem 4.2 (Cantor) No interval of real numbers is countable. The proof is given in the next few exercises. Exercise 463 Prove that there would exist a countable interval if and only if the open interval (0, 1) is itself countable. Answer Exercise 464 Prove that the open interval (0, 1) is not countable, using (as Cantor himself did) properties of infinite decimal expansions to construct a proof. Answer Exercise 465 Some novices, on reading the proof of Cantor’s theorem, say “Why can’t you just put the number c that you found at the front of the list.” What is your rejoinder? Answer Exercise 466 Give a proof that the interval (a, b) is not countable using the nested sequence of intervals argument. Answer

4.2. DERIVATIVES WHICH VANISH OUTSIDE OF COUNTABLE SETS

123

Exercise 467 We define a real number to be algebraic if it is a solution of some polynomial equation an xn + an−1 xn−1 + · · · + a1 x + a0 = 0, √ where all the coefficients are integers. Thus 2 is algebraic because it is a solution of x2 − 2 = 0. The number π is not algebraic because no such polynomial equation can ever be found (although this is hard to prove). Show that the set of algebraic numbers is countable. Answer Exercise 468 A real number that is not algebraic is said to be transcendental. For example, it is known that e and π are transcendental. What can you say about the existence of other transcendental numbers? Answer

4.2 Derivatives which vanish outside of countable sets Our first attempt to extend the indefinite and definite integral to handle a broader class of functions is to introduce a countable exceptional set into the definitions. We have used finite exceptional sets up to this point. Using countable sets will produce a much, more general integral. The principle is the following: if F is a continuous function on an interval I and if ′ F (x) = 0 for all but countably many points in I then F must be constant. We repeat the statement of the theorem here; the proof has already appeared in Section 1.9.5. Theorem 4.3 Let F : (a, b) → R be a function that is continuous at every point in an open interval (a, b) and suppose that F ′ (x) = 0 for all x ∈ (a, b) with possibly countably many exceptions. Then F is a constant function. Exercise 469 Suppose that F : [a, b] → R and G : [a, b] → R are uniformly continuous and that f is a function for which each of the statements F ′ (x) = f (x) and G′ (x) = f (x) holds for all x ∈ (a, b) with possibly countably many exceptions. Show that F and G differ by a constant. Answer

4.2.1 Calculus integral [countable set version] Our original calculus integral was defined in way that was entirely dependent on the simple fact that continuous functions that have a zero derivative at all but a finite number of points must be constant. We now know that that continuous functions that have a zero derivative at all but a countable number of points must also be constant. Thus there is no reason not to extend the calculus integral to allow a countable exceptional set.

124

CHAPTER 4. BEYOND THE CALCULUS INTEGRAL

Definition 4.4 The following describes an extension of our integration theory: • f is a function defined at each point of a bounded open interval (a, b) with possibly countably many exceptions. • f is the derivative of some function in this sense: there exists a uniformly continuous function F : (a, b) → R with the property that F ′ (x) = f (x) for all a < x < b with at most a countable number of exceptions. • Then the function f is said to be integrable [in the new sense] and the value of the integral is determined by Z b a

f (x) dx = F(b−) − F(a+).

Zakon’s Analysis text. There is currently at least one analysis textbook available1 that follows exactly this program, replacing the Riemann integral by the Newton integral (with countably many exceptions): Mathematical Analysis I, by Elias Zakon, ISBN 1-931705-02-X, published by The Trillia Group, 2004. 355+xii pages, 554 exercises, 26 figures, hypertextual cross-references, hyperlinked index of terms. Download size: 2088 to 2298 KB, depending on format. This can be downloaded freely from the web site www.trillia.com/zakon-analysisI.html Inexpensive site licenses are available for instructors wishing to adopt the text. Zakon’s text offers a serious analysis course at the pre-measure theory level, and commits itself to the Newton integral. There are rigorous proofs and the presentation is carried far enough to establish that all regulated2 functions are integrable in this sense. Exercise 470 Show that the countable set version of the calculus integral determines a unique value for the integral, i.e., does not depend on the particular antiderivative F chosen. Answer Exercise 471 In Exercise 258 we asked the following: Define a function F : [0, 1] → R in such a way that F(0) = 0, and for each odd integer n = 1, 3, 5 . . . , F(1/n) = 1/n and each even integer n = 2, 4, 6 . . . , F(1/n) = 0. On the intervals [1/(n + 1), 1/n] for n = 1, 2, 3, the R function is linear. Show that ab F ′ (x) dx exists as a calculus integral for all R1 ′ 0 < a < b ≤ b but that 0 F (x) dx does not.

1 I am indebted to Bradley Lucier, the founder of the Trillia Group, for this reference. The text has been used by him successfully for beginning graduate students at Purdue. 2 Regulated functions are uniform limits of step functions. Consequently regulated functions are bounded and have only a countable set of points of discontinuity. So our methods would establish integrability in this sense. One could easily rewrite this text to use the Zakon integral instead of the calculus integral. We won’t.

4.3. SETS OF MEASURE ZERO

125

Show that the new version of the calculus integral would handle this easily.

Answer

Exercise 472 Show that all bounded functions with a countable number of discontinuities must be integrable in this new sense. Answer

Exercise 473 Show that the new version of the integral has a property that the finite set version of the integral did not have: if fn is a sequence of functions converging uniformly to a function f on [a, b] and if each fn is integrable on [a, b] then the function f must be integrable there too. Exercise 474 Rewrite the text to use the countable set version of the integral rather Answer than the more restrictive finite set version. Exercise 475 (limitations of the calculus integrals) Find an example of a sequence of nonnegative, integrable functions gk (x) on the interval [0, 1] such that such that ∞ Z b gk (x) dx ∑ a k=1 ∞ ∑k=1 gk is not integrable

is convergent, and yet f = finite set or countable set version.

in the calculus sense for either the

[Note: this function should be integrable and the value of this integral should be the sum of the series. The only difficulty is that we cannot integrate enough functions. The Riemann integral has the same defect; the integral introduced later on does not. Answer

4.3 Sets of measure zero We shall go beyond countable sets in our search for a suitable class of small sets. A set is countable if it is small in the sense of counting. This is because we have defined a set to be countable if we can list off the elements of the set in the same way we list off all the counting numbers (i.e., 1, 2, 3, 4, . . . ). We introduce a larger class of sets that is small in the sense of measuring ; here we mean measuring the same way that we measure the length of an interval [a, b] by the number b − a. Our sets of measure zero are defined using subpartitions and very simple Riemann sums. Later on in our more advanced course we will find several other characterizations of this important class of sets.

CHAPTER 4. BEYOND THE CALCULUS INTEGRAL

126

Definition 4.5 A set of real numbers N is said to have measure zero if for every ε > 0 and every point ξ ∈ N there is a δ(ξ) > 0 with the following property: whenever a subpartition {([ci , di ], ξi ) : i = 1, 2, . . . , n} is given with each ξi ∈ N and so that

then

0 < di − ci < δ(ξi ) (i = 1, 2, . . . , n) n

∑ (di − ci) < ε.

i=1

Recall that in order for the subset {([ai , bi ], ξi ) : i = 1, 2, . . . , n}

to be a subpartition, we require merely that the intervals {[ai , bi ]} do not overlap and we always require that the associated point ξi belong to the interval [xi−1 , xi ] with which it is paired. The collection here is not necessarily a partition. Our choice of language, calling it a subpartition, indicates that it could be (but won’t be) expanded to be a partition. Exercise 476 Show that every finite set has measure zero.

Answer

Exercise 477 Show that every countable set has measure zero.

Answer

Exercise 478 Show that no interval has measure zero.

Answer

Exercise 479 Show that every subset of a set of measure zero must have measure zero. Answer Exercise 480 Show that the union of two sets of measure zero must have measure zero. Answer Exercise 481 Show that the union of a sequence of sets of measure zero must have Answer measure zero. Exercise 482 Suppose that {(ak , bk )} is a sequence of open intervals and that ∞

∑ (bk − ak ) < ∞.

k=1

If E is a set and every point in E belongs to infinitely many of the intervals {(ak , bk )}, show that E must have measure zero. Answer

4.3. SETS OF MEASURE ZERO

127 K1 K2 K3

0

1 9

2 9

1 3

2 3

7 9

8 9

1

Figure 4.1: The third stage in the construction of the Cantor ternary set.

4.3.1 The Cantor dust In order to appreciate exactly what we intend by a set of measure zero we shall introduce a classically important example of such a set: the Cantor ternary set. Mathematicians who are fond of the fractal language call this set the Cantor dust. This suggestive phrase captures the fact that the Cantor set is indeed truly small even though it is large in the sense of counting; it is measure zero but uncountable. We begin with the closed interval [0, 1]. From this interval we shall remove a dense open set G. It is easiest to understand the set G if we construct it in stages. Let G1 = 1 2 3 , 3 , and let K1 = [0, 1] \ G1 . Thus 1 2 K1 = 0, ∪ ,1 3 3 is what remains when the middle third of the interval [0,1] is removed. This is the first stage of our construction. We repeat this construction on each of the two component intervals of K1 . Let 7 8 1 2 G2 = 9 , 9 ∪ 9 , 9 and let K2 = [0, 1] \ (G1 ∪ G2 ). Thus 1 2 1 2 7 8 K2 = 0, ∪ , ∪ , ∪ ,1 . 9 9 3 3 9 9 This completes the second stage. We continue inductively, obtaining two sequences of sets, {Kn } and {Gn }. The set K obtained by removing from [0, 1] all of the open sets Gn is called the Cantor set. Because of its construction, it is often called the Cantor middle third set. In an exercise we shall present a purely arithmetic description of the Cantor set that suggests another common name for K, the Cantor ternary set. Figure 6.1 shows K1 , K2 , and K3 . We might mention here that variations in the constructions of K can lead to interesting situations. For example, by changing the construction slightly, we can remove intervals in such a way that G′ =

∞ [

(a′k , b′k )

k=1

with

∞

∑ (b′k − a′k ) = 1/2

k=1 keeping K ′

(instead of 1), while still = [0, 1] \ G′ nowhere dense and perfect. The re′ sulting set K created problems for late nineteenth-century mathematicians trying to de-

128

CHAPTER 4. BEYOND THE CALCULUS INTEGRAL

velop a theory of measure. The “measure” of G′ should be 1/2; the “measure” of [0,1] should be 1. Intuition requires that the measure of the nowhere dense set K ′ should be 1 − 12 = 12 . How can this be when K ′ is so “small?” Exercise 483 We have given explicit statements for K1 and K2, 2 1 ∪ ,1 K1 = 0, 3 3 and 1 2 1 2 7 8 K2 = 0, ∪ , ∪ , ∪ ,1 . 9 9 3 3 9 9 What is K3 ?

Answer

Exercise 484 Show that if this process is continued inductively, we obtain two sequences of sets, {Kn } and {Gn } with the following properties: For each natural number n 1. Gn is a union of 2n−1 pairwise disjoint open intervals. 2. Kn is a union of 2n pairwise disjoint closed intervals. 3. Kn = [0, 1] \ (G1 ∪ G2 ∪ · · · ∪ Gn ). 4. Each component of Gn+1 is the “middle third” of some component of Kn . 5. The length of each component of Kn is 1/3n . Exercise 485 Establish the following observations: 1. G is an open dense set in [0, 1]. 2. Describe the intervals complementary to the Cantor set. 3. Describe the endpoints of the complementary intervals. 4. Show that the remaining set K = [0, 1] \ G is closed and nowhere dense in [0,1]. 5. Show that K has no isolated points and is nonempty. 6. Show that K is a nonempty, nowhere dense perfect subset of [0, 1]. Answer Exercise 486 Show that each component interval of the set Gn has length 1/3n . Using this, determine that the sum of the lengths of all component intervals of G, the set removed from [0, 1], is 1. Thus it appears that all of the length inside the interval [0, 1] has been removed leaving “nothing” remaining. Answer Exercise 487 Show that the Cantor set is a set of measure zero.

Answer

Exercise 488 Let E be the set of endpoints of intervals complementary to the Cantor set K. Prove that the closure of the set E is the set K.

4.3. SETS OF MEASURE ZERO

129

Exercise 489 Let G be a dense open subset of real numbers and let {(ak , bk )} be its set of component intervals. Prove that H = R \ G is perfect if and only if no two of these intervals have common endpoints. Exercise 490 Let K be the Cantor set and let {(ak , bk )} be the sequence of intervals complementary to K in [0, 1]. For each integer k let ck = (ak + bk )/2 (the midpoint of the interval (ak , bk )) and let N be the set of points ck for integers k. Prove each of the following: 1. Every point of N is isolated. 2. If ci 6= c j , there exists an integer k such that ck is between ci and c j (i.e., no point in N has an immediate “neighbor” in N).

Exercise 491 Show that the Cantor dust K can be described arithmetically as the set {x = .a1 a2 a3 . . . (base three) : ai = 0 or 2 for each i = 1, 2, 3, . . . }.

Answer

Exercise 492 Show that the Cantor dust is an uncountable set.

Answer

Exercise 493 Find a specific irrational number in the Cantor ternary set.

Answer

Exercise 494 Show that the Cantor ternary set can be defined as ( ) ∞ in K = x ∈ [0, 1] : x = ∑ n for in = 0 or 2 . n=1 3

Exercise 495 Let D=

∞

) jn x ∈ [0, 1] : x = ∑ n for jn = 0 or 1 . n=1 3

(

Show that D + D = {x + y : x, y ∈ D} = [0, 1]. From this deduce, for the Cantor ternary set K, that K + K = [0, 2]. Exercise 496 A careless student makes the following argument. Explain the error. S

“If G = (a, b), then G = [a, b]. Similarly, if G = ∞ i=1 (ai , bi ) is an open set, S∞ then G = i=1 [ai , bi ]. It follows that an open set G and its closure G differ by at most a countable set. The closure just adds in all the endpoints.” Answer

CHAPTER 4. BEYOND THE CALCULUS INTEGRAL

130

4.4 The Devil’s staircase The Cantor set allows the construction of a rather bizarre function that is continuous and nondecreasing on the interval [0, 1]. It has the property that it is constant on every interval complementary to the Cantor set and yet manages to increase from f (0) = 0 to f (1) = 1 by doing all of its increasing on the Cantor set itself. It has sometimes been called “the devil’s staircase” or simply the Cantor function. Thus this is an example of a continuous function on the interval [0, 1] which has a zero derivative everywhere outside of the Cantor set. If we were to try to develop a theory of indefinite integration that allows exceptional sets of measure zero we would have to impose some condition that excludes such functions. We will see that condition in Section 4.5.5.

4.4.1 Construction of Cantor’s function Define the function f in the following way. On the open interval ( 13 , 23 ), let f = 21 ; on the interval ( 19 , 92 ), let f = 14 ; on ( 79 , 98 ), let f = 34 . Proceed inductively. On the 2n−1 open intervals appearing at the nth stage of our construction of the Cantor set, define f to satisfy the following conditions: 1. f is constant on each of these intervals. 2. f takes the values

1 3 2n − 1 , , . . . , 2n 2n 2n

on these intervals. 3. If x and y are members of different nth-stage intervals with x < y, then f (x) < f (y). This description defines f on G = [0, 1]\K. Extend f to all of [0, 1] by defining f (0) = 0 and, for 0 < x ≤ 1, f (x) = sup{ f (t) : t ∈ G, t < x}. Figure 4.2 illustrates the initial stages of the construction. The function f is called the Cantor function. Observe that f “does all its rising” on the set K. The Cantor function allows a negative answer to many questions that might be asked about functions and derivatives and, hence, has become a popular counterexample. For example, let us follow this kind of reasoning. If f is a continuous function on [0, 1] and f ′ (x) = 0 for every x ∈ (0, 1) then f is constant. (This is proved in most calculus courses by using the mean value theorem.) Now suppose that we know less, that f ′ (x) = 0 for every x ∈ (0, 1) excepting a “small” set E of points at which we know nothing. If E is finite it is still easy to show that f must be constant. If E is countable it is possible, but a bit more difficult, to show that it is still true that f must be constant. The question then arises, just how small a set E can appear here; that is, what would we have to know about a set E so that we could say f ′ (x) = 0 for every x ∈ (0, 1) \ E implies that f is constant?

4.4. THE DEVIL’S STAIRCASE

131

y

16 7 8 3 4 5 8 1 2 3 8 1 4 1 8 1 9

2 9

1 3

2 3

7 9

8 9

- x

1

Figure 4.2: The third stage in the construction of the Cantor function. The Cantor function is an example of a function constant on every interval complementary to the Cantor set K (and so with a zero derivative at those points) and yet is not constant. The Cantor set, since it is both measure zero and nowhere dense, might be viewed as extremely small, but even so it is not insignificant for this problem. Exercise 497 In the construction of the Cantor function complete the verification of details. 1. Show that f (G) is dense in [0, 1]. 2. Show that f is nondecreasing on [0, 1]. 3. Infer from (a) and (b) that f is continuous on [0, 1]. 4. Show that f (K) = [0, 1] and thus (again) conclude that K is uncountable.

Exercise 498 Show that the Cantor function has a zero derivative everywhere on the open set complementary to the Cantor set in the interval [0, 1]. [In more colorful language, we say that this function has a zero derivative almost everywhere.] Exercise 499 Each number x in the Cantor set can be written in the form ∞

x = ∑ 23−ni i=1

for some increasing sequence of integers n1 < n2 < n3 < . . . . Show that the Cantor −ni at each such point. function assumes the value F(x) = ∑∞ i=1 2 Exercise 500 Show that the Cantor function is a monotone, nondecreasing function on [0, 1] that has these properties: 1. F(0) = 0, 2. F(x/3) = F(x)/2„

CHAPTER 4. BEYOND THE CALCULUS INTEGRAL

132 3. F(1 − x) = 1 − F(x).

[In fact the Cantor function is the only monotone, nondecreasing function on [0, 1] that Answer has these three properties.]

4.5 Functions with zero variation Sets of measure zero were defined by requiring certain small sums n

∑ (bi − ai )

i=1

whenever a subpartition {([ai , bi ], ξi ) : i = 1, 2, . . . , n}

is controlled by a function δ(x). We are interested in other variants on this same theme, involving sums of the form n

∑ |F(bi ) − F(ai )| or

i=1

n

∑ | f (ξi )|(bi − ai ) or even

i=1

n

∑ |F(bi ) − F(ai ) − f (ξi)|(bi − ai)|.

i=1

A measurement of the sums n

∑ |F(bi ) − F(ai )|

i=1

taken over nonoverlapping subintervals is considered to compute the variation of the function F. This notion appears in the early literature and was formalized by Camile Jordan (1838–1922) in the late 19th century under the terminology “variation of a function.” We have already studied this concept in Section 3.6.1. Here we focus on a narrower notion, that of zero variation on specified subsets. Zero variation We do not need the actual measurement of variation. What we do need is the notion that a function has zero variation. This is a function that has only a small change on a set, or whose growth on the set is insubstantial. Definition 4.6 A function F : (a, b) → R is said to have zero variation on a set E ⊂ (a, b) if for every ε > 0 and every x ∈ E there is a δ(x) > 0 n

∑ |F(bi ) − F(ai )| < ε

i=1

whenever a subpartition {([ai , bi ], ξi ) : i = 1, 2, . . . , n} is chosen for which ξi ∈ E ∩ [ai , bi ] and bi − ai < δ(ξi ).

We saw a definition very similar to this when we defined a set of measure zero. In fact the formal nature of the definition is exactly the same as the requirement that a set E should have measure zero. Exercise 501 makes this explicit. As we shall discover, all of the familiar functions of the calculus turn out to have zero variation on sets of measure zero. Only rather pathological examples (notably the Cantor function) do not have this property.

4.5. FUNCTIONS WITH ZERO VARIATION

133

Exercise 501 Show that a set E has measure zero if and only if the function F(x) = x has zero variation on E. Answer Exercise 502 Suppose that F : R → R has zero variation on a set E1 and that E2 ⊂ E1 . Answer Show that then F has zero variation on E2 . Exercise 503 Suppose that F : R → R has zero variation on the sets E1 and E2 . Show Answer that then F has zero variation on the union E1 ∪ E2 . Exercise 504 Suppose that F : R → R has zero variation on each member of a seS quence of sets E1 , E2 , E3 , . . . . Show that then F has zero variation on the union ∞ n=1 En . Answer Exercise 505 Prove the following theorem that shows another important version of zero variation. We could also describe this as showing a function has small Riemann sums over sets of measure zero. Theorem 4.7 Let f be defined at every point of a measure zero set N and let ε > 0. Then for every x ∈ N there is a δ(x) > 0 so that n

∑ | f (ξi )|(bi − ai) < ε

i=1

whenever a subpartition {([ai , bi ], ξi ) : i = 1, 2, . . . , n} is chosen for which ξi ∈ N ∩ [ai , bi ] and bi − ai < δ(ξi ).

Answer

Exercise 506 Let F be defined on an open interval (a, b) and let f be defined at every point of a measure zero set N ⊂ (a, b). Suppose that F has zero variation on N. Let ε > 0. Show for every x ∈ N there is a δ(x) > 0 such that n

∑ |F(bi ) − F(ai ) − f (ξi)(bi − ai)| < ε

i=1

whenever a subpartition {([ai , bi ], ξi ) : i = 1, 2, . . . , n} is chosen for which Answer Exercise 507 Let F be defined on an open interval (a, b) and let f be defined at every point of a set E. Suppose that F ′ (x) = f (x) for every x ∈ E. Let ε > 0. Show for every x ∈ E there is a δ(x) > 0 such that n

∑ |F(bi ) − F(ai ) − f (ξi)(bi − ai)| < ε

i=1

whenever a subpartition {([ai , bi ], ξi ) : i = 1, 2, . . . , n} is chosen for which ξi ∈ E ∩ [ai , bi ] and bi − ai < δ(ξi ).

Answer

Exercise 508 Show that the Cantor function has zero variation on the open set complementary to the Cantor set in the interval [0, 1]. Answer

134

CHAPTER 4. BEYOND THE CALCULUS INTEGRAL

4.5.1 Zero variation lemma The fundamental growth theorem that we need shows that only constant functions have zero variation on an interval. Theorem 4.8 Suppose that a function F : (a, b) → R has zero variation on the entire interval (a, b). Then F is constant on that interval. Exercise 509 Use a Cousin covering argument to prove the theorem.

Answer

Exercise 510 Show that the Cantor function does not have zero variation on the Cantor Answer set.

4.5.2 Zero derivatives imply zero variation There is an immediate connection between the derivative and its variation in a set. In the simplest case we see that a function has zero variation on a set on which it has everywhere a zero derivative. There is a partial converse that would be studied in a more advanced course: if a function F : (a, b) → R has zero variation on E then F ′ (x) = 0 at almost every point x of E. For this chapter we need only the one direction. Theorem 4.9 Suppose that a function F : (a, b) → R has a zero derivative F ′ (x) at every point x of a set E ⊂ (a, b). Then F has zero variation on E. Exercise 511 Prove Theorem 4.9 by applying Exercise 507. Exercise 512 Give a direct proof of Theorem 4.9.

Answer

Exercise 513 (comparison test for variations) Suppose that F, G : R → R. 1. If |F ′ (x)| ≤ |G′ (x)| for every point x in a compact interval [a, b] except for x in a set on which F has variation zero, show that |F(b) − F(a)| ≤ V (F, [a, b]) ≤ V (G, [a, b]). 2. If F ′ (x) ≤ |G′ (x)| for every point x in a compact interval [a, b] except for x in a set on which F has variation zero, show that F(b) − F(a) ≤ V (G, [a, b]).

Answer

4.5.3 Continuity and zero variation There is an intimate and immediate relation between continuity and zero variation. Theorem 4.10 Suppose F : (a, b) → R. Then F is continuous at a point x0 ∈ (a, b) if and only if F has zero variation on the singleton set E = {x0 }.

4.5. FUNCTIONS WITH ZERO VARIATION

135

Corollary 4.11 Suppose F : (a, b) → R. Then F is continuous at each point c1 , c2 , c3 , . . . ck ∈ (a, b) if and only if F has zero variation on the finite set E = {c1 , c2 , c3 , . . . ck }. Corollary 4.12 Suppose F : (a, b) → R. Then F is continuous at each point c1 , c2 , c3 , . . . from a sequence of points in (a, b) if and only if F has zero variation on the countable set E = {c1 , c2 , c3 , . . . }. Exercise 514 Suppose F : (a, b) → R. Show that F is continuous at every point in a set E if and only F has zero variation in every countable subset of E.

4.5.4 Lipschitz functions and zero variation For us, one of the key properties of Lipschitz functions is that they must always have zero variation on sets of measure zero. In the next section we shall describe such functions as being absolutely continuous. Theorem 4.13 Suppose that F : [a, b] → R is a Lipschitz function. Then F has zero variation on every subset of (a, b) that has measure zero. As a consequence of this theorem we can show that Lipschitz functions behave in a way that is useful for integration theory. We took much advantage of the fact that two continuous functions whose derivatives agree mostly everywhere or nearly everywhere differ by a constant. For Lipschitz functions we can use “almost everywhere” and have the same conclusion. Theorem 4.14 Suppose that F : [a, b] → R is a Lipschitz function and that F has a zero derivative at almost every point of the interval (a, b). Then F is a constant. Corollary 4.15 Suppose that F, G : [a, b] → R are Lipschitz functions and that F ′ (x) = G′ (x) at almost every point x of the interval (a, b). Then F and G differ by a constant. Exercise 515 Prove Theorem 4.13. Exercise 516 Prove Theorem 4.14 and its corollary.

4.5.5 Absolute continuity [variational sense] We have seen that the function F(x) = x has zero variation on a set N precisely when that set N is a set of measure zero. We see, then, that F(x) = x has zero variation on all sets of measure zero. Most functions that we have encountered in the calculus also have this property. All Lipschitz functions have this property (as we have seen in Theorem 4.13). We shall see, too, that all differentiable functions have this property. It plays a vital role in the theory; such functions are said to be absolutely continuous in the variational sense3 . 3 The

idea is due essentially to Arnaud Denjoy (1884–1974) as a generalized version of the absolute continuity as defined by Vitali. In the treatise Theory of the Integral by Stanislaw Saks this concept appears under the terminology ACG∗ . The definition here is easier and much more accessible.

136

CHAPTER 4. BEYOND THE CALCULUS INTEGRAL

Definition 4.16 A uniformly continuous function F : [a, b[→ R is said to be absolutely continuous in the variational sense on [a, b] if F has zero variation on every subset N of the interval (a, b) that has measure zero. The exercises show that most continuous functions we encounter in the calculus will be absolutely continuous. In fact the only continuous function we have seen so far that is not absolutely continuous is the Cantor function. We would likely drop the phrase “in the variational sense” when it is clear that this is what we intend4 . Exercise 517 Show that the function F(x) = x is absolutely continuous on every open interval. Exercise 518 Show that a linear combination of absolutely continuous functions is absolutely continuous. Exercise 519 Show that a Lipschitz function is absolutely continuous. Exercise 520 Give an example of an absolutely continuous function that is not Lipschitz. Exercise 521 Show that the Cantor function is not absolutely continuous on [0, 1]. Exercise 522 Suppose that a uniformly continuous function F : [a, b] → R is differentiable at each point of the open interval (a, b). Show that F is absolutely continuous on [a, b]. Exercise 523 Suppose that that a uniformly continuous function F : [a, b] → R is differentiable at each point of the open interval (a, b) with finitely many exceptions. Show that F is absolutely continuous on [a, b]. Exercise 524 Suppose that that a uniformly continuous function F : [a, b] → R is differentiable at each point of the open interval (a, b) with countably many exceptions. Show that F is absolutely continuous on [a, b]. Exercise 525 Suppose that that a uniformly continuous function F : [a, b] → R is differentiable at each point of the open interval (a, b) with the exception of a set N ⊂ (a, b). Suppose further that F has zero variation on N. Show that F is absolutely continuous on [a, b]. Exercise 526 Suppose that F : [a, b] → R is absolutely continuous on the interval [a, b]. Then by definition F has zero variation on every subset of measure zero. Is it possible that F has zero variation on subsets that are not measure zero? 4 This

is, however, a bit dangerous. The concept of absolute continuity is used in the literature in two different ways. The first, in classical real analysis, uses the Vitali sense of the next section. The second uses a measure-theoretic version which is closer to the intent meant in the variational senses.

4.5. FUNCTIONS WITH ZERO VARIATION

137

Exercise 527 A function F : [a, b] → R is said to have finite derived numbers on a set E ⊂ (a, b) if, for each x ∈ E, there is a number Mx and one can choose δ > 0 so that F(x + h) − F(x) ≤ Mx h

whenever x + h ∈ I and |h| < δ. Suppose that that a uniformly continuous function F : [a, b] → R has finite derived numbers at every point of (a, b). Show that F is absolutely continuous on [a, b]. [cf. Exercise 171.]

4.5.6 Absolute continuity [Vitali’s sense] There is a type of absolute continuity, due to Vitali, that is very similar to the ε–δ definition of uniform continuity. This the first version of absolute continuity in the literature. The concept is due to Giuseppe Vitali (1875–1932) who introduced it is as the correct characterization of the property of indefinite integrals in the Lebesgue theory of integration. Definition 4.17 A function F : [a, b] → R is absolutely continuous in Vitali’s sense on [a, b] provided that for every ε > 0 there is a δ > 0 so that n

∑ |F(xi ) − F(yi )| < ε

i=1

whenever {[xi , yi ]} are nonoverlapping subintervals of [a, b] for which n

∑ (yi − xi) < δ.

i=1

This condition is strictly stronger than absolute continuity in the variational sense: there are absolutely continuous functions that are not absolutely continuous in Vitali’s sense. In fact we should remember these implications: Lipschitz =⇒ AC [Vitali sense] =⇒ AC [variational sense]. The arrows cannot be reversed. The full story of the connection between the two concepts is explained by the notion of bounded variation: a function is absolutely continuous in Vitali’s sense if and only if it is absolutely continuous in the variational sense and it also has bounded variation. This is left for a more advanced course as it is not needed for our exposition. Exercise 528 Prove that if F is absolutely continuous in Vitali’s sense on [a, b] then F is uniformly continuous there. Exercise 529 Prove that if F is absolutely continuous in Vitali’s sense on [a, b] then F is absolutely continuous in the variational sense on [a, b]. Exercise 530 Prove that ifF is absolutely continuous in Vitali’s sense on [a, b] then F has bounded variation on [a, b]. Exercise 531 Prove that if F is Lipschitz then F is absolutely continuous in Vitali’s sense.

CHAPTER 4. BEYOND THE CALCULUS INTEGRAL

138

Exercise 532 Show that an everywhere differentiable function must be absolutely continuous in the variational sense on any interval [a, b] but need not be absolutely continuous in Vitali’s sense on [a, b].

4.6 The integral Our theory so far in Chapters 2 and 3 has introduced and studied the calculus integral, both as an indefinite and a definite integral. The key point in that theory was simply this observation: ⋆ Continuous functions whose derivatives are determined at all but finitely many points are unique up to an additive constant. The whole theory of the calculus integral was based on this simple concept. We can consider that this simple phrase is enough to explain the elementary theory of integration. The exceptional set that we allowed was always finite. To go beyond that and provide a more comprehensive integration theory we must allow infinite sets. We have seen that sets of measure zero offer a useful class of exceptional sets. But we also saw the Cantor function whose derivative is zero everywhere except on the measure zero Cantor set, and yet the Cantor function is not constant. Thus some further restriction must be made on the functions that are allowed as our indefinite integral; continuity is not enough. The modern theory is essentially the same as the calculus integral, except that the observation ⋆ above is replaced by this one: ⋆ ⋆ Absolutely continuous functions whose derivatives are determined at all but a set of measure zero are unique up to an additive constant. Here we can use either version of absolute continuity, either the variational version or the Vitali version. Moreover, since Lipschitz functions are absolutely continuous, we could use those.

4.6.1 The Lebesgue integral of bounded functions Lebesgue gave a number of definitions for his integral; the most famous is the constructive definition using his measure theory. He also gave a descriptive definition similar to the calculus definitions that we are using in this text. For bounded functions his definition5 is exactly as given below. 5 Here is a remark on this fact from Functional Analysis, by Frigyes Riesz, Bela Szökefalvi-Nagy, and Leo F. Boron: “Finally, we discuss a definition of the Lebesgue integral based on differentiation, just as the classical integral was formerly defined in many textbooks of analysis. A similar definition, if only for bounded functions, was already formulated in the first edition of Lebesgue’s Leçons sur l’intégration, but without being followed up: ‘A bounded function f (x) is said to be summable if there exists a function F(x) with bounded derived numbers [i.e., Lipschitz] such that F(x) has f (x) for derivative, except for a set of values of x of measure zero. The integral in (a, b) is then, by definition, F(b) − F(a).’

4.6. THE INTEGRAL

139

Definition 4.18 (Lebesgue integral of bounded functions) Let f be a bounded function that is defined at almost every point of [a, b]. Then, f is said to be Lebesgue integrable on [a, b] if there is a Lipschitz function F : [a, b] → R such that F ′ (x) = f (x) at every point of (a, b) with the exception of points in a set of measure zero. In that case we define Z b a

f (x) dx = F(b) − F(a)

and this number is called the integral of f on [a, b]. This integral (the Lebesgue integral) applied to bounded functions does go quite a bit “beyond the calculus integral.” For bounded functions, the Lebesgue integral includes the calculus integral and integrates many important classes of functions that the calculus integral cannot manage. Further study of the Lebesgue integral requires learning the measure theory. The traditional approach is to start with the measure theory and arrive at these descriptive descriptions of his integral only after many weeks. There is an abundance of good texts for this. Try to remember when you are going through such a study that eventually, after much detail, you will indeed arrive back at this point of seeing the integral as an antiderivative.

4.6.2 The Lebesgue integral in general The use of Lipschitz functions in our definition of the definite integral was motivated by the fact that two Lipschitz functions whose derivatives agree almost everywhere must differ by a constant. This allows us to define an integral entirely similar to the calculus integral of Chapter 3. This definition however can apply only to bounded functions. For unbounded functions we need a more general definition that goes beyond the scope of Lipschitz functions. For this the concept of absolute continuity in the Vitali sense will replace the requirement that we have a Lipschitz function. The Lebesgue integral We choose now to define our integral based on the notion of absolute continuity in the Vitali sense. A more general kind of integral is obtained in Section 4.6.3 below when we choose to use absolute continuity in the broader variational sense. For both historical and technical reasons it is important to distinguish between the two theories. Definition 4.19 (Definite Lebesgue integral) Let f : (a, b) → R be a function defined at all points of the open interval (a, b) with the possible exception of a set of measure zero. Then f is said to be Lebesgue integrable on the closed, bounded interval [a, b] provided there is a function F : (a, b) → R so that 1. F is absolutely continuous in the Vitali sense on [a, b]. 2. F ′ (x) = f (x) at all points x of (a, b) with the possible exception of a set of measure zero. In that case we define

Z b a

f (x) dx = F(b) − F(a).

CHAPTER 4. BEYOND THE CALCULUS INTEGRAL

140

For bounded functions, the requirement that the indefinite integral F is Lipschitz is equivalent to the requirement that it be absolutely continuous in the Vitali sense. Thus either of the two definitions may be used.

4.6.3 The integral in general Finally we choose now to define an integral based on the notion of absolute continuity in the variational sense. This offers the most general version of integration theory, one that includes the two definitions for the Lebesgue integral given above. Definition 4.20 (Definite integral) Let f : (a, b) → R be a function defined at all points of the open interval (a, b) with the possible exception of a set of measure zero. Then f is said to be integrable on the closed, bounded interval [a, b] provided there is a function F : (a, b) → R so that 1. F is absolutely continuous in the variational sense on [a, b]. 2. F ′ (x) = f (x) at all points x of (a, b) with the possible exception of a set of measure zero. In that case we define

Z b a

f (x) dx = F(b) − F(a).

Recall that we have the following relation: Lipschitz =⇒ AC (Vitali sense) =⇒ AC (variational sense). From this we deduce these two facts: For unbounded functions: Lebesgue integrable =⇒ integrable. and For bounded functions: Lebesgue integrable ⇐⇒ integrable. If a function is integrable in the sense of this definition, but not integrable in the sense of the two previous definitions (i.e., is not Lebesgue integrable) then we would need some appropriate terminology. The simplest for the purposes of our limited Chapter 4 would be to use integrable, but not Lebesgue integrable or nonabsolutely integrable. This can occur if and only if f is unbounded and the function F(x) =

Z x a

f (u) du

(a ≤ x ≤ b)

is absolutely continuous in the variational sense, but is not absolutely continuous in the Vitali sense. It can be shown that this would occur if and only it F does not have bounded variation on [a, b]. Another equivalent condition would be that, while f is integrable, the absolute value | f | is not integrable. All of these remarks would be part of a more advanced course than this chapter allows.

4.6. THE INTEGRAL

141

4.6.4 The integral in general (alternative definition) Sometimes it is more convenient to state the conditions for the integral with direct attention to the set of exceptional points where the derivative F ′ (x) = f (x) may fail. Definition 4.21 (Definite integral) Let f : (a, b) → R be a function defined at all points of the open interval (a, b) with the possible exception of a set of measure zero. Then f is said to be integrable on the closed, bounded interval [a, b] provided there is a function F : (a, b) → R and there is a set N ⊂ (a, b) so that 1. F is uniformly continuous on (a, b). 2. N has measure zero. 3. F ′ (x) = f (x) at all points x of (a, b) with the possible exception of points in N. 4. F has zero variation on N. In that case we define

Z b a

f (x) dx = F(b−) − F(a+).

Exercise 533 Show that Definition 4.20 and Definition 4.21 are equivalent. Exercise 534 Under what hypotheses is Z b a

F ′ (x) dx = F(b) − F(a) Answer

a correct statement?

Exercise 535 Show that the new definition of definite integral (either Definition 4.20 or Definition 4.21) includes the notion of definite integral from Chapter 3. Exercise 536 Show that the new definition of definite integral (either Definition 4.20 or Definition 4.21) includes, as integrable, functions that would not be considered integrable in Chapter 3.

4.6.5 Infinite integrals Exactly the same definition for the infinite integrals Z ∞

−∞

f (x) dx,

Z ∞ a

f (x) dx, and

Z b

−∞

f (x) dx

can be given as for the integral over a closed bounded interval.

CHAPTER 4. BEYOND THE CALCULUS INTEGRAL

142

Definition 4.22 Let f be a function defined at every point of (∞, ∞) with the possible exception of a set of measure zero. Then f is said to be integrable on (∞, ∞) provided there is a function F : (−∞, ∞) → R so that 1. F is absolutely continuous in the variational sense on every closed bounded interval. 2. F ′ (x) = f (x) at all points x with the possible exception of a set of measure zero. 3. Both limits F(∞) = limx→∞ F(x) and F(−∞) = limx→−∞ F(x) exist. In that case the number

Z ∞

−∞

f (x) dx = F(∞)−F(−∞), is called the definite integral

of f on the interval (∞, ∞) . Here the statement that F is absolutely continuous in the variational sense on on every closed bounded interval is equivalent to the simple assertion that F is continuous and has zero variation on every set of measure zero. Similar definitions are available for Z b

and

−∞

f (x) dx = F(b) − F(−∞)

Z ∞ a

f (x) dx = F(∞) − F(a).

In analogy with the terminology of an infinite series ∑∞ k=1 ak we often say that the R integral a∞ f (x) dx converges when the integral exists. That suggests language asserting that the integral converges absolutely if both integrals Z ∞ a

f (x) dx and

Z ∞ a

| f (x)| dx

exist.

4.7 Approximation by Riemann sums We have seen that all calculus integrals can be approximated by Riemann sums. We have two modes of approximation, a uniform approximation and a pointwise approximation. The same is true for the advanced integration theory. In Theorem 4.23 below we see that the property of being an integral (which is a property expressed in the language of derivatives, zero measure sets and zero variation) can be completely described by a property expressed by partitions and Riemann sums. This theorem was first observed by the Irish mathematician Ralph Henstock. Since then it has become the basis for a definition of the modern integral. The proof is elementary. Even so, it is remarkable and was not discovered until the 1950s, in spite of intense research into integration theory in the preceding half-century.

4.8. PROPERTIES OF THE INTEGRAL

143

Theorem 4.23 (Henstock’s criterion) Suppose that f is an integrable function defined at every point of a closed, bounded interval [a, b]. Then for every ε > 0 and every point x ∈ [a, b] there is a δ(x) > 0 so that n Z bi ∑ a f (x) dx − f (ξi )(bi − ai) < ε i=1

and

i

Z b n f (x) dx − ∑ f (ξi )(bi − ai ) < ε a i=1

whenever a partition of the interval [a, b] is chosen for which

{([ai , bi ], ξi ) : i = 1, 2, . . . , n}

ξi ∈ [ai , bi ] and bi − ai < δ(ξi ). This theorem is stated in only one direction: if f is integrable then the integral has a pointwise approximation using Riemann sums. The converse direction is true too and can be used to define the integral by means of Riemann sums. Of course, one is then obliged to develop the full theory of zero measure sets, zero variation and absolute continuity in order to connect the two theories and show that they are equivalent. The theorem provides only for a pointwise approximation by Riemann sums. It is only under rather severe conditions that it is possible to find a uniform approximation by Riemann sums. Exercises 551, ??, and ?? provide that information. Exercise 537 Prove Theorem 4.23.

Answer

4.8 Properties of the integral The basic properties of integrals are easily studied for the most part since they are natural extensions of properties we have already investigated for the calculus integral. There are some surprises and some deep properties which were either false for the calculus integral or were hidden too deep for us to find without the tools we have now developed. We know these formulas for the narrow calculus integral and we are interested now in extending them to full generality.

4.8.1 Inequalities Formula for inequalities:

Z b a

f (x) dx ≤

Z b

g(x) dx

a

if f (x) ≤ g(x) for all points x in (a, b) except possibly points of a set of measure zero. We have seen this statement before for the calculus integral in Section ?? where we allowed only a finite number of exceptions for the inequality. Here is a precise statement of what we intend here by this statement: If both functions f (x) and g(x) have an integral on the interval [a, b] and, if f (x) ≤ g(x) for all points x in (a, b) except possibly points of a set of measure zero. then the stated inequality must hold.

CHAPTER 4. BEYOND THE CALCULUS INTEGRAL

144

Exercise 538 Complete the details needed to prove the inequality formula. Answer

4.8.2 Linear combinations Formula for linear combinations: Z b

[r f (x) + sg(x)] dx = r

Z b

f (x) dx + s

g(x) dx

a

a

a

Z b

(r, s ∈ R).

We have seen this statement before for the calculus integral in Section 3.3.4 Here is a precise statement of what we intend now by this formula: If both functions f (x) and g(x) have an integral on the interval [a, b] then any linear combination r f (x) + sg(x) (r, s ∈ R) also has an integral on the interval [a, b] and, moreover, the identity must hold. The proof is an exercise in derivatives, taking proper care of the exceptional sets of measure zero. We know, as usual, that d (rF(x) + sG(x)) = rF ′ (x) + sG′ (x) dx at any point x at which both F and G are differentiable. Exercise 539 Complete the details needed to prove the linear combination formula. Answer

4.8.3 Subintervals Formula for subintervals: If a < c < b then Z b

f (x) dx =

a

Z c a

f (x) dx +

Z b

f (x) dx

c

The intention of the formula is contained in two statements in this case: If the function f (x) has an integral on the interval [a, b] then f (x) must also have an integral on any closed subinterval of the interval [a, b] and, moreover, the identity must hold. and If the function f (x) has an integral on the interval [a, c] and also on the interval [c, b] then f (x) must also have an integral on the interval [a, b] and, moreover, the identity must hold. Exercise 540 Supply the details needed to prove the subinterval formula.

Answer

4.8.4 Integration by parts Integration by parts formula: Z b a

F(x)G′ (x) dx = F(b)G(b) − F(a)G(b) −

Z b a

F ′ (x)G(x) dx

The intention of the formula is contained in the product rule for derivatives: d (F(x)G(x)) = F(x)G′ (x) + F ′ (x)G(x) dx

4.8. PROPERTIES OF THE INTEGRAL

145

which holds at any point where both functions are differentiable. One must then give strong enough hypotheses that the function F(x)G(x) is an indefinite integral for the function F(x)G′ (x) + F ′ (x)G(x) in the sense needed for our integral. Exercise 541 Supply the details needed to state and prove an integration by parts formula for this integral. Answer

4.8.5 Change of variable The change of variable formula (i.e., integration by substitution) that we would expect to find is this, under some hypotheses: Z b a

f (G(t))G′ (t) dt =

Z G(b)

f (x) dx.

G(a)

The proof for the calculus integral was merely an application of the chain rule for the derivative of a composite function: d F(G(x)) = F ′ (G(x))G′ (x). dx Since our extended integral includes the calculus integral we still have this formula for all the old familiar cases. It is possible to extend the formula to handle much more general situations. Assume, as usual, that g is integrable on an interval [a, b] with Z x

G(x) =

g(s) ds

a

(a ≤ x ≤ b)

and that f is integrable on an interval [c, d] that includes all the values of G(x) for x ∈ [a, b]. Assume that F(t) =

Z t

f (u) du

a

(c ≤ t ≤ d)

Then we would like to be able to assert the change of variable formula: F(G(x)) − F(G(a)) =

Z G(x) G(a)

f (u) du =

Z x a

f (G(t))g(t) dt.

(a ≤ x ≤ b).

There is an obvious necessary condition, namely that the composed function F ◦G must be absolutely continuous in the variational sense. Moreover, in order that the function ( f ◦G)g be not only integrable, but Lebesgue integrable, we would have a stricter necessary condition, namely that the composed function F ◦ G must be absolutely continuous in the Vitali sense. Quite remarkably these necessary conditions are also sufficient. We must, however, leave the proofs of these facts to our later, more advanced course. Exercise 542 Supply the details needed to state and prove at least one change of variables formula for this integral. Answer

CHAPTER 4. BEYOND THE CALCULUS INTEGRAL

146

Exercise 543 (no longer failed change of variables) In Exercise 245 we discovered that the calculus integral did not permit the change of variables, F(x) = |x| and G(x) = x2 sin x−1 , G(0) = 0 in the integral Z 1 0

F ′ (G(x))G′ (x) dx = F(G(1)) − F(G(0)) = | sin 1|.

Is this valid now?

Answer

4.8.6 What is the derivative of the definite integral? What is

Z

d x f (t) dt? dx a Rx We know that a f (t) dt is an indefinite integral of f and so, by definition, Z d x f (t) dt = f (x) dx a at all points in the interval (a, b) except possibly at the points of a set of measure zero. We can still make the same observation that we did in Section 3.3.7 for the calculus integral: Z d x f (t) dt = f (x) dx a at all points a < x < b at which f is continuous. But this is quite misleading here. The function may be discontinuous everywhere, and yet the differentiation formula always holds for most points x.

4.8.7 Monotone convergence theorem For this integral we can integrate a limit of a monotone sequence by interchanging the limit and the integral. Theorem 4.24 (Monotone convergence theorem) Let fn : [a, b] → R (n = 1, 2, 3, . . . ) be a nondecreasing sequence of functions, each integrable on the interval [a, b]. Suppose that, for all x in (a, b) except possibly a set of measure zero, f (x) = lim fn (x). n→∞

Then f is integrable on [a, b] and Z b a

f (x) dx = lim

Z b

n→∞ a

fn (x) dx

provided this limit exists. The exciting part of this statement has been underlined. Unfortunately it is more convenient for us to leave the proof of this fact to a more advanced course. Thus in the exercise you are asked to prove only a weaker version in which the integrability of the function f is assumed (not proved). Exercise 544 Prove the formula without the underlined statement, i.e., assume that f Answer is integrable and then prove the identity.

4.8. PROPERTIES OF THE INTEGRAL

147

Exercise 545 State and prove a version of the formula Z b Z b fn (x) dx. lim fn (x) dx = lim n→∞

a

n→∞ a

using uniform convergence as your main hypothesis.

4.8.8 Summation of series theorem For this integral we can sum series of nonnegative terms and integrate term-by-term. Theorem 4.25 (summation of series) Suppose that g1 , g2 , g3 ,. . . is a sequence of nonnegative functions, each one integrable on a closed bounded interval [a, b]. Suppose that, for all x in (a, b) except possibly a set of measure zero, ∞

f (x) =

∑ gk (x).

k=1

Then f is integrable on [a, b] and Z b

∞

f (x) dx =

a

∑

k=1

Z

a

b

gk (x) dx

(4.1)

provided the series converges. The exciting part of this statement, again, has been underlined. Unfortunately it is more convenient for us to leave the proof of this fact to a more advanced course. Thus in the exercise you are asked to prove only a weaker version. Exercise 546 Prove the formula without the underlined statement, i.e., assume that f Answer is integrable and then prove the identity.

4.8.9 Null functions A function f : [a, b] → R is said to be a null function on [a, b] if it is defined at almost every point of [a, b] and is zero at almost every point of [a, b]. Thus these functions are, for all practical purposes, just the zero function. They are particularly easy to handle in this theory for that reason. Exercise 547 Let f : [a, b] → R be a null function on [a, b]. Then f is integrable on [a, b] and Z b

f (x) dx = 0.

a

Answer Exercise 548 Suppose that f : [a, b] → R is an integrable function on [a, b] and that Z d c

f (x) dx = 0 for all a ≤ c < d ≤ b.

Then f is a null function on [a, b].

Answer

148

CHAPTER 4. BEYOND THE CALCULUS INTEGRAL

Exercise 549 Suppose that f : [a, b] → R is a nonnegative, integrable function on [a, b] and that Z b

f (x) dx = 0.

a

Then f is a null function on [a, b].

Answer

4.9 The Henstock-Kurweil integral We leave our study of integration theory with two final sections included for historical perspective. The Riemann sums property expressed in Theorem 4.23 for all integrable functions can be turned into a definition. That definition defines an integral. At first sight it might appear to be yet more general than the integration theory that we have already developed. In fact this definition gives an equivalent theory. The advantage (which will require further study) is that we have then two powerful methods for establishing the theory of the integral, one as an antiderivative and another as a limiting property of Riemann sums. Definition 4.26 (Henstock-Kurzweil integral) Suppose that f is defined at every point of a closed, bounded interval [a, b]. Then f is said to be Henstock-Kurzweil integrable on [a, b] if there is a number I with the property that, for every ε > 0 and every point x ∈ [a, b] there is a δ(x) > 0 so that n I − ∑ f (ξi )(bi − ai ) < ε i=1 whenever a partition of [a, b] {([ai , bi ], ξi ) : i = 1, 2, . . . , n} is chosen for which ξi ∈ [ai , bi ] and bi − ai < δ(ξi ).

The number I is set equal to Kurzweil integral of f on [a, b].

Rb a

f (x) dx and the latter is called the Henstock-

Defined everywhere? The definition of the Henstock-Kurzweil integral requires that the function to be integrated must be defined at every point of the interval [a, b]. Our descriptive definition of the equivalent integral requires only that the function is defined at almost every point. In practice, users of the definition just given usually agree to replace a function f that is defined almost everywhere with an equivalent function g that is defined everywhere. Perspectives

Here are some remarks that you should be able to prove or research.

1. The Henstock-Kurzweil integral not only includes, but is equivalent to the integral defined in this chapter. 2. There are bounded, Henstock-Kurzweil integrable functions that are not integrable [naive calculus sense]. 3. There are unbounded, Henstock-Kurzweil integrable functions that are not integrable [naive calculus sense].

4.10. THE RIEMANN INTEGRAL

149

4. The Henstock-Kurzweil integral is a nonabsolute integral, i.e., there are integrable functions f for which | f | is not integrable. 5. A function is Lebesgue integrable if and only both functions f and | f | are HenstockKurzweil integrable. 6. The Henstock-Kurzweil integral is often considered to be the correct version of integration theory on the line, but one that only specialists would care to learn. There are now a number of texts that start with Definition 4.26 and develop the theory of integration on the real line in a systematic way. Too much time, however, working with the technical details of Riemann sums may not be entirely profitable since most advanced textbooks will use measure theory exclusively. Our text [TBB] B. S. Thomson, J. B. Bruckner, A. M.Bruckner, Elementary Real Analysis: Dripped Version, ClassicalRealAnalysis.com (2008). available for free at our website contains a brief account of the calculus integral and several chapters devoted to the Henstock-Kurweil integral. After that integration theory is developed we then can give a fairly rapid and intuitive account of the measure theory that most of us are expected to know by a graduate level.

4.10 The Riemann integral The last word in our elementary text goes to the unfortunate Riemann integral, long taught to freshman calculus students in spite of the clamor against it. We can, however, define this integral in a natural way that fits closely into the perspective of our current chapter. This is not the way that most students would first encounter this integral. But, having started our integration theory by the descriptive method of antidifferentiation, it is a natural development for us. We ask, naively, what bounded functions are integrable by our methods of this chapter? This is naive because the correct answer to that question requires some sophisticated tools develop by Lebesgue in his 1901 thesis. Even so, we can find a limited answer to this problem by using a tool which has helped us considerably in the earlier chapters—continuity. We recall that one of our fundamental tools was this: if f : [a, b] → R is a bounded function then there is a Lipschitz function F : [a, b] → R so that F ′ (x) = f (x) at every point x in [a, b] at which f is continuous. Consequently we know immediately that f is integrable if any one of the following is true: 1. f is uniformly continuous on [a, b]. 2. f is bounded and is continuous mostly everywhere in [a.b]. 3. f is bounded and is continuous nearly everywhere in [a, b]. 4. f is bounded and is continuous almost everywhere in [a, b]. We take the most general answer (the last one) as our definition of a Riemann integrable function.

150

CHAPTER 4. BEYOND THE CALCULUS INTEGRAL

Definition 4.27 (Descriptive definition) A function f : [a, b] → R is said to be Riemann integrable if f is bounded and is continuous almost everywhere in [a, b]. This definition would be used simply to specify an historically important subclass of the family of integrable functions. There is great interest to us in knowing if a function is integrable. Occasionally we might also like to know if that function is, in addition, also Riemann integrable. Exercise 550 What properties can you determine are possessed by the class of Riemann integrable functions on an interval [a, b]? Answer

4.10.1 Constructive definition The formal constructive definition is due to Riemann sometime in the middle of the nineteenth century. The definition just given for Riemann integrable functions dates to Lebesgue in the early years of the twentieth century. By taking the latter as our definition we are reversing the history. This is a natural mathematical technique, taking a later characterization as a starting point for a theory. The earlier characterization is constructive and is familiar, of course, since we have already studied the notion of uniform approximation by Riemann sums in Section 3.5.3. Definition 4.28 (Constructive definition) Let f be a bounded function that is defined at every point of [a, b]. Then, f is said to be Riemann integrable on [a, b] if there is a number I so that for every ε > 0 there is a δ > 0 so that n I − ∑ f (ξi )(xi − xi−1 ) < ε i=1 whenever {([xi , xi−1 ], ξi ) : i = 1, 2, . . . n} is a partition of [a, b] with each xi − xi−1 < δ and ξi ∈ [xi−1 , xi ].

The number I is set equal to (R) − integral of f on [a, b].

Rb a

f (x) dx and the latter is called the Riemann

The equivalence of Definition 4.27 and Definition 4.28 can be established by the reader as a research project. The identity (R) −

Z b a

f (x) dx =

Z b

f (x) dx

a

also follows. This means that one can develop the theory of the Riemann integral in two equivalent ways: • Start with the constructive definition of the Riemann integral and develop the properties of such an “integral.” Relate those properties to the calculus integral of this text. [Not recommended, but commonly done this way.] • Start with the calculus integrala and use the descriptive definition of Riemann integrable function. This immediately places the Riemann integral in the correct theoretical framework for understanding elementary integration theory. [Recommended, but hardly ever done this way.]

4.10. THE RIEMANN INTEGRAL

151

Defined everywhere? In our integration theory of this chapter we have required of the function being integrated that it be defined at almost every point of the interval [a.b]. For the Riemann integral the function is very much required to be defined everywhere. This is an unfortunate feature of the theory and must be kept in mind by the user. Perspectives The Riemann integral does not go “beyond the calculus integral.” The Riemann integral will handle no unbounded functions and we have been successful with the calculus integral in handling many such functions. Even for bounded functions the relation between the calculus integral and the Riemann integral is confused: there are functions integrable in either of these senses, but not in the other. Here are some remarks that you should be able to prove or research. 1. There are Riemann integrable functions that are not integrable [naive calculus sense]. 2. There are bounded, integrable functions [naive calculus sense] that are not Riemann integrable. 3. All Riemann integrable functions are integrable in the sense of Lebesgue. 4. A bounded function is Riemann integrable if and only if it is continuous at every point, excepting possibly at points in a set of measure zero. 5. The Riemann integral is considered to be a completely inadequate theory of integration and yet is the theory that is taught to most undergraduate mathematics students. We do not believe that you need to know more about the Riemann integral than these bare facts. Certainly any study that starts with Definition 4.28 and attempts to build and prove a theory of integration is a waste of time; few of the techniques generalize to other settings. Exercise 551 (Riemann criterion) Show that a function f would be Riemann integrable if and only if, for any ε > 0, there is a partition of the interval [a, b] for which

{([ai , bi ], ξi ) : i = 1, 2, . . . , n} n

∑ ω( f , [ai , bi ])(bi − ai) < ε.

i=1

152

CHAPTER 4. BEYOND THE CALCULUS INTEGRAL

Chapter 5

ANSWERS 5.1 Answers to problems Exercise 1, page 3 The symbols −∞ and ∞ do not stand for real numbers; they are used in various contexts n2 n = 1 and limn→∞ n+1 = ∞ have meanto describe a situation. For example limn→∞ n+1 ings that do not depend on there being a real number called ∞. Thus stating a < x < ∞ simply means that x is a real number larger than a. [It does not mean that x is a real number smaller than ∞, because there is no such number .]

Exercise 2, page 4 Well, we have labeled some intervals as “bounded” and some as unbounded. But the definition of a bounded set E requires that we produce a real number M so that |x| ≤ M for all x ∈ E or, equivalently that −M ≤ x ≤ M for all x ∈ E Show that the labels are correct in terms of this definition of what bounded means.

Exercise 3, page 4 Well, we have labeled some intervals as “open” and some as not. But the definition of an open set G requires that we produce, for each x ∈ G at least one interval (c, d) that contains x and is contained inside the set G. Show that the labels are correct in terms of the definition of what open means here. (It is almost immediate from the definition but make sure that you understand the logic and can write it down.)

Exercise 4, page 4 Again, we have labeled some intervals as “closed” and some as not. Show that the labels are correct in terms of the definition of what closed means here. Remember that the definition of closed is given in terms of the complementary set. A set E is closed if the set R \ E is an open set. So, for these intervals, write down explicitly what that complementary set is.

153

CHAPTER 5. ANSWERS

154

Exercise 5, page 4 For example [a, b) is not open because the point a is in the set but we cannot find an open interval that contains a and is also a subset of [a, b). Thus the definition fails at one point of the set. For not closed, work with the complement of [a, b), i.e., the set (−∞, a) ∪ [b, ∞) and find a point that illustrates that this set cannot be open. A glib (and incorrect) answer would be to say that [a, b) is not open because we have defined “open interval” to mean something different. The point here is that the interval [a, b) would include only points between a and b as well as the point a itself. Is that set open? No, because of the argument just given.

Exercise 6, page 4 Yes, if the two open intervals have a point in common [i.e., are not disjoint]. Otherwise / for this reason some authors [not us] call the the intersection would be the empty set 0; empty set a degenerate open interval.

Exercise 7, page 4 Not in general. If the two intervals have only one point in common or no points in common the intersection is not an interval. Yes, if the two closed intervals have at least two points in common. If we have agreed (as in the discussion to the preceding exercise) to call the empty set a degenerate open interval we would be obliged also to call it a degenerate closed interval.

Exercise 8, page 4 Not necessarily. The intersection could, of course, be the empty set which we do not interpret as an interval, and we must consider the empty set as bounded. Even if it is not empty it need not be unbounded. Consider (∞, 1) ∩ (0, ∞) = (0, 1).

Exercise 9, page 4 The only possibility would be (a, b) ∪ (c, d) = (s,t)

where the two intervals (a, b) and (c, d) have a point in common. In that case s = min{a, c} and t = max{b, d}. If (a, b) and (c, d) are disjoint then (a, b) ∪ (c, d) is not an interval, but a disjoint union of two open intervals.

Exercise 10, page 4 The only possibility would be (−∞, c] ∪ [c, ∞) = (−∞, , ∞).

Exercise 11, page 4 Yes. Prove, in fact, that the union of a finite number of bounded sets is a bounded set.

5.1. ANSWERS TO PROBLEMS

155

Exercise 12, page 4 Remember that A \ B is the set of all points that are in the set A but are not in the set B. If I is open you should discover that I \C is a union of a finite number of disjoint open intervals, and is an open set itself. For example if I = (a, b) and C = {c1 , c2 , . . . , cm } where these are points inside (a, b) then (a, b) \C = (a, c1 ) ∪ (c1 , c2 ) ∪ (c2 , c3 ) ∪ · · · ∪ (cm , b).

Exercise 13, page 4 Remember that A \ B is the set of all points that are in the set A but are not in the set B. The set I \C must be a union of intervals. There are a number of possibilities and so, to answer the exercise, it is best to just catalog them. For example if I = [a, b] and C = {a, b} then [a, b] \C is the open interval (a, b). If C = {c1 , c2 , . . . , cm } where these are points inside (a, b) then [a, b] \C = [a, c1 ) ∪ (c1 , c2 ) ∪ (c2 , c3 ) ∪ · · · ∪ (cm , b].

After handling all the possibilities it should be clear that I \ C is a union of a finite number of disjoint intervals. The intervals need not all be open or closed.

Exercise 14, page 6 If a sequence of real numbers {sn } converges to a real number L then, for any choice of ε0 > 0 there is an integer N so that L − ε0 < sn < L + ε0

for all integers n = N, N + 1, N + 2, N + 3, . . . . Thus to find a number M larger than all the values of |sn | we can select the maximum of these numbers: |s1 |, |s2 |, |s3 |, . . . , |sN−2 |, |sN−1 |, |L| + ε0 .

Exercise 15, page 6 The simplest bounded sequence that is not convergent would be sn = (−1)n . It is clearly bounded and obviously violates the definition of convergent.

Exercise 16, page 6 If a sequence of real numbers {sn } is Cauchy then, for any choice of ε0 > 0 there is an integer N so that |sn − sN | < ε0 for all integers n = N, N + 1, N + 2, N + 3, . . . . Thus to find a number M larger than all the values of |sn | we can select the maximum of these numbers: |s1 |, |s2 |, |s3 |, . . . , |sN−2 |, |sN−1 |, |sN | + ε0 .

CHAPTER 5. ANSWERS

156

The simplest bounded sequence that is not Cauchy would be sn = (−1)n . It is clearly bounded and obviously violates the definition of a Cauchy sequence.

Exercise 17, page 6 The easiest of these is the formula lim (asn + btn ) = a lim sn + b lim tn .

n→∞

n→∞

n→∞

You should certainly review your studies of sequence limits if it does not immediately occur to you how to prove this using the definition of limit. The product sn tn and quotient stnn formulas are a little harder to prove and require a bit of thinking about the inequalities. Make that when you state and try to prove the quotient formula sn limn→∞ sn lim = n→∞ tn limn→∞ tn you include an hypothesis to exclude division by zero on either side of the identity.

Exercise 18, page 6 We already know by an earlier exercise that a convergent sequence would have to be bounded, so it is enough for us to prove that on the assumption that this sequence is bounded it must converge. Since the sequence is bounded L = sup{sn : n = 1, 2, 3, . . . }

is a real number. It has the property (as do all suprema) that sn ≤ L for all n and, if ε > 0, then sn > L − ε for some n. Choose any integer N such that sN > L − ε. Then for all integers n ≥ N, By definition, then,

L − ε < sN ≤ sn ≤ L < L + ε. lim sn = L.

n→∞

Notice that if the sequence is unbounded then sup{sn : n = 1, 2, 3, . . . } = ∞

and the sequence diverges to ∞,

lim sn = ∞.

n→∞

You should be able to give a precise proof of this that refers to Definition 1.4.

Exercise 19, page 6 We construct first a nonincreasing subsequence if possible. We call the mth element sm of the sequence {sn } a turn-back point if all later elements are less than or equal to it, in symbols if sm ≥ sn for all n > m. If there is an infinite subsequence of turn-back points sm1 , sm2 , sm3 , sm4 , . . . then we have found our nonincreasing subsequence since sm1 ≥ sm2 ≥ sm3 ≥ sm4 ≥ . . . .

5.1. ANSWERS TO PROBLEMS

157

This would not be possible if there are only finitely many turn-back points. Let us suppose that sM is the last turn-back point so that any element sn for n > M is not a turn-back point. Since it is not there must be an element further on in the sequence greater than it, in symbols sm > sn for some m > n. Thus we can choose sm1 > sM+1 with m1 > M + 1, then sm2 > sm1 with m2 > m1 , and then sm3 > sm2 with m3 > m2 , and so on to obtain an increasing subsequence sM+1 < sm1 < sm2 < sm3 < sm4 < . . . as required.

Exercise 20, page 7 The condition on the intervals immediately shows that the two sequences {an } and {bn } are bounded and monotone. The sequence {an } is monotone nondecreasing and bounded above by b1 ; the sequence {bn } is monotone nonincreasing and bounded below by a1 . By Exercise 18 these sequences converge. Take either z = limn→∞ an or z = limn→∞ bn . This point is in all of the intervals. The assumption that lim (bn − an ) = 0

n→∞

makes it clear that only one point can be in all of the intervals.

Exercise 21, page 7 This follows immediately from Exercises 18 and 19. Take any monotone subsequence. Any one of them converges by Exercise 18 since the sequence and the subsequence must be bounded.

Exercise 22, page 7 If a sequence of real numbers {sn } converges to a real number L then, for every ε > 0 there is an integer N so that L − ε/2 < sn < L + ε/2

for all integers n ≥ N. Now consider pairs of integers n, m ≥ N. We compute that

|sn − sm | = |sn − L + L − sm| ≤ |sn − L| + |L − sm| < ε.

By definition then {sn } is a Cauchy sequence.

Exercise 23, page 7 By the way, before seeing a hint you might want to ask for a reason for the terminology. If every Cauchy sequence is convergent and every convergent sequence is Cauchy why bother with two words for the same idea. The answer is that this same language is used in other parts of mathematics where every convergent sequence is Cauchy, but not every Cauchy sequence is convergent. Since we are on the real line in this course we don’t have to worry about such unhappy possibilities. But we retain the language anyway.

CHAPTER 5. ANSWERS

158

What is most important for you to remember is the logic of this exercise so we will sketch that and leave the details for you to write out: 1. Every Cauchy sequence is bounded. 2. Every sequence has a monotone subsequence. 3. Every bounded, monotone sequence converges. 4. Therefore every Cauchy sequence has a convergent subsequence. 5. When a Cauchy sequence has a subsequence converging to a number L the sequence itself must converge to the number L. [Using an ε, N argument.]

Exercise 24, page 7 If x does not belong to E then it belongs to a component interval (a, b) of R \ E that contains no points of E. Thus there is a δ > 0 so that (x − δ, x + δ) does not contain any points of E. Since all points in the sequence {xn } belong to E this would contradict the statement that x = limn→∞ xn .

Exercise 25, page 8 Just notice that

n

Sn − Sm =

∑

ak

k=m−1

provided that n ≥ m.

Exercise 26, page 8 First observe, by the triangle inequality that n n ∑ ak ≤ ∑ |ak | . k=m k=m Then if we can choose an integer N so that n

∑ |ak | < ε

k=m

for all n ≥ m ≥ N, we can deduce immediately that n a ∑ k < ε k=m

for all n ≥ m ≥ N. What can we conclude? If the series of absolute values n

∑ |ak | = |a1 | + |a2| + |a3| + |a4| + . . .

k=1

converges then it must follow, without further checking, that the original series n

∑ ak = a1 + a2 + a3 + a4 + . . .

k=1

5.1. ANSWERS TO PROBLEMS

159

is also convergent. Thus to determine whether a series ∑nk=1 ak is absolutely convergent we need only check the corresponding series of absolute values.

Exercise 27, page 9 Just choose any finite number of points a = x0 < x1 < x2 < · · · < xn−1 < xn = b

so that the points are closer together than δ. Then no matter what points ξi in [xi−1 , xi ] we choose the partition {([xi−1 , xi ], ξi ) : i = 1, 2, 3, . . . , n}

of the interval [a, b] has the property that each interval [xi−1 , xi ] has length smaller than δ(ξi ) = δ. Note that this construction reveals just how hard it might seem to arrange for a partition if the values of δ(x) are allowed to vary.

Exercise 28, page 9 For every point x in a closed, bounded interval [a, b] let there be given a positive number δ(x). Let us call an interval [c, d] ⊂ [a, b] a black interval if there exists at least one partition {([xi−1 , xi ], ξi ) : i = 1, 2, 3, . . . , n} of the interval [c, d] with the property that each interval [xi−1 , xi ] has length smaller than δ(ξi ). If an interval is not black let us say it is white. Observe these facts about black intervals. 1. If [c, d] and [d, e] are black then [c, e] is black. 2. If [c, d] contains a point z for which d − c < δ(z) then [c, d] is black. The first statement follows from the fact that any partitions for [c, d] and [d, e] can be joined together to form a partition of [c, e]. The second statement follows from the fact that {[c, d], z)} alone makes up a partition satisfying the required condition in the Cousin lemma. Now here is the nested interval argument. We wish to prove that [a, b] is black. If it is not black then one of the two intervals [a, 21 (a + b)] or [ 12 (a + b), b] is white. If both were black then statement (1) makes [a, b] black. Choose that interval (the white one) as [a1 , b1 ]. Divide that interval into half again and produce another white interval of half the length. This produces [a1 , b1 ] ⊃ [a2 , b2 ] ⊃ [a3 , b3 ] . . . , a shrinking sequence of white intervals with lengths decreasing to zero, lim (bn − an ) = 0.

n→∞

By the nested interval argument then there is a unique point z that belongs to each of the intervals and there must be an integer N so that (bN − aN ) < δ(z). By statement (2) above that makes [aN , bN ] black which is a contradiction.

CHAPTER 5. ANSWERS

160

Exercise 29, page 9 For every point x in a closed, bounded interval [a, b] let there be given a positive number δ(x). Let us say that a number a < r ≤ b can be reached if there exists at least one partition {([xi−1 , xi ], ξi ) : i = 1, 2, 3, . . . , n}

of the interval [a, r] with the property that each interval [xi−1 , xi ] has length smaller than δ(ξi ). Define R as the last point that can be reached, i.e., R = sup{r : a < r ≤ b and r can be reached}.

This set is not empty since all points in (a, a + δ(a)) can be reached. Thus R is a real number no larger than b. Check that R itself can be reached. Indeed there must be points r in the interval (R − δ(R), R] that can be reached (by the definition of sups). If R − δ(R) < r < R and r can be reached, then R also can be reached by simply adding the element ([r, R], R) to the partition for [a, r]. Is R < b? No since if it were then we could reach a bigger point by adding a suitable pair ([R, s], R) to a partition for [a, R]. Consequently R = b and b can be reached, i.e., it is the last point that can be reached.

Exercise 30, page 9 For each x in [a, b] select a positive number δ(x) so that the open interval (x − δ(x), x + δ(x))

is inside some open interval of the family C . By Cousin’s lemma there exists at least one partition {([xi−1 , xi ], ξi ) : i = 1, 2, 3, . . . , n}

of the interval [a, b] with the property that each interval [xi−1 , xi ] has length smaller than δ(ξi ). For each i = 1, 2, 3, . . . select from C some open interval (ci , di ) that contains (ξi − δ(ξi ), ξi + δ(ξi )).

This finite list of intervals from C

(c1 , d1 ), (c2 , c3 ), , . . . , (cn , dn ) contains every point of [a, b] since every interval [xi−1 , xi ] is contained in one of these open intervals.

Exercise 31, page 9 / then for Use a proof by contradiction. For example if (a, b) ⊂ G1 ∪ G2 , G1 ∪ G2 = 0, every x ∈ G1 ∩ [a, b] there is a δ(x) > 0 so that (x − δ(x), x + δ(x)) ⊂ G1

and for every x ∈ G2 ∩ [a, b] there is a δ(x) > 0 so that

(x − δ(x), x + δ(x)) ⊂ G2 .

5.1. ANSWERS TO PROBLEMS

161

By Cousin’s lemma there exists at least one partition {([xi−1 , xi ], ξi ) : i = 1, 2, 3, . . . , n}

of the interval [a, b] with the property that each interval [xi−1 , xi ] has length smaller than δ(ξi ). Consequently each interval [xi−1 , xi ] belongs entirely either to G1 or belongs entirely either to G2 . This is impossible. For if a ∈ G1 , then [x0 , x1 ] ⊂ G1 . But that means [x1 , x2 ] ⊂ G1 , and [x2 , x3 ] ⊂ G1 , . . . , and indeed all of the intervals are subsets of G1 .

Exercise 32, page 9 / Use a proof by contradiction. Suppose, for example that [a, b] ⊂ G1 ∪ G2 , G1 ∪ G2 = 0. Then a is in one of these two open sets, say a ∈ G1 . Take the last point t for which [a,t) ⊂ G1 , i.e., t = sup{r : a < r ≤ b, [a, r) ⊂ G1 }.

That number cannot be b, otherwise G2 contains no point of the interval. And that number cannot be in the open set G1 , otherwise we failed to pick the last such number. Thus t ∈ G2 . But the situation t ∈ G2 requires there to be some interval (c, d) containing t and entirely contained inside G2 . That gives us (c,t) ⊂ G1 and (c,t) ⊂ G2 . This is a / contradiction to the requirement that G1 ∪ G2 = 0.

Exercise 33, page 9 Take any two points in the set, s < t. If there is a point s < c < t that is not in the set E then E ⊂ (−∞, c) ∪ (c, ∞) exhibits, by definition, that the set is disconnected. So the set E contains all points between any two of its elements. Consequently E is either (a, b) or [a, b) or (a, b] or [a, b] where for a take inf E and for b take sup E.

Exercise 34, page 10 You should remember that these functions are defined for all real numbers, with the exception that tan(±π/2) = ±∞. So, since we do not consider functions to have infinite values, the function tan x is considered to be defined at all reals that are not of the form (n + 1/2)π/2 for some integer n.

Exercise 35, page 10 The value of arcsin x is defined to be the number −π/2 ≤ y ≤ π/2 such that sin y = x. The only numbers x that permit a solution to this equation are from the interval [−1, 1]. The value of arctan x is defined to be the number −π/2 ≤ y ≤ π/2 such that tan y = x. This equation can be solved for all real numbers x so the assumed domain of arctan x is the entire real line.

Exercise 36, page 11 The exponential function ex is defined for all values of x so its domain is the whole real line. The logarithm function is the inverse defined by requiring log x = y to mean

CHAPTER 5. ANSWERS

162

ey = x. Since ey is always positive the logarithm function cannot be defined at zero or at any negative number. In fact the domain of log x is the open unbounded interval (0, ∞).

Exercise 37, page 11 √ √ Check that x2 − x − 1 only at the points c1 = 1/2 + 5/2 and c2 = 1/2 − 5/2. Thus the domain of the first function would be assumed to be (−∞, c2 ) ∪ (c2 , c1 ) ∪ (c1 , ∞)> Also x2 − x − 1 ≥ 0 only on the intervals (∞, c2 ] and [c1 , ∞) so the the domain of the second function would be assumed to be (∞, c2 ] ∪ [c1 , ∞). Finally, the third function is a composition. We cannot write arcsin t unless −1 ≤ t ≤ 1, consequently we cannot write arcsin(x2 − x − 1) unless −1 ≤ x2 − x − 1 ≤ 1, or equivalently 0 ≤ x2 − x ≤ 2. But −1 ≤ x2 − x − 1 on the intervals (−∞, 0) and (1, ∞), while x2 − x − 1 ≤ 1 on the interval [−1, 2]. So finally arcsin(x2 − x − 1) can only be written for x in the intervals [−1, 0] and [1, 2]. So the the domain of the third function would be assumed to be [−1, 0] ∪ [1, 2].

Exercise 38, page 12 This is trivial. Just state the definition of uniform continuity and notice that it applies immediately to every point.

Exercise 39, page 12 Find a counterexample, i.e., find a function that is continuous on some open interval I and that is not necessarily uniformly continuous on that interval.

Exercise 40, page 13 There are lots of choices. Our favorite might be to set f (x) = x if x is rational and f (x) = −x if x is irrational. Then just check the definition at each point.

Exercise 41, page 13 To work with the definition one must know it precisely and also have an intuitive grasp. Usually we think that uniform continuity of f means . . . if d − c is small enough f (d) − f (c) should be small. For the function f (x) = x this becomes . . . if d − c is small enough d − c should be small. That alone is enough to indicate that the exercise must be trivial. Just write out the definition using δ = ε.

5.1. ANSWERS TO PROBLEMS

163

Exercise 42, page 13 Obtain a contradiction by assuming [falsely] that f (x) = x2 is uniformly continuous on the interval (−∞, ∞). Usually we think that uniform continuity of f means . . . if d − c is small enough f (d) − f (c) should be small. That means that the failure of uniform continuity should be thought of this way: . . . even though d − c is small f (d) − f (c) might not be small.

For the function f (x) = x2 this becomes

. . . even though d − c is small d 2 − c2 might not be small. A similar way of thinking is . . . even though t is small (x + t)2 − x2 might not be small. That should be enough to indicate a method of answering the exercise. Thus, take any particular ε > 0 and suppose [wrongly] that |d 2 − c2 )| < ε

whenever c, d are points for which |d −c| < δ. Take any large integer N so that 1/N < δ. Then |(N + 1/N)2 − 1/N 2 )| = N 2 + 2 < ε. This cannot be true for all large integers N so we have a contradiction. By the way, this method of finding two sequences xN = N and yN = N + 1/N to show that uniform continuity fails is turned into a general method in Exercise 91.

Exercise 43, page 13 The key is to factor x2 − y2 = (x + y)(x − y).

Then, we think that uniform continuity of f means

. . . if x − y is small enough f (x) − f (y) should be small.

For the function f (x) = x2 this becomes

. . . if x − y is small enough [x + y](x − y) should be small. In any bounded interval we can control the size of [x + y]. Here is a formal proof using this thinking. Let I be a bounded interval and suppose that |x| ≤ M for all x ∈ I. Let ε > 0 and choose δ = ε/(2M). Then, if |d − c| < δ

| f (d) − f (c)| = |d 2 − c2 | = |[d + c](d − c)| ≤ [|d| + |c|]|d − c| ≤ [2M]|d − c| < 2Mδ = ε. By definition, f is uniformly continuous on I.

Exercise 44, page 13 We have already proved that this function is uniformly continuous on any bounded interval. Use that fact on the interval (x0 − 1, x0 + 1).

CHAPTER 5. ANSWERS

164

Exercise 45, page 13 Suppose [falsely] that f (x) = 1x is uniformly continuous on the interval (0, ∞); then it must also be uniformly continuous on the bounded interval (0, 1). Using ε = 1 choose δ > 0 so that 1 1 − 0 and let ε > 0. We must choose a δ > 0 so that |1/x − 1/x0 | < ε

whenever x is a point in (0, ∞) for which |x − x0 | < δ. This an exercise in inequalities. Write x − x0 . |1/x − 1/x0 | = xx0

Note that if x > x0 /2 then xx0 /2 > x20 so that 1/[xx0 ] ≤ 1/[2x20 ]. These inequalities reveal the correct choice of δ and reveal where we should place the argument. We need not work in the entire interval (0, ∞) but can restrict the argument to the subinterval (x0 /2, 3x0 /2). Let x0 be a point in the interval (0, ∞). Work entirely inside the interval (x0 /2, 3x0 /2). Let ε > 0. Choose δ = εx20 and suppose that |x − x0 | < δ = εx20 and that x is a point in the interval (x0 /2, 3x0 /2). Then since x0 /2 < x, x − x0 εx20 ≤ | f (x) − f (x0 )| = < ε. xx0 2x20

By definition f (x) = 1x is continuous at the point x0 . Note this device used here: since pointwise continuity at x0 is a local property at a point we can restrict the argument to any open interval that contains x0 . If, by doing so, you can make the inequality work easier then, certainly, do so. Note: We have gone into some great detail in the exercise since this is at an early stage in our theory and it is an opportunity for instruction. You should be able to write up this proof in a shorter, more compelling presentation.

5.1. ANSWERS TO PROBLEMS

165

Exercise 46, page 13 It is easy to check that both rF(x) and sG(x) must be continuous at the point x0 . Thus it is enough to prove the result for r = s = 1, i.e., to prove that F(x) + G(x) must be continuous at the point x0 . The inequality |F(x) + G(x) − [F(x0 ) + G(x0 )| ≤ |F(x) − F(x0 )| + |G(x) − G(x0 )|

suggests an easy proof. Using the same method you should be successful in proving the following statement: Let F and G be functions that are uniformly continuous on an interval I. Then any linear combination H(x) = rF(x)+ sG(x) must also be uniformly continuous on I.

Exercise 47, page 13 The key is to use the simple inequality |F(x)G(x) − F(x0 )G(x0 )| = |F(x)G(x) − F(x0 )G(x) + F(x0 )G(x0 ) − F(x0 )G(x0 )| ≤ |F(x)G(x)−F(x0 )G(x)| + |F(x0 )G(x)−F(x0 )G(x0 )| = |G(x)| · |F(x)−F(x0 )| + |F(x0 )| · |G(x)−G(x0 )|

Since G is continuous at the point x0 there must be at least one interval (c, d) ⊂ I containing the point x0 so that G is bounded on (c, d). In fact we can use the definition of continuity to find a η > 0 so that and so, also

|G(x) − G(x0 )| < 1 for all x in (x0 − η, x0 + η) |G(x)| < |G(x0 )| + 1 for all x in (x0 − η, x0 + η).

Thus we can select such an interval (c, d) and a positive number M that is larger than |G(x)| + |F(x0 )| for all x in the interval (c, d). Let ε > 0. The assumptions imply the existence of the positive numbers δ1 and δ2 , such that ε |F(x) − F(x0 )| < 2M if |x − x0 | < δ1 and ε |G(x) − G(x0 )| < 2M if |x − x0 | < δ2 . Then, using any δ smaller than both δ1 and δ2 , and arguing inside the interval (c, d) we observe that |F(x)G(x) − F(x0 )G(x0 )| ≤ M|F(x) − F(x0 )| + M|G(x) − G(x0 )| < 2Mε/(2M) = ε.

if |x − x0 | < δ. This is immediate from the inequalities above. This proves that the product H(x) = F(x)G(x) must be continuous at the point x0 . Does the same statement apply to uniform continuity? In view of Exercise 46 you might be tempted to prove the following false theorem:

CHAPTER 5. ANSWERS

166

FALSE: Let F and G be functions that are uniformly continuous on an interval I. Then the product H(x) = F(x)G(x) must also be uniformly continuous on I. But note that F(x) = G(x) = x are both uniformly continuous on (−∞, ∞) while FG(x) = F(x)G(x) = x2 is not. The key is contained in your proof of this exercise. You needed boundedness to make the inequalities work. Here is a true version that you can prove using the methods that we used for the pointwise case: TRUE: Let F and G be functions that are uniformly continuous on an interval I. Suppose that G is bounded on the interval I. Then the product H(x) = F(x)G(x) must also be uniformly continuous on I. Later on we will find that, when working on bounded intervals, all uniformly continuous functions must be bounded. If you use this fact now and repeat your arguments you can prove the following version: Let F and G be functions that are uniformly continuous on a bounded interval I. Then the product H(x) = F(x)G(x) must also be uniformly continuous on I.

Exercise 48, page 13 Yes, if G(x0 ) 6= 0. The identity F(x) F(x0 ) F(x)G(x0 ) − F(x)G(x) + F (x)G(x) − F(x0 )G(x) . G(x) − G(x0 ) = G(x)G(x0 should help. You can also prove the following version for uniform continuity.

Let F and G be functions that are uniformly continuous on an interval I. Then the quotient H(x) = F(x)/G(x) must also be uniformly continuous on I provided that the functions F and 1/G are also defined and bounded on I.

Exercise 49, page 13 Let ε > 0 and determine η > 0 so that |G(z) − G(z0 )| < ε

whenever z is a point in J for which |z − z0 | < η. Now use the continuity of F at the point x0 to determine a δ > 0 so that |F(x) − F(x0 )| < η

whenever x is a point in I for which |x − x0 | < δ. Note that if x is a point in I for which |x − x0 | < δ, then z = F(x) is a point in J for which |z − z0 | < η. Thus |G(F(x)) − G(F(x0 ))| = |G(z) − G(z0 )| < ε.

5.1. ANSWERS TO PROBLEMS

167

Exercise 52, page 14 Too simple for a hint.

Exercise 53, page 14 Another way to think about this is that a function that is a sum of characteristic functions M

f (x) = ∑ ai χAi (x) i=1

is a step function if all the Ai are intervals or singleton sets. [Here χE (x), for a set E, is equal to 1 for points x in E and equal to 0 otherwise.] Let f : [a, b] → R be a step function. Show first that there is a partition a = x0 < x1 < x2 < · · · < xn−1 < xn = b

so that f is constant on each interval (xi−1 , xi ), i = 1, 2, . . . , n. This will display all possible discontinuities.

Exercise 54, page 14 This is not so hard and the title gives it away. Show first that R(x) = 1 if x is rational and is otherwise 0.

Exercise 55, page 14 We can interpret this statement, that the distance function is continuous, geometrically this way: if two points x1 and x2 are close together, then they are at roughly the same distance from the closed set C.

Exercise 56, page 15 This just requires connecting two definitions: the definition of continuity at a point and the definition of sequential limit at a point. Let ε > 0 and choose δ > 0 so that | f (x) − f (x0 )| < ε if |x − x0 | < δ. Now choose N so that |xn − x0 | < δ for n > N. Combining the two we get that | f (xn ) − f (x0 )| < ε if n > N. By definition that means that limn→∞ f (xn ) = f (x0 ). That proves one direction. To prove the other direction we can use a contrapositive argument: assume that continuity fails and then deduce that the sequence property also fails. Suppose that f is not continuous at x0 . Then, for some value of ε there cannot be a δ for which | f (x) − f (x0 )| < ε if |x − x0 | < δ. Consequently, for every integer n there must be at least one point xn in the interval so that |x0 − xn | < 1/n and yet | f (x) − f (x0 )| > ε. In other words we have produced a sequence {xn } → x0 for which limn→∞ f (xn ) = f (x0 ) fails.

CHAPTER 5. ANSWERS

168

Exercise 57, page 15 Suppose first that f is continuous. Let V be open, let x0 ∈ f −1 (V ) and choose α < β so that (α, β) ⊂ V and so that x0 ∈ f −1 ((α, β)). Then α < f (x0 ) < β. We will find a neighborhood U of x0 such that α < f (x) < β for all x ∈ U . Let ε = min(β − f (x0 ), f (x0 ) − α).

Since f is continuous at x0 , there exists a δ > 0 such that if x ∈ (x0 − δ, x0 + δ),

then

| f (x) − f (x0 )| < ε.

Thus and so f (x) < β. Similarly,

f (x) − f (x0 ) < β − f (x0 ), f (x) − f (x0 ) > α − f (x0 ),

and so f (x) > α. Thus the neighborhood U = (x0 − δ, x0 + δ) is a subset of f −1 ((α, β)) and hence also a subset of f −1 (V ). We have shown that each member of f −1 (V ) has a neighborhood in f −1 (V ). That is, f −1 (V ) is open. To prove the converse, suppose f satisfies the condition that for each open interval (α, β) with α < β, the set f −1 ((α, β)) is open. Take a point x0 . We must show that f is continuous at x0 . Let ε > 0, β = f (x0 ) + ε, α = f (x0 ) − ε. Our hypothesis implies that f −1 ((α, β)) is open. Thus there is at least open interval, (c, d) say, that is contained in this open set and contains the point x0 . Let δ = min(x0 − c, d − x0 ). For |x − x0 | < δ we find

α < f (x) < β.

Because β = f (x0 ) + ε and α = f (x0 ) − ε we must have | f (x) − f (x0 )| < ε.

This shows that f is continuous at x0 .

Exercise 59, page 15 In preparation . . .

Exercise 60, page 15 Because of Exercise 59 we already know that if ε > 0 then there is δ > 0 so that ω f ([c, d]) < ε whenever [c, d] is a subinterval of I for which |d − c| < δ. If the points of the subdivision a = x0 < x1 < x2 < x3 < · · · < xn−1 < xn = b

5.1. ANSWERS TO PROBLEMS

169

are chosen with gaps smaller than δ then, certainly, each of ω f ([x0 , x1 ]), ω f ([x1 , x2 ]), . . . , and ω f ([xn−1 , xn ]) is smaller than ε. Conversely suppose that there is such a subdivision. Let δ be one-half of the minimum of the lengths of the intervals [x0 , x1 ], [x1 , x2 ], . . . , [xn−1 , xn ]. Note that if we take any interval [c, d] with length less than δ that interval can meet no more than two of the intervals above. For example if [c, d] meets both [x0 , x1 ] and [x1 , x2 ], then ω f ([c, d]) ≤ ω f ([x1 , x2 ]) + ω f ([x1 , x2 ]) < 2ε.

In fact then any interval [c, d] with length less than δ must have ω f ([c, d]) < 2ε.

It follows that f is uniformly continuous on [a, b]. Is there a similar statement for uniform continuity on open intervals? Yes. Just check that f is a uniformly continuous function on an open, bounded interval (a, b) if and only if, for every ε > 0, there are points so that each of

a = x0 < x1 < x2 < x3 < · · · < xn−1 < xn = b ω f ((x0 , x1 ]), ω f ([x1 , x2 ]), . . . , and ω f ([xn−1 , xn ))

is smaller than ε.

Exercise 61, page 16 If f is continuous at a point x0 and ε > 0 there is a δ(x0 ) > 0 so that | f (x) − f (x0 )| < ε/2

for all |x−x0 | ≤ δ(x0 ). Take any two points u and v in the interval [x0 −δ(x0 ), x0 +δ(x0 )] and check that | f (v) − f (u)| ≤ | f (v) − f (x0 )| + | f (v) − f (x0 )| < ε/2 + ε/2 = ε.

It follows that

ω f ([x0 − δ(x0 ), x0 + δ(x0 )]) ≤ ε.

The other direction is easier since

| f (x) − f (x0 )| ≤ ω f ([x0 − δ(x0 ), x0 + δ(x0 )])

for all |x − x0 | ≤ δ(x0 ).

Exercise 62, page 16 This is just a rephrasing of the previous exercise.

Exercise 63, page 16 We use the fact that one-sided limits and sequential limits are equivalent in this sense: A necessary and sufficient condition in order that L = lim F(x) x→a+

CHAPTER 5. ANSWERS

170

should exist is that for all decreasing sequence of points {xn } convergent to a, the sequence { f (xn )} converges to L. Let us prove the easy direction first. Suppose that F(a+) = limx→a+ F(x) exists and let ε > 0. Choose δ(a) > 0 so that |F(a+) − F(x)| ≤ ε/3

for all a < x < a + δ(a). Then, for all c, d ∈ (a, a + δ(a),

|F(d) − F(c)| ≤ |F(a+) − F(d)| + |F(a+) − F(c)| ≤ 2ε/3.

It follows that

ωF((a, a + δ(a)) ≤ 2ε/3 < ε.

In the other direction consider a decreasing sequence of points {xn } convergent to a. Let ε > 0 and choose δ(a) so that ωF((a, a + δ(a)) < ε. Then there is an integer N so that |xn − a| < δ(a) for all n ≥ N. Thus |F(xn ) − F(xm )| ≤ ωF((a, a + δ)) < ε

for all m, n ≥ N. It follows from the Cauchy criterion for sequences that every such sequence {F(xn )} converges. The limit is evidently f (a+).

Exercise 64, page 16 We use the fact that infinite limits and sequential limits are equivalent in this sense: A necessary and sufficient condition in order that L = lim F(x) x→∞

should exist is that for all sequence of points {xn } divergent to ∞, the sequence {F(xn )} converges to L. Let us prove the easy direction first. Suppose that F(∞) = limx→∞ F(x) exists and let ε > 0. Choose T > 0 so that |F(∞) − F(x)| ≤ ε/3

for all T < x. Then, for all c, d ∈ (T, ∞), It follows that

|F(d) − F(c)| ≤ |F(∞) − F(d)| + |F(∞) − F(c)| ≤ 2ε/3. ωF((T, ∞)) ≤ 2ε/3 < ε.

In the other direction consider a sequence of points {xn } divergent to ∞. Let ε > 0 and choose T so that ωF((T, ∞)) < ε. Then there is an integer N so that xn > T for all n ≥ N. Thus

|F(xn ) − F(xm )| ≤ ωF((T, ∞)) < ε

for all m, n ≥ N. It follows from the Cauchy criterion for sequences that every such sequence {F(xn )} converges. The limit is evidently F(∞).

5.1. ANSWERS TO PROBLEMS

171

Exercise 65, page 18 This is a direct consequence of Exercise 63. Let ε > 0 and choose δ > 0 so that ωF((c, d)) < ε for all subintervals [c, d] of (a, b) for which d − c < δ. Then, certainly, ωF((a, a + δ)) < ε and ωF((b, b − δ)) < ε.

From this, Exercise 63 supplies the existence of the two one-sided limits F(a+) = lim F(x) and F(b−) = lim F(x). x→a+

x→b−

Exercise 66, page 18 Let ε > 0 and, using Exercise 63, choose positive numbers δ(a) and δ(a) so that ωF((a, a + δ(a))) < ε and ωF((b − δ(b), b)) < ε/2.

Now choose, for any point ξ ∈ (a, b), a positive number and δ(ξ) so that ωF([ξ − δ(ξ), ξ + δ(ξ)]) < ε.

This just uses the continuity of the function f at the point ξ in the oscillation version of that property that we studied in Section 1.5.5. By the Cousin partitioning argument there must exist points and a partition

a = x0 < x1 < x2 < · · · < xn−1 < xn = b {([xi−1 , xi ], ξi ) : i = 1, 2, 3, . . . , n}

of the whole interval [a, b] such that

ξi ∈ [xi−1 , xi ] and xi − xi−1 < δ(ξi ).

Just observe that this means that each of the following oscillations is smaller than ε: ωF((a, x1 ]), ωF([x1 , x2 ]), ωF([x2 , x3 ]), . . . , ωF([xn−1 , b)). It follows from Exercise 60 that f is uniformly continuous on (a, b).

Exercise 67, page 18 In one direction this is trivial. If F is defined on (a, b) but can be extended to a uniformly continuous function on [a, b] then F is already uniformly continuous on (a, b). The other direction is supplied by the theorem, in fact in the proof of the theorem. That proof supplied points a = x0 < x1 < x2 < · · · < xn−1 < xn = b

so that each of the following oscillations is smaller than ε:

ωF((a, x1 ])), ωF([x1 , x2 ]), ωF([x2 , x3 ]), . . . , ωF([xn−1 , b)).

CHAPTER 5. ANSWERS

172

Now define G = F on (a, b) and G(a) = F(a+), G(b) = F(b−). Then ωF((a, x1 ])) = ωG([a, x1 ])) and ωF([xn−1 , b)) = ωG([xn−1 , b]). With this rather subtle change we have now produced points a = x0 < x1 < x2 < · · · < xn−1 < xn = b

so that each of the following oscillations is smaller than ε:

ωG([a, x1 ])), ωG([x1 , x2 ])), ωG([x2 , x3 ])), . . . , ωG([xn−1 , b]). It follows from Exercise 60 that G is uniformly continuous on [a, b].

Exercise 68, page 18 If F is continuous on (a, b) and [c, d] ⊂ (a, b) note that F is continuous on [c, d] and that F(c) = F(c+) and F(d) = F(d−). Applying the theorem we see that F is uniformly continuous on [c, d].

Exercise 69, page 18 If F is continuous on (a, b) and monotone nondecreasing then we know that either F(a+) = limx→a+ F(x) exists as a finite real number or else F(a+) = −∞. Similarly know that either F(b−) = limx→b− F(x) exists as a finite real number or else F(b−) = +∞. Thus, by Theorem 1.12 the function is uniformly continuous on (a, b) provided only that the function is bounded. Conversely, in order for the function F to be uniformly continuous on (a, b), it must be bounded since all uniformly continuous functions are bounded on bounded intervals.

Exercise 70, page 18 This proof invokes a Bolzano-Weierstrass compactness argument. We use an indirect proof. If F is not uniformly continuous, then there are sequences {xn } and {yn } so that xn − yn → 0 but |F(xn ) − F(yn )| > c

for some positive c. (The verification of this step is left out, but you should supply it. This can be obtained merely by negating the formal statement that f is uniformly continuous on [a, b].) Now apply the Bolzano-Weierstrass property to obtain a convergent subsequence {xnk }. Write z as the limit of this new sequence {xnk }. Observe that xnk − ynk → 0 since xn − yn → 0. Thus {xnk } and the corresponding subsequence {ynk } of the sequence {yn } both converge to the same limit z, which must be a point in the interval [a, b]. If a < z c for all n, this means from our study of sequence limits that |F(z) − F(z)| ≥ c > 0 and this is impossible.

5.1. ANSWERS TO PROBLEMS

173

Now suppose that z = a. Since F(a+) = lim F(x) x→a+

exists it also follows that F(xnk ) → F(a+) and F(ynk ) → F(a+). Again this is impossible. The remaining case, z = b is similarly handled.

Exercise 71, page 18 Choose open intervals (a, a + δ(a)), (b − δ(b), b) so that

ωF((a, a + δ(a))) < ε/2 and ωF((b − δ(b), b)) < ε/2

At the endpoints a and b this is possible because the one-sided limits exist (i.e., by Exercise 63). For each point x ∈ (a, b) we may choose intervals (x − δ(x), x + δ(x) in such a way that ωF((x − δ(x)x + δ(x)) < ε/2. At the points x ∈ (a, b) this is possible because of our assumption that F is continuous at all such points. Pick points s and t with a < s < a + δ(a) and b − δ(b) < t < b. Now apply the Heine-Borel property to this covering of the closed interval [s,t]. There are now a finite number of open intervals (xi − δ(xi ), xi + δ(xi ) with i = 1, 2, 3, . . . , k covering [s,t]. Let δ be half the minimum length of all the intervals (a, a + δ(a)), (b − δ(b), b), (xi − δ(xi ), xi + δ(xi ) (i = 1, 2, 3, . . . , k).

Use this to show that ωF([c, d]) < ε if [c, d] ⊂ (a, b) and d − c < δ.

Exercise 72, page 18 There are functions that are continuous at every point of (−∞, ∞) and yet are not uniformly continuous. Find one.

Exercise 73, page 18 There are functions that are continuous at every point of (0, 1) and yet are not uniformly continuous. Find one.

Exercise 74, page 18 Slight of hand. Choose δ. Wait a minute. Our choice of δ depended on x0 so, to make the trick more transparent, call it δ(x0 ). Then if d is some other point you will need a different value of δ.

Exercise 75, page 19 If G is continuous at every point of an interval [c, d] then the theorem (Theorem 1.12) applies to show that G is uniformly continuous on that interval.

CHAPTER 5. ANSWERS

174

Exercise 76, page 19 Just read this from the theorem.

Exercise 77, page 19 Just read this from the theorem.

Exercise 78, page 19 In preparation . . .

Exercise 79, page 20 Let f be a uniformly continuous function on a closed, bounded interval [a, b]. Take any value of ε0 > 0. Then Exercise 60 supplies points so that each of

a = x0 < x1 < x2 < x3 < · · · < xn−1 < xn = b ω f ([), ω f ((x1 , x2 ]), . . . , ω f ((xn−1 , xn ])

is smaller than ε0 . In particular, f is bounded on each of these intervals. Consequently f is bounded on all of [a, b]. The same proof could be used if we had started with a uniformly continuous function on an open bounded interval (a, b). Note that if the interval is unbounded then such a finite collection of subintervals would not exist.

Exercise 80, page 20 The condition of pointwise continuity at a point x0 gives us an inequality | f (x) − f (x0 )| < ε

that must hold for some interval (x0 − δ, x0 + δ). This immediately provides the inequality | f (x)| = | f (x) − f (x0 ) + f (x0 )| ≤ | f (x) − f (x0 )| + | f (x0 )| < ε + | f (x0 )|

which provides an upper bound for f in the interval (x0 − δ, x0 + δ).

Exercise 81, page 20 No. It doesn’t follow. For a counterexample, the function f (x) = sin(1/x) is a continuous, bounded function on the bounded open interval (0, 1). This cannot be uniformly continuous because ω f ((0,t)) = 2 for every t > 0. This function appears again in Exercise 107 The graph of F is shown in Figure 5.1 and helps to illustrate that the continuity cannot be uniform on (0, 1). Later on we will see that if a function f is uniformly continuous on a bounded open interval (a, b) then the one-sided limits at the endpoints a and b must exist. For the

5.1. ANSWERS TO PROBLEMS

175

example f (x) = sin(1/x) on the bounded open interval (0, 1) we can check that f (0+) does not exist, which it must if f were to be uniformly continuous on (0, 1).

Exercise 82, page 20 We assume here that you have studied sequences and convergence of sequences. If f is not bounded then there must be a point x1 in the interval for which | f (x1 )| > 1. If not then | f (x)| ≤ 1 and we have found an upper bound for the values of the function. Similarly there must be a point x2 in the interval for which | f (x2 )| > 2, and a point x3 in the interval for which | f (x3 )| > 3. Continue choosing points and then check that | f (xn )| → ∞.

Exercise 85, page 20 The function f (x) = x is uniformly continuous on the unbounded interval (∞, ∞) and yet it is not bounded. On the other hand the function f (x) = sin x is uniformly continuous on the unbounded interval (∞, ∞) and it is bounded, since | sin x| ≤ 1.

Exercise 86, page 20 Yes and yes. If | f (x)| ≤ M and |g(x)| ≤ N for all x in an interval I then and

| f (x) + g(x)| ≤ | f (x)| + |g(x)| ≤ M + N | f (x)g(x)| ≤ MN.

Exercise 87, page 20 No. On the interval (0, 1) the functions f (x) = x and g(x) = x2 are bounded functions that do not assume the value zero. The quotient function g/ f is bounded but the quotient function f /g is unbounded.

Exercise 88, page 20 If the values of f (t) are bounded then the values of f (g(x)) are bounded since they include only the same values. Thus there is no need for the extra hypothesis that g is bounded.

Exercise 89, page 21 Use the mean-value theorem to assist in this. If c < d then sin d − sin c = (d − c) cos ξ

for some point ξ between c and d. Don’t remember the mean-value theorem? Well use these basic facts instead: sin d − sin c = sin[(d − c)/2] cos[(d + c)/2]

CHAPTER 5. ANSWERS

176 and | sin x| ≤ |x|.

Stop reading and try the problem now . . .

If you followed either of these hints then you have arrived at an inequality of the form | sin d − sin c| ≤ M|d − c|.

Functions that satisfy this so-called Lipschitz condition are easily shown to be uniformly continuous. For δ you will find that δ = ε/M works.

Exercise 90, page 21 For δ you will find that δ = ε/M works.

Exercise 91, page 21 If f is uniformly continuous then, by definition, for every ε > 0 there is a δ > 0 so that | f (d) − f (c)| < ε

whenever c, d are points in I for which |d − c| < δ. If there did exist two sequences of points {xn } and {xn } from that interval for which xn − yn → 0 then there would be an integer N so that |xn − yn | < δ for n > N. Consequently | f (xn ) − f (yn )| < ε

for n > N. By that means, by the usual sequence definitions that f (xn ) − f (yn ) does indeed converge to zero. Conversely if f is not uniformly continuous on the interval I then for some value ε0 > 0 and every integer n the statement that the inequality | f (d) − f (c)| < ε0

holds whenever c, d are points in I for which |d − c| < 1/n. must fail. Thus it is possible to select points {xn } and {xn } from that interval for which |xn − yn | < 1/n but | f (xn ) − f (yn )| ≥ ε0 . Consequently we have exhibited two sequences of points {xn } and {xn } from that interval for which xn − yn → 0 but f (xn ) − f (yn ) does not converge to zero.

Exercise 92, page 21 By Theorem 1.18, F is bounded and so we may suppose that M is the least upper bound for the values of F, i.e., M = sup{ f (x) : a ≤ x ≤ b}.

If there exists x0 such that F(x0 ) = M, then F achieves a maximum value M. Suppose, then, that F(x) < M for all x ∈ [a, b]. We show this is impossible.

5.1. ANSWERS TO PROBLEMS

177

Let g(x) = 1/(M − F(x)). For each x ∈ [a, b], F(x) 6= M; as a consequence, g is uniformly continuous and g(x) > 0 for all x ∈ [a, b]. From the definition of M we see that inf{M − f (x) : x ∈ [a, b]} = 0, so

1 : x ∈ [a, b] = ∞. sup M − f (x) This means that g is not bounded on [a, b]. This is impossible because uniformly continuous are bounded on bounded intervals. A similar proof would show that F has an absolute minimum.

Exercise 93, page 21 We can also prove Theorem 1.21 using a Bolzano-Weierstrass argument. Let M = sup{F(x) : a ≤ x ≤ b}.

That means that for any integer n the smaller number M − 1/n cannot be an upper bound for the values of the function F on this interval. Consequently we can choose a sequence of points {xn } from [a, b] so that F(xn ) > M − 1/n.

Now apply the Bolzano-Weierstrass theorem to find a subsequence {xnk } that converges to some point z0 in [a, b]. Use the continuity of F to deduce that lim F(xnk ) = F(z0 ).

k→∞

Since M ≥ F(xnk ) > M − 1/nk

it must follow that F(z0 ) = M. Thus the function F attains its maximum value at z0 .

Exercise 94, page 21 How about sin 2πx? This example is particularly easy to think about since the minimum value could only occur at an endpoint and we have excluded both endpoints by working only on the interval (0, 1). In fact we should notice that this feature is general: if f is a uniformly continuous function on the interval (0, 1) then there is an extension of f to a uniformly continuous function on the interval [0, 1] and the maximum and minimum values are attained on [0, 1] [but not necessarily on (0, 1)].

Exercise 95, page 21 How about 1 − sin 2πx?

Exercise 96, page 21 Simplest is f (x) = x.

178

CHAPTER 5. ANSWERS

Exercise 97, page 21 Simplest is f (x) = x.

Exercise 98, page 21 Try f (x) = arctan x.

Exercise 99, page 22 If there is a point f (x0 ) = c > 0, then there is an interval [−N, N] so that x0 ∈ [−N, N] and | f (x)| < c/2 for all x > N and x < −N. Now since f is uniformly continuous on [−N, N] we may select a maximum point. That maximum will be the maximum also on (−∞, ∞). If there is no such point x0 then f assumes only values negative or zero. Apply the same argument but to the function − f . For a suitable example of a function that has an absolute maximum but not an absolute minimum you may take f (x) = (1 + x2 )−1 .

Exercise 100, page 22 All values of f (x) are assumed in the interval [0, p] and f is uniformly continuous on [0, p]. It would not be correct to argue that f is uniformly continuous on (−∞, ∞) [it is] and “hence” that f must have a maximum and minimum [it would not follow].

Exercise 101, page 22 This is a deeper theorem than you might imagine and will require a use of one of our more sophisticated arguments. Try using the Cousin covering argument. Let f be continuous at points a and b and at all points in between, and let c ∈ R. If for every x ∈ [a, b], f (x) 6= c, then either f (x) > c for all x ∈ [a, b] or f (x) < c for all x ∈ [a, b]. Let C denote the collection of closed intervals J such that f (x) < c for all x ∈ J or f (x) > c for all x ∈ J. We verify that C forms a Cousin cover of [a, b]. If x ∈ [a, b], then | f (x) − c| = ε > 0, so there exists δ > 0 such that | f (t) − f (x)| < ε whenever |t − x| < δ > and t ∈ [a, b]. Thus, if f (x) < c, then f (t) < c for all t ∈ [x − δ/2, x + δ/2], while if f (x) > c, then f (t) > c for all t ∈ [x − δ/2, x + δ/2]. By Cousin’s lemma there exists a partition of [a, b], a = x0 < x1 < · · · < xn = b such that for i = 1, . . . , n, [xi−1 , xi ] ∈ C . Suppose now that f (a) < c. The argument is similar if f (a) > c. Since [a, x1 ] = [x0 , x1 ] ∈ C , f (x) < c for all x ∈ [x0 , x1 ]. Analogously, since [x1 , x2 ] ∈ C , and f (x1 ) < c, f (x) < c for x ∈ [x1 , x2 ]. Proceeding in this way, we see that f (x) < c for all x ∈ [a, b].

Exercise 102, page 22 We can prove Theorem 1.23 using the Bolzano-Weierstrass property of sequences rather than Cousin’s lemma. Suppose that the theorem is false and explain, then, why there should exist sequences {xn } and {yn } from [a, b] so that f (xn ) > c, f (yn ) < c and |xn − yn | < 1/n.

5.1. ANSWERS TO PROBLEMS

179

Exercise 103, page 22 We can prove Theorem 1.23 using the Heine-Borel property. Suppose that the theorem is false and explain, then, why there should exist at each point x ∈ [a, b] an open interval Ix centered at x so that either f (t) > c for all t ∈ Ix ∩ [a, b] or else f (t) < c for all t ∈ Ix ∩ [a, b].

Exercise 104, page 22 We can prove Theorem 1.23 using the following “last point” argument: suppose that f (a) < c < f (b) and let z be the last point in [a, b] where f (z) stays below c, that is, let z = sup{x ∈ [a, b] : f (t) ≤ c for all a ≤ t ≤ x}.

Show that f (z) = c. You may take c = 0. Show that if f (z) > 0, then there is an interval [z − δ, z] on which f is positive. Show that if f (z) < 0, then there is an interval [z, z + δ] on which f is negative. Explain why each of these two cases is impossible.

Exercise 105, page 22 For any such function f the Darboux property implies that the image set is connected. In an earlier exercise we determined that all connected sets on the real line are intervals. For the examples we will need three functions F, G, H : (0, 1) → R so that the image under F is not open, the image under G is not closed, and the image under H is not bounded. You can check that F(x) = G(x) = x(1 − x) maps (0, 1) onto (0, 1/4], and that H(x) = 1/x maps (0, 1) onto (0, ∞).

Exercise 106, page 22 As in the preceding exercise we know that the image set is a connected set [by the Darboux property] and hence that it is an interval. This interval must be bounded since a uniformly continuous function on a closed, bounded interval [a, b] is bounded. This interval must be then either (A, B) or [A, b) or (A, B] or [A, B]. The possibilities (A, B) and (A, B] are impossible, for then the function would not have a minimum. The possibilities (A, B) and [A, B) are impossible, for then the function would not have a maximum.

Exercise 107, page 23 The graph of F, shown in Figure 5.1, will help in thinking about this function. Let us check that F is continuous everywhere, except at x0 = 0. If we are at a point x0 6= 0 then this function is the composition of two functions G(x) = sin x and H(x) = 1/x, both suitably continuous. So on any interval [s,t] that does not contain x0 = 0 the function is continuous and continuous functions satisfy the Darboux property. Let t > 0. On the interval [0,t], F(0) = 0 and F assumes every value between 1 and −1 infinitely often. On the interval [−t, 0], F(0) = 0 and F assumes every value between 1 and −1 infinitely often.

CHAPTER 5. ANSWERS

180

Figure 5.1: Graph of the function F(x) = sin x−1 on [−π/8, π/8].

Exercise 108, page 23 Consider the function g(x) = f (x) − x which must also be uniformly continuous. Now g(a) = f (a) − a ≥ 0 and g(b) = f (b) − b ≤ 0. By the Darboux property there must be a point where g(c) = 0. At that point f (c) − c = 0.

Exercise 109, page 23 If zn → c then zn = f (zn−1 ) → f (c). Consequently c = f (c).

Exercise 110, page 23 This is a puzzle. Use the fact that such functions will have maxima and minima in any interval [c, d] ⊂ I and that continuous functions have the Darboux property.

Exercise 111, page 23 This is again a puzzle. Use the fact that such functions will have maxima and minima in any interval [c, d] ⊂ I and that continuous functions have the Darboux property. You shouldn’t have too much trouble finding an example if I is a closed, bounded interval. What about if I is open?

Exercise 112, page 23 For such functions the one-sided limits f (x0 +) f (x0 −) exist at every point x0 and f (x0 −) ≤ f (x0 +). The function is discontinuous at x0 if and only if f (x0 −) < f (x0 +). Show that the Darboux property would not allow f (x0 −) < f (x0 +) at any point.

Exercise 113, page 24 In its usual definition

F(y) − F(x) y→x y−x

F ′ (x) = lim or, equivalently, lim

y→x

F(y) − F(x) ′ − F (x) = 0. y−x

5.1. ANSWERS TO PROBLEMS

181

But limits are defined exactly by ε, δ(x) methods. So that, in fact, this statement about the limit is equivalent to the statement that, for every ε > 0 there is a δ(x) > 0 so that F(y) − F(x) ′ − F (x) ≤ ε y−x

whenever 0 < |y − x| < δ. [Note the exclusion of y = x here.] The statement that is equivalent to the statement that

0 < |y − x| < δ(x)

0 < y − x < δ(x) or 0 < x − y < δ(x).

This, in turn, is identical to the statement that F(y) − F(x) − F ′ (x)(y − x) ≤ ε|y − x|

whenever 0 < |y − x| < δ. The case y = x which is formally excluded from such statements about limits can be accommodated here because the expression is zero for y = x. Consequently, a very small [hardly noticeable] cosmetic change shows that the limit derivative statement is exactly equivalent to the statement that, for every ε > 0, there is a δ(x) > 0 so that F(y) − F(x) − F ′ (x)(y − x) ≤ ε|y − x| whenever y is a points in I for which |y − x| < δ(x).

Exercise 114, page 24 As we just proved, if F ′ (x0 ) exists and F is defined on an interval I containing that point then, there is a δ(x0 ) > 0 so that F(y) − F(x0 ) − F ′ (x0 )(y − x) ≤ |y − x0 |

whenever y is a point in the interval I for which |y − x0 | < δ(x0 ). That translates quickly to the statement that |F(y) − F(x0 )| ≤ (|F ′ (x0 )| + 1)|y − x0 |

whenever y is a point in the interval I for which |y − x0 | < δ(x0 ). This gives the clue needed to write up this proof: If ε > 0 then we choose δ1 < δ(x0 ) so that δ1 < ε/(|F ′ (x0 )| + 1). Then |F(y) − F(x0 )| ≤ (|F ′ (x0 )| + 1)|y − x0 | < ε

whenever y is a point in the interval I for which |y − x0 | < δ1 . This is exactly the requirement for continuity at the point x0 .

Exercise 115, page 25 Note that F(z) − F(y) − F ′ (x)(z − y) = [F(z) − F(x)] − [F(x) − F(y)] − F ′ (x)([z − x] − [y − x]) ≤ F(z) − F(x) − F ′ (x)(z − x) + F(y) − F(x) − F ′ (x)(y − x) .

CHAPTER 5. ANSWERS

182

Thus the usual version of this statement quickly leads to the straddled version. Note that the straddled version includes the usual one. The word “straddled” refers to the fact that, instead of estimating [F(y)−F(x)]/[y− x] to obtain F ′ (x) we can straddle the point by taking z ≤ x ≤ y and estimating [F(y) − F(z)]/[y − z], still obtaining F ′ (x). If we neglect to straddle the point [it would be unstraddled if y and z were on the same side of x] then we would be talking about a much stronger notion of derivative.

Exercise 116, page 25 In preparation . . .

Exercise 117, page 25 If F ′ (x0 ) > 0 then, using ε = F ′ (x0 )/2, there must be a δ > 0 so that F(z) − F(x0 ) − F ′ (x0 )(z − x0 ) ≤ ε|z − x0 |

whenever z is a point in I for which |z − x0 | < δ. Suppose x0 < z < x0 + δ; then it follows from this inequality that and so

F(z) − F(x0 ) − F ′ (x0 )(z − x0 ) ≥ −ε(z − x0 ) F(z) ≥ F(x0 ) + ε(z − x0 )/2 > F(x0 ).

The argument is similar on the left at x0 . It is easy to use the definition of locally strictly increasing at a point to show that the derivative is nonnegative. Must it be positive? For a counterexample simply note that f (x) = x3 is locally strictly increasing at every point but that the derivative is not positive everywhere but has a zero at x0 = 0.

Exercise 118, page 25 Take any [c, d] ⊂ (a, b). We will show that F(d) > F(c) and this will complete the proof that F is strictly increasing on (a, b). For each x0 in c, d] there is a δ(x0 ) > 0 so that F(z) − F(x0 ) − F ′ (x0 )(z − x0 ) ≤ ε|z − x0 | whenever z is a point in I for which |z − x0 | < δ(x0 ). We apply the Cousin partitioning argument. There must exist at least one partition {([xi−1 , xi ], ξi ) : i = 1, 2, 3, . . . , n}

of the interval [c, d] with the property that each interval [xi−1 , xi ] has length smaller than δ(ξi ). Thus n

F(d) − F(c) = ∑ [F(xi − F(xi−1 )] > 0 i=1

since each of these terms must satisfy [F(xi − F(xi−1 )] = [F(xi − F(ξi )] + [F(ξi ) − F(xi−1 )] > 0.

5.1. ANSWERS TO PROBLEMS

183 f

a

c1

c2

b

Figure 5.2: Rolle’s theorem [note that f (a) = f (b)].

Exercise 119, page 26 The strategy, quite simply, is to argue that there is a point inside the interval where a maximum or minimum occurs. Accordingly the derivative is zero at that point. First, if f is constant on the interval, then f ′ (x) = 0 for all x ∈ (a, b), so ξ can be taken to be any point of the interval. Suppose then that f is not constant. Because f is uniformly continuous on the closed, bounded interval [a, b] , f achieves a maximum value M and a minimum value m on [a, b]. Because f is not constant, one of the values M or m is different from f (a) and f (b), say M > f (a). Choose f (ξ) = M. Since M > f (a) = f (b), c ∈ (a, b). Check that f ′ (c) = 0. If f ′ (c) > 0 then, by Exercise 117, the function f must be locally strictly increasing at x0 . But this is impossible if x0 is at a maximum for f . If f ′ (c) < 0 then, again by Exercise 117, the function − f must be locally strictly increasing at x0 . But this is impossible if x0 is at a maximum for f . It follows that f ′ (c) = 0.

Exercise 120, page 26 Rolle’s theorem asserts that, under our hypotheses, there is a point at which the tangent to the graph of the function is horizontal, and therefore has the same slope as the chord determined by the points (a, f (a)) and (b, f (b)). (See Figure 5.2.)

Exercise 121, page 26 There may, of course, be many such points; Rolle’s theorem just guarantees the existence of at least one such point. You should be able to construct a function, under these hypotheses, with an entire subinterval where the derivative vanishes.

Exercise 122, page 26 First check continuity at every point. This function is not differentiable at zero, but Rolle’s theorem requires differentiability only inside the interval, not at the endpoints. Continuity at the point zero is easily checked by using the inequality | f (x)| = |x sin x−1 | ≤ |x|. Continuity elsewhere follows from the fact that function f is differentiable (by the usual rules) and so continuous. Finally, in order to apply Rolle’s theorem, just check that 0 = f (0) = f (1/π).

184

CHAPTER 5. ANSWERS

There are an infinite number of points between 0 and 1/π where the derivative is zero. Rolle’s theorem guarantees that there is at least one.

Exercise 123, page 26 Yes. Notice that f fails to be differentiable at the endpoints of the interval, but Rolle’s theorem does not demand differentiability at either endpoint.

Exercise 124, page 26 No. Notice that f fails to be differentiable only at the midpoint of this interval, but Rolle’s theorem demands differentiability at all interior points, permitting nondifferentiability only at either endpoint. In this case, even though f (−1) = f (1), there is no point inside the interval where the derivative vanishes.

Exercise 125, page 26 In preparation . . .

Exercise 126, page 27 Use Rolle’s theorem to show that if x1 and x2 are distinct solutions of p(x) = 0, then between them is a solution of p′ (x) = 0.

Exercise 127, page 27 In preparation . . .

Exercise 128, page 27 Use Rolle’s theorem twice. See Exercise 130 for another variant on the same theme.

Exercise 129, page 27 Since f is continuous we already know (look it up) that f maps [a, b] to some closed, bounded interval [c, d]. Use Rolle’s theorem to show that there cannot be two values in [a, b] mapping to the same point.

Exercise 130, page 27 cf. Exercise 128.

Exercise 131, page 27 We prove this theorem by subtracting from f a function whose graph is the straight line determined by the chord in question and then applying Rolle’s theorem. Let f (b) − f (a) (x − a). L(x) = f (a) + b−a

5.1. ANSWERS TO PROBLEMS

185

We see that L(a) = f (a) and L(b) = f (b). Now let g(x) = f (x) − L(x).

(5.1)

Then g is continuous on [a,b], differentiable on (a,b) , and satisfies the condition g(a) = g(b) = 0. By Rolle’s theorem, there exists c ∈ (a, b) such that g′ (c) = 0. Differentiating (5.1), we see that f ′ (c) = L′ (c). But f (b) − f (a) , L′ (c) = b−a so f (b) − f (a) f ′ (c) = , b−a as was to be proved.

Exercise 132, page 28 The first statement is just the mean-value theorem applied to every subinterval. For the second statement, note that an increasing function f would allow only positive numbers in S. But increasing functions may have zero derivatives (e.g., f (x) = x3 ).

Exercise 133, page 28 If t, measured in hours, starts at time t = 0 and advances to time t = 2 then s(2) − s(1) s′ (τ) = = 100/2 2 at some point τ in time between starting and finishing.

Exercise 134, page 28 The mean-value p theorem includes Rolle’s theorem as a special case. So our previous example f (x) = |x| which fails to have a derivative at the point x0 = 0 does not satisfy the hypotheses of the mean-value theorem and the conclusion, as we noted earlier, fails.

Exercise 135, page 28 Take any example where the mean-value theorem can be applied and then just change the values of the function at the endpoints.

Exercise 136, page 28 Apply the mean-value theorem to f on the interval [x, x + a] to obtain a point ξ in [x, x + a] with f (x + a) − f (x) = a f ′ (ξ).

CHAPTER 5. ANSWERS

186

Exercise 137, page 28 Use the mean-value theorem to compute f (x) − f (a) . lim x→a+ x−a

Exercise 138, page 28 This is just a variant on Exercise 137. Show that under these assumptions f ′ is continuous at x0 .

Exercise 139, page 29 Use the mean-value theorem to relate ∞

∑ ( f (i + 1) − f (i))

i=1

to

∞

∑ f ′(i).

i=1

Note that f is increasing and treat the former series as a telescoping series.

Exercise 140, page 29 The proof of the mean-value theorem was obtained by applying Rolle’s theorem to the function f (b) − f (a) g(x) = f (x) − f (a) − (x − a). b−a For this mean-value theorem apply Rolle’s theorem twice to a function of the form h(x) = f (x) − f (a) − f ′ (a)(x − a) − α(x − a)2

for an appropriate number α.

Exercise 141, page 29 In preparation . . .

Exercise 142, page 29 Write f (x + h) + f (x − h) − 2 f (x) =

[ f (x + h) − f (x)] + [ f (x − h) − f (x)]

and apply the mean-value theorem to each term.

5.1. ANSWERS TO PROBLEMS

187

Exercise 143, page 29 Let φ(x) = [ f (b) − f (a)]g(x) − [g(b) − g(a)] f (x).

Then φ is continuous on [a,b] and differentiable on (a,b) . Furthermore, φ(a) = f (b)g(a) − f (a)g(b) = φ(b).

By Rolle’s theorem, there exists ξ ∈ (a, b) for which φ′ (ξ) = 0. It is clear that this point ξ satisfies [ f (b) − f (a)]g′ (ξ) = [g(b) − g(a)] f ′ (ξ).

Exercise 144, page 29 We can interpret the mean-value theorem as applied to curves given parametrically. Suppose f and g are uniformly continuous on [a, b] and differentiable on (a, b). Consider the curve given parametrically by x = g(t) ,

y = f (t)

(t ∈ [a, b]).

As t varies over the interval [a,b], the point (x, y) traces out a curve C joining the points (g(a), f (a)) and (g(b), f (b)). If g(a) 6= g(b), the slope of the chord determined by these points is f (b) − f (a) . g(b) − g(a) Cauchy’s form of the mean-value theorem asserts that there is a point (x, y) on C at which the tangent is parallel to the chord in question.

Exercise 145, page 29 In its simplest form, l’Hï£¡pital’s rule states that for functions f and g, if lim f (x) = lim g(x) = 0

x→c

x→c

and lim f ′ (x)/g′ (x)

x→c

exists, then

f (x) f ′ (x) = lim ′ . x→c g(x) x→c g (x) You can use Cauchy’s mean-value theorem to prove this simple version. Make sure to state your assumptions to match up to the situation in the statement of Cauchy’s mean-value theorem. lim

Exercise 146, page 30 Just expand the determinant.

CHAPTER 5. ANSWERS

188

Exercise 147, page 30 Let φ(x) be f (a) g(a) h(a) f (b) g(b) h(b) f (x) g(x) h(x)

and imitate the proof of Theorem 143.

Exercise 148, page 30 By the mean-value theorem f (b) − f (a) f (c) − f (b) = f ′ (ξ1 ) ≥ f ′ (ξ2 ) = c−b b−a for some points a < ξ1 f ′ (ξ2 ) = . c−b b−a

Exercise 149, page 30 From Lipman Bers, Classroom Notes: On Avoiding the Mean Value Theorem, Amer. Math. Monthly 74 (1967), no. 5, 583. It is hard to agree with this eminent mathematician that students should avoid the mean-value theorem, but (perhaps) for some elementary classes this is reasonable. Here is his proof: “This is intuitively obvious and easy to prove. Indeed, assume that there is a p, a 0 we have a < q < p. If f (q) ≥ f (p), then since f ′ (q) > 0, there are points of S to the right of q. If f (q) < f (p), then q is not in S and, by continuity, there are no points of S near and to the left of q. Contradiction. . . . The “full" mean value theorem, for differentiable but not continuously differentiable functions is a curiosity. It may be discussed together with another curiosity, Darboux’ theorem that every derivative obeys the intermediate value theorem.”

Exercise 150, page 30 From Howard Levi, Classroom Notes: Integration, Anti-Differentiation and a Converse to the Mean Value Theorem, Amer. Math. Monthly 74 (1967), no. 5, 585–586.

5.1. ANSWERS TO PROBLEMS

189

Exercise 151, page 30 If there are no exceptional points then the usual mean-value theorem does the job. If, say, there is only one point c inside where f ′ (c) does not exist then apply the meanvalue theorem on both of the intervals [a, c] and [c, b] to get two points ξ1 and ξ2 so that f (c) − f (a) = f ′ (ξ1 ) c−a and f (b) − f (c) = f ′ (ξ2 ). b−c Then | f (b) − f (a)| ≤ (c − a)| f ′ (ξ1 )| + (b − c)| f ′ (ξ2 )| ≤ M(b − a)

where for M we just choose whichever is larger, | f ′ (ξ1 )| or | f ′ (ξ1 )|. A similar proof will handle more exceptional points. For a method of proof that does not invoke the mean-value theorem see Israel Halperin, Classroom Notes: A Fundamental Theorem of the Calculus, Amer. Math. Monthly 61 (1954), no. 2, 122–123.

Exercise 152, page 30 This simple theorem first appears in T. Flett, A mean value theorem, Math. Gazette (1958), 42, 38–39. We can assume that f ′ (a) = f ′ (b) = 0 [otherwise work with f (x) − f ′ (a)x]. Consider the function g(x) defined to be [ f (x) − f (a)]/[x − a] for x 6= a and f ′ (a) at x = a. We compute that g(x) f ′ (x) f (x) − f (a) f ′ (x) + = − + . g′ (x) = − (x − a))2 x−a x−a x−a Evidently to prove the theorem is to prove that g′ has a zero in (a, b). Check that such a zero will solve the problem. To get the zero of g′ first consider whether g(a) = g(b). If so then Rolle’s theorem does the job. If, instead, g(b) > g(a) then g(b) g′ (b) = − < 0. b−a Thus g is locally decreasing at b. There would then have to be at least one point x1 for which g(x1 ) > g(b) > g(a). The Darboux property of the continuous function g will supply a point x0 at which g(x0 ) = g(b). Apply Rolle’s theorem to find that g′ has a zero in (x0 , b). Finally, if g(b) < g(a), then an identical argument should produce the same result.

Exercise 153, page 31 Repeat the arguments for Rolle’s theorem with these new hypotheses. Then just take any γ between F ′ (a) and F ′ (b) and write G(x) = F(x) − γx. If F ′ (a) < γ < F ′ (b), then G′ (a) = F ′ (a) − γ < 0 and G′ (b) = F ′ (b) − γ > 0. This shows then that there is a point

CHAPTER 5. ANSWERS

190 ξ ∈ (a, b) such that G′ (ξ) = 0. For this ξ we have

F ′ (ξ) = G′ (ξ) + γ = γ,

completing the proof for the case F ′ (a) < F ′ (b). The proof when F ′ (a) > F ′ (b) is similar.

Exercise 154, page 31 If F ′ is continuous, then it is easy to check that Eα is closed. In the opposite direction suppose that every Eα is closed and F ′ is not continuous. Then show that there must be a number β and a sequence of points {xn } converging to a point z and yet f ′ (xn ) ≥ β and f ′ (z) < β. Apply the Darboux property of the derivative to show that this cannot happen if Eβ is closed. Deduce that F ′ is continuous.

Exercise 155, page 31 Polynomials have continuous derivatives and only finitely many points where the value is zero. Let p(x) be a polynomial. Then p′ (x) is also a polynomial. Collect all the points c1 , c2 , . . . , c p where p′ (x) = 0. In between these points, the value of the derivative is either always positive or always negative otherwise the Darboux property of p′ would be violated. On those intervals the function is decreasing or increasing.

Exercise 156, page 31 Take any point a < x ≤ b and, applying the mean-value theorem on the interval [a, x], we obtain that |F(x) − F(a)| = F ′ (ξ)(x − a) = 0(x − a) = 0. Consequently F(x) = F(a) for all a < x ≤ b. Thus F is constant.

Exercise 157, page 32 In Exercise 149 we established (without the mean-value theorem) that a function with a positive derivative is increasing. Now we assume that F ′ (x) = 0 everywhere in the interval (a, b). Consequently, for any integer n, the functions G(x) = F(x) + x/n and H(x) = x/n − F(x) both have a positive derivative and are therefore increasing. In particular, if x < y, then H(x) < H(y) and G(x) < G(y) so that −(y − x)/n < F(y) − F(x) < (y − x)/n

would be true for all n = 1, 2, 3, . . . . This is only possible if F(y) = F(x).

Exercise 158, page 32 We wish to prove that, if F : I → R is defined at each point of an open interval I and F ′ (x) = 0 for every x ∈ I, then F is a constant function on I. On every closed subinterval [a, b] ⊂ I the theorem can be applied. Thus f is a constant on the whole interval I. If

5.1. ANSWERS TO PROBLEMS

191

not then we could find at least two different points x1 and x2 with f (x1 ) 6= f (x2 ). But then we already know that f is constant on the interval [x1 , x2 ] (or, rather on the interval [x2 , x1 ] if x2 < x1 ).

Exercise 159, page 32 Take any points c < d from the interval [a, b] in such a way that (c, d) contains no one of these exceptional points. Consider the closed, bounded interval [c, d] ⊂ [a, b]. An application of the mean-value theorem to this smaller interval shows that F(d) − F(c) = F ′ (ξ)(d − c) = 0

for some point c < ξ < d. Thus F(c) = F(d). Now take any two points a ≤ x1 < x2 ≤ b and find all the exceptional points between them: say x1 < c1 < c2 < · · · < cn < x2 . On each interval [x1 , c1 ], [c1 , c2 ], . . . , [cn , x2 ] we have (by what we just proved) that F(x1 ) = F(c1 ], F(c1 ) = F(c2 ), . . . , F(cn ) = F(x2 ). Thus F(x1 ) = F(x2 ). This is true for any pair of points from the interval [a, b] and so the function is constant.

Exercise 161, page 32 Take any points c and x inside the interval and consider the intervals [x, c] or [c, x]. Apply the theorem to determine that F must be constant on any such interval. Consequently F(x) = F(c) for all a < x < b.

Exercise 162, page 32 This looks obvious but be a little bit careful with the exceptional set of points x where F ′ (x) 6= G′ (x). If F ′ (x) = f (x) for every a < x < b except for points in the finite set C1 and G′ (x) = f (x) for every a < x < b except for points in the finite set C2 , then the function H(x) = F(x)− G(x) is uniformly continuous on [a.b] and H ′ (x) = 0 for every a < x < b with the possible exception of points in the finite set C1 ∪C2 . By the theorem H is a constant.

Exercise 163, page 32 According to the theorem such a function would have to be discontinuous. Any step function will do here.

Exercise 164, page 32 Let ε > 0. At every point x0 in the interval (a, b) at which F ′ (x0 ) = 0 we can choose a δ(x0 ) > 0 so that |F(y) − F(z)| ≤ ε|y − z|

CHAPTER 5. ANSWERS

192

for x0 − δ(x0 ) < z ≤ x0 ≤ y < x0 + δ(x0 ). At the remaining points a, b, c1 , c2 , c3 , . . . we choose δ(·) so that: ε ω(F, [a, a + δ(a)]) < , 2 ε ω(F, [b − δ(b), b]) < , 4 and ε ω(F, [c j − δ(c j ), c j − δ(c j )]) < j+2 2 for j = 1, 2, 3, . . . . This merely uses the continuity of f at each of these points. Take any subinterval [c, d] ⊂ [a, b]. By the Cousin covering argument there is a partition {([xi−1 , xi ], ξi ) : i = 1, 2, 3, . . . , n} of the whole interval [c, d] such that

ξi ∈ [xi−1 , xi ] and xi − xi−1 < δ(ξi ).

For this partition

n

∞

ε = ε(d − c + 1). j 2 j=1

|F(d) − F(c)| ≤ ∑ |F(xi ) − F(xi−1 )| ≤ ε(d − c) + ∑ i=1

This is possible only if |F(d)− F(c)| = 0. Since this applies to any such interval [c, d] ⊂ [a, b] the function must be constant.

Exercise 165, page 32 According to Theorem 1.33 this will be proved if it is possible to write the rational numbers (where F ′ (x) is not known) as a sequence. This is well-known. To try it on your own. Start off 1/1, −1/1, 1/2, −1/2, 2/1, −2/1, 3/1, −3/1, 1/3, −1/3, . . .

and describe a listing process that will ultimately include all rational numbers m/n.

Exercise 166, page 33 Apply Exercise 165 to the function F(x) = G(x) − x2 /2. Since F is constant, G(x) = x2 /2 +C for some constant C.

Exercise 167, page 33 This looks like an immediate consequence of Theorem 1.33, but we need to be slightly careful about the exceptional sequence of points. If F ′ (x) = f (x) for every a < x < b except for points in the a sequence {c1 , c2 , c3 , . . . } and G′ (x) = f (x) for every a < x < b except for points in the sequence {d1 , d2 , d3 , . . . }, then the function H(x) = F(x) − G(x) is uniformly continuous on [a.b] and H ′ (x) = 0 for every a < x < b with the possible exception of points in the combined sequence {c1 , d1 , c2 , d2 , c3 , d3 , . . . }. By Theorem 1.33, the function H is a constant.

5.1. ANSWERS TO PROBLEMS

193

Exercise 169, page 33 First show directly from the definition that the Lipschitz condition will imply a bounded derivative. Then use the mean-value theorem to get the converse, that is, apply the mean-value theorem to f on the interval [x, y] for any a ≤ x < y ≤ b.

Exercise 170, page 33 The derivative of f (x) = not bounded on (0, 1).

√ x = x1/2 is the function f ′ (x) = x1/2 /2 which exists but is

Exercise 171, page 33 One direction is easy. If F is Lipschitz then, for some number M, | f (x) − f (y)| ≤ M|x − y|

for all x, y in the interval. In particular F(x + h) − F(x) M|(x + h) − x| ≤ = M. h h The other direction will take a more sophisticated argument. At each point x0 choose a δ(x0 ) > 0 so that F(x0 + h) − F(x) ≤M h whenever x0 + h ∈ I and |h| < δ(x0 ). Note that, then, F(y) − F(z) ≤M y−z

for x0 − δ(x0 ) < z ≤ x0 ≤ y < x0 + δ(x0 ). Take any subinterval [c, d] ⊂ [a, b]. By the Cousin partitioning argument there is a partition {([xi−1 , xi ], ξi ) : i = 1, 2, 3, . . . , n}

of the whole interval [c, d] such that For this partition

ξi ∈ [xi−1 , xi ] and xi − xi−1 < δ(ξi ). n

n

i=1

i=1

|F(d) − F(c)| ≤ ∑ |F(xi ) − F(xi−1 )| ≤ ∑ M|xi − xi−1 | = M(d − c). Thus F is Lipschitz.

Exercise 172, page 33 In preparation . . .

Exercise 173, page 34 In preparation . . .

CHAPTER 5. ANSWERS

194

Exercise 174, page 34 Yes on any interval (a, ∞) if a > 0 but not on (0, ∞).

Exercise 175, page 34 Yes. This is a simple example of a nondifferentiable Lipschitz function, but note that there is only one point of nondifferentiability.

Exercise 178, page 34 From the inequality F(y) − F(x) ≤ M|y − x| y−x deduce that F ′ (x) = 0 everywhere.

Exercise 180, page 34 Find an example illustrating that the first condition can hold without the second condition holding for any value of K < 1.

Exercise 181, page 34 Yes if all the functions F1 , F2 , F3 , . . . have the same Lipschitz constant. But, in general, not otherwise. This is just a simple consequence of the theory of sequence limits and how they behave with inequalities. If we suppose that x < y and that −M(y − x) ≤ Fn (y) − Fn (x) ≤ M(y − x)

for all n = 1, 2, 3, . . . , then

−M(y − x) ≤ lim [Fn (y) − Fn (x)] ≤ M(y − x) n→∞

must be true.

Exercise 182, page 36 By the definition, it is indeed an indefinite integral for F ′ except that we require all indefinite integrals to be continuous. But then we recall that a function is continuous at all points where the derivative exists. So, finally, yes.

Exercise 183, page 36 No. There may be finitely many points where f (x) is not defined, and even if f (x) has been assigned a value there may still be finitely many points where F ′ (x) = f (x) fails.

5.1. ANSWERS TO PROBLEMS

195

Exercise 184, page 36 Exercise 162 is almost identical except that it is stated for two uniformly continuous functions F and G on closed, bounded intervals [a, b]. Here (a, b) is open and need not be bounded. But you can apply Exercise 162 to any closed, bounded subinterval of (a, b).

Exercise 185, page 38 The two functions F(x) = x and G(x) = 1/x are continuous on (0, 1) but G is not uniformly continuous. [It is unbounded and any uniformly continuous function on (0, 1) would have to be bounded.] Thus the two functions f (x) = F ′ (x) = 1 and g(x) = G′ (x) = −1/x2 both possess indefinite integrals on the interval (0, 1) so that, of the two indefinite integrals F is uniformly continuous and the other G is not.

Exercise 186, page 38 The mean-value theorem supplies this on any subinterval [c, d] on which F is differentiable; the proof thus requires handling the finite exceptional set. Let M be larger than the values of | f (x)| at points where F ′ (x) = f (x). Fix x < y in the interval and split the interval at all the points where the derivative F ′ might not exist: a < x < c1 < c2 < · · · < cn < y < b.

The mean-value theorem supplies that

|F(t) − F(s)| ≤ M|s − t|

on any interval [s,t] for which (s,t) misses all the points of the subdivision. But adding these together we find that |F(t) − F(s)| ≤ M|s − t|

on any interval [s,t] ⊂ [x, y]. But x and y are completely arbitrary so that |F(t) − F(s)| ≤ M|s − t|

on any interval [s,t] ⊂ (a, b). We already know that if a function is Lipschitz on (a, b) then it is uniformly continuous on (a, b).

Exercise 187, page 38 It is true that the derivative of x3 /3 + 1 is indeed x2 at every point x. So, provided you also specify the interval in question [here (−∞, ∞) will do] then the function F(x) = x3 /3 + 1 is one possible indefinite integral of f (x) = x2 . But there are others and the R symbol x2 dx is intended to represent all of them.

Exercise 188, page 39 As we know (x + 1)2 = x2 + 2x + 1.

CHAPTER 5. ANSWERS

196

The two functions (x + 1)2 and x2 + 2x + 1 differ by a constant (in this case the constant 1). In situations like this it is far better to write

and

Z

(2x + 1) dx = (x + 1)2 +C1

Z

(2x + 1) dx = (x2 + 2x) +C2

where C1 and C2 represent arbitrary constants. Then one won’t be using the same letter to represent two different objects.

Exercise 189, page 39 Show that F is an indefinite integral of f (x) = x2 on (0, 1) in this stupid sense if and only if there are three numbers, C1 , C2 , and C3 with 0 < C1 < 1 such that F(x) = x3 /3 +C2 on (0,C1 ) and F(x) = x3 /3 +C3 on (C1 , 1).

Exercise 190, page 39 By a direct computation

Z

1 dx = log |x| +C x on any open interval I for which 0 6∈ I. The function F(x) = log |x| is a continuous indefinite integral on such an interval I. It cannot be extended to a continuous function on [0, 1] [say] because it is not uniformly continuous. (This is easy to see because uniformly continuous functions are bounded. The fact that f (0) = 1/0 is undefined is entirely irrelevant. In order for a function to have an indefinite integral [in the calculus sense of this chapter] it is permitted to p have finitely many points where it is undefined. See the next exercise where f (x) = 1/ |x| which is also undefined at x = 0 but does have an indefinite integral.

Exercise 191, page 39 √ By a direct computation the function F(x) = 2 x for x > 0 has a derivative 1 1 F ′ (x) = √ = p . x |x|

Thus on the interval (0, ∞) it is true that Z p 1 p dx = 2 |x| +C. |x| p On the other hand G(x) = −2 |x| for x < 0 has a derivative 1 1 =p . G′ (x) = √ −x |x| Thus on the interval (−∞, 0) it is true that Z p 1 p dx = −2 |x| +C. |x|

5.1. ANSWERS TO PROBLEMS

197

While this might look mysterious, the mystery disappears once the correct interval is specified. The two integrals are not in conflict since one must be stated on the interval (0, ∞) and the other on the interval (−∞, 0).

Exercise 192, page 39 √ √ By a direct computation the function F(x) = 2 x for x ≥ 0 and F(x) = −2 −x for x < 0 has F ′ (x) = f (x) at every point with the single exception of x = 0. Check that this function is continuous everywhere. This is immediate at points where F is differentiable, so it is only at x = 0 that one needs to check continuity. Once again, the fact that f (0) is undefined plays no role in the discussion since this function is defined everywhere else.

Exercise 193, page 39 None of them are correct because no interval is specified. The correct versions would be Z 1 dx = log x +C on (0, ∞) x or Z 1 dx = log(−x) +C on (−∞, 0) x or Z 1 dx = log |x| +C on (0, ∞) or on (0, ∞). x You may use subintervals, but we know by now that there are no larger intervals possible.

Exercise 194, page 40 The maximum value of f in each of the intervals [0, 41 ], [ 14 , 21 ], [ 12 , 43 ], and [ 34 , 1] is 1/8, 1/4, 9/16, and 1 respectively. Thus define F to be x/8 in the first interval, 1/32 + 1/4(x − 1/1/4) in the second interval, 1/32 + 1/16 + 9/16(x − 1/2) in the third interval, and to be 1/32 + 1/16 + 9/64 + (x − 3/4) in the final interval. This should be (if the arithmetic was correct) a continuous, piecewise linear function whose slope in each segment exceeds the value of the function f .

Exercise 195, page 41 Start at 0 and first of all work to the right. On the interval (0, 1) the function f has the constant value 1. So define F(x) = x on [0, 1]. Then on the the interval (1, 2) the function f has the constant value 2. So define F(x) = 1 + 2(x − 1) on [1, 2]. Continue until you see how to describe F in general. This is the same construction we used for upper functions.

CHAPTER 5. ANSWERS

198

Exercise 196, page 42 Let F0 denote the function on [0, 1] that has F0 (0) = 0 and has constant slope equal to c01 = sup{ f (t) : 0 < t < 1}. Subdivide [0, 1] into [0, 12 ] and [ 21 , 1] and let F1 denote the continuous, piecewise linear function on [0, 1] that has F0 (0) = 0 and has constant slope equal to c11 = sup{ f (t) : 0 < t ≤ 12 }

on [0, 21 ] and constant slope equal to

c12 = sup{ f (t) :

1 2

≤ t < 1}

on [0, 12 ]. This construction is continued. For example, at the next stage, Subdivide [0, 1] further into [0, 14 ], [ 41 , 21 ], [ 12 , 43 ], and [ 34 , 1]. Let F2 denote the continuous, piecewise linear function on [0, 1] that has F0 (0) = 0 and has constant slope equal to c11 = sup{ f (t) : 0 < t ≤ 14 }

on [0, 41 ], constant slope equal to

c12 = sup{ f (t) :

1 4

≤ t ≤ 21 }

1 2

≤ t ≤ 43 }

3 4

≤ t < 1}

on [ 14 , 21 ], constant slope equal to c13 = sup{ f (t) : on [ 12 , 43 ], and constant slope equal to c14 = sup{ f (t) : on

[ 34 , 1].

In this way we construct a sequence of such functions {Fn }. Note that each Fn is continuous and nondecreasing. Moreover a look at the geometry reveals that Fn (x) ≥ Fn+1 (x)

for all 0 ≤ x ≤ 1 and all n = 0, 1, 2, . . . . In particular {Fn (x)} is a nonincreasing sequence of nonnegative numbers and consequently F(x) = lim Fn (x) n→∞ ′ F (x) =

exists for all 0 ≤ x ≤ 1. We prove that f (x) at all points x in (0, 1) at which the function f is continuous. Fix a point x in (0, 1) at which f is assumed to be continuous and let ε > 0. Choose δ > 0 so that the oscillation1 ω f ([x − 2δ, x + 2δ])

of f on the interval [x − 2δ, x + 2δ] does not exceed ε. Let h be fixed so that 0 < h < δ. Choose an integer N sufficiently large that |FN (x) − F(x)| < εh and |FN (x + h) − F(x + h)| < εh.

From the geometry of our construction notice that the inequality

|FN (x + h) − FN (x) − f (x)h| ≤ hω f ([x − 2h, x + 2h]),

must hold for large enough N. (Simply observe that the graph of FN will be composed 1 See

Exercise 62.

5.1. ANSWERS TO PROBLEMS

199

of line segments, each of whose slopes differ from f (x) by no more than the number ω f ([x − 2h, x + 2h]).) Putting these inequalities together we find that |F(x + h) − F(x) − f (x)h| ≤ |FN (x + h) − Fn (x) − f (x)h| + |FN (x) − F(x)| + |FN (x + h) − F(x + h)| < 3εh.

This shows that the right-hand derivative of F at x must be exactly f (x). A similar argument will handle the left-hand derivative and we have verified the statement in the theorem about the derivative. The reader should now check that the function F defined here is Lipschitz on [0, 1]. Let M be an upper bound for the function f . Check, first, that 0 ≤ Fn (y) − Fn (x) ≤ M(y − x)

for all x < y in [0, 1]. Deduce that F is in fact Lipschitz on [0, 1].

Exercise 197, page 42 If H(t) = G(a + t(b − a)) then, by the chain rule,

H ′ (t) = G′ (a + t(b − a)) × (b − a) = f (a + t(b − a)) × (b − a).

Substitute x = a + t(b − a) for each 0 ≤ t ≤ 1.

Exercise 198, page 42 If G′ (t) = g(t) then

d dt (G(t) + Kt) =

g(t) + K = f (t).

Exercise 199, page 42 The assumption that f is continuous on an interval (a, b) means that f must be uniformly continuous on any closed subinterval [c, d] ⊂ (a, b). Such functions are bounded. Applying the theorem gives a continuous function with F ′ (x) = f (x) everywhere on that interval. This will construct our indefinite integral on (a, b). Note that F will be Lipschitz on every subinterval [c, d] ⊂ (a, b) but need not be Lipschitz on (a, b), because we have not assumed that f is bounded on (a, b).

Exercise 200, page 43 You need merely to show that H is continuous on the interval and that H ′ (x) = r f (x) + sg(x) at all but finitely many points in the interval. But both F and G are continuous on that interval and so we need to recall that the sum of continuous functions is again continuous. Finally we know that F ′ (x) = f (x) for all x in I except for a finite set C1 , and we know that G′ (x) = g(x) for all x in I except for a finite set C2 . It follows, by properties of derivatives, that H ′ (x) = rF ′ (x) + sG′ (x) = r f (x) + sg(x) at all points x in I that are not in the finite set C1 ∪C2 .

CHAPTER 5. ANSWERS

200

Exercise 201, page 43 If F and G both have a derivative at a point x then we know, from the product rule for derivatives, that d [F(x)G(x)] = F ′ (x)G(x) + F(x)G′ (x) dx Thus, let us suppose that F ′ (x)G(x) has an indefinite integral H(x) on some interval I. Then H is continuous on I and H ′ (x) = F ′ (x)G(x) at all but finitely many points of I. Notice then that K(x) = F(x)G(x) − H(x) satisfies (at points where derivatives exist) Thus Z

K ′ (x) = F ′ (x)G(x) + F(x)G′ (x) − F ′ (x)G(x) = F(x)G′ (x). ′

F(x)G (x) dx = K(x) +C = F(x)G(x) − H(x) +C = F(x)G(x) −

Z

F ′ (x)G(x) dx.

Exercise 202, page 43 One memorizes (as a calculus student) the formula Z

u dv = uv −

Z

v du. R

and makes appropriate substitutions. For example to determine x cos x dx use u = x, v = cos x dx, determine du = dx and determine v = sin x [or v = sin x + 1 for example]. ThenZ substitute in the memorized Zformula to obtain Z Z x cos x dx =

u dv = uv −

v du = x sin x −

sin x dx = x sin x + cos x +C.

or [if you had used v = sin x + 1 instead] Z

x cos x dx = x(sin x + 1) −

Z

(sin x + 1) dx = x sin x + cos x +C.

Exercise 203, page 44 If you need more [you are a masochist] then you can find them on the web where we found these. The only reason to spend much further time is if you are shortly to face a calculus exam where some such computation will be required. If you are an advanced student it is enough to remember that “integration by parts” is merely the product rule for derivatives applied to indefinite or definite integration. There is one thing to keep in mind as a calculus student preparing for questions that exploit integration by parts. An integral can be often split up into many different ways using the substitutions of integration by parts: u = f (x), dv = g′ (x)dx, v = g(x) and du = f ′ (x)dx. You can do any such problem by trial-and-error and just abandon any unpromising direction. If you care to think in advance about how best to choose the substitution, choose u = f (x) only for functions f (x) that you would care to differentiate and choose dv = g′ (x)dx only for functions you would care to integrate.

5.1. ANSWERS TO PROBLEMS

201

Exercise 204, page 44 In most calculus courses the rule would be applied only in situations where both functions F and G are everywhere differentiable. For our calculus integral we have been encouraged to permit finitely many exceptional points and to insist then that our indefinite integrals are continuous. That does not work here: let F(x) = |x| and G(x) = x2 sin x−1 , G(0) = 0. Then G is differentiable everywhere and F is continuous with only one point of nondifferentiability. But F(G(x) = |x2 sin x−1 | is not differentiable at any point x = ±1/π, ±1/2π, ±1/3π, . . . . Thus F(G(x) is not an indefinite integral in the calculus sense for F ′ (G(x)) on any open interval that contains zero and indeed F ′ (G(x)) would have infinitely many points where it is undefined. This function is integrable on any open interval that avoids zero. This should be considered a limitation of the calculus integral. This is basically an 18th century integral that we are using for teaching purposes. If we allow infinite exceptional sets [as we do in the later integration chapters] then the change of variable rule will hold in great generality.

Exercise 205, page 44 To be precise we should specify an open interval; (−∞, ∞) will do. To verify the answer itself, just compute d 1 2 sin(x + 1) = cos(x2 + 1). dx 2

To verify the steps of the procedure just notice that the substitution u = x2 + 1, du = 2x dx is legitimate on this interval.

Exercise 206, page 45 It would be expected that you have had sufficient experience solving similar problems to realize that integration by parts or other methods will fail but that a change of variable will succeed. The only choices likely in such a simple integral would be u = x2 or 2 perhaps v = ex . The former leads to Z Z 2 1 2 1 xex dx = eu du = eu +C = ex +C [u = x2 ] 2 2 2 since if u = x then du = 2x dx; the latter leads to Z Z p 2 x2 xe dx = 2 v log v dv = ? [v = ex ] √ √ 2 2 since if v = ex then x2 = log v, x = log v and dv = 2xex dx = 2v log v. Note that a wrong choice of substitution may lead to an entirely correct result which does not accord with the instructors “expectation.” Usually calculus instructors will select examples that are sufficiently transparent that the correct choice of substitution is immediate. Better yet, they might provide the substitution that they require and ask you to carry it out. Finally all these computations are valid everywhere so we should state our final result on the interval (−∞, ∞). Most calculus instructors, however, would not mark you

CHAPTER 5. ANSWERS

202 incorrect if you failed to notice this.

Exercise 207, page 45 Assuming that r 6= 0 (in which case we are integrating a constant function), use the substitution u = rx + s, du = r dx to obtain Z Z 1 f (u) du = F(u) +C = F(rs + x) +C. f (rx + s) dx = r This is a linear change of variables and is the most common change of variable in numerous situations. R This can be justified in more detail this way. Suppose that f (t) dx = F(t) +C on an open interval I, meaning that F is continuous and F ′ (t) = f (t) on I with possibly finitely many exceptions. Then find an open interval J so that rx + s ∈ I for all x ∈ J. d F(rx + s) = f (rx + s) again It follows that F(rx + s) is continuous on J and that dx with possibly finitely many exceptions. On J then F(rx + s) is an indefinite integral for f (rx + s).

Exercise 208, page 45 This is an exercise in derivatives. Suppose that f : (a, b) → R has an indefinite integral F on the interval (a, b). Let ξ be a point of continuity of f . We can suppose that ξ in contained in a subinterval (c, d) ⊂ (a, b) inside which F ′ (x) = f (x) for all points, except possibly at the point ξ in question. Let ε > 0. Then, since f is continuous at ξ, there is an interval [ξ − δ(ξ), ξ + δ(ξ)] so that f (ξ) − ε < f (x) < f (ξ) + ε

on that interval. For any u < ξ < v in this smaller interval and

( f (ξ) − ε)(ξ − u) ≤ F(ξ) − F(u) ( f (ξ) + ε)(v − ξ) ≥ F(v) − F(ξ).

This is because the function F is continuous on [u, ξ] and [ξ, v] and has a derivative larger than ( f (ξ) − ε) on (u, ξ) and a derivative smaller than ( f (ξ) + ε) on (ξ, v). Together these inequalities prove that, for any u ≤ ξ ≤ v, u 6= v on the interval (ξ−δ(ξ), ξ+ δ(ξ)) the inequality F(v) − F(u) − f (ξ) ≤ ε v−u must be valid. But this says that F ′ (ξ) = f (ξ).

Exercise 209, page 47 In fact this is particularly sloppy. The function log(x − 8) is defined only for x > 8 while log(x + 5) is defined only for x > −5. Thus an appropriate interval for the expression given here would be (8, ∞). But it is also true that Z x+3 dx = [11/13] log(|x − 8|) + [2/13] log(|x + 5|) +C 2 x − 3x − 40

5.1. ANSWERS TO PROBLEMS

203

on any open interval that does not contain either of the points x = 8 or x = −5. For example, on the interval (−5, 8), the following is valid: Z x+3 dx = [11/13] log(8 − x) + [2/13] log(x + 5) +C. 2 x − 3x − 40 You might even prefer to write Z x+3 1 11 2 dx = log (8 − x) (x + 5) +C. x2 − 3x − 40 13

Exercise 210, page 51 Find the necessary statements from Chapter 1 from which this can be concluded.

Exercise 211, page 52 Yes. Just check the two cases.

Exercise 212, page 53 We know this for a < b < c. Make sure to state the assumptions and formulate the thing you want to prove correctly. For example, if b < a = c does it work?

Exercise 215, page 53 R

For a function x(t) = t 2 compute the integral 01 x( f ) d f . That is perfectly legitimate but will make most mathematicians nauseous. R How about using d as a dummy variable: compute 01 d 2 dd? Or use the Greek letter R π as a dummy variable: what is 01 sin π dπ? Most calculus students are mildly amused by this computation: Z 2 d[cabin] cabin=2 = {log cabin}cabin=1 = log 2. 1 [cabin]

Exercise 216, page 54 That is correct but he is being a jerk. More informative is Z

e2x dx = e2x /2 +C,

which is valid on any interval. More serious, though, is that the student didn’t find an indefinite integral so would be obliged to give some argument about the function f (x) = e2x to convince us that it is indeed integrable. An appeal to continuity would be enough.

Exercise 217, page 54 That is correct but she is not being a jerk. There is no simple formula for any indefinite 2 integral of ex other than defining it as an integral as she did here (or perhaps an infinite

CHAPTER 5. ANSWERS

204

series). Once again the student would be obliged to give some argument about the 2 function f (x) = ex to convince us that it is indeed integrable. An appeal to continuity would be enough.

Exercise 218, page 54 One can use any indefinite integral in the computation, so both of those methods are correct.

Exercise 219, page 54 Probably, but the student using the notation should remember that the computation at the ∞ end here is really a limit: 2 lim − √ = 0. x→∞ x

Exercise 220, page 55 Step functions are bounded in every interval [a, b] and have only a finite number of steps, so only a finite number of discontinuities.

Exercise 221, page 55 Differentiable functions are continuous at every point and consequently uniformly continuous on any closed, bounded interval.

Exercise 222, page 55 If f : (a, b) → R is bounded we already know that f is integrable and that the statements here must all be valid. If f is an unbounded function that is continuous at all points of (a, b) then there is a continuous function F on (a, b) for which F ′ (x) = f (x) for all points there. In particular, F is uniformly continuous on any interval [c, d] ⊂ (a, b) and so serves as an indefinite integral proving that f is integrable on these subintervals. In order to claim that f is actually integrable on [a, b] we need to be assured that F can be extended to a uniformly continuous function on all of [a, b]. But that is precisely what the conditions lim

Z c

t→a+ t

f (x) dx and lim

Z t

t→b− c

f (x) dx

allow, since they verify that the limits lim F(t) and lim F(t)

t→a+

t→b−

must both exist. We know that this is both necessary and sufficient in order that F should be extendable to a uniformly continuous function on all of [a, b].

5.1. ANSWERS TO PROBLEMS

205

Exercise 223, page 56 We suppose that f has an indefinite integral F on (a, b). We know that f is integrable on any subinterval [c, d] ⊂ (a, b) but we cannot claim that f is integrable on all of [a, b] until we check uniform continuity of F. We assume that g is integrable on [a, b] and construct a proof that f is also integrable on [a, b]. Let G be an indefinite integral for g on the open interval (a, b). We know that G is uniformly continuous because g is integrable. We check, for a < s < t < b that Z t Zt f (x) dx ≤ |g(x)| dx = G(t) − G(s). s s from which we get that

|F(t) − F(s)| ≤ G(t) − G(s) for all a < s < t < c.

It follows from an easy ε, δ argument that the uniform continuity of F follows from the uniform continuity of G. Consequently f is integrable on [a, b]. R For the infinite integral, a∞ f (x) dx the same argument give us the uniform continuity but does not offer the existence of F(∞). For that we can use Exercise 64. Since R G(∞) must exist in order for the integral a∞ g(x) dx to exist, that Exercise 64 shows us, that for all ε > 0 there should exist a positive number T so that ωG((T, ∞)) < ε. But we already know that ωF((T, ∞)) ≤ ωG((T, ∞)) < ε.

A further application of that same exercise shows us that F(∞) does exist.

Exercise 224, page 56 If f is continuous at all points of (a, b) with the exception of points a < c1 < c2 < · · · < cm < b then we can argue on each interval [a, c1 ], [c1 , c2 ], . . . , [cm , b]. If f is integrable on each of these subintervals of [a, b] then, by the additive property, f must be integrable on [a, b] itself. We know that f has an indefinite integral on (a, c1 ) because f is continuous at each point of that interval. By Theorem 3.7 it follows that f is integrable on [a, c1 ]. The same argument supplies that f is integrable on on each interval [c1 , c2 ], . . . , [cm , b].

Exercise 225, page 56 The method used in the preceding exercise will work. If f is continuous at all points of (a, ∞) with the exception of points a < c1 < c2 < · · · < cm then we can argue on each interval [a, c1 ], [c1 , c2 ], . . . , [cm , ∞). If f is integrable on each of these subintervals of [a, ∞) then, by the additive property, f must be integrable on [a, ∞) itself. We know that f has an indefinite integral on (a, c1 ) because f is continuous at each point of that interval. By Theorem 3.7 it follows that f is integrable on [a, c1 ]. The same argument supplies that f is integrable on on each interval [c1 , c2 ], . . . , [cm−1 , cm ]. For the final interval [cm , ∞) note that f is continuous at every point and so has an indefinite integral on (cm , ∞). Now invoke Theorem 3.9 to conclude integrability on [cm , ∞).

CHAPTER 5. ANSWERS

206

Exercise 226, page 56 Note, first, that all of the integrands are continuous on the interval (0, π/2). Using the simple inequality x/2 < sin x < x (0 < x < π/2) we can check that, on the interval (0, π/2), r sin x 1 √ ≤ ≤ 1, x 2 so that the first integral exists because the integrand is continuous and bounded. For the next two integrals we observe that r 1 1 sin x √ ≤ ≤√ , 2 x x 2x and r sin x 1 1 √ ≤ ≤ . x3 x 2x Thus, by the comparison test, one integral exists and the other does not. It is only the integral Z π/2 r sin x dx x3 0 that fails to exist by comparison with the integral 1 √ 2

Z π/2 1 0

x

dx.

Exercise 227, page 56 The comparison test will handle only the third of these integrals proving that it is integrable. We know that the integrands are continuous on (0, ∞) and so there is an indefinite integral in all cases. The inequality | sin x| ≤ 1 shows that sin x 1 x2 ≤ x2 and we know that the integral Z ∞ 1 dx 2 1 x converges. That proves, by the comparison test that Z ∞ sin x dx x2 1 converges. To handle the other two cases we would have to compute limits at ∞ to determine convergence. The comparison test does not help.

Exercise 228, page 56 If a nonnegative function f : (a, b) → R is has a bounded indefinite integral F on (a, b), then that function F is evidently nondecreasing. We can claim that f is integrable if

5.1. ANSWERS TO PROBLEMS

207

and only if we can claim that the limits F(a+) and F(b−) exist. For a nondecreasing function F this is equivalent merely to the observation that F is bounded.

Exercise 229, page 57 In view of the previous exercise we should search for a counterexample that is not nonnegative. Find a bounded function F : (0, 1) that is differentiable but is not uniformly continuous. Try F(x) = sin x−1 and take f = F ′ .

Exercise 230, page 57 The focus of your discussion would have to be on points where the denominator q(x) has a zero. If [a, b] contains no points at which q is zero then the integrand is continuous everywhere (even differentiable) at all points of [a, b] so the function is integrable there. You will need this distinction. A point x0 is a zero of q(x) if q(x0 ) = 0. A point x0 is a zero of q(x) of order k (k = 1, 2, 3, . . . ) if q(x) = (x − x0 )k h(x)

for some polynomial h(x) that does not have a zero at x0 . Work on an interval [x0 , c] that contains only the one zero x0 . For example, if x0 is a zero of the first order for q, p(x0 ) 6= 0 and the interval contains no other zeros for p and q then there are positive numbers m and M for which M p(x) m ≤ ≤ x − x0 q(x) x − x0 on the interval [x0 , c]. The comparison test supplies the nonintegrability of the function on this interval. Do the same at higher order zeros (where you will find the opposite conclusion).

Exercise 231, page 57 Compare with Exercise 230. First consider only the case where [a, ∞] contains no zeros of either p(x) or q(x). Then the integral Z X p(x) dx a q(x) exists and it is only the limiting behavior as X → ∞ that needs to be investigated. The key idea is that if m p(x) ≤ x q(x) for some m > 0 and all a ≤ x ≤ ∞ then the integral must diverge. Similarly if M p(x) ≥ x2 q(x)

for some M > 0 and all a ≤ x ≤ ∞ then the integral must converge. For a further hint, if you need one, consider the following argument used on the integral Z ∞ 1+x dx. 1 + x + x2 + x3 1

CHAPTER 5. ANSWERS

208 Since

2 1 + x =1 lim x × 2 3 x→∞ 1+x+x +x

it follows that, for all sufficiently large values of x, 2 1+x x × a. Then H is continuous and must therefore be an indefinite integral for f on (−∞, ∞). We need to know that the limiting values H(−∞) and H(∞) both exist. But H(−∞) = F(−∞) and H(∞) = G(∞). Thus the integral Z ∞

−∞

must exist.

f (x) dx

Exercise 252, page 61 We suppose that the two functions f , g are both integrable on a closed, bounded interval [a, b] and that f (x) ≤ g(x) for all x ∈ [a, b]. We can If F is an indefinite integral for f on [a, b] and G is an indefinite integral for g on [a, b] then set H(x) = G(x) − F(x) and notice that d d H(x) = [G(x) − F(x)] ≥ g(x) − f (x) ≥ 0 dx dx except possibly at the finitely many points where the derivative does not have to agree with the function. But we know that continuous functions with nonnegative derivatives are nondecreasing; the finite number of exceptions does not matter for this statement. Thus H(b) − H(a) ≥ 0 and so [G(b) − G(a)] − [F(b) − F(a)] ≥ 0. Consequently Z b a

f (x) dx = [F(b) − F(a)] ≤ [G(b) − G(a)] =

Z b

g(x) dx.

a

The details are similar for infinite integrals.

Exercise 253, page 61 We know that

Z

x2 dx = x3 /3 +C

on any interval. So that, in fact, using [for example] the function F(x) = x3 /3 + 1 as an indefinite integral, Z 2

−1

x2 dx = F(2) − F(−1) = [23 /3 + 1] − [(−1)3 /3 + 1] = 3.

Exercise 254, page 61 We know that on (0, ∞) and on (−∞, 0).

Z

dx = log |x| +C x

5.1. ANSWERS TO PROBLEMS

213

In particular we do have a continuous indefinite integral on both of the open intervals (−1, 0) and (0, 1). But this indefinite integral is not uniformly continuous. The easiest clue to this is that the function log |x| is unbounded on both intervals (−1, 0) and (0, 1). As for integrability on, say the interval [−1, 1]. This is even clearer: there is no antiderivative at all, so the function cannot be integrable by definition. As to the fact that f (0) is undefined: an integrable function may be undefined at any finite number of points. So this was not an issue and did not need to be discussed. Finally is this function integrable on the unbounded interval (∞, −1] or on the unbounded interval [1, ∞)? No. Simply check that neither limit lim log x or

x→∞

lim log(−x)

x→−∞

exists.

Exercise 255, page 61 √ We know that the function F(x) = 2 x is uniformly continuous on [0, 2] and that 1 d √ 2 x= √ dx x for all x > 0. Thus this function is integrable on [0, 2] and Z 2 √ 1 √ dx = F(2) − F(0) = 2 2. x 0 The fact that f is undefined at an endpoint [or any one point for that matter] is no concern to us. Some calculus course instructors may object here, insisting that the ritual known as “improper integration” needs to be invoked. It does not! We have defined the integral in such a way that this procedure is simply part of the definition. For courses that start with the Riemann integral this procedure would not be allowed since unbounded functions are not Riemann integrable. The function f (x) = x−1/2 is unbounded on (0, 2) but this causes us no concern since the definition is only about antiderivatives. Finally, is this function integrable on [0, ∞)? No. The endpoint 0 is no problem but √ limx→∞ 2 x does not exist.

Exercise 256, page 62 The simplest method to handle this is to split the problem at 0. If a < 0 < b then Z 0 p Z b p Z 0 √ Z b Z b p √ 1/ |x| dx + 1/ |x| dx = 1/ −x dx + 1/ x dx 1/ |x| dx = a

a

0

a

0

√ if these two integrals exist. For an indefinite integral of f on (0, ∞) use F(x) = 2 x √ and for an indefinite integral of f on (−∞, 0) use F(x) = −2 −x. Thus Z 0 p √ 1/ |x| dx = F(0) − F(a) = 2 a a

and

Z b 0

p √ 1/ |x| dx = F(b) − F(0) = 2 b.

CHAPTER 5. ANSWERS

214

Exercise 257, page 62 In defining an integral on [a, b] as Z b a

f (x) dx = F(b) − F(a)

we have allowed F ′ (x) = f (x) (a < x < b) to fail at a finite number of points, say at c1 < c2 < · · · < cn provided we know that F is continuous at each of these points. We could merely take the separate integrals Z c1

f (x) dx,

a

Z c1 c1

f (x) dx, . . . ,

Z b

f (x) dx

cn

and add them together whenever we need to. Thus the integral could be defined with no exceptional set and, for applications, . . . well add up the pieces that you need. The calculus integral is only a teaching integral. The modern theory requires a much more general integral and that integral can be obtained by allowing an infinite exceptional set. Thus the training that you are getting by handling the finite exceptional set is really preparing you for the infinite exceptional set. Besides we do get a much better integration theory with our definition, a theory that generalizes quite well to the modern theory. Another thing to keep in mind: when we pass to an infinite exceptional set we maybe unable to “split the interval in pieces.” Indeed, we will eventually allow all of the rational numbers as exceptional points where the derivative may not exist.

Exercise 258, page 62 The derivative of F exists at all points in (0, 1) except at these corners 1/n, n = 2, 3, 4, 5, . . . . If a > 0 then the interval [a, 1] contains only finitely many corners. But the interval (0, 1) contains infinitely many corners! Thus F ′ undefined at infinitely many points of [0, 1] and F(x) is not differentiable at these points. It is clear that F is continuous at all points inside, since it is piecewise linear. At the endpoint 0 we have F(0) and we have to check that |F(x) − F(0)| is small if x is close to zero. This is easy. So F is uniformly continuous on [0, 1] and the identity Z b a

F ′ (x) dx = F(b) − F(a)

is true for the calculus integral if a > 0. It fails for a = 1 only because there are too many points where the derivative fails. What should we do? 1. Accept that F ′ is not integrable and not worry about such functions? 2. Wait for a slightly more advanced course where an infinite set of exceptional points is allowed? 3. Immediately demand that the calculus integral accommodate a sequence of exceptional points, not merely a finite set?

5.1. ANSWERS TO PROBLEMS

215

We have the resources to do the third of these suggestions. We would have to prove this fact though: If F, G : [a, b] are uniformly continuous functions, if F ′ (x) = f (x) for all points in (a, b) except points in some sequence {cn } and if G′ (x) = f (x) for all points in (a, b) except points in some sequence {dn }, then F and G must differ by a constant. If we prove that then, immediately, the definition of the calculus integral can be extended to handle this troublesome example. This fact is not too hard to prove, but it is nonetheless much harder than the finite case. Remember the latter uses only the meanvalue theorem to find a proof. Accepting sequences of exceptional points will make our simple calculus course just a little bit tougher. So we stay with the finite case for this chapter and then introduce the infinite case in the next. After all, the calculus integral is just a warm-up integral and is not intended to be the final say in integration theory on the real line.

Exercise 259, page 62 (1). There is no function F ′ (x) = 1 for all x irrational and F ′ (x) = 0 if x is rational, on any interval [c, d]. To be an indefinite integral in the calculus sense on an interval [a, b] there must subintervals where F is differentiable. Why is there not? Well derivatives have the Darboux property. (2) There are two many points where f is not defined. Every interval contains infinitely many rationals. (3) The only possible indefinite integral is F(x) = x + k for some constant. But then F ′ (x) = f (x) has too many exceptions: at all the points xn = 1/n, 1 = F ′ (xn ) 6= f (xn ) = cn unless we had insisted that cn = 1 for all but finitely many of the {cn }.

Note: The first two are Lebesgue integrable, but not Riemann integrable. The third is Lebesgue integrable and might be Riemann integrable, depending on whether the sequence {cn } is bounded or not. Thus the calculus integral is quite distinct from these other theories.

Exercise 260, page 62 The function F(x) = x p+1 /(p + 1) is an antiderivative on (0, ∞) if p 6= 1. If p = 1 then F(x) = log x is an antiderivative on (0, ∞). Thus for your answer you will need to check F(0+), F(1), and F(∞) in all possible cases.

Exercise 261, page 62 Yes. Take an indefinite integral F for f and write Z ∞

−∞

f (x) dx = F(∞) − F(−∞) = F(∞) − F(b) + F(b) − F(a) + F(a) − F(−∞) =

Z a ∞

f (x) dx +

Z b a

f (x) dx +

Z ∞ b

f (x) dx.

CHAPTER 5. ANSWERS

216 For the second formula write Z ∞

0

f (x) dx = F(∞) − F(0) =

F(∞) − F(N) + [F(N) − F(N − 1)] + . . . [F(2) − F(1)] + [F(1) − F(0)] N

= F(∞) − F(N) + ∑

Z n

f (x) dx.

n=1 n−1

Then, since limN→∞ [F(∞) − F(N)] = 0, it follows that Z ∞

N

f (x) dx = lim

N→∞

0

∑

Z n

∑

f (x) dx =

n=1 n−1

∞

Z n

f (x) dx.

n=1 n−1

The third formula is similar.

Exercise 262, page 63 Let F(x) = 0 for x ≤ 0 and let F(x) = x for x > 0. Then F is continuous everywhere and is differentiable everywhere except at x = 0. Consider Z 1

−1

F ′ (x) dx = F(1) − F(−1) = 1

and try to find a point ξ where F ′ (ξ)(1 − (−1)) = 1.

Exercise 263, page 63 Let m and M be the minimum and maximum values of the function G. It follows that m

Z b a

ϕ(t) dt ≤

Z b a

G(t)ϕ(t) dt ≤ M

Z b

Rb

a

ϕ(t) dt

by monotonicity of the integral. Dividing through by a ϕ(t) dt (which we can assume is not zero), we have that Rb G(t)ϕ(t) dt m ≤ aR b ≤ M. ϕ(t) dt a Since G(t) is continuous, the Darboux property of continuous functions (i.e., the intermediate value theorem) implies that there exists ξ ∈ [a, b] such that G(x) =

Rb a

G(t)ϕ(t) dt

Rb a

which completes the proof.

ϕ(t) dt

Exercise 266, page 64 In Exercise 227 we avoided looking closely at this important integral but let us do so now. We need to consider the indefinite integral Z x sin t dt Si(x) = t 0 which is known as the sin integral function and plays a role in many investigations. Since the functions x−1 and sin x are both continuous on (0, ∞) there is an indefinite

5.1. ANSWERS TO PROBLEMS

217

integral on (0, ∞). There is no trouble at the left-hand endpoint because the integrand is bounded. Hence the function Si(x) is defined for all 0 ≤ x < ∞. Our job is simply to show that the limit Si(∞) exists. It is possible using more advanced methods to evaluate the integral and obtain Z ∞ sin x π dx = . Si(∞) = x 2 0 To obtain that the limit Si(∞) exists let us apply the mean-value theorem given as Exercise 265. On any interval [a, b] ⊂ (0, ∞) Z b a

x−1 sin x dx =

cos a − cos ξ cos ξ − cos b + a b

for some ξ. Consequently

Z b sin x 2 2 |Si(b) − Si(a)| = dx ≤ + . x a b a From this we deduce that the oscillation of Si on intervals [T, ∞) is small if T is large, i.e., that 4 ω Si([T, ∞]) ≤ → 0 T as T → ∞. It follows that Si(∞) must exist. This proves that the integral is convergent. Finally let us show that the function Z x sint F(x) = t dt 0

is unbounded. Then we can conclude that the integral diverges and that the Dirichelet integral is convergent but not absolutely convergent. To see this take any interval [2nπ, (2n + 1)π] on which sin x is nonnegative. Let us apply the mean-value theorem given as Exercise 264. This will show that Z (2n+1)π sint dt ≥ 1 . t 2nπ 2nπ It follows that for x greater than (2N + 1)π Z x N Z (2n+1)π N sint dt ≥ ∑ 1 . sint dt ≥ ∑ t t 0 n=1 2nπ n=1 2nπ Consequently F is unbounded.

Exercise 267, page 66 The choice of midpoint xi + xi−1 = ξi 2 for the Riemann sum gives a sum 1 1 n = ∑ (x2i − x2i−1 ) = b2 − x2n−1 + x2n−1 − x2n−2 + · · · − a2 = (b2 − a2 )/2. 2 i=1 2

To explain why this works you might take the indefinite integral F(x) = x2 /2 and check that F(d) − F(c) c + d = d −c 2

CHAPTER 5. ANSWERS

218

so that the mean-value always picks out the midpoint of the interval [c, d] for this very simple function.

Exercise 273, page 67 Just take, first, the points ξ∗i at which we have the exact identity Z xi

xi−1

f (x) dx − f (ξ∗i )(xi − xi−1 ) = 0

Then for any other point ξi , Z x i ∗ x f (x) dx − f (ξi )(xi − xi−1 ) = | f (ξi )− f (ξ )|(xi − xi−1 ) ≤ ω f ([xi , xi−1 ])(xi − xi−1 ). i−1

The final comparison with

n

∑ ω f ([xi , xi−1 ])(xi − xi−1)

i=1

follows from this. To get a good approximation of the integral by Riemann sums it seems that we might need n

∑ ω f ([xi , xi−1 ])(xi − xi−1)

i=1

to be small. Observe that the pieces in the sum here can be made small if (a) the function is continuous so that the oscillations are small, or (b) points where the function is not continuous occur in intervals [xi , xi−1 ] that are small. Loosely then we can make these sums small if the function is mostly continuous, i.e., where it is not continuous can be covered by some small intervals that don’t add up to much. The modern statement of this is “the function needs to be continuous almost everywhere.”

Exercise 274, page 69 This is the simplest case to prove since we do not have to fuss at the endpoints or at exceptional points where f is discontinuous. Let ε > 0 and choose δ > 0 so that ε ω f ([c, d]) < (b − a) whenever [c, d] is a subinterval of [a, b] for which d − c < δ. Note then that if {([xi , xi−1 ], ξi ) : i = 1, 2, . . . n}

is a partition of [a, b] with intervals shorter than δ then n

n

i=1

i=1

∑ ω f ([xi , xi−1 ])(xi − xi−1) < ∑ [ε/(b − a)](xi − xi−1) = ε.

Consequently, by Exercise 273, Z b n f (x) dx − ∑ f (ξi )(xi − xi−1 ) a i=1

5.1. ANSWERS TO PROBLEMS n Z xi ≤ ∑ f (x) dx − f (ξi )(xi − xi−1 ) < ε. i=1

219

xi−1

Exercise 275, page 69 You can still use the error estimate in Exercise 273, but will have to handle the endpoints differently than you did in Exercise 274.

Exercise 276, page 69 Add one point c of discontinuity of f in (a, b) and prove that case. [Do for c what you did for the endpoints a and b in Exercise 275.]

Exercise 277, page 69 Once we have selected {([xi , xi−1 ], ξi ) : i = 1, 2, . . . n},

a partition of [a, b] with intervals shorter than δ we would be free to move the points ξi anywhere within the interval. Thus write the inequality and hold everything fixed except, for one value of i, let xi−1 ≤ ξi ≤ xi vary. That can be used to obtain an upper bound for | f (ξ)| for xi−1 ≤ ξ ≤ xi .

Exercise 278, page 69 R

Of course we can more easily use the definition of the integral and compute that 01 x2 dx = 1/3 − 0. This exercise shows that, under certain simple conditions, not merely can we approximate the value of the integral by Riemann sums, we can produce a sequence of numbers which converges to the value of the integral. Simply divide the interval at the points 0, 1/n, 2/n, . . . , n − 1)/n, and 1. Take ξ = i/n [the right hand endpoint of the interval]. Then the Riemann sum for this partition is n 2 i 1 12 + 22 + 32 + 42 + 52 + 62 + · · · + n2 = . ∑ n n3 i=1 n As n → ∞ this must converge to the value of the integral by Theorem 3.19. The student is advised to find the needed formula for 12 + 22 + 32 + 42 + 52 + 62 + · · · + N 2 .

and determine whether the limit is indeed the correct value 1/3.

Exercise 279, page 69 Determine the value of the integral

Z 1

x2 dx

0

in the following way. Let 0 < r < 1 be fixed. Subdivide the interval [0, 1] by defining the points x0 = 0, x1 = rn−1 , x2 = rn−2 , . . . , xn−1 = rn−(n−1) = r, and xn = rn−(−n) = 1.

CHAPTER 5. ANSWERS

220

Choose the points ξi ∈ [xi−1 , xi ] as the right-hand endpoint of the interval. Then n n 2 ∑ ξ2i (xi − xi−1) = ∑ rn−i (rn−i − rn−i+1). i=1

i=1

Note that for every value of n this is a Riemann sum over subintervals whose length is smaller than 1 − r. As r → 1− this must converge to the value of the integral by Theorem 3.19. The student is advised to carry out the evaluation of this limit to determine whether the limit is indeed the correct value 1/3.

Exercise 281, page 71 Let f (x) = 1/n if x = 1/n for integers n = 1, 2, 3, . . . and let f (x) = 0 for all other values of x. Show that f is Riemann integrable on the interval [0, 1] and that (R)

Z b

f (x) dx = 0

a

but that f is not integrable in the calculus sense on [0, 1].

Exercise 282, page 72 Suppose first that f is uniformly continuous on [a, b]. Then f is integrable on [a, b] in the calculus sense. We prove (using a different method than that chosen by Robbins) that f satisfies also this strong integrability condition. Let ε > 0 and C > 0 be given. Take δ sufficiently small that | f (x) − f (y)| < ε/C if x and y are points of [a, b] for which |x − y| < δ. R Write F(x) = ax f (t) dt. Suppose that a ≤ x ≤ ξ ≤ y ≤ b and that 0 < y − x < δ. Then, by the mean-value theorem, there is a point ξ∗ between x and y for which Thus we also have

F(y) − F(x) = f (ξ∗ )(y − x).

ε (y − x). C Then, for any choice of points x0 , x1 , . . . , xn and ξ1 , ξ2 , . . . , ξn from [a, b] with the properties in the statement of the definition, |F(y) − F(x) − f (ξ)(y − x)| = |[ f (ξ∗ ) − f (ξ)](y − x)|
0 so that, if we choose any points z1 < z < z2 , the interval [z1 , z2 ] must contain points c1 and c2 for which | f (c1 ) − f (c2 )| > η. Now we apply the strong integrability hypothesis using I = F(b) − F(a), ε = η/4, and C = b − a + 4

to obtain a choice of δ with 0 < δ < 1 that meets the conditions of the definition on [a, b]. Choose points z1 < z < z2 so that z2 − z1 < δ and then select points c1 and c2 in the interval [z1 , z2 ] for which f (c1 ) − f (c2 ) > η. Construct a sequence a = x0 < x1 < · · · < x p = z1

along with associated points {ξi } so that 0 < xi − xi−1 < δ and so that p F(z1 ) − F(a) − ∑ f (ξi )(xi − xi−1 ) < η/4. i=1

This just uses the integrability hypotheses of the function f on the interval [a, z1 ]. Choose the least integer r so that Note that

r(z2 − z1 ) > 1.

1 < r(z2 − z1 ) = (r − 1)(z2 − z1 ) + (z2 − z1 ) ≤ 1 + (z2 − z1 ) < 1 + δ < 2.

CHAPTER 5. ANSWERS

222 Using r continue the sequence {xi } by defining points

x p = x p+2 = x p+4 = · · · = x p+2r = z1

and

x p+1 = x p+3 = x p+5 = · · · = x p+2r−1 = z2 .

Write ξ p+2 j = c2 and ξ p+2 j−1 = c1 for j = 1, 2, . . . , r. Finally complete the sequence {xi } by selecting points

z1 = x p+2r < x p+2r+1 < · · · < xn−1 < xn = b

along with associated points {ξi } so that n F(b) − F(z1 ) − ∑ f (ξi )(xi − xi−1 ) < η/4. i=p+2r+1 This just uses the integrability hypotheses for f on [z1 , b]. Consider now the sum n

∑ f (ξi )(xi − xi−1)

i=1

taken over the entire sequence thus constructed. Observe that p

n

∑ |xi − xi−1| = ∑ (xi − xi−1) + i=1

i=1

p+2r

∑

i=p+1

n

|xi − xi−1 | +

∑

i=p+2r+1

(xi − xi−1 )

= (z1 − a) + 2r(z2 − z1 ) + (b − z1 ) =

(b − a) + 2r(z2 − z1 ) ≤ (b − a) + 4 = C.

Thus the points chosen satisfy the conditions of the definition for the δ selected and we must have n F(b) − F(a) − ∑ f (ξi )(xi − xi−1 ) < ε < η/4. i=1 On the other hand "

#

n

F(z1 ) − F(a) + F(b) − F(z1 ) − ∑ f (ξi )(xi − xi−1 ) i=1

"

#

p

= F(z1 ) − F(a) − ∑ f (xi )(xi − xi−1 ) i=1

"

#

n

+ F(b) − F(z1 ) − −

"

p+2r

∑

i=p+1

∑

i=p+2r+1

f (ξi )(xi − xi−1 ) #

f (ξi )(xi − xi−1 ).

From this we deduce that p+2r ∑ f (ξi )(xi − xi−1 ) < 3η/4. i=p+1

5.1. ANSWERS TO PROBLEMS

223

But a direct computation of this sum shows that p+2r

∑

i=p+1

f (ξi )(xi − xi−1 ) = [ f (c1 ) − f (c2 )]r(z2 − z1 ) > ηr(z2 − z1 ) > η.

This contradiction completes the proof.

Exercise 288, page 73 The same methods will work for this theorem with a little effort. Obtain, first, an inequality of the form | f (ξi )g(ξi ) − f (ξi )g(ξ∗i )| ≤ M (ω( f , [xi , xi−1 ]) + ω(g, [xi , xi−1 ])) .

To obtain this use the simple identity

a1 a2 − b1 b2 = (a1 − b1 )a2 + (a2 − b2 )b1

and use for M an upper bound of the sum function | f | + |g| which is evidently bounded, since both f and g are bounded.

Exercise 289, page 74 Obtain, first, an inequality of the form (2) (3) (p) f1 (ξi ) f2 (ξi ) f3 (ξi ) . . . f p (ξi ) − f1 (ξi ) f2 (ξi ) f3 (ξi ) . . . f p (ξi )

≤ M [ω( f1 , [xi , xi−1 ]) + ω( f2 , [xi , xi−1 ]) + ω( f3 , [xi , xi−1 ]) + · · · + ω( f p , [xi , xi−1 ])] .

To obtain this use the simple identity

a1 a2 a3 . . . a p − b1 b2 b3 . . . b p = (a1 − b1 )a2 a3 . . . a p + (a2 − b2 )b1 a3 . . . a p + (a3 − b3 )b1 b2 a4 . . . a p + . . . +(a p − b p )b1 b2 b3 . . . b p−1

and use an appropriate M.

Exercise 290, page 74 First note that the function H( f (x), g(x)) is defined and bounded. To see this just write |H( f (x), g(x))| ≤ M(| f (x))| + |g(x)|)

and remember that both f and g are bounded. It is also true that this function is continuous at every point of (a, b) with at most finitely many exceptions. To see this, use the inequality |H( f (x), g(x)) − H( f (x0 ), g(x0 ))| ≤ M(| f (x) − f (x0 )| + |g(x) − g(x0 )|)

and the definition of continuity. R Thus the integral ab F( f (x), g(x)) dx exists as a calculus integral and can be approximated by Riemann sums n

∑ H ( f (ξi ) , g (ξi )) (xi − xi−1).

i=1

CHAPTER 5. ANSWERS

224

To complete the proof just make sure that these sums do not differ much from these other similar sums: n

∑ H ( f (ξi ) , g (ξ∗i )) (xi − xi−1).

i=1

That will follow from the inequality |H( f (ξi ) , g (ξi )) − H( f (ξi ) , g (ξ∗i ))|

≤ M|g (ξi ) − g (ξ∗i ) | ≤ Mω(g, [xi−1 , xi ]).

Exercise 291, page 75 Notice, first, that

Z b a

n

f (x) dx = ∑

Z xi

f (x) dx.

i=1 xi−1

Thus n Z xi Z b n f (x) dx − f (ξi )(xi − xi−1 ) f (x) dx − ∑ f (ξi )(xi − xi−1 ) = ∑ a x i−1 i=1 i=1 n Z xi ≤ ∑ f (x) dx − f (ξi )(xi − xi−1 ) i=1

xi−1

merely by the triangle inequality.

Exercise 292, page 75 This is the simplest case to prove since we do not have to fuss at the endpoints or at exceptional points where F ′ may fail to exist. Simply let ε > 0 and choose at each point x a number δ(x) > 0 sufficiently small so that ε(z − y) |F(z) − F(y) − f (x)(z − y)|| < b−a when 0 < z − y < δ(x) and y ≤ x ≤ z. This is merely the statement F ′ (x) = f (x) translated into ε, δ language. Now suppose that we have a partition {([xi , xi−1 ], ξi ) : i = 1, 2, . . . n}

of the interval [a, b] with each

xi − xi−1 < δ(ξi ) and ξi ∈ [xi−1 , xi ].

Then, using our estimate on each of the intervals [xi−1 , xi−1 ], Z b n f (x) dx − ∑ f (ξi )(xi − xi−1 ) a i=1 n Z xi ≤ ∑ f (x) dx − f (ξi )(xi − xi−1 ) x i=1

n

i−1

= ∑ |[F(xi ) − F(xi−1 )] − f (ξi )(xi − xi−1 )| < i=1

ε n ∑ (xi − xi−1) = ε. b − a i=1

5.1. ANSWERS TO PROBLEMS

225

Exercise 293, page 76 This is still a simpler case to prove since we do not have to fuss at the endpoints and there is only one exceptional point to worry about, not a finite set of such points. Let ε > 0 and, at each point x 6= c, choose a number δ(x) > 0 sufficiently small so that ε(z − y) |F(z) − F(y) − f (x)(z − y)|| < 2(b − a) when 0 < z − y < δ(x) and y ≤ x ≤ z. This is merely the statement F ′ (x) = f (x) translated into ε, δ language. At x = c select a positive number δ(c) > 0 so that |F(z) − F(y)| + | f (c)|(z − y) < ε/2

when 0 < z − y < δ(x) and y ≤ x ≤ z. This is possible because F is continuous at c so that |F(z) − F(y)| is small if z and y are sufficiently close to c; the second part is small since | f (c)| is simply a nonnegative number. Now suppose that we have a partition {([xi , xi−1 ], ξi ) : i = 1, 2, . . . n}

of the interval [a, b] with each

xi − xi−1 < δ(ξi ) and ξi ∈ [xi−1 , xi ].

Then, using our estimate on each of the intervals [xi−1 , xi−1 ], n Z xi ∑ x f (x) dx − f (ξi )(xi − xi−1) i=1

n

i−1

= ∑ |[F(xi ) − F(xi−1 )] − f (ξi )(xi − xi−1 )| < ε/2 + i=1

ε n ∑ |(xi − xi−1)+| = ε. b − a i=1

Note that we have had to add the ε/2 in case it happens that one of the ξi = c. Otherwise we do not need it.

Exercise 294, page 76 Exercise 274 and Exercise 275 illustrate the method. Just add more points, including the endpoints a and b into the argument. Let c1 , c2 , . . . , cM be a finite list containing the endpoints a and b and each of the points in the interval where F ′ (x) = f (x) fails. Let ε > 0 and, at each point x 6= ci , choose a number δ(x) > 0 sufficiently small so that ε(z − y) |F(z) − F(y) − f (x)(z − y)|| < 2(b − a) when 0 < z − y < δ(x) and y ≤ x ≤ z. This is merely the statement F ′ (x) = f (x) translated into ε, δ language. At x = c j ( j = 1, 2, 3, . . . , M) select a positive number δ(c j ) > 0 so that ω(F, [a, b] ∩ [c j − δ(c j ), c j + δ(c j )]) + δ(c j )| f (c)| < ε/2M

CHAPTER 5. ANSWERS

226

when 0 < z − y < δ(x) and y ≤ x ≤ z. Thus just uses the continuity of F. Now suppose that we have a partition {([xi , xi−1 ], ξi ) : i = 1, 2, . . . n}

of the interval [a, b] with each

xi − xi−1 < δ(ξi ) and ξi ∈ [xi−1 , xi ].

Note, first, that if ξi = c j for some i and j (which might occur at most M times), then |[F(xi ) − F(xi−1 )] − f (ξi )(xi − xi−1 )|

≤ ω(F, [a, b] ∩ [c j − δ(c j ), c j + δ(c j )]) + δ(c j )| f (c)| < ε/2M.

At any other point ξi 6= c j

|F(z) − F(y) − f (x)(z − y)|| < Consequently

ε(z − y) . 2(b − a)

Z x i ∑ x f (x) dx − f (ξi )(xi − xi−1) n

i=1 n

i−1

= ∑ |[F(xi ) − F(xi−1 )] − f (ξi )(xi − xi−1 )| < ε/2 + i=1

ε n ∑ |(xi − xi−1)+| = ε. b − a i=1

Exercise 295, page 77 Note that the calculus integral

Z b a

F ′ (x) dx = F(b) − F(a)

exists For each point ξ in [c, d] take δ(ξ) sufficiently small that F(y) − F(x) ε ′ − F (ξ) < y−x C whenever x and y are points in [c, d] for which x ≤ ξ ≤ y and 0 < y − x < δ(ξ). This gives us F(y) − F(x) − F ′ (ξ)(y − x) < ε (y − x). C Then, for any choice of points x0 , x1 , . . . , xn and ξ1 , ξ2 , . . . , ξn from [c, d] with the four properties of the statement of the theorem, Z b n F ′ (x) dx − ∑ F ′ (ξi )(xi − xi−1 ) a i=1 n = ∑ F(xi ) − F(xi−1 ) − F ′ (ξi )(xi − xi−1 ) i=1 n ε n ≤ ∑ F(xi ) − F(xi−1 ) − F ′ (ξi )(xi − xi−1 ) < ∑ |xi − xi−1 | ≤ ε. C i=1 i=1

5.1. ANSWERS TO PROBLEMS

227

Exercise 297, page 77 This is a standard “Cauchy” version of the integrability condition. Such a statement is equivalent to the other version. It is an essential element of general integration theory to prove the equivalence of such statements.

Exercise 298, page 78 Use Exercise ??.

Exercise 300, page 78 The easy direction is already contained in Theorem 3.27. Theorem 3.27 shows that every derivative does have this strong version of the integrability property. The proof is structured so as to be similar in many details to the proof of Theorem ??. Let us then suppose that f is a function possessing this property on an interval [a, b]. Under the hypotheses here, we need to establish two facts using fairly standard methods of integration theory. Our methods are similar to those used in Section 3.5.6. Because of Exercises 298 and ?? we know that such a function f with these properties would have to have the same properties on each subinterval. Moreover Exercise 299 shows that there must be a function F : [a, b] → R with I(x, y) = F(y) − F(x) for each a ≤ x < y ≤ b. We claim now that F ′ (x) = f (x) at every point x in the interval [a, b]. Suppose that there is a point z in the interval at which it is not true that F ′ (z) = f (z). One possibility is that this is because the upper right-hand (Dini) derivative at z exceeds f (z) by some positive value η > 0. Another is that the value f (z) exceeds the upper right-hand (Dini) derivative at z by some positive value η > 0. There are six other possibilities, corresponding to the other three Dini derivatives under which F ′ (z) = f (z) might fail. It is sufficient for a proof that we show that this first possibility cannot occur. From this we will obtain a contradiction to the statement in the theorem. Thus we will assume that there must be a positive number η > 0 so that we can choose an arbitrarily small positive number t so that the interval [z, z + t] has this property: F(z + t) − F(z) > f (z) + η t and hence so that F(z + t) − F(z) > f (z)t + ηt.

We give the details assuming this and that a < z < b. Now we apply the theorem using ε < η/4, and C = b − a + 6 to obtain a choice of positive function δ that meets the conditions of the theorem. Choose a number 0 < t < 1 for which t < δ(z) and z + t f (z)t + ηt.

Let s be the least integer so that st > 2. Note that, consequently, 2 < st = (s − 1)t + t ≤ 2 + t < 3.

CHAPTER 5. ANSWERS

228 We first select a sequence of points

z = u0 < u1 < u2 < · · · < uk−1 = z + t

and points υi from [xi−1 , xi ] so that 0 < ui − ui−1 < δ(υi ) and k−1 F(z + t) − F(z) − ∑ f (υi )(ui − ui−1 ) < ηt/2 i=1

This is possible simply because f possesses the strong integrability property on the interval [z, z + t]. Now we add in the point uk = z and υk = z. We compute that k

∑

i=1

k−1

f (υi )(ui − ui−1 ) = − f (z)t + ∑ f (υi )(ui − ui−1 ) i=1

k−1

> −[F(z + t) − F(z) − ηt] + ∑ f (υi )(ui − ui−1 ) > ηt/2. i=1

while at the same time k

∑ |xi − xi−1| = 2t.

i=1

Repeat this sequence z = u0 < u1 < · · · < uk−1 > uk = z

exactly s times so as to produce a sequence

z = u0 , u1 , . . . ur−1 , ur = z with the property that r

∑ f (υi )(ui − ui−1) > ηst/2 > η

i=1

while at the same time

r

∑ |ui − ui−1| = 2st < 6.

i=1

Now construct a sequence a = z0 < z1 < · · · < z p = z

along with associated points ζi so that 0 < zi − zi−1 < δ(ζi ) and so that Z z p f (x) dx − ∑ f (ζi )(zi − zi−1 ) < η/4. a i=1

We also need a sequence

z = w0 < w1 < . . . wq = b

along with associated points ωi so that 0 < wi − wi−1 < δ(ωi ) and so that Z b q f (x) dx − ∑ f (ωi )(wi − wi−1 ) < η/4. z i=1

Both of these just use the strong integrability property of f on the subintervals [a, z] and [z, b]

5.1. ANSWERS TO PROBLEMS

229

Now we put these three sequences together in this way a = z0 < z1 < · · · < z p = z = u0 , u1 , . . . , ur = z = w0 < w1 < . . . wq = b

to form a new sequence a = x0 , x1 , . . . , xN = b for which |xi − xi−1 | < δ(ξi ) and for which N

∑ |xi − xi−1| = (z − a) + 2st + (b − z) = b − a + 2st < b − a + 6 = C.

i=1

We use ξi in each case as the appropriate intermediate point used earlier: thus associated with an interval [zi−1 , zi ] we had used ζi ; associated with an interval [wi−1 , wi ] we had used ωi ; while associated with a pair (ui−1 , ui ) we use υi . Consider the sum N

∑ f (ξi )(xi − xi−1)

i=1

taken over the entire sequence thus constructed. Because the points satisfy the conditions of the theorem for the δ function selected we must have Z b N f (x) dx − ∑ f (ξi )(xi − xi−1 ) < ε < η/4. a i=1

On the other hand

"Z

z

f (x) dx +

Z b z

a

=

+

"Z

"Z

z a

#

N

f (x) dx − ∑ f (ξi )(xi − xi−1 ) i=1

#

p

f (x) dx − ∑ f (ζi )(zi − zi−1 ) i=1

#

q

b

f (x) dx − ∑ f (ωi )(wi − wi−1 )

z

i=1

+

"

r

#

∑ f (υi )(ui − ui−1)

i=1

.

From this we deduce that r

∑ f (ξi )(ui − ui−1) < 3η/4

i=1

and yet we recall that r

∑ f (ξi )(ui − ui−1) > ηst/2 > η.

i=1

This contradiction completes the proof.

Exercise 301, page 79 The inequalities − | f (x)| ≤ f (x) ≤ | f (x)|

CHAPTER 5. ANSWERS

230

hold at every point x at which f is defined. Since these functions are assumed to be integrable on the interval [a, b], −

Z b a

| f (x)| dx ≤

Z b a

f (x) dx ≤

Z b a

| f (x)| dx

which is exactly what the inequality in the exercise asserts.

Exercise 302, page 79 Observe that

Z x Z b n Z xk k ≤ | f (x)| dx | f (x)| dx = |F(x ) − F(x )| = f (x) dx i−1 ∑ x ∑ i ∑ x a n

n

i=1

i=1

i=1

k−1

k−1

for all choices of points

a = x0 < x1 < x2 < · · · < xn−1 < xn = b.

Exercise 303, page 80 Define the function F(x) = x cos

π x

, F(0) = 0 and compute

F ′ (x) = cos(π/x) + (π/x) sin(π/x), x 6= 0.

Thus F is differentiable everywhere except at x = 0 and F is continuous at x = 0. To see the latter note that −|x| ≤ F(x) ≤ |x|. Thus F ′ has a calculus integral on every interval. Note that F ′ (x) is continuous everywhere except at x = 0 and that it is unbounded on (0, 1). We show that |F ′ | is not integrable on [0, 1]. It is, however, integrable on any subinterval [c, d] for which 0 < c < d ≤ 1 since F ′ and hence |F ′ | are continuous at every point in such an interval. Take any integer k and consider the points ak = 2/(2k + 1), bk = 1/k and check that F(ak ) = 0 while F(bk ) = (−1)k /k. Observe that and that

0 < ak < bk < ak−1 < bk−1 < · · · < 1 Z bk

Z b k ′ 1 |F (x)| dx ≥ F (x) dx = |F(bk ) − F(ak )| = . k ak ak ′ If |F | were, in fact, integrable on [0, 1] then, summing n of these pieces, we would have ′

n

n 1 ∑k≤∑ k=1 k=1

Z bk ak

′

|F (x)| dx ≤

Z 1 0

|F ′ (x)| dx.

1 This is impossible since ∑∞ k=1 k = ∞.

Note: In the language introduced later, you may wish to observe that F is not a function of bounded variation on [0, 1]. There is a close connection between this concept and absolute integrability.

5.1. ANSWERS TO PROBLEMS

231

Exercise 304, page 80 You can use the same argument but with different arithmetic. This is the traditional example that illustrates that the calculus integral, which integrates all derivatives, is not contained in the Lebesgue integral. Indefinite Lebesgue integrals, since Lebesgue’s integral is an absolute integration method, must be of bounded variation on any interval. In contrast, the function 1 2 F(x) = x sin 2 x is everywhere differentiable but fails to have bounded variation on [0, 1].

Exercise 305, page 80 Since f is continuous on (a, b) with at most finitely many exceptions and is bounded it is integrable. But the same is true for | f |, since it too has the same properties. Hence both f and | f | are integrable.

Exercise 306, page 81 Subdivide at any one point x inside (a, b), a = x0 < x1 = x < x2 = b. Then |F(x) − F(a)| + |F(x) − F(b)| ≤ V (F, [a, b]).

Consequently

|F(x)| ≤ |F(a)| + |F(b)| +V (F, [a, b])

offers an upper bound for F on [a, b].

Exercise 307, page 81 If F : [a, b] → R is nondecreasing then T (x) = F(x) − F(a). This is because n

n

i=1

i=1

∑ |F(xi ) − F(xi−1 )| = ∑ [F(xi ) − F(xi−1 )] = F(x) − F(a)

for all choices of points a = x0 < x1 < x2 < · · · < xn−1 < xn = x.

If F : [a, b] → R is nonincreasing then T (x) = F(a)−F (x). Putting these together yields that T (x) = |F(x) − F(a)| in both cases.

Exercise 308, page 81 Work on the separate subintervals of [−π, π] on which sin x is monotonic. For example, it is nondecreasing on [−π/2, π/2].

CHAPTER 5. ANSWERS

232

Exercise 309, page 81 You should be able to show that n

∑ |F(xi ) − F(xi−1 )|

i=1

is either 0 (if none of the points chosen was 0) and is 2 (if one of the points chosen was 0). It follows that V (F, [−1, 1]) = 2. Note that this example illustrates that the computation of the sum n

∑ |F(xi ) − F(xi−1 )|

i=1

doesn’t depend merely on making the points close together, but may depend also on which points get chosen. Later on in Exercise 318 we will see that for continuous functions the sum n

∑ |F(xi ) − F(xi−1 )|

i=1

will be very close to the variation value V (F, [a, b]) if we can choose points very close together. For discontinuous functions, as we see here, we had better consider all points and not miss even one.

Exercise 310, page 81 Simplest to state would be F(x) = 0 if x is an irrational number and F(x) = 1 if x is a rational number. Explain how to choose points a = x0 < x1 < x2 < · · · < xn−1 < xn = b

so that the sum

n

∑ |F(xi ) − F(xi−1 )| ≥ n.

i=1

Exercise 311, page 81 Suppose that F : [a, b] → R is Lipschitz with a Lipschitz constant K. Then n

n

i=1

i=1

∑ |F(xi ) − F(xi−1 )| ≤ ∑ K(xi − xi−1) = K(b − a)

for all choices of points a = x0 < x1 < x2 < · · · < xn−1 < xn = b.

Thus V (F, [a, b]) ≤ K(b − a). The converse is not true and it is easy to invent a counterexample. Every monotonic function is of bounded variation, and monotonic functions need not be Lipschitz, nor even continuous.

5.1. ANSWERS TO PROBLEMS

233

Exercise 312, page 82 To estimate V (F + G, [a, b]) consider n

∑ |[F(xi ) + G(xi)] − [F(xi−1 ) + G(xi−1)]|

i=1

for all choices of points a = x0 < x1 < x2 < · · · < xn−1 < xn = b.

By the triangle inequality, n

∑ |[F(xi ) + G(xi)] − [F(xi−1 ) + G(xi−1)]|

i=1 n

n

i=1

i=1

≤ ∑ |F(xi ) − F(xi−1 )| + ∑ |G(xi ) − G(xi−1 )| ≤ V (F, [a, b]) +V (G, [a, b]).

Exercise 313, page 82 Use F = −G and then F + G is a constant and so V (F + G, [a, b]) = 0. Thus it is easy to supply an example for which V (F + G, [a, b]) < V (F, [a, b]) +V (G, [a, b]). For exact conditions on when equality might be possible see F. S. Cater, When total variation is additive, Proceedings of the American Mathematical Society, Volume 84, No. 4, April 1982.

Exercise 314, page 82 This is a substantial theorem and it is worthwhile making sure to master the methods of proof. Mostly it is just a matter of using the definition and working carefully with inequalities. (2). T is monotonic, nondecreasing on [a, b]. Take a ≤ x < y ≤ b and consider computing V (F, [a, x]). Take any points a = x0 < x1 < x2 < · · · < xn−1 < xn = x

Observe that the sum n

∑ |F(xi ) − F(xi−1 )| + |F(y) − F(x)| ≤ V (F, [a, y]).

i=1

This this would be true for any such choice of points, it follows that V (F, [a, x]) + |F (y) − F(x)| ≤ V (F, [a, y]).

Thus T (x) = V (F, [a, x]) ≤ V (F, [a, y]) = T (y). (1). for all a ≤ c < d ≤ b,

|F(d) − F(c)| ≤ V (F, [c, d]) = T (d) − T (c).

The first inequality, |F(d) − F(c)| ≤ V (F, [c, d])

CHAPTER 5. ANSWERS

234

follows immediately from the definition of what V (F, [c, d]) means. The second inequality says this: V (F, [a, d]) = V (F, [a, c]) +V (F, [c, d]) (5.2) and it is this that we must prove. To prove (5.2) we show first that V (F, [a, d]) ≥ V (F, [a, c]) +V (F, [c, d])

(5.3)

Let ε > 0 and choose points so that

a = x0 < x1 < x2 < · · · < xn−1 < xn = c n

∑ |F(xi ) − F(xi−1 )| > V (F, [a, c]) − ε.

i=1

Then choose points so that

c = xn < xn+1 < xn+2 < · · · < xm−1 < xm = d m

∑

i=n+1

Observe that

|F(xi ) − F(xi−1 )| > V (F, [c, d]) − ε.

m

∑ |F(xi ) − F(xi−1 )| ≤ V (F, [a, d]).

i=1

Putting this together now you can conclude that V (F, [a, d]) ≥ V (F, [a, c]) +V (F, [c, d]) − 2ε.

Since ε is arbitrary the inequality (5.3) follows. Now we prove that Choose points so that

V (F, [a, d]) ≤ V (F, [a, c]) +V (F, [c, d])

(5.4)

a = x0 < x1 < x2 < · · · < xn−1 < xn = d n

∑ |F(xi ) − F(xi−1 )| > V (F, [a, d]) − ε.

i=1

We can insist that among the points selected is the point c itself [since that does not make the sum any smaller]. So let us claim that xk = c. Then k

∑ |F(xi ) − F(xi−1 )| ≤ V (F, [a, c])

i=1

and

n

∑

i=k+1

|F(xi ) − F(xi−1 )| ≤ V (F, [c, d]).

Putting this together now you can conclude that V (F, [a, d]) − ε < V (F, [a, c]) +V (F, [c, d]).

Since ε is arbitrary the inequality (5.4) follows. Finally, then, the inequalities (5.3) and (5.4) verify (5.2).

5.1. ANSWERS TO PROBLEMS

235

(3). If F is continuous at a point then so too is T . We argue just on the right at the point a to claim that if F is continuous at a then T (a+) = T (a) = 0. The same argument can be repeated at any point and on either side. The value T (a+) exists since T is monotonic, but it might be positive. Let ε > 0 and choose δ1 so that |T (x) − T (a+)| < ε if a < x < a + δ1 . Choose δ2 , using the continuity of F at a, so that |F(x) − F(a)| < ε

if a < x < a + δ2 . Now take any a < x < min{δ1 , δ2 }. Choose points a = x0 < x1 < x2 < · · · < xn−1 < xn = x

so that

n

∑ |F(xi ) − F(xi−1 )| > T (x) − ε.

i=1

Observe that and that

|F(x1 ) − F(x0 )| = |F(x1 ) − F(a)| < ε

n

∑ |F(xi ) − F(xi−1 )| ≤ T (x) − T (x1 ) ≤ T (x) − T (a+) − [T (x1 ) − T (a+)] < 2ε.

i=2

Putting these together we can conclude that T (x) < 3ε for all a < x < min{δ1 , δ2 }. Thus T (a+) = 0.

(4). If F is uniformly continuous on [a, b] then so too is T . This follows from (3). (5). If F is continuously differentiable at a point then so too is T and, moreover T ′ (x0 ) = |F ′ (x0 )|. This statement is not true without the continuity assumption so your proof will have to make use of that assumption. We will assume that F is continuously differentiable at a and conclude that the derivative of T on the right at a exists and is equal to |F ′ (a)|. This means that F is differentiable in some interval containing a and that this derivative is continuous at a. Let ε > 0 and choose δ so that |F ′ (a) − F ′ (x)| < ε

if a < x < a + δ. Now choose points

a = x0 < x1 < x2 < · · · < xn−1 < xn = x

so that

n

T (x) ≥ ∑ |F(xi ) − F(xi−1 )| > T (x) − ε(x − a). i=1

Apply the mean-value theorem on each of the intervals to obtain n

n

i=1

i=1

∑ |F(xi ) − F(xi−1 )| = ∑ |F ′ (ξi )|(xi − xi−1) = |F ′(a)|(x − a) ± ε(x − a).

CHAPTER 5. ANSWERS

236 We can interpret this to yield that

|T (x) − T (a) − |F ′ (a)|(x − a)| ≤ 2ε(x − a)

for all a < x < a + δ. This says precisely that the right-hand derivative of T at a is |F ′ (a)|. (6). If F is uniformly continuous on [a, b] and continuously differentiable at all but finitely many points in (a, b) then F ′ is absolutely integrable and F(x) − F(a) =

Z x a

F ′ (t) dt and T (x) =

Z x a

|F ′ (t)| dt.

For F ′ to be absolutely integrable both F ′ and |F ′ | must be integrable. Certainly F ′ is integrable. The reason that |F ′ | is integrable is that is continuous at all but finitely many points in (a, b) and has for an indefinite integral the uniformly continuous function T . This uses (5).

Exercise 315, page 82 The natural way to do this is to write F(x) F(x) F(x) = V (F, [a, x]) + − V (F, [a, x]) − 2 2 in which case this expression is called the Jordan decomposition. It is then just a matter of checking that the two parts do in fact express f as the difference of two monotonic, nondecreasing functions. Theorem 3.33 contains all the necessary information.

Exercise 316, page 82 The methods in Exercise 303 can be repeated here. First establish continuity. The only troublesome point is at x = 0 and, for that, just notice that −|x| ≤ F(x) ≤ |x| which can be used to show that F is continuous at x = 0. Then to compute the total variation of take any integer k and consider the points ak = 2/(2k + 1), bk = 1/k and check that F(ak ) = 0 while F(bk ) = (−1)k /k. Observe that 0 < ak < bk < ak−1 < bk−1 < · · · < 1.

Consequently

n

∑ |F(bk ) − F(ak )| ≤ V (F, [0, 1]).

k=1

But

n

n 1 = ∑ ∑ |F(bk ) − F(ak )| k=1 k k=1

1 and ∑∞ k=1 k = ∞. It follows that V (F, [0, 1]) = ∞.

Exercise 317, page 82 See Gerald A. Heuer, The derivative of the total variation function, American Mathematical Monthly, Vol. 78, No. 10 (1971), pp. 1110–1112. For the statement about the

5.1. ANSWERS TO PROBLEMS

237

variation it is enough to work on [0, 1] since the values on [−1, 0] are symmetrical. For the statement about the derivatives, it is enough to work on the right-hand side at 0, since F(−x) = −F(x). Here is the argument for r = 2 from Heuer’s article. Note that, on each interval [2/(2n + 1)π, 2/(2n − 1)π], the function F vanishes at the endpoints and has a single extreme point xn where 1/nπ < xn < 2/(2n − 1)π.

Thus the variation on this interval is 2|F(xn )|, and

(1/nπ)2 = |F(1/nπ)| < |F(xn )| < x2n < {2/[(2n − 1)π]}2 .

By the integral test for series (page ??) 1/n =

Z ∞

dx/x2
v + ε.

(5.5)

j=1

Since F is uniformly continuous on [a, b] there is a η > 0 so that ε |F(x) − F(x′ )| < 2(k + 1) ′ whenever |x − x | < η. We are now ready to specify our δ: we choose this smaller than η and also smaller than all the lengths y j − y j−1 for j = 1, 2, 3, . . . , k. Now suppose that we have made a choice of points a = x0 < x1 < x2 < · · · < xn−1 < xn = b such that each xi − xi−1 < δ. We shall show that n

v < ∑ |F(xi ) − F(xi−1 )| ≤ V (F, [a, b] i=1

and this will prove the statement in the exercise.

(5.6)

CHAPTER 5. ANSWERS

238

We can split this sum up into two parts: if an interval (xi−1 , xi ) contains any one of the points from the collection a = y0 < y1 < y2 < · · · < yn−1 < yk = b

that we started with, then we will call that interval a black interval. Note that, by our choice of δ, a black interval can contain only one of the y j points. In fact, if y j ∈ (xi−1 , xi ) we can make use of the fact that ε |F(xi ) − F(xi−1 )| ≤ |F(xi ) − F(y j )| + |F(y j ) − F(xi−1 )| ≤ . (5.7) (k + 1) If (xi−1 , xi ) contains none of these points we will call it a white interval. The sum in (5.6) is handled by thinking separately about the white intervals and the black intervals. Let combine all the xi ’s and all the y j ’s: Note that

a = z0 < z1 < z2 < · · · < zn−1 < zm = b. m

∑ |F(z p ) − F(z p−1 )| > v + ε.

(5.8)

p=1

This is because the addition of further points always enlarges the sum or leaves it the same. The inequality (5.6) now follows by comparing it to (5.8). There are extra white intervals perhaps where a new point has been added, but each of these has been enlarged by adding a single point and the total extra contribution is no more than ε because of (5.7).

Exercise 320, page 83 If F is locally of bounded variation at every point x ∈ R then the collection β = {([u, v], w) : w ∈ [u, v], V (F, [u, v]) < ∞}

is a full cover of the real line. Take any interval [a, b] and choose a partition π of the interval [a, b] so that π ⊂ β. Then V (F, [a, b]) ≤

∑

V (F, [u, v]) < ∞.

([u,v],w)∈π

The converse is immediate.

Exercise 321, page 83 Recall that “mostly everywhere” indicates a finite exceptional set is possible while “nearly everywhere” allows a sequence of exceptional points where the derivative inequality may not hold. This comparison test is presented in the expository paper J. J. Koliha, Mean, Meaner and the Meanest Mean-Value Theorem, The American Mathematical Monthly, 116, No. 4, (2009) 356-361. as a preferred tool in elementary analysis to the mean-value theorem. It allows of numerous extended versions (more than the three in that paper) and, indeed, the student

5.1. ANSWERS TO PROBLEMS

239

would be better prepared for numerous problems thinking of an application of this principle before trying to fit the mean-value problem to the solution. As a starting point to constructing a proof of the first statement in the exercise, consider a point a < x0 0 and choose δ0 > 0 so that F(x) − F(x0 ) ′ − F (x0 ) < ε x − x0 and G(x) − G(x0 ) ′ − G (x0 ) < ε x − x0 if 0 < |x − x0 | < δ0 . Note that for such x, |F(x) − F(x0 )| ≤ |G(x) − G(x0 )| + 2ε|x − x0 |

because |F ′ (x0 )| ≤ |G′ (x0 )|. For a point x0 that is equal to a or b or for which the inequality |F ′ (x0 )| ≤ |G′ (x0 )| fails just use the continuity of F to select δ0 > 0 so that |F(x) − F(x0 )| < ε

if |x − x0 | < δ0 . Our standard Cousin partitioning argument can now be used; see Section 1.9.5 for a number of worked-out examples. The paper of Koliha can also be consulted for similar details if this hint doesn’t get you started. (Note that his paper assumes that G is nondecreasing so he does not have to work with the variations.)

Exercise 322, page 83 It is easy to check that a Lipschitz function would have this property. Notice that the property stated is similar to the statement of Robbins’s theorem and, accordingly, similar methods will help here. Suppose that the function F : [a, b] → R has the stated property. Let C = 3(b − a) and choose M so that m

∑ |F(xi ) − F(xi−1 )| ≤ M

i=1

for all choices of points a = x0 , x1 , x2 , . . . , xm−1 , xm = b for which

m

∑ |xi − xi−1| ≤ C = 3(b − a).

i=1

We claim that F is Lipschitz, in fact that F(y) − F(x) ≤ 2M b−a y−x for all a ≤ x < y ≤ b. Suppose not. We obtain a contradiction by supposing that 2M(y − x) |F(y) − F(x)| > b−a for some particular choice of a ≤ x < y ≤ b. [We can suppose that a < x < y < b as the

CHAPTER 5. ANSWERS

240 other cases are similarly handled.] Let n be the largest integer for which n(y − x) ≤ 2(b − a).

Choose points x0 = a, x1 = x, x2 = y, x3 = x, x4 = y, . . . x2n+1 = x, x2n+2 = b. Note that 2n+2

∑

i=1

|xi − xi−1 | = b − a + n(y − x) ≤ 3(b − a) = C.

Consequently, by our choice of M, 2n+1

∑

i=1

|F(xi ) − F(xi−1 )| ≤ M.

We can estimate this sum as 2n+1

∑

i=1

|F(xi ) − F(xi−1 )| > |F(x) − F(a)| + |F(b) − F(x)| +

Thus

2Mn(y − x) . b−a

2Mn(y − x) ≤M b−a

or

n(y − x) ≤ (b − a)/2.

This contradicts our choice of the integer n since that would mean that [n + 1](y − x) = n(y − x) + (y − x) ≤ (b − a)/2 + (b − a) ≤ 2(b − a).

This contradiction completes the proof.

Exercise 323, page 84 We suppose that f is absolutely integrable on [a, b]. Thus | f | is integrable here. Observe, then, that Z b n Z xk n n Z xk ∑ |F(xi ) − F(xi−1 )| = ∑ x f (x) dx ≤ ∑ x | f (x)| dx = a | f (x)| dx i=1

i=1

i=1

k−1

k−1

for all choices of points

It follows that

a = x0 < x1 < x2 < · · · < xn−1 < xn = b. V (F, [a, b]) ≤

Z b a

| f (x)| dx.

Consequently F must be a function of bounded variation and we have established an inequality in one direction for the identity V (F, [a, b]) =

Z b a

| f (x)| dx.

Let us prove the opposite direction. Since f and | f | are integrable we may apply the Henstock property (Theorem 3.26) to each of them. Write G for an indefinite integral of | f | and recall that F is an indefinite integral of f .

5.1. ANSWERS TO PROBLEMS

241

For every ε > 0 and for each point x in [a, b] there is a δ(x) > 0 so that n

∑ |F(xi ) − F(xi−1 ) − f (ξi)(xi − xi−1)| < ε

i=1 n

∑ |Gxi) − G(xi−1) − | f (ξi)|(xi − xi−1)| < ε

i=1

whenever {([xi , xi−1 ], ξi ) : i = 1, 2, . . . n} is a partition of [a, b] with each xi − xi−1 < δ(ξi ) and ξi ∈ [xi−1 , xi ].

There must exist one such partition and for that partition n

n

i=1 n

i=1

G(b) − G(a) = ∑ G(xi ) − G(xi−1 ) ≤ ε + ∑ | f (ξi )|(xi − xi−1 ) ≤ 2ε ∑ |Fxi ) − F(xi−1 ) ≤ V (F, [a, b] + 2ε. i=1

It follows, since ε can be any positive number, that Z b a

| f (x)| dx = G(b) − G(a) ≤ V (F, [a, b]).

This completes the proof.

Exercise 324, page 84 This is a limited theorem but useful to state and fairly easy to prove given what we now know. We know that F ′ is integrable on [a, b]; indeed, it is integrable by definition even without the assumption about the continuity of F ′ . We also know that, if F ′ is absolutely integrable, then F would have to be of bounded variation on [a, b]. So one direction is clear. To prove the other direction we suppose that F has bounded variation. Let T be the total variation function of F on [a, b]. Then, by Theorem 3.33, T is uniformly continuous on [a, b] and T is differentiable at every point at which F is continuously differentiable, with moreover T ′ (x) = |F ′ (x)| at such points. Wherever F ′ is continuous so too is |F ′ |. Consequently we have this situation: T : [a, b] → R is a uniformly continuous function that is continuously differentiable at every point in a bounded, open interval (a, b) with possibly finitely many exceptions. Thus T ′ = |F ′ | is integrable.

Exercise 325, page 85 The limit function is f (x) = 1/x which is continuous on (0, ∞) but certainly not bounded there.

Exercise 326, page 85 Each of the functions is continuous. Notice that for each x ∈ (−1, 1), limn→∞ fn (x) = 0 and yet, for x ≥ 1, limn→∞ fn (x) = 1. This is easy to see, but it is instructive to check

CHAPTER 5. ANSWERS

242

the details since we can use them later to see what is going wrong in this example. At the right-hand side on the interval [1, ∞) it is clear that limn→∞ fn (x) = 1. At the other side, on the interval (−1, 1) the limit is zero. For if −1 < x0 < 1 and ε > 0, let N ≥ log ε/ log |x0 |. Then |x0 |N ≤ ε, so for n ≥ N | fn (x0 ) − 0| = |x0 |n < |x0 |N ≤ ε.

Thus

f (x) = lim fn (x) = n→∞

0 if −1 < x < 1 1 if x ≥ 1.

The pointwise limit f of the sequence of continuous functions { fn } is discontinuous at x = 1. (Figure 3.1 shows the graphs of several of the functions in the sequence just on the interval [0, 1].)

Exercise 328, page 85 The sequence of functions fn (x) converges to zero on (−1, 1) and to x − 1 on [1, ∞) . Now fn′ (x) = xn−1 on (−1, 1), so by the previous exercise (Exercise 326), lim f ′ (x) n→∞ n

=

0 if −1 < x < 1 1 if x ≥ 1,

while the derivative of the limit function, fails to exist at the point x = 1.. Thus d d lim fn (x) lim ( fn (x)) 6= n→∞ dx dx n→∞ at x = 1.

Exercise 330, page 86 The concept of uniform convergence would allow this argument. But interchanging two limiting operations cannot be justified with pointwise convergence. Just because this argument looks plausible does not mean that we are under no obligation to use ε, δ type of arguments to try to justify it. Apparently, though, to verify the continuity of f at x0 we do need to use two limit operations and be assured that the order of passing to the limits is immaterial.

Exercise 331, page 87 If all the functions fn had the same upper bound this argument would be valid. But each may have a different upper bound so that the first statement should have been If each fn is bounded on an interval I then there must be, by definition, a number Mn so that | fn (x)| ≤ Mn for all x in I.

Exercise 332, page 87 In this exercise we illustrate that an interchange of limit operations may not give a correct result.

5.1. ANSWERS TO PROBLEMS

243

For each row m, we have limn→∞ Smn = 0. Do the same thing holding n fixed and letting m → ∞.

Exercise 335, page 87 We have discussed, briefly, the possibility that there is a sequence that contains every rational numbers. This topic appears in greater detail in Chapter 4. If f did have a calculus integral there would be a function F such that F ′ = f at all but finitely many points. There would be at least one interval where f is an exact derivative and yet f does not have the Darboux property since it assumes only the values 0 and 1 (and no values in between).

Exercise 337, page 88 The statements that are defined by inequalities (e.g., bounded, convex) or by equalities (e.g., constant, linear) will not lead to an interchange of two limit operations, and you should expect that they are likely true.

Exercise 338, page 88 As the footnote to the exercise explains, this was Luzin’s unfortunate attempt as a young student to understand limits. The professor began by saying “What you say is nonsense.” He gave him the example of the double sequence m/(m + n) where the limits as m → ∞ and n → ∞ cannot be interchanged and continued by insisting that “permuting two passages to the limit must not be done.” He concluded with “Give it some thought; you won’t get it immediately.” As yet another illustration that some properties are not preserved in the limit, compute the length of the curves in Exercise 338 (Fig. 3.3) and compare with the length of the limiting curve [i.e., the straight line y = x].

Exercise 339, page 89 The purpose of the exercise is to lead to the notion of uniform convergence as a stronger alternative to pointwise convergence. Fix ε but let the point x0 vary. Observe that, when x0 is relatively small in comparison with ε, the number log x0 is large in absolute value compared with log ε, so relatively small values of n suffice for the inequality |x0 |n < ε. On the other hand, when x0 is near 1, log x0 is small in absolute value, so log ε/ log x0 will be large. In fact, log ε = ∞. (5.9) lim x0 →1− log x0 The following table illustrates how large n must be before |xn0 | < ε for ε = .1. Note that for ε = .1, there is no single value of N such that |x0 |n < ε for every value of x0 ∈ (0, 1) and n > N. (Figure 5.3 illustrates this.)

CHAPTER 5. ANSWERS

244 1

1

Figure 5.3: The sequence {xn } converges infinitely slowly on [0, 1]. The functions y = xn are shown with n = 2, 4, 22, and 100, with x0 = .1, .5, .9, and .99, and with ε = .1. x0 .1 .5 .9 .99 .999 .9999

n 2 4 22 230 2,302 23,025

Some nineteenth-century mathematicians would have described the varying rates of convergence in the example by saying that “the sequence {xn } converges infinitely slowly on (0, 1).” Today we would say that this sequence, which does converge pointwise, does not converge uniformly. The formulation of the notion of uniform convergence in the next section is designed precisely to avoid this possibility of infinitely slow convergence.

Exercise 340, page 89 We observed that the sequence { fn } converges pointwise, but not uniformly, on (0, 1). We realized that the difficulty arises from the fact that the convergence near 1 is very “slow.” But for any fixed η with 0 < η < 1, the convergence is uniform on [0, η]. To see this, observe that for 0 ≤ x0 < η, 0 ≤ (x0 )n < ηn . Let ε > 0. Since limn→∞ ηn = 0, there exists N such that if n ≥ N, then 0 < ηn < ε. Thus, if n ≥ N, we have 0 ≤ xn0 < ηn < ε,

so the same N that works for x = η, also works for all x ∈ [0, η). (See Figure 5.4.)

Exercise 341, page 89 Use the Cauchy criterion for convergence of sequences of real numbers to obtain a candidate for the limit function f . Note that if { fn } is uniformly Cauchy on the interval I, then for each x ∈ I, the sequence of real numbers { fn (x)} is a Cauchy sequence and hence convergent.

5.1. ANSWERS TO PROBLEMS

245

f fn

Figure 5.4: Uniform convergence on the whole interval.

Exercise 342, page 89 Fix n ≥ m and compute

sup |xn − xm | ≤ ηm .

(5.10)

x∈[0,η]

Let ε > 0 and choose an integer N so that ηN < ε. Equivalently we require that N > log ε/ log η. Then it follows from (5.10) for all n ≥ m ≥ N and all x ∈ [0, η] that |xn − xm | ≤ ηm < ε.

We conclude, by the Cauchy criterion, that the sequence fn (x) = xn converges uniformly on any interval [0, η], for 0 < η < 1. Here there was no computational advantage over the argument in Example 340. Frequently, though, we do not know the limit function and must use the Cauchy criterion rather than the definition.

Exercise 343, page 90 This follows immediately from Theorem 3.38. Just check that the translation from series language to sequence language works out in all of the details.

Exercise 344, page 90 Our computations could be based on the fact that the sum of this series is known to us; it is (1 − x)−1 . We could prove the uniform convergence directly from the definition. Instead let us use the Cauchy criterion. Fix n ≥ m and compute m n m x j ≤ η . sup ∑ x ≤ sup (5.11) 1−η x∈[0,η] j=m x∈[0,η] 1 − x

Let ε > 0. Since

ηm (1 − η)−1 → 0

as m → ∞ we may choose an integer N so that

ηN (1 − η)−1 < ε.

Then it follows from (5.11) for all n ≥ m ≥ N and all x ∈ [0, η] that m m x + xm+1 + · · · + xn ≤ η < ε. 1−η

CHAPTER 5. ANSWERS

246

It follows now, by the Cauchy criterion, that the series converges uniformly on any interval [0, η], for 0 < η < 1. Observe, however, that the series does not converge uniformly on (0, 1), though it does converge pointwise there. (See Exercise 359.)

Exercise 345, page 90 It is not always easy to determine whether a sequence of functions is uniformly convergent. In the settings of series of functions, this simple test is often useful. This will certainly become one of the most frequently used tools in your study of uniform convergence. Let Sn (x) = ∑nk=0 fk (x). We show that {Sn } is uniformly Cauchy on I. Let ε > 0. For m < n we have Sn (x) − Sm (x) = fm+1 (x) + · · · + fn (x), so

|Sn (x) − Sm (x)| ≤ Mm+1 + · · · + Mn .

Since the series of constants ∑∞ k=0 Mk converges by hypothesis, there exists an integer N such that if n > m ≥ N, Mm+1 + · · · + Mn < ε.

This implies that for n > m ≥ N,

|Sn (x) − Sm (x)| < ε

for all x ∈ D. Thus the sequence {Sn } is uniformly convergent on D; that is, the series ∑∞ k=1 fk is uniformly convergent on I.

Exercise 346, page 90 k Then |xk | ≤ ak for every k = 0, 1, 2 . . . and x ∈ [−a, a]. Since ∑∞ k=0 a converges, by the ∞ k M-test the series ∑k=0 x converges uniformly on [−a, a].

Exercise 347, page 90 The crudest estimate on the size of the terms in this series is obtained just by using the fact that the sine function never 1 in absolute value. Thus exceeds sin kθ 1 k p ≤ k p for all θ ∈ R. p Since the series ∑∞ k=1 1/k converges for p > 1, we obtain immediately by the M-test that our series converges uniformly (and absolutely) on the interval (−∞, ∞) [or any interval in fact]. for all real θ provided p > 1. p For 0 < p < 1 the series ∑∞ k=1 1/k diverges and the M-test supplies us with no information in these cases. We seem to have been particularly successful here, but a closer look also reveals a limitation in the method. The series is also pointwise convergent for 0 < p ≤ 1 (use the Dirichlet test) for all values of θ, but it converges nonabsolutely. The M-test cannot be of any help in this situation since it can address only absolutely convergent series. Thus we have obtained only a partial answer because of the limitations of the test.

5.1. ANSWERS TO PROBLEMS

247

Because of this observation, it is perhaps best to conclude, when using the Mtest, that the series tested “converges absolutely and uniformly” on the set given. This serves, too, to remind us to use a different method for checking uniform convergence of nonabsolutely convergent series. See the next exercise (Exercise 348).

Exercise 348, page 90 We will use the Cauchy criterion applied to the series to obtain uniform convergence. We may assume that the bk (x) are nonnegative and decrease to zero. Let ε > 0. We need to estimate the sum n (5.12) ∑ ak (x)bk (x) k=m

for large n and m and all x ∈ I. Since the sequence of functions {bk } converges uniformly to zero on I, we can find an integer N so that for all k ≥ N and all x ∈ I ε 0 ≤ bk (x) ≤ . 2M The key to estimating the sum (5.12), now, is the summation by parts formula. This is just the elementary identity n

n

∑

k=m

ak bk =

∑ (sk − sk−1)bk

k=m

= sm (bm − bm+1 ) + sm+1 (bm+1 − bm+2 ) · · · + sn−1 (bn−1 − bn ) + sn bn .

This provides us with n ∑ ak (x)bk (x) ≤ 2M sup |bm (x)| < ε k=m x∈E

for all n ≥ m ≥ N and all x ∈ I which is exactly the Cauchy criterion for the series and proves the theorem. Commentary: The M-test is a highly useful tool for checking the uniform convergence of a series. By its nature, though, it clearly applies only to absolutely convergent series. Abel’s test clearly shines in this regard. It is worth pointing out that in many applications of this theorem the sequence {bk } can be taken as a sequence of numbers, in which case the statement and the conditions that need to be checked are simpler. For reference we can state this as a corollary. Corollary 5.1 Let {ak } be a sequence of functions on a set E ⊂ R. Suppose that there is a number M so that N a (x) ∑ k ≤ M k=1

for all x ∈ E and every integer N. Suppose that the sequence of real numbers {bk } converges monotonically to zero. Then the series ∞

∑ bk ak

k=1

converges uniformly on E.

248

CHAPTER 5. ANSWERS

Figure 5.5: Graph of ∑nk=1 (sin kθ)/k on [0, 2π] for, clockwise from upper left, n = 1, 4, 7, and 10.

Exercise 349, page 91 It is possible to prove that this series converges for all θ. Questions about the uniform convergence of this series are intriguing. In Figure 5.5 we have given a graph of some of the partial sums of the series. The behavior near θ = 0 is most curious. Apparently, if we can avoid that point (more precisely if we can stay a small distance away from that point) we should be able to obtain uniform convergence. Theorem 3.41 will provide a proof. We apply that theorem with bk (θ) = 1/k and ak (θ) = sin kθ. All that is required is to obtain an estimate for the sums n sin kθ ∑ k=1

for all n and all θ in an appropriate set. Let 0 < η < π/2 and consider making this estimate on the interval [η, 2π − η]. From familiar trigonometric identities we can produce the formula cos θ/2 − cos(2n + 1)θ/2 sin θ + sin 2θ + sin 3θ + sin 4θ + · · · + sin nθ = 2 sin θ/2 and using this we can see that n 1 sin kθ . ≤ ∑ sin(η/2) k=1 Now Theorem 3.41 immediately shows that ∞ sin kθ ∑ k k=1

converges uniformly on [η, 2π − η]. Figure 5.5 illustrates graphically why the convergence cannot be expected to be uniform near to 0. A computation here is instructive. To check the Cauchy criterion on [0, π] we need to show that the sums n sin kθ sup ∑ k θ∈[0,π] k=m

5.1. ANSWERS TO PROBLEMS

249

are small for large m, n. But in fact 2m sin kθ 2m sin(k/2m) 2m sin 1/2 sin 1/2 ≥ ∑ > , sup ∑ ≥ ∑ k k=m k 2 θ∈[0,π] k=m k=m 2m

obtained by checking the value at points θ = 1/2m. Since this is not arbitrarily small, the series cannot converge uniformly on [0, π].

Exercise 358, page 91 Use the Cauchy criterion for convergence of sequences of real numbers to obtain a candidate for the limit function f . Note that if { fn } is uniformly Cauchy on a set D, then for each x ∈ D, the sequence of real numbers { fn (x)} is a Cauchy sequence and hence convergent.

Exercise 376, page 95 R

Let Gk (x) = 01 gk (x) dx be the indefinite integrals of the gk . Observe that, for k = 2, 3, 4, . . . , the function Gk is continuous on [0, 1], piecewise linear and that it is differentiable everywhere except at the point 1 − 1k ; it has a right-hand derivative O there but a left-hand derivative 2−k . That means that the partial sum n

Fn (x) =

∑ gk (x)

k=2

is also continuous on [0, 1], piecewise linear and that it is differentiable everywhere except at all the points 1 − 1k for k = 2, 3, 4, . . . . Both ∞

f (x) =

∑ gk (x) and F(x) =

n=2

∞

∑ Gk (x)

n=2

converge uniformly on [0, 1] and F ′ (x) = f (x) at every point with the exception of all the points in the sequence 21 , 32 , 43 , 45 , 65 , . . . . That is too many points for F to be an indefinite integral. Note that the functions in the sequence f1 , f2 , f3 , . . . are continuous with only finitely many exceptions. But the number of exceptions increase with n. That is the clue that we are heading to a function that may not be integrable in the very severe sense of the calculus integral.

Exercise 377, page 95 Let ε > 0 and choose N so that | fn (x) − f (x)| < ε/(b − a) for all n ≥ N and all x ∈ [a, b]. Then, since f and each function fn is integrable, Z b Z b Z b Z b ε f (x) dx − f (x) dx dx = ε ≤ | f (x) − f (x)| dx ≤ n n a a a a b−a for all n ≥ N. This proves that Z b a

f (x) dx = lim

Z b

n→∞ a

fn (x) dx.

Note that we had to assume that f was integrable in order to make this argument work.

CHAPTER 5. ANSWERS

250

Exercise 378, page 96 Let g(x) = limn Fn′ (x). Since each of the functions Fn′ is assumed continuous and the convergence is uniform, the function g is also continuous on the interval (a, b). From Theorem Z3.42 we infer that Z x

x

g(t) dt = lim

a

n→∞ a

Fn′ (t) dt = lim [Fn (x) − Fn (a)] = F(x) − F(a) for all x ∈ [a, b].

Thus we obtain or

n→∞

(5.13)

Z x a

g(t) dt = F(x) − F(a)

F(x) =

Z x

g(t) dt + F(a).

a

It follows from the continuity of g that F is differentiable and that f ′ (x) = g(x) for all x ∈ (a, b).

Exercise 379, page 96 To justify

we observe first that the series

∞ 1 = ∑ kxk−1 (1 − x)2 k=1 ∞

∑ xk

k=0

(3.4) converges pointwise on (−1, 1). Next we note (Exercise 360) that the series ∞

∑ kxk−1

k=1

converges pointwise on (−1, 1) and uniformly on any closed interval [a, b] ⊂ (−1, 1). Thus, if x ∈ (−1, 1) and −1 < a < x < b < 1, then this series converges uniformly on [a, b]. Now apply Corollary 3.46. Indeed there was a bit of trouble on the interval (−1, 1), but trouble that was easily handled by working on a closed, bounded subinterval [a, b] inside.

Exercise 380, page 96 Indeed there is a small bit of trouble on the interval (−∞, ∞), but trouble that was easily handled by working on a closed, bounded subinterval [−t,t] inside. The Weierstrass M-test can be used to verify uniform convergence since k x tk ≤ k! k!

for all −t < x < t.

Exercise 384, page 97 The hypotheses of Theorem 3.45 are somewhat more restrictive than necessary for the conclusion to hold and we have relaxed them here by dropping the continuity assump-

5.1. ANSWERS TO PROBLEMS

251

tion. That means, though, that we have to work somewhat harder. We also need not assume that { fn } converges on all of [a, b]; convergence at a single point suffices. (We cannot, however, replace uniform convergence of the sequence { fn′ } with pointwise convergence, as Example 328 shows.) Theorem 3.47 applies in a number of cases in which Theorem 3.45 does not. For the purposes of the proof we can assume that the set of exceptions C is empty. For simply work on subintervals (c, d) ⊂ (a, b) that miss the set C. After obtaining the proof on each subinterval (c, d) the full statement of the theorem follows by piecing these intervals together. Let ε > 0. Since the sequence of derivatives converges uniformly on (a, b), there is an integer N1 so that | fn′ (x) − fm′ (x)| < ε

for all n, m ≥ N1 and all x ∈ (a, b). Also, since the sequence of numbers { fn (x0 )} converges, there is an integer N > N1 so that | fn (x0 ) − fm (x0 )| < ε

for all n, m ≥ N. Let us, for any x ∈ [a, b], x 6= x0 , apply the mean value theorem to the function fn − fm on the interval [x0 , x] (or on the interval [x, x0 ] if x < x0 ). This gives us the existence of some point ξ strictly between x and x0 so that fn (x) − fm (x) − [ fn (x0 ) − fm (x0 )] = (x − x0 )[ fn′ (ξ) − fm′ (ξ)].

(5.14)

From this we deduce that

| fn (x) − fm (x)| ≤ | fn (x0 ) − fm (x0 )| + |(x − x0 )( fn′ (ξ) − fm′ (ξ)| < ε(1 + (b − a))

for any n, m ≥ N. Since this N depends only on ε this assertion is true for all x ∈ [a, b] and we have verified that the sequence of continuous functions { fn } is uniformly Cauchy on [a, b] and hence converges uniformly to a continuous function f on the closed, bounded interval [a, b]. We now know that the one point x0 where we assumed convergence is any point. Suppose that a < x0 < b. We show that f ′ (x0 ) is the limit of the derivatives fn′ (x0 ). Again, for any ε > 0, equation (5.14) implies that | fn (x) − fm (x) − [ fn (x0 ) − fm (x0 )]| ≤ |x − x0 |ε

(5.15)

| fn (x) − fn (x0 ) − [ f (x) − f (x0 )]| ≤ |x − x0 |ε

(5.16)

for all n, m ≥ N and any x 6= x0 in the interval (a, b). In this inequality let m → ∞ and, remembering that fm (x) → f (x) and fm (x0 ) → f (x0 ), we obtain if n ≥ N. Let C be the limit of the sequence of numbers M > N such that | fM′ (x0 ) −C| < ε.

{ fn′ (x0 )}.

Thus there exists (5.17)

Since the function fM is differentiable at x0 , there exists δ > 0 such that if 0 < |x − x0 | < δ, then fM (x) − fM (x0 ) ′ < ε. − f (x ) (5.18) 0 M x − x0

CHAPTER 5. ANSWERS

252

From Equation (5.16) and the fact that M > N, we have fM (x) − fM (x0 ) f (x) − f (x0 ) < ε. − x − x0 x − x0 This, together with the inequalities (5.17) and (5.18), shows that f (x) − f (x0 ) −C < 3ε x − x0

for 0 < |x − x0 | < δ. This proves that f ′ (x0 ) exists and is the number C, which we recall is limn→∞ fn′ (x0 ). The final statement of the theorem, lim

Z b

n→∞ a

fn′ (x) dx =

Z b a

f ′ (x) dx,

now follows too. We know that f ′ is the exact derivative on (a, b) of a uniformly continuous function f on [a, b] and so the calculus integral f (b) − f (a) =

Z b a

But we also know that fn (b) − fn (a) =

Z b a

f ′ (x) dx. fn′ (x) dx.

and lim [ fn (b) − fn (a)] = f (b) − f (a).

n→∞

Exercise 385, page 97 Let { fk } be a sequence of differentiable functions on an interval [a, b]. Suppose that ′ the series ∑∞ k=0 fk converges uniformly on [a, b]. Suppose also that there exists x0 ∈ ∞ [a, b] such that the series ∑∞ k=0 fk (x0 ) converges. Then the series ∑k=0 fk (x) converges uniformly on [a, b] to a function F, F is differentiable, and F ′ (x) =

∞

∑ fk′ (x)

k=0

for all a ≤ x ≤ b.

Exercise 386, page 97 It is not true. We have already seen a counterexample in Exercise 376. Rx Here is an analysis of the situation: Let Gn (x) = a gn (t) dt. Theorem 3.47 demands a single finite set C of exceptional points where G′n (x) = gn (x) might fail. In general, however, this set should depend on n. Thus, for each n select a finite set Cn so that G′n (x) = gn (x) is true for all x ∈ [a, b] \Cn . S If C = ∞ n=1 Cn is finite then we could conclude that the limit function g is integrable. But C might be infinite.

5.1. ANSWERS TO PROBLEMS

253

Exercise 387, page 97 A simple counterexample, showing that we cannot conclude that { fn } converges on I, is fn (x) = n for all n. To see there must exist a function f such that f ′ = g = limn→∞ fn′ on I: Fix x0 ∈ I, let Fn = fn − fn (x0 ) and apply Theorem 3.47 to the sequence {Fn } . Thus, the uniform limit of a sequence of derivatives { fn′ } is a derivative even if the sequence of primitives { fn } does not converge.

Exercise 388, page 98 If there is a finite set of points where one of the inequalities fails redefine all the functions to have value zero there. That cannot change the values of any of the integrals but it makes the inequality valid.

Exercise 389, page 99 Lemma 3.48 is certainly the easier of the two lemmas. For that just notice that, for any integer N, if the inequality N

f (x) ≥

holds for all x in (a, b) then, since Z b a

f (x) dx ≥

Z b a

∑ gk (x),

k=1 N ∑k=1 gk (x) is

!

N

∑ gk (x)

k=1

integrable, N Z b gk (x) dx . dx = ∑ k=1

a

But if this inequality in turn is true for all N then Z b ∞ Z b f (x) dx ≥ ∑ gk (x) dx a

k=1

a

is also true.

Exercise 390, page 99 This lemma requires a bit of bookkeeping and to make this transparent we will use some language and notation. Because the proof is a bit tricky we will also expand the steps rather more than we usually do. 1. Instead of writing a partition or subpartition out in detail in the form {([ai , bi ], ξi ) : i = 1, 2, . . . , n}

we will use the Greek letter2 π to denote a partition, so saves a lot of writing.

π = {([ai , bi ], ξi ) : i = 1, 2, . . . , n}

2. For the Riemann sum over a partition π, in place of writing the cumbersome n

∑ f (ξi )(bi − ai)

i=1 2π

is the letter in the Greek alphabet corresponding to “p” so that explains the choice. It shouldn’t interfere with your usual use of this symbol.

CHAPTER 5. ANSWERS

254 we write merely

∑

.

([u,v],w)∈π

f (w)(v − u) or

∑ f (w)(v − u) π

3. Instead of saying that a partition satisfies the usual condition π = {([ai , bi ], ξi ) : i = 1, 2, . . . , n}

with we just say π is δ-fine.

ξi ∈ [ai , bi ] and bi − ai < δ(ξi ).

This notation will make the arguments transparent and is generally convenient. Remember that our first step in the proof of Lemma 3.49 is to assume that the inequality ∞

f (x) ≤

∑ gk (x),

k=1

is valid at every point of the interval [a, b]. Let ε > 0. Since f itself is assumed to be integrable the interval [a, b], the integral can be approximated (pointwise, not uniformly) by Riemann sums. Thus we can choose, for each x ∈ [a, b], a δ0 (x) > 0 so that

∑ f (w)(v − u) ≥ π

Z b a

f (x) dx − ε

whenever π is a partition of the interval [a, b] that is δ0 -fine. This applies Theorem 3.26. Since g1 is integrable and, again, the integral can be approximated by Riemann sums we can choose, for each x ∈ [a, b], a δ0 (x) > δ1 (x) > 0 so that

∑ g1 (w)(v − u) ≤ π

Z b a

g1 (x) dx + ε2−1

whenever π is a partition of the interval [a, b] that is δ1 -fine. Since g2 is integrable and (yet again) the integral can be approximated by Riemann sums we can choose, for each x ∈ [a, b], a δ1 (x) > δ2 (x) > 0 so that

∑ g2 (w)(v − u) ≤ π

Z b a

g2 (x) dx + ε2−2

whenever π is a partition of the interval [a, b] that is δ2 -fine. Continuing in this way we find, for each integer k = 1, 2, 3, . . . a δk−1 (x) > δk (x) > 0 so that

∑ gk (w)(v − u) ≤ π

Z b a

gk (x) dx + ε2−k

whenever π is a partition of the interval [a, b] that is δk -fine. Let t < 1 and choose for each x ∈ [a, b] the first integer N(x) so that N(x)

t f (x) ≤

∑

fn (x).

n=1

Let En = {x ∈ [a, b] : N(x) = n}.

5.1. ANSWERS TO PROBLEMS

255

We use these sets to carve up the δk and create a new δ(x). Simply set δ(x) = δk (x) whenever x belongs to the corresponding set Ek . Take any partition π of the interval [a, b] that is δ-fine (i.e., it must be a fine partition relative to this newly constructed δ.) The existence of such a partition is guaranteed by the Cousin covering argument. Note that this partition is also δ0 -fine since δ(x) < δ0 (x) for all x. We work carefully with this partition to get our estimates. Let N be the largest value of N(w) for the finite collection of pairs ([u, v], w) ∈ π. We need to carve the partition π into a finite number of disjoint subsets by writing, for j = 1, 2, 3, . . . , N, π j = {([u, v], w) ∈ π : w ∈ E j } and

σ j = π j ∪ π j+1 ∪ · · · ∪ πN .

for integers j = 1, 2, 3, . . . , N. Note that σ j is itself a subpartition that is δ j -fine. Putting these together we have π = π1 ∪ π2 ∪ · · · ∪ πN .

By the way we chose δ0 and since the new δ is smaller than that we know, for this partition π that Z b a

so t

Z b a

f (x) dx − ε ≤ ∑ f (w)(v − u) π

f (x) dx − tε ≤ ∑ t f (w)(v − u). π

We also will remember that for x ∈ Ei ,

t f (x) ≤ g1 (x) + g2 (x) + · · · + gi (x).

Now we are ready for the crucial computations, each step of which is justified by our observations above: Z b

f (x) dx − tε ≤

t

a

N

∑ t f (w)(v − u) = ∑ ∑ t f (w)(v − u) i=1 πi

π

N

≤ ∑ ∑ (g1 (w) + g2 (w) + · · · + gi (w)) (v − u) i=1 πi

!

N

=

∑ ∑ g j (w)(v − u)

j=1 N

∑

j=1

Z

b a

σj

g j (x) dx + ε2− j

Since ε is arbitrary, this shows that t

Z b a

∞

≤

∞

f (x) dx ≤

∑

k=1

∑

j=1

Z

b a

Z

b a

≤ g j (x) dx + ε.

gk (x) dx .

As this is true for all t < 1 the inequality of the lemma must follow too.

CHAPTER 5. ANSWERS

256

Exercise 392, page 99 This follows from these lemmas and the identity ∞

f (x) = f1 (x) + ∑ ( fn (x) − fn−1 (x)) . n=1

Since fn is a nondecreasing sequence of functions, the sequence of functions fn (x) − fn−1 (x) is nonnegative. As usual, ignore the finite set of exceptional points or assume that all functions are set equal to zero at those points.

Exercise 393, page 99 We use the same technique and the same language as used in the solution of Exercise 390. Let gn = f − fn and let Gn denote the indefinite integral of the function gn . The sequence of functions {gn } is nonnegative and monotone decreasing with limn→∞ gn (x) = 0 at each x. Let ε > 0. Choose a sequence of functions {δk } so that

∑

([u,v],w)∈π

|Gk (v) − Gk (u) − gk (w)(v − u)| < ε2−k

whenever π is a partition of the interval [a, b] that is δk -fine. Choose, for each x ∈ [a, b], the first integer N(x) so that gk (x) < ε for all k ≥ N(x).

Let

En = {x ∈ [a, b] : N(x) = n}.

We use these sets to carve up the δk and create a new δ(x). Simply set δ(x) = δk (x) whenever x belongs to the corresponding set Ek . Take any partition π of the interval [a, b] that is δ-fine (i.e., it must be a fine partition relative to this newly constructed δ.) The existence of such a partition is guaranteed by the Cousin covering argument. Let N be the largest value of N(w) for the finite collection of pairs ([u, v], w) ∈ π. We need to carve the partition π into a finite number of disjoint subsets by writing π j = {([u, v], w) ∈ π : w ∈ E j }

for integers j = 1, 2, 3, . . . , N. Note that

π = π1 ∪ π2 ∪ · · · ∪ πN

and that these collections are pairwise disjoint. Now let m be any integer greater than N. We compute 0≤ N

∑

j=1

Z b a

gm (x) dx = Gm (b) − Gm (a) =

∑

([u,v],w)∈π j

(Gm (v) − Gm (u))

!

N

≤

∑

j=1

∑

([u,v],w)∈π

∑

(Gm (v) − Gm (u)) =

([u,v],w)∈π j

(G j (v) − G j (u))

!

≤

5.1. ANSWERS TO PROBLEMS " N

257

∑

∑

j=1 ([u,v],w)∈π j

N

∑

"

∑

j=1 ([u,v],w)∈π j

This shows that 0≤ for all m ≥ N. The identity Z b a

f (x) dx − lim

g j (w)(v − u) + ε2− j < #

ε(v − u) + ε2− j < ε(b − a + 1).

Z b a

#

gm (x) dx < ε(b − a + 1)

Z b

n→∞ a

fn (x) dx = lim

Z b

n→∞ a

gn (x) dx = 0.

follows.

Exercise 395, page 101 Just apply the theorems. We need, first, to determine that the interval of convergence of the integrated series ∞

F(x) =

∑ xn+1/(n + 1) = x + x2 /2 + x3 /3 + x4/4 + . . . + .

n=0

is [−1, 1). Consequently, of the two integrals here, only one exists. Note that F(0) − F(−1) = −F(−1) = 1 − 1/2 + 1/3 − 1/4 − 1/5 + 1/6 − . . .

is a convergent alternating series and provides the value of the integral ! Z 0

−1

∞

∑ xn

dx.

n=0

Note that the interval of convergence of the original series is (−1, 1) but that is not what we need to know. We needed very much to know what the interval of convergence of the integrated series was.

Exercise 396, page 102 The formula

1 (−1 < x < 1) 1−x is just the elementary formula for the sum of a geometric series. Thus we do not need to use series methods to solve the problem; we just need to integrate the function 1 f (x) = . 1−x We happen to know that Z 1 dx = − log(1 − x) +C 1−x on (−∞, 1) so this integral is easy to work with without resorting to series methods. 1 + x + x2 + x3 + x4 + · · · + =

CHAPTER 5. ANSWERS

258 The integral

Z 0

1 dx = − log(1 − 0) − (− log(1 − (−1)) = log 2. 1 − x −1 For the first exercise you should have found a series that we now know adds up to log 2.

Exercise 397, page 102 No, you are wrong. And don’t call me ‘Shirley.’ The condition we need concerns the integrated series, not the original series. The integrated series is F(x) = x + x2 /2 + x3 /3 + x4 /4 + . . . + and, while this diverges at x = 1, it converges at x = −1 since

1 − 1/2 + 1/3 − 1/4 − 1/5 + 1/6 − . . .

is an alternating harmonic series, known to be convergent by the convergent alternating series test. Theorem 3.53 then guarantees that the integral exists on [−1, 0] and predicts that it might not exist on [0, 1]. The mistake here can also be explained by the nature of the calculus integral. Remember that in order for a function to be integrable on an interval [a, b] it does not have to be defined at the endpoints or even bounded near them. The careless student is fussing too much about the function being integrated and not paying close enough attention to the integrated series. We know that F(x) is an antiderivative for f on (−1, 1) R0 so the only extra fact that we need for the integral −1 f (x) dx is that F is continuous on [−1, 0]. It is.

Exercise 398, page 102 Yes. Inside the interval (−R, R) this formula must be valid.

Exercise 399, page 102 Yes. If we are sure that the closed, bounded interval [a, b] is inside the interval of convergence (i.e., either (−R, R) or (−R, R] or [−R, R) or [−R, R]) then this formula must be valid.

Exercise 400, page 102 Both the series f (x) = 1 + 2x + 3x2 + 4x3 + . . . and the formally integrated series F(x) = x + x2 + x3 + . . . have a radius of convergence 1 and an interval of convergence exactly equal to (−1, 1). Theorem 3.53 assures us, only, that F is an indefinite integral for f on (−1, 1). But x (−1 < x < 1). F(x) = x + x2 + x3 + · · · = 1−x

5.1. ANSWERS TO PROBLEMS

259

If we define

x (−1 ≤ x < 1). 1−x then G is continuous on [−1, 0] and G′ (x) = F ′ (x) = f (x) on (−1, 1). Consequently G(x) =

Z 0

We were not able to write Z 0

−1

−1

f (x) dx = G(0) − G(−1) = 1/2.

f (x) dx = F(0) − F(−1) = −1 + 1 − 1 + 1 − 1 + 1 − . . .

because F(−1) is not defined (the series for F diverges at x = −1. Since G(x) is unbounded near x = 1 there is no hope of finding an integral for f on [0, 1].

Exercise 401, page 102 Here R = 0. Show that lim kk rk = 0

k→∞

for every r > 0. Conclude that the series must diverge for every x 6= 0.

Exercise 402, page 103 Do R = 0, R = ∞, and R = 1. Then for any 0 < s < ∞ take your power series for R = 1 and make a suitable change, replacing x by sx.

Exercise 406, page 103 This follows immediately from Exercise 405 without any further computation.

Exercise 407, page 103 This follows immediately from the inequalities p p ak+1 ak+1 k k , ≤ lim inf |ak | ≤ lim sup |ak | ≤ lim sup lim inf k→∞ k→∞ ak ak k→∞

k→∞

which can be established by comparing ratios and roots, together with Exercise 404.

Exercise 408, page 104 Exercise 409, page 104 Exercise 410, page 104 Exercise 411, page 104

CHAPTER 5. ANSWERS

260

Exercise 412, page 104 Exercise 413, page 104 Exercise 418, page 105 If the series converges absolutely at an endpoint ±R of the interval of convergence then |a0 | + |a1 |R + |a2 |R2 + |a3 |R3 + . . .

converges. For each x in the interval [−R, R],

|ak (x)xk | ≤ |ak |Rk .

By the Weierstrass M-test the series converges uniformly on [−R, R]. The conclusion is now that f (x) = a0 + a1 x + a2 x2 + a3 x3 + . . . is a uniformly convergent power series on the interval [−R, R] and so f is continuous. We know that f ′ (x) = a1 + 2a2 x + 3a3 x2 + 4a4 x3 + . . . is convergent at least on (−R, R) and that this is indeed the derivative of f there. It follows that f ′ is integrable on [−R, R] and that f is an indefinite integral on that interval.

Exercise 419, page 105 For the proof we can assume that the series f (x) = a0 + a1 x + a2 x2 + a3 x3 + . . . has a radius of convergence 1 and that the series converges nonabsolutely atx = 1. We can assume that the interval of convergence is (−1, 1]. Any other case can be transformed into this case. Set sn = a0 + a1 + a2 + a3 + · · · + an−1

and note that, by our hypothesis that the power series converges at x = 1, this is a convergent series. The sequence bk (x) = xk is nonnegative and decreasing on the interval [0, 1]. one of the versions of Abel’s theorem (Exercise 348) applies in exactly this situation and so we can claim that the series ∞

∑ ak bk (x)

k=0

converges uniformly on [0, 1]. This is what we wanted. The conclusion is now that f (x) = a0 + a1 x + a2 x2 + a3 x3 + . . . is a uniformly convergent power series on the interval [0, 1] and so f is continuous. We know that f ′ (x) = a1 + 2a2 x + 3a3 x2 + 4a4 x3 + . . .

5.1. ANSWERS TO PROBLEMS

261

is convergent at least on (−1, 1) and that this is indeed the derivative of f there. It follows that f ′ is integrable on [0, 1]. We already know that f ′ is integrable on any interval [a, 0] for −1 < a < 0. Thus f ′ is integrable on any interval [a, 1] for −1 < a < 0, and thus integrable on any interval [a, b] ⊂ (−1, 1]. To finish let us remark on the transformations needed to justify the first paragraph. If f (x) = a0 + a1 x + a2 x2 + a3 x3 + . . . converges at x = 1 then converges at x = −1 and

g(x) = a0 − a1 x + a2 x2 − a3 x3 + . . .

h(x) = a0 + R−1 a1 x + R−2 a2 x2 + R−3 a3 x3 + . . .

converges at x = R.

Exercise 421, page 105 Write out the Cauchy criterion for uniform convergence on (−r, r) and deduce that the Cauchy criterion for uniform convergence on [−r, r] must then also hold.

Exercise 424, page 106 The best that can be concluded is that if there is any series representation for f valid at least in some interval (−r, r) for r > 0, then ∞

f (x) =

∑

k=0

f (k) (0) k x k!

must be that series. But it is possible that there simply is no power series representation of a function, even assuming that it is f is infinitely often differentiable at x = 0.

Exercise 425, page 108 Each of these steps, carried out, will lead to the conclusion that the area is expressible as an integral. The first step is the assumption that area is additive. The second step assumes that area can be estimated above and below in this way. The last two steps then follow mathematically from the first two. The loosest version of this argument requires taking the concept for granted and simply assuming that an accumulation argument will work for it. Thus A(x) accumulates all of the area of the region between a and x. Now add on a small bit more to get A(x + h). The bit more that we have added on is close to f (ξ) × h for some [or any] choice of ξ inside (x, x + h). R We “conclude” immediately that ax f (t), dt expresses completely the measurement A(x) that we require. You should be aware here of where you are making an additive assumption and where you are making an assumption of continuity.

Exercise 428, page 109 Well in fact you merely memorized that the area of a circle of radius R is πR2 . Then the

CHAPTER 5. ANSWERS

262

area of a half-circle (assuming that it has an area) would be half of that (assuming that areas add up). Notice that by basing area on integration theory we are on firmer ground for all such statements.

Exercise 429, page 109 √ √ The top half of the circle is the curve y = r2 − x2 and the bottom half is y = r2 − x2 both on the interval −r ≤ x ≤ r. Just apply Definition 3.55 (and hope that you have the skills to determine exactly what the integral is).

Exercise 430, page 109 “The difficulty that occurs with this test of integrands is somewhat subtle. If a quantity Q is equal to the integral of a function f , then every upper sum of f is larger than Q and every lower sum of f is smaller than Q. On the other hand, even with some applications occurring at the most elementary level, it is not possible to know a priori that upper and lower sums bound Q. One knows this only after showing in some other way that the integral of f equals Q. Consider, for example, the area between the graphs of the functions g(x) = 1 + x2 and h(x) = 2x2 on [0, 1]. While for a small ∆x > 0, the maximum of g(x) − h(x) on [0, ∆x] occurs at 0, no rectangle of height 1 and width ∆x contains the region between the graphs over [0, , ∆x], so it is not clear a priori that 1 · ∆x is larger than the area of that region. Of course there are several methods to justify the integral needed here . . . , but even for this simple example the ‘universal’ method of upper and lower sums fails, and Bliss’s theorem also fails, as a test for the integrand.” . . . from Peter A. Loeb, A lost theorem of the calculus, The Mathematical Intelligencer, Volume 24, Number 2 (June, 2002). One can just ignore the difficulty and accept Definition 3.55 as a correct interpretation of area. Or, we could use Definition 3.54 and insist that areas can be added and subtracted. In that way Z 1 0

[g(x) − h(x)] dx =

Z 1 0

g(x) dx −

Z 1

h(x) dx

0

gets around the problem, since both of these areas and integrals allow an interpretation using the method of exhaustion. Yet again, we could consider, instead, adjusted Riemann sums n

∑ [g(ξi ) − h(ξ∗i )](xi − xi−1)

i=1

R

that also approximate the same integral 01 [g(x) − h(x)] dx. Then, judicious choices of ξi and ξ∗i can be made to return to an argument that follows the principles of the method of exhaustion.

Exercise 431, page 109 The geometric series certainly sums to the value 1. Now use the definition of the integral

5.1. ANSWERS TO PROBLEMS

263

R ∞ −2 dx to compute its value. 1 x

Exercise 434, page 113 First note that max{|p|, |q|} ≤

p

p2 + q2 ≤ |p| + |q|

for all real numbers p and q. Consequently, if we make any choice of points a = t0 < t1 < t2 < · · · < tn−1 < tn = b,

the sum

n

q

∑

i=1

has, as an upper bound,

[F(ti ) − F(ti−1 )]2 + [G(ti ) − G(ti−1 )]2

n

∑ |F(ti ) − F(ti−1 )| + |G(ti) − G(ti−1)| ≤ V (F, [a, b]) +V (G, [a, b]).

i=1

Consequently, for the length L of the curve, In the other direction

L ≤ V (F, [a, b]) +V (G, [a, b]).

n

n

i=1

i=1

∑ |F(ti ) − F(ti−1)| ≤ ∑

q

[F(ti ) − F(ti−1 )]2 + [G(ti ) − G(ti−1 )]2 ≤ L.

Thus V (F, [a, b]) ≤ L. The inequality V (G, [a, b]) ≤ L is similarly proved.

Exercise 435, page 113 We know that F ′ (t) and |F ′ (t)| are integrable on [a, b]. We also know that V (F, [a, b]) ≤

Z b a

|F ′ (t)| dt.

Consequently F has bounded variation on [a, b]. Similarly G has bounded variation on [a, b]. It follows from Exercise 434 that the curve is rectifiable. Let ε > 0 and choose points so that

a = t0 < t1 < t2 < · · · < tn−1 < tn = b n

L−ε < ∑

i=1

q

[F(ti ) − F(ti−1 )]2 + [G(ti ) − G(ti−1 )]2 ≤ L.

The sum increases if we add points, so we will add all points at which the derivatives F ′ (t) or G′ (t) do not exist. In between the points in the subdivision we can use the mean-value theorem to select ti−1 < τi < ti and ti−1 < τ∗i < ti so that [F(ti ) − F(ti−1 )] = F ′ (τi ) and [G(ti ) − G(ti−1 )] = G′ (τ∗i ).

CHAPTER 5. ANSWERS

264 Consequently n

L−ε < ∑

i=1

But the sums

n

∑

i=1

q

q

[F ′ (τi )]2 + [G′ (τ∗i )]2 ≤ L.

[F ′ (τi )]2 + [G′ (τ∗i )]2

are approximating sums for the integral Z bq [F ′ (t)]2 + [G′ (t)]2 dt. a

Here we are applying Theorem 3.25 since we have selected two points τi and τ∗i from each interval, rather than one point as the simplest version of approximating Riemann sums would demand. We easily check that the function p H(p, q) = p2 + q2 ≤ |p| + |q|

satisfies the hypotheses of that theorem.

Exercise 436, page 113 Use the Darboux property of continuous functions.

Exercise 439, page 113 Translating from the language of curves to the language of functions and their graphs: The length of the graph would be the least number L so that n

∑ [(xi − xi−1 )]2 + [ f (xi ) − f (xi−1)]2 ≤ L

i=1

for all choices of points a = x0 < x1 < x2 < · · · < xn−1 < xn = b.

This would be finite if and only if f has bounded variation on [a, b] and would be smaller than (b − a) +V (F, [a, b]). A formula for this length, in the case when f is continuously differentiable on (a, b) with a bounded derivative, would be Z bq L= 1 + [ f ′ (x)]2 dx. a

Exercise 440, page 113 The function is continuously differentiable, Lipschitz and so certainly of bounded variation. Hence the curve x = t, y = f (t) (0 ≤ t ≤ 2) is rectifiable. The formula

L=

Z 2r 0

1 1 + (e2x − 2 + e−2x ) dx 4

5.1. ANSWERS TO PROBLEMS

265

is immediate. Calculus students would be expected to have the necessary algebraic skills to continue. “Completing the square” will lead to an integral that can be done by hand.

Exercise 441, page 116 On the interval [a, b] with no additional points inserted this is exactly the trapezoidal rule. The general formula just uses the same idea on each subinterval.

Exercise 442, page 116 We can assume that f is twice continuously differentiable on [a, b] and then apply integration by parts [twice] to the integral Z b a

(x − a)(b − x) f ′′ (x) dx.

One integration by parts will give Z b a

x=b Z b (x − a)(b − x) f (x) dx = (x − a)(b − x) f (x) x=a − [a + b − 2x] f ′ (x) dx ′′

′

a

and a second integration by parts on this integral will give [2x − (a + b)]) f (x)] x=b x=a − 2

Z b a

f (x) dx = (b − a)( f (a) + f (b)) − 2

Z b

f (x) dx.

a

Exercise 443, page 116 Again we can assume that f is twice continuously differentiable on [a, b]. Then the preceding exercise supplies Z b Z f (a) + f (b) 1 b f (x) dx − (b − a) = − (x − a)(b − x) f ′′ (x) dx 2 2 a a Z b

(b − a)3 , 12 a making sure to apply the appropriate mean-value theorem for the integral above. = − f ′′ (ξ)

(x − a)(b − x) dx = − f ′′ (ξ)

Exercise 444, page 116 Again we can assume that f is twice continuously differentiable on [a, b]. Then the preceding exercises supply Z b Z f (a) + f (b) 1 b f (x) dx − (b − a) = − (x − a)(b − x) f ′′ (x) dx. 2 2 a a Now just use the fact that max (x − a)(b − x) =

x∈[a,b]

to estimate

Z b a

(b − a)2 4

(x − a)(b − x)| f ′′ (x)| dx.

CHAPTER 5. ANSWERS

266

Exercise 445, page 117 The preceding exercises should help.

Exercise 446, page 117 This is from Edward Rozema, Estimating the error in the trapezoidal rule, The American Mathematical Monthly, Vol. 87 (2), (1980), pages 124–128. The observation just uses the fact that the usual error is exactly equal to n

∑−

k=1

(b − a)3 ′′ f (ξi ). 12n3

where here we are required to take appropriate points ξi in each interval i(b − a) (i − 1)(b − a) ,a+ (i = 1, 2, 3, . . . , n) [xi−1 , xi ] = a + n n If we rewrite this sum in a more suggestive way the theorem is transparent. Just check that this is exactly the same sum: −

(b − a)2 n ′′ ∑ f (ξi )(xi − xi−1. 12n2 k=1

R

We recognize the sum as a Riemann sum for the integral ab f ′′ (x) dx and that integral can be evaluated as f ′ (b) − f ′ (a). [For large enough n the sum is close to the integral; this is all that is intended here.] Rozema goes on to note that, since we have an explicit (if approximate) error, we may as well use it. Thus an improved trapezoidal rule is Z b

(b − a)2 ′ [ f (b) − f ′ (a)] 12n2 a and the error estimate when using the improvement can be shown to be f (x) dx ≈ Tn −

f ′′′′ (ξ)(b − a)5 720n4 which is rather better than the error for the original trapezoidal rule.

Exercise 447, page 117 We can see (since the correct value of the integral is provided) that n = 1 or n = 2 is nowhere large enough. A simple trial-and-error approach might work. Look for a large value of n, compute the trapezoidal rule approximation and see if we are close enough. Apart from being tedious, this isn’t much of a “method.” For one thing we do not expect normally to be asked such a question when the value is already guaranteed. More importantly, even if we could determine that n = 50, 000 is large enough, how would we know that larger values of n are equally accurate. The trapezoidal rule eventually converges to the correct value, but it does not (in general) work out that the values get closer and closer to the correct value.

5.1. ANSWERS TO PROBLEMS

267 2

In the case here the situation is really quite simpler. Since the function f (x) = ex is convex [sometimes called concave up] on the interval [0, 1] the trapezoidal rule always overestimates the integral. Each successive application for larger n will get closer as it will be smaller. So you could solve the problem using trial-and-error in this way. If you know how to program then this is reasonable. On the web you can also find Java Applets that will do the job for you. For example, at the time of writing, a nice one is here www.math.ucla.edu/ . . . ronmiech/Java Applets/Riemann/index.html that allows you to input {f(x)= exp(x^2)} and select the number of subdivisions. It is perhaps more instructive to do some experimental play with such applets than to spend an equal time with published calculus problems. A more sensible method, which will be useful in more situations, is to use the published error estimate for the trapezoidal rule to find how large n must be so that the error is small enough to guarantee nine decimal place accuracy. 2 The second derivative of f (x) = ex is 2

2

f ′′ (x) = 2 ex + 4 x2 ex . A simple estimate on the interval [0, 1] shows that 2 ≤ f ′′ (ξ) ≤ 6e = 16.30969097 for all 0 ≤ ξ ≤ 1. We know that the use of the trapezoidal rule at the nth stage produces an error 1 ′′ f (ξ), error = − 12n2 where ξ is some number between 0 and 1. Consequently if we want an error less than 10−9 /2 [guaranteeing a nine decimal accuracy] we could require 1 1 ′′ f (ξ) ≤ (16.30969097) < 10−9 /2. 12n2 12n2 So 1 n2 > (16.30969097)(2 × 109 ) 12 or n > 52138 will do the trick. Evidently a trial-and-error approach might have been somewhat lengthy. Notice that this method, using the crude error estimate for the trapezoidal rule, guarantees that for all n > 52138 the answer provided by that rule will be correct to a nine decimal accuracy. It does not at all say that we must use n this large. Smaller n will doubtless suffice too, but we would have to use a different method to find them. What we could do is use a lower estimate on the error. We have 2 1 ′′ f (ξ) ≥ − , error = − 2 12n 12n2 where ξ is some number between 0 and 1. Thus we could look for values of n for which 2 > −10−9 − 12n2

CHAPTER 5. ANSWERS

268

which occurs for n2 < 61 109 , or n < 12909.9. Thus, before the step n = 12, 909 there must be an error in the trapezoidal rule which affects at least the ninth decimal place.

Exercise 448, page 119 To show that

R ∞ n −x 0 x e dx = n! first find a recursion formula for Z ∞

In =

0

xn e−x dx (n = 0, 1, 2, 3, . . . )

by integration by parts. A direct computation shows that I0 = 1 and an integration by parts shows that In = nIn−1 . It follows, by induction, that In = n!. In fact Maple is entirely capable of finding the answer to this too. Input the same command: > int(x^n* exp(-x), x=0..infinity ); memory used=3.8MB, alloc=3.1MB, time=0.38 GAMMA(n + 1)

The Gamma function is defined as Γ(n + 1) = n! at integers, but is defined at nonintegers too.

Exercise 450, page 120 This is called the Cauchy-Schwarz inequality and is the analog for integrals of that same inequality in elementary courses. It can be proved the same way and does not involve any deep properties of integrals.

Exercise 451, page 120 For example, prove the following: 1. log 1 = 0. 2. log x < log y if 0 < x < y. 3. limx→∞ log x = ∞ and limx→0+ log x = 0. 4. the domain and range are both (0, ∞). 5. log xy = log x + log y if 0 < x, y. 6. log x/y = log x − log y if 0 < x, y. 7. log xr = r log x for x > 0 and r = 1, 2, 3, . . . . 8. log e = 1 where e = limn→∞ (1 + 1/n)n . 9.

d dx

log x = 1/x for all x > 0.

10. log 2 = 0.69 . . . . 2

3

4

11. log(1 + x) = −x − x2 − x3 − x4 − . . . for −1 < x < 1.

5.1. ANSWERS TO PROBLEMS

269

Exercise 454, page 121 Take any sequence. It must contain every element of the empty set 0/ since there is nothing to check.

Exercise 455, page 121 If the finite set is {c1 , c2 , c3 , . . . , cm } then the sequence

c1 , c2 , c3 , . . . , cm , cm , cm , cm , . . .

contains every element of the set.

Exercise 456, page 121 If the sequence c1 , c2 , c3 , . . . , cm , . . . contains every element of some set it must certainly contain every element of any subset of that set.

Exercise 457, page 122 The set of natural numbers is already arranged into a list in its natural order. The set of integers (including 0 and the negative integers) is not usually presented in the form of a list but can easily be so presented, as the following scheme suggests: 0, 1, −1, 2, −2, 3, −3, 4, −4, 5, −5, 6, −6, 7, −7, . . . .

Exercise 458, page 122 The rational numbers can also be listed but this is quite remarkable, for (at first sight) no reasonable way of ordering them into a sequence seems likely to be possible. The usual order of the rationals in the reals is of little help. To find such a scheme define the “rank” of a rational number m/n in its lowest terms (with n ≥ 1) to be |m| + n. Now begin making a finite list of all the rational numbers at each rank; list these from smallest to largest. For example, at rank 1 we would have only the rational number 0/1. At rank 2 we would have only the rational numbers −1/1, 1/1. At rank 3 we would have only the rational numbers −2/1, −1/2, 1/2, 2/1. Carry on in this fashion through all the ranks. Now construct the final list by concatenating these shorter lists in order of the ranks: 0/1, −1/1, 1/1, −2/1, −1/2, 1/2, 2/1, . . . .

This sequence will include every rational number.

Exercise 459, page 122 If the sequence c1 , c2 , c3 , . . . , cm , . . .

CHAPTER 5. ANSWERS

270 contains every element of a set A and the sequence d1 , d2 , d3 , . . . , dm , . . .

contains every element of a set B, then the combined sequence c1 , d1 , c2 , d2 , c3 , d3 . . . , cm , dm , . . . contains every element of the union A ∪ B. By induction, then the union of any finite number of countable sets is countable. That is not so remarkable in view of the next exercise (Exercise 460).

Exercise 460, page 122 We show that the following property holds for countable sets: If S1 , S2 , S3 , . . . is a sequence of countable sets of real numbers, then the set S formed by taking all elements that belong to at least one of the sets Si is also a countable set. We can consider that the elements of each of the sets Si can be listed, say, S1 = {x11 , x12 , x13 , x14 , . . . }

S2 = {x21 , x22 , x23 , x24 , . . . } S3 = {x31 , x32 , x33 , x34 , . . . }

S4 = {x41 , x42 , x43 , x44 , . . . }

and so on. Now try to think of a way of listing all of these items, that is, making one big list that contains them all. Describe in a systematic way a sequence that starts like this: x11 , x12 , x21 , x13 , x22 , x23 , x14 , x23 , x32 , x41 , . . .

Exercise 461, page 122 It is easy enough to construct such a function that has finitely many discontinuities. With some persistence you can find such a function that is discontinuous, say, at every rational number. You cannot find such a function that is discontinuous at every irrational because the collection of all points where such a function F is not continuous is countable. First of all establish that for such a function and at every point a < x ≤ b, the one-sided limit F(x−) = limx→x− F(x) exists and that, at every point a ≤ x < b, the one-sided limit F(x+) = limx→x+ F(x) exists. Note that, again because the function is monotonic, nondecreasing, F(x−) ≤ F(x) ≤ F(x+)

at all a < x < b. Consequently F is continuous at a point x if and only if the one-sided limits at that point have the same value as F(x). For each integer n let Cn be the set of points x such that F(x+) − F(x−) > 1/n. Because F is nondecreasing, and because F(b)− F(a) is finite there can only be finitely many points in any set Cn . To see this take, if possible, any points a < c1 < c2 < · · ·
0 then the interval [a, 1] contains only finitely many corners. But the interval (0, 1) contains countably many corners! Thus the calculus integral in both the finite set version and in the countable set version will provide Z b a

F ′ (x) dx = F(b) − F(a)

for all 0 < a < b ≤ 1. The claim that Z b 0

F ′ (x) dx = F(b) − F(0)

for all 0 < b ≤ 1 can be made only for the new extended integral.

Exercise 472, page 124 The same proof that worked for the calculus integral will work here. We know that, for any bounded function f on an interval (a, b), there is a uniformly continuous function on [a, b] whose derivative is f (x) at every point of continuity of f .

Exercise 474, page 125 Just kidding. But if some instructor has a need for such a text we could rewrite Chapters 2 and 3 without great difficulty to accommodate the more general integral. The discussion of countable sets in Chapter 4 moves to Chapter 1. The definition of the indefinite integral in Chapter 2 and the definite integral in Chapter 3 change to allow countable exceptional sets.

CHAPTER 5. ANSWERS

274

Most things can stay unchanged but one would have to try for better versions of many statements. Since this new integral is also merely a teaching integral we would need to strike some balance between finding the best version possible and simply presenting a workable theory that the students can eventually replace later on with the correct integration theory on the real line.

Exercise 475, page 125 We will leave the reader to search for an example of such a sequence. The exercise should leave you with the impression that the countable set version of the calculus integral is sufficiently general to integrate just about any example you could imagine creating. It is not hard to find a function that is not integrable by any reasonable method. But if it is possible (as this exercise demands) to write ∞

f (x) =

∑ gk (x)

k=1

and if

∞

∑

k=1

Z

b a

gk (x) dx

converges then, certainly, f should be integrable. Any method that fails to handle f is inadequate. With some work and luck you might consider the series ∞ ak ∑ p|x − r | k k=1

where ∑∞ k=1 ak converges and {rk } is an enumeration of the rationals in [0, 1]. This is routinely handled by modern methods of integration but the Riemann integral and these two weak versions of the calculus integral collapse with such an example.

Exercise 476, page 126 Start with a set N that contains a single element c and show that that set has measure zero according to the definition. Let ε > 0 and choose δ(c) = ε/2. Then if a subpartition {([ci , di ], c) : i = 1, 2}

is given so that then

0 < di − ci < δ(c) (i = 1, 2) 2

∑ (di − ci) < ε/2 + ε/2 = ε.

i=1

Note that we have used only two elements in the subpartition since we cannot have more intervals in a subpartition with one associated point c. Now consider a set N = {c1 , c2 , c3 , . . . , cn } that contains a finite number of elements. We show that that set has measure zero according to the definition. Let ε > 0 and choose δ(ci ) = ε/(2n) for each i = 1, 2, 3, . . . , n. Use the same argument but now with a few more items to keep track of.

5.1. ANSWERS TO PROBLEMS

275

Exercise 477, page 126 Now consider a countable set N = {c1 , c2 , c3 , . . . , } that contains an finite number of elements. We show that that set has measure zero according to the definition. Let ε > 0 and choose δ(ck ) = ε2−k−1 for each i = 1, 2, 3, . . . , n. Use the same argument as in the preceding exercise but now with a quite a few more items to keep track of. Suppose that we now have a subpartition {([ci , di ], ξi ) : i = 1, 2, . . . , n}

with each ξi = ck ∈ N for some k, and so that Then to estimate the sum

0 < di − ci < δ(ξi ) (i = 1, 2, . . . , n). n

∑ (di − ci)

i=1

just check the possibilities where ([ci , di ], ξi ) = ([ci , di ], ck ) for some k. Each of these adds no more than 2ε2−k−1 to the value of the sum. But ∞

∑ ε2−k = ε.

k=1

Exercise 478, page 126 Prove this by contradiction. If an interval [a, b] does indeed have measure zero then, for any ε > 0, and every point ξ ∈ [a, b] we should be able to find a δ(ξ) > 0 with the following property: whenever a subpartition {([ci , di ], ξi ) : i = 1, 2, . . . , n}

is given with each ξi ∈ [a, b] and so that then

0 < di − ci < δ(ξi ) (i = 1, 2, . . . , n) n

∑ (di − ci) < ε.

i=1

By the Cousin covering argument there is indeed such a partition {([ci , di ], ξi ) : i = 1, 2, . . . , n}

with this property that is itself a full partition of the interval [a, b]. For that partition n

∑ (di − ci ) = b − a.

i=1

This is impossible.

Exercise 479, page 126 Too easy for a hint.

CHAPTER 5. ANSWERS

276

Exercise 480, page 126 We know that subsets of sets of measure zero have themselves measure zero. Thus if N1 and N2 are the two sets of measure zero, write N1 ∪ N2 = N1 ∪ [N2 \ N1 ].

The sets on the right are disjoint sets of measure zero. So it is enough if we prove the statement, assuming always that the two sets are disjoint and have measure zero. Let ε > 0. To every point ξ ∈ N1 or ξ ∈ N2 , there is a δ(ξ) > 0 with the following property: whenever a subpartition {([ci , di ], ξi ) : i = 1, 2, . . . , n}

is given with each ξi ∈ N1 or else with ξi ∈ N2 and so that then

0 < di − ci < δ(ξi ) (i = 1, 2, . . . , n) n

∑ (di − ci ) < ε/2.

i=1

Together that means that whenever a subpartition {([ci , di ], ξi ) : i = 1, 2, . . . , n}

is given with each ξi ∈ N1 ∪ N2 and so that then

0 < di − ci < δ(ξi ) (i = 1, 2, . . . , n) n

∑ (di − ci) < ε/2 + ε/2 = ε,

i=1

since we can easily split the last sum into two parts depending on whether the associated points ξi belong to N1 or belong to N2 .

Exercise 481, page 126 We repeat our argument for the two set case but taking a little extra care. We know that subsets of sets of measure zero have themselves measure zero. Thus if N1 , N2 , N3 , . . . is a sequence of sets of measure zero, write N1 ∪ N2 ∪ N3 · · · = N1 ∪ [N2 \ N1 ] ∪ (N3 \ [N1 ∪ N2 ]) ∪ . . . .

The sets on the right are disjoint sets of measure zero. So it is enough if we prove the statement, assuming always that the sets in the sequence are disjoint and have measure zero. Let ε > 0. To every point ξ ∈ Nk there is a δ(ξ) > 0 with the following property: whenever a subpartition {([ci , di ], ξi ) : i = 1, 2, . . . , n}

is given with each ξi ∈ Nk and so that then

0 < di − ci < δ(ξi ) (i = 1, 2, . . . , n) n

∑ (di − ci) < ε2−k .

i=1

5.1. ANSWERS TO PROBLEMS

277

Together that means that whenever a subpartition {([ci , di ], ξi ) : i = 1, 2, . . . , n}

is given with each ξi ∈ N1 ∪ N2 ∪ N3 ∪ . . . and so that then

0 < di − ci < δ(ξi ) (i = 1, 2, . . . , n) n

∑ (di − ci)
0. Since the series ∑∞ k=1 (bk − ak ) converges there must be an integer N such that ∞

∑ (bk − ak ) < ε.

k=N

Note that every point of E is contained in one of the intervals (ak , bk ) for k = N, N + 1, N + 2, . . . . For each x ∈ E select the first one of these intervals (ak , bk ) that contains x. Choose δ(x) < (bk − ak )/2. This defines δ(x) for all x in E. Whenever a subpartition {([ci , di ], ξi ) : i = 1, 2, . . . , n}

is given with each ξi ∈ E and so that

0 < di − ci < δ(ξi ) (i = 1, 2, . . . , n)

then note that the interval [ci , di ] belongs to one at least of the intervals (ak , bk ). Hence the sum n

∑ (di − ci)

i=1

can be split into a finite number of subsums each adding up to no more that (bk − ak ) for some k = N, N + 1, N + 2, . . . . . It follows that n

∑ (di − ci) < ε.

i=1

Exercise 483, page 128 From each of the four closed intervals that make up the set K2 remove the middle third open interval. This will lead to 2 3 1 , ∪ ∪ .... K3 = 0, 27 27 27 There should be eight intervals in all at this stage.

CHAPTER 5. ANSWERS

278

Exercise 485, page 128 S

First note that G is an open dense set in [0, 1]. Write G = ∞ k=1 (ak , bk ). (The component intervals (ak , bk ) of G can be called the intervals complementary to K in (0, 1). Each is a middle third of a component interval of some Kn .) Observe that no two of these component intervals can have a common endpoint. If, for example, bm = an , then this point would be an isolated point of K, and K has no isolated points. Next observe that for each integer k the points ak and bk are points of K. But there are other points of K as well. In fact, we shall see presently that K is uncountable. These other points are all limit points of the endpoints of the complementary intervals. The set of endpoints is countable, but the closure of this set is uncountable as we shall see. Thus, in the sense of cardinality, “most” points of the Cantor set are not endpoints of intervals complementary to K. Show that the remaining set K = [0, 1] \ G is closed and nowhere dense in [0,1]. Show that K has no isolated points and is nonempty. Show that K is a nonempty, nowhere dense perfect subset of [0,1]. Now let G=

∞ [

Gn

n=1

and let K = [0, 1] \ G =

∞ \

Kn .

n=1

Then G is open and the set K (our Cantor set) is closed. To see that K is nowhere dense, it is enough, since K is closed, to show that K contains no open intervals. Let J be an open interval in [0, 1] and let λ be its length. Choose a natural number n such that 1/3n < λ. By property 5, each component of Kn has length 1/3n < λ, and by property 2 the components of Kn are pairwise disjoint. T Thus Kn cannot contain J, so neither can K = ∞ 1 Kn . We have shown that the closed set K contains no intervals and is therefore nowhere dense. It remains to show that K has no isolated points. Let x0 ∈ K. We show that x0 is a limit point of K. To do this we show that for every ε > 0 there exists x1 ∈ K such that 0 < |x1 − x0 | < ε. Choose n such that 1/3n < ε. There is a component L of Kn that contains x0 . This component is a closed interval of length 1/3n < ε. The set Kn+1 ∩ L has two components L0 and L1 , each of which contains points of K. The point x0 is in one of the components, say L0 . Let x1 be any point of K ∩ L1 . Then 0 < |x0 − x1 | < ε. This verifies that x0 is a limit point of K. Thus K has no isolated points.

Exercise 486, page 128 Each component interval of the set Gn has length 1/3n ; thus the sum of the lengths of these component intervals is 2n−1 1 2 n = . 3n 2 3

5.1. ANSWERS TO PROBLEMS

279

It follows that the lengths of all component intervals of G forms a geometric series with sum ∞ 1 2 n = 1. ∑ n=1 2 3 (This also gives us a clue as to why K cannot contain an interval: After removing from the unit interval a sequence of pairwise disjoint intervals with length-sum one, no room exists for any intervals in the set K that remains.)

Exercise 487, page 128 Here is a hint that you can use to make into a proof. Let E be the set of all points in the Cantor set that are not endpoints of a complementary interval. Then the Cantor set is the union of E and a countable set. If E has measure zero, so too has the Cantor set. Let ε > 0 and choose N so large that N 1 2 n > 1 − ε. ∑ n=1 2 3 i.e., so that

∞

1 ∑ 2 n=N+1

n 2 < ε. 3

Here is how to define a δ(ξ) for every point in the set E. Just make sure that δ(ξ) is small enough that the open interval (ξ − δ(ξ), ξ, +δ(ξ)) does not contain any of the open intervals complementary to the Cantor set that are counted in the sum N 1 2 n > 1 − ε. ∑ n=1 2 3 Now check the definition to see that E satisfies the required condition to check that it is a set of measure zero. Using this δ guarantees that the intervals you will sum do not meet these open intervals that we have decided make up most of [0, 1] (i.e., all but ε).

Exercise 491, page 129 This exercise shows that there is a purely arithmetical construction for the Cantor set. You will need some familiarity with ternary (base 3) arithmetic here. Each x ∈ [0, 1] can be expressed in base 3 as x = .a1 a2 a3 . . . ,

where ai = 0, 1 or 2, i = 1, 2, 3, . . . . Certain points have two representations, one ending with a string of zeros, the other in a string of twos. For example, .1000 · · · = .0222 . . . both represent the number 1/3 (base ten). Now, if x ∈ (1/3, 2/3), a1 = 1, thus each x ∈ G1 must have ‘1’ in the first position of its ternary expansion. Similarly, if 7 8 1 2 , , ∪ , x ∈ G2 = 9 9 9 9 it must have a 1 in the second position of its ternary expansion (i.e., a2 = 1). In general, S each point in Gn must have an = 1. It follows that every point of G = ∞ 1 Gn must have

CHAPTER 5. ANSWERS

280

a 1 someplace in its ternary expansion. Now endpoints of intervals complementary to K have two representations, one of which involves no 1’s. The remaining points of K never fall in the middle third of a component of one of the sets Kn , and so have ternary expansions of the form x = .a1 a2 . . .

ai = 0 or 2.

We can therefore describe K arithmetically as the set {x = .a1 a2 a3 . . . t (base three) : ai = 0thereexists or 2 for each i ∈ N}.

Exercise 492, page 129 In fact, K can be put into 1-1 correspondence with [0,1]: For each x = .a1 a2 a3 . . . (base 3), ai = 0, 2, in the set K, let there correspond the number y = .b1 b2 b3 . . . (base 2), bi = ai /2. This provides a 1-1 correspondence between K (minus endpoints of complementary intervals) and [0, 1] (minus the countable set of numbers with two base 2 representations). By allowing these two countable sets to correspond to each other, we obtain a 1-1 correspondence between K and [0, 1].

Exercise 493, page 129 “When I was a freshman, a graduate student showed me the Cantor set, and remarked that although there were supposed to be points in the set other than the endpoints, he had never been able to find any. I regret to say that it was several years before I found any for myself.”

Ralph P. Boas, Jr, from Lion Hunting & Other Mathematical Pursuits (1995). It is clear that there must be many irrational numbers in the Cantor ternary set, since that set is uncountable and the rationals are countable. Your job is to find just one.

Exercise 496, page 129 This is certainly true for some open sets, but not for all open sets. Consider G = (0, 1) \C where C is the Cantor ternary set. The closure of G is all of the interval [0, 1] so that G and its closure do not differ by a countable set and contain many more points than the endpoints as the student falsely claims.

Exercise 500, page 131 See Donald R. Chalice, "A Characterization of the Cantor Function." Amer. Math. Monthly 98, 255–258, 1991 for a proof of the more difficult direction here, namely

5.1. ANSWERS TO PROBLEMS

281

Figure 5.6: The Cantor function. that the only monotone, nondecreasing function on [0, 1] that has these three properties is the Cantor function. Figure 5.6 should be of assistance is seeing that each of the three properties holds. To verify them use the characterization of the function in the preceding exercise.

Exercise 501, page 132 There is nothing to prove. Write the two definitions and observe that they are identical.

Exercise 502, page 133 There is immediate. If the definition holds for the larger set then it holds without change for the smaller set.

Exercise 503, page 133 Let ε > 0. Then for every x ∈ E1 there is a δ1 (x) > 0 n

∑ |F(bi ) − F(ai )| < ε/2

i=1

whenever a subpartition {([ai , bi ], ξi ) : i = 1, 2, . . . , n} is chosen for which ξi ∈ E1 ∩ [ai , bi ] and bi − ai < δ(ξi ).

Similarly for every x ∈ E2 there is a δ2 (x) > 0 n

∑ |F(bi ) − F(ai )| < ε

i=1

whenever a subpartition {([ai , bi ], ξi ) : i = 1, 2, . . . , n} is chosen for which ξi ∈ E2 ∩ [ai , bi ] and bi − ai < δ(ξi ).

Take δ(x) in such a way, that if a point x happens to belong to both sets then δ(x) is the minimum of δ1 (x) and δ2 (x). For points that are not in both take δ(x) either δ1 (x)

CHAPTER 5. ANSWERS

282 or δ2 (x). Whenever a subpartition {([ai , bi ], ξi ) : i = 1, 2, . . . , n}

is chosen for which the sum

ξi ∈ (E1 ∪ E2 ) ∩ [ai , bi ] and bi − ai < δ(ξi ) n

∑ |F(bi ) − F(ai )|

i=1

splits into two parts, depending on whether the ξi are in the first set E1 or the second set E2 . It follows that n

∑ |F(bi ) − F(ai )| < ε/2 + ε/2.

i=1

We have given all the details here since the next exercise requires the same logic but rather more detail.

Exercise 504, page 133 We can simplify the argument by supposing, without loss of generality, that the sets are S disjoint. This can be arranged by using subsets of the E j so that the union E = ∞j=1 E j is the same. Let ε > 0 and let j = 1, 2, 3, . . . . Then for every x ∈ E j there is a δ j (x) > 0 n

∑ |F(bi ) − F(ai )| < ε2− j

i=1

whenever a subpartition {([ai , bi ], ξi ) : i = 1, 2, . . . , n} is chosen for which ξi ∈ E j ∩ [ai , bi ] and bi − ai < δ j (ξi ).

Simply define δ(x) = δ j (x) if x ∈ E j . Whenever a subpartition {([ai , bi ], ξi ) : i = 1, 2, . . . , n}

is chosen for which the sum

ξi ∈ E ∩ [ai , bi ] and bi − ai < δ(ξi ) n

∑ |F(bi ) − F(ai )|

i=1

splits into finitely many parts, depending on whether the ξi are in the first set E1 , or the second set E2 , or the third set E3 , etc. It follows that n

∑ |F(bi ) − F(ai )|
0 so that

ε(v − u) b−a for all 0 < v − u < δ(x) for which u ≤ x ≤ v. Then just check the inequality works since, if ξi ∈ E ∩ [ai , bi ] and bi − ai < δ(ξi ), |F(v) − F(u) − f (x)(v − u)|
0 such that n

∑ |F(bi ) − F(ai )| < ε

i=1

whenever a subpartition {([ai , bi ], ξi ) : i = 1, 2, . . . , n} is chosen for which ξi ∈ (a, b) ∩ [ai , bi ] and bi − ai < δ(ξi ).

Consider any interval [c, d] ⊂ (a, b). By the Cousin covering lemma there is a partition of the whole interval [c, d], {([ai , bi ], ξi ) : i = 1, 2, . . . , n}, for which Consequently

ξi ∈ [ai , bi ] and bi − ai < δ(ξi ). n

n

i=1

i=1

|F(d) − F(c)| = | ∑ F(bi ) − F(ai )| ≤ ∑ |F(bi ) − F(ai )| < ε. This is true for any such interval and all positive ε. This is only possible if F is constant on (a, b).

Exercise 510, page 134 We have already checked that the Cantor function has zero on the set complementary to the Cantor set in [0, 1]. This is because the Cantor function is constant on all of the component intervals. If the Cantor function also had zero variation on the Cantor set then we could conclude that it has zero variation on the entire interval [0, 1]. It would have to be constant.

Exercise 512, page 134 Just mimic (and simplify) the proof for Exercise 507.

Exercise 513, page 134 This exercise is a generalization of Exercise 321. Essentially the same method will work here, although you should find that it is easier to prove the generalization.

Exercise 534, page 141 If F is differentiable at all points of [a, b] this is certainly a true statement. If we allow exceptional points then the hypotheses have to be adjusted. Assume F is uniformly continuous on [a, b] and differentiable at all but a countable set of points. Then this statement is true. Assume F is Lipschitz on [a, b] and differentiable at all but a set of points of measure zero. Then this statement is true. Remarkably enough this is true without assuming any differentiability. Lipschitz functions are always differentiable at all but a set of points of measure zero. But that observation belongs in a more advanced course than this.

5.1. ANSWERS TO PROBLEMS

285

There are more conditions that you can assume to guarantee that Z b a

F ′ (x) dx = F(b) − F(a).

Exercise 537, page 143 Exercises 505, 506, and 507 contain all the pieces required for a very easy proof. Make sure to write Z bi

ai

f (x) dx = F(bi ) − F(ai )

using the indefinite integral F and to observe that only the first inequality of the theorem need be proved, since the second one follows immediately from the first.

Exercise 538, page 144 The proof is an exercise in derivatives taking care to handle the sets of measure zero. Use F and G for the indefinite integrals of f and g. Let N0 be the set of points x in (a, b) where f (x) ≤ g(x) might fail. Suppose that F ′ (x) = f (x) except on a set N1 with N1 measure zero and such that F has zero variation on N1 . Suppose that G′ (x) = g(x) except on a set N2 with N2 measure zero and such that F has zero variation on N2 . Then H = G − F has H ′ (x) = g(x) − f (x) ≥ except on the set N0 ∪ N1 ∪ N2 . This set is measure zero and, since F and G are absolutely continuous inside the interval, so too is H. The proof then rests on the following fact which you should prove: If H is uniformly continuous on [a, b], absolutely continuous inside the interval, and if d H(x) ≥ 0 dx for all points x in (a, b) except possibly points of a set of measure zero then H(x) must be nondecreasing on [a, b]. Finally then H(a) ≤ H(b) shows that F(a) − F(b) ≤ G(b) − G(a) and hence that Z b a

f (x) dx ≤

Z b

g(x) dx.

a

Exercise 539, page 144 Study the proof for Exercise 538 and just use those techniques here.

Exercise 540, page 144 In preparation . . .

CHAPTER 5. ANSWERS

286

Exercise 541, page 145 Here is a version that is not particularly ambitious and is easy to prove. It is also sufficiently useful for most calculus classes. Suppose that F and G are uniformly continuous on [a, b] and that each function is differentiable except at a countable number of points. Then the function F(x)G′ (x) + F ′ (x)G(x) is integrable on [a, b] and Z b F(x)G′ (x) + F ′ (x)G(x) dx = F(b)G(b) − F(a)G(b). a

In particular F(x)G′ (x) is integrable on [a, b] if and only if F ′ (x)G(x) is integrable on [a, b]. In the event that either is integrable then the formula Z b a

F(x)G′ (x) dx = F(b)G(b) − F(a)G(b) −

Z b a

F ′ (x)G(x) dx

must hold. To prove it, just check that H(x) = F(x)G(x) is uniformly continuous on [a, b] and has a derivative at all but a countable number of points equal to the function F(x)G′ (x)+ F ′ (x)G(x). But you can do better.

Exercise 542, page 145 Here are a number of versions that you might prove. Suppose G is uniformly continuous on [a, b], and that F is uniformly continuous on an interval [c, d] that includes every value of G(x) for a ≤ x ≤ b. Suppose that each function is differentiable except at a countable number of points. Suppose that, for each a ≤ x ≤ b the set G−1 (G(x)) = {t ∈ [a, b] : G(t) = G(x)}

is at most countable. Then the function F ′ (G(x))G′ (x) is integrable on [a, b] and Z b F ′ (G(x))G′ (x) dx = F(G(b)) − F(G(a)). a

To prove it, just check that H(x) = F((G(x)) is uniformly continuous on [a, b] and has a derivative at all but a countable number of points equal to the function F ′ (G(x))G′ (x). Again you can do better. Try working with F and G as Lipschitz functions. Or take F everywhere differentiable and G as Lipschitz.

Exercise 543, page 146 There were an infinite number of points in the interval [0, 1] at which we could not claim that d F(G(x)) = F ′ (G(x))G′ (x). dx But that set is countable and countable sets are no trouble to us now. So this function is integrable and the formula is valid.

Exercise 544, page 146 The proofs in Section 3.8.1 can be repeated with hardly any alterations. This is because both the calculus integral and the integral of this chapter can be given a pointwise

5.1. ANSWERS TO PROBLEMS

287

approximation by Riemann sums. Just read through the proof and observe that the same arguments apply in this setting.

Exercise 546, page 147 The proofs in Section 3.8.1 can be repeated with hardly any alterations. This is because both the calculus integral and the integral of this chapter can be given a pointwise approximation by Riemann sums.

Exercise 547, page 147 Any constant function F(x) = C will be, by definition, an indefinite integral for f .

Exercise 548, page 147 Any function F(x) that is an indefinite integral for f will satisfy F(d) − F(c) = 0 for all a ≤ c < d ≤ b. Thus F is constant and 0 = F ′ (x) = f (x) for all x in the interval except possibly at points of a measure zero set.

Exercise 549, page 147 Any function F(x) that is an indefinite integral for f will be monotonic, nondecreasing and satisfy F(b) − F(a) = 0. Thus F is constant and 0 = F ′ (x) = f (x) for all x in the interval except possibly at points of a measure zero set.

Exercise 550, page 150 You should be able to prove each of these statements: • A linear combination of Riemann integrable functions is Riemann integrable. • A product of finitely many Riemann integrable functions is Riemann integrable. • A uniform limit of a sequence of Riemann integrable functions is Riemann integrable.

Index Abel test for uniform convergence, 90, 105, 261 absolute continuity in Vitali’s sense, 138 absolute continuity, 135 absolutely continuous function, 135 absolutely convergent integral, 52, 142 absolutely integrable, 79 absolutely integrable function, 83 accumulation point of view, 107 algebraic number, 123 applications of the integral, 106 approximation by Riemann sums, 142 area, 107 arithmetic Construction of Cantor set, 129 avoiding the mean-value theorem, 30

theorem, 122 Cantor dust, 127 Cantor function, 130 Cantor function not absolutely continuous, 136 Cantor set, 129, 130 Cantor ternary set, 127 Cantor, G., 122, 130 careless student, 34, 88, 102, 129 Cauchy criterion for uniform convergence, 89 Cauchy criterion for series, 90 Cauchy mean-value theorem, 29 Cauchy-Schwarz inequality, 120, 268 change of a function, 132 change of variable, 44, 60, 145 backwards integral, 52 characteristic function of the rationals, 14 Bers, Lipman, 188 classicalrealanalysis.com, 2 Bliss theorem, 74 closed interval, 3 Bliss, G. A., 74 closed set, 4 Bolzano-Weierstrass argument, 21, 22 comparison test for variations, 83, 134 Bolzano-Weierstrass compactness argument, complementary intervals of Cantor set, 128 18 connected set, 9, 10 bounded derived numbers, 34, 193 constant functions, 31 bounded function, 20 constant of integration, 35 bounded interval, 3 construction of Cantor function, 130 bounded set, 4 continuity and zero variation, 134 bounded variation, 80, 113 convergence boundedness properties of uniformly conuniform convergence, 89 tinuous functions, 19 convergent integral, 52, 142 converges infinitely slowly, 244 calculus integral, 50 converse to the mean-value theorem, 30 calculus integral is a nonabsolute integral, countable number of discontinuities, 125 80 countable set, 122 calculus student notation, 43, 54, 60, 61, countable set has measure zero, 126 102 counterexamples, 85 Cantor Cousin covering argument, 18 set, 127, 129 curve, 112 288

INDEX Darboux property, 22 Darboux property of continuous functions, 22 Darboux property of derivative, 31 Darboux, J. G., 22, 31 definite integral, 50, 140 definite vs. indefinite integrals, 54 degenerate interval, 154 Denjoy-Perron integral, 75 derivative, 24 Darboux property of, 31 not the limit of derivatives, 86 derivative of the definite integral, 61, 146 derivative of the indefinite integral, 45 derivative of the limit is not the limit of the derivative, 86 derivative of the variation, 82 derivatives and uniform convergence, 95 devil’s staircase, 130 differentiable function is absolutely continuous, 136 differentiable functions are integrable, 55 differentiable implies continuous, 25 differentiation rules, 26 Dirichelet integral, 64, 217 discontinuities of monotonic functions, 122 discontinuity of a limit of continuous functions, 85 discontinuous limit of continuous functions, 85 distance between a point and a set, 14 dummy variable, 53 dx, 53 enumeration, 122 epsilon, delta version of derivative, 24 error estimate for Riemann sums, 68 error estimate for Simpson’s rule, 116 error estimate for trapezoidal rule, 115 exact computation by Riemann sums, 66 existence of indefinite integrals, 40 finite derived numbers, 137 first mean-value theorem for integrals, 63 fixed point, 23

289 Flett’s theorem, 31 Flett, T., 189 formula for the length of a curve, 113 Freiling, Chris, 40 function, 10 bounded variation, 80 Cantor function, 130 distance between set and point, 14 fixed point of, 23 Lipschitz, 33 smooth, 29 step function, 14 function of Cantor, 130 function with zero variation, 132 growth of a function, 132 Heaviside’s function, 14 Heine-Borel argument, 18 Heine-Borel property, 22 Henstock property, 75 Henstock’s criterion, 142 Henstock, Ralph, 75, 142 Henstock-Kurzweil integral, 148 Heuer, Gerald A. , 236 inadequate theory of integration, 151 indefinite integral, 35 indefinite integrals and bounded variation, 83 inequalities, 143 inequality Cauchy-Schwarz, 268 inequality properties of the integral, 58 infinitely slowly, 244 integrable [calculus sense], 50 integrable function, 140 integral not the limit of the integral, 86 integral inequalities, 61 integral of the limit is not the limit of the integrals, 86 integration by parts, 43, 59, 145 integration by substitution, 44, 60, 145 interchange of limit operations, 86 intermediate value property, 22 of derivative, 31

INDEX

290 intersection of two closed intervals, 4 intersection of two open intervals, 4 intervals, 2 irrational number in the Cantor ternary set, 129 Israel Halperin, 189 Java Applets, 267 Jordan decomposition, 82 Koliha, J. J., 239 Kurweil, Jaroslav, 75 Kurzweil-Henstock integral, 148 L’Hôpital’s rule, 30 law of the mean, 27 least upper bound argument, 21 least upper bound property, 22 Lebesgue integrable, 138 Lebesgue integral, 138 length of a curve, 112 length of a graph, 112 Levi, Beppo, 271 limit interchange of limit operations, 86 limitations of the calculus integral, 62, 125 limits of Discontinuous Derivatives, 97 linear combination, 13, 43 linear combinations, 59, 144 Lipschitz condition, 33 function, 33 Lipschitz condition, 21 Lipschitz function, 41, 82, 136 Lipschitz function is absolutely continuous, 136 locally bounded function, 20 locally of bounded variation, 83 locally strictly increasing function, 25 logarithm, 120 logarithm function, 87 Luzin, N. N., 243

maximum, 21 McShane, E. J., 78 mean-value theorem, 26, 27 second-order, 29 mean-value theorem for integrals, 63 mean-value theorem of Cauchy, 29 meaner mean-value theorem, 239 measure zero, 125 method of exhaustion, 107 monotone convergence theorem, 97, 98 no interval has measure zero, 126 nondifferentiable, 24 notation for indefinite integral, 38 null function, 147 numerical methods, 114 open interval, 3 open set, 4 oscillation of a function, 15 parametric curve, 112 parametric equations, 112 partial fractions, 46 partition, 66 piecewise monotone, 31 pointwise approximation by Riemann sums, 75, 142 pointwise continuous, 12 products of integrable functions, 57 properties of the definite integral, 58 properties of the integral, 143 properties of the total variation, 81 rectifiable, 112 review, 1 Riemann integrable function, 150 Riemann integral, 150 Riemann sums, 66, 114, 142 Riesz, Frigyes, 138 Rolle’s Theorem, 26 Rozema, Edward , 266

second mean-value theorem for integrals, M-test, 90 63 manipulations with indefinite integrals, 42 second mean-value theorem for the calcuMaple, 47, 118, 119 lus integral, 119 mapping definition of continuity, 15

INDEX second order mean-value theorem, 29 sequence definition of continuity, 15 sequence of functions uniformly bounded, 92 sequence of functions uniformly Cauchy, 89 sequence of functions uniformly convergent, 89 sequences and series of integrals, 84 sequences of functions of bounded variation, 83 sequences of sets of measure zero, 126 set Cantor set, 127, 130 Cantor ternary set, 127, 129 countable, 122 sets of measure zero, 125 Simpson’s rule, 116 sin integral function, 64, 217 smooth function, 29 step function, 14 step functions, 41 step functions are integrable, 55 straddled version of derivative, 25 subintervals, 58, 144 subpartition, 66, 126, 132 summing inside the integral, 98 Szökefalvi-Nagy, Bela, 138 tables of integrals, 48 TBB, 2, 149 ternary representation of Cantor set, 129 theorem mean-value theorem, 27 of Cantor, 122 theorem of G. A. Bliss, 74 total variation function, 80 transcendental number, 123 trapezoidal rule, 115 trigonometric functions, 11 Trillia Group, 124 unbounded interval, 3 unbounded limit of bounded functions, 85 uniform Approximation by Riemann sums, 69

291 uniform convergence, 89 Abel’s test, 90, 261 Cauchy criterion, 89 Weierstrass M-test, 90 uniform convergence and derivatives, 95 uniform convergence of continuous functions, 94 uniformly approximating the variation, 82 uniformly bounded family of functions, 92 uniformly Cauchy, 89 uniformly continuous, 11 unstraddled Riemann sums, 79 unstraddled version of derivative, 25 upper function, 40 vanishing derivatives, 31 vanishing derivatives with a few exceptions, 32 Vitali definition of absolute continuity, 138 Vitali, G., 138 Weierstrass M-test, 90 why the finite exceptional set?, 62 Young, Grace Chisholm, 271 Young, William Henry, 271 Zakon, Elias, 37 Zakon, Elias , 124 zero derivatives imply zero variation, 134 zero variation, 132 zero variation lemma, 134