Real Analysis: A Constructive Approach Through Interval Arithmetic 1470451441, 9781470451448

Real Analysis: A Constructive Approach Through Interval Arithmetic presents a careful treatment of calculus and its theo

1,355 102 3MB

English Pages 302 [321] Year 2019

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Elliptic Functions: A Constructive Approach 0471965316, 9780471965312

The theory of elliptic functions represents a high point of classical analysis. Interest in the use of these mathematica

619 116 8MB Read more

Higher Spin AdS-Duality from CFT - a constructive approach

465 50 1MB Read more

Lectures on Constructive Mathematical Analysis 0821845136, 9780821845134

The basis of this book was a special course given by the author at the Mechanics-Mathematics Faculty of Moscow Universit

850 126 3MB Read more

A Primer of Real Analysis

162 62 7MB Read more

Modal interval analysis. New tools for numerical information 9783319017204, 9783319017211

743 97 1MB Read more

Real Analysis

1,886 95 7MB Read more

Real Analysis

163 32 9MB Read more

Real analysis

274 81 13MB Read more

Interval Analysis: and Automatic Result Verification 9783110499469, 9783110500639

This self-contained text is a step-by-step introduction and a complete overview of interval computation and result verif

161 50 29MB Read more

A Radical Approach to Real Analysis [2 ed.] 1470469049, 9781470469047

In this second edition of the MAA classic, exploration continues to be an essential component. More than 60 new exercise

258 73 9MB Read more

Real Analysis: A Constructive Approach Through Interval Arithmetic
1470451441, 9781470451448

Author / Uploaded
Mark Bridger

Table of contents :
Contents
0. Preliminaries
0.1 The Natural Numbers
0.2 The Rationals
1. The Real Numbers And Completeness
1.0 Introduction
1.1 Interval Arithmetic
1.2 Families of Intersecting Intervals
1.3 Fine Families
1.4 Defnition of the Reals
1.5 Real Number Arithmetic
1.6 Rational Approximations
1.7 Real Intervals and Completeness
1.8 Limits and Limiting Families
Appendix: The Goldbach Number and Trichotomy
2. An Inverse Function Theoremand Its Application
2.0 Introduction
2.1 Functions and Inverses
2.2 An Inverse Function Theorem
2.3 The Exponential Function
2.4 Natural Logs and the Euler Number
3. Limits, Sequences And Series
3.1 Sequences and Convergence
3.2 Limits of Functions
3.3 Series of Numbers
Appendix I: Some Properties of Exp and Log
Appendix II: Rearrangements of Series
4. Uniform Continuity
4.1 Definitions and Elementary Properties
4.2 Limits and Extensions
Appendix I: Are There Non-Continuous Functions?
Appendix II: Continuity of Double-Sided Inverses
Appendix III: The Goldbach Function
5. The Riemann Integral
5.1 Definition and Existence
5.2 Elementary Properties
5.3 Extensions and Improper Integrals
6. Differentiation
6.1 Definitions and Basic Properties
6.2 The Arithmetic of Differentiability
6.3 Two Important Theorems
6.4 Derivative Tools
6.5 Integral Tools
7. Sequences And Series Of Functions
7.1 Sequences of Functions
7.2 Integrals and Derivatives of Sequences
7.3 Power Series
7.4 Taylor Series
7.5 The Periodic Functions
Appendix: Raabe’s Test and Binomial Series
8. The Complex Numbers And Fourier Series
8.0 Introduction
8.1 The Complex Numbers
8.2 Complex Functions and Vectors
8.3 Fourier Series Theory
Index

Citation preview

Sally

The

SERIES

Pure and Applied UNDERGRADUATE TEXTS

Real Analysis A Constructive Approach Through Interval Arithmetic

Mark Bridger

38

Real Analysis A Constructive Approach Through Interval Arithmetic

Sally

The

Pure and Applied UNDERGRADUATE TEXTS • 38

SERIES

Real Analysis A Constructive Approach Through Interval Arithmetic

Mark Bridger

EDITORIAL COMMITTEE Gerald B. Folland (Chair) Jamie Pommersheim

Steven J. Miller Serge Tabachnikov

Originally published in 2007 by John Wiley & Sons, Inc. 2010 Mathematics Subject Classification. Primary 03F60, 03F55, 97-01, 97Ixx, 26-01, 26Axx.

For additional information and updates on this book, visit www.ams.org/bookpages/amstext-38

Library of Congress Cataloging-in-Publication Data Names: Bridger, Mark, 1942- author. Title: Real analysis : a constructive approach through interval arithmetic / Mark Bridger. Description: Providence, Rhode Island : American Mathematical Society, [2019] | Series: AMS pure and applied undergraduate texts ; volume 38 | Includes bibliographical references and index. Identifiers: LCCN 2019006280 | ISBN 9781470451448 (alk. paper) Subjects: LCSH: Mathematical analysis. | Continuity. | Differentiable functions. | Interval analysis (Mathematics) | AMS: Mathematical logic and foundations – Proof theory and constructive mathematics – Constructive and recursive analysis. msc | Mathematical logic and foundations – Proof theory and constructive mathematics – Intuitionistic mathematics. msc | Mathematics education – Instructional exposition (textbooks, tutorial papers, etc.). msc | Mathematics education – Analysis – Analysis. msc | Real functions – Instructional exposition (textbooks, tutorial papers, etc.). msc | Real functions – Functions of one variable – Functions of one variable. msc Classification: LCC QA300 .B68925 2019 | DDC 515–dc23 LC record available at https://lccn.loc.gov/2019006280

Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting for them, are permitted to make fair use of the material, such as to copy select pages for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American Mathematical Society. Requests for permission to reuse portions of AMS publication content are handled by the Copyright Clearance Center. For more information, please visit www.ams.org/publications/pubpermissions. Send requests for translation rights and licensed reprints to [email protected]. c 2019 by the author. All rights reserved. Reprinted with corrections. Printed in the United States of America. ∞ The paper used in this book is acid-free and falls within the guidelines

established to ensure permanence and durability. Visit the AMS home page at https://www.ams.org/ 10 9 8 7 6 5 4 3 2 1

24 23 22 21 20 19

CONTENTS Preface

vii

Acknowledgements

xi

Introduction

xiii

0 Preliminaries 0.1 The Natural Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 0.2 The Rationals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 The Real Numbers and Completeness 1.0 Introduction . . . . . . . . . . . . . . . . . . . 1.1 Interval Arithmetic . . . . . . . . . . . . . . . 1.2 Families of Intersecting Intervals . . . . . . . 1.3 Fine Families . . . . . . . . . . . . . . . . . . 1.4 Denition of the Reals . . . . . . . . . . . . . 1.5 Real Number Arithmetic . . . . . . . . . . . . 1.6 Rational Approximations . . . . . . . . . . . 1.7 Real Intervals and Completeness . . . . . . . 1.8 Limits and Limiting Families . . . . . . . . . Appendix: The Goldbach Number and Trichotomy 2 An 2.0 2.1 2.2 2.3 2.4

1 1 3

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

11 11 12 22 32 39 43 55 58 63 67

Inverse Function Theorem and Its Application Introduction . . . . . . . . . . . . . . . . . . . . . . . Functions and Inverses . . . . . . . . . . . . . . . . . An Inverse Function Theorem . . . . . . . . . . . . . The Exponential Function . . . . . . . . . . . . . . . Natural Logs and the Euler Number h . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

69 69 70 74 83 94

3 Limits, Sequences and Series 3.1 Sequences and Convergence . . . . . . . 3.2 Limits of Functions . . . . . . . . . . . . 3.3 Series of Numbers . . . . . . . . . . . . Appendix I: Some Properties of Exp and Log Appendix II: Rearrangements of Series . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

99 99 108 112 131 134

4 Uniform Continuity 4.1 Denitions and Elementary Properties . . . . 4.2 Limits and Extensions . . . . . . . . . . . . . Appendix I: Are There Non-Continuous Functions? Appendix II: Continuity of Double-Sided Inverses .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

139 139 147 157 161

. . . . .

. . . . .

v

CONTENTS

vi

Appendix III: The Goldbach Function . . . . . . . . . . . . . . . . . . . . 5 The 5.1 5.2 5.3

163

Riemann Integral 165 Denition and Existence . . . . . . . . . . . . . . . . . . . . . . . . . 165 Elementary Properties . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Extensions and Improper Integrals . . . . . . . . . . . . . . . . . . . 176

6 Dierentiation 6.1 Denitions and Basic Properties . 6.2 The Arithmetic of Dierentiability 6.3 Two Important Theorems . . . . . 6.4 Derivative Tools . . . . . . . . . . 6.5 Integral Tools . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

185 185 191 196 204 211

7 Sequences and Series of Functions 7.1 Sequences of Functions . . . . . . . . . . 7.2 Integrals and Derivatives of Sequences . 7.3 Power Series . . . . . . . . . . . . . . . . 7.4 Taylor Series . . . . . . . . . . . . . . . 7.5 The Periodic Functions . . . . . . . . . Appendix: Raabe’s Test and Binomial Issues

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

223 223 233 239 253 261 269

Complex Numbers and Fourier Series Introduction . . . . . . . . . . . . . . . . . . The Complex Numbers C . . . . . . . . . . Complex Functions and Vectors . . . . . . . Fourier Series Theory . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

271 271 275 278 284

8 The 8.0 8.1 8.2 8.3

. . . . .

. . . . .

References

295

Index

297

PREFACE

What is a constructive approach, and why should one take it? If you look at the table of contents for this book, you’ll see mostly familiar topics, but with a slightly dierent emphasis. There’s a long chapter on the real numbers, followed by one on “An Inverse Function Theorem.” The chapter on limits, sequences and series is followed by one on uniform continuity–why not just pointwise continuity? A chapter on the Riemann integral is followed by one on dierentiation– but it’s actually uniform dierentiation. All of these departures from the structure of the usual real analysis text result from a careful reassessment of the role of the course in the technical education of undergraduates. Not every student in Real Analysis is a math major, and, in many schools, only a small percentage of math majors intend to do graduate work in mathematics. A modern course is populated by a wide range of students. Some are headed for careers in secondary education, while there is often a large contingent from the physical sciences and an even larger group from computer science. These students are in the course because they need or want more than a cookbook calculus course. Some need to know more about computability and calculability of oating-point numbers, hence more about the actual nature of the reals. They also need to know about continuity because they need to know about approximations; some need to know about convergence and improper integrals because they need to know about computing special functions and transforms. But real analysis is not primarily focused on computing. It is, signicantly, a course that shapes the way students think about mathematics. Very often it is a student’s introduction to precise reasoning and writing. So: I begin with a careful construction of the real numbers, the eld on which most of analysis is played. The approach here, due to Gabriel Stolzenberg, is via intervals of rational numbers and the arithmetic of such intervals. The many elementary theorems about the properties of this arithmetic later reappear as properties of the real numbers, and verifying them provides a gentle introduction to the art and practice of devising and writing readable and correct proofs. Furthermore, there is a useful metaphor: a rational interval is exactly what is obtained when a scientist uses instruments of limited (but known) accuracy to measure something. Families of rational intervals then correspond to multiple measurements, and the condition on a family that any two of its intervals must meet establishes the consistency of its measurements. Finally, a real number is dened to be a family of rational intervals that is consistent in this sense, and that contains intervals of arbitrarily small length. Interval arithmetic, carried over to families of intervals, now becomes real arithmetic, and conditions on the lengths of intervals become the properties of approximation of reals by rationals. At this point, the students see that the reals have a far more complex structure than the rationals. One important example is the traditional Law of Trichotomy, namely that precisely one of { ? |, | ? {, or { = | must hold. This property holds vii

viii

Preface

for the rationals, since rational arithmetic is basically integer arithmetic. However, reals can, in general, only be approximated by rationals. Modern computer algebra systems allow the user to specify a tolerance, which is expressed as the number of decimal places. This number can be chosen as large as one pleases, but not innitely large. The corresponding tolerance, say , tells us how closely we can distinguish reals using the computer’s rational representation. These considerations lead to the formulation of real number comparison that we prove and use throughout the book. -Trichotomy Given any tolerance A 0, then for any reals { and |, { ? |, | ? {, or { and | are within of each other. Thus, a construction of the reals based on rational measurement and an analysis of what we can actually calculate produces a concordance of theory and practice that students of the sciences easily relate to. Using the notion of -trichotomy as a tool for comparing real numbers enables us to describe a bisection-like algorithm for nding the inverse of a function i , providing ({) . This leads it satises upper and lower bounds on its dierence quotient i (|)i |{ directly to the construction of qth root, exponential, and logarithm functions. Another hallmark of the constructivist program is its emphasis on uniform vs. pointwise continuity: • i is pointwise continuous at d if, given any A 0 we can nd a d () A 0 such that |i ({) i (d)| ? whenever |{ d| ? d (). • i is uniformly continuous on S if, given any A 0 we can nd a () A 0 such that |i ({) i (|)| ? whenever |{ || ? () and {> | 5 S. Uniform continuity on S implies pointwise continuity at each point of S, but the converse is not true: there is no general procedure for constructing a single from the innitely many d . Not only is uniform continuity a stronger notion, it is the more desirable version of continuity since it is the one most useful in studying convergence and integrability. It turns out that the usual proofs that the basic functions of analysis are pointwise continuous also prove that these functions are uniformly continuous on appropriate intervals. We exploit this fact from the very beginning and only use the stronger and more important uniform version of continuity. We take a similar approach to dierentiability. Instead of talking about the derivative of a function at a point, we talk about the derivative function on an interval. As with uniform continuity, this notion of uniform dierentiability is the one that is of most importance in later theory and applications. In fact, it is an approach that generalizes readily to vector-valued functions of several variables. An important consequence of using uniform notions is that they produce transparent proofs of important theorems such as the existence of the Riemann integral and the Fundamental Theorem of Calculus. The pointwise versions of continuity and dierentiability do lead to a number of classical examples of functions which are or aren’t continuous or dierentiable on various dense or nowhere dense subsets of intervals. Since we are emphasizing uniform notions, these examples are relegated to discussions in an appendix and a few exercises, which can be covered at the discretion of the instructor.

Preface

ix

In summary, then, this is neither a text in numerical analysis nor one intended solely to prepare students to be professional mathematicians. It is a thoroughly rigorous modern account of the theoretical underpinnings of calculus; and, being constructive in nature, every proof of every result is direct and ultimately computationally veriable (at least in principle). In particular, existence is never established by showing that the assumption of non-existence leads to a contradiction. By looking through the index or table of contents, you’ll see that nothing of importance for undergraduates has been left o or compromised by our approach. The payo of the constructive approach, however, is that it makes sense–not just to math majors, but to students from all branches of the sciences.

ACKNOWLEDGMENTS

About fteen years ago, Gabe Stolzenberg lent me a copy of notes he had created and used to teach undergraduate seminars and directed studies courses in real analysis. I found Gabe’s approach so elegant, and the material so appealing pedagogically, that I signed up to teach the analysis course, which had just become required for our majors. Over the years, with his help and encouragement, I worked this material into a text suited to the particular mix of students who take this course here at Northeastern. Many of the mathematical ideas in the current form of this text were adapted directly from Gabe’s notes, especially the following: all of the material on the construction of the reals via rational interval arithmetic, the Inverse Function Theorem and its beautiful proof, the properties of exponential functions, and the creation of the Riemann integral. The use of the uniform notions of continuity and dierentiability were also part of the “constructive mindset” that Gabe introduced me to and which I have tried to employ throughout the remainder of the book. As a mathematician specializing in homological algebra, working with this new perspective was like writing a second thesis, with Gabe as advisor and friend. Because Gabe’s creative interests have taken him in other directions, this text could not be a joint authorship. He has continued to rene and expand his exposition of constructive mathematics, some of which can be found on his website. I am quite indebted to him–many of the good things in this book are due, directly or indirectly, to Gabe, and whatever is not so good is solely my responsibility. Professor Joseph Alper read the entire manuscript and oered many extremely helpful suggestions and I am most indebted to him for this eort. Professor Robert Seeley very generously claried a number of mathematical issues for me–especially those relating to Fourier analysis–as did my longtime o!ce-mate Professor John Frampton. My wife, Maxine Bridger, not only proofread a lot of mathematics, she also kept me honest by countering my constructivist mindset with many a healthy platonist riposte. Susanne Steitz (editor) and Anna Pierrehumbert (copyeditor) of John Wiley & Sons helped prepare the rst edition of this book. Professor Stephen Kennedy skillfully shepherded this second edition through the publishing process at the American Mathematical Society. Finally, this entire manuscript was prepared using Scientic Workplace, a product of MacKichan Software. The technical support people there were extremely helpful, patient, and generous with their time. Mark Bridger Newton, MA xi

INTRODUCTION (MOSTLY FOR INSTRUCTORS)

Formally, Real Analysis–the course–is a presentation of the theoretical underpinnings of calculus. It is about the Big Three: continuity, dierentiability, and convergence. Yet it is also, for many, an introduction to reading, writing, and thinking mathematics. I have tried to address all of these issues in this book. In the rst chapter we construct the real numbers, starting with the rationals. This lays the groundwork for the entire book. The basic concept here is that of a family of rational intervals. A real number is a family of rational intervals which satises two important conditions: consistency (any two intervals in the family intersect) and neness (the family contains intervals of arbitrarily small length). These conditions, together with the arithmetic that families inherit from the rationals, lead to all of the familiar algebraic properties of the reals. We establish these properties via quite a few propositions and one main theorem (completeness). Proving these results requires • knowing simple properties of the arithmetic of rational numbers, • applying elementary algebra and simple logic, and • learning to apply new denitions and newly proved results. Thus, Chapter 1 is critical because it provides not only the mathematical ideas that permeate the rest of the text, but also the introduction to the reasoning and writing skills necessary for doing and communicating mathematics. Nothing is more boring than having to read a seemingly endless theorem-proof sequence, so I have tried to provide just enough sample proofs and hints so that readers can proceed on their own. Many propositions are left as exercises; indeed, the exercises provide a vital part of the whole pedagogical process. I take this chapter at a leisurely pace, allowing students to write, critique, and rewrite their work. It is an investment of time well worth making early in the course. Chapter 1 ends with what may be the central result of any real analysis course: the completeness of the reals. This is expressed in terms of families of real intervals, but in Chapter 3 it is rephrased in the language of Cauchy sequences. Chapter 2 uses the Completeness Theorem to prove the useful Inverse Function Theorem. This, in turn, is used to construct qth roots, general exponential functions, and logarithms. A section is devoted to the Euler number h and the natural logarithm. Chapter 3 introduces sequences, limits, and series and derives basic formulas and inequalities for the various functions already constructed. xiii

Introduction

xiv

In Chapter 4 we encounter uniform continuity. Since this version of continuity is the one most used in more advanced courses, we relegate the idea of pointwise continuity to the exercises. Nothing is lost, however, since the usual verications of pointwise continuity for the basic functions of calculus are used with little modication to establish uniform continuity of these functions on intervals. We also encounter many interesting and important consequences of uniform continuity, among them boundedness and the extension of uniformly continuous functions from dense subsets–for example, extending functions from a punctured interval [d> e] {{1 > = = = > {q } to the closed interval [d> e]. In Chapter 5, we use the Completeness Theorem again, this time to construct Re the Riemann integral d i for functions uniformly continuous on an interval [d> e]. The results previously established for limits and extensions of uniformly continuous functions can now be applied to dene and calculate improper integrals. It is here that we introduce the important idea of functions dened as integrals. This includes the denition of the arctangent as an integral, an alternate denition of the natural logarithm (previously dened as an inverse function), and the use of improper integrals to construct the Gamma function and Laplace transforms. Chapter 6 on dierentiation emphasizes the derivative as a function rather than a pointwise limit. All the usual formulas from calculus are derived. In particular, the uniform version of dierentiability that we use makes for very short and illuminating proofs of two central results of calculus: • The Law of Bounded Change, which says that bounds for the derivative (i.e. D i 0 ({) E) are bounds for the dierence quotient (i.e. D i (|) i ({) E). (This is sometimes called the “Mean Value Inequality.”) |{ • The Fundamental Theorem of Calculus. In this chapter, we also derive some rather more di!cult results on dierentiating under the integral sign. In the case of improper integrals, we introduce “dominated convergence” assumptions, which we will also use later in studying series of functions. In Chapter 7, nearly all of the ideas developed in the course are applied to studying the properties of sequences and series of continuous and dierentiable functions. The particular case of power series is given special attention. The chapter ends with the denition of the periodic (trigonometric) functions as power series and a derivation of their properties (including a denition of )–all without pictures. My students invariably enjoy this; in fact, with just a few simplications and detours, it has even worked well for high school students taking AP calculus. The last chapter of the book is organized around Fourier series, but it also provides an introduction to some of the more advanced ideas in functional analysis: inner products of functions, the Bessel and Cauchy-Schwartz inequalities and their applications, kernels and convolutions, and Abel summability. The early sections

Introduction

xv

also introduce the complex numbers and the properties of complex-valued functions of a real variable. There is enough material in the eight chapters to give a full-year course, especially if a lot of the more challenging exercises are assigned and discussed in class. Some of the exercises which have several parts and require more extensive work are labeled “projects.” I have usually given Real Analysis as a one-semester course. I generally get to cover the following. 1. Chapter 1: sections 1.0 through 1.7 (omitting 1.8 and skimming some of the material on absolute value and betweenness). 2. Chapter 2: in which I skip the more technical results–especially the 1- and 2-sided versions of the Inverse Function Theorem and some of the inequalities relating to the Euler number h. 3. Chapter 3: just what I need to talk about convergence of series. 4. Chapter 4: section 4.1 and the beginning of section 4.2 (omitting extensions of continuous functions), some material on limits from Chapter 3. 5. Chapter 5: sections 5.1 and 5.2. 6. Chapter 6: sections 6.1 through 6.3. 7. Chapter 7: just the material on power series. Having done this for one semester, if there is enough student interest in a second semester, or a student wants to do a reading course, I can cover the more technical topics such as improper integrals, general convergence of sequences of functions, complex numbers, and Fourier series. After teaching Real Analysis for many years, I’d say that my general experience has been that there is no general experience. Student ability, background, and motivation can vary a lot from year to year, and I think it is a mistake to commit to a strict syllabus before you know your class. What is critical is that students do lots of problems and write lots of proofs. It is also very important that the central denitions and examples be memorized. I give several quizzes devoted exclusively to this. On the other hand, the more di!cult material (proofs) is best tested via problem sets. Students seem to do these best–and enjoy them more–when working with one or two others. (But I do require independent write-ups!) In terms of submitting mathematical work, most students initially write it out by hand. Since I typically require rewrites, many soon learn to use an equation editor with their word-processor. The software package Scientic Notebook is a good alternative, especially if you can get your school to underwrite its purchase. I have even had a few ambitious students learn to use TEX or LATEX. It is important to remember that this is an undergraduate course and that most students taking it are probably not intending to go to graduate school in theoretical

xvi

Introduction

mathematics. The goal here is to have students understand the mathematics, be able to create some on their own, and come away with happy memories of the experience. There is also plenty of challenging material here, especially in the problems, for the talented and highly motivated student. The approach I have taken in this book has worked well over the years for me and my students. I hope it does for you and yours as well.

0. PRELIMINARIES 0.1

The Natural Numbers

You have to begin somewhere. We begin with the whole numbers: 0> 1> 2> = = =and assume that we know what they are and that they have all the basic properties we know and love. Here are some of them: 1. (Commutative laws) p + q = q + p, pq = qp 2. (Associative laws) n + (p + q) = (n + p) + q, n(pq) = (np)q 3. (Distributive law) n · (p + q) = n · p + n · q 4. (Additive identity) p + 0 = p 5. (Multiplicative identity) p · 1 = p 6. (Cancellation) (a) If p + n = q + n, then p = q. (b) If p · n = q · n and n 6= 0, then p = q. 7. (Inequalities) (a) p ? q if and only if there is a non-zero whole number n with p + n = q. (b) p q if and only if p A q is false.

(c) For any n, p + n ? q + n if and only if p ? q (same for )=

(d) For any n 6= 0, p · n ? q · n if and only if p ? q (same for ). (e) (Trichotomy) For any p, q, either p ? q, q ? p, or p = q.

There are, of course, many more such properties, but we will not attempt to list them all, nor will we try to prove any of them. Attempts have been made to derive the whole numbers and their properties solely from the “laws of logic” or from certain axioms for set theory, but we will not go down that path. In fact, it is not even clear that such an approach is worthwhile, since the existence and properties of whole numbers is arguably as basic and intuitive as the laws of logic or set theory (perhaps even more so). We will denote the collection or set of whole numbers (including 0) by N, standing for natural numbers. Notation 0.1.1 q 5 N means that q is a natural number. 1

PRELIMINARIES

2

One of the most useful properties of N is the following, which we have put in a box because of its importance. Principle of Mathematical Induction Suppose that V is a collection or set of natural numbers with the properties: (a) V contains the number 0, and (b) whenever V contains the number q it also contains q + 1. Then V is actually all of N. There are several alternative and equivalent versions of this principle; the version you use depends on the nature of the result you want to prove. Variation 1: Suppose V contains n, and whenever V contains q it also contains q + 1; then V contains all natural numbers n. Variation 2: Suppose V contains 0, and whenever V contains all the numbers from 0 through q it also contains q + 1; then V = N. Mathematical induction is often compared to the behavior of dominos. The dominos are stood up on edge close to each other in a long row. When one is knocked over, it hits the next one (analogous to q in V implies q + 1 in V), which in turn hits the next, etc. If then we hit the rst (0 in V), then they will all eventually fall (V is all of N). In Variation 1 above, we start by knocking over the nth domino, so that it and all subsequent ones eventually fall. Here is an example of how a proof by induction works. Example 0.1.2 Prove that for any q 1, the sum of the rst q odd numbers is q2 . Proof. We use Variation 1 above with n = 1. We rst verify the claim when q = 1: the rst odd number is 1 and the rst square is 12 = 1, so the claim holds in this case. Now we make the so-called “induction assumption” (or “induction hypothesis”), namely that the claim is true for some q 1; so we have qth odd numb er

1 + 3 + 5 +··· +

z }| { (2q 1)

= q2 .

The idea is to use this to prove that the claim is true for the next number, q + 1. So, starting with this equation, let’s add the next odd number, 2q + 1, to both sides: 1 + 3 + 5 + · · · + (2q 1) + (2q + 1) = q2 + (2q + 1). The left-hand side is the sum of the rst q + 1 odd numbers, while the right-hand side is, of course, equal to (q + 1)2 . Thus, whenever the claim is true for a natural number q 1 it is also true for q + 1. All the dominos starting from q = 1 fall, and our proof is complete.

THE RATIONALS

3

WARNING: After stating the induction assumption you might be tempted to write 1 + 3 + 5 + · · · + (2q + 1) = (q + 1)2 , in an attempt to display what it is that must now be proved. DON’T DO IT! You may be tempted to use it to prove itself. Always proceed from what you know, never from what you want to know. If you must work backwards, do it on scrap paper, but not in the nal write-up. A careful derivation of the arithmetic properties of the the natural numbers N, using induction, was done by G. Peano (1858—1932). It is a lot of fun, but we will not pursue any of it here. Once we have the natural number N, it is a relatively easy and straightforward step to conceive of, or construct, the integers Z by adding on the “negatives” of the naturals. This gives us the collection consisting of 0, ±1> ±2> = = = on which we have to dene the laws of addition and multiplication, as well as the inequalities ? and . There is a clever way of doing this which avoids dealing with a lot of the special cases that the straightforward approach entails. We sketch this in the exercises at the end of the chapter. Notation 0.1.3 q 5 Z means that q is an integer. So, we can now turn to the fractions.

0.2

The Rationals

The natural numbers 1, 2, 3,... are used for counting discrete objects. In fact, the idea of counting is based on an assumption of discreteness. If a quantity to be measured is not a whole number of the units used to measure it, then the unit is subdivided into smaller units (yards into feet into inches into tenth inches etc.) and combinations of units are used (e.g. 2 feet 10 inches plus 3 tenth-inches). It is unclear when the idea that a single number could represent such a measurement was rst thought of, although the Babylonians, with their uniquely advanced positional notation, may have achieved this realization. If one magnitude P1 was used to measure another P2 and the rst didn’t go into the second evenly, the school of Euclid referred to the relationship not as a number but as a ratio (this term itself being undened). If a common unit could be chosen that measured each magnitude evenly, say P1 = d units and P2 = e units, then the ratio of magnitudes would be equal to a ratio of whole numbers, P1 : P2 = d : e. P1 and P2 were called commensurable in this case. Thus, a ratio of commensurables was basically an ordered pair of whole numbers, and various laws were given for dealing with them; for example, dg : eg = d : e. From Renaissance times this ordered pair d : e has been denoted with a slash: d@e, and we now refer to it as a fraction or rational number instead of just a ratio. Historically, these ratios of whole numbers were accepted long before even negative whole numbers were used.

PRELIMINARIES

4

Definition 0.2.1 (Rational numbers) A rational number is an ordered pair d@e, where d and e are integers and e 6= 0. 1. d@e = f@g means dg = fe. ( dg ? fe 2. d@e ? f@g means dg A fe

if eg A 0> ; this is also written f@g A d@e. if eg ? 0=

In the denition above, we used the term “ordered pair.” This can be dened in the context of set theory, but we will not pursue that formality. Basically, an ordered pair of integers–in this case denoted d@e but sometimes by (d> e)–is simply a pair of integers where the order makes a dierence. In other words, d@e is not necessarily the same as e@d. In fact, in the denition we make an explicit condition for ordered pairs to be equal, namely dg = fe. This is an example of the power we have when we make a denition of something hitherto undened. Rationals are added, subtracted, multiplied and divided in the usual way. It is not di!cult to show that equal rationals produce equal results under these arithmetic operations; for example, 2@4 = 1@2 and 2@4 + 3@7 = 1@2 + 3@7. Here is how the proof goes for addition. Suppose d@e = d0 @e0 and f@g = f0 @g0 , so that de0 = d0 e and fg0 = f0 g. Now, using the familiar rule for adding fractions (which we take as our denition of rational number addition): d@e + f@g = (dg + ef)@eg and d0 @e0 + f0 @g0 = (d0 g0 + e0 f0 )@e0 g0 . Are the right-hand sides of these last two equations equal? Check that (dg + ef)e0 g0 = (d0 g0 + e0 f0 )eg by carefully multiplying out and using de0 = d0 e and fg0 = f0 g. In the exercises we invite you to ll in the gaps in this quick exposition of the rational numbers. It is not essential to go through with all these details, but you might nd it an enjoyable diversion, especially if you are not used to giving proofs in a non-geometric setting. Notation 0.2.2 u 5 Q means that u is a rational number. The integers are located or represented or embedded in the rationals by the association: d $ d@1. When we refer to the rational integer d, for example, we will mean d@1. A word about ordering: The integers are discrete in the sense that for any integer q there is no integer between q and q + 1. In other words, every integer has a “next” integer. This is not true of the rationals: for any two rationals u = d@e and v = f@g, the rational (u + v)@2 = 12 (dg + fe)@eg, for example, lies strictly between uand v. In fact, there are innitely many others in between u and v as well. However, even though the rationals are innitely close together, given two rationals it is always possible to tell exactly which of the conditions u ? v, u = v or u A v holds. This is because an order relation on rationals amounts to verifying an order relation between the integers dg and fe, and any two integers can be compared in a nite number of steps. The actual (practical) number of steps depends on how these integers are represented. In our base 10 positional notation, for example, two integers of roughly the same size q require at most about log10 q comparisons of the digits 0> = = = > 9, starting from the left. We summarize this computational fact in the following important result.

THE RATIONALS

5

Theorem 0.2.3 (Trichotomy for the rationals) For any two rational numbers u and v, exactly one of the following conditions can be determined: u ? v> u = v> u A v. Suppose that we can rule out the case u A v. By Trichotomy we must have either u ? v or u = v. This denes what we might call the “weak” ordering of u and v, as opposed to the strong ordering u ? v. Definition 0.2.4 u v means that u A v is false, i.e. u v means that either u ? v or u = v. This is also written v u. ( dg fe if eg A 0> Corollary 0.2.5 d@e f@g +, dg fe if eg ? 0= Sometimes it is di!cult to prove that u v but easier to prove that u is less than or equal to any number bigger than v. That this proves u v is often extremely useful, so we state it for rationals here, and will demonstrate it for real numbers later. We have called it the “Wiggle Lemma” because the presence of the gives us some “wiggle room,” or leeway, in establishing the inequality. Lemma 0.2.6 (Wiggle Lemma for Rationals) If u and v are rationals, and u v + for every rational A 0, then u v. Proof. Let us assume u v + for all rational A 0; we will show that u A v uv A 0. We have u = 12 (u + u) A 12 (u + v) = is impossible. If u A v, let = 2 uv = v + , which contradicts our assumption. So u A v is false. v+ 2 Remark 0.2.7 Although the proof ends by establishing a contradiction, it is not an indirect proof. The statement u v is dened to mean “u A v” is false–i.e. produces a contradiction. The following special case occurs frequently in practice. Corollary 0.2.8 Suppose u and v are rationals and u ? v + for every rational A 0. Then u v. Finally, we recall that |u|, the absolute value of u, is dened to be u if u 0 and u otherwise. Absolute value has many interesting properties, many of which we will deal with later when we state them for real numbers. One of the most important is the following. Proposition 0.2.9 (Triangle Inequality for Rationals) |d + e| |d| + |e| (For a proof, see the exercises.)

PRELIMINARIES

6

Exercises 1. (Project) Here is an outline for constructing the integers Z from the natural numbers N. The idea is to let an integer be an ordered pair hp> qi of natural numbers p and q. This is a “formal” or abstract denition, but intuitively you should think of this as the integer p q (even though subtraction has not actually been dened for natural numbers). Since this is a denition of a new “object”, we are free to dene when these objects are to be equal. Intuitively, if hp> qi is pq, and hp0 > q0 i is p0 q0 , then equality would give pq = p0 q0 . Since we have to build on just what we know for natural numbers, we write this equation as p + q0 = p0 + q and make our denition: Definition For integers, hp> qi = hp0 > q0 i means p + q0 = p0 + q=

We can similarly dene addition of integers: hp> qi+hp0 > q0 i = hp + p0 > q + q0 i (since (p q) + (p0 q0 ) = (p + p0 ) (q + q0 )). Multiplication is a little trickier: hp> qi · hp0 > q0 i = hpp0 + qq0 > pq0 + p0 qi(can you see why?). Now, if d = hp> qi, dene d to be hq> pi. Here are some things to prove: (a) The commutative and distributive laws for addition and multiplication. (b) If d, e and f are integers and d = e, then d + f = e + f (c) d + (d) = 0. (d) (d)e = d(e) = de.

(e) If d + f = e + f then d = e (hint: add f to both sides and use some of the results above). (f) (d) = d.

You may also want to dene the inequalities and ? and prove some of the properties for inequalities listed in the text for the natural numbers. If you do this, you can also dene |p| and prove that |p| |q| = |pq| =

Note that the usual natural numbers can be considered integers by representing the natural number p by the integer hp> 0i; thus, for example, 1 is represented by h1> 0i (which, incidentally, is equal, as an integer, to h2> 1i or h12> 11i, etc.). Now you can show that 1 · d = d, for example, by noting that h1> 0i · hp> qi = hp> qi. 2. (Project) In the text we have dened rational numbers as ordered pairs d@e of integers (e 6= 0) with the equality relationship (for rationals) that d@e = f@g means dg = ef. We also dened d@e ? f@g (see 0.2.1). Definition For rationals u = d@e and v = f@g : d d dg + ef , u = = . eg e e e df • (Products and Reciprocals) uv = , and if u 6= 0, then 1@u = . eg d • (Dierences and Quotients) u v = u + (v), and if v 6= 0, then u@v = u(1@v).

• (Sums and Negatives) u + v =

THE RATIONALS

7

Here are some things to prove: (a) The commutative and distributive laws for addition and multiplication. (b) If u, v, and w are rationals, and u = v, then uw = vw. (We proved a similar thing for addition of rationals in the text.) (c) u + (u) = 0, and if u 6= 0 then u (1@u) = 1.

(d) (u)v = u(v) = uv.

(e) If u + w = v + w then u = v. If uw = vw and w 6= 0, then u = v. (f) (u) = u, and if u 6= 0, 1@(1@u) = u.

(g) If u A v and w A 0, then uw A uv. If u ? 0, then uw ? uv. (h) Dene for rationals and discuss its properties. |p@q| = |p| @ |q|. Prove that |uv| = |u| |v|.

Also, dene |u| =

3. Prove the Triangle Inequality, |d + e| |d| + |e|, for integers. The simplest way is to separate into the cases d and e both positive, both negative, and one positive, one negative. (You may assume that p q p + q for natural numbers.) ¯d f ¯ ¯d¯ ¯ f ¯ ¯ ¯ ¯ ¯ ¯ ¯ 4. Prove the Triangle Inequality ¯ + ¯ ¯ ¯ + ¯ ¯ for rationals by using the e g e g fact that it’s true for integers (see previous exercise). You may assume that |uv| = |u| |v| for rationals u and v. Hint: Consider multiplying or dividing through by |eg|. The remaining exercises involve mathematical induction. 5. Proposition If q A 1, then q2 A q + 1. Here are two proofs of this proposition; one is a good one and the other is bad form. Explain. Proof #1. This is clearly true for q = 2. Suppose (induction hypothesis) that it’s true for some number q A 1. We want to show (q + 1)2 A (q + 1) + 1: q2 + 2q + 1 A q + 2 Now subtract q + 1 from both sides: q2 + q A 1. This last statement is clearly true since q is at least 2. Proof #2. This is clearly true for q = 2. Suppose (induction hypothesis) that it’s true for some number q A 1. We then have q2 A q + 1.

PRELIMINARIES

8

Since we want an (q + 1)2 , add 2q + 1 to both sides: q2 + 2q + 1 A 3q + 2. Since q A 0, 2q A 0 so, adding q + 2 to both sides yields 3q + 2 A q + 2. Combining this with the above gives (q + 1)2 = q2 + 2q + 1 A 3q + 2 A q + 2 = (q + 1) + 1= Thus, the statement is true for q + 1 and we are done. 6. Prove: (a) 12 + 22 + 32 + · · · + q2 =

q(q+1)(2q+1) 6

when q A 0.

q

(b) If q 4, q! A 2 .

(c) (1 + {)q 1 + q{. (You must make an assumption about {. What is it?)

(d) For 0 x 1 and integers q A 0, (1 x)q 1 xq . What about q = 0? (e) For 0 x and integers q A 0, 1 xq q(1 x).

(f) If q A 4, 2q A q2 . (By the way, will 100q eventually be bigger than q100 ?)

7. Prove that {q | q is divisible by { | (as polynomials) (Hint: When looking at {n+1 | n+1 , subtract and add |{n in the middle.) Prove that {q + | q is divisible by { + | when q is odd. 8. The Fibonacci sequence i1 > i2 > i3 = = = is dened as follows: i1 = 1>

i2 = 1>

iq = iq1 + iq2 for q 3.

Prove the amazing formula:

iq = s

³

³ s ´q 12 5 s 5

s ´q 1+ 5 2

s

(Hint: Let = 1+2 5 > = 12 5 . Then it is helpful to note that + 1 = 2 and + 1 = 2 . Also, use Variation 2 of mathematical induction; i.e. in the induction step, show that if the claim is true for all numbers less than q, it is true for q as well.) 9. In studying the Mandelbrot set, one looks at a certain sequence of numbers ]0 > ]1 > ]2 > = = = which have the following properties: ]0 = 0 |]1 | = 2 + % for a certain % A 0 |]n+1 | |]n |2 |]1 | for all n 0. Prove that, for all Q 0, |]Q+1 | 2 + Q · .

THE RATIONALS

9

10. Prove by induction that the product of q consecutive integers is divisible by q!. (Use “double induction” to prove that n(n + 1) = = = (n + q 1) is divisible by q! by induction on q and n. This is a tricky problem and you must proceed very carefully!) 11. Let’s make the ridiculous claim that any q things are equal. Here is a proof by induction. When q = 1, then it is clearly true, since any thing is equal to itself. Now suppose that it’s true for some q and consider any q + 1 things {1 > {2 > = = = {q > {q+1 . The rst q of these are equal by induction assumption, and so are the last q: equal by induction

z }| { {1 > {2 > ===> {q {1 > {2 > ===> {q > {q+1 {2 > ===> {q > {q+1 | {z } equal by induction

Because of the overlap on the q 1 things {2 > = = = {q we have {1 = {2 = · · · = {q = {q+1 , so q + 1 things are always equal as well. What’s wrong?

1. THE REAL NUMBERS AND COMPLETENESS 1.0

Introduction

The real numbers are much more complicated than the integers or the rationals. A popular representation presents them as “innite” decimals. Those innite dec3787 is imals that eventually repeat represent the rationals. Thus, for example, 9900 represented as 0=3825252525 = = = = 0=3825,where the bar over the s 25 indicate that it repeats. On the other hand, the innite decimal representing 2, 1=414213562 = = =, never repeats. Although we can dene the collection of reals in this way, there are problems, the biggest one being how to dene the arithmetic operations. For example, in order to add two innite decimals, we have to “start” somewhere–at some decimal place “on the right.” But then we have the problem of carries: some addition of digits to the right of where we start may require a carry that could eect the digits at or to the left of where we start. There is no simple way of dealing with this di!culty: any solution involves more and more complications. Furthermore, we lose the simplicity of rationals since even simple ones such as 3@17 involve complicated, repeating, innite decimals. There have been several classical constructions of the reals that avoid these problems, the most famous ones being Dedekind Cuts and Cauchy Sequences, named respectively for the mathematicians Richard Dedekind (1831 - 1916) and Augustine Cauchy (1789 - 1857). We will not discuss these constructions here, but will use a more modern one developed by Gabriel Stolzenberg, based on “interval arithmetic.” There are several advantages to this approach. The rst is that we deal exclusively with rational numbers and their arithmetic (so we avoid the di!culties inherent in the innite decimals approach). Secondly, as we discuss below, intervals of rational numbers are very much like scientic measurements, which, because of the fallibility of our instruments, are actually ranges of possible values. Finally, the analogy between interval arithmetic and scientic measurement allows us to apply theoretical results from the former to produce applications to the latter in the study of errors and error propagation. So we begin with the study of rational intervals: [u> v], where u v are rational numbers. We dene their arithmetic–i.e. how to add, subtract, multiply, and divide them (as well as a few other manipulations). We also dene the length of [u> v] to be v u, providing a measure of the “accuracy” of a measurement represented by this interval. Having done this, we turn to collections or families of rational intervals (representing a sequence or set of measurements). We dene similar arithmetic 11

12

THE REAL NUMBERS AND COMPLETENESS

operations on these families and then dene two important properties that a family of intervals may have. The rst of these, consistency, stipulates that any two intervals in the family intersect. The second, neness, demands that the family contain intervals of arbitrarily small length. While this is seldom realizable in the real world, such a family can arise mathematically by taking partial sums of approximating series or other convergent numerical procedures. In fact, we will eventually dene convergence using this idea. Our principle denition is that a real number is a ne and consistent family of rational intervals. We will show that such families inherit the standard arithmetic operations (addition, multiplication, absolute value, etc.) from the rationals; indeed, we will prove all the usual properties of real numbers necessary to deal with algebraic equations and inequalities. Finally, we end by proving the central property that distinguishes the reals from the rationals: completeness. This takes the form of proving that any ne and consistent family of real intervals contains a unique real number common to all the intervals. Applications of completeness will be found throughout the following chapters, especially in the constructions of qth roots, exponential functions and logs, limits of series and functions, and the Riemann integral.

1.1

Interval Arithmetic

When a quantity–say a charge or weight, or volume–is measured, the instruments and methods used generally have a known or estimable accuracy. A numerical reading, say 5=647, usually comes with an error estimate, say ±0=005. Thus, the quantity being measured lies in the range 5=647 ± 0=005, so is bounded below by 5=642 and above by 5=652. We can represent this by the interval [5=642> 5=652]. Since instruments read out either digitally or by dials and scales, the possible measurements resulting from this experiment are rational numbers lying in this interval. We now make some formal denitions related to this idea. Definition 1.1.1 L is a rational interval means L = [u> v], where u and v are rational numbers and u v. The intervals [u> v] and [x> y] are equal precisely when u = x and v = y. Definition 1.1.2 { 5 [u> v] means that { is a rational number and u { v. The rational numbers u and v are called the endpoints of the interval [u> v]. Please note that these denitions are fairly abstract. An interval is not dened to be a set but simply a symbol [u> v]. Unlike set theory, we dene the membership relation 5. If you don’t feel comfortable at this point with this degree of abstraction, that’s ok: no harm will result if you think of the interval [u> v] as simply the set of rationals bounded by the endpoints u and v. Example 1.1.3 [2@5> 1@2] = [6@15> 2@4] and 7@15 5 [2@5> 1@2],since 2@5 7@15 1@2.

INTERVAL ARITHMETIC

13

We will often use the following notation: L = [u> v] ; M = [x> y]; N = [s> t] where u v> x y>and s t are rational numbers. Definition 1.1.4 (Inclusion of Intervals) [u> v] [x> y] (or M L) means that x u and v y. Note that, since intervals are determined solely in terms of their endpoints, this property has been dened solely in terms of endpoints. It’s usually a good idea to draw a picture to see intuitively what some condition means, so here is the picture for inclusion of intervals:

X

U

V

Y

Here is a model for how proofs should go. It is a bit more wordy than it has to be, but that’s ok for now. Proposition 1.1.5 L M and M N implies that L N. (More concisely: L M and M N =, L N). Proof. First we write the intervals in term of their endpoints: L = [u> v]> M = [x> y], and N = [s> t]. Now we go back to Denition 1.1.4 (and the above diagram) to unravel what our assumptions are saying: L M

M means x u and v y N means s x and y t.

We must apply the denition now to compare L and N using their endpoints. We can write s x u and v y t. We conclude from these that s u and v t which, according to the denition, means L N. This is what we wanted to prove. The next two propositions assert that three statements “are equivalent.” This means that when any one of them is true, so are each of the others. Thus, if the three statements are (1), (2), and (3), then we are actually making six assertions: (1) =, (2), (1) =, (3), (2) =, (1), (2) =, (3), (3) =, (1) and (3) =, (2). However, you don’t have to prove six separate implications, because it su!ces to prove only (1) =, (2), (2) =, (3), and (3) =, (1). If we know these three, then, for example, (2) =, (3) =, (1), so we get (2) =, (1). The full proofs are left as exercises, though we include a piece of one to show what is expected. Proposition 1.1.6 The following are equivalent: 1. L M 2. { 5 L =, { 5 M

THE REAL NUMBERS AND COMPLETENESS

14

3. M contains the endpoints of L. Proof. We will prove just (1) =, (2). Suppose, then, that (1) holds, so we are given L M. With our usual notation, this means that x u and v y (see diagram above also). We must use this to prove { 5 L =, { 5 M, so assume that { 5 L. By the denition of 5 above, this means that u { v= But, combining this with x u and v y, we can write: x u { v y, so x { y, which means that { 5 M. Proposition 1.1.7 The following are equivalent: 1. L = M 2. { 5 L +, { 5 M 3. L M and M L We now give the denitions of the fundamental arithmetic operations on intervals, also in terms of endpoints. Suppose, then, that L = [u> v] and M = [x> y]. Definition 1.1.8 (Addition of Intervals) L + M is the interval [u + x> v + y]. Definition 1.1.9 (Negation of Intervals) L is the interval [v> u]. Definition 1.1.10 (Subtraction of Intervals) M L is the interval M +(L) = [x v> y u]. Notation 1.1.11 (The 0 Interval) We will occasionally denote the interval [0> 0] simply by 0. Note that L L = [u> v] + [v> u] = [u v> v u] which is not in general equal to the interval 0; however, we do have the following result. Proposition 1.1.12 For all intervals L, 0 5 L L. (Question: for which intervals L is it true that L L = [0> 0]?) Definition 1.1.13 (Multiplication of Intervals) For L = [u> v] and M = [x> y], LM = [min(ux> uy> vx> vy)> max(ux> uy> vx> vy)]= When the intervals are “non-negative,” their product is much simpler: if d 0, and f 0 then [d> e][f> g] = [df> eg]= However, with negative endpoints the situation is more complicated. For example, [3> 2][4> 5] = [15> 10]= This denition of the product of two intervals, as you might expect, turns out to be rather di!cult to apply when computing successive multiplications of intervals.

INTERVAL ARITHMETIC

15

Fortunately, there is another way of determining intervals that doesn’t depend explicitly on their endpoints, but rather characterizes them in terms of some property S . This kind of denition is very common in abstract mathematics and takes some getting used to. Definition 1.1.14 L is the smallest interval having the property S means 1. L has the property S , and 2. if M also has property S , then M L. What do we mean by a property S ? S is simply a statement about an interval which may or may not be true. For example, S might be the statement “The interval contains all the rationals between 2 and 3.” This particular statement is true about some intervals–for example, [3@2> 7@2] or [2> 3]–and false about others – for example, [0> 1] (which doesn’t contain any rationals between 2 and 3), or [5@2> 9@2] (which doesn’t contain all rationals between 2 and 3). For L to be “smallest” for a property S , not only does S have to hold for the interval L, but any interval for which S also holds must contain L. Thus, [2> 3] is actually the smallest interval containing all the rationals between 2 and 3; i.e., it’s smallest for this particular property. Here’s a proof of this fact in somewhat more generality. It shows how this kind of argument works. Proposition 1.1.15 [u> v] is the smallest interval containing all rationals between u and v. Proof. We must verify two things: that [u> v] actually has this property–i.e. that it contains all rationals between u and v–and that any other interval containing these rationals actually contains [u> v]. So, let’s show that [u> v] has the property. Suppose { is a rational between u and v; this means that u { v. By Denition 1.1.2, this means that { 5 [u> v]. Thus, [u> v] does actually contain all these rationals. Now, suppose that N is any other interval containing all rationals between u and v. Since both u and v are themselves rationals between u and v, it follows that N contains the endpoints u and v. By Proposition 1.1.6, L N and we are done. We should note that, for a given property S , there may not be any rational interval which is smallest for S . (For example, let S be the property “The interval contains all { with {3 A 2.”). However, if there is a smallest, it is unique in the following sense. Proposition 1.1.16 If L and M are both smallest intervals with the property S , then L = M. (This is left as an exercise. Hint: prove this using Proposition 1.1.7 which characterizes equality of intervals. ) We now apply this “smallest interval” idea to give alternative characterizations for the various operations on rational intervals.

THE REAL NUMBERS AND COMPLETENESS

16

Proposition 1.1.17 1. L + M is the smallest interval containing all { + |, with { 5 L and | 5 M= 2. L is the smallest interval containing all { with { 5 L. 3. LM is the smallest interval containing all {| with { 5 L and | 5 M. Proof. We will just prove part (3), which is the trickiest. The property here is “contains all {| with { 5 L and | 5 M.” We must rst show that LM has this property, i.e. that it contains all these {|. Suppose then that { 5 L, so u { v. Hence, if | 0, then u| {| v| and if | ? 0, then v| {| u|. So in either case, {| lies between u| and v|. By the same reasoning, when x | y, we have that u| must be between ux and uy, and v| between vx and vy. Therefore in all cases {| is between the minimum and the maximum of the four products ux> uy> vx> vy. By denition, then {| is in LM. This proves the rst half of what we need. Now suppose that some other interval (we could call it N) also contains all of these {|. We must show that LM N. From the denition of the product LM, we see that both endpoints of LM are of the form {| with { 5 L and | 5 M ({ is either u or v, and | is either x or y). Hence any interval N that contains all such products must contain these endpoints and therefore the entire interval LM (see Proposition 1.1.6). The two remaining parts of this proposition are left for the exercises at the end of this section. Now we turn to the “strict” inequality of intervals. Definition 1.1.18 (Strict Inequality for Intervals) L ? M (also written M A L) means that v ? x . Here is a simple picture of this.

U

V

JDS VX

X

Y

Intuitively, if the intervals represent scientic measurements of quantities S and T, and if the intervals overlap (so there is no gap), we could not conclude anything about the relative sizes of S and T. For example, if [1> 3] measures a quantity S and [2> 5] measures T, then S ? T, T ? S and S = T are all possible. Of course, we haven’t yet dened any “objects” other than rationals, nor relations such as ?, A, or = between anything other than rationals–this is just a suggestion for why we make this denition instead of one based on the length of the intervals or based on the relation. Proposition 1.1.19 L ? M if and only if 0 ? M L. As long as we keep away from zero, we can dene reciprocals, and from them, division.

INTERVAL ARITHMETIC

17

Definition 1.1.20 (Reciprocal) For L A 0 or L ? 0, 1@L is the interval [1@v> 1@u]. Note that this makes sense because if either 0 ? u v, or u v ? 0, then 1@v 1@u so [1@v> 1@u] is actually an interval, which it wouldn’t be if [u> v] contained 0 (check [2@3> 1] for example). Proposition 1.1.21 For any L A 0 or L ? 0, 1@L is the smallest interval containing all 1@{ for { 5 L. If we think of a rational interval as the result of a scientic measurement, then the length of the interval is an indication of how good (accurate) the measurement is. Definition 1.1.22 (Length) c(L) = v u is the length of [u> v]. Proposition 1.1.23 c(L + M) = c(L) + c(M). Proposition 1.1.24 c(L) = c(L). There is no very simple formula for the lengths of products or reciprocals of interval in terms of the lengths of the intervals themselves; however, we can get some inequalitites which will be useful later. We deal with reciprocals rst. We must keep away from zero! Proposition 1.1.25 Suppose L = [u> v] and f is a any number such that 0 ? f u; then c(1@L) c(L)@f2 . Proof. c(1@L) = 1@u 1@v = (v u)@uv = c(L)@uv. A fraction is biggest when its denominator is smallest, and f u v tells us that uv is never smaller than f2 . Therefore c(1@L) c(L)@f2 . When u f and f A 0, we sometimes say that “f bounds L away from 0.” A shorthand way of writing the condition is L [f> 4) with f A 0. We can also bound an interval away from 0 in the negative direction and get a similar result (left as an exercise): Proposition 1.1.26 Suppose L (4> g) where g ? 0. Then c(1@L) c(L)@g2 = Now we turn to multiplication. Proposition 1.1.27 Let D and E be positive rationals such that L [D> D] and M [E> E]. Then c(LM) Ec(L) + Dc(M). Proof. By denition LM = [de> fg] where d and f are endpoints of L = [u> v] (in some order) and e and g are endpoints of M = [x> y]. Therefore c(LM) = fg de = fg dg + dg de = g(f d) + d(g e)

THE REAL NUMBERS AND COMPLETENESS

18

with |g| E, |d| D, |f d| c(L) and |g e| c(M). Here is our rst of many applications of the Triangle Inequality 0.2.9: c(LM) = = =

|c(LM)| (since they are both positive) |g(f d) + d(g e)| |g(f d)| + |d(g e)| (Triangle Inequality!) |g| |f d| + |d| |g e| (properties of ordinary abs. val.) Ec(L) + Dc(M)=

Note that [u> v] [D> D] is simply saying that 0 |u| D and 0 |v| D. The inequality c(LM) Ec(L) + Dc(M) looks a little like the product rule from calculus. In fact, the trick of adding and subtracting the term dg in the rst part of the proof is also used in proving the product rule. You will see this adding and subtracting trick several times later on. (It has already been used in exercise 7 on page 8.) Definition 1.1.28 (Absolute Value for Intervals) Suppose L = [u> v]; then ; L when u A 0 (i.e. when L A 0) ? L when v ? 0 (i.e. when L ? 0) |L| = = [0> max(u> v)] when u 0 v (i.e. when 0 5 L)

Proposition 1.1.29 | L | is the smallest interval containing all | { | for { 5 L. Proposition 1.1.30 c(|L|) c(L). (Proof: show that for each L, either |L| L or |L| L.) Definition 1.1.31 (Max and Min for Intervals) max(L> M) = [max(u> x)> max(v> y)] min(L> M) = [min(u> x)> min(v> y)]= Here is a diagram showing max and min for two intervals [u> v] and [x> y].

PD[ PLQ X

U

V

Y

Proposition 1.1.32 N ? max(L> M) if and only if N ? L or N ? M; N A max(L> M) if and only if N A L and N A M.

INTERVAL ARITHMETIC

19

(A similar result holds for min.) Proposition 1.1.33 The interval max(L> M) is the smallest interval containing all max({> |) for { 5 L and | 5 M. Also min(L> M) is the smallest interval containing all min({> |) for { 5 L and | 5 M. Proposition 1.1.34 c(max(L> M)) max(c(L)> c(M)); c(min(L> M)) max(c(L)> c(M))= (If this seems strange, look at L = [0> 1000] and M = [999> 1000] for example.) More generally, given nitely many intervals Ll = [ul > vl ], l = 1> = = = > q, we make similar denitions: Definition 1.1.35 max(L1 > = = = > Lq ) = [max(u1 > = = = > uq )> max(v1 > = = = > vq )]; min(L1 > = = = > Lq ) = [min(u1 > = = = > uq )> min(v1 > = = = > vq )]= These can also be phrased “recursively” as follows: max(L1 > = = = > Lq ) = max(max(L1 > = = = > Lq1 )> Lq ) min(L1 > = = = > Lq ) = min(min(L1 > = = = > Lq1 )> Lq )= Proposition 1.1.36 The interval max(L1 > = = = > Lq ) is the smallest interval that contains all max({1 > = = = > {q ) for {l 5 Ll > l = 1> = = = > q. Also min(L1 > = = = > Lq ) is the smallest interval that contains all min({1 > = = = > {q ) for {l 5 Ll > l = 1> = = = > q. Example 1.1.37 Use the denition above to show that an equivalent formulation of the denition of | L | is | L | = max(L> L> 0). Proposition 1.1.38 c(max(L1 > = = = > Lq )) max(c(L1 )> = = = > c(Lq )); c(min(L1 > = = = > Lq )) max(c(L1 )> = = = > c(Lq ))=

Exercises 1. Let L = [2> 7], M = [3> 8], N = [4> 7]. (a) Compute the following intervals: L + M, L M, M N, LM, LN, MN, 1@N, M(1@N). (b) Compute max(L> M) and min(L> M). Draw a diagram showing L, M, and the max and min. (c) Is (L M) + (M N) equal to L N ?

(d) Discuss 1@L, 1@M, 1@N and 1@(N).

2. Compare [4> 2]2 and [2> 4]2 . Do you have a generalization? q

3. Show by example that c(L q ) = [c(L)] does not always hold. Is it ever true?

THE REAL NUMBERS AND COMPLETENESS

20

4. Prove that L ? M if and only if { ? | for every { 5 L and | 5 M. 5. When the endpoints of an interval are all non-negative, multiplication of the intervals is much simplied. (a) Suppose u 0 and x 0. Prove that [u> v] · [x> y] = [ux> vy].

(b) Again assuming that u 0 and x 0, give a simplied proof that [u> v] · [x> y] is the smallest interval containing all the products {| where { 5 [u> v] and | 5 [x> y]. 6. Suppose intervals L and M have an element in common (i.e. there is an { with { 5 L and { 5 M). Is it always true that for any interval N, L + N and M + N have an element in common? What about LN and MN? 7. Suppose L ? M. Prove that max(L> M) = M. What about min(L> M)? The next exercises ask you to prove some propositions left unproved in the text. In what follows, we will use 0 to denote the interval [0> 0]–its meaning will be clear from context. For example, [u> v] A 0 will mean [u> v] A [0> 0]; i.e. u A 0.

8. Prove that the following are equivalent: (a) L M

(b) { 5 L =, { 5 M

(c) M contains the endpoints of L.

9. Prove that the following are equivalent: (a) L = M (b) { 5 L +, { 5 M

(c) L M and M L

10. Prove that, for all intervals L, 0 5 L L. 11. Prove that, if L and M are both smallest intervals with the property S , then L = M. 12. Prove: (a) L + M is the smallest interval containing all { + |, for { 5 L and | 5 M=

(b) L is the smallest interval containing all { for { 5 L.

(c) For any L A 0 or L ? 0, 1@L is the smallest interval containing all 1@{ for { 5 L.

INTERVAL ARITHMETIC

21

(d) | L | is the smallest interval containing all | { | for { 5 L. (You must deal with each case of | | separately.) Note that in each case you must prove two things: (1) the interval actually has the property and (2) it is contained in any other interval that has the property. 13. Prove that L ? M if and only if 0 ? M L. 14. Prove: (a) If L M, then c(L) c(M). (b) c(L + M) = c(L) + c(M) (c) c(L) = c(L) (d) For L (4> g] with g ? 0> c(1@L) c(L)@g2 . ([u> v] (4> g] simply means v ? g). (e) c(| L |) c(L). 15. Investigate the inclusion () relations, if any, among L, M, min(L> M) and max(L> M)= What if L M? 16. Here are some basic properties of the rational numbers: (a) d + e = e + d (b) (d + e) + f = d + (e + f) (c) de = ed (d) (de)f = d(ef) (e) (d)e = (de) (f) |de| = |d||e| (g) d A 0 if and only if d ? 0 (h) If d A 0 and e A 0, then d + e A 0. (i) If d A 0 and e A 0, then de A 0. (j) If de A 0, then either d A 0 and e A 0 or d ? 0 and e ? 0. (k) If d1 + · · · + dq A e, then, for at least one value of l> l = 1> = = = > q, we have dl A e@q. Verify that all of these properties still hold when we replace the rational numbers d> e> f> d1 > = = = > dq , by intervals: L> M> N> L1 > = = = > Lq . (Hint: Several of the proofs–notably those involving more than two intervals–will be much easier if you use the “smallest interval” characterization of the various operations on intervals–see Proposition 1.1.14 and subsequent applications.)

THE REAL NUMBERS AND COMPLETENESS

22

(l) Here is another basic property of the rationals (the “distributive law”): d(e + f) = de + df. Verify that, for intervals L> M> N we always have L(M + N) LM + LN= However, make up an example for which L(M + N) 6= LM + LN. (Note that L(M + N) LM + LN is actually an equality when all the intervals are positive–you may want to try to prove this. Thus, your example must involve at least one negative endpoint–nearly any will work!).

1.2

Families of Intersecting Intervals

Although it many never happen, two experimenters measuring the same quantity might, in fact, obtain the exact same measurement (rational number) for it with their instruments. If their experiments are represented by the rational intervals L and M, then this common measurement must belong to both of these intervals. We make this formal by the following denition. Definition 1.2.1 We say the intervals L and M intersect, or meet, if there is a rational number { which belongs to both L and M (i.e. { 5 L and { 5 M). Here is the very important characterization of intersection in terms of endpoints. Proposition 1.2.2 L = [u> v] and M = [x> y] intersect +, u y and x v. Proof. (,): If { belongs to both, then u { v and x { y. Thus u { y so u y. Similarly x v. (+): If u y and x v, then max(u> x) min(v> y). We also have u max(u> x), x max(u> x); similarly, min(v> y) v and min(v> y) y. Thus, the entire interval [max(u> x)> min(v> y)] is contained in both L and M= Notation 1.2.3 If L and M intersect, their intersection is an interval given by L _ M = [max(u> x)> min(v> y)]. (The intersection of nitely many intervals can be characterized as a smallest interval– see the Exercise 9 at the end of this section.) There are several ways intervals can intersect. It may be that [u> v] [x> y], in which case the intersection is all of [u> v]:

U X

,QWHUVHFWLRQ

V Y

FAMILIES OF INTERSECTING INTERVALS

23

or it may be that only parts of the intervals meet (x ? u ? y ? v, making the intersection [u> y]):

V

U ,QWHUVHFWLRQ X

Y

or they may just have an endpoint in common (u ? v = x ? y, making the intersection [v> v] = [x> x]):

,QWHUVHFWLRQ V X

U

Y

We should point out that the symbol _ is not a verb! It is a common mistake to read “L _ M” as “L intersects M”: L _ M is either undened (when L and M don’t meet), or it is an interval (when they do). L _ M is not a statement or assertion, it is a thing or a noun. From this it should be clear that L and M intersect if an only if neither lies to the right or left of the other; i.e. L and M intersect +, L ? M and M ? L are both false. (The proof of this is an exercise.) Proposition 1.2.4 Given a nite collection of intervals with Ln = [ul > vl ]> n = 1> = = = > q, if each pair of them intersect, then all of them do. In fact, any number in [max(u1 > = = = > uq )> min(v1 > = = = > vq )] belongs to all the intervals Ln . Notation 1.2.5 For intervals Ln = [ul > vl ] that intersect, q \

Ll = [max(u1 > = = = > uq )> min(v1 > = = = > vq )]=

n=1

Remark 1.2.6 Another way of saying this is that, given a nite number of intervals, their common intersection is equal to the intersection of any one whose left endpoint is farthest to the right with any one whose right endpoint is farthest to the left. We now move from dealing with individual intervals to dealing with families of intervals. Science advances by considering experiments done by many, and the numerical nature of a quantity emerges from a multiplicity of measurements, expressed as a family of intervals. Now if L and M are intervals resulting from experiments to measure the same quantity, and L lies wholly to the left or wholly to the right of M, then we can be sure that at least one of the experiments is incorrect. Such measurements might be described as inconsistent.

24

THE REAL NUMBERS AND COMPLETENESS

Example 1.2.7 In the mid 1990s, astronomers obtained two measurements for the age of the universe. Using data from the Hubble space telescope and the theory of the expanding universe, they came up with the time of creation lying in the interval [16> 14] (billion years ago). On the other hand, astronomers studying stellar chemistry and the life-cycle of stars concluded, from the ages of the oldest stars, that this creation occurred in the interval [12> 8]. These intervals don’t intersect! Fortunately, errors were detected within a year or so, and new computations produced intersecting intervals. Definition 1.2.8 F is a consistent family of intervals means that for every pair of intervals L and M in F, L and M intersect. Thus, in the Hubble example above, the family {[16> 14]> [12> 8]} is not consistent. On the other hand, {[1> 4]> [1> 7]> [3@2> 3]} is consistent–any two intervals contain 2, or 5@2, or 3, etc., so there are numbers that are common to all of the intervals in this family. Using Proposition 1.2.4 above, we can easily prove the following. Proposition 1.2.9 If F is a consistent family having only a nite number of intervals, then there is at least one rational number which is in every one of the intervals of F. You may nd it surprising that this fact about nite families is not necessarily true for innite families. As we shall see in the next section (see Exercise 10 on page 38), the family ª © C2 = [d> e] | d3 2 e3

is consistent (any two intervals intersect), but there is no rational number common to all of its intervals. If a valid experiment produces an interval [u> v] in measuring a physical quantity T, and another such experiment produces [x> y], then we would deduce that the intersection of these intervals, [max(u> x)> min(v> w)], might also be a valid measurement for T. For example, experiment A may determine a lower bound u for T only, so v might be an arbitrarily large number. Similarly, experiment B may determine an upper bound y, so that x is arbitrarily small. Then [u> y] could be deduced as a bounding interval for T.1 The idea of taking intersections of intervals can be applied to any consistant family of intervals, and consistency will be preserved; that is the content of the following proposition. (We will not actually be doing this kind of construction here, so we just state it for the record.) Proposition 1.2.10 If F is a consistent family of intervals, so is the family of all nite intersections of members of F, and it contains F as a subfamily. Just as we dened algebraic operations on rational intervals based on those for rational numbers, we now dene algebraic operations on families of rational intervals based on the ones we dened for intervals. 1 One should be cautious, however: the increased accuracy of this intersection interval could violate some quantum mechanical law like the Uncertainty Principle.

FAMILIES OF INTERSECTING INTERVALS

25

Notation 1.2.11 If F is a family of rational intervals, then L 5 F means that L is an interval of F. Definition 1.2.12 Given F and G families of intervals, F + G is the family {L + M | L 5 F> M 5 G} F is the family {L | L 5 F} FG is the family {LM | L 5 F> M 5 G} |F| is the family {|L| | L 5 F} If at least one interval of F doesn’t contain 0, then 1@F is the family {1@L | L 5 F and 0 5 @ L} . max(F> G) is the family {max(L> M) | L 5 F> M 5 G} min(F> G) is the family {min(L> M) | L 5 F> M 5 G} F G is the family F + (G). Proposition 1.2.13 If both F and G are consistent families, so are F + G, F, FG, 1@F, |F|, max(F> G), and min(F> G). Proof. Since F and G are both consistent, given L> L 0 5 F and M> M 0 5 G, there is an { 5 L _ L 0 and a | 5 M _ M 0 . Hence { + | is in both L + M and L 0 + M 0 , { is in both L and L 0 , {| is in both LM and L 0 M 0 , |{| is in both |L| and |L 0 |, max({> |) is in both max(L> M) and max(L 0 > M 0 ), and min({> |) is in both min(L> M) and min(L 0 > M 0 ). So F + G, F, FG, |F|, max(F> G) and min(F> G) are consistent families of intervals. Finally, suppose L and L 0 are in F and neither contains 0. Since F is consistent, there is an { lying in both intervals; of course { is non-zero. Then 1@{ lies in both 1@L and 1@L 0 . Sometimes two quantities thought to be dierent are actually the same. One indication that this may be the case is that families of measurements fail to distinguish them; i.e., any measurement of one meets or overlaps any measurement of the other. (Of course, the bigger and more “accurate” the families the more likely it is that the quantities are the same, but we will deal with this idea later.) Definition 1.2.14 (Consistent Families) Given two families of intervals F and G, we say F is consistent with G if each interval from the family F intersects each interval from the family G. This relation is denoted F G. Example 1.2.15 Let F = {[1> 3]> [4> 9]}; then F isn’t even consistent with itself (F is inconsistent). On the other hand, if G = {[1=2> 1=7]> [1=3> 1=5]> [1=4> 1=5]} and H = {[1=25> 1=5]> [1=4> 1=45]}, then it is not di!cult to check that G and H are consistent: the number 1=414, for example, belongs to every interval in these families.

THE REAL NUMBERS AND COMPLETENESS

26

With this denition, a consistent family of intervals is one that is consistent with itself. Thus, F is consistent if and only if F F. Proposition 1.2.16 Given F and G consistent families of intervals, F G if and only if F ^ G is a consistent family (where F ^ G consists of all intervals that are in either F or G). Once again, mathematicians like to see how relations between objects (such as consistency) behave when we perform arithmetic operations on the objects. For example, in number theory we know that if P P 0 mod(N) (P is congruent to P 0 mod N) and Q Q 0 mod(N), then P + Q P 0 + Q 0 mod(N) (and similarly for subtraction and multiplication). The next result shows that consistency works similarly. Proposition 1.2.17 Suppose that F F 0 and G G 0 . Then we also have: 1. F + G F 0 +G 0 2. F F 0 3. FG F 0 G 0 4. 1@F 1@F 0 (when dened) 5. |F| |F 0 | 6. max(F> G) max(F 0 > G 0 ) 7. min(F> G) min(F 0 > G 0 ) (A trivial modication of the proof of Proposition 1.2.13 (preservation of consistency) proves this proposition as well; simply take L 5 F, L 0 5 F 0 , M 5 G and M 0 5 G 0 .) Proposition 1.2.18 Suppose the families F, G, and H are consistent. Then 1. F(GH) (FG)H 2. F(G + H) FG + FH 3. F + (G + H) (F + G) + H 4. |FG| |F| |G| Proof. We will just prove (1) since the others work similarly. We must show that every interval of the left-hand family meets every interval of the right-hand family. By denition, a typical interval on the left looks like L(MN), while one on the right looks like (L 0 M 0 )N 0 , where L> L 0 5 F, M> M 0 5 G, and N> N 0 5 H. By consistency, we can nd numbers { 5 L _ L 0 , | 5 M _ M 0 , and } 5 N _ N 0 . It is now clear that the element {(|}) = ({|)} lies in both triple products, so that L(MN) and (LM)N meet.

FAMILIES OF INTERSECTING INTERVALS

27

Suppose we make measurements of two physical constants, F and G, and nd that F is between 1.88 and 1.92, while G is between 1.85 and 1.90. This shows that F and G do not dier by more than .07. But it does not reveal whether F and G are distinct or not. However, suppose we make a second set of measurements and nd also that F lies between 1.894 and 1.913, and G is between 1.889 and 1.891. This does enable us to conclude that F and G are distinct, because it reveals that G is at least .003 less than F: gap of =003

z }| { 1=889> = = = > 1=891 ? 1=894> = = = > 1=913

In this case, the families of measurements for F and G are not consistent: they can distinguish between F and G by telling us explicitly that G ? F. In other words, it takes only one valid pair of measurements [u> v] and [x> y] of the two quantities G and F respectively, with v ? x, to tell us that G ? F. This suggests that we make the following denition. Definition 1.2.19 (Strict Inequality for Families) Given two families of intervals F and G, F ? G means that we can nd at least one L in F and one M in G, with L ? M. Could it be that both F ? G and G ? F are true? It is not hard to construct a simple example to show that the answer is yes (try it). But the following proposition tells us that this undesirable situation will not happen if both F and G are consistent families of intervals. Proposition 1.2.20 If F and G are consistent families with F ? G, then G 6? F (that is, G ? F is false). Proof. Note rst that for any two intervals [ and \ , if [ ? \ , then { ? | for all { 5 [ and | 5 \ (prove this for yourself if necessary). We are given that F ? G, so we can nd L in F and M in G with L ? M. We must show that G ? F leads to a contradiction, so suppose it’s true. Then we can nd L 0 5 G and M 0 5 F with L 0 ? M 0 . Now we use consistency. Choose any { 5 L _ L 0 and | 5 M _ M 0 . Since L ? M, { ? |; since M 0 ? L 0 , | ? {. But { ? | and | ? { can’t both be true: this is our contradiction. It follows that G 6? F. Proposition 1.2.21 Given families of intervals F, G and H, if F ? G ? H and G is consistent, then F ? H. The proof is left as an exercise. Exercise 1.2.22 Give examples of • A family F such that F ? F= • Families F, G, and H such that F ? G and G ? H but F H.

THE REAL NUMBERS AND COMPLETENESS

28

The last two propositions show that strong inequality, ?, behaves as we would expect. It is also useful to have a notion of weak inequality, usually denoted . For rational numbers, to say that u v is to say that u does not exceed v, i.e. u 6A v. We will adopt this meaning for families as well. Definition 1.2.23 (Weak Inequality: Families) F G means F A G is false; i.e., no interval of F lies wholly to the right of any interval in G. This is also written G F. Example 1.2.24 Let P = {[1> 5]> [3@2> 2]} and Q = {[4> 8]> [2> 3]}. Then P Q since no interval of Q is strictly less than any interval of P. Please note to say that F G is to say that no interval in F lies to the right of any interval in G. It does not say that some interval in F lies to the left of some interval in G (F ? G) nor that every interval in F meets every interval in G (F G), nor that the families might be equal. It is not an “or” statement at all. In trying to verify that F G, then, one must show that M ? L never happens for any intervals M 5 G and L 5 F. In view of this remark, it is probably best to read “F G ” as “F does not exceed G ”, even though it is hard to avoid saying “less than or equal,” given the use of the symbol . Weak inequality can be characterized easily in terms of the endpoints of the intervals in the families. To say that L = [u> v] does not lie to the right of M = [x> y] is to say that u y. Definition 1.2.25 If [u> v] is an interval of the family F, then u is called a lower bound of F and v is called an upper bound of I . (If you look ahead to Proposition 1.3.12 in the next section, you can see why these are called bounds for F.) The following gives a very useful way to characterize inequality and consistency of families in terms of upper and lower bounds; it really makes giving proofs of these relationships much less confusing. Proposition 1.2.26 1. F ? G if and only if some upper bound of F is less than some lower bound of G. 2. F G if and only if every lower bound of F is less than or equal to every upper bound of G. 3. F G if and only if every lower bound of either family is less than or equal to every upper bound of the other. Corollary 1.2.27 Given two families of intervals F and G, F G if and only if F G and G F. Proofs of these results are left as exercises.

FAMILIES OF INTERSECTING INTERVALS

29

Exercise 1.2.28 Is it true in general (no assumptions on the families) that F G and G H implies that F H?

Writing Proofs Before proceding to the exercises involving families, it might be a good idea to discuss the general method of nding proofs and writing them up. As an example, we’ll develop a proof of the following fact about families of intervals (stated as Corollary 1.2.27 in the text). Proposition Given two families of intervals I and J, I J if and only if J I and I J. How to begin nding a proof First, you must know exactly what each word and phrase in the statement of the proposition means: “F G” means that every interval in F intersects every interval in G= For families, the weak inequality F G is dened as the negation of the strong inequality G ? F: F G means that there is no interval of G that lies wholly to the left of any interval of F. (Another way to say this is that for any L 5 F and M 5 G, it is never true that M ? L.) Similarly, the relation G F means that there is no interval in F which lies strictly to the left of an interval in G. Ok. So you know what the terms mean. Now, note that you have to prove an “if and only if” statement. That always means you have two things to do. First, prove that the left-hand side implies the right, and secondly, that the right-hand side implies the left. So let’s start from left to right. You are given that F G. You must show, rst, that no interval in F is to the left of an interval in G. So take any interval L in F and any interval M in G. You must show that L can’t lie to the left of M. What can you use to prove this? You clearly must use the assumption that F G. What does this assumption tell us about L and M ? It tells us that they must intersect. Can one interval lie strictly to the left of another if they intersect? From this picture you can immediately see that the answer is no.

U

V

X 1R,QWHUVHFWLRQ

X

30

THE REAL NUMBERS AND COMPLETENESS

However, you might want to prove this without having to use a picture; you therefore have to go back to using endpoints and inequalities. [u> v] intersects [x> y] is equivalent to u y and x v. On the other hand, [u> v] ? [x> y] means that v ? x (this is the denition of ?). If they intersect, then this can’t happen since x v. In this way you can prove formally what your picture suggested. Thus, no interval in F can lie to the left of an interval in G and, by a similar argument, no interval in G can lie to the left of an interval in F. This completes the proof that the left-hand side implies the right. Now you must show that the right-hand side implies the left. You are given F G and G F and you must prove that F G. So take any L in F and any M in G. You have to show that they intersect – i.e. that u y and x v. But you know that L can’t lie to the left of M, nor can M lie to the left of L (why?). Interpreting this in terms of endpoints, you see that it is false that v ? x and it is false that y ? u. Thus, x v and u y, and you are done. Most of the proofs you will have to do involve just this sort of argument: knowing the denitions of the terms involved, and using them and some pictures to see how to proceed. Of course, knowing exactly what each item means is one of the most important steps in constructing any proof. Now it’s time to write up the proof. Although pictures are sometimes helpful in understanding or motivating an argument, a formal proof generally doesn’t use pictures. Therefore, you’ll have to include the proof, using endpoints, of the fact that if two intervals intersect, one can’t be to the left of the other. Since this is a helpful fact rather than the main result, it is a good idea to single it out and call it a lemma. Below is a sample write-up. Note the use of ¥ to denote the end of a proof. This is the modern typeset replacement for Q.E.D. from classical geometry (Q.E.D. stands for Quod Erat Demonstrandum, “that which was to be shown”). You could use the “#” sign in handwritten or typed proofs. Lemma If L and M intersect, then L 6? M and M 6? L.

Proof. By denition of intervals intersecting, we have u y and x v. But M ? L +, y ? u, and L ? M +, v ? x. Thus, if L and M intersect, neither can be to the left of the other. Proposition Given two families of intervals F and G, F G if and only if F G and G F. Proof. (,): Let L be any interval in F and M be any interval in G. Since F G, L and M intersect. By the Lemma, L 6? M and M 6? L. By denition of ? for families, neither F ? G nor G ? F can hold, so G F and F G.

(+): Let L = [u> v] be any interval in F and M = [x> y] any interval in G. Since F 6? G, v ? x must be false, so x v. Similarly, since G 6? F, y ? u is false, so u y. We know these inequalities mean that L and M intersect. (This proof can be further improved by making the lemma itself into an “if and only if” statement.)

FAMILIES OF INTERSECTING INTERVALS

31

Exercises 1. Let A = {[2> 5]> [1@2> 2]} and B = {[1> 7]> [1@4> 3]}. Calculate the families A + B, A B, AB, 1@B, |A| and max(A> B). Verify that all families are consistent. 2. Find the interval consisting of all numbers common to the intervals in the family {[2> 7@2]> [3@2> 3]> [4@3> 5@3]> [5@3> 2]}. 3. Find a consistent family H having at least 3 intervals such that each interval is contained in [1> 1]. Find family of subintervals of [1> 1] which is not consistent. 4. The family L = {[2> 0]> [1> 3]} is not consistent–why? Is it possible to add intervals to L to make it consistent? Explain. 5. Let M be the family of all intervals [1@Q 2 > 1@Q ], Q = 1> 2> 3> = = = Show that M is not consistent. ¸ Q +2 Q , Q = 0> 1> 2> = = = Is N consistent? > 6. Let N be the family Q +1 Q +1 7. The family C2 consists of all the intervals [d> e] with the property that d3 2 and 2 e3 . Prove that C2 is a consistent family. (Hint: Suppose [d> e] and [f> g] are in C2 but they don’t intersect; for example, [d> e] ? [f> g]. Show that this leads to a contradiction. 8. For intervals L and M, prove that either L ? M, M ? L, or L and M intersect (the only three possibilities). 9. Assuming that L and M intersect, prove that L _M is the smallest interval which contains all { such that { 5 L and { 5 M. 10. Find families F, G, and H such that F ? G and G ? H, but F ? H is false (see the next exercise also). 11. Prove that if families F, G, and H satisfy F ? G ? H, and if G is consistent, then F ? H.

(Outline: By denition, F ? G means you can nd intervals L in F and M in G such that L ? M. Also, G ? H gives us intervals M 0 and N with M 0 ? N. (Note that you must be careful and not assume that M = M 0 .) Write N = [s> t]. What must you prove about endpoints to deduce I ? N? How can you use the consistency of G? Write some inequalities with the endpoints of these intervals and an element in an intersection.)

12. Find consistent families F, G, and H such that F G and G ? H, but F 6? H. (Hint: Start with G ? H and then nd F. 13. The text has the following proposition, with a proof of only the rst statement. Prove the remaining ones. Proposition 1.2.18 Suppose the families F> G, and H are consistent; then

THE REAL NUMBERS AND COMPLETENESS

32

(a) F(GH) (FG)H

(b) F(G + H) FG + FH

(c) F + (G + H) (F + G) + H

(d) |FG| |F| |G|.

14. Show by example that F G and G H need not imply F H. 15. (Project) Dene, for intervals L and M, the relation L M to mean either L ? M or L and M meet (intersect). (a) Prove that the following are equivalent. i. L M. ii. The left-hand endpoint of L is the right-hand endpoint of M. iii. There exist numbers { 5 L, | 5 M such that { |. (b) Try to prove results about that are similar to the results proved in the text about ?. For example, is it true that L M and M N imply L N? Is it true that L 0 and M 0 imply LM 0? (c) Prove that for families F and G, F G if and only if L M for every pair of intervals L 5 F and M 5 G. (You may nd Proposition 1.2.26 useful.)

1.3

Fine Families

So far, in discussing consistency, we have been concerned about whether the intervals in a family seem to be measuring the same thing. However, there is another issue: accuracy or neness of measurement. If an experiment produces an interval of possible values for a quantity, then the length of that interval is a measure of how well the experiment has determined that quantity. Certainly the interval [3=1415> 3=1416] is a better estimate of a certain well-known constant than the interval [3=14> 3=15]. By the way, the actual size of the endpoints of an interval has nothing to do with the accuracy of the measurement it represents. Thus, the interval [1000> 1000=01] has length 0=01, so represents a more accurate measurement than the interval [0=1> 0=2]. Of course, in the case of actual physical experiments, there can be limits to the accuracy of a measurement (for example, limits inherent in the laws of quantum mechanics). In the case of mathematics, however, many constructions have no such limits. The calculation of various roots, or of , or of values of functions, can result in arbitrarily ne accuracy. We now consider families representing these kinds of measurements or constructions. Definition 1.3.1 The family F is ne when, for each positive rational , there is an interval in F that has length less than .

FINE FAMILIES

33

Perhaps it would be more accurate to refer to a family as being “arbitrarily” ne, but we have opted for the shorter form, which is easier to say and read. For clarity we will sometimes refer to ne families as families having arbitrarily ne intervals. Note also that the rationals are positive. We don’t expect to nd intervals of length 0,s since these would give absolutely accurate estimates. For example, measuring 2 by s an interval of length 0 would give an estimate [u> u], which would be saying that 2 is u, s a rational number. However, it has been known for more than 2000 years that 2 can not be rational. Thus, for neness we only demand that intervals can be found of length less than for each A 0. The Greek letter , called “epsilon,” has long been used by scientists and mathematicians to denote a small number. It doesn’t have to be small, but that’s the tradition. Finally, this denition could be rephrased by substituting the word “any” or “every” for “each.” When mathematicians say they can do something for any positive number, they really mean for each or every positive number. The words “any,” “every,” and “each” are usually used synonymously in this context. For example, mathematicians say “The number Q is square-free if, for any prime s dividing Q , s2 does not divide Q .” “Any” means “each” or “every” in this sentence. We chose “each” in the denition of ne family above to avoid giving the impression, upon rst reading, that nding an interval of length less than for just any (one) value of was good enough. Now that this is explained, we will use “any”, “every”, and “each” interchangeably when it is appropriate. ¤ £ Example 1.3.2 Let F be the family of all intervals 1@q2 > 3@q > q = 1> 2> = = = Suppose some (any) A 0 is given. How can we produce an interval of length less than ? The length of the nth interval is 3@q 1@q2 ? 3@q. Thus, if we make 3@q ? , the length of the interval will be ? as well. All we need then (solving the inequality for q) is to have q A 3@ . So this tells us that, for any A 0, if we choose q so that q A 3@, the interval [1@q2 > 3@q], will have length ? . As a specic example, had actually been, say 1@1500 , we would need q A 3@(1@1500) = 4500. Thus, the interval [1@45012 > 3@4501] would meet the challenge. We have seen previously that F ? G ? H need not imply that F ? H; however, when G is consistent, the implication does hold. The next few results show that similar familiar properties of inequalities, which may be false in general – even for consistent families – are true with a neness assumption. These propositions also demonstrate how the neness assumption is used in proofs.

Proposition 1.3.3 Suppose that F is ne. If F G and G ? H, then F ? H. Proof. Since G ? H, we can nd intervals M = [x> y] in G and N = [s> t] in H with y ? s. Since G 6? F, for any interval L = [u> v] in F, we must have y 6? u, so u y. We have the following diagram

THE REAL NUMBERS AND COMPLETENESS

34

JDS SY Y

X

S

T

U dY U

V

From this picture, we see that we can make sure that v lies to the left of s by making the length of [u> v] smaller than the gap, which is s y. Because F is ne and s y A 0, we can nd an interval L = [u> v] in F with v u ? s y. Adding this inequality to the previous one gives v ? s, so L ? N, and hence F ? H. Remark 1.3.4 You can nd a simple example to show that, without the neness assumption, this result may be false, even for consistent families. Proposition 1.3.5 Suppose that F is ne. If F G and G A H, then F A H. The proof, similar to that of the previous proposition, is left as an exercise. The next proposition not only uses neness, but is our rst application of the Wiggle Lemma (0.2.6), which says that if u ? t + for all A 0 then we must have u t. Proposition 1.3.6 Suppose that G is ne. If F G and G H, then F H. Proof. Choose any L = [u> v] in F and N = [s> t] in H. By the characterization of inequalities in terms of upper and lower bounds (1.2.26), we must show that u t. Let A 0 be any rational number; we will show that u t +. Since G is ne, we can choose an interval M = [x> y] in G such that c(M) ? . Since F G and G H the same characterization of weak inequality for families implies that u y and x t. We now have u y = x + (y x) t + (y x) ? t + . Since A 0 was arbitrary, the Wiggle Lemma gives us u t. Remark 1.3.7 This proof has some interesting ideas. Does drawing a diagram help understand it? Try closing the book and reproducing the proof on your own. Corollary 1.3.8 Suppose that G is ne. If F G and G H, then F H. Proof. In Proposition 1.2.27 we proved that if F G and G H, then F G and G H. Therefore, by the previous proposition, F H. Using similar reasoning, H F because H G and G F. By combining these two conclusions, a nal use of the characterization of in terms of gives the desired result: F H. Corollary 1.3.9 If F is consistent with a family that is ne, then F is a consistent family.

FINE FAMILIES

35

Proof. Take G to be the ne family and let H = F in the previous proposition. Notation 1.3.10 For each rational number f, we will, if there is no danger of confusion, also use f to denote the family of intervals whose sole member is [f> f]; that is, f will also denote {[f> f]}. Remark 1.3.11 For each f, the associated family {[f> f]} is obviously consistent and ne, since it has only the one interval of zero length. According to the notation just introduced, if F is any family of intervals, and f and g are rational numbers, then “F ? g” is a paraphrase for “there is at least one interval in F, say L = [u> v], with v ? g”, or equivalently, “g is bigger than some upper bound of F”. Similarly, “f ? F” can be rephrased as “f is smaller than some lower bound of F.” After similar unravelling, the weak inequality F e amounts to saying that e is greater than or equal to every lower bound of F. A somewhat more general case has already been discussed in Proposition 1.2.26. The next result is very important and will be used many times in what follows. Proposition 1.3.12 Let F be any family of intervals, and [d> e] a rational interval. Then d F e if and only if [d> e] intersects each interval of F. The proof is left as an exercise in “unravelling” that everyone should do. Since any interval from a consistent family meets every other interval in that family, we immediately deduce the following consequence. Corollary 1.3.13 If F is a consistent family of intervals and L = [u> v] belongs to F, then u F v. This corollary also explains, at least for consistent families, why we call u and v lower and upper bounds of F when [u> v] belongs to F–see Denition 1.2.25. Corollary 1.3.14 If F is consistent and ne, and [d> e] and [f> g] both meet every interval of F, then [d> e] and [f> g] meet each other. Proof. Observe that {[d> d]} {[g> g]} and {[f> f]} {[e> e]} by applying Proposition 1.3.6 as well as 1.3.12. If a family is not consistent, adding more intervals cannot help. However, if it is consistent and ne, then we can construct a maximal consistent family containing it. The proposition we just proved and its corollaries tell us exactly what this family must be. Definition 1.3.15 If F is a ne and consistent family, dene F to be the family of all [d> e] that intersect every member of F (or, equivalently, all [d> e] such that d F e).

36

THE REAL NUMBERS AND COMPLETENESS

That F is consistent follows directly from the second corollary above. Note that since F contains every interval of F, F will be ne since F is assumed to be. F is maximal in the sense that if K is consistent and contain all the intervals of F, then K is contained in F . (Verication of this is left as an exercise.) The family F will be used later to standardize the denitions of certain numbers. Note also that F F , and F G if and only if F G . See the exercises for these and other properties of F . Example 1.3.16 Let C2 be the family of all [d> e] such that d3 2 e3 . Then C2 (which stands for “cube root of 2”) is a ne and consistent family. Furthermore, C2 = C2 . For details, see the exercises for this section. Next we check how neness behaves when we do arithemetic on families. We will now see the importance of the computations we made for the lengths of sums, products, etc. of intervals done at the beginning of this chapter. Proposition 1.3.17 If both F and G are ne families of intervals, then so are the families F + G, F, |F|, max(F> G),and min(F> G). Proof. For F, |F|, max(F> G), and min(F> G), the result is a direct consequence of the length estimates (for intervals) computed for L (1.1.24), |L | (1.1.30), and max(L> M) and min(L> M) (1.1.34) . It remains to consider F + G. Let be any positive rational, and let d and e be any positive rationals whose sum is 1 (e.g., take both d and e to be 1@2). Then we have d + e = . To get an interval in F + G of length ? , take any L 5 F and M 5 G such that c(L) ? d and c(M) ? e. Then, by the length formula for L + M (1.1.23), L + M is an interval in F + G whose length is d + e = . It is not too di!cult to nd examples of ne families whose product and reciprocals are not ne (see exercises at the end of this section). In these examples the missing property is consistency. To see this, we rst establish the following somewhat technical results. Proposition 1.3.18 Suppose F contains arbitrarily small subintervals of a single interval [ and G contains arbitrarily small subintervals of some interval \ . Then FG contains arbitrarily small subintervals of [\ . Proposition 1.3.19 Suppose F contains arbitrarily small subintervals of [, and either [ A 0 or [ ? 0. Then 1@F contains arbitrarily small subintervals of 1@[. Here are the proofs of these propositions. Proof. We do products rst. Take symmetric intervals [V> V] containing [, and [W> W ] containing \ (e.g., for [ = [d> e], set V = max(e> d)). Then take any positive rationals s and t with s + t = 1. For each positive rational we then have s + t = . By the length estimate for LM (1.1.27), for any L 5 F with L [ and any M 5 G with M \ , we have c(LM) W c(L) + Vc(M)=

FINE FAMILIES

37

So if we take such L and M with c(L) ? s@W and c(M) ? t@V, then LM is a subinterval of [\ that belongs to FG and c(LM) ? W

t s + V = = W V

Now we look at reciprocals. Suppose [ = [d> e] A 0. By the length estimate for 1@L (1.1.25), if L is contained in [, then ¶ 1 c(L) c 2 = L d Therefore given any positive rational , if we take any L 5 F that is a subinterval of [ with c(L) ? d2 > then 1@L is an interval in 1@F that is a subinterval of 1@[ with c(1@L) ? . The case where [ ? 0 involves just a simple modication of this argument. The intervals [ and \ required in these propositions can always be found when the families are both ne and consistent, as shown by this lemma. Lemma 1.3.20 Let H be any ne and consistent family. Then, for each L = [u> v] in H and any positive rational f, the family H contains arbitrarily small intervals that are subintervals of ] = [u f> v + f]. When H A 0 we can choose ] A 0; when H ? 0 we can choose ] ? 0. Proof. Because H is consistent, every M 5 H intersects L. Hence, if the length of M is less than f, M must be contained in ]. (Check this.) Combining what we now know, we see that the basic arithmetic operations behave well on families that are both consistent and ne. In summary, then, we state the following. Proposition 1.3.21 Suppose that both F and G are ne and consistent families. Then so are the families F + G, F, FG, |F|, max(F> G), min(F> G), and 1@F (provided either F A 0 or F ? 0).

Exercises i h 1 1 > q+1 , q = 1> 2> = = = Show that S is 1. Let S be the family of all intervals 3q+1 ne. Is it consistent? h i 2. Let T be the family of all intervals q1 > q+2 q+1 , q = 1> 2> = = = Is T ne? Consistent? 3. Here are some examples to show that the product and reciprocal of ne families needn’t be ne.

THE REAL NUMBERS AND COMPLETENESS

38

(a) Let M be the family of all intervals [Q> Q + 1@Q ]> Q = 1> 2> 3> = = = Show that M is ne but M · M is not. Note that M is not consistent.

(b) Let R be the family of all intervals of the form [1@Q> 2@Q ]> Q = 1> 2> 3> = = = Show that R is ne but 1@R is not. Note that R is not consistent. 4. Suppose that the family W has only a nite number of intervals, each having positive (non-zero) length. Can W be ne? Explain. 5. (See previous exercise) Suppose F is a ne family with only nitely many intervals. What can you conclude about F? 6. Given families F> G> H with F G> G A H and F ne, prove that F A H. (Hint: the proof is exactly the same idea as the proof of Proposition 1.3.3). 7. Recall that when F is ne and consistent, we dene F to be the family of all intervals which intersect every member of F, that is F = {N | N intersects every L 5 F}. (a) Prove that F is itself ne and consistent.

(b) Prove that F F .

(c) Prove that F G =, F G .

(d) Show that F is maximal in the sense that if K is any other consistent family which contains all the intervals of F, then every interval of K is an interval of F . 8. The following is an important proposition left unproved in the text; prove it. Proposition. Let F be any family of intervals, and [d> e] a rational interval. Then d F e if and only if [d> e] intersects each interval of F. 9. Use the proposition of the previous exercise to show that F is the family of all [d> e] such that d F e (when F is ne and consistent). 10. Let C2 be the family of all [d> e] such that d3 2 e3 . (a) Prove that C2 is consistent.

(b) Prove that C2 ne. (Outline: To prove that there are arbitrarily small intervals in C2, rst check that [1> 2] belongs to C2, and then show that if [d> e] is in C2 and p = (d + e)@2 (i.e. p is the midpoint of [d> e]), then either [d> p] or [p> e] also belongs to C2. Each of the intervals [d> p] and [p> e] is half the length of the original interval [d> e]. We can repeat this process over and over, getting intervals each having half the length of the previous one. We call this process a bisection method for getting arbitrarily small intervals in the family C2.)

(c) Prove that if u is a rational number which belongs to every interval in C2, then u 3 = 2. (Hint: First note that u3 and 2 lie in every interval [d3 > e3 ]. Next, show that C2 contains intervals [d> e] such that e3 d3 is arbitrarily small, using e3 d3 = (e d)(e2 + de + d2 ); you may assume [d> e] [1> 2]. Conclude that for any A 0, u3 2 + and 2 u3 + .)

DEFINITION OF THE REALS

39

(d) In the previous exercise, we proved that 2 lies in every interval [d3 > e3 ] where [d> e] 5 C2. These intervals [d3 > e3 ] lie in (C2)(C2)(C2) = (C2)3 . However, a typical interval in (C2)3 looks like LMN where L> M> N are intervals in C2. Show that 2 lies in all of these as well. (Hint: If these intervals are all A 0 then LMN = [uxs> vyt] (L = [u> v] etc). Show that 2 5 LMN by using d = max(u> x> s) and e = min(v> y> t). When L> M> N are not all A 0, use the following: Lemma. If L = [u> v] is in C2 and u 0, then L L 0 = [1> v] and L 0 is in C2.) 3 (e) Prove that no rational number u can satisfy s u = 2.(Hint: This is a simple modication of the classical proof that 2 is irrational; look it up if you haven’t seen it.) Deduce that there is no rational number common to all of the intervals in C2.

(f) Work out some sample intervals. Start with [1> 2] and apply bisection 1 . several times to nd an interval [d> e] 5 C2 with c ([d> e]) ? 100

11. There is a better method for nding square roots than bisection. There is evidence that this method, called “Divide-and-Average,” was known to the ancient Babylonians. Let t A 0 be a rational number. First, we observe ³ t ´2 that if { A 0 and {2 t then t. Suppose we have constructed { h ti 1³ t´ Ln = {> , and let | = {+ (we have divided t by { and averaged { 2 { h 0 ti t@{ with {). Now let Ln+1 = [min(|> t@|)> max(|> t@|)] = { > 0 where {0 { is either | or t@|. Starting with { = 1, we get a family of rational intervals Gn . This family is consistent and ne. Prove consistency by observing that {2 t (t@{)2 . To prove neness, show that if O = c(Gn ) = |{ t@{|, then c(Gn+1 ) = || t@|| 12 O2 . Thus, once these lengths are less than, say 1@2, they get small very quickly: they are less than 1@2, 1@4, 1@16, 1@256, 1@65536, etc. This is sometimes called quadratic convergence.

1.4

Denition of the Reals

We wish to capture the idea that real numbers are measured by rationals. For any particular real number, these measurements must be consistent (we don’t want to be measuring two dierent things); also, they must be arbitrarily ne, or we don’t have enough information to determine the real number uniquely. But what are we measuring? Since all we have at our disposal are the rational numbers, we must actually create the object of measurement from the measurements themselves. This is captured in the following fundamental denition. Definition 1.4.1 (Real Number) A real number is a ne and consistent family

THE REAL NUMBERS AND COMPLETENESS

40

of rational intervals. We use the symbol R to denote the reals, and the expression D 5 R means that D is a real number. The neness assumption in this denition tells us that we have as much accuracy as possible. Then, if two real numbers D and E cannot be distinguished from each other by a pair of measurements L ? M or M ? L (where L and M are intervals in D and E respectively), then they cannot be distinguished. But rejecting these inequalities means saying D E and E D, i.e. D E. Definition 1.4.2 (Equality of Reals) D = E as real numbers means D E as families of intervals. In order to avoid confusion, for a while we will denote real numbers using capital letters D, E, F, and will continue to employ the notations D ? E, D A E, D E and E D in the way that was dened for families of intervals (these relations for reals are dened to be exactly the same relations for the reals when considered as families of rational intervals). The following properties of equality and order follow directly from propositions in the previous section about the relations , ?, and . Proposition 1.4.3 Suppose D, E, F and G are real numbers. Then 1. D = D (“Re exivity”). 2. D = E if and only if E = D (“Symmetry”). 3. If D = E and E = F, then D = F (“Transitivity”). 4. D = E if and only if D E and E D. 5. If D ? E, then D E (i.e. D 6A E). 6. If D ? E ? F, then D ? F. 7. If D E and E ? F, then D ? F. 8. If D E and E A F, then D A F. 9. If D E and E F, then D F. 10. If D = E ? F = G, then D ? G. 11. If D = E F = G, then D G. (Note that the last two parts follow from the preceding parts.) Exercise 1.4.4 For each of the above properties, state the property for families, already proved, which implies it. Those reals D for which the reciprocal 1@D exists are those for which D A 0 or D ? 0 (see Denition 1.2.12). We have a special notation for these. Definition 1.4.5 D 5 R means that D 5 R and either D ? 0 or D A 0.

DEFINITION OF THE REALS

41

Just as we were able to consider the integers Z as part of the rationals Q by the association Q $ Q@1, we can consider the rationals Q as part of the reals R. To see how, consider the operation that associates to each rational f the simple family consisting of the single interval [f> f]. As we have observed, this family is clearly consistent and ne, and the association f $ {[f> f]} enables us to consider the rationals as real numbers. On the other hand, we could also identify f with the maximal family f , which in this case turns out to be simply the family of all rational intervals containing f. It is easy to see, however, that these two identications, or embeddings, produce equal real numbers since f f . We can also think of these identications as functions mapping Q into R. Suppose that D is a real number and that the (rational) interval [u> v] belongs to D. Then, since real numbers are consistent families, Proposition 1.3.12 tells us that u D v. Thus, the left-hand endpoints of intervals in the family D are really lower bounds for D, and the right-hand endpoints are upper bounds. Conversely, suppose that [u> v] is a rational interval with endpoints u and v satisfying this inequality. Then [u> v] intersects every interval in D by the same proposition. Now [u> v] may not belong to D, but we can enlarge D by adding [u> v], without changing D as a real number (since the new family will be consistent with D); that is, the new family will be a real number equal to D and containing [u> v]. We can continue enlarging D in this way to form D , the family of all rational intervals which meet every interval of D. By Corollary 1.3.14 we know that D is consistent, and it is also ne since it contains D (which is ne). In fact, D D as families, so D = D as real numbers. (For more about D see the exercises on page 38.) For any bounds u D v, the interval [u> v] may not be in D but it is in D . In the previous section (see 1.3.16) we constructed the family C2 (cube root of 2): C2 = family of all rational intervals [d> e] such that d3 2 e3 . In fact, the following properties of C2 appear as exercise 7 on page 38. • C2 is ne and consistent.

• 2 belongs to every interval of (C2)3 = (C2) (C2) (C2).

The rst property tells us that C2 is a real number, and the second tells us 3 3 {[2> 2]}. So, as real numbers, that [2> 2] meets every interval of (C2) , thus (C2) s (C2)3 = 2, and we are justied in writing C2 = 3 2. We have constructed a number we didn’t have before, since there is no rational whose cube is 2. (The exercises for this section contain several other examples of “new” real numbers.) It is clear that we could mimic this construction s and compute various roots of rational numbers. For example, we could construct 6 19 by looking at all rational intervals [d> e] with the properties that d A 0 and d6 19 e6 . (We need d A 0 to make sure we are getting the positive root; without it, the family wouldn’t be consistent either, since we would be looking at the positive and negative root simultaneously.) What we are doing is constructing inverses, one value at a time, for functions of the form {Q . In the next chapter we will see a much better way of constructing inverses for functions.

THE REAL NUMBERS AND COMPLETENESS

42

Exercises 1. Let us represent the real number 2 by the family {[2> 2]} and 3 by {[3> 3]}. (a) Show that 2 ? 3 and 2 3.

(b) Let T be the family of intervals [2 1@Q> 2 + 1@Q ] for Q = 1> 2> = = =Show that T is a real number and, as a real number, T = 2. Similarly, look at the family H of intervals [3 1@Q> 3 + 1@Q ]. Prove that T ? H. (c) Write down some of the intervals in the family T @H, and prove that {[2@3> 2@3]} T @H, so that T @H = 2@3 as real numbers.

2. (Project) The following famous formula is often derived in calculus courses: ln 2 = 1

4

X 1 1 1 1 + + ··· = (1)n+1 = 2 3 4 n n=1

In fact, proving this is not at all trivial, as we will see later when we take up innite series. However, with this formula as motivation, let’s dene the following sequence of partial sums: v0 v1

= 0 = 1

v2

= 1

1 2

.. . vn

= 1

1 1 1 + · · · (1)1+n 2 3 n

and in general (q 1), v2q v2q+1

1 2q 1 . = v2q + 2q + 1 = v2q1

It is clear from the last of these equations that, for each n 0, v2n v2n+1 . We can therefore dene, for n = 0> 1> = = =, the intervals Ln = [v2n > v2n+1 ] . (a) Write down some of these intervals explicitly. Also, sketch some of these intervals on a single line representing the rational numbers. (b) Show that, for each n, Ln+1 Ln and c(Ln ) =

1 2n+1 .

(c) Let L be the family of all the intervals Ln , n = 0> 1> 2> = = =. Given A 0, show how to nd n so that c(Ln ) ? . Show that L is a real number. (Note: for consistency, you have to show that Ln and Lq intersect. What, exactly, will be their intersection?) We might want to call L “ln 2.”

REAL NUMBER ARITHMETIC

43

P4 1 (d) Similarly from calculus, 4 = 1 13 + 15 17 · · · = n=1 (1)n+1 2n1 . With this formula as motivation, we follow the same idea as above and dene 1 1 , w2q+1 = w2q + 4q+1 . Also, dene w0 = 0> w1 = 1, and w2q = w2q1 4q1 the intervals Mn = [w2n > w2n+1 ]. Show that Mn+1 Mn , and nd c(Mn ). Let P be the family of all the intervals Mn , n = 0> 1> 2> = = =. Given A 0, show how to nd n so that c(Mn ) ? . P is a real number that we might call 4 . (e) The real number ln 2 · 4 is the product family L · P. Find any interval in 1 (that is, nd p and q so that Lp · Mq L · P that has length less than 1000 1 has length ? 1000 , and prove it.) Show how to nd one of length less than . (Hint: Explain why all of these intervals are contained in [0> 1] and then use Proposition 1.1.27, the estimate for c(L · M).)

(f) Prove, using the criterion for strict inequality of reals, that ln 2 ? @4. This means you must nd p and q such that Lp ? Mq . (Note: you should not use decimals, though it may be handy to use a calculator to determine which rational intervals to compare. The Maple package will do fractional arithmetic very nicely.)

(g) There are a few cranks who have s printed up (at their own expense) papers attempting to show that = 10. Show that, with the denition above ¡ ¢2 of 4 , that 2 ? 10 (i.e. 4 ? 58 ). This, of course, does not refute the cranks, since their denition of is geometric, but it may be fun anyways.

(h) (Generalization) Suppose d1 > d2 > d3 > = = = are positive rationals. Let Oq = [v2q > v2q+1 ] where v0 = 0, v1 = d1 , and, for q 1, v2q = d1 d2 +· · ·d2q and v2q+1 = d1 d2 + · · · d2q + d2q+1 . What sort of condition will guarantee that On On+1 for all n? This will make the family of intervals consistent (explain why). What sort of condition on the d’s will guarantee that the family is ne? s 3. (Project) Using the family C2, which we now call 3 2 (see page 41), as a model, show how you can construct Q th roots (Q = 2> 3> = = =) of any positive rational number u. Note that you have to be careful with even roots to make the family positive. 4. Show that the family Gn obtained by “Divide-and-Average” (see exercise 11 p on page 39) has t assits limit. Write down some of the Divide-and-Average family for calculating 5. Divide-and-Average is a special case of Newton’s Method, which, when it works, exhibits the same quadratic convergence.

1.5

Real Number Arithmetic

By “arithmetic,” mathematicians usually mean basic operations such as addition, multiplication, taking absolute value, etc. Dierent collections of mathematical objects have dierent arithmetics. So, for example, we may talk about the arithmetic

THE REAL NUMBERS AND COMPLETENESS

44

of numbers, the arithmetic of vectors or the arithmetic of matrices. In this chapter we will see that the arithmetic of real numbers is very similar to that of the rationals in the sense that similar laws hold (e.g. associativity, commutativity, etc.) This is not too surprising, since the reals inherit their basic arithmetic (a-rith-met0 -ic) operations from those dened on families of rational intervals. The arithmetic of rational intervals, as we have already seen, is derived from that of the rationals. There are some dierences, however, mostly stemming from that fact that we can always tell when a rational has a reciprocal since we can always tell when a rational is “not zero.” This is not true for reals, where, in order to nd the reciprocal of D we have to determine, explicitly, that D A 0 or D ? 0, that is, whether D 5 R . Except in certain specic examples, this requires proving a theorem (see the appendix to this chapter on “The Goldbach Number”). Suppose that D and E are reals. Since D and E are ne and consistent families, so are D + E, DE, D, |D|, max(D> E), and min(D> E). Thus, D + E, DE, D, |D|, max(D> E), and min(D> E) are also real numbers. When D 5 R (i.e. D A 0 or D ? 0), then 1@D is also a real number. Definition 1.5.1 The following operations on real numbers are performed by considering the real numbers as families: D + E, DE, D, |D|, max(D> E), min(D> E), and 1@D. A real number can be represented in many ways by dierent families that are consistent with each other. ©£ For example, ¤the real number ª 2 = {[2> 2]} is also represented by the family T = 2 q1 > 2 + q1 > q = 1> 2> = = = . It is also represented by the maximal family 2 consisting of all [u> v] containing the rational number 2. Does 3 + 2 = 3 + T as real numbers–i.e. is it true that {[3> 3]} + {[2> 2]} {[2> 2]} + T as families? Similarly, is 1@2 = 1@T = 1@2 ? It is clearly important to know that arithmetic operations performed on equal reals yield equal results. Put another way, if we perform an arithmetic operation on families F and G, and then perform the same operation, replacing F and G by families F 0 and G 0 consistent with them, do we get consistent answers? Fortunately we do; indeed, this is a fact we have already proved in Proposition (1.2.17). However, we will now rephrase it in the language of reals. Proposition 1.5.2 If D = D0 and E = E 0 , then 1. D + E = D0 + E 0 (in particular, D = D0 implies D + E = D0 + E). 2. DE = D0 E 0 (in particular, D = D0 implies DE = D0 E). 3. When D 5 R so is D0 , and 1@D = 1@D0 . 4. |D| = |D0 | 5. max(D> E) = max(D0 > E 0 ) 6. min(D> E) = min(D0 > E 0 ) (For specic examples, see the exercises.)

REAL NUMBER ARITHMETIC

45

Definition 1.5.3 (Subtraction and Division) D E means D + (E) and, for E 5 R , D@E means D(1@E). Proposition 1.5.4 The following laws of arithmetic hold. 1. D + E = E + D 2. D + (E + F) = (D + E) + F 3. D + 0 = D 4. D D = 0 5. DE = ED 6. D(EF) = (DE)F 7. 1 · E = E 8. If D 5 R , then D@D = 1 9. D(E + F) = DE + DF Proof. These are proved by working on the interval level and using consistency; details are left for the exercises. Corollary 1.5.5 (Additive Laws) The following additional laws of arithmetic hold. 10. If D + F = E + F, then D = E (“Cancellation for Addition”). 11. D = E if and only if 0 = E D. 12. (D + F) = (D) + (F) 13. (D) = D. Corollary 1.5.6 (Multiplicative Laws) The following additional laws of arithmetic hold. 14. If F 5 R and FD = FE, then D = E. (“Cancellation for Multiplication”). 15. For D 5 R , D = E if and only if E@D = 1. 16. If DE 5 R , then D 5 R and E 5 R , and 1@(DE) = (1@D)(1@E). 17. For D 5 R , 1@(1@D) = D.

THE REAL NUMBERS AND COMPLETENESS

46

Proof of both corollaries. These are consequences of the previous proposition, and need not be proved using intervals; details are left as exercises. Before moving on, we should note that the two Cancellation properties listed in the corollaries above are extremely useful in proving that certain numbers are equal. For example, the cancellation rule for addition says that if two numbers become equal when the same third number is added to them, they were equal to begin with. This can be used to show that (D + E) equals D E, since these two numbers become equal (to 0) when we add F = D + E to them. We now turn our attention to inequalities and how they behave with respect to arithmetic operations. The following proposition is the key to relating inequalities of reals to these operations, since it enables us to put all the quantities involved on one side. It is most easily proved by going back to intervals and using the denitions of ? and ; once again, the proof is left as a (straightforward) exercise. Proposition 1.5.7 D ? E if and only if 0 ? E D, and D E if and only if 0 E D. (If you are careful, you can deduce the equivalence of the weak inequalities from the equivalence of the strict ones; however, it may be easier to deduce it directly from the denition of .) Corollary 1.5.8 1. D ? 0 if and only if 0 ? D. 2. D A 0 if and only if 0 A D. 3. D + F ? E + F if and only if D ? E. 4. E ? 0 if and only if E + F ? F. 5. E A 0 if and only if E + F A F. 6. E 0 if and only if E + F F. 7. E 0 if and only if E + F F. Proposition 1.5.9 If E 0 and F A 0, then E + F A 0. Proof. By the previous proposition E + F F. But F A 0. Therefore, by proposition 1.5.2, E + F A 0. Proposition 1.5.10 If D ? F and E G, then D + E ? F + G. Proof. (F + G) (D + E) = (F D) + (G E). Because F D A 0 and G E 0, the result follows from the previous proposition. Proposition 1.5.11 If D + E A 0, then D A 0 or E A 0.

REAL NUMBER ARITHMETIC

47

Proof. If D + E A 0, we have intervals [u> v] in D and [x> y] in E such that u + x A 0. Then either u A 0 or x A 0. If u A 0, D A 0. If x A 0, E A 0. Corollary 1.5.12 If D + E ? F + G, then D ? F or E ? G. If D F and E G then D + E F + G. Proposition 1.5.13 If [u> v] is in D and [x> y] is in E, then u + x D + E v + y. Proof. This proposition follows from the previous one if we make use of Corollary 1.3.13, which tells us that u D v and also that x E y. However, this proposition is also is an immediate consequence of the corollary itself, because we know that if [u> v] belongs to D and [x> y] belongs to E, then [u + x> v + y] belongs to the consistent family D + E. Proposition 1.5.14 0 ? DE if and only if 0 ? D and 0 ? E or D ? 0 and E ? 0. Proof. This is a consequence of the fact that for intervals, we have LM A 0 if and only if L A 0 and M A 0 or L ? 0 and M ? 0. (For an interval N, we have either N A 0 or N ? 0 or N contains 0.) Corollary 1.5.15 FD ? FE if and only if 0 ? F and D ? E or F ? 0 and E ? D. Proof. FD ? FE if and only if 0 ? FE FD = F(E D) and we can apply the previous proposition. Recall that R stands for the real numbers [ which are either positive ([ A 0) or negative ([ ? 0). Lemma 1.5.16 If DE = 1, then D A 0 or D ? 0 and E = 1@D. Proof. If DE = 1, then DE A 0. As we have just seen, this means that either D A 0 or D ? 0. In either case, 1@D is dened, and multiplying both sides of the equation DE = 1 by it gives us that E = 1@D. Definition 1.5.17 (Unit) D is a unit means that DE = 1 for some E. This next proposition, listing elementary properties of units, follows directly from previous results in this section; its proof is left as an exercise. Proposition 1.5.18 Suppose that D and E are real numbers. Then 1. D is a unit if and only if D 5 R . 2. DE is a unit if and only if both D and E are units. 3. If D + E is a unit, then D is a unit or E is a unit.

THE REAL NUMBERS AND COMPLETENESS

48

We now establish some formal properties of max, min and absolute value. (See 1.1.32 to review max and min.) Some of these properties are useful in deriving properties of absolute value. Their proofs tend to be a bit technical and one can skim them without too much guilt. Proposition 1.5.19 F A max(D> E) +, F A D and F A E. Proof. (,) Suppose F A max(D> E). Then we have L in D, M in E and N in F such that N A max(L> M). This implies that N A L and N A M, so F A D and F A E. (+) If F A D and F A E, we have L in D, M in E, and N 0 and N 00 in F with N 0 A L and N 00 A M. Let N be either N 0 or N 00 , whichever has the greater left endpoint. In either case, N is in F and we have N A L and N A M, so N A max(L> M). Thus F A max(D> E). Here is a picture for the case where N 0 has the greatest left endpoint:

, -

V Y

S !S

S

.

.

(Note that we used neither consistency nor neness in this proof, so the proposition actually holds for arbitrary families.) Proposition 1.5.20 F ? max(D> E) +, F ? D or F ? E. Proof. Straightforward. The next proposition is useful in converting a proof about max into one about min, and vice versa. Proposition 1.5.21 min(D> E) = max(D> E). Proposition 1.5.22 1. F ? min(D> E) if and only if F ? D and F ? E. 2. F A min(D> E) if and only if F A D or F A E. 3. min(D> E) D max(D> E) Proposition 1.5.23 max(D> E) = E if and only if D E. Proof. (,) By part (3) of the preceding proposition, if max(D> E) = E, then D E. (+) Suppose that D E. Since E max(D> E), we only have to show that E max(D> E). This follows from proposition 1.5.20 because E 6? E and we are assuming that E 6? D (i.e. D E).

REAL NUMBER ARITHMETIC

49

Corollary 1.5.24 min(D> E) = D if and only if D E. (This is a good example of the use of Proposition 1.5.21 in converting min’s to max’s.) Now we turn to absolute value. Its properties are quite important for what follows in later chapters. Proposition 1.5.25 |D| = max(D> D). (To prove this, recall that |L | = max(L> L> 0) for intervals.) Proposition 1.5.26 1. |D| ? E if and only if E ? D ? E (and similarly for ). 2. F ? |D| if and only if F ? D or F ? D (and similarly for ). Proof. For part (1) notice that D ? E if and only if E ? D, so we have to show that max(D> D) ? E if and only if D ? E and D ? E. This is a direct consequence of Proposition 1.5.19. The proof of (2) is similar, using Proposition 1.5.20 with E = D. Corollary 1.5.27 If D [ E and D \ E, then |[ \ | E D. Proof. Add the inequalities D \ E and E [ D. The following form of part (1) of the previous proposition is most often used in doing limits and other analysis constructions; it will be used many times throughout the remainder of the book. Corollary 1.5.28 |[ D| ? H if and only if D H ? [ ? D + H, (and similarly for ). Corollary 1.5.29 1. |D| A 0 if and only if D A 0 or D ? 0. 2. |D| 0 3. |D| = 0 if and only if D = 0. Proposition 1.5.30 1. If D 0, then |D| = D. If D 0, then |D| = D. 2. D |D| and D |D|. Proof. (1): If D 0, then D D. If D 0, then D D. The result now follows from Propositions 1.5.23 and 1.5.25. Part (2) is similar and uses 1.5.22 and 1.5.25.

THE REAL NUMBERS AND COMPLETENESS

50

Corollary 1.5.31 If D ? F ? E then |F| ? max(|D| > |E|); similarly for weak inequality . The next result, though not di!cult to prove, is extremely important, so we rate it a “Theorem”. Theorem 1.5.32 (Triangle Inequality) |D + E| |D| + |E|. Proof. From the previous proposition and Corollary 1.5.12, we get both D + E |D| + |E| and (D + E) = D + (E) |D| + |E| = Therefore, it follows from Proposition 1.5.20 that max{(D + E)> (D + E)} |D| + |E| = But by Proposition 1.5.25, |D + E| = max{(D + E)> (D + E)}= So we are done. We next turn to the formula |E D| = max(D> E) min(D> E). If we knew which was bigger, D or E (or whether they were equal) then the proof would be straightforward. Thus, we would like to separate this into cases. What gives us the right to do this? Although we don’t have trichotomy for reals (though we do have a version of it–see Proposition 1.6.9 in the next section), we do know that D ? E, D A E and D = E can’t all be false (see lemma below). Here is an alternate terminology: Definition 1.5.33 (Ruling out) To say that statements S1 > S2 > = = = > SQ can’t all be ruled out is to say that they can’t all be false; in other words, if they are all false it leads to a contradiction. Lemma 1.5.34 For real numbers, the statements D ? E, D A E and D = E can’t all be ruled out. Proof. Suppose these statements are all false; then E D (since D ? E is false) and D E (since D A E is false). But then D = E by Proposition 1.4.3, which contradicts the assumed falsity of D = E. Let’s see how to apply this in proving that |E D| = max(D> E) min(D> E). | {z } Statement T

REAL NUMBER ARITHMETIC

51

Suppose we know that it is true in each of the cases A E}, and |D {z = E}. D ? E}, D | {z | {z S1

S2

S3

That is, suppose that S1 =, T, S2 =, T, and S3 =, T. Suppose now that |E D| ? max(D> E) min(D> E); then T is false, so each of S1 , S2 , and S3 is false, which is impossible (since they can not all be ruled out). Thus, |E D| ? max(D> E)min(D> E) is impossible, so |E D| max(D> E)min(D> E). In a similar way, |E D| A max(D> E) min(D> E) is impossible, so |E D| max(D> E) min(D> E). Thus, since we have and , we get equality, and we have proved the following. Proposition 1.5.35 |E D| = max(D> E) min(D> E). Remark 1.5.36 This idea can be extended to proving weak inequalities as well as equalities. For example, suppose we know that d e =, X Y and e d =, X Y . Then, if X A Y , we deduce that d e and e d must both be false. But this cannot happen, since d e false means d ? e must be false. Thus, e d. In other words, d e and e d can’t both be ruled out. Thus, X A Y can’t happen, so X Y. Lemma 1.5.37 F + max(D> E) = max(F + D> F + E). Note that it is easy to extend this, by induction, to F + max(D1 > = = = > Dq ) = max(F + D1 > = = = > F + Dq )= An application of the lemma above provides an alternative proof for Proposition 1.5.35: max(D> E) min(D> E) = max(D> E) + max(D> E) = max{max(D> E) D> max(D> E) E} = max{max(0> E D)> max(D E> 0)} = max(0> E D> D E) = max(0> |E D|) = |E D|= This suggests the following generalization of Proposition 1.5.35. Proposition 1.5.38 max(D1 > = = = > Dq )min(D1 > = = = > Dq ) = max(Dl Dm ) for l> m = 1> = = = > q. Proof. Exercise. We conclude this section with the denition and some results on betweenness. The reason why we take such formal care with this concept is that there are many

THE REAL NUMBERS AND COMPLETENESS

52

cases where we want to say that D is between E and F without specifying which of E or F is to the right of D; in fact, sometimes it is either irrelevent or unknown. In some sense, betweenness is a partial substitute for trichotomy, or at least a way of sidestepping the issue. (By the way, like “bookkeeper,” “betweenness” is one of those rare words with three sets of double letters!) Definition 1.5.39 (Betweenness) D is between E and F means min(E> F) D pd{(E> F). Proposition 1.5.40 If D and E are between F and G, then |E D| |G F|. Proposition 1.5.41 If D is between E and F, and E and F are between G and H, then D is between G and H. Proposition 1.5.42 Given nitely many real numbers, D1 > = = = > Dq , such that for any three taken in succession, the third is between the rst two, then Dq is between D1 and D2 . Proposition 1.5.43 D is between E and F if and only if the following two conditions are satised: 1. If E F, then E D F and 2. If F E, then F D E. Proposition 1.5.44 D is between E and F if and only if |F E| = |F D| + |D E|. Corollary 1.5.45 If D is between E and F, then for all G, GD is between GE and GF. (Prove this by simply multiplying the equation in the preceding proposition by |G|.) Proposition 1.5.46 For any D, E, and [ the following statements can’t all be ruled out: 1. [ min(D> E). 2. max (D> E) [. 3. [ is between D and E. Proof. That [ min(D> E) is false implies [ ? min(D> E) is false, which implies min(D> E) [. That max(D> E) [ is false implies max(D> E) ? [ is false, which implies [ max(D> E). So both statements being false contradicts the falsity of min(D> E) [ max(D> E).

REAL NUMBER ARITHMETIC

53

Remark 1.5.47 The statements [ min(D> E) and max (D> E) [ together can be interpreted as saying “D and E lie on the same side of [.” Thus, the previous proposition says that we can not rule out both of the statements: “D and E lie on the same side of [” and“[ lies between D and E.”

Exercises 1. Let us identify the rational numbers 3 and 4 with the families {[3> 3]} and {[4> 4]}. Let I = {[4 1@Q> 4 + 1@Q ] : Q = 1> 2> ===}. Compute the family 3 + I and prove directly that 3 + 4 = 3 + I = {[7> 7]} (as real numbers). You’ll have to show that the underlying families are consistent with each other. Give a similar proof that 3I = {[12> 12]}. Note that Proposition 1.5.2, based on Proposition 1.2.17, proves that these kinds of equalities are true in general. 2. Thinking of 2 as the real number {[2> 2]}, 3 as {[3> 3]}, etc., prove directly that 2 + 3 = 5 and 2 · 3 = 6. Now think of 2 as the real number 2 , the family of all [u> v] such that u 2 v. Similarly identify 3 with 3 etc. Prove the same arithmetic formulas: 2+3 = 5 and 2·3 = 6. Note that Proposition 1.5.2, based on Proposition 1.2.17 proves that these kinds of equalities are true in general. 3. Prove the following properties of the reals. (You will have to use intervals and consistency.) (a) D(E + F) = DE + DF (b) D + 0 = D (c) D D = 0

(d) 1 · E = E

(e) If D 5 R then D@D = 1.

4. Prove the following properties of the reals. You shouldn’t have to go back to intervals and families of intervals, but use the properties already established in the previous exercise. (a) If D + F = E + F, then D = E (Cancellation for Addition) (Hint: add F to both sides, using Proposition 1.5.2.)

(b) D = E if and only if 0 = E D.

(c) (D + E) = (D) + (E) (Hint: Cancellation says that you can test whether two real numbers [ and \ are equal by seeing if you can nd a third number ] such that they become equal when you add ] to them (that is, [ + ] = \ + ] =, [ = \ ). Apply this to test whether [ = (D + E) equals \ = D + (E); you’ll have to gure out what to chose for ].)

(d) (D) = D (Hint: Similar in idea to the preceding.)

THE REAL NUMBERS AND COMPLETENESS

54

(e) If F 5 R and FD = FE, then D = E. (Cancellation for Multiplication) (f) For D 5 R , D = E if and only if E@D = 1

(g) For D 5 R , 1@(1@D) = D. (Hint: Similar to the idea of previous hints involving addition, but this is multiplication.) 5. Using intervals, prove the following proposition from the text. Proposition. For real D and E, (a) D ? E if and only if 0 ? E D.

(b) D E if and only if 0 E D.

(If you argue carefully, you can deduce the part from the ? part, but it’s probably not worth it unless you see right away how to proceed.) 6. Prove the following corollary from the text. You should not have to use intervals at all: its parts follow from the proposition proved in the previous exercise. Corollary. For reals D, E, and F, (a) D ? 0 if and only if 0 ? D.

(b) D A 0 if and only if 0 A D.

(c) D + F ? E + F if and only if D ? E.

(d) E ? 0 if and only if E + F ? F. (e) E A 0 if and only if E + F A F. (f) E 0 if and only if E + F F.

(g) E 0 if and only if E + F F.

7. Suppose that D and E are real numbers. Prove the following. (a) D is a unit if and only D 5 R . (Hint: if D[ = 1 then D[ A 0; show that D A 0 or D ? 0.) (b) DE is a unit if and only if both D and E are units. (c) If DE is a unit, then 1@(DE) = (1@D)(1@E) (Hint: Use 1 and 2). (d) If D + E is a unit, then D is a unit or E is a unit. (Hint: It was already proved in the text that if the sum of two numbers is positive, one of them must be positive.) 8. Prove the very useful Corollary 1.5.28. which says that |D [| ? H if and only if [ H ? D ? [ + H (and similarly for ). 9. Suppose D 0, E 0. Prove the following. (a) If DE 1 and E ? 1, then D A 1.

(b) If DE A 1 then D A 1 or E A 1.

RATIONAL APPROXIMATIONS

55

10. Here are some further useful inequalities. (a) If D1 > D2 > = = = Dq are non-negative and D1 D2 · · · Dq A 1, then some Dn A 1. Pq (b) If l=1 Dl ? q where each Dl A 0, then some Dn ? 1. Qq Pq (c) Suppose l=1 Dl = 1, where each Dl A 0. Then l=1 Dl q. (Hint: proP ceed by induction on q. If it’s true for some q but q+1 l=1 Dl ? q + 1, then by part (b) some D, say D1 is ? 1. But now the product of the remaining D’s must be bigger than 1, so by part (a) one of these, say Dq+1 , is A 1. Now use the induction hypothesis, the grouping (D2 )===(Dq )(D1 Dq+1 ), and D1 Dq+1 + D1 + Dq+1 = (Dq+1 1)(1 D1 ) + 1 A 1 to get a contradiction.) 11. Suppose d e =, X = Y and e d =, X = Y . Prove that X = Y . (Hint: see Remark 1.5.36.) 12. Use the previous exercise to show that |D| + D = max(D> 0) 2 and

1.6

|D| D = max(D> 0)= 2

Rational Approximations

The rationals, as we have seen, sit inside the reals–mathematicians say they are embedded in the reals. This association is u $ {[u> u]}. Because real numbers are ne families, it is clear that for any real D and any rational A 0, we can nd rationals u and v such that u ? D ? v and v u ? : simply take an interval [u0 > v0 ] from D of length smaller than , say @2, and choose u slightly smaller than u0 (say u = u0 @4) and v slightly larger than v0 (say v = v0 + @4). (We had to make these adjustments because all we know about [u0 > v0 ] is that u0 D v0 and we want strict inequality ?.) We summarize this in the following proposition. Proposition 1.6.1 For any real number D and rational A 0, there are rationals u and v such that u ? D ? v and v u ? . Proposition 1.6.2 If D A E, there are rationals f such that D A f A E. Proof. If D A E, there are rationals u and y with D u A y E. (We have an interval [u> v] in D and an interval [x> y] in E with u A y, and we know by Corollary 1.3.13 that D u and y E.) Any f such that u A f A y has the required property, for example, f = (u + y)@2.

THE REAL NUMBERS AND COMPLETENESS

56

Corollary 1.6.3 If D A 0, there are positive rationals such that D A . Corollary 1.6.4 D ? E or D A E if and only if |E D| A for some rational A 0. This follows from the preceding proposition, because we know that D ? E or D A E if and only if |E D| A 0 (see Corollary 1.5.29). Remark 1.6.5 It also follows that D ? E or D A E if and only if |E D| A F for some F A 0. (Given the latter condition, for example, we can apply either Proposition 1.6.2 to get |E D| A A F with rational, or apply Corollary 1.6.3 to get F A A 0, again with rational.) Corollary 1.6.6 D = E if and only if |E D| for all positive rationals , . The following tells us we can adjust the family that denes a given real so that the lower and upper bounds fall within certain limits. Proposition 1.6.7 Let D be an real number such that u0 D v0 for rational numbers u0 and v0 . Let L0 = [u0 > v0 ] and let D _ L0 be the family of all L _ L0 with L 5 D. Then 1. D _ L0 is consistent and ne. 2. D _ L0 D, so that D _ L0 = D as real numbers. 3. All lower and upper bounds for D _ L0 lie between u0 and v0 . Recall the Wiggle Lemma for rationals (Lemma 0.2.6), which says that if u v+ for all A 0, then u v. Here is the version for real numbers. Lemma 1.6.8 (Wiggle Lemma for Reals) Let [ and \ be real numbers. 1. If [ \ + for every real A 0, then [ \ . 2. If [ \ + for every rational A 0, then [ \ . (The proof is left as an exercise: it follows from the Wiggle Lemma for rational numbers.) To conclude this section, we take another look at Trichotomy. For any two rationals u and v, we can determine exactly which of the relations u ? v, u A v, and u = v holds. As we have seen, given two reals D and E, we can not exclude all three conditions: D ? E, D A E, D = E. Nevertheless it may be impossible, without proving a theorem, to determine which one holds for given D and E (see the appendix on “The Goldbach Number”). The following result expresses what we can say in general. Proposition 1.6.9 (-Trichotomy) If D and E are real numbers, then for each rational A 0, either D ? E, E ? D, or D and E are within of each other.

RATIONAL APPROXIMATIONS

57

Proof. It is important to emphasize that the words “either...or” here mean that one can actually, constructively, determine which of the three alternatives hold. The proof should re ect this. The idea is to choose intervals L and M from D and E. Since rationals satisfy trichotomy, one can determine if L ? M (so D ? E), or M ? L (so E ? D). If neither, than L and M meet. By choosing L and M of su!ciently small length, this third case will imply that D and E are within . We leave the details for the exercises. Here are several very useful applications of -Trichotomy. The proofs are discussed in the exercises. Lemma 1.6.10 If D ? E then, for any S , either S A D or S ? E. Lemma 1.6.11 Let Ln = [dn > en ], n = 0> 1> = = = > Q, be a family of intervals such that 1. dn ? en for n = 0> 1> = = = > Q and 2. dn+1 ? en for n = 0> 1> = = = > Q 1 (successive intervals overlap). If { 5 [d0 > eQ ], then we can nd some N such that { 5 LN . Lemma 1.6.12 Let D = {0 ? {1 ? · · · ? {Q = E be a partition of the interval [D> E] such that {l {l1 ? for l = 1> = = = > Q. Then for each { 5 [D> E], there is some m such that |{ {m | ? . Lemma 1.6.13 Given reals I0 > I1 > = = = > IQ and A 0, there are indices N and O such that, for all m = 0> = = = > Q, IN ? Im + and IO A Im . (What is this last lemma saying? Hint: It might be nicknamed “-max @ min.”)

Exercises 1. Let D and E be any reals with D E. For any A 0, show how to nd rationals u v such that uDEv

and

v u ? (E D) + .

2. Use s bisection (see exercises on page 38) to nd rationals u and v such that u 2 v and 0 ? v u ? 1@50. 3. Use Divide-and-Average (see exercises on page 39) to make the same estimate as in the previous exercise. 4. Apply -Trichotomy to 0 and the Goldbach number (see the appendix to this chapter).

THE REAL NUMBERS AND COMPLETENESS

58

5. Prove the Wiggle Lemma (Lemma 1.6.8), which says that if [ \ + for every rational A 0, then [ \ .

(Hint: Use the fact that we know it is true when [ and \ are rationals (0.2.6). Let [u> v] and [x> y] be arbitrary rational intervals of [ and \ respectively. Interpret [ \ + in terms of intervals and use it to show u y.)

6. Prove -Trichotomy (Proposition 1.6.9), which says: If D and E are real numbers, then for each rational A 0, either D ? E, E ? D, or D and E are within of each other. The text gives an outline. 7. Prove the following three lemmas, stated in the text (Hint: -Trichotomy will come in handy). Lemma 1.6.10 If D ? E then, for any S , either S A D or S ? E. Lemma 1.6.12 Let D = {0 ? {1 ? · · · ? {Q = E be a partition of the interval [D> E] such that {l {l1 ? for l = 1> = = = > Q. Then for each { 5 [D> E], there is some m such that |{ {m | ? . Lemma 1.6.13 (“-max @ min”) Given reals I0 > I1 > = = = > IQ and A 0, there are indices N and O such that, for all m = 0> = = = > Q, IN ? Im + and IO A Im .

8. Prove the following from the text (use Lemma 1.6.10): Lemma 1.6.11 Let Ln = [dn > en ], n = 0> 1> = = = > Q, be a family of intervals such that (a) dn ? en for n = 0> 1> = = = > Q and (b) dn+1 ? en for n = 0> 1> = = = > Q 1. If { 5 [d0 > eQ ], then we can nd some N such that { 5 LN . (Hint: Here is an algorithm which will produce N:

(i) We know that d0 {. (ii) Since d1 ? e0 , either { ? e0 or { A d1 . If { ? e0 , then { 5 [d0 > e0 ]. Otherwise, d1 ? {. (iii) Since d2 ? e1 , either { ? e1 , or { A d2 . If { ? e1 , then { 5 [d1 > e1 ]. Otherwise, d2 ? {. (iv) Continuing in the way, we have the pattern: { 5 [dn > en ] or dn+1 ? {. This must terminate either with { in some Ln for n ? Q , or dQ ? {. But in this case, { 5 [dQ > eQ ] since { eQ .)

1.7

Real Intervals and Completeness

Recall that in Example 1.3.16 we looked at the family C2 of all (rational) [d> e] such that d3 2 e3 . It is not di!cult to show, by bisection (see exercise 10 on page

REAL INTERVALS AND COMPLETENESS

59

38), that this family is consistent and ne, so that it is a real number–we now call it s ¡ s ¢3 3 2. This notation is justied because it is possible to show that 3 2 = 2, as real numbers (this is discussed on page 41). Thus, although there is no rational number whose cube is 2, there is a real number with this property. Put another way, the equation {3 = 2 had no solution when all we had was the rationals, but we now can construct a solution that is a real number. Thinking of a rational interval, as we have done before, as a measurement, one might be tempted to say that this real number is “the collection of all attempts to measure a cube root of 2 with rationals.” Technically this is circular, since the cube root of 2 was not actually a “thing” that could be “measured,” when all we had was the rationals; nevertheless, it describes the process of construction rather nicely. As you have probably already realized, this kind of construction can be applied to many other kinds of problems, including, clearly, qth roots of rationals, and even solutions of more general algebraic equations. The exercises at the end of this chapter give some examples, and we will see others, such as Newton’s Method, in later chapters. So let us take a closer look at the method. s We started with a consistent ne family of rational intervals (C2, or 3 2). Since this is already a real number, we didn’t have to do anything s further to get our answer. It is important to note, however, that the real number 3 2 does not lie in any of the intervals of this family, since it issnot itself rational. On thesother hand, if [u> v] is a rational interval in the family 3 2 then we know that u 3 2 v as families (by Proposition 1.3.12), or s as real numbers. If we allowed s [u> v] to contain real numbers, then we would have 3 2 5 [u> v] for any interval in 3 2. To pursue this idea further, we need to look at real intervals. Definition 1.7.1 (Real Interval) [D> E] is a real interval when D and E are real numbers and D E. If [ is a real number, then [ 5 [D> E] means D [ E. Of course, this denition leads to an obvious problem. Since rationals like 1@2 and 3@4 are real numbers, how do we know whether the interval denoted [1@2> 3@4] is to be considered a real interval (containing non-rational reals as well as rationals) or a rational interval (containing only rational members)? Well, we could have done something fancy, like notating real intervals with double brackets: [[1@2> 3@4]]. But instead we will simply assume that from now on, all intervals [d> e] are real intervals unless otherwise specied. In other words,s without further comment, [1@2> 3@4] will mean a real interval. Thus, for example, 22 5 [1@2> 3@4]. As we did for rational intervals, we dene [D> E] ? [F> G] to mean E ? F. Now, if you look back over the properties of rational intervals and the proofs of those properties, you’ll see that nearly all of them depend only on elementary properties of inequalities and their algebraic manipulation. Since the same rules for manipulating inequalities hold for reals as do for rationals, most of these properties of rational intervals continue to be true for real intervals. Here are two examples. Proposition 1.7.2 [D> E] intersects [F> G] (i.e. there is a real { in both intervals) if and only if D G and F E.

60

THE REAL NUMBERS AND COMPLETENESS

Proof. The argument used for Proposition 1.2.2 works here too. It gives us the result that { is in the intersection of [D> E] and [F> G] if and only if it is in [max(D> F)> min(E> G)]. Proposition 1.7.3 For real numbers D and E and a family of real intervals S, D S E if and only if [D> E] meets every interval of S. Of course, D S E really means {[D> D]} S {[E> E]}. The proof of this is exactly the same as the proof of its rational interval version, Proposition 1.3.12. Remark 1.7.4 Suppose that [d> e] and [f> g] are real intervals with rational endpoints d, e, f, and g. Suppose also that they intersect, in other words, have a real number in common. By Proposition 1.7.2 above, this means that d g and f e. But this tells us that there is a rational number common to both, since they now intersect as rational intervals. Thus, if two real intervals with rational endpoints intersect, they contain a common rational. Of course, having rational endpoints is critical here: for s s example, [1> 2] and [ 2> 2] have a common real but no common rational. Definition 1.7.5 For any family of real intervals S, 1. S is consistent means that every pair of intervals in S intersect. 2. S is ne means that, for each A 0 there in an interval L in S with c(L) ? . Remark 1.7.6 Note that in part (2) of this denition, it su!ces to check the existence of intervals of length less than for all positive rational . s Let’s look at our example involving 3 2 more carefully. Here is the original family of rational intervals: C2 = All rational intervals [u> v] with u3 2 v3 . As we have seen, there can be no single rational number common to all the intervals in this family. Here is the corresponding family of real intervals: C = All real intervals [D> E] with D3 2 E 3

s All intervals in this family have the real number 3 2 in common. Ins the second of these families, the family of real intervals, is there any real other than 3 2 common to all the intervals? Fortunately no, as the following proposition tells us. Proposition 1.7.7 Suppose S is a ne family of real intervals. If S and T are real numbers that belong to each interval in S, then S = T.

REAL INTERVALS AND COMPLETENESS

61

Proof. Given an arbitrary A 0, let [D> E] be an interval in S of length ? . Then S and T are in [D> E] and |S T| = max(S> T) min(S> T) E D ? . Since was arbitrary, 0 |S T| 0 by the Wiggle Lemma, so S = T.

s So let us try to generalize the construction of 3 2 to families of real intervals S. When will the intersection of all the intervals in S have one and only one element in it? For the uniqueness of this element, we will surely want S to be ne in view of the proposition we just proved. We will also want S to be consistent, since otherwise S might not determine any real number. So, suppose that S is both ne and consistent. Since real numbers are dened by rational measurements, we may ask, “What sort of rational measurements would measure a real number determined by S?” Suppose [D> E] is an interval in S. Since the reals D and E can, unless they are rational, only be approximated by rationals, the measurement [D> E] will be approximated by some rational interval [u> v] which, to be conservative, we will assume both underestimates and overestimates, i.e. u D E v. Note that we can’t say that the rational interval [u> v] contains [D> E] since membership in [u> v] has been restricted to rationals; of course, if we replace [u> v] by a real interval that happens to have rational endpoints u and v, then this new interval would contain [D> E]. Instead, let’s just be more careful and introduce the following terminology. Definition 1.7.8 (Overspreads) The rational interval [u> v] overspreads the real interval [D> E] means u D E v. We can’t just choose any rational intervals that overspread the intervals in S, because the family of rational intervals we get may not be ne. Instead, we look at all rational intervals which overspread at least one interval in S. Definition 1.7.9 (lim (S)) For a consistent and ne family S of real intervals, lim(S) is the family of all rational intervals that overspread at least one interval of S. To paraphrase this denition slightly, the rational interval [u> v] is in lim(S) if and only if we can nd some interval [D> E] in S such that u D E v. Note that the interval [D> E] will depend on [u> v], and there will, in general, be many intervals in S that are overspread by [u> v]. £s s ¤ 2> 3 happens to lie in S. For example, suppose that the real interval U = s s Then the rational interval [1> 2] overspreads U since 1 2 3s 2; s hence [1> 2] lies in lim(S). Similarly, [1=4> 1=75] also lies in lim(S), since 1=4 2 3 1=75. The family lim(S) is very large. It is consistent. To see this, suppose L = [u> v] and M = [x> y] are any two (rational) intervals in lim(S). By the denition of lim(S), they overspread intervals of S, so we can write u DEv x FGy for intervals [D> E] and [F> G] in S. Since S is consistent, D G and F E. It follows that u y and x v; thus, [u> v] and [x> y] intersect.

THE REAL NUMBERS AND COMPLETENESS

62

What about neness? If [D> E] is in S, we can nd rationals u D and v E with D u and v E as small as we choose. In fact, given any A 0, we can make use of Proposition 1.6.1 to nd rationals u and v0 such that u D v0 and D u v0 u ? @3. Similarly, we can nd rationals u0 and v such that vE vu0 ? @3. Thus, c([u> v]) = (v E)+(E D)+Du ? @3+c(D> E)+@3. This is illustrated in the following diagram.

$ U

VPDOOOHQJWK

%

OHQJWK %$

V

U

V

VPDOOOHQJWK

Because there are arbitrarily small intervals [D> E] in S (of length less than @3), we also have arbitrarily small intervals [u> v] in lim(S). We now come to the fundamental property of the real number system, expressed by the following theorem, sometimes called “The Completeness of the Reals.”

Theorem 1.7.10 (Completeness) If S is any consistent and ne family of real intervals, then there is a real number, lim(S), that belongs to every interval of S. Furthermore, it is the only real number that lies in every interval of S. Proof. We have already veried that lim(S) is consistent and ne, so lim(S) is a real number. We must prove that it belongs to every interval in S. But every interval in S intersects every interval in lim(S) (because every interval in lim(S) overspreads an interval of S and the family S is consistent). It follows immediately from Proposition 1.7.3 that for every [F> G] in S, we have F lim(S) G. Thus, lim(S) does indeed belong to every interval in S. The fact that there are no other real numbers lying in each interval of S is a direct consequence of Proposition 1.7.7.

The Completeness Theorem may seem somewhat formal and unremarkable. This is due, in part, to the fact that most of the hard work in proving it was done in the propositions preceding it. Nevertheless, as you will soon see, it is a very powerful result because it guarantees (by construction) the existence of a real number with certain properties. It will enable us to prove convergence of sequences and series, and to dene derivatives and integrals and show when they exist. It will simply be a matter of choosing the proper family S in each application.

LIMITS AND LIMITING FAMILIES

63

Exercises 1. Let

s ª © 2 = [u> v] | u A 0 and u2 ? 2 ? v2 s ª © 3 = [u> v] | u A 0 and u2 ? 3 ? v2 ,

where these are of rational intervals. Find any rational interval that s families s overspreads [ 2, 3]. Find one whose length is less than 1. s ª © 2. Let 2 = [u> v] | u A 0 and u2 ? 2 ? v2 (as a family of rational intervals). s s Find an interval that overspreads [ 2 1> 2 + 1] and whose length is less than 2=01. Can you nd one whose length is less than 2? 3. Let [D> E] be a real interval. Given any A 0, show how to nd a rational interval L which overspreads [D> E] and whose length is less than (E D) + . ª © 4. Let S be the family of real intervals [D> E] | D3 2 E 3 . Explain why S is ne and consistent. Why is there a single real number contained in each of the intervals of S? What is this real number? 5. To generalize the previous exercise, suppose F is a real number (so F is a ne and consistent family of rational intervals). Suppose we form the family C¯ of real intervals, obtained by considering each rational interval of F as a real ¯ interval with the same endpoints. What is lim(C)? 6. Every real number D is the limit of some ne and consistent family of real intervals. Why? 7. Make sense out of this seemingly paradoxical statement about the real number D and rational numbers u and v. [u> v] 5 D =, D 5 [u> v]. Is there any sort of converse to this?

1.8

Limits and Limiting Families

The data for taking a limit is a family S of real intervals that is ne and consistent. The limit of such a family is a real number contained in every interval of that family. Since we will be taking a lot of limits, we abstract this idea in the following denition. Definition 1.8.1 A limiting family S is a ne and consistent family of real intervals. The limit of such a family is the real number lim(S) contained in each interval of S.

THE REAL NUMBERS AND COMPLETENESS

64

Notation 1.8.2 S $ [ means S is a limiting family and [ = lim(S). Note rst that the existence and uniqueness of lim(S) is guaranteed by the Completeness Theorem. Since lim(S) is in each interval of S, we must have D lim(S) E

for each [D> E] 5 S=

To say this in another way, we extend Denition (1.2.25), originally made for families of rational intervals, to real intervals. Definition 1.8.3 If [D> E] is an interval of the family S, then D is called a lower bound for S and E is called an upper bound for S. It is now very easy to show that the result characterizing inequalities for rational families (Proposition 1.2.26) also holds for real families. In particular, S S 0 if and only if every lower bound of S is less than or equal to every upper bound of S 0 . Proposition 1.8.4 Let S be a limiting family, and let O and X be any lower and upper bounds for S. Then O lim(S) X= The proof is an exercise. We have seen previously in the The Wiggle Lemma (Lemma 1.6.8) that if [ \ + for all A 0, then [ \ . The rst assertion of the following proposition is the version for limiting families. Proposition 1.8.5 Let S and T be limiting families. Then 1. S T +, lim(S) lim(T ). 2. S ? T +, lim(S) ? lim(T ). Proof. First we assume S T . Suppose that S $ [ and T $ \ . Let A 0 be given; we will show [ ? \ + . Choose intervals [D> E] 5 S and [F> G] 5 T each of length less than @2. Since S T we must have D G and so E D ? @2 and D \ G \ G F ? @2. Adding these, we get E \ ? so [ E ? \ + . Since [ ? \ + for any A 0, we can invoke Proposition 1.8.4 above to conclude that [ \ . Conversely, suppose that [ \ . [ is in every interval [D> E] of S, and \ is in every interval [F> G] of T , so we have D [ \ G. Thus S T . The proof of the second part, for strict inequality, is left as an exercise. Corollary 1.8.6 If S T (i.e. every interval of S meets every interval of T ), then lim(S) = lim(T ). The following lemma will be used later in establishing a fundamental inequality for exponential functions (see Proposition 2.3.17 in the next chapter). It shows that if we want to test whether F G, we don’t have to test all the intervals of F against all the intervals of G, but just look at arbitrarily ne ones.

LIMITS AND LIMITING FAMILIES

65

Lemma 1.8.7 Let F and G be ne and consistent families. Suppose that for each rational A 0 we can nd intervals [u> v] in F and [x> y] in G, each of length ? , such that u y. Then F G. Note that we did not specify whether the families are families of real or rational intervals. The lemma is true in either case, and we leave the proof for the exercises. Corollary 1.8.8 (Eudoxus’ Method of Exhaustion) Let F and G be limiting families. Suppose that for each rational A 0 we can nd intervals [u> v] in F and [x> y] in G, each of length ? , such that u y. Then lim F A lim G is false. Eudoxus (408—355 b.c.e) was a contemporary of Euclid, and many of his ideas were incorporated into Euclid’s Geometry. The Method of Exhaustion was devised by Eudoxus to cope with relations such as equalities and inequalities between “quantities” whose ratios aren’t rational. The Method was later used by Archimedes (287— 212 b.c.e.) to provide elegant and rigorous proofs of various geometric formulas. It seems clear that Eudoxus and Archimedes were among the rst to understand clearly the nature of real numbers and their relation to the rationals. As you probably have suspected for some time, limiting families share many of the properties we have proved for consistent and ne families of rational intervals. In particular, the sum, dierence, product and quotient (with denominator A 0 or ? 0) of limiting families is once again a limiting family (the same proof as for rational intervals works here as well). We summarize below some of the main algebraic facts. Proposition 1.8.9 Suppose S $ [ and T $ \ . Then S + T $ [ + \ , S T $ [ \ , ST $ [\ and, if \ is a unit, S@T $ [@\ . Proof. We prove the proposition only for products, leaving the other parts as exercises. A typical interval of ST is X Y , where X is an interval of S and Y is an interval of T . As in the case of rational intervals, X Y contains all reals of the form {|, where { 5 X> | 5 Y . Thus, [\ 5 X Y , a typical interval of ST , so it lies in all intervals of ST . Thus, [\ = lim ST . Suppose we have a real number D, given as a consistent and ne family of rational intervals [u> v]. Consider each of these intervals as a real interval, having the same endpoints; let’s call it [u> v]R . The family of all [u> v]R for [u> v] 5 D we will be denoted DR . Proposition 1.8.10 DR $ D Proof. DR is consistent and ne because these properties depend only on the endpoints of the intervals, and the family D is consistent and ne. If [u> v] 5 D, then we have already observed (Proposition 1.3.12) that u D v, so D 5 [u> v]R for each [u> v]. Thus D = lim DR . One way of expressing this last proposition is that any real number is the limit of its own family of rational intervals when viewed as real intervals.

66

THE REAL NUMBERS AND COMPLETENESS

Exercises 1. Prove the following, stated in the text: Proposition 1.8.4 Let V be a limiting family, and let O and X be any lower and upper bounds for V. Then O lim(V) X . (See Proposition 1.7.2.) 2. Prove Corollary 1.8.6 If V W then lim(V) = lim(W ). 3. Prove the following, stated in the text: Proposition 1.8.5 If V and W are limiting families, then V ? W +, lim(V) ? lim(W ). (Hint: =, is straightforward. Suppose then that lim(V) ? lim(W ). Use neness to choose [D> E] 5 V and [F> G] 5 W , each with length ? (\ [)@2. Use this to show that [D> E] ? [F> G].) 4. Prove the following, stated in the text: Proposition 1.8.9 Suppose V $ [ and W $ \ . Then V + W $ [ + \ , V W $ [ \ , VW $ [\ and, if \ is a unit, V@W $ [@\ .

The Goldbach Number

67

Appendix: The Goldbach Number and Trichotomy In 1742, in a letter to Euler, Christian Goldbach made the following guess, based on a lot of computations: The Goldbach Conjecture: Every even number greater than 2 is the sum of two primes. Although deep partial results have been obtained since that time, no proof of this conjecture is known at this time. It has been tested by computer over vast stretches of integers and no counterexample has been found. Several elementary observations can be made about the conjecture itself. The rst is that it is veriable for any particular even number Q : simply check the sums, two by two, of the nite number of primes less than Q . This may not be the most e!cient test, but it is conceptually clear that it works. From this it follows that the Goldbach conjecture cannot be formally undecidable and false. In fact, if Q is an even number that is not the sum of two primes, then a search will discover Q after (at most) Q steps, and the test just mentioned will verify (prove) than Q is a counterexample. Suppose rst that S is a computable boolean (true- or false-valued) function, with domain the positive integers. We will use the example S (q) = (q is congruent to 3 mod 4)= Now we dene the function j on positive integers by ; if q = 1 or 2q is a sum of 2 primes ? 0 1 if 2q is not a sum of 2 primes and S (q) is true . j(q) = = 1 if 2q is not a sum of 2 primes and S (q) is false Next, we dene

JQ =

Q X j(n)

n=1

and observe that

2n

Q Q X X 1 1 j(n) 1 = P1 Q . n n 2 2 2 2

n=P

n=P

Finally, we dene the following family of rational intervals. Definition 1.8.11

½ J= JQ

1 2Q1

> JQ +

1 2Q1

¸

¾ > Q = 1> 2> = = =

Appendix

68

It is not hard to see that this is a ne and consistent family (see exercises below), so J P is a real number. We will call it the Goldbach number. (We could also dene n , but we have not yet studied innite series carefully. We will J= 4 n=1 j(n) · 2 return to J in a later chapter–see page 163.) The idea of constructing J in this way has appeared in various places, and may have originated with von Neumann (cf. his essay “The formalist foundations of mathematics” (1931) reprinted in [Benacerraf-Putnam, 1984]). Von Neumann does not use the boolean S (equivalently, he chooses S (n) to be the statement “n = n”), which makes his example somewhat less eective. Since j(n) is computable for any n, it is clear that the partial sums JQ are computable to any desired degree of accuracy. Thus J is a well dened real number. This in spite of the fact that the Goldbach Conjecture has not been established. On the other hand, because the Goldbach Conjecture has not been established, we cannot tell whether J is zero, positive, or negative. At this point, all we denitely know is that J is very small in absolute value. It may be that Goldbach’s Conjecture will never be resolved! What then will be the truth value of the conjecture? If one takes the (Platonic) viewpoint that truth is independent of our knowledge, then one might assert strong trichotomy to say that the Goldbach number is denitely either zero or positive or negative. On the other hand, the constructivist viewpoint (to which your author subscribes) asserts that a statement has a truth value only when it has been proved or disproved. Thus, from this position, only -Trichotomy holds for J. Maybe, by the time you read this, Goldbach’s Conjecture will have been resolved, either proved or refuted. Then the particular question about whether J is positive, negative or zero will have been settled. But as long as there are unproved theorems in number theory–which is, safe to say, forever–there will be other numbers constructed in a similar way for which these issues apply. One nal note about J: |J| is computable since J is. Consider the product (J |J|) · (J + |J|) = J2 |J|2 = 0= We thus have the product of two numbers equal to 0, yet we can not determine which number is 0. (Thanks to John Frampton for this idea.) Exercises

1. Prove that the family J is ne and consistent. 2. Prove that the Goldbach Conjecture is true if and only if J = 0. 3. For any real number D, (D |D|) · (D + |D|) = D2 |D|2 = 0. Is it always true that this leads to a product being 0, but we can’t determine which factor is 0? Explain.

2. AN INVERSE FUNCTION THEOREM AND ITS APPLICATION 2.0

Introduction

Suppose [D> E] is a real interval and i is a function with i (D) ? i (E). For each number } 5 [i (D)> i (E)], can we nd some { 5 [D> E] such that i ({) = }? (We can also ask if there is more than one such {.) Here’s a picture where the answer seems a!mative. I%

]

I$ $

[ ""

%

We can put this dierently: if i : [D> E] $ [i (D)> i (E)], can we go backwards to nd an inverse function from [i (D)> i (E)] to [D> E]? We will show that this is the case when i satises left and right Lipschitz conditions, that is, when there exist reals O and N with 0 ? O N and O (| {) i (|) i ({) N (| {) for all { | in [D> E]. This is our Inverse Function Theorem (IFT). Given } 5 [i (D)> E(D)], we will construct a nested (hence consistent) family of subintervals of [D> E] using the left-hand inequality. Each of these intervals [d> e] will have the property i (d) } i (e). The right-hand condition guarantees that this family is 69

AN INVERSE FUNCTION THEOREM AND ITS APPLICATION

70

also ne; hence we can invoke the Completeness Theorem to nd some { which is in all of them. It will turn out that i ({) = }. After proving the IFT, we will apply it to various elementary functions, the simplest of which are the power functions {Q . Since these functions all satisfy left and right Lipschitz conditions on nite intervals 0 ? [D> E] (when Q A 0), the IFT guarantees the existence of Q th roots on positive nite intervals. This can be extended to intervals [0> E] as well. ¢1@Q ¢P ¡ ¡ = D1@Q for any rational u = P@Q 0; Next, we can dene Du = DP this can be extended to negative rationals as well. Finally, for a real exponent {, we can look at the family of real intervals of the form [Du > Dv ], where [u> v] 5 {. This family turns out to be ne and consistent, so we use completeness again to pick out a real number common to all the intervals in this family. We dene this real number to be D{ . The exponential function D{ constructed in this way satises the following Lipschitz condition on [f> g] (when D A 1): ¶ 1 f (| {) D| D{ Dg (D 1)(| {)= D 1 D This enables us to dene the inverse function logD ({). Finally, we will construct the Euler number h, and thus obtain the functions h{ and ln({). Throughout this chapter we will be talking about intervals of various kinds. Here is a catalog. Notation 2.0.1 Types of real intervals: 1. (Closed nite): { 5 [D> E] +, D { E. 2. (Open nite): { 5 (D> E) +, D ? { ? E. 3. (Left open nite): { 5 (D> E] +, D ? { E. 4. (Right open nite): { 5 [D> E) +, D { ? E. 5. (Left innite closed): { 5 (4> E] +, { E. 6. (Left innite open): { 5 (4> E) +, { ? E. 7. (Right innite closed): { 5 [D> 4) +, { D. 8. (Right innite open): { 5 (D> 4) +, { A D.

2.1

Functions and Inverses

Definition 2.1.1 A real-valued function i is a rule or procedure that denes, for each allowable real number {, another real number denoted i ({). The collection of

FUNCTIONS AND INVERSES

71

allowable numbers { is called the domain of i . We say that i is dened on the interval [D> E] if every number in [D> E] is in the domain of i . In the general case, a real number { will be given as a consistent ne family of rational intervals, and the rule for i must construct from this family a new family dening the real number i ({). The most basic functions use algebraic operations, already known to yield real numbers, to construct their values, so we can usually avoid having to work with intervals again. Here are some simple examples. Example 2.1.2 (Some elementary functions we already know about) • Given real numbers and , dene the a!ne function O({) = { + . When = 0 and = 1 we have the identity function i ({) = {; when = 0 we have the constant function O({) = . • Given real numbers > > > , dene the fractional-linear function F by I ({) =

{ + . { +

The (implied) domain here consists of those real numbers { for which { + is a unit (i.e. { + A 0 or { + ? 0). • Dene the squaring function to be {2 = { · {; more generally, for Q a positive · · · · · {}. integer, dene the Q th power function to be {Q = |{ · { {z Q i dfwruv

Definition 2.1.3 (Inverse functions on intervals) Suppose i : [D> E] $ R with i (D) i (E). We say that j : [i (D)> i (E)] $ [D> E] is an inverse for i if i (j(})) = } for every } 5 [i (D)> i (E)]. We say that i and j are inverses of each other if, in addition, j(i ({)) = { for every { 5 [D> E].

Definition 2.1.4 We write i : [D> E] $ L if, for each { 5 [D> E], i ({) 5 L. 1. i : [D> E] $ L is called surjective (or onto) if, for each } 5 L, there is some { 5 [D> E] (depending on }) such that } = i ({). 2. i : [D> E] $ L is called injective (or 1-to-1) if i ({1 ) = i ({2 ) implies that {1 = {2 . The proof of the following facts is a direct consequence of the denitions, and will be left as an exercise. Proposition 2.1.5 Suppose i : [D> E] $ R.

AN INVERSE FUNCTION THEOREM AND ITS APPLICATION

72

1. If i : [D> E] $ [i (D)> i (E)] and j is an inverse for i , then i is surjective and j is injective. 2. If i (D) i (E) and i and j are inverses of each other, then i : [D> E] $ [i (D)> i (E)], and both i and j are injective and surjective (1-to-1 and onto). D

E

z}|{ z}|{ Example 2.1.6 i ({) = 1 |1 {| on the interval [ 0 > 3@2 ]. Then i (0) = 0 ? i (D) i (E)

z}|{ z}|{ 1@2 = i (3@2). The function j(}) = } on the interval [ 0 > 1@2 ] is an inverse for i . Note that i is surjective but not injective, since i (1@2) = i (3@2), while j : [0> 1@2] $ [0> 3@2] is injective but not surjective.

We will be interested in nding some conditions that will guarantee that a function, dened on some interval [D> E], has an inverse. Some functions can be inverted simply by solving for {. For example, for the {+ | fractional-linear function {+ {+ , we can write | = {+ and nd that { = |+ (with appropriate domains). In general, though, we will have to make some sort of assumptions to guarantee that, for each allowable real number } lying in some interval, we can nd an { in the domain of i such that i ({) = }; furthermore, we want to avoid the possibility that there might be more than one such {, since that will complicate our task of nding the right one. So, it will be useful to make sure that i ({1 ) = i ({2 ) implies that {1 = {2 . Also, for reasons that will become clear when you see how the construction unfolds, we want i ({1 ) and i ({2 ) to be close when {1 and {2 are close. One way to meet these conditions on i is to assume that i satises Lipschitz conditions.1 Definition 2.1.7 (Lipschitz Conditions) Suppose that i is dened on the interval [D> E]. 1. The function i satises a left Lipschitz condition if there is a real number O A 0 such that O · (| {) i (|) i ({) for all D { | E. 2. The function i satises a right Lipschitz condition if there is a real number N A 0 such that i (|) i ({) N · (| {) for all D { | E. 3. i satises a two-sided Lipschitz condition if it satises both left and right conditions with O N. Please note that Lipschitz conditions aren’t unique, and don’t have to be the best possible. Consider the function i ({) = {3 + 2{ + 1 on the interval [0> 2]. We can write ¡ ¢ ¡ ¢ i (|) i ({) = | 3 + 2| + 1 {3 + 2{ + 1

1 Rudolf

= (| 3 {3 ) + 2(| {) = (| {)(| 2 + {| + {2 ) + 2(| {) = (| 2 + {| + {2 + 2)(| {).

Lipschitz (1832-1903)

FUNCTIONS AND INVERSES

73

For { and | in the interval [0> 2], | 2 + {| + {2 + 2 is never bigger than 14 nor less than 2, so i satises the Lipschitz conditions: 2(| {) i (|) i ({) 14(| {). 3 (| {) i (|) i ({) 15(| {). In fact, if 2 0 ? O 2 and 14 N, then i also satises O · (| {) i (|) i ({) N · (| {). If i satises a two-sided Lipschitz condition we can deduce two important facts: On the other hand, it also satises

• i ({) i (|) when { | and i ({) ? i (|) when { ? | ( i is strictly increasing). • i (|) i ({) N · when || {| . These are easy exercises. We are now ready to nd inverses.

Exercises 1. Find Lipschitz conditions for the function S ({) = 2{2 3{ + 4 on the interval [1> 3]. 2. Show that the squaring and cubing functions satisfy left and right Lipschitz condition on any interval [D> E] with 0 ? D E. To nd them, note that | 2 {2 = (| + {)(| {) and | 3 {3 = (| 2 + |{ + {2 )(| {). Now see how big or small (| + {) and (| 2 + |{ + {2 ) can be on the interval [D> E]. 3. Prove that neither the squaring nor cubing functions can satisfy a left Lipschitz condition on any interval containing 0. (Hint: Use the factorizations of the previous exercise; show that { + | (resp. {2 + {| + | 2 ) can be made arbitrarily small while | { A 0. You can let { = 0 for simplicity.) Can either function satisfy a right Lipschitz condition on such an interval? {2 satises a two-sided Lipschitz conditions 1+{ on every interval [D> E] where D A 0.

4. Show that the function z({) =

5. If i satises a left Lipschitz condition, prove that i ({) i (|) when { | and i ({) ? i (|) when { ? |. (That is, i is strictly increasing). Prove also that i ({) = i (|) implies { = | (so i is 1-to-1). Think of this as a cancellation law: in a certain sense we can cancel i from the equation i ({) = i (|). 6. The left Lipschitz condition says that a function “can’t grow too slowly”; explain. Suppose the function also satises a right Lipschitz condition: then it “can’t grow too rapidly.” Explain. 7. Find an interval and Lipschitz conditions for the function k({) = 7 30{ + 33{2 4{3 on that interval. (Hint: This is tricky. It may help to write | 2 + |{ + {2 as (| + {)2 {| and let x = | + {.)

AN INVERSE FUNCTION THEOREM AND ITS APPLICATION

74

8. Assume i and j satisfy Lipschitz conditions Oi · (| {) i (|) i ({) Ni · (| {) and Oj · (| {) j(|) j({) Nj · (| {) on an interval [D> E]. (a) Find Lipschitz conditions for the functions i + j and i ( A 0 a constant). (b) Find Lipschitz conditions for the composition i j, assuming it is dened.

(c) Suppose i and j are positively bounded on [D> E], meaning that there are constants such that 0 ? Vi i ({) Wi > 0 ? Vj j({) Wj . Find Lipschitz conditions for the product i j (Hint: Add and subtract i ({)j(|) when looking at i (|)j(|) i ({)j({)).

9. Suppose that i satises a two-sided Lipschitz condition on [D> E]. Prove that |i (|) i ({)| N · || {| for all {> | 5 [D> E] (with no assumptions on the relative sizes of { and |). (Hint: Use the fact that { | and | { can’t both be ruled out–see Remark 1.5.36.)

2.2

An Inverse Function Theorem

Theorem 2.2.1 (Inverse Function Theorem) Suppose the function i is dened on the interval [D> E], and that for some constants N> O with 0 ? O N, we have the two-sided Lipschitz condition O(| {) i (|) i ({) N(| {)

()

whenever D { | E. Then we can dene a function j : [ i (D)> i (E)] $ [D> E] such that i (j(})) = } and the Lipschitz condition 1 1 (z }) j(z) j(}) (z }) N O

()

is satised for all i (D) } z i (E). A naive proof of this theorem, using bisection, would proceed as follows. Start with the interval [d0 > e0 ] = [D> E] and construct a sequence of nested intervals [d0 > e0 ] · · · [dn1 > en1 ] [dn > en ] · · · , with i (dn ) } i (en ). We would construct these recursively, that is, using the one already found to nd the next. So suppose that [dn > en ] has been dened with } lying between i (dn ) and i (en ). Let p be its midpoint (i.e. p = (dn + en )@2), and look at i (p). If } i (p), then i (dn ) } i (p), so i 1 (}) would lie between dn and p (we are assuming i is increasing; i.e. if { | then i ({) i (|)). In this case, we would choose the next interval [dn+1 > en+1 ] to be [dn > p]. On the other hand, if i (p) }, then i (p) } i (en ), so we would choose [dn+1 > en+1 ] = [p> en ]. Each of the intervals [dn > p], [p> en ] have half the length of [dn > en ]. Repeating this process yields a nested (hence consistent) family of real intervals, each one-half the length of the previous

AN INVERSE FUNCTION THEOREM

75

one. By the Completeness Theorem, this family would have a limit which we call j(}), our candidate for i 1 (}). There is only one thing wrong with this approach: it may not be possible to determine the relative sizes of i (p) and } because it is not always possible to compare two reals in this way. This is a very “real” problem in implementing a bisection method of this type on a computer. It may turn out that to a thousand binary or decimal places, i (p) and } are equal–possibly as a result of rounding errors. Even worse, dierences in algorithms or rounding errors may produce, in the thousandth place, i (p) ? } when, in fact, i (p) A }. We are careful to avoid this problem in the proof that follows. We rst prove the following useful lemma. Lemma 2.2.2 If { 5 L = [u> v] and } 5 M = [x> y] where L and M are intersecting real intervals, then |} {| c(L) + c(M). Proof. Intuitively, since L and M intersect, the widest interval they can span occurs when they meet at just one endpoint:

U

V X OHQJWK VU

Y OHQJWK YX

Here’s the formal argument. Since u { v and x } y, (} y)+(u {) 0. Thus } { y u = y (x x) u + (v v) = (y x) + (v u) + (x v) c(M) + c(L) since x v 0 (because L and M intersect). Starting with { } instead of } {, a similar argument shows that { } c(L) + c(M) so c(L) c(M) } { and we are done (by Proposition 1.5.26). Proof of the Inverse Function Theorem. The proof will use a variation of the bisection method. We will want to trap } in the images of smaller and smaller subintervals of [D> E]. To do that, we will have to make comparisons of } and i (p), where p is a midpoint of one of these subintervals. Since the real numbers } and i (p) are families of rational intervals, we will need to use a comparison scheme that exploits this representation. It turns out that using a kind of “threefold”bisection does the job: we will look at the images of the left half, the right half, and the middle half of subintervals of [D> E]. Suppose i (D) } i (E) and let L0 (}) = [d0 > e0 ] = [D> E]. Suppose that L0 (}) L1 (}) · · · Lq (}) = [dq > eq ] have already been dened, with i (dq ) } i (eq ) 1 c (Lm ) . c (Lm+1 ) = 2

AN INVERSE FUNCTION THEOREM AND ITS APPLICATION

76

We will now construct Lq+1 . For simplicity of notation, let d = dq and e = eq . Take p = (d + e)@2, s = (3d + e)@4, and t = (d + 3e)@4. Consider the intervals [d> p], [p> e] and [s> t], each with length c([d> e])@2. These are the left, right and middle halves of the interval [d> e]. We can now look at the images of these intervals under i , [i (d)> i (p)], [i (p)> i (e)] and [i (s)> i (t)], and ask: “In which one does } lie?” Here’s a picture:

IE IT

]

IP IS

ID

D DQ

S

P

T

E EQ

To make this determination, let’s suppose that L is a rational interval in the family of the real number i (p) and M is a rational interval in the real }. For the moment, let these intervals be arbitrary (we’ll place some length restrictions on them shortly). Since L and M have rational endpoints, we can compare them, with only three possible results. If L ? M, then i (p) ? }; if L A M, then i (p) A }. The third possibility is that L and M intersect. This case leads to the following diagram.

AN INVERSE FUNCTION THEOREM

77

IE IT IP

, FRQWDLQV IP

FRQWDLQV]

IS

ID

What parts of this diagram can we actually verify? Because of the left Lipschitz condition, i is non-decreasing, so the relative placement of i (d), i (s), i (p), i (t), and i (e) is correct. The diagram also correctly re ects the assumptions that L and M intersect and that L is an interval of i (p). Since we are looking for a subinterval of [d> e] whose image under i contains }, the diagram suggests that [s> t] will work if we can guarantee that } is su!ciently close to i (p). Can we make this happen by choosing the lengths of L and M small enough? Here’s where the lemma we just proved helps. It tells us that } and i (p) can’t dier by more than c(L) + c(M). On the other hand, i (s) and i (t) can’t be too close to i (p) because of the left Lipschitz condition: ed A0 i (p) i (s) O · 4 and i (t) i (p) O ·

ed A 0= 4

The distance between either i (s) or i (t) and i (p) is at least = O · ed 4 . Since L and M come from ne families, we can assume that they satisfy c(L) + c(M) ? , so we have |} i (p)| . Putting our inequalities together, we have i (s) i (p) } i (p) + i (t). Thus i (s) } i (t). In summary, here is the denition of Lq+1 , assuming that Lq (}) = [d> e] has already been constructed and that we have chosen intervals L 5 i (p) and M 5 }

AN INVERSE FUNCTION THEOREM AND ITS APPLICATION

78

with c(L) + c(M) ? = O ·

ed 4 :

Lq+1

; A ?[p> e] if L ? M> = [d> p] if L A M> A = [s> t] if L and M intersect.

Note that for each n we have c(Ln ) = (E D)@2n . Let (}> i ) = {Ln : n = 0> 1> 2> ===}. This is a family of nested intervals containing arbitrarily small ones, so it is consistent and ne. Applying the Completeness Theorem, dene j(}) = lim (}> i ). Then j(}) is a real number such that for each n, dn j(}) en . Hence i (dn ) i (j(})) i (en )=

By construction, we also have

i (dn ) } i (en )= Thus, both i (j(})) and } lie in each interval [ i (dn )> i (en )]. These intervals are arbitrarily ne because i (en ) i (dn ) N · (en dn ) = N · (E D)@2n = (Here is where we nally use the right Lipschitz condition!) By Proposition 1.7.7 on the uniqueness of limits, we can conclude that i (j(})) = }. Now we will verify that inequalities () hold. We know that j(z) and j(}) are both in [D> E]. We know that j(z) j(}) because if j(z) ? j(}), we would have 0 ? O · (j(}) j(z)) i (j(})) i (j(z)) = } z or } A z, which contradicts the assumption. Therefore, since D j(}) j(z) E, we get 0 O · (j(z) j(})) i (j(z)) i (j(})) = z } N · (j(z) j(}))= 1 From O · (j(z) j(})) z } we get j(z) j(}) (z }), and from the last O 1 inequality we get (z }) j(z) j(}). Thus we have established our Lipschitz N conditions for j. It is possible to give a somewhat shorter proof of this construction using -Trichotomy (Proposition 1.6.9)–see the exercises. Corollary 2.2.3 In the Inverse Function Theorem, it is also true that, for any { 5 [D> E], j(i ({)) = {. Proof. Exercise As we have noted, the left Lipschitz condition implies that the function is nondecreasing. There is a version of the Inverse Function Theorem for decreasing functions. Here the Lipschitz constants are negative, but everything else is pretty much the same.

AN INVERSE FUNCTION THEOREM

79

Theorem 2.2.4 (Inverses of decreasing functions) Suppose the real-valued function i is dened on the interval [D> E] and that for some constants N 0 > O0 with O0 N 0 ? 0, we have the two-sided Lipschitz condition O0 (| {) i (|) i ({) N 0 (| {)

()

whenever D { | E. Then we can dene a function j : [ i (E)> i (D)] $ [D> E] such that i (j(})) = } and the Lipschitz condition 1 1 (z }) j(z) j(}) 0 (z }) N0 O

()

is satised for all i (E) } z i (E). Proof. There are two ways to prove this. The most straightforward is to go back through the proof of the usual IFT (Theorem 2.2.1 above) and make the slight changes necessary to make it work under these new Lipschitz conditions. On the other hand, it is annoying to have to do all this work over again. Fortunately, there is a trick. Note that the function i is increasing, and satises the conditions of the usual IFT. Finding an inverse for i yields an inverse for i . The details are left as one of the exercises at the end of this section. When q A 0, we can apply the Inverse Function Theorem to nd qth roots. The following algebraic lemma and corollary, easily proved by induction, establish the necessary Lipschitz conditions. Lemma 2.2.5 For all {> | 5 R and q 5 N, | q {q =

Ãq1 X l=0

!

| l {q1l (| {)

Corollary 2.2.6 If 0 ? D { | E then qDq1 (| {) | q {q qE q1 (| {)= The role of this lemma and corollary and further details about qth roots are discussed in the exercises. Note that N and O are not allowed to be 0 in Lipschitz conditions. If you look through the proof of the IFT you can see why N = 0, for example, is not useful. Nevertheless, the squaring function, for example, does have the square root function as an inverse on any interval [0>p E]. However, the square root does not satisfy any p Lipschitz condition of the form | { N(| {) on any interval containing p 0. You can see this easily by letting { = 0 in such a relation and noting that |@ | can’t be bounded by any N on an interval [0> E]. Thus, we have to modify the statement of the IFT and its proof, slightly, to deal with intervals containing 0. We will state the results here, but leave their proofs as exercises.

80

AN INVERSE FUNCTION THEOREM AND ITS APPLICATION

Theorem 2.2.7 (IFT: One-sided Case) Suppose that i is dened on the interval [0> E] with i (0) = 0 and i (E) = G A 0. Suppose also that for each D with 0 ? D E, we have a number OD A 0 (depending on D) such that OD · (| {) i (|) i ({)

for all D { | E

and another number N A 0 such that i (|) i ({) N · (| {)

for all 0 { | E=

Then we can dene a function j : [0> G] $ [0> E] such that for each } 5 [0> G], i (j(})) = }. Furthermore, j will satisfy Lipschitz conditions on all intervals of the form [S> G] where 0 ? S G and F T ? 0. Theorem 2.2.8 (IFT: Two-sided Case) Suppose that D ? 0 ? E and that i is dened on the interval [D> E] with I (D) = F, i (E) = G and i (0) = 0. Suppose also that for each 0 ? H E, we have a number OH A 0 (depending on H) with OH · (| {) i (|) i ({) for all H { | E and, for each D I ? 0, a number O0I (depending on I ) with O0I · (| {) i (|) i ({) for all D { | I= Finally, suppose that there is a N A 0 with i (|) i ({) N · (| {) for all D { | E= Then we can dene a function j : [F> G] $ [D> E] such that for each } 5 [F> G], i (j(})) = }. Furthermore, j will satisfy Lipschitz conditions on all intervals of the form [S> G] or [F> T] where 0 ? S G and F T ? 0. Applying the one-sided case of IFT to evenpower functions, we see that their inverses, even roots, are dened on intervals [0> E]. The two-sided case shows that odd roots are dened on anyinterval. The details of these applications of the IFT to general qth powers we leave for the exercises. 1 q q , or ( ) q , is the inverse of the function ( ) . When q is odd, Definition 2.2.9 s this function is dened on any interval; when q is even it is dened only on intervals [D> E] with D 0.

Exercises 1. We know from the last set of exercises that squaring and cubing satisfy Lipschitz conditions on any nite interval [D> E] with D A 0. Use this to show that any number F A 0 has a square root and a cube root. To do, this you’ll need to show that there are always positive real numbers D and E such that D2 F E 2 . (Hint: D = min(F> 1) and E = max(F> 1) should work. Explain why.)

AN INVERSE FUNCTION THEOREM

81

2. We know from the last set of exercises that squaring and cubing satisfy Lipschitz conditions on any nite interval [D> E] with D A 0. What does the s Inverse Function Theorem say about the Lipschitz conditions satised by s and 3 on an interval [F> G] with 0 ? F G? Be careful in answering this question, since the Inverse Function Theorem as stated gives the Lipschitz conditions for the inverse j in terms of { and |, lying in [D> E], the interval for the original function i . Now you want to express the Lipschitz conditions for the inverse function (here the root function) in terms of variables z and } lying in some [F> G]. Thus, you’ll have to nd the corresponding D and E in terms of F and G in order to apply the IFT. The preceding exercise should be helpful. 3. More generally, prove that for all {> | 5 R and q 5 N, Ãq1 ! X | l {q1l (| {). | q {q = l=0

(Hint: Induction is a good method). Discuss how this result gives us Lipschitz conditions for general qth power functions on intervals [D> E] with D A 0. 4. Apply exercise 3 above to show that any number F A 0 has an qth root; see the hint in exercise 1 above. 5. Apply the Inverse Function Theorem to deduce the existence of qth roots on intervals [F> G], with 0 ? F G. Give the Lipschitz satised by s conditions s qth roots on this interval in the form O · (z }) q z q } N · (z }), where O and N are expressed in terms of F and G. The previous exercise will be helpful. 6. Redo the previous two exercises in light of the one- and two-sided cases of the Inverse Function Theorem (see Theorems 2.2.7 and 2.2.8). 7. Let D1 > D2 > ===> Dq be positive real numbers, and dene the following. D1 + D2 + · · · + Dq q s q (b) The geometric mean: J = D1 · D2 · · · · · Dq (a) The arithmetic mean: D =

(c) The harmonic mean: K=

1³ 1 + q D1

1 1 D2

+ ··· +

1 Dq

´=

1 D1

+

1 D2

q + ··· +

1 Dq

Prove that D J K. (Hint: Look at exercise 10 on page 55.) 8. Give a shorter proof of the Inverse Function Theorem by making use of Trichotomy (Proposition 1.6.9): compare } with i (p), taking =O · (e d)@4.

82

AN INVERSE FUNCTION THEOREM AND ITS APPLICATION

9. Prove Theorem 2.2.4, the IFT for decreasing functions, using the trick of replacing the decreasing function i by the increasing function i . Note the new Lipschitz conditions you get, and gure out how the inverse j¯ for i can be used to get an inverse for i . 10. (Technical) Prove the One-sided and Two-sided versions of the Inverse Function Theorem (Theorems 2.2.7 and 2.2.8). Here is a sketch of the more di!cult twosided case, using -Trichotomy. (Make sure you understand the proof of IFT before trying this variation; doing exercise 8 rst is a good idea). Proof. Suppose D ? 0 ? E. Let O0 (I ) be the left Lipschitz constant on [D> I ] and let O(H) be the left Lipschitz constant on [H> E]. Divide [D> 0] into fourths (D> 3D@4> D@2> D@4> 0) and divide [0> E] into fourths (0> E@4> E@2> 3E@4> E). Use -Trichotomy: rst let 0 = O0 (D@4) · (D@4). If } ? i (D@2), the next interval is [D> D@2]. If } is within 0 of i (D@2), the next interval is [3D@4> D@4]. If } A i (D@2) let h = O(E@4) · (E@4), and apply -trichotomy again. If } A i (E@2), the next interval is [E@2> E]; if } within of i (E@2), the next interval is [E@4> 3E@4]. Finally, if } ? i (E@2), the next interval is [D@2> E@2]. Note that in all cases save the very last, we are reduced to the standard construction from IFT, since we have Lipschitz conditions on the full interval. In the last case we repeat these steps since D@2 ? 0 ? E@2. { 11. Suppose we know theRbasic facts from calculus about R | exp({) = h and ln({), { namely that ln({) = 1 (1@w)gw so ln(|) ln({) = { (1@w)gw and that exp({) is the inverse of ln({). Also suppose we know that the integral of a function is bounded below (resp. above) by the product of the length of the interval of integration with the smallest (resp. biggest) value of the function on the interval. (Draw a diagram to see that this makes sense.) Use this to nd Lipschitz conditions on the logarithm over a nite interval [D> E]> 0 ? D E, then nd Lipschitz conditions on the exponential function over an interval [F> G] = [ln D> ln E]. We will discuss the calculus of these functions in more detail in Section 6.2. Inequalities for ln (and similar ones for logD ) will be derived in the next section without using calculus (see page 97).

12. Consider the function i ({) = {2 . (a) Find a Lipschitz condition for i on the interval G = [=832> =833]. (b) Because of this condition, we know that i has an inverse on this interval. It is, of course, the square root function. Now recall the assignment where you worked with the real number called “ln 2” (see page 42). Verify that ln 2 5 i (G) = G2 = [=8322 > =8332 ]. Do this by nding a small enough rational interval bracketing ln 2 and contained in G2 . (c) The proof of the Inverse Function Theorem describes how to nd a rational subinterval H G having length half that of G such that ln 2 5 H 2 . Go through the details of nding this subinterval and nd it.

THE EXPONENTIAL FUNCTION

83

You will need to compute some very small intervals in ln 2 to do the above, so the following outline of a computer program may be helpful if you haven’t done much programming. This will ask you for N and then compute the left and right endpoints of the Nth interval of ln 2. input K 0 -A S For N = 1 to K do begin S + 1/(2*N-1) -A S S - 1/(2*N) -A S end output S REM Left endpoint output S+1/(2*K+1) REM Right endpoint REM Gives the Kth interval of ln2

2.3

The Exponential Function

We need to dene D{ for real numbers { and D, with D A 0. We already know how to dene Dq for q 5 N (the natural numbers), since this is just repeated multiplication, so our next step is to extend this to q 5 Z (the integers). Once we have Dq for integer exponents, we’ll extend the denition to rational exponents, then nally to real exponents (via the Completeness Theorem) by looking at the family of real intervals [Du > Dv ], where [u> v] varies over the rational intervals in the real number {. Definition 2.3.1 Let D be a positive real number and q an integer. Then 1. D0 = 1. 2. For q A 0, Dq+1 = D · Dq .

¢ ¡ 3. For q ? 0, Dq = 1@ (Dq ) = 1@ D|q| .

Proposition 2.3.2 (Laws of exponents for integers) Given D> E 5 R and p and q integers, 1. Dp+q = Dp Dq . 2. (Dp )q = Dpq . 3. (DE)q = Dq E q . Proof. The case where p and q are both positive can be proved easily by induction. The other cases use the previous denition for negative exponents – see exercises. The next useful fact is a direct consequence of the Inverse Function Theorem and is a type of cancellation theorem (we are cancelling the exponent q).

AN INVERSE FUNCTION THEOREM AND ITS APPLICATION

84

Lemma 2.3.3 (Root Lemma) Suppose D> E are positive reals and q 6= 0 is an integer. Then Dq = E q implies D = E. Proof. Exercise. Our next step is to dene Du for rationals u = p@q. Since there are two possibilities, (D1@q )p and (Dp )1@q , it is nice to know that it doesn’t make a dierence. This is shown by the following lemma. Lemma 2.3.4 Given a real number D A 0 and p> q 5 Z, (D1@q )p = (Dp )1@q . ¤q £ Proof. By the denition of inverse function, we have (Dp )1@q = Dp . Also ¤p £ 1@q p ¤q £ (D ) = (D1@q )pq = (D1@q )qp = (D1@q )q = Dp . Applying the Root Lemma (2.3.3) above completes the proof. Definition 2.3.5 For a rational number u = p@q A 0 with p> q 5 N and D A 0, Du = (Dp )1@q . Definition 2.3.6 For a rational number u ? 0 and D A 0, Du = 1@(Du ). Lemma 2.3.7 If D> E A 0 are real and q is a non-zero integer, then (DE)1@q = D1@q E 1@q . Proof. Deal with the cases q A 0 and q ? 0 separately. Raise both sides to the qth power, show that you get the same number, then apply the Root Lemma. For rationals we now have the fundamental exponent laws. Proposition 2.3.8 (Laws of exponents for rationals) Suppose D> E A 0 are reals and u> v rationals. Then 1. u = v =, Du = Dv . 2. (DE)u = Du E u . 3. Du+v = Du Dv . 4. (Du )v = Duv . Proof. We prove just the rst part. Suppose that u = p@q and v = p0 @q0 with p> q> p0 > q0 A 0. Since u = v, we know that pq0 = p0 q. Then h iq0 0 0 0 0 = (Dp )q = Dpq (Du )qq = (Dp@q )qq = (Dp@q )q h i 0 0 0 0 0 0 0 q 0 0 (Dv )qq = (Dp @q )qq = (Dp @q )q = (Dp )q = Dp q .

0

0

Thus (Du )qq = (Dv )qq , so Du = Dv by the Root Lemma. Suppose that u and v are 0 0 negative. Then we have Du = (1@D)u = (1@D)|p|@|q| and Dv = (1@D)|p |@|q | . By

THE EXPONENTIAL FUNCTION

85

what we have just proved, these last numbers are equal and we are done. The other parts are proved in a similar fashion. Having dened Du for rational u, we are now ready to work on D{ for { real. The idea will be to construct a family of intervals of the form [Du > Dv ] with [u> v] 5 { and apply the Completeness Theorem. We rst deal with the case D 1, since in this case the function Du increases with u. Proposition 2.3.9 Suppose D 5 R, D 1, and u> v 5 Q with u v. Then Du Dv . Proof. We have Du Dv if and only if 1 Dvu . Since v u is non-negative, it su!ces to show that Dx 1 for any non-negative rational x. First note that for q 5 N, we have Dq D 1 (this is easily established by induction). In Corollary 2.2.6, we obtained the inequality | q {q qDq1 (| {) on the interval [1> D]. Let i ({) = {q ; then i : [1> D] $ [1> Dq ]=Using this and applying the IFT to i , we see that qD1q1 (z }) z1@q } 1@q on the interval [1> Dq ]. Since we have just shown that 1 D Dq , we can let z = D and } = 1, obtaining 0

1 (D 1) D1@q 1. qDq1

Thus D1@q 1 for any real D 1. Since Dp 1, we can replace the D in D1@q by Dp , giving us Dx = Dp@q = (Dp )1@q 1. We now turn to D{ for D 1 and { real. Recall that { is the family consisting of all rational intervals [u> v] where u { v. For any rational interval [u> v] in { , we have already dened Du and Dv , and by the proposition just proved, we know that Du Dv . Let (D> {) = {[Du > Dv ] : [u> v] 5 { }. To prove that this family is ne, we establish a fundamental inequality for exponentials (here just for rational exponents). Lemma 2.3.10 For real D 1 and rationals u v, ¶ 1 (v u) Dv Du Dv (D 1)(v u)= Du 1 D Proof. Let v u = p@q with p and q non-negative integers. Apply Corollary 2.2.6 to i ({) = {p on the interval [1> D1@q ] to get p(D1@q 1) Dp@q 1 p(D1@q )p1 (D1@q 1)= Apply the corollary again to !({) = {q on the interval [1> D1@q ] to get q(D1@q 1) D 1 q(D1@q )q1 (D1@q 1)= For clarity, we label the four separate parts of this inequality as (1) p(D1@q 1) Dp@q 1 (2) Dp@q 1 p(D1@q )p1 (D1@q 1)

AN INVERSE FUNCTION THEOREM AND ITS APPLICATION

86

(3) q(D1@q 1) D 1 (4) D 1 q(D1@q )q1 (D1@q 1) Inequality (3) gives D1@q 1

D1 , q

while (4) yields: D1@q 1

1 D 1 (1q)@q D 1 D1@q D = q q D q

¶ 1 1 = D

Multiplying this last inequality by p and combining with (1) gives: p q

¶ ³ ´ 1 1 p D1@q 1 Dp@q 1. D

(*)

Now note that, since 1@q 0, we know that D1@q 1 so we have (D1@q )p1 = Dp@q D1@q Dp@q . Combining this fact with (2) and (3) gives us ³ ´ ´³ Dp@q 1 p Dp@q D1@q 1 ³ ´ D 1¶ p ³ ´ = p Dp@q Dp@q (D 1)= q q

(**)

Putting (*) and (**) together yields p q

1

1 D

¶

Dp@q 1

p p@q D (D 1)= q

Since p@q = v u, we have Dv Du = Du (Dp@q 1), and therefore ¶ 1 (v u) Dv Du Dv (D 1)(v u)= Du 1 D

Proposition 2.3.11 (D> {) is a consistent and ne family. Proof. Given [Du > Dv ] and [Dx > Dy ] in (D> {), we know that u { v and x { y. Therefore x v and u y. So, by the Proposition 2.3.9, Dx Dv and Du Dy . Thus the two intervals intersect, and we have established consistency. Now, given A 0, we must nd an interval in (D> {) of length less than . Let us x an interval [f> g] in { . For any subinterval [u> v] of [f> g], we can apply Lemma (2.3.10) to obtain

THE EXPONENTIAL FUNCTION

87

O

z }| { Df (1 1@D)(v u) Du (1 1@D) (v u) Dv Du Dv (D 1)(v u) Dg (D 1)(v u) | {z } N

or

O · (v u) Dv Du N · (v u).

Therefore, to get an interval [Du > Dv ] in (D> {) with length less than , we can simply choose [u> v] to be any subinterval of [f> g] having length less than @N . It may be worthwhile to explain here why we had to expand the family { to the family { . We have to be careful in trying to deduce arbitrary neness from the inequality Dv Du Dv (D 1)(v u). Making v u arbitrarily small on the right-hand side may make Dv unboundedly large at the same time. This is because we know that the family { is ne, which only means that we can nd some [u> v] 5 { with v u small. However, the algorithm for doing so may force v (and hence u) to be large. But since the intersection of any two intervals in { is also in { , we can restrict ourselves to subintervals of a xed constant interval, as we did at the end of the proof. We now could use the following to dene the exponential function: 1. For D 1, let D{ = lim( (D> {)), 2. For 0 ? D 1 let D{ = (1@D){ . However, it is much simpler to dene D{ in terms of a single family of intervals for { without having to separate into cases. Fortunately, this is not hard to do. Definition 2.3.12 (Limiting family for D{ ) For D A 0, V(D> {) = {[min(Du > Dv )> max(Du > Dv )] : [u> v] 5 { }= If we know how big D is, then the denition of this family can be simplied. For D A 0 and u> v 5 Q, 1. When D 1, [min(Du > Dv )> max(Du > Dv )] = [Du > Dv ]. 2. When D 1, [min(Du > Dv )> max(Du > Dv )] = [Dv > Du ]. Proposition 2.3.13 V(D> {) is a consistent and ne family. Proof. We will deal with the two cases D 1 and D 1 separately. (We have discussed this method of reasoning in the proof of Proposition 1.5.35.) We have proved the case for D 1. For D 1 we have 1@D 1, and for [u> v] and [x> y] in {

AN INVERSE FUNCTION THEOREM AND ITS APPLICATION

88

we have u y and x v. This implies that (1@D)u (1@D)y and (1@D)x (1@D)v , and therefore Dy Du and Dv Dx , so [Dv > Du ] and [Dy > Dx ] intersect. So V(D> {) is consistent for both D 1 and D 1, and therefore is consistent. Now we will show that V(D> {) contains arbitrarily small intervals. Take any [u> v] in { . For D 1, we have 0 Dv Du Dv (D 1)(v u)=

()

For D 1, replacing D by 1@D in the equation above yields 0 Du Dv Du1 (1 D)(v u)=

()

So if we x an interval [f> g] in { , the following hold for all subintervals [u> v] of that interval. (1) By relation (), 0 Dv Du Dg (D 1)(v u) if D 1. (2) By relation (), 0 Du Dv Df1 (1 D)(v u) if D 1. By the relations above, for D A 0 we have |Dv Du | (v u) max(Dg > Df1 ) | D 1|=

( )

But c([min(Du > Dv )> max(Du > Dv )]) = max(Du > Dv ) min(Du > Dv ) = |Dv Du |= Therefore we can use inequality ( ) to get arbitrarily small intervals in V(D> {). We now can give the denition of the exponential function in terms of the single family V . Definition 2.3.14 (The exponential function) For D A 0, D{ = lim(V(D> {)). Proposition 2.3.15 (Laws of exponents for reals) Given real numbers D, E, {> and | with D> E A 0, 1. If { = | (as reals) then D{ = D| . 2. D{+| = D{ D| . 3. (D{ )| = D{| . 4. (DE){ = D{ E { . Proof. If { = | as reals, then { | , and it is easy to see that every interval in V(D> {) meets every interval in V(D> |). Apply Corollary 1.8.6 to conclude that their limits are equal, so D{ = D| . We next show that (D{ )| = D{| in the case where D 1 and { 0. First, assume | is rational, | = {[x> x]}. For [u> v] 5 {, D{ 5 [Du > Dv ] and hence (D{ )| 5 [min(Dux > Dvx )> max(Dux > Dvx )].

THE EXPONENTIAL FUNCTION

89

On the other hand, {x 5 [min(ux> vx)> max(ux> vx)], so D{x 5 [min(Dux > Dvx )> max(Dux > Dvx )] as well. Since the family of intervals [min(Dux > Dvx )> max(Dux > Dvx )] is ne (by the fundamental inequality of Lemma 2.3.10), (D{ )x = D{x . Now suppose [x> y] 5 |. Then D{| 5 [D{x > D{y ] = [(D{ )x > (D{ )y ]. |

But, by the denition of (D{ )| , (D{ ) belongs to [(D{ )x > (D{ )y ] as well. Since the family of such intervals is ne, we must have (D{ )| = D{| . The remaining cases and properties are proved similarly. Corollary 2.3.16 D{ = 1@(D{ ). Proposition 2.3.17 (Bounds for exponential functions) For 0 f { | g, ¶ 1 Df 1 (| {) D| D{ Dg (D 1)(| {)= D Proof. We will prove the rst inequality; the ¡ ¢ second is proved similarly. The idea 1 (| {) and a limiting family T for is to nd a limiting family S for Df 1 D D| D{ , and to show that X Y for every lower bound X of S and upper bound Y of T . Then S T , and we can invoke Proposition 1.8.5 (“weak inequalities are preserved in the limit”) to conclude that lim(S) lim(T ). Let T be any rational upper bound for g. Then we have 0 f { | g T. Let L = [0> T]. Replace each rational interval in the families dening the real numbers f, {, |, and g by its intersection with L. By doing so, we can assume that all lower bounds for any of these families are non-negative. We will need this to simplify our construction of product families below. Let fU , {U , and |U be the limiting families obtained by converting every rational interval in f, { and | into a real interval having the same endpoints. From the section on limiting families (1.8.1), we know that these families of real intervals have limits f, {, and |, respectively. Let F be the family containing the single interval [1 1@D> 1 1@D]. Let S = D(fU ) · I · (|U {U ) and T = D(|U ) D({U ) . To nd a typical lower bound for S, pick arbitrary intervals [uf > vf ] 5 fU , [u{ > v{ ] 5 {U and [u| > v| ] 5 |U ; these are real intervals with rational endpoints. Because we have made all our lower bounds non-negative, all these numbers and intervals are non-negative, so it follows from the denition of interval multiplication and subtraction that X = D(uf ) (11@D)(u| v{ ) is a typical lower bound for S. In order to prove that S T , we only have to show that for each A 0 there are intervals L 5 S and M 5 T , each with length less than , such that the left-hand endpoint of L is the right-hand endpoint of M. (This is the content of Lemma 1.8.7 and its corollary, Eudoxus’ Method of Exhaustion). So, for each pair of intervals [uf > vf ] 5 {U and [u{ > v{ ] 5 {U of length ? , replace [u{ > v{ ] by [max(uf > u{ )> v{ ] 5 {U . This last interval also has length less than , and max(uf > u{ ) uf . Thus, since the

AN INVERSE FUNCTION THEOREM AND ITS APPLICATION

90

¡ ¢ 1 (| {) D| D{ all preserve neness, we need algebraic operations in Df 1 D only look at intervals in fU and {U with the property uf u{ . 0 0 A typical upper bound for T from this family is Y = D(v| ) D(u{ ) , where [u{0 > v0{ ] and [u|0 > v0| ] are another set of intervals in {U and |U respectively (and uf u{0 ). Since all the u’s and v’s are rational, we can use Lemma 2.3.10 again to see that 0

X = Duf (1 1@D)(u| v{ ) Du{ (1 1@D)(u| v{ ) 0

Du{ (1 1@D)(v0| u{0 ) 0

0

Dv| Du{ = Y . We conclude S T . Corollary 2.3.18 For the numbers D and f with D 0, 1. D 1 and f 0 implies Df 1. 2. D 1 and f 0 implies Df 1. 3. D 1 and f 0 implies Df 1. 4. D 1 and f 0 implies Df 1.

The rst of these follows from the Proposition 2.3.17, and the others follow from the rst by noting that Df = 1@(Df ). It turns out that the assumption 0 f { | g can be weakened: the numbers don’t have to be non-negative. In reading the proof below, note that we don’t have to use intervals anymore. That part of the work was done in Proposition 2.3.10 above, and now we can proceed by simply dealing with the various cases. Proposition 2.3.19 (Lipschitz condition for exponential functions) If D A 0 and f { | g, then ¶ 1 (| {) D| D{ Dg (D 1)(| {)= Df 1 | {z } D {z } | N O

Before beginning the proof, we note that, for real numbers f> {> |> and g, we can not rule out all of the following cases: (1) 0 f { | g (2) f 0 { | g (3) f { 0 | g (4) f { | 0 g (5) f { | g 0 Thus, it su!ces to prove the inequality in each of these cases. We will give the proof for f { 0 | g and leave the others as exercises. We will need the following small but helpful result.

THE EXPONENTIAL FUNCTION

91

Lemma 2.3.20 If D 0, then Df (1 1@D) (1 1@D) and (D 1) Dg (D 1). Proof of the Lemma. We’ll only show the rst inequality; the proof of the second is similar. If D A 1, then Df 1 since f is assumed non-positive. Multiplying by the non-negative quantity 1 1@D gives us what we want. If D = 1, then both sides are equal. Finally, if D ? 1, then Df 1, so multiplying by the non-positive quantity (1 1@D) also gives us the inequality we want. Proof of the Proposition. Now we return to Proposition 2.3.19. Since 0 | g we can apply the inequality in a case we already know, with f = 0 = {, to conclude D0 (1 1@D)| D| D0 Dg (D 1)|= Applying the rst inequality of the lemma gives: Df (1 1@D)| D| 1 Dg (D 1)|=

(*)

Now we also have 0 { f, so we can once again apply the inequality we already know, with 1@D in place of D, to obtain (1@D)0 (1 D)({) (1@D){ 1 (1@D)f (1@D 1)({) or (1 D)({) D{ 1 Df (1@D 1)({)= Finally, multiplying by 1 and using the lemma, we get the inequalities Df (1 1@D)({) 1 D{ (D 1)({) Dg (D 1)({)> which we add to () to obtain the desired result. Corollary 2.3.21 For real numbers D A 0 and { ? |, 1. D A 1 =, D{ ? D| . 2. D 1 =, D{ D| . 3. D ? 1 =, D{ A D| . 4. D 1 =, D{ D| l 5. D = 1 =, D{ = D| = 1. Corollary 2.3.22 For real numbers { and 0 ? D ? E, 1. { A 0 =, D{ ? E { . 2. { 0 =, D{ E { . 3. { ? 0 =, D{ A E { . 4. { 0 =, D{ E { .

AN INVERSE FUNCTION THEOREM AND ITS APPLICATION

92

5. { = 0 =, D{ = E { = 1. This follows by applying the previous corollary to D@E.

The Logarithm Function We start with the Lipschitz condition for exponential functions (Proposition 2.3.19). When f { | g and either 0 ? D ? 1 or D A 1, we have Df (1 1@D) ·(| {) D| D{ Dg (D 1) ·(| {). {z } | | {z } O

N

It is sometimes more convenient to write Df (1 1@D) as Df1 (D 1). This is, of course, a two-sided Lipschitz condition, so we can apply the Inverse Function Theorem to the function D{ to obtain its inverse, which we call logD . Note that if 0 ? D ? 1, we have to use the IFT for decreasing functions (Theorem 2.2.4). Definition 2.3.23 When 0 ? D ? 1 or D A 1,logD ({) is the inverse of the function D{ . We then have, for Df } z Dg , 1 1 · (z }) logD (z) logD (}) f · (z }). Dg (D 1) D (1 1@D) The only problem with this formula is that the Lipschitz conditions on the logarithm are written in terms of the interval [f> g], which is part of a domain for the exponential function. Supposing that 0 ? F } z G, can we express this inequality in terms of F and G? We can if we can write F = Df and G = Dg for some reals f and g. So the question boils down to the following: If 0 ? F G are reals, can we nd powers of D such that DS F G DT ? Lemma 2.3.24 For any real A 0 and integer S 0, (1 + )S 1 + S . The proof, by induction, is an exercise. Lemma 2.3.25 For any A 0 and F A 0 there is some S such that S A F. Proof. Choose whole numbers such that 0 ? F ? p@q and 0 ? p0 @q0 ? . Then all you need is S (qp0 ) A pq0 (so S = pq0 will work, for example). Corollary 2.3.26 For any real 0 ? D ? 1 or D A 1 and reals 0 ? F G, there exist integers S and T such that 0 ? DS F G DT . Proof. In the case D A 1, choose S such that DS A 1@F, so DS ? F, and choose T such that DT A G. When 0 ? D ? 1 consider 1@D, which is greater than 1. Proposition 2.3.27 The logarithm is dened on any interval [F> G] with F A 0.

THE EXPONENTIAL FUNCTION

93

Proof. Choose S and T such that 0 ? DS F G DT and apply the IFT to D{ on the interval [S> T]. Proposition 2.3.28 (Laws of the log) 1. logD (} z ) = z logD }. 2. logD (}z) = logD } + logD z. 3. logD (}@z) = logD } logD z. 4. logE } = (logD })@(logD E). (These follow from similar properties for exponential functions and the fact that DX = DY implies X = Y . See the exercises.)

Exercises 1. Prove these properties of integer exponentiation rst when q and p are nonnegative (by induction), then when one or both of q> p are negative: (a) Dp+q = Dp Dq . (b) (Dp )q = Dpq . (c) (DE)q = Dq E q . (Hint: Use the denition Dq = 1@ (Dq ) when q is negative.) 2. Prove the Root Lemma (Lemma 2.3.3): Suppose D> E are positive reals and q 6= 0 is an integer. Then Dq = E q implies D = E. 3. Using the denition Dp@q = (Dp )1@q , prove, that D(p)@(q) = Dp@q when p> q are positive integers. 4. Prove Proposition (2.3.15), the Laws of Exponents (two parts are proved in the text). 5. Prove Lemma 2.3.24 from the text: For any real A 0 and integer S 0, (1 + )S 1 + S . 6. Prove that if 0 ? F G and D A 0, then there is an integer Q such that 0 ? DQ ? F G ? DQ . 7. The function logD ({) is dened on any interval 0 ? F { G. Find Lipschitz conditions for logD in terms of F and G. 8. Prove the following properties of the logarithm using the corresponding property, proved in the text for exponentiation. (a) logD (} z ) = z logD }.

AN INVERSE FUNCTION THEOREM AND ITS APPLICATION

94

(b) logD (}z) = logD } + logD z. (c) logD (}@z) = logD } logD z.

(d) logE } = (logD })@(logD E). (Hint: DO = DU implies O = U.)

2.4

Natural Logs and the Euler Number h

In order to take the mystery out of what we’re about to do in this section, remember the computation from calculus where you look at the derivative of the function i ({) = d{ : d{+k d{ d{ (dk 1) = lim k$0 k$0 k k k d 1 . = d{ lim k$0 k

i 0 ({) =

lim

On the other hand, using the Chain Rule gives the formula i 0 ({) = d{ ln d. So you dk 1 = ln d. Now we haven’t done any calculus yet, but deduce that O(d) = lim k$0 k we will use this as motivation to examine the numbers (d{ 1)@{, where d A 1 and { is small and positive. First note that d{ 1 1 d{ 1 = = · d{ { { Now, using the right-hand side of the Lipschitz condition for exponential functions (Proposition 2.3.19), with d| in place of D, {@| in place of | and 0 in place of {, we get {@| {@| 1 (d| ) (d| 1) ({@|) (*) (d| ) or Therefore

d{ 1 d{ (d| 1)({@|).

(*)

d{ 1 1 d{ 1 d| 1 = {· = { d { |

(**)

Definition 2.4.1 For d A 1 let ¸ ¾ ½ { d 1 d{ 1 :{A0 = > (d) = { { Proposition 2.4.2 (d) is a consistent and ne family=

NATURAL LOGS AND THE EULER NUMBER h

95

Proof. The consistency is an immediate consequence of inequality (). To show that (d) contains arbitrarily small intervals, note that ¸¶ ¶ { d{ 1 1 (d{ 1)2 d 1 d{ 1 = 1 { = > = c { { { d {d{

But inequality (), with | = 1, tells us that d{ 1 {d{ (d 1). Therefore

(d{ 1)2 {2 (d{ )2 (d 1)2 = {d{ (d 1)2 {d{ {d{ and d{ d for 0 ? { 1. So we get ¸¶ { d 1 d{ 1 {d{ (d 1)2 {d(d 1)2 = > c { {

Clearly, {d(d 1)2 will be less than when { ?

d(d1)2 .

Definition 2.4.3 Let O(d) = lim((d)). (Think of O(d) as ln(d).) Remark 2.4.4 Note that, for all { A 0 and d A 1, d{ 1 d{ 1 O(d) = { { Proposition 2.4.5 For all A 0, O(d ) = O(d). Proof. Let M be any interval in (d ). Then ¸ { ¸ { d (d ) 1 (d ){ 1 1 d{ 1 = > > M= { { { { ¸ | ¸ { { | d 1 d 1 1 d 1 d = > > = { { | |

where | = {. Therefore, any interval in M is of the form L, where L is an interval in (d). So in the limit we have lim((d )) = lim((d)) = lim((d)), which implies that O(d ) = O(d). Proposition 2.4.6 For all d> e A 1, d1@O(d) = e1@O(e) . Proof. Let = logd e. Then e = d , and we have

e1@O(e) = (d )1@O(d

)

= (d )1@(O(d)) = d1@O(d) =

(Aside: Recall the basic property of logs, logE } = logD }@ logD E. If we put D = h = }, we get logE h = 1@ logh E. Since our motivation for this is the belief–soon to be veried–that O(d) = logh d, then the result that d1@O(d) is the same for all d just amounts to noting that d1@O(d) = dlogd h = h.)

AN INVERSE FUNCTION THEOREM AND ITS APPLICATION

96

Definition 2.4.7 The Euler number is given by h = 21@O(2) . Corollary 2.4.8 For all d A 1, h = d1@O(d) . { d{ 1 O(d) d {1 = If we let { = 12 and { 1 1 2 O(4) 1. The previous corollary gives

Remark 2.4.9 As we have noted above, d = 4, we get 2 h 4.

1 12 1 2

O(4)

21 1 2

, so

Proposition 2.4.10 For all d A 0, O(d) = logh d= Proof. h = d1@O(d) =, hO(d) = d =, O(d) = logh d. Corollary 2.4.11 O(h) = 1. Corollary 2.4.12 For all { A 0, h{ 1 h{ 1 1 = { { In particular, h{ 1 + {. Notation 2.4.13 From now on we will use ln d in place of logh d. Although the denition of the power function {q for q an integer preceded our study of exponential functions, we need the latter to give a more general denition of the former when q is not an integer. We have dened the function Dx for xed D A 0 and any real x. So, for any { A 0 and any real x we can dene the number {x to be the value of Dx when D = {. This allows us to piece together the function {x . However, it is nicer to give a denition in terms of functions we have alread dened. The key is to note that hln { = {, since h{ and ln { are inverse functions. Raising both sides to the power x, and using (D{ )| = D{| , we are lead to the following denition. Definition 2.4.14 (The Power Function) For { A 0 and real x, {x = hx·ln { .

Exercises 1. In a previous set of exercises from chapter 1 we dened a number called “ln 2”. In this section we dened, for each d, a family (d), whose limit we proved was equal to ln(2) (where ln({) is the inverse of the function h{ ). Given the suggestiveness of the notation, it is reasonable to suppose that these real numbers are equal, i.e. ln 2 = ln(2).

NATURAL LOGS AND THE EULER NUMBER h

97

(a) Examine the families for ln 2 and ln(2), and write down conditions that would prove ln 2 = ln(2). (b) Try to nd some numerical evidence that these conditions do hold. Aside from this numerical evidence and the suggestive nature of the notation, is there any apparent reason for believing that these real numbers are equal? (As far as numerical evidence goes, note the following: h

s 163

= 262537412640768744=00000000000

to 11 decimal places! But h

s 163

= 262537412640768743=99999999999925007259719818568865

to 32 places!). (c) Verifying these conditions directly seems to be a daunting task; do you have any ideas on how it might be done? (d) The strong principle of real trichotomy says that, for reals { and |, exactly one of the following conditions holds: { ? |> { = |> { A |. What evidence is there that this is correct? Opposing this principle is the following: “Generally, to prove two reals equal (or unequal) requires a theorem.” Discuss these principles in the light of ln 2 and ln(2). We will be able to prove that ln 2 and ln(2) are actually equal, later, after we discuss power series. 2. We have derived the following Lipschitz conditions for exponential functions: ¶ 1 (| {) D| D{ Dg (D 1)(| {), Df 1 D where f { | g (see Proposition 2.3.19) (a) Suppose now that 0 ? F [ \ G. Use the above inequality and the Inverse Function Theorem to nd left and right Lipschitz conditions for the logD function. (This issue was discussed at the end of Section 2.3.) (b) When D = h, prove that we can obtain a better bound, namely \ [ \ [ ln(\ ) ln([) . G F {

{

(Hint: Use the inequality: h {1 1 h {1 for all { A 0–see Corollary 2.4.12. The right-hand inequality gives h{ 1 + {. Taking logs gives { ln(1 + {). Since \ [ A 0, we can write \ @[ = 1 + (\ [)@[. \ [ \ [ Now let { = (\ [)@[, so ln(\ @[) = ln \ ln [ [ F { since F [. Prove the other half using h {1 1 and choosing { carefully.)

98

AN INVERSE FUNCTION THEOREM AND ITS APPLICATION

(c) Now deal with the general base D by noting that logD (X ) = (obtained from { = DlogD ({) by applying ln to both sides).

1 ln(D)

ln(X )

3. LIMITS, SEQUENCES AND SERIES 3.1

Sequences and Convergence

In section 1.8 we talked about limiting families and their limits, which are real numbers. In this section we talk about sequences of reals and their limits. This provides a parallel theory which, in a sense to be made precise, is equivalent to that of limiting families. Intuitively, a sequence is just a list of real numbers; for example, 1> 2> 4> 8> = = = 1> 1@2> 1@4> 1@8> = = = rq q s s s 2> 2> = = = 2> 2> are all sequences. However, we want to be able to describe the various terms of the sequence precisely. For example, in the last example above, the terms represent a certain number of iterated square-roots of 2. Thus, given some whole number n, we want a notation for the nth term of the sequence. That is, we want to be able to make an association n $ nth term of the sequence. This suggests that a sequence should be a function which associates with each natural number n some real number: the nth term of the sequence. Recall that the the set of non-negative whole numbers is denoted by N (N ={0> 1> 2> 3> ===}). Definition 3.1.1 A sequence d of real numbers is a function d : N $ R. Notation 3.1.2 It is customary to denote d(n) by dn and the sequence itself by (dn ). Also, sometimes d0 is omitted, so that the rst term of the sequence is d1 . Definition 3.1.3 A sequence of real numbers (dn ) is said to converge to the real number O if, for any A 0 there is an integer Qd () 0 such that |dl O| whenever l Qd (). O is called the limit of the sequence, and we write (dn ) $ O, or O = limn$4 (dn ). Qd () is called a modulus of convergence for (dn ). Remark 3.1.4 In the above denition we have used weak inequality: |dn O| whenever n Qd (). We might have phrased the denition using strong inequality, that is, |dn O| ? whenever n Qd (). This would lead to the notion of a strong 99

LIMITS, SEQUENCES AND SERIES

100

modulus of convergence. It is harder to be a strong modulus of convergence than a weak one, since |dn O| ? implies |dn O| , So any strong modulus of convergence is automatically a weak one. On the other hand, suppose u is any real with 0 ? u ? 1. If Qd is a weak modulus of convergence for the sequence (dn ), then Qd0 () = Qd (u) is a strong modulus for (dn ). Here is a simple example. Let dn = 1@n, and suppose = 1@P , where P is some (large) whole number. Then (dn ) $ 0, and Q () = P is a weak modulus of convergence, since |dP 0| = 1@P = . ¡1 ¢ ¡ 1 ¢ 0 = 2P is a strong modulus of converOn the other hand, Q () = Q 2 = Q 2P gence, since 1 1 ? = . |d2P 0| = 2P P Later we will dene other moduli of convergence as well as moduli of continuity and dierentiability, and this same issue will arise, so keep this remark in mind. One other point: Often, if we are dealing with just one sequence at a time, we will suppress the subscript d and simply write Q () for a modulus of convergence. This makes the notation less combersome. It is important to note, right away, that a sequence may or may not have a limit, or it may be impossible to determine whether it does. Here are some examples. Example 3.1.5 Let dn = 1 1@n for n 1 (we have to start with n = 1 here, since d0 is undened). Some terms are d1 = 0, d2 = 1@2, d3 = 2@3; we might simply write d = 0> 1@2> 2@3> 3@4> = = = > (n 1)@n> = = = Clearly (dn ) $ 1 and the proof is easy: |dn 1| = |1@n| = 1@n which can be made less than by making n A 1@. Example 3.1.6 Let en = (1)n , so e0 = 1, e1 = 1, e2 = 1, etc. Thus e = 1> 1> 1> 1> = = = Clearly there can be no limit here, since for any O ½ |1 O| for n even , |en O| = |1 + O| for n odd and these numbers can’t both be arbitrarily small, since |1 O| small means that |1 + O| is close to 2. Example 3.1.7 Let jn =

½

0 if 2n is a sum of two primes . 1 if 2n is not a sum of two primes

For every n A 1 that has ever been tested, jn = 0. Yet we do not know if (jn ) has a limit. If we could show that (jn ) has a limit, then that limit must be 0 (why?). The famous Goldbach Conjecture says that jn = 0 for all n (see page 67), but this has never been proved (and maybe never will be).

SEQUENCES AND CONVERGENCE

101

Proposition 3.1.8 lim(dn ), if it exists, is unique. Proof. Suppose O and O0 are both limits for v, and let A 0 be given. Choose Q so that |dn O| ? @2 for n Q , and choose Q 0 so that |dn O0 | ? @2 for n Q 0 . Let P = max(Q> Q 0 ). Then |O O0 | = |O dP + dP O0 | |O dP | + |dP O| ? @2 + @2 = = Since reals that are arbitrarily close are equal (see Corollary 1.6.6), we have O = O0 .

As with limiting families, the expected algebraic laws hold for limits of sequences, as stated in the following. Proposition 3.1.9 Suppose (dn ) $ O and (en ) $ P . Then (dn + en ) $ (O + P ), (dn en ) $ (O P ), (dn en ) $ OP , (dn @en ) $ O@P (provided P 5 R ), and |dn | $ |O|. Proof. The product is the tricky case, so we prove that. As usual, we add and subtract: |dn en OP | = |dn en Oen + Oen OP | |en | |dn O| + |O| |en P | . Clearly, the terms |dn O| and |en P | can be made arbitrarily small; the only part we have to worry about is that the |en | must not get arbitrarily large. But they can’t since en is eventually within say 1 of P . Here are the details. ³ ´ Suppose we are given A 0. Choose Q so that |en P | ? min 1> 2|O| , and |dn O| ? 2(1+|P|) when n Q . Note that |en | = |en P + P | |en P | + |P |. Then we have |dn en OP | |en | |dn O| + |O| |en P | ? (|en P | + |P |) |dn O| + |O| |en P | ¶ ¶ + |O| = + = ? (1 + |P |) 2(1 + |P |) 2|O| 2 2 The remaining parts are left as exercises. Note that there could be a problem with the sequence (dn @en ) since some of the en might be 0; however, once n is su!ciently large, en is very close to P and P 5 R ; thus en will be bounded away from 0. This is dealt with more carefully in the exercises Limits also behave reasonably well with respect to inequalities. An important principle is: weak inequalities are preserved in the limit. We show this in the following. Proposition 3.1.10 (Weak inequalities are preserved in the limit) Suppose (dn ) $ O and (en ) $ P , with dn en for n su!ciently large. Then O P .

LIMITS, SEQUENCES AND SERIES

102

Proof. Given A 0, choose Q large enough that dn en , O dn + @2, and en P + @2 when n A Q . Then O dn + @2 en + @2 P + . Since A 0 was arbitrary, the Wiggle Lemma tells us that O P . It is clear that if a sequence converges to a limit, then the terms of the sequence must be getting closer to each other because they are getting closer to the limit. One way of phrasing O = lim(dn ) is to say that for any A 0, the terms of (dn ) eventually lie in the interval [O > O + ] centered at O:

/

D

D

D

D

D D DHWF

LQWHUYDORIZLGWKH FHQWHUHGDW / Here we see that all of the terms after d6 are within of O. It is reasonable to ask whether a series converges solely on the basis of the terms getting closer together, even if we don’t happen to have the number O on hand–that is, can we construct such an O solely from the fact that the terms are getting closer together? The answer is yes, provided we have a strong enough notion of closeness. This was discovered by Augustin-Louis Cauchy (1789—1857).

Definition 3.1.11 The sequence (dn ) is called a Cauchy sequence if, for any A 0 there is an integer Qd () A 0 such that |dl dm | whenever l and m Qd (). Qd () is called a modulus of convergence for the sequence (dn ). The inequality |dl dm | ? means dm 5 [dl > dl + ], so once again we are looking at an interval of width 2. This time we have no O to center it around. The Cauchy criterion says that if we slide this interval around, we can eventually cover all but a nite number of terms of the the sequence (dn ). The diagram is very much like the previous one for the limit:

SEQUENCES AND CONVERGENCE

103

,QWHUYDORIZLGWKH WKLVZRUNV D

D

D

D

D D DHWF

,QWHUYDORIZLGWKH QRWJRRGHQRXJK

Example 3.1.12 Let vn =

,QWHUYDORIZLGWKH QRWJRRGHQRXJK

n X 1 . If p q A 1,we can write 2 l l=1

vp vq =

p X 1 2 l l=q+1

p X

1 l(l 1) l=q+1 ¶ p X 1 1 = l1 l l=q+1

=

1 1 1 , q p q

so to make |vp vq | ? we need only choose Q so that Q A 1@; thus (vn ) is a Cauchy sequence. I should remark that the limit of this sequence of partial sums has 2 (discovered by Euler). The proof is not been known since the 18th century: it is 6 easy, though (see [Kalman 1993]). The sequences that have limits are exactly the Cauchy sequences; this is expressed in the following proposition and theorem. Proposition 3.1.13 If (dn ) $ O, then (dn ) is a Cauchy sequence. The proof is left as an exercise. The following important result is another way of stating the Completeness Theorem for the reals–perhaps, in fact, the most usual way. For more details, see the exercises. This is one of the most important results in classical real analysis; it is sometimes simply called The Completeness of the Reals. Theorem 3.1.14 (Cauchy Completeness Theorem) Every Cauchy sequence converges to some real number. In other words, if (dn ) is a Cauchy sequence, then there exists an O such that (dn ) $ O.

104

LIMITS, SEQUENCES AND SERIES

Proof. Using (dn ) we will dene a consistent and ne family of real intervals and invoke the Completeness Theorem we’ve already proved to show the existence of O. For each integer n A 0, dene the interval Ln as follows. Let Qn be an integer the assumption such that |dl dm | ? 1@n whenever l> m Q£n . (Here we are using ¤ that (dn ) is a Cauchy sequence.) Let Ln = dQn n1 > dQn + n1 and F = {Ln | n = 1> 2> = = = }. We note that the length of Ln is n2 , so the family F is clearly ne. To prove consistency, suppose that we have the interval Ln as above and another such interval Lt = [dQt 1t > dQt + 1t ]. Either Qn Qt or Qt Qn ; suppose that Qn Qt . Then ¯ ¯ ¯dQt dQ ¯ ? 1@n, or n dQt 1@n ? dQn ? dQt + 1@n=

The left-hand inequality gives dQt 1@t ? DQt ? dQn + 1@n, while the right-hand inequality gives dQn 1@n ? dQt ? dQt + 1@t. Thus Ln and Lt intersect. The case Qt Qn is proved the same way. Since the family F is ne and consistent, there is a unique real number O common to all of its intervals. This O should work because it is arbitrarily close to each of the dQn , and the dQn are arbitrarily close to all dl for l and n big enough. All that’s left is an application of the triangle inequality to formalize this. So, suppose we are given A 0. Choose n so large that 1@n ? @2, and suppose that l Qn . Then |dl O| |dl dQn | + |dQn O| ? 1@n + 1@n . Not only is Cauchy convergence equivalent to convergence, but any modulus for Cauchy convergence is a modulus of convergence. We give a separate proof of this fact, since it is an instructive example of the use of the Wiggle Lemma and the triangle inequality. Proposition 3.1.15 Suppose (dn ) is Cauchy convergent with modulus Q (). Then (dn ) is convergent with modulus of convergence Q (). Proof. We know that (dn ) $ O for some O, so suppose q Q . Then for any p, |dq O| |dq dp | + |dp O| + |dp O| . But (dn ) $ d, so we can make |dp O| arbitrarily small by making p large enough (without changing q or Q ); thus |dq O| by the Wiggle Lemma. A modulus of convergence for (dq ) $ O, however, may not be good enough to establish Cauchy convergence if the sequence oscillates around O. Try to nd an example illustrating this. Finally, we deal with criteria for inequalities of limits. We saw above (Proposition 3.1.10) that if dn en for n su!ciently large, then lim(dn ) lim(en ). The converse is not true, however: 1@n is never less than or equal to 1@n when n is a positive integer, yet lim(1@n) lim(1@n). Furthermore, we haven’t proved anything yet about strict inequality. We now deal with these issues. In what follows, Qd () represents a modulus of convergence for a Cauchy sequence (dl ). We assume (dl ) $ Od and (el ) $ Oe .

SEQUENCES AND CONVERGENCE

105

Proposition 3.1.16 The following are equivalent. 1. There exist rationals A 0 and A 0 such that em A dm + + for l Qd () and m Qe (). 2. Oe A Od . Proposition 3.1.17 Let (dl ) and (el ) be Cauchy sequences. The following statements are equivalent. 1. For any rationals A 0 and A 0, dl em + for l Qd () and m Qe (). 2. For any rational A 0 there is an P () such that dl el + for l> m P (). 3. Od Oe . The proofs are exercises.

Exercises 1. Discuss the convergence of the following. (a) The sequence of reciprocals uQ = 1@Q , Q = 1> 2> = = = Q +1 Q = 2Q

(b) The sequence dQ = (c) The sequence yQ

(d) The sequence vQ = sin(Q @10). 2. Let S ({) and T({) be polynomials with deg(S ) ? deg(T), let gq =

S (q) and T(q)

T(q) . What can you say about the convergence of the sequences S (q) (gq ) and (Gq ) ?

let Gq =

3. Prove that for any positive number d, lim (d1@q ) = 1. (Hint: One method is q$4 to use Lipschitz conditions on d{ . Another is to separate into cases: d A 1 and d ? 1. Suppose d A 1; then d1@q A 1 (why?), so write d1@q = 1 + hq and show that lim hq = 0 by noting that (1 + w)n 1 + nw. Deal with d ? 1 by noting q$4

that 1@d A 1.)

4. Suppose (vQ ) is a sequence. Form a new sequence (wQ ) by throwing away the rst 1000 terms of (vQ ) and renumbering so as to start from 0 again. Discuss in a paragraph or so the comparative convergence properties of (vQ ) and (wQ ). (See the next exercise for related ideas.)

106

LIMITS, SEQUENCES AND SERIES

5. (Subsequences) Let u : N $N be an increasing function on the natural numbers; i.e. if p ? q then i (p) i (q). If (dn ) is a sequence, then, if we write en = du(n) , we call (en ) a subsequence of (dn ). For example, suppose dn = 1@n so that (dn ) consists of the reciprocals of the (positive) integers; let u(q) = 2q 1. Then (en ) consists of the reciprocals of the odd (positive) integers. On the other hand, if u(q) = q + 5, then (en ) consists of the reciprocals of the integers starting at 6. This subsequence is called a shift of (dn ) (see the previous exercise). (a) Prove that if (dn ) $ O and (en ) is a subsequence, then (en ) $ O. Suppose that we know that (dn ) converges and that (en ) $ O. Does that mean that (dn ) $ O? (b) Suppose we know that the subsequence (en ) of (dn ) converges. Does that mean that (dn ) converges? (Give a proof or a counterexample.) (c) Suppose (dn ) $ D and the sequence (d0n ) is obtained from (dn ) by changing a nite number of terms of (dn ). Prove the (d0n ) $ D. (d) Suppose that (dn ) $ D and D is a unit, say with D A 0. Prove that there is some integer N such that dn A 0 whenever n N. Thus, there is a shift of (dn ) which converges to D, and all of whose terms are positive. 6. Suppose the sequence (dq ) converges to 0. Discuss the convergence of (dnq ). 7. Prove those parts of Proposition 3.1.9 on the algebra of convergence not proved in the text. 8. In the problems for the section on limiting families you were asked to prove several propositions about limiting families that occur in the text (see page 66). Here are the same propositions rephrased in terms of sequences (assume the limits exist). (a) Proposition: Suppose that there are constants V> W such that V dn W for all n. Then O lim(dn ) X= (b) Proposition: Suppose that dn en for all n. Then lim(dn ) lim(en ) (“weak inequalities are preserved in the limit”). (c) Proposition: Suppose that (dn ) $ D and (en ) $ E. Then • (dn + en ) $ D + E, (dn en ) $ D E • (dn en ) $ DE • if E is a unit, (dn @en ) $ D@E after suitable renumbering so that dn @en makes sense (see Exercises 4 and 5 above).

Prove these propositions. Note that, in the second proposition, the strict inequality ? for the terms of the sequence gets replaced by the weak inequality in the limits. Show by example that even if the terms of one sequence are strictly less than the terms of another sequence, their limits may be equal. ³p ´ p dn $ D. Generalize 9. Suppose (dn ) $ D and dn A 0 for all n. Show that this.

SEQUENCES AND CONVERGENCE

107

10. Use the n3 + 6n¶2 + 11n + 5 = (n + 1)(n + 2)(n + 3) 1 to nd 3fact that 2 n + 6n + 11n + 5 . lim n$4 (n + 3)! 11. Prove that if (dn ) $ O, then (dn ) is a Cauchy sequence. Show by example that a modulus of convergence Q () may not itself work to establish Cauchy convergence (see Proposition 3.1.15). 12. Prove that a Cauchy sequence is always bounded (i.e. there is a number E such that |dn | ? E for all n). ¯ ¯ ¯ dq+1 ¯ ¯ ¯ = u for the sequence (dq ). Show that 13. Suppose that lim ¯ q$4 dq ¯ (a) If u ? 1 then lim (dq ) = 0. q$4

(b) If u A 1 then lim (dq ) = 4. q$4

14. Guess the limit of this sequence and prove (fairly hard) that it converges: dq =

q+1 2q+1

2q 2 22 + + ··· + 1 2 q

¶

.

15. A Fibonacci-type sequence is dened recursively by i1 = d, i2 = e and iq = iq1 + iq2 for q A 2. (a) Show that such a sequence cannot be a Cauchy sequence unless d = e = 0. (b) Modify the denition so that iq = 12 (iq1 + iq2 ). Prove that this new sequence is always a Cauchy sequence. (Hint: Prove that it converges by writing the rst few terms as d> d + (e d)> 12 (2d + (e d)) = d + 1 1 1 2 (e d)> d + 2 (e d) + 4 (e d), etc.; you might want to prove this by induction.) (c) There are a number of variations and generalizations of this averaging idea; for example, prove that if lim (dq ) = d, then q$4

lim

q$4

d1 + d + · · · + dq q

¶

= d=

(d) Here’s another: Let d1 > d2 > = = = > ds be positive numbers. Find lim

q$4

Ãs q

dq1 + dq2 + · · · + dqs s

!

.

(Hint: Let d = max(d1 > d2 > = = = > ds ) and try to nd lower and upper bounds s for the qwk root; also remember that lim q s = 1.) q$4

LIMITS, SEQUENCES AND SERIES

108

16. (Project) We have constructed the reals from the rationals by dening a real number to be a ne and consistent family of rational intervals. There are several other ways of constructing the reals; the one that is closest to the approach here is via Cauchy sequences. Here is an outline of some denitions and things to be proved. (a) Dene a real number to be a Cauchy sequence (ul ) of rational numbers. (b) Dene (dl ) to be a null sequence if, for any A 0 there is an Q such that l A Q =, |dl | ? . (c) Dene (dl ) = (el ) (as real numbers) to mean (dl el ) is a null sequence.

(d) Dene addition, subtraction, and multiplication of reals by: (dl ) + (el ) = (dl + el ); (dl ) (el ) = (dl el ); and (dl )(el ) = (dl el ). (e) Prove that these operations on equal reals yield equal reals.

(f) Dene (dl ) ? (el ) to mean that there exist rationals A 0 and A 0 such that em A dl + + for l Qd () and m Qe () (see Proposition 3.1.16).

(g) Dene (dl ) (el ) to mean that (el ) ? (dl ) is false. Prove that (dl ) (el ) if and only if for any rational A 0 there is an P () such that dl el + , for l> m P () (see Proposition 3.1.10).

(h) Dene division of reals, absolute value, max, min, etc.

(i) Prove some of the results from the text for real numbers using these new denitions. (j) Dene what a Cauchy sequence of reals would be using these denitions. Prove a Completeness Theorem which says that any Cauchy sequence of reals (i.e. a Cauchy sequence of Cauchy sequences of rationals) has a limit (a Cauchy sequence of rationals).

3.2

Limits of Functions

If a sequence is a function v : N $ R and we can dene the limit of such a function, we might ask about generalizing this notion for functions i : R $ R. In the case of sequences, we are looking at what happens to v(n) when n gets very large. By analogy, we can make the following denition for i . Definition 3.2.1 lim{$4 i ({) = O (O a real number) means that, given any A 0, there is an integer P 0 such that |i ({) O| ? for all { A P . Definition 3.2.2 lim{$4 i ({) = 4 (respectively, lim{$4 i ({) = 4) means that, given an integer Q A 0, there is an integer P A 0 such that i ({) A Q (respectively i ({) ? Q ) for all { A P . (We can make similar denitions for lim{$4 i ({)–these are left as exercises.)

LIMITS OF FUNCTIONS

109

Example 3.2.3 lim{$4 {3 = 4 and lim{$4 {3 = 0. These are easily proved using the Lipschitz condition on nth powers (Corollary 2.2.6). Remark 3.2.4 It may happen that for some function i : R $ R, the sequence (i (q)) may have a limit but lim{$4 i ({) may not exist–see the exercises.. Unlike the situation for sequences, however, we have much more exibility in taking limits of functions from R to R. Definition 3.2.5 Let d and O be real numbers. We say that the limit of i ({) as { approaches d is O if for any A 0 there is a A 0 such that |i ({) O| ? whenever 0 ? |{ d| ? . In this case we write lim i ({) = O. {$d

Remark 3.2.6 Note that this denition says nothing about what happens when { = d; in fact, d may not even be in the domain of i (i.e. i (d) may not even be dened). The limit of i at d depends only on the values i ({) for { near d (see exercises). Here is a diagram that shows the conditions for lim i ({) to be O: {$d

LQSXWWR f

:HFDQPDNH WKHVHFORVH H %\PDNLQJ WKHVHFORVH HQRXJK G

RXWSXWRI f

L

a

f(x)

x f

Example 3.2.7 lim{$2 {3 = 8. To see this, one can manipulate Lipschitz conditions, but it is perhaps simpler to write ¯ ¯ 3 ¯{ 8¯ = |{ 2| |{2 + 2{ + 4|.

Clearly, we can make the |{2| term small by making { close to 2, but we do have to However, if we make { within say make sure that the |{2 + 2{ + 4| term ¯ ¯ 2 is bounded. ¯{ + 2{ + 4¯ ? 19. Thus, choose ? min(@19> 1) 1 of¯ 2, then { won’t exceed 3, so ¯ ¯ ¯ so ¯{3 8¯ = |{ 2| ¯{2 + 2{ + 4¯ ? (@19)(19) = .

110

LIMITS, SEQUENCES AND SERIES

We now can state the usual facts about the algebra of limits; the proofs are straightforward and use ideas we have encountered before. Proposition 3.2.8 (Properties of finite limits) Suppose that lim{$d i ({) = O and lim{$d j({) = P . Then lim{$d (i ({) + j({)) = O+P , lim{$d (i ({) j({)) = OP , and lim{$d i ({)j({) = OP . If P A 0 or P ? 0, then lim{$d i ({)@j({) = O@P . (The tricky one is the product; for a model, use the proof given when we were looking at convergence: see Proposition 3.1.9.) The formulation of similar results for innite limits is left to the exercises. Once again, we have the result that weak inequalities are preserved in the limit: Proposition 3.2.9 (Weak inequalities preserved) Suppose that i ({) j({) for { 5 (d> e), f 5 (d> e) and lim{$f i ({) = O, lim{$f j({) = P . Then O P . Proof. For any A 0, choose such that i ({) and j ({) are within @2 of O and P , respectively, when |{ f| ? . Then O i ({)+@2 j({)+@2 (P +@2)+@2 = P + . Since was arbitrary, O P . This, by the way, is a beautiful example of the usefulness of the Wiggle Lemma. We know that i ({) j({), but j({) is merely close to P ; in fact, it may be always larger than P . The Wiggle Lemma gives us the wiggle room to deal with this provided the amount that j({) exceeds P is arbitrarily small, which is exactly what we know. We now state some results about log and exponential functions in the language of limits. The proofs are elementary but a bit technical, so we have placed them in an appendix. Proposition 3.2.10 If d A 0, then d{ $ 1 as { $ 0 (Equivalently, d1@{ $ 1 as { $ 4.) (This follows directly from Proposition 2.3.19, the Lipschitz condition for exponential functions.) Proposition 3.2.11 For q 5 N, (1 + 1@q)q $ h as q $ 4. Proposition 3.2.12 For all { A 0, {@(1 + {) log(1 + {) {. Proposition 3.2.13 Given q 5 N, w A 0 and n 0,

(1+w)q qn

$ 4 as q $ 4=

(The case where n = 1 can be proved simply using the binomial theorem; here’s how w2 1 + q(q1) 1 (1 + w)q 2 2 2 = + (q1) w , so it goes: (1 + w)q 1 + q(q1) 2 2 w $ 4 as q q q q $ 4. This idea can be extended to whole-number powers n 0, and thence to reals n 0.)

LIMITS OF FUNCTIONS

111

Corollary 3.2.14 For { with |{| ? 1 and any n 0, qn {q $ 0. Corollary 3.2.15 For any n 0,

{n $ 0 as { $ 4. h{

log { {q

Corollary 3.2.16 If q A 0 then

$ 0 as { $ 4, { A 0.

Proposition 3.2.17 q1@q $ 1 as q $ 4. Corollary 3.2.18 For any polynomial S (q) = d0 + d1 q + · · · + dn qn , with dn A 0, (S (q))1@q $ 1 as q $ 4. For proofs of these last few results, see the appendix at the end of this section.

Exercises 1. Let d A 0 be a real number. (a) Prove that lim {3 = d3 . {$d

(b) Prove that lim {2 = d2 . {$d

(c) Prove that lim {1@2 = d1@2 . {$d

(d) Generalize these by proving that lim {w = dw . {$d

2. Prove that if D A 1, then lim{$4 D{ = 0 (Hint: let D = 1 + w and use (1 + w)n 1 + nw, or use Lipschitz conditions.) Prove lim{$4 logD { = 4. 3. Let i ({) = dened.

{2 4 {4 .

Find lim{$2 i ({) (and prove it), even though i (2) is not

4. Find a function V with the property that the sequence (V(n))> n = 0> 1> === has a limit but lim{$4 V({) is undened. (Hint: Choose, for example, V so that V (0) = V(1) = V(2) = V(3) = · · · = n, while V( 12 ) = V( 32 ) = V( 52 ) = · · · = n.) Ã

5. (Tricky) Prove that the sequence

(Hint: Start with the inequality Deduce that 1@{

(1 + {)

1 1+ n

h{ 1 {

¶n !

1

h 1+

converges to h. h{ 1 {

{ 1{

¶1@{

from Corollary 2.4.12. ,

using 0 ? { ? 1 for the right-hand inequality. Choosing¡ rst {¢ = 1@q and then q { = 1@(q + 1) and doing some algebra yields h qh 1 + q1 h.)

LIMITS, SEQUENCES AND SERIES

112

³ { ´q 6. Suppose { A 0. Prove that lim 1 + = h{ (see exercise 5). What if we q$4 q don’t know that { A 0? 7. (Project) The algebra of limits involving forms lim{$4 k({) or lim{$4 k({), or cases where lim{$d k({) = ±4 needs to be dealt with separately. Formulate and prove results in these cases. 8. Prove that lim {q = 0 when |{| ? 1 (see Corollary 3.2.14). q$4

9. Suppose that d A 1, A 0, and is any real. Prove the following (you may use any results from this section). d{ { = 4 (equivalently, lim = 0). {$4 { {$4 d{ logd { (b) lim = 0. {$4 { { (c) lim = 1. (a) lim

{$0

(d) lim ln(1 + {) = 0. {$0

3.3

Series of Numbers

Perhaps the most common source of sequences in mathematics is the summing of numbers, and this is most interesting when there are innitely many numbers to sum. One of the earliest examples is Zeno’s “Race Course Paradox.” 1 . In order for a runner to complete a race, he must run half the prescribed distance, then half the remaining distance, then half of the now remaining distance etc. This amounts to the runner covering the following fractions of the course: 1@2 + 1@4 + 1@8 + 1@16 + · · · Zeno (and many who followed him, up through modern times) was concerned about the realization of innitely many tasks–in this case performance of the innitely many legs of the journey. Since we can’t literally add innitely many numbers, we look at the sequence of partial sums obtained when we add increasingly large but nite numbers of terms. In this section, we will try to nd criteria for the convergence of this type of sequence. We begin by making our notion of series–an innite sum–mathematically precise. Let (vn ) be a sequence of numbers for, say, n = 0> 1> 2> = = = Definition 3.3.1 The partial sums Vq of this sequence are dened by V0 Vn+1 1 Zeno

of Elea, circa 5 th century B.C.E

= v0 > = Vn + vn+1 .

SERIES OF NUMBERS

We also write Vq =

q P

113

n=0

vn = v0 + v1 + · · · + vq .

Definition 3.3.2 If the sequence P4 of partial sums (Vq ) has a limit, i.e. (Vq ) $ O, then we say that the series n=0 vn converges to O, and we write 4 X

vn = lim Vq = O. q$4

n=0

If there is no limit we say the series fails to converge; if the series fails to converge because lim Vq = 4 or 4, then we say the series diverges. q$4

As you might expect, simple combinations of convergent series converge. We state this and leave its proof, a straightforward application of the denition, as an exercise. Proposition 3.3.3 Suppose that 4 X

P4

n=0 vn

= V and

P4

n=0 wn

= W . Then

(vn + wn ) = V + W

n=0

for any constants and . P4 P (The situation for the product: ( 4 n=0 vn ) ( n=0 wn ) is more complicated: see the second appendix to this chapter.) The use of the word “diverge” implies that the partial sums are becoming unboundedly positive or unboundedly negative. This may or may not account for the failure to converge. For example, consider the series 4 X 1 1 + 1 ··· ± 1 ··· = (1)n . n=0

The partial sums here are 1> 0> 1> 0> · · · so there is no limit, yet they are not unbounded. Some texts call this divergence but that seems misleading; we will say, simply, that this sequence fails to converge. Here are some more examples. Example 3.3.4

P4

n n=1 (1) l

= 1 2 + 3 4 + · · · . The partial sums are

1> 1> 2> 2> 3> 3> 4> 4> = = =

They alternate between Q and Q , so they don’t approach either 4 or 4; we should say that this series fails to converge. P P4 n Example 3.3.5 V = 4 n=1 n = 1 + 2 + 3 + · · · , and W = n=1 ln(1@h ) = 1 2 3 · · · . These series diverge.

LIMITS, SEQUENCES AND SERIES

114

P4 1 1 Example 3.3.6 (Zeno’s series) n=1 2n = 2 + sums are 1 3 7 2q 1 2 > 4 > 8 > = = = > 2q > = = =

1 4

+

1 8

+ · · · , where the partial

This is an example of a geometric series, which we will soon discuss at length. It q = clearly converges to 1. To prove it formally, rst prove inductively that Vq = 2 21 q 1 21q , so that |Vq 1| = 21q . We can clearly make this less than by making q large enough. P4 1 1 1 1 Example 3.3.7 l=1 l(l+1) = 2 + 6 + 12 + 1 1 is to note that l(l+1) = 1l l+1 , so that q X l=1

1 l(l+1)

=

q X l=1

1 l

1 l+1

1 20

+ · · · . The trick to nding this sum

³ ¢ ¡ ¢ ¡ ¢ ¡ = 1 12 + 12 13 + 13 14 + · · · + q1

1 q+1

³ ´ ¢ ¡ ¢ ¡ ¢ ¡ = 1 + 12 + 12 + 13 + 13 + 14 + 14 + · · · + q1 + q1

= 1

1 q+1

´

1 q+1

$ 1,

which approaches 1 as q $ 4. As the last example shows, you may need a trick or some other algebraic manipulation to get at the convergence of a series. Sometimes there is noPsimple expression 4 for the limit of a convergent series; in Example 3.1.12 we saw that n=1 n12 converges (we’ll prove it again later in this section), but it is not at all easy to see that its 2 limit isP6 . No one knows a similar expression for the limit of the related convergent 4 series n=1 n13 . General formulas for the sums of large classes of series are few, but here is the most basic of them all. Proposition 3.3.8 The nth partial sum of the geometric series Vq = d + du + du2 + · · · + duq is given by d + du + du2 + · · · + duq =

Pq

n=0

dun =

d(1 uq+1 ) . 1u

Proof. By induction on q, or compute Vq uVq and note the cancellations. P n 2 Corollary 3.3.9 The innite geometric series 4 n=0 du = d + du + du + · · · + d , provided |u| ? 1. duq + · · · converges to 1u ¯ ¯ ¯ ¯ q+1 ¯ ¯ ¯ ¯ d(1 uq+1 ) ¯ du ¯ d ¯ d ¯¯ ¯ ¯ · |u|q+1 $ 0 as q $ 4 when ¯ ¯ ¯ Proof. ¯ =¯ =¯ 1u 1 u¯ 1u ¯ 1 u¯ |u| ? 1 (see Corollary 3.2.14). Example 3.3.10 The innite series 1 2 1 = 3. 1+ 2

P4

n=0

(1)n 2n

= 1 12 + 14 18 + · · · converges to

SERIES OF NUMBERS

115

The convergence or non-convergence of a series does not depend on what happens over any nite stretch–it’s the adding of innitely many terms P that matter. To make this precise, we note that for any q 0 an innite series 4 n=0 vn can be split into a partial sum Vq and a tail: 4 X

vn

=

n=0

=

q X

n=0 q X

4 X

vn + vn +

n=0

| {z } Vq

vn

n=q+1 4 X

vq+1+n

n=0

|

{z

tail

}

The convergence of the original series depends only on the convergence of the tail, as we now see. Proposition 3.3.11 (Tail Convergence) For any q 0, P4 P4 1. n=0 vn converges if and only if n=0 vq+1+n converges. P4 P4 P 2. When n=0 vn converges, n=0 vn = Vq + 4 n=0 vq+1+n .

The proof is a straightforward application of the denition of series convergence and will be left as an exercise.

For a general series, one often veries convergence by applying the Cauchy criterion (3.1.11) to the partial sums. It is useful to note, then, that for p q, Vq Vp =

q X

n=0

vn

p X

n=0

vn =

q X

vn >

n=p+1

where the right-hand sum is taken to be 0 when p = q. In particular, Vq+1 Vq = vq+1 . Proposition 3.3.12 P = P () such that

P4

n=0 vn

converges if and only if, given A 0, there exists an

¯ ¯ q ¯ ¯ X ¯ ¯ vl ¯ ? ¯ ¯ ¯

whenever q A p A P .

n=p+1

Proof. This is simply a rephrasing of the Cauchy convergence criterion. The next result gives us a necessary condition for convergence–namely that the terms of a convergent series must approach 0. It is logically equivalent to the statement that if the terms of a series don’t go to 0, the series can’t converge. P4 Proposition 3.3.13 If the series n=0 vn converges, then the nth term vq $ 0 as q $ 4; in particular, the terms of a convergent series are bounded.

LIMITS, SEQUENCES AND SERIES

116

Proof. If the series converges, the partial sums (Vq ) form a Cauchy sequence. Given A 0, choose P so large that |Vq Vn | ? when q> n P . In particular, if q A P then q 1 P and |vq | = |Vq Vq1 | ? . Corollary 3.3.14 (Divergence Test) If the terms of a series don’t approach 0, the series can’t converge. Example 3.3.15 nth term:

P4

lim

n2 n=0 3n2 +2n+1

n$4 3n 2

n2 + 2n + 1

diverges. To see this, we compute the limit of the n2 @n2 + 2n@n2 + 1@n2 1 = lim n$4 3 + 2@n + 1@n 2 1 . = 3

=

lim

n$4 3n 2 @n 2

Since this limit is not 0, the series diverges. What is crucial to realize is that even if the terms of a series do approach 0, the series still may not converge! Here is the oldest and simplest example; the proof is also a classic. 4 X 1 1 1 1 diverges Proposition 3.3.16 The harmonic series K = 1+ + + + · · · = 2 3 4 n n=1 to 4.

Proof. There are a number of dierent and interesting proofs that the harmonic series diverges. The most intuitive is based on the simple observation that for any integer q A 1, 1 1 1 1 1 1 1 + + + ··· + + ··· + A = . q+1 q+2 q+3 2q |2q {z 2q} 2 {z } | q terms

q terms

Thus, for example, we can write the 16th partial ¤ £ £ ¤ £ K16 = 1 + 12 + 13 + 14 + 15 + 16 + 17 + 18 {z |{z} | {z } | 1 = 2

so K16 A 1 + 4 become innite.

1 A 2

1 A 2

sum, K16 , as ¤ £1 + + 1 + ··· + } | 9 10 {z 1 A 2

1 15

+

1 16

¤ , }

¡1¢ ¡1¢ 2 . And, in general, K2n A 1 + n 2 , so that the partial sums

Thus, although the terms of the harmonic series approach 0, their sum (the innite harmonic series) does not converge. So how can we tell if a series does converge? This is a topic of such importance and subtlety that mathematicians have developed many tests, some of them quite complicated. We will state and prove a few of the most basic in this section.

SERIES OF NUMBERS

117

We start by assuming that the terms of our series are all positive. In some sense it is harder for such a series to converge, since there can’t be any cancellation of positive and negative terms. On the other hand, this also narrows down the problem of divergence to the issue of the terms getting too big. If we already have a series that converges, its terms can’t be getting too big, so any series whose terms are roughly the same in size or smaller should converge. That is the idea of a comparison test; here’s the simplest one. Theorem 3.3.17 (Comparison that |vn | wn for n P4 P4 Test for Series) Suppose w converges. Then v the su!ciently large and that n=0 n P n=0 n converges. POn 4 w diverges, then other hand, if vn |wn | for n su!ciently large and 4 n=0 n n=0 vn diverges. ¯ Pq ¯Pq Pq Proof. |Vq Vp | = ¯ n=p+1 vn ¯ n=p+1 |vn | n=p+1 wn , say for q p A O (p, q “su!ciently large”). Since the series with terms wn converges (and the wn are non-negative), there is an P such that this last sum is ? when q A p A P . Thus, |Vq Vp | ? when q p A max(O> P ). The divergence is equally straightforward. As a quick application of the comparison test, we establish again the convergence P 1 1 1 1 1 1 of 4 n=0 (n+1)2 = 1 + 4 + 9 + 16 + · · · . We note that (n+1)2 ? n(n+1) , and we saw P4 P4 1 1 above n=1 n(n+1) converges. It follows that n=1 ns converges when s 2. It is, in fact, true that this last series converges for all s A 1. This fact is most easily proved using the integral test (see Theorem 5.3.27), but here’s a very simple proof illustrating both a comparison with a geometric series and an idea borrowed from the proof of the divergence of the harmonic series. 4 X 1 converges for s A 1 and diverges for Proposition 3.3.18 (s-series test) ns n=1 s 1.

Proof. 4 X 1 ns

n=1

1 1 1 + s + s + ··· s 2 3 4 ¶ ¶ 1 1 1 1 1 1 + + + s + s + s + ··· = 1+ 2s 3s 4s 5 6 7 {z } | {z } | 2 terms 4 terms ¶ ¶ ¶ 1 1 1 + 4 + 8 + ··· ? 1+2 2s 4s 8s ¶2 ¶3 1 1 1 = 1 + s1 + + + ··· 2 2s1 2s1 = 1+

This last is a geometric series which converges when 2s1 A 1, i.e. when s 1 A 0. The divergence for s 1 follows from comparison with the harmonic series.

LIMITS, SEQUENCES AND SERIES

118

Theorem 3.3.19 (Limit Comparison Test) Suppose (vn ) and (wn ) are sequences which are both non-negative for n su!ciently large; also suppose that vwnn $ O A 0 P4 P4 as n $ 4. Then n=0 vn converges if and only if n=0 wn converges.

vn O + . So wn vn wn (O + ), and the comparison test shows that convergence of the w series wn $ 1@O. implies convergence of the v series. To prove the converse, note that vn Proof. For n su!ciently large, vn and wn are non-negative and

Remark 3.3.20 This provides another proof that

X4

n=1

1 converges, since n2

n2 + n 1@n 2 = $1 1@ (n(n + 1)) n2 as n $ 4. It also shows that if d A 0, then

compare it with the harmonic series.

X4

n=1

1 diverges, since we can dn + e

Since we know exactly when geometric series converge, our next move P is to compare a given series with a related geometric one. So suppose we have 4 n=1 vn with vn A 0. Dene un for n = 1> 2> = = = by un+1 =

vn+1 , vn

so vn+1 = un+1 vn . If all the un were equal, we would, of course, have a geometric series; nevertheless we can make a comparison. Proposition 3.3.21 Suppose the terms vn are eventually non-negative and u for some 0 u ? 1 and all su!ciently large n. Then 1. vn $ 0 as n $ 4. P4 2. n=0 vn converges and p X

n=0

vn

4 X

n=0

vn

Ãp X n=0

vn

!

+

1 1u

¶

vn+1 vn

vp+1

for p P . Proof. First we prove claim (1). Let’s suppose that for n P , the terms vn are vn+1 non-negative and u ? 1. For q A p P , vn vq

= = = =

uq vq1 uq uq1 vq2 uq uq1 uq2 vq3 uq uq1 · · · up+2 vp ,

SERIES OF NUMBERS

119

or vq = vp

qp Y n=1

up+n vp uqp = uq

³v ´ p

up

=

vP So let’s x some p, say p = P , and set F = P . Then 0 ? vq Fuq $ 0 as u q $ 4 since u ? 1. This establishes the rst claim. To prove the second, we assume that q A p P 1 and examine the Cauchy condition: 0 ? Vq Vp =

q X

vn

n=p+1

vp+1 = vp+1 vp+1

Ã

q X

u

n(p+1)

n=p+1

Ã q X

n=0 Ã4 X n=0

u

n

un

!

!

!

vp+1 $ 0, = 1u since we know vp+1 $ 0. Also, the inequality above shows that for all q A p p+1 (where Vq and Vp are partial sums). We now hold p P 1, Vp ? Vq ? Vp + v1u xed and let q $ 4. Since weak inequalities are preserved in the limit (Corollary 3.1.10), we get the rest of the claim. Thus, if our series is (eventually) dominated by a geometric one, then it not only converges, but we get a nice upper error bound when we truncate it at some nite partial sum. If, in addition, our series eventually dominates a geometric one, then we also get a lower error bound, using a similar argument. Combining lower and upper bounds gives us a nice estimate for the dierence between the innite sum and the large partial sums. This dierence is sometimes called the truncation error –what you get when you chop o a series after a nite number of terms. Theorem 3.3.22 (Truncation estimate) Suppose the same hypotheses as the previous proposition, but also assume that 0 w un+1 u for all n A P . Then P4 v converges and l l=1 Ãp ! ¶ ¶ 4 X X 1 1 vp+1 vp+1 . vn vn 1w 1u n=0

n=0

¶ n {. Example 3.3.23 As we’ll see later, ln(1{) = n+1 n=0 vn+1 Supposing that 0 { ? 1, we can take p = 20, and we have 0 ? 20 {. 21 { ? v n 4 X {n

vn+1 . Here = n vn

LIMITS, SEQUENCES AND SERIES

120

If we want to estimate ln 4, we can take { = 3@4 and truncate after the 20th term. vn+1 This gives us 0 ? 57 ? 34 , so vn 7 2

¡ 3 ¢21 4

1 21

ln(4)

20 X {n

4

20 X {n

0=000453=

n=0

n

¡ 3 ¢21 4

1 21

or 0=000396 ln(4)

n=0

n

In fact, the actual value of the error is about 0=000403, so this is a pretty good estimate. Corollary 3.3.24 (Ratio Test) Suppose vn 0 for n su!ciently large, and 4 4 X X vn+1 lim = u. Then vn converges if u ? 1, and vn diverges if u A 1. We n$4 vn n=0 n=0 can draw no conclusion if u = 1 . detect the divergence of the harmonic series P4Note that the ratio test fails Pto 4 2 1@n or the convergence of n=1 n=1 1@n , which we have established using ad hoc methods. Later, we will be able to use the integral test (Theorem 5.3.27) to deal with series of this type. Proposition 3.3.25 (Ratio Comparison Test) Suppose for su!ciently large n 4 X vn+1 en+1 . If the series en the numbers vn and en are non-negative, and vn en n=0 4 X converges, so does vn . n=0

vn vn vn+1 , so the sequence of ratios is decreasing. If en+1 en en 4 X vQ , then vn en for n A Q ; so vn converges by the Comparison Test. = eQ

Proof. For n some Q ,

n=0

So far we have been dealing with series whose terms are either all positive or at least eventually positive. Now we turn our attention to more general series whose terms don’t satisfy these conditions. P4 P4 Definition 3.3.26 n=0 vn is said to be absolutely convergent if n=0 |vn | converges. If a series converges but is not absolutely convergent, it is called conditionally convergent. Proposition 3.3.27 Every absolutely convergent series is convergent.

SERIES OF NUMBERS

121

Pq Pq Proof. Let Vq = n=0 vn and Wq = n=0 |vn |. Since (Wq ) converges, it satises the Cauchy criterion; we must show that (Vq ) does as well. ¯ ¯ q ¯ ¯ X ¯ ¯ vn ¯ |Vq Vp | = ¯ ¯ ¯

n=p+1 q X

n=p+1

|vn | (by the triangle inequality)

= |Wp Wq | Since (Wq ) is a Cauchy sequence, there is an Q () that makes |Wp Wq | ? for all p> q A Q (), so the same Q () works for |Vp Vq |.

¯ ¯ ¯ vn+1 ¯ ¯ ¯ = u ? 1. Then Corollary 3.3.28 (Absolute Ratio Test) Suppose lim ¯ n$4 vn ¯ 4 X vn is absolutely convergent, so it converges. n=0

Example 3.3.29 For the series

4 X 2n {n , we have (1)n n

n=0

¯ ¯ ¯ vn+1 ¯ ¯ = lim n |2{| = |2{| = lim ¯¯ n$4 vn ¯ n$0 n + 1

When 1@2 ? { ? 1@2, the series converges absolutely. When { = 1@2, we get the harmonic series, which diverges. When { = 1@2 we get the alternating harmonic series which converges, as we shall see below. When |{| A 1@2, the series diverges. 1@n

Proposition 3.3.30 (nth Root Test) Suppose lim |vn | n$4

1. If O ? 1, the series

P4

n=0 vn

= O.

converges absolutely.

2. If O A 1, the series diverges. 3. If O = 1, the test gives no convergence information. The proof is by comparison with a geometric series and is left as an exercise. We now return to the general case of a series whose terms may be positive or negative. The simplest such series are called alternating because their terms alternate in sign. Here is the formal denition. Definition 3.3.31 The sequence vn alternates in sign if vn vn+1 0 for all n 0; in other P4words, the product of any two consecutive terms is non-positive. In this case, n=0 vn is called an alternating series.

LIMITS, SEQUENCES AND SERIES

122

An alternating series looks like d0 d1 + d2 d3 + · · · , where the dn are nonnegative. It turns out that these converge when the terms dn eventually decrease to 0–i.e. when dn dn+1 for all su!ciently large n and lim(dn ) = 0. Here is a diagram of what this looks like (Vq as usual denotes the qth partial sum):

6P

rDP

6P DQ

6Q

DQ

6Q DQ

6Q

DQ

6Q

You can see that any partial sum Vp lies between any two consecutive previous partial sums. It follows that the distance between any two partial sums is less than the distance between any two consecutive previous partial sums, but this last distance is always |Vq1 Vq | = |dq |. These observations are enough to establish that the partial sums form a Cauchy sequence, so we get convergence. We now turn to a more precise formulation of these results, one that doesn’t depend on a picture. Also, it is annoying to have to deal with the various special cases in alternating series that arise by having to know whether Vq+1 = Vq + dq+1 , or Vq+1 = Vq dq+1 (that is, whether q is even or odd). Fortunately, we have the notion of betweenness introduced in Chapter 1 (Denition 1.5.39) that enables us to deal with both cases at once. However, if you skipped the material on betweenness, or would rather have a more concrete proof (maybe using the diagram above for guidance), see exercise 8 on page 128. P In what follows, we will assume that we have an alternating series 4 n=0 vn whose terms eventually weakly decrease (that is, do not increase) in size. All these conditions can be summarized as follows. For some Q , vn+1 = un+1 · vn with 1 un+1 0 for n Q . Proposition 3.3.32 Suppose the sequence (vn ) weakly decreases for n Q . Then, for p A q Q , the partial sum Vp lies between the sums Vq and Vq+1 . Proof. By a previous result on betweenness (Proposition 1.5.42), it su!ces to show that if we take any n A Q and consider three consecutive terms Vn , Vn+1 and Vn+2 , then the third term is between the rst two. To prove this, it su!ces to show that |Vn+1 Vn | = |Vn+1 Vn+2 | + |Vn+2 Vn | (by Proposition 1.5.44). Here is the

SERIES OF NUMBERS

123

computation: |Vn+1 Vn+2 | + |Vn+2 Vn | = |vn+2 | + |vn+2 + vn+1 | = |un+2 ||vn+1 | + |un+2 vn+1 + vn+1 | = un+2 |vn+1 | + |un+2 + 1||vn+1 | = un+2 |vn+1 | + (un+2 + 1)|vn+1 | = |vn+1 | = |Vn+1 Vn |= (Note that we have used the fact that 1 un+1 0 several times.) Corollary 3.3.33 (Alternating Series Test) If (vn ) is alternating, weakly decreasing for n A Q , and vn $ 0, then 1. (Vn ) converges to a limit V, and for each q Q , V is between Vq and Vq+1 . P 2. | qn=0 vn V| |vq+1 | for q Q .

Proof. We want to show that (Vq ) is a Cauchy sequence. The betweenness result from the previous proposition tells us that ¯q+1 ¯ ¯q+1 ¯ ¯p ¯ q p q ¯X ¯ ¯X ¯ ¯X ¯ X X X ¯ ¯ ¯ ¯ ¯ ¯ vn vn ¯ = ¯ vn vn ¯ + ¯ vn vn ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ n=0

n=0

n=0

n=0

n=0

n=0

or, performing the subtractions,

¯ ¯ p ¯ ¯ p ¯ ¯ X ¯ ¯ X ¯ ¯ ¯ ¯ vn ¯ + ¯ vn ¯ = |vq+1 | = ¯ ¯ ¯ ¯ ¯ n=q+2

n=q+1

¯ ¯Pp The last equation tells us that ¯ n=q+1 vn ¯ |vq+1 | $ 0, independent of p. This makes (Vq ) a Cauchy sequence, so (Vq ) $ O for some O. On the other hand, we can choose Q so that for p A Q , |Vp O| ? @2 (for any A 0). Then by the triangle inequality we have |Vq+1 O| |Vq+1 Vp | + |Vp O| |O Vq | |O Vp | + |Vp Vq | , and adding yields |Vq+1 O| + |O Vq | |Vq+1 Vq | + = Since was arbitrary, |Vq+1 O| + |O Vq | |Vq+1 Vq |. One can show the reverse inequality similarly, so |Vq+1 O| + |O Vq | = |Vq+1 Vq |. This completes the proof. Since alternating series are so common, we summarize the results of Corollary 3.3.33 as follows.

LIMITS, SEQUENCES AND SERIES

124

Summary 3.3.34 An alternating series whose terms decrease in absolute value to 0 converges, and 1. The limit lies between any two consecutive partial sums. 2. The truncation error is less than the absolute value of the rst term dropped. Definition 3.3.35 The alternating harmonic series D is given by 4 X (1)n

n=0

n+1

=1

1 1 1 + + ··· 2 3 4

Corollary 3.3.36 The alternating harmonic series converges but does not converge absolutely (i.e. it is conditionally convergent). (In fact, the limit of the alternating harmonic series is the number we called ln 2 in exercise 2 on page 42 of Chapter 1. Proving this is an exercise at the end of this section.) Here are some examples related to the Euler number h, although we don’t have the machinery yet to prove this connection (we will when we do power series; see Example 7.3.33). Example 3.3.37 First we look at a series of positive terms, dened as follows: h0 Hq

hn n+1 q q X X 1 = hn = n! = 1, hn+1 =

n=0

4 X

hn

= 1+

n=0

n=0

1 1 1 + + ··· + + ··· 1! 2! q!

hn+1 1 $ 0 as n $ 4. Thus, the series = hn n+1 converges by the ratio test; let the limit be H. In fact, as we will see in section 7.3 on power series, H is actually the Euler number h already dened (Denition 2.4.7). Since the partial sums are increasing, we have Hq+1 H for any q. By the second part of Proposition 3.3.21 we have Note that the quotients un+1 =

Hq+1

1 hn+1 1 un+2 1 1 = Hq + 1 (q + 1)! 1 q+2 1 1 ´ = Hq + ³ q! q + 1 q+1 H Hq +

q+2

Hq +

1 q!q

SERIES OF NUMBERS

125

Letting q = 1> 2> 3 gives us the estimates 1 + 1 + 1@2 ? H ? 1 + 1 + 1 1 + 1 + 1@2 + 1@6 ? H ? 1 + 1 + 1@2 + 1@4 52 51 ?H ?2+ , 2+ 72 72 or, roughly, H 2=72. Now let’s make the preceding into an alternating series by dening d0 = 1 and dn ; thus, un+1 = n+1 4 X

dn =

n=0

4 X 1 1 1 1 (1)n = 1 + + ··· (n + 1)! 1! 2! 3! 4!

n=0

The alternating series test tells us that this converges to some number D. For q = 3, we have the estimate D=

3 X 1 (1)n ± , (n + 1)! 5! n=0

or D=

9 1 ± 24 120

This give us 0=3667 ? D ? 0=3833. As we shall see later, D = 1@h 0=3679. Although the comparison and ratio tests only tell us when a series is absolutely convergent, there are also useful tests for conditional convergence, due to Dirichlet and Abel. Theorem 3.3.38 (Dirichlet’s Test) Suppose (dn ) and (en ) are sequences that have the following properties: 1. For all n , dn 0, dn dn+1 , and lim dn = 0. n$4

2. The partial sums Eq = |Eq | E for all q). Then the series

P4

n=0

Pq

n=0 en

are all bounded by some number E (that is,

dn en converges.

Proof. An outline of the proof is in the exercises. P4 P4 Example 3.3.39 The series n=1 (sin n{) @n and n=1 (cos n{) @n (with { not a multiple of 2) converge. We show the rst of these converges. Here we have dn = from 1@n and en = sin n{. A standard formula from trigonometry, derived ¶ the formu¶ x+y xy las for sin(D+E) and cos(D+E), gives cos(x)cos(y) = 2 sin sin . 2 2

LIMITS, SEQUENCES AND SERIES

126

Replacing x by (n + 12 ){ and y by (n 12 ){, solving for sin(n{), and noticing the telescoping (cancellation of almost all terms), we see that Eq

=

q X

sin n{

n=1

=

=

¶ ¸ 1 2q + 1 { cos { cos { 2 2 2 sin 2 ¶ q{ 1 (q + 1){ . sin sin { 2 2 sin 2 1

¯ ¯ ¯ ¯ ¯ 1 ¯ ¯ Thus, |Eq | ¯¯ { ¯, so the Eq are bounded and Dirichlet’s test applies. (Note that ¯ sin ¯ 2 when sin({@2) = 0 the series is identically 0, so converges anyway.) Note, however, both of these series converge conditionally – in fact, the ¯ thatX ¯ 4 ¯ 4 ¯ X ¯ sin n{ ¯ ¯ cos n{ ¯ ¯ ¯ ¯ ¯ series ¯ n ¯ and ¯ n ¯ diverge (see exercises). n=1

n=1

Corollary 3.3.40 (Abel’s Test) If (dn ) is aPdecreasing (or increasing) Psequence 4 e converges, then of non-negative numbers which converges and 4 n=0 n n=0 dn en converges. P Proof. The partial sums of 4 n=0 en are bounded. Suppose (dn ) $ d and apply Dirichlet’s test using the sequence (dn d) (or (d dn ) when (dn ) is increasing). We conclude by returning to a result we have already proved: if a series converges, its terms go to 0. The following proposition shows the equivalence between a sequence going to 0 and the divergence of a related series. We will use this result later to study binomial series. Proposition 3.3.41 Let (dn ) be a decreasing sequence of positive numbers and dedn+1 ne en = 1 ; suppose that en $ 0 as n $ 4. Then dn $ 0 if and only if dn P4 n=0 en diverges.

Proof. For q 2, dq = d1 (1 e1 )(1 e2 ) · · · (1 eq1 ). Therefore, taking ln Pq1 of both sides, we have ln dq = n=1 ln(1 en ). Thus, dq $ 0 if and only if Pq1 ln(1 en ) lim ln(1en ) = 4. Since en $ 0, lim = 1 (say by l’Hôpital’s n=0 q$4 n$4 en Pq1 P4 rule, which we’ll prove later), so n=0 ln(1 en ) diverges if and only if n=0 en diverges, by the limit comparison test (Theorem 3.3.19), so dq $ 0 if and only if P 4 n=0 en diverges.

SERIES OF NUMBERS

127

Exercises 1. Prove the following, which was left unproved in the text. Proposition 3.3.11. For any q 0, P4

P converges if and only if 4 n=0 vq+1+n converges. P4 P4 P (b) When n=0 vn converges, n=0 vn = Vq + 4 n=0 vq+1+n . (a)

n=0 vn

2. We have seen that

4 P

n=1

harmonic series

4 P

n=1

(a) Prove that (b) Prove that P4 (c) Does n=1

1 n

P4

1 ns

doesn’t (Proposition 3.3.16).

1 n=1 ns

P4

converges for s A 1 (see Proposition 3.3.18), but the

diverges for s 1.

sin(n{) ns

converges absolutely for s A 1 and all {. P4 n 1 n2 1 n=1 n4 +n+1 ? n3 +n+1 converge? What about P4 P4 (d) Do either of n=1 cos(1@n) or n=1 sin(1@n) converge? n=1 2

3. (Ratio test) Discuss the convergence or divergence of (a) (b) (c) (d)

P4

n=1

P4

n=1

P4

n=1

P4

n=1

n! nn 2n n! nn 3n n! nn dn n! (What’s the cut-o for d?) nn

³s s ´ P 4. The harmonic series isn’t the only one: show that 4 n + 1 n din=1 ³s s ´ verges even though lim n + 1 n = 0. It will be helpful to use n$4

³s s ´ ³s s ´ n+1 n n + 1 + n = 1.

P4 P4 P 5. Prove that ifP 4 n=1 dn $ O and n=1 en $ P , then n=1 (dn + en ) $ O+P . d e converges also. Does it converge to OP ? Explain. Prove that 4 n=1 n n

6. One of the consequences of the divergence of the harmonic series is that there are innitely many primes. Here is an outline of a proof from Ivan Niven; rewrite, it lling in all the reasoning.

LIMITS, SEQUENCES AND SERIES

128

P0

1 denote the sum of the reciprocals of those numbers n ? q n which are square-free (i.e. not divisible by the square of a whole number other than 1). Since every number is uniquely a product of a square and a square-free number ¶ X ¶ X X 1 0 1 1 . 2 n?q n m?q m p p?q

(a) Let

n?q

P0

1 is unbounded (why?). n (b) h{ A 1 + { (Corollary 2.4.12). P 1 is bounded by E, where s runs over primes less than (c) Suppose that s?q s q. Then Ã ! X1 Y X0 Y 1 hE A exp . exp(1@s) A (1 + 1@s) = n?q n s s?q s?q s?q This shows that

n?q

(d) There must be innitely many primes. 7. The hyperbolic Bessel function Lr ({) is given by the series L0 ({) =

4 X

n=0

{2n . 4n (n!)2

(a) Discuss the convergence of this series for particular values of {. (b) Use the truncation estimate of Theorem 3.3.22 to approximate the error when the series for L0 (2) is chopped o after 20 terms. 8. (Alternating series without betweenness theory). Let us write an alternating series as 4 X (1)n dn = d0 d1 + d2 d3 + · · · , n=0

wherePthe dn are non-negative and decreasing to 0. Denote by Vq the partial p sum n=0 (1)n dn .

(a) Prove that Vq1 Vq Vq2 if q is even, and Vq2 Vq Vq1 if q is odd.

(b) Use the previous part to show that |Vq Vp | + |Vp Vq1 | = |Vq Vq1 | for q ? p. Explain why this gives us Cauchy convergence. (c) Prove the two parts of the alternating series test (Corollary 3.3.33).

SERIES OF NUMBERS

129

9. Show that the terms of this alternating series are eventually decreasing to 0 for any D: D2n D2 D4 D6 + + ··· ±

··· 1 2! 4! 6! (2n)! (As far as convergence goes, however, it is actually absolutely convergent.) What will the truncation error be after q terms? 10. In Exercise 2 on page 42 the numbers ln 2 and @4 were dened. Using these denitions, prove that P n+1 1 (a) ln 2 = 1 12 + 13 14 + · · · = 4 n=1 (1) n. P4 1 1 1 1 . (b) @4 = 1 3 + 5 7 · · · = n=1 (1)n+1 2n1 (Prove these statements by showing that the partial sums of these series lie in every one of the intervals dening the numbers ln 2 and @4.

11. The following sequence of ideas is borrowed from [Spivak, 1994]. We suppose that i is a function on R with the property that |i (|) i ({)| f || {| for all {> | in some interval L, with f ? 1. Such a function is called a contraction map. (a) Suppose that (dq ) is a sequence such that |dq dq+1 | ? fq where f ? 1. Prove that (dq ) is a Cauchy sequence. (Hint: Sum the series fp + fp+1 + · · · + fq .)

(b) Prove that for any {, the sequence {> i ({)> i (i ({))> i (i (i ({)))> = = = is a Cauchy sequence. (c) Use the previous part to show that i has a xed point (i.e. a number x such that i (x) = x). (d) Prove that i has only one xed point.

12. Here is an outline of the proof of Dirichlet’s test for conditional convergence (Theorem 3.3.38). Pq Pq1 (a) Since eq = Eq Eq1 , we have n=1 dn en = n=1 En (dn dn+1 )+dq Eq (this formula is sometimes called partial or Abel summation.) Pq1 Pq1 Pq1 (b) n=1 |En (dn dn+1 )| n=1 E(dn dn+1 ) = ? So n=1 En (dn dn+1 ) is absolutely convergent.

(c) dq Eq $ 0; why? P4 (d) n=1 dn en converges. 13. Prove that

P4

n=1

n n+1

¶n(n+1)

converges.

14. Basically, the alternating series test says that if (dn ) is a sequence that decreases P4 to 0, then the series n=0 (1)n dn converges. Show that this follows from Dirichlet’s test (Theorem 3.3.38).

LIMITS, SEQUENCES AND SERIES

130

P4 15. Prove that n=1 (cos P4n{) @n converges provided { is not a multiple of 2 (see a similar proof for n=1 (sin n{) @n in Example 3.3.39). P4 16. Prove that n=1 |(sin n{)@n| diverges when { is not a multiple of (we already know that it converges without the absolute value signs–see the previous exercise and Example 3.3.39). (Hint: Use the formula: cos2 n{ sin2 n{ 2 sin2 n{ cos 2n{ 2 sin2 n{ 1 = + = + n n n n n and the fact that

4 X cos 2n{

n=1

n

converges to deduce that

4 X sin2 n{

n=1

n

diverges.

But 0 sin2 n{ |sin n{|.) ¯ 4 ¯ X ¯ cos n{ ¯ ¯ ¯ 17. Prove that ¯ n ¯ diverges, using the ideas of the previous exercise. n=1

¶ 4 X 1 1 sin n 1 + + ··· + converges. (Hint: See Example 3.3.39 18. Prove that n 2 n n=1 ¶ 1 1 1 and look at dn = 1 + + ··· + .) n 2 n

Exp and Log

131

Appendix I: Some Properties of Exp and Log We prove here some properties of the exponential and logarithm function that were stated in Section 3.2. Proposition 3.2.11 For q 5 N, (1 + 1@q)q $ h as q $ 4. Proof. Put d = h into the inequalities for O(d) in Remark 2.4.4. Noting that O(d) = logh (d) (so O(h) = 1), we get the inequalities (for 0 ? { ? 1) 1 1 { h{ or h{ and

{ 1 =1+ 1{ 1{ { h{ 1

or 1 + { h{ = Putting these together and taking {th roots gives ¶1@{ { = (1 + {)1@{ h 1 + 1{

()

(Note that the restriction { ? 1 applies only to the right inequality.) Let q be an integer greater than 1. Taking { = 1@q, we get (1 + 1@q)q h. Taking { = 1@(q + 1), we get h (1 + 1@q)q+1 . But ¶q+1 ¶q ¶q ¶q 1 1 1 h 1 1 1+ 1+ = 1+ + 1+ + = q q q q q q Therefore h

¶q 1 h 1+ h> q q

which shows that |(1 + 1@q)q h| h@q ? 4@q.

Proposition 3.2.12 For all { A 0, {@(1 + {) log(1 + {) {. Proof. Taking logarithms of the two sides of the left inequality in (), we get (1@{) log(1 + {) 1 or log(1 + {) {. By the right inequality in (), we have h 1+

w 1w

¶1@w

for 0 ? w ? 1. Now let { = w@(1 w) in the inequality above to get h (1 + {)1+1@{ =

Appendix I

132

Note that we have { 5 (0> 4). Taking logarithms of both sides of the inequality above shows that ¶ 1 log(1 + {) 1 1+ { or { log(1 + {)= 1+{

As long as d A 0, the exponential function d{ dominates any power function {n . That is the content of the following. Proposition 3.2.13 Given q 5 Q , w A 0, and n 0, (1 + w)q $ 4 as q $ 4= qn Proof. Let = ln(1 + w). Then 1 + w = h , and therefore (1 + w)q = hqn ln q = qn s Let p = q. Then q n ln q = p2 2n ln p, and by Proposition 3.2.12, ln p p 1 ? p. So s s q n ln q p2 2np = q( q 2n)= But s q 2n 1 +, q So if q

¡ 1+2n ¢2

=

³

1+2n ln(1+w)

´2

1 + 2n

¶2

=

, we have:

(1 + w)q qn

= hqn ln q s

s

h q( q2n) s h q s s 1+ qA q$4

Corollary 3.2.14 For { with |{| ? 1 and any n 0, qn {q $ 0. Proof. By -trichotomy, we can separate into the cases |{| A 0 and |{| within, say, 1@2 of 0. For |{| A 0, let 1@ |{| = 1 + w. Then w A 0, and (1 + w)q $4 qn so qn {q $ 0.

implies

qn $ 0, (1 + w)q

Exp and Log

133

For |{| near 0, take = (1 |{|)@2 and let 1 + w = 1@(|{| + ). Now use the same proof. Just as exponential functions dominate powers of {, log functions are dominated by powers of {. More precisely, we have the following. Corollary 3.2.16 If q A 0 then, for { A 0, Proof. Let | = q ln {, so { = h|@q . Then

ln { {q

$ 0 as { $ 4.

|@{ | ln { = | = (1@{) | $ 0 q { h h by Proposition 3.2.13 above. Proposition 3.2.17 q1@q $ 1 as q $ 4. Proof. For all q 5 N, q1@q A 1. Therefore, to prove the proposition, we have to show that for each A 0, q1@q 1 + for q some Q (). But q1@q 1 + +, q (1 + )q +,

s (1 + )q = q s q

From the proof of Proposition 3.2.13 above (taking n = 1@2), this last inequality will ³ ´2 2 hold providing q ln(1+) . Corollary 3.2.18 For any polynomial S (q) = d0 + d1 q + · · · + dn qn with dn A 0, (S (q))1@q $ 1 as q $ 4.

Appendix II

134

Appendix II: Rearrangements of Series P4 P4 Definition 3.3.42 A rearrangement of a series n=0 vn is a series n=0 wn where each term of the rst series appears exactly once in the second, and each term of the second appears exactly once in the rst. A more technical way of saying this is to dene a permutation of the natural numbers N to be a 1-to-1 correspondence : NP $ N (so has an inverse). Then a P4 4 rearrangement of the series n=0 vn is a series n=0 v(n) for some permutation . Example 3.3.43 Starting with v0 + v1 + v2 + v3 + · · · , we group consecutive even and odd terms: v0 + v2 + v1 + v3 + v4 + v6 + v5 + v7 + · · · |{z} |{z} |{z} |{z} |{z} |{z} |{z} |{z} w0

w1

w2

w3

w4

w5

w6

w7

This corresponds to the permutation where ; if u = 0 or 3 ? 4n + u 4n + u + 1 if u = 1 . (4n + u) = = 4n + u 1 if u = 2

(Note that any natural number q can be written uniquely at 4n + u where 0 u ? 4 is the remainder when q is divided by 4). What happens to convergence when we rearrange? Answer: it’s unpredictable. Here’s a classic example. 1 1 Example 3.3.44 We have seen that the alternating harmonic series 1 + 2 3 1 + · · · converges to some number O. (We will see in a later chapter that O = ln 2.) 4 Now consider the rearrangement formed when we pair one positive term with two negative ones and add them. ¶ ¶ ¶ 1 1 1 1 1 1 1 1 1 + + + ··· 2 4 3 6 8 5 10 12 ¶ ¶ ¶ 1 1 1 1 1 1 1 1 = 1 + + + ··· 2 4 3 6 8 5 10 12 1 1 1 1 1 1 + + + ··· = 2 4 6 8 10 12 ¶ 1 1 1 1 1 1 1 + + + ··· = 2 2 3 4 5 6 1 O = 2 Thus, the rearranged series converges to half the sum of the original (i.e. to

1 2

ln 2).

Rearrangements of Series

135

This may seem unfair–we are going through negative terms twice as fast as positive one–until you realize the we eventually account for every positive term (the curious nature of innity). So we see that a rearrangement of a series may not converge to the same thing as the original. This means that one must be very careful in trying to compute limits P4 The issue arises, for example, in dening the product of two series P4 of series. ( n=0 vn ) ( n=0 wn ). Fortunately, the problem goes away when the series converges absolutely. P4 Proposition 3.3.45 If n=0 vn converges absolutely to O, then any rearrangement P4 n=0 wn also converges absolutely to O. P4 Proof. Since n=0 |vn | converges, given any A 0, we can nd an P 0 such that 4 X

n=p+1

|vn | =

4 X

n=0

|vn |

p X

n=0

|vn | ? @2

whenever P p P 0 . This is a restatement of tail convergence (Proposition 3.3.11). 00 Since 4 n=0 vn = O, for the same we can nd an P such that ¯Ã p ! ¯ ¯ X ¯ ¯ ¯ vn O¯ ? @2 ¯ ¯ ¯ n=0

whenever p P 00 . Now let P = max(P 0 > P 00 ). Finally, choose Q so large that the terms v0 > = = = > vP are among the w0 > = = = > wQ (this can be done since P is just a nite number). Suppose q Q . If we subtract Pq PP n=0 vn from n=0 wn , we are left with terms vl whose subscripts are bigger than P . Thus ¯ q ¯ ¯ 4 ¯ P 4 ¯X ¯ ¯ X ¯ X X ¯ ¯ ¯ ¯ wn vn ¯ ¯ vn ¯ |vn | ? @2. ¯ ¯ ¯ ¯ ¯ n=0

n=0

n=P+1

n=P+1

Now we have

¯ ¯Ã P ! ¯Ã q ! ¯ ¯ ¯ q P ¯ ¯ X ¯ X ¯ ¯ ¯X X ¯ ¯ ¯ ¯ ¯ ¯ wn O¯ ¯ wn vn ¯ + ¯ vn O¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ n=0

n=0

n=0

n=0

@2 + @2 = .

Having we deduce that the rearrangeP4 P P proved that the rearrangement converges, w converges absolutely, because |w | is a rearrangement of 4 ment 4 n=0 n n=0 n n=0 |vn |. InPview of this for dening the prodP4result, there are a number of possibilities P4 P4 4 uct ( n=0 vn ) ( n=0 wn ) as a series when each of n=0 vn and n=0 wn converge

Appendix II

136

absolutely. We just have to nd a systematic way of adding all of the products vl wm . One of the standard ways is to add them on diagonals: v0 w0 v1 w0 v2 w0 v3 w0 .. .

v0 w1 . .

v1 w1 v2 w1

. ···

v0 w2 .

v1 w2

. ··· .

v0 w3 . ··· .

··· .

.

Pq P4 P4 P4 If we dene fq = l=0 vl wql , then q=0 fq is a candidate for ( n=0 vn ) ( n=0 wn ); it is called the Cauchy product of the series. When either one or both of the series fails to converge absolutely, the Cauchy product may also fail to converge (see below). On the other hand, we have the following general result. P4 P Proposition 3.3.46 Suppose 4 n=0 vn and n=0 wn converge absolutely to V and W > x > = = = > xq > = = = contains exactly the products respectively, and suppose the sequence x 0 1 P4 vl wm in some order. Then n=0 xn converges absolutely to VW . Proof. We sketch the proof found in [Spivak, 1994]. If we dene SO =

O X

n=0

|vn |

O X

n=0

|wn | ,

then the sequence (SO ) converges to VW . This makes (SO ) a Cauchy sequence so, for any A 0 and O and O0 large enough, ¯ ¯ 0 ¯ ¯X O0 O O X X X ¯ ¯O ¯ ¯ |v | |w | |v | |w | n n n n ¯ ? @2. ¯ ¯ ¯n=0 n=0 n=0 n=0

It follows that

P

l or m A O

|vl | · |wm | @2. Now suppose that Q is an integer large

enough that the terms x0 , x1 , . . . , xQ contain every product vl wm for l> m O. PQ PO PO Then n=0 xn l=0 vl m=0 wm contains only terms vl wm where l A O or m A O; consequently ¯ ¯ ¯X ¯ O O X X X ¯Q ¯ ¯ ¯ x v w |vl | · |wm | @2. n l m¯ ¯ ¯n=0 l=0 m=0 ¯ l or m A O Also, for O big enough, we have ¯ ¯ ¯X ¯ O 4 O X X ¯4 X ¯ ¯ ¯ v w v w l m l m ¯ ? @2. ¯ ¯ l=0 m=0 l=0 m=0 ¯

¯ ¯P P4 PQ ¯ ¯ We now make¯ 4 l=0 vl m=0 wm n=0 xn ¯ small by the triangle inequality.

Rearrangements of Series

137

For more on this, see Exercise 10 on page 252.

Exercises s P4 P4 P n 1. Let 4 n=0 vn = n=0 wn = n=0 (1) @ n. These two series converge but not absolutely; show that their Cauchy product does not converge. P 2. Suppose that, for a given series 4 n=0 vn , we can determine exactly which terms are non-negative and which are non-positive–for example, the case where the terms vn are rational numbers. If the original series converges conditionally, prove that the sums of the non-negative and non-positive terms each diverge. P4 3. Suppose that n=0 vn converges conditionally and that the terms vn are rational numbers (see the previous exercise). Show Pthat, for any rational number S , there is some rearrangement of the terms, 4 n=0 wn , such that 4 X

wn = S .

n=0

(Hint: Suppose for the sake of argument that S A 0. Since the non-negative terms of (vn ) diverge, add them until their sum is greater than S . Now start adding non-positive terms until the sum is ? S . Keep doing this. Since lim vn = 0, these sums approach S as a limit.) n$4

The reason we have to work with rational numbers here is that we have to make comparisons in size, and that is not always possible with arbitrary reals.

4. UNIFORM CONTINUITY 4.1

Denitions and Elementary Properties

A function i is continuous if i ({) is close to i (|) when { is close to |. We will soon state this precisely. The functions one generally encounters in mathematics are continuous where they are well-dened. This is particularly clear for those functions that are dened by extending the arithmetic operations of the rationals to the reals. Example 4.1.1 Let i ({) = {3 . Here i is dened on rationals from the start. To extend i to reals, we need to know that i sends ne and consistent families of rational intervals into ne and consistent families of rational intervals via i ([u> v]) = [u3 > v3 ]. Since consistency is trivial, the real issue is that i sends short intervals into short intervals, that is, i (u) = u3 is close to i (v) = v3 when u is close to v. The fact that this is true makes i continuous as a function of rationals or reals, as we shall see. ¢q ¡ 1@q = D1@q Example 4.1.2 Let i ({) = D{ . Here we used the formula(Dp ) and the Inverse Function Theorem to show that i can be dened when { = p@q is rational. Then we dened i on reals { by showing that the family of intervals i ([u> v]) = [Du > Dv ] is ne and consistent. Once again, the issue becomes whether i (u) = Du is close to i (v) = Dv when u is close to v. In both of the above examples, Lipschitz conditions are available to guarantee that i sends nearby inputs into nearby values: O · (| {) i (|) i ({) N · (| {). However, there are many examples of non-Lipschitz functions that are nonetheless continuous. In this chapter we separate the idea of close inputs implying close outputs from Lipschitz conditions and study it on its own. This is uniform continuity. Definition 4.1.3 The function i is uniformly continuous (UC) on V R if for each A 0 we can nd a A 0 (depending on and i ) such that |i (|) i ({)|

whenever {> | 5 V and || {| ?

Remark 4.1.4 We will call the number a modulus of continuity for . Of course, a modulus of continuity for is not unique: if is a modulus of continuity for , then so is @2 for example. Nevertheless, we may sometimes denote a modulus of continuity for by () or i () (or even i>V (), if necessary). 139

140

UNIFORM CONTINUITY

Remark 4.1.5 The that is chosen depends on i , , and V, but it doesn’t depend on the particular values { and | other than that they lie in V. There is another notion of continuity, continuity at a point, in which the depends on a particular point. We will not be using this notion, but see exercise 13 on page 147 for a brief discussion. Proposition 4.1.6 If i satises an absolute-value Lipschitz condition of the type |i (|) i ({)| ? N || {| (N A 0) on V, then i is uniformly continuous on V. We know from Corollary 1.5.31 that O · (| {) i (|) i ({) N · (| {) implies that |i (|) i ({)| max(|O| · || {| > |N| · || {|) = N || {| . Therefore, a two-sided Lipschitz condition implies an absolute-value one. This means that all the functions for which we have established these conditions are UC. ¯ ¯ Example 4.1.7 Let i ({) = {2 and V = [D> E]. Then |i (|) i ({)| = ¯| 2 {2 ¯ = || {| || + {| ? N · || {| where N is any number bounding both 2|D| and 2|E|. Given A 0, choose A 0 to be any number less than @N; then ¯ ¯ |i (|) i ({)| = ¯|2 {2 ¯ || {| · N ? (@N) · N = . Proposition 4.1.8 (Log and Exponential Functions)

1. For D A 0, D{ is uniformly continuous on any nite interval. 2. For any D A 0, f A 0, logD ({) is uniformly continuous on any interval [f> 4). Proof. Both of these functions satisfy Lipschitz conditions. The details are left for the exercises. Example 4.1.9 You can prove that a function is not uniformly continuous by producing a sequence of numbers {1 > {2 > = = = > {q > = = = that get arbitrarily close but whose function values i ({1 )> i ({2 )> = = = don’t get arbitrarily close. For example, let {n = hn = 1@hn ; then {n {n+1 = hh1 n+1 can be made as small as we please, but ln({n ) ln({n+1 ) = (n) (n 1) = 1. Thus, the function ln is not continuous in any interval containing all these {n ; for example, ln is not continuous on any interval (0> f], f A 0. (Also, see the exercises.) Example 4.1.10 It is easy to see that the function i ({) = { is UC on all of R. However, j({) = {2 is not UC on all of R. To see this, we compute ¯ ¯ |j(|) j({)| = ¯| 2 {2 ¯ = || + {| || {| .

No matter how small the || {| term is, we can always make the || + {| term much bigger. For example, no matter what we choose, we can always make || + {| || {| A 1. To see this, work backwards. Suppose || {| ? , say | = { + @2. Then (| + {)(| {) A 1

DEFINITIONS AND ELEMENTARY PROPERTIES

141

gives (2{ + @2) (@2) A 1 which we can solve for { to obtain {A

2 . 2 4

So, no matter what A 0 we try to use for a modulus of continuity, we can’t ensure ¯ ¯ 2 . that ¯| 2 {2 ¯ ? 1: we simply make | and { close together but bigger than 2 4

So far the examples of UC functions have all satised Lipschitz conditions. But UC was advertised as a generalization of Lipschitz: Can it be that every UC function must satisfy Lipschitz conditions? The answer is no. The functions {1@Q with Q A 1, for example, do not satisfy Lipschitz conditions on any interval containing 0, but are UC wherever they are dened. Showing this for the function {1@3 is in the exercises. To see what’s going on, let’s consider the square-root function on an interval (0> f]. If it satised an absolute-value Lipschitz condition, we’d have: ¯ ¯ ¯s s ¯ ¯ { ¯¯ ¯ | {¯ = ¯ s| s ¯ | + { ¯ N || {| , so

¯ ¯ ¯ ¯ ¯ s 1 s ¯ N. ¯ | + {¯

1 s can’t be bounded on (0> f], since we can certainly make it as big as But s |+ { 1 we please by making | and { close to 0. For example, take {> | ? 2P+2 ; then 2 1 s A 2P . s |+ { Now we go through the standard verication that a property–in this case uniform continuity–behaves reasonably well under the usual arithmetic operations. Proposition 4.1.11 Suppose i and j are uniformly continuous on V, and n is any number. Then 1. i + j is uniformly continuous on V. 2. ni is uniformly continuous on V for any constant n. Proof. We’ll just prove part (1). This turns out to be an “-over-2” proof, and provides a good illustration of how the triangle inequality is often used in analysis. Please study it carefully since there will be lots more like it. We begin by writing out what we must prove. Given A 0, we must make |(i + j)(|) (i + j)({)| smaller than by making || {| su!ciently small. Then |(i + j)(|) (i + j)({)| = |i (|) + j(|) i ({) j({)| .

UNIFORM CONTINUITY

142

Since we know about i and j separately, we rearrange and use the triangle inequality to obtain |(i + j)(|) (i + j)({)| = |[i (|) i ({)] + [j(|) j({)]| |i (|) i ({)| + |j(|) j({)| {z } | {z } | rst piece

second piece

Since we want this to be less than , we make each piece of the last expression less than @2. Since i is UC, there is a modulus of continuity for i , call it i (@2), which makes the rst piece @2 when || {| ? i (@2). We can also nd a modulus of continuity j (@2) for j so that the second piece @2 when || {| ? j (@2). However, we want both pieces to be @2, so || {| must be less than both i (@2) and j (@2). Let = min( i (@2)> j (@2)). If || {| ? then both pieces of the above inequality will be less than @2, so their sum will be less than . Thus |(i + j)(|) (i + j)({)| |i (|) i ({)| + |j(|) j({)| @2 + @2 = .

The composition of UC functions is also UC. Proposition 4.1.12 Suppose i is UC on a set V and j is UC on a set W , and that j({) 5 V for any { 5 W . Then the composition i j, dened by (i j) ({) = i (j({)), is UC on V. Furthermore, a modulus of continuity for the composition is given by i j () = j ( i ()). Proof. All we have to do is unravel the denitions. |i (j(|)) i (j({))| will be less than when |j(|) j({)| is less than i (), which will happen when || {| ? j ( i ()). The situation for products and quotients is a bit more complicated. As we saw in Example 4.1.10 above, { is UC on all of R, but {2 = { · { isn’t. The problem stems from the unboundedness of the factors in the product ({)({). Definition 4.1.13 i is bounded on V if there is a constant such that |i ({)| ? N for all { 5 V. Definition 4.1.14 j is bounded away from 0 on V if there is a constant O A 0 such that |j ({)| A O for all { 5 V. Proposition 4.1.15 Suppose i and j are uniformly continuous on V R. 1. If i and j are bounded on V, then i j is uniformly continuous on V. 2. If j is bounded away from 0 on V, then 1@j is uniformly continuous on V.

DEFINITIONS AND ELEMENTARY PROPERTIES

143

Proof. We will prove just the quotient part. By the hypothesis, we may suppose that there is a number O A 0 such that |j({)| A O for all { 5 V. Then ¯ ¯ ¯ ¯ ¯ 1 1 ¯¯ ¯¯ j({) j(|) ¯¯ ¯ ¯ j(|) j({) ¯ = ¯ j({)j(|) ¯ |j({) j(|)| = O2 If A 0 is given, let = j (O2 ), so that |j(|) j({)| O2 when || {| ? . Then ¯ ¯ ¯ 1 1 ¯¯ |j({) j(|)| ¯ ¯ j(|) j({) ¯ O2 O2 = . O2 The proof for the product is left as an exercise. The boundedness and bounded away from 0 conditions of the previous proposition are su!cient but not necessary. In other words, they ensure the continuity of the product or reciprocal, but the product or reciprocal might be continuous even without them. Here are some examples. 1. Since the function { is UC on all of U, but {2 = {{ is not UC on all of U, we see that the product of uniformly continuous functions need not be uniformly continuous without further assumptions. On the other hand, if V is the (innite) interval [1> 4), then { and 1@{ are both UC on V, { is unbounded, yet their product is UC. Thus, we don’t always need the boundedness condition. 2. Similarly, while { is UC on all of U, it is not bounded away from 0 on any interval (0> t] (t A 0), and 1@{ is not uniformly continuous on any of these intervals. So, without further assumptions, the reciprocal of a UC function need not be UC. On the other hand, the function 1@{ is uniformly continuous but not bounded away from 0 on [1> 4), yet its reciprocal is UC. Thus, the reciprocal of a UC function may be UC, even though the original function is not bounded away from 0. An important part of real analysis–indeed, of mathematics in general–is the use of examples that show the extent to which theorems hold and the extent that their hypotheses are really needed. (Reading the rst part of the example above, you may suspect that in order for the product of two UC functions to be UC, it’s enough that one of them be bounded. Show by example that this is not true.) Corollary 4.1.16 Since { is uniformly continuous, so is any polynomial on a nite interval. Rational functions are uniformly continuous on any nite interval where their denominators are bounded away from 0. On nite, closed intervals, uniformly continuous functions must be bounded. We will now prove a strong version of this fact.

144

UNIFORM CONTINUITY

Proposition 4.1.17 If i is uniformly continuous on the nite interval [d> e], then for each A 0 we can nd numbers x> y 5 [d> e] such that i (x) i ({) i (y) + for all { 5 [d> e]. We prove this by means of two lemmas (1.6.12 and 1.6.13) already stated. We now give the proofs that were originally left as exercises. Lemma 4.1.18 Suppose d = {0 ? {1 ? · · · ? {Q = e, with {l {l1 ? for 1 l Q . Then, for each { in [d> e], we can nd an {n such that |{ {n | ? . Proof. We apply -trichotomy (Proposition 1.6.9), with = , to { and {0 . Since { ? {0 is false, { is either within of {0 (so n = 0), or { A {0 . We continue to apply -trichotomy successively to { and {l until we nd some biggest p such that { ? {p or { lies within of {p= (we know there has to be one since { can’t be greater than {Q ). Then we have {p1 ? { ? {p so we can take n to be either p or p 1. Lemma 4.1.19 Given numbers I0 > I1 = = = > IQ , and A 0, there are indices O and N such that, for all m, IN ? Im + and Im ? IO . Proof. We can use induction on Q . The case Q = 0 is obvious. Suppose that Q A 0 and the lemma is true for Q 1; then we can nd W with IW ? Im + for m = 0> = = = > Q 1. If IW ? IQ or is within of IQ , let N = W . If IQ ? IW , let N = Q . A similar idea is used to select IO . Note that we have used -trichotomy again. Proof of Proposition 4.1.17. Let = (@2) be¡ a modulus of continuity for i . ¢ ed . By Lemma 4.1.18, given ? and let { = d + l Choose Q so large that ed l Q Q { in [d> e], { is within of some {n , so i ({n ) ? i ({) + @2. By the Lemma 4.1.19 applied to Im = i ({m ), there is some N with i ({N ) ? i ({m ) + @2 for all m. Thus, i ({N ) ? i ({n ) @2 ? i ({), so we can take x = {N . The construction of y is similar, but uses the other half of Lemma 4.1.19 (with IO ). Corollary 4.1.20 If i is uniformly continuous on the nite interval [d> e], then there is a constant N such that |i ({)| N for all { 5 [d> e]. Proposition 4.1.21 Suppose i is uniformly continuous on an interval L = [d> e] (d ? e). If there is an {0 5 L such that i ({0 ) A 0 then i is bounded away from 0 in some subinterval [f> g] (f ? g) of L containing {0 . The proof is left as an exercise. Proposition 4.1.17 is also as close as we can come, in general, to constructing the max and min of a function on a nite closed interval. When we discuss dierentiability, we will see another approach which, in special cases, will yield these extreme values exactly. To demonstrate how continuity can be veried, we show that the qth-root function is uniformly continuous wherever it is dened on the real line. We begin with a simple lemma, easily proved by induction.

DEFINITIONS AND ELEMENTARY PROPERTIES

145

Lemma 4.1.22 For any integer q 1 and real number x with 0 x 1, (1 x)q 1 xq = In fact, this lemma is true even when q 1 is any real number. Since the proof of this more general fact is best accomplished using derivatives, we postpone it until later. We can now apply this lemma to x = {@|, where { = d1@q and | = e1@q . Corollary 4.1.23 For reals 0 d e, e1@q d1@q (e d)1@q = This shows easily that the qth-root function, for positive integers q, is uniformly continuous on the non-negative real axis. In fact, |e1@q d1@q | will be less than provided |e d| ? q . In the case of q odd, the qth-root function is dened on the whole real line, and some simple algebraic inequalities show that for these functions, |e d| ? (@2)q works to establish uniform continuity (see the exercises). Exercises 1. Suppose you draw the graph of | = i ({) over the interval d { e, and imagine a rectangle of width and height with sides parallel to the axes. Express the statement that is a modulus of continuity for in terms of what happens as you move this box around the graph. Here is a picture showing a (= 1@2) which works in some intervals but not in others, with = 1@2.

\

y x

[

UNIFORM CONTINUITY

146

2. Prove (from the text) that a constant multiple of a uniformly continuous function on V is also uniformly continuous. Also prove that the dierence of UC functions is UC. 3. Show that the following functions are not uniformly continuous on (0> 1) (you may assume the usual properties of the trigonometric functions). For a proof that a function is not UC, see Example 4.1.9 in the text. a. h1@{

b. sin(1@{)

c. h{ cos(1@{)

d. cos({) cos(@{)

4. Prove that if i satises a Lipschitz condition of the type |i (|) i ({)| N || {| (N A 0) on V, then i is uniformly continuous on V. 5. Prove that if i satises a Lipschitz condition of the type D(| {) i (|) i ({) E(| {) where D> E A 0, then i is uniformly continuous (see the previous exercise as well as Corollary 1.5.31). 6. Give detailed proofs that the logarithm and exponential functions are uniformly continuous on appropriate intervals. 7. Prove that if i and j are uniformly their product i j is also uniformly i ({) j (|) in |i (|)j(|) i ({)j({)|.) the product of uniformly continuous

continuous and bounded functions, then continuous. (Hint: Add and subtract Deduce that, on a nite interval [D> E], intervals must be uniformly continuous.

8. Just for practice, prove that the polynomial {3 { + 1 issuniformly continuous 1 and 2{ + 3 are uniformly on the interval [2> 5]. Similarly, prove that 2{+3 continuous on this interval. 9. Suppose the functions i and j are UC on the nite interval [d> e]. Dene the functions k({) = min(i ({)> j({)) and K({) = max(i ({)> j({)). Show that k and K are UC on [d> e]. Deduce that the absolute value function is UC. 10. Suppose i is UC on [d> 4) and lim i ({) is nite. Prove that i is bounded {$4

on [d> 4). What if we just assume that i is UC on every nite subinterval of [d> 4)? Will i be UC on [d> 4)?

11. Prove the following from the text. Proposition 4.1.21 Suppose i is uniformly continuous on an interval L = [d> e] ( d ? e). If there is an {0 5 L such that i ({0 ) A 0 then i is bounded away from 0 in some subinterval [f> g] ( f ? g) of L containing {0 . 12. Here we look at functions satisfying certain simple identities (called functional equations). We assume in each case that the function i is UC on every nite interval [d> e]. Prove the following statements. (It is recommended that you follow the order listed.) (a) If i satises i ({ + |) = i ({) + i (|), then there is a real number such that i ({) = { for all {. (Hint: What does have to be?) (b) Find all i which satisfy the conditions i (1) A 0 and i ({ + |) = i ({)i (|).

LIMITS AND EXTENSIONS

147

(c) (Here i is UC on every nite interval [d> e] with d A 0.) Show that if i is not identically 0 and satises i ({|) = i ({) + i (|) then i is a log function. (Hint: Think of a trick so that you can use part (a).) (d) (Here i is UC on every nite interval [d> e] with d A 0.) Show that if i is not identically 0 and satises i ({|) = i ({)i (|), then i is a power function i ({) = { . 13. That a function i is continuous at the point {0 ({0 in the domain of i ) means that, given A 0, there exists a A 0 (possibly depending on {0 ) such that |i ({) i ({0 )| ? whenever |{ {0 | ? . (a) Prove that i is continuous at {0 if and only if i ({0 ) = lim i ({). {${0

(b) Prove that i UC on V implies that i is continuous at each point of V. (c) The converse of the above is not true. Show that the function i ({) = 1@{ is continuous at each point of (0> 1], for example, but is not uniformly continuous on (0> 1]. Also, the function {2 is continuous at every point of R but is not UC on all of R (though it is UC on every nite subinterval.) 14. Give a detailed proof that the function {1@3 is uniformly continuous on all of R when we dene ({)1@3 = ({1@3 ) for { 0. This function cannot satisfy a Lipschitz condition |i (|) i ({)| N · || {| on any interval containing 0. Why? 15. The two-sided Inverse Function Theorem, Two-sided Case (Theorem 2.2.8) tells us that the function {q has an inverse on all of R when q is odd. In fact, for q odd and { 0, {1@q = ({)1@q . Use Corollary 4.1.23 to show that {1@q is UC in this case on all of R, with modulus of continuity (@2)q (Hint: Separate out into the cases { and | both non-negative, both non-positive, and one non-negative and the other non-positive. This is legal since these three ¯ ¯ cases can’t all be excluded, and you are trying to prove that ¯| 1@q {1@q ¯ is not greater than .)

4.2

Limits and Extensions

The main idea in this section is that uniformly continuous functions preserve limits and hence can be extended to take on their limiting values. The following proposition expresses the phrase “preserves limits” more carefully. Proposition 4.2.1 Suppose i is uniformly continuous on V and (dn ) $ d where each dn 5 V. Then

UNIFORM CONTINUITY

148

1. (i (dn )) is a Cauchy sequence. 2. The limits {$d lim i ({) and O = lim (i (dn )) exist and are equal. n$4

{5V

3. If d 5 V then O = i (d). Proof. (1) Given A 0, let = i () be a modulus of continuity for on V. Since (dn ) is a Cauchy sequence, choose Q so that |dp dq | ? when p> q A Q; then |i (dp ) i (dq )| ? . (2) O = lim(i (dn )) exists by part (1) and the fact that Cauchy sequences converge (by completeness). We must show that given A 0, we can make |i ({) O| by making |{ d| small enough. Once again, let be a modulus of continuity for on V, and suppose that |{ d| ? @2. Since (dn ) $ d, choose an q such with |d dq | ? @2 (we will rene q soon). We now have |{ dq | |{ d| + |d dq | ? @2 + @2 = , so |i ({) i (dq )| ? and |i ({) O| |i ({) i (dq )| + |i (dq ) O| ? + |i (dq ) O|. By making q large enough, |i (dq ) O| can be made arbitrarily small (since O = lim(i (dn ))), so |i ({) O| (by the Wiggle Lemma). (3) This is left as an exercise. Corollary 4.2.2 If i is uniformly continuous on V and d 5 V, then i (d) = lim i ({). {$d {5V

Example 4.2.3 Consider the sequence (1@n 3 ) $ 0. If i ({) = s 3 1@n $ 0. Therefore lim { = 0. {$0

¢ ¡ s 3 {, then i 1@n3 =

{2 4 and (dn ) = (2+1@n) $ 2. Then (i (dn )) = Example 4.2.4 Consider i ({) = ¶ {2 2 { 4 = 4. (4 + 4@n) $ 4, so lim {$2 {2 Note that in the second example, i ({) is not dened at { = 2, but it is dened everywhere else (i.e. at those { such that |{ 2| A 0). The domain (allowable inputs for i ) has a “hole” at { = 2. We would be tempted to ll that hole by dening i (2) to be 4. However, this leads to the rather awkward denition ; ? {2 4 if { 6= 2 . i ({) = 2 = { 4 if { = 2

But this is worse than being merely awkward. What happens if { is a real number– say given as a family of intervals, or as a limit of a Cauchy sequence, or as an innite series–but we can’t determine whether { is actually equal to 2? We know that the

LIMITS AND EXTENSIONS

149

strong trichotomy property doesn’t hold for all real numbers, so how can we apply this denition? Of course, the solution in this simple case is easy: just dene i ({) to be { + 2 for any {. Here are some examples of extending functions, already taken care of by other means (namely IFT). Example 4.2.5 Let i ({) be the qth-root of {. For { 0 and rational, this is dened by the family {[u> v] : 0 u and uq { vq }. When { is not necessarily rational, verifying the inequalities in bisection can create problems in determining neness; thus the more careful approach taken in the IFT is necessary. Example 4.2.6 i ({) = {D , dened for D = u@v rational to be

s v {u .

In all of these examples, the numbers we want to be in the domain of the function (e.g. all the reals) can be approximated to arbitrary closeness by numbers on which the function is already dened–namely the rationals. Basically, to add the number { to the domain of i , we try to write { = lim ({q ) where i ({q ) exists and then dene i ({) = lim (i ({q )). We now make precise this important idea. Definition 4.2.7 Let V W be sets of real numbers. V is dense in W means that every open interval of R which intersects W also intersects V. Proposition 4.2.8 For V W , the following are equivalent. 1. V is dense in W . 2. For each { 5 W , there is a convergent sequence (vn ) $ { with each vn 5 V. 3. For each { 5 W and A 0, there is some v 5 V such that |{ v| ? . ¢ ¡ Proof. Suppose rst that V is dense in W . For n = 1> 2> = = = let M = { n1 > { + n1 . By density, M contains some vn which is clearly within 1@n of {. Thus, (vn ) $ {. Now suppose that { 5 W , and (vn ) converges to {. For A 0 nd vn closer to { than and let v = vn . Finally, suppose that M = (x> y) is an open interval containing { 5 W . Let = min({ x> y {). By hypothesis, there is an v 5 V lying within of {; clearly v must lie in M. Remark 4.2.9 In the above proposition, if { 5 W happens to lie in V, we can choose the sequence so that vn = { for all n. Example 4.2.10 The rationals Q are dense in the reals R. The rational interval [u> v] is dense in the real interval [u> v]. The open and half-open intervals (d> e), (d> e], and [d> e) are dense in the closed interval [d> e]. And here is the fundamental result.

UNIFORM CONTINUITY

150

Theorem 4.2.11 (Extension Theorem for UC Functions) Suppose that i is uniformly continuous on V and V is dense in W . Then there is one and only one function I , uniformly continuous on W , such that I ({) = i ({) for each { 5 V. Furthermore, any modulus of continuity for i is a modulus of continuity for I . Proof. Suppose that () is a modulus of continuity for i on V. Then, for s in W , dene the family F(s) = {[i ({) > i ({) + ] : A 0> { 5 V> |{ s| ? ()} = Since V is dense in W , for each A 0 there is an { 5 V with |{ s| ? (); thus, F(s) is ne. To prove consistency, suppose that L1 = [i ({1 ) 1 > i ({1 ) + 1 ] and L2 = [i ({2 ) 2 > i ({2 ) + 2 ] are any two intervals in F(s). Choose | so close to s that |{l s| + |s || ? (l ), for l = 1> 2. (We can do this because (l ) |{l s| A 0 and V is dense.) This gives |{l || = || s + s {l | || s| + |s {l | ? (l ) for l = 1> 2. Then i ({l ) is within l of i (|), so i (|) is in both L1 and L2 . We now can dene I (s) = lim F(s), for each s 5 W . When s 5 V, i (s) belongs to each interval in F(s) (by the continuity of i ), so I (s) = i (s). Next we show that a modulus of continuity for i on V also works for I on W . Given A 0, suppose = i (), and suppose that |t s| ? . We will show that for any A 0, |I (t) I (s)| + for any A 0; applying the Wiggle Lemma then will give |I (t) I (s)| . Let = min( |t s| > @2). Since I (s) = lim F(s) and I (t) = lim F(t), I (s) 5 [i ({) @2> i ({) + @2] I (t) 5 [i (|) @2> i (|) + @2]

for some { 5 V with |{ s| ? @2 for some | 5 V with || s| ? @2.

Therefore |I (s) i ({)| ? @2, |I (t) i (|)| ? @2, and |{ || =

|{ s| + |s t| + |t || @2 + |s t| + @2 + |s t| .

Now we put these pieces together to obtain |I (t) I (s)| |I (s) i ({)| + |i ({) i (|)| + |i (|) I (t)| @2 + + @2 = + .

LIMITS AND EXTENSIONS

151

As we said, the Wiggle Lemma gives |I (t) I (s)| . Finally, uniqueness: Suppose that J is another uniformly continuous extension of i to W , so J(v) = I (v) = i (v) for v 5 V. Given A 0 and s 5 W , choose v 5 V with |s v| ? min( I (@2)> J (@2)). Then J(v)=I (v)

z }| { |J(s) I (s)| = |J(s) J(v) + I (v) I (s)| |J(s) J(v)| + |I (v) I (s)| @2 + @2 = . Since was arbitrary, J(s) = I (s). Note that in this proof we used property 3 of Proposition 4.2.8 to construct i via the Completeness Theorem. In view of the equivalence of this property with property 2 of the same proposition – the existence of a Cauchy sequence (vn ) $ {–we could have dened I ({) = lim i (vn ). This variation is left as an exercise. Corollary 4.2.12 Any function UC on the rational interval [u> v] can be extended in a unique way to a function UC on the real interval [u> v]. Corollary 4.2.13 Any function UC on the open or half-open intervals (d> e), (d> e], [d> e) can be extended in a unique way to a function UC on the closed interval [d> e]. In this last corollary, we obtained the open and half open intervals by removing one or both endpoints from a closed interval [d> e]. However, we needn’t restrict outselves to endpoints, but can puncture an interval any nite number of times. Definition 4.2.14 If {1 > = = = > {q 5 [d> e], then S = [d> e] {{1 > = = = > {q } consists of those { 5 [d> e] such that |{ {l | A 0 for l = 1> = = = > q. S is called a (nitely) punctured interval. The following tells us that a nitely punctured interval S is dense in [d> e]. Proposition 4.2.15 Suppose {> f1 > = = = > fq 5 [d> e] and A 0. Then there is an { ¯ 5 [d> e] with |{ { ¯| ? and |¯ { fl | A 0 for each l = 1> = = = > q. The proof, by induction, is left as an exercise. Corollary 4.2.16 Any function uniformly continuous on [d> e] {{1 > = = = > {q } can be extended uniquely to a uniformly continuous function on all of [d> e]. Next we turn to another way of extending functions: amalgamation, or the piecing together of UC functions that agree on overlaps.

UNIFORM CONTINUITY

152

I

J

IDQG JDJUHH

G

F D

E

We will apply the Extension Theorem to a union of intervals, so rst we dene what this means for us. Definition 4.2.17 D ^ E, the union of two sets (for us they will be intervals), consists of those { for which it can be determined that { lies in D or { lies in E. Remark 4.2.18 As the example of the Goldbach number j shows (see page 67), j lies in [1> 1] but we don’t know if j 5 [1> 0] or j 5 [0> 1]; thus, at this time, j 65 [1> 0] ^ [0> 1]. As in the diagram above, we have functions i and j dened on [d> e] and [f> g] respectively, and would like to dene an extension k by setting k({) = i ({) if { 5 [d> e], and k({) = j ({) if { 5 [f> g]. But this only makes sense when we can tell which interval { belongs to, i.e. only when { 5 [d> e] ^ [f> g]. Fortunately, this union is dense in the obvious interval determined by [d> e] and [f> g], as the following lemma veries. Lemma 4.2.19 Suppose the intervals [d> e] and [f> g] intersect. Then X = [d> e]^[f> g] is dense in the interval Y = [min(d> f)> max(e> g)]. Proof. We must show that any open interval (v> w) that meets Y also meets X . Here is what we know: 1. d g and f e (since the intervals meet). 2. There is a s with min(d> f) s max(e> g) and v ? s ? w, since (v> w) meets Y. 3. We must show that (v> w) meets [d> e] or [f> g]. Because it is di!cult to write conditions describing the intersection of an open and a closed interval, we choose a subinterval [v0 > w0 ] containing s, with v ? v0 ? s ? w0 ? w, and show [v0 > w0 ] meets [d> e] or [f> g]. It clearly su!ces to show that if [v0 > w0 ] does not meet [d> e] then it meets [f> g]. The assumption that [v0 > w0 ] and [d> e] don’t meet is equivalent to assuming that d A w0 and e ? v0 can’t both be false, so we need to show that each of these strict inequalities implies that [v0 > w0 ] meets [f> g]. Let’s do the case e ? v0 ; the other one is similar. If g ? v0 then s A v0 max(e> g), contradicting (2) above. so v0 g. If w0 ? f then e ? v0 ? w0 ? f, contradicting (1) above, so f w0 . Thus, [v0 > w0 ] meets [f> g].

LIMITS AND EXTENSIONS

153

Remark 4.2.20 Lemma 4.2.19 above holds even when [d> e] is replaced by (4> e], or [f> g] is replaced by [f> 4). Proposition 4.2.21 (Amalgamation of Functions) Suppose d e f, i is uniformly continuous on [d> e], j is uniformly continuous on [e> f], and i (e) = j(e). Then there is a unique function k such that 1. k is uniformly continuous on [d> f], 2. k({) = i ({) for { 5 [d> e], and 3. k({) = j({) for { 5 [e> f]. (Note that this proposition is easily extended to the case of nitely many subintervals and functions.) Proof. We rst construct k on the union X = [d> e] ^ [e> f] and prove k is uniformly continuous, then apply Lemma 4.2.19 above and the Extension Theorem to extend k to [d> f]. By denition of union, { 5 X means that { 5 [d> e] or { 5 [e> f], so we dene k ({) = i ({) or k({) = j({) in these cases respectively. Now we must make |k({) k(|)| ? . We cannot exclude all three cases {> | e; e {> |; and e lies between { and | (we have proved this earlier; see Proposition 1.5.46). In the rst case k = i , and we can use the modulus of continuity i () on [d> e]; in the second case, we use the modulus of continuity j () on [e> f]. Now let () = min( i (@2)> j (@2) and suppose |{ || ? (). We write |k({) k(|)| |k({) k(e)| + |k(e) k(|)|. Since e lies between { and |, |{ e| + |e || = |{ || ? (), so |{ e| and |e || are each less than (). Hence |k({) k(e)| + |k(e) k(|)| ? . This proves continuity. Remark 4.2.22 This proposition remains true when i is UC on (4> e] and j is UC on [e> 4). In the case where the two intervals [d> e] and [f> g] don’t intersect at just a single endpoint e = f, but rather e 5 [f> g] and f 5 [d> e], we note that [d> min(e> f)] ^ [min(e> f)> max(e> f)] ^ [max(e> f)> g] is dense in [d> g] and we can apply Proposition 4.2.21 in this case as well (we assume i and j agree on [min(e> f)> max(e> f)]). One of the most important applications of the amalgamation or piecing together of functions, described in this last corollary, is the case of piecewise linear functions. Recall that a linear function R $ R is one of the form O ({) = d{ + e . Given points ({1 > |1 ) and ({2 > |2 ) with |{1 {2 | A 0, it is aways possible to nd a linear function for which O({l ) = |l , l = 1> 2. Applying this idea several times yields the following. Corollary 4.2.23 Given d = {0 ? {1 ? · · · ? {q = e, and numbers |0 > |1 > = = = > |q , there is a function O such that 1. O is linear on each interval [{l > {l1 ] separately, and 2. O({l ) = |l .

UNIFORM CONTINUITY

154

Finally, we can weaken the condition that a function be uniformly continuous on all of a punctured interval in order to be extended: it just has to be uniformly continuous on closed subintervals. This is made precise below. Lemma 4.2.24 Suppose that i is uniformly continuous on every subinterval [x> y] of [d> e). Furthermore, suppose lim i ({) exists. Then i is uniformly continuous on {$e {?e

[d> e), and hence it has a unique uniformly continuous extension to all of [d> e]. Proof. Given A 0, choose A 0 such that |i ({) O| ? @2 when e { ? , where lim i ({) = O. Since i is UC on the interval L = [d> e @4], we can nd a modulus {$e {?e

of continuity L = (> L) for on it. Now let 0 = min( L > @4). Here is a picture of what we have. _ / I[ _ H KHUH

EG

D

E G E G E

I = [a, b - G/4]: f is UC here

Suppose that || {| ? 0 . We deal with the two cases { e @2 and { e @2. In the former, e { @2 ? and so | { @4 e 3@4 e , and e | also. Thus, by denition of : |i (|) i ({)| |i (|) O| + |O i ({)| @2 + @2 = . Now we deal with { e @2, which tells us that { 5 L. Also, | { + @4 e @4 so | 5 L as well. Since || {| ? L , |i (|) i ({)| in this case as well. Note that this result is also true for intervals of the form (d> e] when the function has a limit as { $ d, and { A d. It is also true when e = 4, using a similar argument: see the exercises. This lemma is very useful in its own right, since it is often hard to prove that functions are UC on open or half-open intervals when they are not dened at one or more endpoints. Two well-known examples are (sin {)@{ and { sin(1@{) on the interval (0> e] (see exercises.) Here is another. Example 4.2.25 The function i ({) = (h{ 1) @{ is UC on (0> 1], hence has an extension to [0> 1]. To see this, we note that on any interval [d> 1] where d A 0, h{ 1 is UC and { is bounded away from 0; thus, i ({) is UC on [d> 1]. Furthermore, we know from Section 2.4 that lim i ({) = 1, so we can apply a variation of Lemma {$0 4.2.24 for intervals open on the left.

LIMITS AND EXTENSIONS

155

Perhaps the most important application of Lemma 4.2.24, however, is to the power function {t . Proposition 4.2.26 If t A 0 the function {t is UC on the interval (0> e] and can be extended to the interval [0> e] with value 0 at { = 0. Proof. By virtue of Lemma 4.2.24, we only have to show that lim ht ln { = 0. Given {$0 {A0

A 0, choose Q A

ln . If { ? hQ , then it is easy to see that {t ? . t

Exercises 1. Give an alternative proof of the Extension Theorem (4.2.11), using a Cauchy sequence (vn ) $ { instead of a family of intervals, to dene I ({) = lim i (vn ). After dening I , prove the following. (a) I ({) is independent of the sequence (vn ) $ { and so I ({) = i ({) for { 5 V.

(b) I is uniformly continuous on W . (In fact, it has the same modulus of continuity as i .) (c) If J is also uniformly continuous on W and J ({) = i ({) for all { 5 V, then J ({) = I ({) for all { 5 W . 2. Prove that the punctured interval [d> e] {{1 > = = = {q } = {{ 5 [d> e] : | { {l | A 0> l = 1> = = = q} is dense in [d> e]. (Hint: Use induction on q. Prove that every open interval which intersects [d> e] also intersects the punctured interval. Lemma 1.6.10 will be useful in the inductive step.) 3. Reprove Proposition 4.2.21, which says that UC functions dened on [d> e] and [e> f] and agreeing at e can be amalgamated into a single UC function dened on all of [d> f]. This time, use Cauchy sequences instead of intervals. 4. We have proved: Lemma 4.2.24 Suppose that i is uniformly continuous on every subinterval [x> y] of [d> e] such that y ? e. Furthermore, suppose lim i ({) exists. Then i {$e {?e

is uniformly continuous on [d> e), hence has a unique extension to all of [d> e]. Modify the proof given in the text to deal with the case e = 4. 5. Suppose that F1 and F2 are two conditions. Mathematicians often say that F1 is stronger than F2 if F1 =, F2 . For example, the condition “Q is divisible

UNIFORM CONTINUITY

156

by 4” is stronger than the condition “Q is divisible by 2.” So consider the conditions: F1 F2 F3 F4 F5

= = = = =

“i “i “i “i “i

is is is is is

UC UC UC UC UC

on on on on on

every subinterval every subinterval every subinterval every subinterval [d> e).”

[x> y] of [d> e).” [x> y] of [d> e] with y ? e.” [x> y) of [d> e].” [x> y) of [d> e).”

(a) Which of these conditions are stronger than which others? (b) Do any of these conditions imply that i must be UC on all of [d> e]? (c) For each condition, what additional hypothesis on i can be used to ensure that i is UC on [d> e]? 6. (a) Prove that h{ is UC on [d> 4). (b) Prove that h1@{ is UC on (0> 4). sin { . Assume the following two facts about this function (we will { prove them later when we examine the sine function more carefully):

7. Let i ({) =

(a) sin { is uniformly continuous on all of R. sin { = 1. (b) lim {$0 { Prove that i ({) is UC on any punctured interval L {0} (L need not be nite), so that it has an extension to all of R. (Hint: It su!ces to prove this result for a bounded interval [E> E], E A 0, since this function is very well-behaved for { not near 0.) 1 is UC on any punctured interval { L {0} (L need not be nite), so that it has an extension I to all of R. (You may assume |sin {| 1 and sin { is UC on all of R.) What will I (0) be? It su!ces to prove this result for a bounded interval [E> E], E A 0, since this function is very well-behaved for { not near 0.

8. Prove that, for A 0, the function { sin

9. Prove the following version of amalgamation (only minor adjustments should be necessary): Proposition Suppose 4 ? e ? 4, i is uniformly continuous on (4> e], j is uniformly continuous on [e> 4), and i (e) = j(e). Then there is a unique function k such that (a) k is uniformly continuous on (4> 4).

(b) k({) = i ({) for { 5 (4> e]. (c) k({) = j({) for { 5 [e> 4).

Are There Non-Continuous Functions?

157

Appendix I: Are There Non-Continuous Functions? Before proceding further, let’s deal with an obvious non-continuous function. Remark 4.2.27 The function 1@{ is not uniformly continuous on any interval containing 0, or any interval of the form (0> E] or [D> 0). The problem, in the case of an interval containing 0, is that 1@{ is not dened at { = 0. The problem with the other two types of intervals is that 1@{ grows too fast near 0 to be UC. So let us rephrase the question more carefully so as to avoid this trivial case. Are there non-continuous functions on closed intervals? By closed interval we’ll mean an interval of the type [d> e]; by half-innite we’ll mean those of type [d> 4) or (4> e]. • Constant functions: UC on all of R. • Sine and cosine functions (as we’ll see later): UC on all of R. • qth root functions (q 1): UC on all of R when q is odd integer; UC on the half-innite interval [0> 4) when q is an even integer. • ln { (on [d> 4) for d A 0), h{ (on (4> d] for d A 0), 1@{ (on [d> 4) or (4> d] for d A 0): UC on half-innite intervals. • Quotients

i ({) where i is bounded and j is bounded away from 0; examples: j({)

{ 1 and : UC on all of R. 1 + {2 1 + {2 {2 2 {2 2 and 2 : UC on closed or half-innite intervals not con {({ 1) { ({ 1) taining 0 or 1. {3 2 : UC on closed intervals not containing 0 or 1. {({ 1) tan {, sec {, cot {, csc {: UC on closed intervals not containing numbers where the functions are undened.

• Polynomials: UC on closed intervals. On the basis of these examples, it would seem that a function can fail to be UC for two reasons: 1. The interval is innite and the function, while dened, grows too quickly when { gets large (e.g. polynomials).

Appendix I

158

2. The interval is closed and nite, but the function becomes undened at some point in the interval (e.g. 1@{ or ln { on [0> 1]). If we accept the fact that functions which grow even moderately (e.g. {2 ) can’t be expected to be UC on non-nite intervals, then we might reasonably be content with functions being UC on closed intervals where they are dened. Are there any other possibilities? Classically there is a group of functions called step functions that are usually put forward as examples of non-continuous functions on (nite) closed intervals. The most basic is called the Heaviside function1 : Example 4.2.28 (The Heaviside function) ; ? 0 if { ? 0 1 if { = 0 K({) = = 2 1 if { A 0 \

[

The Heaviside function

For any A 0, |{ 0| ? can not guarantee that |K({) K(0)| ? 12 , so K cannot be continuous on any interval [d> e], e A d, containing 0. Is the Heaviside function an example of a function dened on a closed interval but not continuous on that interval? The answer, as disappointing as it may be to lovers of pathology, is no. The Heaviside function on the interval [1> 1], for example, is not dened on all of that interval but only on those numbers { for which we can make the decision of whether { ? 0, { = 0, or { A 0. For example, what is the value of K(J) where J is the Goldbach number dened on page 67? Since we can’t, at this time (or maybe ever?), determine whether J is 0 or not, K(J) is undened, although we do know that J 5 [1> 1]). Here is another example of a piecewise dened function that shares the same problems as the Heaviside function. 1 Oliver

Heaviside, 1850 - 1925

Are There Non-Continuous Functions?

159

Example 4.2.29 ; 0 A A A A cos({) A A ? {@3 Jump({) = 0 A A A 1 A ({ 3)2 + A A = 2 1=3

1 2

if if if if if if

5 { ? 4 4 { ? 1 1{?1 1{?2 2{?4 4{5

\

[

The function Jump({)

Another famous function2 has the following denition on any interval [d> e], d A 0: ½ 0 if { is not rational U({) = 1 if { is rational with { = st in lowest terms. t This is often given as an example of a function which is continuous at every nonrational number and discontinuous at every rational (see Exercise 13 on page 147 for the denition of continuous at a point). Unfortunately, it is only dened for those numbers where we can make the determination of rational vs. non-rational. For example, what is the value of U( + h)? Now, there are many mathematicians–most in fact–who believe that statements whose truth value we currently do not know (e.g. J = 0) nevertheless do have a truth value. In other words, they are either true of false–we just don’t happen to know which. That is ne. Dierent mathematicians can play by dierent rules as long as we’re up front about what we are assuming. This book takes a rather 2 due

to Georg Bernhard Riemann (1826-1866)

Appendix I

160

minimalist or constructive view: statements are only true (or false) when we can prove (or disprove) them. At least you won’t have to unlearn anything you see here. The upshot of all this is that, from a constructive point of view, there don’t seem to be any functions, well-dened on a nite closed interval [d> e], which fail to be uniformly continuous. All of the functions you’ll see in classical mathematics are either • dened on the rationals, proved UC, and extended to the reals (e.g. addition, multiplication of reals, and our construction of D{ ), or • derived from existing UC functions via algebraic operations (e.g. adding, dividing, composing functions, and applying the Inverse Function Theorem), or R{ 2 • derived using integration or dierentiation (e.g. I ({) = 0 hw gw), or

• constructed as limits of sequences of functions (e.g. our construction, in a later chapter, of the trig functions).

As we have seen, or will see later, these operations always produce functions which, on a nite closed interval, either are UC or fail to be UC because they are not well-dened on all points of that interval. On the other hand, no one has been able to prove that every well-dened function on a closed nite interval is continuous. We will leave this issue, then, as unsettled but certainly debatable.

Exercises 1. Prove carefully that the Heaviside function can’t be continuous on [1> 1]. 2. What is the domain of the function Jump({) dened above? 3. Dene { to be strongly irrational if |{ u| A 0 for every rational number u. (a) Suppose we know that uq = v issimpossible, where u and v are positive Using rational numbers. Prove that q v is strongly irrational. (Hint: s q q A v; show u A v.) Thus, strong trichotomy for rationals, suppose u s s 2 and 3, for example, are strongly irrational. (b) Let V 0 consist of those numbers d 5 (0> 4) for which we know that d must be either rational or strongly irrational. Prove that {$d lim U({) = 0 {5V 0

for all d 5 V 0 where U is Riemann’s function, dened above. (Hint: For any Q A 0, there are only a nite number of rationals pl @ql (in lowest terms) such that 0 ? |pl @ql d| ? 1 and 0 ? ql Q . Choose Q A 1@ and then a A 0, which forces q A Q when 0 ? |p@q d| ? .)

(c) Deduce that U, dened on the set V 0 , is continuous at the (strongly) irrational numbers, and discontinuous at the rational ones.

Continuity of Double-Sided Inverses

161

Appendix II: Continuity of Double-Sided Inverses Although we don’t have a general proof that the inverse of a UC function is UC, we can prove that inverses of functions obtained via our Inverse Function Theorem (IFT)–or its variations–are UC. In the case of IFT itself (Theorem 2.2.1), we know that the inverse satises Lipschitz conditions, so is UC. We leave the one-sided case (Theorem 2.2.7) as an exercise and deal with the more di!cult two-sided case. Theorem 4.2.30 (IFT: Two-sided Case) Suppose that D ? 0 ? E and i is dened on the interval [D> E] with I (D) = F, i (E) = G and i (0) = 0. Suppose also that for each 0 ? H E we have a number OH A 0 (depending on H) with OH · (| {) i (|) i ({) for all H { | E and, for each D I ? 0, another number O0I (depending on I ) with O0I · (| {) i (|) i ({) for all D { | I= Finally, suppose that there is a N A 0 with i (|) i ({) N · (| {) for all D { | E= Then we can dene a function j : [F> G] $ [D> E] such that for each } 5 [F> G], i (j(})) = }. Furthermore, j is uniformly continuous on [F> G]. The proof of the existence of j has already been sketched in exercise 10 on page 82, so we’ll just deal with the continuity. Proof (Continuity of Inverse). The idea is to separate the continuity issues of j into two parts: the places where j is UC because it’s Lipschitz, and the place where |i (|) i ({)| is small because both i ({) and i (|) are small ({, | near 0). Given A 0, subdivide the interval [@2> @2] [D> E] using the interior division points @3 ? @6 ? 0 ? @6 ? @3. By assumption, we have the following Lipschitz conditions: i (|) i ({) O0@6 · (| {) on [D> @6] i (|) i ({) O@6 · (| {) on [@6> E]. In particular i (@6) i (@3), i (@3) i (@2) O0@6 · , 6 | {z } k1

Appendix II

162

and

i (@2) i (@3), i (@3) i (@6) O@6 · = | {z 6} k2

Let k = min(k1 > k2 ). Also note that the inverse function j is uniformly continuous on the intervals [i (D)> i (@6)] and [i (@6)> i (E)], since j is Lipschitz there by the standard Inverse Function Theorem. Let 0 = 0 () be a modulus of continuity on these intervals (the min of the moduli for j on each of these intervals). This information is shown in the following diagram: I%

IH DWOHDVWK

JLV 8&ZLWK PRGXOXV G

IH DWOHDVWK IH I IH DWOHDVWK IH DWOHDVWK IH

JLV 8&ZLWK PRGXOXV G

I

I$

$

H H H

H H H

%

Let = min( 0 > k@2) and suppose |} z| ? . We will now apply -Trichotomy several times, with equal to k@2. First, compare } with i (@3). If } A i (@3)> then z A i (@6) since } and z dier by less than k@2 ? k. Thus, they are in the range where j is uniformly continuous with modulus 0 , and so |j(}) j(z)| ? . If } and i (@3) dier by less than k@2, then } and z both lie in [i (@2)> i (@2)], so their images under j lie in [@2> @2], and we are ok here too. If it turns out that } ? i (@3), then compare } with i (@3). If } ? i (@3) then z ? i (@6) since } and z dier by less than k@2; thus we are again in the range of uniform continuity of j. If } is within k@2 of i (@3) or } A i (@3), then } and z both lie in [i (@2)> i (@2)], so we are ok (as above). Thus, in all cases, |j(}) j(z)| . (If we needed strict inequality here, we could have replaced the given by a slightly smaller number or made the interval [@2> @2] slightly smaller from both ends). Exercise: Prove the continuity of the inverse in both the one- and two-sided cases using Lemma 4.2.24, Corollary 4.2.13, and Proposition 4.2.21.

The Goldbach Function

163

Appendix III: The Goldbach Function In the appendix to chapter 1 (see page 67) we dened the Goldbach number J as follows: S (q) = (q is congruent to 3 mod 4) ; if q = 1 or 2q is a sum of 2 primes ? 0 1 if 2q is not a sum of 2 primes and S (q) is true j(q) = = 1 if 2q is not a sum of 2 primes and S (q) is false JQ

=

Q X j(n)

2n ½ J = JQ n=1

1 2Q 1

> JQ +

1 2Q 1

¸

¾ > Q = 1> 2> = = =

We also saw (exercise) that the Goldbach Conjecture (every even number is a sum of 2 primes) is true if and only if J = 0. We can use J and an idea of Von Neumann to construct a function G that calls into question the general applicability of the bisection method in nding roots of functions. G is dened on the interval [0> 1]; its graph on [0,1/3] is the line joining (0> 1) with (1@3> J); on [1/3,2/3] it’s the horizontal line joining (1@3> J) with (2@3> J). Finally, on [2@3> 1] its graph is the line joining (2@3> J) with (1> 1). Formally, ; 3{(J + 1) 1 if { 5 [0> 1@3] ? J if { 5 [1@3> 2@3] G({) = = 3J 2 3{(J 1) if { 5 [2@3> 1] The existence of G is guaranteed by the existence of amalgamated piecewise functions (see Example 4.2.23). Here are three possibilities for the graph of G:

*! *

*

Appendix III

164

It is clear that G is a function computable to any desired degree of accuracy, much like the square root or trigonometric or exponential functions. A computer program to do this is easily designed. The fact that the Goldbach Conjecture is (as yet) unsolved has no bearing on this computability. Furthermore, it is clear (by amalgamation) that G is uniformly continuous on the interval [0,1]; note that G(0) = 1 ? 0 and G(1) = 1 A 0. Can we specify a zero of G? The answer is clearly no. If the Goldbach Conjecture is true, the entire middle third of this interval is a zero. If the Goldbach Conjecture is false, then the zero lies either in the rst or last thirds; however, we don’t know which unless and until a counter-example is produced. It is instructive to see how badly bisection fails: G(1@2) = J, and we don’t know whether this is positive or negative or zero, so we have no way to choose the left or right half in order to proceed.

Exercises 1. Prove that J =

4 P

n=1

j(n) . 2n

2. Prove explicitly, using Corollary 4.2.23, that the Goldbach function G is actually well-dened.

5. THE RIEMANN INTEGRAL 5.1

Denition and Existence

We will take the approach here of specifying what we want an integral to be, and then show that the description we give determines how it must be dened. We then construct it. Of course, our motivation for the conditions imposed on the integral come from our intuitive idea of area, which is exploited when the integral is introduced in the usual calculus course. In keeping with our idea of doing a careful and abstract analysis, we’ll try to avoid drawing pictures, at least for a while. Let us x some interval L of reals. Throughout this section L will be a nite interval. An integral on I is a rule that associates with each subinterval [x> y] L and each R y uniformly continuous function i on [x> y], a well-dened real number denoted x i . We will assume that this association satises the following fundamental conditions, where D j E on [x> y] means that D j(w) E for each w 5 [x> y]. Basic Properties of Integrals Condition 5.1.1 For each R[x> y] L and each D j E on y [x> y], we have D · (y x) x j E · (y x)=

Condition 5.1.2 For each [d> e] L with d f e and each j, we have Z f Z e Z e j+ j= j= d

f

d

R{ Remark 5.1.3 { j = 0. In fact, on [{> {] we have j({) j j({), so by Condition R{ 5.1.1: 0 = j({)({ {) { j j({)({ {) = 0. Ry Suppose that such an integral exists. The numbers x and y in x j are sometimes called the limits of the integral and are assumed to satisfy x y in the rst property above. We can extend the notation somewhat by eliminating this condition. Definition 5.1.4 For any x> y 5 L, Z y Z y j= x

min(x>y)

j

Z

x

j=

min(x>y)

Remark 5.1.5 If we know that x Ry, then Denition 5.1.4 reducesR to ourR usual Rx y y x notation, because min(x> y) = x and min(x>y) j min(x>y) j becomes x j x j = 165

THE RIEMANN INTEGRAL

166

Ry

j. On the other hand, if we know that y x, then min(x> y) = y, and the x denition and previous remark give Z y Z x j= j= x

y

Definition 5.1.6 A sequence of points d = s0 s1 · · · sq = e is called a partition of the interval [d> e]. Suppose, then, that we have d = s0 s1 · · · sq = e in L. A simple induction argument using Condition 5.1.2 of the integral shows that Z

q Z X

e

j=

d

l=1

sl

j

sl1

If we also know that Ol j Xl on each [sl1 > sl ] and we set l = sl sl1 , then by using this equation and Condition 5.1.1, we can conclude that q X l=1

Ol l

Z

e

d

j

q X

Xl l =

(*)

l=1

Since our function is uniformly continuous, its values on intervals of small length cannot vary very much. This enables us to nd bounds on partition intervals, which we now combine with inequality (*). Proposition 5.1.7 Suppose j is uniformly continuous on [d> e] and = () is a modulus of continuity for j on [d> e] (i.e. |j(|) j({)| ? when || {| ? and {> | 5 [d> e]). Suppose also that we choose our partition so ne that l = sl sl1 ? . Then for any choice of points {l 5 [sl1 > sl ], we have q X l=1

j({l )l (e d)

Z

d

e

j

q X l=1

j({l )l + (e d)> i.e.

¯ ¯Z q ¯ ¯ e X ¯ ¯ j j({l )l ¯ (e d)= ¯ ¯ ¯ d l=1

Proof. To verify this, rst note we have j({l ) j(w) j({l ) + for any Ol = j({l ) and w 5 [sl1 > sl ], since |{l w| l . Therefore, we can Pset q Xl = j({l ) + in the inequality () above (also note that l=1 l = e d).

Definition 5.1.8 Assume j is dened on [d> e]. For any partition d = sr s1 q P j({l )(sl sl1 ) is · · · sq = e and points {l 5 [sl1 > sl ], a number of the form l=1

called a Riemann sum.

So, if the integral exists, we know two important facts about it. First of all, Proposition 5.1.7 tells us that Riemann sums approximate it. Even better, it tells us that they approximate the integral within (e d) if the intervals in the partition

DEFINITION AND EXISTENCE

167

have length less than the modulus of continuity (). This gives us a method for calculating integrals if we can evaluate the limit of these sums: see the exercises. Secondly, we know that if the integral it must be trapped between Pq exists, thenP q O and between the lower and upper sums l l l=1 Xl l (i.e. it must be Pl=1 Pq q contained in the interval [ l=1 Ol l > l=1 Xl l ]). This suggests that to construct the integral–in other words, show that it exists–we should look at the following family. Definition 5.1.9 Given a uniformly continuous function j on [d> e], dene the family #) (" q q X X Ol l > Xl l , L(j; [d> e]) = l=1

l=1

where = s0 s1 · · · sq = e is any partition of [d> e], and Ol j Xl on [sl1 > sl ] and l = sl sl1 for l = 1> = = = > q. Proposition 5.1.10 L(j; [d> e]) is a consistent and ne family of real intervals.

Proof. First we prove neness, which turns out to be easy. Given A 0, let = 2(ed) . (If necessary, to avoid dividing by 0, replace e d by a slightly larger number.) Take q so large that (e d)@q is less than a modulus of continuity (). We divide [d> e] into q equal pieces by setting sl = d + ql (e d) for l = 1 = = = q. We now can let Ol = j(sl1 ) and Xl = j(sl1 ) + . Note that we could have used, in place of j(sl1 ), any j(w) for w 5 [sl1 > sl ]. We now have Ã" q #! q X X c Ol l > Xl l = 2(e d) ? , l=1

l=1

so the family is ne. Now let’s turn to consistency. We must show that the left endpoint of any interval in L(j; [d> e]) does not exceed the right endpoint of any other. So we have to consider two partitions of [d> e]: d = {0 {1 · · · {p = e with Ol j Xl on [{l1 > {l ] d = |0 |1 · · · |q = e with Om j Xm on [|m1 > |m ]. Also, let l = {l {l1 and m = |m |m1 . Let S (p> q) denote the following statement, which expresses consistency for the sizes p, q: < ; ?For any interval [d> e], function j uniformly continuous on [d> e],@ positive as above, Pq Ppintegers p> q> and partitions > = l=1 Ol l = O X = l=1 Xl l.

Call p + q the degree of S (p> q). We establish the truth of S (p> q) by induction on the degree: the truth of S (p> q) always follows from the truth of S (s> t) of smaller degree. (Actually, it will always follow from the truth of S (p 1> q) or S (p> q 1).)

THE RIEMANN INTEGRAL

168

The smallest possible values for p and q are both 1, so the smallest possible degree is 2. Since S (1> 1) is clearly true (because the partitions both reduce to just endpoints), we can start our induction on degree 2. So let us suppose we have proved S (p> q) for all degrees up to (but not including) p + q, and consider S (p> q). We cannot exclude both {1 |1 and |1 {1 , so it su!ces to prove S (p> q) in both of these cases. Suppose, then, that {1 |1 . We will show that S (p 1> q) implies that O X . First, note that d = {0 = |0 {1 |1 = Choose w 5 [d> {1 ] [d> |1 ]. By assumption, O1 j(w) and j(w) X1 , so we conclude that O1 1 = O1 ({1 {0 ) X1 ({1 {0 )= Let f = {1 and consider the following partitions of the interval [f> e]: f = {1 {2 · · · {p = e f = {1 |1 · · · |q = e. Note that the rst of these has only p 1 intervals. The degree of S (p 1> q) is p 1 + q ? p + q, so it is true by induction hypothesis. Letting |0 = {1 , this tells us that 3 4 EX F E q F E Ol l E Xm m F F + X1 (|1 {1 ). C D l=2 m=2 | {z } | {z } p1 terms q terms {z } | p X

p1+q ? p+q terms

Now just add O1 1 X1 ({1 {0 ) to both sides to show that O X . Thus, if {1 |1 , then O X . In a symmetrical way, if |1 {1 , then S (p> q1) implies O X . So, in either case, S (p> q) can be established if we know it for either of the statements S (p 1> q) or S (p> q 1) of smaller degrees. Thus it is true for all degrees and we are done. Since the family L(j; [d> e]) is consistent and ne, the Completeness Theorem tells Re Pq Pq us that its limit exists. As we have already seen, l=1 Ol l d j l=1 Xl l , so the integral must lie in every interval of the family. This leaves us no choice in how to dene it. Definition 5.1.11

Re d

j = lim (L(j; [d> e]))

For historical reasons, we often write Z

d

e

j=

Z

d

e

j({) g{.

DEFINITION AND EXISTENCE

169

The “variable” { has no intrinsic signicance; in fact, we can use any letter in this expression. If {l 5 [{l1 > {l ] is chosen arbitrarily, then Ol j({l ) Xl for each l, and we have q q q X X X Ol l j({l )l Xl l = l=1

l=1

l=1

Pq

Thus, sums of the form l=1 j({l )P l lie in the intervals of L(j; [d> e]), and Rhence approximate the integral. The Greek was metamorphosed by Leibniz into , the elongated S, in taking the limit over ner partitions, while the j({l ) became simply j({). When { is the chosen variable, l = {l {l1 = { is written in the limit as g{, especially when all the subintervals [{l1 > {l ] have the same length. Re Of course, we must formally check that d j dened in this way satises the two basic conditions for being an integral. Proposition 5.1.12

Re d

j = lim(L(j; [d> e])) satises Conditions 5.1.1 and 5.1.2.

Proof. Suppose D j E on [x> y]. Then [D · (y x)> E · (y x)] is an interval in L(j; [x> y]). Since R y the limit of a family lies in each interval of the family, we see that D · (y x) x j E · (y x), which establishes the rst condition. To prove the second, suppose that d f e, and choose intervals M 5 L(j; [d> f]) and N 5 L(j; [f> e]): # " q q X X Ol · ({l {l1 )> Xl · ({l {l1 ) M = l=1

N

"p X

=

l=1

O0l

l=1

· (|l |l1 )>

p X

Xl0

l=1

#

· (|l |l1 ) ,

where the {’s partition [d> f] and the |’s partition [f> e]. We can rewrite N as # " q+p q+p X X Ol · ({l {l1 )> Xl · ({l {l1 ) , N= l=q+1

l=q+1

where {q = f, {p+q = e and Oq+m = O0m , Xq+m = Xm0 for m = 1> = = = > p. Also, M +N =

"q+p X l=1

Ol · ({l {l1 )>

q+p X l=1

#

Xl · ({l {l1 ) .

Thus, M + N 5 L(j; [d> e]), and we have proved that the sum M + N of any Re L(j; [d> f])-interval M with any L(j; [f> e])-interval N is an L(j; [d> e])-interval, so d j 5 Rf Re M + N. Of course, d j + f j 5 M + N also. Given any A 0, we can always choose Rf Re these intervals so that all of M, N, and M +N have length less than . Since d j+ f j Re and d j both lie in a (consistent) family of intervals of arbitrarily small length, they are equal.

THE RIEMANN INTEGRAL

170

Summary 5.1.13 RFor functions i uniformly continuous on L and [x> y] L, the y Riemann integral x i exists and is determined by these properties. Ry 1. If D i E on [x> y], then D · (y x) x i E · (y x). Rf Ry Ry 2. For f 5 [x> y], x i + f i = x i .

3. Given A 0 and x = s0 s1 · · · sq1 sq = y a partition of [x> y] with |sn sn1 | ? () (a modulus of continuity for i ), then for any numbers {n 5 [sn1 > sn ], the integral can be approximated by a Riemann sum with the following accuracy: ¯ ¯ Riem ann sum ¯ z }| {¯¯ ¯Z q ¯ ¯ y X ¯ ¯ i i ({n )(sn sn1 )¯ · (y x). ¯ ¯ x {z }¯ | n=1 ¯ ¯ n ¯ ¯ Exercises q q X X Re Ol l , Xl l for d j in each of the 1. Write down approximating sums l=1

l=1

following cases. Use equal size partitions with q = 4 (i.e. each subinterval has length = (e d)@4). (a) j({) = 3{, [d> e] = [0> 2]. (b) j({) = {2 , [d> e] = [0> 1]. (c) j({) = {2 , [d> e] = [1> 1].

2. For each of the functions and intervals in the previous question, let {l be the midpoint of the partition interval [sl1 > sl ]. (a) Estimate the integral using the Riemann sum

4 X

j({l ).

l=1

(b) Estimate the error

Re d

j

4 X

j({l ). (See Proposition 5.1.7.)

l=1

3. A function is called increasing on an interval L if i ({) i (|) for all { | in L. Suppose i is increasing on [d> e]. Partition [d> e] into q equal pieces, Re each of width = (e d)@q. Find simple upper and lower bounds for d i q X i ({l ). (Hint: In each [sl1 sl ], i (sl1 ) i ({) i (sl ); also note that l=1

q X l=1

i (sl ) i (sl1 ) = i (e) i (d).)

Apply this idea to estimating the error in the previous problem.

DEFINITION AND EXISTENCE

171

4. The functions 1, { and {2 are uniformly continuous on any nite [d> e]. ComRe Re Re pute d 1, d { and d {2 . (Hint: You can use any partitions you want, so divide [d> e] into q equal pieces, so your partition will be sn = d + n(e d)@q, n = 0> = = = > q. It is also useful to know the formulas q X

n=1 q X

n = q(q + 1)@2 n2 = q(q + 1)(2q + 1)@6.

n=1

5. Suppose we have two partitions of the interval [d> e] d = {0 {1 {2 = e d = |0 |1 |2 = e. Suppose also that d = {0 = |0 {1 |1 {2 = |2 = e. Use the idea and notation of the proof of Proposition 5.1.10 to show that O1 ({1 {0 ) + O2 ({2 {1 ) X1 (|1 |0 ) + X2 (|2 |1 ). If you feel ambitious, try this exercise again with d = {0 {1 {2 {3 = e and {2 {3 = |2 = e. Re 6. We have shown that Riemann sums approximate the integral d i . Instead of partitioning the interval [d> e] into equal-size pieces, Pierre de Fermat (1601Re 1665) used the following partition in computing d {V . {0

p = (e@d)1@q = q e@d = d ? d ? d 2 ? d 3 ? · · · ? d q1 ? d q = e = {q | {z } |{z} |{z} |{z} {1

{2

{3

{q1

and the Riemann sum q P

n=1

i ({n )({n {n1 ) =

´V q ³ P d n (d n d n1 )=

n=1

Assume V is an integer (positive or negative) not equal to 1, and show that Re eV+1 dV+1 this method works, producing d {V = . You’ll have to do the V+1 following: (a) Show that the lengths of the intervals become arbitrarily small as q $ 4, i.e. $ 1 as q $ 4 (this is exercise 3 on page 105).

(b) Calculate the Riemann sums (they are geometric series).

(c) Find the limit of the Riemann sums as q $ 4 (recalling ({N | N ) = ({ |)({N1 + {N2 | + · · · + | N1 will be useful). 7. Prove these useful estimates (see [Mercer, 2005]).

THE RIEMANN INTEGRAL

172

(a) (Mercer’s Lemma) Let i and j be uniformly continuous on [d> e], with Z e p i P on this interval. Suppose that j({) g{ = 0. Then d

¯e ¯ ¯Z ¯ Ze ¯ ¯ ¯ i ({)j({) g{¯ 1 (P p) |j({)| g{. ¯ ¯ 2 ¯ ¯ d

(Hint:

Z

e

d

i ({)j({) g{ = d

1 2

Z

e

d

(2i ({) P p)j({) g{.)

(b) Proposition Let i and j be uniformly continuous on [d> e] with p i P and q j Q . Let 1 I = ed

Ze

d

1 i ({) g{ and J = ed

Ze

j({) g{.

d

Then ¯ ¯ ¯ ¯ Ze ¯ ¯ 1 ¯ ¯ ¯ e d i ({)j({) g{ I J¯ ¯ ¯ d 5 6 Ze Ze 1 min 7(P p) |j({) J| g{> (Q q) |i ({) I | g{8 . 2 d

d

(Hint: Set k = j J so that

Z

e

k({) g{ = 0; apply Mercer’s Lemma to d

i and k. Now do the same thing with the roles of i and j reversed.)

5.2

Elementary Properties

We now deduce some of the properties of the denite integral from the denition and construction of the previous section. Proposition 5.2.1 Given j continuous on L and d> e> f 5 L, Z

d

f

j+

Z

e

j= f

Z

e

j= d

ELEMENTARY PROPERTIES

173

Proof. Using Denition 5.1.4 we can write Z v Z v Z u j= j u

min(u>v)

=

Z

min(u>v)

Z

v

j+

min(u>v)

j min(u>v>w) | {z

min(u>v)

Z

min(u>v)

j min(u>v>w) }

= 0

Z

u

j

min(u>v)

for any u> v> w 5 L. But we have veried that the integral satises Condition 5.1.2 in Rv R min(u>v) the previous section, and min(u> v> w) min(u> v) u> v, so min(u>v) j + min(u>v>w) j = Rv Ru R min(u>v) Ru and min(u>v) j + min(u>v>w) j = min(u>v>w) . We conclude that min(u>v>w) Z

v

= u

Z

Z

v

min(u>v>w)

j

u

j.

min(u>v>w)

By applying this twice we get Z

f

j+ d

Z

e

j=

f

Z

f

min(d>e>f)

=

Z

e

min(d>e>f)

=

Z

Z

d

j+ min(d>e>f)

j

e

Z

Z

e

min(d>e>f)

d

j

Z

f

j

min(d>e>f)

j

min(d>e>f)

j.

d

Corollary 5.2.2 If j is uniformly continuous on L and s0 > = = = > sq 5 L then Z

sq

s0

j=

q Z X l=1

sl

j.

sl1

(Prove this by induction on q.) Corollary 5.2.3 If j is uniformly continuous on L and s> {> | 5 L, then Z { Z | Z | j j= j. s

s

{

Proposition 5.2.4 If i is non-negative and continuous on [d> e] and i (f) A 0 for Re some f 5 [d> e], then d i ({) g{ A 0. Corollary 5.2.5 If i is non-negative and continuous on [d> e] and then i is identically 0 on [d> e].

Re d

i ({) g{ = 0,

THE RIEMANN INTEGRAL

174

Corollary 5.2.6 Suppose that j i k on the interval [d> e] and all three functions are continuous on [d> e]. Then Z e Z e Z e j i k= d

d

d

Proof. These are all exercises. Definition 5.2.7 That j and k don’t cross on [d> e] means that either j k on all of [d> e] or k j on all of [d> e]. We can now generalize our previous result without committing ourselves to a given ordering of d and e, and j and k. This will be useful later in our discussion of Taylor series. Lemma 5.2.8 Suppose i 0 is continuous on [min(d> e)> max(d> e)]. Then ¯Z ¯ ¯ e ¯ Z max(d>e) ¯ ¯ i¯ = i. ¯ ¯ d ¯ min(d>e) Proof. Exercise.

Proposition 5.2.9 Suppose that the continuous functions j and k don’t cross on the interval [min(d> e)> max(d> e)]. If i ({) is between j({) and k({) for all { in this Re Re Re interval, then d i is between d j and d k.

Proof. Because j and k don’t cross, we only have to check the result for the two cases j k and k j. So suppose j k. Since i ({) is between j({) and k({), we have j({) i ({) k({) for all { in the interval. Now we have ¯Z ¯ Z e ¯¯ ¯¯Z e ¯ e ¯ ¯ ¯ ¯ ¯ k j¯ = ¯ k j¯ ¯ ¯ d ¯ ¯ ¯ d d Z max(d>e) kj = min(d>e)

=

Z

max(d>e)

min(d>e)

(k i ) + (i j)

Z max(d>e) ki + i j min(d>e) min(d>e) ¯Z ¯ ¯Z ¯ ¯ e ¯ ¯ e ¯ ¯ ¯ ¯ ¯ =¯ k i¯ + ¯ i j¯ ¯ d ¯ ¯ d ¯ ¯ ¯Z ¯Z Z Z e ¯¯ ¯ e ¯ e e ¯ ¯ ¯ ¯ ¯ k i¯ + ¯ i j¯ =¯ ¯ ¯ ¯ ¯ d d d d =

Z

max(d>e)

Thus, we get our result in this case by Proposition 1.5.44. The case k j is, of course, similar.

ELEMENTARY PROPERTIES

175

Proposition 5.2.10 (Linearity of the Integral) Given i and j continuous on [d> e] and u> v 5 R, we have Z e Z e Z e (ui + vj) = u i +v j= d

d

d

Proof. We will show that for any A 0, the left and right sides dier by less than . For the moment, let 1 A 0 be arbitrary. Since i and j are continuous, so is ui + vj. Let i (1 ), j (1 ) and ui +vj (1 ) be moduli of continuity for these functions, and let be the minimum of these three moduli. Let s0 > = = = > sq be a partition of [d> e] with sl sl1 ? for each l = 1> = = = q. Re Pq Pq Let Di = l=1 i (sl )l and Dj = l=1 j(sl )l be Riemann sums for d i and Re j. Then uDi +vDj is a Riemann sum for ui +vj. We know from Proposition 5.1.7 d that these Riemann sums dier from the integral they approximate by an amount less than 1 (e d). ¯Z Z e Z e ¯¯ ¯ e ¯ ¯ (ui + vj) u i v j¯ ¯ ¯ ¯ d d d ¯ÃZ ! ÃZ ! ÃZ !¯ ¯ ¯ e e e ¯ ¯ ui + vj (uDi + vDj ) + uDi u i + vDj v j ¯ =¯ ¯ ¯ d d d ¯ ¯ ¯ÃZ ! ÃZ !¯ ¯ ÃZ !¯ ¯ ¯ ¯ ¯ ¯ ¯ e e e ¯ ¯ ¯ ¯ ¯ ¯ ui + vj (uDi + vDj )¯ + ¯uDi u i ¯ + ¯vDj v j ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ d d d 1 (e d) + |u|1 (e d) + |v|1 (e d) = 1 (e d)(1 + |u| + |v|)=

Re Re Re By choosing 1 ? (ed)(1+|u|+|v|) we will have d ui + vj and u d i + v d j within . Since they are arbitrarily close, they are equal.

Exercises 1. Suppose k is formed by amalgamating the UC functions i on [d> e] and j on [e> f](see Proposition 4.2.21); so k({) = i ({) for { 5 [d> e] and k({)R= j({) for f { 5 [e> f] (with i (e) = j(e)). What can you say about the integral d k ? 2. Suppose i 0 is UC on [min(d> e)> max(d> e)]. Prove that ¯Z ¯ ¯ e ¯ Z max(d>e) ¯ ¯ i¯ = i= ¯ ¯ d ¯ min(d>e)

(Hint: You may separate into cases d e and e d: see Remark1.5.36 .) ¯R ¯ Re ¯ e ¯ 3. Suppose i is UC on [d> e]. Prove that ¯ d i ¯ d |i | . (One way is to use Proposition 5.1.7.)

THE RIEMANN INTEGRAL

176

4. Suppose j i k on the interval [d> e], and all three functions are UC on [d> e]. Then Z e Z e Z e j i k= d

d

d

5. Suppose i is UC on [d> e]. For each positive integer q, let = (e d) @q and {n = d + n, n = 0> = = = > q. Dene Dq =

1 (i ({1 ) + i ({2 ) + · · · + i ({q )) = q

(a) Express D = lim Dq in terms of an integral. (Hint: Dq is a Riemann q$4

sum.) D is called the average value of i on the interval [d> e]. (b) If O i X on [d> e], prove that O D X as well.

5.3

Extensions and Improper Integrals

In this section, we show how the integral of a function can sometimes be extended, even to an interval on which the function may not be uniformly continuous. Suppose i is uniformly continuous on an interval L. If we x some f 5 L, then we can dene, for all { 5 L, Z {

i=

!({) =

f

Proposition 5.3.1 ! is a uniformly continuous function on L. Proof. Since i is continuous on L, i is bounded on L. Thus, E i ({) E for some E 0, and ¯Z | Z { ¯ ¯Z | ¯ ¯ ¯ ¯ ¯ i i ¯¯ = ¯¯ i ¯¯ |!(|) !({)| = ¯¯ f

f

{

E|| {|=

Hence, ! is uniformly continuous with modulus of continuity () = @E. Suppose now that L is an open nite interval (d> e) and i is uniformly continuous on L. Then we know (see Theorem 4.2.11) that i has a unique, uniformly continuous extension i to the closed interval [d> e]. We can now dene a new function on [d> e] R{ e e by !({) = f i . Clearly !({) and !({) are equal for { 5 (d> e). Since i is uniformly e continuous on [d> e], so is !, by Proposition 5.3.1. Of course, ! is uniformly continuous on (d> e) by the same proposition, so it has its own unique uniformly continuous extension to [d> e]; call it !. Because of this uniqueness, we have e = ! is the unique extension of !({) = Proposition 5.3.2 !

R{ f

i to [d> e].

EXTENSIONS AND IMPROPER INTEGRALS

177

w2 4 on the interval [0> 2). Clearly i (w) = w+2 extends w4 i to all of R. For any 0 d { ? 2, we have

Example 5.3.3 Let i (w) = Z{ d

w2 4 gw w4

=

¡ 2 ¢ ¡ ¢ { @2 + 2{ d2 @2 + 2d

¢ ¡ $ 6 d2 @2 + 2d Z 2 w + 2 gw. = d

(We have used the Fundamental Theorem of Calculus here–we’ll prove it in the next chapter!) sin w on the interval (0> 1]: see the exercises. w There are two other ways in which we may be able to extend an The R 10integral. 1 rst is to enlarge the region of integration from a nite one (e.g. 2 {2 g{ ) to an R4 innite one (e.g. 2 {12 g{). The second is to try to integrate a function over an R1 interval that includes a point where the function is not dened (e.g. 0 s1{ g{). The idea in both cases is to integrate almost up to the place in question (4 in the rst example, 0 in the second), and then try to take a limit as we get closer and closer. Example 5.3.4 Let i (w) =

Re Definition 5.3.5 The symbol d i (where either one of d or e or both may be 4) is called an improper integral if i is not UC on [d> e] (half open or open when an endpoint is 4). Definition 5.3.6 Suppose that f is either a nite number or 4, and that i is uniformly continuous on every nite subinterval of L = [d> f). R P$4 i converges if, for every A 0 there is a 1. When f = 4, we say that d number D() such that ¯Z x Z y ¯ ¯ ¯ ¯ i i ¯¯ ? ¯ d

d

for any x> y A D(), x> y 5 L. R P$f i converges if, for every A 0 there is a A 0 2. For f nite, we say that d such that ¯Z x Z y ¯ ¯ ¯ ¯ i i ¯¯ ? ¯ d

d

for any x> y 5 (f > f), x> y 5 L. R P$4 i converges to O if, for each A 0 there is a 3. When f = 4, say that d number E() such that ¯Z x ¯ ¯ ¯ ¯ ¯? i O ¯ ¯ d

for x A E(), x 5 L.

THE RIEMANN INTEGRAL

178

4. For f nite, we say that A 0 such that

R P$f d

for x 5 (f > f), x 5 L.

i converges to O if, for each A 0 there is a

¯Z ¯ ¯ ¯

x d

¯ ¯ i O¯¯ ?

Remark 5.3.7 You can easily reword the above denitions to deal with the case where it’s the lower limit of the integral that’s in question; this leads to denitions Re Re of the convergence of P$4 i and P $f+ i . We leave the details as an exercise. With straightforward modications, the proofs below apply to these integrals as well. Proposition 5.3.8 For f nite or 4, if R P$f O such that d i converges to O.

R P$f d

i converges, then there is a number

Rx Proof. Let I (x) = d i . We give the proof for f = 4; the other proof is similar. As you will notice, the argument is almost exactly the same as the one we gave for Cauchy Completeness (Theorem 3.1.14). For each A 0 and x A D(), we form the real interval [I (x)> I (x)+]. Let F be the family of all such intervals for all A 0. Since, for any A 0 we can nd an D(), F is ne. To prove consistancy, we must show that I (x) 1 I (y) + 2 and I (y) 2 I (x) + 1 , where x A D(1 ) and y A D(2 ). As usual by our ruling out argument (see 1.5.33), it su!ces to assume D(1 ) D(2 ). Then x> y A D(1 ) so I (x) and I (y) are within 1 ; consequently I (y) 1 I (x) I (y) + 1 . The left-hand inequality gives I (y) 2 ? I (y) I (x)+1 , while the right-hand inequality gives I (x)1 I (y) ? I (y)+2 . We let O be the limit of this family. For similar details, see the proof that Cauchy sequences converge (Theorem 3.1.14). Notation 5.3.9 When an improper integral converges, we say that it exists, and R4 Rf Re Re write d i or d i (respectively: 4 i or f i ) for the number O it converges to. Definition 5.3.10 If lim Re (similarly for P$4 i ).

R P $4 d

i = 4 or +4, we say that the integral diverges

Example 5.3.11 When we’ve proved the Fundamental Theorem of Calculus, we’ll Re R4 know that d {12 = d1 1e (with 0 ? d ? e). It follows that d {12 = 1@d. (One can prove that this integral converges directly–without computing it–using the convern P 1 gence of the sequence l2 . See the exercises at the end of this section.) l=1

R4 Now we turn to dening doubly innite improper integrals, 4 i . We rst prove a result which is the analogue of Tail Convergence for series (Proposition 3.3.11), namely that the convergence of an integral on an innite interval [d> 4) depends only on the innite segment, not on the value of the integral over a nite subinterval:

EXTENSIONS AND IMPROPER INTEGRALS

179

Lemma 5.3.12 (Tail Convergence for Integrals) Suppose Ri is uniformly con4 tinuous on every nite subinterval of [d> 4), and that d e; then e i exists if and R4 R4 Re R4 only if d i exists, and d i = d i + e i . R4 Proof. Suppose e i exists, and let D() be given as in the denition. Suppose x> y A D(). Then ¯Z x Z y ¯ ¯¯Z e Z x Z e Z y ¯¯ ¯Z x Z y ¯ ¯ ¯ ¯ ¯ ¯ ¯¯ ¯ ¯ i i¯ = ¯ i+ i i i i ¯¯ ? . ¯=¯ ¯ ¯ ¯ d

d

d

e

d

e

e

e

The addition formula is a straightforward consequence.

A similar proof shows that the same tail convergence holds for innite lower limits. Lemma 5.3.13 Suppose i is uniformly continuous on every nite subinterval of Rd Re Re (4> e], and that d e; then 4 i exists if and only if 4 i exists, and 4 i = Rd Re i + d i. 4 Proposition 5.3.14 Suppose i is uniformly continuous on Revery nite subinterval Rd 4 of (4> 4). If, for some d and e, both integrals 4 i and e i exist, then this is Re R4 Rd true for any d and e. Furthermore, 4 i + d i + e i is independent of d and e.

Proof. Use Lemmas 5.3.12 and 5.3.13.

Definition 5.3.15 When subinterval of R don every Rnite R 4 i is uniformly continuous 4 (4> 4), the integral 4 i exists if both integrals 4 i and d i exist; we write R4 Rd R4 i = 4 i + d i . 4 (The previous proposition shows that this is independent of d.)

In a similar way we can deal with improper integrals over a nite interval [d> e], where i may behave badly at some f between d and e.

Definition 5.3.16 Suppose i is uniformly continuous on all closed subintervals of R P$f Re Rf i and P$f+ i both exist, then we [d> f) and of (f> e]. If the limits d i = d R P$f R Re i + lim P$f+ i . dene d i = lim d

R Im proper R4 Re R4 Proposition 5.3.17 Let denote one of the integrals d , 4 , 4 or Re R Im proper R Im proper . If i and j exist, then for any constants and , the integral d of i + j exists, and Z

Im proper

i + j =

Z

Im proper

i +

Z

Im proper

j.

THE RIEMANN INTEGRAL

180

Proof. Exercise. One of the most useful ways of showing that the improper integral of a function converges is by comparing it with the integral of a function known to converge. The simplest test is the following. Proposition 5.3.18 (Integral Comparison Test) Suppose that both i and j are UC on every nite subinterval of [d> 4), and 0 i j. R4 R4 1. If d j exists, then d i exists as well. R4 R4 2. If d i diverges, then d j diverges as well.

¯R x R y ¯ ¯R y ¯ ¯R y ¯ ¯R x Ry ¯ Proof. For the rst part, ¯ d i d i ¯ = ¯ x i ¯ ¯ x j ¯ = ¯ d j d j ¯ ? (for x and y su!ciently large). The divergence part is left as an exercise. Just as for series, there is a limit form of this test. Proposition 5.3.19 (Limit Comparison) Suppose that both i and j are noni ({) = O with negative and UC on every nite subinterval of [d> 4) and lim {$4 j({) R4 R4 0 ? O ? 4. Then d i converges if and only if d j converges. Definition 5.3.20 A function iR, UC on every nite subinterval of (4> 4), is 4 said to be absolutely integrable if 4 |i | converges. 2

{ { Example 5.3.21 The functions 1+{ are both uniformly continuous on 3 and {h all of R, and for |{| su!ciently large, each is smaller than {12 . It follows that both of sin { these functions are absolutely integrable. Also, the function 1+{ 2 has absolute value 1 less than {2 when { A 0, so it too is absolutely integrable.

We can now use the comparison test to show that absolute integrability implies convergence of the integral (what you would expect). The trick is to write i as the sum i = i + i , where i + ({) =

|i ({)| + i ({) 2

i ({) =

|i ({)| i ({) = 2

and

One can prove directly that 0 i follows from the following lemma. Lemma 5.3.22

+

({) |i ({)| and 0 i

({) |i ({)|; this also

|D| D |D| + D = max(D> 0) and = max(D> 0). 2 2

EXTENSIONS AND IMPROPER INTEGRALS

181

Proof. It su!ces to prove these in the cases 0 D and D 0, where they are clear. So we see that 0 i + ({) |i ({)| and 0 i ({) |i ({)|. Clearly, i + and i are UC wherever i is, so we can apply the Comparison Test to each of them to prove the following (absolute integrability implies integrability).

Corollary 5.3.23 Suppose i is UC on every nite subinterval of (d> 4). R4 R4 1. If d |i | exists, then d i exists. R4 R4 2. If 4 |i | exists, then 4 i exists. Example 5.3.24

R4

4

sin { g{ exists. 1 + {2

A number of very important mathematical functions are dened in terms of improper integrals. Below are two of them; see the exercises for further discussion. Computing these and other improper integrals or nding their special properties usually requires the Fundamental Theorem of Calculus or techniques such as integration by substitution or by parts. We will return to improper integrals when we have developed these tools. Definition 5.3.25 For { A 0, the Gamma function is dened by Z 4 ({) = w{1 hw gw. 0

Definition 5.3.26 For certain functions k(w), the Laplace transform K of k is dened by Z 4 K({) = h{w k(w) gw. 0

Finally, we can use improper integrals to test for convergence of series whose terms are given by evaluating continuous functions on integers. Theorem 5.3.27 (Integral Test) Suppose i P 0 is continuous on every nite 4 subinterval of R and is weakly decreasing. Then n=0 i (n) converges if and only if R4 i ({) g{ converges. 0

Proof. Since i is decreasing, i (n + 1) i ({) i (n) for { 5 [n> n + 1], so i (n + 1) R n+1 i ({) i (n). When q p we can sum to get n q X

n=p

i (n + 1)

Z

q+1

p

i ({) g{

q X

i (n).

n=p

Thus, if either the sum or the integral satises the Cauchy convergence criterion, so does the other.

THE RIEMANN INTEGRAL

182

Note that the R 4 integral depends on the lower P4convergence of neither sum nor the limit being 0: n=N i (n) converges if and only if N i ({) g{ converges. (See exercises for applications.) In conclusion, we note that although the Riemann integral was initially dened only for functions uniformly continuous on a nite interval, we were able to extend it by dening improper integrals for certain functions. We make this formal in the following extension of our original denition. Definition 5.3.28 (Riemann integrability) i is said to be Riemann integrable over an interval [d> e] if there is a partition d = s0 s1 · · · sq1 sq = e such that the integrals (proper or improper) Z sn i Yn = sn1

for n = 1> = = = > q exist. When this is the case, we dene Z

e

i = Y1 + Y2 + = = = + Yq .

d

Remark 5.3.29 Denition 5.3.28 assumes a partition by a nite number of points. This also can be generalized by considering innite numbers of partition points and P Y as Q $ 4. This can also be done when looking at the limit of the sums Q n n=1 either d = 4 or e = 4 (or both).

Exercises Re 1. Suppose that i is UC on every closed subinterval of (f> e]. Discuss P $f+ i Re and dene the improper integral f i (when i is not UC on [f> e] itself).

2. Suppose that 0 i Rj and both functions are UC R 4on every nite subinterval 4 of [d> 4). Prove that d i diverges implies that d j diverges.

3. Complete the divergence part of the integral comparison test, Proposition 5.3.18. 4. Prove the limit comparison test for integrals, Proposition 5.3.19.. R4 5. Prove that 1 {12 converges (see page 103). R4 6. Prove that f h{ exists. (Hint: h{ is eventually smaller than {12 : see Corollary 3.2.15 and the previous exercise.) 7. ProveR that if i is UC on everyR nite subinterval of [n> 4) and d> e 5 [d> 4], 4 4 then d i exists if and only if e i exists. R4 8. Suppose that i 0 is UC on [1> 4) and 1 i = E, a nite number.

EXTENSIONS AND IMPROPER INTEGRALS

183

(a) Give an example to show that lim i ({) need not exist. {$4

(b) Prove that if lim i ({) exists, it must be 0. {$4

R4 R4 9. Prove the following part of Proposition 5.3.17: If f i and f j exist, then for any constants and , the integral of i + j exists, and Z 4 Z 4 Z 4 i + j = i + j. f

f

10. Use the integral comparison test to prove that

f

R4 0

cos { {2 g{

exists.

11. Prove that if i is UC on every nite interval and lim i ({) = O, then {$4

1 {$4 {

lim

Z

{

i (w)gw = O.

0

(In other words, “If i is eventually close to O, then its average value is eventually close to O.”) R4 12. Prove that the Gamma function ({) = 0 w{1 hw gw (for { A 0) exists. (Hint: For w big enough, w{1 ? hw@2 because of Corollary 3.2.15; use the integral comparison test.) 13. For certain functions k(w), the Laplace transform K of k is dened by Z 4 h{w k(w) gw. K({) = 0

A function k(w) is exponentially bounded if, for some n A 0, there is a constant N A 0, such that |k(w)| hnw for w A N. (a) Prove that exponentially bounded functions have Laplace transforms. (b) Give the values of { for which K({) is dened. (c) Give some examples of such functions. (Computing Laplace transforms usually requires integration by parts, which we’ll deal with later.) 14. Let i ({) = sin{ { . Assume the following two facts about this function (we will prove them later when we examine the sine function more carefully): • i ({) is UC on any nite closed interval not containing 0. • lim

{$0

sin { {

= 1.

Prove the following (Lemma 4.2.24 will be helpful). (a) i ({) is uniformly continuous on any punctured interval L {0}.

(b) i ({) has a uniformly continuous extension to all of R

THE RIEMANN INTEGRAL

184

(c) The integrals

Re 0

i,

R0

e

i exist for any e 0.

sin { || |{| h are absolutely integrable. 15. Prove that the functions 1+{ 2 and { R 4 sin { 16. 1 { g{ is a classic example of an improper integral that converges but not absolutely.

(a) The easiest way to see that this integral converges is to use integration by parts. Although we haven’t proved this technique yet, we will in the next chapter (see Proposition 6.5.3), so you are allowed to use it. ¯P Z P Z P cos { ¯¯ sin { cos { g{ = g{= ¯ { { {2 1 1 1 ¯ RP ¯ (b) To see that the integral 1 ¯ sin{ { ¯ g{ goes to 4 with P , look at the R (q+1) ¯ sin { ¯ ¯ ¯ g{, and show each one is greater than 1 . numbers Lq = q { 2q+1 (Hint: Look at one arch of |sin {| and note that the function is at least 12 on a pretty big subinterval.) 17. (In this exercise you are allowed to use formulas from calculus, including the Fundamental Theorem (see Corollary 6.3.6) and integration by substitution (see Proposition 6.5.1). Use the integral test (Theorem 5.3.27)–or any simpler test–to analyze the convergence/divergence of these series. P4 1@n (harmonic series) (a) Pn=1 4 (b) 1@n2 Pn=1 4 (c) 1@ns Pn=1 4 ln n (d) n Pn=1 4 1 (e) n=2 n ln n P4 1 (f) n=2 n(ln n)2

18. (Project) The denition of the Riemann integral can be extended to functions which are piecewise uniformly continuous, even if their pieces do not agree enough to be amalgamated. For example, consider Jump({) dened in Example 4.2.29 on page 159: ; 0 if 5 { ? 4 A A A A cos({) if 4 { ? 1 A A ? {@3 if 1 { ? 1 . Jump({) = 0 if 1 { ? 2 A A A 1 A ({ 3)2 + 12 if 2 { ? 4 A A = 2 1=3 if 4 { 5 This function is UC on each of the open intervals (5> 4), (4> 1), (1> 1), (1> 2), (2> 4) and (4> 5), so its restriction to each extends to six separate functions which are UC on the corresponding closed intervals. Explain how to dene Re Jump({) where 5 d e 5 and generalize this. If d 5 [5> 5], is Rdx Jump({) uniformly continuous as a function of x for x 5 [5> 5]? d

6. DIFFERENTIATION 6.1

Denitions and Basic Properties

Although the classical denition, found in most calculus books, of the derivative of i at the point d, i (d + k) i (d) , i 0 (d) = lim k$0 k has some interest, the derivative function, i 0 ({), is of far more signicance in the broad eld of analysis–for example, in the Fundamental Theorem of Calculus or Taylor series etc. Consequently, we will sidestep the denition of the derivative at a single point, and make our central concept the derivative function. What would it mean for a function j({) to be the derivative of the function i ({)? i ({ + k) i ({) We would have to have the quotient very close to j ({) when k is k small. More precisely, given A 0 we can nd a A 0 such that, for any {, this quotient will be within of j({) whenever 0 ? |k| ? . We could take this as our denition of derivative, but since the idea of derivative is so important and we will have to verify that one function is the derivative of another for so many dierent kinds of functions, we will try to express this notion in two other equivalent ways. The rst thing we can do is change notation slightly. Let’s write | = { + k so that k = | {. With this substitution i ({ + k) i ({) k becomes i (|) i ({) = |{ Definition 6.1.1 The change in the values of a function i divided by the change i (|) i ({) , is called the dierence quotient of i . in its imputs, |{ We can now say that j ({) is the derivative of i ({) if the dierence quotient can be made as close to j({) as we want by making || {| su!ciently small–all expressed, of course, in > language. Finally, since the dierence quotient must be close to j({), let’s look at their dierence and call it u({> |) (since it depends on both { and |): i (|) i ({) j({) = u({> |). |{ 185

DIFFERENTIATION

186

Now multiply through by | { to get rid of the quotient and rearrange things to obtain i (|) i ({) = j({) · (| {) + u({> |) · (| {). With this formulation, j({) will be the derivative of i ({) when u ({> |) can be made small by making | { small (but not 0). The following denition helps to express this precisely. Note that we are also introducing the idea of things happening when we have several variables ({ and |) belonging to a set H. Definition 6.1.2 u({> |) $ 0 as | $ { on H R means that for each A 0 there is a A 0 (depending on ) such that |u({> |)| whenever { and | are in H and 0 ? || {| ? . Please note that u ({> |) does not even have to exist when { = |. We now show that the three formulations of j being the derivative of i are logically equivalent. We also take into account that a function may not have a derivative everywhere, but only on some set of reals H. Proposition 6.1.3 For a function i on the set H R, the following statements are equivalent. 1. There is a function j({), dened for { 5 H, with the property that for each A 0 we can nd a A 0 (depending on ) such that ¯ ¯ ¯ ¯ i ({ + k) i ({) ¯ j({)¯¯ ¯ k whenever {> { + k 5 H and 0 ? |k| ? .

2. There is a function j({), dened for { 5 H, with the property that for each A 0 we can nd a A 0 (depending on ) such that ¯ ¯ ¯ ¯ i (|) i ({) ¯ ¯ j({) ¯ ¯ |{ whenever {> | 5 H and 0 ? || {| ? .

3. There are functions j({) and u({> |), dened for {> | 5 H and || {| A 0, such that i (|) i ({) = j({) · (| {) + u({> |) · (| {) where u({> |) $ 0 as | $ {. Proof. (1 =, 2): Let k = | {. i (|) i ({) (2=,3): Let u({> |) = j({). |{ ¯ ¯ ¯ ¯ i ({ + k) i ({) ¯ j({)¯¯ = |u({> |)|. (3 =, 1): Let | = { + k, then ¯ k

DEFINITIONS AND BASIC PROPERTIES

187

Definition 6.1.4 (Uniformly Differentiable) If the above equivalent conditions hold, we say that i is uniformly dierentiable on H, with derivative j({). (Uniformly dierentiable will often be abbreviated as UD.)The derivative will usually be denoted i 0 ({). 2{k + k2 ({ + k)2 {2 2{ = 2{ = 2{ + k 2{ = k, k k 2 we see that { is uniformly dierentiable on all of R with derivative 2{. Example 6.1.5 Because

Example 6.1.6 We can do a similar trick for q A 2, starting with the factorization Ã q ! X q q ql l1 | { = | { (| {) l=1

Ã q ! X¡ ¢ | ql {l1 {q1 (| {) + q{q1 (| {). = |

l=1

{z

= u({>|)

}

¡ ¢ On any nite interval, | ql {l1 {q1 $ 0 as | $ { (hence u({> |) $ 0), so {q is uniformly dierentiable with derivative q{q1 on each nite subinterval of R (by Condition 3). s s s s | { { | 1 1 1 s s = ¡s s = s Example 6.1.7 s ¢ s . Since |{ 2 { |+ { 2 { |+ { 2 { we have already proved that the square-root function is continuous, the numerator of the last fraction can be made small by making || {| small. If we are in some interval [f> 4), with f A 0, then the denominator of this s fraction is bounded away from zero, so the whole fraction can be made small. Thus, { is UD on [f> 4), with 1 derivative s (by Condition 2). 2 { Remark 6.1.8 There is another condition, equivalent to the three that constitute the denition of UD: There is a function I ({> |) such that i (|) i ({) = I ({> |)(| {) and I ({> |) $ I ({> {) as | $ { (I ({> {) will be i 0 ({)). Details of this approach are presented in [Bridger and Stolzenberg, 1999]. One of its advantages is that it has generalizations to higher dimensional spaces. Definition 6.1.9 For A 0 in the denition of UD, the A 0, mentioned in the rst two equivalent conditions and implied by u({> |) $ 0 in the third, is called a modulus of dierentiability. It depends on the function i , the set H, and , so may be denoted i (> H). Like a modulus of continuity, a modulus of dierentiability i (> H) is not unique; if you have one, any positive smaller number is also a modulus of dierentiability.

188

DIFFERENTIATION

Furthermore, there may not be any way of telling whether the one you have is best possible (i.e. largest). Before going further, it is reasonable to ask if a function can have two dierent derivatives on a set H. Suppose that j1 and j2 both satisfy the denition of being the derivative of i . Then we have ¯ ¯ ¯ ¯ i (|) i ({) i (|) i ({) + j2 ({)¯¯ |j1 ({) j2 ({)| = ¯¯j1 ({) |{ |{ ¯ ¯ ¯ ¯ ¯ ¯ i (|) i ({) ¯ ¯ i (|) i ({) j1 ({)¯¯ + ¯¯ j2 ({)¯¯ = ¯¯ |{ |{

Both of these last quantities can be made arbitrarily small if 0 ? || {| is su!ciently small. The only problem is whether the set H contains numbers | arbitrarily close to {. Proposition 6.1.10 (Uniqueness of the Derivative) Suppose H is a subset of R with the property that for each each { 5 H and A 0, there exists at least one | 5 H such that 0 ? || {| ? . Then if i is dierentiable on H, it has exactly one derivative. Proposition 6.1.11 If i is uniformly dierentiable, then i 0 is uniformly continuous. i (|) i ({) is symmetric in { |{ 0 and |, if { and | are within (@2) of each other, both i ({) and i 0 (|) are within @2 of G({> |) = G(|> {), and hence within of each other. Proof. Because the dierence quotient G({> |) =

Since uniformly continuous functions on nite closed intervals are bounded, Proposition 6.1.11 gives us the following. Corollary 6.1.12 On nite closed intervals, i 0 is bounded. Proposition 6.1.13 If i 0 is bounded, then i is uniformly continuous. Proof. By making || {| ? (1), we can make the dierence quotient within ¯ 1 ¯ ¯ i (|) i ({) ¯ 0 0 ¯? of i ({) (by denition of UD); so if |i ({)| is bounded by E, then ¯¯ |{ ¯ E + 1. Thus |i (|) i ({)| (E + 1) || {|, which can be made less than by making || {| ? min(> @(E + 1)). Note that we could also have observed that |i (|) i ({)| (E + 1) || {| is the kind of Lipschitz condition that we have seen implies continuity (see Example 4.1.7). We already know enough to calculate the derivative of the basic exponential and logarithm functions, h{ and logh ({). Later we will deal with the more general d{ and logd ({). Proposition 6.1.14 For all f 5 R, h{ is uniformly dierentiable on (4> f] with derivative h{ .

DEFINITIONS AND BASIC PROPERTIES

189

Proof. For k A 0, from Corollary 2.4.12 we have hk 1 1 hk 1 1 = hk k k Setting k = | { for { ? |, we get h{ (| {) h| h{ h| (| {).

(*)

In fact, it is easy to check that these last inequalities hold for | ? { as well. Let us now dene u({> |) by the equation u({> |)(| {) = h| h{ h{ (| {)= The function u is certainly dened for all { ? | and { A |. By inequality (), we have 0 u({> |)(| {) (h| h{ )(| {)= and |u({> |)| |h| h{ | max(h{ > h| ) || {| hf || {| $ 0 as || {| $ 0 on each interval (4> f]. Proposition 6.1.15 For all U A 0, i ({) = ln { is uniformly dierentiable on [U> 4) with derivative 1@{. Proof. Use inequality () above with the substitutions | $ ln | (so h| $ |) and { $ ln { (so h{ $ {): 1 1 (| {) ln | ln { (| {). | { This gives ln | ln { 1 1 . | |{ { By reasoning similar to that in the proof of Proposition 6.1.14, we can show that if we dene v({> |) by the equation (| {) · v({> |) = ln | ln {

1 (| {), {

then (by the previous inequality): || {| |{|| ¶ 1 1 || {| max > { | 1 || {| $ 0 as || {| $ 0 on [U> 4). U

|v({> |)|

DIFFERENTIATION

190

Exercises 1. Prove the following directly and in detail from the denition, and give intervals on which they hold. (a) n is UD with derivative 0 (n a number). (b) n{ is UD, with derivative n (n a number). (c) {3 is UD with derivative 3{2 . (d) 1@{ is UD with derivative 1@{2 . s 1 (e) { is UD with derivative s (see Example 6.1.7). 2 { s 2. Prove that { is not UD on the half open interval (0> d] for any d A 0. (Hint: anys > A 0 ¯show that if | = { @2 you can choose { A 0 so that ¯For ¯ s| { 1 ¯¯ ¯ ¯ | { 2s{ ¯ A while clearly || {| = @2 ? .)

3. Write out in detail the proof of Proposition 6.1.3 showing the equivalence of the three conditions for dierentiability. 4. Suppose i is UD on (d> e) and { 5 (d> e). Prove that lim i 0 ({).

k$0

i ({ + k) i ({) = k

5. Suppose i is UD on (d> e) and i 0 (f) = 0 for some f 5 (d> e). Prove that i cannot satisfy a left Lipschitz condition O(| {) i (|) i ({) on (d> e) with O A 0. 6. Suppose we have two functions v ({) and f ({) dened and UC on all of R, with the following properties. (a) v2 + f2 = 1 (b) v({ + |) = v({)f(|) + f({)v(|) (c) f(0) = 1 (so v(0) = 0 by property (a)) (d) v(k)@k $ 1 as k $ 0. Prove that v ({) is UD on all of R with derivative f({). Note that both functions f(k) 1 $1 are bounded. You’ll need to use parts (a) and (d) to show that k as k $ 0. Later, we will dene the sine and cosine functions rigorously using power series, and show that they have these properties. We will also be able to prove other familiar properties of the sine and cosine using facts (a) through (d). 7. Prove that |{| is not uniformly dierentiable on any interval [d> e] such that d ? 0 ? e. 8. Suppose i is UD on [d> e]. Make a conjecture about when |i | will be UD as well (see the previous exercise). Try to prove your conjecture.

THE ARITHMETIC OF DIFFERENTIABILITY

191

9. Suppose that G is dense in V and i is uniformly continuous and dierentiable on G with derivative j. Prove that both i and j can be extended to V, on which i 0 = j as well. (Hint: You can approximate the dierence quotients and derivatives with variables in V arbitrarily closely by quotients ¯ and derivatives ¯ ¯ ¯ i ({ + k) i ({) j({)¯¯ ? + 1 where with variables in G. Prove that ¯¯ k 1 A 0 is arbitrary.) In special cases, a similar result can be obtained using the Fundamental Theorem of Calculus, as we will see later. 10. Suppose that i is UD on an interval L. We know that i 0 is then UC on L. For and A 0, prove that any modulus of continuity () for i 0 on L is also a modulus of dierentiability for i on L. What about the other way around: can you get a modulus of continuity for i 0 from a modulus of dierentiability for i ?

6.2

The Arithmetic of Dierentiability

In proving that sums, products and reciprocals of dierentiable functions are differentiable, the following two lemmas will be useful. By stating them here, we can give the proofs of the rules for dierentiation in a more natural and streamlined way, without getting bogged down in details that are clearly true but obscure the ideas and logic of the argument. Lemma 6.2.1 Suppose u({> |) $ 0 and v ({> |) $ 0 as | { $ 0 on H. Then 1. u({> |) + v({> |) $ 0 as | { $ 0 on H. 2. If T ({> |) is bounded on H, T({> |)u({> |) $ 0 as | { $ 0 on H. The proof of this lemma is straightforward and is left as an exercise. Note that the condition that T({> |) be bounded is important. To see why, let u({> |) = | { and 1 . Even though u({> |) $ 0 on the interval [1> 2], T({> |)u({> |) = 1 T({> |) = |{ on this interval, so it cannot go to 0. Lemma 6.2.2 Suppose we have functions I and J such that 1. I (|) I ({) = Z ({> |)(| {) + U({> |)(| {) for all {> | 5 H, 2. Z ({> |) J({) $ 0 and U({> |) $ 0 as | { $ 0. Then J is the derivative of I on H.

DIFFERENTIATION

192

Proof. Let K({> |) = Z ({> |) J({), so k ({> |) $ 0 as | { $ 0. Then I (|) I ({) = [K({> |) + J({)](| {) + U({> |)(| {) = J({)(| {) + [K({> |) + U({> |)](| {)= Since [K({> |) + U({> |)] $ 0 as | { $ 0, J({) is the derivative of I ({). Theorem 6.2.3 (Arithmetic of Differentiation) Sums: If i and j are uniformly dierentiable on H, then i + j is uniformly dierentiable on H with (i + j)0 = i 0 + j 0 . Products: If i , j and their derivatives are bounded on H (e.g. E is a nite interval [d> e]), then i j is uniformly dierentiable on H with (i j)0 = i 0 j + i j 0 . Reciprocals: If 1@i is dened and bounded on H and i 0 is bounded, then 1@i is uniformly dierentiable on H with (1@i )0 = i 0 @i 2 . Proof. We will use the version of dierentiation involving the “error” function u({> |) (see Proposition 6.1.3). Sums: [i (|) + j(|)] [i ({) + j({) = [i (|) i ({)] + [j(|) j({)] = [i 0 ({) + ui ](| {) + [j 0 ({) + uj ](| {) = [i 0 ({) + j 0 ({)](| {) + [u{ ({> |) + u| ({> |)](| {). By Lemma 6.2.1, u{ ({> |) + u| ({> |) $ 0 as | { $ 0. Products: We compute (using the usual trick of adding and subtracting):

i (|)j(|) i ({)j({) = i (|)j(|) i (|)j({) + i (|)j({) i ({)j({) = i (|)[j(|) j({)] + j({)[i (|) i ({)] = i (|)[j 0 ({) + uj ({> |)](| {) +j({)[i 0 ({) + ui ({> |)](| {) = [i (|)j 0 ({) + j({)i 0 ({)](| {) {z } | Z ({>|)

+[i (|)uj ({> |) + j({)ui ({> |)](| {)= {z } | U({>|)

Now i 0 is bounded so i is UC, so i (|) i ({) $ 0 as | { $ 0. Also, j 0 is bounded, so again by Lemma 6.2.1 we see that [i (|)j 0 ({) + j({)i 0 ({)] [i ({)j 0 ({) + j({)i 0 ({)] {z } | {z } | Z ({>|)

=

[i (|) i ({)] j 0 ({)

J({)

$ 0

as | { $ 0. We can now apply Lemma 6.2.2 to conclude that J({) = I 0 ({), where I ({) = i ({)j({).

THE ARITHMETIC OF DIFFERENTIABILITY

193

Reciprocals: 1 1 i (|) i ({)

i ({) i (|) i (|)i ({) i 0 ({)(| {) u({> |)(| {) = i (|) i ({) ¸ ¸ 0 u({> |) i ({) (| {) + (| {)= = i (|)i ({) i (|)i ({) {z } {z } | | =

Z ({>|)

U({>|)

Since 1@i and i 0 are both bounded and 1@i is continuous, we see that 5 6 ¸ 9 i 0 ({) 1 1 i 0 ({) : 9 : i 0 ({) $0 = 9 : i ({) i (|) i ({) 7 i (|)i ({) i ({)i ({) 8 | {z } | {z } Z ({>|)

J({)

as | { $ 0. We can now apply Lemma 6.2.2 as we did for products.

Corollary 6.2.4 (Quotient Rule) Suppose that i is dierentiable and bounded, and has bounded derivative on H. Suppose j is dierentiable, bounded away from 0, and has bounded derivative on H. Then i @j is uniformly dierentiable on H with ji 0 i j 0 . derivative j2 We now turn to the derivatives of compositions and inverse functions. Proposition 6.2.5 (Chain Rule) If i and j are uniformly dierentiable, with i 0 and j 0 bounded, then the composition i (j({)) is uniformly dierentiable with derivative i 0 (j({))j 0 ({). Proof. Since i is dierentiable, i (j(|)) i (j({)) = i 0 (j({))(j(|) j({)) + ui (j({)> j(|))(j(|) j({)). Since j is dierentiable, j(|)j({) = j 0 ({) (| {)+u({> |)(| {). Substituting this in the previous equation and simplifying gives the following for i (j(|)) i (j({)): i 0 (j({))j 0 ({)(| {) +[i 0 (j({))uj ({> |) + j 0 ({)ui (j({)> j(|)) + ui (j({)> j(|))uj ({> |)](| {) Since j is continuous, j(|) j({) $ 0, so ui (j({)> j(|)) $ 0 and uj (j({)> j(|)) $ 0. Since i 0 and j 0 are bounded, each of the three terms in the brackets goes to 0 as | { $ 0. Corollary 6.2.6 For d A 0, the function d{ is uniformly dierentiable on (4> f] with derivative d{ ln d.

DIFFERENTIATION

194

¢{ ¡ Proof. d{ = hln d and we know how to dierentiate h{ by Proposition 6.1.14.

Next we consider, for any s 5 R, the power function i ({) = {s on (0> 4). We recall (Denition 2.4.14) that ¢s ¡ {s = hln { = hs ln({) . Corollary 6.2.7 (General Power Rule) On any (nite) interval [U> V] with U A 0, {s is uniformly dierentiable with derivative s{s1 .

Proof. Both functions exp x = hx and ln y are uniformly dierentiable on [U> V]. By the Chain Rule, then, so is {s = exp(s log {), and ({s )0 = [hs ln({) ]0 = (hs ln { )

s = s{s1 = {

Remark 6.2.8 Later (Corollary 6.3.22), we will see that some power functions (e.g. s {) are actually dierentiable on half-innite intervals. Proposition 6.2.9 (Inverse Functions) If k is a uniformly continuous inverse for i and i is dierentiable on H with i 0 bounded away from 0 on H, then k is uniformly dierentiable with k0 ({) = 1@i 0 (k({)). Proof. First, we write k(|) k({) = =

|{ |{ k(|) k({) (| {) 0 + 0 |{ i (k({)) i (k({)) ¸ k(|) k({) 1 1 (| {) + 0 (| {). i 0 (k({)) |{ i (k({)) {z } | v({>|)

We now calculate v({> |). Writing | { = i (k(|)) i (k({)) and applying the dierentiability of i yields | { = [i 0 (k ({)) + u(k({)> k(|))](k(|) k({)). Substituting this into v({> |) gives v({> |) = =

1 1 i 0 (k({)) + u(k({)> k(|)) i 0 (k({)) u(k({)> k(|)) . [i 0 (k({)) + u(k({)> k(|))][i 0 (k({))]

The numerator approaches 0 as | { $ 0 (since k is uniformly continuous), but what about the denominator? We know that i 0 is bounded away from 0, so we can write |i 0 k ({)| A f A 0. But |u(k({)> k(|))| can be made arbitrarily small–say less than f@2–when || {| is small. Thus, adding u(k({)> k(|)) to i 0 k ({) can’t make it less than f f@2 = f@2, and so |i 0 (k({) + u(k({)> k(|))| is also bounded away from 0. Therefore v({> |) $ 0 as | { $ 0, and we are done.

THE ARITHMETIC OF DIFFERENTIABILITY

195

Corollary 6.2.10 For f A 0, logd ({) is uniformly dierentiable on [f> 4) with 1 . derivative { ln(d) Finally, we turn to the trigonometric functions. We will dene them later as power series and derive their properties independently by using the dierential equations they satisfy. So, for now, we will assume their usual properties and simply state how they are dierentiated. Summary 6.2.11 (Trig Functions) (For details, see Section 7.5.) 1. sin({) is uniformly dierentiable on all of R with derivative cos({)= 2. cos({) is uniformly dierentiable on all of R with derivative sin({)= sin({) is uniformly dierentiable on any subset of R bounded away cos({) from the zeros of cos({); it has derivative (1@ cos({))2 = sec2 ({).

3. tan({) =

Remark 6.2.12 The number will also be dened when we deal, formally, with the trig functions. We will see that the zeros of sin({) are precisely the numbers n where n is any integer, and the zeros of cos({) are precisely the numbers @2 + n (so tan({) is UD on any set bounded away from these numbers).

Exercises 1. Prove Lemma 6.2.1 on page 191. 2. The derivative of { is 1. Knowing this fact and just the product rule (i j)0 = i j 0 + i 0 j, which functions can you dierentiate? If you also know the rules (i + j)0 = i 0 + j 0 and (ni )0 = ni 0 (n a constant), which functions can now be dierentiated? Finally, which functions can be dierentiated if you add the rule (1@i )0 = i 0 @i 2 ? 3. Under reasonable hypotheses, what will be the derivative of the product i jk of three functions? 4. For each of the following functions, nd its derivative and a region where it is uniformly dierentiable. (a) {1@3 (b) {h{ (c) {h{ (d) {2 , {3

DIFFERENTIATION

196

5. In Proposition 6.2.9 we proved that, under suitable hypotheses, the inverse of a dierentiable function is itself dierentiable, and we gave a formula for its derivative. (a) Assuming that i has a dierentiable inverse j, derive the formula for j 0 by using i (j({)) = {. (b) Assuming that the sine, cosine and tangent functions have inverses on appropriate intervals, calculate the derivatives of these inverses. 6. On an interval L = [d> d] (d A 0), i is called an odd function if i ({) = i ({) for all { 5 L, and is called even if i ({) = i ({) for all { 5 L. Assume that i is UD on L. (a) If i is even, prove that i 0 (0) = 0. (b) More generally, prove that i even =, i 0 odd, and i odd =, i 0 even. (Why is this “more generally”?) 7. Suppose that i is UD on (d> e), and i (f) A 0 for some f 5 (d> e), and calculate lim

q$4

i (f + 1@q) i (f)

¶1@q

.

(Hint: Look at ln(i ({)).) 8. Prove that {q h{ is UD on intervals of the type [f> 4). (The values f may take depend on q; discuss this). This requires some real work: look at the proof of the product rule–in particular, what happens to the terms that look like {q {q u({> |). Also, remember that lim { = 0. {$4 h 9. Prove that there is no rational function–that is, no quotient

S ({) where S and T({)

1 S are polynomials–whose derivative is . (Hint: This is tricky. One possibil{ s ity is to imitate the proof that 2 can’t be rational. Note that any polynomial S ({) can be written S ({) = {N S1 ({) where { doesn’t divide S1 ({).)

6.3

Two Important Theorems

The two theorems referred to are The Fundamental Theorem of Calculus, (FTC) relating derivatives and integrals, and the Mean Value Inequality (which we call The Law of Bounded Change or LBC), relating bounds for the derivative with bounds for the change of a function. These theorems are the backbone for most of the other theoretical results of elementary calculus. We state and prove them without further digression.

TWO IMPORTANT THEOREMS

197

Theorem 6.3.1 (Fundamental Theorem of Calculus, Part I)R Suppose i is { uniformly continuous on an interval L and s 5 L. Dene I ({) = s i (w) gw for 0 { 5 L. Then I is uniformly dierentiable on L with I = i . R{ Proof. First, a word about notation: the expression s i (w) gw is simply the integral R{ i . We have put in the dummy variable w to emphasize that we are to treat the s value of the integral as a function of {, the upper limit of integration. We have done this before–see page 150, where we showed that the R y function dened in this way is UC when i is. On the other hand, the expression x i ({) gw stands for the integral Rofy the constant i ({) (at least as far as w is concerned) over the interval [x> y]. So i ({) gw is equal to i ({)(y x). Since this can be confusing, we will put in the x dummy variable w throughout this R |argument. We note that I (|) I ({) = { i (w) gw (see Corollary 5.2.3). We must show that I (|) I ({) i ({)(| {) = u({> |)(| {),

where u({> |) $ 0 as |{ $ 0 on L. As we have just noted, i ({)(|{) = since i ({) is a constant in this integral. Thus, we can write Z | i (w) i ({) gw I (|) I ({) i ({)(| {) = "{R | # i (w) i ({) gw { (| {) = |{ | {z }

R| {

i ({) gw,

u({>|)

U

| | i (w)i ({)| We now use the estimate: |u({> |)| = { ||{| P·||{| ||{| = P , where P is any bound for |i (w) i ({)|, w 5 [{> |]. Since i is uniformly continuous, given A 0, there is a such that || {| ? makes |i (w) i ({)| ? . Thus we can take P = , an arbitrarily small number. So u({> |) $ 0 as | { $ 0. Theorem 6.3.2 (Law of Bounded Change) Let i be a uniformly dierentiable function on [d> e]. 1. If i 0 ({) E on [d> e], then i (|) i ({) E · (| {) for all d { | e. 2. If D i 0 ({) on [d> e]> then D · (| {) i (|) i ({) for all d { | e. The Law of Bounded Change, sometimes called the Mean Value Inequality, is probably the most useful theoretical tool in elementary calculus; furthermore, its practical or physical meaning is crystal clear: The distance (or better: displacement) of a trip can’t exceed the elapsed time of the trip times a maximum bound for the velocity. In fact, the very simple proof we now give is based on the idea of estimating this distance or displacement by a sum of small displacements over short time intervals. Each one of these is roughly the rate of change –a bounded number– times the interval. Taking the sum corresponds to integrating the derivative. Proof. We will prove (1); the other part is proved similarly.

DIFFERENTIATION

198

In the denition of i being dierentiable, we have the function u({> |) $ 0. Thus, given A 0, we can nd a A 0 such that u({> |) |u({> |)| ? when || {| ? . So we choose ¡ Q ¢so large that (e d)@Q ? and construct the partition of [{> |] with {l = { + l |{ Q . On each interval [{l1 > {l ] we have i ({l ) i ({l1 ) = i 0 ({l1 )({l {l1 ) + u({l1 > {l )({l {l1 ) E · ({l {l1 ) + (| {)@Q .

To estimate the total change in i , we simply add up all Q P of these inequalities, noting that, because of additive cancellation (telescoping), ({l {l1 ) = | { P and i ({l ) i ({l1 ) = i (|) i ({). So i (|) i ({) E · (| {) + (| {)=

Since this last inequality holds for all A 0, we conclude from the Wiggle Lemma (Lemma 1.6.8) that i (|) i ({) E(| {). Note that for | dierent from {, LBC says that D

i (|) i ({) E> |{

which can be phrased bounds for the derivative are bounds for the dierence quotient. We will look at some of the consequences of LBC before returning to the Fundamental Theorem of Calculus. The rst of these is that a function with derivative identically 0 is constant. This deceptively simple statement has many applications in calculus–for example, in dierential equations–yet there seems to be no easier proof than via LBC. Corollary 6.3.3 i is constant on any interval on which i 0 = 0. Proof. This is just LBC with D and E equal to 0. Definition 6.3.4 I is an antiderivative of i on H means I is uniformly dierentiable on H and I 0 = i on H. Corollary 6.3.5 On an interval, any two antiderivatives of the same function dier by a constant. Corollary 6.3.6 (Fundamental Theorem of R { Calculus, Part II) If I is an anti-derivative of i on [d> e], then for { 5 [d> e], d i (w) gw = I ({) I (d).

Proof. By the Fundamental Theorem of Calculus, Part I, both sides have the same derivative and thus they dier by a constant. Since they agree at { = d, they are equal.

Definition 6.3.7 Let i be dened on a set H. 1. i is strictly increasing on H means for all { ? | in H, i ({) ? i (|).

TWO IMPORTANT THEOREMS

199

2. i is strictly decreasing on H means for all { ? | in H, i ({) A i (|). 3. i is weakly increasing (non-decreasing) on H means for all { | in H, i ({) i (|). 4. i is weakly decreasing (non-increasing) on H means for all { | in H, i ({) i (|). Corollary 6.3.8 Suppose D is a constant and D i 0 on an interval L. ¾ D A 0 =, i is strictly increasing on L. ; in fact, i (|) i ({) + D(| {) D 0 =, i is weakly increasing on L. Corollary 6.3.9 Suppose E is a constant and i 0 E on an interval L. ¾ E ? 0 =, i is strictly decreasing on L. ; in fact, i (|) i ({) + E(| {) E 0 =, i is weakly decreasing on L. Note that in the strict cases of these corollaries, we are assuming that i 0 ({) is bounded away from 0 (e.g. i 0 D A 0). We don’t really need this to conclude that i is strictly increasing or decreasing, just that i 0 is positive (or negative); however, this result, stated below, requires a bit more work. We leave its proof as an exercise. Corollary 6.3.10 If i 0 A 0 on an interval L then i is strictly increasing on L; if i 0 ? 0 on L then i is strictly decreasing on L. Proof. See exercises. Corollary 6.3.11 (Generalized Law of Bounded Change) Suppose i and j are uniformly dierentiable on an interval [d> e] and D · j 0 i 0 E · j 0 for reals D and E. Then for all d { | e D[j(|) j({)] i (|) i ({) E[j(|) j({)]. Proof. Again, we just show the right-hand inequality; the other is proved similarly. Since i 0 E · j 0 we have Ej 0 i 0 = (Ej i )0 0. By the Law of Bounded Change, Ej({) i ({) Ej(|) i (|). Rearranging gives the desired inequality. (We will use this in the next section to prove “l’Hôpital’s rule.”) Corollary 6.3.12 If i 0 is bounded on an interval L then i is uniformly continuous on L. Proof. Suppose |i 0 | N on L, and {> | 5 L. By the Law of Bounded Change: |i (|) i ({)| N|| {| so || {| ? @N will make |i (|) i ({)| ? . Remark 6.3.13 Although we’ve already proved that a bounded derivative implies uniform continuity, this corollary shows the stronger result that we actually have UC via a Lipschitz condition.

DIFFERENTIATION

200

Corollary 6.3.14 If i is uniformly dierentiable on [d> e], then it is uniformly continuous on [d> e]; furthermore, a modulus of continuity of i is () = @N, where N is a bound for |i 0 | on [d> e]. Proof. Since i 0 is uniformly continuous on a nite interval we know that it is bounded. Since the Law of Bounded Change gives us Lipschitz conditions on a function, it provides a good source of inverse functions. Corollary 6.3.15 Suppose i is uniformly dierentiable on the interval [d> e] with derivative i 0 . Suppose also that 0 ? O i 0 on [d> e]. Then i has an inverse k dened on [i (d)> i (e)]; furthermore, k is uniformly continuous and dierentiable on this interval. Proof. Since i 0 is uniformly continuous on [d> e], it is bounded above by some E on [d> e]. From the Law of Bounded Change, it follows that i has both upper and lower positive Lipschitz bounds. The Inverse Function Theorem now tells us that k has upper and lower positive Lipschitz bounds as well, so k is uniformly continuous. By Proposition 6.2.9, we know that k is dierentiable. Corollary 6.3.16 Assume p i 0 P on an interval L. Then for {> |,x 5 L, ¯ ¯ ¯ i (|) i ({) ¯ 0 ¯ ¯ ¯ (| {) i (x)¯ P p=

Proof. By the Law of Bounded Change, the dierence quotient and the derivative both lie in the interval [p> P ], so they can’t dier by more than P p. Corollary 6.3.17 Suppose we have an interval G (not necessarily nite) and two functions i> j : G $ R. Assume i is uniformly dierentiable on all su!ciently small subintervals of G with derivative j and that j is uniformly continuous on G. Then i is uniformly dierentiable on all of G. Proof. “Dierentiable for all su!ciently small subintervals” means that there is some u A 0 such that i is uniformly dierentiable on each subinterval of G of length at most u. Also, for each A 0, there is a 0 ? ? u such that, on each [{> |] G of length at most , the values of j lie between j({) @2, and j({) + @2. Therefore, i (|) i ({) j({), we can apply Corollary 6.3.16 to see that if we write u({> |) = |{ |u({> |)| ? ;. This makes j the uniform derivative of i . Now we turn to some uses of the Fundamental Theorem of Calculus. Corollary 6.3.18 If i is UD on H and d 5 H, then i ({) =

R{ d

i 0 g{ + i (d).

Proof. This is simply a restatement of FTC Part II (Corollary 6.3.6). The Fundamental Theorem turns out to be a useful tool for extending or piecing together functions. This is because derivatives are always UC, and we know how to

TWO IMPORTANT THEOREMS

201

extend or amalgamate uniformly continuous functions. We then recover the original function by integrating the derivative. The idea is captured in the proofs of the following two propositions. Proposition 6.3.19 Suppose i 0 = j on [d> e). Then i and j have unique extensions to [d> e] where they have the same relation. Since the derivative j is UC, it has an extension j¯ to [d> e]. Let I ({) = RProof. { j ¯ + i (d). By FTC, Part I, I 0 = j¯ on [d> e], so I 0 = i 0 = j on [d> e). Also by d FTC, part II, I = i on any [d> s] for s ? e. Thus I = i on [d> e). (Note that a somewhat more general theorem is suggested in exercise 9 on page 191.) Proposition 6.3.20 Suppose i is UD on [d> e], j is UD on [e> f], and i (e) = j(e), i 0 (e) = j 0 (e). Then there is a UD function K on [d> f] such that K = i and K 0 = i 0 on [d> e] and K = j and K 0 = j 0 on [e> f]. Proof. The derivatives i 0 and j 0 are UC, so can be amalgamated R { (see Proposition 4.2.21) to produce a UC function M on [d> f]. Now let K({) = e M + i (e). (Further details left to the exercises.) We conclude this section with another application of LBC. We have already seen that {s is dierentiable on nite positive intervals [U> V], U A 0. When 0 s 1 (so, for example, for cube roots) we can do better. Proposition 6.3.21 For 0 s 1, i ({) = {s is uniformly continuous on (0> 4). Proof. Note that we have already proved this in the case s = 1@q where q is an integer 1: see Corollary 4.1.23. The proof used the inequality (established by induction on q) (1 x)q 1 xq where 0 x 1. By generalizing this inequality to real numbers q 1, we can get the continuity of {s using the same ideas as in the proof of Corollary 4.1.23. The details are left for the exercises. Corollary 6.3.22 For 1 ? s 2, i ({) = {s is uniformly dierentiable on (0> 4). Proof. Exercise; this follows from the previous proposition and a nice application of Corollary 6.3.17, a consequence of LBC. Remark 6.3.23 (Differentiation vs. Integration) The Fundamental Theorem shows that dierentiation and integration can be described as inverse processes: they essentially undo each other. However, their eects on a function are quite dierent. R{ In general, an indenite integral I ({) = d i (w) gw of a function i tends to “smooth out” the original function i . The reason for this is that integrating is like averaging. In fact, for i UC on [d> e], if you divide [d> e] into q equal parts, average the values of i at the q division points, and take the limit as q $ 4 of these averages,

DIFFERENTIATION

202

Re 1 you get ed i ({) g{ (see exercise 5 on page 176). (This is consequently called the d average value of i .) In this sense, integrating a function produces a function whose values are averages of the values of i . Possibly erratic behavior of i gets averaged out. The derivative, on the other hand, is very local. Its value at { depends on the behavior of i arbitrarily close to {: the limit of the average rate of change of i near that {. So instead of being an average, which would smooth things, taking this limit “unaverages the function,” magnifying variations in its change near { because of the division by | {. Consider the function i ({) = |{|, which is uniformly continuous on all of R. Its problem is that its rate of change to the left of 0 is 1 and to the right of 0 is +1. On any positive closed interval, i 0 ({) = 1 and on any negative closed interval, i 0 ({) = 1, but these derivatives cannot be amagamated: its derivative cannot be dened on any interval containing 0. On the other hand, i has a very nice antiderivative given by the amalgamated function ½ 2 if { 0 { . I ({) = {2 if { 0 So the integral I of i , is smooth; I 0 = i is uniformly continuous but has a “corner” (not smooth); and I 00 = i 0 cannot be dened at the origin. Another example of this phenomenon is obtained by taking i ({) = {1@3 , another UC function. Its integral, (3@4){4@3 , is at least as nice, but its derivative (1@3) {2@3 goes bad near 0. It is even possible to construct a function i which is UC on an interval L (so has an anti-derivative or integral on L), but has no derivative on any subinterval of L. i ({ + k) doesn’t exist for any { 5 L. In fact, lim k$0 k Finally, a continuous function may not be dierentiable, but the Fundamental Theorem tells us that its integral always is.

Exercises 1. Prove that if i is uniformly dierentiable on the nite interval [d> e], then i satises a Lipschitz condition on [d> e] of the form: |i (|) i ({)| N · || {|. 2. Prove the following part of Corollary 6.3.8: If i 0 D A 0 on H = [d> e], then i (|) i ({) + D(| {), so i is strictly increasing on H. 3. Prove the following stronger result from the text. Proposition 6.3.10 If i 0 A 0 on an interval L then i is strictly increasing on L; if i 0 ? 0 on L, then i is strictly decreasing on L. (Hint: This can be proved either from LBC or from the Fundamental Theorem. In either case the idea is rst to observe that i is weakly increasing. Now recall that if i 0 (f) A 0, then i 0 ({) is bounded away from 0 in some neighborhood of f: see Proposition 4.1.21.)

TWO IMPORTANT THEOREMS

203

4. Fill in the details of the proof of the following from the text. Proposition 6.3.20 Suppose i is UD on [d> e], j is UD on [e> f], and i (e) = j(e), i 0 (e) = j 0 (e). Then there is a UD function K on [d> f] such that K = i and K 0 = i 0 on [d> e] and K = j and K 0 = j 0 on [e> f]. 5. Prove the following from the text. Corollary 6.3.22. For 1 ? s 2, i ({) = {s is uniformly dierentiable on (0> 4). (This also holds for s = 1.) 6. Prove that if A 2, then the function { sin(1@{) has a uniformly dierentiable extension to any interval [D> D]. 7. Describe the function whose derivative is |{|. 8. Prove the Riemann-Lebesgue Lemma, which says that when i is UC on [d> e], Re lim d i (w) sin(w) gw = 0. (Hint: First use FTC to deal with the special case $4

0

i a constant; you may assume (cos w) = sin w. Next, partition [d> e] into subintervals in which i ({n ) i ({) i ({n ) + and look at a Riemann sum.) There are many generalizations of this result, because it is of importance in the theory of Fourier series–see the last chapter of this book.

9. Here is an outline of the proof of Proposition 6.3.21, the continuity of {s on (0> 4) when 0 ? s 1. (a) Lemma For any t 1 and 0 x 1, (1 x)t 1 xt .

To prove this, consider the function !(x) = (1 x)t + xt . The idea is to show that !(x) 1 when 0 x 1. Compute !0 (x) and show that for { 1@2, ! is decreasing; thus, if 0 ? v x 1@2 then !(x) !(v) lim !(v) (weak inequalities are preserved in the limit). Prove that this v$0

last limit is 1 (check the proof of Proposition 4.2.26). This shows that !(x) 1 when 0 x 1@2. The case x 1@2 is similar. (b) Look at the proof of Corollary 4.1.23 to see how this lemma can be used to show |es ds | |e d|s when 0 ? s 1. (c) Deduce the continuity of {s on (0> 4) and use this to prove Corollary 6.3.22: {s is UD on (0> 4) when 1 ? s 2.

DIFFERENTIATION

204

6.4

Derivative Tools

Part 1: Derivative Tests Remark 6.4.1 In the hypotheses of the Law of Bounded Change, in place of [d> e], we can have any H R with the following property: for any s t in H and A 0, there are s = s0 s1 · · · sq = t in H such that for each l = 1> = = = > q, sl sl1 ? . Note that if H satises this property, then for all s t in H, every { 5 [s> t] can be approximated arbitrarily closely by points in H. Proposition 6.4.2 Suppose i is uniformly dierentiable on H R satisfying the conditions in Remark 6.4.1, and suppose we are given s 5 H with i 0 (s) A 0. Then for some A 0, i is strictly increasing on H _ (s > s + ). Proof. Take u with 0 ? u ? i 0 (s). Since i 0 is uniformly continuous, we know by proposition 4.1.21 that there is a A 0 such that on H _ (s > s + ), 0 ? u ? i 0 . Therefore by the Law of Bounded Change, i is strictly increasing on H _ (s > s + ). Definition 6.4.3 (Max and Min) 1. s is a local max (maximum) for i means we can nd a A 0 such that for all { in (s > s + ), i ({) i (s). 2. s is a local min (minimum) for i means we can nd a A 0 such that for all { in (s > s + ), i ({) i (s). 3. s is a strict local max or min when the weak inequalities ( or ) can be replaced by strong ones (? or A). Proposition 6.4.4 Suppose H = (d> e), d ? e, s 5 H, and i is uniformly dierentiable on H. If s is a local max or a local min for i , then i 0 (s) = 0. Proof. Since the proof in the cases of local max and local min is similar, we only prove the local max case. We have to show that i 0 (s) 0 and i 0 (s) 0. But if i 0 (s) A 0, then i is strictly increasing on (s 0 > s + 0 ) for some 0 A 0. Let 00 = min(> 0 ). Then for any s ? { ? s + 00 , i (s) ? i ({). But by assumption i ({) i (s). Therefore i 0 (s) 6A 0. By a similar reasoning, we have i 0 (s) 6? 0. Therefore i 0 (s) = 0. Definition 6.4.5 For an integer q A 0, the function i is q-times dierentiable, with qth derivative i (q) if i is uniformly dierentiable and i 0 is (q 1)-times differentiable with (i 0 )(q1) = i (q) .

DERIVATIVE TOOLS

205

This is an example of a recursive denition: we know what q-times dierentiable (resp. qth derivative) means when we know what (q 1)-times dierentiable (resp. (q 1)st derivative) means. Since we know what these terms mean for q = 1, we can interpret them for q A 1. You might also call this an inductive denition. As a matter of notation, the rst few derivatives of i are usually written i 0 , i 00 , and i 000 ; after the third derivative we start using i (4) , i (5) , etc. As we saw in Corollary 6.3.16, the error in replacing a function by its best a!ne approximation i 0 ({)(| {) + i ({) depends on the variation of its derivative. This in turn can be estimated using the second derivative, as the following result makes precise. Proposition 6.4.6 Suppose i is twice dierentiable on an interval H. Then for all { and s in H, 1. If i 00 f on H, then i ({) i (s) i 0 (s)({ s) 2f ({ s)2 . 2. If g i 00 on H, then

g 2 ({

s)2 i ({) i (s) i 0 (s)({ s)=

Proof. Suppose rst that s {. Starting with i 00 ({) f, integrate both sides from { = s to { = w, obtaining i 0 (w) i 0 (s) f(w s). Now integrate both sides (with respect to w) from w = s to w = {, obtaining i ({) i (s) i 0 (s)({ s) 2f ({ s)2 . If { s, do the exact same thing, but reverse the limits of integration, so that s is always at the top. You obtain i 0 (s)(s {) (i (s) i ({)) 2f (s {)2 which is, of course, the exact same inequality. The second inequality is proved in a similar fashion. Corollary 6.4.7 If i 00 ? f ? 0 on an interval H, then for { A s or { ? s in H, i ({) ? i (s) + i 0 (s)({ s). (This means that the “graph of the function is always under its tangent line.) Similarly for 0 ? g ? i 00 , i ({) A i (s) + i 0 (s)({ s)= The exercises suggest a direct proof using LBC. Corollary 6.4.8 (Second Derivative Test) Let i be twice uniformly dierentiable on an open interval L, and suppose i 0 (s) = 0 for some s 5 L. Then 1. i 00 (s) A 0 =, i has a strict local max at s. 2. i 00 (s) ? 0 =, i has a strict local min at s. Corollary 6.4.9 For 0 ? ? 1 and w 0, w w + (1 ). Proof. See the exercises.

DIFFERENTIATION

206

Part 2: l’Hôpital’s Rule We conclude this section with l’Hôpital’s rule1 , a popular method for calculating certain kinds of limits. Since we know that i ({) is close to i 0 (s)({ s) + i (s) when { is close to s, we expect that this can be used in calculating limits. This is indeed the case if we can strip o the i (s) and ({ s). This will happen if i (s) = 0 and we take a quotient of two functions, so that the { s factors will cancel to obtain i 0 (s)({ s) + i (s) i 0 (s) i ({) 0 = 0 j({) j (s)({ s) + j(s) j (s) when i (s) = j(s) = 0 and { 6= s. This is the idea behind l’Hôpital’s rule. We begin with the simpler “0@0” case. Proposition 6.4.10 (L’Hôpital: 0@0 case) Suppose i and j are dierentiable on (0> D] with i ({) $ 0 and j({) $ 0 as { $ 0 on (0> D]. Suppose also that j 0 is bounded away from 0 on each [u> D] (u A 0). If i 0 ({)@j 0 ({) $ O as { $ 0 in (0> D], then i ({)@j({) $ O as well. Proof. Since i 0 @j 0 $ O, given A 0 we can nd A 0 small enough so that O i0 and O + are bounds for 0 on (0> ). For any | with 0 ? | ? { ? we have, by j the Generalized Law of Bounded Change (Proposition 6.3.11) on [|> {], O

i ({) i (|) O + = j({) j(|)

But i (|) $ 0 and j(|) $ 0 as | $ 0, and weak inequalities are preserved in the limit (Proposition 3.2.9). So, for 0 ? { , O

i ({) O + = j({)

Corollary 6.4.11 We can replace (0> D] in Proposition 6.4.10 by (s> E], [F> t), [G> 4), or (4> H], where G A 0 and H ? 0, and get the same result having i ({) $ 0 and j({) $ 0, as { $ s, { $ t, { $ 4 or { $ 4 in the respective intervals. We will give the proof for the cases [F> t) and [G> 4). The other cases will be similar. The idea is to reduce to Proposition 6.4.10 by changing the functions slightly. Proof. On [F> t), dene i1 (x) = i (x+t) and j1 (x) = j(x+t) for x 5 (0> t F]. Then i10 (x) i 0 (x + t) i 0 (x + t) = = . 0 j1 (x) j 0 (x + t) j 0 (x + t) 1 G.

F. Marquis de l’Hôpital (1661-1704)

DERIVATIVE TOOLS

207

Letting x = { + t, we get x $ 0 in (0> t F] as { $ t in [F> t), so Proposition 6.4.10 yields the result. On [G> 4), dene i1 (x) = i (1@x) and j1 (x) = j(1@x) for x 5 (0> 1@G]. Then ¡ ¢ i 0 (1@x) 1 i 0 (1@x) i10 (x) x2 ¢ ¡ = = = j10 (x) j 0 (1@x) j 0 (1@x) 1 x2

Letting x = 1@{, we get x $ 0 in (0> 1@G] as { $ 4 in [G> 4) and we can apply Proposition 6.4.10.

Now we turn to the more di!cult 4@4 case. The problem here is that in the ({)i (|) fraction ij({)j(|) we don’t have i (|)> j(|) $ 0, so we have to use a few algebraic tricks. Proposition 6.4.12 (L’Hôpital: 4@4 case) Suppose i and j are dierentiable on [D> 4) with i ({) $ 4 and j({) $ 4 as { $ 4, j A 0, and j 0 ({) bounded away from 0 for { su!ciently large. If i 0 ({)@j 0 ({) $ O as { $ 4, then i ({)@j({) $ O as { $ 4 as well. Proof. Let A 0 be given, and choose any with 0 ? ? . Since i 0 ({)@j 0 ({) $ O, there is an Q () such that O

i 0 ({) O+ j 0 ({)

for { Q . By the Generalized Law of Bounded Change (Proposition 6.3.11) we have, for { A | A Q, i (|) + (O ) [j({) j(|)] i ({) i (|) + (O + ) [j({) j(|)] . As usual, we deal with the right-hand inequality rst; the other one is similar. Since j(|) $ 4 we can, if necessary, make | (hence {) larger, so that both j ({) and j(|) are positive. This being the case, divide through the above inequality by j ({) to obtain ¸ ¸ i ({) j(|) i (|) j(|) i (|) . + (O ) 1 + (O + ) 1 j({) j({) j({) j({) j({) Now we note the we can make { as big as we like while holding | xed, so i (|) and j(|) can be held xed while j({) is made as big as we want. Thus, for { su!ciently large we can make |i (|)@j({)| arbitrarily small and [1 j(|)@j({)] between 0 and 1 but arbitrarily close to 1. Since (O + ) ? (O + ), we can ensure that (O + ) [1 j(|)@j({)] ? (O + ) as well. Making j({) even bigger if necessary, we can have i (|)@j({) + (O + ) [1 j(|)@j({)] (O + ). Working in a similar way with the left-hand side and using (O ) ? (O ), we can make (O ) i (|)@j({) + (O )[1 j(|)@j({)]. Putting the two sides together gives us O

i ({) O + . j({)

DIFFERENTIATION

208

Since was arbitrary, we are done. Note that nearly the same proof works when { $ s instead of { $ 4.

Exercises 1. Suppose that i is twice dierentiable on (d> e) and i 00 is bounded (|i 00 | ? P for some constant P ). Prove that i is uniformly continuous on (d> e). Generalize this. 2. Suppose that i is twice dierentiable on (d> 4), and |i 00 ({)| ? P on (d> 4). If lim i ({) = 0, prove that lim i 0 ({) = 0 as well. {$4

{$4

3. Suppose that i is dierentiable on (d> e), and for some f 5 (d> e) there is a A 0 such that • i 0 ({) A 0 in (f > f) and • i 0 ({) ? 0 in (f> f + ).

Prove that i has a relative maximum at f. This result is sometimes called the “rst derivative test.” 4. The converse of the previous exercise need not be true. Find a function i ({) which is uniformly dierentiable on, say, [1> 1], has a maximum at 0, but there is no A 0 such that • i 0 ({) A 0 in (f > f) and • i 0 ({) ? 0 in (f> f + ).

5. Suppose that i in twice dierentiable on [d> e] and i 00 (w) 0 (the graph is “concave down”) for w 5 [d> e]. (a) Prove that the graph of i lies below its tangent lines. This means that for any {> s 5 [d> e] i ({) i (s) + i 0 (s) · ({ s). {z } | tangent line at (s>i (s))

(Draw a picture to see what this is saying). (Hint: i 00 0 says that i 0 (w) is a decreasing function of w; separate into the two cases { x and x { and use LBC in each case. Note that if i 00 f ? 0 we can replace with ?.)

DERIVATIVE TOOLS

209

(b) Prove that the graph of i lies above its secant line. This means that j({) 0, where 6

5

: 9 i (e) i (d) : 9 · ({ d): . j({) = i ({) 9i (d) + e {zd 7| }8 line joining (d>i (d)) and (e>i (e))

(Hint: First prove that j 0 (d) 0 j 0 (e) by integrating the inequality i 0 (d) i 0 (w) i 0 (e). Next, deal with the two cases j 0 ({) 0 and j 0 ({) 0 separately, in each case using FTC to integrate inequalities involving j 0 (w) and j 0 ({) either from d to { or from { to e. Once again, if i 00 f ? 0, then the weak inequality becomes a strong one: j({) A 0.) Analogous results hold when we assume i 00 0 (the graph is “concave up”). 6. (Proposition 6.3.11) Use Proposition 6.4.6 to show that if 0 ? ? 1 and w 0 then: w w + (1 ). 7. The following result, is of great importance in studying Lebesgue integration and the Ls spaces. Hölder’s Inequality Suppose that i and j are uniformly ¯R continuous¯ on 1 1 ¯ e ¯ + = 1. Then ¯ d i ({)j({) g{¯ [d> e] and that s A 1, t A 1 satisfy s t hR i1@s hR i1@t e e |i ({)|s g{ |j({)|t g{ . d d We outline the proof.

(a) We can assume i and j are non-negative (why?). Re (b) If Li = d i ({) g{ = 0 then i = 0 on [d> e], so i = 0 on [d> e]; similarly Re for Lj = d j ({) g{. Thus we can assume that both integrals Li and Lj are positive. (c) Let I ({) = i ({)@Li and J({) = j ({)@Lj . Then the integrals of I and J over [d> e] are both 1.

(d) Let = 1@s and = 1@t = 1 . By exercise 6, wd w + . If we let w = W @X , then we have W X W + X (e) Now let W = I ({) and X = J({) and integrate the last inequality to get Z

d

e

I ({)J ({) g{ + = 1.

(When s = t = 1@2, Hölder’s inequality is sometimes called the CauchySchwartz inequality–see Theorem 8.2.19 on page 282 for another version.)

DIFFERENTIATION

210

8. The following continuous version of the triangle inequality follows from the previous exercise. Minkowski’s Inequality. Suppose that i and j are uniformly continuous on [d> e] and s 1. Then "Z

d

e s

#1@s

|i + j| g{

"Z

d

e s

#1@s

|i | g{

+

"Z

e

d

s

#1@s

|j| g{

.

We outline the proof. (a) |i + j| |i | + |j|, so that the inequality clearly holds for s = 1; we will therefore assume that s A 1. We may also assume that i and j are non-negative (why?). (b) Write (i +j)s = i (i +j)s1 +j(i +j)s1 . Now apply Hölder’s Inequality s . (see exercise 7) to s and t = 1 1 = s1 1 s

(c) We now get the result by dividing through by

hR e

i s1 s s (i + j) g{ . d

9. Suppose that i and j are three-times dierentiable on (d> e) with i (f) = ¶0 i 0 j(f) = 0, but j (f) A 0 (f 5 (d> e)). Calculate (f). Now, suppose that j ¶0 i 0 0 00 (f). i (f) = j (f) = 0 as well, but j (f) A 0; once again calculate j 10. In each of the two cases 0 ? d ? 1 and 1 ? d, evaluate lim

{$4

d{ 1 {(d 1)

¶1@{

.

11. Suppose i is dierentiable on (0> 4) and let d A 0. Use the 4@4 case of l’Hôpital’s rule (Proposition 6.4.12) to prove (a) If lim (di ({) + i 0 ({)) = O, then lim (i ({)) = O@d. {$4 {$4 s 0 (b) lim (di ({) + 2 {i ({)) = O, then lim (i ({)) = O@d. {$4

(Hint: For the rst part, write i ({) = second part.) ¶ ({ 1)t = t. 12. Show that lim { 1 {$4 {t

{$4

hd{ i ({) . A similar trick is used for the hd{

INTEGRAL TOOLS

6.5

211

Integral Tools

We begin by proving two fundamental techniques of integration from calculus. The rst is the change of variables formula for integrals, popularly known as integration by substitution or more simply as “x-substitution.” Proposition 6.5.1 (Change of Variables in Integration) Suppose we have functions i x [d> e] $ L $ R with i UC on the nite interval L and x UC and UD on [d> e]. Then Z e Z x(e) i (x(w))x0 (w) gw = i (x) gx= d

R{

x(d)

0

Proof. Let !({) = x(d) i , so ! = i on L. By the chain rule, (! x)0 = !0 (x) · x0 = i (x) · x0 on [d> e], so we can write: Z e Z e i (x(w))x0 (w) gw = (! x)0 = !(x(e)) !(x(d)) d

d

=

Z

x(e)

!0

x(d)

=

Z

x(e)

i (x)gx=

x(d)

Example 6.5.2 Here’s a standard example from calculus (x = ln {): Z 3 Z ln 3 3 (ln {)3 2)3 g{ = x3 gx = (ln 3) (ln = 3 { 2 ln 2 Just as we obtained the integration by substitution formula by integrating the chain rule, we get the integration by parts formula by integrating the product rule. Proposition 6.5.3 (Integration by Parts) For i and j dierentiable on [d> e], we have Z e Z e i j 0 = i (e)j(e) i (d)j(d) i 0 j= d

d

0

0

0

Proof. By the Product Rule, i j = (i j) i j, so Z e Z e Z e i j0 = (i j)0 i 0j d

d

d

= i (e)j(e) i (d)j(d)

Z

d

e

i 0 j.

DIFFERENTIATION

212

Corollary 6.5.4 For any i and j dierentiable on [d> e] with i (d) = i (e) = 0, Re 0 Re i j = d i 0 j. d Functions Dened As Integrals R { gw . Although we took the approach of dening the exExample 6.5.5 ln({) = 1 w ponential function rst, and then the logarithm as its inverse, it is possible to start with the denition of the natural log as an integral, and then dene the exponential function from it. The advantage of this approach is that you easily get Lipschitz conditions and dierentiability from the the Law of Bounded Change and the Fundamental Theorem. One also gets the properties of log and exponential functions by a clever interplay of the properties of this particular integral and the fact that any two antiderivatives of the same function dier by a constant. These ideas can be found in many calculus and analysis books; we outline them as a project in exercise 10 at the end of this section. s Z 1p { 1 {2 + 1 w2 gw, for 1 { 1. Intuitively, Example 6.5.6 D({) = 2 } { | {z | {z } D1

D2

this function represents the area within a sector measured counterclockwise from the positive x-axis, inside the unit circle.

A1

A2

T [

One then denes by = 2D(1). The function D is strictly increasing, so it has an inverse. Since the area of a sector is half its angle and the x-coordinate on 1 the unit circle is the cosine of the angle, we p dene cos() = D (@2) for any real 0 2 (so cos() = 1), and sin() = 1 (cos ) . All the remaining properties of the trigonometric functions can be derived from this. A complete account can be found in [Spivak 1994].

INTEGRAL TOOLS

213

R { gw R { gw s Example 6.5.7 arctan({) = 0 1+w 2 for 4 ? { ? 4; arcsin({) = 0 1w2 for 1 ? { ? 1. Either of these functions can also be used to dene all of the trigonometric functions in a way similar to the previous example. See the exercises for further exploration. R{ 2 Example 6.5.8 S ({) = s12 4 hw @2 gw for 4 ? { ? 4. This function represents a cumulative probability for a normal (Gaussian) distribution. S ({) and its inverse have been studied extensively because of their importance in statistics. For functions dened in this way, where the variable is in the actual limit of the integral, the Fundamental Theorem applies, and we see that they are continuous and dierentiable. So we turn our attention to a dierent kind of integral, where the limits are constant, but the functions being integrated also depend on a second variable or parameter. We have seen two examples of these already: the Gamma R4 function ({) = 0 w{1 hw gw (Denition 5.3.25) and the Laplace transform K({) = R 4 {w h k(w) gw (Denition 5.3.26)–see the exercises on page 183. Note that in each 0 case the variable w, being the variable of integration, gets “integrated out,” leaving just a function of the remaining variable {. The general case for these kinds of functions looks like the following Z e i ({> w)gw. I ({) = d

In the examples of Gamma function and Laplace transform, the integral is improper (e = 4), but we will begin by looking at the simpler, nite case. We need to be able to integrate the function i , so we will need a reasonable condition that allows this. Definition 6.5.9 The function i ({> w) is said to be uniformly continuous on the set T = H × W = {({> w) | { 5 H and w 5 V} if for each A 0 there is a A 0 (depending on and i and the set T) such that |i (|> v) i ({> w)| ? whenever ({> w)> (|> v) 5 T and || {| ? and |v w| ? . It is important to note that this denition of uniform continuity is not the same as saying the i ({> w) is uniformly continuous in each variable separately (holding the other xed). We are saying that the same works for both variables, and doesn’t depend on the particular value of either. Proposition 6.5.10 Suppose that i is uniformly continuous on T = H × [d> e]. Re Then the function I ({) = d i ({> w)gw is uniformly continuous on H.

Proof. Suppose that |i (|> v) i ({> w)| ? @(e d) when || {| ? . Then I (|) I ({)

Z

d

e

|i (|> w) i ({> w)| gw

(e d) (@(e d)) = .

DIFFERENTIATION

214

Since that was easy, we’ll move directly to dierentiation. It turns out that the nicest possible formulation is the one that’s correct, namely that the derivative of an Re integral is the integral of the derivative. We want to dierentiate I ({) = d i ({> w)gw with respect to {, so we rst have to dene the derivative of i with respect to {. Definition 6.5.11 If the function i (· > w) (for w xed) is uniformly dierentiable on H, then we denote its derivative by i{ ({> w). This is usually called the partial Ci . derivative with respect to { and is also denoted C{ Theorem 6.5.12 (Differentiation Under the Integral Sign) Let i ({> w) be uniformly continuous on T = [f> g] × [d> e] and uniformly dierentiable as a function of { for each w 5 [d> e]. Suppose also that its derivative i{ ({> w) is uniformly Re continuous on T. Then I ({) = d i ({> w)gw is uniformly dierentiable on [f> g] and Z e I 0 ({) = i{ ({> w) gw. d

Proof. Since i (· > w) is dierentiable for each w, i (|> w) i ({> w) = i{ ({> w)(| {) + u({> |> w)(| {), where u({> |> w) $ 0 as | { $ 0 (for each w). Integrating this equation from d to e gives Z e Z e i{ ({> w)(| {) gw + u({> |> w)(| {) gw I (|) I ({) = d ! ÃZ ! ÃdZ e

=

d

e

i{ ({> w) gw (| {) +

d

u({> |> w) gw (| {).

We must show that the last integral goes to 0 as | { $ 0. Since the interval [d> e] is nite, it su!ces to show that u({> |> w) $ 0 as | { $ 0, independent of w; i.e., given A 0, there is a A 0 (not depending on ¯ w) such that |u ({> |> w)| ? ¯ ¯ i (|> w) i ({> w) ¯ i{ ({> w)¯¯ is when || {| ? . However, we know that |u({> |)| = ¯¯ |{ less than any bound for |i{ (y> w) i{ (x> w)| on H; this is a consequence of LBC–see Corollary 6.3.16. Furthermore, since we have assumed i{ is uniformly continuous in the sense of Denition 6.5.9, we can nd a A 0 that makes |i{ (y> w) i{ (x> w)| ? when || {| ? . Corollary 6.5.13 (Fubini’s Theorem for Rectangles) uniformly continuous on the rectangle T = [d> e] × [f> g]. Then ! ! Z ÃZ Z ÃZ g

e

e

i ({> |) g{

f

d

g

g| =

i ({> |) g|

d

Suppose j({> |) is

g{=

f

Proof. See exercises. Re So we now know that, under reasonable hypotheses, I ({) = d i ({> w)gw is uniformly continuous and dierentiable for the nite interval [d> e]; however, many important examples involve improper integrals, so we now have to see what happens

INTEGRAL TOOLS

215

when (for example) e = 4. As usual the continuity case is easiest, so we deal with that rst. R 4 We assume that i ({> w) is uniformly continuous on [> ] × [d> 4) and that i ({> w) gw exists for all { 5 [> ]. Let d I ({) =

Z

4

i ({> w) gw.

d

R4 Definition 6.5.14 The integral d i ({> w) gw is said to converge uniformly if for any A 0, we can nd U() (independent of {) such that ¯ ¯ Z P ¯ ¯ ¯ ¯ i ({> w) gw¯ when P A U(). ¯I ({) ¯ ¯ d

R4 Theorem 6.5.15 If the integral d i ({> w) gw converges uniformly, then I ({) = R4 i ({> w) gw is uniformly continuous. d

Proof. Given A 0, choose P A U(@3). Then ¯ ¯ ¯ ¯ Z P Z P ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ i (|> w) gw¯ + ¯I ({) i ({> w) gw¯ |I (|) I ({)| ¯I (|) ¯ ¯ ¯ ¯ d d ¯ ¯Z Z ¯ ¯ P P ¯ ¯ i (|> w) i ({> w) gw¯ +¯ ¯ ¯ d d ¯ ¯Z Z P ¯ ¯ P ¯ ¯ i (|> w) i ({> w) gw¯ . @3 + @3 + ¯ ¯ ¯ d d

However, P is nite, 6.5.10 tells ¯ us that there is a A 0 such that ¯R so Proposition RP ¯ ¯ P || {| ? makes ¯ d i (|> w) d i ({> w) gw¯ @3 as well. Note that while depends on P , the number P depends only on , so depends only on .

R4 So when will d i ({> w) gw converge uniformly? Even though we’re assuming i ({> w) is uniformly continuous, we’re not working over a nite interval, so we don’t automatically get boundedness of i ({> w). So, we will have to make a boundedness condition on i . When we do this, we get what is called a “dominated convergence” type theorem. Here is one version. Theorem 6.5.16 (Dominated Convergence I) Suppose that i ({> w) is uniformly continuous on T = [> ] × [d> 4) and that there is a function E({) such that 1. |i ({> w)| E(w) for all ({> w) 5 T and R4 2. d E(w) gw converges.

Then

R4 d

i ({> w) gw converges uniformly (hence is a UC function of {).

DIFFERENTIATION

216

Proof. Exercise P4 (There is a discrete version of this–for series of functions n=0 in –called the Weierstrass M Test; see Proposition 7.1.19 in the next chapter.) We turn now to the dierentiability case, and see that the situation for improper integrals is more delicate than for nite R 4 ones. As before, u({> |> w) $ 0 independently of w, but we still have to prove that d u({> |> w) gw $ 0 as | { $ 0. Once again a dominated convergence theorem gives us what we need. Theorem 6.5.17 (Dominated Convergence II) Suppose that for {> | 5 [f> g], u ({> |> w) is a uniformly continuous function of w 5 [d> 4) that satises the following conditions. 1. u({> |> w) $ 0 as | { $ 0. 2. |u({> |> w)| E(w) for some function E having the property that Then

R4 d

u({> |> w) gw exists and $ 0 as | { $ 0.

R4 d

E(w) exists.

Proof. By the integral comparison test (Proposition 5.3.18) R 4and the fact that absolute integrability implies integrability (Corollary 5.3.23), d u({> |> w) gw exists for each {, |. Now we have: ¯Z 4 ¯ Z 4 ¯ ¯ ¯ ¯ u({> |> w) gw¯ |u({> |> w)| gw ¯ d

d

=

Z

P

|u({> |> w)| gw +

d

Z

P

d

|u({> |> w)| gw +

Z

4

P Z 4

|u({> |> w)| gw E(w) gw.

P

R4 R4 Since d E(w) gw exists, we can nd P A d so large that P E(w) gw ? @2 when A 0 is given. Note that this P depends only on E, not on { or | or u. Once P is RP xed in this way, dealing with d |u({> |> w)| gw is easy: since u $ 0, there is a A 0 such that |u({> |> w)| ? (@2)@(P d) when || {| ? . Proposition 6.5.18 (Differentiation inside Improper Integrals) Suppose i ({> w) is uniformly continuous on T = [f> g] × L for every nite subinterval L of [d> 4) and uniformly dierentiable for each w 5 [d> 4). Suppose also that 1. i{ ({> w) is uniformly continuous on T and 2. there is a function E({) such that |i{ ({> w)| E({) and Then I ({) =

R4 d

R4 d

i ({> w)gw is uniformly dierentiable on [f> g] and 0

I ({) =

Z

d

4

i{ ({> w) gw.

E({) g{ exists.

INTEGRAL TOOLS

217

Proof. As we have seen before (Corollary 6.3.16), bounds for the derivative are bounds for u, so we can apply the dominated convergence theorem. See the exercises at the end of this section for applications of these ideas to the continuity and dierentiability of the Gamma function and Laplace transform. Finally, Dirichlet’s test for convergence of series (Theorem 3.3.38) has an analogue that applies to improper integrals. The proof of the version stated here is not di!cult, using just integration by parts, so we leave it as exercise 11 on page 220. Proposition 6.5.19 (Dirichlet’s Test for Integrals) Suppose that the functions i and j satisfy the following conditions. 1. i is non-negative and uniformly dierentiable on every subinterval [d> e] of [d> 4). 2. i is decreasing, and lim i ({) = 0. {$4

3. j is uniformly continuous on every subinterval [d> e] of [d> 4). ¯R ¯ ¯ e ¯ 4. ¯ d j ¯ ? P for every d e ? 4.

Then the improper integral

R4 d

i j converges.

This proposition can be proved without the dierentiability assumption on i . Suggestions for doing this harder variation can be found in exercise 12. R 4 Note that the Dirichlet Test immediately gives the convergence of 1 sin{ { g{. R 1 sin { The integral 0 { g{ need not be considered improper, since sin{ { is actually UC R4 on (0> 1], so has a UC extension to [0> 1]; thus, the integral 0 sin{ { g{ converges (its value is 2 but this is not easy to prove–see [Spivak, 1994], exercise 19-42 for an outline).

Exercises R4 1. Use integration by parts to show that the improper integral 1 sin{ { g{ exists. R 4 sin { (See [Spivak 1994], exercise 19-42 for a proof that 0 { g{ = 2 .) R4 R4 2. Find functions i , j such that d i j 0 exists but d i 0 j does not (Hint: See previous exercise.) Thus, the integration by parts formula for improper integrals does not apply in this case; what goes wrong? R{ 3. Let Uq = (1@q!) {0 ({w)q i (q+1) (w) gw where [{0 > {] [d> e] and the derivatives i (n) (w), n = 0> = = = > q+1, exist and are UC in [d> e]. Find a recursion formula for Uq in terms of Uq1 . This result can be used to derive the Taylor polynomial expansion with remainder for i ({). (We will discuss this in more detail in the next chapter.) 4. Prove Fubini’s Theorem (Corollary 6.5.13).

DIFFERENTIATION

218

5. Prove Theorem 6.5.16 (Dominated Convergence I). 6. The following relate to the Gamma function: ({) = (a) ({) is dened for { A 0.

R4 0

w{1 hw gw.

(b) lim ({) = 4 = lim ({). {$0 {A0

{$4

(c) Use integration by parts to show that ({ + 1) = {({). (d) Show (1) = 1 so that, for q a positive integer, (q) = (q 1)!. (e) Use the previous property to show that for q a positive integer, (q+{) = (q 1 + {)(q 2 + {) · · · (1 + {)({)({). (f) ({) is not dened for { ? 0, but we can extend it to intervals (q + 1) ? { ? q by using the formula (q + {) = (q 1 + {)(q 2 + {) · · · (1 + {)({)({)= R4

2

h{ g{ =

s 2

to compute (1@2). s ¢ (2q 1)! ¡ 1 . (h) For q a non-negative integer, q + 2 = 2q 2 (q 1)! (g) Use the well-known fact that

0

(i) ({) is uniformly continuous on any closed interval [d> e] with d A 0 (use Dominated Convergence I).

(j) ({) is uniformly dierentiable on any closed interval [d> e] with d A 0 (use Dominated Convergence II). (k) Draw a sketch of | = ({). 7. (Project) (Parts of this exercise repeat exercise 13 on page 183.) For certain functions k(w), the Laplace transform K of k is dened by K({) =

Z

4

h{w k(w) gw.

0

The function k(w) is exponentially bounded if, for some n A 0, there is a constant N A 0, such that |k(w)| hnw for w A N. (a) Prove that exponentially bounded functions have Laplace transforms. (b) Give the values of { for which K({) is dened. (c) Discuss the continuity and dierentiability of Laplace transforms of exponentially bounded functions, using the dominated convergence theorems. (d) Here is a table of some functions and their Laplace transforms. Verify that the functions are exponentially bounded and that their transforms are as given.

INTEGRAL TOOLS

219

k(w)

K({)

Domain of K

1

1 {

{A0

w

1 {2

{A0

See next formula –or integrate by parts

w

( + 1) {+1

{A0

Let x = {w and see previous exercise on ({)

hw

1 {

{A

Straightforward

sin(nw) cos(nw) sinh(nw) cosh(nw)

Technique of proof Straightforward

{2

n + n2

{A0

Integration by parts (twice)

{2

{ + n2

{A0

Integration by parts (twice)

{2

n n2

{ A |n|

sinh(nw) =

{2

{ n2

{ A |n|

cosh(nw) =

¢ 1 ¡ nw h hnw 2

¢ 1 ¡ nw h + hnw 2

8. State and prove a version of the Dominated Convergence Theorem that apR P $e plies to improper integrals of the type d u({> |> w) gw, where u ({> |> w) is uniformly continuous for {> | 5 [f> g], w 5 [d> e). 9. Use a revised version of the Dominated Convergence Theorem (see above) to prove a version of the theorem about dierentiating under the integral sign R P$e i ({> w) gw, where i ({> w) that applies to improper integrals of the type d is uniformly continuous on [f> g] × [d> e0 ], for any e0 ? e. 10. (Project) Forget about what we have done with D{ and logs, and dene, for any { A 0, a “new” function Z { 1 gw= ln({) = 1 w Prove each of the following. If you can’t do some part, you may still use it to do other parts. (a) ln(1) = 0 and the derivative of ln({) is 1@{. (b) Show that the derivative of ln(d{) ln(d) ln({) is zero. Thus, this function is constant. Find the constant by letting { = 1. We now know ln(de) = ln(d) + ln(e) (explain).

DIFFERENTIATION

220

(c) Use the previous part to show, for q an integer, that ln({q ) = q ln({) (treat q A 0 and q ? 0 separately). (d) For 0 ? D { | E, prove that ln satises the Lipschitz condition (1@E)(| {) ln(|) ln({) (1@D)(| {)= R| (Hint: Use ln(|) ln({) = { 1w gw or use the Law of Bounded Change.) (e) Since E A 0 this means that ln is one-to-one, so ln(x) = ln(y) , x = y. (f) Prove that ln(4) A 1. (Hint: subdivide the interval [1> 4] into three equal R4 parts and show that 1 1w gw 1@2 + 1@3 + 1@4.)

(g) Using the Inverse Function Theorem, deduce that there is a number, call it h, between 1 and 4 such that ln(h) = 1. (Actually, if you feel ambitious, you can improve on this by proving ln(2) ? 1 ? ln(3), so that 2 ? h ? 3 (you’ll need a ner partition in the previous part). (h) Denote by exp({) the inverse of ln({), so that ln(exp({)) = {. Apply the chain rule to deduce that exp({) is its own derivative. Note that exp(1) = h.

(i) Prove that exp(d + e) = exp(d) · exp(e). The easiest way is to show that ln applied to each side gives the same number, then use that fact that ln is one-to-one. (j) Use the previous part and exp(1) = h to show that exp(p) = hp for all integers p 0 (you can simply explain why it’s true or use induction on p). (k) Similarly, for integers q A 0, (exp(1@q))q = h so exp(1@q) = h1@q . d 1 1 = + · · · + .) What (l) Is exp(u) = hu for u A 0 and rational? (Hint: e e | {z e} d copies of 1@e

if u is real? Explain. (This is tricky to say rigorously: try to explain it using the approximation of reals by rationals. A good proof would probably invoke the continuity of exp and the triangle inequality. You can extend hu to reals using Corollary 4.2.12.)

11. Prove Proposition 6.5.19, Dirichlet’s Integral Test, as stated in the text. The R{ Re trick is to let J({) = d j(w) gw and use integration by parts: d i j = Re 0 i (e)J(e) i (d)J(d) d i J. Let e $ 4.

12. It turns out that Dirichlet’s Integral Test also holds even when i is not dierentiable; however, the proof is much harder in this case. First, prove Dirichlet’s Test as in the previous exercise (assuming i is dierentiable). Now, the idea Re is to replace the last integral, d i 0 J by lim

N X

N$4 $0 n=1

J({n )(i ({n ) i ({n1 ))

INTEGRAL TOOLS

221

and the quantity i (e)J(e) i (d)J(d) by a certain telescoping sum. Here {{n }n=0==Q is a partition of the interval [d> e] where the widths {n {n1 are all PN less than . Sums of the type n=1 k({n )(i ({n ) i ({n1 )) look like Riemann sums, except that n = {n {n1 is replaced by in = i ({n ) i ({n1 ). Limits of such sums are called Riemann-Stieltjes integrals and are denoted Re k gi . You will get a formula that amounts to d Z

e

d

i j = i (e)J(e) i (d)J(d)

Z

e

J gi

d

which looks a lot like the integration by parts from the simpler proof. Now R e$4 i j. apply the Cauchy criterion to get convergence of d

13. Let SPiL({) =

sin { (“SinePiLog”). ln {

(a) Discuss the continuity of SPiL({) on the interval (0> 1) and its possible extension to [0> 1]; in particular, what are its limits at 0 and 1? R1 (b) Is 0 SPiL({) g{ improper? (c) Discuss the convergence or divergence of R4 • 0 SPiL({) g{ R4 • 0 |SPiL({)| g{.

7. SEQUENCES AND SERIES OF FUNCTIONS 7.1

Sequences of Functions

A sequence of functions (iq ) is a generalization of the notion of a sequence of numbers (dq ). Thus, instead of numbers dq , for each q 5 N we will be given some function iq : V $ R, where V R. Here is a somewhat more formal denition, though none is really needed. Definition 7.1.1 A sequence of functions on a set V R is a rule or procedure which associates with each whole number q 5 N a function iq : V $ R. The sequence is usually denoted (iq ) or even (iq ({)) (as long as this is not confused with the sequence of numbers (iq ({)). As with sequences of numbers, we are interested in convergence and what properties of the functions are preserved in the limit function when the sequence does converge. The simplest kind of convergence is called pointwise convergence. Definition 7.1.2 A sequence of functions (iq ) on V is said to converge pointwise to the function i (dened on V) if, for each v 5 V, the sequence of numbers (iq (v)) converges to the number i (v). Example 7.1.3 For q = 1> 2> = = =, let iq ({) = {q . Then iq (v) $ 0 for each v 5 V = [0> 1). The reason that we stick to the half-open interval [0> 1) is that iq (1) = 1 (not 0) for all q. ³ { ´q . Then (iq (v)) $ hv for Example 7.1.4 For q = 1> 2> = = =, let Tq ({) = 1 + q any v (see exercise 6 on page 112). Pq Example 7.1.5 For q = 0> 1> = = =, let Vq ({) = n=0 {n = 1 + { + {2 + · · · + {q . 1 for each v 5 V = (1> 1); the Vq ({) are the partial sums of a Then (iq (v)) $ 1v geometric series. As we shall see, pointwise convergence is not enough to ensure that the limit of uniformly continuous functions be uniformly continuous. We need a stronger condition, and here it is. 223

SEQUENCES AND SERIES OF FUNCTIONS

224

Definition 7.1.6 That a sequence of functions (iq ) on V converges to a function i : V $ R uniformly on V means that for each A 0, there is some number Q () such that |iq (v) i (v)| for all v 5 V when q Q (). When this is the case, Q () is called a modulus of convergence. The key point here is that the number Q () such that |iq (v) i (v)| depends only on and not on any particular value of v. This is similar to the situation for uniform continuity, where a modulus of continuity () (such that |i ({) i (|)| when |{ || ? ()) depends only on and not on (say) { or |. To say that iq ({) lies within of i ({) means that the graph of | = iq ({) lies within a strip of vertical width 2 centered around the graph of | = i ({). The diagram below shows such a strip and one function i1 ({) that does not quite lie within this strip and another, i2 ({) that does.

i4 i5

i

5H

P Example 7.1.7 Let V be any closed subinterval of (1> 1), and let iq ({) = qn=0 {n . 1 uniformly on V (the proof is left for the exercises). Note Then iq ({) $ i ({) = 1{ that the convergence is not uniform on all of V = (1> 1), as we will see later. Definition 7.1.8 A sequence of functions (iq ) is uniformly Cauchy on V if for each A 0 there is a whole number Q () such that |iq (v) ip (v)| for all v 5 V when p> q A Q (). It is clear that if (iq ) is uniformly Cauchy on V, then the sequence of numbers (iq (v)) is a Cauchy sequence, so it has a limit (Theorem 3.1.14), which we denote i (v). This denition makes i into a function V $ R. We will soon see that even if

SEQUENCES OF FUNCTIONS

225

(iq ) is not uniformly Cauchy, the sequence of numbers (iq (v)) may form a Cauchy sequence. So, in this case also, we can dene i (v) = lim (iq (v)). This function is called the pointwise limit of the sequence (iq ). Proposition 7.1.9 If (iq ) $ i uniformly on V, then (iq ) is uniformly Cauchy on V. Proof. Exercise. Proposition 7.1.10 If (iq ) is uniformly Cauchy on V, it converges uniformly on V to the pointwise limit function i (v) = lim iq (v). q$4

Proof. Since (iq ) is uniformly Cauchy there is a modulus of convergence Q (). Let q be any number A Q (); the claim is that |iq (v) i (v)| for any v 5 V. Let’s hold q xed and suppose that p A q. The triangle inequality gives |iq (v) i (v)| |iq (v) ip (v)| + |ip (v) i (v)| . {z } {z } | | D

E

Since p> q Q (), we know that D . But we are free to make p as big as we want without changing q, so we can use the fact that, for any particular v 5 V, the sequence of numbers ip (v) $ i (v). Given any 0 A 0, choose p so big that the quantity E ? 0 . So, without changing |iq (v) i (v)| or q or , we see that |iq (v) i (v)| + 0 for any 0 A 0. The Wiggle Lemma now applies to tell us that |iq (v) i (v)| . This proof seems like magic, so read over the logic carefully! We have actually used this reasoning before: see Lemma 3.1.15 on page 104, which says that the same modulus establishing that a sequence is Cauchy can be used as a modulus of convergence. As promised, we now show that the uniform limit of uniformly continuous functions is itself UC. The main part of the argument is contained in the following lemmas (see Denition 6.1.2 to review the meaning of u({> |) $ 0 as | { $ 0). Also, uq ({> |) $ u ({> |) uniformly means what you’d expect: give A 0, there is a modulus Q () – depending only on u and – such that |uq ({> |) u({> |)| ? when q A Q () (for all {> | 5 V). Lemma 7.1.11 Suppose we have a sequence of functions uq ({> |), and another function u({> |), all dened for {> | 5 V, and satisfying 1. uq ({> |) $ 0 as | { $ 0 on V. 2. uq ({> |) $ u({> |) uniformly on V. Then u({> |) $ 0 as | { $ 0 on V.

SEQUENCES AND SERIES OF FUNCTIONS

226

Proof. Basically, uq ({> |) can be made close to u({> |) and u ({> |) close to 0, so it’s the usual adding and subtracting, triangle inequality gambit: |u({> |)| |u({> |) uq ({> |)| + |uq ({> |)| . The rst term on the right can be made small by making q su!ciently large, independent of { and |, say q = Q (@2). But since uq ({> |) $ 0 for this particular q, there is a = (uq > @2) which makes the second term also less than @2 as long as || {| ? . Lemma 7.1.12 If (iq ) $ i uniformly on V, then iq (|) iq ({) $ i (|) i ({) uniformly on V. Proof. We just have to make |(iq (|) iq ({)) (i (|) i ({))| = |(iq (|) i (|)) + (i ({) iq ({))| smaller than by making q su!ciently large. But this is a simple (@2) triangle inequality argument. Theorem 7.1.13 If iq $ i uniformly on V, and each iq is uniformly continuous on V, then i is uniformly continuous on V as well. Proof. Use the previous two lemmas, taking uq ({> |) = iq (|) iq ({) and u({> |) = i (|) i ({). We also have a partial converse to this theorem. Proposition 7.1.14 Suppose (iq ({)) $ i ({) pointwise on [d> e]. Suppose also that each iq is (weakly) increasing and i is uniformly continuous on [d> e]. Then 1. i is weakly increasing on [d> e], and 2. (iq ) $ i uniformly on [d> e]. Proof. See exercises Remark 7.1.15 This proposition also holds with the same hypotheses except each function iq is weakly decreasing. Recall that uniformly continuous functions extend to a set V from a dense subset G (see Theorem 4.2.11); we can combine this fact with Theorem 7.1.13 to obtain the following result. Proposition 7.1.16 Suppose G is dense in V, each iq is uniformly continuous on G, and (iq ) $ i uniformly on G. Let iq and i be the (uniformly continuous) extensions of these functions to V. Then iq $ i uniformly on V.

SEQUENCES OF FUNCTIONS

227

Proof. Suppose { is in V; since G is dense in V, we can nd a | in G which is arbitrarily close to {. Noting that i (|) = i (|) since i extends i , and (similarly) iq (|) = iq (|), we have ¯ ¯ ¯ ¯ ¯ ¯ ¯i ({) iq ({)¯ ¯i ({) i (|)¯ + ¯i (|) iq ({)¯ ¯ ¯ ¯ ¯ ¯i ({) i (|)¯ + |i (|) iq (|)| + ¯iq (|) iq ({)¯. {z } | {z } {z } | | i is UC

iq $i uniformly

iq is UC

Given A 0, suppose q A Q (@3) where Q (@3) is a modulus of convergence for iq $ i on G. That makes the middle term above @3. By choosing | su!ciently close to {, uniform continuity of i and iq makes the other two terms @3 as well. Not only are Theorem 7.1.13 and Proposition 7.1.16 useful in proving that certain limit functions are uniformly continuous, their contrapositives can be used to show that certain sequences of functions do not converge uniformly; in other words, if the functions are UC, but their limit is not, then the convergence can not have been uniform. Here are some examples. Example 7.1.17 For q = 1> 2> = = =, let iq ({) = {q . Consider these functions on V = [0> 1], and let G = [0> 1), a dense subset of V. On G, iq ({) $ 0. If the convergence on G were uniform, (iq ) would converge to a function i identically 0 on G. Proposition 7.1.16 guarantees that they converge to 0 on V as well. But clearly iq (1) = 1 for all q, so this can’t happen. There is no continuous function on V = [0> 1] which is 0 on [0> 1) but takes the value 1 at 1. So (iq ) converges pointwise but not uniformly. Here is a picture of some of these functions. Note that for large q, the functions stay close to the {-axis more and more. \

iq ({) = {q

[

i1 > i2 > i3 > i5 , and i10

SEQUENCES AND SERIES OF FUNCTIONS

228

Example 7.1.18 Here is another example ; A A 1 A A ? ³ q{ ´ jq ({) = sin A 2 A A A = 1

similar to the one above.

1 { q 1 1 if { q q 1 if { q As q gets larger, the graph of jq gets closer and closer to the |-axis, while the strip where it is not equal to 1 or 1 gets narrower and narrower. Denote by j({) the pointwise limit of jq ({). If { ? 1, then j({) = 1; if { A 1, then j({) = 1, while j(0) = 0. Please note that for a given {, we may not be able to determine which of these three conditions hold; nevertheless, it is not hard to show that the function j cannot be continuous, even though each jq is, in fact, dierentiable. \

if

g2

g10 g1

[

j1 , j2 , j3 , j4 , and j10

We conclude with two useful tests for proving convergence of sequences obtained by taking sums of functions. We’ll see more of this in section 7.3 on power series. Theorem 7.1.19 (Weierstrass M-test) Suppose iq is a sequence of functions of non-negative numbers, P with |iq ({)| Pq for on a set V R and Pq a sequenceP 4 P converges then all q = 0> 1> = = = and all { 5 V. If 4 q q=0 q=0 iq ({) converges uniformly (and absolutely) on V. Proof. Exercise–use the Cauchy criterion for convergence. Example 7.1.20

4 X cos q{ converges uniformly and absolutely on all of R. q2 q=1

SEQUENCES OF FUNCTIONS

229

The following is an easy generalization of Dirichlet’s convergence test for series of numbers–see Theorem 3.3.38.

Theorem 7.1.21 (Dirichlet’s Test for Uniform Convergence) Suppose the sequences of functions (in ({)) and (jn ({)) satisfy the following conditions on an interval L: P 1. The partial sums qn=0 in ({) are bounded by some number E. 2. (jn ) $ 0 uniformly on L.

3. jn ({) jn+1 ({) for all n and all { 5 L. Pq Then n=0 iq jq converges uniformly on L.

Proof. See the outline of the proof of the Dirichlet Test for numbers in exercise 12 on page 129.

P4 sin(n{) Example 7.1.22 converges uniformly on the interval [@2> 3@2]. It n=1 n does not converge uniformly on [@2> @2] however. See the exercises at the end of this section.

Exercises Pq 1. Let V be any closed subinterval of (1> 1), and let iq ({) = n=0 {n . Prove 1 uniformly on V (see example 3.3.9). that iq ({) $ i ({) = 1{ 2. Prove that if iq $ i uniformly on V, then iq is uniformly Cauchy on V. 3. The uniform limit of uniformly continuous functions must be uniformly continuous. What does this tell us about the sequence of functions iq ({) = 1 on the interval [0> 1]? (Hint: What is the limit function i ({) on 1 + (q{ 1)2 [0> 1] ?) 4. Prove that iq $ i uniformly on V if and only if there is a sequence (dq ) of positive numbers with dq $ 0 such that, for q su!ciently large, |iq ({) i ({)| dq for all { 5 V.

Prove that the sequence of functions (iq ) is Cauchy convergent on V if and only if there is a convergent sequence of numbers (dq ) such that, for all q su!ciently large, |iq ({) ip ({)| |dq dp | for all { 5 V.

5. Discuss the convergence on [0> 1] of the following sequences of functions. (Hint: Use calculus to nd the max of the functions on [0> 1] and apply the previous exercise.)

SEQUENCES AND SERIES OF FUNCTIONS

230

{2 + (q{ 1)2 (b) iq ({) = {q (1 {) (a) iq ({) =

{2

(c) iq ({) = q{q (1{) (Hint: show that the max of iq ({) on [0> 1] approaches 1@h.)

(d) iq ({) = q3 {q (1 {)4 q{2 1 + q{ 1 (f) iq ({) = 1 + {q

(e) iq ({) =

(These are from [Kaczor&Nowak, 2000].) 6. Discuss the convergence of iq ({) = cosq {(1 cosq {) on the interval [0> @2] and on [@4> @2]. Use the same ideas as the previous exercise. P4 7. Prove that q=1 sinqq{ converges uniformly and absolutely on all of R when 2. 8. Prove the Weierstrass M-Test (Theorem 7.1.19). 9. Prove Dirichlet’s test for uniform convergence (Theorem 7.1.21) using the ideas outlined in exercise 12 on page 129. P4 10. (Project) The series n=1 sin(n{) is an example of a Fourier series: a repn resentation of a function in terms of the family of trig functions sin(n{) and P10 . (Note cos(n{), n = 0> 1> 2> = = =Here is a plot of the partial sum n=1 sin(n{) n that this has period 2.) \

[

(a)

P4

sin(n{) n

converges uniformly on any interval that doesn’t contain an integral multuple of 2. This is a consequence of Dirichlet’s test for uniform convergence–see Example 3.3.39 for details. n=1

SEQUENCES OF FUNCTIONS

231

(b) On the interval (0> 2), the series converges to i ({) = 12 ( {). It is, in fact, the Fourier series for 12 ( {). (The function i is dened on (0> 2). We make it periodic by dening i ({ ± 2n) = i ({) for { not in (0> 2).) But the Fourier series for a function doesn’t always converge uniformly to that function. However, it does if the function is uniformly dierentiable, which 12 ( {) certainly is; see Theorem 8.3.1. There are similar examples discussed at the end of Section 8.3. (c) This series does not converge uniformly on any interval containing 0 (or any other integral multiple of ): see the plot above. Here is an outline of a proof from [Spivak, 1994]. ¯ ¯Q ¯ ¯ 2Q ¯ ¯X ¯ ¯X ¯ ¯ ¯ ¯ sin n{¯ = ¯ sin n{¯ i. ¯ ¯ ¯ ¯ ¯ n=Q n=0 ¯ ¯Q ¯ Q ¯X ¯ ¯ when sin n{¯ ii. Use the calculation in Example 3.3.39 to show ¯ ¯ ¯ n=0 ¶ (Q + 1) Q is large. (When Q is large, sin is close to 1, so is 12 .) 2Q ¯ ¯ 2Q ¯ X sin n{ ¯ 1 ¯ ¯ , so the series can’t converge uniformly iii. For large Q , ¯ ¯ ¯ n ¯ 2 n=Q on [0> 2], for example.

(d) The term-by-term derivatives of this series don’t converge at all (use exercise 23 below) so their sum can’t converge either.

P4 Pq 11. Prove that if n=0 dn is absolutely convergent, then n=0 dq sin(q{) is uniformly convergent on all of R. 12. Discuss the convergence of

4 X {2 + n (1)n1 on the interval [0> 1]. Is it unin2

n=1

form? Absolute?

13. What can you say about the values of the function below(from [Spivak, 1994])? i ({) = lim ( lim (cos q!{)2n ) q$4 n$4

14. The sequence vq =

q X (1)n+1 {n

converges pointwise on (1> 1]. Show that n this convergence is actually uniform on [0> 1]. n=1

15. Let V be any closed subinterval of (1> 1), and let iq ({) = (a) Prove that iq ({) $ i ({) =

1 uniformly on V. 1{

Pq

n=0

{n .

SEQUENCES AND SERIES OF FUNCTIONS

232

(b) This sequence does not converge at the endpoints { = 1> 1. Suppose we replace the partial sums iq ({) by their averages, Ã q ! P X X n 1 { = vP ({) = P+1 q=0

Now v2N1 (1) = v2N (1) =

|

n=0

{z

iq ({)

}

N 2N+1 1 2

Show that vP ({) $ i ({) at least pointwise on [1> 1).

(c) Is the convergence vP ({) $ i ({) uniform on [1> 1)?

16. Suppose that (Sq ) $ S uniformly on all of R, where the Sq are polynomials. Prove that S is also a polynomial. 17. Suppose that iq $ i and jq $ j, both uniformly on V, and let and be constants. Show that iq + jq $ i + j uniformly on V. 18. Suppose that iq $ i and jq $ j, both uniformly on V. Show by example that iq jq need not converge uniformly to i j. Does it converge pointwise? 19. (See previous exercise.) Suppose iq $ i and jq $ j, both uniformly on V, and |iq ({)| > |jq ({)| ? P for all q and { 5 V. Then iq jq $ i j uniformly on V. 20. Suppose iq $ i and jq $ j, both uniformly on V. Find conditions under which iq @jq will converge uniformly to i @j on V. 21. Prove the following result from the text. Proposition 7.1.14 Suppose (iq ({)) $ i ({) pointwise on [d> e]. Suppose also that each iq is (weakly) increasing and i is uniformly continuous on [d> e]. Then (a) i is weakly increasing on [d> e], (b) (iq ) $ i uniformly on [d> e]. (Hint: i is (weakly) increasing since weak inequalities are preserved in the limit (see Proposition 3.1.10). Since i is UC, given A 0, let = (@2) be a modulus of continuity for @2. Partition [d> e] by points d = {0 ? {1 ? · · · ? {n = e, where {m {m1 ? @2. Also, choose Q such that |iq ({l ) i ({l )| ? @2 for all l = 0> = = = n, when q A Q . For any { 5 [d> e], we can nd some m such that |{ {m | ? @2, so { 5 [{m1 > {m+1 ], and we have i ({m1 ) @2 iq ({m1 ) iq ({) iq ({m+1 ) i ({m+1 ) + @2. Combine this with the uniform continuity of i to show iq ({)i ({) .)

INTEGRALS AND DERIVATIVES OF SEQUENCES

233

s 22. The sequence ( q cos(q{)) does not converge. To see this, rst note that ¯ cos q{ ¯ ¯s ¯ ¯ ¯ 1 ¯ q cos q{¯ . ¯ ¯ = q3@2 q P4

s |cos q{| @q diverges, | q cos q{| is unbounded. Why? s (Incidentally, the oscillations of q cos(q{) as q $ 4 must be unbounded, P4 since q=1 (cos q{)@q converges.) Since

q=1

23. If { is not a multiple of 2, the sequence (cos q{) does not converge, even pointwise. Here is an outline of a proof from A. Zelevinsky. Lemma. If the sequence cos(q{) converges, then cos { = 1. Proof. First, use the familiar identity cos({ ± |) = cos { cos | sin { sin | to prove that cos (2q{) = 2 cos2 (q{) 1 and cos ((q + 1){) + cos ((q 1){) = 2 cos (q{) cos {= Assume lim cos q{ = }. Passing to the limit in the rst formula above gives q$4

} = 2} 2 1, so |}| A 0. Passing to the limit in the second formula gives } = } cos {, hence cos { = 1 as desired. (Note that sin(q{) doesn’t converge either unless sin { = 0.) The formula cos({ ± |) = cos { cos | sin { sin | and other properties of the trig functions are derived in the Section 7.5.

7.2

Integrals and Derivatives of Sequences

In this section we ask, “What happens to a convergent sequence of functions(iq ) when we integrate or dierentiate it?” This will have important applications in the next section when the sequence consists of partial sums of power series. You will probably notice that there is a strong resemblance between the results and proofs here and those of Section 6.5 when we were looking at integrals and derivatives of a family of functions i ({> w) parametrized by {. Here we can think of q as a discrete parameter for the functions iq (w). The case of integration is the most easily settled. Proposition 7.2.1 Suppose iq $ i uniformly on [d> e], with each iq uniformly continuous on [d> e]. Then for any s> { 5 [d> e], 1.

R{ s

iq $

R{ s

i

SEQUENCES AND SERIES OF FUNCTIONS

234

³ ´ 2. If Q () is a modulus of convergence for iq $ i , then Q ed is a modulus of R{ R{ convergence for s iq $ s i , so this convergence is uniform as convergence of functions on [d> e]. Proof. Exercise; use j P =,

Re d

j P (e d).

It is important, of course, that iq $ i uniformly, as the next example shows. ; ?

2q2 { if 0 { 1@(2q) 2q 2q2 { if 1@(2q) { 1@q . Example 7.2.2 Let kq ({) = = 0 if 1@q { \

k43

k4

[

k1 > k3 > k5 > k7 > and k10 we have a sequence of continuous, piecewise linear functions kq . For each RHere 1 q, 0 kq = 12 . The sequence kq ({) $ 0 for { = 0 and for each { A 0; for these {, R1 R1 kq $ k where k is the 0. function. However, we have 0 kq $ 12 6= 0 = 0 k. Thus, the convergence can not be uniform in view of the result we have just proved. This example is even more interesting. Each kq ({) is uniformly continuous on [0> 1], the limit function k = 0 is, of course, uniformly continuous on [0> 1], and the convergence is uniform on any interval [d> e] (0> 1], and kq (0) $ k(0) = 0; nevertheless, the convergence is not uniform on all of [0> 1]. Even improper integrals can behave well with respect to convergence. Proposition 7.2.3 Suppose that the sequence of uniformly continuous functions iq converges uniformly to i on every nite closed subinterval of [d> 4). Suppose that

INTEGRALS AND DERIVATIVES OF SEQUENCES

235

R e$4

iq converges in the sense that, given there is a F(), independent ¯R uniformly ¯ R4 ¯ Q ¯ of q, such that ¯ P iq ¯ for all P> Q A F(). Then d i exists and d

Z

4

d

iq $

Z

4

i.

d

R4 RP Proof. To prove that d i exists, we must show that d i is Cauchy. Given A 0, choose P> Q A F(). Then ¯ ¯Z ¯ ¯ ¯Z ¯Z ¯ ¯ Q ¯ ¯ Q ¯ ¯ Q ¯ ¯ ¯ ¯ ¯ ¯ i¯ ¯ i iq ¯ + ¯ i ¯ ¯ ¯ ¯ P q¯ ¯ P ¯ ¯ P ¯ ¯Z ¯ ¯ Q ¯ ¯ i iq ¯ + . ¯ ¯ ¯ P

¯ ¯R ¯ ¯ Q Holding P> Q xed, we can make q so large that ¯ P i iq ¯ is arbitrarily small– this because [P> Q ] is a nite and iq $ i converges uniformly. The Wiggle ¯R interval ¯ R4 ¯ Q ¯ Lemma now tells us that ¯ P i ¯ . Thus d i exists, so we compare it with R4 iq : d ¯ ¯Z ¯Z 4 Z 4 ¯ ¯¯Z 4 Z Q ¯¯ ¯¯Z Q Z 4 ¯¯ ¯ ¯ Q ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ iq i ¯¯ ¯ iq iq ¯ + ¯ iq i ¯ + ¯ i i ¯. ¯ ¯ ¯ ¯ ¯ ¯ ¯ d d d d d d d {z } | {z } | {z } | A

B

C

Supoose we are given A 0. By choosing Q large enough, we can make D smaller than R 4@3 independent of q, because of our assumption about the uniform convergence of d iq . Making Q even larger if necessary will ensure that F is smaller than R4 @3 because we have just shown that d i converges. Thus, we choose some Q , depending only on , which makes both of these two parts @3. Now we can make E smaller than @3 by making q su!ciently large¯Rsince theR interval [d> Q] is nite 4 4 ¯ and iq $ i uniformly (Proposition 7.2.1). Thus, ¯ d iq d i ¯ can be made by making q su!ciently large. As was the case for improper integrals of functions of the type i ({> w) (see ProposiR4 tion 6.5.16) the uniform convergence assumption about the improper integral d iq will be satised when we have a “dominated convergence” assumption; here’s how it goes. Corollary 7.2.4 (Dominated Convergence III) Suppose that the sequence of uniformly continuous functions iq converges uniformly to i on every nite closed subintervalR of [d> 4). Suppose R 4that there is a UC function j such that |iq | j for 4 all q and d j exists. Then d i exists and Z 4 Z 4 iq $ i. d

d

236

SEQUENCES AND SERIES OF FUNCTIONS

Proof. Exercise; show that the hypotheses of Proposition 7.2.3 hold. The situation for dierentiability is more complicated. Consider the sequences: 1 1 iq ({) = sin(q{) and jq ({) = s sin(q{). In both cases, the functions are uniq q formly dierentiable on, say, the nite interval [> ] and each sequence converges uniformly to 0 on [> ] as q $ 4, . However, it can be shown (see exercises on page 233) that the derivatives in each case don’t even converge (except at { = 0). more and more It is clear what is happening here. The functions iq and jq oscillate s frequently for large q because of the sin(q{); however, the q or q in the denominator makes them approach 0 as q $ 4, so it doesn’t matter. On the other hand, this rapid oscillation makes the derivatives oscillate just as fast (they include a cos q{), but without the redeeming value of going to 0 in amplitude. This last example shows that there is not much hope of deducing dierentiability from mere convergence of a sequence of functions–even from uniform convergence. As we have remarked before, dierentiation tends to make functions worse in terms of behavior while integration has the opposite eect (see Remark 6.3.23). So, we take a dierent tack, and assume the derivatives converge instead of the original functions. Is that good enough? Well, not quite: suppose iq ({) = q. Then all the derivatives are 0, so they certainly converge, but the iq certainly don’t. We need one more little piece: convergence of iq ({) for at least one value of {. Here is the standard result. Theorem 7.2.5 Suppose each iq is uniformly dierentiable on a nite interval L and iq0 $ j uniformly on L. If we can nd some s 5 L with iq (s) $ O (for some O), then iq converges uniformly on L to a function i that is uniformly dierentiable on L with i 0 = j. Proof. One might be tempted to look at the limit of the dierence quotients iq (|) iq ({) , but this quickly leads to complications; it is much easier to use the |{ Fundamental Theorem. Dene Z { iq0 Iq ({) = iq (s) + s Z { i ({) = O + j s

By the Fundamental Theorem, Iq and iq dier by a constant; since Iq (s) = iq (s), preserves uniform convergence Iq = iq on L. Because we have shown R{ R {that integration (Proposition 7.2.1), we know that s iq0 $ s j uniformly. Since iq (s) $ O as a R{ sequence of constants, it is easy to see that iq = iq (s) + s iq0 $ i uniformly on L. Clearly i 0 = j. Corollary 7.2.6 If iq is uniformly dierentiable on the nite interval L, iq $ i , and iq0 $ j, both uniformly on L, then i is also UD on L, with i 0 = j.

INTEGRALS AND DERIVATIVES OF SEQUENCES

237

Proof. Exercise.

Exercises 1. Prove the following from the text. Proposition 7.2.1 Suppose iq $ i uniformly on [d> e] with each iq uniformly continuous on [d> e]. Then for any s> { 5 [d> e], R{ R{ (a) s iq $ s i ³ ´ is a (b) If Q () is a modulus of convergence for iq $ i , then Q ed R{ R{ modulus of convergence for s iq $ s i , so this convergence is uniform as convergence of functions on [d> e]. 2. Examine the convergence of the sequence of functions iq ({) =

{ and 1 + q2 {2

their derivatives, on L = [1> 1]. 3. The sequence of partial sums Vq = every closed subinterval of (1> 1).

Pq

n=0

{n converges uniformly to

(a) On closed subintervals of (1> 1), show that the partial sums converge to ln(1 {) uniformly.

1 on 1{

Xq

n=1

{n n

(b) Show that this convergence is uniform on closed subintervals of [1> 1) as well. (Hint: on the left half of this interval you have an alternating series.) 1 1 1 1 (c) Deduce that ln 2 = 1 + + · · · 2 3 4 5 Xq {n $ ln(1 {) pointwise on [1> 1). Why can’t this be uniform (d) n=1 n convergence on all of [1> 1)? 4. Analyze the partial sums Vq = exercise). 5. Let Hq ({) =

q X {n n=0

n!

Pq

n=0 (1)

n n

{ in a similar way (see previous

.

(a) Show that (Hq ) converges uniformly on every nite subinterval of R. (b) Show that Hq0 ({) = Hq1 ({), so the derivatives converge as well. (c) Let H({) = lim(Hq ({)) and show that H is uniformly dierentiable on every nite subinterval of R, with H 0 ({) = H({).

SEQUENCES AND SERIES OF FUNCTIONS

238

(d) Deduce that H({) = h{ (Hint: Look at T({) = T0 ({) = 0 for all {).

H({) and show that h{

This is an example of a power series expansion. We will study these in more detail in the next section. 6. Prove the following corollary to Theorem 7.2.5. Corollary 7.2.6 If iq is uniformly dierentiable on the nite interval L, and iq $ i , and iq0 $ j, both uniformly on L, then i is also UD on L with i 0 = j. 7. Prove Corollary 7.2.4 (Dominated Convergence III). 8. The Riemann Zeta function is dened by ({) =

X4

q=1

1 . q{

(a) Prove this converges pointwise for { A 1. (b) Prove this converges uniformly on closed nite subintervals of (1> 4). (c) Calculate 0 ({). Where is it dened?

The Zeta function (}) can also be dened for complex numbers } and plays a very important role in number theory and the distribution of primes; it is the subject of the famous–and unsolved at the time of this writing –Riemann Hypothesis. s 9. The length of the diagonal of the unit square is 2. On the other hand, you can “approximate” this diagonal by a sequence of staircase-like approximations– see diagram (a) below. If you lay the lower triangle down on the {-axis, you get diagram (b) below.

(a)

POWER SERIES

239

(b) It is clear that the length of any of these staircase approximations is exactly 2. However, as functions whose graphs are shownsin diagram (b), they converge uniformly to a function whose graph has length 2. Since arc length is given by an s integral, why don’t the integrals of the staircase approximations approach 2?

7.3

Power Series

In this section we look at series that are formally like polynomials having innite degree. Their limits are called power series. Typically, a power series looks like V({) = v0 + v1 { + v2 {2 + · · · + vq {q + · · · Before studying the specics of power series, we can make some general statements about series of functions. Definition P 7.3.1 Let (in ({)) be a sequence of functions all dened on V R. Let the sequence (Iq ({)) Iq ({) = qn=0 in ({) be their qth partial sum. If, for { 5 V,P 4 converges to a number (which we’ll call I ({)), we say that n=0 in ({) converges P4 i ({) = I ({). If the sequence (I ) converges uniformly pointwise and write n q n=0 P4 on V, we say that n=0 in converges uniformly to I on V. P Note that 4 n=0 in converges uniformly on V to I means that given A 0, there is a number Q () such that, for all { 5 V, ¯ ¯ q ¯ ¯ X ¯ ¯ in ({)¯ when q A Q (). ¯I ({) ¯ ¯ n=0

(Q () depends only on , not on any { 5 V.)

Proposition 7.3.2 For a sequence of functions (in ) on V R,

SEQUENCES AND SERIES OF FUNCTIONS

240

1.

P4

n=0 in ({) converges pointwise Pqif and only if given { 5 V and A 0, there is an integer Q ({> ) such that | n=p in ({)| ? for all q p Q ({> ). P4 uniformly if and only if given A 0, there is an integer 2. n=0 in ({) converges P Q () such that | qn=p in ({)| ? for all { 5 V and for all q p Q ().

Proof. These are simply restatements of the Cauchy criterion for convergence.

Theorem 7.3.3 (Comparison Test for Function Series) Suppose that for all { 5 V and n = 0> 1> = = =, we have |in ({)| jn ({). Then 1. 2.

P4

n=0 jn ({)

P4

n=0 jn ({)

converges pointwise =, converges uniformly =,

P4

n=0

in ({) converges pointwise.

P4

n=0 in ({)

converges uniformly.

Pq Pq Pq Proof. | n=p in ({)| n=p |in ({)| n=p jn ({); now use the preceding proposition. We now turn to the special case of power series. The preceding comparison test will be used to compare power series with geometric series; the latter converge uniformly when they converge. Definition 7.3.4 (Power Series) Let (vq ) be a sequence of numbers. The power series with coe!cients (vq ) is the sequence of polynomials (Vq ({)) given by Vq ({) =

q X

vn {n .

n=0

We could have dened the power series in the more usual and less abstract way as an “innite” polynomial 4 X vn {n . n=0

The problem is, what sense is to be made of such an innite sum? However, we will not actually push this abstraction; in fact, we immediately move away from it when the sequence of partial sums Vq ({) converges. Definition 7.3.5 If for some number {, the sequence of partial sums (Vq ({)) converges to a number V({), then we write V({) = lim Vq ({) = q$4

4 X

vn {n .

n=0

¢ ¡Pq n is a geometric series and Example 7.3.6 The power series (Jq ({)) = n=0 d{ d for any { in [u> u], with 0 u ? 1. In this case we converges uniformly to 1{ P4 d n write n=0 d{ = 1u .

POWER SERIES

241

Proof. We know the series converges, and the convergence is uniform since ¯ ¯ ¯ ¯ q+1 ¯ ¯ ¯ d ¯¯ ) ¯Jq ({) 1 ¯ = ¯ d(1 { ¯ ¯ 1 {¯ 1{ 1 {¯ ¯ q ¯ ¯ { ¯ ¯ = ¯¯ 1 {¯ ¶ 1 uq $ 0 1u independent of { as q $ 4 (see Corollary 3.2.14).

Geometric power series converge absolutely and uniformly within their interval of convergence. We will exploit this fact to demonstrate a similar property for power series in general, using the comparison test for series of functions. Thus, the advantage of power series over series of arbitrary functions is that convergence for a particular { give convergence for smaller {; this is the substance of the following central result. (It is called “Convergence in a Disk” because it also holds in the complex plane–the space of complex numbers }–where |}| ? u describes a disk. This is discussed in Chapter 8.) Theorem 7.3.7 (Convergence in a Disk) P 1. If Vq (d) = qn=0 vn dn converges and 0 u ? |d|, then Vq ({) converges uniformly and absolutely for all { with |{| u. 2. If Vq (d) does not converge, then neither does Vq ({) for any { with |{| A |d|. Proof. To prove the rst part, P suppose Vq (d) converges. The key to this whole proof is to note that the terms of qn=0 vn¯dn are¯ bounded since the terms of a convergent n ¯ n¯ series must approach ¯ u ¯0. Let’s write vn d P for all n, so that |vn | P |d| . If ¯ ¯ |{| u ? |d|, then ¯ ¯ ? 1 and |{|n un ? |d|n . Therefore d ¯ ¯n ¯ ¯ ¯¯ ¯ ¯ ¯dn {n ¯ P ¯dn ¯ ¯un ¯ = P ¯¯ u ¯¯ = d Thus, the series of absolute values is dominated by a convergent geometric series (independent of {), so converges uniformly and absolutely by the comparison test stated above. To prove the second part, suppose that |{| A |d|. By the rst part, Vq ({) converges implies Vq (d) converges. But Vq (d) doesn’t, so neither does Vq ({).

Theorem 7.3.8 (Main Convergence Theorem) Let (vq ) be a sequence of coefcients and U a positive number. For convenience, let E = 1@U. The following four conditions are equivalent. 1. For all { 5 (U> U), vn {n $ 0 as n $ 4.

SEQUENCES AND SERIES OF FUNCTIONS

242

2. For each A 0, there is an integer Q () such that |vn | (E + )n when n A Q (). P n 3. The series 4 n=0 vn { converges for all |{| ? U. 4. For all { 5 (U> U), nvn {n1 $ 0 as n $ 4.

Proof. We will show that (1) =, (2) =, (3) =, (4) =, (1). 1 1 so { ? = U. Since vn {n $ 0, we know (1) =, (2): Given A 0, let { = E + E ¯ ¯ ¯ ¯ ¯ ¯ 1 that there is some Q () such that ¯vn {n ¯ = ¯vn (E+) n ¯ 1 whenever n A Q (). This

implies that |vn | (E + )n .

1 ? U, (2) =, (3): Suppose 0 |{| ? U. We can nd so small that |{| ? E+ which we write as (E + ) |{| ? 1. Combining this with our assumption about |vn |, ¯ ¯ n A Q (). But this is a termwise comparison we see that ¯vn {n ¯ [(E +)|{|]n for allP n with a convergent geometric series, so 4 n=0 vn { converges by the comparison test (Theorem 3.3.17). ¯ ¯ Pq (3) =, (4): If n=1 vn {n converges for |{| ? U, then ¯vn {n ¯ $ 0 as n $ 4. As in 1 1 ? U. This makes 1 ? (E+)|{| , the (2) =, (3) argument, chose so that |{| ? E+ ³ ´ n P4 1 1 so let (E+)|{| = 1 + w with w A 0. Since n=0 vn E+ converges, chose Q large ³ ń 1 ? 1 when n A Q . Putting these pieces together, we have enough so that vn E+

1 n n [(E + )|{|] |{| 1 n = = |{| (1 + w)n

nvn {n1 n(E + )n |{|n1 =

n However, (1+w) n $ 0 as n $ 4 since exponential functions dominate power functions (Proposition 3.2.13). ¡ ¢ (4) =, (1): vn {n = {n nvn {n1 $ 0.

¢ ¡ Corollary 7.3.9 If the sequence vn sn is bounded for some s, then U = |s| satises the equivalent conditions (1)—(4). ¯ n¯ ¯vn s ¯ P for all n some Q . Then for all |{| ? |s| and n Q , we Proof. ¯ ¯ Say n¯ ¯ have vn { P · |{@s|n $ 0, since |{@s| ? 1.

PQ Corollary 7.3.10 If for some s A 0, n=0 |vn | sn P for all Q , then U = s satises the equivalent conditions (1)-(4). Remark 7.3.11 If the equivalent conditions (1)-(4) of the Main Convergence Theorem above hold for all U A 0, then we usually write U = 4. In this case, E = 0 and the series converges for all numbers {. On the other hand, if the series converges only for { = 0, we write U = 0.

POWER SERIES

243

Definition 7.3.12 When a function

Pq

n n=0 vn {

V({) =

converges for { in some interval L, it denes

4 X

n=0

vn {n : L $ R=

P n Corollary 7.3.13 Suppose V({) = 4 n=0 vn { on (U> U). Then V is uniformly dierentiable on all closed subintervals [d> e] of (U> U) with derivative V 0 ({) = P 4 n1 . n=0 nvn { Proof. We know that the last sum converges by the Main Convergence Theorem. The rest follows Corollary 7.2.6 about uniform convergence of derivatives, proved in the last section. Note also that by the Convergence-in-a-Disk Theorem (7.3.7), all convergence is uniform on any [u> u] with 0 u ? U. P n Corollary 7.3.14 Suppose V({) = 4 n=0 vn { on (U> U). Then V is uniformly continuous on any subinterval of (U> U) on which V 0 is bounded; in particular, it is uniformly continuous on any nite closed subinterval [d> e] (U> U).

Example 7.3.15 The functions 1@(1 {) = 1 + { + {2 + = = = 1@(1 + {) = 1 { + {2 = = = are dened on (1> 1). Neither can be UC on this interval since neither has an extension to all of [1> 1], but they are both UC on closed subintervals. A similar statement (and argument) holds for uniform dierentiability. For more on these series, see Abel’s Summation Theorem below. P n Corollary 7.3.16 Suppose i ({) = 4 n=0 vn { on (U> U). Then I ({) =

4 X vn {n+1

n=0

n+1

is an anti-derivative for i (i.e. I 0 = i ) on any closed subinterval of (U> U). Definition 7.3.17 Given any sequence (wn ),¡ and ¢an increasing sequence of nonnegative integers(q(n)), the sequence (xn ) = wq(n) is said to be a subsequence of the original sequence wq . ¡ ¢ Example 7.3.18 Suppose (wn ) = n1 = 11 > 12 > 13 > = = = > n1 > = = = ¢ ¡ 1. If q(n) = n 2 , then wq(n) = 11 > 14 > 19 > = = = > n12 > = = = is a subsequence of (wn ). ¢ ¡ 1 2. If q(n) = 2n 1, then wq(n) = 11 > 13 > 15 > = = = > 2n1 > = = = is a subsequence of (wn ). ¢ ¡ 3. If q(n) = n 2 6n + 1, then wq(n) is not a subsequence, since q(n) is not increasing: ¡ ¢ wq(n) = 15 > 12 > 11 > 11 > 12 > 15 > = = = On the other hand, if we restrict to n A 3, then we are Ok.

SEQUENCES AND SERIES OF FUNCTIONS

244

4. If q(n) =

½

1 n

¢ ¡ if n is odd , then wq(n) is again not a subsequence: if n is even ¡ ¢ wq(n) = 11 > 12 > 11 > 14 > 11 > 16 > = = =

Remark 7.3.19 It is easily seen that if a sequence converges to a limit O, then so does any subsequence. By looking at the last example above you can see why restricting the subscripts q(n) to an increasing sequence of numbers is important for this to hold. P4 Proposition 7.3.20 the power series n=0 vn {n converges on (U> U). If ¢ ¡ Suppose any subsequence of |vn |1@n converges to a limit O, then O E (= 1@U).

Proof. Suppose a subsequence of |vq |1@q converges to O A E. Let = OE 2 , so that O = E + . Then for each q there is an p A q such that |vp |1@p A O . In other words, |vp | A (E + )p . But this violates part (2) of the Main Convergence Theorem 7.3.8. Thus, O A E is false, so O E. Corollary 7.3.21 If any subsequence of |vq |1@q converges to O, then any U that satises equivalent conditions (1) (4) of the Main Convergence Theorem also satises U 1@O. Corollary 7.3.22 If |vq |1@q $ O, then U = 1@O satises the equivalent conditions of the Main Convergence Theorem. Proof. Part (2) of the theorem is satised by 1@U = E = O. Definition 7.3.23 Whenever we have a largest value of U for which the equivalent conditions of the Main Convergence Theorem hold, we call it the radius of convergence of the power series. Remark 7.3.24 It is important to be very clear on the role of the radius of convergence. If U A 0 is the radius of convergence of a power series, then 1. For any 0 d ? U, the power series converges absolutely and uniformly when |{| d (i.e. in [d> d]). 2. The power series diverges for |{| A U. 3. When |{| = U (i.e. { = U or { = U), the power series may or may not converge at {: see exercises. |1@n $ O as n $ 4, then U = 1@O is the radius of Corollary 7.3.25 If |vnP 4 convergence of the series n=0 vn {n .

(This follows from the previous two corollaries.)

POWER SERIES

245

P4 Example 7.3.26 Since limn$4 n1@n = 1 (Proposition 3.2.17), the series n=0 n{n converges on (1> 1). Of course this is nothing new. In fact, you can calculate what this converges to. (What is it?) 1 (For another calculation of limn$4 n1@n , look at the function {1@{ = h { ln { . Its 1 1@{ 1@{ is decreasing derivative is {2 (1 ln {){ , which is negative for { A h. Thus, { for { A h. Also, lim{$4 ln{{ = 0 since powers dominate logs (Corollary 3.2.16). We conclude that n1@n decreases to 1 as n $ 4.) The ratio test (Corollary 3.3.24) is probably the most widely-used test to check for convergence of a power series. Example 7.3.27 What is the radius of convergence of consecutive terms is

P4

{n n=1 2n n2 ?

The ratio of

2n n2 n2 1 {n+1 · n = · · {, 2 + 1) { (n + 1)2 2

2n+1 (n

{ whose limit as n $ 4 is simply . Thus, the cut-o for convergence is |{| = 2, 2 so the radius of convergence is 2. The series converges when { = 2 or { = 2, so its interval of convergence is actually [2> 2], in which it converges absolutely and uniformly. Once we know that a function has a power series representation on a non-trivial interval, then that representation is unique; in fact, the coe!cients vn are determined by evaluation of the derivatives of the function at 0. Theorem 7.3.28 (Uniqueness of Power Series) Suppose V({) = on (U> U) for some U A 0. Then for all n, vn =

P4

n n=0 vn {

V (n) (0) , n!

where V (n) denotes the kth derivative of V. Proof. Induction shows that the lth derivative of {n is n(n 1) = = = (n l + 1){nl . (When l = 0 this is interpreted, as usual, to mean simply {n , and when l A n this becomes simply 0.) Thus, we have (by Corollary 7.3.13) V (l) ({) =

4 X

n=0

n(n 1) = = = (n l + 1)vn {nl .

All terms with l A n are 0, and all terms with l ? n have a positive power of {, so they all become 0 if we let { = 0. Thus V (n) (0) = (n!) vn , which establishes the theorem. Corollary 7.3.29 If the function V is given by a power series on some interval (U> U) with U A 0, and if all derivatives of V vanish at 0, then V is constant on (U> U).

SEQUENCES AND SERIES OF FUNCTIONS

246

There are non-zero functions which have derivatives of all orders, all of which vanish at { = 0. Such a function cannot be given by a power series in any interval containing zero, since such a power series would then have to be identically zero (see Example 7.4.10 in the next section). If we have two functions given by converging power series, we could reasonably ask if their sum, dierence, product and quotient are also given by converging power series. If yes, then the coe!cients of these series are determined, as we have just seen. Here is the answer to the convergence question. P4 n Theorem P47.3.30n Suppose V({) = n=0 vn { with radius of convergence UV and W ({) = n=0 wn { with radius of convergence UW . P 1. (V ± W ) ({) = (vn ± wn ){n with radius of convergence min(UV > UW ). P n 2. VW ({) = 4 n=0 xn { with radius of convergence min(UV > UW ), where xn = Pn m=0 vm wnm . 3. If |W (0)| A 0 (i.e.|w0 | A 0), then 1@W has a power series expansion converging in some interval [> ] with 0 ? UW .

Proof. Part 1 is straightforward. For part 2, note that !Ã P ! ÃQ P Q X X X X n vn { w { vn w {n+ = n=0

=0

n=0 =0

P Q X X ¯ ¯ ¯vn w {n+ ¯ =

n=0 =0

ÃQ X n=0 Ã4 X n=0

n

|vn | |{|

n

|vn | |{|

!Ã

P X

=0 !Ã 4 X =0

|w | |{|

|w | |{|

! !

.

If |{| ? min(UV > UW ), then both innite sums in the last expression are nite. Thus, the product series, and hence its terms, must be bounded, so it converges absolutely by Corollary 7.3.10 of the Main Convergence Theorem. Since the product series converges absolutely, its terms can be rearranged or collected to put it in the form P4 n x { (see Proposition 3.3.45). n n=0 The uniqueness of power series expansions for a given function tells us that the power series given by this theorem are the only ones that converge to the given sum, dierence and product functions. The situation for the reciprocal is a bit more complicated since it is di!cult to write down, explicitly, the formulas for the coe!cients of 1@W ({). Suppose we write 1@W ({) = y0 + y1 { + y2 {2 + y3 {3 + · · · . Then we must have ! Ã4 ! Ã4 X X n wn { · y { = 1 + 0 · { + 0 · {2 + 0 · {3 + · · · . n=0

=0

POWER SERIES

247

From the second part of this theorem, we know what the coe!cients of the left-hand product are, so we get the equations w0 y0 w0 y1 + w1 y0 w0 y2 + w1 y1 + w2 y0

= 1 = 0 = 0 etc.

Since |w0 | A 0 we can solve the rst equation for y0 . Putting this into the second equation, we can solve for y1 . We can continue in this way, solving successively for each new y in terms of the wn s and the previous values of yl . The question is: does this new power series have a positive radius of convergence? Here is an outline of a proof that it does (adapted from [Spivak 1994]). P n 1. Suppose W ({) = 4 n=0 wn { converges for { equal to some positive u UW ; 1 P1 ym wm (from equations above) for note that y0 = 1@w0 and y = w0 m=0 A 0. ¯ ¯ 2. All terms ¯wn un ¯ are bounded by some number P , and we have 1 X¯ ¯ ¯ ¯ ¯y { ¯ P ¯y { ¯ . |w0 | m=0

¯ ¯ s 1 3. Let Q = max(P> 2); then ¯y u ¯ Q 2 (induction on ). w0 ¯{¯ P ¯ ¯ y { when 4. Deduce that 4 ¯ ¯ is small enough (and positive). =0 u P n (Another proof of this theorem, for more general power series 4 n=0 vn ({ d) , can be found in [Strichartz, 1995], p. 288.) P n Corollary 7.3.31 Suppose i ({) = 4 n=0 vn { on the closed interval [d> e]. Then for any > 5 [d> e], Z 4 Z 4 ´ X X vn ³ n+1 i ({) = (vn {n ) = n+1 . n+1 n=0 n=0 ¡ ¢ P4 vn n+1 P4 vn n+1 P4 vn {n+1 n+1 = n=0 n+1 { n=0 n+1 . Proof. Let J({) = n=0 n+1 (All of these converge by what we have already proved about power series.) R { Then R{ J({) and i ({) have the same derivative and agree at { = , so J({) = i ({). Example 7.3.32 (Geometric Series) We start with the two series 1 = 1 + { + {2 + {3 + · · · 1{ 1 = 1 { + {2 {3 + · · · . 1+{

SEQUENCES AND SERIES OF FUNCTIONS

248

These both have a radius of convergence of 1 and thus converge in (1> 1). Integrating term by term gives us ln(1 {) = { + {2 @2 + {3 @3 + {4 @4 + · · · ln(1 + {) = { {2 @2 + {3 @3 {4 @4 + · · ·

and by adding, we obtain ¶ 1+{ = 2{ + 2{3 @3 + 2{5 @5 + · · · . ln 1{ The rst of these converges on [1> 1), the second on (1> 1], and the last on (1> 1). Let’s look at the second. It is not di!cult to show that it converges uniformly and absolutely on [u> 1] for any 0 u ? 1 (it’s an alternating series). It follows that its limit as { $ 1 equals its value at { = 1, the sum of the alternating harmonic series. But its value for { ? 1 is ln(1 + {), a function uniformly continuous on (at least) [0> 1]. It follows that the sum of the alternating harmonic series is simply ln(2). (This was also proved in an exercise to the last section.) We can also obtain a series for ln(2) by letting { = 12 in the rst series. Note that the rst of these series converges much more slowly than the second. (See the exercises at the end of this section for a more detailed discussion.) The advantage of the third series is that is can be used to calculate values of ln(|) for all | A 0 by letting { = |1 |+1 . 2 If we substitute { for { in the second series, we get 1 = 1 {2 + {4 {6 + · · · 1 + {2 4 X = (1)l {2l = l=0

This converges on (1> 1); once again, we integrate both sides to obtain arctan({) = { {3 @3 + {5 @5 {7 @7 + · · · 4 X {2l+1 = = (1)l 2l + 1 l=0 This now converges on (1> 1]. When { = 1, we get the famous but slowly converging series: arctan(1) = 1 1@3 + 1@5 1@7 + 1@9 1@11 + · · · . In Section 7.5, where we dene and the periodic functions, we will prove that this value is @4 (see page 266). Example 7.3.33 (The exponential function as a power series) exp({) =

4 X {n

n=0

n!

=1+

{ {2 {q + + ··· + + ··· 1! 2! q!

POWER SERIES

249

This power series converges uniformly and absolutely on any interval [U> U], and it is easily checked (term-by-term dierentiation) that exp({) is its own derivaexp({) tive. We already know that h{ has this property, so form the quotient T({) = . h{ 0 { T ({) = 0, so T is constant; T(0) = 1, so we conclude that exp({) = h . (This was the subject of an exercise in the last section). In general, we know that if a power series converges for { = U A 0, then it converges uniformly on closed subintervals of (U> U). On the other hand, it may or may not converge at { = U or { = U. For example, { {2 @2 + {3 @3 {4 @4 + · · · converges at { = 1 but not at { = 1 and is, in fact, uniformly convergent throughout [d> 1] for 0 d ? 1. Is this always the case that if a power series converges at { = U A 0, then it is uniformly convergent in [d> U] for 0 d ? U? The answer is yes, as shown in a famous theorem proved more than two centuries ago by Niels Abel. We establish rst a preliminary result that is useful in itself. Lemma 7.3.34 (Abel’s Lemma)PSuppose e1 e2 · · · , with each el 0. Suppose also that the partial sums Dq = qm=1 dm all lie in [p> P ]. Then the partial sums Pq m=n dm em all lie in [pen > P en ]. Proof. We use the same trick (sometimes called “Abel summation”) that we used in proving Dirichlet’s Test (3.3.38): dn en + · · · + dq eq

= Dn en + (Dn+1 Dn ) en+1 + · · · +(Dq1 Dq )eq1 + (Dq Dq1 )eq = Dn (en en+1 ) + Dn+1 (en+1 en+2 ) + · · · +Dq1 (eq1 eq ) + Dq eq

For each m, p Dm P , pen

= =

p(en en+1 ) + p(en+1 en+2 ) + · · · + p(eq1 eq ) + peq Dn (en en+1 ) + Dn+1 (en+1 en+2 ) + · · · + Dq1 (eq1 eq ) + Dq eq P (en en+1 ) + P (en+1 en+2 ) + · · · + P (eq1 eq ) + P eq P en ,

since the rst and third sums telescope. Thus, dn en + · · · + dq eq 5 [pen > P en ] as required. Theorem 7.3.35 (Abel’s Summation Theorem) Suppose that the power series P4 n d { when { = 1. Then it converges uniformly on [0> 1], so that n n=0 P4converges n d { is uniformly continuous on [0> 1]; in particular, lim i ({) = i ({) = n n=0 {$1 P4 i (1) = n=0 dn . Proof. Exercise (apply Abel’s Lemma with em = {m ).

P4 Corollary 7.3.36 Suppose the power series n=0 dn {n converges when { = U for some U A 0. Then it converges uniformly on [d> U] for 0 d ? U.

SEQUENCES AND SERIES OF FUNCTIONS

250

Proof. Exercise. Remark 7.3.37 This provides another proof that the series verges to ln 2.

P4 (1)n+1 conn=1 n

ButPthere’s more to Abel’s Theorem. Suppose we don’t know whether a numerical 4 series n=0 dn converges. We can make this series into P4the power series S ({) = P4 n d { . If our original series did converge, say n n=0 n=0 dn = D, then Abel’s Theorem tells us that lim S ({) = D as well. So we turn this around into a denition. {$1

P4 said to be Abel Definition 7.3.38 Summability) The series n=0 dn isP P(Abel 4 4 n summable to O if n=0 dn {n converges for all { 5 [0> 1) and lim n=0 dn { = O. {$1

(Note that { $ 1 means we restrict { to { ? 1.)

We can now interpret Abel’s Theorem as saying that if a series converges to D, then it is Abel summable to D. So Abel summability is most interesting for series that don’t converge, or whose convergence properties are unknown. The nonconverging series 1 1 + 1 1 + · · · is actually Abel summable–see the exercises. We will use Abel summability in Chapter 8 to discuss the convergence of Fourier series (see page 286). Of course, it is reasonable to ask, “When does Abel summability imply convergence?” Not always, as 1 1 + 1 1 + · · · shows, but sometimes. Here is Tauber’s1 famous partial converse to Abel’s Theorem. Theorem 7.3.39 (Tauber’s Theorem) Suppose and the summands satisfy lim n · dn = 0.

P4

n=0

dn is Abel summable to O

n$4

Then

P4

n=0

dn converges to O.

Proof. Outlined in the exercises. There is a famous generalization of Tauber’s theorem, due to Hardy and Littlewood. Even though the change in the hypothesis from Tauber’s Theorem seems minor, the proof of the Hardy-Littlewood theorem is so di!cult that it is beyond the scope of this book. Here is its deceptively simple statement. Theorem 7.3.40 (Hardy-Littlewood Tauberian Theorem) Suppose is Abel summable to O and the summands satisfy n · dn F, where F is a positive constant. Then 1 Alfred

Tauber, 1866-1942

P4

n=0

dn converges to O.

P4

n=0

dn

POWER SERIES

251

There are other summability methods besides Abel summability, and there are variations for integrals instead of sums. Theorems like Abel’s, which assert that convergent series are summable in these other senses, and sum to what they converge to, are now usually called Abelian theorems. Converses (more properly, partial converses), which state that under certain conditions summability implies convergence, are called Tauberian theorems. For a brief review of some of the literature, see [Borwein, 2005].

Exercises 1. Suppose (iq ) is a sequence of functions on dened on V such that (a) iq ({) iq+1 ({) for all q. (b) iq $ 0 uniformly on V. P4 Prove that q=0 (1)q i ({) converges uniformly on V.

Is the convergence necessarily absolute? Give some examples.

2. Show how the proposition below follows from a similar result about sequences of functions (Theorem 7.2.5). Proposition. Suppose (iq ) is a sequence of functions dierentiable on [d> e] such that P4 iq (f) converges for some f 5 [d> e]. (a) Pq=0 4 0 (b) q=0 i q ({) converges uniformly on [d> e]. P4 Then q=0 iP ({) converges uniformly on [d> e] to a dierentiable function whose 4 derivative is q=0 i 0q ({). P4 3. Suppose vn = S (n) is a polynomial in n. Prove that n=0 vn {n converges exactly when |{| ? 1. P n 4. It was pointed out that 4 n=0 n{ has a radius of convergence of 1. Starting with a geometric series, deduce this radius of convergence and calculate the actual value of this series for { 5 (1> 1). P4 P4 5. Find the radius of convergence of n=0 sin(n){n and n=0 cos(n){n . What about convergence at { = 1> 1?

6. A hypergeometric series is one where the coe!cients of {n are rational functions of n: vn = S (n)@T(n). Check online references at http : @@mathworld=wolfram=com@HypergeometricSeries=html and http : @@en=wikipedia=org@wiki@Hypergeometric_series.

Prove that U = 1 is a radius of convergence for a hypergeometric series. (Hint: 1@q = 1 when s(q) is a polynomial Prove that it su!ces to show that lim |s(q)| q$4

in q–see Corollary 7.3.25.)

SEQUENCES AND SERIES OF FUNCTIONS

252

7. To see what can happen at the boundary of the interval of convergence, consider n P4 P4 n P4 P4 the series n=0 {n , n=1 {n2 , n=1 {n , and n=1 ({) n . Each has radius of convergence 1 but behaves dierently for the values of { with |{| = 1. 8. Finish the proof of Abel’s Summation Theorem (7.3.35). As suggested, apply m Abel’s Lemma P4to the sequence em = { , which is weakly decreasing since { 5 [0> 1]. Since m=0 dm converges, the sums dn + dn+1 + · · · + dq lie in [> ] for n su!ciently large, so the same is true for their limits as n $ 4. 9. Use Abel’s Summation Theorem (7.3.35) to prove the following. P4 n Corollary. Suppose the power series i ({) = n=0 dn { converges when { = U for U A 0. Then it converges uniformly on [d> U] where 0 d ? U. ¡ ¢ n P n | converges at | = 1; apply (Hint: Show that the series j(|) = 4 n=0 dn U Abel’s Theorem. Compare i ({) and j(|) and their limits.)

P4 P 10. Suppose 4 n=0 dn and n=0 en are convergent series of numbers. Their Cauchy P product is dened to be 4 n=0 fn , where fn = (d0 en + d1 en1 + · · · + dm enm + · · · dn e0 ) =

n X

dm enm .

m=0

P4 P4 P4 P4 If n=0 fn exists, show thatP ( n=0 dn ) ( n=0 en ) = n=0 fn . (Hint: Look at the associated power series dn {n etc. and use Theorem 7.3.35; remember how polynomials are multiplied; see also the proof of Theorem 7.3.30)

P4 11. P Using the ideas in the proof of TheoremP7.3.30, show that if n=0 dn and 4 4 e converge absolutely, then so does f where f = (d e 0 n +d1 en1 + n=0 n n=0 n P P4 P4 n d ) ( e ) = f by the previous · · · + dm enm + · · · dn e0 ). Thus ( 4 n=0 n n=0 n n=0 n exercise. 12. (From [Spivak 1994]): Show that the series 4 X {q+1 {2q+1 2q + 1 2q + 2

n=0

converges uniformly to 12 ln({ + 1) on [d> d] for 0 ? d ? 1, but that at { = 1 it converges to ln 2. (Hint: This is a rearrangement of the of two P4dierence 1 1 2q+2 .) well-known log series. At { = 1 write out some terms of n=0 2q+1 Why is this not a violation of Abel’s Summation Theorem? 13. Find and prove a generalization P of Abel’s Summation Theorem (7.3.35) that applies to series of the form 4 n=0 dn in ({).

14. As was mentioned in the text, the series 1 1 + 1 1 + · · · is Abel summable: show this and nd its Abel limit.

TAYLOR SERIES

253

15. Here is an outline of the proof of Tauber’s Theorem (7.3.39). First, prove the following. Lemma. If x 0, then for every positive integer n, 1 xn n(1 x).

Lemma. If theP sequence (dq ) converges to d then so does the sequence of q averages ((1@q) n=1 dn ).

(Induction works for the rst of these; the second was an exercise on page 107.) Here are the hypotheses (and notation) for Tauber’s Theorem. P4 (a) S ({) = n=0 dn {n converges for |{| ? 1. (b) lim S ({) = O. {$1

(c) lim n dn = 0. n$4

Pq We want to make n=0 dn within of O by making q larger than some Q . ¯ ¯ q ¯ q ¯ ¯ q ¯ q ¯ ¯X ¯X ¯ ¯X ¯ X ¯ ¯ ¯ ¯ ¯ ¯ dn O¯ ¯ dn dn {n ¯ + ¯ dn {n S ({)¯ + |S ({) O|. ¯ ¯ ¯ ¯ ¯ ¯ ¯ | {z } n=0 n=0 n=0 n=0 V3 {z } | {z } | V1

V2

Note that the last term, V3 , has no q in it; we remedy this by letting { = 1 1@q. The idea now is to show that each of the three terms can be made less than @3 by making q bigger than numbers Q1 , Q2 , Q3 ; we then let Q = max(Q1 > Q2 > Q3 ).

By the rst lemma,

¯ q ¯ q ¯X ¯ X ¯ ¯ V1 ¯ dn (1 (1 1@q)n )¯ (1@q) n |dn | , ¯ ¯ n=0

n=1

and this last quantity can be made @3 by the second lemma. ¯ ¯P4 For V2 , choose Q2 so that q |dq | ? @3 for q A Q2 ; thenV2 = ¯ n=q+1 dn {n ¯ P4 n { . q+1 3q This last is a geometric series which is @3 when { = 1 1@q. Finally, S (11@q) $ O when q is su!ciently large, so V3 can be made small.

7.4

Taylor Series

Introduction In this section, we will use a slight generalization of the kind of series we have been discussing. Instead of sums of terms in powers of {, we will consider sums of terms

SEQUENCES AND SERIES OF FUNCTIONS

254

in powers of ({ f) for a xed number f. Everything we have said up to now has an analogue if we replace { by { f, including the notions of convergence and radius of convergence. For example, consider the following corollary to Abel’s Summation Theorem. P4 Corollary 7.3.36 Suppose the power series n=0 dn {n converges when { = U for some U A 0. Then it converges uniformly on [d> U] for 0 d ? U. We can rewrite it by replacing { by { f to obtain the following. P n Corollary 7.4.1 (Centered at f) Suppose 4 n=0 dn ({ f) converges when { = f + U for some U A 0. Then it converges uniformly on [f d> f + U] for 0 d ? U. P4 Proof. When { = f+U, the hypothesis translatesP into the statement that n=0 dn Un 4 converges. By the rst version of the corollary, n=0 dn {n converges uniformly for P4 n d { U. Replacing { by { f, we see that n=0 dn ({ f) converges for d { f U, i.e. f d { f + U. This proof of how much of the theory already provedPfor power series P is typical n n d { translates easily into the theory for series 4 of the form 4 n n=0 n=0 dn ({ f) . Here is another simple rewrite of a result stated previously for powers of {: Theorem 7.3.28 on the uniqueness of power series representations. Theorem 7.4.2 (Uniqueness of power series) Suppose V({) =

4 X

n=0

n

vn ({ f)

converges on (f U> f + U) for some U A 0. Then for all n, vn =

V (n) (f) , n!

where V (n) denotes the kth derivative of V. To prove this we could simply redo the proof already given in the case f = 0 (see: Theorem 7.3.28), but in order to see the translation from “centered at 0” to “centered at f,” it would be better to use a method with more general applicability. If we let | = { f, then as { takes on values in (f U> f + U), | takes on values P4 P n n e in (U> U). The series V({) = 4 n=0 vn ({ f) becomes V(|) = n=0 vn | which then converges uniformly for | 5 (U> U). By the uniqueness theorem for series Ve(n) (0) e . centered at 0 (Theorem 7.3.28), the coe!cients of V(|) are given by vn = n! e e f) = V({), Since V(|) = V({ ³ ´ e g V(|) e g(V(|)) g| = g{ g| g{

TAYLOR SERIES

255

and by the chain rule, ¯ ¯ e g(V(|)) ¯ ¯ g{ ¯

= |={f

e g(V(|)) g(V({)) = = g{ g|

In other words, V 0 ({) = Ve0 (|) (the rst as a function of {, the second as a function of |). Inductively, this holds for the higher derivatives as well: V (n) ({) = Ve(n) (|). When { = f, | = 0, so Ve(n) (f) Ve(n) (0) = . vn = n! n!

Just as we used the chain rule in the previous argument, we can use the change of variable formula for integrals to reformulate Corollary 7.3.31 of the previous section on denite integrals of series. Corollary 7.4.3 Suppose i ({) = Then, for any > 5 [d> e], Z

i ({) =

4 Z X n=0

P4

n=0 vn ({

(vn ({ f)n ) =

f)n on the closed interval [d> e].

4 X vn ³

n=0

n+1

´ ( f)n+1 ( f)n+1 .

P4 Proof. Once again, we let | = { f and look at i˜(|) = n=0 vn | n , which converges R f R f P4 for | 5 [d f> e f]. By Corollary 7.3.31, f i˜(|) g| = n=0 vn f | n g|. We can now use the change of variables theorem for integrals (Proposition 6.5.1) with { = | + f to deduce Z Z f i ({) g{ i˜(|) g| = f

and

Z

f

| n g| =

f

Z

({ f)n g{=

In the previous discussion, the function i ({) was dened as a power series. We now turn around the situation and start with a function i ({) and ask if it can be computed as the limit of a power series. By Theorem 7.4.2, we know what the coe!cients of this series must be. Definition 7.4.4 (Taylor polynomials) Suppose the function i is q-times differentiable on the interval (d> e), and f 5 (d> e). For n = 0 = = = q 1, dene the Taylor polynomials for i , centered at f by W0 (i> {> f) = i (f) Wn+1 (i> {> f) = Wn (i> {> f) +

i (n+1) (f) · ({ f)n+1 . (n + 1)!

SEQUENCES AND SERIES OF FUNCTIONS

256

In practice, we will be dealing with only one function i and only one real number f at a time, so we will adopt the more compact notation Wq ({) = Wq (i> {> f). We will always assume that i is dierentiable enough times so that the particular Taylor polynomials we are working with make sense. Wq ({) =

q X i (n) ({)

n=0

n!

· ({ f)n

Proposition 7.4.5 Wq ({) is the unique polynomial of degree q in ({ f) with the property that its nth derivative at f equals the nth derivative of i at f for n = 0 = = = q. Corollary 7.4.6 Let Uq ({) = i ({) Wq ({). Then Uq (f) = Uq0 (f) = · · · = (q) Uq (f) = 0. Recall (Denition 5.2.7) that functions j and k don’t cross in an interval L when j k on all of L or k j on all of L. Lemma 7.4.7 If f> w 5 (d> e), the functions j and k don’t cross in (d> e), and N is any constant, then the functions Nj(w) · (w f) and Nk(w) · (w f) don’t cross in [min(f> {)> max(f> {)]. The proof is a tedious though straightforward analysis of the several cases. Theorem 7.4.8 (Taylor polynomials with remainder) Suppose the function i is (q + 1)-times dierentiable on the interval H and {> f 5 H. Suppose further that i (q+1) (w) lies between p and P for all w 5 H. Then the remainder, or error term Uq (i> {> f) = i ({) Wq (i> {> f) lies between p({ f)q+1 @(q + 1)! and P ({ f)q+1 @(q + 1)! Proof. Replacing i by i Wq , we preserve our hypotheses, and so we can assume i (f) = i 0 (f) = · · · = i (q) (f) = 0 (the higher derivatives remain the same). We now have to show that i ({) lies between the stated fractions. Proceed by induction. The function i 0 (w) is n-times dierentiable and its nth derivative lies between p and P . By assumption (n = 0), or induction hypothesis, i 0 (w) lies ben n tween p(wf) and P(wf) . By the preceding lemma, these last functions don’t (n)! (n)! cross on [min({> f)> max({> f)], so we can use our result on functions that don’t cross (Proposition 5.2.9) to conclude that the integrals preserve betweenness. ´ ³ R Moren preR{ 0 { p(wf)n+1 gw cisely, f i (w) gw = i ({) i (f) = i ({) lies between (n+1)! = f p(wf) (n)! ´ ³ R n+1 n { and P (wf) gw . = f P(wf) (n+1)! (n)! Definition 7.4.9 If the sequence of Taylor polynomials Wq (i> {> f) converges for U ? { f ? U, then 4 X l=0

Wq (i> {> f) =

4 X i (l) (f)({ f)l l=0

l!

is called the Taylor series for i around { = f, denoted W (i> {> f).

TAYLOR SERIES

257

Even if the Taylor series converges, it may not converge to the function i itself. Here is the classic example. 2

Example 7.4.10 Let i ({) = h1@{ . This function is uniformly continuous for |{| A 0, and i ({) $ 0 as { $ 0. Thus, i can be extended to all of R with i (0) = 0. Computing a few derivatives of i reveals that, for |{| A 0, each derivative is of the = Shx(x) form Sh(1@{) 2 , where S is a polynomial and x = 1@{. As { $ 0, x $ 4. 1@{2 Since exponential functions dominate polynomials (Proposition 3.2.15), it is easily checked that these derivatives all exist at 0 and are equal to zero; i.e. i (n) (0) = 0 for all n 0 (see exercises). Thus, the Taylor series for i around 0 is identically 0, so it doesn’t converge to i . So how do we know when a convergent Taylor series actually converges to the function that it is built on? We have to look at the remainder term Uq (i> {> f) = i ({) Wq (i> {> f) and show that it goes to 0 as q $ 4. From the Theorem 7.4.8, we know that Uq (i> {> f) lies in the interval ¶ ¶¸ p({ f)q+1 P ({ f)q+1 p({ f)q+1 P ({ f)q+1 > max , > > min (q + 1)! (q + 1)! (q + 1)! (q + 1)! so q+1 (P p) |{ f| . |Uq (i> {> f)| (q + 1)! Example 7.4.11 (Exponential series) i ({) = h{ , f = 1. i (n) ({) = i ({) = h{ . On the interval [d> e] (containing 1) we can take p = hd , P = he ; also, (e d) { f e d, so we have |Uq (i> {> f)|

(he hd )(e d)q+1 . (q + 1)!

X4 nq nq converges, so $ 0 as q $ 4. Thus, n=0 q! q! { |Uq (i> {> f)| $ 0, so the Taylor series for h centered at 1 (or any other number) converges on any nite interval. Of course, we have already seen this in Example 7.3.33. We know from the ratio test that

Example 7.4.12 (Binomial series) Let E({) = (1 + {)s , where s is any real number. Its derivatives are E 0 ({) = s(1 + {)s1 E 00 ({) = s(s 1)(1 + {)s2 ··· (n) E ({) = s(s 1) · · · (s n + 1)(1 + {)sn . ¡ ¢ s(s 1) · · · (s n + 1) If we dene the kth binomial coe!cient to be ns = and n! ¡s¢ 0 = 1, then ¶ ¶ ¶ ¶ q ¶ s 2 s s s q X s n { . Wq (j> {> 0) = { + ··· + {+ + { = n 2 1 0 q n=0

SEQUENCES AND SERIES OF FUNCTIONS

258

¡ ¢ (Note that if s is a non-negative whole number, ns = 0 for n A s, since one of the factors of its numerator will be (s s) = 0.) As usual, we check the ratio test rst: ¯¡ ¢ ¯ ¯ s ¯ ¯ n+1 {n+1 ¯ ¯¯ s n ¯¯ n s ¯¡s¢ ¯ = ¯¯ {¯ = |{| when n A s. ¯ {n ¯ n+1 ¯ n+1 n

Thus, letting n $ 4 we see that the ratio is |{|, so the binomial series converges absolutely and uniformly in the interval (1> 1). But what does it converge to? First, ¶ ¶ ¶ ¶ s s s s 0 2 3{ + · · · + 2{ + + (n + 1){n + · · · E ({) = 3 2 1 n+1 ¶ ¶ ¶ ¶ s s s s 0 2 3{ + · · · + 2{ + + (n + 1){n + · · · (1 + {)E ({) = 3 2 1 n+1 ¶ ¶ ¶ s s s 2 2{ + · · · + { + (n){n + · · · + 2 1 n

At this point we note that, except for {0 , the coe!cient of {n in (1 + {)E 0 ({) is ¶ ¶ ¶ s s s . (n) = s (n + 1) + n n n+1 ¡ ¢ ¡ ¢ Since we can also write s1 = s s0 , this relation is true for n = 0 also, and we have 4 ¶ X s n { = sE({). (1 + {)E ({) = s n 0

n=0

Now compute

E({) (1 + {)s

¶0

(1 + {)s E 0 ({) E({)s(1 + {)s1 (1 + {)2s (1 + {)s E 0 ({) (1 + {)E 0 ({)(1 + {)s1 = (1 + {)s = 0. =

E(0) E({) is constant on (1> 1). But = 1, so (1 + {)s is, in fact, (1 + {)s (1 + 0)s equal to its Taylor expansion E({) on (1> 1). In the Appendix to this chapter, we show that E ({) also converges pointwise (though not absolutely) at { = 1, and { = 1. Thus, by the Abel Summation Theorem (7.3.35), Thus,

4 ¶ X s n { uniformly on [1> 1] and absolutely on (1> 1]. (1 + {) = n s

n=0

Note that we didn’t use the remainder term in this argument.

TAYLOR SERIES

259

The following theorem gives an explicit calculation for the remainder term. This calculation is rarely made, however, since it is generally di!cult to estimate the size of the (q + 1)st derivative. Theorem 7.4.13 (Integral form of the Remainder) If the rst q + 1 derivatives of i exist on [f> {], then Z { (q+1) i (w) ({ w)q gw. Uq (i> {> f) = q! f

(n) P Proof. Consider the remainder as a function of w: Uq (i> {> w) = i ({) qn=0 i n!({) · ({ w)n . We will dierentiate and then integrate Uq (notice that almost everything cancels, or telescopes). " q #0 X i (n) (w) gi ({) g(Uq (i> {> w) = · ({ w)n gw gw n!

n=0

= 0 [i 0 (w)] ¸ i 00 (w) ({ w) i 0 (w) + 1! ¸ 00 i (w) i 000 (w) ({ w) + ({ w)2 · · · 1! 2! (q) ¸ i (w) i (q+1) (w) ({ w)q1 + ({ w)q (q 1)! q! =

i (q+1) (w) ({ w)q . q!

Our result now follows from integrating this from f to {, since Uq (i> {> {) = 0.

Exercises 1. Find the sums and intervals of convergence of the following. P4 n1 (a) n=1 n{ P4 2 n1 (b) n=1 n { P4 {n+2 (c) n=1 n(n+2)

2

2. This exercise veries the claims made about the function h1@{ . 2

(a) Prove that all of the derivatives of h1@{ are of the form is a polynomial and x = 1@{2 .

S (x) ,where S hx

(b) Prove that the derivatives are all uniformly continuous on R {0} and have limit 0 as { $ 0.

SEQUENCES AND SERIES OF FUNCTIONS

260

2

(c) Conclude that h1@{ is innitely dierentiable and all its derivatives vanish at 0. X4 cos(q2 {) . 3. Let i ({) = q=0 hq (a) Prove that this series converges absolutely and uniformly on all of R (Hint: Use the Weierstrass MTest (Theorem 7.1.19). (b) Prove the same for all derivatives of i . (c) Show that the Taylor series for i around f = 0 diverges for all |{| A 0.

This is another example of an innitely dierentiable function whose Taylor series doesn’t converge to it. 4. Find the Taylor series, and its interval of convergence, for the following functions (from [Kaczor&Nowak, 2000]): (a) i ({) = ln(1 + {) around f = 0 1 1+{ around f = 0 (b) i ({) = ln 2 1{ (c) i ({) = ln(1 + { + {2 ) around f = 0 1 (d) i ({) = , around f = 0 1 5{ + 6{2 h{ around f = 0 (e) i ({) = 1{ (f) i ({) = ({ + 1)h{ around f = 1 h{ around f = 1. (g) i ({) = { 5. Find the sum of each of the following series. (Hint: These are values of Taylor series.) (a) (b) (c) (d) (e) (f)

X4

q=0

X4

q=0

X4

q=0

X4

q=1

X4

q=2

X4

q=0

X4 (q + 3) 2q and q=0 q+3 2q 1 (2q)! q 2q (1)q+1 q(q + 1) (1)q q2 + q 2 3q (q + 1) q!

6. Suppose that | = i ({) satises the dierential equation | 0 | = h{ . Find the Taylor expansion for i and its radius of convergence. Find a simple expression for i .

THE PERIODIC FUNCTIONS

7.5

261

The Periodic Functions

The goal of this section is to dene the trigonometric functions, particularly the sine and cosine,completely rigorously and without the aid of pictures or triangles. Our tools will be power series, estimates, and results from calculus, especially the consequences of i 0 A 0 and i 0 = 0 on an interval. We start with the series dened recursively: vm ; 1. v0 = 0, v1 = 1, vm+2 = (m + 1)(m + 2) X2q+1 Xq {3 {5 {7 (1)n 2n+1 Vq ({) = { + + ··· vm {m = ={ m=0 n=0 (2n + 1)! 3! 5! 7! fm ; (m + 1)(m + 2) X2q Xq (1)n {6 {2 {4 Fq ({) = {2n = 1 + + ··· fm {m = m=0 n=0 (2n)! 2! 4! 6!

2. f0 = 1, f1 = 0, fm+2 =

Of course, these come from the well-known Taylor series, but here we are starting with them instead of deriving them. Note that our notation here is not the same one that we usually use for power series. The degrees of Vq ({) and Fq ({) are 2q + 1 and 2q respectively; the nth term of Fq ({), for example, involves {2n . The reason is that both series have a lot of 0 terms and writing a formula taking this into account would be very messy. Furthermore, when these 0 terms are discarded, both series are clearly alternating, so the error in truncating them is no bigger than the rst term deleted–at least once the terms start decreasing, which they eventually do when q is big enough. (Think about how we know this.) Lemma 7.5.1 Both Vq ({) and Fq ({) converge uniformly and absolutely for { 5 (U> U), for any U 0. This follows directly from the ratio test. Here are plots of F5 ({) and V5 ({): C5(x)

S5(x)

SEQUENCES AND SERIES OF FUNCTIONS

262

Definition 7.5.2 (Sine and Cosine Functions) 1. sin({) = lim Vq ({) = q$4

2. cos({) = lim Fq ({) = q$4

4 X (1)n ·

n=0

{2n+1 . (2n + 1)!

4 X {2n . (1)n · (2n)! n=0

The following key facts about sine and cosine follow directly from their denitions and our general results about alternating series and power series. Proposition 7.5.3 (Basic Properties) 1. For any U A 0, there is a positive integer Q (U) such that sin({) (respectively, cos({)) lies between Vq ({) and Vq+1 ({) (respectively Fq ({) and Fq+1 ({)) when q Q (U) and { 5 (U> U). In fact, Q (U) can be taken to be any integer Q U. 2. sin(0) = 0 and cos(0) = 1. 3. sin({) = sin({) and cos({) = cos({). 4.

g(cos({)) g(sin({)) = cos({) and = sin({). g{ g{

Proof. The only assertion that is not straightforward is the rst. By the alternating series test (Corollary 3.3.33), the limits will lie between consecutive partial sums once the terms of the series begin decreasing value. This will happen when ¯ in absolute ¯ ¯ { ¯ {n { {n+1 ¯ ¯ = and ¯ q U, since ? 1 when { 5 (U> U) and n q U. (n + 1)! n! n + 1 n + 1¯ Corollary 7.5.4 For any constants D and E, the function $({) = D sin({) + E cos({) satises the dierential equation $ 00 = $. The sine and cosine functions are determined by their values at 0 and the differential equation $ 00 = $ which they satisfy. This is a special case of a far more general theorem about solutions of dierential equations. In this special case, however, we can use the simple consequence (Corollary 6.3.3) of LBC that a function whose derivative vanishes on an interval must be constant on that interval. Before applying this to the dierential equation for sine and cosine, we’ll warm up by using it to prove an even more basic fact about the sine and cosine. Proposition 7.5.5 sin2 ({) + cos2 ({) = 1 for all {.

THE PERIODIC FUNCTIONS

263

(Note that sin2 ({) means simply (sin({))2 ; this time-saving but sometimes confusing notation for powers of trigonometric functions dates back to the 18th century and was used by William Jones (1675-1749).) Proof. Let i ({) = sin2 ({) + cos2 ({). An easy calculation shows that i 0 ({) is identically 0, so i ({) is constant. Since i (0) = 1, we are done. (Can you imagine how hard it would be to prove this simple identity directly from the power series?) Now, as promised, we show that linear combinations of sine and cosine are the only solutions to $ 00 = $. Proposition 7.5.6 Suppose i is a function dened and twice uniformly dierentiable on an interval L which contains 0 and may possibly be all of R. Suppose that i has the property that i 00 = i on L. Then i ({) = i (0) cos { + i 0 (0) sin({) for all { 5 L. Proof. Let $({) = i (0) sin { + i 0 (0) cos { and j = ($ i )2 + ($ 0 i 0 )2 . From Corollary 7.5.4 and the assumptions on i , we easily see that j 0 ({) = 0 on L, so j is constant on L. Since $ and i agree at 0, j is identically 0 on L. If a sum of squares is 0, each must be 0 (Propositions 1.5.9 and 1.5.14). Thus, $ = i on L. Corollary 7.5.7 (Addition Formulas) For any and , 1. sin( + ) = sin() cos() + cos() sin(). 2. cos( + ) = cos() cos() sin() sin(). Proof. Exercise We now turn to the denition of . There are a number of routes to take. Had we developed the notion of arclength, we might have dened as the length of some circular arc–but we didn’t go down this road. Using our approach, we might have dened to be either the rst (smallest) positive root of sin { or twice the rst positive root of cos {. The problem here is that these roots are both larger than 1. As a result, we may have to take more terms than we’d like of our series to ensure that these terms are decreasing (so that we can locate sin { or cos { between consecutive partial sums). It turns out that the actual arithmetic is simplest if we dene to be 4 times the rst positive number where sin({) = cos({). To do that, we have to show that such a number exists. The idea will be to show that the function sin({) cos({) is negative at { = 0, positive at { = 1, and has positive derivative on the interval [0> 1]. We can then invoke the derivative version of the Inverse Function Theorem (Corollary 6.3.15). Lemma 7.5.8 1. sin({) 0 on [0> 2] and A 0 on (0> 2].

SEQUENCES AND SERIES OF FUNCTIONS

264

2.

1 2

cos({) 1 on [0> 1].

3. sin({) + cos({)

1 2

on the interval [0> 1].

4. sin(0) cos(0) ? 0, sin(1) cos(1) A 0. Proof. When { 5 (0> 2], the terms of the sine series are decreasing in size; when { 5 [0> 1], the same is true for the cosine series. By Proposition 7.5.3 and what we know about alternating decreasing series, we see that sine and cosine lie between consecutive partial sums of their power series. Thus, we have {3 sin({) { for { 5 [0> 2] 6 {2 cos({) 1 for { 5 [0> 1]. 1 2 ³ ´ 2 = { 1 {6 and the quantity in parentheses is positive when {

Also, {

{3 6

2

{ 5 [0> 2], which establishes part (1). 1 {2 1 12 when { 5 [0> 1], which establishes part (2). Part (3) now follows by addition of inequalities. The rst inequality of part (4) is clear from the power series themselves. To prove the second part, note that cos(1) lies between its 3rd and 4th partial sums, i.e., 1

1 1 1 1 1 + cos(1) 1 + = 2 24 720 2 24

Thus, cos(1) 13 24 . Since we already know that sin(1) 7 = cos({) 56 13 24 24 A 0.

5 6,

we see that sin({)

Let !({) = sin({) cos({). Then !0 ({) = cos({) + sin({) A 12 on [0> 1], so it has an inverse on [!(0)> !(1)] by the derivative version of the Inverse Function Theorem (Corollary 6.3.15). Since !(0) ? 0 and !(1) A 0, ! has a unique root in [0> 1]. We will dene to be 4 times this root. Definition 7.5.9

4

is the unique number { in [0> 1] for which sin({) = cos({).

Lemma 7.5.10 1. 0 ? @2 ? 2. 2. cos(@2) = 0 and @2 is the smallest positive zero of the cosine. 3. sin(@2) = 1. 4. cos({) decreases from 1 to 0 on [0> @2]. Proof. Part (1) follows from 0 ? @4 ? 1. From the addition formula for the cosine, cos(@2) = cos2 (@4) sin2 (@4), which is 0 by denition of . Suppose F A 0 and

THE PERIODIC FUNCTIONS

265

cos(F) = 0. We must show that F @2. If F 2, then this is certainly the case by part (1). Suppose 0 ? F 2. By the addition formula for the cosine again, 0 = cos(F) = cos2 (F@2) sin2 (F@2) = (cos(F@2) sin(F@2))(cos(F@2) + sin(F@2)). However, we already know that the second factor is greater than 1@2, so the rst factor must be 0. This means that F@2 is a number between 0 and 1 with sin(F@2) = cos(F@2). Thus, F@2 = @4 so F @2. Since sin2 (@2) + cos2 (@2) = sin2 (@2) = 1 and sin(@2) 0 (by Lemma 7.5.8), we must have sin(@2) = 1. Finally, the derivative of cos({) is sin({), which is negative on (0> @2] (0> 2], also by Lemma 7.5.8. We now can use the addition formulas to prove all of the nice symmetry properties of sine and cosine; for example, cos({) = sin(@2 {) and cos(@2 + {) = sin({) = cos(@2 {). This enables us to piece together the sine and cosine functions between 0 and 2. That’s all we need, because of the following results (whose proof we leave as an exercise). Proposition 7.5.11 1. The zeros of sin({) are exactly the numbers n for integer n. 2. The zeros of cos({) are exactly the numbers @2 + n for integer n. 3. 2 is the smallest positive number S for which cos({ + S ) = cos({) for all {. 4. 2 is the smallest positive number S for which sin({ + S ) = sin({) for all {. Definition 7.5.12 A function i is periodic if there is a number S such that i ({ + S ) = i ({) for all { in the domain of i with { + S also in the domain of i . S is then called a period of i . If S is the smallest positive period for i , then it is called the period of i . The derivative of a periodic function is itself periodic, but this is not necessarily true for the antiderivative: see the exercises. On the other hand, we do have a kind of periodicity for denitive integrals, as the following proposition shows. Proposition 7.5.13 Suppose i is uniformly continuous on an interval L and has period S . If d, e, d + S and e + S are all in L, then Z

d

d+S

i=

Z

e+S

i.

e

Proof. See the exercises. Definition 7.5.14 (The Tangent) tan({) =

sin { where cos({) 6= 0. cos {

SEQUENCES AND SERIES OF FUNCTIONS

266

Proposition 7.5.15 The period of sin e{ and cos e{ is 2@e; the period of tan e{ is @e. By the Quotient Rule, the derivative of tan({) is cos12 ({) , which is positive except at the zeros of cos({), where tan is undened; tan({) is negative and unbounded near @2 and positive and unbounded near @2. tan : (@2> @2) $ (4> 4) and this function is uniformly continuous and dierentiable on every closed subinterval of (@2> @2). Thus, tan({) has an inverse function which we’ll call, temporarily, invtan({). invtan : (4> 4) $ (@2> @2). Recall that have previously dened another bounded, increasing function, R { we gx arctan({) = 0 1+x 2 (see page 213). Let us now apply integration by substitution, with x = tan(w), to obtain arctan({) = =

Z

Z

{ 0

gx = 1 + x2

invtan({) 0

Z

invtan({) invtan(0)

1 1 · gw 2 2 (w) cos 1 + tan (w)

1 gw sin2 (w) + cos2 (w)

= invtan({)= So, at last, we have invtan({) = arctan({); in particular, invtan({) = arctan({) =

4 X {2l+1 (1)l 2l + 1

(on (1> 1]);

n=0

invtan(1) = @4 = 1 1@3 + 1@5 1@7 + · · · (see exercises).

Exercises 1. Prove that the power series used in the denition of the sine and cosine– see Denition 7.5.2–do, in fact, converge uniformly and absolutely on every interval [U> U]. 2. Prove the addition formulas for sin ( + ) and cos ( + ). 3. Prove the following from the text. Proposition 7.5.11 (a) The zeros of sin({) are exactly at the numbers n for integer n.

THE PERIODIC FUNCTIONS

267

(b) The zeros of cos({) are exactly at the numbers @2 + n for integer n. (c) 2 is the smallest positive number S for which cos({ + S ) = cos({) for all {. (d) 2 is the smallest positive number S for which sin({ + S ) = sin({) for all {. 4. Prove that 3 ? ? 4. 5. Use the results from this section to describe or sketch one period of the sine or cosine curve, explaining features such as increasing/decreasing, lying above or below the tangent line (concavity), zeros, and maxima/minima. 6. The unit circle consists of those pairs ({> |) such that {2 + | 2 = 1. (a) Suppose ({> |) is on the unit circle and | A 0. Prove that there is a unique number 5 (0> ) such that { = cos and | = sin . (Such a point is on the “upper half” of the unit circle.) (b) Suppose ({> |) is on the unit circle and | ? 0. Prove that there is a unique number 5 (> 2) such that { = cos and | = sin . (Such a point is on the “lower half” of the unit circle.) (c) Show that the unit circle consists of precisely those points (cos > sin ) where 5 [0> 2), and this representation is unique. 7. Proposition 7.5.13 says that if i has period S , then R {+S this by showing that J({) = { i is constant.

R d+S d

i =

R e+S e

i . Prove

8. (From [Kaczor&Nowak, 2000]; The Weierstrass M-test, Theorem 7.1.19, will be useful for these.) (a) Show that the function i ({) = is dierentiable on [@6> 11@6].

X cos(q{) 1 + q2

(b) Show that the function i ({) =

4 X sin(q{2 ) 1 + q3 q=1

is dierentiable on all of R. (c) Show that the function i ({) =

4 X s q(tan {)q

q=1

is dierentiable on every closed subinterval of (@4> @4).

SEQUENCES AND SERIES OF FUNCTIONS

268

9. Prove in detail that 1 1@3 + 1@5 1@7 + · · · converges to @4. The problem 1 only converges for 1 ? { ? 1, while is that the geometric series for 1 + {2 the term-by-term integral of this series converges on 1 ? { 1, and it’s at { = 1 that we get our series for @4. Note that this convergence is very slow, so that this series is a pretty ine!cient way to compute . 10. Find the Taylor series, and its interval of convergence, for the following functions (from [Kaczor&Nowak, 2000]): (a) i ({) = sin({3 ) around f = 0 (b) i ({) = (sin {)3 around f = 0 (c) i ({) = sin { cos 3{ around f = 0 (d) i ({) = sin6 { + cos6 ({) around f = 0 cos { around f = 1, |{| A 0 (e) i ({) = { 1 (f) i ({) = { arctan { ln(1 + {2 ). 2 11. Find the sum of each of the following series (from [Kaczor&Nowak, 2000]). (Hint: These are values of Taylor series.) (a) (b)

X4

q=0

X4

q=1

(1)q q (2q + 1)! (1)q1 q(2q 1)

Binomial Series

269

Appendix: Raabe’s Test and Binomial Series P4 ¡ ¢ Applying the ratio test to n=0 ns is inconclusive (the limit is 1). However, there are a whole series of tests that make ne adjustments to the ratio test. Here is the rst of this series; others can be found in [Lewin, 2003]. Proposition 7.5.16 (Raabe’s Test) Suppose vq is a sequence of positive numbers. D vq+1 1 for all su!ciently large q, 1. If there is a number D A 1 such that v q q P then 4 q=0 vq converges.

D vq+1 for all su!ciently large q 1 2. If there is a number D ? 1 such that vq q P4 then q=0 vq diverges. ¶ P4 vq+1 = n. The series q=0 vq converges 3. (Limit form): Suppose lim q 1 q$4 vq if n A 1, diverges if n ? 1, and the test gives no information if n = 1. Proof. (We acknowledge [Lewin, 2003] for this nice proof.) First, we need a preliminary result which is most easily proved using the 4@4 case of l’Hôpital’s rule (see Proposition 6.4.12). ¶ ({ 1) Lemma. lim { 1 = . {$4 { vq+1 D We prove part (1). Without loss of generality, we can assume 1 vq q 1 for all q and D A 1. Choose any u with 1 ? u ? D, and let eq = u for (q 1) P4 q = 2> 3> = = =. Then the series q=2 eq converges (since u A 1). We also have ¶ ¶ eq+1 (q 1)u lim q 1 = lim q 1 = u. q$4 q$4 eq qu ¶ eq+1 eq+1 vq+1 D ? Thus, when q is large enough, q 1 ? D, so 1 . By eq vq P q eq 4 the ratio comparison test (Proposition 3.3.25), we see that q=0 vq converges. The other parts are proved similarly. ¯¡s¢¯ P ¯ ¯ It is now a straightforward application of Raabe’s test to see that ¡4 n=0 P4 s¢ n converges when s A 0 and diverges when s ? 0. So what happens to n=0 n when ¯¡ ¢¯ s ? 0? If s 1, it is easy to see that ¯ ns ¯ 1 so the series diverges. The remaining case is 1 ? s ? 0, which ¡ ¢¯ with. ¡ ¢ we now ¯deal We rst note that ns = (1)n ¯ ns ¯, since 1 ? s ? 0, so we are dealing with an alternating series, which will converge if the terms go to 0, by the alternating series test.

Appendix

270

¯¡ ¢¯ dn+1 s+1 . Since en is basically the harmonic series, Let dn = ¯ ns ¯, en = 1 = dn n 1 P+ 4 we know that its terms go to 0, but P4 ¡s¢ P n=0 en diverges. However, we have previously shown (Proposition 3.3.41) that 4 n=0 en diverges implies dn $ 0. Thus, n=0 n converges conditionally when 1 ? s ? 0. P4 ¡s¢ Here is the summary, then, for the biniomial series n=0 n . • When s 1, the series diverges.

• When 1 ? s ? 0, the series converges conditionally. • When s A 0, the series converges absolutely.

8. THE COMPLEX NUMBERS AND FOURIER SERIES 8.0

Introduction

In studying power series we considered the representation of a function as an innite linear combination of powers of a variable {, that is, sums of the form i ({) =

4 X

vn {n .

n=0

This has some distinct advantages because the building blocks {n are easily computed and the convergence properties are relatively easy to establish. Many important functions, especially the exponential and trigonometric ones, have these Taylor series representations. On the other hand, only functions that are innitely dierentiable–and not even 2 all of such functions, as the example of h1@{ shows–have Taylor series converging to them. This is a major restriction, which excludes even such nice functions as |{|. In this section we look at a dierent approach: representing functions not in terms of powers of { but in terms of the basic periodic functions sin n{ and cos n{. We are looking for representations of the form i ({) =

4 X

dn cos n{ + en sin n{,

n=0

which are called Fourier series representations. Any function that has such a representation is periodic (since each of sin n{ and cos n{ is 2 periodic). Conversely, many periodic functions can be represented by such series in the sense of pointwise convergence. Moreover, many of those periodic functions that have a convergent Fourier series representation do not have a Taylor series representation (because they are not su!ciently dierentiable). Finally, Fourier representations have many nice properties that make them easy to work with. For all these reasons, Fourier theory has turned out to be very important both in applications and in spurring the development of the eld of analysis. We will prove in this chapter that every periodic uniformly dierentiable function has a Fourier series converging to it. This result requires some work. Proving it, however, will provide an opportunity to introduce, if only brie y, a number of ideas and constructions that are important in more advanced courses in functional analysis. These include 271

THE COMPLEX NUMBERS AND FOURIER SERIES

272

• the use of complex numbers and, especially, Euler’s formula hl = cos +l sin ; • the idea of general inner products and the important Bessel and CauchySchwartz inequalities; • kernel functions and approximate identities; • convolutions of functions. This chapter is a bare introduction to Fourier Analysis. There is a huge literature on this subject; for further reading, see, for example, [Folland, 1992], [Pinsky, 2002], [Stade, 2005].

Some history Let | = |({> w) represent the | coordinate at time w of a point { units from the end of a vibrating string whose endpoints are stationary. An analysis of the forces on such a string leads to the following classical partial dierential equation, called the wave equation: C 2 |({> w) C 2 |({> w) = f2 . 2 Cw C{2 Here are some facts about this motion. • At any xed position { = n, the point undergoes a periodic motion | = |(n> w). • The string has length O and its endpoints are xed, so |(0> w) = |(O> w) = 0. • At initial time w = 0, the curve takes the form | = |({> 0) = i ({), 0 { O. The great eighteenth century mathematician David Bernoulli (1700—1782) found a whole class of fairly simple solutions to the wave equation: ¶ ¶ fn n |n ({> w) = sin { cos w , O O with initial condition in ({) = |n ({> 0) = sin

¶ n { = O

Because of the form of the wave equation, if the functions |n ({> w) are solutions for n = 1> = = = > Q, then so is any linear combination |({> w) =

Q X

en |n ({> w),

n=1

where the en are arbitrary constants. This has the initial condition i ({) =

Q X

n=1

en sin

¡ n ¢ O { .

INTRODUCTION

273

Bernoulli seems to have been convinced that his functions and their linear combinations represented the general solution to the wave equation–in the sense that every solution took this form. Since the initial shape of the string could be anything, providing the ends were xed at (0> 0) and (O> 0), i ({) could be any reasonable function. This led Bernoulli to allow the sum to be innite and make the claim that any (reasonable) function could be represented by the series i ({) =

4 X

en sin

n=1

¡ n ¢ O { .

Now this clearly can’t work for any function, since the sines are odd functions (i.e. sin({) = sin({)) while not every function is odd, and of course you’d have to have i (0) = i (O) = 0. Also, Bernoulli didn’t have a good method of computing the coe!cients en , nor the correct formalism to deal with issues of convergence. A half century later, Jean-Baptiste Fourier (1768—1830) was led to consider similar series when he tried to solve the heat equation C 2 |({> w) C|({> w) = f2 . Cw C{2 PQ Fourier, however, went further by looking also at cosine series n=1 dn cos (n{), which produce even functions. Fourier conjectured that every function of period 2 has a full trigonometric expansion on the interval [> ]; that is, one of the form i ({) =

4 X

dn cos (n{) + en sin (n{) .

(*)

n=0

He also found formulas for the coe!cients in the following way. Assuming enough convergence, he multiplied both sides of equation (*) by cos q{ and integrated from to to obtain Z Z 4 X i ({) cos(q{) g{ = dn cos (n{) cos(q{) g{

n=0 4 X

+

n=0

en

Z

sin (n{) cos(q{) g{.

Fourier then used the following important facts. Lemma 8.0.1 For integers n> q 0, ½ Z cos(n{) cos(q{) = 0 ½ Z sin(n{) sin(q{) = 0 Z sin(n{) cos(q{) = 0.

if n = q otherwise if n = q otherwise

(**)

THE COMPLEX NUMBERS AND FOURIER SERIES

274

Proof. This follows from the periodicity of the sine and cosine, together with the identities sin (D) sin (E) = 12 [cos(D + E) cos(D E)] cos(D) cos(E) = 12 [cos(D + E) + cos(D E)] sin (D) cos (E) = 12 [sin(D + E) + sin(D E)] . Details are left as exercises. R Thus, from equation () above, dn = 1 i ({) cos(n{) g{. Similarly, by mulR tiplying through by sin(q{), Fourier deduced that en = 1 i ({) sin(n{) g{. For the remainder of this section, we will assume that our functions are 2periodic, that is, i ({ + 2) = i ({)

for all { in the domain of i .

Definition 8.0.2 The Fourier cosine series for the function i is X4 dn cos(n) n=0

where

1 dn =

Z

i (w) cos(nw) gw=

Similarly, the Fourier sine series for i is X4 en sin(n) n=1

where

1 en =

Z

i (w) sin(nw) gw=

The complete Fourier series for i is d0 X4 + dn cos(n) + en sin(n). n=1 2

You may wonder about the term d0 @2. We need this to make the formulas for dn and en all have the same form. If X4 i () = N + dn cos(n) + en sin(n), n=1

R

1 i (w) cos(0) gw = 12 d0 . then N = 2 It turns out that we can write the Fourier series for a function in a much more compact form by using the complex numbers C, which is the subject of the next section.

THE COMPLEX NUMBERS C

275

Exercises 1. Verify that Bernoulli’s functions are solutions to the wave equation (see page 272). P4 2. Suppose that i ({) = n=0 dn cos n{ + en sin n{ converges and that summation can be exchanged with integration (e.g. when this convergence is uniform). Verify that the coe!cients dn and en are given by the formulas in Denition 8.0.2; see page 274 for how Fourier did this. 3. The following functions are Riemann integrable (see Denition 5.3.28). Without worrying about convergence or periodicity, nd their Fourier series on [> ]: (a) i () = 3 + 2 sin(5) 5 cos()

(b) i () = cos(@2) (Hint: Write cos(@2) in terms of sin and cos .) (c) i () = sin2 () (d) i () = (e) i () = ||

(f) i () = ||.

It is interesting to plot the original function i () as well as the sum of the rst ve or so terms of its Fourier series. n P4 2 4. The series i () = 3 + 4 n=1 (1) n2 cos(n) converges (absolutely) for any . What is i ? (Hint: Guess i , then verify that the series is its Fourier series.) 5. The Riemann-Lebesgue Lemma (Exercise 8 on page 203) says that when i is Re UC on [d> e], lim d i (w) sin(w) gw = 0. What does this say about Fourier se$4

ries? In advanced courses this lemma is often proved under weaker assumptions than uniform continuity; try it for i a Riemann integrable function (Denition 5.3.28).

8.1

The Complex Numbers C

You are probably familiar with complex numbers as expressions of the the form d+el, s where l = 1. This is a ne working denition, but we give a formal denition since we are being rather careful about our denitions in this book. Definition 8.1.1 (Complex numbers) A complex number is an ordered pair (d> e) of real numbers. Two complex numbers (d> e) and (f> g) are equal exactly when d = f and e = g.

THE COMPLEX NUMBERS AND FOURIER SERIES

276

• } 5 C means } is a complex number. • The reals R are embedded in the complex numbers C by d $ (d> 0); in fact, we usually write the complex number (d> 0) as simply d. • We dene l = (0> 1). • Complex numbers are added according to the formula (d> e) + (f> g) = (d + f> e + g)= • Complex numbers are multiplied according to the formula: (d> e) · (f> g) = (df eg> ef + dg); in particular, d · (f> g) = (d> 0) · (f> g) = (df> dg) and so (f> g) = (f> 0) + g · (0> 1) = f + gl= • If } = d + el, then d and e are called, respectively, the real and imaginary parts of }; l2 = 1. • If d2 + e2 A 0, then d + el is called a unit, and ¶ el d = (1> 0) (d + el) · d2 + e2 d2 + e2 or (d + el) The complex number is denoted

d el d2 + e2

¶

d el d2 + e2

¶

= 1=

is called the inverse or reciprocal of (d> e), and

1 or (d + el)1 . d + el

• For a complex number } = d + el, the norm or length of } is given by p |}| = |d + el| = d2 + e2 .

• For a complex number } = d + el, the conjugate of } is given by } = d + el = d el.

Here are some facts that follow in a straightforward way from these denitions. Proposition 8.1.2 (Properties of complex numbers) • The operations of addition and multiplication are commutative and associative, and multiplication distributes over addition: }(z + z0 ) = }z + }z0 .

THE COMPLEX NUMBERS C

277

2

• } + z = } + z; }z = } z; } } = |}| ; 1@} = } 1 = }@ |}|. • |} z| = |}| |z| and |} + z| |}| + |z| (triangle inequality for complex numbers). The following result tells us that a complex number is small if and only if both its real and imaginary parts are small. Lemma 8.1.3 If |d + el| ? , then |d| ? and |e| ? . Conversely, if + = 1 where and are positive reals, then |d| ? and |e| ? imply that |d + el| ? . It is now possible to go through all the material in previous chapters on limits and convergence and see that the denitions still make sense for complex numbers instead of reals, with | · | for complex numbers in place of the absolute value for reals. Here is a brief summary of easily proved results. Proposition 8.1.4 Suppose (}n ) = (dn + len ) is a sequence of complex numbers. 1. (}n ) $ (d + el) if and only if (dn ) $ d and (en ) $ e. 2. (}n ) is a Cauchy sequence if and only if (dn ) and (en ) are Cauchy sequences. P4 P4 P4 3. n=0 }n converges (or converges absolutely) if and only if n=0 dn and n=0 en converge (or converge absolutely). P4 4. If the series n=0 vn } n converges for } = d + el (vn are complex numbers), and 0 u ? |d + el|, then this series converges for all } with |}| u. P4 P4 Note: n=0 }n converges absolutely means that n=0 |}n | converges. Here is an important result to show how this kind of thing works.

Theorem 8.1.5 The complex exponential series h} =

4 X }n

n=0

n!

converges uniformly and absolutely in any disk |}| U. Proof. For P ? Q , ¯ ¯Q P ¯X } n X } n ¯¯ ¯ ¯ ¯ ¯ n! n! ¯ n=0

n=0

¯ n¯ Q X ¯} ¯ ¯ ¯= ¯ n! ¯

n=P+1

Q X

n=P+1

n

|}| n!

The sum on the right is a truncation of the real Taylor series for h|}| , which we know converges uniformly for |}| U, for any xed U. Thus, by making P and Q large enough (depending only on U), this sum on the right can be make as small as we PQ ¯¯ }n ¯¯ please. If follows that the partial sums n=0 ¯ n! ¯ form a Cauchy sequence, so the series for h} converges absolutely and uniformly.

THE COMPLEX NUMBERS AND FOURIER SERIES

278

Exercises 1. Prove some of the results stated in this section about the algebra of the complex numbers. 2. Prove Proposition 8.1.4 relating convergence of sequences of complex numbers to convergence of their real and imaginary parts. P4 3. P Suppose n=0 xn (w) + lyn (w) converges to x(w) + ly(w) for each w 5 L. Then 4 xn (w) + lynP (w) converges uniformly to x(w) + ly(w) on L if and only if Pn=0 4 4 n=0 xn (w) and n=0 yn (w) both converge uniformly to x(w) and y(w), respectively. P n 4. For any complex number d, prove that the geometric series 4 n=0 d} converges d uniformly to 1} when |}| ? 1, and that this convergence is uniform in any interval [U> U], where 0 U ? 1.

8.2

Complex Functions and Vectors

In this chapter, we are only concerned with complex-valued functions of a real variable. The situation with complex-valued functions of a complex variable is very much dierent, especially with regards to dierentiation and integration; so dierent, in fact, that there is a separate branch of mathematics called Complex Analysis devoted to it. We will be looking at functions i : L $ C, where L is a real interval. Since the conditions for uniform continuity and dierentiability involve absolute value, we can substitute the complex norm | | and everything goes through pretty much as before. Proposition 8.2.1 i (w) = x(w) + ly(w) is uniformly continuous if and only if the functions x(w) and y(w) are uniformly continuous. Proposition 8.2.2 i (w) = x(w) + ly(w) is uniformly dierentiable if and only if the functions x(w) and y(w) are uniformly dierentiable. When i is UD, i 0 (w) = x0 (w) + ly0 (w). If i 0 (w) = 0 on L, then i is constant on L. Next we come to the complex exponential function. We will not study h} (see Theorem 8.1.5) extensively as a function of }: we won’t need that for studying Fourier series. Instead, we x a value of }, say } = }0 = d + el, and look at the two functions: J(w) = hw}0 =

4 X (w}0 )n n=0

dw

n!

=

I (w) = h (cos ew + l sin ew).

4 n n X w }0 n=0

n!

Proposition 8.2.3 J (w) = I (w); i.e. hw(d+el) = hdw (cos ew + l sin ew).

COMPLEX FUNCTIONS AND VECTORS

279

Proof. Theorem 8.1.5 tells us that the series for hw}0 is uniformly and absolutely convergent; dierentiating it term by term gives J0 (w) = }0 J(w). On the other hand, straightforward dierentiation of I using the previous proposition gives I 0 (w) = }0 I (w) as well. These formulas and the complex-valued version of the quotient rule applied to T(w) = J(w)@I (w) shows that T0 (w) = 0. (We note that I (w) is never 0.) Thus, T(w) is constant. Since J(0) = I (0) = 1, this constant is 1, so we get I (w) = J(w). Corollary 8.2.4 (Euler’s formula) For any real number , hl = cos +l sin . Corollary 8.2.5 (Addition formula for complex exponents) For any com0 0 plex numbers } and } 0 : h}+} = h} h} . ¡ ¢q Corollary 8.2.6 (de Moivre’s formula) For any integer q: hl = (cos + l sin )q = cos q + l sin q. Proof. When q A 0, this is proved by induction, using the sum-of-angles formulas q for sine and cosine; also, by denition, (cos + l sin )q = 1@ (cos + l sin ) = q (cos l sin ) and a similar induction argument works. Note that de Moivre’s formula holds for much more general exponents than integers, but to deal with them we would have to make a big detour into dening complex exponentiation, a topic best left for a complex variables course. Corollary 8.2.7 If } = uhl , then for any integer n }n }n

= un hln = un (cos(n) + l sin(n)) = un hln = un (cos(n) l sin(n))=

We are almost ready to apply these formulas to express Fourier series in a very compact way; we just need to know how to compute the integral of a complex-valued function. Fortunately, we can just split such a function into real and imaginary parts, and then dene the integral in the most straightforward way. Definition 8.2.8 Suppose i : L $ C, i (w) = x(w) + ly(w) where x and y are Riemann integrable in L. Then Z

e

i (w) gw =

d

Z

e

x(w) gw + l

d

Z

e

y(w) gw,

d

where [d> e] L. The following will be very useful shortly. Lemma 8.2.9 For any integer n,

R

lnw

h

gw =

½

0 2

if n 6= 0 . if n = 0

THE COMPLEX NUMBERS AND FOURIER SERIES

280

Definition 8.2.10 If the real-valued, Riemann integrable function i is 2-periodic, its complex Fourier series is given by 4 X

fn hln

n=4

where

1 fn = 2

Z

i (w) hlnw gw=

Note that this is a formal denition: we are not asserting that this series either converges or, if it does converge, that it converges to i . We will deal with these di!cult questions in the Also, note ¡ section. ¢ that the doubly innite sum P4next P 4 ln ln ln f . f h means f + h + f h 0 n n n=4 n n=1 It turns out that this complex Fourier series is simply our complete Fourier series in disguise, as we see in the following basically computational result. Proposition 8.2.11 As formal series (without regard to convergence), 4 4 X d0 X + dn cos(n) + en sin(n) = fn hln . 2 n=1

n=4

Proof. Because of Euler’s formula hl = cos + l sin , we see that cos nw = (hlnw + hlnw )@2 and sin(nw) = (hlnw hlnw )@(2l), which gives us fn = (dn len )@2 and fn = ln ln = dn cos n+en sin n (dn +len )@2 for n 0. It therefore R follows that fn h +fn h d0 1 for n = 1> 2> = = = ; also, 2 = 2 i (w) gw = f0 .

Looking back at the denition of the Fourier coe!cients dn , en and fn , as well as R the formulas in Lemma 8.0.1, we see integrals of the form © ±lnwªi (w)!(w) gw, where the functions ! are from the families {cos nw}, {sin nw}, or h .

Definition 8.2.12 (Inner product) If i and ! are 2-periodic functions on the interval [> ], their inner product is given by Z 1 hi> !i = i (w)!(w) gw, 2 where !(w) is the complex conjugate of !(w).

The inner product of two functions, as dened above, has the following properties, which form the basis of the denition of the term “inner product” in the much more general setting of vector spaces and linear algebra. Proposition 8.2.13 1. hi> i i is real and hi> i i 0, with equality if and only if i = 0. 2. h!> i i = hi> !i

COMPLEX FUNCTIONS AND VECTORS

281

3. hdi + ej> !i = hdi> !i + hej> !i = d hi> !i + e hj> !i and hi> ! + #i = hi> !i + hi> #i = hi> !i + hi> #i.

If you skip to the end of this section, you’ll see that inner products are generalizations of dot products of vectors. Two vectors are perpendicular, or orthogonal if theirsdot product is 0. The length or magnitude of a vector v is dened to be kvk = v · v. A vector has length 1–is a unit vector –if its dot product with itself is 1. These ideas from elementary linear algebra motivate the following denition for functions. Definition 8.2.14 The norm of a function ! is given by k!k =

p h!> !i =

1 2

Z

¶1@2 2 |!(w)| gw .

! and # are called orthogonal if h!> #i = 0. A family of functions {!n }, n = 0> 1> = = = is called orthonormal if ½ ®

1 if l = m = !l > !m = 0 if l 6= m The next two propositions explain why we make all these denitions.

Proposition 8.2.15 The family of functions !n (w) = hlnw , n = 0> ±1> ±2> · · · is an orthonormal family. Proof. This follows from computations using Lemma 8.2.9. Proposition 8.2.16 If i is Riemann integrable then its complex Fourier coe!cients are given by ®

fn = hi> !n i = i> hlnw .

The following theorem is a classical result; we will use it in the next section. If you take more advanced courses in functional analysis, you will see it stated and proved in far more generality for so-called inner product spaces, and, especially, for Hilbert spaces. Theorem 8.2.17 (Bessel’s Inequality) Suppose the family {!n } is orthonormal and i and the !n are 2-periodic. If fn = hi> !n i then q X

n=0

2

|fn |

Z

|i |2 = ki k2 .

Proof. Apply P the results from Proposition 8.2.13 to the inequality 0 hi !> i !i, where ! = qn=0 !n . Details are left as an exercise. Let’s go back for a moment to vectors, which you probably encountered in secondyear calculus. Consider the following analogy. If we have two q-dimensional real

THE COMPLEX NUMBERS AND FOURIER SERIES

282

vectors v = (y1 > y2 > = = = > yq ) and w = (z1 > z2 > = = = > zq ), then we can think of each as a function v w

= y : {1> 2> = = =} $ R where y(n) = yn = z : {1> 2> = = =} $ R where z(n) = zn .

The standard dot product of v and w is then given by v·w=

q X

yn zn =

n=1

q X

y(n)z(n).

n=1

If we think of integrals as kinds of sums (actually limits of sums), then this is similar to the inner product hy> zi of the coordinate functions y and z. Inner products of functions and dot products of vectors are basically the same idea. We now extend the dot product construction to vectors with complex coordinates. Definition 8.2.18 (Inner products of complex vectors) For complex vectors v = (y1 > y2 > = = = > yq ) and w = (z1 > z2 > = = = > zq ), the dot or inner product is given by q X yn zn . v·w = n=1

The length, or norm, of v is given by q s kvk = v · v = |y|21 + |y|22 + · · · + |yq |2 .

Finally, here is another important classical result that also can be extended to arbitrary inner product spaces. Theorem 8.2.19 (Cauchy-Schwartz inequality) For complex vectors a and b, |a · b| kak kbk Proof. Use the properties summarized in Proposition 8.2.13, which apply to all inner products. Applying 0 (a b) · (a b) with =

a·b , and using b · a = a · b gives b·b (a · b)(b · a) a · b a·b (b · a) + (a · b) b·b b·b b·b (a · b)(a · b) . = a·a b·b

0 a·a

Multiplying by b · b and taking square roots completes the proof.

COMPLEX FUNCTIONS AND VECTORS

283

Remark 8.2.20 This proof is such a classic that it’s a good idea to remember it. One way to think of it is to multiply out (a b) · (a b) as if the vectors were real. You then get a · a2(a · b) + 2 (b · b) 0. a·b , so that’s the b·b substitution you make to get Cauchy-Schwartz. (If you know something about projection of vectors from linear algebra, you can also see that this value of minimizes the length of a b.) This is a quadratic function of and has a minimum at =

Remark 8.2.21 There is a version of Cauchy-Schwartz for functions UC on [d> e] and the inner product h!> #i: p p |h!> #i| h!> !i h#> #i, i.e.

¯Z ¯ ÃZ !1@2 ÃZ !1@2 ¯ e ¯ e e ¯ ¯ 2 2 !(w)#(w) gw¯ (!(w)) gw (#(w)) gw ¯ ¯ d ¯ d d

This is a special case of Hölder’s Inequality, whose proof is outlined in exercise 7 on page 209.

Exercises 1. Show that the family of functions {sin nw} and {cos nw}, for n = 0> 1> = = = are pairwise orthogonal. Make them into an orthonormal family. 2. Verify the details in the proof that the functions {hlnw }, n = 0> ±1> ±2> = = = form an orthonormal family. 3. Prove Bessel’s inequality (Theorem 8.2.17). 4. Prove Euler’s formula, hl = cos + l sin , by substituting l into the series for h} and collecting real and imaginary parts. 0

0

5. Using the result hw(d+el) = hdw (cos ew + l sin ew), prove that h}+} = h} h} . 6. Recall that a function i is even if i ({) = i ({) for all { in its domain, and is odd if i ({) = i ({). (a) Show that the Fourier series for an even function involves only cos n terms, while the Fourier series for an odd function has only sin n terms. i ({) + i ({) i ({) i ({) and ir ({) = . (b) For any function i , let ih ({) = 2 2 Prove that i = ih + ir and that ih is even and ir is odd. (c) Relate the previous parts to the full Fourier series for any function i .

THE COMPLEX NUMBERS AND FOURIER SERIES

284

7. Calculate directly the complex Fourier series for the function i () = ||, 5 [> ]. Do the same for the function i () = ||. These are both even functions. 8. Calculate the Fourier series for the function i described below (i is odd).

S

; ? i () = =

if @2 if @2 @2 if @2

S S

S

9. (Project) Fourier analysis can be done for functions that are 2O-periodic, i.e. such that i ( + 2O) = i () for all {. Discuss how the families of functions {sin(n)> cos(n)} or {hln } have to be adjusted if we take these functions to be dened on [O> O], and restate some of the results about coe!cients. Work out some examples of Fourier series in this more general setting (used by Bernoulli and Fourier).

8.3

Fourier Series Theory

We now turn to two important questions about convergence. • Does the Fourier series for a function i converge? • If the Fourier series for i converges, does it converge to i ? The answer to the rst question is rather complicated. There exist rather complicated examples of periodic continuous functions whose Fourier series don’t converge at some points; the rst one was constructed by du Bois-Reymond1 –see [Pinsky, 2002]. On the other hand, there is a very di!cult and famous result of Lennart Carleson (1966) that the Fourier series of continuous functions (actually L2 functions) must converge at almost every point of their domain, where “almost every” has a precise meaning in measure theory. The second question is also not trivial: the Fourier series for i may converge, but not necessarily to i –see example 8.3.13 at the end of this section. When i is just uniformly continuous, its values can be computed from its Fourier series (we 1 Paul

David Gustav du Bois-Reymond, 1831—1889

FOURIER SERIES THEORY

285

will prove this), even though the series may not actually converge, even pointwise, to i . On the other hand, as we shall see, if the Fourier series for i converges, and i is UC, then the Fourier series converges to i . First, note that each of the functions sin(n) and cos(n) have period 2@n, n = 1> 2> = = = We wouldn’t even expect the Fourier series for i to converge to i , then, unless i is at least 2-periodic, so we will assume this. The following classical result is by no means the best possible general criterion for convergence, but deeper results are beyond the scope of this text. Note how its proof uses the Cauchy Schwartz inequality (Theorem 8.2.19) and Bessel’s inequality (Theorem 8.2.17). A lot of classical analysis was created to deal with the kinds of questions the convergence of Fourier series raises. Theorem 8.3.1 (Convergence of Fourier series) If i has period 2 and is uniformly dierentiable on [> ], then its Fourier series converges uniformly. Proof. Let fn denote the nth Fourier coe!cient of i and f0n denote the nth Fourier coe!cient for i 0 . It is easy to see, using integration by parts, that for each n, f0n = lnfn . (This is where ® we use the periodicity of i .) Since the Fourier coe!cients are of the form i> hlnw , we can apply Bessels’ inequality to obtain PQ

n=1

similarly,

2

2

n |fn | = PQ

n=1

PQ

0 2 n=1 |fn |

n2 |fn |2 =

1 2

PQ

n=1

Z

2

|i 0 | = E ? 4;

¯ 0 ¯2 ¯fn ¯ E ? 4=

For integers P Q we can also apply the Cauchy-Schwartz inequality to the vectors ¶ 1 1 1 > >··· > D = P P +1 Q E = (P |fP | > (P + 1) |fP+1 | > = = = > Q |fQ |) which gives us XQ

n=P

|fn | =

¶ 1 (n |fn |) n=P n ¶1@2 X X Q Q 1 XQ

n=P

X Q

n=P

n2 1 n2

n=P

¶1@2

s E.

2

2

n |fn |

¶1@2

¯ ln ¯ P4 P4 ¯fn h ¯ = By the Cauchy test, then, n=1 |fn | converges since n=1 1@n2 does. Now P4 |fn |, so the Weierstrass M-test (Theorem 7.1.19) tells us that the series n=1 fn hln

THE COMPLEX NUMBERS AND FOURIER SERIES

286

converges uniformly and absolutely. A similar argument gives convergence of the negative half of the series. Now we examine the question of what the Fourier series converges to. Here we will assume only that i is uniformly continuous on [> ] and is 2periodic ( i ( + 2) = i () for all ). Since we are interested in what the Fourier series X4 fn hln I () = n=4

converges to, we look at the more general power series X4 ¡ ln ¢ |n| fn h { . X ({> ) = n=4

Note that when { = 1 this sum becomes just the Fourier series for i . Although we don’t know that the Fourier series converges, Abel’s Summation Theorem (Theorem 7.3.35) suggests that we take the limit of this power series as { $ 1. If this limit O exists, then we say that I is Abel summable to O. Our main result will be to show that this limit is just i (), so that the Fourier coe!cients fn actually determine i . We note rst that the coe!cients fn hln of X ({> ) are bounded. This is because ¯ ln ¯ ¯fn h ¯ = |fn | ¯ ¯ Z ¯ ¯ 1 lnw ¯ i (w)h gw¯¯ = ¯ 2 Z ¯ ¯ 1 |i (w)| ¯hlnw ¯ gw 2 Z 1 = |i (w)| gw = N ? 4 2 (since i is assumed UC on [> ]). Lemma 8.3.2 For each 0 u ? 1, X (u> ) converges absolutely and uniformly as a function of . X4 ¯¡ ln ¢ |n| ¯ X4 ¯ fn h u ¯ Nu|n| , so this series is bounded by a conProof. n=4 n=4 vergent geometric series and our result follows from Theorem 7.3.3, the comparison test for series of functions. We now replace the coe!cients fn by their values as integrals. Since we have established uniform convergence, we can interchange the order of summation and integration by Proposition 7.2.1; thus, for 0 u ? 1, Z ¶ X4 X4 1 fn hln u|n| = i (w)hlnw gw hln u|n| X (u> ) = n=4 n=4 2 Z ´ ³ X 4 1 = hln(w) u|n| i (w) gw n=4 2 Z = S (u> w) i (w) gw.

FOURIER SERIES THEORY

287

Definition 8.3.3 (Poisson kernel) S (u> !) = 0 u ? 1) is called the Poisson kernel. Proposition 8.3.4 For 0 u ? 1, S (u> !) =

1 X4 hln! u|n| (dened for n=4 2

1 u2 . 1 2u cos ! + u2

Proof. This is an exercise; note that S is the sum of two geometric series, one for positive n and the other for negative n. Definition 8.3.5 (Convolution) For i and j uniformly continuous [> ], their convolution i j is given by Z i ({ w)j(w) gw. (i j) ({) =

Proposition 8.3.6 i j = j i . Proof. See the exercises. Corollary 8.3.7 For each 0 u ? 1, X (u> ) = S (u> ) i () = i () S (u> ). In order to nd lim X (u> ), we need to look at the Poisson kernel more carefully. u$1

Here are plots of S (u> {) for a few values of u.

r = 0.85

r = 0.75 r = 0.50

Poisson kernel As you can see, values of u approaching 1 give plots more and more concentrated in the middle. Since they also get taller in the middle, one might conjecture that their areas might be constant. This is in fact true; we summarize these crucial properties of S (u> ).

THE COMPLEX NUMBERS AND FOURIER SERIES

288

Lemma 8.3.8 The Poisson kernel has the following properties: 1. S (u> ) 0. 2. For all 0 u ? 1, 3. For all A 0, lim

1 2

R

R

u$1 =

S (u> ) g = 1.

S (u> ) g = 0. 2

Proof. The explicit formula S (u> !) = 12u1u cos !+u2 gives the rst and third parts, while the second part is proved from the denition of S (u> ) and the basic fact that ½ Z 1 0 if n = 6 0 hlnw gw = . Details are left for the exercises. 1 if n = 0 2 Functions that have the three properties enumerated in this last lemma are called approximate identities; see the exercises. We now come to our fundamental result about limits of Fourier Series. Theorem 8.3.9 If i is uniformly continuous, lim X (u> ) = i (). u$1

Proof. We compute: ¯Z ¯ ¯ ¯ |X (u> ) i ()| = ¯¯ S (u> w) i (w) gw i ()¯¯ ¯ ¯Z Z ¯ ¯ S (u> w) i (w) gw i () S (u> w) gw¯¯ = ¯¯ ¯ ¯ ¯ ¯ ¯ ¯Z Z ¯ ¯ ¯ ¯ S (u> w) i (w) gw i ()S (u> w) gw¯ = ¯ ¯ ¯ ¯ ¯| {z } ¯ ¯ = S (u>)i () ¯ ¯ ¯ ¯ ¯ ¯Z Z ¯ ¯ ¯ ¯ S (u> w) i ( w) gw i ()S (u> w) gw¯ = ¯ ¯ ¯ ¯ ¯| {z } ¯ ¯ = i ()S (u>) ¯ ¯Z ¯ ¯ S (u> w) (i ( w) i ()) gw¯¯ = ¯¯ Z S (u> w) |i ( w) i ()| gw.

Suppose we are given A 0. Since i is UC, we can nd a modulus of continuity = i (@2). We now split the last integral above into two pieces: Z Z S (u> w) |i ( w) i ()| gw + S (u> w) |i ( w) i ()| gw |w| |w| | {z } | {z } D

E

FOURIER SERIES THEORY

289

By the rst two parts of Lemma 8.3.8, |D| @2; on the other hand, by making u close enough to 1, we can make |E| ? @2 as well, since the continuity of i ensures that |i ( w) i ()| is bounded. Thus, |X (u> ) i ()| can be made less than by making u close enough to 1. Ok, so let’s put together what we now know when i is UC on [> ] and the Fourier series for i converges. 1. lim X (u> ) = i (): this is what we just proved. u$1

2. Thinking of X (u> ) as a power series in u, we have that X (1> ) converges, since X (1> ) is the Fourier series for i . 3. With minor adjustments, it follows from Abel’s Summation Theorem (7.3.35) that lim X (u> ) = X (1> ). u$1

Corollary 8.3.10 If the Fourier series for a uniformly continuous, 2-periodic function i converges, it converges to i ; in particular, the Fourier series for any uniformly dierentiable function converges to that function. In working out examples, the following is usually helpful in simplifying computation. Lemma 8.3.11 Suppose i is Riemann integrable on [> ]. 1. If i is even (i.e. i ({) = i ({)), then the Fourier coe!cients are given by Z 2 dn = i (w) cos(nw) gw 0 en = 0. 2. If i is odd (i.e. i ({) = i ({)), then the Fourier coe!cients are given by dn

= 0 Z 2 = i (w) sin(nw) gw. 0

en

The proof of this is left as an exercise. Example 8.3.12 On the interval [> ] dene the sawtooth function () = ||.

S

S

THE COMPLEX NUMBERS AND FOURIER SERIES

290

We can extend to all of R by requiring 2-periodicity ( + 2) = ():

S

S

S

S

The sawtooth function () Since this is an even function, we see from the lemma above that the Fourier coe!cients are given by d0

=

dn

=

2

en

= 0

Ã

(1)n 1 n2

!

Thus, the Fourier series for () is given by 4 4 X cos ((2n 1) ) + . 2 (2n 1)2 n=1

¯ ¯ ¯ ¯ 1 Since ¯ cos(2n1) 2 (2n1) ¯ (2n1)2 , this series converges absolutely; since is continuous, it is equal to its Fourier series. Here is a plot of the Fourier series truncated at q = 5:

S

S

S ()

2

+

4

5 P

n=1

cos(2n1) (2n1)2

S

FOURIER SERIES THEORY

291

The more terms of the series we take, the sharper the corners on the plot will be. Example 8.3.13 Now we turn to a function that fails to be dened at many points. We start with the interval (> ) and the step function ½ 1 if A A 0 V() = . 1 if ? ? 0

We note that V is dened only for those for which we can determine either ? ? 0 or 0 ? ? (see the discussion of these kinds of functions beginning on page 157). We note thatR V is uniformly continuous R on both (> 0) and (0> ), so both improper integrals V(w) cos(nw) gw and V(w) sin(nw) gw exist; since V is odd, the rst of these is 0, and we have the Fourier coe!cients given by dn

= 0

en

2

=

1 cos(n) n

¶

=

(

0 4 n

if n is even if n is odd.

.

This gives the Fourier series for V: 4 4 X sin((2n 1) ) 2n 1 n=1

which converges by Dirichlet’s test (after a bit of trigonometric manipulation). The convergence is uniform only in closed intervals not containing a multiple of . We can extend V by periodicity, i.e. by dening V ( + 2) = V(). This denes V on the union of the open intervals (n> (n + 1)), n = 0> ±1> ±2> = = = Here is what it looks like:

The step function V() and here is its 5 term Fourier series:

V()

4

5 P

n=1

sin((2n1)) 2n1

THE COMPLEX NUMBERS AND FOURIER SERIES

292

Because the function V is undened at multiples of , we cannot conclude automatically that the Fourier series converges at these values of . In fact, it clearly does (it’s 0), but it can’t converge uniformly since if it did, the original function V would be uniformly continuous on all of R, which it clearly isn’t. If you look at the graph of the Fourier series, you’ll see that it has what appear to be extra high or low “horns” around multiples of ; They are examples of what is called the Gibbs phenomenon. When we add on more and more terms of the Fourier series, these horns decrease in width but not in height, which persists in “overshooting” the value of the function near the jump by about 18%. Thus, the Gibbs phenomenon prevents the Fourier series from converging uniformly. Here is a picture with 15 terms of the Fourier series:

V()

4

15 P

n=1

sin((2n1)) 2n1

Some nal remarks The dierence between Taylor and Fourier series is often expressed by the statement “Taylor series are local in nature while Fourier series are global.” To understand this, recall that the Taylor series for i is determined by the values of the successive derivatives of i at a single reference point { = f. Functions that have a Taylor series must be innitely dierentiable at f, and the convergence of the Taylor series (when it converges to i ) is much better the closer { is to f, as you can see from the presence q+1 in the expression for the remainder (see Theorem 7.4.8). of ({ f) Fourier series, on the other hand, exist when their coe!cients can be calculated. These coe!cients by integrals (not derivatives), and the form of these R are dened 1 integrals, 2 i (w) hlnw gw, suggests that they are averages, with the function hlnw acting as a weight. Thus, the Fourier series depends on the values of the function not at a single point, but over an entire interval. From Carleson’s very di!cult theorem, we know that the Fourier series of a continuous function converges almost everywhere, but even from the more elementary results in this book, we see that the Fourier series determines the function (it is a limit of convolutions) and that we get convergence with just the assumption of uniform dierentiability.

FOURIER SERIES THEORY

293

When a function is not periodic, it obviously cannot have a Fourier series converging to it everywhere. If the function is dened on all of R or on an interval where its values don’t agree on the endpoints, trying to force periodicity causes inaccuracies in the Fourier series representation (e.g. the Gibbs phenomenon). One way around this is to allow non-integer, even irrational, frequencies in the trigonometric functions sin w. When this is the case, it doesn’t make sense to expand the function as a series; instead, we think of the assignment of Fourier coe!cients itself as a representation of the function. Starting with i (w) as a function of the “time” variable w, we create the Fourier transform function ib(v) as a function of the frequency v: Z 4 b i (w)hlvw gw. i (v) = 4

If i satises certain technical hypotheses, its Fourier transform will exist. Furthermore, i is recoverable from its transform via the inverse transform: Z 4 1 ib(v)hlvw gv. i (w) = 2 4

In the theory, the functions i and ib are regarded simply as dual ways of describing the same data: one from a temporal or time domain point of view, the other from a frequency domain point of view. This perspective, along with beautiful formulas relating simple functions with their Fourier transforms and multiplication with convolution, provides deep insights into the ways functions can describe physical processes. There are important applications to signal and image-processing. Finally, the (re)discovery of the Fast Fourier Transform (FFT) has enabled computers to calculate transforms of functions represented by data very quickly, thus providing practical techniques for digital ltering and smoothing. Arguably, the study of Fourier analysis has provided much of the stimulus for the development of modern analysis, starting with the precise denition of convergence (pointwise vs. uniform) and continuing with the creation of abstract notions of function and inner product spaces, convolution and transform theory, and harmonic analysis on more general topological spaces. I hope that the introduction provided by this book will encourage you to look further into these exciting topics.

Exercises 1. Prove that convolution is commutative, i.e. i j = j i (Proposition 7.5.13 will be useful). 2. As was suggested in the text, integration is a continuous form of addition. With this in mind, what is the analogy between convolution and multiplication of polynomials or series (see Cauchy products of series on page 136 and exercise 10 on page 252). 3. A function j(u> w) is called an approximate identity if it has the following properties, enumerated in Lemma 8.3.8:

THE COMPLEX NUMBERS AND FOURIER SERIES

294

(a) j(u> w) 0 (resp. jn (w) 0) R (b) For all 0 u ? 1, j(u> w) gw = 1 R (c) For all A 0, lim w= j(u> w) gw = 0. u$1

Show that if i is 2-periodic and uniformly continuous on [> ], then lim i () j(u> ) = i ().

u$1

4. We can also dene an approximate identity in a dierent way, as a sequence of functions jn (w) such that (a) For all n, jn (w) 0 R (b) For all n, jn (w) gw = 1 R (c) For all A 0, lim w= jn (w) gw = 0. n$4

Show that if i is 2-periodic and uniformly continuous on [> ], then lim i () jn () = i ().

n$4

5. In exercise 3 on page 275, you were asked to compute the Fourier series for the functions listed below. Now discuss the convergence of these series. (a) i () = 3 + 2 sin(5) 5 cos()

(b) i () = cos(@2) (c) i () = sin2 () (d) i () = (e) i () = ||

(f) i () = ||

6. Discuss the convergence of the series it converge to? (Make some plots!)

4

4 P

n=0

(1)n sin(2n+1){ . (2n+1)2

What function does

REFERENCES

[Benacerraf-Putnam, 1984] P. Benacerraf, H. Putnam, ed., Philosophy of Mathematics, 2 nd edition, Cambridge Univ. Press, Cambridge 1984 [Borwein, 2005] D. Borwein, review of Tauberian Theory, in Bulletin of the Amer. Math. Soc., v. 42, no. 3, pp. 401—406 (2005). [Bridger-Stolzenberg, 1999] M. Bridger, G. Stolzenberg, “Uniform Calculus and the Law of Bounded Change”, Amer. Math. Monthly. v. 106 no. 7, pp. 628—635 (August-September 1999). [Folland, 1992] G.B. Folland, Fourier Analysis and its Applications, Brooks-Cole, Pacic Grove, 1992. [Kaczor-Nowak, 2000] W.J. Kaczor, M.T. Nowak, Problems in Mathematical Analysis I,II, Amer. Math. Soc., Providence, 2000. [Lewin, 2003] J. Lewin, An Interactive Introduction to Mathematical Analysis, Cambridge Univ. Press, Cambridge (UK), 2003. [Mattuck, 1999] A. Mattuck, Introduction to Analysis, Prentice-Hall, Upper Saddle River, 1999. [Mercer, 2005] P.R. Mercer, “Error Estimates For Numerical Integration Rules”, College. Math. J. v. 36 no. 1, pp. 27—34 (January 2005). [Pinsky, 2002] M.A. Pinsky, Introduction to Fourier Analysis and Wavelets, BrooksCole, Pacic Grove, 2002. [Spivak, 1994] M. Spivak, Calculus, 3rd Ed., Publish or Perish, Houston, 1994. [Stade, 2005] E. Stade, Fourier Analysis, John Wiley and Sons, New Jersey, 2005. [Strichartz, 1995] R.S. Strichartz, The Way of Analysis, Jones and Bartlett, Boston, 1995.

295

Index Symbols D{ , 88 C, 275 C2, 36 h, 96 F , 35 , 13 h > i, see inner product Re j, 168 d _, 22 (d), 94 , 28 ?, 27 lim(S), 61 N, 1 Qd (), 99 Q, 4 R, 39 R , 41 , 25 S $ [, 64 D ^ E, 152 {x , 96 Z, 3 Abel –’s Lemma, 249 –’s Summation Theorem, 249 –’s test, 126 summability, 250 summation, 249 – summation, 129 absolutely integrable, 180 absolute value of family of intervals, 25 of interval, 18 properties, 49 when |[ D| ? H, 49 addition

of families of intervals, 25 of rational intervals, 14 alternating series test, 123 amalgamation of functions, 153 antiderivative, 198 approximation of reals by rationals, 55 arcsine, 212 arctangent, 212 arithmetic operations on reals, 44 average value, 176, 183 Bessel’s inequality, 281 betweenness, 52 bisection method, 38 bounded away from 0 on a set, 142 on a set, 142 positively, 74 bounds for families rational, 28 real, 64 C2 exercise, 38 cancellation law for real addition, 45 for real multiplication, 45 can’t be ruled out, 50 Cauchy Completeness Theorem, 103 product and Abel’s Theorem, 252 product of series, 136 sequence, 102 sequence of functions, 224 Cauchy-Schwartz inequality, 282 chain rule, 193 change of variables in integrals, 210 comparison test for improper integrals, 180 for series, 117 297

INDEX

298

for series of functions, 240 Completeness Theorem, 62 complex numbers denitions, 275 concavity, and the second derivative, 208 consistent families of intervals, 25 family of intervals, 24 for families of real intervals, 60 continuity at a point (exercise), 147 of inverse functions, 161 uniform, function of 1 variable, 139 uniform, function of 2 variables, 213 contraction map, 129 convergence Abel’s test, 126 alternating series test, 123 Cauchy criterion, 102 comparison test, 117 of a denite integral, 177 Dirichlet’s test for improper integrals, 217 for series of functions, 229 for series of numbers, 125 of a series of functions, 239 of Fourier Series for UD functions, 285 s-series test, 117 of power series in a disk, 241 of power series: Main Theorem, 241 radius of convergence, 244 ratio test, 121 root test, 121 of a sequence Cauchy criterion, 102 of a sequence of functions pointwise, 223 uniform, 224 of a sequence, 99 of a series, 113 absolute convergence, 120 conditional convergence, 120

tail convergence for integrals, 179 for series, 115 convolution, 287 cosine function, 262 cross functions don’t cross, 174 cube root of 2 (family), 36 exercise, 38 decreasing, strictly decreasing, 198 dense, 149 derivative continuity of, 188 denition, 187 qth, 204 partial, 213 uniqueness of, 188 dierence quotient, 185 dierentiable, 187 modulus of dierentiability, 187 q times, 204 dierentiation of arithmetic combinations, 192 of compositions, 193 of h{ , 188 of inverse functions, 194 of ln {, 189 of power functions, 194 of quotients, 193 under the integral sign, 214 Dirichlet’s test for improper integrals, 217 for series of functions, 229 for series of numbers, 125 discontinuous functions, 157 divergence of improper integal, 178 of a series, 113 divide and average (square roots), 39 division of reals, 45 dominated convergence for parametrized functions, 215, 216 for sequences of functions, 235 don’t cross functions don’t cross, 174

INDEX

299

h, 96 -trichotomy, 56 epsilon-trichotomy, 56 Eudoxus’ Method of Exhaustion, 65 Euler number h, 96 lim(1 + 1@n)n , 111 Euler’s formula, 279 exp power series, 248 inverse to log (exercise), 219 exponential functions as power series, 248 denition, 88 fundamental inequality for rational exponents, 85 for real exponents, 90 limiting family for, 87 exponentially bounded, 183 Extension Theorem for uniformly continuous functions, 150 extrema, approximating, 144

Fundamental Theorem of Calculus Part I, 197 Part II, 198

families arithmetic operations, 25 consistent, 24, 25 ne families of rational intervals, 32 families of real intervals, 60 xed point, 129 Fourier complex series, 280 sine and cosine series, 274 transform, 293 FTC, see Fundamental Theorem of Calculus Fubini’s Theorem, 214 function 1-to-1, 71 a!ne, 71 denition, 70 fractional-linear, 71 inverse, 71 limit of, 109 non-continuous, 157 Nth power, 71 functions dened as integrals, 211

improper integral convergence of, 177 denition, 177 divergence of, 178 doubly innite, 179 increasing, strictly increasing, 198 inequalities and arithmetic operations, 46 inequality strict for families of intervals, 27 for rational intervals, 16 for rational numbers, 4 for real intervals, 59 for reals (as families), 40 weak, 5 for families of intervals, 28 for real intervals, 59 for reals (as families), 40 injective function, 71 inner product general properties, 280 of functions, 280 integers, 3 integral

Gamma function denition, 181 properties (exercise), 217 geometric series convergence of, 114 Gibbs phenomenon, 292 Goldbach Conjecture, 67 number, 67 harmonic series alternating, 124 diverges, 116 Heaviside function, 158 Hölder’s inequality, 209 hypergeometric series exercise, 251

INDEX

300

dening family, 167 dening properties, 165 denition, 168 improper, 177 linearity, 175 of piecewise function (exercise), 184 integration by parts, 211 by substitution, 210 intersect (intervals), 22 intersection of intervals in terms of endpoints, 22 interval punctured, 151 rational, 12 real, 59 smallest with property P, 15 types, 70 inverse function, 71 Inverse Function Theorem continuity in two-sided case, 161 for decreasing functions, 79 Lipschitz conditions, 74 one-sided case, 80 two-sided case, 80 with positive derivatives, 200 l’Hôpital’s rule, 206 Laplace transform denition, 181 existence, 183 properties (exercise), 218 Law of Bounded Change, 197 generalized, 199 laws of exponents for integer exponents, 83 for rational exponents, 84 for real exponents, 88 LBC, see Law of Bounded Change limit of a family of real intervals, 61 of a function, 109 of a limiting family, 63 of a sequence, 99 limit comparison test for improper integrals, 180 for series, 118

limiting family, 63 limits algebra of, 101 preserve weak inequalities, 101 linearity of the integral, 175 Lipschitz condition, 72 for exponential functions, 85, 89, 90 for power functions, 79 local max or min, 204 logarithm denition, 92 basic properties, 93 dened as an integral (exercise), 219 domain for, 92 M-test (Weierstrass), 228 mathematical induction, 2 max for a function (approximating), 144 for families, 25 for intervals, 18 properties of max(D> E), 48 Method of Exhaustion, 65 min for a function (approximating), 144 for families, 25 for intervals, 18 properties of min(D> E), 48 Minkowski’s inequality, 209 modulus of continuity, 139 of dierentiability, 187 weak vs. strong, 99 modulus of convergence for a Cauchy sequence, 102 for a sequence, 99 for a sequence of functions, 224 multiplication of rational intervals, 14 natural numbers, 1 negative of a family of intervals, 25 of rational interval, 14

INDEX

norm of a function, 281 qth root function, 80 onto function, 71 orthogonal functions, 281 orthonormal, 281 overspreads, 61 partial derivative, 213 partial sum, 112 partition (of an interval), 166 period, 265 periodic, 265 2-periodic, 274 pi denition, 264 lies between 3 and 4, 267 properties of, 264 pointwise limit, of a sequence of functions, 225 Poisson kernel, 287 power function denition for real exponent, 96 derivative, 194 Lipschitz conditions, 79 power series convergence of (denition), 240 denition of, 240 term-by-term dierentiation, 243 term-by-term integration, 243 uniqueness centered at 0, 245 centered at c, 254 powers integer exponent, 83 limiting family for, 86 rational exponent, 84 product of families of intervals, 25 of intervals, 14 product rule, 192 punctured interval, 151 quotient rule, 193 u({> |)

301

in derivative denition, 185 u({> |) $ 0, 186 Raabe’s test, 269 radius of convergence denition, 244 example, using the ratio test, 245 1@n lim |vn | , 244 ratio test, 121 rational approximation, 55 rational numbers, 4 real number arithmetic, 45 real numbers denition, 39 denition as Cauchy sequences, 108 equality, 40 reciprocal of family of intervals, 25 of rational interval, 17 Riemann integrability, 182 integral, see integral sum, 166 Riemann-Lebesgue lemma, 203 root (qth root), 80 root test for convergence, 121 ruled out, 50 sawtooth function, Fourier series of, 289 secant line (exercise), 208 second derivative test, 205 sequence Cauchy, 102 convergence to a limit, 99 denition, 99 of functions, 223 series alternating, 121 convergence, see convergence geometric, 114 rearrangement of, 134 sine function, 262 smallest interval, 15 smallest interval with property S , 15 step function, 158 Fourier series, 291 subsequence, 106, 243

INDEX

302

subtitution in integrals, 210 subtraction of families of intervals, 25 of rational intervals, 14 of reals, 45 surjective function, 71

union of intervals (D ^ E) of intervals, 152 unit denition, 47 properties, 47 see also R , 41

tail convergence for integrals, 179 for series, 115 tangent (function), 265 tangent line, 208 Tauberian theorems, 250 Hardy-Littlewood, 250 Taylor polynomials, 255 remainder, as integral, 259 series, 256 for binomial functions, 257 for exponential functions, 257 may not converge to f, 257 Taylor’s Theorem with remainder, 256 triangle inequality for rationals, 5 for reals, 50 trichotomy -trichotomy, 56 for rational numbers, 5 for real numbers, 56 trigonometric functions addition formulas, 263 denition, 262 dierential equation for, 262 periodicity, 265 truncation error estimate, 119

weak inequality preserved in the limit, 101, 110 Weierstrass M-test, 228 Wiggle Lemma for the rationals, 5 for the reals, 56

UC (uniformly continuous), 139 UD (uniformly dierentiable), 187 uniform convergence of a sequence of functions, 224 of an improper integral, 215 uniform limit of UC functions, 226 uniformly continuous function of 1 variable, 139 function of 2 variables, 213 functions: Extension Theorem, 150

Published Titles in This Series 38 33 32 31

Mark Bridger, Real Analysis, 2019 Brad G. Osgood, Lectures on the Fourier Transform and Its Applications, 2019 John M. Erdman, A Problems Based Course in Advanced Calculus, 2018 Benjamin Hutz, An Experimental Introduction to Number Theory, 2018

30 Steven J. Miller, Mathematics of Optimization: How to do Things Faster, 2017 29 Tom L. Lindstrøm, Spaces, 2017 28 Randall Pruim, Foundations and Applications of Statistics: An Introduction Using R, Second Edition, 2018 27 Shahriar Shahriari, Algebra in Action, 2017 26 25 24 23

Tamara J. Lakins, The Tools of Mathematical Reasoning, 2016 Hossein Hosseini Giv, Mathematical Analysis and Its Inherent Nature, 2016 Helene Shapiro, Linear Algebra and Matrices, 2015 Sergei Ovchinnikov, Number Systems, 2015

22 21 20 19

Hugh L. Montgomery, Early Fourier Analysis, 2014 John M. Lee, Axiomatic Geometry, 2013 Paul J. Sally, Jr., Fundamentals of Mathematical Analysis, 2013 R. Clark Robinson, An Introduction to Dynamical Systems: Continuous and Discrete, Second Edition, 2012 18 Joseph L. Taylor, Foundations of Analysis, 2012 17 Peter Duren, Invitation to Classical Analysis, 2012 16 Joseph L. Taylor, Complex Variables, 2011

15 Mark A. Pinsky, Partial Differential Equations and Boundary-Value Problems with Applications, Third Edition, 1998 14 Michael E. Taylor, Introduction to Differential Equations, 2011 13 Randall Pruim, Foundations and Applications of Statistics, 2011 12 John P. D’Angelo, An Introduction to Complex Analysis and Geometry, 2010 11 10 9 8 7 6 5 4

Mark R. Sepanski, Algebra, 2010 Sue E. Goodman, Beginning Topology, 2005 Ronald Solomon, Abstract Algebra, 2003 I. Martin Isaacs, Geometry for College Students, 2001 Victor Goodman and Joseph Stampfli, The Mathematics of Finance, 2001 Michael A. Bean, Probability: The Science of Uncertainty, 2001 Patrick M. Fitzpatrick, Advanced Calculus, Second Edition, 2006 Gerald B. Folland, Fourier Analysis and Its Applications, 1992

3 Bettina Richmond and Thomas Richmond, A Discrete Transition to Advanced Mathematics, 2004 2 David Kincaid and Ward Cheney, Numerical Analysis: Mathematics of Scientific Computing, Third Edition, 2002 1 Edward D. Gaughan, Introduction to Analysis, Fifth Edition, 1998

Real Analysis: A Constructive Approach Through Interval Arithmetic presents a careful treatment of calculus and its theoretical underpinnings from the constructivist point of view. This leads to an important and unique feature of this book: All existence proofs are direct, so showing that the numbers or functions in question exist means exactly that they can be explicitly calculated. For example, at the very beginning, the real numbers are shown to exist because they are constructed from the rationals using interval arithmetic. This approach, with its clear analogy to scientific measurement with tolerances, is taken throughout the book and makes the subject especially relevant and appealing to students with an interest in computing, applied mathematics, the sciences, and engineering. The first part of the book contains all the usual material in a standard onesemester course in analysis of functions of a single real variable: continuity (uniform, not pointwise), derivatives, integrals, and convergence. The second part contains enough more technical material—including an introduction to complex variables and Fourier series—to fill out a full-year course. Throughout the book the emphasis on rigorous and direct proofs is supported by an abundance of examples, exercises, and projects—many with hints—at the end of every section. The exposition is informal but exceptionally clear and well motivated throughout.

For additional information and updates on this book, visit www.ams.org/bookpages/amstext-38

AMSTEXT/38

Sally

The

SERIES

This series was founded by the highly respected mathematician and educator, Paul J. Sally, Jr.