Differential Calculus

348 122 184MB

English Pages 296 [304] Year 1956

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Differential Calculus

Table of contents :
f0-f49_Redacted
a2
a3
a4
a5
a6
a7
a8
a9
d1
d2
d3
d4
d5
d6
d7
d8
d9
d10
d11
d12
d13
d14
d15
d16
d17
d18
d19
d20
d21
d22
d23
d24
d25
f26
f27
f28
f30
f31
f32
f33
f34
f35
f36
f37
f38
f39
f40
f41
f42
f43
f44
f45
f46
f47
f48
f49
f50-f71_Redacted
f50
f51
f52
f53
f54
f55
f56
f57
f58
f59
f60
f61
f62
f63
f64
f65
f66
f67
f68
f69
f70
f71
f72-93_Redacted
f72
f73
f74
f75
f76
f77
f78
f79
f80
f81
f82
f83
f84
f85
f86
f87
f88
f89
f90
f91
f92
f93
FerrDiff-f94 - 201_Redacted
f94
f95
f96
f97
f98
f99
f100
f101
f102
f103
f104
f105
f106
f107
f108
f109
f110
f111
f112
f113
f114
f115
f116
f117
f118
f119
f120
f121
f122
f123
f124
f125
f126
f127
f128
f129
f130
f131
f132
f133
f134
f135
f136
f137
f138
f139
f140
f141
f142
f143
f144
f145
f146
f147
f148
f149
f150
f151
f152
f153
f154
f155
f156
f157
f158
f159
f160
f161
f162
f163
f164
f165
f166
f167
f168
f169
f170
f171
f172
f173
f174
f175
f176
f177
f178
f179
f180
f181
f182
f183
f184
f185
f186
f187
f188
f189
f190
f191
f192
f193
f194
f195
f196
f197
f198
f199
f200
f201
Ferrar Diff 202-296_Redacted
f202
f203
f204
f205
f206
f207
f208
f209
f210
f211
f212
f213
f214
f215
f216
f217
f218
f219
f220
f221
f222
f223
f224
f225
f226
f227
f228
f229
f230
f231
f232
f233
f234
f235
f236
f237
f238
f239
f240
f241
f242
f243
f244
f245
f246
f247
f248
f249
f250
f251
f252
f253
f254
f255
f256
f257
f258
f259
f260
f261
f262
f263
f264
f265
f266
f267
f268
f269
f270
f271
f272
f273
f274
f275
f276
f277
f278
f279
f280
f281
f282
f283
f284
f285
f286
f287
f288
f289
f290
f291
f292
f293
f294
f295
f296

Citation preview

DIFFERENTIAL CALCULUS BY

W. L. FERRAR, D.Sc. FELLOW OF HERTFORD CO LLEG E OXF ORD

O XF ORD AT THE CLARENDON PRESS 1956

Oxford University Press, Amen House, London /','.(' .. / GLASGOW r-;Ew YORK 'n)HONTO MEJ.UOURNK WEJ,J,JNIJTON

BOMBAY OAI,Cl:TTA M."-DRAS KARACHI CAl'li: TOWN lllAHA~

Geoffrey Oumberlege , Publisher to the l!niver8£ty

PRIN T ED IN GREAT BRITAIN

PREFACE MY aim in writing this book has been to provide a textbook for mathematicians and scientists during t heir first year or so at a university. The books already available for tl1e study of the differential calculus provide either too much or too little to be really helpful at this particular stage. In making my choice of what to put in and what to leave out I have been guided by my own experience in teaching undergraduates. I have put in most things they should know about, including points of theory which I know will appeal to some, will not appeal to others, and will be gracefully evaded by those wh o are not so much interested in what can be said as in what can be done. Most of the book is suitable for study during the first year at a university, but some topi cs inevitably go beyond a first-year course and are more suitable for second-year work. I have noted these in the preliminary 'reading list '. The reader is expected to have worked through a school course in the differential (and integra l) calculus and its applications. Accordingly, the presentation here does not stress-often does not even mention- some of the points appropriate to an introductory course. On the other hand, the average first-year undergraduate is not well versed in the theory and practice of partial differentiation; and this topic I have presented as a step-bystep deYelopment from the simplest exercises to the hardest theorems. The book is mainly concerned with the theorems and processes of the calculus and but little with their applications to mechanics and geometry. There is a note, at the end of the book, on 'singular points and envelopes'. No case can be made out for such an extraneous note save my own t eaching experience, which tells me that this is the one section of geometrical work that needs revision and a new point of view at the university stage of study. I have tried to meet the particular needs of honours science students by drawing up a list of 'reading for scientists'; this leaves out all the specialized work appropriate only to the honours mathematician.

vi

PREFACE

A list of some of the books to which I am indebted will serve as a guide for the reader's furth er study : Hardy, Pure Mathematics, Gibson , Advanced Calculus, t he great series of Cours d'analyse by de la Vallee Poussin, Goursat, Valiron, and, for the connoisseur (even though its date is 1909), Jordan ; last, but by no means least, T. W. Chaundy 's Differential Calculus. In reading the proofs I h ave been greatly helped by Professor E. T. Copson and I gratefully acknowledge the criticisms that have enabled me to remove many faults and blemishes from the original text . Finally, I wish to thank the staff of t he Oxford University Press for their skill and patience in executing t he detailed spacing work, which this subject inevitably imposes , and for their unfailing helpfulness .

W.L.F.

Hertford College, Oxford 25 April 1955

READING FOR SCIENTISTS PART r.

The foundations of the differential calculus

R ead only Chapter I, §§ 1-6; n ot e th e explanation of the symbols 3 and • in § 4.1. For the rest, r efer to Part I if the need a rises. PART II.

Functions of one variable

Ch ap ters VI, VII. Chapter VIII, §§ 1 and 2; § 4.1 (replacing the proof of Theorem 16 by a graphical proof). In § 5 note the formula of Theorem 17 (the proof also if possible). Note § 6. Chapter IX, omit; read X and XI. Cha pter XII, omit §§ 3- 5. PART III.

Functions of two or more variables

Chapters XIII, XIV. Cha pter XV. Read in the light of Chrystal's advice 'when you come on a hard or dreary passage, pass it over and come back to it after you have seen its importance or found the need for it furth er on' ; but note the results of§ 1 (Theorem 32) and of§ 5. Chapter XVI, omit. Chapter XVII, omit §§ 3 and 4. Cha pter XVIII, omit. Chapter XIX. Head§ 1 and, if possible,§ 5.1 [follow the argument with n = 2 or 3]. Head §§ 2- 4 in the way recommended for Chapter XV. Chapter XX, omit.

READING FOR MATHEMATICIANS (A provisional separat,ion of first-year from second -year work)

PART I.

The foundations of the differential calculus

First year. Chapters I-IV; possibly a first reading of Chapter V, taking what you can and leaving the rest. Some first-year mathematicians may prefer to omit most of Part I and to study only that part of it recommended in the 'Reading for Scientists '. Second year. Chapter V. PART IL

Functions of one variable

First year. Chapters VI- VIII. Chapter IX, §§ 1- 3 and possibly 6. Chapters X, XI. Chapter XII, §§ 1-4, possibly 5, 6- 8. (A less exacting selection is §§ l, 2, and 6-8.)

Secorul year. Revise and complete Chapter IX. If needed, complete Chapter XII. PART III.

Functions of two or more variables

F irst year. Chapters XIII- XIV. Chapter XV. See 'Reading for Scientists'. Chapter XVII, §§ 1, 2, 5-8. Chapter XIX (probably omitting § 5.2 and parts of § 4). Note on singular points and envelopes ; a first reading, taking what you can and leaving the rest.

Second year. Chapter XVI. Revise a nd complete Chapter XVII. Chapters XVIII and XX. Note on singular points and envelopes.

CONTENTS PART I

THE FOUNDATIONS OF THE DIFFERENTIAL CALCULUS I. NUMBER: INTERVAL: FUN CTION

1

11

II. LIMITS III. DERIVATIVES

25

IV. PARTICULAR FUNCTIONS

38

V. REAL NUMBERS : CONTINUOUS FUNCTIONS

50

PART II

FUNCTIONS OF ONE VARIABLE VI. DIFFERENTIATION VII. REPEATED DIFFERENTIATION VIII. THE MEAN VALUE THEOREM IX. EXTENSIONS OF THE MEAN VALUE THEOREM X. INEQUALITIES

72 82 93 108 123

XI. MAXIMA AND MINIMA XII. INDETERMINATE FORMS

128 137

PART III

FUNCTIONS OF TWO OR MORE VARIABLES XIII. PARTIAL DIFFERENTIATION

150

XIV. THE TOTAL DIFFERENTIAL

161

XV. DIFFERENTIAL OPERATORS: TAYLOR'S EXPANSION XVI. THE SECOND DIFFERENTIAL XVII. MAXIMA AND MINIMA XVIII. RESTRICTED MAXIMA AND MINIMA

184 194

201 218

x

CONTENTS

236 XIX. M ISCELLANEO US TOPICS :Functions of a complex variable : elimination of arbitrary functions; x,, and u,,; contact transformations: J acobians XX. IMPLICIT FUNCTIONS : INDEPENDENT FUNCTIONS

254

NOTE ON SINGULAR POINTS AND ENVELOPES

264

ANSWERS

279

INDEX

295

PART I

THE FOUNDATIONS OF THE DIFFERENTIAL CALCULUS

I

NUMBER: INTERVAL: FUNCTION 1. Preliminary The differential calculus is based on the idea of a limit, limits are based on the properties of number, and a full-scale presentation of the calculus requires a preliminary exposition of at least the following topics: the definition and properties of the natural numbers 1, 2, 3,. .. ; the definition and properties of the rational and irrational numbers; a careful discussion of limits and of continuous functions; some of the theory commonly collected under the heading 'Functions of a Real Variable'. Such a full-scale presentation is not our intention: on the other hand, we shall try to give an adequate foundation to our later work by setting down the essentials of those theories on which the differential calculus rests. Sometimes we shall give proofs; sometimes only references. ·where only references are given, one of them is usually to the appropriate section of ( :hapter V of the present book; this gives, in brief outline, the main facts about real numbers and continuous functions.

2. Real numbers We assume that we have defined (or have left undefined) the positive integers, negative integers, zero, rational numbers (the ratio of two integers), and irrational numbers. These constitute the real numbers and it is with real numbers that we shall be (·oncerned. We shall not make use of complex numbers, at least 5091

B

2

NUMBER:~INTEltVAL:

FUNCTION

not in the development of our theory: we may , from time to time, use complex numbers in particular examples. H E FEJrnNCE S : Ch a p t -x since, by definition, lb-al > 0.

(iv) On using directed lengths along a linet -+

-+

-+

->-

BA= BO+OA

=

-+

OA-OB

=

a-b,

and so !a-bl is the positive distance between A and B. The result (ii) extends, step by step, to

(v)

la+b+ ... +kl ::::;;; !al+ !bl+ ... + Ile!,

or, in words, 'the modulus of a sum is less than or equal to the

sum of the rnodul,i'.

4. Sequences of numbers A set of numbers in a definite order of occurrence is called a sequence. A sequence °'l' °'2• CX3, .. '

in which the number °'n is defined (in some way or other) what('VCr positive integer n is taken, is called an unending, or an infinite, sequence. 4.1. Sequences that converge to zero The sequence 1

1,

or again

-2' 1

1 1 22' ... ' ( - l)n 2n'"""

1

i+l2' i+2 2 '""'

1

l+n 2 '···

i:-; one in which the terms become indefinitely small in magnitude a:-; we proceed along the sequence; we can make the terms as :-;mall as we please in magnitude by going on far enough. We need a formal statement of this property and, as this is an import ant detail in the study of limits, we lead up to it by easy stages.

t

Cf., for example, E. A. Maxwell, Geometry for Advanced Pupils (Oxford,

1!!49), p. 67.

NUMBER: INTERVAL: FUNCTION

4

We shall express our DEFINITION in three different forms; they state the same thing in different ways. FORM A. The sequence a1 , cx 2 , cx3 ,. .. is said to converge, or tend, to zero (in symbols, cxn --r 0) if '!cxnl can be made as small as we please by taking any n that is sufficiently large',

1 1 1 1, -, -,. .. , -,. ... 2 3 n

e.g.

(I)

Here an= n-1 ; we can make an < 10-5 by taking any n > 105 ; we can make ex,. < 10-12 by taking any n > 1012 • In testing whether the sequence ( 1) satisfied our definition, we first took a small positive number, at random, and then showed that ex,. was less than this small positive number if n were big enough. This gives us a lead to the next form of our definition. FORM B. an --r 0 if , having chosen any posit·ive number soever, there is then a de,finite number N for which

!ex,.!
N. In all applications we shall use a form of definition which is almost wholly composed of symbols, n amely FORM

E

C. ex,. --r 0 if E

>

O;

3 N . lex,.)




N.

In this symbolism, what comes before the semi-colon is set down to begin with and is subject to no limitation which is not explicitly shown. Thus ' E > O; ' means 'setting down any positive number E whatsoever to begin with'. The semi-colon separates what is set down to begin with from what can be stated after it has been set down. The symbol 3 is to be read 'there is' or 'there can be found' ('a number', or whatever follows the symbol). The full point . is read 'such that' and is merely a useful symbolism for a common phrase. The whole of Form C thus reads (when there is need to read it): an --r 0 if, on setting down any positive number E whatsoever to begin with, there is a number N such that In:,, ! < E when n > N.

I

I

NUMBER: INTERVAL: FUNCTION

5

REFERENCE: W. L. F erra r, Convergence, pp. 9-15, for a more d etailed discussion and some examples.

4.2. Sequences that converge, but not to zero

We say that the sequence an tends to l if the sequence an-l tends to zero. In symbols, FORM

C (we omit forms A and B). an__,.. l if E

>

3 N . Jan-ll

O;




N.

We call l the LIMIT of the sequence. A common notation is lim an = l; n--> oo

an __,.. l

another is

as n __,.. oo.

In the sequel we shall use whichever notation is the more convenient. Notice that

lixn-Z J
>

X1; X 2•

max(Xv X 2 ),

3

2

2x -5x+4 2 1 6.!x 18 - -2 - -

O; ' lf(x)-21




max(XvX 2 , 18/e).

2 .6. Functions that diverge as x -+ oo DEFINITION

10. We say that f(x) -+ 00 as x-+ oo, if

A> O;

3 X • f(x) >A

when x

>

X.

It is a consequence of the definition that, if f(x)-+ oo, then f(x) -+ 0. The converse is not necessarily true, for when >(x)-+ 0 it may do so through values of varying sign: but if >(x)-+ 0 through positive values as x-+ oo, then lf(x)-+ oo. [Work Examples IIA, p. 24.J

. Limit of a function as x -+ o: What we have to do is to make precise the notion that, as approaches the value a, f(x) approaches a definite finite numer l; or again, an attempt to say the same thing in other words ut this time beginning withf(x), we can make f(x) differ from by as little as we please if we take x near enough to a. The rm of definition we use is DEFINITION 11. We say that f(x) tends to zero as x tends to a, nd write f(x) -+ 0 as x-+ a, if

e

>

O;

3

o•

lf(x)j


0,

-+ I if p = 0.

I, th en certainly

E > O; jf(x)-lj < E forallx and so, in a ccordance with D efinition 9, we may correct ly write f( .i:) -+ I as x -+ a:i .]

III DERIVATIVES 1. Definitions and notations 1.1. Definitions DEFINITION 15. Let f(x) be defined in an interval containing o: as an interior point. If

f(x)-f(o:)-+ l

as x-+ o:

X- f(x). Thus, if the derivative of y is positive whenever a :(; x :(; b, y increases as x increases from x =a to x = b. Similarly, if the derivative of y is negative whenever a :(; x :(; b, y decreases as x increases from x = a to x = b. This property of increasing or decreasing is sufficient in most circumstances; occasionally we need the following, more particular and detailed, lemma. LEMMA. Letf(x) be defined in an interval having a as an interior point and let f'(a) be positive. Then there is an open interval a < x < a+8 throughout which f (x) > f (a) and an open interval a-8 < x < a throughout which f (x) < f (a).

Proof. Let f'(a)

=

l

>

0. Then

{f(x)-f(a)} /(x-a)--+ l and 3 8 . when 0

< Ix-al


fl

>

as x--+ a

0.

X-a

When a < x < a+8, so that x-a is positive, this implies f(x) > f(a). When a-8 < x < a, so that x-a is negative, it implies f(x) < f(a). COROLLARY

1. The lemma also holds when f'(a)

= +oo.

For then the ratio {j(x)-f(a)} /(x-a)--+ +oo; with A any given positive number, 3 8 . when 0 < Ix-al < 8,

f(x)-f(a) >A x-a

>

0,

and A may replace fl in the proof of the lemma. COROLLARY 2. Let f'(a) be either a finite negative number or -oo. Then there is an open interval a < x < a+8 throughout which f(x) < f(a) and an open interval a-8 < x < a throughout which f(x) > f(a).

2. Theorem

continuity 2.1. THEOREM 6. If f(x) has a finite derivative at x = a, then it is continuous at x = a. concernin~

29

DERIVATIVES

Proof. Letj'(a) be the finite derivative at x = a of y = f(x). Then f(a+h)-f(a) = f'(a)h+rih

where ri-+ 0 as h-+ 0. Hence, as h-+ 0, f(a+h)-f(a)-+ 0

and

f(a+h)-+ f(a),

which is the condition that f (x) be continuous at x = a. NOTE. The theorem is conveniently summed up in the form 'a function with a finite derivative is necessarily continuous'. The converse is not true: there are functions which are continuous but have not a derivative. Thorough treatment of this topic is beyond our scope, but elementary examples are easily constructed. We illustrate the main point by one such example.

2.2. An example Let the function f(x) be defined in (0, 2) in the following manner: in (0, 1) define y in (1, 2) define y

= =

f(x) by f(x) by

y = x+l; y = 3-x.

Then f (1) = 2 and y is a 'single-valued function of x in (0, 2). Its graph is ) y 2

( 1, 2)

0'----....------r------2 x

We see, either from the graph or from the definition of f(x), that f (x) -+ 2 as x-+ 1 from either side of x = 1: that is, f(x)-+ /(1) as x-+ 1, which is the condition that f(x) be con, tinuous at x = 1. But there is not a derivative at x = 1; for when x

>

1, f(x)-f(l) x-1

(3-x)-2

x-1

-1,

DERIVATIVES

30

when x


(x) sidered.

(3)

=-

1 for all x con-

31

DERIVATIVES

Proofs. Let u+Liu,. .. be the values of u,. .. that correspond to the value x+Lix of the independent variable x. Then (i) Lie

=

0 and therefore

lim Lie = lim 0 = 0 t.x-+O

(ii)

Lix

(Definition 11, Note 2).

Ax-+O

Li(cu) = c(u+Liu)-cu = cLiu Li(cu)

-- =

Lix

Liu c---+ cu' Lix

and therefore

(Theorem 2, Corollary 1).

(iii) By Theorem 2, Corollary 2,

Liu Liv Liw , , '+ .. .. -+""A+~+··· --+U +v +w Lix uX uX (iv) We first prove that if y = uv, then

y'

u'v+uv'.

=

Here

Liy

rem 2,

Liv Liu. -- --+ 0 Lix

(4)

(5) u Liv+v Liu+Liu.Liv and, since u has a finite derivative u', u is continuous (Theorem 6) and Liu--+ 0 as Lix--+ 0. Also Liv/Lix--+ v'. Hence, by Theo=

as Lix--+ 0

and (Theorem 2, Corollary 2) from (5)

Liy--+ uv' +vu' +o, Lix which proves (4). Provided uv -::/= 0, we may write (4) as

y' u' v' -=-+-. y u v

(6)

The proof of (iv) now follows by induction: for let

y = U1···Un ' I

and suppose that Further let z

= yun+l ·

I

I

'!/_ = U1+···+ Un.

y

Un

U1

Then, by (6), provided z -::/= 0, f

f

:_ = '!/_

Z

Y

+ Un +l I

Un+l

32

DERIVATIVES

so that, if (iv) is true for a product of n functions, it is also true for a product of n+ 1 functions; but (6) proves that it is true when n = 2 and hence, by induction, it is true for the product of any finite number of functions. (iv)

COROLLARY TO

n

Let

m

J1 1g

y=

Ur

Vs.

Then, provided no ur or vs is zero, n

I

?/__=

m

I:r- I~· I

I

(7)

Y r=l r s=l s This is proved by applying (iv) to both sides of the equation m

Y II Vs s=l

n

=

II Ur. r=l

(v) This is an example of (7); or, a direct proof, 1 ( 1 t::.x v+t::.v

1)

-:v

t::.v

1 1

= - t::.x·v· v+t::.v

and !::.v/!::.x-+ v', v remains constant as !::.x-+ 0, while t::.v -+ 0 since vis continuous (for v' is finite). By Theorem 2, Corollary 2, the R.H.S. -+ , 1 1 -V .-.-.

v v

(vi) By (4),

j_ (~) =

dx v

u

±(~) +~ ±(u) = _ uv' + u' = u'v - uv' .

dx v

v2

v dx

v

v2

A special form of (iv) It is sometimes necessary to use a formula for the derivative of y = uvw ... when one of the factors is zero. The rule is y' = (u'vw ... )+(uv'w ... )+ (uvw' ... )+ .. . .

It is proved by induction from (4).

4. Rules for derivation (II) : function of a function Let z be a function of y, which in turn is a function of x. Then z is a function of x and, a rule familiar to most readers, dz dx

=

dz dy dydx "

(1)

DERIVATIVES

33

Our next concern is to give a careful proof of this rule in the form

Let z = f(y), y = (x); lett z have a finite derivativef'(y) w.r.t. y, y have a finite derivative '(x) w.r.t. x. d

Then

dx z

=

f'(y)'(x).

A common form of pmof is based on the argument tiz tiz tiy fix= tiy · tix' the factors on the right-+ dzfdy and dyfdx, their product therefore tends to the product of these limits, and so dz dz dy dx=dydx' The attempt to polish this line of proof so as to exclude the possibility fiy = 0 and to observe the formal definition of limit at each relevant step leads to much troublesome detail and I have rejected such a proof as being too repellent to be worth the reader's while. The proof that follows is really based on Definition 15 A.

Proof. By the hypotheses that f'(y) and '(x) are finite, we may write ~z = f'(y)~y+' ~y, (2) ~y = '(x)~x+'YJ ~x,

(3)

where ' -+ 0 as ~y -+ 0 and 'YJ -+ 0 as ~x -+ 0. Let us first firmly establish that ' -+ 0 as ~x -+ 0. This is so because ~y = {'(x)+TJ} ~x which, by Theorem 2,-+ '(x). 0 as ~x-+ 0. Since ~y-+ 0 as ~x-+ 0, it follows that '-+ 0 as ~X-+ 0. [We can make J~J as small as we please by taking Jfiyj small enough, and make jfiyj as small as we please by taking Jfixj small enough.]

From (2) and (3), ~z =

=

t 5691

{f '(y)+,}{'(x)+TJ} f'(y)'(x)

~x

~x+{'YJf'(y)+,'(x)+,'YJ} ~x.

We use the standard abbreviation w.r.t. =with respect to. D

DERIVATIVES

34

Now, as L\x varies, f'(y) and '(x) remain constant, while 'YJ, and the product S'Y/ all tend to zero as L\x-+ 0. Thus

s,

L\z = f'(y)'(x) L\x+g L\x,

where g -+ 0 as L\x -+ 0. Hencet L\z L\x-f'(y)'(x) = g-+ 0

as L\x-+ 0

and, by the definition of limit (Definition 11), L\z L\x-+ f'(y)'(x)

as L\x-+ 0.

5. Rules for derivation (III): the chain rule Let u = f(v) have a finite derivative f'(v), v = g(w) w = h(z) z = (x)

,, ,, ,,

,, ,, ,,

,, ,, ,,

g'(w), h'(z), '(x),

where, to illustrate the process, we have taken a chain of four relations whereby we define u as a function of x via three intermediate variables v, w, z. Then d

dx u = f'(v)g'(w)h'(z)'(x)

(1)

or, in a more familiar notation, du du dv dw dz dx = dv dw dz dx ·

Proof. By§ 4

du dw = f'(v)g'(w), du= duh'( ) dz dw z' du= du,/.'( ) dx dz'f' x,

and, on combining these results, du dx = f'(v)g'(w)h'( z)'(x).

t Or a direct appeal to Definition 15A at this point shows that z, qua function of x, has a d erivative f'(y) 0 ensures that we always k eep a way from the difficulties which m ay arise if we admit x = 0 or x < O; e.g. with p /q = ·h x-i is n ot defined at x = 0 (though x-i -+ oo as x -+ 0), or again v = x 1 does not d efine a r eal function wh en x is n egative.

(iii) When n is a negative rational number and x

>

0

Let y = I /xm, where m is a positive rational number. Then, by (ii) above and Rule (v), y'

=

mxm---;:J2m1 = -mx- m- 1

PARTICULAR FUNCTIONS

39

(iv) When n is an irrational number We come to this in§ 6. \

1.2. Polynomials and rational functions When n is a positive integer (or zero) and en is a constant, d

-dX (cnxn)

=

ncn xn-1

and, by Rule (iii),

d

dx

2 crxr i rcrxr-1. n

=

r=O

r=O

This gives P'(x) when P(x) is a given polynomial in x. The derivative of a rational function P(x)/Q(x) is given by the formulat (Rule vi)

(p)

~ = P'Q-Q'P dx Q Q2

(Q =F P) •

1.3. Algebraic functions These are not, in general, single-valued functions and their range of definition is often restricted. We shall deal with particular examples as they arise, but make no attempt here at a general treatment. (See Chap. XIV,§ 3.3.)

2. Trigonometrical functions We take the functions sine and cosine to be defined by the usual methods of geometrical trigonometry and, on this basis, prove the formulae -

d .

dx

SlllX =

COSX

d

'

dx cosx

=

.

-smx.

(1)

In an appendix to this chapter we outline, for the reader's interest and not as an integral part of our 'foundations of the calculus', a purely analytical treatment of the functions sinx and cosx. t

It is often convenient to write P for P(x) and P ' for P '(x).

PARTICULAR FUNCTIONS

40

2.1. Preliminary In the figure, AP is an arc of a circle of unit radius, PN is perpendicular to OA, and PM is the tangent at P. From this figure, in which x is the radian measure of an acute angle, area of b.OPN

~ ..J6,

so that (4) and (6)

As all but immediate consequences of (1) and (2) we· can now obtain S(77) = 2S(!77)0(!77) = o, 0(77) = -1, S(277) = 0,

0(277) = 1,

'

and, from these, S(x+277)

=

O(x+277) = O(x).

S(x),

The function S(x) is identified with sinx, O(x) with cosx and, as we have seen (at least in rapid outline), all the familiar trigonometrical formulae relating to sine and cosine can be established directly from the properties of the defining powerseries and without reference to geometry.

7.4. Complex variables The series of § 7.1 are taken as the definitions of sine and cosine when x is a complex number. All the work of§ 7.1 is valid whether x and y are real or complex: in particular, formulae (1) and (2) are true for all x and y, real or complex.

5691

E

v REAL NUMBERS: CONTINUOUS FUNCT I ONS This chapter contains proofs of certain theorems quoted elsewhere in the book. It is intended as a first, brief course on real numbers and continuous functions. The reading of the chapter is not n ecessary to the understanding of the r est of the book.

1. The elementary rational numbers We suppose that we may use, without explanation, the positive and negative integers

±1, ±2, ... , ±n, ...

(1)

and zero. Further, we suppose that we may assume the laws governing the use of the fractions

p/q, where p and q belong to (1). All the above form the ELEMENTARY RATIONAL NUMBERS. We refer to them, for convenience, as the E.R. For our use in the next section we note a preliminary LEMMA. When x is a given E.R., there is no least E.R. which exceeds x. Proof. Let y be any E.R. which exceeds x. Then (i)

f(x+y) is an E.R.,

(ii)

x


b' (in sense of > as used for real numbers).

REAL NUMBERS: CONTINUOUS FUNCTIONS

55

2.6. The sum of two real numbers DEFINITION

22. Let x, y be two real numbers given by cuts,

x = (Lv R 1 ) and y by the rule

(L 2, R 2). Divide the

E.R.

into two sets.\.,

p

an E .R. c is in A if 3 an l1 and an l2 • l1 +l2 ~ c and is otherwise in p. Then (.\., p) is a cut and defines a real number called the sum of x and y and denoted by x+y. The r eader should satisfy himself that (/..,p) satisfies the conditions (a), (b), (c) r equired of a cut and notice that an E.R. of the form

r 1 +r 2 belongs top. Thus, in essence, the cut is the separation of the l1 + l 2 from the r 1 +r 2 •

2.7. The difference of two real numbers We first define -x when x is given by the cut (L, R). Let l, r be typical E.R. in L, R respectively. In L 1 put all -rand in R 1 put all -l. Then (Lv R 1 ) is a cut and it is denoted by -x. The diagram leading to this definition is a reflection in 0. L,

R 0 DEFINITION

x

-r

+r

-x

0

23. We define x-y to be x+(-y).

2.8. Manipulation of real numbers In a full account of this matter we should continue through the definitions of xy, I/y, x/y to the familiar rules of calculation, such as

a-(b+c)

=

a-b-c,

a+ c _ ad+bc

b x

a,-l)d·

> y and y > z imply x > z.

In fact, the definitions of real numbers, their sums, products, etc., are such that the ordinary arithmetical manipulations hold for them as for the elementary rationals. If we begin to prove this for particular steps it soon becomes clear that such is the case generally. A full treatmentt would require a very lengthy, formal scheme, out of place in a book of this kind. We shall t Cf. Landau, Grundlagen der Analysis (Chelsea Publishing Co., New York) .

56

REAL NUMBERS: CONTINUOUS FUNCTIONS

press on, assuming that we may work with 'real numbers' as we have always worked with the more familiar numbers of our earlier mathematics. The point of introducing real numbers at all is the fact that, with them, we can go on to prove theorems which we could not prove if we relied solely on the familiar numbers of elementary mathematics. From this point on, all numbers will be 'real numbers' unless the context states expressly that they are E.R. 2.9. Positive and negative real numbers Zero, as a real number, is the cut given by (L, R) where any E.R. that is positive goes into R, any E.R. that is negative goes into L.

[A positive E.R. is, of course, a fraction p/q wherein p and q are positive integers chosen from I, 2, 3, .... ] A positive real number is one which, according to the definition of§ 2.5, is greater than zero: it will therefore contain some positive E.R. in its L class. L,

R, G

R2

L2 G+r=

If Eis a positive real number, G denotes the cut (L 1 , R 1 ) and G+E is the cut (L 2 , R 2 ), then, as we may deduce from the definition of sum in§ 2.6, some l 2 >some r1 ; or, in a form convenient for later use, some l of G+E >some r of G.

3. Upper and lower bound 3.1. THEOREM 7. Let {a} be a given set of real numbers and let there be at least one number A 0 ~ each and every member of {a}. Then there is a number U, called THE upper bound of {a}, with the two properties (i) U ~ each and every m ember of {a}, (ii) whenever U' < U, there is at least one member of {a} that exceeds U'.

REAL NUMBERS: CONTINUOUS FUNCTIONS

57

Proof. (i) Let {A} denote the set of numbers ? each and every a. Every rational real number either belongs to {A} or it does not: if it does, put the corresponding E.R. into a set R, and otherwise put the corresponding E.R. into a set L. Then (L, R) is a cut--a little care is necessary to see that (a) at least one E.R. goes into L and at least one into R, but it is fairly clear from the mode of definition of L and R that (b) an E.R. goes either into Lor R (but not both), and (c) each and every E.R. in L < each and every E.R. in R. Denote this cut by U. Then (§ 2.9), for every positive number E, 3 an l of U +E > some r of U

and, on using Corollaries 1 and 2 of§ 2.5, if l', r' are the rational real numbers derived from the E.R. l, r, U+E? l'

>

r',

Hence, from the way in which the r of U are defined, U +E >each and every member of {a}.

It follows, since (X) is true for every positive

E,

U? each and every member of {a}.

(X)

that (Y)

This proves (i). NOTE. If the deduction of (Y) from (X) is not at once clear, consider the following. Suppose (Y) is not true and that there is a member of {a}, say a 1 , that exceeds U. Then

U+t(a 1 -U) < U+(a 1 -U) = a 1

and, when of E.

E =

t (a1 - rJ), U +

E

< a 1 , which denies

(X) for that value

(ii) Moreover if U' < U, 3 an r of U' V, there is at least one member of {a} that is less than V'.

3.3. Notation. We reserve the phrase 'THE upper bound' for the number U with properties (i) and (ii) of Theorem 7. If we happen to know that a number K has property (i), and have no knowledge as to whether or not it has property (ii), we call K AN upper bound, or a rough upper bound, of the set {a}. For example, let {a} be the set of numbers n-I

n

(n = I, 2,. .. ).

Then the number 26 is AN upper bound, but the number I is THE upper bound. [See also, Chapter I, § 7, p. 9, for further comment on this matter.]

4. Continuous functions 4.1. We recall the definitions of Chapter II,§ 6, p. 19. (i) Let f(x) be defined in an interval containing a as an interior point; we say thatf(x) is continuous at x =a if f(x) -+ f(a) as x-+ a. That is, in the standard E

>

O;

E,

8 notation,

3 8 • lf(x)-f(a)I




O;

3 8 • lf(x)-f(a) I
0, there is an interval c-o < x < c+o throughout which f(x) > ik; (ii) if f(c) = 0 and Mis any given positive number, there is an interval c-o < x < c+o throughout which lf(x)J < M. These properties are often quoted. Both derive directly from the definition of continuity and their proofs, which follow, consist merely of giving E particular values in (1) of§ 4.1.

Proof (i) Letf(c) = k. Then, on putting 3 o • lf(x)-kJ


0, there is an interval b-o < x :::;;; b throughout whichf(x) is positive.

4.3. A note for future reference Let f(x) be continuous in the closed interval (a, b). Then, having set down a definite positive number E, there is assigned to each and every interior point rx an open interval }cx-'-o, rx+o( within which property§ 4.1 (1) holds; there is assigned to a an interval which is closed on the left and open on the right, namely a:::;;; x < a+o in (ii)§ 4.1; and there is assigned to b

60

REAL NUMBERS: CONTINUOUS FUNCTIONS

an interval closed on the right and open on the left. [The value of 8 varies, of course, from point to point.] What we wish to stress is not why we have assigned the intervals, but the mere fact that e~ch point of (a, b) has an interval assigned to it. In the next two sections we shall examine such sets of assigned intervals and shall be not at all concerned with why or how they are assigned. It will be only after we have considered the general problem that we shall return to the application of our general theorem to continuous functions.

5. Introduction to the Heine-Borel theorem This is an essentially simple theorem. It does not occur as the natural sequel to what has gone before, but is the outcome of many attempts to establish the various properties of continuous functions: when once we have proved the Heine-Borel theorem, all the other complications and difficulties disappear. One further foreword. An essential difficulty in dealing with an infinite number of lengths, or an infinite sequence of numbers, is to determine whether or not there is among them a least length, or a least number. When we have a finite set of lengths, or numbers, one of them must be~ each and every other member of the set. The point of the Heine-Borel theorem is that, starting with an infinity of intervals, it selects a finite number of them and thereafter we can deal with that finite number of intervals and, in particular, with the least length of this finite number of intervals.

6. The Heine-Borel theorem 6.1. Preliminary 24. L et {a} be a given set of points. L et Iv I 2 ,. .. , In be a given (finite)t set of intervals. W e say that {a} is 'COVERED' by the set of intervals I 1 , I 2 , • • • , In if each and every member of {a} belongs to at least one of the intervals. DEFINITION

t (i) W o nro u sing t ho to1·minology, oxplo.in od in Cho.pter I, § 5, which uses goom otrico.l hin g uo.go to d ise uRa properties of numbers. (ii) The definition also applies to o.n inOni to soquonoo I,, I,,. .. of intervals, or indeed to an infinite se t {I} of intorvllls not n ocossarily arranged as a sequence; but we shall not hore bo co nco rnod with such sots.

REAL NUMBERS: CONTINUOUS FUNCTIONS

61

6.2. THEOREM 8. THE HEINE-BOREL THEOREM HYPOTHESIS. Let 1 denote the closed interval (a, b). We are given that to each interior point P of 1 there is assigned a determinate open interval IP, which contains P. We are given, further, that to the end points a and b there are assigned respectively

and

an interval Ia containing a and open on the right, an interval lb containing b and open on the left.

DEDUCTION. Then it is possible to select a finite number of the assigned intervals in such a way that 1 is covered by this finite number of selected intervals.

Proof. Call g an attainable point if it is in 1 and the closed interval (a, t) can be covered by a finite number of the assigned intervals. Then (i) there are attainable points-for if x lies to the right of a in Ia, we can 'cover' (a, x ) by the single assigned interval Ia. Denote the set of all attainable points by {fl. Then since, by definition, an attainable point lies in 1, each and every member of {t}

~

b.

Hence {t} has a finite upper bound and (Theorem 7, Corollary) (ii) THE upper bound of {fl

w ~ b.

=

Suppose that w < b. We must have w >a, since there are attainable points to the right of a. Thus w is an interior point of 1, for a < w < b. By hypothesis, there is assigned to w an open interval Iw which contains w ; let this open interval be

)w-81' w+8 2 (. Since w is the upper bound of the attainable points, there is (by (ii) of Theorem 7) an attainable point g1 > w-8 1 • Since g1 is attainable, a finite number m of the assigned intervals covers (a, t 1 ); and these m intervals together with Iw cover

w-o1 a

I

W+02

r.

I w

b

(a, w+!8 2 ). Hence there are attainable points to the right of w,

62

REAL NUMBERS: CONTINUOUS FUNCTIONS

which contradicts the hypothesis that w is the upper bound of such points. Hence (iii) the upper bound

w

cannot be less than b.

Accordingly, the upper bound of {fl is b. Let b-8 be the left-hand end point of the assigned interval lb. Since b is the upper bound of {fl, there is an attainable point g2 > b-8. Since g2 is attainable, a finite number n of the assigned intervals covers (a, g2 ); and these n intervals together with lb cover the whole interval (a, b). Hence (iv) l can be covered by a finite number of the assigned intervals, say Iv 12 , ... , In.

6.3. Note on the selected intervals The intervals 11 , 12 , ... , In must overlap on the pattern of the diagram ~

Y, X 3

X1

12

X2

·----

in which 11 is X 1 ~ x < Y1 , 12 is X 2 < x < Y 2 , and so on. The need for overlapping is apparent from the following considerations: The interval 11 does not contain the end-point Yv which must therefore belong to one of 12 , 13 , ... , In' All these are open on the left and the interval containing Y1 must begin at a point X 2 lying to the left of Y1 [for if X 2 were equal to Yv the interval 12 would be Y1 < x < Y2 and would not contain Y1]. Similarly, X 3 must lie to the left of Y:i; and so on.

7. Properties of continuous functions 7.1. THEOREM 9. Let f(x) be continuous in (a, b). Then it is bounded in (a, b). Proof. Let E be a given positive number. Then, for each interior point o: of (a, b),

3 8"' • lf(x)-f(o:)j Call o:-8"'


I, DEFINE cosh-1x to be the positive number B for which cosh B = x. With this definition, sinh B is positive and cosh2B-sinh2B I gives sinhB = +.J(x 2-I). From

=

coshB = x sinhBdB dx __I,

we obtain d

dx cosh-lx =dB_ dx -

so that

1

.. •

(iii) EXERCISE. Show that, in the range -I < x I and B = cosh-1x. Then, by definition, () is positive and so sinhB is positive and equal to +.J(x 2-l). Thus f(e6+e-6) = x, t(e6-e-6) = ,J(x 2 -I), and, on adding these, e6 = x+.J(x 2 -I).

cosh-1x = B = log{x+.J(x2-I)}.

Thus NOTE.

The negative value of 8 for which cosh 8 = x is given by l)}; fort

log{x-~(x 2 -

log{x-~(x 2 -

t

l)} = -log{x+~(x 2 - l)}.

{x-.J(x2 ...'.. l)}{x+.J(x2 -l}} = x 2 -{x2 - l) = I; log I = 0.

78

DIFFERENTIATION

(iii)

When - l

EXERCISE.


I. 2 5. (l+x )fsinh-Ix; tanh-Ix when /xi < l; log(secx+tanx) when

lxl < f1r. 6. 7. 8. 9.

tanhx, sechx; cothx, cosechx when x =F 0. cosec-Ix, sec-Ix when lxl > l; cot-Ix. cose5h-Ix; sech-Ix when 0 < x < l; coth-Ix when Ix/ > I. log(l-3x)whenx < t; log{(l-x)/(l+x)}when Ix/< 1.

Differentiate w.r.t. t 10. (a) cos{tan-I l~t

2 };

(b) tan{sin-I.J(l-t 2 )} when 0 < t 0.

5691

G

Ix/
0,

Yn = rne- a" cos(bx+c + n.p), 2 2 where r = + .j(a +b ), rcos .p = -a, and rsin.p = b. Show that one value of cP lies b etween trr and f-rr and is equal to TT - sin- 1 (b/r). 8. Show that, when y = x 2e", Yn = !n(n- l)y 2 -n(n-2)y1 +t(n - l)(n - 2)y. 9. Prove that

e-xnn(xne" ) =

n { n' }2 x n- r - - ·-. (n - r)! r!

2

r=O 10. When z = x ke", prove tha t (z1 -z )x = kz and that

XZn+2 + zn+I(n + l - k - x) - (n + l)zn = 0. Show that, if y = Dn(xn+ie" ), xy 2- x y 1 = (n + l)y.

"

REPEATED DIFFERENTIATION

91

11. Find an expression for D"{(l -x)-1Iogx}. 12. Differentiate n times the equations ('

x 2y 2 +xy1 +y

(i)

0,

7

(l-x 2 )y

(ii) 2 -xy1 = 0. 13. Find y,. when y = (x- l)- 3 (x-2)-1. 14. (i) Find Yn when y = (l-x 2 )e-"'. (ii) Findexpressionsforthenth derivative of e"'/(x2 + 1) when x =I and when x = 0. 15. Prove that the nth derivatives of

e-"'sinx

e"'cosx, ('V'2)"e"' cos(x+!wr),

are

('V'2)"e-"' sin(x+!rm).

16. Prove that, when y = sin(msin-1x), (l-x 2 )Yn-i- 2 -(2n+I)xy,.+1+(m 2 -n 2 )y,. = O• 17. Prove by induction, or otherwise, that, when u and v are functions of x, vD"u = D"(uv)-nD"- 1(uDv )+tn(n- I)D"- 2(uD 2v)- ....

18. (Harder . ) a 0 , a 1, ... , a 2 ,. are given constants and .P,.(x) == a 0 x' + ra 1 x•-1+ !r(r- l)a 2 xr- 2+ ... +a,, where 0 ~ r ~ 2n. Prove that P;(x) = rP,_1(x) and that 2n

~

r=O

(-1)' 2110,P,(x)P2,._,(x )

is a constant. 19. (Harder.) Show that

d'

-d (l+x)n-t = (n- t )(n-!) ... (n-r+t)(l+x)"-H

x'

and that the product of this and the (2n-r)th derivative of (1-x)n-i is equal to

"{l 3 (-l)

2n-1}2 (l +x)n-r-t (1- x)'•-•+t'

2·2 .. ·:·-2-

Differentiate 2n times the product (1-x)"-t(I+x)"-t to show that, when x = cos8, sin2 "H8

2n-l

s~n._

8)

= (-l)"l2 . 3 2..... (2n-1) 2 •

D(e-"'') = -2xe-"''.

20. Prove that that

d2"( ·

D"(e-"'') = -2{xD"- 1(e-"'')+(n-l)D"- 2 (e-"' 2 )},

and that f ,.(x) = e"''D"(e-"'') is a polynomial of d egree n in x. Prove that Df,.(x)

=

e"''{2xD"(e-"'')+D"H(e-"'')}

=

-2nf,._1(x).

21. When y = .J{(l+x)/(1-x)}, prove that y = (l-x2)Y1 and deduce, by using Leibniz's formula, that (l-x 2)yn-{2(n-l)x+I}yn_ 1-(n-l)(n-2)yn_ 2 = 0. Prove that the function]n(x) defined by

(I-x 2 )nYn = Yfn(x) is a polynomial in x of degree n-1. 22. When (x 2-l)y =I andfn(x) = (x 2 -I)n+iyn, show that (i)

(ii)

Un

fn+ 1 =

(x 2 -I)f~-2(n+I)xfn,

fn+i+2(n+I)xfn+n(n+I)(x 2 -I)fn-l = 0,

(iii)

n(n+Ilfn-l = -j~,

(iv)

(x 2 -I)f~-2nxf~+n(n+

~ fn(x),j~ ~J~(x),

l)fn = 0.

etc.]

23. The functionfn(z) is defined by

d)n - I = (I-z)-n-1j (z). ( zI-z n dz Show that it is a polynomial of degree n or less in z, that

fn+ 1 (z) = z{(I-z)j~(z)+(n+Ilfn(z)}, and thatfn(l) = n! Use the above relation between f n+i and f n to show that, on writing

fn(z)

~

A 1 z+A 2 z 2 + ... +Anzn,

the coefficients A 1, A 2, ... , An are all positive. 24. (Harder.) Whenµ, = cos 6 and K = (-l)n-1{1. 3. 5 ... (2n - l)}n- 1, show that, if dn-1 sin2n-1 (1) dµ,n-l = Ksmn6,

e

.

the use of Leibniz's formula for Dn+l{(l-µ, 2)sin 2n-16} g iv es dn+l sin 2n+ig dµ,n+i = Kn(2n+I)cos(n+I)Bcosec 6, and that integration w .r.t. µ, then gives dnsin2n+ig = (-l)nl.3 .... (2n+l) sin(n + l)6. dµ,n n+ I Use induction t o prove t h a t formula (1) is true. Aliter. Prove the form ula (1) by u sing the m ethod of Example 19.

VIII THE MEAN VALUE THEOREM 1. Graphical introduction Let y = f(x) have a graph with a continuously turning t an gent : let A, B be the points {aJ(a) },

{b,f(b )}.

B /

/ p/

/

/

/

/

/ ,//

//

/ /

/// /

B

P,

/

/

/

/

/

/

/

A // /

A --....rP,

Then, as a glance at the figure will show, there is at least one point P (possibly two or more points Pv P2 , .. . ) on the curve between A and B such that the t angent at P is parallel to the chord AB. Now the slope of the chord is {j(b)-f(a)}/(b- a ) and the slope of the tangent is f'(fl , where g is the abscissa of the point P: Thus 'there is a number

g between a and b for which f(b)-f(a) == f'(g) .' b-a

This result is usually referred to as the MEAN v ALUE of the differential calculus. [In short, M.V.T.]

THEOREM

2. Need for proof free of graphical argument The M.V.T. was found to be a powerful instrument for developing the theory of the calculus and a search was made

MAXIMA AND MINIMA

202

We require certain algebraical theorems, which we dispose of before coming to our main problem. 2 .1. Algebraical details LEMMA

1. Let r, s, t be given real numbers: let

r

>

0,

t

>

0

and

rt

>

s2.

Then, for all real values of hand k oth~r than h = k = 0, rh 2+2shk+ tk 2 > 0. The proof follows at once on writing I

rh 2+2shk+tk 2 = -{(rh+sk) 2+(rt-s 2)k2}. r

The quadratic form rh 2+2shk+tk 2 in the variables h, k is then said to be POSITIVE-DEFINITE. In order to round off the proof of our theorems concerning maximum and minimum, even when we use the elementary approach, we need a refinement on Lemma 1. This is LEMMA

2. Let r, s, t be given real numbers: let

r

>

0, t

>

and

0

rt

>

s 2.

Q = rh 2 +2shk+tk2 •

Let

Leth, k vary subject to h 2 +k2 Q

= p2 , where pis fixed. Then ~

mp2,

where m is a positive constant independent of the value of p and dependent only on the values of r, s, t. Proof. Write h = p cos B, k = p sin B. Then

Q = ip 2{r(I +cos 2B)+2ssin 2B+t(l-cos 28)} =

ip 2{(r+t) + (r-t)cos2B+2ssin2B}.

On putting r-t = A cos n:, 2s = A sin n:, where ,V{(r-t) 2 +4s 2}, Q = i p2{r+t+A cos(2B-n:)}

A

=

~

iP2(r+t -A).

(2)

203

MAXIMA AND MINIMA

A = .J{(r+t) 2 -4(rt-s2 )}

Now


0. Hence r+t-A is a positive number, 2m say, whose value depends only on r, s, t and, from (2), rt-s 2

Q~

mp2.

2.2. Conditions for a minimum Let z = f(x, y) have continuous partial derivatives of the first, second, and third orders in a certain region 81 and let x = a, y = b lie inside that region. Then, by Taylor's expansion (Chap. XV, § 5.1, p. 192) f(a+h, b+k) = f(a, b)+ (ph+qk) + ~ (rh 2 + 2shk+tk 2 )+ 2.

8 8 )3 f ] +-31! [(h-+kox oy a +Bh,b +Bk'

where p, q,. .. are the values of fv fv, ... evaluated at x = a, y = b. The first approximation when h and k are small gives f(a+h, b+k)-f(a, b)

~

ph+qk.

(3)

If p -=/= 0, the pairs of values (i)

h=

+s, k = o,

(ii)

h=

-o, k =

0

give opposite signs to (3) and so f(a+h, b+k)-f(a, b) will be positive for some h, k and negative for others; there can be neither maximum nor minimum at x =a, y = b. Similarly, if q -=/= 0, there is neither maximum nor minimum. Thus, a first condition for either a maximum or a minimum is that p = q = 0. When p = q = 0, our expansion gives the approximation f(a+h, b+k)-f(a, b)

~

Hrh 2 +2shk+tk 2 )

when h, k are small. For a minimum, this must certainly be positive for all h, k and so the second condition for a minimum is that r

>

0, t

>

0

and

rt

>

s2 •

In order to prove that this actually gives a minimum we must deal with the remainder term in the Taylor expansion. Suppose

MAXIMA AND MINIMA

204

that the third order derivatives of f are bounded in f!4; let their moduli be all less than a given M. Then the magnitude of 2-(ha EPf3

3!

8x

+ 3h2k 8xasf8y +···) ~ !(lhl +lkl) M 3

2

which, by elementary algebra, t ~

i{2(h 2 +k 2 )}iM.

When h 2 +k2 = p 2 , this last term is less than Mp 3 • Moreover, when h 2+k2 = p2, rh 2+2shk+tk 2 ~ mp 2, where m is the positive constant in Lemma 2. Thus, when h2+k2 = p2,

f(a+h, b+k)-f(a, b) 3 2 = i(rh +2shk+tk2)+~ [(h!_+k!._) f] 8Y 3· ax a+Oh,b+Ok > tmp2-Mpa. Since m, Mare fixed numbers independent of p, we can make this positive by t aking p to be sufficiently small.

There is therefore a minimum at x = a, y = b if (i)

p

=

(ii)

r

>

q = O; 0, t > 0

and

rt

>

s2 •

2.3. Conditions for a maximum When p = q = 0, the Taylor expansion gives

f(a+h, b+ k)-f(a, b)

~

!(rh 2 +2shk+tk2).

Q = rh 2 +2shk+tk 2 will be negative provided that

Moreover,

-Q

(-r)h 2 +2(-s)hk+(-t)k 2

=

is positive. Thus Q is negative for all h, k other than h = 0 k = 0 when

r < 0, t < 0 and rt> s 2 • There is a maximum at x = a, y = b if (i)

p

= q = O;

(ii)

r




0

for all dx, dy; save when dx = dy = 0.

Then f(x, y) has a minimum value at x

=

a, y

=

b.

The proof is based on two details, one of algebra and one of calculus. We deal with these separately.

3.1. A detail of algebra LEMMA. Let Q - rg 2 +2sg11+t11 2 be a positive-definite form in the variables g, 11· Then, if g2 +11 2 = p 2 , Q ;;?: mp2, where m is a positive constant independent of p and dependent only on r, s, t. Proof. [Cf. the elementary proof of§ 2.1.J Since Q is positive-definite, there is an orthogonal transformationt from the variables g, 1) to new variables X, Y such that, in these new variables,

Q t t

.\1

= .\ X 1

2

+.\2 Y2,

We use the notation of§ 2 (1). Cf. W. L. Ferrar, Algebra (Oxford, 1941), Theorem 49, p. 153 for the form X~+ ... and Theorem 45, p. 146 for ,\t> ,\ 2 positive.

MAXIMA AND MINIMA

206

where A1 and A2 are positive. When x2+y2 = p2 and, as x and y vary,

g2+'Y) 2 =

p2, then also

Q ~ A(X2+y2) = Ap2, where A= min(A1 , A2 ) and is independent of p; it depends only on the original coefficients r, s, t. This proof extends automatically to the corresponding problem inn variables.

3.2. A detail of calculus Let z = f(x, y) have continuous partial derivatives of the first and second orders at x =a, y = b; let the values of these derivatives at x =a, y = b be denoted by p, q, r, s, t and let d 2f = rdx 2+2sdxdy+tdy 2 • LEMMA. When p = q = 0 and dx2+dy2 = p2, f(a+dx,b+dy)-f(a,b) = O(d 2f+ap 2 ),

(1)

where

(i)

(ii)

o


and, from (5), provide9- only that dx 2 +djJ2 < s~. Accordingly (Definition 32), f(x, y) has a minimum at x = a, y = b.

MAXIMA AND MINIMA

208 COROLLARY.

With the same conditions as in Theorem 35 save

that now (i) (ii)

df = 0 for all dx, dy, d 2f < 0 save when dx

=

dy

=

0,

j(x, y) has a maximum value at x = a, y = b.

The proof is similar to that of the theorem. Since d2f the quadratic form -d2f -rdx2-2sdxdy-tdy2 is positive-definite. Accordingly, when dx 2+dy2 = p2,




Then f(xv .. ., xn) has a minimum value at x 1

=

a1 ,. • ., xn

Proof. The condition (ii) ensurest that d 2f

=

L a,,. dx,,.dx r,s 8

8

(a,,. 8

= asr)

is a positive-definite form. Let dxi+ ... +dx~ = p2 • The lemma of§ 3.1 shows that

d2f

~

A.p2,

where A. is a positive constant independent of p. A proof on the lines of§ 3.2 establishes that J(a1+dx1,. . ., an +dxn)-J(a1,. .. , an)

t 5691

=

B(d 2f +f3p2 ),

W. L. Ferrar, Algebra (Oxford, 1941), p. 138. p

=

an.

210

MAXIMA AND MINIMA

where 0 < e < 1 and fJ ~ 0 as p ~ 0 and, as in the proof of Theorem 35, there is a number 8 for which f(a1+dxv···• an+dxn)

> f(av···• an)

whenever p < 8. That is to say, f (xv .. ·• xn) has a minimum at X1 = Cl.v···• Xn = Cl.n. COROLLARY. With the same conditions as in the theorem save that (ii) is replaced by au< 0,

(iii)

au

Ia21

au

I > 0,

a22

f(xv ... , xn) has a maximum at x1

=

au

a12

a21 aa1

a22 aa2

aia I a2a aaa

av···· xn

=


0 save when dx 1 = ... = dxn

= 0.

Then f(x 1, ... , xn) has a minimum value at x1 = av .. ·• Xn = an. Moreover, if (i) holds and (ii) has d 2f < 0 in place of d2f > 0, there is a maximum value.

The proof is exactly that of Theorem 36 save that the positivedefinite character of d2f is taken as a hypothesis instead of being deduced from the positive sign of the numbers in (ii) of Theorem 36. J

5. The doubtful case Let z = f(x, y) and let p, q, r, s, t be defined as in§ 2. When x = a, y = b let · (1) rt= s 2 • p = q = 0, This is called the 'doubtful case', and for this reason:

MAXIMA AND MINIMA

We have, when (I) is satisfied, df EITHER

(i)

r =I= 0,

=

211

= 0 for all dx,

d2f =

!r (rdx+sdy) 2,

d2f

tdy2.

dy and (2)

(3) 2 When (i) holds f(a+dx, b+dy)-f(a, b) :::= id f and has the same sign as r unless the ratio dx: dy is equal to -s: r; but when dx: dy = -s: r, (2) merely gives d2f = 0 and tells us nothing about the sign of f(a+dx, b+dy)-f(a, b), which can only be found by a closer approximation. We cannot tell from (2) alone whether there is maximum, or a minimum, or neither at x =a, y = b. The same uncertainty arises in (ii) when dy = 0; (3) will tell us nothing about the sign of f(a+dx, b)-f(a, b) . The problem can often be handled quite simply in particular cases (see Example I,§ 8), but the general theory is too involved to be worth the pains, and the conditions (I) are usually referred to as the doubtful case and the problem left at that.t OR (ii)

r

0,

=

6. Saddle points The r eader will probably know the general shape of a saddle as it is placed on a horse's back. Thillk of a point A, roughly at the centre of the seat of the saddle. Curves through A in the surface of the saddle and in the general direction 'from nose to tail' have a minimum at A; curves in the general direction 'from flank to flank' have a maximum at A. This is the characteristic of what are called saddle points of a surface; some curves on the surface through the point have minima there, others h ave maxima.

We shall suppose that z = f(x, y) has continuous partial derivatives of the first, second, and third orders in a neighbourhood of the point (a, b) and that all points (a+,\dx, b+,\dy) considered lie in this neighbourhood. We may then write, when dx2+ay2 = p2, ,\2

,\3

2

31

f(a+,\dx, b+,\dy)-f(a,b) = ,\df+ ! d 2f+

M 1 p3 ,

(I)

where M 1 is bounded. Let ,\ vary and let p be fixed. Let p, q, r, s, t have their usual meanings (§ 2) and, when x =a and y = b, let r =I= 0, p

t

=

q = 0,

rt
0. Then d 2f has a positive value, and [with a proof as in (i)] f(a+.\dx, b+A.dy)-f(a, b)

>

0

when /.\/ is sufficiently small. The curve on the surface defined by this value of the ratio dx: dy has a minimum at x = a, y = b. The signs are reversed throughout (i) and (ii) if r is negative. Thus, if for x

=

a, y

r =I= 0,

b,

=

p

=

q

=

0,

rt




s2•

There is a minimum.

t

Aliter. (1) gives y 3

=

-x 3, whence y

= -x and so x 3 = 2x.

q

= 0,

MAXIMA AND MINIMA

214

The doubtful case at (0, 0). Here d2f = -4dx 2+Sdxdy-4dy 2 = -4(dx-dy) 2. For any definite unequal dx and dy, d 2f is negative and f(A.dx, A.dy)-f(O, 0) =::= iA. 2d 2f


.. is sufficiently small; z decreases as we leave the origin in any direction other than dx = dy. On the other hand, let dx = dy. The form I ~ z = f(x, y) = x 4+y4-2(x-y)2

gives at once f(A.dx, A. dx)-f(O, 0) = 2A.4 (dx) 4 > 0,

and z increases as we leave the origin in the direction dx = dy. The function has neither maximum nor minimum when x = 0 and y = 0. EXAMPLE

2. Show that, when ab # h 2, z = x 3 +ax 2+2hxy+by 2+2b-1 (hx+by)

has one and only one maximum or minimum value and that this is 'a maximum if b < 0. SOLUTION

p

= 3x 2+2ax+2hy+2hb-1, r = 6x+2a, s = 2h,

q = 2hx+2by+2,

t

=

2b.

Maxima or minima can occur only if p = q bp-hq = 0 [eliminating y], x then satisfies 3bx2+2(ab-h 2)x = 0, i.e. When x

x =

=

0,

when x =ex,

0

or

x

= -2(ab-h2)/3b =

rt-s 2 = 4(ab-h2); rt-s 2 = -4(ab-h2).

=

ex

0. On taking

say.

~

For one of these rt > s and there is a maximum or minimum; for the other rt< s 2 and there is a saddle point. Finally, rt > s 2 implies that rt is positive and that r has the same sign as t. Hence z has a maximum if t is negative, that is, if b is negative. 2

MAXIMA AND MINIMA

215

8.2. Solutions without the calculus There are some circumstances in which the formal theorems of the calculus fail to give results, while simple algebra provides a clear indication whether there is, or is not, a maximum or minimum. EXAMPLE

3. Show that the function

z

=

(x+y-l)(x 4 +y 4 )

has a maximum at x = 0, y = 0. SOLUTION. At (0, O),p = q = r = s = t = 0 and the theorems of the calculus, without elaborations which we have omitted, give no result: On the other hand, when x and y are small and not both zero, x+y-1 is negative, x 4+y 4 is positive,

and z is negative. When x = y = 0, z is zero. Hence z has a maximum at x = y = 0. [Strictly, we should show that z is negative when 0 < x 2+y 2 < some definite 82; in fact, o = l /.J2 will serve, for then (x + y)2 ~ 2(x2+y2) < 1.J EXAMPLE

4. Prove that

V = xyz+x4+y"+z4 has neither maximum nor minimum at x = y = z = 0. SOLUTION. When x = y = z = k and k is small, V has the sign of k3 : this is positive when k is positive, negative when k is negative. Also V = 0 when x = y = z = 0. Hence as we leave x = y = z = 0, V increases for some x, y, z and decreases for others. There is neither maximum nor minimum. EXAMPLE

5. Show that

z = x 4 +y4 -2(x-y) 2 has a minimum value at x = .J2, y = -.J2 (Example 1). SOLUTION.

tion

Put x

= h+.J2, y = k-.J2. After a little reduc-

z = -8+h2(h+2.J2) 2+k 2 (k-2.J2) 2+2(h+k) 2

and, when h and k are small real numbers, this exceeds -8, the value of z when x = .J2, y -:-- - .J2.

216

MAXIMA AND MINIMA EXAMPLES

XVII

[Some of these examples require heavy 'arithmetic' to deal with all the possibilities. For the harder examples consult Chapter XI I I, § 3.] 1. Find the maxima or minima of z when

(i)

z = x 4 +:zj4-(x+y) 2,

(ii)

z = (x+y- I)(x2+y2), z = x 4 +2x 2y-x 2+3y2,

(iii)

(

z = (x_:__y) /(x 2+y2+ l) ; 2. (Harder.) Prove that (ax+by+c) 2/(x 2+y 2+1) has a maximum at x = a/c, y = b/c . 3. Prove that x 3y 2 (1-x-y) has a maximum at (!,t). 4. Prove that x 3 +y3 -3axy has only one maximum or minimum, a minimum if a > 0 and a maximum if a < 0. 5. Show that, when x and y are restricted to be positive acute angles sinxsinysin(x+y) has one and only one maximum. 6. Show that, when x and y are restricted to be positive acute angles and m, n are positive constants, (iv) (Harder.)

sinmxsinnysinm+n(x+y) can have a maximum or minimum only for values of x, y which satisfy the equations tanx = tany tanxtany = 2. m n

x

(Harder in detail.) Show that, when m = 2n, there is a maximum at tan- 1 2, y = trr. 7. Prove that, when ex, fJ, y are the angles of an acute-angled triangle,

=

sec cx+sec ,B+secy has a minimum value if the triangle is equilateral.

[Examples 8 and 9 are somewhat exacting.] 8. Show that the function

(y+ z) 2+ (z+x )2+ xyz has n either maximum nor minimum. 9. Show that for positive values of the constants a, b, hand for positive values of the variables x, y, z, the only maximum value of \

x2 xy(z-h) (a2

y2 z2) -+--b2 c2

----.

is (2h/5) 5 abc- 4 • 10. When a, bare positive and ab # h 2, show that (ax 2+2hxy+by2)e" has a minimum at (0, 0) or at (-2, 2h/b) according as ab > h 2 or ab < h 2 •

MAXIMA AND MINIMA

217

11. Show that when z is defined implicitly as a function of x and y by an equation of the form

xy+xf(z)+yg(z)+h(z)

=

0

it can have neither maximum nor minimum. 12. When F == Ax 2 +2Hxy+By2 is a positive-definite form,

j == ax 2 +2hxy+by 2 show that p

=

q

=

and

z = j/F,

0 when

hx+by Hx+By and hence show that any critical (stationary) value of z is a root of the equation a-A,\ h - H,\ h-H,\ b-B,\ = O. ax+hy

j

F= Ax+Hy

I

I

When 0 == I CrsXrXs is positive-d efinite in the n variables x 1 , • •• , Xn, A == I arsx,x8 and z = A/C, show that any critical (stationary) value of z is a root of the determinantal equation

lars-ACrsl = 0.

XVIII RESTRICTED MAXIMA AND MINIMA 1. When the variables x, y, z are not independent but are connected by an identity, say x2 y2 z2

-+-+-= a2 b2 c2 - 1 '

(1)

which, within a certain region at any rate, t serves to determine the value of any one of the variables when the values of the other two are given, the differentials dx, dy, dz are not independent. They must satisfy the linear relation

x y z dx+b 2 dy+ 2 dz= O. a2 c If (1) is replaced by a general type of identity

=

rf>(x, y, z) 0, the differentials dx, dy, dz must satisfy the relation

(2)

=

rf>xdx+rf>vdy+rf>zdz 0. (3) We shall suppose throughout, and without further reference, that the function ,rf>(x, y, z) is such that (2), within the region we are considering, serves to determine the value of any one of x, y, z when the values of the other two are given. t We work mostly in three variables, leaving it to be understood that the methods apply to any number of variables.

2. One relation

=

We seek maxima and minima ofj(x,y,z) when (x,y,z) 0. We first show how to find the points x, y, z at which\f(x, y, z) ) t Cf. Chapter XIV § 3.3, p. 166. i For example, within the positive octant (1) determines z when x and y are given. We leave aside such formal theorems as 'If z =/= 0 at (x0 , y 0 , z0 ), then (2) defines z uniquely as a junction of x and y in the immediate neighbourhood of (x0 , y 0 , z0 )'. See Theorem 40' et seq. (p. 254) ; or Goursat, Gours d'analyse, vol. i, chapter iii.

RESTRICTED MAXIMA AND MINIMA

219

has critical values and afterwards we determine whether these values are maxima, or minima, or neither.

2.1. Algebraical preliminary LEMMA.

With three variables g, 71, ,, let a'g+b'71+c'' = 0

(i)

ag+b71+c' = 0.

(ii)

whenever

Then there is a number ,.\for which a' = ,\a,

b' = ,\b,

c' = ,\c.

(iii)

Proof. If a = b = c = 0, (ii) imposes no condition on g, 71, ' and (i) requires a' = b' = c' = 0. The r elation (iii) is satisfied automatically by any value of,.\. Let a, b, c be not all zero, say a =!= 0. Then, from (i) and (ii), (ab'-a'b)71+(ac'-a'c)' = 0

(iv)

when ever (ii) holds. But, on taking arbitrary values for 71 and ,, (ii) merely serves to determine g in terms of 71 and ' and so (iv) is true for arbitrary values of 71 and'· Accordingly, ab'= a'b,

ac' - a'c

and, on writing ,.\ = a' /a, a' = ,\a,

b' = ,\b,

c' = ,\c.

Conversely, if (iii) holds, then a'g+b'71+c'' = 0 whenever ag+b71 +c' = 0.

2.2. Finding critical points Let cp(x, y, z) = 0. Then the function f(x, y, z) has a critical value at (x, y, z) if, and only if,

dj = fxdx+fvdy+fzdz = 0 whenever the differentials dx, dy, dz satisfy

(4)

(5)

= 8dy; from

0.

There is a minimum with u = ../17. Similar working shows (ii) a maximum, u = 3../2 and (iv) a minimum, u = ../17.

224

RESTRICTED MAXIMA AND MINIMA

EXAMPLE

2. The variables x, y, z satisfy the relation x2+y 2+z2 = a 2 +b 2 +c 2 (abc =I=- 0).

Show that the Junction

=

V

f(x)+g(y)+h(z)

has a minimum value when x =a, y J'(a)

g'(b) h'(c) -b- = -c-

---a -

and if also, when a 2a f3+y

J"(a)-{j'(a) /a},

=

>

= b, z = c if

and

0

etc.,

f3y+ya+af3

>

0.

At a critical point

SOLUTION.

dV

= f'(x)dx+g'(y)dy + h'(z)dz =

0

=

0.

whenever

xdx+

ydy +

zdz

(1)

There is a critical point at x = a, y = b, z = c if f'(a) /a = g'(b)/b = h'(c)/c =A. d 2V = If'(x)d 2x+ If"(x)dx 2

Also and, from (1), At (a, b, c), f'(a)

L xd 2x + L dx 2 •

0= =

Aa, J'(b) d2V

=

Ab, f'(c)

=

Ac and so

= L {f"(a)-A}dx 2 •

That is, with the notation indicated in the question, d 2V

=

L a adx

0

=

L adx.

while, from (1),

2

2

On writing adx = X, bdy = Y, cdz is a minimum at (a, b, c) if aX 2 +f3Y2+yZ2

>

0

whenever

,

=

Z, we see that there X+Y +z

=

0.

Thus there is a minimum if

aX 2 +f3Y 2 +y(X+Y) 2

=

(a+y)X 2 +2yXY +(,B+})Y 2

is a positive-definite form in X and Y. This is so if a+y > 0, f3+y > 0, (a + y)(f3+y)-y 2 = f3y+ya+af3 > 0. If f3+y > 0 and f3y+ya+af3 > 0, then also a+y > 0. Hence, under the given conditions, there is a minimum at x = a, y = b, z = c.

RESTRICTED MAXIMA AND MINIMA

225

2.5. A property of determinants We conclude these examples with a well-known property of determinants, due to Hadamard. t The proof of this property presumes a knowledge of the following theorem about determinants: If ~ = /arsl is a determinant of the nth order having the element ars in the rth row and sth column, and if Ars is the cofactor of ars> then ~

n

= L arsArs

=

(r

s=l

1, .. ., n),

n

0= -~'

and

L arsAts S=l

_ /A rs / =

-

(r ::/= t), ~ n-1

'---J '

where the notation indicates that~' is the determinant in which Ars is the element in the rth row and sth column. EXAMPLE

3. If~ = /arsl and the elements ars satisfy n

L a~s = s=l

br

(r

=

1,. .., n),

where the br are given positive constants, then ~ has a maximum value ...j(b 1 b2 ... bn) and a minimum value -,,j(b 1 b2 ... bn). The values of ars which yield these stationary values satisfy n

L arsats = s=l

0.

NOTE. When b1 = ... = b11 = I, the critical matrices [a, 8 ] are orthogonal matrices.

SOLUTION. When only elements in the rth row vary, a critical value if n

L Arsd(ars) = s=l

~

has

n

0

whenever

L arsd(ars) = s-1

0.

This needs An arl

= ... =

An ars

= ...

A~ = arn

~

= b,..

t Hadamard's property concerns least and greatest v~lues rather than maxima and minima; but see § 3, which follows. 5691

Q

226

RESTRICTED MAXIMA AND MINIMA

That is

(1) Ars = (fl./br)ars and when the elements in all the rows vary, a critical value requires (1) to hold for all rands. When this is so ~n-l =

IArsl = (fl.n/b1b2···bn)larsl

6. 2 = b1b2 ••• bn or fl. = o. To examine for maximum or minimum, first let only the elements in the rth row vary; then

i.e.

A=

L Arsars•

d~ =

8

d 2fl. =

8

while, from the relation

L arsd(ars) =

L Arsd(ars),

0,

8

.L a;

br, 8 2 L arsd (ars)+

L Arsd2(ars), 8

=

L {d(ars)} 2 =

8

0.

8

From these and (1), d2fl = (fl./br)

L arsd2(ars) =

-(~/br)

8

L {d(ars)}2. 8

When the elements in all the rows vary

d 2~

= -fl.

.Lr [b; 1 .Ls {d(ars)} 2].

Hence there is a maximum when minimum when A = -..J(b 1b2 ••• bn). Finally, at a critical value

~ =

+..J(b1b2 ••• bn) and a

Ats = (6./be)ats and, when r =j::. t, the relation

L arsAts =

0

gives

8

La

78

a18

= 0.

8

3. Greatest and least values This is a suitable point to introduce two further considerations. The first is that a maximum is not necessarily a greatest value and that in Hadamard's determinant (§ 2.5{ no one is really interested in the maximum, but only in tlie greatest value: all the applications turn on the latter. If we are to get the most we can out of our calculus work, we must be able to decide when a maximum is, in fact, a greatest value. We shall make a beginning on this problem but, as we shall realize, its full development would take us too far afield.

RESTRICTED MAXIMA AND MINIMA

227

The second consideration is that the most readily convincing proof of a greatest value is often obtained by the methods of algebra, without recourse to calculus; we give examples in§ 4.

3.1. The interior points of a region The points of a region Bi't which lie inside and not on the boundary of the region are called INTERIOR POINTS . If (x 0 , y 0 ) is an interior point, there is a neighbourhood, say (x-xo)2+ (Y-Yo)2


1 and a minimum value at that point if a < I. Show also that, if 3a < 2.Y2, w has no other maximum or minimum value. 3. Show that u = yz+zx+xy has no maximum or minimum values when considered as a function of three independent variables x, y, z but has a maximum value when these variables are subject to the relation

ax+by+cz == 1 and a, b, c are positive constants satisfying the condition 2(bc+ca+ab) > a 2+b2+c2. 4. Find the minimum and maximum values of 7x 2+sxy+y 2 when x 2 +y2 == I. 5. If u = f(x, y), where (x, y) == 0, has a stationary value, show that it occurs when x and y satisfy equations of the type

f.,+>..,

0, fv+>.v = 0 and show that the stationary value is a maximum when =

/.,.,+>..,., fxv+Axv ., fxv+Axv fvv+Avv v ., v 0 is positive for these values of x, y, and.\.

l )

231

RESTRICTED MAXIMA Ar;]) MINIMA

6. (Easy.) When the variables X v · ·· • Xn are subject to the single condition I arxr""" l, in which the ar are constants, the function y = I x~ has one and only one critical value (I a~)- 1 , which is a minimum. 7. When a 1 ,. . ., an, Care positive constants, k i= 1, I a~/(k-1) = Akf(k- 1) and I x~ """ Ck, prove that y = I a,xr has a critical value AC for certa in real positive values of the Xr, and that this critical value is a maximum if k > 1, a minimum if k < 1. 8. Find the maxima and minima of u = (x 2+ x y+y 2)/(x+y) 4 when x +y4 """ a4. 9. When x 1 ,. • ., Xn are real and I x~ """ s, where k is an even positive integer and s a positive constant, show that the st ationary values of

" x~ f = II r- 1

satisfy

Jk =

{~ }tn(n+l) n n(n + 1) rr, r- 1

IT

and that one stationary value of f is a maximum and the other a minimum. 10. When m, n, p are positive constants and x, y, z are restricted to b e positive and subject to the relation x+y+z """ 1, find any maximum or minimum of xmynzP.

5. Two or more relations We set out to find the critical points ofj(x,y,z,. .. ) when the x, y, z,. .. are not independent, but are subject to two or more restrictions of the type (x, y, z,. .. ) 0, if;(x, y, z,. .. ) 0. If there are n variables x, y , z,. .. and k relations 0, 1(X, y, z,. .. ) 0, ... , k(x,y,z,. .. ) we shall assume that, in the n-dimensional region under consideration, these relations serve to determine the values of k of the variables when the values of n-k of them are given. The differentials dx, dy, dz,... are not then independent but are subject to the k conditions

=

=

=

81 dy+ ... =0, -81 dx+ox 8y

8k dx+ 8k dy+ .. .

ax

=

ay

= O.

232

RESTRICTED MAXIMA AND MINIMA

As in§§ 1 and 2, there is a critical point of f(x,y,z,. .. ) when df = O; further, the critical point is a maximum (minimum) if d2f

=

Lfxd2x+ L dfxdx

is negative (positive) whenever dx, dy, dz, ... are not all zero (cf., in particular, § 2.3).

5.1. Algebraical preliminary LEMMA.

With four variables g, 1J,

t () let

a 3 g+bari+c3 ,+d3 B = 0

(1)

whenever

a 2 g+b 2 11+c 2 '+d; B = 0

(2)

and

a 1g+b 1 ri+c1 '+d1 B = 0.

(3)

Then there are numbers .\ and µfor which aa = .\a1 +µaz,

da = .\d1 +µdz.

(4)

c1 = kc 2, d1 = kd 2 ,

(5)

... ,

Proof. First let

a 1 = ka 2, b1 = kb 2 ,

so that either (2) and (3) are one and the same condition or, when k = 0, (3) imposes no restriction on the variables. Then, as in§ 2.1, there is a numberµ for which aa

=

µaz,

... ,

da

=

µdz

and (4) is proved with.\= 0. If (5) is not true, one of a 1 b2 -a2 bv··· is not zero. Say b1 c2 -b 2 c1 = A 3 # Oandle.t b 2 c3 -b 3 c2 =Av andb 3 c1 -b1 c3 = A 2. Then, Av A 2, A 3 being the cofactors of av a 2, a 3 in the determinant Ja1 b2 c3 J, (6) L brAr = 0, L er Ar= 0 and

(,L arAr)g+(,L drAr)B

= 0

(7)

whenever (2) and (3) hold. But when A 3 # 0 and g, () are given arbitrary\ values, (2) and (3) merely serve to determine 11 and ' in terms of g and B. Hence (7) is true for arbitrary values of g and() and, accordingly, in addition to (6), a 1 A 1 +a 2A 2 +a3 A 3 = 0,

d 1 A 1 +d2A 2 +d3 A 3 = 0.

RESTRICTED MAXIMA AND MINIMA

233

Since A 3 # 0 we may write aa = - Ala1 - A2a2,

A3

ba = - Alb1 - A2b2,

A3

A3

A3

and so on ; and the lemma is proved. General form of the lemma If, for r = I, 2, ... , m ~ n, Lr is a linear form in then variables Xv x 2, ... , xn and if Lm = 0 whenever L 1 = L 2 = ... = Lm-l = 0, then there are numbers\, ... , Am-l for which Lm

= A1L1+A2L2+ ... +Am-1Lm-1·

NOTE. It is tempting to say at the outset that, since Lm = 0 whenever L 1 , ••• , Lm- 1 are zero, L m must be linearly dependent on L 1 , ••• , Lm-i· The point of the lemm a is that it proves this and does not leave it as an unproved, intuitive inference from the facts given.

5.2. Worlced example The general theory of critical values in the present context is too heavy and unrewarding to warrant its full development. We content ourselves with an example; even then, unless we take an example in which the geometrical interpretation shows at once whether a maximum or minimum is in question (cf. Examples XVIIIB, I), it is difficult to make progress beyond the point at which we can say 'These are the critical values'. PROBLEM.

Find the critical values of F

=

L x2a-4

when the variables x, y, z are subject to the two relations

.L lx =

.L x 2a- 2 =

0,

I.

SOLUTION. For a critical point dF = 2 .L xa- 4 dx = 0 whenever

.L l dx =

0,

.L xa-

2

dx

0.

=

At a critical point (x, y, z) there are constants .\, µfor which xa- 4 +.\l+µxa- 2 = 0,

etc.,

whence, on multiplying by x, y, z and adding, F+µ= and so

xa-4(I-Fa 2 )

o

= -.\l, etc.

(I)

234

RESTRICTED MAXIMA AND MINIMA

Hence the critical points (x, y, z) satisfy

z

x

----y la 4/(1-Fa 2 ) - mb 4 /(I-Fb 2 )

and, since L lx

nc 4 /(1-Fc2) = -.\

0, the critical values of F satisfy the equation

=

L {l a /(I~Fa )} = 2

4

2

0.

(2)

The roots, F1 and F 2 , of this quadratic equation in F are the critical values required. To determine whether the critical value is a maximum or minimum: tl"F = L a- 4dx 2 + L xa-4d2x

=

_Lld 2x 0 = L a- 2 dx 2+ L xa-2d 2x and these give, on using (1), jd 2F = L (a- 4+a- 2µ)dx 2 = L a- 4(I-a 2F)dx 2, 0

but it is not a rewarding task to examine whether leads to a positive or negative sign for d 2 F.

Fi or F 2

NOTE. In all problems of this sort it is easy to obtain the equations that correspond to (1) above; it is not so easy to see what to do with them and the only method is to work out problems for oneself.

EXAMPLES

1. Find the minimum value of

XVIII B

x 2 +y 2 +z 2

when

ax+by+cz = a'x+b'y+c'z = 1. [Perpendicular from origin on line of intersection of two planes.] 2. Find the critical values of x 2 +y 2 +z 2 when

ax 2 +by2 +cz2

=

lx+my + nz

1,

=

0.

[Lengths of axes of a central section of a quadric.] 3. Show that the critical values of F = x1+ ... +x; when

a1 x 1 + ... +anxn = 0, satisfy the equation

b1x1+ ... +b;x; = 1

n

L

r~l

{a~/(b~F-1)} =

0.

(

4. Prove that y = x1+ ... +x! has~one and only one critical value, which is a minimum, when the variables x 1 , ••• , Xn are subject to (i) , the single condition L a,x, = 1, (ii) the two conditions L a,x, = 0, L b,x, = 1, and the a,, b, are given constants.

RESTRICTED MAXIMA AND MINIMA

235

Find these minimum values in the forms c~; a~)-1,

(ii) (!a~)/ ! (a, b8 -a8 b,)2. [Geometrically, with n = 3, perpendicular on a plane in (i) and on a line in (ii).] 5. Show that the critical values of u = I a, x, when the variables x 1 , ... , x,. are subject to the relations (i)

!b,x, = 1, Ix~= 1 are the roots of the quadratic equation

I

(ub,-a,) 2

=I (a,b

8

- a 8 b,)2

and that the greater root is a maximum, the smaller a minimum. [Take! b~ > I.]

XIX MISCELLANEOUS TOPICS 1. Functions of a complex variable In this section we consider two complex variables z = x+iy and w = u+iv connected by a relation w = f(z). This relation is (1) u+iv = f(x+iy) and, on taking the real and imaginary parts, serves to find u 1 and v as functions of x and y, say u(x, y) and v(x, y). Throughout we shall suppose that u and v have continuous second order partial derivatives w.r.t. x and y: this condition will ensure [cf. Chap. XV] that uxy = uyx and vx11 = v11 x· . PROPERTY

I. When u+iv

=

f(x+iy), Uy =

Ux =Vy,

-Vx.

(2)

Proof. From (1),t ux+ivx = f'(x+iy)_!_ (x+iy ) = f'(x+iy),

ox

(3)

u 11 +ivy = f' (x+iy) _!__ (x+iy) = if' (x+iy). oy

i(ux+ivx) = uy+iv11 and so -Vx =Uy. Ux =Vy, PROPERTY II. When a and b are constants and u(x, y) = a, v(x, y) = b are the equ/itions of curves, the curve u = a is orthogonal to the curve v = b at any point of intersection of the two. Hence

Proof. The slope of a curve u =a is given by dy/dx calculated from the equation [Chap. XIV, p. 166] _/dy ux+uydx = 0 and so is equal to -ux/u11 • t

We assume that the functionf(t) is one for which, as At->- 0, {f(t+At)-f(t)}/At->- a unique limit, denoted by j'(t), and this whether t and At are real or complex variables.

MISCELLANEOUS TOPICS

The slope of a curve v the equation ' ~nd

237

= bis given by dy/dx calculated from d y v,,+vYdx = 0

so is equal to -v,,/vy. But, by (2), u,,v,,+uyvy = 0

at any point of intersection of a curve u = a with a curve v and hence the curves cut at right angles [mm' = -1].

= b

PROPERTY III. Each of the functions u and v is a solution of the partial differential equation

a2 v a2 v ax2

+ ay2

= 0.

Proof. By differentiating u,, =Vy partially w.r.t. x, Uxx

8

= Vxy = Vyx = -8y (-Uy) = -Uyy·

Similarly,

v,,,,

= -vyy·

IV. Let V be a function of u, v having continuous second order derivatives. Then PROPERTY

Vr.,+Yyy Proof. Since

IJ'(x+iy)l 2 (~u+Y,;v)·

=

V:X

=

V.U u X +V.V vX'

Yxx = ~u.,.,+u.,(~uu,,+v,;uvx)+

+v,; v,,x+vx(~v ux+ Vvv vx)· Adding this to the corresponding expression for Vy 11 gives Vr,,+Yyy = ~(uxx+uyy)+Y,;(vxx+vyy)+~u(u;+u;)+

+Y,;v(v;+v;)+2~v(u,, v,,+uy vy)· In this the coefficients of~' v,; are zero by Property III; that of ~v is zero by Property I and, on again using Property I, Yxx+ Yyy = (~u+ Y,;v)(u;+v;)

and Property IV follows on using (3). Examples on this section: Examples XIX A 1-6.

MISCELLANEOUS TOPICS

238

2. Elimination of arbitrary functions We indicate how partial differential equations are formed by the elimination of arbitrary functions. We employ p, q, r, s, t to denote the partial derivatives [cf. Chap. XVII,§ 2, p. 201]. I. (Easy.)

EXAMPLE

Let

z

=

(x+y)f(xy),

(1)

where f(t) is any function oft with a finite derivative f '(t). Then z satisfies the partial differential equation z(x-y) = (x+y)(px-qy). SOLUTION. Differentiate w.r.t. y; this gives

(1)

partially, first w.r.t. x and then

p = f(xy)+(x+y)yf'(xy), q

= f(xy)+(x+y)xf'(xy).

Eliminate f' and substitute for f from (1); this gives px-qy

= (x-y)z/(x+y).

2. (Harder.)

EXAMPLE

Let

z=

(x+y)f(~) +

(x2-y2)cf>(x2+y2),

(2)

where f(t), cf>(t) are any functions oft having second order derivatives. Then z satisfies the partial differential equation (x2-y 2){(r-t)xy-s(x2-y 2)} = 4xy(px+qy-z). SOLUTION.

On differentiating (2),

P =f-Y2(x+y)f'+2xcf>+(x 2--y 2)2xcf>', x q= f

1

r

/

+-x (x+y)f'-2ycf>+(x 2-y2)2ycf>'. .

Eliminate f' to give px+qy

= (x+y)f+2(x2-y2)cf>+2(x2-y2)(x2+y2)cf>',

i.e. px+qy-z = (x2-y2)[cf>(x2+y2)+2(x2 + y2)c/>'(x2+y2)].

(3)

MISCELLANEOUS TOPICS

239

Our object is to eliminate 0. 2. Deduction. The difference b etween cosx and either extreme of the inequalities is less than x 4 n+2/(4n + 2)!, which-+ 0. 5. Last part. The inequality holds if f(x) = x(l+x 2)-1--tan- 1x > O

andf'(x) has sign of I+tx 2 -(l +x 2 )l-; but (I+tx 2 ) 3 > l +x 2 •

ANSWERS

283

6. j'(x ) = (xsin x -2 + 2cosx)/x 3 • Apply Theorem 23 (twice) to numerator. 7. HINTS. (i) j(x ) = x 2 -sin xsinh x gives j(ivl(x ) = 4sinxsinhx > O. (ii) t anhx < x < tanx. Also, with H = harmonic m ean, l/H = t(cot x +cothx); takef(x) = H- x and inj'(x) use cosec 2x = l+cot 2x, cosech 2x = coth 2x-l, etc.

==

8. HINTS. With j(x) tan- 1x/tanh x , j'( x ) has same F( x ) (l + x 2 ) - 1 sinh x coshx-tan- 1x , F ' (x ) h as same sign as

==

(x)

sign as

== (l+x )sinh x-x cosh x, 2

and ' (x ) > 0 when x > 0. j( x ) -+

f77

as

x -+ oo.

EXAMPLES XI 1. (i) max. x = 2, min. x = ~(ii) j'( x ) (bc-ad)/(ccosx+d sin x )2 • 2. x = ac(3c 2 -2a 2 )- !; a 2 > c2 n eeded to m ak e function real at the maximum. U se § 3 to get j"(rx).

==

3. min. at x = 2: log(-1) is not r eal. 4. Critical at x = (k77-c-)/ b, where rcos = -a, rsin = b, and k is any integer or zero. 5. j'( x ) = m sin(m + l)x{sinm- 1x- cosm-1 x} ; u se § 3 for jn(b). 6. (HINTS.) Wit hy fix ed F' > 0 for x < a + n(2a -y) and F increa ses with x throughout 0 .;:; x .;:; y provided y .;:; a+n(2a-y), i.e. if (i) y .;:; a(2n+ l)/ (n + 1). If (ii) y exceeds this, F has a maximum at x = a + n(2a-y). When (i) holds greatest F = (4n + 2)ay-(2n+l)y 2 , which has its greatest v alue ( 2n + 1 )a 2 at y = a. When (ii) holds great est F = {a + n(2a- y)}2 ; a lso 2a-y < a/(n+ 1) and F .;:; (2n + 1) 2a 2 / (n + 1) 2 • 7. Differ entiat e (ax 2 + 2hx + b)u = A x 2 +2Hx+ B: when u 1 = 0, (qx +h)u = A x +H and, from this and the above, (hx +b)u = Hx+B: equate the two v a lues of x .

EXAMPLES XII 1. 2, 0, 0, 1/ 30, 5, 27.

Last part.

sin 3 3x x ) - . - sec 3 3x -+ 1. (3x 3 smx

2 . 2, 1/ 18, l /a.

Last p art. y =

x2 (l+ax)"' gives logy= xlog (l+ax) =:=a·

ANSWERS

284

logy= xlog(I+x-1 ); {log(x+I)-logx}/x-1 and L'Hopital, or xlog(I+x- 1 ) ~ x(x-1 -tx- 2)-+ I. 1 (ii) (I+x- )"' = exp{I-tx- 1 +0(x- 2)}. 4. (iv) Put x = t7T-h; logy= sin2h(logcosh-logsinh)-+ 0. 5. (i) k 2/log(4/9); L'Hopital or expansion. (ii) f{(loga) 2-(logb) 2}; a'"= exloga and expand. 6. Expand in powers of x: need 3+2A+B = 0, 27+8A+B = 0 to make coefficients of x and x 3 2iero. 7. O;f(x) = logcosx, g(x) = secx and L'Hopital or x = }rr-h gives sinhlog(sinh), which-+ 0. 8. g'(x) = 0 when x = k7T and the condition 'g'(x) =I= 0 when x > X' is not satisfied. 9. j'/g' = (2xsinx- 1 -cosx- 1 )/cosx; jsinx-1 j < I and 2xsinx- 1 -+ 0, cosx-+ I, but cosx- 1 oscillates over the whole range -1 to +I. 'No.' Theorem 28 says 'ifj' /g'-+ l, thenf/g-+ l'; it says nothing of the converse and the example shows that the converse is not true. 10. 'No'-as in Example 9 above. 3. (i)

EXAMPLES XIII A •

2

1 x -sec 2 - , smxy +Y Iog (x+y )cosxy, (l xy2 . , . , (1 -x2y,2)-f . y y x y -x y · x x smxy x2y xay 2. - 2y sec 2 -y , x+ y +xlog(x+y)cosxy, (l -x 2y 2)p (l -x 2y 2)t' 3. (i) 2xsin2(x 2+y), ex+av, (x 2+y 2)-t, cosh(x 2+y 3 )+2x 2sinh(x 2+y 3 ). (ii) sin2(x 2+y), 3e"'+3 11, -(x/y)(x 2+y 2)-t, 3xy 2sinh(x2+y 3 ). 4. u, = ercos9cos(8+rsin8), v, = ercos9sin(8+rsin8).

-+

I.

2 5. (i) z{l+y 2cot(xy 2)}, ( Y )2, -3cos 28sin8with8=x+4y, ~. x+y x +Y 1

J[x2+y2)' (ii) 2z{l + ycot(xy2)},

2

(x~:)2'

-x -12cos28sin8, x2+y2'

yz

x+.J(x~+y2)'

EXAMPLES XIII B 2 2. z = f(r ); z., = j'(r2). 2r(or/ox) = 2xf'(r 2). 4. V., = (x/r)e"'+re"'+e' + (x 2 /r)e'. 5. (i) z., =f(ax+by)+a(x+y)f'(ax+by). 6. z+(x+y)z., = '(x). 7. oS/oO = tabcosO. 8. 43,200 o:(t2 -t1 )/(l+o:t1 ). 9. First show that (2y 2+I)u.,+(y 2+2)v., = 0 and then differentiate this w.r.t. x. x2 x2} 10. =f°(r)2+f'(r) - - 3 · r r r

u.,.,

{l

285

ANSWERS

11. Vo = 3rx, V,.., = 3(x2/r)+3r, Vy.= 3yz/r. 14. Put/ = cp(u); f., = '(u)u.,. 16. V,. = nxrn- 2, etc., leads to (n+l)u+z(du/dz) = 0. If solution of differ ential equation is not known, put u = vz-