Calculus and Analysis in Euclidean Space (Instructor Solution Manual, Solutions) [1 ed.] 3319493124, 9783319493121

obtained thanks to https://t.me/HermitianSociety

574 70 781KB

English Pages 136 Year 2016

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Calculus and Analysis in Euclidean Space  (Instructor Solution Manual, Solutions) [1 ed.]
 3319493124, 9783319493121

Citation preview

Calculus and Analysis in Euclidean Space: Selected Solutions Preface 0.0.1. (a) Consider two surfaces in space, each surface having a tangent plane and therefore a normal line at each of its points, and consider pairs of points, one on each surface. Conjecture a geometric condition, phrased in terms of tangent planes and/or normal lines, about the closest pair of points. There needn’t be a closest pair of points at all. If there is a closest pair of points, it needn’t be unique. If the two surfaces meet then a shared point certainly is a “closest pair,” but no particular geometric condition need hold at the point. All of this detritus aside, the case of interest is when the two surfaces don’t meet and there is at least one closest pair of points. In this case, call the surfaces A and B, and call the points a and b. Geometric intuition says that the line containing the two points a and b is normal to both surfaces A and B. So: • The normal line to A at a is equal to the normal line to B at b. Consequently: • The tangent planes to A at a and to B at b are parallel. • The normal line to A at a is orthogonal to the tangent plane to B at b, and conversely. (Here and elsewhere in this answer, “normal” and “orthogonal” are synonyms, each being used in different places to try to make the ideas easier to understand.) Note that the first bullet is a stronger condition than the second and the third. Also note that in all three cases, the geometric condition is necessary but not sufficient. For example, it also applies to the farthest pair of points, and it can apply to pairs of points that are neither nearest nor farthest. By analogy from one-variable calculus, a horizontal tangent is necessary at each maximum 1

and each minimum of a smooth curve (not thinking about endpoints), but a horizontal tangent does not imply a maximum or a minimum. In this context, it is worth mentioning that the ideas of “parallel” and “orthogonal” deserve some rethinking in space. In the plane, the condition that two lines never meet and the condition that two lines are everywhere equidistant from one another mean the same thing, and this is our notion of parallel. But in space, the first condition is weaker, the second stronger: two lines in space can be skew, meaning that they never meet but nor are they parallel in the sense of being everywhere equidistant. Similarly, for a pair of lines in space to be orthogonal, not only should their directions form a right angle, but the lines should meet. In the plane, every two nonparallel lines meet, and so this is a nonissue. Two planes in space are equal, or they are parallel, or they share one line. When they share one line, we can visualize them in a cross-sectional plane at right angles to the shared line, and what we see of the planes is two lines crossing, a sort of “X” shape. The planes are orthogonal when the two lines of the “X” are orthogonal in the planar sense. Note how this process reduces a three-dimensional issue to two dimensions. (b) Consider a surface in space and a curve in space, the curve having a tangent line and therefore a normal plane at each of its points, and consider pairs of points, one on the surface and one on the curve. Make a conjecture about the nearest pair of points. After again making all of the disclaimers, the condition is that the line containing the two points a and b is normal to the surface A and to the curve B. One can reach four conclusions from this, the first two stronger than the last two, but in any case all necessary rather than sufficient: • The normal line to A at a lies in the normal plane to B at b. • The normal line to A at a is orthogonal to the tangent line to B at b. • The tangent plane to A at a is orthogonal to the normal plane to B at b. • The tangent plane to A at a contains a line parallel to the tangent line to B at b. (c) Make a conjecture about the nearest pair of points on two curves. The key beginning observation is again that the line between the two points a and b is normal to both curves A and B. In the terms of the problem, this can be expressed in three ways: • The normal planes to A at a and to B at b share a line. • The tangent lines to A at a and to B at b are skew and possibly parallel. • The normal plane to A at a contains a line orthogonal to the tangent line to B at b, and conversely. 2

You might think about whether any of these conditions is stronger than any other, but in any case none of them is sufficient. The answers to (a) and (c) have three phrasings each, while the answer to (b) has four. Why is this? 0.0.2. (a) Assume that the factorial of a half-integer makes sense, and grant the general formula for √ the volume of a ball in n dimensions. Explain why it follows that (1/2)! = π/2. The general formula is vol (Bn (r)) =

π n/2 n r , (n/2)!

n = 1, 2, 3, 4, . . .

So in particular, for n = 1 we have 2r =



π r, (1/2)!

and the result follows by basic algebra. Further assume that the half-integral factorial function satisfies the relation x! = x · (x − 1)!

for x = 3/2, 5/2, 7/2, . . .

Subject to these assumptions, verify that the volume of the ball of radius r in three dimensions is 43 πr3 as claimed. The formula for n = 3 is vol (B3 (r)) = =

π 3/2 3 π 3/2 r = r3 (3/2)! (3/2) · (1/2)!

π π 3/2 4 √ r3 = r3 = πr3 . (3/4) 3 (3/2) · π/2

What is the volume of the ball of radius r in five dimensions? Similarly, vol (B5 (r)) = =

π 5/2 π 5/2 5 √ r = r5 (5/2)! (5/2) · (3/2) · π/2 8 2 5 π2 r5 = π r . (15/8) 15

(b) The ball of radius r in n dimensions sits inside a circumscribing box of sides 2r. Draw pictures of this configuration for n = 1, 2, 3. Determine what portion of the box is filled by the ball in the limit as the dimension n gets large. That is, find vol (Bn (r)) . lim n→∞ (2r)n

3

The limit is √ vol (Bn (r)) π n/2 rn /(n/2)! ( π/2)n lim = lim = lim . n→∞ n→∞ n→∞ (n/2)! (2r)n (2r)n Thus, perhaps surprisingly, the limit is zero. Indeed, the convergence is rapid because the numerator is roughly (0.9)n , which converges geometrically to 0, while the denominator grows factorially. As the dimension grows large, the ball tends to filling an insignificant portion of its circumscribing box. Already by dimension n = 10 the portion of the box filled by the ball is less than one quarter of one percent.

Chapter 1 1.1.2. Prove that in any ordered field, 0 < 1. First we argue that nonzero squares are positive in any ordered field. Let b be a nonzero square. Then b = a2 where a 6= 0 (else b = 0). If a < 0 then −a > 0, and also b = (−a)2 , so in any case b is the square of a positive element and hence b is positive. Now, we know that 0 6= 1 because 0 and 1 are assumed to be distinct. So either 1 is positive or −1 is positive. In the latter case, 1 = (−1)(−1) is positive as well, contradicting trichotomy. So 1 is positive, i.e., 1 > 0. Prove that the complex number field C can not be made an ordered field. In C we have that −1 = i2 is a nonzero square, so that if C is an ordered field then −1 > 0. This is incompatible with part (a). 1.1.4. (a) Prove by induction that n X i=1

i2 =

n(n + 1)(2n + 1) 6

for all n ∈ Z+ .

For each n ∈ Z+ let P (n) be the proposition that the displayed equality holds for that value of n. It suffices to show that P (1) holds and that for each n ∈ Z+ , if P (n) holds then consequently so does P (n + 1). To confirm P (1), simply compute that 1 X i=1

i2 = 12 = 1

and

1(1 + 1)(2 · 1 + 1) = 1. 6

Thus the two sides of the desired equality are indeed equal. Now let n ∈ Z+ be arbitrary, and assume that P (n) holds. Compute that

4

therefore n+1 X

i2 =

n X

i2 + (n + 1)2

by definition of summation

i=1

i=1

= = = = =

n(n + 1)(2n + 1) + (n + 1)2 6 n(n + 1)(2n + 1) + 6(n + 1)2 6 (n + 1)(2n2 + 7n + 6) 6 (n + 1)(n + 2)(2n + 3) 6 (n + 1)((n + 1) + 1)(2(n + 1) + 1) 6

since P (n) holds by adding fractions by algebra by algebra by algebra.

This establishes P (n + 1), completing the induction. (b) For any real number r ≥ −1, prove Bernoulli’s Inequality by induction, (1 + r)n ≥ 1 + rn for all n ∈ N. Here the wrinkle is that the problem involves the real variable r. The idea is to carry out induction on n independently of the value of r, using only the condition that r ≥ −1. For n = 0, compute that (1 + r)0 = 1 and

1 + r · 0 = 1,

so indeed (1 + r)n ≥ 1 + nr when n = 0. Here the fact that r ≥ −1 is irrelevant. Next, assume that the inequality holds for some n ∈ N, and compute that therefore, (1 + r)n+1 = (1 + r)n (1 + r) ≥ (1 + nr)(1 + r)

= 1 + (n + 1)r + nr2 ≥ 1 + (n + 1)r

by definition of the (n + 1)st power since (1 + r)n ≥ 1 + nr and since 1 + r ≥ 0 by algebra

since nr2 ≥ 0 regardless of the value of r.

(Note where the condition r ≥ −1 is used.) This establishes the equality for n + 1, completing the induction. (c) For what positive integers n is 2n > n3 ? Although the problem didn’t specificallly request a proof, it should go without saying that one is called for, especially since the answer here is a bit tricky. The idea is that 2n grows exponentially in n, while n3 grows as a polynomial in n. Thus we expect the left side to be larger eventually. However, although the inequality 2n > n3 holds for n = 1, it fails for n = 2 and then continues to fail for several more values of n. Nonetheless, it becomes true again at n = 10 5

since 210 = 1024 > 1000 = 103 , and we expect it to remain true from then on. So we do an induction argument starting at n = 10: Suppose that n ≥ 10 and that 2n > n3 . Then 2n+1 = 2 · 2n

by definition of the (n + 1)st power

3

since 2n > n3

> 2n

= n3 + n · n2 3

2

3

2

by algebra 2

2

> n + 3n + 3n + n

since n > 7

> n + 3n + 3n + 1

since n > 1, making n2 > n and n2 > 1

= (n + 1)3

by algebra.

This completes the induction. Note that the induction step works only for n ≥ 7, so that an induction starting at n = 1 fails, and the base case fails at n = 7, so that an induction starting at n = 7 fails. The smallest starting value of n for which the induction can succeed is n = 10. 1.2.1. Use the Intermediate Value Theorem to show that 2 has a positive square root. Define a function f : [1, 2] −→ R,

f (x) = x2 .

This function is continuous. Note that f (1) = 1 < 2 and f (2) = 4 > 2. By the Intermediate Value Theorem f (s) = 2 for some s between 1 and 2. This value s is a positive square root of 2.√ One can not use f (x) = x to solve this problem, as this function’s very existence presupposes that square roots exist in the first place. 1.2.2. Let f : [0, 1] −→ [0, 1] be continuous. Use the Intermediate Value Theorem to show that f (x) = x for some x ∈ [0, 1]. This problem asks us to prove that the graph of f meets a diagonal line. On the other hand, the Intermediate Value Theorem says that under certain conditions the graph of a function g meets a horizontal line. So the idea is to reduce the diagonal problem to the horizontal one. Let g : [0, 1] −→ R be g(x) = f (x) − x. The original problem is equivalent to showing that g(x) = 0 for some x. The function g is continuous since the function f and the identity function are continuous and the difference of any two continuous functions is again continuous. Compute that since the values of f (x) lie in [0, 1], g(0) = f (0) − 0 ≥ 0

and

g(1) = f (1) − 1 ≤ 0.

If g(0) = 0 or if g(1) = 0 then we are done. Otherwise, g(0) > 0 and g(1) < 0, so g changes sign, and so by the Intermediate Value Theorem g(x) = 0 for some x ∈ [0, 1], and again we are done. 6

1.2.3. Let a and b be real numbers with a < b. Suppose that f : [a, b] −→ R is continuous and that f is differentiable on the open subinterval (a, b). Use the Mean Value Theorem to show that if f ′ > 0 on (a, b) then f is strictly increasing on [a, b]. Let x and x′ lie in [a, b], with x′ > x. Then f is continuous on [x, x′ ] and differentiable on (x, x′ ). By the Mean Value Theorem, f (x′ ) − f (x) = f ′ (c) x′ − x

for some c ∈ (x, x′ ).

Since f ′ > 0 on (a, b), this means that the quotient in the display is positive. Since the denominator x′ − x is also positive, it follows that so is the numerator, f (x′ ) − f (x) > 0. That is, For all x, x′ ∈ [a, b],

x′ > x =⇒ f (x′ ) > f (x).

This is the desired result. 1.3.1. (a) Making a table shows routinely that the (2n + 1)st degree Taylor polynomial for sin at 0 is n

T2n+1 (x) = x −

X x5 x2k+1 x3 x2n+1 (−1)k + − · · · + (−1)n = . 3! 5! (2n + 1)! (2k + 1)! k=0

The remainder is Rn (x) = ±

sin(c)x2n+2 (2n + 2)!

for some c between 0 and x

(whether it’s “+” or “−” won’t matter in a moment, so we don’t worry about exactly what the sign is). Since | sin(c)| ≤ 1 for all c, |Rn (x)| ≤

|x|2n+2 . (2n + 2)!

For any fixed x, as n → ∞, the factorial (2n + 2)! dominates the exponential |x|2n+2 , giving limn→∞ Rn (x) = 0. Therefore, since f (x) = Tn (x) + Rn (x), lim Tn (x) = f (x).

n→∞

Similarly for the cosine, the 2nth degree Taylor polynomial at 0 is n

T2n (x) = 1 −

X x2 x4 x2k x2n (−1)k + − · · · + (−1)n = . 2! 4! (2n)! (2k)! k=0

The argument with the remainder is virtually the same as it was for the sine, though with 2n + 1 in place of 2n + 2.

7

(b) The approximation sin x ≈ x is the approximation of sin x by its quadratic Taylor polynomial T2 (x) at 0 (the x2 term in T2 (x) is zero). Since the third derivative of sin is − cos, the remainder is R2 (x) =

− cos c 3 x 3!

for some c between 0 and x.

Although we don’t know where c is, the estimate | cos c| ≤ 1 holds for all values of c. Thus the remainder satisfies |R2 (x)| ≤

|x|3 . 6

If we set x to ±8◦ = ±(8π/180) rad ≈ ±0.14 then we get |x|3 /6 ≈ 0.00045, which is to say that the approximation sin x ≈ x is “correct to three digits after the decimal.” (For ±9◦ instead, the error-estimate is ±0.0006 · · · , too big to guarantee “three digits of accuracy.”) Long ago, before computing power was ubiquitous, one digit of accuracy was viewed as rough, two digits of accuracy as fair, and three digits of accuracy (accurate to the thousandth, i.e., to one-tenth of one percent) as quite precise. This explains the mysterious choice of ±8◦ as the outer range for the approximation sin x ≈ x to be “accurate.” 1.3.2. (a) What is the nth degree Taylor polynomial Tn (x) for the function f (x) = arctan x at 0? The derivative of arctan x is 1/(1 + x2 ). There are several ways to proceed from here: 1. We can start computing derivatives of 1/(1 + x2 ) and quickly get discouraged by the messy algebra. 2. We can carry out a partial fractions analysis of 1/(1 + x2 ), using the complex numbers,   1 1 1 1 , = + 1 + x2 2 1 − ix 1 + ix and then compute the derivatives of the right side a little more systematically:

0 1 2

1 2

1 2

3

1 2

4

1 2

5 .. .

f (k) (0) k! 0

f (k) (x)

k

1 2

   

 arctan x  1 1 1−ix + 1+ix

i (1−ix)2 2



i (1+ix)2 2

2i (1−ix)3

+

2i (1+ix)3

3!i3 (1−ix)4



3!i3 (1+ix)4

4

4!i (1−ix)5

+

4

4!i (1+ix)5

.. . 8



1 0



− 31



1 5



0

.. .

Thus the Taylor polynomials are the truncations of the series T (x) = x −

x3 x5 x7 + − + ··· . 3 5 7

3. We can recognize the derivative as a geometric series 1/(1 − r) where r = −x2 . Then by the geometric series formula, arctan′ x = 1 − x2 + x4 − x6 + · · · . This expansion is valid for |r| < 1, i.e., for |−x2 | < 1, i.e., for |x| < 1. Integrating term-by-term gives that for some constant C, arctan x = C + x −

x3 x5 x7 + − + ··· . 3 5 7

We hope that this is still valid for |x| < 1, but we don’t know that it is unless we invoke the Differentiation Theorem from the end of Ray Mayer’s Math 112 notes, and somehow that is too powerful a tool for this problem. Proceeding blithely on in any case, substitute x = 0 to see that C = 0, giving the same answer as in method 2. Although this method is appealing by virtue of involving by far the least algebra, it either has a logical gap or relies on a substantial theorem. (b) f (x) = (1 + x)α where α ∈ R. The first few derivatives of f are f (x) = (1 + x)α , f ′ (x) = α(1 + x)α−1 , f ′′ (x) = α(α − 1)(1 + x)α−2 , and so on. That is, for each k ∈ N, f (k) (x) = α(α − 1) · · · (α − k + 1)(1 + x)α−k , For convenience introduce the binomial coefficient symbol,   α(α − 1) · · · (α − k + 1) α for α ∈ R and k ∈ N. = k! k (By definition, this means 1 when k = 0.) Then for each k ∈ N,   f (k) (0) α = , k! k and thus the nth degree Taylor polynomial is Tn (x) = 1 + αx +

  α(α − 1) 2 α(α − 1)(α − 2) 3 α n x + x + ··· + x . 2 3! n

The interesting feature of this problem is that if α ∈ N (i.e., α is 0 or 1 or 2 or. . . ) then the binomial coefficient (α, k) is 0 for all k > α. This is because 9

the product in its numerator contains a 0. In this case we recover the binomial theorem from high school algebra: For every natural number α ∈ N there is a finite expansion of (1 + x)α ,   α(α − 1) 2 α(α − 1)(α − 2) 3 α α (1 + x)α = 1 + αx + x + x + ··· + x . 2 3! α For example, (1 + x)5 = 1 + 5x + 10x2 + 10x3 + 5x4 + x5 . On the other hand, if α ∈ / N then the Taylor series is infinite. For example, the Taylor series for (1 + x)1/2 is 1 1 5 4 1 x + ··· . 1 + x − x2 + x3 − 2 8 16 128 1.3.3 In figure 1.1, identify the graphs of T1 through T5 and the graph of ln near x = 0 and near x = 2. We have T1 (x) = x − 1,

(x − 1)2 2 (x − 1)2 T3 (x) = (x − 1) − 2 (x − 1)2 T4 (x) = (x − 1) − 2 (x − 1)2 T4 (x) = (x − 1) − 2

T2 (x) = (x − 1) −

, (x − 1)3 , 3 (x − 1)3 (x − 1)4 + − , 3 4 3 4 (x − 1) (x − 1) (x − 1)5 + − + . 3 4 5 +

At x = 0 the function ln(x) is undefined (and ln(x) tends to −∞ as x → 0), but the Taylor polynomials are perfectly well behaved. Specifically, T1 (0) = −1,

1 , 2 1 T3 (0) = −1 − − 2 1 T4 (0) = −1 − − 2 1 T5 (0) = −1 − − 2 T2 (0) = −1 −

1 , 3 1 − 3 1 − 3

1 , 4 1 1 − . 4 5

Thus toward the left of the figure, the graphs from top to bottom show T1 (x), T2 (x), T3 (x), T4 (x), T5 (x), and then ln(x) 10

At x = 2 the function is ln(2) and the Taylor polynomials take the values T1 (2) = 1, 1 , 2 1 T3 (2) = 1 − + 2 1 T4 (2) = 1 − + 2 1 T5 (2) = 1 − + 2 T2 (2) = 1 −

1 , 3 1 − 3 1 − 3

1 , 4 1 1 + . 4 5

That is, T1 (2) is the greatest, T2 (2) drops to the lowest, T3 (2) climbs back to the second-greatest, T4 (2) drops back to the second-lowest, and then T5 (2) climbs again, with the Taylor polynomial values oscillating around the actual value ln(2). Thus toward the right of the figure, the graphs from top to bottom show T1 (x), T3 (x), T5 (x), ln(x), T4 (x), and finally T2 (x). √ 1.3.5 Use a second degree Taylor polynomial to approximate 4.2 and to estimate the accuracy √ of the approximation. Let f (x) = x on the interval [0, ∞), and let a = 4. Then f (x) = x1/2 ,

f ′ (x) =

1 −1/2 x , 2

1 f ′′ (x) = − x−3/2 , 4

It follows that f (a) = 2,

f ′ (a) =

1 , 4

f ′′′ (x) =

3 −5/2 x . 8

f ′′ (a) 1 =− , 2! 64

and

f ′′′ (c) 1 . = 3! 16c5/2 Thus the second degree Taylor polynomial is 1 1 T2 (x) = 2 + (x − 4) − (x − 4)2 , 4 64 and the remainder is R2 (x) =

1 (x − 4)3 16c5/2

for some c between 4 and x.

Substitute x = 4.2: 1 1 1 1 T2 (4.2) = 2 + (0.2) − (0.2)2 = 2 + − = 2.049375, 4 64 20 1600 and

(0.2)3 1 1 = < = 0.00002. 16 · 32 512 · 125 50000 These calculations show that √ 2.049355 < 4.2 < 2.049395. |R2 (4.2)| ≤

11

Chapter 2 2.1.3. Verify that Rn satisfies vector space axioms (A2), (A3), (D1). These are very similar to the argument in the text for (M1), each reducing to its field axiom counterpart. For (A2), let x = (x1 , . . . , xn ) and likewise for y and z. Then (x + y) + z = ((x1 , . . . , xn ) + (y1 , . . . , yn )) + (z1 , . . . , zn ) by definition of x, y, and z = (x1 + y1 , . . . , xn + yn ) + (z1 , . . . , zn ) by definition of vector addition = ((x1 + y1 ) + z1 , . . . , (xn + yn ) + zn ) = (x1 + (y1 + z1 ), . . . , xn + (yn + zn ))

by definition of vector addition by n applications of (a1) in R

= (x1 , . . . , xn ) + (y1 + z1 , . . . , yn + zn ) = x + (y + z)

by definition of vector addition by definition of vector addition.

The other two proofs are virtually identical. 2.1.5. (Throughout this exercise, the solution can proceed in coordinates by reducing problems in Rn to problems in R that have been solved, or it can proceed intrinsically by repeating in Rn the symbol-pattern of the solution from R with no reference to coordinates.) Show that 0 is the unique additive identity in Rn . A solution that reduces to R is as follows: Let z = (z1 , . . . , zn ) be an additive identity in Rn . That is, x+z = x for all x ∈ Rn . In terms of component scalars, this means that (x1 + z1 , . . . , xn + zn ) = (x1 , . . . , xn )

for all x1 , . . . , xn ∈ R.

Thus each of z1 , . . . , zn is an additive identity in R. But 0 is the unique additive identity in R, so each of z1 , . . . , zn is 0, and therefore z = 0 as desired. A solution that repeats the symbol-pattern of the solution in R is as follows: Let z be an additive identity in Rn . Then z = z + 0 since 0 is an additive identity = 0 + z since vector addition is commutative =0

since z is an additive identity.

Show that each vector x ∈ Rn has a unique additive inverse, which can therefore be denoted −x. This is very similar. A solution that reduces to R is as follows: If y = (y1 , . . . , yn ) is an additive inverse of x then the vector equation x + y = 0 is (x1 + y1 , . . . , xn + yn ) = (0, . . . , 0). This forces each yj to be an additive inverse of xj in R, determining yj uniquely since additive inverses are unique in R. Thus y is determined uniquely. 12

A solution that repeats the symbol-pattern of the solution in R is as follows: Let y and z additive inverses of x. Then y =y+0 = y + (x + z)

since 0 is the additive identity since z is an additive inverse of x

= (y + x) + z = (x + y) + z =0+z

since vector addition is associative since vector addition is commutative since y is an additive inverse of x

=z+0 =z

since vector addition is commutative since 0 is the additive identity.

Show that 0x = 0 for all x ∈ Rn . A solution that reduces to R is as follows: Let x = (x1 , . . . , xn ). Then by definition of scalar multiplication, 0x = (0x1 , . . . , 0xn ), and this is (0, . . . , 0) since we know that 0 is a multiplicative annihilator in R. A solution that repeats the symbol-pattern of the solution in R is as follows: Compute that 0x = (0 + 0)x = 0x + 0x

since 0 is the additive identity in R by the distributivity axiom (D1) for Rn .

Add the additive inverse of 0x to both sides to get 0 = 0x. 2.1.7. Show the uniqueness of additive identity and additive inverse using only (A1), (A2), (A3). To show uniqueness of additive identity, suppose that 0 and 0′ are both additive identities on the right. That is, suppose that x + 0 = x + 0′ = x for all x. By (A3), 0′ has an additive inverse on the right, i.e., there exists some y such that 0′ + y = 0. (Throughout this solution, we don’t bother using boldface notation for zero since no reference will be made to scalars.) It follows that 0′ = 0′ + 0

by (A2), since 0 is an additive identity

= 0′ + (0′ + y) = (0′ + 0′ ) + y

from the previous display by (A1)

= 0′ + y =0

by (A2), since 0′ is an additive identity from the previous display.

This completes the argument. 13

To show uniqueness of right inverses, suppose that some x has two right inverses a and b. That is, x + a = x + b = 0. By (A3), a has a right inverse α, i.e., a + α = 0. Consequently, a + x = (a + x) + 0 = (a + x) + (a + α) = a + (x + (a + α)) = a + ((x + a) + α) = a + (0 + α)

by (A1) by (A1) since x + a = 0

= (a + 0) + α =a+α

by (A1) by (A2)

=0

since a + α = 0.

That is, a + x = 0. And similarly, b + x = 0. So a + x = b + x, and consequently, (a + x) + a = (b + x) + a, so that by (A1), a + (x + a) = b + (x + a). That is, a + 0 = b + 0, so that finally, by (A2), a = b. 2.1.9. Which of the following sets are bases of R3 ? S1 = {(1, 0, 0), (1, 1, 0), (1, 1, 1)} is a basis since the equation (x, y, z) = a(1, 0, 0) + b(1, 1, 0) + c(1, 1, 1) has unique solution a = x − y, b = y − z, c = z. S2 = {(1, 0, 0), (0, 1, 0), (0, 0, 1), (1, 1, 1)} is a not a basis since the fourth vector is a linear combination of the first three, giving a nonunique linear combination of the elements of S2 . S3 = {(1, 1, 0), (0, 1, 1)} is not a basis since, for example, (0, 1, 0) can not be expressed as a linear combination of its elements. 14

S4 = {(1, 1, 0), (0, 1, 1), (1, 0, −1)} is not a basis since the third vector is the first minus the second; or it is not a basis because, for example, (1, 0, 0) can not be expressed as a linear combination of its elements. Conjecturally a basis of Rn must have n elements. Bases of R2 consist of pairs of nonzero noncollinear vectors, like the hands of a clock excluding “midnight” and “six o’clock” scenarios. Bases of R3 consist of triples of nonzero noncoplanar vectors. 2.1.10. A basis of Cn over R is {(1, 0, . . . , 0), (i, 0, . . . , 0), (0, 1, . . . , 0), (0, i, . . . , 0), . . . , (0, 0, . . . , 1), (0, 0, . . . , i)}. Note that this basis contains 2n elements. On the other hand, scalar multiplication by complex numbers is more powerful than scalar multiplication by real numbers only, so a basis of Cn over C requires only n elements, {(1, 0, . . . , 0), (0, 1, . . . , 0), . . . , (0, 0, . . . , 1)}. 2.2.2. Show that x = (2, −1, 3, 1), y = (4, 2, 1, 4), and z = (1, 3, 6, 1) form the vertices of a triangle in R4 with two equal angles. Compute that the sides of the triangle are x − y = (−2, −3, 2, −3),

y − z = (3, −1, −5, 3),

z − x = (−1, 4, 3, 0).

Consequently, |x − y|2 = |z − x|2 = 26. That is, the sides that meet at x have the same length, suggesting that the angles at y and z are equal. The cosine of the angle at y is 22 6 − 3 + 10 + 9 hx − y, z − yi √ √ =√ , = |x − y| |z − y| 26 44 26 · 44 and the cosine of the angle at z is 22 hx − z, y − zi 3 + 4 + 15 + 0 √ √ =√ . = |x − z| |y − z| 26 44 26 · 44 Since the angles lie between 0 and π and they have the same cosines, they are equal. Pn 2.2.3. Prove that x = j=1 hx, ej iej . This is a matter of unwinding the notation. Compute that hx, ej i = h(x1 , . . . , xj , . . . , xn ), (0, . . . , 1, . . . , 0)i = xj , so that hx, ej iej is the vector with xj in the jth slot and all other entries 0. Summing these vectors clearly gives x. 2.2.4. Prove the Inner Product Properties. For (IP1), compute that for any x ∈ Rn , hx, xi =

n X i=1

15

x2i .

This is a sum of squares. By ordered field properties of the real number system, each square x2i is zero or positive, zero if and only xi = 0. A sum of such numbers is zero or positive, zero if and only if xi = 0 for i = 1, . . . , n, i.e., if and only if x = 0. For (IP2), compute that for any x, y ∈ Rn , hx, yi =

n X

x i yi =

n X i=1

i=1

yi xi = hy, xi.

For the first part of (IP3), compute that for any x, x′ , y ∈ Rn , hx + x′ , yi = = = =

n X

i=1 n X

i=1 n X

i=1 n X

(x + x′ )i yi

by definition of inner product

(xi + x′i )yi

by definition of vector addition

(xi yi + x′i yi )

by the distributive field axiom

x i yi +

i=1

n X

x′i yi

by the commutativity field axiom

i=1 ′

= hx, yi + hx , yi. For the second part of (IP3), compute that for any a ∈ R and any x, y ∈ Rn , hax, yi = =

n X

i=1 n X

(ax)i yi

by definition of inner product

axi yi

by definition of scalar–vector multiplication

i=1 n X

=a

x i yi

by the distributive field axiom

i=1

= ahx, yi. The third and fourth parts of (IP3) follow from the first two parts via (IP2): hx, y + y ′ i = hy + y ′ , xi = hy, xi + hy ′ , xi = hx, yi + hx, y ′ i, and hx, byi = hby, xi = bhy, xi = bhx, yi. 2.2.5. Prove that |x| ≥ 0 for all x ∈ Rn , with equality if and only if x = 0. The definition p |x| = hx, xi, 16

the property (IP1) that hx, xi ≥ 0 with equality if and only if x = 0, and the fact that the square root is nonnegative and is zero if and only if hx, xi = 0 all combine to give the result immediately. Prove that |ax| = |a| |x| for all a ∈ R and x ∈ Rn . Compute, using results from the section, that p p |ax| = hax, axi = a2 hx, xi.

Each of a2 and hx, xi is a nonnegative real number, so √ p p a2 hx, xi = a2 hx, xi. √ 2 But a2 = |a| by work p in the real number system (the square root of a is not necessarily a), and hx, xi = |x| by definition. In sum, we have the desired result, |ax| = |a| |x|. 2.2.7. Starting from the basic Triangle Inequality, |x + y| ≤ |x| + |y|

for all x, y ∈ Rn ,

derive the full Triangle Inequality, | |x| − |y| | ≤ |x ± y| ≤ |x| + |y|

for all x, y ∈ Rn .

For any x and y, substitute −y for y in the basic inequality to see that |x + (−y)| ≤ |x| + | − y|. That is, |x − y| ≤ |x| + |y|. Along with the basic inquality, this gives the right side of the full inequality. Now, for any x and y, note that by the right side of the full inequality, |x| = |(x + y) − y| ≤ |x + y| + |y|, so that subtracting |y| from both sides gives |x| − |y| ≤ |x + y|. Reverse the roles of x and y to see that also |y| − |x| ≤ |x + y|. Since | |x| − |y| | is one of |x| − |y| or |y| − |x|, we now have | |x| − |y| | ≤ |x + y|. Finally, replace y by −y to complete the proof. 2.2.8. Prove the Size Bounds. 17

First, compute for each j ∈ {1, . . . , n} by the fact that xj = hx, ej i, by the Cauchy–Schwarz inequality, and by the fact that |ej | = 1 that |xj | = |hx, ej i| ≤ |x||ej | = |x|. Equality holds in the Cauchy–Schwarz inequality exactly when the two vectors involved are parallel. So in our case here, equality holds exactly when x is parallel to ej , i.e., x = xj ej has all but its jth component equal to 0 (and possibly xj = 0 as well). Second, compute by the Triangle Inequality, by the second part of exercise 2.2.5, and by the fact that |ej | = 1 for all j that |x| = |x1 e1 + · · · + xn en | ≤ |x1 e1 | + · · · + |xn en | = |x1 | |e1 | + · · · + |xn | |en | = |x1 | + · · · + |xn |.

Equality holds in the triangle inequality when all the vectors involved are parallel and point in the same direction. So in our case here, since the vectors are xi ei , equality holds exactly when x has at most one nonzero component. 2.2.10. Use the Law of Cosines to derive the formula for cos θ in the plane. With x, y, and θ defined as in the exercise, the Law of Cosines is |x − y|2 = |x|2 + |y|2 − 2|x| |y| cos θ. The left side is hx−y, x−yi, which, by bilinearity, expands out to |x|2 −2hx, yi+ |y|2 . Thus, after some cancellation the display becomes |x|2 + |y|2 − 2hx, yi = |x|2 + |y|2 − 2|x| |y| cos θ. The result follows immediately from a little algebra. 2.2.12. Show that two nonzero vectors x and y are orthogonal if and only if |x + y|2 = |x|2 + |y|2 . Similarly to the work in exercise 2.2.10, compute that |x + y|2 = |x|2 + 2hx, yi + |y|2 . The result follows immediately. 2.2.13. Show that the diagonals of a parallelogram are orthogonal if and only if the parallelogram is a rhombus. Let the parallelogram have sides x and y. Then the diagonals are x + y and x − y. Compute, using properties of the inner product, that hx + y, x − yi = |x|2 − |y|2 . This shows that the diagonals are orthogonal if and only if the sides are equal. This is the desired result. 18

2.3.1. For A ⊂ Rn , partially verify that M(A, Rm ) is a vector space over R by showing that it satisfies vector space axioms (A4) and (D1). To verify that M(A, Rm ) satisfies (D1), consider any mapping f ∈ M(A, Rm ) and any scalars a, b ∈ R. For any x ∈ A, ((a + b)f )(x) = (a + b)(f (x)) = a(f (x)) + b(f (x)) = (af )(x) + (bf )(x) = (af + bf )(x)

by definition of “·” in M(A, Rm ) by (D1) in Rm by definition of “·” in M(A, Rm )

by definition of “+” in M(A, Rm ).

Since x is arbitrary, (a + b)f = af + bf . 2.3.4. Define an inner product and a modulus on C([0, 1], R) by Z 1 p f (t)g(t)dt, |f | = hf, f i. hf, gi = 0

How much of the material on inner product and modulus in Rn carries over to C([0, 1], R)? Express the Cauchy–Schwarz inequality as a relation between integrals. All of the material carries through. The point of the exercise is that to establish this, all one has to do is establish the inner product properties for this new inner product, because the rest of the results of the section were derived solely from the inner product properties. Beyond parsing language, the subtle point in the inner product properties for this new inner product is that hf, f i = 0 only if f is identically 0. The proof of this is where it matters that the functions in question are continuous. The argument proceeds as follows: If f is not identically 0 then f (t)2 > 0 for some t0 ∈ [0, 1]. Give f (t0 )2 the name r, so that r > 0. In the Persistence of Inequality principle (Proposition 2.3.9), let the f in the principle be the f 2 here, let a = t0 and let b = r/2. The principle then says that f (t)2 > r/2 for all t ∈ [0, 1] within some positive distance ε of t0 . Even if t0 is an endpoint, this R1 shows that 0 f (t)2 dt ≥ rε/4 > 0. The Cauchy–Schwarz inequality in this context is 2 Z 1 Z 1 Z 1 g(t)2 dt. f (t)g(t) dt ≤ f (t)2 dt · 0

0

0

2.3.5. Prove the componentwise nature of convergence. Let {xν } be a sequence of vectors in Rn , and let a ∈ Rn be a fixed vector. The text has shown that the sequence {xν } converges to a if and only if the sequence {xν −a} is null. That is, {xν } converges to a if and only if the sequence {(x1,ν − a1 , . . . , xn,ν − an )} is null. By Lemma 2.3.2, this holds if and only if each scalar sequence {xj,ν −aj } is null, and by definitions and/or results from one variable, this is equivalent to each scalar sequence {xj,ν } converging to aj . 19

2.3.9. Which of the following functions on R2 can be defined continuously at 0? ( 4 4 x −y if (x, y) 6= 0, 2 2 2 f (x, y) = (x +y ) b if (x, y) = 0, Let x 6= 0 and let y = mx, and compute that in this case f (x, mx) =

(1 − m4 )x4 1 − m4 = . 2 2 4 (1 + m ) x (1 + m2 )2

This shows that f is a “spiral staircase” function as in the text, so it can not be made continuoous at 0. ( 2 3 x −y if (x, y) 6= 0, 2 2 g(x, y) = x +y b if (x, y) = 0, Compute that g(x, 0) = 1 for x 6= 0 but g(0, y) = −y for y 6= 0. Thus g(x, 0) → 1 as x → 0, while g(0, y) → 0 as y → 0, showing that g can not be made continuous at 0. ( 3 3 x −y if (x, y) 6= 0, 2 2 h(x, y) = x +y b if (x, y) = 0, Compute that for (x, y) 6= 0, |h(x, y)| =

|x3 − y 3 | |x|3 + |y|3 |(x, y)|3 + |(x, y)|3 ≤ ≤ = 2|(x, y)|. |(x, y)|2 |(x, y)|2 |(x, y)|2

Since 2|(x, y)| → 0 as (x, y) → 0, it follows that |h(x, y)| → 0 as (x, y) → 0, and therefore h(x, y) → 0 as (x, y) → 0. So define h(0, 0) = 0 to make h continuous at 0. ( xy 2 if (x, y) 6= 0, 2 +y 6 x k(x, y) = b if (x, y) = 0, Compute that for any slope m and any nonzero x, k(x, mx) =

x2

m2 x 3 m2 x = , 6 6 +m x 1 + m6 x 4

and so k(x, mx) → 0 as x → 0, i.e., the only possible value of b that could make k continuous at 0 is b = 0. But now let (x, y) approach the origin along the curve x = y 2 , 1 y4 = , k(y 2 , y) = 4 y + y6 1 + y2 and so k(y 2 , y) → 1 as y → 0. Thus k can not be made continuous at 0. 20

2.3.11. Let f, g ∈ M(Rn , R) be such that f + g and f g are continuous. Are f and g necessarily continuous? No. For instance, let n = 1, let ( 1 if x ∈ Q, f (x) = −1 if x ∈ / Q. and let g(x) = −f (x). Neither f nor g is continuous, but f + g is the constant function 0 and f g is the constant function −1, both of which are continuous. 2.4.1. (a) The set B(0, 1) is not closed: for example, e1 is a limit point of the set that is not in the set. The set is bounded. (b) The set {(x, y) ∈ R2 : y − x2 = 0} is closed. It is not bounded: for any positive number R the set contains the point (R, R2 ) whose modulus is greater than R. (c) The set {(x, y, z) ∈ R3 : x2 + y 2 + z 2 − 1 = 0} is closed and bounded, and hence it is compact. (d) The set {x : f (x) = 0m } where f ∈ M(Rn , Rm ) is continuous is closed. To see this, let a be a limit point of the set. Then there is a sequence {xν } in the set such that {xν } approaches a. By the definition of the set, f (xν ) = 0m for each ν. By the continuity of f it follows that f (a) = 0m as well, i.e., a lies in the set. As shown by (b) and (c), we can not determine whether the set is bounded unless we know more about f . √ (e) The set Qn is neither closed nor bounded. For example, 2e1 is a limit point of the set that is not in the set. The set is unbounded: any positive real number R there is an integer n > R by Archimedes’s Principle, and so ne1 ∈ Qn has modulus greater than R. (f) The set {(x1 , . . . , xn ) : x1 + · · · + xn > 0} is neither closed nor bounded. For example, 0 is a limit point of the set that does not belong to the set, and for any positive real number R the point (R + 1)e1 lies in the set and has modulus greater than R. 2.4.2. Give a set A ⊂ Rn and limit point b of A such that b ∈ / A. Let A = B(0, 1) and let b = (1, 0, . . . , 0). Give a set A ⊂ Rn and a point a ∈ A such that a is not a limit point of A. Let A = {0} and let a = 0. 2.4.5. Prove that any ball B(p, ε) is bounded in Rn . Let R = |p| + ε. For any x ∈ B(p, ε) we have by the triangle inequality |x| = |p + x − p| ≤ |p| + |x − p| < |p| + ε = R. This shows that B(p, ε) ⊂ B(0, R). 2.4.7. Show by example that a closed set need not satisfy the sequential characterization of bounded sets, and that a bounded set need not satisfy the sequential characterization of closed sets. 21

For the sake of simple examples, let n = 1. Consider the closed set A = R. The sequence {xν } = {1, 2, 3, · · · } has all of its entries in A but it has no subsequence that converges in R. Consider the bounded set A = (0, 1]. The sequence {xν } = {1, 1/2, 1/3, · · · } has all of its entries in A and it converges; but its limit, 0, does not lie in A. 2.4.8. Show by example that the continuous image of a closed set need not be closed. The closed set [1, ∞) is taken by the continuous function f (x) = 1/x to the nonclosed set (0, 1]. Show that the continuous image of a closed set need not be bounded. The closed set R is taken by the continuous function f (x) = x to the unbounded set R. Show that the continuous image of a bounded set need not be closed. The bounded set (0, 1) is taken by the continuous function f (x) = x to the nonclosed set (0, 1). Show that the continuous image of a bounded set need not be bounded. The bounded set (0, 1] is taken by the continuous function f (x) = 1/x to the unbounded set [1, ∞). 2.4.9. A subset A of Rn is called discrete of each of its points is isolated. Is discreteness a topological property? That is, need the continuous image of a discrete set be discrete? No. Scrutinizing the definition of continuity shows that every mapping whose domain is discrete must be continuous. Especially, the function f : N −→ R given by

( 0 f (x) = 1/x

if x = 0 if x > 0

is continuous. And this function takes the discrete set N to the set {0} ∪ {1, 1/2, 1/3, . . . }, a set in which 0 is a nonisolated point. 2.4.10. A subset A of Rn is called path-connected if for any two points x, y ∈ A, there is a continuous mapping γ : [0, 1] −→ A such that γ(0) = x and γ(1) = y. Prove that path-connectedness is a topological property. Let A ⊂ Rn be path connected, and let f : A −→ Rm be continuous. Consider any two points x, y ∈ f (A). These points take the form x = f (x′ ), y = 22

f (y ′ ) where x′ , y ′ ∈ A. Since A is path connected, there exists a continuous map γ : [0, 1] −→ A such that γ(0) = x′ and γ(1) = y ′ . Now consider the composite δ = f ◦ γ : [0, 1] −→ f (A). This composite is continuous since the compositands γ and f are continuous. Furthermore, δ(0) = f (γ(0)) = f (x′ ) = x, and similarly δ(1) = y. Having shown that any two points x, y ∈ f (A) are joined by a path in f (A), we have shown that f (A) is path-connected.

Chapter 3 3.1.1. Prove that T : Rn −→ Rm is linear if and only if it satisfies (3.1) and (3.2). If T is linear then it satisfies the displayed condition in Definition 3.1.1, ! k k X X αi T (xi ), T αi x i = i=1

i=1

for all k ∈ Z+ , α1 , . . . , αk ∈ R, and x1 , . . . , xk ∈ Rn . Specialize to k = 2 and α1 = α2 = 1 to get (3.1). Specialize to k = 1 to get (3.2). Conversely, if T satisfies (3.1) and (3.2) then we prove that it satisfies the condition in Definition 3.1.1 by induction on k. For k = 1, the condition in the definition is (3.1), which we know that T satisfies. Now let k ∈ Z+ be arbitrary, and assume that T satisfies the condition in the definition for k. We need to show that T satisfies the condition for k + 1. Compute that for any α1 , . . . , αk+1 ∈ R and x1 , . . . , xk+1 ∈ Rn , ! ! k k+1 X X by def’n of summation αi xi + αk+1 xk+1 T αi x i = T i=1

i=1

=T

k X

αi x i

i=1

=

k X

!

+ T (αk+1 xk+1 )

since T satisfies (3.1)

αi T (xi ) + αk+1 T (xk+1 )

by ind. hyp. and (3.2)

αi T (xi )

by def’n of summation.

i=1

=

k+1 X i=1

This completes the induction. Note how neatly the induction step uses (3.1) and (3.2) once each. 3.1.2. Suppose that T ∈ L(Rn , Rm ). Show that T (0n ) = 0m . Compute that T (0n ) = T (0n + 0n ) = T (0n ) + T (0n ), and now adding −T (0n ) to both sides gives the result. 23

3.1.3. Fix a vector a ∈ Rn . Show that the mapping T : Rn −→ R given by T (x) = ha, xi is linear, and that T (ej ) = aj for j = 1, . . . , n. To prove linearity it suffices to establish (3.1) and (3.2). For (3.1), compute that for any x1 , x2 ∈ Rn , T (x1 + x2 ) = ha, x1 + x2 i

= ha, x1 i + ha, x2 i = T (x1 ) + T (x2 )

by definition of T since the inner product is bilinear by definition of T .

For (3.2), compute that for any α ∈ R and x ∈ Rn , T (αx) = ha, αxi = αha, xi = αT (x)

by definition of T since the inner product is bilinear by definition of T .

By exercise 3.1.1, this is enough to show that T is linear. For the second part of the exercise, take any j ∈ {1, . . . , n} and compute T (ej ) = ha, ej i = h(a1 , . . . , aj , . . . , an ), (0, . . . , 1, . . . , 0)i = aj . 3.1.5. Complete the proof of the componentwise nature of linearity. Let T = (T1 , . . . , Tm ) : Rn −→ Rm . We need to show that T satisfies (3.2) if and only if each Tj does. Compute that for any α ∈ R and any x ∈ Rn , T (αx) = (T1 (αx),

...,

Tm (αx))

αT (x) = α(T1 (x), = (αT1 (x),

..., ...,

Tm (x)) αTm (x)).

and

But T satisfies (3.2) exactly when the left sides are equal, the left sides are equal exactly when the right sides are equal, and the right sides are equal exactly when each Ti satisfies (3.2). This completes the proof. 3.1.6. Carry out the matrix-by-vector multiplies.      1 1 0 0 1  1 1 0  2 =  3 , 6 1 1 1 3     ax + by a b    c d  x =  cx + dy  , y ex + f y e f   y1   .  x1 . . . xn  ..  = x1 y1 + · · · + xn yn = hx, yi, yn      0 1 −1 0 1  0 1 −1  1 = 0 . 0 −1 0 1 1 24

3.1.8. Let θ denote a fixed but generic angle. Argue geometrically that the mapping R : R2 −→ R2 given by counterclockwise rotation by θ is linear, and then find its matrix. The argument is virtually identical to the particular case θ = π/6 in the text. The matrix is   cos θ − sin θ A= . sin θ cos θ 3.1.13. If S ∈ L(Rn , Rm ) and T ∈ L(Rp , Rn ), show that S ◦ T : Rp −→ Rm lies in L(Rp , Rm ). It suffices to show that S ◦ T satisfies (3.1) and (3.2). For (3.1), take any x1 , x2 ∈ Rp and compute that (S ◦ T )(x1 + x2 ) = S(T (x1 + x2 ))

by definition of composition

= S(T (x1 ) + T (x2 )) = S(T (x1 )) + S(T (x2 ))

since T satisfies (3.1) since S satisfies (3.1)

= (S ◦ T )(x1 )) + (S ◦ T )(x2 )

by definition of composition.

The proof that S ◦ T satisfies (3.2) is similar. 3.1.14. (a) Let S ∈ L(Rn , Rm ). Its transpose is S T : Rm −→ Rn defined by the characterizing condition for all x ∈ Rn and y ∈ Rm .

hx, S T (y)i = hS(x), yi

Use the condition to show that S T (y + y ′ ) = S T (y) + S T (y ′ ) for all y, y ′ ∈ Rm . Compute that for all x ∈ Rn and y, y ′ ∈ Rm , hx, S T (y + y ′ )i = hS(x), y + y ′ i

by the property of S T

= hS(x), yi + hS(x), y ′ i T

T

since the inner product is bilinear ′

= hx, S (y)i + hx, S (y )i

= hx, S T (y) + S T (y ′ )i

by the property of S T since the inner product is bilinear.

(b) Keeping S from part (a), now further introduce T ∈ L(Rp , Rn ), so that also S ◦ T ∈ L(Rp , Rm ). Show that the transpose of the composition is the composition of the transposes in reverse order, i.e., (S ◦ T )T = T T ◦ S T . Compute that for all x ∈ Rp and z ∈ Rm , hx, (S ◦ T )T (z)i = h(S ◦ T )(x), zi = hS(T (x)), zi T

by the property of (S ◦ T )T

by definition of composition by the property of S T

= hT (x), S (z)i

= hx, T T (S T (z))i

= hx, (T T ◦ S T )(z)i

25

by the property of T T by definition of composition.

3.2.2. Carry out the matrix multiplies.    d −b a b = (ad − bc)I2 , −c a c d       a 1 b1 x 1 x 2 x 3  a 2 b2  = a 1 x 1 + a 2 x 2 + a 3 x 3 b1 x 1 + b2 x 2 + b 3 x 3 , a 3 b3  2   0 0 1 0 0 1 0 0 0 0 0 1 0 0 1 0     0 0 0 1 = 0 0 0 0 , 0 0 0 0 0 0 0 0           1 1 1 1 0 0 3 2 1 1 0 0 1 1 1 1 1 1 0 1 1 1 1 0 = 2 2 1 , 1 1 0 0 1 1 = 1 2 2 . 0 0 1 1 1 1 1 1 1 1 1 1 0 0 1 1 2 3

3.2.4. Let A = [aij ] ∈ Mm,n (R) be the matrix of S ∈ L(Rn , Rm ). Its transpose AT ∈ Mn,m (R) is the matrix of the transpose mapping S T . The characterizing property of S T gives hx, AT yi = hAx, yi

for all x ∈ Rn and y ∈ Rm .

Make specific choices of x and y to show that the (i, j)th entry of AT is aji . Take x = ei and y = ej . Then the two sides of the characterizing equality are hei , AT ej i = hei , jth column of AT i = (i, j)th entry of AT , hAei , ej i = hith column of AT , ej i = (j, i)th entry of A.

This gives the result. (Also, although the exercise observes that the formula (AB)T = B T AT

for all A ∈ Mm,n (R) and B ∈ Mn,p (R)

follows from a previous exercise, a direct proof is also immediate: for any i ∈ {1, · · · , p} and any j ∈ {1, · · · , m}, the (i, j)th entry of (AB)T is the (j, i)th entry of AB, which is hjth row of A, ith column of Bi, which in turn is hith row of B T , jth column of AT i,

and this is the (i, j)th row of B T AT .)

3.3.1. Write down the following 3-by-3 elementary matrices and their inverses: R3;2,π , S3,3 , T3;2 , T2;3 . Solution:     1 0 0 1 0 0 −1 R3;2,π = 0 1 0 , R3;2,π = R3;2,−π = 0 1 0 , 0 π 1 0 −π 1 26

and S3,3 and



1 0 = 0 1 0 0 T3;2

 0 0 , 3 

−1 S3,3

1 0 = 0 0 0 1

= S3,1/3

 0 1 , 0



1 0 = 0 1 0 0

 0 0 , 1/3

−1 = T3;2 , T3;2

−1 = T2,3 = T3,2 . and T2,3 = T3,2 so that T2,3 h1 2i 3.3.3. Let A = 3 4 . Evaluate the following products without actually multi5 6 plying matrices: R3;2,π A, S3,3 A, T3;2 A, T2;3 A. Since T2,3 = T3,2 , we only need to compute the first three. They are       1 2 1 2 1 2 4  , T3;2 A = 5 6 . 4  , S3,3 A =  3 R3;2,π A =  3 3 4 15 18 5 + 3π 6 + 4π

3.3.7. Are the following matrices echelon? tion M x = 0. The matrix  1 0 M = 0 1 0 0

For each matrix M , solve the equa 3 1 1

is not echelon. Its echelon form is I3 , and so the only solution of the equation M x = 0 is x = 0. The matrix   0 0 0 1 M= 0 0 0 0 is echelon. The equation M x = 0 is solved by all vectors (x1 , x2 , x3 , 0) where x1 , x2 , and x3 are free. The matrix   1 1 0 0 M= 0 0 1 1 is echelon. The equation M x = 0 has solutions (−x2 , x2 , −x4 , x4 ) where x2 and x4 are free. The matrix   0 0 1 0  M = 0 1 0 0

is not echelon because its nonzero rows are not at the top. The only solution of the equation M x = 0 is x = 0. The matrix   1 0 0 0 M =  0 1 1 0 0 0 1 0 27

is not echelon because the leading 1 in the bottom row doesn’t have all 0’s above it. But it becomes echelon after its third row is subtracted from its second, and then the equation M x = 0 has solutions (0, 0, 0, x4 ) where x4 is free. The matrix   0 1 1 M =  1 0 3 0 0 0

is not echelon, but it becomes echelon after its first two rows are transposed. The equation M x = 0 has solutions (−3x3 , −x3 , x3 ). 3.3.8. For each  −1  1 1

matrix A solve the equation   2 −1 3 1 4 4 0 3 8 ,  1 2 6 −1 2 5

Ax = 0.   3 −1 2 1 1 ,  2 1 −3 5

The echelon form of the first matrix is   1 0 −1 3 , E= 0 1 0 0 0

 2 1 . 0

and so in this case the equation Ax = 0 has solutions (x3 , −3x3 , x3 ) where x3 is free. The echelon form of the second matrix is   1 0 0 17/5 E =  0 1 0 −3/5  , 0 0 1 −9/5

and so the equation Ax = 0 has solutions (−(17/5)x4 , (3/5)x4 , (9/5)x4 , x4 ) where x4 is free. The echelon form of the third matrix is I3 , and so the equation Ax = 0 has only the trivial solution x = 0. 3.3.9. Balance the chemical equation Ca + H3 PO4 −→ Ca3 P2 O8 + H2 . Of course this is silly. One can pretty much do it on sight. But also one can set it up as a linear algebra problem, exhibiting a systematic method that applies as well to situations too complicated to do in one’s head. Let x be the number of Ca molecules, y the number of H3 PO4 molecules, z the number of Ca3 P2 O8 molecules, and w the number of H2 molecules. Then the conditions for balancing the equation are that the number of Ca atoms be the same on both sides, and similarly for the number of H atoms, P atoms, and O atoms. This gives four equations in the unknowns:    x 1 0 −3 0  0 3  y  0 −2     = 04 .  0 1 −2 0  z  w 0 4 −8 0 28

Putting this equation into echelon form gives    x 1 0 0 −1  0 1 0 −2/3   y      0 0 1 −1/3   z  = 04 . w 0 0 0 0

So x = w, y = (2/3)w, and z = (1/3)w. Set w = 3: then x = 3 and y = 2 and z = 1, 3 Ca + 2 H3 PO4 −→ Ca3 P2 O8 + 3 H2 .

3.5.5. The square matrix A is orthogonal if AT A = I. Show that if A is orthogonal then det A = ±1. Give an example with determinant −1. By Theorem 3.5.4 (whose proof is exercise 3.5.4), det AT = det A. Thus, if T A A = I then 1 = det I = det AT A = det AT det A = (det A)2 , and so det A = ±1. An example with determinant −1 is A = [−1].

3.5.6. The matrix A is skew symmetric if AT = −A. Show that if A is n-by-n skew symmetric with n odd then det A = 0. By Theorem 3.5.4 (whose proof is exercise 3.5.4), det A = det(AT ). That is, det A = det(−A). But −A is the matrix whose n rows are the n rows of A each multiplied by −1. Thus, by the multilinearity of the determinant, det(−A) = (−1)n det A. Since n is odd, this says that det(−A) = − det A. Concatenating the various results so far, we have det A = det(AT ) = det(−A) = (−1)n det A = − det A. Since det A is a real number and equal to its additive inverse, it is 0.

3.6.2. Use the desired determinant properties to obtain the formula in the section for 1-by-1 and the 3-by-3 determinant. Verify that the 1-by-1 formula satisfies the properties. For the 1-by-1 case, the only permutation is (1), and so the formula is simply det[a] = a. This is multilinear because det[αa + α′ a′ ] = αa + α′ a′ = α det[a] + α′ det[a]. It is vacuously skew-symmetric. And it is normalized because det[1] = 1. For the 3-by-3 case, let A have entries aij . Many applications of multilinearity give det(A) = det(

3 X

a1,i1 ei1 ,

i1 =1

=

3 3 X 3 X X

3 X

i2 =1

a2,i2 ei2 ,

3 X

a3,i3 ei3 )

i3 =1

a1,i1 a2,i2 a3,i3 det(ei1 , ei2 , ei3 ).

i1 =1 i2 =1 i3 =1

29

Because the determinant is alternating, we may ignore the 21 terms of the 27 where standard basis vectors repeat, leaving det(A) = a11 a22 a33 det(e1 , e2 , e3 ) + a11 a23 a32 det(e1 , e3 , e2 ) + a12 a21 a33 det(e2 , e1 , e3 ) + a12 a23 a31 det(e2 , e3 , e1 ) + a13 a21 a32 det(e3 , e1 , e2 ) + a13 a22 a31 det(e3 , e2 , e1 ). Counting inversions reduces this to det(A) = (a11 a22 a33 − a11 a23 a32 − a12 a21 a33 + a12 a23 a31

+ a13 a21 a32 − a13 a22 a31 ) det(e1 , e2 , e3 ),

and since the determinant is normalized, we have the formula, det(A) = a11 a22 a33 − a11 a23 a32 − a12 a21 a33 + a12 a23 a31

+ a13 a21 a32 − a13 a22 a31 .

3.6.3. For each permutation, count the inversions and compute the sign: (2, 3, 4, 1), (3, 4, 1, 2), (5, 1, 4, 2, 3). The permutation (2, 3, 4, 1) has the three inversions (2, 1), (3, 1), and (4, 1), so its sign is negative. The permutation (3, 4, 1, 2) has the four inversions (3, 1), (3, 2), (4, 1), and (4, 2), so its sign is positive. The permutation (5, 1, 4, 2, 3) has the six inversions (5, 1), (5, 4), (5, 2), (5, 3), (4, 2), (4, 3), and so its sign is positive. 3.6.8. Use the determinant formula to re-explain why the determinant of a triangular matrix is the product of its diagonal entries. If a rook is above the diagonal then the corresponding permutation (π(1), · · · , π(n)) has i for some i, so also it must have π(j) < j for some j since Pn π(i) > P n π(i) = i=1 i=1 i, so a second rook is below the diagonal. And conversely. Theus, for a triangular matrix the only rook-placement that can give a nonzero term in the determinant formula is the placement of the rooks down the diagonal. This placement has no upward slopes, so its sign is positive. This gives the result. 3.6.9. Calculate the determinants of the following matrices:     1 −1 2 3 4 3 −1 2   0 1 2 0 2  2 3  . ,  2   4  1 0 1 −1 −1  4 1  1 2 3 0 2 0 3 0 The determinants are 9 and 128 respectively. In each case there are many ways to do it, but the idea is to use row and column operations to put the 30

matrix into triangular form, and then the determinant is product of the diagonal elements of the triangular matrix, times the reciprocals of the row and column scale-factors that were used, times the parity of the number of transpositions that were used. 3.8.2. Describe the geometric effect of multiplying by the matrices R′ and S ′ in the text. Describe the effect of multiplying by R and S if a < 0. Multiplying by R′ produces a shear in the y-direction, taking (1, 0) to (1, a) and taking (0, 1) back to itself. Multiplying by S ′ produces a scale in the y-direction, taking (1, 0) back to itself and taking (0, 1) to (0, a). Multiplying by R but with a < 0 produces a negative shear in the x-direction, and multiplying by S but with a < 0 scales the box in the negative x-direction. 3.8.3. Describe the geometric effect of multiplying by the 3-by-3 elementary matrices R2;3,1 , R3;1,2 , and S2,−3 . Multiplying by R2;3,1 preserves (1, 0, 0) and (0, 1, 0) but takes (0, 0, 1) to (0, 1, 1). The effect is to shear the positive z-axis 45 degrees toward the positive y-axis. (The negative z-axis is sheared the other way, of course.) Multiplying by R3;1,2 shears the positive x-axis toward the positive z-axis, taking (1, 0, 0) to (1, 0, 2). Multiplying by S2,−3 negates and triples in the y-direction while preserving the x and z-directions. 3.8.4. Describe counterclockwise rotation of the plane by angle π/2 as a composition of shears and scales. The relevant matrix is   0 −1 . 1 0 Compute that R1;2,1 R2;1,−1 R1;2,1



0 1

−1 0



= I.

It follows that       1 −1 1 0 1 −1 0 −1 . = R1;2,−1 R2;1,1 R1;2,−1 = 0 1 1 1 0 1 1 0 That is, the counterclockwise rotation through angle π/2 is a composition shearing the positive y-axis counterclockwise through angle π/4, then shearing the positive x-axis counterclockwise through angle π/4, then again shearing the positive y-axis counterclockwise through angle π/4. 3.8.6. In R3 , describe the linear mapping that takes e1 to e2 , e2 to e3 , and e3 to e1 as a composition of shears, scales, and transpositions. The relevant matrix is   0 0 1 1 0 0 . 0 1 0 31

Compute that

It follows that



0 T2;3 T1;2 1 0 

0 0 1 0 0 1

 0 1 0 0 = I. 1 0

 1 0 = T1;2 T2;3 . 0

That is, the mapping is a composition of transposing the second and third axes followed by transposing the first and second. 3.9.1. Any invertible mapping T : Rn −→ Rn is a composition of scales, shears and transpositions. Give conditions on such a composition to make the mapping orientation-preserving, orientation-reversing. The condition for the mapping to preserve orientation is that the determinant be positive, and similarly for reversing orientation and negative determinant. Since each shear has determinant 1, shears preserve orientation. Scaling in any direction by a positive number preserves orientation, while scaling in any direction by a negative number reverses it. Every transposition reverses orientation. Thus the condition for the mapping to preserve orientation is that the combined number of negative scales and transpositions must be even, and the condition for the mapping to reverse orientation is that the combined number of negative scales and transpositions must be odd. 3.10.1. Evaluate (2, 0, −1) × (1, −3, 2). By the formula in the section, the cross product is (−3, −5, −6). 3.10.2. Suppose that v = u1 × e1 = u2 × e2 for some u1 , u2 . Describe v. Since v is orthogonal to e1 and to e2 , it is a scalar multiple of e3 . 3.10.3. True or false: For all u, v, w in R3 , (u × v) × w = u × (v × w). This is false. For instance, (e1 × e1 ) × e2 = 03 × e2 = 03 , but e1 × (e1 × e2 ) = e1 × e3 = −e2 . 3.10.4. Express (u + v) × (u − v) as a scalar multiple of u × v. Compute, using various properties of the cross product, that (u + v) × (u − v) = u × u − u × v + v × u − v × v = −2(u × v). 3.10.5. For fixed u, v in R3 with u 6= 0, describe the vectors w satisfying the condition u × v = u × w. The condition is u × (w − v) = 03 . That is w − v is parallel to u, and since u 6= 0 this condition is that w − v = cu for some c ∈ R, or w = v + cu for some c ∈ R. Another way to say this is that w ∈ ℓ(v, u). 3.10.10. What can you conclude about the lines x − xp y − yp z − zp = = xd yd zd

and 32

x − xp y − yp z − zp = = xD yD zD

given that xd xD + yd yD + zd zD = 0? The lines have the common point p and orthogonal directions. That is, they are normal. What can you conclude if xd /xD = yd /yD = zd /zD ? The lines have the common point p and parallel directions. That is, they are equal. 3.10.12. Use vector geometry to show that the distance from the point q to the line ℓ(p, d) is |(q − p) × d| . |d|

By results in the text, |(q − p) × d| is the area of the parallelogram spanned by q − p and d. View d as the base of the parallelogram, making the height of the parallelogram exactly the distance we seek. But the height is the area divided by the base, |(q − p) × d|/|d|. So we are done. 3.10.15. Where does the plane x/a + y/b + z/c = 1 intersect each axis? The points of intersection are (a, 0, 0), (0, b, 0), and (0, 0, c).

3.10.17. Use vector geometry to show that the distance from the point q to the plane P (p, n) is |hq − p, ni| . |n| We want the modulus of (q − p)kn . From exercise 2.2.15, hq − p, ni n, |n|2

(q − p)kn = and this is (q − p)kn =

hq − p, ni n . |n| |n|

Since n/|n| is a unit vector, the modulus of the displayed vector is |hq −p, ni|/|n| as desired.

Chapter 4 4.2.2. Let e be a nonnegative real number. Consider the function ϕe : Rn −→ R,

ϕ(x) = |x|e .

(a) Suppose that e > 0. Let c > 0 be given. If |h| ≤ c1/e then what do we know about |ϕe (h)| in comparison to c? What does this tell us about ϕe ? If |h| ≤ c1/e then |ϕe (h)| = |h|e ≤ c. This says that ϕe is o(1). (b) Prove that ϕ1 is O(h). Let c = 1. Then |ϕ1 (h)| = |h| ≤ c|h| for any h. 33

(c) Suppose that e > 1. Show that ϕe is o(h). Note that ϕe = ϕe−1 ϕ1 . By parts (a) and (b), ϕe−1 is o(1) and ϕ1 is O(h). By the product property for Landau functions, ϕe is o(h). (d) Explain how parts (a), (b), and (c) have proved Proposition 4.2.2. Part (a) has shown that ϕe is o(1) if e > 0. Part (b) has shown that ϕ1 is O(h). Part (c) has shown that ϕe is o(h) if e > 1. Since o(h) functions are also O(h), parts (b) and (c) together have shown that ϕe is O(h) if e ≥ 1. 4.3.2. Give a geometric interpretation of the derivative when n = m = 2. Give a geometric interpretation of the derivative when n = 1 and m = 2. When m = n = 2, we can conceive of the map f from (some of) the plane back to the plane as distorting a grid in some nonuniform, curvy way, as in figure 2.9 from the notes. The derivative Ta distorts a grid into another grid, as in figure 3.6. The interpretation of the derivative is that near the point a where the derivative is taken, the distortion of the grid under f is very closely approximated by the translation to f (a) of the distortion of the grid under T near the origin. When n = 1 and m = 2, we can conceive of the map f from (some of) the line into the plane as motion along a curve. Similarly, the derivative Ta can be viewed as motion along a straight line through the origin in the plane. The interpretation of the derivative is that near the point a where the derivative is taken, the motion along the curve is very closely approximated by the translation to f (a) of the motion along the straight line near the origin. 4.3.3. Prove the componentwise nature of differentiability: Let f : A −→ Rm (where A ⊂ Rn ) have component functions f1 , · · · , fm , and let a be a point of A. Let T : Rn −→ Rm be a linear mapping with component functions T1 , · · · , Tm . Show that f is differentiable at a with derivative T if and only if each component fi is differentiable at a with derivative Ti . Let a be an interior point of A. (At any other point a, neither f nor any of its component functions can be differentiable, so the condition that f be differentiable is equivalent to the condition that each fi be differentiable, since both conditions are flat-out false.) The condition that f is differentiable at a with derivative T is f (a + h) − f (a) − T (h)

is o(h).

By the componentwise nature of the o(h) condition, this condition is fi (a + h) − fi (a) − Ti (h)

is o(h)

for i = 1, · · · , m.

And this is the condition the each fi is differentiable at a with derivative Ti . 4.3.4. Let f (x, y) = (x2 − y 2 , 2xy). Show that Df(a,b) (h, k) = (2ah − 2bk, 2bh + 2ak) for all (a, b) ∈ R2 . We may work componentwise. The first component was discussed in the section. For the second component, compute that for any (a, b) ∈ R2 and any

34

(h, k) ∈ R2 , a little algebra gives f2 (a + h, b + k) − f2 (a, b) − (2bh + 2ak) = 2(a + h)(b + k) − 2ab − 2bh − 2ak = 2hk.

As explained in the text immediately after the product property for Landau functions (Proposition 4.2.6), 2hk is o(h, k), and so we are done. 4.3.5. Let g(x, y) = xey . Show that Dg(a,b) (h, k) = heb + kaeb for all (a, b) ∈ R2 . The characterizing property of the one-variable exponential function derivative at 0 is ek − 1 = ek − e0 = k + o(k). Compute that g(a + h,b + k) − g(a, b) − heb − kaeb

= (a + h)eb+k − aeb − heb − kaeb  = eb (a + h)(ek − 1) − ka  = eb (a + h)(k + o(k)) − ka  = eb ao(k) + h(k + o(k))  = eb ao(k) + hO(k)

by definition of g by algebra by the characterizing property by algebra since k and o(k) are O(k).

Note that o(k) = o(h, k) and O(k) = O(h, k) since |k| ≤ |(h, k)|. Also h = o(1), and so hO(k) = o(1)O(h, k) = o(h, k) by the product property (Proposition 4.2.6). Thus overall the right side of the previous display is o(h, k), giving the desired result. 4.3.6. Show that if f : Rn −→ R satisfies |f (x)| ≤ |x|2 for all x ∈ Rn then f is differentiable at 0n . Since the graph of f is trapped in a parabolic envelope, the only reasonable candidate for the derivative Df0 is the zero function T (h) = 0 for all h. Note that the condition |f (x)| ≤ |x|2 specializes for x = 0 to |f (0)| ≤ 0, i.e., f (0) = 0. Compute that indeed, |f (0 + h) − f (0) − 0| = |f (h)| ≤ |h|2 , and since |h|2 is o(h) by Proposition 4.2.2, so is f (0 + h) − f (0) − 0 by the Dominance Principle, Proposition 4.2.3. 4.4.2. Prove part (2) of Proposition 4.4.2. We are assuming that f is differentiable at a with derivative Dfa . That is, we are assuming that f (a + h) − f (a) − Dfa (h)

is o(h).

We want to show that αf is differentiable at a with derivative αDfa . That is, we want to show that (αf )(a + h) − (αf )(a) − (αDfa )(h) 35

is o(h).

By definition of scalar multiplication of mappings, this means showing that α · (f (a + h) − f (a) − Dfa (h))

is o(h).

(Here the dot denotes scalar-by-vector multiplication in Rm .) This is immediate from the hypothesis and the fact that o(h) is closed under scalar multiplication. 4.4.3. Prove part (2) of Lemma 4.4.4. Let a be a nonzero real number. Then a is an interior point of the domain R − {0} of the reciprocal function r, so the derivative Dra at least might exist. Compute that for all h small enough, 2 r(a + h) − r(a) + h = 1 − 1 + h = a − a(a + h) + h(a + h) 2 2 2 a a+h a a a (a + h) |h|2 |h| = 2 = 2 |h|. a |a + h| a |a + h| For any c > 0, the constant |h|/(a2 |a + h|) is less than c for all small enough h. Thus r(a + h) − r(a) + h/a2 is o(h). That is, the linear mapping T : R −→ R given by T (h) = −h/a2 satisfies the defining condition of Dra . Alternatively, since r is a function of one variable, we may quote from onevariable calculus that 1 r′ (a) = − 2 , a 6= 0. a (And a person can check that deriving the one-variable formula is little different from the work done here.) By the discussion in the chapter that relates the onevariable derivative to the new definition of the derivative in the one-variable case, it follows that the derivative of r at a in our new sense is T (h) = −h/a2 . 4.4.5. Let f (x, y, z) = xyz. Find Df(a,b,c) for arbitrary (a, b, c) ∈ R3 . Use the product rule, the fact that X, Y , and Z are linear, and the fact that the derivative of a linear map is itself: Df(a,b,c) = D((XY )Z)(a,b,c) = (XY )(a, b, c)DZ(a,b,c) + Z(a, b, c)D(XY )(a,b,c) = abZ + c(X(a, b, c)DY(a,b,c) + Y (a, b, c)DX(a,b,c) ) = abZ + c(aY + bX) = abZ + bcX + caY. That is, Df(a,b,c) (h, k, ℓ) = bch + cak + abℓ. 4.4.7. Recall that a function f : Rn × Rn −→ R is called bilinear if for all x, x′ , y, y ′ ∈ Rn and all α ∈ R, f (x + x′ , y) = f (x, y) + f (x′ , y), f (x, y + y ′ ) = f (x, y) + f (x, y ′ ), f (αx, y) = αf (x, y) = f (x, αy). 36

(a) Show that if f is bilinear then f (h, k) is o(h, k). ˆ = h/|(h, k)| and kˆ = k/|(h, k)|, viewing For any h, k not both 0n , define h (h, k) as an element of R2n . By bilinearity, ˆ |(h, k)|k) ˆ = |(h, k)|2 f (h, ˆ k). ˆ f (h, k) = f (|(h, k)|h, ˆ k). ˆ But And so it suffices to bound f (h, ˆ= h

n X

h i ei

where each |hi | ≤ 1,

i=1

P and similarly for kˆ = j kj ej . Thus ˆ k)| ˆ = |f |f (h, =| ≤ ≤ call

X

X

i

h i ei ,

X j

 k j ej |

hi kj f (ei , ej )|

i,j

X i,j

X i,j

|hi | |kj | |f (ei , ej )| |f (ei , ej )|

= C.

This completes the argument. (b) Show that if f is bilinear then f is differentiable with Df(a,b) (h, k) = f (a, k) + f (h, b). Compute that by bilinearity and then by part (a), f (a + h, b + k) − f (a, b) − f (a, k) − f (h, b) = f (h, k) = o(h, k). This gives exactly what we want. (c) What does this exercise say about the inner product? The inner product function is differentiable. Its derivative is Dh, i(a,b) (h, k) = ha, ki + hh, bi. 4.5.3. Define f : R −→ R by f (x) =

(

x2 sin x1 0

if x 6= 0, if x = 0.

Show that f ′ (x) exists for all x but that f ′ is discontinuous at 0. Explain how this disproves the converse of Theorem 4.5.3.

37

For x 6= 0, the methods of one variable calculus give f ′ (x) = 2x sin

1 1 − cos , x x

x 6= 0.

For x = 0, the quick solution is to recognize that exercise 4.2.8 applies here, showing that that the linear mapping derivative of f at 0 (i.e., Df0 ) is the zero mapping, and hence f ′ (0) is the 1-by-1 matrix with entry 0, i.e., f ′ (0) = 0. Alternatively, we can go back to the definition of one variable derivative: h2 sin 1/h − 0 f (0 + h) − f (0) = lim = lim (h sin 1/h). h→0 h→0 h→0 h h lim

But h → 0 while sin 1/h is bounded, so this limit is 0. That is, regardless of which method we use, f ′ (0) = 0. So f ′ (x) exists for all x. But the limit lim f ′ (x) = lim (2x sin 1/x − cos 1/x)

x→0

x→0

does not exist: both sin 1/x and cos 1/x oscillate faster and faster near x = 0, but although the first term 2x sin 1/x is a dampened oscillation that goes to 0, the second term cos 1/x oscillates without being dampened. So f ′ is discontinuous at 0. The converse of Theorem 4.5.3 is the statement that if f is differentiable at a then all partial derivatives exist at a and about a, and they are continuous at a. The example here (with n = m = 1) is differentiable at 0, and all partial derivatives (i.e., the derivative itself in the one variable sense of derivative) exist at and about 0, but some partial derivative (again, i.e., the derivative itself in the one variable sense) fails to be continuous. 4.5.4. Discuss the derivatives of the following mappings at the following points. 2 −y on {(x, y) ∈ R2 : y 6= −1} at generic (a, b) with b 6= −1. (a) f (x, y) = xy+1 Each such (a, b) is an interior point of the domain of f . Since f is a rational function, the partial derivatives exist and are continuous at every point of its domain. Therefore Theorem 4.5.3 applies, and then Theorem 4.5.2. That is, we may simply compute the Jacobian matrix of partial derivatives of f at (a, b):   a2 + 1 2a . ,− f ′ (a, b) = [D1 f (a, b), D2 f (a, b)] = b + 1 (b + 1)2 The derivative is the corresponding linear map, Df(a,b) (h, k) =

2a a2 + 1 h− k. b+1 (b + 1)2

(Compare the ease of this method to the far more laborious derivation of the same formula from first principles at the end of section 4.4.) 38

2

xy (b) f (x, y) = y−1 on {(x, y) ∈ R2 : y 6= 1} at generic (a, b) with b 6= 1. We may reason similarly to part (a) and then proceed to the calculation of the Jacobian matrix,   2 ab(b − 2) b . , f ′ (a, b) = [D1 f (a, b), D2 f (a, b)] = b − 1 (b − 1)2

Again the derivative is the corresponding linear map, Df(a,b) (h, k) =

(c) f (x, y) =

(

ab(b − 2) b2 h+ k. b−1 (b − 1)2

√ xy 2

if (x, y) 6= (0, 0)

0

if (x, y) = (0, 0)

x +y 2

at generic (a, b) 6= (0, 0) and

at (0, 0). Away from (0, 0) we may proceed as in parts (a) and (b) since all partial derivatives exist around each such point and are continuous at each such point. The Jacobian matrix at a point (a, b) 6= (0, 0) is   a3 b3 ′ , , f (a, b) = [D1 f (a, b), D2 f (a, b)] = (a2 + b2 )3/2 (a2 + b2 )3/2 and so the derivative is Df(a,b) (h, k) =

b3 a3 h + k. (a2 + b2 )3/2 (a2 + b2 )3/2

At (0, 0) we can not quote general results from calculus since the function is defined casewise. But it is easy to compute that the partial derivatives at (0, 0) are both 0: D1 f (0, 0) = lim

t→0

0−0 f (0 + t, 0) − f (0, 0) = lim = lim 0 = 0, t→0 t→0 t t

and similarly D2 f (0, 0) = 0. Therefore, by Theorem 4.5.2, the only possible candidate for Df(0,0) is the zero map, and we need to check whether it works. That is, we need to study hk f (h, k) − f (0, 0) − 0 = f (h, k) = √ h2 + k 2 to see whether it is o(h, k). Let (h, k) → (0, 0) along the 45-degree line, i.e., let h = k, and compute that 1 1 h2 = √ |h| = |(h, h)|. |f (h, h)| = √ 2 2 |h| 2 This is not less than c|(h, h)| for c = 1/4. The counterexample shows that the quantity is not o(h, k) in general. Therefore Df(0,0) does not exist. 39

4.5.5. For what differentiable mappings f : A −→ Rm is f ′ (a) a diagonal matrix for all a ∈ A? This depends a bit on what we mean by “diagonal matrix,” but the main idea is that we want all off-diagonal entries of f ′ (a) to be 0. For this to happen on the first row, whose entries are the partial derivatives of the first component function f1 of f , in fact f1 must be a function only of x1 . Similarly, the second component function f2 of f must be a function only of x2 , and so on. That is, f (x) = (f1 (x1 ), . . . , fm (xm )). If we further insist that a diagonal matrix must be square then we are requiring that m = n. If not, then when m > n, the previous display connotes that last component functions fn+1 through fm must be constant functions. 4.5.7. Let w = F (xz, yz). Show that x · wx + y · wy = z · wz . Compute by the Chain Rule that wx = D1 F (xz, yz) · z + D2 F (xz, yz) · 0 = zD1 F (xz, yz), wy = D1 F (xz, yz) · 0 + D2 F (xz, yz) · z = zD2 F (xz, yz), wz = D1 F (xz, yz) · x + D2 F (xz, yz) · y.

The result follows immediately. 4.5.9. The function f : R2 −→ R is called homogeneous of degree k if f (tx, ty) = tk f (x, y) for all scalars t and vectors (x, y). Show that such f satisfies the differential equation xf1 (x, y) + yf2 (x, y) = kf (x, y). Since f (tx, ty) = tk f (x, y), the partial derivatives of both sides with respect to t are equal, f1 (tx, ty) · x + f2 (tx, ty) · y = ktk−1 f (x, y). Now set t = 1 to get the result. Alternatively, one can let w = f (tx, ty) and then cite exercise 4.5.8 with t in place of z to get xwx + ywy = twt . But also w = tk f (x, y), so that wx = tk f1 (x, y) and wy = tk f2 (x, y) and wt = ktk−1 f (x, y). Thus the previous display becomes xtk f1 (x, y) + ytk f2 (x, y) = ktk f (x, y). In particular, this holds for t = 1, giving the result. 4.6.1. Let f (x, y) =

(

xy(y 2 −x2 ) x2 +y 2

0 40

if (x, y) 6= (0, 0) if (x, y) = (0, 0).

Away from (0, 0), f is rational and so it is continuous and and all its partial derivatives of all orders exist and are continuous. (a) Show that f is continuous at (0, 0). The numerator of f away from (0, 0) is homogeneous of degree 4 in x and y while its denominator is |(x, y)|2 , so the Size Bounds show that |f (x, y)| is at most a constant multiple of |(x, y)|2 , which goes to 0 as (x, y) → (0, 0). Since f (0, 0) = 0, the desired continuity of f at (0, 0) is established. (b) Show that D1 f , and D2 f exist and are continuous at (0, 0). Since f vanishes on the x-axis, D1 f (0, 0) = lim

h→0

f (h, 0) − f (0, 0) = lim 0 = 0, h→0 h

and similarly D2 f (0, 0) = 0. Away from the origin, the first partial derivative of f is, by calculation, D1 f (x, y) =

−x4 y − 4x2 y 3 + y 5 , (x2 + y 2 )2

and since f (x, y) = −f (y, x) the second partial derivative must be the same thing but with the roles of s and y exchanged and then the whole thing negated, D2 f (x, y) =

−x5 + 4x3 y 3 + xy 4 . (x2 + y 2 )2

In both cases the numerator is homogeneous of degree 5 while the denominator is |(x, y)|4 , so the Size Bounds show that each partial derivative is at most a constant multiple of |(x, y)|. Thus lim(x,y)→(0,0) Di f (x, y) = 0 = Di f (0, 0) for i = 1, 2. Thus the desired continuity of Di f at (0, 0) for i = 1, 2 is established. (c) Show that D12 f (0, 0) = 1 6= −1 = D21 f (0, 0). Compute, k 5 /k 4 − 0 D1 f (0, k) − D2 f (0, 0) = lim = 1, k→0 k→0 k k D2 f (h, 0) − D2 f (0, 0) −h5 /h4 − 0 D21 f (0, 0) = lim = lim = −1. h→0 h→0 h h

D12 f (0, 0) = lim

4.6.2. Suppose u, as a function of x and y, satisfies the differential equation uxx − uyy = 0. Make the change of variables x = s + t, y = s − t. What corresponding differential equation does u satisfy when viewed as a function of s and t? Compute that since xs = 1 and ys = 1, u s = u x x s + u y ys = u x + u y , and then, since also xt = 1 and yt = −1, ust = uxx xt + uxy yt + uyx xt + uyy yt = uxx − uyy 41

But we are given that this is 0, so the new differential equation is ust = 0. 4.6.3. (a) Let c be a constant. Let x and t denote a space variable and a time variable, and introduce variables p = x + ct,

q = x − ct.

Show that a quantity w, viewed as a function of x and t, satisfies the wave equation, c2 wxx = wtt , if and only if it satisfies the equation wpq = 0. Compute, freely using the relations px = 1, qx = 1, pt = c, qt = −c, wx = wp + wq wxx = wpp + 2wpq + wqq wt = c(wp − wq )

wtt = c2 (wpp − 2wpq + wqq ).

Thus c2 wxx − wtt = 4c2 wpq , and the result follows. (b) Using part (a), show that in particular if w = F (x + ct) + G(x − ct) then w satisfies the wave equation. We are given that w = F (p) + G(q). Thus wp = F ′ (p) and so wpq = 0. By part (a), c2 wxx = wtt as desired. (c) Now let 0 < v < c, and define new space and time variables in terms of the original ones by a Lorentz transformation, y = γ(x − vt),

u = γ(t − (v/c2 )x)

where γ = (1 − v 2 /c2 )−1/2 .

Show that y + cu = γ(1 − v/c)(x + ct),

y − cu = γ(1 + v/c)(x − ct),

so that consequently (y, u) has the same spacetime norm as (x, t), y 2 − c 2 u 2 = x 2 − c 2 t2 . This is a matter of algebra, y + cu = γ((x − vt) + c(t − (v/c2 )x)) = γ(1 − v/c)(x + ct),

y − cu = γ((x − vt) − c(t − (v/c2 )x)) = γ(1 + v/c)(x − ct), 42

and so y 2 − c2 u2 = (y + cu)(y − cu) = γ 2 (1 − v 2 /c2 )(x + ct)(x − ct) = x2 − c2 t2 . (d) Recall the variables p = x + ct and q = x − ct from part (a). Similarly, let r = y + cu and s = y − cu. Suppose that a quantity w, viewed as a function of p and q satisfies the wave equation wpq = 0. Use the results r = γ(1 − v/c)p, s = γ(1 + v/c)q from part (c) to show that also it satisfies the wave equation in the (r, s)-coordinate system, wrs = 0. Consequently, if w satisfies the wave equation c2 wxx = wtt in the original space and time variables then also it satisfies the wave equation c2 wyy = wuu in the new space and time variables. Since rp = γ(1 − v/c), rq = 0, sp = 0, and sq = γ(1 + v/c), the Chain Rule gives wp = γ(1 − v/c)wr

wpq = γ 2 (1 − v/c)(1 + v/c)wrs = wrs . Thus altogether, c2 wxx = wtt ⇐⇒ wpq = 0 ⇐⇒ wrs = 0 ⇐⇒ c2 wyy = wuu . 4.6.4. Show that the substitution x = es , y = et converts the equation x2 uxx + y 2 uyy + xux + yuy = 0 into Laplace’s equation uss + utt = 0. Compute that since xs = x and y is independent of s, the chain rule simplifies to us = xux , so that by the product rule and then the chain rule again, uss = (xux )s = xs ux + x(ux )s = xux + x2 uxx . Symmetrical calculations with t and y in place of s and x give ut = yuy ,

utt = yuy + y 2 uyy .

Thus the original given equation is precisely uss + utt = 0. 4.6.5. Show that the substitution u = x2 − y 2 , v = 2xy converts Laplace’s equation wxx + wyy = 0 back into Laplace’s equation wuu + wvv = 0. The substitution is the complex squaring map, which squares the radius and doubles the angle of a point. Thus if we take x = r cos θ, y = r sin θ and u = ρ cos φ, v = ρ sin φ then the substitution becomes ρ = r2 ,

φ = 2θ.

43

Now compute wr = wρ ρr = 2wρ r, wrr = (2wρ r)r = 2wρr r + 2wρ = 4wρρ r2 + 2wρ , wθ = wφ φθ = 2wφ , wθθ = 2wφθ = 2wφφ φθ = 4wφφ . Thus r2 wrr = 4r4 wρρ + 2r2 wρ = 4ρ2 wρρ + 2ρwρ and rwr = 2wρ r2 = 2ρwρ and wθθ = 4wφφ , so that Laplace’s equation r2 urr + rur + uθθ = 0 indeed is preserved under the transformation, ρ2 wρρ + ρwρ + wφφ = 0. 4.7.1. Compute the best quadratic approximation to f (x, y) = ex cos y at the point (0, 0), f (h, k) ≈ f (0, 0) + Df(0,0) (h, k) + 12 Qf(0,0) (h, k). Note that f (0, 0) = 1. The Jacobian matrix is f ′ (a, b) = [ea cos b, −ea sin b], so f ′ (0, 0) = [1, 0] and thus Df(0,0) (h, k) = h. The second derivative matrix at general (a, b) and then at (0, 0) are  a    e cos b −ea sin b 1 0 ′′ f ′′ (a, b) = , f (0, 0) = , −ea sin b −ea cos b 0 −1 and thus

1 1 Qf(0,0) (h, k) = (h2 − k 2 ). 2 2 In sum, the quadratic approximation is 1 f (h, k) ≈ 1 + h + (h2 − k 2 ). 2

4.7.2. Compute the best quadratic approximation to f (x, y) = ex+2y at the point (0, 0). Note that f (0, 0) = 1. The Jacobian matrix is f ′ (a, b) = [ea+2b , 2ea+2b ], so f ′ (0, 0) = [1, 2] and thus Df(0,0) (h, k) = h + 2k. The second derivative matrix at general (a, b) and then at (0, 0) are  a+2b    e 2ea+2b 1 2 ′′ ′′ f (a, b) = , , f (0, 0) = 2 4 2ea+2b 4ea+2b 44

and thus

1 1 Qf(0,0) (h, k) = (h2 + 4hk + 4k 2 ). 2 2 In sum, the quadratic approximation is 1 f (h, k) ≈ 1 + h + 2k + h2 + 2hk + 2k 2 . 2 4.7.3. Give a heuristic explanation, making whatever reasonable assumptions seem to be helpful, of why the n-dimensional conceptual analogue of figure 4.5 should have 3n “pictures.” How does this relate to figure 4.7? In each of the n independent variable directions, the quadratic approximation can bend up, bend down, or be flat, giving three possibilities. Since the behavior in each direction is independent, the total number of possibilities is 3n as desired. In figure 4.7 we have n = 1 and indeed the quadratic approximations are an up-parabola, a down-parabola, and a horizontal line. The tacit geometric assumption here is that the coordinates can be chosen to make this argument reasonable; that is, the assumption is that we can pretend that the second derivative matrix has zeros off the diagonal. 4.7.4. Find the extreme values taken by f (x, y) = xy(4x2 + y 2 − 16) on the quarter ellipse E = {(x, y) ∈ R2 : x ≥ 0, y ≥ 0, 4x2 + y 2 ≤ 16}. Note that f (x, y) = 0 on the boundary of the ellipse and f (x, y) < 0 at interior points (x, y). Thus f takes its maximum value, 0, all along the boundary. Since the quarter ellipse is compact and since f is continuous, f also takes a minimum, and this necessarily happens at some interior point, where the partial derivatives are necessarily 0. Compute that the partial derivatives of f are f1 (x, y) = y(4x2 + y 2 − 16) + 8x2 y = y(12x2 + y 2 − 16),

f2 (x, y) = x(4x2 + y 2 − 16) + 2xy 2 = x(4x2 + 3y 2 − 16). Since we are in the interior of the ellipse, we have x > 0 and y > 0, so the condition that the partial derivatives both vanish is unaffected by dividing out y and x, 12x2 + y 2 − 16 = 0, 4x2 + 3y 2 − 16 = 0. This rewrites as an affine equation in x2 and y 2 ,   2    16 x 12 1 . = 16 y2 4 3

The techniques of section 3.4 show that the solution is x2 = 1, y 2 = 4, and so since we are in the quarter ellipse, x = 1, y = 2. Thus the minimum value taken by f is f (1, 2) = 2(4 + 4 − 16) = −16. 45

4.7.5. Find the local extrema of the function f (x, y) = x2 + xy − 4x + 23 y 2 − 7y on R2 . The function is differentiable everywhere, so we are seeking its critical points. The partial derivatives are f1 (x, y) = 2x + y − 4,

f2 (x, y) = x + 3y − 7.

In matrix form, the condition that both partial derivatives vanish is      2 1 x 4 = . 1 3 y 7 The only solution is (x, y) = (1, 2). The second derivative matrix of f is   2 1 , 1 3 and so by Proposition 4.7.8, f (1, 2) = −9 is a local minimum. (As |(x, y)| → ∞, f (x, y) is dominated by its quadratic part x2 + xy + 32 y 2 = (x + 12 y)2 + 45 y 2 , which goes to +∞. So f has no local maxima, and f (1, 2) = −9 is the global minimum.) 4.7.7. Find the critical points. Are they maxima, minima, or saddle points? For f (x, y) = x2 y + xy 2 , compute that the partial derivatives are f1 (x, y) = 2xy + y 2 = (2x + y)y,

f2 (x, y) = x2 + 2xy = x(x + 2y).

In particular, f1 (x, 0) = 0 for all x, while f2 (x, 0) = x2 , so (0, 0) is the only critical point with y = 0. Similarly, f2 (0, y) = 0, while f1 (0, y) = y 2 , so also (0, 0) is the only critical point with x = 0. All other critical points (x, y) have x 6= 0 and y 6= 0. At such a point the conditions are therefore 2x + y = 0,

x + 2y = 0.

We can solve this system of equation using the techniques of chapter 3, or we can just do it by hand. In any case, the unique solution is (x, y) = (0, 0). But this has been disallowed since we are assuming x 6= 0 and y 6= 0. That is, the critical point (0, 0) found before is the only critical point. The second derivative matrix of f at a general point and then at (0, 0) is     2y 2x + 2y 0 0 f ′′ (x, y) = , f ′′ (0, 0) = . 2x + 2y 2x 0 0 That is, the best quadratic approximation to f near (0, 0) is the zero function. (This is unsurprising since all terms of f are cubic.) Therefore the max/min test tells us nothing. However, we note that f (x, y) = xy(x + y), so that f changes its sign each time its input (x, y) crosses an axis or crosses the line y = −x of slope −1. That is, f has a sort of saddle point at (0, 0) but with three updirections and three down-directions. This is more complicated behavior than the usual two up-directions and two down-directions of an ordinary saddle point. 46

For g(x, y) = ex+y , compute that the partial derivatives are g1 (x, y) = ex+y ,

g2 (x, y) = ex+y .

Neither of these ever vanishes, so g has no critical points. For h(x, y) = x5 y + xy 5 + xy, compute that the partial derivatives are h1 (x, y) = 5x4 y + y 5 + y,

h2 (x, y) = x5 + 5xy 4 + x.

In particular, h1 (x, 0) = 0 for all x, while h2 (x, 0) = x5 + x = x(x4 + 1), so since x4 + 1 is always positive, (0, 0) is the only critical point with y = 0. Similarly, h2 (0, y) = 0, while h1 (0, y) = y 5 + y = y(y 4 + 1), and so (0, 0) is the only critical point with x = 0. All other critical points (x, y) have x 6= 0 and y 6= 0. At such a point the conditions are therefore 5x4 + y 4 = −1,

x4 + 5y 4 = −1.

These have no solution, and so the critical point (0, 0) is the only critical point. The second derivative matrix of h at a general point and then at (0, 0) is     20x3 y 5x4 + 5y 4 + 1 0 1 ′′ h′′ (x, y) = , h (0, 0) = . 5x4 + 5y 4 + 1 20xy 3 1 0 That is, the best quadratic approximation to h near (0, 0) is the function 1 Qh (x, y) = xy. 2 This is unsurprising since all terms of h except its “xy” have degree higher than 2. The max/min test tells us that h has a saddle point at (0, 0). 4.8.2. Let g(x, y, z) = xyz, let d be the unit vector in the direction from (1, 2, 3) to (3, 1, 5). Find Dd g(1, 2, 3). First note that although (3, 1, 5) − (1, 2, 3) = (2, −1, 2), this vector is not yet d because it has not been normalized to have unit length. In facts its length is 3, and so 1 d = (2, −1, 2). 3 Note that g has continuous partial derivatives everywhere, so in particular Theorem 4.5.3 guarantees that g is differentiable at (1, 2, 3). Now finding the directional derivative is easy by the general formula Dd f (a) = h∇f (a), di of Theorem 4.8.2, 1 Dd g(1, 2, 3) = h(yz, zx, xy)|(1,2,3) , (2, −1, 2)i 3 1 = h(6, 3, 2), (2, −1, 2)i 3 1 = (12 − 3 + 4) 3 13 . = 3 47

4.8.5. Show that if f : Rn −→ R and g : Rn −→ R are differentiable then so is their product f g : Rn −→ R and ∇(f g) = f ∇g + g∇f . This is simply a restatement of the Product Rule (see Proposition 4.4.5) in coordinates, since ∇f = f ′ for scalar-valued functions f .

4.8.6. Find the tangent plane to the surface {(x, y, z) : x2 + 2y 2 + 3zx − 10 = 0} in R3 at the point (1, 2, 1/3). Let f (x, y, z) = x2 + 2y 2 + 3zx. The surface in question is the level set where f = 10, which indeed contains the point (1, 2, 1/3). Compute the gradient of this function in general and then at (1, 2, 1/3), (∇f )(x, y, z) = (2x + 3z, 4y, 3x),

(∇f )(1, 2, 1/3) = (3, 8, 3).

The gradient is normal to the tangent plane, so by the equation for a plane in section 3.8, the tangent plane has equation 3(x − 1) + 8(y − 2) + 3(z − 1/3) = 0, or 3x + 8y + 3z = 20. 4.8.8. (a) Let A and α be nonzero constants. Solve the one-variable differential equation z ′ (t) = Aαeαz(t) , z(0) = 0. The differential equation gives e−αz(t) z ′ (t) = Aα Integrate,

to get

Z

for t ≥ 0 where the system is sensible.

t

e−αz(τ ) z ′ (τ ) dτ = Aαt, τ =0

1 (1 − e−αz(t) ) = Aαt, α

and thus

1 log(1 − α2 At). α One should check this: certainly z(0) = 0; also z ′ (t) = (−1/α)(−α2 A/(1 − α2 At)) = αA/(1 − α2 At), while αAeαz(t) = αA/(1 − α2 At) as well. (b) The pheromone concentration in the plane is given by f (x, y) = e2x +4ey . What path does a bug take, starting from the origin? We have f (x, y) = e2x + 4ey , and so z(t) = −

∇f (x, y) = [2e2x

4ey ].

Thus we need to solve x′ (t) = 2e2x(t) ,

x(0) = 0

and 48

y ′ (t) = 4ey(t) ,

y(0) = 0.

Using part (a) twice, 1 (x(t), y(t)) = (− log(1 − 4t), − log(1 − 4t)), 2

0 ≤ t < 1/4.

The integral curve is the ray y = 2x in the first quadrant, with the bug running off to infinity in one quarter unit of time. 4.8.9. Sketch some level sets and integral curves for the function f (x, y) = xy. The level sets are hyperbolas having the two coordinate axes as their asymptotes. An example in the notes suggests that the integral curves should be a second family of hyperbolas, orthogonal to the first, having the lines of slopes ±1 as their asyptotes. The general differential equation and initial condition are γ ′ (t) = (∇f )(γ(t)),

γ(0) = ~a.

In this case, where f (x, y) = xy, writing γ(t) = (x(t), y(t)), these work out to x′ (t) = y(t) y ′ (t) = x(t)

x(0) = a y(0)= b.

Add and subtract these equations to get (x + y)′ (t) = (x + y)(t) (x − y)′ (t) = −(x − y)(t)

(x + y)(0) = a + b (x − y)(0)= a − b.

Thus (x + y)(t) = (a + b)et ,

(x − y)(t) = (a − b)e−t ,

so that (x(t), y(t)) =



 a + b t a − b −t a + b t a − b −t . e + e , e − e 2 2 2 2

The solution is is more naturally expressed using the hyperbolic functions, defined as et + e−t et − e−t cosh(t) = , sinh(t) = . 2 2 These satisfy the relation cosh2 − sinh2 = 1, in analogy with the circular functions cos and sin. In terms of the hyperbolic functions, the solution is (x(t), y(t)) = (a cosh(t) + b sinh(t), b cosh(t) + a sinh(t)). A little algebra shows that x 2 − y 2 = a 2 − b2 , and so the integral curves are hyperbolas, as expected. 49

Another method is to proceed from the differential equation (x′ (t), y ′ (t)) = (y(t), x(t)),

(x(0), y(0)) = (a, b)

by noting that x′′ (t) = y ′ (t) = x(t) has solutions x(t) = cet + de−t and similarly for y(t). From here one can fiddle around with constants and find the same solution as before. 4.8.10. Define f : R2 −→ R by f (x, y) =

(

x2 y x2 +y 2

if (x, y) 6= 0 if (x, y) = 0.

0

(a) Show that f is continuous at (0, 0). We need to check that lim

x2 y = 0. + y2

(x,y)→(0,0) x2

The verification is a routine application of the Size Bounds since |x2 y| ≤ |(x, y)|3 and x2 + y 2 = |(x, y)|2 . (b) Find the partial derivatives D1 f (0, 0) and D2 f (0, 0). Since f is 0 on both axes, a routine calculation shows that both partial derivatives at the origin are 0. (c) Let d be any unit vector in R2 (thus d takes the form d = (cos θ, sin θ) for some θ ∈ R). Show that Dd f (0, 0) exists by finding it. Compute, t2 cos2 θ · t sin θ/t2 f (td) − f (0) = lim = lim cos2 θ sin θ = cos2 θ sin θ. t→0 t→0 t→0 t t lim

Thus Dd f (0, 0) = cos2 θ sin θ. (d) Show that in spite of (c), f is not differentiable at (0, 0). Thus, the existence of every directional derivative at a point is not sufficient for differentiability at the point. If f were differentiable at (0, 0) then by Theorem 4.8.2 we would have for all θ, D(cos θ,sin θ) f (0, 0) = hf ′ (0, 0), (cos θ, sin θ)i.

But also we would have f ′ (0, 0) = [0 0] by part (b) and Theorem 4.5.2, and we know that D(cos θ,sin θ) f (0, 0) = cos2 θ sin θ by part (c). So we would have for all θ, cos2 θ sin θ = 0.

But the last displayed equality fails for many values of θ. Thus f is not differentiable at (0, 0).

50

Chapter 5 5.1.1. Let x ∈ B(a, ε) and let δ = ε − |x − a|. Explain why δ > 0 and why B(x, δ) ⊂ B(a, ε). Since x ∈ B(a, ε) we have |x − a| < ε. That is, ε − |x − a| > 0, i.e., δ > 0. Now, take any y ∈ B(x, δ). Then |y − x| < δ, and so |y − a| = |y − x + x − a| ≤ |y − x| + |x − a| < δ + |x − a| = ε. Since this argument applies to each y ∈ B(x, δ) it shows that B(x, δ) ⊂ B(a, ε) as desired. 5.1.5. Define f : R −→ R by f (x) = x3 − 3x. Compute f (−1/2). Find f −1 ((0, 11/8)), f −1 ((0, 2)), f −1 ((−∞, −11/8) ∪ (11/8, ∞)). Does f −1 exist? Algebra and one variable calculus show that √ √ f (x) = x(x − 3)(x + 3), f ′ (x) = 3(x − 1)(x + 1). The local extrema of f are thus f (1) = −2 and f (−1) = 2. So the graph of f is clear. Compute that f (−1/2) = −1/8 + 3/2 = 11/8. Thus the polynomial f (x) − 11/8 = x3 − 3x − 11/8 must be divisible by x + 1/2. And indeed, x3 − 3x − 11/8 = (x + 1/2)(x2 − x/2 − 11/4). This further factors as f (x) − 11/8 = (x + 1/2)(x − r1 )(x − r2 ),

r1 , r2 =

√ 1∓3 5 . 4

Similarly, f (x) − 2 = (x + 1)2 (x − 2). It follows that √ √ f −1 ((0, 11/8)) = (− 3, r1 ) ∪ (−1/2, 0) ∪ ( 3, r2 ), √ √ f −1 ((0, 2)) = (− 3, 0) ∪ ( 3, 2),

f −1 ((−∞, −11/8) ∪ (11/8, ∞)) = (−∞, −r2 ) ∪ (r1 , −1/2)

∪ (1/2, −r1 ) ∪ (r2 , ∞).

−1 There since, for example, f takes the three inputs √ is no inverse function f 0, ± 3 to the common output 0.

5.1.9. Let a and b be real numbers with a < b. Let n > 1, and suppose that the mapping f : [a, b] −→ Rn is continuous and that f is differentiable on the open interval (a, b). It is tempting to generalize the Mean Value Theorem (Theorem 1.2.3) to the assertion “f (b) − f (a) = f ′ (c)(b − a) 51

for some c ∈ (a, b).”

This assertion is grammatically meaningful, since it posits an equality between two n-vectors. However, the assertion is false. (a) Let f : [0, 2π] −→ R2 be f (t) = (cos t, sin t). Show that the assertion fails for this f . Describe the situation geometrically. The assertion becomes “02 = 2π(− sin c, cos c) for some c ∈ (0, 2π)” and this is visibly false. Geometrically, the impossibility is that the endpoint-minusstartpoint vector of the circle (the zero vector) is not 2π times any tangent vector of the circle (none of which is zero). (b) Let f : [0, 2π] −→ R3 be f (t) = (cos t, sin t, t). Show that the assertion fails for this f . Describe the situation geometrically. The assertion becomes “(0, 0, 2π) = 2π(− sin c, cos c, 1) for some c ∈ (0, 2π)” and this is visibly false. Geometrically, the impossibility is that the endpointminus-startpoint vector of the helix (a purely vertical vector) is not 2π times any tangent vector of the helix (none of which is vertical). (c) Here is an attempt to prove the assertion: Let f = (f1 , . . . , fn ). Since each fi is scalar-valued, we have for i = 1, . . . , n by the Mean Value Theorem, fi (b) − fi (a) = fi′ (c)(b − a)

for some c ∈ (a, b).

Assembling the scalar results gives the desired vector result. What is the error here? The error is the tacit assumption that the same c works simultaneously at all the coordinates. 5.2.1. Define f : R2 −→ R2 by f (x, y) = (x3 + 2xy + y 2 , x2 + y). Is f locally invertible at (1, 1)? If so, what is the best affine approximation to the inverse near f (1, 1)? Note that f has a continous derivative. Compute that f (1, 1) = (4, 2), and that     5 4 3x2 + 2y 2x + 2y . , f ′ (1, 1) = f ′ (x, y) = 2 1 2x 1 This matrix is invertible, so yes, f is locally invertible at (1, 1). The affine approximation to the inverse mapping is          1 1 1 −4 h − 4k h 1 1 . = − g(4 + h, 2 + k) ≈ − 5 k 1 1 3 −2 3 −2h + 5k That is, g(4 + h, 2 + k) ≈



 1 4 2 5 1 − h + k, 1 + h − k . 3 3 3 3

5.2.2. Same question for f (x, y) = (x2 − y 2 , 2xy) at (2, 1). Again the function has a continuous derivative. Compute that f (2, 1) = (3, 4), and that     4 −2 2x −2y ′ ′ . , f (2, 1) = f (x, y) = 2 4 2y 2x 52

This matrix is invertible, so yes, f is locally invertible at (2, 1). The affine approximation to the inverse mapping is          1 1 2 4 2 4h + 2k h 2 g(3 + h, 4 + k) ≈ + . = + 1 k 1 20 −2 4 20 −2h + 4k

That is,



 1 1 1 1 g(3 + h, 4 + k) ≈ 2 + h + k, 1 − h + k . 5 10 10 5 (The function in this problem is the complex square, so we are computing a local affine approximation to the complex square root.) 5.2.4. Same question for C(ρ, θ, φ) = (ρ cos θ sin φ, ρ sin θ sin φ, ρ cos φ) at (1, 0, π/2). The function has a continuous derivative. Compute that C(1, 0, π/2) = (1, 0, 0), and that   cos θ sin φ −ρ sin θ sin φ ρ cos θ cos φ ρ cos θ sin φ ρ sin θ cos φ  , C ′ (ρ, θ, φ) =  sin θ sin φ cos φ 0 −ρ sin φ so that



1 0 C ′ (1, 0, π/2) =  0 1 0 0

 0 0 . −1

This matrix is invertible (in fact it is its own inverse), so yes, C is locally invertible at (1, 0, π/2). The affine approximation to the inverse mapping is g(1 + h, k, ℓ) ≈ (1 + h, k, π/2 − ℓ).

(The function in this problem is the spherical coordinate map.) 5.2.6. Define f : R2 −→ R2 by f (x, y) = (ex cos y, ex sin y). Show that f is locally invertible at each point (a, b) ∈ R2 , but that f is not globally invertible. Let (a, b) = (0, π3 ); let (c, d) = f (a, b); let g be the local inverse to f near (a, b). Find an explicit formula for g, compute g ′ (c, d) and verify that it agrees with f ′ (a, b)−1 . Compute that  a  e cos b −ea sin b f ′ (a, b) = . ea sin b ea cos b

This has determinant e2a (not ea ), which is always nonzero. The local invertibility of f follows from the Inverse Function Theorem. But since f (x, y + 2kπ) = f (x, y) for all integers k, it follows that f is not globally invertible since it is periodic in the y-direction. √ If (a, b) = (0, π/3) then (c, d) = f (a, b) = (1/2, 3/2). To find the precise local inverse of f near (a, b), compute z = ex cos y =⇒ =⇒

and

w = ex sin y

z 2 + w2 = e2x and w/z = tan y 1 x = ln(z 2 + w2 ) and y = arctan(w/z)). 2 53

This calculation is sloppy in its wanton divide-by-z √ and casual in its reference to the inverse tangent function. Since z is near 3/2, we don’t have to worry about dividing by it; but to get the right inverse function we do need to specify √ that we are using the branch of arctan that takes 3 to π/3. All of this said, the local inverse is   1 g(z, w) = ln(z 2 + w2 ), arctan(w/z) . 2 Now we can return to our computations. On the one hand, √   1/2 − 3/2 ′ √ , f (0, π/3) = 3/2 1/2 which has inverse ′

f (0, π/3)

−1

=



√1/2 − 3/2



3/2 1/2



.

But on the other hand, the derivative matrix of g works out to   1 z w , g ′ (z, w) = 2 z + w2 −w z so that in particular ′

g (1/2,



3/2) =



√1/2 − 3/2

√  3/2 = f ′ (0, π/3)−1 . 1/2

This is the desired result. 5.2.8. Define f : R −→ R by ( f (x) =

x + 2x2 sin x1 0

if x 6= 0 if x = 0.

(a) Show that f is differentiable at x = 0 and that f ′ (0) 6= 0. Compute, using the casewise definition of f , f ′ (0) = lim

h→0

f (h) − f (0) = lim (1 + 2h sin(1/h)) = 1. h→0 h

(b) Despite the result from (a), show that f is not locally invertible at x = 0. Why doesn’t this contradict the Inverse Function Theorem? For x 6= 0 we have f ′ (x) = 1 + 4x sin(1/x) − 2 cos(1/x). If 1/x is a large even multiple of π then f ′ (x) = 1 + 0 − 2 = −1; but if 1/x is a large odd multiple of π then f ′ (x) = 1 + 0 + 2 = 3. Thus f alternates between climbing and falling no matter how close x gets to 0, and so f has no local inverse at 0. This result does not contradict the Inverse Function Theorem because f ′ is discontinuous at 0. Indeed, limx→0 f ′ (x) does not exist. 54

5.3.1. Does the relation x2 + y + sin(xy) = 0 implicitly define y as a function of x near the origin? If so, what is its best affine approximation? How about x as a function of y and its affine approximation? Let g(x, y) = x2 + y + sin(xy). Compute g ′ (x, y) = [2x + y cos(xy), 1 + x cos(xy)],

g ′ (0, 0) = [0, 1].

Since the right entry of g ′ (0, 0) is nonzero, the relation g(x, y) = 0 does define y as a function of x near the origin, and the best affine approximation to this function is k ≈ 0.

Since the left entry of g ′ (0, 0) is zero, the Implicit Function Theorem gives no information about whether the relation g(x, y) = 0 defines x as a function of y near the origin. (Note: This problem is simple enough that we can see what’s going on without using the theorem. Use Taylor’s Theorem to approximate the given relation as x2 + y + xy ≈ 0. That is, y(1 + x) ≈ −x2 . Since x and y are close to 0 we can further approximate this relation as y ≈ −x2 , suggesting that y depends on x in the small, but not conversely.) 5.3.2. Does the relation xy − z ln y + exz = 1 implicitly define z as a function of (x, y) near (0, 1, 1)? How about y as a function of (x, z)? When possible, give the affine approximation to the function. Let g(x, y, z) = xy − z ln y + exz − 1, and note that g(0, 1, 1) = 0. Compute that g ′ (x, y, z) = [y + zexz , x − z/y, − ln y + xexz ],

g ′ (0, 1, 1) = [2, −1, 0].

Since the third entry is zero, the Implicit Function gives no information about whether the relation g(x, y, z) = 0 defines z as a function of (x, y) near (0, 1, 1). On the other hand, since the second entry is nonzero, the Implicit Function Theorem says that the relation g(x, y, z) = 0 does define y as a function of (x, z) near (0, 1, 1). The best affine approximation is 1 + k = ψ(0 + h, 1 + ℓ) ≈ 1 + 2h. (Note: Let x = h, y = 1 + k, z = 1 + ℓ, and use Taylor’s Theorem to approximate ln y by y − 1 = k and to approximate exz by 1 + xz = 1 + h(1 + ℓ). The resulting approximation of the condition g(x, y, z) = 0 near (0, 1, 1) works out to 2h − k + hk − kℓ + hℓ = 0. Looking at the first order terms we see the matrix [2, −1, 0] from before, but now we have quadratic terms as well. Especially we can solve for ℓ, ℓ=

−2h + k − hk . h−k 55

If k = h then the numerator is −h − h2 while the denominator vanishes, and so under the quadratic approximation ℓ is not a well behaved function of (h, k) near (0, 0, 0). Thus z does not appear to be a well behaved function of (x, y) near (0, 1, 1).) 5.3.3. Do the simultaneous conditions x2 (y 2 + z 2 ) = 5 and (x − z)2 + y 2 = 2 implicitly define (y, z) as a function of x near (1, −1, 2)? If so, the what is the function’s affine approximation? Let g(x, y, z) = (x2 (y 2 +z 2 )−5, (x−z)2 +y 2 −2), and note that g(1, −1, 2) = (0, 0). Compute that   2x(y 2 + z 2 ) 2x2 y 2x2 z g ′ (x, y, z) = , 2(x − z) 2y −2(x − z) so that g ′ (1, −1, 2) =



10 −2

 −2 4 . −2 2

Since the right square block is invertible, the condition g(x, y, z) = (0, 0) implicitly defines (y, z) as a function of x near (1, −1, 2). The affine approximation is      1 2 −4 10 −1 h. − (−1 + k, 2 + ℓ) = ψ(1 + h) ≈ −2 2 4 2 −2

This works out to

(−1 + k, 2 + ℓ) = ψ(1 + h) ≈ (−1 − 7h, 2 − 6h). 5.3.4. Same question for the conditions x2 + y 2 = 4 and 2x2 + y 2 + 8z 2 = 8 near (2, 0, 0). Let g(x, y, z) = (x2 + y 2 , 2x2 + y 2 + 8z 2 ). Note that g(2, 0, 0) = (4, 8). Compute     2x 2y 0 4 0 0 ′ ′ g (x, y, z) = , g (2, 0, 0) = . 4x 2y 16z 8 0 0 Since g ′ (2, 0, 0) has no invertible 2-by-2 subblock, the Implicit Function Theorem gives no information. (However, the first condition is satisfied by a cylinder of radius 2 over the (x, y)-plane, and the second condition is satisfied by an ellipsoid. Replace x2 +y 2 by 4 in the second condition to get x2 + 8z 2 = 4, an ellipse in the (x, z)-plane. Thus the conditions define a pair of ellipses on the cylinder, each having the ellipse from a moment earlier as its (x, z)-shadow. The circles cross at (±2, 0, 0), and the two crossing arcs at (2, 0, 0) do not form the graph of any two of x, y, and z as a function of the third.) 5.3.5. Do the simultaneous conditions xy + 2yz = 3xz and xyz + x − y = 1 implicitly define (x, y) as a function of z near (1, 1, 1)? How about (x, z) as a function of y? How about (y, z) as a function of x? Give affine approximations when possible. 56

Let g(x, y, z) = (xy +2yz −3xz, xyz +x−y −1). Note that g(1, 1, 1) = (0, 0). Compute     y − 3z x + 2z 2y − 3x −2 3 −1 g ′ (x, y, z) = , g ′ (1, 1, 1) = . yz + 1 xz − 1 xy 2 0 1 The left 2-by-2 subblock is invertible, so locally (x, y) is a function of z, and the affine approximation is      1 0 −3 −1 1 ℓ, (1 + h, 1 + k) ≈ + 1 1 6 −2 −2 that is, (1 + h, 1 + k) ≈ (1 − ℓ/2, 1). The right 2-by-2 subblock is invertible, so locally (y, z) is a function of x, and the affine approximation is      1 1 1 −2 1 (1 + k, 1 + ℓ) ≈ h, − 2 1 3 0 3 that is, (1 + k, 1 + ℓ) ≈ (1, 1 − 2h). The 2-by-2 block consisting of the first and third columns is not invertible, and so the Implicit Function Theorem gives no information. 5.3.6. Do the conditions xy 2 + xzu + yv 2 = 3 and u3 yz + 2xv − u2 v 2 = 2 implicitly define (u, v) in terms of (x, y, z) near the point (1, 1, 1, 1, 1)? If so, what is the derivative matrix of the implicitly defined mapping at (1, 1, 1)? Let g(x, y, z, u, v) = (xy 2 + xzu + yv 2 − 3, u3 yz + 2xv − u2 v 2 − 2). Compute that  2  y + zu 2xy + v 2 xu xz 2yv g ′ (x, y, z, u, v) = , 2v u3 z u3 y 3u2 yz − 2uv 2 2x − 2u2 v so that



2 3 g (1, 1, 1, 1, 1) = 2 1 ′

1 1 1 1

 2 . 0

The right 2-by-2 block is invertible, so yes, the constraint g = 0 implicitly defines (u, v) = ψ(x, y, z) near (1, 1, 1, 1, 1). The derivative matrix of the implicitly defined mapping at (1, 1, 1) is −



1 1

2 0

−1 

2 3 2 1

   1 1 0 −2 2 = 1 1 2 2 −1

  3 1 −2 −1 = 1 1 0 −1

and so the approximation is ψ(1 + h, 1 + k, 1 + ℓ) ≈ (1 − 2h − k − ℓ, 1 − k). 57

 −1 , 0

5.3.8. Do the conditions 2x + y + 2z + u − v = 1 xy + z − u + 2v = 1

yz + xz + u2 + v = 0 define the first three variables (x, y, z) as a function ϕ(u, v) near the point (x, y, z, u, v) = (1, 1, −1, 1, 1)? If so, find the derivative matrix ϕ′ (1, 1). Let g(x, y, z, u, v) = (2x+y+2z +u−v−1, xy+z −u+2v−1, yz +xz +u2 +v). Compute   2 1 2 1 −1 1 −1 2  , g ′ (x, y, z, u, v) = y x z z x + y 2u 1 and so



2 g ′ (1, 1, −1, 1, 1) =  1 −1

1 2 1 1 −1 2

 1 −1 −1 2  2 1

The left 3-by-3 block is invertible, and so indeed locally (x, y, z) = ϕ(u, v), and 

2 ϕ′ (1, 1) = −  1 −1  −3 1 =  3 3 0  −5/3 3 = −1/3

 −1  1 −1 1 2 2  1 1   −1 2 1 −1 2   4 1 1 −1 −6 0   −1 2  −1 −1 2 1  4 −5  . −1

5.3.9. Define g : R2 −→ R by g(x, y) = 2x3 − 3x2 + 2y 3 + 3y 2 and let L be the level set {(x, y) : g(x, y) = 0}. Find those points of L about which y need not be defined implicitly as a function of x, and find the points about which x need not be defined implicitly as a function of y. Describe L precisely—the result should explain the points you found. Compute g ′ (x, y) = [6x2 − 6x, 6y 2 + 6y] = 6[x(x − 1), y(y + 1)]. The points of L where y need not be defined implicitly as a function of x are the points (x, 0) and (x, −1). To find the points (x, 0) such that g(x, y) = 0, note that g(x, 0) = 2x3 − 3x2 , which vanishes at x = 0 (twice) and x = 3/2. Thus the points are (0, 0) and (3/2, 0). To find the points (x, −1) such that g(x, y) = 0, note that g(x, −1) = 2x3 − 3x2 + 1, which vanishes at x = 1 (twice) and x = −1/2. Thus the points are (1, −1) and (−1/2, −1). 58

The points of L where x need not be defined implicitly as a function of y are the points (0, y) and (1, y). To find the points (0, y) such that g(x, y) = 0, note that g(0, y) = 2y 3 + 3y 2 , which vanishes for y = 0 (twice) and y = −3/2. Thus the points are (0, 0) and (0, −3/2). To find the points (1, y) such that g(x, y) = 0, note that g(1, y) = 2y 3 + 3y 2 − 1, which vanishes at y = −1 (twice) and y = 1/2. so the points are (1, −1) and (1, 1/2). To summarize the data so far: • g ′ (x, y) is [0, 0] at the points (0, 0), (1, −1). These are the points of L where neither variable need be defined implicitly as a function of the other. • g ′ (x, y) is [∗, 0] (where ∗ 6= 0) at the points (3/2, 0), (−1/2, −1). These are the points of L where x is defined implicitly as as function of y but perhaps not conversely, i.e., where the tangent is vertical. • g ′ (x, y) is [0, ∗] (where ∗ = 6 0) at the points (0, −3/2), (1, 1/2). These are the points of L where y is defined implicitly as as function of x but perhaps not conversely, i.e., where the tangent is horizontal. Keeping only the lowest order (biggest) terms near (0, 0), we have g(x, y) ≈ −3(x2 − y 2 ) = −3(x + y)(x − y), and so near (0, 0), the level set L where g = 0 looks like two lines of slope ±1. Similarly, near (1, −1) the lowest nonzero terms of g in local coordinates are g(ξ − 1, η + 1) ≈ −3(ξ 2 − η 2 ) = 3(ξ + η)(ξ − η), and so near (1, −1), the level set L where g = 0 again looks like two lines of slope ±1. But also, g(x, y) = 2(x3 + y 3 ) − 3(x2 − y 2 )

= 2(x + y)(x2 − xy + y 2 ) − 3(x + y)(x − y) = (x + y)(2x2 − 2xy + 2y 2 − 3x + 3y).

This vanishes where x + y = 0 (a line through the origin of slope −1) and where 2x2 − 2xy + 2y 2 − 3x + 3y = 0 (an ellipse). The final picture makes gratifyingly vivid all the features that we have been discovering. A little practice with eigenvalues and eigenvectors lets us understand the ellipse better, in its natural coordinate system. The quadratic terms in its equation are       x 2 −1 2 2 2x − 2xy + 2y = x y A where A = . y −1 2 The characteristic polynomial of A is

pA (λ) = λ2 − 4λ + 3 = (λ − 3)(λ − 1). 59

Thus the eigenvalues of A are 3 and 1. To find an eigenvector for the eigenvalue 3, solve the equation      2 − 3 −1 x 0 = , −1 2 − 3 y 0   which is −x − y = 0, twice. Thus an eigenvector is −11 . Similarly, to find an eigenvector for the eigenvalue 3, solve the equation      0 x 2 − 1 −1 , = 0 y −1 2 − 1 which is x − y = 0, twice, and thus an eigenvector is [ 11 ]. Introduce the matrix having unit length dilations of the eigenvectors as its columns,   1 1 1 P =√ . 2 −1 1 Note that P is orthogonal, and compute for confirmation that     1 1 −1 2 −1 1 1 P T AP = 1 −1 2 −1 1 2 1       1 1 1 −1 3 1 3 6 0 = = = −3 1 0 1 2 1 2 0 2

0 1



.

This is as it must be, because P takes the first standard basis vector e1 to the first eigenvector, then A dilates the eigenvector by 3, and then P T = P −1 takes the dilated eigenvector back to 3e1 . And similarly √ for e2 . √ Now let [ xy√] = P [ uv ]. That is, x √ = (u + v)/ 2 and y = (−u + v)/ 2, so that u = (x − y)/ 2 and v = (x + y)/ 2. Now the quadratic terms of the ellipse equation are       T   u x x y A = 3u2 + v 2 , = u v P AP v y

and the cross term is gone. √ The ellipse equation overall also includes the√terms 2 −3x + 3y = −3(x −√ y) = −3 2 u, so overall the ellipse √ 2 equation2is 3u − 3 2 u + 2 2 2 v = 0, or 3(u − 1/ 2) + v = 3/2, or 2(u − 1/ 2) + (2/3)v = 1, or √ !2 u − 1/ 2 √ + 1/ 2

v p 3/2

!2

= 1.

Since the u-axis points into the third quadrant at 45 degrees while the v-axis points √ at 45 degrees, we see that the ellipse having center √ into the first quadrant doing exactly what it must: its extreme points in (1/ 2, 0) and u-radius 1/ 2 is √ the u-direction are (0, 0) and ( 2, 0) in the (u, v)-coordinate system, and these are (0, 0) and (1, −1) in the (x, y)-coordinate system. The ellipse has tangent slope 1 at these points. The extreme points of the ellipse in the v-direction,

60

p √ (u, v) = (1/ 2, ± 3/2) are the points the ellipse has tangent slope −1, √ where √ and these work out to (x, y) = (1 ± 3, −1 ± 3)/2. 5.4.1. Find the nearest point to the origin on the intersection of the hyperplanes x + y + z − 2w = 1 and x − y + z + w = 2 in R4 . The problem is to minimize f (x, y, z, w) = x2 + y 2 + z 2 + w2 subject to the constraints g1 (x, y, z, w) = x + y + z − 2w = 1,

g2 (x, y, z, w) = x − y + z + w = 2.

The Lagrange multiplier condition (after absorbing a constant into the λ’s) is (x, y, z, w) = λ1 (1, 1, 1, −2) + λ2 (1, −1, 1, 1), x + y + z − 2w = 1, x − y + z + w = 2.

Use the first equation to substitute for x, y, z, w in the constraining equations, obtaining 7λ1 − λ2 = 1, −λ1 + 4λ2 = 2,

or



7 −1 −1 4



λ1 λ2



=



1 2



.

The solution is          1 1 2 1 1 4 1 6 λ1 = = . = 2 λ2 27 1 7 27 15 9 5 Consequently (x, y, z, w) =

1 1 (2 + 5, 2 − 5, 2 + 5, −4 + 5) = (7, −3, 7, 1). 9 9

And although the problem didn’t ask, the actual distance is √ 1√ 6 3 2 1p 2 1√ 2 2 2 7 +3 +7 +1 = 108 = 36 · 3 = =√ . 9 9 9 9 3 5.4.2. Find the nearest point on the ellipse x2 + 2y 2 = 1 to the line x + y = 4. Any ellipse and any line E:

x2 y2 + = 1, a2 b2

L : Ax + By = C

have normal vectors nE = 2(x/a2 , y/b2 ) and nL = (A, B). Assuming that the ellipse and the line don’t meet, when (x, y) is the point on the ellipse nearest the line, the normals are parallel, (x/a2 , y/b2 ) = λ(A, B), x2 /a2 + y 2 /b2 = 1 61

By the first equation, x = λa2 A and y = λb2 B, and substituting these into the second equation gives λ2 (a2 A2 + b2 B 2 ) = 1. Therefore, λ = ±√

1 , a2 A2 + b2 B 2

and consequently b2 B y = ±√ . a2 A2 + b2 B 2

a2 A , x = ±√ a2 A2 + b2 B 2

When A > 0, B > 0, and C > 0, as in this problem, the line passes through the first quadrant, so the near point is the one with the “+” signs. (The ellipse and the line in the problem don’t meet because √ the farthest the the ellipse goes right is x = 1 and the highest it goes is √ y = 1/ 2. Therefore √ ellipse sits in a box whose upper right corner is (1, 1/ 2). But 1 + 1/ 2 is less than 4, so this upper right corner is on the same side of the line x + y = 4 as the origin. That is, the ellipse lies entirely on one side of the line.) 5.4.4. Maximize f (x, y, z) = xy + yz subject to the constraints x2 + y 2 = 2, yz = 2. By the second constraint, f (x, y, z) = xy + 2, so the problem reduces to maximizing the new function f˜(x, y) = xy +2 subject to the constraint g(x, y) = x2 + y 2 = 2. The Lagrange multipier condition is (y, x) = λ(x, y), x2 + y 2 = 2. So λ = y/x = x/y and therefore x2 = y 2 . But also x2 + y 2 = 2, so x2 = y 2 = 1, and each of x and y is ±1 (independently of the other). Compute that f˜(1, 1) = f˜(−1, −1) = 3,

f˜(1, −1) = f˜(−1, 1) = 1,

So the maximum is 3 (and the minimum is 1). 5.4.6. Find the largest rectangular box with sides parallel to the coordinate axes that can be inscribed in the ellipsoid  x  2  y 2  z 2 + + = 1. a b c . We need to maximize the function f (x, y, z) = xyz subject to the given constraint. The tidy way to do this is to reduce it to the problem where the ellipsoid is a sphere by introducing new, scaled variables, x y z ξ= , η= , ζ= . a b c Now the function to maximize is ϕ(ξ, η, ζ) = abcξηζ (and we may as well just maximize ξηζ since abc is constant), and the new constraint is simply ξ 2 + η 2 + ζ 2 = 1. 62

The Lagrange multiplier condition is (ηζ, ξζ, ξη) = λ(ξ, η, ζ), ξ 2 + η 2 + ζ 2 = 1. The first equation gives ξηζ = λξ 2 = λη 2 = λζ 2 , and this combines with the second equation to give λ = 3ξηζ. So the first component of the first equation is now ηζ = 3ξ 2 ηζ, or 1 ξ = ±√ . 3 And similarly for η and ζ. In terms of the original variables, we have y z 1 x = = = ±√ , a b c 3 so that

b c y = ±√ , z = ±√ . 3 3 √ The biggest box thus has volume 8abc/(3 3). a x = ±√ , 3

5.4.7. The lengths of the twelve edges of a rectangular block sum to 4, and the areas of the six faces sum to 4α. Find the lengths of the edges when the excess of the block’s volume over that of a cube with edge equal to the least edge of the block is greatest. Let the dimensions be x, y, z, with z smallest. Our task is to optimize the quantity xyz − z 3 = (xy − z 2 )z subject to the constraints x+y+z =1

xy + yz + zx = 2α.

By the first constraint, the quantity to optimize is (xy − z 2 )z = (xy − (1 − x − y)z)z = (xy + yz + zx − z)z, and then by the second constraint, the quantity is (xy + yz + zx − z)z = (2α − z)z. Call this quantity f (z). The graph of f is a parabola having roots at z = 0 and z = 2α. Thus it attains its maximum at z = α, and the maximum value is α2 . 63

However, we still need to check that such a box exists at all, i.e., we need to check that the sides x and y are real numbers at least as big as z. This turns out to depend on α. The conditions satisfied by the box-sides are first, since x + y + z = 1 and z = α, x + y = 1 − α, and second, since xy + yz + zx = 2α and xy + yz + zx = xy + (x + y)z, xy = α(1 + α). Thus x and y satisfy the quadratic equation u2 − (1 − α)u + α(1 + α) = 0, which has real roots so long as is discriminant is nonnegative, (1 − α)2 − 4α(1 + α) = 1 − 6α − 3α2 ≥ 0.

√ √ Thus the values√of α √ that make x and y nonnegative are 0 < α ≤ (2 − 3)/ 3. When α = (2 − 3)/ 3 the optimizing box has equal x- and y-sides. In general the x- and y-sides are √ 1 − α ± 1 − 6α − 3α2 , x, y = 2 Elementary algebra shows√that√ the smaller of these is indeed greater than α. So any value of α in (0, (2 − 3)/ 3] makes the problem viable. 5.4.8. A cylindrical can (with top and bottom) has volume V . Subject to this constraint, what dimensions give it the least surface area? The problem is to minimize A(r, h) = 2πr2 + 2πrh subject to the constraint that V (r, h) = πr2 h = v where v is some fixed value. The Lagrange multiplier condition is (absorbing a 2 into the λ) (2πr + πh, πr) = λ(2πrh, πr2 ), πr2 h = v. So λ = (2r + h)/(2rh) = 1/r, quickly giving 2r = h. That is, the height is twice the radius. 5.4.9. Find the distance in the plane from the point (0, 1) to the parabola y = ax2 where a > 0. The distance squared and the constraint are f (x, y) = x2 + (y − 1)2 , g(x, y) = y − ax2 .

So the Lagrange condition is, after absorbing a factor of 2 into λ, [x, y − 1] = λ[−2ax, 1], y = ax2 . 64

One solution is (x, y) = (0, 0) (with λ = −1). In this case the square-distance is obviously 1. On the other hand, if x 6= 0 then the proportionality of [x, y − 1] and [−2ax, 1] quickly gives 1 y =1− . 2a If a ≤ 1/2 then this makes y nonpositive, which is impossible. So a solution other than (x, y) = (0, 0) cannot exist unless a > 1/2, which now assume. The previous display gives 1 1 y x2 = = − 2 . a a 2a In this case the square-distance is 1 1 1 1 1 − + 2 = − 2. a 2a2 4a a 4a

f (x, y) = x2 + (y − 1)2 =

Subtract this square-distance from 1, the square distance from (0, 1) to the point (0, 0) on the parabola, 1−

1 1 + 2 = a 4a



1−

1 2a

2

> 0.

Thus, for a > 1/2 the two points on the parabola having y-coordinate 1−1/(2a) are closer to (0, 1) than is (0, 0). 5.4.10. This exercise extends the Arithmetic–Geometric Mean Inequality. Let Pn e1 , . . . , en be positive numbers with i=1 ei = 1. Maximize the Pfunction n f (x1 , . . . , xn ) = xe11 · · · xenn (where all xi > 0) subject to the constraint i=1 ei xi = 1. The function to optimize and the constraint are respectively f (x1 , . . . , xn ) = xe11 · · · xenn

g(x1 , . . . , xn ) = e1 x1 + · · · + en xn = 1.

The Lagrange multiplier condition ∇f = λ∇g, g = 1 is f (x1 , . . . , xn )[e1 /x1 , . . . , en /xn ] = λ[e1 , . . . , en ]. Thus

1 1 λ = = ··· = , f (x1 , . . . , xn ) x1 xn

and therefore all the xP i are equal, say to x. Substitute this into the constraint to get (recalling that i ei = 1 for the second equality) e1 x + · · · + en x = 1 =⇒ (e1 + · · · + en )x = 1 =⇒ x = 1.

Thus f (x1 , . . . , xn ) ≤ f (1, . . . , 1) = 1e1 · · · 1en = 1. 65

(To see that the inequality is indeed “≤” take x1 = ε/(e1 n), x2 = (2 − ε)/(e2 n), and xi = 1/(ei n) for i = 3, . . . , n. The weighted arithmetic mean of these xi is 1 while their weighted geometric mean is close to 0.) Use your result to derive the generalized Arithmetic–Geometric Mean inequality, ae11 · · · aenn ≤ e1 a1 + · · · + en an

for all positive a1 , . . . , an .

Given positive a1 ,. . . ,an , let a = e1 a1 + · · · + en an , and let xi = ai /a for i = 1, . . . , n. Then e1 x 1 + · · · + en x n =

a e1 a 1 + · · · + en a n = = 1, a a

so the xi satisfy the constraint g(x1 , . . . , xn ) = 1. Therefore, ae11 · · · aenn = f (a1 , . . . , an ) = f (ax1 , . . . , axn ) = (ax1 )e1 · · · (axn )en

= ae1 +···+en xe11 · · · xenn

≤a = e1 a 1 + · · · + en a n ,

by the first part of this exercise

and we are done. What values of e1 , . . . , en reduce this to the first Arithmetic–Geometric Mean Inequality? If e1 = · · · = en = 1/n then this problem gives (a1 · · · an )1/n ≤

a1 + · · · + an , n

the usual Arithmetic–Geometric Mean Inequality. 5.5.1. Let f (x, y) = y and let g(x, y) = y 3 − x4 . Graph the level set L = {(x, y) : g(x, y) = 0}. Show that the Lagrange multiplier criterion does not find any candidate points where f is optimized on L. Optimize f on L nonetheless. The level set is the graph of the function y = x4/3 for all x ∈ R. The graph has a parabola-like shape. The Lagrange condition ∇f = λ∇g is [0

1] = λ[−4x3

3y 2 ].

The equality of second components, 1 = 3λy 2 , disallows the possibility λ = 0. So the equality of first components, 0 = −4λx3 , forces x = 0. But then the constraint y 3 = x4 also forces y = 0, and so the equality of second components is impossible.

66

Nonetheless, the shape of the level set shows that f (x, y) = y on the level set is minimized at (x, y) = (0, 0), where f = 0. 5.5.2. Consider the linear mapping g(x, y, z) = (x + 2y + 3z, 4x + 5y + 6z). (a) Use Theorem 5.5.2, part (1) to optimize the linear function f (x, y, z) = 6x + 9y + 12z subject to the affine constraints g(x, y, z) = (7, 8). We have the data     1 2 3 T a = 6 9 12 , M= . 4 5 6

Note that

That is, λT = [2 consequently



6

     1 2 3 9 12 = 2 1 . 4 5 6

1]. By the method in the section, the optimal value of f is T



f (x) = λ b = 2

   7 1 = 22. 8

(b) Verify without using the Lagrange multiplier method that the function f subject to the constraints g = (7, 8) (with f and g from part (a)) is constant, always taking the value that you found in part (a). Since x + 2y + 3z = 7 and 4x + 5z + 6z = 8 we have f (x, y, z) = 6x + 9y + 12z = 2(x + 2y + 3z) + (4x + 5y + 6z) = 2 · 7 + 8 = 22. (c) Show that the function f (x, y, z) = 5x + 7y + z can not be optimized subject to any constraint g(x, y, z) = b. This time aT = [5 7 1], and the condition aT = λT M has no solution. Alternatively, note that f (x, y, z) = 5x + 7y + z = (x + 2y + 3z) + (4x + 5y + 6z) − 8z = 7 + 8 − 8z, and clearly this has no optimal value. 5.5.3. (a) Use Theorem 5.5.2, part (2) to minimize the quadratic function f (x, y) = x2 + y 2 subject to the affine constraint 3x + 5y = 8. The data are   A = I2 M= 3 5 b = 8. According to the theorem, the optimum value is

f (x) = bT (M A−1 M T )−1 b = 8 · 34−1 · 8 = 64/34 = 32/17. (And this value is easy to find geometrically as well.) (b) Use the same result to find the extrema of f (x, y, z) = 2xy + z 2 subject to the constraints x + y + z = 1, x + y − z = 0. 67

The data are



0 A = 1 0

 1 0 0 0 0 1

M=



1 1 1 1 1 −1



b=

  1 . 0

And so the optimum value is (noting that A−1 = A) f (x) = bT (M A−1 M T )−1 b     0 1   1 1 1  1 0 = 1 0  1 1 −1 0 0  −1     3 1 1 = 1 0 1 3 0     3 −1 1 1 3 1 0 = = . −1 3 0 8 8

 −1   0 1 1 1 0 1 1  0 1 1 −1

(c) Use the same result to find the nearest point to the origin on the intersection of the hyperplanes x + y + z − 2w = 1 and x − y + z + w = 2 in R4 , reproducing your answer to exercise 5.4.1. The data are     1 1 1 −2 1 A = I4 M= b= . 1 −1 1 1 2 And so the optimum value is (noting that A−1 = I) f (x) = bT (M M T )−1 b −1      1 7 −1 = 1 2 2 −1 4     4 1 1 1  1 2 = 1 7 2 27 1 = (4 · 12 + 2 · 1 · 2 + 7 · 22 ) 27 36 4 = = . 27 3 Indeed this reproduces the answer from 5.4.1, since here we are optimizing the square of the distance. 5.5.4. (a) Use Theorem 5.5.2, part (3) to optimize f (x, y, z) = x − 2y + 2z on the sphere of radius 3. The data are   aT = 1 −2 2 M = I3 b = 9.

Thus

λ=±

q

aT M −1 a/b = ± 68

p

9/9 = ±1,

and then the optimal values of f are f (x) = λb = ±9. (b) Use the same result to optimize the function f (x, y, z, w) = x + y − z − w subject to the constraint g(x, y, z, w) = 1, g(x, y, z, w) = x2 /2 − y 2 + z 2 − w2 . The data are   1/2     −1  b = 1. aT = 1 1 −1 −1 M =   1 −1

Thus

λ=±

q

aT M −1 a/b = ±

and then the optimal values of f are

p

1/1 = ±1,

f (x) = λb = ±1 · 1 = ±1. 5.5.5. (a) Use Theorem 5.5.2, part (4) to optimize the function f (x, y) = 2xy subject to the constraint g(x, y) = 1 where g(x, y) = x2 + 2y 2 . The data are     0 1 1 0 A= M= b = 1. 1 0 0 2 Thus M −1 A =



1 0

0 1/2



0 1

  1 0 = 0 1/2

 1 , 0

and this matrix has characteristic polynomial pM −1 A (λ) = λ2 − 1/2. Thus the eigenvalues are

√ λ = ±1/ 2

and so the optimal values of f are √ f (x) = λb = ±1/ 2. (b) Use the same result to optimize the function f (x, y, z) = 2(xy + yz + zx) subject to the constraint g(x, y, z) = 1 where g(x, y, z) = x2 + y 2 − z 2 . The data are     1 0 0 0 1 1 0 b = 1. M = 0 1 A = 1 0 1 0 0 −1 1 1 0 69

Thus



0 1 0 M −1 A = M A =  1 −1 −1

 1 1 , 0

and this matrix has characteristic polynomial   −λ 1 1 1 pM −1 A (λ) = det  1 −λ −1 −1 −λ = −(λ3 + λ + 2)

= −(λ + 1)(λ2 − λ + 2). The quadratic factor has imaginary roots, and so the only real eigenvalue is λ = −1. Thus the optimal values of f are f (x) = λb = −1 · 1 = −1.

Chapter 6 6.1.1. (a) Let I = [0, 1], let P = {0, 1/2, 1}, let P ′ = {0, 3/8, 5/8, 1}, and let P ′′ be the common refinement of P and P ′ . What are the subintervals of P , and what are their lengths? Same question for P ′ . Same question for P ′′ . The subintervals of P are [0, 1/2] and [1/2, 1], both of length 1/2. The subintervals of P ′ are [0, 3/8], [3/8, 5/8], and [5/8, 1], of lengths 3/8, 1/4, and 3/8. The common refinement of P and P ′ is P ′′ = {0, 3/8, 1/2, 5/8, 1}. Its subintervals are [0, 3/8], [3/8, 1/2], [1/2, 5/8], and [5/8, 1], of lengths 3/8, 1/8, 1/8, and 3/8. (b) Let B = I × I, let Q = P × {0, 1/2, 1}, let Q′ = P ′ × {0, 1/2, 1}, and let ′′ Q be the common refinement of Q and Q′ . What are the subboxes of Q and what are their areas? Same question for Q′ . Same question for Q′′ . The subrectangles of Q are [0, 1/2] × [0, 1/2], [1/2, 1] × [0, 1/2],

[0, 1/2] × [1/2, 1], [1/2, 1] × [1/2, 1],

all of area 1/4. The subrectangles of Q′ are [0, 3/8] × [0, 1/2], [3/8, 5/8] × [0, 1/2], [5/8, 1] × [0, 1/2],

[0, 3/8] × [1/2, 1], [3/8, 5/8] × [1/2, 1], [5/8, 1] × [1/2, 1],

of areas 3/16, 1/8, 3/16, 3/16, 1/8, 3/16. The common refinement of Q and Q′ is Q′′ = {0, 3/8, 1/2, 5/8, 1} × {0, 1/2, 1}. 70

Its subrectangles are [0, 3/8] × [0, 1/2], [3/8, 1/2] × [0, 1/2], [1/2, 5/8] × [0, 1/2], [5/8, 1] × [0, 1/2],

[0, 3/8] × [1/2, 1], [3/8, 1/2] × [1/2, 1], [1/2, 5/8] × [1/2, 1], [5/8, 1] × [1/2, 1], of areas 3/16, 1/16, 1/16, 3/16, 3/16, 1/16, 1/16, and 3/16.

6.1.2. Show that the lengths of the subintervals of any partition of [a, b] sum to the length of [a, b]. Same for the areas of the subboxes of [a, b]×[c, d]. Generalize to Rn . For an interval, the partition is P = {t0 , t1 , . . . , tk },

a = t0 < t1 < · · · < tk = b.

The subintervals and their lengths are Jj = [tj−1 , tj ],

j = 1, . . . , k,

length (Jj ) = tj − tj−1 .

So the sum of the lengths of the subintervals is k X

length (Jj ) =

k X j=1

j=1

(tj − tj−1 )

= (t1 − t0 ) + (t2 − t1 ) + · · · + (tk − tk−1 )

= tk − t0 = b − a.

(A sum that mostly-cancels in this fashion is called a telescoping sum.) For a 2-dimensional box [a, b] × [c, d], the partition is P = {t1,0 , t1,1 , . . . , t1,k1 } × {t2,0 , t2,1 , . . . , t2,k2 }, where c = t2,0 < t2,1 < · · · < t2,k2 = d.

a = t1,0 < t1,1 < · · · < t1,k1 = b,

So the sum of the areas of the subboxes is k2 k1 X X

j1 =1 j2 =1

(t1,j1 − t1,j1 −1 )(t2,j2 − t2,j2 −1 ).

This is a doubly-indexed sum of twofold products. Note that in each product, each factor depends on only one index of summation. Therefore, one factor of each product passes through the sum over the other index, and the sum of the areas is k2 k1 X X (t2,j2 − t2,j2 −1 ). (t1,j1 − t1,j1 −1 ) j1 =1

j2 =1

71

This is (b − a)(d − c) by two applications of the 1-dimensional result. The discussion of the 2-dimensional case has made clear how the general argument will go. For an n-dimensional box [a1 , b1 ] × · · · × [an , bn ], proceed either by induction or by ellipsis. First we give a solution by ellipsis. The partition is P = {t1,0 , t1,1 , . . . , t1,k1 } × · · · × {tn,0 , tn,1 , . . . , tn,kn }, where a1 = t1,0 < t1,1 < · · · < t1,k1 = b1 ,

...,

an = tn,0 < tn,1 < · · · < tn,kn = bn .

So the sum of the volumes of the subboxes is k1 X

j1 =1

···

kn X

jn =1

(t1,j1 − t1,j1 −1 ) · · · (tn,jn − tn,jn −1 ).

In this n-fold sum of n-fold products, each term of each product depends on only one index of summation, and therefore passes through all the other sums. This shows that the sum of the volumes of the subboxes is k1 X

j1 =1

(t1,j1 − t1,j1 −1 ) · · ·

kn X

(tn,jn − tn,jn −1 ).

jn =1

and this is (b1 − a1 ) · · · (bn − an ) by n applications of the 1-dimensional result. Alternatively, for a solution by induction, note (or prove by induction) that an n-dimensional box is the cartesian product of an (n − 1)-dimensional box and an interval, Bn = Bn−1 × I. And similarly for its n-dimensional subboxes, Jn = Jn−1 × J1 . Let voln denote n-dimensional volume and let voln−1 denote (n−1)-dimensional volume. Then X X voln (Jn−1 × J1 ) voln (Jn ) = Jn−1 ×J1

Jn

=

XX

voln−1 (Jn−1 ) length(J1 )

Jn−1 J1

=

X

voln−1 (Jn−1 )

length(J1 ).

J1

Jn−1

By inductive hypothesis, X

X

voln−1 (Jn−1 ) = voln−1 (Bn−1 ),

Jn−1

72

and by the result in one dimension, X length(J1 ) = length(I). J1

So our calculation has shown that X voln (Jn ) = voln−1 (Bn−1 ) length(I) = voln (Bn ). Jn

This is the desired result. 6.1.3. Let J = [0, 1]. Compute mJ (f ) and MJ (f ) for each of the following functions f : J −→ R. (a) f (x) = x(1 − x), One-variable ( calculus shows that mJ (f ) = 0 and MJ (f ) = 1/4. 1 if x is irrational (b) f (x) = 1/m if x = n/m in lowest terms, n, m ∈ Z and m > 0, All values taken by f are positive. The values 1/m for m ∈ Z+ get arbitrarily close to 0 despite never reaching it. On the other hand, f takes a maximum value of 1. Thus, mJ (f ) = 0, MJ (f ) = 1. ( (1 − x) sin(1/x) if x 6= 0 (c) f (x) = (See figure 1.) 0 if x = 0.

1 0.5

0.2

0.4

0.6

0.8

1

-0.5 -1 Figure 1: Oscillation in an envelope This function is trapped between 1 − x and −1 + x. As x approaches 0, f (x) oscillates ever faster between 1 − x and −1 + x, so it assumes values closer and closer to 1 (though always below 1), and it assumes values closer and closer to −1 (though always above −1). Thus mJ (f ) = −1,

MJ (f ) = 1. 73

6.1.4. (a) Let I, P , P ′ and P ′′ be as in exercise 6.1.1(a), and let f (x) = x2 on I. Compute the lower sums L(f, P ), L(f, P ′ ), L(f, P ′′ ) and the corresponding upper sums, and check that they conform to Lemma 6.1.6, Lemma 6.1.8, and Proposition 6.1.10. The sums are L(f, P ) = 02 · 1/2 + (1/2)2 · 1/2 = 1/8 = 64/512,

L(f, P ′ ) = 02 · 3/8 + (3/8)2 · 1/4 + (5/8)2 · 3/8 = 93/512,

L(f, P ′′ ) = 02 · 3/8 + (3/8)2 · 1/8 + (1/2)2 · 1/8 + (5/8)2 · 3/8 = 100/512, U (f, P ) = (1/2)2 · 1/2 + 12 · 1/2 = 5/8 = 320/512,

U (f, P ′ ) = (3/8)2 · 3/8 + (5/8)2 · 1/4 + 12 · 3/8 = 269/512,

U (f, P ′′ ) = (3/8)2 · 3/8 + (1/2)2 · 1/8 + (5/8)2 · 1/8 + 12 · 3/8 = 260/512. And indeed, max{L(f, P ), L(f, P ′ )} ≤ L(f, P ′′ ) ≤ U (f, P ′′ ) ≤ min{U (f, P ), U (f, P ′ )}. by

(b) Let B, Q, Q′ and Q′′ be as in exercise 6.1.1(b), and define f : B −→ R f (x, y) =

(

0 if 0 ≤ x < 1/2 1 if 1/2 ≤ x ≤ 1.

Compute L(f, Q), L(f, Q′ ), L(f, Q′′ ) and the corresponding upper sums, and check that they conform to Lemma 6.1.6, Lemma 6.1.8, and Proposition 6.1.10. The sums are L(f, Q) = 0 · 1/4 + 1 · 1/4 + 0 · 1/4 + 1 · 1/4 = 1/2, L(f, Q′ ) = 0 · 3/16 + 0 · 1/8 + 1 · 3/16 + 0 · 3/16 + 0 · 1/8 + 1 · 3/16 = 3/8, L(f, Q′′ ) = 0 · 3/16 + 0 · 1/16 + 1 · 1/16 + 1 · 3/16 + 0 · 3/16 + 0 · 1/16 + 1 · 1/16 + 1 · 3/16 = 1/2, U (f, Q) = 1 · 1/4 + 1 · 1/4 + 1 · 1/4 + 1 · 1/4 = 1,

U (f, Q′ ) = 0 · 3/16 + 1 · 1/8 + 1 · 3/16 + 0 · 3/16 + 1 · 1/8 + 1 · 3/16 = 5/8, U (f, Q′′ ) = 0 · 3/16 + 1 · 1/16 + 1 · 1/16 + 1 · 3/16 + 0 · 3/16 + 1 · 1/16 + 1 · 1/16 + 1 · 3/16 = 5/8.

Again, max{L(f, Q), L(f, Q′ )} ≤ L(f, Q′′ ) ≤ U (f, Q′′ ) ≤ min{U (f, Q), U (f, Q′ )}. 6.1.5. Draw the cartesian product ([a1 , b1 ] ∪ [c1 , d1 ]) × ([a2 , b2 ] ∪ [c2 , d2 ]) ⊂ R2 where a1 < b1 < c1 < d1 and similarly for the other subscript. The picture is four boxes arranged like panes of a window. (See figure 2.) 6.2.1. RLet f : B R−→ R be a bounded function. Explain how Lemma 6.2.2 shows that L B f ≤ U B f . 74

y d2

c2

b2 a2 a1

b1

c1

d1

x

Figure 2: Cartesian product Let L be the set of lower sums of f over all partitions P of B, and similarly for U . Then by Proposition 6.1.10, ℓ ≤ u for all ℓ ∈ L and u ∈ U. This is the required condition for the lemma. By the definitions ofR lower and R upper integral, the lemma’s conclusion is precisely that L B f ≤ U B f .

6.2.2. Let U and L be real numbers satisfying U ≥ L. Show that U = L if and only if U − L < ε for all ε > 0. If U = L then U − L = 0, so certainly U − L < ε for all ε > 0. On the other hand, if U − L < ε for all ε > 0, then • The condition U − L > 0 is impossible because the given information that U − L < ε for all ε > 0 says in particular that U − L < U − L, which is nonsense. • So, since we are given that U ≥ L, and we now know that U > L is impossible, the only remaining possibility is U = L, as desired. 6.2.3. Let f : B −→ R be the constant function f (x) = k for all x ∈ B. Show R that f is integrable over B and B f = k · vol(B). Let P be any partition of B. For each subbox J of P , we have mJ (f ) = k. It follows that the lower sum for f and P is X X L(f, P ) = mJ (f ) vol(J) = k vol(J) = k vol(B). J

J

Since this is independent of the partition P , we have showns that all lower sums have the same value, k vol (B). The lower integral is the least upper bound of 75

the lower sums, so since they all have the same value, it is Z L f = k vol (B). B

A virtually identical argument shows that also the upper integral is Z f = k vol (B). U B

Since the lower and upper integrals agree, the integral exists, and it is their shared values, Z f = k vol (B). B

6.2.4. Fill in the details in the argument that the function f : [0, 1] −→ R with f (x) = 0 for irrational x and f (x) = 1 for rational x is not integrable over [0, 1]. Every positive-length interval J of real numbers contains both rational and irrational numbers. (We take this fact for granted here.) Consequently, mJ (f ) = 0

and

MJ (f ) = 1.

Consequently, every lower sum for this function is X L(f, P ) = 0 · length(J) = 0, J

and therefore the lower integral is L

Z

f = 0. [0,1]

Also, every upper sum for this function is X X L(f, P ) = 1 · length(J) = length(J) = 1, J

J

and therefore the upper integral is U

Z

f = 1. [0,1]

Since the lower and upper integrals are unequal, the integral does not exist. 6.2.5. Let B = [0, 1] × [0, 1] ⊂ R2 . Define a function f : B −→ R by ( 0 if 0 ≤ x < 1/2, f (x, y) = 1 if 1/2 ≤ x ≤ 1. Show that f is integrable and

R

B

f = 1/2. 76

Let P = {0, 1/2, 1} × {0, 1}. By a small calculation, L(f, P ) = 1/2. Consequently, Z L f ≥ 1/2. B

On the other hand, let ε be a small positive number, and let

Pε = {0, (1 − ε)/2, 1/2, 1} × {0, 1}. R By another calculation, U (f,RP ) = 1/2 + ε/2, so U B f ≤ 1/2 + ε/2 < 1/2 + ε. Since ε > 0 was arbitrary, U B f ≤ 1/2. Putting all of this together gives Z Z 1/2 ≤ L f ≤U f ≤ 1/2. B

It follows that L is 1/2.

R

B

f = U

R

B

B

f = 1/2, showing that the integral exists and

6.2.6. This exercise shows that integration is linear. Let f : B −→ R and g : B −→ R be integrable. (a) Let P be a partition of B and let J be some subbox of P . Show that mJ (f ) + mJ (g) ≤ mJ (f + g) ≤ MJ (f + g) ≤ MJ (f ) + MJ (g). Show that consequently, L(f, P ) + L(g, P ) ≤ L(f + g, P ) ≤ U (f + g, P ) ≤ U (f, P ) + U (g, P ). This is left as an exercise. (b) Part (a) of this exercise obtained comparisons between lower and upper sums, analogously to the first paragraph ofRthe proof of Proposition 6.2.4. Argue R R analogously to the rest of the proof to show B (f +g) exists and equals B f + B g. Let ε > 0 be given. There exist partitions P ′ and P ′′ of B such that U (f, P ′ ) − L(f, P ′ ) < ε/2,

U (g, P ′′ ) − L(g, P ′′ ) < ε/2.

Let P be the common refinement of P ′ and P ′′ . Then U (f, P ) − L(f, P ) < ε/2, so that

U (g, P ) − L(g, P ) < ε/2,

  U (f, P ) + U (g, P ) − L(f, P ) + L(g, P ) < ε.

By the estimate from part (a), it follows that

Thus

R

U (f + g, P ) − L(f + g, P ) < ε. B

(f + g) exists by the Integrability Criterion.

77

Z

Similarly, given any ε > 0 we can find a partition P of B such that Z Z Z f+ g − ε ≤ L(f, P ) + L(g, P ) ≤ U (f, P ) + U (g, P ) ≤ f+ g + ε,

B

B

B

B

and so consequently Z Z Z Z g + ε. f+ g − ε ≤ L(f + g, P ) ≤ U (f + g, P ) ≤ f+ B

B

B

B

Since ε can be arbitrarily small, Z Z Z Z Z Z f+ g ≤ L (f + g) ≤ U (f + g) ≤ f+ g. B

B

B

B

B

B

The quantities at either end are equal, R inequalities all must be inequalities. R so the R Hence B (f + g) exists and equals B f + B g. (c) Let c ≥ 0 be any constant. Let P be any partition of B. Show that for any subbox J of P , mJ (cf ) = c mJ (f )

and

MJ (cf ) = c MJ (f ).

If c = 0 then all the quantities are 0, so we may assume that c > 0. Since mJ (f ) ≤ f (x) for all x ∈ J, it follows that c mJ (f ) ≤ c f (x) = (cf )(x) for all x ∈ J, so that c mJ (f ) ≤ mJ (cf ). Since mJ (cf ) ≤ (cf )(x) = c f (x) for all x ∈ J, it follows that mJ (cf )/c ≤ f (x) for all x ∈ J, so that mJ (cf )/c ≤ mJ (f ) and therefore mJ (cf ) ≤ c mJ (f ). The two inequalities show that mJ (cf ) = c mJ (f ). The argument for MJ is virtually identical. Explain why consequently L(cf, P ) = c L(f, P )

and

U (cf, P ) = c U (f, P ).

This follows immediately, X X X L(cf, P ) = mJ (cf ) vol(J) = c mJ (f ) vol(J) = c mJ (f ) vol(J) J

J

J

= c L(f, P ),

and similarly for upper sums. Explain why consequently Z Z f cf = c L L B

and

B

U

Z

cf = c U B

Z

f. B

Similarly to the argument at the beginning of this part of the problem, inf{c L(f, P )} = c inf{L(f, P )}. 78

It follows that Z Z L cf = inf{L(cf, P )} = inf{c L(f, P )} = c inf{L(f, P )} = c L f. B

B

And similarly for the upper integral. R Explain why consequently B cf exists and Z Z cf = c f. B

R

Since L

B

f =U

R

B

f=

B

R

f we have in fact established that Z Z Z L cf = U cf = c f, B

B

B

B

which is exactly what we need to show. (d) Let P be any partition of B. Show that for any subbox J of P , mJ (−f ) = −MJ (f )

and

MJ (−f ) = −mJ (f ).

Since mJ (−f ) ≤ −f (x) for all x ∈ J, it follows that −mJ (−f ) ≥ f (x) for all x ∈ J, so that −mJ (−f ) ≥ MJ (f ), and so mJ (−f ) ≤ −MJ (f ). Since MJ (f ) ≥ f (x) for all x ∈ J, it follows that −MJ (f ) ≤ −f (x) for all x ∈ J, so that −MJ (f ) ≤ mJ (−f ) The two inequalities show that mJ (−f ) = −MJ (f ). The relation MJ (−f ) = −mJ (f ) is shown virtually identically. Explain why consequently L(−f, P ) = −U (f, P ) and so L and so

R

B

Z

B

(−f ) = −U

(−f ) exists and

Z Z

f

and

U (−f, P ) = −L(f, P ),

and

U

B

B

(−f ) = −

Z

Z

B

(−f ) = −L

Z

f, B

f. B

The argument is very similar to the argument in part(c). Explain why the Rwork so far here in part (d) combines with part (c) to show that for any c ∈ R, B cf exists and Z Z cf = c f. B

B

If c ≥ 0 then we have the result from part (c). If c < 0 then c = −˜ c where c˜ > 0, and so by the various results that we have established, skipping some

79

pedantic basic algebra steps, Z Z (−˜ cf ) cf = B B Z c˜f =− B Z = −˜ c f Z B =c f.

by (d), since

Z

c˜f exists by (c) B

by (c)

B

6.2.7. This exercise shows that integration preserves order. Let f : B −→ R and g : B −→ R both be integrable,R and suppose that f ≤ g, meaning that R g. f ≤ f (x) ≤ g(x) for all x ∈ B. Show that B B R R R (g − f ) exists and is equal to B g − B f . By exercise 6.2.6, the integral B R So it suffices to prove that B (g − f ) ≥ 0. Simply the notation by replacing the symbol-string “g − f ” by “g”. Now we only have to prove that if g ≥ 0 then R also B g ≥ 0. Take any partition P of B. For each subbox, we have mJ (g) ≥ 0, so that L(f, P ) ≥ 0. That condition that each lower sum L(f, P ) is at least 0 means that 0 is a lower bound of the set of lower sums, and so the lower integral, being the greatest lower bound, is at least 0. Since g is integrable, its integral is equal to its lower integral, giving the desired result. 6.3.2. Let f : R −→ R be the cubing function f (x) = x3 . Give a direct proof that f is ε-δ continuous on R. Note that for any x, x ˜ ∈ R, |˜ x3 − x3 | = |(˜ x − x)(˜ x2 + x ˜x + x2 )| = |˜ x − x| |˜ x2 + x ˜ x + x2 |

≤ |˜ x − x|(|˜ x|2 + |˜ x| |x| + |x|2 ).

Now take |˜ x − x| < 1. We can compute |˜ x|2 + |˜ x| |x| + |x|2 = |˜ x − x + x|2 + |˜ x − x + x| |x| + |x|2

≤ (|˜ x − x| + |x|)2 + (|˜ x − x| + |x|)|x| + |x|2

< (1 + |x|)2 + (1 + |x|)|x| + |x|2

= 1 + 2|x| + |x|2 + |x| + |x|2 + |x|2 = 1 + 3|x| + 3|x|2 ,

or we can note that |˜ x| < |x| + 1 and therefore, again, |˜ x|2 + |˜ x| |x| + |x|2 < (1 + |x|)2 + (1 + |x|)|x| + |x|2 = 1 + 3|x| + 3|x|2 . Now, pick any x ∈ R and let ε > 0 be given. Define δ = min{1, ε/(1 + 3|x| + 3|x|2 )}. 80

Then for any x ˜ ∈ R such that |˜ x − x| < δ, we have |˜ x3 − x3 | ≤ |˜ x − x|(|˜ x|2 + |˜ x| |x| + |x|2 )

< |˜ x − x| (1 + 3|x| + 3|x|2 ) ε < (1 + 3|x| + 3|x|2 ) 1 + 3|x| + 3|x|2 = ε.

since |˜ x − x| < 1 since |˜ x − x|
0 be given. The claim is that this δ fails to satisfies the definition of uniform continuity for ε = 1. To see this, set √ √ x = 1/ δ, x ˜ = 1/ δ + δ/3. Then certainly |˜ x − x| < δ, and also √ √ |˜ x3 −x3 | = |(1/ δ+δ/3)3 −(1/ δ)3 | = |1/δ 3/2 +1+δ 3/2 /3+δ 3 /7−1δ 3/2 | > 1 = ε. So uniform continuity fails at x, as claimed. On [0, 500]? Yes: The cubing function is continuous on the set [0, 500], and the set is compact, so the continuity is uniform. 6.3.4. (a) Show: If I ⊂ R is an interval (possibly all of R), f : I −→ R is differentiable, and there exists a positive constant R such that |f ′ (x)| ≤ R for all x ∈ I then f is uniformly continuous on I. This is an application of the Mean Value Theorem. Given ε > 0, let δ = ε/R. Then for all x, x ˜ ∈ I, f (˜ x) − f (x) = f ′ (c)(˜ x − x)

for some c between x and x ˜.

It follows that |f (˜ x) − f (x)| = |f ′ (c)| |˜ x − x| ≤ R|˜ x − x|. So in particular, if |˜ x − x| < δ then (recalling that δ = ε/R) |f (˜ x) − f (x)| < Rδ = ε, as desired. (b) Prove that sine and cosine are uniformly continuous on R. This is immediate from (a) since | sin′ | = | cos | ≤ 1 and similarly for cos′ . √ 6.3.5. Let f : [0, +∞) be the square root function f (x) = x. Take for granted that f is ε-δ continuous on [0, +∞). (a) What does part (a) of the previous problem say about the uniform continuity of f ? 81

√ Nothing. The derivative −1/(2 x) of the square root function is not defined at 0, and it is unbounded near 0, so the hypotheses are not met. However, the fact that a particular diagnostic tool fails to show that f is uniformly continuous does not preclude the possibility that it is. (b) Is f uniformly continuous on [0, +∞)? Yes. The idea of the proof is that given ε > 0, whatever δ works at x = 0 should work all along the graph, because the graph is steepest at the origin. To quantify this statement about the graph, the claim is that: √ √ √ For any x, x ˜ such that 0 ≤ x ≤ x ˜, x ˜− x≤ x ˜ − x. One way to prove the claim is to note first that in general, if 0 ≤ a ≤ b then certainly √ √ b − a ≤ b + a. √ Multiply both sides of the inequality by b − a to get p b − a ≤ b2 − a 2 . √ √ ˜ and a = x, proving the claim. A variant In particular this holds for b = x proof of the claim is to observe that if 0 ≤ x ≤ x ˜ then √ √ √ 2 ˜ − x) = x ˜−2 x ˜x + x ≤ x ˜ − 2x + x = x ˜ − x, 0≤( x and then the claim follows by taking square roots. √ A little thought shows that the box that works at x = 0 has ε = δ, i.e., δ = ε2 . This is the δ that should work everywhere. Now we can write the proof: Given ε > 0, let δ = ε2 . Take any x, x ˜ ∈ [0, ∞), and assume without loss of generality that x ≤ x ˜. Then √ √ √ √ √ |˜ x − x| < δ =⇒ | x ˜ − x| ≤ x ˜ − x < δ = ε2 = ε. Another solution is to argue that • the square root function is pointwise continuous, • so it is uniformly continuous on [0, 1] since [0, 1] is compact, • and it is uniformly continuous on [1, ∞) since it is differentiable with bounded derivative there, • and the two uniform continuities concatenate to make the square root function uniformly continous on [0, ∞).

82

Invoking the continuity of the square root function for the first bullet is fine, and the second and third bullets are supported. But the fourth bullet requires showing that two uniform continuities concatenate to give a single uniform continuity, and the general concatenation argument is not quite as trivial as a person might think: taking δ = min{δ1 , δ2 } is not guaranteed to work (though it will work for the square root function). For example, consider the function   if x ≤ −1/2, −1 f : R −→ R, f (x) = 2x if −1/2 < x < 1/2,   1 if x ≥ 1/2.

Let ε = 1.5. On (−∞, 0] we may take δ1 = 1.1 in response, and on [0, ∞) we may take δ2 = 1.1 in response, but δ = 1.1 does not work on (−∞, ∞). For example, if we take x = −1/2 and x ˜ = 1/2 then |˜ x − x| = 1 < 1.1 but |f (˜ x) − f (x)| = |1 − (−1)| = 2 > 1.5.

Generalizing the problem, for any α such that 0 ≤ α ≤ 1, the function fα (x) = xα on [0, +∞) is uniformly continuous. In particular, the square root function is the case α = 1/2. To solve the generalized problem, use calculus. For any fixed positive h, introduce the function gα,h (x) = fα (x + h) − fα (x),

x ≥ 0.

Compute that for any x > 0, because α − 1 ≤ 0 we have ′ gα,h (x) = α((x + h)α−1 − xα−1 ) ≤ 0.

Thus gα,h is decreasing, which is to say that for any x ≥ 0, fα (x + h) − fα (x) = gα,h (x) ≤ gα,h (0) = fα (h). In particular, given nonnegative x and x ˜ with x ˜ > x, let h = x ˜ − x to get fα (˜ x) − fα (x) ≤ fα (˜ x − x) = (˜ x − x)α . Now we can solve the problem as before. Let ε > 0 be given, and set δ = ε1/α . Let 0 ≤ x < x ˜ with x ˜ − x < δ. Then fα (˜ x) − fα (x) ≤ (˜ x − x)α < δ α = ε. 6.3.6. Let J be a box in Rn with sides of length less than δ/n. Show that any points x and x ˜ in J satisfy |˜ x − x| < δ. Let x = (x1 , . . . , xn ) and x ˜ = (˜ x1 , . . . , x ˜n ). Then x ˜ − x = (˜ x1 − x1 , . . . , x ˜ n − xn )

where

|˜ x1 − x1 | < δ/n, . . . , |˜ xn − xn | < δ/n.

And so by the Size Bounds, |˜ x − x| ≤

n X i=1

|˜ xi − xi | < 83

n X i=1

δ/n = δ.

R 6.3.7. For B f to exist, it is sufficient that f : B −→ R be continuous, but it is not necessary. What preceding exercise provides an example of this? Exercise 6.2.5. Here is another example. Let B = [0, 1] and let f : B −→ R be monotonic increasing, meaning that if x1 < x2 in B then f (x1 ) ≤ f (x2 ). Show that such a function is bounded, Rthough it need not be continuous. Use the Integrability Criterion to show that B f exists. The function is bounded because its outputs lie in a compact interval, f (x) ∈ [f (0), f (1)]

for all x ∈ [0, 1].

A discontinuous such function is ( 0 f (x) = 1

if 0 ≤ x ≤ 1/2, if 1/2 < x ≤ 1.

To use the Integrability Criterion to show that For some positive integer n we have

R

B

f exists, let ε > 0 be given.

(f (1) − f (0))/n < ε. Partition the interval [0, 1] into n subintervals of equal length 1/n. The lower and upper sums for the partition are  L(f, P ) = 1/n f (0) + f (1/n) + f (2/n) + · · · + f ((n − 1)/n) ,  U (f, P ) = 1/n f (1/n) + f (2/n) + f (3/n) · · · + f (1) .

Their difference telescopes, and thus by our choice of n,

Consequently

R

U (f, P ) − L(f, P ) = (f (1) − f (0))/n < ε. [0,1]

f exists by the Integrability Criterion.

6.4.1. (a) Show that for three points a, b, c ∈ R in any order, and any integrable Rc Rb Rc function f : [min{a, b, c}, max{a, b, c}] −→ R, a f = a f + b f . For example, suppose that b ≤ c ≤ a. Then we know that Z a Z c Z a f, f+ f= and so by algebra, −

Z

c

b

b

a

f= c

Z

c b

84

f−

Z

a

f. b

Consequently, Z

c a

Z a f f =− Z Z cc f− = = =

Z

Z

b

c

f+ b b

f+ a

Z

Z

by definition a

f

by the previous display

f

by definition

f

obviously.

b b a c

b

(b) Show that if f : [min{a, b}, max{a, b}] −→ R takes the constant value k Rb then a f = k(b − a), regardless of which of a and b is larger. If a ≤ b then we already have the result from exercise 6.2.3. If a > b then compute Z

b a

f =−

Z

a

f

by definition

b

= −(k(a − b))

= k(b − a)

by the result that we already have by algebra.

6.4.3. Show that if F1 , F2 : [a, b] −→ R are differentiable and F1′ = F2′ , then F1 = F2 + C for some constant C. This is a consequence of the Mean Value Theorem. Let G = F1 − F2 . Then G′ = F1′ − F2′ = 0. For any x ∈ [a, b], G(x) − G(a) = (x − a)G′ (c) =0

for some c ∈ [a, x] since G′ = 0.

This shows that G(x) = G(a) for all x ∈ [a, b]. That is, G is some constant C. Since G = F1 − F2 , we are done. 6.4.5. Let f : [0, 1] −→ R be continuous and suppose that for all x ∈ [0, 1], Rx R1 f = x f . What is f ? 0 Rx R1 Rx We are given that 0 f = x f = − 1 f for all x. Differentiate to obtain f (x) = −f (x) for all x. Thus f is identically zero. 6.4.6. Find R xall differentiable functions f : R≥0 −→ R such that for all x ∈ R≥0 , (f (x))2 = 0 f . Differentiate the given relation to get 2f (x)f ′ (x) = f (x). It follows that at every x such that f (x) 6= 0 we have f ′ (x) = 1/2. R0 Next note that (f (0))2 = 0 f = 0, so f (0) = 0. 85

One solution is f (x) = 0 for all x ≥ 0. For any other solution, f (x) 6= 0 for some x. Take any such x and consider the associated set Sx = {˜ x ∈ R≥0 : f (˜ x) = 0 and x ˜ < x}. Then Sx is nonempty (since it contains 0), and Sx is bounded above by x. The completeness of the real number system now says that sup(Sx ) exists. Call it x0 . If x0 is an isolated point of Sx then it belongs to Sx and so f (x0 ) = 0. On the other hand, if x0 is a limit point of Sx then by the continuity of f and the definition of x0 , also f (x0 ) = 0. That is, f (x0 ) = 0 in all cases. Recall that we have fixed some x such that f (x) 6= 0, while on the other hand f (x0 ) = 0. So x0 6= x. Since x and x0 are upper bounds of Sx and since x0 is the least upper bound, it follows that x0 < x. For any c between x0 and x, necessarily f (c) 6= 0, and so f ′ (c) = 1/2. Now compute that by the Mean Value Theorem, f (x) = f (x) − f (x0 ) = f ′ (c)(x − x0 ) 1 = (x − x0 ) 2

for some c ∈ (x0 , x) as just explained.

This calculation shows that f (x) is positive. Next we show that for every x′ > x, also f (x′ ) is positive. The alternative is that for some x′ > x, f (x′ ) = 0. In this case define another set Tx = {˜ x ∈ R≥0 : f (˜ x) = 0 and x ˜ > x}. Then Tx is nonempty (since it contains x′ ), and Tx is bounded below by x. Similarly to the previous argument, the infimum (greatest lower bound) x1 of Tx satisfies f (x1 ) = 0 and x1 ≥ x, and so x1 > x. And then another Mean Value Theorem calculation gives −f (x) = f (x1 ) − f (x) =

1 (x1 − x) > 0. 2

But since f (x) > 0 this is a contradiction. So it is impossible to have any x′ > x such that f (x′ ) = 0. That is, f (x′ ) > 0 for all x′ ≥ x, and so the reasoning of the previous paragraph shows that f (x′ ) =

1 (x − x0 ) 2

for all x′ > x.

All the quantization that has gone on shows that in this expression, x0 is the largest x-value where f is 0, and that f is identically 0 for all values up to x0 . Thus  0 if 0 ≤ x ≤ x0 , f (x) = 1  (x − x0 ) if x > x0 . 2 86

But this function fails to be differentiable at x0 unless x0 = 0. Thus finally, either f is identically 0, or f is the function f (x) = x/2. 6.4.7. RDefine f : R+ −→ R by f (u) = e(u+1/u) /u and F : R+ −→ R by x F (x) = 1 f . Show that F behaves somewhat like a logarithm in that F (1/x) = −F (x) for all x ∈ R+ . Interpret this property of F as a statement about area under the graph of f . We have f (u) = eu+1/u /u. Let φ : R+ −→ R+ be φ(u) = 1/u. Then  e1/u+u −1 eu+1/u (f ◦ φ) · φ′ (u) = · 2 =− = −f (u). 1/u u u

That is, (f ◦ φ) · φ′ = −f . Consequently, F (1/x) =

Z

1/x

f= 1

Z

1

(−f ) = 1/x

Z

1 ′

1/x

(f ◦ φ) · φ =

Z

1 x

f =−

Z

x 1

f = −F (x).

This says that f has the following property: For any x > 1, the area under the graph of f between 1 and x equals the area under the graph from 1/x to 1. 6.5.4. Let S be the set of rational numbers in [0, 1]. Show that S does not have a volume (i.e., a length) under Definition 6.5.1. The unit interval [0, 1] is a box containing S, so according to the definition, the volume is Z χS , [0,1]

if this integral exists. But by exercise 6.2.4, it doesn’t. 6.5.5. Prove the Volume Zero Criterion. By Definition 6.5.1, a set S contained in the box B has volume zero if and R only if B χS exists and equals 0. Since χS (x) ≥ 0 for all R x ∈ B, each lower χ is nonnegative. sum L(χS , P ) is nonnegative. Thus the lower integral L B S R And we have established that the upper integral U B χS is at least the lower integral. So we have the inequalities Z Z 0≤L χS ≤ U χS , B

B

R and therefore B χS exists and equals 0 if and only if U B χS ≤ 0. Since the upper integral is the greatest lower bound of the upper sums, it suffices to show that no positive number is a lower bound of the upper sums. So we need to show that given any ε > 0, some upper sum is less than ε. But the upperR sum is the sum of the areas of the type I subboxes, and so the criterion is that B χS exists and equals 0 if and only if R

for every ε > 0, there exists a partition P of B such that X vol(J) < ε. J:type I

87

6.5.9. Use Theorem 6.5.4, the discussion immediately after its proof, Proposition 6.5.3, and any other results necessary R to explain why for each set K and function f : K −→ R below, the integral K f exists. (a) K = {(x, y) : 2 ≤ y ≤ 3, 0 ≤ x ≤ 1 + ln y/y}, f (x, y) = exy . Put the shaded set K inside the box B = [0, 1 + ln 3/3] × [2, 3]. Extend f from K to B by defining f = 0 on B − K. Then f is discontinuous only on the boundary curve between the shaded and unshaded regions, and this curve is the graph of the function ϕ : [2, 3] −→ R,

ϕ(y) = 1 + ln y/y.

Thus the boundary curve has volume zero by Proposition 6.5.3. Consequently, R R f , as explained near the end f exists by Theorem 6.5.4. This integral is K B of the section. √ 2 (b) K = {(x, y) : 1 ≤ x ≤ 4, 1 ≤ y ≤ x}, f (x, y) = ex/y /y 5 . This is very similar to (a). This time it involves the graph of the function √ ϕ : [1, 4] −→ R, ϕ(x) = x. (c) K = the region between the curves y = 2x2 and x = 4y 2 , f (x, y) = 1. This is again very similar to (a), but it involves two graphs rather than one. (d) K = {(x, y) : 1 ≤ x2 + y 2 ≤ 2}, f (x, y) = x2 . This is again very similar to (a), but it involves four graphs. The functions in question are p √ √ ϕ1 : [− 2, 2] −→ R, ϕ1 (x) = 2 − x2 , p √ √ ϕ2 : [− 2, 2] −→ R, ϕ2 (x) = − 2 − x2 , p ϕ3 : [−1, 1] −→ R, ϕ3 (x) = 1 − x2 , p ϕ4 : [−1, 1] −→ R, ϕ4 (x) = − 1 − x2 .

(e) K = the pyramid with vertices (0, 0, 0), (3, 0, 0), (0, 3, 0), (0, 0, 3/2), and f (x, y, z) = x. Put the pyramid in a box B. Extend f from K to B by defining f = 0 outside the pyramid. Then f is discontinous only on the tilted pyramid-roof. This roof is a subset of the graph of the function ϕ : [0, 3] × [0, 3] −→ R,

ϕ(x, y) = (3 − x − y)/2.

Hence the roof has area zero by Proposition 6.5.3 and exercise 6.5.6. (f) K = {x ∈ Rn : |x| ≤ 1} (the solid unit ball in Rn ), f (x1 , . . . , xn ) = x1 · · · xn . Extend f from the ball K to the box [−1, 1]n . Then f is discontinuous only on the boundary of the K, i.e., on the the unit sphere. The upper half of the unit sphere is a subset of the graph of the continuous function ϕ : [−1, 1]n−1 −→ R 88

given by ϕ(x1 , · · · , xn−1 ) =

(q 0

1 − x21 − · · · − x2n−1

if x21 + · · · + x2n−1 ≤ 1,

if x21 + · · · + x2n−1 > 1.

Similarly, the Rlower half of the unit sphere is a subsetR of the graph of −ϕ. Consequently, B f exists by Theorem 6.5.4, and this is K f by definition..

6.6.1. Let S be the set of points (x, y) ∈ R2 between the x-axis and the sine curve as x varies between 0 and 2π. Since the sine curve has two arches between 0 and 2π, and since the area of an arch of sine is 2, Z 1 = 4. S

On the other hand,

Z

2π x=0

Z

sin x

1= y=0

Z



sin x = 0. x=0

Why doesn’t this contradict Fubini’s Theorem? The set S has equal parts of its area above and below the x-axis. The double integral simply measures area without being not sensitive to this, but the iterated integral is is sensitive to it since each of its one-dimensional pieceintegrals takes orientation into account. Rb Rx 6.6.2. Exchange the order of integration in x=a y=a f (x, y). Rb Rb The other iterated integral is y=a x=y . R 1 R 1−x R x+y 6.6.3. Exchange the inner order of integration in x=0 y=0 z=0 f . The result is Z 1 Z x Z 1−x Z 1 Z 1−x  f. + x=0

z=0

z=x

y=0

y=z−x

6.6.4. Exchange the inner order of integration in the region of integration. The result is Z x2 Z 1 Z 1+x2 Z 1 Z 1 + √ x=0

z=0

y=0

z=x2

R1

R 1 R x2 +y2 x=0 y=0 z=0

y= z−x2

!

f . Sketch

f.

R 6.6.5. Evaluate K f from in parts (a), (b), (c), (f ) of exercise 6.5.9. (a) The integral is Z 3 Z 3 Z 3 Z 1+ln y/y xy 1+ln y/y xy (1/y)(ey+ln y − 1) = (1/y)e x=0 e = y=2

x=0

=

Z

y=2 3

y=2

y=2

3 (ey − 1/y) = (ey − ln y) y=2 = e3 − e2 − ln(3/2) . 89

(b) The integral is Z

4 x=1

Z



x

e

x/y 2

5

/y =

y=1

=

Z

Z

2 y=1 2 y=1

Z

4

e

x/y 2

5

/y =

x=y 2

Z

2 y=1

4 2 ex/y /y 3 x=y2

 2 2 (e4/y /y 5 − e/y 3 ) = − (1/8)e4/y + (1/2)e/y 2 y=1 2

= (1/8)(e4 − e) − e/2(1 − 1/4) = e4 /8 − e/2 . (c) The integral is Z

1/161/3 x=0

Z



x/2

1= y=2x2

Z

1/161/3 x=0

 1/161/3 √ ( x/2 − 2x2 ) = (1/3)x3/2 − (2/3)x3 x=0

= 1/(3 · 4) − 2/(3 · 16) = 1/12 − 1/24 = 1/24 . (f) The integral is Z

1 x1 =0

Z

1

x2 =0

···

Z

1 xn =0

x1 x2 · · · xn =

Z

1

x1 x1 =0

Z

1 x2 =0

x2 · · ·

Z

1

xn = (1/2)n . xn =0

6.6.6. Find the volume of the region K bounded by the coordinate planes, x+y = 1, and z = x2 + y 2 . Sketch K. Compute Z

1 x=0

6.6.7. Evaluate

Z

R

1−x y=0

K

Z

x2 +y 2

1=

Z

1

Z

1−x

(x2 + y 2 )

=

x=0 Z 1

=

1 1 1 1 − + = . 3 4 12 6

z=0

y=0

1 (x2 (1 − x) + (1 − x)3 ) 3  x=0 1 1 1 3 1 4 x − x − (1 − x)4 x=0 = 3 4 12

(1 + x + y + z)−3 where K is the unit simplex.

90

Compute Z 1 Z x=0

1−x y=0

Z

1−x−y

(1 + x + y + z)−3 z=0

1 =− 2 1 =− 2 =−

1 2

Z Z

1 x=0 1

x=0 Z 1 x=0

Z Z

1−x y=0 1−x y=0

1−x−y (1 + x + y + z)−2 z=0 (1/4 − (1 + x + y)−2 )

1−x ((1/4)(1 − x) − (1 + x + y)−1 ) y=0

Z 1 1 =− ((1/4)(1 − x) + 1/2 − (1 + x)−1 ) 2 x=0   1 1 1 1 2 1 =− − (1 − x ) x=0 + − ln(1 + x) x=0 2 8 2     1 1 1 1 1 1 + − ln 2 = − + − ln 2 =− 2 8 2 2 8 2 =

5 ln 2 − . 2 16

6.6.8. Find the volume of the region K in the first octant bounded by x = 0, z = 0, z = y, and x = 4 − y 2 . Sketch K. Compute, Z

2 y=0

Z

4−y 2 x=0

Z

y

1= z=0

=

Z

Z

2

y y=0 2 y=0

Z

4−y 2

1= x=0

Z

2 y=0

y(4 − y 2 )

2 (4y − y 3 ) = (2y 2 − (1/4)y 4 ) y=0 = 8 − 4 = 4 .

6.6.11. Let K and L be compact subsets of Rn with boundaries of volume zero. Suppose that for each x1 ∈ R, the cross sectional sets Kx1 = {(x2 , . . . , xn ) : (x1 , x2 , . . . , xn ) ∈ K} Lx1 = {(x2 , . . . , xn ) : (x1 , x2 , . . . , xn ) ∈ L}

have equal (n − 1)-dimensional volumes. Show that K and L have the same volume. Illustrate for n = 2. Compute Z Z Z Z vol(K) = 1= 1= vol(Kx1 ) K

x1

K x1

=

Z

x1

vol(Lx1 ) = x1

91

Z

x1

Z

1= L x1

Z

1 = vol(L). L

6.6.13. Let n ∈ Z+ and r ∈ R≥0 . The n-dimensional simplex of side r is Sn (r) = {(x1 , . . . , xn ) : 0 ≤ x1 , . . . , 0 ≤ xn , x1 + · · · + xn ≤ r}. F (a) Show that for n > 1, Sn (r) = xn ∈[0,r] Sn−1 (r − xn ) × {xn }. That is, Sn (r) is a disjoint union of cross-sectional (n − 1)-dimensional simplices of side r − xn at height xn as xn varies from 0 to r. Make sketches for n = 2 and n = 3. For all (x1 , . . . , xn−1 ) ∈ Rn−1 and any fixed xn ∈ [0, r], we have the equivalences ) ( 0 ≤ x1 , 0 ≤ x2 , . . . , 0 ≤ xn−1 , (x1 , . . . , xn ) ∈ Sn (r) ⇐⇒ x1 + x2 + · · · + xn−1 ≤ r − xn ⇐⇒ (x1 , . . . , xn−1 ) ∈ Sn−1 (r − xn ). That is, the cross-section of Sn (r) at xn is Sn−1 (r − xn ). The result follows. (b) Prove that vol(S1 (r)) R r = r. Compute vol(S1 (r)) = x1 =0 1 = r. Use Fubini’s Theorem to prove that Z r vol(Sn−1 (r − xn )) for n > 1, vol(Sn (r)) = xn =0

and show by induction that vol(Sn (r)) = rn /n!. Compute Z Z r Z 1 vol(Sn (r)) = 1= Sn (r)

=

xn =0 Z r xn =0 Z r

by Fubini’s Theorem

Sn−1 (r−xn )

vol(Sn−1 (r − xn ))

(r − xn )n−1 by induction hypothesis (n − 1)! xn =0 (r − xn )n r rn =− = . x =0 n n! n! Z Z r (r − xn )n−1 xn (c) Use Fubini’s Theorem to show that . xn = (n − 1)! xn =0 Sn (r) R Work this integral by parts to get Sn (r) xn = rn+1 /(n + 1)!. This is similar to (b). Compute Z Z r Z xn xn = 1 by Fubini’s Theorem =

Sn (r)

=

Z

xn =0 r

xn xn =0

Sn−1 (r−xn ) (r − xn )n−1

(n − 1)! 92

as in (c).

Now let u = xn and dv = (r − xn )n−1 /(n − 1)!, so that du = 1 and v = −(r − xn )n /n! and continue, Z Z r (r − xn )n r (r − xn )n xn = −xn + x =0 n n! n! Sn (r) xn =0 n+1 n+1 r (r − x) r = . =− (n + 1)! xn =0 (n + 1)!  R (d) The centroid of Sn (r) is (x1 , . . . , xn ), where xj = Sn (r) xj vol(Sn (r)) for each j. What are these coordinates explicitly? By the previous calculations, xn =

rn+1 /(n + 1)! r = . n r /n! n+1

By symmetry, xj = r/(n + 1) for all j. When r = 1 and n = 3, this gives x = y = z = 1/4, as in the text.

93

R 6.7.1. Evaluate S x2 + y 2 where S is the region bounded by x2 + y 2 = 2z and z = 2. Sketch S. Using cylindrical coordinates, compute Z 2π Z 2 Z 2 Z Z Z 3 2 2 2 2 2 1 r r ·r = x +y = x +y = φ(K) Z 2

S

= 2π

r=0

=



K

r3 2 −

 2

r 2

16π . 3

θ=0

r=0

z=r 2 /2

  2  2 64 = 2π r4 /2 0 − r6 /12 0 = 2π 8 − 12 

6.7.2. Find the volume of the region S between x2 +y 2 = 4z and x2 +y 2 +z 2 = 5. Sketch S. The surfaces intersect at z such that 4z = x2 + y 2 = 5 − z 2 , i.e., z = 1, and 2 x + y 2 = 4, i.e., r2 = 4. So, using cylindrical coordinates, compute  Z 2π Z 2 Z √5−r2 Z Z 2  p r3 r 1= r 5 − r2 − 1 = 2π 4 θ=0 r=0 S r=0 z=r 2 /4     4 r 1 3/2 1 2 2 3/2 = 2π (5 − 1) − 1 = 2π − (5 − r ) − 3 16 r=0 3 2π √ = (5 5 − 4) . 3 6.7.3. Find the volume of the region between the graphs of z = x2 + y 2 and z = (x2 + y 2 + 1)/2. The graphs intersect where x2 + y 2 = (x2 + y 2 + 1)/2, i.e., x2 + y 2 = 1, i.e., r = 1. So, using cylindrical coordinates, compute Z 1 Z 1 Z 2π Z 1 Z (r2 +1)/2 2 3 (r − r3 )/2 r(r + 1)/2 − r = 2π 1 = 2π r θ=0

r=0

z=r 2

r=0

r=0

1 = 2π(r /4 − r /8) r=0 = 2π(1/4 − 1/8) = π/4 . 2

4

6.7.5. Let φ be the spherical coordinate mapping. Describe φ(K) where K = {(ρ, θ, ϕ) : 0 ≤ θ ≤ 2π, 0 ≤ ϕ ≤ π/2, 0 ≤ ρ ≤ cos ϕ}.

The range of θ- and ϕ-values shows that φ(K) sits in the upper half of (x, y, z)space, where z ≥ 0, and that φ(K) is symmetric about the z-axis. The curve ρ = cos ϕ in parameter space also has the equation ρ2 = ρ cos ϕ, so long as we rule out ρ = 0 unless ϕ = π/2. The spherical coordinate map φ takes the curve to points (x, y, z) such that x2 + y 2 + z 2 = z. 94

That is, x2 + y 2 + z 2 − z + (1/2)2 = (1/2)2 , or x2 + y 2 + (z − 1/2)2 = (1/2)2 . This is a sphere of radius 1/2 centered at (0, 0, 1/2), i.e., a sphere centered on the positive z-axis, tangent to the (x, y)-plane, of diameter 1. Same question for K = {(ρ, θ, ϕ) : 0 ≤ θ ≤ 2π, 0 ≤ ϕ ≤ π, 0 ≤ ρ ≤ sin ϕ}. The analysis here is similar. This time φ(K) is again symmetric about the z-axis but is not restricted to the upper half space. The curve ρ = sin ϕ in parameter space also has the equation ρ2 = ρ cos 0 sin ϕ, so long as we rule out ρ = 0 unless ϕ ∈ {0, π}. Since φ(K) is symmetric about the z-axis, fix θ at 0 and note that the spherical coordinate map φ takes the curve to points (x, 0, z) such that x2 + z 2 = x. That is, x2 − x + (1/2)2 + z 2 = (1/2)2 , or (x − 1/2)2 + z 2 = (1/2)2 . This is a circle of radius 1/2 in the (x, z)-plane, centered at (1/2, 0, 0). So its rotation about the z-axis is a sort of degenerate torus of inner radius 0 and outer radius 1. R 6.7.6. Evaluate S xyz where S is the first octant of B3 (1). Compute, using spherical coordinates, Z Z Z ρ cos θ sin ϕ · ρ sin θ sin ϕ · ρ cos ϕ · ρ2 sin ϕ xyz = xyz = S

=

Z

φ(K)

K



cos θ sin θ θ=0

Z

1

ρ5 ρ=0

Z

π/2

sin3 ϕ cos ϕ

ϕ=0

π/2 ρ6 1 π/2 1 1 sin2 θ θ=0 · ρ=0 · sin4 ϕ ϕ=0 2 6 4 1 1 1 1 . = · · = 2 6 4 48 =

6.7.7. Find the mass of a solid figure filling the spherical shell S = B3 (b) − B3 (a) 95

with density δ(x, y, z) = x2 + y 2 + z 2 . Use spherical coordinates and Fubini’s Theorem. In terms of spherical coordinates, the density is ρ2 , so the mass is M=

Z

Z

2π θ=0

π ϕ=0

Z

b ρ=a

5

= 2π · 2 ·

ρ2 · ρ2 sin ϕ = 2π ·

5

5

Z

π

sin ϕ ϕ=0

Z

b

ρ4 ρ=a

5

b −a 4π(b − a ) = . 5 5

6.7.8. A Rsolid sphere of radius r has density δ(x, y, z) = e−(x its mass, B3 (r) δ.

2

+y 2 +z 2 )3/2

. Find 3

Again, use spherical coordinates. In spherical coordinates the density is e−ρ , and so the mass is Z r Z π Z 2π Z π Z r 3 −ρ3 2 e−ρ ρ2 sin ϕ e ρ sin ϕ = 2π · M= θ=0



ρ=0

ϕ=0

3 r 1 = 4π · − e−ρ ρ=0 3



ϕ=0

=

ρ=0

3 4π (1 − e−r ) . 3

6.7.9. Find the centroid of the region S = B3 (a) ∩ {x2 + y 2 ≤ z 2 } ∩ {z ≥ 0}. Sketch S. The region is an ice cream cone, so by symmetry, x = y = 0. Computing with spherical coordinates shows that the volume of the region is √ Z 2π Z π/4 Z a √ a3 2 πa3 sin ϕ = ρ2 V = · 2π · (1 − )= (2 − 2). 3 2 3 θ=0 ϕ=0 ρ=0 Similarly, since z in spherical coordinates is ρ cos ϕ, integrating z over the region gives Z

z= S

Z

It follows that

a 3

ρ ρ=0

Z

2π θ=0

Z

π/4

sin ϕ cos ϕ = ϕ=0

π/4 a4 πa4 1 · 2π · sin2 ϕ ϕ=0 = . 4 2 8

√ πa4 /8 3(2 + 2) √ = z= a . 16 πa3 (2 − 2)/3

6.7.10. (a) Prove Pappus’s Theorem: Let K be a compact set in the (x, z)plane lying to the right of the z-axis and with boundary of area zero. Let S be the solid obtained by rotating K about the z-axis in R3 . Then

where as always, x =

R

K

vol(S) = 2πx · area(K), x/area(K). 96

Use triples (x, θ, z) rather than the usual (r, θ, z) to denote cylindrical coordinates. Let R = {(x, θ, z) : (x, z) ∈ K, 0 ≤ θ ≤ 2π}. Use cylindrical coordinates to parametrize S, φ : R −→ S,

φ(x, θ, z) = (x cos θ, x sin θ, z).

Then by the Change of Variable Theorem and then Fubini’s Theorem, the volums of S is Z Z Z Z Z 2π Z vol(S) = x = 2π 1= x = 2πx · area(K). 1= x= S

φ(R)

θ=0

R

(x,z)∈K

K

(b) What is the volume of the torus Ta,b of cross-sectional radius a and major radius b? By the formula from part (a), the volume is vol(Ta,b ) = 2πb · πa2 = 2π 2 a2 b. 6.7.11. Prove the change of scale principle: If the set K ⊂ Rn has volume v then for any r ≥ 0, the set rK = {rx : x ∈ K} has volume rn v. The map φ : K −→ rK, φ(x) = rx

is linear, so it is its own derivative. Thus φ′ (x) = rI (where I is the n-by-n identity matrix) has determinant rn . By the Change of Variable Theorem, Z Z Z Z vol(rK) = 1= 1= rn = rn 1 = rn vol(K) = rn v. rK

φ(K)

K

K

6.7.12. (Volume of the n-ball, first version.) Let n ∈ Z+ and r ∈ R≥0 . The n-dimensional ball of radius r is Bn (r) = {x : x ∈ Rn |x| ≤ r}. Let vn = vol(Bn (1)). (a) Explain how exercise 6.7.11 reduces computing the volume of Bn (r) to computing vn . It is straightforward to show that Bn (r) = rBn (1) since in general |rx| = r|x| for r ≥ 0 and x ∈ Rn : Bn (r) = {x : x ∈ Rn , |x| ≤ r} = {rx : x ∈ Rn : |x| ≤ 1} = rBn (1). Consequently, the change of scale principle shows that Z Z Z vol(Bn (r)) = 1= 1 = rn 1 = rn vol(Bn (1)) = rn vn . Bn (r)

rBn (1)

Bn (1)

97

(b) Show that v1 = 2 and v2 = π. Note that v1 is the length of [−1, 1], which is 2, and that v2 is the area of the unit ball, which is the very definition of π. (c) Let D denote the unit disk B2 (1). Explain why for n > 2, q G Bn (1) = {(x1 , x2 )} × Bn−2 ( 1 − x21 − x22 ). (x1 ,x2 )∈D

That is, the p unit n-ball is a union of cross-sectional (n − 2)-dimensional balls of radius 1 − x21 − x22 as (x1 , x2 ) varies through the unit disk. Make a sketch for n = 3, the only value of n for which we can see this. For any x = (x1 , . . . , xn ) we have the equivalences x ∈ Bn (1) ⇐⇒ x21 + x22 + x23 + · · · + x2n ≤ 1 ⇐⇒ x23 + · · · + x2n ≤ 1 − x21 − x22

⇐⇒ (x1 , x2 ) ∈ D and (x3 , . . . , xn ) ∈ Bn−2 (

q

1 − x21 − x22 ).

(d) The problem assumes that n > 2 and gives a string of equalities. The first equality is Z n (1 − x21 − x22 ) 2 −1 . vn = vn−2 (x1 ,x2 )∈D

To show this, note that by the definition of volume and by part (c), Z Z 1 vn = 1= √ 2 2 Bn (1)

{(x1 ,x2 )×Bn−2 (

1−x1 −x2 ):(x1 ,x2 )∈D}

Next, by Fubini’s Theorem the last integral is Z Z (x1 ,x2 )∈D

Bn−2 (



1.

1−x21 −x22 )

p The inner integral is the volume of Bn−2 ( 1 − x21 − x22 ), and by the change of scale principle this is (1 − x21 − x22 )n/2−1 vn−2 . So now we have Z vn−2 (1 − x21 − x22 )n/2−1 . (x1 ,x2 )∈D

Switching to polar coordinates, the integral becomes Z vn−2 r(1 − r2 )n/2−1 . (r,θ)∈[0,1]×[0,2π]

By Fubini’s Theorem, this is vn−2

Z

2π θ=0

Z

1 r=0

r(1 − r2 )n/2−1 .

98

And by a short calculation, this is −vn−2 2π

π (1 − r2 )n/2 1 = vn−2 . r=0 n n/2

(e) Show by induction the even case of the formula   π n/2   for n even, vn = (n/2)! (n−1)/2 n  2 ((n − 1)/2)!  π for n odd. n!

For n = 2, the claimed value of vn is

π 2/2 = π, (2/2)! which is indeed v2 . Now the induction can proceed in steps of 2. For even n ≥ 2, if the formula holds for n then the right side for n + 2 is π (n+2)/2 π n/2 π = ((n + 2)/2)! (n + 2)/2 (n/2)! π = vn , (n + 2)/2 and this is vn+2 by part (d). The induction is complete. R∞ 2 6.7.13. This exercise computes the “improper” integral I = x=0 e−x , defined RR RR 2 2 as the limit limR→∞ x=0 e−x . Let I(R) = x=0 e−x for any R ≥ 0. R 2 2 (a) Use Fubini’s Theorem to show that I(R)2 = S(R) e−x −y , where S(R) is the square S(R) = {(x, y) : 0 ≤ x ≤ R, 0 ≤ y ≤ R}. Note that the variable of integration x in the formula for I(R) is a dummy variable. So we can compute I(R)2 = I(R) · I(R) =

Z

R

e−x x=0

2

Z

By Fubini’s Theorem, this last integral is (b) Let Q(R) be the quarter disk

R

2

e−y = y=0

R

S(R)

e−x

2

Z

R x=0

−y 2

Z

R

e−x y=0

.

Q(R) = {(x, y) : 0 ≤ x, 0 ≤ y, x2 + y 2 ≤ R2 }, √ and similarly for Q( 2 R). Explain why Z Z Z 2 2 2 2 2 2 e−x −y , e−x −y ≤ e−x −y ≤ √ Q(R)

Q( 2 R)

S(R)

99

2

−y 2

.

The inequalities between the integrals follow from two conditions. First, we have a containment of sets, √ Q(R) ⊂ S(R) ⊂ Q( 2 R), 2

2

and second, the integrand e−x −y is R R positive. 2 2 2 2 (c) Change variables, evaluate Q(R) e−x −y and Q(√2 R) e−x −y . What are the limits of these two quantities as R → ∞? Compute, using polar coordinates and Fubini’s Theorem, that   Z Z π/2 Z R  2 2 2 2 2 R π π 1 re−r = e−x −y = − e−r r=0 = 1 − e−R . 2 2 4 r=0 θ=0 Q(R) √ Substitute 2 R for R to get that also Z  π −2R2 −x2 −y 2 . 1 − e e = √ 4 Q( 2 R) (d) What is I? Both of the integrals computed in (c) go to π/4 as R goes to infinity. Since the quantity I(R)2 is trapped between them, it is squeezed to π/4 as well. Hence the desired integral I is √ Z ∞ 2 π . e−x = 2 x=0 6.7.14. (Volume of the n-ball, improved version) Define the gamma function as an integral, Z ∞ xs−1 e−x dx, s > 0. Γ(s) = x=0

(a) Show that Γ(1) = 1. Compute Z Z ∞ x0 e−x dx = Γ(1) = x=0

Show that Γ(1/2) = Start from



∞ x=0

π. Γ(1/2) =

∞ e−x dx = −e−x x=0 = −(0 − 1) = 1.

Z



x−1/2 e−x dx.

x=0

Let x = y 2 , so that y = x1/2 . Then dy = (1/2)x−1/2 dx, and so by the previous exercise, √ Z ∞ π √ −y 2 e dy = 2 · Γ(1/2) = 2 = π. 2 y=0 Show that Γ(s + 1) = sΓ(s). Start from Γ(s + 1) =

Z

∞ x=0

100

xs e−x dx.

Let u = xs and dv = e−x dx. Then du = sxs−1 and v = −e−x . Integration by parts gives Z ∞ ∞ xs−1 e−x dx. +s Γ(s + 1) = −xs e−x x=0

x=0

For large x, the exponential decay of e−x dominates the polynomial growth of xs , and since s > 0, xs e−x is zero at x = 0. So the boundary term of the previous display vanishes. The other term is Γ(s), giving the desired result. (b) Use part (a) to show that Γ(n) = (n − 1)! for n = 1, 2, 3, . . . . This is immediate by induction on n since Γ(1) = 1 = 0! and then for n ≥ 1, if we assume inductively that Γ(n) = (n − 1)! then also Γ(n + 1) = nΓ(n) = n(n − 1)! = n!

and this completes the induction. (c) Use exercise 6.7.12(b), exercise 6.7.12(d), and the extended definition of the factorial in part (b) of this exercise to to obtain a uniform formula for the volume of the unit n-ball, vn =

π n/2 , (n/2)!

n = 1, 2, 3, · · · .

Thus the n-ball of radius r has volume vol(Bn (r)) =

π n/2 n r , (n/2)!

n = 1, 2, 3, · · · .

We already have the formula for vn if n even. For n odd, the argument is essentially identical to exercise 6.7.12(e) but starting at the base case n = 1. For the base case, we need to show that π 1/2 = 2. Γ(1/2 + 1) So compute that indeed √ √ π π π 1/2 √ = 2. = = Γ(3/2) 1/2 · Γ(1/2) 1/2 · π For the induction step, assume that vn−2 =

π (n−2)/2 π (n−2)/2 = . Γ((n − 2)/2 + 1) Γ(n/2)

Then vn =

π π (n−2)/2 π n/2 π · vn−2 = · = . n/2 n/2 Γ(n/2) Γ(n/2 + 1)

This completes the induction. 101

Chapter 8 8.2.1. (a) Let α : I −→ Rn be a regular curve that doesn’t pass through the origin, but has a point α(t0 ) of nearest approach to the origin. Show that the position vector α(t0 ) and the velocity vector α′ (t0 ) are orthogonal. The scalar-valued function f (t) = |α(t)|2 = hα(t), α(t)i has a minimum at t0 . Its derivative is f ′ (t) = 2hα(t), α′ (t)i, so that especially since f ′ (t0 ) = 0, hα(t0 ), α′ (t0 )i = 0. The previous display says precisely that α(t0 ) ⊥ α′ (t0 ) as desired. Yes, geometrically it is clear that at the point of nearest approach to the origin, the velocity is orthogonal to the position. (b) Find a regular curve α : I −→ Rn that does not pass through the origin and does not have a point of nearest approach to the origin. Let I = (0, 1), and let n = 1, and let α be the identity map. Does an example exist with I compact? No. If I is compact then as in part (a), the continuous function f (t) = |α(t)| assumes a minimum. 8.2.2. Let α be a regular parametrized curve with α′′ (t) = 0 for all t ∈ I. What is the nature of α? The curve α is a line. Since α′′ vanishes componentwise, α takes the form α(t) = (a1 t + b1 , a2 t + b2 , · · · , an t + bn ). That is, letting d = (a1 , · · · , an ) and p = (b1 , · · · , bn ), α(t) = td + p. Furthermore, since α′ (t) = d for all t, it follows that d 6= 0 since α is regular. In sum, the trace of α is the line through the point p having direction d. 8.2.3. Let α : I −→ Rn be a parametrized curve and let v ∈ Rn be a fixed vector. Assume that hα′ (t), vi = 0 for all t ∈ I and that hα(to ), vi = 0 for some t0 ∈ I. Prove that hα(t), vi = 0 for all t ∈ I. What is the geometric idea? For any t ∈ I we have hα(t), vi′ = hα′ (t), vi + hα(t), v ′ i = 0 + hα(t), 0i = 0 + 0 = 0. Thus hα(t), vi is constant, and since hα(to ), vi = 0 the constant is 0. The geometric idea is that the trace of α lies in the hyperplane orthogonal to v. 8.3.2. The parametrized curve α : [0, +∞) −→ R2 ,

α(t) = (aebt cos t, aebt sin t)

(where a > 0 and b < 0 are real constants) is called a logarithmic spiral. 102

(a) Show that as t → +∞, α(t) spirals in toward the origin. Compute that lim α(t) = lim (aebt cos t, aebt sin t) = (0, 0).

t→∞

t→∞

The approach to the origin winds around and around because α(t) is a positive scalar multiple of (cos t, sin t). (b) Show that as t → +∞, L(0, t) remains bounded. Thus the spiral has finite length. Compute that since |α′ (t)|2 = |abebt (cos t, sin t) + aebt (− sin t, cos t)|2 = a2 e2bt (b2 + 1), it follows that Z

p

t ′

Z

t

|α (τ )| dτ = a ebτ dτ +1 τ =0 τ =0 p p = (a/b) b2 + 1(ebt − 1) = (a/|b|) b2 + 1(1 − ebt ),

L(0, t) =

so that

b2

lim L(0, t) = (a/|b|)

t→∞

p

b2 + 1.

8.3.3. Explicitly reparametrize each curve α : I −→ Rn with a curve γ : I ′ −→ Rn parametrized by arc length. (a) The ray α : R>0 −→ Rn given by α(t) = t2 v where v is some fixed nonzero vector. Letting t0 = 1 gives for all t > 0, Z t Z t Z t 2τ dτ = |v|(t2 − 1). |2τ v| dτ = |v| |α′ (τ )| dτ = ℓ(t) = τ =1

τ =1

τ =1

Thus solving the equation s = |v|(t2 − 1) for t gives ℓ−1 (s) = parametrization by arc length is consequently γ : (−|v|, ∞) −→ Rn ,

p

1 + s/|v|. The

γ(s) = (1 + s/|v|)v.

(b) The circle α : R −→ R2 given by α(t) = (cos et , sin et ). Letting t0 = 0 gives for all t ∈ R, Z t Z t eτ dτ = et − 1. |α′ (τ )| dτ = ℓ(t) = τ =0

τ =0

Thus solving the equation s = et − 1 for t gives ℓ−1 (s) = ln(1 + s). The parametrization by arc length is consequently γ : (−1, ∞) −→ R2 ,

γ(s) = (cos(1 + s), sin(1 + s)).

(c) The helix α : [0, 2π] −→ R3 given by α(t) = (a cos t, a sin t, bt). 103

Letting t0 = 0 gives for all t ∈ R, Z t Z p |α′ (τ )| dτ = a2 + b2 ℓ(t) = τ =0

t

dτ =

τ =0

p

a2 + b2 t.

√ √ Thus solving the equation s = a2 + b2 t for t gives ℓ−1 (s) = s/ a2 + b2 . The parametrization by arc length is consequently p γ : [0, 2π/ a2 + b2 ] −→ R3 where

p p p γ(s) = (a cos(s/ a2 + b), a sin(s/ a2 + b2 ), bs/ a2 + b2 ).

8.4.1. (a) Let a and b be positive. Find the curvature of the ellipse α(t) = (a cos(t), b sin(t)) for t ∈ R. Compute, κ=

ab sin2 t + ab cos2 t x′ y ′′ − x′′ y ′ ab = = 2 2 . 2 2 3/2 2 2 2 2 3/2 ′ ′ (a sin t + b cos t) (a sin t + b2 cos2 t)3/2 (x + y )

Assuming that a > b, at t = 0 the curvature is κ = a/b2 > 1/a (because a2 /b2 > 1), and at t = π/2 the curvature is κ = b/a2 < 1/b. These results agree with the geometry: at the point (a, 0) the ellipse is bending inside its tangent circle of radius a, and at the point (0, b) the ellipse is bending outside its tangent circle of radius b. (b) Let a be positive and b be negative. Find the curvature of the logarithmic spiral α(t) = (aebt cos t, aebt sin t) for t ≥ 0. In the formula x′ y ′′ − x′′ y ′ κ = ′2 (x + y ′ 2 )3/2 we have x′ y ′′ = a2 e2bt (b cos t − sin t)((b2 − 1) sin t + 2b cos t) and x′′ y ′ = a2 e2bt (b2 − 1) cos t − 2b sin t)(b sin t + cos t), so that (after a while) x′′ y ′ − x′ y ′′ = (1 + b2 )a2 e2bt . Also, 2

2

x′ + y ′ = a2 e2bt (b2 cos2 t + sin2 t + b2 sin2 t + cos2 t) = a2 e2bt (1 + b2 ). Thus the curvature is κ=

1 e|b|t (1 + b2 )a2 e2bt = = . a3 e3bt (1 + b2 )3/2 aebt (1 + b2 )1/2 a(1 + b2 )1/2 104

8.4.2. Let γ : I −→ R2 be parametrized by arc length. Fix any unit vector v ∈ R2 , and define a function θ : I −→ R by the conditions cos(θ(s)) = hT (s), vi,

sin(θ(s)) = −hN (s), vi.

Thus θ is the angle that the curve γ makes with the fixed direction v. Show that θ′ = κ. Thus our notion of curvature does indeed measure the rate at which γ is turning. Differentiate the first condition to get − sin(θ(s))θ′ (s) = hT ′ (s), vi. The second condition says that − sin(θ(s)) = hN (s), vi, and the Frenet equations say that T ′ (s) = κ(s)N (s). Thus hN (s), viθ′ (s) = κ(s)hN (s), vi. So long as hN (s), vi = 6 0 we may cancel to get the result. In the exceptional case, proceed similarly but start by differentiating the second condition. This time the factor that we want to cancel will be hT (s), vi, and this is nonzero when hN (s), vi = 0. 8.5.1. (a) Let a and b be positive. Compute the curvature κ and the torsion τ of the helix α(t) = (a cos t, a sin t, bt). Routine calculations give κ=

a2

a + b2

and

τ=

a2

b . + b2

(b) How do κ and τ behave if a is held constant and b → ∞? Here κ → 0 and τ → 0. These results are sensible since the helix is tending to a vertical line through (a, 0, 0). (c) How do κ and τ behave if a is held constant and b → 0? Here κ → 1/a and τ → 0. These results are sensible since the helix is tending to a circle of radius a. (d) How do κ and τ behave if b is held constant and a → ∞? Here κ → 0 and τ → 0. These results are perhaps sensible since the helix is acquiring an ever-larger radius but its pitch is being held constant. (e) How do κ and τ behave if b is held constant and a → 0? Here κ → 0 and τ → 1/b. These results are not immediately intuitive to me. The helix is tending to a vertical line through (0, 0, 0), but somehow even in the limit it is twisting at a rate reciprocal to the pitch.

105

Chapter 9 9.1.1. Consider two vectors u = (xu , yu , zu ) and v = (xv , yv , zv ). Calculate that |u|2 |v|2 − (u · v)2 = |u × v|2 . Compute that |u|2 |v|2 = (x2u + yu2 + zu2 )(x2v + yv2 + zv2 ) = x2u x2v + x2u yv2 + x2u zv2

+ yu2 x2v + yu2 yv2 + yu2 zv2 + zu2 x2v + zu2 yv2 + zu2 zv2 , and that (u · v)2 = (xu xv + yu yv + zu zv )2

= x2u x2v + xu xv yu yv + xu xv zu zv + yu yv xu xv + yu2 yv2 + yu yv zu zv + zu zv xu xv + zu zv yu yv + zu2 zv2 .

Thus the difference is |u|2 |v|2 − (u · v)2 = x2u yv2 − 2xu yv yu xv + yu2 x2v

+ x2u zv2 − 2xu zv zu xv + zu2 x2v

+ yu2 zv2 − 2yu zv zu yv + zu2 yv2

= (xu yv − yu xv )2 + (xu zv − zu xv )2 + (yu zv − zu yv )2 = |u × v|2 .

9.1.3. Let f (x, y, z) = x2 + yz. (a) Integrate f over the box B = [0, 1]3 . This is an integral of the type studied in chapter 6, so Fubini’s Theorem applies immediately, Z 1 Z 1 Z Z 1 Z 1 Z 1 1 1 2 (x2 z + yz 2 ) z=0 (x + yz) = f= 2 x=0 y=0 x=0 y=0 z=0 B Z 1 Z 1 Z 1 1 1 1 (x2 + y) = = (x2 y + y 2 ) y=0 2 4 x=0 y=0 x=0 Z 1 1 1 1 1 1 1 7 (x2 + ) = ( x3 + x) x=0 = + = = . 4 3 4 3 4 12 x=0 (b) Integrate f over the parametrized curve γ : [0, 2π] −→ R3 ,

γ(t) = (cos t, sin t, t).

106

Since a curve is a 1-surface, Definition 9.1.3 specializes to Z 2π Z Z 2π (f ◦ γ) |γ ′ |. (f ◦ γ) length(γ ′ ) = f= t=0

t=0

γ

Compute that (f ◦ γ)(t) = cos2 t + t sin t, and that |γ ′ (t)| = Therefore the integral is Z

γ

p

(− sin t)2 + (cos t)2 + 12 =

√ Z f= 2



2.



(cos2 t + t sin t) t=0

A standard trick is that since sine and cosine are translates and both have period 2π, Z Z Z 2π Z 2π 1 2π 1 2π 2 2 2 2 (cos + sin ) = 1 = π. sin = cos = 2 0 2 0 0 0 And integration by parts with u = t and v ′ = sin t gives Z 2π Z 2π 2π cos t = −2π. t sin t = −t cos t t=0 + t=0

t=0

Thus the entire integral is Z

f=



γ

√ 2(π − 2π) = − 2 π .

(c) Integrate f over the parametrized surface S : [0, 1]2 −→ R3 ,

S(u, v) = (u + v, u − v, v).

The derivative matrix of S is 

Therefore,

 1 1 S ′ (u, v) =  1 −1  . 0 1

S ′ (u, v)t S ′ (u, v) =



1 1 1 −1

0 1





   1 1  1 −1  = 2 0 . 0 3 0 1

This has determinant 6, showing that the volume factor in Definition 9.1.3 is √ area(P(D1 S, D2 S)) = 6. 107

(One can compute this factor in other ways as well, since we are in the particular case of k = 2 and n = 3.) So the integral is Z √ Z f= 6 (f ◦ S) [0,1]2

S

But (f ◦ S)(u, v) = (u + v)2 + (u − v)v = u2 + 3uv, and so the integral is Z

S

√ Z f= 6

1

u=0

Z

1

√ Z (u + 3uv) = 6 2

v=0

√ √ 1 3 3 13 6 (u + u) = 6( + ) = . 2 3 4 12 u=0 1

2

(d) Integrate f over the parametrized solid V : [0, 1]3 −→ R3 ,

V (u, v, w) = (u + v, v − w, u + w).

The derivative matrix of V is



1 V ′ (u, v, w) =  0 1

 1 0 1 −1  . 0 1

This has determinant 0, and since the volume factor in Definition 9.1.3 is vol(P(D1 V, D2 V, D3 V )) = | det V ′ | = 0, the integral is 0 . (In parts (b), (c), and (d) of this exercise, the volume factor worked out to a constant, and the constant even was 0 for (d). This is all flukish: in general the volume factor depends on the parameters.) 9.2.2. Derive equations (9.6) and (9.8) from equation (9.12). To get (9.6), substitute n = 2 and write out the terms. To get (9.8), substitute n = 3 and write out the terms. 9.3.1. Write out all ordered k-tuples from {1, . . . , n} in the cases n = 4, k = 1; n = 3, k = 2. • n = 4, k = 1: (1), (2), (3), (4). • n = 3, k = 2: (1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3). In general, how many ordered k-tuples I = (i1 , . . . , ik ) from {1, . . . , n} are there? Each of the k entries can take any of n values, so there are nk possibilities. How many of these are increasing, meaning that i1 < · · · < ik ? There are as many such as there are ways to choose k distinct indices from {1, . . . , n}. This is the binomial coefficient,   n! n = . k k!(n − k)! Write out all increasing k-tuples from {1, 2, 3, 4} for k = 1, 2, 3, 4. 108

• k = 1: (1), (2), (3), (4). • k = 2: (1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4). • k = 3: (1, 2, 3), (1, 2, 4), (1, 3, 4), (2, 3, 4). • k = 4: (1, 2, 3, 4). P 9.3.2. An expression ω = I fI dxI where the sum is only over increasing ktuples from {1, . . . , n} is called a standard presentation of ω. Write out explicitly what a standard presentation for a k-form on R4 looks like for k = 0, 1, 2, 3, 4. Here are the standard presentations of a k-form on R4 . For k = 1, f1 dx1 + f2 dx2 + f3 dx3 + f4 dx4 . For k = 2: f1,2 dx1 ∧ dx2 + f1,3 dx1 ∧ dx3 + f1,4 dx1 ∧ dx4

+f2,3 dx2 ∧ dx3 + f2,4 dx2 ∧ dx4 + f3,4 dx3 ∧ dx4 . For k = 3: f1,2,3 dx1 ∧ dx2 ∧ dx3 + f1,2,4 dx1 ∧ dx2 ∧ dx4

+f1,3,4 dx1 ∧ dx3 ∧ dx4 + f2,3,4 dx2 ∧ dx3 ∧ dx4 . For k = 4: f1,2,3,4 dx1 ∧ dx2 ∧ dx3 ∧ dx4 . 9.4.1. Let ω = x dy − y dx, a 1-form on R2 . Evaluate curves. (a) γ : [−1, 1] −→ R2 , γ(t) = (t2 − 1, t3 − t). The integral is Z Z 1 Z 1 2 2 3 (t − 1)(3t − 1) − (t − t)2t = ω= =2



γ

t=−1

t=−1

γ

R

 16 1 2 . − +1 = 5 3 15

ω for the following

t4 − 2t2 + 1

(b) γ : [0, 2] −→ R2 , γ(t) = (t, t2 ). The integral is Z

ω= γ

Z

2 t=0

t · 2t − t2 · 1 =

Z

2

t2 = t=0

8 . 3

9.4.2. Let ω = z dx + x2 dy + y dz, a 1-form on R3 . Evaluate following two curves. (a) γ : [−1, 1] −→ R3 , γ(t) = (t, at2 , bt3 ); 109

R

γ

ω for the

The integral is Z Z Z 1 bt3 · 1 + t2 · 2at + at2 · 3bt2 = ω= t=−1

γ

=

1

bt3 + 2at3 + 3abt4 t=−1

6ab . 5

(b) γ : [0, 2π] −→ R3 , γ(t) = (a cos t, a sin t, bt). The integral is Z 2π Z bt · (−a sin t) + a2 cos2 t · a cos t + a sin t · b ω= γ

= =

Z

t=0 2π

t=0 Z 2π t=0

−abt sin t + a3 cos3 t + ab sin t −abt sin t + a3 cos3 t + ab sin t.

3

The integrals of cos t and of sin t over [0, 2π] are 0. This leaves an integration by parts, ! 2π Z 2π Z Z 2π t(− sin t) = ab t cos t − ω = ab cos t = 2πab . t=0

γ

0

t=0

9.4.3. (a) Let ω = f dy where f : R2 −→ R depends only on y. That is, f (x, y) = ϕ(y) for some ϕ : R −→ R. Show that for any curve γ = (γ1 , γ2 ) : [a, b] −→ R2 , Z Z γ2 (b) ϕ. ω= γ2 (a)

γ

Compute, using the Change of Variable Theorem for the third equality, Z γ2 (b) Z b Z b Z ϕ(u). ϕ(γ2 (t))γ2′ (t) = f (γ(t))γ2′ (t) = ω= γ

u=γ2 (a)

t=a

t=a

Using the notation of functions rather than their outputs, the calculation is Z γ2 (b) Z b Z b Z ′ ′ ϕ. (ϕ ◦ γ2 )γ2 = (f ◦ γ)γ2 = ω= γ

γ2 (a)

a

a

(b) Let Rω = f dx + g dy where f depends only on x and g depends only on y. Show that γ ω = 0 whenever γ : [a, b] −→ R2 is a closed curve, meaning that γ(b) = γ(a). Let f (x, y) = ϕ(x) and let g(x, y) = ψ(y). Then by part (a), Z γ2 (b) Z Z γ1 (b) ψ, ϕ+ ω= γ

γ1 (a)

110

γ2 (a)

and the right side is 0 because γ1 (b) = γ1 (a) and γ2 (b) = γ2 (a). 9.5.1. Let a be a positive number. Consider a 2-surface in R3 , Φ : [0, a] × [0, π] −→ R3 ,

Φ(r, θ) = (r cos θ, r sin θ, r2 ).

Sketch thisRsurface, noting that θ varies from 0 to π, not from 0 to 2π. Try to determine Φ dx ∧ dy by geometrical reasoning, and then check your answer by evaluatinge the integral. Do the same for dy ∧ dz and dz ∧ dx. Do the same for z dx ∧ dy − y dz ∧ dx. The figure is a half-chalice. Its projected oriented (x, y)-area is πa2 /2, its projected oriented (y, z)-area is 0 by cancellation, and its projected oriented (z, x)-area is ±4a3 /3 by a small exercise in one-variable calculus. Furthermore, the (z, x)-plane has the opposite orientation from the projected (z, x)-image of the (r, θ)-grid on the half chalise, so we expect a minus sign. Finally, z dx ∧ dy should capture the volume above the (x, y)-plane and below the half-chalise, while −y dz ∧ dx should capture the volume above the half-chalise and below the horizontal plane at height z = a2 . Thus z dx ∧ dy − y dz ∧ dx should capture the volume of the half-cylinder of radius a and height a2 , i.e., πa4 /2. Now compute that indeed   cos θ −r sin θ r cos θ , Φ′ (r, θ) =  sin θ 2r 0 so that

Z

and

and

and

Z Z Z

Φ

Φ

Φ

Φ

dx ∧ dy =

dy ∧ dz =

dz ∧ dx =

Z

a r=0

Z

Z

Z

a r=0

a r=0

Z

Z

π

r = πa2 /2 , θ=0

π

(−2r2 cos θ) = 0 , θ=0

π θ=0

(−2r2 sin θ) = −4

z dx ∧ dy − y dz ∧ dx =

Z

a r=0

Z

π

Z

a r=0

r2 = −4a3 /3 ,

r3 (1 + 2 sin2 θ) = πa4 /2 .

θ=0

9.5.2. Let ω = x dy ∧ dz + y dx ∧ dy, a 2-form on R3 . Evaluate the 2-surface (a) Φ : [0, 1] × [0, 1] −→ R3 , Φ(u, v) = (u + v, u2 − v 2 , uv). The derivative matrix of Φ is   1 1 Φ′ (u, v) = 2u −2v  v u 111

R

Φ

ω when Φ is

Consequently, letting D = [0, 1] × [0, 1], Z Z 2u −2v 1 + (u2 − v 2 ) 1 ω= (u + v) 2u −2v v u Φ ZD = 2(u + v)(u2 + v 2 ) − 2(u2 − v 2 )(u + v) D

=4 =4

Z

(u + v)v 2 = 4

D Z 1

Z

1

v2 v=0

Z

1

u+v u=0

v 2 (v + 1/2) = 4(1/4 + 1/6) = 5/3 .

v=0

(b) Φ : [0, 2π] × [0, 1] −→ R3 , Φ(u, v) = (v cos u, v sin u, u). This time the derivative matrix of Φ is   −v sin u cos u Φ′ (u, v) =  v cos u sin u  1 0

Consequently, letting D = [0, 2π] × [0, 1], Z Z −v sin u v cos u sin u + v sin u ω= v cos u v cos u 1 0 Φ ZD = −v sin u cos u − v 2 sin u D

=−

Z

1

v v=0

= 0 .

Z



u=0

sin u cos u −

Z

1

v2 v=0

Z

cos u sin u



sin u u=0

9.5.4. This exercise proves that integration of k-forms on Rn reduces to standard integration when k = n Let D ⊂ Rn be compact and connected. Define the corresponding natural parametrization, ∆ : D −→ Rn , by ∆(u1 , . . . , un ) = (u1 , . . . , un ). Let ω = f dx1 ∧ · · · ∧ dxn , an n-form on Rn . Use (9.14) to show that Z Z ω= f. ∆

D

Since ∆ is the identity map, its derivative matrix is the identity matrix, which has determinant det(∆′ ) = 1. Consequently, Z

ω= ∆

Z

D

(f ◦ ∆) ·

det(∆′(1,...,n) )

112

=

Z

D

f ·1=

Z

f. D

9.5.5. This exercise proves that granting the Change of Variable Theorem, integration of forms is invariant under reparametrizations of a surface. e −→ A Let A be an open subset of Rn . Let Φ : D −→ A and Ψ : D be k-surfaces in A. Suppose that there exists a smoothly invertible mapping e such that Ψ ◦ T = Φ. In other words, T is smooth, T is invertible, T : D −→ D its inverse is also smooth, and the following diagram commutes: D ◆◆ ◆◆◆ ◆◆Φ◆ ◆◆◆ ◆◆& T 8A qqq q q qqq  qqqq Ψ eq D

e −→ D, a smooth mapping. Starting from the relation (a) Let S = T −1 : D (S ◦ T )(u) = id(u) for all u ∈ D (where id is the identity mapping on D), differentiate, use the chain rule, and take determinants to show that det T ′ (u) 6= 0 for all u ∈ D. Differentiate the equality of mapping-actions, (S ◦ T )(u) = id(u)

for all u ∈ D

to obtain an equality of k-by-k matrices, (S ◦ T )′ (u) = I

for all u ∈ D.

By the chain rule, this says that S ′ (T (u)) · T ′ (u) = I

for all u ∈ D,

where S ′ (T (u)) and T ′ (u) are both k-by-k matrices. Therefore, taking determinants gives det(S ′ (T (u)) det(T ′ (u)) = 1 for all u ∈ D.

It follows that det(T ′ (u)) 6= 0 for all u ∈ D, as desired. (b) From now on, assume that det T ′ > 0 on D. For any n-by-k matrix M and any ordered k-tuple I from {1, . . . , n}, recall that MI denotes the k-by-k matrix comprising the Ith rows of M . If N is a k-by-k matrix, prove the equality (M N )I = MI N. In words, this says that the Ith rows of (M times N ) are (the Ith rows of M ) times N .

113

The word-phrasing of the desired result makes it essentially immediate. For any single number i ∈ {1, . . . , n}, the ith row of (M times N ) is (the ith row of M ) times N . The conclusion follows since the multi-index I is simply a concatenation of single indices i1 , . . . , ik . In more detail,         mi1 mi1 N m1 N m1         (M N )I = ( ...  N )I =  ...  =  ...  =  ...  N = MI N. mn N

mn

mik

mik N

I

(c) Use the chain rule and part (b) to show that for any I, det Φ′I (u) = det Ψ′I (T (u)) det T ′ (u)

for all u ∈ D.

For all u ∈ D we have by the chain rule, Φ′ (u) = Ψ′ (T (u)) T ′ (u). The right side has the form M N from part (b), and so by part (b), Φ′I (u) = Ψ′I (T (u)) T ′ (u). This is an equality of the form “k-by-k matrix equals product of two k-by-k matrices.” The desired result follows since the determinant of a product of square matrices is the product of the determinants. (d) Let ω = f (x) dxI , a k-form on A. Show that Z Z ω= (f ◦ Ψ) det Ψ′I . Ψ

T (D)

R R ′ This is essentially immediate. By definition, Ψ ω = D e (f ◦ Ψ) det ΨI . Since e = T (D), the result follows. D Explain why if the Change of Variable Theorem holds then Z Z  ω= (f ◦ Ψ) det Ψ′I ◦ T · det T ′ . Ψ

D

This follows by applying the Change of Variable Theorem to the right side of the previous display. The theorem gives Z Z  (f ◦ Ψ) det Ψ′I ◦ T · | det T ′ |, (f ◦ Ψ) det Ψ′I = T (D)

D

but since we are assuming that det T ′ > 0, the result follows. Explain why this show shows that if the Change of Variable Theorem holds then Z Z ω= ω. Ψ

Φ

114

So far we have established that Z Z  ω= (f ◦ Ψ) det Ψ′I ◦ T · det T ′ . Ψ

D

But the right side is Z Z  ′ ′ (f ◦ Ψ) det ΨI ◦ T · det T = (f ◦ Ψ ◦ T ) · det(Ψ′I ◦ T ) · det T ′ D D Z = (f ◦ Φ) · det(Φ′I ) ZD = ω. Φ

What would the conclusion be for orientation-reversing Ψ? Here we would need to note that | det T ′ | = − det T ′ , leading to the conclusion that Z Z ω. ω=− Φ

Ψ

(e) Do the results from (d) remain valid if instead ω = Yes, because sums pass through integrals.

P

I

fI dxI ?

9.7.1. Find a wedge product of two differential forms that encodes the inner product of R4 . Let ω = f1 dx1 + f2 dx2 + f3 dx3 + f4 dx4 , λ = g1 dx2 ∧ dx3 ∧ dx4 − g2 dx1 ∧ dx3 ∧ dx4 + g3 dx1 ∧ dx2 ∧ dx4 − g4 dx1 ∧ dx2 ∧ dx3 . Then a straightforward calculation shows that ω ∧ λ = (f1 g1 + f1 g1 + f1 g1 + f1 g1 ) dx1 ∧ dx2 ∧ dx3 ∧ dx4 . 9.7.2. Find a wedge product of three differential forms that encodes the 3-by-3 determinant. Let ω = a1,1 dx1 + a1,2 dx2 + a1,3 dx3 , λ = a2,1 dx1 + a2,2 dx2 + a2,3 dx3 , µ = a3,1 dx1 + a3,2 dx2 + a3,3 dx3 . Then a straightforward calculation shows that X (−1)π a1,π(1) a2,π(2) a3,π(3) dx1 ∧ dx2 ∧ dx3 . ω∧λ∧µ= π∈Sn

115

And by definition the sum equals det([ai,j ]). (A more conceptual argument uses the defining properties of the determinant.) 9.7.3. Prove the properties of the wedge product. The only obstacles are notational. For the first property, ω ∧ (λ1 + λ2 ) = =

X I

X I

=

X I,J

=

X I,J

X

fI dxI ∧ fI dxI ∧

gJ,1 dxJ +

J

X

X

gJ,2 dxJ

J

!

(gJ,1 + gJ,2 ) dxJ

J

fI (gJ,1 + gJ,2 ) dxI ∧ dxJ fI gJ,1 dxI ∧ dxJ +

= ω ∧ λ1 + ω ∧ λ2 .

X I,J

fI gJ,2 dxI ∧ dxJ

For the second, (ω ∧ λ) ∧ µ = =

X I

X I,J

=

fI dxI ∧

X

X I

=

X I

J

fI fJ dx(I,J) ∧

fJ dxJ X

!



X

fK dxK

K

fK dxK

K

fI fJ fK dx(I,J,K)

I,J,K

=

X

fI dxI ∧ fI dxI ∧

= ω ∧ (λ ∧ µ).

X

fJ fK dxJ,K

J,K

X J

fJ dxJ ∧

X K

fK dxK

!

For the third, note that for dxI ∈ Λk (A) and dxJ ∈ λℓ (A), a small counting argument shows that dxJ ∧ dxI = (−1)kℓ dxI ∧ dxJ .

116

The general result follows, λ∧ω = =

X J

X J,I

fJ dxJ ∧

X

fJ dxI

I

fJ fI dxJ ∧ dxI

= (−1)kℓ

X I,J

= (−1)kℓ

X I

fI fJ dxI ∧ dxJ fI dxI ∧

= (−1)kℓ ω ∧ λ.

X

fJ dxJ

J

9.7.4. Prove that (ω1 + ω2 ) ∧ λ = ω1 ∧ λ + ω2 ∧ λ for all ω1 , ω2 ∈ Λk (A) and λ ∈ Λℓ (A). Compute,  (ω1 + ω2 ) ∧ λ = (−1)k+ℓ λ ∧ (ω1 + ω2 )  = (−1)k+ℓ λ ∧ ω1 + λ ∧ ω2  = (−1)k+ℓ (−1)k+ℓ ω1 ∧ λ + (−1)k+ℓ ω2 ∧ λ = ω1 ∧ λ + ω2 ∧ λ. 9.8.1. Let ω = f dx + g dy + h dz. Show that dω = (D2 h − D3 g) dy ∧ dz + (D3 f − D1 h) dz ∧ dx + (D1 g − D2 f ) dx ∧ dy. Compute, dω = (D1 f dx + D2 f dy + D3 f dz) ∧ dx + (D1 g dx + D2 g dy + D3 g dz) ∧ dy

+ (D1 h dx + D2 h dy + D3 h dz) ∧ dz = D1 f dx ∧ dx + D2 f dy ∧ dx + D3 f dz ∧ dx + D1 g dx ∧ dy + D2 g dy ∧ dy + D3 g dz ∧ dy

+ D1 h dx ∧ dz + D2 h dy ∧ dz + D3 h dz ∧ dz.

The result follows by the properties of the wedge product. 9.8.2. Let ω = f dy ∧ dz + g dz ∧ dx + h dx ∧ dy. Evaluate dω.

117

Compute, dω = (D1 f dx + D2 f dy + D3 f dz) ∧ dy ∧ dz

+ (D1 g dx + D2 g dy + D3 g dz) ∧ dz ∧ dx + (D1 h dx + D2 h dy + D3 h dz) ∧ dx ∧ dy = D1 f dx ∧ dy ∧ dz + D2 f dy ∧ dy ∧ dz + D3 f dz ∧ dy ∧ dz

+ D1 g dx ∧ dz ∧ dx + D2 g dy ∧ dz ∧ dx + D3 g dz ∧ dz ∧ dx + D1 h dx ∧ dx ∧ dy + D2 h dy ∧ dx ∧ dy + D3 h dz ∧ dx ∧ dy

= (D1 f + D2 g + D3 h) dx ∧ dy ∧ dz.

9.8.3. Differential forms of orders 0, 1, 2, 3 on R3 are written ω0 = φ, ω1 = f1 dx + f2 dy + f3 dz, ω2 = g1 dy ∧ dz + g2 dz ∧ dx + g3 dx ∧ dy,

ω3 = h dx ∧ dy ∧ dz.

(a) For a 0-form φ, what are the coefficients fi of dφ in terms of φ? By the definition of dφ, f1 = D1 φ,

f2 = D2 φ,

f3 = D3 φ.

(b) For a 1-form ω1 , what are the coefficients gi of dω1 in terms of the coefficients fi of ω1 ? By exercise 9.8.1, g 1 = D2 f 3 − D3 f 2 ,

g 2 = D3 f1 − D1 f3 ,

g 3 = D1 f 2 − D2 f 1 .

(c) For a 2-form ω2 , what is the coefficient h of dω2 in terms of the coefficients gi of ω2 ? By exercise 9.8.2, h = D1 g 1 + D2 g 2 + D3 g 3 . 9.8.4. Classical vector analysis features the operator ∇ = (D1 , D2 , D3 ), where the Di are familiar partial derivative operators. Thus, for a function φ : R3 −→ R, call ∇φ = (D1 φ, D2 φ, D3 φ) = grad φ.

Similarly, for a mapping F = (f1 , f2 , f3 ) : R3 −→ R3 , ∇ × F is defined in the symbolically appropriate way, and for a mapping G = (g1 , g2 , g3 ) : R3 −→ R3 , so is h∇, Gi. Write down explicitly the vector-valued mapping ∇ × F and the function h∇, Gi for F and G as just described. 118

Compute call

∇ × F = (D2 f3 − D3 f2 , D3 f1 − D1 f3 , D1 f2 − D2 f1 ) = curl F. and

call

h∇, Gi = D1 g1 + D2 g2 + D3 g3 = div G. 9.8.5. Continuing with the notation of the previous two problems, introduce correspondences between the classical scalar–vector environment and the environment of differential forms, as follows. Let ~ = (dx, dy, dz), ds ~ = (dy ∧ dz, dz ∧ dx, dx ∧ dy), dn

dV = dx ∧ dy ∧ dz.

~ Let id be the mapping that takes each function φ : R3 −→ R to itself. Let ·ds be the mapping that takes each vector-valued mapping F = (f1 , f2 , f3 ) to the 1-form ~ = f1 dx + f2 dy + f3 dz. F · ds

~ be the mapping that takes each vector-valued mapping G = (g1 , g2 , g3 ) Let ·dn to the 2-form ~ = g1 dy ∧ dz + g2 dz ∧ dx + g3 dx ∧ dy. G · dn And let dV be the mapping that takes each function h to the 3-form h dV = h dx ∧ dy ∧ dz. Combine the previous problems to verify that the following diagram commutes, meaning that either path around each square yields the same result. ✤ φ ❴

grad

/ (f1 , f2 , f3 ) ✤ ❴ ~ ·ds

id

 φ✤

d

/

curl

/ (g1 , g2 , g3 ) ✤ ❴



f1 dx +f2 dy ✤ +f3 dz

 d

/

div

~ ·dn

g1 dy ∧ dz +g2 dz ∧ dx ✤ +g3 dx ∧ dy

/h ❴ dV

d

 / h dx ∧ dy ∧ dz

The desired result for the left square is exactly the content of exercise 9.8.3(a) and the definition of the gradient operator as reviewed in exercise 9.8.4. The desired result for the middle square is exactly the content of exercise 9.8.3(b) and the definition of the computation of the curl operator in exercise 9.8.4. The desired result for the right square is exactly the content of exercise 9.8.3(c) and the computation of the divergence operator in exercise 9.8.4.

119

9.8.6. Two of these operators are zero: curl ◦ grad,

div ◦ curl,

div ◦ grad.

Explain, using the diagram from the preceding exercise and the nilpotence of d. Both curl◦grad and div◦curl are composites along the top row of the diagram corresponding to composites d2 = 0 along the bottom row. Since the vertical arrows are invertible maps, the two classical composites must be 0. For a function φ : R3 −→ R, write out the harmonic equation (or Laplace’s equation), div(gradφ) = 0. Compute that grad

div

φ 7−→ (D1 φ, D2 φ, D3 φ) 7−→ D11 φ + D22 φ + D33 φ. Therefore the harmonic equation is D11 φ + D22 φ + D33 φ = 0. call

9.9.1. Define S : R2 −→ R2 by S(u, v) = (u + v, uv) = (x, y). Let ω = x2 dy + y 2 dx and λ = xy dx, forms on (x, y)-space. (a) Compute ω ∧ λ, S ′ (u, v), and S ∗ (ω ∧ λ). First, ω ∧ λ = −x3 y dx ∧ dy . Second, S ′ (u, v) =



1 v

1 u



.

Third, by the Pullback–Determinant Theorem, S ∗ (ω ∧ λ) = −(u + v)3 uv det S ′ (u, v) du ∧ dv = −(u + v)3 uv(u − v) du ∧ dv . (b) Compute S ∗ ω, S ∗ λ, and S ∗ ω ∧ S ∗ λ. How do you check the last of these? Compute that S ∗ ω = (u + v)2 d(uv) + (uv)2 d(u + v) = (u + v)2 (u dv + v du) + (uv)2 (du + dv) = ((u + v)2 v + (uv)2 ) du + ((u + v)2 u + (uv)2 ) dv , and S ∗ λ = (u + v)uv d(u + v) = (u + v)uv(du + dv) ,

120

and therefore, S ∗ ω ∧ S ∗ λ = ((u + v)2 v + (uv)2 ) du + ((u + v)2 u + (uv)2 ) dv ∧ ((u + v)uv(du + dv))



 = (u + v)uv ((u + v)2 v + (uv)2 ) − ((u + v)2 u + (uv)2 ) du ∧ dv

= (u + v)uv(u + v)2 (v − u) du ∧ dv = −(u + v)3 uv(u − v) du ∧ dv .

This agrees with S ∗ (ω ∧ λ) from part (a), as it must since the pullback of the product is the product of the pullbacks. (c) Compute dω and S ∗ (dω). Compute that dω = d(x2 dy + y 2 dx) = 2(x − y) dx ∧ dy , and then that S ∗ (dω) = 2(u + v − uv) det S ′ (u, v) du ∧ dv = 2(u + v − uv)(u − v) du ∧ dv . (d) Compute d(S ∗ ω). How do you check this? Compute, using the previously computed value of S ∗ ω, that d(S ∗ ω) = d(((u + v)2 v + (uv)2 ) du + ((u + v)2 u + (uv)2 ) dv) = (D1 ((u + v)2 u + (uv)2 ) − D2 ((u + v)2 v + (uv)2 )) du ∧ dv

= (2(u + v)u + (u + v)2 + 2uv 2 − 2(u + v)v − (u + v)2 − 2u2 v) du ∧ dv = 2(u + v − uv)(u − v)) du ∧ dv . This agrees with S ∗ (dω) from part (c), as it must since the pullback of the derivative is the derivative of the pullback. call (e) Define T : R2 −→ R2 by T (s, t) = (s − t, set ) = (u, v). Compute ∗ ∗ T (S λ). Using the value of S ∗ λ from (b), compute that T ∗ (S ∗ λ) = T ∗ ((u + v)uv(du + dv)) = (s − t + set )(s − t)set (d(s − t) + d(set ))

= (s − t + set )(s − t)set (ds − dt + et ds + set dt) = (s − t + set )(s − t)set ((et + 1) ds + (set − 1) dt) . (f) What is the composite mapping S ◦ T ? Compute (S ◦ T )∗ λ. How do you check this?

121

The composite mapping is (S ◦ T )(s, t) = S(T (s, t)) = S(s − t, set ) = (s − t + set , (s − t)set ) . It follows that (S ◦ T )∗ λ = (s − t + set )(s − t)set d(s − t + set )

= (s − t + set )(s − t)set (ds − dt + d(set ))

= (s − t + set )(s − t)set (ds − dt + et ds + set dt)

= (s − t + set )(s − t)set ((et + 1) ds + (set − 1) dt) . This agrees with T ∗ (S ∗ λ) from part (e), as it must since the pullback of a composition is the composition of the pullbacks. 9.9.2. Recall the two forms from the beginning of the section, λ = dx ∧ dy,

ω=

x dy − y dx . x2 + y 2

Consider a mapping from the nonzero points of (u, v)-space to nonzero points of (x, y)-space.   −v u call T (u, v) = , = (x, y). u2 + v 2 ux2 + v 2 As at the end of the section, in light of the fact that T is the complex reciprocal mapping, determine what T ∗ ω and T ∗ λ must be. We recognize T as the complex reciprocal function, λ as the area-form, and ω as the change-of-angle-form. Consider the polar coordinate map call

Φ : R>0 × R −→ R2 \{(0, 0)},

Φ(r, θ) = (r cos θ, r sin θ) = (u, v).

In polar coordinates the reciprocal map re-expresses itself as call

˜ S(r, θ) = (1/r, −θ) = (˜ r, θ).

S : R>0 × R −→ R>0 × R,

And the polar coordinate map also applies to the polar coordinates output by the reciprocal map, call

˜ = (˜ ˜ r˜ sin θ) ˜ = (x, y). Φ(˜ r, θ) r cos θ,

Φ : R>0 × R −→ R2 \{(0, 0)}, Thus we have a commutative diagram R>0 × R

Φ

/ R2 \{(0, 0)} T

S



R>0 × R

Φ

 / R2 \{(0, 0)}.

122

In terms of differential forms and pullbacks we have the resulting diagram (in which k = 1 or k = 2, depending on whether we want to study λ or ω) Λk (R>0 × R) o O

Φ∗

Λk (R2 \{(0, 0)}) O

S∗

T∗

Λk (R>0 × R) o

Φ∗

Λk (R2 \{(0, 0)}).

Now, to study λ = dx ∧ dy, note that in the second diagram we have r−1 d(r−1 ) ∧ d(−θ) o O ✤

❴ r˜ d˜ r ∧ dθ˜ o

T ∗O λ ❴ ✤λ

and a small calculation gives r−1 d(r−1 ) ∧ d(−θ) =

dr ∧ dθ . r3

Thus T ∗ λ is the (u, v)-form that pulls back through the polar coordinate map to dr ∧ dθ/r3 . The area-form du ∧ dv pulls back to r dr ∧ dθ, so the correct 4 answer is the area form du ∧ dv √ divided by r in (u, v)-coordinates. That is, 2 2 since r in (u, v)-coordinates is u + v , T ∗ λ = T ∗ (dx ∧ dy) =

du ∧ dv . (u2 + v 2 )2

Similarly, since ω measures change in angle and since the reciprocal function negates angle, we have ✤ ∗ o −dθ TOω O ❴ dθ˜ o

✤❴ ω

and so T ∗ ω should be the negative of ω, but with u and v in place of x and y, T ∗ω = −

u dv − v du . u2 + v 2

These formulas for T ∗ λ and T ∗ ω can be verified directly by purely mechanical computation. 9.9.3. Consider a differential form on the punctured (x, y)-plane, x dx + y dy µ= p . x2 + y 2 123

(a) Pull µ back through the polar coordinate mapping from the end of the section, ˜ = (˜ ˜ r˜ sin θ) ˜ call Φ(˜ r, θ) r cos θ, = (x, y). R In light of the value of the pullback, what must be the integral γ µ where γ is a parametrized curve in the punctured (x, y)-plane? Compute ˜ + r˜ sin θ˜ d(˜ ˜ r˜ cos θ˜ d(˜ r cos θ) r sin θ) r˜ ˜ ˜ + sin θ(sin ˜ ˜ = cos θ(cos θ˜ d˜ r − r˜ sin θ˜ dθ) θ˜ d˜ r + r˜ cos θ˜ dθ)

Φ∗ µ =

= d˜ r.

Thus µ measures change in distance from the origin, and so for any parametrized curve γ : [0, 1] −→ R2 − {(0, 0)}, Z µ = |γ(1)| − |γ(0)|. γ

(b) In light of part (a), pull µ back through the complex square mapping from the section, call T (u, v) = (u2 − v 2 , 2uv) = (x, y), by using diagrams rather than by relying heavily on computation. Check your answer by computation if you wish. In polar coordinates the complex square map re-expresses itself as call

˜ S(r, θ) = (r2 , 2θ) = (˜ r, θ).

S : R>0 × R −→ R>0 × R, Thus we have a commutative diagram R>0 × R

Φ

/ R2 \{(0, 0)} T

S



R>0 × R

Φ

 / R2 \{(0, 0)},

˜ where the variables are (r, θ) at the upper left, (u, v) at the upper right, (˜ r, θ) at the lower left, and (x, y) at the lower right. In terms of differential forms and pullbacks we have the resulting diagram: Λ1 (R>0 × R) o O

Φ∗

S∗

Λ1 (R>0 × R) o

Λ1 (R2 \{(0, 0)}) O T∗

Φ



124

Λ1 (R2 \{(0, 0)}).

Now, to study µ, note that in the second diagram we have 2r O dr o

✤ T ∗µ O

❴ d˜ ro

✤❴ µ

Thus T ∗ µ is the (u, v)-form that pulls back through the polar coordinate map to 2r dr. Since µ itself pulls back to d˜ r, the answer is µ in (u, √ v)-coordinates times 2r in (u, v)-coordinates, and since r in (u, v)-coordinates is u2 + v 2 , this works out tidily to T ∗ µ = 2(u du + v dv). This formula can be verified by purely mechanical computation. (c) Similarly to part (a), pull µ back through the complex reciprocal mapping from the previous exercise,   u −v call T (u, v) = = (x, y). , 2 2 2 2 u +v u +v by using diagrams. Check your answer by computation if you wish. The argument is very similar except that now the complex reciprocal map in polar coordinates is call

˜ S(r, θ) = (1/r, −θ) = (˜ r, θ).

S : R>0 × R −→ R>0 × R,

Thus our commutative diagram works out to d(r−1 ) o O ❴ d˜ ro



T ∗O µ ❴ ✤µ

and a small calculation gives d(r−1 ) = −r−2 dr. Thus T ∗ µ is the (u, v)-form that pulls back through the polar coordinate map 2 to −r−2 dr. Since µ itself pulls back to d˜ r, the √ answer is −µ/r in (u, v)2 2 coordinates, and since r in (u, v)-coordinates is u + v , this works out to T ∗µ =

u du + v dv . (u2 + v 2 )3/2

This formula can be verified by purely mechanical computation. 9.9.4. Let r be a fixed positive real number. Consider a 2-surface in R3 , Φ : [0, 2π] × [0, π] −→ R3 ,

Φ(θ, ϕ) = (r cos θ sin ϕ, r sin θ sin ϕ, r cos ϕ). 125

Consider also a 2-form on R3 , ω = −(x/r) dy ∧ dz − (y/r) dz ∧ dx − (z/r) dx ∧ dy. Compute the derivative matrix Φ′ (θ, ϕ), and use the Pullback-Determinant Theorerem three times to compute the pullback Φ∗ ω. Compare your answer to the integrand of the surface integral near the end of section 9.1 used to compute the volume of the sphere of radius r. The columns of the derivative matrix were computed in section 9.1. In any case, the derivative matrix is 

−r sin θ sin ϕ Φ′ (θ, ϕ) =  r cos θ sin ϕ 0

 r cos θ cos ϕ r sin θ cos ϕ  . −r sin ϕ

By the Pullback-Determinant Theorem, Φ∗ (dy ∧ dz) = −r2 cos θ sin2 ϕ dθ ∧ dϕ,

Φ∗ (dz ∧ dx) = −r2 sin θ sin2 ϕ dθ ∧ dϕ,

Φ∗ (dx ∧ dy) = −r2 sin ϕ cos ϕ dθ ∧ dϕ. Also, by definition of Φ, Φ∗ (−x/r) = − cos θ sin ϕ,

Φ∗ (−y/r) = − sin θ sin ϕ,

Φ∗ (−z/r) = − cos ϕ.

It follows that the pullback of ω is Φ∗ (ω) = r2 sin ϕ dθ ∧ dϕ . This is the result in section 9.1. We need the minus signs in ω because our (poorly-chosen) spherical coordinate system reverses orientation. call

9.10.1. Let T : R2 −→ R2 be given by T (x1 , x2 ) = (x21 − x22 , 2x1 x2 ) = (y1 , y2 ). Let γ be the curve γ : [0, 1] −→ R2 given by γ(t) = (1, t) mapping the unit interval into (x1 , x2 )-space, and let T ◦ γ be the corresponding curve mapping into (y1 , y2 )-space. R Let ω = y1 dy2 , a 1-form on (y1 , y2 )-space. (a) Compute T ◦γ ω. Compute first that (T ◦ γ)(t) = T (1, t) = (1 − t2 , 2t), and then that Z Z ω= T ◦γ

1 t=0

(1 − t2 ) d(2t) = 2

Z

1 t=0

(1 − t2 ) dt = 2(1 − 1/3) = 4/3 .

(b) Compute T ∗ ω, the pullback of ω by T . 126

Compute that since ω = y1 dy2 , and T (x1 , x2 ) = (x21 − x22 , 2x1 x2 ), T ∗ ω = (x21 − x22 ) d(2x1 x2 ) = 2(x21 − x22 )(x2 dx1 + x1 dx2 ) . R (c) Compute γ T ∗ ω. What theorem says that the answer here is the same as (a)? Compute that since γ(t) = (1, t), Z Z 1 Z 1 2 ∗ (1 − t2 ) dt = 4/3 . (1 − t )(t d1 + 1 dt) = 2 T ω=2 t=0

t=0

γ

This equality is a special case of Theorem 9.10.2, the Change of Variable Theorem for differential forms. (d) Let λ = dy1 ∧ dy2 , the area form on (y1 , y2 )-space. Compute T ∗ λ. Since T (x1 , x2 ) = (x21 − x22 , 2x1 x2 ), the pullback is T ∗ λ = d(x21 − x22 ) ∧ d(2x1 x2 ) = 4(x1 dx1 − x2 dx2 ) ∧ (x2 dx1 + x1 dx2 ) = 4(x21 + x22 ) dx1 ∧ dx2 . (e) A rectangle in the first quadrant of (x1 , x2 )-space, R = {(x1 , x2 ) : a1 ≤ x1 ≤ b1 , a2 ≤ x2 ≤ b2 }, Rgets taken to some indeterminate patch B = T (R) by T . Find the area of B, λ, using (d). (This exercise abuses notation slightly, and identifying R with B its natural parametrization and B with the corresponding surface T ◦ R.) Compute, initially viewing R as a surface, Z Z Z λ= λ= T ∗λ by the Change of Variable Theorem B T ◦R R Z = 4 (x21 + x22 ) dx1 ∧ dx2 by calculation in part (d) ZR = 4 (x21 + x22 ) now viewing R as a set, by exercise 9.5.2 =4

Z

R b1

x1 =a1

Z

b2

x2 =a2

(x21 + x22 )

by Fubini’s Theorem

From here the calculation is routine, albeit a bit tedious, Z b1 Z b1 Z b2 b2 1 (x21 x2 + x32 ) x =a (x21 + x22 ) = 4 4 2 2 3 x1 =a1 x1 =a1 x2 =a2 Z b1 1 ((b2 − a2 )x21 + (b32 − a32 )) =4 3 x1 =a1 b1 1 1 = 4( (b2 − a2 )x31 + (b32 − a32 )x1 ) x =a 1 1 3 3 4 = ((b2 − a2 )(b31 − a31 ) + (b32 − a32 )(b1 − a1 )) . 3 127

(f) Why does this exercise requires that R live in the first quadrant? This ensures that T is injective on R, i.e., that it doesn’t fold R onto itself. (Note that T is the complex squaring map in disguise, so it doubles angles.) Can the restriction be weakened? Yes. Since T doubles angles, all that we need is that R lie entirely on one side of some line through the origin. 9.11.1. (a) Here is a special case of showing that closed form is exact. A function f : R3 −→ R is called homogeneous of degree k if f (tx, ty, tz) = tk f (x, y, z)

for all t ∈ R and (x, y, z) ∈ R3 .

Such a function must satisfy Euler’s identity, xD1 f + yD2 f + zD3 f = kf. Suppose that ω = f1 dx + f2 dy + f3 dz is a closed 1-form whose coefficient functions are all homogeneous of degree k where k ≥ 0. Show that ω = dφ where 1 (xf1 + yf2 + zf3 ). φ= k+1 Compute, (k + 1) dφ = d(xf1 + yf2 + zf3 ) = x df1 + f1 dx + y df2 + f2 dy + z df3 + f3 dz = x(D1 f1 dx + D2 f1 dy + D3 f1 dz) + f1 dx + y(D1 f2 dx + D2 f2 dy + D3 f2 dz) + f2 dy + z(D1 f3 dx + D2 f3 dy + D3 f3 dz) + f3 dz = (xD1 f1 + yD1 f2 + zD1 f3 + f1 ) dx + (xD2 f1 + yD2 f2 + zD2 f3 + f2 ) dy + (xD3 f1 + yD3 f2 + zD3 f3 + f3 ) dz. But ω is closed, i.e., dω = 0. This means that D2 f 1 = D1 f 2 ,

D3 f 1 = D1 f 3 ,

D3 f 2 = D2 f 3 .

The result follows from this and Euler’s identity, (k + 1) dφ = (xD1 f1 + yD2 f1 + zD3 f1 + f1 ) dx + (xD1 f2 + yD2 f2 + zD3 f2 + f2 ) dy + (xD1 f3 + yD2 f3 + zD3 f3 + f3 ) dz = (k + 1)(f1 dx + f2 dy + f3 dz) = (k + 1)ω. (b) Here is a closed form that is not exact: Let ω=

x dy − y dx , x2 + y 2 128

a 1-form on the punctured plane A = R2 − {(0, 0)}. Show that ω is closed. Compute,     x y dω = D1 dx ∧ dy + D2 dx ∧ dy x2 + y 2 x2 + y 2 x2 − y 2 y 2 − x2 dx ∧ dy + dx ∧ dy = 2 (x + y 2 )2 (x2 + y 2 )2 = 0. Integrate ω around the counterclockwise unit circle, γ : [0, 2π] −→ A,

γ(t) = (cos t, sin t),

to show that there is no 0-form (i.e., function) θ on the punctured plane such that ω = dθ. Compute Z 2π Z Z 2π cos t · cos t + sin t · sin t 1 = 2π. = ω= cos2 t + sin2 t t=0 t=0 γ But if ω were to take the form ω = dθ then we would have Z Z ω= dθ = θ(γ(2π)) − θ(γ(0)) = θ(1, 0) − θ(1, 0) = 0. γ

γ

since the integral is not 0, ω can not take the form ω = dθ. (c) Use part (b) to show that there cannot exist a homotopy of the punctured plane. How does this nonexistence relate to the example of the annulus at the beginning of the section? If a homotopy of the punctured plane exists, then every closed form on the punctured plane is exact. However, part (b) demonstrated a closed form on the punctured plane that is not exact, and so by contraposition no homotopy of the punctured plane exists. Consequently, no homotopy of the annulus exists either. Otherwise we could use half a unit of time to shrink the punctured plane into the annulus, and then half a unit of time to shrink the annulus to a point, giving altogether a homotopy of the punctured plane. But no such homotopy exists. Thus no homotopy of the annulus exists. 9.13.2. Describe the boundary of the hemispherical shell H : D −→ R3 where p 2 D is the unit disk in R and H(x, y) = (x, y, 1 − x2 − y 2 ). Parametrize the shell via a composite mapping, Φ

H

[0, 1]2 −→ D −→ R3 . Here D is the unit disk and call

Φ(r, θ) = (r cos 2πθ, r sin 2πθ) = (x, y), 129

while H is the map from the disk up to the hemispherical shell, p H(x, y) = (x, y, 1 − x2 − y 2 ). Then the boundary is

− H ◦ Φ ◦ ∆21,0

+ H ◦ Φ ◦ ∆21,1

+ H ◦ Φ ◦ ∆22,0

− H ◦ Φ ◦ ∆22,1 .

The first of these parametrizes the north pole. Integrating any 1-form over this surface will give 0, so the first piece is negligible. The second second term traverses the equator west to east, i.e., counterclockwise if one is looking down the z-axis. The third and fourth pieces give cancelling traverals of the quarter great circle at longitude 0, from the north pole to the equator and back. The entire boundary is one traversal of the equator, eastward. 9.13.3. Describe the boundary of the solid unit upper hemisphere H = {(x, y, z) ∈ R3 : x2 + y 2 + z 2 ≤ 1, z ≥ 0}. Parametrize H by Φ : [0, 1]3 −→ R3 where Φ(ρ, u, v) = (ρ cos(2πu) sin(πv/2), ρ sin(2πu) sin(πv/2), ρ cos(πv/2)). Then ∂Φ = Φ ◦ ∂∆3 = Φ ◦ (−∆31,0 + ∆31,1 + ∆32,0 − ∆32,1 − ∆33,0 + ∆33,1 ). The first piece parametrizes the centerpoint of the sphere. Integrating any 2form over this surface will give 0, so the first piece is negligible. The second piece parametrizes the upper half spherical shell. The third piece parametrizes the quarter disk cross section of the solid upper half hemisphere at longitude zero. The fourth piece parametrizes the same quarter disk, but with the opposite sign, and so the third and fourth pieces cancel. The fifth piece traverses the z-axis from 0 to 1. Integrating any 2-form over this surface will give 0, so the fifth piece is negligible. Finally, the sixth piece parametrizes the unit disk in the (x, y)-plane. In sum, the boundary is the upper half spherical shell and the equatorial disk. 9.13.4. Describe the boundary of the paraboloid Φ : D −→ R3 where Φ(u, v) = (u, v, u2 + v 2 ). This is very similar to 9.13.2. The boundary is the lip. 130

9.13.5. Describe the boundary of Φ : [0, 2π] × [0, π] −→ R3 where Φ(θ, φ) = (cos θ sin φ, sin θ sin φ, cos φ). Reparametrize to Ψ : [0, 1]2 −→ R3 where Φ(u, v) = (cos 2πθ sin πφ, sin 2πθ sin πφ, cos πφ). Then the boundary is Ψ ◦ (−∆21,0 + ∆21,1 + ∆22,0 − ∆22,1 ). The first term traverse the great half-circle at longitude 0, from the north pole to the south pole. The second term does the same, but with the opposite sign, so these two terms cancel. The third term parametrizes the north pole. The integral of any 1-form over this surface is 0, so the second term is negligible. The fourth term is similar, but with the south pole instead. In sum, the boundary is 0. 9.13.6. Describe the boundary of Φ : [0, 1] × [0, 2π] × [0, π] −→ R3 where Φ(ρ, θ, φ) = (ρ cos θ sin φ, ρ sin θ sin φ, ρ cos φ). This is very similar to 9.13.3. The boundary is the spherical shell. 9.13.7. Fix constants 0 < a < b. Describe the boundary of Φ : [0, 2π] × [0, 2π] × [0, 1] −→ R3 where Φ(u, v, t) = (cos u(b + at cos v), sin u(b + at cos v), at sin v). This surface is a solid torus. As in exercise 9.13.3, its boundary is ∂Φ = Φ ◦ ∂∆3 = Φ ◦ (−∆31,0 + ∆31,1 + ∆32,0 − ∆32,1 − ∆33,0 + ∆33,1 ). The first two terms are a pair of cross-sectional circles at longitude 0 that cancel. The second two terms are a pair of equatorial annuli of inner radius b and outer radius b + a that cancel. The fifth terms is a central equatorial circle that is negligible because the integral of any 2-form over it is zero. Finally, the sixth term is the torus shell. Thus the boundary is the torus-shell. 9.14.1. Similarly to the second example before the proof of the Generalized FTIC, show that the theorem holds when C = ∆3 and ω = f dz ∧ dx. The left side integral is Z Z D2 f. dω = [0,1]3

∆3

For the right side integral, only the left and right faces contribute, and they give Z 1 Z 1 Z  f (x, 1, z) − f (x, 0, z) . ω= ∂∆3

z=0

x=0

131

By the one-variable FTIC and then Fubini’s Theorem, the right sides of the two displays are equal, and hence so are the two left sides. 2 9.14.2. R Prove as a corollary to the Generalized FTIC that ∂ = 0, in the sense that ∂ 2 C ω = 0 for all forms ω. Compute that since d2 = 0, Z Z Z ω= dω = d2 ω = 0. ∂2C

∂C

C

9.14.3. Let C be a k-chain in Rn , f : Rn −→ R a function, and ω a (k − 1)form on Rn . Use the Generalized FTIC to prove a generalization of the formula for integration by parts, Z Z Z f dω = f ω − df ∧ ω. C

∂C

C

First compute that by the product rule for derivatives, using the fact that f is a 0-form so that there is no minus sign, d(f ∧ ω) = df ∧ ω + f dω. Therefore, by Stokes’s Theorem, Z Z Z Z fω = d(f ω) = df ∧ ω + f dω. ∂C

C

C

C

The result follows immediately. 9.14.4. If Φ is a 4-chain in R4 with boundary ∂Φ, prove the identity, Z f1 dy ∧ dz ∧ dw + f2 dz ∧ dw ∧ dx + f3 dw ∧ dx ∧ dy + f4 dx ∧ dy ∧ dz ∂Φ Z = (D1 f1 − D2 f2 + D3 f3 − D4 f4 )dx ∧ dy ∧ dz ∧ dw. Φ

This is just a special case of Stokes’s Theorem. That is, the integrand on the right side is the derivative of the integrand on the left side. 9.16.2. Use Green’s Theorem to show that for a planar region Φ, Z Z area(Φ) = x dy = − y dx. ∂Φ

∂Φ

Thus one can measure the area of a planar set by traversing its boundary. Since x dy = f (x, y) dx + g(x, y) dy where f (x, y) = 0 and g(x, y) = x, Green’s Theorem gives  ZZ Z Z ZZ  ∂f ∂g dx dy = dx dy = area(Φ). − x dy = f dx + g dy = ∂x ∂y Φ ∂Φ ∂Φ Φ 132

The second argument is similar but with f (x, y) = −y and g(x, y) = 0. More generally, Z Z (any λ such that dλ = dx ∧ dy). dx ∧ dy = area(Φ) = ∂Φ

Φ

And this specializes to λ = x dy and λ = −y dx. 9.16.3. Let H be the upper unit hemispherical shell, H = {(x, y, z) ∈ R3 : x2 + y 2 + z 2 = 1, z ≥ 0}. Define a vector-valued mapping on R3 , F (x, y, z) = (x + y + z, xy + yz + zx, xyz). RR ~ Use Stokes’s Theorem to calculate H curlF · dn. By Stokes’s Theorem, ZZ Z ~ = ~ curlF · dn F · ds. H

∂H

From a previous exercise, ∂H is the equatorial circle, and the choice of traversing it counterclockwise (as viewed from above) seems natural: ∂H : [0, 2π] −→ R3 ,

∂H(θ) = (cos θ, sin θ, 0).

~ is Thus on ∂H, the integrand F · ds (cos θ + sin θ, cos θ sin θ, 0) · (− sin θ, cos θ, 0) = cos2 θ sin θ − cos θ sin θ − sin2 θ. The first two terms will integrate to 0 over a period of the cosine, and so the integral is Z Z 2π ~ sin2 θ dθ = −π . F · dn = − ∂H

θ=0

Choosing the opposite orientation for ∂H will give an answer of π instead. 9.16.4. Use the Divergence Theorem to evaluate ZZ x2 dy ∧ dz + y 2 dz ∧ dx + z 2 dx ∧ dy, ∂H

where ∂H is the boundary of the solid unit hemisphere H = {(x, y, z) ∈ R3 : x2 + y 2 + z 2 ≤ 1, z ≥ 0}. By the Divergence Theorem, the integral is ZZZ (D1 x2 + D2 y 2 + D3 z 2 ) dx ∧ dy ∧ dz, H

133

and this is 2

ZZZ

H

(x + y + z) dx ∧ dy ∧ dz.

By symmetry, the integral of x over H is 0, and similarly for y. This leaves ZZZ z dx ∧ dy ∧ dz. 2 H

To compute this, parametrize with spherical coordinates to get 2

Z

And this is 2 · 2π

1 ρ=0

Z

Z

2π θ=0

1 3

ρ ρ=0

Z

Z

π/2 ϕ=0

ρ cos ϕ · ρ2 sin ϕ.

π/2

cos ϕ sin ϕ = π/2 . ϕ=0

9.16.5. Let g and h be functions on R3 . Recall the operator ∇ = (D1 , D2 , D3 ), which takes scalar-valued functions to vector-valued functions. The Laplacian operator is ∆ = D11 + D22 + D33 . From an earlier exercise, ∆ = div ◦ grad. (a) Prove that div (g ∇h) = g ∆h + ∇g · ∇h. First note that g ∇h = (g D1 h, g D3 h, g D3 h). Now take the divergence, div (g ∇h) = D1 (g D1 h) + D2 (g D2 h) + D3 (g D3 h) = D1 g D1 h + g D11 h + D2 g D2 h + g D22 h + D3 g D3 h + g D33 h = g(D11 h + D22 h + D33 h) + D1 g D1 h + D2 g D2 h + D3 g D3 h = g ∆h + ∇g · ∇h. This is the desired result. (b) If D is a closed compact subset of R3 with positively oriented boundary ∂D, prove that ZZZ ZZ ~ (g ∆h + ∇g · ∇h) dV = g ∇h · dn. D

∂D

This is immediate from part (a) and then the Divergence Theorem, ZZ ZZZ ZZZ ~ g ∇h · dn. div (g ∇h) dV = (g ∆h + ∇g · ∇h) dV = D

D

134

∂D

Interchange g and h and then subtract the resulting formula from the first one to get ZZZ ZZ ~ (g ∆h − h ∆g) dV = (g ∇h − h ∇g) · dn. D

∂D

These two formulas are Green’s identities. This follows immediately. (c) Assume that h is harmonic, meaning that it satisfies the harmonic equation ∆h = 0. Take g = h and use Green’s first identity to conclude that if h = 0 on the boundary ∂D then h = 0 on all of D. Green’s first identity with h harmonic and g = h is ZZZ ZZ ~ |∇h|2 dV = h ∇h · dn. D

∂D

If also h = 0 on ∂D then the right side is 0, forcing |∇h|2 to be 0 throughout D. This in turn forces ∇h = (D1 h, D2 h, D3 h) to be (0, 0, 0) throughout D, meaning that h itself is constant. But this constant is 0 on ∂D, and so it is 0 throughout D. Take g = 1 and use Green’s second identity to show that ZZ ~ = 0. ∇h · dn ∂D

The desired equality is an immediate consequence of Green’s second identity when g = 1 and h is harmonic. It says that the vector field created by taking the gradient of a harmonic function creates no net flux through any boundary surface.

135

http://www.springer.com/978-3-319-49312-1