Convex Functions, Monotone Operators and Differentiability (Lecture Notes in Mathematics, 1364) 9783540567158, 3540567151

The improved and expanded second edition contains expositions of some major results which have been obtained in the year

113 64 7MB

English Pages 132 [127] Year 1993

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Convex Functions, Monotone Operators and Differentiability (Lecture Notes in Mathematics, 1364)
 9783540567158, 3540567151

Table of contents :
0003
0005
0006
0007
0008
0009
0010
0011
0013
0015
0016
0017
0018
0019
0020
0021
0022
0023
0024
0025
0026
0027
0028
0029
0030
0031
0032
0033
0034
0035
0036
0037
0038
0039
0040
0041
0042
0043
0044
0045
0046
0047
0048
0049
0050
0051
0052
0053
0054
0055
0056
0057
0058
0059
0060
0061
0062
0063
0064
0065
0066
0067
0068
0069
0070
0071
0072
0073
0074
0075
0076
0077
0078
0079
0080
0081
0082
0083
0084
0085
0086
0087
0088
0089
0090
0091
0092
0093
0094
0095
0096
0097
0098
0099
0100
0101
0102
0103
0104
0105
0106
0107
0108
0109
0110
0111
0112
0113
0114
0115
0116
0117
0118
0119
0120
0121
0122
0123
0124
0125
0126
0127
0128
0129
0130
0131

Citation preview

Lecture Notes in Mathematics Editors: A. Dold, Heidelberg B. Eckmann, Zurich F. Takens, Groningen

1364

Robert R. Phelps

Convex Functions, Monotone Operators and Differentiability 2nd Edition

Springer-Verlag Berlin Heidelberg New York London Paris Tokyo Hong Kong Barcelona Budapest

Author Robert R. Phelps Department of Mathematics GN-50 University of Washington Seattle, WA 98195-0001, USA

Cover Graphic by Diane McIntyre

Mathematics Subject Classification (1991): 46B20, 46B22, 47H05, 49A29, 49A51 , 52A07 ISBN 3-540-56715-1 Springer-Verlag Berlin Heidelberg New York ISBN 0-387-56715-1 Springer-Verlag New York Berlin Heidelberg Library of Congress Cataloging-in-Publication Data Phelps, Robert R. (Robert Ralph), 1926 Convex functions, monotone operators, and differentiability/Robert R. Phelps. 2nd. ed. p. cm. - (Lecture notes in mathematics; 1364) Includes bibliographical references and index. ISBN 0-387-56715-1 1. convex functions. 2. Monotone operators. 3. Differentiable Functions. I. Title. II. Series: Lecture notes in mathematics (Springer Verlag); 1364. QA3.L28 no. 1364 (QA331.5) 510 s-dc20 (515'.8) This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. © Springer-Verlag Berlin Heidelberg 1989, 1993 Printed in Germany Printing and binding: Druckhaus Beltz, HemsbachlBergstr. 2146/3140-543210 - Printed on acid-free paper

PREFACE In the three and a half years since the first edition to these notes was written there has been progress on a number of relevant topics. D. Preiss answered in the affirmative the decades old question of whether a Banach space with an equivalent Gateaux differentiable norm is a weak Asplund space, while R. Haydon constructed some very ingenious examples which show, among other things, that the converse to Preiss' theorem is false. S. Simons produced a startlingly simple proof of Rockafellar's maximal monotonicity theorem for subdifferentials of convex functions. G. Godefroy, R. Deville and V. Zizler proved an exciting new version of the Borwein-Preiss smooth variational principle. Other new contributions to the area have come from J. Borwein, S. Fitzpatrick, P. Kenderov, 1. Namioka, N. Ribarska, A. and M. E. Verona and the author. Some of the new material and substantial portions of the first edition were used in a one-quarter graduate course at the University of Washington in 1991 (leading to a number of corrections and improvements) and some of the new theorems were presented in the Rainwater Seminar. An obvious improvement is due to the fact that I learned to use '!EX. The task of converting the original MacWrite text to '!EXwas performed by Ms. Mary Sheetz, to whom I am extremely grateful. Robert R. Phelps February 6, 1992 Seattle, Washington

PREFACE TO THE FIRST EDITION These notes had their genesis in a widely distributed but unpublished set of notes Differentiability of convex functions on Banach spaces which I wrote in 1977-78 for a graduate course at University College London (UCL). Those notes were largely incorporated into J. Giles' 1982 Pitman Lecture Notes Convex analysis with application to differentiation of convex functions. In the course of doing so, he reorganized the material somewhat and took advantage of any simpler proofs available at that time. I have not hesitated to return the compliment by using a few of those improvements. At my invitation, R. Bourgin has also incorporated material from the UCL notes in his extremely comprehensive 1983 Springer Lecture Notes Geometric aspects of convex sets with the Radon-Nikodym property. The present notes do not overlap too greatly with theirs, partly because of a substantially changed emphasis and partly because I am able to use results or proofs that have come to light since 1983. Except for some subsequent revisions and modest additions, this material was covered in a graduate course at the University of Washington in Winter Quarter of 1988. The students in my class all had a good background in functional analysis, but there is not a great deal needed to read these notes, since they are largely self-contained; in particular, no background in convex functions is required. The main tool is the separation theorem (a.k.a. the Hahn-Banach theorem); like the standard advice given in mountaineering classes (concerning the all-important bowline for tying oneself into the end of the climbing rope), you should be able to employ it using only one hand while standing blindfolded in a cold shower. These notes have been influenced very considerably by frequent conversations with Isaac Namioka (who has an almost notorious instinct for simplifying proofs) as well as occasional conversations with Terry Rockafellar; I am grateful to them both. I am also grateful to Jon Borwein, Marian Fabian and Simon Fitzpatrick, each of whom sent me useful suggestions based on a preliminary verSIon. Robert R. Phelps October 5, 1988 Seattle, Washington

INTRODUCTION The study of the differentiability properties of convex functions on infinite dimensional spaces has continued on and off for over fifty years. There are a couple of obvious reasons for this. Aside from the intrinsic interest of investigating the many consequences implicit in something as simple as convexity, there is the satisfaction (for this author, at least) in discovering that a number of apparently disparate mathematical topics (extreme points - rather, strongly exposed points - of noncompact convex sets, monotone operators, perturbed optimization of real-valued functions, differentiability of vector-valued measures) are in fact closely intertwined, with differentiability of convex functions forming a common thread. Starting in Section 1 with the definition of convex functions and a fundamental differentiability property in the one-dimensional case [right-hand and left-hand derivatives always exist], we get quickly to the first infinite dimensional result, Mazur's intriguing 1933 theorem: A continuous convex function on a separable Banach space has a dense G6 set of points where it is Gateaux differentiable. In order to go beyond Mazur's theorem, some time is spent in studying the subdifferentialof a convex function f; this is a set-valued map from the space to its dual whose image at each point x consists of all plausible candidates for the derivative of f at x. [The function f is Gateaux differentiable precisely when the subdifferential is single-valued, and it is Frechet differentiable precisely when its subdifferential is single-valued and norm-tonorm continuous.] Since a subdifferential is a special case of a monotone operator, Section 2 starts with a detailed look at monotone operators. These objects are of independent origin, having been extensively studied in the sixties and early seventies by numerous mathematicians (with major contributions from H. Brezis, F. Browder and G. J. Minty) in connection with nonlinear partial differential equations and other aspects of nonlinear analysis. (See, for instance, [Bre] , [De], [Pa-Sb] or [Ze]). Also in the sixties, an in-depth study of monotone operators in fairly general spaces was carried out by R. T. Rockafellar, who established a number of fundamental properties, such as their local boundedness. He also gave an elegant characterization of those monotone operators which are the subdifferentials of convex functions. [The connection between monotone operators and derivatives of convex functions is readily apparent on the real line, since single-valued monotone operators coincide in that case with monotone nondecreasing functions, as do the right-hand derivatives of convex functions of one variable.] Mazur's theorem was revisited 30 years later by J. Lindenstrauss, who showed in 1963 that if a separable Banach space is assumed to be reflexive, then Mazur's conclusion about Gatea.ux differentiability could be strengthened to Frechet differentiability. In 1968, E. Asplund extended Mazur's theorem in two ways: He found more general spaces in which the same conclusion holds, and he studied a smaller class of spaces (now called Asplund spaces) in which Lindenstrauss' Frechet differentiability conclusion is valid. Asplund used an

VIII

ingenious combination of analytic and geometric techniques to prove some of the basic theorems in the subject. Roughly ten years later, P. Kenderov (as well as R. Robert and S. Fitzpatrick) proved some general continuity theorems for monotone operators which, when applied to subdifferentials, yield Asplund's results as special cases. In Section 2 we follow this approach, ipcorporating recent work by D. Preiss and L. Zajicek to obtain the major differentiability theorems. The results of Section 2 all involve continuous convex functions defined on open convex sets. For many applications, it is more suitable to consider lower semicontinuous convex functions, even those which are extended real valued (possibly equal to +00). (For instance, in many optirr.Lization problems one finds just such a function in the form of the supremum of an infinite family of affine continuous functions.) Lower semicontinuous convex functions also yield a natural way to translate results about closed convex sets into results about convex functions and vice versa. (For instance, the set of points on or above the graph of such a convex function - its epigraph - forms a closed convex set). In Section 3 one will find some classical results (various versions and extensions of the Bishop-Phelps theorems) which, among other things, guarantee that subdifferentials still exist for lower semicontinuous convex functions. A nonconvex version of this type of theorem is I. Ekeland's variational principle, which asserts that a lower semicontinuous function which nearly attains its minimum at a point x admits arbitrarily small perturbations (by translates of the norm) which do attain a minimum, at points near x. This result, while simple to state and prove, has been shown by Ekeland [Ek] to have an extraordinarily wide variety of applications, in areas such as optimization, mathematical programming, control theory, nonlinear semigroups and global analysis. In Section 4, we examine variational principles which use differentiable perturbations. The first such result was due to J. Borwein and D. Preiss; subsequently, this was recast in a different and somewhat simpler form by G. Godefroy, R. Deville and V. Zizler; we follow their approach. Some deep theorems about differentiability of convex functions fallout as fairly easy corollaries, and it is reasonable to expect future useful applications. This is followed by the generalization (to maximal monotone operators) of Preiss' theorem that Gateaux differentiability of the norm forces every continuous convex function to be generically Gateaux differentiable. Section 5 describes the duality between Asplund spaces and spaces with the Radon-Nikodym property (RNP). These are Banach spaces for whIch a Radon-Nikodym-type differentiation theorem is valid for vector measures with values in the space. Spaces with the RNP have an interesting history, starting in the late sixties with the introduction by M. Rieffel of a geometric property (dentability) which turne~ out to characterize the RNP and which has led to a number of other characterizations in terms of the extreme points (or strongly expos ed points) of bounded closed convex subsets of the space. A truly beautiful result in this area is the fact that a Banach space is an Asplund space if and only if its dual has the RNP. (Superb expositions of the RNP may be

IX

found in the books by J. Diestel and J. J. Uhl [Dt-U] and R. Bourgin [Bou].) In Section 5, the RNP is defined in terms of dent ability, and a number of basic results are obtained using more recent (and simpler) proofs than are used in the above monographs. One will also find there J. Bourgain's proof of C. Stegall's perturbed optimization theorem for semicontinuous functions on spaces with the RNP; this yields as a corollary the theorem that in such spaces every bounded closed convex set is the closed convex hull of its strongly exposed points. The notion of perturbed optimization has been moving closer to center stage, since it not only provides a more general format for stating previously known theorems, but also permits the formulation of more gelleral results. The idea is simple: One starts with a real-valued function f which is, say, lower semicontinuous and bounded below on a nice set, and then shows that there exist arbitrarily small perturbations 9 such that f + 9 attains a minimum on the set. The perturbations 9 might be restrictions of continuous linear functionals of small norm, or perhaps Lipschitz functions of small Lipschitz norm. Moreover, for really nice sets, the perturbed function attains a strong minimum: Every mimimizing sequence converges. The brief Section 6 is devoted to the class of Banach spaces in which every continuous convex function is Gateaux differentiable in a dense set of points (dropping the previous condition that the set need be a G6). Some evidence is presented that this is perhaps the "right" class to study. Even more general than monotone operators is a class of set valued maps (from a metric space, say, to a dual Banach space) which are upper semicontinuous and take on weak* compact convex values, the so-called useD maps. In Section 7, some interesting connections between monotone operators and usco maps are described, culminating in a topological proof of one of P. Kenderov's continuity theorems.

CONTENTS 1. Convex functions on real Banach spaces

1

Subdifferentials of continuous convex functions, Gateaux and Frechet differentiability, Mazur's theorem.

2. Monotone operators, subdifferentials and Asplund spaces

17

Upper semicontinuity of set-valued monotone operators, characterization of Gateaux and Frechet differentiability in terms of continuous selections, PreissZajicek generic continuity theorem for monotone operators into separable dual spaces, Asplund spaces and subspaces with separable duals, weak*-slices, subdifferentials of convex functions as maximal monotone operators, local boundedness of monotone operators, Kenderov's generic continuity theorems for maximal monotone operators, weakly compactly generated dual spaces and Asplund spaces.

3. Lower semicontinuous convex functions

38

Extended real-valued convex functions and their subdifferentials, support points of convex sets, minimal points with respect to special cones, Ekeland's variational principle, Brondsted-Rockafellar theorem, Bishop-Phelps theorems, maximal monotonicity of the subdifferential, maximal cyclically monotone operators are subdifferentials, subdifferentials of saddle functions.

4. Smooth variational principles, Asplund spaces, weak Asplund spaces

58

,a-differentiability, smooth bump functions and the Godefroy-Deville-Zizler variational principles, density of ,a-differentiability points, Borwein-Preiss smooth variational principle, Banach-Mazur game, generically single-valued monotone operators on Gateaux smooth spaces.

5. Asplund spaces, the RNP and perturbed optimization

79

Slices and weak* slices and dentability, RNP, infinite trees, E is an Asplund space means E' has the RNP, duality between weak* strongly exposed points and Frechet differentiability, perturbed optimization on RNP sets, bounded closed convex RNP sets are generated by their strongly exposed points, Ghoussoub-Maurey theorems.

6. Gateaux differentiability spaces

95

Gateaux differentiability spaces, equivalence with M-differentiability spaces, duality characterization in terms of weak* exposed points, stability results.

7. A generalization of monotone operators: Usco maps

102

Upper semicontinuous compact valued (usco) maps, maximal monotone operators are minimal usco maps, topological proof of Kenderov's generic continuity theorem, the Clarke subdifferential.

References

110

Index

115

Index of Symbols

117

Section 1

1. Convex functions on real Banach spaces. The letter E will always denote a real Banach space, D will be a nonempty open convex subset of E and f will be a convex function on D. That is, f: D ---t R satisfies

f(tx

+ (1- t)y] :::; tf(x) + (1- t)f(y)

whenever x, y E D and 0 < t < 1. IT equality always holds, f is said to be affine. A function f: D ---t R is said to be concave if - f is convex. We will be studying the differentiability properties of such functions, assuming, in the beginning, that they are continuous. (The important case of lower semicontinuous convex functions is considered in Sec. 3.) 1.1. Examples. (a) The norm function f(x) = IIxli is an obvious example. More generally, if C is a nonempty convex subset of E, then the distance function dc(x)

= inf{lIx -

is continuous and convex on D

yll:y E C},

x E E,

= E. (Note that dc(x) = Ilxll if C = {O}.)

(b) The supremum of any family of convex functions is convex on the set where it is finite. In particular, if A is a nonempty bounded subset of E, then the farthest distance function x -+ sup{lIx - yll: yEA} is continuous and convex on D = E. (c) The norm function is also generalized by sublinear functionals, that is, functions p: E -+ R which satisfy

p(x + y) :::; p(x) + p(y) and p(tx)

= tp(x) whenever t 2: O.

Obviously, the supremum of a finite family of linear functionals is sublinear. A sublinear functional p is continuous if and only if there exists M > 0 such that p(x) ::; Mllxll for all x. (d) The Minkowski gauge functional is another generalization of the norm function: Suppose that C is a convex subset of E, with 0 E int C. Define

pc(x)

=

inf{>.

> O:x

E >'C},

x E E.

The functional pc is sublinear and nonnegative. Moreover, pc(x) {>.x: >. 2: O} C C, and bdry C = {x:pc(x) = I}; in fact

= 0 if and only if

2

Section 1 int C

= {x:pc(x) < 1} C C C {x:pc(x) ~ 1} = C.

There exists M > 0 such that pc(x) ~ Mllxll for all x (take M = 1/r, where the ball of radius r centered at 0 is contained in C), hence pc is necessarily continuous. Conversely, any positive-homogeneous, subadditive, nonnegative and continuous functional p on E is of the form pc, simply take C = {x:p(x) ~ 1}. If p fails to be a seminorm, that is, if p( x) ~ p( -x), then C is not symmetric with respect to O. Conversely, if C is either open or closed and is not symmetric, then p is not a seminorm.

The following elementary lemma is fundamental to the study of differentiability of convex functions. Lemma 1.2. If Xo E D, then for each x E E the "right hand" directional derivative

d+ f(xo)(x)

=

lim f(xo t--+O+

+ tx) t

f(xo)

exists and defines a sublinear functional on E. Proof. Note that since D is open, f( xo + tx) is defined for sufficiently small t > O. Figure 1.1 below shows why d+ f(xo) exists; the difference quotient is nonincreasing as t ~ 0+ , and bounded below, by the corresponding difference

quotient from the left.

)co +tx Fig. 1.1 To prove this, we can assume that Xo by convexity

f(tx)

~

= 0 and f(xo) = o.

IT 0

< t < s, then

t (s-t) t - f(sx) + --f(O) = - f(sx), s s s

which proves monotonicity. Applying this to -x in place of x, we see that

-[f(xo - tx) - f(xo)]/t is nondecreasing as t ~ 0+. Moreover, by convexity again, for t

2f(xo) so that

~

f(xo - 2tx) + f(xo

-[f(xo - 2tx) - f(xo)] < [f(xo 2t -

>0

+ 2tx),

+ 2tx) -

f(xo)]

2t

which shows that the right side is bounded below and the left is bounded above. Thus, both limits exist; the left one is -d+ f( xo)( -x) and we obviously have

1. Convex functions on real Banach spaces.

3

-d+ I(xo)( -x) ~ d+ I(xo)(x). It is also obvious that d+ I( x) is positively homogeneous. To see that it is subadditive, use convexity again to show that for t > 0,

[/(x

+ t(u + v)) - I(x)] < I(x + 2tu) - I(x) -

t

+

2t

I(x

+ 2tv) - I(x) 2t

and take limits as t --+ 0+.

Definition 1.3. The convex function Xo E D provided the limit

df(xo)(x)

I

is said to be Gateaux differentiable at

= lim f(xo + tx) t--+O

t

f(xo)

exists for each x E E. The function dl( xo) is called the Gateaux derivative (or Gateaux differential) of I at xo. It is immediate from this definition (requiring the existence of a two-sided limit) that I is Gateaux differentiable at Xo if and only if -d+ I( xo)( -x) = d+ I(xo)(x) for each x E E. Since a sublinear functional p is linear if and only if pc-x) = -p(x) for all x, this shows that I is Gateaux differentiable at Xo if and only if x --+ d+ I(xo)(x) is linear in X; in particular, if this is true, then dl( xo) is a linear functional on E. 1.4. Examples. (a) If f is a linear functional on E (not necessarily continuous), then df(xo)(x) = f(x) for all Xo and x. For an example of a discontinuous linear functional on a normed linear space, let f(x) = x'(O), for x in the space of all polynomials on [-1,1] with

supremum norm. (It is easy to construct a sequence of polynomials Xn converging uniformly to 0 such that x~(O) = 1 for all n.) Thus, x ~ df(xo)(x) need not be continuous. (b) The norm IIxll = Elxn I in [1 is Gateaux differentiable precisely at those points x = (x n ) for which X n i= 0 for all n. In this case, the Gateaux differential is the bounded sequence (sgn x n ) E [00. The norm in [1(r) (r uncountable) is not Gateaux differentiable at any point.

°

Proof. If x E [1 and X n = for some n, let cn = (0,0, ... ,0,1,0, ...) be the sequence whose only nonzero term is a 1 in the n-th place. It follows that IIx+tcn llt-"xllt = Itl, so dividing both sides by t shows that the (two-sided) limit as t -+ 0 does not exist. [This observation shows how to prove the second assertion, since any element of [1(r) vanishes at all but a countable number of members of r.] Suppose, on the other hand, that for every n, X n f:. 0, that f > 0 and that y E [1. We can choose N > 0 such that En>NIYnl < f/2. For sufficiently small 6 > 0 we have sgn(x n Consequently,

+ tYn) = sgn

Xn if 1 ~ n ~ N,

ItI < C.

4

Section 1

x _ IlIx + tylht - II lh - LJYn sgn XnI < IE~=lt-l{lxn + tYnl-IXnl- tYn sgn xnll + 2En>NIYnl < € r'

provided It I < 6.

IT f is a continuous convex function which is Gateaux differentiable at a point, then its differential is in fact a continuous linear functional. This is a consequence of the following basic result. Notation. IT x E E and r > 0, the closed ball centered at x is denoted by B(x;r) = {y E E: IIx - yll ~ r}. 1.5.

Proposition 1.6. If the convex function f is continuous at XQ E D, then it is locally Lipschitzian at XQ, that is, there exist M > 0 and 8 > 0 such that

B(xQ; 8) C D and

If(x) - f(y)1

~

Mllx -

yll

whenever x, y E B(xQ; 8). Proof. Since f is continuous at Xo, it is locally bounded there; that is, there exist M 1 > 0 and 8 > 0 such that If I ~ M 1 on B(xQ; 28) C D. If x, y are distinct points of B(xQ; 8), let a = IIx - yll and let

z = y + (8/a)(y - x);

see Fig. 1.2 below.

Fig. 1.2 Note that z E B(xQ;28). Since y = [a/(a combination (lying in B(xQ; 28)), we have

fey) fey) - f(x)

~

~

[al(a

+ 6)]z + [6/(a + 8)]x

+ 6)]f(z) + [8/(a + 6)]f(x)

[a/(a + 8)]{f(z) - f(x)}

~

is a convex

so

(a/6) · 2M1 = (2M1 /8)lI x -

Interchanging x and y gives the desired result, with M

= 2M1 /8.

yll·

Remark. Note that we only used local boundedness of f; hence the latter property is equivalent to continuity for a convex function on an open convex

1. Convex functions on real Banach spaces.

5

set. In fact, it suffices that f be merely locally bounded above: If f ~ M on the ball B(xo;r), then for all x E B(xo;r) we have 2xo - x E B(xo;r) and hence f( 2x o - x) + f(x) < M + f(x) f( x o) < 2 2 '

+ 2If(xo)l; that is, If I ~ M + 2If(xo)1 on B(xo; r).

so - f(x) ~ M

Corollary 1.7. If the convex function f is continuous at Xo E D, then d+ f(xo) is a continuous sublinear functional on E, and hence df(xo) (when it exists) is a continuous linear functional.

Proof. Given Xo E D there exists a neighborhood B of Xo and M that, if x E E, then f(xo + tx) - f(xo) ~ Mtllxll

> 0 such

provided t > 0 is sufficiently small that Xo +tx E B. Thus, for all points x E E, we have d+ f(xo)(x) ~ Mllxll, which implies that d+ f(xo) is continuous. Proposition 1.8. The continuous convex function f is Gateaux differentiable at Xo E D if and only if there exists a unique functional x* in E* satisfying

(x*,x - xo)

~

f(x) - f(xo),

xED,

(1.1 )

or equivalently

(1.2)

Y E E.

Proof. We first show that (1.1) and (1.2) are equivalent. If x* satisfies (1.1), then for any y E E we have Xo + ty E D for sufficiently small t > 0 hence t(x*,y) = (x*,(xo + ty) - xo) ~ f(xo + ty) - f(xo) which implies that x* satisfies (1.2). Conversely, if x* satisfies (1.2) and xED, let y = x - Xo; then Xo + t(x - xo) E D if 0 < t ~ 1 so (x*,x - xo) ~ d+ f(xo)(x - xo) ~ c 1 (f(xo

+ t(x -

xo)) - f(xo)).

Setting t = 1 yields (1.1). If df(xo) exists, then df(xo)(x - xo) ~ f(x) - f(xo) as above, so df(xo) satisfies (1.1). Moreover, if x* satisfies (1.1), then it satisfies (1.2); linearity of d+ f(xo) = df(xo) implies that x* = df(xo). Conversely, suppose that x* is the unique element of E* satisfying (1.1), hence the unique element satisfying (1.2). We now apply the general fact that if a continuous sublinear functional p majorizes exactly one linear functional, then p is itself linear. Indeed, if p is not linear, then it dominates many linear functionals (see the sketch below); the proof is an easy consequence of the Hahn-Banach theorem: If -pc -x) < p(x), find p-dominated extensions of the linear functionals

fl(rx) = rp(x)

and

f2(rx)

= -rp( -x).

6

Section 1

Fig. 1.3 The functionals x· which satisfy (1.1) play an important role in the study of convex functions, so they are singled out for special attention.

Definition 1.9. IT f is a convex function defined on the convex set C and x E C, we define the subdifferential of f at x to be the set 8 f( x) of all x* E E* satisfying (x·,y - x) $ f(y) - f(x) for all y E C. Note that this is the same as saying that the affine function x* + a, where a = f(x) - (x*, x), is dominated by f and is equal to it at y = x, as indicated in the sketch.

Graph of

x Fig. 1.4 The Hahn-Banach argument we used above shows quickly that if f is continuous at Xo, then 8f(xo) is nonempty: d+ f(xo) is continuous and sublinear, so (as above) there exists x* such that (x*, y) ~ d+ f(xo)(Y) for all y E E. Using the fact that the right-hand difference quotients for d+ f(xo) are decreasing, replacing y by y - Xo and letting t = 1, we get

(x*, y - xo) $ ~ f(xo)(Y - xo) ~ f((y - xo) + xo) - f(xo) for all y E C. As we will see later, it is still possible to have 8f(xo) nonempty for certain convex f which are not continuous at Xo.

1. Convex functions on real Banach spaces.

7

1.10. Exercise. Prove that for any convex function f the set 8f(xo) (possibly empty!) is convex and weak* closed. (Note that a continuous convex f is Gateaux differentiable at Xo if and only if 8f(xo) is a singleton.)

Proposition 1.11. If the convex function f is continuous at Xo E D, then af(xo) is a nonempty, convex and weak* compact subset of E*. Moreover, the map x --+ af( x) is locally bounded at Xo, that is, there exist M > 0 and neighborhood U of Xo in D such that IIx* II :5 M whenever x E U and x* E af(x). Proof. The fact that af(xo) is nonempty, weak* closed and convex follows from the preceding remarks and Exercise 1.10. The fact that it is weak* compact will follow from Alaoglu's theorem, once we have shown the local boundedness property. Since, by Proposition 1.6, f is locally Lipschitzian at Xo, there exist M > 0 and a neighborhood U of Xo such that

If(y) - f(x)1 :51\Jlly -

xII

whenever x,y E U.

H x E U and x* E af(x), then for all y E U we have

(x*,y - x) :5 fey) - f(x):5 Mlly -

xII,

which implies that IIx* II :5 M.

Definition 1.12. Suppose that E and F are normed linear spaces, that U is a nonempty open subset of E and that 0, we can assume that p(x) = 1. Choose a subsequence {Xn(i)} of {x n } such that IXn(i)1 --+ 1. By taking a further subsequence we can assume that the Xn(i) have the same sign and, since p( x) = p( -x), it suffices to consider the case Xn(i) > 0 for all i. Define Yn = 0 if either n :F n( i) or n = n( i) with i odd, while Yn = 1 if n = n( i) with i even. Then

t- 1 [p(x

+ ty) -

p(x)] = 1 if t > 0, = 0 if - 1 < t < O.

Despite the foregoing example, there are nonseparable spaces in which the conclusion to Mazur's theorem remains valid, for instance, the class of weakly compactly generated (WCG) Banach spaces (to which we will return later). The attempt to characterize those spaces in which convex continuous functions are always generically differentiable has motivated the following terminology.

Definition 1.22. A Banach space E is said to be an Asplund space ( weak Asplund space) provided every continuous convex function defined on a nonempty open convex subset D of E is Frechet differentiable (Gateaux differentiable) at each point of some dense G 6 subset of D. Note that the term "weak" does not refer to the weak topology; it arose because Gateaux differentiability has sometimes been called "weak differentiability" (since it is weaker than Frechet - sometimes called "strong"- differentiability).

Some 30 years after Mazur's paper, J. Lindenstrauss [Li] obtained the next result of this nature, when he showed that if a separable Banach space was also reflexive, then the Gateaux differentiability in Mazur's theorem could be changed from Gateaux to Frechet. Five years later, E. Asplund renewed the study of such questions [Asp], making some impressive contributions. Mazur's theorem can be restated in the form "Separable Banach spaces are weak Asplund spaces" (while Example 1.21 shows that ROO is not a weak Asplund space and Example 1.14(a) shows that R1 is not an Asplund space). 'Ye will eventually prove, among other things, the fundamental result (due to Asplund) which states that if E* is separable, then E is an Asplund space; this clearly implies Lindenstrauss' theorem.

14

Section 1

For convex continuous functions f, it is possible (and quite useful) to characterize Frechet differentiability at x solely in terms of f, that is, without mentioning the linear functional f'(x). (This is possible since there are always candidates for the derivative lurking in 8f(x).)

Proposition 1.23. Suppose that f is continuous and convex on the nonempty open convex subset D of E. Then f is Frechet differentiable at xED if and only if for all e > 0 there exists 6 > 0 such that

f(x whenever

lIyll =

+ ty) + f(x - ty) - 2f(x) < te < t < 6.

1 and 0

Proof. H f'(x) exists, then for all e > 0 there exists 6

+ ty) -

f(x

(1.7)

> 0 such that

f(x) - (f'(x), ty) < (ej2)tllyll

whenever 0 < t < 6 and lIyll = 1. Writing this down with -y in place of y and adding both expressions yields (1.7). Conversely, suppose the stated condition holds. Choose x* E 8 f( x); it follows that for all y and -all sufficiently small t > 0 such that x ± ty ED,

(x*, ty) = (x*, (x

+ ty) -

x)

~

f(x

+ ty) -

f(x) and

-(x*, ty) 5: f(x - ty) - f(x).

(1.8)

(1.9)

> 0 there exists 6 > 0 such that (1.7) holds for any

By hypothesis, for any e

o < t < 6 and any y with lIyll =

1, that is,

f(x+ty)-f(x)-(x*,ty) S:te +f(x)-f(x-ty)-(x*,ty). Inequality (1.8) shows that the left side is greater or equal to 0 while (1.9) shows that the right side is less than or equal to te, which completes the proof.

1.24. Exercise. Prove that a continuous convex function f on the open convex set D is Gateaux differentiable at xED if and only if for each y E E lim

t

_

+ f(x

o

+ ty) + f(x t

ty) - 2f(x)

= o.

The previous proposition makes it easy to formulate and prove the following basic fact.

Proposition 1.25. Suppose that f is continuous and convex on the nonempty open convex subset D of E. Then the set G (possibly empty) of points xED where f is Frechet differentiable is a G 6. Proof. For each n let G n be the set of all those xED for which there exists 6 > 0 such that

1. Convex functions on real Banach spaces.

sup

f(x

+ 6y) + f(x

6

- 6y) - 2f(x)


0 and M > 0 such that If(u) - f(v)1 ~ Mllu - vII whenever u, v E B(x; 61 ). Since x E Gn , there exist 6 > 0 and r > 0 such that for all y with lIyll = 1 we have x ± 6y E D and

f(x

+ 6y) + f(x6 -

6y) - 2f(x)
0 such that every weak* slice of A has diameter greater than f; we will show that p is nowhere Frechet differentiable. Indeed, given any x E E, for each n ~ 1 the weak* slice S(x, A, f/3n) has diameter greater than f. It follows that there exist x~, y~ E Sex, A, f/3n) with IIx~ - y~1I > f, that is (x~,

and

(x~

-

y~,

xn ) >

f

x) > p( x) - f/3n,

for some

Xn

E E with

(y:,x) > p(x) - f/3n

IIxnll =

1. Thus

p[x + (l/n)xn ] + p[x - (l/n)xn ] - 2p(x) ~ ~ (x~,

x + (lin)x n ) + (y:, x - (lin)xn ) ~ (x~ + y:, x) - 2f/3n =

= (lln)(x~ - y:, xn )

-

2f /3n > fin - 2f/3n = f/3n.

IT we divide through by lin it becomes evident (using Proposition 1.23) that p is not Frechet differentiable at x. We can now characterize separable Asplund spaces. Theorem 2.19. A separable Banach space E is an Asplund space if and only if E* is separable. Proof. It was shown in Theorem 2.12 that if E* is separable, then E is an Asplund space. Suppose then, that E is a separable Asplund space. If E* is not separable, then its unit ball B* is not separable, hence there is an uncountable set A C B* and n ~ 1 such that IIx* -y*1I > lin whenever x*, y* are distinct points of A. (Look at maximall/n-nets in B*.) Since E is separable, B* (in its weak* topology) is compact and metrizable, hence satisfies the second axiom of countability. Thus, A has at most countably many points which are not weak* condensation points; we assume that these points have been deleted from A. Let S be any weak* slice of A. Since such a slice is weak* relatively open and since no point of A is weak* isolated, it must contain two distinct points of A, so its diameter is greater than lin, contradicting Lemma 2.18.

We next look at some additional properties which distinguish subdifferentials within the class of monotone operators, proving some basic results along the way.

26

Section 2

Definition 2.20. A set-valued map T: E monotone provided

L

--+

2E * is said to be n-cyclically

(Xk' Xk - Xk-l) ~ 0

l~k~n

whenever n ~ 2 and xo, Xl, X2, ... x n E E, Xn = Xo, and xk E T(Xk), k = 1,2,3, ... , n. We say that T is cyclically monotone if it is n-cyclically monotone for every n. Clearly, a 2-cyclically monotone operator is monotone.

2.21 Examples. (a) The linear map in R 2 defined by T(XI' X2) = (X2, -Xl) is positive, hence monotone, but it is not 3-cyclically monotone: Look at the points (1,1), (0,1) and (1,0). (b) Let f be a continuous convex function on an open convex set; then 8f is cyclically monotone. (As we will see, this is almost the only· example.)

Definition 2.22. A subset G of E x E* is said to be monotone provided (x*-y*,x-y) ~ 0 whenever (x,x*), (y,y*) E G. IfT:E ~ 2 E * is a monotone operator then its graph G(T)

= {(x, x*) E E x E*: x* E T(x)}

is a monotone set. A monotone set is said to be maximal monotone if it is maximal in the family of monotone subsets of E x E*, ordered by inclusion. We say that a monotone operator T is maximal monotone provided its graph is a maximal monotone set. There is an obvious one-to-one correspondence between monotone sets and monotone operators. An easy application of Zorn's lemma shows that every monotone operator T can be extended to a maximal monotone operator If , in the sense that G(T) C G(T).

2.23 Exercises. (a) Prove that a monotone operator T: E ~ 2E * is maximal monotone if and only if the following condition holds: Whenever y E E, y. E E· and

(y. - x·,y - x)

~

0 for all

X

E D(T),

x· E T(x)

then it necessarily follows that y. E T(y). (b) Prove that if T is maximal monotone, then T(x) is convex set, for every

x E E.

(c) Show that the operator in Example 2.21(a) is maximal monotone.

The next theorem is due to G. Minty. A much more general result has been proved by Rockafellar and will be presented in the next section (Theorem 3.24). We start with a simple and useful result.

2. Monotone operators, subdifferentials and Asplund spaces

27

Proposition 2.24. If f is convex on D and continuous at xED, then for all y E E, d+f(x)(y) = sup{(x*,y):x* E 8f(x)}

and this supremum is attained at some point x* E 8f(x). Proof. As shown in Prop. 1.8, if x* E 8f(x), then (x*,y) 5 d+ f(x)(y) for all y. On the other hand, by Corollary 1.7, d+ f(x) is a continuous sublinear functional, so for any y f 0 we can use the Hahn-Banach theorem to find x* E E* such that (x*,z) 5 d+f(x)(z) for all z E E (so x* E 8f(x) by Proposition 1.8) and (x*,y) = J+ f(x)(y), which completes the proof. Theorem 2.25. If f is continuous and convex on all of E, then its subdifferential map 8 f is maximal monotone.

Proof. By Exercise 2.23( a), to show that 8 f is maximal it suffices to show that whenever y E E and y* E E* are such that y* (j. 8f(y), then there exist x E E and x* E 8f(x) such that (y* - x*, y - x) < O. To simplify the proof we can replace f by the convex continuous function 9 defined by g(x) = f(x + y) - (y*,x). It is easily verified that one has x* E 8g(x) if and only if x* + y* E 8/(x + y). Thus, if y* (j. 8f(y), then 0 (j. 8g(0) and if there exist x* and x with x* E 8g(x) such that (x*,x) < 0, then for z = x + y and z* = x* + y* we have z* E 8f(z) and (y* - z*,'y - z) = (x*, x) < O. Assume, then, that y = 0 and y* = OJ we want to produce x E E and x* E 8f(x) such that (x*, x) < O. By Prop. 1.25, we know that 0 is not a global minimum for f, so there exists a point Xl E E such that f(O) > f(X1)' Consider the convex function h(t) = f(tx1)'O 5 t 5 1. Its right hand derivative at a point to E (0,1) is clearly equal to d+ f(toX1)(XI). Suppose this quantity were nonnegative for each such to; by one form of the mean value theorem (see, for instance, [Fl, p. 22]), this would imply that h(O) 5 h(1), a contradiction. Thus, it is necessarily negative for some 0 < to < 1, so by homogeneity (and letting x = toxI), we have d+ f(x)(x) < O. By Prop. 2.24 above, there must exist x* E 8f(x) such that (x*,x) = d+ f(x)(x) < 0, which completes the proof. 2.26 Example. In a Banach space E define f(x) = (1/2)lIxIl 2 and define J = 8f; the maximal monotone map J is called the duality mappingfor E. Explicitly,

= {x* E E*:(x*,x) = IIx*II'lIxll and Ilx*1I = IIxll}. Proof. It is readily computed that d+ f(x)(y) = IIxll . d+llxll(y). If x = 0, then d+ f(O)(y) = 0 for all y, hence is linear and therefore 8f(0) = {O}. Suppose, then, J(x)

that x f. O. We know (by Prop. 1.8) that x* E 8f(x) if and only if x* ::; d+ f(x), that is, if and only if IIxll-1x* ::; d+llxll, which is equivalent to y* == IIxll-1x* E 811xll, that is, if and only if (y*, y - x) ::; Ilyll-lIxli for all y E E. If, in this last inequality, we take y x + z, IIzl1 ::; 1 and apply the triangle inequality, we conclude that lIy*1I ::; 1. If we take y = 0, we conclude that IIxli ::; (y*,x) ::;lIy*II'lIxll, so lIy*1I = 1 and (y*, x) = IIxll, which is equivalent to what we want to prove. The converse

=

28

Section 2

is easy: If

Ily·11 = 1 and (y·,x) = Ilxll, lIyll-lIxll, so y. E 811 xll·

(y.,y - x) S

then for all y in E, we necessarily have

If the norm of E is Gateaux differentiable, then obviously J( x) contains only one element; by a slight abuse of notation, we denote it also by J(x).

We now know that subdifferential maps of continuous convex functions are maximal cyclically monotone; we will discuss later the fact that maximal cyclically monotone mappings are the subdifferentials of certain convex functions. These (and other) considerations require a fundamental property of monotone operators which was easy to prove for subdifferentials of continuous convex functions (in Prop. 1.11); namely, that they are locally bounded at points in the interior of their domain.

Definition 2.27. a) A monotone operator T: E - t 2E • is locally bounded at x E D(T) provided there exist M > 0 and [; > 0 such that Ily*11 S M whenever y E B(x; 8) n D(T) and y* E T(y). b) A subset (not necessarily convex) AcE which contains the origin is said to be absorbing if E = U{AA : A > O}. Equivalently, A is absorbing if for each x E E there exists t > 0 such that tx E A. A point x E A is called an absorbing point of A if the translate A - x is absorbing. It is obvious that any interior point of a set is an absorbing point. If Al is the union of the unit sphere and {O}, then Al is absorbing, even though it has empty interior.

Theorem 2.28. Suppose that T: E - t 2 E • is monotone and that x E D(T). If x E int D(T), or, more generally, if x is an absorbing point of D(T), then T is locally bounded at x.

Proof. By choosing any x· E T( x) and replacing T by the monotone operator y - t T(y + x) - x*, we lose no generality in assuming that x = 0 and that o E T(O). Thus, we want to show that, under these assumptions, T is locally bounded at O. Define, for x E E, f(x) = sup{(y*,x - y):y E D(T), lIyll S 1 and y* E T(y)} and let C = {x E E: f( x) S I}. As the supremum of affine continuous functions, f is convex and lower semicontinuous, and hence C is closed and convex. It also contains the origin: First, since 0 E T(O), we must have f 2:: O. Second, whenever y E D(T) and y* E T(y), monotonicity implies that 0 S (y* - 0, y - 0), so f(O) SO. We claim that the closed convex symmetric set A = C n (-C) is absorbing, hence, by a standard consequence of the Baire category theorem, is a neighborhood of the origin. It suffices to prove that C is absorbing, so suppose x E E. By hypothesis, D(T) is absorbing so there exists t > 0 such that T(tx) =I- 0. Choose any element u* E T(tx). If y E D(T) and y* E T(y), then by monotonicity

2. Monotone operators, subdifferentials and Asplund spaces

(y*, tx - y)

~

29

(u*, tx - y).

Consequently,

f(tx) Choose 0

~

sup{(u*,tx - y):y

E

D(T), lIyll

~

I} < (u*,tx) +

Ilu*11 < 00.

< A < 1 such that Af(tx) < 1. By convexity f(AtX)

~

Af(tx) + (1 - A)f(O) = Af(tx) < 1,

so Atx E C. Thus, A is a neighborhood of 0 and hence there exists 6 > 0 such that f(x) ~ 1 whenever IIxll ~ 26 . Equivalently, if !lxll ~ 28, then (y*,x) ~ (y*,y) + 1 whenever y E D(T), lIyll ~ 1 and y* E T(y). Thus, if y E B(O; 6) n D(T) and y* E T(y), then

2811y*1I = so

lIy*1I

sup{(y*,x) : II x ll ~

28}

~

Ily*II·llyll + 1 ~ 811y*1I + 1,

~ l/S.

Note that the foregoing result does not require that D(T) be convex. There are trivial examples which show that 0 can be an absorbing point of D(T) but not an interior point (for instance, let T be the restriction of the subdifferential of the norm to the set Al defined above). Even if D(T) is convex and T is maximal monotone, D(T) can have empty interior, as shown by the following example. (In this example, T is an unbounded linear operator, hence it is -not locally bounded at any point and therefore D(T) has no absorbing points.)

Example. In the Hilbert space [2 let D = {x = (x n ) E [2: (2 n x n ) E [2} and define Tx = (2 n x n ), xED. Then D(T) = D is a proper dense linear subspace of [2 and T is a positive operator, hence is monotone. We use Exercise 2.23 (a) to show that it is maximal monotone. Suppose, then, that y and y. are in [2 and that for all x E D(T)

o~

(y. - Tx, y - x) == (y., y) - (Tx, y) - (y., x) + (Tx, x);

(2.2)

we want to show that y E D(T) and that Ty = y•. Fix n ~ 1, m ~ 1 and a E R and let x = (Y1, Y2, . • ~ , Yn -1 , a, Yn+ 1 , ... , Yn+ m , 0, 0, ...). Since x E D(T), we can expand the right side of (2.2) and cancel a number of terms to obtain {"1n+m YkYk * - aYn* O« _ Y *,y) - a 271. Yn - Ll1

* + 271. a. 2 + YnYn

Letting m --+ 00 yields 0271. (Yn - a) ~ Y: (Yn - a) for all n ~ 1. Since this is true for arbitrary a E R, it follows that y~ = 2n Yn for each n. Since y* E [2 and y* == (2 1t Yn), we conclude that Y E D(T) and y. = T(y).

It is conceivable that for a maximal monotone T, any absorbing point of

D(T) is actually an interior point. A word of caution i; in order at this point. Our knowledge of the structure of the domain of monotone operators is incomplete, although some things are

30

Section 2

known about D(T) when T is maximal monotone. Rockafellar [Ro 2] gives two special conditions under which int D(T) =1= 0 for a maximal monotone T; in particular, this is the case if the convex hull of D(T) is assumed to have nonempty interior. Under this hypothesis, incidentally, the interior of D(T) is convex, and for any point x E D(T)\int D(T), the set T(x) is unbounded. (Indeed, by the separation theorem there exists y. =F 0 such that (y., x) ~ (y., u) for every point 1.t E D(T). Thus, for every A> 0 and. any x· E T(x), u· E T(u), we have (x· + Ay·) - u·,x - u) (x· - u·,x - u) + A(y·,x - u) ~ 0,

=

by monotonicity. By maximality, x·

+ Ay· E T(x) for every A> 0.)

2.29. Exercises. (a) Prove that if T is maximal monotone, then T(x) is weak* compact and convex, for all x E int D(T). (b) Prove that if T is maximal monotone, then it is norm-to-weak* upper semicontinuous on int D(T).

The following theorem is due to P. Kenderov [Ke2]. Note that by Lemma 2.18, the "small slices" hypothesis on E* is satisfied if E is an Asplund space. Theorem 2.30. Suppose that every nonempty bounded subset of the Banach space E* admits weak* slices of arbitrarily small diameter; then for every maximal monotone operator T: E --+ 2 E * there is a dense G6 subset G of W = int D(T) (which we assume nonempty) such that T is single-valued and norm-to-norm upper semicontinuous at each point of G. Proof. For each n ~ 1 let G n be the set of all x E W for which there is a neighborhood V of x in W such that diam T(V) < lin. Clearly, T is single-valued and norm-to-norm upper semicontinuous at each point of the intersection G = nG n • Since W is a Baire space, we need only show that each of the sets G n is open and dense in W. They are open by their definition, so it remains

to show that each G n is dense. Let x E Wand let U be any neighborhood

of x in W. From Theorem 2.28 we know that T is locally bounded in W, so without loss of generality we can assume that T(U) is a bounded subset of E*. By hypothesis, there exist z E E and 0 > 0 such that the weak* slice

S =. S(z,T(U),o) = {x* E T(U): (x*,z) >

o. We claim that T(xo) C S. Indeed, if y* E T(xo), then we have

o ~ (y* -

x*,xo -

Xl)

= r(y* - x*,z),

which implies that y* E S. Since {x* E E*: (x*,z) > l1T(U)(Z) - o} is weak* open and since (by Exercise 2.29(b)) T is norm-to-we~k* upper semicontinuous, there exists 6 > 0 such that B(xo; 6) C U and T(y) C S for any point

2. Monotone operators, subdifferentials and Asplund spaces

31

B(xo; 6); it follows that T[B(xo; 6)] has diameter less than lIn. This says that Xo E Gn n U, which completes the proof.

YE

We can now prove the converse to Lemma 2.18, obtaining a characterization of Asplund spaces. We first prove a local extension result for convex functions, proved originally by Asplund (although the proof given below is simpler).

Lemma 2.31. Suppose that I is continuous and convex on the open convex set DeE and that Xo E D. Then t3ere exists a neig'lborhood U of Xo in D and a convex Lipschitzian function I on E such that I = I in U. Proof. Given n

~

nil· ·11:

1, x E E, define

In

to be the "inf-convolution" of

I

and

fn(x) = inf{f(y) + nllx - yll: y ED}.

We need to know that f n > -00 (at least for all sufficiently large n). To this end, choose x* E 8f(xo). If n ~ IIx*lI, then for any x E E and y E D we have

I(y) - I(xo)

~

(x*,y - xo)

~

-nlly -

xoll

~

-nlly -

xll- nllx - xoll,

so fn(x) ~ f(xo) - nllx - xoll. Thus, we will assume below that n is large enough that In> -00. It follows easily from the definition that for Xl, X2 E E, o ~ t ~ 1 and any e > O,onehastln(xl)+(1-t)fn(X2) ~ fn(txI+(1-t)x2)-e, so fn is convex. Also, fn(x) ~ f(x) for all xED (take y = x in the definition). Moreover, given u, vEE and e > 0 we can choose y E D such that

fn(u) > f(y)

+ nllu - yll- e.

Since

fn(v)

~

I(y)

+ nllv - yll

we have

fn(v) - fn(u) < f(y)

+ nllv - yll- [f(y) + nllu - yll- €]

$ nllv -

ull + €,

which shows that fn(u) - fn(v) $ nllu - vII for all u, v. Interchanging u, v proves that f n has Lipschitz constant n. Since 8f is locally bounded, there exists a neighborhood U of Xo and an integer n > 0 such that 8f(U) is contained in the ball nB*. Suppose that x E U and choose any functional x* E 81(x); then IIx*1I ~ n and for all y E D

I(x)

~

which implies that I(x)

f(y) ~

+ (x*,x - y)

~

I(y) +nllx - yll,

fn(x); that is, I = In in U.

Theorem 2.32. A Banach space E is an Asplund space if and only if every nonempty bounded subset of E* admits weak* slices of arbitrarily small diameter.

32

Section 2

Proof. The necessity portion has already been proved in Lemma 2.18. Suppose, then, that I is continuous and convex on the nonempty open convex set D c E; we need only show that I' exists on a dense subset of D. Given Xo E D and a neighborhood U of Xo, we can assume that U is sufficiently small that the conclusion to Lemma 2.31 holds in U; that is, there exists a convex Lipschitz continuous function on E such that = I on U. By Theorems 2.25 and 2.30, there is a dense G 6 subset G of points in E at each of which is singlevalued and norm-to-norm upper semicontinuous, hence any selection for is norm-to-norm continuous. By Prop. 2.8, is Frechet differentiable at each such point. But this implies that I has points of differentiability in U.

1

1

aJ

1

a1

As we will see shortly, this characterization has a number of applications. For instance, it is not obvious that a closed subspace M of an Asplund space E is itself an Asplund space; even if we could extend a continuous convex function on an open convex subset of M to one on an open convex subset of E, the (dense) set of points of differentiability of the extension could fail to intersect the (nowhere dense) set M. The Asplund property i3 inherited by closed subspaces:, however, and Theorem 2.32 can be used to prove it. Proposition 2.33. A cl03ed 3ub3pace M of an A3plund 3pace E i3 it3elf an A3plund 3pace.

Proof. We need only show that any bounded nonempty subset A of M* E* / M.L admits weak* slices of arbitrarily small diameter. Without loss of generality, we can assume that A is weak* compact and convex. (Any weak* slice of the weak* closed convex hull of A is also a slice of A.) The quotient map Q: E* ~ M* is of norm one, onto and weak*-to-weak* continuous. Suppose that f > O. Let B* be the unit ball of E*; since Q is an open map, the set Q( B*) contains a neighborhood of the origin in M*. Since A is bounded, this implies that there exists A > 0 such that Q(AB*) = AQ(B*) :> A. Let C = AB* n Q-I(A); clearly, C is weak* compact, convex and Q(C) = A. By Zorn's lemma there exists a minimal (under inclusion) set C with these properties; let C I be such a minimal set. By hypothesis, there exists a weak* slice S == S(x,CI,a) of C I of diameter less that E; since S is relatively weak* open, the set Al = Q(CI \ S) is a weak* compact convex set which, by the minimality of C I , is properly contained in A. IT xi, xi E A \ AI, there exist E S such that Q(Yi) = xi and

yt, y;

Thus, diam(A \ AI) ~ E. By the separation theorem, there exists a weak* slice of A which misses AI, and hence has diameter at most E. By Theorem 2.32, again, we conclude that M is an Asplund space. Theorem 2.34. A Banach 3pace i3 an A3plund 3pace if and only if every 3eparable cl03ed 3ub3pace of E ha3 a 3eparable dual.

2. Monotone operators, subdifferentials and Asplund spaces

33

Proof. The sufficiency half of this theorem is precisely Corollary 2.15. The necessity follows from the previous proposition and Theorem 2.19.

Problem. It is a difficult open question whether a closed subspace of a weak Asplund space is itself a weak Asplund space. At this point we should state the obvious: The property of being an Asplund (or weak Asplund) space is preserved under linear isomorphisms; that is, replacing the norm in a Banach space by an equivalent norm will have no effect on the differentiability of functions on the space nor on a topological property like being a dense G 6. This fact has played an important role in the subject, since many results, starting with Asplund's own theorems, were obtained by showing first that particular classes of spaces admit norms which themselves have good differentiability properties, then using that fact to deduce differentiability for arbitrary continuous convex functions. The close connection between differentiable norms and Asplund spaces is illustrated by the following corollary to Theorem 2.32.

Corollary 2.35. If a Banach space E is not an Asplund space, then there exists an equivalent norm on E which is nowhere Frechet differentiable. Proof. If E is not an Asplund space, then by Theorem 2.32 there exists a bounded nonempty subset A of E* and E > 0 such that every weak* slice of A has diameter greater than E. We will use A to construct a bounded symmetric convex set U* with nonempty interior in E* such that every weak* closed slice of U* has diameter greater than E. From the proof of Lemma 2.18, it will follow that the continuous support function p for U* is nowhere Frechet differentiable. Since U* has nonempty interior, p defines an equivalent norm on ·E. To construct U*, note that both the convex hull C of A U -A and the sum U* = C + B*, where B* is the unit ball of E*, are bounded, symmetric and have the property that all their weak* slices have diameter greater than E. (This last assertion for U* makes use of the fact that O'C+B* = O'c + 0' B*.) The set U* has nonempty interior, since B* does.

For one of his Gateaux differentiability theorems, Asplund used the hypothesis that E could be given an equivalent norm which was itself Gateaux differentiable in a certain strong sense. We will approach this result through maximal monotone operators and another theorem of Kenderov.

Definition 2.36. (a) A norm on a Banach space E is said to be strictly convex (or rotund) provided there are no line segments in the unit sphere; equivalently, provided

IIxli =

1=

whenever 0


> (1 - f)Kllz - xii> O. Since z* E 8f(z), we have f(x) the proof.

2:: fez)

+ (z*,x -

z) > fez), which completes

Theorem 3.24 (Rockafellar). If f i3 a proper lower 3emicontinuous convex function on a Banach 3pace E, the it3 3ubdifferential 8f i3 a maximal monotone operator. Proof. Suppose that x E E, that x* E E* and that x* rf:. af(x). Thus,

oi

a(! -x*)(x), which implies that inf E (! -x*) < (! -x*)(x). By Lemma 3.23 there exists z E dome! - x*) == dome!) and z* E 8(! - x*)(z) such that (z* , z - x) < O. Thus, there exists y* E a f( z) such that z* = y* - x*, so that (y*-x*,z-x) f(xo) in E\B(xo; "1/2). By Cor. 4.9, there exists h E D p such that IIhlloo < 1/4 and 9 + ¢ + h attains its minimum at some point Xl. It follows that (g + 4> + h)(x) > f( xo) + 3/4 for x ¢ B(xo; "1/2) while (g + ¢ + h)(xo) < f(xo) + 1/4. Necessarily, then, Xl E B(xo; "l/2). Now, f = 9 in B( xo; "1/2) and 9 + (¢ + h) attains its minimum at Xl; by Prop. 4.12(c), f is ,B-subdifferentiable at Xl. Corollary 4.14. Suppose that E has the property (H p ), that C is a nonempty open convex subset of E and f: C --4 R is continuous and convex; then f is ,B-differentiable on a dense subset of C.

Proof. Choose an open ball B C C such that B C C. Let 9 = -fin B while = 00 in E\B, so 9 is proper and lower semicontinuous and hence there exists a dense subset of B at each point of which 9 is ,B-subdifferentiable. In particular, there exists X E B such that 9 is ,B-subdifferentiable at x. Since 9 is concave and continuous in B, by Prop. 4.12(b), 9 is ,B-differentiable at x, therefore f is ,B-differentiable at x. 9

This corollary does not say that the dense set of points of ,a-differentiability is a Gs, but as we have seen (Prop. 1.25), this is automatic for Frechet differentiability, hence we obtain the following corollary. Corollary 4.15. If a Banach space admits a Frechet differentiable bump function with bounded derivative, then it is an Asplund space.

This corollary is a slightly weaker form of the original theorem by 1. Ekeland and G. Lebourg [Ek-L], who proved it without assuming boundedness of the derivative. By Prop. 4.3, their result (or the one above) solved half of a long-standing open problem, whether the existence of an equivalent Frechet differentiable norm implies the Asplund property. The converse remained open for decades until R. Haydon's ingenious example [Hal]: There exists a scattered compact Hausdorff space X with the property that C(X) is an Asplund space but does not even admit an equivalent Gateaux differentiable norm. It had long been known that the converse is valid for separable E; more generally, Fabian [Fal] has shown that this is true if E is assumed to be WCG. (He assumes the formally weaker hypothesis that E be "weakly countably determined", but then shows that these are equivalent notions in Asplund spaces.) Corollary 4.16. If a Banach space E admits an equi.valent Gateaux differentiable norm (or, more generally, if it satisfies property (HG)), then for every

4. Smooth variational principles, Asplund spaces, weak Asplund spaces.

67

nonempty open convex 3ub3et D of E and continuoU3 convex function f on D there exi3t3 a den3e 3et of point3 in D where f i3 Gateaux differentiable.

This corollary came close to completely solving another long-standing problem: If a Banach 3pace admit3 an equivalent Gateaux differentiable norm, i3 it nece33arily a weak A3plund 3pace? (As was mentioned in Section 2, there exist continuous convex functions whose set G of points of Gateaux differentiability is dense but G does not contain a dense GD subset.) Spaces which satisfy the conclusion of the Corollary 4.16 are called "Gateaux Differentiability Spaces"; see Section 6. The fact that the existence of a Gateaux differentiable norm doe3 imply that the space is a weak Asplund space was proved much later by D. Preiss [P-P-N]; we will pursue this later in this section. Corollary 4.16 is sufficient to yield some negative results concerning smooth renormings.

Proposition 4.17. The 3pace3 foo, LOO[O, 1] and fI(F) (F uncountable) do not admit equivalent Gateaux differentiable norm3. Proof. We showed in Example 1.4(b) that the norm in fI(F) is nowhere Gateaux differentiable, and it is an interesting exercise (below) to prove the same fact about LOO[O, 1]. Also, in Example 1.21 it was shown that p(x) = lim sup Ixnl defines a continuous seminorm on foo which is nowhere Gateaux differentiable.

4.18 Exercises. (a) Prove that the essential supremum norm on the Banach space LOO [O,1] is nowhere Gateaux differentiable. (b) Prove that any separable Banach space admits an equivalent strictly convex (rotund) norm. (Hint: Use a dual version of the construction in Exercise 2.40(b).) (c) Show that there exists a strictly convex space whose dual norm is not smooth. (Hint: Look at the separable space [1 and Prop. 4.17).

We mentioned at the beginning of this section that Borwein and Preiss [Bor-P] were the first to establish a smooth variational principle. We will describe their result here and refer the reader to [Bor-P] or [Ph4 ] for the proofs.

Definition 4.19 Let of the form

e denote the family of all real-valued functions (J on E

where EJ-ln = 1, J-ln ~ 0 and {v n } converges in norm to some vEE. Thus, each (} is a possibly infinite convex combination of squares of translates of the norm, with the translates themselves converging. It can be shown that if the norm in E is ,8-differentiable (at nonzero points), then each function in is everywhere ,B-differentiable.

e

68

Section 4

Theorem 4.20 (Borwein-Preiss). Suppose that g: E ~ (-00,00] is lower semicontinuous and bounded below, that e > 0 and that ,\ > o. Assume further that

XQ

satisfies

(a) Then there exist 8 E

(b)

9

e and vEE such that

+ (2e/,\2)8

attains its minimum on E

at v

while

II X

(c)

Q -

vII

< ,\

and

(d)

< infEg + e.

g(v)

Moreover, if E has a f3-smooth norm, then 8 is f3-differentiable and

8pg( v)

(e)

n (2ej,\ )B* :I 0,

where B* is the dual unit ball.

Figure 4.1 below illustrates (a), (b), (c) and (d), where (b) is written as g(x) ~ g(v)

+ (2ej,\2)[8(v) -

8(x)]:

i nf 9 + s E······················· inf 9 E

9

••••••••• ~n ••••

g(v) + (2£/A)[e(V) - e(-)]

Fig. 4.1 One can apply this theorem to prove Cor. 4.13 (using the same arguments). While this result is more difficult to prove than Theorem 4.10, it has the advantage of giving (in conclusions (d) and (e)) additional information about the minimizing point of the perturbation of g. ' The Borwein-Preiss theorem was motivated by the following deep theorem due to Preiss [PrJ. The proof is long and not easy; we content ourselves with merely giving the statement. Theorem 4.21 (Preiss). Any locally Lipschitzian real-valued function on an Asplund space is Frechet differentiable at the points of a dense set.

4. Smooth variational principles, Asplund spaces, weak Asplund spaces.

69

We return now to the fact that the existence of an equivalent Gateaux differentiable norm on E implies that it is a weak Asplund space; more generally, it implies that maximal monotone operators are generically single-valued on E. The technique of proof (which, as we will see later, has application to yet more general results) is due to D. Preiss [P-P-N]. It also uses a topological result called the Banach-Mazur game, which we present next.

Definition 4.22. Suppose that X is a topological space and that S is any subset of X; we define a game (X, S) with players A and B, as follows. A play is a decreasing sequence UI :) VI :) U2 :) V2 :) ••• of nonempty open subsets of X which have been chosen alternately by A and B: Player A chooses U1 , B chooses VI, A chooses U2 , ••• etc. A strategy for B is a sequence IB = {In} of maps In where, for each n, In is defined for (U 1 , VI, U2 , lr2 , ••• , Un) (the first 2n - 1 elements of a play) and In (UI , Vi, U2 , V2 , ••• , Un) is a nonempty open subset of Un' A play is consistent with IB = {In} provided Vn = In(UI , VI, U2 , V2 , • •• , Un) for each n. We say that IB is a winning strategy for B if Vn C S for every play consistent with lB. (One can similarly define a winning strategy for A, but for our purposes, it is unnecessary.)

n

Recall that a set is said to be residual provided its complement is of first category. We only need the "only if" portion of the following theorem; for a proof of the other half and related results, see [Ox].

Theorem 4.23 (Banacll-Mazur). Suppose that S is a subset of the topological space X and that A and B are players of the game (X, S). There exists a winning strategy fdr B if and only if S is residual in X. Proof. Suppose there exists a winning strategy IB of order n we mean a nested sequence

= {In} for B. By an I-chain

of nonempty open sets such that Vi = li(U1 , VI, ... Ui), i = 1,2, ... , n. An I-chain of order n + k is a continuation of one of order n if the first 2n terms of each are the same. We partially order the I-chains by continuation. Among all I-chains or order 1, let F I be a maximal family with the property that any two distinct I-chains in F 1 have disjoint smallest elements. This exists, by a straightforward application of Zorn's lemma. Since B has a strategy, the union WI of all the VI'S belonging to I -chains in F I is dense in X. \Ve now proceed by induction: Suppose that a family F n of I-chains of order n has been defined so that the corresponding Vn's are pairwise disjoint and their union W n is dense in X. Among the I-chains of order n + 1 which are continuations of members of F n , use Zorn's lemma to produce a maximal family F n +l with the property that all the corresponding Vn + 1 's are pairwise disjoint. The union W n+1 of the Vn+I's is dense, by maximality and the existence of a strategy

70

Section 4

n

for B. Having found F n for all n, let W = W n . For each x E W, there exists a unique sequence {Cn} of I-chains such that Cn E F n and x is in the corresponding Vn , for all n. Now, these I-chains en are linearly ordered by continuation, so we have x E Vn • This sequence is consistent with the strategy IB by definition of I-chain; since B has a winning strategy, xES. Thus, we have shown that W == Wn C S and hence X\S C U(X\ W n ) is of first category.

n n

Our first application of the Banach-Mazur game is to a proof of the following basic theorem due to Asplund [Asp].

Theorem 4.24. If E i3 a weak A3plund 3pace and T: E --+ F i3 continuow, linear and onto the Banach 3pace F, then F i3 a weak A3plund 3pace. Proof. Suppose that D C F is open and convex and that I is a continuous convex function on D. Then D l = T-l(D) is open and convex in E and 11 = loT is continuous and convex on D l • By hypothesis, there exists a dense GfJ subset Gl C D l such that dll(x) exists for all x E Gl . Let G = T(G l ); then G is dense in D and dl(x) exists for all x E G. Indeed, since T is onto, its adjoint T* is one-one, and it is simple to see that T*(BI(Tx)) C B/l(X) for all x E D l . In general, there is no reason to assume that G is a GfJ set, but the fact that it contain3 a dense G fJ set follows from the following corollary to Theorem 4.23.

Lemma 4.25. Let M be a complete metric 3pace, X a Hau3dorff 3pace and I: M --+ X a continuou3, open 3urjective mapping. If {G n } i3 a 3equence of den3e open 3ub3et3 of M, then the image I( G) of G = G n i3 re3idual in X.

n

Proof. Let S = I(G) and suppose that A and B play the Banach-Mazur game (X, S). The following reasoning will show that B has a winning strategy. Suppose that A has chosen the nonempty open subset Ul of X. Since G l is dense and open in M, there exists an open metric ball B l = B( Xl, rl) such that o < rl < 1 and B l C l-l(Ul ) G l , so player B chooses VI = f(Bl ). Player A chooses some nonempty open subset U2 C VI. The set B 1 n/- l (U2 )nG2 is nonempty and open, so there exists B 2 = B(X2' r2) with 0 < r2 < 1/2 and B 2 C B l l- l (U2) G2; player B takes V2 = I(B 2). Continuing in this way, for any sequence Ul ::> Vi ::> U2 ::> V2 ::> ..• ::> Un player B lets Vn = I(B n ), where B n = B(xn,r n ) C B n- l , 0 < r n < lin and -B n C I -1 (Un)nG n. These choices define a winning strategy for B, that is, nVn c I(G). Indeed, by completeness of M, B n is a single point, say {xo}. If Vn , then for each n there exists Zn E B n such that y = I(zn) --+ I(xo) and {xo} = nBn c nG n = G hence y = I(xo) E I(G).

n

n

n

n

yEn

We next consider maximal monotone operators on Gateaux smooth Banach spaces.

4. Smooth variational principles, Asplund spaces, weak Asplund spaces.

71

Definition 4.26. Let E be a Banach space, T a maximal monotone operator on E and D = int D(T). Define

xED, y E E.

UT(X,y) = sup {(x*,y): x* E T(x)}, (We will usually write u(x, y) instead of UT(X, y).)

IT A is any subset of E* and y E E, we will, for simplicity, write {(A, y)} for the set of real numbers {(x*,y):x* E A}. We will also need a notion for monotone operators analogous to the situation when the directional derivative dl( x )(y) of I exists (at the point xED in the direction 0 t= y E E). Recall that this will be the case if and only if d+ I(x)( -y) = -d+ I(x)(y). From Prop. 2.24 we know that d+ I(x)(y) = sup{ (x*, y): x* E al(x)}, so the equality above is equivalent to sup {(al(x), y)} = in! {(af(x), y)}, that is, this set of real numbers is actually a singleton. This motivates the following definition.

Definition 4.27. For any y E E and T as above, let yT denote the set-valued mapping from E into the real line defined by

(yT)(x) = {(x*,y): x* E T(x)}. Our substitute for saying that df( x )(y) exists will be the assertion that yT( x)

is a singleton.

Definition 4.28. The following notation will be convenient: If A c E*, y E E and a is a real number, the assertion that (A, y) > a means that (x*, y) > a for each x* E A. Proposition 4.29. Let T and D be as above. Then (i) For each xED, the real-valued function y ~ UT(X, y) is subadditive and positive homogeneous and, for any A > 0, U~T(X, y) = AUT(X, y). (ii) For each x E D(T),

sup{U(x,y):

lIyll = I} = sup{u(x,y): lIyll ~ I} = sup{lI x *lI:

x* E T(x)}.

(iii) (yT)(x) is a singleton if and only if u(x, -y) = -u(x, y). (iv) If Xo ED, Y E E and (yT)( xo) is a singleton (say equal to {a}), then for all f > 0 there exists a neighbohood U of Xo in D such that (T(U), y) > a-

f.

(v) Fixing xED and y t= 0, letting I f(t) = u(x

= {t E R:

x + ty E D} and defining

+ ty, y),

t E I,

72

Section 4

the function J is monotone nondecreasing on I (and hence is continuous at all but countably many points of I). Moreover, if J is continuous at to E I, then (yT)(x +toY) is a singleton. Proof. (i) and (ii) are immediate from the definitions.

(iii) If (yT)(x) is a singleton, then so is (-yT)(x), wi~h (-yT)(x) = -u(x,y); that is, u(x,-y) = -u(x,y). On the other hand, if (yT)(x) is not a singleton, there exists x* E T(x) such that (x*,y) < u(x,y) and hence u(x,-y) ~ (x*,-y) ~ -u(x,y). (iv) Given f > 0, let W = {y* E E*: (y*,y) > a - f}. This is a weak* open set containing T(xo), so by the norm-to-weak* upper semicontinuity of T at xo, there exists an open neighborhood U of Xo in D such that T(x) C W

for all x E U, which was to be proved.

(v) Note that we are not asserting that the open set I is an interval, but this does not affect our argument. Suppose that t l , t2 E I with tl < t2; then for any xi E T(x + tiY), i = 1,2, we have (by monotonicity)

o ~ (xr hence (xi, y)

~

x;, (x

+ tly) - (x + t 2y))

= (t l

-

t2)(X~

- x;, y),

(X2, y). This shows that

J(t l ) == sup{(x*,y): x* E T(x +tly)}

~

inf{(x*,y): x* E T(x +t 2y)}; (4.5)

in particular, J(tl) ~ J(t2), so J is monotone. Suppose, now, that yT is not single-valued at x + toY. Then a

== inf {(x*,y): x*

E

T(x +·toY)} < J(t o).

Hence if t E I, t < to, then (by (4.5)), J(t) at to.

~

a

< J(t o), so J is not continuous

The relationship between Gateaux smoothness of the norm and ~ingle­ valuedness of T is exhibited by the following lemma. Lemma 4.30. Suppose that Xo E D and that Yo E E, IIYoll = 1, is such,that yoT is single-valued at Xo, with value a. If

sup{ UT( Xo, y): lIyll = I} ~ a, then

T(xo)

Ca·

all· ·1I(yo).

In particular, if the norm is Gateaux differentiable at Yo, then T( xo) singleton.

IS

a

Proof. Suppose that x* E T(xo); then from Prop. 4.29 (ii), we see that a ~ 0 and IIx*1I ~ a. Since (YoT)(xo) is a singleton, it follows that (x*,Yo) = a and so x* Ea· all· ·1I(yo).

4. Smooth variational principles, Asplund spaces, weak Asplund spaces.

73

Theorem 4.31. Suppose that E admits an equivalent Gateaux smooth norm, that T is a maximal monotone operator on E and that

D

==

int {x E E: T(x)

i= 0}

is nonempty. Then there exists a dense G8 subset G C D such that T( x) is a singleton for every x E G. Proof. Let PI = 11·11 denote an equivalent smooth norm on E. Choose sequences of positive numbers 1/2 > €1 > €2 > · · · and (31 > (32 > (33 > · .. such that

€k ~ 0,

E(3~

0 there exists x* E E* and 0: > 0 such that diam S(x*, A, a) < €. (c) IT A is a nonempty subset of E*, we say that it is weak*-dentable provided it admits weak*-slices S(x,A,a) of arbitrarily small diameter. Using this terminology, Theorem 2.31 can be reformulated to say that a Banach space E is an Asplund space if and only if every nonempty bounded subset of E* is weak* dentable.

Definition 5.2. A subset A of a Banach space E is said to have the RadonNikodym property (RNP) if every nonempty bounded subset of A is dentable.

80

Section 5

Since weak* dentable sets are obviously dentable, Theorem 2.31 implies that if E is an Asplund space, then E* has the RNP. One of the most striking results in this area is that the reverse implication is also valid. To that end, we need another definition.

Definition 5.3. An infinite tree in a Banach space E is a sequence {x n } such that Xn = (1/2)(x2n + X2n+t) for each n. An infinite tree such that IIx2n - x2n+tll 2:: 28 for some 8 > 0 and all n is called an infinite 8-tree.

Geometri c Pi cture X7

X

Schematic Picture

"'./ V

XV 4

X2

5

V

VV

XV X3

Fig. 5.1 Here is an obvious but important observation: An infinite 8-tree cannot be dentable (since every slice must have diameter at least 28). What we will do is use the characterization that E is an Asplund space (if and) only if every separable subspace of E has separable dual (Theorem 2.33) in order to produce a 8-tree in E*. We need some preliminary results. Lemma 5.4. Assume that E is a Banach space which contains a separable subspace F such that F* is nonseparable. Then there exists € > 0 and a nonempty subset A of the unit ball B* of E* such that every non~mpty relatively weak* open subset of A has diameter greater than €. Proof. Let B(F*) be the unit ball of the nonseparable space F*. By the proof of Theorem 2.19, there exist € > 0 and an uncountable subset At of B(F*) such that each point of At is a weak* condensation point of At and such that IIx* - y* II > € whenever x*, y* are distinct points of At. It follows that any nonempty relatively weak*-open subset of the weak* closure A 2 of Al contains at least two distinct points of AI. The restriction map R: E* --+ F* maps B* onto B(F*) and is weak*-to-weak* continuous. Let A be a minimal weak* compact subset of B* such that R(A) = A2 • Thus, if U is a relatively weak* open subset of A, then the image A3 under R of the weak* compact set A \ U is a proper closed subset of A2 • Since A2 \ A3 is a relatively weak* open subset of A2 it contains at least two distinct points of At and hence there exist distinct points x*, y* in U with II x* - y* II > €.

5. Asplund spaces, the RNP and perturbed optimization.

81

Lemma 5.5. Let {!{n} be a sequence of nonempty compact convex subsets of a topological vector space such that K 2n U K 2n+ I C K n for all n. Then there exists an infinite tree {x n } such that

Xn

E K n for all n.

Proof. Let K = II K n be the (compact) cartesian product consisting of all x = (x n ) such that X n E K n for each n. Define, for each n,

Each An is closed, hence compact, and a sequence {x n } corresponding to an element x E K will be an infinite tree if x E nAn, .that is, we need only show that this intersection is nonempty. By compactness, it suffices to prove that Al nA2 n·· ·nAk :I 0 for each k. To this end, fix k and define x in K as follows: If n > k, let X n be any element of !{n' From n = k to n = 1, use induction: Suppose that, for m = n + 1, n + 2, ... , a point x m E K m has been chosen, and define X n = (1/2)(x2n + X2n+I). Clearly, X n E K n since the latter is convex and K 2n U !{2n+I C !{n. The resulting element x is in Al n A2 n ... n Ak, so the proof is complete.

Proposition 5.6. Suppose that E is a Banach space and suppose that there exists a nonempty bounded set A C E* and an € > 0 such that diam U > € whenever U is a nonempty relatively weak* open subset of A. Then its weak* closed convex hull w* -cl coA contains an infinite €/2-tree. Proof. We construct a sequence {Un} of nonempty relatively weak* open subsets of A and a sequence {x n } in E such that

(a)

IIxnll =

1 for each n,

(b) U2n U U2n +1 C Un, n = 1,2,3, ... , and (c) x* E U2n and y* E U2n +1 imply that (x* - y*, Xn) ~ € for each n. First, let U1 = A. Suppose that, for some positive integer m, sets Uk have been defined for 1 :::; k < 2 m and points X n have been defined for each 1 ~ n < 2m - 1 so that properties (a), (b) and (c) are valid for all n such that 1 :::; n < 2m - I . Suppose k satisfies 2m - 1 :::; k < 2m • By hypothesis, we have diam Uk > €, hence there are functionals z~ and zi in Uk such that Ilz~ -zi II > €. Choose points Xk E E such that IIxkll = 1 and (z~ - zj, Xk) = € + 0 such that p(x) ::; M . Ilxll for all x. A nonnegative continuous sublinear functional is called a Minkowski functional (or gauge). It can be characterized in terms of its associate closed convex "ball" {x E E : p( x) ::; I}, but we will be more interested in its "dual ball" C(p) - or simply C - defined by

C(p)

= {x* E E: (x*,x) ::; p(x) for all x E E}.

This is easily seen to be weak* compact and convex. The following elementary facts are useful enough that we spell out the details. Lemma 5.10. If p is a Minkowski gauge, then x* E apex) if and only if x* E C and (x*,x) =p(x). Moreover, p = f7c, the support functional for the set C.

Proof. By definition, x* E apex) provided

(x* , y - x) ::; p(y) - p(x) for all y E E.

(5.1)

Assuming this, apply it to the point ry where r > 0 and y is fixed, divide by r and let r ~ 00 to get (x*, y) ::; p(y) for all y. Thus x* is in C. Moreover, setting y in (5.1) equal to 0 yields (x*,x) ~ p(x), so equality holds at x. Conversely, if x* ::; p and (x*,x) = p(x), then (5.1) is immediate. It is also easy to see that

5. Asplund spaces, the RNP and perturbed optimization.

85

p(x) = O"c(x) == sup{(x*,x) : x* E C}; indeed, if x E E, then (x*,x) ~ p(x) for all x* E C, so O"c(x) ~ p(x). Since p is continuous, there exists an element x* E 8p(x), so x* E C and

p(x) = (x *, x)

~

0" C ( X ).

The duality property in which we are interested is described by the following proposition. Proposition 5.11. Suppose that p is a Minkowski gauge on E. An element x* in C(p) is weak* strongly exposed by x E E if and only if p is Frechet differentiable at x, with p'(x) = x* .

Proof. Suppose that x* E C is weak* strongly exposed by x =1= 0, so that (x*, x) = O"c(x) = p(x). Since p is continuous, 8p(y) =1= 0 for all y in E. Let 0 there exists x E A such that x is not in co( A \ B( x; €)).

The next lemma gives a recipe for constructing nondentable sets; we will use it when we prove Theorem 5.15 by contradiction.

Lemma 5.17. Suppose that {An} is a sequence of (eventually) nonernpty subsets of a Banach space E with the following property: There exist constants > 0 and A > 0 such that for all x E coA n and y E E,



5. Asplund spaces, the RNP and perturbed optimization.

dist [x, co(A n+1 \ B(y; f))]

~

89

>../2 n .

Then the set

A = nnU{coAj:j

~

n}

is nonempty and not dentable. Proof. We will show that x E coCA \ B(x; €/2)) for each x E A. First however, we show that A is nonempty and that for each n ~ 1, (5.3) where B == B(O; 1). To this end, fix n sufficiently large that An is nonempty and suppose that XQ E co An. By hypothesis, there exists a point Xl E co A n +l such that IIxQ - xIII ~ 2>../2 n . Similarly, there exists a point n l X2 E co A n +2 such that IIxI - x211 ~ 2>../2 + . Continuing by induction, we obtain Xk E COAn+k, k = 0,1,2, ... such that Ellxk - xk+lll < 00. This implies that the series E(Xk - Xk+l) converges to an element Y E E, of norm at most 4>../2 n . By writing Y as a limit of the (collapsing) partial sums we get Y = XQ - Z where Z = limxk. It follows that Z E A (so A is nonempty) and XQ E A + (4)../2 n )B, which proves (5.3). Suppose that x E A. Fix m such that 4>../2 m < €/2. Since we have x E {Uco.4j:j ~ n} for all n, for each n ~ m there exist j ~ nand Yn E coAj such that IIx - Ynll ~ >../2 n . By hypothesis,

j n dist [Yn, CO(Aj+1 \ B(x; f))] ~ >../2 ~ >"/2 , so there exists Zn E co(A j+1 \ B(x; f)) such that llYn - znll write Zn as a finite convex combination

< 2>../2 n . We can

By (5.3), for each i we have Ui E A + (4)../2 j )B C A + (4)../2 n )B, so there exists Vi E A such that lIui - ViII ~ 4>../2 n . Let W n = E>"iVi; it follows that Ilzn - wnll ~ 4>../2 n and hence IIx - wnll ~ 7>../2 n . Since lIui - xII ~ € for each i, we have IIvi - xII ~ €- 4>../2 n ~ €- 4>../2 m > €/2, that is, W n E coCA \ B(x; €/2)). It follows that x E coCA \ B(x; €/2)) as required. The next two lemmas will allow us to reduce the proof of Theorem 5.13 to simply showing that there are arbitrarily small perturbations I + x* of I which define small slices. The first lemma shows that we can get decreasing families of slices provided we perturb the function I slightly; the proof is a straightforward verification from the definition. Since all our slices will be in the same set C, we will write S(I, a) in place of the usual S(I, C, a). Lemma 5.18. Suppose that the real-valued function I is bounded above on the nonempty subset C of the unit ball B of E. For any a > 0 we have S(I + x*, (3) c S(I, a) provided Ilx* II < a/2 and 0 < f3 < a - 211 x * II·

90

Section 5

Lemma 5.19. Let C be a nonempty bounded clolJed lJublJet of E and lJuppOlJe that for every upper-bounded upper lJemicontinuoulJ function! on C and every € > 0, there exilJt x* E E*, IIx* II < €, and 0 > 0 lJuch that diam S(! + x*, 0) ~ 2€. Then given € > 0 there exilJtlJ x* E E* lJuch that IIx* II < € and! + x* attainlJ a lJtrong maximum on C. Proof. We assume without loss of generality that C is contained in the unit ball of E and that 0 < € < 1. By hypothesis, there exist xi E E*, "xi" < €/2, and 0 < 01 < 1 such that diam Sl < €, where we write Sl = S(! + xi, 01). By applying the hypotheses to ! + xi and €1 = 01 . €/2 2 , we obtain Ilx; II < €1 so that diam S2 < 2€1, where S2 = S(! + xi + x;, 02) and 0 < 02 < 01· Continuing by induction with 00 = 1, we obtain sequences €n

> 0,

0
-00, so that t l is an interior point of I. "\Ve will use the characterization of Gateaux differentiability as given in Exercise 1.24. For any (y, 8) E E x R and all t > 0 sufficiently small, we have (Xl ±ty,tl ±ts) E B x I and

o~

f( Xl

h(XI + ty)

+ ty, t l + ts) + f( Xl

- ty, t l - ts) - 2f( Xl, t l ) ~

+ h(XI - ty) - 2h(xt) - [g(tt + t) + g(t l

-

t) - 2g(t 1 )].

Since both hand 9 are differentiable, if we divide through these inequalities by t > 0 and let t -+ 0, the terms on the right tend to zero, showing that f is differentiable at (Xl, t l ). Corollary 6.6. A Banach space E is a GDS if and only if it is an MDS.

98

Section 6

Proof. Let H .be a closed hyperplane in E, so that E is isomorphic to the product H x R. IT E is an MDS, then by Prop. 6.4, H is a GDS. By Prop. 6.5, the space H x R, that is, the space E, is a GDS.

6.7. Problems. Is a closed subspace of a GDS necessarily a GDS? Is the product of two Gateaux differentiability spaces necessarily a GDS? (It is obvious from Prop. 6.5 that, by induction, the second question has an affirmative answer for the product of a GDS with a finite dimensional space.)

The following permanence property for Gateaux differentiability spaces is similar to the corresponding one for weak Asplund spaces (Theorem 4.19) which required the operator T to be onto; the proof is essentially the same: .

Proposition 6.8. Suppose that T: E

--+ F is continuous and linear, with dense range. If the Banach space E is a GDS, then so is F.

Proof. Suppose that D is a nonempty open convex subset of F and that 1 is continuous and convex on D. The function 11 = loT is convex and continuous on the open convex set D 1 = T- 1 (D), hence is Gateaux differentiable at the points of a dense set G 1 C D 1 . To see that 1 is Gateaux differentiable at the points of the dense set T(G 1), it suffices to verify that T*[81(Tx)] C 811(x) for all x E D 1 and use the fact that density of T(E) implies that T* is one-one.

(It follows easily from the foregoing proposition that the first question in 6.7 has an affirmative answer for any complemented subspace M of a GDS space E; that is, for any M which is the image of a continuous linear projection on E.) The next proposition is competely analogous to Prop. 5.11.

Proposition 6.9. Suppose that p is a Minkowski gauge on E. An element x* in C(p) = {x* E E*: (x*,x) $ p(x) for all x E E} is weak* exposed by x E E if and only if p is Gateaux differentiable at x, with dp(x) = x* . Proof. Suppose that x* E C and that x weak* exposes C at x*; then 8p(x) = {x*}. Indeed, if y* E op(x), then by Lemma 5.10 we have y* E C and (y *, x) = p(x) = (1 C ( x), hence y* = x*. Conversely, suppose that p is Gateaux differentiable at x, with x* = dp(x). Then x* is in 8p(x), so by Lemma 5.10 again, x* E C and x attains its supremum on C at x*. Suppose there were another point y* E C such that (1 c( x) = (y*, x); then the other implication in Lemma 5.10 shows that y* E 8p(x), hence y* = x*, that is, x weak* exposes C at x*.

Recall the statement of Theorem 6.2: A Banach space E is a Gateaux differentiability space if and only if every weak* compact convex subset C of E* is the weak* closed convex hull of its weak* exposed points.

6. Gateaux differentiability spaces.

99

Proof of necessity in Theorem 6.2. The proof that the dual of a GDS has the indicated property is identical to the proof of the analogous portion of Theorem 5.12, except for the use of Prop. 6.9 in place of Prop. 5.11 and, of course, the substitution of Gateaux differentiability for Frechet differentiability. To prove the converse, we need the following dual version of a classical "parallel hyperplane lemma".

Lemma 6.10. Suppose that x, y E E, with I(x*, y) I ~ 1 whenever x* E E* satisfies

(x*,x) = 0 and then either

IIx*11

IIxll = 1 = lIyll,

and

E

>

o.

If

< 2/e,

IIx - yll ~ e or IIx + yll ~ e.

The following sketch shows the reason for the name; the hypotheses require that hyperplanes defined by the functionals x and y be nearly parallel.

y =1

y = -1 Fig. 6.2 Proof. Note that, as a (weak* continuous) linear functional on E* , y is bounded in absolute value by e/2 on the intersection of the dual ball with the subspace H = {x* : (x*, x) = O}. By the Hahn-Banach theorem, its restriction to H can be extended to a functional of norm at most E/2 defined on all of E*. Since this extension is necessarily weak* continuous, it is defined by an element z E E. Thus, y - z = 0 on H, so y - z = ax for some a E R. Note that

II zl1 ~ E/2. 11(1 - a)x - zll ~ 11 - al + IIzll ~ e. 11(1 + a)x + zll ~ 11 + al + II zll ~ E.

11 -Iall = IIly'lI-lly - zlll If a ~ 0, then If a < 0, then

IIx - yll = IIx + yll =

~

Proof of sufficiency in Theorem 6.2. By Cor. 6.6, it suffices to show that E is an MDS. To that end, suppose that p is a Minkowski gauge functional on E, so that, by Lemma 5.10, we can write p = (7c, where C == C(p) contains the origin. (This is the weak* compact convex set defined again in the statement of Prop. 6.9.) We lose no generality in assuming that O"c( x) ~ IIxll for all x E E. Since we want to show that (7C (or, more simply, 0") is Gateaux differentiable on a dense set and since it is clearly Gateaux differentiable at each point of the interior of the closed convex set where 0" vanishes, we need only consider the open set where (7 is strictly positive. Suppose, then, that x E E and 0"( x) > 0; we want to approximate x by a point y where 0" is

100

Section 6

Gateaux differentiable. Since u is positive homogeneous, we can assume that = 1. Given € > 0 with € < a( x), it suffices to produce y such that IIx - yll < € and y exposes C. Consider the weak* compact convex subset Cl of E* defined by Cl = co( C UN), where

a( x)

N

= {x* E E*:(x*,x) = 0 and Ilx*11 < 2/€}.

Fig. 6.3 Let Ul denote the support functional of C l . By hypothesis, Cl is the weak* closed convex hull of its weak* exposed points, so any weak* slice of Cl contains a weak* exposed point. In particular, then, if 0 < a == a( x) - €, then the open slice S(x,Cl,a) contains a weak* exposed point y* of Cl . Let y 1= 0 be a point of E which exposes Cl at y*; we may assume that lIyll = 1. Since y* is an extreme point of C l it is either in C or in N. The latter is impossible; since x vanishes on N we have al(x) = a(x) so (y*,x) > al(x) - a and the latter equals E. Thus, y actually exposes C at y*; since 0 E C and y* 1= 0, this implies in particular that (y*, y) > o.

Fig. 6.4 The sketch above suggests that the hyperplanes defined by x and yare nearly parallel (hence that y is close to x). To see that this is actually the case, note first that since (y*,z) ::; u(z) ::; IIzll for all z, we must have Ily*1I :::; 1, hence (y*, y) ::; 1. Thus, if x* ENe Cl , then

(x*, y) ~ al(Y) = (y*, y) $ 1; by symmetry of N this implies that I(x*, y) I ~ 1 for all x* EN. It follows from Lemma 6.10 that either IIx - yll ~ € or Ilx + yll ~ €. But the latter is impossible: lIy* II ~ 1 hence

IIx+yll ~ (y*,x+y) > (y*,x) >

E.

6. Gateaux differentiability spaces.

101

Remarks. Much of the material in this section originated in [La-Ph], with one major exception. Prop. 6.5 was shown to us by M. Fabian, who says the proof was motivated by work of A. loffe. (The function 9 introduced in the proof is what optimization specialists call a "penalty" function.) Fabian's result represents the first bit of affirmative progress in this topic in the last decade, although there have been a number of relevant striking examples (see [C-K], [Tah] and [TaI2]). We are grateful for his permission to include it in these notes. The names "Asplund space" and "weak Asplund space" were introduced in [Na-Ph]; Asplund called them "strong" and "weak differentiability spaces". In retrospect, both definitions could easily have dropped the G6 requirement; indeed, it wouldn't have changed anything in the Frechet case, and in the Gateaux case, Asplund's name would now be attached to what appears to be a more tractable class of spaces.

J. Borwein and S. Fitzpatrick [Bor-F 2] have sketched a way to unify the results of this section with some of those of Sec. 5. A key (and easily proved) observation is the following: An element x* of a weak* compact convex subset C C E* is weak* exposed by x E E provided O'c(x) = (x*,x) and x~ ---+ x* in the weak* topology whenever {x:} C C and (x:, x) ---+ (x*, x). Now, the weak* topology is precisely the topology (call it GO) of uniform convergence on elements of the Gateaux bornology (3 = G and the norm topology in E* is the topology FO of uniform convergence on the elements of the Frechet bornology (3 = F. One is thus led to a unified definition of a weak * (3-exposed point of such a set C: replace "weak* convergence" in the observation above by "convergence in the topology (30". One can also define a "(3-differentiability space" to be one in which every convex continuous function is (3-differentiable on a dense subset of its domain. This makes it possible to mimic the proofs of Theorems 5.12 and 6.2 to prove the general result that a Banach space E is a (3-differentiability space if and only if every weak* compact convex subset of E* is the weak* closed convex hull of its weak* (3-exposed points.

Section 7

7. A generalization of monotone operators: U seo maps. There has been much relatively recent work showing that Kenderov's results about generic continuity of maximal monotone operat9rs are special cases of theorems of a more topological nature. We will describe some of this, selecting the results most closely related to our primary interest in Banach spaces.

Definition 7.1. ( a) Suppose that X and Y are Hausdorff topological spaces and let F : X ~ 2 Y be a map from X into the nonempty compact subsets of Y. We say that F is usco (for upper semicontinuous and compact valued) provided it is also upper semicontinuous. (Recall that upper semicontinuity means that {x EX: F( x) C U} is open in X whenever U is open in Y.) (b) If 1'- is a Hausdorff linear space, then a usco map F: X ~ 2 Y is called convex useo provided F(x) is convex for each x E X. (c) We denote the graph of F by G(F) == {(x,y) E X x Y:y E F(x)}, and, as \vith monotone operators, we partially order these set-valued maps by the inclusion ordering on their graphs. We will usually write F I C F 2 in place of G(F1 ) C G(F2 ), and talk about one such map containing another. A usco map [convex usco map] is said to be minimal if it does not properly contain any other usco [convex usco] map. (d) If Y = E* is a dual Banach space, saying that a usco map F from X into 2Y is w*-usco means that we are taking Y to be E* in its weak* topology (so the values of Fare weak* compact sets). 7.2. Example. Suppose that E is a Banach space, that T: E -+ 2 E * is maximal monotone and that D is a nonempty open subset of D(T). Then the restriction Tv of T to D is convex w* -usco (as a set-valued map from D to E* in its weak* topology). Proof This is a consequence of Exercise 2.29.

The kind of theorem we will consider has the form: For certain minimal usco maps F from a Baire space X into 2Y , there exists a dense GfJ subset D of ~y such that F is single-valued and continuous (for an appropriate topology in }'") at each point of D. (This is not too surprising, since we would expect minimal set-valued maps to be single-valued at many points.) In view of the

7. A generalization of monotone operators: Usco maps.

103

relationship between of Asplund spaces E and certain continuity properties of maximal monotone operators defined in E, it is also not surprising that such results would be of interest. We need some elementary facts about usco maps. Proposition 7.3. (aJ If F:X - 2 Y is usco, then its graph G(F) is closed in

XxY.

(bJ For every usco map [convex usco map] F there exists a minimal useo map [minimal convex usco map] contained in F. Proof. (a) Suppose that (x o ,Yo) is a net in G(F) which converges to a point (x, y) E X x Y, but that y is not in F( x). Since the latter is compact there exists an open neighborhood U of F( x) whose closure U does not contain y. By upper semicontinuity, (Yo) is eventually in U, and hence y is in U, a contradiction. (b) This will follow from Zorn's lemma provided we can show that any decreasing chain {Fo } of [convex] usco maps contained in F has a minimal element. For x E X define Fo(x) = nFo(x)j by compactness, this is nonempty, [convex] and compact. To see that Fo is upper semicontinuous, suppose that x E X, U is open in Y and Fo(x) C U. Necessarily, Fo(x) C U for some OJ indeed, if each Fo(x) \ U were nonempty, the intersection of these compact nested sets would be a nonempty subset of F o( x) \ U. By upper semicontinuity of F o , there exists an open set V containing x such that Fo(V) C Fo(V) C U. Example 7.2 showed that maximal monotone operators are convex usco maps; the following example shows that they need not be minimal usco maps.

7.4 Example. Define monotone operators Fo , F I and F 2 from R to 2R by

Fo(x) while

Fo(O)

= FI(x) = F2 (x) =

sgn x if x

i= 0,

= {-I}, FI(O) = {-I, I} and F2 (0) = [-1,1].

Then Fo C F I C F2 , they are all distinct, F2 is maximal monotone and minimal convex usco, F I is minimal usco (hence has closed graph) and Fo does not have closed graph, although Fo(x) is closed for each x. Proof Since F2 = 8f, where f(x) = Ixl is continuous and convex, F2 is maximal monotone. The remaining assertions are readily verified. The following sufficient condition for a map to be usco is quite useful. Proposition 7.5. Suppose that F l , F z are set-valued maps from X into Y such that G(Fz ) is closed, Fz C F l and F I is useo. Then Fz is useo. Proof. The fact that G( F z ) is closed implies that F z (x) is closed for each x EX. Since G(Fz ) C G(Fl ) implies that Fz(x) C Fl(x), each Fz(x) is necessarily compact. It remains to prove that F z is upper semicontinuous. Suppose that x E X but that F z is not upper semicontinuous at x. Then there exist an

104

Section 7

open set U C Y containing F2 (x), a net (x a ) C X such that X a ---+ x, and points Ya E F2 (x a ) \ U for each a. Since F I is upper semicontinuous at x, the net (Ya) is eventually in every open neighborhood of FI(x). This implies that some cluster point Y of (Ya) is in the compact set FI(x). (Indeed, if not, then for every Y E F 2 ( x) there would exist an open neighborhood Vy of y such that (Ya) is eventually in Y \ Vy. By covering FI(x) with such neighborhoods and using compactness, we could find an open set W containing F I (x) such that (Ya) is eventually outside W.) Take a subnet of (x a , Ya) converging to (x,y); since G(F2 ) is closed, we conclude that Y E F 2 (x) C U. On the other hand, Ya E Y \ U for all a, so Y E Y \ U, a contradiction which completes the proof. We want to extend the notion of "maximal monotone" in the following way:

Definition 7.6. IT T: E ---+ 2 E • is monotone and DeE, we say that T is maximal monotone in D provided the monotone set G(T)

n (D

x E*) == {(x, x*) E D x E*: xED and x* E T(x)}

is maximal (under set inclusion) in the family of all monotone sets contained in D x E*. We will see shortly (Cor. 7.8, below) that if T is a maximal monotone operator in E, then its restriction to any open subset D of D(T) is maximal monotone in D.

Lemma 7.7. Suppose that DeE is open, that T: D ---+ 2E • is monotone and norm-to-weak* upper semicontinous and that T( x) is nonempty, convex and weak* closed for all xED; then T is maximal monotone in D.

Proof. As with the usual notion of maximal monotone, we need only show that if (y, y*) E D x E* satisfies (y* - x*,y - x)

~

0 for all x E D,x* E T(x),

(7.1)

then y* E F(y). If not, by the separation theorem there exists z E E such that F(y) C {z* E E*: (z*,z) < (y*,z)} = W. Since W is weak* open and T is norm-to-weak* upper semicontinuous, there exists a neighborhood U of y in D such that T(U) C W. For t > 0 sufficiently small, y + tz E U and therefore T(y + tz) C W. Applying (7.1) to any u* E T(y + tz) we get

o~

(y* - u*, y - (y + tz)) = -t(y* - u*, z),

which implies that (u*, z)

~

(y*, z), that is, u* is not in W, a contradiction.

Corollary 7.8. If T: E ---+ 2E • is maximal monotone and D is a nonempty open subset of D(T), then the restriction T D of T to D is maximal monotone in D.

7. A generalization of monotone operators: Useo maps.

105

Proof. By Example 7.2, the maximal monotonicty of T implies that the monotone map TD is convex w*-usco, so the result follows from Lemma 7.7. The next theorem exhibits an interesting relationship between maximal monotone operators in an open set D and minimal usco maps in D. Theorem 7.9. If D is open in the Banach space E and if T: D _ 2E * has nonempty values and is maximal monotone in D, then T is a minimal convex w* -usco map in D.

Proof. We know by Example 7.2 that T is convex and w* -usco, so the only question is whether it is minimal in this family. Suppose that F: D _ 2E * is convex and w*-usco and that G(F) C G(T). By Lemma 7.7, F is maximal monotone and hence F = T. It is easy to see that not every minimal convex w* -usco is monotone; consider any continuous non-monotonic function from R into itself. The application of these notions (usco maps, minimal usco maps, etc.) to a generic continuity theorem for monotone operators requires several further lemmas.

Lemma 7.10. Let X, Y be Hausdorff spaces and suppose that F: X -+ 2 Y has closed graph and is locally relatively compact, that is, every x E X has an open neighborhood V M£ch that F(V) is relatively compact in Y. Then F is a usco map.

Proof. Since G( F) is closed, every image F( x) is closed and contained in some relatively compact set, hence is compact. To see that F is upper semicontinuous, it suffices to show that for each x in X and each open neighborhood V of x, the restriction of F to V is upper semicontinuous. By hypothesis, we may restrict attention to open neighborhoods V of x for which F(V) is compact. Given such a V, define F I : V - 2 Y by FI(x) = F(V), x E V. Obviously, F I is a usco map and Flv C F I ; since G(Flv) is closed in V x Y, Proposition 7.5 implies that F is upper semicontinuous in V. Lemma 7.11. If D is open in E and T: D - 2E * is monotone, with Iiil nonempty for each xED, then the map T whose graph is the closure G(T) in D x (E*,w*) ofG(T) is monotone and w*-usco.

Proof. Let T I :D - 2E * be a maximal monotone extension of T. By Theorem 7.9, T I is a minimal convex w*-usco map in D, and by Prop. 7.3 its graph G(TI ) is closed in D x (E*,w*). Thus, T c T I so T is monotone and-by Prop. 7.5-it is also a w* -usco map. The next theorem shows how the concept of minimal usco maps can be used to prove a basic result about monotone operators.

106

Section 7

Lemma 7.12. Suppose that X is a Hausdorff space and that T: X ~ 2E * is w*-usco. 'For x E X define coT(x) to be the weak* closed convex hull ofT(x). Then the map coT is convex w* -usco. Proof. Since coT obviously has weak* compact convex values, it suffices to prove that it is weak* upper semicontinuous. To see this, suppose x E X and that U is a weak* open subset of E* with coT(x) c U. In any locally convex space, a compact convex set I{ has a neighborhood base of the form I{ + IV, where the closed convex sets W form a neighborhood· base of o. Thus, we can assume that U is of the form U = coT(x) + W, where W is a weak* closed convex neighborhood of o. By the upper semicontinuity of T there exists an open neighborhood V of x in X such that T(V) C coT(x)

+ W.

It follows that coT(V) C coT(x) + IV, so coT is upper semicontinuous.

Theorem 7.13. Suppose that D is an open subset of the Banach space E and that T: D ~ 2E * is a monotone operator, with T( x) =1= 0 for all x in D. Then there is a unique maximal monotone operator M in D containing T. In fact, M can be characterized as follows: Let T be the set-valued map whose graph is the closure in D x (E*,w*) of G(T) and for each xED let M(x) be the weak* closed convex hull ofT(x); this defines M.

Remark. As Example 7.4 shows, one must distinguish between sets of the form T(x) and the (possibly smaller) sets T(x). Proof. Let TI be any maximal monotone operator containing T. By Theorem

7.9, T1 is a minimal convex w*-usco map. Since it has closed graph, we must

have T C TI , and since T has closed graph, Prop. 7.5 implies that it is w*-usco. From Lemma 7.12, we conclude that the map M = coT is convex w*-usco and, clearly, M C T I . By the minimality of T I , we have M = TI , which proves the uniqueness assertion. There does not appear to be a unique extension theorem for monotone operators with arbitrary effective domains. IT, for instance, E has dimension at least one, if (xo, Yo) E E x E* and if T is defined to be the monotone operator whose graph is {( Xo, Yo)}, then there are many maximal monotone extensions of T. The next lemma, which is purely topological in nature, has Kenderov's Theorem 2.30 on continuity of maximal monotone mappings as an immediate corollary. The main hypothesis will seem less peculiar when we apply the lemma to maximal monotone operators.

Lemma 7.14. Let F be a minimal usco map on the Baire space X with compact values in the Hausdorff space (Y, T) and let d be a metric on Y. Suppose that for every nonempty open subset U of X there exist nonempty open

7. A generalization of monotone operators: Useo maps.

107

subsets V of U such that F(V) contains relatively open subsets of arbitrarily small d-diameter. Then there exists a dense G6 subset D of X such that F is single-valued and d-upper semicontinuous at each point of D. Proof. We first note that the fact that F is a minimal usco map implies that if J is a proper closed subset of G(F), then p(J) =1= X, where p is the natural projection of X x Y onto X. Indeed, if p( J) = X, then J would be the (closed) graph of a set-valued map which, by Prop. 7.5, would be a usco map properly contained in F. Next, given e > 0, let

Of: = U{G: G is an open subset of X and d - diam F(G)

~

e}.

Clearly, Of: is open in X; we will show that it is dense. Let U be a nonempty open subset of X. By hypothesis, there is a nonempty open subset V of U and a r-open subset W of Y such that F(V) n W =1= 0 and d-diam (F(V) n W) ~ e. Since G(F) n (V - W) =1= 0 and F is minimal (and, by Prop. 7.3 (a), G(F) is closed), we must have p[G(F) \ (V x W)] =1= X. Choose Xo E X \ p[G(F) \ (V x W)]. Then p-I(xO) n G(F) c V x ~ that is, Xo E V and F(xo) C W. If G = {x E X: F(x) C W} n V, then G is an open neighborhood of Xo with d-diam F(G) ~ d-diam(F(V) n W) ~ e. It follows that G C Of: and therefore 0 =1= V n Of: c U n Of:. This proves that Of: is dense in X. Now let D = n{OI/n:n = 1,2,3, ... }.

Since X is a Baire space, D is a dense G 6 subset of ..Y". From the definition of D it is evident that not only is F( x) a singleton at each point xED but that F is d-upper semicontinuous at each such point. This lemma leads to the following alternative proof of Theorem 2.30.

Tlleorem 7.15. Let E be a Banach space such that every bounded nonempty subset of E* is weak* dentable. If T: E ~ 2E * is maximal monotone, with X == int D(T) =1= 0, then there exists a dense G6 subset D of X such that T is 3ingle-valued and norm-to-norm upper 3emicontinuo'lt3 at each point of D. Proof. We know from Example 7.2, Cor. 7.8 and Theorem 7.9 that T is a minimal convex w*-usco map from X into (E*, w*); by Prop. 7.3 it contains a minimal w* -usco map T I . By Theorem 2.28, T is locally bounded in X, hence the same is true of T I . Consequently, given any nonempty open subset U of X there exists a nonempty open subset V of U such that T 1 (V) is bounded. By the weak* dentability hypothesis, T 1 (V) admits nonempty relatively weak* open neighborhoods of arbitrarily small norm diameter, so by Lemma 7.14 there exists a dense G6 subset D of ~Y' such that T1 is single-valued and normto-norm upper semicontinuous at each point of D. Now, by Lemma 7.12, coT} is convex w*-usco, and hence by the minimality of Tlx, we have coT} = TI.x.

108

Section 7

Thus, for x E X, if T 1 (x) is contained in a closed ball, then so is T(x). It follows that T is also single-valued and norm-to-norm upper semicontinuous at each xED. The following exercise exhibits a particular convex w*-usco map which is of considerable importance in optimization.

7.16. Exercises: The Clarke subdifferential. Suppose that f is a locally Lipschitzian real-valued function on a nonempty open subset D (not necessarily convex) of the Banach space E. For each xED the generalized directional derivative fO (x; h) of f at x in the direction h E E is defined by

r(x;h)=

limsup (t,y)-(o+ ,x)

f(y+th)-f(Y). t

(a) Show that fO(x; h) is finite and that for fixed x, the function h is positive homogeneous, subadditive and Lipschitzian. (b) Show that if {x n } C D and

Xn

~

---+

fO(x; h)

xED, then for each h E E,

lim supfO(x n ; h)

~

fO(x; h).

The Clarke subdifferential 8° f( x) of f at xED is defined by 8° f(x) = {x· E E·: (x·, h) ~ fO(x; h)

for all h E E}.

(c) Show that, for all x E E, the set 8° f(x) is a nonempty convex weak* compact subset of E·. (d) Show that the graph of 8° f is norm X weak· closed in D that 8° f is a convex weak· usco map whenever D = E.

X

E· and hence

If f is a continuous convex function, then it is locally Lipschitzian, hence one can define both its generalized directional derivative and Clarke subdifferential as above. The following proposition reassures us that these coincide with the usual notions of directional derivative and subdifferential in this case.

Proposition 7.16. If f is convex and continuous on a nonempty open convex subset D of E (hence is locally Lipschitzian), then for all xED and h E E one has

and hence 8° f(x) = 8f(x). Proof. It is immediate from the definitions that fO( x; h) ~ d+ f( x)( h). To prove the reverse inequality, note that for any fixed 6 > 0,

r(x;h) = lim f-O+

sup 11J,-:z:1I