Optimization: A Theory of Necessary Conditions 0691081417

This book presents a comprehensive treatment of necessary conditions for general optimization problems. The presentation

479 96 17MB

English Pages [437] Year 1976

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Optimization: A Theory of Necessary Conditions
 0691081417

Table of contents :
A brief summary of the mathematical background nec­essary to understand the material in the text is presented in Chapter I.
On the assumption that the reader is familiar with the fundamentals of real analysis and basic elements of measure theory and integration, per­tinent definitions and results needed for the subsequent analysis in the
linear topological space setting are given. Included also in Chapter I is a rather complete discussion of various types of differentials for functions from a linear (vector) space into a topological vector space.
In Chapter II is found a generalized Lagrange multiplier rule for
abstract optimization problems with a finite number of equality and
inequality constraints. It is shown that application of this multiplier
rule to a particular class of optimization problems defined in terms of
operator equations in a Banach space yields a maximum principle
which solutions of the problems must satisfy. Sufficiency of these
conditions is discussed under certain convexity hypotheses on the
problem data. Chapter III is devoted to a development of an extremal theory that leads to a generalization of the multiplier rule given in
Chapter II. These generalizations involve a weakening of the hypoth­eses on the underlying set on which the optimization is carried out and
a relaxation on the allowable constraints to permit a considerably
more general type of "inequality" constraint.
In Chapter IV the fundamental multiplier rules developed earlier
are used to treat the general optimization problem: Given a family
W of continuously differentiable operators T:A -> SC, where A is an
open subset of the Banach space 3C, choose xe A satisfying (i) Tx = χ
for some Τ e iV, (ii) certain equality and generalized "inequality"
constraints, and which is in some sense optimal. The formulation here
is such that not only are the usual necessary conditions for restricted
phase coordinate optimal control problems with ordinary differential
equation restrictions obtained as special cases (the subject matter of
Chapter V), but many other general optimal control problems can also
be easily treated as special cases. This is discussed in Chapter VI,
where results are given for control problems with parameters and control problems with mixed control-phase inequality constraints. In Chapter VII necessary conditions using the framework of Chapter
IV are obtained for control problems governed by such diverse
systems as functional differential equations (differential-difference
equations being a special case), Volterra integral equations, and
difference equations.
An appendix contains fundamental results (existence, continuation,
uniqueness, continuous dependence) for equations defined in terms of
the Volterra-type operators used in the formulation of certain of the
problems discussed in Chapter IV. A concluding chapter (Notes and
Historical Comments) comprises an extensive literature survey in which
the development of necessary conditions and sufficient conditions in
modern optimization theory is outlined and comments are made on
the relationship between the differing approaches of various contri­butors to the literature.

Citation preview

OPTIMIZATION A Theory of Necessary Conditions

Unauthenticated Download Date | 11/17/19 9:48 PM

Unauthenticated Download Date | 11/17/19 9:48 PM

Lucien W. Neustadt

OPTIMIZATION A Theory of Necessary Conditions

Princeton University Press

1976

Unauthenticated Download Date | 11/17/19 9:48 PM

Copyright © 1976 by Princeton University Press Published by Princeton University Press, Princeton, New Jersey In the United Kingdom: Princeton University Press, Guildford, Surrey All Rights Reserved Library of Congress Cataloging in Publication Data will be found on the last printed page of this book Printed in the United States of America by Princeton University Press, Princeton, New Jersey

Unauthenticated Download Date | 11/17/19 9:48 PM

to Hilda and Adoph Neustadt

Unauthenticated Download Date | 11/17/19 9:48 PM

Unauthenticated Download Date | 11/17/19 9:48 PM

PREFACE

THE PURPOSE OF this book is to give a comprehensive development of necessary conditions for optimization problems. This is done in the context of a general theory for extremal problems in a topological vec­ tor space setting. A brief summary of the mathematical background nec­ essary to understand the material in the text is presented in Chapter I. On the assumption that the reader is familiar with the fundamentals of real analysis and basic elements of measure theory and integration, per­ tinent definitions and results needed for the subsequent analysis in the linear topological space setting are given. Included also in Chapter I is a rather complete discussion of various types of differentials for functions from a linear (vector) space into a topological vector space. In Chapter II is found a generalized Lagrange multiplier rule for abstract optimization problems with a finite number of equality and inequality constraints. It is shown that application of this multiplier rule to a particular class of optimization problems defined in terms of operator equations in a Banach space yields a maximum principle which solutions of the problems must satisfy. Sufficiency of these conditions is discussed under certain convexity hypotheses on the problem data. Chapter III is devoted to a development of an extremal theory that leads to a generalization of the multiplier rule given in Chapter II. These generalizations involve a weakening of the hypoth­ eses on the underlying set on which the optimization is carried out and a relaxation on the allowable constraints to permit a considerably more general type of "inequality" constraint. In Chapter IV the fundamental multiplier rules developed earlier are used to treat the general optimization problem: Given a family W of continuously differentiable operators T:A -> SC, where A is an open subset of the Banach space 3C, choose xe A satisfying (i) Tx = χ for some Τ e iV, (ii) certain equality and generalized "inequality" constraints, and which is in some sense optimal. The formulation here is such that not only are the usual necessary conditions for restricted phase coordinate optimal control problems with ordinary differential equation restrictions obtained as special cases (the subject matter of Chapter V), but many other general optimal control problems can also be easily treated as special cases. This is discussed in Chapter VI, where results are given for control problems with parameters and IX

Brought to you by | UCL - University College London Authenticated Download Date | 2/12/18 8:56 PM

PREFACE

control problems with mixed control-phase inequality constraints. In Chapter VII necessary conditions using the framework of Chapter IV are obtained for control problems governed by such diverse systems as functional differential equations (differential-difference equations being a special case), Volterra integral equations, and difference equations. An appendix contains fundamental results (existence, continuation, uniqueness, continuous dependence) for equations defined in terms of the Volterra-type operators used in the formulation of certain of the problems discussed in Chapter IV. A concluding chapter (Notes and Historical Comments) comprises an extensive literature survey in which the development of necessary conditions and sufficient conditions in modern optimization theory is outlined and comments are made on the relationship between the differing approaches of various contri­ butors to the literature. We in the Division of Applied Mathematics at Brown University were fortunate to have Lucien Neustadt spend the 1971-1972 academic year on sabbatical leave with us. During that period much of the material in the present book was presented by Lucien in a year-long advanced seminar. The main text of the book (Chapters I-VII and the Appendix) was finished by the summer of 1972 but at the time of Lucien's death in October 1972 the question of a publisher was still unsettled. In addition, work on the Notes and Historical Comments chapter was still in a rough draft and incomplete form. In the winter of 1972-1973 Joe LaSalle and I undertook the responsibility for having the manuscript reviewed by several publishers and then made recommendations to the executor of Lucien's estate concerning choice of a publisher. Due to a number of unfortunate delays, it was not until the spring of 1974 that legal matters were finally cleared up and agreement for publication by Princeton University Press was com­ pleted. I agreed to undertake the task of completing the work on the Notes and Historical Comments chapter and putting the entire chapter into a polished form. In addition I accepted the usual author's respon­ sibility concerning copy editing, reading of proofs, etc. The present text of Chapters I-VII and the Appendix are unchanged from Lucien's final manuscript except for minor corrections. The referencing system chosen by Lucien has been maintained throughout the text, the appendix, and the historical comments. It is easily understood. Chapters are denoted by Roman numerals and are each divided into χ

Brought to you by | UCL - University College London Authenticated Download Date | 2/12/18 8:56 PM

PREFACE

a number of sections. These sections in turn consist of collections of numbered paragraphs and formulae. In references the number (II.4.13) refers to paragraph 13 of Section 4 of Chapter II while (1.7.13, 16, and 19) refers to paragraphs 13, 16, and 19 of Section 7 of Chapter I. Chapter numbers are omitted when referring to entities within the same chapter. For example the reference to (2.14) within Chapter II refers to paragraph 14 of Section 2 of that Chapter. The same procedure is followed in referencing paragraphs within a given section (i.e., the reference to (14) in a section of a chapter refers to paragraph (14) of that section of that chapter). Much of the material of the chapter Notes and Historical Comments came either directly from Lucien's rough draft material or from my own summaries of the references on which Lucien had intended to comment (as indicated by his incomplete rough draft bibliography). But both Lucien's Notes and Historical Comments rough draft and his bibliography were unfinished at the time of his death and I have tried to complete the bibliography (along with relevant paragraphs in appropriate places in the Notes and Historical Comments chapter) in the way in which I judge, to the best of my ability, Lucien would have done so had he lived to complete the task. However, I accept all responsibility for any errors of commission or omission with respect to the bibliography and thefinalchapter Notes and Historical Comments. Lucien undoubtedly would have wished to express his appreciation to a number of people for their helpful comments during the lengthy period he worked on this book. It is impossible for me to guess the names of all such individuals and I will therefore mention by name only those who are due thanks (on the behalf of both Lucien and myself) for help since I became involved with completing and publishing of the manuscript. These include Professors N. Nahi (who helped in locating the rough draft notes for the bibliographical section among Lucien's effects), J. A. Burns (who during the summer of 1974 helped with some of the literature search necessary to complete the Notes and Historical Comments chapter) and J. P. LaSalle. Special appreciation is due S. Spinacci and K. Avery for their excellent typing both before and after Lucien's death. Finally I wish to express my personal appreciation to John W. Hannon of Princeton University Press for his patience and help with publication details that were without doubt more difficult than they would have been had Lucien been alive. XI

Brought to you by | UCL - University College London Authenticated Download Date | 2/12/18 8:56 PM

PREFACE

I am certain that all of us feel that the present book more than justifies our own meagre efforts and feel privileged to have helped in some small way in the completion of a book that is indeed a lasting memorial to a man who contributed so much to the intellectual and spiritual lives of all who were associated with him. Η. T. Banks Providence, Rhode Island May, 1975

Brought to you by | UCL - University College London Authenticated Download Date | 2/12/18 8:56 PM

SUMMARY OF NOTATION

B(#,iT) (1.5.1) Ch (IV.5.2) Ch (IV.5.15) co A (1.1.27) co A (1.4.24) co co A (1.1.29) coco A (1.4.24) cone A (1.1.28) cone A (1.4.24) B and g:B -> C (or if g:f(A) —> C), then the function from A into C which assigns to each xe A the element g(f(x)) in C [or, as we shall write in the future for such situations, the function x -> g(f{x))] will be called 10

Brought to you by | New York University Bobst Library Technical Services Authenticated Download Date | 5/20/15 4:37 AM

1.2 FUNCTIONS BETWEEN VECTOR SPACES

the composite of g and f , and will be denoted by

5

6

7

8

9 10

If Au ..., Am are arbitrary sets, then the set of all m-tuples . . . ,xm) such that x ; e At for each i = 1 , . . . , m will be called the direct product of Alt..., Am, and will be denoted by At x • • • x Am. If Ai = A (for some set A) for each i = 1 , . . ., m, then we shall also write Am for this direct product. If f „ for each i = 1 , . . . , m, is a function from some set A0 into Ah then the f u n c t i o n w i l l be denoted by U 1 , . . . , J m l . Functions from a subset of a linear vector space into a linear vector space will occasionally be referred to as operators. Throughout the remainder of this section, !,..., 2tm will denote linear vector spaces. A function / : ^ -> will be said to be linear if for all ind a , / i e R . Linear functions from one linear vector space into another will often be referred to as linear operators. Note that /(0) = 0 whenever / is a linear operator. If A cz a function / : A -> will be said to be linear if / is the restriction to A of a linear function from ?/ into X. In this case, we shall sometimes permit ourselves the abuse of notation of using the same symbol to denote both the function on A and its (linear) extension to (even though there may be infinitely many such extensions). A function / from l>) (or from a set A Jf is W-convex, then the function / t defined on the set + A through the relation (v) = f(y — yQ) + C is also W-convex for every y 0 e W2, then every W^-convex function is also Wi-convex. In particular, if W = then evidently every function f \ A - * 2 £ is M^convex. An affine function f : is clearly ^-convex for any pointed, convex cone W such that f is the restriction of f to A. If also /(0) = 0, then f is linear. Proof. Let . 2 be defined by My) = f(y) - /(0) for all ye A, so that fi is also ^-convex [see (35)] and / x (0) = 0. We shall construct, as is evidently sufficient for our proof, a linear function f - > such that / j is the restriction of

If ye A and then f,(ay) = f^ay + (1 - a)0) = a/i(y) + (1 - a)/i(0) = a/i(y). If y e A and a > 1, then f^y) = fi(oc~1(ay)) = a" 1 / 1 (a>') (by what we have just shown), so that fi(ay) ~ 0 and all ye A. If yu y2 e A and 16

Brought to you by | New York University Bobst Library Technical Services Authenticated Download Date | 5/20/15 4:37 AM

I 3 TOPOLOGICAL SPACES

a > 0, p > 0, and a + (J > 0, then

38

39

Since A is a convex cone, the span of A, as is easily seen, is A — A. If y e span A, so that y = — y2 with yu y2 e A, we set f\(y) = / i ( y i ) — fi(y2). It is then straightforward to verify that f \ is well-defined and linear on span A and that fi(y) = f\(y) for all ye A. It only remains to extend f \ to X' and g:A' -* X" be functions such that f(A) cz A'. Then, if f is continuous at a point x0 e A and the restriction of g to f(A) is continuous at f(x0), g ° f is continuous at x0- In particular, if f is continuous and g is continuous on f(A), then g ° f is continuous.

28

If X and X' are topological spaces, A a X, and A' c X', then A and A' will be said to be homeomorphic if there is a continuous, one-to-one function / from A onto A' such that f~1 is also continuous. In this case, f will be said to be a homeomorphism of A onto A'.

29

A topological space X will be said to be Hausdorff (or separated) if for every pair of distinct points x1 and x2 in X, there are neigh­ borhoods U1 of X! and U2 οϊχ2 such that U1 and U2 are disjoint, i.e., have no points in common. The following lemma may easily be verified. LEMMA. In a Hausdorff space, every finite set is closed as well as compact, and every compact set is closed. COROLLARY. In a Hausdorff space, the intersection of compact sets is compact, and each subset of a compact set is conditionally compact. Proof. Corollary (31) follows at once from Lemma (30), (4), and (15). LEMMA. Let X and X' be topological spaces, with X' Hausdorff, let A be a compact set in X, and let f:A -> X' be a continuous, one-to-one function. Thenf is a homeomorphism from A onto f (A). Proof. By Theorem (25), it is sufficient to prove that, for every closed set C x- as x —• XQ (respectively, as x -* x0 and x e AJ for each i = 1 , . . . , m if and only if ( f u • • • Jm)(x) -> ( x i , . . . , x j as x -»• x 0 (respectively, as x -> x 0 and x 6 A t ). A similar assertion may be made for uniform limits. 5. If X 0 , x 0 , A, Ax, a n d / l 5 . . . , / m are as in (37), then ( / 1 ( . . . , / J is continuous (respectively, continuous on Au continuous at x 0 ) if and only if f i , . . . , f m are all continuous (respectively, continuous on A u continuous at x 0 ). then we shall write Xm for

As usual, if

39

40

As an example, let us consider the usual topology on R. Namely, as a base for the topology on R let us take all sets of the form {A: |A - k0 | < s} where A0 e R and e e R+. It is easy to see that in this way we obtain a Hausdorff topology. In this topology, all sets of the form {A: 10 < X < Aj}, where A0, At e R are clearly open and will be denoted by (A0,AX); all sets of the form {A:A0 < A < A J , where A0, Aj e R (we do not exclude A0 = Ax) are clearly closed and will be denoted by [ A o ^ ] . The sets [AQJAJ) and ( A c A j are defined in an analogous manner (and are neither open nor closed). The sets of the form {A: A > A0}, where A0 e R, are open and will be denoted by (A0,oo). The sets [A0,oo), (— oo,A0), and ( — co,A0] are defined in an analogous manner. By an interval we shall mean any subset of R of the just-described types. Since R" is the same as R x R x • • • x R (n times), we may define the product topology on R", with the just-indicated topology for R. It is not hard to see that, for any e R", we may 23

Brought to you by | New York University Bobst Library Technical Services Authenticated Download Date | 5/20/15 4:37 AM

I

41

42 43 44 45

46

MATHEMATICAL PRELIMINARIES

then take as base at ξ0 the family of all sets of the form {ξ: \ξ ξ0\ < α} where α > 0. This topology is known as the Euclidean topology for R" and is, by (36), Hausdorff. Henceforth, R" will always be considered to be a topological space (with the Euclidean topology). It is easily seen that R"+ and K"_ are open in R", and that R"+ and Rl are the closures of R"+ and R1, respectively. The following results are of importance for the spaces R" (n = 1, 2,...). 1. The function (ξ,η) -» ξ + η from R" χ R" into R" is con­ tinuous, i.e., vector addition is continuous in R". 2. The function (α,,ξ) -* αξ from R χ R" into R" is continuous, i.e., scalar multiplication is continuous in R". 3. A set in R" is open if and only if it is finitely open [see (1.37)]. 4. A set in R" is compact if and only if it is closed and bounded. (A set A in R" is bounded if there is an a0 ε R+ such that \ξ\ < α 0 for all ξ e A.) This result is known as the Heine-Borel Theorem [7, Theorem 10.10, p. 53]. A set in R" is conditionally compact if and only if it is bounded. (This follows because the closure of a bounded set in R" is bounded.) It follows from (45) and Theorem (26) that if C is a compact subset of a topological space X, and f is a continuous map from A into R, where X => A => C, then there are points x1 and x2 in C such that / t a ) = inf {/(χ): χ e C} and/(x 2 ) = sup {/(χ): χ € C}, so that f is bounded from above as well as from below on C, and attains both its infimum and its supremum on C. 4. Linear Topological Spaces

1

A linear topological space is a linear vector space ®t together with a topology on 9 such that (a) with this topology, ty is Hausdorff, (b) vector addition in ^ is a continuous function from 2£, where A is a subset of and and 3t are linear topological spaces, and if Ax is a subset of A which is compact in r& and on which 3F is equicontinuous, then 2F is uniformly equicontinuous on A v Proof. Let N' be an arbitrary neighborhood of 0 in Let N" be a neighborhood of 0 in 2£ such that N" + N" c N' [see Corollary (4.4)], and let N'" = N" n (-N") [see Corollary (4.6)]. Since is equicontinuous on Au for every y' e Ax there is a neighborhood Nv- of 0 i n s u c h that f ( y ) e f(y') + N'" for all / e J5" whenever F o r e a c h / e A,, let Ny be a neighborhood 0 in ®f such that . The family {y' + Ny: y'e Ax} evidently forms a covering of Au so that, since A, is comoact. there are noints v\, . . . , y'p such that Let Now suppose that yx e Ax and that Then for some/' = 1 , . . . , p, and yey'j + Consequently, for every f e , XQ (respectively, x -> XQ ) to mean x x 0 and [respectively, 34

If 3C, A', and x 0 are as in the second sentence of (33), Y is an arbitrary set, 4 , c i c f x Y, Yx c Y, and / is a function from A into then we shall say that f(x,y) tends (or converges) to A' as x tends to x 0 uniformly with respect to y e Yu written as uniformly w.r. to

y e Fb

if, for every neighborhood N' of 0 in 2£, there is a neighborhood N of 0 in 9C such that f(x,y)eA' + N' whenever Yj] n A. We shall also write 35

36

uniformly w.r. to

y e Yt

with a meaning that should now be clear, and shall write x -> XQ or x -* XQ in (35) in place of x -» x 0 and x e Ax when 9C = R and A i is as previously described. It is easy to see that if A' consists of a single point x! e 2£, then all of the definitions of convergence to A' introduced above coincide with the corresponding convergence definitions of (3.16-23). Note that if SC, 2£, x 0 , Au and A are as in the second sentence of (33), / and g are functions from A into A' c: i f , and A" a then, if and we also have tnat a j ( x ) + pg(x) -> aA' + (3A" as x —• XQ (for all a, p e R), and similarly if we adjoin "x e A t o "x —> XQ." Further, an analogous statement can be made for uniform convergence. Finally, with respect to direct products, an assertion similar to that in (3.37) can be made.

37

Brought to you by | New York University Bobst Library Technical Services Authenticated Download Date | 5/20/15 4:37 AM

I. MATHEMATICAL PRELIMINARIES

37

38

39

40

If A is a subset of an arbitrary set X, and / is a map from A into X, then any xe A that satisfies the equation x = f(x) will be called a. fixed point of / . The following so-called fixed point theorem will be very important in the sequel. THEOREM. Let C be a compact, convex set in a locally convex linear topological space, and let f be a continuous function from C into C. Then f has at least one fixed point. Proof. See [4, Theorem 5, p. 456], Given two sets K and in Rm, we shall say that is a simplicial linearization of K if, for every pair (S,JJ), where S is a simplex in Rm such that 0 e S e} • x(t), t el.lt is easily seen that χ € 1) is complete, i.e., that r4jn is a Banach space. If I is a compact interval [tut2], then we shall denote by NBV(I) the normed linear vector space of all functions X from / into R which (i) are of bounded variation on I, (ii) are continuous from the right in (tut2) [i.e., which have the property that X(t + e) -> X(t) as e 0 + for every t,tx < t < t2], and (iii) satisfy X(t2) = 0, with ||2j[ equal to the total variation of X, written as TV X and defined as

11 with the supremum taken over all subdivisions

12

There is an intimate relationship between the spaces ( R which are essentially bounded, with the norm defined by 25

||z|| = ess sup \z(t)\. tel

26

27

28

29

30

In the spaces LP(I) (1 < ρ < oo) we shall, as is customary, not distinguish between functions which differ only on a set of measure zero. It is then easily seen that (25) defines a norm on L^; for a proof that (24) defines a norm on Lp, for 1 < ρ < oo, see [8, Example 5, p. 90]. The spaces LP(I) (1 < ρ < oo) are complete, and hence are Banach spaces (see [4, Theorem 6, p. 146 and Corollary 14, p. 150]). It turns out that (Lp(I))* is isometrically isomorphic to Lq(I) for every pe(l,oo), where q = p/{i — p), and that (L^I))* is isometrically isomorphic to 1^(/) (see [4, Theorem 1, p. 286, and Theorem 5, p. 289]). However, we shall be primarily interested in L^il), whose dual is more difficult to specify. THEOREM. For every eiL^I))*, there is a finitely additive, real-valued measure (which is not necessarily countably additive) v, defined on all Lebesgue measurable subsets of I, such that v(J 0 ) = 0 for all subsets I0 of I of Lebesgue measure zero, and such that t(z) = £ z{s) d\{s) for all

ζ e LJI).

Proof. See [4, Theorem 16 and Definition 15, p. 296]. For every positive integer n, we define L"X(I) (where I is the usual compact interval [ii,f 2 ]) as the normed linear vector space consisting of all functions z:I -+ R" such that |ζ(·)| is essentially 43 Brought to you by | New York University Bobst Library Technical Services Authenticated Download Date | 5/20/15 4:37 AM

I. MATHEMATICAL PRELIMINARIES

bounded, with the norm defined by ||z|| = ess sup |z(t)|. re/

Arguing as in (7) and (8), we can show that L^,(7) is topologically isomorphic to (L^I))". The following theorem is analogous to Theorem (22): THEOREM. Let Κ denote the closed convex cone in LX(I) defined as follows: Κ = {ζ: ζ(ί) < Ofor almost all t e /'}, where Γ is some given Lebesgue measurable subset of I. Then K* [see (5.23)] consists of all linear functionals t e (L^I))* with the representation (29) for some finitely additive, nonnegative realvalued measure v, defined on all Lebesgue measurable subsets of I, such that v(/) = Ofor every Lebesgue measurable subset I of I whose intersection with Γ has Lebesgue measure zero. Theorem (32) may be proved much in the same way as Theorem (22). Now suppose that A is a subset of some topological space X (which is typically a linear topological space), that 2t is some normed linear vector space, and that