Introduction to the Theory of Optimization in Euclidean Space (Chapman & Hall/CRC Series in Operations Research) [1 ed.] 0367195577, 9780367195571

Introduction to the Theory of Optimization in Euclidean Space is intended to provide students with a robust introduction

239 63 21MB

English Pages 334 [335] Year 2019

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Introduction to the Theory of Optimization in Euclidean Space (Chapman & Hall/CRC Series in Operations Research) [1 ed.]
 0367195577, 9780367195571

Table of contents :
Cover
Half Title
Series Page
Title Page
Copyright Page
Dedication
Contents
Preface
Acknowledgments
Symbol Description
Author
1. Introduction
1.1 Formulation of Some Optimization Problems
1.2 Particular Subsets of Rn
1.3 Functions of Several Variables
2. Unconstrained Optimization
2.1 Necessary Condition
2.2 Classification of Local Extreme Points
2.3 Convexity/Concavity and Global Extreme Points
2.3.1 Convex/Concave Several Variable Functions
2.3.2 Characterization of Convex/Concave C1 Functions
2.3.3 Characterization of Convex/Concave C2 Functions
2.3.4 Characterization of a Global Extreme Point
2.4 Extreme Value Theorem
3. Constrained Optimization-Equality Constraints
3.1 Tangent Plane
3.2 Necessary Condition for Local Extreme Points-Equality Constraints
3.3 Classification of Local Extreme Points-Equality Constraints
3.4 Global Extreme Points-Equality Constraints
4. Constrained Optimization-Inequality Constraints
4.1 Cone of Feasible Directions
4.2 Necessary Condition for Local Extreme Points/Inequality Constraints
4.3 Classification of Local Extreme Points-Inequality Constraints
4.4 Global Extreme Points-Inequality Constraints
4.5 Dependence on Parameters
Bibliography
Index

Citation preview

Introduction to the Theory of Optimization in Euclidean Space

Series in Operaons Research Series Editors: Malgorzata Sterna, Marco Laumanns

About the Series The CRC Press Series in Operaons Research encompasses books that contribute to the methodology of Operaons Research and applying advanced analycal methods to help make beer decisions. The scope of the series is wide, including innovave applicaons of Operaons Research which describe novel ways to solve real-world problems, with examples drawn from industrial, compung, engineering, and business applicaons. The series explores the latest developments in Theory and Methodology, and presents original research results contribung to the methodology of Operaons Research, and to its theorecal foundaons. Featuring a broad range of reference works, textbooks and handbooks, the books in this Series will appeal not only to researchers, praconers and students in the mathemacal community, but also to engineers, physicists, and computer sciensts. The inclusion of real examples and applicaons is highly encouraged in all of our books. Raonal Queueing Refael Hassin Introducon to the Theory of Opmizaon in Euclidean Space Samia Challal

For more informaon about this series please visit: hps://www.crcpress.com/Chapman--HallCRC-Series-inOperaons-Research/book-series/CRCOPSRES

Introduction to the Theory of Optimization in Euclidean Space

Samia Challal Glendon College-York University Toronto, Canada

CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 c 2020 by Taylor & Francis Group, LLC  CRC Press is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Printed on acid-free paper International Standard Book Number-13: 978-0-367-19557-1 (Hardback) This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a notfor-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com

To my parents

Contents

Preface

ix

Acknowledgments

xi

Symbol Description

xiii

Author

xv

1 Introduction 1.1 Formulation of Some Optimization Problems . . . . . . . . . 1.2 Particular Subsets of Rn . . . . . . . . . . . . . . . . . . . . 1.3 Functions of Several Variables . . . . . . . . . . . . . . . . .

1 1 8 20

2 Unconstrained Optimization 2.1 Necessary Condition . . . . . . . . . . . . . . . . . . . . . 2.2 Classification of Local Extreme Points . . . . . . . . . . . 2.3 Convexity/Concavity and Global Extreme Points . . . . 2.3.1 Convex/Concave Several Variable Functions . . . 2.3.2 Characterization of Convex/Concave C 1 Functions 2.3.3 Characterization of Convex/Concave C 2 Functions 2.3.4 Characterization of a Global Extreme Point . . . . 2.4 Extreme Value Theorem . . . . . . . . . . . . . . . . . .

. . . . . . . .

49 49 71 93 93 95 98 102 117

. . . . . . .

135 137

. . . . . . .

151

. . . . . . . . . . . . . .

167 187

4 Constrained Optimization-Inequality Constraints 4.1 Cone of Feasible Directions . . . . . . . . . . . . . . . . . . . 4.2 Necessary Condition for Local Extreme Points/ Inequality Constraints . . . . . . . . . . . . . . . . . . . . . . 4.3 Classification of Local Extreme Points-Inequality Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . .

203 204

3 Constrained Optimization-Equality Constraints 3.1 Tangent Plane . . . . . . . . . . . . . . . . . . . 3.2 Necessary Condition for Local Extreme Points-Equality Constraints . . . . . . . . . . . . 3.3 Classification of Local Extreme Points-Equality Constraints . . . . . . . . . . . . . . . . . . . . . 3.4 Global Extreme Points-Equality Constraints . .

. . . . . . . .

220 251 vii

viii

Contents 4.4 4.5

Global Extreme Points-Inequality Constraints . . . . . . . . Dependence on Parameters . . . . . . . . . . . . . . . . . . .

271 292

Bibliography

315

Index

317

Preface

The book is intended to provide students with a useful background in optimization in Euclidean space. Its primary goal is to demystify the theoretical aspect of the subject.

In presenting the material, we refer first to the intuitive idea in one dimension, then make the jump to n dimension as naturally as possible. This approach allows the reader to focus on understanding the idea, skip the proofs for later and learn to apply the theorems through examples and solving problems. A detailed solution follows each problem constituting an image and a deepening of the theory. These solved problems provide a repetition of the basic principles, an update on some difficult concepts and a further development of some ideas.

Students are taken progressively through the development of the proofs where they have the occasion to practice tools of differentiation (Chain rule, Taylor formula) for functions of several variables in abstract situation. They learn to apply important results established in advanced Algebra and Analysis courses, like, Farkas-Minkowski Lemma, the implicit function theorem and the extreme value theorem.

The book starts, in Chapter 1, with a short introduction to mathematical modeling leading to formulation of optimization problems. Each formulation involves a function and a set of points. Thus, basic properties of open, closed, convex subsets of Rn are discussed. Then, usual topics of differential calculus for functions of several variables are reminded.

In the following chapters, the study is devoted to the optimisation of a function of several variables f over a subset S of Rn . Depending on the particularity of this set, three situations are identified. In Chapter 2, the set S has a nonempty interior; in Chapter 3, S is described by an equation g(x) = 0 and in Chapter 4

ix

x

Preface

by an inequality g(x)  0 where g is a function of several variables. In each case, we try to answer the following questions: – If the extreme point exists, then where is it located in S? Here, we look for necessary conditions to have candidate points for optimality. We make the distinction between local and global points. – Among the local candidate points, which of them are local maximum or local minimum points? Here, we establish sufficient conditions to identify a local candidate point as an extreme point. – Now, among the local extreme points found, which ones are global extreme points? Here, the convexity/concavity property intervenes for a positive answer. Finally, we explore how the extreme value of the objective function f is affected when some parameters involved in the definition of the functions f or g change slightly.

Acknowledgments

I am very grateful to my colleagues David Spring, Mario Roy and Alexander Nenashev for introducing the course on optimization, for the first time, to our math program and giving me the opportunity to teach it. I, especially, thank Professor Vincent Hildebrand, Chair of the Economics Department for the useful discussions during the planning of the course content to support students majoring in Economics. My thanks are also due to Sarfraz Khan and Callum Fraser from Taylor and Francis Group, to the reviewers for their invaluable help, and to Shashi Kumar for the expert technical support. I have relied on the various authors cited in the bibliography, and I am grateful to all of them. Many exercises are drawn or adapted from the cited references for their aptitude to reinforce the understanding of the material.

xi

Symbol Description



For all, or for each



There exists

∃!

There exists a unique



The empty set

s.t

Subject to



A

a2ij

1/2

norm of the ma-

trix A = (aij )i,j=1,...,n

Interior of the set S

∂S

Closure of the set S

S

Boundary of the set S

CS

The complement of S.

i, j, k

i = (1, 0, 0), j = (0, 1, 0), k = (0, 0, 1) standard basis of R3

rankA

rank of the matrix A

detA

determinant of the matrix A

KerA

= {x : Ax = 0} Kernel of the matrix A

= h1 . . . hn transpose of ⎡ ⎤ h1 ⎢ ⎥ h = ⎣ .. ⎦ . hn

th

Br (x0 )

Ball centered at x0 with radius r

Br (x0 )

Bordered Hessian of order r at x0 .

., .

or [ . , . ] brackets for vectors

∇f

gradient of f ⎡ ∗ ⎤ x1 ⎢ . ⎥ = ⎣ . ⎦ column vector iden. x∗n tified sometimes to the point (x∗1 , . . . , x∗n )  x21 + x22 + . . . + x2n norm of = the vector x

x

n

i,j=1

S

x∗

=

Mm n

set of matrices of m rows and n columns

A

= (aij ) i = 1, . . . , m, j = 1, . . . , n n matrix

is an m ×

t h.x∗

=

n

hk .xk dot product of the

k=1

vectors h and x∗ C 1 (D)

set of continuously differentiable functions on D

C k (D)

set of continuously differentiable functions on D up to the order k

C ∞ (D)

set of continuously differentiable functions on D for any order k

Hf (x)

= (fxi xj )n×n Hessian of f

Dk (x)

  fx 1 x 1     fx 2 x 1   =  .  ..    f

fx 1 x 2

...

fx 2 x 2

...

.. .

.. .

fx k x 2 ... xk x1 leading minor of order Hessian Hf

 fx 1 x k     fx 2 x k     ..  .    fx k x k  k of the

xiii

Author

Samia Challal is an assistant professor of Mathematics at Glendon College, the bilingual campus of York University. Her research interests include homogenization, optimization, free boundary problems, partial differential equations and problems arising from mechanics.

xv

Chapter 1 Introduction

Optimization problems arise in different domains. In Section 1.1 of this chapter, we introduce some applications and learn how to model a situation as an optimization problem. The points where an optimal quantity is attained are looked for in subsets that can be one dimensional, multi-dimensional, open, closed, bounded or unbounded, . . . etc. We devote Section 1.2 to study some topological properties of such subsets of Rn . Finally, since, the phenomena analyzed are often complex, because of the many parameters that are involved, this requires an introduction to functions of several variables that we study in Section 1.3.

1.1

Formulation of Some Optimization Problems

The purpose of this short section is to show, through some examples, the main elements involved in an optimization problem.

Example 1. Different ways in modeling a problem. To minimize the material in manufacturing a closed can with volume capacity of V units, we need to choose a suitable radius for the container. i)

Show how to make this choice without finding the exact radius.

ii)

How to choose the radius if the volume V may vary from one liter to two liters?

1

2

Introduction to the Theory of Optimization in Euclidean Space

Solution: Denote by h and r the height and the radius of the can respectively. Then, the area and the volume of the can are given by area = A = 2πr2 + 2πrh,

volume = V = πr2 h.

i) * The area can be expressed as a function of r and the problem is reduced to find r ∈ (0, +∞) for which A is minimum: ⎧ 2V ⎪ ⎨ minimize A = A(r) = 2πr2 + over the set S r ⎪ ⎩ S = (0, +∞) = {r ∈ R / r > 0}. Note that the set S, as shown in Figure 1.1, is an open unbounded interval of R.

0.0

interval

r0

0.5

1.0

1.5

2.0

r

2.5

FIGURE 1.1: S = (0, +∞) ⊂ R

** We can also express the problem as follows: ⎧ ⎨ minimize A(r, h) = 2πr2 + 2πrh ⎩

S = {(r, h) ∈ R+ × R+

over the set S

/ πr2 h = V }.

Here, the set S is a curve in R2 and is illustrated by Figure 1.2 below: h

2.0

1.5

1.0

0.5

S

0.5

1.0

1.5

2.0

r

FIGURE 1.2: S is a curve h = π −1 /r2 in the plane (V=1 liter)

ii) In the case, we allow more possibilities for the volume, for example 1  V  2, then we can formulate the problem as a two dimensional problem

Introduction ⎧ 2 ⎪ ⎨ minimize A(r, h) = 2πr + 2πrh ⎪ ⎩ S = {(r, h) ∈ R+ × R+

/

3 over the set S

1 2  h  2 }. πr2 πr

The set S is the plane region, in the first quadrant, between the curves 2 1 h = 2 and h = 2 (see Figure 1.3). πr πr h 3.5 3.0 2.5 2.0 1.5 1.0

S

0.5 0.5

1.0

1.5

2.0

r

FIGURE 1.3: S is a plane region between two curves

A three dimensional formulation of the same problem is ⎧ 2V ⎪ ⎨ minimize A(r, h, V ) = 2πr2 + over the set S r ⎪ ⎩ S = {(r, h, V ) ∈ R+ × R+ × R+ / πr2 h = V, 1  V  2} where, the set S ⊂ R3 is the part of the surface V = πr2 h located between the planes V = 1 and V = 2 in the first octant; see Figure 1.4. 3 h

2

1 0 1.0

V

0.5

0.0 0.0 0.5 1.0 r

1.5 2.0

FIGURE 1.4: S is a surface in the space

4

Introduction to the Theory of Optimization in Euclidean Space

Example 2. Too many variables and linear inequalities. Diet Problem. * One can buy four types of aliments where the nutritional content per unit weight of each food and its price are shown in Table 1.1 [5]. The diet problem consists of obtaining, at the minimum cost, at least twelve calories and seven vitamins.

calories vitamins price

type1 type2 type3 type4 2 1 0 1 3 4 3 5 2 2 1 8

TABLE 1.1: A diet problem with four variables

Solution: Let ui be the weight of the food of type i. The total price of the four aliments consumed is given by the relation 2u1 + 2u2 + u3 + 8u4 = f (u1 , u2 , u3 , u4 ). To ensure that at least twelve calories and seven vitamins are included, we can express these conditions by writing 2u1 + u2 + u4  12

and

3u1 + 4u2 + 3u3 + 5u4  7.

Hence, the problem would be ⎧ ⎨ minimize f (u1 , u2 , u3 , u4 ) ⎩

2u1 + u2 + u4  12,

 over the set S = (u1 , u2 , u3 , u4 ) ∈ R4 :  3u1 + 4u2 + 3u3 + 5u4  7 .

** The above problem is rendered more complex if more factors (fat, proteins) and types of food (steak, potatoes, fish, ...) were to be considered. For example, from Table 1.2, we deduce that the total price of the seven

protein f at calories vitamins price

type1 type2 type3 type4 type5 type6 type7 3 1 2 7 8 5 10 0 1 0 8 15 10 6 2 1 0 1 5 7 9 3 4 3 5 1 2 5 2 2 1 8 12 10 8

TABLE 1.2: A diet problem with seven variables

Introduction

5

aliments consumed is 2u1 + 2u2 + u3 + 8u4 + 12u5 + 10u6 + 8u7 = p(u1 , u2 , u3 , u4 , u5 , u6 , u7 ). To ensure that at least twelve calories, seven vitamins, twenty proteins are included, and less than fifteen fats are consumed, the problem would be formulated as ⎧ minimize p(u1 , u2 , u3 , u4 , u5 , u6 , u7 ) over the set ⎪ ⎪ ⎪ ⎪ ⎪ ⎪  ⎪ ⎪ ⎪ S = (u1 , u2 , u3 , u4 , u5 , u6 , u7 ) ∈ R7 : ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 3u1 + u2 + 2u3 + 7u4 + 8u5 + 5u6 + 10u7  20 ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩

u2 + 8u4 + 15u5 + 10u6 + 6u7  15 2u1 + u2 + u4 + 5u5 + 7u6 + 9u7  12 3u1 + 4u2 + 3u3 + 5u4 + u5 + 2u6 + 5u7  7.



Example 3. Too many variables and nonlinearities. * A company uses x units of capital and y units of labor to produce x y units of a manufactured good. Capital can be purchased at 3$/ unit and labor can be purchased at 2$/ unit. A total of 6$ is available to purchase capital and labor. How can the firm maximize the quantity of the good that can be manufactured? Solution: We need to maximize the quantity x y on the set of points (see Figure 1.5) y 3

2

L2

L3

1

1

S

1

2

3

x

L1

1

FIGURE 1.5: S is a triangular region in the plane S = {(x, y) ∈ R2 :

3x + 2y  6,

x  0,

y  0}.

6

Introduction to the Theory of Optimization in Euclidean Space

The set S is the triangular plane region bounded by the sides L1 , L2 and L3 , defined by: L1 = {(x, 0), 0  x  2}, L2 = {(0, y), 0  y  3},

L3 = {(x, (6 − 3x)/2), 0  x  2}.

Here, the objective function f (x, y) = xy is nonlinear and the set S is described by linear inequalities. ** Such a model may work for a certain production process. However, it may not reflect the situation as other factors involved in the production process cannot be ignored. Therefore, new models have to be considered. For Example [7]: - The Canadian manufacturing industries for 1927 is estimated by: P (l, k) = 33l0.46 k 0.52 where P is product, l is labor and k is capital. - The production P for the dairy farming in Iowa (1939) is estimated by: P (A, B, C, D, E, F ) = A0.27 B 0.01 C 0.01 D0.23 E 0.09 F 0.27 where A is land, B is labor, C is improvements, D is liquid assets, E is working assets and F is cash operating expenses. Each of these nonlinear production function P is optimized on a suitable set S that describes well the elements involved.

As seen above, the main purpose, of this study, is to find a solution to the following optimization problems find u ∈ S

such that

f (u) = min f (v)

find u ∈ S

such that

f (u) = max f (v)

S

or S

n

where f : S ⊂ R −→ R is a given function and S a given subset of Rn . It is obvious that establishing existence and uniqueness results of the extreme points, depends on properties satisfied by the set S and the function f . So, we need to know some categories of subsets in Rn as well as some calculus on multi-variable functions. But, first look at the following remark:

Introduction

7

Remark 1.1.1 The extreme point may not exist on the set S. In our study, we will explore the situations where min f and max f are attained in S. S

S

For example min f (x) = x2

does not exist.

(0,1)

Indeed, suppose there exists x0 ∈ (0, 1) such that f (x0 ) = min f (x). Then, (0,1)

0
0

To include these limit cases, usually, instead of looking for a minimum or a maximum, we look for inf f (x) = inf{f (x) : x ∈ S} S

and

sup f (x) = sup{f (x) : x ∈ S} S

where inf E and sup E of a nonempty subset E of R are defined by [2] sup E = the least number greater than or equal to all numbers in E inf E = the greatest number less than or equal to all numbers in E. If E is not bounded below, we write inf E = −∞. If E is not bounded above, we write sup E = +∞. By convention, we write sup ∅ = −∞ and inf ∅ = +∞. For the previous example, we have inf x2 = 0,

(0,1)

and

sup x2 = 1.

(0,1)

8

Introduction to the Theory of Optimization in Euclidean Space

1.2

Particular Subsets of Rn

We list here the main categories of sets that we will encounter and give the main tools that allow their identification easily. Even though the purpose is not a topological study of these sets, it is important to be aware of the precise definitions and how to apply them accurately [18], [13].

Open and Closed Sets In one dimension, the distance between two real numbers x and y is measured by the absolute value function and is given by d(x, y) = |x − y|. d satisfies, for any x, y, z, the properties d(x, y)  0 d(y, x) = d(x, y) d(x, z)  d(x, y) + d(y, z)

d(x, y) = 0 ⇐⇒ x = y symmetry triangle inequality.

These three properties induce on R a metric topology where a set O is said to be open if and only if, at each point x0 ∈ O, we can insert a small interval centered at x0 that remains included in O, that is, O

is open

⇐⇒

∀x0 ∈ O

∃ > 0

such that

(x0 − , x0 + ) ⊂ O.

In higher dimension, these tools are generalized as follows: The distance between two points x = (x1 , · · · , xn ) and y = (y1 , · · · , yn ) is measured by the quantity  d(x, y) = x − y = (x1 − y1 )2 + . . . + (xn − yn )2 . d is called the Euclidean distance and satisfies the three properties above. A set O ⊂ Rn is said to be open if and only if, at each point x0 ∈ O, we can insert a small ball B (x0 ) = {x ∈ Rn : x − x0 < }

Introduction

9

centered at x0 with  that remains included in O, that is, O

∀x0 ∈ O

⇐⇒

is open

∃ > 0

B (x0 ) ⊂ O.

such that

The point x0 is said to be an interior point to O. Example 1. As n varies, the ball takes different shapes; see Figure 1.6. n=1

a∈R

n=2

a = (a1 , a2 )

Br (a) = (a − r, a + r)

:

an open interval

Br (a) = {(x1 , x2 ) : (x1 − a1 )2 + (x2 − a2 )2 < r2 } : an open disk

n=3

a = (a1 , a2 , a3 ) Br (a) = {(x1 , x2 , x3 ) : (x1 − a1 )2 + (x2 − a2 )2 + (x3 − a3 )2 < r2 } : set of points delimited by the sphere centered at a with radius r Br (a) is the set of points delimited by a = (a1 , . . . , a3 ) the hyper sphere of points x satisfying d(a, x) = r.

n>3

y

4

2

y 0

3 2 4 4

2

2

1

1

2  x  2

interval 0

1

2

3

2

1

disk

1 x2  y2  4

sphere

2

2

1 2

3

x

x2  y2  z2  4

z 0 2 4 4

2 x

3

FIGURE 1.6: Shapes of balls in R, R2 and R3 Using the distance d, we define

Definition 1.2.1 Let S be a subset of Rn . ◦

– S is the interior of S, the set of all interior points of S. – S is a neighborhood of a if a is an interior point of S.

0

2 4

10

Introduction to the Theory of Optimization in Euclidean Space – S is a closed set

C S is open.

⇐⇒

– ∂S is the boundary of S, the set of boundary points of S, where



x0 ∈ ∂S ⇐⇒ ∀r > 0, Br (x0 )∩ S = ∅ and Br (x0 )∩ C S = ∅. – S = S ∪ ∂S is the closure of S. ⇐⇒

– S is bounded

∃M >0

such that

x  M

∀x ∈ S.

– S is unbounded if it is not bounded.

Example 2. For the sets,

S1 = [−2, 2] ⊂ R

S2 = {(x, y) : x2 +y 2  4} ⊂ R2 ,

S3 = {(x, y, z) : x2 +y 2 +z 2 < 4} ⊂ R3 ,

we have ◦

S

S

∂S

S

S1

(−2, 2)

{−2, 2}

S1

S2

B2 (0)

C2 (0) : circle

S2

S3

S3 = B2 (0)

S2 (0) : sphere

S3 ∪ S2 (0)

where C2 (0) = {(x, y) : x2 + y 2 = 4},

S2 (0) = {(x, y, z) : x2 + y 2 + z 2 = 4}.

We have the following properties:

Remark 1.2.1 – Rn and ∅ are open and closed sets – The union (resp. intersection) of arbitrary open (resp. closed) sets is open (resp. closed). – The finite intersection (resp. union) of open (resp. closed) sets is open (resp. closed). – S is open – S is closed

⇐⇒ ⇐⇒



S = S. S = S.

Introduction

11

– If f is continuous on an open subset Ω ⊂ Rn (see Section 1.3), then

f −1 (−∞, a] = [f  a], [f  a], [f = a] are closed sets in Rn

f −1 (−∞, a) = [f < a],

are open sets in Rn .

[f > a]

Example 3. Sketch the set S in the xy-plane and determine whether it is ◦

open, closed, bounded or unbounded. Give S, S = {(x, y) :

x  0,

∂S and S.

y  0,

xy  1}

y 5

4

3

xy1

x0

y0

2

3

4

2

1

2

1

1

5

x

1

2

FIGURE 1.7: An unbounded closed subset of R2

∗ Note that the set S, sketched in Figure 1.7, doesn’t contain the points on the x and y axis. So S = {(x, y) : x > 0,

y > 0,

xy  1}

and can be described using the continuous function f : (x, y) −→ xy on the open set Ω = {(x, y) : x > 0, y > 0} as

S = {(x, y) ∈ Ω : f (x, y)  1} = f −1 [1, +∞) . Therefore, S is a closed subset of R2 . Thus S = S. ∗∗ The set is unbounded since it contains the points (x(t), y(t)) = (t, t) for t  1 (xy = t.t = t2  1) and  √ (x(t), y(t)) = (t, t) = t2 + t2 = 2t −→ +∞ as t −→ +∞.

12

Introduction to the Theory of Optimization in Euclidean Space

∗ ∗ ∗ We have ◦

S = {(x, y) : x > 0, y > 0, xy > 1} the region in the 1st quadrant above the hyperbola y = ∂S = {(x, y) :

x > 0,

y > 0,

1 x

xy = 1}

the arc of the hyperbola in the 1st quadrant. Example 4. A person can afford any commodities x  0 and y  0 that satisfies the budget inequality x + 3y  7. Sketch the set S described by these inequalities in the xy-plane and determine ◦

whether it is open, closed, bounded or unbounded. Give S,

∂S

and S.

y 4

3

2

1

S

2

4

6

x

1

FIGURE 1.8: Closed set as intersection of three closed sets of R2

∗ Figure 1.8 shows that S is the triangular region formed by all the points in the first quadrant below the line x + 3y = 7 : S = {(x, y) :

x + 3y  7,

x  0,

y  0}

and can be described using the continuous functions f1 : (x, y) −→ x + 3y,

f2 : (x, y) −→ x,

f3 : (x, y) −→ y

on R2 as S = {(x, y) ∈ R2 : f1 (x, y)  7, f2 (x, y)  0, f3 (x, y)  0}





f2−1 [0, +∞) f3−1 [0, +∞) . = f1−1 (−∞, 7] Therefore, S is a closed subset of R2 as the intersection of three closed subsets of R2 . Thus S = S.

Introduction

13

∗∗ The set S is bounded since x + 3y  7,

x  0,

y0

0  x  7,

=⇒

0y

7 3

from which we deduce  (x, y) = x2 + y 2 

72 +

7 2 7√ = 10 3 3

∀(x, y) ∈ S.

∗ ∗ ∗ We have ◦

S = {(x, y) : x > 0, y > 0, x + 3y < 7} the region S excluding its three sides ∂S =

the three sides of the triangular region.

Convex sets The category of convex sets, deals with sets S ⊂ Rn where any two points x, y ∈ S can be joined by a line segment that remains entirely into the set. Such sets are without holes and do not bend inwards. Thus S

is convex

⇐⇒

(1 − t)x + ty ∈ S

∀x, y ∈ S

∀t ∈ [0, 1].

We have the following properties:

Remark 1.2.2 – Rn and ∅ are convex sets – A finite intersection of convex sets is a convex set.

Example 5. “Well known convex sets” (see Figure 1.9) ∗ A line segment joining two points x and y is convex. It is described by [x, y] = {z ∈ Rn :

∃t ∈ [0, 1]

such that

z = x + t(y − x) = (1 − t)x + ty}.

∗∗ A line passing through two points x0 and x1 is convex. It is described by L = {x ∈ Rn :

∃t ∈ R

such that

x = x0 + t(x1 − x0 )}.

14

Introduction to the Theory of Optimization in Euclidean Space y 5 4 3 line segment B

line

1

A 4

disk

2

2

2

4

6

x

1

FIGURE 1.9: Convex sets in R2

∗ ∗ ∗ A ball Br (x0 ) = {x ∈ Rn :

x − x0 < r} is convex.

Indeed, let a and b in Br (x0 ) and t ∈ [0, 1], we have [(1 − t)a + tb] − x0 = (1 − t)(a − x0 ) + t(b − x0 )  (1 − t)(a − x0 ) + t(b − x0 ) = |1 − t| a − x0 + |t| b − x0 < |1 − t|r + |t|r = r

a − x0 < 1 and b − x0 < 1.

since

Hence (1 − t)a + tb ∈ Br (x0 ) for any t ∈ [0, 1]; that is, [a, b] ⊂ Br (x0 ). y 2

1 x2  y2  4

2

1

1

2

x

closed disk 1

2

FIGURE 1.10: A closed ball is convex ∗ ∗ ∗∗ A closed ball Br (x0 ) = {x ∈ Rn : x − x0  r} is convex. For example, in the plane, the set in Figure 1.10, defined by {(x, y) :

x2 + y 2  4} = B2 ((0, 0))

is convex.

Introduction

15

The set is the closed disk with center (0, 0) and radius 2. It is closed since it includes its boundary points located on the circle with center (0, 0) and radius 2. This set is bounded since (x, y)  2 ∀(x, y) ∈ B2 ((0, 0)). Example 6. “Convex sets described by linear expressions” ∗ For a = (a1 , . . . , an ) ∈ Rn , b ∈ R, the set of points x = (x1 , . . . , xn ) ∈ Rn :

a1 x1 + a2 x2 + . . . + an xn = a.x = b

is convex and called hyperplane. Indeed, consider x1 , x2 in the hyperplane and t ∈ [0, 1], then a.[(1 − t)x1 + tx2 ] = (1 − t)a.x1 + ta.x2 = (1 − t)b + tb = b thus (1 − t)x1 + tx2 belongs to the hyperplane. As illustrated in Figure 1.11, the graph of an hyperplane is reduced to the point x1 = b/a1 when n = 1, to the line a1 x1 + a2 x2 = b in the plane when n = 2, and to the plane a1 x1 + a2 x2 + a3 x3 = b in the space when n = 3. 4 y 0

y 2.0 4 4

1.5 hyperplane 3

2

1

0

1

2

x

z 0

hyperplane 0.5

2

hyperplane

2

1.0 3

2

2

2

1

1

2

4 4

x

2 x

0.5

0 2 4

FIGURE 1.11: Hyperplane in R, R2 and R3

∗∗ The set of points in x = (x1 , . . . , xn ) ∈ Rn defined by a linear inequality a1 x1 + a2 x2 + . . . + an xn = a.x  b

(resp. , )

is convex.

Indeed, as above, consider x1 , x2 in the region [a.x  b] and t ∈ [0, 1], then a.x1  b

=⇒

(1 − t)a.x1  (1 − t)b

a.x2  b

=⇒

ta.x2  tb

since

since t0

(1 − t)  0

16

Introduction to the Theory of Optimization in Euclidean Space Adding the two inequalities, we get a.[(1 − t)x1 + tx2 ] = (1 − t)a.x1 + ta.x2  (1 − t)b + tb = b

thus (1 − t)x1 + tx2 belongs to the region [a.x  b]. The set a.x  b describes the region of points located below the hyperplane a.x = b. ∗ ∗ ∗ A set of points in Rn described by linear equalities and inequalities is convex as it can be seen as the intersection of convex sets described by equalities and inequalities. For example, in Figure 1.12, the set S = {(x, y) : 2x + 3y  19, −3x + 2y  4, x + y  8, 0  x  6, x + 6y  0} can be described as S = S1 ∩ S2 ∩ S3 ∩ S4 ∩ S5 ∩ S6 where S1 = {(x, y) ∈ R2 : x + 6y  0} S3 = {(x, y) ∈ R2 : x + y  8}

S2 = {(x, y) ∈ R2 : x  6} S4 = {(x, y) ∈ R2 : 2x + 3y  19}

S5 = {(x, y) ∈ R2 : −3x + 2y  4}

S6 = {(x, y) ∈ R2 : x  0}.

y

6

L4 4

L5

L3 2

S

L2

L6

L1

2

4

6

x

FIGURE 1.12: A convex set described by linear inequalities

Introduction

17

S is the region of the plan xy, bounded by the lines L1 : x + 6y = 0

L2 : x = 6,

L3 : x + y = 8,

L5 : −3x + 2y = 4

L4 : 2x + 3y = 19,

L6 : x = 0.

Often, such sets are described using matrices and vectors; ⎡ ⎤ ⎡ 2 3 ⎢ −3 2 ⎥ ⎢ ⎢ ⎥  ⎢  x  ⎢ ⎥ ⎢ 1 ⎥ x ⎢ 1 S = ∈ R2 : ⎢ ⎢ ⎥ ⎢ y 0 ⎥ y ⎢ 1 ⎢ ⎣ −1 −6 ⎦ ⎣ −1 0

19 4 8 6 0 0

⎤ ⎥ ⎥ ⎥ ⎥ . ⎥ ⎥ ⎦

Example 7. “Well-known non convex sets” ∗ The hyper-sphere (see Figure 1.13 for an illustration in the plane) ∂Br (x∗ ) = {x ∈ Rn :

x − x0 = r}

is not convex.

y 3

2

circle 1 x  12  y  12  4

1

1

2

3

x

1

FIGURE 1.13: Circle ∂B2 ((1, 1)) is not convex Indeed, we have (x∗1 , . . . , x∗n ± r) ∈ ∂Br (x∗ ) since (0, . . . , ±r) = r  1 1    (x∗1 , . . . , x∗n + r) + (1 − )(x∗1 , . . . , x∗n − r) − x∗  2 2  1   =  (2x∗1 , . . . , 2x∗n + r − r) − x∗  = x∗ − x∗ = 0 = r 2 1 ∗ 1 (x , . . . , x∗n + r) + (1 − )(x∗1 , . . . , x∗n − r) = x∗ ∈ ∂Br (x∗ ). =⇒ 2 1 2 ∗∗ The domain located outside the hyper-sphere, described by S = {x ∈ Rn :

x − x∗ > r} = Rn \ Br (x∗ )

is not convex.

18

Introduction to the Theory of Optimization in Euclidean Space y 4

x2  y2  4 2

4

2

2

4

x

2

4

FIGURE 1.14: An unbounded open non convex set of R2

Indeed, we have (x∗1 , . . . , x∗n ± 2r) ∈ S

since

(0, . . . , ±2r) = 2r > r

1 1 ∗ (x1 , . . . , x∗n + 2r) + (1 − )(x∗1 , . . . , x∗n − 2r) 2 2 1 (2x∗1 , . . . , 2x∗n + 2r − 2r) = x∗ ∈ S. 2 For example, in the plane, the set =

{(x, y) :

x2 + y 2 > 4} = R2 \ B2 ((0, 0))

is not convex.

Moreover, the set is open since it is the complementary of the closed disk with center (0, 0) and radius 2 (see Figure 1.14). It is not bounded since for t  2, the points (0, t2 ) belong to the set, but (0, t2 ) = t2 −→ +∞ as t −→ +∞. ∗ ∗ ∗ The region located outside the hyper-sphere, including the hyper-sphere, described by S = {x ∈ Rn :

x − x0  r} = Rn \ Br (x∗ )

is not convex.

Example 8. “The union of convex sets is not necessarily convex ” ∗ The union of the disk and the line in Figure 1.9 is not convex. ∗∗ The set E = {(x, y) ∈ R2 : is not convex.

xy + x − y − 1 > 0}, graphed in Figure 1.15,

Introduction

19

Indeed, we have xy + x − y − 1 > 0 ⇐⇒ x > 1 and

⇐⇒ (x − 1)(y + 1) > 0 y > −1 or x 1 and y > −1}

E2 = {(x, y) ∈ R2 :

x < 1 and y < −1}

E1 and E2 are convex since they are described by linear inequalities. However, E = E1 ∪ E2 is not convex since for example (2, 0) and (0, −2) are points of E, but



1 1 2, 0 + 1 − 0, −2 = 1, −1 2 2

4

2

doesn’t belong to the set E.

2

0.5

4

x1 and y  1

1.0

y  1  1  x

1.5

2.0

FIGURE 1.15: Union of convex sets

20

Introduction to the Theory of Optimization in Euclidean Space

1.3

Functions of Several Variables

We refer the reader to any book of calculus [1], [3], [21], [23] for details on the points introduced in this section.

Definition 1.3.1 A function f of n variables x1 , · · · , xn is a rule that assigns to each n-vector x = (x1 , . . . , xn ) in the domain of f , denoted by Df , a unique number f (x) = f (x1 , . . . , xn ).

Example 1. Formulas may be used to model problems from different fields. – Linear function f (x1 , . . . , xn ) = a1 x1 + a2 x2 + . . . + an xn . – The body mass index is described by the function B(w, h) =

w h2

where w is the the weight in kilograms and h is the height measured in meters. – The distance of a point P (x, y, z) to a given point P0 (x0 , y0 , z0 ) is a function of three variables  d(x, y, z) = (x − x0 )2 + (y − y0 )2 + (z − z0 )2 . – The Cobb-Douglas function or the production function, describes the relationship between the output: the product Q and the inputs: x1 , . . . , xn (capital, labor, . . .) involved in the production process Q(x1 , · · · , xn ) = Cxa1 1 xa2 2 . . . xann

C, a1 , . . . , an

are constants, C > 0.

– The electric potential function for two positive charges, one at (0, 1) with twice the magnitude as the charge at (0, −1), is given by ϕ(x, y) = 

2 x2

+ (y −

1)2

+

1 x2

+ (y + 1)2

.

Introduction

21

Example 2. When given a formula of a function, first identify its domain of definition before any other calculation. The domains of definition of the functions given by the following formulas: √ √ √ g(x, y) = x h(x, y, z) = x f (x) = x are Df = {x ∈ R/ x  0} Dg = {(x, y) ∈ R2 / x  0} : the half plane bounded by the y axis, including the axis and the points located in the 1st and 4th quadrants. Dh = {(x, y, z) ∈ R3 / x  0} : the half space bounded by the plane yz, including this plane and the points with positive 1st coordinates x  0. The three domains Df , Dg , Dh are closed, convex, unbounded subsets of R, R2 and R3 respectively; see Figure 1.16. y 1.0

Dg : x  0

Df : x  0

interval 0

1

2

3

1.0

4

5

0.5

0.5

0.5

6 0.5

1.0 10 y

Dh : x  0

5

0 5 10 10

5

z

0

5

10 0

x

5 10

FIGURE 1.16: Domains of definition

1.0

x

22

Introduction to the Theory of Optimization in Euclidean Space

Graphs and Level Curves With the aid of monotony, and convexity, sketching the graph of a real function is performed by plotting few points. This is not possible in the case of dimension 3. To get familiar with some sets in R3 , we describe the traces’ method used for plotting graphs of functions of two variables. The method consists on sketching the intersections of the graph (or surface) with well-chosen planes, usually planes that are parallel to the coordinates planes: xy-plane : z = 0

xz-plane : y = 0

yz-plane : x = 0.

These intersections are called traces.

Definition 1.3.2 The graph of a function f : x = (x1 , . . . , xn ) ∈ Df ⊂ Rn −→ z = f (x) ∈ R is the set Gf = {(x, f (x)) ∈ Rn+1 :

x ∈ Df }.

The set of points x in Rn satisfying f (x) = k is called a level surface of f .

When n = 2, a level surface f (x, y) = k is called level curve. It is the projection of the trace Gf ∩[z = k] onto the xy-plane. Drawing level curves of f is another way to picture the values of f . The following examples illustrate how to proceed to graph some surfaces and level curves. Example 3. A cylinder is a surface that consists of all lines that are parallel to a given line and that pass through a given plane curve. Let

E = {(x, y, z), x = y 2 }.

The set E cannot be the graph of a function z = f (x, y) since (1, 1, z) ∈ E for any z, and then (1, 1) would have an infinite number of images. However, we can look at E as the graph of the function x = f (y, z) = y 2 . Moreover, we have  {(x, y, z), x = y 2 , (x, y) ∈ R2 }. E= z∈R

Introduction

23

This means that any horizontal plane z = k (// to the xy plane) intersects the graph in a curve with equation x = y 2 . So these horizontal traces E ∩ [z = k], k ∈ R are parabolas. The graph is formed by taking the parabola x = y 2 in the xy-plane and moving it in the direction of the z-axis. The graph is a parabolic cylinder as it can be seen as formed by parallel lines passing through the parabola x = y 2 in the xy-plane (see Figure 1.17). Note that for any k ∈ R, the level curve z = k is the parabola x = y 2 in the xy plane.

z traces y y 4

Level curve x  y2

x

2

4

2

2

4

x

2

4 2 graph x

 y2

1

y 0 1 2 2

1

z

0

1 2 2 1 0 x

1 2

FIGURE 1.17: Parabolic cylinder Example 4. An Elliptic Paraboloid, in its standard form, is the graph of the function f (x, y) = z = The graph Gf =

x2 y2 + a2 b2 

 (x, y, z),

z∈[0,+∞)

with

a > 0, b > 0.

 x2 y2 + = z a2 b2

y2 x2 + = k in the planes z = k, k  0. a2 b2 By choosing the traces in Table 1.3, we can shape the graph in the space (see Figure 1.18 for a = 2, b = 3): can be seen as the union of ellipses

24

Introduction to the Theory of Optimization in Euclidean Space plane xy (z = 0)

trace point : (0, 0)

xz (y = 0)

parabola : z =

x2 a2

yz (x = 0)

parabola : z =

y2 b2

z=1

ellipse :

x2 y2 + =1 a2 b2

TABLE 1.3: Traces to sketch a paraboloid z

x2 4



y2 9

2 y 0 y 10

x2

levelcurves

4



y2 9

2 k

1.0

k9

5 k4 k1

z

k0 10

5

10

x

2

1.5 z

1.0 0.5

5

0

0.0

y

0.0 2

2

1

1

2

0 x 10

0.5

2.0 5

1 2

0 x

1 2

FIGURE 1.18: Elliptic paraboloid

Note that for any k < 0, the level curves z = k are not defined. For k > 0, x2 y2 the level curves are ellipses √ + √ = 1 centered at the origin. For (a k)2 (b k)2 k = 0, the level curve is reduced to the point (0, 0). Example 5. The Elliptic Cone, in its standard form, is described by the equation x2 y2 with a > 0, b > 0. z2 = 2 + 2 a b y2 x2 It is the union of the graphs of the functions z = ± + 2. 2 a b To sketch the cone, one can make the choice of traces in Table 1.4 (see Figure 1.19 for a = 2, b = 3):

Introduction

25

plane xy (z = 0)

trace point : (0, 0)

xz (y = 0)

lines : z = ±

x a

yz (x = 0)

lines : z = ±

y b

z = ±1

ellipse :

x2 y2 + =1 a2 b2

TABLE 1.4: Traces to sketch a cone z2 

x2 4



y2 9

2

y 0 y 10

x2

levelcurves

4



y2 9

2  k2

1.0

k3

5

0.5

k2 k1 k0 10

5

z

1.0

5

10

x

z

0.0

0.5 0.0

0.5

0.5

2

1.0

1.0

2 5

0

1 x

1 0

2

1 10

2

y

0 x

1 2

2

FIGURE 1.19: Elliptic cone

Note that for any k = 0, the level curves z = ±k are ellipses y2 x2 + = 1 centered at the origin. For k = 0, the level curve is 2 (|k|a) (|k|b)2 reduced to the point (0, 0).

Example 6. The Elliptic Ellipso¨ıd, in its standard form, is described by the equation y2 z2 x2 + + =1 a2 b2 c2

with

a > 0, b > 0, c > 0.

x2 y2 − 2 that one 2 a b can sketch by making the following choice of traces in Table 1.5 (see Figure 1.20 for a = 2, b = 3, c = 4): It is the union of the graphs of the functions z = ±c

1−

26

Introduction to the Theory of Optimization in Euclidean Space plane

trace x2 y2 ellipse : 2 + 2 = 1 a b

xy (z = 0) xz (y = 0)

ellipse :

x2 z2 + =1 a2 c2

yz (x = 0)

ellipse :

y2 z2 + 2 =1 2 b c

TABLE 1.5: Traces to sketch an ellipso¨ıd x2 44 y y 3

9



z2 16

1

1

2

0 1

k0

2

4 4

2

2

y2

0

2

x



2

k  2 y 0

1

2

k  3 2

k  4 3

2

1

1

2

3

z

0

x 4

2 1

2

4 4 z

0

2

2

0 2

3

x 4

2 4

FIGURE 1.20: An elliptic ellipso¨ıd

For k ∈ R, the level curves centered z = ±k are ellipses at the origin with ver k2 k2 k2 k2

tices − a 1 − 2 , a 1 − 2 , − b 1 − 2 , b 1 − 2 in the xy plane. c c c c

Limits and Continuity For the local study of a function, the concept of limit is generalized to functions of several variables as follows

n Definition 1.3.3

Let x0 ∈ R and let f be a function defined on Df ∩ Br (x0 ) \ {x0 } . We write lim f (x) = L

⇐⇒



x−→x0

∀ > 0, ∃δ > 0 such that ∀x : 0 < x−x0 < δ =⇒ |f (x)−L| <  .

Introduction

27

Remark 1.3.1 i) The definition above supposes that f is defined in a neighborhood of x0 , except possibly at x0 . It includes points x0 located at the boundary of the domain of f . ii) One can establish, using similar tools in one dimension [2], that the standard properties of limits hold for limits of functions of n variables. iii) If the limit of f (x) fails to exist as x −→ x0 along some smooth curve, or if f (x) has different limits as x −→ x0 along two different smooth curves, then the limit of f (x) does not exist as x −→ x0 .

Example 7. • lim xi = ai ,

a = (a1 , · · · , an ) ∈ Rn .

i = 1, · · · , n

x−→a

Indeed, for  > 0, choose δ =  > 0. Then, we have for x satisfying x − a < δ

|xi − ai |  x − a < δ = .

=⇒

• Algebraic operations on limits. lim

(x,y,z)−→(1,2,3)

=

lim

3xy 2 + z − 5

(x,y,z)−→(1,2,3)

= 3[

lim

[3xy 2 ] +

(x,y,z)−→(1,2,3)

x] . [

lim

(x,y,z)−→(1,2,3)

lim

(x,y,z)−→(1,2,3)

• The limit

2x2 y + y2

lim

(x,y)→(0,0) x4

z −

lim

(x,y,z)−→(1,2,3)

5

y]2 + 3 − 5 = 3(1)(2)2 + 3 − 5 = 10.

does not exist.

Indeed, if we consider the smooth curves C1 and C2 with equations y = x2 and y = x respectively, we find that lim

(x,y)→(0,0)(x,y)∈C

1

2x2 y 2x2 x2 2x4 = lim 4 = lim 4 = 1, 2 2 2 x→0 x + (x ) x→0 2x +y

x4

2x2 y 2x2 x 2x = 0, = lim = lim 2 4 2 4 2 x→0 x→0 x +x x +1 (x,y)→(0,0)(x,y)∈C x + y 2 lim

the limits have different values along C1 and C2 (see Figure 1.21).

28

Introduction to the Theory of Optimization in Euclidean Space z

2 x2 x4  y2

15

0.5

10z 5 0 0.5

0.0 y

0.0 x

0.5

0.5

FIGURE 1.21: Behavior of f (x, y) =

2x2 y near (0, 0) x4 + y 2

Definition 1.3.4 Let f be a function defined on Df ⊂ Rn . Then ⎧ ⎪ f (x0 ) is defined and ⎨ f is continuous at x0 ⇐⇒ ⎪ ⎩ lim f (x) = f (x0 ). x−→x0

If f is continuous at every point in an open set O, then we say that f is continuous on O.

Remark 1.3.2 A function of n variables that can be constructed from continuous functions by combining the operations of addition, substraction, multiplication, division and composition is continuous wherever it is defined.

Example 8. Give the largest region where f is continuous f (x, y) =

1 . −1

exy

Solution: f is continuous on its domain of definition Df = R2 \ {(x, y) ∈ R2 / x = 0 or y = 0}.

Introduction

29

More precisely, we have ∗ (x, y) −→ x y is continuous on R2 as the product of the function (x, y) −→ x and the function (x, y) −→ y 1 is continuous on Df as the composition of the C 0 −1 1 on function (x, y) −→ xy on R2 and the C 0 function t −→ t e −1 R \ {0} : 1 . (x, y) ∈ Df −→ xy = t ∈ R \ {0} −→ t e −1

∗ ∗ (x, y) −→

exy

First-order Partial Derivatives Our purpose, now, is to generalize the concept of differentiability to functions of several variables. More precisely, we will show that the existence of a line tangent for a real differentiable function f at a point x0 is extended to the existence of an hyperplane for a differentiable function with several variables. First, we introduce some tools:

Definition 1.3.5 If

z = f (x) = f (x1 , · · · , xn ), then the quantity

∂f f (x1 , · · · , xi + h, · · · , xn ) − f (x1 , · · · , xi , · · · , xn ) (x) = lim h−→0 ∂xi h is the partial derivative of f (x1 , · · · , xn ) with respect to xi when all the other variables xj (j = i, i = 1, . . . , n) are held constant.

Remark 1.3.3 - The partial derivative  d ∂f  (a) = [f (a1 , . . . , xi , . . . , an )] , ∂xi dxi xi =ai

i = 1, . . . , n

can be viewed as the slope of the line tangent to the curve Ci : z = f (a1 , . . . , xi , . . . , an ) at the point a, or the rate of change of z with respect to xi along the curve Ci at a.

30

Introduction to the Theory of Optimization in Euclidean Space

- Other notations are : ∂z ∂f = = f x i = zx i ∂xi ∂xi

i = 1, · · · , n.

- We call gradient of f the vector ∇f (x) = fx1 , fx2 , · · · , fxn  = f  (x).

f (w, x, y, z) = xeyw sin z.

Example 9. Let fx (1, 2, 3, π/2),

fy (1, 2, 3, π/2),

Find

fz (1, 2, 3, π/2)

and

fw (1, 2, 3, π/2).

Solution: We have fx = eyw sin z fy = xweyw sin z fz = xeyw cos z fw = xyeyw sin z

fx (1, 2, 3, π/2) = eyw sin z

 (w,x,y,z)=(1,2,3,π/2)

fy (1, 2, 3, π/2) = xweyw sin z fz (1, 2, 3, π/2) = xeyw cos z

= e3



(w,x,y,z)=(1,2,3,π/2)



(w,x,y,z)=(1,2,3,π/2)

fw (1, 2, 3, π/2) = xyeyw sin z

= 2e3

=0



(w,x,y,z)=(1,2,3,π/2)

= 6e3 .

Example 10. The rate of change of the (BMI) body mass index function B(w, h) = w/h2 with respect of the weight w at a constant height h is 1 ∂B = 2 > 0. ∂w h Thus, at constant height, people’s BMI differs by a factor of 1/h2 . The rate of change of the BMI with respect of the height h at a constant weight w is 2w ∂B = − 3 < 0. ∂h h Therefore, with similar weight, people’s BMI is a decreasing function of the height.

Introduction

31

Higher Order Partial Derivatives • Each partial derivative is also a function of n variables. These functions may themselves have partial derivatives, called second order derivatives. For each i = 1, . . . , n, we have ∂2f ∂ ∂f

= = fxi xj . ∂xj ∂xi ∂xj ∂xi The n second-order partial derivatives fxi xi are called direct second-order partial; the others, fxi xj where i = j, are called mixed second-order partial. Usually these second-order partial derivatives are displayed in an n × n matrix named the Hessian ⎡ ⎤ fx1 x1 fx1 x2 . . . f x1 xn ⎢ fx2 x1 fx2 x2 . . . f x2 xn ⎥ ⎢ ⎥ Hf (x) = (fxi xj )n×n = ⎢ ⎥ .. .. .. .. ⎦ ⎣ . . . . f xn x1

fxn x2

...

fxn xn

• The mixed derivatives are equal in the following situation [15]

Theorem 1.3.1 Clairaut’s theorem Let f (x) = f (x1 , x2 , · · · , xn ). If fxi xj and fxj xi , i = j for i, j ∈ {1, · · · , n} are defined on a neighborhood of a point a ∈ Rn and are continuous at a then fxi xj (a) = fxj xi (a).

• Third-order, fourth-order and higher-order partial derivatives can be obtained by successive differentiation. Clairaut’s theorem reduces the steps of calculations when the continuity assumption is satisfied. Example 11. Write the Hessian of the Cobb-Douglas function Q(L, K) = cLa K b

(c, a, b are positive constants)

where the two inputs are labor L and capital K. Solution: For L, K > 0, we have ln Q = ln c + a ln L + b ln K

32

Introduction to the Theory of Optimization in Euclidean Space QL a ∂(ln Q) = = ∂L Q L

=⇒

QK b ∂(ln Q) = = ∂K Q K QLL = QKK =

QL =

=⇒

a Q L

QK =

b Q K

a

a a

a(a − 1) a a

QL + − 2 Q = Q+ − 2 Q= Q L L L L L L2 b

b b

b(b − 1) b b

QK + − 2 Q = Q+ − 2 Q= Q K K K K K K2

a ab QK = Q. L LK The Hessian matrix of Q is given by: QKL = QLK =



 HQ (L, K) =

QLL QKL

QLK QKK

a(a − 1)  ⎢ L2 ⎢ = Q⎢ ⎣ ab LK

ab LK



⎥ ⎥ ⎥. b(b − 1) ⎦ K2

Example 12. Laplace’s equation of a function u = u(x1 , . . . , xn ) is u =

∂2u ∂2u ∂2u + + ... + = 0. 2 2 ∂x1 ∂x2 ∂x2n

For which value of k, the function u = (x21 + x22 + . . . + x2n )k satisfies Laplace’s equation? Solution: We have ∂u = 2xi k(x21 + x22 + . . . + x2n )k−1 ∂xi ∂2u = 2k(x21 + x22 + . . . + x2n )k−1 + 4x2i k(k − 1)(x21 + x22 + . . . + x2n )k−2 ∂x2i u =

∂2u ∂2u ∂2u + + ... + 2 2 ∂x1 ∂x2 ∂x2n

= 2kn(x21 + x22 + . . . + x2n )k−1 + 4k(k − 1) n

 x2i (x21 + x22 + . . . + x2n )k−2 × i=1

= 2k[n + 2(k − 1)](x21 + x22 + . . . + x2n )k−1 . Thus u = 0 if n + 2(k − 1) = 0 ie. for k = 1 + n/2.

Introduction

33

Differentiability While the existence of a derivative of a one variable function at a point guarantees the continuity of the function at this point, the existence of partial derivatives for a function of several variables doesn’t. Indeed, for example ⎧ if x > 0 and y > 0 ⎨ 2 f (x, y) = ⎩ 0 if not has partial derivatives at (0, 0) since f (h, 0) − f (0, 0) 0−0 = lim = lim 0 = 0, h→0 h→0 h→0 h h

fx (0, 0) = lim

f (0, h) − f (0, 0) =0 h but f is not continuous at (0, 0) since fy (0, 0) = lim

h→0

lim f (t, t) = lim 2 = 2 = 0 = f (0, 0).

t→0+

t→0+

This motivates, the following definition

Definition 1.3.6 A function of n variables is said to be differentiable at a = (a1 , . . . , an ) provided that fxi (a), i = 1, . . . , n exist and that there exists a function ε : R+ −→ R such that: f (x) = f (a) + fx1 (a)(x1 − a1 ) + . . . + fxn (a)(xn − an ) + x − a ε( x − a ) with lim ε( x − a ) = 0.

x−→a

Remark 1.3.4 The definition extends the concept of differentiability of functions of one variable to functions of n variables in such a way that we preserve properties like: - f continuous at a; - the values of f at points near a can be very closely approximated by the values of a linear function: f (x) ≈ f (a) + fx1 (a)(x1 − a1 ) + . . . + fxn (a)(xn − an ).

34

Introduction to the Theory of Optimization in Euclidean Space

The next theorem provides particular conditions for a function f to be differentiable.

Theorem 1.3.2 If all first-order partial derivatives of f exist and are continuous at a point, then f is differentiable at that point.

If f has continuous partial derivatives of first-order in a domain D, we call f continuously differentiable in D. In this case, f is also called a C 1 function on D. If all partial derivatives up to order k exist and are continuous, f is called a C k function. Example 13. Use the linear approximation to estimate the change of the Cobb-Douglas production function Q(L, K) = L1/3 K 2/3

from

Solution: We have 1 2 Q, QK (L, K) = Q, QL (L, K) = 3L 3K 1 Q(20, 10), QL (20, 10) = 3(20) Thus, close to (20, 10), we have

(20, 10)

to

(20.6, 10.3).

Q(20, 10) = 201/3 102/3 = 10(21/3 ), QK (20, 10) =

2 Q(20, 10) 3(10)

Q(L, K) ≈ Q(20, 10) + QL (20, 10)(L − 20) + QK (20, 10)(K − 10)

2 1 = 1 + (L − 20) + (K − 10) Q(20, 10) 60 30 from which we deduce the estimate

2 1 Q(20.6, 10.3) ≈ 1+ (20.6−20)+ (10.3−10) Q(20, 10) = 1.003 Q(20, 10). 60 30 Another consequence of the differentiability is the chain rule for derivation under composition.

Theorem 1.3.3 Chain rule 1 If f is differentiable at x = (x1 , x2 , . . . , xn ) and each xj = xj (t), j = 1, . . . , n, is a differentiable function of a variable t, then z = f (x(t)) is differentiable at t and ∂z dx1 ∂z dx2 ∂z dxn dz = + + ...... + . dt ∂x1 dt ∂x2 dt ∂xn dt

Introduction

35

Proof. Since f is differentiable at the point a, then, for x(t) close to a = x(t0 ), we have f (x(t)) − f (a) = fx1 (a)(x1 (t) − a1 ) + · · · + fxn (a)(xn (t) − an ) + x(t) − a ε( x(t) − a )

with

lim ε( x − a ) = 0.

x−→a

Dividing each side of the equality by Δt = t − t0 , we obtain x (t) − a

x (t) − a

f (x(t)) − f (a) 1 1 n n = fx1 (a) + . . . . . . + fxn (a) Δt Δt Δt  x(t) − a    +  ε( x(t) − a ). Δt Then letting t −→ t0 and using the fact that each xj = xj (t), j = 1, . . . , n, is a differentiable function of the variable t and that lim ε( x − a ) = 0, then x−→a

x (t) − a

f (x(t)) − f (a) 1 1 = fx1 (a). lim + ... t−→t0 t−→t0 Δt Δt x (t) − a  x(t) − a    n n + fxn (a). lim +  lim  . lim ( x(t) − a ) t−→t0 t−→t0 t−→t0 Δt Δt from which we deduce that lim

 dx  dxn d(f (x(t)))  dx1   (t0 ) + . . . . . . + fxn (a). (t0 ) +  (t0 ).0 = fx1 (a).  dt dt dt dt t=t0 and the result follows. In the general situation, each variable xi is a function of m independent variables t1 , t2 , . . . , tm . Then z = f (x(t1 , t2 , . . . , tm )) is a function of ∂z , we hold ti with i = j fixed and compute the t1 , t2 , . . . , tm . To compute ∂tj ordinary derivative of z with respect to tj . The result is given by the following theorem:

Theorem 1.3.4 Chain rule 2 If f is differentiable at x = (x1 , x2 , . . . , xn ) and each xj = xj (t1 , t2 , · · · , tm ), j = 1, · · · , n, is a differentiable function of m variables t1 , t2 , . . . , tm , then z = f (x(t1 , t2 , . . . , tm )) is differentiable at (t1 , t2 , . . . , tm ) and ∂z ∂x1 ∂z ∂x2 ∂z ∂xn ∂z = + + ...... + . ∂ti ∂x1 ∂ti ∂x2 ∂ti ∂xn ∂ti

36

Introduction to the Theory of Optimization in Euclidean Space

Example 14. Let f (x, y) = x2 − 2xy + 2y 3 ,

x = s ln t,

y = s t.

Use the chain rule formula to find ∂f , ∂s

∂f , ∂t

∂f |s=1,t=1 ∂s

and

∂f |s=1,t=1 . ∂t

Solution: i) We have ∂f = 2x − 2y, ∂x

∂f = −2x + 6y 2 , ∂y

x = x(s, t),

∂x = ln t, ∂s

y = y(s, t),

∂y = t, ∂s

∂x s = , ∂t t ∂y = s. ∂t

Hence the partial derivatives of f at (s, t) are: ∂f ∂x ∂f ∂y ∂f = . + . = (2x − 2y) ln t + (−2x + 6y 2 )t ∂s ∂x ∂s ∂y ∂s = (2s ln t − 2st) ln t + (−2s ln t + 6s2 t2 )t

∂f ∂x ∂f ∂y s ∂f = . + . = (2x − 2y) + (−2x + 6y 2 )s ∂t ∂x ∂t ∂y ∂t t s = (2s ln t − 2st) + (−2s ln t + 6s2 t2 )s. t ii) When s = 1 and t = 1, we have x(1, 1) = (1) ln(1) = 0,

and

y(1, 1) = 1.

Thus the partial derivatives of f at (s, t) = (1, 1) are: 

∂f  ∂s 

s=1,t=1



∂f  ∂t 

s=1,t=1

  = (2x(s, t) − 2y(s, t)) ln t + (−2x(s, t) + 6y 2 (s, t))t   = (2x(s, t) − 2y(s, t)) st + (−2x(s, t) + 6y(s, t)2 )s

s=1,t=1

s=1,t=1

= 6

= 4.

Introduction

37

Solved Problems

1. – Sketch the domains of definition of the functions given by the following formulas:   ii) f (x, y, z) = z (1 − x2 )(y 2 − 4) i) f (x, y) = e2x y − x2  iii) H(x, y, z) = z − x2 − y 2 .

Solution: 2 2 Df : 1 5  x  y  4  0

DH : z  x2  y2

y

2

y

0

0 2

5 5 10

y z 8

5 0

z

y  x2  0

Df

0

6 5

4

5

5

2

2

0

0

x 3

2

1

1

2

3

x

x

2

5

FIGURE 1.22: Domains of definitions i)

f (x, y) = e2x

 y − x2 Df = {(x, y) ∈ R2 :

y − x2  0}

the plane region located above the parabola, including the parabola. ii)

f (x, y, z) = z

 (1 − x2 )(y 2 − 4)

Df = {(x, y, z) ∈ R3 : x 1 − x2 y2 − 4

-2 − +

(1 − x2 )(y 2 − 4)  0}

-1 − −

1 + −

2 − −

− +

38

Introduction to the Theory of Optimization in Euclidean Space

so



 = [−1, 1] × (−∞, −2] ∪ [2, +∞) × R 

 ∪ (−∞, −1] ∪ [1, +∞) × [−2, 2] × R .

Df

iii)

H(x, y, z) =

 z − x2 − y 2

DH = {(x, y, z) ∈ R3 :

z − x2 − y 2  0}

set of points bounded by the paraboloid z = x2 +y 2 , including the paraboloid. The three domains are illustrated in Figure 1.22.

2. – Match the functions with their graphs in Figure 1.23. a.

y − z2 = 0

d.

x2 +

b.

x+y+z =0

y2 − z2 = 1 9

x2 +

e.

c.

4x2 +

y2 + z2 = 1 9

f.

z − y2 = 0

y2 = z2 9

2 2

2

y

y

y 0

0

1

0 1

2

2

2

1.0 4

2

z

0.5 z

z0.0

0

2 0.5

2

0

1.0

2

1.0

1

0.5

2

0.0 x

0

A

x 2 3 y

B

0 0.5 1.0

2

1

2

y 0

2

0

5

1.0 1.0 0.5

1

0.5 z0.0

0

z0.0

0.5

1

0.5

1.0

2

1.0

1.0 0.5

2

D

1

5

y 0

2

2

z

x

C

0.0 x

0 x

2

E

1 0.5 1.0

F

FIGURE 1.23: Surfaces in R3

x

0 1

Introduction

39

Solution: equation of the surf ace a.

b.

its graph

why?

y − z2 = 0

(D)

parabolic cylinder in the direction of the x − axis, located in y  0

x+y+z =0

(A)

a plane

c.

4x2 +

y2 9

+ z2 = 1

(E)

ellipsoid centered at (0, 0, 0)

d.

x2 +

y2 9

− z2 = 1

(F )

the traces at z = −1, 0, 1 are ellipses

(B)

elliptic cone

(C)

parabolic cylinder in the direction of the x − axis, located in z  0

e. f.

x2 +

y2 9

= z2

z − y2 = 0

3. – Sketch the graphs of the following functions:   i) f (x, y) = 81 − x2 ii) f (x, y) = 3 iii) f (x, y) = − x2 + y 2 .

Solution: i) Df

z

15

81  x2

10

y 0 10

10 15

5

10

0 z

5

5

0

10

10 0

15

x 15

10

5

0

5

10

15

10

FIGURE 1.24: Domain and graph of z =



81 − x2

40

Introduction to the Theory of Optimization in Euclidean Space

Domain of f : Df = {(x, y) ∈ R2 : 81 − x2  0} = {(x, y) ∈ R2 : |x|  9} Graph of f : Gf = {(x, y, z) ∈ R3 : ∃(x, y) ∈ Df such that z = = {(x, y, z) ∈ R3 :

 81 − x2 }

∃(x, y) ∈ Df such that x2 + z 2 = 81,

z  0}.

It is the half circular cylinder located in the z  0 with radius 9 and axis the y axis (see Figure 1.24). ii) Df = {(x, y) ∈ R2 :

Domain of f :

Gf = {(x, y, z) ∈ R3 :

Graph of f :

f (x, y) = 3 ∈ R} = R2 ∃(x, y) ∈ Df such that z = 3}

It is the plane passing through (0, 0, 3) with normal vector k = 0, 0, 1 (see Figure 1.25). iii) Df = {(x, y) ∈ R2 : x2 + y 2  0} = R2

Domain of f :

Gf = {(x, y, z) ∈ R3 : ∃(x, y) ∈ Df such that  z = − x2 + y 2 }

Graph of f :

= {(x, y, z) ∈ R3 : ∃(x, y) ∈ R2 such that z 2 = x2 + y 2 , z  0} The graph is the part of the circular cone z 2 = x2 + y 2 located in the region [z  0]; see Figure 1.25. 5

z3

x2  y2

z1.0 

y

y

0

0.5

0.0 0.5

5

1.0 0.0

4

z

z 0.5 2

0

1.0

5

1.0 0.5 0

0.0

x

x 5

FIGURE 1.25: Graph of z = 3

0.5 1.0

and

graph of z = −

 x2 + y 2

Introduction

41

4. – Match the surfaces with the level curves in Figure 1.26. 10

2

5

1

4

2

0

0

5

1

0

2

4 10

1.

10

4.

5

0

5

2.

10

2 2

1

0

1

3.

2

10

2

10

5

1

5

0

0

0

5

1

5

10 10

5

0

5

10

5.

4

2 2

1

0

1

6.

2

1

1 0 2

10

2

4

5

0

5

10

1

2

5

0 1 2

2 2

1

z

1

1

0 y

0

2

1 z 0

2

2

10

2

3 z 2

4

0 y

0

5

y

1 x

1

0

x

1

A

2

1

0 1

B

2

0 x

2

2.0

C

2

1.0

1.5 z 1.0

z

1

0.5 0.0 2

0.06

10

0.0 0.5

0

0

D

1

1

2

2

5

0 x

1

0

2

y

5 x

1

0.00

10

y

1

2

z 0.04 0.02

5

1.0

0

5

0.08

0.5

2

5

E

x

5 10

1

0 1

F

10

2

FIGURE 1.26: Surfaces and their level curves Solution: level curves

(1)

(2)

(3)

(4)

(5)

(6)

surface

(E)

(F )

(C)

(A)

(D)

(B)

2

y

42

Introduction to the Theory of Optimization in Euclidean Space 5. – Draw a set of level curves for the following functions: i)

z = x2 + y

ii)

f (x, y, z) = (x − 2)2 + y 2 + z 2 .

Solution: i) We have: Df = {(x, y) ∈ R2 ,

x2 + y ∈ R} = R2 ,

z = x2 + y = k ⇐⇒ y − k = x2 : parabola with vertex (0, k) and axis the line Oy; see Figure 1.27. ii) The level curve (see the 2nd graph in Figure 1.27) (x − 2)2 + y 2 + z 2 = k is reduced to ⎧ the point (2, 0, 0) ⎪ ⎪ ⎪ ⎪ ⎨ √ the sphere centered at (2, 0, 0) with radius k ⎪ ⎪ ⎪ ⎪ ⎩ no points

if

k=0

if

k>0

if

k < 0.

x  22  y2  z2  k 2

y

x2  y  k

0

y 2 4

2 2

z 4

2

2

4

0

x

2 2 2 4

0 x 2

FIGURE 1.27: Level curves x2 +y = k and level surfaces (x−2)2 +y 2 +z 2 = k

6. – Sketch the largest region on which the function is continuous. Explain why the function is continuous.  f (x, y, z) = y − x2 ln z.

Introduction

43

Solution: f is continuous on its domain of definition Df = {(x, y, z) ∈ R3 / y − x2  0

and

z > 0}

because it is the product of the two continuous functions: ∗ u : (x, y, z) −→ ln z continuous on D1 = {(x, y, z) : z > 0} with values in R as the composite of the polynomial function (x, y, z) ∈ D1 −→ z ∈ R+ \ {0} and the function t −→ ln t continuous on R+ \ {0}; we have (x, y, z) ∈ D1 −→ z = t ∈ R+ \ {0} −→ ln t. 

y − x2 : continuous on D2 = {(x, y, z) : y − x2  0} as the composite of the polynomial function(x, y, z) ∈ D2 −→ y − x2 √ ∈ R+ and the function t −→ t continuous on R+ ; we have √ (x, y, z) ∈ D2 −→ y − x2 = t ∈ R+ \ {0} −→ t.

∗ ∗ v : (x, y, z) −→

∗∗∗

f = u.v is continuous on D1 ∩ D2 = Df , the set in Figure 1.28. 3 y

Df

2

1 0 10

z

5

0 2 0 x 2

FIGURE 1.28: Domain of continuity of f (x, y, z) =



y − x2 ln z

7. – Let f (x, y, z) = x2 y 2 − y 3 + 3x4 + xe−2z sin(πy) + 5. Find (a)

fxy

(b)

fyz

(c)

fxz

(e)

fzyy

(f )

fxxy

(g)

fzyx

(d) (h)

fzz fxxyz .

44

Introduction to the Theory of Optimization in Euclidean Space

Solution: Note that f is indefinitely differentiable. Therefore, we can change the order of differentiation with respect of the variables by using Clairaut’s theorem. fx = 2xy 2 + 12x3 + e−2z sin(πy) fy = 2x2 y − 3y 2 + πxe−2z cos(πy)

fz = −2xe−2z sin(πy)

(a)

fxy = (fx )y = 4xy + πe−2z cos(πy)

(b)

fyz = (fy )z = −2πxe−2z cos(πy)

(c)

fxz = (fx )z = −2e−2z sin(πy)

(e)

fzyy = (fzy )y = (fyz )y = 2π 2 xe−2z sin(πy),

(f )

fxxy = (fx )xy = (fx )yx = (fxy )x = 4y

(g)

fzyx = (fzy )x = (fyz )x = −2πe−2z cos(πy),

(h)

fxxyz = (fxxy )z = (4y)z = 0.

(d)

fzz = (fz )z = 4xe−2z sin(πy)

8. – Show that u = ln(x2 + y 2 ) satisfies Laplace equation Show, without calculation, that:

∂2u ∂2u + 2 = 0. ∂x2 ∂y

∂2u ∂2u = . ∂x∂y ∂y∂x

Solution: We have 2x ∂u = 2 ∂x x + y2

2y ∂u = 2 ∂y x + y2

(1)(x2 + y 2 ) − x(2x) (y 2 − x2 ) ∂2u = 2 = 2 ∂x2 (x2 + y 2 )2 (x2 + y 2 )2

(x2 − y 2 ) ∂2u = 2 ∂y 2 (x2 + y 2 )2

(y 2 − x2 ) (x2 − y 2 ) ∂2u ∂2u + 2 =2 2 +2 2 = 0. 2 2 2 ∂x ∂y (x + y ) (x + y 2 )2

Introduction Note that

45

∂2u ∂u is a fraction. Then is also a fraction. As a consequence, ∂x ∂y∂x

∂2u is continuous on R2 \ {(0, 0)}. ∂y∂x ∂2u ∂2u ∂u is a fraction, is also a fraction. Therefore, In the same way, ∂y ∂x∂y ∂x∂y is continuous on R2 \ {(0, 0)}.

From Clairaut’s Theorem, the two second derivatives uxy and uyx are equal on R2 \ {(0, 0)}.

9. – Find the value

dw  if  ds s=0

w = x2 e2y cos(3z);

x = cos s,

y = ln(s + 2),

z = s.

Solution: We have x = x(s), y = y(s), z = z(s) and w = w(x, y, z). Then dx = − sin s, ds

dy 1 = , ds s+2

∂w = 2xe2y cos(3z), ∂x x(0) = 1,

dz =1 ds

∂w = 2x2 e2y cos(3z), ∂y

y(0) = ln 2,

∂w = −3x2 e2y sin(3z) ∂z

z(0) = 0

∂w dx ∂w dy ∂w dz dw = + + ds ∂x ds ∂y ds ∂z ds = [2xe2y cos(3z)](− sin s) + [2x2 e2y cos(3z)]

1

+[−3x2 e2y sin(3z)] s+2

dw  = e2 ln 2 = 4.  ds s=0

10. – Let R = ln(u2 + v 2 + w2 ), Find

∂R |x=1,y=0 ∂x

u = x + 2y,

and

v = 2x − y, ∂R |x=1,y=0 . ∂y

w = 2xy.

46

Introduction to the Theory of Optimization in Euclidean Space

Solution: We have 2u ∂R = 2 , ∂u u + v 2 + w2

2v ∂R = 2 , ∂v u + v 2 + w2

∂u = 1, ∂x

∂v = 2, ∂x

∂u = 2, ∂y

∂v = −1, ∂y

2w ∂R = 2 , ∂w u + v 2 + w2 ∂w = 2y, ∂x ∂w = 2x. ∂y

The partial derivatives of R are: ∂R ∂u ∂R ∂v ∂R ∂w 2u + 4v + 4wy ∂R = . + . + . = 2 ∂x ∂u ∂x ∂v ∂x ∂w ∂x u + v 2 + w2 ∂R ∂u ∂R ∂v ∂R ∂w 4u − 2v + 4wx ∂R = . + . + . = 2 . ∂y ∂u ∂y ∂v ∂y ∂w ∂y u + v 2 + w2 When x = 1 and y = 0, we have u=1

v = 2,

u2 + v 2 + w2 = 5.

w = 0,

Thus ∂R 4(1) − 2(2) + 4(0) = = 0. ∂y 5

2(1) + 4(2) + 4(0) ∂R = = 2, ∂x 5

11. – Use the linear approximation of f (x, y, z) = x3 (2, 3, 4) to estimate the number  (1.98)3 (3.01)2 + (3.97)2 .



y 2 + z 2 at the point

Solution: Since f is differentiable at the point (2, 3, 4), the linear approximation of L(x, y, z) at the point (2, 3, 4) is given by: L(x, y, z) = f (2, 3, 4) + fx (2, 3, 4)(x − 2) + fy (2, 3, 4)(y − 3) + fz (2, 3, 4)(z − 4). We have fx = 3x2



y2 + z2 ,

fy = 

yx3 y2 + z2

,

fz = 

zx3 y2 + z2

Introduction

47

and fx (2, 3, 4) = 60,

f (2, 3, 4) = 40,

fy (2, 3, 4) =

24 , 5

fz (2, 3, 4) =

32 . 5

Thus 24 32 12 (x − 2) + (y − 3) + (z − 4). 5 5 5 Using this approximation, one obtain the following estimate: L(x, y, z) = 40 +

(1.98)3



(3.01)2 + (3.97)2 ≈ L(1.98, 3.01, 3.97) 24 32 = 40 + 60(1.98 − 2) + (3.01 − 3) + (3.97 − 4)3 5 5 32 24 = 40 + 60(−0.02) + (0.01) + (−0.03) = 38.656. 5 5

12. – Determine whether the limit exists. If so, find its value. x4 − x + y − x3 y , x−y (x,y)→(0,0) lim

cos(xy) , (x,y)→(0,0) x + y lim

x − y4 . (x,y)→(1,1) x3 − y 4 lim

Solution: We have i)

x4 − x + y − x3 y x3 (x − y) − (x − y) = lim x−y x−y (x,y)→(0,0) (x,y)→(0,0) lim

= ii)

lim

(x,y)→(0,0)

x3 − 1 = −1.

cos(xy) doesn’t exist since (x,y)→(0,0) x + y lim

cos(xy) cos(t2 ) = lim = +∞. 2t (x,y)=(t,t),t>0→(0,0) x + y t→0+ cos(xy) cos(t2 ) and lim = lim = −∞. 2t (x,y)=(t,t),t 0

f (x)  f (x∗ )

such that

(resp. )

∀x ∈ Br (x∗ ) ∩ S.

– a strict local maximum (resp. minimum) of f if ∃r > 0 such that f (x) < f (x∗ ) (resp. >) ∀x ∈ Br (x∗ )∩S,

x = x∗ .

– a global maximum (resp. minimum) of f if f (x)  f (x∗ )

(resp. )

∀x ∈ S.

– a strict global maximum (resp. minimum) of f if f (x) < f (x∗ )

∀x ∈ S,

(resp. >)

x = x∗ .

Remark 2.1.1 Note that a global extreme point is also a local extreme point when S is an open set, but the converse is not always true.

Indeed, suppose, for example, that x∗ is such that min f (x) = f (x∗ ) S

then

f (x)  f (x∗ )

∀x ∈ S.



Because S is an open set and x ∈ S, there exists a ball Br (x∗ ) such that Br (x∗ ) ⊂ S, and then, in particular, f (x)  f (x∗ )

∀x ∈ Br (x∗ )

which shows that x∗ is a local minimum. To show that the converse is not true, consider the function f (x) = x3 − 3x. The study of the variations of f , in Table 2.1, and its graph, in Figure 2.1, show that f has a local minimum at x = 1 and a local maximum at x = −1, but none of them is a global maximum or a global minimum, as we have f  (x) = 3x2 −3

f  (x) = 6x

lim f (x) = +∞

x→+∞

lim f (x) = −∞.

x→−∞

Now, here is a characterization of a local extreme point for a regular objective function.

Unconstrained Optimization x f  (x) f (x) f  (x) f is

−∞ −∞

−1 +  − concave

51

0 − 

2 − concave

+∞

1 −2

+ 

+ convex

+∞ + convex

TABLE 2.1: Study of f (x) = x3 − 3x y 3

2

y  x3  3 x

1

3

2

1

1

2

3

x

1

2

3

FIGURE 2.1: Local extreme points but not global ones

Theorem 2.1.1 Necessary condition for local extreme points Let S ⊂ Rn and f : S −→ R be a differentiable function at an interior ◦

point x∗ ∈ S. Then x∗

is a local extreme point

=⇒

∇f (x∗ ) = 0.

Proof. Suppose f has a local minimum at x∗ . Since f is differentiable at x = (x∗1 , x∗2 , . . . , x∗n ), its first derivatives exist. From the definition of the partial derivative, we have, for j ∈ {1, . . . , n} ∗

f (x∗1 , . . . , x∗j + t, . . . , x∗n ) − f (x∗1 , . . . , x∗j , . . . , x∗n ) ∂f ∗ (x ) = lim t→0 ∂xj t Because f has an interior local minimum at x∗ , there is an  > 0 such that ∀x ∈ B (x∗ ) ⊂ S

=⇒

f (x)  f (x∗ ).

In particular, for |t| < , we have (x∗1 , . . . , x∗j +t, . . . , x∗n )−x∗ = (x∗1 , . . . , x∗j +t, . . . , x∗n )−(x∗1 , . . . , x∗j , . . . , x∗n )

52

Introduction to the Theory of Optimization in Euclidean Space  = (0, . . . , 0, t, 0, . . . , 0) = 02 + . . . + 02 + t2 + 02 + . . . + 02 = |t| < .

Thus the points (x∗1 , . . . , x∗j + t, . . . , x∗n ) remain inside the ball B (x∗ ) and therefore satisfy f (x∗1 , . . . , x∗j + t, . . . , x∗n )  f (x∗ ) ⇐⇒

f (x∗1 , . . . , x∗j + t, . . . , x∗n ) − f (x∗1 , . . . , x∗j , . . . , x∗n )  0.

Thus, if t is positive, f (x∗1 , . . . , x∗j + t, . . . , x∗n ) − f (x∗1 , . . . , x∗j , . . . , x∗n ) 0 t and letting t → 0+ , we deduce that lim+

t→0

f (x∗1 , . . . , x∗j + t, . . . , x∗n ) − f (x∗1 , . . . , x∗j , . . . , x∗n )  0. t

In the same way, if t is negative, f (x∗1 , . . . , x∗j + t, . . . , x∗n ) − f (x∗1 , . . . , x∗j , . . . , x∗n ) 0 t and letting t → 0− , we deduce that lim

t→0−

f (x∗1 , . . . , x∗j + t, . . . , x∗n ) − f (x∗1 , . . . , x∗j , . . . , x∗n )  0. t

Because lim

t→0+

f (x∗1 , . . . , x∗j + t, . . . , x∗n ) − f (x∗1 , . . . , x∗j , . . . , x∗n ) t

= lim

t→0−

f (x∗1 , . . . , x∗j + t, . . . , x∗n ) − f (x∗1 , . . . , x∗j , . . . , x∗n ) ∂f ∗ = (x ) t ∂xj

∂f ∗ ∂f ∗ ∂f ∗ (x )  0 and (x )  0, and we deduce that (x ) = 0. ∂xj ∂xj ∂xj This holds for each j ∈ {1, . . . , n}. Hence ∇f (x∗ ) = 0.

we have

A similar argument applies if f has a local maximum at x∗ .

Remark 2.1.2 Note that a local extremum can also occur at a point where a function is not differentiable.

Unconstrained Optimization

53

y 3

2

y  x

1

3

2

1

1

2

3

x

1

FIGURE 2.2: A minimum point where f (x) = |x| is not differentiable

• For example, the one variable function f (x) = |x|, illustrated in Figure 2.2, has a local minimum at 0 but f is not differentiable at 0 since we have

lim+

f (x) − f (0) x−0 = lim+ =1 x x x→0

lim

f (x) − f (0) −x − 0 = lim = −1. x x x→0−

x→0

x→0−

Moreover 0 is a global minimum since we have f (x) = |x|  0 = f (0) z 1.0 y

∀x ∈ R.

x2  y2

0.5

0.0 0.5

1.0

1.0 1.0

0.5

z

0.5 0.0

0.0

0.5

1.0 0.5 0.0 x

0.5

1.0 1.0

1.0

0.5

FIGURE 2.3: A minimum point where f (x, y) = tiable

0.0

0.5

1.0

 x2 + y 2 is not differen-

 x2 + y 2 , graphed in Figure 2.3, • The two variables function f (x, y) = attains its minimum value at (0, 0) because we can see that  ∀(x, y) ∈ R2 . f (x, y) = x2 + y 2  0 = f (0, 0) But f is not differentiable at (0, 0) since, for example fx (0, 0) doesn’t exist. Indeed, we have

54

Introduction to the Theory of Optimization in Euclidean Space ⎧ √ if h → 0+ ⎨ 1, |h| h2 − 0 f (0 + h, 0) − f (0, 0) = = −→ ⎩ h h h −1, if h → 0− .

The above remark leads to the following definition.

Definition 2.1.2 Critical point An interior point x∗ of the domain of a function f is a critical point of f if it is a stationary point where ∇f (x∗ ) = 0 or a point where f is not differentiable.

Example 1. (0, 0) is the only stationary point for the functions f and g i)

f (x, y) = x2 + y 2

g(x, y) = 1 − x2 − y 2 .

ii)

It is a local and absolute minimum for f and a local and absolute maximum for g. The values of the level curves are increasing in Figure 2.4, while they are decreasing in Figure 2.5. z  x2  y2

1.0 1.8

1.08

1.26

1.62

1.44

1.8 1.62

1.0

1.44 0.72 1.26

0.5 0.36 1.08 z

0.5 0.0

1.08 0.0

1.08

0.18

0.5

1.0

0.54 1.44

0.5 0.0 x

1.44 1.26

1.8

0.5

1.0 1.0

1.0

0.9

1.62

1.26 0.5

0.0

0.5

1.62

1.8 1.0

FIGURE 2.4: Local minimum point that is a global one Indeed, we have for any (x, y) ∈ R2 f (x, y) = x2 + y 2  0 = f (0, 0)

and

g(x, y) = 1 − (x2 + y 2 )  1 = g(0, 0).

Example 2. In economics, one is interested in maximizing the total profit P (x) in the sale of x units of some product. If C(x) is the total cost of production and R(x) is the revenue function then P (x) = R(x) − C(x).

Unconstrained Optimization

55

z  x2  y2  1

1.0 0.9 0.54

0.18

0.36

0.72

1.5

0.36

0.72 0.9 0.54

0.36 0.5

0.1

1.0 z 0.54

0.0 0 0.5 0.72 0.0

0.1

0.5

1.0

0.36

0.5

0.54

0.0 x

0.72 0.5

1.0 0.9 0.54 1.0

1.0

0.18 0.36

0.18 0.5

0.0

0.5

0.720.9 1.0

FIGURE 2.5: Local maximum point that is a global one

The maximum profit occurs when P  (x) = 0, or R (x) = C  (x). From the linear approximation, we have for Δx = 1, R(x+1)−R(x) ≈ R (x)Δx = R (x),

C(x+1)−C(x) ≈ C  (x)Δx = C  (x),

C(x + 1) − C(x) ≈ R(x + 1) − R(x) that is, the cost of manufacturing an additional unit of a product is approximately equal to the revenue generated by that unit. P  (x), R (x), C  (x) are interpreted respectively as the additional profit, revenue and cost that result from producing one additional unit when the production and sales levels are at x units.

Remark 2.1.3 A function needs not to have a local extremum at every local critical point.

• For example, the one variable function f (x) = x3 has a local critical point since ⇐⇒ x = 0. f  (x) = 3x2 = 0 But 0 is not a local extremum (see Figure 2.6). Indeed we have f (x) = x3 > 0 = f (0)

∀x > 0

and

f (x) = x3 < 0 = f (0)

∀x < 0.

The point 0 is called an inflection point. • The two variables function f (x, y) = y 2 − x2 , graphed in Figure 2.7, has a critical point at (0, 0) since we have ∇f (x, y) = −2x, 2y = 0, 0

⇐⇒

(x, y) = (0, 0).

56

Introduction to the Theory of Optimization in Euclidean Space y 2

y  x3

1

2

1

1

2

x

1

2

FIGURE 2.6: The critical point x = 0 is an inflection point for f (x) = x3

However, the function f has neither a relative maximumnor a relative minimum at (0,0). Indeed, along the x and y axis, we have f (x, 0) = −x2  0 = f (0, 0)

∀x ∈ R and f (0, y) = y 2  0 = f (0, 0)

∀y ∈ R.

The point (0, 0) is called a saddle point. Figure 2.7 shows how the values of the level curves are increasing in one side and decreasing on the other side. 1.0 z y

 y2  x2

0.5

0.0 0.5

1.0

0.54

0.72

0.18

1.0 1.0

0.3 0.18 0.50.72

0.5

0.36

0

0.9 z

0

0.36

0.0 0.0

0.9

0.54

0.5

1.0

0

0.5

1.0

0.54

0.5

0.72

0.18 0.54

0.0 x

0.5

1.0 1.0

1.0

0.36

0.72 0.5

0.0

0.5

0.1 1.0

FIGURE 2.7: (0, 0) is a saddle point for f (x, y) = y 2 − x2

Definition 2.1.3 Saddle point A differentiable function f (x) has a saddle point at a critical point x∗ if in every open ball centered at x∗ there are domain points x where f (x) > f (x∗ ) and domain points x where f (x) < f (x∗ ).

Remark 2.1.4 In two dimensions, the projection of horizontal traces shows circular curves around (x∗ , y ∗ ) when it is a local extreme point, and hyperbolas when the point is a saddle point.

Unconstrained Optimization

57

Now, we give a necessary condition when the extreme point is not necessarily an interior point [5].

Theorem 2.1.2 Necessary condition for a relative extreme point on a convex set Let S ⊂ Ω ⊂ Rn , Ω an open set, S a convex set and f : Ω −→ R be a differentiable function at a point x∗ ∈ S. Then f (x)  f (x∗ ) =⇒

(resp. )

∇f (x∗ ).(x − x∗ )  0

∀x ∈ S (resp.  0)

∀x ∈ S.

Proof. Let x ∈ S, x = x∗ . Since S is convex, θx + (1 − θ)x∗ = x∗ + θ(x − x ) ∈ S for θ ∈ [0, 1]. Suppose f has a relative minimum at x∗ . Since f is differentiable at x∗ , we can write ∗

f (x∗ + θ(x − x∗ )) − f (x∗ ) = θ[ f  (x∗ ).(x − x∗ ) + (θ) ],

lim (θ) = 0.

θ→0

If f  (x∗ ).(x − x∗ ) < 0, then ∃θ0 ∈ (0, 1) : Hence,

∀θ ∈ (0, θ0 )

∀θ ∈ (0, θ0 ),

=⇒

1 |(θ)| < − f  (x∗ ).(x − x∗ ). 2

f (x∗ + θ(x − x∗ )) − f (x∗ )
0, x2 > 0} since f  (x1 , x2 ) = 2x1 −1+x2 , 1+x1  = 0, 0

⇐⇒



(x1 , x2 ) = (−1, 3) ∈ S.

So the minimum value, if it exists, must be attained on the boundary of S. Note that 1 1 1 1 f (x1 , 0) = x21 − x1 = (x1 − )2 −  − = f ( , 0) 2 4 4 2

∀x1  0

and f (0, x2 ) = x2  0 Since −1/4 < 0, the point shown in Figure 2.9.

( 12 , 0)

∀x2  0.

is the global minimum point of f on S, as

At this point   f  (x1 , x2 )

x1 = 12 ,x2 =0

  = 2x1 − 1 + x2 , 1 + x1 

x1 = 12 ,x2 =0

3 = 0,  = 0. 2

Unconstrained Optimization 2 z 10 x  y x  x  y

y

59 2 z 10 x  y x  x  y

5

y

0

5

5 10 10

0 10

5

z

5

z

0

5

0

5

10

10

10

0 5 0 x

5 x

5 10

10

FIGURE 2.9: Min f attained at the boundary of x1  0, x2  0 and 1 3 1 ∇f ( , 0).x1 − , x2 − 0 = x2  0 2 2 2

∀(x1 , x2 ) ∈ S = R+ × R+ .

Remark 2.1.5 * Note that, it is not easy to find the candidate points by solving an inequality ∇f (x∗ ).(x − x∗ )  0 (resp.  0). However, the information gained is useful to establish other results. ** Solving the equation ∇f (x) = 0 is not that easy either! It induces nonlinear equations or large linear systems when the number of variables is large. To overcome this difficulty, we resort to approximate methods. Newton’s method is one of the well known approximate methods for approaching a root of the equation F (x) = 0. In Exercise 5, the method is described and applied for solving a nonlinear equation in one dimension. Steepest descent method, Conjugate gradient methods and many other methods are developed for approaching the solution [22], [5].

∗ ∗ ∗ Finally, the following example, in dimension 2, shows the necessity of using new methods for finding the critical points. Indeed, the graph, Figure 2.10, of z = f (x, y) = 10e−(x

2

+y 2 )

+ 5e−[(x+5)

2

+(y−3)2 ]/10

+ 4e−2[(x−4)

2

+(y+1)2 ]

,

on the window [−10, 8] × [−10, 8] × [−1, 12], shows three peaks. Thus, we have at least three local maxima points. These points are solution of the system ⎧ 2 2 2 2 2 2 fx = −20xe−(x +y ) − 10(x + 5)e−[(x+5) +(y−3) ]/10 − 16(x − 4)e−2[(x−4) +(y+1) ] = 0 ⎪ ⎪ ⎨ −(x2 +y 2 ) − (y − 3)e−[(x+5)2 +(y−3)2 ]/10 − 16(y + 1)e−2[(x−4)2 +(y+1)2 ] = 0, ⎪ ⎪ ⎩fy = −20ye

60

Introduction to the Theory of Optimization in Euclidean Space

a nonlinear system, for which it is not evident to find an explicit solution by algebraic manipulations. The following Maple software command searches for a solution near (0, 0) using an approximate method: f := (x, y)− > 10 ∗ exp(−x2 − y 2 ) + 5 ∗ exp(−((x + 5)2 + (y − 3)2 ) ∗ (1/10)) +4 ∗ exp(−2 ∗ ((x − 4)2 + (y + 1)2 )) with(Optimization) : N LP Solve(f (x, y), x = −8..8, y = −8..8, initialpoint = x = 0, y = 0, maximize);

The result is [10.1678223807097599, [x = −0.842598632890276e − 2, y = 0.505559179745079e − 2]].

Thus, (−0.084e−2 , 0.5e−2 ) ≈ (−0.115, 0.067) is an approximate critical point, where f takes the approximate local maximal value 10.1678. A search near (−5, 3) and (4, −1) yields to the other approximate local maxima points: with(Optimization); N LP Solve(f (x, y), x = −8..8, y = −8..8, initialpoint = x = −4, y = 3, maximize) [5.00000000000001688, [x = −5.00000000010854, y = 2.99999999999990]] with(Optimization) : N LP Solve(f (x, y), x = −8..8, y = −8..8, initialpoint = x = 4, y = −1, maximize); [4.00030684298145278, [x = 3.99996531847993, y = −.999984626392930]

5

y 0 5 10

5 10

0

z 5

5

0 10 5 x

0

10 5

10

5

FIGURE 2.10: Location of mountains

0

5

Unconstrained Optimization

61

Solved Problems

1. – A suitable choice of the objective function. Find a point on the curve y = x2 that is closest to the point (3, 0).

Solution: When formulating an optimization problem, sometimes, one can encounter some technical difficulties by considering an auxiliary objective function instead of considering the direct one. This situation is illustrated by the two choices below. y 4

3

y  x2

2

1

4

2

2

4

x

1

FIGURE 2.11: Closest point • 1st choice. Let D = distance between (3,0) and any point (x, y). Since (x, y) lies on the curve y = x2 , the distance D must satisfy   D = D(x) = (x − 3)2 + (y − 0)2 = (x − 3)2 + x4 . We need to solve the problem (see Figure 2.11) min D(x). x∈R

Since R is an open set, the minimum must occur at a critical point, i.e., since D is differentiable, at a point where

62

Introduction to the Theory of Optimization in Euclidean Space 2(x − 3) + 4x3 dD =  =0 dx 2 (x − 3)2 + x4 ⇐⇒

2(x − 3) + 4x3 = 2(x − 1)(2x2 + 2x + 3) = 0

⇐⇒

x = 1.

Since D ∈ C 0 (R) and lim D(x) = +∞

and

x→−∞

lim D(x) = +∞,

x→+∞

the minimum exists and it must be at x = 1 [1]. The variations of D is given by Table 2.2. x D (x) D(x)

−∞



+∞



+∞

2 0 D(1)

+ 

TABLE 2.2: Variations of D(x) = Thus min D(x) = D(1) = x∈R

+∞

 (x − 3)2 + x4

√ 5.

• 2nd choice. Note that, for any x0 , x ∈ R, we have   0  D2 (x0 )  D2 (x) ⇐⇒ 0  D(x0 ) = D2 (x0 )  D2 (x) = D(x) √ since t is an increasing function on the interval [0, +∞). It suffices, then, to minimize on R the function F (x) = D2 (x) = (x − 3)2 + x4 . Since R is an open set, the minimum must occur at a critical point, i.e., since F is differentiable, at a point where dF = 2(x − 3) + 4x3 = 0 dx

⇐⇒

2(x − 1)(2x2 + 2x + 3) = 0

⇐⇒

x = 1.

Since F ∈ C 0 (R) and lim F (x) = +∞

x→−∞

and

lim F (x) = +∞,

x→+∞

the minimum exists and it must be at x = 1. The variations of F is given by Table 2.3. The point (1, 1) is the closest point on the curve [y = x2 ] to the point (3, 0).

Unconstrained Optimization x F  (x) = 2(x − 1)(2x2 + 2x + 3) F (x)

−∞ +∞



63 +∞

1 0 F (1)



+ 

+∞

TABLE 2.3: Variations of F (x) = (x − 3)2 + x4

2. – To minimize the material in manufacturing a closed can with volume capacity of V units, we need to choose a suitable radius for the container. Find the radius if the container is cylindrical.

Solution: From Section 1.1, Example 1, we are lead to solve the minimization problem ⎧ 2V ⎪ ⎨ minimize A = A(r) = 2πr2 + over the set S r ⎪ ⎩ S = (0, +∞) = {r ∈ R / r > 0}. Since S is an open set, the minimum must occur at a critical point, ie., since A(r) is differentiable, at a point where 2V dA = 4πr − 2 = 0 dr r 0 Since A ∈ C (S) and

=⇒

lim A(r) = +∞

and

r→0+

r=

V 1/3 2π

∈ S.

lim A(r) = +∞,

r→+∞

V 1/3 the minimum exists and it must be on r = . Indeed the variations of 2π A are as shown in Table 2.4. r A (r) = 4πr − A(r)

2V r2

0 − +∞



√ V 1/3 / 3 2π 0 + A((V /2π)1/3 )

TABLE 2.4: Variations of A(r) = 2πr2 +

+∞ 

+∞

2V r

So we should choose for the can a radius r = (V /2π)1/3 and a height h = V /(2πr) = (V /2π)2/3 .

64

Introduction to the Theory of Optimization in Euclidean Space 3. – Locate all absolute maxima and minima if any for each function. f (x, y) = 1 − (x + 1)2 − (y − 5)2

i)

g(x, y) = 3x − 2y + 5

ii) iii)

h(x, y) = x2 − xy + y 2 − 3y.

Solution: i) 2 2 z  x 7  1  y  5  1

y

6

5 7 6.48

4

5.04

3.6

4.32

5.76

3

5.76 6.48

2.16 5.0

1.0 4.32 6

0 3.6

0.5 1.44 z

0.0

5

0.5 0.72

3.6

1.0

3.6

4

3 2

5.04

5.0

1 x

0

3 1

6.48 5.76

4.32

3

2

4.32

2.88 1

0

6.48 5.76 1

FIGURE 2.12: Graph and level curves of z = f (x, y) Since f is differentiable on R2 , its absolute extremum that are also local extremum (if they exist), are stationary points, ie. solution of ∇f = −2(x + 1), −2(y − 5) = 0, 0

⇐⇒

(x, y) = (−1, 5).

So, there is only one critical point. It satisfies f (−1, 5) = 1  1 − (x + 1)2 − (y − 5)2 = f (x, y)

∀(x, y) ∈ R2 .

Hence, it is a global maximum of f in R2 ; see Figure 2.12. However, f does not have a global minimum since the following hold: (x + 1)2 + (y − 5)2 = (x, y) − (−1, 5) 2 (x, y) − (−1, 5) 2  ( (x, y) + (−1, 5) )2 = (



x2 + y 2 +

√ 2 26)

 2  √   (x, y) − (−1, 5) 2   (x, y) − (−1, 5)  = ( x2 + y 2 − 26)2 .

Unconstrained Optimization Then 1−(

65

  √ √ x2 + y 2 + 26)2  f (x, y)  1 − ( x2 + y 2 − 26)2

and we deduce that lim

(x,y)→+∞

f (x, y) = −∞.

It suffices also to show that f takes large negative values on a subset of its domain R2 , like f (x, 5) = 1 − (x + 1)2 −→ −∞

x −→ ±∞.

as

ii) Since g is differentiable on R2 , its absolute extreme points that are also z 3x2 y5 2

y 0

3

2.7

2

1.0

1

0.5

z

8.1

2.7

8.1

2

5.4 13.5 0

0.0

0.5

1

1.0

10.8

5.4

2 0

2 0

16.2

3

x 2

3

2

1

0

1

2

3

FIGURE 2.13: Graph and level curves of g(x, y) = 3x − 2y + 5 local extreme points (if they exist), are stationary points, ie. solution of ∇g = 0, 0. But ∇g = 3, −2 = 0, 0. So, there is no critical point. g has no local or global extreme point. The graph z = g(x, y) is a plane in R3 which spreads in the space taking large values when x or y −→ ±∞; see Figure 2.13. For example g(0, y) = −2y + 5 −→ ∓∞ as g(x, 0) = 3x + 5 −→ ±∞ as

y −→ ±∞ x −→ ±∞.

iii) Since h is differentiable on R2 , its absolute extreme points that are also local extreme points (if they exist) are stationary points, ie. solution of ∇h = 2x − y, −x + 2y − 3 = 0, 0

⇐⇒

(x, y) = (1, 2).

66

Introduction to the Theory of Optimization in Euclidean Space So, there is only one critical point. It satisfies h(1, 2) = 1 − 2 + 4 − 6 = −3 y y2 + y 2 − 3y + 3 h(x, y) − h(1, 2) = x2 − xy + y 2 − 3y + 3 = (x − )2 − 2 4 3 y ∀(x, y) ∈ R2 . = (x − )2 + (y − 1)2  0 2 4

Hence, the point (1, 2) is a global minimum of h in R2 . Here also, one can see that h takes large values, for example, along the x axis, we have h(x, 0) = x2

−→ +∞

when x −→ ±∞.

So h has no global maximum (see Figure 2.14). z  4x2  y x  y2  3 y y

3

2 4

1

7

0

5

3

1

6

0

1

4

3

0

1 2 z

2

2

3

2

4

2

1

1

4

0

6 1 x

2

1

0 1

3

0

1

3

5

7

2

3

FIGURE 2.14: Graph and level curves of h(x, y) = x2 − xy + y 2 − 3y

4. – Consider the problem min f (x, y) = y S

where

S = {(x, y) : x2 + y 2  1}.

i) Does f have local minimum points? ii) Where may the minimum points locate if they exist? iii) Solve the inequality ∇f (a, b).(x − a, y − b)  0

∀(x, y) ∈ S

to find the candidate points (a, b) and solve the problem. iv) Can you proceed as in iii) if S = {(x, y) : x2 + y 2  1}? What is the solution in this case?

Unconstrained Optimization

67

Solution: i) Since f is differentiable on R2 , a local minimum point would be a critical point, ie. solution of ∇f = 0, 0. But ◦

∀(x, y) ∈ {(x, y) : x2 + y 2 < 1} = S.

∇f = 0, 1 = 0, 0

So, there is no critical point. f has no local minimum point. ii) If the minimum points exist, they may be on the unit circle, the boundary of S: ∂S = {(x, y) : x2 + y 2 = 1}. 1.0 y

0.5

0.0 0.5 1.0 1.0

0.5

z

0.0

0.5

1.0 1.0 0.5 0.0 x

0.5 1.0

FIGURE 2.15: Graph of f (x, y) = y on the set x2 + y 2  1 iii) Since S is convex and f differentiable on R2 , a solution (a, b) of the problem, if it exists, must satisfy ⎧ 2 ⎨ a + b2 = 1 ⎩

∇f (a, b).(x − a, y − b)  0

⇐⇒

⇐⇒

∀(x, y) ∈ S

⎧ 2 ⎨ a + b2 = 1 ⎩

0, 1.x − a, y − b  0

∀(x, y) ∈ S

⎧ 2 ⎨ a + b2 = 1 ⎩

y−b0

∀(x, y) ∈ S

⇐⇒

yb

Thus b = −1

and

a = 0.

∀(x, y) ∈ S

68

Introduction to the Theory of Optimization in Euclidean Space

So the only point candidate is (a, b) = (0, 1). In fact, it is the minimum point (see Figure 2.15) since we have f (x, y) = y  −1 = f (0, −1)

∀(x, y) ∈ S.

iv) We cannot proceed as in iii) because the set S = {(x, y) : x2 + y 2  1} is not convex. And because this set is not bounded, we can see that f can take large negative values. Therefore, it doesn’t attain a minimum value. For example, on the negative y axis, we have f (0, y) = y

−→ −∞

as

y −→ −∞.

5. – Newton’s Method[2] Let I = [a, b] and let F : I −→ R be twice differentiable on I. Suppose that ∃m, M ∈ R+ : |F  (x)|  m > 0 and |F  (x)|  M F (a).F (b) < 0, K = M/2m.

∀x ∈ I

Then there exists a subinterval I ∗ containing a root r of F such that for any x1 ∈ I ∗ , the sequence xn defined by xn+1 = xn −

F (xn ) F  (xn )

∀n ∈ N,

belongs to I ∗ and (xn ) converges to r. Moreover |xn+1 − r|  K|xn − r|2 Application

∀n ∈ N.

Let f (x) = x3 − 2x − 5.

i) Show that f has a root on the interval I = [2, 2.2]. ii) If x1 = 2 and if (xn ) is the sequence obtained by Newton’s method, show that |xn+1 − r|  (0.7)|xn − r|2 iii) Show that x4 is exact up to 6 decimals.

Solution: i) We have f (2.2) = (2.2)3 − 2.(2.2) − 5 = 1.248 > 0 f is continuous on [2, 2.2]

and

f (2) = 8 − 4 − 5 = −1 < 0

0 is between f (2) and f (2.2).

Unconstrained Optimization

69

From the intermediate value theorem, there exists x0 ∈ (2, 2.2) such that f (x0 ) = 0. ii) The sequence (xn ) obtained by Newton’s method is: ⎧ x1 = 2 ⎪ ⎪ ⎨ 2x3 + 5 f (xn ) x3 − 2xn − 5 ⎪ ⎪ = xn − n 2 = n2 ⎩ xn+1 = xn −  f (xn ) 3xn − 2 3xn − 2 with f  (x) = 3x2 − 2

f  (x) = 6x.

Because, the functions f  and f  are increasing on [2, 2.2], we have 10 = f  (2)  f  (x)  f  (2.2) = 12.52 12 = f  (2)  f  (x)  f  (2.2) = 13.2. In particular |f  (x)|  10 = m

|f  (x)|  13.2 = M

and

∀x ∈ [2, 2.2].

We deduce that the sequence (xn ) converges to a root r of f (x) = 0 in [2, 2.2] and satisfies |xn+1 − r|  0.7|xn − r|2

K=

M = 0.66 < 0.7. 2m

iii) Denote by en = xn − r, the approximation error of the root r, then |Ken+1 |  K 2 |en |2 = |Ken |2

=⇒

|Ken+1 |  |Ke1 |2n

by induction,

where |e1 | = |x1 − r| < (2.2 − 2) = 0.2

since

x1 , r ∈ [2, 2.2].

Thus

2n = (0.0196)n |Ken+1 |  |Ke1 |2n  (0.7)(0.2) To obtain an accuracy up to 6 decimals, it suffices to choose n such that |en+1 | 

(0.0196)n  10−6 . 0.66

70

Introduction to the Theory of Optimization in Euclidean Space We have (0.0196)2 = 0.000582061 0.66

(0.0196)3 ≈ 0.0000114084 0.66

(0.0196)4 ≈ 0.0000002236 < 10−6 . 0.66 The desired accuracy is obtained for n = 4. The approximate values of this root are:

x2 =

2(8) + 5 21 2x31 + 5 = = = 2.1 2 2 3x1 − 2 3(2 ) − 2 10

x3 =

2(2.1)3 + 5 23.522 2x32 + 5 = = = 2.0945681 2 3x2 − 2 3(2.1)2 − 2 11.23

x4 =

3 2( 23.522 2x33 + 5 11.23 ) + 5 = ≈ 2.09455148 23.522 2 3x3 − 2 3( 11.23 )2 − 2

x5 =

23.3782059 2x34 + 5 ≈ ≈ 2.0945514841. 3x24 − 2 11.1614377

We can see that x4 is exact up to six decimals; see Figure 2.16. y 10

5

y 1.0 3

2

1

1

2

3

x

0.5

5

1

1

2

3

x

0.5

10

1.0

FIGURE 2.16: Approximate position of the root of f (x) = x3 − 2x − 5

Unconstrained Optimization

2.2

71

Classification of Local Extreme Points

For a C 2 function f of one variable, in a neighborhood of a critical point x∗ , one can write by using the second order Taylor’s formula: f (x) = f (x∗ ) +

f  (c) f  (x∗ ) (x − x∗ ) + (x − x∗ )2 1! 2!

for some number c between x∗ and x. Then, since f  (x∗ ) = 0, we have

f (x) = f (x∗ ) + (x − x∗ )2

f  (c) . 2!

Now, if we have f  (x∗ ) > 0, then by continuity of f  , we deduce that for x close to x∗ , ( x ∈ (x∗ − , x∗ + ) ), we will have

f  (c) > 0

=⇒

f (x) > f (x∗ )

∀x ∈ (x∗ − , x∗ + ) \ {x∗ }.

This means that x∗ is a strict local minimum point. Similarly, we show that

f  (x∗ ) < 0

=⇒

x∗

is a strict local maximum point.

This classification of critical points, into minima and maxima points, where the sign of the second derivative intervenes, is generalized to C 2 functions with several variables in the theorem below, following the definition: Definition 2.2.1 Let Hf (x) = (fxi xj (x))n×n be the Hessian of a C 2 function f . Then, the n leading minors of Hf are defined by    f x1 x1 fx1 x2 . . . f x1 xk         fx2 x1 fx2 x2 . . . f x2 xk  Dk (x) =  k = 1, . . . , n. ,   .. .. .. ..   . . . .    fx x fx x . . . fx x  k 1 k 2 k k

Theorem 2.2.1 Second derivatives test - Sufficient conditions for a strict local extreme point

Let S ⊂ Rn and f : S −→ R be a C 2 function in a neighborhood of a critical point x∗ ∈ S (∇f (x∗ ) = 0). Then (i)

∀k = 1, . . . , n Dk (x∗ ) > 0, ∗ =⇒ x is a strict local minimum point,

72

Introduction to the Theory of Optimization in Euclidean Space

(−1)k Dk (x∗ ) > 0, ∀k = 1, . . . , n ∗ =⇒ x is a strict local maximum point,

(ii)

Dn (x∗ ) = 0

(iii)

and neither of the conditions in (i) and (ii) are satisfied, then x∗ is a saddle point.

Before proving the theorem, we will see its application through some examples. Example 1. Profit in selling one commodity A commodity is sold at 5$ per unit. The total cost for producing x units is given by C(x) = x3 − 10x2 + 17x + 66. Find the most profitable level of production. Solution: The total revenue for selling x units is R(x) = 5x. Thus, the profit P (x) on x units is P (x) = R(x) − C(x) = 5x − (x3 − 10x2 + 17x + 66) = −x3 + 10x2 − 12x − 66. The profit, illustrated in Figure 2.17, will be at its maximum at points where 2 dP = −3x2 + 20x − 12 = −3(x − 6)(x − ) = 0. dx 3 We deduce that we have two critical points x = 6 and x = 23 . The Hessian of P is  d2 P 

= [−6x + 20]. dx2 Applying the second derivatives test, we obtain HP (x) =

∗ at x = 6, (−1)1 D1 (6) = (−1)

d2 P

dx2 Thus, x = 6 is a local maximum. ∗∗ at x = 2/3,

d2 P 2 2 ( ) = −6( ) + 20 = 16 > 0 2 dx 3 3 is a local minimum. D1 (2/3) =

Thus, x =

2 3

(6) = (−1)(−6(6) + 20) = 16 > 0

Unconstrained Optimization

73

Thus six units is a candidate point for optimality. We have to check that it is the point at which we have the most profitable profit. This can be done by comparing P (x) and P (6). Indeed, we have P (x) − P (6) = −(x − 6)2 (x + 2)  0 =⇒

∀x > 0,

x = 6

∀x ∈ (0, +∞) \ {6}.

P (x) < P (6) y 2

4

6

8

10

x

50

100

y  x3  10 x2  12 x  66

150

FIGURE 2.17: Graph of P and the maximum profit at x = 6

Example 2. Profit in selling two commodities The cost to produce x units of a commodity A and y units of a commodity B is C(x, y) = 0.2x2 + 0.05xy + 0.05y 2 + 20x + 10y + 2500. If each unit from A and B are sold for 75 and 45 respectively, find the daily production levels x and y that maximize the profit per day. Solution: The daily profit is given by P (x, y) = 75x + 45y − C(x, y) = −0.2x2 − 0.05xy − 0.05y 2 + 55x + 35y − 2500. Since P is differentiable (because it is a polynomial), the points that maximize the profit are critical ones, i.e, solutions of  x = 100 ∇P (x, y) = −0.4x−0.05y+55, −0.05x−0.1y+35 = 0, 0 ⇐⇒ y = 300. We deduce that (100, 300) is the only critical point of P ; see Figure 2.18. Now, we apply the second derivatives test to classify that point. We have     −0.4 0 Pxx Pxy = HP (x, y) = 0 −0.1 Pyx Pyy   D1 (100, 300) =  Pxx  = Pxx = −0.4 < 0,   P D2 (100, 300) =  xx Pxy

  Pxy   −0.4 = Pyy   0

 0  = 0.004 > 0. −0.1 

74

Introduction to the Theory of Optimization in Euclidean Space

So (100, 300) is a local maximum point. In fact, it is a global maximum point where P attains the optimal value P (100, 300) = 5500. This is true because P is concave in R2 . Indeed, we have D1 (x, y) < 0 500

and

D2 (x, y) > 0

∀(x, y) ∈ R2

(see next section).

225018001350900

2700

4050

z  0.2 x2  0.05 y x  55 x  0.05 y2  35 y  2500

3150 3600 4950

400

3600

2700 3150

300 global maximum

4000 z

3150 200

2700

500

2000

400

3600 0 3600 4500

3150

0

50

100

150

200

100 x

2700

900135018002250

100

300 y 50

150 200

200

100

FIGURE 2.18: Profit function P (x, y) and maximum point (100, 300) Example 3. Several local extreme points Find the stationary points and classify them when f (x, y) = 3x − x3 − 2y 2 + y 4 . Solution: Since f is a differentiable function (because it is a polynomial), the local extreme points are critical points, i.e, solutions of ∇f (x, y) = 0, 0. We have ∇f (x, y) = 3 − 3x2 , −4y + 4y 3  = 0, 0 ⎧ ⎧ 2 ⎨ x = 1 or ⎨ x =1 and and ⇐⇒ ⇐⇒ ⎩ ⎩ y = 0 or y(y 2 − 1) = 0 ⎧ ⎧ ⎨ x=1 ⎨ x=1 and and ⇐⇒ or or ⎩ ⎩ y=0 y=1 ⎧ ⎧ ⎨ x = −1 ⎨ x = −1 and and or or ⎩ ⎩ y=0 y=1

x = −1 y = 1 or y = −1 ⎧ ⎨ x=1 and ⎩ y = −1 ⎧ ⎨ x = −1 and or ⎩ y = −1.

We deduce that (1, 0), (1, 1), (1, −1), (−1, 0), (−1, 1) and (−1, −1) are the critical points of f . The level curves, graphed in Figure 2.19, show the nature of these points.

Unconstrained Optimization 9.7 5.82 7.76 0.67

z  y4  2 y2  x3  3 x

75 4.85 2.91 0.97 0.97 6.7

3.88

1.94

1.94

1.5 8.73

4.85 0

1.0 2.91 0.5

3.88 7.76

2.91

1.94

10

1.94

0.0 2

5 z

6.79

2

1.0

0.97

7.76

4.85

1

1.5 8.73

1 2

5.82

2.91

0y 1 x0

0

0.97

0.5

1

0

0.67 7.76 2.91 9.7

2

1.94

1.94 2

3.88 1

0

3.88

4.85 2.91 1

2

FIGURE 2.19: Local extreme points of f (x, y) = 3x − x3 − 2y 2 + y 4 ∗ Classification of the critical points: We have fyy (x, y) = 12y 2 − 4, fxy (x, y) = 0, fxx (x, y) = −6x,     −6x 0 fxx fxy = Hf (x, y) = 0 12y 2 − 4 fyx fyy   D1 (x, y) =  fxx  = fxx = −6x,      fxx fxy   −6x  0   = −24x[3y 2 − 1].   D2 (x, y) =  = fxy fyy   0 12y 2 − 4  Applying the second derivative test, we obtain: (x, y) (1, 0)

D1 (x, y) −6

D2 (x, y)   −6   0

 0  = 24 −4 

  6   0

 0  = 48 8 

  6   0

 0  = 48 8 

(−1, 1)

6

(−1, −1)

6

(1, 1)

−6

  −6   0

 0  = −48 8 

(1, −1)

−6

  −6   0

 0  = −48 8 

(−1, 0)

6

  6   0

 0  = −24 −4 

type local maximum

local minimum

local minimum

saddle point

saddle point

saddle point

TABLE 2.5: Critical points’ classification for f (x, y) = 3x − x3 − 2y 2 + y 4

76

Introduction to the Theory of Optimization in Euclidean Space

The proof of Theorem 2.2.1 uses Taylor’s formula for a function of several variables and a characterization of symmetric quadratic forms (see the end of this section). Taylor’s formula will be used several times through out the next chapters. It is therefore important to understand its proof.

Theorem 2.2.2 2nd order Taylor’s formula for a function of n variables Suppose f is C 2 in an open set of Rn containing the line segment [x∗ , x∗ + h]. Then f (x∗ + h) = f (x∗ ) +

n n n  1   ∂2f ∂f ∗ (x )hi + (x∗ + c h)hi hj ∂x 2 ∂x ∂x i i j i=1 i=1 j=1

or

1t hHf (x∗ + ch)h 2 ⎤ ⎤ ⎡ h1 ⎥ ⎢ . ⎥ ⎦ , h = ⎣ .. ⎦, t h = x∗n hn   h1 . . . hn . Here, we identified the column vector x∗ + th with the point (x∗1 + th1 , . . . , x∗n + thn ), t ∈ R. f (x∗ + h) = f (x∗ ) + ∇f (x∗ ).h + ⎡ ∗ x1 ⎢ .. ∗ for some c ∈ (0, 1) and where x = ⎣ .

Proof. Define the function g(t) = f (x∗1 + th1 , . . . , x∗n + thn ) = f (x∗ + th). Note that g(t) = f (x1 (t), x2 (t), . . . , xn (t))

with

xj (t) = x∗j + thj

j = 1, . . . , n.

Since the real functions xj , j = 1, . . . , n, are differentiable with xj (t) = hj , then g is differentiable and we have by the chain rule formula g  (t) =

∂f ∂x2 ∂f ∂xn ∂f ∂x1 + + ...... + ∂x1 ∂t ∂x2 ∂t ∂xn ∂t

      = fx1 (x∗ + th) h1 + fx2 (x∗ + th) h2 + . . . . . . + fxn (x∗ + th) hn   = ∇f (x∗ + th) .h.

Unconstrained Optimization

77

Because f is C 2 , then g is also C 2 , and we have g  (t) =

d  d  d  fx1 (x∗ +th) h1 + fx2 (x∗ +th) h2 + . . . . . . + fxn (x∗ +th) hn . dt dt dt

For each i = 1, . . . , n, we have fxi (x∗ + th) = fxi (x1 (t), x2 (t), . . . , xn (t)). Then ∂fxi ∂x2 ∂fxi ∂xn ∂fxi ∂x1 d fx (x∗ + th) = + + ...... + dt i ∂x1 ∂t ∂x2 ∂t ∂xn ∂t       = fxi x1 (x∗ + th) h1 + fxi x2 (x∗ + th) h2 + . . . . . . + fxi xn (x∗ + th) hn =

n  

 fxi xj (x∗ + th) hj .

j=1

Hence g  (t) =

n  n  

 fxi xj (x∗ + th) hi hj .

i=1 j=1

Now, since f is defined on the segment [x∗ , x∗ + h], g is defined on the interval [0, 1] and by using the 2nd order Taylor’s formula for real functions [1], [2], we get g(1) = g(0) +

g  (c) g  (0) (1 − 0) + (1 − 0)2 1! 2!

1 = g(0) + g  (0) + g  (c) 2

for some c ∈ (0, 1),

or equivalently f (x∗ + h) = f (x∗ ) +

n  i=1

fxi (x∗ )hi +

n

n

1  fx x (x∗ + ch)hi hj . 2 i=1 j=1 i j

Proof. (Theorem 2.2.1) Since x∗ is an interior point of S and is a local stationary point of f then ∇f (x∗ ) = 0.

78

Introduction to the Theory of Optimization in Euclidean Space

For h ∈ Rn such that x∗ + h ∈ S, we have from the 2nd order Taylor’s formula f (x∗ + h) = f (x∗ ) +

Situation (i)

1t hHf (x∗ + ch)h 2

for some c ∈ (0, 1).

Suppose that Dk (x∗ ) > 0 for all k = 1, . . . , n.

By continuity of the second-order partial derivatives of f , there exists r > 0 such that Dk (x) > 0

∀x ∈ Br (x∗ )

∀k = 1, . . . , n.

As a consequence, the quadratic form Q(h)(x) =

n  n 

fxi xj (x)hi hj =t hHf (x)h

i=1 j=1

  with the associated symmetric matrix Hf (x) = fxi xj (x) n×n is definitely positive. Since x∗ + ch ∈ Br (x∗ ), then Q(h)(x∗ + ch) =t hHf (x∗ + ch)h > 0. Therefore, we have for x∗ + h ∈ Br (x∗ ) 1 Q(h)(x∗ + ch) > 0 2 which shows that the stationary point x∗ is a strict local minimum point for f in S. f (x∗ + h) − f (x∗ ) =

Situation (ii)

Suppose that (−1)k Dk (x∗ ) > 0 for all k = 1, . . . , n.

By continuity of the second-order partial derivatives of f , there exists r > 0 such that (−1)k Dk (x) > 0

∀x ∈ Br (x∗ )

From the property of determinants, we can write   (−f )x1 x1 (−f )x1 x2 . . .     (−f )x2 x1 (−f )x2 x2 . . .   k ∗ (−1) Dk (x ) =   .. .. ..  . . .     (−f )x x (−f )x x . . . k 1 k 2

∀k = 1, . . . , n.  (−f )x1 xk    (−f )x2 xk     ..  .    (−f )xk xk 

Unconstrained Optimization

79

As a consequence, the quadratic form t

hH−f (x)h =

n  n 

(−f )xi xj (x)hi hj

i=1 j=1

  with the associated symmetric matrix H−f (x) = (−f )xi xj (x) n×n is definite positive. Therefore, we have for x∗ + h ∈ B(x∗ , r) 1 (−f )(x∗ + h) − (−f )(x∗ ) = ( )t hH−f (x∗ + ch)h > 0 2 =⇒

(−f )(x∗ + h) > (−f )(x∗ )

f (x∗ ) > f (x∗ + h)

⇐⇒

which shows that the stationary point x∗ is a strict local maximum point for f in S.

Situation (iii) hold.

Assume Dn (x∗ ) = 0 and neither of the conditions i) and ii)

 situation (i) (resp. (ii)) means also that the matrix A = Note that fxi xj (x∗ ) n×n is definite positive (resp. negative), which is equivalent to each of its eigen value λi to be positive (resp. negative). So, if neither (i) or (ii) hold, there exist i0 , j0 ∈ {1, . . . , n} such that Dn (x∗ ) =

n

λi = 0

with

λi0 > 0

and

λj0 < 0.

i=1

Now, since A is symmetric, there exists an orthogonal matrix O = (pij )n×n (O−1 =t O) such that ⎡ ⎤ λ1 · · · 0 ⎢ ⎥ A = ODt O D = ⎣ 0 ... 0 ⎦ 0

···

λn

Then the quadratic form Q(h) can be written as Q(h)(x∗ ) =t hAh =t [t Oh]D[t Oh] =

n  i=1

λi

n 

pji hj

2

.

j=1

Choose hs and hs such that for s > 0, t

s 2s Ohs =  ei0 +  ej0 −λj0 λi0

t

2s s Ohs =  ei0 +  ej0 , −λj0 λi0

80

Introduction to the Theory of Optimization in Euclidean Space

which is possible since t O is invertible. Then we have s 2 2s 2 Q(hs )(x∗ ) = λi0  + λ j0  = s2 − 4s2 = −3s2 < 0 −λj0 λi0 2s 2 s 2 + λ j0  = 4s2 − s2 = 3s2 > 0. Q(hs )(x∗ ) = λi0  −λj0 λi0 We deduce, by continuity of Q(h)(x) the existence of δ > 0 such that ∀s ∈ (0, δ) 1 f (x∗ + hs ) − f (x∗ ) = Q(hs )(x∗ + chs ) < 0 2 1 Q(hs )(x∗ + chs ) > 0. 2 f takes values greater and less than f (x∗ ) in the neighborhood of x∗ . Therefore x∗ is a saddle point. f (x∗ + hs ) − f (x∗ ) =

The following theorem shows that the Hessian matrix of a C 2 function at a local maximum (resp. minimum) point is necessarily positive (resp. negative) semi definite. However, this condition is not sufficient as we can see it in a suggested exercise where the origin is neither a local minimum, nor a local maximum.

Theorem 2.2.3 Necessary conditions for a local extreme point Let S ⊂ Rn and f : S −→ R be a C 2 function in a neighborhood of a ◦

critical point x∗ ∈ S

(∇f (x∗ ) = 0). Then

(i) x∗ is a local minimum point (ii) x∗ is a local maximum point

=⇒ =⇒

k (x∗ )  0

∀k = 1, n

(−1)k k (x∗ )  0

∀k = 1, n

where k (x∗ ) is the principal minor of order k of the Hessian matrix Hf (x∗ ); that is the determinant of a matrix obtained by deleting n − k rows and n − k columns such that if the ith row (column) is selected, then so is the ith column (row).

Proof. (i) Suppose that x∗ is an interior local minimum point for f . There exists r > 0 such that f (x∗ )  f (x)

∀x ∈ Br (x∗ ).

Unconstrained Optimization

81

In particular, for t ∈ (−r, r) and h ∈ Rn with h = 1, we have x∗ + th ∈ Br (x∗ ) since x∗ + th − x∗ = |t| h = |t| < r. Then g(0) = f (x∗ )  f (x∗ + th) = g(t)

∀t ∈ (−r, r).

So g is a one variable function that has an interior local minimum at t = 0. Consequently, it satisfies g  (0) = 0

g  (0)  0.

and

From previous calculations, we have g  (t) =

n  n 

fxi xj (x∗ + th)hi hj .

i=1 j=1

Hence g  (0) =

n  n 

fxi xj (x∗ )hi hj =t hHf (x∗ )h  0.

i=1 j=1

The above inequality remains true for h = 0 and for h = 0. It suffices to consider for this last case h/ h which is a unit vector. Hence the Hessian matrix of f at x∗ is positive semi definite by the result below from Algebra (see [10]). (ii) is proved similarly.

Quadratic forms Consider the quadratic form in n variables Q(h) =

n  n 

aij hi hj =t hAh

t

h=



h1

...

hn



i=1 j=1

associated to the symmetric matrix ⎡ a11 . . . ⎢ .. .. A = (aij )i,j=1,...,n = ⎣ . . an1 . . .

⎤ a1n .. ⎥ . ⎦ ann

(aij = aji ).

82

Introduction to the Theory of Optimization in Euclidean Space Definition. Q is positive (resp. negative) definite if Q(h) > 0 (resp. < 0) for all h = 0. Q is positive (resp. negative) semi definite if Q(h)  0 (resp.  0) for all h ∈ Rn .

We have the following necessary and sufficient condition for a quadratic form Q to be positive (negative), definite or semi definite.

Theorem. Q is positive definite Q is negative definite

⇐⇒ ⇐⇒

r = 1, . . . , n Dr > 0 r (−1) Dr > 0 r = 1, . . . , n

where Dr is the leading principal minor of order r of the matrix A;    a11 . . . a1n     ..  .. Dr =  ... for r = 1, . . . , n. . .    an1 . . . ann 

Theorem. Q is positive semi definite Q is negative semi definite

⇐⇒ ⇐⇒

Δr  0 r = 1, . . . , n (−1)r Δr  0 r = 1, . . . , n

where Δr is the principal minor of order r of the matrix A; that is the determinant of the matrix obtained from the matrix A by deleting n − r rows and n − r columns such that if the i th row (column) is selected, then so is the i th column (row).

Unconstrained Optimization

83

Solved Problems

1. – Use the following functions to show that the positivity or negativity semi definite of the Hessian of the objective function at a critical point is not a necessary condition for local optimality. f (x, y) = x4 + y 4 ,

g(x, y) = −(x4 + y 4 ),

h(x, y) = x4 − y 4 .

Solution: z  x4  y4

z  x4  y4

z  x4  y4

2.0

1.0

0.0

1.5 z1.0

0.5 z 1.0

1.0 0.5

0.5 0.0 1.0

2.0 1.0

0.0y

1.0 0.5

0.5

0.5

1.5

0.5

0.5 z0.0

1.0

1.0 1.0

0.0y

0.0y 0.5

0.5 0.5

0.0 x

0.5

0.0 x

0.5

0.5

0.5 1.0

1.0

0.5

0.0 x

1.0

1.0

1.0

1.0

FIGURE 2.20: Graphs of f, g, h

We have ∇f (x, y) = 4x3 , 4y 3 ,

∇g(x, y) = −4x3 , −4y 3 ,

∇h(x, y) = 4x3 , −4y 3 .

So (0, 0) is the only stationary point for f , g and h. But, we cannot conclude anything about its nature by using the second derivatives test since the Hessian matrix at (0, 0) of each function is equal to the zero matrix.  Hf =

12x2 0

0 12y 2



 ,

Hg = −Hf ,

Hh = 

Hf (0, 0) = Hg (0, 0) = Hh (0, 0) =

0 0

12x2 0 0 0



0 −12y 2



84

Introduction to the Theory of Optimization in Euclidean Space Δ11 (0, 0) = Δ21 (0, 0) = Δ2 (0, 0) = 0

where Δl1 is the principal minor of order l obtained by removing the leme row and leme column l = 1, 2. Thus the Hessian matrices of f , g and h are positive and negative semi definite at (0, 0). However, this doesn’t imply that (0, 0) is a local minimum or maximum point. Indeed, by looking at the functions directly, we can classify the point. The three situations are shown in Figure 2.20. First, note that (0, 0) is a global maximum for f since we have f (x, y) = x4 + y 4  0 = f (0, 0)

∀(x, y) ∈ R2 .

Next, note that (0, 0) is a global minimum for g. Indeed, we have g(x, y) = −(x4 + y 4 )  0 = g(0, 0)

∀(x, y) ∈ R2 .

Finally, (0, 0) is a saddle point for h since we have h(x, 0) = x4  0 = h(0, 0) h(0, y) = −y 4  0 = h(0, 0)

∀x ∈ R ∀y ∈ R.

Thus, for any disk centered at (0, 0), h takes values greater and lower than h(0, 0).

2. – Classify the stationary points of f (x1 , x2 , x3 , x4 ) = 20x2 + 48x3 + 6x4 + 8x1 x2 − 4x21 − 12x23 − x24 − 4x32 . Does f attain its global extreme values on R4 ?

Solution: Since the function f is differentiable (because it is a polynomial), the local extreme points are critical points, i.e, solutions of ∇f (x1 , x2 , x3 , x4 ) = 8x2 − 8x1 , 20 + 8x1 − 12x22 , 48 − 24x3 , 6 − 2x4  = 0R4 ⎧ 5 + 2x1 − 3x22 = 0 ⎨ x2 = x 1 ⇐⇒ ⎩ x3 = 2 x4 = 3. 5 5 We deduce that (−1, −1, 2, 3) and ( , , 2, 3) are the critical points of f . 3 3

Unconstrained Optimization

85

• Classification of the critical points: The Hessian matrix of f is ⎡ ⎤ −8 8 0 0 ⎢ 8 −24x2 0 0 ⎥ ⎥ Hf (x1 , x2 , x3 , x4 ) = ⎢ ⎣ 0 0 −24 0 ⎦ 0 0 0 −2 The leading principal minors at the point (−1, −1, 2, 3) are D1 = −8 < 0,

D2 = −256 < 0,

D3 = −24D2 > 0 and D4 = −2D3 < 0.

Then, (−1, −1, 2, 3) is a saddle point. The leading principal minors at the point ( 53 , 53 , 2, 3) are D1 = −8 < 0,

D2 = 256 > 0,

D3 = −24D2 < 0 and

D4 = −2D3 > 0.

5 5 Then, ( , , 2, 3) is a local maximum point. 3 3 • Global optimal points: Note that f (0, x2 , 0, 0) = 20x2 − 4x32

−→ ∓∞

as

x2 −→ ±∞.

Thus f takes large negative and positive values. Therefore f doesn’t attain its global optimal values on R2 .

3. – Let f (x, y) = ln(1 + x2 y). i) Find and sketch the domain of definition of f . ii) Find the stationary points and show that the second-derivatives test is inconclusive at these points. iii) Describe the behavior of f at these points.

Solution: i) The domain of f is given by: Df = {(x, y) ∈ R2 : = {(0, y) :

1 + x2 y > 0}

y ∈ R} ∪ {(x, y) ∈ R∗ × R :

y>−

1 }. x2

1 The domain of f is the region located above the curve y = − 2 , including the x y axis; see Figure 2.21.

86

Introduction to the Theory of Optimization in Euclidean Space y 10

5

5

10

x

0.2

0.4

0.6

0.8

FIGURE 2.21: Domain of f (x, y) = ln(1 + x2 y)

ii) f is differentiable on its open domain Df because v(t) = ln t, u(x, y) = 1 + x2 y u is differentiable in R then, in particular, in Df

f = vou

with

2

u(Df ) ⊂ R∗ + v is differentiable in R∗ + . The stationary points are solutions of x2 2xy ,  = 0, 0 1 + x2 y 1 + x2 y ⎧ ⎧ ⎨ x=0 ⎨ xy = 0 ⇐⇒ ⎩ ⎩ 2 x=0 x =0

∇f (x, y) = 

⇐⇒

⇐⇒

⎧ ⎨ x=0 or ⎩ y=0

and

x=0

and

x=0

or

⇐⇒

y=0

x = 0,

y ∈ R.

We deduce that the points located on the y axis are the critical points of f . The Hessian matrix of f is 1 Hf (x, y) = (1 + x2 y)2



1 − x2 y 2x

At the stationary points, we have  Hf (0, y) =

1 0 0 0

 .

2x −x4

 .

Unconstrained Optimization

87

2 2.04 1.7

z  logy x2  1

0.51

0

1 2 2

0 y

0.17

1

1 2

1.1 2.0

1.7

1.36 0.17

1.87 1.19 0.34

2

2

0.51 1.53

0.68

0.68

1

0

0.34 1.02

0

1.53

1 x

0.34

0 0.51 1.36

2.04 0.85

1

1.19

0.68

0.17

0.51

0.34 1.02 1.7

2

1.53

0.17

1.53 1.02

1 z 0

1.87 1.02

0.85

1.36

2

1.36 1.7

1.87 1

2.04

0.85

0

0.68

1.19

2

1

1.87 0.85 0

1

2

FIGURE 2.22: Graph and level curves of f (x, y) = ln(1 + x2 y)

The leading minor D2 (0, y) = det(Hf (0, y)) = 0, then the second derivatives test fails at these points. The behaviour of the function is illustrated in Figure 2.22. Classification of these points: • The points (0, y0 ) with y0 > 0 are local minimum points for f . Indeed, since the logarithm function is increasing, we have f (x, y) = ln(1 + x2 y)  ln(1) = ln(1 + 02 y0 ) = 0 = f (0, y0 ) ∀x ∈ R,

∀y ∈ (y0 −

y0 3y0 y0 y0 , y0 + ) = ( , ). 2 2 2 2

Thus, f takes values greater than f (0, y0 ) in a neighborhood of (0, y0 ) with y0 > 0. • The points (0, y0 ) with y0 < 0 are local maximum points for f . Indeed, since ln is an increasing function, we have f (x, y) = ln(1 + x2 y)  ln(1) = ln(1 + 02 y0 ) = 0 = f (0, y0 ) ∀y ∈ (y0 −

3y0 y0 −y0 −y0 , y0 + )=( , ), 2 2 2 2

∀x

such that

0 < 1 + x2 y.

f takes values lower than f (0, y0 ) in a neighborhood of (0, y0 ) with y0 < 0. • The point (0, 0) is a saddle point for f . Indeed, we have f (x, y) = ln(1 + x2 y)  ln(1) = 0 = f (0, 0)

∀y ∈ R+

88

Introduction to the Theory of Optimization in Euclidean Space

f (x, y) = ln(1 + x2 y)  ln(1) = 0 = f (0, 0)

∀y ∈ R− such that 0 < 1 + x2 y.

For any disk centered at (0, 0), f takes values greater and lower than f (0, 0).

4. – Find and classify all stationary points of f (x, y) = x2 y + y 3 x − x y. Are there global minimum and maximum values of f on R2 ?

Solution: 2

z  x y3  x2 y  x y

1

5

0 2

z0 1 5 2

1

0y 1 1

x0 1 2

2

2

2

1

0

1

2

FIGURE 2.23: Graph and level curves of f (x, y) = x2 y + y 3 x − xy The function f is differentiable on its open domain R2 since it is a polynomial. So, the local extreme points are critical, i.e, solutions of ∇f (x, y) = 2xy + y 3 − y, x2 + 3y 2 x − x = 0, 0 ⇐⇒ ⎧ ⎨ y(y 2 + 2x − 1) = 0 ⎩

⇐⇒

x(3y 2 + x − 1) = 0 ⎧ y = 0 and ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ or [y = 0 ⎪ ⎪ or ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ or

⇐⇒

⎧ ⎨ y=0

or

y 2 + 2x − 1 = 0



or

3y 2 + x − 1 = 0

x=0

x=0 and

3y 2 + x − 1 = 0]

[y 2 + 2x − 1 = 0

and

x = 0]

[y 2 + 2x − 1 = 0

and

3y 2 + x − 1 = 0]

⇐⇒

⎧ ⎨ y=0 ⎩

or

and [y = 0

Unconstrained Optimization ⎧ 2 ⎪ x=0 ⎨ or [y − 1 = 0 and

x = 1]

⎪ ⎩ or

[y 2 =

1 5

89 and

and

x = 0]

x=

2 ]. 5

1 2 2 1 We deduce that (0, 0), (1, 0), (0, 1), (0, −1), ( , √ ) and ( , − √ ) are the 5 5 5 5 critical points of f . Reading the level curves in Figure 2.23, one can locate four saddle points and two local extrema. Classification of the critical points: Applying the second derivatives test, we obtain: critical point

D1 (x, y)

D2 (x, y)

classification

(0, 0)

0

−1

saddle point

(1, 0)

0

−4

saddle point

(0, 1)

2

−4

saddle point

(0, −1)

−2

−4

saddle point

2 1 ( ,√ ) 5 5

2 √ 5

4 5

local minimum point

1 2 ( , −√ ) 5 5

2 −√ 5

4 5

local maximum point

TABLE 2.6: Critical points’ classification for f (x, y) = x2 y + y 3 x − xy where fxx (x, y) = 2y,  Hf (x, y) =

fxy (x, y) = 2x + 3y 2 − 1

fyy (x, y) = 6yx,

2y 2x + 3y 2 − 1 2 6xy 2x + 3y − 1

 ,

  D1 (x, y) =  fxx  = fxx = 2y

90 Introduction to the Theory of Optimization in Euclidean Space      fxx fxy   2y 2x + 3y 2 − 1  2 2 2  =  fxy fyy   2x + 3y 2 − 1  = 12xy − [2x + 3y − 1] . 6yx Finally, note that f takes large positive and negative values since we have f (1, y) = y 3 −→ ±∞

y −→ ±∞.

as

Therefore, f doesn’t attain a global maximal value nor a minimal one.

5. – A power substation must be located at a point closest to three houses located at the points (0, 0), (1, 1), (0, 2). Find the optimal location by minimizing the sum of the squares of the distances between the houses and the substation. Solution: Let (x, y) be the position of the power substation. Then, we have y

2.0

1.5

1.0

0.5

0.2

0.2

0.4

0.6

0.8

1.0

1.2

x

FIGURE 2.24: The closest power station to three houses to look for (x, y) as the point that minimize the function f (x, y) = d2 ((x, y), (0, 0)) + d2 ((x, y), (1, 1)) + . . . + d2 ((x, y), (0, 2)) which can be written as f (x, y) = [(x − 0)2 + (y − 0)2 ] + [(x − 1)2 + (y − 1)2 ] + [(x − 0)2 + (y − 2)2 ]. Because f is polynomial, it is differentiable on the open set R2 . Thus a global minimum point is also a local one. Therefore, it is a solution of

Unconstrained Optimization

91

∇f (x, y) = 2x + 2(x − 1) + 2x, 2y + 2(y − 1) + 2(y − 2) = 0, 0 ⎧ ⎨ 6x − 2 = 0

⇐⇒



1 (x, y) = ( , 1). 3

⇐⇒

6y − 6 = 0

Thus, we have one critical point and by applying the second derivatives test, we obtain:  Hf (x, y) =

6 0

0 6



  6 0 1 D2 ( , 1) =  0 6 3

1 D1 ( , 1) = 6 > 0 3

   = 36 > 0. 

So ( 13 , 1) is a local minimum; see Figure 2.24 for the position of the point and the three houses. To show that it is the point that minimizes f globally, we proceed by comparing the values of f and completing squares: 2 1 f (x, y) − f ( , 1) = 3x2 − 2x + 1 + 3y 2 − 6y + 5 − ( + 2) 3 3 1 2 ∀(x, y) ∈ R2 . = 3(x − ) + 3(y − 1)2  0 3 6. – Based on the level curves that are visible in Figures 2.25 and 2.26, identify the approximate position of the local maxima, local minima and saddle points. 2

0.128

0.128 0.256

0.128

0.12 0.32

0.064 1 0.192 0.256

0.064

0 0

0.192

0 0.192

0.064

0.192

0.256

0.256

1 0.064

0.32 0.128

0.128 0.128

2 2

1

0.128 0

1

FIGURE 2.25: Level curves of f (x, y) = −xye−(x

2

2

+y 2 )/2

on [−2.2] × [−2, 2]

Solution: i) From the level curves’ plotting, one can locate: - a saddle point at (0, 0)

92

Introduction to the Theory of Optimization in Euclidean Space 1.2

1.2

0.4

1.2

0 2.4

0.4 0.4 0.8 0

2

1.6

2.4

8 0.8

0.4

0.8

0.8

2

1.6

1.2

6 1.2

0.8

1.2

0.4 0.4

0.8 0.8

0

1.2

1.2

4

1.2

0.4

0.4 0.4

1.2

1.6

0.8

2

0.8 0.4 0.4 0

0.4 2.4

2

2.4

1.6

0

0 0

1.2

0.8 0.8

2

1.2 2

1.2

0.8 4

1.2

0 6

8

FIGURE 2.26: Level curves of g(x, y) = sin(x) + sin(y) − cos(x + y) for x, y in [0, 3π]

- two local maxima at (−1, 1), (1, −1) - two local minima at (−1, −1), (1, 1). Using Maple software, one can check these observations by applying the second derivatives test using the coding: with(Student[M ultivariateCalculus]) LagrangeM ultipliers(−x ∗ y ∗ exp(−(x2 + y 2 ) ∗ (1/2)), [ ], [x, y], output = detailed) [x = 0, y = 0, −x ∗ y ∗ exp(−(1/2) ∗ x2 − (1/2) ∗ y 2 ) = 0], [x = 1, y = 1, −x ∗ y ∗ exp(−(1/2) ∗ x2 − (1/2) ∗ y 2 ) = −exp(−1)], [x = 1, y = −1, −x ∗ y ∗ exp(−(1/2) ∗ x2 − (1/2) ∗ y 2 ) = exp(−1)], [x = −1, y = 1, −x ∗ y ∗ exp(−(1/2) ∗ x2 − (1/2) ∗ y 2 ) = exp(−1)], [x = −1, y = −1, −x ∗ y ∗ exp(−(1/2) ∗ x2 − (1/2) ∗ y 2 ) = −exp(−1)] SecondDerivativeT est(−x ∗ y ∗ exp(−(x2 + y 2 ) ∗ (1/2)), LocalM in = [],

LocalM ax = [],

SecondDerivativeT est(−x ∗ y ∗ exp(−(x2 + y 2 ) ∗ (1/2)), LocalM in = [[1, 1]],

[x, y] = [0, 0])

Saddle = [[0, 0]]

LocalM ax = [],

[x, y] = [1, 1])

Saddle = []

.. .

ii) For the second figure, the exact points found, using Maple, are: - 5 saddle points

(

3π 3π , ), 2 2

π 3π ( , ), 2 2

- 4 local maxima at

π 5π ( , ), 2 2

- 2 local minima at

(

(

11π 11π , ), 6 6

(

3π π , ), 2 2

5π π , ), 2 2 (

(

π π ( , ), 2 2

7π 7π , ). 2 2

7π π , ), 2 2 (

π 7π ( , ) 2 2

5π 5π , ) 2 2

Unconstrained Optimization

2.3

93

Convexity/Concavity and Global Extreme Points

In dimension 1, when a C 2 function f is convex on its domain Df and x∗ is a local minimum of f , then x∗ is a global minimum. Indeed, the convexity of f is characterized by f  (x)  0 [2], [1]. Then, using Taylor’s formula, the values f (x) and f (x∗ ) can be compared as follows: f (x) = f (x∗ ) + (x − x∗ )f  (x∗ ) +

(x − x∗ )2  f (c) 2

for some c between x∗ and x.

Because f  (x∗ ) = 0, then (x − x∗ )2  f (c)  0. 2 As x is arbitrarily chosen in the domain of f , then f (x) − f (x∗ ) =

f (x)  f (x∗ )

∀x ∈ Df



which shows that x is a global minimum point for f .

In this section, we want to generalize the convexity property to functions of several variables in order to establish, later, results of global optimality.

2.3.1

Convex/Concave Several Variable Functions

Definition 2.3.1 Let S be a convex set of Rn and let f be a real function f:

S −→ R x = (x1 , · · · , xn ) −→ f (x).

Then, f is convex

⇐⇒

f is strictly convex

f is concave

⇐⇒

f is strictly concave

f (ta + (1 − t)b)  tf (a) + (1 − t)f (b). ⇐⇒

f (ta + (1 − t)b) < tf (a) + (1 − t)f (b) a = b, t = 0, 1.

f (ta + (1 − t)b)  tf (a) + (1 − t)f (b). ⇐⇒

f (ta + (1 − t)b) > tf (a) + (1 − t)f (b) a = b, t = 0, 1.

These equivalences must hold ∀a, b ∈ S,

∀ t ∈ [0, 1].

94

Introduction to the Theory of Optimization in Euclidean Space

• Using the definition, one can check that the functions i)

f (x) = ax + b

ii)

f (x, y) = ax + by + c

are simultaneously concave and convex in R and R2 respectively. Their respective graphs represent a line y = ax + b and a plane z = ax + by + c. • A convex/concave function is not necessarily differentiable at every point.  i) f (x) = |x| ii) f (x, y) = x2 + y 2 = (x, y . Each function is not differentiable at the origin and represents the Euclidean distance in R and R2 respectively. We use the triangular inequality to verify that they are convex. • One can form new convex/concave functions using algebraic operations. For Example [25], if f , g are functions defined on a convex set S ⊂ Rn and s, t  0, then: f and g are concave (resp. convex) =⇒   sf + tg is concave (resp. convex)     min(f, g) (resp. max(f, g)) is concave (resp. convex).

Remark 2.3.1 The geometrical interpretation of the convexity of f expresses that the graph of f remains under the line segment [AB] joining any two points A(a, f (a)) and B(b, f (b)) of the graph of f . Indeed,  [A, B] = (x, y) ∈ Rn × R : x = a + t(b − a),  y = f (a) + t(f (b) − f (a)) t ∈ [0, 1] is located above the part of the graph of f  (x, y) ∈ Rn × R : x = a + t(b − a), y = f (a + t(b − a)) since we have

t ∈ [0, 1]



∀t ∈ [0, 1]

f (a+t(b−a)) = f (tb+(1−t)a)  tf (b)+(1−t)f (a) = f (a)+t(f (b)−f (a)). Similarly, the geometrical interpretation of the concavity of f expresses that the graph of f remains above the line segment [AB] joining any two points A(a, f (a)) and B(b, f (b)) of the graph of f ; see Figure 2.27.

Unconstrained Optimization

95

convex function z fx,y segment AB above the graph y fx

A

B

FIGURE 2.27: Shape of convex functions

Remark 2.3.2 There is a connection with a convexity/concavity of a function f defined on a convex set S ⊂ Rn and the convexity/concavity of particular sets described by f [25]. Indeed, we have   f is convex ⇐⇒ the set (x, y) ∈ S × R : y  f (x) is convex f is concave

2.3.2

⇐⇒

 the set

(x, y) ∈ S × R :

y  f (x)

 is convex

Characterization of Convex/Concave C 1 Functions

When n = 1, the following theorem expresses that the graph of a convex (resp. concave) C 1 function remains above (resp. below) its tangent lines.

Theorem 2.3.1 Let S be a convex open set of Rn and let f : S −→ R be C 1 . Then, for any x, a ∈ S, the following inequalities hold   f is convex in S ⇐⇒ f (x) − f (a)  ∇f (a) .(x − a)   f is strictly convex in S ⇐⇒ f (x) − f (a) > ∇f (a) .(x − a), x = a

96

Introduction to the Theory of Optimization in Euclidean Space

⇐⇒

f is concave in S

  f (x) − f (a)  ∇f (a) .(x − a)

  f is strictly concave in S ⇐⇒ f (x) − f (a) < ∇f (a) .(x − a), x = a.

Proof. We prove the first assertion. The other assertions can be established similarly. =⇒)

If f is convex in S, then, by definition, we have for a, b ∈ S, f (tb + (1 − t)a)  tf (b) + (1 − t)f (a)

∀t ∈ [0, 1]

from which we deduce

f (b) − f (a) 

f (a + t(b − a)) − f (a) f (tb + (1 − t)a) − f (a) = t t

∀t ∈ (0, 1].

Since f ∈ C 1 (S), we obtain f (b) − f (a)  lim

t→0+

g(t) − g(0) = g  (0) t−0

where g(t) = f (a+t(b−a))

g  (t) = f  (a+t(b−a)).(b−a),

g  (0) = f  (a).(b−a).

Indeed, g(t) = f (a1 + t(x1 − a1 ), . . . , an + t(xn − an )) = f (x1 (t), x2 (t), ..., xn (t)). Each function xj (t) == aj + t(bj − aj ), j = 1, ..., n, is differentiable with xj (t) = bj − aj . So g is differentiable and we obtain, by the chain rule formula, g  (t) =

∂f ∂x2 ∂f ∂xn ∂f ∂x1 + + ... + ∂x1 ∂t ∂x2 ∂t ∂xn ∂t

    = fx1 (a + t(b − a)) (b1 − a1 ) + fx2 (a + t(b − a)) (b2 − a2 ) + . . .     . . . + fxn (a + t(b − a)) (bn − an ) = (∇f )(a + t(b − a)) .(b − a). ⇐=)

Assume that

Unconstrained Optimization   f (x) − f (u)  ∇f (u) .(x − u)

97 ∀ x, u ∈ S.

Let a, b ∈ S and t ∈ [0, 1]. Choosing x = a and u = ta + (1 − t)b in the above inequality, we obtain   f (a) − f (ta + (1 − t)b)  ∇f (ta + (1 − t)b) .(a − [ta + (1 − t)b])   = (1 − t) ∇f (ta + (1 − t)b) .(a − b).

(∗)

Now, choose x = b and u = ta + (1 − t)b in the same inequality. We get   f (b) − f (ta + (1 − t)b)  ∇f (ta + (1 − t)b) .(b − [ta + (1 − t)b])   = −t ∇f (ta + (1 − t)b) .(a − b).

(∗∗)

Multiply the inequality (∗) by t > 0 and the inequality (∗∗) by (1 − t) > 0, then add the resulting inequalities. This gives tf (a) + (1 − t)f (b) − (t + (1 − t))f (ta + (1 − t)b)  [t(1 − t) − (1 − t)t]∇f (ta + (1 − t)b).(a − b) = 0. Therefore f is convex. Example 1. Show that f (x, y) = x2 + y 2 is convex on R2 . Solution: We have



f (x, y) − f (s, t) − ∇f (s, t). 2

2

2

2

= x + y − (s + t ) −



2s



x−s y−t 2t



 .

x−s y−t



= x2 + y 2 − (s2 + t2 ) − 2s(x − s) − 2t(y − t) = (s − x)2 + (t − y)2  0

∀ (x, y), (s, t) ∈ R2 .

Thus f is convex on R2 . Note that by taking (s, t) = (0, 0), the critical point of f , we deduce that f (x, y) − f (0, 0)  0 ∀ (x, y) ∈ R2 . Hence, (0, 0) is a global minimum of f .

98

Introduction to the Theory of Optimization in Euclidean Space

As, we can expect, from the above example, it will not always be easy to check the convexity or concavity of a function through solving inequalities. Next, we show a more practical characterization, but requiring more regularity on the function.

Characterization of Convex/Concave C 2 Functions

2.3.3

Theorem 2.3.2 Strict convexity/concavity Let S be a convex open set of Rn and let f : S −→ R, (i) Dk (x) > 0

∀x ∈ S,

k = 1, . . . , n

=⇒

f ∈ C 2 (S). Then

f is strictly convex in S.

(ii) (−1)k Dk (x) > 0 ∀x ∈ S, k = 1, . . . , n =⇒ f is strictly concave in S. Dk (x), k = 1, . . . , n are the n leading minors of the Hessian matrix Hf (x) = (fxi xj (x))n×n of f .

Proof. i) For a, b ∈ S, a = b, and t ∈ [0, 1], define the function g(t) = f (tb + (1 − t)a) = f (a + t(b − a)) = f (x1 (t), . . . , xn (t)) with xj (t) = aj + t(bj − aj ) j = 1, . . . , n. By the chain rule theorem, we have   g  (t) = (∇f )(a + t(b − a)) .(b − a). Since f is C 2 , g is also C 2 and we have d  d  fx1 (a + t(b − a)) (b1 − a1 ) + . . . + fxn (a + t(b − a)) (bn − an ). dt dt For each i = 1, . . . , n, we have

g  (t) =

fxi (a + t(b − a)) = fxi (x1 (t), x2 (t), . . . , xn (t)). Then d ∂fxi ∂x2 ∂fxi ∂xn ∂fxi ∂x1 fxi (y + t(x − y)) = + + ... + dt ∂x1 ∂t ∂x2 ∂t ∂xn ∂t     = fxi x1 (a + t(b − a)) (b1 − a1 ) + . . . + fxi xn (a + t(b − a)) (bn − an ) =

n  

 fxi xj (a + t(b − a)) (xj − yj ).

j=1

Unconstrained Optimization

99

Hence g  (t) =

n  n 

[fxi xj (a + t(b − a))](bi − ai )(bj − aj ).

i=1 j=1

Now, by assumption, we have Dk (z) > 0 for all z ∈ S and for all k = 1, . . . , n, then the quadratic form Q(h) =

n  n  

 fxi xj (a + t(b − a)) hi hj

i=1 j=1

  with the associated symmetric matrix fxi xj (a + t(b − a)) n×n is positive definite. As a consequence, g  (t) > 0 and g is strictly convex. In particular f (tb+(1−t)a) = g(t) = g(t.1+(1−t)0) < tg(1)+(1−t)g(0) = tf (b)+(1−t)f (a) and the strict convexity of f follows. ii) Under the assumptions ii), the quadratic form Q∗ (h) =

n  n  

 (−f )xi xj (a + t(b − a)) hi hj =t hH−f (a + t(b − a))h

i=1 j=1



(−f )x1 x1

⎢ ⎢ ⎢ (−f )x2 x1 ⎢ ⎢ =t h ⎢ ⎢ .. ⎢ . ⎢ ⎣ (−f )xk x1

(−f )x1 x2

...

(−f )x1 xk

(−f )x2 x2

...

(−f )x2 xk

.. .

.. .

.. .

(−f )xk x2

...

(−f )xk xk

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥h ⎥ ⎥ ⎥ ⎦

is positive definite by assumption. As a consequence, (−g) (t) > 0 and −g is strictly convex. In particular −f (tb + (1 − t)a) = (−g)(t) = −g(t.1 + (1 − t)0) ⇐⇒

< t(−g)(1) + (1 − t)(−g)(0) = t(−f )(b) + (1 − t)(−f )(a) f (tb + (1 − t)a) > tf (b) + (1 − t)f (a) ∀ a = b ∀t ∈ [0, 1]

and the strict concavity of f follows. We also have the following characterization

100

Introduction to the Theory of Optimization in Euclidean Space

Theorem 2.3.3 Convexity/concavity Let S be a convex open set of Rn and let f : S −→ R be C 2 . Then f is convex in S ⇐⇒ Δk (x)  0

∀x ∈ S

f is concave in S ⇐⇒ (−1)k Δk (x)  0

∀ k = 1, . . . , n.

∀x ∈ S

∀ k = 1, . . . , n.

A principal minor Δr (x) of order r in the Hessian [fxi xj (x)] of f is the determinant obtained by deleting n − r rows and the n − r columns with the same numbers (if the ith row (column) is selected, then so is the ith column (row)).

Proof. We prove only the first assertion. The second one is established by replacing f by −f . ⇐=) We proceed as in the proof of the previous theorem. We conclude that Q(h) is positive semi definite. As a consequence, g  (t)  0 and g is convex. In particular f (tb+(1−t)a) = g(t) = g(t.1+(1−t)0)  tg(1)+(1−t)g(0) = tf (b)+(1−t)f (a) and the convexity of f follows. =⇒) Suppose f convex in S. It suffices to show that the quadratic form Q(h) satisfies n  n  fxi xj (a)hi hj  0 ∀a ∈ S. Q(h) = i=1 j=1

So, let a ∈ S. Since S is an open set, there exists  > 0 such that B (a) ⊂ S. In particular for h ∈ Rn , h = 0, we have a + th ∈ B (a)

⇐⇒

a + th − a = |t| h < 

⇐⇒

|t|
0

∀(x, y, z, t) ∈ R4

for

k = 1, 2, 3, 4.

Therefore, f is strictly concave on R4 and the point (12, 16, 12, 12) is the only global maximum point. f doesn’t exist since f takes large negative values. Indeed, we Note that min 4 R

have, for example, f (x, 0, 0, 0) = 24x − x2 −→ −∞

as

x −→ ±∞.

106

Introduction to the Theory of Optimization in Euclidean Space

Solved Problems

1. – A power substation must be located at a point closest to n houses located at m distinct points (x1 , y1 ), (x2 , y2 ), . . . , (xm , ym ). Find the optimal location by minimizing the sum of the squares of the distances between the houses and the substation. Solution: Let (x, y) be the position of the power substation. Then, we have to look for (x, y) as the point that minimizes the function f (x, y) = d2 ((x, y), (x1 , y1 )) + d2 ((x, y), (x2 , y2 )) + . . . + d2 ((x, y), (xm , ym )) which can be written as f (x, y) = [(x−x1 )2 +(y−y1 )2 ]+[(x−x2 )2 +(y−y2 )2 ]+. . .+[(x−xm )2 +(y−ym )2 ]. Because f is polynomial, it is differentiable on the open set R2 . Thus a global minimum point is also a local one. Therefore, it is a solution of ∇f (x, y) = 2(x − x1 ) + 2(x − x2 ) + . . . + 2(x − xm ), 2(y − y1 ) + 2(y − y2 ) + . . . + 2(y − ym ) = 0, . . . , 0

⇐⇒

⎧ m  ⎪ ⎪ ⎪ xk = 0 m.x − ⎪ ⎪ ⎪ ⎨ k=1 ⎪ m ⎪  ⎪ ⎪ ⎪ yk = 0 m.y − ⎪ ⎩

m

⇐⇒ x =

1  xk m

m

and

k=1

k=1

We have only one critical point. The Hessian matrix of f is     m 0 fxx fxy = . Hf (x, y) = 0 m fyx fyy

y=

1  yk . m k=1

Unconstrained Optimization

107

The leading principal minors satisfy   m 0 D2 (x, y) =  0 m

D1 (x, y) = m > 0

   = m2 > 0. 

So f is strictly convex on R2 . Then, the critical point is the global minimum of f and describes the optimal location of the substation.

2. – Let f be a function of two variables given by f (x, y) = x2 + y 4 − 4xy

for all x and y.

i) Calculate the first and second order partial derivatives of f . ii) Find all the stationary points of f and classify them by means of the second derivatives test. iii) Does f have any global extreme points? iv) Use a software to graph f .

Solution: i) and ii) Since the function f is differentiable (because it is a polynomial), the local extreme points are critical, i.e, solutions of ∇f (x, y) = 2x − 4y, 4y 3 − 4x = 0, 0

⇐⇒

⇐⇒

⎧ ⎨ 2x − 4y = 0 ⎩

4y 3 − 4x = 0

⇐⇒



y 3 − 2y = 0

⎧ ⎨ x = 2y

⎧ ⎨ x = 2y ⎩

⎧ ⎨ x = 2y

or



y=0

y=

⇐⇒

y(y 2 − 2) = 0

⎧ ⎨ x = 2y or

√ 2



√ y=− 2

√ √ √ √ We deduce that (0, 0), (2 2, 2) and (−2 2, − 2) are the critical points. Classification of the critical points: The Hessian matrix of f is  Hf (x, y) =

fxx fyx

fxy fyy



 =

2 −4

−4 12y 2



108

Introduction to the Theory of Optimization in Euclidean Space (x, y) (0, 0) √ √ (2 √2, √2) (−2 2, 2)

D1 (x, y) 2 2 2

D2 (x, y) −16 32 32

type saddle point local minimum local minimum

TABLE 2.7: Critical points classification of f (x, y) = x2 + y 4 − 4xy The leading principal minors are D1 (x, y) = 2

  2 D2 (x, y) =  −4

 −4  . 12y 2 

An application of the second derivatives test gives the characterization in Table 2.7. iii) and iv) The first graphing in Figure 2.28 shows a form of a saddle. On the second graphing, there are two families of circulaire curves and a hyperbola which confirm the previous classification of the critical points. z  y4  4 x y  x2

2 4

1 2

z

0

0

2 1

4 5 0

2

x 5

4

2

0

2

4

FIGURE 2.28: Graph and level curves of f Global extreme points. We cannot conclude about the concavity/convexity of f on R2 since the signs of the principal minors of the Hessian are as follows: 2 Δ11 1 (x, y) = 12y  0,

Δ22 1 (x, y) = 2  0,

Δ2 (x, y) = 24y 2 − 16

and Δ2 depends on y. Thus, f is neither convex, nor concave on R2 .

Unconstrained Optimization

109

However, we remark that, on the y axis, we have as f (0, y) = y 4 −→ +∞ So f cannot attain a maximum value in R2 .

y −→ ±∞.

Moreover, by completing the squares,√we √ compare the √ values√of f with its value at the local minima points f ((2 2, 2) = f ((−2 2, − 2) = −4 and obtain f (x, y) + 4 = (x − 2y)2 + (y 2 − 2)2  0

∀(x, y) ∈ R2 .

Thus, f attains its global minimal value -4 at these two points. 3. – Let f (x, y) = x2 . i) Show that f has infinitely many critical points and that the second derivatives test fails for these points. ii) Show that f is convex on R2 . iii) What is the minimum value of f ? Give the minima points. iv) Does f have any local or global maxima? Justify your answer.

Solution: z  x2

6

4

4

2

z

2

0

0

2 2 4 5

4 0 x

6 5

6

4

2

0

2

4

6

FIGURE 2.29: Graph and level curves of f i) Since f is a differentiable function (because it is a polynomial), the local extreme points are critical ones, i.e, solutions of ∇f (x, y) = 2x, 0 = 0, 0

⇐⇒

x = 0.

We deduce that the points on the y axis are all critical points of f .

110

Introduction to the Theory of Optimization in Euclidean Space

Classification of the critical points: The Hessian matrix of f is Hf (x, y) =



fxx fyx

fxy fyy



 =

2 0 0 0



The leading principal minors at the critical points (0, y) of f (y ∈ R) are    2 0    . = 0. D2 (0, y) =  D1 (0, y) = 2 > 0 0 0  So the second derivative test is inconclusive. ii) The principal minors are Δ11 1 (x, y) = 0

   2 0   = 0. Δ2 (x, y) =  0 0 

Δ22 1 (x, y) = 2

and satisfy Δk (x, y)  0 for R2 .

x, y ∈ R2 . Therefore f is convex in

k = 1, 2

iii) Note that f (x, y) = x2  0 = f (0, y)

∀(x, y) ∈ R2 .

We deduce that the critical points are global minimum for f in R2 . iv) Since f is infinitely differentiable (because it is polynomial) in the open set R2 , an absolute maximum of f would be a local maximum, and therefore a critical point. But, all the critical points are minima points for f . Hence, f has no local nor absolute maxima; see Figure 2.29. In fact, on the x-axis, we have f (x, 0) = x2 −→ +∞

x −→ +∞.

as 2

So f cannot attain a maximum value M in R . Indeed, if not, we would have f (x, y)  M Then, we have

∀(x, y) ∈ R2 .

f (x, 0) = x2  M

∀x ∈ R

which is not possible. For example f (M, 0) = M 2 > M

∀M > 1.

Unconstrained Optimization

111

4. – Discuss the convexity/concavity of f on R2 f (x, y) = 4xy − x2 − y 2 − 6x. Are there global extreme points? Solution: z  x2  4 y x  6 x  y2

6

2

4

0

z

2

2

0

4 2 6 5

4 0 x

6 6

5

4

2

0

2

4

6

FIGURE 2.30: Graph and level curves of f

We have ∇f (x, y) = 4y − 2x − 6, 4x − 2y. ∞

Since f is C , then f

is convex

⇐⇒

is semi-definite positive

Hf

where the Hessian matrix of f is  fxx Hf (x, y) = fyx

fxy fyy



The principal minors of Hf are Δ11 1 (x, y) = −2,

Δ22 1 (x, y) = −2

and

 =

−2 4 4 −2



  −2 Δ2 (x, y) =  4

 4  = −12. −2 

So f is not convex, nor concave on R2 ; see Figure 2.30. Remark that f (0, y) = −y 2

−→ −∞

as

y −→ ±∞.

f takes large negative values and doesn’t attain its minimal value.

112

Introduction to the Theory of Optimization in Euclidean Space

On the other hand, when looking for the critical points of f , we obtain ∇f (x, y) = 4y − 2x − 6, 4x − 2y = 0, 0

⇐⇒

x=1

and

y = 2x = 2.

This point is a saddle point. It will help us to find a direction of increase of values of f . Indeed, by completing the squares, we obtain f (x, y) − f (1, 2) = −(2x − y)2 + 3(x − 1)2 from which we deduce that f (x, 2x) = f (1, 2) + 3(x − 1)2

−→ +∞

as

x −→ ±∞.

So f takes large positive values and doesn’t attain its maximal value either.

5. – Let f be the function defined by: f (x, y) = x4 − 2x2 + y 2 − 6y. i) Find the critical points of f . ii) Use the second derivative test to classify the critical points of f . iii) Find the global minimum value of f on R2 by completing squares. iv) Is there a global maximum value of f on R2 ? v) Show that f is convex on each of the open convex sets √ √ S1 = {(x, y) : x < −1/ 3} and S2 = {(x, y) : x > 1/ 3}. vi) Sketch these sets and plot the critical points. vii) Find

min f (x, y) S1

and

viii) Set S = S1 ∪ S2 . Find ix) Use

min f (x, y) S2

(justify)

m0 = min (x4 − 2x2 ) 2 R \S

(x4 − 2x2 )  m0 on R2 \ S to deduce

min f (x, y).

R2 \S

Unconstrained Optimization

113

Solution: The shape of the surface, in Figure 2.31, shows that the function is neither convex, nor concave. z  x4  2 x2  y2  6 y

4.0

5 3.5 0 z 3.0

5

10 2.5

2 1 0 x

1

2.0 2

2

1

0

1

2

FIGURE 2.31: Graph and level curves of f i) Since f is a differentiable function (because it is a polynomial), the local extreme points are critical points solution of ⎧ ⎨ 4x(x + 1)(x − 1) = 0 ∇f (x, y) = 4x3 − 4x, 2y − 6 = 0, 0 ⇐⇒ ⎩ 2(y − 3) = 0

⇐⇒

⇐⇒

⎧ ⎨ x=0

or

x+1=0

or

x−1=0



y=3 ⎧ x = 0 and y = 3 ⎪ ⎪ ⎪ ⎪ ⎨ or [x = −1 and y = 3] ⎪ ⎪ ⎪ ⎪ ⎩ or [x = 1 and y = 3].

We deduce that (−1, 3), (0, 3) and (1, 3) are the critical points of f . ii) Classification of the critical points: The Hessian matrix of f is Hf (x, y) =



fxx fyx

The leading principal minors are D1 (x, y) = 12x2 − 4

fxy fyy



 =

12x2 − 4 0

0 2



   12x2 − 4 0  . D2 (x, y) =  0 2 

114

Introduction to the Theory of Optimization in Euclidean Space

(x, y)

D1 (x, y)

(−1, 3)

(0, 3)

D2 (x, y)   8   0

8

(1, 3)

 0  = 16 2 

  −4 0   0 2

−4

  8   0

8

type

local minimum

   = −8 

 0  = 16 2 

saddle point

local minimum

TABLE 2.8: Classifying critical points of f (x, y) = x4 − 2x2 + y 2 − 6y The second derivative test gives the following characterization of the points in Table 2.8. iii) Global minimum value of f : We have f (x, y) = (x2 − 1)2 + (y − 3)2 − 10  −10 = f (1, 3) = f (−1, 3)

∀(x, y) ∈ R2

Thus min

(x,y)∈R2

f (x, y) = −10 = f (1, 3) = f (−1, 3).

iv) Global maximum value of f : We have f (x, 3) = (x2 − 1)2 − 10

and

lim f (x, 3) = +∞.

x→±∞

So f doesn’t attain its maximum value on R2 . v) The principal minors of the Hessian of f are    fyy  = 2  0, Δ11 1 =   12x2 − 4  Δ2 =   0

   fxx  = 12x2 − 4  0 ⇐⇒ |x|  √1 Δ22 1 = 3

 0   = 8(3x2 − 1)  0  2 

⇐⇒

1 |x|  √ 3

Unconstrained Optimization So Δk  0

1 |x|  √ . 3

⇐⇒

k = 1, 2

115

Hence, Hf is semi definite positive on each open convex set S1 and S2 . Hence f

is convex on each of the open convex sets

S1 and S2 .

vi) Sketch of the sets S1 and S2 in Figure 2.32. y

4

2

4

2

S2

2

4

x

2

S1

4

FIGURE 2.32: The convex sets S1 , and S2

vii) Since f is convex on S1 = [x < − √13 ] and the critical point (−1, 3) is in S1 with f (−1, 3) = −10, then min f (x, y) = f (−1, 3) = −10. S1

f is also convex on S2 = [x > f (1, 3) = −10, then

√1 ] 3

and the critical point (1, 3) is in S2 with

min f (x, y) = f (1, 3) = −10. S2

viii) We have ϕ(x) = x4 − 2x2

ϕ (x) = 4x3 − 4x = 4x(x − 1)(x + 1)

Using Table 2.9, we find that min

x∈[− √13 , √13 ]

1 5 1 ϕ(x) = ϕ(− √ ) = ϕ( √ ) = − . 9 3 3

116

Introduction to the Theory of Optimization in Euclidean Space x ϕ (x) ϕ(x)

−1 + 

−1

0 0 0

1 − 

−1

TABLE 2.9: Variations of ϕ(x) = x4 − 2x2 ix) We deduce that x4 − 2x2  −

5 9

1 1 ∀x ∈ [− √ , √ ] 3 3

5 1 5 5 f (x, y)  − +y 2 −6y = (y−3)2 −9−  −9− = f (± √ , 3) ∀(x, y) ∈ R2 \S. 9 9 9 3 Hence, 86 1 5 1 min f (x, y) = f (− √ , 3) = f ( √ , 3) = −9 − = − . 9 9 3 3

R2 \S

Remark. Note that f (x, y)  −9 −

5  −10 9

∀(x, y) ∈ R2 \ S.

We also have f (x, y)  −10 = f (−1, 3) = f (1, 3)

∀(x, y) ∈ S.

Hence, f (x, y) = f (−1, 3) = f (1, 3) = −10. min 2 R

Unconstrained Optimization

2.4

117

Extreme Value Theorem

The first main result of this section is

Theorem 2.4.1 Extreme value theorem Let S be a closed bounded set of Rn . Let f ∈ C 0 (S). Then f attains both its maximal and minimal values in S; that is, min f exist. max f and S

S

The proof of the extreme value theorem uses the fact that the image of a closed bounded set S of Rn by a real valued continuous function f : S −→ R is a closed bounded set of R [18]. Thus f (S) is a closed bounded interval [a, b]. Therefore ∃xm , xM ∈ S

such that

f (xm ) = a,

f (xM ) = b.

Since f (S) = [f (xm ), f (xM )], then f (xm )  f (x)  f (xM )

∀x ∈ S.

Therefore, f (xm ) = min f (x) S

and

f (xM ) = max f (x). S

Remark 2.4.1 When f is a continuous function on a closed and bounded set S, then the extreme value theorem guarantees the existence of an absolute maximum and an absolute minimum of f on S. These absolute extreme points can occur either on the boundary of S or in the interior of S. As a consequence, to look for these points, we can proceed as follows: – find the critical points of f that lie in the interior of S – find the boundary points where f takes its absolute values on the boundary – compare the values of f taken at the critical and boundary points found. The largest of the values of f at these points is the absolute maximum and the smallest is the absolute minimum.

118

Introduction to the Theory of Optimization in Euclidean Space

Example 1. Find the extreme values of f (x) = 13 x3 − 12 x2 − 2x + 3 on the intervals [−1, 1] and [−2, 2]. Solution: We have

f  (x) = x2 − x − 2 = (x − 2)(x + 1)

f  (x) = 0

⇐⇒

x=2

and

x = −1.

or

We deduce that 2 and −1 are the critical points of f ; see Figure 2.33. • The values max f (x) and min f (x) exist by the extreme value theorem x∈[−1,1]

x∈[−1,1]

because f is continuous on the closed bounded interval [−1, 1]. Now, since there is no critical points in the interior of the interval (−1, 1), these values must be in {f (−1) , f (1)}. Comparing these two values, we conclude that 25 6

max f (x) = f (−1) =

x∈[−1,1]

and

min f (x) = f (1) =

x∈[−1,1]

5 . 6

• The values max f (x) and min f (x) exist by the extreme value theorem x∈[−2,2]

x∈[−2,2]

because f is continuous on the closed bounded interval [−2, 2]. The critical point −1 is in the interior of the interval (−2, 2), the absolute values must be 7 25 1 in {f (−2) , f (−1) , f (2)} = { , , − }. Comparing these three values, we 3 6 3 conclude that max f (x) = f (−1) =

x∈[−2,2]

25 6

1 min f (x) = f (2) = − . 3

and

x∈[−2,2]

y y

x3 3



x2 2

2x3 4

2

4

2

2

4

x

2

4

FIGURE 2.33: Absolute values on a closed interval

Unconstrained Optimization

119

Example 2. Find the absolute maximum and minimum values of f (x, y) = 4xy − x2 − y 2 − 6x on the closed triangle S = {(x, y) : 0  x  2,

0  y  3x}.

Solution: f is continuous (because it is a polynomial) on the triangle S, which is a bounded and closed subset of R2 . So f attains its absolute extreme points on S at the stationary points lying at the interior of S or on points located at the boundary of S (see Figure 2.34). ∗ Interior stationary points of f : We have ∇f = 4y − 2x − 6, 4x − 2y = 0, 0

⇐⇒

(x, y) = (1, 2).

The point (1, 2) is the only critical point of f and f (1, 2) = −3. y

6 2 2 z  x 6 4 yx6x y

y

4

2 4

L3

0 0 L2 5 z

2

10

15 0.0 1

1 L1

2

3

4

0.5

x

1.0 x

1.5 2.0

FIGURE 2.34: Extreme values of f on the triangular plane region S ∗ Extreme values of f at the boundary of S: Let L1 , L2 and L3 be the three sides of the triangle, defined by: L1 = {(x, 0), 0  x  2}

L2 = {(2, y), 0  y  6}

L3 = {(x, 3x), 0  x  2}. – On L1 , we have: f (x, 0) = −x2 − 6x = g(x), g  (x) = −2x − 6. We deduce from the monotony of g (see Table 2.10) that

120

Introduction to the Theory of Optimization in Euclidean Space x g  (x) g(x)

0

2 − 

0

− 16

TABLE 2.10: Variations of g(x) = −x2 − 6x on [0, 2]

max f = f (0, 0) = 0

min f = f (2, 0) = −16.

and

L1

L1

– On L2 , we have: f (2, y) = −y 2 + 8y − 16 = h(y), y h (y) h(y) 

0

4 0

+ −16



h (y) = −2y + 8. 6

− 

0

−4

TABLE 2.11: Variations of h(y) = −y 2 + 8y − 16 on [0, 6] Then, from Table 2.11, we obtain max f = f (2, 4) = 0

min f = f (2, 0) = −16.

and

L2

L2

– On L3 , we have: f (x, 3x) = 2x2 − 6x = l(x), x l (x) l(x) 

0 0

− 

3 2

l (x) = 4x − 6. 2

0 − 92

+ 

−4

TABLE 2.12: Variations of l(x) = 2x2 − 6x on [0, 2] Using Table 2.12, we deduce that max f = f (0, 0) = 0 L3

and

min f = f (3/2, 9/2) = −9/2. L3

∗ Conclusion: We list, in Table 2.13, the values of f at the interior critical points and at the boundary points where an absolute extreme value occurs on the considered side of the boundary. We conclude that the absolute maximum value of f is f (0, 0) = f (2, 4) = 0 and the absolute minimum value is f (2, 0) = −16. Now, here is a version of an extreme value theorem for a continuous function on an unbounded domain.

Unconstrained Optimization (x, y) f (x, y)

(1, 2) −3

(0, 0) 0

(2, 0) −16

121

(2, 4) 0

(3/2, 9/2) −9/2

TABLE 2.13: Values of f at the points

Theorem 2.4.2 Let f (x) be a continuous function on an unbounded set S of Rn such that lim

x→+∞

(resp. − ∞).

f (x) = +∞

Then, there exists an element x∗ ∈ S such that f (x∗ ) = min f (x)

(resp. max f (x)).

x∈S

x∈S

Proof. Let x0 ∈ S. There exists R0 > 0 such that ∀x ∈ S :

x > R0 inf f (x)

So the optimization problem min f (x)

x∈S0

=⇒

x∈S

f (x) > f (x0 ).

is equivalent to

S0 = S ∩ {x ∈ Rn :

with

x  R0 }.

Indeed, we have S0 ⊂ S

min f (x) = inf f (x)  inf f (x).

=⇒

x∈S0

x∈S0

x∈S

Moreover, we have f (x) > f (x0 ) f (x)  min f (z)

if

z∈S0

then

x ∈ S \ S0

if

x ∈ S0

f (x)  max f (x0 ), min f (z)  min f (z) z∈S0

z∈S0

Hence inf f (x)  min f (z).

x∈S

z∈S0

∀x ∈ S.

122

Introduction to the Theory of Optimization in Euclidean Space

Note that the minimum min f (z) is attained by the extreme value theorem z∈S0

since S0 is a bounded closed set of Rn . Therefore ∃x∗ ∈ S0 Now, since, we have

min f (x) = f (x∗ ).

such that

inf f (x)  f (x∗ ),

x∈S

x∈S0

we deduce that

f (x∗ ) = inf f (x) = min f (x). x∈S

x∈S0

Example 3. Let f (x) = 3x4 + 4x3 − 12x2 + 2. i)

Show that f has an absolute minimum on R.

ii)

Find the minimal value of f on R.

Solution: The graphing in Figure 2.35 shows three local extrema. y 10

3

2

1

1

2

3

x

10

20

y  3 x4  4 x3  12 x2  2

30

FIGURE 2.35: Absolute minimum of f i) f is continuous on R since f is polynomial. Moreover, we have lim

|x|→+∞

f (x) = +∞.

Then f attains its minimum value at some point x∗ ∈ R. ii) Since R is an open set and x∗ ∈ R, then x∗ must be a critical point of f . We have f  (x) = 12x3 + 12x2 − 24x = 12x(x + 2)(x − 1) f  (x) = 0

⇐⇒

x = 0,

x = −2

or

and x = 1.

Unconstrained Optimization

123

We deduce that ! " ! " min f (x) = min f (0), f (−2), f (1) = min 2, −30, −3 = −30 = f (−2). x∈R

Example 4. Let f (x) = p(x) = xn + +an−1 xn−1 + · · · + a1 x + a0 be a polynomial with n  1. If n is odd, then lim p(x)

x→+∞

and

lim p(x)

x→−∞

have opposite signs (one is +∞ and the other is −∞), so f has no absolute extreme points. If n is even, then the limits above have the same sign. When they are both equal to +∞, f has an absolute minimum but no absolute maximum. When the limits are both equal to −∞, f has an absolute maximum but no absolute minimum.

124

Introduction to the Theory of Optimization in Euclidean Space

Solved Problems

1. – Define the function f (x, y) =

1 2 1 2 x − y on the closed unit disk. Find 4 9

i) the critical points ii) the local extreme values iii) the absolute extreme values.

Solution: z

x2 4



y2 9

1.0

0.5

0.0

0.2 1.0

0.1 0.0

0.5 0.5

0.1 0.0

1.0 0.5 0.5

0.0

1.0

0.5 1.0

1.0

1.0

0.5

0.0

0.5

1.0

FIGURE 2.36: Graph of f on the unit disk and level curves i) Since f is differentiable, the critical points are solution of ∇f (x, y) = 0, 0. That is x 2y ∇f (x, y) =  , −  = 0, 0 2 9

⇐⇒

(x, y) = (0, 0).

So (0, 0), the origin of the unit disk, is the unique critical point of f .

Unconstrained Optimization ii) Nature of the local extreme point. fxx =

1 , 2

125

We have

2 fyy = − , 9

fxy = 0.

1 2 ](0, 0) = − < 0 Then D2 (0, 0) = [fxx fyy − fxy 9 point; see Figure 2.36.

and (0, 0) is a saddle

iii) Global extreme points. Since the unit disk is a bounded closed subset of R2 , f attains its global extreme points on this set since it is continuous (because it is a polynomial function). These extreme points are interior critical points or points on the boundary of the disk. ∗ Extreme values of f on the boundary of the disk: On the unit disk, f takes the values (see Table 2.14) f (cos t, sin t) =

1 1 cos2 t − sin2 t = g(t), 4 9

t ∈ [0, 2π].

We have 2 13 1 g  (t) = − cos t sin t − sin t cos t = − sin t cos t 2 9 18

θ sin t cos t g  (t) g(t)

π 2

0

1 4

+ + − 



1 9

3π 2

π + − + 

1 4

− − −

TABLE 2.14: Variations of g(t) =



1 9

cos2 t −

1 9

 1 4

2π − + + 

1 4

sin2 t

∗ Conclusion: We list, in Table 2.15, the values of f at the critical point and at the boundary points where f attains its absolute values on that boundary. (x, y)

(0, 0)

f (x, y)

0

(1, 0) 1 4

(0, 1) 1 − 9

(−1, 0) 1 4

(0, −1) 1 − 9

TABLE 2.15: Values of f (x, y) = 14 x2 − 19 y 2 at candidate points

126

Introduction to the Theory of Optimization in Euclidean Space

The absolute maximal value of f on the disk is (1, 0) and (−1, 0).

1 and is attained on the points 4

The absolute minimal value of f on the disk is − points (0, 1) and (0, −1).

1 and is attained on the 9

2. – Find the absolute extreme points of the function f (x, y) = (4x − x2 ) cos y on the rectangular region

1  x  3,

−π/4  y  π/4.

Solution: y 2.0 1.5 L3

1.0 0.5 L4 1

L2 1

2

0.5

3

4

x

R

1.0

L1

1.5

FIGURE 2.37: The plane region R

f is continuous (because it is the product of a polynomial function and the cosine function) on the rectangle R = [1, 3] × [−π/4, π/4] (see Figure 2.37), which is a closed bounded set of R2 , then f attains its absolute extreme points on R. These points are attained at the critical points of f located at the interior of R or on points located on ∂R. ∗ Interior stationary points of f . We have ∇f = (4 − 2x) cos y, −(4x − x2 ) sin y = 0, 0

⇐⇒

⎧ ⎨ x=2

or

cos y = 0



or

x=4

x=0

⇐⇒ or

sin y = 0

(x, y) = (2, 0).

Unconstrained Optimization

127

The point (2, 0) is the only critical point of f , as shown in Figure 2.38, and f (2, 0) = 4. ∗ Extreme values of f at the boundary of R: Let L1 , L2 , L3 and L4 the four sides of the rectangle R, defined by: L1 = {(x, − π4 ), L3 = {(x, π4 ),

1  x  3}, 1  x  3},

L2 = {(3, y),

− π4  y 

π 4}

L4 = {(1, y),

− π4  y 

π 4 }.

1.0

z  4 x  x2  cosy

0.5

4.0

0.0

3.5 3.0

0.5

2.5 0.5

0.0

1.0 1.5 2.0

0.5 2.5

1.0 1.0

3.0

1.5

2.0

2.5

3.0

FIGURE 2.38: Values of f on R and level curves

– On L1 , we have: f (x, − π4 ) = x g  (x) g(x)

1 3 2



2 2 (4x

− x2 ) = g(x),

g  (x) =

2 √

2

+ 

√ 2 2

√ 2(2 − x).

3 − 

3 2



TABLE 2.16: Variations of g(x) =

2 2 (4x

√ 2

− x2 )

We deduce from the monotony of g, described in Table 2.16, that √ π max f = f (2, − ) = 2 2 L1 4

√ π 3 2 π . min f = f (1, − ) = f (3, − ) = L1 4 4 2

– On L2 , we have: f (3, y) = 3 cos y = h(y), h (y) = −3 sin y. From the monotony of h (see Table 2.17), we have

max f = f (3, 0) = 3 L2

√ π 3 2 π . min f = f (3, − ) = f (3, − ) = L2 4 4 2

128

Introduction to the Theory of Optimization in Euclidean Space − π4

y h (y) h(y) 

3 2



π 4

0 2



+ 

3 2



3

√ 2

TABLE 2.17: Variations of h(y) = 3 cos y – On L3 , we have: f (x, π4 ) = x l (x) l(x)



2 2 (4x

− x2 ) = l(x),

1 3 2

l (x) =

2 √

2

+ 



2(2 − x).

3 − 

√ 2 2

3 2



TABLE 2.18: Variations of l(x) =

2 2 (4x



2

− x2 )

As a consequence of Table 2.18, we have √ π 3 2 π . min f = f (1, ) = f (3, ) = L3 4 4 2

√ π max f = f (2, ) = 2 2 L3 4

– On L4 , we have: f (1, y) = 3 cos y = m(y), y m (y) m(y) 

− π4 3 2

√ 2

m (y) = −3 sin y. π 4

0 + 

− 3



3 2

√ 2

TABLE 2.19: Variations of m(y) = 3 cos y From the behaviour described in Table 2.19, we deduce that

max f = f (1, 0) = 3 L4

√ π 3 2 π . min f = f (1, − ) = f (1, − ) = L4 4 4 2

∗ Conclusion: We list the particular points found above in Table 2.20. The maximal value of f on R is 4 and it is attained at the point (2, 0), which is an interior critical point. √ The minimal value of f on R is 3 2 2 and it is attained at the points (1, − π4 ), (1, π4 ), (3, − π4 ) and (3, π4 ).

Unconstrained Optimization (x, y) f (x, y)

(2, 0) 4

(2, ± π4 ) √ 2 2

(1, √ ± π4 ) 3 2 2

129

(3, √ ± π4 ) 3 2 2

(3, 0) 3

(1, 0) 3

TABLE 2.20: Values of f (x, y) = (4x − x2 ) cos y at candidate points 3. – Find the points on the surface z 2 = xy + 4 that are closer to the origin.

Solution: The distance of a point (x, y, z) to the origin is given by d =  x2 + y 2 + z 2 . The problem is equivalent to minimize d2 = x2 + y 2 + z 2 on the set z 2 = xy + 4 or equivalently to look for min x2 + y 2 + (xy + 4) = f (x, y).

S=R2

Note that the function f is continuous on the unbounded set R2 and satisfies 1 1 1 f (x, y)  x2 + y 2 − (x2 + y 2 ) + 4 = (x2 + y 2 ) + 4 = (x, y) + 4 2 2 2 since |xy| 

1 2 (x + y 2 ). 2

lim

f (x, y) = +∞.

Thus (x,y)→+∞

Hence, the minimization problem has a solution. Note that a global minimum of the problem is also a local minimum, i.e., solution of

∇f = 2x+y, 2y +x = 0, 0

⇐⇒

(x, y) = (0, 0)

since

  2   1

 1 

0. = 2 

The point (0, 0) is the only critical point of f and f (0, 0) = 4. Since the global minimum exists, then (0, 0) is the global minimum and the corresponding points on the surface z 2 = 4 + xy where the distance is closer to (0, 0, 0) are: (0, 0, ±2). We can also verify that (0, 0) is a local minimum by applying the second derivatives test. Indeed, we have fxx = 2,

fxy = 1,

fyy = 2

130

Introduction to the Theory of Optimization   f fxy D2 (x, y) =  xx D1 (x, y) = fxx = 2 fxy fyy

in Euclidean Space      2 1  =    1 2  = 3.

Since D1 (0, 0) > 0 and D2 (0, 0) > 0, then (0, 0) is a strict local minimum.

4. – i) Find the quantities x, y that should be produced to maximize the total profit function f (x, y) = x + 4y subject to 2x + 3y  19, x + y  8,

−3x + 2y  4 0  x  6,

y  0.

ii) Use level curves to solve the problem geometrically.

Solution: y

6

L4 4

L5

L3 2

S

L2

L6

2

L1

4

6

x

FIGURE 2.39: Hexagonal plane region S

i) Set S = {(x, y) : 2x + 3y  19, −3x + 2y  4, x + y  8, 0  x  6, y  0}. The set S is the region of the plan xy, located in the first quadrant and bounded by the lines 2x+3y = 19, −3x+2y = 4, x+y = 8; see Figure 2.39. It is a closed bounded convex of R2 . Since f is continuous (because it is a

Unconstrained Optimization

131

polynomial), it attains its absolute extreme points on S at the stationary points lying at the interior of S or on points located at the boundary of S. ∗ Interior stationary points of f .

We have ∀(x, y) ∈ R2 .

∇f = 1, 4 = 0, 0 There is no critical point of f . ∗ Extreme values of f at the boundary of S:

Let L1 , · · · , L6 be the six sides of the hexagon S, defined by: L1 = {(x, 0),

L3 = {(x, 8 − x),

L5 = {(x,

4 + 3x ), 2

0  x  6},

5  x  6},

0  x  2},

L2 = {(6, y),

L4 = {(x,

0  y  2}

19 − 2x ), 2

L6 = {(0, y),

2  x  5},

0  y  2}.

On L1 , we have: f (x, 0) = x, max f = f (6, 0) = 6 L1

min f = f (0, 0) = 0. L1

On L2 , we have: f (6, y) = 6 + 4y, max f = f (6, 2) = 10 L2

min f = f (6, 0) = 6. L2

On L3 , we have: f (x, 8 − x) = x + 4(8 − x) = 32 − 3x, max f = f (5, 3) = 17 L3

min f = f (6, 2) = 10. L3

On L4 , we have: f (x, 19−2x ) = x + 43 (19 − 2x) = 13 (76 − 5x), 3 max f = f (2, 5) = 22 L4

min f = f (5, 3) = 17. L4

On L5 , we have: f (x, 4+3x 2 ) = x + 2(4 + 3x) = 8 + 7x, max f = f (2, 5) = 22 L5

min f = f (0, 2) = 8. L5

132

Introduction to the Theory of Optimization in Euclidean Space On L6 , we have: f (0, y) = 4y, max f = f (0, 2) = 8

min f = f (0, 0) = 0.

L6

L6

∗ Conclusion: We list, in Table 2.21 below, the values of f at the boundary points where f takes absolute values on each side of the set S. We conclude that the absolute maximum value of f is f (2, 5) = 22 and the absolute minimum value is f (0, 0) = 0. (x, y) f (x, y)

(0, 0) 0

(6, 0) 6

(6, 2) 10

(5, 3) 17

(2, 5) 22

(0, 2) 8

TABLE 2.21: Values of f (x, y) = x + 4y at candidate points

ii) To solve the problem geometrically, we sketch the level curves x + 4y = k. The profit k is attained if the line has common points with the region S. The profit 0 is attained at the point (0, 0) at the level curve x + 4y = 0. When the profit k increases, the lines x + 4y = k are parallel and move out farther to reach the point (2, 5) where the highest profit is attained; see Figure 2.40. y

6

L4 4

x4y 22

L5

L3 2

S

L2

L6

2

L1

4

6

x

x4y 0

FIGURE 2.40: Level curve of highest profit

Unconstrained Optimization

133

Remark 2.4.2 Note that the points that appear in the above table are the vertices of the hexagon S. The extreme points are attained at two of these vertices. This is true in the more general problem min p.x = p1 x1 + · · · + pn xn

or

S

with

S = {x = (x1 , · · · , xn ) ∈ R+

n

max p.x S

: Ax  b}

where A = (aij )1im, 1jn ,

b =t (b1 , · · · , bn )

and

p =t (p1 , · · · , pn ).

We look for the extreme points on the polyhedra U = {x = (x1 , · · · , xn ) ∈ R+n : Ax = b}. We establish, when it exists, that an extreme point is at least a corner of the polyhedra U . However, when m and n take large values, the number of corners is very important, and linear programming develops various methods to approach these optimal values of the objective function [19], [5], [29].

Chapter 3 Constrained Optimization-Equality Constraints

In this chapter, we are interested in optimizing functions f : Ω ⊂ Rn −→ R over subsets described by equations

g(x) = (g1 (x), g2 (x), . . . , gm (x)) = cRm

with

m < n x ∈ Rn .

Denote the set of the constraints by

S = [g(x) = c] = [g1 (x) = c1 ] ∩ [g2 (x) = c2 ] ∩ . . . ∩ [gm (x) = cm ]. In dimension n = 3, when m = 1, the equation g1 (x, y, z) = cR3 may describe a surface, while when m = 2, the equations

g1 (x, y, z) = c1

g2 (x, y, z) = c2

and

may describe a curve as the intersection of two surfaces. Thus, the set [g = c] can be seen as a set of dimensions less than 3. For m = 3, the set [g = c] may be reduced to some points or to the empty set. For this reason, we will not consider these situations and assume always m < n. x52  y2  z2  9 y 0

5 5

z

0

5 5

0 x 5

FIGURE 3.1: S = [g1 = 9], n = 3, m = 1

Example. ∗ S = [g1 (x, y, z) = x2 + y 2 + z 2 = 9] is a surface (the sphere centered at the origin with radius 3; see Figure 3.1). Here (n = 3, m = 1).

135

136

Introduction to the Theory of Optimization in Euclidean Space

∗∗ S = [g1 (x, y, z) = x2 + y 2 + z 2 = 9] ∩ [g2 (x, y, z) = z = 2] is the intersection of the previous sphere with the plan z = 2; see Figure 3.2.

x2 5y2  z2  9, z  2

x2  y2  z2  9, z  2

y

2

0

y

1

0 1

5 5

2 4

3 z

0

z

2

1 5

0

5

2 1 0

0

x

x

1

5

2

FIGURE 3.2: S = [g1 = 9] ∩ [g2 = 2], n = 3, m = 2

∗∗ S = [g1 (x, y, z) = x2 + y 2 + z 2 = 9] ∩ [g2 (x, y, z) = z = 2] ∩ [g3 (x, y, z) = y = 1] = {(2, 1, 2), (−2, 1, 2)}is the intersection of the sphere with the two planes z = 2 et y = 1. It is reduced to two points; see Figure 3.3.

x2  y2  z2  9, z  2, y  1

5 y

2

0

y

1

0 1

5 5

2 4

3 z

0

z

2

1 5

0

5

2 1 0

0

x

x 5

1 2

FIGURE 3.3: S = [g1 = 9] ∩ [g2 = 2] ∩ [g3 = 1],

n = 3, m = 3

As in the case of unconstrained optimization, we will need to reduce our set of searches of the extreme points by looking for some necessary conditions. A local study for such points x∗ cannot be done by considering balls centered at these points because the points x∗ + th, with |h| small, do not remain necessarily inside the set [g = c]. This situation prevents us from comparing the values f (x∗ + th) with f (x∗ ). In order to remain close to x∗ through points of the set [g = c], an idea is to consider all the curves passing through x∗ included in the constraint set. We will consider curves t −→ x(t) such that the set {x(t) : t ∈ [−a, a], x(0) = x∗ }, for some a > 0, are included in [g = c]. So, if x∗ is a local maximum of f , then we have

Constrained Optimization-Equality Constraints f (x(t))  f (x∗ )

137

∀t ∈ [−a, a].

Thus, 0 is local maximum point for the function t −→ f (x(t)). Hence

  d   f (x(t))  =0 = f  (x(t)).x (t) dt t=0 t=0

=⇒

f  (x∗ ).x (0) = 0

x (0) is a tangent vector to the curve x(t) at the point x(0) = x∗ . This equality musn’t depend on a particular curve x(t). So, we must have

f  (x∗ ).x (0) = 0

for any curve x(t) such that g(x(t)) = c.

In this chapter, first, we will characterize, in Section 3.1, the set of tangent vectors to such curves, then establish in Section 3.2, the equations satisfied by a local extreme point x∗ . In Section 3.3, we identify the candidates’ points for optimality, and in Section 3.4, we explore the global optimality of a constrained local candidate point. Finally, we establish, in Section 3.5, the dependence of the optimal function with respect to certain of its parameters.

3.1

Tangent Plane

Let

x∗ ∈ S = [g(x) = c].

Definition 3.1.1 The set defined by T = { x (0) :

t −→ x(t) ∈ S,

x ∈ C 1 (−a, a),

a > 0,

x(0) = x∗ }

of all tangent vectors at x∗ to differentiable curves included in S, is called tangent plane at x∗ to the surface [g = c].

We have the following characterization of the tangent plane T at a regular point x∗ of S.

138

Introduction to the Theory of Optimization in Euclidean Space

Definition 3.1.2 A point x∗ ∈ S = [g = c] is said to be a regular point of the constraints if the gradient vectors ∇g1 (x∗ ), . . ., ∇gm (x∗ ) are linearly independent (LI). That is, the m × n matrix ⎤ ⎡ ∂g ∂g1 ∂g1 1 . . . ∂x1 ∂x2 ∂xn ⎥ ⎢ ⎥ ⎢ ⎢ ∂g2 ∂g2 ∂g2 ⎥  ∗ . . . ⎢ has rank m. g (x ) = ⎢ ∂x1 ∂x2 ∂xn ⎥ .. .. .. ⎥ ⎥ ⎢ .. ⎣ . . . . ⎦ ∂gm ∂x1

∂gm ∂x2

...

∂gm ∂xn



v1 , . . . , vm ∈ Rn are LI ⇐⇒ α1 v1 + . . . + αm vm = 0 =⇒ α1 = . . . = 0 . The rank of a matrix = rank of its transpose [10].

Theorem 3.1.1 At a regular point x∗ ∈ S = [g = c], where g is C 1 in a neighborhood of x∗ , the tangent plane T is equal to the subspace M = {y ∈ Rn :

g  (x∗ )y = 0}.

The proof of this theorem is an application of the implicit function theorem. Proof. We have T ⊂ M : Indeed, let y ∈ T, then ∃ x ∈ C 1 (−a, a) such that g(x(t)) = c x(0) = x∗ , x (0) = y.

∀t ∈ (−a, a) for some a > 0,

Differentiating the relation g(x(t)) = c, we obtain g  (x(t))x (t) = 0

∀t ∈ (−a, a)

=⇒

g  (x(0))x (0) = 0

⇐⇒

g  (x∗ )y = 0.

Hence y ∈ M. M ⊂ T : ∗ Indeed, let y ∈ M \ {0} and consider the vectorial equation F (t, u) = g(x∗ + ty +t g  (x∗ )u) − c = 0, where for fixed t, the vector u ∈ Rm is the unknown.

Constrained Optimization-Equality Constraints

139

Note that F is well defined on an open subset of R × Rm . Indeed, if g is C 1 on Bδ (x∗ ) ⊂ Rn , then ∀(t, u) ∈ (−δ0 , δ0 ) × Bδ0 (0)

with

δ0 = min

δ δ , 2 y 2 g  (x∗ )

(x∗ + ty +t g  (x∗ )u) − x∗  |t| y + u g  (x∗ ) < =⇒

δ δ δ δ y + g  (x∗ ) = + = δ  ∗ 2 y 2 g (x ) 2 2

[x∗ + ty +t g  (x∗ )u] ∈ Bδ (x∗ ).

We have X(t, u) = x∗ + ty +t g  (x∗ )u

F (t, u) = g(X(t, u)) − c Xj (t, u) = x∗j + tyj +

m  ∂gl ∗ (x )ul ∂xj

∂gi ∗ ∂Xj = (x ) ∂ui ∂xj

l=1

∂Fk (t, u) = ∂ui  ∂F

k

∂ui

n  j=1

 (t, u)

n

 ∂gk ∂gi ∗ ∂gk ∂Xj = (X(t, u)) (x ) ∂Xj ∂ui ∂x ∂x j j j=1

k,i=1,··· ,m

t

= g  (X(t, u)) g  (x∗ ) .

By hypotheses, we have – F is a C 1 function in the open set A = (−δ0 , δ0 ) × Bδ0 (0) – F (0, 0) = g(x∗ ) − c = 0 – (0, 0) ∈ (−δ0 , δ0 ) × B(0, δ0 ), so (0, 0) is an interior point – det(∇u F (0, 0)) = 



rankg (x ) = m.

 t

 ∂(F1 , · · · , Fm ) = det g  (x∗ ) g  (x∗ )

= 0 ∂(u1 , · · · , um )

as

Then, by the implicit function theorem, there exists open balls B (0) ⊂ (−δ0 , δ0 ),

Bη (0) ⊂ Bδ0 (0),

, η > 0

with

B (0) × Bη (0) ⊆ A,

140

Introduction to the Theory of Optimization in Euclidean Space

and such that det(∇u F (t, u)) = 0 ∀t ∈ B (0),

∃!u ∈ Bη (0) :

u : (−, ) −→ Bη (0); The curve

B (0) × Bη (0)

in

F (t, u) = 0 is a C 1 function.

t −→ u(t)

x(t) = X(t, u(t)) = x∗ + ty +t g  (x∗ )u(t)

is thus, by construction, a curve on S. By differentiating both sides of F (t, u(t)) = g(x(t)) − c = g(X(t, u(t))) − c = 0 with respect to t, we get n

0=

 ∂g ∂Xj d g(x(t)) = dt ∂Xj ∂t j=1

0=

 d g(x(t)) dt t=0

m  ∂gl ∗ (x )ul ∂xj

m

 ∂gl ∂ul ∂Xj = yj + (x∗ ) ∂t ∂xj ∂t l=1 l=1 # n m    ∂g ∂gl ∗ ∂ul  = (X(t, u)) yj + (x ) ∂xj ∂xj ∂t j=1

Xj (t, u) = x∗j + tyj +







∗ t 



l=1 

t=0

= g (x )y + g (x ) g (x )u (0). Since y ∈ M \ {0}, we have g  (x∗ ).y = 0. Moreover, since g  (x∗ )t g  (x∗ ) is nonsingular, we conclude that u (0) = 0. Hence x (0) = y +t g  (x∗ )u (0) = y + 0 = y and y is a tangent vector to the curve x(t) included in S, so y ∈ T. ∗∗ If y = 0, the constant curve x(t) = x∗ is included in S and x (0) = 0 = y, so 0 ∈ T. It is easy to show that M is a subspace of Rn . Indeed, 0 ∈ M and for y1 , y2 ∈ M, κ ∈ R, we have g  (x∗ )(y1 + κy2 ) = g  (x∗ )y1 + κg  (x∗ )y2 = 0.

Constrained Optimization-Equality Constraints

141

Theorem 3.1.2 Implicit function theorem [15] [20] Let A in Rn × Rm be an open set. Let F = (F1 , . . . , Fm ) be a C 1 (A) function. Consider the vector equation F (x, y) = 0. If ◦

∃(x0 , y 0 ) ∈ A = A,

F (x0 , y 0 ) = 0

then, ∃, η > 0 such that

det Fy (x, y) = 0 ∀x ∈ B (x0 ),

and



det Fy (x0 , y 0 ) = 0,

∀(x, y) ∈ B (x0 ) × Bη (y 0 ) ⊂ A

∃!y ∈ Bη (y 0 ) :

ϕ : B (x0 ) −→ Bη (y 0 );

F (x, y) = 0

x −→ ϕ(x) = y

is C 1 (B (x0 ))



−1 Fx (x, y) ϕ (x) = − Fy (x, y)

where ⎡ ∂F1 ∂y1

⎢ Fy (x, y) = ∇y F (x, y) = ⎣ ...

∂Fm ∂y1

det(Fy (x, y)) =

... .. . ...

∂(F1 , . . . , Fm ) ∂(y1 , . . . , ym )

∂F1 ∂ym



.. ⎥ . ⎦

gradient of F with respect to y

∂Fm ∂ym

Jacobian of F with respect to y.

Remark 3.1.1 Denote by T(x∗ ) the translation of T by the vector x∗ : T(x∗ ) = x∗ + M = {x∗ + h ∈ Rn : g  (x∗ ).h = 0} = {x ∈ Rn : g  (x∗ ).(x − x∗ ) = 0}. T(x∗ ) is the tangent plane to the surface [g(x) = c] passing through x∗ .

142

Introduction to the Theory of Optimization in Euclidean Space y 1.0

y 1.5 2

0.5

y  x  1  1

tangent line y 1 1.0

0.5

0.5

1.0

1.5

2.0

2.5

x y  1  x2

0.5 0.5

1.0

1.5

tangent line y 1

1.0

0.5

0.5

1.0

1.5

x

0.5

FIGURE 3.4: Horizontal tangent line at local extreme points

Remark 3.1.2 Tangent plane at a point of a surface z = f (x) * Suppose x∗ is an interior point of a surface z = f (x) where f is a C 1 function. Then, the tangent plane at (x∗ , f (x∗ )) is given by z = f (x∗ ) + f  (x∗ ).(x − x∗ ) Indeed, setting

g(x, z) = z − f (x) = 0,

g  (x, z) = −f  (x), 1 = 0

and

then rank(g  (x∗ , f (x∗ ))) = 1.

The tangent plane at (x∗ , f (x∗ )) is characterized by g  (x∗ , f (x∗ )).x − x∗ , z − f (x∗ ) = −f  (x∗ ), 1.x − x∗ , z − f (x∗ ) = 0. ** If x∗ is an interior stationary point, then ∇f (x∗ ) = 0 , and the tangent plane T (x∗ , f (x∗ )) is the horizontal plane z = f (x∗ ). *** The graph of the tangent plane is the graph of the linear approximation L(x) = f (x∗ ) + f  (x∗ ).(x − x∗ ). Thus, we have f (x) ≈x∗ L(x)

for x close to x∗ .

Example 1. The tangent plane to a curve y = f (x) at a point (x0 , f (x0 )) corresponds to the tangent line to the curve at that point described by the equation y = f (x0 ) + f  (x0 )(x − x0 ). The following examples, in Table 3.1, show that the tangent line is horizontal at local extreme points and separates the graph into two parts at an inflection point; see Figure 3.4 and Figure 3.5.

Constrained Optimization-Equality Constraints

f  (x0 )

f (x)

point x0

(x − 1)2 − 1

1 : global minimum

2(x − 1)

1 − x2

0 : global maximum

−2x

(x − 1)3 + 1

0 : inflection point

3(x − 1)2

ln x

e

143

tangent line

 x=1

y = −1

=0

 x=0

=0

y=1

 x=1

=0

1 1 = x x=e e

y=1 y−1=

1 (x − e) e

TABLE 3.1: Tangent planes in one dimension y 1.5

tangent line y 1 1.0

y 1.5 1.0

0.5 y  1  x2

0.5

0.5

0.5 1.0

1.5

2.0

x

0.0

tangent line y

y  logx

x

1

2

3

4

x

0.5

0.5

FIGURE 3.5: Tangent lines at an inflection and at an ordinary points Example 2. The tangent plane to a surface z = f (x, y) at a point (x0 , y0 , f (x0 , y0 )) corresponds to the usual tangent plane to the surface at that point described by the equation z = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ). A normal vector to this plane is n = fx (x0 , y0 ), fy (x0 , y0 ), −1 = fx (x0 , y0 )i + fy (x0 , y0 )j − k. A normal line to the surface z = f (x, y) at (x0 , y0 , f (x0 , y0 )) is the line parallel to the vector n. The examples, given in Table 3.2 and graphed in Figures 3.6 and 3.7, show that the tangent plane is horizontal at local extreme points and separates the graph into two parts at a saddle point.

144

Introduction to the Theory of Optimization in Euclidean Space 4

4

2 2

2

2

z  x  1  y  1  1

y

y

0

plane tangent z 4

0

2

2 4

z  x2  y2  4

2 3 1 z z

2

0 1 1 0 2

2

plane tangent z 1

1 0

0 x

1 x

1 2

2

FIGURE 3.6: Horizontal tangent planes at local extreme points

The corresponding tangent planes at (x0 , y0 ) are respectively a) z = −1,

b) z = 4,

d) z = 2x + 2y − 3

c) z = 0,

4

2 y

1 2

0

plane tangent 2 x 2 y  z3 0

y

1

0 2 2

z  y2  x2 2 2

1

z

1 0 z

0

1 1 2

plane tangent z 0

2

2 1 x

z  x  12  y  12  1 0

0

1 1

x 2

2 3

FIGURE 3.7: Tangent planes at a saddle and ordinary points

Example 3. Find the tangent plane at the point (0, 1, 0) to the set g = (g1 , g2 ) = 1, 1 with g1 (x, y, z) = x + y + z,

g2 (x, y, z) = x2 + y 2 + z 2 .

and

Solution: The surface g(x, y, z) = 1, 1 is the intersection of the two surfaces g1 (x, y, z) = 1 and g2 (x, y, z) = 1. So, it is a curve in the space R3 . We have ⎡ ⎤ ∂g1 ∂g1 ∂g1   ∂y ∂z 1 1 1 ⎢ ∂x ⎥  g (x, y, z) = ⎣ ⎦ = 2x 2y 2z . ∂g2 ∂g2 ∂g2 ∂x

∂y

∂z

Constrained Optimization-Equality Constraints

f  (x0 , y0 )

z = f (x, y)

a)

(x − 1)2 + (y + 1)2 − 1 2(x − 1), 2(y + 1)

(x,y)=(1,−1)



−2x, −2y

y 2 − x2

−2x, 2y

(x − 1)2 + (y + 1)2 − 1

2(x − 1), 2(y + 1)

c) d)



4 − x2 − y 2

b)

145

(x,y)=(0,0)

 (x,y)=(0,0)

= 0, 0

= 0, 0

= 0, 0

 (x,y)=(2,0)

= 2, 2

TABLE 3.2: Examples in two dimension

g  (0, 1, 0) =



1 0

1 2

1 0

 has rank 2

The tangent plane is the set of points (x, y, z) such that ⎡ ⎤     x 1 1 1 0  ⎣ ⎦ g (0, 1, 0).x − 0, y − 1, z − 0 = . y−1 = 0 2 0 0 z ⎧ ⎨ x+y−1+z =0 ⇐⇒ ⎩ 2(y − 1) = 0. A parametrization of the tangent plane to the two surfaces at (0, 1, 0) is the line (see Figure 3.8) x=t

y=1

z = −t,

t ∈ R.

146

Introduction to the Theory of Optimization in Euclidean Space 2 y

1

0 1 2 2

1

z

0

1

2 2 1 0 x

1 2

FIGURE 3.8: Tangent plane at (0, 1, 0) to [g = 1, 1]

Remark 3.1.3 Note that the representation of the tangent plane obtained in the theorem has used the fact that the point was regular. When this hypothesis is omitted, the representation is not necessary true. Indeed, if S is the set defined by g(x, y) = 0

with

g(x, y) = x2 ,

then S is the y axis. No point of S is regular since we have g  (x, y) = 2x, 0

and

g  (0, y) = 0, 0

on the y-axis.

We deduce that at each point (0, y0 ) ∈ S, we have M = {h = (h1 , h2 ) :

g  (0, y0 ).h = 0} = R2 .

However, the line x(t) = 0

y(t) = y0 + t

passes through the point (0, y0 ) at t = 0 with direction x (0), y  (0) = 0, 1 and remains included in S. Hence, the tangent plane is equal to S.

Constrained Optimization-Equality Constraints

147

Solved Problems

1. – Find an equation of the tangent plane to the ellipsoid x2 +4y 2 +z 2 = 18 at the point (1, 2, 1). Solution: Set g(x, y, z) = x2 + 4y 2 + z 2 = 18. Then, g  (x, y, z) = 2xi + 8yj + 2zk, g  (1, 2, 1) = 2i + 16j + 2k = 0

=⇒

rank(g  (1, 2, 1)) = 1.

The tangent plane (see Figure 3.9) is the set of points (x, y, z) such that ⎡ ⎤ x−1   g  (1, 2, 1).x − 1, y − 2, z − 1 = 2 16 2 . ⎣ y − 2 ⎦ = 0 z−1 ⇐⇒

2(x − 1) + 16(y − 2) + 2(z − 1) = 0

⇐⇒

x + 8y + z − 18 = 0.

5 y 0

5 5

z

0

5 5

0 x 5

FIGURE 3.9: Tangent plane at (1, 2, 1) to the ellipsoid

148

Introduction to the Theory of Optimization in Euclidean Space

2. – Find all points on the surface 2x2 + 3y 2 + 4z 2 = 9 at which the tangent plane is parallel to the plane x − 2y + 3z = 5. Solution: Set g(x, y, z) = 2x2 + 3y 2 + 4z 2 = 9. We have g  (x, y, z) =



4x

6y

8z



= 0

on [g = 9]

=⇒

rank(g  (x, y, z)) = 1

since g  (x, y, z) = 0

⇐⇒

(x, y, z) = 0

g(0) = 9.

and

The tangent plane to the surface g(x, y, z) = 9 at a point (x0 , y0 , z0 ) is the set of points (x, y, z) such that ⎡ ⎤ x − x0   g  (x0 , y0 , z0 ).x − x0 , y − y0 , z − z0  = 4x0 6y0 8z0 . ⎣ y − y0 ⎦ = 0 z − z0 4x0 (x − x0 ) + 6y0 (y − y0 ) + 8z0 (z − z0 ) = 0.

⇐⇒

This tangent plane will be parallel to the plane x − 2y + 3z = 5 if the two planes have their respective normals g  (x0 , y0 , z0 ) and 1, −2, 3 parallel. So, we have to solve the following system ⎧ ⎪ ⎪ 4x0 = t ⎪ ⎧ ⎪ ⎪ ⎪ find t ∈ R : ⎪ ⎪ ⎪ ⎪ ⎨  ⎨ 6y0 = −2t g (x0 , y0 , z0 ) = t1, −2, 3 ⇐⇒ ⎪ ⎪ ⎪ ⎪ 8z0 = 3t ⎩ ⎪ ⎪ g(x0 , y0 , z0 ) = 9 ⎪ ⎪ ⎪ ⎪ ⎩ 2x20 + 3y02 + 4z02 = 9 =⇒

t 2

2

4

t 2 3t 2 +3 − +4 =9 3 8

The needed points on the surface are 3√ 4√ 9 √

3 , 3, − 3, 7 7 14





=⇒

t=±

12 √ 3. 7

3√ 4√ 9√

3, 3 . 3, − 7 7 14

The equations of the tangent planes to the surface (see Figure 3.10) at these points are

x−

3√

4√

9√

3 −2 y+ 3 +3 z− 3 = 0, 7 7 14

Constrained Optimization-Equality Constraints

149

2 1

y 0 1 2 2

1

z

0

1

2 2 1 0 x

1 2

FIGURE 3.10: Parallel tangent planes to an ellipsoid x+

4√

9√

3√

3 −2 y− 3 +3 z+ 3 = 0. 7 7 14

3. – Show that the surfaces  z = x2 + y 2

and

z=

1 2 5 (x + y 2 ) + 10 2

intersect at (3, 4, 5) and have a common tangent plane at that point.

Solution: Set g1 (x, y, z) = z −



x2 + y 2

g2 (x, y, z) = z −

5 1 2 (x + y 2 ) − . 10 2

Since g1 (3, 4, 5) = 0 and g2 (3, 4, 5) = 0, then the point (3, 4, 5) is a common point to the surfaces g1 (x, y, z) = 0 and g2 (x, y, z) = 0. We have g1 (x, y, z) = − 

x x2

y2

y i−  j+k 2 x + y2

+ y x g2 (x, y, z) = − i − j + k 5 5

4 3 g1 (3, 4, 5) = − i − j + k = 0, 5 5

rank(g1 (3, 4, 5)) = 1

4 3 g2 (3, 4, 5) = − i − j + k = 0, 5 5

rank(g2 (3, 4, 5)) = 1.

150

Introduction to the Theory of Optimization in Euclidean Space

Note that the normal vectors g1 (3, 4, 5) and g2 (3, 4, 5) of the tangent planes to the surfaces g1 (x, y, z) = 0 and g2 (x, y, z) = 0 respectively are the same. Hence, the two surfaces have a common tangent plane at this point with the equation 4 3 − (x − 3) − (y − 4) + (z − 5) = 0. 5 5

4. – Find two unit vectors that are normal to the surface sin(xz) − 4 cos(yz) = 4 at the point P (π, π, 1). Solution: A vector that is normal to the surface g(x, y, z) = sin(xz) − 4 cos(yz) = 4 is normal to the tangent plane to this surface at this point and we have g  (x, y, z) = z cos(xz)i + 4z sin(yz)j + (x cos(xz) + 4y sin(yz))k g  (π, π, 1) = −i − πk = 0 =⇒ rank(g  (π, π, 1)) = 1. A normal vector to the tangent plane is g  (π, π, 1) = −i − πk and two unit vectors that are normal to the surface sin(xz) − 4 cos(yz) = 4 at the point P (π, π, 1) are ±

1 π g  (π, π, 1) = ±− √ , 0, − √ . g  (π, π, 1) 1 + π2 1 + π2

Constrained Optimization-Equality Constraints

3.2

151

Necessary Condition for Local Extreme Points-Equality Constraints

Before setting the results rigorously, we will try to give an intuitive approach of the comparison of the values of f close to a local maximum value f (x∗ ) under the constraints g(x) = c. We will follow the unconstrained case in parallel. • Unconstrained case: We compare values of f taken in a neighborhood of x∗ in all directions

f (x∗ + th)  f (x∗ )

f or h ∈ Rn

|t| < δ

or equivalently, for each i = 1, . . . , n

f (x∗ + tei )  f (x∗ )

|t| < δ

then for |t| < δ, we have f (x∗ + tei ) − f (x∗ ) 0 if t>0 t f (x∗ + tei ) − f (x∗ ) 0 if t

and

t > 0.

Since f is differentiable, we obtain as t −→ 0+ and t −→ 0− respectively

fxi (x∗ )  0 So

fxi (x∗ ) = 0

fxi (x∗ )  0

and for each

i = 1, . . . , n.

• Constrained case: We cannot choose points around x∗ in any direction because

we need to remain on the set [g = c]. A way to do that, is to consider curves t −→ x(t) satisfying x(t) ∈ [g = c] for t ∈ (−a, a) and x(0) = x∗ . Then, we have

f (x(t))  f (x∗ )

∀t ∈ (−a, a)

and 0 is local maximum point for the function t −→ f (x(t)). Hence, for regular functions, we have

  d   f (x(t))  = f  (x(t)).x (t) =0 dt t=0 t=0

=⇒

f  (x∗ ).x (0) = 0.

152

Introduction to the Theory of Optimization in Euclidean Space

x (0) is a tangent vector to the curve x(t) at the point x(0) = x∗ . This equality musn’t depend on a particular curve. Thus, it must be satisfied for any y = x (0) ∈ M , which is summarized below:

Lemma 3.2.1 Let f and g = (g1 , . . . , gm ) be C 1 functions in a neighborhood of x∗ ∈ [g = c]. If x∗ is a regular point and a local extreme point of f subject to these constraints, then we have ∀y ∈ Rn :

g  (x∗ )y = 0

f  (x∗ )y = 0.

=⇒

The lemma says that f  (x∗ ) is orthogonal to the plane tangent at x∗ to the surface g(x) = c. As a consequence, we will see that f  (x∗ ) is a linear  (x∗ ). combination of g1 (x∗ ), . . . , gm

Theorem 3.2.1 Let f and g = (g1 , . . . , gm ) be C 1 functions in a neighborhood of x∗ ∈ [g = c]. If x∗ is a regular point and a local extreme point of f subject to these constraints, then there exists unique numbers λ∗1 , . . . , λ∗m such that m  ∂f ∗ ∂gj ∗ (x ) − λ∗j (x ) = 0 ∂xi ∂xi j=1

i = 1, . . . , n.

Proof. The proof uses a simple argument of linear algebra. Indeed, ⎡ ⎢ ⎢ ⎢  ∗ A = g (x ) = ⎢ ⎢ ⎢ ⎣

b = f  (x∗ ) =

∂g1 ∂x1

∂g1 ∂x2

∂g2 ∂x1

∂g2 ∂x2

∂gm ∂x1

∂gm ∂x2

.. .

.. .

... ... .. . ...

∂g1 ∂xn ∂g2 ∂xn

.. .

∂gm ∂xn

 ∂f ∂f  ,..., ∂x1 ∂xn

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

rank(A) = m

b ∈ Rn .

From the previous lemma, we have ∀y ∈ Rn :

Ay = 0

=⇒

b.y = 0.

Constrained Optimization-Equality Constraints

153

In other words, we have  KerA = Ker

A b



where KerN denotes the Kernel [10] of the linear transformation induced by the matrix N . Since we have [10]     A A ) + rank( ) dimRn = dim(kerA) + rank(A) = dim(ker b b 

then rank(A) = rank(

A b

 )

which means that the vector b is linearly dependent on the line vectors of A, so there exists a unique vector λ∗ = (λ∗1 , . . . , λ∗m ) ∈ Rm such that t

b =t Aλ∗

m

⇐⇒

 ∂gj ∂f ∗ (x ) = λ∗j (x∗ ), ∂xi ∂x i j=1

i = 1, . . . , n.

Finally, to look for extreme points of f subject to the constraint g(x) = c, we are lead to solve the system m  ∂gj ∂f (x) − λj (x) = 0 ∂xi ∂xi j=1

gj (x) − cj = 0

i = 1, · · · , n

j = 1, · · · , m.

These equations suggest to introduce the function L(x, λ) = f (x) − λ1 (g1 (x) − c1 ) − · · · − λm (gm (x) − cm ) called Lagrange function or Lagrangian and λ1 , . . . , λm the Lagrange multipliers. The necessary conditions can be, then, expressed in the form ⎧ m  ⎪ ∂f ∂gj ∂L ⎪ ⎪ (x, λ) = (x) − λj (x) = 0 i = 1, . . . , n ⎪ ⎪ ∂xi ∂xi ⎨ ∂xi j=1 ⎪ ⎪ ⎪ ∂L ⎪ ⎪ (x, λ) = −(gj (x) − cj ) = 0 ⎩ ∂λj or simply

∇L(x, λ) = 0.

j = 1, . . . , m

154

Introduction to the Theory of Optimization in Euclidean Space We may reformulate the previous theorem as follow:

Theorem 3.2.2 Let f and g = (g1 , . . . , gm ) be C 1 functions in a neighborhood of x∗ ∈ [g = c]. x∗ is a regular point and a local extreme point of f ∃! λ∗ ∈ Rm such that ∇L(x∗ , λ∗ ) = 0.

=⇒

Remark 3.2.1 When m = 1, the necessary condition is reduced to ∃!λ∗ ∈ R :

∇f = λ∗ ∇g

∇f // ∇g.

=⇒

The vectors g  (x∗ ) and f  (x∗ ) are respectively normal to the level curves g(x) = c and f (x) = f (x∗ ). When the extreme point is attained then the two vectors g  (x∗ ) and f  (x∗ ) are parallel. Thus the two level curves have a common tangent plane at x∗ . When, using a graphic utility, it is where the level curves are tangent, the constrained extreme points may locate. Example 1. At what points on the circle x2 + y 2 = 1 does f (x, y) = xy have its maximum and minimum? Solution: Set g(x, y) = x2 + y 2

S = {(x, y) :

g(x, y) = x2 + y 2 = 1}

By the extreme-value theorem, f attains its maximum and minimum values on S since f is continuous on the closed and bounded unit circle S; see Figure 3.11.

0.5

z

1.0

0.0

0.5

0.5

zxy 0.0

1.0

y

0.5 0.5

0.0 x

0.5 1.0

1.0

FIGURE 3.11: Graph of f (x, y) = xy on the unit disk [x2 + y 2  1]

Constrained Optimization-Equality Constraints

155

Next, the functions f and g are C 1 around each point (x, y) ∈ R2 and in particular each point of S is relatively interior to S and is a regular point since we have g  (x, y) = 2xi + 2yj = 0

rank(g  (x, y)) = 1.

=⇒

Thus, introducing the Lagrangian L(x, y, λ) = x y − λ (x2 + y 2 − 1), we can apply Lagrange multipliers method to look for the interior extreme points as solutions of the system ⎧ Lx = fx (x, y) − λgx (x, y) = 0 ⎪ ⎪ ⎪ ⎪ ⎨ Ly = fy (x, y) − λgy (x, y) = 0 ⎪ ⎪ ⎪ ⎪ ⎩ Lλ = −(g(x, y) − 1) = 0

⇐⇒



⎧ y − 2xλ = 0 ⎪ ⎪ ⎪ ⎪ ⎨ x(1 − 4λ2 ) = 0 ⎪ ⎪ ⎪ ⎪ ⎩ 2 x + y2 − 1 = 0

⇐⇒

⇐⇒

⎧ y − 2xλ = 0 ⎪ ⎪ ⎪ ⎪ ⎨ x − 2yλ = 0 ⎪ ⎪ ⎪ ⎪ ⎩ 2 x + y2 − 1 = 0 ⎧ y − 2xλ = 0 ⎪ ⎪ ⎪ ⎪ ⎨ x=0 or λ = ± 21 ⎪ ⎪ ⎪ ⎪ ⎩ 2 x + y 2 − 1 = 0.

x = 0 leads to y = 0 and (0, 0) is not a point on the constrained curve.

∗∗ λ = 12 leads to y = x and from the constraint equation, we deduce that √ x = ±1/ 2. ∗ ∗ ∗ λ = − 12 leads to y = −x and from the constraint equation, we deduce √ that x = ±1/ 2.

So, the stationary points, for the Lagrangian, are the four points 1 1 ( √ , √ ), 2 2

1 1 (− √ , − √ ), 2 2

1 1 ( √ , − √ ), 2 2

1 1 (− √ , √ ) 2 2

at which f takes its maximum and minimum values respectively 1 1 1 1 1 f ( √ , √ ) = f (− √ , − √ ) = , 2 2 2 2 2

1 1 1 1 1 f ( √ , − √ ) = f (− √ , √ ) = − . 2 2 2 2 2

The problem can be solved graphically, as illustrated in Figure 3.12.

156

Introduction to the Theory of Optimization in Euclidean Space y 1.5 2.1 1.9 1.7 1.4 1.1 0.7 21.8 0.3 1.5 1.2 0.8 1.6 1.0 1.3 0.4 1 0.6

0.9 0.5

1.5 0.1

1.0

0.8

0.4

0

0

0.5

1.0

1.1 1

1.7

1.5 1.8

1.3

0.1

0.2 1.5

xy 1 2

0.5

0.6

1.0 0.2

1.5

x

1.5

1.0

0.5

0.5

1.0

1.5

x

0.4

0.6 1

0.7 0.5

1.0

0.9

0.1 0.5

0.5

08

1.

1.2

0.7

0.3

0.3 0.7

1.8

1.4 1

0.3

0.4

.6 1.4

1.7

1.3

0.1 0.5

0.5

0.9

1.5

1.5

0.2

0.2

0.6

.2

y 1.1

0.5

0.5

0 1 1.

1.0

1 1.7  0.9 1.31 6 1 92.1

1.5

xy 1 2

1.1 1.4

1 2

FIGURE 3.12: The constraint [x2 + y 2 = 1] and the level curves xy = − , are tangent

1 2

Remark 3.2.2 Note that the Lagrange’s method doesn’t transform a constrained optimization problem into one finding an unconstrained extreme point of the Lagrangian.

Example 2. Consider the problem max x y

subject to

x + y = 2,

x  0,

y  0.

Using the Lagrange multiplier method, prove that (x, y) = (1, 1) solves the problem with λ = 1. Prove also that (1, 1, 1) does not maximize the Lagrangian L. Solution: Since x and y must be positive and satisfy the sum x + y = 2, we may look for the extreme points in the set [0, 2] × [0, 2]. Let us denote f (x, y) = xy

Ω = [0, 2] × [0, 2].

g(x, y) = x + y

First, the optimization problem has a solution by the extreme-value theorem. Indeed, f is continuous on the line segment (see Figure 3.13) S = {(x, y) :

g(x, y) = 2,

x  0,

y  0}

which is a closed and bounded subset of R2 . Next, the functions f and g are C 1 around each point (x, y) ∈ (0, 2) × (0, 2) which is a regular point since we have g  (x, y) = i + j = 0

=⇒

rank(g  (x, y)) = 1.

Constrained Optimization-Equality Constraints

157

y 2.5

2.0

1.5

S

1.0

0.5

0.5

0.5

1.0

1.5

2.0

2.5

x

0.5

FIGURE 3.13: Set of the constraints

So, by applying the method of Lagrange Multipliers, we introduce the Lagrangian L(x, y, λ) = f (x, y) − λ(g(x, y) − 2) = xy − λ(x + y − 2) and look for the interior extreme points as solutions of the system ⎧ ⎧ Lx = fx − λgx = 0 y−λ=0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎨ Ly = fy − λgy = 0 ⇐⇒ x−λ=0 ⇐⇒ x = y = λ = 1. ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ⎩ Lλ = −(g − 2) = 0 x+y−2=0 So, the point (1, 1, 1) is a stationary point for the Lagrangian L. But, it is not an extreme point for L. Indeed, the second test derivative gives ⎡ ⎤ 0 1 −1 the Hessian matrix of L. HL (x, y, λ) = ⎣ 1 0 −1 ⎦ 1 1 0 The leading principal minors of HL at (1, 1, 1) are   D1 =  0  ,

  0 D2 =  1

 1  = −1, 0 

  0  D3 =  1  1

 1 −1  0 −1  = −2 = 0. 1 0 

Hence, (1, 1, 1) is a saddle point. It remains to show that the point (1, 1) is the maximum point for the problem; see Figure 3.14 for a graphical solution using level curves. Indeed, since it is the only interior point to the segment, it suffices to compare the value of f at (1, 1) with its value at the end points of the segment. We have f (1, 1) = 1

f (2, 0) = 0

f (0, 2) = 0.

158

Introduction to the Theory of Optimization in Euclidean Space 0.0 2.0 y

1.5

x 0.5 1.0 1.5

1.0

2.0

0.5 0.0 4

2.0

0.57

2.28

1.33

3.04 3.42

1.9

2.66

3.8 3.6 3.23

3

1.5

2.85 0.95

z

2.09

2.47

1.52

2

1.0

1.71

zxy

1

0.76 0.5

1.14

0.19

0

0.3 0.0 0.0

0.5

1.0

1.5

2.0

FIGURE 3.14: The constraint x + y = 2 and the level curve f = xy = 1 are tangent

So f attains its maximum value at (1, 1) under the constraint g(x, y) = 2.

Remark 3.2.3 A function subject to a constraint needs not to have a local extremum at every stationary point of the associated Lagrangian. The Lagrangian multiplier method transforms a constrained optimization problem into one of finding the appropriate stationary points of the Lagrangian.

Example 3. Consider the problem min x y

subject to

x + y = 2,

x  0,

y  0.

Using the Lagrange multiplier method, prove that (x, y) = (1, 1) doesn’t solve the problem with λ = 1. Solution: Arguing as in the Example 2, the problem has a solution by the extreme-value theorem. But, by applying the method of Lagrange multiplier, we found the only point candidate (1, 1) and it realizes the maximum of f . So the minimum point of f is not necessary a stationary point of L. In fact, f attains its minimum value 0, under the constraint g(x, y) = 2 at (2, 0) and (0, 2).

Constrained Optimization-Equality Constraints

159

Solved Problems

1. – i) Show that the Lagrange equations for max (min) f (x, y) = x+y +3

subject to

g(x, y) = x−y = 0

have no solution. ii) Show that any point of the constraints’ set is a regular point. iii) What can you conclude about the minimum and maximum values of f subject to g = 0? Show this directly.

Solution: i) Set L(x, y, λ) = f (x, y) − λ(g(x, y) − 0) = x + y + 3 − λ(x − y). By applying Lagrange’s multipliers method, we look for the interior extreme points as a solution of the system ⎧ ⎧ Lx = fx − λgx = 0 1−λ=0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎨ Ly = fy − λgy = 0 1+λ=0 ⇐⇒ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ⎩ Lλ = −(g − 0) = 0 x−y =0 which leads to a contradiction with λ = 1 and λ = −1. So the system has no solution. ii) Any point of the constraint is a regular point since we have g  (x, y) = i − j = 0

=⇒

rank(g  (x, y)) = 1.

iii) We can conclude that f has no maximum nor minimum on the set of the constraints since if they existed, they would be solution of the above

160

Introduction to the Theory of Optimization in Euclidean Space 2

y 0

y 2

2 x y 0 line: z 0, xy 0

1

2

z 2

1

1

2

0

x

plane : z xy  3

2 1 2 0 x

2

2

FIGURE 3.15: No solution for the constrained optimization problem

system. Indeed, all conditions of the theorem on the necessary conditions for a constrained candidate point are satisfied. The problem is equivalent to optimize for x ∈ R.

F (x) = f (x, x) = 2x + 3 We can see that lim F (x) = −∞

lim F (y) = +∞.

x→−∞

x→+∞

Therefore, f cannot reach a finite lower and upper bound on the set of the constraints. The graph of f is a plane; see Figure 3.15. The level curves y+1 = k are parallel lines that intersects the constraint line x − y = 0 at the points (k − 1, k − 1). This shows that f takes large values (see Figure 3.16). 5.73

4.56

6.84

8.36

7.6

7.98

.28

2

4.94

7.22

6.08

3.42 1

76 3 0.38

2

2 66

3.8 1

6.46

2

3 4.18

2.66

1.14 2

1.14 1 9

5.32 1

1

0

1.52 2.28

1.52

0 76

3

0 38

1.9

30

FIGURE 3.16: ∇f = 1, 1 ∦ 1, −1 = ∇g

Constrained Optimization-Equality Constraints

161

2. – Consider the problem of minimizing f (x, y) = y + 1

g(x, y) = x4 − (y − 2)5 = 0.

subject to

i) Show, without using calculus, that the minimum occurs at (0, 2). Is it a regular point? ii) Show that the Lagrange condition ∇f = λ∇g is not satisfied for any value of λ. iii) Does this contradicts the theorem on the necessary conditions for a constrained candidate point?

Solution: i) Note that we have, g(x, y) = x4 −(y −2)5 = 0

(y −2)5 = x4  0

⇐⇒

y  2.

=⇒

So, on the set of the constraint (see Figure 3.17), we have f (x, y) = y + 1  3 = f (x, 2)

∀(x, y) ∈ [g = 0].

Since g(0, 2) = 0, then (0, 2) ∈ [g = 0]. Thus f (x, y)  f (0, 2)

∀(x, y) ∈ [g = 0]

and (0, 2) is a global minimum point. 5

y 0 5 10 10

y 10

5

5

x4  y  2  0

8 z

plane : z y  1

0

6 5

4 10

2

5 0

10

5

5

10

x

x

5 10

FIGURE 3.17: Minimal value of f on the constraint set g = 0

162

Introduction to the Theory of Optimization in Euclidean Space

ii) Let L(x, y, λ) = f (x, y) − λ(g(x, y) − 0) = y + 1 − λ(x4 − (y − 2)5 ). An interior extreme point, if it exists, is a solution of the system ⎧ Lx = fx − λgx = 0 ⎪ ⎪ ⎪ ⎪ ⎨ Ly = fy − λgy = 0 ⎪ ⎪ ⎪ ⎪ ⎩ Lλ = −(g − 0) = 0

⎧ 0 − 4λx3 = 0 ⎪ ⎪ ⎪ ⎪ ⎨ 1 − 5λ(y − 2)4 = 0 ⎪ ⎪ ⎪ ⎪ ⎩ 4 x − (y − 2)5 = 0

⇐⇒

Note that λ = 0 is not possible by the second equation. So, we deduce that x = 0, from the first equation, and then y = 2 from the third equation. But, this leads to a contradiction by the second equation. So the system has no solution. No level curve is tangent to the constraint set in Figure 3.18. 10

10.24

9.6

.96

8.32 7.04 5.76

5

7.6 6.4

5.12

4.4

3.84 2.56 1.28 5

10 0.64 2.56 5.12

6.4 8.32

3.2 1.92 0

3.84 5

1.92

3.2

7.04

0.6 10

5

1.28

5.76

4.48 7.

10

FIGURE 3.18: No solution with Lagrange method

iii) This does not contradict the theorem on the necessary conditions for a constrained candidate point since the theorem is true if all assumptions are satisfied which is not the case for the regularity of the point (0, 2). Indeed, we have g  (x, y) = 4x3 i−5(y−2)4 j

g  (0, 2) = 0, 0

=⇒

rank(g  (0, 2)) = 0 = 1.

3. – At what points on the curve g(x, y) = x4 +y 4 = 1 does f (x, y) = x2 +y 2 have its maximum and minimum values? Give a geometric interpretation of the problem.

Constrained Optimization-Equality Constraints

163

Solution: Note that, the optimization problem has a solution by the extremevalue theorem

 since f is continuous on the closed and bounded subset [g = 1] = g −1 {1} of R2 . Next, the functions f and g are C 1 around each point (x, y) ∈ R2 . In particular each point of [g = 1] is relatively interior to [g = 1]. Indeed, if (x0 , y0 ) ∈ [g = 1], then the point (x20 , y02 ) is on the unit circle. Thus, (x20 , y02 ) is an interior point and we conclude, by using the preimage of an open ball by the continuous function (x, y) −→ (x2 , y 2 ) is an open set. Moreover, each point of [g = 1] is a regular point since we have g  (x, y) = 4x3 i + 4y 3 j = 0

on [g = 1]

=⇒

rank(g  (x, y)) = 1.

So, by setting L(x, y, λ) = x2 + y 2 − λ (x4 + y 4 − 1) we are led to solve the system ⎧ ⎧ Lx = 2x − 4λx3 = 0 2x(1 − 2λx2 ) = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎨ Ly = 2y − 4λy 3 = 0 2y(1 − 2λy 2 ) = 0 ⇐⇒ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ⎩ 4 x + y4 = 1 Lλ = −(x4 + y 4 − 1) = 0 ⎧ x = 0 or 2λx2 = 1 ⎪ ⎪ ⎪ ⎪ ⎨ y = 0 or 2λy 2 = 1 ⇐⇒ ⇐⇒ ⎪ ⎪ ⎪ ⎪ ⎩ 4 x + y4 = 1 ⎧ ⎧ ⎧ 2 x=0 y=0 x = y2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎨ ⎨ x4 = 1/2 y = ±1 x = ±1 or or ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ⎩ ⎩ λ = 1/(2x2 ). λ = 1/2 λ = 1/2

So, the stationary points for the Lagrangian are 1 1 1 1 , ± 1/4 ), (− 1/4 , ± 1/4 ) 21/4 2 2 2 at which f takes its maximum and minimum values respectively √ 1 1 1 1 max f = f ( 1/4 , ± 1/4 ) = f (− 1/4 , ± 1/4 ) = 2 g=1 2 2 2 2 (0, ±1),

(±1, 0),

(

164

Introduction to the Theory of Optimization in Euclidean Space min f = f (±1, 0) = f (0, ±1) = 1. g=1

1.0 0.5

y

y

0.0

2.3 1.5

4 3.4 3.1 2.7 3.83.6 3.2 2.9 2.4 .9 3.73.5

0.5 1.0

1.0

3.3

3 2.6

2.8

2.2

2.1 1.2

1.0

2.5 2.93.2

0.8

2.7

1.3

2.5

3 3.3 3.53.9 2.6 3.7 4 2.8 3.13.4 3.8 3.

2.4

1.7

0.5

z

0.7 0.2

2.3

1.4

0.4

0.5 1.5 1.8 0.0

1.0 1

z  x2  y2 0.6

2.6 2.3

0.5

3 2.7 0.0 0.5 1.0

0.5

x

2.

2.4

1.6

1.0 1.1

2 1.5

1.5 1.9

0.5

1.5

2.5 3.4 3.7 3.2 3.9 3.5 2.9 2.4 3.8 3.3 4 36 31 28

1.0 0.9

0.1

1.0

x

0.3 0.5

0.5

2.3 27

2.8 3.1

2.9 3.33. 3.23.53.84 3 3.43.73.9

2.6

FIGURE 3.19: The constraint [x4 + y 4 = 1] and the level curves f = 1,



2 are

tangent

Since f (x, y) = (x, y) − (0, 0) 2 , then the problem looks for points (x, y) on the curve x4 + y 4 = 1 that are closest and farthest from the origin; see Figure 3.19.

4. – Figures A an B (see Figure 3.20) show the level curves of f and the constraint curve g(x, y) = 0 graphed thickly. Estimate the maximum and minimum values of f subject to the constraint. Locate the point(s), if any, where an extreme value occurs. Solution: Figure A. Two level curves of f are tangent to the constraint curve g = 0. Comparing the values of f taken at these level curves, we deduce that local max f ≈ 15 ≈ f (−1.5, 1.5) g=0

local min f ≈ 3.64 ≈ f (−1.5, 1.5). g=0

Figure B. One level curve of f is tangent to the constraint curve g = 0. Comparing the values of f taken at different level curves, we remark that f keeps taking large values on the constraint set. Therefore, we deduce that local max f doesn’t exist g=0

local min f ≈ 19.2 ≈ f (3, −2). g=0

Constrained Optimization-Equality Constraints y 3 32.48 33.6 21.5620.16 33.04 30.8 31.92 30.24 24.0822.68 29.1228.25.76 .8831.36 26.3224.92 26.88 29.68 .32 .76 28.56 2.2 30.52 27.44 23.2420.72 .64 1.08 21.84 19.32 9.96 28.84 28.28 24.36 17.92 16.24 9.4 26.04 25.2 27.16 7.72 22.4 2 23.5221. 19.88 6.6 13.44 12.04 5.48 4.64 22.12 15.12 3.8 18.7616.8 8.96 2.96 10.92 1 20.44 12.32 .28 6.72 9.6 14. 9.8 317.64 2 1 8.12 15.96 11.48

19.04

17.36

14.84

A

9.88 0.44 19.32 1. .56 .12 20.7218.48 11.76 .68 19.6 15.6813 72 21 22120.16 484 28

163.2 2.4

7.28 5.6

7.84 9.52 3

201.6 220 211

105.6

19 172 134.4153

5

5.04 1 3.36

10.0 8.4

4.2 2

3

96.

x

57.6 38.4

10 76.8

9.6

19.2 5

86.4 x 28.8 48. 67.2 10 5

6.16

4.48 2

115.2

240. 249.6 230.4 259.2 268.8288 278

124.8

144.

12.

10

7.56

7.

12.88 14.56 10.368.68 17.08

249.6230.4 259.2 288.268.8 8.4 240. 201.6 0.8 1.2

16.5 14.28 11.2 9.24

5.32 13.92 .04

y

22.6 22.4 20.1620.72 21.28 21.84 19.6 19.04 22. 19.8821. 21 20. 19.3 18.2 15.4

13.16

10.64

165

53.6134.4 2.8 92. 1.2 0.8 201.6

3.64 4.76 6.44

7.28 8.1 5.88 8.6 7.84 9.24 9. 8.4 10. 8.96 9 10 5210 086 7 56

B

5

144 124.8

115.2

105.6

240. 8.4 288 268259 8249 2 6230 4

10

182 163.2

211 220 201.6 240 278 2 8288 230 4249259 6268

FIGURE 3.20: Level curves of f and the constraint curve g = 0

5. – Find the points on the sphere x2 + y 2 + z 2 = 1 that are closest to and farthest from the point (1, 2, 2).

Solution: The distance of a point (x, y, z) to the point (1, 2, 2) is given by  D(x, y, z) = (x − 1)2 + (y − 2)2 + (z − 2)2 . To look for the shortest and the farthest distance when (x, y, z) remains into the unit sphere is equivalent to optimize D2 (x, y, z) under the constraint x2 + y 2 + z 2 = 1. So, let us denote f (x, y, z) = (x − 1)2 + (y − 2)2 + (z − 2)2 g(x, y, z) = x2 + y 2 + z 2 S = [g = 1]. First, the optimization problem has a solution by the extreme value theorem since f is continuous on the unit sphere S, which is a closed and bounded subset of R3 . Next, f and g are C ∞ around each point (x, y, z) ∈ R3 . In particular, each point of S a relatively interior point and is a regular point since we have g  (x, y, z) = 2xi + 2yj + 2zk = 0

=⇒

rank(g  (x, y, z)) = 1.

So, consider the Lagrangian L(x, y, z, λ) = (x − 1)2 + (y − 2)2 + (z − 2)2 − λ(x2 + y 2 + z 2 − 1)

166

Introduction to the Theory of Optimization in Euclidean Space

and apply Lagrange multipliers method to look for the interior extreme points by solving the system ⎧ Lx = fx (x, y, z) − λgx (x, y, z) = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ Ly = fy (x, y, z) − λgy (x, y, z) = 0

⎧ 2(x − 1) − 2xλ = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ 2(y − 2) − 2yλ = 0

⇐⇒

⎪ ⎪ Lz = fz (x, y, z) − λgz (x, y, z) = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ Lλ = −(g(x, y, z) − 1) = 0

⎪ ⎪ 2(z − 2) − 2zλ = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ 2 x + y 2 + z 2 − 1 = 0.

If x = 0, then the first equation leads to x = 1 which is impossible. We cannot have also y = 0 and z = 0. So, we deduce from the system that 2 2 1 =1− =1− x y z

λ=1− from which we deduce ⎧ ⎨ y = z = 2x λ = 1 − 1/x ⎩ 2 x + y2 + z2 = 1

⎧ ⎪ ⎨ y = z = 2x λ = 1 − 1/x ⎪ ⎩ x2 + 4x2 + 4x2 − 1 = 0 ⇐⇒ x = ± 1 . 3

⇐⇒

44 y 0

2

2

x 0 2 4

2 4 5

z

0

5

FIGURE 3.21: The constraint [g = 1] and the level curves f = 4, 16 are tangent So, the stationary points for the Lagrangian are the two points 1 2 2 1 2 2 (− , − , − , 4) ( , , , −2), 3 3 3 3 3 3 and f takes its maximum and minimum values respectively 1 2 2 1 2 2 and f ( , , ) = 4. f (− , − , − ) = 16 3 3 3 3 3 3 The level curves passing by these points are spheres tangents to the constraint as shown in Figure 3.21.

Constrained Optimization-Equality Constraints

3.3

167

Classification of Local Extreme Points-Equality Constraints

To classify a local extreme point x∗ in the case of an unconstrained optimization problem, we compared values f (x∗ + h) with f (x∗ ) using Taylor’s formula and the fact that ∇f (x∗ ) = 0. In this constrained case, we also need to make this comparison, but, we have to take into account the presence of the constraints. The Lagrangian function links the values of f to those of g. Therefore, we will apply Taylor’s formula to compare values L(x∗ + h) with L(x∗ ) using the fact that ∇L(x∗ , λ∗ ) = 0. More precisely, we establish a second derivative test under specific assumptions.

Consider the optimization problem with equality constraints, local max(min)f (x)

subject to

g(x) = c

where g(x) = g1 (x), . . . , gm (x),

c = c1 , . . . , cm 

(m < n).

The associated Lagrangian is L(x, λ) = f (x) − λ1 (g1 (x) − c1 ) − λ2 (g2 (x) − c2 ) − . . . − λm (gm (x) − cm ).

Theorem 3.3.1 Sufficient conditions for a strict local constrained extreme point

Let f and g = (g1 , . . . , gm ) be C 2 functions in a neighborhood of x∗ in Rn such that: g(x∗ ) = c

rank(g  (x∗ )) = m,

∇L(x∗ , λ∗ ) = 0

for a unique vector

λ∗ = λ∗1 , . . . , λ∗m .

Then (i)

(ii)

(−1)m Br (x∗ ) > 0 ∀r = m + 1, . . . , n =⇒ x∗ is a strict local minimum point (−1)r Br (x∗ ) > 0 ∀r = m + 1, . . . , n =⇒ x∗ is a strict local maximum point.

168

Introduction to the Theory of Optimization in Euclidean Space

For r = m + 1, . . . , n, Br (x∗ ) is the bordered Hessian determinant defined by          ∗ Br (x ) =        

0 .. . 0

... .. . ...

∂g1 ∗ ∂x1 (x )

.. . ∂g1 ∗ ∂xr (x )

... .. . ...

∂g1 ∗ ∂x1 (x )

.. . ∂gm ∗ ∂x1 (x )

... .. . ...

Lx1 x1 (x∗ , λ∗ ) .. . Lxr x1 (x∗ , λ∗ )

... .. . ...

0 .. . 0 ∂gm ∗ ∂x1 (x )

.. . ∂gm ∗ ∂xr (x )

    ..  .   ∂gm ∗ (x )  ∂xr   ∗ ∗  Lx1 xr (x , λ )  ..   .  Lxr xr (x∗ , λ∗ )  ∂g1 ∗ ∂xr (x )

The variables are renumbered in order to make the first m columns in the matrix g  (x∗ ) linearly independent.

Remark 3.3.1 If we introduce the notations: Q(h) = Q(h1 , . . . , hn ) =

n  n 

Lxi xj (x∗ , λ∗ )hi hj

i=1 j=1



the (m + n) × (m + n) bordered matrix ⎣

0m×m t 

g (x∗ )

g  (x∗ )

⎤ ⎦

[Lxi xj (x∗ , λ∗ )]n×n

M = {h ∈ Rn : g  (x∗ ).h = 0} the theorem says that Q(h) > 0

∀h ∈ M, h = 0

=⇒

x∗ is a strict local minimum

Q(h) < 0

∀h ∈ M, h = 0

=⇒

x∗ is a strict local maximum.

It suffices then to study the definite positivity (negativity) of the quadratic form on the tangent plan M to the constraint g = c at the point x∗ (see the reminder at the end of this section).

Before proving the theorem, we will see its application through some examples.

Constrained Optimization-Equality Constraints

169

Example 1. Consider the problem local max f (x, y) = x y

g(x, y) = x + y = 2, x  0, y  0.

subject to

Lagrange multiplier method shows that (1, 1) is a regular candidate point. Prove that it is a local maximum to the constrained optimization problem. Solution: Considering the Lagrangian L(x, y, λ) = f (x, y) − λ(g(x, y) − 2) = xy − λ(x + y − 2), we can study the nature of the point (1, 1) using the second derivatives test. Here, we have n = 2 and m = 1. The first column vector of g  (1, 1) = 1, 1 is linearly independent. So, we keep the matrix g  (1, 1) without renumbering the variables. Then, we have to consider the sign of the bordered Hessian determinant (r = m + 1 = 2 = n)    gy (1, 1)   0 gx (1, 1)     0 1 1       (−1)2 B2 (1, 1) =  gx (1, 1) Lxx (1, 1, 1) Lxy (1, 1, 1)  =  1 0 1  = 2 > 0.   1 1 0      gy (1, 1) Lxy (1, 1, 1) Lyy (1, 1, 1)  We conclude that the point (1, 1) is a local maximum to the problem. Example 2. Solve the problem local max f (x, y, z) = xy + yz + xz

subject to

g(x, y, z) = x + y + z = 3.

Solution: Note that f and g are C 1 in R3 and g  (x, y, z) = i + j + k = 0

=⇒

rank(g  (x, y, z)) = 1.

Thus, any point, interior to the constraint set [g = 3] (see Figure 3.22), is a regular point. Consider the Lagrangian L(x, y, z, λ) = f (x, y, z) − λ(g(x, y, z) − 3) = xy + yz + xz − λ(x + y + z − 3) and let us look for its stationary points solutions of the system ⎧ Lx = y + z − λ = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ Ly = x + z − λ = 0 ∇L(x, y, z, λ) = 0, 0, 0, 0 ⇐⇒ ⎪ ⎪ Lz = y + x − λ = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ Lλ = −(x + y + z − 3) = 0.

170

Introduction to the Theory of Optimization in Euclidean Space x yz3

10 z

4

5 0

2

5 4

0y 2 x

2

0 2 4

4

FIGURE 3.22: The constraint set [g = 3]

λ From the first three equations, we deduce that = x = y = z, which inserted 2 into the last equation gives x=y=z=1

λ = 2.

Now, let us study the nature of the point (1, 1, 1). For this we use the second derivative test since f and g are C 2 around this point. The first column vector of g  (1, 1, 1) is linearly independent. So, we keep the matrix g  (1, 1, 1) without renumbering the variables. As n = 3 and m = 1, we have to consider the signs of the following bordered Hessian determinants:   0    (−1)2 B2 (1, 1, 1) =  gx (1, 1, 1)    gy (1, 1, 1)

      0   Lxy (1, 1, 1, 2)  =  1   1  Lyy (1, 1, 1, 2) 

gx (1, 1, 1)

gy (1, 1, 1)

Lxx (1, 1, 1, 2) Lxy (1, 1, 1, 2)

  0     gx (1, 1, 1)  (−1)3 B3 (1, 1, 1) = −   gy (1, 1, 1)     gz (1, 1, 1)     = −   

0 1 1 1

gx (1, 1, 1)

gy (1, 1, 1)

Lxx (1, 1, 1, 2)

Lxy (1, 1, 1, 2)

Lyx (1, 1, 1, 2)

Lyy (1, 1, 1, 2)

Lzx (1, 1, 1, 2)

Lzy (1, 1, 1, 2)

1 0 1 1

1 1 0 1

1 1 1 0

1 0 1

1 1 0

    = 2 > 0.  

     Lxz (1, 1, 1, 2)    Lyz (1, 1, 1, 2)    Lzz (1, 1, 1, 2)  gz (1, 1, 1)

     = 3 > 0.   

We conclude that the point (1, 1, 1) is a local maximum to the constrained maximization problem.

Constrained Optimization-Equality Constraints

171

Proof. We will prove assertion i). Assertion ii) can be established similarly. We follow for this the proof in [25] with more details in the steps involved.



Step 1 : Let Ω be a neighborhood of x∗ . For h ∈ Rn such that x∗ + h ∈ Ω, we have from Taylor’s formula, for some τ ∈ (0, 1), L(x∗ +h, λ∗ ) = L(x∗ , λ∗ )+

n 

Lxi (x∗ , λ∗ )hi +

i=1

n

n

1  Lx x (x∗ +τ h, λ∗ )hi hj . 2 i=1 j=1 i j



Since x∗ ∈ Ω and (x∗ , λ∗ ) is a local stationary point of L then, in particular, Lxi (x∗ , λ∗ ) = 0

i = 1, · · · , n.

Moreover, we have g1 (x∗ ) − c1 = g2 (x∗ ) − c2 = . . . = gm (x∗ ) − cm = 0 L(x∗ , λ∗ ) = f (x∗ ) − λ∗1 (g1 (x∗ ) − c1 ) − . . . − λ∗m (gm (x∗ ) − cm ) = f (x∗ ) L(x∗ + h, λ∗ ) = f (x∗ + h) − λ∗1 (g1 (x∗ + h) − c1 ) − . . . − λ∗m (gm (x∗ + h) − cm ) from which we deduce





f (x +h)−f (x ) =

n 

λ∗k [gk (x∗ +h)−ck ]

k=1

n

n

1  + Lx x (x∗ +τ h, λ∗ )hi hj . 2 i=1 j=1 i j

Using Taylor’s formula for each gk , k = 1, . . . , m, we obtain

gk (x∗ + h) − ck = gk (x∗ + h) − gk (x∗ ) =

n  ∂gk j=1

∂xj

(x∗ + τk h)hj

τk ∈ (0, 1).

Step 2 : Now consider the (m + n) × (m + n) bordered Hessian matrix ⎤ ⎡ 0 G(x1 , . . . , xm ) ⎦ B(x0 , x1 , . . . , xm ) = ⎣ t G(x1 , . . . , xm ) HL(.,λ∗ ) (x0 )

172

Introduction to the Theory of Optimization in Euclidean Space

where ⎡ G(x1 , . . . , xm ) =

x1 , . . . , xm

 ∂g

i

∂xj



(xi )

m×n

⎢ =⎢ ⎣

∂g1 1 ∂x1 (x )

.. . ∂gm m (x ) ∂x1

... .. . ...

∂g1 1 ∂xn (x )



⎥ .. ⎥ . ⎦ ∂gm m (x ) ∂xn

are arbitrary vectors in some open ball around x∗

HL(.,λ∗ ) (x0 ) : is the Hessian matrix of L with respect to x evaluated at x0 .

For r = m+1, . . . , n, let detBr (x0 , x1 , . . . , xm ) be the (m+r)×(m+r) leading principal minor of the matrix B(x0 , x1 , . . . , xm ). Suppose that (−1)m Br (x∗ ) > 0 for all r = m + 1, . . . , n, then by continuity of the second-order partial derivatives of f and g, and since detBr (x∗ , x∗ , . . . , x∗ ) = Br (x∗ ) there exists ρ > 0 such that, ∀r = m + 1, . . . , n, (−1)m detBr (x0 , x1 , . . . , xm ) > 0

∀x0 , x1 , . . . , xm ∈ Bρ (x∗ ).

As a consequence, for x0 , x1 , . . . , xm ∈ Bρ (x∗ ), the quadratic form Q(t) = Q(t1 , . . . , tn ) =

n  n 

Lxi xj (x0 , λ∗ )ti tj ,

i=1 j=1



with the associated symmetric matrix Lxi xj (x0 ) ject to the constraints

G(x1 , . . . , xm ).t = 0

⇐⇒

n  ∂gk j=1

∂xj

 n×n

, is definite positive sub-

(xk )tj = 0

k = 1, . . . , m.

Step 3 : Because τ, τk ∈ (0, 1), we have, for x∗ + h ∈ Bρ (x∗ ), x0 = x∗ + τ h, Then

n  n  i=1 j=1

x1 = x∗ + τ1 h, . . . , xm = x∗ + τm h ∈ Bρ (x∗ ).

Lxi xj (x∗ + τ h, λ∗ )ti tj > 0

∀t = 0 such that

Constrained Optimization-Equality Constraints n  ∂gk j=1

∂xj

(x∗ + τk h)tj = 0

173

k = 1, . . . , m.

In particular, for t = h such that n  ∂gk j=1

∂xj

(x∗ + τk h)hj = 0

k = 1, . . . , m,

(1)

we have n

n

1  f (x + h) − f (x ) = Lx x (x∗ + τ h, λ∗ )hi hj > 0. 2 i=1 j=1 i j ∗



(2)

This shows that the stationary point x∗ is a strict local minimum point for f subject to the constraint g(x) = c in particular directions.

Step 4 : Suppose that x∗ is not a strict relative minimum point. Then, there exists a sequence of points yl satisfying yl −→ x∗

f (yl )  f (x∗ ).

g(yl ) = c

Write each yl in the form yl = x∗ + δl sl = 0 Note that we have

sl ∈ Rn

sl = 1

δl > 0

∀l.

δl = δl sl = yl − x∗ −→ 0.

Hence, there exists l0 > 1 such that for all l  l0 , yl ∈ Bρ (x∗ ). Choose in steps 1 and 3, h = δl sl = yl − x∗ . Then g(x∗ + h) − g(x∗ ) = g(ylk ) − g(x∗ ) = c − c = 0 gk (x∗ + h) − gk (x∗ ) =

n ∂gk ∗ (x + τk h)hj = 0 ∂xj

τk ∈ (0, 1),

k = 1, . . . , m

j=1

and we should have from (1) and (2) 0  f (yl ) − f (x∗ ) = f (x∗ + h) − f (x∗ ) = which is a contradiction.

n

n

1  Lx x (x∗ + τ h)hi hj > 0 2 i=1 j=1 i j

174

Introduction to the Theory of Optimization in Euclidean Space

Theorem 3.3.2 Necessary conditions for local extreme points Let f and g = (g1 , . . . , gm ) be C 2 functions in a neighborhood of x∗ in Rn such that: g(x∗ ) = c ∇L(x∗ , λ∗ ) = 0

rank(g  (x∗ )) = m, for a unique vector

λ∗ = λ∗1 , . . . , λ∗m .

Then, (i) x∗ is a local minimum point =⇒ HL = (Lxi xj (x∗ , λ∗ ))n×n is positive semi definite on M : t yHL y  0 ∀y ∈ M (ii)

x∗ is a local maximum point

=⇒

HL = (Lxi xj (x∗ , λ∗ ))n×n

is negative semi definite on M : t yHL y  0 where M = {h ∈ Rn : g  (x∗ ).h = 0} g(x) = c at the point x∗ .

∀y ∈ M

is the tangent plane to the surface

Proof. We prove i), then ii) can be established similarly. Let x(t) be a two differentiable curve on the constraint surface g(x) = c with x(0) = x∗ . Suppose that x∗ is a local minimum point for f subject to the constraint g(x) = c. Then there exists r > 0 such that f (x∗ )  f (x(t)) Then

f$(0) = f (x∗ )  f (x(t)) = f$(t)

∀t ∈ (−r, r). ∀t ∈ (−r, r).

So f$ is a one variable function that has an interior minimum at t = 0. Consequently, it satisfies f$ (0) = 0 and f$ (0)  0 or equivalently  d2  and f (x(t))  0. ∇f (x∗ ).x (0) = 0  dt2 t=0 We have d2 f (x(t)) = t x (t)Hf (x(t))x (t) + ∇f (x(t))x (t) dt2  d2  f (x(t)) = t x (0)Hf (x∗ )x (0) + ∇f (x∗ ).x (0).  dt2 t=0 Moreover, differentiating the relation g(x(t)) = c twice, we obtain t 

x (t)Hg (x(t))x (t) + ∇g(x(t))x (t) = 0

=⇒

Constrained Optimization-Equality Constraints

175

t 

x (0)Hg (x∗ )x (0) + ∇g(x∗ )x (0) = 0.

Hence 0

 d2  f (x(t)) = [t x (0)Hf (x∗ )x (0) + ∇f (x∗ )x (0)]  dt2 t=0 − t λ∗ [t x (0)Hg (x∗ )x (0) + ∇g(x∗ )x (0)] = =

t 

x (0)[Hf (x∗ ) −t λ∗ Hg (x∗ )]x (0) + [∇f (x∗ ) + t λ∗ ∇g(x∗ )]x (0)

t 

x (0)[HL (x∗ )]x (0)

∇f (x∗ ) + t λ∗ ∇g(x∗ ) = 0

since

and the result follows since x (0) is an arbitrary element of M.

Quadratic Forms with Linear Constraints Consider the symmetric quadratic form in n variables Q(h) =

n  n 

aij hi hj

(aij = aji )

i=1 j=1

subject to m linear homogeneous constraints

b11 h1 + . . . + b1n hn = 0 .. .. . . . bm1 h1 + . . . + bmn hn = 0

Set



a11 ⎢ A = ⎣ ... an1

... .. . ...

⎤ a1n .. ⎥ . ⎦ ann



b11 ⎢ B = ⎣ ... bm1

... .. . ...

⎤ b1n .. ⎥ . ⎦ bmn



⎤ h1 ⎢ ⎥ h = ⎣ ... ⎦ hn

Definition. Q(h) = t hAh is positive (resp. negative) definite subject to the linear constraints Bh = 0 if Q(h) > 0 (resp. < 0) for all h = 0 that satisfy Bh = 0. We have the following necessary and sufficient condition for a quadratic form Q to be positive (resp. negative) definite subject to linear constraints.

176

Introduction to the Theory of Optimization in Euclidean Space

Theorem: Assume the first m columns in the matrix B = (bij ) are linearly independent. Then Q is positive definite subject to the constraints Bh = 0 ⇐⇒

(−1)m Br > 0

r = m + 1, . . . , n

Q is negative definite subject to the constraints Bh = 0 ⇐⇒

(−1)r Br > 0

r = m + 1, . . . , n

where Br are the symmetric determinants         Br =       

··· .. . ...

0 .. .

b11 .. .

...

b1r .. .

0

bm1

...

bmr

b11 .. .

...

bm1 .. .

a11 .. .

a1r .. .

b1r

...

bmr

ar1

... .. . ...

0 .. . 0

arr

               

for

r = m + 1, . . . , n.

Constrained Optimization-Equality Constraints

177

Solved Problems

1. – Consider the problem max(min) f (x, y) = x2 + 2y 2

subject to

g(x, y) = x2 + y 2 = 1.

i) Find the four points that satisfy the first-order conditions. ii) Classify them by using the second derivatives test. iii) Graph some level curves of f and the graph of g = 1. Explain, where the extreme points occur.

Solution: i) First, each of the optimization problems has a solution by the extreme-value theorem; see Figure 3.23. Indeed, f is continuous on the unit circle S = {(x, y) : g(x, y) = 1} which is a closed and bounded subset of R2 . 1.0 y

0.5

0.0 0.5 1.0 2.0

1.5

z 1.0

0.5

0.0

z  x2  2 y2

1.0 0.5 0.0 x

0.5 1.0

FIGURE 3.23: Graph of f on the set [x2 + y 2  1]

178

Introduction to the Theory of Optimization in Euclidean Space

Next, the functions f and g are C 1 in R2 and any point on the unit circle is regular since, for each (x, y) ∈ S, we have g  (x, y) = (2x, 2y) = (0, 0) Thus, if we introduce the Lagrangian

rank(g  (x, y)) = 1.

=⇒

L(x, y, λ) = f (x, y) − λ(g(x, y) − 1) = x2 + 2y 2 − λ(x2 + y 2 − 1), then, by applying Lagrange multipliers method, the interior extreme points candidates are solutions of the system ∇L(x, y, λ) = 0, 0, 0

⇐⇒

⎧ Lx = 2x − λ(2x) = 0 ⎪ ⎪ ⎪ ⎪ ⎨ Ly = 4y − λ(2y) = 0 ⎪ ⎪ ⎪ ⎪ ⎩ Lλ = −(x2 + y 2 − 1) = 0

⇐⇒

⎧ x = 0 or λ = 1 ⎪ ⎪ ⎪ ⎪ ⎨ y = 0 or λ = 2 ⎪ ⎪ ⎪ ⎪ ⎩ 2 x + y 2 − 1 = 0.

We cannot have x = y = 0 since the constraint is not satisfied. If x = 0 and λ = 2, we deduce from the third equation y = ±1. Then, if y = 0 and λ = 1, we get x = ±1. So the four points that satisfy the necessary conditions are (1, 0) (−1, 0) (0, 1) (0, −1). ii) Now, because f and g are C 2 , we may study the nature of the four points by using the second derivatives test. Here, we have n = 2 and m = 1. Then, we have to consider the sign of the bordered Hessian determinant B2 at each point. Nature of the points (±1, 0) where λ = 1 : First, we have g  (x, y) = (2x, 2y),

g  (±1, 0) = (±2, 0),

rank(g  (±1, 0)) = 1,

and the first column vector of g  (±1, 0) is linearly independent. We have   0  B2 (x, y) =  gx (x, y)  gy (x, y)   0  B2 (1, 0) =  2  0

2 0 0

gx (x, y) Lxx (x, y, λ) Lxy (x, y, λ) 0 0 2

    = −8  

gy (x, y) Lxy (x, y, λ) Lyy (x, y, λ)

    0    =  2x     2y

2x 2 − 2λ 0

  0 −2  B2 (−1, 0) =  −2 0  0 0

0 0 2

2y 0 4 − 2λ     = −8.  

For m = 1, we have (−1)1 B2 (1, 0) = 2 > 0 and the points (±1, 0) are local minima.

(−1)1 B2 (−1, 0) = 2 > 0

     

Constrained Optimization-Equality Constraints

179

Nature of the points (0, ±1) where λ = 2 : We have g  (x, y) = (2x, 2y),

g  (0, ±1) = (0, ±2)

rank(g  (0, ±1)) = 1.

=⇒

Note that the first column vector of g  (0, ±1) is linearly dependent and the second column vector is linearly independent. So, we renumber the variables so that the second column vector of g  (0, ±1) is in the first position. Hence B2 will be written as   0  B2 (x, y) =  gy (x, y)  gx (x, y)

gy (x, y) Lyy (x, y, λ) Lxy (x, y, λ)

  0 2  B2 (0, 1) =  2 0  0 0

0 0 −2

gx (x, y) Lyx (x, y, λ) Lxx (x, y, λ)

    0    =  2y     2x

2y 4 − 2λ 0

  0 −2  B2 (0, −1) =  −2 0  0 0

   =8  

2x 0 2 − 2λ

     

    = 8.  

0 0 −2

For r = m + 1 = 2 = n, we have (−1)2 B2 (0, 1) = 8 > 0

(−1)2 B2 (0, −1) = 8 > 0

and the points (0, ±1) are local maxima. y .4 76 5.12

6.08 5.44

3.2

4.16

y

1.5 3.84

4.8

6.4 5.12 6.0 5.4

3.52

2.56

1.0

4.4

0.96

1.92

1.5

5.76

4.8

2.8

0.5

x2  2 y2  2

1.0

0.5 x2  2 y2  1

1.5

1.0

0.64

1.6

0.32 0.5

0.5

1.0

1.5

0.5 .2

2.24

2.56

2.88 4.16 5 12

3.84 1.5

1.0

0.5

0.5

1.0

1.5

x

3.5

1.0 5.44 08 6 4 5 76

1.5

0.5 1.28

4.8

x

48

1.0 4.48 5.4 5.12 6.0 5 76 6 4

1.5

FIGURE 3.24: Level curves f = 1 and f = 2 are tangent to the constraint g=1

iii) Conclusion: We have f (±1, 0) = 1

f ((0, ±1) = 2.

Subject to the constraint g(x, y) = 1, f attains its maximum value 2 at the points (0, ±1) and its minimum value 1 at the points (±1, 0). At these points,

180

Introduction to the Theory of Optimization in Euclidean Space

the level curves x2 + 2y 2 = 1, x2 + 2y 2 = 2 and the constraint x2 + y 2 = 1, sketched in Figure 3.24, are tangent.

2. – Consider the problem min f (x, y, z) = (x − x0 )2 + (y − y0 )2 + (z − z0 )2 subject to

g(x, y, z) = ax + by + cz + d = 0

for (x0 , y0 , z0 ) ∈ R3 , d ∈ R and (a, b, c) = (0, 0, 0). i) Find the points that satisfy the first-order conditions. ii) Show that the second-order conditions for a local minimum are satisfied. iii) Give a geometric argument for the existence of a minimum solution. iv) Does the maximization problem have any solution? v) Solve min x2 + y 2 + z 2

subject to

x + y + z = 1.

Solution: i) Note that f and g are C 1 in R3 . In particular, each point of [g = 0] is a relative interior and regular point since we have g  (x, y, z) = ai + bj + ck = 0

=⇒

rank(g  (x, y, z)) = 1.

So, by applying Lagrange multipliers method, we will look for the candidate extreme points as stationary points for the Lagrangian L(x, y, z, λ) = (x − x0 )2 + (y − y0 )2 + (z − z0 )2 − λ(ax + by + cz + d). These points are solution of the system

∇L(x, y, z, λ) = 0, 0, 0, 0

⇐⇒

⎧ Lx = 2(x − x0 ) − λa = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ Ly = 2(y − y0 ) − λb = 0 ⎪ ⎪ Lz = 2(z − z0 ) − λc = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ Lλ = −(ax + by + cz + d) = 0

Constrained Optimization-Equality Constraints

181

from which we deduce ⎧ λ λ λ ⎪ ⎪ y = b + y0 z = c + z0 ⎪ ⎨ x = 2 a + x0 2 2

λ

λ

⎪ ⎪ λ ⎪ ⎩ a a + x 0 + b b + y 0 + c c + z0 + d = 0 2 2 2 and that ax0 + by0 + cz0 + d λ∗ λ =− . = 2 a 2 + b2 + c 2 2 Thus, we have only one critical point denoted (x∗ , y ∗ , z ∗ ) with λ = λ∗ . ii) First, note that g  (x∗ , y ∗ , z ∗ ) = (a, b, c) = (0, 0, 0) and discuss: Case a = 0. The first column vector of g  (x∗ , y ∗ , z ∗ ) is linearly independent, and because n = 3 and m = 1, we have to consider the signs of the following bordered Hessian determinants:   0  ∗ ∗ ∗ B2 (x , y , z ) =  gx  gy

gy Lxy Lyy

gx Lxx Lxy

    0   = a     b

a 2 0

b 0 2

    = −2(a2 + b2 ) < 0.  

The partial derivatives of g are taken at (x∗ , y ∗ , z ∗ ) and those of L at (x∗ , y ∗ , z ∗ , λ∗ ).     B3 =   

0 gx gy gz

gx Lxx Lyx Lzx

gy Lxy Lyy Lzy

gz Lxz Lyz Lzz

        =      

0 a b c

a 2 0 0

b 0 2 0

c 0 0 2

     = −4(a2 + b2 + c2 ) < 0.   

Case a = 0 & b = 0. The first column vector of g  (x∗ , y ∗ , z ∗ ) is linearly dependent and the second is linearly independent. We renumber the variables in the order y, x, z and obtain   0  B2 =  b  a

 b a  2 0  = −2(a2 + b2 ) 0 2 

    B3 =   

0 b a c

b a 2 0 0 2 0 0

c 0 0 2

     = −4(a2 + b2 + c2 ).   

182

Introduction to the Theory of Optimization in Euclidean Space

Case a = 0, b = 0, & c = 0. The first and second column vector of g  (x∗ , y ∗ , z ∗ ) are linearly dependent and the third is linearly independent. We renumber the variables in the order z, x, y and obtain   0  B2 =  c  a

    B3 =   

 c a  2 0  = −2(a2 + c2 ) 0 2 

0 c a b

c 2 0 0

a 0 2 0

b 0 0 2

     = −4(a2 + b2 + c2 ).   

Conclusion. In each case, we have, with m = 1, (−1)m B2 (x∗ , y ∗ , z ∗ ) > 0

(−1)m B3 (x∗ , y ∗ , z ∗ ) = 4(a2 + b2 + c2 ) > 0.

We conclude that the point (x∗ , y ∗ , z ∗ ) is a local minimum to the constrained minimization problem. iii) Geometric interpretation of the minimization problem: If M (x, y, z), M0 (x0 , y0 , z0 ) ∈ R3 , then f (x, y, z) = (x − x0 )2 + (y − y0 )2 + (z − z0 )2 = M0 M 2 is the square of the distance of the point M to the point M0 . The constraint surface g(x, y, z) = ax + by + cz + d = 0

is the plane with normal

a, b, c.

The minimization problem consists in finding a point M in the plane that is located at a shortest distance from M0 . Such a point exists and is obtained by considering the intersection of the line passing through the point M0 and perpendicular to the plane. A direction of this line is given by the normal to the plane a, b, c. Therefore, parametric equations of the line are x = x0 + ta

y = y0 + tb

Clearly the intersection of the line with the plane gives





a x0 + ta + b y0 + tb + c z0 + tc + d = 0 t=

t ∈ R.

z = z0 + tc

⇐⇒

λ∗ ax0 + by0 + cz0 + d =− . 2 a 2 + b2 + c 2

f takes its minimum value f(

λ∗ λ∗ λ∗ λ∗ λ∗ λ∗2 2 2 2 λ∗ a+x0 , b+y0 , c+z0 ) = ( a)2 +( b)2 +( c)2 = (a +b +c ). 2 2 2 2 2 2 4

Constrained Optimization-Equality Constraints

183

The shortest distance of M0 to the plan g = 0 is |λ∗ |  2 |ax0 + by0 + cz0 + d| λ∗2 2 √ (a + b2 + c2 ) = . a + b2 + c 2 = D= 4 2 a 2 + b2 + c 2 iv) The maximization problem doesn’t have a solution: Suppose that there exists (xm , ym , zm ) a solution to the maximization problem. Then the points (xm + t, ym − t, zm ) for t ∈ R are located in the plane and satisfy f (xm +t, ym −t, zm ) = (xm +t)2 +(ym −t)2 +z 2

−→ +∞

as

t −→ +∞.

v) From, the previous study, choose (a, b, c) = (1, 1, 1), d = −1, (x0 , y0 , z0 ) = (0, 0, 0). Then 1 λ = 2 3

(x∗ , y ∗ , z ∗ ) =

and

1 1 1

, , . 3 3 3

We conclude that the point ( 13 , 13 , 13 ) is a local minimum to the constrained minimization problem. At this point, the two level surfaces x2 + y 2 + z 2 =

1 1 1

1 =f , , , 3 3 3 3

and

x+y+z =1

are tangent, as it is described in Figure 3.25. 1.0 y

0.5

0.0 0.5 1.0 1.0 x yz  1 0.5

z

0.0

0.5 x2  y2  z2  1.0

1 3

1.0 0.5 0.0 x

0.5 1.0

FIGURE 3.25: The level surface and the plane are tangent

184

Introduction to the Theory of Optimization in Euclidean Space

3. – The planes x + y + z = 3 and x − y = 2 intersect in a straight line. Find the point on that line that is closest to the origin.

Solution: i) We formulate the problem as follows ⎧ ⎨ g1 (x, y, z) = x + y + z = 3

min f (x, y, z) = x2 + y 2 + z 2

subject to



g2 (x, y, z) = x − y = 2.

Note that f , g1 and g2 are C 1 in R3 and any point of the set of the constraints, sketched in Figure 3.26 and defined by g = (g1 , g2 ) = (3, 2), is an interior point and regular since we have   1 1 1  rank(g  (x, y, z)) = 2. g (x, y, z) = 1 −1 0 x

0 y

2

x

2 4

y

2

0 2

0

4

0 2

2

5 5

minimum

z

z origin

0 0

5

5

FIGURE 3.26: The constraints, the origin and the minimum point

Consider the Lagrangian L(x, y, z, λ1 , λ2 ) = f (x, y, z) − λ1 (g1 (x, y, z) − 3) − λ2 (g2 (x, y, z) − 2) = x2 + y 2 + z 2 − λ1 (x + y + z − 3) − λ2 (x − y − 2)

Constrained Optimization-Equality Constraints

185

and look for the stationary points solutions of ∇L(x, y, z, λ1 , λ2 ) = 0R5 ⎧ (1) Lx = 2x − λ1 − λ2 = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (2) Ly = 2y − λ1 + λ2 = 0 ⎪ ⎪ ⎪ ⎪ ⎨ (3) Lz = 2z − λ1 = 0 ⇐⇒ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (4) Lλ1 = −(x + y + z − 3) = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ (5) Lλ2 = −(x − y − 2) = 0. From equations (1), (2) and (3), we deduce that ⎧ 1 ⎪ ⎪ x = 2 (λ1 + λ2 ) ⎪ ⎪ ⎨ y = 12 (λ1 − λ2 ) ⎪ ⎪ ⎪ ⎪ ⎩ z = 12 λ1 then substituting these values into equations (4) and (5), we obtain ⎧ 1 ⎨ 2 (λ1 + λ2 ) + 12 (λ1 − λ2 ) + 12 λ1 = 3 =⇒ (λ1 , λ2 ) = (2, 2). ⎩ 1 1 (λ + λ ) − (λ − λ ) = 2 1 2 1 2 2 2 The only critical point for L is

(x∗ , y ∗ , z ∗ , λ∗1 , λ∗2 ) = (2, 0, 1, 2, 2).

ii) Note that the first two column vectors of g  (x, y, z) are linearly independent. We can, therefore, keep the matrix without renumbering the variables, and consider the sign of the following bordered Hessian determinant (n = 3, m = 2, r = m + 1 = 3):   0     0     1 B3 (2, 0, 1) =  ∂g  ∂x   ∂g1  ∂y     ∂g1 ∂z

We have

0

∂g1 ∂x

∂g1 ∂y

∂g1 ∂z

0

∂g2 ∂x

∂g2 ∂y

∂g2 ∂z

∂g2 ∂x

Lxx

Lxy

Lxz

∂g2 ∂y

Lyx

Lyy

Lyz

∂g2 ∂z

Lzx

Lzy

Lzz

                =           

0 0 1 1 1

0 0 1 −1 0

1 1 2 0 0

(−1)m B3 (2, 0, 1) = (−1)2 B3 (2, 0, 1) = 12 > 0.

1 −1 0 2 0

1 0 0 0 2

      = 12    

186

Introduction to the Theory of Optimization in Euclidean Space

We conclude that the point (2, 0, 1) is a local minimum to the constrained optimization problem. iii) To show that the point is the global minimum point, we use the following parametrization of the set of the constraints; see Figure 3.26: x = t + 2,

y = t,

z = 1 − 2t

t ∈ R.

So the optimization problem is reduced to min F (t) = f (t + 2, t, 1 − 2t) = (t + 2)2 + t2 + (2t − 1)2 . t∈R

We have F  (t) = 2(t + 2) + 2t + 2(2t − 1)(2) = 12t = 0 and

F  (t) = 12 > 0

⇐⇒

t=0

∀t ∈ R.

Hence 0 is a global minimum for F . That is, the point (2, 0, 1) is the solution to the minimization problem. In Section 3.4, we will see that using the convexity of the Lagrangian in (x, y, z), when (λ1 , λ2 ) = (2, 2), we can conclude that the local minimum point (2, 0, 1) is the global minimum point. Therefore, it solves the problem. The advantage, in arguing in this way, prevents us from exploring the geometry of the constraint set.

Constrained Optimization-Equality Constraints

3.4

187

Global Extreme Points-Equality Constraints

The following theorem gives sufficient conditions for a critical point of the Lagrangian to be a global extreme point for the associated constrained optimization problem.

Theorem 3.4.1 Let Ω ⊂ Rn , Ω be an open set and f, g1 , . . . , gm : Ω −→ ◦

R be C 1 functions. Let S ⊂ Ω be convex, x∗ ∈ S and L be the Lagrangian L(x, λ) = f (x) − λ1 (g1 (x) − c1 ) − . . . − λm (gm (x) − cm ). Then, we have ∃ λ∗ = λ∗1 , . . . , λ∗m  : ∇x,λ L(x∗ , λ∗ ) = 0

⎫ ⎬

L(., λ∗ ) is concave (resp. convex) in x ∈ S



=⇒

f (x∗ ) =

max

{x∈S: g(x)=c}

f (x)

( resp. min)

Proof. Suppose that the Lagrangian L(., λ∗ ) is concave in x and that m  ∂f ∗ ∂gj ∗ ∂L ∗ ∗ (x , λ ) = (x ) − λ∗j (x ) = 0 ∂xi ∂xi ∂x i j=1

i = 1, . . . , n,

then x∗ is a stationary point for L(., λ∗ ). Therefore, x∗ is a global maximum for L(., λ∗ ) in S (by Theorem 2.3.4) and we have L(x∗ , λ∗ ) = f (x∗ ) − λ∗1 (g1 (x∗ ) − c1 ) − . . . − λ∗m (gm (x∗ ) − cm )  f (x) − λ∗1 (g1 (x) − c1 ) − . . . − λ∗m (gm (x) − cm ) = L(x, λ∗ ) Since, we have ∂L ∗ ∗ (x , λ ) = −(gj (x∗ ) − cj ) = 0 ∂λj then

j = 1, . . . , m

g1 (x∗ ) − c1 = g2 (x∗ ) − c2 = . . . = gm (x∗ ) − cm = 0.

∀x ∈ S.

188

Introduction to the Theory of Optimization in Euclidean Space

So, the previous inequality reduces to f (x∗ )  f (x) − λ∗1 (g1 (x) − c1 ) − . . . − λ∗m (gm (x) − cm ). In particular, we have f (x∗ )  f (x)

∀x ∈ {x ∈ S :

g(x) = c}.

Thus x∗ solves the constrained maximization problem. The minimization case can be established similarly.

Remark 3.4.1 * Note that there is no regularity assumption on the point x∗ in the theorem. The proof uses the characterization of a C 1 convex function on a convex set. ** The concavity/convexity hypothesis is a sufficient condition. We may have a global extreme point with a Lagrangian that is neither concave nor convex (see Example 3).

Example 1. Economy. If the cost of capital K and labor L is r and w dollars per unit respectively, find the values of K and L that minimize the cost to produce the output Q = c K a Lb , where c, a and b are positive parameters satisfying a + b < 1. Solution: The inputs K and L minimizing the cost must solve the problem min rK + wL

subject to

cK a Lb = Q.

We look for the extreme points in the set Ω = (0, +∞) × (0, +∞) since K and L must satisfy cK a Lb = Q. Denote g(K, L) = cK a Lb

f (K, L) = rK + wL

S = Ω.

Note that f and g are C 1 in the open convex set Ω. Consider the Lagrangian L(K, L, λ) = f (K, L) − λ(g(K, L) − Q) = rK + wL − λ(cK a Lb − Q) and Lagrange’s necessary conditions

∇L(K, L, λ) = 0, 0, 0

⇐⇒

⎧ LK = r − λcaK a−1 Lb = 0 ⎪ ⎪ ⎪ ⎪ ⎨ LL = w − λcbK a Lb−1 = 0 ⎪ ⎪ ⎪ ⎪ ⎩ Lλ = −(cK a Lb − Q) = 0.

Constrained Optimization-Equality Constraints

189

Multiplying each side of the first equality by K, each side of the second equality by L, we obtain rK = λcaK a Lb = λaQ

wL = λcbK a Lb = λbQ

then using the third equality, we deduce the unique solution of the system K ∗ = λ∗

aQ r

L∗ = λ ∗

bQ w

λ∗ =

1 a b Q a+b r a+b w a+b . c aQ bQ

Convexity of L in (K, L). The Hessian matrix of L is ⎤ ⎡ −λ∗ cabK a−1 Lb−1 −λ∗ ca(a − 1)K a−2 Lb ⎦. HL(.,.,λ∗ ) = ⎣ −λ∗ cabK a−1 Lb−1 −λ∗ cb(b − 1)K a Lb−2 The leading principal minors are D1 (K, L) = −λ∗ ca(a − 1)K a−2 Lb > 0   −λ∗ ca(a − 1)K a−2 Lb  D2 (K, L) =   −λ∗ cabK a−1 Lb−1

since 0 < a < a + b < 1 −λ∗ cabK a−1 Lb−1 −λ∗ cb(b − 1)K a Lb−2

     

= (λ∗ )2 c2 abK 2a−2 L2b−2 (1 − (a + b)) > 0. Hence, L(., ., λ∗ ) is strictly convex in (K, L) in Ω, and we conclude that the point (K ∗ , L∗ ) is the solution to the constrained minimization problem. Example 2. Two-constraint problem. Solve the problem ⎧ ⎨ g1 (x, y, z) = x2 + y 2 = 1 min (max) f (x, y, z) = x−z subject to ⎩ g2 (x, y, z) = x2 + z 2 = 1. Solution: i) Consider the Lagrangian L(x, y, z, λ1 , λ2 ) = f (x, y, z) − λ1 (g1 (x, y, z) − 1) − λ2 (g2 (x, y, z) − 1) = x − z − λ1 (x2 + y 2 − 1) − λ2 (x2 + z 2 − 1)

190

Introduction to the Theory of Optimization in Euclidean Space and look for its stationary points, solution of the system

∇L(x, y, z, λ1 , λ2 ) = 0R5

⎧ (1) Lx = 1 − 2xλ1 − 2xλ2 = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (2) Ly = 0 − 2yλ1 = 0 ⎪ ⎪ ⎪ ⎪ ⎨ (3) Lz = −1 − 2zλ2 = 0 ⇐⇒ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (4) Lλ1 = −(x2 + y 2 − 1) = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ (5) Lλ2 = −(x2 + z 2 − 1) = 0.

From equation (2), we deduce that λ1 = 0

or

y = 0.

∗ If y = 0, then from (4) and (5) we deduce that x = ±1

and

z = 0.

But (3) is not possible. ∗ If λ1 = 0, then (1) and (3) reduce to 1 − 2xλ2 = 0

or

− 1 − 2zλ2 = 0.

Since λ2 cannot be equal to zero, we deduce that x = −z =

1 . 2λ2

Inserting x = −z in (5), we obtain 2x2 = 1

⇐⇒

√ x = ±1/ 2.

Then, from (4), we get 1 + y2 = 1 2

√ y = ±1/ 2.

⇐⇒

So, the critical points of L are 1 1 1 ( √ , ± √ , − √ , λ∗1 , λ∗2 ) 2 2 2 1 1 1 (− √ , ± √ , √ , λ∗1 , λ∗2 ) 2 2 2

with with

1 (λ∗1 , λ∗2 ) = (0, √ ), 2 1 (λ∗1 , λ∗2 ) = (0, − √ ). 2

Constrained Optimization-Equality Constraints

191

The values taken by f at these points are √ 1 1 1 f(√ , ±√ , −√ ) = 2 2 2 2 ii) To study the convexity ⎡ Lxx HL(x,y,z,λ1 ,λ2 ) = ⎣ Lyx Lzx

√ 1 1 1 f (− √ , ± √ , √ ) = − 2. 2 2 2

of L in (x, y, z), consider the Hessian matrix ⎤ ⎡ ⎤ −2(λ1 + λ2 ) Lxy Lxz 0 0 Lyy Lyz ⎦ = ⎣ 0 ⎦ 0 −2λ1 Lzy Lzz 0 0 −2λ2

1 * With (λ∗1 , λ∗2 ) = (0, √ ), the Hessian is 2 ⎡ √ − 2 HL(x,y,z,0, √1 ) = ⎣ 0 2 0 and

√  √   − 2 =− 2 Δ12 1 =   0 0 √ Δ12 =  0 − 2  √  − 2 0  0 Δ3 =  0  0 0

  =0 

   0 =0 Δ13 1 =  √  − 2 Δ22 =  0

   =0  − 2  0 0 √

⎤ 0 0 0 0 ⎦ √ 0 − 2 √  √   − 2 =− 2 Δ23 1 =

  =2 − 2  0 √

(−1)k Δk  0

 √  − 2 Δ32 =  0

 0  =0 0 

k = 1, 2, 3.

Thus L(., 0, √12 ) is concave in R3 and the points ( √12 , ± √12 , − √12 ) are maxima points. ** Similarly, we show that L(., 0, − √12 ) is convex and the points (− √12 , ± √12 , √12 ) are minima points. iii) Comments. The constraint set, illustrated in Figure 3.27, is the intersection of two cylinders. A parametrization of this set is described by the equations  −t t ∈ [−1, 1]. x(t) = ± 1 − t2 , y(t) = t, z(t) = t or The set is closed since g1 and g2 are continuous on R3 and



 [(g1 , g2 ) = (1, 1)] = g1−1 {1} ∩ g2−1 {1} . It is bounded since, for any (x, y, z) ∈ [(g1 , g2 ) = (1, 1)], we have (x, y, z) = x2 + y 2 + z 2  (x2 + y 2 ) + (x2 + z 2 )  1 + 1 = 2.

192

Introduction to the Theory of Optimization in Euclidean Space 4 y

1.0

2

y

0 2

0.5

x2  y2  1

4 4

1.0 1.0

2

z

0.5

0.0

0.5

z

0

2

0.0

0.5

x2  z2  1

4

1.0

4

1.0 2

0.5 0 x

0.0 x

2 4

0.5 1.0

FIGURE 3.27: The constraint set

As f is continuous on the closed bounded constraint set [(g1 , g2 ) = (1, 1)], it attains its maximum and minimum values on this set by the extreme value theorem. Thus, the solution of the problem is found by comparing the values of f taken at the candidate points obtained in i). Example 3. No concavity nor convexity. Consider the problem max f (x, y, z) = xy + yz + xz

subject to

g(x, y, z) = x + y + z = 3

and the associated Lagrangian L(x, y, z, λ) = f (x, y, z) − λ(g(x, y, z) − 3) = xy + yz + xz − λ(x + y + z − 3). Show that the local maximum point (1, 1, 1) of the constrained optimization problem, with λ = 2, is a global maximum, but L(., 2) is not concave. Solution: We have Lx = y + z − λ

Ly = x + z − λ

Lz = y + x − λ

Lλ = −(x + y + z − 3).

To study the concavity of L in (x, y, z) when λ = 2, consider the Hessian matrix ⎤ ⎡ ⎡ ⎤ 0 1 1 Lxx Lxy Lxz HL(x,y,z,2) = ⎣ Lyx Lyy Lyz ⎦ = ⎣ 1 0 1 ⎦ . Lzx Lzy Lzz 1 1 0

Constrained Optimization-Equality Constraints

193

The principal minors are    0 =0 Δ12 1 =   0 Δ12 =  1   0  Δ3 =  1  1

   0 =0 Δ13 1 =

   0 1  2  = −1 Δ = 2  1 0   1 1  0 1  = 2. 1 0 

   0 =0 Δ23 1 =

 1  = −1 0 

  0 Δ32 =  1

 1  = −1 0 

So L(., 2) is neither concave nor convex in (x, y, z). Thus, we cannot conclude, by using the theorem, whether the point (1, 1, 1) is a global maximum or not. Now, to show that (1, 1, 1) is a global maximum point, we can proceed as follows. Consider the values of f taken on the plane g(x, y, z) = 3: f (x, y, 3−(x+y)) = xy +(y +x)[3−(x+y)] = xy +3(x+y)−(x+y)2 = θ(x, y). The maximization problem is equivalent to solve the following unconstrained problem max

(x,y)∈R2

θ(x, y).

Since θ is C 1 , the critical points are solutions of ∇θ(x, y) = y + 3 − 2(x + y), x + 3 − 2(x + y) = 3 − 2x − y, 3 − x − 2y = 0, 0 ⎧ ⎨ 2x + y = 3 ⇐⇒ (x, y) = (1, 1). ⇐⇒ ⎩ x + 2y = 3 (1, 1) is the only critical point for θ. Moreover, we have     −2 −1 θxx θxy Hθ (x, y) = = −1 −2 θyx θyy D1 (x, y) = −2   −2 D2 (x, y) =  −1

=⇒  −1  =3 −2 

(−1)1 D1 (x, y) = 2 > 0 =⇒

(−1)2 D2 (x, y) = 3 > 0.

θ is strictly concave on R2 . Thus (1, 1) is a global maximum of θ on R2 . Therefore, (1, 1, 1) is a global maximum of f on [g = 3].

194

Introduction to the Theory of Optimization in Euclidean Space

Solved Problems

Part 1. – A constrained optimization problem. [29] i) Solve the following constrained minimization problem min x21 + x22 + . . . + x2n

subject to

x 1 + x2 + . . . + x n = c

where c ∈ R. ii) Use part (i) to show that if x1 , x2 , . . . , xn are given numbers, then n

i=n 

x2i 

i=n 

i=1

xi

2

.

i=1

When does the equality hold ? Solution: i) Denote by f and g the C ∞ functions in Rn : f (x1 , . . . , xn ) = x21 + x22 + . . . + x2n

g(x1 , . . . , xn ) = x1 + x2 + . . . + xn .

Consider the Lagrangian L(x1 , . . . , xn , λ) = f (x1 , . . . , xn ) − λ(g(x1 , . . . , xn ) − c) = x21 + x22 + . . . + x2n − λ(x1 + x2 + . . . + xn − c). Note that any point of the hyperplane g = c is a regular point since we have g  (x1 , . . . , xn ) = 1, . . . , 1

=⇒

rank(g  (x1 , . . . , xn )) = 1.

The stationary points of the Lagrangian are solutions of the system

∇L(x1 , . . . , xn , λ) = 0, . . . , 0, 0 ⇐⇒

⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨

Lx1 = 2x1 − λ = 0 .. .

Lxi = 2xi − λ = 0 .. ⎪ . ⎪ ⎪ ⎪ ⎪ ⎪ Lxn = 2xn − λ = 0 ⎪ ⎪ ⎪ ⎪ ⎩ Lλ = −(x1 + x2 + . . . + xn − c) = 0.

Constrained Optimization-Equality Constraints

195

We deduce, from the n first equations, that λ = 2x1 = . . . = 2xi = . . . = 2xn

=⇒

xi =

λ 2

i = 1, . . . , n,

λ which inserted into the last equation gives n = c. Hence the unique solution 2 to the system is λ=

2c n

xi =

c n

i = 1, . . . , n.

Now, let us study the convexity of L in (x1 , . . . , xn ) when λ = The corresponding Hessian matrix is ⎡ 2 ··· ⎢ .. . . ⎣ . . 0

···

2c . n

⎤ 0 .. ⎥ . ⎦ 2

The leading principal minors are D1 = 2 > 0,

D2 = 22 > 0, . . . Di = 2i , . . . Dn = 2n > 0.

Hence, L is strictly convex in (x1 , . . . , xn ), and we conclude that the point c c

,..., n n is the solution to the constrained minimization problem. ii) Let x1 , x2 , . . . , xn be given numbers. Denote by c their sum. From part i), we have f

c n

,...,

c

 f (t1 , . . . , tn ) n

∀(t1 , . . . , tn ) ∈ [t1 + . . . + tn = c].

In particular, for the given xi , we can write c 2 n

⇐⇒

c 2

c 2

 x21 + x22 + . . . + x2n n c2 c2  x21 + x22 + . . . + x2n ⇐⇒ n 2 = n n c2 = (x1 + x2 + . . . + xn )2  n(x21 + x22 + . . . + x2n ). +

n

+ ... +

The equality holds only at the minimum point whose coordinates are equal to (x1 + x2 + . . . + xn )/n.

196

Introduction to the Theory of Optimization in Euclidean Space

Part 2. – Method of least squares. [1] Consider n points (x1 , y1 ), . . . , (xn , yn ) such that x1 , . . . , xn are not all equal. Find the slope m and the y-intercept b of the line y = mx + b, that minimize the quantity D(m, b) =

n 

(mxi + b − yi )2 = (mx1 + b − y1 )2 + . . . + (mxn + b − yn )2

i=1

which represents the sum of the squares of the vertical distances di = [yi −(mxi +b)] from these points to the line. This line is called the regression line or the least squares’ line of best fit. (Hint: find the point candidate and check its global optimality by using Part 1 (ii))

Solution: consider the following unconstrained minimization problem: min D(m, b) = [y1 − (mx1 + b)]2 + . . . + [yn − (mxn + b)]2

(m,b)

Since D is regular, then the local extreme points are stationary points of the gradient of D, i.e, solution of ∇D(m, b) = 0, 0

⇐⇒

⎧ ∂D ⎪ ⎪ ⎪ ⎨ ∂m = −2[y1 − (mx1 + b)]x1 − . . . − 2[yn − (mxn + b)]xn = 0 ⎪ ⎪ ∂D ⎪ ⎩ = −2[y1 − (mx1 + b)] − . . . − 2[yn − (mxn + b)] = 0 ∂b ⎧  n n n   ⎪ 2 ⎪ y x = m[ x ] + b[ xi ] ⎪ i i i ⎪ ⎪ ⎨ i=1 i=1 i=1 ⇐⇒ ⎪ n n ⎪   ⎪ ⎪ ⎪ yi = m[ xi ] + b[n]. ⎩ i=1

i=1

The determinant of this 2 × 2 linear system is    n 2  xi   i=1      n  xi  i=1

  xi  n n 

2   i=1  2 x − xi = 0 = n  i  i=1 i=1  n 

n 

Constrained Optimization-Equality Constraints

197

since x1 , . . . , xn are different (see Part 1). Therefore, there exists a unique solution to the system. It remains to show that it is the minimum point. For this, we study the convexity of D where its Hessian matrix is given by ⎡

n  x2i 2 ⎢ ⎢ i=1 ⎢ HD (m, b) = ⎢ ⎢  n ⎣ 2 xi

2

⎤ xi ⎥ ⎥ i=1 ⎥ ⎥ ⎥ ⎦ 2n

n 

i=1

The leading principal minors values are D1 (m, b) = 2

n 

x2i > 0

i=1

n n

  D2 (m, b) = 4 n x2i − ( xi )2 > 0. i=1

i=1

So D is convex and the unique critical point (m∗ , b∗ ) is the global minimum. The regression line equation is y = m∗ x + b∗ with    n   n   yi x i xi    i=1  i=1        n     y n i   i=1 ∗ m =  n  n   2     x x i  i   i=1  i=1        n     x n i   i=1

   n  n 2    xi yi xi    i=1  i=1        n n     x y i i   i=1 b∗ =  i=1  n n     2   x x i  i   i=1  i=1        n     x n i   i=1

Part 3. – Students’ scores. In a math course, Table 3.3 lists the scores xi of 14 students on the midterm exam and their scores yi on the final exam. i) Plot the data. Do the data appear to lie along a straight line? ii) Find the least squares’ line of best fit of y as a function of x. iii) Plot the points and the regression line on the same graph. iv) Use your answer from ii) to predict the final exam score of a student whose midterm score was 41 and who dropped the course.

198

Introduction to the Theory of Optimization in Euclidean Space xi yi

100 95

95 81 88 53

71 58

83 80

48 31

92 91

100 78

85 85

63 52

78 78

58 74

73 60

60 60

TABLE 3.3: Students’ scores Solution: i) The plot, in Figure 3.28, shows that 10 points are close to a line. The plot is obtained using the Mathematica coding below: f p = {{100, 95}, {95, 88}, {81, 53}, {71, 58}, {83, 80}, {48, 31}, {92, 91}, {100, 78}, {85, 85}, {63, 52}, {78, 78}, {58, 74}, {73, 60}, {60, 60}}; gp = ListP lot[f p]

90 80 70 60 50 40

60

70

80

90

100

FIGURE 3.28: The data shows an alignment

ii) Using the results from Part 2, we have 14 

xi = 1087

i=1

14 

x2i = 87855

i=1

yi = 983

i=1

1499 ≈ 0.8981426 1669 and the regression line will be m∗ =

14 

14 

xi yi = 7942815178

i=1

b∗ =

801 ≈ 0.4799281 1669

y = 0.8981426 x + 0.4799281. iii) To check the equation of the line of best fit, we use the instruction line = F it[f p, {1, x}, x] 0.479928 + 0.898143 x

Constrained Optimization-Equality Constraints

199

To sketch the line (see Figure 3.29) with the data, we add the following Mathematica coding: gl = P lot[line, {x, 25, 110}]; Show[gl, gp]

100 90 80 70 60 50 40 60

80

100

FIGURE 3.29: Data and line y = 0.479928 + 0.898143 x

iv) The student who dropped the course would have at the final exam the approximate mark of y(41) ≈ 0.8981426(41) + 0.4799281 = 36.4056. The student would have failed if he didn’t improve his understanding of the material studied. However, this is only a relative prediction that doesn’t take into account other factors involving the learning experience of the student. Part 4. – University tuition. [12] The following, in Table 3.4, are the tuition fees that were charged at Vanderbilt University from 1982 to 1991. i) Plot the data. ii) To fit these data with a model of the form y = β0 eβ1 x , find the least squares’ line of best fit of ln y as a function of ln x. Deduce approximate values of β0 and β1 . iii) Sketch the curve in ii) with the data plot in i). iv) Suppose the exponential model is accurate for a period of time. In which year would the tuition attain a rate of $40000?

Solution: i) The data, of points (x, y), appear to lie along a straight line. The plot, shown in Figure 3.30, is obtained using the Mathematica coding below:

200

Introduction to the Theory of Optimization in Euclidean Space

year 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991

year after 1981, x 1 2 3 4 5 6 7 8 9 10

tuition (in thousands $), y 6.1 6.8 7.5 8.5 9.3 10.5 11.5 12.625 13.975 14.975

TABLE 3.4: University tuition

f p1 = {{1, 6.1}, {2, 6.8}, {3, 7.5}, {4, 8.5}, {5, 9.3}, {6, 10.5}, {7, 11.5}, {8, 12.625}, {9, 13.975}, {10, 14.975}}; gp1 = ListP lot[f p1]

14

12

10

8

4

6

8

10

FIGURE 3.30: The data (xi , yi ) lie along a straight line

The plot of the data, of points (xi , ln yi ), appears also to lie along a straight line (see Figure 3.31). f p2 = {{1, Log[6.1]}, {2, Log[6.8]}, {3, Log[7.5]}, {4, Log[8.5]}, {5, Log[9.3]}, {6, Log[10.5]}, {7, Log[11.5])}, {8, Log[12.625]}, {9, Log[13.975]}, {10, Log[14.975]}}; gp2 = ListP lot[f p2]

Constrained Optimization-Equality Constraints

201

2.6

2.4

2.2

2.0

4

6

8

10

FIGURE 3.31: The data (xi , ln yi ) are positioned along a straight line

ii) Using the results from Part 2, the least squares’ line of best fit is given by ln(y) = ln(β0 ) + β1 x; where b∗ = ln(β0 ) and m∗ = β1 , are the solution of the linear system p = A.m∗ + B.b∗ ,

q = B.m∗ + 10.b∗

where B=

10 

xi = 55

A=

i=1

p=

10 

10 

x2i = 385

i=1

ln(yi ) = 22.7832

i=1

q=

10 

xi ln(yi ) = 22.7832

i=1

p.10 − qB ≈ 0.10156 10A − B 2 and the regression line will be m∗ =

b∗ =

qA − Bp ≈ 1.71975 10A − B 2

ln y = 0.10156x + 1.71975. Thus



β0 = eb ≈ 5.583112000

β1 = m∗ ≈ 0.10156.

iii) Using Matematica, we find the equation of the line of best fit line = F it[f p2, {1, x}, x] 1.71975 + 0.10156 x

We sketch the line with the data (xi , ln(yi )), in Figure 3.32, using the coding: gl = P lot[line, {x, 1/2, 11}]; Show[gl, gp2]

202

Introduction to the Theory of Optimization in Euclidean Space 2.8

2.6

2.4

2.2

2.0

4

6

8

10

FIGURE 3.32: Data (xi , ln(yi )) and line y = 1.71975 + 0.10156 x

Finally, we sketch, in Figure 3.33, the curve y = f (x) = β0 eβ1 x , with the original data (xi , yi ): curve = P lot[5.583112Exp[0.10156x], {x, 1/2, 11}]; Show[curve, gp1]

16 14 12 10 8

2

4

6

8

10

FIGURE 3.33: Data (xi , yi ) and curve model f (x) = 5.583112e0.10156x

vi) Using the formula for prediction. We need to solve the equation 1000f (x) = 40000

⇐⇒

x=

40 1 ln( ) ≈ 19.38886498 0.10156 5.583112

Thus in the year 1981 + 19 = 2000, the tuition fees reached the rate of $40000.

Chapter 4 Constrained Optimization-Inequality Constraints

In this chapter, we are interested in optimizing functions f : Ω ⊂ Rn −→ R over subsets described by inequalities

g(x) = (g1 (x), g2 (x), . . . , gm (x))  bRm

⇐⇒

⎧ ⎪ ⎨ g1 (x)  b1

.. .. x ∈ Rn . . . ⎪ ⎩ g (x)  b m m

Denote the set of the constraints

S = [g(x)  b] = [g1 (x)  b1 ] ∩ [g2 (x)  b2 ] ∩ . . . ∩ [gm (x)  bm ].

Example. ∗

S = [g1 (x, y) = x2 + y 2  1] ∩ [g2 (x, y) = x − y  0] is the plane region inside

the unit disk and above the line y = x. Here ∗∗

(n = 2,

S = [g1 (x, y, z) = 9−(x2 +y 2 +z 2 )  0] = [x2 +y 2 +z 2  9] is the domain out-

side the sphere centered at the origin with radius 3. Here ∗∗∗

m = 2).

S = [g(x, y) = x2  0] = {(0, y) :

(n = 3,

m = 1).

y ∈ R} is the y-axis. Here

(n = 2,

m = 1). Note that sets defined by inequalities contain interior points and boundary points. So, for comparing the values of a function f taken around an extreme point x∗ , it will be suitable to consider curves x(t) passing through x∗ and included in the constraint set [g  b]. We will consider, this time, curves t −→ x(t) such that the set {x(t) : t ∈ [0, a], x(0) = x∗ }, for some a > 0, is included in [g  b]. Then, if x∗ is a local maximum of f , then we have f (x(t))  f (x∗ )

∀t ∈ [0, a]. 203

204

Introduction to the Theory of Optimization in Euclidean Space

Thus, 0 is local maximum point for the function t −→ f (x(t)). Hence

  d   f (x(t))  = f  (x(t)).x (t) 0 dt t=0 t=0

=⇒

f  (x∗ ).x (0)  0.

x (0) is a tangent vector to the curve x(t) at the point x(0) = x∗ . This inequality musn’t depend on a particular curve x(t). So, we should have

f  (x∗ ).x (0)  0

for any curve x(t) such that g(x(t))  b.

In this chapter, we will first characterize, in Section 4.1, the set of tangent vectors to such curves, then establish, in Section 4.2, the equations satisfied by a local extreme point x∗ . In Section 4.3, we identify the candidates points for optimality, and in Section 4.4, we explore the global optimality of a constrained local candidate point. Finally, we establish, in Section 4.5, the dependence of the optimal value of the objective function with respect to certain parameters involved in the problem.

4.1

Cone of Feasible Directions

Let

x∗ ∈ S = [g(x)  b]

Definition 4.1.1 The set defined by T = { x (0) :

t −→ x(t) ∈ S,

x ∈ C 1 [0, a],

a > 0,

x(0) = x∗ }

of all tangent vectors at x∗ to differentiable curves included in S, is called cone of feasible directions at x∗ to the set [g  b].

We have the following characterization of the cone T at an interior point x∗ of S.

Constrained Optimization-Inequality Constraints

205

Remark 4.1.1 We have g continuous on Ω

x∗ ∈ [g(x) < b]

and

=⇒

T = Rn .

That is, when x∗ is an interior point of S, then the cone at x∗ coincides with the whole space. Indeed, we have T ⊂ Rn . Let us prove that Rn ⊂ T . Let y ∈ Rn . ∗ If y = 0, then the constant curve x(t) = x∗ with t ∈ [0, 1] satisfies: x ∈ C 1 [0, 1], x(0) = x∗ , x (t) = 0, x (0) = 0 = y, x(t) = x∗ ∈ S ∀t ∈ [0, 1]. So y = 0 ∈ T . ∗

∗∗ Suppose y = 0. We have x ∈

m

[gj (x) < bj ] which is an open subset of

j=1

Rn . So there exists δ > 0 such that

Bδ (x∗ ) ⊂

m

[gj (x) < bj ].

j=1

Now x(t) = x∗ + ty ∈ Bδ (x∗ )

∀t ∈ [−

δ δ , ] 2|y| 2|y|

since |x(t) − x∗ | = |t||y| 

δ δ |y| = < δ. 2|y| 2

δ We deduce that y ∈ T since the curve satisfies: x ∈ C 1 [0, 2|y| ],

x(0) = x∗ ,

x (t) = y,

x (0) = y,

x(t) = x∗ + ty ∈ S

∀t ∈ [0,

δ ]. 2|y|

Example 1. Find and sketch the cone of feasible directions at the point (−1/2, 1/2) belonging to the set S = {(x, y) ∈ R2 : g1 (x, y) = x2 +y 2 −1  0 and

g2 (x, y) = x−y  0}.

Solution: The set S is the part of the unit disk located above the line y = x. The point (−1/2, 1/2) is an interior point of S; see Figure 4.1. Thus T = R2 .

206

Introduction to the Theory of Optimization in Euclidean Space y 1.5

1.0

1.0

0.5

1.5

1.0

y

1.5

0.5

S

0.5

0.5

1.0

1.5

x

1.5

1.0

S

0.5

0.5

0.5

0.5

1.0

1.0

1.5

1.5

1.0

1.5

x

Cone  IR2

FIGURE 4.1: S and the cone at (−1/2, 1/2)

We know a representation of the cone T when x∗ is a regular point of S.

Definition 4.1.2 A point x∗ ∈ S = [g  b] is said to be a regular point of the constraints if the gradient vectors ∇gi (x∗ ), i ∈ I(x∗ ) are linearly independent, and where   I(x∗ ) = i, i ∈ {1, . . . , m} : gi (x) = bi .

Theorem 4.1.1 At a regular point x∗ ∈ S = [g  b], where g is C 1 in a neighborhood of x∗ , the cone of feasible directions T is equal to the convex cone C = {y ∈ Rn :

gi (x∗ )y  0,

i ∈ I(x∗ )}.

Before giving the proof, we give some remarks and identify some cones.

Remark 4.1.2 The cone of feasible directions at a point x∗ ∈ S with vertex x∗ is the translation of C by the vector x∗ given by C(x∗ ) = x∗ + C = x∗ + {h ∈ Rn :

gi (x∗ ).h  0,

i ∈ I(x∗ )}

Constrained Optimization-Inequality Constraints

= {x∗ + h ∈ Rn : = {x ∈ Rn :

gi (x∗ ).h  0,

207

i ∈ I(x∗ )}

gi (x∗ ).(x − x∗ )  0,

i ∈ I(x∗ )}

C(x∗ ) is the cone of feasible directions to the constraint set [g(x)  b] passing through x∗ .

Example 2. Find and sketch the cone of feasible directions C(x, y) with vertex √ √ (x, y) = (−1/2, −1/2), (0, 1) and (1/ 2, 1/ 2). The points belong to the set S = {(x, y) ∈ R2 :

g1 (x, y) = x2 +y 2 −1  0

g2 (x, y) = x−y  0}.

and

Solution: Note that the three points belong to ∂S; see Figure 4.2. y 1.5

1.0

1.0

0.5

1.5

1.0

y

1.5

0.5

S

0.5

0.5

1.0

1.5

x

1.5

1.0

S

0.5

0.5

0.5

0.5

1.0

1.0

1.5

1.5

1.0

1.5

x

Cone : y  x

FIGURE 4.2: Location of the points on S and C(−1/2, −1/2) To determine the cone of feasible directions at each point (see Figures 4.2 and 4.3), we need to discuss the regularity of each point. First, we will need: ⎡ ⎢ g  (x, y) = ⎣

∂g1 ∂x

∂g1 ∂y

∂g2 ∂x

∂g2 ∂y



⎡ 2x ⎥ ⎣ ⎦= 1

2y −1

⎤ ⎦.

∗ At (−1/2, −1/2), the equality constraints g2 = 0 is satisfied and the point is regular. We have   and rank(g2 (−1/2, −1/2)) = 1 g2 (x, y) = 1 −1

208

Introduction to the Theory of Optimization in Euclidean Space 



C(−1/2, −1/2) = (x, y) ∈ R2 :

1

−1



⎡ .⎣





= (x, y) ∈ R2 :

x+

1 2

y+

1 2





⎦0

x−y 0 .

∗∗ At (0, 1), only the equality-constraint g1 = 0 is satisfied and the point is regular. We have g1 (0, 1) =



0 2



rank(g1 (0, 1)) = 1

and





C(0, 1) = (x, y) ∈ R2 : 

2

= (x, y) ∈ R :

0 2



⎡ .⎣



x−0 y−1

⎤ ⎦0

y1 .

y

1.5

1.0

y

1.5

1.5

1.0

1.0

0.5

0.5



S

0.5

0.5

1.0

1.5

x

1.5

1.0

0.5

0.5

S

0.5

0.5

1.0

Cone : x  y 

1.5

x

2

x y 0 1.0

Cone : y  1

1.0

1.5

1.5





FIGURE 4.3: C(0, 1) = [y  1] and C(1/ 2, 1/ 2) = [x + y 

√ 2] ∩ [x  y]

√ √ ∗ ∗ ∗ At (1/ 2, 1/ 2), the two equality constraints g1 = g2 = 0 are satisfied and the point is regular. We have 1 1 g (√ , √ ) = 2 2

 √





1 1 rank(g  ( √ , √ )) = 2 2 2 ⎡ ⎤  √ √  x − √12   1 1 2 2 2 ⎣ ⎦0 C( √ , √ ) = (x, y) ∈ R : . 1 −1 2 2 y − √12   √ = (x, y) ∈ R2 : x + y − 2  0 and x − y  0 . 

2 1

2 −1

and

Constrained Optimization-Inequality Constraints

209

Remark 4.1.3 The conclusion of the theorem is also true when the point x∗ satisfies any one of the following regularity conditions [5] : i) Each constraint gj (x) is affine for j ∈ I(x∗ ). ii) There exists x ¯ such that x)  bj gj (¯ gj (¯ x) < bj

∀j ∈ I(x∗ ). if

gj

is not affine

Example 3. Suppose that all the constraints are affine and that the set S is described by

S = {x ∈ Rn :

n 

aij xj  bi ,

i = 1, . . . , m} = {x ∈ Rn :

Ax  b}

j=1

where A = (aij ) is an m × n matrix and b ∈ Rm .   Here g(x) = Ax, g  (x) = A and gi (x) = ai1 ai2 . . . ain . Thus, from the previous remark, any point of S is a regular point and the cone of feasible directions at a point x∗ ∈ S with vertex x∗ is given by the polyhedra C(x∗ ) = {x ∈ Rn : gi (x).(x − x∗ ) =

n 

aij (xj − x∗j )  0,

i ∈ I(x∗ )}

j=1

= {x ∈ Rn :

n  j=1

aij xj 

n 

aij x∗j = bi ,

i ∈ I(x∗ )}.

j=1

Example 4. Suppose f is a C 1 function and S = [z  f (x)] = {(x, z) ∈ Ω × R :

z  f (x)}

Ω ⊂ Rn .

Let x∗ be a relative interior point of the surface z = f (x). Find the cone at x∗ . Solution: If we set

g(x, z) = z − f (x),

then the set S can be described by

S = [g(x, z)  0] = {(x, z) ∈ Ω × R :

g(x, z)  0}

210

Introduction to the Theory of Optimization in Euclidean Space and the point (x∗ , f (x∗ )) is a regular point since we have g  (x∗ , f (x∗ )) =



−f  (x∗ )

1



rank(g  (x∗ , f (x∗ ))) = 1.

= 0

The cone of feasible directions at the point (x∗ , f (x∗ )) with vertex (x∗ , f (x∗ )) is given by  C(x∗ , f (x∗ )) = (x, z) ∈ Rn × R :

g  (x∗ , f (x∗ )).



x − x∗ z − f (x∗ )



 0 .

We have 





g (x , f (x )).



x − x∗ z − f (x∗ )

 =







−f (x )

1



 .

x − x∗ z − f (x∗ )



= −f  (x∗ ).(x − x∗ ) + z − f (x∗ )  0 ⇐⇒ z  f (x∗ ) + f  (x∗ ).(x − x∗ ). Hence

 C(x∗ , f (x∗ )) = (x, z) ∈ Rn × R :

 z  f (x∗ ) + f  (x∗ ).(x − x∗ ) .

The cone is the region below the hyperplane z = f (x∗ )+f  (x∗ ).(x−x∗ ), which is also the tangent plane to the surface z = f (x) at x∗ . In particular, when x∗ is a stationary point, i.e f  (x∗ ) = 0, the cone of feasible directions at x∗ is the region below the horizontal tangent plane z = f (x∗ ).

Remark 4.1.4 Note that the representation of the cone of feasible directions obtained in the theorem used the fact that the point was regular. When, this hypothesis is omitted the representation is not necessary valid. Indeed, if we consider the set S defined by g(x, y)  0

with

g(x, y) = x2 ,

then S is reduced to the y axis. No point of S is regular since we have     and g  (0, y) = 0 0 on the y-axis. g  (x, y) = 2x 0 We deduce that at each point (0, y0 ), we have     x−0  0 C(0, y0 ) = (x, y) : g (0, y0 ). y − y0       x−0 0 0 . = 0 = R2 . = (x, y) : y − y0

Constrained Optimization-Inequality Constraints

211

However, the line y(t) = y0 + t

x(t) = 0

remains included in  S, passes  through the point (0, y0 ) at t = 0, and has the   0 x (0) = . Hence, the cone of feasible directions at each direction 1 y  (0) point of S is equal to S. Note that, it also coincides with the tangent plane at each point, since g(x, y) = x2  0

g(x, y) = x2 = 0.

⇐⇒

Proof. We have: T ⊂ C : Indeed, let y ∈ T , y = 0, then ∃ x(t) differentiable such that g(x(t))  b x (0) = y. x(0) = x∗ ,

∀t ∈ [0, a] for some a > 0,

So 0 is a minimum for the function φi (t) = gi (x(t)) − bi , (i ∈ I(x∗ )), over the interval [0, a] since we have φi (t) = gi (x(t)) − bi  0 = φi (0) φi (0) = gi (x∗ ) − bi = 0

because i ∈ I(x∗ ).

Since gi and x(.) are C 1 , then φi is C 1 and Taylor’s formula gives

φi (t) − φi (0) = φi (0)t + tα(t) = t φi (0) + α(t)

with

lim α(t) = 0.

t→0+

If φi (0) > 0 then there exists a0 ∈ (0, a) such that α(t)


φi (0) 2

∀t ∈ (0, a0 ).

We deduce that φ (0) φ (0)

=t i >0 φi (t) − φi (0) > t φi (0) − i 2 2

∀t ∈ (0, a0 )

which contradicts that 0 is a maximum for φi on [0, a]. So y ∈ C since we have φi (0) =

  d (gi (x(t))) = ∇gi (x(t)).x (t) = gi (x∗ ).y  0. dt t=0 t=0

212

Introduction to the Theory of Optimization in Euclidean Space

C ⊂ T : Let y ∈ C \ {0} . We distinguish between two situations: First

Suppose that

case

gi (x∗ ).y < 0

∀i ∈ I(x∗ ).

Since x∗ ∈ [gj (x) < bj ] for j ∈ I(x∗ ) and g continuous, there exists δ > 0 such that [gj (x) < bj ]. Bδ (x∗ ) ⊂ j ∈I(x∗ )

Consider the curve x(t) = x∗ + ty

t>0

x(0) = x∗

where

x (0) = y.

and

We claim that δ

∃δ0 ∈ 0, min(δ, ) |y|

such that

x(t) ∈ S = [g(x)  b]

∀t ∈ [0, δ0 ].

Indeed, for j ∈ I(x∗ ), we have gj (x(t)) = gj (x∗ + ty) = gj (x∗ ) + tgj (x∗ ).y + tεj (t) Since gj (x∗ ) = bj and gj (x∗ ).y δ δ0j ∈ (0, min(δ, )) such that |y|

with

lim εj (t) = 0.

t→0

< 0, we deduce the existence of

1 |εj (t)| < − gj (x∗ ).y. 2 Consequently, for δ0 = min∗ δ0j , we have j∈I(x )

∀j ∈ I(x∗ ),

t t gj (x(t)) < bj + tgj (x∗ ).y − gj (x∗ ).y = bj + gj (x∗ ).y < bj 2 2 Second case

∀t ∈ (0, δ0 ).

Suppose that

gi (x∗ ).y = 0

∀i ∈ {i1 , i2 , . . . , ip } ⊂ I(x∗ )

and

gi (x∗ ).y < 0

∀i ∈ I(x∗ ) \ {i1 , i2 , . . . , ip }

p < n.

Constrained Optimization-Inequality Constraints

213

Consider the system of equations

F (t, u) = G x∗ + ty +t G (x∗ )u − B = 0 where, for t fixed, u ∈ Rp is the unknown, and where G = (gi1 , gi2 , . . . , gip ),

B = (bi1 , bi2 , . . . , bip ),

rank(G (x∗ )) = p.

Note that F is well defined on an open subset of R × Rp . Indeed, if g is C 1 on Bδ (x∗ ) ⊂ {x ∈ Rn : gj (x) < bj , j ∈ I(x∗ )},

δ δ , , we have then ∀(t, u) ∈ (−δ0 , δ0 ) × Bδ0 (0) with δ0 = min  ∗ 2 y 2 G (x ) (x∗ + ty +t G (x∗ )u) − x∗  |t| y + u G (x∗ )
0 with B (0) × Bη (0) ⊆ A, and such that det(∇u F (t, u)) = 0 ∀t ∈ B (0),

B (0) × Bη (0)

in

∃!u ∈ Bη (0) :

u : (−, ) −→ Bη (0);

F (t, u) = 0 is a C 1 function.

t −→ u(t)

Thus, the curve x(t) = X(t, u(t)) = x∗ + ty +t G (x∗ )u(t) is, by construction, a curve in S since we have for each t ∈ (−, ) G(x(t)) − B = 0

⇐⇒

gj (x(t)) − bj = 0

x(t) ∈ Bδ (x∗ ) ⊂ {x ∈ Rn : gj (x) < bj , ⇐⇒

gj (x(t)) − bj < 0

∀j ∈ {i1 , i2 , . . . , ip } ⊂ I(x∗ )

j ∈ I(x∗ )}

∀j ∈ I(x∗ ).

By differentiating both sides of F (t, u(t)) = G(x(t)) − B = G(X(t, u(t))) − B = 0 with respect to t, we get n

0=

 ∂G ∂Xj d G(x(t)) = dt ∂Xj ∂t j=1

0=

 d G(x(t)) dt t=0

m  ∂Gi

m

 ∂Gi ∂ul ∂Xj l = yj + (x∗ ) ∂xj ∂t ∂xj ∂t l=1 l=1 # n m    ∂G ∂Gl ∗ ∂ul  = (X(t, u)) yj + (x ) ∂xj ∂xj ∂t j=1

Xj (t, u) = x∗j + tyj +

l

(x∗ )ul

l=1

t=0

= G (x∗ )y + G (x∗ )t G (x∗ )u (0). Since we have G (x∗ )y = 0 and that G (x∗ )t G (x∗ ) is nonsingular and definite positive, we conclude that G (x∗ )t G (x∗ )u (0) = G (x∗ )y = 0

=⇒

u (0) = 0.

Constrained Optimization-Inequality Constraints Hence

215

x (0) = y +t G (x∗ )u (0) = y.

Now, for j ∈ I(x∗ ) \ {i1 , i2 , . . . , ip }, we have gj (x(t)) = gj (x(0)) + tgj (x∗ ).x (0) + tη(t) = bj + tgj (x∗ ).y + tη(t) with

lim η(t) = 0. Then, from the first case, there exists 0 ∈ (0, ) such that

t→0

gj (x(t)) < bj

∀t ∈ (0, 0 )

thus x(t) ∈ [gj (x)  bj ]

for all

j ∈ I(x∗ ) \ {i1 , i2 , . . . , ip }.

Finally, y is a tangent vector to the curve x(t) included in S for t ∈ [0, 0 /2], so y ∈ T . *C is a cone of Rn since for y ∈ C and κ ∈ R+ , we have gi (x∗ )(κy) = κgi (x∗ )y  0 for i ∈ I(x∗ ). Thus κy ∈ C. *C is a convex of Rn since for y, y  ∈ C and s ∈ [0, 1], we have gi (x∗ )(sy + (1 − s)y  ) = sgi (x∗ )y + (1 − s)gi (x∗ )y   s.0 + (1 − s).0 = 0 for

i ∈ I(x∗ ). Thus sy + (1 − s)y  ∈ C.

216

Introduction to the Theory of Optimization in Euclidean Space

Solved Problems

1. – Find and draw the cone of feasible directions at the point (0, 3, 0) belonging to the set x2 + y 2 + z 2  9. Solution: Set g(x, y, z) = 9 − (x2 + y 2 + z 2 ). We have 5 5

y

y

0

0

5

5 5 5

Cone y3

z

z

0

5

x0

5

0

x2  y2  z2  9

5

4 2

x

0 x 5

0

FIGURE 4.4: The set [g  3] ∩ [x  0]

g  (x, y, z) = −2xi − 2yj − 2zk,

and

C(0, 3, 0) = [y  3]

g  (0, 3, 0) = −6j = 0,

rank(g  (0, 3, 0)) = 1.

So (0, 3, 0) is a regular point and the cone of feasible directions to [g  0], with vertex at this point (see Figure 4.4), is given by ⎡ ⎤ x−0 C(0, 3, 0) = {(x, y, z) ∈ R3 : g  (0, 3, 0). ⎣ y − 3 ⎦  0}. z−0 We have 

0 −6 0





⎤ x−0 .⎣ y − 3 ⎦  0 z−0

⇐⇒

0(x − 0) − 6(y − 3) + 0(z − 1)  0

Constrained Optimization-Inequality Constraints ⇐⇒

y3:

217

C(0, 3, 0) = [y  3].

2. – Find the cone of feasible directions at the point (0, 1, 0) to the set g(x, y, z) = (g1 (x, y, z), g2 (x, y, z))  (1, 1) g2 (x, y, z) = x2 + y 2 + z 2 .

g1 (x, y, z) = x + y + z,

Solution: The set S = [g  (1, 1)], as illustrated in Figure 4.5, is the part of the unit ball located below the plane x + y + z  1. 2 y

2

1

y

0

1

0

1

Cone

1

2 2

x  y  z  1  1 y1  0

2 2 P

P

1

z

1

z

0

1

0

1

x2  y2  z2  1  0 and

2

2

x yz1  0

2

2 1

1 0

0

x

x

1

1

2

2

FIGURE 4.5: [g  (1, 1)]

and

C(0, 1, 0)

The point (0, 1, 0) ∈ S satisfies the two constraints g1 (x, y, z) = g2 (x, y, z) = 1 and is a regular point since we have: ⎡ ⎢ g (x, y, z) = ⎣ 



g (0, 1, 0) =



∂g1 ∂x

∂g1 ∂y

∂g1 ∂z

∂g2 ∂x

∂g2 ∂y

∂g2 ∂z

1 0

1 2

1 0



⎤ ⎥ ⎦=



1 2x

1 2y

has rank 2.

1 2z



218

Introduction to the Theory of Optimization in Euclidean Space

The cone of feasible directions to the set S at the point (0, 1, 0), with vertex this point, is the set of points (x, y, z) such that ⎡ ⎤ ⎡ ⎤     x−0 x 1 1 1 0 g  (0, 1, 0). ⎣ y − 1 ⎦ = .⎣ y − 1 ⎦  0 2 0 0 z−0 z ⇐⇒

x+y−1+z 0

2(y − 1)  0.

and

Thus C(0, 1, 0) = {(x, y, z) ∈ R3 :

3. – Show that the sets  z  x2 + y 2

x+y+z 1

z

and

and

y  1}.

1 2 5 (x + y 2 ) + 10 2

have a common cone of feasible directions at the point (3, 4, 5). Solution: Set g1 (x, y, z) = z −

 x2 + y 2

g2 (x, y, z) = z −

1 2 5 (x + y 2 ) − . 10 2

We have g1 (3, 4, 5) = g2 (3, 4, 5) = 0 g1 = 

1 x2 + y 2



− xi − yj + k ,

y x g2 (x, y, z) = − i − j + k, 5 5 rank(g1 (3, 4, 5)) = 1

4 3 g1 (3, 4, 5) = − i − j + k = 0 5 5 4 3 g2 (3, 4, 5) = − i − j + k = 0 5 5 rank(g2 (3, 4, 5)) = 1.

So (3, 4, 5) is a regular point for the two constraints g1 (x, y, z) = 0 and g2 (x, y, z) = 0. Therefore, the cones of feasible directions at the point (3, 4, 5) for the sets [g1 (x, y, z)  0] and [g2 (x, y, z)  0], with vertex (3, 4, 5), are given respectively by: C1 (3, 4, 5) = {(x, y, z) ∈ R3 : g1 (3, 4, 5).t C2 (3, 4, 5) = {(x, y, z) ∈ R3 : g2 (3, 4, 5).t

 

x−3

y−4

z−5

x−3

y−4

z−5

 

 0}  0}.

Constrained Optimization-Inequality Constraints

219

Clearly, since g1 (3, 4, 5) = g2 (3, 4, 5), the two sets are equal and we have for i = 1, 2 ⎡ ⎤ ⎡ ⎤ x−3 x−3   − 35 − 45 1 . ⎣ y − 4 ⎦  0 gi (3, 4, 5). ⎣ y − 4 ⎦  0 ⇐⇒ z−5 z−5 3 4 − (x − 3) − (y − 4) + 1(z − 5)  0. 5 5

⇐⇒

Hence, the two given sets have a common cone of feasible directions at this point (see the illustrations in Figure 4.6) characterized by the inequality 4 3 − (x − 3) − (y − 4) + (z − 5)  0. 5 5 10 y

5

5

y

0

0

5 5

10 10

P 6

z

5

z

4

Cone

2 0 0

10 5

5 0 x

0

5

x 5

10 10 y

5

0 5 10 10

z

5

Cone

0 10 5 0 x

5 10

FIGURE 4.6: Sets

[g1  0], [g2  0]

and

C(3, 4, 5)

220

Introduction to the Theory of Optimization in Euclidean Space

4.2

Necessary Condition for Local Extreme Points/ Inequality Constraints

In what follows, we will be interested in the study of the maximization problem

max f (x1 , . . . , xn )

⎧ ⎪ ⎨ g1 (x1 , . . . , xn )  b1 .. .. . . ⎪ ⎩ gm (x1 , . . . , xn )  bm

subject to

The results established are strongly related to the fact that we are maximizing a function f under inequality constraint g(x)  b. To solve a minimization problem min f (x), we can maximize −f (x), and if a constraint is given in the form gj (x)  bj , we can transform it into −gj (x)  −bj . An equality constraint gj (x) = bj can be equivalently written as gj (x)  bj and −gj (x)  −bj .

We have the following preliminary lemma

Lemma 4.2.1 Let f and g = (g1 , . . . , gm ) be C 1 functions in a neighborhood of x∗ ∈ [g(x)  b]. If x∗ is a regular point and a local maximum point of f subject to these constraints, then we have

∀y ∈ Rn : gi (x∗ ).y  0, i ∈ I(x∗ ) =⇒ f  (x∗ ).y  0.

Proof. Let y ∈ Rn such that g  (x∗ ).y  0. Because, x∗ is a regular point of the set [g(x)  b], then y ∈ C(x∗ ), the cone of feasible directions at x∗ to the set [g(x)  b]. So ∃ a > 0, ∃ x ∈ C 1 [0.a] such that g(x(t))  c

∀t ∈ [0, a],

x(0) = x∗ ,

x (0) = y.

Now, since x∗ is a local maximum point of f on the set g(x)  b, then there exists δ ∈ (0, a) such that ∀t ∈ (0, δ), f (x(t))  f (x∗ ) = f (x(0))

⇐⇒

f (x(t)) − f (x(0)) 0 t−0

from which we deduce f  (x∗ ).y = f  (x(t)).x (t)

 t=0

=

 d f (x(t)) − f (x(0)) f (x(t))  0. = lim+ dt t−0 t=0 t→0

Constrained Optimization-Inequality Constraints

221

Remark 4.2.1 The lemma generalizes the necessary condition for a local maximum point x∗ in a convex S: f  (x∗ ).(x − x∗ )  0

∀x ∈ S.

Without assuming the set S = [g(x)  b] is convex, the local maximum point must satisfy an inequality on the convex cone C(x∗ ): f  (x∗ ).(x − x∗ )  0

∀x ∈ C(x∗ ).

As a consequence of the lemma, we have the following characterization of a constrained local maximum point.

Theorem 4.2.1 Let f and g = (g1 , . . . , gm ) be C 1 functions in a neighborhood of x∗ ∈ [g(x)  b]. If x∗ is a regular point and a local maximum point of f subject to these constraints, then ∃λ∗j  0,

j ∈ I(x∗ ) = {k ∈ N,

1km:

gk (x) = bk }

such that ∂f ∗ (x ) − ∂xi



λ∗j

j∈I(x∗ )

∂gj ∗ (x ) = 0, ∂xi

i = 1, · · · , n.

The proof uses an argument of linear algebra called “Farkas-Minkowski’s Lemma” [5] that says: Farkas-Minkowski’s Lemma. Let A be an p×n real matrix and c ∈ Rn . Then, the inclusion {x ∈ Rn :

Ax  0}



{x ∈ Rn :

c.x  0}

is satisfied if and only if ∃λ = λ1 , . . . , λp  ∈ Rp ,

Proof. Set

λ  0,

I(x∗ ) = {i1 , i2 , · · · , ip },

such that

c = t Aλ.

222

Introduction to the Theory of Optimization in Euclidean Space ⎡ A = −[gj (x∗ )]j∈I(x∗ )

⎢ ⎢ ⎢ = −⎢ ⎢ ⎢ ⎣

∂gi1 ∂x1

∂gi1 ∂x2

∂gi2 ∂x1

∂gi2 ∂x2

∂gip ∂x1

∂gip ∂x2

.. .

∂gi1 ∂xn

...

∂gi2 ∂xn

... .. .. . . ∂gip . . . ∂xn ⎡ ∂f ⎤

.. .

⎢ ⎢  ∂f  ⎢ t  ∗ t ∂f = −⎢ ,··· , c = − f (x ) = − ⎢ ∂x1 ∂xn ⎣

∂x1

.. .

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

⎥ ⎥ ⎥ ⎥. ⎥ ⎦

∂f ∂xn

From Farkas-Minkowski’s Lemma, the inclusion {y = (y1 , · · · , yn ) ∈ Rn :



Ay  0} =

{y ∈ Rn :

gi (x∗ ).y  0}

i∈I(x∗ )



{y ∈ Rn :

f  (x∗ ).y  0} = {y ∈ Rn :

is satisfied, then ∃ λ∗ = (λ∗1 , . . . , λ∗p ) ∈ Rp , −t f  (x∗ ) = c = t Aλ∗ = −

p 

c.y = −f  (x∗ ).y  0}

λ∗  0

λ∗k t gik (x∗ )

⇐⇒

such that f  (x∗ ) =

k=1

p 

λ∗k gik (x∗ ).

k=1

So we are led to solve the system  ∂f ∂gj (x) − λj (x) = 0 ∂xi ∂x i ∗

i = 1, . . . , n,

λj  0

j ∈ I(x∗ )

j∈I(x )

gj (x) − bj = 0

∀j ∈ I(x∗ )

gj (x) − bj < 0

∀j ∈ I(x∗ ).

To find a practical way to solve the system, we introduce the complementary slackness conditions

λj  0,

with

λj = 0

if

gj (x) < bj ,

j = 1, . . . , m

Constrained Optimization-Inequality Constraints

223

When gj (x∗ ) = bj , we say that the constraint gj (x)  bj is active or binding at x∗ . When gj (x∗ ) < bj , we say that the constraint gj (x)  bj is inactive or slack at x∗ . We introduce the Lagrangian function L(x, λ) = f (x) − λ1 (g1 (x) − b1 ) − . . . − λm (gm (x) − bm ) where λ1 , · · · , λm are the generalized Lagrange multipliers. Then, we reformulate the previous theorem as follows:

Theorem 4.2.2 Let f and g = (g1 , . . . , gm ) be C 1 functions in a neighborhood of x∗ ∈ [g(x)  b]. If x∗ is a regular point and a local maximum point of f subject to these constraints, then ∃!λ∗ = (λ∗1 , . . . , λ∗m ) such that the following Karush-Kuhn-Tucker (KKT) conditions hold at (x∗ , λ∗ ): ⎧ m  ⎪ ∂f ∗ ∂gj ∗ ∂L ∗ ∗ ⎪ ⎪ (x , λ ) = (x ) − λ∗j (x ) = 0 i = 1, . . . , n ⎨ ∂x ∂x ∂x i i i j=1 ⎪ ⎪ ⎪ ⎩ ∗ λj  0, with λ∗j = 0 if gj (x∗ ) < bj , j = 1, . . . , m.

Remark 4.2.2 The numbers λ∗j , j ∈ I(x∗ ) are unique. Indeed, suppose there exist λ = λ1 , · · · , λp  and λ = λ1 , · · · , λp  solutions of c =t Aλ

and

c =t Aλ

then t A(λ − λ ) = 0, which we can write  (λj − λj )gj (x∗ ) = 0. j∈I(x∗ )

Since the vectors gj (x∗ ) are linearly independent, deduce that (λj − λj ) = 0

for each

j ∈ I(x∗ ).

224

Introduction to the Theory of Optimization in Euclidean Space

Remark 4.2.3 If I(x∗ ) = ∅ then the Karush-Kuhn-Tucker conditions reduce to ∇f (x∗ ) = 0 which is expected since then the point x∗ belongs to the interior of the set of the constraints. On the other hand, this shows that the Kuhn-Tucker conditions are not sufficient for optimality. In fact, when x∗ is an interior point, it could be a local maximum, a local minimum or a saddle point. First, let us practice writing the KKT conditions through simple examples. Example 1. Solve the problem max (x − 2)3

g(x) = −x  0.

subject to

Solution: Since f, g, are C 1 in R, consider the Lagrangian L(x, α) = (x − 2)3 − α(−x) and write the KKT conditions: ⎧ ∂L ⎪ ⎨ (1) = 3(x − 2)2 + α = 0 ∂x ⎪ ⎩ (2) α0 with α = 0 ∗

if

−x 1, solve the problem min f (x, y) = (x, y) − (a, b) 2

g(x, y) = x2 + y 2  1.

subject to

Solution: i) The problem describes the shortest distance of the point (a, b) to the unit disk (here (a, b) is located outside the unit disk). This distance is attained by the extreme value theorem since f is continuous on the constraint set [g  1], which is a closed and bounded subset of R2 . The case (a, b) = (2, 3) is illustrated in Figure 4.7 and a graphical solution is described in Figure 4.8 using level curves. x

4

0 2

y

2

4

0

1.0

z  x  22  y  32

0.5 y 0.0 0.5

20

1.0 20

z

15 z 10

10

1.0 0.5 0

0.0 x 0.5 1.0

FIGURE 4.7: Graph of z = (x − 2)2 + (y − 3)2 on x2 + y 2  1

ii) KKT conditions. f and g being C 1 , introduce, for the corresponding maximization problem, the Lagrangian L(x, y, λ) = −(x − a)2 − (y − b)2 − λ(x2 + y 2 − 1). The necessary conditions to satisfy are: ⎧ (i) Lx = −2(x − a) − 2λx = 0 ⇐⇒ x(1 + λ) = a ⎪ ⎪ ⎪ ⎪ ⎨ (ii) Ly = −2(y − b) − 2λy = 0 ⇐⇒ y(1 + λ) = b ⎪ ⎪ ⎪ ⎪ ⎩ (iii) λ  0 with λ = 0 if x2 + y 2 < 1 ∗ If x2 + y 2 < 1 then λ = 0, and then (i) and (ii) yield (x, y) = (a, b) which leads to a contradiction since a2 + b2 > 1.

Constrained Optimization-Inequality Constraints

227

∗∗ If x2 +y 2 = 1 then from (i) and (ii), we deduce that (x, y) = ( By substitution in x2 + y 2 = 1, we get a 2 b 2 + = 1, λ  0 ⇐⇒ 1+λ 1+λ Thus, the only solution of the system is the point

λ=



b a , ). 1+λ 1+λ

a2 + b2 − 1.

a b ,√ ) 2 2 +b a + b2 where the constraint is active. Finally, the point is regular since we have   g  (x, y) = 2x 2y , and rankg  (x∗ , y ∗ ) = 1. (x∗ , y ∗ ) = ( √

a2

Therefore, the point is candidate for optimality. Conclusion. Now, since it is guaranteed that the maximum value is attained, it must be at the candidate point found. Hence,  f (x, y) = f (x∗ , y ∗ ) = a2 + b2 − 1. max 2 2 x +y 1

y 6.8 30.4 25.6

6.44

12.8

1.6

4.8

17.6 3.2

35.2

2

4.8

22.4 9.6

40

4 6.4

8

14.4

2

2

11.2

4 16

28.8 19.2

51.2 54.4 57.6 60.8 62.4 65.6 68.8 72 5.2 80

70.4 7876473 8 6

24

33.6 43.2

27.2

2

32

38.4

36.8 41

49.6 67 264 59.2 56

x

20

44.8 452 8

48 49 6 51 2

FIGURE 4.8: Minimal value of z = (x − 2)2 + (y − 3)2 on x2 + y 2 = 1

Remark 4.2.4 The conclusion of the Karush-Kuhn-Tucker theorem is also true when the extreme point x∗ satisfies any one of the following regularity conditions (see [14], [5]): i) Linear constraints: gj (x) is linear, j = 1, · · · , m.

ii) Slater’s condition: gj (x) is convex and there exists x ¯ such that gj (¯ x) < bj , j = 1, · · · , m (with f concave).

228

Introduction to the Theory of Optimization in Euclidean Space

iii) Concave programming (with f concave): gj (x) is convex and there exists x ¯ such that for any j = 1, . . . , m, x)  bj gj (¯

and

gj (¯ x ) < bj

if

gj

is not linear.

iv) The rank condition: The constraints gi1 , . . . , gip , (p  m), are binding. The rank of the matrix ⎡  ∗ ⎤ gi1 (x ) ⎢ ⎥ .. ⎣ ⎦ . gip (x∗ )

is equal to p. This last case is the one we consider here in our study. These four conditions are not equivalent to one another. For example, the uniqueness of the Lagrange multipliers is established under the rank condition iv).

Example 4. (Non-uniqueness of Lagrange multipliers). Solve the problem max f (x, y) = x1/2 y 1/4 2x + y  3,

x + 2y  3,

subject to

x+y 2

with

x  0,

y  0.

Solution: To simplify calculations, we will transform the problem to an equivalent one as we did for distance problems, where the square distance is considered instead of the distance itself. Here, to avoid the powers, we will use the logarithmic function. i) The constraint set, sketched in Figure 4.9, and defined by: S = {(x, y) ∈ R+ × R+ :

2x + y  3,

x + 2y  3,

x + y  2}

is a closed bounded subset of R2 . f is continuous on S, then, by the extreme value theorem, ∃(x∗ , y∗ ) ∈ S

such that

f (x∗ , y∗ ) = max f (x, y). (x,y)∈S

Note that, we have f (0, y) = f (x, 0) = 0 f (x, y) > 0

∀x > 0,

∀x  0, y > 0.

y0

Constrained Optimization-Inequality Constraints

229

So f (x∗ , y∗ ) = max f (x, y) > 0. Therefore, at the maximum point, the con(x,y)∈S

straints x  0 and y  0 cannot be binding. y 3.0 2.0

2x y 3 y

2.5

1.5

1.0 0.5 0.0

2.0

1.5 z x

1.5 z

4

y

1.0

0.5

1.0

0.0

0.5

0.0

x2 y 3

S

0.5

x y 2

1.0 x

0.5

1.0

1.5

2.0

2.5

3.0

x

1.5 2.0

FIGURE 4.9: Graph of z = x1/2 y 1/4 on S

ii) Set

Ω = (0, +∞) × (0, +∞).

As a consequence of i), we have max f (x, y) = max ∗ f (x, y)

(x,y)∈S

(x,y)∈S

where S ∗ = {(x, y) ∈ Ω :

2x + y  3,

x + 2y  3,

x + y  2}.

Set F (x, y) = ln f (x, y) =

1 1 ln(x) + ln(y) 2 4

(x, y) ∈ Ω

F is well defined and we have max f (x, y) = max ∗ f (x, y) = max ∗ eln F (x,y)

(x,y)∈S

(x,y)∈S

max ∗ F (x, y) = ln

(x,y)∈S

(x,y)∈S

max ∗ f (x, y) = ln f (x∗ , y∗ )

(x,y)∈S

since the functions t −→ ln t and t −→ et are increasing. Note that S ∗ is a bounded subset of R2 but not closed. Thus, we cannot apply the extreme value theorem to conclude about the existence of a solution to the problem

230

Introduction to the Theory of Optimization in Euclidean Space max F (x, y).

(x,y)∈S ∗

iii) Since F and the constraints are C 1 in Ω, to solve the problem, we write the KKT conditions for the associated Lagrangian L(x, y, λ1 , λ2 , λ3 ) =

1 1 ln x+ ln y−λ1 (2x+y−3)−λ2 (x+2y−3)−λ3 (x+y−2), 2 4

The necessary conditions to satisfy are: ⎧ 1 ⎪ ⎪ − 2λ1 − λ2 − λ3 = 0 (i) Lx = ⎪ ⎪ 2x ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 1 ⎪ ⎪ ⎪ − λ1 − 2λ2 − λ3 = 0 (ii) Ly = ⎪ ⎪ 4y ⎨ ⎪ (iii) λ1  0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (iv) λ2  0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ (v) λ3  0

with

λ1 = 0

if

2x + y < 3

with

λ2 = 0

if

x + 2y < 3

with

λ3 = 0

if

x+y 0 4x 4y

=⇒

y = x.

Because λ1 > 0 then 2x + y = 3. Thus, we have (x, y) = (1, 1). But x + y = 1 + 1 = 2 : contradiction. ◦◦ if x + 2y = 3, then −

if

2x + y < 3, then λ1 = 0, and λ2 =

1 1 = >0 2x 8y

=⇒

x = 4y.

Because x + 2y = 3, then, we have (x, y) = (2, 1/2). But x + y = 2 + 1/2 > 2: contradiction. − contradiction.

if 2x + y = 3, then (x, y) = (1, 1). But x + y = 1 + 1 = 2:

Constrained Optimization-Inequality Constraints

231

** If x + y = 2, then by drawing the constraint set, we see that the only point satisfying x + y = 2 is (x, y) = (1, 1) for which we have also 2x + y = 3 and x + 2y = 3 with 2λ1 + λ2 + λ3 = 1

⇐⇒

λ1 + 2λ2 + λ3 = 1

λ1 = λ2

λ3 = 1 − 3λ1 .

iv) Conclusion. The only point candidate is (1, 1), and it is the maximum point since we know that such a point exists. However, we see that we do not have uniqueness of the Lagrange multipliers, but still we can apply the KKT conditions since the constraints are linear. Note also, that the rank condition is not satisfied since we have g(x, y) = (2x + y, ⎡ 2 g  (x, y) = ⎣ 1 1

x + 2y , x + y) g(1, 1) = (3, 3, 3) ⎤ 1 2 ⎦ rank(g  (1, 1)) = 2 = 3 1

and the three constraints are active at (1, 1); see Figure 4.10. y 2.0 096

0.704

1.312

1.024

0.416

1.44

1.152 0.864

1.5 0.288

1.28

0.576

1.536 1.6 1.472

1.056 1.184

0.768 0.48

1.344

1.504 1.408

0.928 1.088

1.00.16

1.248

0.64 0.384

1.376

0.832 0.96

0.5 0.224

1.568

1.12

1.216

0.544 0.672

0.8

0.896

0.992

0.32 0.0 032 0.0

0.192 0.5

0.448

0 128 1.0

0.512

0.608 0 256 1.5

0.7 0 352 0 0 x 2.0

FIGURE 4.10: Maximal value of z = x1/2 y 1/4 on S

232

Introduction to the Theory of Optimization in Euclidean Space

Mixed Constraints Some maximization problems take the form ⎧ ⎨ gj (x) = bj , max f (x)

subject to



hk (x)  ck ,

j = 1, · · · , r

(r < n)

k = 1, · · · , s

We have:

Theorem 4.2.3 Let f , g = (g1 , . . . , gr ), and h = (h1 , . . . , hs ) be C 1 functions in a neighborhood of x∗ ∈ [g(x) = b] ∩ [h(x)  c]. If x∗ is a regular point and a local maximum point of f subject to these constraints, then ∃!(λ∗ , μ∗ ), λ∗ = (λ∗1 , . . . , λ∗r ), μ∗ = (μ∗1 , . . . , μ∗s ) such that the following Karush-Kuhn-Tucker (KKT) conditions hold at (x∗ , λ∗ , μ∗ ): ⎧ r s   ⎪ ∂f ∗ ∂hk ∗ ∂L ∗ ∗ ∗ ⎪ ∗ ∂gj ∗ ⎪ (x , λ , μ ) = (x ) − λ (x ) − μ∗k (x ) = 0 ⎪ j ⎪ ∂x ∂x ∂x ∂xi ⎪ i i i ⎪ j=1 k=1 ⎪ ⎪ ⎪ ⎪ i = 1, . . . , n ⎨ ⎪ ∂L ∗ ∗ ∗ ⎪ ⎪ (x , λ , μ ) = −(gj (x∗ ) − bj ) = 0 j = 1, . . . , r. ⎪ ⎪ ⎪ ∂λ j ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ∗ with μ∗k = 0 if hk (x∗ ) < ck , k = 1, . . . , s, μk  0, where L(x, λ, μ) = f (x) −

r  j=1

λj (gj (x) − bj ) −

s 

μk (hk (x) − ck ).

k=1

Proof. The maximization problem is equivalent to

max f (x)

subject to

⎧ j = 1, . . . , r gj (x)  bj , ⎪ ⎪ ⎪ ⎪ ⎨ −gj (x)  −bj , j = 1, . . . , r ⎪ ⎪ ⎪ ⎪ ⎩ hk (x)  ck , k = 1, . . . , s

Constrained Optimization-Inequality Constraints

233

By applying the KKT conditions with the Lagrangian



L (x, τ, κ, μ) = f (x)−

r 

τj (gj (x)−bj )−

j=1

r 

κj (−gj (x)+bj )−

j=1

s 

μk (hk (x)−ck )

k=1

there exist unique multipliers τj∗ , κ∗j , μ∗k such that the necessary conditions are satisfied: ⎧ r  ∂f ∗ ∂gj ∗ ∂L∗ ∗ ∗ ∗ ∗ ⎪ ⎪ ⎪ (x , τ , κ , μ ) = (x ) − τj∗ (x ) ⎪ ⎪ ∂xi ∂xi ∂xi ⎪ ⎪ j=1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ r s ⎪   ⎪ ∂hk ∗ ∗ ∂gj ∗ ⎪ ⎪ + κ (x ) − μ∗k (x ) = 0 ⎪ j ⎨ ∂xi ∂xi ⎪ ⎪ ⎪ ⎪ τj∗  0, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ κ∗j  0, ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ∗ μk  0,

j=1

i = 1, . . . , n

k=1

with

τj∗ = 0

if

gj (x∗ ) < bj ,

with

κ∗j = 0

if

− gj (x∗ ) < −bj ,

with

μ∗k = 0

if

hk (x∗ ) < ck ,

j = 1, . . . , r j = 1, . . . , r k = 1, . . . , s.

Setting λ∗ = τ ∗ − κ ∗

and

L(x, λ, μ) = L∗ (x, τ, κ, μ) = f (x) −

r 

(τj − κj )(gj (x) − bj ) −

j=1

s 

μk (hk (x) − ck )

k=1

we deduce that (x∗ , λ∗ , μ∗ ) is also a solution of the KKT conditions corresponding to Lagrangian L. Moreover, for j = 1, . . . , r, λj changes sign λj = τj − κj = −κj  0 λ j = τj − κ j = τj  0

if if

gj (x) < bj gj (x) > bj .

234

Introduction to the Theory of Optimization in Euclidean Space

Uniqueness of λ∗ and μ∗ . Suppose λ∗ and μ∗ are not uniquely defined, then we would have for some λ = λ and μ = μ r s   ∂f ∗ ∂gj ∗ ∂hk ∗ (x ) − λj (x ) − μk (x ) = 0 ∂xi ∂xi ∂xi j=1 k=1

r s   ∂gj ∗ ∂hk ∗ ∂f ∗ (x ) − λj (x ) − μk (x ) = 0. ∂xi ∂x ∂xi i j=1 k=1

Subtracting the two equalities and using the fact that x∗ is a regular point, we obtain a contradiction: r 

(λj − λj )∇gj (x∗ ) −

j=1

s 

(μk − μk )∇hk (x∗ ) = 0

=⇒

(λ , μ ) = (λ, μ).

k=1

Nonnegativity constraints Some maximization problems take the form ⎧ ⎨ gj (x)  bj , max f (x)

subject to



j = 1, . . . , m

x1  0, . . . , xn  0.

We introduce the following n new constraints: gm+1 (x) = −x1  0, . . . . . . . . . . . . , gm+n (x) = −xn  0. The maximization problem is equivalent to ⎧ ⎨ gj (x)  bj , max f (x)

subject to



gj (x)  0,

j = 1, . . . , m j = m + 1, . . . , m + n.

By applying the KKT conditions, for a regular point x, with the Lagrangian L∗ (x, λ, μ) = f (x) −

m  j=1

λj (gj (x) − bj ) −

n  k=1

μk (−xk )

Constrained Optimization-Inequality Constraints there exist unique multipliers λj , μk such that ⎧ m  ⎪ ∂f ∂gj ∂L∗ ⎪ ⎪ (x, λ, μ) = (x) − λj (x) + μi = 0 ⎪ ⎪ ∂x ∂x ∂x ⎪ i i i ⎪ j=1 ⎨ ⎪ λj  0, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ μk  0,

with

λj = 0

if

gj (x) < bj ,

with

μk = 0

if

xk > 0,

235

i = 1, . . . , n j = 1, . . . , m k = 1, . . . , n.

We deduce then:

Theorem 4.2.4 Let f and g = (g1 , . . . , gr ) be C 1 functions in a neighborhood of x∗ ∈ [g(x)  b] ∩ [x  0]. If x is a regular point and a local maximum point of f subject to these constraints, then ∃!λ∗ = (λ∗1 , . . . , λ∗m ) such that the following Karush-Kuhn-Tucker (KKT) conditions hold at (x, λ): ⎧ m  ⎪ ∂f ∂gj ∂L ⎪ ⎪ (x, λ) = (x) − λj (x)  0 (=0 ⎪ ⎪ ∂xi ∂xi ⎨ ∂xi j=1 i = 1, . . . , n ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ λj  0, with λj = 0 if gj (x) < bj , where the Lagrangian is L(x, λ) = f (x) −

m  j=1

λj (gj (x) − bj ).

if

xi > 0 ),

j = 1, . . . , m

236

Introduction to the Theory of Optimization in Euclidean Space

Solved Problems

1. – Importance of KKT hypotheses. Show that the KKT conditions fail to hold at the optimal solution of the problem ⎧ ⎨ g1 (x, y) = (x − 2)2 = 0 2 subject to max f (x, y) = x + y ⎩ g2 (x, y) = (y + 1)3  0.

Solution: i) The set of constraints, graphed in Figure 4.11, is S = {(x, y) :

g1 (x, y) = 0

= {(x, y) :

x=2

and

g2 (x, y)  0} y  −1}.

and

y 0.0

0.5

1.0

1.5

2.0

2.5

3.0

x

0.5

1.0

1.5

2.0

S

2.5

3.0

FIGURE 4.11: Constraint set S

ii) The Karush-Kuhn-Tucker conditions for the Lagrangian L(x, y, α, β) = x2 + y − α((x − 2)2 ) − β((y + 1)3 )

Constrained Optimization-Inequality Constraints ⎧ (1) ⎪ ⎪ ⎪ ⎪ ⎨ (2) ⎪ ⎪ ⎪ ⎪ ⎩ (3)

are

237

Lx = 2x − 2α(x − 1) = 0 Ly = 1 − 3β(y + 1)2 = 0 β0

with

β=0

(y + 1)3 < 0

if

* If (y + 1)3 < 0, then β = 0. We get a contradiction with (2) which leads to 1 = 0. * If (y + 1)3 = 0, then y = −1, and by (3) again, we obtain 1 = 0 which is not possible. Thus, KKT conditions have no solution. ii) The problem has a solution at (2, −1) since, we have f (x, y) = x2 + y = 22 + y  4 + (−1) = f (2, −1)

∀(x, y) ∈ S.

Thus max f (x, y) = f (2, −1) = 3. S

Note that the point is not a candidate for the KKT conditions. This is because it doesn’t satisfy the constraint qualification under which the KKT conditions are established. In particular, the rank condition is not satisfied. Indeed,  the   g1 (2, −1) ) = 0 two constraints are active at (2, −1), but we have rank( g2 (2, −1) since           g1 (x, y) g1 (2, −1) 2(x − 2) 0 0 0 = = . 0 3(y + 1)2 0 0 g2 (x, y) g2 (2, −1)

2. –KKT conditions are not sufficient. Consider the problem min f (x, y) = 2 − y − (x − 1)2

subject to

y − x = 0,

x+y−20

i) Sketch the feasible set and write down the necessary KKT conditions. ii) Find the point(s) solution of the KKT conditions and check their regularity. iii) What can you conclude about the solution of the minimization problem? iv) Does this contradict the theorem on the necessary conditions for a constrained candidate point?

238

Introduction to the Theory of Optimization in Euclidean Space

Solution: i) The set of the constraints is the set of points on the line y = x included in the region below the line y = 2 − x, as shown in Figure 4.12. y 3 x y 2 2

1

3

2

1

1

2

3

x

1 x y 0 2

S

S 3

FIGURE 4.12: Constraint set S

Writing the Karush-Kuhn-Tucker conditions. First, transform the problem into a maximization one as: max −f (x, y) = y − 2 + (x − 1)2

y − x = 0,

subject to

x+y−20

Note that f , and the constraints g1 and g2 are C ∞ in R2 where g1 (x, y) = y − x

g2 (x, y) = x + y − 2.

Thus, the Lagrangian associated is L(x, y, α, β) = y − 2 + (x − 1)2 − α(y − x) − β(x + y − 2) and the Karush-Kuhn-Tucker conditions are ⎧ Lx = 2(x − 1) + α − β = 0 ⎪ ⎪ (1) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ Ly = 1 − α − β = 0 ⎨ (2) ⎪ ⎪ (3) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ (4)

Lα = −(y − x) = 0 β0

with

ii) Solving the KKT conditions. ∗ If x + y − 2 < 0 then β = 0 and

β=0

if

x + y − 2 < 0.

Constrained Optimization-Inequality Constraints ⎧ ⎨ 2(x − 1) + α = 0 1−α=0 ⎩ y−x=0

1 1 =⇒ (x, y) = ( , ) 2 2

and

239

(α, β) = (1, 0).

∗∗ If x + y − 2 = 0 then ⎧ 2(x − 1) + α − β = 0 ⎪ ⎪ ⎨ 1−α−β =0 =⇒ (x, y) = (1, 1) y−x=0 ⎪ ⎪ ⎩ x+y−2=0

and

(α, β) = (1/2, 1/2).

So, there are two solutions: (1/2, 1/2) and (1, 1) for the KKT conditions. Regularity of the point (1/2, 1/2). Only the constraint g1 (x, y) = y − x is active at (1/2, 1/2) and we have 

g1 (x, y)



=



−1

1

    rank( g1 (1/2, 1/2) ) = rank( −1 1 ) = 1.



The point (1/2, 1/2) is a regular point. Regularity of the point (1, 1). The two constraints are active at (1, 1). We have 

g1 (x, y) g2 (x, y)



 =

−1 1 1 1



 rank(

g1 (1, 1) g2 (1, 1)



 ) = rank(

−1 1 1 1

 ) = 2.

Thus the point (1, 1) is a regular point. iii) Conclusion. The two points are candidates for optimality. Comparing the values taken by f at these points gives: f (1, 1) = 1,

1 1 5 1 1 f ( , ) = 2 − − = > 1, 2 2 2 4 4

we deduce that, only (1, 1) is the candidate for minimality. However, it is not the minimum point. Indeed, we have f (x, x) = 2 − x − (x − 1)2 −→ −∞ Therefore, f doesn’t attain its minimal value.

as

x −→ −∞.

240

Introduction to the Theory of Optimization in Euclidean Space

iv) This doesn’t contradict the theorem since KKT conditions indicate only where to find the possible points when they exist.

3. – Positivity constraints. Solve the problem by two methods: ⎧ ⎨ y − x2  0, y  4 max f (x, y) = 3 + x − y + xy subject to ⎩ x  0, y0 i) using the extreme value theorem. ii) using the KKT conditions.

Solution: i) EVT method. The constraints set, graphed in Figure 4.13, is S = {(x, y) / 0  x  2,

x2  y  4}. x

y

0 4

5

1

3 y

2 3

2

2

yx

1 0 10

4 y4

3

z  yxx y3

S

z 5

2

1

0

0 0.0

0.5

1.0

1.5

2.0

2.5

3.0

x

FIGURE 4.13: Graph of z = 3 + x − y + xy on S

f is continuous (because it is a polynomial) on the set S, which is a bounded and closed subset of R2 . So f attains its absolute extreme points on S (by the ◦

extreme value theorem), either at the critical points located in S or on ∂S.

Constrained Optimization-Inequality Constraints

241

* Critical points of f : f has no critical point in the interior of S because ∇f (x, y) = 1 + y, −1 + x = 0, 0

⇐⇒

(x, y) = (1, −1) ∈ S.

** Extreme values on ∂S : Let L1 , L2 and L3 the three parts of the boundary of S defined by: L1 = {(x, x2 ), 0  x  2},

L2 = {(x, 4), 0  x  2}

L3 = {(0, y), 0  y  4} – On L1 , we have f (x, x2 ) = 3 + x − x2 + x3 = g(x), g  (x) = 3x2 − 2x + 1. x g  (x) g(x)

0

2 +

3



9

TABLE 4.1: Variations of g(x) = 3 + x − x2 + x3 on [0, 2] Then, using Table 4.1, we deduce that max f = f (2, 4) = 9

min f = f (0, 0) = 3.

L1

L1

– On L2 , we have: f (x, 4) = 5x − 1 = h(x), x h (x) h(x)

h (x) = 5.

0 −1

2 + 

9

TABLE 4.2: Variations of h(x) = 5x − 1 on [0, 2] From Table 4.2, the extreme values on this side are max f = f (2, 4) = 9 L2

min f = f (0, 4) = −1. L2

ϕ (y) = −1.

– On L3 , we have: f (0, y) = 3 − y = ϕ(y), Hence, we obtain from Table 4.3, max f = f (0, 0) = 3 L3

min f = f (0, 4) = −1. L3

242

Introduction to the Theory of Optimization in Euclidean Space y ϕ (y) ϕ(y)

0

4 

3



−1

TABLE 4.3: Variations of ϕ(y) = 3 − y on [0, 4] ∗ ∗ ∗Conclusion: The maximal value of f on S is 9 and is attained at the point (2, 4). The minimal value of f on S is −1 and is attained at the point (0, 4). ii) KKT conditions. Consider the Lagrangian L(x, y, λ, μ) = 3 + x − y + x y − λ(x2 − y) − μ(y − 4).

The Karush-Kuhn-Tucker conditions are ⎧ (= 0 if x > 0) (1) Lx = 1 + y − 2λx  0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ Ly = −1 + x + λ − μ  0 (= 0 if y > 0) ⎨ (2) ⎪ ⎪ (3) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ (4)

λ0

with

λ=0

if

y < x2

μ0

with

μ=0

if

y 0, then y = x2 > 0. From (1) and (2), we get 1 + x2 − 2λx = 0

and

λ=1−x

=⇒

3x2 − 2x + 1 = 0

with no solution. ◦ if x = 0, then y = x2 = 0, and (1) leads to a contradiction.

Constrained Optimization-Inequality Constraints

243

∗∗ If y = 4 – Suppose y < x2 then λ = 0 and by (1), we get 1 + 4  0 which is not possible. – Suppose y = x2 . We deduce that x = 2 or x = −2. The second value is not possible since x  0. For x = 2 > 0, we insert the values x = 2 and y = 4 in (1) and (2), and obtain  5 9 5 − 4λ = 0 =⇒ (λ, μ) = ( , ). 1+λ−μ=0 4 4 Note that, both constraints are active at (2, 4), and if g(x, y) = (x2 − y, y − 4), then     2x −1 4 −1 rank(g  (2, 4)) = rank( ) = 2. g  (x, y) = 0 1 0 1 Thus, (2, 4) is a regular point and, therefore, a candidate point. Moreover, (2, 4) solves the problem since the maximal value of f is attained on S by the EVT; see Figure 4.14. y 0.7

5

8.75 10.15

4.55

1.05

12.2513.3 14.715.4

.75 6.3

1.4 0

14. 15.

11.2

12.95

2.45

1.05 4

11.9

7.7 9.45

14.

13. 10.5

0.7 0.35

11.55 12. 8.4

5.25

3 1.75

7.

35

9.8 10.8

3.5 2 8.05 .4

9.1

5.6

1 2.1

6.65 7.3

2.8 0 3 15 0.0

0.5

3 85

4.9

4.2 1.0

1.5

2.0

2.5

5 95 x 3.0

FIGURE 4.14: Maximal value of z = 3 + x − y + xy on S

244

Introduction to the Theory of Optimization in Euclidean Space

4. – Application. Find (x, y) ∈ S = {(x, y) : x + y  0, x2 − 4  0} that lies closest to the point (2, 3) by following the steps below: i) Formulate the problem as an optimization problem. ii) Illustrate the problem graphically (Hint: use level curves). iii) Write down the KKT conditions. iv) Find all points that satisfy the KKT conditions. Check whether or not each point is regular. v) What can you conclude about the solution of the problem?

Solution: i) The square of the distance between (x, y) and (2, 3) is given by (x − 2)2 + (y − 3)2 . To find the point (x, y) ∈ S that lies closest to the point (2, 3) is equivalent to solve the minimization problem

min (x − 2)2 + (y − 3)2

⎧ ⎨ g1 (x, y) = x + y  0 subject to



g2 (x, y) = x2 − 4  0

or to maximize the objective function f below subject to the two constraints: ⎧ ⎨ g1 (x, y) = x + y  0 max f (x, y) = −(x − 2)2 − (y − 3)2 subject to ⎩ g2 (x, y) = x2 − 4  0

ii) The feasible set, graphed in Figure 4.15, is also described by S = {(x, y) :

y  −x,

−2  x  2}

The level curves of f , with equations: (x√− 2)2 + (y − 3)2 = k where k  0, are circles centered at (2, 3) with radius k; see Figure 4.16. If we increase the values of the radius, the values of f decrease. The first circle that will intersect the set S will be the circle with radius equal to the distance of the point (2, 3) to the line y = −x. So, only the first constraint will be active in solving the optimization problem. iii) Writing the KKT conditions. Consider the Lagrangian L(x, y, λ, β) = −(x − 2)2 − (y − 3)2 − λ(x + y) − β(x2 − 4)

Constrained Optimization-Inequality Constraints

245 x

1

0 1

4

2

y 2

0

y 10

x  2

x2

4

y  x

2

z 5 3

2

1

1

2

3

x

2

z  x  22  y  32

S 0 4

FIGURE 4.15: Graph of z = (x − 2)2 + (y − 3)2 on S

The KKT conditions ⎧ ⎪ ⎪ (1) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ (2)

are

⎪ ⎪ ⎪ ⎪ ⎪ (3) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ (4)

∂L = −2(x − 2) − λ − 2βx = 0 ∂x ∂L = −2(y − 3) − λ = 0 ∂y λ0

with

λ=0

if

x+y 0. 1+β

This contradicts x + y < 0. ∗∗ If x + y = 0 then – Suppose x2 − 4 < 0 then β = 0 and ⎧ ⎨ −2(x − 2) − λ = 0 =⇒ ⎩ −2(y − 3) − λ = 0

y = x + 1.

With x+y = 0, we deduce that (x, y) = (−1/2, 1/2). Note that (−1/2)2 −4 < 0 is satisfied and λ = 5 > 0. – Suppose x2 − 4 = 0. We deduce that x = 2 or x = −2. Then, inserting in (1) and (2), we obtain

(x, y) = (2, −2)

⎧ ⎨ λ + 4β = 0 =⇒



10 − λ = 0

=⇒

(λ, β) = (10, −5/2)

Constrained Optimization-Inequality Constraints ⎧ ⎨ 8 − λ + 4β = 0 (x, y) = (−2, 2)

=⇒



=⇒

247

(λ, β) = (0, −2)

λ=0

contradicting β  0. So, the only point solution of the system is (x∗ , y ∗ ) = (−1/2, 1/2)

with

(λ, β) = (5, 0).

Regularity of the candidate point (−1/2, 1/2). Note that only the constraint g1 (x, y) = x + y is active at (−1/2, 1/2). We have 

g1 (x, y)



=



1 1



   rank( g1 (−1/2, 1/2) ) = rank( 1

 1 ) = 1.

Thus the point (−1/2, 1/2) is a regular point. iv) Conclusion. The constraint set is an unbounded closed convex and we have |f (x, y)| = (x, y) − (2, 3) 2  ( (x, y) − (2, 3) )2 =⇒

lim

(x,y)→+∞

f (x, y) = +∞.

By Theorem 2.4.2, there exists a minimum point for f on S. Thus, the candidate found solves the problem.

5. – Mixed constraints. Solve the problem ⎧ ⎨ 2x2 + y 2 + z 2 = 1 2 2 2 max x + y + z subject to ⎩ x + y + z  0.

Solution: Set U (x, y, z) = x2 + y 2 + z 2 and g(x, y, z) = x + y + z

h(x, y, z) = 2x2 + y 2 + z 2 − 1.

First, the maximization problem has a solution by the extreme-value theorem. Indeed, U is continuous on the set S = {(x, y, z) :

x + y + z  0,

2x2 + y 2 + z 2 = 1}

248

Introduction to the Theory of Optimization in Euclidean Space

which is a closed and bounded subset of R3 as the intersection of the ellipsoid x2 √ + y 2 + z 2 = 1 with the region below the plane x + y + z = 0. The (1/ 2)2 plane passes through the center of the ellipsoid; see Figure 4.17. 2 y

1

0 1 2 2

Plane

Ellipsoid

1

z

0

1

2 2 1 0 x

1 2

FIGURE 4.17: S is the part of the ellipsoid below the plane Next, the functions U , g and h are C 1 around each point (x, y, z) ∈ R3 . We then may deduce the solution by using the Karusk-Kuhn-Tucker conditions. The Lagrangian is given by L(x, y, λ, μ) = x2 + y 2 + z 2 − λ(x + y + z) − μ(2x2 + y 2 + z 2 − 1), and the necessary conditions to satisfy are: ⎧ (i) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (ii) ⎪ ⎪ ⎪ ⎪ ⎨ (iii) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (iv) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ (v)

Lx = 2x − λ − 4μx = 0

⇐⇒

2x(1 − 2μ) = λ

Ly = 2y − λ − 2μy = 0

⇐⇒

2y(1 − μ) = λ

Lz = 2z − λ − 2μz = 0

⇐⇒

2z(1 − μ) = λ

Lμ = −(2x2 + y 2 + z 2 − 1) = 0 λ0

with

λ=0

if

x + y + z < 0.

* If x + y + z < 0, then λ = 0, and from (i), (ii), (iii) and (iv), we deduce that

Constrained Optimization-Inequality Constraints ⎧ x = 0 or μ = 1/2 ⎪ ⎪ ⎨ y = 0 or μ = 1 ⎪ z = 0 or μ = 1 ⎪ ⎩ 2x2 + y 2 + z 2 = 1

249

We obtain the points (0, 0, −1),

(0, −1, 0)

1 (− √ , 0, 0) 2

with

with

μ=1

μ=

1 2

The active constraint at these points satisfies: h (x, y, z) = (4x, 2y, 2z),





1 rank h (0, −1, 0) = rank h (0, 0, −1) = rank h (− √ , 0, 0) = 1. 2 Thus, the points are regular and candidate for optimality. * If x + y + z = 0, then • Suppose x = 0. We deduce from (iv) that y2 + z2 = 1

and

y+z =1

and deduce the two candidate points 1 1 (x, y, z) = (0, √ , − √ ) 2 2

1 1 (0, − √ , √ ) 2 2

or

with

(λ, μ) = (0, 1/2).

The two constraints are active at theses points and satisfy 

g(x, y, z)

h(x, y, z)



=



x+y+z

  1 g  (x, y, z) = 4x h (x, y, z) # (  g  (0, √12 , − √12 ) = rank rank h (0, √12 , − √12 ) ( #  g  (0, − √12 , √12 ) rank = rank h (0, − √12 , √12 ) 

1 2y

2x2 + y 2 + z 2 − 1 1 2z





1 0

1 √ 2 2

1√ −2 2

1 0

1√ 1 √ −2 2 2 2

 =2  =2

The points satisfy the constraint qualification. They are regular points and candidates for optimality.

250

Introduction to the Theory of Optimization in Euclidean Space • Suppose x = 0. Then

– if μ = 1/2, then λ = 0. By (ii) and (iii), we have y = z = 0. Thus, from x + y + z = 0, we deduce x = 0 : contradiction with x = 0. – if μ = 1/2, then from (i), we have λ = 0. Moreover, by (ii) and (iii), we have μ = 1. So, by dividing each side of (ii) by each side of (iii), we obtain y = z. Then we deduce that x + 2y = 0

2x2 + 2y 2 = 1

2y(1 − μ) = λ = 2(−2y)(1 − 2μ)

1 y = ±√ 10 3 μ= . 5

=⇒ =⇒

With λ  0, the only possible point is: (x, y, z) =

1 1

2 −√ ,√ ,√ 10 10 10

4 3 (λ, μ) = ( √ , ). 5 10 5

with

It is clear also that the constraint qualification condition is satisfied, so the point is regular. g − rank ⎣ h − ⎡



√2 , √1 , √1 10 10 10

√2 , √1 , √1 10 10 10

⎦ = rank



1

− √810

1

1

√2 10



√2 10

= 2.

Conclusion: Finally, comparing the values of f at the candidate points 2 1 1 3 f −√ ,√ ,√ = 5 10 10 10



1 1 f − √ , 0, 0 = 2 2

1 1

1

1 f (0, 0, −1) = f (0, −1, 0) = f 0, √ , − √ = f 0, − √ , √ = 1 2 2 2 2 we deduce that f attains its maximum value subject to the constraints at (0, 0, −1),

(0, −1, 0),

1

1 0, √ , − √ 2 2





and

1 1

0, − √ , √ . 2 2

Constrained Optimization-Inequality Constraints

4.3

251

Classification of Local Extreme Points-Inequality Constraints

To classify a candidate point x∗ for optimality of the problem local max (min) f (x)

g(x)  b

subject to

with g = (g1 , . . . , gm ) and b = (b1 , . . . , bm ), we proceed as in the case of equality constraints by comparing the values taken by the Lagrangian L(x, λ) = f (x) − λ1 (g1 (x) − b1 ) − · · · − λm (gm (x) − bm ), at points close to x∗ . Then, since, x∗ ∈ [g(x)  b] means that x∗ ∈



[gj (x) < bj ] = O

and

j ∈I(x∗ )



x∗ ∈

[gj (x) = bj ],

j∈I(x∗ )

we remark that by working in a neighborhood of x∗ included in the open set O, we bring ourselves to solving a local optimization problem of type equality constraints local max (min) f (x)

subject to

gj (x) = bj ,

j ∈ I(x∗ ).

Consequently, we can apply the second derivative test established for equality constraints by considering in the test only the active constraints at that point. In what follows, suppose we have: Hypothesis (H) f and g = (g1 , . . . , gm ) be C 2 functions in a neighborhood of x∗ in Rn such that: gj (x∗ ) = bj

if

j ∈ I(x∗ ) = {i1 , . . . , ip }

λj = 0

if

gj (x∗ ) < bj

rank(G (x∗ )) = p, ∇x L(x∗ , λ∗ ) = 0

p 0 ∗ x is a strict local minimum point (−1)r Br (x∗ ) > 0

∀r = p + 1, . . . , n



x is a strict local maximum point.

Proof. The proof follows the one seen for the case of equality constraints. We outline here the key modification that allows us to conclude with the previous proof. We assume that I(x∗ ) = {1, . . . , m} to avoid the case of equality constraints. Note that the positivity of λ is not assumed in the hypothesis H in order to include both the maximization and minimization problems as explained below. The Lagrangian introduced is used to link values of f and g for comparison. Then depending on its positivity or negativity on the plan tangent of the active constraints at that point, we identify whether we have a minimum or a maximum point. Step 0: Suppose that we assign for the problems max f :

L(x, α) = f (x) − α.(g(x) − b)

min f :

L(x, β) = −f (x) − β.(g(x) − b)

α0 β0

Constrained Optimization-Inequality Constraints

253

then −L(x, β) = f (x) − (−β).(g(x) − b)

− β  0.

So, to consider the two problems simultaneously, we can introduce the Lagrangian L(x, λ) = f (x) − λ.(g(x) − b) with λ  0 (resp. )

for the maximization (resp. minimization) problem.

Step 1: We have [g(x)  b] =









[gj (x) < bj ]

j ∈I(x∗ )

[gj (x) = bj ]

⊂ O.

j∈I(x∗ )

Thus x∗ belongs to the open set O. So, one can find ρ0 > 0 such that Bρ0 (x∗ ) ⊂ O. Then, for h ∈ Rn such that x∗ + h ∈ Bρ0 (x∗ ), we have from Taylor’s formula, for some τ ∈ (0, 1), L(x∗ +h, λ∗ ) = L(x∗ , λ∗ )+

n 

n

n

1  Lx x (x∗ +τ h, λ∗ )hi hj . 2 i=1 j=1 i j

Lxi (x∗ , λ∗ )hi +

i=1

By assumptions, we have Lxi (x∗ , λ∗ ) = 0 L(x, λ) = f (x) −

i = 1, . . . , n 

λj (gj (x) − bj )

j∈I(x∗ )

gi1 (x∗ ) − bi1 = gi2 (x∗ ) − bi2 = . . . = gip (x∗ ) − bip = 0 then, we have L(x∗ , λ∗ ) = f (x∗ ) − λ∗i1 (gi1 (x∗ ) − bi1 ) − . . . − λ∗ip ((gip (x∗ ) − bip ) = f (x∗ ) L(x∗ + h, λ∗ ) = f (x∗ + h) − λ∗i1 (gi1 (x∗ + h) − bi1 ) − . . . − λ∗ip (gip (x∗ + h) − bip ) from which we deduce

f (x∗ +h)−f (x∗ ) =



λ∗k [gk (x∗ +h)−bk ]+

k∈I(x∗ )

n

n

1  Lx x (x∗ +τ h, λ∗ )hi hj . 2 i=1 j=1 i j

Using Taylor’s formula for each gk , k ∈ I(x∗ ), we obtain gk (x∗ + h) − bk = gk (x∗ + h) − gk (x∗ ) =

n  ∂gk j=1

∂xj

(x∗ + τk h)hj

τk ∈ (0, 1).

254

Introduction to the Theory of Optimization in Euclidean Space

Step 2: Consider the (p + n) × (p + n) bordered Hessian matrix B(x0 , x1 , . . . , xp ) with ⎡

G(x1 , · · · , xp ) =

∂g

ik

(xk )

∂xj

p×n

∂gi1 ∂xi1

(x1 ) ⎢ .. =⎢ . ⎣ ∂gip p ∂x1 (x )

... .. . ...

⎤ (x1 ) ⎥ .. ⎥ . ⎦ ∂gip p (x ). ∂xn ∂gi1 ∂xn

The remaining steps of the equality constraints’ proof work is shown using the above notations.

Remark 4.3.1 If we introduce the notations: Q(h) = Q(h1 , . . . , hn ) =

n  n 

Lxi xj (x∗ , λ∗ )hi hj

i=1 j=1



the (p + n) × (p + n) bordered matrix

0p×p

⎣ t





G (x )

G (x∗ ) ∗

⎤ ⎦



[Lxi xj (x , λ )]n×n

M = {h ∈ Rn : G (x∗ ).h = 0} the theorem says that Q(h) > 0 =⇒

∀h ∈ M, h = 0 x∗ is a strict local constrained minimum

Q(h) < 0 =⇒

∀h ∈ M, h = 0 x∗ is a strict local constrained maximum.

It suffices then to study the positivity (negativity) of the quadratic form on the tangent plan M to the constraints gk (x) = bk , k ∈ I(x∗ ) at x∗ .

Example 1. Solve the problem local max (min) f (x, y) = xy

subject to

g(x, y) = x + y  2.

Solution: Consider the Lagrangian L(x, y, λ) = f (x, y) − λ(g(x, y) − 2) = xy − λ(x + y − 2)

Constrained Optimization-Inequality Constraints

255

and the system ⎧ ⎨ (i) Lx = y − λ = 0 (ii) Ly = x − λ = 0 ⎩ (iii) λ = 0 if x + y < 2. From (i) and (ii), we deduce that λ = x = y. ∗ If x + y < 2, then λ = 0. Thus (0, 0) is a candidate point, that is an interior point of [g  2]. To explore its nature, we use the second derivatives test for unconstrained problems. We have   0 1 , D1 (0, 0) = 0 and D2 (0, 0) = −1 < 0. Hf (x, y) = 1 0 Then, (0, 0) is a saddle point. ∗∗

If x + y = 2, then (x, y) = (1, 1) is a candidate point with λ = 1.

First, (1, 1) is a regular point since g  (x, y) = 1, 1 and rank[g  (1, 1)] = 1. Next, since n = 2 and p = 1, we have to consider the sign of the bordered Hessian determinant:   0    (−1)2 B2 (1, 1) =  gx (1, 1)    gy (1, 1)

gx (1, 1) Lxx (1, 1, 1) Lxy (1, 1, 1)

      0 1 1   Lxy (1, 1, 1)  =  1 0 1   1 1 0  Lyy (1, 1, 1)  gy (1, 1)

    = 2 > 0.  

We conclude that the point (1, 1) is a local maximum to the problem. Finally, we also have

Theorem 4.3.2 Necessary conditions for a local constrained extreme points If assumptions (H) hold, then (i)

(ii)

x∗ is a local minimum point =⇒ positive semi definite on M :

t

HL = (Lxi xj (x∗ , λ∗ ))n×n is yHL y  0

∀y ∈ M

x∗ is a local maximum point =⇒ HL = (Lxi xj (x∗ , λ∗ ))n×n is is negative semi definite on M : t yHL y  0 ∀y ∈ M

where M = {h ∈ Rn : G (x∗ ).h = 0} constraints gk (x) = bk , k ∈ I(x∗ ) at x∗ .

is the tangent plan to the

256

Introduction to the Theory of Optimization in Euclidean Space

Proof. Let x(t) ∈ C 2 [0, a], a > 0, be a curve on the constraint set g(x)  b passing through x∗ at t = 0. Suppose that x∗ is a local maximum point for f subject to the constraint g(x)  b. Then, f (x∗ )  f (x(t)) or

∀t ∈ [0, a).

f$(0) = f (x∗ )  f (x(t)) = f$(t)

∀t ∈ [0, a).

So f$ is a one variable function that has a local maximum at t = 0. Consequently, it satisfies f$ (0)  0 and f$ (0)  0 or equivalently ∇f (x∗ ).x (0)  0

and

 d2   0. f (x(t))  dt2 t=0

We have d2 f (x(t)) = t x (0)Hf (x∗ )x (0) + ∇f (x∗ ).x (0). dt2 Moreover, differentiating the relations gk (x(t)) = bk , k ∈ I(x∗ ) twice and denoting Λ∗ = λ∗i1 , . . . , λ∗ip , we obtain t 

x (0)HG (x∗ )x (0) + ∇G(x∗ )x (0) = 0

=⇒

t 

x (0)t Λ∗ HG (x∗ )x (0) + t Λ∗ ∇G(x∗ )x (0) = 0.

Hence 0

 d2  f (x(t)) = [t x (0)Hf (x∗ )x (0) + ∇f (x∗ )x (0)] − 2 dt t=0 [t x (0)t ΛHG (x∗ )x (0) + t Λ∇G(x∗ )x (0)]

= t x (0)[Hf (x∗ ) −t ΛHG (x∗ )]x (0) + [∇f (x∗ ) + t Λ∇G(x∗ )]x (0) = t x (0)[HL (x∗ )]x (0)

since

∇f (x∗ ) + t Λ∇G(x∗ ) = 0

and the result follows since x (0) is an arbitrary element of M . Example 2. Suppose that (4, 0) is a candidate satisfying the KKT conditions   where only the constraint g is active and such that g  (4, 0) = −1 0  −2 0 . and the Hessian of the associated Lagrangian is HL(.,−8) (4, 0) = 0 14 Can (4, 0) be a local maximum or minimum to the constrained optimization problem? Solution: The point (4, 0) is regular since rank(g1 (4, 0)) = 1.

Constrained Optimization-Inequality Constraints

257

We have p = 1 < 2 = n. Then we can consider the following determinant (r = p + 1 = 2). (Note that the first column vector of g1 (4, 0) is linearly independent, so we do not have to renumber the variables).    0 −1 0    B2 (4, 0) =  −1 −2 0  = −14.  0 0 14  We have (−1)1 B2 (4, 0) = 14 > 0 and λ = −8 < 0. So the second derivatives test is satisfied and (4, 0) is a strict local minimum. This shows also that the Hessian is positive definite under the constraint. Indeed, we can check this directly:       h h = −1 0 . = −h + (0)k = 0 g1 (4, 0) k k 

Thus,

 0 k ∈ R} and k     −2 0 0 k = 14k2  0 0 14 k

M ={ 

0

 ∀

0 k

 ∈ M.

258

Introduction to the Theory of Optimization in Euclidean Space

Solved Problems

1. – Solve the problem ⎧ ⎨ x + 2y  3

local min f (x, y) = x2 + y 2

s.t



2x − y  1

Solution: The problem is equivalent to the maximization problem.

local max −f (x, y) = −(x2 + y 2 )

⎧ ⎨ −(x + 2y)  −3 s.t



−(2x − y)  −1

Consider the Lagrangian L(x, y, λ1 , λ2 ) = −f (x, y) − λ1 (−(x + 2y) + 3) − λ2 (−(2x − y) + 1) = −(x2 + y 2 ) + λ1 (x + 2y − 3) + λ2 (2x − y − 1). The constraints are linear, so we can look for the candidate points by writing the Karush-Kuhn-Tucker conditions: ⎧ (i) Lx = −2x + λ1 + 2λ2 = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ Ly = −2y + 2λ1 − λ2 = 0 ⎨ (ii) ⎪ ⎪ (iii) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ (iv)

λ1  0

with

λ1 = 0

if

x + 2y > 3

λ2  0

with

λ2 = 0

if

2x − y > 1.

We distinguish several cases: • If 2x − y > 1, then λ2 = 0. From (i) and (ii), we deduce that λ1 = 2x = y. But 2x − 2x = 0 ≯ 1. So, no solution.

Constrained Optimization-Inequality Constraints •

259

If 2x − y = 1, then

– If x + 2y > 3, then λ1 = 0. From (i) and (ii), we deduce that λ2 = x = 2y. With 2x − y = 1, we deduce (x, y) = (2/3, 1/3). But 2/3 + 2(1/3) ≯ 3. So, no solution. – If x + 2y = 3, then with 2x − y = 1, we have (x, y) = (1, 1) and (λ1 , λ2 ) are such that ⎧ ⎨ λ1 + 2λ2 = 2 6 2 ⇐⇒ (λ1 , λ2 ) = ( , ). ⎩ 5 5 2λ1 − λ2 = 2 Hence, the only solution point is (x, y) = (1, 1)

with

6 2 (λ1 , λ2 ) = ( , ). 5 5

Regularity of the point. The two constraints are active at the point. We have g = (g1 , g2 ) = (−(x + 2y), −(2x − y)), #  (  ∂g1 ∂g1 −1 −2  ∂x ∂y g (x, y) = ∂g = =⇒ rank(g  (1, 1)) = 2. ∂g2 2 −2 1 ∂x ∂y Classification of the point. Since n = 2, p = 2, p ≮ n, then we can’t apply the second derivatives test. Let us use comparison to conclude. We have 2 6 6 2 L(x, y, , ) = −(x2 + y 2 ) + (x + 2y − 3) + (2x − y − 1) 5 5 5 5 = −x2 − y 2 + 2x + 2y − 4 = −(x − 1)2 − (y − 1)2 − 2  −2 and, on the set of the constraints, we have 6 2 6 2 −f (x, y)  L(x, y, , ) = −f (x, y) + (x + 2y − 3) + (2x − y − 1) 5 5 5 5 Thus, −f (x, y)  −2 = −f (1, 1)

∀(x, y)

x + 2y  3,

2x − y  1.

Hence, (1, 1) is the minimum point solution; see Figure 4.18 for a geometric interpretation of the solution.

260

Introduction to the Theory of Optimization in Euclidean Space y

y

4

4

16.12

29.76 28.52 31 24.826.04 27.28 23.56 30.3 29.1 22.32 27. 16.74 26.6 20.46 24.18 25.4

17.98

21.08

19.22 13.02 3

3

18.6

7.44

22.94 21.

x2 y 3 2

2

1

1

9.92 15.5

19.8

3.72 17.3 1.24 6.2 S

1

1

2

3

4

x

12.4 1

2x y 1

1 0.62

1

2

3

4

x

1.86 16 12

1

FIGURE 4.18: Local minimum of z = x2 + y 2 on x + 2y  3 and 2x − y  1

2. – Classify the solutions of the problem local max (min)f (x, y) = x2 y + 3y − 4 s.t

g(x, y) = 4 − xy  0

Solution: i) Consider the Lagrangian L(x, y, λ) = f (x, y) − λ(g(x, y) − 0) = x2 y + 3y − 4 − λ(4 − xy) and write the conditions ⎧ (1) Lx = 2xy − λ(−y) = 0 ⇐⇒ ⎪ ⎪ ⎪ ⎪ ⎨ (2) Ly = x2 + 3 − λ(−x) = 0 ⎪ ⎪ ⎪ ⎪ ⎩ (3) λ = 0 if xy > 4.

y(2x + λ) = 0

∗ If xy > 4, then λ = 0, and with (2), we have x2 + 3 = 0 which has no solution. ∗∗ If xy = 4, then x = 0 and y = 0. By (1), we deduce that λ = −2x, which inserted in (2), we obtain 3 − x2 = 0. Thus, we have two solutions: √ 4 (x, y) = ( 3, √ ) 3 √ 4 (x, y) = (− 3, − √ ) 3

with with

√ λ = −2 3 √ λ = 2 3.

ii) Constraint qualification. Note that g is C 1 in R2 and any point of the set of the constraints g = 0 is an interior point and regular; see Figure 4.19. Indeed, we have

Constrained Optimization-Inequality Constraints

261

4 y

2

0 y

2

4 4 4

4x y0

2

2

z

4

2

2

4

0

x 2

4x y0 4 4

2

2 0 x 4

2 4

FIGURE 4.19: Graph of z = x2 y + 3y − 4 on xy  4

g  (x, y) =



−y

−x



rank(g  (x, y)) = 1

for (x, y) ∈ [g = 0]

since g  (x, y) = 0 ⇐⇒ (x, y) = (0, 0) and (0, 0) ∈ [g = 0]. In particular √ 4 √ 4 ( 3, √ ) and (− 3, − √ ) are regular points of [g = 0]. Therefore, they are 3 3 candidate points; see in Figure 4.20, the variations of the values of the function close to these points. iii) Classification. With m = 1 (the number of the active constraints), n = 2 (the dimension of the space), then r taking values from m + 1 to n = 2, must be equal to r = 2. So, consider the following determinant      0  0 −y −x  gx gy    2y 2x + λ  B2 (x, y) =  gx Lxx Lxy  =  −y  −x 2x + λ   gy Lyx Lyy  0 √ √ √ ∗ At ( 3, 4/ 3), we have λ = −2 3, then √ √     √  √ 0√ −4/√ 3 − 3  √ √  −4/ 3 −8/ 3  √ √    = −8 3   √ B2 ( 3, 4/ 3) =  −4/ 3 8/ 3 0  = − 3  √ 3 0 8/  − 3 0 0 

√ √ √ 1 Because √ λ = −2 3  0 and (−1) B2 ( 3, 4/ 3) > 0, we deduce that √ ( 3, 4/ 3) is a local minimum. √ √ √ ∗ At (− 3, −4/ 3), we have λ = 2 3, then   0 √  √ B2 (− 3, −4/ 3) =  4/ 3 √  3 √

√ 4/ √3 −8/ 3 0



3 0 0

   √  √  4/ 3  = 3 √   8/ 3 

 √ 3  =8 3  0



√ √ √ Because λ√= 2 3  0 and (−1)2 B2 (− 3, −4/ 3) > 0, we deduce that √ (− 3, −4/ 3) is a local maximum.

262

Introduction to the Theory of Optimization in Euclidean Space y 4

43.2 37.8

4.8 54 4

43.2 37.8

32.4 27

32.4 27 2

21.6 16.2

21.6 16.2

0

4

2

2

5.4

4

21.6

x

21.6

27

27 2 10.8

32.4 9.4

54 64 5

5.4

32.4

37.8 43.2

37.8 43.2

64.8 .6

5

64.8 7 48.6

48.6

416 2

FIGURE 4.20: Local extrema of z = x2 y + 3y − 4 on xy  4

3. – Solve the problem local min f (x, y, z) = x2 +y 2 +z 2

⎧ ⎨ g1 (x, y, z) = x + 2y + z  30 s.t



g2 (x, y, z) = 2x − y − 3z  10

Solution: Note that f , g1 and g2 are C 1 in R3 . The problem is equivalent to the maximization problem

local max −f = −(x2 + y 2 + z 2 )

⎧ ⎨ −g1 = −(x + 2y + z)  −30 s.t



g2 = 2x − y − 3z  10

Consider the Lagrangian L(x, y, z, λ1 , λ2 ) = −(x2 + y 2 + z 2 ) + λ1 (x + 2y + z − 30) − λ2 (2x − y − 3z − 10). Because the constraints are linear, the local candidate points satisfy the KKT conditions: ⎧ (i) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (ii) ⎪ ⎪ ⎪ ⎪ ⎨ (iii) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (iv) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ (v)

Lx = −2x + λ1 − 2λ2 = 0 Ly = −2y + 2λ1 + λ2 = 0 Lz = −2z + λ1 + 3λ2 = 0 λ1  0 λ2  0

with with

λ1 = 0 λ2 = 0

if if

x + 2y + z > 30 2x − y − 3z < 10.

Constrained Optimization-Inequality Constraints

263

From the first three equations, we deduce that

x=

1 y = λ 1 + λ2 2

1 λ1 − λ 2 2

z=

1 3 λ1 + λ2 . 2 2

We distinguish several cases: ∗ If x + 2y + z = 30 and 2x − y − 3z = 10, then inserting the expressions of x, y and z above into the two equations gives ⎧ ⎨ 3λ1 + 32 λ2 = 30 54 8 , λ2 = − ⇐⇒ λ1 = ⎩ 3 5 5 − 2 λ1 − 7λ2 = 10 which contradicts λ2  0. ∗∗

If x + 2y + z = 30 and 2x − y − 3z < 10, then λ2 = 0 and 1 1 (x, y, z) = λ1 ( , 1, ) 2 2

which inserted into the equation x + 2y + z = 30 gives λ1 = 10 and (x, y, z) = (5, 10, 5). We have 2x − y − 3z = 2(5) − 10 − 3(5) = −15 < 10. So the point λ1 = 10,

(x, y, z) = (5, 10, 5)

λ2 = 0

is a candidate for optimality. Now, let us study the nature of the point (5, 10, 5). For this, we use the second derivatives test since f , g1 and g2 are C 2 around this point. Since n = 3 and p = 1 (only the constraint g1 is active), then r takes the values p + 1 = 2 to n = 3. First, we consider the matrix     1 1 1 − ∂g − ∂g = −1 −2 −1 g  (x, y, z) = − ∂g ∂x ∂y ∂z Then rank(g  (x, y, z)) = 1. Moreover, the first column vector of g  (5, 10, 5) is linearly independent, so we don’t have to renumber the variables. Next, we have to consider the sign of the following bordered Hessian determinants:   0    1 B2 (5, 10, 5) =  − ∂g  ∂x  ∂g  − 1 ∂y

1 − ∂g ∂x

1 − ∂g ∂y

Lxx

Lxy

Lyx

Lyy

       0 −1 −2    −1 −2 0   = 10. =   −2 0 −2   

264

Introduction to the Theory of Optimization in Euclidean Space

  0    − ∂g1  ∂x B3 (5, 10, 5) =  1  − ∂g  ∂y   − ∂g1 ∂z

1 − ∂g ∂x

1 − ∂g ∂y

1 − ∂g ∂z

Lxx

Lxy

Lxz

Lyx

Lyy

Lyz

Lzx

Lzy

Lzz

         =       

0 −1 −2 −1

−1 −2 0 0

−2 0 −2 0

−1 0 0 −2

    = −24.  

Here, the partial derivatives of g1 are evaluated at the point (5, 10, 5) and the second partial derivatives of L are evaluated at the point (5, 10, 5, 10, 0). We have (−1)2 B2 (5, 10, 5) = 10 > 0

and

(−1)3 B3 (5, 10, 5) = 24 > 0.

We conclude that the point (5, 10, 5) is a local maximum to the maximization problem, or equivalently, a local minimum to the minimization problem. ∗∗∗

If x + 2y + z > 30 and 2x − y − 3z = 10 then λ1 = 0 and 1 3 (x, y, z) = λ2 (1, − , − ) 2 2

which inserted into the equation 2x − y − 3z = 10 gives λ2 = −10/7 < 0 : contradiction. ∗ ∗ ∗∗ If x + 2y + z > 30 and 2x − y − 3z < 10 then λ1 = 0 and λ2 = 0. So (x, y, z) = (0, 0, 0) which contradicts the first above inequality. Conclusion: The minimization problem has one local minimum at the point (5, 10, 5).

4. – Classify the candidates of the problem ⎧ ⎨ g 1 = x2 + y 2 + z 2 = 1 local max(min)f (x, y, z) = x + y + z

s.t



g2 = x − y − z  1.

Solution: i) Note that f , g1 and g2 are C ∞ in R3 and consider the Lagrangian L(x, y, z, λ1 , λ2 ) = f (x, y, z) − λ1 (g1 (x, y, z) − 1) − λ2 (1 − g2 (x, y, z)) = x + y + z − λ1 (x2 + y 2 + z 2 − 1) + λ2 (x − y − z − 1)

Constrained Optimization-Inequality Constraints and let us look for the ⎧ (1) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (2) ⎪ ⎪ ⎪ ⎪ ⎨ (3) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (4) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ (5)

265

solutions of the system Lx = 1 − 2xλ1 + λ2 = 0 Ly = 1 − 2yλ1 − λ2 = 0 Lz = 1 − 2zλ1 − λ2 = 0 Lλ1 = −(x2 + y 2 + z 2 − 1) = 0 λ2 = 0

if

x − y − z > 1.

From the first three equations, we deduce that −λ2 = 1 − 2xλ1

λ1 (x + y) = 1

λ1 (x + z) = 1.

Note that λ1 = 0 is not possible because we would have from (1) : λ2 = −1, and from (2) : λ2 = 1. So λ1 = 0 and we have x+y =x+z =

1 λ1

=⇒

y = z.

∗ If x − y − z > 1, then λ2 = 0. We deduce that

1 = 2x, thus λ1

1 = 2x. So x = y = z, which inserted into (4) gives 3x2 = 1. λ1 Hence, we have two points x+y =x+z =

1 1 1 (x, y, z) = ( √ , √ , √ ) 3 3 3

1 1 1 (− √ , − √ , − √ ). 3 3 3

or

But, they do not satisfy x − y − z > 1. ∗ If x − y − z = 1, then with y = z and (4), we have ⎧ ⎨ x = 1 + 2y 1 2 ⇐⇒ (x, y) = (1, 0) or (x, y) = (− , − ). ⎩ 3 3 2y(3y + 2) = 0 We deduce then (x, y, z) = (1, 0, 0)

with

1 2 2 (x, y, z) = (− , − , − ) 3 3 3

λ1 = 1, with

λ2 = 1

λ1 = −1,

1 λ2 = − . 3

266

Introduction to the Theory of Optimization in Euclidean Space

ii) Regularity of the points. We have ⎡ ⎢ g (x, y, z) = ⎣

∂g1 ∂x

∂g1 ∂y

∂g1 ∂z

2 − ∂g ∂x

2 − ∂g ∂y

2 − ∂g ∂z



g  (1, 0, 0) =



2 0 −1 1

0 1



⎤ ⎥ ⎦=



2x −1

1 2 2 g  (− , − , − ) = 3 3 3

2y 1 

− 23 −1

2z 1



− 43 1

− 43 1

 .

Then

1 2 2 rank(g  (1, 0, 0)) = rank(g  (− , − , − )) = 2. 3 3 3 The two points are regular. Moreover, we remark that the first two column vectors are linearly independent and we will not renumber the variables. iii) Classification of the points. Now, let us study the nature of the points (1, 0, 0) and (− 13 , − 23 , − 23 ). For this we use the second derivatives test since f , g1 and g2 are C 2 around these points. We have to consider the sign of the following bordered Hessian determinant:   ∂g1 ∂g1 ∂g1 0  0  ∂x ∂y ∂z     ∂g2 ∂g2 ∂g2   0 0 − ∂x − ∂y − ∂z      ∂g1  2  ∂x − ∂g Lxx Lxy Lxz  ∂x B3 (x, y, z) =      ∂g1 − ∂g2 Lyx Lyy Lyz   ∂y ∂y    ∂g1  2  ∂z − ∂g  L L L zx zy zz ∂z       =   

0 0 2x 2y 2z

0 0 −1 1 1

2x −1 −2λ1 0 0

2y 1 0 −2λ1 0

2z 1 0 0 −2λ1

    .   

The first partial derivatives of g1 and g2 are evaluated at (x, y, z). The second partial derivatives of L are evaluated at (x, y, z, λ1 , λ2 ). ∗ At (1, 0, 0) with      B3 (1, 0, 0) =    

λ1 = 1 and λ2 = 1, we have  0 0 2 0 0  0 0 −1 1 1  2 −1 −2 0 0  = −16 0 1 0 −2 0  0 1 0 0 −2 

(−1)3 B3 = 16 > 0.

Constrained Optimization-Inequality Constraints

267

We conclude that the point (1, 0, 0) is a local maximum to the constrained optimization problem (λ2  0, (−1)3 B3 > 0). ∗∗ At (− 13 , − 23 , − 23 ) with λ1 = −1 and λ2 = − 13 , we have      1 2 2 B3 (− , − , − ) =  3 3 3   

0 0 − 23 − 43 − 43

0 0 −1 1 1

− 23 −1 2 0 0

− 43 1 0 2 0

− 43 1 0 0 2

      = 16    

(−1)2 B3 = 16 > 0.

We conclude that the point (− 13 , − 23 , − 23 ) is a local minimum to the constrained optimization problem (λ2  0, (−1)2 B3 > 0). iii) The set of the constraints is a closed bounded set of R2 as it is the intersection of the unit sphere [g1 = 1] and the region above the plane [g2 = 1]. By the extreme value theorem, f attains its extreme values on this set of the constraints. Therefore, the local points found in ii) are also the global extreme points. Hence, we have max

g1 =1, g2 1

f = f (1, 0, 0) = 1

min

g1 =1, g2 1

5 1 2 2 f = f (− , − , − ) = − . 3 3 3 3

5. – Classify the candidates of the problem ⎧ ⎨ g 1 = x2 + y 2 + z 2 = 1 local max(min)f (x, y, z) = x + y + z

s.t



g2 = x − y − z  1.

Solution: i) Note that f , g1 and g2 are C ∞ in R3 and consider the Lagrangian L(x, y, z, λ1 , λ2 ) = f (x, y, z) − λ1 (g1 (x, y, z) − 1) − λ2 (g2 (x, y, z) − 1) = x + y + z − λ1 (x2 + y 2 + z 2 − 1) − λ2 (x − y − z − 1). We look for the solutions of the system ⎧ (1) ⎪ ⎪ ⎪ ⎪ ⎨ (2) ⎪ ⎪ ⎪ ⎪ ⎩ (3)

Lx = 1 − 2xλ1 − λ2 = 0 Ly = 1 − 2yλ1 + λ2 = 0 Lz = 1 − 2zλ1 + λ2 = 0

⎧ ⎨ (4) ⎩

(5)

Lλ1 = −(x2 + y 2 + z 2 − 1) = 0 λ2 = 0

if

x − y − z < 1.

268

Introduction to the Theory of Optimization in Euclidean Space

From the first three equations, we deduce that λ2 = 1 − 2xλ1

λ1 (x + y) = 1

λ1 (x + z) = 1.

Note that λ1 = 0 is not possible because we would have from (1) : λ2 = −1, and from (2) : λ2 = 1. So λ1 = 0 and we have x+y =x+z =

1 λ1

=⇒

y = z.

∗ If x − y − z < 1, then λ2 = 0. We deduce that

1 = 2x, thus λ1

1 = 2x. So x = y = z, which inserted into (4) gives 3x2 = 1. λ1 Hence, we have two solutions x+y =x+z =



1 1 1 (x, y, z) = ( √ , √ , √ ) 3 3 3

with

1 1 1 (x, y, z) = (− √ , − √ , − √ ) 3 3 3

3 , 2

λ1 =

λ2 = 0

√ 3 , λ1 = − 2

with

λ2 = 0.

∗ If x − y − z = 1, then with y = z and (4), we have ⎧ ⎨ x = 1 + 2y 1 2 ⇐⇒ (x, y) = (1, 0) or (x, y) = (− , − ). ⎩ 3 3 2y(3y + 2) = 0 We deduce then (x, y, z) = (1, 0, 0)

1 2 2 (x, y, z) = (− , − , − ) 3 3 3 ii) Regularity of the points. We have ⎡ ⎢ g (x, y, z) = ⎣ 

 1 1 1 g2 ( √ , √ , √ ) = 3 3 3

λ1 = 1,

with

∂g1 ∂y

∂g1 ∂z

∂g2 ∂x

∂g2 ∂y

∂g2 ∂z

√2 3

√2 3

√2 3

λ1 = −1,

with

∂g1 ∂x

⎤ ⎥ ⎦= 

λ2 = −1



2x 1

2y −1

λ2 =

2z −1

1 . 3



1 1 1 = −g2 (− √ , − √ , − √ ) 3 3 3

Constrained Optimization-Inequality Constraints

269

1 1 1 1 1 1 rank(g2 ( √ , √ , √ )) = rank(g2 (− √ , − √ , − √ )) = 1. 3 3 3 3 3 3 



g (1, 0, 0) =

2 1

0 −1

0 −1



1 2 2 g (− , − , − ) = 3 3 3 



− 23 1

− 43 −1

− 43 −1

 .

1 2 2 rank(g  (1, 0, 0)) = rank(g  (− , − , − )) = 2. 3 3 3 The four points are regular. Moreover, we will not have to renumber the variables since the first two column vectors of each derivative above are linearly independent. 1 1 1 iii) Classification of the points (± √ , ± √ , ± √ ). 3 3 3 Here n = 3, p = 1, thus we have to consider the sign of the following bordered Hessian determinants:   0  B2 =  2x  2y

2x −2λ1 0

2y 0 −2λ1

     

    B3 =   

0 2x 2y 2z

2x −2λ1 0 0

2y 0 −2λ1 0

2z 0 0 −2λ1

    .   

We have 1 1 8 1 1 1 1 B3 ( √ , √ , √ ) = −12 B2 ( √ , √ , √ ) = √ 3 3 3 3 3 3 3 1 1 1 (−1)r Br ( √ , √ , √ ) > 0 r = 2, 3. 3 3 3 Thus, the point is a local maximum since λ1 = 1 > 0 and (−1)2 B2 > 0, (−1)3 B3 > 0. 1 1 8 1 B2 (− √ , − √ , − √ ) = − √ 3 3 3 3

1 1 1 B3 (− √ , − √ , − √ ) = −12 3 3 3

1 1 1 (−1)1 Br (− √ , − √ , − √ ) > 0 3 3 3

r = 2, 3

Thus, the point is a local minimum since λ1 = −1 < 0 and (−1)1 B2 > 0, (−1)1 B3 > 0. 1 2 2 iv) Classification of the points (1, 0, 0), (− , − , − ). 3 3 3 Here n = 3, p = 2, thus we have to consider the sign of the following bordered Hessian determinant:

270

Introduction to the Theory of Optimization in Euclidean Space    0 0 2x 2y 2z    0 0 1 −1 −1    0 0  . B3 (x, y, z) =  2x 1 −2λ1  2y −1 0  0 −2λ 1   2z −1 0 0 −2λ1 

∗ At (1, 0, 0) with λ1 = 1 and λ2 = −1, we have B3 (1, 0, 0) = −16. We conclude that the point cannot be a local maximum because λ2 = −1  0. It cannot also be a local minimum because the Hessian is not semi definite positive at the point on the tangent plane ⎡ ⎤ ⎡ ⎤ ⎡ ⎤   h    0  h  2 0 0 ⎣ k ⎦= 0 M = ⎣ k ⎦: = k⎣ 1 ⎦ : k ∈ R 1 −1 −1 0 l l −1 

0 k

−k





⎤⎡ ⎤ 0 0 0 −2 0 ⎦ ⎣ k ⎦ = −4k2  0 0 −2 −k

−2 ⎣ 0 0

on M.

∗∗ At (− 13 , − 23 , − 23 ) with λ1 = −1 and λ2 = 13 , we have B3 (− 13 , − 23 , − 23 ) = 16. We conclude that the point cannot be a local minimum because λ2 = 1/3  0. It cannot also be a local maximum because the Hessian is not semi definite negative at the point on the tangent plane ⎡ ⎡ ⎤ ⎡ ⎤ ⎤  h     2 0  h  4 4 0 −3 −3 −3 ⎣ k ⎦= M = ⎣ k ⎦: = k ⎣ 1 ⎦: k ∈ R 0 1 −1 −1 l l −1 

0

k

−k





2 ⎣ 0 0

0 2 0

⎤⎡ ⎤ 0 0 0 ⎦ ⎣ k ⎦ = 4k2  0 2 −k

on M.

v) The set of the constraints is a closed bounded set of R2 as it is the intersection of the unit sphere [g1 = 1] and the region below the plane [g2 = 1]. By the extreme value theorem, f attains its extreme values on this set of the constraints. Hence, we have max

g1 =1, g2 1

f (x, y, z) =



3

and

min

g1 =1, g2 1

√ f (x, y, z) = − 3.

Constrained Optimization-Inequality Constraints

4.4

271

Global Extreme Points-Inequality Constraints

When the Lagrangian is concave/convex on a convex constraint set, a solution of the Karush-Kuhn-Tucker conditions is a global maximum/minimum point.

Theorem 4.4.1 Let Ω ⊂ Rn , Ω be an open set and f, g1 , . . . , gm : Ω −→ ◦

R be C 1 functions. Let S ⊂ Ω be convex, x∗ ∈ S and   L(x, λ) = f (x) − λ1 (g1 (x) − b1 ) − . . . − λm (gm (x) − bm )     ∃ λ∗ = λ∗1 , . . . , λ∗m  : ∇x L(x∗ , λ∗ ) = 0    ∗  λj = 0 if gj (x∗ ) < bj j = 1, . . . , m. Then, we have λ∗  0 and L(., λ∗ ) is concave in x ∈ S =⇒ f (x∗ ) =

λ∗  0 and L(., λ∗ ) is convex in x ∈ S =⇒ f (x∗ ) =

max

f (x)

min

f (x)

S∩{x∈Ω: g(x)b}

S∩{x∈Ω: g(x)b}

Proof. i) First implication. The point x∗ is a critical point for the Lagrangian L(., λ∗ ) (∇x L(x∗ , λ∗ ) = 0) and L(., λ∗ ) is concave on the convex set S, then x∗ is a global maximum for L(., λ∗ ) on S (by Theorem 2.3.4). Thus, we have L(x∗ , λ∗ ) = f (x∗ ) − λ∗1 (g1 (x∗ ) − b1 ) − . . . − λ∗m (gm (x∗ ) − bm )  f (x) − λ∗1 (g1 (x) − b1 ) − . . . − λ∗m (gm (x) − bm ) = L(x, λ∗ ) At x∗ , we have so

λ∗j  0,

with

λ∗j = 0

−λ∗j (gj (x∗ ) − bj ) = 0

if

gj (x∗ ) < bj

j = 1, . . . , m,

∀x ∈ S.

j = 1, . . . , m

272

Introduction to the Theory of Optimization in Euclidean Space

and, the previous inequality reduces to L(x∗ , λ∗ ) = f (x∗ )  f (x) − λ∗1 (g1 (x) − b1 ) − . . . − λ∗m (gm (x) − bm ) = L(x, λ∗ ). For each j = 1, . . . , m, we also have , λ∗j  0 and gj (x) − bj  0, then −λ∗j (gj (x) − bj )  0. Therefore, L(x∗ , λ∗ ) = f (x∗ )  L(x, λ∗ )  f (x)

∀x ∈ S ∩ {x ∈ Ω : g(x)  b}.

Hence x∗ solves the constrained problem. ii) Second implication. This part can be deduced similarly. Moreover, it suggests, for example, when looking for candidates for a maximization problem that we keep the points with negative Lagrange multipliers and see if they are global minima points without maximizing (−f ) and introducing another Lagrangian. Example 1. Solve the problem min(max)f (x, y, z) = x2 + y 2 + z 2

s.t

g(x, y, z) = x − 2z  −5.

Solution: Form the Lagrangian using the C ∞ functions f and g on R3 : L(x, y, z, λ) = x2 + y 2 + z 2 − λ(x − 2z + 5) Let us solve the system ⎧ (i) Lx = 2x − λ = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ (ii) Ly = 2y = 0 ⎪ ⎪ (iii) Lz = 2z + 2λ = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ (iv) λ = 0 if x − 2z + 5 < 0. ∗ If x − 2z + 5 < 0 , then λ = 0. From the equations (i), (ii) and (iii), we deduce that (x, y, z) = (0, 0, 0). But, then the inequality x − 2z + 5 < 0 is not satisfied. ∗∗ If x − 2z + 5 = 0 , then using (i), (ii) and (iii), we obtain λ = 2x = −z,

y = 0,

x−2z+5 = 0 ⇐⇒ (x, y, z) = (−1, 0, 2) with λ = −2

which is the only candidate point for maximality.

Constrained Optimization-Inequality Constraints

273

Now, we study the convexity/concavity of L have ⎡ 2 HL(.,−2) (x, y, z) = ⎣ 0 0

in (x, y, z) when λ = −2. We ⎤ 0 0 2 0 ⎦ 0 2

The leading principal minors are such that:

∀(x, y, z) ∈ R3 ,

D1 (x, y, z) = 2 > 0,

D2 (x, y, z) = 4 > 0,

D3 (x, y, z) = 8 > 0.

Hence, L(., −2) is strictly convex in (x, y, z), and we conclude that the point (−1, 0, 2) is the solution to the constrained manimization problem. The maximization problem doesn’t have a solution, since there is only one solution to the system and it is a global minimum point. Interpretation. The problem looks for the shortest and farthest distance of the origin to the space region located below the plan x − 2z + 5 = 0. The shortest distance is attained on the plane.

Remark 4.4.1 The rank condition, at the point x∗ , is not assumed in the theorem. The proof uses the characterization of a C 1 convex function on a convex set only.

Example 2. In Example 4, Section 4.2, the point (1, 1) doesn’t satisfy the rank condition. It solves the KKT conditions related to the problem with linear constraints: max F (x, y) = ln x + ln y 2x + y  3,

x + 2y  3

and

subject to

x+y 2

with

x > 0,

y > 0.

Use concavity to show that (1, 1) solves the problem. Solution: i) With the Lagrangian L(x, y, λ1 , λ2 , λ3 ) =

1 1 ln x+ ln y−λ1 (2x+y−3)−λ2 (x+2y−3)−λ3 (x+y−2), 2 4

the Hessian with respect to (x, y) is ⎡

1 ⎢ − x2 HL(.,λ1 ,λ2 ,λ3 ) (x, y) = ⎣ 0

⎤ 0 ⎥ 1 ⎦ − 2 y

274

Introduction to the Theory of Optimization in Euclidean Space

is strictly definite negative since the leading principal minors are such that D1 (x, y) = −

1 < 0, x2

D2 (x, y) =

1 x2 y 2

>0

for (x, y) ∈ Ω = (0, +∞) × (0, +∞). So the Lagrangian is strictly concave in (x, y) ∈ Ω, and (1, 1) is the maximum point.

Remark 4.4.2 The concavity/convexity hypothesis is a sufficient condition. We may have a global extreme point with a Lagrangian that is neither concave nor convex (see Exercise 3).

Example 3. In Exercise 2, Section 4.3, the points √ 4 ( 3, √ ) 3

with

√ λ = −2 3

and

√ 4 (− 3, − √ ) 3

with

√ λ=2 3

solve respectively the local min and local max problems local max (min)f (x, y) = x2 y + 3y − 4

s.t

g(x, y) = 4 − xy  0.

Are there global extreme points? Solution: i) Let us explore the concavity and convexity of L with respect to (x, y) L(x, y, λ) = x2 y + 3y − 4 + λ(xy − 4) The Hessian matrix of L in (x, y) is    2y Lxx Lxy = HL = 2x + λ Lyx Lyy √ When λ = 2 3, the principal minors are Δ11 = Lyy = 0,

Δ21 = Lxx = 2y

and

So L is neither concave nor convex in (x, y) ∈ R2 . √ Similarly, when λ = −2 3 the principal minors are Δ11 = Lyy = 0,

Δ21 = Lxx = 2y

and

2x + λ 0



Δ2 = −(2x + λ)2 .

√ Δ2 = −(2x − 2 3)2 ,

and L is neither concave nor convex in (x, y). Therefore, we cannot use this sufficient condition to conclude anything about the global optimality of the candidate points.

Constrained Optimization-Inequality Constraints

275

ii) Note that, on the boundary of the constraint set [g  4], we have y = 4/x and f takes the values 12 4 −4 f (x, ) = 4x + x x and

4 4 lim f (x, ) = +∞ and lim f (x, ) = −∞. x→−∞ x x Hence f doesn’t attain an absolute maximum nor an absolute minimum value on the constraint set. x→+∞

Remark. Note that √ √ √ √ 24 24 f (− 3, −4/ 3) = − √ − 4. f ( 3, 4/ 3) = √ − 4 3 3 √ √ √ √ √ √ With and √ minimum √ √ f ( 3,√4/ 3) > f (− 3, −4/ 3), ( 3, 4/ 3) being a local we can see that ( 3, 4/ 3) cannot (− 3, −4/ 3) being a local maximum, √ √ be a global minimum and (− 3, −4/ 3) cannot be a global maximum. A constrained global extreme point would be a local one since any point of the set of the constraints g = 4 is an interior point and regular. Example 4. Quadratic programming. The general quadratic program (QP) can be formulated as min

1 t xQx +t x.d 2

Ax  b

s.t

where Q is a symmetric n × n matrix, d ∈ Rn , b ∈ Rm and A an m × n matrix. Introduce the Lagrangian L(x, λ) = −(

1 t xQx +t x.d) − λ(Ax − b) 2

and write the KKT conditions ⎧ ⎨ ∇x L = −Qx − d −t Aλ = 0 ⎩

λi  0

with

λi = 0

if

(Ax)i < bi .

If (x∗ , λ∗ ) is a solution of the KKT conditions, and x∗ is a candidate point where p constraints are active (Ax)ik = bik , k = 1, . . . , p, then the second derivatives test at the point shows whether the point is a solution or not since the HL (x, λ∗ ) = Q is constant and the constraints are linear. Thus the positivity of the Hessian subject to these constraints is equivalent to test the

276

Introduction to the Theory of Optimization in Euclidean Space

bordered determinants formed from the matrix ⎡ t ⎤ ai1   0 Ap ⎢ ⎥ t Ap = ⎣ ... ⎦ aik is the ik eme row vector of A t Ap Q t aip

Remark 4.4.3 * To sum up, solving an unconstrained or constrained optimization problem leads to solving a nonlinear system F (x, λ) = 0 that appears in different forms no constraints f  (x) = 0 ↓ F(x, λ) = 0 



equality constraints ∇x,λ L(x, λ) = 0

inequality constraints

∇x L(x, λ) = 0,

λ.(g(x) − b) = 0



On the other hand, solving a nonlinear equation is not easy even when F is a polynomial of degree 3 of one variable. ** The importance of the theorems studied comes from - locating the possible candidates - showing how to compare the values of f along the feasible directions. These two points are the start for the development of numerical methods for approaching the solution with accuracy (see [17], [19], [8], [4]). *** The proofs we studied for optimization problems in the Euclidean space constitute a natural step to more complex ones developed in calculus of variation where the maximum and minimum are searched in a class of functions and where the objective function is a function defined on that class (see [16], [6], [9]).

Constrained Optimization-Inequality Constraints

277

Solved Problems

1. – Distance to an hyperplane. Let a ∈ Rn , a = 0, b ∈ R b > 0. i) Solve

min x 2

subject to

t

a.x  b.

ii) Deduce the solution to the following problems.  5 + x2 + y 2 β) max −6x2 − 6y 2 − 6z 2 + 4 α) min −x+y2

2x−y+2z−1

    Solution: i) Let t a = a1 . . . an , t x = x1 . . . xn . The minimization problem looks for points in the region above the hyperplane t

a.x = b

⇐⇒

a 1 x 1 + a2 x 2 + . . . + a n x n = b

that are closest to the origin. It is a nonlinear minimization problem with inequality constraints. We introduce the Lagrangian L(x, λ) = −(x21 + x22 + . . . + x2n ) + λ(a1 x1 + a2 x2 + . . . + an xn − b) and write the KKT conditions ⎧ Lx1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨

= −2x1 + λa1 = 0 .. .

⎪ ⎪ ⎪ Lxn = −2xn + λan = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ λ0 with λ = 0

if

a1 x1 + a2 x2 + . . . + an xn > b.

278

Introduction to the Theory of Optimization in Euclidean Space

Finding a candidate. * If a1 x1 + a2 x2 + . . . + an xn > b, then λ = 0. We get x1 = . . . = xn = 0. But then, we have a contradiction with a1 (0) + . . . + an (0) = 0  b. * If a1 x1 + a2 x2 + . . . + an xn = b, then λ x i = ai , i = 1, . . . , n 2 which inserted in the equation of the hyperplane, we obtain a1

λ

λ

λ

a1 + a 2 a2 + . . . + a n an = b 2 2 2

λ

⇐⇒

2

=

b . a 2

Hence, a solution to the system is xi =

b ai , a 2

⇐⇒

i = 1, . . . , n

x=

b a. a 2

Finding the solution.

Lx 1 x 1 .. .

... .. .

b , consider the Hessian matrix a 2 ⎤ ⎡ ⎤ Lx 1 x n −2 . . . 0 ⎥ ⎢ . .. .. ⎥ .. ⎦ = ⎣ .. . . . ⎦

Lxn x1

...

Lxn xn

To study the concavity of L in x when λ = 2 ⎡ HL(.,λ) (x)

=

⎢ ⎣

0

...

−2

The leading minor principals are equal to Dk (x) = (−2)k , k = 1, . . . , n. The matrix is semi-definite negative. Thus, the point maximizes − x 2 subject to the constraint t ax  b. Hence, the point solves the minimization problem and the minimal distance of the origin to this point is equal to  b  b   . a =  a 2 a ii) α) Note that  min min 5 + x2 + y 2 =

−x+y2

−x+y2

5 + min 5 + x2 + y 2 =

−x+y2

Moreover, we have min

−x+y2

x2 + y 2

=



min⎡ −1 1



.⎣

x y

⎤ ⎦2

(x, y) 2 .

x2 + y 2 .

Constrained Optimization-Inequality Constraints Thus min

−x+y2

and is attained at

x2 + y 2 =



279

√ 2 2  =√ = 2 −1 2 1

(x∗ , y ∗ ) = (−1, 1). Hence min

−x+y2



) 5+

x2

+

y2

=

5+

√ 2.

β) We have max

2x−y+2z−1

−6x2 − 6y 2 − 6z 2 + 4 = 4 − 6

min

−2x+y−2z1

x2 + y 2 + z 2



Thus min

−2x+y−2z1

max

2x−y+2z−1

x2 + y 2 + z 2

=

min 

−2 1 −2

−6x2 − 6y 2 − 6z 2 + 4 = 4 −

and is attained at

(x∗ , y ∗ , z ∗ ) =

⎡ ⎢ .⎢ ⎣

x y z



(x, y, z) 2 ,

⎥ ⎥1 ⎦

6 6 ⎤ = 4 − √ = 2, −2 9 ⎣ 1 ⎦ −2 ⎡

1 (−2, 1, −2). 9

2. – Distance to an hyperplane with positive constraints. i) Let a ∈ Rn , a = 0, b ∈ R b > 0. Solve min x 2

⎧ t ⎨ a.x  b subject to



x0

ii) Minimize x2 + y 2 over the following sets α)

− y  −2, x  0

β)

x − y  2, x  0, y  0

γ)

− x + y  2, x  0, y  0

δ)

x + y  2, x  0, y  0

Sketch graphs to check the solution.

280

Introduction to the Theory of Optimization in Euclidean Space     t x = x1 . . . xn . The minSolution: i) Let t a = a1 . . . an , imization problem looks for points with positive coordinates in the region above the hyperplane t

⇐⇒

a.x = b

a1 x1 + a2 x2 + . . . + an xn = b,

and that are closest to the origin. It is a nonlinear minimization problem with inequality constraints. We introduce the Lagrangian L(x, λ) = −(x21 + x22 + . . . + x2n ) + λ(a1 x1 + a2 x2 + . . . + an xn − b) and write the KKT conditions ⎧ Lx1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎨

= −2x1 + λa1  0 .. .

⎪ ⎪ ⎪ Lxn = −2xn + λan  0 ⎪ ⎪ ⎩ λ0 with λ = 0

(= 0

if x1 > 0)

(= 0

if xn > 0)

a1 x1 + a2 x2 + . . . + an xn > b.

if

Finding a candidate. * If xi = 0 for each i ∈ {1, . . . , n}, then, a1 (0) + . . . + an (0) = 0  b > 0, and we get a contradiction with. * If xi0 > 0 for some i0 ∈ {1, . . . , n}, then λ ai 2 0 then λ > 0 and ai0 > 0. As a consequence, we have a1 x1 +a2 x2 +. . .+an xn = b. Suppose xi > 0 for i ∈ {i0 , i1 , . . . , ip }, and xi = 0 for i = i0 , i1 , . . . , ip . Then, −2xi0 + λai0 = 0

λaj  0

for j = i0 , i1 , . . . , ip

⇐⇒

⇐⇒

xi0 =

aj  0

for j = i0 , i1 , . . . , ip

since λ > 0. Hence, we can write xj =

λ λ max(aj , 0) = (aj )+ 2 2

for j = i0 , i1 , . . . , ip

and get a unified formula for the candidate point x∗ =

λ + a 2

t +

a =



a+ 1

...

a+ n



Constrained Optimization-Inequality Constraints

281

Inserting the expression of x∗ in the equation of the hyperplane, we obtain a1

λ

λ

λ

a+ + a2 a+ + . . . + a n an + = b 1 2 2 2 2

λ

⇐⇒

2

=

b . a+ 2

Hence, a solution to the system is xi =

b a+ , a+ 2 i

i = 1, . . . , n

⇐⇒

x=

b a+ . a+ 2

Finding the solution. To study the concavity of L in x when λ = 2 matrix



HL(.,λ) (x)

=

Lx 1 x 1 ⎢ .. ⎣ . Lxn x1

b , consider the Hessian a+ 2

⎤ Lx 1 x n ⎥ .. ⎦ . Lxn xn

... .. . ...

⎡ =

−2 ⎢ .. ⎣ . 0

... .. . ...

⎤ 0 .. ⎥ . ⎦ −2

The leading minor principals are equal to Dk (x) = (−2)k , k = 1, . . . , n. The matrix is semi-definite negative. Thus, the point maximizes − x 2 subject to the constraint t ax  b and to the positivity constraint x  0. Hence, the point solves the minimization problem and the minimal distance of the origin to this point is equal to   b b    + 2 a+  = + . a a ii) Here, in filling Table 4.4, we have b = 2. a

t +

a

a+

(x∗ , y ∗ )

α

(0, 1)

(0, 1)

1

(0, 2)

β

(1, −1)

(1, 0)

1

(2, 0)

γ

(−1, 1)

(0, 1)

1

(0, 2)

δ

(1, 1)

(1, 1)



√ √ ( 2, 2)

set

t

2

TABLE 4.4: Minima points for x2 + y 2 on the four sets One can easily check the minimal distance of the origin to the given sets from the graphics in Figure 4.21.

282

Introduction to the Theory of Optimization in Euclidean Space y

y

2.5

5

y x2

2.0 4 1.5 x0, y2

3 2

y2

1.0

xy2

0.5

x0, y0

1 x0 0.5

0.5

1.0

1.5

2.0

x

1

2

3

4

5

x

0.5 y

y

2.5

5 x0, y0 xy2

4

2.0

xy2

1.5

x0, y0

3 1.0 2 0.5 1

0.5

0.5

1.0

1.5

2.0

x

1

2

3

4

x

0.5

FIGURE 4.21: Closest point of the constraint set to the origin

3. – L not convex nor concave Consider the following minimization problem: ⎧ 2 ⎪ ⎪ y 4−x ⎪ ⎪ ⎨ y  3x min x2 + y 2 s.t ⎪ ⎪ ⎪ ⎪ ⎩ y  −3x i) Sketch the feasible set. ii) Write the problem as a maximization problem in the standard form, and write down the necessary KKT conditions for a point (x∗ , y ∗ ) to be a solution of the problem. iii) Find the points that satisfy the KKT conditions. Check whether or not each point is regular. iv) Determine whether or not the point(s) in part ii) satisfy the secondorder sufficient condition. v) Explore the concavity of the Lagrangian in (x, y) ∈ R2 . vi) What can you conclude about the solution of the problem? vii) Give a geometric interpretation of the problem that confirms the solution you have found (Hint: use level curves).

Solution: i) The feasible set is the plane region located above the curve and the two lines, as described in Figure 4.22.

Constrained Optimization-Inequality Constraints

283

y

6

S

y  3 x

y3x

4

2

y  4  x2

3

2

1

1

2

x

3

2

FIGURE 4.22: The constraint set S

ii) Writing the KKT conditions. The problem is equivalent to the following maximization problem

max (−x2 − y 2 )

subject to

⎧ g1 (x, y) = 4 − x2 − y  0 ⎪ ⎪ ⎪ ⎪ ⎨ g2 (x, y) = 3x − y  0 ⎪ ⎪ ⎪ ⎪ ⎩ g3 (x, y) = −3x − y  0.

Consider the Lagrangian L(x, y, λ, β, γ) = −x2 − y 2 − λ(4 − x2 − y) − β(3x − y) − γ(−3x − y). The conditions are ⎧ (1) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (2) ⎪ ⎪ ⎪ ⎪ ⎨ (3) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (4) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ (5)

Lx = −2x + 2λx − 3β + 3γ = 0 Ly = −2y + λ + β + γ = 0 λ0

with

λ=0

if

4 − x2 − y < 0

β0

with

β=0

if

3x − y < 0

γ0

with

γ=0

if

− 3x − y < 0

284

Introduction to the Theory of Optimization in Euclidean Space

iii) Solving the equations satisfying the KKT conditions. • If 4 − x2 − y < 0 then λ = 0 and ⎧ ⎨ −2x − 3β + 3γ = 0 ⎩

−2y + β + γ = 0

then we discuss ∗ Suppose 3x − y < 0, then β = 0. Thus, ⎧ ⎨ −2x + 3γ = 0 ⎩

−2y + γ = 0

we get 2x = 3γ = 6y =⇒ x = 3y. But, then 3x − y = 3(3y) − y = 8y < 0 and hence γ < 0 which contradicts γ  0. ∗ Suppose 3x − y = 0. We have then ⎧ ⎨ −2x − 3β + 3γ = 0 ⎩

=⇒

−2(3x) + β + γ = 0

6γ = 20x

and

3β = 8x  0.

We deduce that x  0 and y  0. So x = y = 0 since −3x − y  0. But, this contradicts 4 − 02 − 0 = 4 < 0. •• If 4 − x2 − y = 0 then ∗ Suppose 3x − y < 0 then β = 0 and ⎧ ⎨ −2x + 2λx + 3γ = 0 ⎩

−2y + λ + γ = 0

– Suppose −3x − y < 0 then γ = 0 and ⎧ ⎨ −2x + 2λx = 0 ⎩

−2y + λ = 0

⇐⇒

⎧ ⎨ 2x(−1 + λ) = 0 ⇐⇒ ⎩

x = 0 or λ = 1

2y = λ  0 √ √ ◦ λ = 1 leads to y = 1/2 and x = ± 7/2. But, for (x, y) = (√7/2, 1/2), the inequality 3x − y < 0 is not satisfied, and for (x, y) = (− 7/2, 1/2), the inequality −3x − y < 0 is not satisfied. So we cannot have λ = 1.

Constrained Optimization-Inequality Constraints

285

◦ x = 0 leads to y = 4 and λ = 8 > 0. The two inequalities 3x − y < 0 and −3x − y < 0 are satisfied at this point. Hence, the following point is a solution: (x∗ , y ∗ ) = (0, 4)

(λ∗ , β ∗ , γ ∗ ) = (8, 0, 0)

with

←−

– Suppose −3x − y = 0 then y = −3x and ⎧ ⎨ −2x + 2λx + 3γ = 0

⎧ ⎨ −2x + 2λx + 3γ = 0 ⎩

=⇒

−2(−3x) + λ + γ = 0



λ + γ = −6x

From 4 − x2 − y = 0, we have y = −3x

and

4 − x2 − y = 0

⇐⇒

(x, y) = (−1, 3)

or

(4, −12).

The point (4, −12) doesn’t satisfy the inequality 3x − y < 0, so it cannot be a solution. The point (−1, 3) satisfies the inequality 3x − y < 0, and we have ⎧ ⎨ −2λ + 3γ = −2 =⇒ (λ, γ) = (4, 2). ⎩ λ+γ =6 Thus, we have another candidate point: (x∗ , y ∗ ) = (−1, 3)

with

(λ∗ , β ∗ , γ ∗ ) = (4, 0, 2)

←−

∗∗ Suppose 3x − y = 0 then y = 3x. We have y = 3x

and

4 − x2 − y = 0

⇐⇒

(x, y) = (−4, −12)

or

(1, 3).

The points (−4, −12) doesn’t satisfy the inequality −3x − y  0, so it cannot be a candidate. The point (1, 3) satisfies the inequality −3x − y < 0, thus γ = 0, and we have ⎧ ⎨ 2λ − 3β = 2 =⇒ (λ, β) = (4, 2). ⎩ λ+β =6

286

Introduction to the Theory of Optimization in Euclidean Space Thus, we have another candidate point: (x∗ , y ∗ ) = (1, 3)

with

(λ∗ , β ∗ , γ ∗ ) = (4, 2, 0)

←−

Regularity of the candidate point (0, 4). Only the constraint g1 (x, y) = 4 − x2 − y is active at (0, 4) and we have     g1 (0, 4) = 0 −1 rank(g1 (0, 4)) = 1. g1 (x, y) = −2x −1 Thus the point (0, 4) is a regular point.

Regularity of the candidate point (−1, 3). Only the constraints g1 (x, y) = 4 − x2 − y and g3 (x, y) = −3x − y are active at (−1, 3) and we have           g1 (x, y) −2x −1 g1 (−1, 3) 2 −1 = = . −3 −1 −3 −1 g3 (x, y) g3 (−1, 3)  Thus the point (1, −3) is a regular point since rank(

g1 (−1, 3) g3 (−1, 3)

 ) = 2.

Regularity of the candidate point (1, 3). Only the constraints g1 (x, y) = 4 − x2 − y and g2 (x, y) = 3x − y are active at (1, 3) and we have           g1 (x, y) −2x −1 g1 (1, 3) −2 −1 = = . 3 −1 3 −1 g2 (x, y) g2 (1, 3)    g1 (1, 3) ) = 2. Thus the point (1, 3) is a regular point since rank( g2 (1, 3) iv) With p = 2 (the number of active constraints) at the points (3, −1) and (3, 1), n = 2 (the dimension of the space), then p = n. The second derivatives test cannot be applied since it is established for p < n. For the point (0, 4), we have p = 1 < 2 = n. We consider the following determinant (r = p + 1 = 2) (Note that the first column vector of [g1 (x, y)] is linearly dependent, so we have to renumber the variables)   0   1 B2 (x, y) =  ∂g ∂y  ∂g  1 ∂x

∂g1 ∂y

Lyy Lxy

∂g1 ∂x

Lyx Lxx

      

=

  0 −1   −1 −2   −2x 0

−2x 0 −2 + 2λ

     

Constrained Optimization-Inequality Constraints

287

∗ At (0, 4), we have λ = 8,   0 −1  B2 (0, 4) =  −1 −2  0 0

0 0 14

    = −14.  

We have (−1)2 B2 (0, 4) = −14 < 0. So the second derivatives test is not satisfied at (0, 4). v) Let us explore the concavity and convexity of L with respect to (x, y) where the Hessian matrix of L in (x, y) is     −2 0 Lxx Lxy = HL = 0 −2 + 2λ Lyx Lyy When λ = 8 or 4, the principal minors are Δ11 = Lyy = −2 + 2λ > 0

Δ21 = Lxx = −2 < 0

Δ2 = 4(1 − λ) < 0.

Therefore, L is neither concave, nor concave in (x, y). vi) We have a situation where the theorems studied remain inconclusive. To conclude, we proceed by comparison. Since, the candidate points are on the boundary of the constraint set, let us study directly the values of the objective function on these points. On the lines y = ±3x, with |x|  1, the function f (x, y) = x2 + y 2 takes the values f (x, ±3x) = x2 + (±3x)2 = 10x2  10 = f (1, ±3)

∀ |x|  1.

On the parabola x2 = 4 − y, with |x|  1,we have f (x, 4 − x2 ) = x2 + (4 − x2 )2 = x4 − 8x2 + 16 + x2 = x4 − 7x2 + 16 = ϕ(x)  ϕ (x) = 4x3 − 14x = 2x(2x2 − 7) = 0 ⇐⇒ x = 0, ± 7/2. By the extreme value theorem, ϕ attains its extreme values on the closed bounded interval [−1, 1] at the critical points inside the interval (−1, 1) or at the end points. Therefore, we have min ϕ(x) = min{ϕ(−1), ϕ(0), ϕ(1)} = min{10, 16, 10} = 10.

[−1,1]

Thus, f (x, 4 − x2 ) = ϕ(x)  10 = f (±1, 3)

∀ |x|  1.

288

Introduction to the Theory of Optimization in Euclidean Space

So we can conclude that the minimum value attained by f on the set of the constraints is 10.

vii) The feasible set is S = {(x, y) :

4 − x2 − y  0,

3x − y  0, 2

−3x − y  0}

2

The level curves of f , with equations : x + y = k where k  0, are circles √ centered at (0, 0) with radius k; see Figure 4.23. If we increase the values of the radius, the values of f increase. The value k = 10 is the first one at which the level curve intersects the constraints g1 = g2 = 0 and g1 = g3 = 0. Thus the value 10 is the minimal value of f reached at (±1, 3). Moreover, the objective function f (x, y) = x2 +y 2 is the square of the distance between (x, y) and (0, 0). So our problem is to find the point(s) in the feasible region that are closest to (0, 0). y

6

S

4

2

4

2

2

4

x

2

4

FIGURE 4.23: Level curves of f and the closest points of S to the origin

4. – The data in Table 4.5 can be found in [8]. Here we consider boundary conditions to illustrate an inequality constrained problem. The Body Fat Index (BFI) measures the fitness of an individual. It is a function of the body density ρ (in units of kilograms per liter) according

Constrained Optimization-Inequality Constraints to Brozek’s formula, BF I =

289

457 − 414.2. ρ

However the accurate measurement of ρ is costing. An alternative solution is to try to describe the dependence of the BFI with respect of five variables x1 , x2 , x3 , x4 , x5 in the form f : x −→ BF I = y = f (x) = a1 x1 + a2 x2 + a3 x3 + a4 x4 + a5 x5 The variables are easier to measure and represent

x1 = weight(lb.) x4 = wrist(cm.)

x2 = height(in.) x5 = neck(cm.)

x3 = abdomen(cm.) y = BF I

Using the following table of measurements, we assume the average of each category x ¯i of measurements satisfying: ¯ 1 + a2 x ¯ 2 + a3 x ¯ 3 + a4 x ¯ 4 + a5 x ¯5  y¯ a1 x

(∗)

hoping to find a model when BF I  y¯ = 15.23. i)

Use a software to find a linear function f which best fits the given data, in the sense of least-squares, i.e., find a that minimizes the sum of the square errors 10  i=1

(f (xi ) − yi )2 =

10 

(a1 xi1 + a2 xi2 + a3 xi3 + a4 xi4 + a5 xi5 − yi )2

s.t

i=1

xi = (xi1 , xi2 , xi3 , xi4 , xi5 ) are the measurements for the ieme individual. ii)

Formulate the constrained problem using matrices. Use Maple to check that the Hessian of the resulting objective function is definite positive on the convex described by (∗).

Solution: We use Maple software for solving the problem. i) Finding the linear regression of best fit. We solve the “least square problem” “LS” with ten linear residuals. The objective function is 1 (154.25a1 + 67.75a2 + 85.2a3 + 17.1a4 + 36.2a5 − 12.7)2 ϕ(a1 , a2 , a3 , a4 , a5 ) = 2 +(173.25a1 + 72.25a2 + 83a3 + 18.2a4 + 38.5a5 − 6.9)2 +(154a1 + 66.25a2 + 87.9a3 + 16.6a4 + 34a5 − 24.6)2 +(184.75a1 + 72.25a2 + 86.4a3 + 18.2a4 + 37.4a5 − 10.9)2 +(184.25a1 + 71.25a2 + 100a3 + 17.7a4 + 34.4a5 − 27.8)2

290

Introduction to the Theory of Optimization in Euclidean Space x1 154.25 173.25 154 184.75 184.25 210.25 181 176 191 198.25

x2 67.75 72.25 66.25 72.25 71.25 74.75 69.75 72.5 74 73.5

x3 85.2 83 87.9 86.4 100 94.4 90.7 88.5 82.5 88.6

x4 17.1 18.2 16.6 18.2 17.7 18.8 17.7 18.8 18.2 19.2

x5 36.2 38.5 34 37.4 34.4 39 36.4 37.8 38.1 42.1

y 12.6 6.9 24.6 10.9 27.8 20.6 19 12.8 5.1 12

TABLE 4.5: Measurements involved in BFI +(210.25a1 + 74.75a2 + 94.4a3 + 18.8a4 + 39a5 − 20.6)2 +(181a1 + 69.75a2 + 90.7a3 + 17.7a4 + 36.4a5 − 19)2 +(176a1 + 72.5a2 + 88.5a3 + 18.8a4 + 37.8a5 − 12.8)2 +(191a1 + 74a2 + 82.5a3 + 18.2a4 + 38.1a5 − 5.1)2 +(198.25a1 + 73.5a2 + 88.6a3 + 19.2a4 + 42.1a5 − 12)2

with(Optimization) : LSSolve( [154.25a1 + 67.75a2 + 85.2a3 + 17.1a4 + 36.2a5 − 12.6, 173.25a1 + 72.25a2 + 83a3 + 18.2a4 + 38.5a5 − 6.9, 154a1 + 66.25a2 + 87.9a3 + 16.6a4 + 34a5 − 24.6, 184.75a1 + 72.25a2 + 86.4a3 + 18.2a4 + 37.4a5 − 10.9, 184.25a1 + 71.25a2 + 100a3 + 17.7a4 + 34.4a5 − 27.8, 210.25a1 + 74.75a2 + 94.4a3 + 18.8a4 + 39a5 − 20.6, 181a1 +69.75a2 +90.7a3 +17.7a4 +36.4a5 −19, 176a1 +72.5a2 +88.5a3 +18.8a4 +37.8a5 −12.8, 191a1 +74a2 +82.5a3 +18.2a4 +38.1a5 −5.1, 198.25a1 +73.5a2 +88.6a3 +19.2a4 +42.1a5 −12], {180.7a1 + 71.425a2 + 88.72a3 + 18.05a4 + 37.39a5 ≤ 15.23}) [15.0549945448635683, [a1 = 0.474753096134219e − 1, a2 = −1.03634130223772, a3 = 1.22920301075594, a4 = −1.86308283592359, a5 = .140089140413700]] Thus f (x1 , x2 , x3 , x4 , x5 ) ≈ 0.474x1 − 1.036x2 + 1.229x3 − 1.863x4 + .140x5 f can be used to predict an individual’s body fat index, based upon the five measurements types. Comments. - Least square problems are solved by the LSSolve command. - When the residuals in the objective function and the constraints are all linear, which is the case here, then an active set method is used. This is an approximate method [19],[22], [17]. -The LSSolve command uses various methods implemented in a built in library provided by a group of numerical algorithms. ii) Finding the linear regression of best fit using matrices. Let G = (xi1 , xi2 , xi3 , xi4 , xi5 )i=1,...,10 ∈ M10;5 be the matrix whose rows are the vectors xi , or equivalently, the matrix whose columns are the five first columns entries of the table. Let

Constrained Optimization-Inequality Constraints

291

c be the last column entry of the table. Denote a =t (a1 , a2 , a3 , a4 , a5 ),

A =t (180.7, 71.425, 88.72, 18.05, 37.39),

then ϕ(a) = ϕ(a1 , a2 , a3 , a4 , a5 ) =

1 1 ((G.a − c)1 )2 + . . . + ((G.a − c)10 )2 = G.a − c 2 2 2

and the problem can be expressed as 1 G.a − c 2 subject to Aa  b. 2 Following Maple’s instructions, we enter the data using matrices min

with(Optimization) : c := V ector([12.6, 6.9, 24.6, 10.9, 27.8, 20.6, 19, 12.8, 5.1, 12], datatype = f loat) : G := M atrix([[154.25, 67.75, 85.2, 17.1, 36.2], [173.25, 72.25, 83, 18.2, 38.5], [154, 66.25, 87.9, 16.6, 34], [184.75, 72.25, 86.4, 18.2, 37.4], [184.25, 71.25, 100, 17.7, 34.4], [210.25, 74.75, 94.4, 18.8, 39], [181, 69.75, 90.7, 17.7, 36.4], [176, 72.5, 88.5, 18.8, 37.8], [191, 74, 82.5, 18.2, 38.1], [198.25, 73.5, 88.6, 19.2, 42.1]], datatype = f loat) : with(Statistics) : A := M ean(G) : b := M ean(c) : A := M atrix([[180.7, 71.425, 88.72, 18.05, 37.39]], datatype = f loat) : b := V ector([15.23], datatype = f loat) : lc := [A, b] : LSSolve([c, G], lc) : ⎡



⎢ ⎢ ⎢ 15.0549945448635683, ⎣

⎢ ⎢ ⎢ ⎣

0.0474753096134219 −1.03634130223772 1.22920301075594 −1.86308283592359 0.140089140413700

⎤ ⎤ ⎥ ⎥ ⎥ ⎦

⎥ ⎥ ⎥ ⎦

Hence, we obtain the same coefficients ai . The Hessian of ϕ is ϕ(a) = =

b = 15.23

1 1 G.a − c 2 = (G.a − c).(G.a − c) 2 2

1 1 ( G.a 2 − 2t c.G.a + c 2 ) = (t at GG.a − 2t c.G.a + c 2 ) 2 2 ϕ (a) =t GG.a − G.c ϕ (a) =t GG

Checking that the Hessian is definite positive. with(LinearAlgebra) H := M ultiply(T ranspose(G), G) : IsDef inite(H) true

292

Introduction to the Theory of Optimization in Euclidean Space

4.5

Dependence on Parameters

The cost to produce an output Q is equal to rK + wL where r and w are respectively the prices of the input capital K and labor L. The firm would like the output to obey the Cobb-Douglas production function Q = cK a Lb (r > 0, w > 0, c > 0, a + b < 1). Thus, to minimize the cost of production, the problem is expressed as: min rK + wL

subject to

cK a Lb = Q

with (K, L) ∈ (0, +∞) × (0, +∞). Using Lagrange’s multiplier method, the unique solution is (see Example 1, Section 3.4) 1 a b Q a+b aQ bQ aQ a+b bQ a+b L∗ = λ λ∗ = . K∗ = λ r w c r w One can see the dependence of the extreme point on the parameters r, w, c, a, b. In general, it is not easy to express explicitly the solution with respect of many parameters. On the other hand, changing the parameters and solving a new optimization problem is costing or difficult. An alternative solution is to have an estimate on how much the optimal value changes compared to an initial situation. To set the main result of this section, we suppose the objective function f and the constraint function g depending on a parameter r ∈ Rk , i.e. f (x, r) = f (x1 , . . . , xn , r1 , . . . , rk ), g = (g1 , . . . , gn ), g(x, r) = g(x1 , . . . , xn , r1 , . . . , rk ) I(x(r)) = {i ∈ {1, · · · , m} : gi (x(r), r) < 0}, Consider the problem (Pr ) f ∗ (r) = local max f (x, r)

(resp. local min)

s.t

g(x, r)  0

and introduce the Lagrangian L(x, λ, r) = f (x, r) − λ1 g1 (x, r) − . . . − λm gm (x, r).

Hypothesis (Hr ). f and g are C 2 functions in a neighborhood of x∗ and for each r ∈ Bδ (¯ r) ⊆ Rk such that: " ! p 0

r) ⊆ A, B1 (x∗ ) × B2 (λ∗p ) × Bη (¯ det(∇x,λp F (x, λp , r)) = 0

in

B1 (x∗ ) × B2 (λ∗p ) × Bη (¯ r)

such that r), ∀r ∈ Bη (¯ (x, λp ) :

∃!(x, λp ) ∈ B1 (x∗ ) × B2 (λ∗p ) :

F (x, λp , r) = 0

Bη (¯ r) −→ B1 (x∗ ) × B2 (λ∗p ) r −→ (x(r), λp (r))

are C 1 functions.

296

Introduction to the Theory of Optimization in Euclidean Space

Remark 4.5.2 * In the theorem above, the local max(min) problem can be replaced by the max(min) problem, provided we assume, for example, −

∀r ∈ B(¯ r, δ), x −→ L(x, λ∗ , r) is strictly concave (resp. convex)

* For the unconstrained case, L is reduced to f , F (x, r) = ∇x f (x, r) and det(∇x F (x∗ , r¯)) = detHf (x∗ , r¯).

Example 1. Suppose that when a firm produces and sells x units of a commodity, it has a revenue R(x) = x, while the cost is C(x) = x2 . i) Find the optimal choice of units of the commodity that maximize profit. ii) Find the approximate change of the optimal profit if the revenue changes to 0.99x. Solution: i) The profit is given by P (x) = R(x) − C(x) = x − x2

with x > 0.

Since the set of the constraints S = (0, +∞) is an open set and the profit function is regular, the optimal point, if it exists, is a critical point solution of the equation dP = 1 − 2x = 0 dx

⇐⇒

x=

1 . 2

Moreover, we have d2 P = −2 dx2

and

d2 P 0 : P (x, r) = rx − C(x) = rx − x2

with x > 0.

Proceeding as in i), one can verify that d2 P (x, r) = −2 < 0. Thus P (., r) is concave 1. For, r close to 1, we have dx2 in x. 2. The second order condition for strict maximality is satisfied when r = 1.

Constrained Optimization-Inequality Constraints 3. P (1/2, 1) = max P (x, 1) = S

297

1 1 1 − = . 2 4 2

As a consequence, – ∃η > 0 such that the function P ∗ (r) = max P (x, r) is defined for any x∈S

r ∈ (1 − η, 1 + η) – P ∗ is C 1 and –

 ∂P   dP ∗ 1 (1) = (x, r) =x = . dr ∂r 4 x=1/2, r=1 x=1/2, r=1

We can write the following approximation P ∗ (r) ≈ P ∗ (1) +

1 1 dP ∗ (1)(r − 1) = + (r − 1) dr 4 2

for r close to 1.

In particular, for r = 0.99, the objective function P ∗ takes the following approximate value: P ∗ (0.99) ≈ 0.25 + 0.5(0.99 − 1) = 0.25 − 0.5(0.01) = 0.245 and the approximate change in the maximum value of the maximum profit function is P ∗ (0.99) − P ∗ (1) ≈ −0.5(0.01) = −0.005. * Note that, for this example, we have easily the exact value of the objective function P ∗ ; see Figure 4.24. Indeed, we have r r r r2 P ∗ (r) = P (x∗ (r), r) = P ( , r) = r − ( )2 = 2 2 2 4 from which we deduce (0.99)2 = 0.245025 4 We also have the following equality  ∂P (x, r)  r dP ∗ = = x∗ (r) = . dr 2 ∂r x=x∗ (r) P ∗ (0.99) =

y 0.30 0.25   : y xx^2 0.20 0.15 0.10 0.05 0.00 0.0 0.2

y  0.99 x  x2 0.4

0.6

0.8

1.0

x

FIGURE 4.24: Highest profit for r = 1 and r = 0.99

298

Introduction to the Theory of Optimization in Euclidean Space

Remark 4.5.3 In particular, when r = b, f (x, r) = f (x) and g(x, r) = g(x) − b, we have  ∂L  ∂f ∗ ¯ (b) = (x, λ, b) = λj (¯b) ∂bj ∂bj x=x∗ , λ=λ∗ , b=¯ b

j = 1, . . . , m.

This tells us that the Lagrange multiplier λj = λj (¯b) for the j th constraint is the rate of change at which the optimal value function changes with respect of the parameter bj at the point ¯b. Using the linear approximation formula, ∂f ∗ ¯ ∂f ∗ ¯ (b)(b1 − ¯b1 ) + · · · · · · + (b)(bm − ¯bm ) f ∗ (b) − f ∗ (¯b)  ∂b1 ∂bm = λ1 (¯b)(b1 − ¯b1 ) + · · · · · · + λm (¯b)(bm − ¯bm ), the change in the optimal value function is estimated, when one or more components of the resource vector are slightly changed.

Example 2. For b close to 3, estimate f ∗ (b) = local max f (x, y, z) = xy + yz + xz

subject to

x+y+z =b

knowing that (see Example 2, Section 3.3) f ∗ (3) = f (1, 1, 1) = 3,

λ(3) = 2,

(−1)r Br (1, 1, 1) > 0 for r = 2, 3.

Solution: We can deduce that f ∗ ∈ C 1 (3 − η, 3 + η) for some η > 0, and write the linear approximation f ∗ (b) ≈ f ∗ (3) +

∂f ∗ (3)(b − 3) ∂b

for b close to 3.

If we denote by L(x, y, z, λ, b) = xy + yz + xz − λ(x + y + z − b) the Lagrangian associated with the new constrained maximization problem, then we have   ∂L ∂f ∗ (3) = (x(b), y(b), z(b), λ(b), b) = λ(b) =2 ∂b ∂b b=3 b=3 f ∗ (b) ≈ 3 + 2(b − 3)

for b close to 3.

Constrained Optimization-Inequality Constraints

299

Solved Problems

1. – Irregular value function. i) Show that the value function f ∗ (r) = max (x − r)2 x∈[−1,1]

is not differentiable on R. Is there a contradiction with the theorem? ii) Can you expect a regularity for the value function g ∗ (r) = min

x∈[−1,1]

(x − r)2 .

Solution: This example shows that the optimal value function is not necessarily regular. Indeed, set y = f (x, r) = (x − r)2

f ∗ (r) = max

x∈[−1,1]

f (x, r).

We have

dy = fx (x, r) = 2(x − r). dx We distinguish different cases: y =

∗ r ∈ (−1, 1) : From Table 4.6, we deduce the maximum value. x y  = 2(x − r) y = (x − r)2

−1 (1 + r)2

r − 

1 + 0



(1 − r)2

TABLE 4.6: Variations of y = (x − r)2 when r ∈ (−1, 1)

max (x − r)2 = max (1 + r)2 , (1 − r)2 = f ∗ (r).

x∈[−1,1]

300

Introduction to the Theory of Optimization in Euclidean Space x y  = 2(x − r) y = (x − r)2

−1

1

r

− 

(1 + r)2

− (1 − r)2



0

TABLE 4.7: Variations of y = (x − r)2 when r ∈ (1, +∞) x y  = 2(x − r) y = (x − r)2

−1

r + 

0

1

(1 + r)2

+

(1 − r)2



TABLE 4.8: Variations of y = (x − r)2 when r ∈ (−∞, −1) ∗∗ r ∈ (1, +∞) : Using Table 4.7, we obtain max (x − r)2 = (1 + r)2 = f ∗ (r).

x∈[−1,1]

∗∗ r ∈ (−∞, −1) : Table 4.8 shows that max (x − r)2 = (1 − r)2 = f ∗ (r).

x∈[−1,1]

Conclusion: Note that (1 + r)2 − (1 − r)2 ⎧ (1 − r)2 ⎪ ⎪ ⎪ ⎪ ⎨ 1 f ∗ (r) = ⎪ ⎪ ⎪ ⎪ ⎩ (1 + r)2

= 4r, then if

r0



For r = 0, f is differentiable since it is a polynomial. For r = 0, we have ⎧ (1 − r)2 − 1 ⎪ ⎪ = −(2 − r) ⎪ ⎨ r

f ∗ (r) − f ∗ (0) = ⎪ r−0 2 ⎪ ⎪ ⎩ (1 + r) − 1 = 2 + r r

if

r0

Hence lim

r−→0−

f ∗ (r) − f ∗ (0) = −2 r−0

and f ∗ is not differentiable at 0.

lim +

r−→0

f ∗ (r) − f ∗ (0) =2 r−0

Constrained Optimization-Inequality Constraints

301

This doesn’t contradicts the theorem since the regularity of f ∗ was proved when x∗ is an interior point for f , which is not the case here with x∗ = ±1. Indeed, we have f (x, 0) = x2 and f ∗ (0) = f (±1, 0) = 1. f (x) = f (x, 0) = x2 ,

ii) We have min

x∈[−1,1]

x2 = 0 = f (0) = g ∗ (0),

0 ∈ (−1, 1),

f  (x) = 2 > 0.

So f attains its minimal value at the interior point 0, where the second derivatives test is satisfied. Moreover, f is convex on [−1, 1], which let’s 0 be the global minimum point. Therefore, for r close to 0, that is ∃η > 0 such that g ∗ ∈ C 1 (−η, η). In fact, from i), we have exactly g ∗ (r) = 0 for r ∈ (−1, 1), which is a regular function.

2. – Find an approximate value of (1.05)2 x + 5y sin(0.01) − 2x2 − 3y 2 max 2 R

Solution: Since, we are looking for an estimate of the maximal value, we will proceed using the linear approximation for a suitable function. First, we remark that 1.05 ≈ 1, 0.01 ≈ 0 and sin(0.01) ≈ 0. So, if we introduce the function f (x, y, r, s) = r2 x + 5y sin(s) − 2x2 − 3y 2 where r and s are parameters, then the problem seems like a perturbation of the simpler problem max f (x, y, 1, 0) = x − 2x2 − 3y 2 to the given problem max f (x, y, 1.05, 0.01). Solving

x − 2x2 − 3y 2 . max 2 R

Since R2 is an open set, a global extreme point of f (x, y) = f (x, y, 1, 0) is also a local extreme point. Therefore, it is a stationary point of f (x, y) = x−2x2 −3y 2 (f is a polynomial, it is C ∞ ). We have ∇f (x, y) = 1 − 4x, −6y = 0, 0

⇐⇒

1 (x, y) = ( , 0). 4

The only stationary point is ( 14 , 0). The Hessian matrix is   −4 0 Hf (x, y) = 0 −6

302

Introduction to the Theory of Optimization in Euclidean Space

The leading principal minors are D1 (x, y) = −4 < 0 and D2 (x, y) = 24 > 0. 1 Hence, f is strictly concave on R2 and we conclude that (x∗ , y ∗ ) = ( , 0) is a 4 global maximum point, and the only one. Linear approximation. We have   −4 0 1. Hf (.,r,s) (x, y) = 0 −6 (r, s) ∈ R2 .

=⇒

f (., r, s) is concave on R2 for any

1 1 2. f ( , 0) = max f (x, y, 1, 0) = . 2 4 8 R 3. The second order condition for strict maximality is satisfied when (r, s) = (1, 0) at the point (x, y) = (1/4, 0). As a consequence, –

∃η > 0 such that the function f ∗ (r, s) = max 2 f (x, y, r, s) (x,y)∈R

is defined for any (r, s) ∈ Bη (1, 0) – f ∗ is C 1 (Bη (1, 0)) and –

 ∂f  1 ∂f ∗  (1, 0) = = 2rx =  ∂r ∂r (x,y)=(1/4,0), (r,s)=(1,0) 2 (x,y)=(1/4,0), (r,s)=(1,0)   ∂f  ∂f ∗  (1, 0) = = 5y cos(s) = 0.  ∂s ∂s (x,y)=(1/4,0), (r,s)=(1,0) (x,y)=(1/4,0), (r,s)=(1,0) We can write the following approximation, for (r, s) close to (1, 0), f ∗ (r, s) ≈ f ∗ (1, 0) +

∂f ∗ 1 1 ∂f ∗ (1, 0)(r − 1) + (1, 0)(s − 0) = + (r − 1). ∂r ∂s 8 2

In particular, for (r, s) = (1.05, 0.01), the objective function f ∗ takes the following approximate value: 1 f ∗ (1.05, 0.01) ≈ 0.125 + (1.05 − 1) = 0.125 + 0.025 = 0.15 2 and the approximate change in the maximum value of the maximum profit function is f ∗ (1.05, 0.01) − f ∗ (1, 0) ≈

1 (1.05 − 1) = 0.025. 2

Constrained Optimization-Inequality Constraints

303

3. – Consider the problem min(max)f (x, y, z) = ex + y + z

⎧ ⎨ g1 = x + y + z = 1 s.t



g2 = x2 + y 2 + z 2 = 1.

i) Apply Lagrange’s theorem to the problem to show that there are four points satisfying the necessary conditions. ii) Show that each point is a regular point. iii) What can you conclude about the global minimal and maximal values of f subject to g1 = g2 = 1? Justify your answer. iv) Replace the constraints by x + y + z = a and x2 + y 2 + z 2 = b with (a, b) close to (1, 1) (a > 0, b > 0). - What is the approximate change in the optimal value function f ∗ (a, b) =

min

g1 =a, g2 =b

f (x, y, z)?

- What is the approximate change in the optimal value function F ∗ (a, b) =

max

g1 =a, g2 =b

f (x, y, z)?

Solution: i) Note that f , g1 and g2 are C ∞ in R3 . Consider the Lagrangian L(x, y, z, λ1 , λ2 ) = ex + y + z − λ1 (x + y + z − 1) − λ2 (x2 + y 2 + z 2 − 1) and look for its stationary points solution of the system

∇L(x, y, z, λ1 , λ2 ) = 0R5

⇐⇒

⎧ (1) Lx = ex − λ1 − 2xλ2 = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (2) Ly = 1 − λ1 − 2yλ2 = 0 ⎪ ⎪ ⎪ ⎪ ⎨ (3) Lz = 1 − λ1 − 2zλ2 = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (4) Lλ1 = −(x + y + z − 1) = 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ (5) Lλ2 = −(x2 + y 2 + z 2 − 1) = 0.

From equations (2) and (3), we deduce that (z − y)λ2 = 0

=⇒

z=y

or

λ2 = 0.

304

Introduction to the Theory of Optimization in Euclidean Space

∗ If λ2 = 0, we deduce from equation (2) that λ1 = 1 and then from equation (1), that x = 0. Hence, equations (4) and (5) give 2y 2 − 2y = 0

=⇒

y=0

or

y = 1.

Therefore, we have the two points (0, 1, 0),

(0, 0, 1)

with

(λ1 , λ2 ) = (1, 0).

∗∗ If z = y, then equations (4) and (5) give x = 1 − 2y

6y 2 − 4y = 0

and

=⇒

y=0

or

y=

2 . 3

Therefore, we have the two points (1, 0, 0) 1 2 2 (− , , ) 3 3 3

with

with

λ1 = 1

λ1 =

and

1 1 (1 + 2e− 3 ) 3

1 (e − 1) 2

λ2 = and

λ2 =

1 1 (1 − e− 3 ). 2

ii) Consider the matrix ⎡ ⎢ g  (x, y, z) = ⎣

g  (0, 1, 0) = g  (1, 0, 0) =



1 2



∂g1 ∂x

∂g1 ∂y

∂g1 ∂z

∂g2 ∂x

∂g2 ∂y

∂g2 ∂z

1 0

1 2

1 0

1 0

1 0 







⎥ ⎣ ⎦=

1

1

1

2x

2y

2z

g  (0, 0, 1) = 1 2 2 g  (− , , ) = 3 3 3

 

1 0

1 0

1 − 23

⎤ ⎦

1 2 1 4 3



1 4 3

 .

Each critical point is regular, and we remark that the first two column vectors in the matrices g  (− 13 , 23 , 23 )), g  (0, 1, 0) and g  (1, 0, 0) are linearly independent, while they are linearly dependent in g  (0, 0, 1). Therefore, we can keep the matrices without renumbering the variables when applying the second derivatives test in the first three matrices and change the variables in the last one. iii) Now, f is continuous on the constraint set which is a closed and bounded curve of R3 as the intersection of the unit sphere x2 + y 2 + z 2 = 1 and the plane x + y + z − 1 = 0. So f attains its optimal values by the extreme value theorem on points that are also critical points of the Lagrangian. Comparing the values of f on these points, we obtain 1 4 1 2 2 2 < f (− , , ) = e− 3 + ≈ 2.0498 < e. 3 3 3 3

Constrained Optimization-Inequality Constraints min

g1 =1, g2 =1

305

f (x, y, z) = f (0, 1, 0) = f (0, 0, 1) = 2

and max

g1 =1, g2 =1

f (x, y, z) = f (1, 0, 0) = e.

iv) ∗ If we denote f ∗ (a, b) =

min

g1 =a, g2 =b

f (x, y, z)

then f ∗ is regular for (a, b) close to (1, 1) because we have: 1. for (a, b) close to (1, 1), there exists a solution to the constrained minimization problem by the extreme value theorem (because f is continuous on the closed bounded set x + y + z = a and x2 + 2y 2 + z 2 = b). 2. (0, 1, 0) and (0, 0, 1) are solutions to the constrained minimization problem when (a, b) = (1, 1) and are regular points. 3. the second order condition for minimality is satisfied when (a, b) = (1, 1) at (0, 1, 0) and (0, 0, 1). Indeed, n = 3 and m = 2, then we have to consider the sign of the following bordered Hessian determinant:   0      0    B3 (x, y, z) =  ∂g1  ∂x   ∂g1  ∂y    ∂g  1 ∂z

     B3 (0, 1, 0) =    

0

∂g1 ∂x

∂g1 ∂y

∂g1 ∂z

0

∂g2 ∂x

∂g2 ∂y

∂g2 ∂z

∂g2 ∂x

Lxx

Lxy

Lxz

∂g2 ∂y

Lyx

Lyy

Lyz

∂g2 ∂z

Lzx

Lzy

Lzz

0 0 1 1 1

0 0 0 2 0

1 0 1 0 0

1 2 0 0 0

1 0 0 0 0

     =4   

    0        0       =1        1        1 

=⇒

0

1

1

0

2x

2y

2x

ex − 2λ2

0

2y

0

−2λ2

2z

0

0

     2z     0  = 4.    0    −2λ2  1

(−1)2 B3 (0, 1, 0) = 4 > 0.

We change the variables in the order (x, z, y) to compute B3 (0, 0, 1) and obtain      B3 (0, 0, 1) =    

0 0 1 1 1

0 0 0 2 0

1 0 1 0 0

1 2 0 0 0

1 0 0 0 0

     =4   

=⇒

(−1)2 B3 (0, 0, 1) = 4 > 0.

306

Introduction to the Theory of Optimization in Euclidean Space

Consequently, with the new Lagrangian, La,b (x, y, z, λ1 , λ2 ) = ex + y + z − λ1 (x + y + z − a) − λ2 (x2 + y 2 + z 2 − b), we have f ∗ (1, 1) = f (0, 1, 0) = f (0, 0, 1) = 2 λ1 (1, 1) = 1

and

with

λ2 (1, 1) = 0

∂La,b  ∂f ∗ (1, 1) = = λ1 (1, 1) = 1 ∂a ∂a (x,y,z,λ1 ,λ2 )=(0,1,0,λ1 (1,1),λ2 (1,1)) ∂La,b  ∂f ∗ (1, 1) = = λ2 (1, 1) = 0 ∂b ∂b (x,y,z,λ1 ,λ2 )=(0,1,0,λ1 (1,1),λ2 (1,1)) f ∗ (a, b) ≈ f ∗ (1, 1) +

∂f ∗ ∂f ∗ (1, 1)(a − 1) + (1, 1)(b − 1) ∂a ∂b

= 2 + (a − 1) + (0)(b − 1) = a + 1. ∗∗ If we denote

F ∗ (a, b) =

max

g1 =a, g2 =b

f (x, y, z)

then F ∗ is regular for (a, b) close to (1, 1) because we have: 1. for (a, b) close to (1, 1), there exists a solution to the constrained maximization problem by the extreme value theorem (because f is continuous on the closed bounded set x + y + z = a and x2 + 2y 2 + z 2 = b). 2. (1, 0, 0) is the solution to the constrained maximization problem when (a, b) = (1, 1) and it is a regular point. 3. the second order condition for maximality is satisfied when (a, b) = (1, 1) at (1, 0, 0). Indeed, n = 3 and m = 2, then we have to consider the sign of the following bordered Hessian determinant:      B3 (1, 0, 0) =    

0 0 1 1 1

0 0 2 0 0

1 2 1 0 0

1 0 0 1−e 0

1 0 0 0 1−e

      = 8(1 − e) < 0   

Consequently, we have F ∗ (1, 1) = f (1, 0, 0) = e λ1 (1, 1) = 1

and

with λ2 (1, 1) =

1 (e − 1) 2

(−1)3 B3 = 8(e − 1) > 0.

Constrained Optimization-Inequality Constraints 307 ∂L  ∂F ∗ (1, 1) = = λ1 (1, 1) = 1 ∂a ∂a (x,y,z,λ1 ,λ2 )=(1,0,0,λ1 (1,1),λ2 (1,1)) ∂L  1 ∂F ∗ (1, 1) = = λ2 (1, 1) = (e − 1) ∂b ∂b (x,y,z,λ1 ,λ2 )=(1,0,0,λ1 (1,1),λ2 (1,1)) 2 F ∗ (a, b) ≈ F ∗ (1, 1) +

∂F ∗ ∂F ∗ (1, 1)(a − 1) + (1, 1)(b − 1) ∂a ∂b

1 = e + (a − 1) + (e − 1)(b − 1). 2

4. – Consider the problem min(max) f (x, y) = 1 − (x − 2)2 − y 2

⎧ 2 ⎨ x + y2  8 s.t



x − y  0.

i) Sketch the feasible set and write down the necessary KKT conditions. ii) Find the solutions candidates of the necessary KKT conditions. iii) Use the second derivatives test to classify the points. iv) Explore the concavity and convexity of the associated Lagrangian in (x, y). v) What can you conclude about the solution of the maximization problem? vi) Determine the approximate values of each problem. ⎧ √ ⎨ x2 + 1.04y 2  8 min(max) 1 − (0.98)3 (x − 2)2 − e−0.01 y 2 s.t ⎩ (1.04)2 x − y  0.

Solution: i) Figure 4.25 describes the constraint set and locate the extreme points, approximately, following the variation of the objective function along the level curves. Consider the Lagrangian L(x, y, λ, β) = 1 − (x − 2)2 − y 2 − λ(x2 + y 2 − 8) − β(x − y) We look simultaneously for the possible minima and maxima candidates. Thus, the Karush-Kuhn-Tucker conditions are

308

Introduction to the Theory of Optimization in Euclidean Space y 46 50 48 44 40 34 42 49 38 47 45 36 31 43 41 25 39

3

2 S

1

1

1

2

3

32 4

x

35 37

x2  y2  8

41 43

x y 0

23

5

 2 3

2

2

26

4

x

1 19 15

2 8

4

6

30

36 45 47 38 49 42 40 50 4844 46

3

12

11

39 2

17 1 1

2

29 2

y 418 14 9

16

35

3

22

20

37

1

28

10 24 33

27

214 17

13

1 17 1

FIGURE 4.25: Level curve of highest profit

⎧ (1) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ (2) ⎪ ⎪ (3) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ (4)

Lx = −2(x − 2) − 2λx − β = 0 Ly = −2y − 2λy + β = 0 λ=0

if

x2 + y 2 < 8

β=0

if

x−y 0. Thus, by (4), we get x − y = 0 which contradicts x − y < 0. – Suppose x − y = 0. We have then y = x and y = −x + 2. Thus, we have a candidate point for optimality: (x, y) = (1, 1)

with

(λ, β) = (0, 2).

Constrained Optimization-Inequality Constraints

309

∗ If x2 + y 2 = 8 then – Suppose x − y < 0 then β = 0 and ⎧ ⎨ x − 2 + λx = 0 ⎩

−2y(1 + λ) = 0

⇐⇒

y=0

or

λ = −1.

λ = −1 is not possible by x − 2 + λx = 0. Thus y = 0. √ With √ x2 + y 2 = 8, we deduce that√ x = 8, which contradicts x < y, or x = − 8. Inserting the value x = − 8 into x − 2 + λx = 0 gives λ = −1 − √12 . So, we have another candidate √ (x, y) = (− 8, 0)

with

1 (λ, β) = (−1 − √ , 0). 2

– Suppose x − y = 0. With x2 + y 2 = 8, we deduce that x = 2 or x = −2. Then, inserting in (1) and (2), we obtain ⎧ ⎨ −4λ − β = 0 (x, y) = (2, 2)

=⇒



⇐⇒

−4λ + β = 4

1 (λ, β) = (− , 2) 2

contradicting the common sign of λ and β. ⎧ ⎨ 4λ − β = −8

(x, y) = (−2, −2)

=⇒



4λ + β = −4

⇐⇒

3 (λ, β) = (− , 2) 2

contradicting the common sign of λ and β. Regularity of the candidate point (1, 1). Note that the constraints g1 (x, y) = x2 + y 2 and g2 (x, y) = x − y are C 1 in R2 and that only the constraint g2 is active at (1, 1). We have g2 (x, y) =



1

−1



rank(g2 (1, 1)) = 1

Thus the point (1, 1) is a regular point. √ Regularity √of the candidate point (− 8, 0). Only the constraint g1 is active at (− 8, 0). We have   g1 (x, y) = 2x 2y √ Thus the point (− 8, 0) is a regular point.

√ rank(g2 (− 8, 0)) = 1

310

Introduction to the Theory of Optimization in Euclidean Space

iii) Second derivatives test at (1, 1). With p = 1 (the number of the constraints), n = 2 (the dimension of the space), then r = p + 1, n = 2, 2 =⇒ r = 2 and we will consider the following determinant   0   2 B2 (x, y) =  ∂g ∂x  ∂g  ∂y2

∂g2 ∂x

Lxx Lyx

∂g2 ∂y

Lxy Lyy

∗ At (1, 1), we have λ = 0, then   0 1 −1  B2 (1, 1) =  1 −2 0  −1 0 −2

      

=

   =4  

  0   1   −1

1 −2 − 2λ 0

=⇒

(−1)2 B2 (1, 1) > 0

−1 0 −2 − 2λ

     

and (1, 1) is a local maximum. √ Second derivatives test at (− 8, 0). We consider the following determinant   0   1 B2 (x, y) =  ∂g ∂x  ∂g  ∂y1

∂g1 ∂x

Lxx Lyx

∂g1 ∂y

Lxy Lyy

      

=

√ ∗ At (− 8, 0), we have λ = −3/2, then   √  0√ −2 8 0   B2 (1, 1) =  −2 8 1 0  = −32  0 0 1  √ and (− 8, 0) is a local minimum.

  0   2x   2y

=⇒

2x −2 − 2λ 0

2y 0 −2 − 2λ

     

√ (−1)1 B2 (− 8, 0) > 0

iv) and v) Let us explore the concavity and convexity of L with respect to (x, y) where the Hessian matrix of L in (x, y) is     −2 − 2λ 0 Lxx Lxy = HL = 0 −2 − 2λ Lyx Lyy • When λ = 0, the principal minors are Δ11 = Lyy = −2 < 0, Δ21 = Lxx = −2 < 0 and Δ2 = 4 > 0. So (−1)k Δk  0 for k = 1, 2. Therefore, L is concave in (x, y) and then (1, 1) is a global maximum for the constrained maximization problem. • When λ = −3/2, the principal minors are Δ11 = Lyy = 1 > 0, Δ21 = Lxx = > 0. So (−1)k Δk  0 for k = 1, 2. Therefore, L is convex in 1 > 0 and Δ2 = 1 √ (x, y) and then (− 8, 0) is a global minimum for the constrained minimization problem.

Constrained Optimization-Inequality Constraints

311

vi) Note that 0.98 ≈ 1, 1.04 ≈ 1, 0.01 ≈ 0 and e−0.01 ≈ 1. Thus, the new problems seem like a perturbation from the original problem. Therefore, we will use linear approximation to solve the problem when r = 0.98, s = 1.04 and t = −0.01. So, introduce the Lagrangian associated with the new constrained optimization problem √ L(x, y, λ, β, r, s, t) = 1 − r3 (x − 2)2 − et y 2 − λ(x2 + sy 2 − 8) − β(s2 x − y) • Set f (x, y, r, s, t) = 1 − r3 (x − 2)2 − et y 2 , and the value function ⎧ 2 √ 2 ⎨ x + sy  8 f ∗ (r, s, t) = min f (x, y, r, s, t) s.t ⎩ 2 s x−y 0 Then f ∗ is well defined and differentiable when (r, s, t) is close to (1, 1, 0). Indeed, the following is satisfied √ 1. There is a unique solution (x, y) = (− 8, 0) to the√constrained minimization problem when (r, s, t) = (1, 1, 0) and (− 8, 0) is a regular point. 2. For (r, s, t) close to (1, 1, 0), there exists a solution to the constrained minimization problem by the extreme value theorem since the set of constraints is a closed bounded set and the function is continuous. √ 3. the second order condition for minimality is satisfied at (− 8, 0) when (r, s, t) = (1, 1, 0). As a consequence, ∂L  ∂f ∗ (1, 1, 0) =  √ √ ∂r ∂r (x,y,λ,β)=(− 8,0,−1−1/ 2,0), (r,s,t)=(1,1,0)  √  = −3r2 (x − 2)2  = −3( 8 + 2)2 √ √ (x,y,λ,β)=(− 8,0,−1−1/ 2,0), (r,s,t)=(1,1,0)

∂L  ∂f ∗ (1, 1, 0) =  √ √ ∂s ∂s (x,y,λ,β)=(− 8,0,−1−1/ 2,0, (r,s,t)=(1,1,0)  1  = −λ √ y 2 − 2βsx =0 √ √ 2 s (x,y,λ,μ)=(− 8,0,−1−1/ 2,0), (r,s,t)=(1,1,0) ∂L  ∂f ∗ (1, 1, 0) =  √ √ ∂t ∂t (x,y,λ,β)=(− 8,0,−1−1/ 2,0), (r,s,t)=(1,1,0)   = −et y 2  = 0. √ √ (x,y,z,λ,β)=(− 8,0,−1−1/ 2,0), (r,s,t)=(1,1,0)

312

Introduction to the Theory of Optimization in Euclidean Space

Hence, for (r, s, t) close to (1, 1, 0) f ∗ (r, s, t) ≈ f ∗ (1, 1, 0) +

∂f ∗ (1, 1, 0)(r − 1) ∂r

∂f ∗ ∂f ∗ (1, 1, 0)(s − 1) + (1, 1, 0)(t − 0) ∂s ∂t √ f ∗ (r, s, t) ≈ 2 − 3( 8 + 2)2 (r − 1) √ f ∗ (0.98, 1.04, −0.01) ≈ 2 − 3( 8 + 2)2 (0.04). +

• Set the value function F ∗ (r, s, t) = max f (x, y, r, s, t)

⎧ 2 √ 2 ⎨ x + sy  8 s.t



s2 x − y  0

Then F ∗ is well defined and differentiable when (r, s, t) is close to (1, 1, 0). Indeed, the following is satisfied 1. There is a unique solution (x, y) = (1, 1) to the constrained maximization problem when (r, s, t) = (1, 1, 0) and (1, 1) is a regular point. 2. For (r, s, t) close to (1, 1, 0), there exists a solution to the constrained maximization problem by the extreme value theorem since the set of constraints is a closed bounded set and the function is continuous. 3. the second order condition for maximality is satisfied at (1, 1) when (r, s, t) = (1, 1, 0). As a consequence, ∂L  ∂F ∗ (1, 1, 0) =  ∂r ∂r (x,y,λ,β)=(1,1,0,2), (r,s,t)=(1,1,0)   = −3r2 (x − 2)2 

(x,y,λ,β)=(1,1,0,2), (r,s,t)=(1,1,0)

= −3

∂L  ∂F ∗ (1, 1, 0) =  ∂s ∂s (x,y,λ,β)=(1,1,0,2), (r,s,t)=(1,1,0)  1  = −λ √ y 2 − 2βsx = −4 2 s (x,y,λ,μ)=(1,1,0,2), (r,s,t)=(1,1,0) ∂L  ∂F ∗ (1, 1, 0) =  ∂t ∂t (x,y,λ,β)=(1,1,0,2), (r,s,t)=(1,1,0)   = −et y 2  = −1. (x,y,λ,β)=(1,1,0,2), (r,s,t)=(1,1,0)

Constrained Optimization-Inequality Constraints

313

Hence, for (r, s, t) close to (1, 1, 0) F ∗ (r, s, t) ≈ F ∗ (1, 1, 0) +

∂F ∗ (1, 1, 0)(r − 1) ∂r

∂F ∗ ∂F ∗ (1, 1, 0)(s − 1) + (1, 1, 0)(t − 0) ∂s ∂t F ∗ (r, s, t) ≈ −1 − 3(r − 1) − 4(s − 1) − (t − 0) +

F ∗ (0.98, 1.04, −0.01) ≈ −1 − 3(−0.02) − (0.04) − (−0.01) = 0.02.

Remark. The set of feasible solutions S = {(x, y) : x2 +y 2  8, x−y  0} is a closed bounded set of R2 and f is continuous on S. Therefore, the extreme points are attained on this set by the extreme value theorem. Moreover, such points must occur either at points satisfying the KKT conditions or at points √ where the constraint qualification fails. Since, (1, 1) and (− 8, 0) are the only two points solution and they are regular, then they solve the problem. For more practice, we refer the reader to [11], [27], [28], [26], [25], [24], [4].

Bibliography

[1] H. Anton, I. Bivens, and S. Davis. Calculus. Early Transcendentals. John Wiley & Sons, Inc. New York, NY, USA, 2005. [2] R. G. Bartle and D. R. Sherbert. Introduction to Real Analysis. John Wiley & Sons, Inc, 2011. [3] W. Briggs, L. Cochran, and B. Gillett. Calculus. Early Transcendentals. Addison-Wesley. Pearson, 2011. ˙ [4] E. K. P. Chong and S. H. Zak. An Introduction to Optimization. Wiley, 2013. [5] P. G. Ciarlet. Introduction ` a l’analyse num´erique matricielle et l’optimisation. Masson, 1985. [6] B. Dacorogna. Introduction au calcul des variations. Presses polythechniques et universitaires romandes. Lausanne, 1992. [7] Jr. Ernest.F. Haeussler, S. P. Richard, and J. W. Richard. Introductory Mathematical Analysis for Business, Economics, and the Life and Social Sciences. Pearson, Prentice Hall, 2008. [8] P. E. Fishback. Linear and Nonlinear Programming with MapleTM . An Interactive, Applications-Based Approach. CRC Press, Taylor and Francis Group, 2010. [9] A.S. Gupta. Calculus of Variations with Applications. Prentice-Hall of India, 2006. [10] W. Keith Nicholson. Linear Algebra with Applications. McGraw-Hill Ryerson, 2014. [11] D. Koo. Elements of Optimisation with Applications in Economics and Business. Springer-Verlag, 1977. [12] R. J. Larsen and M. L. Marx. An Introduction to Mathematical Statistics and its Applications. Prentice Hall, 2001. [13] S. Lipschutz. Topologie, cours et probl`emes. McGraw-Hill, 1983.

315

316

Bibliography

[14] D.G. Luenberger. Introduction to Linear and Nonlinear Programming. Addison Wesley, 1973. [15] J. E. Marsden. Elementary Classical Analysis. W. H. Freeman and Company, 1974. [16] M. Mesterton-Gibbons. A primer on the calculus of variations and optimal control theory. Student Mathematical Library vol 50. American Mathematical Society, 2009. [17] M. Minoux. Mathematical Programming: Theory and Algorithms. John Wiley and Sons, 1986. [18] J. R. Munkres. Topology of First Course. Prentice Hall, 1975. [19] J. Nocedal and S. J. Wright. Numerical Optimization. Springer, 1999. [20] M.H. Protter and C.B. Morrey. A First Course in Real Analysis. Springer, 2000. [21] S.L. Salas, E. Hille, and G.J. Etgen. Calculus. One and Several Variables. Tenth Edition. John Wiley & Sons, INC, 2007. [22] J. A. Snyman. Practical Mathematical Optimization: An Introduction to Basic Optimization Theory and Classical and New Gradient-Based Algorithms. Springer, 2005. [23] J. Stewart. Essential Calculus. Brooks/Cole, 2013. [24] K. Sydsæter and P. Hammond. Mathematics for Economic Analysis. FT Prentice Hall, 1995. [25] K. Sydsæter, P. Hammond, A. Seierstad, and A. Strøm. Further Mathematics for Economic Analysis. FT Prentice Hall, 2008. [26] K. Sydsæter, P. Hammond, A. Seierstad, and A. Strøm. Instructor’s Manual: Further Mathematics for Economic Analysis. Pearson, 2008. 2nd Edition. [27] K. Sydsæter, A. Strøm, and P. Hammond. Instructor’s Manual: Essential Mathematics for Economic Analysis. Pearson, 2008. 3rd Edition. [28] K. Sydsæter, A. Strøm, and P. Hammond. Instructor’s Manual: Essential Mathematics for Economic Analysis. Pearson, 2014. 4th Edition. [29] W. L. Winston. Operations Research: Applications and Algorithms. Brooks/Cole, 2004.

Index

absolute maximum, 54, 117 absolute minimum, 54, 117 active, 223 affine, 209 approximate method, 60 approximation, 293

eigen value, 79 ellipse, 23 ellipsoid, 25 extreme-value theorem, 117

ball, 8 binding, 223 bordered Hessian determinant, 252 boundary, 10, 27, 117 bounded, 10, 117

generalized Lagrange multipliers, 223 global extreme points, 117 global maximum, 50 global minimum, 50 gradient, 30, 141 graph, 22

chain rule, 34, 96, 294 Clairaut, 31 closed, 10, 117 closure, 10 Cobb-Douglas, 20, 292 columns, 80 concave, 93 cone, 24, 206 cone of feasible directions, 204 constraint function, 292 continuous, 28, 117 continuously differentiable, 34 convex, 13, 93 critical point, 54 critical points, 117 cylinder, 22

Hessian, 31, 71, 98 hyperplane, 29

dependence, 292 determinant, 80 differentiability, 29 differentiable, 33 dimension, 22 domain, 21

Farkas-Minkowski, 222

implicit function theorem, 138, 139, 214, 295 inactive, 223 inflection point, 55 interior, 9, 117, 139 interior point, 9, 54, 204 intermediate value theorem, 69 Jacobian, 141 Karush-Kuhn-Tucker, 223, 232, 235 Lagrange, 153 Lagrange multipliers, 153 Lagrangian, 153, 223 Laplace, 32 leading minors, 71, 98 level curve, 22 level surface, 22 line, 23 line tangent, 29 317

318

Index

linear, 33 linear combination, 152 linear constraints, 175 Linear programming, 133 linearly independent, 138, 176, 206, 252 local extreme point, 50 local maximum, 50 local minimum, 50

quadratic form, 76, 78, 81, 99, 254

negative definite, 175 negative semi definite, 82, 255 neighborhood, 9, 27, 80 normal line, 143 normal vector, 143

saddle point, 56, 80 second derivatives test, 72, 293 semi definite, 80 several variables, 26, 29 slack, 223 slope, 29 stationary point, 54 strictly concave, 93 strictly convex, 93, 99 subspace, 138, 140 surface, 22 symmetric, 76, 79 symmetric matrix, 81

objective function, 50, 292 open, 9 optimal value function, 293 orthogonal, 152 orthogonal matrix, 79 parabola, 23 Paraboloid, 23 parallel, 23, 143 parameters, 292 partial derivative, 29 plane tangent, 152 polyhedra, 133 positive definite, 82, 99, 175 positive semi definite, 82, 255 principal minor, 80, 82, 100 production, 20

radius, 9 rank, 139 rate of change, 29 regular point, 137, 206 relative maximum, 56 relative minimum, 56 rows, 80

tangent line, 95, 142 tangent plane, 137, 254 Taylor’s formula, 76, 253 traces, 22 triangular inequality, 8, 94 unbounded, 10, 121 unit vectors, 150 vertices, 26