Convex Analysis and Global Optimization [2ed.] 3319314823, 978-3-319-31482-2, 978-3-319-31484-6

This book presents state-of-the-art results and methodologies in modern global optimization, and has been a staple refer

421 75 3MB

English Pages 505 [511] Year 2016

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Convex Analysis and Global Optimization [2ed.]
 3319314823, 978-3-319-31482-2, 978-3-319-31484-6

Table of contents :
Front Matter....Pages i-xvi
Front Matter....Pages 1-1
Convex Sets....Pages 3-37
Convex Functions....Pages 39-86
Fixed Point and Equilibrium....Pages 87-102
DC Functions and DC Sets....Pages 103-123
Front Matter....Pages 125-125
Motivation and Overview....Pages 127-149
General Methods....Pages 151-165
DC Optimization Problems....Pages 167-228
Special Methods....Pages 229-281
Parametric Decomposition....Pages 283-336
Nonconvex Quadratic Programming....Pages 337-390
Monotonic Optimization....Pages 391-433
Polynomial Optimization....Pages 435-452
Optimization Under Equilibrium Constraints....Pages 453-487
Back Matter....Pages 489-505

Citation preview

Springer Optimization and Its Applications  110

Hoang Tuy

Convex Analysis and Global Optimization Second Edition

Springer Optimization and Its Applications VOLUME 110 Managing Editor Panos M. Pardalos (University of Florida) Editor–Combinatorial Optimization Ding-Zhu Du (University of Texas at Dallas) Advisory Board J. Birge (University of Chicago) C.A. Floudas (Princeton University) F. Giannessi (University of Pisa) H.D. Sherali (Virginia Polytechnic and State University) T. Terlaky (McMaster University) Y. Ye (Stanford University)

Aims and Scope Optimization has been expanding in all directions at an astonishing rate during the last few decades. New algorithmic and theoretical techniques have been developed, the diffusion into other disciplines has proceeded at a rapid pace, and our knowledge of all aspects of the field has grown even more profound. At the same time, one of the most striking trends in optimization is the constantly increasing emphasis on the interdisciplinary nature of the field. Optimization has been a basic tool in all areas of applied mathematics, engineering, medicine, economics, and other sciences. The series Springer Optimization and Its Applications publishes undergraduate and graduate textbooks, monographs and state-of-the-art expository work that focus on algorithms for solving optimization problems and also study applications involving such problems. Some of the topics covered include nonlinear optimization (convex and nonconvex), network flow problems, stochastic optimization, optimal control, discrete optimization, multi-objective programming, description of software packages, approximation techniques and heuristic approaches.

More information about this series at http://www.springer.com/series/7393

Hoang Tuy

Convex Analysis and Global Optimization Second Edition

123

Hoang Tuy Vietnam Academy of Science and Technology Institute of Mathematics Hanoi, Vietnam

ISSN 1931-6828 ISSN 1931-6836 (electronic) Springer Optimization and Its Applications ISBN 978-3-319-31482-2 ISBN 978-3-319-31484-6 (eBook) DOI 10.1007/978-3-319-31484-6 Library of Congress Control Number: 2016934409 Mathematics Subject Classification (2010): 90-02, 49-02, 65K-10 © Springer International Publishing AG 1998, 2016 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. Printed on acid-free paper This Springer imprint is published by Springer Nature The registered company is Springer International Publishing AG Switzerland

Preface to the First Edition

This book grew out of lecture notes from a course given at Graz University of Technology (autumn 1990), The Institute of Technology at Linköping University (autumn 1991), and the Ecole Polytechnique in Montréal (autumn 1993). Originally conceived of as a Ph.D. course, it attempts to present a coherent and mathematically rigorous theory of deterministic global optimization. At the same time, aiming at providing a concise account of the methods and algorithms, it focuses on the main ideas and concepts, leaving aside too technical details which might affect the clarity of presentation. Global optimization is concerned with finding global solutions to nonconvex optimization problems. Although until recently convex analysis has been developed mainly under the impulse of convex and local optimization, it has become a basic tool for global optimization. The reason is that the general mathematical structure underlying virtually every nonconvex optimization problem can be described in terms of functions representable as differences of convex functions (dc functions) and sets which are differences of convex sets (dc sets). Due to this fact, many concepts and results from convex analysis play an essential role in the investigation of important classes of global optimization problems. Since, however, convexity in nonconvex optimization problems is present only partially or in the “other” way, new concepts have to be introduced and new questions have to be answered. Therefore, a new chapter on dc functions and dc sets is added to the traditional material of convex analysis. Part I of this book is an introduction to convex analysis interpreted in this broad sense as an indispensable tool for global optimization. Part II presents a theory of deterministic global optimization which heavily relies on the dc structure of nonconvex optimization problems. The key subproblem in this approach is to transcend an incumbent, i.e., given a solution of an optimization problem (the best so far obtained), check its global optimality, and find a better solution, if there is one. As it turns out, this subproblem can always be reduced to solving a dc inclusion of the form x 2 D n C; where D; C are two convex sets. Chapters 4–6 are devoted to general methods for solving concave and dc programs through dc inclusions of this form. These methods include successive partitioning and cutting, outer approximation and polyhedral annexation, or combination of the v

vi

Preface to the First Edition

concepts. The last two chapters discuss methods for exploiting special structures in global optimization. Two aspects of nonconvexity deserve particular attention: the rank of nonconvexity, i.e., roughly speaking the number of nonconvex variables, and the degree of nonconvexity, i.e., the extent to which a problem fails to be convex. Decomposition methods for handling low rank nonconvex problems are presented in Chap. 7, while nonconvex quadratic problems, i.e., problems involving only nonconvex functions which in a sense have lowest degree of nonconvexity, are discussed in the last Chap. 8. I have made no attempt to cover all the developments to date because this would be merely impossible and unreasonable. With regret I have to omit many interesting results which do not fit in with the mainstream of the book. On the other hand, much new material is offered, even in the parts where the traditional approach is more or less standard. I would like to express my sincere thanks to many colleagues and friends, especially R.E. Burkard (Graz Technical University), A. Migdalas and P. Värbrand (University of Linköping), and B. Jaumard and P. Hansen (University of Montréal), for many fruitful discussions we had during my enjoyable visits to their departments, and P.M. Pardalos for his encouragement in publishing this book. I am particularly grateful to P.T. Thach and F.A. Al-Khayyal for many useful remarks and suggestions and also to P.H. Dien for his efficient help during the preparation of the manuscript. Hanoi, Vietnam October 1997

Hoang Tuy

Preface to the Second Edition

Over the 17 years since the publication of the first edition of this book, tremendous progress has been achieved in global optimization. The present revised and enlarged edition is to give an up-to-date account of the field, in response to the needs of research, teaching, and applications in the years to come. In addition to the correction of misprints and errors, much new material is offered. In particular, several recent important advances are included: a modern approach to minimax, fixed point and equilibrium, a robust approach to optimization under nonconvex constraints, and also a thorough study of quadratic programming with a single quadratic constraint. Moreover, three new chapters are added: monotonic optimization (Chap. 11), polynomial optimization (Chap. 12), and optimization under equilibrium constraints (Chap. 13). These topics have received increased attention in recent years due to many important engineering and economic applications. Hopefully, as main reference in deterministic global optimization, this book will replace both its first edition and the 1996 (last) edition of the book Global Optimization: Deterministic Approaches by Reiner Horst and Hoang Tuy. Hanoi, Vietnam September 2015

Hoang Tuy

vii

Contents

Part I

Convex Analysis

1

Convex Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.1 Affine Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.2 Convex Sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.3 Relative Interior and Closure . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.4 Convex Hull . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.5 Separation Theorems .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.6 Geometric Structure .. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.6.1 Supporting Hyperplane .. . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.6.2 Face and Extreme Point . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.7 Representation of Convex Sets . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.8 Polars of Convex Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.9 Polyhedral Convex Sets . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.9.1 Facets of a Polyhedron . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.9.2 Vertices and Edges of a Polyhedron . . .. . . . . . . . . . . . . . . . . . . . 1.9.3 Polar of a Polyhedron . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.9.4 Representation of Polyhedrons.. . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.10 Systems of Convex Sets . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1.11 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .

3 3 5 7 12 16 19 19 21 23 25 27 28 29 31 32 34 36

2

Convex Functions .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.1 Definition and Basic Properties .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.2 Operations That Preserve Convexity . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.3 Lower Semi-Continuity . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.4 Lipschitz Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.5 Convex Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.6 Approximation by Affine Functions .. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.7 Subdifferential .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.8 Subdifferential Calculus . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.9 Approximate Subdifferential . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.10 Conjugate Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .

39 39 43 47 51 52 56 58 62 65 67 ix

x

Contents

2.11 Extremal Values of Convex Functions . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.11.1 Minimum.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.11.2 Maximum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.12 Minimax and Saddle Points .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.12.1 Minimax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.12.2 Saddle Point of a Function . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.13 Convex Optimization .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.13.1 Generalized Inequalities .. . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.13.2 Generalized Convex Optimization .. . . .. . . . . . . . . . . . . . . . . . . . 2.13.3 Conic Programming . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.14 Semidefinite Programming . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.14.1 The SDP Cone . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.14.2 Linear Matrix Inequality . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.14.3 SDP Programming .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 2.15 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .

69 69 70 72 72 75 76 76 78 80 81 81 83 84 84

3

Fixed Point and Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 87 3.1 Approximate Minimum and Variational Principle .. . . . . . . . . . . . . . . . . 87 3.2 Contraction Principle.. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 89 3.3 Fixed Point and Equilibrium .. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 91 3.3.1 The KKM Lemma . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 91 3.3.2 General Equilibrium Theorem . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 93 3.3.3 Equivalent Minimax Theorem . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 96 3.3.4 Ky Fan Inequality .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 97 3.3.5 Kakutani Fixed Point . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 98 3.3.6 Equilibrium in Non-cooperative Games .. . . . . . . . . . . . . . . . . . 98 3.3.7 Variational Inequalities .. . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 99 3.4 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 101

4

DC Functions and DC Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 4.1 DC Functions .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 4.2 Universality of DC Functions.. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 4.3 DC Representation of Basic Composite Functions . . . . . . . . . . . . . . . . . 4.4 DC Representation of Separable Functions . . . . .. . . . . . . . . . . . . . . . . . . . 4.5 DC Representation of Polynomials .. . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 4.6 DC Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 4.7 Convex Minorants of a DC Function .. . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 4.8 The Minimum of a DC Function . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 4.9 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .

Part II 5

103 103 105 107 110 112 113 117 120 122

Global Optimization

Motivation and Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 127 5.1 Global vs. Local Optimum .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 127 5.2 Multiextremality in the Real World . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 129

Contents

5.3

xi

Basic Classes of Nonconvex Problems .. . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.3.1 DC Optimization . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 5.3.2 Monotonic Optimization . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . General Nonconvex Structure . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . Global Optimality Criteria . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . DC Inclusion Associated with a Feasible Solution . . . . . . . . . . . . . . . . . Approximate Solution .. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . When is Global Optimization Motivated? . . . . . .. . . . . . . . . . . . . . . . . . . . Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .

132 133 135 135 136 139 140 142 147

6

General Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6.1 Successive Partitioning.. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6.1.1 Simplicial Subdivision . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6.1.2 Conical Subdivision . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6.1.3 Rectangular Subdivision . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6.2 Branch and Bound.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6.2.1 Prototype BB Algorithm . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6.2.2 Advantage of a Nice Feasible Set . . . . . .. . . . . . . . . . . . . . . . . . . . 6.3 Outer Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6.3.1 Prototype OA Procedure . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 6.4 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .

151 151 151 155 156 158 158 160 161 162 164

7

DC Optimization Problems .. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.1 Concave Minimization Under Linear Constraints . . . . . . . . . . . . . . . . . . 7.1.1 Concavity Cuts. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.1.2 Canonical Cone .. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.1.3 Procedure DC . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.1.4 !-Bisection . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.1.5 Branch and Bound .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.1.6 Rectangular Algorithms . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.2 Concave Minimization Under Convex Constraints .. . . . . . . . . . . . . . . . 7.2.1 Simple Outer Approximation . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.2.2 Combined Algorithm .. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.3 Reverse Convex Programming.. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.3.1 General Properties . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.3.2 Solution Methods .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.4 Canonical DC Programming .. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.4.1 Simple Outer Approximation . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.4.2 Combined Algorithm .. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.5 Robust Approach .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.5.1 Successive Incumbent Transcending . .. . . . . . . . . . . . . . . . . . . . 7.5.2 The SIT Algorithm for DC Optimization.. . . . . . . . . . . . . . . . . 7.5.3 Equality Constraints .. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.5.4 Illustrative Examples . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.5.5 Concluding Remark . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .

167 167 169 172 173 178 180 182 184 185 187 189 190 194 198 201 203 204 207 208 211 212 213

5.4 5.5 5.6 5.7 5.8 5.9

xii

Contents

7.6

DC Optimization in Design Calculations . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.6.1 DC Reformulation . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.6.2 Solution Method .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . DC Optimization in Location and Distance Geometry . . . . . . . . . . . . . 7.7.1 Generalized Weber’s Problem.. . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.7.2 Distance Geometry Problem . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.7.3 Continuous p-Center Problem . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.7.4 Distribution of Points on a Sphere .. . . .. . . . . . . . . . . . . . . . . . . . Clustering via dc Optimization . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7.8.1 Clustering as a dc Optimization.. . . . . . .. . . . . . . . . . . . . . . . . . . . Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .

213 214 215 218 218 220 221 221 222 223 226

Special Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 8.1 Polyhedral Annexation .. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 8.1.1 Procedure DC . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 8.1.2 Polyhedral Annexation Method for (BCP) . . . . . . . . . . . . . . . . 8.1.3 Polyhedral Annexation Method for (LRC) . . . . . . . . . . . . . . . . 8.2 Partly Convex Problems.. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 8.2.1 Computation of Lagrangian Bounds .. .. . . . . . . . . . . . . . . . . . . . 8.2.2 Applications.. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 8.3 Duality in Global Optimization .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 8.3.1 Quasiconjugate Function .. . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 8.3.2 Examples .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 8.3.3 Dual Problems of Global Optimization.. . . . . . . . . . . . . . . . . . . 8.4 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 8.4.1 A Feasibility Problem with Resource Allocation Constraint .. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 8.4.2 Problems with Multiple Resources Allocation .. . . . . . . . . . . 8.4.3 Optimization Under Resources Allocation Constraints .. . 8.5 Noncanonical DC Problems . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 8.6 Continuous Optimization Problems . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 8.6.1 Outer Approximation Algorithm.. . . . . .. . . . . . . . . . . . . . . . . . . . 8.6.2 Combined OA/BB Algorithm .. . . . . . . . .. . . . . . . . . . . . . . . . . . . . 8.7 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .

229 229 229 233 234 236 244 247 249 250 253 255 258 259 261 264 268 271 275 278 279

Parametric Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 9.1 Functions Monotonic w.r.t. a Convex Cone . . . . .. . . . . . . . . . . . . . . . . . . . 9.2 Decomposition by Projection .. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 9.2.1 Outer Approximation .. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 9.2.2 Branch and Bound .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 9.3 Decomposition by Polyhedral Annexation .. . . . .. . . . . . . . . . . . . . . . . . . . 9.4 The Parametric Programming Approach .. . . . . . .. . . . . . . . . . . . . . . . . . . . 9.5 Linear Multiplicative Programming .. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 9.5.1 Parametric Objective Simplex Method . . . . . . . . . . . . . . . . . . . . 9.5.2 Parametric Right-Hand Side Method . .. . . . . . . . . . . . . . . . . . . . 9.5.3 Parametric Combined Method . . . . . . . . .. . . . . . . . . . . . . . . . . . . .

283 283 286 288 289 292 294 298 300 302 303

7.7

7.8 7.9 8

9

Contents

xiii

9.6

Convex Constraints. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 9.6.1 Branch and Bound .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 9.6.2 Polyhedral Annexation . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 9.6.3 Reduction to Quasiconcave Minimization.. . . . . . . . . . . . . . . . 9.7 Reverse Convex Constraints . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 9.7.1 Decomposition by Projection . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 9.7.2 Outer Approximation .. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 9.7.3 Branch and Bound .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 9.7.4 Decomposition by Polyhedral Annexation . . . . . . . . . . . . . . . . 9.8 Network Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 9.8.1 Outer Approximation .. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 9.8.2 Branch and Bound Method.. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 9.9 Some Polynomial-Time Solvable Problems . . . .. . . . . . . . . . . . . . . . . . . . 9.9.1 The Special Concave Program CPL.1/ .. . . . . . . . . . . . . . . . . . . 9.10 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 9.10.1 Problem PTP(2).. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 9.10.2 Problem FP(1) . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 9.10.3 Strong Polynomial Solvability of PTP.r/ . . . . . . . . . . . . . . . . . 9.11 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .

305 306 306 308 313 314 314 316 317 319 320 321 326 326 328 328 329 330 333

10 Nonconvex Quadratic Programming .. . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 10.1 Some Basic Properties of Quadratic Functions .. . . . . . . . . . . . . . . . . . . . 10.2 The S-Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 10.3 Single Constraint Quadratic Optimization . . . . . .. . . . . . . . . . . . . . . . . . . . 10.4 Quadratic Problems with Linear Constraints . . .. . . . . . . . . . . . . . . . . . . . 10.4.1 Concave Quadratic Programming.. . . . .. . . . . . . . . . . . . . . . . . . . 10.4.2 Biconvex and Convex–Concave Programming.. . . . . . . . . . . 10.4.3 Indefinite Quadratic Programs . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 10.5 Quadratic Problems with Quadratic Constraints .. . . . . . . . . . . . . . . . . . . 10.5.1 Linear and Convex Relaxation .. . . . . . . .. . . . . . . . . . . . . . . . . . . . 10.5.2 Lagrangian Relaxation . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 10.5.3 Application.. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 10.6 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .

337 337 339 346 355 355 358 361 363 364 377 382 388

11 Monotonic Optimization .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 11.1 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 11.1.1 Increasing Functions, Normal Sets, and Polyblocks . . . . . . 11.1.2 DM Functions and DM Constraints . . .. . . . . . . . . . . . . . . . . . . . 11.1.3 Canonical Monotonic Optimization Problem . . . . . . . . . . . . . 11.2 The Polyblock Algorithm .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 11.2.1 Valid Reduction .. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 11.2.2 Generating the New Polyblock . . . . . . . .. . . . . . . . . . . . . . . . . . . . 11.2.3 Polyblock Algorithm . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 11.3 Successive Incumbent Transcending Algorithm .. . . . . . . . . . . . . . . . . . . 11.3.1 Essential Optimal Solution .. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 11.3.2 Successive Incumbent Transcending Strategy.. . . . . . . . . . . . 11.4 Discrete Monotonic Optimization . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .

391 391 391 395 397 399 399 402 403 405 406 407 413

xiv

Contents

11.5 Problems with Hidden Monotonicity .. . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 11.5.1 Generalized Linear Fractional Programming.. . . . . . . . . . . . . 11.5.2 Problems with Hidden Monotonicity in Constraints .. . . . . 11.6 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 11.6.1 Reliability and Risk Management . . . . .. . . . . . . . . . . . . . . . . . . . 11.6.2 Power Control in Wireless Communication Networks . . . 11.6.3 Sensor Cover Energy Problem . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 11.6.4 Discrete Location .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 11.7 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .

417 417 422 424 424 425 426 429 433

12 Polynomial Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 12.1 The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 12.2 Linear and Convex Relaxation Approaches .. . . .. . . . . . . . . . . . . . . . . . . . 12.2.1 The RLT Relaxation Method .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 12.2.2 The SDP Relaxation Method.. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 12.3 Monotonic Optimization Approach . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 12.3.1 Monotonic Reformulation .. . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 12.3.2 Essential Optimal Solution .. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 12.3.3 Solving the Master Problem . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 12.3.4 Equality Constraints .. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 12.3.5 Discrete Polynomial Optimization .. . . .. . . . . . . . . . . . . . . . . . . . 12.3.6 Signomial Programming . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 12.4 An Illustrative Example .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 12.5 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .

435 435 436 436 439 441 442 443 445 450 450 451 451 452

13 Optimization Under Equilibrium Constraints . . . . . .. . . . . . . . . . . . . . . . . . . . 13.1 Problem Formulation .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 13.2 Bilevel Programming.. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 13.2.1 Conversion to Monotonic Optimization . . . . . . . . . . . . . . . . . . . 13.2.2 Solution Method .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 13.2.3 Cases When the Dimension Can Be Reduced .. . . . . . . . . . . . 13.2.4 Specialization to (LBP). . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 13.2.5 Illustrative Examples . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 13.3 Optimization Over the Efficient Set . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 13.3.1 Monotonic Optimization Reformulation . . . . . . . . . . . . . . . . . . 13.3.2 Solution Method for (OWE) . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 13.3.3 Solution Method for (OE) .. . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 13.3.4 Illustrative Examples . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 13.4 Equilibrium Problem .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 13.4.1 Variational Inequality.. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 13.5 Optimization with Variational Inequality Constraint . . . . . . . . . . . . . . . 13.6 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .

453 453 455 457 459 463 465 467 468 469 469 474 478 479 480 483 486

References .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 489 Index . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 501

Glossary of Notation

R Rn aff E conv E cl C ri C @C rec.C/ NC .x0 / Eı X\ domf epif f  .p/ f \ .p/ ıC .x/ sC .x/ dC .x/ @f .x0 / @" f .x0 / @\ f .x0 / LMI SDP DC.˝/ BB BRB OA PA CS SIT BCP

The real line Real n-dimensional space Affine hull of the set E Convex hull of the set E Closure of the set C Relative interior of the set C Relative boundary of a convex set C Recession cone of a convex set C (Outward) normal cone of a convex set C at point x0 2 C Polar of set E Lower conjugate of set X Effective domain of function f .x/ Epigraph of f .x/ Conjugate of function f .x/ Quasiconjugate of function f .x/ Indicator function of C Support function of C Distance function of C Subdifferential of f at x0 "-subdifferential of f at x0 Quasi-supdifferential of f at x0 Linear matrix inequality Semidefinite program Linear space formed by dc functions on ˝ Branch and Bound Branch Reduce and Bound Outer Approximation Polyhedral Annexation Cone Splitting Successive Incumbent Transcending Basic Concave Program xv

xvi

SBCP GCP LRC CDC COP MO MPEC MPAEC VI OVI GBP CBP LBP OE

Glossary of Notation

Separable Basic Concave Program General Concave Program Linear Reverse Convex Program Canonical DC Program Continuous Optimization Problem Canonical Monotonic Optimization Problem Mathematical Programming with Equilibrium Constraints MPEC with an Affine Variational Inequality Variational Inequality Optimization with Variational Inequality constraint General Bilevel Programming Convex Bilevel Programming Linear Bilevel Programming Optimization over Efficient Set

Part I

Convex Analysis

Chapter 1

Convex Sets

1.1 Affine Sets Let a; b be two points of Rn : The set of all x 2 Rn of the form x D .1  /a C b D a C .b  a/;

2R

(1.1)

is called the line through a and b: A subset M of Rn is called an affine set (or affine manifold ) if it contains every line through two points of it, i.e., if .1 /a Cb 2 M for every a 2 M; b 2 M, and every  2 R: An affine set which contains the origin is a subspace. Proposition 1.1 A nonempty set M is an affine set if and only if M D a C L; where a 2 M and L is a subspace. Proof Let M be an affine set and a 2 M: Then M D a C L; where the set L D M  a contains 0 and is an affine set, hence is a subspace. Conversely, let M D a C L; with a 2 M and L a subspace. For any x; y 2 M;  2 R we have .1  /x C y D a C .1  /.x  a/ C .y  a/: Since x  a; y  a 2 L and L is a subspace, it follows that .1  /.x  a/ C .y  a/ 2 L; hence .1  /x C y 2 M: Therefore, M is an affine set. t u The subspace L is said to be parallel to the affine set M: For a given nonempty affine set M the parallel subspace L is uniquely defined. Indeed, if M D a0 C L0 ; then L0 D M  a0 D L C .a  a0 /; but a0 D a0 C 0 2 M D a C L; hence a0  a 2 L; and so L0 D L C .a  a0 / D L: The dimension of the subspace L parallel to an affine set M is called the dimension of M: A point a 2 Rn is an affine set of dimension 0 because the subspace parallel to M D fag is L D f0g: A line through two points a; b is an affine set of dimension 1, because the subspace parallel to it is the one-dimensional subspace

© Springer International Publishing AG 2016 H. Tuy, Convex Analysis and Global Optimization, Springer Optimization and Its Applications 110, DOI 10.1007/978-3-319-31484-6_1

3

4

1 Convex Sets

fx D .b  a/j  2 Rg: An .n  1/-dimensional affine set is called a hyperplane, or a plane for short. Proposition 1.2 Any r-dimensional affine set is of the form M D fxj Ax D bg; where b 2 Rm ; A 2 Rmn such that rankA D n  r: Conversely, any set of the above form is an r-dimensional affine set. Proof Let M be an r-dimensional affine set. For any x0 2 M; the set L D M  x0 is an r-dimensional subspace, so L D fxj Ax D 0g; where A is some m  n matrix of rank n  r: Hence M D fxj A.x  x0 / D 0g D fxj Ax D bg; with b D Ax0 : Conversely if M is a set of this form then, for any x0 2 M; we have Ax0 D b; hence M D fxjA.x  x0 / D 0g D x0 C L; with L D fxj Ax D 0g: Since rankA D n  r; L is an r-dimensional subspace. t u Corollary 1.1 Any hyperplane is a set of the form H D fxj ha; xi D ˛g;

(1.2)

where a 2 Rn n f0g; ˛ 2 R: Conversely, any set of the above form is a hyperplane. The vector a in (1.2) is called the normal to the hyperplane H: Of course a and ˛ are defined up to a nonzero common multiple, for a given hyperplane H: The line ˛ faj  2 Rg meets H at a point a such that ha; ai D ˛; hence  D kak 2 : Thus, the distance from 0 to a hyperplane ha; xi D ˛ is equal to jjkak D

j˛j kak :

The intersection of a family of affine sets is again an affine set. Given a set E  Rn ; there exists at least an affine set containing E; namely Rn : Therefore, the intersection of all affine sets containing E is the smallest affine set containing E: This set is called the affine hull of E and denoted by aff E: Proposition 1.3 The affine hull of a set E is the set consisting of all points of the form x D  1 x 1 C : : : C k x k ; such that xi 2 E; 1 C : : : C k D 1; and k a natural number. ProofPLet M be the set P of all points of the above Pform. If x;Py 2 M; for example, x D kiD1 i xi ; y D hjD1 j yj ; with xi ; yj 2 E; kiD1 i D hjD1 j D 1; then for any ˛ 2 R we can write .1  ˛/x C ˛y D

k h X X .1  ˛/i xi C ˛j yj iD1

jD1

1.2 Convex Sets

5

P P with kiD1 .1  ˛/i C hjD1 ˛j D .1  ˛/ C ˛ D 1; hence .1  ˛/x C ˛y 2 M: Thus P M is affine, and hence M affE: On the other hand, it is easily seen that if P k x D kiD1 i xi with xi 2 E; iD1 i D 1; then x 2 affE (for k D 2 this follows from the definition of affine sets, for k  3 this follows by induction). So M  affE; and therefore, M D affE: t u Proposition 1.4 The affine hull of a set of k points x1 ; : : : ; xk .k > r/ in Rn is of dimension r if and only if the .n C 1/  k matrix 

x1 x2    xk 1 1  1

 (1.3)

is of rank r C 1: Proof Let M D afffx1 ; : : : ; xk g: Then M D xk C L; where L is the smallest space containing x1  xk ; : : : ; xk1  xk : So M is of dimension r if and only if L is of dimension r; i.e., if and only if the matrix Œx1  xk ; : : : ; xk1  xk  is of rank r: This in turn is equivalent to requiring that the matrix (1.3) be of rank r C 1: t u We say that k points x1 ; : : : ; xk are affinely independent if afffx1 ; : : : ; xk g has dimension k  1; i.e., if the vectors x1  xk ; : : : ; xk1  xk are linearly independent, or equivalently, the matrix (1.3) is of rank k: Corollary 1.2 The affine hull S of a set of k affinely independent points fx1 ; : : : ; xk g in Rn is a .k  1/-dimensional affine set. Every point x 2 S can be uniquely represented in the form xD

k X

i x i ;

iD1

k X

i D 1:

(1.4)

iD1

Pk i Proof By Proposition 1.3 any point x 2 S has the form (1.4). If x D iD1 i x Pk1 i k is another representation, then iD1 .i  i /.x  x / D 0; hence i D i 8i D 1; : : : ; k: t u Corollary 1.3 There is a unique hyperplane passing through n affinely independent points x1 ; x2 ; : : : ; xn in Rn : Every point of this hyperplane can be uniquely represented in the form xD

n X iD1

i x ; i

n X

i D 1:

iD1

1.2 Convex Sets Given two points a; b 2 Rn ; the set of all points x D .1  /a C b such that 0    1 is called the (closed) line segment between a and b and denoted by Œa; b:

6

1 Convex Sets

A set C  Rn is called convex if it contains any line segment between two points of it; in other words, if .1  /a C b 2 C whenever a; b 2 C; 0    1: Affine sets are obviously convex. Other particular examples of convex sets are halfspaces which are sets of the form fxj ha; xi  ˛g;

fxj ha; xi  ˛g

fxj ha; xi < ˛g;

fxj ha; xi > ˛g

(closed halfspaces), or

(open halfspaces), where a P 2 Rn ; ˛ 2 R: P A point x such that x D kiD1 i ai with ai 2 Rn ; i  0; kiD1 i D 1; is called a convex combination of a1 ; : : : ; ak 2 Rn : Proposition 1.5 A set C  Rn is convex if and only if it contains all the convex combinations of its elements. Proof It suffices to show that if C is a convex set then ai 2 C; i D 1; : : : ; k; i > 0;

k X

i D 1 )

iD1

k X

i ai 2 C:

iD1

Indeed, this P is obvious for k D 2: Assuming this is true for k D h  1  2 and h1 letting ˛ D iD1 i > 0 we can write h X iD1

i a i D

h1 X

i a i C h a h D ˛

iD1

h1 X i iD1

˛

a i C h a h :

Ph1 Ph1 Since iD1 .i =˛/ D 1; by the induction hypothesis y D iD1 .i =˛/ai 2 C: Next, h since ˛ C h D 1; we have ˛y C h a 2 C: Thus the above property holds for every natural k: t u Proposition 1.6 The intersection of any collection of convex sets is convex. If C; D are convex sets then C C D WD fx C yj x 2 C; y 2 Dg; ˛C D f˛xj x 2 Cg (and hence, also C  D WD C C .1/D/ are convex sets. Proof If fC˛ g is a collection of convex sets, and a; b 2 \˛ C˛ ; then for each ˛ we have a 2 C˛ ; b 2 C˛ ; therefore Œa; b  C˛ and hence, Œa; b  \˛ C˛ : If C; D are convex sets, and a D x C y; b D u C v with x; u 2 C; y; v 2 D; then .1  /a C b D Œ.1  /x C u C Œ.1  /y C v 2 C C D; for any  2 Œ0; 1; hence C C D is convex. The convexity of ˛C can be proved analogously. t u A subset M of Rn is called a cone if x 2 M;  > 0 ) x 2 M:

1.3 Relative Interior and Closure

7

The origin 0 itself may or may not belong to M: A set aCM; which is the translate of a cone M by a 2 Rn ; is also called a cone with apex at a: A cone M which contains no line is said to be pointed; in this case, 0 is also called the vertex of M; and a C M is a cone with vertex at a: Proposition 1.7 A subset M of Rn is a convex cone if and only if M  M

8 > 0

M C M  M:

(1.5) (1.6)

Proof If M is a convex cone then (1.5) holds by definition of a cone; furthermore, from x; y 2 M we have 12 .x C y/ 2 M; hence by (1.5), x C y 2 M: Conversely, if (1.5) and (1.6) hold, then M is a cone, and for any x; y 2 M;  2 .0; 1/ we have .1  /x 2 M; y 2 M; hence .1  /x C y 2 M; so M is convex. t u Corollary 1.4 A subset M of Rn is a convex cone if and only if it contains all the positive linear combinations of its elements. Proof Properties (1.6) and (1.5) mean that M is closed under addition and positive P scalar multiplication, i.e., M contains all elements of the form kiD1 i ai ; with ai 2 M; i > 0: t u Corollary 1.5 Let E be a convex set. The set fxj x 2 E;  > 0g is the smallest convex cone which contains E: Proof Since any cone containing E must contain this set it suffices to show that this set Pcone. Buti this follows Pfrom the preceding corollary and the fact that P is ai convex t u i ˛i x D  i .˛i =/x ; with  D i ˛i : The convex cone obtained by adjoining 0 to the cone mentioned in Corollary 1.5 is often referred to as the cone generated by E and is denoted by cone E:

1.3 Relative Interior and Closure A norm in Rn is by definition a mapping k  k from Rn to R which satisfies kxk  0 8x 2 Rn I kxk D 0 only if x D 0 1 k˛xk D j˛jkxk 8x 2 Rn ; ˛ 2 R n kx C yk  kxk C kyk x; y 2 R : Familiar examples of norms in Rn are the lp -norms

kxkp D

n X iD1

!1=p jxi j

p

;

1  p < 1;

(1.7)

8

1 Convex Sets

and the l1 -norm kxk1 D max jxi j: 1in

which can also be defined via the inner product by means of The l2 -norm, p kxk D hx; xi, is also called the Euclidean norm. The l1 -norm is also called the Tchebycheff norm. Given a norm k  k in Rn ; the distance between two points x; y 2 Rn is the nonnegative number kxykI the ball of center a and radius r is the set fx 2 Rn j kx ak  rg: Topological concepts in Rn such as open set, closed set, interior, closure, neighborhood, convergent sequence, etc., are defined in Rn with respect to a given norm. However, the following proposition shows that the topologies in Rn defined by different norms are all equivalent. Proposition 1.8 For any two norms kk; kk0 in Rn there exist constants c2  c1 > 0 such that c1 kxk  kxk0  c2 kxk 8x 2 Rn : We omit the proof which can be found, e.g., in Ortega and Rheinboldt (1970). Denote the closure and the interior of a set C by clC and intC; respectively. Recall that a 2 clC if and only if every ball around a contains at least one point of C; or equivalently, a is the limit of a sequence of points of CI a 2 intC if and only if there is a ball around a entirely included in C: An immediate consequence of Proposition 1.8 is the following alternative characterization of interior points of convex sets in Rn : Corollary 1.6 A point a of a convex set C  Rn is an interior point of C if for every x 2 Rn there exists ˛ > 0 such that a C ˛.x  a/ 2 C: Proof Without loss of generality we can assume that a D 0: Let ei be the i-unit vector of Rn : The condition implies that for each i D 1; : : : ; n there is ˛i > 0 such that ˛i ei 2 C and ˇi > 0 such that ˇi ei 2 C: Now let ˛ D minf˛i ; ˇiP ; i D 1; : : : ; ng and consider the set B D fxj kxi k1  ˛g: For any x 2 B; since x D niD1 xi ei with Pn jxi j i i i i iD1 ˛  1; if we define u D ˛e for xi > 0 and u D ˛e for xi  0 then Pn jxi j i Pn jxi j 1 n u ; : : : ; u 2 C and we can write x D iD1 ˛ u C .1  iD1 ˛ /0: From the convexity of C it follows that x 2 C: Thus, B  C; and since B is a ball in the l1 -norm, 0 is an interior point of C: t u A point a belongs to the relative interior of a convex set C  Rn W a 2 riC; if it is an interior point of C relative to affC: The set difference (clC/ n .riC/ is called the relative boundary of C and denoted by @C: Proposition 1.9 Any nonempty convex set C  Rn has a nonempty relative interior.

1.3 Relative Interior and Closure

9

Proof Let M D affC with dim M D r; so there existPr C 1 affinely independent rC1 i 1 elements x1 ; : : : ; xrC1 of C: We show that a D rC1 x is a relative interior PrC1 iD1 PrC1 i of C: Indeed, any x 2 M is of the form x D  x ; with iD1 i iD1 i D 1; so PrC1 i a C ˛.x  a/ D iD1 i x with i D .1  ˛/

1 C ˛i ; rC1

rC1 X

i D 1:

iD1

For ˛ > 0 sufficiently small, we have i > 0; i D 1; : : : ; r C 1; hence a C ˛.x  a/ 2 C: Therefore, by Corollary 1.6, a 2 riC: t u The dimension of a convex set is by definition the dimension of its affine hull. A convex set C in Rn is said to have full dimension if dim C D n: It follows from the above Proposition that a convex set C in Rn has a nonempty interior if and only if it has full dimension. Proposition 1.10 The closure and the relative interior of a convex set are convex. Proof Let C be a convex set, a; b 2 clC; say a D lim!1 x ; b D lim!1 y ; where x ; y 2 C for every : For every  2 Œ0; 1; we have .1  /x C y 2 C; hence, .1  /a C b D limŒ.1  /x C y  2 clC: Thus, if a; b 2 clC then Œa; b  clC; proving the convexity of clC: Now let a; b 2 riC; so that there is a ball B around 0 such that .a C B/ \ affC and .b C B/ \ affC are contained in C: For any x D .1  /a C b; with  2 .0; 1/ we have .x C B/ \ affC D .1  /.a C B/ \ affC C .b C B/ \ affC  C; so x 2 riC: This proves the convexity of riC: t u Let C  Rn be a convex set containing the origin. The function pC W Rn ! R such that pC .x/ D inff > 0j x 2 Cg is called the gauge or Minkowski functional of C: Proposition 1.11 The gauge pC .x/ of a convex set C  Rn containing 0 in its interior satisfies (i) (ii) (iii) (iv)

pC .˛x/ D ˛pC .x/ 8x 2 Rn ; ˛  0 (positive homogeneity), pC .x C y/  pC .x/ C pC .y/ 8x; y 2 C (subadditivity), pC .x/ is continuous, intC D fxj pC .x/ < 1g  C  clC D fxj pC .x/  1g:

Proof (i) Since 0 2 intC it is easy to see that pC .x/ is finite everywhere. For every ˛  0 we have pC .˛x/ D inff > 0j ˛x 2 Cg D ˛ inff > 0j x 2 Cg D ˛pC .x/: (ii) If x 2 C; y 2 C .;  > 0/; i.e., x D x0 ; y D y0 with x0 ; y0 2 C; then

10

1 Convex Sets

x C y D . C /Œ

  0 x0 C y  2 . C /C: C C

Thus, pC .x C y/   C  for all ;  > 0 such that x 2 C; y 2 C: Hence pC .x C y/  pC .x/ C pC .y/: (iii) Consider any x0 2 Rn and let x  x0 2 Bı WD fuj kuk  ıg; where ı > 0 is so small that the ball Bı is contained in "C: Then, x  x0 2 "C; consequently pC .x  x0 /  "; hence pC .x/  pC .x0 / C pC .x  x0 /  pC .x0 / C "I on the other hand, x0  x 2 "C implies similarly pC .x0 /  pC .x/ C ": Thus, jpC .x/  pC .x0 /j  " whenever x  x0 2 Bı ; proving the continuity of pC at any x0 2 Rn : (iv) This follows from the continuity of pC .x/: t u Corollary 1.7 If a 2 intC; b 2 clC; then x D .1/aCb 2 intC for all  2 Œ0; 1/: If a convex set C has nonempty interior then cl(intC) = clC; int(clC/ = intC: Proof By translating we can assume that 0 2 intC; so intC D fxj pC .x/ < 1g: Then pC .x/  .1  /pC .a/ C pC .b/ < 1 whenever pC .a/ < 1; pC .b/  1; and 0   < 1: The second assertion is a straightforward consequence of (iv). t u Let C be a convex set in Rn : A vector y ¤ 0 is called a direction of recession of C if fx C yj   0g  C

8x 2 C:

(1.8)

Lemma 1.1 The set of all directions of recession of C is a convex cone. If C is closed then (1.8) holds provided fx C yj   0g  C for one x 2 C: Proof The first assertion can easily be verified. If C is closed, and fx C yj   0 0g  C for one x 2 C then for any x0 2 C;    0 we have x C y 2 C because   x0 C y D lim!C1 z ; where z D 1  C x0 C C .x C y/ 2 C: t u The convex cone formed by all directions of recession and the vector zero is called the recession cone of C and is denoted by recC: Proposition 1.12 Let C  Rn be a convex set containing a point a in its interior. (i) For every x ¤ a the halfline  .a; x/ D fa C .x  a/j  > 0g either entirely lies in C or it cuts the boundary @C at a unique point .x/; such that every point in the line segment Œa; .x/; except .x/; is an interior point of C: (ii) The mapping .x/ is defined and continuous on the set Rn n .a C M/; where M D rec.intC/ D rec.clC/ is a closed convex cone. Proof (i) Assuming, without loss of generality, that 0 2 intC; it suffices to prove the proposition for a D 0: Let pC .x/ be the gauge of C: For every x ¤ a; if pC .x/ D 0 then x 2 C for all arbitrarily small  > 0; i.e., x 2 C for all  > 0; hence the halfline  .a; x/ entirely lies in C: If pC .x/ > 0 then pC .x/ D pC .x/ D 1

1.3 Relative Interior and Closure

11

a+M

a

x

C

σ (x) Fig. 1.1 The mapping  .x/

just for  D 1=pC .x/: So  .a; x/ cuts @C at a unique point .x/ D x=pC .x/; such that for all 0   < 1=pC .x/ we have pC .x/ < 1 and hence x 2 intC: (ii) The domain of definition of  is Rn n M; with M D fxj pC .x/ D 0g: The set M is a closed convex cone (Fig. 1.1) because pC .x/ is continuous and pC .x/ D 0; pC .y/ D 0 implies 0  pC .x C y/  pC .x/ C pC .y/ D 0 for all  > 0 and  > 0: Furthermore, the function .x/ D x=pC .x/ is continuous at any x where pC .x/ > 0: It remains to prove that M is the recession cone of both intC and clC: It is plain that rec(intC/  M: Conversely, if y 2 M and x 2 intC then x D ˛a C .1  ˛/c; for some c 2 C and 0 < ˛ < 1: Since a C y 2 C for all  > 0; it follows that x C y D ˛a C .1  ˛/c C y D ˛.a C ˛ y/ C .1  ˛/c 2 C for all  > 0; i.e., y 2 rec.intC/: Thus, M  rec.intC/: Finally, any x 2 @C is the limit of some sequence fxk g  intC: If y 2 M then for every  > 0 W xk C y 2 C; 8k; hence by letting k ! 1; x C y 2 clC: This means that M  rec.clC/; and since the converse inclusion is obvious we must have M D rec.clC/: t u Note that if C is neither closed nor open then recC may be a proper subset of rec(intC/ D rec.clC/: An example is the set C D f.x1 ; x2 /j x1 > 0; x2 > 0g [ f.0; 0/g; for which recC D C; rec.intC/ D clC: Corollary 1.8 A nonempty closed convex set C  Rn is bounded if and only if its recession cone is the singleton f0g: Proof By considering C in affC; we can assume that C has an interior point (Proposition 1.6) and that 0 2 intC: If C is bounded then by Proposition 1.12, .x/ is defined on Rn n f0g; i.e., rec(int)C D f0g: Conversely, if rec(int)C D f0g; then pC .x/ > 0 8x ¤ 0: Letting K D minfpC .x/j kxk D 1g; we then have, for every   1 1 x   : t u x 2 C W pC .x/ D kxkpC kxk  1; hence, kxk  K pC x kxk

12

1 Convex Sets

1.4 Convex Hull Given any set E  Rn ; there exists a convex set containing E; namely Rn : The intersection of all convex sets containing E is called the convex hull of E and denoted by convE: This is in fact the smallest convex set containing E: Proposition 1.13 The convex hull of a set E  Rn consists of all convex combinations of its elements. Proof Let C be the set of all convex combinations of E: Clearly C  convE: To prove the converse suffices to show that C is convex (since obviously P inclusion, itP C  E/: If x D i2I i ai ; y D j2J j bj with ai ; bj 2 E and 0  ˛  1 then X X .1  ˛/i ai C ˛j bj : .1  ˛/x C ˛y D i2I

Since

X

.1  ˛/i C

i2I

D .1  ˛/

X

X

j2J

˛j

j2J

i C ˛

i2I

X

j D .1  ˛/ C ˛ D 1;

j2J

it follows that .1  ˛/x C ˛y 2 C: Therefore, C is convex.

t u

The convex hull S of a set of k affinely independent points a1 ; : : : ; ak in Rn ; written S D Œa1 ; : : : ; ak ; is called a .k  1/-simplex spanned by a1 ; : : : ; ak : By Corollary 1.2 this is a .k  1/-dimensional set every point of which is uniquely represented as a convex combination of a1 ; : : : ; ak W xD

k X iD1

i a i ;

with

k X

i D 1; i  0; i D 1; : : : ; k:

iD1

The numbers i D i .x/; i D 1; : : : ; k are called the barycentric coordinates of x in the simplex S: For any two points x; x0 in S; since  k  k X  X  0 0  i .x/  i .x / D kx  x0 k k.x/  .x /k     iD1

iD1

the 1-1 map x 7! .x/ is continuous. Proposition 1.13 shows that any point x 2 convE can be expressed as a convex combination of a finite number of points of E: The next proposition which gives an upper bound of the minimal number of points in such expressions is a fundamental dimensionality result in convexity theory. Theorem 1.1 (Caratheodory’s Theorem) Let E be a set contained in an affine set of dimension k: Then any vector x 2 convE can be represented as a convex combination of k C 1 or fewer elements of E:

1.4 Convex Hull

13

Proof By Proposition 1.13, for any x 2 convE there exists a finite subset of E; say a1 ; : : : ; am 2 E; such that m X

i a D x; i

iD1

m X

i D 1; i  0:

iD1

If  D .1 ; : : : ; m / is a basic solution (in the sense of linear programming theory) of the above linear system in 1 ; : : : ; m ; then at most k C 1 components of  are positive. For example, i D 0 for all i > h where h  kC1; i.e., x can be represented as a convex combination of a1 ; : : : ; ah ; with h  k C 1: t u Corollary 1.9 The convex hull of a compact set is compact. g  convE: By the above theorem, xk D Proof Let E be a compact set and fxkP P nC1 i;k i;k 2 E; i;k  0; nC1 iD1 i;k a ; with a iD1 i;k D 1: Since E  Œ0; 1 is compact, there exists a subsequence fk g such that i;k ai;k ! i ai ; ai 2 E; i 2 Œ0; 1: We then have xk !

PnC1 iD1

i ai 2 convE; proving the corollary.

t u

Remark 1.1 The convex hull of a closed set E may not be closed, as shown by the example of a set E  R2 consisting of a straightline L and a point a … L (Fig. 1.2). Let fS!i ; i D 1; : : : ; mg be any m X Si there exists a subfamily finite family of sets in Rn : For every x 2 conv Theorem 1.2 (Shapley–Folkman’s Theorem)

iD1

fSi ; i 2 Ig; with jIj  n; such that X X Si C conv.Si /: x2 i…I

i2I

a

L Fig. 1.2 Convex hull of a closed set

14

1 Convex Sets

Roughly speaking, if m is very large and every Si is sufficiently small then the sum P m iD1 Si differs little from its convex hull. Pm P Indeed, define P the linear Proof Observe that conv. m iD1 Si / D iD1 convSi : P m m nm n 1 m i mapping ' W R ! R such that '..x ; : : : ; x // D x : Then iD1 iD1 Si D Qm '. iD1 Si / and by linearity of ' W !! ! ! m m m Y Y Y D ' conv Si Si D ' convSi ; conv ' iD1

iD1

iD1

Pm

P i proving our assertion. So any x 2 conv. iD1 Si / is of the form x D m iD1 x ; with i i ij x 2 convSi : Let x be a convex combination of x 2 Si ; j 2 Ji : Then there exist ij satisfying m X X

ij xij D x

iD1 j2Ji

X

ij D 1

i D 1; : : : ; m:

j2Ji

ij  0 

i D 1; : : : ; m; j 2 Ji :

.ij /

Let  D be a basic solution of this linear system in ij ; so that at most n C m components of  are positive. Denote by I the set of all i such that ij > 0 for at least two j 2 Ji ; and let jIj D k: Then  has at least 2k C .m  k/ positive components, therefore n C m  2k C .m  k/; and hence k  n: Since for every i … I we have iji D 1 for some ji 2 Ji and ij D 0 for all other j 2 Ji ; it follows that XX X X X xiji C ij xij 2 Si C convSi : xD i…I

i2I j2Ji

i…I

i2I

t u

This is the desired expression.

We close this section by a proposition generalizing Caratheodory’s Theorem. A set X  Rn is called a Caratheodory core of a set E  Rn if there exist a set B  Rn ; an interval   R along with a mapping ˛ W X !  such that tB  t0 B for t; t0 2 ; t < t0 ; and [ ED .x C ˛.x/B/: x2X

A case Sof interest for the applications is when B is the unit ball and  D RC ; so that E D x2X B.x; ˛.x// where B.x; r/ denotes the ball of center x and radius r: Since a trivial Caratheodory core of any set E is X D E with B being the unit ball in Rn and  D f0g; Caratheodory’s Theorem appears to be a special case of the following general proposition (Thach and Konno 1996):

1.4 Convex Hull

15

Proposition 1.14 If a set E in Rn has a Caratheodory core X contained in a k-dimensional affine set then any vector of convE is a convex combination of k C 1 or fewer elements of E: Proof For any y 2 conv.E/ there exists a finite set of vectors fy1 ; : : : ; ym g  E such that m X

yD

i yi ; i  0;

iD1

m X

i D 1:

(1.9)

iD1

i i i i i Since E, there Pm is x 2 X such that y 2 x C ˛i B; ˛i WD ˛.x /: Setting x D Pm y 2 i iD1 i x , D iD1 i ˛i ; we have

.x; / D

m X

i .xi ; ˛i /:

iD1

Consider the linear program max

m X

ti ˛i W

iD1 m X

i

ti x D x;

iD1

m X

ti D 1

(1.10)

iD1

ti  0; i D 1; : : : ; m:

(1.11)

and let t WD .t1 ; : : : ; tm / be a basic optimal solution of it. Since X is contained in an affine set of dimension k the matrix of the system (1.10)  1 2  x x    xm 1 1  1 is of rank at most k C 1 (Proposition 1.4). Therefore, t must satisfy m  .k C 1/ constraints (1.11) as equalities. That is, ti D 0 for at least m  .k C 1/ indices i: Without loss of generality we can assume that ti D 0 for i D k C 2; : : : ; m; so by P  setting  D kC1 iD1 ti ˛i , we have 

.x; / D

kC1 X

ti .xi ; ˛i /; ti

 0;

iD1 

kC1 X

ti D 1:

iD1



Here  (because t is optimal to the above program while ti D i ; i D 1; : : : ; m is a feasible solution). Hence B   B and we can write yD

m X

i y i 2

iD1

m X

i .xi C ˛i B/

iD1

D

m X iD1

i x i C

m X iD1

i ˛i B

16

1 Convex Sets

D x C B  x C  B D

kC1 X

ti xi

iD1

D

kC1 X

C

kC1 X

ti ˛i B

iD1

ti .xi C ˛i B/:

iD1 i i We have found k C 1 vectors PkC1thus PkC1 u WD x C ˛i B 2 E; i D 1; : : : ; k C 1; such that  i  y D iD1 ti u with ti  0; iD1 ti D 1: t u

Note that in the above proposition, the dimension k of affX can be much smaller than the dimension of affE: For instance, if X D Œa; b is a line segment in Rn and B is the unit ball while ˛.a/ D ˛.b/ D 1; ˛.x/ D 0 8x … fa; bg (so that E is the union of a line segment and two balls centered at the endpoints of this segment) then affE D Rn but affX D R: Thus in this case, by Proposition 1.14, any point of convE can be represented as the convex combination of only two points of E (and not n C 1 points as by applying Caratheodory’s Theorem.

1.5 Separation Theorems Separation is one of the most useful concepts of convexity theory. The first and fundamental fact is the following: Lemma 1.2 Let C be a nonempty convex subset of Rn : If 0 … C then there exists a vector t 2 Rn such that supht; xi  0; x2C

inf ht; xi < 0:

x2C

(1.12)

Proof The proposition is obvious for n D 1: Arguing by induction, assume it is true for n D k  1 and prove it for n D k: Define C0 D fx 2 Cj xn D 0g; CC D fx 2 Cj xn > 0g;

C D fx 2 Cj xn < 0g:

If xn D 0 for all x 2 C then C can be identified with a set in Rn1 ; so the Lemma is true by the induction assumption. If C D ; but CC ¤ ;; then the vector t D .0; : : : ; 0; 1/ satisfies (1.12). Analogously, if CC D ; and C ¤ ; we can take t D .0; : : : ; 0; 1/: Therefore, we may assume that both C and CC are nonempty. For any x 2 CC ; x0 2 C we have   x0 x   0 2 C0 (1.13) xn xn

1.5 Separation Theorems

17

   D 1: Thus, the set xn x0n

for  > 0 such that

C D f.x1 ; : : : ; xn1 /j .x1 ; : : : ; xn1 ; 0/ 2 Cg is a nonempty convex set in Rn1 which does not contain the origin. By the induction hypothesis there exists a vector t 2 Rn1 and a point .Ox1 ; : : : ; xO n1 / 2 C satisfying n1 X

ti xO i < 0;

sup

iD1

n1 X

ti xi j

.x1 ; : : : ; xn1 / 2 C  0:

(1.14)

iD1

From (1.13) and (1.14), for all x 2 CC ; x0 2 C we deduce   n1 X x0i xi   0; ti  xn x0n iD1 i.e., p.x/  q.x0 /; where p.x/ D

n1 X iD1

ti

xi .x 2 CC /; xn

q.x0 / D

n1 X iD1

ti

x0i 0 .x 2 C /: xn

Setting p D sup p.x/;

q D inf q.x/ x2C

x2CC

 we then conclude p  q: Therefore, if we now take t D .t1 ; : : : ; tn1 ; tn /; with tn 2 Œp; q then p.x/ C tn  0 8x 2 CC and tn C q.x/  0 8x 2 C : This implies n X

ti xi  0 8x 2 C;

iD1

completing the proof since ht; xO i < 0 for xO D .Ox1 ; : : : ; xO n1 ; 0/:

t u

We say that two nonempty subsets C; D of Rn are separated by the hyperplane ht; xi D ˛ .t 2 Rn n f0g/ if supht; xi  ˛  inf ht; yi: x2C

y2D

(1.15)

Theorem 1.3 (First Separation Theorem) Two disjoint nonempty convex sets C; D in Rn can be separated by a hyperplane. Proof The set C  D is convex and satisfies 0 … C  D; so by the previous Lemma, there exists t 2 Rn n f0g such that ht; x  yi  0 8x 2 C; y 2 D: Then (1.15) holds for ˛ D supx2C ht; xi: t u Remark 1.2 Before drawing some consequences of the above result, it is useful to recall a simple property of linear functions:

18

1 Convex Sets

A linear function l.x/ WD ht; xi with t 2 Rn n f0g never achieves its minimum (or maximum) over a set D at an interior point of DI if it is bounded above (or below) over an affine set then it is constant on this affine set. It follows from this fact that if the set D in Theorem 1.3 is open then (1.15) implies ht; xi  ˛ < ht; yi

8x 2 C; y 2 D:

Furthermore: Corollary 1.10 If an affine set C does not meet an open convex set D then there exists a hyperplane containing C and not meeting D: Proof Let ht; xi D ˛ be the hyperplane satisfying (1.15). By the first part of the Remark above, we must have ht; xi D const 8x 2 C, i.e., C is contained in the hyperplane ht; x  x0 i D 0; where x0 is any point of C: By the second part of the Remark, ht; yi < ˛ 8y 2 D (hence D \ C D ;/; since if ht; y0 i D ˛ for some y0 2 D then this would mean that ˛ is the maximum of the linear function ht; xi over the open set D: t u Lemma 1.3 Let C be a nonempty convex set in Rn : If C is closed and 0 … C then there exists a vector t 2 Rn n f0g and a number ˛ < 0 such that ht; xi  ˛ < 0

8x 2 C:

Proof Since C is closed and 0 … C; there is a ball D around 0 which does not intersect C: By the above theorem, there is t 2 Rn n f0g satisfying (1.15). Then ˛ < 0 because 0 2 D: t u We say that two nonempty subsets C; D of Rn are strongly separated by a hyperplane ht; xi D ˛; .t 2 Rn n f0g; ˛ 2 R/ if supht; xi < ˛ < inf ht; yi: x2C

y2D

(1.16)

Theorem 1.4 (Second Separation Theorem) Two disjoint nonempty closed convex sets C; D in Rn such that either C or D is compact can be strongly separated by a hyperplane. Proof Assume, e.g., that C is compact and consider the convex set C  D: If zk D xk  yk ! z; with xk 2 C; yk 2 D; then by compactness of C there exists a subsequence xk ! x 2 C and by closedness of D; yk D xk  zk ! x  z 2 DI hence z D xy; with x 2 C; y 2 D: So the set CD is closed. Since 0 … CD; there exists by the above Lemma a vector t 2 Rn nf0g such that ht; xyi  < 0 8x 2 C; y 2 D: Then supx2C ht; xi  2  infy2D ht; yi C 2 : Setting ˛ D infx2C ht; xi C 2 yields (1.16). t u

1.6 Geometric Structure

19

Remark 1.3 If the set C is a cone then one can take ˛ D 0 in (1.15). Indeed, for any given x 2 C we then have x 2 C 8 > 0; so ht; xi  ˛ 8 > 0; hence ht; xi  0 8x 2 C: Similarly, one can take ˛ D 0 in (1.16).

1.6 Geometric Structure There are several important concepts related to the geometric structure of convex sets: supporting hyperplane and normal cone, face and extreme point.

1.6.1 Supporting Hyperplane A hyperplane H D fxj ht; xi D ˛g is said to be a supporting hyperplane to a convex set C  Rn if at least one point x0 of C lies in H W ht; x0 i D ˛; and all points of C lie in one halfspace determined by H; say: ht; xi  ˛ 8x 2 C (Fig. 1.3). The halfspace ht; xi  ˛ is called a supporting halfspace to C: Theorem 1.5 Through every boundary point x0 of a convex set C  Rn there exists at least one supporting hyperplane to C: Proof Since x0 … riC; there exists by Theorem 1.3 a hyperplane ht; xi D ˛ such that ht; xi  ˛  ht; x0 i: 8x 2 riC; hence ht; x  x0 i  0

8x 2 C:

Thus, the hyperplane H D fxj ht; x  x0 i D 0g is a supporting hyperplane to C at x0 : u t Fig. 1.3 Supporting hyperplane

x0

C

20

1 Convex Sets

The normal t to a supporting hyperplane to C at x0 is characterized by the condition ht; x  x0 i  0

8x 2 C:

(1.17)

A vector t satisfying this condition does not make an acute angle with any line segment in C having x0 as endpoint. It is also called an outward normal, or simply a normal, to C at point x0 (t is called an inward normal). The set of all vectors t normal to C at x0 is called the normal cone to C at x0 and is denoted by NC .x0 /: It is readily verified that NC .x0 / is a convex cone which reduces to the singleton f0g when x0 is an interior point of C: Proposition 1.15 Let C be a closed convex set. For any y0 … C there exists x0 2 @C such that y0  x0 2 NC .x0 /: Proof Let fxk g  C be a sequence such that ky0  xk k ! inf ky0  xk x2C

.k ! 1/:

The sequence fky0  xk kg is bounded (because convergent), and hence so is the sequence fkxk kg because kxk k  kxk y0 kCky0 k: Therefore, there is a subsequence xk ! x0 : Since C is closed, we have x0 2 C and ky0  x0 k D min ky0  xk: x2C

0

0

0

It is easy to see that y  x 2 NC .x /: Indeed, for any x 2 C and  2 .0; 1/ we have x WD x C .1  /x0 2 C; so ky0  x k2  ky0  x0 k2 ; i.e., k.y0  x0 /  .x  x0 /k2  ky0  x0 k2 : Upon easy computation this yields 2 kx  x0 k2  2hy0  x0 ; x  x0 i  0 for all  2 .0; 1/; hence hy0  x0 ; x  x0 i  0; as desired.

t u

Note that the condition y0  x0 2 NC .x0 / uniquely defines x0 : Indeed, if y0  x1 2 NC .x1 / then from the inequalities hy0  x0 ; x1  x0 i  0;

hy0  x1 ; x0  x1 i  0;

it follows that hx0  x1 ; x1  x0 i  0; hence x1  x0 D 0: The point x0 is called the projection of y0 on the convex set C: As seen from the above, it is the point of C nearest to y0 : Theorem 1.6 A closed convex set C which is neither empty nor the whole space is equal to the intersection of all its supporting halfspaces.

1.6 Geometric Structure

21

Proof Let E be this intersection. Obviously, C  E: Suppose E n C ¤ ; and let y0 2 E n C: By the above proposition, there exists a supporting halfspace to C which does not contain y0 ; conflicting with y0 2 E: Therefore, C D E: t u

1.6.2 Face and Extreme Point A convex subset F of a convex set C is called a face of C if any line segment in C with a relative interior point in F entirely lies in F; i.e., if x; y 2 C; .1  /x C y 2 F; 0 <  < 1 ) Œx; y  F:

(1.18)

Of course, the empty set and C itself are faces of C: A proper face of C is a face which is neither the empty set nor C itself. A face of a convex cone is itself a convex cone. Proposition 1.16 The intersection of a convex set C with a supporting hyperplane H is a face of C: Proof The convexity of F WD H\C is obvious. If a segment Œa; b  C has a relative interior point x in F and were not contained in F then H would strictly separate a and b; contrary to the definition of a supporting hyperplane. t u A zero-dimensional face of C is called an extreme point of C: In other words, an extreme point of C is a point x 2 C which is not a relative interior point of any line segment with two distinct endpoints in C: If a convex set C has a halfline face, the direction of this halfline is called an extreme direction of C: The set of extreme directions of C is the same as the set of extreme directions of its recession cone. It can easily be verified that a face of a face of C is also a face of C: In particular: An extreme point (or direction) of a face of a convex set C is also an extreme point (or direction, resp.) of C: Proposition 1.17 Let E be an arbitrary set in Rn : Every extreme point of C D convE belongs to E: Pk i i Proof Let x be an extreme point of C: By Proposition 1.13 x D iD1 i x ; x 2 Pk E; i > 0; iD1 i D 1: It is easy to see that i D 1 (i.e., x D xi / for some i: In fact, otherwise there is h with 0 < h < 1; so x D h xh C .1  h /yh ; with yh D

X i¤h

i xi 2 C 1  h

P i D 1/; i.e., x is a relative interior point of the line segment (because i¤h 1 h h h Œx ; y   C; conflicting with x being an extreme point. t u

22

1 Convex Sets

Given an arbitrary point a of a convex set C; there always exists at least one face of C containing a; namely C: The intersection of all faces of C containing a is obviously a face. We refer to this face as the smallest face containing a and denote it by Fa : Proposition 1.18 Fa is the union of a and all x 2 C for which there is a line segment Œx; y  C with a as a relative interior point. Proof Denote the mentioned union by F: If x 2 F; i.e., x 2 C and there is y 2 C such that a D .1  /x C y with 0 <  < 1; then Œx; y  Fa because Fa is a face. Thus, F  Fa : It suffices, therefore, to show that F is a face. By translating if necessary, we may assume that a D 0: It is easy to see that F is a convex subset of C: Indeed, if x1 ; x2 2 F; i.e., x1 D 1 y1 ; x2 D 2 y2 ; with y1 ; y2 2 C and 1 ; 2 > 0 then for any x D .1  ˛/x1 C ˛x2 .0 < ˛ < 1/ we have x D .1  ˛/1 y1  ˛2 y2 ; with .1  ˛/1 ˛2 1 xD y1 C y2 2 C; .1  ˛/1 C ˛2 .1  ˛/1 C ˛2 .1  ˛/1 C ˛2 hence x 2 F: Thus, it remains to prove (1.18). Let x; y 2 C; u D .1  /x C y 2 F; 0 <  < 1: Since u 2 F; there is v 2 C such that u D v; so xD

u y v y  D  : 1 1 1 1

We have 

.1  /x   D vC y 2 C; C C C

so x 2 F; and similarly, y 2 F: Therefore, F is a face, completing the proof.

t u

An equivalent formulation of the above proposition is the following: Proposition 1.19 Let C be a convex set and a 2 C: A face F of C is the smallest face containing a if and only if a 2 riF: Proof Indeed, for any subset F of C the condition in Proposition 1.18 is equivalent to saying that F is a face and a 2 riF: t u Corollary 1.11 If a face F1 of a convex set C is a proper subset of a face F2 then dim F1 < dim F2 : Proof Let a be any relative interior point of F1 : If dim F1 D dim F2 ; then a 2 riF2 ; and so both F1 and F2 coincide with the smallest face containing a: t u

1.7 Representation of Convex Sets

23

Corollary 1.12 A face F of a closed convex set C is a closed set. Proof We may assume that 0 2 riF; so that F D F0 : Let M be the smallest space containing F: Clearly, every x 2 M is of the form x D y for some y 2 F and  > 0: Therefore, by Proposition 1.18, F D M \ C and since M and C are closed, so must be F: t u

1.7 Representation of Convex Sets Given a nonempty convex set C; the set (recC/ \ .recC/ is called the lineality space of C: It is actually the largest subspace contained in recC and consists of the zero vector and all the nonzero vectors y such that, for every x 2 C; the line through x in the direction of y is contained in C: The dimension of the lineality space of C is called the lineality of C: A convex set C of lineality 0, i.e., such that (recC/ \ .recC/ D f0g; contains no line (and so is sometimes said to be line-free). Proposition 1.20 A nonempty closed convex set C has an extreme point if and only if it contains no line. Proof If C has an extreme point a then a is also an extreme point of a C recC; hence 0 is an extreme point of recC; i.e., (recC/ \ .recC/ D f0g: Conversely, suppose C is line-free and let F be a face of smallest dimension of C: Then F must also be line-free, and if F contains two distinct points x; y then the intersection of C with the line through x; y must have an endpoint a: Of course Fa cannot contain Œx; y; hence by Corollary 1.11, dim Fa < dim F; a contradiction. Therefore, F consists of a single point which is an extreme point of C: t u If a nonempty convex set C has a nontrivial lineality space L then for any x 2 C we have x D u C v; with u 2 L; v 2 L? ; where L? is the orthogonal complement of L: Clearly v D x  u 2 C  L D C C L  C; hence C can be decomposed into the following sum: C D L C .C \ L? /;

(1.19)

in which the convex set C \ L? contains no line, hence has an extreme point, by the above proposition. We are now in a position to state the fundamental representation theorem for convex sets. In view of the above decomposition (1.19) it suffices to consider closed convex sets C with lineality zero. Denote the set of extreme points of C by V.C/ and the set of extreme directions of C by U.C/: Theorem 1.7 Let C  Rn be a closed convex set containing no line. Then C D convV.C/ C coneU.C/:

(1.20)

24

1 Convex Sets

In other words any x 2 C can be represented in the form xD

X i2I

i v i C

X

j uj ;

j2J

with I; J finite index sets, v i 2 V.C/; ui 2 U.C/; and i  0; j  0;

P i

i D 1:

Proof The theorem is obvious if dim C D 0: Arguing by induction, assume that the theorem is true for dim C D n  1 and consider the case when dim C D n: Let x be an arbitrary point of C and  be any line containing x: Since C is line-free, the set  \ C is either a closed halfline, or a closed line segment. In the former case, let  D fa C uj   0g: Of course a 2 @C; and there is a supporting hyperplane to C at a by Theorem 1.5. The intersection F of this hyperplane with C is a face of C of dimension less than n: By the induction hypothesis, a 2 coneU.F/ C convV.F/  coneU.C/ C convV.C/: Since u 2 coneU.C/ and x D a C u for some   0; it follows that x D a C u 2 .coneU.C/ C convV.C// C coneU.C/  coneU.C/ C convV.C/: In the case when  is a closed line segment Œa; b; both a; b belong to @C; and by the same argument as above, each of these points belongs to coneU.C/CconvV.C/: Since this set is convex and x 2 Œa; b; x belongs to this set too. t u An immediate consequence of Theorem 1.7 is the following important Minkowski’s Theorem which corresponds to the special case U.C/ D f0g: Corollary 1.13 A nonempty compact convex set C is equal to the convex hull of its extreme points. t u In view of Caratheodory’s Theorem one can state more precisely: Every point of a nonempty compact convex set C of dimension k is a convex combination of k C 1 or fewer extreme points of C: Corollary 1.14 A nonempty compact convex set C has at least one extreme point and every supporting hyperplane H to C contains at least one extreme point of C: Proof The set C \ H is nonempty, compact, convex, hence has an extreme point. Since by Proposition 1.16 this set is a face of C any extreme point of it is also an extreme point of C: t u

1.8 Polars of Convex Sets

25

1.8 Polars of Convex Sets By Theorem 1.6 a closed convex set which is neither empty nor the whole space is fully determined by the set of its supporting hyperplanes. Its turns out that a duality correspondence can be established between convex sets and their sets of supporting hyperplanes. For any set E in Rn the set Eı D fy 2 Rn j hy; xi  1; 8x 2 Eg is called the polar of E: If E is a closed convex cone, so that E  E for all   0, then the condition hy; xi  1; 8x 2 E, is equivalent to hy; xi  0; 8x 2 E. Therefore, the polar of a cone M is the cone M ı D fy 2 Rn j hy; xi  0; 8x 2 Mg: The polar of a subspace is the same as its orthogonal complement. Proposition 1.21 (i) The polar Eı of any set E is a closed convex set containing the origin; if E  F then F ı  Eı I (ii) E  Eıı ; if E is closed, convex, and contains the origin then E ıı D EI (iii) 0 2 intE if and only if Eı is bounded. Proof (i) Straightforward. (ii) If x 2 E then hx; yi  1 8y 2 Eı , hence x 2 Eıı : Suppose E is closed, convex and contains 0, but there is a point u 2 Eıı n E. Then by the Separation Theorem 1.4 there exists a vector y such that hy; xi  1; 8x 2 E, while hy; ui > 1. The former inequality (for all x 2 E) means that y 2 Eı I the latter inequality then implies that u … Eıı , a contradiction. (iii) First note that the polar of the ball Br of radius r around 0 is the ball B1=r . 1 . Therefore, if Indeed, the distance from 0 to a hyperplane hy; xi D 1 is kyk 1  r, hence kyk  1=r, i.e., y 2 .Br /ı , i.e., hy; xi  1 8x 2 Br then kyk y 2 B1=r , and conversely. This proves that .Br /ı D B1=r . Now, 0 2 intE if and only if Br  E for some r > 0, hence if and only if Eı  .Br /ı D B1=r , i.e., Eı is bounded. t u Proposition 1.22 If M1 ; M2  Rn are closed convex cones then .M1 C M2 /ı D M1ı \ M2ı I

.M1 \ M2 /ı D M1ı C M2ı :

26

1 Convex Sets

Proof If y 2 .M1 CM2 /ı then for any x 2 M1 ; writing x D xC0 2 M1 CM2 we have hx; yi  0; hence y 2 M1ı I analogously, y 2 M2ı : Therefore .M1 C M2 /ı  M1ı \ M2ı : Conversely, if y 2 M1ı \ M2ı then for all x1 2 M1 ; x2 2 M2 we have hx1 ; yi  0; hx2 ; yi  0; hence hx1 C x2 ; yi  0; implying that y 2 .M1 C M2 /ı : Therefore, the first equality holds. The second equality follows, because by (ii) of the previous proposition, M1ı C M2ı D .M1ı C M2ı /ıı D Œ.M1ı /ı \ .M2ı /ı ı D .M1 \ M2 /ı : t u Given a set C  Rn ; the function x 2 Rn 7! sC .x/ D supy2C hx; yi is called the support function of C: Proposition 1.23 Let C be a closed convex set containing 0. (i) The gauge of C is the support function of Cı : (ii) The recession cone of C and the closure of the cone generated by Cı are polar to each other. (iii) The lineality space of C and the subspace generated by Cı are orthogonally complementary to each other. Dually, also, with C and Cı interchanged. Proof (i) Since .Cı /ı D C by Proposition 1.21, we can write x x 2 Cg D inff > 0j h ; yi  1 8y 2 Cı g   D inff > 0j hx; yi   8y 2 Cı g D sup hx; yi

pC .x/ D inff > 0j

y2Cı

(ii) Since 0 2 C; the recession cone of C is the largest closed convex cone contained in C: Its polar must then be the smallest closed convex cone containing Cı ; i.e., the closure of the convex cone generated by Cı : (iii) Similarly, the lineality space of C is the largest space contained in C: Therefore, its orthogonal complement is the smallest subspace containing Cı : t u Corollary 1.15 For any closed convex set C  Rn containing 0 we have dim Cı D n  linealityC linealityCı D n  dim C: Proof The latter relation follows from (iii) of the previous proposition, while the former is obtained from the latter by noting that .Cı /ı D C: u t The number dim C  linealityC which can be considered as a measure of the nonlinearity of a convex set C is called the rank of C: From (1.19) we have rankC D dim.C \ L? /; where L is the lineality space of C: The previous corollary also shows that for any closed convex set C W rankC D rankCı :

1.9 Polyhedral Convex Sets

27

1.9 Polyhedral Convex Sets A convex set is said to be polyhedral if it is the intersection of a finite family of closed halfspaces. A polyhedral convex set is also called a polyhedron. In other words, a polyhedron is the solution set of a finite system of linear inequalities of the form hai ; xi  bi

i D 1; : : : ; m;

(1.21)

or in matrix form: Ax  b;

(1.22)

where A is the m  n matrix of rows ai and b 2 Rm : Since a linear equality can be expressed as two linear inequalities, a polyhedron is also the solution set of a system of linear equalities and inequalities of the form hai ; xi D bi ;

i D 1; : : : ; m1

hai ; xi  bi ;

i D m1 C 1; : : : ; m:

The rank of a system of linear inequalities of the form (1.22) is by definition the rank of the matrix A: When this rank is equal to the number of inequalities, we say that the inequalities are linearly independent. Proposition 1.24 A polyhedron D ¤ ; defined by the system (1.22) has dimension r if and only if the subsystem of (1.22) formed by the inequalities which are satisfied as equalities by all points of D has rank n  r: Proof Let I0 D fij hai ; xi D bi 8x 2 Dg and M D fxj hai ; xi D bi i 2 I0 g: Since dim M D n  r; and D  M; it suffices to show that M  affD: We can assume that I1 WD f1; : : : ; mg n I0 ¤ ; for otherwise D D M: For each i 2 I1 take xi 2 D such 1 P i i 0 i that ha ; x i < bi and let x D q i2I1 x ; where q D jI1 j: Then clearly x0 2 M;

hai ; x0 i < bi 8i 2 I1 :

(1.23)

Consider now any x 2 M: Using (1.23) it is easy to see that for  > 0 sufficiently small, we have x0 C .x  x0 / 2 M;

hai ; x0 C .x  x0 /i  bi 8i 2 I1 ;

hence y WD x0 C .x  x0 / 2 D: Thus, both points x0 and y belong to D: Hence the line through x0 and y lies in affD; and in particular, x 2 affD: Since x is an arbitrary point of M this means that M  affD; and hence M D affD: t u

28

1 Convex Sets

Corollary 1.16 A polyhedron (1.22) is full-dimensional if and only if there is x0 satisfying hai ; x0 i < bi 8i D 1; : : : ; m: Proof This corresponds to the case r D 0; i.e., I1 D f1; : : : ; mg: Note that if for every i 2 I1 there exists xi 2 D satisfying hai ; xi i < bi then any x0 2 riD (or x0 constructed as in the above proof) satisfies hai ; x0 i < bi 8i 2 I1 : t u

1.9.1 Facets of a Polyhedron As previously, denote by D the polyhedron (1.22) and let I0 D fij hai ; xi D bi 8x 2 Dg: Theorem 1.8 A nonempty subset F of D is a face of D if and only if F D fxj hai ; xi D bi ; i 2 II

hai ; xi  bi ; i … Ig;

(1.24)

for some index set I such that I0  I  f1; : : : ; mg: Proof (i) Any set F of the form (1.24) is a face. Indeed, F is obviously convex. Suppose there is a line segment Œy; z  D with a relative interior point x in F: We have z D .1  /x C y for some  < 0; so if hai ; yi < bi for some i 2 I then hai ; zi D .1  /hai ; xi C hai ; yi > .1  /bi C bi D bi ; a contradiction. Therefore, hai ; yi D bi 8i 2 I; and analogously, hai ; zi D bi 8i 2 I: Hence Œy; z  F; proving that F is a face. (ii) Any face F has the form (1.24). Indeed, by Proposition 1.19 F D Fx0 ; where x0 is any relative interior point of F: Denote by F 0 the set in the right-hand side of (1.24) for I D fij hai ; x0 i D bi g: According to what has just been proved, F 0 is a face, so F 0  F: To prove the converse inclusion, consider any x 2 F 0 and let y D x0 C .x0  x/: Since hai ; x0 i D hai ; xi D bi 8i 2 II

hai ; x0 i < bi 8i … I

one has hai ; yi D bi 8i 2 I; hai ; yi  bi 8i … I provided  > 0 is sufficiently small. Then y 2 F 0 and Œx; y  F 0  D: Since x0 2 F and x0 is a relative interior point of Œx; y; while F is a face, this implies that Œx; y  F; and in particular, x 2 F: Thus, any x 2 F 0 belongs to F: Therefore F D F 0 ; completing the proof. t u A proper face of maximal dimension of a polyhedron is called a facet. By the above proposition, if F is a facet of D then there is a k … I0 such that F D fx 2 Djhak ; xi D bk g:

(1.25)

1.9 Polyhedral Convex Sets

29

However, not every face of this form is a facet. An inequality hak ; xi  bk is said to be redundant if the removal of this inequality from (1.21) does not affect the polyhedron D; i.e., if the system (1.21) is equivalent to hai ; xi  bi ;

i 2 f1; : : : ; mg n fkg:

Proposition 1.25 If the inequality hak ; xi  bk with k … I0 is not redundant then (1.25) is a facet. Proof The non-redundance of the inequality hak ; xi  bk means that there is an y such that hai ; yi D bi ; i 2 I0 ;

hai ; yi  bi ; i … .I0 [ fkg/;

hak ; yi > bk :

Let x0 be a relative interior point of D; i.e., such that hai ; x0 i D bi ; i 2 I0 ;

hai ; x0 i < bi ; i … I0 :

Then the line segment Œx0 ; y meets the hyperplane hak ; xi D bk at a point z satisfying hai ; zi D bi ; i 2 .I0 [ fkg/;

hai ; zi < bi ; i … .I0 [ fkg/:

This shows that F defined by (1.25) is of dimension dim D  1; hence is a facet. u t Thus, if a face (1.25) is not a facet, then the inequality hak ; xi  bk can be removed without changing the polyhedron. In other words, the set of facet-defining inequalities, together with the system hai ; xi D bi ; i 2 I0 ; completely determines the polyhedron.

1.9.2 Vertices and Edges of a Polyhedron An extreme point (0-dimensional face) of a polyhedron is also called a vertex, and a 1-dimensional face is also called an edge. The following characterizations of vertices and edges are immediate consequences of Theorem 1.8: Corollary 1.17 (i) A point x 2 D is a vertex if and only if it satisfies as equalities n linearly independent inequalities from (1.21). (ii) A line segment (or halfline, or line)   D is an edge of D if and only if it is the set of points of D satisfying as equalities n  1 linearly independent inequalities from (1.21). Proof A set F of the form (1.24) is a k-dimensional face if and only if the system hai ; xi D bi ; i 2 I; is of rank n  k: t u

30

1 Convex Sets

A vertex of a polyhedron D  Rn [defined by (1.21)] is said to be nondegenerate if it satisfies as equalities exactly n inequalities (which must then be linearly independent) from (1.21). Two vertices x1 ; x0 are said to be neighboring (or adjacent) if the line segment between them is an edge. Theorem 1.9 Let x0 be a nondegenerate vertex of a full-dimensional polyhedron D defined by a system (1.21). Then there are exactly n edges of D emanating from x0 : If I is the set of indices of inequalities that are satisfied by x0 as equalities, then for each k 2 I there is an edge emanating from x0 whose direction z is defined by the system hak ; zi D 1;

hai ; zi D 0; i 2 I n fkg:

(1.26)

Proof By the preceding corollary a vector z defines an edge emanating from x0 if and only if for all  > 0 small enough x0 C z belongs to D and satisfies exactly n  1 equalities from the system hai ; xi D bi ; i 2 I: Since hai ; x0 i < bi .i … I/; we have hai ; x0 C zi  bi .i … I/ for all sufficiently small  > 0: Therefore, it suffices that for some k 2 I we have hai ; x0 C zi D bi .i 2 I n fkg/; hak ; x0 C zi < bk ; i.e., that z (or a positive multiple of z) satisfies (1.26). Since the latter system is of rank n it fully determines the direction of z: We thus have exactly n edges emanating from x0 : t u A bounded polyhedron is called a polytope. Proposition 1.26 An r-dimensional polytope has at least r C 1 vertices. Proof A polytope is a compact convex set, so by Corollary 1.13 it is the convex hull of its vertex set fx1 ; x2 ; : : : ; xk g: The affine hull of the polytope is then the affine hull of this vertex set, and by Proposition 1.4, if it is of dimension r then the matrix 

is of rank r C 1; hence k  r C 1:

x 1 x2    xk 1 1  1



t u

A polytope which is the convex hull of r C 1 affinely independent points x1 ; x2 ; : : : ; xrC1 is called an r-simplex and is denoted by Œx1 ; x2 ; : : : ; xrC1 : By Proposition 1.17 any vertex of such a polytope must then be one of the points x1 ; x2 ; : : : ; xrC1 ; and since there must be at least r C 1 vertices, these are precisely x1 ; x2 ; : : : ; xrC1 : If D is an r-dimensional polytope then any set of k  r C 1 vertices of D determines an r-simplex which is an r-dimensional face of D: Conversely, any rdimensional face of D is of this form. Proposition 1.27 The recession cone of a polyhedron (1.22) is the cone M WD fxj Ax  0g:

1.9 Polyhedral Convex Sets

31

Proof By definition y 2 recD if and only if for any x 2 D and  > 0 we have x C y 2 D; i.e., A.x C y/ D Ax C Ay  b; i.e., Ay  0: Hence the conclusion. t u Thus, the extreme directions of a polyhedron (1.22) are the same as the extreme directions of the cone Ax  0: Corollary 1.18 A nonempty polyhedron (1.22) has at least a vertex if and only if rankA D n: It is a polytope if and only if the cone Ax  0 is the singleton f0g: Proof By Proposition 1.20 D has a vertex if and only if it contains no line, i.e., the recession cone Ax  0 contains no line, which in turns is equivalent to saying that rankA D n: The second assertion follows from Corollary 1.8 and the preceding proposition. t u

1.9.3 Polar of a Polyhedron Let x0 be a solution of the system (1.21), so that Ax0  b: Then A.x  x0 /  b  Ax0 ; with b  Ax0  0: Therefore, by shifting the origin to x0 and dividing, the system (1.21) can be put into the form hai ; xi  1; i D 1; : : : ; p;

hai ; xi  0; i D p C 1; : : : ; m:

(1.27)

Proposition 1.28 Let P be the polyhedron defined by the system (1.27). Then the polar of P is the polyhedron Q D convf0; a1 ; : : : ; ap g C conefapC1 ; : : : ; am g:

(1.28)

and conversely, the polar of Q is the polyhedron P: Proof If y 2 Q; i.e., yD

m X

i ai ; with

iD1

p X

i  1; i  0; i D 1; : : : ; m

iD1

then hy; xi D

m X

i hai ; xi  1

8x 2 P;

iD1

hence y 2 Pı (polar of P/: Therefore, Q  Pı : To prove the converse inclusion, observe that for any y 2 Qı we must have hy; xi  1 8x 2 Q: In particular, for i  p; taking x D ai yields hai ; yi  1; i D 1; : : : ; pI while for i  p C 1; taking x D ai with arbitrarily large > 0 yields hai ; yi  0; i D p C 1; : : : ; m: Thus, Qı  P and hence, by Proposition 1.21, Pı  .Qı /ı D Q: This proves that Q D Pı : The equality Qı D Pıı D P then follows. t u

32

1 Convex Sets

Note the presence of the vector 0 in (1.28). For instance, if P D fx 2 R2 j x1  1; x2  1g then its polar is Pı D convf0; e1 ; e2 g and not convfe1; e2 g: However, if P is bounded then 0 2 intPı (Proposition 1.21), and 0 can be omitted in (1.28). As an example, it is easily verified that the polar of the unit l1 -ball fxj maxi jxi j  1g D fxj  1  P xi  1; i D 1; : : : ; ng is the set convfei; ei ; i D 1; : : : ; ng; i.e., the unit l1 -ball fxj niD1 jxi j  1g: This fact is often expressed by saying that the l1 -norm and the l1 -norm are dual to each other. Corollary 1.19 If two full-dimensional polyhedrons P; Q containing the origin are polar to each other then there is a 1-1 correspondence between the set of facets of P not containing the origin and the set of nonzero vertices of QI likewise, there is a 1-1 correspondence between the set of facets of P containing the origin and the set of extreme directions of Q: Proof One can suppose that P; Q are defined as in Proposition 1.28. Then an inequality hak ; xi  1; k 2 f1; : : : ; pg; is redundant in (1.27) if and only if the vector ak can be removed from (1.28) without changing Q: Therefore, the equality hak ; xi D 1 defines a facet of P not containing 0 if and only if ak is a vertex of Q: Likewise, for the facets passing through 0 (Fig. 1.4). t u Fig. 1.4 Polyhedrons polar to each other a1

Q

a1 , x = 1

a3

a3 , x = 0 P

a2 0 a2 , x = 0

1.9.4 Representation of Polyhedrons From Theorem 1.8 it follows that a polyhedron has finitely many faces, in particular finitely many extreme points and extreme directions. It turns out that this property completely characterizes polyhedrons in the class of closed convex sets. Theorem 1.10 For every polyhedron D there exist two finite sets V D fv i ; i 2 Ig and U D fuj ; j 2 Jg such that D D convV C coneU:

(1.29)

1.9 Polyhedral Convex Sets

33

In other words, D consists of points of the form xD X

X i2I

i D 1;

i v i C

X

j uj ; with

j2J

i  0; j  0; i 2 I; j 2 J:

i2I

Proof If L is the lineality of D then, according to (1.19): D D L C C;

(1.30)

where C WD D \ L? is a polyhedron containing no line. By Theorem 1.7, C D convV C coneU1 ;

(1.31)

where V is the set of vertices and U1 the set of extreme directions of C: Now let U0 be a basis of L; so that L D conefU0 [ .U0 /g: From (1.30) and (1.31) we conclude D D convV C coneU1 C conefU0 [ .U0 /g; whence (1.29) by setting U D U0 [ .U0 / [ U1 :

t u

Lemma 1.4 The convex hull of a finite set of halflines emanating from the origin is a polyhedral convex cone. In other words, given any finite set of vectors U  Rn ; there exists an m  n matrix A such that coneU D fxj Ax  0g

(1.32)

Proof Let U D fuj ; j 2 Jg; M D conefuj ; j 2 Jg: By Proposition 1.28 M ı D fxj hv jP ; xi  0; j 2 Jg and by Theorem 1.10 there exist a1 ; : : : ; am such that ı i M D fx D m iD1 i a j i  0; i D 1; : : : ; mg: Finally, again by Proposition 1.28, ı ı i .M / D fxj ha ; xi  0; i D 1; : : : ; mg; which yields the desired representation by noting that M ıı D M: t u Theorem 1.11 Given any two finite sets V and U in Rn ; there exist an m  n matrix A and a vector b 2 Rm such that convV C coneU D fxj Ax  bg: Proof Let V D fv i ; i 2 Ig; U D fuj ; j 2 Jg: We can assume V ¤ ;; since the case V D ; has just been treated. In the space RnC1 D Rn  R; consider the set 8 9 < = X X K D .x; t/ D i .v i ; 1/ C j .uj ; 0/j i  0; j  0 : : ; i2I

j2J

34

1 Convex Sets

This set is contained in the halfspace f.x; t/ 2 Rn  Rj t  0g of RnC1 and if we set D D convV C coneU then: x 2 D , .x; 1/ 2 K: By Lemma 1.4, K is a convex polyhedral cone, i.e., there exist .A; b/ such that K D f.x; t/j Ax  tb  0g: Thus, D D fxj completing the proof.

Ax  bg; t u

Corollary 1.20 A closed convex set D which has only finitely many faces is a polyhedron. Proof Indeed, in view of (1.19), we can assume that D contains no line. Then D has a finite set V of extreme points and a finite set U of extreme directions, so by Theorem 1.7 it can be represented in the form convV C coneU; and hence is a polyhedron by the above theorem. t u

1.10 Systems of Convex Sets Given a system of closed convex sets we would like to know whether they have a nonempty intersection. Lemma 1.5 If m closed convex proper sets C1 ; : : : ; Cm of Rn have no common point in a compact convex set C then there exist closed halfspaces Hi  Ci .i D 1; : : : ; m/ with no common point in C: Proof The closed convex set Cm does not intersect the compact set C0 D C \ .\m1 iD1 Ci /; so by Theorem 1.4 there exists a closed halfspace Hm  Cm disjoint with C0 (i.e., such that C1 ; : : : ; Cm1 ; Hm have no common point in C). Repeating this argument for the collection C1 ; : : : ; Cm1 ; Hm we find a closed halfspace Hm1  Cm1 such that C1 ; : : : ; Cm2 ; Hm1 ; Hm have no common point in C; etc. Eventually we obtain the desired closed halfspaces H1 ; : : : ; Hm : t u Theorem 1.12 If the closed convex sets C1 ; : : : ; Cm in Rn have no common point in a convex set C but any m  1 sets of the collection have a common point in C then there exists a point of C not belonging to any Ci ; i D 1; : : : ; m: Proof First, it can be assumed that the set C is compact. In fact, if it were not so, we could pick, for each i D 1; : : : ; m; a point ai common to C and the sets Cj ; j ¤ i: Then the set C0 D convfa1 ; : : : ; am g is compact and the sets C1 ; : : : ; Cm have no common point in C0 (because C0  C/I moreover, any m  1 sets among these have a common point in C0 (which is one of the points a1 ; : : : ; am ). Consequently, without any harm we can replace C with C0 :

1.10 Systems of Convex Sets

35

Second, each Ci ; i D 1; : : : ; m; must be a proper subset of Rn because if some Ci0 D Rn then a common point ai0 of Ci ; i ¤ i0 ; in C would be a common point of C1 ; : : : ; Cm ; in C: Therefore, by Lemma 1.5 there exist closed halfspaces Hi  Ci ; i D 1; : : : ; m; such that H1 ; : : : ; Hm have no common point in C: Let Hi D fxj hhi .x/  0g; where hi .x/ is an affine function. We contend that for every set I  f1; : : : ; mg and every q … I there exists xqI 2 C satisfying 8 < > 0 if i D q hi .xqI / D 0 if i 2 I :  0 if i ¤ q

.1/ .2/ .3/

In fact, for I D ; this is clear: since Hi  Ci ; i D 1; : : : ; m; any m  1 sets among the Hi ; i D 1; : : : ; m; have a common point in C: Supposing this is true for j I j D s consider the case j I j D sC1: For p 2 I; J D Infpg by the induction hypothesis there exist xpJ and xqJ : The convex set fx 2 Cj hhi .x/ D 0 .i 2 J/; hi .x/  0 i … fp; qgg contains two points xpJ and xqJ satisfying hp .xpJ / > 0 and hp .xqJ /  0; hence it must contain a point xqI satisfying hp .xqI / D 0: Clearly xqI satisfies (2) and (3), and hence, also (1) since the Hi have no common point in C: For each q denote by xq the point xqI associated with I D f1; : : : ; mg n fqg: P q It is easily seen that the point x WD m1 m x 2 C satisfies 8i hi .x / D qD1 P m 1 1 q i  qD1 hi .x / D m hi .x / > 0: This means x does not belong to any Hi ; hence m neither to any Ci : t u Corollary 1.21 Let C1 ; C2 ; : : : ; Cm be closed convex sets whose union is a convex set C: If any m1 sets of the collection have a nonempty intersection, then the entire collection has a nonempty intersection. Proof If the entire collection had an empty intersection then by Theorem 1.12 there would exist a point of C not belonging to any Ci ; a contradiction. t u Theorem 1.13 (Helly) Let C1 ; : : : ; Cm be m > n C 1 closed convex subsets of Rn : If any n C 1 sets of the collection have a nonempty intersection then the entire collection has a nonempty intersection. Proof Suppose m D n C 2; so that m  1 D n C 1: For each i take an ai 2 \j¤i Cj : Since m  1 > n the vectors ai  a1 ; i D 2; : : : ; m must be linearly dependent, i.e., there exist numbers i ;P i D 1; : : : ; m; P not all zero, Pm m m i 1 i such that Setting  D   yields 1 i iD2 i .a a /D0: iD2 iD1 i a D0. P P With J D fij i  0; ˛ D i2J i D  i…J i > 0 the latter equality can be written as 1X 1X i a i D .i /ai : ˛ i2J ˛ i…J

36

1 Convex Sets

P i i For i 2 J we have ai 2 \j…J Cj ; so x D i2J ˛ a 2 \j…J Cj I for i … J we have P i i m i a 2 \j2J Cj ; so x D i…J ˛ a 2 \j2J Ci : Therefore, x 2 \jD1 Cj ; proving the theorem for m D n C 2: The general case follows by induction on m: t u

1.11 Exercises 1 Show that any affine set in Rn is the intersection of a finite collection of hyperplanes. 2 Let A be an mn matrix. The kernel (null-space) of A is by definition the subspace L D fx 2 Rn j Ax D 0g: The orthogonal complement to L is the set L? D fx 2 Rn j hx; yi D 0 8y 2 Lg: Show that L? is a subspace and that L? D fx 2 Rn j x D AT u; u 2 Rm g: 3 Let M be a convex cone containing 0. Show that the affine hull of M is the set M  M D fx  yj x 2 M; y 2 Mg and that the largest subspace contained in M is the set M \ M: 4 Let D be a nonempty polyhedron defined by a system of linear inequalities Ax  b where rankA D r < n: Show that the lineality of D is the subspace E D fxj Ax D 0g; of dimension n  r; and that D D E C D0 where D0 is a polyhedron having at least one extreme point. 5 Let ' W Rn ! Rm be a linear mapping. If D is a (bounded) polyhedron in Rn then '.D/ is a (bounded) polyhedron in Rm : (Use Theorems 1.10 and 1.11.) 6 If C and D are polyhedrons in Rn then C C D is also a polyhedron. (C C D is the image of E D f.x; y/j x 2 C; y 2 Dg under the linear mapping .x; y/ 7! x C y from R2n to Rn :/ 7 Let fCi ; i 2 Ig be an arbitrary collection of nonempty convex sets in Rn : Then a vector x 2 Rn belongs to the convex hull of the union of the collection if and only if x is the convex combination of a finite number of vectors yi ; i 2 J such that yi 2 Ci and J  I: 8 Let C1 ; C2 be two disjoint convex sets in Rn ; and a 2 Rn : Then at least one of the two sets C2 \ conv.C1 [ fag/; C1 \ conv.C2 [ fag/ is empty. (Argue by contradiction.) 9 Let C1 ; : : : ; Cm be open convex proper subsets of Rn with no common point in a convex proper subset D of Rn : Then there exist open halfspaces Li  Ci ; i D 1; : : : ; m; and a closed halfspace H  D such that L1 ; : : : ; Lm have no common point in H: 10 Let a1 ; : : : ; am be a set of m vectors in Rn : Show that the convex hull of this set contains 0 in its interior if and only if either of the following conditions holds:

1.11 Exercises

37

1. For any nonzero vector x 2 Rn one has maxfhai ; xij i D 1; : : : ; mg > 0: 2. Any polyhedron hai ; xi  ˛i ; i D 1; : : : ; m is bounded. 11 Let C be a closed convex set and let y0 … C: If x0 2 C is the point of C nearest to y0 ; i.e., such that kx0  y0 k D min kx  y0 k; then kx  x0 k < kx  y0 k 8x 2 C: x2C

(Hint: From the proof of Proposition 1.15 it can be seen that x0 is characterized by the condition x0  y0 2 NC .x0 /; i.e., hx0  y0 ; x  x0 i  0 8x 2 C:/ 12 (Fejer) For any set E in Rn ; a point y0 belongs to the closure of convE if and only if for every y 2 Rn there is x 2 E such that kxy0 k  kxyk: (Use Exercise 10 to prove the “only if” part.) 13 The radius of a compact set S is defined to be the number rad.S/ D minx maxy2S kx  yk: Show that for any x 2 convS there exists y 2 S such that kx  yk  rad.S/: 14 Show by a counter-example that Theorem 1.7 (Sect. 1.7) is not true even for polyhedrons if the condition that C be line-free is omitted. 15 Show by a counter-example in R3 that Corollary 1.19 is not true if P is not full dimensional.

Chapter 2

Convex Functions

2.1 Definition and Basic Properties Given a function f W S ! Œ1; C1 on a nonempty set S  Rn ; the sets domf D fx 2 Sj f .x/ < C1g epif D f.x; ˛/ 2 S  Rj f .x/  ˛g are called the effective domain and the epigraph of f .x/; respectively. If domf ¤ ; and f .x/ > 1 for all x 2 S, then we say that the function f .x/ is proper. A function f W S ! Œ1; C1 is called convex if its epigraph is a convex set in Rn R: This is equivalent to saying that S is a convex set in Rn and for any x1 ; x2 2 S and  2 Œ0; 1, we have f ..1  /x1 C x2 /  .1  /f .x1 / C f .x2 /

(2.1)

whenever the right-hand side is defined. In other words (2.1) must always hold unless f .x1 / D f .x2 / D ˙1: By induction it can be proved that if f .x/ is convex then for any finite set x1 ; : : : ; xk 2 S and any nonnegative numbers 1 ; : : : ; k summing up to 1, we have f

k X iD1

! i x i



k X

i f .xi /

iD1

whenever the right-hand side is defined. A function f .x/ is said to be concave on S if f .x/ is convex; affine on S if f .x/ is finite and both convex and concave. An affine function on Rn has the form f .x/ D ha; xi C ˛; with a 2 Rn ; ˛ 2 R; because its epigraph is a halfspace in Rn  R containing no vertical line.

© Springer International Publishing AG 2016 H. Tuy, Convex Analysis and Global Optimization, Springer Optimization and Its Applications 110, DOI 10.1007/978-3-319-31484-6_2

39

40

2 Convex Functions

For a given nonempty convex set C  Rn we can define the convex functions: 0 if x 2 C • the indicator function of C W ıC .x/ D C1 if x … C • the support function (see Sect. 1.8) sC .x/ D supy2C hy; xi • the distance function dC .x/ D infy2C kx  yk: The convexity of ıC .x/ is obvious; that of the two other functions can be verified directly or derived from Propositions 2.5 and 2.9 below. Proposition 2.1 If f .x/ is an improper convex function on Rn then f .x/ D 1 at every relative interior point x of its effective domain. Proof By the definition of an improper convex function, f .x0 / D 1 for at least some x0 2 domf (unless domf D ;/: If x 2 ri.domf / then there is a point x0 2 domf such that x is a relative interior point of the line segment Œx0 ; x0 : Since f .x0 / < C1; it follows from x D x0 C .1  /x0 with  2 .0; 1/; that f .x/  f .x0 / C .1  / f .x0 / D 1: t u From the definition it is straightforward that a function f .x/ on Rn is convex if and only if its restriction to every straight line in Rn is convex. Therefore, convex functions on Rn can be characterized via properties of convex functions on the real line. Theorem 2.1 A real-valued function f .x/ on an open interval .a; b/  R is convex if and only if it is continuous and possesses at every x 2 .a; b/ finite left and right derivatives f0 .x/ D lim t"0

f .x C t/  f .x/ ; t

fC0 .x/ D lim t#0

f .x C t/  f .x/ t

such that fC0 .x/ is nondecreasing and f0 .x/  fC0 .x/;

fC0 .x1 /  f0 .x2 / for x1 < x2 :

(2.2)

Proof (i) Let f .x/ be convex. If 0 < s < t and x C t < b then the point .x C s; f .x C s// is below the segment joining .x; f .x// and .x C t; f .x C t//; so f .x C t/  f .x/ f .x C s/  f .x/  : s t

(2.3)

This shows that the function t 7! Œf .x C t/  f .x/=t is nonincreasing as t # 0: Hence it has a limit fC0 .x/ (finite or D 1/: Analogously, f0 .x/ exists (finite or D C1/: Furthermore, setting y D x C s; t D s C r, we also have f .x C s/  f .x/ f .y C r/  f .y/  ; s r

(2.4)

2.1 Definition and Basic Properties

41

which implies fC0 .x/  fC0 .y/ for x < y; i.e., fC0 .x/ is nondecreasing. Finally, writing (2.4) as f .y  s/  f .y/ f .y C r/  f .y/  : s r and letting s " 0; r # 0 yields f0 .y/  fC0 .y/; proving the left part of (2.2) and at the same time the finiteness of these derivatives. The continuity of f .x/ at every x 2 .a; b/ then follows from the existence of finite f0 .x/ and fC0 .x/: Furthermore, setting x D x1 ; y C r D x2 in (2.4) and letting s; r ! 0 yields the right part of (2.2). (ii) Now suppose that f .x/ has all the properties mentioned in the Proposition and let a < c < d < b: Consider the function g.x/ D f .x/  f .c/  .x  c/

f .d/  f .c/ : dc

Since for any x D .1  /c C d, we have g.x/ D f .x/  f .c/  Œf .d/  f .c/ D f .x/  Œ.1  /f .c/ C f .d/; to prove the convexity of f .x/ it suffices to show that g.x/  0 for any x 2 Œc; d: Suppose the contrary, that the maximum of g.x/ over the segment Œc; d is positive (this maximum exists because f .x/ is continuous). Let e 2 Œc; d be the point where this maximum is attained. Note that g.c/ D g.d/ D 0; (hence c < e < d/ and from its expression, g.x/ has the same properties as f .x/; namely: g0 .x/; g0C .x/ exist at every x 2 .c; d/; g0 .x/  g0C .x/; g0C .x/ is nondecreasing and g0C .x1 /  g0 .x2 / for x1  x2 : Since g.e/  g.x/ 8x 2 Œc; d, we must have g0 .e/  0  g0C .e/; consequently g0 .e/ D g0C .e/ D 0; and hence, since g0C .x/ is nondecreasing, g0C .x/  0 8x 2 Œe; d: If g0 .y/  0 for some y 2 .e; d then g0C .x/  g0 .y/  0 hence g0 .x/ D 0 for all x 2 Œe; y/; from which it follows that g.y/ D g.e/ > 0: Since g.d/ D 0; there must exist y 2 .e; d/ with g0 .y/ > 0: Let x1 2 Œy; d/ be the point where g.x/ attains its maximum over the segment Œy; d: Then g0C .x1 /  0; contradicting g0C .y/  g0 .y/ > 0: Therefore g.x/  0 for all x 2 Œc; d; as was to be proved. t u Corollary 2.1 A differentiable real-valued function f .x/ on an open interval is convex if and only if its derivative f 0 is a nondecreasing function. A twice differentiable real-valued function f .x/ on an open interval is convex if and only if its second derivative f 00 is nonnegative throughout this interval. t u Proposition 2.2 A twice differentiable real-valued function f .x/ on an open convex set C in Rn is convex if and only if for every x 2 C its Hessian matrix Qx D .qij .x//;

qij .x/ D

@2 f .x1 ; : : : ; xn / @xi @xj

42

2 Convex Functions

is positive semidefinite, i.e., hu; Qx ui  0 8u 2 Rn : Proof The function f is convex on C if and only if for each a 2 C and u 2 Rn the function 'a;u .t/ D f .a C tu/ is convex on the open real interval ftj a C tu 2 Cg: The proposition then follows from the preceding corollary since an easy computation yields ' 00 .t/ D hu; Qx ui with x D a C tu: t u In particular, a quadratic function f .x/ D

1 hx; Qxi C hx; ai C ˛; 2

where Q is a symmetric n  n matrix, is convex on Rn if and only if Q is positive semidefinite. It is concave on Rn if and only if its matrix Q is negative semidefinite. Proposition 2.3 A proper convex function f on Rn is continuous at every interior point of its effective domain. Proof Let x0 2 int.domf /: Without loss of generality one can assume x0 D 0: By Theorem 2.1, for each i D 1; : : : ; n the restriction of f to the open interval ftj x0 C tei 2 int.domf /g is continuous relative to this interval. Hence for any given " > 0 and for each i D 1; : : : ; n, we can select ıi > 0 so small that jf .x/  f .x0 /j  " for all x 2 Œıi ei ; Cıi ei : Let ı D minfıi j i D 1; : : : ; ng and B D fxj kxk1  ıg: Denote ui D ıei ; uiCn D ıei ; i D 1; P : : : ; n: Then, as seen P2n in the proof of Corollary 1.6, i any x 2 B is of the form x D 2n  u ; with 1; 0  i  1; hence iD1 i iD1 i DP P2n 2n i 0 i 0 f .x/  iD1 i f .u /; and consequently, f .x/  f .x /  iD1 i Œf .x /  f .x /: Therefore, jf .x/  f .x0 /j 

2n X

i jf .x/  f .x0 /j  "

iD1

for all x 2 B; proving the continuity of f .x/ at x0 :

t u

Proposition 2.4 Let f be a real-valued function on a convex set C  Rn : If for every x 2 C there exists a convex open neighborhood Ux of x such that f is convex on Ux \ C then f is convex on C: Proof It suffices to show that for every a 2 C; u 2 Rn ; the function '.t/ D f .a C tu/ is convex on the interval  WD ftj a C tu 2 Cg: But from the hypothesis, this function is convex in a neighborhood of every t 2 ; hence is continuous and has 0 left and right derivatives '0 .t/  'C .t/ which are non decreasing in a neighborhood of every t 2 . These derivatives thus exist and satisfy the conditions described in Theorem 2.1 on the whole interval : Hence, '.t/ is convex. t u

2.2 Operations That Preserve Convexity

43

Fig. 2.1 Convex piecewise affine function

For example, let f .x/ W R ! R be a piecewise convex function on the real line, i.e., a function such that the real line can be partitioned into a finite number of intervals i ; i D 1; : : : ; N; in each of which f .x/ is convex. Then f .x/ is convex if and only if it is convex in the neighborhood of each breakpoint (endpoint of some interval i /: In particular, a piecewise affine function on the real line is convex if and only if at each breakpoint the left slope is at most equal to the right slope (in other words, the sequence of slopes is nondecreasing) (see Fig. 2.1).

2.2 Operations That Preserve Convexity A function with a complicated expression may be built up from a number of simpler ingredient functions via certain standard operations. The convexity of such a function can often be established indirectly, by proving that the ingredient functions are known convex functions, whereas the operations involved in the composition of the ingredient functions preserve convexity. It is therefore useful to be familiar with some of the most important operations which preserve convexity. Proposition 2.5 A positive combination of finitely many proper convex functions on Rn is convex. The upper envelope (pointwise supremum) of an arbitrary family of convex functions is convex. Proof If f .x/ is convex and ˛  0, then ˛f .x/ is obviously convex. If f1 and f2 are proper convex functions on Rn ; then it is also evident that f1 C f2 is convex. This proves the first part of the proposition. The second part follows from the facts that if f .x/ D supffi .x/j i 2 Ig; then epif D \i2I epifi ; and the intersection of a family of convex sets is a convex set. t u Proposition 2.6 Let ˝ be a convex set in Rn , G a convex set in Rm ; '.x; y/ a realvalued convex function on ˝  G: Then the function f .x/ D inf '.x; y/ y2G

is convex on ˝.

44

2 Convex Functions

Proof Let x1 ; x2 2 ˝ and x D x1 C .1  /x2 with  2 Œ0; 1: For each i D 1; 2 select a sequence fyi;k g  G such that '.xi ; yi;k / ! inf '.xi ; y/: y2G

By convexity of '; f .x/  '.x; y1;k C .1  /y2;k /  '.x1 ; y1;k / C .1  /'.x2 ; y2;k /; hence, letting k ! 1 yields f .x/  f .x1 / C .1  /f .x2 /: t u Proposition 2.7 If gi .x/; i D 1; : : : ; m; are concave positive functions on a convex set C  Rn then their geometric mean f .x/ D

"m Y

#1=m gi .x/

(2.5)

iD1

is a concave function (so f .x/ is a convex function) on C: Qm Proof Let T D ft 2 Rm Cj iD1 ti  1g: We show that for any fixed x 2 C W "

m Y

#1=m gi .x/

iD1

( m ) X 1 D min ti gi .x/ : m t2T iD1

(2.6)

Indeed, observe that the right-hand side of (2.6) is equal to ( m ) m X Y 1 min ti gi .x/j ti D 1; ti > 0; i D 1; : : : ; m m iD1 iD1 Qm 0 Q 0 since if m iD1 ti > 1 then by replacing Pm ti with ti  ti such that iD1 ti D 1; we Qm can only decrease the value of iD1 ti gi .x/: Therefore, it can be assumed that iD1 ti D 1 and hence m Y iD1

ti gi .x/ D

m Y

gi .x/:

(2.7)

iD1

Since ti gi .x/ > 0; i D 1; Q:m: : ; m; and for fixed x the product of these positive numbers is constant [D iD1 gi .x/ by (2.7)] their sum is minimal when these

2.2 Operations That Preserve Convexity

45

numbers are equal (by theorem on arithmetic and geometric mean). That is, Pm taking account of (2.7), the minimum of t g i i .x/ is achieved when ti gi .x/ D iD1 Q 1=m Πm 8iPD 1; : : : ; m; hence (2.6). Since for fixed t 2 T the function iD1 gi .x/ x 7! 't .x/ WD m1 m t g .x/ is concave by Proposition 2.5, their lower envelope Q iD1 i i1=m g .x/ is concave by the same proposition. t u inft2T 't .x/ D Πm i iD1 An interesting concave function of the class (2.5) is the geometric mean: f .x/ D

.x1 x2    xn /1=n if x1  0; : : : ; xn  0 1 otherwise

which corresponds to the case when gi .x/ D xi : Proposition 2.8 Let g.x/ W Rn ! .1; C1/ be a convex function and let '.t/ W R ! .1; C1/ be a nondecreasing convex function. Then f .x/ D '.g.x// is convex on Rn : Proof The proof is straightforward. For any x1 ; x2 2 Rn and  2 Œ0; 1 we have g..1  /x1 C x2 /  .1  /g.x1 / C g.x2 / hence '.g..1  /x1 C x2 //  .1  /'.g.x1 // C '.g.x2 //: t u

Pm

gi .x/

is convex if ci > 0 For example, by this proposition the function f .x/ D iD1 ci e and each gi .x/ is convex proper. Given the epigraph E of a convex function f .x/, one can restore f .x/ by the formula f .x/ D infftj .x; t/ 2 Eg:

(2.8)

Conversely, given a convex set E  RnC1 the function f .x/ defined by (2.8) is a convex function on Rn by Proposition 2.6. Therefore, if f1 ; : : : ; fm are m given convex functions, and E  RnC1 is a convex set resulting from some operation on their epigraphs E1 ; : : : ; Em ; then one can use (2.8) to define a corresponding new convex function f .x/: Proposition 2.9 Let f1 ; : : : ; fm be proper convex functions on Rn : Then ( f .x/ D inf

m X iD1

is a convex function on Rn :

fi .x /j x 2 R ; i

i

n

m X iD1

) i

x Dx

46

2 Convex Functions

Proof Indeed, f .x/ is defined by (2.8), where E D E1 C    C Em and Ei D epifi , i D 1; : : : ; m: t u The above constructed function f .x/ is called the infimal convolution of the functions f1 ; : : : ; fm : For example, the convexity of the distance function dC .x/ D inffkx  yk j y 2 Cg associated with a convex set C follows from the above proposition because dC .x/ D infy fkxykCıC .y/g D inffkx1 kCıC .x2 /j x1 Cx2 D xg: Let g.x/ now be a nonconvex function, so that its epigraph is a nonconvex set. The relation (2.8) where E D conv.epig/ defines a function f .x/ called the convex envelope or convex hull of g.x/ and denoted by convg: Since E is the smallest convex set containing the epigraph of g it is easily seen that convg is the largest convex function majorized by g: When C is a subset of Rn ; the convex envelope of the function gjC D

g.x/ if x 2 C C1 if x … C

is called the convex envelope of g over C: Proposition 2.10 Let g W Rn ! R: The convex envelope of g over a set C  Rn such that dim(aff C/ D k is given by f .x/ D inf

( kC1 X

i g.x /j x 2 C; i  0; i

i

kC1 X

iD1

i D 1;

kC1 X

iD1

) i x D x : i

iD1

Proof Let X D C  f0g SRn  R; B D f.0; t/j 0  t  1g  Rn  R and define E D f.x; g.x//j x 2 Cg D .x;0/2X ..x; 0/ C g.x/B/: Then X is a Caratheodory core of E (cf Sect. 1.4) and since dim(aff X/ D k; by Proposition 1.14, we have ( convE D .x; t/ D

kC1 X

i .x ; g.x //j x 2 C; i  0; i

i

i

iD1

kC1 X

) i D 1 :

iD1

But clearly for every .x; t/ 2 epig there is  t such that .x; / 2 E; so for every .x; t/ 2 conv.epig/ there is  t such that .x; / 2 convE: Therefore, (convg/.x/ D P i infftj .x; t/ 2 conv.epig/g D infftj .x; t/ 2 convEg D inff kC1  g.x /j xi 2 C; iD1 i kC1 kC1 X X i  0; i xi D x; i D 1g: t u iD1

iD1

Corollary 2.2 The convex envelope of a concave function g W Rn ! R over a polytope D in Rn with vertex set V is the function f .x/ D min

( nC1 X iD1

i g.v /j v 2 V; i  0; i

i

nC1 X iD1

i D 1;

nC1 X iD1

) i v D x : i

2.3 Lower Semi-Continuity

47

Proof By the above proposition, (convg/.x/  f .x/ but any x 2 D is of the form nC1 X P P i i i x D nC1  v ; with v 2 V;   0; i D 1; hence f .x/  nC1 i iD1 i iD1 i g.v / (by iD1 P i  g.v /  g.x/ by the concavity of g: Therefore, definition of f .x//; while nC1 iD1 i f .x/  g.x/ 8x 2 D; and since f .x/ is convex, we must have f .x/  .convg/.x/; and hence f .x/ D .convg/.x/: t u

2.3 Lower Semi-Continuity Given a function f W Rn ! Œ1; C1 the sets fxj f .x/  ˛g;

fxj f .x/  ˛g

where ˛ 2 Œ1; C1 are called lower and upper level sets, respectively, of f : Proposition 2.11 The lower (upper, resp.) level sets of a convex (concave, resp.) function f .x/ are convex. Proof This property is equivalent to f ..1  /x1 C x2 /  maxff .x1 /; f .x2 /g

8 2 .0; 1/

(2.9)

for all x1 ; x2 2 Rn ; which is an immediate consequence of the definition of convex functions. t u Note that the converse of this proposition is not true. For example, a real-valued function on the real line which is nondecreasing has all its lower level sets convex, but may not be convex. A function f .x/ whose every nonempty lower level set is convex (or, equivalently, which satisfies (2.9) for all x1 ; x2 2 Rn /; is said to be quasiconvex. If every nonempty upper level set is convex, f .x/ is said to be quasiconcave. Proposition 2.12 For any proper convex function f W (i) The maximum of f over any line segment is attained at one endpoint. (ii) If f .x/ is finite and bounded above on a halfline, then its maximum over the halfline is attained at the origin of the halfline. (iii) If f .x/ is finite and bounded above on an affine set then it is constant on this set. Proof (i) Immediate from Proposition 2.11. (ii) If f .b/ > f .a/ then for any x D b C .b  a/ with   0; we have 1  b D 1C xC 1C a; hence .1 C/f .b/  f .x/Cf .a/; (whenever f .x/ < C1/;

48

2 Convex Functions

i.e., f .x/  Œf .b/  f .a/ C f .b/; which implies f .x/ ! C1 as  ! C1: Therefore, if f .x/ is finite and bounded above on a halfline of origin a; one must have f .b/  f .a/ for every b on this halfline. (iii) Let M be an affine set on which f .x/ is finite. If f .b/ > f .a/ for a; b 2 M; then by (ii), f .x/ is unbounded above on the halfline in M from a through b: Therefore, if f .x/ is bounded above on M; it must be constant on M: t u A function f from a set S  Rn to Œ1; C1 is said to be lower semi-continuous (l.s.c.) at a point x 2 S if lim inf f .y/  f .x/: y2S

y!x

It is said to be upper semi-continuous (u.s.c.) at x 2 S if lim sup f .y/  f .x/: y2S

y!x

A function which is both lower and upper semi-continuous at x is continuous at x in the ordinary sense. Proposition 2.13 Let S be a closed set in Rn : For an arbitrary function f W S ! Œ1; C1 the following conditions are equivalent: (i) The epigraph of f is a closed set in RnC1 I (ii) For every ˛ 2 R the set fx 2 Sj f .x/  ˛g is closed; (iii) f is lower semi-continuous throughout S: Proof (i)) (ii). Let x 2 S; x ! x; f .x /  ˛: Since .x ; ˛/ 2 epif ; it follows from the closedness of epif that .x; ˛/ 2 epif ; i.e., f .x/  ˛; proving (ii). (ii))(iii). Let x 2 S; x ! x: If lim!1 f .x / < f .x/ then there is ˛ < f .x/ such that f .x /  ˛ for all sufficiently large : From (ii) it would then follow that f .x/  ˛; a contradiction. Therefore lim!1 f .x /  f .x/; proving (iii). (iii)) (i). Let .x ; t / 2 epif ; (i.e., f .x /  t / and .x ; t / ! .x; t/: Then from (iii), we have liminff .x /  f .x/; hence t  f .x/; i.e., .x; t/ 2 epif : t u Proposition 2.14 Let f be a l.s.c. proper convex function. Then all the nonempty lower level sets fxj f .x/  ˛g; ˛ 2 R; have the same recession cone and the same lineality space. The recession cone is made up of 0 and the directions of halflines over which f is bounded above, while the lineality space is the space parallel to the affine set on which f is constant. Proof By Proposition 2.13 every lower level set C˛ WD fxj f .x/  ˛g is closed. Let  D fuj   0g: If f is bounded above on a halfline a D a C ; then f .a/ 2 R (because f is proper) and by Proposition 2.12 f .x/  f .a/ 8x 2 a : For any nonempty C˛ ; ˛ 2 R; consider a point b 2 C˛ and let ˇ D maxff .a/; ˛g so that

2.3 Lower Semi-Continuity

49

ˇ 2 R and a  Cˇ : Since Cˇ is closed and b 2 Cˇ it follows from Lemma 1.1 that b  Cˇ ; i.e., f .x/ is finite and bounded above on b : Then by Proposition 2.12, f .x/  f .b/  ˛ 8x 2 b ; hence b  C˛ : Thus, if f is bounded above on a halfline a then  is a direction of recession for every nonempty C˛ ; ˛ 2 R: The converse is obvious. Therefore, the recession cone of C˛ is the same for all ˛ and is made up of 0 and all directions of halflines over which f is bounded above. The rest of the proposition is straightforward. t u The recession cone and the lineality space common to each lower level set of f are also called the recession cone and the constancy space of f ; respectively. Corollary 2.3 If the lower level set fxj f .x/  ˛g of a l.s.c. proper convex function f is nonempty and bounded for one ˛ then it is bounded for every ˛: Proof Any lower level set of f is a closed convex set. Therefore, it is bounded if and only if its recession cone is the singleton f0g (Corollary 1.8). t u Corollary 2.4 If a l.s.c. proper convex function f is bounded above on a halfline then it is bounded above on every parallel halfline emanating from a point of domf : If it is constant on a line then it is constant on every parallel line passing through a point of domf : t u

Proof Immediate.

Proposition 2.15 Let f be any proper convex function on Rn : For any y 2 Rn ; there exists t 2 R such that .y; t/ belongs to the lineality space of epif if and only if f .x C y/ D f .x/ C t

8x 2 domf ; 8 2 R:

(2.10)

When f is l.s.c., this condition is satisfied provided for some x 2 domf the function  7! f .x C y/ is affine. Proof .y; t/ belongs to the lineality space of epif if and only if for any x 2 domf W .x; f .x// C .y; t/ 2 epif 8 2 R; i.e., if and only if f .x C y/  t  f .x/

8 2 R:

By Proposition 2.12 applied to the proper convex function './ D f .xCt/t; this is equivalent to saying that './ D constant, i.e., f .x C y/  t D f .x/ 8 2 R: This proves the first part of the proposition. If f is l.s.c., i.e., epif is closed, then .y; t/ belongs to the lineality space of epif provided for some x 2 domf the line f.x; f .x// C .y; t/j  2 Rg is contained in epif (Lemma 1.1). t u The projection of the lineality space of epif on Rn ; i.e., the set of vectors y for which there exists t such that .y; t/ belongs to the lineality space of epif ; is called the lineality space of f : The directions of these vectors y are called directions in which f is affine. The dimension of the lineality space of f is called the lineality of f :

50

2 Convex Functions

By definition, the dimension of a convex function f is the dimension of its domain. The number dimf  linealityf which is a measure of the nonlinearity of f is then called the rank of f W rankf D dimf  linealityf :

(2.11)

Corollary 2.5 If a proper convex function f of full dimension on Rn has rank k then there exists a k  n matrix B with rankB D k such that for any b 2 B.domf / the restriction of f to the affine set Bx D b is affine. Proof Let L be the lineality space of f : Since dim L D n  k; there exists a k  n matrix B of rank k such that L D fu 2 Rn j Bu D 0g: If b D Bx0 for some x0 2 domf then for any x such that Bx D b, have B.x  x0 / D 0; hence, denoting by Pwe h 1 h 0 u ; : : : ; u a basis of L; x D x C iD1 i ui : By Proposition 2.15 there exist ti 2 R P P such that f .x0 C hiD1 i ui / D f .x0 / C hiD1 i ti for all  2 Rh : Therefore, f .x/ is affine on the affine set Bx D b: t u Given any proper function f on Rn ; the function whose epigraph is the closure of epif is the largest l.s.c. minorant of f : It is called the l.s.c. hull of f or the closure of f , and is denoted by clf : Thus, for a proper function f ; epi.clf / D cl.epif /:

(2.12)

A proper convex function f is said to be closed if clf D f (so for proper convex functions, closed is synonym of l.s.c.). An improper convex function is said to be closed only if f C1 or f 1 (so if f .x/ D 1 for some x then its closure is the constant function 1/: Proposition 2.16 The closure of a proper convex function f is a proper convex function which agrees with f except perhaps at the relative boundary points of domf : Proof Since epif is convex, cl(epif /, i.e., epi(clf /, is also convex (Proposition 1.10). Hence by Proposition 2.13, clf is a closed convex function. Now the condition (2.12) is equivalent to clf .x/ D lim inf f .y/ y!x

8x 2 Rn :

(2.13)

If x 2 ri.domf / then by Proposition 2.3 f .x/ D lim f .y/; hence, by (2.13), f .x/ D y!x

clf .x/: Furthermore, if x … cl.domf / then f .y/ D C1 for all y in a neighborhood of x and the same formula (2.13) shows that clf .x/ D C1: Thus the second half of the proposition is true. It remains to prove that clf is proper. For every x 2 ri.domf /; since f is proper, 1 < clf .x/ D f .x/ < C1: On the other hand, if clf .x/ D 1 at some relative boundary  point x of domf D dom.clf / then for an arbitrary y 2 ri.domf /, we have clf xCy  12 clf .x/ C 12 clf .y/ D 1: Noting that xCy 2 2 2 ri.domf / this contradicts what has just been proved and thereby completes the proof. t u

2.4 Lipschitz Continuity

51

2.4 Lipschitz Continuity By Proposition 2.3 a convex function is continuous relative to the relative interior of its domain. In this section we present further continuity properties of convex functions. Proposition 2.17 Let f be a convex function on Rn and D a polyhedron contained in domf : Then f is u.s.c. relative to D; so that if f is l.s.c. f is actually continuous relative to D: Proof Consider any x 2 D: By translating if necessary, we can assume that x D 0: Let e1 ; : : : ; en be the unit vectors of Rn and C the convex hull of the set fe1 ; : : : ; en ; e1 ; : : : ; en g: This set C is partitioned by the coordinate hyperplanes into simplices Si ; i D 1; : : : ; 2n ; with a common vertex at 0. Since D is a polyhedron, n each Di D Si \D is a polytope and obviously D\C D [2iD1 Di : Now let fxk g  D\C be any sequence such that xk ! 0; f .xk / ! : Then at least one Di ; say D1 , contains an infinite subsequence of fxk g: For convenience we also denote this subsequence k by fxk g: If V1 is the set of vertices of D1 other than each P 0, then Px is ak convex k k combination of 0 and elements of V1 W x D .1  v2V1 v /0 C v2V1 v v; with P kv  0; v2V1 kv  1: By convexity of f we can write f .xk /  .1 

X v2V1

kv /f .0/ C

X

kv f .v/:

v2V1

As k ! C1; since xk ! 0; it follows that kv ! 0 8v; hence  f .0/; i.e., lim sup f .xk /  f .0/: This proves the upper semi-continuity of f at 0 relative to D:

t u

Theorem 2.2 For a proper convex function f on Rn the following assertions are equivalent: (i) (ii) (iii) (iv)

f is continuous at some point; f is bounded above on some open set; int.epif / ¤ ;I int.domf / ¤ ; and f is Lipschitzian on every bounded set contained in int.domf /I (v) int.domf / ¤ ; and f is continuous there.

Proof (i) ) (ii) If f is continuous at a point x0 , then there exists an open neighborhood U of x0 such that f .x/ < f .x0 / C 1 for all x 2 U: (ii) ) (iii) If f .x/  c for all x in an open set U, then U  Œc; C1/  epif ; hence int.epif / ¤ ;: (iii) ) (iv) If int(epif / ¤ ;, then there exists an open set U and an open interval I  R such that U  I  epif ; hence U  domf ; i.e., int(domf / ¤ ;: Consider any

52

2 Convex Functions

compact set C  int.domf / and let B be the Euclidean unit ball. For every r > 0 the set CCrB is compact, and the family of closed sets f.CCrB/nint.domf /; r > 0g has an empty intersection. In view of the compactness of C C rB some finite subfamily of this family must have an empty intersection, hence for some r > 0, we must have .C CrB/nint.domf / D ;; i.e., C CrB  int.domf /: By Proposition 2.3 the function f is continuous on int(domf /: Denote by 1 and 2 the maximum and the minimum 0/ of f over C C rB: Let x; x0 be two distinct points in C and let z D x C r.xx : Then kxx0 k z 2 C C rB  int.domf /: But x D .1  ˛/x0 C ˛z;

˛D

kx  x0 k r C kx  x0 k

and z; x0 2 domf ; hence f .x/  .1  ˛/f .x0 / C ˛f .z/ D f .x0 / C ˛.f .z/  f .x0 // and consequently f .x/  f .x0 /  ˛.f .z/  f .x0 /  ˛.1  2 / 1  2 :  kx  x0 k; D r By symmetry, we also have f .x0 /  f .x/  kx  x0 k: Hence, for all x; x0 such that x 2 C; x0 2 C W jf .x/  f .x0 /j  kx  x0 k; proving the Lipschitz property of f over C: (iv)) (v) and (v)) (i): obvious.

t u

2.5 Convex Inequalities A convex inequality in x is an inequality of the form f .x/  0 or f .x/ < 0 where f is a convex function. Note that an inequality like f .x/  0 or f .x/ > 0; with f .x/ convex, is not convex but reverse convex, because it becomes convex only when reversed. A system of inequalities is said to be consistent if it has a solution, i.e., if there exists at least one value x satisfying all the inequalities; it is inconsistent otherwise. Many mathematical problems reduce to investigating the consistency (or inconsistency) of a system of inequalities. Proposition 2.18 Let f0 ; f1 ; : : : ; fm be convex functions, finite on some nonempty convex set D  Rn : If the system x 2 D; fi .x/ < 0 i D 0; 1; : : : ; m

(2.14)

2.5 Convex Inequalities

53

is Pminconsistent, then there exist multipliers i  0; i D 0; 1; : : : ; m; such that iD0 i > 0 and 0 f0 .x/ C

m X

i fi .x/  0 8x 2 D:

(2.15)

iD1

If in addition 9x0 2 D

fi .x0 / < 0 i D 1; : : : ; m

(2.16)

then 0 > 0; so that one can take 0 D 1: Proof Consider the set C of all vectors y D .y0 ; y1 ; : : : ; ym / 2 RmC1 for each of which there exists an x 2 D satisfying fi .x/ < yi ;

i D 0; 1; : : : ; m:

(2.17)

As can readily be verified, C is a nonempty convex set and the inconsistency of the system (2.14) means that 0 … C: By Lemma 1.2 there is a hyperplane in RmC1 properly separating 0 from C; i.e., a vector  D .0 ; 1 ; : : : ; m / ¤ 0 such that m X

i y i  0

8y D .y0 ; y1 ; : : : ; ym / 2 C:

(2.18)

iD0

If x 2 D then for every " > 0, we have fi .x/ < fi .x/ C " for i D 0; 1; : : : ; m; so .f0 .x/ C "; : : : ; fm .x/ C "/ 2 C and hence, m X

i .fi .x/ C "/  0:

iD0

Since " > 0 can be arbitrarily small, this implies (2.15). Furthermore, i  0; i D 0; 1; : : : ; m because if j < 0 for some j; then by fixing, forP an arbitrary x 2 D; m all yi > fi .x/; i ¤ j; while letting yj ! C1; we would have Pm iD0 i yi ! 1; contrary toP(2.18). Finally, under (2.16), if 0 D 0 then P iD1 i > 0; hence m m 0 0 by (2.16)  f .x / < 0; contradicting the inequality iD1 i i iD1 i fi .x /  0 from (2.15). Therefore, 0 > 0 as was to be proved. t u Corollary 2.6 Let D be a convex set in Rn ; g; f two convex functions finite on D: If g.x0 / < 0 for some x0 2 D; while f .x/  0 for all x 2 D satisfying g.x/  0; then there exists a real number   0 such that f .x/ C g.x/  0 8x 2 D: Proof Apply the above Proposition for f0 D f ; f1 D g:

t u

A more general result about inconsistent systems of convex inequalities is the following:

54

2 Convex Functions

Theorem 2.3 (Generalized Farkas–Minkowski Theorem) Let fi ; i 2 I1  f1; : : : ; mg; be affine functions on Rn ; and let f0 and fi ; i 2 I2 WD f1; : : : ; mg n I1 ; be convex functions finite on some convex set D  Rn : If there exists x0 satisfying x0 2 riD; fi .x0 /  0 .i 2 I1 /; fi .x0 / < 0 .i 2 I2 /:

(2.19)

x 2 D; fi .x/  0 .i D 1; : : : ; m/; f0 .x/ < 0;

(2.20)

while the system

is inconsistent, then there exist numbers i  0; i D 1; : : : ; m; such that f0 .x/ C

m X

i fi .x/  0 8x 2 D:

(2.21)

iD1

Proof By replacing I1 with fij fi .x0 / D 0g we can assume fi .x0 / D 0 for every i 2 I1 : Arguing by induction on m; observe first that the theorem is true for m D 1: Indeed, in this case, if I1 D ; the theorem follows from Proposition 2.18, so we can assume that I1 D f1g; i.e., f1 .x/ is affine. In view of the fact f1 .x0 / D 0 for x0 2 riD; if f1 .x/  0 8x 2 D; then f1 .x/ D 0 8x 2 D and (2.21) holds with  D 0: On the other hand, if there exists x 2 D satisfying f1 .x/ < 0 then, since the system x 2 D; f1 .x/  0; f0 .x/ < 0 is inconsistent, again the theorem follows from Proposition 2.18. Thus, in any event the theorem is true when m D 1: Assuming now that the theorem is true for m D k  1  1; consider the case m D k: The hypotheses of the theorem imply that the system x 2 D; fi .x/  0 .i D 1; : : : ; k  1/; maxffk .x/; f0 .x/g < 0

(2.22)

is inconsistent, and since the function maxffk .x/; f0 .x/g is convex, there exists, by the induction hypothesis, ti  0; i D 1; : : : ; k  1; such that maxffk .x/; f0 .x/g C

k1 X

ti fi .x/  0 8x 2 D:

(2.23)

iD1

We show that this implies the inconsistency of the system x 2 D;

k1 X

ti fi .x/  0; fk .x/  0; f0 .x/ < 0:

(2.24)

iD1

Indeed, from (2.23) it follows that no solution of (2.24) exists with fk .x/ < 0: Furthermore, if there exists x 2 D satisfying (2.24) with fk .x/ D P 0 then, setting k1 0 x0 D ˛x0 C .1  ˛/x with ˛ 2 .0; 1/, we would have x0 2 D; iD1 ti fi .x /  0; fk .x0 / < 0 and f0 .x0 /  ˛f0 .x0 / C .1  ˛/f0 .x/ < 0 for sufficiently small ˛ > 0:

2.5 Convex Inequalities

55

Since this contradicts (2.23), the system (2.24) is inconsistent and, again by the induction hypothesis, there exist  0 and k  0 such that f0 .x/ C

k1 X

ti fi .x/ C k fk .x/  0

8x 2 D:

(2.25)

iD1

This is the desired conclusion with i D ti  0 for i D 1; : : : ; k  1:

t u

Corollary 2.7 (Farkas’ Lemma) Let A be an m  n matrix, and let p 2 Rn : If hp; xi  0 for all x 2 Rn satisfying Ax  0 then p D AT  for some  D .1 ; : : : ; m /  0: Proof Apply the above theorem for D D Rn ; f0 .x/ D hp; xi; and fi .x/ D hai ; xi; i i D 1; : : : ; m; where i-th row of A: Then there exist nonnegative 1 ; : : : ;  m P a is the Pm i n i such that hp; xi m  ha ; xi  0 for all x 2 R ; hence hp; xi  i iD1 P iD1 i ha ; xi D 0 i for all x 2 Rn ; i.e., p D m  a : t u iD1 i Theorem 2.4 Let f1 ; : : : ; fm be convex functions finite on some convex set D in Rn ; and let A be a k  n matrix, b 2 riA.D/: If the system x 2 D; Ax D b; fi .x/ < 0; i D 1; : : : ; m

(2.26)

is inconsistent, then there exist a vector t 2 Rm and nonnegative numbers 1 ; : : : ; m summing up to 1 such that ht; Ax  bi C

m X

i fi .x/  0 8x 2 D:

(2.27)

iD1

Proof Define E D fx 2 D j Ax D bg: Since the system x 2 E; fi .x/ < 0; i D 1; : : : ; m is inconsistent, there exist, by Proposition 2.18, nonnegative numbers 1 ; : : : ; m ; not all zero, such that m X

i fi .x/  0 8x 2 E:

(2.28)

iD1

Pm Pm By dividing Pm iD1 i , we may assume iD1 i D 1: Obviously, the convexk function f .x/ WD iD1 i fi .x/ is finite on D: Consider the set C of all .y; y0 / 2 R  R for which there exists an x 2 D satisfying Ax  b D y; f .x/ < y0 :

56

2 Convex Functions

Since by (2.28) 0 … C; and since C is convex there exists, again by Lemma 1.2, a vector .t; t0 / 2 Rk  R such that inf Œht; yi C t0 y0   0;

.y;y0 /2C

sup Œht; yi C t0 y0  > 0:

(2.29)

.y;y0 /2C

If t0 < 0 then by fixing x 2 D; y D Ax  b and letting y0 ! C1 we would have ht; yi C t0 y0 ! 1; contradicting (2.29). Consequently t0  0: We now contend that t0 > 0: Suppose the contrary, that t0 D 0; so that ht; yi  0 8.y; y0 / 2 C; and hence ht; yi  0 8y 2 A.D/  b: Since by hypothesis b 2 riA.D/; i.e., 0 2 ri.A.D/  b/; this implies ht; yi D 0 8y 2 A.D/; hence ht; yi C t0 y0 D 0 for all .y; y0 / 2 C; contradicting (2.29). Therefore, t0 > 0 and we can take t0 D 1: Then the left inequality (2.29), where y D Ax  b; y0 D f .x/ C " for x 2 D; yields the desired relation (2.27) by making " # 0. t u

2.6 Approximation by Affine Functions A general method of nonlinear analysis is linearization, i.e., the approximation of convex functions by affine functions. The basis for this approximation is provided by the next result on the structure of closed convex functions which is merely the analytical equivalent form of the corresponding theorem on the structure of closed convex sets (Theorem 1.5). Theorem 2.5 A proper closed convex function f on Rn is the upper envelope (pointwise supremum) of the family of all affine functions h on Rn minorizing f : Proof We first show that for any .x0 ; t0 / … epif there exists .a; ˛/ 2 Rn  R such that ha; xi  t < ˛ < ha; x0 i  t0

8.x; t/ 2 epif :

(2.30)

Indeed, since epif is a closed convex set there exists by Theorem 1.3 a hyperplane strongly separating .x0 ; t0 / from epif ; i.e., an affine function ha; xi C t such that ha; xi C t < ˛ < ha; x0 i C t0

8.x; t/ 2 epif :

It is easily seen that  0 because if > 0 then by taking a point x 2 domf and an arbitrary t  f .x/; we would have .x; t/ 2 epif ; hence ha; xi C t < ˛ for all t  f .x/; which would lead to a contradiction as t ! C1: Furthermore, if x0 2 domf then D 0 would imply ha; x0 i < ha; x0 i; which is absurd. Hence, in this case, < 0; and by dividing a and ˛ by  , we can assume D 1; so

2.6 Approximation by Affine Functions

57

that (2.30) holds. On the other hand, if x0 … domf and D 0; we can consider an x1 2 domf and t1 < f .x1 /, so that .x1 ; t1 / … epif and by what has just been proved, there exists .b; ˇ/ 2 Rn  R satisfying hb; xi  t < ˇ < hb; x1 i  t1

8.x; t/ 2 epif :

For any > 0 we then have for all .x; t/ 2 epif W hb C a; xi  t D .hb; xi  t/ C ha; xi < ˇ C ˛; while hb C a; x0 i  t0 D .hb; x0 i  t0 / C ha; x0 i > ˇ C ˛ for sufficiently large because ˛ < ha; x0 i: Thus, for > 0 large enough, setting a0 D bC a; ˛ 0 D ˇC ˛, we have ha0 ; xi  t < ˛ 0 < ha0 ; x0 i  t0

8.x; t/ 2 epif ;

i.e., .a0 ; ˛ 0 / satisfies (2.30). Note that (2.30) implies ha; xi  ˛  f .x/ 8x; i.e., the affine function h.x/ D ha; xi  ˛ minorizes f .x/: Now let Q be the family of all affine functions h minorizing f : We contend that f .x/ D supfh.x/j h 2 Qg:

(2.31)

Suppose the contrary, that f .x0 / >  D supfh.x/j h 2 Qg for some x0 : Then .x0 ; / … epif and by the above there exists .a; ˛/ 2 Rn  R satisfying (2.30) for t0 D : Hence, h.x/ D ha; xi˛ 2 Q and ˛ < ha; x0 i; i.e., h.x0 / D ha; x0 i˛ > ; a contradiction. Thus (2.31) holds, as was to be proved. t u Corollary 2.8 For any function f W Rn ! Œ1; C1 the closure of the convex hull of f is equal to the upper envelope of all affine functions minorizing f : Proof An affine function h minorizes f if and only if it minorizes cl(convf /; hence the conclusion. t u Proposition 2.19 Any proper convex function f has an affine minorant. If x0 2 int.domf / then an affine minorant h exists which is exact at x0 ; i.e., such that h.x0 / D f .x0 /: Proof Indeed, the collection Q in Theorem 2.5 for cl(f / is nonempty. If x0 2 int.domf / then .x0 ; f .x0 // is a boundary point of the convex set epif : Hence by Theorem 1.5 there exists a supporting hyperplane to epif at this point, i.e., there exists .a; ˛/ 2 Rn  R such that either a ¤ 0 or ˛ ¤ 0 and ha; xi  ˛t  ha; x0 i  ˛f .x0 /

8.x; t/ 2 epif :

As in the proof of Theorem 2.5, it is readily seen that ˛  0: Furthermore, if ˛ D 0 then the above relation implies that ha; xi  ha; x0 i for all x in some open neighborhood U of x0 contained in domf ; and hence that a D 0, a contradiction.

58

2 Convex Functions

Therefore, ˛ < 0; so we can take ˛ D 1: Then ha; xi  t  ha; x0 i  f .x0 / for all .x; t/ 2 epif and the affine function h.x/ D ha; x  x0 i C f .x0 / satisfies h.x/  f .x/ 8x and h.x0 / D f .x0 /:

t u

2.7 Subdifferential Given a proper function f on Rn ; a vector p 2 Rn is called a subgradient of f at a point x0 if hp; x  x0 i C f .x0 /  f .x/

8x:

(2.32)

The set of all subgradients of f at x0 is called the subdifferential of f at x0 and is denoted by @f .x0 / (Fig. 2.2). The function f is said to be subdifferentiable at x0 if @f .x0 / ¤ ;: Theorem 2.6 Let f beSa proper convex function on Rn : For any bounded set C  int.domf / the set x2C @f .x/ is nonempty and bounded. In particular, @f .x0 / is nonempty and bounded at every x0 2 int.domf /: Proof By Proposition 2.19 if x0 2 int.domf / then f has an affine minorant h.x/ such that h.x0 / D f .x0 /; i.e., h.x/ D hp; x  x0 i C f .x0 / for some p 2 @f .x0 /: Thus, @f .x0 / ¤ ; for every x0 2 int.domf /: Consider now any bounded set C  int.domf /: As we saw in the proof of Theorem 2.2, there is r > 0 such that C C rB  int.domf /; where B denotes the Euclidean unit ball. By definition, for any

f (x) p, x − x0 f (x0 )

0 Fig. 2.2 The set @f .x0 /

x0

x

2.7 Subdifferential

59

x 2 C and p 2 @f .x/, we have hp; y  xi C f .x/  f .y/ 8y; but by Theorem 2.2, there exists > 0 such that jf .x/  f .y/j  ky  xk for all y 2 C C rB: Hence jhp; y  xij  ky  xk for all y 2 C C rB; i.e., jhp; S uij  kuk for all u 2 B: By taking u D p=kpk this implies kpk  ; so the set x2C @f .x/ is bounded. t u Corollary 2.9 Let f be a proper convex function on Rn : For any bounded convex subset C of int.domf / there exists a positive constant such that f .x/ D supfh.x/j h 2 Q0 g 8x 2 C;

(2.33)

where every h 2 Q0 has the form h.x/ D ha; xi  ˛ with kak  : Proof It suffices to take as Q0 the family of all affine functions h.x/ D ha; x  yi C f .y/; with y 2 C; a 2 @f .y/: t u Corollary 2.10 Let f W D ! R be a convex function defined and continuous on a convex set D with nonempty interior. If the set [f@f .x/j x 2 intDg is bounded, then f can be extended to a finite convex function on Rn : Proof For each point y 2 intD take a vector py 2 @f .y/ and consider the affine function hy .x/ D f .y/ C hpy ; x  yi: The function fQ .x/ D supfhy .x/j y 2 intDg is convex on Rn as the upper envelope of a family of affine functions. If a is any fixed point of D then hy .x/ D f .y/ C hpy ; a  yi C hpy ; x  ai  f .a/ C hpy ; x  ai  f .a/ C kpy k:kx  ak: Since kpy k is bounded on intD the latter inequality shows that 1 < fQ .x/ < C1 8x 2 Rn : Thus, fQ .x/ is a convex finite function on Rn : Finally, since obviously fQ .x/ D f .x/ for every x 2 intD it follows from the continuity of both f .x/ and fQ .x/ on D that fQ .x/ D f .x/ 8x 2 D: t u Example 2.1 (Positively Homogeneous Convex Function) Let f W Rn ! R be a positively homogeneous convex function, i.e., a convex function f W Rn ! R such that f .x/ D f .x/ 8 > 0: Then @f .x0 / D fp 2 Rn j hp; x0 i D f .x0 /; hp; xi  f .x/ 8xg

(2.34)

Proof if p 2 @f .x0 / then hp; x  x0 i C f .x0 /  f .x/ 8x: Setting x D 2x0 yields hp; x0 i C f .x0 /  2f .x0 /; i.e., hp; x0 i  f .x0 /; then setting x D 0 yields hp; x0 i  f .x0 /; hence hp; x0 i D f .x0 /: (Note that this condition is trivial and can be omitted if x0 D 0/: Furthermore, hp; x  x0 i D hp; xi  hp; x0 i D hp; xi  f .x0 /; hence hp; xi  f .x/ 8x: Conversely, if p belongs to the set on the right-hand side of (2.34) then obviously hp; x  x0 i  f .x/  f .x0 /; so p 2 @f .x0 /: t u If, in addition, f .x/ D f .x/  0 8x then the condition hp; xi  f .x/ 8x is equivalent to jhp; xij  f .x/ 8x: In particular: 1. If f .x/ D kxk (Euclidean norm) then 0

@f .x / D



fp j kpk  1g (unit ball) if x0 D 0 if x0 ¤ 0: fx0 =kx0 kg

(2.35)

60

2 Convex Functions

2. If f .x/ D maxfjxi j j i D 1; : : : ; ng (Tchebycheff norm) then

0

@f .x / D

if x0 D 0 convf˙e1 ; : : : ; ˙en g 0 0 convf.signxi /xi j i 2 Ix0 g if x0 ¤ 0;

(2.36)

where Ix D fi j jxi j D f .x/g: p 3. If Q is a symmetric positive semidefinite matrix and f .x/ D hx; Qxi (elliptic norm) then ( 0

@f .x / D

p fp j hp; p xi  hx; Qxi 8xg if x0 2 KerQ if x0 … KerQ: f.Qx0 /= hx0 ; Qx0 ig

(2.37)

Example 2.2 (Distance Function) Let C be a closed convex set in Rn ; and f .x/ D minfky  xk j y 2 Cg: Denote by C .x/ the projection of x on C; so that k C .x/  xk D minfky  xk j y 2 Cg and hx  C .x/; y  C .x/i  0 8y 2 C (see Proposition 1.15). Then ( @f .x0 / D

0 0 / \ B.0; N o 1/ if x 2 C n C .x x0  C .x0 / if x0 … C; kx0  .x0 /k

(2.38)

C

where NC .x0 / denotes the outward normal cone of C at x0 and B.0; 1/ the Euclidean unit ball. Proof Let x0 2 C; so that f .x0 / D 0: Then p 2 @f .x0 / implies hp; x  x0 i  f .x/ 8x; hence, in particular, hp; x  x0 i  0 8x 2 C; i.e., p 2 NC .x0 /I furthermore, hp; xx0 i  f .x/  kxx0 k 8x; hence kpk  1; i.e., p 2 B.0; 1/: Conversely, if p 2 NC .x0 / \ B.0; 1/ then hp; x  C .x/i  kx  C .x/k  f .x/; and hp; C .x/  x0 i  0; consequently hp; x  x0 i D hp; x  C .x/i C hp; C .x/  x0 i  f .x/ D f .x/  f .x0 / for all x; and so p 2 @f .x0 /: Turning to the case x0 … C; observe that p 2 @f .x0 / implies hp; x  x0 i C f .x0 /  f .x/ 8x; hence, setting x D C .x0 / yields hp; C .x0 /  x0 i C k C .x0 /  x0 k  0; i.e., hp; x0  C .x0 /i  kx0  C .x0 /k: On the other hand, setting x D 2x0  C .x0 / yields hp; x0  C .x0 /i C k C .x0 /  x0 k  2k C .x0 /  x0 k; i.e., hp; x0  C .x0 /i  k C .x0 /  x0 k: Thus, hp; x0  C .x0 /i D k C .x0 /  x0 k and consequently p D x0  C .x0 / : Conversely, the last equality implies hp; x0  C .x0 /i D kx0  C .x0 /k D kx0  C .x0 /k f .x0 /; hp; x C .x/i  kx C .x/k D f .x/; hence hp; xx0 iCf .x0 / D hp; xx0 iC hp; x0  C .x0 /i D hp; x  C .x0 /i D hp; x  C .x/i C hp; C .x/  C .x0 /i;  kx  C .x/k D f .x/ for all x (note that hp; C .x/ C .x0 /i  0 because p 2 NC . C .x0 ///: Therefore, hp; x  x0 i C f .x0 /  f .x/ for all x; proving that p 2 @f .x0 /: t u Observe from the above examples that there is a unique subgradient (which is just the gradient) at every point where f is differentiable. This is actually a general fact which we are now going to establish.

2.7 Subdifferential

61

Let f W Rn ! Œ1; C1 be any function and let x0 be a point where f is finite. If for some u ¤ 0 the limit (finite or infinite) lim #0

f .x0 C u/  f .x0 / 

exists, then it is called the directional derivative of f at x0 in the direction u, and is denoted by f 0 .x0 I u/: Proposition 2.20 Let f be a proper convex function and x0 2 domf : Then: (i) f 0 .x0 I u/ exists for every direction u and satisfies f .x0 C u/  f .x0 / I >0 

f 0 .x0 I u/ D inf

(2.39)

(ii) The function u 7! f 0 .x0 I u/ is convex and homogeneous and p 2 @f .x0 / if and only if hp; ui  f 0 .x0 ; u/

8u:

(2.40)

(iii) If f is continuous at x0 then f 0 .x0 I u/ is finite and continuous at every u 2 Rn ; the subdifferential @f .x0 / is compact and f 0 .x0 I u/ D maxfhp; ui j p 2 @f .x0 /g:

(2.41)

Proof (i) For any given u ¤ 0 the function './ D f .x0 C u/ is proper convex on the 0 real line, and 0 2 dom': Therefore, its right derivative 'C .0/ D f 0 .x0 I u/ exists (but may equal C1 if 0 is an endpoint of dom'/: The relation (2.39) follows from the fact that Œ'./  '.0/= is nonincreasing as  # 0: (ii) The homogeneity of f 0 .x0 I u/ is obvious. The convexity then follows from the relations f 0 .x0 I u C v/ D inf

>0

f .x0 C 2 .u C v//  f .x0 /  2

f .x0 C u/  f .x0 / C f .x0 C v/  f .x0 / >0 

 inf

D f 0 .x0 I u/ C f 0 .x0 I v/: Setting x D x0 C u we can turn the subgradient inequality (2.32) into the condition hp; ui  Œf .x0 C u/  f .x0 /=

8u; 8 > 0;

which is equivalent to hp; ui  inf>0 Œf .x0 C u/  f .x0 /=; 8u; i.e., by (i), hp; ui  f 0 .x0 I u/ 8u:

62

2 Convex Functions

(iii) If f is continuous at x0 then there is a neighborhood U of 0 such that f .x0 C u/ is bounded above on U: Since by (i) f 0 .x0 I u/  f .x0 C u/  f .x0 /; it follows that f 0 .x0 I u/ is also bounded above on U; and hence is finite and continuous on Rn (Theorem 2.2). The Condition (2.40) then implies that @f .x0 / is closed and hence compact because it is bounded by Theorem 2.6. In view of the homogeneity of f 0 .x0 I u/; an affine minorant of it which is exact at some point must be of the form hp; ui; with hp; ui  f 0 .x0 I u/ 8u; i.e., by (ii), p 2 @f .x0 /: By Corollary 2.9, we then have f 0 .x0 I u/ D maxfhp; ui j p 2 @f .x0 /g: t u According to the usual definition, a function f is differentiable at a point x0 if there exists a vector rf .x0 / (the gradient of f at x0 / such that f .x0 C u/ D f .x0 / C hrf .x0 /; ui C o.kuk/: This is equivalent to lim #0

f .x0 C u/  f .x0 / D hrf .x0 /; ui; 

8u ¤ 0;

so the directional derivative f 0 .x0 I u/ exists, and is a linear function of u: Proposition 2.21 Let f be a proper convex function and x0 2 domf : If f is differentiable at x0 then rf .x0 / is its unique subgradient at x0 : Proof If f is differentiable at x0 then f 0 .x0 I u/ D hrf .x0 /; ui; so by (ii) of Proposition 2.20, a vector p is a subgradient f at x0 if and only if hp; ui  hrf .x0 /; ui 8u; i.e., if and only if p D rf .x0 /: t u One can prove conversely that if f has a unique subgradient at x0 then f is differentiable at x0 (see, e.g., Rockafellar 1970).

2.8 Subdifferential Calculus A convex function f may result from some operations on convex functions fi ; i 2 I: (cf Sect. 2.2). It is important to know how the subdifferential of f can be computed in terms of the subdifferentials of the fi0 s. Proposition 2.22 Let fi ; i D 1; : : : ; m; be proper convex functions on Rn : Then for every x 2 Rn W @

m X iD1

! fi .x/ 

m X iD1

@fi .x/:

2.8 Subdifferential Calculus

63

If there exists a point a 2 \m iD1 domfi ; where every function fi ; except perhaps one, is continuous, then the above inclusion is in fact an equality for every x 2 Rn : Proof It suffices to prove the proposition for m D 2 because the general case will follow by induction. Furthermore, the first part is straightforward, so we only need to prove the second part. If p 2 @.f1 C f2 /.x0 /; then the system x  y D 0; f1 .x/ C f2 .y/  f1 .x0 /  f2 .x0 /  hp; x  x0 i < 0 is inconsistent. Define D D domf1  domf2 and A.x; y/ WD x  y: By hypothesis, f1 is continuous at a 2 domf1 \ domf2 ; so there is a ball U around 0 such that aCU  domf1 ; hence U D .aCU/a  domf1 domf2 D A.D/; i.e., 0 2 intA.D/: Therefore, by Theorem 2.4 there exists t 2 Rn such that ht; x  yi C Œf1 .x/ C f2 .y/  f1 .x0 /  f2 .x0 /  hp; x  x0 i  0 for all x 2 Rn and all y 2 Rn : Setting y D x0 yields hp  t; x  x0 i  f1 .x/  f1 .x0 / 8x 2 Rn ; i.e., p  t 2 @f1 .x0 /: Then setting x D x0 yields ht; y  x0 i  f2 .y/  f2 .x0 / 8y 2 Rn ; i.e., t 2 @f2 .x0 /: Thus, p D .p  t/ C t 2 @f1 .x0 / C @f2 .x0 /; as was to be proved. t u Proposition 2.23 Let A W Rn ! Rm be a linear mapping and g be a proper convex function on Rm : Then for every x 2 Rn W AT @g.Ax/  @.g ı A/.x/: If g is continuous at some point in Im.A/ .the range of A/ then the above inclusion is in fact an equality for every x 2 Rn : Proof The first part is straightforward. To prove the second part, consider any p 2 @.g ı A/.x0 /: Then the system Ax  y D 0; g.y/  g.Ax0 /  hp; x  x0 i < 0; x 2 Rn ; y 2 Rm is inconsistent. Define D D Rn  domg; B.x; y/ D Ax  y: Since there is a point b 2 ImA \ int.domg/, we have b 2 intB.D/; so by Theorem 2.4 there exists t 2 Rm such that ht; Ax  yi C g.y/  g.Ax0 /  hp; x  x0 i  0 for all x 2 Rn and all y 2 Rm : Setting y D 0 then yields hAT t  p; xi  g.Ax0 / C hp; x0 i  0 8x 2 Rn ; hence p D AT t; while setting x D x0 yields ht; y  Ax0 i  g.y/  g.Ax0 /; i.e., t 2 @g.Ax0 /: Therefore, p 2 AT @g.Ax0 /: t u Proposition 2.24 Let g.x/ D .g1 .x/; : : : ; gm .x//; where each gi is a convex functions from Rn to R; let ' W Rm ! R be a convex function, component-wise

64

2 Convex Functions

increasing, i.e., such that '.t/  '.t0 / whenever ti  ti0 ; i D 1; : : : ; m: Then the function f D ' ı g is convex and ( @f .x/ D

m X

) si p j p 2 @gi .x/; .s1 ; : : : ; sm / 2 @'.g.x// : i

i

(2.42)

iD1

Proof The convexity of f .x/ follows from an obvious extensionPof Proposition 2.8 i i which corresponds to the case m D 1: To prove (2.42), let p D m iD1 si p with p 2 0 0 0 0 @gi .x /; s 2 @'.g.x //: First observe that hs; y  g.x /i  '.y/  '.g.x // 8y 2 Rm 0 implies, for all y D g.x0 /Cu with u   '.g.x0 /Cu/'.g.x //  0 8u  P0m W hs; ui Pm 0 i 0 0; hence s  0: Now hp; x  x i D iD1 si hp ; x  x i  iD1 si Œgi .x/  gi .x0 / D hs; g.x/  g.x0 /i  '.g.x//  '.g.x0 // D f .x/  f .x0 / for all x 2 Rn : Therefore p 2 @f .x0 /; i.e., the right-hand side of (2.42) is contained in the left-hand side. To prove the converse, let p 2 @f .x0 /; so that the system x 2 Rn ; y 2 Rm ; gi .x/ < yi

i D 1; : : : ; m

'.y/  '.g.x0 //  hp; x  x0 i < 0

(2.43) (2.44)

is inconsistent, while the system (2.43) has a solution. By Proposition 2.18 there exists s 2 Rm C such that '.y/  '.g.x0 //  hp; x  x0 i C hs; g.x/  yi  0 for all x 2 Rn ; y 2 Rm : Setting x D x0 yields '.y/  '.g.x0 //  hs; y  g.x0 /i for 0 0 all y 2 Rm ; which means Pm that s 2 @'.g.x0 //: On the othern hand, setting y D g.x / 0 yields p2 P hp; x  0x i  iD1 si Œgi .x/  gi .x / for allPxm 2 R i; whichi means that 0 @. m s g .x //; hence by Proposition 2.22, p D s p with p 2 @g .x /: t u i i i i iD1 iD1 Note that when '.y/ is differentiable at g.x/ the above formula (2.42) is similar to the classical chain rule, namely: @.' ı g/.x/ D

m X @' .g.x//@gi .x/: @yi iD1

Proposition 2.25 Let f .x/ D maxfg1 .x/; : : : ; gm .x/g; where each gi is a convex function from Rn to R: Then @f .x/ D convf[@gi .x/j i 2 I.x/g; where I.x/ D fi j f .x/ D gi .x/g: Proof If p 2 @f .x0 / then the system gi .x/  f .x0 /  hp; x  x0 i < 0

i D 1; : : : ; m

(2.45)

2.9 Approximate Subdifferential

65

is inconsistent. By Proposition 2.18, there exist i  0 such that P m 0 0 0 iD1 i Œgi .x/  f .x /  hp; x  x i  0: Setting x D x , we have X

Pm iD1

i D 1 and

i Œgi .x0 /  f .x0 /  0;

i…I.x0 /

with gi .x0 /  f .x0 / < 0 for every i … I.x0 /: This implies that i D 0 for all i … I.x0 /: Hence X i Œgi .x/  gi .x0 /  hp; x  x0 i  0 i2I.x0 /

P P for all x 2 Rn and so p 2 @. i2I.x0 / i gi .x0 //: By Proposition 2.22, p D i2I.x0 / pi ; with pi 2 @gi .x0 /: Thus @f .x0 /  convf[@gi .x0 /j i 2 I.x0 /g: The converse inclusion can be verified in a straightforward manner. t u

2.9 Approximate Subdifferential A proper convex function f on Rn may have an empty subdifferential at certain points. In practice, however, one often needs only a concept of approximate subdifferential . Given a positive number " > 0; a vector p 2 Rn is called an "-subgradient of f at point x0 if hp; x  x0 i C f .x0 /  f .x/ C " 8x:

(2.46)

The set of all "-subgradients of f at x0 is called the "-subdifferential of f at x0 , and is denoted by @" f .x0 / (Fig. 2.3).

f (x0 ) f (x0 ) − ε

0

Fig. 2.3 The set @" f .x0 /

x0

x

66

2 Convex Functions

Proposition 2.26 For any proper closed convex function f on Rn ; and any given " > 0 the "-subdifferential of f at any point x0 2 domf is nonempty. If C is a bounded subset of int(dom f / then the set [x2C @" f .x/ is bounded. Proof Since the point .x0 ; f .x0 /  "/ … epif ; there exists p 2 Rn such that hp; x  x0 i < f .x/  .f .x0 /  "/ 8x (see (2.30) in the proof of Theorem 2.5). Thus, @" f .x0 / ¤ ;: The proof of the second part of the proposition is analogous to that of Theorem 2.6. t u Note that @" f .x0 / is unbounded when x0 is a boundary point of domf : Proposition 2.27 Let x0 ; x1 2 domf : If p 2 @" f .x0 / ."  0/ then p 2 @ f .x1 / for

D f .x1 /  f .x0 /  hp; x1  x0 i C "  0: Proof If hp; x  x0 i  f .x/  f .x0 / C " for all x then hp; x  x1 i D hp; x  x0 i C hp; x0  x1 i  f .x/  f .x0 / C "  hp; x1  x0 i D f .x/  f .x1 / C for all x: t u A function f .x/ is said to be strongly convex on a convex set C if there exists r > 0 such that f ..1  /x1 C x2 /  .1  /f .x1 / C f .x2 /  .1  /rkx1  x2 k2

(2.47)

for all x1 ; x2 2 C; and all  2 Œ0; 1: The number r > 0 is then called the modulus of strong convexity of f .x/: Using the identity .1  /kx1  x2 k2 D .1  /kx1 k2 C kx2 k2  k.1  /x1 C x2 k2 for all x1 ; x2 2 Rn ; and all  2 Œ0; 1; it is easily verified that a convex function f .x/ is strongly convex with modulus of strong convexity r if and only if the function f .x/  rkxk2 is convex. Proposition 2.28 If f .x/ is a strongly convex function on Rn with modulus of strong convexity r then for any x0 2 Rn and "  0 W p @f .x0 / C B.0; 2 r"/  @" f .x0 /;

(2.48)

where B.0; ˛/ denotes the ball of radius ˛ around 0. Proof Let p 2 @f .x0 /: Since F.x/ WD f .x/  rkxk2 is convex and p  2rx0 2 @F.x0 / we can write f .x/  f .x0 /  r.kxk2  kx0 k2 /  hp  2rx0 ; x  x0 i for all x; hence f .x/  f .x0 /  hp; x  x0 i C rkx  x0 k2 :

2.10 Conjugate Functions

67

Let us determine a vector u such that rkx  x0 k2  hu; x  x0 i  " 8x 2 Rn :

(2.49)

The convex quadratic function rkx  x0 k2  hu; x  x0 i achieves its minimum at the point x such that 2r.x  x0 /  u D 0; i.e., x  x0 D 2ru : This minimum is 2

2

kuk kuk u 2 2 equal to rŒp 2r   2r D 4r : Thus, by choosing u such that kuk  4r"; i.e., 0 u 2 B.0; 2 r"/ we will have (2.49), hence p C u 2 @" f .x /: t u

Corollary 2.11 Let f .x/ D 12 hx; Qxi C hx; ai; where a 2 Rn ;and Q is an n  n symmetric positive definite matrix. Let r > 0 be the smallest eigenvalue of Q: Then Qx C a C u 2 @" f .x/ p for any u 2 Rn such that kuk  2 r": Proof Clearly f .x/  rkxk2 is convex (as its smallest eigenvalue is nonnegative), so f .x/ is a strongly convex function to which the above proposition applies. t u

2.10 Conjugate Functions Given an arbitrary function f W Rn ! Œ1; C1, we consider the set of all affine functions h minorizing f : It is natural to restrict ourselves to proper functions, because an improper function either has no affine minorant (if f .x/ D 1 for some x/ or is minorized by every affine function (if f .x/ is identical to C1/: Observe that if hp; xi  ˛  f .x/ for all x then ˛  hp; xi  f .x/ 8x: The function f  .p/ D sup fhp; xi  f .x/g:

(2.50)

x2Rn

which is clearly closed and convex is called the conjugate of f : For instance, the conjugate of the function f .x/ D ıC .x/ (indicator function of a set C; see Sect. 2.1) is the function f  .p/ D suphp; xi D sC .p/ (support function x2C

of C/: The conjugate of an affine function f .x/ D hc; xi  ˛ is the function f  .p/ D supfhp; xi  hc; xi C ˛g D x



C1; p ¤ c ˛; p D c:

Two less trivial examples are the following: Example 2.3 The conjugate of the proper convex function f .x/ D ex ; x 2 R; is by definition f  .p/ D supfpx  ex g: Obviously, f  .p/ D 0 for p D 0 and f  .p/ D C1 x

for p < 0: For p > 0; the function px  ex achieves a maximum at x D satisfying p D e ; so f  .p/ D p log p  p: Thus,

68

2 Convex Functions

8 pD0 < 0;  f .p/ D C1; p 0: The conjugate of f  .p/ is in turn the function f  .x/ D supfpx  f  .p/g D supfpx  p

p log p C pj p > 0g D ex : Example 2.4 The conjugate of the function f .x/ D is f  .p/ D

1 ˛

Pn iD1

jxi j˛ ; 1 < ˛ < C1;

n 1X ˇ jpi j ; 1 < ˇ < C1; ˇ iD1

where 1=˛ C 1=ˇ D 1: Indeed, (

) n 1X ˛ p i xi  jxi j : f .p/ D sup ˛ iD1 x iD1 

n X

By differentiation, we find that the supremum on the right-hand side is achieved at x D satisfying pi D j i j˛1 sign i ; hence f  .p/ D

  n 1X ˇ 1 D j i j˛ 1  jpi j : ˛ ˇ iD1 iD1

n X

Proposition 2.29 Let f W Rn ! Œ1; C1 be an arbitrary proper function. Then: (i) f .x/ C f  .p/  hp; xi 8x 2 Rn ; 8p 2 Rn I   (ii) f .x/  f .x/ 8x; and f D f if and only if f is convex and closed; (iii) f  .x/ D supfh.x/j h affine; h  f g; i.e., f  .x/ is the largest closed convex function minorizing f .x/ W f  D cl.conv/f : Proof (i) is obvious, and from (i) f  .x/ D supp fhp; xif  .p/g  f .x/: Let Q be the set of all affine functions majorized by f : For every h 2 Q; say h.x/ D hp; xi  ˛; we have hp; xi  ˛  f .x/ 8x; hence ˛  supx fhp; xi  f .x/g D f  .p/; and consequently h.x/  hp; xi  f  .p/ 8x: Thus supfhj h 2 Qg  supfhp; xi  f  .p/g D f  ;

(2.51)

p

If f is convex and closed then, since it is proper, by Theorem 2.5 it is just equal to the function on the left-hand side of (2.51) and since f  f  ; it follows that f D f  : Conversely, if f D f  then f is the conjugate of f  , hence is convex and closed. Turning to (iii) observe that if Q D ; then f  .p/ D supx fhp; xi  f .x/g D C1 for

2.11 Extremal Values of Convex Functions

69

all p and consequently, f  1: But in this case, supfhj h 2 Qg D 1; too, hence f  D supfhj h 2 Qg: On the other hand, if there is h 2 Q then from (2.51) h  f  I conversely, if h  f  then h  f (because f   f /: In this case, since f  .x/  f .x/ < C1 at least for some x; and f  .x/ > 1 8x (because there is an affine function minorizing f  /; it follows that f  is proper. By Theorem 2.5, then f  D supfhj h affine h  f  g D supfhj h 2 Qg: This proves the equality in (iii), and so, by Corollary 2.8, f  D cl.conv/f : t u

2.11 Extremal Values of Convex Functions The smallest and the largest values of a convex function on a given convex set are often of particular interest. Let f W Rn ! Œ1; C1 be an arbitrary function, and C an arbitrary set in Rn : A point x0 2 C \ domf is called a global minimizer of f .x/ on C if 1 < f .x0 /  f .x/ for all x 2 C: It is called a local minimizer of f .x/ on C if there exists a neighborhood U.x0 / of x0 such that 1 < f .x0 /  f .x/ for all x 2 C \ U.x0 /: The concepts of global maximizer and local maximizer are defined analogously. For an arbitrary function f on a set C we denote the set of all global minimizers (maximizers) of f on C by argminx2Cf .x/ (argmaxx2Cf .x/; resp.). Since minx2C f .x/ D  maxx2C .f .x// the theory of the minimum (maximum) of a convex function is the same as the theory of the maximum (minimum, resp.) of a concave function.

2.11.1 Minimum Proposition 2.30 Let C be a nonempty convex set in Rn ; and f W Rn ! R be a convex function. Any local minimizer of f on C is also global. The set argmin f .x/ is x2C

a convex subset of C: Proof Let x0 2 C be a local minimizer of f and U.x0 / be a neighborhood such that f .x0 /  f .x/ 8x 2 C \ U.x0 /: For any x 2 C we have x WD .1  /x0 C x 2 C \ U.x0 / for sufficiently small  > 0: Then f .x0 /  f .x /  .1  /f .x0 / C f .x/; hence f .x0 /  f .x/; proving the first part of the proposition. If ˛ D min f .C/ then argminx2C f .x/ coincides with the set C \ fxj f .x/  ˛g which is a convex set by the convexity of f .x/ (Proposition 2.11). t u Remark 2.1 A real-valued function f on a convex set C is said to be strictly convex on C if f ..1  /x1 C x2 / < .1  /f .x1 / C f .x2 / for any two distinct points x1 ; x2 2 C and 0 <  < 1: For such a function f the set argminx2C f .x/; if nonempty, is a singleton, i.e., a strictly convex function f .x/ on C has at most one minimizer over C: In fact, if there were two distinct minimizers 1 2 x1 ; x2 then by strict convexity f . x Cx / < f .x1 /; which is impossible. 2

70

2 Convex Functions

Proposition 2.31 Let C be a convex set in Rn ; and f W Rn ! Œ1; C1 a convex function which is finite on C: For a point x0 2 C to be a minimizer of f on C it is necessary and sufficient that 0 2 @f .x0 / C NC .x0 /;

(2.52)

where NC .x0 / denotes the (outward) normal cone of C at x0 : (cf Sect. 1.6) Proof If (2.52) holds there is p 2 @f .x0 / \ NC .x0 /: For every x 2 C; since p 2 @f .x0 /, we have hp; x  x0 i  f .x/  f .x0 /; i.e., f .x0 /  f .x/  hp; x  x0 iI on the other hand, since p 2 NC .x0 /; we have hp; x  x0 i  0; hence f .x0 /  f .x/; i.e., x0 is a minimizer. Conversely, if x0 2 argminx2C f .x/ then the system .x; y/ 2 C  Rn ; x  y D 0; f .y/  f .x0 / < 0 is inconsistent. Define D WD C  Rn and A.x; y/ WD x  y; so that A.D/ D C  Rn : For any ball U around 0, x0 C U  Rn ; hence U D x0  .x0 C U/  A.D/; and so 0 2 intA.D/: Therefore, by Theorem 2.4, there exists a vector p 2 Rn such that hp; x  yi C f .y/  f .x0 /  0 8.x; y/ 2 C  Rn : Letting y D x0 yields hp; x  x0 i  0 8x 2 C; i.e., p 2 NC .x0 /; then letting x D x0 yields f .y/  f .x0 /  hp; y  x0 i 8y 2 Rn ; i.e., p 2 @f .x0 /: Thus, p 2 NC .x0 / \ @f .x0 /; completing the proof. t u Corollary 2.12 Under the assumptions of the above proposition, an interior point x0 of C is a minimizer if and only if 0 2 @f .x0 /: Proof Indeed, NC .x0 / D f0g if x0 2 intC:

t u

Proposition 2.32 Let C be a nonempty compact set in Rn ; f W C ! R an arbitrary continuous function, f c the convex envelope of f over C: Then any global minimizer of f .x/ on C is also a global minimizer of f c .x/ on convC: Proof Let x0 2 C be a global minimizer of f .x/ on C: Since f c is a minorant of f , we have f c .x0 /  f .x0 /: If f c .x0 / < f .x0 / then the convex function h.x/ D maxff .x0 /; f c .x/g would be a convex minorant of f larger than f c ; which is impossible. Thus, f c .x0 / D f .x0 / and f c .x/ D h.x/ 8x 2 convC: Hence, f c .x0 / D f .x0 /  f c .x/ 8x 2 convC, i.e., x0 is also a global minimizer of f c .x/ on convC: t u

2.11.2 Maximum In contrast with the minimum, a local maximum of a convex function may not be global. Generally speaking, local information is not sufficient to identify a global maximizer of a convex function.

2.11 Extremal Values of Convex Functions

71

Proposition 2.33 Let C be a convex set in Rn ; and f W C ! R be a convex function. If f .x/ attains its maximum on C at a point x0 2 riC then f .x/ is constant on C: The set argmax f .x/ is a union of faces of C: x2C

Proof Suppose that f attains its maximum on C at a point x0 2 riC and let x be an arbitrary point of C: Since x0 2 riC there is y 2 C such that x0 D x C .1  /y for some  2 .0; 1/: Then f .x0 /  f .x/ C .1  /f .y/; hence f .x/  f .x0 /  .1  / f .y/  f .x0 /  .1  /f .x0 / D f .x0 /: Thus f .x/  f .x0 /; hence f .x/ D f .x0 /; proving the first part of the proposition. The second part follows, because for any maximizer x0 there is a face F of C such that x0 2 riF W then by the previous argument, any point of this face is a global minimizer. t u Proposition 2.34 Let C be a closed convex set, and f W C ! R be a convex function. If C contains no line and f .x/ is bounded above on every halfline of C then supff .x/j x 2 Cg D supff .x/j x 2 V.C/g; where V.C/ is the set of extreme points of C: If the maximum of f .x/ is attained at all on C; it is attained on V.C/: Proof By Theorem 1.7, C D convV.C/ C K; where K is the convex cone generated by the extreme directions of C: Any point of C which is actually not an extreme point belongs to a halfline emanating from some v 2 V.C/ in the direction of a ray of K: Since f .x/ is finite and bounded above on this halfline, its maximum on the halfline is attained at v (Proposition 2.12, (ii)). Therefore, the supremum of f .x/ on C is reduced to the supremum on convV.C/: The P conclusion then follows from the i i fact that any xP2 convV.C/ is of the form x D i2I i v ; with jIj < C1; v 2 P i i V.C/; i  0; i2I i D 1; hence f .x/  i2I i f .v /  maxi2I f .v /: t u Corollary 2.13 A real-valued convex function f .x/ on a polyhedron C containing no line is either unbounded above on some unbounded edge or attains its maximum at an extreme point of C: Corollary 2.14 A real-valued convex function f .x/ on a compact convex set C attains its maximum at an extreme point of C: The latter result is in fact true for a wider class of functions, namely for quasiconvex functions. As was defined in Sect. 2.3, these are functions f W Rn ! Œ1; C1 such that for any real number ˛; the level set L˛ WD fx 2 Rn j f .x/  ˛g is convex, or equivalently, such that f ..1  /x1 C x2 /  maxff .x1 /; f .x2 /g

(2.53)

for any x1 ; x2 2 C and any  2 Œ0; 1: To see that Corollary 2.14 extends to quasiconvex functions, just note that a compact convex set C is the convex hull of its extreme points (Corollary 1.13), so

72

2 Convex Functions

P any P x 2 C can be represented as x D i2I i v i ; where v i are extreme points, i  0 and i2I i D 1: If f .x/ is a quasiconvex function finite on C; and ˛ D maxi2I f .v i /; then v i 2 C \ L˛ ; 8i 2 I; hence x 2 C \ L˛ ; because of the convexity of the set C \ L˛ : Therefore, f .x/  ˛ D maxi2I f .v i /; i.e., the maximum of f on C is attained at some extreme point. A function f .x/ is said to be quasiconcave if f .x/ is quasiconvex. A convex (concave) function is of course quasiconvex (quasiconcave), but the converse may not be true, as can be demonstrated by a monotone nonconvex function of one variable. Also it is easily seen that an upper envelope of a family of quasiconvex functions is quasiconvex, but the sum of two quasiconvex functions may not be quasiconvex.

2.12 Minimax and Saddle Points 2.12.1 Minimax Given a function f .x; y/ W C  D ! R we can compute infx2C f .x; y/ and supy2D f .x; y/: It is easy to see that there always holds sup inf f .x; y/  inf sup f .x; y/: y2D x2C

x2C y2D

Indeed, infx2C f .x; y/  f .z; y/ 8z 2 C; y 2 D; so supy2D infx2C f .x; y/  supy2D f .z; y/ 8z 2 C; hence supy2D infx2C f .x; y/  infz2C supy2D f .z; y/: We would like to know when the reverse inequality is also true, i.e., when there holds the minimax equality WD sup inf f .x; y/ D inf sup f .x; y/ WD : y2D x2C

x2C y2D

Investigations on this question date back to von Neumann (1928). A classical result of his states that if C; D are compact and f .x; y/ is convex in x; concave in y and continuous in each variable then the minimax equality holds. Since minimax theorems have found important applications, there has been a great deal of work on the extension of von Neumann’s theorem. Almost all these extensions are based either on the separation theorem of convex sets or a fixed point argument. The most important result in this direction is due to Sion (1958). Later a more general minimax theorem was established by Tuy (1974) without any appeal to separation or fixed point argument. We present here a simplified version of the latter result which is also a refinement of Sion’s theorem. Theorem 2.7 Let C; D be two closed convex sets in Rn ; Rm ; respectively, and let f .x; y/ W C  D ! R be a function quasiconvex, lower semi-continuous in x and quasiconcave, upper semi-continuous in y: Assume that (*) There exists a finite set N  D such that supy2N f .x; y/ ! C1 as x 2 C; kxk ! C1:

2.12 Minimax and Saddle Points

73

Then there holds the minimax equality inf sup f .x; y/ D sup inf f .x; y/:

x2C y2D

y2D x2C

(2.54)

Proof Since the inequality infx2C supy2D f .x; y/  sup inf f .x; y/ is trivial, it suffices y2D x2C

to show the reverse inequality: inf sup f .x; y/  sup inf f .x; y/:

x2C y2D

y2D x2C

(2.55)

Let WD supy2D infx2C f .x; y/: If D C1 then (2.55) is obvious, so we can assume

< C1: For an arbitrary ˛ > define C˛ .y/ D fx 2 Cj f .x; y/  ˛g: Since supy2D infx2C < ˛; we have C˛ .y/ ¤ ; 8y 2 D: If we can show that \

C˛ .y/ ¤ ;;

(2.56)

y2D

i.e., there is x 2 C satisfying f .x; y/  ˛ for all y 2 D; then infx2C supy2D  ˛ and since this is true for every ˛ > it will follow that infx2C supy2D  ; proving (2.55). Thus, all is reduced to establishing (2.56). This will be done in three stages. To simplify the notation, from now on we shall omit the subscript ˛ and write simply C.a/; C.b/; etc. . . I. Let us first show that for every pair a; b 2 D the two sets C.a/ and C.b/ intersect. Assume the contrary, that C.a/ \ C.b/ D ;:

(2.57)

Consider an arbitrary  2 Œ0; 1 and let y D .1  /a C b: If x 2 C.y /; i.e., f .x; y /  ˛ then minff .x; a/; f .x; b/g  f .x; y /  ˛ by quasiconcavity of f .x; :/; hence, either f .x; a/  ˛ or f .x; b/  ˛: Therefore, C.y /  C.a/ [ C.b/:

(2.58)

Since C.y / is convex it follows from (2.58) that C.y / cannot meet both sets C.a/ and C.b/ which are disjoint by assumption (2.57). Consequently, for every  2 Œ0; 1; one and only one of the following alternatives occurs: .a/ C.y /  C.a/I

.b/ C.y /  C.b/:

74

2 Convex Functions

Denote by Ma .Mb ; resp.) the set of all  2 Œ0; 1 satisfying (a) (satisfying (b), resp.). Clearly 0 2 Ma ; 1 2 Mb ; Ma [ Mb D Œ0; 1 and, analogously to (2.58): C.y /  C.y1 / [ C.y2 / 8 2 Œ1 ; 2 :

(2.59)

Therefore,  2 Ma implies Œ0;   Ma ; and  2 Mb implies Œ; 1  Mb : Let s D sup Ma D inf Mb and assume, for instance, that s 2 Ma (the argument is similar if s 2 Mb /: We show that (2.57) leads to a contradiction. Since ˛ >  infx2C f .x; ys /; we have f .x; ys / < ˛ for some x 2 C: By upper semi-continuity of f .x; :/ there is " > 0 such that f .x; ysC" / < ˛ and so x 2 C.ysC" /: But x 2 C.ys /  C.a/; hence C.ysC" /  C.a/; i.e., s C " 2 Ma ; contradicting the definition of s: Thus (2.57) cannot occur and we must have C.a/ \ C.b/ ¤ ; for all a; b 2 C; ˛ < : II. We now show that any finite collection C.y1 /; : : : ; C.yk / with y1 ; : : : ; yk 2 D has a nonempty intersection. By (I) this is true for k D 2: Assuming this is true for k D h1 let us consider the case k D h: Set C0 D C.yh /; C0 .y/ D C0 \C.y/: From part I we know that C0 .y/ ¤ ; for every y 2 D: This means that for all ˛ > W 8y 2 D 9x 2 C0 f .x; y/  ˛; so that supy2D infx2C0 f .x; y/  ˛: Since D supy2D infx2C f .x; y/  supy2D infx2C0 f .x; y/; it follows that D supy2D infx2C0 f .x; y/: So all the hypotheses of the theorem still hold when C is replaced by C0 : It then follows from the induction assumption that the sets C0 .y1 /; : : : ; C0 .yh1 / have a nonempty intersection. Thus the family fC˛ .y/; y 2 Dg has the finite intersection property. III. Finally, for every y 2 D let C .y/ D fx 2 Cj f .x; y/  ˛; supz2N f .x; z/  ˛g: Then C .y/  CN WD fx 2 Cj supz2N f .x; z/  ˛g and the set CN is compact because if it were not so there would exist a sequence xk 2 CN such that kxk j ! C1; contradicting assumption (*). On the other hand, for any finite set E  D clearly \y2E C .y/ D \y2E[N C.y/; so by part II \y2E C .y/ ¤ ;; i.e., the family fC .y/; y 2 Dg has the finite intersection property. Since every C .y/ is a subset of the compact set CN it follows that \y2D C˛ .y/ D \y2D C .y/ ¤ ;; i.e. (2.56) must hold. t u Remark 2.2 In view of the symmetry in the roles of x; y Theorem 2.7 still holds if instead of condition (*) one assumes that (!) There exists a finite set M  C such that infx2M f .x; y/ ! 1 as y 2 D; kyk ! C1: The proof is analogous, using D˛ .x/ WD fy 2 Dj f .x; y/  ˛g with ˛ > WD infx2C supy2D f .x; y/ (instead of C˛ .y/ with ˛ < WD supy2D infx2C f .x; y/) and proving that \x2C D˛ .x/ ¤ ;:

2.12 Minimax and Saddle Points

75

2.12.2 Saddle Point of a Function A pair .x; y/ 2 C  D is called a saddle point of the function f .x; y/ W C  D ! R if f .x; y/  f .x; y/  f .x; y/

8x 2 C; 8y 2 D:

(2.60)

This means min.x; y/ D f .x; y/ D max f .x; y/; x2C

y2D

so in a neighborhood of the point .x; y; f .x; y// the set epif WD f.x; y; t/j .x; y/ 2 C  D; t 2 R; f .x; y/  tg reminds the image of a saddle (or a mountain pass). Proposition 2.35 A point .x; y/ 2 C  D is a saddle point of f .x; y/ W C  D ! R if and only if max inf f .x; y/ D min sup f .x; y/: y2D x2C

x2C y2D

(2.61)

Proof We have .2.60/ , supy2D f .x; y/  f .x; y/  infy2C f .x; y/ , infx2C supy2D f .x; y/  supy2D f .x; y/  f .x; y/  infx2C f .x; y/  supy2D infx2C f .x; y/: Since always infx2C supy2D f .x; y/  supy2D infx2C f .x; y/ we must have equality everywhere in the above sequence of inequalities. Therefore, (2.60) must be equivalent to (2.61). t u As a consequence of Theorem 2.7 and Remark 2.2 we can now state Proposition 2.36 Assume that C  Rn ; D  Rm are nonempty closed convex sets and the function f .x; y/ W C  D ! R is quasiconvex l.s.c in x and quasiconcave u.s.c. in y: If both the following conditions hold: (i) There exists a finite set N  D such that supy2N f .x; y/ ! C1 as x 2 C; kxk ! C1I (ii) There exists a finite set M  C such that infx2M f .x; y/ ! C1 as y 2 D; kyk ! C1I then there exists a saddle point .x; y/ for the function f .x; y/: Proof If (i) holds, we have minx2C supy2D f .x; y/ D supy2D infx2C f .x; y/ by Theorem 2.7. If (ii) holds, we have infx2C supy2D f .x; y/ D maxy2D infx2C f .x; y/ by Remark 2.2. Hence the condition (2.61). t u

76

2 Convex Functions

2.13 Convex Optimization We assume that the reader is familiar with convex optimization problems, i.e., optimization problems of the form minff .x/j gi .x/  0 .i D 1; : : : ; m/; hj .x/ D 0 .j D 1; : : : ; p/; x 2 ˝g;

(2.62)

where ˝  Rn is a closed convex set, f ; g1 ; : : : ; gm are convex functions finite on an open domain containing ˝; h1 ; : : : ; hm are affine functions. In this section we study a generalization of problem (2.62), where the convex inequality constraints are understood in a generalized sense.

2.13.1 Generalized Inequalities a. Ordering Induced by a Cone A cone K  Rn induces a partial ordering K on Rn such that x K y , y  x 2 K: We also write x K y to mean y K x: In the case K D RnC this is the usual ordering x  y , xi  yi ; i D 1; : : : ; m: The following properties of the ordering K are straightforward: (i) (ii) (iii) (iv)

transitivity: x K y; y K z ) x K zI reflexivity: x K x 8x 2 Rn I preservation under addition: x K y; x0 K y0 ) x C x0 K y C y0 I preservation under nonnegative scaling: x K y; ˛ > 0 ) ˛x K ˛y:

The ordering induced by a cone K is of particular interest when K is closed, solid (i.e., has nonempty interior), and pointed (i.e., contains no line: x 2 K ) x … K/: In that case a relation x K y is called a generalized inequality. We also write x K y to mean that y  x 2 intK and call such a relation a strict generalized inequality. Generalized inequalities enjoy the following important properties: (i) (ii) (iii) (iv) (v) (vi)

x K y; y K x ) x D yI x 6 K xI xk K yk ; xk ! x; yk ! y ) x K yI If x K y then for u; v small enough x C u K y C vI x K y; u K ) x C u K y C vI The set fzj x K z K yg is bounded.

For instance, (ii) is true because x K x implies 0 D x  x 2 intK; which is impossible as K is pointed; (vi) holds because the set E D fzj x K z K yg

2.13 Convex Optimization

77

is closed, so if there exists fzk g  E; kzk k ! C1 then, up to a subsequence, k k zk =kzk k ! u with kuk D 1 and since kzzk k 2 kzxk k C K; kzzk k 2 kzxk k  K; letting k ! C1 yields u 2 K; u 2 K; hence by (i) u D 0 conflicting with kuk D 1: b. Dual Cone The dual cone of a cone K is by definition the cone K  D fy 2 Rn j hx; yi  0 8x 2 Kg: Note that K  D K o ; where K o is the polar of K (Sect. 1.8). As can easily be seen: (i) (ii) (iii) (iv)

K  is a closed convex cone; K  is also the dual cone of the closure of KI .K  / D clKI If x 2 intK then hx; yi > 0 8y 2 K  n f0g:

A cone K is said to be self-dual if K  D K: Clearly the orthant RnC is a self-dual cone. Lemma 2.1 If K is a closed convex cone then y … K , 9 2 K  h; yi < 0: Proof By (iii) above K D .K  / so y … K if and only if y … .K  / ; hence if and only if there exists  2 K  satisfying h; yi < 0: t u

c. K-Convex Functions Given a convex cone K  Rm inducing an ordering K on Rm a map g W Rn ! Rm is called K-convex if for every x1 ; x2 2 Rn and 0  ˛  1 we have the generalized inequality g.˛x1 C .1  ˛/x2 / K ˛g.x1 / C .1  ˛/g.x2 /:

(2.63)

For instance, the map g W Rn ! Rm is Rm C -convex (or component-wise convex) if each function gi .x/; i D 1 : : : ; m; is convex. m  Lemma 2.2 If g W Rn ! PmR is a K-convex map then for every  2 K the function h; g.x/i D iD1 i gi .x/ is convex in the usual sense and the set fx 2 Rn j g.x/ K 0g is convex.

Proof For every x1 ; x2 2 Rn , we have by (2.63) ˛g.x1 / C .1  ˛/g.x2 /  g.˛x1 C .1  ˛/x2 / 2 K:

78

2 Convex Functions

Since  2 K  it follows that h; ˛g.x1 / C .1  ˛/g.x2 /  g.˛x1 / C .1  ˛/g.x2 /i  0; hence ˛h; g.x1 /i C .1  ˛/h; g.x1 /i  h; g.˛x1 / C .1  ˛/g.x2 /i; proving that the function h; g.x/i is convex. Further, by (2.2), if g.x1 / K 0; g.x2 / K 0 then g.˛x1 / C .1  ˛/x2 / K ˛g.x1 / C .1  ˛/g.x2 / K 0; proving that the set fxj g.x/ K 0g is convex.

t u

2.13.2 Generalized Convex Optimization A generalized convex optimization problem is a problem of the form minff .x/j gi .x/ Ki 0 .i D 1; : : : ; m/; h.x/ D 0; x 2 ˝g;

(2.64)

where f W Rn ! R is a convex function, Ki is a closed, solid, pointed convex cone in Rsi ; gi W Rn ! Rsi ; i D 1; : : : ; m; is Ki -convex, finite on the whole Rn ; h W Rn ! Rp is an affine map, and ˝  Rn is a closed convex set. By Lemma 2.2 the constraint set of this problem is convex, so this is also a problem of minimizing a convex function over a convex set. Let i 2 Ki be the Lagrange multiplier associated with the generalized inequality gi .x/ Ki 0 and  2 Rp the Lagrange multiplier associated with the equality h.x/ D 0: So the Lagrangian of the problem is the function L.x; ; / WD f .x/ C

m X hi ; gi .x/i C h; h.x/i; iD1

where i 2 Ki ; i D 1; : : : ; m; and  2 Rp : The dual Lagrange function is '.; / D inf L.x; ; / D inf ff .x/ C x2˝

x2˝

m X

hi ; gi .x/i C h; h.x/ig:

ID1

Since '.; / is the lower envelope of a family of affine functions in .; /; it is a concave function. Setting K  D K1      Km ;  D .1 ; : : : ; m / we can show that sup

2K  ;2Rp

L.x; ; / D

f .x/ if gi .x/ Ki 0 .i D 1; : : : ; m/; h.x/ D 0I C1 otherwise

2.13 Convex Optimization

79

In fact, if gi .x/ Ki 0 .i D 1; : : : ; m/; h.x/ D 0 then the supremum is attained for  D 0;  D 0 and equals f .x/: On the other hand, if h.x/ ¤ 0 there is  2 Rp satisfying h; h.x/i > 0 and for  D 0 the supremum is equal to sup >0 ff .x/ C h ; h.x/ig D C1: If there is an i D 1; : : : ; m with gi .x/ 6 Ki 0 then by Lemma 2.1 there is i 2 Ki satisfying hi ; gi .x/i > 0; hence for  D 0; j D 0 8j ¤ i; the supremum equals sup >0 ff .x/ C hi ; gi .x/ig D C1: Thus, the problem (2.64) can be written as inf

sup

x2˝ 2K  ;2Rp

L.x; ; /:

The dual problem is sup

inf L.x; ; /;

2K  ;2Rp x2˝

that is, sup

2K  ;2Rp

'.; /:

(2.65)

Theorem 2.8 (i) (weak duality) The optimal value in the dual problem never exceeds the optimal value in the primal problem (2.64). (ii) (strong duality) The optimal values in the two problems are equal if the Slater condition holds, i.e., if 9x 2 int˝

h.x/ D 0; gi .x/ Ki 0; i D 1; : : : ; m:

Proof (i) is straightforward, we need only prove (ii). By Lemma 2.2 for every i 2 Ki the function hi ; gi .x/i is convex (in the usual sense), finite on Rn and hence continuous. So L.x; ; / is a convex continuous function in x 2 ˝ for every fixed .; / 2 D WD K1   Km Rp and affine in .; / 2 D for every fixed x 2 h.˝/: Since h W Rn ! Rp is an affine map, without loss of generality it can be assumed that h.˝/ D Rp : Since h.x/ D 0 and x 2 int˝, we have 0 2 int˝; so for every j D 1; : : : ; p there is an aj 2 ˝ such that h.aj / has its j-th component equal to 1, and all other components equal to 0. Then for sufficiently small " > 0, we have xi WD x C ".aj  x/ 2 ˝; xO j WD x  ".aj  x/ 2 ˝; and so gi .xj / < 0 .i D 1; : : : ; m/; hj .xj / > 0; hj .xj / D 0 8i ¤ j gi .Oxj / < 0 .i D 1; : : : ; m/; hj .Oxj / < 0; hi .Oxj / D 0 8i ¤ j: Let M D fxj ; xO j ; j D 1; : : : ; pg: For every .; / 2 D n f0g we can write p m X X .; / WD minŒ i gi .x/ C j hj .x/ < 0 x2M

iD1

jD1

80

2 Convex Functions

p hence, WD maxf.; /j  2 Rm C ;  2 R ; kk C kk D 1g < 0: Consequently,

min L.x; .; //  max f .x/ C minf x2M

x2M

x2M

m X

i gi .x/ C

iD1

p X

j hj .x/g

jD1

 max f .x/ C .kk C kk/ ! 1 x2M

as .; / 2 D; kk C kk ! C1: So the function L.x; .; // satisfies condition (!) in Remark 2.2 with C D ˝; y D .; /: By virtue of this Remark, inf sup L.x; .; // D max inf L.x; .; //; .;/2D x2˝

x2˝ .;/2D

t u

completing the proof of (ii).

The difference between the optimal values in the primal and dual problems is called the duality gap. Theorem 2.8 says that for problem (2.64) the duality gap is never negative and is zero if Slater condition is satisfied.

2.13.3 Conic Programming Let K  Rm be a closed, solid, and pointed convex cone, let c 2 Rn and let A be an m  n matrix, and b 2 Rm : Then the following optimization problem: minfhc; xij Ax K bg

(2.66)

is called a conic programming problem. Since A.˛x1 C .1  ˛/x2 /  Œ˛Ax1 C .1  ˛/Ax2  D 0 2 K; i.e., A.˛x1 C .1  ˛/x2 / K ˛Ax1 C .1  ˛/Ax2 ; for all x1 ; x2 2 Rn ; 0  ˛  1; the map x 7! b  Ax is K-convex. So a conic program is nothing but a special case of the above considered problem (2.64) when m D 1; K1 D K; f .x/ D hc; xi; g1 .x/ D b  Ax; h 0; ˝ D Rn : The Lagrangian of problem (2.66) is L.x; y/ D hc; xi C hy; b  Axi D hc  AT y; xi C hb; yi .y K 0/: But, as can easily be seen, infn L.x; y/ D

x2R

hb; yi if AT y D x 1 otherwise

2.14 Semidefinite Programming

81

so the dual of the conic programming problem (2.66) is the problem maxfhb; yij AT y D c; y K 0g:

(2.67)

Clearly linear programming is a special case of conic programming when K D RnC . However, while strong duality always holds for linear programming (except only when both the primal and the dual problems are infeasible), it is not so for conic programming. By Theorem 2.8 a sufficient condition for strong duality in conic programming is 9x

Ax K b:

2.14 Semidefinite Programming 2.14.1 The SDP Cone Given a symmetric n  n matrix A D Œaij  (i.e., a matrix A such that AT D A) the trace of A; written Tr(A), is the sum of its diagonal elements: Tr.A/ WD

n X

aii :

iD1

From the definition it is readily seen that Tr.˛A C ˇB/ D ˛Tr.A/ C ˇTr.B/;

Tr.AB/ D Tr.BA/

Tr.A/ D 1 C    C n where i ; i D 1; : : : ; n; are the eigenvalues of A; the latter equality being derived from the development of the characteristic polynomial det.In  A/. Consider now the space Sn of all n  n symmetric matrices. Using the concept of trace we can introduce an interior product1 in Sn defined as follows: hA; Bi D Tr.AB/ D

X

aij bij D vec.A/T vec.B/;

i;j

where vec.A/ denotes the n  n column vector whose elements are elements of the matrix A in the order a11 ;    ; a1n ; a21 ;    ; a2n ;    ; an1 ; ann : The norm of a matrix A associated with this inner product is the Frobenius norm given by

1

Sometimes also written as A  B and called dot product.

82

2 Convex Functions

0 kAkF D .hA; Ai/1=2 D .Tr.AT A/1=2 D @

X

11=2 a2ij A

:

i;j

The set of semidefinite positive matrices X 2 Sn forms a convex cone SnC called the SDP cone (semidefinite positive cone). For any two matrices A; B 2 Sn we write A B to mean that B  A is semidefinite positive. In particular, A 0 means A is a semidefinite positive matrix. Lemma 2.3 The cone SnC is closed, convex, solid, pointed, and self-dual, i.e., SnC D .SnC / : Proof We prove only the self-dual property. By definition .SnC / D fY 2 SnC j Tr.YX/  0 8X 2 SnC g: Let Y 2 SnC . For every X 2 SnC , we have Tr.YX/ D Tr.YX 1=2 X 1=2 / D Tr.X 1=2 YX 1=2 /  0; where the last inequality holds because the matrix X 1=2 YX 1=2 is semidefinite positive. Therefore, SnC  .SnC / : Conversely, let Y 2 .SnC / : For any u 2 Rn , we have Tr.YuuT / D

n X iD1

y1i ui u1 C    C

n X iD1

yni ui un D

n X

yij ui uj D uT Yu:

i;jD1

The matrix uuT is obviously semidefinite positive, while the trace of YuuT is nonnegative by assumption, so the product uT Yu is nonnegative. Since u is arbitrary, this means Y 2 SnC : Hence, .SnC /  SnC : t u A map F W Rn ! Sm is said to be convex, or more precisely, convex with respect n to matrix inequalities if it is Sm C -convex, i.e., such that for every X; Y 2 S and 0t1 F.tX C .1  t/Y/ tF.X/ C .1  t/F.Y/: For instance, the map X 7! X 2 is convex because for every u 2 Rm the function uT X 2 u D kXuk2 is convex quadratic with respect to the components of X and so uT .X C .1  /X/2 u  uT X 2 u C .1  /uT X 2 u; which implies .X C .1  /X/2 X 2 C .1  /X 2 : Analogously, the function X 7! XX T is convex.

2.14 Semidefinite Programming

83

2.14.2 Linear Matrix Inequality A linear matrix inequality (LMI) is a generalized inequality A 0 C x1 A 1 C    C xn A n 0 where x 2 Rn is the variable and Ai 2 Sp ; i D 0; 1; : : : ; n; are given p  p symmetric p matrices. The inequality sign is understood with respect to the cone SC W the notation the matrix P is semidefinite negative. Obviously, A.x/ WD P P 0 means p A0 C nkD1 xk Ak 2 SC and each element of this matrix is an affine function of x W A.x/ D Œaij .x/; aij .x/ D a0ij C

n X

akij xk :

kD1

Therefore an LMI can also be defined as an inequality of the form A.x/ 0; where A.x/ is a square symmetric matrix whose every element is an affine function of x: By definition A.x/ 0 , hy; A.x/yi  0 8y 2 Rp ; so setting C WD fx 2 Rn j A.x/ 0g, we have C D \y2Rp fx 2 Rn j hy; A.x/yi  0g: Since for every fixed y the set fx 2 Rn j hy; A.x/yi  0g is a halfspace, we see that the solution set C of an LMI is a closed convex set. In other words, an LMI is nothing but a specific convex inequality which is equivalent to an infinite system of linear inequalities. Obviously, the inequality A.x/ 0 is also an LMI, determining a convex constraint for x: Furthermore, a finite system of LMI’s of the form A.1/ .x/ 0; : : : ; A.m/ .x/ 0 can be equivalently written as the single LMI Diag.A.1/ .x/;    ; A.m/ .x// 0:

84

2 Convex Functions

2.14.3 SDP Programming A semidefinite program .SDP/ is a problem of minimizing a linear function under an LMI constraint, i.e., a problem of the form .SDP/

minfhc; xij A0 C x1 A1 C : : : C xn An 0g:

Clearly this is a special case of generalized convex optimization. Specifically, .SDP/ can be rewritten in the form (2.64), with m D 1; f .x/ D hc; xi; g1 .x/ D A0 C x1 A1 C p : : : C xn An ; K1 D SC : To form the dual problem to .SDP/ we associate with the LMI constraint a dual p p variable Y 2 .SC / D SC (see Lemma 2.3), so the Lagrangian is L.x; Y/ D cT x C Tr.Y.A0 C x1 A1 C    C xn An //: The dual function is '.Y/ D inffL.x; Y/j x 2 Rn g: Since L.x; Y/ is affine in x it is unbounded below, except if it is identically zero, i.e., if ci C Tr.YAi / D 0; i D 1; : : : ; n; in which case L.x; Y/ D Tr.A0 Y/: Therefore, '.Y/ D

Tr.A0 Y if Tr.Ai Y/ C ci D 0; i D 1; : : : ; n 1 otherwise

Consequently, the dual of .SDP/ is .SDD/

maxfTr.A0 Y/j Tr.Ai Y/ C ci D 0; i D 1; : : : ; n; Y D Y T 0g:

Writing this problem in the form maxfhA0 ; Yij hAi ; Yi D ci ; i D 1; : : : ; ; Y 0g

(2.68)

we see that .SDP/ reminds a linear program in standard form. By Theorem 2.8 strong duality holds for .SDP/ if Slater condition is satisfied 9x

A 0 C x1 A 1 C    C xn A n :

(2.69)

2.15 Exercises 1 Let f .x/ be a convex function and C D domf  Rn : Show that for all x1 ; x2 2 C and  … Œ0; 1 such that x1 C .1  /x2 2 C W f .x1 C .1  /x2 /  f .x1 / C .1  /f .x2 /:

2.15 Exercises

85

2 A real-valued function f .t/; 1 < t < C1; is strictly convex (cf Remark 2.1) if it has a strictly monotonically increasing derivative f 0 .t/: Apply this result to f .t/ D et and show that for any positive numbers ˛1 ; : : : ˛k I k Y iD1

!1=k ˛i

1X ˛i k iD1 k



with equality holding only when ˛1 D : : : D ˛k : 3 A function f .x/ is strongly convex (cf Sect. 2.9) on a convex set C if and only if there exists r > 0 (modulus of strong convexity) such that f .x/  rkxk2 is convex. Show that: f ..1  /x1 C x2 /  .1  /f .x1 / C f .x2 /  .1  /rkx1  x2 k2 for all x1 ; x2 2 C and  … Œ0; 1 such that x1 C .1  /x2 2 C: 4 Show that if f .x/ is strongly convex on Rn (with modulus of strong convexity r/ then for any x0 2 Rn and p 2 @f .x0 / W f .x/  f .x0 /  hp; x  x0 i C rkx  x0 k2 8x 2 Rn : 5 Notations being the same as in Exercise 4, show that for any x1 ; x2 2 Rn and p1 2 @f .x1 /; p2 2 @f .x2 / W hp1  p2 ; x1  x2 i  rkx1  x2 k2 : 6 If f .x/ is strongly convex on a convex set C then for any x0 2 C the level set fx 2 Cj f .x/  f .x0 /g is bounded. 7 Let C be a nonempty convex set in Rn ; f W C ! R a convex function, Lipschitzian with constant L on C: The function F.x/ D infff .y/ C Lkx  ykj y 2 Cg is convex, Lipschitzian with constant L on the whole space and satisfies F.x/ D f .x/ 8x 2 C: 8 Show that if @" f .x0 / is a singleton for some x0 2 domf and " > 0; then f .x/ is an affine function. 9 For a proper convex function f W p 2 @f .x0 / if and only if .p; 1/ is an outward normal to the set epif at point .x0 ; f .x0 //: 10 Let M be a nonempty set in Rn ; h W M ! R an arbitrary function, E an n  n matrix. The function

86

2 Convex Functions

'.x/ D maxfhx; Eyi  h.y/j y 2 Mg is convex and for every x0 2 dom'; if y0 2 argmaxy2M fhx0 ; Eyi  h.y/g then Ey0 2 @'.x0 /: 11 Let f W Rn ! R be a convex function, X a closed convex subset of Rn ; A 2 Rmn ; B 2 Rmp ; c 2 Rm : Show that the function '.y/ D minff .x/j Ax C By  c; x 2 Xg is convex and for every y0 2 dom'; if  is a Kuhn–Tucker vector for the problem minff .x/j Ax C By0  c; x 2 Xg then the vector BT  is a subgradient of ' at y0 :

Chapter 3

Fixed Point and Equilibrium

3.1 Approximate Minimum and Variational Principle It is well known that a real-valued function f W Rn ! R which is l.s.c. on a nonempty closed set X  Rn attains a minimum over X if X is compact. This is true, more generally, if f is coercive on X; i.e., if f .x/ ! C1 as x 2 X; kxk ! C1:

(3.1)

In fact, for an arbitrary a 2 X the set C D fx 2 Xj f .x/  f .a/g is closed and if it were unbounded there would exist a sequence fxk g  X such that f .xk /  f .a/; kxk k ! C1 while by (3.1) this would imply f .xk / ! C1; a contradiction. So C is compact and the minimum of f .x/ over C; i.e., the minimum of f .x/ on X; exists. If condition (3.1) fails, f .x/ may not have a minimum over X: In that case one must be content with the following concept of approximate minimum: Given " > 0 a point x" is called an "-approximate minimizer of f .x/ on X if inf f .x/  f .x" /  inf f .x/ C ":

x2X

x2X

(3.2)

An "-approximate minimizer on X always exists if f .x/ is bounded below on X: For differentiable functions f W Rn ! R a minimizer x of f .x/ on Rn satisfies rf .x/ D 0: Is there some similar condition for an "-approximate minimizer? The next propositions give an answer to this question. Theorem 3.1 (Ekeland Variational Principle (Ekeland 1974)) Let f W X ! R [ fC1g be a function on a closed set X  Rn which is finite at at least one point, l.s.c. and bounded below on X: Let " > 0 be given. Then for every "-approximate minimizer x" of f and every  > 0 there exists x satisfying

© Springer International Publishing AG 2016 H. Tuy, Convex Analysis and Global Optimization, Springer Optimization and Its Applications 110, DOI 10.1007/978-3-319-31484-6_3

87

88

3 Fixed Point and Equilibrium

f .x /  f .x" /I 

f .x /  f .x/ C

kx  x" k  I "  kx

(3.3)



 x k 8x 2 X:

(3.4)

Proof We follow the elegant proof in Hiriart-Urruty (1983). Clearly the function g.x/ D f .x/ C " kx  x" k is l.s.c. and coercive on X; so it has a minimizer x on X; i.e., a point x 2 X such that f .x / C

"  " kx  x" k  f .x/ C kx  x" k  

8x 2 X:

(3.5)

Letting x D x" yields f .x / C

"  kx  x" k  f .x" /; 

(3.6)

hence f .x /  f .x" /: Further, infx2X f .x/ C " kx  x" k  f .x / C " kx  x" k  f .x" /  infx2X f .x/ C " by the definition of x" , hence kx  x" k  : Finally, from (3.5) it follows that " fkx  x" k  kx  x" kg  "  f .x/ C kx  x k 8x 2 X: 

f .x /  f .x/ C

t u

Note that by (3.3) x is also an "-approximate minimizer which is not too distant from x" ; moreover by (3.4) x is an exact minimizer of the function f .x/C " kxx k which differs from f .x/ just by a ‘perturbation’ quantity " kx  x k: The larger  the smaller this perturbation quantity and the closer x to an exact minimizer, but x may be farther from x"p : In practice, the choice of  depends on the circumstances. Usually the value  D " is suitable. Corollary 3.1 Let f W Rn ! R be a differentiable function which is bounded below on Rn : For every "-approximate minimizer x" of f there exists an "-approximate minimizer x such that kx  x" k 

p

";

krf .x /k 

p ";

p Proof From (3.4) for x D x C tu; u 2 Rn ; t < 0 and  D " we have f .x /  p p   f .x Ctu/f .x / f .x C tu/ C "ktuk:  "kuk; p Therefore, n t p and letting t ! C1  yields hrf .x /; ui  "kuk 8u 2 R ; hence krf .x /k  ": t u Let X; Y be two sets in Rn ; Rm , respectively. A set-valued map f from X to Y is a map which associates with each point x 2 X a set f .x/  Y called the value of f at x: Theorem 3.1 implies the following fixed point theorem due to Caristi (1976):

3.2 Contraction Principle

89

Theorem 3.2 Let P be a set-valued map from a closed set X  Rn into itself and let f W X ! Œ0; C1 be a l.s.c. function finite at at least one point. If .8x 2 X/.9y 2 P.x//

kx  yk  f .x/  f .y/

(3.7)

then P has a fixed point, i.e., a point x 2 P.x /: Proof Let x be the point that exists according to Theorem 3.1 for " D 1 < : We k have f .x /  f .x/C kxx 8x 2 X; so f .x / < f .x/Ckxx k 8x 2 X nfx g: On the  other hand, by hypothesis there exists y 2 P.x / satisfying kx  yk  f .x /  f .y/; i.e., f .x /  f .y/ C ky  x k: This implies y D x ; so x 2 P.x /: t u It can be shown that Theorem 3.2 in turn implies Theorem 3.1 (Exercise 1), so these two theorems are in fact equivalent. Heuristically, a set-valued map P with values P.x/ ¤ ; can be conceived of as a dynamic system whose potential function value at position x is f .x/: The assumption (3.7) means that when the system moves from position x to position y its potential diminishes at least by a quantity equal to the distance from x to y: The fixed point x corresponds to an equilibrium—a position from where no move to a position with lower potential is possible.

3.2 Contraction Principle For any set A  Rn and ı > 0 define d.x; A/ WD infy2A kx  yk; Aı WD fx 2 Rn j d.x; A/  ıg: If A; B are two subsets of Rn the number d.A; B/ WD inffı > 0j A  Bı ; B  Aı g is called the Hausdorff distance between A and B: From this definition it readily follows that: if d.A; B/  ı then d.x; B/  ı 8x 2 A (because A  Bı ) and d.y; A/  ı 8y 2 B (because B  Aı ). A set-valued map P from a set X  Rn to Rn is said to be a contraction if there exists ; 0 < < 1; such that .8x; x0 2 X/

d.P.x/; P.x0 //  kx  x0 k:

(3.8)

Lemma 3.1 If a set-valued map P from a closed set X  Rn to X is a contraction then the function f .x/ D d.x; P.x/ is l.s.c. on X: Proof We have to show that if d.xk ; P.xk //  ˛; kxk x0 j ! 0 then d.x0 ; P.x0 //  ˛: By definition of d.xk ; P.xk // we can select yk 2 P.xk / such that kxk  yk k  d.xk ; P.xk // C 1=k: Then d.yk ; P.x0 //  d.P.xk /; P.x0 //  kxk  x0 k and therefore d.x0 ; P.x0 //  kx0  xk k C kxk  yk k C d.yk ; P.x0 //

90

3 Fixed Point and Equilibrium

 kx0  xk k C d.xk ; P.xk // C 1=k C kxk  x0 k  kx0  xk k C ˛ C 1=k C kxk  x0 k ! ˛ as k ! C1: Consequently, d.x0 ; P.x0 //  ˛:

t u

The next theorem extends the well-known Banach contraction principle to setvalued maps. Theorem 3.3 (Nadler 1969) Let P be a set-valued map from a closed set X  Rn into itself, such that P.x/ is a nonempty closed set for every x 2 X: If there exists a number ; 0 < < 1; satisfying (3.8) then for every a 2 X and every ˛ 2 . ; 1/ there exists a point x 2 P.x / such that kx  ak  kaP.a/k : ˛ Proof By Lemma 3.1 the function f .x/ D d.x; P.x// is l.s.c., so the set E WD fx 2 Xj .˛  /kx  ak  f .a/  f .x/g is closed and nonempty (as a 2 E). We show that .8x 2 E/.9y 2 P.x/ \ E/

.˛  /kx  yk  f .x/  f .y/:

(3.9)

Let x 2 X: Since ˛ < 1 by definition of d.x; P.x// there exists y 2 P.x/ satisfying ˛kx  yk  d.x; P.x//:

(3.10)

On the other hand, since y 2 P.x/ it follows from (3.8) that d.y; P.y//  d.P.x/; P.y//  kx  yk:

(3.11)

Adding (3.10) and (3.11) yields ˛kx  yk C d.y; P.y//  d.x; P.x// C kx  yk; hence .˛  /kx  yk  f .x/  f .y/; proving (3.9). Finally, since x 2 E means .˛  /kx  ak  f .a/  f .x/; we have .˛  /ky  ak  .˛  /.kx  yk C kx  ak/  f .x/  f .y/ C f .a/  f .x/ D f .a/  f .y/: So y 2 E and all the conditions in Caristi Theorem 3.2 are satisfied for the set E; the map P.x/ \ E and the function f .x/ D d.x; P.x//: Therefore, there exists .x / f .a/ x 2 P.x / \ E;, i.e., x 2 P.x / and kx  ak  f .a/f D ˛ by noting that ˛    f .x / D 0 as x 2 P.x /: t u

3.3 Fixed Point and Equilibrium

91

3.3 Fixed Point and Equilibrium Solving an equation f .x/ D 0 amounts to finding a fixed point of the mapping F.x/ D x  f .x/: This explains the important role of fixed point theorems in analysis and optimization. In the preceding section we have already encountered two fixed point theorems: Caristi theorem which is equivalent to Ekeland variational principle and Nadler contraction theorem which is an extension of Banach fixed point principle. In this section we will study topological fixed points and their connection with the theory of equilibrium. The foundation of topological fixed point theorems lies in a famous proposition due to the three Polish mathematicians B. Knaster, C. Kuratowski, and S. Mazurkiewicz (Knaster et al. 1929) and for this reason called the KKM Lemma.

3.3.1 The KKM Lemma Let S WD Œa0 ; a1 ; : : : ; an  be an n-simplex spanned by the vertices a0 ; a1 ; : : : ; an : For every set I  f0; 1; : : : ; ng the set convfaij i 2 Ig is a face of S; more precisely the face spanned by the vertices ai ; i 2 I; or the face opposite the vertices aj ; j … I: A face opposite a vertex ai is also called a facet. For example, Œa1 ; : : : ; an  is the facet opposite a0 : A simplicial subdivision of S is a decomposition of it into a finite number of n-simplices such that any two simplices either do not intersect or intersect along a common face. A simplicial subdivision (briefly, a subdivision) of S can be constructed inductively as follows. For a 1-simplex Œa0 ; a1  the subdivision is formed by two simplices Œa0 ; 12 .a0 C 1 a / and Œ 12 .a0 C a1 /; a1 : Suppose subdivisions have been defined for k-simplices with k < n: The subdivision of the n-simplex Œa0 ; a1 ; : : : ; an  consists of all nsimplices spanned each by a set of the form  [ fcg; where  is the vertex set of an 1 .n  1/-simplex in the subdivision of a facet of S and c D nC1 .a0 C a1 C : : : C an /: It can easily be checked that the required condition of a subdivision is satisfied, namely: any two simplices in the subdivision either do not intersect or intersect along a common face. The just constructed subdivision is referred to as subdivision of degree 1. By applying the subdivision operation to each simplex in a subdivision of degree 1 a more refined subdivision is obtained which is referred to as subdivision of degree 2, and so on. By continuing the process, subdivisions of larger and larger degrees are obtained. It is not hard to see that the maximal diameter of simplices in a subdivision of degree k tends to zero as k ! C1: Consider now an n-simplex S D Œa0 ; a1 ; : : : ; an  and a subdivision of S consisting of m simplices S1 ; S2 ; : : : ; Sm :

(3.12)

92

3 Fixed Point and Equilibrium

To avoid confusion we will refer to the simplices in (3.12) as subsimplices in the subdivision. Let k be the vertex set of Sk : The set V D [m kD1 k is actually the set of subdivision points of the subdivision. A map ` W V ! f0; 1; 2; : : : ; ng is called a labeling of the subdivision and a subsimplex Sk is said to be complete if `.k / D f0; 1; 2; : : : ; ng: Lemma 3.2 Let ` W V ! f0; 1; 2; : : : ; ng be a labeling such that if z 2 V belongs to a facet Œai0 ; ai1 ; : : : ; aik  of S then `.z/ 2 fi0 ; i1 ; : : : ; ik g: Then there exists an odd number of complete subsimplices. Proof The proposition is trivial if S is 0-dimensional, i.e., if S is a single point. Arguing by induction, suppose the proposition is true for any .n  1/-dimensional S and check it for an n-dimensional S D Œa0 ; a1 ; : : : ; an : Consider any subsimplex Sk : A facet of Sk will be called a fair facet if its vertex set U satisfies `.U/ D f1; : : : ; ng: Let k denote the number of fair facets of the subsimplex Sk : It can easily be seen that if Sk is complete then k D 1; otherwise k D 2. Therefore the number of complete P 0 subsimplices will be odd if so is the number  D m kD1 k : Let  be the number 1 of fair facets of subsimplices that lie entirely in the facet Œa ; : : : ; an  of S: Clearly each of these fair facets is in fact a complete subsimplex in the subdivision of the .n  1/-simplex Œa1 ; : : : ; an  (with respect to the labeling of Œa1 ; : : : ; an  induced by `/: By the induction hypothesis 0 is an odd number. The number of remaining fair facets of subsimplices (3.12) is then   0 : But each of these fair facets is shared by two subsimplices in (3.12), so it appears in two subsimplices in (3.12) and is counted twice in the sum : Therefore,   0 is even, and hence,  is odd. t u The above proposition, due to Sperner (1928), is often referred to as Sperner Lemma. We are now in a position to state and prove the KKM Lemma. Proposition 3.1 (KKM Lemma) Let fa0 ; a1 ; : : : ; an g be any finite subset of Rm : If L0 ; L1 ; : : : ; Ln are closed subsets of Rm such that for every set I  f0; 1; : : : ; ng convfai ; i 2 Ig  [i2I Li ;

(3.13)

then \niD0 Li ¤ ;: Proof First assume, as in the classical formulation of this proposition, that a0 ; a1 ; : : : ; an are affinely independent. The set S D Œa0 ; a1 ; : : : ; an  is then an n-simplex and for each I  f0; 1; : : : ; ng the set convfai; i 2 Ig is a face of S: Consider a simplicial subdivision of degree k of S: Let x be a subdivision point, i.e., a vertex of some subsimplex in the subdivision, and convfaij i 2 Ig be the smallest face of S containing x: By assumption x 2 [i2I Li ; so there exists i 2 I such that x 2 Li : We then set `.x/ D i: Clearly this labeling ` of the subdivision points satisfies the condition of Lemma 3.2. Therefore, there exists a subsimplex Œx0;k ; x1;k ; : : : ; xn;k  with `.xi;k / D i; i D 0; 1; : : : ; n: By compactness of S there exists

3.3 Fixed Point and Equilibrium

93

x 2 S such that, up to a subsequence, x0;k ! x : As k ! C1; the diameter of the simplex Œx0;k ; x1;k ; : : : ; xn;k  tends to zero, hence kx0;k  xi;k k ! 0; i.e., xi;k ! x ;

i D 0; 1; : : : ; n:

Since xi;k 2 Li 8k and the set Li is closed, we conclude x 2 Li ; i D 0; 1; : : : ; n: Turning to the general case when a0 ; a1 ; : : : ; an may not be affinely independent let K WD convfa0 ; a1 ; : : : ; an g and S WD Œe0 ; e1 ; : : : ; en  where ei is the i-th unit vector of Rn ; i.e., the vector such that eii D 1; eij D 0 8j ¤ i: Noting that for every x 2 S Pn P we have x D niD0 xi with Pxn i  0;i iD0 xi D 1; we can define the continuous map  W S ! K by .x/ D iD0 xi a : For each i let Mi WD fx 2 Sj .x/ 2 Li \ Kg: By continuity of  each Mi is a closed set and from the assumption (3.13) it can easily be deduced that for every I  f0; 1; : : : ; ng; convfei ; i 2 Ig  [i2I Mi : Since e0 ; e1 ; : : : ; en are affinely independent, by the above there exists x 2 Mi 8i D 0; 1; : : : ; n; hence .x/ 2 Li 8i D 0; 1; : : : ; n: t u Remark 3.1 To get insight into condition (3.13) consider the case when ai Dei ; iD0; 1; : : : ; n: Imagine that a sum of value 1 is to be distributed among n C 1 persons i D 0; 1; : : : ; n: A sharing program in which person i receives a share equal to xi can be represented by a point x D .x0 ; x1 ; : : : ; xn / of the simplex S D Œe0 ; e1 ; : : : ; en : It may happen that for each person not every sharing program is acceptable. Let Li denote the set of sharing programs x acceptable for person i (each Li may naturally be assumed to be a closed set). Then condition (3.13), i.e., convfei; i 2 Ig  [i2I Li simply states that for any set I  f0; 1; : : : ; n C 1g every sharing program in which only people in the group I have positive shares is acceptable for at least someone of them. The KKM Lemma says that under these natural assumptions there exists a sharing program acceptable for everybody. Thus a very simple common sense underlies this proposition whose proof, as we saw, is quite sophisticated.

3.3.2 General Equilibrium Theorem Given a nonempty subset C  Rn ; a nonempty subset D  Rm and a real-valued function F.x; y/ W C  D ! R; the general equilibrium problem is to find a point x such that x 2 C; F.x; y/  0

8y 2 D:

Such a point x is referred to as an equilibrium point (shortly, equilibrium) for the system .C; D; F/: In the particular case when C D D is a nonempty closed convex set this problem was first introduced by Blum and Oettli (1994) and since then has been extensively studied.

94

3 Fixed Point and Equilibrium

The KKM Lemma can be viewed as a proposition asserting the existence of equilibrium in a basic particular case. In fact given a finite set ˝  Rm ; and for each y 2 ˝ a closed set L.y/  Rm ; the KKM Lemma gives conditions for the existence of an x 2 \y2˝ L.y/: If we define F.x; y/ D y .x/  1 where y .x/ is the characteristic function of the closed set L.y/ (so that F.x; y/ D 0 for x 2 L.y/ and F.x; y/ D 1 otherwise), then x 2 \y2˝ L.y/ means x 2 ˝; F.x; y/  0 8y 2 ˝: So x is an equilibrium point for the system .˝; ˝; F/: Based on the KKM Lemma we now prove a general existence condition for equilibrium in a system .C; D; F/. Recall that given a set X  Rn and a set Y  Rm ; a set-valued map f from X to Y is a map which associates with each point x 2 X a set f .x/  Y; called the value of f at x: A set-valued map f from X to Y is said to be closed if its graph f.x; y/j x 2 X; y 2 f .x/g is closed in Rn  Rm ; i.e., if x 2 X; y 2 f .x / and x ! x; y ! y always imply y 2 f .x/: Theorem 3.4 (General Equilibrium Theorem) Let C be a nonempty compact subset of Rn ; D a nonempty closed convex subset of Rm ; and F.x; y/ W C  D ! R a function which is upper semi-continuous in x and quasiconvex in y: If there exists a closed set-valued map ' from D to C with nonempty convex compact values such that infy2D infx2'.y/ F.x; y/  0 then the system .C; D; F/ has an equilibrium point, i.e., there exists an x such that x 2 C; F.x; y/  0 8y 2 D:

(3.14)

Proof For every y 2 D define C.y/ WD fx 2 Cj F.x; y/  0g: By assumption F.x; y/  0 8x 2 '.y/; 8y 2 D; so '.y/  C.y/ 8y 2 D: Since '.y/ ¤ ;; this implies C.y/ ¤ ;I furthermore, C.y/ is compact because C is compact and F.:; y/ is u.s.c. So to prove (3.14), i.e., that \y2D C.y/ ¤ ;; it will suffice to show that the family fC.y/; y 2 Dg has the finite intersection property, i.e., for any finite set fa1 ; : : : ; ak g  D, we have \kiD1 C.ai / ¤ ;:

(3.15)

Let ˝ WD fai ; i D 1; : : : ; kg and for each i D 1; : : : ; k define L.ai / WD fy 2 conv˝j '.y/ \ C.ai / ¤ ;g (note that conv˝  D because D is convex). It is easily seen that L.ai / is a closed subset of D: Indeed, let fy g  L.ai / be a sequence such that y ! y: Then y 2 conv˝; so y 2 conv˝ because the set conv˝ is closed. Further, for each  there is x 2 '.y / \ C.ai /: Since C.ai / is compact we have, up to a subsequence, x ! x for some x 2 C.ai / and since the map ' is closed, x 2 '.y/: Consequently, x 2 '.y/ \ C.ai /; hence '.y/ \ C.ai / ¤ ;; i.e., y 2 L.ai /:

3.3 Fixed Point and Equilibrium

95

For any nonempty set I  f1; : : : ; kg; let DI WD fy 2 conv˝j C.y/  [i2I C.ai /g D fy 2 conv˝j F.x; y/ < 0 8x … [i2I C.ai /g: In view of the quasiconvexity of F.x; :/ the set DI is convex and since ai 2 DI 8i 2 I it follows that convfai ; i 2 Ig  DI  fy 2 conv˝j '.y/  [i2I C.ai /g:

(3.16)

Therefore, convfai ; i 2 Ig  [i2I fy 2 conv˝j '.y/ \ C.ai / ¤ ;g  [i2I L.ai /:

(3.17)

Since this holds for every set I  f1; : : : ; kg; by KKM Lemma, we have \kiD1 L.ai / ¤ ;: So there exists y 2 conv˝ such that '.y/ \ C.ai / ¤ ; 8i D 1; : : : ; k: Now for each i D 1; : : : ; k take an xi 2 '.y/ \ C.ai /: It is not hard to see that for every set I  f1; : : : ; kg, we have M WD convfxi ; i 2 Ig  [i2I C.ai /:

(3.18)

In fact, assume the contrary, that S WD fx 2 Mj x … [i2I C.ai /g ¤ ;: Since y 2 conv˝; by (3.16) '.y/  [kiD1 C.ai /; and since '.y/ is convex it follows that M  '.y/  [kiD1 C.ai /:

(3.19)

Therefore, S D fx 2 Mj x 2 [kiD1 C.ai / n [i2I C.ai /g D fx 2 Mj x 2 [i…I C.ai /g D M \ .[i…I C.ai //: Since each set C.ai /; i D 1; : : : ; k; is compact, so is the set [i…I C.ai /; hence S is a closed subset of M: On the other hand, M n S D fx 2 Mj x 2 [i2I C.ai /g D M \ .[i2I C.ai //; so M n S; too, is a closed subset of M: But M n S ¤ ; because xi 2 C.ai / 8i 2 I: So M is the union of two disjoint nonempty closed sets S and M n S; conflicting with the convexity of M: Therefore S D ;, proving (3.18). Since (3.18) holds for every set I  f1; : : : ; kg; again by KKM Lemma (this time with ˝ D fx1 ; : : : ; xk g; L.xi / D C.ai /), we have \kiD1 C.ai / ¤ ;; i.e., (3.15) and hence, \y2D C.a/ ¤ ;; as was to be proved. t u Remark 3.2 (i) Theorem 3.4 still holds if instead of the compactness of C we only assume the existence of a y 2 D such that the set fx 2 Cj F.x; y/  0g is compact (or, equivalently, such that F.x; y/ ! 1 as x 2 C; kxk ! C1). Indeed, for any finite set N  D we can write \y2N[fyg C.y/ D \y2D C0 .y/; where C0 .y/ D C.y/ \ C.y/; so the family fC0 .y/; y 2 Dg has the finite intersection property; since C0 .y/ is compact it follows that \y2D C0 .y/ ¤ ;; hence \y2D C.y/ ¤ ; and the above proof goes through. (ii) By replacing F.x; y/ with F.x; y/  ˛; where ˛ is any real number Theorem 3.4 can also be reformulated as follows:

96

3 Fixed Point and Equilibrium

Let C  Rn be a nonempty compact set, D  Rm a nonempty closed convex set, and F.x; y/ W C  D ! R a function upper semi-continuous in x and quasiconvex in yI let ' be a closed set-valued map from D to C with nonempty convex compact values. Then for any real number ˛ we have the implication inf inf F.x; y/  ˛ ) 9x 2 C inf F.x; y/  ˛:

y2D x2'.y/

y2D

(3.20)

Or equivalently, max inf F.x; y/  inf inf F.x; y/: x2C y2D

y2D x2'.y/

(3.21)

Remark 3.3 For a heuristic interpretation of Theorem 3.4, consider a company (a person, a community) with an utility function f .x; y/ that depends upon a variable x 2 C under its control and a variable y 2 D outside its control. For each given y 2 D; the best for the company is certainly to select an xy 2 C maximizing F.xy ; y/ over C; i.e., such that f .xy ; y/  f .x; y/  0 8x 2 C: Theorem 3.4 tells under which conditions there exists an x 2 C such that f .x; y/  f .x; y/  0 8x 2 C 8y 2 D; i.e., an x D xy for any whatever y 2 D: As it turns out, such kind of situation underlies the equilibrium mechanism in most cases of interest. For instance, a weaker variant of Theorem 3.4 with both C; D compact and F.x; y/ continuous on C  D was shown many years ago to be a generalization of the Walras Excess Demand Theorem (Tuy 1976). Below we show how various important existence propositions pertaining to equilibrium can be obtained as immediate consequences of Theorem 3.4.

3.3.3 Equivalent Minimax Theorem First an alternative equivalent form of Theorem 3.4 is the following minimax theorem: Theorem 3.5 Let C be a nonempty compact subset of Rn ; D a nonempty closed convex subset of Rm ; and F.x; y/ W C  D ! R a function which is u.s.c. in x and quasiconvex in y: If there exists a closed set-valued map ' from D to C with nonempty compact convex values such that for every ˛ < WD infy2D supx2C F.x; y/ we have '.y/  C˛ .y/ WD fx 2 Cj F.x; y/  ˛g then the following minimax equality holds: max inf F.x; y/ D inf sup F.x; y/: x2C y2D

y2D x2C

(3.22)

Proof Clearly, for every ˛ < the assumption '.y/  C˛ .y/ 8y 2 D means that infy2D infx2'.y/ F.x; y/  ˛ while by Theorem 3.4 the latter inequality implies maxx2C infy2D F.x; y/  ˛: Therefore max inf F.x; y/  WD inf sup F.x; y/; x2C y2D

y2D x2C

3.3 Fixed Point and Equilibrium

97

whence (3.22) because the converse inequality is trivial. Thus Theorem 3.4 implies Theorem 3.5. The converse can be proved in the same manner. t u Corollary 3.2 Let C be a nonempty compact convex set in Rn ; D a nonempty closed convex set in Rm ; and F.x; y/ W C  D ! R a u.s.c. function which is quasiconcave in x and quasiconvex in y: Then there holds the minimax equality (3.22). Proof For every ˛ < define '.y/ WD fx 2 Cj F.x; y/  ˛g: Then '.y/ is a nonempty closed convex set. If .x ; y / 2 C  D; .x ; y / ! .x; y/; and x 2 '.y /; i.e., F.x ; y /  ˛ then by upper semi-continuity of F.x; y/, we must have F.x; y/  ˛; i.e., x 2 '.y/: So the set-valued map ' is closed and '.y/ D C˛ .y/ 8y 2 D: The conclusion follows by Theorem 3.5. t u Note that Corollary 3.2 differs from Sion’s minimax theorem (Theorem 2.7) only by the continuity properties imposed on F.x; y/: Actually, with a little bit more skill Sion’s theorem can also be derived directly from the General Equilibrium Theorem (Exercise 6).

3.3.4 Ky Fan Inequality When D D C and '.y/ is single-valued Theorem 3.4 yields Theorem 3.6 (Ky Fan 1972) Let C be a nonempty compact convex set in Rn ; and F.x; y/ W C  C ! R a function which is u.s.c in x and quasiconvex in y: If '.y/ is a continuous map from C into C there exists x 2 C such that inf F.x; y/  inf F.'.y/; y/:

y2C

y2C

(3.23)

In the special case '.y/ D y 8y 2 C this inequality becomes inf F.x; y/  inf F.y; y/:

y2C

y2C

If, in addition, F.y; y/ D 0 8y 2 C then inf F.x; y/  0:

y2C

Proof The inequality (3.21) reduces to (3.23) when '.y/ is a singleton.

t u

In the special case F.x; y/ D kx  yk we have from (3.23) miny2C kx  yk  miny2C k'.y/  yk; hence 0 D miny2C k'.y/  yk; which means there exists y 2 C such that y D '.y/: Thus, as a consequence of Ky Fan inequality we get Brouwer fixed point theorem: a continuous map ' from a compact convex set C into itself has a fixed point. Actually, the following more general fixed point theorem is also a direct consequence of Theorem 3.4:

98

3 Fixed Point and Equilibrium

3.3.5 Kakutani Fixed Point Theorem 3.7 (Kakutani 1941) Let C be a compact convex subset of Rn and ' a closed set-valued map from C to C with nonempty convex values '.x/  C 8x 2 C: Then ' has a fixed point, i.e., a point x 2 '.x/: Proof The function F.x; y/ W C  C ! R defined by F.x; y/ D kx  yk is continuous on C  C and convex in y for every fixed x: By Theorem 3.4 (Remark 3.2, (3.21)) there exists x 2 C such that inf kx  yk  inf inf kx  yk:

y2C

y2C x2'.y/

Since infy2C kx  yk D 0; for every natural k we have infy2C infx2'.y/ kx  yk  1=k; hence there exist yk 2 C; xk 2 '.yk /  C such that kyk xk k  1=k: By compactness of C there exist x; y such that, up to a subsequence, xk ! x 2 C; yk ! x 2 C: Since the map ' is closed it follows that x 2 '.x/: t u Historically, Brouwer Theorem was established in 1912. Kakutani Theorem was derived from Brouwer Theorem almost three decades later, via a quite involved machinery. The above proof of Kakutani Theorem is perhaps the shortest one.

3.3.6 Equilibrium in Non-cooperative Games Consider an n-person game in which the player i has a strategy set Xi  Rmi : When the player i chooses a strategy Q xi 2 Xi ; the situation of the game is described by the vector x D .x1 ; : : : ; xn / 2 niD1 Xi : In that situation the player i obtains a payoff fi .x/: Assume that each player does Qn not know which strategy is taken by the other players. A vector x 2 X WD iD1 Xi is called a Nash equilibrium if for every i D 1; : : : ; n W fi .x1 ; x2 ; : : : ; xn / D max fi .x1 ; : : : ; xi1 ; xi ; xiC1 ; : : : ; xn /: xi 2Xi

In other words, x 2 X is a Nash equilibrium if fi .x/ D max fi .xji xi /; xi 2Xi

where for any z 2 Xi ; xji z denotes the vector x 2 replaced by z:

Qn iD1

Xi in which xi has been

3.3 Fixed Point and Equilibrium

99

Theorem 3.8 (Nash Equilibrium Theorem (Nash 1950)) Assume that for each i D 1; : : : ; n the set Xi is convex compact and the function fi .x/ is continuous on X and concave in xi : Then there exists a Nash equilibrium. Qn Proof The set X WD iD1 Xi is convex, compact as the Cartesian product of n convex compact sets. Consider the function F.x; y/ WD

n X

Œfi .x/  fi .xji yi /; x 2 X; y 2 X:

(3.24)

iD1

This is a function continuous in x and convex in y: The latter is due to the assumption that for each i D 1; : : : ; n the function fi .x/ is concave in xi ; which means that the function z 7! fi .xji z/ is concave. Since yji yi D y we also have F.y; y/ D 0 8y 2 X: Therefore, by Ky Fan inequality Theorem 3.6, there exists x 2 X satisfying F.x; y/ D

n X Œfi .x/  fi .xji yi /  0

8y 2 X:

iD1

Fix an arbitrary i and take y 2 X such that yj D xj for j ¤ i: Writing the above inequality as fi .x/  fi .xji yi / C

X

Œfj .x/  fj .xjj xj /  0 8y 2 X;

j¤i

and noting that xjj xj D x we deduce the inequality fi .x/  fi .xjyi / 8yi 2 Xi : Since this holds for every i D 1; : : : ; n; x is a Nash equilibrium. t u

3.3.7 Variational Inequalities As we saw in Chap. 2, Proposition 2.31, a minimizer x of a convex function f W Rn ! R over a convex set C  Rn must satisfy the condition 0 2 @f .x/ C NC .x/;

(3.25)

where @f .x/ is the subdifferential of f at x and NC .x/Dfy 2 Rn j hy; xxi  0 8x 2 Cg is the outward normal cone to C at x: The above condition can be equivalently written as 9y 2 @f .x/ hy; x  xi  0 8x 2 C: A relation like (3.25) is called a variational inequality. Replacing the map x 7! @f .x/ with a given arbitrary set-valued map ˝ from a nonempty closed convex set

100

3 Fixed Point and Equilibrium

C to Rn , we consider the variational inequality for (C; ˝): .VI.C; ˝//

Find x 2 C satisfying 0 2 ˝.x/ C NC .x/:

Alternatively, following Robinson (1979) we can consider the generalized equation 0 2 ˝.x/ C NC .x/:

(3.26)

Although only a condensed form of writing VI.C; ˝/; the generalized equation (3.26) is often a more efficient tool for studying many questions of nonlinear analysis and optimization. Based on Theorem 3.4 we can now derive a solvability condition for the variational inequality VI.C; ˝/ or the generalized equation (3.26). For that we need a property of set-valued maps. A set-valued map f .x/ from a set X  Rn to a set Y  Rm is said to be upper semi-continuous if for every x 2 X and every open set U  f .x/ there exists a neighborhood W of x such that f .z/  U 8z 2 W \ X: Proposition 3.2 (i) An upper semi-continuous set-valued map f from a set X to a set Y which has closed values is a closed map. Conversely, a closed set-valued map f from X to a compact set Y is upper semi-continuous. (ii) If f .x/ is an upper semi-continuous map from a compact set X to Y with nonempty compact values then f .X/ is compact. Proof (i) Let f be an upper semi-continuous map with closed values. If .x ; y / 2 X  Y is a sequence such that .x ; y / ! .x; y/ 2 X  Y; then for any neighborhood U of f .x/ we will have y 2 U for all large enough ; so y 2 f .x/: Conversely, let f be a closed set-valued map from X to a compact set Y: If f is not upper semi-continuous. There would exist x0 2 X and an open set U  f .x0 / such that for every k there exists xk ; yk 2 f .xk / satisfying kxk  x0 k  1=k; yk … U: Since Y is compact, by taking a subsequence if necessary, yk ! y 2 Y and since f .x0 /  U; clearly y … f .x0 /; conflicting with the map being closed. (ii) Let Wt ; t 2 T; be an open covering of f .X/: For each fixed x 2 X since f .x/ is compact there exists a finite covering Wi ; i 2 I.x/  T: In view of the upper semi-continuity of the map f there exists an open ball V.y/ around x such that f .z/  [i2I.x/ Wi 8z 2 V.x/: Then by compactness of Y there exists a finite set E  Y such that Y is entirely covered by [x2E V.x/: So the finite family Wt ; t 2 [fWi ; i 2 I.x/; x 2 Eg covers f .X/; proving the compactness of this set. t u Theorem 3.9 Let C be a compact convex set in Rn ; ˝ an upper semi-continuous set-valued map from C to Rn with nonempty compact convex values. Then the variational inequality VI.C; ˝) [the generalized equation (3.26)] has a solution, i.e., there exists x 2 C and y 2 ˝.x/ such that hy; x  xi  0 8x 2 C:

3.4 Exercises

101

Proof By Proposition 3.2 ˝ is a closed map and its range D D ˝.C/ is compact. Define F.x; y/ D hx; yi  minz2C hz; yi: This function is continuous in y and convex in x: By Theorems3.4, (3.21), there exists y 2 D such that inf F.x; y/  inf inf F.x; y/:

x2C

x2C y2˝.x/

But infx2C F.x; y/ D minx2C hx; yi  minz2C hz; yi D 0; hence inf inf F.x; y/  0:

x2C y2˝.x/

Since C is compact, for every k D 1; 2; : : : ; there exists xk 2 C satisfying inf F.xk ; y/  1=k;

y2˝.xk /

hence there exists xk 2 C; yk 2 ˝.xk / such that F.xk ; yk /  1=k; i.e., hxk ; yk i  minhz; yk i  1=k: z2C

Let .x; y/ be a cluster point of .xk ; yk /: Then x 2 C; y 2 ˝.x/ because the map ˝ is closed, and hx; yi  minz2C hz; yi  0; i.e., hy; x  xi  0 8x 2 C as to be proved. u t

3.4 Exercises 1 Derive Ekeland variational principle from Theorem 3.2. (Hint: Let x" be an "-approximate minimizer of f .x/ on X and  > 0: Define P.x/ D fy 2 Xj " ky  x" /k  f .x" /  f .y/g and show that if Ekeland’s theorem is false there exists for every x 2 X an y 2 P.x/ satisfying " kx  yk < f .x/  f .y/: Then apply Caristi theorem to derive a contradiction.) 2 Let S be an n-simplex spanned by a0 ; a1 ; : : : ; an : For each i D 0; 1; : : : ; n let Fi be the facet of S opposite the vertex ai (i.e., the .n  1/-simplex spanned by n points aj ; j ¤ i/; and Li a given closed subset of S: Show that the KKM Lemma is equivalent to saying that if the following condition is satisfied: S  [niD0 Li ;

Fi  Li ; 8i;

then \niD0 Li ¤ ;: 3 Let S; Fi ; Li ; i D 0; 1; : : : ; n be as in the previous exercise. Show that the KKM Lemma is equivalent to saying that if the following condition is satisfied: S  [niD0 Li ; then \niD0 Li ¤ ;:

ai 2 Li ; Fi \ Li D ;;

102

3 Fixed Point and Equilibrium

4 Derive the KKM Lemma from Brouwer fixed point theorem. (Hint: Let S; L0 ; L1 ; : : : ; Ln be as in the previous exercise, .x; Li / D minfkx  ykj y 2 Li g: For k > 0 set Hi D fx 2 Sj .x; Li /  1=k and define a continuous map f W S ! S i/ by fi .x/ D Pxni1 .x;H with the convention x1 D xn : Show that if x D f .x/ then iD1 xi .x;Hi / fi .x/ > 0 8i; hence .x; Hi / > 0; i.e., .x; Li /  1=k for every i D 0; 1; : : : ; n: Since k can be arbitrary, deduce that \niD0 Li ¤ ;:) 5 Let C1 ; : : : ; Cn be n closed convex sets in Rk such that their union is a convex set C: Show that if the intersection of n  1 arbitrary sets among them is nonempty then the intersection of the n sets is nonempty. (Hint: Let ai 2 Li WD \j¤i Cj and apply KKM to the set fa1 ; : : : ; an g and the collection of closed sets fL1 ; : : : ; Ln g.) 6 Let C be a compact convex subset of Rn ; D a closed convex subset of Rm ; f .x; y/ W C  D ! R a function which is quasi convex, l.s.c. in x and quasiconcave u.s.c. in y: Show that for every ˛ > WD supy2D infx2C f .x; y/ the set-valued map ' from D to C such that '.y/ WD fx 2 Cj f .x; y/  ˛g 8y 2 D is closed and using this fact deduce Sion’s Theorem from Theorem 3.4. (Hint: show that the set-valued map ' is u.s.c., hence closed because C is compact.)

Chapter 4

DC Functions and DC Sets

4.1 DC Functions Convexity is a nice property of functions but, unfortunately, it is not preserved under even such simple algebraic operations as scalar multiplication or lower envelope. In this chapter we introduce the dc structure (also called the complementary convex structure ) which is the common underlying mathematical structure of virtually all nonconvex optimization problems. Let ˝ be a convex set in Rn . We say that a function is dc on ˝ if it can be expressed as the difference of two convex functions on ˝, i.e., if f .x/ D f1 .x/f2 .x/, where f1 ; f2 are convex functions on ˝. Of course, convex and concave functions are particular examples of dc functions. A quadratic function f .x/ D hx; Qxi, where Q is a symmetric matrix, is a dc function which may be neither convex nor concave. Indeed, setting x D Uy where U D Œu1 ; : : : ; un  is the matrix of normalized eigenvectors of Q, we have U T QU D diag.1 ; 2 ; : : : ; n /, hence f .x/ D f .Uy/ D hUy; QUyi D hy; U T QUyi, so that f .x/ D f1 .x/  f2 .x/ with f1 .x/ D

X i 0

i y2i ;

f2 .x/ D 

X

i y2i ;

y D U 1 x:

i 0:

(4.28)

Proof Without loss of generality we may assume that g.x0 / D h.x0 / D 0. Then x0 is a global minimizer of g.x/  h.x/ on Rn if and only if h.x/  g.x/ 8x 2 Rn . It is easy to see that the latter condition is equivalent to (4.28). Indeed, observe that an affine function '.x/ D ha; x  x0 i C '.x0 / is a minorant of a proper function f .x/ if and only if a 2 @" f .x0 / with " D f .x0 /  '.x0 /. Now, since h.x/ is a proper convex l.s.c. function on Rn , it is known that h.x/ D supf'.x/j ' 2 Qg, where Q denotes the collection of all affine minorants of h.x/ (Theorem 2.5). Hence, the condition h.x/  g.x/ 8x 2 Rn is equivalent to saying that every ' 2 Q is a minorant of g, i.e., by the above observation, every a 2 @" h.x0 / belongs to @" g.x0 /, for every " > 0. t u Remark 4.4 Geometrically, the condition h.x/  g.x/ 8x 2 Rn which expresses the global minimality of x0 means G  H;

(4.29)

where G; H are the epigraphs of the functions g; h, respectively. Since h is proper convex and l.s.c., H is a nonempty closed convex set in Rn  R and it is obvious that (4.29) holds if and only if every closed halfspace in Rn  R that contains H also contains G (see Strekalovski 1987, 1990, 1993). The latter condition, if restricted to nonvertical halfspaces, is nothing but a geometric expression of (4.28). The next result exhibits an important duality relationship in dc minimization problems which has found many applications (Toland 1978).

122

4 DC Functions and DC Sets

Proposition 4.21 Let g; h be as in Proposition 4.20. Then inf fg.x/  h.x/g D

x2domg

inf fh .u/  g .u/g:

u2domh

(4.30)

If u0 is a minimizer of h .u/  g .u/ then any x0 2 @g .u0 / is a minimizer of g.x/  h.x/. Proof Suppose g.x/  h.x/  , i.e., g.x/  h.x/ C for all x 2 domg. Then g .u/ D supx2domg fhu; xi  g.x/g  supx2domg fhu; xi  h.x/g   h .u/  , i.e., h .u/  g .u/  . Conversely, the latter inequality implies (in an analogous manner) g .x/  h .x/  , hence g.x/  h.x/  . This proves (4.30). Suppose now that u0 is a minimizer of h .u/g .u/ and let x0 2 @g .u0 /, hence u0 2 @g.x0 /. Since h .u0 / C h.x0 /  hu0 ; x0 i, we have g.x0 /  h.x0 /  g.x0 /  hu0 ; x0 i C h .u0 /  g.x/  hu0 ; xi C h .u0 /  g.x/  Œhu0 ; xi  h .u0 /  g.x/  h.x/ for all x 2 domg.

t u

4.9 Exercises 1 Let C be a compact set in Rn . For every x 2 Rn denote by d.x/ D inffkx  yk j y 2 @Cg. Show that d.x/ is a dc function such that if C is a convex set then d.x/ is concave on C but convex on any convex subset of Rn n C. Find a dc representation of the functions: P 2 P.x; y/ D njD1 Œaj xj C bj yj C cj x2j C dj y2j C pj xj yj C qj xj y2j C rj x2j yj  for .x; y/ 2 RnC  RnC . 3 8 ˆ < ˛  wt f .t/ D .˛  wı/.1  ˆ :0

0tı ıt

t

tı /



for t  0 (˛; ı; are positive numbers, and ı < /. 4 f .x/ D

n X iD1

ˇ0 C

ri .xi / Pk jD1

ˇj xj

;

x  0;

where each ri .t/ is a strictly increasing S-shaped function (convex for t 2 Œ0; ˛i , concave for t  ˛i , with ˛i > 0/, and ˇi ; i D 0; 1; : : : ; n are positive numbers.

4.9 Exercises

1 1=2 4 5 f .x/ D x1 C C ; x2 .x1  x2 /2

123

x 2 R2C .

6 A function f W C ! R (where C is an open convex subset of Rn / is said to be weakly convex if there exists a constant r > 0 such that for all x1 ; x2 2 C and all  2 Œ0; 1: f ..1  /x1 C x2 /  .1  /f .x1 / C f .x2 / C .1  /rkx1  x2 k2 : Show that f is weakly convex if and only if it is a dc function of the form f .x/ D g.x/  rkxk2 , where r > 0. 7 Show that a function f W Rn ! R is weakly convex if and only if there exists a constant r > 0 such that for every x0 2 Rn one can find a vector y 2 Rn satisfying f .x/  f .x0 /  hy; x  x0 i  rkx  x0 k2 8x 2 Rn . 2

8 Find a convex minorant for the function f .x/ D e.x1 Cx2 x3 / C x1 x2 x3 on 1  xi  C1; i D 1; 2; 3. 9 Same question for the function f .x/ D log.x1 C x22 /Œminfx1 ; x2 g on 1  xi  3; i D 1; 2. 10 (1) If x is a local minimizer of the function f .x/ D g.x/  h.x/ over Rn then 0 2 @g.x/  @h.x/ where A  B D fxj x C B  Ag. (2) If x is a global minimizer of the dc function f D g  h over Rn then any point x 2 @h.x/ achieves a minimum of the function h  g over domg .

Part II

Global Optimization

Chapter 5

Motivation and Overview

5.1 Global vs. Local Optimum We will be concerned with optimization problems of the general form .P/

minff .x/ j

gi .x/  0 .i D 1; : : : ; m/; x 2 Xg;

where X is a closed convex set in Rn ; f W ˝ ! R and gi W ˝ ! R; i D 1; : : : ; m, are functions defined on some set ˝ in Rn containing X. Denoting the feasible set by D: D D fx 2 Xj gi .x/  0; i D 1; : : : ; mg we can also write the problem as minff .x/j x 2 Dg: A point x 2 D such that (Fig. 5.1) f .x /  f .x/; 8x 2 D; is called a global optimal solution (global minimizer) of the problem. A point x0 2 D such that there exists a neighborhood W of x0 satisfying f .x0 /  f .x/; 8x 2 D \ W is called a local optimal solution (local minimizer). When the problem is convex, i.e., when all the functions f .x/ and gi .x/; i D 1; : : : ; m, are convex, any local minimizer is global (Proposition 2.30), but this is no longer true in the general case. A problem is said to be multiextremal when it has many local minimizers

© Springer International Publishing AG 2016 H. Tuy, Convex Analysis and Global Optimization, Springer Optimization and Its Applications 110, DOI 10.1007/978-3-319-31484-6_5

127

128

5 Motivation and Overview

Nonconvex function

Nonconvex constraints

Fig. 5.1 Local and global minimizers

with different objective function values (so that a local minimizer may fail to be global; Fig. 5.1). Throughout the sequel, unless otherwise stated it will always be understood that the minimum required in problem .P/ is a global one, and we will be mostly interested in the case when the problem is multiextremal. Often we will assume X compact and f .x/; gi .x/; i D 1; : : : ; m, continuous on X, (so that if D is nonempty then the existence of a global minimum is guaranteed); but occasionally the case where f .x/ is discontinuous at certain boundary points of D will also be discussed, especially when this turns out to be important for the applications. By its very nature, global optimization requires quite different methods from those used in conventional nonlinear optimization, which so far has been mainly concerned with finding local optima. Strictly speaking, local methods of nonlinear programming which rely on such concepts as derivatives, gradients, subdifferentials and the like, may even fail to locate a local minimizer. Usually these methods compute a Karush–Kuhn–Tucker (KKT) point, but in many instances most KKT points are not even local minimizers. For example, the problem: n X minimize  .ci xi C x2i / subject to  1  xi  1 .i D 1; : : : ; n/; iD1

where the ci are sufficiently small positive numbers, has up to 3n KKT points, of which only 2n C 1 points are local minimizers, the others are saddle points. Of course the above example reduces to n univariate problems minfci xi  x2i j  1  xi  1g; i D 1; : : : ; n, and since the strictly quasiconcave function ci xi  x2i attains its unique minimum over the segment Œ1; 1 at xi D 1, it is clear that .1; 1; : : : ; 1/ is the unique global minimizer of the problem. This situation is typical: although certain multiextremal problems may be easy (among them instances of problems which have been shown to be NP-hard, see Chap. 7), their optimal solution is most often found on the basis of exploiting favorable global properties

5.2 Multiextremality in the Real World

129

of the functions involved rather than by conventional local methods. In any case, easy multiextremal problems are exceptions. Most nonconvex global optimization problems are difficult, and their main difficulty is often due to multiextremality. From the computational complexity viewpoint, even the problem which only differs from the above mentioned example in that the objective function is a general quadratic concave function is NP-hard (Sahni 1974). It should be noted, however, that, as shown in Murty and Kabadi (1987), to verify that a given point is an unconstrained local minimizer of a nonconvex function is already NP-hard. Therefore, even if we were to restrict our goal to obtaining a local minimizer only, this would not significantly lessen the complexity of the problem. On the other hand, as optimization models become widely used in engineering, economics, and other sciences, an increasing number of global optimization problems arise from applications that cannot be solved successfully by standard methods of linear and nonlinear programming and require more suitable methods. Often, it may be of considerable interest to know at least whether there exists any better solution to a practical problem than a given solution that has been obtained, for example, by some local method. As will be shown in Sect. 5.5, this is precisely the central issue of global optimization.

5.2 Multiextremality in the Real World The following examples illustrate how multiextremal nonconvex problems arise in different economic, engineering, and management activities. Example 5.1 (Production–Transportation Planning) Consider r factories producing a certain good to satisfy the demands dj ; j D 1; : : : ; m, of m destination points. The cost of producing t units of goods at factory i is a concave function gi .t/ (economy of scale), the transportation cost is linear, with unit cost cij from factory i to destination point j. In addition, if the destination point j receives t ¤ dj units, then a shortage penalty hj .t/ must be paid which is a decreasing nonnegative convex function of t in the interval Œ0; dj /. Under these conditions, the problem that arises is the following one where a dc function has to be minimized over a polyhedron: minimize

r X m X

cij xij C

iD1 jD1

s:t:

m X jD1 r X

r X

gi .yi / C

iD1

xij D yi

.i D 1; : : : ; r/

xij D zj

.j D 1; : : : ; m/

iD1

xij ; yi ; zj  0

8i; j:

m X jD1

hj .zj /

130

5 Motivation and Overview

In a general manner nonconvexities in economic processes arise from economies of scale (concave cost functions) or increasing returns (convex utility functions). Example 5.2 (Location–Allocation) A number of p facilities providing the same service have to be constructed to serve n users located at points aj ; j D 1; : : : ; n on the plane. Each user will be served by one of the facilities, and the transportation cost is a linear function of the distance. The problem is to determine the locations xi 2 R2 ; i D 1; : : : ; p, of the facilities so as to minimize the weighted sum of all transportation costs from each user to the facility that serves this user. Introduced by Cooper (1963), this problem was shown to be NP-hard by Megiddo and Supowwit (1984). Originally it was formulated as consisting simultaneously of a nontrivial combinatorial part (allocation, i.e., partitioning the set of users into groups to be served by each facility) and a nonlinear part (location, i.e., finding the optimal location of the facility serving each group). The corresponding mathematical formulation of the problem is (Brimberg and Love 1994): Pp P minx;u iD1 NjD1 wij kaj  xi k Pp s:t: iD1 wij D rj ; j D 1; : : : ; N wij  0; i D 1; : : : ; p; j D 1; : : : ; N xi 2 S; i D 1; : : : ; p; where wij is an unknown weight representing the flow from point aj to facility i, S is a rectangular domain in R2 where the facilities must be located. With this traditional formulation the problem is exceedingly complicated as it involves a large number of variables and a highly nonconvex objective function. However, observing that to minimize the total cost, each user must be served by the closest facility, and that the distance from a user j to the closest facility is obviously dj .x1 ; : : : ; xp / D min kaj  xi k iD1;:::;p

(5.1)

the problem can be formulated more simply as minimize

n X

wj dj .x1 ; : : : ; xp / subject to xi 2 R2 ; i D 1; : : : ; p;

jD1

where wj > 0 is the weight assigned to user j. Since each dj .:/ is a dc function (pointwise minimum of a set of convex functions kaj  xi k, see Sect. 3.1), the problem amounts to minimizing a dc function over .R2 /p . Example 5.3 (Engineering Design) Random fluctuations inherent in any fabrication process (e.g., integrated circuit manufacturing) may result in a very low production yield. To help the designer to minimize the influence of these random fluctuations, a method consists in maximizing the yield by centering the nominal value of design parameters in the region of acceptability. In more precise terms,

5.2 Multiextremality in the Real World

131

this design centering problem (Groch et al. 1985; Vidigal and Director 1982) can be formulated as follows. Suppose the quality of a manufactured item can be characterized by an n-dimensional parameter and an item is accepted when the parameter is contained in a given region M  Rn . If x is the nominal value of the parameter, and y its actual value, then usually y will deviate from x and for every fixed x the probability that the deviation kx  yk does not exceed a given level r monotonically increases with r. Under these conditions, for a given nominal value of x the production yield can be measured by the maximal value of r D r.x/ such that B.x; r/ WD fy W kx  yk  rg  M and to maximize the production yield, one has to solve the problem maximize r.x/ subject to x 2 Rn :

(5.2)

Often, the norm k:k is ellipsoidal, i.e., there exists a positive definite matrix Q such that kx  yk2 D .x  y/T Q.x  y/. Denoting D D Rn n M, we then have r2 .x/ D inff.x  y/T Q.x  y/ W y … Dg D inffxT Qx C yT Qy  2xT Qy W y … Dg; so r2 .x/ D xT Qx  h.x/ with h.x/ D supf2xT Qy  yT Qy W y … Dg. Since for each y … D the function x 7! 2xT Qy  yT Qy is affine, h.x/ is a convex function and therefore, r2 .x/ is a dc function. Using nonlinear programming methods, we can only compute KKT points which may even fail to be local minima, so the design centering problem is another instance of multiextremal optimization problem (see, e.g., Thach 1988). Problems similar to (5.2) appear in engineering design, coordinate measurement technique, etc. Example 5.4 (Pooling and Blending) In oil refineries, a two-stage process is used to form final products j D 1; : : : ; J from given oil components i D 1; : : : ; I (see Fig. 5.2). In the first stage, intermediate products are obtained by combining the components supplied from different sources and stored in special tanks or pools. In the second stage, these intermediate products are combined to form final products of prescribed quantities and qualities. A certain quantity of an initial component can go directly to the second stage. If xil denotes the amount of component i allocated Fig. 5.2 The pooling problem

Comp 1

x11

x12 Comp 2 z31 Comp 3

y11

Product 1

Pool 1 y12 z32

Product 2

132

5 Motivation and Overview

to pool l, ylj the amount going from pool l to product j, zij the amount of component i going directly to product j, plk the level of quality k (relative sulfur content level) in pool l, and Cik the level of quality k in component i, then these variables must satisfy (see, e.g., Ben-Tal et al. 1994; Floudas and Aggarwal 1990; Foulds et al. 1992) (Fig. 5.2): ) P P xil C j zij  Ai l P P (component balance) i xil  j ylj D 0 P x  Sl (pool balance) P Pi il .p  P /y C i .Cij  Pjk /zij  0 (pool quality) Pl lk Pjk lj l ylj C i zij  Dj (product demands constraints) xil ; ylj ; zij ; plk  0; where Ai ; Sl ; Pjk ; Dj are the upper bounds for component availabilities, pool sizes, product qualities, and product demands, respectively, and Cik the level of quality k in component i. Given the unit prices ci ; dj of component i and product j, the problem is to determine the variables xil ; ylj ; zij ; plk satisfying the above constraints with maximum profit 

XX i

l

ci xil C

XX l

j

dj ylj C

XX .dj  ci /zij : i

j

Thus, a host of optimization problems of practical interest in economics and engineering involve nonconvex functions and/or nonconvex sets in their description. Nonconvex global optimization problems arise in many disciplines including control theory (robust control), computer science (VLSI chip design and databases), mechanics (structural optimization), physics (nuclear design and microcluster phenomena in thermodynamics), chemistry (phase and chemical reaction equilibrium, molecular conformation), ecology (design and cost allocation for waste treatment systems), . . . Not to mention many mathematical problems of importance which can be looked at and studied as global optimization problems. For instance, finding a fixed point of a map F W Rn ! Rn on a set D  Rn , which is a recurrent problem in applied mathematics, reduces to finding a global minimizer of the function kx  F.x/k over D; likewise, solving a system of equations gi .x/ D 0 .i D 1; : : : ; m/ amounts to finding a global minimizer of the function f .x/ D maxi jgi .x/j.

5.3 Basic Classes of Nonconvex Problems Optimization problems without any particular structure are hopelessly difficult. Therefore, our interest is focussed first of all on problems whose underlying mathematical structure is more or less amenable to analysis and computational study. This point of view leads us to consider two basic classes of nonconvex global optimization problems: dc optimization and monotonic optimization.

5.3 Basic Classes of Nonconvex Problems

133

5.3.1 DC Optimization A dc optimization (dc programming) problem is a problem of the form ˇ ˇ minimize f .x/ ˇ ˇ subject to x 2 D; hi .x/  0 .i D 1; : : : ; m/;

(5.3)

where D is a convex set in Rn , and each of the functions f .x/; h1 .x/, : : : ; hm .x/ is a dc function on Rn . Most optimization problems of practical interest are dc programs (see the examples in Sect. 4.2). The class of dc optimization problems includes several important subclasses: concave minimization, reverse convex programming, non convex quadratic optimization among others.

a. Concave Minimization (Convex Maximization) minimize f .x/ subject to

x 2 D;

(5.4)

where f .x/ is a concave function (i.e., f .x/ is a convex function) and D is a closed convex set given by an explicit system of convex inequalities: gi .x/  0; i D 1; : : : ; m. This is a global optimization problem with the simplest structure. Since (5.4) can be written as minftj x 2 D; f .x/  tg, it is equivalent to minimizing t under the constraint .x; t/ 2 G n H with G D D  R and H D f.x; t/j f .x/ > tg (H is a convex set because the function f .x/ is concave). Of particular interest is the linearly constrained concave minimization problem (i.e., problem (5.4) when the constraint set D is a polyhedron given by a system of linear inequalities) which plays a central role in global optimization and has been extensively studied in the last four decades. Since the most important property of f .x/ that should be exploited in problem (5.4) is its quasiconcavity (expressed in the fact that every upper level set fxj f .x/  ˛g is convex), rather than its concavity, it is convenient to slightly extend problem (5.4) by assuming f .x/ to be only quasiconcave. The corresponding problem, called quasiconcave minimization (or quasiconvex maximization), does not differ much from concave minimization. Also, by considering quasiconcave rather than concave minimization, more insight can be obtained into the relationship between this problem and other nonconvex optimization problems.

134

5 Motivation and Overview

b. Reverse Convex Programming minimize f .x/

subject to

x 2 D; h.x/  0;

(5.5)

where f .x/; h.x/ are convex finite functions on Rn and D is a closed convex set in Rn . Setting C D fxj h.x/  0g; the problem can be rewritten as minimize f .x/

subject to x 2 D n intC:

This differs from a conventional convex program only by the presence of the reverse convex constraint h.x/  0: Writing the problem as minftj f .x/  t; x 2 D; h.x/  0g, we see that it amounts to minimizing t over the set G n intH, with G D f.x; t/j f .x/  t; x 2 Dg; H D C  R. Just as concave minimization can be extended to quasiconcave minimization, so can problem (5.5) be extended to the case when f .x/ is quasiconvex. As indicated above, concave minimization can be converted to reverse convex programming by introducing an additional variable. For many years, it was commonly believed (but never established) that conversely, a reverse convex program should be equivalent to a concave minimization problem via some simple transformation. It was not until 1991 that a duality theory for nonconvex optimization was developed (Thach 1991), within which framework the two problems: quasiconcave minimization over convex sets and quasiconvex minimization over complements of convex sets are dual to each other, so that solving one can be reduced to solving the other and vice versa.

c. Quadratic Optimization An important subclass of dc programs is constituted by quadratic nonconvex programs which, in their most general formulation, consist in minimizing a quadratic function under nonconvex quadratic constraints, i.e., a problem (5.3) in which 1 hx; Q0 xi C hc0 ; xi 2 1 hi .x/ D hx; Qi xi C hci ; xi i D 1; : : : ; m; 2 f .x/ D

where Qi ; i D 0; 1; : : : ; m, are symmetric n  n matrices, and at least one of these matrices is indefinite or negative semidefinite.

5.4 General Nonconvex Structure

135

5.3.2 Monotonic Optimization This is the class of problems of the form ˇ ˇ maximize f .x/ ˇ ˇ subject to g.x/  0  h.x/; x 2 Rn ; C

(5.6)

where f .x/; g.x/; h.x/ are increasing functions on RnC . A function f W RnC ! R is said to be increasing if f .x/  f .x0 / whenever x; x0 2 RnC ; x  x0 . Monotonic optimization is a new important chapter of global optimization which has been developed in the last 15 years. It is easily seen that if F denotes the family of increasing functions on RnC then f1 .x/; f2 .x/ 2 F ) minff1 .x/; f2 .x/g 2 F I maxff1 .x/; f2 .x/g 2 F : These properties are similar to those of the family of dc functions, so a certain parallel exists between Monotonic Optimization and DC Optimization.

5.4 General Nonconvex Structure In the deterministic, much more than in the stochastic and other approaches to global optimization, it is essential to understand the mathematical structure of the problem being investigated. Only a deep understanding of this structure can provide insight into the most relevant properties of the problem and suggest efficient methods for handling it. Despite the great variety of nonconvex optimization problems, most of them share a common mathematical structure, which is the complementary convex structure as described in (5.5). Consider, for example, the general dc programming problem (5.3). As we saw in Sect. 4.1, the system of dc constraints hi .x/  0; i D 1; : : : ; m is equivalent to the single dc constraint: h.x/ D max hi .x/  0: iD1;:::;m

(5.7)

If h.x/ D p.x/  q.x/ then the constraint (5.7) in turn splits into two constraints: one convex constraint p.x/  t and one reverse convex constraint q.x/  t. Thus, assuming f .x/ linear and setting z D .x; t/, the dc program (5.3) can be reformulated as the reverse convex program: minfl.z/j z 2 G n intHg;

(5.8)

where l.z/ D f .x/ is linear, G D fz D .x; t/jx 2 D; p.x/  tg, and H D fz D .x; t/j q.x/  tg. This reverse convex program (5.8) is also referred to as a canonical dc programming problem.

136

5 Motivation and Overview

Consider next the more general problem of continuous optimization minimize f .x/ subject to

x 2 S;

(5.9)

where S is a compact set in R and f .x/ is a continuous function on R . As previously, by writing the problem as minftj f .x/  t; x 2 Sg and changing the notation, we can assume that the objective function f .x/ in (5.9) is a linear function. Then, by Proposition 4.13, if r W Rn ! RC is any separator for S (for instance, r.x/ D inffkx  yk j y 2 Sg/ the set S can be described as n

n

S D fxj g.x/  kxk2  0g; where g.x/ D supy…S fr2 .y/ C 2xy  kyk2 g is a convex function on Rn . Therefore, problem (5.9) is again equivalent to a problem of the form (5.8). To sum up, a complementary convex structure underlies every continuous global optimization problem. Of course, this structure is not always apparent and even when it has been made explicit still a lot of work remains to be done to bring it into a form amenable to efficient computational analysis. Nevertheless, the attractive feature of the complementary convex structure is that, on the one hand, it involves convexity, and therefore brings to bear analytical tools from convex analysis, like subdifferential, supporting hyperplane, polar set, etc. On the other hand, since convexity is involved in the “wrong” (reverse) direction, these tools must be used in some specific way and combined with combinatorial tools like cutting planes, relaxation, restriction, branch and bound, etc. In this sense, global optimization is a sort of bridge between nonlinear programming and combinatorial (discrete) optimization. Remark 5.1 The connection between global optimization and combinatorial analysis is also apparent from the fact that any 0-1 integer programming problem can be converted into a global optimization problem. Specifically, a discrete constraint of the form xi 2 f0; 1g .i D 1; : : : ; n/ can be rewritten as x 2 G n H; with G being the hypercube Œ0; 1n and H the interior of the ball circumscribing the hypercube, or alternatively as n X 0  xi  1; xi .xi  1/  0: iD1

5.5 Global Optimality Criteria The essence of an iterative method of optimization is that we have a trial solution x0 to the problem and look for a better one. If the method used is some standard procedure of nonlinear programming and x0 is a KKT point, there is no clue to

5.5 Global Optimality Criteria

137

proceed further. We must then stop the procedure even though there is no guarantee that x0 is a global optimal solution. It is this ability to become trapped at a stationary point which causes the failure of classical methods of nonlinear programming and motivates the need to develop global search procedures. Therefore, any global search procedure must be able to address the fundamental question of how to transcend a given feasible solution (the current incumbent, which may be a stationary point). In other words, the core of global optimization is the following issue: Given a solution x0 (the incumbent), check whether it is a global optimal solution, and if it is not, find a better feasible solution. As in classical optimization theory, one way to deal with this issue is by devising criteria for recognizing a global optimum. For unconstrained dc optimization problems, we already know a global optimality criterion by Hiriart-Urruty (Proposition 4.20), namely: Given a proper convex l.s.c. function h.x/ on Rn and an arbitrary proper function g W Rn ! R [ fC1g, a point x0 such that g.x0 / < C1 is an unconstrained global minimizer of g.x/  h.x/ if and only if @" h.x0 /  @" g.x0 /

8" > 0:

(5.10)

We now present two global optimality criteria for the general problem min f f .x/j x 2 Sg;

(5.11)

where the set S  Rn is assumed nonempty, compact, and robust, i.e., S D cl.intS/, and f .x/ is a continuous function on S. Since f .x/ is then bounded from below on S, without loss of generality we can also assume f .x/ > 0 8x 2 S. Proposition 5.1 A point x0 2 S is a global minimizer of f .x/ over S if and only if ˛ WD f .x0 / satisfies either of the following conditions: (i) The sequence r.˛; k/; k D 1; 2; : : : ; is bounded (Falk 1973a), where Z  r.˛; k/ D S

˛ f .x/

k dx:

(ii) The set E WD fx 2 Sj f .x/ < ˛g has null Lebesgue measure (Zheng 1985). Proof Suppose x0 is not a global minimizer, so that f .x/ < ˛ for some x 2 S. By continuity we will have f .x0 / < ˛ for all x0 sufficiently close to x. On the other hand, since S is robust, in any neighborhood of x there is an interior point of S. Therefore, by taking a point x0 2 intS sufficiently close to x we will have f .x0 / < ˛. Let W be a neighborhood of x0 contained in S such that f .y/ < ˛ for all y 2 W. Then W  E, hence mesE  mesW > 0. Conversely, if mesE > 0, then E ¤ ;, hence x0 is not a global minimizer. This proves that (ii) is a necessary and sufficient condition for the optimality of x0 . We now show that (i) is equivalent to (ii). If mesE > 0, then since ˛ 1 ˛ 1 E D [C1 iD1 fx 2 Sj f .x/  1 C i g there is i0 such that E0 D fx 2 Sj f .x/  1 C i0 g has

138

5 Motivation and Overview

mesE0 D  > 0. This implies Z  S

˛ f .x/

k dx  .1 C 1=i0 /k ! C1 as k ! C1:

On the other hand, if mesE D 0 then Z  S

˛ f .x/

k



Z dx D SnE

˛ f .x/

k dx  mes.S n E/ 8k;

proving that (i) , (ii).

t u

Using Falk’s criterion it is even possible, when the global optimizer is unique, to derive a global criterion which in fact yields a closed formula for this unique global optimizer (Pincus 1968). Proposition 5.2 Assume, in addition to the previously specified conditions, that f .x/ has a unique global minimizer x0 on S. Then for every i D 1; : : : ; n: R xi ekf .x/ dx RS ! x0i kf .x/ dx Se

.k ! C1/:

(5.12)

Proof We show that for every i D 1; : : : ; n: R 'i .k/ WD

S

jxi  x0i jekf .x/ dx R !0 kf .x/ dx Se

.k ! C1/:

(5.13)

For any fixed " > 0 let W" D fx 2 S j kx  x0 k < "g and write R W"

'i .k/ D

jxi  x0i jekf .x/ dx R C kf .x/ dx Se

R SnW"

jxi  x0i jekf .x/ dx R : kf .x/ dx Se

Since x0 is the unique minimizer of f .x/ over S, clearly minff .x/j x 2 SnW" g > f .x0 / and there is ı > 0 such that maxff .x0 /  f .x/ j x 2 S n W" g < ı. Then R W"

R jxi  x0i jekf .x/ dx ekf .x/ dx R RW"  "  ": kf .x/ dx kf .x/ dx Se Se

(5.14)

On the other hand, setting M D maxx2S kx  x0 k, we have R SnW"

Z 0 jxi  x0i jekf .x/ dx ekŒf .x /f .x/ dx R R  M kf .x/ dx kŒf .x0 /f .x/ dx SnW" S e Se 

Mmes.S n W" /ekı R : kŒf .x0 /f .x/ dx Se

(5.15) (5.16)

5.6 DC Inclusion Associated with a Feasible Solution

139

R 0 The latter ratio tends to 0 as k ! C1 because S ekŒf .x /f .x/ dx is bounded by Proposition 5.1 (Falk’s criterion for x0 to be the global minimizer of ef .x/ /. Since " > 0 is arbitrary, this together with (5.14) implies (5.13). u t Remark 5.2 Since f .x/ > 0 8x 2 S, a minimizer of f .x/ over S is also a maximizer 1 of f .x/ and conversely. Therefore, from the above propositions we can state, under the same assumptions as previously (S compact and robust and f .x/ continuous positive on S/: (i) A point x0 2 S with f .x0 / D ˛ is a global maximizer of f .x/ over S if and only if the sequence Z  s.˛; k/ WD S

f .x/ ˛

k dx; k D 1; 2; : : : ;

is bounded. (ii) If f .x/ has a unique global maximizer x0 over S then x0i

R xi ekf .x/ dx D lim RS kf .x/ ; k!C1 dx Se

i D 1; : : : ; n:

5.6 DC Inclusion Associated with a Feasible Solution In view of the difficulties with computing multiple integrals the above global optimality criteria are of very limited applicability. On the other hand, using the complementary convex structure common to most optimization problems, it is possible to formulate global optimality criteria in such a way that in most cases transcending an incumbent solution x0 reduces to checking a dc inclusion of the form D  C;

(5.17)

where D; C are closed convex subsets of Rn depending in general on x0 . Furthermore, this inclusion is such that from any x 2 D n C it is easy to derive a better feasible solution than x0 . Specifically, for the concave minimization problem (5.4), C D fxj f .x/  f .x0 /g and D is the constraint set. Clearly these sets are closed convex, and x0 2 D is optimal if and only if D  C, while any x 2 D n C yields a better feasible solution than x0 .

140

5 Motivation and Overview

Consider now the canonical dc program (5.8): minfl.x/j x 2 G n intHg;

(5.18)

where G and H are closed convex subsets of R . As shown in Sect. 4.3 this is the general form to which one can reduce virtually any global optimization problem to be considered in this book. Assume that the feasible set D D G n intH is robust, i.e., satisfies D D cl.intD/ and that the problem is genuinely nonconvex, which means that n

minfl.x/j x 2 Gg < minfl.x/j x 2 G n intHg:

(5.19)

Proposition 5.3 Under the stated assumptions a feasible solution x0 to (5.18) is global optimal if and only if fx 2 Gj l.x/  l.x0 /g  H:

(5.20)

Proof Suppose there exists x 2 G \ fxj l.x/  l.x0 /g n H. From (5.19) we have l.z/ < l.x/ for at least some z 2 G \ intH. Then the line segment Œz; x meets the boundary @H at some point x0 2 GnintH, i.e., at some feasible point x0 . Since x … H, we must have x0 D xC.1/z with 0 <  < 1, hence l.x0 / D l.x/C.1/l.z/ < l.x/  l.x0 /. Thus, if (5.20) does not hold then we can find a feasible solution x0 better than x0 . Conversely suppose there exists a feasible solution x better than x0 . Let W be a ball around x such that l.x0 / < l.x0 / for all x0 2 W. Since D is robust one can find a point x0 2 W \ intD. Then clearly x0 2 G n H and l.x0 / < l.x0 /, contrary to (5.20). Therefore, if (5.20) holds, x0 must be global optimal. t u Thus, for problem (5.18) transcending x0 amounts to solving an inclusion (5.17) where D D fx 2 Gj l.x/  l.x0 /g and C D H. If x 2 G n H then the point x0 2 @H \ Œz; x yields a better feasible solution than x0 , where z is any point of D \ intH with l.z/ < l.x/ [this point exists by virtue of (5.19)]. We shall refer to the problem of checking an inclusion D  C or, equivalently, solving the inclusion x 2 D n C, as the DC feasibility problem. Note that for the concave minimization problem the set C D fxj f .x/  f .x0 /g depends on x0 , whereas for the canonical dc program the set D D fx 2 Gj l.x/  l.x0 /g depends on x0 .

5.7 Approximate Solution In practice, the computation of an exact global optimal solution may be too expensive while most often we only need the global optimum within some prescribed tolerance "  0. So problem .P/

minff .x/j x 2 Dg

5.7 Approximate Solution

141

can be considered solved if a feasible solution x has been found such that f .x/  f .x/  ";

8x 2 D:

(5.21)

Such a feasible solution is called a global "-optimal solution. Setting f D max f .D/  min f .D/, the relative error made by accepting f .x/ as the global optimum is f .x/  min f .D/ " "   ; f f M where M is a lower bound for f . With tolerance " the problem of transcending a feasible solution x0 then consists in the following: () Find a feasible solution z such that f .z/ < f .x0 / or else prove that x0 is a global "-minimizer. A typical feature of many global optimization problems which results from their complementary convex structure is that a subset X (typically, a finite subset of the boundary) of the feasible domain D is known to contain at least one global minimizer and is such that starting from any point z 2 D we can find, by local search, a point x 2 X with f .x/  f .z/. Under these circumstances, if a procedure is available for solving () then .P/ can be solved according to a two phase scheme as follows: Let z0 2 D be an initial feasible solution. Set k D 0. Phase 1 (Local phase). Starting from zk search for a point xk 2 X such that f .xk /  f .zk /. Go to Phase 2. Phase 2 (Global phase). Solve () for x D xk , by the available procedure. If the output of this procedure is the global "-optimality of xk , then terminate. Otherwise, the procedure returns a feasible point z such that f .z/ < f .xk /. Then set zk z, increase k by 1 and go back to Phase 1. Phase 1 can be carried out by using any local optimization or nondeterministic method (heuristic or stochastic). Only Phase 2 is what distinguishes deterministic approaches from the other ones. In the next chapters we shall discuss various methods for solving inclusions of the form (), i.e., practically for carrying out Phase 2, for various classes of problems. Clearly, the completion of one cycle of the above described iterative scheme achieves a move from one feasible point xk 2 X to a feasible point xkC1 2 X such that f .xkC1 / < f .xk /. If X is finite then the process is guaranteed to terminate after finitely many cycles by a global "-optimal solution. In the general case, there is the possibility of the process being infinite. Fortunately, however, for most problems it turns out that the iterative process can be organized so that, whenever infinite, it will converge to a global optimum; in other words, after a sufficient number of iterations it will yield a solution as close to the global optimum as desired.

142

5 Motivation and Overview

5.8 When is Global Optimization Motivated? Since global optimization methods are generally expensive, they should be used only when they are really motivated. Before applying a sophisticated global optimization algorithm to a given problem, it is therefore necessary to perform elementary operations (such as fixing certain variables to eliminate them, tightening bounds, and changing the variables) for improving or simplifying the formulation wherever possible. In some simple cases, an obvious transformation may reduce a seemingly nonconvex problem to a convex or even linear program. For instance, a problem of the form p minf 1 C .l.x//2 j x 2 Dg; where D is a polyhedron and l.x/ a linear function nonnegative on D is merely a disguise of the linear program maxfl.x/j x 2 Dg. More generally, problems such as minf˚.l.x/j x 2 Dg where D is a polyhedron, l.x/ a linear function, while ˚.u/ is a nondecreasing function of u on the set l.D/ should be recognized as linear programs and solved as such.1 There are, however, circumstances when more or less sophisticated transformations may be required to remove the apparent nonconvexities and reformulate the problem in a form amenable to known efficient solution methods. Therefore, preprocessing is often a necessary preliminary phase before solving any global optimization problem. Example 5.5 Consider the following problem encountered in the theoretical evaluation of certain fault-tolerant shortest path distributed algorithms (Bui 1993): .Q/

min n X

n X .  2xi /yi C xi W .ı  yi /xi iD1

xi  a;

n X

yi  b

iD1

iD1

 xi  ;

 yi  ı;

(5.22)

(5.23) (5.24)

where 0 < n  b; 0 < < 1 <  and 0 < ı < 1. Assume that n  a; nı  b (otherwise the problem is infeasible). The objective function looks rather complicated, and one might be tempted to study the problem as a difficult nonconvex optimization problem. Close scrutiny reveals, however, that certain variables can be fixed and eliminated. Specifically, by rewriting the objective function as " ! # n X 1  2ı  ı 1 C C2 : (5.25) xi ı  yi ı  yi iD1 1 Unfortunately, such linear programs disguised in “nonlinear” global optimization problems have been and still are sometimes used for “testing” global optimization algorithms.

5.8 When is Global Optimization Motivated?

143

it is clear that a feasible solution .x; y/ of (Q) can be optimal only if xi D ; i D 1; : : : ; n. Hence, we can set xi D ; i D 1; : : : ; n. Problem (Q) then becomes .Q/

" n X  min iD1 n X

! # 1  2ı ı 1 C C2 W ı  yi ı  yi

 yi  ı:

yi  b;

(5.26)

(5.27)

iD1

The objective function of .Q/ can further be rewritten as n X

"

iD1

  = C 2 ı  yi

#

1 with  D .ı/= C 1  2ı/ D .1  ı/ C ı.=  1/ > 0. Since each term ıy is a i convex function of yi , it follows that .Q/ is a convex program. Furthermore, due to the special structure of this program, an optimal solution of it can be computed in closed form by standard convex mathematical programming methods. Note, however, that if the inequalities (5.23) are replaced by equalities then the problem becomes a truly nonconvex optimization problem.

Example 5.6 (Geometric Programming) An important class of optimization problems called geometric programming problems, which have met with many engineering applications (see, e.g., Duffin et al. 1967), have the following general formulation: minfg0 .x/j gi .x/  1 .i D 1; : : : ; m/; x D .x1 ; : : : ; xn / > 0g; where each gi .x/ is a posynomial, i.e., a function of the form gi .x/ D

Ti X jD1

cij

n Y

.xk /aijk

i D 0; 1; : : : ; m

kD1

with cij > 0 8i D 1; : : : ; m; j D 1; : : : ; Ti . Setting xk D exp.tk /; gi .x/ D

Ti X

cij exp

jD1

X

! aijk tk

WD hi .t/;

k

one can rewrite this problem into the form minflog h0 .t/j log hi .t/  0

i D 1; : : : ; mg:

144

5 Motivation and Overview

It is now not hard to see that each function hi .t/ is convex on Rn . Indeed, if the vectors aij D .aij1 ; : : : ; aijn / 2 Rn are defined, then hi .t/ D

Ti X

cij eha

ij ;ti

jD1

and, since cij > 0, to prove the convexity of hi .t/ it suffices to show that for any vector a 2 Rn ; the function eha;ti is convex. But the function '.y/ D ey is convex increasing on the real line 1 < y < C1, while l.t/ D ha; ti is linear in t, therefore, by Proposition 2.8, the composite function '.l.t// D eha;ti is convex. Thus a geometric program is actually a convex program of a special form. Very efficient specialized methods exist for solving this problem (Duffin et al. 1967). Note, however, that if certain cij are negative, then the problem becomes a difficult dc optimization problem (see, e.g., McCormick 1982). Example 5.7 (Generalized Linear Programming) A generalized linear program is a problem of the form:

.GLP/

ˇ ˇ min cx W ˇ Pn j ˇ jD1 y xj D b ˇ ˇ yj 2 Y j D 1; : : : ; n j ˇ ˇ x  0;

(5.28)

where x 2 Rn ; yj 2 Rm .j D 1; : : : ; n/; b 2 Rm , and Yj  Rm .j D 1; : : : ; n/ are polyhedrons. The name comes from the fact that this problem reduces to an ordinary linear program when the vectors y1 ; : : : ; yn are constant, i.e., when each Yj is a singleton. Since yj are now variables, the constraints are bilinear. In spite of that, the analogy of the problem with ordinary linear programs suggests an efficient algorithm for solving it which can be considered as an extension of the classical simplex method (Dantzig 1963). A special case of (GLP) is the following problem which arises, e.g., from certain applications in agriculture: .SP/

min

n X

ci xi W

(5.29)

iD1

x 2 D  Rn xi D ui vi u 2 Œr; s 

(5.30)

i D 1; : : : ; p  n; p RCC ;

v2S

(5.31) p RC ;

(5.32)

where D; S are given polytopes in Rn ; Rp , respectively. Setting ui D xi ; vi D 1 for all i > p and substituting (5.31) into (5.29) and (5.30), we see that the problem is to

5.8 When is Global Optimization Motivated?

145

minimize a bilinear function in u; v under linear and bilinear constraints. However, this natural transformation would lead to a complicated nonlinear problem, whereas a much simpler solution method is by reducing it to a (GLP) as follows. Let 8 9 p < = X p D D v 2 RC j ˛ij vj D ei .i D 1; : : : ; m/ ; : ; 8 < SD

:

jD1

n X

x 2 RnC j

jD1

(5.33)

9 =

ˇkj xj D dk .k D 1; : : : ; l/ : ;

(5.34)

By rewriting the constraints (5.30)–(5.32) as p X

xj D ei uj

i D 1; : : : ; m

ˇkj xj D dk

k D 1; : : : ; l

˛ij

jD1 n X jD1

rj  uj  sj

j D 1; : : : ; p

xj  0 j D 1; : : : ; n; we obtain a (GLP) where b D

yj 2 RmCl ;

  e 2 RmCl , d

8 < ˛hj =uj h D 1; : : : ; m; j D 1; : : : ; p j yh D 0 h D 1; : : : ; m; j D p C 1; : : : ; n : ˇhm;j h D m C 1; : : : ; m C l; j D 1; : : : ; n

and Yj is a polyhedron in RmCl defined by ˛hj =rj  yh  ˛hj =sj

h 2 IjC

(5.35)

j yh

h 2 Ij

(5.36)

j

˛hj =sj 

 ˛hj =rj

IjC D fij ˛ij > 0g;

Ij D fij ˛ij < 0g:

(5.37)

An interesting feature worth noticing of this problem is that, despite the bilinear constraints, the set ˝ of all x satisfying (5.31), (5.32) is a polytope, so that .SP/ is actually a linear program (Thieu 1992). To see this, for every fixed u 2 Œr; s denote by uQ W Rp ! Rp the affine map that carries v 2 Rp to uQ .v/ 2 Rp with

146

5 Motivation and Overview

uQ i .v/ D ui vi ; i D 1; : : : ; p. Then the set uQ .S/ is a polytope, as it is the image of a polytope under an affine map. Now let uj ; j 2 J, be the vertices of the rectangle Œr; s and G be the convex hull of [j2J uQ j .S/; G D fx 2 Rn j .x1 ; : : : ; xp / 2 Gg. If x 2 G, P P j then xi D j2J j uQ i .v j /; i D 1; : : : ; p with v j 2 S; j  0, j2J j D 1 (Exercise 7, Chap. 1), and so xi D

X

j j

j ui vi ;

i D 1; : : : ; p:

(5.38)

j2J

P j j j Since ui > 0; vi  0; one can have j2J j vi D 0 only if xi D 0, so one can write P j xi D ui vi .i D 1; : : : ; p/, with v D j2J j v 2 S, and ( xi if vi > 0 ui D vi ri otherwise: Clearly, ri  ui  si ; i D 1; : : : ; p because ri vi  xi  si vi by (5.38). Thus x 2 G implies that x 2 ˝. Conversely, ifPx 2 ˝; i.e., xi D ui vi ;Pi D 1; : : : ; p j for some u 2 Œr; s; v 2 S, then u D j2J j u with j  0; j2J j D 1, P j  .u v /; i D 1; : : : ; p, which means that x 2 G. Therefore hence xi D j2J j i i ˝ D G, and so the feasible set of (SP) is a polytope. Though this result implies that .SP/ can be solved as a linear program, the determination of the constraints in this linear program [i.e., the linear inequalities describing the feasible set] may be cumbersome computationally. It should also be noted that the problem will become highly nonconvex if the rectangle Œr; s in the condition u 2 Œr; s is replaced by an arbitrary polytope. Example 5.8 (Semidefinite Programming) In recent years linear matrix inequality (LMI) techniques have emerged as powerful numerical tools for the analysis and design of control systems. Recall from Sect. 2.12.2 that an LMI is any matrix inequality of the form Q.x/ WD Q0 C

n X

xi Qi 0;

(5.39)

iD1

where Qi ; i D 0; 1; : : : ; m are real symmetric m  m matrices, and for any two real symmetric m  m matrices A; B the notation A B (A B, resp.) means that the matrix A  B is negative semidefinite (definite, resp.), i.e., the largest eigenvalue of A  B, max .A  B/, is nonpositive (negative, resp.). A semidefinite program is a linear program under LMI constraints, i.e., a problem of the form minfcxj Q.x/ 0g;

(5.40)

5.9 Exercises

147

where Q.x/ is as in (5.39). Obviously, Q.x/ is negative semidefinite if and only if supfuT Q.x/uj u 2 Rm g  0; Since for fixed u the function 'u .x/ WD uT Q.x/u is affine in x, it follows that the set G D fx 2 Rn j 'u .x/  0 8u 2 Rm g (the feasible set of (5.40)) is convex. Therefore, the problem (5.40) is a convex program which can be solved by specialized interior point methods (see, e.g., Boyd and Vandenberghe 2004). Note that the problem minfcxj Q.x/ 0g is also a semidefinite program , since Q.x/ 0 if and only if Q.x/ 0. Many problems from control theory reduce to nonconvex global optimization problems under LMI constraints. For instance, the “robust performance problem” (Apkarian and Tuan 1999) can be formulated as that of minimizing subject to N X kD1

zi A0i C

m X .ui Ai C vi AiCm / C B 0

(5.41)

iD1

ui D xi ; xi vi D

i D 1; : : : ; m

x 2 Œa; b; z 2 Œd; d ; N

(5.42) (5.43)

where A0i ; Ai ; B are real symmetric matrices (the constraint (5.41) is convex, while (5.42) are bilinear).

5.9 Exercises 1 Let C; D be two compact convex sets in R2 such that D  intC. Show that the distance d.x/ D inffkx  yk W y … Cg from a point x 2 D to the boundary of C is a concave function. Suppose C is a lake, and D an island in the lake. Show that finding the shortest bridge connecting D with the mainland (i.e., R2 n intC) is a concave minimization problem. If C; D are merely compact (not necessarily convex), show that this is a dc optimization problem. 2 (Weber problem with attraction and repulsion ) 1. A single facility is to be constructed to serve n users located at points aj 2 D on the plane (where D is a rectangle in R2 /. If the facility is located at x 2 D, then the attraction of the facility to user j is qj .hj .x//, where hj .x/ D kx  aj k is the distance from x to aj and qj W R ! R is a convex decreasing function (the farther x is away from aj the less attractive it looks to user j). Show that Pthe problem of finding the location of the facility with maximal total attraction njD1 qj .hj .x// is a dc optimization problem. 2. In practice, some of the points aj may be repulsion rather than attraction points for the facility. For example, there may be in the area D a garbage dump, or

148

5 Motivation and Overview

sewage plant, or nuclear plant, and one may wish the facility to be located as far away from these points as possible. If J1 is the set of attraction points, J2 the set of repulsion points, then in typical situations one may have ( qj Œhj .x/ D

˛j  wj kx  aj k j 2 J1 ; j j 2 J2 ; wj e j kxa k

where ˛j > 0 is the maximal attraction of point j 2 J1 , j > 0 is the rate of decay of the repulsion of point j 2 J2 and wj > 0 for all j. Show that in this case the problem of maximizing the function X j2J1

qj Œhj .x/ 

X

qj Œhj .x/

j2J2

is still a dc optimization problem. (When J2 D ; the problem is known as Weber’s problem). 3 In the previous problem, show that if there is an index i 2 J1 such that wi  P n i jD1;j¤i wj then a is the optimal location of the facility (“majority theorem”). 4 (Competitive location ) In Exercise 2, suppose that J2 D ; (no repulsion point), and the attraction of the new facility to user j is defined as follows. There exist already in the area several facilities and the new facility is meant to compete with these existing facilities. If ıj > 0 . j > ıj ; resp.) is the distance from user j to the nearest (farthest, resp.) existing facility, then the attraction of the new facility to user j is a function qj .t/ of the distance t from the new facility to user j such that 8 ˛j  wj t  ˆ ˆ < qj .t/ D .˛j  wj ıj / 1  ˆ ˆ : 0

tıj

j ıj

 0  t  ıj ıj  t  j t  j

(for user j the new facility is attractive as long as it is closer than any existing facility, but this attraction quickly decreases when the new facility is farther than some of the existing facilities and becomes zero when it is farther than the farthest existing facility). Denoting by hj .x/ the distance from the unknown location x of the new facility to user j, show that to maximize the total attraction one has again to solve a dc optimization problem. 5 In Example 5.3 on engineering design prove that r.x/ D maxfrj B.x; r/  Mg D inffkx  yk W y 2 @Mg (distance from x to @M/. Derive that r.x/ is a dc function (cf Exercise 1, Chap. 4), and hence that finding the maximal ball contained in a given compact set M  Rn reduces to maximizing a dc function under a dc constraint.

5.9 Exercises

149

6 Consider a system of disjunctive convex inequalities x 2

n [

Ci , where Ci D

iD1

fx W gi .x/  0g. Show that this system is equivalent to a single dc inequality p.x/  q.x/  0 and find the two convex functions p.x/; q.x/. P 7 In Example 5.4 introduce the new variables qil such that xil D qil j ylj and transform the pooling problem into a problem in qil ; ylj ; zij (eliminate the variables xil /. 8 (Optimal design of a water distribution network) Assuming that a water distribution network has a single source, a single demand pattern, and a new pumping facility at the source node, the problem is to determine the pump pressure H, the flow rate qi and the head losses Ji along the arcs i D 1; : : : ; s, so as to minimize the cost (see, e.g., Fujiwara and Khang 1990): f .q; H; J/ WD b1

X

Li .KLi =Ji /ˇ=˛ .qi =c/ˇ=˛ C b2 ja1 je1 H e2 C b3 ja1 jH

i

under the constraints P P q  i2out.k/ qi D ak k D 1; : : : ; n (flow balance) Pi2in.k/ i ˙Ji D 0 p D 1; : : : ; m (head loss balance) Pi2loop p min p D 1; : : : ; m (hydraulic requirement) i2r.k/ ˙Ji  H C h1  hk

 qi  qmin  0 i D 1; : : : ; s qmax i i (bounds) H max  H  H min  0

˛ KLi .qi =c/ =dmax  Ji i D 1; : : : ; s (Hazen–Williams equations) ˛ Ji  KLi .qi =c/ =dmin (Li is the length of arc i; K and c are the Hazen–Williams coefficients, a1 is the supply water rate at the source and b1 ; b2 ; b3 ; e1 ; e2 ; ˇ are constants such that 0 < e1 ; e2 < 1 and 1 < ˇ < 2:5;  D 1:85 and ˛ D 4:87/. By the change of variables pi D log qi ; i D log Ji , transform the first term of the sum defining f .q; H; J/ into a convex function of p; and reduce the problem to a dc optimization problem. 9 Solve the problem in Example 5.5. 10 Solve the problem minimize 3x1 C 4x2 y11 x1 C 2x2 C x3 D 5I

subject to y12 x1 C x2 C x4 D 3

x1 ; x2  0I y11  y12  1I 0  y11  2I 0  y12  2:

Chapter 6

General Methods

Methods for solving global optimization problems combine in different ways some basic techniques: partitioning, bounding, cutting, and approximation by relaxation or restriction. In this chapter we will discuss two general methods: branch and bound (BB) and outer approximation (OA). A basic operation involved in a BB procedure is partitioning, more precisely, successive partitioning.

6.1 Successive Partitioning Partitioning is a technique consisting in generating a process of subdivision of an initial set (which is either a simplex or a cone, or a rectangle) into several subsets of the same type (simplices, cones, rectangles, respectively) and reiterating this operation for each of the partition sets as many times as needed. There are three basic types of partitioning : simplicial, conical, and rectangular.

6.1.1 Simplicial Subdivision Consider a .p  1/-simplex S D Œu1 ; : : : ; up   Rn and an arbitrary point v 2 S, so that vD

p X iD1

i u i ;

p X

i D 1;

i  0 .i D 1; : : : ; p/:

iD1

Let I D fij i > 0g. It is a simple matter to check that for every i 2 I, the points u1 ; : : : ; ui1 ; v; uiC1 ; : : : ; up are vertices of a simplex Si  S and that: © Springer International Publishing AG 2016 H. Tuy, Convex Analysis and Global Optimization, Springer Optimization and Its Applications 110, DOI 10.1007/978-3-319-31484-6_6

151

152

6 General Methods

u1

H u2

v

v u3 u1

u4

u3

x0 u2 Fig. 6.1 Simplicial and conical subdivision

.riSi / \ .riSj / D ;

8j ¤ iI

[

Si D S:

i2I

We say that the simplices Si ; i 2 I form a radial partition(subdivision) of the simplex S via v (Fig. 6.1). Each Si will be referred to as a partition member or a “child” of S. Clearly, the partition is proper, i.e., consists of at least two members, if and only if v does not coincide with any ui . An important special case is when v is a point of a longest edge of the simplex S, i.e., v 2 Œuk ; uh , where kuk  uh k D maxfkui  uj kj i < jI i; j D 1; : : : ; pg (k  k denotes any given norm in Rn , for example, kxk D maxfjx1 j; : : : ; jxn jg/. If v D ˛uk C .1  ˛/uh with 0 < ˛  1=2 then the partition is called a bisection of ratio ˛ (S is divided into two subsimplices such that the ratio of the volume of the smaller subsimplex to that of S equals ˛/. When ˛ D 1=2, the bisection is called exact. Consider now an infinite sequence of nested simplices S1  S2      Sk     such that SkC1 is a child of Sk in a partition via someTv k 2 Sk . Such a sequence is called a filter (of simplices) and the simplex S1 D 1 kD1 Sk the limit of the filter. Let ı.S/ denote the diameter of S, i.e., the length of its longest edge. The filter is said to be exhaustive if \1 kD1 Sk is a singleton, i.e., if lim ı.Sk / D 0 (the filter shrinks to k!1

a single point). An important question about a filter concerns the conditions under which it is exhaustive. Lemma 6.1 (Basic Exhaustiveness Lemma) Let Sk D Œuk1 , : : : ; ukn ; k D 1; 2; : : :, be a filter of simplices such that SkC1 is a child of Sk in a subdivision via v k 2 Sk . Assume that: (i) For infinitely many k the subdivision of Sk is a bisection; (ii) There exists a constant  2 .0; 1/ such that for every k: maxfkv k  uki k W i D 1; : : : ; ng  ı.Sk /: Then the filter is exhaustive.

6.1 Successive Partitioning

153

Proof Denote ı.Sk / D ık . Since ıkC1  ık , it follows that ık tends to some limit ı as k ! 1. Arguing by contradiction, suppose ı > 0, so there exists t such that ık < ı for all k  t. By assumption (ii) we can write maxfkv k  uki k W i D 1; : : : ; ng  ık < ı  ık ;

8k  t:

(6.1)

Let us color every vertex of St “black” and for any k > t let us color “white” every vertex of Sk which is not black. Clearly, every white vertex of Sk must be a point v h for some h 2 ft; : : : ; k  1g, hence, by (6.1), maxfkv h  uhi k W i D 1; : : : ; ng  ıh < ı  ık . So any edge of Sk incident to a white vertex must have a length less than ık and cannot be a longest edge of Sk . In other words, for k  t a longest edge of Sk must have two black endpoints. Now denote K D fk  tj Sk is bisectedg. Then for k 2 K; v k belongs to a longest edge of Sk , hence, according to what has just been proved, to an edge of Sk joining two black vertices. Since v k becomes a white vertex of SkC1 , it follows that for k 2 K, SkC1 has one white vertex more than Sk . On the other hand, in the whole subdivision process the number of white vertices never decreases. That means, after one bisection the number of white vertices increases at least by 1. Therefore, after a certain number of bisections, we come up with a simplex Sk with only white vertices. Then, according to (6.1), every edge of Sk has length less than ı, conflicting with ı  ık D ı.Sk /. Therefore, ı D 0, as was to be proved. t u It should be emphasized that condition (ii) is essential for the exhaustiveness of the filter (see Exercise 1 of this chapter). This condition implies in particular that the bisections mentioned in (i) are bisections of ratio no less than 1   > 0. A filter of simplices satisfying (i) but not necessarily (ii) enjoys the following weaker property of “semi-exhaustiveness” which, however, may sometimes be more useful. Theorem 6.1 (Basic Simplicial Subdivision Theorem) Let Sk D Œuk1 ; : : : ; ukn , k D 1; 2; : : :, be a filter of simplices such that SkC1 is a child of Sk in a subdivision via a point v k 2 Sk . Then (i) Each sequence fuki ; k D 1; 2; : : :g has an accumulation point ui such that k 1 n S1 WD \1 kD1 S is the convex hull of u ; : : : ; u ; k (ii) The sequence fv ; k D 1; 2; : : :g has an accumulation point v 2 fu1 ; : : : ; un g; (iii) If for infinitely many k the subdivision of Sk is a bisection of ratio no less than ˛ > 0 then an accumulation point of the sequence fv k ; k D 1; 2; : : :g is a vertex of S1 D Œu1 ; : : : ; un . Proof (i) Since S1 is compact and fuki ; k D 1; 2; : : :g  S1 there exists a subsequence fks g  f1; 2; : : :g such that for every i D 1; : : : ; n, uks i ! ui 1 n as ˛i  0 such that 1 D Œu ; : : : ; u  then there exist P Pns ! C1. If x 2 SP n n i ks i ˛ D 1 and x D ˛ u D lim x , for x D 2 S ks , s!C1 s s iD1 i iD1 i iD1 ˛i u ks k hence x 2 S . Since this holds for every s and fS g is a nested sequence, it 1 k k ks follows that x 2 \1 kD1 S . So S1  \kD1 S . Conversely for any s if x 2 S , then

154

6 General Methods

P P P with ˛si  0 satisfying niD1 ˛si D 1 we have x D niD1 ˛si uks i ! niD1 ˛i ui k as s ! C1, hence x 2 Œu1 ; : : : ; un  D S1 . Therefore, S1 D \1 kD1 S D 1 n Œu ; : : : ; u . (ii) Sk is a child of Sk1 in a subdivision via v k1 , so a vertex of Sk , say ukik , must be v k1 . Since ik 2 f1; : : : ; ng there exists a subsequence fks g  f1; 2; : : :g such that iks is the same for all s, for instance, iks D 1 8s. Therefore v ks 1 D uks 1 8s. Since u1 is an accumulation point of the sequence fuk1 g we can assume that uks 1 ! u1 as s ! C1. Hence, v ks 1 ! u1 as s ! C1. (iii) This is obviously true if ı.S1 / D 0, i.e., if S1 is a singleton. Consider the case ı.S1 / > 0 which by Lemma 6.1 can occur only if condition (ii) of this Lemma does not hold, i.e., if for every s D 1; 2; : : : there is a ks such that maxfkv ks  uks i kj i D 1; : : : ; ng  s ı.Sks /; where s ! 1.s ! C1/. Since S1 is compact, by taking a subsequence if necessary we may assume that v ks ! v; uks i ! ui .i D 1; : : : ; n/. Then letting s ! 1 in the previous inequality yields maxiD1;:::;n kv  ui k D ı.S1 /. Since for each i the function x 7! kx  ui k is strictly convex, so is the function x 7! maxiD1;:::;n kx  ui k. Therefore, a maximizer of the latter function over the polytope convfu1; : : : ; un g must be a vertex of it: v 2 vertS1 . t u Incidentally, the proof has shown that one can have ı.Sk / > 0 only if infinitely many subdivisions are not bisections of ratio  ˛. Therefore: Corollary 6.1 A filter of simplices Sk ; k D 1; 2; : : : such that every SkC1 is a child of Sk via a bisection of ratio no less than ˛ > 0 is exhaustive. t u A finite collection N of simplices covering a set ˝ is called a (simplicial) net for ˝. A net for ˝ is said to be a refinement of another net if it is obtained by subdividing one or more members of the latter and replacing them with their partitions. A (simplicial) subdivision process for ˝ is a sequence of (simplicial) nets N1 ; N2 ; : : :, each of which, except the first one, is a refinement of its predecessor. Lemma 6.2 An infinite subdivision process N1 ; N2 ; : : :, generates at least one filter. Proof Since N1 has finitely many members, at least one of these, say Sk1 , has infinitely many descendants. Since Sk1 has finitely many children, one of these, say Sk2 , has infinitely many descendants. Continuing this way we shall generate an infinite nested sequence which is a filter. t u A subdivision process is said to be exhaustive if every filter fSk ; k D 1; 2; : : : ; g generated by the process is exhaustive. From Corollary 6.1 it follows that a subdivision process consisting only of bisections of ratio at least ˛ is exhaustive.

6.1 Successive Partitioning

155

6.1.2 Conical Subdivision Turning to conical subdivision, consider a fixed cone M0  Rn vertexed at x0 (throughout this chapter, by cone in Rn we always mean a polyhedral cone having exactly n edges). To simplify the notation assume x0 D 0. Let H be a fixed hyperplane meeting each edge of M0 at a point distinct from 0. For any cone M  M0 the set H \ M is an .n  1/-simplex S D Œu1 ; : : : ; un  not containing 0. We shall refer to this simplex S as the base of the cone M. Then any subdivision of the simplex S via a point v 2 S induces in an obvious way a subdivision (splitting) of the cone M into subcones, via the ray through v (Fig. 6.1). A subdivision of M is called bisection (of ratio ˛) if it is induced by a bisection (of ratio ˛) of S. A filter of cones, i.e., an infinite sequence of nested cones each of which is a child of its predecessor by a subdivision, is said to be exhaustive if it is induced by an exhaustive filter of simplices, i.e., if their intersection is a ray. Theorem 6.2 (Basic Conical Subdivision Theorem) Let C be a compact convex set in Rn , containing 0 in its interior, fMk ; k D 1; 2; : : :g a filter of cones vertexed at 0 and having each exactly n edges. For each k let zk1 ; : : : ; zkn be the intersection points of the n edges of Mk with @C (the boundary of C). If for each k MkC1 is a child of Mk in a subdivision via the ray through a point qk 2 Œzk1 ; : : : ; zkn  then one accumulation point of the sequence fqk ; k D 1; 2; : : : ; g belongs to @C. Proof Let Sk D Œuk1 ; : : : ; ukn  be the base of Mk . Then SkC1 is a child of Sk in a subdivision via a point v k 2 Sk , with v k being the point where Sk meets the halfline from 0 through qk . For every i D 1; : : : ; n let ui be an accumulation point of the sequence fuki ; k D 1; 2; : : :g. By Theorem 6.1, (ii), there exists ui such that v ks ! ui for some subsequence fks g  f1; 2; : : :g. Therefore, qks ! zi where zi D lims!1 zks i 2 @C (Fig. 6.2). t u

zk1

qk2

∂C

q Mk2

Mk1

vk1

0 Fig. 6.2 Basic conical subdivision theorem

qk1 zk2

156

6 General Methods

6.1.3 Rectangular Subdivision Q A rectangle in Rn is a set of the form M D Œr; s D niD1 Œri ; si  D fx 2 Rn j ri  xi  si .i D 1; : : : ; n/g. There are several ways to subdivide a rectangle M D Œr; s into subrectangles. However, in global optimization procedures, we will mainly consider partitions through a hyperplane parallel to one facet of the given rectangle. Specifically, given a rectangle M, a partition of M is defined by giving a point v 2 M together with an index j 2 f1; : : : ; ng: the partition sets are then the two subrectangles M and MC determined by the hyperplane xj D vj , namely: M D fxj rj  xj  vj ; ri  xi  si .i ¤ j/gI MC D fxj vj  xj  sj ; ri  xi  si .i ¤ j/g: We will refer to this partition as a partition via .v; j/ (Fig. 6.3). A nice property of this partition method is expressed in the following: Lemma 6.3 Let fMk WD Œrk ; sk ; k 2 Kg be a filter of rectangles such that Mk is partitioned via .v k ; jk /. Then there exists a subsequence K1  K such that jk D j0 8k 2 K1 and, as k ! C1; k 2 K1 , rjk0 ! rj0 ; skj0 ! sj0 ; vjk0 ! v j0 2 frj0 ; sj0 g:

(6.2)

Proof Without loss of generality we can assume K D f1; 2; : : :g. Since jk 2 f1; : : : ; ng there exists a j0 such that jk D j0 for all k forming an infinite subsequence K1  K. Because of boundedness we may assume vjk0 ! v j0 .k ! C1; k 2 K1 /. Furthermore, since rjk0  rjkC1  rjh0  shj0  skC1  skj0 for all h > k, we have j0 0 kC1 kC1 k k rj0 ! rj0 ; rj0 ! rj0 , and sj0 ! sj0 ; sj0 ! sj0 .k ! C1/. But vjk0 2 frjkC1 ; skC1 j0 g, 0 hence, by passing to the limit, v j0 2 frj0 ; sj0 g. t u

s2 v M−

M+

r2

r1 Fig. 6.3 Rectangular subdivision

v1

s1

6.1 Successive Partitioning

157

Theorem 6.3 (Basic Rectangular Subdivision Theorem) Let fMk D Œrk ; sk ; k 2 Kg be a filter of rectangles as in Lemma 6.3, where jk 2 argmaxf kj j j D 1; : : : ; ng;

kj D minfvjk  rjk ; skj  vjk g:

(6.3)

Then at least one accumulation point of the sequence of subdivision points fv k g coincides with a corner of the limit rectangle M1 D \C1 kD1 Mk . Proof Let K1 be the subsequence mentioned in Lemma 6.3. One can assume that rk ! r; sk ! s; v k ! v as k ! C1; k 2 K1 . From (6.2) and (6.3) it follows that for all j D 1; : : : ; n:

kj  kj0 ! 0 .k ! C1; k 2 K1 /; hence v j 2 frj ; sj g 8j, which means that v is a corner of M1 D Œr; s.

t u

The subdivision of a rectangle M D Œr; s via .v; j/ is called a bisection of ratio ˛ if the index j corresponds to a longest side of M and v is a point of this side such that j D min.vj  rj ; sj  vj / D ˛.sj  rj /. Corollary 6.2 Let fMk WD Œrk ; sk g be any filter of rectangles such that MkC1 is a child of Mk in a bisection of ratio no less than a positive constant ˛ > 0. Then the filter is exhaustive, i.e., the diameter ı.Mk / of Mk tends to zero as k ! 1. Proof Indeed, by the previous theorem, one accumulation point v of the sequence of subdivision points v k must be a corner of the limit rectangle. This can occur only if ı.Mk / ! 0. t u Theorem 6.4 (Adaptive Rectangular Subdivision Theorem) For each k D 1; 2; : : :, let uk ; v k be two points in the rectangle Mk D Œrk ; sk  and let wk D 12 .uk C v k /. Define jk by jukjk  vjkk j D max juki  vik j: iD1;:::;n

If for every k D 1; 2; : : : ; MkC1 is a child of Mk in the subdivision via .wk ; jk / then there exists an infinite subsequence K  f1; 2; : : :g such that kuk  v k k ! 0 as k 2 K; k ! C1. Proof By Theorem 6.3 there exists an infinite subsequence K  f1; 2; : : :g such that jk D j0 8k 2 K; wkj0 ! wj0 2 frj0 ; sj0 g as k 2 K; k ! C1. Since wkj0 D 12 .ukj0 C vjk0 / it follows that jukj0  vjk0 j ! 0 as k 2 K; k ! C1, i.e., maxiD1;:::;n juki  vik j ! 0 as k 2 K; k !1 . Hence, kv k  uk k ! 0 as k 2 K; k ! 1. t u The term “adaptive” comes from the adaptive role of this subdivision rule in the context of BB algorithms for solving problems with “nice” constraints, see Sect. 6.2 below. Its aim is to drive the difference kuk  v k k to zero as k ! C1. We refer to it as an adaptive subdivision via .uk ; v k /.

158

6 General Methods

6.2 Branch and Bound Consider a problem .P/

minff .x/j x 2 Dg;

where f .x/ is a real-valued continuous function on Rn and D is a closed bounded subset of Rn . In general it is not possible to compute an exact global optimal solution of this problem by a finite procedure. Therefore, in many cases one must be content with only a feasible solution with objective function value sufficiently near to the global optimal value. Given a tolerance " > 0 a feasible solution x such that f .x/  minff .x/j x 2 Dg  " is called a global "-optimal solution of the problem. A BB method for solving problem .P/ can be outlined as follows: – Start with a set (simplex or rectangle) M0  D and perform a (simplicial or rectangular, resp.) subdivision of M0 , obtaining the partition sets Mi ; i 2 I. If a feasible solution is available let x be the best feasible solution available and D f .x/. Otherwise, let D C1. – For each partition set Mi compute a lower bound ˇ.Mi / for f .Mi \ D/, i.e., a number ˇ.Mi / D C1 if M \ D D ;; ˇ.Mi /  f .M \ D/ otherwise. Update the current best feasible solution x and the current best feasible value . – Among all the current partition sets select a partition set M 2 argmini2I ˇ.Mi /. If  ˇ.M /  " stop: x WD CBS is a global "-optimal solution. Otherwise, further subdivide M , add the new partition sets to the old ones to form a new refined partitioning of M0 . – For each new partition set M compute a lower bound ˇ.M/. Undate x; and repeat the process. More precisely we can state the following

6.2.1 Prototype BB Algorithm Select a (simplicial or rectangular) subdivision rule and a tolerance "  0. Initialization. Start with an initial set (simplex or rectangle) M0 containing the feasible set D. If any feasible solution is readily available, let x0 be the best among such solutions and 0 D f .x0 /. Otherwise, set 0 D C1. Set P0 D fM0 g; S0 D fM0 g; k D 0. Step 1. (Bounding) For each set M 2 Pk compute a lower bound ˇ.M/ for f .M \ D/, i.e., a number such that ˇ.M/ D C1 if M \ D D ;;

ˇ.M/  inf f .M \ D/ otherwise:

(6.4)

6.2 Branch and Bound

159

If some feasible solutions have been known thus far, let xk be the best among all of them and set k D f .xk /. Otherwise, set k D C1. Step 2. (Pruning) Delete every M 2 Sk for which ˇ.M/  k  " and let Rk be the collection of remaining sets of Sk . Step 3. (Termination Criterion) If Rk D ;, terminate: xk is a global "-optimal solution. Otherwise, select Mk 2 argminfˇ.M/j M 2 Rk . Step 4. (Branching) Subdivide Mk according to the chosen subdivision rule. Let Pk be the partition of Mk . Reset Sk .Rk n fMk g/ [ Pk . If any new feasible solution is available, update the current best feasible solution xk and k D f .xk /. Set k k C 1 and return to Step 1. We say that bounding is consistent with branching (briefly, consistent) if ˇ.M/  minff .x/j x 2 M \ Dg ! 0 as diamM ! 0:

(6.5)

Proposition 6.1 If the subdivision is exhaustive and bounding is consistent, then the BB algorithm is convergent, i.e., it can be infinite only when " D 0 and then ˇ.Mk / ! v.P/ WD minff .x/j x 2 Dg as k ! C1. Proof First observe that if the algorithm stops at Step 3 then k < ˇ.M/ C " for every M 2 Rk , so k < C1, and k D f .xk /. Since xk is feasible and satisfies f .xk / < ˇ.M/C"  f .M\D/C" for every M 2 Sk , [see (6.4)], xk is indeed a global "-optimal solution. Suppose now that Step 3 never occurs, so that the algorithm is infinite. Since ˇ.Mk / < k C ", it follows from condition (6.4) that Mk \ D ¤ ;. So there exists xk 2 Mk \ D . Let zk 2 Mk \ D be a minimizer of f .x/ over Mk \ D (the points xk and zk exist, but may not be known effectively). We now contend that diam.Mk / ! 0 as k ! C1. In fact, for every partition set M let .M/ be the generation index of M defined by induction as follows: .M0 / D 1; whenever M 0 is a child of M then .M 0 / D .M/ C 1. It is easily seen that .Mk / ! C1 as k ! C1. But .Mk / D m means that Mk D \m iD1 Si where S1 D M0 ; Sm D Mk ; Si 2 Si ; SiC1 is a child of Si . Hence, since the subdivision process is exhaustive, diam.Mk / ! 0 as k ! C1. We then have zk  xk ! 0, so zk and xk tend to a common limit x 2 D. But by (6.5), f .zk /  ˇ.Mk / ! 0, i.e., ˇ.Mk / ! f .x /, hence, in view of (6.4), f .x /  v(P). Since x 2 D the latter implies f .x / D v(P), i.e., ˇ.Mk / ! v(P)). Therefore, for k large enough k D f .xk /  f .xk / < ˇ.Mk /C", a contradiction. t u As it is apparent from the above proof, the condition ˇ.M/ D C1 if M \ D D ; in (6.4) is essential in a BB algorithm: if it fails to hold, Mk \ D may be empty for some k, in which case x 2 \hk .Mh \ D/ is infeasible, hence ˇ.Mk / 6! v(P), despite (6.5). In the general case, however, it may be very difficult to decide whether a set Mk \ D is empty or not. It may also happen that computing a feasible solution is as hard as solving the problem itself, and then no feasible solution is available effectively at each iteration. Under these conditions, when the algorithm

160

6 General Methods

stops (case " > 0/ it only provides an approximate value of v.P/, namely a value ˇ.Mk / < v.P/ C "; when the algorithm is infinite (case " D 0/ it only provides a sequence of infeasible solutions converging to a global optimal solution.

6.2.2 Advantage of a Nice Feasible Set All the above difficulties of the BB algorithm disappear when the problem has a nice feasible set D. By this we mean that for any simplex or rectangle M a point x 2 M\D can be computed at cheap cost whenever it exists. More importantly, in most cases when the feasible set is nice a rectangular adaptive subdivision process can be used which ensures a faster convergence than an exhaustive subdivision process does. Specifically suppose for each rectangle Mk two points xk ; yk in Mk can be determined such that xk 2 Mk \ D;

yk 2 Mk ;

f .y /  ˇ.Mk / D o.kx  y k/: k

k

k

(6.6) (6.7)

Example 6.1 Let 'Mk .x/ be an underestimator of f .x/ tight at yk 2 Mk (i.e., a continuous function 'Mk .x/ satisfying 'Mk .x/  f .x/ 8x 2 Mk \ D; 'Mk .yk / D f .yk //; ˇ.Mk / D minf'Mk .x/j x 2 Mk \Dg, while xk 2 argminf'Mk .x/j x 2 Mk \Dg. (As is easily checked, f .yk /  ˇ.Mk / D 'Mk .yk /  'Mk .xk / ! 0 as xk  yk ! 0.) Example 6.2 Let f .x/ D f1 .x/  f2 .x/; xk 2 argminff1 .x/j x 2 Mk \ Dg, yk 2 argmaxff2 .x/j x 2 Mk g; ˇ.Mk / D f1 .xk /  f2 .yk /. (If f1 .x/ is continuous then f .yk /  ˇ.Mk / D f1 .yk /  f2 .yk /  Œf1 .xk /  f2 .yk / D f1 .yk /  f1 .xk / ! 0 as xk  yk ! 0) Under these conditions, the following adaptive bisection rule can be used for subdividing Mk : Bisect Mk via .wk ; jk / where wk D 12 .xk C yk / and jk satisfies kykjk  xkjk k D max kykj  xkj k: jD1;:::;n

Proposition 6.2 A rectangular BB algorithm using an adaptive bisection rule is convergent, i.e., it produces a global "-optimal solution in finitely many steps. Proof By Theorem 6.4 there exists a subsequence fk g such that yk  xk ! 0 . ! C1/

(6.8)

lim!C1 yk D lim!C1 xk D x 2 D:

(6.9)

From (6.6) and (6.7) we deduce ˇ.Mk / ! f .x / as  ! C1. But, since x 2 D, we have ˇ.Mk /  f .x /. Therefore, ˇ.Mk / ! v.P/ . ! C1/, proving the convergence of the algorithm. t u

6.3 Outer Approximation

161

We shall refer to a BB algorithm using an adaptive (exhaustive, resp.) subdivision rule as an adaptive (exhaustive , resp.) BB algorithm. To sum up, global optimization problems with all the nonconvexity concentrated in the objective function are generally easier to handle by BB algorithm than those with nonconvex constraints. This suggests that in dealing with nonconvex global optimization problems one should try to use valid transformations to shift, in one way or another, all the nonconvexity in the constraints to the objective function. Classical Lagrange multipliers or penalty methods, as well as decomposition methods are various ways to exploit this idea. Most of these methods, however, work under restrictive conditions. In the next chapter we will present a more practical way to handle hard constraints in dc optimization problems.

6.3 Outer Approximation One of the earliest methods of nonconvex optimization consists in approximating a given problem by a sequence of easier relaxed problems constructed in such a way that the sequence of solutions of these relaxed problems converges to a solution of the given problem. This approach, first introduced in convex programming in the late 1950s (Cheney and Goldstein 1959; Kelley 1960), was later extended under the name of outer approximation, to concave minimization under convex constraints (Hoffman 1981; Thieu et al. 1983; Tuy 1983; Tuy and Horst 1988) and to more general nonconvex optimization problems (Mayne and Polak 1984; Tuy 1983, 1987a). Referring to the fact that cutting planes are used to eliminate unfit solutions of relaxed problems, this approach is sometimes also called a cutting plane method. It should be noted, however, that in an outer approximation procedure cuts are always conjunctive, i.e., the set that results from the cuts is always the intersection of all the cuts performed. In this section we will present a flexible framework of outer approximation (OA) applicable to a wide variety of global optimization problems (Tuy 1995b), including DC optimization and monotonic optimization. Consider the general problem of searching for an element of an unknown set ˝  Rn (for instance, ˝ is the set of optimal solutions of a given problem). Suppose there exist a closed set G  ˝ and a family P of particular sets P  G, such that for each P 2 P a point x.P/ 2 P (called a distinguished point associated with P/ can be defined satisfying the following conditions: A1. x.P/ always exists and can be computed if ˝ ¤ ;, and whenever a sequence of distinguished points x1 D x.P1 /; x2 D x.P2 /; : : :, converges to a point x 2 G then x 2 ˝ (in particular, x.P/ 2 G implies that x.P/ 2 ˝/. A2. Given any distinguished point z D x.P/; P 2 P, we can recognize when z 2 G and if z … G, we can construct an affine function l.x/ (called a “cut”) such that P0 D P \ fxj l.x/  0g 2 P and l.x/ strictly separates z from G, i.e., satisfies l.z/ > 0;

l.x/  0 8x 2 G:

162

6 General Methods

In traditional applications ˝ is the set of optimal solutions of an optimization problem, G the feasible set, P a family of polyhedrons enclosing G, and x.P/ an optimal solution of the relaxed problem obtained by replacing the feasible set with P. However, it should be emphasized that in the most general case ˝; G; P; x.P/ can be defined in quite different ways, provided the above requirements A1, A2 are fulfilled. For instance, in several interesting applications, the set G may be defined only implicitly or may even be unknown, while x.P/ need not be a solution of any relaxed problem. Under assumptions A1, A2 a natural method for finding a point of ˝ is the following.

6.3.1 Prototype OA Procedure Initialization. Start with a set P1 2 P. Set k D 1. Step 1. Find the distinguished point xk D x.Pk / (by (A1)). If x.Pk / does not exist then terminate: ˝ D ;. If x.Pk / 2 G then terminate: x.Pk / 2 ˝. Otherwise, continue. Step 2. Using (A2) construct an affine function lk .x/ such that PkC1 D Pk \ fxj lk .x/  0g 2 P and lk .x/ strictly separates xk from G , i.e., satisfies lk .xk / > 0; Set k

lk .x/  0 8x 2 G:

(6.10)

k C 1 and return to Step 1.

The above scheme generates a nested sequence of sets Pk 2 P: P1  P2  : : :  Pk  : : :  G approximating G more and more closely from outside, hence the name given to the method (Fig. 6.4). The approximation is refined at each iteration but only around the current distinguished point (considered the most promising at this stage). An OA procedure as described above may be infinite (for instance, if x.P/ exists and x.P/ … G for every P 2 P/. An infinite OA procedure is said to be convergent if it generates a sequence of distinguished points xk , every accumulation point x of which belongs to G [hence solves the problem by (A1)]. In practice, the procedure is stopped when xk has become sufficiently close to G, which necessarily occurs for sufficiently large k if the procedure is convergent. The study of the convergence of an OA procedure relies on properties of the cuts lk .x/  0 satisfying condition (6.10). Let lk .x/ D hpk ; xi C ˛k ;

lk .x/ D

lk .x/ D hpk ; xi C ˛ k : kpk k

6.3 Outer Approximation

163 x1

x2

x3

Fig. 6.4 Outer approximation

Lemma 6.4 (Outer Approximation Lemma) If the sequence fxk g is bounded then lk .xk / ! 0

.k ! C1/:

Proof Suppose the contrary, that lkq .xkq /  > 0 for some infinite subsequence fkq g. Since fxk g is bounded we can assume xkq ! x .q ! C1/. Clearly, for every k, lk .xk / D lk .x/ C hpk ; xk  xi: But for every fixed k; lk .xkq /  0 8kq > k and letting q ! C1 yields lk .x/  0. Therefore, 0 <  lkq .xkq /  hpkq ; xkq  xi ! 0, a contradiction. u t Theorem 6.5 (Basic Outer Approximation Theorem) Let G be an arbitrary closed subset of Rn with nonempty interior, let fxk g be an infinite sequence in Rn , and for each k let lk .x/ D hpk ; xi C ˛k be an affine function satisfying (6.10). If the sequence fxk g is bounded and for every k there exist wk 2 G, and yk 2 Œwk ; xk nintG, such that lk .yk /  0; fwk g is bounded and, furthermore, every accumulation point of fwk g belongs to the interior of G, then x k  yk ! 0

.k ! C1/:

(6.11)

In particular, if a sequence fyk g as above can be identified whose every accumulation point belongs to G then the OA procedure converges. Proof By (6.10) hpk ; xk i < ˛ k  hpk ; wk i, so the sequence ˛ k is bounded. We can then select an infinite subsequence fkq g such that ˛ kq ! ˛; xkq ! x, ykq ! y; pkq ! p; wkq ! w .kpk D 1 and w 2 intG/. So lkq .x/ ! l.x/ WD px C ˛ 8x. Since l.x/ D lim lkq .x/ D limŒlkq .xkq / C hpkq ; x  xkq i D lim lkq .xkq /, by Lemma 6.4 we have l.x/ D 0. On the other hand, from (6.10), l.x/  0 8x 2 G, and since

164

6 General Methods

w 2 intG and l.x/ 6 0, we must have l.w/ < 0. From the hypothesis lk .yk /  0 it follows that hpk ; yk i C ˛ k  0, hence l.y/  0. But y D w C .1  /x for some 2 Œ0; 1, hence l.y/ D l.w/ C .1  /l.x/ D l.w/, and since l.y/  0; l.w/ < 0, this implies that D 0. Therefore, y D x. t u An important special case is the following: Corollary 6.3 Let G D fxj g.x/  0g, where g W Rn ! R is a convex function and fxk g  Rn n G a bounded sequence. If the affine functions lk .x/ satisfying (6.10) are such that lk .x/ D hpk ; x  yk i C ˛k ; pk 2 @g.yk /; yk 2 Œwk ; xk  n intG; 0  ˛k  g.yk /;

(6.12)

where fwk g is a bounded sequence every accumulation point of which belongs to the interior of G, and g.yk /  ˛k ! 0 .k ! C1/, then every accumulation point x of fxk g belongs to G. Proof Since fxk g and fwk g are bounded, fyk g is bounded, too. Therefore, fpk g is bounded (Theorem 2.6), and by Lemma 6.4, lk .xk / ! 0. Furthermore, by Theorem 6.5, xk yk ! 0, hence ˛k D lk .xk /hpk ; xk yk i ! 0. But by hypothesis, g.yk /  ˛k ! 0, consequently, g.yk / ! 0. If xkq ! x then by Theorem 6.5 ykq ! y D x, so g.x/ D lim g.yk / D 0, i.e., x 2 G. t u In a standard outer approximation procedure wk is fixed and ˛k D g.yk / 8k. But as will be illustrated in subsequent sections, by allowing more flexibility in the choice of wk and ˛k the above theorems can be used to establish convergence under much weaker assumptions than usual.

6.4 Exercises 1 Construct a sequence of nested simplices Mk such that: for every k of an infinite sequence   f1; 2; : : :g; MkC1 is a child of Mk by an exact bisection and nevertheless the intersection of all Mk ; k D 1; 2; : : :, is not a singleton. Thus, in Lemma 6.1 (which is fundamental to The Basic Subdivision Theorem) the condition (i) alone is not sufficient for the filter Mk to be exhaustive. 2 Prove directly (i.e., without using Lemma 6.1) that if a sequence of nested simplices Mk ; k D 1; 2; : : :, is constructed such that MkC1 is a child of Mk in a bisection of ration ˛ > 0 then the intersection of all Mk is a singleton. (Hint: observe that in a triangle Œa; b; c where Œa; b is the longest side, with length of ı > 0, and v is a point of this side such that minfkv  ak; kv  bkg  ˛ı with 0 < ˛  1=2 then there exists a constant 2 .0; 1/ such that kv  ck  ı; in fact, D kwsk ı where w is a point of Œa; b such that kw  ak D ˛ı and s is the third vertex of an equilateral triangle whose two other vertices are a; b.)

6.4 Exercises

165

3 Show that Corollary 6.2 can be strengthened as follows: Let fMk WD Œrk ; sk g  R be any filter of rectangles in Rn such that for infinitely many k; MkC1 is a child of Mk in a bisection of ratio no less than a positive constant ˛ > 0. Then the filter is exhaustive, i.e., diamMk ! 0 as k ! C1. (Hint: use Lemma 6.3; This result is in contrast with what happens for filters of simplices, see Exercise 1). 4 Consider a sequence of nonempty polyhedrons D1  : : :  Dk  : : : such that DkC1 D Dk \ fxj lk .x/  0g, where lk .x/ is an affine function. Show that if there exist a number " > 0 along with for each k a point xk 2 Dk n DkC1 such that the distance from xk to the hyperplane Hk D fxj lk .x/ D 0g is at least " then the sequence D1 ; : : : ; Dk ; : : : must be finite. 5 Suppose in Step 1 of the Prototype BB Algorithm, we take ˇ.M/  inf f .M \ P/; where P is a pregiven set containing D. Does the algorithm work with this modification?

Chapter 7

DC Optimization Problems

7.1 Concave Minimization Under Linear Constraints One of the most frequently encountered problems of global optimization is Concave Minimization (or Concave Programming) which consists in minimizing a concave function f W Rn ! R over a nonempty closed convex set D  Rn . In this and the next chapters we shall focus on the Basic Concave Programming (BCP) Problem which is a particular variant of the concave programming problem when all the constraints are linear, i.e., when D is a polyhedron. Assuming D D fxj Ax  b; x  0g, where A is an m  n matrix, and b an m-vector, the problem can be stated as .BCP/

minimize f .x/

(7.1)

subject to Ax  b

(7.2)

x  0:

(7.3)

This is the earliest problem of nonconvex global optimization studied by deterministic methods (Tuy 1964). Numerous decision models in operations research, mathematical economics, engineering design, etc., lead to formulations such as (BCP). Important special (BCP) models are plant location problems and also concave min-cost network flow problems which have many applications. Other models which originally are not concave can be transformed into equivalent (BCP)s. On the other hand, techniques for solving (BCP) play a central role in global optimization, as will be demonstrated in subsequent chapters. Recall that the concavity of f .x/ means that for any x0 ; x00 in Rn we have f ..1  /x0 C x00 /  .1  /f .x0 / C f .x00 /

8 2 Œ0; 1:

© Springer International Publishing AG 2016 H. Tuy, Convex Analysis and Global Optimization, Springer Optimization and Its Applications 110, DOI 10.1007/978-3-319-31484-6_7

167

168

7 DC Optimization Problems

Since f .x/ is finite throughout Rn , it is continuous (Proposition 2.3) . Later we shall discuss the case when f .x/ is defined only on D and may even be discontinuous at certain boundary points of D. Of course, all the difficulty of (BCP) is caused by the nonconvexity of f .x/. However, for the purpose of minimization, the concavity of f .x/ implies several useful properties listed below: (i) For any real the level set fxj f .x/  g is convex (Proposition 2.11). (ii) The minimum of f .x/ over any line segment is attained at one of the endpoints of the segment (Proposition 2.12). (iii) If f .x/ is bounded below on a halfline, then its minimum over the halfline is attained at the origin of the halfline (Proposition 2.12). (iv) A concave function is constant on any affine set M where it is bounded below (Proposition 2.12). (v) If f .x/ is unbounded below on some halfline then it is unbounded below on any parallel halfline (Corollary 2.4). (vi) If the level set fxj f .x/  g is nonempty and bounded for one , it will be bounded for all (Corollary 2.3). The first two properties, which are actually equivalent, characterize quasiconcavity. When D is bounded, these turn out to be the most fundamental properties for the construction of solution methods for (BCP), so many results in this chapter still remain valid for quasiconcave minimization over polytopes. Let be a real number such that the set C. / WD fxj f .x/  g is nonempty. By quasiconcavity and continuity of f .x/ this set is closed and convex. To find a feasible point x 2 D such that f .x/ < if there is one (in other words, to transcend ) we must solve the dc inclusion x 2 D n C. /:

(7.4) 0

This inclusion depends upon the parameter . Since C. /  C. / whenever  0 , the optimal value in (BCP) is the value inff j

D n C. / ¤ ;g

(7.5)

A first property to be noted of this parametric dc inclusion is that the range of can be restricted to a finite set of real numbers. This follows from Property (iii) listed above, or rather, from the following extension of it: Proposition 7.1 Either the global minimum of f .x/ over D is attained at one vertex of D, or f .x/ is unbounded below over some extreme direction of D. Proof Let z be a point of D, chosen so that z 2 argminx2D f .x/ if f .x/ is bounded on D and f .z/ < f .v/ for any vertex v of D otherwise. From convex analysis (Theorem 2.6) we know that the convex function f .x/ has a subgradient at z, i.e., a vector p such that hp; x  zi  f .x/  f .z/ 8x 2 Rn . Consider the linear program minimize hp; x  zi

subject to x 2 D:

(7.6)

7.1 Concave Minimization Under Linear Constraints

169

Since the objective function vanishes at x D z, by solving this linear program we obtain either an extreme direction u of D such that hp; ui < 0 or a basic optimal solution v 0 such that hp; v 0  zi  0. In the first case, for any x we have f .x C u/  f .z/  hp; x C u  zi D hp; x  x i C hp; ui ! 1 as  ! C1, i.e., f .x/ is unbounded below over any halfline of direction u. In the second case, v 0 is a vertex of D such that f .v 0 /  f .z/  hp; v 0  zi  0, hence f .v 0 /  f .z/. By the choice of z this implies that the second case cannot occur if f .x/ is unbounded below over D. Therefore, f .x/ has a minimum over D and z is a global minimizer. Since then f .v 0 / D f .z/, the vertex v 0 is also a global minimizer. t u By this proposition the search for an optimal solution can be restricted to the finite set of vertices and extreme directions of D, thus reducing (BCP) to a combinatorial optimization problem. Furthermore, if a vertex v 0 is such that f .v 0 /  f .v/ for all neighboring vertices v then by virtue of the same property v 0 achieves the minimum of f .x/ over the polytope spanned by v 0 and all its neighbors, and hence is a local minimizer of f .x/ on D. Therefore, given any point z 2 D one can compute a vertex x which is a local minimizer such that f .x/  f .z/ by the following simple procedure: Take a subgradient p of f .x/ at z. Solve (7.6) . If an extreme direction u is found with hp; ui < 0 then (BCP) is unbounded. Otherwise, a vertex v 0 is obtained with f .v 0 /  f .z/. Starting from v 0 , pivot from a vertex to a better adjacent one, and repeat until a vertex v k is reached which is no worse than any of its neighbors. Then x D v k is a local minimizer. Thus, according to the two phase scheme described in Chap. 5 (Sect. 5.7), to solve (BCP) within a tolerance "  0 we can proceed as follows: Let z 2 D be an initial feasible solution. Phase 1

Phase 2

(Local phase). Starting from z check whether the optimal value is unbounded (D 1), and if it is not, search for a vertex x of D such that f .x/  f .z/. Go to Phase 2. (Global phase). Solve the inclusion (7.4) for D f .x/". If the inclusion has no solution, i.e., DnC. / D ;, then terminate: x is a global "-optimal solution of (BCP). Otherwise, a feasible point z0 is obtained such that f .z0 / < f .x/  ". Then set z z0 , and go back to Phase 1.

Since a polyhedron has finitely many vertices, this scheme terminates after finitely many cycles and for sufficiently small " > 0 a global "-optimal solution is actually an exact global optimal solution. The key point, of course, is how to solve the dc inclusion (7.4).

7.1.1 Concavity Cuts Most methods of global optimization combine in different ways some basic techniques: cutting, partitioning, approximating (by relaxation or restriction), and dualizing. In this chapter we will discuss successive partitioning methods which combine cutting and partitioning techniques.

170

7 DC Optimization Problems

A combinatorial tool which has proven to be useful in the study of many nonconvex optimization problems is the concept of concavity cut first introduced in Tuy (1964). This is a device based on the concavity of the objective function and used to exclude certain portions of the feasible set in order to avoid wasteful search on non-promising regions. Without loss of generality, throughout the sequel we will assume that the feasible polyhedron D in (BCP) is full-dimensional. Consider the inclusion (7.4) for a given value of (for instance, D f .x/  ", where x is the incumbent feasible solution at a given stage). Let x0 be a vertex of D such that f .x0 / > . Suppose that the convex set C. / is bounded and that a cone M vertexed at x0 is available that contains D and has exactly n edges. Since x0 2 intC. /, these edges meet the boundary of C. / at n uniquely determined points z1 ; : : : ; zn which are affinely independent (because so are the directions of the n edges of M/. Let .x  x0 / D 1 be the hyperplane passing through these n points. From .zi  x0 / D 1

.i D 1; : : : ; n/;

(7.7)

we deduce D eQ1 ;

(7.8)

where Q D .z1  x0 ; : : : ; zn  x0 / is the nonsingular matrix with columns zi  x0 ; i D 1; : : : ; n, and e D .1; : : : ; 1/ is a row vector of n ones. Proposition 7.2 Any point x 2 D such that f .x/ < must lie in the halfspace eQ1 .x  x0 / > 1:

(7.9)

Proof Let S D Œx0 ; z1 ; : : : ; zn  denote the simplex spanned by x0 ; z1 ; : : : ; zn . Clearly, S D M\fxj eQ1 .xx0 /  1g. Since x0 ; z1 ; : : : ; zn 2 C. /, it follows that S  C. /, in other words, f .x/  for all x 2 S. Therefore, if x 2 D satisfies f .x/ < then x 2 M but x … S, and hence, x must satisfy (7.9). t u Corollary 7.1 The linear inequality eQ1 .x  x0 /  1

(7.10)

excludes x0 without excluding any point x 2 D such that f .x/ < . We call the inequality (7.10) (or the hyperplane H of equation eQ1 .x  x0 / D 1) a -valid cut for .f ; D/, constructed at x0 (Fig. 7.1). Corollary 7.2 If D f .x/  " for some x 2 D and it so happens that maxfeQ1 .x  x0 /j x 2 Dg  1

(7.11)

then f .x/  f .x/  " for all x 2 D, i.e., x is a global "-optimal solution of (BCP). Thus, (7.11) provides a sufficient condition for global "-optimality which can easily be checked by solving the linear program maxfeQ1 .x  x0 /j x 2 Dg.

7.1 Concave Minimization Under Linear Constraints Fig. 7.1 -valid concavity cut

171

D

x0 + θ2 u2 C(γ )

x0 + θ1 u1 x0

Remark 7.1 Denote by ui ; i D 1; : : : ; n, the directions of the edges of M and by U D .u1 ; : : : ; un / the nonsingular matrix with columns u1 ; : : : ; un , so that M D fxj x D x0 C Ut; t D .t1 : : : ; tn /T  0gI i.e., x 2 M if and only if t D U 1 .x  x0 /  0. Let zi D x0 C i ui , where i > 0 is defined from the condition f .x0 C i ui / D . Then Q D Udiag. 1 ; : : : ; n /, hence 1 1 Q1 D diag. ;    ; /U 1 and 1 n X ti 1 1 ;    ; /.t1 ; : : : ; tn /T D : 1 n iD1 i n

eQ1 .x  x0 / D .

Thus, if zi D x0 C i ui ; i D 1; : : : ; n then the cut (7.10) can also be written in the form n X ti  1; iD1 i

t D U 1 .x  x0 /;

(7.12)

which is often more convenient than (7.10) . Remark 7.2 Given a real number < f .x0 / and any point x ¤ x0 , if D supfj f .x0 C .x  x0 //  g then the point x0 C .x  x0 / is called the -extension of x with respect to x0 (or simply, the -extension of x if x0 is clear from the context). The boundedness of C. / ensures that < C1 (thus, each zi defined above is the -extension of v i D x0 C ui with respect to x0 /. Without assuming the boundedness of C. /, there may exist an index set I  f1; : : : ; ng such that each ith edge, i 2 I, of the cone M is entirely contained in C. /. In this case, for each i 2 I the -extension of v i D x0 C ui is

172

7 DC Optimization Problems

infinite, but x0 C i ui 2 C. / for all i > 0, so the inequality (7.12) still defines a -valid cut for whatever large i .i 2 I/. Making i D C1 .i 2 I/ in (7.12), we then obtain the cut X ti  1; i

t D U 1 .x  x0 /:

(7.13)

i…I

We will refer to this cut (which exists, regardless of whether C. / is bounded or not) as the -valid concavity cut for .f ; D/ at x0 (or simply concavity cut, when f ; D; ; x0 are clear from the context).

7.1.2 Canonical Cone The construction of the cut (7.13) supposes the availability of a cone M  D vertexed at x0 and having exactly n edges. When x0 D 0 (which implies b  0/, M can be taken to be the nonnegative orthant x  0. In the general case, to construct M we transform the problem to the space of nonbasic variables relative to x0 , in the following way. Recall that D D fx 2 Rn j Ax  b; x  0g where A is an m  n matrix and b an m-vector. Introducing the slack variables s D b  Ax, denote y D .x; s/ 2 RnCm . Let yB D .yi ; i 2 B/ be the vector of basic variables and yN D .yi ; i 2 N/ the vector of nonbasic variables relative to the basic feasible solution y0 D .x0 ; s0 /; s0 D b  Ax0 .jBj D m; jNj D n because dimD D n as assumed). The associated simplex tableau then yields yB D y0B  WyN ;

yB  0;

yN  0

(7.14)

where W D Œwij  is an m  n matrix and the basic solution y0 corresponds to yN D 0. By setting yN D t the constraints (7.14) become Wt  y0B ; t  0, so the polyhedron D is contained in the orthant t  0. Note that, since xi D yi for i D 1; : : : ; n, from (7.14) we derive x D x0 C Ut; where U is an n  n matrix of elements uij ; i D 1; : : : ; n; j 2 N, such that 8 < wij i 2 B uij D 1 iDj : 0 otherwise: Thus, the orthant t  0 corresponds, in the original space, to the cone M D fxj x D x0 C Ut; t  0g

(7.15)

7.1 Concave Minimization Under Linear Constraints

173

which contains D, is vertexed at x0 and has exactly n edges of directions represented by the n columns uj ; j D 1; : : : ; n, of U. For convenience, in the sequel we will refer to the cone (7.15) as the canonical cone associated with x0 . Concavity cuts of the above kind have been developed for general concave minimization problems. When the objective function f .x/ has some additional structure, as in concave quadratic minimization and bilinear programming problems, it is possible to significantly strengthen the cuts (Konno 1976a,b; see Chap. 10, Sect. 10.4). Also note that concavity cuts, originally designed for concave programming, have been used in other contexts (Balas 1971, 1972; Glover 1972, 1973a,b). Given the inclusion (7.4), where D f .x/", one can check whether the incumbent x is global "-optimal by using the sufficient optimality condition in Corollary 7.2. This suggests a cutting method for solving the inclusion which consists in iteratively reducing the feasible set by concavity cuts, until a global "-optimal solution is identified (see e.g. Bulatov 1977, 1987). Unfortunately, it has been observed that, when applied repeatedly the concavity cuts often become shallower and shallower, and may even leave untouched a significant portion of the feasible set. To make the cuts deeper, the idea is to combine cutting with splitting the feasible set into smaller and smaller parts. As illustrated in Fig. 7.2, by splitting the initial cone M0 into sufficiently small subcones and applying concavity cuts to each of the subcones separately, it is possible to exclude a portion of the feasible set as close to D \ C. / as desired, and obtain an accurate approximation to the set D n C. /.

7.1.3 Procedure DC With the preceding background, let us now consider the fundamental DC feasibility problem that was formulated in Sect. 7.1 as the key to the solution of (BCP): ./ Given two closed convex sets C; D in Rn , either find a point x 2 D n C;

z1

(7.16)

ω1 ω3

ω2

M0 x0 Fig. 7.2 Cut and split process

z2

174

7 DC Optimization Problems

or else prove that D  C:

(7.17)

As shown in Sect. 7.1, if x is the best feasible solution of (BCP) known thus far (the incumbent), then solving ./ for D D fxj Ax  b; x  0g and C D fxj f .x/  f .x/g amounts to transcending x. According to the general two phase scheme for solving (BCP) within a given tolerance " > 0, it suffices, however, to find a point x 2 D n C (i.e., a better feasible solution than x/ or else to establish that D  C" for C" D fxj f .x/  f .x/  "g (i.e., that x is a global "-optimal solution). Therefore, we will discuss a slight modification of Problem ./, by relaxing the condition (7.17) to D  C" ;

(7.18)

where C" is a closed convex set such that C0 D C;

C  intC" if " > 0:

(7.19)

In other words, we will consider the problem .#/ For a given closed convex set C" satisfying (7.19) find a point x 2 D n C or else prove that D  C" . We will assume that C" is bounded, since by restricting ourselves to the case when (BCP) has a finite optimal solution it is always possible to take C" D fxj f .x/  f .x/  "; kxk  g with sufficiently large  > 0. Let x0 be a vertex of D such that f .x0 /  f .z/ for all neighboring vertices z and f .x0 / > f .x/  " (i.e., x0 2 intC" / and let M0 D fxj x D x0 C U0 t; t  0g be the canonical cone associated with x0 as defined above (so the columns u01 ; : : : ; u0n of U0 are the directions of the n edges of M0 /. By S0 D Œv 01 ; : : : ; v 0n  denote the simplex spanned by v 0i D x0 C u0i ; i D 1; : : : ; n. For a given subcone M of M0 with base S D Œv 1 ; : : : ; v n   Z0 , i.e., a subcone generated by n halflines from x0 through v 1 ; : : : ; v n , let U be the nonsingular matrix of columns ui D v i  x0 ; i D 1; : : : ; n. If i D maxfj x0 C ui 2 C" g (so that x0 C i ui is the -extension of x0 C ui for D f .x/  "/, then, as we saw from (7.12), the inequality n X ti  1; iD1 i

t D U 1 .x  x0 /

defines a -valid cut for .f ; D \ M/ at x0 . Consider the linear program maxf

n X ti j iD1 i

t D U 1 .x  x0 /  0; Ax  b; x  0g:

i.e., LP.D; M/

n X ti Q  b; Q t  0g maxf j AUt i iD1

(7.20)

7.1 Concave Minimization Under Linear Constraints

175

with  AQ D

   A b  Ax0 Q : ; bD x0 I

(7.21)

Denote the optimal value and a basic optimal solution of this linear program in t by .M/ and t.M/, respectively, and let !.M/ D x0 C Ut.M/. Proposition 7.3 If .M/  1 then D \ M  C" , i.e., f .x/  f .x/  " for all points x 2 D \ M. If .M/ > 1 and f .!.M//  f .x/  " then !.M/ does not lie on any edge of M. Proof Clearly .M/ and !.M/ are the optimal value and a basic optimal solution of the linear program maximize eQ1 .x  x0 / subject to x 2 D \ M; where Q D Udiag. 1 ; : : : ; n / (i.e., Q is the matrix of columns 1 u1 ; : : : ; n un ). The first assertion then follows from Proposition 7.2. To prove the second assertion, suppose that .M/ > 1 and f .!.M//  f .x/  " while !.M/ belongs to some jth n X ti D 1 cuts this edge at the point x0 C j uj we must edge. Since the hyperplane i iD1

have !.M/ D x0 C .M/. j uj / D x0 C uj for D .M/ j > j which contradicts the definition of j because f .!.M//  f .x/  ". t u

From this proposition it follows that if M0 can be partitioned into a number of subcones such that .M/  1 for every subcone, then D  C" , i.e., x is a global "optimal solution. Furthermore, if .M/ > 1 for some subcone M while f .!.M//  f .x/  ", then the subdivision of M via the ray through !.M/ will be proper. We can now state the following procedure for the master problem (#), assuming D D fxj Ax  b; x  0g, C D fxj f .x/  f .x/g, and x is a vertex of D.

7.1.3.1 Procedure DC Let x0 be a vertex of D which is a local minimizer, such that x0 2 intC" , let M0 D fxj x D x0 C U0 t; t  0g be a canonical cone associated with x0 , S0 D Œx0 C u01 ; : : : ; x0 C u0n , where u01 ; : : : ; u0n are the columns of U0 . Set P D fM0 g; N D fM0 g. Step 1.

For each cone M 2 P, of base S D Œx0 C u1 ; : : : ; x0 C un   S0 let U be the matrix of columns u1 ; : : : ; un . Compute i such that x0 C i ui 2 @C" .i D 1; : : : ; n/ and solve the linear program [see (7.20)]: LP.D; M/

maxf

n X ti Q  b; Q t  0g j AUt iD1 i

176

Step 2. Step 3. Step 4. Step 5.

Step 6.

7 DC Optimization Problems

to obtain the optimal value .M/ and a basic optimal solution t.M/. Let !.M/ D x0 C Ut.M/. If for some M 2 P; !.M/ … C, i.e., f .!.M// < f .x/, then terminate: !.M/ yields a point z 2 D n C. Delete every cone M 2 N for which .M/  1 and let R be the collection of all remaining cones. If R D ; then terminate: D  C" .x is a global "-optimal solution of .BCP/). Let M 2 argmaxf.M/j M 2 Rg, with base S . Select a point v  2 S which does not coincide with any vertex of S . Split S via v  , obtaining a partition P of M . Set N .R n fM g/ [ P , P P and return to Step 1.

Incorporating the above procedure into the two phase scheme for solving (BCP) (see Sect. 5.7), we obtain the following: CS (Cone Splitting) Algorithm for (BCP) Let z0 be an initial feasible solution. Phase 1. Phase 2.

Compute a vertex x which is a local minimizer such that f .x/  f .z0 /. Call Procedure DC with CDfxj f .x/  f .x/g; C" Dfxj f .x/  f .x/"g. (a) If the procedure returns a point z 2 D such that f .z/ < f .x/ then set z0 z and go back to Phase 1. (Optionally: with D f .z/ and x0 being the vertex of D where Procedure DC has been started, construct a -valid cut .xx0 /  1 for .f ; D/ at x0 and reset D D\fxj .x x0 /  1g/. (b) If the procedure establishes that D  C" , then x is a global "-optimal solution.

With regard to convergence and efficiency, a key point in Procedure DC is the subdivision strategy, i.e., the rule of how the cone M in Step 5 must be subdivided. Let Mk denote the cone M at iteration k. Noting that !.Mk / does not lie on any edge of Mk , a natural strategy that comes to mind is to split the cone Mk upon the halfline from x0 through !.Mk /, i.e., to subdivide its base Sk via the intersection point v k of Sk with this halfline. A subdivision of this kind is referred to as an !subdivision and the strategy which prescribes !-subdivision in every iteration is called the !-strategy. This natural strategy was used in earliest works in concave minimization (Tuy 1964; Zwart 1974) and although it seemed to work quite well in practice (Hamami and Jacobsen 1988), its convergence was never rigorously established, until the works of Jaumard and Meyer (2001), Locatelli (1999), and most recently Kuno and Ishihama (2015). Now we can sate the following: Theorem 7.1 The CS Algorithm with !-subdivisions terminates after finitely many steps, yielding a global "-optimal solution of (BCP). Proof Thanks to Theorem 6.2, a new proof for this result can be given which is much shorter than all previously published ones in the above cited works. In fact, consider iteration k of Procedure DC with !-subdivisions. The cone Mk (i.e., M in step 5) is subdivided via the halfline from x0 through ! k WD !.Mk /. Let zk1 ; : : : ; zkn

7.1 Concave Minimization Under Linear Constraints

177

be the points where the edges of Mk meet the boundary @C" of C" and let qk be the point where the halfline from x0 through ! k meets the simplex Œzk1 ; : : : ; zkn . By Theorem 6.2 if the procedure is infinite there is an infinite subsequence fks g such that qks ! q 2 @C" as s ! 1. By continuity of f .x/ it follows that f .qks / ! f .q/ D f .x/  ", so f .! ks / < f .x/ for sufficiently large ks . Since this is the stopping criterion in Step 2 of the procedure, the algorithm terminates after finitely many iterations by the stopping criterion in Step 4. t u Remark 7.3 It is not hard to show that ! k D .Mk /qk . From the above proof there is an infinite subsequence fks g such that k! ks  qks k ! 0, i.e., ..Mks /  1/kqks k ! 0, hence .Mks / ! 1 .s ! 1/:

(7.22)

Later we will see that this condition is sufficient to ensure convergence of the CS Algorithm viewed as a Branch and Bound Algorithm. Remark 7.4 The CS Algorithm for solving (BCP) consists of a finite number of cycles each of which involves a Procedure DC for transcending the current incumbent. Any new cycle is actually a restart of the algorithm, whence the name CS restart given sometimes to the algorithm. Due to restart, the set Nk (the collection of cones to be stored) can be kept within reasonable size to avoid storage problems. Furthermore, the vertex x0 used to initialize Procedure DC in each cycle needs not be fixed throughout the whole algorithm, but may vary from cycle to cycle. When started from x0 Procedure DC concentrates the search over the part of the boundary of D contained in the canonical cone associated with x0 , so a change of x0 means a change of search directions. This multistart effect may sometimes drastically improve the convergence speed. On the other hand, further flexibility can be given to the algorithm by modifying Step 2 of Procedure DC as follows: Step 2.

(Modified) If for some M 2 N ; !.M/ … C, i.e., f .!.M// < f .x/ then, optionally: (a) either set z0 !.M/ and return to Phase 1, or (b) compute a vertex xO of D such that f .Ox/  f .!.M//, reset x xO . and go to Step 3.

With this modification, as soon as in Step 2 of Procedure DC a better feasible solution is found which is better than the incumbent, the algorithm either returns to Phase 1 (with z0 !.M// or goes to Step 3 (with an updated incumbent) to continue the current Procedure DC. In general, the former option (“return to Phase 1”) should be chosen when the set N has reached a critical size. The efficiency of the multirestart strategy is illustrated in the following example: Example 7.1 Consider the problem minimize x1  10x2 C 10x3 C x8 x21  x22  x23  x24  7x25  4x26  x27  2x28 C2x1 x2 C 6x1 x5 C 6x2 x5 C 2x3 x4

178

7 DC Optimization Problems

ˇ ˇ x1 C 2x2 C x3 C x4 C x5 C x6 C x7 C x8  8 ˇ ˇ 2x C x C x  9 2 3 ˇ 1 ˇ ˇ x3 C x4 C x5  5 subject to ˇ ˇ 0:5x5 C 0:5x6 C x7 C 2x8  3 ˇ ˇ 2x2  x3  0:5x4  5 ˇ ˇ x1  6; xi  0 i D 1; 2; : : : ; 8 A local minimizer of the problem is the vertex x0 D .0; 0; 0; 0; 5; 1; 0; 0/ of the feasible polyhedron, with objective function value 179. Suppose that Procedure DC is applied to transcend this local minimizer. Started from x0 the procedure runs through 37 iterations, generating 100 cones, without detecting any better feasible solution than x0 nor producing evidence that x0 is already a global optimal solution. By restarting the procedure from the new vertex x0 .0; 2:5; 0; 0; 0; 0; 0; 0/ we find after eight iterations that no better feasible solution than x0 exists. Therefore, x0 is actually a global minimizer.

7.1.4 !-Bisection As said above, for some time the convergence status of conical algorithms using !subdivisions was unsettled. It was then natural to look for other subdivisions which could secure convergence, even at some cost for efficiency. The simplest subdivision that came to mind was bisection: at each iteration the cone M is split along the halfline from x0 through the midpoint of a longest edge of the simplex S . In fact, a first conical algorithm using bisections was developed (Thoai and Tuy 1980) whose convergence was proved rigorously. Subsequently, however, it was observed that the convergence achieved with bisections is too slow. This motivated the Normal Subdivision (Tuy 1991a) which is a kind of hybrid strategy, using !-subdivisions in most iterations and bisections occasionally, just enough to prevent jamming and eventually ensure convergence. At the time when the pure !-subdivision was efficient, yet uncertain to converge, Normal Subdivision was the best hybrid subdivision ensuring both convergence and efficiency (Horst and Tuy 1996). However, once the convergence of pure !subdivision has been settled, bisections are no longer needed to prevent jamming in conical algorithms. On the other hand, an advantage of bisections over other kinds of subdivision is that each bisection creates only two new partition sets, so that using bisections may help to keep the growth of the collection N of accumulated partition sets within more manageable limits. Moreover, recent investigations by Kuno and Ishihama (2015) show that bisections can be built in such a way to take account as well of current problem structure just like !-subdivisions.

7.1 Concave Minimization Under Linear Constraints

179

The following concept of !-bisection is a slight modification of that introduced by Kuno–Ishihama. The modification allows us to dispense with assuming strict concavity of f .x/ and also to simplify the convergence proof. Let Mk be the cone to be subdivided at iteration k of Procedure DC, Sk D n Œuk1 ; : : : ; ukn   RP its base, v k the point Sk meets the ray through ! k D Pn where n k ki k k !.Mk /. Let v D iD1 i u where iD1 i D 1; ki  0 .i D 1; : : : ; n/. Take ˛ > 0, and for every k define ik D

ki C ˛ ; i D 1; : : : ; n; 1 C n˛

Clearly ik > 0 8k; i D 1; : : : ; n; longest edge of the simplex Sk and wk D

Pn

k iD1 i

vQ k D

n X

ik uki :

iD1

D 1, so vQ k 2 Sk . Let Œukrk ; uksk  be a

rkk ukrk C skk uksk rkk C skk

:

(7.23)

The !-bisection of Mk is then defined to be the subdivision of Mk via the halfine from x0 through wk (or, in other words, the subdivision of Mk induced by the subdivision of Sk via wk /. Since ik > 0 8i the set fvQ k ; uki ; i … frk ; sk gg is affinely independent and, as is easily seen, wk is just the point where the segment Œukrk ; uksk  meets the affine manifold through vQ k ; uki ; i … frk ; sk g. Theorem 7.2 The CS Algorithm with !-bisections terminates after finitely many steps, yielding a global "-optimal solution of (BCP). Proof Since ˛ 1Cn˛ kn

rkk k rk C skk



krk C˛ 1Cn˛



˛ 1Cn˛

the !-bisection of Sk is a bisection of Sk of

> 0, so by Corollary 6.1 the set \1 ratio  kD1 Sk is a singleton fvg. Hence, if k1 z ; : : : ; z are the points where the halflines from x0 through uk1 ; : : : ; ukn meet @C" while qk ; qO k are the points where the halfline from x0 through v k meets the simplex Œzk1 ; : : : ; zkn  and @C" , respectively, then kqk  qO k k ! 0 as k ! 1. Therefore there is q 2 @C" such that qk ; qO k ! q as k ! 1. Since ! k WD !.Mk / 2 Œqk ; qO k  it follows that ! k ! q 2 @C" . By reasoning just as in the last part of the proof of Theorem 6.1 we conclude that the CS Algorithm with !-bisections terminates after finitely many iterations, yielding a global "-optimal solution. t u From the computational experiments reported in Kuno and Ishihama (2015) the CS Algorithm with !-bisection seems to compete favorably with the CS Algorithm using !-subdivisions.

180

7 DC Optimization Problems

7.1.5 Branch and Bound Let M be a cone vertexed at a point x0 2 D and having n edges of directions u1 ; : : : ; un . A lower bound of f .M \ D/ to be used in a conical BB algorithm for solving (BCP) can be computed as follows. Let be the current incumbent value. Assuming that  " < f .x0 / (otherwise 0 x is already a global "-optimal solution of (BCP) compute i D supfj f .x0 C ui /   "g; i D 1; : : : ; n;

(7.24)

and by  D .M/ denote the optimal value of the linear program LP.D; M/ given by (7.20). If yi D x0 C  i ui ; i D 1; : : : ; n then clearly D \ M  Œx0 ; y1 ; : : : ; yn , so by concavity of f .x/ a lower bound of f .D \ M/ can be taken to be the value ˇ.M/ D

" if .M/  1 minff .y1 /; : : : ; f .yn /g otherwise:

(7.25)

(see Fig. 7.3). Using this lower bounding M 7! ˇ.M/ together with the conical !-subdivision rule we can then formulate the following BB (branch and bound) version of the CS Algorithm discussed in Sect. 5.4. Conical BB Algorithm for (BCP) Initialization Compute a feasible solution z0 . Compute a vertex x of D which is a local minimizer such that f .x/  f .z0 /. Take a vertex x0 of D (local minimizer) such that f .x0 / > f .x/  ". Let M0 D fxj x D x0 C U0 t; t  0g be the canonical cone associated with x0 ; S0 D Œx0 C u01 ; : : : ; x0 C u0n , where u0i ; i D 1; : : : ; n are the columns of U0 . Set P D fM0 g; N D fM0 g; D f .x/.

Phase 1 Phase 2

y2

f (x) = γ − ε

M x0 Fig. 7.3 Computation of ˇ.M/

y1

7.1 Concave Minimization Under Linear Constraints

Step 1.

181

(Bounding) For each cone M 2 P, of base S D Œz1 ; : : : ; zn   S0 let U be the matrix of columns ui D zi  x0 ; i D 1; : : : ; n. Compute i ; i D 1; : : : ; n according to (7.24) and solve the linear program [see (7.20)]: (

LP.D; M/

Step 2.

n X ti max j iD1 i

) Q  b; Q t0 AUt

(7.26)

to obtain the optimal value .M/ and a basic optimal solution t.M/. Let !.M/ D x0 C Ut.M/. Compute ˇ.M/ as in (7.25). (Options) If for some M 2 P; f .!.M// < , then do either of the options: (a) with f .!.M// construct a .  "/-valid cut for .f ; D/ at x0 W 0 h ; x  x i  1, reset D D \ fxj h ; x  x0 i  1g, z0 !.M/, and go back to Phase 1; (b) compute a vertex xO of D such that f .Ox/  f .!.M//, reset x xO ; f .Ox/ and go to Step 3.

Step 3. Step 4. Step 5. Step 6.

(Pruning) Delete every cone M 2 N for which ˇ.M/   " and let R be the collection of all remaining cones. (Termination Criterion) If R D ;, then terminate: x is a global "-optimal solution of (BCP). (Branching) Let M 2 argminfˇ.M/j M 2 Rg, with base S . Perform an !-subdivision of M and let P be the partition of M . (New Net) Set N .R n fM g/ [ P , P P and return to Step 1.

Theorem 7.3 The above algorithm can be infinite only if " D 0 and then any accumulation point of the sequence fxk g is a global optimal solution of (BCP). Proof Since D has finitely many vertices, if the algorithm is infinite, a Phase 2 must be infinite and generate a filter fMk ; k 2 Kg. Let k be the current value of , and let k D .Mk /; ! k D !.Mk /; D lim k . By formula 7.22 (which extends easily to the case when k & /, there exists K1  K such that k ! 1; ! k ! !, with ! satisfying f .!/ D " as k ! 1; k 2 K1 . But in view of Step 2, f .! k /  k , hence f .!/  , which is possible only if " D 0. Moreover, without loss of generality we can assume ˇ.Mk / D f .x0 C k 1k u1k /, and so, in that case, k  ˇ.Mk /  f . 1k u1k /  f .k 1k u1k / ! 0 as k ! 1; k 2 K1 . Therefore, by Proposition 6.1 the algorithm converges. t u Remark 7.5 Option (a) in Step 2 allows the algorithm to be restarted when storage problems arise in connection with the growth of the current set of cones. On the other hand, with option (b) the algorithm becomes a unified procedure. This flexibility sometimes enhances the practicability of the algorithm. Note, however, that the BB algorithm requires more function evaluations than the CS Algorithm (for each cone M the CS algorithm only requires the value of f .x/ at !.M/, while the BB Algorithm requires the values of f .x/ at n points y1 ; : : : ; yn /. This should be taken into account when functions evaluations are costly, as it occurs in problems

182

7 DC Optimization Problems

where the objective function f .x/ is implicitly defined via some auxiliary problem (for instance, f .x/ D minfhQx; yij y 2 Gg, where Q 2 Rpn and G is a polytope in Rp /.

7.1.6 Rectangular Algorithms Aside from simplicial and conical subdivisions, rectangular subdivisions are also used in branch and bound algorithms for (BCP). Especially, rectangular subdivisions are often preferred when the objective function is such that a lower bound of its values over a rectangle can be obtained in a straightforward manner. The latter occurs, in particular, when the function is separable, i.e., has the form f .x/ D Pn iD1 fi .xi /. Consider now the Separable Basic Concave Program .SBCP/

minff .x/ WD

n X

fj .xj /j

x 2 Dg;

(7.27)

jD1

where D is a polytope contained in the rectangle Œc; d D fx W c  x  dg and each function fj .t/ is concave and continuous on the interval Œcj ; dj . Separable programming problems of this form appear quite frequently in production, transportation, and planning models. Also note that any quadratic function can be made separable by an affine transformation (see Chap. 4, Sect. 4.5). P Proposition 7.4 If f .x/ D njD1 fj .xj / is a separable concave function and M D ŒrI s is any rectangle with c  r; s  d, then there is an affine minorant of f .x/ on M which agrees with f .x/ at the vertices of M. This function is given by 'M .x/ D

n X

'M;j .xj /;

jD1

where, for every j; 'M;j .:/ is the affine function of one variable that agrees with fj .:/ at the endpoints of the interval Œrj ; sj , i.e., 'M;j .t/ D fj .rj / C

fj .sj /  fj .rj / .t  rj /: sj  rj

(7.28)

Proof That 'M .x/ is affine and minorizes f .x/ over M is obvious. On the other hand, if x is a vertex of M then, for every j, we have either xj D rj or xj D sj , hence 'M;j .xj / D fj .xj /; 8j, and therefore, f .x/ D 'M .x/. t u On the basis of this proposition a lower bound for f .x/ over a rectangle M D Œr; s can be computed by solving the linear program LP.D; M/

minf'M .x/j x 2 D \ Mg:

(7.29)

7.1 Concave Minimization Under Linear Constraints

183

If !.M/ and ˇ.M/ denote a basic optimal solution and the optimal value of this linear program, then ˇ.M/  min f .D\M/, with equality holding when f .!.M// D 'M .!.M//. It is also easily seen that if a rectangle M D ŒrI s is entirely outside D then D\M is empty and M can be discarded from consideration (ˇ.M/ D C1). On the other hand, if M is entirely contained in D then D\M D M, hence the minimum of f .x/ over D \ M is achieved at a vertex of M and since 'M .x/ agrees with f .x/ at the vertices of M, it follows that ˇ.M/ D min f .D \ M/. Therefore, in a branch and bound rectangular algorithm for (SBCP) only those rectangles intersecting the boundary of D will need to be investigated, so that the search will essentially be concentrated on this boundary. Proposition 7.5 Let fMk ; k 2 Kg be a filter of rectangles as in Lemma 6.3. If jk 2 argmaxffj .!jk /  'Mk ;j .!jk /jj D 1; : : : ; ng

(7.30)

then there exists a subsequence K1  K such that f .! k /  'Mk .! k / ! 0 .k ! 1; k 2 K1 /:

(7.31)

Proof By Lemma 6.3, we may assume, without loss of generality, that jk D 1 8k 2 K1 and !1k  r1k ! 0 as k ! 1; k 2 K1 (the case !1k  sk1 ! 0 is similar). Since 'Mk ;1 .r1k / D f1 .r1k /, it follows that f1 .!1k /  'Mk ;1 .!1k / ! 0: The choice of jk D 1 by (7.30) then implies that fj .!jk /  'Mk ;j .!jk / ! 0 .j D 1; : : : ; n/, hence (7.31). t u The above results provide the basis for the following algorithm, essentially due to Falk and Soland (1969): BB Algorithm for (SBCP) Initialization Let x0 be the best feasible solution available, P1 D N1 D fM0 WD Œc; dg. Set k D 1. Step 1.

Step 2. Step 3. Step 4. Step 5.

Step 6.

(Bounding) For each rectangle M 2 Pk solve the linear program LP.D; M/ given by (7.29) to obtain its optimal value ˇ.M/ and a basic optimal solution !.M/. (Incumbent) Update the incumbent by setting xk equal to the best among xk1 and all !.M/; M 2 Pk . (Pruning) Delete every M 2 Nk such that ˇ.M/  f .xk /  ". Let Rk be the collection of remaining members of Nk . (Termination Criterion) If Rk D ;, then terminate: xk is a global "-optimal solution of (SBCP). (Branching) Choose Mk 2 argminfˇ.M/jM 2 Rk g. Let ! k D !.Mk /. Choose jk according to (7.30) and subdivide Mk via .! k ; jk /. Let PkC1 be the partition of Mk . (New Partitioning) Set NkC1 D .Rk n fMk g/ [ PkC1 . Set k k C 1 and go back to Step 1.

184

7 DC Optimization Problems

Theorem 7.4 The above algorithm can be infinite only if " D 0 and in this case every accumulation point of the sequence fxk g is a global optimal solution of (SBCP). Proof Proposition 7.5 ensures that every filter fMk ; k 2 Kg contains an infinite subsequence fMk ; k 2 K1 g such that f .! k /  ˇ.Mk / ! 0 .k ! 1; k 2 K1 /: The conclusion then follows from Proposition 5.1.

t u

Remark 7.6 The subdivision via (! k ; jk ) as used in the above algorithm is also referred to as !-subdivision. Note that if Mk is chosen as in Step 5, and ! k coincides with a corner of Mk then, as we saw above [see comments following (7.29)], f .! k / D ˇ.Mk /, hence ! k solves (SBCP). Intuitively this suggests that convergence would be faster if the subdivision tends to bring ! k more rapidly to a corner of Mk , i.e., if !-subdivision is used instead of bisection as in certain similar rectangular algorithms (see, e.g., Kalantari and Rosen 1987). In fact, although bisection also guarantees convergence (in view of its exhaustiveness, by Corollary 6.2), for (SBCP) it seems to be generally less efficient than !-subdivision. Also note that just as with conical or simplicial subdivisions, rectangular procedures using !-subdivision do not need any special device to ensure convergence.

7.2 Concave Minimization Under Convex Constraints Consider the following General Concave Minimization Problem: .GCP/

minff .x/j

x 2 Dg;

(7.32)

where f W Rn ! R is a concave function, and D D fxj g.x/  0g is a compact convex set defined by the convex function g W Rn ! R. Since f .x/ is continuous and D is compact an optimal solution of (GCP) exists. Using the outer approximation (OA) approach described in Chap. 5 we will reduce (GCP) to solving a sequence of (BCP)s. First it is not hard to check that the conditions (A1) and (A2) required in the Prototype OA Algorithm are fulfilled. Let G D D, let ˝ be the set of optimal solutions to .GCP/, P the collection of polyhedrons that contain D. Since G D D is a compact convex set, it is easily seen that Condition A2 is fulfilled by taking l.x/ to be an affine function strictly separating z from G. The latter function l.x/ can be chosen in a variety of different ways. For instance, according to Corollary 6.3, if g.x/ is differentiable then to separate xk D x.Pk / from G D D one can take lk .x/ D hrg.xk /; x  xk i C g.xk /

(7.33)

7.2 Concave Minimization Under Convex Constraints

185

[so yk D xk ; ˛k D g.xk / in (6.12)]. If an interior point x0 of D is readily available, the following more efficient cut (Veinott 1967) should be used instead: lk .x/ D hpk ; x  yk i C g.yk /;

yk 2 Œx0 ; xk  n intG; pk 2 @g.yk /:

(7.34)

Thus, to apply the outer approximation method to (GCP) it only remains to define a distinguished point xk WD x.Pk / for every approximating polytope Pk so as to fulfill Condition A1.

7.2.1 Simple Outer Approximation In the simple outer approximation method for solving (GCP), xk D x.Pk / is chosen to be a vertex of Pk minimizing f .x/ over Pk . In other words, denoting the vertex set of Pk by Vk one defines xk 2 argminff .x/j x 2 Vk g:

(7.35)

Since f .xk /  min f .D/ it is obvious that x D lim!1 xk 2 D implies that x solves (GCP), so Condition A1 is immediate. A practical implementation of the simple OA method thus requires an efficient procedure for computing the vertex set Vk of Pk . At the beginning, P1 can be taken to be a simple polytope (most often a simplex) with a readily available vertex set V1 . Since Pk is obtained from Pk1 by adding a single linear constraint, all we need is a subroutine for solving the following subproblem of on-line vertex enumeration: Given the vertex set Vk of a polytope Pk and a linear inequality lk .x/  0, compute the vertex set VkC1 of the polytope PkC1 D Pk \ fxj lk .x/  0g: Denote Vk D fv 2 Vk j lk .x/ < 0g; VkC D fv 2 Vk j lk .x/ > 0g: Proposition 7.6 VkC1 D Vk [ W, where w 2 W if and only if it is the intersection point of the hyperplane lk .x/ D 0 with an edge of Pk joining an element of VkC with an element of Vk . Proof The inclusion Vk  VkC1 is immediate from the fact that a point of a polyhedron of dimension n is a vertex if and only if it satisfies as equalities n linearly independent constraints (Corollary 1.17). To prove that W  VkC1 let w be a point of an edge Œv; u joining v 2 VkC with u 2 Vk . Then w must satisfy as equalities n1 linearly independent constraints of Pk (Corollary 1.17). Since all these constraints are binding for v while lk .v/ > 0 (because v 2 VkC /, the inequality lk .x/  0

186

7 DC Optimization Problems

cannot be a linear combination of these constraints. Hence w satisfies n linearly independent constraints defining PkC1 , i.e., w 2 VkC1 . Conversely if w 2 VkC1 then either lk .w/ < 0 (i.e., w 2 Vk /, or else lk .w/ D 0. In the latter case, n  1 linearly independent constraints of Pk must be binding for w, so w is the intersection of the edge determined by these n1 constraints with the hyperplane lk .y/ D 0. Obviously, one endpoint of this edge must be some v 2 Vk such that lk .v/ > 0. t u Assume now that the polytopes Pk and PkC1 are nondegenerate and for each vertex v 2 Vk the list N.v/ of all its neighbors (adjacent vertices) together with the index set I.v/ of all binding constraints for v is available. Also assume that jVkC j  jVk j (otherwise, interchange the roles of these sets). Based on Proposition 7.6, VkC1 D Vk [ W, where W (set of new vertices) can be computed as follows (Chen et al. 1991): On-Line Vertex Enumeration Procedure Step 1.

Set W

; and for all v 2 VkC :

• for all u 2 N.v/ \ Vk : – compute  2 Œ0; 1 such that lk .v C .1  /u/ D 0I – set w D v C .1  /uI N.w/ D fug; N.u/ .N.u/ n fvg/ [ fwgI I.w/ D .I.v/ \ I.u// [ fg where  denotes the index of the new constraint. – set W W [ fwg. Step 2.

For all v 2 W; u 2 W W • if I.u/\I.v/ D n1 then set N.v/

N.v/[fug; N.u/

N.u/[fvg.

In the context of outer approximation, if Pk is nondegenerate then PkC1 can be made nondegenerate by slightly perturbing the hyperplane lk .x/ D 0 to prevent it from passing through any vertex of Pk . In this way, by choosing P1 to be nondegenerate, one can, at least in principle, arrange to secure nondegeneracy during the whole process. Earlier variants of on-line vertex enumeration procedures have been developed by Falk and Hoffman (1976), Thieu et al. (1983), Horst et al. (1988), and Thieu (1989). Remark 7.7 If an interior point x0 of D is available, the point yk 2 Œx0 ; xk  \ @D is feasible and xk 2 argminff .y1 /; : : : ; f .yk /g is the incumbent at iteration k. Therefore, a global "-optimal solution is obtained when f .xk /  f .xk /  ", which must occur after finitely many iterations. The corresponding algorithm was first proposed by Hoffman (1981). A major difficulty with simple OA methods for global optimization is the rapid growth of the set Vk which creates serious storage problems and often makes the computation of VkC1 a formidable task. Remark 7.8 When D is a polyhedron, g.x/ D maxiD1;:::;m .hai ; xibi /, so if xk … D, then haik ; xk ibik > 0 for ik 2 argmaxfhai ; xk ibi j i D 1; : : : ; mg, and rg.xk / D aik . Hence the cut (7.33) is lk .x/ D haik ; xi  bik  0, and coincides thus with the constraint the most violated by xk . Therefore, for (BCP) a simple OA method produces an exact optimal solution after finitely many iterations. Despite this and

7.2 Concave Minimization Under Convex Constraints

187

several other positive features, the uncontrolled growth of Vk is a handicap which prevents simple OA methods from competing favorably with branch and bound methods, except on problems of small size. In a subsequent section a dual version of OA methods called “polyhedral annexation” will be presented which offers restart and multistart possibilities, and hence, can perform significantly better than simple OA procedures in many circumstances. Let us also mention a method for solving (BCP) via “collapsing polytopes” proposed by Falk and Hoffman (1986) which can be interpreted as an OA method with the attractive feature that at each iteration only n C 1 new vertices are to be computed for Pk . Unfortunately, strong requirements on nondegeneracy limit the applicability of this ingenious algorithm.

7.2.2 Combined Algorithm By replacing f .x/, if necessary, with minf; f .x/g where  > 0 is sufficiently large, we may assume that the upper level sets of the concave function f .x/ are bounded. According to (7.35), the distinguished point xk of Pk in the above OA scheme is a global optimal solution of the relaxed problem .SPk /

minff .x/j x 2 Pk g:

The determination of xk in (7.35) involves the computation of the vertex set Vk of Pk . However, as said in Remark 7.7, a major difficulty of this approach is that, for many problems the set Vk may grow very rapidly and attain a prohibitively large size. To avoid computing Vk an alternative approach is to compute xk by using a BB Algorithm, e.g., the BB conical procedure, for solving .SPk /. A positive feature of this approach that can be exploited to speed up the convergence is that the relaxed problem (SPk ) differs from .SPk1 / only by one single additional linear constraint. Due to this fact, it does not pay to solve each of these problems to optimality. Instead, it may suffice to execute only a restricted number of iterations (even one single iteration) of the conical procedure for .SPk ), and take xk D !.Mk /, where Mk is the current candidate for branching. At the next iteration k C 1, the current conical partition in the conical procedure for (SPk ) may be used to start the conical procedure for (SPkC1 ). This idea of combining OA with BB concepts leads to the following hybrid procedure (Fig. 7.4). Combined OA/BB Conical Algorithm for (GCP) Initialization Take an n-simplex Œs1 ; : : : ; snC1  inscribed in D and an interior point x0 of this simplex. For each i D 1; : : : ; n C 1 compute the intersection si of the halfline from x0 through si with the boundary of D, Let x1 2 argminff .s1 /; : : : ; f .snC1 /g, 1 D f .x1 /, Let Mi be the cone with vertex at x0 and n edges passing through sj ; j ¤ i, N1 D fMi ; i D 1; : : : ; n C 1g, P1 D N1 . (For any subcone of a cone M 2 P1 its base is taken to be the simplex spanned by the n intersection points of the edges of M with the boundary of D) Let P1 be an initial polytope enclosing D. Set k D 1.

188

7 DC Optimization Problems

Fig. 7.4 Combined algorithm

Pk

Pk+1

ωk

ωk

Mk

x0 D

∂C

Step 1.

(Bounding) For each cone M 2 Pk of base Œu1 ; : : : ; un  compute the point x0 C i ui where the ith edge of M meets the surface f .x/ D k . Let U be the matrix of columns u1 ; : : : ; un . Solve the linear program ( n ) X ti LP.Pk ; M/ max j x0 C Ut 2 Pk ; t  0 i iD1 to obtain the optimal value .M/ and a basic optimal solution t.M/. Let !.M/ D x0 C Ut.M/. If .M/ > 1, compute zi D x0 C .M/ i ui ; i D 1; : : : ; n and ˇ.M/ D minff .z1 /; : : : ; f .zn /g:

Step 2. Step 3. Step 4. Step 5. Step 6.

(7.36)

(Pruning) Delete every cone M 2 Nk such that .M/  1 or ˇ.M/  k  ". Let Rk be the collection of remaining members of Nk . (Termination criterion 1) If Rk D ; terminate: xk is a global "-optimal solution. (Termination criterion 2) Select Mk 2 argminfˇ.M/j M 2 Rk g. If ! k WD !.Mk / 2 D terminate: ! k is a global optimal solution. (Branching) Subdivide Mk via the halfline from x0 through ! k , obtaining the partition PkC1 of Mk . (Outer approximation) Compute pk 2 @g.! k /, where ! k is the intersection of the halfline from x0 through ! k with the boundary of D and define PkC1 D Pk \ fxj hpk ; x  ! k i C g.! k /  0g:

Step 7.

(Updating) Let xkC1 be the best of the feasible points xk ; ! k . Let kC1 D f .xkC1 /, NkC1 D .Rk n fMk g/ [ PkC1 . Set k k C 1 and go back to Step 1.

Theorem 7.5 The Combined Algorithm can be infinite only if " D 0, and in that case every accumulation point of the sequence fxk g is a global optimal solution of (GCP).

7.3 Reverse Convex Programming

189

Proof First observe that for any k we have ˇ.Mk /  min f .D/ while f .! k /  minff .zk1 /; : : : ; f .zkn /g D ˇ.Mk /. Therefore, when Step 4 occurs ! k is effectively a global optimal solution. We now show that for " > 0 if Step 5 never occurs the current best value k decreases by at least a quantity " after finitely many iterations. In fact, suppose that k D k0 for all k  k0 . Let C D fx 2 Rn j f .x/  k0 g and by zki denote the intersection point of the ith edge of Mk with @C. Let k D .Mk / and by qk denote the intersection point of the halfline from x0 through ! k with the simplex Œzk1 ; : : : ; zkn . By Theorem 6.2 there exists a subsequence fks g  f1; 2 : : : ; g and a point q 2 @C such that qks ! q as s ! C1. Moreover, ks ! 1 as s ! C1 [see (7.22) in Remark 7.3]. Since ! ks  x0 D ks .qks  x0 / it follows that ! ks ! q 2 @C, hence f .! ks / ! f .q/ D k0 as s ! C1. But f .! k /  ˇ.Mk / 8k, hence ˇ.Mks /  f .! ks / ! k0 as s ! C1. Therefore, for s sufficiently large we will have ˇ.Mks  k0  " D ks  ", and then the algorithm stops by the criterion in Step 3 yielding k0 as the global "-optimal value and xks as a global "-optimal solution. Thus if k0 is not yet the global "-optimal value, we must have k  k0  " for some k > k0 . Since min f .D/ < C1, this proves the finiteness of the algorithm. t u Remark 7.9 If for solving (SPk ) the CS Algorithm is used instead of the BB algorithm, then we have a Combined OA/CS Conical Algorithm for (GCP). In Step 1 of the latter algorithm, the values i are computed from the condition f .x0 C i ui / D k  ", and there is no need to compute ˇ.M/ while in Step 2 a cone M is deleted if .M/  1. The OA/CS Algorithm involves less function evaluations than the OA/BB Algorithm. Combined algorithms of the above type were introduced for general global optimization problems in Tuy and Horst (1988) under the name of restart BB Algorithms. A related technique of combining BB with OA developed in Horst et al. (1991) leads essentially to the same algorithm in the case of .GCP/.

7.3 Reverse Convex Programming A reverse convex inequality is an inequality of the form h.x/  0;

(7.37)

where h W Rn ! R is a convex function. When such an inequality is added to the constraints of a linear program minfcxjAx  b; x  0g, the latter becomes a complicated nonconvex global optimization problem .LRC/

minimize cx subject to Ax  b; h.x/  0:

(7.38) x0

(7.39) (7.40)

190

7 DC Optimization Problems

Fig. 7.5 Reverse convex constraint

D c, x = γ

h(x) ≤ 0 C

The difficulty is brought on by the additional reverse convex constraint which destroys the convexity and sometimes the connectedness of the feasible set (Fig. 7.5). Even when h.x/ is as simple as a convex quadratic function, testing the feasibility of the constraints (7.39)–(7.40) is already NP-hard. This is no surprise since (BCP), which is NP-hard, can be written as a special (LRC): minftj

Ax  b; x  0; t  f .x/  0g:

Reverse convex constraints occur in many problems of mechanics, engineering design, economics, control theory, and other fields. For instance, in engineering design a control variable x may be subject to constraints of the type kx  dk  ı .; i:e:; h.x/ WD kx  dk  ı  0/, or hpi ; xi  ˛i for at least one i 2 I .i:e:; h.x/ D max.hpi ; xi  ˛i /  0/. In combinatorial optimization a discrete constraint such as i2I

xi 2 f0; 1g for all i 2 I amounts to requiring that 0  xi  1; i 2 I, and h.x/ WD P i2I xi .xi  1/  0. Also, in bilevel linear programming problems, where the variable is x D .y; z/ 2 Rp Rq , a constraint of the form z 2 argmaxfdzjPyCQz  sg can be expressed as h.x/ WD '.y/  dz  0, where '.y/ WD maxfdzjQz  Py  sg is a convex function of y.1

7.3.1 General Properties Setting D D fxjAx  b; x  0g, C D fxjh.x/  0g, we can write the problem as .LRC/ W

minfcxj x 2 D n intCg:

(7.41)

In the above formulation the function h.x/ is assumed to be finite throughout Rn I if this condition may not be satisfied (as in the last example) certain results below must be applied with caution.

1

7.3 Reverse Convex Programming

191

Here D is a polyhedron and C is a closed convex set. For the sake of simplicity we will assume D nonempty and C bounded. Furthermore, it is natural to assume that minfcxjx 2 Dg < minfcxjx 2 D n intCg:

(7.42)

which simply means that the reverse convex constraint h.x/  0 is essential, so that (LRC) does not reduce to the trivial underlying linear program. Thus, if w is a basic optimal solution of the linear program then w 2 D \ intC

and cw < cx; 8x 2 D n intC:

(7.43)

Proposition 7.7 For any z 2 D n C there exists a better feasible solution y lying on the intersection of the boundary @C of C with an edge E of the polyhedron D. Proof We show how to compute y satisfying the desired conditions. If p 2 @h.z/ then the polyhedron fx 2 Dj hp; xziCh.z/ D 0g is contained in the feasible set, and a basic optimal solution z1 of the linear program minfcxj x 2 D; hp; xziCh.z/ D 0g will be a vertex of the polyhedron P D fx 2 Dj cx  cz1 g. But in view of (7.43) z1 is not optimal to the linear program minfcxj x 2 Pg. Therefore, starting from z1 and using simplex pivots for solving the latter program we can move from a vertex of D to a better adjacent vertex until we reach a vertex u with an adjacent vertex v such that h.u/  0 but h.v/ < 0. The intersection of @C with the line segment Œu; v will be the desired y. t u Corollary 7.3 (Edge Property) If (LRC) is solvable then at least one optimal solution lies on the intersection of the boundary of C with an edge of D. If E denotes the collection of all edges of D and for every E 2 E , .E/ is the set of extreme points of the line segment E \ C, then it follows from Corollary 7.3 (Hillestad and Jacobsen 1980b) that an optimal solution must be sought only among the finite set [ XD f.E/j E 2 E g: (7.44) We next examine how to recognize an optimal solution. Problem (LRC) is said to be regular (stable) if for every feasible point x there exists, in every neighborhood of x, a feasible point x0 such that h.x0 / > 0. Since by slightly moving x0 towards an interior point of D, if necessary, one can make x0 2 intD, this amounts to saying that the feasible set S D D n intC satisfies S D cl.intD/, i.e., it is robust. For example, the problem depicted in Fig. 7.5 is regular, but the one in Fig. 7.6 is not (the point x D opt which is an isolated feasible point does not satisfy the required condition). By specializing Proposition 5.3 to (LRC) we obtain Theorem 7.6 (Optimality Criterion) In order that a feasible point x be a global optimal solution it is necessary that fx 2 Dj cx  cxg  C:

(7.45)

192

7 DC Optimization Problems

Fig. 7.6 Nonregular problem

x

C

opt

t u

This condition is also sufficient if the problem is regular.

For any real number let D. / D fx 2 Dj cx  g. Since D. /  D. 0 / whenever  0 , it follows from the above theorem that under the regularity assumption , the optimal value in (LRC) is inff j D. / n C ¤ ;g:

(7.46)

Also note that by letting D cx condition (7.45) is equivalent to maxfh.x/j x 2 D. /g  0;

(7.47)

where the maximization problem on the left-hand side is a (BCP) since h.x/ is convex. In the general case when regularity is not assumed, condition (7.45) [i.e. (7.47)] may hold without x being a global optimal solution, as illustrated in Fig. 7.6. Fortunately, a slight perturbation of (LRC) can always make it regular, as shown in the following: Proposition 7.8 For " > 0 sufficiently small, the problem minimize cx

subject to x 2 D; h.x/ C ".kxk2 C 1/  0

(7.48)

is regular and its optimal value tends to the optimal value of (LRC) as " ! 0. Proof. Since the vertex set V of D is finite, there exists "0 > 0 so small that " 2 .0; "0 / implies that h" .x/ WD h.x/ C ".kxk2 C 1/ ¤ 0; 8x 2 V. Indeed, if V1 D fx 2 Vjh.x/ < 0g and "0 satisfies h.x/ C "0 .kxk2 C 1/ < 0; 8x 2 V1 , then whenever 0 < " < "0 we have h.x/ C ".kxk2 C 1/ < 0; 8x 2 V1 , while h.x/ C ".kxk2 C 1/  0 C " > 0; 8x 2 V n V1 . Now consider problem (7.48) for 0 < " < "0 and let x 2 D be such that h" .x/  0. If x … V then x is the midpoint of some line segment   D, and since the function h" .x/ is strictly convex, any neighborhood of x must contain a point x0 of  such that h" .x0 / > 0. On the other hand, if x 2 V, then h" .x/ > 0, and any point x0 of D sufficiently near to x will satisfy h" .x0 / > 0. Thus, given any

7.3 Reverse Convex Programming

193

feasible solution x of problem (7.48) there exists a feasible solution x0 arbitrarily close to x such that h" .x0 / > 0. This means that problem (7.48) is regular. If x" is a global optimal solution of (7.48) then obviously cx"  WD minfcxjx 2 D; h.x/  0g and since h" .x" /  0 and h" .x" /  h.x" / ! 0 as " ! 0, if x is any cluster point of fx" g then h.x/  0, i.e., x is feasible to (LRC), hence cx D . t u Note that the function h" .x/ in the above perturbed problem (7.48) is strictly convex. This additional feature may sometimes be useful. An alternative approach for handling nonregular problems is to replace the original problem by the following: .LRC" /

minfcxj

x 2 D; h.x/  "g:

(7.49)

A feasible solution x" to the latter problem such that cx"  minfcxj x 2 D; h.x/  0g

(7.50)

is called an "-approximate optimal solution to (LRC). The next result shows that for " > 0 sufficiently small an "-approximate optimal solution is as close to an optimal solution as desired. Proposition 7.9 If for each " > 0, x" is an "- approximate optimal solution of (LRC), then any accumulation point x of the sequence fx" g solves (LRC). Proof Clearly h.x/  0, hence x is feasible to (LRC). For any feasible solution x of (LRC), the condition (7.50) implies that cx  minfcxj x 2 D; h.x/  0g, hence x is an optimal solution of (LRC). t u Observe that if C" D fxj h.x/  "g then a feasible solution x" to the problem (7.49) such that fx 2 Dj cx  cx" g  C"

(7.51)

is an "-approximate optimal solution of (LRC). This together with Proposition 7.7 suggests that, within a tolerance " > 0 (so small that still x0 2 intC" /, transcending an incumbent value in (LRC) reduces to solving the following problem, analogous to problem (#) for (BCP) in Sect. 7.1, Subsect. 7.1.3: (\) Given two closed convex sets D. / D fx 2 Dj cx  g and C D fxj h.x/   g (0 < < "), find a point z 2 D. / n C" or else prove that D. /  C . Note that here C"  C . By Proposition 7.7, from a point z 2 D. / n C" one can easily derive a feasible point y to (LRC" ) with cy < I on the other hand, if D. /  C , then is an "-approximate optimal value because 0 < < ".

194

7 DC Optimization Problems

7.3.2 Solution Methods Thus the search for an "-approximate optimal solution of (LRC) reduces to solving the parametric dc feasibility problem inff j D. / n C" ¤ ;g:

(7.52)

analogous to (7.5) for (BCP). This leads to a method for solving (LRC) which is essentially the same as that used for solving (BCP). In the following X" denotes the set analogous to the set X in (7.44) but defined for problem (LRC" ) instead of (LRC). CS (Cone Splitting) Restart Algorithm for (LRC) Initialization Solve the linear program minfcxj x 2 Dg to obtain a basic optimal solution w. If h.w/  ", terminate: w is an "-approximate optimal solution. Otherwise: (a) If a vertex z0 of D is available such that h.z0 / > ", go to Phase 1. (b) Otherwise, set D 1; D. / D D, and go to Phase 2. Phase 1. Phase 2.

Starting from z0 search for a feasible point x 2 X" [see (7.44)]. Let D cx; D. / D fx 2 Dj cx  g. Go to Phase 2. Call Procedure DC to solve problem (\) (i.e., apply Procedure DC in Sect. 6.1 with D. /; C" and C in place of D; C; C" resp.). z and go to Phase 1. (1) If a point z 2 D. / n C" is obtained, set z0 (2) If D. /  C then terminate: x is an "-approximate optimal solution (when < C1/, or the problem is infeasible (when D C1/.

Fig. 7.7 The FW/BW algorithm

z1

x0

x1

c, x = γ z2

C x2

w

Remark 7.10 In Phase 1, the point x 2 X can be computed by performing a number of simplex pivots as indicated in the proof of Proposition 7.7. In Phase 2, the initial cone of Procedure DC is the canonical cone associated with the vertex x0 D w of D (w is also a vertex of the polyhedron D. //. Also note that " > 0, whereas

7.3 Reverse Convex Programming

195

the algorithm does not assume regularity of the problem. However, if the problem is regular, then the algorithm with D " D 0 will find an exact global optimal solution after finitely many steps, though may require infinitely many steps to identify it as such. Since the current feasible solution moves in Phase I forward from the region h.x/ > 0 to the frontier h.x/ D 0, and in Phase 2 backward from this frontier to the region h.x/ > 0, the algorithm is also called the forward–backward (FW/BW) algorithm (Fig. 7.7). Example 7.2 Minimize 2x1 C x2 subject to x1 C x2 x1 C 2x2 x1  x 2 2x1  3x2 x1 ; x2 x21  x1 x2 C x22  6x1

 10  8  4  6  0  0:

The optimal solution x D .4:3670068I 5:6329932/ is obtained after one cycle (Fig. 7.8).

opt 4 w 2

0

3

4

Fig. 7.8 An example

Just as the CS Algorithm for (BCP) admits of a BB version, the CS Algorithm for (LRC) can be transformed into a BB Algorithm by using lower bounds to define the selection operation in Phase 2 . Let M0 be the canonical cone associated with the vertex x0 D w of D, and let i0 z ; i D 1; : : : ; n, be the points where the ith edge of M0 meets the boundary @C of the closed convex set C D fxj h.x/  0g. Consider a subcone M of M0 , of base

196

7 DC Optimization Problems

S D Œz1 ; : : : ; zn   S0 . Let yi D x0 C i ui be the point where the ith edge of M meets @C and let U be the nonsingular n  n matrix of columns u1 ; : : : ; un . Let AQ and bQ be defined as in (7.20). Proposition 7.10 A lower bound for minfcxj x 2 D \ M; h.x/  0g is given by the optimal value ˇ.M/ of the linear program ( ) n X t i 0 Q  b; Q min hc; x C Utij AUt  1; t  0 : (7.53) iD1 i Furthermore, if a basic optimal solution t.M/ of this linear program is such that !.M/ D x0 C Ut.M/ lies on some edge of M then !.M/ 2 S, and hence ˇ.M/ D minfcxj x 2 M \ Sg, where S D D n intC is the feasible set of (LRC). P Proof Clearly x 2 M n intC if and only if x D x0 C Ut; t  0; niD1 tii  1. Since M \ S  .M \ D/ n intC, a lower bound of minfcxj x 2 M \ Sg is given by the optimal value ˇ.M/ of the linear program minfhc; x0 C Utij

x0 C Ut 2 D; t  0;

n X ti  1g iD1 i

(7.54)

which is nothing but (7.53). To prove the second part of the proposition, observe that if !.M/ lies on the ith edge it must satisfy !.M/ D x0 C .yi  x0 / for some   1 and since x0 2 D; !.M/ 2 D it follows by convexity of D that yi 2 D, hence yi 2 S. In view of assumption (7.42) this implies that cx0 < cyi , i.e., c.yi  x0 / > 0, and hence hc; !.M/  yi i D hc; .  1/yi i > 0 if  > 1. Since c!.M/  cyi from the definition of !.M/ we must have  D 1, i.e., !.M/ 2 D n intC. t u As a consequence of this proposition, when !.M/ is not feasible to (LRC) then it does not lie on any edge of M, so the subdivision of M upon the halfline from x0 through !.M/ is proper. Also, if all yi ; i D 1; : : : ; n, belong to D, then !.M/ is decidedly feasible (and is one of the yi ), hence ˇ.M/ is the exact minimum of cx over the feasible portion in M and M will be fathomed in the branch and bound process. Thus, in a conical procedure using the above lower bounding, only those cones will actually remain for consideration which contain boundary points of the feasible set. In other words, the search will concentrate on regions which are potentially of interest. In the next algorithm, let "  0 be the tolerance. Conical BB Algorithm for (LRC) Initialization Solve the linear program minfcxj x 2 Dg to obtain a basic optimal solution w. If h.w/  ", terminate: w is an "-approximate optimal solution. Otherwise: (a) If a vertex z0 of D is available such that h.z0 / > ", go to Phase 1. (b) Otherwise, set 1 D 1; D. 1 / D D, and go to Phase 2.

7.3 Reverse Convex Programming

Phase 1. Phase 2 Step 1.

Step 2.

Step 3. Step 4.

197

Starting from z0 search for a point x1 2 X" [see (7.44)]. Let 1 D cx1 ; D. 1 / D fx 2 Dj cx  1 g. Go to Phase 2. Let M0 be the canonical cone associated with the vertex x0 D w of D, let N1 D fM0 g; P1 D fM0 g. Set k D 1. (Bounding) For each cone M 2 Pk whose base is Œx0 C u1 ; : : : ; x0 C un  and whose edges meet @C at points yi D x0 C i ui ; i D 1; : : : ; n, solve the linear program (7.53), where U denotes the matrix of columns ui D yi  x0 ; i D 1; : : : ; n. Let ˇ.M/; t.M/ be the optimal value and a basic optimal solution of this linear program, and let !.M/ D x0 C Ut.M/. (Options) If some !.M/; M 2 Pk , are feasible to (LRC" ) (7.49) and satisfy hc; !.M/i < k , then reset k equal to the smallest among these values and xk arg k (i.e., the x such that hc; xi D k ). Go to Step 3 or (optionally) return to Phase 1. (Pruning) Delete every cone M 2 Nk such that ˇ.M/  k and let Rk be the collection of remaining cones. (Termination Criterion) If Rk D ;, then terminate: (a) If k < C1, then xk is an "-approximate optimal solution. (b) Otherwise, (LRC) is infeasible.

Step 5. Step 6.

(Branching) Select Mk 2 argminfˇ.M/j M 2 Rk g. Subdivide Mk along the halfline from x0 through ! k D !.Mk /. Let PkC1 be the partition of Mk . (New Net) Set Nk .Rk n fMk g/ [ PkC1 , k k C 1 and return to Step 1.

Theorem 7.7 The above algorithm can be infinite only if " D 0, and then any accumulation point of each of the sequences f! k g; fxk g is a global optimal solution of (LRC). Proof. Since the set X" [see (7.44)] is finite, it suffices to show that Phase 2 is finite, unless " D 0. If Phase 2 is infinite it generates at least a filter fMk ; k 2 Kg. Denote by yki ; i D 1; : : : ; n the intersections of the edges of Mk with @C, by qk the intersection of the halfline from x0 through ! k WD !.Mk / with the hyperplane through yk1 ; : : : ; ykn . By Theorem 6.2, there exists at least one accumulation point of fqk g, say q D lim qk .k ! 1; k 2 K1  K/, such that q 2 @C. We may assume that ! k ! ! .k ! 1; k 2 K1 /. But, as a result of Steps 2 and 3, ! k 2 intC" , hence ! 2 @C" . On the other hand, qk 2 Œx0 ; ! k , hence q 2 Œx0 ; !. This is possible only if " D 0 and ! D q 2 @C, so that ! is feasible. Since, furthermore, c! k  minfcxj x 2 D n intCg; 8k, it follows that !, and hence any accumulation point of the sequence fxk g is a global optimal solution. t u Remark 7.11 It should be emphasized that the above algorithm does not require regularity of the problem, even when " D 0. However, in the latter case, since every ! k is infeasible, the procedure may approach a global optimal solution through a sequence of infeasible points only. This situation is illustrated in Fig. 7.9, where

198

7 DC Optimization Problems

y02 opt

ω1

ω2

ω3 y01

Fig. 7.9 Convergence through infeasible points

an isolated global optimal solution of a nonregular problem is approached by a sequence of infeasible points ! 1 ; ! 2 ; : : : A conical algorithm for (LRC) using bisections throughout was originally proposed by Muu (1985).

7.4 Canonical DC Programming In its general formulation (cf. Sect. 5.3) the Canonical DC Programming problem is to find .CDC/ minff .x/j g.x/  0  h.x/g;

(7.55)

where f ; g; h W Rn ! R are convex functions. Clearly (LRC) is a special case of (CDC) when the objective function f .x/ is linear. Proposition 7.11 Any dc optimization problem can be reduced to the canonical form (CDC) at the expense of introducing at most two additional variables. Proof An arbitrary dc optimization problem minff1 .x/  f2 .x/j gi;1 .x/  gi;2 .x/  0; i D 1; : : : ; mg

(7.56)

can be rewritten as minff1 .x/  zj z  f2 .x/  0; g.x/  0; i D 1; : : : ; mg; where the objective function is convex (in x; z) and g.x/ D maxiD1;:::;m Œgi;1 .x/  gi;2 .x/ is a dc function (cf. Proposition 4.1). Next the dc inequality g.x/ D p.x/  q.x/  0 with p.x/; q.x/ convex can be split into two inequalities: p.x/  t  0;

t  q.x/  0;

(7.57)

7.4 Canonical DC Programming

199

where the first is a convex inequality, and the second is a reverse convex inequality. By changing the notation if necessary, one obtains the desired canonical form. u t Whereas the application of OA technique to .GCP/ is almost straightforward, it is much less obvious for .CDC/ and requires special care to ensure convergence. By hypothesis the sets D WD fxj g.x/  0g;

C WD fxj h.x/  0g

are closed and convex. For simplicity we will assume D bounded, although with minor modifications the results can be extended to the unbounded case. If an optimal solution w of the convex program minff .x/j g.x/  0g satisfies h.w/  0, then the problem is solved (w is a global optimal solution). Therefore, without loss of generality we may assume that there exists a point w satisfying w 2 intD \ intC;

f .x/ > f .w/ 8x 2 D n intC:

(7.58)

The next important property is an immediate consequence of this assumption. Proposition 7.12 (Boundary Property) Every global optimal solution lies on D \ @C. Proof Let z0 be any feasible solution. If z0 … @C, then the line segment Œw; z0  meets @C at a point z1 D .1  /w C z0 such that 0 <  < 1. By convexity we have from (7.58), f .z1 /  .1  /f .w/ C f .z0 / < f .z0 /, so z1 is a better feasible solution than z0 . t u Just like (LRC), Problem .CDC/ is said to be regular if the feasible set S D D n intC is robust or, which amounts to the same, D n intC D cl.D n C/:

(7.59)

Theorem 7.6 for (LRC) extends to (CDC): Theorem 7.8 (Optimality Criterion) In order that a feasible solution x to (CDC) be global optimal it is necessary that fx 2 Dj f .x/  f .x/g  C:

(7.60)

This condition is also sufficient if the problem is regular. Proof Analogous to the proof of Proposition 5.3, with obvious modifications (f .x/ instead of l.x/). t u

200

7 DC Optimization Problems

To exploit the above optimality criterion, it is convenient to introduce the next concept. Given " > 0 (the tolerance), a vector x is said to be an "-approximate optimal solution to .CDC/ if x 2 D;

h.x/  ";

f .x/  minff .x/j x 2 D; h.x/  0g:

(7.61) (7.62)

Clearly, as "k # 0, any accumulation point of a sequence fxk g of "k -approximate optimal solutions to .CDC/ yields an exact global optimal solution. Therefore, in practice one should be satisfied with an "-approximate optimal solution for " sufficiently small. Denote C" D fxj h.x/  "g; D. / D fx 2 Dj f .x/  g. In view of (7.58) it is natural to require that w 2 intD \ intC" ;

f .x/ > f .w/

8x 2 D n intC" :

(7.63)

Define D infff .x/j x 2 D; h.x/ > "g and let G D D. /;

˝ D D. / n intC" :

By (7.63) and Proposition 7.12, it is easily seen that ˝ coincides with the set of "approximate optimal solutions of (CDC), so the problem amounts to searching for a point x 2 ˝. Denote by P the family of polytopes P in Rn for which there exists 2 Œ ; C1 satisfying G  D. /  P. For every P 2 P define x.P/ 2 argmaxfh.x/j x 2 Vg;

(7.64)

where V is the vertex set of P. Let us verify that Conditions A1 and A2 stated in Sect. 6.3 are satisfied. Condition A1 is obvious because x.P/ exists and can be computed provided ˝ ¤ ;I moreover, since P  G  ˝, we must have h.x.P//  ", so any accumulation point x of a sequence x.Pk /; k D 1; 2; : : :, satisfies h.x/  ", and hence x 2 ˝ whenever x 2 G. To verify A2, consider any z D x.P/ associated to a polytope P such that D. /  P for some 2 Œ ; C1. As we just saw, h.z/  ". If h.z/ < 0 then maxfh.x/j x 2 Pg < 0, hence D. /  fxj h.x/ < 0g, which implies that is an "approximate optimal value (if < C1/ or the problem is infeasible (if D C1/. If h.z/  0 then maxfg.z/; h.z/ C "g > 0 and since maxfg.w/; h.w/ C "g < 0, we can compute a point y such that y 2 Œw; z;

maxfg.y/; h.y/ C "g D 0:

7.4 Canonical DC Programming

201

Two cases are possible: (a) g.y/ D 0 W since g.w/ < 0 this event may occur only if g.z/ > 0 and so z D x.P/ can be separated from G by a cut l.x/ WD hp; x  yi  0 with p 2 @g.y/. (b) h.y/ D " W then g.y/  0, so y is an "-approximate solution in the sense of (7.61). Furthermore, since y D .1  /w C z; 0 <  < 1/, it follows that f .y/  .1  /f .w/ C f .z/ < f .z/, so z D x.P/ can be separated from G by a cut l.x/ WD hp; x  yi  0 with p 2 @f .y/. In either case, if we set 0 D minf ; f .y/g then P0 D P \ fxj l.x/  0g  D. 0 /, i.e., P0 2 P. Thus, both Conditions A1 and A2 are satisfied, and hence the outer approximation scheme described in Sect. 6.3 can be applied to solve .CDC/.

7.4.1 Simple Outer Approximation We are led to the following OA Procedure (Fig. 7.10) for finding an "-approximate optimal solution of (CDC): OA Algorithm for (CDC) Step 0.

Step 1.

Let x1 be the best feasible solution available, 1 D f .x1 / (if no feasible solution is known, set x1 D ;; 1 D C1/. Take a polytope P0  D and let P1 D fx 2 P0 j f .x/  1 g. Determine the vertex set V1 of P1 . Set k D 1. Compute xk 2 argmaxfh.x/j x 2 Vk g. If h.xk / < 0, then terminate: (a) If k < C1, xk is an "-approximate optimal solution; (b) If k D C1, the problem is infeasible. x1 g(x) = 0

f (x) = c, x h(x) = −ε P1

l1 (x) = 0 P3

y1 w

y2

l2 (x) = 0

P2 x2

Fig. 7.10 Outer approximation algorithm for .CDC/

202

Step 2.

7 DC Optimization Problems

Compute yk 2 Œw; xk  such that maxfg.yk /; h.yk / C "g D 0. If g.yk / D 0 then set kC1 D k ; xkC1 D xk and let lk .x/ WD hpk ; x  yk i;

Step 3.

(7.65)

If h.yk / D ", then set kC1 D minf k ; f .yk /g, xkC1 D yk if f .yk /  k , xkC1 D xk otherwise. Let lk .x/ WD hpk ; x  yk i;

Step 4.

pk 2 @g.yk /:

pk 2 @f .yk /:

Compute the vertex set VkC1 of PkC1 D Pk \ fxj lk .x/  0g, set k and go back to Step 1.

(7.66) kC1

Theorem 7.9 The OA Algorithm for .CDC) can be infinite only if " D 0 and in this case, if the problem is regular any accumulation point of the sequence fxk g will solve (CDC). Proof Suppose that the algorithm is infinite and let x D limq!C1 xkq . It can easily be checked that all conditions of Theorem 6.5 are fulfilled (w 2 intG by (7.58); yk 2 Œw; xk  n intG, lk .yk / D 0/. By this theorem, x D lim ykq , therefore h.x/ D lim h.ykq /  ". On the other hand, h.xk /  0, hence h.x/  0. This is a contradiction unless " D 0, Since g.yk /  0 8k it follows that x 2 D. Thus x is a feasible solution. Now let Q D lim k . Since for every k; h.xk / D maxfh.x/j x 2 Pk g and D. k /  Pk it follows that 0 D h.x/ D maxfh.x/j x 2 D. Q /g. Hence D. Q /  C. By the optimality criterion (7.60) this implies, under the regularity assumption, that x is a global optimal solution and Q D , the optimal value. Furthermore, since k D f .xk /, any accumulation point of the sequence fxk g is also a global optimal solution. t u Remark 7.12 No regularity condition is required when " > 0. It can easily be checked that the convergence proof is still valid if in Step 3 of certain iterations the cut is taken with pk 2 @h.yk / instead of pk 2 @f .yk /. This flexibility should be used to monitor the speed of convergence. Remark 7.13 The above algorithm presupposes the availability of a point w 2 intD \ intC, i.e., satisfying maxfg.w/; h.w/g < 0. Such a point, if not readily available, can be computed by methods of convex programming. It is also possible to modify the algorithm so as to dispense with the computation of w (Tuy 1995b). An earlier version of the above algorithm with " D 0 was developed in (Tuy 1987a). Several modified versions of this algorithm have also been proposed (e.g., Thoai 1988) which, however, were later shown to be nonconvergent. In fact, the convergence of cutting methods for (CDC) is a more delicate question than it seems.

7.4 Canonical DC Programming

203

7.4.2 Combined Algorithm By translating if necessary, assume that (7.58) holds with w D 0, and the feasible set D n intC lies entirely in the orthant M0 D RnC . It is easily verified that the above OA algorithm still works if, in the case when maxfh.x/j x 2 Pk g > 0, x.Pk / is taken to be any point xk 2 Pk satisfying h.xk / > " rather than a maximizer of h.x/ over Pk as in (7.64). As shown in Sect. 6.3 such a point xk can be computed by applying the Procedure DC to the inclusion .SPk /x 2 Pk n C" : For " > 0, after finitely many steps this procedure either returns a point xk such that h.xk / > ", or establishes that maxfh.x/j x 2 Pk g < 0 (the latter implies that the current incumbent is an "-approximate optimal solution if k < C1, or that the problem is infeasible if k D C1/. Furthermore, since PkC1 differs from Pk by only one additional linear constraint, the last conical partition in the DC procedure for (SPk ) can be used to start the procedure for (SPkC1). We can thus formulate the following combined algorithm (recall that w D 0/: Combined OA/CS Conical Algorithm for (CDC) Initialization Let 1 D cx1 , where x1 is the best feasible solution available (if no feasible solution is known, set x1 D ;; 1 D C1/. Take a polytope P0  D and let P1 D fx 2 P0 j cx  1 g. Set N1 D P1 D fRnC g (any subcone of RnC will be assumed to have its base in the simplex spanned by the units vectors fei ; i D 1; : : : ; ng/. Set k D 1. Step 1.

(Evaluation) For each cone M 2 Pk of base Œu1 ; : : : ; un  compute the point i ui where the ith edge of M meets @C. Let U be the matrix of columns u1 ; : : : ; un . Solve the linear program (

n X ti max j Ut 2 Pk ; t  0 iD1 i

Step 2. Step 3.

Step 4.

)

to obtain the optimal value .M/ and a basic optimal solution t.M/. Let !.M/ D Ut.M/. (Screening) Delete every cone M 2 Nk such that .M/ < 1, and let Rk be the set of remaining cones. (Termination Criterion 1) If Rk D ;, then terminate: xk is an "approximate optimal solution of (CDC) (if k < C1/, or the problem is infeasible (if k D 1/. (Termination Criterion 2) Select Mk 2 argmaxf.M/j M 2 Rk g. If ! k D !.Mk / 2 D. k / WD fx 2 Dj f .x/  k g, then terminate: ! k is a global optimal solution.

204

Step 5. Step 6.

7 DC Optimization Problems

(Branching) Subdivide Mk along the ray through ! k D !.Mk /. Let PkC1 be the partition of Mk . (Outer Approximation) If h.!.M// > " for some M 2 Rk then let xk D !.M/; yk 2 Œ0; xk  such that maxfg.yk /; h.yk / C "g D 0. (a) If g.yk / D 0 let lk .x/  0 be the cut (7.65), xkC1 D xk ; kC1 D k . (b) Otherwise, let lk .x/  0 be the cut (7.66), xkC1 D yk ; kC1 D cyk .

Step 7.

(Updating) Let NkC1 D .Rk nfMk g/[PkC1 , PkC1 D fx 2 Pk j lk .x/  0g. Set k k C 1 and go back to Step 1.

Note that in Step 2, a cone M is deleted if .M/ < 1 (strict inequality), so Rk D ; implies that maxfh.x/j x 2 Pk g < 0. It is easy to prove that, analogously to the Combined Algorithm for (GCP) (Theorem 7.5), the above Combined Algorithm for (CDC) can be infinite only if " D 0 and in this case, if the problem is regular, every accumulation point of the sequence fxk g is a global optimal solution of (CDC).

7.5 Robust Approach Consider the general dc optimization problem .P/

minff .x/j x 2 Œa; b; gi .x/  0; i D 1; : : : ; mg;

where f ; g1 ; : : : ; gm are dc functions on Rn ; a; b 2 Rn and Œa; b D fx 2 Rn j a  x  bg. If f .x/ D f1 .x/  f2 .x/ with f1 ; f2 being convex functions, we can rewrite (P) as minff1 .x/  tj x 2 Œa; b; gi .x/  0; i D 1; : : : ; m; t  f2 .x/  0g, so by changing the notation it can always be assumed that the objective function f .x/ in (P) is convex. In the most general case, a common approach for solving problem (P) is to use a suitable OA or BB procedure for generating a sequence of infeasible solutions xk such that, as k ! C1; xk tends to a feasible solution x , while f .xk / tends to the optimal value of the problem. Under these conditions x is a global optimal solution, and theoretically the problem can be considered solved. Practically, however, the algorithm has to be stopped at some xk which is expected to be close enough to the optimal solution. On the one hand, xk must be approximately feasible, on the other hand, the value f .xk / must approach the optimal value v.P/. In precise terms the former condition means that, given a tolerance " > 0, one must have xk 2 Œa; b; gi .xk /  "; i D 1; : : : ; mI the latter condition means that, given a tolerance > 0, one must have f .xk /  v.P/ C . A point xk satisfying both these conditions is referred to as an ."; /-approximate optimal solution, or for D " an "-approximate optimal solution. Everything in this approximation scheme seems to be in order, so computing an ("; )-approximate optimal solution has become a common practice for solving nonconvex global optimization problems. Close scrutiny reveals, however, pitfalls behind this ."; )-approximation concept.

7.5 Robust Approach

205

Fig. 7.11 Inadequate "-approximate optimal solution

b

x∗

g1 (x) Opt g2 (x) x g3 (x) f (x)

Consider, for example, the quadratic program depicted in Fig. 7.11, where the objective function f .x/ is linear and the nonconvex constraint set is defined by the inequalities a  x  b; gi .x/  0 .i D 1; 2; 3/. The optimal solution is x , while the point x is infeasible, but almost feasible: g1 .x/ D 0; g2 .x/ D 0; g3 .x/ D ˛ for some small ˛ > 0. If " > ˛ then x is feasible with tolerance " and will be accepted as an "-approximate optimal solution, though it is quite far off the optimal solution x . But if " < ˛ then x is no longer feasible with tolerance " and the "approximate solution will come close to x . Thus, even for regular problems (problems with no isolate feasible solution), the "-approximation approach may give an incorrect optimal solution if " is not sufficiently small. The trouble is that in practice we often do not know what exactly means “sufficiently small,” i.e., we do not know how small the tolerance should be to guarantee a correct approximate optimal solution. Things are much worse when the problem happens to have a global optimal solution which is an isolated feasible solution. In that case a slight perturbation of the data may cause a drastic change of the global optimal solution; consequently, a small change of the tolerances "; may cause a drastic change of the ("; )-approximate optimal solution. Due to this instability an isolated global optimal solution is very difficult to compute and very difficult to implement when computable. To avoid such annoying situation a common practice is to restrict consideration to problems with a robust feasible set, i.e., a feasible set D satisfying D D cl.intD/;

(7.67)

where cl and int denote the closure and the interior, respectively. Unfortunately, this condition (7.67) which implies the absence of isolated feasible solutions is generally very hard to check for a given nonconvex problem. Practically we often have to deal with feasible sets which are not known à priori to contain isolated points or not.

206

7 DC Optimization Problems

Therefore, from a practical point of view only nonisolated feasible solutions of (P) should be considered and instead of trying to find a global optimal solution of (P) it would be more reasonable to look for the best nonisolated feasible solution. Embodying this idea, the robust approach is devised for computing the best nonisolated feasible solution without preliminary knowledge about condition (7.67) holding or not. First let us introduce some basic concepts. Let D denote the set of nonisolated points (i.e., accumulation points) of D WD fx 2 Œa; bj g.x/  0g. A solution x 2 D is called an essential optimal solution of (P) if f .x /  f .x/ 8x 2 D : In other words, an essential optimal solution is an optimal solution of the problem minff .x/j x 2 D g: For given " > 0; > 0, a point x 2 Œa; b satisfying g.x/  " is said to be "-essential feasible, and a nonisolated feasible solution x is called essential ."; /optimal if f .x /  f .x/ C 8x 2 D" ; where D" WD fx 2 Œa; bj g.x/  "g is the set of all "-essential feasible solutions. Clearly for " D D 0 an essential ."; /-optimal solution is a nonisolated feasible point which is optimal. To find the best nonisolated feasible solution the robust approach proceeds by successive improvement of the current incumbent. Specifically, let w 2 Rn be any point such that f .w/  " > f .x/ 8x 2 DI in each stage of the process, one has to solve the following subproblem of incumbent transcending: .SP/ Given an x 2 Rn , find a nonisolated feasible solution xO of (P) such that f .Ox/  f .x/  ", or else establish that none such xO exists. Clearly, if x D w then an answer to .SP/ would give a nonisolated feasible solution or else identify essential infeasibility of the problem (P). If x is the best nonisolated feasible solution currently available then an answer to .SP/ would give a new nonisolated feasible solution xO with f .Ox/  f .x/  ", or else identify x as an essential "-optimal solution. Since the essential optimal value is bounded above in view of the compactness of D, by successively solving a finite sequence of subproblems .SP/ we will finally come up with an essential "-optimal solution, or an evidence that the problem is essentially infeasible. The key reduces to investigating the incumbent transcending subproblem .SP/. Setting g.x/ D maxiD1;:::;m gi .x/, consider the following pair of problems: minff .x/j g.x/  "; x 2 Œa; bg;

(P" )

minfg.x/j f .x/  ; x 2 Œa; bg;

(Q )

where the objective and constraint functions are interchanged.

7.5 Robust Approach

207

Due to the fact that f .x/ is convex a key property of problem (Q ) is that its feasible set is a convex set, so (Q ) has no isolated feasible solution and computing a feasible solution of it can be done at cheap cost. Therefore, as shown in Sect. 7.5, an adaptive BB Algorithm can be devised that, for any given tolerance > 0, can compute an -optimal solution of (Q ) in finitely many steps (Proposition 5.2). Furthermore, such an -optimal solution is stable under small changes of . For our purpose of solving problem (P) this property of (Q ) together with the following relationship between (P" ) and (Q ) turned out to be crucial. Let min.P" /, min.Q / denote the optimal values of problem (P" ), (Q ), respectively. Proposition 7.13 For every " > 0, if min.Q / > " then min.P" / > . Proof If min.Q / > ", then any x 2 Œa; b such that g.x/  " must satisfy f .x/ > , hence, by compactness of the feasible set of (P" ), minff .x/j g.x/  "; x 2 Œa; bg > . t u This almost trivial proposition simply expresses the interchangeability between objective and constraint in many practical situations (Tuy 1987a, 2003). Exploiting this interchangeability, the basic idea of robust approach is to replace the original problem (P), possibly very difficult, by a sequence of easier, stable, problems (Q ), where the parameter can be iteratively adjusted until a stable (robust) solution to (P) is obtained.

7.5.1 Successive Incumbent Transcending As shown above, a key step towards finding an essential ("; )-optimal solution of problem (P) is to deal with the following question which is the subproblem .SP/ when D f .x/  " W .SP / Given a real number , and " > 0, check whether problem (P) has a nonisolated feasible solution x satisfying f .x/  , or else establish that no "essential feasible solution x exists such that f .x/  . Based on the convexity of f .x/, consider an adaptive BB Algorithm for solving problem (Q /. As described in the proof of Proposition 6.2, such an algorithm generates at each iteration k two points xk 2 Mk ; yk 2 Mk satisfying f .xk /  ;

g.yk /  ˇ.Mk / ! 0 as k ! C1;

where ˇ.Mk /  v.Q / (note that the objective function is g.x/, the constraints are f .x/  ; x 2 Œa; b). Proposition 7.14 Let " > 0 be given. Either g.xk / < 0 for some k or ˇ.Mk / > " for some k. In the former case, xk is a nonisolated feasible solution of (P) satisfying f .xk /  . In the latter case, no "-essential feasible solution x of (P) exists such that f .x/  (so, if D f .x/  for a given > 0 and a nonisolated feasible solution x then x is an essential ."; /-optimal solution).

208

7 DC Optimization Problems

Proof If g.xk / < 0, then xk is an essential feasible solution of (P) satisfying f .xk /  , because by continuity g.x/ < 0 for all x in a neighborhood of xk . If ˇ.Mk / > ", then v.Q / > ", hence, by Proposition 7.13, v.P" / > . It remains to show that either g.xk / < 0 for some k, or ˇ.Mk / > " for some k. Suppose the contrary that g.xk /  0; ˇ.Mk /  " 8k:

(7.68)

As we saw in Sect. 6.1 (Theorem 6.4), the sequence xk ; yk generated by an adaptive BB Algorithm satisfies, for some infinite subsequence fk g  f1; 2; : : :g, g.yk /  ˇ.Mk / ! 0 . ! C1/; k

k

lim!C1 x D lim!C1 y D x





(7.69) 

with f .x /  ; x 2 Œa; b:

(7.70)

Since by (7.68) g.xk /  0 8k, it follows that g.x /  0. On the other hand, by (7.69) and (7.68) we have g.x / D lim!C1 ˇ.Mk /  ". This contradiction shows that the event (7.68) is impossible, completing the proof. t u Thus, with the stopping criterium “ g.xk / < 0 or ˇ.Mk / > " ” an adaptive BB procedure for solving (Q ) will help to solve the subproblem .SP /. Note that ˇ.Mk / D minfˇ.M/j M 2 Rk g (Rk being the collection of partition sets remaining for exploration at iteration k/. So if the deletion (pruning) criterion in each iteration of this procedure is “ ˇ.M/ > " ” then the stopping criterion “ ˇ.Mk / > " ” is nothing but the usual criterion “ Rk D ; ”. For brevity an adaptive BB Algorithm for solving (Q / with the deletion criterion “ˇ.M/ > " ” and stopping criterion “ g.xk / < 0 ” will be referred to as Procedure .SP /. Using this procedure, problem (P) can be solved according to the following scheme: Successive Incumbent Transcending (SIT) Scheme Start with D 0 , where 0 D f .a/  . Call Procedure .SP /. If a nonisolated feasible solution x of (P) is obtained with f .x/  , reset f .x/  and repeat. Otherwise, stop: if D f .x/  for a nonisolated feasible solution x then x is an essential ."; /-optimal solution; if D 0 then problem (P) has no "-essential feasible solution. Since f .D/ is compact and > 0 this scheme is necessarily finite.

7.5.2 The SIT Algorithm for DC Optimization For the sake of convenience let us rewrite problem (P) in the form: .P/

minff .x/j g1 .x/  g2 .x/  0; x 2 Œa; bg;

where f ; g1 ; g2 are convex functions.

7.5 Robust Approach

209

As shown above (Proposition 7.14), Procedure .SP / for problem (P) is an adaptive BB algorithm for solving problem (Q ), with deletion criterion “ˇ.M/ > "” and stopping criterion “g.xk / < 0”. Incorporating Procedure .SP / into the SIT Scheme we can state the following SIT Algorithm for (P): SIT Algorithm for (P) Select tolerances " > 0; > 0. Let 0 D f .a/  . Step 0. Step 1.

Set P1 D fM1 g; M1 D Œa; b; R1 D ;; D 0 . Set k D 1. For each box (hyperrectangle) M 2 Pk W – Reduce M, i.e., find a box Œp; q  M as small as possible satisfying minfg.x/j f .x/  ; x 2 Œp; qg D minfg.x/j f .x/  ; x 2 Mg and set M Œp; q. – Compute a lower bound ˇ.M/ for g.x/ over the feasible solutions in Œp; q. ˇ.M/ must be such that: ˇ.M/ < C1 ) fx 2 Mj f .x/  g ¤ ;, see (5.4).

Step 2.

Step 3.

Delete every M such that ˇ.M/ > ". Let P 0 k be the collection of boxes that results from Pk after completion of Step 1. Let R 0 k D Rk [ P 0 k . If R 0 k D ;, then terminate: if D 0 the problem (P) is "-essential infeasible; if D f .x/  for some nonisolated feasible solution x, then x is an essential ."; /-optimal solution of (P). [ Step 4.] If R 0 k ¤ ;, let Mk 2 argminfˇ.M/j M 2 R 0 k g; ˇk D ˇ.Mk /. Determine xk 2 Mk and yk 2 Mk such that f .xk /  ;

Step 5.

Step 6.

g.yk /  ˇ.Mk / D o.kxk  yk k/:

(7.71)

If g.xk / < ", go to Step 5. If g.xk /  ", go to Step 6. xk is a nonisolated feasible solution satisfying f .xk /  . If D 0 define x D xk ; D f .x/  and go to Step 6. If D f .x/ 

and f .xk / < f .x/ reset x xk and go to Step 6. Divide Mk into two subboxes by the adaptive bisection, i.e., bisect Mk via .v k ; jk /, where jk 2 argmaxfjykj  xkj j W j D 1; : : : ; ng; v k D 12 .xk C yk /. Let PkC1 be the collection of these two subboxes of Mk ; RkC1 D R 0 k n fMk g. Increment k, and return to Step 1.

Theorem 7.10 The SIT Algorithm terminates after finitely many steps, yielding an essential ."; /-optimal solution or an evidence that the problem has no "-essential feasible solution. Proof Follows from the above discussion.

t u

210

7 DC Optimization Problems

To help practical implementation, it may be useful to discuss in more detail two operations important for the efficiency of the algorithm: the box reduction in Step 1 and the computation of xk and yk in Step 4. Box Reduction Let M D Œp; q be a box. We wish to determine a box redM D Œp0 ; q0   M such that fx 2 Œp0 ; q0 j f .x/  ; g1 .x/  g2 .x/  "g D fx 2 Mj f .x/  ; g1 .x/  g2 .x/  "g: (7.72) Select a convex function g.x/ underestimating g.x/ D g1 .x/  g2 .x/ on M and tight at a point y 2 M, i.e., such that g.y/ D g.y/. For example, g.x/ D g1 .x/  g2 .y/, where y is a corner of the hyperrectangle Œp; q maximizing g2 .x/ over Œp; q/. Then p0 ; q0 are determined by solving, for i D 1; : : : ; n, the convex programs: p0i D minfxi j x 2 Œp; q; f .x/  ; g.x/  "g; q0i D maxfxi j x 2 Œp; q; f .x/  ; g.x/  "g: Bounding and Condition (7.71) With g.x/ being a convex underestimator of g.x/ D g1 .x/g2 .x/ over M D Œp; q, tight at a point y 2 redM define ˇ.M/ D minfg.x/j f .x/  ; x 2 redMg. So for each k there is a point yk 2 Mk such that g.yk / D g.yk /. Also define xk 2 argminfg.x/j f .x/  ; x 2 redMk g, so that g.xk / D ˇ.Mk /. If xk D yk then ˇ.Mk / D minfg.x/j x 2 Mk g so, as can easily be checked, condition (7.71) is satisfied. Alternative Method: Define xM 2 argminfg1 .x/j f .x/  ; x 2 redMg; yM 2 argmaxfg2 .x/j f .x/  ; x 2 redMg; ˇ.M/ D g1 .xM /  g2 .yM /: The points xk D xMk ; yk D yMk are such that xk D yk implies that minfg1 .x/  g2 .x/j x 2 Mk g D 0, hence it is easy to verify that condition (7.71) is satisfied. Special Case: When f ; g are polynomials or signomials, one can define at each iteration k a convex program (e.g., a SDP) .Lk /

maxf`k .x; u/j f .x/  ; .x; u/ 2 Ck g;

where u 2 Rmk ; Ck is a compact set in Mk  Rmk , and `k .x; u/ is a linear function such that the problem maxfg.x/j fQ .x/  ; x 2 Mk g is equivalent to

7.5 Robust Approach

.NLk /

211

maxf`k .x; u/j f .x/  ; .x; u/ 2 Ck ; ri .x; u/  0 .i 2 Ik /g;

where ri .x; u/  0; i 2 Ik , are nonconvex constraints (Lasserre 2001; Sherali and Adams 1999). Furthermore, .Lk / can be selected so that there exists yk 2 Mk satisfying .yk ; u/ 2 Ck ) ri .yk ; u/  0 .i 2 Ik /

(7.73)

(see, e.g., Tuy (1998, Proposition 8.12)). In that case, setting ˇ.Mk / D maxf`k .x; u/j f .x/  ; .x; u/ 2 Ck g and taking an optimal solution .xk ; uk / of (Lk / one has a couple xk ; yk satisfying (7.73) and xk 2 Mk ; f .xk /  ; .xk ; uk / 2 Ck ; `k .xk ; uk / D ˇ.Mk /:

(7.74)

If xk D yk then from (7.73) and .xk ; uk / 2 Ck it follows that ri .xk ; u/  0 .i 2 Ik /, so .xk ; uk / solves (NLk / and hence xk solves the subproblem minfg.x/j fQ .x/  ; x 2 Mk g, i.e., g.xk / D `k .xk ; uk / D ˇ.Mk /. It can then easily be shown that g.yk /  ˇ.Mk / ! 0 as xk  yk ! 0, so that xk ; yk satisfy condition (7.71).

7.5.3 Equality Constraints So far we assumed that all nonconvex constraints are of inequality type: gi .x/  0; i D 1; : : : ; m. If there are equality constraints such as hj .x/ D 0; j D 1; : : : ; s with all functions h1 ; : : : ; hs nonlinear, the SIT algorithm for whatever " > 0 will conclude that there is no "-essential feasible solution. This does not mean, however, that the problem has no nonisolated feasible solution. In that case, since an exact solution to a nonlinear system of equations cannot be expected to be computable in finitely many steps, one should be content with replacing every equality constraint hj .x/ D 0 by an approximate one jhj .x/j  ı; where ı > 0 is the tolerance. The above approach can then be applied to the resulting set of inequality constraints.

212

7 DC Optimization Problems

7.5.4 Illustrative Examples To illustrate the practicality of the robust approach, we give two numerical examples. Example 7.3 Minimize x1 subject to .x1  5/2 C 2.x2  5/2 C .x3  5/2  18 .x1 C 7  2x2 /2 C 4.2x1 C x2  11/2 C 5.x3  5/2  100 This problem has one isolated feasible point .1; 3; 5/ which is also the optimal solution. With " D D 0:01 the essential ("; )-optimal solution (3.747692, 7.171420, 2.362317) with objective function value 3:747692. is found by the SIT Algorithm after 15736 iterations. Example 7.4 (n=4) Minimize .3 C x1 x3 /.x1 x2 x3 x4 C 2x1 x3 C 2/2=3 subject to 3.2x1 x2 C 3x1 x2 x4 /.2x1 x3 C 4x1 x4  x2 / .x1 x3 C 3x1 x2 x4 /.4x3 x4 C 4x1 x3 x4 C x1 x3  4x1 x2 x4 /1=3 C3.x4 C 3x1 x3 x4 /.3x1 x2 x3 C 3x1 x4 C 2x3 x4  3x1 x2 x4 /1=4  309:219315 2.3x3 C 3x1 x2 x3 /.x1 x2 x3 C 4x2 x4  x3 x4 /2 C.3x1 x2 x3 /.3x3 C 2x1 x2 x3 C 3x4 /4  .x2 x3 x4 C x1 x3 x4 /.4x1  1/3=4 3.3x3 x4 C 2x1 x3 x4 /.x1 x2 x3 x4 C x3 x4  4x1 x2 x3  2x1 /4  78243:910551 3.4x1 x3 x4 /.2x4 C 2x1 x2  x2  x3 /2 C2.x1 x2 x4 C 3x1 x3 x4 /.x1 x2 C 2x2 x3 C 4x2  x2 x3 x4  x1 x3 /4  9618 0  xi  5 i D 1; 2; 3; 4: This problem would be difficult to solve by Sherali-Adams’ RLT method or Lassere’s method which both would require introducing a very large number of additional variables. For " D D 0:01 the essential ("; )-optimal solution xessopt D .4:994594; 0:020149; 0:045424; 4:928073/ with objective function value 5.906278 is found by the SIT Algorithm at iteration 667, and confirmed as such at iteration 866.

7.6 DC Optimization in Design Calculations

213

7.5.5 Concluding Remark Global optimization problems with nonconvex constraint sets are difficult in two major aspects: (1) a feasible solution may be very hard to determine; (2) the optimal solution may be an isolated feasible solution, in which case a correct solution cannot be computed by a finite procedure and implemented successfully. To cope with these difficulties the robust approach consists basically in finding a nonisolated feasible solution and improving it step by step. The resulting algorithm works its way to the best nonisolated optimal solution through a number of cycles of incumbent transcending. A major advantage of it, apart from stability, is that when prematurely stopped it may still provide a good nonisolated feasible solution, in contrast to other current methods which are almost useless in that case.

7.6 DC Optimization in Design Calculations An important class of nonconvex optimization problems encountered in the applications has the following general formulation: .P/ Given a compact set S  Rn with intS ¤ ; and a gauge p W Rn ! RC find the largest value r for which there is an x 2 S satisfying B.xI r/ WD fy 2 Rn j p.y  x/  rg  S:

(7.75)

Here by gauge is meant a nonnegative positively homogeneous and convex function. As can be immediately seen, a gauge p W Rn ! RC is actually the gauge of the convex set fxj p.x/  1g in the sense earlier defined in Sect. 1.3. Problems of the above kind arise in many design calculations. A typical example is the design centering problem mentioned in Example 5.3, Sect. 5.2. Specifically, suppose in a fabrication process the quality of a manufactured item is characterized by an n-dimensional parameter. An item is acceptable if this parameter lies in some region of acceptability S. If x 2 Rn is the designed value of the parameter, then, due to unavoidable fluctuations, the actual value y of it will usually deviate from x, so that p.y  x/ > 0. For each fixed value of x the value of y must lie in B.xI r/ in order to be acceptable, so the production yield can be measured by the maximal value of r D r.x/ such that B.xI r/ is still contained in S. Under these conditions, to maximize the yield the designer has to solve the problem (7.75). This is an inherently difficult problem involving two levels of optimization: for every fixed value x 2 S compute r.x/ WD maxfrj B.xI r/  Sg;

(7.76)

then determine the value x that achieves r WD maxfr.x/j x 2 Sg:

(7.77)

214

7 DC Optimization Problems

Except in special cases both these two subproblems are nonconvex. Even in the relatively simple case when S is convex nonpolyhedral and p.x/ D kxk (the Euclidean norm), the subproblem (7.76), for fixed x, is known to be NP-hard (Shiau 1984). In view of these difficulties, for some time most of the papers published on this subject either concentrated on some particular aspects of the problem or were confined to the search for a local optimum, under rather restrictive conditions. If no further structure is given on the data, the general problem of design centering is essentially intractable by deterministic methods. Fortunately, in many practically important cases (see, e.g., Vidigal and Director (1982)) it can be assumed that 1. p.x/ D xT Ax, where A is a symmetric positive definite matrix; 2. S D C \ D1 \    \ Dm , where C is a closed convex set in Rn with intC ¤ ; and Di D Rn n intCi .i D 1; : : : ; m/ and Ci is a closed convex subset of Rn . Below we present a solution method proposed by Thach (1988) for problem .P/ under these assumptions.

7.6.1 DC Reformulation For every x 2 Rn define g.x/ D sup.2xT Ay  yT Ay/: y…S

Proposition 7.15 (Thach 1988) The function g.x/ is convex, finite on Rn and the general design centering problem is equivalent to the unconstrained dc maximization problem maxfxT Ax  g.x/j x 2 Rn g:

(7.78)

Proof For every x 2 Rn we can write xT Ax  g.x/ D inf .xT Ax  2xT Ay C yT Ay C yT Ay/ y…intS

D inf .y  x/T A.y  x/

(7.79)

y…intS

D inf p2 .y  x/: y…intS

But for x 2 S it is easily seen from (7.76) that r.x/ D infy…intS p.y  x/ 8x 2 S. On the other hand, for x … S, by making y D x in (7.79) we obtain xT Ax  g.x/ D 0 since p2 .y  x/  0. Hence,

7.6 DC Optimization in Design Calculations

215

T

x Ax  g.x/ D

r2 .x/ 0

x2S x…S

(7.80)

This shows that problem (7.77) is equivalent to (7.78). The finiteness of g.x/ follows from (7.80) and that of r.x/. Finally, since for every fixed y the function x 7! 2xT Ay  yT Ay is affine, g.x/ is convex as the pointwise supremum of a family of affine functions. t u Remark 7.14 (i) In view of (7.80) problem (7.78) is the same as the constrained optimization problem maxfxT Ax  g.x/j x 2 Sg: (ii) For x 2 S we can also write r.x/ D miny2@S p.y  x/, where @S WD S n intS is the boundary of S. Therefore, from (7.79), g.x/ D maxf2xT Ay  yT Ay/ y2@S

8x 2 S:

7.6.2 Solution Method A peculiar feature of the dc optimization problem (7.78) is that the convex function g.x/ is not given explicitly but is defined as the optimal value in the optimization problem maxf2xT Ay  yT Ayj y … Sg:

(7.81)

n However, from the relation Rn n intS D .[m iD1 Ci / [ .R n intC/ it is easily seen that

g.x/ D maxfg0 .x/; g1 .x/; : : : ; gm .x/g with g0 .x/ D maxf2xT Ay  yT Ayj y 2 Rn n intCgI

(7.82)

gi .x/ D maxf2x Ay  y Ayj y 2 Ci g; i D 1; : : : ; m:

(7.83)

T

T

Thus, the subproblem (7.81) splits into mC1 simpler ones. Every subproblem (7.83) is a convex program (maximizing a concave function y 7! 2xT Ay  yT Ay over a closed convex set Ci ), so it can be solved by standard algorithms. The subproblem (7.82), though harder (maximizing a concave function over a closed reverse convex set), can be solved by reverse convex programming methods developed in Section 7.3. In that way the value of g.x/ at every point x can be computed. Also the next proposition helps to find a subgradient of g.x/ at every x.

216

7 DC Optimization Problems

Proposition 7.16 If x … S, then 2Ax 2 @g.x/. If x 2 S, then 2Ay 2 @g.x/, where y is an optimal solution of (7.81) for x D x. Proof By y denote an optimal solution of (7.81) for x D x. Then g.x/ D 2xT Ay  yT Ay, so g.x/  g.x/  .2xT Ay  yT Ay/  .2xT Ay  yT Ay/ D .x  x/T 2Ay 8x 2 Rn : This implies that 2Ay 2 @g.x/ for every x 2 Rn . In particular, when x … S we have g.x/ D xT Ax; so y D x is an optimal solution of (7.81), hence 2Ax 2 @g.x/. t u We are now in a position to derive a solution method for the dc optimization problem (7.78) equivalent to the general design centering problem .P/. Let f .x/ WD xT Ax  g.x/. Starting with an n-simplex Z0 containing the compact set S we shall construct a sequence of functions fk .:/; k D 0; 1; : : :, such that (i) fk .:/ overestimates f .:/, i.e., fk .x/  f .x/ allx 2 SI (ii) Each problem maxffk .x/j x 2 Z0 g can be solved by available efficient algorithms; (iii) maxffk .x/j x 2 Z0 g ! maxff .x/j x 2 Sg as k ! C1. Specifically, we can state the following: Algorithm Select an n-simplex Z0  S, a point x0 2 S and y0 D x0 . Set k D 1. Step 1.

Let fk .x/ D xT Ax  maxfh2Ayi ; x  xi i C g.xi /j i D 0; 1;    ; k  1g. Solve the kth approximating problem .Pk /

Step 2.

obtaining xk . If xk 2 S go to Step 2, otherwise go to Step 3. Solve the kth subproblem .Sk /

Step 3.

maxffk .x/j x 2 Z0 g;

maxf2.xk /T Ay  yT Ayj y … intSg

obtaining an optimal solution yk and the optimal value g.xk /. If fk .xk / D .xk /T Axk  g.xk /, stop: xk solves .P/. Otherwise, set k and go back to Step 1. Let yk D xk . Set k k C 1 and go back to Step 1.

kC1

Proposition 7.17 If the above algorithm stops at Step 2 the current xk solves the design centering (P). If it generates an infinite sequence xk then any cluster point of this sequence solves (P). Proof First observe that fk1 .x/  fk .x/  f .x/ 8x and that fk .xi / D f .xi /; i D 0; 1; : : : ; k  1:

7.6 DC Optimization in Design Calculations

217

In fact, since 2Ayi 2 @g.xi / by Proposition 7.16, it follows that h2Ayi ; x  xi i C g.xi /  g.x/; i D 0; 1; : : : ; k  1, hence fk .x/  fkC1 .x/  f .x/ 8x 2 Rn . For x D xi we have g.xi /  h2Ayi ; xi  xj i C g.xj /; i; j D 0; 1; : : : ; k  1. Hence, fk .xi /  .xi /T Axi  g.xi / D f .xi /; which implies that fk .xi / D f .xi /. Now suppose that at some iteration k; fk .xk / D .xk /T Axk  g.xk /, i.e., fk .xk / D f .xk /. Then, since xk solves .Pk / we have, for any x 2 Z0 W f .x/  fk .x/  fk .xk / D f .xk /, so xk solves .P/, justifying the stopping criterium in Step 2. If the algorithm is infinite, let x D lim!C1 xk . As we just saw, fkC1 .xkC1 / D maxffkC1 .x/j x 2 Z0 g  maxffk .x/j x 2 Z0 g D fk .xk /, so ffk .xk /g is an increasing sequence bounded above by maxff .x/j x 2 Sg. Therefore, there exists D limk!C1 fj .xk /. But for any i 0 W k Y

fr r .x / D 3:

(8.94)

rD1

Indeed, let !D

k Y

frr .x /:

rD1

By (8.92) xj > 2; j D 1; : : : ; n, so log.!/ > 0. Taking D

log.3/ log.!/

Qk

yields log.!/ D log.3/, i.e., log.

D . Clearly max x2X

k Y rD1

fr r .x/

D max x2X

r  .x // rD1 fr

k Y

! frr .x/

D

D log.3/, hence (8.94). Now let k Y

! frr .x /

r

rD1

so 2 M. Hence, q D q.x / (  sup q.x/j

k Y rD1

D h. /  h :

) fr r .x/

 3; x 2 X

D 3;

8.4 Application

267

This implies, in particular, that h > 0. Conversely, let  be an optimal solution of (8.44), i.e.,  2 M and h.  / D h . By x denote a maximizer of the function Qk

 r   rD1 fr .x/ on X. By Theorem 8.14 x 2 XE . Since 2 M we have k Y



fr r .x / D sup

k Y

x2X rD1

rD1



fr r .x/

 3: But if k Y



fr r .x / < 3;

rD1

then

( 

h. / D sup q.x/j

k Y

)  fr r .x/

 3; x 2 X

rD1

D sup ; D0 < h ; which is a contradiction. Thus, k Y



fr r .x / D 3

rD1

and therefore, h D h.  / ( D sup q.x/j

k Y

)  fr r .x/

 3; x 2 X

rD1

D q.x /  q : t Consequently, q D h , and so x is an optimal solution of the problem (10.12). u Thus, by quasi-gradient duality a difficult nonconvex problem in an ndimensional space like (8.85)–(8.89) can be converted into a k-th dimensional problem (8.93) with k  n, which is a much more tractable problem of convex maximization over a compact convex set. Especially when k is small, as is often the case in practice, the converted problem can be solved efficiently even by outer approximation method as proposed in Thach et al. (1996).

268

8 Special Methods

8.5 Noncanonical DC Problems In many dc optimization problems the constraint set is defined by a mixture of inequalities of the following types: • • • •

convex; reverse convex; piecewise linear; quadratic (possibly indefinite).

Problems of this form often occur, for instance, in continuous location theory (Tuy 1996; Tuy et al. 1995a). To handle these problems it is not always convenient to reduce them to the canonical form. In this section we present an outer approximation method based on a “visibility” assumption borrowed from location theory. This method is practical when the number of nonconvex variables is small. Consider the problem minff .x/j x 2 Sg;

(8.95)

where f W Rn ! R is a convex function, and S is a compact set in Rn . A feasible point x is said to be visible from a given point w 2 Rn n S if x is the unique feasible point in the line segment Œw; x, i.e., if Œw; x \ S D fxg (Fig. 7.3). Our basic assumption is the following: Visibility Assumption: The feasible set S is such that there exists an efficient finite procedure to compute, for any given point w 2 Rn n S and any z ¤ w, the visible feasible point w .z/ on the line segment Œw; z, if it exists. Note that since S is compact, either the line through w; z does not meet S, or w .z/ exists and is the nearest feasible point to w in this line. To simplify the notation we shall omit the subscript w and write .z/ when w is clear from the context. Proposition 8.4 If S D fxj gi .x/  0 i D 1; : : : ; mg and each function gi .x/ is either convex, concave, piecewise linear, or quadratic, then S satisfies the visibility assumption. Proof It suffices to check that if g.x/ belongs to one of the mentioned types and g.w/ > 0 then, for any given z ¤ w one can easily determine whether g.w C .z  w//  0 for some  > 0 and if so, compute the smallest such . In particular, if g.x/ is quadratic (definite or indefinite), then g.w C .z  w// is a quadratic function of , so computing supfj g.w C .z  w//  0g is an elementary problem. t u Since f .x/ is convex one can easily compute w 2 argminff .x/j x 2 Rn g. If it so happens that w 2 S, we are done: w solves the problem. Otherwise, w … S and we have f .w/ < minff .x/j x 2 Sg:

(8.96)

8.5 Noncanonical DC Problems

269

From the convexity of f .x/ it then follows that f .w C .z  w// < f .z/ for any z 2 S;  2 .0; 1/, so the visible feasible point in Œw; z achieves the minimum of f .x/ over the line segment Œw; z. Therefore, assuming the availability of a point w satisfying (8.96) we can reformulate (8.95) as the following: Optimal Visible Point Problem: Find the feasible point visible from w with minimal function value f .x/. Aside from the basic visibility condition and (8.96), we will further assume that: (a) the set S is robust, i.e., S D cl.intS/I (b) f .x/ is strictly convex and has bounded level sets. (c) a number ˛ 2 R is known such that f .x/ < ˛ 8x 2 S. Let P0 be a polytope containing S. If we denote D minff .x/j x 2 Sg; G D fx 2 P0 j f .x/  g, then G is a closed convex set and the problem is to find a point of ˝ WD S \ G. Applying the outer approximation method to this problem, we construct a sequence of polytopes P1  P2  : : :  G. At iteration k let xk be the incumbent feasible point and zk 2 argmaxff .x/j x 2 Pk g. The distinguished point associated with Pk is defined to be xk D .zk /. Clearly, if xk 2 G, then xk is an optimal solution. If f .xk / < f .xk /, then xkC1 D xk I otherwise xkC1 D xk . Furthermore, PkC1 D Pk \ fxj l.x/  0g where l.x/  0 is a cut which separates xk from the closed convex set G. We can thus state the following OA algorithm for solving the problem (8.95): Optimal Visible Point Algorithm Initialization. Take a polytope (simplex or rectangle) P1 known to contain one optimal solution. Let x1 be the best feasible solution available, 1 D f .x1 / (if no feasible solution is known, set x1 D ;; 1 D ˛/. Let V1 be the vertex set of P1 . Set k D 1. Step 1. Step 2. Step 3. Step 4.

If xk 2 argminff .x/j x 2 Pk g, then terminate: xk is an optimal solution. Compute zk 2 argmaxff .x/j x 2 Vk g. Find .zk /. If .zk / exists and f . .zk // < f .xk /, set xkC1 D .zk / and kC1 D f .xkC1 /I otherwise set xkC1 D xk ; kC1 D k . Let yk 2 Œw; zk  \ fxj f .x/ D kC1 g. Compute pk 2 @f .yk / and let PkC1 D Pk \ fxj hpk ; x  yk i  0g:

Step 5.

Compute the vertex set VkC1 of PkC1 . Set k

kC1 and go back to Step 1.

Figure 8.3 illustrates the algorithm on a simple example in R2 where the feasible domain consists of two disjoint parts. Starting from a rectangle P1 we arrive after two iterations at a polygon P3 . The feasible solution x2 obtained at the second iteration is already an optimal solution. Theorem 8.12 Either the above algorithm terminates by an optimal solution, or it generates an infinite sequence fxk g every accumulation point of which is an optimal solution.

270

8 Special Methods z2

π (z2 ) w

π (z1 )

P3 z1 Fig. 8.3 The optimal visible point problem

Proof Let D limk!C1 k ; G D fxj f .x/  g. By Theorem 6.5, zk  yk ! 0 and any accumulation point z of fzk g belongs to G. Suppose now that there exists a feasible point x better than z, e.g., such that f .z/  f .x / > ı > 0. By robustness of S, there exists a point x0 2 intS so close to x that f .x0 / < f .z/  ı and, consequently, a ball W  S around x0 such that f .x/ < f .z/  ı 8x 2 W. Let xO be the point where the halfline from w through x0 meets @G. Since f .x/ is strictly convex, its maximum over P1  G is achieved at a vertex, hence f .Ox/ < f .z1 / and there exists in P1 a sequence fxq g ! xO such that f .xq / & f .Ox/ D . Since f .zk / D maxff .x/j x 2 Pk g & , for each q there exists kq such that f .zkq / < f .xq /, hence xq … Pk , and consequently, hpkq ; xkq  ykq i > 0. By passing to subsequences if necessary we may assume zkq ! z; ykq ! y; pkq ! p. Since zkq  ykq ! 0, we must have y D z 2 G. Furthermore, from pk 2 @f .yk / it follows that p 2 @f .y/, hence p ¤ 0 for the relation 0 2 @f .y/ would imply that y is a minimizer of f .x/ over Rn , which is not the case. Letting q ! 1 in the inequality hpkq ; xkq  ykq i > 0 then yields hp; xO  y  0. This together with the equality f .Ox/ D f .y/ D implies, by the strict convexity of f .x/, that xO D y, and consequently, zkq ! xO . For sufficiently large q the halfline from w through zkq must then meet W  S, i.e., .zkq / exists and f . .zkq // < f .z/  ı, conflicting with f . .zk //  D f .z/ 8k. Thus, z is a global optimal solution, and hence so is any accumulation point of the sequence fxk g. t u Remark 8.9 Assumptions (a)–(c) are essential for the implementability and convergence of the algorithm. If f .x/ is not strictly convex, one could add a small perturbation "kxk2 to make it strictly convex, though this may not be always computationally practical. The robustness assumption can be bypassed by requiring only an "-approximate optimal solution as in the case of (CDC/. Remark 8.10 Suppose S D fxj gi .x/  0; i D 1; : : : ; mg and we know, for every i 2 I  f1; 2; : : : ; mg, a concave function gQ i .x/ such that gQ i .x/  gi .x/ for all x 2 P1 . Then along with zk we can compute the value "k D max min gQ i .x/: i2I x2Pk

8.6 Continuous Optimization Problems

271

(Of course this involves minimizing the concave functions gQ i .x/ over Pk but not much extra computational effort is needed, since we have to compute the vertex set Vk of Pk anyway). Let g.x/ D maxiD1;:::;m gi .x/. Clearly, "k  max min gi .x/  min max gi .x/ iD1;:::;m x2Pk

x2Pk iD1;:::;m

D min g.x/: x2Pk

(8.97)

But it is easy to see that fx 2 P1 j f .x/  f .xk /g  Pk . Indeed, the inclusion is obvious for k D 1I assuming that it holds for some k  1, we have, for any x 2 Pk satisfying f .x/  f .xkC1 /, hpk ; x  yk i  f .x/  f .yk / D f .x/  f .xkC1 /  0, hence x 2 PkC1 , so that the inclusion also holds for k C 1. From the just established inclusion and (8.97) we deduce minfg.x/j x 2 P1 ; f .x/  f .xk /g  "k ; implying that there is no x 2 P1 such that g.x/ < "k ; f .x/  f .xk /. In other words f .xk /  infff .x/j g.x/ < "k g; so xk is an "k -approximate optimal solution.

8.6 Continuous Optimization Problems The most challenging of all nonconvex optimization problems is the one in which no convexity or reverse convexity is explicitly present, namely: .COP/

minff .x/j x 2 Sg;

where f W Rn ! R is a continuous function and S is a compact set in Rn given by a system of continuous inequalities of the form gi .x/  0; i D 1; : : : ; m:

(8.98)

Despite its difficulty, this problem started to be investigated by a number of authors from the early seventies (Dixon and Szego (1975, 1978) and the references therein). However, most methods in this period dealt with unconstrained minimization of smooth or Lipschitzian functions and were able to handle only problems of just one or two dimensions. Attempts to solve constrained problems of higher dimensions by deterministic methods have begun only in recent years. As shown in Chap. 4 the core of (COP) is the subproblem of transcending an incumbent: (SP˛) Given a number ˛ 2 f .S/ (the best feasible value of the objective function known at a given stage), find a point x 2 S with f .x/ < ˛, or else prove that ˛ D minff .x/j x 2 Sg.

272

8 Special Methods

In the so-called modified function approaches, the subproblem (SP˛) for ˛ D f .x/ is solved by replacing the original function with a properly modified function such that a local search procedure, started from x and applied to the modified function will lead to a better feasible solution than x, when x is not yet a global minimizer. However, a modified function satisfying the required conditions is very difficult to construct. In its two best known versions, this modified function (tunneling function of Levy and Montalvo (1988), or filled function of Ge and Qin (1987)) depends on parameters whose correct values, in many cases, can be determined only by trial and error. The following approach to (COP) Thach and Tuy (1990) is based on the transformation of (SP˛) into the minimization of a dc function '.˛; x/, called a relief indicator. Assume that the problem is regular, i.e., minff .x/j x 2 Sg D infff .x/j x 2 intSg:

(8.99)

Since f .x/ is continuous this assumption is satisfied provided the constraint set S is robust : S D cl(intS/; whereas usual int and cl denote the interior and the closure, respectively. For ˛ 2 .1; C1 define F.˛; x/ D supff .x/  ˛; g1 .x/; : : : ; gm .x/g:

(8.100)

Proposition 8.5 Under the assumption (8.99) there holds ˛ D min f .x/ , 0 D minn F.˛; x/:1 x2S

x2R

(8.101)

Proof If ˛ D minx2S f .x/ and x 2 S satisfies f .x/ D ˛, then gi .x/  0 for i D 1; : : : ; m, hence F.˛; x/ D 0I on the other hand, f .x/  ˛ 8x 2 S, hence F.˛; x/  0 8x 2 Rn , i.e., minx2Rn F.˛; x/ D 0. Conversely, if the latter equality holds then for all x 2 intS, i.e., for all x satisfying gi .x/ < 0; i D 1; : : : ; m, we have f .x/  ˛, hence minx2S f .x/  ˛ in view of (8.99). Furthermore, there is x satisfying F.˛; x/ D 0, hence gi .x/  0; i D 1; : : : ; m (so x 2 S/ and f .x/ D ˛, i.e., ˛ D minx2S f .x/. t u Thus, solving (COP) can be reduced, essentially, to finding the unconstrained minimum of the function F.˛; x/: if this minimum is zero, and F.˛; x/ D 0 then x solves (COP); if it is negative, then any x such that F.˛; x/ < 0, will be a feasible point satisfying f .x/ < ˛I finally, if it is positive then S D ;, i.e., the problem is infeasible.

1 If f ; g1 ; : : : ; gm are convex, condition (8.99) is implied by intS ¤ ; and by writing (8.101) in the form 0 2 @F.˛; x/ we recover the classical optimality criterion in convex programming.

8.6 Continuous Optimization Problems

273

However, finding the global minimum of F.˛; x/ may be as hard as solving the original problem. Therefore, we look for some function '.˛; x/ which could do essentially the same job as F.˛; x/ while being easier to handle. The key for this is the representation of closed sets by dc inequalities as discussed in Chap. 3. Define SQ ˛ D fx 2 Rn j F.˛; x/  0g D fx 2 Sj f .x/  ˛g:

(8.102)

To make the problem tractable, assume that for every ˛ 2 .1; C1 a function r.˛; x/ (separator for .f ; S/) is available which is l.s.c. in x and satisfies: (a) r.˛; x/ D 0 8x 2 SQ ˛ I (b) 0 < r.˛; y/  d.y; SQ ˛ / WD minfkx  ykj x 2 SQ ˛ g 8y … SQ ˛ I (c) For fixed x; r.˛1 ; x/  r.˛2 ; x/ if ˛1 < ˛2 . This assumption is fulfilled in many cases of interest, as shown in the following examples: Example 8.3 If all functions f ; g1 ; : : : ; gm are Lipschitzian with constants L0 ; L1 ; : : : ; Lm then a separator is

f .x/  ˛ gi .x/ r.˛; x/ D max 0; ; ; i D 1; : : : ; m : L0 Li Indeed, for any x 2 SQ ˛ W f .y/  ˛  f .y/  f .x/  L0 kx  yk gi .y/  gi .y/  gi .x/  Li kx  yk i D 1; : : : ; m hence r.˛; y/  kx  yk, from this it is easy to verify (a)–(c). Example 8.4 If f is twice continuously differentiable, with kf 00 .x/k  K for every x, and S D fxj kxk  cg then a separator is r.˛; x/ D maxf.˛; x/; kxk  cg; where .˛; x/ denotes the unique nonnegative root of the equation K 2 t C kf 00 .x/kt D maxf0; f .x/  ˛g: 2 Indeed, we need only verify condition (b) since the other conditions are obvious. For any x 2 SQ ˛ , we have by Taylor’ s formula: jf .x/  f .y/  f 0 .y/.x  y/j 

M kx  yk2 ; 2

hence jf .x/  f .y/j  kf 0 .y/kkx  yk C f .x/  f .y/ 

M kx  yk2 2

M kx  yk2  kf 0 .y/kkx  yk: 2

274

8 Special Methods

Now M2 t2 C kf 0 .y/kt is a monotonic increasing function of t. Therefore, since f .x/  ˛ implies that M kx  yk2 C kf 0 .y/kkx  yk  f .y/  ˛; 2 it follows from the definition of .˛; y/ that .˛; y/  kx  yk. On the other hand, since kxk  c; we have kykc  kykkxk  kxyk. Hence, r.˛; y/  kxyk 8x 2 SQ ˛ , proving (b). By Proposition 4.13, given a separator r.˛; x/, if we define h.˛; x/ D sup fr2 .˛; y/ C 2xy  kyk2 g y…SQ ˛

then h.˛; x/ is a closed convex function in x such that SQ ˛ D fxj h.˛; x/  kxk2  0g:

(8.103)

Since for any y 2 S satisfying f .y/ D ˛ we have r.˛; y/ D 0, hence r2 .˛; y/ C 2xy  kyk2  kxk2 D kx  yk2  0 8x, the representation (8.103) still holds with h.˛; x/ D sup fr2 .˛; y/ C 2xy  kyk2 g

(8.104)

y…S˛

where S˛ D fx 2 Sj f .x/ < ˛g:

(8.105)

Proposition 8.6 Assume S ¤ ; and let '.˛; x/ D h.˛; x/  kxk2 . If '.˛; x/ D 0 D minn '.˛; x/: x2R

(8.106)

then x is a global optimal solution of (COP). Otherwise, the minimum of '.˛; x/ is negative and any x such that '.˛; x/ < 0 satisfies x 2 S; f .x/ < ˛. Proof Since F.˛; x/  0 , '.˛; x/  0 (see (8.102) and (8.103)), it suffices to show further that F.˛; x/ < 0 , '.˛; x/ < 0. But if F.˛; x/ < 0 then x 2 intS˛ and denoting by ı > 0 the radius of a ball around x contained in S˛ , we have for every y … S˛ W d 2 .y; S˛ /  kx  yk2  ı 2 , hence r2 .˛; y/  kx  yk2  ı 2 , i.e., '.˛; x/  ı 2 < 0. Conversely, if '.˛; x/ < 0 then for every y … S˛ W r2 .˛; y/ C 2xy  kyk2 < kxk2  , with > 0, hence r2 .˛; y/ < kx  yk2  and we must have F.˛; x/ < 0 (for otherwise x … intS˛ and by taking a sequence yn ! x; such that yn … S˛ 8n and letting n ! 1 in the inequality r2 .˛; yn / < kx  yn k2  we would arrive at a contradiction: 0   ). t u

8.6 Continuous Optimization Problems

275

The function '.˛; x/ is called a relief indicator. By Proposition 8.6 it gives information about the “altitude” f .x/ of every point x relative to the level ˛ and can be used as a substitute for F.˛; x/. The knowledge of a relief indicator '.˛; x/ reduces each subproblem SP.˛) to an unconstrained dc optimization problem, namely: minimize h.˛; x/  kxk2

s:t:

x 2 Rn :

(8.107)

We shall shortly discuss an algorithm for solving this dc optimization problem. Assuming such an algorithm available, the global optimum of (COP) can be computed by the following iterative scheme: Relief Indicator Method Start with an initial value ˛1 2 f .S/, if available, or ˛1 D C1 if no feasible solution is known. At iteration k D 1; 2; : : : solve minf'.˛k ; x/j x 2 Rn g;

(8.108)

to obtain its optimal solution xk and optimal value k . If k  0 then terminate: xk is an optimal solution of (COP) (if k D 0/, or (COP) is infeasible (if k > 0/. Otherwise, xk is a feasible solution such that f .xk / < ˛k W then set ˛kC1 D f .xk / and go to the next iteration. The convergence of this iterative scheme is immediate. Indeed, if x is a accumulation point of the sequence fxk g and ˛ D f .x/, then x 2 S and '.˛; x/  0. On the other hand, since xk … S˛ it follows that '.˛; x/  r2 .˛; xk / C 2xxk  kxk k2  kxk2  kxk  xk2 ! 0 as xk ! x. Hence, '.˛; x/ D 0, so by Proposition 8.6 x is a global optimal solution of (COP). t u

8.6.1 Outer Approximation Algorithm The implementation of the above conceptual scheme requires the availability of a procedure for solving subproblems of the form (8.108). One possibility is to rewrite (8.108) as a concave minimization problem: minft  kxk2 j h.˛k ; x/  tg;

(8.109)

and to solve the latter by the simple OA algorithm (Sect. 6.2). Since ˛kC1 < ˛k we have h.˛k ; x/  h.˛kC1 ; x/, so if Gk denotes the constraint set of (8.109) then GkC1  Gk 8k. This allows the procedures of solving (8.109) for successive k D 1; 2; : : : to be integrated into a unified OA procedure for solving the single concave minimization problem minft  kxk2 j h.˛; x/  tg; where ˛ is the (unknown) optimal value of (COP).

(8.110)

276

8 Special Methods

Specifically, denote by G the constraint set of (8.110). Let Pk be a polytope outer approximating Gk  G and let .xk ; tk / be an optimal solution of the relaxed problem minft  kxk2 j x 2 Pk g: If tk  kxk k2  0 then the optimal value of (8.110) is   tk  kxk k2  0 and there are two possibilities: (1) if ˛k D C1, then (COP) is infeasible (otherwise, one would have  < 0, conflicting with 0  tk  kxk k2  /I (2) if ˛k < C1; then xk is an optimal solution of (COP) (because 0  tk  kxk k2    0, hence  D 0/. So the only case that needs further investigation is when tk  kxk k2 < 0. Define minf˛k ; f .xk /g if xk 2 S (8.111) ˛kC1 D otherwise ˛k Lemma 8.4 The affine function lk .x/ WD 2hxk ; xi  kxk k2 C r2 .˛kC1 ; xk / strictly separates .xk ; tk / from GkC1 , i.e., satisfies lk .xk / > tk ;

lk .x/  t  0 8x 2 GkC1 :

Proof Since tk < kxk k2 we have lk .xk /  tk > lk .xk /  kxk k2 D r2 .˛kC1 ; xk /  0. On the other hand, x 2 GkC1 implies h.˛kC1 ; x/  t  0, but since xk … S˛kC1 it follows from (8.104) that lk .x/  h.˛kC1 ; x/, and hence lk .x/  t  h.˛kC1 ; x/  t  0. u t In view of this Lemma if we define PkC1 D Pk \fxj lk .x/  tg then .xk ; tk / … PkC1 while PkC1  GkC1  G, so the sequence P1  P2      G realizes an OA scheme for solving (8.110). We are thus led to the following: Relief Indicator OA Algorithm for (COP) Initialization. Take a simplex or rectangle P1  Rn known to contain an optimal solution of the problem. Let x1 be a point in P1 and let ˛1 D f .x1 / if x1 2 S, or ˛1 D C1 otherwise. Set k D 1. Step 1.

If xk 2 S, then improve xk by local search methods (provided this can be done at reasonable cost). Denote by xk the best feasible solution so far obtained. If xk exists, let ˛kC1 D f .xk /, otherwise let ˛kC1 D C1. Define lk .x/ D 2hxk ; xi  kxk k2 C r2 .˛kC1 ; xk /:

Step 2.

Solve .SPk /

minft  kxk2 j x 2 P1 ; li .x/  t; i D 1; : : : ; kg

to obtain .xkC1 ; tkC1 /.

8.6 Continuous Optimization Problems

Step 3.

Step 4.

277

If tkC1  kxkC1 k2 > 0, then terminate: S D ; (COP) is infeasible). If tkC1  kxkC1 k2 D 0, then terminate (S D ; if ˛kC1 D C1; xk solves (COP) if ˛kC1 < C1). If tkC1  kxkC1 k2 < 0, then set k k C 1 and go back to Step 1.

Lemma 8.5 After finitely many steps the above algorithm either gives evidence that .COP/ is infeasible or generates a feasible solution, if there is one. Proof Let xQ 2 intS (see assumption (8.99)), so '.˛; xQ / < 0 for ˛ D C1. If xk … S 8k, then ˛k D C1; tk < kxk k2 8k, and tk  kxk k2  min '.˛k1 ; x/  '.˛k1 ; xQ /; x2P1

8k

while for all i < k W tk  kxk k2  li .xk /  kxk k2  kxi  xk k2 : Thus, kxi  xk k2  '.˛k ; xQ / > 0 8i < k, conflicting with the boundedness of fxk g  P1 . Therefore, xk 2 S for all sufficiently large k. In particular, any accumulation point of the sequence fxk g is feasible. t u Proposition 8.7 Whenever infinite the Relief Indicator Algorithm generates an infinite sequence fxk g every accumulation point of which is an optimal solution of (COP). Proof Let x D lim xks , (as we just saw, x 2 S). Define s!C1

Qlk .x/ D lk .x/  kxk2 : For any q > s we have Qlks .xkq / D lks .xkq /  kxkq k2  tkq  kxkq k2 < 0, hence, by making q ! C1 W Qlks .x/  0. But clearly, Qlks .xks / D Qlks .x/  kxks  xk2  kxks  xk2 ! 0 as s ! C1, while Qlk .xk / D r2 .˛kC1 ; xk / > 0. Therefore, Qlks .xks / ! 0 and also tks  kxks k2 ! 0, as s ! C1. If t D lim tks then t  kxk2 D 0. Since tk kxk k2 D minftkxk2 j x 2 P1 ; li .x/  t; i D 1; 2; : : :, k1g  minftkxk2 j x 2 P1 ; h.˛k ; x/  tg 8k, it follows that t  kxk2 D minft  kxk2 j x 2 P1 ; h.˛; x/  tg for ˛ D lim ˛k . That is, '.˛; x/ D 0, and hence, x solves (COP). t u Remark 8.11 Each subproblem (SPk ) is a linearly constrained concave minimization problem which can be solved by vertex enumeration as in simple outer approximation algorithms. Since (SPkC1 ) differs from (SPk ) by just one additional linear constraint, if Vk denotes the vertex set of the constraint polytope of (SPk ) then V1 can be considered known, while VkC1 can be derived from Vk using an on-line vertex enumeration subroutine (Sect. 6.2). If the algorithm is stopped when tk  kxk k2 > "2 then xk yields an "-approximate optimal solution.

278

8 Special Methods

Example 8.5 Solve the problem Minimize f .x/ D x21 C x22  cos 18x1  cos 18x2 1

subject to g.x/ D Œ.x1  0:5/2 C .x2  0:415331/2 2 C 0:65  0 0  x1  1; 0  x2  1: The functions f and g are Lipshitzian on S D Œ0; 1  Œ0; 1, with Lipschitz constants 28.3 and 1, respectively. Following Example 6.2 a separator is r.˛; x/ D maxf0;

f .x/  ˛ ; g.x/g: 28:3

With tolerance " D 0:01 the algorithm terminates after 16 iterations, yielding an approximate optimal solution x D .0; 1/ with objective function value 0:6603168.

8.6.2 Combined OA/BB Algorithm An alternative method for solving (8.110) is by combining outer approximation with branch and bound. This gives rise to a relief indicator OA/BB algorithm (Tuy 1992b, 1995a) for problems of the following form: .CO=SNP/

minff .x/j G.x; y/  0; gi .x/  0; i D 1; : : : ; mg;

where f ; g1 ; : : : ; gm are as previously, while G W Rn  Rq ! R is a convex function. To take advantage of the convexity of G in this problem it is convenient to treat the constraint G.x; y/  0 separately. Assuming robustness of the constraint set of (CO/SNP), and the availability of a separator r.˛; x/ for .f ; S/ (with S being the set determined by the system gi .x/  0; i D 1; : : : ; m) we can transform (CO/SNP) into the concave minimization problem minft  kxk2 j G.x; y/  0; h.˛; x/  tg;

(8.112)

where h.˛; x/ is defined as in (8.104) and ˛ is the (unknown) optimal value of f .x/. Since for every fixed x this is a convex problem, the most convenient way is to branch upon x and compute bounds by using relaxed subproblems of the form LPk .M/

minft 

M .x/j

x 2 M; li .x/  t .i 2 Ik / Lj .x; y/  0 .j 2 Jk /g

where M is the partition set, M .x/ is the affine function that agrees with kxk2 at the vertices of M; f.x; t/j li .x/  t .i 2 Ik /g the outer approximating polytope of the convex set C˛ WD f.x; t/j h.˛; x/  tg and f.x; y/j Lj .x; y/  0 .j 2 Jk /g the outer approximating polytope of the convex set ˝ D f.x; y/j G.x; y/  0g, at iteration k.

8.7 Exercises

279

BB/OA Algorithm for (CO/SNP) Initialization. Take an n-simplex M1 in Rn known to contain an optimal solution of (CO/SNP). If a feasible solution is available then let .x0 ; y0 / be the best one, ˛0 D f .x0 /I otherwise, let ˛0 D C1. Set I1 D J1 D ;; R D N D fM1 g. Set k D 1. Step 1. Step 2.

Step 3.

Step 4.

For each M 2 N compute the optimal value ˇ.M/ of the linear program LPk .M/. Update .xk ; yk / and ˛k . Delete every M 2 R such that ˇ.M/ > 0. Let R 0 be the remaining collection of simplices. If R 0 D ; then terminate: (CO/SNP) is infeasible. Select Mk 2 argminfˇ.M/j M 2 R 0 g. Let .xk ; yk ; tk / be a basic optimal solution of LPk .Mk /. If ˇ.Mk / D 0 then terminate: (CO/SNP) is infeasible (if ˛k D C1/, or .xk ; yk / solves (CO/SNP) (if ˛k D C1). If kxk k2  tk and G.xk ; yk /  0 then set IkC1 D Ik ; JkC1 D Jk . Otherwise: (a) if kxk k2  tk > G.xk ; yk / then set JkC1 D Jk , IkC1 D Ik [ fkg; lk .x/ D 2hxk ; xi  kxk k2 C r2 .˛k ; xk /I (b) if kxk k2  tk  G.xk ; yk / then set IkC1 D Ik , JkC1 D Jk [ fkg; Lk .x/ D huk ; x  xk i C hv k ; y  yk i C G.xk ; yk /; where .uk ; v k / 2 @G.xk ; yk /.

Step 5.

Perform a bisection or an !-subdivision ( subdivision via ! k D xk ) of Mk , according to a normal subdivision rule. Let N  be the partition of Mk . Set S N ; R N  [ .R 0 n fMk g/, increase k by 1 and return to Step 1.

The convergence of this algorithm follows from that of the general algorithm for (COP).

8.7 Exercises 1 Solve by polyhedral annexation: min .x1  1/2  x22  .x3  1/2 4x1  5x2 C 4x3  4 6x1  x2  x3  4:1 x1 C x2  x 3  1

s:t:

280

8 Special Methods

12x1 C 5x2 C 12x3  34:8 12x1 C 12x2 C 7x3  29:1 x1 ; x2 ; x3  0 (optimal solution x D .1; 0; 0// 2 Consider the problem: minfcxj x 2 ˝; g.x/  0; h.x/  0; hg.x/; h.x/i D 0g where g; h W Rn ! Rn are two convex maps and ˝ a convex subset of Rn (convex program with an additional complementarity condition). Show that this problem can be converted into a (GCP) (a map g W Rn ! Rm is said to be convex if its components gi .x/; i D 1; : : : ; m are all convex functions) 3 Show that an "-optimal solution of (LRC) can be computed according to the following scheme (Nghia and Hieu 1986): Initialization Solve the linear program ˇ1 D minfcxj x 2 Dg: Let w be a basic optimal solution of this program. If h.w/  0, terminate: w solves (LRC). Otherwise, set ˛1 D C1; k D 1. Iteration k (1) If ˛k  ˇk  ", terminates: x is an "-optimal solution. (2) Solve

˛k C ˇk

k D max h.z/j z 2 D; cz  2



to obtain zk . (3) If k < 0, and ˛k D 1, terminate: the problem is infeasible. ˛k C ˇk (4) If k < 0, and ˛k < 1, set ˛kC1 D ˛k ; ˇkC1 D and go to iteration 2 k C 1. (5) If k  0, then starting from zk perform a number of simplex pivots to move to a point xk such that cxk  czk and xk is on the intersection of an edge of D with the surface h.x/ D 0 (see Phase 1 of FW/BW Procedure). Set x D xk ; ˛kC1 D cxk ; ˇkC1 D ˇk and go to iteration k C 1. 4 Solve minfy3 C 4:5y2  6yj 0  y  3g.

8.7 Exercises

281

5 Solve min 4x41 C 2x22  4x21 W x21  2x1  2x2  1  0 1  x1  1I

1  x2  1:

(optimal solution x D .0; 707; 0:000//. 6 Solve minf2x1 C x2 j

x21 C x22  1; x1  x22  0g:

(optimal solution x D .0:618034; 0:786151// 7 Solve minfx21 C x22 C x1 x2 C x1 C x2 j

x21 C x22  0:5I x1 ; x2  0g:

8 Solve min 2x21 C x1 x2  10x1 C 2x22  20 W x1 C x2  11I

1:5x1 C x2  11:4I

x1 ; x2  0

9 Solve 1 2 1 min x1 C x2 C x22  x21 j 2 3 2

2x1  x2  3I x1 ; x2  0 :

10 Solve the previous problem by reduction to quasiconcave minimization via dualization (see Sect. 8.3). 11 Let C and D two compact convex sets in Rn containing 0 in their interiors. Consider the pair of dual problems .P/

maxx;r frj x C rC  Dg

.Q/

minx;r frj C  x C rDg:

Show that 1) .x; r/ solves .P/ if and only if .x=r; 1=r/ solves QI 2) When D is a polytope, problem .P) merely reduces to a linear program; 3) When C is a polytope, problem .Q/ reduces to minimizing a dc function. 2

12 Solve minfex sin x  jyj j 0  x  10; 0  y  10g. (Hint: Lipschitz constant =2)

Chapter 9

Parametric Decomposition

A partly convex problem as formulated in Sect. 8.2 of the preceding chapter can also be viewed as a convex optimization problem depending upon a parameter x 2 Rn : The dimension n of the parameter space is referred to as the nonconvexity rank of the problem. Roughly speaking, this is the number of “nonconvex variables” in the problem. As we saw in Sect. 8.2, a partly convex problem of rank n can be efficiently solved by a BB decomposition algorithm with branching performed in Rn : This chapter discusses decomposition methods for partly convex problems with small nonconvexity rank n: It turns out that these problems can be solved by streamlined decomposition methods based on parametric programming.

9.1 Functions Monotonic w.r.t. a Convex Cone Let X be a convex set in Rn and K a closed convex cone in Rn : A function f W Rn ! R is said to be monotonic on X with respect to K (or K-monotonic on X for short) if x; x0 2 X; x0  x 2 K

)

f .x0 /  f .x/:

(9.1)

In other words, for every x 2 X and u 2 K the univariate function 7! f .x C u/ is nondecreasing in the interval .0;  / where  D supf j x C u 2 Xg: When X D Rn we simply say that f .x/ is K-monotonic. Let L be the lineality space of K; and Q an r  n matrix of rank r such that L D fxj Qx D 0g: From (9.1) it follows that x; x0 2 X; Qx D Qx0

)

f .x/ D f .x0 /;

(9.2)

i.e., f .x/ is constant on every set fx 2 Xj Qx D constg: Denote Q.X/ D fy 2 Rr j y D Qx; x 2 Xg: © Springer International Publishing AG 2016 H. Tuy, Convex Analysis and Global Optimization, Springer Optimization and Its Applications 110, DOI 10.1007/978-3-319-31484-6_9

283

284

9 Parametric Decomposition

Proposition 9.1 Let K  recX: A function f .x/ is K-monotonic on X if and only if f .x/ D F.Qx/ 8x 2 X;

(9.3)

where F W Q.X/ ! R satisfies y; y0 2 Q.X/; y0  y 2 Q.K/

)

F.y0 /  F.y/:

(9.4)

Proof If f .x/ is K-monotonic on X then we can define a function F W Q.X/ ! R by setting, for every y 2 Q.X/ W F.y/ D f .x/ for any x 2 X such that y D Qx:

(9.5)

This definition is correct in view of (9.2). Obviously, f .x/ D F.Qx/: For any y; y0 2 Q.X/ such that y0  y 2 Q.K/ we have y D Qx; y0 D Qx0 ; with Q.x0  x/ D Qu; u 2 K; i.e., Qx0 D Q.x C u/; u 2 K: Here x C u 2 X because u 2 recX; hence by (9.1), f .x0 / D f .x C u/  f .x/; i.e., F.y0 /  F.y/: Conversely, if f .x/ D F.Qx/ and F W Q.X/ ! R satisfies (9.4), then for any x; x0 2 X such that x0  x 2 K we have y D Qx; y0 D Qx0 ; with y0  y 2 Q.K/; hence F.y0 /  F.y/; i.e., f .x0 /  f .x/: t u The above representation of K-monotonic functions explains their role in global optimization. Indeed, using this representation, any optimization problem in Rn of the form minf f .x/j x 2 Dg

(9.6)

where D  X; and f .x/ is K-monotonic on X; can be rewritten as minfF.y/j y 2 Q.D/g

(9.7)

which is a problem in Rr : Assume further, that f .x/ is quasiconcave on X; i.e., by (9.5), F.y/ is quasiconcave on Q.X/: Proposition 9.2 Let K  recX: A quasiconcave function f W X ! R on X is Kmonotonic on X if and only if for every x0 2 X with f .x0 / D W x0 C K  X. / WD fx 2 Xj f .x/  g: Proof If f .x/ is K-monotonic on X, then for any x 2 X. / and u 2 K we have x C u 2 X (because K  recX/ and condition (9.1) implies that f .x C u/  f .x/  ; hence x C u 2 X. /; i.e., u is a recession direction for X. /: Conversely, if for any 2 f .X/; K is contained in the recession cone of X. /; then for any x0 ; x 2 X such that x0  x 2 K one has, for D f .x/ W x 2 X. /; hence x0 2 X. /; i.e., f .x0 /  f .x/: t u

9.1 Functions Monotonic w.r.t. a Convex Cone

285

Proposition 9.1 provides the foundation for reducing the dimension of problem (9.6), whereas Proposition 9.2 suggests methods for restricting the global search domain. Indeed, since local information is generally insufficient to verify the global optimality of a solution, the search for a global optimal solution must be carried out, in principle, over the entire feasible set. If, however, the objective function f .x/ in (9.6) is K-monotonic, then, once a solution x0 2 D is known, one can ignore the whole set D \ .x0 C K/  fx 2 Dj f .x/  f .x0 /g; because no better feasible solution than x0 can be found in this set. Such kind of information is often very helpful and may drastically simplify the problem by limiting the global search to a restricted region of the feasible domain. In the next sections, we shall discuss how efficient algorithms for K-monotonic problems can be developed based on these observations. Below are some important classes of quasiconcave monotonic functions encountered in the applications. Example 9.1 f .x/ D '.x1 ; : : : ; xp / C d T x; with '.y1 ; : : : ; yp / a continuous concave function of y D .y1 ; : : : ; yp /: Here F.y/ D '.y1 ; : : : ; yp / C ypC1 ; Qx D .x1 ; : : : ; xp ; dT x/T : Monotonicity holds with respect to the cone K D fuj ui D 0.i D 1; : : : ; p/; d T u  0g: It has long been recognized that concave minimization problems with objective functions of this form can be efficiently solved only by taking advantage of the presence of the linear part in the objective function (Rosen 1983; Tuy 1984). Pp i 2 pC1 Example 9.2 f .x/ D ; xi: iD1 Œhc ; xi C hc P p 2 Here F.y/ D  iD1 yi C ypC1 ; Qx D .hc1 ; xi; : : : ; hcpC1 ; xi/T : Monotonicity holds with respect to K D fuj hci ; ui D 0.i D 1; : : : ; p/; hcpC1 ; ui  0g: For p D 1 such a function f .x/ cannot in general be written as a product of two affine functions and finding its minimum over a polytope is a problem known to be NP-hard (Pardalos and Vavasis 1991), however, quite practically solvable by a parametric algorithm (see Sect. 9.5). Q Example 9.3 f .x/ D riD1 Œhci ; xi C di ˛i with ˛i > 0; i D 1; : : : ; r: This class P includes functions which are products of affine functions. Since log f .x/ D riD1 ˛i logŒhci ; xi C di  for every x such that hci ; xi C di > 0; it is easily seen that f .x/ is quasiconcave on the set X D fxj hci ; xi C di  0; i D 1; : : : ; rg: Furthermore, f .x/ is monotonic on X with respect to theQ cone K D fuj hci ; ui  0; i D 1; : : : ; rg: Here f .x/ has the form (9.3) with F.y/ D riD1 .yi Cdi /˛i : Problems of minimizing functions f .x/ of this form (with ˛i D 1; i D 1; : : : ; r/ under linear constraints are termed linear multiplicative programs and will be discussed later in this chapter (Sect. 9.5). P i Example 9.4 f .x/ P D  riD1 i ehc ;xi with i > 0; i D 1; : : : ; r. Here F.y/ D  riD1 i eyi is concave in y; so f .x/ is concave in x: Monotonicity holds with respect to the cone K D fxj hci ; xi  0; i D 1; : : : ; rg: Functions of this type appear when dealing with geometric programs with negative coefficients (“signomial programs,” cf Example 5.6).

286

9 Parametric Decomposition

Example 9.5 f .x/ quasiconcave and differentiable, with rf .x/ D

r X

pi .x/ci ;

ci 2 Rn ; i D 1; : : : ; r:

iD1

For every x; x0 2 X satisfying hci ; x0  xi D 0; i D 1; : : : ; r; there P is some 2 .0; 1/ such that f .x0 /  f .x/ D hrf .x C .x0  x//; x0  xi D riD1 pi .x C .x0  x//hci ; x0  xi D 0; hence f .x/ is monotonic with respect to the space L D fxj hci ; xi D 0; i D 1; : : : ; rg: If, in addition, pi .x/  0 8x .i D 1; : : : ; r/ then monotonicity holds with respect to the cone K D fuj hci ; ui  0; i D 1; : : : ; rg: Example 9.6 f .x/ D minfyT Qxj y 2 Eg where E is a convex subset of Rm and Q is an m  n matrix of rank r: This function appears when transforming rank r bilinear programs into concave minimization problems (see Sect. 10.4). Here monotonicity holds with respect to T the subspace K D fuj Qu D 0g: If E  Rm C then Qu  0 implies y Q.x C u/  T y Qx; hence f .x C u/  f .x/; and so f .x/ is monotonic with respect to the cone K D fuj Qu  0g: Example 9.7 f .x/ D supfd T yj Ax C By  qg where A 2 Rpn with rankA D r; B 2 Rpm ; y 2 Rm and q 2 Rp : This function appears in bilevel linear programming, for instance, in the max-min problem (Falk 1973b): min maxfcT x C d T yj Ax C By  q; x  0; y  0g: x

y

Obviously, Au  0 implies A.x C u/  Ax; hence fyj Ax C By  qg  fyj A.x C u/ C By  qg; hence f .x C u/  f .x/: Therefore, monotonicity holds with respect to the cone K D fuj Au  0g: Note that by Corollary 1.15, rankQ D n  dim L D dim K ı : A K- monotonic function f .x/ on X; with rankQ D r is also said to possess the rank r monotonicity property.

9.2 Decomposition by Projection Let D be a compact subset of a closed convex set X in Rn and let f W Rn ! R be a quasiconcave function, monotonic on X with respect to a polyhedral convex cone K contained in the recession cone of X: Assume that the lineality space of K is L D fxj Qx D 0g; where Q is an r  n matrix of rank r: Consider the quasiconcave monotonic problem .QCM/

minff .x/j

x 2 Dg

(9.8)

9.2 Decomposition by Projection

287

In this and the next two sections, we shall discuss decomposition methods for solving this problem, under the additional assumption that D is a polytope. The idea of decomposition is to transform the original problem into a sequence of subproblems involving each a reduced number of nonconvex variables. This can be done either by projection, or by dualization (polyhedral annexation), or by parametrization ( for r  3/: We first present decomposition methods by projection. As has been previously observed, by setting F.y/ D f .x/ for any x 2 X satisfying y D Qx;

(9.9)

we unambiguously define a quasiconcave function F W Q.X/ ! R such that problem .QCM/ is equivalent to min F.y/ s:t:

y D .y1 ; : : : ; yr / 2 G

(9.10)

where G D Q.D/ is a polytope in Rr : The algorithms presented in Chaps. 5 and 6 can of course be adapted to solve this quasiconcave minimization problem. We  show how to compute F.y/: Since rankQ D r; by writing Q D ŒQB ; QN ;  first xB ; where QB is an r  r nonsingular matrix, the equation Qx D y yields xD xN  xD

   Q1 Q1 B B QN x N yC 0 xN

   Q1 QN x N Q1 B B and u D we obtain Setting Z D 0 xN 

x D Zy C u

with Qu D QN xN C QN xN D 0:

(9.11)

Since Qx D Q.Zy/ with Zy D x  u 2 X  L D X; it follows that f .x/ D f .Zy/: Thus, the objective function of (9.10) can be computed by the formula F.y/ D f .Zy/:

(9.12)

We shall assume that f .x/ is continuous on some open set in Rn containing D; so F.y/ is continuous on some open set in Rr containing the constraint polytope G D Q.D/ of (9.10). The peculiar form of the latter set (image of a polytope D under the linear map Q/ is a feature which requires special treatment, but does not cause any particular difficulty.

288

9 Parametric Decomposition

9.2.1 Outer Approximation To solve (9.10) by outer approximation, the key point is: given a point y 2 Q.Rn /  Rr , determine whether y 2 G D Q.D/; and if y … G then construct a linear inequality (cut) l.y/  0 to exclude y without excluding any point of G: Observe that y … Q.D/ if and only if the linear system Qx D y has no solution in D; so the existence of l.y/ is ensured by the separation theorem or any of its equivalent forms (such as the Farkas–Minkowski Lemma). However, since we are not so much interested in the existence as in the effective construction of l.y/; the best way is to use the duality theorem of linear programming (which is another form of the separation theorem). Specifically, assuming D D fxj Ax  b; x  0g; with A 2 Rmn ; b 2 Rm ; consider the dual pair of linear programs minfhh; xij Qx D y; Ax  b; x  0g

(9.13)

maxfhy; vi  hb; wij QT v  AT w  h; w  0g:

(9.14)

where h is any vector chosen so that (9.14) is feasible (for example, h D 0/ and can be solved with least effort. Proposition 9.3 Solving (9.14) either yields a finite optimal solution or an extreme direction .v; w/ of the cone QT v  AT w  0; w  0; such that hy; vi  hb; wi > 0: In the former case, y 2 G D Q.D/I in the latter case, the affine function l.y/ D hy; vi  hb; wi

(9.15)

satisfies l.y/ > 0;

l.y/  0 8y 2 G:

(9.16)

Proof Immediate from the duality theorem of linear programming.

t u

On the basis of this proposition a finite OA procedure for solving (9.10) can be carried out in the standard way, starting from an initial r-simplex P1  G D Q.D/ with a known or readily computable vertex set. In most cases, one can take this initial simplex to be ( P1 D yj yi  ˛i ; .i D 1; : : : ; r/;

r X

) .yi  ˛i /  ˇ

(9.17)

iD1

where ˛i D minfyi j y D Qx; x 2 Dg; i D 1; : : : ; r;

˚Pr ˇ D max iD1 .yi  ˛i /j y D Qx; x 2 D :

(9.18) (9.19)

9.2 Decomposition by Projection

289

Remark 9.1 A typical case of interest is the concave program with few nonconvex variables (Example 7.1): minf'.y/ C dzj Ay C Bz  b; y  0; z  0g

(9.20)

where the function ' W Rp ! R is concave and .y; z/ 2 Rp  Rq : By rewriting this problem in the form (9.10): minf'.y/ C tj dz  t; Ay C Bz  b; y  0; z  0g p

we see that for checking the feasibility of a vector .y; t/ 2 RC  R the best choice of h in (9.13) is h D d: This leads to the pair of dual linear programs minfdzj Bz  b  Ay; z  0g

(9.21)

maxfhAy  b; vij  B v  d; v  0g:

(9.22)

T

We leave it to the reader to verify that: 1. If v is an optimal solution to (9.22) and hAy  b; vi  t then .y; t/ is feasible; 2. If v is an optimal solution to (9.22) and hAy  b; vi > t then the cut hAy  b; vi  t excludes .y; t/ without excluding any feasible point; 3. If (9.22) is unbounded, i.e., hAy  b; vi > 0 for some extreme direction v of the feasible set of (9.22), then the cut hAy  b; vi  0 excludes .y; t/ without excluding any feasible point. Sometimes the matrix B has a favorable structure such that for each fixed y  0 the problem (9.20) is solvable by efficient specialized algorithms (see Sect. 9.8). Since the above decomposition preserves this structure in the auxiliary problems, the latter problems can be solved by these algorithms.

9.2.2 Branch and Bound A general rule in the branch and bound approach to nonconvex optimization problems is to branch upon the nonconvex variables, except for rare cases where there are obvious reasons to do otherwise. Following this rule the space to be partitioned for solving .QCM/ is Q.Rn / D Rr : Given a partition set M  Rr a basic operation is to compute a lower bound for F.y/ over the feasible points in M: To this end, select an appropriate affine minorant lM .y/ of F.y/ over M and solve the linear program minflM .y/j Qx D y; x 2 D; y 2 Mg:

290

9 Parametric Decomposition

If ˇ.M/ is the optimal value of this linear program then obviously ˇ.M/  minfF.y/j y 2 Q.D/ \ Mg: According to condition (6.4) in Sect. 6.2 this can serve as a valid lower bound, provided, in addition, ˇ.M/ < C1 whenever Q.D/ \ M ¤ ;  which is the case when M is bounded as, e.g., when using simplicial or rectangular partitioning. Simplicial Algorithm If M D Œu1 ; : : : ; urC1  is an r-simplex in Q.Rn / then lM .x/ can beP taken to be the i affine function that agrees with F.y/ at u1 ; : : : ; urC1 : Since lM .y/ D rC1 iD1 ti F.u / for PrC1 i PrC1 y D iD1 ti u with t D .t1 ; : : : ; trC1 /  0; iD1 ti D 1; the above linear program can also be written as ( rC1 ) rC1 rC1 X X X i i min ti F.u /j x 2 D; Qx D ti u ; ti D 1; t  0 : iD1

iD1

iD1

Simplicial subdivision is advisable when f .x/ D f0 .x/ C dx; where f0 .x/ is concave and monotonic with respect to the cone Qx  0: Although f .x/ is then actually monotonic with respect to the cone Qx  0; dx  0; by writing .QCM/ in the form minfF0 .y/ C dxj Qx D y; x 2 Dg one can see that it can be solved by a simplicial algorithm operating in Q.Rn / rather than Q.Rn /  R: As initial simplex M1 one can take the simplex (9.17). Rectangular Algorithm When F.y/ is concave separable, i.e., F.y/ D

r X

Fi .yi /;

iD1

where each Fi ./; i D 1; : : : ; r is a concave univariate function, this structure can be best exploited by rectangular subdivision. In that case, for any rectangle M D Œp; q D fyj p  y  qg; lM .y/ can be taken to be the uniquely defined affine function which agrees with F.y/ at the corners of M: This function is P lM .y/ D riD1 li;M .yi / where each li;M .t/ is the affine univariate function that agrees with Fi .t/ at the endpoints pi ; qi of the segment Œpi ; qi : The initial rectangle can be taken to be M1 D Œ˛1 ; ˇ1   : : :  Œ˛r ; ˇr ; where ˛i is defined as in (9.18) and ˇi D supfyi j y D Qx; x 2 Dg: Conical Algorithm In the general case, since the global minimum of F.y/ is achieved on the boundary of Q.D/; conical subdivision is most appropriate. Nevertheless, instead of bounding, we use a selection operation, so that the algorithm is a branch and select rather than branch and bound procedure.

9.2 Decomposition by Projection

291

The key point in this algorithm is to decide, at each iteration, which cones in the current partition are ‘non promising’ and which one is the ‘most promising’ (selection rule). Let y be the current best solution (CBS) at a given iteration and WD F.y/: Assume that all the cones are vertexed at 0 and F.0/ > ; so that 0 2 int˝ ; where ˝ D fy 2 ˝j F.y/  g: Let lM .y/ D 1 be the equation of the hyperplane in Rr passing through the intersections of the edges of M with the surface F.y/ D : Denote by !.M/ a basic optimal solution and by .M/ the optimal value of the linear program LP.M; /

maxflM .y/j y 2 Q.D/ \ Mg:

Note that if u1 ; : : : ; ur are the directions of the edges of M and U D .u1 ; : : : ; ur / is the matrix of columns u1 ; : : : ; ur then M D fyj y D Ut; t D .t1 ; : : : ; tr /T  0g; so LP(M, / can also be written as maxflM .Ut/j Ax  b; Qx D Ut; x  0; t  0g: The deletion and selection rule are based on the following: Proposition 9.4 If .M/  1 then M cannot contain any feasible point better than CBS. If .M/ > 1 and !.M/ lies on an edge of M then !.M/ is a better feasible solution than CBS. Proof Clearly D \ M is contained in the simplex Z WD M \ fyj lM .y/  .M/g: If .M/  1; then the vertices of Z other than 0 lie inside ˝ ; hence the values of F.y/ at these vertices are at least equal to : But by quasiconcavity of F.y/; its minimum over Z must be achieved at some vertex of Z, hence must be at least equal to : This means that F.y/  for all y 2 Z; and hence, for all y 2 D \ M: To prove the second assertion, observe that if !.M/ lies on an edge of M and .M/ > 1, then !.M/ … ˝ ; i.e., F.!.M// < I furthermore, !.M/ is obviously feasible. t u The algorithm is initialized from a partition of Rr into r C 1 cones vertexed at 0 (take, e.g., the partition induced by the r-simplex Œe0 ; e1 ; : : : ; er  where ei ; for i D 1; : : : ; r; is the i-th unit vector of Rr and e0 D  1r .e1 C : : : C er //: On the basis of Proposition 9.4, at iteration k a cone M will be fathomed if .M/  1; while the most promising cone Mk (the candidate for further subdivision) is the unfathomed cone that has largest .M/: If ! k D !.Mk / lies on any edge of M k it yields, by Proposition 9.4, a better feasible solution than CBS and the algorithm can go to the next iteration with this improved CBS. Otherwise, Mk can be subdivided via the ray through ! k : As proved in Chap. 7 (Theorem 7.3), the above described conical algorithm converges, in the sense that any accumulation point of the sequence of CBS that

292

9 Parametric Decomposition

it generates is a global optimal solution. Note that no bound is needed in this branch and select algorithm. Although a bound for F.y/ over the feasible points in M can be easily derived from the constructions used in Proposition 9.4, it would require the computation of the values of F.y/ at r additional points (the intersections of the edges of M with the hyperplane lM .y/ D .M//; which may entail a nonnegligible extra cost.

9.3 Decomposition by Polyhedral Annexation The decomposition method by outer approximation is conceptually simple. However, a drawback of this method is its inability to be restarted. Furthermore, the construction of the initial polytope involves solving r C 1 linear programs. An alternative decomposition method which allows restarts and usually performs better is by dualization through polyhedral annexation . Let x be a feasible solution and C WD fx 2 Xj f .x/  f .x/g: By quasiconcavity and continuity of f .x/; C is a closed convex set. By translating if necessary, assume that 0 is a feasible point such that f .0/ > f .x/; so 0 2 D \ intC: The PA method (Chap. 8, Sect. 8.1) uses as a subroutine Procedure DC* for solving the following subproblem: Find a point x 2 D n C (i.e., a better feasible solution than x/ or else prove that D  C (i.e., x is a global optimal solution). The essential idea of Procedure DC* is to build up, starting with a polyhedron P1  Cı (Lemma 8.1), a nested sequence of polyhedrons P1  P2  : : :  Cı that yields eventually either a polyhedron Pk  Dı (in that case Cı  Dı ; hence D  C/ or a point xk 2 D n C: In order to exploit the monotonicity structure, we now try to choose the initial polyhedron P1 so that P1  L? (orthogonal complement of L/: If this can be achieved, then the whole procedure will operate in L? ; which allows much computational saving since dim L? D r < n: ? Let c1 ; : : : ; cr be the rows is the subspace generated by these Prof Qi (so that L ? 0 vectors) and let c D  c : Let W L ! Rr be the linear map such iD1 Pr i that y D iD1 ti c , .y/ D t: For each i D 0; 1; : : : ; r since 0 2 intC; the halfline from 0 through ci intersects C: If this halfline meets @C; we define ˛i to be the positive number such that zi D ˛i ci 2 @CI otherwise, we set ˛i D C1: Let I D fij ˛i < C1g; S1 D convf0; zi ; i 2 Ig C conefci ; i … Ig: Lemma 9.1 (i) The polar P1 of S1 C L is an r-simplex in L? defined by (

) 1 .P1 / D t 2 R j ti hc ; c i  ; j D 0; 1; : : : ; r ; ˛j iD1 r

(as usual

1 C1

D 0/:

r X

i

j

(9.23)

9.3 Decomposition by Polyhedral Annexation

293

(ii) Cı  P1 and if V is the vertex set of any polytope P such that Cı  P  P1 ; then maxfhv; xij x 2 Dg  1

8v 2 V

C ı  Dı :

)

(9.24)

Proof Clearly S1  L? and 0 2 riS1 ; so 0 2 int.S1 C L/; hence P1 is compact. SinceP.S1 C L/ı D S1ı \ L? ; any y 2 P1 belongs to L? ; i.e., is of the form y D riD1 ti ci for some t D .t1 ; : : : ; tr /: But by Proposition 1.28, y 2 S1ı is equivalent to hcj ; yi  ˛1j 8j D 0; 1; : : : ; r: Hence P1 is the polyhedron defined by (9.23). We

have 0
f .x/ and set D D x0 ; C C  x0 : Let PQ 1 D .P1 / be the simplex in Rr defined by (9.23) (or PQ 1 D .P1 / with P1 being any polytope in L? such that Cı  P1 /. Let V1 be the vertex set of PQ 1 : Set k D 1: Step 1.

For every new t D .t1 ; : : : ; tr / 2 Vk solve the linear program ( r ) X max ti hci ; xij x 2 D

(9.25)

iD1

Step 2. Step 3. Step 4.

to obtain its optimal value .t/ and a basic optimal solution x.t/: Let tk 2 argmaxf.t/j t 2 Vk g: If .tk /  1 then terminate: x is an optimal solution of (QCM). If .tk / > 1 and xk D x.tk / … C, then update the current best feasible solution and the set C by resetting x D xk : Compute k D supf j f . xk /  f .x/g and define (

PQ kC1

Step 5.

r X

1 D PQ k \ tj ti hx ; c i  k iD1 k

i

From Vk derive the vertex set VkC1 of PQ kC1 : Set k Step 1.

) : k C 1 and go back to

294

9 Parametric Decomposition

Remark 9.2 Reported computational experiments (Tuy and Tam 1995) seem to indicate that the PA method is quite practical for problems with fairly large n provided r is small (typically r  6/: For r  3; however, specialized parametric methods can be developed which are more efficient. These will be discussed in the next section.

Case When K D fuj Qu  0g A case when the initial simplex P1 can be constructed so that its vertex set is readily available when K D fuj Qu  0g; with an r  n matrix Q of rank r: Let c1 ; : : : ; cr be the rows of Q: Using formula (9.11) one can compute a point w satisfying Qw D e; where e D .1; : : : ; 1/T ; and a value ˛ > 0 such that wO D ˛w 2 C (for the efficiency of the algorithm, ˛ should be taken to be as large as possible). Lemma 9.2 The polar P1 of the set S1 D wO C K is an r- simplex with vertex set f0; c1 =˛; : : : ; cr =˛g and satisfying condition (ii) in Lemma 9.1. Proof Since K  recC and wO 2 C it follows that S1 D wO C K  C; and hence P1  Cı : Furthermore, clearly S1 D fxj hci ; xi  ˛; i D 1; : : : ; rg because x 2 S1 if and only if x  wO 2 K; i.e., hci ; x  wi O  0 i D 1; : : : ; r; i.e., hci ; xi  i hc ; wi O D ˛; i D 1; : : : ; r: From Proposition 1.28 we then deduce that S1ı D convf0; c1 =˛; : : : ; cr =˛g: The rest is immediate. t u Also note that inP this case K ı D conefc1 ; : : : ; cr g and so .K ı / D Rr while r .P1 / D ft 2 R j riD1 ti  1=˛g:

9.4 The Parametric Programming Approach In the previous sections, it was shown how a quasiconcave monotonic problem can be transformed into an equivalent quasiconcave problem of reduced dimension. In this section an alternative approach will be presented in which a quasiconcave monotonic problem is solved through the analysis of an associated linear program depending on a parameter (of the same dimension as the reduced problem in the former approach). Consider the general problem formulated in Sect. 8.2: .QCM/

minff .x/j

x 2 Dg

(9.26)

where D is a compact set in Rn ; not necessarily even convex, and f .x/ is a quasiconcave monotonic function on a closed convex set X  D with respect to a polyhedral cone K with lineality space L D fxj Qx D 0g: To this problem we associate the following program where v is an arbitrary vector in K ı W

9.4 The Parametric Programming Approach

LP.v/

295

minfhv; xij x 2 Dg:

(9.27)

Denote by Kbı any compact set such that cone(Kbı / D K ı : Theorem 9.1 For every v 2 K ı ; let xv be an arbitrary optimal solution of LP.v/: Then minff .x/j x 2 Dg D minff .xv /j v 2 Kbı g:

(9.28)

Proof Let D minff .xv /j v 2 Kb0 g: Clearly D minff .xv /j v 2 K ı n f0gg: Denote by E the convex hull of the closure of the set of all xv ; v 2 K ı n f0g: Since D is compact, so is E and it is easy to see that the convex set G D E C K is closed. Indeed, for any sequence x D u C v  ! x0 with u 2 E; v  2 K; one has, by passing to subsequences if necessary, u ! u0 2 E; hence v  D x  u ! x0  u0 2 K; which implies that x0 D u0 C .x0  u0 / 2 E C K D G: Thus, G is a closed convex set. By K-monotonicity we have f .y C u/  f .y/ for all y 2 E; u 2 K; hence f .x/  for all x 2 E C K D G: Suppose now that > minff .x/j x 2 Dg; so that for some z 2 D we have f .z/ < ; i.e., z 2 D n G: By Proposition 1.15 on the normals to a convex set there exists x0 2 @G such that v WD x0  z satisfies hv; x  x0 i  0 8x 2 G;

hv; z  x0 i < 0:

(9.29)

For every u 2 K; since x0 C u 2 G C K D G; we have hv; ui  0; hence v 2 K ı n f0g: But, since z 2 D; it follows from the definition of xv that hv; xv i  hv; zi < hv; x0 i by (9.29), hence hv; xv  x0 i < 0; conflicting with the fact xv 2 E  G and (9.29). t u The above theorem reduces problem .QCM/ to minimizing the function '.v/ WD f .xv / over Kbı : Though K ı  L? ; and dim L? D r may be small, this is still a difficult problem even when D is convex, because '.v/ is generally highly nonconvex. The point, however, is that in many important cases the problem (9.28) is more amenable to computational analysis than the original problem. For instance, as we saw in the previous section, when D is a polytope the PA Method solves .QCM/ through solving a suitable sequence of linear P programs (9.25) which can be recognized to be identical to (9.27) with v D  riD1 ti ci : An important consequence of Theorem 9.1 can be drawn when: ˇ ˇ D is a polyhedronI ˇ ˇ K D fxj hci ; xi  0; i D 1; : : : ; pI hci ; xi D 0; i D p C 1; : : : ; rg

(9.30)

296

9 Parametric Decomposition

(c1 ; : : : ; cr being linearly independent vectors of Rn /: Let p

 D f 2 RC j p D 1g; W D f˛ D .˛pC1 ; : : : ; ˛r /j ˛ i  ˛i  ˛ i ; i D p C 1; : : : ; rg; ˛ i D inf hci ; xiI x2D

˛ i D suphci ; xi i D p C 1; : : : ; r: x2D

Corollary 9.1 For each .; ˛/ 2   W let x;˛ be an arbitrary basic optimal solution of the linear program LP.; ˛/

ˇ ˇ min Pp1 i hci ; xi C hcp ; xi iD1 ˇ ˇ s:t: x 2 D; hci ; xi D ˛ ; i D p C 1; : : : ; r: i

Then minff .x/j x 2 Dg D minff .x;˛ /j  2 ; ˛ 2 Wg:

(9.31)

Proof Clearly minff .x/j x 2 Dg D min infff .x/j x 2 D \ H˛ g; ˛

where H˛ D fxj hci ; xi D ˛i ; i D p C 1 : : : ; rg: Let KQ D fxj hci ; xi  0; Q i D 1; : : : ; pg: Since f .x/ is quasiconcave K-monotonic on X \ H˛ ; we have by Theorem 9.1 : minff .x/j x 2 D \ H˛ g D minff .xv;˛ /j v 2 KQ bı g;

(9.32)

where xv;˛ is any basic optimal solution of the linear program minfhv; xij x 2 D \ H˛ g:

(9.33)

Noting that KQ ı D conefc1 ; : : : ; cp g we can take KQ bı D Œc1 ; : : : ; cp : Also we can 0 v 0 2 riKQ bı such that xv ;˛ assume v 2 riKQ bı since for any v 2 KQ bı there Pp exists is a basic optimal solution of (9.33). So v D iD1 ti ci with tp > 0 and the desired result follows by setting i D ttpi : t u As mentioned above, by Theorem 9.1 one can solve .QCM/ by finding first the vector v 2 Kbı that minimizes f .xv / over all v 2 Kbı : In certain methods, as, for instance, in the PA method when D is polyhedral, an optimal solution x D xv is immediately known once v has been found. In other cases, however, x must be sought by solving the associated program minfhv; xij x 2 Dg: The question then arises as to whether any optimal solution of the latter program will solve .QCM/:

9.4 The Parametric Programming Approach

297

Theorem 9.2 If, in addition to the already stated assumptions, D  intX; while the function f .x/ is continuous and strictly quasiconcave on X, then for any optimal solution x of .QCM/ there exists a v 2 K ı n f0g such that x is an optimal solution of LP.v/ and any optimal solution to LP.v/ will solve .QCM/. (A function f .x/ is said to be strictly quasiconcave on X if for any two x; x0 2 X; f .x0 / < f .x/ always implies f .x0 / < f . x C .1  /x0 / 8 2 .0; 1// Proof It suffices to consider the case when f .x/ is not constant on D: Let x be an optimal solution of .QCM/ and X.x/ D fx 2 Xj f .x/  f .x/g: By quasiconcavity and continuity of f .x/, this set X.x/ is closed and convex. By optimality of x; D  X.x/ and again by continuity of f .x/; fx 2 Xj f .x/ > f .x/g  intX.x/: Furthermore, since f .x/ is not constant on D one can find x0 2 D such that f .x0 / > f .x/; so if x 2 intX.x/ then for > 0 small enough, x D x  .x0  x/ 2 X.x/; f .x / < f .x0 /; hence, by strict quasiconcavity of f .x/; f .x / < f .x/; conflicting with the definition of X.x/: Therefore, x lies on the boundary of X.x/: By Theorem 1.5 on supporting hyperplanes there exists then a vector v ¤ 0 such that hv; x  xi  0

8x 2 X.x/:

(9.34)

For any u 2 K we have x C u 2 D C K  X.x/; so hv; ui  0; and consequently, v 2 K ı nf0g: If xQ is any optimal solution of LP.v/; then hv; xQ xi D 0; and in view of (9.34), xQ cannot be an interior point of X.x/ for then hv; x  xi D 0 8x 2 Rn ; conflicting with v ¤ 0: So xQ is a boundary point of X.x/; and hence f .Qx/ D f .x/; because fx 2 Xj f .x/ > f .x/g  intX.x/: Thus, any optimal solution xQ to LP.v/ is optimal to .QCM/. t u Remark 9.3 If f .x/ is differentiable and pseudoconcave then it is strictly quasiconcave (see, e.g., Mangasarian 1969). In that case, since X.x/ D fx 2 Xjf .x/  f .x/g one must have hrf .x/; x  xi  0 8x 2 X.x/ and since one can assume that there is x 2 X satisfying f .x/ > f .x/ the strict quasiconcavity of f .x/ implies that rf .x/ ¤ 0 (Mangasarian 1969). So rf .x/ satisfies (9.34) and one can take v D rf .x/: This result was established in Sniedovich (1986) as the foundation of the so-called Cprogramming. Corollary 9.2 Let F W Rq ! R be a continuous, strictly quasiconcave and Kmonotonic function on a closed convex set Y  Rq ; where K  Rq is a polyhedral convex cone contained in the recession cone of Y: Let g W D ! Rq be a map defined on a set D  Rn such that g.D/  intY: If F.g.x// attains a minimum on D then there exists t 2 K ı such that any optimal solution of minfht; g.x/ij x 2 Dg

(9.35)

minfF.g.x//j x 2 Dg:

(9.36)

will solve

298

9 Parametric Decomposition

Proof If x is a minimizer of F.g.x// over D, then y D g.x/ is a minimizer of F.y/ over G D g.D/; and the result follows from Theorem 9.2. t u q

In particular, when K  RC and D is a convex set while gi .x/; i D 1; : : : ; q are convex functions, problem (9.35) is a parametric convex program (t  0 because q K ı  RC /: Example 9.8 Consider the problem (Tanaka et al. 1991): ( n ! ! ) n X X f1;i .xi / : f2;i .xi / j xi 2 Xi ; i D 1; : : : ; n min x

iD1

(9.37)

iD1

where Xi  RCC and f1;i ; f2;i are positive-valued functions Pn .i D 1 : : : ; n/: Applying the Corollary for g.x/ D .g .x/; g .x//; g .x/ D 1 2 1 iD1 f1;i .xi /; g2 .x/ D Pn 2 iD1 f2;i .xi /; F.y/ D y1 y2 for every y D .y1 ; y2 / 2 RCC ; we see that if this problem is solvable then there exists a number t > 0 such that any optimal solution of ( n ) X min .f1;i .xi / C tf2;i .xi //j xi 2 Xi ; i D 1; : : : ; n (9.38) x

iD1

will solve (9.37). For fixed t > 0 the problem (9.38) splits into n one-dimensional subproblems minff1;i .xi / C tf2;i .xi /j xi 2 Xi g: In the particular case of the problem of optimal ordering policy for jointly replenished products as treated in the mentioned paper of Tanaka et al. f1;i .xi / D a=n C ai =xi ; f2;i .xi / D bi xi =2 .a; ai ; bi > 0/ and Xi is the set of positive integers, so each of the above subproblems can easily be solved.

9.5 Linear Multiplicative Programming In this section we discuss parametric methods for solving problem (QCM) under assumptions (9.30). Let Q be the matrix of rows c1 ; : : : ; cr : In view of Proposition 9.1, the problem can be reformulated as minfF.hc1; xi; : : : ; hcr ; xi/j x 2 Dg;

(9.39)

where D is a polyhedron in Rn ; and F W Rr ! R is a quasiconcave function on a closed convex set Y  Q.D/; such that for any y; y0 2 Y W yi  y0i .i D 1; : : : ; p/ yi D y0i .i D p C 1; : : : ; r/



) F.y/  F.y0 /:

(9.40)

9.5 Linear Multiplicative Programming

299

A typical problem of this class is the linear multiplicative program : .LMP/

min

8 r 3 the cells are very complicated and there is in general no practical method for computing these cells, except in rare cases where additional structure (like network constraints, see Sect. 9.8) may simplify this computation. Below we briefly examine the method for problems with r  3: For more detail the reader is referred to the cited papers of Konno and Kuno.

9.5.1 Parametric Objective Simplex Method When p D r D 2 the problem (9.39) is minfF.hc1; xi; hc2 ; xi/j x 2 Dg

(9.46)

9.5 Linear Multiplicative Programming

301

where D is a polyhedron contained in the set fxj hc1 ; xi  0; hc2 ; xi  0g; and F.y/ is a quasiconcave function on the orthant R2C such that y; y0 2 R2C ; y  y0 ) F.y/  F.y0 /:

(9.47)

As seen earlier, aside from the special case when F.y/ D y1 y2 (linear multiplicative program), the class (9.46) includes problems with quite different objective functions, such as .hc1 ; xi/q1 .hc2 ; xi/q2 I q1 ehc

1 ;xi

 q2 ehc

2 ;xi

.q1 > 0; q2 > 0/

corresponding to different choices of F.y/ W q

q

y11 y22 I q1 ey1  q2 ey2 : By Corollary 9.1 (with p D r D 2/ the parametric linear program associated with (9.46) is LP./

minfhc1 C c2 ; xij x 2 Dg;

  0:

(9.48)

Since the parameter domain  is the nonnegative real line, the cells of this parametric program are intervals and can easily be determined by the standard parametric objective simplex method (see, e.g., Murty 1983) which consists in the following. Suppose a basis Bk is already available such that the corresponding basic solution xk is optimal to LP./ for all  2 ˘k WD Œk1 ; k  but not for  outside this interval. The latter means that if  is slightly greater than k then at least one of the inequalities in the optimality criterion .c1 C c2 /Nk  .c1 C c2 /Bk B1 k Nk  0 becomes violated. Therefore, by performing a number of simplex pivot steps we will pass to a new basis BkC1 with a basic solution xkC1 which will be optimal to LP./ for all  in a new interval ˘kC1  Œk ; C1/: In this way, starting from the value 0 D 0 one can determine the interval ˘1 D Œ0 ; 1  then pass to the next ˘2 D Œ1 ; 2 ; and so on, until the whole halfline is covered. Note that the optimum objective value function ./ in the linear program LP./ is a concave piecewise affine function of  with ˘k as linearity intervals. The points 0 D 0; 1 ; : : : ; l are the breakpoints of ./: An optimal solution to (9.46) is then xk where k 2 argminfF.hc1 ; xk i; hc2 ; xk i/j k D 1; 2; : : : ; lg:

(9.49)

Remark 9.4 Condition (9.47) is a variant of (9.40) for p D r D 2: Condition (9.40) with p D 1; r D 2 always holds since it simply means F.y/ D F.y0 / whenever

302

9 Parametric Decomposition

y D y0 : Therefore, without assuming (9.47), Corollary 9.1 and hence the above method still applies. However, in this case,  D R; i.e., the linear program LP./ should be considered for all  2 .1; C1/:

9.5.2 Parametric Right-Hand Side Method Consider now problem (9.39) under somewhat weaker assumptions than previously, namely: F.y/ is not required to be quasiconcave, while condition (9.47) is replaced by the following weaker one: y1  y01 ; y2 D y02

)

F.y/  F.y0 /:

(9.50)

The latter condition implies that for fixed y2 the univariate function F.:; y2 / is monotonic on the real line. It is then not hard to verify that Corollary 9.1 still holds for this case. In fact this is also easy to prove directly. Lemma 9.4 Let ˛ D minfhc2 ; xij x 2 Dg; ˛ D maxfhc2 ; xij x 2 Dg and for every ˛ 2 Œ˛; ˛ let x˛ be a basic optimal solution of the linear program LP.˛/

minfhc1 ; xij x 2 D; hc2 ; xi D ˛g

:

Then minfF.hc1 ; xi; hc2 ; xi/j x 2 Dg D minfF.hc1; x˛ i; hc2 ; x˛ i/j ˛  ˛  ˛g: Proof For every feasible solution x of LP.˛/ we have hc1 ; x  x˛ i  0 (because x˛ is optimal to LP(˛// and hc2 ; x  x˛ i D 0 (because x˛ is feasible to LP(˛//: Hence, from (9.50) F.hc1 ; xi; hc2 ; xi/  F.hc1 ; x˛ i; hc2 ; x˛ i/ and the conclusion follows. t u Solving (9.39) thus reduces to solving LP.˛/ parametrically in ˛: This can be done by the following standard parametric right-hand side method (see, e.g., Murty 1983). Assuming that the constraints in LP.˛/ have been rewritten in the form Ax D b C ˛b0 ; x  0; let Bk be a basis of this system satisfying the dual feasibility condition c1Nk  c1Bk B1 k Nk  0: Then the basic solution defined by this basis, i.e., the vector xk D uk C ˛v k such k that xkBk D B1 k .b C ˛b0 /; xNk D 0 [see (9.45)], is dual feasible. This basic solution k x is optimal to LP.˛/ if and only if it is feasible, i.e., if and only if it satisfies the nonnegativity condition B1 k .b C ˛b0 /  0:

9.5 Linear Multiplicative Programming

303

Let k WD Œ˛k1 ; ˛k  be the interval formed by all values ˛ satisfying the above inequalities. Then xk is optimal to LP.˛/ for all ˛ in this interval, but for any value ˛ slightly greater than ˛k at least one of the above inequalities becomes violated, i.e., at least one of the components of xk becomes negative. By performing then a number of dual simplex pivots, we can pass to a new basis BkC1 with a basic solution xkC1 which will be optimal for all ˛ in some interval kC1 D Œ˛k ; ˛kC1 : Thus, starting from the value ˛0 D ˛; one can determine an initial interval 1 D Œ˛; ˛1 ; then pass to the next interval 2 D Œ˛1 ; ˛2 ; and so on, until the whole interval Œ˛; ˛ is covered. The generated points ˛0 D ˛; ˛1 ; : : : ; ˛h D ˛ are the breakpoints of the optimum objective value function .˛/ in LP.˛/ which is a convex piecewise affine function. By Lemma 9.4 an optimal solution to (9.39) is then xk where k 2 argminfF.hc1 ; xk i; hc2 ; xk i/j k D 1; 2; : : : ; hg:

(9.51)

Remark 9.5 In both objective and right-hand side methods the sequence of points xk on which the objective function values have to be compared depends upon the vectors c1 ; c2 but not on the specific function F.y/: Computational experiments (Konno and Kuno 1992), also (Tuy and Tam 1992) indicate that solving a problem (9.39) requires no more effort than solving just a few linear programs of the same size. Two particular problems of this class are worth mentioning: minfc1 x:c2 xj x 2 Dg minfx21 C c0 xj

x 2 Dg;

(9.52) (9.53)

where D is a polyhedron [contained in the domain c1 x > 0; c2 x > 0 for (9.52)]. These correspond to functions F.y/ D y1 y2 and F.y/ D y21 C y2 , respectively and have been shown to be NP-hard (Matsui 1996; Pardalos and Vavasis 1991), although, as we saw above, they are efficiently solved by the parametric method.

9.5.3 Parametric Combined Method When r D 3 the parametric linear program associated to (9.39) is difficult to handle because the parameter domain is generally two-dimensional. In some particular cases, however, a problem (9.39) with r D 3 may be successfully tackled by the parametric method. As an example consider the problem minfc0 x C c1 x:c2 xj

x 2 Dg

(9.54)

which may not belong to the class (QCM) because we cannot prove the quasiconcavity of the function F.y/ D y1 C y2 y3 : It is easily seen, however, that this problem is equivalent to minfhc0 C ˛c1 ; xij x 2 D; hc2 ; xi D ˛; ˛  ˛  ˛g

(9.55)

304

9 Parametric Decomposition

where ˛ D minx2D hc2 ; xi; ˛ D maxx2D hc2 ; xi: By writing the constraints x 2 D; hc2 ; xi D ˛ in the form Ax D b C ˛b0 :

(9.56)

we can prove the following: Lemma 9.5 Each basis B of the system (9.56) determines an interval B  Œ˛; ˛ and an affine map xB W B ! Rn such that xB .˛/ is a basic optimal solution of the problem P.˛/

minfhc0 C ˛c1 ; xij x 2 D; hc2 ; xi D ˛g

for all ˛ 2 B : The collection of all intervals B corresponding to all bases B such that B ¤ ; covers all ˛ for which P.˛/ has an optimal solution.   xB Proof Let B be a basis, A D .B; N/; x D ; so that the associated basic xN solution is xB D B1 .b C ˛b0 /;

xN D 0:

(9.57)

This basic solution is feasible for all ˛ 2 Œ˛; ˛ satisfying the nonnegativity condition B1 .b C ˛b0 /  0:

(9.58)

and is dual feasible for all ˛ satisfying the dual feasibility condition (optimality criterion) .c0 C ˛c1 /N  .c0 C ˛c1 /B B1 N  0:

(9.59)

Conditions (9.58) and (9.59) determine each an interval. If the intersection B of the two intervals is nonempty, then the basic solution (9.57) is optimal for all ˛ 2 B : The rest is obvious. t u Based on this Lemma, one can generate all the intervals B corresponding to different bases B by a procedure combining primal simplex pivots with dual simplex pivots (see, e.g., Murty 1983). Specifically, for any given basis Bk ; denote by uk C ˛v k the vector (9.57) which is the basic solution of P(˛/ corresponding to this basis, and by k D Œ˛k1 ; ˛k  the interval of optimality of this basic solution. When ˛ is slightly greater than ˛k either the nonnegativity condition [see (9.58)] or the dual feasibility condition [see (9.59)] becomes violated. In the latter case we perform a primal simplex pivot step, in the former case a dual simplex pivot step, and repeat as long as necessary, until a new basis BkC1 with a new interval kC1 is obtained. Thus, starting from an initial interval 1 D Œ˛; ˛1  we can pass from one interval to

9.6 Convex Constraints

305

the next, until we reach ˛: Let then k ; k D 1; : : : ; l be the collection of all intervals. For each interval k let uk C ˛v k be the associated basic solution which is optimal to P(˛/ for all ˛ 2 k and let xk D uk C k v k ; where k is a minimizer of the quadratic function '.˛/ WD hc0 C ˛c1 ; uk C ˛v k i over k : Then from (9.55) and by Lemma 9.5, an optimal solution of (9.54) is xk where k 2 argminfc0 xk C .c1 xk /.c2 xk /j k D 1; 2; : : : ; lg: Note that the problem (9.54) is also NP-hard (Pardalos and Vavasis 1991).

9.6 Convex Constraints So far we have been mainly concerned with solving quasiconcave monotonic minimization problems under linear constraints. Consider now the problem .QCM/

minff .x/j

x 2 Dg:

(9.60)

formulated in Sect. 9.2 [see (9.8)], where the constraint set D is compact convex but not necessarily polyhedral. Recall that by Proposition 9.1 where Qx D .c1 x; : : : ; cr x/T there exists a continuous quasiconcave function F W Rr ! R on a closed convex set Y  Q.D/ satisfying condition (9.40) and such that f .x/ D F.hc1 ; xi; : : : ; hcr ; xi/:

(9.61)

It is easy to extend the branch and bound algorithms in Sect. 8.2 and the polyhedral annexation algorithm in Sect. 8.3 to (QCM) with a compact convex constraint set D; and in fact, to the following more general problem: .GQCM/

minfF.g.x//j x 2 Dg

(9.62)

where D and F.y/ are as previously (with g.D/ replacing Q.D//; while g D .g1 ; : : : ; gr /T W Rn ! Rr is a map such that gi .x/; i D 1; : : : ; p; are convex on a closed convex set X  D and gi .x/ D hci ; xi C di ; i D p C 1; : : : ; r: For the sake of simplicity we shall focus on the case p D r; i.e., we shall assume F.y/  F.y0 / whenever y; y0 2 Y; y  y0 : This occurs, for instance, for the convex multiplicative programming problem ( min

r Y iD1

) gi .x/j x 2 D

(9.63)

306

where F.y/ D rewritten as

9 Parametric Decomposition

Qr

iD1 yi ;

D  fxj gi .x/  0g: Clearly problem (GQCM) can be

minfF.y/j g.x/  y; x 2 Dg

(9.64)

which is to minimize the quasiconcave RrC -monotonic function F.y/ on the closed convex set E D fyjg.x/  y; x 2 Dg D g.D/ C RrC :

(9.65)

9.6.1 Branch and Bound If the function F.y/ is concave (and not just quasiconcave) then simplicial subdivision can be used. In this case, a lower bound for F.y/ over the feasible points y in a simplex M D Œu1 ; : : : ; urC1  can be obtained by solving the convex program minflM .y/j y 2 E \ Mg; where lM .y/ is the affine function that agrees with F.y/ at the vertices of M (this function satisfies lM .y/  F.y/ 8y 2 M in view of the concavity of F.y//: Using this lower bounding rule, a simplicial algorithm for (GQCM) can be formulated following the standard branch and bound scheme. It is also possible to combine branch and bound with outer approximation of the convex set E by a sequence of nested polyhedrons Pk  E: P If the function F.y/ is concave separable, i.e., F.y/ D rjD1 Fj .yj /; where each Fj .:/ is a concave function of one variable, then rectangular subdivision can be more convenient. However, in the general case, when F.y/ is quasiconcave but not concave, an affine minorant of a quasiconcave function over a simplex (or a rectangle) is in general not easy to compute. In that case, one should use conical rather than simplicial or rectangular subdivision.

9.6.2 Polyhedral Annexation Since F.y/ is monotonic with respect to the orthant RrC ; we can apply the PA Algorithm using Lemma 9.2 for constructing the initial polytope P1 : Let y 2 E be an initial feasible point of (9.64) and C D fyj F.y/  F.y/g: By translating, we can assume that 0 2 E and 0 2 intC; i.e., F.0/ > F.y/: Note that instead of a linear program like (9.25) we now have to solve a convex program of the form maxfht; yij y 2 Eg D maxfht; g.x/ij x 2 Dg

(9.66)

9.6 Convex Constraints

307

where t 2 .RrC /ı D RrC : Therefore, to construct the initial polytope P1 (see Lemma 9.2) we compute ˛ > 0 such that F.˛w/ D F.y/ (where w D .1; : : : ; 1/T 2 Rr / and take the initial polytope P1 to be the r-simplex in Rr with vertex set V1 D f0; e1 =˛; : : : ; er =˛g: We can thus state: PA Algorithm (for .GQCM/) Initialization. Let y be a feasible solution (the best available), C D fyj F.y/  F.y/g: Choose a point y0 2 E such that F.y0 / > F.y/ and set E E  y0 ; C C  y0 : r Let P1 be the simplex in R with vertex set V1 : Set k D 1: Step 1. Step 2. Step 3. Step 4.

For every new t D .t1 ; : : : ; tr /T 2 Vk nf0g solve the convex program (9.66), obtaining the optimal value .t/ and an optimal solution y.t/: Let tk 2 argmaxf.t/j t 2 Vk g: If .tk /  1, then terminate: y is an optimal solution of (GQCM). If .tk / > 1 and yk WD y.tk / … C (i.e., F.yk / < F.y//; then update the current best feasible solution and the set C by resetting y D yk : Compute k D supf j F. yk /  F.y/g

(9.67)

and define PkC1 D Pk \ ftj ht; yk i  Step 5.

1 g: k

From Vk derive the vertex set VkC1 of PkC1 : Set k Step 1.

k C 1 and go back to

To establish the convergence of this algorithm we need two lemmas. Lemma 9.6 .tk / & 1 as k ! C1: Proof Observe that .t/ D maxfht; yij y 2 g.D/g is a convex function and that yk 2 @.tk / because .t/  .tk /  ht; yk i  htk ; yk i D ht  tk ; yk i 8t: Denote lk .t/ D ht; yk i1: We have lk .tk / D htk ; yk i1 D .tk /1 > 0; and for t 2 Œg.D/ı W lk .t/ D ht; yk i  1  .t/  1  0 because yk 2 g.D/ C RrC : Since g.D/ is compact, there exist t0 2 intŒg.Dı (Proposition 1.21), i.e., t0 such that l.t0 / D .t0 /  1 < 0 and sk 2 Œt0 ; tk / n intŒg.D/ı such that l.sk / D 0: By Theorem 6.1 applied to the set Œg.D/ı ; the sequence ftk g and the cuts lk .t/; we conclude that tk  sk ! 0: Hence every accumulation point t of ftk g satisfies .t / D 1; i.e., .tk / & .t / D 1: t u Denote by Ck the set C at iteration k: Lemma 9.7 Ckı  Pk for every k:

308

9 Parametric Decomposition

Proof The inclusion C1ı  P1 follows from the construction of P1 : Arguing by induction on k; suppose that Ckı  Pk for some k  1: Since k yk 2 CkC1 ; for all ı ı t 2 CkC1 we must have ht; k yk i  1; and since Ck  CkC1 ; i.e., CkC1  Ckı  Pk it k follows that t 2 Pk \ ftj ht; k y i  1g D PkC1 : t u Proposition 9.5 Let yk be the incumbent at iteration k: Either the above algorithm terminates by an optimal solution of (GQCM) or it generates an infinite sequence fyk g every accumulation point of which is an optimal solution. Proof Let y be any accumulation point of the sequence fyk g: Since by Lemma 9.6 ı maxf.t/j t 2 Pk g & 1 it follows from Lemma 9.7 that maxf.t/j t 2 \1 kD1 Ck g  1 ı 1 ı 1: This implies that \kD1 Ck  E ; and hence E  [kD1 Ck : Thus, for any y 2 E there is k such that y 2 Ck ; i.e., F.y/  F.yk /  F.y/; proving the optimality of y: t u Incidentally, the above PA Algorithm shows that minfF.y/j y 2 Eg D minfF.g.x.t///j t 2 Rr g;

(9.68)

where x.t/ is an arbitrary optimal solution of (9.66), i.e., of the convex program minfht; g.x/ij x 2 Dg: This result could also be derived from Theorem 9.1. Remark 9.6 All the convex programs (9.66) have the same constraints. This fact should be exploited for an efficient implementation of the above algorithm. Also, in practice, these programs are usually solved approximately, so .t/ is only an "-optimal value, i.e., .t/  maxfht; yij y 2 Eg  " for some tolerance " > 0:

9.6.3 Reduction to Quasiconcave Minimization An important class of problems (GQCM) is constituted by problems of the form: minimize

p qi X Y

gij .x/

s:t

x 2 D;

(9.69)

iD1 jD1

where D is a compact convex set and gij W D ! RCC ; i D 1; : : : ; p; j D 1; : : : ; qi ; are continuous convex positive-valued functions on D: This class includes generalized convex multiplicative programming problems (Konno and Kuno 1995; Konno et al. 1994) which can be formulated as:

9.6 Convex Constraints

309

8 9 q < = Y min g.x/ C gj .x/j x 2 D : : ;

(9.70)

jD1

or, alternatively as ( min g.x/ C

p X

) gi .x/hi .x/j x 2 D ;

(9.71)

iD1

where all functions g.x/; gi .x/; hi .x/ are convex positive-valued on D: Another special case worth mentioning is the problem of minimizing the scalar product of two vectors: ( n ) X n min (9.72) xi yi j .x; y/ 2 D  RCC : iD1

Let us show that any problem (9.69) can be reduced to concave minimization under convex constraints. For this we rewrite (9.69) as Pp Qqi minimize iD1 . jD1 yij /1=qi s:t: Œgij .x/qi  yij ; i D 1; : : : ; pI j D 1; : : : ; qi yij  0; x 2 D:

(9.73)

D yij ; j D 1; : : : ; qi are linear, it follows from Proposition 2.7 that each Since ' Qj .y/ qi term . jD1 yij /1=qi ; i D 1; : : : ; p; is a concave function of y D .yij / 2 Rq1 C:::;qp ; hence their sum, i.e., the objective function F.y/ in (9.73), is also a concave function of y: Furthermore, y  y0 obviously implies F.y/  F.y0 /; so F.y/ is monotonic with respect to the cone fyjy  0g: Finally, since every function gij .x/ is convex, so is Œgij .x/qi : Consequently, (9.73) is a concave programming problem. If q D q1 C : : : C qp is relatively small this problem can be solved by methods discussed in Chaps. 5 and 6. An alternative way to convert (9.69) into a concave program is to use the following relation already established in the proof of Proposition 2.7 [see (2.6)]: qi Y jD1

X 1 q min

ij giji .x/; i qi 2Ti jD1 qi

gij .x/ D

Qqi i with i D . i1 ; : : : ; iqi /; Ti D f i W jD1 ij  1g: Hence, setting 'i . ; x/ D qi 1 X q

ij giji .x/; the problem (9.69) is equivalent to qi jD1

310

9 Parametric Decomposition

min x2D

p X iD1

min 'i . i ; x/

i 2Ti

D min minf

p X

x2D

'i . i ; x/ W i 2 Ti ; i D 1; : : : ; pg

iD1

D minfmin

p X

x2D

'i . i ; x/ W i 2 Ti ; i D 1; : : : ; pg

iD1 1

D minf'. ; : : : ; p /j i 2 Ti ; i D 1; : : : ; pg

(9.74)

where 8 9 p qi 0;

l.y/  0

8y 2 ˝:

(9.84)

Note that we assume (a) and 0 2 D; so '.0/ > 0;

.0/ < minf .y/j y 2 Q.D/; '.y/  0g:

(9.85)

Now, since '.yk / < 0 < '.0/ and k  .yk /  0 < k  .0/; one can compute zk 2 Œ0; yk  satisfying minf'.zk /; k  .zk /g D 0: Consider the linear program .SP.zk //

maxfhzk ; vi  hb; wijQT v  AT w  c; v  0g:

Proposition 9.7 (i) If SP(zk ) has a (finite) optimal solution .v k ; wk / then zk 2 Q.D/ and (9.84) is satisfied by the affine function l.y/ WD hv k ; y  zk i:

(9.86)

(ii) If SP(zk / has no finite optimal solution, so that the cone QT v  AT w  d; w  0 has an extreme direction .v k ; wk / such that hzk ; v k i  hb; wk i > 0; then zk … Q.D/ and (9.84) is satisfied by the affine function l.y/ WD hy; v k i  hb; wk i:

(9.87)

Proof If SP.zk / has an optimal solution .v k ; wk / then its dual .SP .zk //

minfhc; xij Ax  b; Qx D zk ; x  0g

is feasible, hence zk 2 Q.D/: It is easy to see that v k 2 @ .zk /: Indeed, for any y; since .y/ is the optimal value in SP .y/; it must also be the optimal value in SP.y/I i.e., .y/  hy; v k i  hb; wk i; hence .y/  .zk /  hy; v k i  hb; wk i  hzk ; v k i C hb; wk i D hv k ; y  zk i; proving the claim. For every y 2 ˝ we have .y/  Q  .zk /; so l.y/ WD hv k ; y  zk i  .y/  .zk /  0: On the other hand, yk D zk D zk C .1  /0 for some  1; hence l.yk / D l.zk / C .1  /l.0/ > 0 because l.0/  .0/  .zk / < 0; while l.zk / D 0: To prove (ii), observe that if SP.zk / is unbounded, then SP .zk / is infeasible, i.e., zk … Q.D/: Since the recession cone of the feasible set of SP.zk / is the cone QT v  AT w  0; v  0; SP.zk / may be unbounded only if an extreme direction

316

9 Parametric Decomposition

.v k ; wk / of this cone satisfies hzk ; v k i  hb; wk i > 0: Then the affine function (9.87) satisfies l.zk / > 0 while for any y 2 Q.D/; SP .y/ (hence SP.y// must have a finite optimal value, and consequently, hy; v k i  hb; wk i  0; i.e., l.y/  0: t u The convergence of an OA method for .MRP/ based on the above proposition follows from general theorems established in Sect. 6.3. Remark 9.7 In the above approach we only used the constancy of h.x/ on every manifold fxj Qx D yg: In fact, since '.y0 /  '.y/ for all y0  y; problem .MRP/ can also be written as minf .y/j x 2 D; Qx  y; '.y/  0g:

(9.88)

Therefore, SP.zk / could be replaced by maxfhzk ; vi  hb; wij QT v  AT w  0; v  0; w  0g:

9.7.3 Branch and Bound We describe a conical branch and bound procedure. A simplicial and a rectangular algorithm for the case when '.y/ is separable can be developed similarly. From assumptions (a), (b) we have that ˇ ˇ 0 2 Q.D/; '.0/ > 0; ˇ ˇ .0/ < minf .y/j y 2 Q.D/; '.y/  0g:

(9.89)

A conical algorithm for solving (9.83) starts with an initial cone M0 contained in Q.X/ and containing the feasible set Q.D/ \ fyj '.y/  0g. Given any cone M  Q.Rn / of base Œu1 ; : : : ; uk  let i ui be the point where the i-th edge of M intersects the surface '.y/ D 0: If P U denotes the matrix of columns u1 ; : : : ; uk then M D ft 2 Rk j Ut  0; t  0g and kiD1 ti = i D 1 is the equation of the hyperplane passing through the intersections of the edges of M with the surface '.y/ D 0: Denote by ˇ.M/ and t.M/ the optimal value and a basic optimal solution of the linear program (

LP.M/

) k X t i min hc; xij x 2 D; Qx D ti ui ;  1; t  0 : iD1 iD1 i k X

Proposition 9.8 ˇ.M/ gives a lower bound for .y/ over the set of feasible points in M: If !.M/ D Ut.M/ lies on an edge of M then ˇ.M/ D minf .y/j y 2 Q.D/ \ M; '.y/  0g:

(9.90)

9.7 Reverse Convex Constraints

317

Proof The first part of the proposition P follows from the inclusion fy 2 Q.D/ \ Mj '.y/  0g  fy 2 Q.D/ \ Mj kiD1 ti = i  1g and the fact that minf .y/j y 2 Q.D/ \ M; '.y/  0g D minfhc; xij x 2 D; Qx D y; y 2 M; '.y/  0g: On the other hand, if !.M/ lies on an edge of M it must lie on the portion of this edge outside the convex set '.y/  0; hence '.!.M//  0: Consequently, !.M/ is feasible to the minimization problem in (9.90) and hence is an optimal solution of it. t u It can be proved that a conical branch and bound procedure using !-subdivision or !-bisection (see Sect. 7.1) with the above bounding is guaranteed to converge. Example 9.10 min 2x1 C x2 C 0:5x3 s.t. 2x1 C x3  x4  2:5 x1  3x2 C x4  2 x1 C x 2  2 x1 ; x2 ; x3 ; x4  0 h.x/ WD .3x1 C 6x2 C 8x3 /2  .4x1 C 5x2 C x4 /2 C 154  0: Here the function h.x/ is concave and monotonic with respect to the cone K D fyj c1 y D 0; c2 y D 0g; where c1 D .3; 6; 8; 0/; c2 D .4; 5; 0; 1/ (cf Example 7.2). So Q is the matrix of two rows c1 ; c2 and Q.D/ D fy D Qx; x 2 Dg is contained in the cone M0 D fyjy  0g (D denotes the feasible set). Since the optimal solution 0 2 R4 of the underlying linear program satisfies h.0/ > 0; conditions (9.89) hold, and M0 D conefe1 ; e2 g can be chosen as an initial cone. The conical algorithm terminates after 4 iterations, yielding an optimal solution of .0; 0; 1:54; 2:0/ and the optimal value 0:76: Note that branching is performed in R2 (t-space) rather than in the original space R4 :

9.7.4 Decomposition by Polyhedral Annexation From L D fxj Qx D 0g  K  C we have Cı  K ı  L? ;

(9.91)

where L? (the orthogonal complement of L/ is the space generated by the rows c1 ; : : : ; cr of Q (assuming rankQ D r/: For every 2 R [ Cf1g let D D fx 2 Dj hc; xi  g: By Theorem 7.6 the value is optimal if and only if D n intC ¤ ;;

D  C:

(9.92)

Following the PA Method for .LRC/ (Sect. 8.1), to solve .MRP/ we construct a nested sequence of polyhedrons P1  P2  : : : together with a nonincreasing

318

9 Parametric Decomposition

sequence of real numbers 1  2  : : : such that D k n intC ¤ ;; Cı  Pk  L? 8k; and eventually Pl  ŒD l ı for some l W then D l  Pıl  C; hence l is optimal by the criterion (9.92). To exploit monotonicity [which implies (9.91)], the initial polytope P1 is taken to be the r-simplex constructed as in Lemma 9.1 (note that 0 2 intC by assumption (a), so is possible). Let W L? ! Rr be the linear map Prthis construction i y D iD1 ti c 7! .y/ D t: For any set E  Rr denote EQ D .E/  Rr : Then by Lemma 9.1 PQ 1 is the simplex in Rr defined by the inequalities r X

ti hci ; cj i 

iD1

1 ˛j

j D 0; 1; : : : ; r

P where c0 D  riD1 ci ; ˛j D supf˛j ˛cj 2 Cg: In the algorithm below, when a better feasible solution xk than the incumbent has been found, we can compute a feasible point xOk at least as good as xk and lying on an edge of D (see Sect. 5.7). This can be done by carrying out a few steps of the simplex algorithm. We shall refer to xOk as a point obtained by local improvement from xk : PA Algorithm for .MRP/ Initialization. Choose a vertex x0 of D such that h.x0 / > 0 and set D Dx0 ; C 1 1 1 0 C  x : Let x D best feasible solution available, 1 D hc; x i .x D ;; k D C1 if no feasible solution is available). Let P1 be the simplex constructed as in Lemma 9.1, V1 the vertex set of PQ 1 : Set k D 1: Step 1.

For every t D .t1 ; : : : ; tr /T 2 Vk solve the linear program ( max

r X

) ti hci ; xij x 2 D; hc; xi  k

iD1

Step 2.

Step 3.

to obtain its optimal value .t/: (Only the new t 2 Vk should be considered if this step is entered from Step 3). If .t/  1 8t 2 Vk , then terminate: if k < 1; xk is an optimal solution; otherwise (MRP) is infeasible. If .tk / > 1 for some tk 2 Vk , then let xk be a basic optimal solution of the corresponding linear program. If xk … C, then let xOk be the solution obtained by local improvement from xk ; reset xk D xOk ; k D hc; xOk i; and return to Step 1. If xk 2 C then set xkC1 D xk ; kC1 D k ; compute k D supf j xk 2 Cg and define ) ( r X 1 k i : ti hx ; c i  PQ kC1 D PQ k \ tj k iD1 From Vk derive the vertex set VkC1 of PQ kC1 : Set k Step 1.

k C 1 and return to

9.8 Network Constraints

319

Proposition 9.9 The above PA Algorithm for .MRP/ is finite. Proof Immediate, since this is a mere specialization of the PA Algorithm for .LRCP/ in Sect. 6.4 to .MRP/. u t Remark 9.8 If K D fxjQx  0g then P1 can be constructed as in Lemma 9.2. The above algorithm assumes regularity of the problem (assumption (b)). Without this assumption, we can replace C by C" D fx 2 Xj h.x/  "g: Applying the above algorithm with C C" will yield an "-approximate optimal solution (cf Sect. 7.3), i.e., a point x" satisfying x" 2 D; h.x" /  "; and hc; x" i  minfhc; xij x 2 D; h.x/  0g.

9.8 Network Constraints Consider a special concave programming problem of the form: min g.y1 ; : : : ; yr / C hc; xi

.SCP/

s.t. y 2 Y Qx D y; Bx D d; x  0:

(9.93) (9.94)

where g W RrC ! RC is a concave function, Y a polytope in RrC , c; x 2 Rn ; d 2 Rm ; Q an r  n matrix of rank r; and B an m  n matrix. In an economic interpretation, y D .y1 ; : : : ; yr /T may denote a production program to be chosen from a set Y of technologically feasible production programs, x a distribution– transportation program to be determined so as to meet the requirements (9.94). The problem is then to find a production–distribution–transportation program, satisfying conditions (9.94), with a minimum cost. A variant of this problem is the classical concave production–transportation problem

.PTP/

ˇ r X m X ˇ ˇ min g.y/ C cij xij ˇ ˇ iD1 jD1 ˇ m ˇ X ˇ s.t. xij D yi i D 1; : : : ; r ˇ ˇ jD1 ˇ r ˇ X ˇ xij D dj j D 1; : : : ; m ˇ ˇ iD1 ˇ ˇ x  0; i D 1; : : : ; r; j D 1; : : : ; m ij

where yi is the production level to be determined for factory i and xij the amount to be sent from factory i to warehouse j in order to meet the demands d1 ; : : : ; dm of the warehouses. Another variant of SCP is the minimum concave cost network flow problem which can be formulated as

320

9 Parametric Decomposition

.MCCNFP/

ˇ P P ˇ min r gi .xi / C n iD1 iDrC1 ci xi ˇ ˇ s:t: Ax D d; x  0:

where A is the node-arc incidence matrix of a given directed graph with n arcs, d the vector of node demands (with supplies interpreted as negative demands), and xi the flow value on arc i: Node j withP dj > 0 are the sinks, nodes j with dj < 0 are the sources, and it is assumed that j dj D 0 (balance condition). The cost of shipping t units along arc i is a nonnegative concave nondecreasing function gi .t/ for i D 1; : : : ; r; and a linear function ci t (with ci  0/ for i D r C 1; : : : ; n: To cast .MCCNFP/ into the form .SCP/ it suffices to rewrite it as ˇ P P ˇ min r gi .yi / C n iD1 iDrC1 ci xi ˇ ˇ s:t: xi D yi .i D 1; : : : ; r/; Ax D d; x  0:

(9.95)

Obviously, Problem .SCP/ is equivalent to min g.y/ C t s.t. hc; xi  t; Qx D y; Bx D d; x  0; y 2 Y:

(9.96)

Since this is a concave minimization problem with few nonconvex variables, it can be handled by the decomposition methods discussed in the previous Sects. 7.2 and 7.3. Note that in both .PTP/ and .MCCNFP/ the constraints (9.94) have a nice structure such that when y is fixed the problem reduces to a special linear transportation problem (for .PTP/) or a linear cost flow problem (for .MCCNFP/), which can be solved by very efficient specialized algorithms. Therefore, to take full advantage of the structure of (9.94) it is important to preserve this structure under the decomposition.

9.8.1 Outer Approximation This method is initialized from a polytope P1 D Y  Œ˛; ˇ,where ˛; ˇ are, respectively, a lower bound and an upper bound of hc; xi over the polytope Bx D d; x  0: At iteration k; we have a polytope Pk  P1 containing all .y; t/ for which there exists x  0 satisfying (9.96). Let .yk ; tk / 2 argminft C g.y/j .y; t/ 2 Pk g: Following Remark 7.1 to check whether .yk ; tk / is feasible to (9.96) we solve the pair of dual programs minfhc; xij Qx D yk ; Bx D d; x  0g

(9.97)

9.8 Network Constraints

321

maxfhyk ; ui C hd; vij QT u C BT v  cg

(9.98)

obtaining an optimal solution xk of (9.97) and an optimal solution .uk ; v k / of the dual (9.98). If hc; xk i  tk then .xk ; yk / solves .SCP/: Otherwise, we define PkC1 D Pk \ f.y; t/j hy; uk i C hd; v k i  tg

(9.99)

and repeat the procedure with k k C 1: In the particular case of .PTP/ the subproblem (9.97) is a linear transportation problem obtained from .PTP/ by fixing y D yk ; while in the case of .MCCNFP/ the subproblem (9.97) is a linear min-cost network flow problem obtained from .MCCNFP/ by fixing xi D yki for i D 1; : : : ; r (or equivalently, by deleting the arcs i D 1; : : : ; r and accordingly modifying the demands of the nodes which are their endpoints).

9.8.2 Branch and Bound Method Assume that in problem .SCP/ the function g.y/ is concave separable, i.e., g.y/ D P r iD1 gi .yi /: For each rectangle M D Œa; b let lM .y/ D

R X

li;M .yi /

iD1

where li;M .t/ is the affine function that agrees with gi .t/ at the endpoints ai ; bi of the segment Œai ; bi ; i.e., li;M .t/ D gi .ai / C

gi .bi /  gi .ai / .t  ai /: bi  ai

(9.100)

(see Proposition 7.4). Then a lower bound ˇ.M/ for the objective function in .SCP/ over M is provided by the optimal value in the linear program ( .LP.M//

min

r X

) li;M .yi / C hc; xij Qx D y; Ax D d; x  0 :

iD1

Using this lower bound and a rectangular !-subdivision rule, a convergent rectangular algorithm has been developed in Horst and Tuy (1996) which is a refined version of an earlier algorithm by Soland (1974) and should be able to handle even large scale .PTP/ provided r is relatively small. Also note that if the data are all integral, then it is well known that a basic optimal solution of the linear transportation problem LP.M/ always exists with integral values. Consequently, in this case, starting from an initial rectangle with integral vertices the algorithm will generate only subrectangles with integral vertices, and so it will necessarily be finite.

322

9 Parametric Decomposition

In practice, aside from production and transportation costs there may also be shortage/holding costs due to a failure to meet the demands exactly. Then we have the following problem (cf Example 5.1) which includes a formulation of the stochastic transportation–location problem as a special case:

.GPTP/

ˇ ˇ min hc; xi C Pr g .y / C Pm h .z / ˇ iD1 i i jD1 j j ˇ m X ˇ ˇ s:t: xij D yi .i D 1; : : : ; r/ ˇ ˇ jD1 ˇ r X ˇ ˇ xij D zj .j D 1; : : : ; m/ ˇ ˇ iD1 ˇ ˇ 0  yi  si 8i; ˇ ˇ xij  0 8i; j:

where gi .yi / is the cost of producing yi units at factory i and hj .zj / is the shortage/holding cost to be incurred if the warehouse j receives zj ¤ dj (demand of warehouse j/: It is assumed that gi is a concave nondecreasing function satisfying gi .0/ D 0  gi .0C / (so possibly gi .0C / > 0Pas, e.g., for a fixed charge), while r hj W Œ0; s ! RC is a convex function (s D iD1 si /; such that hj .dj / D 0; and the right derivative of hj at 0 has a finite absolute value j : The objective function is then a d.c. function, which is a new source of difficulty. However, since the problem becomes convex when y is fixed, it can be handled by a branch and bound algorithm in which branching is performed upon y (Holmberg and Tuy 1995). For any rectangle Œa; b  Y WD fy 2 RrC j 0  yP i  si i D 1; : : : ; rg let as previously li;M .t/ be defined by (9.100) and lM .y/ D riD1 li;M .yi /: Then a lower bound of the objective function over the feasible points in M is given by the optimal value ˇ.M/ in the convex program

CP.M/

ˇ ˇ min hc; xi C l .y/ C Pm h .z / M ˇ jD1 j j ˇ m X ˇ ˇ s:t: xij D yi .i D 1; : : : ; r/ ˇ ˇ jD1 ˇ r X ˇ ˇ xij D zj .j D 1; : : : ; m/ ˇ ˇ iD1 ˇ ˇ ai  yi  bi 8i; ˇ ˇ xij  0 8i; j:

Note that if an optimal solution .x.M/; y.M/; z.M// of this convex program satisfies lM .y.M// D g.y.M// then ˇ.M/ equals the minimum of the objective function over the feasible solutions in M: When y is fixed, .GPTP/ is a convex program in x; z: If '.y/ denotes the optimal value of this program then the problem amounts to minimizing '.y/ over all feasible .x; y; z/.

9.8 Network Constraints

323

Fig. 9.1 Consequence of concavity of g1 .t/

g1 (t)

bkq 1

τ1 t

O

BB Algorithm for .GPTP/ Initialization. Start with a rectangle M1  Y. Let y1 be the best feasible solution available, INV = '.y1 /: Set P1 D S1 D fM1 g; k D 1: Step 1. Step 2. Step 3. Step 4. Step 5.

For each M 2 R1 solve the associated convex program CP(M) obtaining its optimal value ˇ.M/ and optimal solution .x.M/; y.M/; z.M//: Update INV and yk : Delete all M 2 Sk such that ˇ.M/  INV  ": Let Rk be the collection of remaining rectangles. If Rk D ;; then terminate: yk is a global "-optimal solution of .GPTP/: Select Mk 2 argminfˇ.M/j M 2 Rk g: Let yk D y.Mk /; ik 2 arg maxfgi .yk /  li;MK .yk /g: i

Step 6.

(9.101)

If gik .ykik /  lik ;Mk .yk / D 0; then terminate: yk is a global optimal solution. Divide Mk via .yk ; ik / (see Sect. 5.6). Let PkC1 be the partition of Mk and SkC1 D Rk n .fMk g/ [ PkC1 : Set k k C 1 and return to Step 1.

To establish the convergence of this algorithm, recall that yk D y.Mk /: Similarly denote xk D x.Mk /; zk D z.Mk /; so .xk ; yk ; zk / is an optimal solution of CP.Mk /: Suppose the algorithm is infinite and let y be any accumulation point of fyk g, say y D limq!1 ykq : Without loss of generality we may assume xkq ! x; zkq ! z and also ikq D 1 8q; so that 1 2 arg maxfgi .ykq /  li;Mkq .ykq /g: i

(9.102)

k

Clearly if for some q we have y1q D 0, then gi .ykq / D l1;Mkq .ykq /; and the algorithm k

terminates in Step 5. Therefore, since the algorithm is infinite, we must have y1q > 0 8q: Lemma 9.9 If g1 .0C / > 0 then k

y1 D lim y1q > 0: q!1

In other words, y1 is always a continuity point of g1 .t/:

(9.103)

324

9 Parametric Decomposition q

k

q

q

q

q

q

Proof Let Mkq ;1 D Œa1 ; b1 ; so that y1q 2 Œa1 ; b1  and Œa1 ; b1 ; q D 1; 2; : : : is a q q q q sequence of nested intervals. If 0 … Œa10 ; b10  for some q0 , then 0 … Œa1 ; b1  8q  q0 ; k q q and y1q  a10 > 0 8q  q0 ; hence y1  a10 > 0; i.e., (9.103) holds. Thus, it suffices to consider the case k

q

a1 D 0; y1q > 0

8q:

(9.104)

Recall that j denotes the absolute value of the right derivative of hj .t/ at 0. Let D max j  min c1j :

(9.105)

j

j

P k k k Since njD1 x1jq D y1q > 0; there exists j0 such that x1jq0 > 0: Let .Qxkq ; ykq ; zkq / be a feasible solution to CP.M/ obtained from .xkq ; ykq ; zkq / by subtracting a very k k k k small positive amount < x1jq0 from each of the components x1jq0 ; y1q ; zj0q ; and letting unchanged all the other components. Then the production cost at factory 1 decreases k k k k by l1;Mkq .y1q /  l1;Mkq .Qy1q / D g1 .b1q / =b1q > 0; the transportation cost in the arc .1; j0 / decreases by c1j0 ; while the penalty incurred at warehouse j0 either decreases k k k (if zj0q > dj /; or increases by hj0 .zj0q  /  hj0 .zj0q /  j0 : Thus the total cost decreases by at least " ıD

#

k

g1 .b1q / k

b1q

C c1j0  j0 :

(9.106)

If  0 then c1j0  j0  0; hence ı > 0 and .Qxkq ; yQ kq ; zQkq / would be a better feasible solution than .xkq ; ykq ; zkq /; contradicting the optimality of the latter for CP.Mkq /: Therefore, > 0: Now suppose g1 .0C / > 0: Since g1 .t/=t ! C1 as t ! 0C (in view of the fact g1 .0C / > 0/; there exists 1 > 0 satisfying g1 .1 / > : 1 k

k

k0

k

Observe that since Mkq ;1 D Œ0; b1q  is divided via y1q ; we must have Œ0; b1q   Œ0; y1q  k

k

for all q0 > q; while Œ0; y1q   Œ0; b1q  for all q: With this in mind we will show that k

b1q  1

8q:

(9.107) k

By the above observation this will imply that y1q  1 8q; thereby completing k the proof. Suppose (9.107) does not hold, i.e., b1q < 1 for some q: Then kq kq y1  b1 < 1 ; and since g1 .t/ is concave it is easily seen by Fig. 9.1 that k k g1 .b1q /  b1q g1 .1 /=1 ; i.e.,

9.8 Network Constraints

325 k

g1 .b1q / k b1q



g1 .1 / : 1

This, together with (9.106) implies that  ı

 g1 .1 / k  x1jq0 > 0 1

hence .Qxkq ; yQ kq ; zQkq / is a better feasible solution than .xkq ; ykq ; zkq /: This contradiction proves (9.107). t u Theorem 9.3 The above algorithm for .GPTP/ can be infinite only if " D 0 and in this case it generates an infinite sequence yk D y.Mk /; k D 1; 2; : : : every accumulation point of which is an optimal solution. Proof For " D 0 let y D limq!1 ykq be an accumulation point of fykq g with, as q q q q previously, ikq D 1 8q; and Mkq ;1 D Œa1 ; b1 : Since fa1 g is nondecreasing, fb1 g q q is nonincreasing, we have a1 ! a1 ; b1 ! b1 : For each q either of the following situations occurs: q

qC1

a1 < aqC1 < b1 q a1


and for every y 2 D let C˛ .y/ WD fx 2 Cj L.x; y/  ˛g: We first prove that for any segment  D Œa; b  D the following holds: \y2 C˛ .y/ ¤ ;:

(10.4)

It can always be assumed that y 2 .a; b/: Since infx2C L.x; y /  < ˛ there exists x 2 C satisfying L.x; y / < ˛; and by continuity there exist u; v such that u < y < v and L.x; y/ < ˛ 8y 2 Œu; v: Therefore, setting s D supfy 2 Œu; bj \uzy C˛ .z/ ¤ ;g:

(10.5)

we have s  v: By definition of s there exists a sequence yk % s; such that Ck WD \uyyk C˛ .y/ ¤ ;: In view of assumption (iii) C˛ .y / is compact, so Ck ; k D 1; 2; : : : ; form a nested sequence of nonempty closed subsets of the compact set C˛ .y /: Therefore, there exists x0 2 \1 kD1 Ck ; i.e., satisfying L.x0 ; y/  ˛ 8y 2 Œu; s/: Since L.x0 ; yk /  0 8k; letting k ! C1 yields L.x0 ; s/  ˛: So L.x0 ; y/  ˛ 8y 2 Œu; s: We contend that s D b: Indeed, if s < b we cannot have L.x0 ; s/ < ˛ for then by continuity there would exist q > s satisfying L.x0 ; y/ < ˛ 8y 2 Œs; q/; conflicting with (10.5). Therefore, if s < b then necessarily L.x0 ; s/ D ˛: Furthermore, for any ball W centered at x0 we cannot have ˛ D minx2C\W L.x; s/ for then assumption (ii) would imply that ˛ D L.x0 ; s/ D minx2C L.x; s/; conflicting with ˛ > infx2C L.x; s/: So there exists xk 2 C such that xk ! x0 and L.xk ; s/ < ˛: If for some y 2 .s; b and some k we had L.xk ; y/ > ˛; then, since L.xk ; s/ < ˛ assumption (i) would imply that L.xk ; z/ < ˛ 8z 2 Œu; s and, moreover, by continuity L.xk ; q/ < ˛ 8y 2 Œs; q for some q > s; conflicting with (10.5). Therefore, L.xk ; y/  ˛ 8k; hence letting xk ! x0 yields L.x0 ; y/  ˛; 8y 2 .s; b: This proves that s D b and so \uyb C˛ .y/ ¤ ;:

(10.6)

10.2 The S-Lemma

341

In an analogous manner, setting t D inffy 2 Œu; bj \yzb C˛ .z/ ¤ ;g: we can show that t D a; i.e., \ayb C˛ .y/ ¤ ;; proving (10.4). It is now easy to complete the proof. By assumption (iii) the set C˛ .y / is compact. Since any finite set E  D is contained in some segment  it follows from (10.4) that the family C˛ .y/ \ C˛ .y /; y 2 D; has the finite intersection property. So there exists x 2 C satisfying L.x; y/  ˛ 8y 2 D: Taking ˛ D C 1=k we see that for each k D 1; 2; : : : ; there exists xk 2 C satisfying L.xk ; y/ 

C 1=k 8y 2 D: Since, in particular, L.xk ; y /  C 1=k < C 1; i.e., xk 2 C C1 .y / 8k; while the set C C1 .y / is compact, the sequence fxk g has a cluster point x 2 C satisfying L.x; y/  8y 2 D: Noting that  this implies (10.3). t u In the above theorem D is an interval of R: The following proposition deals with the case D  Rm with m  1: Corollary 10.1 Let C be a closed subset of Rn ; D a convex subset of Rm , and L.x; y/ W Rn  D ! R be a continuous function. Assume the following conditions: (i) for every x 2 Rn the function L.x; :/ is affine; (ii) for every y 2 D any local minimizer of L.x; y/ on C is a global minimizer of L.x; y/ on CI (iii*) there exists y 2 D satisfying L.x; y / ! C1 as x 2 C; kxk ! C1 and such that, moreover inf L.x; y / D max inf L.x; y/

x2C

y2D x2C

& L.x ; y / D min L.x; y / x2C

(10.7)

for a unique x 2 C: Then we have the minimax equality min sup L.x; y/ D max inf L.x; y/: x2C y2D

y2D x2C

(10.8)

Proof Consider an arbitrary ˛ > WD supy2D infx2C L.x; y/ and for every y 2 D let C˛ .y/ WD fx 2 Cj L.x; y/  ˛g: By assumption (iii*) infx2C L.x; y / D ; hence C .y / D fx 2 Cj L.x; y / D g D fx g: For every y 2 D let Dy WD Œy ; y D f.1  t/y C tyj 0  t  1g  R and Ly .x; t/ WD L.x; .1  t/y C ty/; which is a function on C  Dy : It can easily be verified that all conditions of Theorem 1 are satisfied for C; Dy and Ly .x; t/: Since ˛ >  supt2Dy infx2C L.x; t/; it follows that ; ¤ C˛ .y / \ C˛ .y/  C˛ .y /  C:

342

10 Nonconvex Quadratic Programming

Furthermore, since by assumption L.x; y / ! C1 as x 2 C; kxk ! C1; the set C˛ .y / is compact and so is C˛ .y / \ C˛ .y/ because C˛ .y/ is closed. Letting ˛ ! C then yields ; ¤ C .y / \ C .y/  C .y / D fx g: Thus, L.x ; y/  8y 2 D and hence, inf sup L.x; y/ 

x2C y2D

t u

completing the proof.

Remark 10.1 (a) When C is an affine manifold and L.x; y/ is a quadratic function in x it is superfluous to require the uniqueness of x in assumption (iii*). In fact, in that case the uniqueness of x follows from the condition L.x; y / ! C1 as x 2 C; kxk ! C1 (which implies that x 7! L.x; y / is strictly convex on the affine manifold C/: (b) Conditions (iii*) is satisfied, provided there exists a unique pair .x ; y / 2 CD such that L.x ; y / D max min L.x; y/ and L.x; y / ! C1 as x 2 C; kxk ! C1: y2D x2C

Indeed, let y satisfy min L.x; y/ D maxy2D minx2C L.x; y/; and L.x; y/ ! C1 x2C

as x 2 C; kxk ! C1: Since L.:; y/ is convex, there is x satisfying L.x; y/ D minx2C L.x; y/: Then L.x; y/ D maxy2D minx2C L.x; y/; so .x ; y / D .x; y/ by uniqueness condition. Hence minx2C L.x; y / D maxy2D minx2C L.x; y/; i.e., (iii*) holds. Assumption (ii) in Theorem 10.1 or Corollary 10.1 is obviously satisfied if L.:; y/ is a convex function and C is a convex set (then Theorem 10.1 reduces to a classical minimax theorem). However, as will be seen shortly, this assumption is also satisfied in many other cases of interest in quadratic optimization. Corollary 10.2 Let f .x/ and g.x/ be quadratic functions on Rn ; L.x; y/ D f .x/ C yg.x/ for x 2 Rn ; y 2 R. Then for every line  in Rn there holds the minimax equality inf sup L.x; y/ D sup inf L.x; y/:

x2 y2R C

y2RC x2

(10.9)

If g.x/ is not affine on  there also holds the minimax equality inf sup L.x; y/ D sup inf L.x; y/:

x2 y2R

y2R x2

(10.10)

10.2 The S-Lemma

343

Proof If f .x/ and g.x/ are both concave on  and one of them is not affine then clearly infff .x/j g.x/  0; x 2  g D 1; hence infx2 supy2RC L.x; y/ D infff .x/j g.x/  0; x 2  g D 1; and (10.9) is trivial. If f .x/ and g.x/ are both affine on ; (10.9) is a consequence of linear programming duality. Otherwise, either f .x/ or g.x/ is strictly convex on  . Then there exists y 2 RC such that f .x/ C y g.x/ is strictly convex on ; hence such that L.x; y / ! C1 as x 2 ; kxk ! C1: Since by Proposition 10.2 condition (ii) in Theorem 10.1 is satisfied, (10.9) follows by Theorem 10.1 with C D ; D D RC : If g.x/ is not affine on ; there always is y 2 R such that f .x/ C y g.x/ is strictly convex on  (y > 0 when g.x/ is strictly convex on  and y < 0 when g.x/ is strictly concave on  ). Therefore all conditions of Theorem 10.1 are satisfied again for C D  and D D R and (10.10) follows. t u We can now state a generalized version of the S-Lemma which is in fact valid for arbitrary lower semi-continuous functions f .x/ and g.x/; not necessarily quadratic. Theorem 10.2 (Generalized S-Lemma (Tuy and Tuan 2013)) Let f .x/ and g.x/ be quadratic, functions on Rn ; W a closed subset of Rn ; D a closed subset of R and L.x; y/ D f .x/ C yg.x/ for y 2 R: Assume WD infx2W supy2D L.x; y/ > 1 and the following conditions are satisfied: (i) Either : D D RC and there exists x 2 W such that g.x / < 0 Or : D D R and there exist a; b 2 W such that g.a/ < 0 < g.b/: (ii) For any two points a; b 2 W such that g.a/ < 0 < g.b/ we have sup inf L.x; y/  : y2D x2fa;bg

Then there exists y 2 D satisfying inf L.x; y/ D ;

x2W

(10.11)

and so infx2W supy2D L.x; y/ D maxy2D infx2W L.x; y/: Proof We first show that for any real ˛ < and any finite set E  W there exists y 2 D satisfying inf L.x; y/  ˛:

x2E

(10.12)

Since x 2 E  W; g.x/ D 0 implies L.x; y/ D f .x/ C yg.x/ D f .x/  > ˛ 8y 2 D, it suffices to prove (10.12) for any finite set E  fx 2 Wj g.x/ ¤ 0g: Then E D E1 [ E2 where E1 WD fx 2 Ej g.x/ < 0g; E2 D fx 2 Ej g.x/ > 0g: For every .x/ x 2 E define .x/ WD ˛f : Clearly g.x/ • for every x 2 E1 W • for every x 2 E2 W

.x/ > 0 and Œ L.x; y/  ˛ , y  .x/ ]; L.x; y/  ˛ , y  .x/:

344

10 Nonconvex Quadratic Programming

Consequently, if E1 ¤ ; then minx2E1 .x/ > 0 and we have L.x; y/  ˛ 8x 2 E1 when 0  y  minx2E1 .x/I if E2 ¤ ;, then we have L.x; y/  ˛ 8x 2 E2 when y  maxx2E2 .x/: Further, if Ei ¤ ;; i D 1; 2; then by assumption (ii) with a 2 argminx2E1 .x/; b 2 argmaxx2E2 .x/ there exists y 2 D satisfying infx2fa;bg L.x; y/  ˛; i.e., such that max .x/ D .b/  y  .a/ D min .x/; x2E1

x2E2

hence L.x; y/  ˛ 8x 2 E2 ;

L.x; y/  ˛ 8x 2 E1 :

Thus in any case for any finite set E  W there exists y 2 D satisfying (10.12). Setting D.x/ WD fy 2 Dj L.x; y/  ˛g we then have \x2E D.x/ ¤ ; for any finite set E  W: In other words, the collection of closed sets D.x/; x 2 W; has the finite intersection property. If assumption (i) holds then • either D D RC and f .x / C yg.x / ! 1 as y ! C1; so the set D.x / D fy 2 RC j L.x ; y/  ˛g is compact • or D D R and f .a/ C yg.a/ ! 1 as y ! C1; f .b/ C yg.b/ ! 1 as y ! 1: So the set D.a/ D fy 2 Rj L.a; y/  ˛g is bounded above, while D.b/ is bounded below; hence the set D.a/ \ D.b/ is compact. Consequently, \x2W D.x/ ¤ ;: For every ˛k D  1=k; k D 1; 2; : : : there exists yk 2 \x2W fy 2 Dj L.x; y/  ˛k g; i.e., yk 2 D such that L.x; yk /  ˛k 8x 2 W: Obviously every yk is contained in the compact set \x2W fy 2 Dj L.x; y/  ˛1 g; so there exists y 2 D such that, up to a subsequence yk ! y .k ! C1/: Then L.x; y/  8x 2 W; hence, infx2W L.x; y/ D : This proves (10.11). t u Theorem 10.2 yields the most general version of S-Lemma. In fact, if W is an affine manifold in Rn and D D RC condition (ii) is automatically fulfilled because for every line  connecting a and b in W we have, by Corollary 10.2, sup

inf L.x; y/  sup inf L.x; y/ D inf sup L.x; y/

y2RC x2fa;bg

y2RC x2

x2 y2R C

 inf sup L.x; y/ D : x2W y2R C

Also using Corollary 10.2, condition (ii) is automatically fulfilled if W is an affine manifold, D D R and g.x/ is a quadratic function g.x/ D hx; Q1 xiChc1 ; xiCd1 with either Q1 nonsingular or with c1 D 0 and d1 D 0 (g.x/ is homogeneous). Indeed, either of the latter conditions implies that g.x/ is not affine on any line  in W and hence, by Corollary 10.2, sup inf L.x; y/  sup inf L.x; y/ D inf sup L.x; y/  inf sup L.x; y/ D : y2R x2fa;bg

y2R x2

x2 y2R

x2W y2R

10.2 The S-Lemma

345

Corollary 10.3 Let f .x/ and g.x/ be two quadratic functions on Rn : (i) Let H be an affine manifold in Rn and assume there exists x 2 H such that g.x / < 0: Then the system x 2 H; f .x/ < 0; g.x/  0 is inconsistent if and only if there exists y 2 RC such that f .x/ C yg.x/  0 8x 2 H: (ii) Let H be an affine manifold in Rn and assume that g.x/ is either homogeneous or g.x/ D hx; Q1 xi C hc1 ; xi C d1 with Q1 nonsingular and that, moreover, g.x/ takes both positive and negative values on H: Then the system x 2 H; f .x/ < 0; g.x/ D 0 is inconsistent if and only if there exists y 2 R such that f .x/ C yg.x/  0 8x 2 H: (iii) Let H be a subspace of Rn and assume both f .x/; g.x/ are homogeneous. Then the system x 2 H n f0g; f .x/  0; g.x/  0 is inconsistent if and only if there exists .y1 ; y2 / 2 R2C such that y1 f .x/ C y2 g.x/ > 0 8x 2 H n f0g: Proof Since the ‘if’ part in each of the three assertions is obvious, it suffices to prove the ‘only if’ part. Let L.x; y/ D f .x/ C yg.x/: (i) This is a special case of Theorem 10.2 with D D RC ; W D H: In fact, WD infff .x/j g.x/  0; x 2 Hg D inf sup L.x; y/; x2H y2R

C

so inconsistency of the system fx 2 H; f .x/ < 0; g.x/  0g means  0. Therefore, by Theorem 10.2 there exists y 2 RC such that f .x/ C yg.x/   0; 8x 2 H: (ii) This is a special case of Theorem 10.2 with D D R; W D H: In fact, inconsistency of the system fx 2 H; f .x/  0; g.x/ D 0g means WD infff .x/j g.x/ D 0; x 2 Hg D infx2H supy2R L.x; y/  0: (iii) With H being a subspace and both f .x/; g.x/ homogeneous, the system fx 2 H n f0g; f .x/  0; g.x/  0g is inconsistent only if so is the system fx 2 H n f0g; f .x/=hx; xi  0; g.x/=hx; xi  0g: p Setting z D x= hx; xi we have the implication: Œz 2 H; hz; zi D 1; g.z/  0 ) f .z/ > 0: Since the set fx 2 Hj hx; xi D 1g is compact it follows that 1 WD infff .z/j g.z/  0; hz; zi D 1g > 0 and noting that f .0/ D g.0/ D 0 we can write infff .x/  1 hx; xij g.x/  0; x 2 Hg  0: If there exists x 2 H satisfying g.x/ < 0 then by (i) there exists y2 2 RC satisfying f .x/  1 hx; xi C y2 g.x/  8x 2 H; hence f .x/ C y2 g.x/ > 0 8x 2 H n f0g: Similarly, with the roles of f .x/ and g.x/ interchanged, if there exists x 2 H satisfying f .x/ < 0; then there exists y1 2 RC satisfying y1 f .x/ C g.x/  0 8x 2 H n f0g: Finally, in the case g.x/  0 8x 2 H and f .x/  0 8x 2 H; since there cannot exist x 2 H n f0g such that f .x/ D 0 and g.x/ D 0 we must have f .x/ C g.x/ > 0 8x 2 H n f0g; completing the proof. t u Assertion (i) of Corollary 10.3 is a result proved in Polik and Terlaky (2007) and later termed the ‘generalized S-Lemma’ in Jeyakumar et al. (2009). Assertion (ii) for homogeneous g.x/ is the S-Lemma with equality constraint while Assertion

346

10 Nonconvex Quadratic Programming

(iii) is the S-Lemma with nonstrict inequalities. So Theorem 10.2 is indeed the most general version of the S-Lemma. Condition (i) is a kind of generalized Slater condition and the proof given above is a simple elementary proof for the S-Lemma in its full generality. Remark 10.2 By Proposition 10.2 the function f .x/ C yg.x/ in assertions (i) and (ii) of the above corollary is convex on H:

10.3 Single Constraint Quadratic Optimization Let H be an affine manifold in Rn ; and f .x/; g.x/ two quadratic functions on Rn defined by f .x/ WD

1 hx; Q0 xi C hc0 ; xi; 2

g.x/ D

1 hx; Q1 xi C hc1 ; xi C d1 ; 2

(10.13)

where Qi ; i D 0; 1; are symmetric n  n matrices and ci 2 Rn ; d1 2 R: Consider the quadratic optimization problems: .QPA/

minff .x/j x 2 H; g.x/  0g;

.QPB/

minff .x/j x 2 H; g.x/ D 0g:

Define D D RC ; G D fx 2 Rn j g.x/  0g for .QPA/, and D D R; G D fx 2 Rn j g.x/ D 0g for .QPB/ and let L.x; y/ D f .x/ C yg.x/; where x 2 Rn ; y 2 R: Clearly sup L.x; y/ D y2D

f .x/ if x 2 G C1 otherwise

so the optimal value of problem .QPA/ or .QPB/ is v.QPA/ D inf sup L.x; y/; Q x2H y2R

v.QPB/ D inf sup L.x; y/: x2H y2R

C

Strong duality is said to hold for .QPA/ or .QPB/ if v.QPA/ D sup inf L.x; y/

.case D D RC /

(10.14)

v.QPB/ D sup inf L.x; y/

.case D D R/:

(10.15)

y2RC x2H y2R x2H

The problem on the right-hand side of (10.14) or (10.15) is called the dual problem of .QPA/ or .QPB/, respectively. By Proposition 10.2, if infx2H L.x; y/ > 1 the function L.:; y/ is convex on H: Therefore, with the usual convention

10.3 Single Constraint Quadratic Optimization

347

inf ; D 1; the supremum in the dual problem can be restricted to those y 2 D for which the function L.:; y/ is convex on H: As will be seen shortly (Remark 10.4 below), in view of this fact many properties are obvious which otherwise have been proved sometimes by unnecessarily involved arguments. Actually, it is this hidden convexity that is responsible for the striking analogy between certain properties of quadratic inequalities and corresponding ones of convex inequalities. To investigate conditions guaranteeing strong duality we introduce the following assumptions: (S) There exists x 2 H satisfying g.x / < 0: (T) There exists y 2 D such that L.x; y / D f .x/ C y g.x/ is strictly convex on H: Theorem 10.3 (Duality Theorem) (i) If assumption (S) is satisfied then (10.14) holds and if, furthermore, v.QPA/ > 1 (which is the case if assumption (T), too, is satisfied with D D RC ), then there exists y 2 RC such that L.x; y/ is convex on H and v.QPA/ D min L.x; y/ D max inf L.x; y/: x2H

(10.16)

y2RC x2H

A point x 2 H is then an optimal solution of .QPA/ if and only if it satisfies hrf .x/ C yrg.x/; ui D 0 8u 2 E;

yg.x/ D 0;

g.x/  0;

(10.17)

where E is the subspace parallel to the affine manifold H: (ii) If H is a subspace and v.QPB/ > 1 while • either g.x/ is homogeneous and takes both positive and negative values on H; • or g.x/ and f .x/ are both homogeneous, then (10.15) holds and there exists y 2 R such that L.x; y/ is convex on H and v.QPB/ D min L.x; y/ D max inf L.x; y/: x2H

y2R x2H

(10.18)

A point x 2 H is then an optimal solution of .QPB/ if and only if it satisfies hrf .x/ C yrg.x/; ui D 0 8u 2 H;

g.x/ D 0:

(10.19)

(iii) If assumption (T) is satisfied then (10.15) holds, where the supremum can be restricted to those y 2 D for which L.:; y/ is convex on H: If, furthermore, the dual problem has an optimal solution y 2 D then x 2 H is an optimal solution of the primal problem if and only if hrf .x/ C yrg.x/; ui D 0 8u 2 E; where E is the subspace parallel to H:

(10.20)

348

10 Nonconvex Quadratic Programming

Proof (i) The existence of y follows from Corollary 10.3, (i). Then x is an optimal solution of .QPA/ if and only if max L.x; y/  L.x; y/  min L.x; y/;

y2RC

x2H

hence, if and only if conditions (10.17) hold by noting that L.:; y/ is convex on H: (ii) When g.x/ is homogeneous and takes both positive and negative values on the subspace H the existence of y follows from Corollary 10.3, (ii). When both f .x/; g.x/ are homogeneous, if g.x/  0 8x 2 H then infff .x/j g.x/  0; x 2 Hg D v.QPB/, so by Corollary 10.3, (iii), v.QPB/ > 1 implies the existence of .y1 ; y2 / 2 RC ; such that y1 f .x/ C y2 g.x/ > 0 8x 2 H n f0g: If y1 D 0 then y2 g.x/ > 0 8x 2 H n f0g; hence g.x/  0 8x 2 H; hence g.x/ D 0 8x 2 H; a contradiction. Therefore, y1 > 0 and (10.18) holds with y D y2 =y1 2 RC : Analogously, if g.x/  0 8x 2 H, then infff .x/j  g.x/  0; x 2 Hg D v.QPB/; so (10.18) holds for some y 2 R : Since L.x; y/ is convex on H, conditions (10.19) are necessary and sufficient for x 2 H to be an optimal solution. (iii) If assumption (T) is satisfied the conclusion follows from Theorem 10.1 where C D H and D D R for .QPB/: If, in addition to (T), the dual problem has an optimal solution y 2 D then x 2 H solves the primal problem if and only if it solves the convex optimization problem minx2H L.x; y/; and hence, if and only if it satisfies (10.20). t u Remark 10.3 In the usual case H D Rn it is well known that by setting Q.y/ D Q0 C yQ1 ; c.y/ D c0 C yc1 ; d.y/ D yd1 the dual problem of .QPA/ can be written as the SDP 

 Q.y/ c.y/ 0; y 2 D : (10.21) max tj c.y/T 2.d.y/  t/ In the general case H ¤ Rn ; as noticed above the supremum in (10.14) can be restricted to the set of those y 2 RC for which L.x; y/ is a convex function on H; i.e., infx2H L.x; y/ is a convex problem. Since the problem (10.14) amounts to maximizing the concave function '.y/ D infx2H L.x; y/ on RC ; and the value of '.y/ at each y is obtained by solving a convex problem, it is easy to solve the dual problem by convex programming methods. Corollary 10.4 In .QPA/ or .QPB/ if f .x/ or g.x/ is strictly convex on Rn then strong duality holds. Proof If f .x/ or g.x/ is strictly convex on Rn there exists y 2 RC such that f .x/ C y g.x/ is strictly convex on Rn ; and hence, by Proposition 10.2, coercive on H: So the function L.x; y/ D f .x/ C yg.x/ satisfies all conditions of Theorem 10.1 with C D H; D D RC (for .QPA/) or D D R (for .QPB/). By this theorem, strong duality follows. t u

10.3 Single Constraint Quadratic Optimization

349

As a consequence, quadratic minimization over an ellipsoid, i.e., the problem minff .x/j hQx; xi  r2 ; x 2 Rn g

(10.22)

with Q 0 (positive definite) can be solved efficiently by solving a SDP which is its dual. This is in contrast with linearly constrained quadratic minimization which has been long known to be NP-hard (Sahni 1974). Remark 10.4 As a simple example illustrating the relevance of condition (T), consider the problem minfx41 Cax21 Cbx1 g: Upon the substitution x21 ! x2 ; it becomes a quadratic minimization problem of two variables x1 ; x2 : minfx22 C ax2 C bx1 j x21  x2 D 0g:

(10.23)

Since .x22 C ax2 C bx1 / C .x21  x2 / D x1 .x1 C b/ C x2 .x2 C a  1/ ! C1 as k.x1 ; x2 /k ! C1 condition (T) is satisfied. Hence, by Theorem 10.3, (iii), (10.23) is equivalent to a convex problem—a result proved by an involved argument in Shor and Stetsenko (1989). Incidentally, note that the convex hull of the nonconvex set f.1; x1 ; x21 ; : : : ; xn1 /T j x1 2 Œa; bg  RnC1 is exactly described by SDP constraints (Hoang et al. 2008), so for any n-order univariate polynomial Pn .x1 / the optimization problem minx1 2Œa;b Pn .x1 / can be fully characterized by means of SDPs. Corollary 10.5 In .QPA/ where H D Rn assume condition (S). Then x 2 Rn solves .QPA/ if and only if there exists y 2 RC satisfying Q0 C yQ1 0 0

1

0

1

(10.24)

.Q C yQ /x C c C yc D 0

(10.25)

hQ1 x C 2c1 ; xi C 2d1  0:

(10.26)

When Q0 is not positive definite, the last condition is replaced by hQ1 x C 2c1 ; xi C 2d1 D 0

(10.27)

Proof By Theorem 10.3, (i), x is an optimal solution of .QPA/ if and only if there exists y 2 RC such that L.x; y/ D f .x/ C yg.x/ is convex and x is a minimizer of this convex function. That is, if and only if there exists y 2 RC satisfying (10.24) and (10.17), i.e., (10.24)–(10.26). If the inequality (10.26) holds strictly, i.e., hQ1 x C 2c1 ; xi C 2d1 < 0

350

10 Nonconvex Quadratic Programming

then g.x/ < 0; hence g.x/  0 for all x in some neighborhood of x: Then x is a local minimizer, hence a global minimizer, of f .x/ over Rn ; which happens only if Q0 is positive definite. t u Corollary 10.6 With f ; g; H defined as in (10.13) above, assume that g.x/ is homogeneous and takes both positive and negative values on H: Then x 2 H; g.x/ D 0

)

f .x/  0

(10.28)

if and only if there exists y 2 R such that f .x/ C yg.x/  0 8x 2 H: Proof This is a proposition often referred to as the S-Lemma with equality constraint. By Corollary 10.3, (ii), it extends to the case when g.x/ D hx; Q1 xi C hc1 ; xiCd1 with Q1 nonsingular and, moreover, g.x/ takes both positive and negative values on H: t u Note that when H D Rn and f .x/; g.x/ are homogeneous with associated matrices Q and Q1 ; g.x/ takes on both positive and negative values if Q1 is indefinite, while f .x/ C yg.x/  0 8x 2 Rn if and only if Q0 C yQ1 0: So in this special case Corollary 10.6 gives the following nonstrict Finsler Theorem (Finsler 1937): If Q1 is indefinite then hQ1 x; xi D 0 implies hQ0 x; xi  0 if and only if Q0 CyQ1 0 for some y 2 R: Also note: 0

Corollary 10.7 (Strict Finsler’s Theorem (Finsler 1937)) With f ; g homogeneous and H D Rn we have x ¤ 0; g.x/ D 0

)

f .x/ > 0

(10.29)

if and only if there exists y 2 R such that f .x/ C yg.x/ > 0 8x ¤ 0: Proof This follows directly from Corollary 10.3, (iii) (S-Lemma with nonstrict inequalities). Since f .x/ and g.x/ are both homogeneous, there exists .y1 ; y2 / 2 R2C such that y1 f .x/ C y2 g.x/ > 0 8x 2 H n f0g: From (10.29) we must have y1 > 0; so y D y2 =y1 : t u Corollary 10.8 Let f ; g; H be defined as in (10.13) above. Then just one of the following alternatives holds: (i) There exists x 2 H satisfying f .x/ < 0; g.x/ < 0I (ii) There exists y D .y1 ; y2 / 2 R2C n f.0; 0/g satisfying y1 f .x/ C y2 g.x/  0 8x 2 H: Proof Clearly (i) and (ii) cannot hold simultaneously. If (i) does not hold, two cases are possible: 1) f .x/  0 for every x 2 H such that g.x/ < 0I 2) g.x/  0 for every x 2 H such that f .x/ < 0: In case (1), if 0 is a local minimum of g.x/ on H; then g.x/  0 8x 2 H; so y1 f .x/ C y2 g.x/  0 8x 2 H with y1 D 0; y2 D 1I if, on

10.3 Single Constraint Quadratic Optimization

351

the contrary, 0 is not a local minimum of g.x/ on H; then every x 2 H satisfying g.x/ D 0 is the limit of a sequence xk 2 H satisfying g.xk / < 0; k D 1; 2; : : : ; so infff .x/j g.x/  0; x 2 Hg D infff .x/j g.x/ < 0; x 2 Hg  0; hence by Theorem 10.3, (i), there exists y2 2 RC such that f .x/ C y2 g.x/  0 8x 2 H: The proof in case (2) is similar, with the roles of f .x/; g.x/ interchanged. t u Corollary 10.9 Let f ; g; H be defined as above, let A; B be two closed subsets of an affine manifold H such that A [ B D H: Assume condition (S). If f .x/  0

8x 2 A;

while

g.x/  0 8x 2 B

(10.30)

then there exists y 2 RC such that f .x/ C yg.x/  0 8x 2 H: Proof Condition (S) means that H n B ¤ ;; hence A ¤ ;: Clearly fx 2 Hj g.x/ < 0g  A: If x0 2 H satisfies g.x0 / D 0; then for every k there exists xk 2 H satisfying kxk  x0 k  1=k; g.xk / < 0 for otherwise x0 would be a local minimum of g.x/ on H; hence a global minimum on H; i.e., g.x/  g.x0 / D 0 8x 2 H; contradicting (S). Therefore, fx 2 Hj g.x/  0g D clfx 2 Hj g.x/ < 0g  A: Assumption (10.30) then implies f .x/  0 for every x 2 H such that g.x/  0; hence infff .x/j g.x/  0g  0; and the existence of y follows from Theorem 10.3, (i). t u The above results, and especially the next one, are very similar to corresponding properties of convex inequalities. Corollary 10.10 (i) With f .x/; g.x/; H as in (10.13) the set C D ft 2 R2 j 9x 2 H s.t. f .x/  t1 ; g.x/  t2 g is convex. (ii) With f .x/; g.x/ homogeneous and H a subspace, the set G D ft 2 R2 j 9x 2 H s.t. f .x/ D t1 ; g.x/ D t2 g is convex. Proof (i) 1 We first show that for any two points a; b 2 H the set Cab WD ft 2 R2 j 9x 2 Œa; b s.t. f .x/  t1 ; g.x/  t2 g is closed and convex. Let tk 2 Cab ; k D 1; 2; : : : ; be a sequence converging to t: For each k let xk 2 Œa; b satisfy f .xk /  t1k ; g.xk /  t2k : Then xk ! x 2 Œa; b (up to a subsequence) and by continuity of f ; g we have f .x/  t1 ; g.x/  t2 ; so t 2 Cab : Therefore, Cab is a closed set. Further, consider any point t D .t1 ; t2 / 2 R2 n Cab : Since Cab is a closed set there exists a point t0 … Cab such that t10 > t1 ; t20 > t2 : Then 6 9x 2 Œa; b

f .x/ < t10 ; g.x/ < t20 :

By Corollary 10.8 there exists .y1 ; y2 / 2 R2C n f0; 0g satisfying y1 .f .x/  t10 / C y2 .g.x/  t20 /  0 8x 2 Œa; b: Setting Lt WD fs 2 R2 j y1 s1 C y2 s2  y1 t10 C y2 t20 g it is easily checked that Cab  Lt while t … Lt : So for every t … Cab there exists a halfplane Lt separating t from Cab : Consequently, C D \t…C Lt ; proving the convexity of Cab :

1

This proof corrects a flaw in the original proof given in Tuy and Tuan (2013).

352

10 Nonconvex Quadratic Programming

Now if t; s are any two points of C then there exist a; b 2 H such that f .a/  t1 ; g.a/  t2 ;

f .b/  s1 ; g.b/  s2 :

This means t; s 2 Cab ; and hence Œt; s  Cab  C: Thus t; s 2 C ) Œt; s  C; proving the convexity of C: (ii) Turning to the set G with f .x/; g.x/ homogeneous and H a subspace, observe that if there exists x 2 H satisfying a1 g.x/  a2 f .x/ D 0; a1 f .x/ > 0; then f .x/ a1 a1 2 2 D Œfa1.x/ 2 > 0; so setting f .x/ D r yields a1 D r f .x/ D f .rx/ and a2 D f .x/ a1 2 f .x/ g.x/ D r g.x/ D g.rx/; hence a 2 G because rx 2 H: Therefore, if a … G then minfa1 f .x/j a1 g.x/ a2 f .x/ D 0; x 2 Hg D 0: By Corollary 10.3, (ii), there exists y 2 R satisfying a1 f .x/ C y.a1 g.x/  a2 f .x//  0 8x 2 H: Then G  La WD ft 2 R2 j  a1 t1 C y.a1 t2  a2 t1 /  0g; while a … La (because a21 < 0/: Thus, every a … G can be separated from G by a halfspace, proving the convexity of the set G: t u Assertion (i) of Corollary 10.10 is analogous to a well-known proposition for convex functions. It has been derived from Corollary 10.3, (i), but the converse is also true. In fact, suppose for given f .x/; g.x/; H the set C WD ft 2 R2 j 9x 2 H s:t: f .x/  t1 ; g.x/  t2 g is convex. Clearly if the system x 2 H; f .x/  0; g.x/  0 is inconsistent then .0; 0/ … C; hence, by the separation theorem, there exists .u; y/ 2 R2C ; such that ut1 Cyt2  0 8.t1 ; t2 / 2 C; i.e., uf .x/Cyg.x/  0 8x 2 H: If, furthermore, there exists x 2 H such that g.x / < 0 then .f .x /; g.x // 2 C; so uf .x / C yg.x /  0: We cannot have u D 0 for this would imply yg.x /  0; hence y D 0; conflicting with .u; y/ ¤ 0: Therefore u > 0 and setting y D y=u yields f .x/ C yg.x/  0 8x 2 H: So the fact that C is convex is equivalent to Corollary 10.3, (i). Assertion (ii) of Corollary 10.10 (which still holds with H replaced by a regular cone) includes a result of Dines (1941) which was used by Yakubovich (1977) in his original proof of the S-Lemma. Since this result has been derived from Corollary 10.3, (ii), i.e., the S-Lemma with equality constraints, Dines’ theorem is actually equivalent to the latter. It has been long known that the general linearly constrained quadratic programming problem is NP-hard (Sahni 1974). Recently, two special variants of this problem have also been proved to be NP-hard: 1. The linearly constrained quadratic program with just one negative eigenvalue (Pardalos and Vavasis 1991): minfx21 C hc; xij Ax  b; x  0g: 2. The linear multiplicative program (Matsui 1996): minf.hc; xi C /.hd; xi C ı/j Ax  b; x  0g

10.3 Single Constraint Quadratic Optimization

353

Nevertheless, quite practical algorithms exist which can solve each of the two mentioned problems in a time usually no longer than the time needed for solving a few linear programs of the same size (Konno et al. 1997). In this section we are going to discuss another special quadratic minimization problem which can also be solved by efficient algorithms (Karmarkar 1990; Ye 1992). This problem has become widely known due to several interesting features. Referring to its role in nonlinear programming methods it is sometimes called the Trust Region Subproblem. It consists of minimizing a nonconvex quadratic function over a Euclidean ball, i.e.,

1 .TRS/ min hx; Qxi C hc; xij hx; xi  r2 ; 2 (10.31) where Q is an n  n symmetric matrix and c 2 Rn ; r 2 R: A seemingly more general formulation, in which the constraint set is an ellipsoid hx; Hxi  r2 (with H being a positive definite matrix), can be reduced to the above one by means of a nonsingular linear transformation. Let u1 ; : : : ; un be a full set of orthonormal eigenvectors of Q; and let 1 : : : ; n be the corresponding eigenvalues. Without loss of generality one can assume that 1 D : : : D k < kC1  : : :  n ; and 1 < 0: (The latter inequality simply means that the problem is nonconvex.) By Proposition 4.18, the global minimum is achieved on the sphere hx; xi D r2 ; so the problem is equivalent to

1 2 hx; Qxi C hc; xij hx; xi D r : min 2

Proposition 10.4 A necessary and sufficient condition for x to be a global optimal solution of .TRS/ is that there exist   0 satisfying Q C  yI is positive semidefinite 

 

Qx C  x C c D 0 

kx k D r:

(10.32) (10.33) (10.34)

Proof Clearly .TRS/ is a problem .QPA/ where H D Rn ; Q0 D Q; Q1 D I and condition .S/ is satisfied, so the result follows from Corollary 10.5. t u Remark 10.5 It is interesting to note that when this result first appeared it came as a surprise. In fact, at the time .TRS/ was believed to be a nonconvex problem, so the fact that a global optimal solution can be fully characterized by conditions like (10.32)–(10.34) seemed highly unusual. However, as we saw by Theorem 10.3, x is a global optimal solution of .TRS/ as a .QPA/ if and only if there exists   0 such that the function 12 hx; Qxi C  hc; xi is convex (which is expressed by (10.32)) and x is a minimizer of this convex function (which is expressed by the equalities (10.33)–(10.34)—standard optimality condition in convex programming).

354

10 Nonconvex Quadratic Programming

Also note that by a result of Martinez (1994) .TRS/ has at most one local nonglobal optimal solution. No wonder that good local methods such as the DCA method (Tao 1997; Tao and Hoai-An (1998) and the references therein) can often give actually a global optimal solution. In view of Proposition 10.4, solving .TRS/ reduces to finding   0 satisfying (10.32)–(10.34). Condition (10.33) implies that 1   < C1: For 1 <  since QCI is positive definite, the equation QxCxCc D 0 has a unique solution given by x./ D .Q C I/1 c: Proposition 10.5 The function kx./k is monotonically strictly increasing in the interval .1 ; C1/: Furthermore, lim kx./k D 0;

!1

lim kx./k D

!1

kxk if hc; ui i D 0 8i D 1; : : : ; k C1 otherwise;

where xD

n X hc; ui iui :   1 iDkC1 i

Proof From x./ D Qx./ C c; we have for every i D 1; : : : ; n: hx./; ui i D hx; Qui i C hc; ui i D i hx; ui i C hc; ui i; hence, for  > 1 : hx./; ui i D

hc; ui i : i  

Since fu1 ; : : : ; un g is an orthonormal base of Rn ; we then derive x./ D

n X hc; ui iui iD1

i  

;

(10.35)

and consequently kx./k2 D

n X .hc; ui i/2 : .i  /2 iD1

(10.36)

10.4 Quadratic Problems with Linear Constraints

355

This expression shows that kx./k is a strictly monotonically increasing function of  in the interval .1 ; C1/: Furthermore, lim kx./k D 0: If hc; ui i D 0 8i D Pn !1i i 1; : : : ; k, then from (10.35) x./ D iDkC1 hc; u iu =.i  /; hence x./ ! x as  ! 1 : Otherwise, since hc; ui i ¤ 0 for at least some i D 1; : : : ; k and 1 D : : : ; k we have kx./k ! C1 as  ! 1 : t u From the above results we derive the following method for solving (TRS): • If hc; ui i ¤ 0 for at least one i D 1; : : : ; k or if kxk  r then there exists a unique value  2 Œ1 ; C1/ satisfying kx. /k D r: This value  can be determined by binary search. The global optimal solution of (TRS) is then x D .Q C  I/1 c and is unique. • If hc; ui i D 0 8i D 1; : : : ; k and kxk < r then  D 1 and an optimal solution is x D

k X

˛i ui C x;

iD1

P where ˛1 ; : : : ; ˛k are any real numbers such that kiD1 ˛i2 D r2  kxk2 : Indeed, P it is easily verified that kx k2 D r2 and also Qx C c D kiD1 ˛i Qui C Qx C c D Pk P  iD1 ˛i 1 ui  1 x D 1 . kiD1 ˛i ui C x/ D 1 x : Since there are many different choices for ˛1 ; : : : ; ˛k satisfying the required condition, the optimal solution in this case is not unique.

10.4 Quadratic Problems with Linear Constraints 10.4.1 Concave Quadratic Programming The Concave Quadratic Programming Problem .CQP/

minff .x/j Ax  b; x  0g;

where f .x/ is a concave quadratic function, is a special case of .BCP/—the Basic Concave Programming Problem (Chap. 5). Algorithms for this problem have been developed in Konno (1976b), Rosen (1983), Rosen and Pardalos (1986), Kalantari and Rosen (1987), Phillips and Rosen (1988), and Thakur (1991), . . . . Most often the quadratic function f .x/ is considered in the separable form, which is always possible via an affine transformation of the variables (cf Sect. 8.1). Also, particular attention is paid to the case when the quadratic function is of low rank, i.e., involves a restricted number of nonconvex variables and, consequently, is amenable to decomposition techniques (Chap. 7).

356

10 Nonconvex Quadratic Programming

Of course, the algorithms presented in Chaps. 5–6–7 for .BCP/ can be adapted to handle .CQP/: In doing this one should note two points where substantial simplifications and improvements are possible due to the specific properties of quadratic functions. The first point concerns the computation of the -extension (Remark 5.2). Since f .x/ is concave quadratic, for any point x0 2 Rn and any direction u 2 Rn ; the function 7! f .x0 C u/ is concave quadratic in : Therefore, for any x0 ; x 2 Rn and any real number such that f .x0 / > ; the -extension of x with respect to x0 ; i.e., the point x0 C .x  x0 / corresponding to the value D supfj f .x0 C .x  x0 //  g; can be found by solving the quadratic equation in W f .x0 C .x  x0 // D (note that the level set f .x/  is bounded). The second point is the construction of the cuts to reduce the search domain. It turns out that when the objective function f .x/ is quadratic, the usual concavity cuts can be substantially strengthened by the following method (Konno 1976a, see also Balas and Burdet 1973). Assume that D  RnC ; x0 D 0 is a vertex of D and write the objective function as f .x/ D hx; Qxi C 2hc; xi; where Q is a negative semidefinite matrix. Consider a -valid cut for .f ; D/ constructed at x0 : n X xi iD1

ti

 1:

(10.37)

P So if we define M.t/ D fx 2 Dj niD1 xi =ti /  1g, then fx 2 Dj f .x/ < g  M.t/: Note that by construction f .ti ei /  ; i D 1; : : : ; n; where as usual ei is the ith unit vector. We now construct a new concave function 't .x/ such that a -valid concavity cut for .'t ; D/ will also be -valid for .f ; D/ but in general deeper than the above cut. To this end define the bilinear function F.x; y/ D hx; Qyi C hc; xi C hc; yi: Observe that f .x/ D F.x; x/;

F.x; y/  minff .x/; f .y/g:

Indeed, the first equality is evident. On the other hand, F.x; y/  F.x; x/ D hx; Q.y  x/i C hc; y  xi F.x; y/  F.y; y/ D hy; Q.x  y/i C hc; x  yi;

(10.38)

10.4 Quadratic Problems with Linear Constraints

357

hence ŒF.x; y/  F.x; x/ C ŒF.x; y/  F.y; y/ D hx  y; Q.x  y/i  0 because Q is negative semidefinite. Thus at least one of the differences F.x; y/  F.x; x/; F.x; y/  F.y; y/ must be nonnegative, whence the inequality in (10.38). The argument also shows that if Q is negative definite then the inequality in (10.38) is strict for x ¤ y because then hx  y; Q.x  y/i < 0: Lemma 10.1 The function 't .x/ D minfF.x; y/j y 2 M.t/g is concave and satisfies 't .x/  f .x/

8x 2 M.t/

(10.39)

minf't .x/j x 2 M.t/g D minff .x/j x 2 M.t/g:

(10.40)

Proof The concavity of 't .x/ is obvious from the fact that it is the lower envelope of the family of affine functions x 7! F.x; y/; y 2 M.t/: Since 't .x/  F.x; x/ D f .x/ 8x 2 M.t/ (10.39) holds. Furthermore, 't .x/  minfF.x; y/j y 2 M.t/; y D xg D minfF.x; x/j x 2 M.t/g; hence 't .x/  minff .x/j x 2 M.t/g; t u

and (10.40) follows from (10.39).

Proposition 10.6 Let '.ti e / D F.ti e ; y /; i.e., y 2 argminfF.ti e ; y/j y 2 M.t/g: If f .yi /  ; i D 1; : : : ; n then the vector t D .t1 ; : : : ; tn / such that i

i

i

i

i

ti D maxfj 't .ei /  g; i D 1; : : : ; n

(10.41)

defines a -valid cut for .f ; D/ at x0 : n X xi  1; t iD1 i

(10.42)

which is no weaker than (10.37) (i.e., such that ti  ti ; i D 1; : : : ; n/: Proof The cut (10.42) is -valid for .'t ; D/, hence for .f ; D/; too, in view of (10.39). Since 't .ti ei /  minff .ti ei /; f .yi /g by (10.38), and since f .ti ei /  ; it follows from the inequality f .yi /  that 't .ti ei /  ; and hence ti  ti by the definition of ti : t u Note that if 't .ti ei / > minff .ti ei /; f .yi /g (which occurs when Q is negative definite) then ti > ti provided that ti ei … D (for yi 2 M.t/ implies yi 2 D; hence yi ¤ ti ei /: This means that the cut (10.42) in this case is strictly deeper than (10.37).

358

10 Nonconvex Quadratic Programming

The computation of ti in (10.41) reduces to solving a linear program. Indeed, if Qi denotes the ith row of Q, then F.ei ; y/ D hQi C c; yi C ci ; so for D D fxj Ax  b; x  0g we have 't .ei / D gi ./ C ci ; where ( gi ./ D min hQi C hc; yij

n X yi iD1

ti

)  1; Ay  b; y  0 :

By the duality theorem of linear programming, (



1 1 gi ./D max hb; ui C j  A u C  ;:::; t1 tn T

)

T

 .Qi C c/ ; u  0;   0 : T

Thus, ti D maxfj gi ./ C ci  g, hence (

ti



1 1 D max j  hb; ui C  C ci  ; Q  A u C  ;:::; t1 tn

T

T

 .Qi C c/T ;

u  0;   0g :

(10.43)

To sum up, a cut (10.37) can be improved by the following procedure: 1. Compute yi 2 argminfF.ti ei ; y/j y 2 M.t/g; i D 1; : : : ; n: 2. If f .yi / < for some i, then construct a 0 -valid cut for .f ; D/,with the new incumbent 0 WD minff .yi /j i D 1; : : : ; ng: 3. Otherwise, compute ti ; i D 1; : : : ; n by (10.43), and construct the cut (10.42). Clearly, if a further improvement is needed the above procedure can be repeated with t t :

10.4.2 Biconvex and Convex–Concave Programming A problem that has earned a considerable interest due to its many applications is the bilinear program whose general form is minfhc; xi C hx; Ayi C hd; yi W x 2 X; y 2 Yg; where X; Y are polyhedrons in Rp ; Rq , respectively, and A is a pq matrix. By setting '.x/ D hc; xi C minfhx; Ayi C hd; yi W y 2 Yg, the problem becomes minf'.x/ W x 2 Xg, which is a concave program, because '.x/ is a concave function (pointwise minimum of a family of affine functions). A more difficult problem, which can also be shown to be more general, is the following jointly constrained biconvex program: .JCBP/

minff .x/ C hx; yi C g.y/ W .x; y/ 2 Gg;

10.4 Quadratic Problems with Linear Constraints

359

where f .x/; g.y/ are convex functions on Rn and G is a compact convex subset of Rn  Rn : Since hx; yi D .kx C yk2  kx  yk2 /=4, the problem can be solved by rewriting it as the reverse convex program: minff .x/ C g.y/ C .kx C yk2  t/=4 W .x; y/ 2 G; kx  yk2  tg: An alternative branch and bound method was proposed in Al-Khayyal and Falk (1983), where branching is performed by rectangular partition of Rn  Rn : To compute a lower bound for F.x; y/ WD f .x/ C hx; yi C g.y/ over any rectangle M D f.x; y/ 2 Rn  Rn W r  x  sI u  y  vg observe the following relations for every i D 1; : : : ; n: xi yi  ui xi C ri yi  ri ui

8xi  ri ; yi  ui

xi yi  vi xi C si yi  si vi

8xi  si ; yi  vi

(these relations hold because .xi  ri /.yi  ui /  0 and .xi  si /.yi  vi /  0, respectively). Therefore, an underestimator of xy over M is the convex function ˇ P ˇ '.x; y/ D niD1 'i .x; y/; ˇ 'i .x; y/ D maxfui xi C ri yi  ri ui ; vi xi C si yi  si vi g ˇ

(10.44)

and a lower bound for minfF.x; y/ W .x; y/ 2 Mg is provided by the value ˇ.M/ WD minff .x/ C '.x; y/ C g.y/ W .x; y/ 2 G \ Mg

(10.45)

(the latter problem is a convex program and can be solved by standard methods). At iteration k of the branch and bound procedure, a rectangle Mk in the current partition is selected which has smallest ˇ.Mk /: Let .xk ; yk / be the optimal solution of the bounding subproblem (10.45) for Mk ; i.e., the point .xk ; yk / 2 G \ Mk ; such that ˇ.Mk / D f .xk / C '.xk ; yk / C g.yk /: If xk yk D '.xk ; yk / then the procedure terminates: .xk ; yk / is a global optimal solution of (JCBP). Otherwise, select ik 2 arg max fxki yki  'i .xk ; yk /g: iD1;:::;n

If Mk D Œrk ; sk   Œuk ; v k ; then subdivide Œrk ; sk  via .xk ; ik / and subdivide Œuk ; v k  via .yk ; ik / to obtain a partition of Mk into four smaller rectangles which will replace Mk in the next iteration. By simple computation, one can easily verify that '.x; y/ D xy; hence f .x/ C '.x; y/Cg.y/ D F.x; y/ at every corner .x; y/ of the rectangle M: Using this property

360

10 Nonconvex Quadratic Programming

and the basic rectangular subdivision theorem it can then be proved that the above branch and bound procedure converges to a global optimal solution of .JCBP/: Another generalization of the bilinear programming problem is the convex– concave programming problem studied in Muu and Oettli (1991): .CCP/

minfF.x; y/ W .x; y/ 2 G \ .X  Y/g;

where G  Rp  Rq is a closed convex set, X a rectangle in Rp ; Y a rectangle in Rq , and F.x; y/ is convex in x 2 Rp but concave in y 2 Rq : The method of Muu–Oettli starts from the observation that, for any rectangle M  Y; if ˇ.M/ is the optimal value and .xM ; yM ; uM / an optimal solution of the problem minfF.x; y/ W x 2 X; y 2 M; u 2 M; .x; u/ 2 Gg;

(10.46)

then .xM ; uM / is feasible to .CCP/ and ˇ.M/ D F.xM ; yM /  minfF.x; y/ W .x; y/ 2 G \ .X  M/g  F.xM ; uM /: In particular, ˇ.M/ yields a lower bound for F.x; y/ on G \ .X  M/; which becomes the exact minimum if yM D uM : This suggests a branch and bound procedure in which branching is by rectangular subdivision of Y and ˇ.M/ is used as a lower bound for every subrectangle M of Y: At iteration k of this procedure, a rectangle Mk in the current partition is selected which has smallest ˇ.Mk /: Let .xk ; yk ; uk / D .xMk ; yMk ; uMk /: If yk D uk then the procedure terminates: .xk ; yk / solves (CCP). Otherwise, select ik 2 arg max fjyki  uki jg iD1;:::;q

and subdivide Mk via . y Cu ; ik /: In other words, subdivide Mk according to the 2 adaptive rule via yk ; uk (see Theorem 6.4, Chap. 6). The convergence of the procedure follows from the fact that, by virtue of Theorem 5.4, a nested sequence fMk.s/ g is generated such that jyMk.s/  uMk.s/ j ! 0 as s ! C1: This method is implementable if the bounding subproblems (10.46) can be solved efficiently, which is the case, e.g., for the d.c. problem k

k

minfg.x/  h.y/ W .x; y/ 2 G \ .X  Y/g: In that case, the subproblem (10.46): minfg.x/  h.y/ W x 2 X; y 2 M; u 2 M; .x; u/ 2 Gg is equivalent to minfg.x/ W x 2 X; u 2 M; .x; u/ 2 Gg  maxfh.y/ W y 2 Mg;

10.4 Quadratic Problems with Linear Constraints

361

which reduces to solving a standard convex program and computing the maximum of the convex function h.y/ at 2q corners of the rectangle M: By bilinear programming, one usually means a problem of the form .BLP/

minfhx; Qyi C hc; xi C hd; yij x 2 D; y 2 Eg;

where D; E are polyhedrons in Rn1 and Rn2 , respectively, Q is an n1  n2 matrix and c 2 Rn1 ; d 2 Rn2 : This is a rather special (though perhaps the most popular) variant of bilinear problem, since in the general case a bilinear problem may involve bilinear constraints as well. The fact that the constraints in .BLP/ are linear and separated in x and y allow this problem to be converted into a linearly constrained concave minimization problem. In fact, setting '.x/ D minfhx; Qyi C hd; yij y 2 Eg we see that .BLP/ is equivalent to minf'.x/ C hc; xij x 2 Dg: Since for each fixed y the function x 7! hx; Qyi C hd; yi is affine, it follows that '.x/ is concave. Therefore, the converted problem is a concave program. Aside from the bilinear program, a number of other nonconvex problems can also be reduced to concave or quasiconcave minimization. Some of these problems (such as minimizing a sum of products of convex positive-valued functions) have been discussed in Chap. 9 (Subsect. 9.6.3).

10.4.3 Indefinite Quadratic Programs Using an appropriate affine transformation any indefinite quadratic minimization problem under linear constraints can be written as: .IQP/

minfhx; xi  hy; yij Ax C By C c  0g;

where x 2 Rn1 ; y 2 Rn2 : Earlier methods for dealing with this problem (Kough 1979; Tuy 1987b) were essentially based on Geoffrion-Benders’ decomposition ideas and used the representation of the objective function as a difference of two convex quadratic separable functions. Kough’s method, which closely follows Geoffrion’s scheme, requires the solution, at each iteration, of a relaxed master problem which is equivalent to a number of linearly constrained concave programs, and, consequently, may not be easy to handle. Another approach is to convert .IQP/ to a single concave minimization under convex constraints and to use for solving it a decomposition technique which extends to convex constraints the decomposition

362

10 Nonconvex Quadratic Programming

by outer approximation presented in Chap. 7 for linear constraints. In more detail, this approach can be described as follows. For simplicity assume that the set Y D fyj .9x/ Ax C By C c  0g is bounded. Setting '.y/ D inffhx; xij Ax C By C c  0g it is easily seen that the function '.y/ is convex (see Proposition 2.6) and that dom'  Y: Then .IQP/ can be rewritten as minft  hy; yij '.y/  t; y 2 Yg:

(10.47)

Here the objective function t  hy; yi is concave, while the constraint set D WD f.y; t/ 2 Rn2  Rj '.y/  t; y 2 Yg is convex. So (10.47) is a concave program under convex constraints, which can be solved, e.g., by the outer approximation method. Since, however, the convex feasible set in this program is defined implicitly via a convex program, the following key question must be resolved for carrying out this method (Chap. 7, Sect. 7.1): given a point .y; t/ determine whether it belongs to D and if it does not, construct a cutting plane (in the space Rn2  R/ to separate it from D: Clearly, .y; t/ 2 D if and only if y 2 Y; and '.y/  t: • One has y 2 Y if and only if the linear program minf0j Ax  Bycg is feasible, hence, by duality, if and only if the linear program maxfhBy C hc; uij AT u D 0; u  0g

(10.48)

has optimal value 0. So, if u D 0 is an optimal solution of the linear program (10.48), then y 2 YI otherwise, in solving this linear program an extreme direction v of the cone AT u D 0; u  0 is obtained such that hBy C c; vi > 0 and the cut hBy C hc; vi  0

(10.49)

separates .y; t/ from D: • To check the condition '.y/  t one has to solve the convex program minfhx; xij Ax C By C c  0g:

(10.50)

Lemma 10.2 The convex program (10.50) has at least one Karush-Kuhn-Tucker vector  and any such vector satisfies BT  2 @'.y/:

10.5 Quadratic Problems with Quadratic Constraints

363

Proof Since hx; xi  0 one has '.y/ > 1; so the existence of  follows from a classical theorem of convex programming (see, e.g., Rockafellar 1970, Corollary 28.2.2). Then '.y/ D inffhx; xi C h; Ax C By C cij x 2 Rn1 g

(KKT vector)

and for any y 2 Rn2 : '.y/  inffhx; xi C h; Ax C By C cij x 2 Rn1 g (because   0/: Hence '.y/  '.y/  h; By  Byi D hBT ; y  yi; i.e., BT  2 @'.y/:

t u

Corollary 10.11 If '.y/ > t, then a cut separating .y; t/ from D is given by h; B.y  y/i C '.y/  t  0:

(10.51)

Thus, given any point .y; t/ one can check whether .y; t/ 2 D and if .y; t/ … D then construct a cut separating .y; t/ from D (this cut is (10.49) if y … Y; or (10.51) if '.y/ > t/: With the above background, a standard outer approximation procedure can be developed for solving the indefinite quadratic problem .IQP/ (Tuy 1987b). At iteration k of this procedure a polytope Pk is in hand which is an outer approximation of the feasible set D: Let V.Pk / be the vertex set of Pk : By solving the subproblem minft hy; yij y 2 V.Pk /g a solution .yk ; tk / is obtained. If .yk ; tk / 2 D the procedure terminates. Otherwise a hyperplane is constructed which cuts off this solution and determines a new polytope PkC1 : Under mild assumptions this procedure converges to a global optimal solution. In recent years, BB methods have been proposed for general quadratic problems which are applicable to .IQP/ as well. These methods will be discussed in the next section and also in Chap. 11 devoted to monotonic optimization.

10.5 Quadratic Problems with Quadratic Constraints We now turn to the general nonconvex quadratic problem : .GQP/

minff0 .x/j fi .x/  0; i D 1; : : : ; mI x 2 Xg;

where X is a closed convex set and the functions fi ; i D 0; 1; : : : ; m are indefinite quadratic: fi .x/ D

1 hx; Qi xi C hci ; xi C di ; 2

i D 0; 1; : : : ; m:

364

10 Nonconvex Quadratic Programming

By virtue of property (iii) listed in Sect. 8.1 .GQP/ can be rewritten as minf'0 .x/  rkxk2 j 'i .x/  rkxk2  0; i D 1; : : : ; mI x 2 Xg;

(10.52)

where r  maxiD0;1;:::;m 12 .Qi / and 'i .x/ D fi .x/ C rkxk2 ; i D 0; 1; : : : ; m; are convex functions. Thus .GQP/ is a d.c. optimization problem to which different approaches are possible. Aside from methods derived by specializing the general methods presented in the preceding chapters, there exist other methods designed specifically for quadratic problems. Below we shall focus on BB (branch and bound) methods.

10.5.1 Linear and Convex Relaxation For low rank nonconvex problems, i.e., for problems which are convex when a relatively small number of variables are kept fixed, and also for problems whose quadratic structure is implicit, outer approximation (or combined OA/BB methods) may in many cases be quite practical. Two such procedures have been discussed earlier: the Relief Indicator Method (Chap. 7, Sect. 7.5), designed for continuous optimization and the Visible Point Method (Chap. 7, Sect. 7.4) which seems to work well for problems in low dimension with many nonconvex quadratic or piecewise affine constraints (such as those encountered in continuous location theory (Tuy et al. 1995a)). For solving large-scale quadratic problems with little additional special structure branch and bound methods seem to be most suitable. Two fundamental operations in a BB algorithm are bounding and branching. As we saw in Chap. 5, these two operations must be consistent in order to ensure convergence of the procedure. Roughly speaking, the latter means that in an infinite subdivision process, the current lower bound must tend to the exact global minimum. A standard bounding method consists in considering for each partition set a relaxed problem obtained from the original one by replacing the objective function with an affine or convex minorant and enclosing the constraint set in a polyhedron or a convex set.

a. Lower Linearization Consider a class of problems which can be formulated as:

.SNP/

ˇ ˇ min g0 .x/ C h0 .y/ ˇ ˇ s:t: gi .x/ C hi .y/  0; i D 1; : : : ; m ˇ ˇ .x; y/ 2 ˝;

10.5 Quadratic Problems with Quadratic Constraints

365

where g0 ; gi W Rn ! R are continuous functions, h0 ; hi W Rp ! R are affine functions, while ˝ is a polytope in Rn  Rp : Obviously any quadratically constrained quadratic program is of this form. It turns out that for this problem a bounding method can be defined consistent with simplicial subdivision provided the nonconvex functions involved satisfy a lower linearizability condition to be defined below.

I. Simplicial Subdivision A function f W Rn ! R is said to be lower linearizable on a compact set C  Rn with respect to simplicial subdivision if for every n-simplex M  C there exists an affine function 'M .x/ such that 'M .x/  f .x/ 8x 2 M;

supŒf .x/  'M .x/ ! 0 as diamM ! 0:

(10.53)

x2M

The function 'M .x/ will be called a lower linearization of f .x/ on M: Most continuous functions are locally lower linearizable in the above sense. For example, if f .x/ is concave, then 'M .x/ can be taken to be the affine function that agrees with f .x/ at the vertices of M; if f .x/ is Lipschitz, with Lipschitz constant L, then 'M .x/ can be taken to be the affine function that agrees with fQ .x/ WD f .xM /  Lkx  xM k at the vertices of M; where xM is an arbitrary point in M (in fact, fQ .x/ is concave, so 'M .x/ is a lower linearization of fQ .x/; hence, also a lower linearization of f .x/ because by Lipschitz condition fQ .x/  f .x/ and jf .x/  fQ .x/j  2kx  xM k 8x 2 M/: Thus special cases of problem (SNP) with locally lower linearizable gi ; i D 0; 1; : : : ; m; include reverse convex programs (g0 .x/ linear, gi .x/; i D 1; : : : ; m; concave), as well as Lipschitz optimization problems (g0 .x/; gi .x/; i D 1; : : : ; m; Lipschitz). Now assume that every function gi is lower linearizable and let M;i .x/ be a lower linearization of gi .x/ on a given simplex M: Then the following linear program is a relaxation of (SNP) on MI

SP.M/

ˇ ˇ min M;0 .x/ C h0 .y/ ˇ ˇ s:t: M;i .x/ C hi .y/  0; i D 1; : : : ; m: ˇ ˇ .x; y/ 2 ˝; x 2 M:

Therefore, the optimal value ˇ.M/ of this linear program is a lower bound for the objective function value on the feasible solutions .x; y/ such that x 2 M: Incorporating this lower bounding operation into an exhaustive simplicial subdivision process of the x-space yields the following algorithm: BB Algorithm for (SNP) Initialization. Take an n-simplex M1  Rn containing the projection of ˝ on Rn : Let .x; y/ be the best feasible solution available, D g0 .x/ ( D C1 if no feasible solution is known). Set R D S D fM1 g; k D 1:

366

Step 1. Step 2. Step 3. Step 4. Step 5. Step 6.

10 Nonconvex Quadratic Programming

For each M 2 S compute the optimal value ˇ.M/ of the linear program SP.M/: 0 Update .x; y/ and : Delete every M 2 R such that ˇ.M/  . Let R be the remaining collection of simplices. 0 If R D ;, then terminate: .x; y/ solves (SNP) (if < C1) or (SNP) is infeasible (if D C1/: 0 Let Mk 2 argminfˇ.M/ W M 2 R g; and let .xk ; yk / be a basic optimal solution of SP.Mk /. Perform a bisection or a radial subdivision of Mk ; following an exhaustive subdivision rule. Let S  be the partition of Mk . 0 Set S S ; R S  [ .R n fMk g/; increase k by 1 and return to Step 1.

The convergence of this algorithm is easy to establish. Indeed, if the algorithm is infinite, it generates a filter fMk g collapsing to a point x : Since ˇ.Mk /  min(SNP) and (10.53) implies that Mk ;i .x / D gi .x /; i D 0; 1; : : : ; m; it follows that if ˇ.Mk / D g0 .xk / C h0 .yk /, then xk ! x ; yk ! y such that .x ; y / solves (SNP). Remark 10.6 If the lower linearizations are such that for any simplex M each function M;i .x/g matches gi .x/ at every corner of M then as soon as xk in Step 3 coincides with a corner of Mk the solution .xk ; yk / will be feasible, hence optimal to the problem. Therefore, in this case an adaptive subdivision rule to drive xk eventually to a corner of Mk should ensure a much faster convergence (Theorem 6.4).

II. Rectangular Subdivision A function f .x/ is said to be lower linearizable on a compact set C  Rn with respect to rectangular subdivision if for any rectangle M  C there exists an P affine function 'M .x/ satisfying (10.53). For instance, if f .x/ is separable: f .x/ D niD1 fi .x/ and each univariate function fi .t/ has a lower linearization 'M;i .x/ on the segment Œai ; bi  then Pn a lower linearization of f .x/ on the rectangle C D Œa; b is obviously 'M .x/ D iD1 'M;i .xi /: For problems .SNP/ where the functions gi .x/; i D 0; 1; : : : ; m are lower linearizable with respect to rectangular subdivision, a rectangular algorithm can be developed similar to the simplicial algorithm.

b. Tight Convex Minorants A convex function '.x/ is said to be a tight convex minorant of a function f .x/ on a rectangle M (or a simplex M/ if '.x/  f .x/ 8x 2 M;

minŒf .x/  '.x/ D 0: x2M

(10.54)

10.5 Quadratic Problems with Quadratic Constraints

367

If f .x/ is continuous on M, then its convex envelope '.x/ WD convf .x/ over M must be a tight convex minorant, for if minx2M Œf .x/  '.x/ D ˛ > 0 then the function '.x/ C ˛ would give a larger convex minorant. Example 10.1 The function maxfpy C rx  pr; qy C sx  qsg is a tight convex minorant of xy 2 R2 on Œp; q  Œr; s: It matches xy for x D p or x D q (and y arbitrary in the interval r  y  s). Proof This function is the convex envelope of the product xy on the rectangle Œp; q Œr; s (Corollary 4.6). The second assertion follows from the inequalities xy  .py C rx  pr/ D .x  p/.y  r/  0; xy  .qy C sx  qs/ D .q  x/.s  y/  0 which hold for every .x; y/ 2 Œp; q  Œr; s: t u Example 10.2 Let f .x/ D hx; Qxi: The function 'M .x/ D f .x/ C r

n X .xi  pi /.xi  qi /; iD1

where r  .Q/ is a tight convex minorant of f .x/ on the rectangle M D Œp; q: It matches f .x/ at each corner of the rectangle Œp; q; i.e., at each point x such that xi 2 fpi ; qi g; i D 1; : : : ; n: Proof By (v), Sect. 8.1, 'M .x/ is a convex minorant of f .x/ and clearly 'M .x/ D f .x/ when xi 2 fpi ; qi g; i D 1; : : : ; n: t u Note that a d.c. representation of f .x/ is f .x/ D g.x/  rkxk2 with g.x/ D 2 2 f .x/ C rkxk Pn ; and r  .Q/: Since the convex envelope of kxk on Œp; q is l.x/ D  iD1 Œ.pi C qi /xi  pi qi ; a tight convex minorant of f .x/ is g.x/ C rl.x/: It is easily verified that g.x/ C rl.x/ D 'M .x/: Example 10.3 Let Qi be the row i and Qij be the element .i; j/ of a symmetric n  n matrix Q: For any rectangle M D Œp; q  Rn define ki D minfQi xj x 2 Œp; qg D Ki D maxfQi xj x 2 Œp; qg D

Pn jD1

minfpj Qij ; qj Qij g;

(10.55)

jD1

maxfpj Qij ; qj Qij g:

(10.56)

Pn

Then a tight convex minorant of the function f .x/ D hx; Qxi on M D Œp; q is 'M .x/ D

n X

'M;i .x/;

(10.57)

iD1

'M;i .x/ D maxfpi Qi x C ki xi  pi ki ; qi Qi x C Ki xi  qi Ki g: Actually, 'M .x/ D f .x/ at every corner x of the rectangle Œp; q:

(10.58)

368

10 Nonconvex Quadratic Programming

Pn Proof We can write f .x/ D iD1 fi .x/ with fi .x/ D xi yi ; yi D Qi x so the result simply follows from the fact that 'M;i .x/ is a convex minorant of fi .x/ and 'M;i .x/ D fi .x/ when xi 2 fpi ; qi g (Example 10.1). t u Consider now the nonconvex optimization problem minff0 .x/j fi .x/  0; i D 1; : : : ; m; x 2 Dg;

(10.59)

where f0 ; fi W Rn ! R are nonconvex functions and D a convex set in Rn : Given any rectangle M  Rn one can compute a lower bound for f0 .x/ over the feasible points x in M by solving the convex problem minf'M;0 .x/j x 2 M \ D; 'M;i .x/  0; i D 1; : : : ; mg obtained by replacing fi ; i D 0; 1; : : : ; m with their convex minorants 'M;i on M: Using this bounding a BB algorithm for (10.59) with an exhaustive rectangular subdivision can then be developed. Theorem 10.4 Assume that fi ; i D 0; 1; : : : ; are continuous functions and for every partition set M each function 'M;i is a tight convex minorant of fi on M such that for any filter fMk g and any two sequences xk 2 Mk ; zk 2 Mk : 'Mk ;i .xk /  'Mk ;i .zk / ! 0 whenever xk  zk ! 0: Then the corresponding BB algorithm converges. Proof If the algorithm is infinite, it generates a filter fMk g such that \ Mk D fxg: Let us write 'k;i for 'Mk ;i and denote by zk;i a point of Mk where 'k;i .zk;i / D fi .zk;i /: Then for every i D 0; 1; : : : ; m: zk ;i ! x: Furthermore, denote by xk an optimal solution of the relaxed program associated with Mk ; so that 'k;0 .xk / D ˇ.Mk /: Then xk ! x: Hence fi .x/ D lim fi .zk ;i / D lim 'k ;i .zk ;i / D lim 'k ;i .xk / 8i D 0; 1; : : : ; m: Since 'k ;i .xk / /  0 8i D 1; : : : ; m it follows that fi .x/  0 8i D 1; : : : ; m; i.e., x is feasible. On the other hand, for any k: ˇ.Mk /  WD minff0 .x/j fi .x/  0; i D 1; : : : ; mg; so 'k ;0 .xk /  ; hence f0 .x/  : Therefore, f0 .x/ D and x is an optimal solution, proving the convergence of the algorithm. t u It can easily be verified that the convex minorants in Examples 8.1–8.3 satisfy the conditions in the above theorem. Consequently, BB algorithms using these convex minorants for computing bounds are guaranteed to converge. An algorithm of this type has been proposed in Al-Khayyal et al. (1995) (see also Van Voorhis 1997) for the quadratic program (10.59) where f0 .x/ D hx; Q0 xi C hc0 ; xi; fh .x/ D hx; Qh xi C hch ; xi C dh ; h D 1; : : : ; m and D is a polytope. For each rectangle M the convex h minorant of hx; Qh xi over M is 'M .x/ computed as in Example 10.3. To obtain an upper bound one can use, in addition, a concave majorant for each function fh : By

10.5 Quadratic Problems with Quadratic Constraints

369

similar arguments to the above, it is easily seen that a concave majorant of hx; Qh xi is Pn h h h h h h h h h M .x/ D iD1 i .x/ with i .x/ D minfpi Qi xCKi xi pi Ki ; qi Qi xCki xi qi ki g: So the relaxed problem is the linear program 0 min 'M .x/ C hc0 ; xi s.t. th C hch ; xi C dh  0 h D 1; : : : ; m h .x/  th  Mh .x/ h D 1; : : : ; m 'M x2M\D :

Tight convex minorants of the above type are also used in Androulakis et al. (1995) for the general nonconvex problem (10.59). By writing each function fh .x/; h D 0; 1; : : : ; m; in (10.59) as a sum of bilinear terms of the form bij xi xj and a nonconvex function with no special structure, a convex minorant 'M;h .x/ of fh .x/ on a rectangle M can easily be computed from Examples 10.1 and 10.2. Such a convex minorant is tight because it matches fh .x/ at any corner of M; furthermore, as is easily seen, 'M;h .x/; h D 0; 1; : : : ; m; satisfy the conditions formulated in Theorem 10.4. The convergence of the BB algorithm in Androulakis et al. (1995) then follows. The reader is referred to the cited papers (Al-Khayyal et al. 1995; Androulakis et al. 1995) for details and computational results concerning the above algorithms. In fact the convex minorants in Examples 10.1–10.3 could be obtained from the general rules for computing convex minorants of factorable functions (see Chap. 4, Sect. 4.7). In other words the above algorithms could be viewed as specific realizations of the general scheme of McCormick (1982) for factorable programming. Remark 10.7 (i) If the convex minorants used are of the types given in Examples 8.1–8.3 then for any rectangle M each function 'M;i matches fi at the corners of M: In that case, the convergence of the algorithm would be faster by using the !-subdivision, viz : at iteration k divide Mk D Œpk ; qk  via .! k ; jk /; where ! k is an optimal solution of the relaxed problem at iteration k and jk is a properly chosen index (jk 2 argmaxf k;j j j D 1; : : : ; ng where

k;j D minf!jk  pkj ; qkj  !jk gI see Theorem 6.4. (ii) A function f .x/ is said to be locally convexifiable on a compact set C  Rn with respect to rectangular subdivision if there exists for each rectangle M  C a convex minorant 'M .x/ satisfying (10.53) (this convex minorant is then said to be asymptotically tight). By an argument analogous to the proof of Theorem 10.4 it is easy to show that if the functions fi .x/; i D 0; 1; : : : ; m in (10.59) are locally convexifiable and the convex minorants 'M;i used for bounding are asymptotically tight, while the subdivision is exhaustive then the BB algorithm converges. For the applications, note that if two functions f1 .x/; f2 .x/ are locally convexifiable then so will be their sum f1 .x/ C f2 .x/:

c. Decoupling Relaxation A lower bounding method which can be termed decoupling relaxation consists in converting the problem into an equivalent one where the nonconvexity is concentrated in a number of “coupling constraints.” When these coupling constraints are

370

10 Nonconvex Quadratic Programming

omitted, the problem becomes linear, or convex, or easily solvable. The method proposed in Muu and Oettli (1991), Muu (1993), and Muu (1996), for convex– concave problems, and also the reformulation–linearization method of Sherali and Tuncbilek (1992, 1995), can be interpreted as different variants of variable decoupling relaxations.

d.

Convex–Concave Approach

A real-valued function f .x; y/ on X  Y  Rn1  Rn2 is said to be convex–concave if it is convex on X for every fixed y 2 Y and concave on Y for every fixed x 2 X: Consider the problem .CCAP/

minff .x; y/j .x; y/ 2 C; x 2 X; y 2 Yg;

(10.60)

where X  Rn1 ; Y  Rn2 are polytopes, C is a compact convex set in Rn1  Rn2 and the function f .x; y/ is convex–concave on X  Y: Clearly, linearly (or convexly) constrained quadratic programs and more generally, d.c. programming problems with convex constraints, are of this form. Since by fixing y the problem is convex, a branch and bound method for solving .CCAP/ should branch upon y: Using, e.g., rectangular partition of the y-space, one has to compute, for a given rectangle M  Rn2 ; a lower bound ˇ.M/ for .SP.M//

minff .x; y/j .x; y/ 2 C; x 2 X; y 2 Y \ Mg:

(10.61)

The difficulty of this subproblem is due to the fact that convexity and concavity are coupled in the contraints. To get round this difficulty the idea is to decouple x and y by reformulating the problem as minff .x; y/j .x; u/ 2 C; x 2 X; u 2 Y; y 2 M; u D yg and omitting the constraint u D y: Then, obviously, the optimal value in the relaxed problem .RP.M//

minff .x; y/j .x; u/ 2 C; x 2 X; u 2 Y; y 2 Mg

(10.62)

is a lower bound of the optimal value in .CCAP/: Let V.M/ be the vertex set of M: By concavity of f .x; y/ in y we have minff .x; y/j y 2 Mg D minff .x; y/jy 2 V.M/g; so the problem (10.62) is equivalent to min minff .x; y/j .x; u/ 2 C; x 2 X; u 2 Yg:

y2V.M/

10.5 Quadratic Problems with Quadratic Constraints

371

Therefore, one can take ˇ.M/ D min minff .x; y/j .x; u/ 2 C \ .X  Y/g: y2V.M/

(10.63)

For each fixed y the problem minff .x; y/j .x; u/ 2 C \ .X  Y/g is convex (or linear if f .x; y/ is affine in x; and C is a polytope), so (10.63) can be solved by standard algorithms. If .xM ; yM ; uM / is an optimal solution of RP.M/ then along with the lower bound f .xk ; yk / D ˇ.Mk /; we also obtain a feasible solution .xM ; uM / which can be used to update the current incumbent. Furthermore, when yM D uM ; ˇ.M/ equals the exact minimum in (10.63). At iteration kth of the branch and bound procedure let Mk be the rectangle such that ˇ.Mk / is smallest among all partition sets still of interest. Denote .xk ; yk ; uk / D .xMk ; yMk ; uMk /: If uk D yk then, as mentioned above, ˇ.Mk / D f .xk ; yk / while .xk ; yk / is a feasible solution. Since ˇ.Mk /  min (CCAP) it follows that .xk ; yk / is an optimal solution of .CCAP/: Therefore, if uk D yk the procedure terminates. Otherwise, Mk must be further subdivided. Based on the above observation, one should choose a subdivision method which, intuitively, could drive kyk  uk k to zero more rapidly than an exhaustive subdivision process. A good choice confirmed by computational experience is an adaptive subdivision via .yk ; uk /; i.e., to subdivide Mk via the hyperplane yik D .ykik C ukik /=2 where ik 2 argmaxiD1;:::;n2 jyki  uki j: Proposition 10.7 The rectangular algorithm with the above branching and bounding operations is convergent. Proof If the algorithm is infinite, it generates at least a filter of rectangles, say Mkq ; q D 1; 2; : : : : To simplify the notation we write Mq for Mkq : Let Mq D Œaq ; bq ; so that by a basic property of rectangular subdivision (Lemma 5.4), one can suppose, q q without loss of generality, that iq D 1; Œaq ; bq  ! Œa ; b ; .y1 C u1 /=2 ! y1 2 q q   fa1 ; b1 g: The latter relation implies that jy1  u1 j ! 0; hence yq  uq ! 0 q q q q because jy1  u1 j D maxiD1;:::;n2 jyi  ui j by the definition of iq D 1: Let  q  q q x D lim x ; y D lim y .D lim u /: Then ˇ.Mq / D f .xq ; yq / ! f .x ; y / and since ˇ.Mq /  min(CCAP) for all q; while .x ; y / is feasible, it follows that .x ; y / is an optimal solution. t u This method is practicable if n2 is relatively small. Often, instead of rectangular subdivision, it may be more convenient to use simplicial subdivision. In that case, a simplex Mk is subdivided via ! k D .yk C uk /=2 according to the !-subdivision rule. The method can also be extended to problems in which convex and nonconvex variables are coupled in certain joint constraints, e.g., to problems of the form minff .x/j .x; y/ 2 C; g.x; y/  0; x 2 X; y 2 Yg;

(10.64)

372

10 Nonconvex Quadratic Programming

where f .x/ is a convex function, X; Y are as previously, and g.x; y/ is a convex– concave function on X  Y: For any rectangle M in the y-space, one can take ˇ.M/ D minff .x/j .x; u/ 2 C; g.x; y/  0; x 2 X; u 2 Y; y 2 Mg D minff .x/j .x; u/ 2 C \ .X  Y/; g.x; y/  0; y 2 V.M/g: Computing ˇ.M/ again amounts to solving a convex subproblem.

e. Reformulation–Linearization Let X; Y be two polyhedrons in Rn1 ; Rn2 ; n1  n2 ; defined by reformulation– linearization X D fx 2 Rn1 j hai ; xi  ˛i ; i D 1; : : : ; h1 g

(10.65)

Y D fy 2 Rn2 j hbj ; yi  ˇj ; j D 1; : : : ; h2 g;

(10.66)

and let f0 .x; y/ D hx; Q0 yi C hc0 ; xi C hd0 ; yi

(10.67)

fi .x; y/ D hx; Q yi C hc ; xi C hd ; yi C gi ; i D 1; : : : ; m;

(10.68)

i

i

i

where for each i D 0; 1; : : : ; m Qi is an n1  n2 matrix, ci 2 Rn1 ; di 2 Rn2 and gi 2 R: Consider the general bilinear program minff0 .x; y/j fi .x; y/  0 .i D 1; : : : ; m/; x 2 X; y 2 Yg:

(10.69)

Assume that the linear inequalities defining X and Y are such that both systems hai ; xi  ˛i  0

i D 1; : : : ; h1

(10.70)

hbj ; yi  ˇj  0 j D 1; : : : ; h2

(10.71)

are inconsistent, or equivalently, 8x 2 Rn1 miniD1;:::;h1 .hai ; xi  ˛i / < 0; 8y 2 Rn2 minjD1;:::;h2 .hbj ; yi  ˇj / < 0:

(10.72)

This condition is fulfilled if the inequalities defining X and Y include inequalities of the form 1 < p  xi0  q < C1

(10.73)

1 < r  yj0  s < C1

(10.74)

10.5 Quadratic Problems with Quadratic Constraints

373

for some i0 2 f1; : : : ; h1 g; p < q; and some j0 2 f1; : : : ; h2 g; r < sI in particular, it is fulfilled if X and Y are bounded. Lemma 10.3 Under condition (10.72) the system x 2 X; y 2 Y is equivalent to .hai ; xi  ˛i /.hbj ; yi  ˇj /  0

8i D 1; : : : ; h1 ; j D 1; : : : ; h2 :

(10.75)

Proof It suffices to show that (10.75) implies .x; y/ 2 X  Y: Let .x; y/ satisfy (10.75). By (10.72) there exists for this x an index j0 2 f1; : : : ; h2 g such that hbj0 ; yi  ˇj0 < 0: Then for any i D 1; : : : ; h1 ; since .hai ; xi  ˛i /.hbj0 ; yi  ˇj0 /  0; it follows that hai ; xi ˛i  0: Similarly, for any j D 1; : : : ; h2 W hbj ; yi ˇj  0: u t Thus, a system of two sets of linear constraints in x and in y like (10.65) and (10.66) can be converted to an equivalent system of formally bilinear constraints. Now for each bilinear function P.x; y/ denote by ŒP.x; yL the affine function of x; y; w obtained by replacing each product xl yl0 by wll0 : For instance, if P.x; y/ D .ha; xi  ˛/.hb; yi  ˇ/ where a 2 Rn1 ; b 2 Rn2 ; ˛; ˇ 2 R then Œ.ha; xi  ˛/.hb; yi  ˇ/L D

n2 n1 X X

al bl0 wll0  ˛hb; yi  ˇha; xi C ˛ˇ:

lD1 l0 D1

Let Œfi .x; y/L D 'i .x; y; w/; i D 0; 1; : : : ; m and for every .i; j/: Œ.hai ; xi  ˛i /.hbj ; yi  ˇj /L D Lij .x; y; w/: Then the bilinear program (10.69) can be written as ˇ ˇ min '0 .x; y; w/ ˇ ˇ s:t: 'i .x; y; w/  0 i D 1; : : : ; m; ˇ ˇ Lij .x; y; w/  0 i D 1; : : : ; h1 ; j D 1; : : : ; h2 ˇ ˇ wll0 D xl yl0 l D 1; : : : ; n1 ; l0 D 1; : : : ; n2 :

(10.76)

A new feature of the latter problem is that the only nonlinear elements in it are the coupling constraints wll0 D xl yl0 : Let (LP) be the linear program obtained from (10.76) by omitting these coupling constraints:

.LP/

ˇ ˇ min '0 .x; y; w/ ˇ ˇ s:t: 'i .x; y; w/  0 i D 1; : : : ; m; ˇ ˇ Lij .x; y; w/  0 i D 1; : : : ; h1 ; j D 1; : : : ; h2 :

Proposition 10.8 The optimal value in the linear program (LP) is a lower bound of the optimal value in (10.69). If an optimal solution .x; y; w/ of (LP) satisfies wll0 D xl yl0 8l; l0 then it solves (10.69).

374

10 Nonconvex Quadratic Programming

Proof This follows because (LP) is a relaxation of the problem (10.76) equivalent to (10.69). t u Remark 10.8 The replacement of the linear constraints x 2 X; y 2 Y; by constraints Lij .x; y; w/  0; i D 1; : : : ; h1 ; j D 1; : : : ; h2 ; is essential. For example, to compute a lower (or upper) bound for the product xy 2 R  R over the rectangle p  x  q; r  y  s; one should minimize (maximize, resp.) the function w under the constraints Œ.x  p/.y  r/L  0;

Œ.q  x/.s  y/L  0

Œ.x  p/.s  y/L  0;

Œ.q  x/.y  r/L  0:

This yields the well-known bounds given in Example 8.1: maxfrx C py  pr; sx C qy  qsg  w  minfsx C py  ps; rx C qy  qrg: Notice that trivial bounds 1 < w < C1 would have been obtained, were the original linear constraints used instead of the constraints Lij .x; y; w/  0: Thus, although it is more natural to linearize nonlinear constraints, it may sometimes be useful to tranform linear constraints into formally nonlinear constraints. From the above results we now derive a BB algorithm for solving (10.69). For simplicity, consider the special case of (10.69) when x D y; i.e., the general nonconvex quadratic program minff0 .x/j fi .x/  0 .i D 1; : : : ; m/; x 2 Xg;

(10.77)

where X is a polyhedron in Rn and f0 .x/ D

1 1 hx; Q0 xi C hc0 ; xi; fi .x/ D hx; Qi xi C hci ; xi C di : 2 2

(10.78)

Given any rectangle M in Rn ; the set X \ M can always be described by a system hai ; xi  ˛i

i D 1; : : : ; h

(10.79)

which includes among others all the inequalities pi  xi  qi ; i D 1; : : : ; n and satisfies (10.72), i.e., miniD1;:::;h .hai ; xi  ˛i / < 0 8x 2 Rn : For every .i; j/ with 1  i  j  h denote by Lij .x; w/ D Œ.hai ; xi  ˛i /.haj ; xi  ˛j /L the function that results from .hai ; xi  ˛i /.haj ; xi  ˛j / by the substitution wll0 D xl xl0 ; 1  l  l0  n: Also let

10.5 Quadratic Problems with Quadratic Constraints

 1 0 hx; Q xi C hc0 ; xi '0 .x; w/ D 2 L   1 hx; Qi xi C hci ; xi C di 'i .x; w/ D 2 L

375



i D 1; : : : ; m:

Then a lower bound of f0 .x/ over all feasible points x in X \ M is given by the optimal value ˇ.M/ in the linear program

LP.M/

ˇ ˇ min '0 .x; w/ ˇ ˇ s:t: 'i .x; w/  0 i D 1; : : : ; m ˇ ˇ Lij .x; w/  0 1  i  j  h:

To use this lower bounding for algorithmic purpose, one has to construct a subdivision process consistent with this lower bounding, i.e., such that the resulting BB algorithm is convergent. The next result gives the foundation for such a subdivision process. Proposition 10.9 Let M D Œp; q: If .x; w/ is a feasible point of LP.M/ such that x is a corner of the rectangle M then .x; w/ satisfies the coupling constraints wll0 D xl xl0 for all l; l0 : Proof Since x is a corner of M D Œp; q one has, for some I  f1; : : : ; ng: xi D pi for i 2 I and xj D qj for j … I: The feasibility of .x; w/ implies that Œ.xl pl /.ql0 xl0 /L  0; Œ.xl  pl /.xl0  pl0 /L  0 for any l; l0 ; i.e., pl xl0 C pl0 xl  pl pl0  wll0  ql0 xl C pl xl0  pl ql0 : For l 2 I we have xl D pl ; so xl xl0 C pl0 pl  pl pl0  wll0  ql0 pl C xl xl0  pl ql0 ; hence wll0 D xl xl0 : On the other hand, Œ.ql xl /.ql0 xl0 /k  0; Œ.ql xl /.xl0 pl0 /k  0; i.e., ql ql0 C ql0 xl C ql xl0  wll0  ql xl0  ql pl0 C pl0 xl : For l … I we have xl D ql ; so ql ql0 C ql0 ql C xl xl0  wll0  xl xl0  ql pl0 C pl0 ql ; hence wll0 D xl xl0 : Thus, in any case, wll0 D xl xl0 :

t u

Based on this property the following rectangular BB procedure can be proposed for solving (10.77). At iteration k; let Mk D Œpk ; qk  be the partition set with smallest lower bound among all partition sets still of interest and let .xk ; wk / be the optimal

376

10 Nonconvex Quadratic Programming

solution of the bounding problem, so that ˇ.Mk / D '.xk ; wk /: If wkll0 D xkl xkl0 for all l; l0 (in particular, if xk is a corner of Mk / then xk is an optimal solution of (10.77) and the procedure terminates. Otherwise, bisect Mk via .xk ; ik / where ik 2 argmaxf ki j i D 1; : : : ; ng;

ki D minfxki  pki ; qki  xki g:

Proposition 10.10 The rectangular algorithm with bounding and branching as above defined converges to an optimal solution of (10.77). Proof The subdivision ensures that if the algorithm is infinite it generates a filter of rectangles fM D Œp ; q g such that .x ; w / ! .x ; w /; and x is a corner of Œp ; q  D \ Œp ; q  (Theorem 6.3). Clearly .x ; w / is feasible to LP.M / for M D Œp ; q ; so by Proposition 10.9, wll0 D xl xl0 for all l; l0 ; and hence, x is a feasible solution to (10.77). Since '0 .x ; w / D f0 .x / is a lower bound for f0 .x/ over the feasible solutions, it follows that x is an optimal solution. t u Example 10.4 (Sherali and Tuncbilek 1995) Minimize .x1  12/2  x22 subject to 6x1 C 8x2  48; 3x1 C 8x2  120; 0  x1  24; x2  0: Setting w11 D x21 ; w12 D x1 x2 ; w22 D x22 the first bounding problem is to minimize the linear function w11 C 24x1  w22  144 subject to .5  6/=2 D 15 linear constraints corresponding to different pairwise products of the factors .48 C 6x1  8x2 /  0; .120  3x1  8x2 /  0; .24  x1 /  0; x1  0; x2  0: The solution to this linear program is .x1 ; x2 ; w11 ; w12 ; w22 / D .8; 6; 192; 48; 72/; with objective function value 216: By bisecting the feasible set X into M1 D fx 2 Xj x1  8g and M1C D fx 2 Xj x1  8g and solving the corresponding bounding problems one obtains a lower bound of 180 for both. Furthermore, the solution .x1 ; x2 ; w11 ; w12 ; w22 / D .0; 6; 0; 0; 36/ to the bounding problem on M1 is such that .x1 ; x2 / D .0; 6/ is a vertex of M1 : Hence, by Proposition 10.9, .0; 6/ is an optimal solution of the original problem and 180 is the optimal value. This result could also be checked by solving the problem as a concave minimization over a polytope (the vertices of this polytope are .0; 0/; .0; 6/; .8; 12/; .24; 6/; and .24; 0//: Remark 10.9 Since the number of constraints Lij .x; w/  0 is often quite large, it may sometimes be more practical to use only a selected subset of these constraints, even if this may result in a poorer bound. On the other hand, it is always possible to add some convex constraints of the form x2i  wii to obtain a convex rather than linear relaxation of the original problem. This may give, of course, a tighter bound at the expense of more computational effort.

10.5 Quadratic Problems with Quadratic Constraints

377

10.5.2 Lagrangian Relaxation Consider again the general problem \

fX WD infff0 .x/j fi .x/  0 .i D 1; : : : ; m/; x 2 Xg;

(10.80)

where X is a convex set in Rn ; f0 .x/; fi .x/ are arbitrary real-valued functions. The function L.x; u/ D f0 .x/ C

m X

ui fi .x/;

u 2 Rm C

iD1

is called the Lagrangian of problem (10.80) and the problem  X

WD sup u2Rm C

.u/;

the Lagrange dual problem . Clearly

.u/ D inf L.x; u/ x2X

(10.81)

\

.u/  fX 8u 2 Rm C ; hence  X

\

 fX ;

(10.82)

\

\

i.e., X always provides a lower bound for fX : The difference fX  X  0 is referred to as the Lagrange duality gap. When the gap is zero, i.e., strong duality holds, then  X is the exact optimal value of (10.80). Note that the function .u/ is concave on every convex subset of its domain as it is the lower envelope of a family of affine functions in u: Furthermore: Proposition 10.11 Assume X is compact while f0 .x/ and fi .x/ are continuous on X: Then .u/ is defined and concave on Rm .u/ D L.x; u/ then the vector C and if .fi .x/; i D 1; : : : ; m/ is a subgradient of  .u/ at u: Proof It is enough to prove the last part of the proposition. For any u 2 Rm C we have .u/ 

.u/  L.x; u/  L.x; u/ D

m X .ui  ui /fi .x/: iD1

t u

The rest is obvious.

The next three propositions indicate important cases when strong duality holds. Proposition 10.12 Assume f0 .x/ and fi .x/ are convex and finite on X: If the optimal value in (10.80) is not 1; and the Slater regularity condition holds, i.e., there exists at least one feasible solution x0 of the problem such that x0 2 riX;

fi .x0 / < 0 8i 2 I;

378

10 Nonconvex Quadratic Programming

where I D fi D 1; : : : ; mj fi .x/ is not affineg; then  X

\

D fX :

Proof This is a classical result from convex programming theory which can also be directly derived from Theorem 2.3 on convex inequalities. t u Proposition 10.13 In (10.80) where X D Rn ; assume that the concave function u 7! infx2Rn L.x; u/ attains its maximum at a point u 2 Rm C such that L.x; u/ is strictly convex. Then strong duality holds, so  X

\

D fX :

Proof This follows from Corollary 10.1 with C D Rn ; D D Rm C and Remark 10.1(a) by noting that by Proposition 10.3 L.x; u/ is coercive on Rn : t u Example 10.5 Consider the following minimization of a concave quadratic function under linear constraints min f50jjxjj2 C cT x j Ax  b; xi 2 Œ0; 1; i D 1; 2; : : : :; 10g;

x2R10

2

2 6 6 6 5 6 6 A D 6 5 6 6 4 9 5 8 7

1 0 8 3 5 3 0 9 4 5

3 3 0 1 8 8 1 8 9 1

2 6 3 8 9 2 3 9 7 1

(10.83)

3 2 2 9 3 7 7 7 0 9 7 7 9 3 5 3 2

c D .48; 42; 48; 45; 44; 41; 47; 42; 45; 46/T ; b D .4; 22; 3; 23; 12/T : The box constraint 0  xi  1; i D 1; : : : ; 10; can be written as 10 quadratic constraints x2i  xi  0; i D 1; 2; : : : ; 10; so actually there are in total 15 constraints in (10.83). By solving a SDP, it can be checked that maxy2R15 infx2R10 L.x; y/ is attained at y C such that L.x; y/ is strictly convex and attains its minimum at x D .1; 1; 0; 1; 1; 1; 0; 1; 1; 1/T which is feasible to (10.83). The optimal value is 47 with zero duality gap by Proposition 10.13. Proposition 10.14 In (10.80) whereP X D Rn assume there exists y 2 Rm C such that that the function L.x; y/ WD f .x/ C m iD1 yi gi .x/ is convex and has a minimizer x satisfying yi gi .x/ D 0; gi .x/  0 8i D 1; : : : ; m: Then strong duality holds, so  X

\

D fX :

10.5 Quadratic Problems with Quadratic Constraints

379

Proof It is readily verified that L.x; y/  L.x; y/  L.x; y/

8x 2 Rn ; 8y 2 Rm C;

so .x; y/ is a saddle point of L.x; y/ on Rn  Rm C ; and hence, by Proposition 2.35, max inf L.x; y/ D minn sup L.x; y/:

x2Rn y2Rm C

t u

x2R y2Rm C

Example 10.6 Consider the following nonconvex quadratic problem which arises from the so-called optimal beamforming in wireless communication (Bengtsson and Ottersen 2001): d X

min

xi 2RNi ;

iD1;2;:::;d

jjxi jj2 W xTi Rii xi C

iD1

X

xTj Rij xj C i  0; i D 1; 2; : : : ; d;

j¤i

(10.84) where 0 < i ;

0 Rij 2 RNi Ni ; i; j D 1; 2; : : : ; d;

(10.85)

so each xi is a vector in RNi while each Rij is a positive semidefinite matrix of dimension Ni  Ni . Applying Proposition 10.14 yields the following result (Tuy and Tuan 2013): Corollary 10.12 Strong Lagrangian duality holds for problem (10.84), i.e., min

sup L.x; y/ D max

xi 2RNi ; iD1;2;:::;d y2Rd C

where L.x; y/ D

d X iD1

xTi .I C

min

y2RdC xi 2RNi ; iD1;2;:::;d

X

yj Rji  yi Rii /xi C

d X

L.x; y/;

(10.86)

i yi :

iD1

j¤i

Proof It is convenient to precede the proof by a lemma. As usual the inequalities y > 0 or y < x for vectors y and x are component-wise understood. Lemma 10.4 For a given matrix R 2 Rdd with positive diagonal entries rii > 0 and nonpositive off-diagonal entries rij  0, i ¤ j, assume that there is y > 0 such that RT y > 0:

(10.87)

Then the inverse matrix R1 is positive, i.e., all its entries are positive. Proof Define a positive diagonal matrix D WD diagŒrii iD1;2;:::;d and nonnegative matrix G D D  R, so R D D  G. Then, (10.87) means  WD Dy  GT y > 0 and 0 < D1 GT y D y  D1  < y. Due to (10.87), without loss of generality

380

10 Nonconvex Quadratic Programming

we can assume rij < 0 so GD1 is irreducible. Let max Œ: stand for the maximal eigenvalue of a matrix. As D1 GT is obviously nonnegative, by Perron–Frobenius theorem (Gantmacher 1960, p.69), we can write max ŒGD1  D max ŒD1 GT  D min max

y>0 iD1;2;:::;d



max

iD1;2;:::;d

.D1 GT y/i yi

.D1 GT y/i < 1: yi

Again by Perron–Frobenius theorem, the matrix I  GD1 is invertible; moreover its inverse .I  GD1 /1 is positive. Then R1 D D1 .I  GD1 /1 is obviously positive. t u Turning to the proof of Corollary 10.12, let

WD sup

min

Ni y2RdC xi 2R ; iD1;2;:::;d

2 0 1 3 d d X X X 4 xTi @I C yj Rji  yi Rii A xi C i yi 5 ; iD1

iD1

j¤i

which means 9 ˇ ˇ = X ˇ i yi ˇˇ I C yj Rji  yi Rii 0; i D 1; 2; : : : ; d ;

D sup ; : ˇ y2RdC iD1 j¤i 8 d yi such that IC

X

381

yj Rji  yi Rii 0:

j¤i

Then setting

( yQ j D

yj j ¤ i yQ i j D i

yields a feasible solution yQ 2 RdC to (10.88) satisfying

d X

i yi
0; i D 1; 2; : : : ; d and there are xQ i 2 RNi , jjQxi jj D 1, i D 1; 2; : : : ; d such that 0 1 X xQ Ti @I C yj Rji  yi Rii A xQ i D 0; i D 1; 2; : : : ; d; (10.91) j¤i

hence, by setting R D Œrij i;jD1;2;:::;d with rii WD xQ Ti Rii xQ i and rij WD QxTj Rij xQ j , i ¤ j, RT y D .1; 1; : : : ; 1/T > 0: By Lemma 10.4, R1 is positive so p D R1  > 0, which is the solution of the linear equations X pj .QxTj Rij xQ j / D i ; i D 1; 2; : : : ; d: (10.92) pi .QxTi Rii xQ i /  j¤i

p With such p it is obvious that fxi WD pi xQ i ; i D 1; 2; : : : ; d} is feasible to (10.84) and satisfies (10.89), completing the proof of (10.86). t u Remark 10.10 The main result in Bengtsson and Ottersen (2001) states that the convex relaxation of (10.84), i.e., the convex program min

0 Xi 2RNi Ni ; iD1;2;:::;d



X

d X

Trace.Xi / W Trace.Xi Rii /

iD1

Trace.Xj Rij / C i ; i D 1; 2; : : : ; d;

(10.93)

j¤i

attains its optimal value at a rank-one feasible solution Xi D xi xTi , so (10.84) and (10.93) are equivalent. It can be shown that the Lagrangian dual of the convex

382

10 Nonconvex Quadratic Programming

problem (10.93) is precisely max min L.x; y/: In other words the above result of y2RdC xi 2RNi

Bengtsson-Ottersen is just equivalent to Corollary 10.12 which, besides, has been obtained here by a more straightforward and rigorous argument. Also note that the computation of the optimal solution of the nonconvex program (10.84), which is based on the optimal solution y of the convex Lagragian p dual (10.88) and pi xQ i ; i D 1; 2; : : : ; d by (10.91) and (10.92), is much more efficient than by using the optimal solutions of the convex program (10.93). This is because the convex program (10.93) has multiple optimal solutions, so its rankone optimal solution is not quite easily computed. We have shown above several examples where strong duality holds, so that the optimal value of the dual equals exactly the optimal value of the original problem. In the general case, a gap exists which in a sense reflects the extent to which the given problem fails to be convex and regular. Quite often, however, the gap can be reduced by gradually restricting the set X: More specifically, it may happen that: ( ) Whenever a nested sequence of rectangles M1  M2  : : : collapses to a \  singleton: \1 kD1 Mk D fxg then fMk  Mk ! 0: Under these circumstances, a BB algorithm for (10.80) can be developed in the standard way, such that: • Branching follows a selected exhaustive rectangular subdivision rule. • Bounding is by Lagrangian relaxation: for each rectangle M the value M is taken \ to be a lower bound for fM : • The candidate for further subdivision at iteration k is Mk 2 argminf M j M 2 Rk g where Rk denotes the collection of all partition sets currently still of interest. • Termination criterion: At each iteration k a new feasible solution is computed and the incumbent xk is set equal to the best among all feasible solutions so far obtained. If f .xk / D M k , then terminate (xk is an optimal solution). Theorem 10.5 Under assumption ( ) the above BB algorithm converges. In other words, whenever infinite, the algorithm generates a sequence fMk g such that M k % \

f \ WD fX :

Proof By the selection of Mk we have M k  f \ for all k: By exhaustiveness of the subdivision, if the procedure is infinite it generates at least one filter fMk g \ collapsing to a singleton fxg: Since M k  f \  fMk and since obviously M kC1     \ t u Mk for all k; it follows from Assumption ( ) that Mk % f :

10.5.3 Application a. Bounds by Lagrange relaxation Following Ben-Tal et al. (1994) let us apply the above method to the problem: .PX/

\

fX D minfcT yj A.x/y  b; y  0; x 2 Xg;

(10.94)

10.5 Quadratic Problems with Quadratic Constraints

383

where X is a rectangle in Rn ; and A.x/ is an m  p matrix whose elements aij .x/ are continuous functions of x: Clearly this problem includes as a special case the problem of minimizing a linear function under bilinear constraints. Since for fixed x the problem is linear in y we form the dual problem with respect to the constraints A.x/y  b; i.e., max minfhc; yi C hu; A.x/y  bij y  0; x 2 Xg:

(10.95)

u0

Lemma 10.5 The problem (10.95) can be written as .DPX/

maxfhb; uij hA.x/; ui C c  0 8x 2 Xg:

(10.96)

u0

Proof For each fixed u  0 we have minfhc; yi C hu; A.x/y  bij x 2 X; y  0g D min minfhc; yi C hu; A.x/y  big x2X y0

D hb; ui C minfhc; yi C hu; A.x/yij y  0g x2X

D hb; ui C h.u/; where

h.u/ D

0 if hA.x/; ui C c  0 8x 2 X 1 otherwise:

The last equality is due to the fact that the infimum of a linear function of y over the orthant y  0 is 0 or 1 depending on whether the coefficients of this linear function are all nonnegative or not. The conclusion follows. t u Denote the column j of A.x/ by Aj .x/ and for any given rectangle M  X define gM .u/ D min minŒhAj .x/; ui C cj : jD1;:::;p x2M

Since for each fixed x the function u 7! hAj .x/; ui C cj is affine, gM .u/ is a concave function and the optimal value of the dual (DPM) is  M

D maxfhb; uij gM .u/  0; u  0g:

Consider now a filter fMk g of subrectangles of X such that \1 kD1 Mk D fxg: For \ \     brevity write fk ; k ; x for fMk ; Mk ; fxg ; respectively. We have \

fk  minfhc; yij A.x/y  b; y  0g (since x 2 Mk 8k/ minfhc; yij A.x/y  b; y  0g D maxfhb; uij hA.x/; ui C c  0; u  0g (dual linear programs)  x

D maxfhb; uij hA.x/; ui C c  0; u  0g (definition of

 fxg /;

hence, \

fk 

 x:

(10.97)

384

10 Nonconvex Quadratic Programming

Also, setting L D fu 2 Rm C j hA.x/; ui C c  0g; Lk D fu 2 Rm C j hA.x/; ui C c  0 8x 2 Mk g we can write  k

 x

D maxfhb; uiju 2 Lk g;  1

L1  : : :  Lk  : : :  L;

D maxfhb; uij u 2 Lg

 ::: 

 k

 x:

 ::: 

Proposition 10.15 Assume B1:

.8x 2 X/ .9u 2 Rm C / hA.x/; ui C c > 0:

If fMk g is any filter of subrectangles of X such that \1 kD1 Mk D fxg; then \

fk  Proof For all k we have, by (10.97), to show that  k

%

 x

 k

! 0:  k

\

 f \  fk 

 x;

so it suffices

.k ! C1/:

Denote .x; u/ D hA.x/; ui C c 2 Rm : Assumption B1 ensures that int L ¤ ; and .x; u/ is never identical to zero. Therefore, if u 2 intL then .x; u/ > 0 (i.e., i .x; u/ > 0 8i D 1; : : : ; m/ and since .x; u/ is continuous in x one must have .x; u/ > 0 for all x in a sufficiently small ball around x; hence for all x 2 Mk with sufficiently large k (since maxx2Mk kx  xk ! 0 as k ! C1/: Thus, if u 2 intL then u 2 Lk for all sufficiently large k: Keeping this in mind let " > 0 be an arbitrary positive number and u an arbitrary element of L: Since L is a polyhedron with nonempty interior, there exists a sequence fut g  intL such that u D limt!C1 ut : Hence, there exists ut 2 intL such that hb; ut i  hb; ui  ": Since ut 2 intL; for all sufficiently large k we have, by the above, ut 2 Lk and, consequently, hb; ut i   k : Thus  k  hb; ui C " for all sufficiently large k: Since " > 0 and u 2 L are arbitrary, this shows that k ! x ; completing the proof. t u Thus, under Assumption B1, Condition ( ) is fulfilled. It then follows from Theorem 10.5 that Problem .PX/ can be solved by a convergent BB procedure using the bounds by Lagrangian relaxation and an exhaustive rectangular subdivision process. For the implementation of this BB algorithm note that the dual problem .DPX/ is a semi-infinite linear program which in the general case may not be easy to solve. However, if we assume, additionally, that for u  0:

10.5 Quadratic Problems with Quadratic Constraints

B2:

385

Every function hAj .x/; ui C cj ; j D 1; : : : p; is quasiconcave

then the condition hAj .x/; ui C cj  0 8x 2 M is equivalent to hAj .v/; ui C cj  0 8v 2 V.M/; where V.M/ denotes the vertex set of the rectangle M: Therefore, under Assumption B2, .DPM/ merely reduces to the linear program maxfhb; uij hAj .v/; ui C cj  0 u0

v 2 V.M/; j D 1; : : : ; pg:

(10.98)

If M D Œr; s then the set V.M/ consists of all points v 2 Rn such that vj 2 frj ; sj g; j D 1; : : : ; n: Thus,under Assumptions B1 and B2 the problem (PX) can be solved by the following algorithm: BB Algorithm for (PX) Initialization. Let .x1 ; y1 / be an initial feasible solution and 1 D hc; y1 i (found, e.g., by local search). Set M0 D X; S1 D P1 D fM0 g: Set k D 1: Step 1.

For each rectangle M 2 Pk solve maxfhb; uij hAj .v/; ui C cj  0 v 2 V.M/; j D 1; : : : ; p; u  0g

Step 2. Step 3. Step 4.

Step 5. Step 6.

to obtain M : Delete every rectangle M 2 Sk such that M  k  ": Let Rk be the collection of remaining rectangles. If Rk D ; then terminate: .xk ; yk / is an "-optimal solution. Let Mk 2 argminf M j M 2 Rk g: Compute a new feasible solution by local search from the center of Mk (or by solving a linear program, see Remark 10.11 below). Let .xkC1 ; ykC1 / be the new incumbent, kC1 D hc; ykC1 i: Bisect Mk upon its longest side. Let PkC1 be the partition of Mk : Set SkC1 D .Rk n fMk g/ [ PkC1 ; k k C 1 and return to Step 1.

Proposition 10.16 The above algorithm can be infinite only if " D 0: In the latter case, M k % min.PX/; and any filter of rectangles fMk g determines an optimal solution .x; y/; such that fxg D \1 D1 Mk and y is an optimal solution of Pfxg : Proof Immediate from Theorem 10.5 and Proposition 10.15.

t u

Remark 10.11 (i) If xk is any point of Mk then an optimal solution yk of the linear program minfhc; yij A.xk /y  b; y  0g yields a feasible solution .xk ; yk /: (ii) For definiteness we assumed that a rectangular subdivision is used but the above development is also valid for simplicial subdivisions (replace everywhere

386

10 Nonconvex Quadratic Programming

“rectangle” by “simplex”). For many problems, simplicial subdivision is even more efficient because in that case jV.M/j D n C 1 instead of jV.M/j D 2n as above, so that (10.98) will involve much less constraints. Example 10.7 ConsiderPthe Pooling and PBlending Problem formulated in Example 4.4. Setting xil D qil j ylj and flk D i Cik qil ; it can be reformulated as P min  j j s.t. P P P .dj  i ci qil /ylj C i .di  ci /zij D j l P P P P P P q y C j zij  Ai ; j ylj  Sl ; l ylj C i zij  Dj P Pl Pj il lj . C q  P /y C .C  P /z  0 ik il jk lj ik jk ij l i P i qil  0; i qil D 1; ylj  0; zij  0: In this form it appears as a problem of type P .PX/ since for fixed q it is linear in .y; z; /; while the set X D fqj qil  0; i qil D 1g is a simplex. It is readily verified that conditions B1 and B2 are satisfied, so the above method applies.

b. Bounds via Semidefinite Programming Lagrange relaxation has also been used to derive good and cheaply computable lower bounds for the optimal value of problem (10.80) where X is a polyhedron and 1 1 hx; Q0 xi C hc0 ; xi C d0 ; fi .x/ D hx; Qi xi C hci ; xi C di : 2 2 The Lagrangian is f0 .x/ D

L.x; u/ D

1 hx; Q.u/xi C hc.u/; xi C d.u/ 2

with Q.u/ D Q0 C

m X

ui Qi ; c.u/ D c0 C

iD1

m X

ui ci ; d.u/ D d0 C

iD1

m X

ui di :

iD1

Define D D fu 2 Rm C j Q.u/ 0g;

D D fu 2 Rm C j Q.u/ 0g;

where for a matrix A the notation A 0 .A 0/ means that A is positive definite (positive semidefinite, resp.). Obviously Q.u/ 0 , minfhx; Q.u/xij kxk D 1g > 0 Q.u/ 0 , minfhx; Q.u/xij kxk D 1g  0 so D; D are convex sets (when nonempty) and D is the closure of D: If u 2 D then finding the value .u/ D inf fL.x; u/g x2X

10.5 Quadratic Problems with Quadratic Constraints

387

is a convex minimization problem which can be solved by efficient algorithms. But if u … D; then computing .u/ is a difficult global optimization problem. Therefore, instead of  D supu2Rm .u/ Shor (1991) proposes to consider the C weaker estimate 

D sup .u/ 



:

u2D 

D  because then u … D implies Note that in the special case X D Rn ; we have .u/ D 1: For simplicity below we assume X D Rn : Proposition 10.17 If



D supu2D

.u/ is attained on D then



D f \:

Proof Let .u/ D L.x; u/ D supu2D .u/: By Proposition 10.11 the vector .fi .x/; i D 1; : : : ; m/ is a subgradient of  .u/ at u: If the concave function .u/ attains a maximum at a point u 2 D then since u is an interior point of D one must have fi .x/  0; i D 1; : : : ; m: That is, x is feasible and since  D L.x; u/  f \ one must have  D f \ : t u Thus, to compute a lower bound, one way is to solve the problem supf .u/j Q.u/ 0; u D .u1 ; : : : ; um /  0g; where

(10.99)

.u/ is the concave function 1 .u/ D infn hx; Q.u/xi C hc.u/; xi C d.u/: x2R 2

Remark 10.12 The problem (10.99) is a concave maximization (i.e., convex minimization) under convex constraints. It can also be written as max t Since

s.t.

.u/  t  0; Q.u/ 0; u D .u1 ; : : : ; um /  0:

.u/  t  0 if and only if 1 hx; Q.u/xi C hc.u/; xi C d.u/  t  0 2

8x 2 Rn

it is easy to check that (10.99) is equivalent to max t j



 Q.u/ c.u/ 0; .c.u//T 2.d.u/  t/

u D .u1 ; : : : ; um /  0 :

Thus (10.99) is actually a linear program under LMI constraints . For this problem several efficient algorithms are currently available (see, e.g., Vandenberge and Boyd (1996) and the references therein). However, it is not known whether the bounds computed that way are consistent with a rectangular or simplicial subdivision process and can be used to produce convergent BB algorithms.

388

10 Nonconvex Quadratic Programming

We will have to get back to quadratic optimization under quadratic constraints in Chap. 11 on polynomial optimization. There a SIT Algorithm for this class of problems will be presented which is the most suitable when a feasible solution is available and we want to check whether it is already optimal and if not, to find a better feasible solution.

10.6 Exercises 1 Let D D fx 2 Rn j hai ; xi  ˛i

i D 1; : : : ; mg be a polytope. P i j 1. Show that there exist ij  0 such that hx; xi D  m i;jD1 ij ha ; xiha ; xi: (Hint: use (2) of Exercise 11, Chap. 1, to show that the convex hull of a1 ; : : : ; am contains 0 in its interior; then for every i D 1; : : : ; n; the ith unit vector ei is a nonnegative combination of a1 ; : : : ; am /: 2. Deduce that for any quadratic functionPf .x/ there exists  such that for all i j r >  the quadratic function f .x/ C r m i;jD1 ij .ha ; xi  ˛i /.ha ; xi  ˛j / is convex, where ij are defined as in (1).

2 Consider the problem minff .x/j x 2 Xg where X is a polyhedron, and f .x/ is \ an indefinite quadratic function. Denote the optimal value of this problem by fX : Given any rectangle M D Œp; q; assume that X \ M D fx 2 Rn jP hai ; xi  ˛i i D i 1; : : : ; mg: Show that for sufficiently large r the function f .x/ C r m i;jD1 ij .ha ; xi  \

˛i /.haj ; xi  ˛j / is a convex minorant of f .x/ over M and a lower bound for fM D minff .x/jPx 2 X \Mg is given by the unconstrained minimum of the convex function i j n f .x/ C r m i;jD1 ij .ha ; xi  ˛i /.ha ; xi  ˛j / over R : (Hint: see Exercise 1). 3 Solve the indefinite quadratic program: min 3.x  0:5/2  2y2 s.t. x C 2y  1; 0:5  x  1; 0:75  y  2: 4 Solve the generalized linear program min 3x1 C 4x2 s.t. y11 x1 C 2x2 C x3 D 5; y12 x1 C x2 C x4 D 3 x 0; x2  0I y11  y12  1I 0  y11  2; 0  y12  2: 5 Solve the problem minf0:25x1  0:5x2 C x1 x2 j x1  0:5; 0  x2  0:5g: 6 Solve the problem minfx1 C x1 x2  x2 j  6x1 C 8x2  3I 3x1  x2  3I 0  x1  5I 0  x2  5g:

10.6 Exercises

389

7 Solve the problem     x 1 2 x1 C Œ1; 1 1 s.t. x2 x2 1 1    1 1 x1 Œx1 ; x2   6; x1 C x2  4; 0  x1  4; 0  x2  4: x2 32 

min Œx1 ; x2 

8 Consider a quadratic program minff .x/j x 2 Dg; where D is a polytope in Rn ; f .x/ D 12 hx; Qxi C hc; xi and Q is a symmetric matrix of rank k  n: Show that there exist k linearly independent vectors a1 ; : : : ; ak such that for fixed values of t1 ; : : : ; tk the problem minff .x/j x 2 D; hai ; xi D ti ; i D 1; : : :g becomes a linear program. Develop a decomposition method for solving this problem (see Chap. 7). 9 Let G D G.V; E/ be an undirected graph, where V D f1; 2; : : : ; ng denotes the set of vertices, E the set of edges. A subset V 0  V is said to be stable if no two distinct members of V 0 are adjacent, i.e., if .i; j/ … E for every i; j 2 V 0 such that i ¤ j: Let ˛.G/ be the cardinality of the largest stable set in G: Show that ˛.G/ is the optimal value in the quadratic problem; ( n ) X max xi j xi xj D 0 8.i; j/ 2 E/; xi .xi  1/ D 0 8i D 1; : : : ; n/ : x

iD1

1. Using the Lagrangian relaxation method show that an upper bound for ˛.G/ is given by " 

D inff ./j Q./ > 0g;

./ D max 2 x

#

n X

xi  hx; Q./xi ;

iD1

where Q./ D ŒQij  is a symmetric matrix such that Qij D 0 if i ¤ j; .i; j/ … E and Qij D ij for every other .i; j/: 2. Show that ./ D './ where ( './ D minŒhx; Q./xij x

n X

) 1 xi D 1

iD1

(Shor and Stetsenko 1989). Hint: show that for any positive definite matrix A and vector b 2 Rn : maxx Œ2hb; xi  hx; Axi D fminx Œhx; Axij hb; xi D 1g1 : 10 Let A0 ; A1 ; A2 ; B be symmetric matrices of order n; and a; b two positive numbers such that a < b: Consider the problem  WD inff j A0 C uA1 C vA2 C B < 0; u D x; hx; vi  ; a  x  b; > 0g:

390

Show that

10 Nonconvex Quadratic Programming

4ab ˛    ˛; where ˛ is the optimal value of the problem .a C b/2 minf j zA0 C uA1 C vA2 C B < 0 4u C .a C b/2 v  4.a C b/ ; a  u  b;  0g:

Hint: Observe that in the plane .x; v/ the line segment f.x; v/j 4 x C .a C b/2 v D 4.p C q/ ; a  x  bg is tangent to the arc f.x; v/j hx; vi D ; a  x  bg at point 2 4ab . aCb ; aCb /; and is the chord of the arc f.x; v/j hx; vi D .aCb/ 2 ; a  x  bg: 2 11 Show that the water distribution network problem formulated in Exercise 8, Chap. 4, is of type (PX) and can be solved by a BB method using Lagrange relaxation for lower bounding. 12 Show that the objective function f .q; H; J/ in the water distribution network problem (Exercise 8, Chap. 4) is convex in J and concave in .q; H/: Given any M rectangle M D f.q; H/j qM  q  qM ; H M  H  H g we have f .qM ; H M ; J/

 f .q; H; J/

 =c/  KLi .qi =c/  KLi .qM KLi .qM i =c/ ; i D 1; : : : ; s i

for all .q; H/ 2 M: Therefore, a lower bound for the objective function value over all feasible solutions .q; H; J/ with .q; H/ 2 M can be computed by solving a convex program. Derive a BB method for solving the problem based on this lower bounding.

Chapter 11

Monotonic Optimization

By their very nature, many nonconvex functions encountered in mathematical modeling of real world systems in a broad range of activities, including economics and engineering, exhibit monotonicity with respect to some variables (partial monotonicity) or to all variables (total monotonicity). In this chapter we will present a mathematical framework for solving optimization problems that involve increasing functions, and more generally, functions representable as differences of increasing functions. Monotonic optimization was first considered in Rubinov et al. (2001) and extensively developed in the two main papers Tuy (2000a) and Tuy et al. (2006).

11.1 Basic Concepts 11.1.1 Increasing Functions, Normal Sets, and Polyblocks For any two vectors x; y 2 Rn we write x  y .x < y; resp.) and say that y dominates x ( y strictly dominates x, resp.) if xi  yi (xi < yi resp.) for all i D 1; : : : ; n: If a  b then the box Œa; b is the set of all x such that a  x  b: We also write .a; b WD fxj a < x  bg; Œa; b/ WD fxj a  x < bg: As usual e is the vector of all ones and ei the i-th unit vector of the space under consideration, i.e., the vector such that eii D 1; eij D 0 8j ¤ i: A function f W RnC ! R is said to be increasing if f .x/  f .x0 / whenever 0  x  x0 I strictly increasing if, in addition, f .x/ < f .x0 / whenever x < x0 : A function f is said to be decreasing if f is increasing. Many functions encountered in various applications are increasing in this sense. Outstanding examples are the production functions and the utility functions in mathematical economics (under the assumptions that all goods are useful), polynomials (in particular quadratic functions) with nonnegative coefficients, and posynomials © Springer International Publishing AG 2016 H. Tuy, Convex Analysis and Global Optimization, Springer Optimization and Its Applications 110, DOI 10.1007/978-3-319-31484-6_11

391

392

11 Monotonic Optimization m X jD1

cj

n Y

a

xi ij

with cj  0 and aij  0:

iD1

Q (in particular, Cobb–Douglas functions f .x/ D i xai i ; ai  0). The following obvious proposition shows that the class of increasing functions includes a large variety of functions: Proposition 11.1 (i) If f1 ; f2 are increasing functions, then for any nonnegative numbers 1 ; 2 the function 1 f1 C 2 f2 is increasing. (ii) The pointwise supremum of a bounded above family .f˛ /˛2A of increasing functions and the pointwise infimum of a bounded below family .f˛ /˛2A of increasing functions are increasing. Nontrivial examples of increasing functions given by this proposition are functions of the form f .x/ D supy2a.x/ g.y/; where g W RnC ! R is an arbitrary function and a W RnC ! RnC is a set-valued map with bounded images such that a.x0 /  a.x/ for x0  x: A set G  RnC is called normal if Œ0; x  G whenever x 2 GI in other words, if 0  x0  x 2 G ) x0 2 G: The empty set, the singleton f0g; and RnC are special normal sets which we shall refer to as trivial normal sets in RnC : If G is a normal set then G [ fx 2 RnC j xi D 0g for some i D 1; : : : ; n is still normal. Proposition 11.2 For any increasing function g.x/ on RnC and any ˛ 2 R the set G D fx 2 RnC j g.x/  ˛g is normal and it is closed if g.x/ is lower semi-continuous. Conversely, for any closed normal set G  RnC with nonempty interior there exists a lower semi-continuous increasing function g W RnC ! R and ˛ 2 R such that G D fx 2 RnC j g.x/  ˛g: Proof We need only prove the second assertion. Let G be a closed normal set with nonempty interior. For every x 2 RnC define g.x/ D inff > 0j x 2 Gg: Since intG ¤ ; there is u > 0 such that Œ0; u  G: The halfline fxj   0g intersects Œ0; u  G; hence 0  g.x/ < C1: Since for every  > 0 the set G is normal, if 0  x  x0 2 G, then x 2 G; too, so g.x0 /  g.x/; i.e., g.x/ is increasing. We n show that G D fx 2 TC j g.x/  1g: In fact, if x 2 G, then obviously g.x/  1: Conversely, if x … G, then since G is closed there exists ˛ > 1 such that x … ˛GI hence, since G is normal, x … G 8  ˛; which means that g.x/  ˛ > 1: Consequently, G D fx 2 RnC j g.x/  1g: It remains to prove that g.x/ is lower semicontinuous. Let fxk g  RnC be a sequence such that xk ! x0 and g.xk /  ˛ 8k: Then for any given ˛ 0 > ˛ we have inffj xk 2 Gg < ˛ 0 8k; i.e., xk 2 ˛ 0 G 8kI hence x0 2 ˛ 0 G in view of the closedness of the set ˛ 0 G: This implies g.x0 /  ˛ 0 and since ˛ 0 can be taken arbitrarily close to ˛; we must have g.x0 /  ˛; as was to be proved. t u A point y 2 Rn is called an upper boundary point of a normal set G if y 2 clG and there is no point x 2 G such that x D a C .y  a/ with  > 1: The set of all upper boundary points of G is called its upper boundary and is denoted @C G:

11.1 Basic Concepts

393

A point y 2 clG is called a Pareto point of G if x 2 G; x  y ) x D y: Since such a point necessarily belongs to @C G; it is also called an upper extreme point of G: Proposition 11.3 If G  Œa; b is a compact normal set with nonempty interior, then for every point z 2 Œa; b n fag the line segment joining a to z meets @C G at a unique point G .z/ defined by G .z/ D a C .z  a/;

 D maxf˛ > 0j a C ˛.z  a/ 2 Gg:

(11.1)

Proof If u 2 intG then Œa; u  G and the set f˛ > 0j aC˛.za/ 2 Gg is nonempty. So the point G .z/ defined by (11.1) exists and is unique. Clearly G .z/ 2 G and for ˛ >  we have a C .z  a/ … G; hence G .z/ is in fact the point where the line segment joining a to z meet @C G: t u A set H  RnC is called conormal (or sometimes, reverse normal) if x C RnC  H whenever x 2 H: It is called conormal in a box Œa; b if b  x0  x  a; x 2 H implies x0 2 H; or equivalently, if x … H whenever a  x  x0  b; x0 … H: Clearly a set H is conormal if and only if the set H [ WD RnC n H is normal. The following propositions are analogous to Propositions 11.2 and 11.3 and can be proved by similar arguments: 1. For any increasing function h.x/ on RnC and any ˛ 2 R the set H D fx 2 RnC j h.x/  ˛g is normal and is closed if h.x/ is upper semi-continuous. Conversely, for any closed conormal set H  RnC such that RnC n H has a nonempty interior there exist an upper semi-continuous increasing function h W RnC ! R and ˛ 2 R such that H D fx 2 RnC j h.x/  ˛g: A point y 2 Œa; b is called a lower boundary point of a conormal set H  Œa; b if y 2 clH and there is no point x 2 H such that x D a C .y  a/ with  < 1: The set of all lower boundary points of H is called its lower boundary and is denoted @ H: A point y 2 clH is called a Pareto point or a lower extreme point of H if x 2 H; x  y ) x D y: 2. H  Œa; b is a closed conormal set with b 2 intH, then for every point z 2 Œa; b n H the line segment joining z to b meets @ H at a unique point H .z/ defined by H .z/ D b  .z  b/;

 D maxf˛ > 0j b  ˛.z  b/ 2 Hg:

(11.2)

Given a set A  RnC the whole orthant RnC is a normal set containing A: The intersection of all normal sets containing A; i.e., the smallest normal set containing A; is called the normal hull of A and denoted Ae : The conormal hull of A, denoted bA, is the smallest conormal set containing A: Proposition 11.4 (i) The normal hull of a set A  Œa; b is the set Ae D [z2A Œa; z: If A is compact then so is Ae : (ii) The conormal hull of a set A  Œa; b is the set bA D [z2A Œz; b: If A is compact then so is bA:

394

11 Monotonic Optimization

Proof It suffices to prove (i), because the proof of (ii) is similar. Let P D [z2A Œa; z: Clearly P is normal and P  A; hence P  Ae : Conversely, if x 2 P then x 2 Œa; z for some z 2 A  Ae ; hence x 2 Ae by normality of Ae ; so P  Ae ; proving the equality P D Ae : If A is compact then for any sequence fxk g  Ae we have xk 2 Œa; zk  with zk 2 A so xk  A; hence, up to a subsequence, xk ! x0 2 A  Ae ; proving the compactness of Ae : t u Proposition 11.5 A compact normal set G  Œa; b has at least one upper extreme point and is equal to the normal hull of the set V of its upper extreme points. Proof For each i D 1; : : : ; n an extreme point of the line segment fx 2 Gj x D ei ;   0g is an upper extreme point of G: So V ¤ ;: We show that G D V e : Since V  @C G; we have V e  .@C G/e  G; so it suffices to show that G  V e : If y 2 G then G .y/ 2 @C G; so y 2 .@C G/e : Define x1 2 argmaxfx1 j x 2 G; x  yg; and xi 2 argmaxfxi j x  xi1 g for i D 2; : : : ; n: Then v WD xn 2 G and v  x for all x 2 G satisfying x  y: Therefore, x 2 G; x  v ) x D v: This means that y  v 2 V; hence y 2 V e : Consequently, G  V e ; as was to be proved. t u A polyblock P is the normal hull of a finite set V  Œa; b called its vertex set. By the above proposition P D [z2V Œa; z: A point z 2 V is called a proper vertex of P if there is no z0 2 V n fzg such that z0  z: It is easily seen that a proper vertex of a polyblock is nothing but a Pareto point of it. The vertex set of a polyblock P is denoted by vertP; the set of proper vertices by pvertP: A vertex which is not proper is called improper. Clearly a polyblock is fully determined by its proper vertex; more precisely, P D .pvertP/e ; i.e., a polyblock is the normal hull of its proper vertex set (Fig. 11.1). Similarly, a copolyblock (or reverse polyblock) Q is the conormal hull of a finite set T  Œa; b; i.e., Q D [z2T Œz; b: The set T is called the vertex set of the copolyblock. A proper vertex of a copolyblock Q is a vertex z for which there is no vertex z0 ¤ z such that z0  z: A vertex which is not proper is said to be improper. A copolyblock is fully determined by its proper vertex set, i.e., a copolyblock is the conormal hull of its proper vertex set.

H G

Fig. 11.1 Polyblock, Copolyblock

11.1 Basic Concepts

395

Proposition 11.6 (i) The intersection of finitely many polyblocks is a polyblock. (ii) The intersection of finitely many copolyblocks is a copolyblock. Proof If V1 ; V2 are the vertex sets of two polyblocks P1 ; P2 , respectively, then P1 \ P2 D .[z2V1 Œa; z/ \ .[y2V2 Œa; y/ D [.z;y/2V1 V2 Œa; z \ Œa; y D [.z;y/2V1V2 Œa; z ^ y where u D z ^ y means ui D minfzi ; yi g 8i D 1; : : : ; n: Similarly if T1 ; T2 are the vertex sets of two copolyblocks Q1 ; Q2 , respectively, then Q1 \ Q2 D [.z;y/2T1 T2 Œz; b \ Œy; b D [.z;y/2T1 T2 Œz _ y; b where v D z _ y means vi D maxfzi ; yi g 8i D 1; : : : ; n: t u Proposition 11.7 If y 2 Œa; b/ then the set Œa; b n .y; b is a polyblock with proper vertices ui D b C .yi  bi /ei ; i D 1; : : : ; n:

(11.3)

Proof We have .y; b D [niD1 fx 2 Œa; bj xi > yi g; so Œa; b n .y; b D [niD1 .x 2 Œa; bj xi  yi g D [niD1 Œa; ui : t u

11.1.2 DM Functions and DM Constraints A function f W RnC ! R is said to be a dm (difference-of-monotonic) function if it can be represented as a difference of two increasing functions: f .x/ D f1 .x/  f2 .x/; where f1 ; f2 are increasing functions. Proposition 11.8 If f1 .x/; f2 .x/ are dm then (i) for any 1 ; 2 2 R the function 1 f1 .x/ C 2 f2 .x/ is dm; (ii) the functions maxff1 .x/; f2 .x/g and minff1 .x/; f2 .x/g are dm. Proof (i) is trivial. To prove (ii) let fi D fi .x/  qi .x/; i D 1; 2; with fi ; qi increasing functions. Since f1 D p1 C q2  .q1 C q2 /; f2 D p2 C q1  .q1 C q2 / we have maxff1 ; f2 g D maxfp1 C q2 ; p2 C q1 g  .q1 C q2 /; minff1 ; f2 g D minfp1 C q2 ; p2 C q1 g  .q1 C q2 /: t u Proposition 11.9 The following functions are dm on Œa; b  RnC :

P ˛ (i) any polynomial, and more generally, any synomial P.x/ D ˛ c˛ x ; where ˛ ˛ ˛ 1 2 ˛n ˛ D .˛1 ; : : : ; ˛n /  0 and x D x1 x2    xn ; c˛ 2 R: (ii) any convex function f W Œa; b  RnC ! R: (iii) any Lipschitz function f W Œa; b ! R with Lipschitz constant K: (iv) any C1 -function f W Œa; b  RnC ! R: Proof (i) P.x/ D P1 .x/  P2 .x/ where P1 .x/ is the sum of terms c˛ x˛ with c˛ > 0 and P2 .x/ is the sum of terms c˛ x˛ with c˛ < 0:

396

11 Monotonic Optimization

(ii) Since the set [a K is increasing on Œa; b because one has, for a  x  x0  b: g.x0 /  g.x/ D M

n X .x0i  xi / C f .x0 /  f .x/ iD1

n X M .x0i  xi /  Kjx0  xj  0: iD1

P (iv) For M > K WD supx2Œa;b krf .x/k the function g.x/ D f .x/ C M niD1 xi is increasing. In fact, by the mean-value theorem, for a  x  x0  b there exists 2 .0; 1/ such that f .x0 /  f .x/ D hrf .x C .x0  x//; x0  xi; hence g.x0 /  g.x/  M

Pn

0 iD1 .xi

 xi /  K

Pn

0 iD1 .xi

 xi /  0:

t u

As a consequence of Proposition 11.9, (i), the class of dm functions on Œa; b is dense in the space CŒa;b of continuous functions on Œa; b with the supnorm topology. A function f .x/ is said to be locally increasing (locally dm, resp.) on RnC if for every x 2 RnC there is a box Œa; b containing x in its interior such that f .x/ is increasing (dm, resp.) on Œa; b: Proposition 11.10 A locally increasing function on RnC is increasing on RnC : A locally dm function on a box Œa; b is dm on this box. Proof Suppose f .x/ is locally increasing and let 0  x  x0 : Since Œx; x0  is compact there exists a finite covering of it by boxes such that f .x/ is increasing on each of these boxes. Then the segment  joining x to x0 can be divided into a finite number of intervals: x1 D x  x2      xm D x0 such that f .xi /  f .xiC1 /; i D 1; : : : ; m  1: Hence, f .x/  f .x2 /      f .x0 /:

11.1 Basic Concepts

397

Suppose now that f .x/ is locally dm on Œa; b: By compactness the box Œa; b can be covered by a finite number of boxes Œp1 ; q1 ;    ; Œpm ; qm  such that f .x/ D ui .x/ vi .x/ 8x 2 Œpi ; qi ; with ui ; vi increasing functions on Œp1 ; qi : For x 2 Œpi ; qi \Œpj ; qj  we have ui .x/  vi .x/ D uj .x/  vj .x/ D f .x/; u.x/ D ui .x/ C P so we can define P i i i i maxf0; u .q /  u .q /g; v.x/ D v .x/ C maxf0; u .q /  u j i i j i .q /g for every j¤i j¤i i i x 2 Œp ; q : These two functions are increasing and obviously f .x/ D u.x/  v.x/: t u

11.1.3 Canonical Monotonic Optimization Problem A monotonic optimization problem is an optimization problem of the form maxff .x/j gi .x/  0; i D 1; : : : ; m; x 2 Œa; bg;

(11.4)

where the objective function f .x/ and the constraint functions gi .x/ are dm functions on the box Œa; b  Rn : For the purpose of optimization an important property of the class of dm functions is that, like the class of dc functions, it is closed under linear operations and operations of taking pointwise maximum and pointwise minimum. Exploiting this property we have the following proposition: Theorem 11.1 Any monotonic optimization problem can be reduced to the canonical form .MO/

maxff .x/j g.x/  0  h.x/; x 2 Œa; bg

(11.5)

where f ; g; h are increasing functions. Proof Consider a monotonic optimization problem in the general form (11.4). Since the system of inequalities gi .x/  0; i D 1; : : : ; m; is equivalent to the single inequality g.x/ WD maxiD1;:::;m gi .x/  0; where g.x/ is dm, we can rewrite the problem as maxff1 .x/  f2 .x/j g1 .x/  g2 .x/  0; x 2 Œa; bg; where f1 ; f2 ; g1 ; g2 are increasing functions. Introducing two additional variables z; t we can further write maxff1 .x/ C zj f2 .x/ C z  0; g1 .x/ C t  0  g2 .x/ C t; z 2 Œf2 .b/; f2 .a/; t 2 Œg2 .b/; g1 .a/; x 2 Œa; bg: Finally, replacing the system of two inequalities f2 .x/ C z  0; g1 .x/ C t  0 by the single inequality maxff2 .x/ C z; g1 .x/ C tg  0 and changing the notations yield the canonical form (MO). t u

398

11 Monotonic Optimization

Remark 11.1 A minimization problem such as minff .x/j g.x/  0  h.x/; x 2 Œa; bg can be converted to the form (MO). In fact, by setting x D a C b  y; fQ .y/ D f .a C b  y/; gQ .y/ D g.a C b  y/; it can be rewritten as Q maxffQ .y/j h.y/  0  gQ .y/; y 2 Œa; bg: For simplicity we will assume in the following that f .x/; h.x/ are upper semicontinuous, while g.x/ is lower semi-continuous. Then the set G WD fx 2 Œa; bj g.x/  0g is a compact normal set, H WD fx 2 Œa; bj h.x/  0g is a compact conormal set and the problem (MO) can be rewritten as maxff .x/j x 2 G \ Hg

(11.6)

Proposition 11.11 If G \ H ¤ ; the problem (MO) has at least an optimal solution which is a Pareto point of G: Proof Since the feasible set G \ H is compact, while the objective function f .x/ is upper semi-continuous, the problem has an optimal solution x: If x is not a Pareto point of G then the set Gx WD fx 2 Gj x  xg is a normal set in the box Œx; b and any Pareto point xQ of this set is clearly a Pareto point of G and f .Qx/ D f .x/; i.e., xQ is also optimal. t u Proposition 11.12 Let G  Œa; b be a compact normal set and z 2 Œa; b n G; y D G .z/: Then the polyblock P with proper vertices ui D b C .yi  b/ei ; i D 1; : : : ; n

(11.7)

strictly separates z from G; i.e., P  G; z 2 P n G: Proof Recall that G .z/ denotes the point where the line segment from a to z meets the upper boundary @C G of G; so y < z: By Proposition 11.7 Py D Œa; bn.y; b: u t The next corollary shows that with respect to compact normal sets polyblocks behave like polytopes with respect to compact convex sets. Corollary 11.1 Any compact normal set is the intersection of a family of polyblocks. In other words, a compact normal set can be approximated as closely as desired by a polyblock. Proof Clearly, if G  Œa; b then P0 D Œa; b is a polyblock containing G: Therefore, the family I of polyblocks containing G is not empty and G  \i2I Pi : If there were x 2 \i2I Pi n G then by the above proposition there would exist a polyblock P  G such that x … P; a contradiction. t u

11.2 The Polyblock Algorithm

399

On the basis of these properties, one method for solving problem (MO) is to generate inductively a nested sequence of polyblocks outer approximating the feasible set: Œa; b WD P0  P1      G \ H; in such a way that maxff .x/j x 2 Pk g & maxff .x/j x 2 G \ Hg: This can be done as follows. Start with the polyblock P1 D Œa; b: At iteration k a polyblock Pk  G has been constructed with proper vertex set Tk : If Tk0 D Tk \ H D ; stop: the problem is infeasible. Otherwise let zk 2 argmaxff .x/j x 2 Tk0 g: Since f .x/ is an increasing function, f .zk / is the maximum of f .x/ over Pk \ H  G \ H: If zk 2 G, then stop: zk 2 G \ H; i.e., zk is feasible, so f .zk / is also the maximum of f .x/ over G \ H; i.e., zk solves the problem. If, on the other hand, zk … G, then compute xk D G .zk / and let PkC1 D .Œa; b n .xk ; b/ \ Pk : By Proposition 11.7 Œa; b n .xk ; b is polyblock, so the polyblock PkC1 satisfies G  PkC1  Pk n fzk g: The procedure can then be repeated at the next iteration. This method is analogous to the outer approximation method for minimizing a convex function over a convex set. Although it can be proved that, under mild conditions, as k ! C1 the sequence xk generated by this method converges to a global optimal solution of (MO), the convergence is generally slow. To speed up convergence the method should be improved in two points: (1) bounding: using valid cuts reduce the current box Œp; q in order to have tighter bounds; (2) computing the set TkC1 (proper vertex set of polyblock PkC1 /: not from scratch, but by deriving it from Tk :

11.2 The Polyblock Algorithm 11.2.1 Valid Reduction At any stage of the solving process, some information may have been gathered about the global optimal solution. To economize the computational effort this information should be used to reduce the search domain by removing certain unfit portions where the global optimal solution cannot be expected to be found. The subproblem which arises is the following. Given a number 2 f .G \ H/ (the current best feasible objective function value) and a box Œp; q  Œa; b check whether the box Œp; q contains at least one solution to the system fx 2 Œp; qj g.x/  0  h.x/; f .x/  g ¤ ;

(11.8)

400

11 Monotonic Optimization

and to find a smaller box Œp0 ; q0   Œp; q still containing at least one solution of (11.8). Such a box Œp0 ; q0  is called a -valid reduction of Œp; q and is denoted red Œp; q: Observe that if g.q/  0 then for every x 2 Œp; q satisfying (11.8) the line segment joining x to q intersects the surface g.:/ D 0 at a point x0 2 Œp; q satisfying g.x0 / D 0  h.x0 /;

f .x0 /  f .x/  :

Therefore, the box Œp0 ; q0  should contain all x 2 Œp; q satisfying g.x/ D 0  h.x/; f .x/  ; i.e., satisfying g.x/  0  h .x/ WD minfg.x/; h.x/; f .x/  g:

(11.9)

Lemma 11.1 (i) If g.p/ > 0 or h .q/ < 0 then there is no x 2 Œp; q satisfying (11.9). P (ii) If g.p/  0 then the box Œp; q0  where q0 D p C niD1 ˛i .qi  pi /ei ; with ˛i D supf˛j 0  ˛  1; g.p C ˛.qi  pi /ei /  0g; i D 1 : : : ; n;

(11.10)

still contains all feasible solutions to (11.9) in Œp; q: P (iii) If g.p/  0  h .q/ then the box Œp0 ; q0  where p0 D q0  niD1 ˇi .q0i  pi /ei ; with ˇi D supfˇj 0  ˇ  1; h .q0  ˇ.q0i  pi /ei /  0g; i D 1; : : : ; n;

(11.11)

still contains all feasible solutions to (11.9) in Œp; q: Proof It suffices to prove (ii) because (iii) can be proved analogously, while (i) is obvious. Since q0i D ˛qi C .1  ˛/pi with 0  ˛  1; it follows that pi  q0i  qi 8i D 1; : : : ; n; i.e., Œp; q0   Œp; q: Recall that G WD fx 2 Œp; qj g.x/  0g: For any x 2 G \ Œp; q we have by normality Œp; x  G; so xi D p C .xi  pi /ei 2 G; i D 1; : : : ; n: But xi  qi ; so xi D p C ˛.qi  pi /ei with 0  ˛  1: This implies that ˛  ˛i ; i.e., xi  p C ˛i .qi  pi /ei ; i D 1; : : : ; n; and consequently x  q0 ; i.e., x 2 Œp; q0 : Thus, G \ Œp; q  G \ Œp; q0 ; which completes the proof because the converse inclusion is obvious by the fact Œp; q0   Œp; q: t u Clearly the box Œp; q0  defined in (1) is obtained from Œp; q by cutting off the set [niD1 fxj xi > q0i g; while the box Œp0 ; q defined in (2) is obtained from Œp; q by cutting off the set [niD1 fxi j xi < p0i g: The cut [niD1 fxi jxi > q0i g is referred to as an upper -valid cut with vertex q0 and the cut [niD1 fxi j xi < p0i g as a lower -valid cut with vertex p0 ; applied to the box Œp; q (Fig. 11.2). We now deduce a closed formula for red Œp; q:

11.2 The Polyblock Algorithm

401

Fig. 11.2 Reduced box

q q’

H G p’ p

Given any continuous increasing function '.˛/ W Œ0; 1 ! R such that '.0/ < 0 < '.1/ let N.'/ denote the largest value ˛ 2 .0; 1/ satisfying '.˛/ D 0: For every i D 1; : : : ; n let qi WD p C .qi  pi /ei : Proposition 11.13 A -valid reduction of a box Œp; q is given by the formula red Œp; q D

;; if g.p/ > 0 or h .q0 / < 0; Œp0 ; q0 ; if g.p/  0 & h .q0 /  0:

(11.12)

where q0 D p C

n X

˛i .qi  pi /ei ;

p0 D q0 

iD1

˛i D ˇi D

1; if g.qi /  0 N.'i /; if g.qi / > 0:

1; if h .q0i /  0 N. i /; if h .q0i / < 0:

Proof Straightforward from Lemma 11.1 .

n X

ˇi .q0  pi /ei ;

(11.13)

iD1

'i .t/ D g.p C t.qi  pi /ei /:

i .t/

D h .q0  t.q0i  pi /ei /: t u

Remark 11.2 The reduction operation which is meant to enhance the efficiency of the search procedure is worthwhile only if the cost of this operation does not offset its advantage. Therefore, although the exact values of N.'i /; N. i / can be computed easily using, e.g., a procedure of Bolzano bisection, in practical implementation of the above formula it is enough to compute only approximate values of N.'i /; N. i / by applying just a few iterations of the Bolzano bisection procedure. Proposition 11.14 Let E WD fx 2 G \ Hj f .x/  g; and let P be a polyblock containing E with proper vertex V: Let V 0 be the set obtained from V by deleting

402

11 Monotonic Optimization

every z 2 V satisfying red Œa; z D ; and replacing every other z 2 V with the highest corner z0 of the box red Œa; z D Œa; z0 : Then the polyblock P0 generated by V 0 satisfies fx 2 G \ Hj f .x/  g  P0  P: Proof Since E \Œa; z D ; for every deleted z; while E \Œa; z  red Œa; z WD Œa; z0  for every other z; the conclusion follows. t u We shall refer to the polyblock P0 with vertex set V 0 as the -valid reduction of the polyblock P and denote it by red P:

11.2.2 Generating the New Polyblock The next question after reducing the search domain is how to derive the new polyblock PkC1 that should better approximate G \ H than the current polyblock Pk : The general subproblem to be solved is the following. Given a polyblock P  Œa; b with V D pvertP; a vertex v 2 V and a point x 2 Œa; v/ such that G \ H  P n .x; b; determine a new polyblock P0 satisfying P n .x; b  P0  P n fvg. For any two z; y let J.z; y/ D fjj zj > yj g and if a  x  z  b then define zi D z C .xi  zi /ei ; i D 1; : : : ; n: Proposition 11.15 Let P be a polyblock with proper vertex V  Œa; b and let x 2 Œa; b satisfy V WD fz 2 Vj x < zg ¤ ;: (i) P0 WD P n .x; b is a polyblock with vertex set T 0 D .V n V / [ fzi D z C .xi  zi /ei j z 2 V ; i D 1; : : : ; ng:

(11.14)

(ii) The proper vertex set of P0 is obtained from T 0 by removing improper elements according to the rule: For every pair z 2 V ; y 2 VC WD fy 2 Vj x  yg compute J.z; y/ D fJj zj > yj g and if J.z; y/ D fig then zi is an improper element of T 0 : Proof Since Œa; z\.x; b D ; for every z 2 V nV it follows that Pn.x; b D P1 [P2 ; where P1 is the polyblock with vertex set V n V and P2 D .[z2V Œa; z/ n .x; b D [z2V .Œa; z n .x; b/: Noting that by Proposition 11.7 Œa; b n .x; b is a polyblock with vertices ui D C.xi  bi /ei ; i D 1; : : : ; n; we can then write Œa; z n .x; b D Œa; z \ .Œa; b n .x; b/ D Œa; z \ .[niD1 Œa; ui / D [niD1 Œa; z \ Œa; ui  D [niD1 Œa; z ^ ui ; where z ^ ui denotes the vector whose every j-th component equals minfzj ; uij g: Hence, P2 D [fŒa; z ^ ui j z 2 V ; i D 1; : : : ; ng: Since z ^ ui D zi for z 2 V ; this shows that the vertex set of P n .x; b is the set T 0 given by (11.14).

11.2 The Polyblock Algorithm

403

It remains to show that every y 2 V n V is proper, while a zi D z C .xi  zi /ei with z 2 V is improper if and only if J.z; y/ D fig for some y 2 VC : Since every y 2 V n V is proper in V; while zi  z 2 V for every zi it is clear that every y 2 V n V is proper. Therefore, an improper element must be some zi such that zi  y for some y 2 T 0 : Two cases are possible: (1) y 2 VI (2) y 2 T 0 n V: In the former case, since obviously x  zi we must have x  yI furthermore, zj D zij  yj 8j ¤ i; hence, since z 6 y; it follows that zi > yi ; i.e., J.z; y/ D fig: In the latter case, zi  yl for some y 2 V and some l 2 f1; : : : ; ng: Then it is clear that we cannot have y D z, so y ¤ z and zij  ylj 8j D 1; : : : ; n: It l D i then this implies zj D zij  yij D yj 8j ¤ i; hence, since z 6 y it follows that zi > yi and J.z; y/ D fig: On the other hand, if l ¤ i then from zij  ylj 8j D 1; : : : ; n; we have zj  yj 8j ¤ i and again, since z 6 y we must have zi > yi ; so J.z; y/ D fig: Thus an improper zi must satisfy J.z; y/ D fig for some y 2 VC : Conversely, if J.z; y/ D fig for some y 2 VC then zj  yj 8j ¤ i; hence, since zii D xi and xi D yi if y 2 V or else xi  yi if y 2 VC ; it follows that zi  yi if y 2 V or else zi  y if y 2 VC I so in either case zi is improper. This completes the proof of the proposition. t u

11.2.3 Polyblock Algorithm By shifting the origin to ˛e if necessary we may always assume that x  a C ˛e

8x 2 H:

(11.15)

where ˛ > 0 is chosen not too small compared to kb  ak (e.g., ˛ D 14 kb  ak). Furthermore, since the problem is obviously infeasible if b … H whereas b is an obvious optimal solution if b 2 G \ H; we may also assume that b 2 H n G;

i:e:; h.b/  0; g.b/  0:

(11.16)

Algorithm POA (Polyblock Outer Approximation Algorithm, or briefly, Polyblock Algorithm) Initialization Let P1 be an initial polyblock containing G \ H and let V1 be its proper vertex set. For example, P1 D Œa; b; V1 D fbg: Let x1 be the best feasible solution available and CBV D f .x1 / (if no feasible solution is available, set CBV D 1). Set k D 1: Step 1. Step 2. Step 3.

For D CBV let Pk D red Pk and V k D pvertPk : Reset b _fx 2 V k g where x D _fa 2 Ag means xi D maxa2A ai 8i D 1; : : : ; n: If V k D ; terminate: if CBV D 1 the problem is infeasible; if CBV > 1 the current best solution xk is an optimal solution. If V k ¤ ; select zk 2 argmaxff .x/j x 2 V k g: If g.zk /  0 then terminate: zk is an optimal solution. Otherwise, go to Step 4.

404

Step 4.

Step 5.

11 Monotonic Optimization

Compute xk D G .zk /; the last point of G in the line segment from a to xk : Determine the new current best feasible xkC1 and the new current best value CBV: Compute the proper vertex set VkC1 of PkC1 D Pk n .xk ; b according to Proposition 11.15. Increment k and return to Step 1.

Proposition 11.16 When infinite Algorithm POA generates an infinite sequence fxk g every cluster point of which is a global optimal solution of (MO). Proof First note that by replacing Pk with Pk D red Pk in Step 1, all z 2 Vk such that f .z/ < are deleted, so that at any iteration k all feasible solutions z with f .z/  CBV are contained in the polyblock Pk : This justifies the conclusions when the algorithm terminates at Step 2 or Step 3. Therefore it suffices to consider the case when the algorithm is infinite. Condition (12.16) implies that min.zi  ai /  i

˛ kz  ak  kz  ak 8z 2 H; kz  ak

(11.17)

˛ where  D kbak : We contend that zk  xk ! 0 as k ! C1: Suppose the contrary, that there exist > 0 and an infinite sequence kl such that kzkl  xkl k  > 0 8l: For  > l we have zk … .xkl ; zkl  because Pkm u  Pkl n .xkl ; b: Hence, kzkm u  zkl k  miniD1;:::;n jzki l  xki l j: On the other hand, miniD1;:::;n .zki l  ai /  kzkl  ak by (11.17) because zkl 2 H; while xkl lies on the line segment joint a to zkl ; so

zki l xki l D

k

zi l ai kzkl xkl k kzkl ak km u kl

 kzkl xkl k 8i; i.e., miniD1;:::;n jzki l xki l j  kzkl xkl k:

Therefore, kz  z k  miniD1;:::;n jzki l  xki l j  kzkl  xkl k  ; conflicting with the boundedness of the sequence fzkl g  Œa; b: Thus zk  xk ! 0 and by boundedness we may assume that, up to subsequences, k x ! x; zk ! x: Then, since zk 2 H; xk 2 G 8k; it follows that x 2 G \ H; i.e., x is feasible. Furthermore, f .zk /  f .z/ 8z 2k  G \ H; hence, by letting k ! C1; f .x/  f .x/ 8x 2 G \ H; i.e., x is a global optimal solution. t u Remark 11.3 In practice we must stop the algorithm at some iteration k: The question arises as to when to stop, i.e., how large k should be, to obtain a zk sufficiently close to an optimal solution. If the problem (MO) does not involve conormal constraints, i.e., has the form .MO1/

maxff .x/j g.x/  0; x 2 Œa; bg;

then xk is always feasible, so when f .zk /  f .xk / < "; then f .xk / C " > f .zk /  maxff .x/j g.x/  0; x 2 Œa; bg; i.e., xk will be an "-optimal solution of the problem. In other words, the Polyblock Algorithm applied to problem (MO1) provides an "optimal solution in finitely many steps.

11.3 Successive Incumbent Transcending Algorithm

405

In the general case, if both normal and conormal constraints are present, xk may not be feasible, and the Polyblock Algorithm may not provide an "-optimal solution in finitely many steps. The most that can be said is that, since g.zk /  g.xk / ! 0 and g.xk /  0; while h.zk /  0 8k; f .zk / ! max .MO/ as k ! C1; we will have, for k sufficiently large, g.zk /  "; h.zk /  0; f .zk /  max .MO/  ": A point x 2 Œa; b satisfying g.x/  "  0  h.x/; f .x/  max .MO/  " is called an "-approximate optimal solution of (MO). Thus, given any tolerance " > 0; Algorithm POA can only provide an "-approximate optimal solution zk in finitely many steps.

11.3 Successive Incumbent Transcending Algorithm Very often in practice we are not so much interested in solving a problem to optimality as in obtaining quickly a feasible solution better than a given one. In that case, the Polyblock Algorithm is a good choice only if the problem does not involve conormal constraints, i.e., if its feasible set is a normal set. But if the problem involves both normal and conormal sets, i.e., if the feasible set is the intersection of a normal and a conormal set, then the Polyblock Algorithm does not serve our purpose. As we saw at the end of the previous section, for such a problem the Polyblock Algorithm is incapable of computing a feasible solution in finitely many iterations. More importantly, when the problem happens to have an isolated feasible or almost feasible solution with an objective function value significantly better than the optimal value (as, e.g., point x in the problem depicted in Fig. 11.3), then the "-approximate solution produced by the algorithm may be quite far off the optimal solution and so can hardly be accepted as an adequate approximation of the optimal solution. The situation is very much like to what happens for dc optimization problems with a nonconvex feasible set. To overcome or at least alleviate the above difficulties a robust approach to (MO) can be developed which is similar to the robust approach for dc optimization (Chap. 7, Sect. 7.5), with two following basic features : 1. Instead of an ."; /-approximate optimal solution an essential -optimal solution is computed which is always feasible and provides an adequate approximation of the optimal solution; 2. The algorithm generates a sequence of better and better nonisolated feasible solutions converging to a nonisolated feasible solution which is the best among all nonisolated feasible solutions. In that way, for any prescribed tolerance > 0 an essential -optimal solution is obtained in finitely many iterations.

406

11 Monotonic Optimization

Fig. 11.3 Inadequate approximate optimal solution x

b h(x)=0 h(x) ≥ 0 g(x)=0 opt

g(x) ≤ 0

f (x) = γ

• x∗

a

11.3.1 Essential Optimal Solution When the problem has an isolated optimal solution this optimal solution is generally difficult to compute and difficult to implement when computable. To get round the difficulty a common practice is to assume that the feasible set D of the problem under consideration is robust, i.e., it satisfies D D cl.intD/;

(11.18)

where cl and int denote the closure and the interior, respectively. Condition (11.18) rules out the existence of an isolated feasible solution and ensures that for " > 0 sufficiently small an "-approximate optimal solution will be close enough to the exact optimum. However, even in this case the trouble is that in practice we often do not know what does it mean precisely sufficiently small " > 0: Moreover, condition (11.18) is generally very hard to check. Quite often we have to deal with feasible sets which are not known a priori to contain isolated points or not. Therefore, from a practical point of view a method is desirable which could help to discard isolated feasible solutions from consideration without having to preliminarily check their presence. This method should employ a more adequate concept of approximate optimal solution than the commonly used concept of "-approximate optimal solution. A point x 2 Rn is called an essential feasible solution of (MO) if it satisfies x 2 Œa; b; g.x/ < 0  h.x/: Given a tolerance > 0 an essential feasible solution x is said to be an essential

-optimal solution of (MO) if it satisfies f .x/  f .x/  for all essential feasible solutions x of (MO).

11.3 Successive Incumbent Transcending Algorithm

407

Given a tolerance " > 0; a point x 2 Rn is said to be an "-essential feasible solution if it satisfies x 2 Œa; b; g.x/ C "  0  h.x/ Finally, given two tolerances " > 0; > 0; an "-essential feasible solution x is said to be ."; /-optimal if it satisfies f .x/  f .x/ for all "-essential optimal solutions. Since g; h are continuous increasing functions, it is readily seen that if x is an essential feasible point then for all t > 0 sufficiently small x0 D x C t.b  x/ is still an essential feasible point. Therefore, an essential feasible solution is always a nonisolated feasible solution. The above discussion suggests that instead of trying to find an optimal solution to (MO), it would be more practical and reasonable to look for an essential -optimal solution. The robust approach to problem (MO) to be presented below embodies this point of view.

11.3.2 Successive Incumbent Transcending Strategy A key step towards finding a global optimal solution of problem (MO) is to deal with the following subproblem of incumbent transcending: .SP / Given a real number ; check whether (MO) has a nonisolated feasible solution x satisfying f .x/ > ; or else establish that no nonisolated feasible solution x of (MO) exists such that f .x/ > : Since f ; g are increasing, if f .b/  , then f .x/  8x 2 Œa; b; so there is no feasible solution with objective function value exceeding I if g.a/ > 0; then g.x/ > 0 8x 2 Œa; b; so the problem is infeasible. If f .b/ > and g.b/ < 0 then b is an essential feasible solution with objective function value exceeding : Therefore, barring these trivial cases, we can assume that f .b/ > ;

g.a/  0  g.b/:

(11.19)

To seek an answer to .SP / consider the master problem: minfg.x/j h.x/  0; f .x/  ; x 2 Œa; bg: Setting k .x/ WD minfh.x/; f .x/  g and noting that k .x/ is an increasing function, we write the master problem as .Q /

minfg.x/j k .x/  0; x 2 Œa; bg:

For our purpose an important feature of this problem is that it can be solved efficiently, while solving it furnishes an answer to question (IT), as will shortly be seen.

408

11 Monotonic Optimization

By min.Q / denote the optimal value of .Q /. The robust approach to (MO) is founded on the following relationship between problems (MO) and .Q ): Proposition 11.17 Assume (11.19). (i) If min.Q / < 0 there exists an essential feasible solution x0 of (MO) satisfying f .x0 /  : (ii) If min.Q /  0, then there is no essential feasible solution x of (MO) satisfying f .x/  : Proof (i) If min.Q / < 0 there exists an x0 2 Œa; b satisfying g.x0 / < 0  h.x0 /; f .x0 /  : This point x0 is an essential feasible solution of (MO), satisfying f .x0 /  : (ii) If min.Q /  0 any x 2 Œa; b such that g.x/ < 0 is infeasible to .Q ). Therefore, any essential feasible solution of (MO) must satisfy f .x/ < : t u On the basis of this proposition, to solve the problem (MO) one can proceed according to the following conceptual SIT (Successive Incumbent Transcending) scheme: Select tolerance > 0: Start from D 0 with 0 D f .a/: Solve the master problem .Q ). – If min.Q / < 0 an essential feasible x with f .x/  is found. Reset f .x/  and repeat. – Otherwise, min.Q /  0: no essential feasible solution x for (MO) exists such that f .x/  : Hence, if D f .x/  for some essential feasible solution x of (MO) then x is an essential -optimal solution; if D 0 the problem (MO) is essentially infeasible (has no essential feasible solution). Below we indicate how this conceptual scheme can be practically implemented.

a. Solving the Master Problem Recall that the master problem .Q ) is minfg.x/j x 2 Œa; b; k .x/  0g;

(11.20)

where k .x/ WD minfh.x/; f .x/  g is an increasing function. Since the feasible set of .Q ) is robust (has no isolated point) and easy to handle (a feasible solution can be computed at very cheap cost), this problem can be solved efficiently by a rectangular RBB (reduce-bound-and-branch) algorithm characterized by three basic operations (see Chap. 6, Sect. 6.2): – Reducing: using valid cuts reduce each partition set M D Œp; q  Œa; b without losing any feasible solution currently still of interest. The resulting box Œp0 ; q0  is referred to as a valid reduction of M:

11.3 Successive Incumbent Transcending Algorithm

409

– Bounding: for each partition set M D Œp; q compute a lower bound ˇ.M/ for g.x/ over the feasible solutions of .Q ) contained in the valid reduction of MI – Branching: use an adaptive subdivision rule to further subdivide a selected box in the current partitioning of the original box Œa; b:

b. Valid Reduction At a given stage of the RBB algorithm for .Q ) let Œp; q be a box generated during the partitioning process and still of interest. The search for a nonisolated feasible solution x of (MO) in Œp; q such that f .x/  can then be restricted to the set fx 2 Œp; qj g.x/  0  k .x/g: A -valid reduction of Œp; q; written red Œp; q, is a box Œp0 ; q0   Œp; q that still contains all the just mentioned set. For any box Œp; q  Rn define qi WD pC.qi pi /ei ; i D 1; : : : ; n: As in Sect. 10.2, for any increasing function '.t/ W Œ0; 1 ! R denote by N.'/ the largest value of t 2 Œ0; 1 such that '.t/ D 0: By Proposition 11.13 we can state: A -valid reduction of a box Œp; q is given by the formula red Œp; q D

;; if g.p/ > 0 or k .q/ < 0; Œp0 ; q0  if g.p/  0 & k .q/  0:

where q0 D p C

n X

˛i .qi  pi /ei ;

p0 D q0 

iD1

˛i D ˇi D

1; if g.qi /  0 N.'i /; if g.qi / > 0:

1; if k .q0i /  0 N. i /; if k .q0i / < 0:

n X

ˇi .q0i  pi /ei ;

(11.21)

iD1

'i .t/ D g.p C t.qi  pi /ei /:

i .t/

D k .q0  t.q0i  pi /ei /:

In the context of RBB algorithm for (Q ), a smaller red Œp; q allows a tighter bound to be obtained for g.x/ over Œp; q but requires more computational effort. Therefore, as already mentioned in Remark 11.2, in practical implementation of the above formulas for ˛i ; ˇi it is enough to compute approximate values of N.'i / or N. i / by using, e.g., just a few iterations of a Bolzano bisection procedure.

410

11 Monotonic Optimization

c. Bounding Let M D Œp; q be a valid reduction of a box in the current partition. Since g.x/ is an increasing function an obvious lower bound for g.x/ over the box Œp; q; is the number ˇ.M/ D g.p/:

(11.22)

As simple as it is, this bound combined with an adaptive branching rule suffices to ensure convergence of the algorithm. However, to enhance efficiency better bounds are often needed. For example, applying two or more iterations of the Polyblock procedure for computing minfg.x/j x 2 Œp; qg may yield a reasonably good lower bound. Alternatively, based on the observation that minfg.x/ C tk .x/j x 2 Œp; qg  minfg.x/j x 2 Œp; q; k .x/  0g 8t 2 RC ; one may take ˇ.M/ D g.p/ C tk .p/ for some t > 1: As shown in (Tuy 2010), if either of the functions f .x/; g.x/ C tk .x/ is coercive, i.e., tends to C1 as kxk ! C1; then this bound approaches the exact minimum of g.x/ over Œp; q as t ! C1: Also one may try to exploit any partial convexity present in the problem. For instance, if a convex function u.x/ and a concave function v.x/ are available such that u.x/  g.x/; v.x/  k .x/ 8x 2 Œp; q; then a lower bound is given by the optimal value of the convex minimization problem minfu.x/j v.x/  0; x 2 Œp; qg which can be solved by many currently available algorithms. Sometimes it may be useful to write g.x/ as a sum of several functions such that a lower bound can be easily computed for each of these functions: ˇ.M/ can then be taken to be the sum of the separate lower bounds. Various methods and examples of computing good bounds are considered in (Tuy-AlKhayyal-Thach 2005).

d.

Branching

Branching is performed according to an adaptive rule. Specifically let Mk D Œpk ; qk  be the valid reduction of the partition set chosen for further subdivision at iteration k; and let zk 2 Mk be a point such that g.zk / provides a lower bound for minfg.x/j k .x/  0; x 2 Mk g: (for instance, zk D pk ). Obviously, if k .zk /  0 then g.zk / is the exact minimum of the latter problem and so if Mk is the partition set with smallest lower bound among all partition sets M currently still of interest then zk is a global optimal solution of (Q ). Based on this observation one can use an adaptive branching method to eventually drive zk to the feasible set of (Q ). This method proceeds as follows. Let yk be the point where the line segment Œzk ; qk  meets the surface k .x/ D 0; i.e., yk D N. / for

.t/ D k .zk C t.qk  zk //; 0  t  1: Let wk be the midpoint of the segment

11.3 Successive Incumbent Transcending Algorithm

411

Œzk ; yk  W wk D .zk C yk /=2 and jk 2 argmaxfjzki  yki j i D 1; : : : ; ng: Then divide the box Mk D Œpk ; qk  into two smaller boxes Mk ; MkC via the hyperplane xjk D wkjk : Mk D Œpk ; qk ; q

k

k

Dq 

.qkjk



MkC D ŒpkC ; qk ;

wkjk /ejk ;

p

kC

k

Dp C

.wkjk

with  pkjk /ejk :

In other words, subdivide Mk according to the adaptive subdivision rule via .zk ; yk /: Proposition 11.18 Whenever infinite the above adaptive RBB algorithm for (Q ) generates a sequence of feasible solutions yk at least a cluster point y of which is an optimal solution of (Q ). Proof Since the algorithm uses an adaptive subdivision rule via .zk ; yk /; by Theorem 6.4 there exists an infinite sequence K  f1; 2; : : :g such that kzk  yk k ! 0 as k 2 K; k ! C1: From the fact g.zk /  minfg.x/j x 2 Œa; b; k .x/  0g; k .yk / D 0; we deduce that, up to a subsequence, zk ! y; yk ! y with g.y/ D minfg.x/j x 2 Œa; b; k .x/  0g: t u Incorporating the above RBB procedure for solving the master problem into the SIT (Successive Incumbent Transcending) Scheme yields the following robust algorithm for solving (MO): SIT Algorithm for (MO) Let 0 D f .b/: Select tolerances " > 0; > 0: Initilization. Start with P1 D fM1 g; M1 D Œa; b; R1 D ;: If an essential feasible solution x of (MO) is available let D f .x/  : Otherwise, let D 0 : Set k D 1. Step 1.

For each box M 2 Pk : – Compute its valid reduction red M; – Delete M if red M D ;: Replace M by red M if otherwise; – Letting red M D Œp; q compute a lower bound ˇ.M/ for g.x/ over the set fx 2 Œp; qj k .x/  0g such that ˇ.M/ D g.z/ for some z 2 Œp; q: – Delete M if ˇ.M/ > ":

Step 2. Step 3.

Step 4.

Let P 0 k be the collection of boxes that results from Pk after completion of Step 1 and let R 0 k D Rk [ P 0 k : If R 0 k D ;, then terminate: if D 0 the problem (MO) is "-essential infeasible; if D f .x/  then x is an essential ."; /-optimal solution of (MO). If R 0 k ¤ ;; let Œpk ; qk  D Mk 2 argminfˇ.M/j M 2 R 0 k g and let zk 2 Mk be the point satisfying g.zk / D ˇ.Mk /: If k .zk /  0 let yk D zk and go to Step 5. If k .zk / < 0; let yk be the point where the line segment joining zk

412

Step 5. Step 6.

11 Monotonic Optimization

to qk meets the surface k .x/ D 0: If g.yk / < " go to Step 5. If g.yk /  " go to Step 6. yk is an essential feasible solution of (MO) with f .yk /  : Reset x yk and go back to Step 0. Divide Mk into two subboxes according to an adaptive branching via .zk ; yk /: Let PkC1 be the collection of these two subboxes of Mk : Let RkC1 D R 0 k n fMk g: Increment k and return to Step 1.

Proposition 11.19 The above algorithm terminates after finitely many steps, yielding either an essential ."; /-optimal solution of (MO), or an evidence that the problem is essentially infeasible. Proof Since Step 1 deletes every box M with ˇ.M/ > "; the fact Rk D ; in Step 3 implies that g.x/ C " > 0 8x 2 Œp; q \ fxj k .x/  0g; hence there is no "-essential feasible solution x of (MO) with f .x/  ; justifying the conclusion in Step 3. It remains to show that Step 3 occurs for sufficiently large k: Suppose the contrary, that the algorithm is infinite. Since each occurrence of Step 5 increases the current value of f .x/ by at least > 0 while f .x/ is bounded above on the feasible set, Step 5 cannot occur infinitely often. Therefore, for all k sufficiently large Step 6 occurs, i.e., g.yk /  ": By Proposition 11.18 there exists a subsequence fks g such that kzks  yks k ! 0 as s ! C1: Also, by compactness, one can assume that yks ! y; zks ! y: Then g.y/  "; while g.zk / D ˇ.Mk /  "; hence g.y/  "; a contradiction. t u Remark 11.4 So far we assumed that the problem involves only inequality constraints. In the case there are also equality constraints, e.g., cj .x/ D 0; j D 1; : : : ; s:

(11.23)

observe that the linear constraints can be used to eliminate certain variables, so we can always assume that all the constraints (11.23) are nonlinear. Since, however, in the most general case one cannot expect to compute a solution to a system of nonlinear equations in finitely many steps, one should be content with replacing the system (11.23) by the approximate system ı  cj .x/  ı;

j D 1; : : : ; s ;

where ı > 0 is the tolerance. The method presented in the previous sections can then be applied to the resulting approximate problem. An essential "-optimal solution to the above defined approximate problem is called an essential .ı; "/-optimal solution of (MO).

e. Application to Quadratic Programming Consider the quadratically constrained quadratic optimization problem

11.4 Discrete Monotonic Optimization

.QQP/

413

minff0 .x/j fk .x/  0; k D 1; : : : ; m; x 2 Cg

where C  Rn is a polyhedron and each fk .x/; k D 0; 1 : : : ; m; is a quadratic function. Noting that every quadratic polynomial can be written as a difference of two quadratic polynomials with positive coefficients we can rewrite (QQP) as a monotonic optimization problem of the form .P/

minff .x/j g.x/  0  h.x/; x 2 Œa; bg

where f .x/; g.x/; h.x/ are quadratic functions with positive coefficients (hence, increasing functions). The SIT method applied to the monotonic optimization problem .P/ then yields a robust algorithm for solving (QQP). For details of implementation, we refer the reader to the article (Tuy and Hoai-Phuong 2007). Sometimes it may be more convenient to convert (QQP) to the form .Q/

minff .x/j g.x/  0; x 2 Œa; bg

where g.x/ D minkD1;:::;m fuk .x/  vk .x/g and f ; uk ; vk are quadratic functions with positive coefficients. Then the SIT Algorithm can be applied, using the master problem .Q /

maxfg.x/j f .x/  ; x 2 Œa; bg:

11.4 Discrete Monotonic Optimization Consider the discrete monotonic optimization problem maxff .x/j g.x/  0  h.x/; .x1 ; : : : ; xs / 2 S; x 2 Œa; bg; where f ; g; h W RnC ! R are increasing functions and S is a given finite subset of RsC ; s  n: Setting G D fxj g.x/  0g; H D fxj h.x/  0g S D fx 2 Œa; bj .x1 ; : : : ; xs / 2 Sg; we can write the problem as .DMO/

maxff .x/j x 2 Œa; b; x 2 .G \ S / \ Hg:

414

11 Monotonic Optimization

Q D .G \ S /e ; the normal hull of the set G \ S : Then Proposition 11.20 Let G problem (DMO) is equivalent to Q \ Hg: maxff .x/j x 2 G

(11.24)

Q we need only show that for each point x 2 G Q there Proof Since G \ S  G Q D [z2G\S Œa; z; exists z 2 G \ S satisfying f .z/  f .x/: By Proposition 11.4 G so x 2 Œa; z for some z 2 G \ S : Since x  z and f .x/ is increasing we have f .x/  f .z/: t u As a consequence, solving problem (DMO) reduces to solving (11.24) which is a monotonic problem without explicit discrete constraint. The difficulty, now, is how Q which is defined only implicitly as the normal hull of to handle the polyblock G  G\S : Given a box Œp; q  Œa; b and a point x 2 Œp; q; we define the lower Sadjustment of x to be the point bxcS D xQ ; with xQ i D maxfyi j y 2 S [ fpg; yi  xi g; i D 1; : : : ; n

(11.25)

and the upper S-adjustment of x to be the point dxeS D xO ; with xO i D minfyi j y 2 S [ fqg; yi  xi g; i D 1; : : : ; n:

(11.26)

In the frequently encountered special case when S D S1      Ss and every Si is a finite set of real numbers we have xQ i D maxf j 2 Si [ fpi g;  xi g; i D 1; : : : ; n: xO i D minf j 2 Si [ fqi g;  xi g; i D 1; : : : ; n: For example, if each Si is the set of integers and pi ; qi 2 Si then xQ i is the largest integer no larger than xi ; while xO i is the smallest integer no less than xi : The box obtained from red Œp; q upon the above S-adjustment is referred to as the S-adjusted -valid reduction of Œp; q and denoted redS Œp; q: However rather than compute first red Œp; q and then S-adjust it, one should compute directly redS Œp; q using the formulas redS Œp; q D ŒQp; qQ  with qQ i D maxfxi j x 2 red Œp; q \ .G \ S /g; i D 1; : : : ; n; pQ i D minfxi j x 2 red Œp; qQ  \ .H \ S /g; i D 1; : : : ; n: With the above modifications to accommodate the discrete constraint the Polyblock Algorithm and the SIT Algorithm for (MO) can be extended to solve (DMO). For every box Œp; q let redS Œp; q denote the S-adjusted -valid reduction of Œp; q;

11.4 Discrete Monotonic Optimization

415

i.e., the box obtained from Œp0 ; q0  WD red Œp; q by replacing p0 with dp0 eS and q0 with bq0 cS : Discrete Polyblock Algorithm Initialization. Take an initial polyblock P0  G \ H; with proper vertex set T0 : Let xO 0 be the best feasible solution available (the current best feasible solution), 0 D f .Ox0 /; S0 D fx 2 S j f .x/ > 0 g: If no feasible solution is available, let 0 D 1: Set k WD 0: Step 1.

Step 2. Step 3. Step 4.

Step 5.

Step 6.

Delete every v 2 Tk such that redS k Œa; v D ;: For every v 2 Tk such that redS k Œa; v ¤ ; denote the highest vertex of the latter box again by v and if f .v/  k then drop v: Let TQ k be the set that results from Tk after performing this operation for every v 2 Tk (so TQ k  H). Reset Tk WD TQ k : If Tk D ; terminate: if k D 1 the problem is infeasible; if k D f .Oxk /; xO k is an optimal solution. If Tk ¤ ; select v k 2 argmaxff .v/j v 2 Tk g: If v k 2 G \ S terminate: v k is an optimal solution. If v k 2 G n S compute vQ k D bv k cSk (using formula (11.25) for S D Sk ). If v k … G compute xk D G .v k / and define vQ k D xk if xk 2 Sk ; vQ k D bxk cSk if xk … Sk . Let Tk; D fz 2 Tk j z > vQ k g: Compete \ Tk0 D .Tk n Tk; / fzk;i D z C .vQ ik  zi /ei j z 2 Tk; ; i D 1; : : : ; ng: Let TkC1 be the set obtained from Tk0 by removing every zi such that fjj zj > yj g D fig for some z 2 Tk; and y 2 Tk : Determine the new current best feasible solution xO kC1 : Set kC1 D  f .OxkC1 /; SkC1 D fx 2 S j f .x/ > kC1 g: Increment k and return to Step 1.

Proposition 11.21 If s D n the DPA (Discrete Polyblock) algorithm is finite. If s < n and there exists a constant ˛ > 0 such that min .xi  ai /  ˛

iDsC1;:::;n

8x 2 H;

(11.27)

then either the DPA algorithm is finite or it generates an infinite sequence fv k g converging to an optimal solution. Proof A each iteration k a pair v k ; vQ k is generated such that v k … .G \ Sk /c ; vQ k 2 .G \ Sk /c and the rectangle .vQ k ; b contains no point of Pl with l > k and hence no vQ l with l > k: Therefore, there can be no repetition in the sequence fvQ 0 ; vQ 1 ; : : : ; vQ k ; : : :g  S : This implies finiteness of the algorithm in the case s D n because then S D S is then finite. In the case s < n; if the sequence fv k g is infinite, it follows from the fact .vQ 1k ; : : : ; vQ sk / 2 S that for some sufficiently large k0 ; vQik D vQi ;

i D 1; : : : ; s;

8k  k0 :

On the other hand, since v k 2 H we have from (11.27) that min .vik  ai /  ˛

iDsC1;:::;n

8k:

(11.28)

416

11 Monotonic Optimization

We show that v k  xk ! 0 as k ! C1: Suppose the contrary, that there exist > 0 and an infinite sequence kl such that kv kl  xkl k  > 0 8l: For all  > l we have v k … .vQ kl ; v kl  because Pk  Pkl n .vQ kl ; b: Since vQik D xki 8i > s; we then derive kv k  v kl k 

min

iDsC1;:::;n

jvikl  vQ ikl j D

min

iDsC1;:::;n

jvikCl  xkCl i j:

By (11.28) minfvikl  ai j i D s C 1; : : : ; ng  ˛ because v kl 2 H; while xkl lies on the line segment joining a to v kl : Thus vikl  xki l D

vikl  ai ˛

kv kl  xkl k  kv kl  ak kb  ak

8i > s:

Consequently, kv k  v kl k 

min

iDsC1;:::;n

jvikl  xki l k 

˛

; kb  ak

conflicting with the boundedness of the sequence fv kl g  Œa; b: Therefore, kv k  xk k ! 0 and by boundedness we may assume that, up to a subsequence, xk ! x; v k ! x: Then, since v k 2 H; xk 2 G 8k; it follows that x 2 G\H and hence, v 2 G \ H \ S for v defined by v i D vQ i .i D 1; : : : ; s/; v i D xi .i D s C 1; : : : ; n/: Furthermore, f .v k /  f .v/ 8v 2 Pk  G \ H \ Sk I hence, letting k ! C1 yields f .x/  f .x/ 8x 2 G \ H \ S for D limk!C1 k : The latter in turn implies that v 2 argmaxff .x/j x 2 G \ H \ Sg; so v is an optimal solution. t u Discrete SIT Algorithm Let 0 D f .b/: Select tolerances " > 0; > 0: Initialization. Start with P1 D fM1 g; M1 D Œa; b; R1 D ;: If an essential feasible solution x of (DMO) is available let D f .x/  : Otherwise, let D 0 : set k D 1: Step 1.

For each box M 2 Pk : – Compute its S-adjusted -valid reduction redS M: – Replace M by redS M (in particular, delete it if redS M D ;/; – Let redS M D ŒpM ; qM ; take g.pM / to be ˇ.M/; a lower bound for g.x/ over the box M: – Delete M if g.pM / > ":

Step 2. Step 3.

Step 4.

Let P 0 k be the collection of boxes that results from Pk after completion of Step 1 and let R 0 k D Rk [ P 0 k : If R 0 k D ; terminate: in the case D 0 the problem (DMO) is "essential infeasible; otherwise, D f .x/  then x is an essential ."; /optimal solution of (DMO). If R 0 k ¤ ;; let Œpk ; qk  D Mk 2 argminfˇ.M/j M 2 R 0 k g: If k .pk /  0; let yk D pk and go to Step 5. If k .pk / < 0 let yk be the upper Sadjustement of the point where the line segment joining pk to qk meets the surface k .x/ D 0: If g.yk /  " go to Step 5; otherwise go to Step 6.

11.5 Problems with Hidden Monotonicity

Step 5.

Step 6.

417

yk is an essential feasible solution of (DMO) with f .yk /  : If D 0 define x D yk ; D f .x/  and go back to Step 0. If D f .x/  and f .yk / > f .x/ reset x yk and go to Step 0. Divide Mk into two subboxes according to an adaptive branching via .pk ; yk /: Let PkC1 be the collection of these two subboxes of Mk : Let RkC1 D R 0 k n fMk g: Increment k and return to Step 1.

Proposition 11.22 The Discrete SIT Algorithm terminates after finitely many iterations, yielding either an essential ."; /-optimal solution, or an evidence that problem (DMO) is "-essential infeasible. Proof Analogous to the proof of Proposition 11.19.

t u

11.5 Problems with Hidden Monotonicity Many problems encountered in various applications have a hidden monotonic structure which can be disclosed upon suitable variable transformations. In this section we study two important classes of such problems.

11.5.1 Generalized Linear Fractional Programming This is a class of problems with hidden monotonicity in the objective function. It includes problems of either of the following forms: .P/ .Q/



 fm .x/ f1 .x/ ;:::; j x 2 Dg g1 .x/ gm .x/



 fm .x/ f1 .x/ ;:::; j x 2 Dg g1 .x/ gm .x/

maxf˚

minf˚

where D is a nonempty polytope in Rn ; f1 ; : : : ; fm ; g1 ; : : : ; gm are linear affine functions on Rn such that  1 < ai WD min x2D

fi .x/ < C1; gi .x/

i D 1; : : : ; m;

(11.29)

m while ˚ W Rm ! R is a continuous function, increasing on Rm aC WD fy 2 R j yi  ai ; i D 1; : : : ; mg; i.e., satisfying

ai  y0i  yi .i D 1; : : : ; m/ ) ˚.y0 /  ˚.y/:

(11.30)

This class of problems include, aside from linear multiplicative programs (see Chap. 8, Sect. 8.5), diverse generalized linear fractional programs, namely:

418

11 Monotonic Optimization

1. Maximin and Minimax 

f1 .x/ fm .x/ g1 .x/ ; : : : ; gm .x/

 j x 2 Dg

(11.31)

.˚.y/ D minfy1 ; : : : ; ym g/   fm .x/ j x 2 Dg minfmax gf11.x/ ; : : : ; .x/ gm .x/

(11.32)

maxfmin

.˚.y/ D maxfy1 ; : : : ; ym g/ 2. Maxsum and Minsum Pm P f1 .x/ maxf m iD1 g1 .x/ j x 2 Dg .˚.y/ D iD1 yi / P Pm f1 .x/ minf m iD1 g1 .x/ j x 2 Dg .˚.y/ D iD1 yi /

(11.33) (11.34)

3. Maxproduct and Minproduct Qm Q f1 .x/ maxf m iD1 g1 .x/ j x 2 Dg .˚.y/ D iD1 yi / Qm f1 .x/ Qm minf iD1 g1 .x/ j x 2 Dg .˚.y/ D iD1 yi /

(11.35) (11.36)

(For the last two problems it is assumed that ai > 0; i D 1; : : : ; m in condition (11.29)) Standard linear fractional programs which are special cases of problems (P), (Q) when m D 1 were first studied by Charnes and Cooper as early as in 1962. Problems (11.31) and (11.33) received much attention from researchers (Schaible 1992 and the references therein); problems (11.33) and (11.34) were studied in Falk and Palocsay (1992), Konno and Abe (1999), Konno et al. (1991), and problems (11.35), (11.36) in Konno and Abe (1999) and Konno and Yajima (1992). For a review of fractional programing and extensions up to 1993, the reader is referred to Schaible (1992), where various potential and actual applications of fractional programming are also described. In general, fractional programming arises in situations where one would like to optimize functions of such ratios as return/investment, return/risk, cost/time, output/input, etc. Let us also mention the book (Stancu-Minasian 1997) which is a comprehensive presentation of the theory, methods and applications of fractional programming until this date. From a computational point of view, so far different methods have been proposed for solving different variants of problems .P/ and .Q/: A general unifying idea of parametrization underlies the methods developed by Konno and his associates (Konno et al. 1991; Konno and Yajima 1992; Konno and Yamashita (1997)). Each of these methods is devised for a specific class of problems, and most of them work quite well for problem instances with m  3: In the following we present a unified approach to all variants of problems (P) and (Q) which has been developed in Hoai-Phuong and Tuy (2003), based on an application of monotonic optimization.

11.5 Problems with Hidden Monotonicity

419

a. Conversion to Monotonic Optimization First note that in the study of problems (11.31)–(11.34) it is usually assumed that min fi .x/  0; x2D

min gi .x/ > 0; x2D

i D 1; : : : ; m;

(11.37)

a condition generally weaker than (11.29). Further, without loss of generality we can always assume in problems .P/ and .Q/ that ˚ W RnC ! RCC and minfgi .x/; fi .x/g > 0

8x 2 D;

i D 1; : : : ; m:

(11.38)

In fact, in view of (11.29) gi .x/ does not vanish on D; hence has a constant sign on D and by replacing fi ; gi with their negatives if necessary, we can assume gi .x/ > 0 8x 2 D; i D 1; : : : ; m: Then setting fQi .x/ D fi .x/  ai gi .x/ we have, by (11.29), Q fi .x/ fQi .x/ > 0 8x 2 D; i D 1; : : : ; m; and since gfii.x/ .x/ C ai D gi .x/ ; the problem (P) is equivalent to maxf˚

! fQi .x/ fQm .x/ ;:::; j x 2 Dg; gi .x/ gm .x/

where fQi .x/ > 0 8x 2 D; i D 1; : : : ; m: Setting ˚Q .y1 ; : : : ; ym / D ˚.y1 C a1 ; : : : ; ym C am / one has, in view of (11.30), an increasing function ˚Q W Rm C ! R; so by adding a positive constant one can finally assume ˚Q W Rm ! R CC : Analogously C for problem (Q). Now define the sets G D fy 2 Rm C j yi 

fi .x/ gi x/ ; .i

D 1; : : : ; m/;

x 2 Dg

(11.39)

H D fy 2 Rm C j yi 

fi .x/ ; .i gi x/

D 1; : : : ; m/;

x 2 Dg:

(11.40)

Clearly, G is a normal set, H a conormal set and both are contained in the box Œ0; b with bi D max x2D

fi .x/ ; i D 1; : : : ; m: gi .x/

Proposition 11.23 Problems (P) and (Q) are equivalent, respectively, to the following problems: .MP/

maxf˚.y/j y 2 Gg

.MQ/

minf˚.y/j y 2 Hg:

420

11 Monotonic Optimization

More precisely, if x solves (P) ((Q), resp.) then y with yi D

fi .x/ gi .x/

solves (MP) ((MQ), fi .x/ gi .x/

resp.). Conversely, if y solves (MP) ((MQ), resp.) and x satisfies yi  fi .x/ gi .x/ ;

.yi 

resp.) then x solves (P) ((Q), resp.).

.x Proof Suppose x solves (P). Then y D . gf1i .x/ ; : : : ; gfmm.x/ / satisfies ˚.y/  ˚.y/ for all .x/

; i D 1; : : : ; m; with x 2 D: Since ˚ is increasing this implies y satisfying yi D gfii.x/ .x/ ˚.y/  ˚.y/ 8y 2 G; hence y solves (MP). Conversely, if y solves (MP) then yi  fi .x/ f1 .x/ fm .x/ gi .x/ ; i D 1 : : : ; m; for some x 2 D; and since ˚ is increasing, ˚. g1 .x/ ; : : : ; gm .x/ / 

˚.y/  ˚.y/ for every y 2 G; i.e., for every y satisfying yi D with x 2 D: This means x solves (P).

fi .x/ gi .x/ ; i

D 1; : : : ; m t u

Thus generalized linear fractional problems .P/; .Q/ in Rn can be converted into monotonic optimization problems (MP), M.Q/ respectively, in Rm C with m usually much smaller than n:

b. Solution Method We only briefly discuss a solution method for problem (MP) (a similar method works for (MQ)): .MP/

maxf˚.y/j y 2 Gg

where (see (11.39)) G D fy 2 Rm C j yi  the box Œ0; b with b satisfying bi D max x2D

fi .x/ gi .x/

fi .x/ gi .x/

8i D 1; : : : ; m; x 2 Dg is contained in

i D 1; : : : ; m:

(11.41)

In view of assumption (11.38) we can select a vector a 2 Rm CC such that ai D min x2D

fi .x/ >0 gi .x/

i D 1; : : : ; m;

(11.42)

where it can be assumed that ai < bi ; i D 1; : : : ; m because if ai D bi for some i then gfii.x/ .x/ 8x 2 D and ˚.:/ reduces to a function of m  1 variables. So with a; b defined as in (11.41) and (11.42) the problem to be solved is maxf˚.y/j y 2 G \ Œa; bg;

Œa; b  RnCC :

(11.43)

Let be the optimal value of (MP) ( > 0 since ˚ W RnC ! RCC ). Given a tolerance " > 0; we say that a vector y 2 G is an "-optimal solution of (MP) if  .1 C "/˚.y/; or equivalently, if ˚.y/  .1 C "/˚.y/ 8y 2 G: The vector x 2 D satisfying y D gfii.x/ .x/ is then called an "-optimal solution of .P/:

11.5 Problems with Hidden Monotonicity

421

The POA algorithm specialized to problem (11.43) reads as follows: POA algorithm (tolerance " > 0) Initilization. Start with T1 D TQ 1 D fbg: Set k D 1: Step 1.

Step 2.

Step 3.

Step 4.

Select zk 2 argmaxf˚.z/j z 2 TQ k g. Compute yk D G .zk /; the point where the halfline from a through zk meets the upper boundary @C G of G: Let k k/ xk 2 D satisfy yki  gfii.x.xk// ; i D 1; : : : ; m: Define yQ k 2 G by yQ ki D gfii.x ; iD .xk / k 1; : : : ; m (so yQ is a feasible solution). Determine the new current best solution yk (with corresponding xk 2 D) and the new current best value CBV D ˚.yk / by comparing yQ k with yk1 (for k D 1 set y1 D yQ 1 ; x1 D x1 ). Select a set Tk0  Tk such that zk 2 Tk0  fzj z > yk g and let Tk be the set obtained from Tk by replacing each z 2 Tk0 with the points zi D z  .zi  yki /ei ; i D 1; : : : ; m: From Tk remove all improper elements, all z 2 TQ k such that ˚.z/ < .1 C "/CBV and also all z 6 a: Let TkC1 be the set of remaining elements. If TkC1 D ;; terminate: yk is an "-optimal solution of (MP), and the corresponding xk an "-optimal solution of .P/: If TkC1 ¤ ;; set k k C 1 and return to Step 1.

c. Implementation Issues A basic operation in Step 1 is to compute the point G .z/; where the halfline from a through z meets the upper boundary of the normal set G: Since G  Œ0; b we have G .z/ D z where fi .x/ ; x 2 Dg iD1;:::;m zi gi .x/

 D maxf˛j ˛  min D max min

fi .x/

x2D iD1;:::;m zi gi .x/

:

(11.44)

So computing G .z/ amounts to solving a problem (11.31). Since this is computationally expensive, instead of computing G .z/ D z one should compute only an approximate value of G .z/; i.e., a vector y  G .z/; such that O with O    .1 C /; O y D .1 C /z;

(11.45)

where > 0 is a tolerance. Of course, to guarantee termination of the above algorithm, the tolerance must be suitably chosen. This is achieved by choosing

> 0 such that O < .1 C "=2/˚.z/; O ˚..1 C /z/

(11.46)

422

11 Monotonic Optimization

O  G .z/  y WD .1 C /z; O i.e., ˚.y/  .1 C "=2/˚. G .z//: Indeed, since z k k while ˚.y/ is continuous, it follows that, for sufficiently large k; ˚.z /  ˚.y /  "=2˚. G .zk //: Then, noting that ˚.Qyk /  ˚. G .zk // because ˚.y/ is increasing, we will have .1 C "/CBV  .1 C "/˚.Qyk /  .1 C "/˚. G .zk //  .1 C "=2/˚. G .zk // C ."=2/˚. G.zk //  ˚.yk / C ."=2/˚. G.zk //  ˚.yk / C Œ˚.zk /  ˚.yk / D ˚.zk /; and so the termination criterion will hold. Consider the subproblem LP.˛/

R.˛/ WD max min .fi .x/  ˛zi gi .x// x2D iD1;:::;m

which is equivalent to the linear program maxftj t  fi .x/  ˛zi gi .x/; i D 1; : : : ; m; x 2 Dg: As has been proved in Hoai-Phuong and Tuy (2003), one way to ensure (11.46) is to compute y  G .z/ by the following subroutine: Procedure G .z/ Step 1. Step 2.

Step 3.

Set s D 1; ˛1 D 0: Solve LP.˛1 /; obtaining an optimal solution x1 of it. 1 Define ˛Q 1 D mini zifgi .xi .x1/ / : Let ˛sC1 D ˛Q s .1 C /: Solve LP.˛sC1 /; obtaining an optimal solution xsC1 of it. If R.˛sC1 /  0; and ˚.˛sC1 z/  .1 C "=2/˚.˛Q s z/; terminate: y D ˛sC1 z; yQ si D fi .xsC1 /=gi .xsC1 /; i D 1; : : : ; m: If R.˛sC1 /  0 but ˚.˛sC1 z/ > .1 C "=2/˚.˛Q sz/ reset

=2; ˛Q sC1 D fi .xsC1 / mini z g .xsC1 / ; increment s and go back to Step 2. i i

Step 4.

If R.˛sC1 / > 0; set ˛Q sC1 D mini Step 2.

fi .xsC1 / ; zi gi .xsC1 /

increment s and go back to

11.5.2 Problems with Hidden Monotonicity in Constraints This class includes problems of the form minff .x/j ˚.C.x//  0; x 2 Dg;

(11.47)

where f .x/ is an increasing function, ˚ W Rm C ! R is an u.s.c. increasing function, D a compact convex set and C W Rn ! Rm C a continuous map.

11.5 Problems with Hidden Monotonicity

423

The monotonic structure in (11.47) is easily disclosed. Since the set C.D/ WD fy D C.x/; x 2 Dg is compact it is contained in some box Œa; b  Rm C : For each y 2 Œa; b let '.y/ WD minff .x/j C.x/  y; x 2 Dg:

R.y/

Proposition 11.24 The problem (11.47) is equivalent to the monotonic optimization problem minf'.y/j ˚.y/  0g

(11.48)

in the sense that if y solves (11.48) then an optimal solution x of R.y / solves (11.47) and conversely, if x solves (11.47 ) then y D C.x / solves (11.48). Proof That (11.48) is a monotonic optimization problem is clear from the fact that '.y/ is an increasing function by the definition of R.y/ while ˚.y/ is an increasing function by assumption. The equivalence of the two problems is easily seen by writing (11.47) as infff .x/j x 2 D; C.x/  y; y 2 Œa; b; h.y/  0g x;y

D

min

y2Œa;b;h.y/0

minff .x/j x 2 D; C.x/  0g x

D minf'.y/j y 2 Œa; b; h.y/  0g: In more detail, let y be an optimal solution of (11.48) and let x be an optimal solution of R.y /; i.e., '.y / D f .x / with x 2 D; C.x /  y : For any x 2 D satisfying ˚.C.x//  0; we have ˚.y/  0; for y D C.x/  '.y/  '.y / D f .x /; proving that x solves (11.47). Conversely, let x be an optimal solution of (11.47), i.e., x 2 D; ˚.y /  0 with y D C.x /: For any y 2 Œa; b satisfying ˚.y/  0; we have '.y/ D f .x/ for some x 2 D with C.x/  y; hence ˚.C.x//  ˚.y/  0; i.e., x is feasible to (11.47), and hence, '.y/ D f .x/  f .x /  '.y /; i.e., y is an optimal solution of (11.48). t u Note that the objective function '.y/ in the monotonic optimization problem (11.48) is given implicitly as the optimal value of a subproblem R.y/: However, in many cases of interest, R.y/ is solvable by standard methods and no difficulty can arise from that. Consider, for example, the problem minff .x/j x 2 D;

m X ui .x/ iD1

vi .x/

 1g;

where D is a polytope, f .x/ is an increasing function, ui .x/; vi .x/; i D 1; : : : ; m; are continuous functions such that vi .x/ > 0 8x 2 D and the set fyj yi D vuii .x/ .x/ ; i D 1; : : : ; m; x 2 Dg is contained in a box Œa; b  Rm : C

424

11 Monotonic Optimization

Setting Ci .x/ D

ui .x/ vi .x/ ;

˚.y/ D

Pm

iD1 yi

 1; we have

'.y/ D minff .x/j x 2 D; yi vi .x/  ui .x/; i D 1; : : : ; ng so the equivalent monotonic optimization problem is minf'.y/j

m X

yi  1; y 2 Œa; bg:

iD1

11.6 Applications Monotonic optimization has met with important applications in economics, engineering, and technology, especially in risk and reliability theory and in communication and networking systems.

11.6.1 Reliability and Risk Management As an example of problem with hidden monotonicity in the constraints consider the following problem which arises in reliability and risk management (Prékopa 1995): minfhc; xij Ax D b; x 2 RnC ; PfTx  .!/g  ˛g;

(11.49)

where C 2 Rn ; A 2 Rpn ; b 2 Rp ; T 2 Rmn are given deterministic data, ! is a random vector from the probability space .˝; ˙; P/ and W ˝ ! Rm represent stochastic right-hand side parameters; PfSg represent the probability of the event S 2 ˙ under the probability measure P and ˛ 2 .0; 1/ is a scalar. This is a linear program with an additional probabilistic constraint which requires that the inequalities Tx  .!/ hold with a probability of at least ˛: If the stochastic parameters have a continuous log-concave distribution, then problem (11.49) is guaranteed to have a convex feasible set (Prékopa 1973), and hence may be solvable with standard convex programming techniques (Prékopa 1995). For general distributions, in particular for discrete distributions, the feasible set is nonconvex. For every y 2 Rm define f .y/ WD inffhc; xij Ax D b; x 2 RnC ; Tx  yg: Let F W Rm ! Œ0; 1 be the cumulative density function of the random vector .!/; i.e., F.y/ D Pfy  .!/g: Assuming p  y  q with 1 < f .y/ < C1 8y 2 Œp; q problem (11.49) can then be rewritten as minff .y/j F.y/  ˛; y 2 Œp; qg:

(11.50)

11.6 Applications

425

As is easily seen, f .y/ is a continuous convex and increasing function while F.y/ is an u.s.c. increasing function, so this is a monotonic optimization problem equivalent to (11.49). For a study of problem (11.49) based on its monotonic reformulation (11.50) see, e.g., Cheon et al. (2006).

11.6.2 Power Control in Wireless Communication Networks We briefly describe a general monotonic optimization framework for solving resource allocation problems in communication and networking systems, as developed in the monographs (Bj¯ornson and Jorswieck 2012; Zhang et al. 2012) (see also Bj¯ornson et al. 2012; Utschick and Brehmer 2012; Jorswieck and Larson 2010; Qian and Jun Zhang 2010; Qian et al. 2009, 2012). For details the reader is referred to these works and the references therein. Consider a wireless network with a set of n distinct links. Each link includes a transmitter node Ti and receiver node Ri : Let Gij be the channel gain between node Ti and node Rj I fi the transmission power of link i (from Ti to Ri ); i the received noise power on link iI Then the received signal to interference-plus-noise ratio (SINR) of link i is i D P

Gii fi j¤i Gij fj C i

If p D .f1 ; : : : ; fn /; .p/ D . 1 .p/; : : : ; n .p// and U. .p// is the system utility then the problem we are concerned with can be formulated as ˇ ˇ max U. .p//; ˇ s.t. i .p/  i;min i D 1; : : : ; nI ˇˇ 0  fi  fimax i D 1; : : : ; n: ˇ

(11.51)

By its very nature U. .p// is an increasing function of .:/; so this is a problem of Generalized Linear Fractional Programming as discussed in the previous section. In the more general case of MISO (multi-input-single-output) links, each link includes a multi-antenna transmitter Ti and a single-antenna receiver Ri : The channel gain between node Ti and node Rj is denoted by vector hij : If wi is the beamforming vector of Ti which yields a transmit power level of fi D kwi k22 ; and i s the noise power at node Ri ; then the SINR of link i is 2 jhH ij wi j

i .w1 ; : : : ; wn / D P j¤i

jhYij wj j2 C i

:

As before, by U. / denote the system utility function which monotonically increases with WD . 1 ; : : : ; n /: Then the problem is to

426

11 Monotonic Optimization

ˇ ˇ max U. .w// ˇ s.t. i .w1 ; : : : ; wn /  i;min i D 1; : : : ; nI ˇˇ ˇ i D 1; : : : ; n: 0  kwi k22  Pmax i

(11.52)

where i;min is the minimum SINR requirement of link i and Pmax is the maximum i transmit power of link i: Again problem (11.52) can be converted into a canonical monotonic optimization problem. In fact, setting G D fyj 0  yi  i .w1 ; : : : ; wn /; 0  kwi k22  Pmax ; i D 1; : : : ; ng i H D fyj yi  i;min ; i D 1; : : : ; ng we can rewrite problem (11.52) as maxfU.y/j y 2 G \ Hg; where G is a closed normal set, H a closed conormal set. Remark 11.5 In Bj¯ornson and Jorswieck (2012) and Zhang et al. (2012) the monotonic optimization problem was solved by the Polyblock Algorithm or the BRB Algorithm. Certainly the performance could have been much better with the SIT Algorithm, had it been available at the time. Once again, it should be emphasized that not only is the SIT algorithm more efficient, but it also suits best the user’s purpose if we are mostly interested in quickly finding a feasible solution or, possibly, a feasible solution better than a given one.

11.6.3 Sensor Cover Energy Problem An important optimization problem that arises in wireless communication about sensor coverage energy (Yick et al. 2008) can be described as follows: Consider a wireless network deployed in a certain geographical region, with n sensors at points si 2 Rp ; i D f1; : : : ; ng; and m target nodes at points tj 2 Rp ; j D f1; : : : ; mg; where p is generally set to 2 or 3. The energy consumption per unit time of a sensor i is a monotonically non decreasing function of its sensing radius ri : Denoting this function by Ei .ri / assume, as usual in many studies on wireless sensor coverage, that ˇ

Ei .ri / D ˛i ri i C ;

li  ri  ui ;

where ˛i > 0; ˇi > 0 are constants depending on the specific device and represents the idle-state energy cost. The problem is to determine ri 2 Œli ; ui   RC ; i D 1; : : : ; n; such that each target node tj ; j D 1; : : : ; m; is covered by at least one sensor and the total energy consumption is minimized. That is,

11.6 Applications

427

Fig. 11.4 Sensor cover energy problem (n=3, m=14) x x

x

x

x x x

x x x

x x x

.SCEP/

min s.t.

x

sensor x target

Pn

iD1 Ei .ri / max1in .ri  ksi  tj k/  0 l  r  u;

j D 1; : : : ; m;

where the last inequalities are component-wise understood (Fig. 11.4). ˇ ˇ ˇ Setting xi D ˛i ri i ; aij D ˛i ksi  tj kˇi and ai D ˛i li i ; bi D ˛i ui i we can rewrite (SCEP) as ˇ P ˇ minimize f .x/ D niD1 xi subject to ˇ (11.53) gj .x/ WD max1in .xi  aij /  0; j D 1; : : : ; m; ˇˇ ˇ a  x  b: Clearly each gj .x/; j D 1; : : : ; m; is a convex function, so the problem is a linear optimization under a reverse convex constraint and can be solved by methods of reverse convex programming (Chap. 7, Sect. 7.3). In Astorino and Miglionico (2014), the authors use a penalty function to shift the reverse convex constraint to the objective function, transforming (11.53) into an equivalent dc optimization problem which is then solved by the DCA algorithm of P.D. Tao. Aside from the need of a penalty factor which is not simple to determine even approximately, a drawback of this method is that it can at best provide an approximate local optimal solution which is not guaranteed to be global. A better approach to problem (SCEP) is to view it as a monotonic optimization problem. In fact, each gj .x/; j D 1; : : : ; m; is an increasing function, i.e., gj .x/  gj .x0 / whenever x  x0 : So the feasible set of problem (11.53) : H WD fx 2 Œa; bj gj .x/  0; j D 1; : : : ; mg is a conormal set (see Sect. 11.1) and (SCEP) is actually a standard P monotonic optimization problem: minimizing the increasing function f .x/ D niD1 xi over the conormal set H:

428

11 Monotonic Optimization

Let aj D .a1j ; : : : ; anj /; J WD fj D 1; : : : ; mj a  aj g: We have H D \j2J Hj ; where Hj D fx 2 Œa; bj max1in .xi  aij /  0g: Clearly Hj D Œa; b n Cj with Cj D fx 2 Œa; bj max1in .xi  aij / < 0g: For every j 2 J; since the closure of Cj is Cj D fxj a  x  aj g D Œa; aj ; the set G WD [j2J C j D [j2J Œa; aj  is a polyblock. Let J1 be the set of all j 2 J for which aj is a proper vertex of the polyblock G, i.e., for which there is no k 2 J; k ¤ j such that ak  aj : Then G D [j2J1 Œa; aj ; hence [j2J Cj D [j2J1 Cj : Since H D \j2J Hj D Œa; b n [j2J Cj D Œa; b n [j2J1 Cj it follows that H D fx 2 Œa; bj gj .x/ D max .xi  aij /  0; 8j 2 J1 g D \j2J1 Hj : 1in

Proposition 11.25 H is a copolyblock in Œa; b and (SCEP) reduces to minimizing P the increasing function f .x/ D niD1 xi over this copolyblock. Proof We have Hj D Œa; bj n Cj ; where Cj D fx 2 Œa; bj max1in .xi  aij < 0g D Œa; b n Œa; aj / D [1in Œuij ; b; with uij D a C .aij  ai /ei (Proposition 11.7), so each Cj is a copolyblock with vertex set fu1j ; : : : ; unj g: Hence H D \j2J1 Hj is a copolyblock (Proposition 11.6). t u Let V denote the proper vertex P set of the copolyblock H: Since each v 2 V is a local minimizer of f .x/ D niD1 xi over H; we thus have a simple procedure for computing a local optimal solution of (SCEP). Starting from such a local optimal solution, a BRB Algorithm can then be applied to solve the discrete monotonic optimization minf

n X

xi j x 2 Vg

iD1

which is equivalent to (SCEP). This is essentially the method recently developed in Hoai and Tuy (2016) for solving SCEP. For detail on the implementation of the method and computational experiments with problems of up to 1000 variables, the reader is referred to the cited paper.

11.6 Applications

429

11.6.4 Discrete Location Consider the following discrete location problem: (DL) Given m balls in Rn of centers ai 2 RnCC and radii ˛i .i D 1; : : : ; m/; and a bounded discrete set S  RnC ; find the largest ball that has center in S and is disjoint from any of these m balls. In other words, ˇ ˇ maximize z subject to ˇ kx  ai k  ˛i  z; i D 1; : : : ; m; ˇˇ ˇ x 2 S  RnC ; z 2 RC :

(11.54)

This problem is encountered in various applications. For example, in location theory (Plastria 1995), it can be interpreted as a “maximin location problem”: ai ; i D 1; : : : ; m; are the locations of m obnoxious facilities, and ˛i > 0 is the radius of the polluted region of facility i; while an optimal solution is a location x 2 S outside all polluted regions and as far as possible from the nearest of these obnoxious facilities. In engineering design (DL) appears as a variant of the “design centering problem” (Thach 1988; Vidigal and Director 1982), an important special case of which, when ˛i D 0; i D 1; : : : ; m; is the “largest empty ball problem” (Shi and Yamamoto 1996): given m points a1 ; : : : ; am in RnCC and a bounded set S; find the largest ball that has center in S  RnC and contains none of these points (Fig. 11.5). As a first step towards solving (DL) one can study the following feasibility problem: (Q(r)) Given a number r  0; find a point x.r/ 2 S lying outside any one of the m balls of centers ai and radii i D ˛i C r: It has been shown in Tuy et al. (2003) that this problem can be reduced to maxfkxk2  h.x/j x 2 Sg; Fig. 11.5 Discrete location problem (n D 2; m D 10/

2

3

7 5

1

8 10 4 6

9

430

11 Monotonic Optimization

where h.x/ D maxiD1;:::;m .2hai ; xi C i2  kai k2 /: Since both kxk2 and h.x/ are obviously increasing functions, this is a discrete dm optimization problem that can be solved by a branch-reduce-and-bound method as discussed for solving the Master Problem in Sect. 11.3. Clearly if r is the maximal value of r such that (Q(r)) is feasible, then x D x.r/ will solve (DL). Noting that r  0 and for any r > 0 one has r  r or r < r according to whether (Q(r)) is feasible or not, the value r can be found by a Bolzano binary search scheme: starting from an interval Œ0; s containing r; one reduces it by a half at each step by solving a (Q(r)) with a suitable r: Quite encouraging computational results with this preliminary version of the SIT approach to dm optimization have been reported in Tuy et al. (2003). However, it turns out that more complete and much better results can be obtained by a direct application of the enhanced Discrete SIT Algorithm adapted to problem (11.54). Next we describe this method. Observe that kx  ai k  ˛i  z;

i D 1; : : : ; m;

, kxk2 C kai k2  2hai ; xi  .z C ˛i /2 ; 2

i D 1; : : : ; m;

, max f.z C ˛i / C 2ha ; xi  ka k g  kxk2 : i

i 2

iD1;:::;m

Therefore, by setting '.x; z/ D max ..z C ˛i /2 C 2hai; xi  kai k2 /; iD1;:::;m

(11.55)

problem (11.54) can be restated as maxfzj '.x; z/  kxk2  0; x 2 Œa; b \ S; z  0g;

(11.56)

where '.x; z/ and kxk2 are increasing functions and Œa; b is a box containing S: To apply the Discrete SIT Algorithm for solving this problem, observe that the value of z is determined when x is fixed. Therefore, branching should be performed upon the variables x only. Another key operation in the algorithm is bounding: given a box Œp; q  Œa; b; compute an upper bound .M/ for the optimal value of the subproblem maxfzj '.x; z/  kxk2  0; x 2 S \ Œp; q; 0  zg:

(11.57)

If CBV D r > 0 is the largest thus far known value of z at a feasible solution .x; z/; then only feasible points .x; z/ with z > r are still of interest. Therefore, to obtain a tighter value of .M/ one should consider, instead of (11.57), the problem .DL.M; r//

maxfzj '.x; z/  kxk2  0; x 2 S \ Œp; q; z  rg:

11.6 Applications

431

Because '.x; z/ and kxk2 are both increasing, the constraints of DL.M; r/ can be relaxed to '.p; z/  kqk2 ; so an obvious upper bound is c.p; q/ WD maxfzj '.p; z/  kqk2  0g:

(11.58)

Although this bound is easy to compute (it is the zero of the increasing function z 7! '.p; z/kqk2 /; it is often not quite efficient. A better bound can be computed by solving a linear relaxation of DL.M; r/ obtained by omitting the discrete constraint x 2 S and replacing '.x; z/ with a linear underestimator. Since .z C ˛i /2  .r C ˛i /2 C 2.r C ˛i /.z  r/; a linear underestimator of '.x; z/ is .x; z/ WD max f2.r C ˛i /.z  r/ C .r C ˛i /2 C 2hai ; xi  kai k2 g: iD1;:::;n

On the other hand, kxk2 

n X Œ.pj C qj /xj  pj qj 

8x 2 Œp; q;

jD1

so by (11.55) the constraint '.x; z/  kxk2  0 can be relaxed to .x; z/ 

n X Œ.pj C qj /xj  pj qj   0; jD1

which is equivalent to the system of linear constraints max f2.r C ˛i /.z  r/ C .r C ˛i /2 C 2hai; xi  kai k2 g 

iD1;:::;m

n X

Œ.pj C qj /xj  pj qj :

jD1

Therefore, a lower bound can be taken to be the optimal value .M/ of the linear programming problem

.LP.M; r//

ˇ ˇ max ˇ ˇ ˇ s.t. ˇ ˇ ˇ ˇ

2.r C ˛i /.z  r/ C .r C ˛i /2 C 2hai ; xi  kai k2 n X Œ.pj C qj /xj  pj qj   0; i D 1; : : : ; m; jD1

z  r; p  x  q:

The bounds can be further improved by using valid reduction operations. For this, observe that, since '.x; r/  '.x; z/ for r  z; the feasible set of (DL(M; r)) is contained in the set fxj '.x; r/  kxk2  0; x 2 Œp; qg;

432

11 Monotonic Optimization

so, according to Proposition 11.13, if p0 ; q0 are defined by (11.13) with g.x/ D '.x; r/;

h.x/ D kxk2 ;

and pQ D dp0 eS ; qQ D bq0 cS ; then the box ŒQp; qQ   Œp; q is a valid reduction of Œp; q: Thus, the Discrete SIT Algorithm as specialized to problem (DL) becomes the following algorithm: SIT Algorithm for (DL) Initialization. Let P1 WD fM1 g; M1 WD Œa; b; R1 WD ;: If some feasible solution .x; z/ is available, let r WD z be CBV; the current best value. Otherwise, let r WD 0: Set k WD 1. Step 1.

Step 2.

Step 3. Step 4.

Step 5.

Step 6.

Apply S-adjusted reduction cuts to reduce each box M WD Œp; q 2 Pk : In particular delete any box Œp; q such that '.p; r/  kqk2  0: Let Pk0 be the resulting collection of reduced boxes. For each M WD Œp; q 2 Pk0 compute .M/ by solving LP(M; r). If LP(M; r) is infeasible or .M/ D r; then delete M: Let Pk WD fM 2 Pk0 j .M/ > rg: For every M 2 Pk if a basic optimal solution of LP(M; r) can be S-adjusted to derive a feasible solution .xM ; zM / with zM > r; then use it to update CBV. Let Sk WD Rk [ Pk : Reset r WD CBV: Delete every M 2 Sk such that .M/ < r and let RkC1 be the collection of remaining boxes. If RkC1 D ;; then terminate: if r D 0; the problem is infeasible; otherwise, r is the optimal value and the feasible solution .x; z/ with z D r is an optimal solution. If RkC1 ¤ ;; let Mk 2 argmaxf.M/j M 2 RkC1 g: Divide Mk into two boxes according to the standard bisection rule. Let PkC1 be the collection of these two subboxes of Mk : Increment k and return to Step 1:

Theorem 11.2 “SIT Algorithm for (DL)” solves the problem .DL/ in finitely many iterations. Proof Although the variable z is not explicitly required to take on discrete values, this is actually a discrete variable, since the constraint (11.54) amounts to requiring that z D miniD1;:::;m .kx  ai k  ˛i / for x 2 S: The finiteness of “SIT Algorithm for (DL)” then follows from Theorem 7.10. t u As reported in Tuy et al. (2006), the above algorithm has been tested on a number of problem instances of dimension ranging from 10 to 100 (10 problems for each instance of n/: Points aj were randomly generated in the square 1000  xi  90;000; i D 1; : : : ; n; while S was taken to be the lattice of points with integral coordinates. The computational results suggest that the method is quite practical for this class of discrete optimization problems. The computational cost increases much

11.7 Exercises

433

more rapidly with n (dimension of space or number of variables) than with m (number of balls). Also, when compared with the results preliminarily reported in Tuy et al. (2003), they show that the performance of the method can be drastically improved by using a more suitable monotonic reformulation.

11.7 Exercises 1 Let P be a polyblock in the box Œa; b  Rn : Show that the set Q D cl.Œa; b n P/ is a copolyblock. Hint: For each vertex ai of P the set Œa; b n Œa; ai / is a copolyblock; see Proposition 11.7. 2 Let h.x/ be a dm function on Œa; b such that h.x/  ˛  0 8x 2 Œa; b and q W RC ! R a convex decreasing function with finite right derivative at a W q0C .a/ > 1: Prove that q.h.x// is a dm function onŒa; b: Hint: For any 2 RC we have q.t/  q. / C q0C . /.t  / with equality holding for D t; so q.t/ D sup 2RC fq. / C q0C . /.t  /g: If h.x/ D u.x/  v.x/ with u.x/; v.x/ increasing, show that g.x/ WD K.u.x/ C v.x//  q.h.x// is increasing. 3 Rewrite the following problem (Exercise 10, Chap. 5) as a canonical monotonic optimization problem (MO): minimise 3x1 C 4x2 subject to y11 x1 C 2x2 C x3 D 5I

y12 x1 C x2 C x4 D 3

x1 ; x2  0I y11  y12  1I 0  y11  2I 0  y12  2 Solve this problem by the SIT Algorithm for (MO). 4 Consider the monotonic optimization problem minff .x/j g.x/  0; x 2 Œa; bg where f .x/; g.x/ are lower semi-continuous increasing functions. Devise a rectangular branch and bound method for solving this problem using an adaptive branching rule. Hint: A polyblock (with vertex set V) which contains the normal set G D fx 2 Œa; bj g.x/  0g can be viewed as a covering of G by the set of overlapping boxes Œa; v; v 2 V W G  [v2V Œa; v; so the POA algorithm is a BB procedure in each iteration of which a set of overlapping boxes Œa; v; v 2 V is used to cover the feasible set instead of a set of non-overlapping boxes Œp; q  Œa; b as in a conventional BB algorithm.

Chapter 12

Polynomial Optimization

In this chapter we are concerned with polynomial optimization which is a very important subject with a multitude of applications: production planning, location and distribution (Duffin et al. 1967; Horst and Tuy 1996), risk management, water treatment and distribution (Shor 1987), chemical process design, pooling and blending (Floudas 2000), structural design, signal processing, robust stability analysis, design of chips (Ben-Tal and Nemirovski 2001), etc.

12.1 The Problem In its general form the polynomial optimization problem can be formulated as .P/

minff0 .x/j fi .x/  0 .i D 1; : : : ; m/; a  x  bg;

where a; b 2 RnC and f0 .x/; f1 .x/; : : : ; fm .x/ are polynomials, i.e., fi .x/ D

X ˛2Ai

ci˛ x˛ ; x˛ WD x˛1 1 x˛2 2    x˛n n ; ci˛ 2 R;

(12.1)

with ˛ D .˛1 ; : : : ; ˛n / being an n-vector of nonnegative integers. Since the set of polynomials on Œa; b is dense in the space CŒa; b of continuous functions on Œa; b with the supnorm topology, any continuous optimization problem can be, in principle, approximated as closely as desired by a polynomial optimization problem. Therefore, polynomial programming includes an extremely wide class of optimization problems. It is also a very difficult problem of global optimization because many special cases of it such as the nonconvex quadratic optimization problem are known to be NP-hard.

© Springer International Publishing AG 2016 H. Tuy, Convex Analysis and Global Optimization, Springer Optimization and Its Applications 110, DOI 10.1007/978-3-319-31484-6_12

435

436

12 Polynomial Optimization

Despite its difficulty, polynomial optimization has become a subject of intensive research during the last decades. Among the best so far known methods of polynomial optimization one should mention: an earliest method by Shor and Stetsenko (1989), the RLT method (Reformulation–Lineralization Technique) by Sherali and Adams (1999), a SDP relaxation method by Lasserre (2001). Furthermore, since any polynomial on a compact set can be represented as a dc function, it is natural that dc methods of optimization have been adapted to solve polynomial programs. Particularly a primal-dual relaxed algorithm called GOP developed by Floudas (2000) for biconvex programming has been later extended to a wider class of polynomial optimization problems. Finally, by viewing a polynomial as a dm function, polynomial optimization problems can be efficiently treated by methods of monotonic optimization.

12.2 Linear and Convex Relaxation Approaches In branch and bound methods for solving optimization problems with a nonconvex constraint set, an important operation is bounding over a current partition set M. A common bounding practice consists in replacing the original problem .P/ by a more easily solvable relaxed problem. Solving this relaxed problem restricted to M yields a lower bound for the objective function value over M: Under suitable conditions this relaxation can be proved to lead to a convergent algorithm. Below we present the RLT method (Sherali and Adams 1999) which is an extension of the Reformulation–Linearization technique for quadratic programming (Chap. 6, Sect. 9.5) and the SDP relaxation method (Lasserre 2001, 2002) which uses a specific convex relaxation technique based on Semidefinite Programming.

12.2.1 The RLT Relaxation Method Letting li .x/ WD ci0 C hci ; xi be the linear part of fi .x/; the polynomial (12.1) can be written as X fi .x/ D li .x/ C ci˛ x˛ ; ˛2A i

P   where Ai D f˛ 2 Ai j njD1 ˛j  2g: Setting A D [m iD0 Ai ; ci˛ D 0 for ˛ 2 A n Ai we can further write (12.1) as fi .x/ D li .x/ C

X ˛2A

ci˛ x˛ :

(12.2)

12.2 Linear and Convex Relaxation Approaches

437

The RLT method is based on the simple observation that using the substitution y˛ x˛P each polynomial fi .x/ can be transformed into the function Li .x; y/ D li .x/ C ˛2A ci˛ y˛ ; which is linear in .x; y/ 2 Rn  RjAj : Consequently, the polynomial optimization problem (P) can be rewritten as

.Q/

P ˇ ˇ min l0 .x/ C ˛2A c0˛ y˛ ; ˇ P ˇ s:t li .x/ C ˛2A ci˛ y˛  0 .i D 1; : : : ; m/; ˇ ˇ a  x  b; ˇ ˇ y˛ D x˛ ; ˛ 2 A:

which is a linear program with the additional nonconvex constraints y˛ D x˛ ; ˛ 2 A:

(12.3)

The linear program obtained by omitting these nonconvex constraints

.LR.P//

ˇ P ˇ min l0 .x/ C ˛2A c0˛ y˛ ; ˇ P ˇ s:t li .x/ C ˛2A ci˛ y˛  0 .i D 1; : : : ; m/; ˇ ˇ a  x  b:

yields a linear relaxation of .P/: For any box M D Œp; q  Œa; b let .PM / denotes the problem .P/ restricted to the box M D Œp; qI then the optimal value of LR.PM / yields a lower bound ˇ.M/ for min(PM ) such that ˇ.M/ D C1 whenever the feasible set of .PM / is empty. The bounds computed this way may be rather poor. To have tighter relaxation and hence, tighter bounds, one way is to augment the original constraint set with implied constraints formed by taking mixed products of original constraints. For example, for a partition set Œp; q; such an implied constraint is .x1  p1 /.x3  p3 / .q2  x2 /  0; i.e., x1 x2 x3 C p1 x2 x3 C p3 x1 x2 C q2 x1 x3  p1 p3 x2 q2 p3 x1 C p1 q3 x2  p1 q2 x3  0: Of course, the more implied constraints are added, the tighter the relaxation is, but the larger the problem becomes. In practice it is enough to add just a few implied constraints. With bounds computed by RLT relaxation it is not difficult to devise a convergent rectangular branch and bound algorithm for polynomial optimization. To ease the reasoning in the sequel it is convenient to write a vector ˛ D .˛1 ; ˛2 ; : : : ; ˛n / 2 A as a sequence of nonnegative integers between 0 and n as follows: ˛1 numbers 1, then ˛2 numbers 2, . . . , then ˛n numbers n: For example, for n D 4 a vector ˛ D .2; 3; 1; 2/ is written as J.˛/ D 11222344: By A denote the set of all J.˛/; ˛ 2 A:

438

12 Polynomial Optimization

Lemma 12.1 Consider a box M D Œp; q  Œa; b and let .x; y/ be a feasible solution to LR(PM ). If xl D pl then yJ[flg D pl yJ

8J 2 A ; l 2 f0; 1; : : : ; ng:

(12.4)

8J 2 A ; l 2 f0; 1; : : : ; ng:

(12.5)

Similarly, if xl D ql then yJ[flg D ql yJ

Proof Consider first the case xl D pl : For jJj D 1 consider any j 2 f0; 1; : : : ; ng: If .x; y/ is feasible to LR.PM / then .x; y/ satisfies Œ.xl  pl /.xj  pj /L D ylj  pj xl  pl xj C pl pj  0 Œ.xl  pl /.qj  xj /L D ylj C pl xj C qj xl  pl qj  0; hence, noting that xl D pl ; pj .xl  pl / C pl xj  ylj  qj .xl  pl / C pl xj and consequently, ylj D pl xj : Thus (12.4) is true for jJj D 1: Arguing by induction, suppose (12.4) true for jJj D 1; : : : ; t  1 and consider jJj D t where 2  t  jA j: For any j 2 f0; 1; : : : ; ng; if .x; y/ is feasible to LR.PM / then Œ.xj  pj /.xl  pl / Œ.xl  pl /.qj  xj /

Q

r2Jnfjg .xr

 pr /L  0

r2Jnfjg .xr

 pr /L  0:

Q

Q Q By writing r2Jnfjg .xr  pr / D r2Jnfjg xr C f .x/ where f .x/ is a polynomial in x of degree no more than t  2; we then have .yJ[flg  pl yJ /  pj .yJnflg[fjg  pl yJnfjg / CŒpl xj f .x/  xl xj f .x/L C pj Œxl f .x/  pl f .x/L .yJ[flg  pl yJ /  qj .yJ[flgnfjg  pl yJnfjg / CŒpl xj f .x/  xl xj f .x/L C qj Œxl f .x/  pl f .x/L : By induction hypothesis yJ[flgnfjg Dpl yJnfjg ; Œxl xj f .x/L Dpl Œxj f .x/L ; and Œxl f .x/L D pl Œf .x/L ; so the right-hand sides of both the above inequalities are zero. Hence, yJ[flg D pl yJ : t u Corollary 12.1 Let M D Œp; q  Œa; b: If .x; y/ is a feasible solution of LR.PM / such that x is a corner of the box M then x is a feasible solution of .PM /; i.e., is a feasible solution of problem (P) in the box M.

12.2 Linear and Convex Relaxation Approaches

439

Proof If x is a corner of the box M D Œp; q there exists a set I  f1; : : : ; ng such that xi D pi 8i 2 I and xj D qj 8j 2 K WD f1; : : : ; ng n I: Consider any ˛ 2 A and J D J.˛/:QBy Lemma 12.1, for every Q j 2 J \ K we Q have yJ D qj yJnfjg D xj yJnfjg ; so yJ D . j2J\K xj /yJ\I : Then yJ D . j2J\K xj /. i2J\I xi /; i.e., y˛ D x˛ : Hence, x 2 M and is a feasible solution of .P/: t u Observe that the above proposition is an extension of Proposition 9.9 for quadratic programming to polynomial optimization. Just like for quadratic programming, it gives rise to the following branching method in a BB procedure for solving .P/: By D denote the feasible set of .P/: Let Mk D Œpk ; qk   Œa; b be the partition set selected for further exploration at iteration kth of the BB procedure. Let ˇ.Mk / be the lower bound for f0 .x/ over D \ Mk computed by RLT relaxation, i.e., X c0˛ yk˛ ; (12.6) ˇ.Mk / D l0 .xk / C ˛2A

where .xk ; yk / 2 Rn  RjAj is an optimal solution of problem LR.PMk /: If xk is a corner of Mk ; we are done by the above proposition. Otherwise, we bisect Mk via .xk ; ik / where ik 2 argmaxf ki j i D 1; : : : ; ng;

ki D minfxki  pki ; qki  xki g:

Theorem 12.1 A rectangular BB Algorithm with bounding by RLT relaxation and branching as above either terminates after finitely many iterations, yielding an optimal solution xk of .P/ or it generates an infinite sequence fxk g converging to an optimal solution x of .P/: Proof The proof is analogous to that of Proposition 9.10. Specifically, we need only consider the case when the algorithm is infinite. By Theorem 6.3 the branching rule ensures that a filter of rectangles M D Œp ; q  is generated such that .x ; y / ! .x ; y /; and x is a corner of M WD Œp ; q  D \ Œp ; q /: Clearly .x ; y / is a feasible solution of LR.PM /; P so by above corollary x is a feasible solution of   .P/: Since ˇ.M / D l .x / C D we have l0 .x / C  0 ˛2A c0˛ y˛  f0 .x/ 8x 2 P P      c y  f .x/ 8x 2 D: But x 2 D implies that l .x /C 0 0 ˛2A 0˛ ˛ ˛2A c0˛ y˛ D f0 .x /; hence f0 .x /  f0 .x/ 8x 2 D: t u

12.2.2 The SDP Relaxation Method This method is due essentially to Lasserre (2001, 2002). By p denote the optimal value of the polynomial optimization problem .P/: p WD minff0 .x/j fi .x/  0 .i D 1; : : : ; m/; x D .x1 ; x2 ; : : : ; xn / 2 Rn g:

440

12 Polynomial Optimization

We show that a SDP relaxation of .P/ can be constructed whose optimal value approximates p as closely as desired. Consider the following matrices called moment matrices: 2

3 x1   M1 .x/ D 4 : : : 5 1 x1 : : : xn xn 2 3 1 x1 x 2 : : : xn 6 x 7 6 1 x21 x1 x2 : : : x1 xn 7 6 7 D 6 x2 x1 x2 x22 : : : x2 xn 7 6 7 4::: ::: :::: ::: ::::5 xn x1 xn x2 xn : : : : x2n

(12.7)

(12.8)

2

3 x1 6 ::: 7 6 7 6 7 6 xn 7 6 2 7 6 x1 7 6 7 6 x1 x2 7 6 7  2 2 2 7 M2 .x/ D 6 6 : : : 7 x 1 : : : xn x1 x1 x2 : : : x1 xn x2 x2 x3 : : : xn 6x x 7 6 1 n7 6 x2 7 6 2 7 6 7 6 x2 x3 7 6 7 4 ::: 5 x2n 2 3 1 x1 : : : : xn x21 : : : x2n 6 x1 x2 : : : : x1 xn x3 : : : : x1 x2 7 n7 1 1 D6 4::: ::: ::: ::: ::: ::: :::5 x2n x1 x2n : : : x3n x21 x2n : : : x4n

(12.9)

(12.10)

and so all. We also set M0 .x/ D Œ1: Clearly, each moment matrix Mk .x/ is semidefinite positive. Setting Mik .x/ WD fi .x/Mk .x/ for any given natural k we then have Mi.k1/ .x/ 0 , fi .x/  0; so the problem (P) is equivalent to .Qk /

minff0 .x/j Mi.k1/ .x/ 0; i D 1; : : : ; mg:

Upon the change of variables x˛ y˛ the objective function in .Qk / becomes a linear function of y D fy˛ g while the constraints become linear matrix inequalities

12.3 Monotonic Optimization Approach

441

in y D fy˛ g: The problem .Qk / then becomes a SDP which is a relaxation of .P/; referred to as the k-th order SDP relaxation of .P/: We have the following fundamental result: Theorem 12.2 Suppose the constraint set fx W fi .x/  0; i D 1; 2; : : : ; mg is compact. Then lim inf.Qk / D min.Q/:

(12.11)

k!C1

We omit the proof which is based on a theorem of real algebraic geometry by Putinar (1993) on the representation of a positive polynomial as sum of squares of polynomials. The interested reader is referred to Lasserre (2001, 2002) for more detail. Remark 12.1 In the particular case of quadratic programming: fi .x/ D

1 hQi x; xi C hci ; xi C di ; i D 0; 1; : : : ; m; 2

with symmetric matrices Qi ; the first order SDP relaxation is 1 1 minf hQ0 ; yi C hc0 ; xi C d0 j hQi ; yi C di  0; i D 1; : : : ; m; x;y 2 2



 y x 0g x T x0

which can be shown to be the Lagrangian dual of the problem   Q.u/ c.u/ max ftj 0g; .c.u//T 2.d.u/  t/ t2R;u2Rm C P Pm Pm where Q.u/ D Q0 C m iD1 ui Qi ; c.u/ D c0 C iD1 ui ci ; d.u/ D iD1 ui di : Thus, for quadratic programming the first order SDP relaxation coincides with the Lagrange relaxation as computed at the end of Chap. 10.

12.3 Monotonic Optimization Approach Although very ingenious, the RLT and SDP relaxation approaches suffer from two drawbacks. First, they are very sensitive to the highest degree of the polynomials involved. Even for problems of small size but with polynomials of high degree they require introducing a huge number of additional variables. Second, they only provide an approximate optimal value rather than an approximate optimal solution of the problem. In fact, in these methods the problem .P/ is relaxed to a linear program or a SDP problem in the variables x 2 Rn ; yDfy˛ ; ˛ 2 Ag; which will become equivalent to .P/ when added the nonconvex constraints

442

12 Polynomial Optimization

y˛ D x˛

8˛ 2 A:

(12.12)

So if .Ox; yO / is an optimal solution to this relaxed problem then xO will solve problem .P/ if yO ˛ D xO ˛ 8˛ 2 A: The idea, then, is to use a suitable BB procedure with bounding based on this kind of relaxation to generate a sequence .Oxk ; yO k / such that kOyk  xO k k ! 0 as k ! C1: Under these conditions, given tolerances " > 0; > 0 it is expected that for large enough k one would have kOyk  xO k k  "; while f0 .Oxk /   p ; where p is the optimal value of problem .P/: Such an xO k is accepted as an approximate optimal solution, though it is infeasible and may sometimes be quite far off the true optimal solution. Because of these difficulties, if we are not so much interested in finding an optimal solution as in finding quickly a feasible solution, or possibly a feasible solution better than a given one, then neither the RLT nor the SDP approach can properly serve our purpose. By contrast, with the monotonic optimization approach to be presented below we will be able to get round these limitations.

12.3.1 Monotonic Reformulation Recall that a function f W Rn ! R is said to be increasing on a box Œa; b WD fx 2 Rn j a  x  bg  Rn if f .x0 /  f .x/ for all x0 ; x satisfying a  x0  x  b: It is said to be a d.m. function on Œa; b if f .x/ D f1 .x/  f2 .x/; where f1 ; f2 are increasing on Œa; b: Clearly a polynomial of n variables with positive coefficients is increasing on the orthant RnC : Since every polynomial can be written as a difference of two polynomials with positive coefficients: X

c˛ x˛ D

˛

X

c˛ x˛  Œ

c˛ >0

X

.c˛ /x˛ 

c˛ 0g ¤ ;:

(12.14)

444

12 Polynomial Optimization

For "  0; an x 2 Œa; b satisfying g.x/  " is called an "-essential feasible solution, and a nonisolated feasible solution x of .Q/ is called an essential "-optimal solution if it satisfies f .x/  "  infff .x/j g.x/  "; x 2 Œa; bg:

(12.15)

Clearly for " D 0 a nonisolated feasible solution which is essentially "-optimal is optimal. A basic step towards finding an essential "-optimal solution is to deal with the following subproblem of incumbent transcending: .SP / Given a real number  f .a/ (the incumbent value), find a nonisolated feasible solution xO of .Q/ such that f .Ox/  ; or else establish that none such xO exists. Since f .x/ is increasing, if D f .b/ then an answer to the subproblem .SP / would give a nonisolated feasible solution or else identify essential infeasibility of the problem (a problem is called essentially infeasible if it has no nonisolated feasible solution). If x is the best nonisolated feasible solution currently available and D f .x/  "; then an answer to .SP / would give a new nonisolated feasible solution xO with f .Ox/  f .x/  "; or else identify x as an essential "-optimal solution. In connection with question .SP / consider the master problem: .Q /

maxfg.x/j f .x/  ; x 2 Œa; bg:

If g.a/ > 0 and D f .a/ then a solves .Q /: Therefore, barring this trivial case, we can assume that g.a/  0; f .a/ < :

(12.16)

Since g and f are both increasing functions, problem .Q / can be solved very efficiently, while, as it turns out, solving .Q / furnishes an answer to question .SP /. More precisely, Proposition 12.1 Under assumption (12.16): (i) Any feasible solution x0 of .Q / such that g.x0 / > 0 is a nonisolated feasible solution of (P) with f .x0 /  : In particular, if max.Q / > 0 then the optimal solution xO of .Q / is a nonisolated feasible solution of .P/ with f .Ox/  : (ii) If max.Q / < " for D f .x/  "; and x is a nonisolated feasible solution of .P/, then it is an essential "-optimal solution of (P). If max(Q / < " for D f .b/; then the problem .P/ is "-essentially infeasible. Proof (i) In view of assumption (12.16), x0 ¤ a: Since f .a/ < ; and x0 ¤ a; every x in the line segment joining a to x0 and sufficiently close to x0 satisfies f .x/  f .x0 /  ; g.x/ > 0; i.e., is a feasible solution of .P/: Therefore, x0 is a nonisolated feasible solution of .P/ satisfying f .x0 /  :

12.3 Monotonic Optimization Approach

445

(ii) If max.Q // < " then " > supfg.x/j f .x/  ; x 2 Œa; bg; so for every x 2 Œa; b satisfying g.x/  "; we must have f .x/ > D f .x/  "; in other words, infff .x/j g.x/  "; x 2 Œa; bg > f .x/  ": This means that, if x is a nonisolated feasible solution, then it is an essential "-optimal solution, and if D f .b/; then fx 2 Œa; bj g.x/  "g D ;; i.e., the problem is "-essentially infeasible. t u On the basis of this proposition, the problem .Q/ can be solved by proceeding according to the following SIT (Successive Incumbent Transcending) scheme: 1. Select tolerance > 0: Start with D 0 for 0 D f .b/: 2. Solve the master problem .Q ). f .Ox/C

- If max.Q / > 0 an essential feasible xO with f .Ox/  is found. Reset and repeat. - Otherwise, max.Q /  0 W no essential feasible solution x for .Q/ exists such that f .x/  : So if D f .Ox/ C for some essential feasible solution xO of .Q/; xO is an essential -optimal solution; if D 0 problem .Q/ is essentially infeasible. The key for implementing this conceptual scheme is an algorithm for solving the master problem.

12.3.3 Solving the Master Problem Recall that the master problem .Q ) is maxfg.x/j f .x/  ; x 2 Œa; bg: We will solve this problem by a special RBB (Reduce-Bound-and-Branch) algorithm involving three basic operations: reducing (the partition sets), bounding, and branching. - Reducing: using valid cuts reduce the size of each current partition set M D Œp; q  Œa; b: The box Œp0 ; q0  obtained from M as a result of the cuts is referred to as a valid reduction of M and denoted by red Œp; q: - Bounding: for each current partition set M D Œp; q estimate a valid upper bound ˇ.M/ for the maximum of g.x/ over the feasible portion of .Q ) contained in the box Œp0 ; q0  D red Œp; q: - Branching: successive partitioning of the initial box M0 D Œa; b using an exhaustive subdivision rule or any subdivision consistent with the bounding.

446

12 Polynomial Optimization

a. Valid Reduction At a given stage of the RBB Algorithm for .Q /; let Œp; q  Œa; b be a box generated during the partitioning procedure and still of interest. The search for a nonisolated feasible solution of .Q/ in Œp; q such that f .x/  can then be restricted to the set B \ Œp; q; where B WD fxj f .x/   0; g.x/  0g:

(12.17)

Since g.x/ D minjD1;:::;m fuj .x/  vj .x/g with uj .x/; vj .x/ being increasing polynomials (see (12.13)), we can also write B D fxjf .x/  ; uj .x/  vj .x/  0 j D 1; : : : ; mg: The reduction operation aims at replacing the box Œp; q with a smaller box Œp0 ; q0   Œp; q without losing any point x 2 B \ Œp; q; i.e., such that B \ Œp0 ; q0  D B \ Œp; q: The box Œp0 ; q0  satisfying this condition is referred to as a valid reduction of Œp; q and denoted by redŒp; q: Let ei be the i-th unit vector of Rn and for any continuous function '.t/ W Œ0; 1 ! R such that '.0/ < 0 < '.1/; let N.'/ denote the largest value t 2 .0; 1/ satisfying '.t/ D 0: Also for every i D 1; : : : ; n let qi D p C .qi  pi /ei : By Proposition 10.13 applied to problem .Q ) we have ; if f .p/ > or minj fuj .q/  vj .p/g < 0 red Œp; q D Œp0 ; q0  if f .p/  & minj fuj .q/  vj .p/g  0 where 0

q DpC

n X

˛i .qi  pi /e ; i

iD1

˛i D



0

p Dq 

n X

ˇi .q0i  pi /ei ;

iD1

1; if f .qi /   0 N.'i /; if f .qi /  > 0: ˇi D

0

'i .t/ D f .p C t.qi  pi /ei /  :

1; if minj fuj .q0i /  vj .p/g  0 N. i /; if minj fuj .q0i /  vj .p/g < 0:

i .t/

D minj fuj .q0  t.q0i  pi /ei /  vj .p/g:

Remark 12.2 It is easily verified that the box Œp0 ; q0  D red Œp; q still satisfies f .p0 /  ;

minfuj .q0 /  vj .p0 /g  0: j

12.3 Monotonic Optimization Approach

447

b. Valid Bounds Given a box M WD Œp; q; which is supposed to have been reduced, we want to compute an upper bound ˇ.M/ for maxfg.x/j f .x/  ; x 2 Œp; qg: Since g.x/ D minjD1;:::;m fuj .x/  vj .x/g and uj .x/; vj .x/ are increasing, an obvious upper bound is ˇ.M/ D min Œuj .q/  vj .p/: jD1;:::;m

(12.18)

Though very simple, this bound ensures convergence of the algorithm, as will be seen shortly. However, for a better performance of the procedure, tighter bounds can be computed using, for instance, the following: Lemma 12.2 (i) If g.p/ > 0 and f .p/ < , then p is a nonisolated feasible solution with f .p/ < : (ii) If f .q/ > and x.M/ D p C .q  p/ with  D maxf˛j f .p C ˛.q  p//  g; zi D q C .xi .M/  qi /ei ; i D 1; : : : ; n; then an upper bound of g.x/ over all x 2 Œp; q satisfying f .x/  is ˇ.M/ D max

min fuj .zi /  vj .p/g:

iD1;:::;n jD1;:::;m

Proof (i) Obvious. (ii) Let Mi D Œp; zi  D fxj p  x  zi g D fx 2 Œp; qj fi  xi  xi .M/g: From the definition of x.M/ it is clear that f .z/ > f .x.M// D for all z D p C ˛.q  p/ with ˛ > : Since for every x > x.M/ there exists z D p C ˛.q  p/ with ˛ > ; such that x  z; it follows that f .x/  f .z/ > for all x > x.M/: So fx 2 Œp; qj f .x/  g  Œp; q n fxj x > x.M/g: On the other hand, noting that fxj x > x.M/g D \niD1 fxj xi > xi .M/g; we can write Œp; q n \niD1 fxj xi > xi .M/g D [iD1;:::;n fx 2 Œp; qj pi  xi  xi .M/g D [iD1;:::;n Mi : Since ˇ.Mi / D minjD1;:::;m Œuj .zi /  vj .p/  maxfg.x/j x 2 Mi g it follows that ˇ.M/ D maxfˇ.Mi /j i D 1; : : : ; ng  maxfg.x/j x 2 [iD1;:::;n Mi g  maxfg.x/j x 2 B \ Œp; qg: t u Remark 12.3 Each box Œp; zi  can be reduced by the method presented above. If Œp0i ; q0i  D redŒp; zi ; i D 1; : : : ; n; then without much extra effort, we can have a more refined upper bound, namely ˇ.M/ D max f min Œuj .q0i /  vj .p0i /g: iD1;:::;n jD1;:::;m

The points zi constructed as in (ii) determine a set Z WD [niD1 Œp; zi  containing fx 2 Œp; qj f .x/  g: Such a set Z is a polyblock with vertex set zi ; i D 1; : : : ; n

448

12 Polynomial Optimization

(see Sect. 10.1). The above procedure thus amounts to constructing a polyblock Z  fx 2 Œp; qj f .x/  g—which is possible because f .x/ is increasing. To have a tighter bound, one can even construct a sequence of polyblocks Z1  Z2 ; : : : ; approximating the set B \ Œp; q more and more closely. For the details of this construction the interested reader is referred to Tuy (2000a). By using polyblock approximation one could compute a bound as tight as we wish; since, however, the computation cost increases rapidly with the quality of the bound, a trade-off must be made, so practically just one approximating polyblock as in the above lemma is used. Remark 12.4 In many cases, efficient bounds can also be computed by exploiting, additional structure such as partial convexity, aside from monotonicity properties Tuy (2007b). For example, assuming, without loss of generality, that p 2 RnCC ; i.e., fi > 0; i D 1; : : : ; n; and using the transformation ti D log xi ; i D 1; : : :P ; n; one can ˛ ˛ write each monomial xP as expht; ˛i; and each polynomial fk .x/ D ˛2Ak ck˛ x as a function 'k .t/ D ˛2Ak ck˛ expht; ˛i: Since, as can easily be seen, expht; ˛i with ˛ 2 Rm C is an increasing convex function of t; each polynomial in x becomes a d.c. function of t and the subproblem over Œp; q becomes a d.c. optimization problem under d.c. constraints. Here each d.c. function is a difference of two increasing convex functions of t on the box Œexp p; exp q; where exp p denotes the vector .exp f1 ; : : : ; exp fn /: A bound over Œp; q can then be computed by exploiting simultaneously the d.c. and d.m. structures.

c. A Robust Algorithm Incorporating the above RBB procedure for solving .Q ) into the SIT scheme yields the following robust algorithm for solving .Q/ W SIT Algorithm for (Q) Initialization. If no feasible solution is known, let D f .b/ I otherwise, let x be the best nonisolated feasible solution available, D f .x/  ": Let P1 D fM1 g; M1 D Œa; b; R1 D ;: Set k D 1. Step 1.

For each box M 2 Pk W -

Step 2.

Compute its valid reduction redMI Delete M if redM D ;I Replace M by redM if redM ¤ ;I If redM D Œp0 ; q0 , then compute an upper bound ˇ.M/ for g.x/ over the feasible solutions in redM: (ˇ.M/ must satisfy ˇ.M/  minjD1;:::;m Œuj .q0 /  vj .p0 /; see (12.18)). Delete M if ˇ.M/ < 0:

Let P 0 k be the collection of boxes that results from Pk after completion of Step 1. Let R 0 k D Rk [ P 0 k :

12.3 Monotonic Optimization Approach

Step 3. Step 4. Step 5. Step 6.

449

If R 0 k D ;, then terminate: x is an essential "-optimal solution of .Q/ if D f .x/  "; or the problem .Q/ is essentially infeasible if D f .b/. If R 0 k ¤ ;; let Œpk ; qk  WD Mk 2 argmaxfˇ.M/j M 2 R 0 k g; ˇk D ˇ.Mk /: If ˇk < ", then terminate: x is an essential "-optimal solution of .Q/ if D f .x/  "; or the problem .Q/ is "-essentially infeasible if D f .b/: If ˇk  "; compute k D maxf˛j f .pk C ˛.qk  pk //   0g and let xk D pk C k .qk  pk /:

6.a)

6.b) Step 7.

If g.xk / > 0, then xk is a new nonisolated feasible solution of .Q/ with f .xk /  W if g.pk / < 0; compute the point xk where the line segment joining pk to xk meets the surface g.x/ D 0; and reset x xk I otherwise, k reset x p : Go to Step 7. If g.xk /  0; go to Step 7, with x unchanged. Divide Mk into two subboxes by the standard bisection (or any bisection consistent with the bounding M 7! ˇ.M//: Let PkC1 be the collection of these two subboxes of Mk ; RkC1 D R 0 k n fMk g: Increment k; and return to Step 1.

Proposition 12.2 The above algorithm terminates after finitely many steps, yielding either an essential "-optimal solution of .Q/; or an evidence that the problem is essentially infeasible. Proof Since any feasible solution x with f .x/  D f .x/  " must lie in some box M 2 R 0 k the event R 0 k D ; implies that no such solution exists, hence the conclusion in Step 3. If Step 5 occurs, so that ˇk < "; then max.Q = /  "; hence the conclusion in Step 5 (see Proposition 12.1). Thus the conclusions in Step 3 and Step 5 are correct. It remains to show that either Step 3 (R 0 k D ;/ or Step 5 (ˇk < "/ must occur for sufficiently large k: To this end, observe that in Step 6, since f .pk /  (Remark 12.2), the point xk exists and satisfies f .xk /  ; so if g.xk / > 0; then by Proposition 12.1, xk is a nonisolated feasible solution with f .xk /  f .x/  "; justifying Step 6a. Suppose now that the algorithm is infinite. Since each occurrence of Step 6a decreases the current best value at least by " > 0 while f .x/ is bounded below it follows that Step 6a cannot occur infinitely often. Consequently, for all k sufficiently large, x is unchanged, and g.xk /  0; while ˇk  ": But, as k ! C1; we have, by exhaustiveness of the subdivision, diamMk ! 0; i.e., kqk  pk k ! 0: Denote by xQ the common limit of qk and pk as k ! C1: Since "  ˇk  min Œuj .qk /  vj .pk /; jD1;:::;m

it follows that "  lim ˇk  min Œuj .Qx/  vj .Qx/ D g.Qx/: k!C1

jD1;:::;m

But by continuity, g.Qx/ D limk!C1 g.xk /  0; a contradiction. Therefore, the algorithm must be finite. u t

450

12 Polynomial Optimization

Remark 12.5 The SIT Algorithm works its way to the optimum through a sequence of better and better nonisolated solutions. If for some reason the algorithm has to be stopped prematurely, some reasonably good feasible solution may have been already obtained. This is a major advantage of it over most existing algorithms, which may be useless when stopped prematurely.

12.3.4 Equality Constraints So far we assumed (12.14), so that the feasible set has a nonempty interior. Consider now the case when there are equality constraints, e.g., hl .x/ D 0; l D 1; : : : ; s ;

(12.19)

First, if some of these constraints are linear, they can be used to eliminate certain variables. Therefore, without loss of generality we can assume that all the constraints (12.19) are nonlinear. Since, however, in the most general case one cannot expect to compute a solution of a nonlinear system of equations in finitely many steps, one should be content with an approximate system ı  hl .x/  ı;

l D 1; : : : ; s ;

where ı > 0 is the tolerance. In other words, a set of constraints of the form gj .x/  0

j D 1; : : : ; m

hl .x/ D 0

l D 1; : : : ; s

should be replaced by the approximate system gj .x/  0

j D 1; : : : ; m

hl .x/ C ı  0; l D 1; : : : ; s hl .x/ C ı  0

l D 1; : : : ; s:

The method presented in the previous sections can then be applied to the resulting approximate problem. With g.x/ D minjD1;:::;m gj .x/; h.x/ D maxlD1;:::;s jhl .x/j; the required assumption is, instead of (12.14), fx 2 Œa; bj g.x/ > 0; h.x/ C ı > 0g ¤ ;:

12.3.5 Discrete Polynomial Optimization The above approach can be extended without difficulty to the class of discrete (in particular, integral) polynomial optimization problems. This can be done along the same lines as the extension of monotonic optimization methods to discrete monotonic optimization.

12.4 An Illustrative Example

451

12.3.6 Signomial Programming A large class of problems called signomial programming or generalized geometric programming can be considered which inludes problems of the form .P/ where each vector ˛ D .˛1 ; : : : ; ˛n / may involve rational non integral components, i.e., may be an arbitrary vector ˛ 2 RnC : The above solution method via conversion of polynomial programming into monotonic optimization can be applied without modification to signomial programming.

12.4 An Illustrative Example We conclude by an illustrative example. Minimize .3Cx1 x3 /.x1 x2 x3 x4 C2x1 x3 C2/2=3 subject to 3.2x1 x2 C 3x1 x2 x4 /.2x1 x3 C 4x1 x4  x2 / .x1 x3 C 3x1 x2 x4 /.4x3 x4 C 4x1 x3 x4 C x1 x3  4x1 x2 x4 /1=3 C3.x4 C 3x1 x3 x4 /.3x1 x2 x3 C 3x1 x4 C 2x3 x4  3x1 x2 x4 /1=4  309:219315 2.3x3 C 3x1 x2 x3 /.x1 x2 x3 C 4x2 x4  x3 x4 /2 C.3x1 x2 x3 /.3x3 C 2x1 x2 x3 C 3x4 /4  .x2 x3 x4 C x1 x3 x4 /.4x1  1/3=4 3.3x3 x4 C 2x1 x3 x4 /.x1 x2 x3 x4 C x3 x4  4x1 x2 x3  2x1 /4  78243:910551 3.4x1 x3 x4 /.2x4 C 2x1 x2  x2  x3 /2 C2.x1 x2 x4 C 3x1 x3 x4 /.x1 x2 C 2x2 x3 C 4x2  x2 x3 x4  x1 x3 /4  9618 0  xi  5 i D 1; 2; 3; 4: This problem would be difficult to solve by the RLT or Lasserre’s method, since a very large number of variables would have to be introduced. For " D 0:01; after 53 cycles of incumbent transcending, the SIT Algorithm found the essential "-optimal solution xessopt D .4:994594; 0:020149; 0:045424; 4:928073/ with essential "-optimal value 5.906278 at iteration 667, and confirmed its essential "-optimality at iteration 866.

452

12 Polynomial Optimization

12.5 Exercises 1 Solve the problem minimize 3x1 C 4x2 ; s:t: x21 x2 C 2x2 C x3 D 5I x21 x22 C x2 C x4 D 3I x1  0; x2  0I x1 x2  x1 x22  1I 0  x1 x2  2I 0  x1 x22  2: 2 Consider the problem Minimize 4.x21 x3 C 2x21 x2 x23 x5 C 2x21 x2 x3 /.5x21 x3 x24 x5 C 3x2 /3=5 C 3.2x24 x25 /.4x21 x4 C 4x2 x5 /5=3 subject to 2.2x1 x5 C 5x21 x2 x24 x5 /.3x1 x4 x25 C 5 C 4x3 x25 /1=2  7684:470329 2.2x1 x22 x3 x24 /.2x1 x2 x3 x24 C 2x2 x24 x5  x21 x25 /3=2  1286590:314422 0  xi  5

i D 1; 2; 3; 4; 5:

With tolerance " D 0:01 solve this problem by the SIT monotonic optimization algorithm. Hints: the essential "-optimal solution is x D .4:987557; 4:984973; 0:143546; 1:172267; 0:958926/ with objective function value 28766.057367.

Chapter 13

Optimization Under Equilibrium Constraints

In this closing chapter we study optimization problems with nonconvex constraints of a particular type called equilibrium constraints. This class of constraints arises from numerous applications in natural sciences, engineering, and economics.

13.1 Problem Formulation Let there be given two closed subsets C; D of Rn ; Rm , respectively, and a function F W C  D ! R: As defined in Chap. 3, Sect. 3.3, an equilibrium point of the system C; D; F is a point z 2 Rn such that z 2 C;

F.z; y/  0 8y 2 D:

Consider now a real-valued function f .x; z/ on Rn  Rm together with a nonempty closed set U  Rn  Rm ; and two a set-valued maps C; D from U to Rn with closed values C.x/; D.x/ for every .x; z/ 2 U: The problem we are concerned with and which is commonly referred to as Mathematical Programming problem with Equilibrium Constraints can be formulated as follows:

.MPEC/

minimize ˇ f .x; z/ ˇ .x; z/ 2 U; z 2 C.x/; subject to ˇˇ F.z; y/  0 8y 2 D.x/:

Setting S.x/ D fz 2 C.x/j F.x; z/  0 8y 2 D.x/g the equilibrium constraints can also be written as z 2 S.x/:

© Springer International Publishing AG 2016 H. Tuy, Convex Analysis and Global Optimization, Springer Optimization and Its Applications 110, DOI 10.1007/978-3-319-31484-6_13

(13.1)

453

454

13 Optimization Under Equilibrium Constraints

For every fixed x 2 X WD fx 2 Rn j .x; z/ 2 U for some z 2 Rm g; S.x/ is the set of all equilibrium points of the system .C.x/; D.x/; F/: So the problem can be viewed as a game (called Stackelberg game) in which a distinctive player is leader and the remaining players are followers. The leader has an objective function f .x; z/ depending on two variables x 2 Rn and z 2 Rm : The leader controls the variable x and to each decision x 2 X by the leader, the followers react by choosing a value of z 2 C.x/ achieving equilibrium of the system .C.x/; D.x/; F/: Supposing that f .x; z/ is a loss, the leader’s objective is to minimize f .x; z/ under these conditions. In this interpretation the variable x under the leader’s control is also referred to as upper level variable and the variable y under the followers’ control as lower level variable. The most important cases that occur in the applications are the following: 1. S is generated by an optimization problem: S.x/ D argminfg.x; y/j y 2 D.x/gI

(13.2)

then F.z; y/ D g.x; y/  g.x; z/: 2. S is generated by a variational inequality: S.x/ D fz 2 C.x/j hg.z/; y  zi  0 8y 2 D.x/gI

(13.3)

then F.z; y/ D hg.z/; y  zi: 3. S is generated by a complementarity problem: S.x/ D fz 2 RnC j hF.z/; y  zi  0 8y 2 RnC g D fz 2 RnC ; F.z/  0; hF.z/; zi D 0gI

(13.4)

then C.x/ D D.x/ D RnC ; F.z; y/ D hF.z/; y  zi: Although this case can be reduced to the previous one, it deserves special interest as it can be shown to include the important case when S is generated by a Nash equilibrium problem. There is no need to dwell on the difficulty of MPEC. Finding just an equilibrium point is already a hard problem, let alone solving an optimization problem with an equilibrium constraint. In spite of that, the problem has attracted a great deal of research, due to its numerous important applications in natural sciences, engineering, and economics. Quite understandably, in the early period aside from heuristic algorithms for solving certain classes of MPECs (Friesz et al. 1992), many solution methods proposed for MPECs have been essentially local optimization methods. Few papers have been concerned with finding a global optimal solution to MPEC. Among the most important works devoted to MPECs we should mention the monographs by Luo et al. (1996) and Outrata et al. (1998). Both monographs describe a large number of MPECs arising from applications in diverse fields of natural sciences, engineering, and economics.

13.2 Bilevel Programming

455

In Luo et al. (1996), under suitable assumptions including a constraint qualification, MPEC is converted into a standard optimization problem. First and second order optimality conditions are then derived and based on these a penalty interior point algorithm and a descent algorithm are developed for solving the problem. Particular attention is also given to MPAECs which are mathematical programs with affine equilibrium constraint. The monograph by Outrata–Ko˘cvara–Zowe presents a nonsmooth approach to MPECs where the equilibrium constraint is given in the form of a variational or quasivariational equality. Specifically, using the implicit function locally defined by the equilibrium constraint the problem is converted into a mathematical program with a nonsmooth objective which can then be solved by the bundle method of nonsmooth optimization. In the following we will present a global optimization approach to MPEC. Just like in the approaches of Luo–Pang–Ralp and Outrata–Ko˘cvara–Zowe, the basic idea is to convert MPEC into a standard optimization problem. The difference, however, is that the latter will be an essential global optimization problem belonging to a class amenable to efficient solution methods, such as dc or monotonic optimization. As it turns out, most MPECs of interest can be approached in that way. The simplest among these are bilevel programming and optimization over efficient set, for which the equilibrium constraint is an ordinary or multiobjective optimization problem.

13.2 Bilevel Programming In its general form the Bilevel Programming Problem can be formulated as follows

.GBP/

min f .x; y/ s.t. g1 .x; y/  0; x 2 RnC1 ; y solves R.x/ W minfd.y/j g2 .C.x/; y/  0; y 2 RnC2 g

where it is assumed that (A1) The map C W Rn1 ! Rm and the functions f .x; y/; d.y/; g1 .x; y/; g2 .u; y/ are continuous. (A2) The set D WD f.x; y/ 2 RnC1  RnC2 j g1 .x; y/  0; g2 .C.x/; y/  0g is nonempty and compact. Also an assumption common to a wide class of (GBP) is that (A3) For every fixed y the function u 7! g2 .u; y/ is decreasing. Assumptions (A1) and (A2) are quite natural. Assumption (A3) means that the function g2 .C.x/; y/ decreases when the action C.x/ of the leader increases (component-wise). By replacing u with v D u; if necessary, this assumption also holds when g2 .C.x/; y/ increases with C.x/:

456

13 Optimization Under Equilibrium Constraints

If all the functions F.x; y/; g1 .x; y/; g2 .C.x/; y/; d.y/ are convex, the problem is called a Convex Bilevel Program (CBP); if all these functions are linear affine, the problem is called a Linear Bilevel Program (LBP). Specifically, the Linear Bilevel Programming problem can be formulated as

.LBP/

min L.x; y/ s.t. A1 x C B1 y  c1 ; x 2 RnC1 ; y solves minfhd; yij A2 x C B2 y  c2 ; y 2 RnC2 g

where L.x; y/ is a linear function, d 2 Rn2 ; Ai 2 Rmi n1 ; Bi 2 Rmi n2 ; ci 2 Rmi ; i D 1; 2: In this case g1 .x; y/ D maxiD1;:::;m1 .c1  A1 x  B1 y/i  0; g2 .u; y/ D maxiD1;:::;m2 .c2  u  B2 y/i  0; while C.x/ D A2 x: Clearly all assumptions (A1), (A2), (A3) hold for (LBP) provided the set D WD f.x; y/ 2 RnC1  RnC2 j A1 x C B1 y  c1 ; A2 x C B2 y  c2 g is a nonempty polytope. Also for every x  0 satisfying A1 x C B1 y  c1 the set fy 2 RnC2 j A2 x C B2 y  c2 g is compact, so R.x/ is then solvable. Originally formulated and studied as a mathematical program by Bracken and McGill (1973, 1974) bilevel and more generally, multilevel optimization has become a subject of extensive research, owing to numerous applications in diverse fields such as: economic development policy, agriculture economics, road network design, oil industry regulation, international water systems and flood control, energy policy, traffic assignment, structural analysis in mechanics, etc. Multilevel optimization is also useful for the study of hierarchical designs in many complex systems, particularly in biological systems (Baer 1992; Vincent 1990). Even (LBP) is an NP-hard nonconvex global optimization problem (Hansen et al. 1992). Despite the linearity of all functions involved, this is a difficult problem fraught with pitfalls. Actually, several early published methods for its solution turned out to be non convergent, incorrect or convergent to a local optimum quite far off the true optimum. Some of these errors have been reported and analyzed, e.g., in Ben-Ayed (1993). A popular approach to (LBP) at that time was to use, directly or indirectly, a reformulation of it as a single mathematical program which, in most cases, was a linear program with an additional nonconvex constraint. It was this peculiar additional nonconvex constraint that constituted the major source of difficulty and was actually the cause of most common errors. Whereas (LBP) has been extensively studied, the general bilevel programming problem has received much less attention. Until two decades ago, most research in (GBP) was limited to convex or quadratic bilevel programming problems, and/or was mainly concerned with finding stationary points or local solutions (Bard 1988, 1998; Bialas and Karwan 1982; Candler and Townsley 1982; Migdalas et al. 1998; Shimizu et al. 1997; Vicentee et al. 1994; White and Annandalingam 1993). Global optimization methods for (GBP) appeared much later, based on converting (GBP) into a monotonic optimization problem (Tuy et al. 1992, 1994b, and also Tuy and Ghannadan 1998).

13.2 Bilevel Programming

457

13.2.1 Conversion to Monotonic Optimization We show that under assumptions (A1)–(A3) (GBP) can be converted into a single level optimization problem which is a monotonic optimization problem. In view of Assumptions (A1) and (A2) the set f.C.x/; d.y//j .x; y/ 2 Dg is compact, so without loss of generality we can assume that this set is contained in some box Œa; b  RmC1 C : a  .C.x/; d.y//  b 8.x; y/ 2 D:

(13.5)

Setting U D fu 2 Rm j 9.x; y/ 2 D

u  C.x/g;

W D f.u; t/ 2 R  Rj 9.x; y/ 2 D m

u  C.x/; t  d.y/g;

(13.6) (13.7)

we define .u/ D minfd.y/j .x; y/ 2 D; u  C.x/g; f .u; t/ D minfF.x; y/j .x; y/ 2 D; u  C.x/; t  d.y/g:

(13.8) (13.9)

Proposition 13.1 (i) The function .u/ is finite for every u 2 U and satisfies .u0 /  .u/ whenever u0  u 2 U: In other words, .u/ is a decreasing function on U: (ii) The function f .u; t/ is finite on the set W: If .u0 ; t0 /  .u; t/ 2 W then .u0 ; t0 / 2 W and f .u0 ; t0 /  f .u; t/: In other words, f .u; t/ is a decreasing function on W: (iii) u 2 U for every .u; t/ 2 W: Proof (i) If u 2 U then there exists .x; y/ 2 D such that C.x/  u; hence .u/ < C1: It is also obvious that u0  u 2 U implies that .u0 /  .u/: (ii) By assumption (A2) the set D of all .x; y/ 2 RnC1  RnC2 satisfying g1 .x; y/  0; g2 .C.x/; y/  0 is nonempty and compact. Then for every .u; t/ 2 W the feasible set of the problem determining f .u; t/ is nonempty, compact, hence f .u; t/ < C1 because F.x; y/ is continuous by (A1). Furthermore, if .u0 ; t0 /  .u; t/ 2 W, then obviously .u0 ; t0 / 2 W and f .u0 ; t0 /  f .u; y/: (iii) Obvious. t u As we just saw, the assumption (A2) is only to ensure the solvability of problem (13.9) and hence, finiteness of the function f .u; t/: Therefore, without any harm it can be replaced by the weaker assumption (A2*) The set W D f.u; t/ 2 Rm  Rj 9.x; y/ 2 D u  C.x/; t  d.y/g is compact. In addition to (A1), (A2)/(A2*), and (A3) we now make a further assumption: (A4) The function .u/ is upper semi-continuous on intU while f .u; t/ is lower semi-continuous on intW:

458

13 Optimization Under Equilibrium Constraints

Proposition 13.2 If all the functions F.x; y/; g1 .x; y/; g2 .C.x/; y/; d.y/ are convex then Assumption (A4) holds, with .u/; f .u; t/ being convex functions. Proof Observe that if .u/ D d.y/ for x satisfying g2 .C.x/; y/  0; C.x/  u; and .u0 / D d.y0 / for x0 satisfying g2 .C.x0 /; y0 /  0; C.x0 /  u0 ; then g2 .C.˛x C .1  ˛/x0 /; ˛y C .1  ˛/y0 /  ˛g2 .C.x/; y/ C .1  ˛/g2 .C.x0 /; y0 /; and C.˛x C .1  ˛/x0 /  ˛C.x/C.1˛/C.x0 /  ˛uC.1˛/u0 ; so that .˛xC.1˛/x0 ; ˛yC.1˛/y0 / is feasible to the problem determining .˛u C.1 ˛/u0 /; hence .˛u C.1 ˛/u0 /  d.˛y C .1  ˛/y0 /  ˛d.y/ C .1  ˛/d.y0 / D ˛ .u/ C .1  ˛/ .u0 /: Therefore the function .u/ is convex, and hence continuous on intU: The convexity and hence, the continuity of f .u; t/; on intW is proved similarly. t u Proposition 13.3 (GBP) is equivalent to the monotonic optimization problem .Q/

minff .u; t/j t  .u/  0; .u; t/ 2 Wg

in the sense that min(GBP)= min(Q) and if .u; t/ solves (Q) then any optimal solution .x; y/ of the problem minfF.x; y/j g1 .x; y/  0; g2 .u; y/  0; u  C.x/; t  d.y/; x  0; y  0g solves (GBP). Proof By Proposition 13.1 the function f .u; t/ is decreasing while t  .u/ is increasing, so .Q/ is indeed a monotonic optimization problem. If .x; y/ is feasible to (GBP), then g1 .x; y/  0; g2 .C.x/; y/  0; and taking u D C.x/; t D d.y/ D .C.x// we have  t; hence F.x; y/  minfF.x0 ; y0 /j g1 .x0 ; y0 /  0; g2 .C.x0 /; y0 /  0; u  C.x0 /; t  d.y0 /; x0  0; y0  0g D f .u; t/  minff .u0 ; t0 /j .u0 /  t0  0g D min.Q/: Conversely, if .u/  t  0; then the inequalities u  C.x/; t  d.y/ imply d.y/  t  .u/  ..x//; hence f .u; t/ D minfF.x; y/j g1 .x; y/  0; g2 .C.x/; y/  0; u  C.x/; t  d.y/; x  0; y  0g  minfF.x; y/jg1.x; y/  0; g2 .C.x/; y/  0; .C.x//  d.y/g D min.GBP/: Consequently, min(GBP)= min.Q/: The last assertion of the proposition readily follows. t u

13.2 Bilevel Programming

459

13.2.2 Solution Method For convenience of notation we set z D .u; t/ 2 RmC1 ;

h.z/ D t  .u/:

(13.10)

The problem .Q/ can then be rewritten as .Q/

minff .z/j h.z/  0; z 2 Wg;

where f .z/ is a decreasing function and h.z/ a continuous increasing function The feasible set of this monotonic optimization problem .Q/ is a normal set, hence is robust and can be handled without difficulty. As shown in Chap. 11, such a problem .Q/ can be solved either by the polyblock outer approximation method or by the BRB (branch-reduce-and-bound) method. The polyblock method is practical only for problems .Q/ with small values of m (typically m  5/: The BRB method we are going to present is more suitable for problems with fairly large values of m: As its name indicates, the BRB method proceeds according to the standard branch and bound scheme, with three basic operations: branching, reducing (the partition sets), and bounding. 1. Branching consists in a successive rectangular partition of the initial box M0 D Œa; b  Rm  R; following a chosen subdivision rule. As will be explained shortly, branching should be performed upon the variables u 2 Rm ; so the subdivision rule is induced by a standard bisection upon these variables. 2. Reducing consists in applying valid cuts to reduce the size of the current partition set M D Œp; q  Œa; b: The box Œp0 ; q0  that results from the cuts is referred to as a valid reduction of M: 3. Bounding consists in estimating a valid lower bound ˇ.M/ for the objective function value f .z/ over the feasible portion contained in the valid reduction Œp0 ; q0  of a given partition set M D Œp; q: Throughout the sequel, for any vector z D .u; t/ 2 Rm  R we define zO D u:

a. Branching Since an optimal solution .u; t/ of the problem (Q) must satisfy t D .u/; it suffices to determine the values of the variables u in an optimal solution. This suggests that instead of branching upon z D .u; t/ as usual, we should branch upon u; according to a chosen subdivision rule which may be the standard bisection or the adaptive subdivision rule. As shown by Corollary 6.2, the standard subdivision method is exhaustive, i.e., for any infinite nested sequence Mk ; k D 1; 2; : : : ; such that each MkC1 is a child of Mk via a bisection we have diamMk ! 0: This property guarantees a rather slow convergence, because the method does not take account

460

13 Optimization Under Equilibrium Constraints

of the current problem structure. By contrast the adaptive subdivision method exploits the current problem structure and due to that can ensure a much more rapid convergence.

b. Valid Reduction At a given stage of the BRB algorithm for .Q/ a feasible solution z is available which is the best so far known. Let D f .z/ and let Œp; q  Œa; b be a box generated during the partitioning procedure which is still of interest. Since an optimal solution of .Q/ is attained at a point satisfying h.z/ WD t  .u/ D 0; the search for a feasible solution z of .Q/ in Œp; q such that f .z/  can be restricted to the set B \ Œp; q; where B WD fzj f .z/  ; h.z/ D 0g:

(13.11)

The reduction aims at replacing the box Œp; q with a smaller box Œp0 ; q0   Œp; q without losing any point y 2 B \ Œp; q; i.e., such that B \ Œp0 ; q0  D B \ Œp; q: The box Œp0 ; q0  satisfying this condition is referred to as a -valid reduction of Œp; q and denoted by red Œp; q: As usual ei denotes the ith unit vector of RmC1 ; i.e., eii D 1; eij D 0 8j ¤ i: Given any continuous increasing function '.˛/ W Œ0; 1 ! R such that '.0/ < 0 < '.1/ let N.'/ denote the largest value ˛ 2 .0; 1/ satisfying '.˛/ D 0: For every i D 1; : : : ; n let qi WD p C .qi  pi /ei : Proposition 13.4 A -valid reduction of a box Œp; q is given by the formula ;; if h.q/ < 0 or  f .q/ < 0; (13.12) red Œp; q D Œp0 ; q0 ; if h.q/  0 &  f .q/  0: where q0 D p C ˛i D ˇi D

n X

˛i .qi  pi /ei ;

p0 D q0 

iD1

1; if h.qi /  0 N.'i /; if h.qi / < 0:

n X

ˇi .q0  pi /ei ;

(13.13)

iD1

'i .˛/ D h.p C ˛.qi  pi /ei /:

1; if f .q0i /   0 N. i /; if f .q0i /  > 0:

i .ˇ/

D  f .q0  ˇ.q0i  pi /ei /: t u

Proof Follows from Proposition 11.13 0

0

Remark 13.1 As can be easily checked, the box Œp ; q  D redŒp; q still satisfies h.q0 /  0; f .q0 /  :

13.2 Bilevel Programming

461

c. Valid Bounds Let M WD Œp; q be a partition set which is supposed to have been reduced, so that, as mentioned in the above remark, h.q/  0;

f .q/  :

(13.14)

Let us now examine how to compute a lower bound ˇ.M/ for minff .z/j z 2 Œp; q; h.z/ D 0g: Since f .z/ is decreasing, an obvious lower bound is f .q/: We will shortly see that to ensure convergence of the BRB Algorithm, it suffices that the lower bounds satisfy ˇ.M/  f .q/: We shall refer to such a bound as a valid lower bound. Define h .z/ WD minf  f .z/; h.z/g: Since f .z/ and h.z/ are increasing, so is h .z/: Let G D fz 2 Œp; qj h.z/  0g;

H D fz 2 Œp; qj h .z/  0g:

From (13.14) h .q/  0; so if h.q/ D 0 then obviously q is an exact minimizer of f .z/ over the feasible points z 2 Œp; q that are at least as good as the current best, and f .q/ can be used to update the current best. Suppose now that h.q/ < 0; and hence, h.p/ < 0: For each z 2 Œp; q such that z 2 H n G let p .z/ be the first point where the line from z to p meets the upper boundary of G; i.e., p .z/ D z  ˛.z  p/ with ˛ D N.'/; '.˛/ D h.z  ˛.z  p//: Obviously, h. p .z// D 0: Lemma 13.1 If s D p .q/; v i D q  .qi  si /ei ; iD1; : : : ; mC1; and IDfij v i / 2 Hg then a valid lower bound over M D Œp; q is ˇ.M/ D minff .v i /j i 2 Ig D minff .u; t/j g1 .x; y/  0; g2 .Oui ; y/  0; .C.x/; d.y//  v i .i 2 I/; x  0; y  0g: Proof This follows since the polyblock with vertices v i ; i 2 I; contains all feasible solutions z D .u; t/ still of interest in Œp; q: t u

462

13 Optimization Under Equilibrium Constraints

Remark 13.2 Each box Mi D Œp; v i  can be reduced by the method presented above. If Œp0 ; v 0i  D red Œp; v i ; i D 1; : : : ; m C 1; then without much extra effort, we can have the following more refined lower bound: ˇ.M/ D min f .v 0i /; i2I

I D fij fQ .v i /  0g:

The above procedure amounts to constructing a polyblock Z  B \ Œp; q D fz 2 Œp; qjh.z/  0  fQ .z/g; which is possible since h.z/; fQ .z/; are increasing functions.

d.

Improvements for the Case of CBP

In the case of (CBP) (Convex Bilevel Programming) the function .u/ is convex by Proposition 13.1, so the function h.u; t/ D t  .u/ is concave, and a lower bound can be computed which is rather tight. In fact, since h.z/ is concave and since the box Œp; q has been -reduced, h.qi / D 0; i D 1; : : : ; n [see (13.13)], the set fz 2 Œp; qj h.z/ D 0g is contained in the simplex S WD Œq1 ; : : : ; qn : Hence a lower bound of f .u; t/ over M D Œp; q is given by ˇ.M/ D minff .z/j z 2 Sg:

(13.15)

Taking account of (13.8)–(13.9) this convex program can be written as minu;t F.x; y/ s.t. g1 .x; y/  0; g2 .C.x/; y/  0; C.x/  u; d.y/  t; x  0; y  0; .u; t/ 2 Œq1 ; : : : ; qn :

(13.16)

We are now in a position to state the proposed monotonic optimization (MO) algorithm for .Q/: Recall that the problem is minff .z/j z 2 Œa; b; h.z/ D 0g and we assume (A1)– O .b// O yields a feasible (A4) so always b 2 W; bO D .b1 ; : : : ; bm / 2 U; and .b; solution. MO Algorithm for (GBP) Initialization. Start with P1 D fM1 g; M1 D Œa; b; R1 D ;: Let CBS be the best feasible solution available and CBV (current best value) the corresponding value of O .b//: O Set k D 1: f .z/: (At least CBV= f .b; Step 1.

For each box M 2 Pk and for DCBV: (1) (2) (3) (4)

Compute the -valid reduction red M of MI Delete M if red M D ;I Replace M by red M if red M ¤ ;I If red M D Œp; q, then compute a valid lower bound ˇ.M/ for f .z/ over the feasible solutions in M:

13.2 Bilevel Programming

Step 2.

Step 3. Step 4.

Step 5.

463

Let P 0 k be the collection of boxes that results from Pk after completion of Step 1. Update CBS and CBV. From Rk remove all M 2 Rk such that ˇ.M/  CBV and let R 0 k be the resulting collection. Let Sk D R 0k [ P 0k : If Sk D ; terminate: CBV is the optimal value and CBS (the feasible solution z with f .z/ D CBV) is an optimal solution. If Sk ¤ ; select Mk 2 argminfˇ.M/j M 2 Sk g: Subdivide Mk D Œpk ; qk  according to the adaptive subdivision rule: if ˇ.Mk / D f .v k / for some v k 2 Mk and zk D pk .v k / is the first point where the line segment from v k to pk meets the surface h.z/ D 0 then bisect Mk via .wk ; jk / where wk D 12 .v k C zk / and jk satisfies kzkjk  vjkk k D maxj kzkj  vjk k: Let PkC1 be the collection of newly generated boxes. Set RkC1 D Sk n fMk g; increment k and return to Step 1.

Proposition 13.5 Whenever infinite the above algorithm generates an infinite sequence fzk g every cluster point of it yields a global optimal solution of (GBP). t u

Proof Follows from Proposition 6.2.

13.2.3 Cases When the Dimension Can Be Reduced Suppose m > n1 while the map C satisfies x0  x ) C.x0 /  C.x/:

(13.17)

In that case the dimension of the problem (Q) can be reduced. For this, define .v/ D minfd.y/j g2 .C.v/; y/  0; y 2 RnC2 g

(13.18)

f .v; t/ D minfF.x; y/j G1 .x; y/  0; g2 .C.v/; y/  0 v  x; t  d.y/; x 2 RnC1 ; y 2 RnC2 g:

(13.19)

It is easily seen that all these functions are decreasing. Indeed, v 0  v implies C.v 0 /  C.v/ and hence g2 .C.v 0 /; y/  0; i.e., whenever v is feasible to (13.18) and v 0  v then v is also feasible. This shows that .v 0 /  .v/: Similarly, whenever .v; t/ is feasible to (13.19) and .v 0 ; t0 /  .v; t/ then .v; t/ is also feasible, and hence, f .v 0 ; t0 /  f .v; t/: With .v/ and f .v; t/ defined that way, we still have Proposition 13.6 Problem (GBP) is equivalent to the monotonic optimization problem Q .Q/

minff .v; t/j t  .v/  0g

464

13 Optimization Under Equilibrium Constraints

Q then an optimal solution .x; t/ of the problem and if .v; t/ solves (Q) minfF.x; y/j g1 .x; y/  0; g2 .C.x/; y/  0; v  x; t  d.y/; x  0; y  0g solves (GBP). Proof If .x; y/ is feasible to (GBP), then g1 .x; y/  0; g2 .C.x/; y/  0 and taking v D x; t D d.y/ D .x/ we have t  .v/ D 0; hence, setting G WD f.v; t/j t .v/  0g yields F.x; y/  minfF.x0 ; y0 /j g1 .x0 ; y0 /  0; g2 .C.x0 /; y/  0; v  x0 ; t  d.y0 /; x0  0; y0  0g D f .v; t/  minff .v 0 ; t0 /j .v 0 ; t0 / 2 Gg Q D min.Q/: Conversely, if t  .v/  0; then the inequalities v  x; t  d.y/ imply d.y/  t  /v/  .x/; hence f .v; t/ D minfF.x; y/j g1 .x; y/  0; g2 .C.x/; y/  0; v  x; t  d.y/; x  0; y  0g  minfF.x; y/j g1 .x; y/  0; g2 .C.x/; y/  0; .x/  d.y/; x  0; y  0g D min.GDP/ Q The rest is clear. Consequently, min.GBP/ D min.Q/:

t u

Thus, under assumption (13.17), (GBP) can be solved by the same method as previously, with v 2 Rn1 as main parameter instead of u 2 Rm : Since n1 < m the method works in a space of smaller dimension. More generally, if C is a linear map with rankC D r < m such that E D KerC satisfies Ex0  Ex ) C.x0 /  C.x/;

(13.20)

then one can arrange the method to work basically   in a space of dimension r rather x than m: In fact, writing E D ŒEB ; EN ; x D B ; where EB is an r  r nonsingular xN matrix, we have for every v 2 Ex:    EB1 EN xN EB1 vC : xD 0 xN 

13.2 Bilevel Programming

465

   EB1 EN xN EB1 ; zD yields Hence, setting Z D 0 xCN 

x D Zv C z

with Ez D EN xN C EN xN D 0:

Since by (13.20) Ez D 0 implies Cz D 0; it follows that Cx D C.Zv/: Now define .v/ D minfd.y/j g2 .C.Zv/; y/  0; y 2 RnC2 g;

(13.21)

f .v; t/ D minfF.x; y/j g1 .x; y/  0; g2 .C.Zv/; y/  0; v  Ex; t  d.y/; x 2 RnC1 ; y 2 RnC1 g:

(13.22)

For any v 0 2 Rr D E.Rn1 / we have v 0 D Ex0 for some x0 ; so v 0  v implies Ex0  Ex; hence by (13.20) Cx0  Cx; i.e., C.Zv 0 /  C.Zv/; and therefore, g2 .C.Zv 0 /; y/  g2 .C.Zv/; y/  0: It is then easily seen that .v 0 /  .v/ and f .v 0 ; t0 /  f .v; t/ for some v 0  v and t0  t; i.e., both .v/ and f .v; t/ are decreasing. Furthermore, if .v; t/ solves the problem minff .v; t/j t  .v/  0g

(13.23)

(so that, in particular, t D .v/; and .x; y/ solves the problem (13.22) where v D v; t D t D .v//; then, since v D Ex; we have .Ex/  .v/  d.y/: Noting that Zv D ZEx D x; this implies that y solves minfd.y/j g2 .Cx; y/  0g; and consequently, .x; y/ solves (GBP).

13.2.4 Specialization to (LBP) In the special case when all functions involved in (GBP) are linear we have the Linear Bilevel Programming problem

.LBP/

ˇ ˇ min L.x; y/ s.t. ˇ ˇ A1 x C B1 y  c1 ; x 2 Rn1 ; y solves C ˇ ˇ minfhd; yij A x C B y  c2 ; y 2 Rn2 g 2 2 C

where L.x; y/ is a linear function, d 2 Rn2 ; Ai 2 Rmi n1 ; Bi 2 Rmi n2 ; i D 1; 2: In this case C.x/ D A2 x while g1 .x; y/ D maxiD1;:::;m1 .c1 x  B1 y/i  0; g2 .u; y/ D maxiD1;:::;m2 .c2  u  B2 y/i  0: Clearly all assumptions (A1), (A2), and (A3) hold, provided the polytope D WD f.x; y/ 2 RnC1 2 RnC2 j A1 x C B1 y  c1 ; A2 x C B2 y  c2 g is nonempty. To specialize the MO Algorithm for (GBP) to .BLP/ define

466

13 Optimization Under Equilibrium Constraints

    1 B1 c A1 ; BD ; cD 2 ; AD A2 B2 c 2 2 3 3 A1;1 A2;1 A1 D 4 : : : 5 ; A2 D 4 : : : 5 ; A1;m1 A2;m1 

D D f.x; y/j Ax C By  c; x  0; y  0g: By Proposition 13.3, .BLP/ is equivalent to the monotonic optimization problem minff .u; t/j h.u; t/  0g; where for u 2 Rm2 ; t 2 R: f .u; t/ D minfL.x; y/j .x; y/ 2 D; u  A2 x; t  hd; yig; h.u; t/ D t  minfhd; yij .x; y/ 2 D; u  A2 xg; with f .u; t/ decreasing and h.u; t/ increasing. For convenience set z D .u; t/ 2 Rm2 C1 : The basic operations proceed as follows: – Initial box: M1 D Œa; b  Rm2 C1 with ai D minf.A2;i ; x/j .x; y/ 2 Dg;

i D 1; : : : ; m2

am2 C1 D minfhd; yij .x; y/ 2 Dg bi D maxfhA2;i ; xij.x; y/ 2 Dj

i D 1; : : : ; m2

bm2 C1 D maxfhd; yij .x; y/ 2 Dg: – Reduction operation: Let z D .u; t/ D CBS and D f .z/ DCBV. If M D Œp; q is any subbox of Œa; b still of interest at a given iteration, then the box red M (valid reduction M/ is determined as follows: (1) If h.z/ < 0 or f .z/ > , then red .M/ D ; (M is to be deleted); (2) Otherwise, red .M/ D Œp0 ; q0  with p0 ; q0 being determined by formulas (13.12)–(13.13), i.e., q0 D p C

n X

˛i .qi  pi /ei ;

p0 D q0 

iD1

n X

ˇi .q0i  pi /ei ;

iD1

where ˛i D ˇi D

1; if h.qi /  0 N.'i /; if h.qi / < 0

'i .˛/ D h.p C ˛.qi  pi /ei /:

1; if f .q0i /   0 N. i /; if f .q0i /  > 0

i .ˇ/

D f .q0  ˇ.q0i  pi /ei /  :

13.2 Bilevel Programming

467

– Lower bounding: For any box M D Œp; q with red .M/ D Œp0 ; q0  a lower bound for f .z/ over Œp; q is ˇ.M/ D minu;t L.x; y/ s.t. A 1 x C B 1 y  c1 ; A 2 x C B 2 y  c2 ; A2 x  u; d.y/  t; x  0; y  0; .u; t/ 2 Œq01 ; : : : ; q0n :

13.2.5 Illustrative Examples Example 13.1 Solve the problem min.x2 C y2 / x  0; y  0; min.y/

s.t. y solves s.t.

3x C y  15; x C y  7; x C 3y  15: We have D D f.x; y/ 2 R2C j 3x C y  15I x C y  7; x C 3y  15g; .u/ D minfy/j .x; y/ 2 D; 3x  ug; f .u; t/ D minfx2 C y2 j .x; y/ 2 D; 3x  u; y  tg: Computational results (tolerance 0.0001): Optimal solution: (4.492188, 1.523438) Optimal value: 22.500610 (relative error  0:0001) Optimal solution found and confirmed at iteration 10. Example 13.2 (Test Problem 7 in Floudas (2000), Chap. 9) min.8x1  4x2 C 4y1  40y2 C 4y3 / x1 ; x2  0;

y solves

min.y1 C y2 C 23 / s.t. y1 C y2 C y3  1; 2x1  y1 C 2y2  0:5y3  1; 2x2 C 2y1  y2  0:5y3  1; y1 ; y2 ; y3  0:

s.t.

468

13 Optimization Under Equilibrium Constraints

D D f.x; y/ 2 R2C  R3C j  y1 C y2 C y3  1; 2x1  y1 C 2y2  0:5y3  1; 2x2 C 2y1  y2  0:5y3  1g; .u/ D minfy1 C y2 C 2y3 j .x; y/ 2 D; x1  u1 ; x2  u2 g; f .u; t/ D minf8x1  4x2 C 4y1  40y2 C 4y3 j .x; y/ 2 D; x1  u1 ; x2  u2 ; y1 C y2 C 2y3  tg: Computational results (tolerance 0.01): Optimal solution x D .0; 0:896484/; y D .0; 0:597656; 0:390625/ Optimal value: 25.929688 (relative error  0:01) Optimal solution found and confirmed at iteration 22.

13.3 Optimization Over the Efficient Set Given a nonempty set X  Rn and a mapping C D .C1 ; : : : ; Cm / W Rn ! Rm ; a point x0 2 X is said to be efficient w.r.t. C if there is no x 2 X such that C.x/  C.x0 / and C.x/ ¤ C.x0 /: It is said to be weakly efficient w.r.t. C if there is no x 2 X such that C.x/ > C.x0 /: These concepts appear in the context of multiobjective programming, when a decision maker has several objectives C1 .x/; : : : ; Cm .x/ he/she would like to maximize over a feasible set X: Since these objectives may not be consistent with one another, efficient or weakly efficient solutions are used as substitutes for the standard concept of optimal solution. Denote the set of efficient points by XE and the set of weakly efficient points by XWE : Obviously, XE  XWE : To help the decision maker to select a most preferred solution, the following problems can be considered: .OE/ .OWE/

maxff .x/j x 2 X; g.x/  0; x 2 XE g maxff .x/j x 2 X; g.x/  0; x 2 XWE g

where f .x/; g.x/ are given functions. Since x 2 XWE if and only if C.y/  C.x/  0 8y 2 X the problem (OWE) is a MPEC. It can be proved that (OE), too, is a MPEC, though the proof of this fact is a little more involved (see Philip 1972). Among numerous studies devoted to different aspects of the two above problems we should mention (Benson 1984, 1986, 1991, 1993, 1995, 1996, 1998, 1999, 2006; Bolintineau 1993a,b; Ecker and Song 1994; Bach-Kim 2000; An-Tao-Muu 2003; An-Tao-Nam-Muu 2010; Muu and Tuyen 2002; Thach et al. 1996). All these approaches are based in one way or another on dc optimization or extensions. Below we present a different approach, first proposed in Luc (2001) and later expanded in Tuy and Hoai-Phuong (2006), which is based on reformulating these problems as monotonic optimization problems.

13.3 Optimization Over the Efficient Set

469

13.3.1 Monotonic Optimization Reformulation Assume X is compact and C is continuous. Then the set C.X/ is compact and there exists a box Œp; q  Rm C such that C.X/  Œa; b: For every y 2 Œa; b define h.y/ D minftj yi  Ci .x/  t; i D 1; : : : ; m; x 2 Xg:

(13.24)

Proposition 13.7 The function h W Œa; b ! R is increasing and continuous and there holds x 2 XWE , h.C.x//  0: Proof We can write x 2 XWE , 8z 2 X 9i Ci .x/  Ci .z/ , max min.Ci .x/  yi /  0 y2Y

i

, max minftj yi  Ci .x/  t; i D 1; : : : ; mg  0 y2Y

, h.C.x//  0:

t u

Since the set C.X/ is compact, it is contained in some box Œa; b  Rm : For each y 2 Œa; b let '.y/ WD minff .x/j x 2 X; g.x/  0; C.x/  yg: Clearly for y0  y we have '.y0 /  '.y/; so this is a decreasing function. Proposition 13.8 Problem (OWE) is equivalent to the monotonic optimization problem minf'.y/j h.y/  0; y 2 Œa; bg:

(13.25)

Proof By Proposition 13.7 problem (OWE) can be written as minff .x/j x 2 X; g.x/  0; h.C.x//  0g:

(13.26)

This is a problem with hidden monotonicity in the constraints which has been studied in Chap. 11, Sect. 11.5. By Proposition 11.24, it is equivalent to problem (13.25).

13.3.2 Solution Method for (OWE) For small values of m the monotonic optimization problem (13.25) can be solved fairly efficiently by the copolyblock approximation algorithm analogous to the

470

13 Optimization Under Equilibrium Constraints

polyblock algorithm described in Sect. 11.2, Chap. 11. For larger values of m; the SIT Algorithm presented in Sect. 11.3 is more suitable. The latter is a rectangular BRB algorithm involving two specific operations: valid reduction and lower bounding.

a. Valid Reduction At a given stage of the BRB algorithm for solving (13.25), a feasible solution y is known or has been produced which is the best so far available. Let D '.y/ and let Œp; q  Œa; b be a box currently still of interest among all boxes generated in the partitioning process. Since both '.x/ and h.x/ are increasing, an optimal solution of (13.25) must be attained at a point y 2 Œp; q satisfying h.y/ D 0 (i.e., a lower boundary point of the conormal set ˝ WD fx 2 Œa; bj h.y/  0g). The search for a better feasible solution than y can therefore be restricted to the set B WD fy 2 Œp; qj h.y/  0  h.y/; '.y/   0g D fy 2 Œp; qj h.y/  0  h .y/g;

(13.27)

where h .y/ D minf  '.y/; h.y/g: In other words, we can replace the box Œp; q by a smaller one, say Œp0 ; q0   Œp; q; provided B \ Œp0 ; q0  D B : A box Œp0 ; q0  satisfying this condition is referred to as a -valid reduction of Œp; q; and denoted red Œp; q: For any continuous increasing function '.˛/ W Œ0; 1 ! R such that '.0/ < 0 < '.1/ let N.'/ denote the largest value of ˛ 2 .0; 1/ satisfying '.˛/ D 0: For every i D 1; : : : ; n let qi WD p C .qi  pi /ei ; where eii D 1; eij D 0 8j ¤ i: Proposition 13.9 We have

; if h.p/ > 0 or h .q0 /g < 0; 0 0 Œp ; q ; if h.p/  0 & h .q0 /g  0:

red Œp; q D where q0 D p C ˛i D ˇi D

n X

p0 D q0i 

˛i .qi  pi /ei ;

iD1

n X

ˇi .q0i  pi /ei ;

iD1

1; if h.q /  0 N.'i /; if h.qi / > 0: i

'i .t/ D h.p C t.qi  pi /ei /:

1; if h .q0i /  0 N. i /; if h .q0i / < 0:

i .t/

D h .q0  t.q0i  pi /ei /:

13.3 Optimization Over the Efficient Set

471

Remark 13.3 It is easily verified that the box Œp0 ; q0  D red Œp; q still satisfies '.p0 /  ; h.p0 /  0; h.q0 /  0:

b. Lower Bounding Let M WD Œp; q be a partition set which is assumed to have been reduced, so that according to Remark 13.3 '.p/  ; h.p/  0; h.q/  0: To compute a lower bound ˇ.M/ for minf'.y/j y 2 Œp; q; h.y/ D 0g we note that, since '.y/ is increasing, an obvious bound is '.p/: It turns out that, as will be shortly seen, the convergence of the BRB algorithm for solving (eqn13.25) will be ensured, provided that ˇ.M/  '.p/:

(13.28)

A lower bound satisfying this condition will be referred to as a valid lower bound. Define .y/ D minf'.y/  ; h.y/g;  D fy 2 Œp; qj .y/  0g: If .p/  0  .q/; then obviously p is an exact minimizer of '.y/ over the feasible points in Œp; q at least as good as the current best, and can be used to update the current best solution. Suppose therefore that .p/  0; h.p/ < 0; i.e., p 2 n˝: For each y 2 Œp; q such that h.y/ < 0 let .y/ be the first point where the line segment from y to q meet the lower boundary of ˝; i.e., .y/ D y C .q  y/; with  D maxf˛j h.y C ˛.q  y//  0g: Obviously, h. .y// D 0: Lemma 13.2 If z D .p/; zi D p C .zi  pi /ei ; i D 1; : : : ; m; and I D fij zi 2 g; then a valid lower bound over M D Œp; q is ˇ.M/ D minf'.zi /j i 2 Ig D minff .x/j x 2 D; C.x/  zi ; i 2 Ig: Proof Let Mi D Œzi ; q: From the definition of z it is easily seen that h.u/ < 0 8u 2 Œp; z/; i.e., Œp; q\\˝  Œp; qnŒp; z/: Noting that fuj p  u < zg D \m iD1 fuj pi  m ui < zi g we can write Œp; qnŒp; z/ D Œp; qn\m fuj u < z g D [ fu 2 Œp; qj zi  i i iD1 iD1 m i ui  qi g D [m M : Thus, if I D fijz 2 g, then Œp; q \  \ ˝  [ i iD1 iD1 Mi : Since '.zi /  minf'.y/j y 2 Mi g; the result follows. t u

472

13 Optimization Under Equilibrium Constraints

MO Algorithm for (OWE) Initialization. Start with P1 D fM1 g; M1 D Œa; b; R1 D ;: If a current best feasible solution (CBS) is available let CBV (current best value) denote the value of f .x/ at this point. Otherwise, set CBV=C1: Set k D 1: Step 1.

For each bx M 2 Pk : – – – –

Step 2.

Step 3.

Step 4. Step 5.

Compute the -valid reduction red M of M for =CBV; Delete M if red M D ;I Replace M by red M otherwise; If red M D Œp; q, then compute a valid lower bound ˇ.M/ for '.y/ over the feasible solutions in M:

Let Pk0 be the collection of boxes that results from Pk after completion of Step 1. From Rk remove all M 2 Rk such that ˇ.M/ CBV and let Rk0 be the resulting collection. Let Sk D Rk0 [ Pk0 : If Sk D ;, then terminate: the problem is infeasible if CBVD C1 or CBV is the optimal value and the feasible solution y with '.y/=CBV is an optimal solution if CBV< C1: Otherwise, let Mk 2 argminfˇ.M/j M 2 Sk g: Divide Mk into two subboxes by the standard bisection. Let PkC1 be the collection of these two subboxes of Mk : Let RkC1 D Sk n fMk g: Increment k and return to Step 1.

Proposition 13.10 Either the algorithm is finite, or it generates an infinite filter of boxes fMkl g whose intersection yields a global optimal solution. Proof If the algorithm is finite it must generate an infinite filter of boxes fMkl g such  that \C1 lD1 Mkl D fy g by exhaustiveness of the standard bisection process. Since kl kl Mkl D Œp ; q  with h.pkl /  0  h.qkl / and y D lim pkl D lim qkl it follows that h.y /  0  h.q /; so y is a feasible solution. Therefore, if the problem is infeasible, the algorithm must stop at some iteration where no box remains for consideration while CBVD C1 (see Step 3). Otherwise, '.pkl /  ˇ.Mkl /  '.qkl /; whence liml!C1 ˇ.Mkl / D '.y /: On the other hand, since ˇ.Mkl / is minimal among the current set Skl we have ˇ.Mkl /  minf'.y/j y 2 Œa; b \ ˝g ad, consequently, '.y /  minf'.y/j y 2 Œa; b \ ˝g: Since y is feasible it must be an optimal solution.

t u

13.3 Optimization Over the Efficient Set

473

c. Enhancements When C Is Concave Tighter Lower Bounds Recall from (13.25) that (OWE) is equivalent to the monotonic optimization problem minf'.y/j y 2 ˝g where ˝ D fy 2 Œa; bj h.y/  0g; h.y/ D minftj yi  Ci .x/  t; i D 1; : : : ; m; x 2 Xg and '.y/ D minff .x/j x 2 X; g.x/  0; C.x/  yg:

(13.29)

Lemma 13.3 If the functions Ci .x/; i D 1; : : : ; m; are concave the set  D fy 2 Œa; bj h.y/  0g is convex. Proof Clearly  D fy 2 Œa; bj y  C.x/ for some x 2 Xg: If y; y0 2 ; 0  ˛  1, then y  C.x/; y0  C.x0 / with x; x0 2 C and ˛yC.1˛/y0  ˛C.x/C.1˛/C.x0 /  C.˛x C .˛/x0 /; where ˛x C .1  ˛/x0 2 X because X is convex by assumption. u t The convexity of  permits a simple method for computing a tight lower bound for the minimum of '.y/ over the feasible solutions in Œp; q: For each i D 1; : : : ; m let yi be the last point of  on the line segment from p to 0 p D p C .qi  pi /ei ; i.e., yi D p C i .qi  pi /ei with i D maxf˛j p C ˛.qi  pi /ei 2 ; 0  ˛  1g: Clearly h.yi / D 0: Proposition 13.11 If the functions Ci .x/; i D 1; : : : ; m are concave then a lower bound for '.y/ over the feasible portion in M D Œp; q is ˇ.M/ D minff .x/j x 2 X; g.x/  0; C.x/  p; m X .Ci .x/  pi /=.yii  pi /  1g:

(13.30)

iD1

Proof First observe that by convexity of  the simplex S D Œy1 ; : : : ; ym  satisfies fy 2 Œp; qj h.y/  0g  S C Rm C and since '.y/ is increasing we have '.y/  '. .y// 8y 2 S; where .y/ D y C .q  y/ with  D maxf˛j h.y C ˛.q  y/  0g, i.e., h. .y// D 0: Hence, ˇ.M/ D minf'.y/j y 2 Sg  minf'.y/j y 2 Œp; q; h.y/ D 0g;

(13.31)

474

13 Optimization Under Equilibrium Constraints

so (13.31) gives a valid lower bound. Further, letting E D f.x; y/j x 2 X; g.x/  0; C.x/  y; y 2 Sg Ey D fx 2 Xj g.x/  0; C.x/  yg D fxj .x; y/ 2 Eg for every y 2 S; we can write min '.y/ y2S

D min minff .x/j x 2 Ey g y2S

D minff .x/j .x; y/ 2 Eg D minff .x/j x 2 X; g.x/  0; C.x/  y; y 2 Sg: Since S D fy  pj

Pm

iD1 .yi

 pi /=.yii  pi / D 1g; (13.30) follows.

t u

Remark 13.4 In particular, when X is a polytope, C is a linear mapping, g.x/; f .x/ are affine functions, as e.g., in problems (OWE) with linear objective criteria, then '.y/ and ˇ.M/ are optimal values of linear programs. Also note that the linear program (13.29) for different y differs only by the right-hand side of the linear constraints and the linear programs (13.30) for different partition sets M differ only by the last constraint and the right-hand side of the constraint C.x/  p: Exploiting these facts, reoptimization techniques can be used to solve efficiently these linear programs. Adaptive Subdivision Rule Let yM be the optimal solution of problem (13.30) and zM D .yM /: Since the equality yM D zM would imply that ˇ.M/ D minf'.y/j y 2 Œp; q; h.y/ D 0g; the following adaptive subdivision rule must be used to accelerate convergence of the algorithm (see Theorem 6.4, Chap. 6): M Determine v M D 12 .yM C zM /; sM 2 argmaxiD1;:::;m kyM i  zi j; and subdivide M M via .v ; sM /:

13.3.3 Solution Method for (OE) We now turn to the constrained optimization problem over the efficient set .OE/

˛ WD minff .x/j x 2 X; g.x/  0; x 2 XE g

where XE is the set of efficient points of X with respect to C W Rn ! Rm : Although efficiency is a much stronger property than weak efficiency, it turns out that the problem (OE) can be solved, roughly speaking, by the same method as (OWE). More precisely, let, as previously,  D fy 2 Œa; bj x 2 X; C.x/  yg: This is a normal set in the sense that y0  y 2  ) y0 2  (see Chap. 11);

13.3 Optimization Over the Efficient Set

475

moreover, it is compact. Recall from Sect. 11.1 (Chap. 11) that a point z 2  is called an upper boundary point of ; written z 2 @C ; if .z; b  Œa; b n : A point z 2 @C  is called an upper extreme point of ; written z 2 extC ; if x 2 ; x  z implies x D z: By Proposition 11.5 a compact normal set such as  always has at least one upper extreme point. Proposition 13.12 x 2 XE , C.x/ 2 extC : Proof If x0 2 XE , then y0 D C.x0 / 2  and there is no x 2 X such that C.x/  C.x/ ; C.x/ ¤ C.x0 /, i.e., no y D C.x/; x 2 X such that y  y0 ; y ¤ y0 ; hence y0 2 extC : Conversely, if y0 D C.x0 / 2 extC  with x0 2 X, then there is no y D C.x/; x 2 X; such that y  y0 ; y ¤ y0 ; i.e., no x 2 X with C.x/  C.x0 /; C.x/ ¤ C.x0 /; hence x0 2 XE : t u For every y 2 Rm C define ( .y/ D min

m X

) .yi  zi /j z  y; z 2 

:

(13.32)

iD1

Proposition 13.13 The function .y/ W Rm C ! RD [ fC1g is increasing and satisfies 8 if y 2  yi for at least C if y 2 ext  then for every z 2  iD1 .yi  zi / < 0; a contradiction. Conversely, P such that z  y one must have z D y; hence m iD1 .yi  zi / D 0; i.e., .y/ D 0: That .y/  0 8y 2 ; .y/ D C1 8y ¤  is obvious, t u Q Setting D D fx 2 Xj g.x/  0g; h.y/ D minfh.y/; .y/g; we can thus formulate the problem (OE) as Q minff .x/j x 2 D; h.C.x//  0g;

(13.34)

Q which has the same form as (13.25), with the difference, however, that h.y/; though increasing like h.y/; is not usc. Upon the same transformation as previously, this problem can be rewritten as minf'.y/j y 2 ˝; .y/ D 0g; where '.y/ is defined as in (13.29) and ˝ D fx 2 Œa; bj h.y/  0g.

(13.35)

476

13 Optimization Under Equilibrium Constraints

Clearly problem (13.35) differs from problem (13.25) only by the additional constraint .y/ D 0; whose presence is what makes (OE) a bit more complicated than (OWE). There are two possible approaches for solving (OE). In the first approach proposed by Thach and discussed in Tuy et al. (1996a) the problem (OE) is approximated by an (OWE) differing from (OE) only in that the criterion mapping C is slightly perturbed and replaced by C" with Cj" .x/ D Cj .x/ C "

m X

Ci .x/;

j D 1; : : : ; m;

iD1 " where " > 0 is sufficiently small. Denote by XWE and XE" resp. the weakly efficient " set and the set of efficient set of X w.r.t. C .x/ and consider the problem

.OWE" /

" ˛."/ WD minff .x/j g.x/  0; x 2 XWE g:

" Proposition 13.14 (i) For any " > 0 we have XWE  XE : (ii) ˛."/ ! ˛ as " & 0: " (iii) If X is a polytope and C is linear, then there is "0 > 0 such that XWE D XE for all " 2 .0; "0 /:  Proof (i) Let x 2 X n XE : Then there is x 2 XPsatisfying C.x/ Pm C.x /; m  Ci .x/ > Ci .x / for at least one i: This implies iD1 Ci .x/ > iD1 Ci .x /; " hence C" .x/ > C" .x /; so that x … XWE : Thus, if x 2 X n XE then " x 2 X n XWE  XE : (ii) Property (i) implies that ˛."/  ˛: Define

 " D fy 2 Œa; bj y  C" .x/; x 2 Xg; '" .y/ D minff .x/j; x 2 X; g.x/  0; C" .x/  yg;

(13.36) (13.37)

so that problem .OWE" / reduces to minf'" .y/j y 2 cl.Œa; b n  " g:

(13.38)

" " : Since x … XWE there exists Next consider an arbitrary point x 2 XE n XWE x" 2 X such that C" .x" / > C" .x/: Then x" is feasible to problem .OWE" / and, consequently, f .x" /  ˛."/  ˛: In view of the compactness of X; by passing to a subsequence if necessary, one can suppose that x" ! x 2 X as " & 0: Therefore, f .x /  lim ˛."/  ˛: But from C" .x" / > C" .x/ we have C.x /  C.x/ and since x 2 XE it follows that x D x; and hence, " f .x/  lim ˛."/  ˛: This being true for arbitrary x 2 XE n XWE we conclude that lim ˛."/ D ˛: (iii) We only sketch the proof, referring the interested reader to Tuy et al. (1996b, Proposition 15.1), for detail. Since C.x/ is linear, we can consider the cones K WD fxj Ci .x/  0; i D 1; : : : ; mg; K " WD fxj C" .x/  0; i D 1; : : : ; mg: Let x0 2 XE and denote by F the smallest facet of X that contains x0 : Since

13.3 Optimization Over the Efficient Set

477

x0 2 XE we have .x0 C K/ \ F D ;; and hence, there exits "F > 0 such that .x0 C K " / \ F D ; 8" 2 .0; "F /: Then we have .x C K " / \ F D ; 8x 2 F whenever 0 < "  "F : If F denotes the set of facets of X such that F  XE ; then F is finite and "0 D minf"F j F 2 F g > 0: Hence, .x C K " / \ F D " ; 8x 2 F; 8F 2 F ; i.e., XE  XWE whenever 0 < " < "0 : t u As a consequence, an approximate optimal solution of (OE) can be obtained by solving .OWE" / for any " such that 0 < "  "0 : An alternative approach to (OE) consists in solving the equivalent monotonic optimization problem (13.25) by a BRB algorithm analogous to the one for solving problem (OWE). To take account of the peculiar constraint .y/ D 0 some precautions and special manipulations are needed (i) A “feasible solution” is an y 2 Œa; b satisfying .y/ D 0; i.e., a point of extC : The current best solution is the best among all so far known feasible solutions. (ii) In Step 1, after completion of the reduction operation described above, a special supplementary reduction–deletion operation is needed to identify and fathom any box Œp; q that contains no point of extC : This supplementary operation will ensure that every filter of partition sets Œpk ; qk  collapses to a point corresponding to an efficient point. Define h" .y/ D minftj x 2 X; yi  Ci" .x/  t; i P D 1; : : : ; mg;  " D fy 2 " " Œa; bj y  C .x/; x 2 Xg where Cj .x/ D Cj .x/ C " m j D 1; : : : ; m; iD1 Ci .x/; " " Since C.x/  a 2 Rm C 8x 2 X; we have C.x/  C .x/ 8x 2 X; hence    : Let Œp; q be a box which has been reduced as described in Proposition 13.9, so that p 2 ; q … : Since p 2  we have .p/  0: Supplementary Reduction–Deletion Rule: Fathom Œp; q in either of the following events: (a) .q/  0 (so that .y/  0 8y 2  \ Œp; q  @C  \ Œp; q/I if .q/ D 0, then q gives an efficient point and can be used to update the current best. (b) h" .q/  0 for some small enough " > 0 (then q 2  " ; so the box Œp; q contains no point of @C . " /; and hence no point of extC  /: Proposition 13.15 The BRB algorithm applied to problem (13.35) with precaution (i) and the supplementary reduction–deletion rule either yields an optimal solution to (OE) in finitely many iterations or generates an infinite filter of boxes whose intersection is a global optimal solution y of (13.35), corresponding to an optimal solution x of (OE). Proof Because of precautions (i) and (ii) the box selected for further partitioning in each iteration contains either a point of extC  or a point of extC  " : Therefore, any infinite filter of boxes generated by the algorithm collapses either to a point y 2 extC  (yielding then an x 2 XE ) or to a point y 2 extC  " (yielding then " x 2 XWE  XE ). t u

478

13 Optimization Under Equilibrium Constraints

13.3.4 Illustrative Examples Example 13.3 Problem (OWE) with linear input functions. Solve minff .x/j x 2 X; g.x/  0; x 2 XWE with following data: f .x/ D hc; xi; c D .4; 8; 3; 1; 7; 0; 6; 2; 9; 3/; X D fx 2 R10 C j A1 x  b1 ; A2 x  b2 g; where

2

3 7 1 3 7 0 9 2 1 5 1 6 5 8 1 7 5 0 1 3 4 0 7 6 7 6 7 A1 D 6 2 1 0 2 3 2 2 5 1 3 7 6 7 4 1 4 2 9 4 3 3 4 0 2 5 3 1 0 8 3 1 2 2 5 5 b1 D .66; 150; 81; 79; 53/ 2 2 9 1 2 2 1 6 2 3 2 4 5 4 6 6 4 8 1 1 5 3 6 6 6 2 7 1 2 5 9 A2 D 6 6 2 3 1 1 4 3 6 6 6 0 0 0 4 3 6 4 2 2 3 5 6 4 1 4 4 6 0 3

3 4 1 5 2 1 9 2 1 7 7 2 0 2 9 7 7 7 4 1 2 0 7 7 1 0 0 6 7 7 2 2 4 6 7 7 0 0 1 4 5 4 2 4 1

b2 D .126; 12; 52; 23; 2; 23; 28; 90/ g.x/ WD Gx  d where

 GD

1 1 2 4 4 4 1 4 6 3 4 9 0 1 2 1 6 5 0 0



d D .14; 89/ 2 3 5 1 7 1 4 9 0 4 3 7 C D 4 1 2 5 4 1 6 4 0 3 0 5 3 3 0 4 0 1 2 1 4 0 Computational results: Optimal solution (tolerance D 0:01): x D .0; 8:741312; 8:953411; 9:600323; 3:248449; 3:916472; 6:214436; 9:402384; 10; 4:033163/ (found at iteration 78 and confirmed at iteration 316)

13.4 Equilibrium Problem

479

Optimal value: 107.321065 Maximal number of nodes generated: 32 Example 13.4 Problem (OE) with linear-fractional multicriteria (taken from Muu and Tuyen (2002)) Solve minff .x/j x 2 X; x 2 XE g with following data: f .x/ D x1  x2 X D fx 2 R2C j Ax  bg where 2

3 1 2 6 1 2 7 7 AD6 4 1 1 5 ; 1 0; b D .2; 2; 1; 6/T ; x1 ; C1 .x/ D x1 C x2 C2 .x/ D

x1 C 6 : x1  x 2 C 3

Computational results: Optimal solution (tolerance D 0:01): x D .1:991962; 2:991962/ (found at iteration 16 and confirmed at iteration 16), Optimal value: 4.983923.

13.4 Equilibrium Problem Given a nonempty compact set C  Rn ; a nonempty closed convex set D  Rm ; and a function F.x; y/ W C  D ! R which is upper semi-continuous in x and quasiconvex in y; the problem is to find an equilibrium of the system .C; D; F/; i.e., a point x 2 Rn satisfying x 2 C; F.x; y/  0 8y 2 D: (cf. Sect. 3.3).

(13.39)

480

13 Optimization Under Equilibrium Constraints

By Theorem 3.4 we already know that this problem has a solution, provided the following assumption holds: (A) There exists a closed set-valued function ' from D to C with nonempty compact values '.y/  C 8y 2 D; such that inf inf F.x; y/  0:

y2D x2'.y/

(13.40)

The next question that arises is how to find effectively the equilibrium point when condition (13.40) is fulfilled. As it turns out, this is a question of fundamental importance both for the theory and applications of mathematics. Historically, about 40 years ago the so-called combinatorial pivotal methods were developed for computing fixed point and economic equilibrium, which was at the time a very popular subject. Subsequently, the interest switched to variational inequalities—a new class of equilibrium problems which have come to play a central role in many theoretical and practical questions of modern nonlinear analysis. With the rapid progress in nonlinear optimization, including global optimization, nowadays equilibrium problems and, in particular, variational inequalities can be approached and solved as equivalent optimization problems.

13.4.1 Variational Inequality Let C be a compact convex set in Rn ; ' W Rn ! Rn a continuous mapping. Consider the variational inequality .VI/

Find x 2 C;

s.t.

h'.x/; x  xi  0 8x 2 C:

For x; y 2 C define F.x; y/ D maxfhx; y  zij z 2 Cg D hx; yi  minhx; zi: z2C

(13.41)

Writing minz2C hx; zi D hx; zi for z 2 C we see that the function F.x; y/ is continuous in x and convex in y: By Theorem 3.4 there exists x 2 C such that inf F.x; y/  inf F.'.y/; y/:

y2C

y2C

But infy2C F.x; y/ D infy2C hx; yi  minz2C hx; zi D 0; so inf F.'.y/; y/  0:

y2C

(13.42)

13.4 Equilibrium Problem

481

On the other hand, F.'.y/; y/ D h'.y/; yi  minz2C h'.y/; zi  0 8y 2 C; hence, taking account of the continuity of F.'.y/; y/ and the compactness of C: min F.'.y/; y/ D 0: y2C

(13.43)

If y is a global optimal solution of this optimization problem, then F.'.y/; y/ D 0; i.e., 0 D h'.y/; yi  minz2C h'.y/; zi; hence h'.y/; yi D miny2C h'.y/; yi; i.e., h'.y/; y  yi  0 8y 2 C; which means y is a solution to (VI). Conversely, it is easily seen that any solution y of (VI) satisfies F.'.y/; y/ D 0: Thus, under the stated assumptions the variational inequality (VI) is solvable and any solution of it can be found by solving the global optimization problem minfF.'.y/; y/j y 2 Cg;

(13.44)

where F.'.y/; y/ D h'.y/; yi  minh'.y/; zi  0 8y 2 C: z2C

(13.45)

It is important to note that the optimization problem (13.44) should be solved to global optimality because F.'.y/; y/ is generally nonconvex and a local minimizer y of it over C may satisfy F.'.y/; y/ > 0; so may fail to provide a solution to the variational inequality. A function g.y/; such that g.y/  0 8y 2 C and g.y/ D 0 for y 2 C if and only if y solves the variational inequality (VI), is often referred to as a gap function associated with (VI). It follows from the above that the function F.'.y/; y/ defined by (13.45), which was first introduced by Auslender (1976), is a gap function for (VI). Thus, once a gap function g.y/ is known for the variational inequality (VI), solving the latter reduces to solving the equation y 2 C; g.y/ D 0;

(13.46)

which in turn is equivalent to the global optimization problem minfg.y/j y 2 Cg:

(13.47)

The existence theorem for variational inequalities and the conversion of variational inequalities into equivalent equations date back to Hartman and Stampacchia (1966). The use of gap functions to convert variational inequalities into optimization problems (13.47) appeared subsequently (Auslender 1976). However, in the earlier period, when global optimization methods were underdeveloped, the converted optimization problem was solved mainly by local optimization methods, i.e., by methods seeking a local minimizer, or even a stationary point satisfying necessary optimality conditions. Since the function g.y/ may not be convex such points do not

482

13 Optimization Under Equilibrium Constraints

always satisfy the required inequality in (VI) for all y 2 C; so strictly speaking they do not solve the variational inequality. Therefore, additional ad-hoc manipulations have often to be used to guarantee the global optimality of the solution found. Aside from the function (13.45) introduced by Auslender, many other gap functions have been developed and analyzed in the last two decades: (Fukushima 1992), (Zhu and Marcotte 1994), (Mastroeni 2003),. . . For a gap function to be practically useful it should possess some desirable properties making it amenable to currently available optimization methods. In particular, it should be differentiable, if differentiable optimization methods are to be used. From this point of view, the following function, discovered independently by Auchumuty (1989) and Fukushima (1992), is of particular interest:

1 g.y/ D max h'.y/; z  yi  hz  y; M.z  y/i ; z2C 2

(13.48)

where M is an arbitrary n  n symmetric positive definite matrix. Let 1 h.z/ D h'.y/; z  yi  hz  y; M.z  y/i: 2

(13.49)

For every fixed y 2 C; since h.z/ is a strictly concave quadratic function, it has a unique maximizer u D u.y/ on the compact convex set C which is also a minimizer of the strictly convex quadratic function h.z/ on C and hence, is completely defined by the condition 0 2 rh.u/ C NX .u/ (see Proposition 2.31), i.e., hM.u.y/  y/  '.y/; z  u.y/i  0

8z 2 C:

(13.50)

Proposition 13.16 (Fukushima 1992) The function (13.48) is a gap function for the variational inequality (VI). Furthermore, a vector y 2 C solves (VI) if and only if y D u.y/; i.e., if and only if y is a fixed point of the mapping u W C ! C that carries each point y 2 C to the maximizer u.y/ of the function h.z/ over the set C: Proof Clearly h.y/ D 0; so g.y/ D maxz2C h.z/  h.y/ D 0 8y 2 C: We show that y 2 C solves (VI) if and only if g.y/ D 0. Since g.y/ D maxz2C h.z/ we have g.y/ D h'.y/; u.y/  yi  12 hu.y/  y; M.u.y/  y/i: It follows that g.y/ D 0 for y 2 C if and only if y D u.y/; g.y/ D minz2C g.y/; and h'.y/; z  yi  0; 8z 2 C [see (13.50)], consequently, if and only if y is a solution of (VI). t u It has also been proved in Fukushima (1992) that under mild assumptions on continuity and differentiability of '.y/ the function (13.48) is also continuous and differentiable and that any stationary point of it on C solves (VI). Consequently, with the special gap function (13.48) and under appropriate assumptions (VI) can be solved by using conventional differentiable local optimization methods to find a stationary point of the corresponding problem (13.47). The next proposition indicates two important cases when the problem (13.44) using the gap function (13.45) can be solved by well-developed global optimization methods.

13.5 Optimization with Variational Inequality Constraint

483

Proposition 13.17 (i) If the map ' W Rn ! Rm is affine, then the gap function (13.45) is a dc function and the corresponding optimization problem equivalent to (VI) is a dc optimization problem. n (ii) If C  Rm C and each function 'i .y/ W RC ! RC ; i D 1; : : : ; m; is an increasing function then the gap function (13.45) is a dm function and the corresponding optimization problem equivalent to (VI) is a dm optimization problem. Proof (i) If '.y/ is affine, then h'.y/; yi is a quadratic function, while minz2C h'.y/; zi is a concave function, so (13.45) is a dc function. 0 (ii) If every function 'i .y/ is increasing then for any y; y0 2 Rm C such that y  y by writing h'.y/; yi  '.t0 /; y0 i D Œh'.y/; yi  h'.y0 /; yi C Œh'.y0 /; yi  h'.y0 /; y0 i; it is easily seen that h'.y/; yi is an increasing function. Since it is also obvious that minz2C h'.y/; zi is an increasing function, (13.45) is a dm function. t u In the case (i) where, in addition, C is a polytope, (VI) is called an affine variational inequality.

13.5 Optimization with Variational Inequality Constraint A class of MPECs which has attracted much research in recent years includes optimization problems with a variational inequality constraint. These are problems of the form .OVI/

minimize ˇ f .x; y/ ˇ .x; y/ 2 U; y 2 C; subject to ˇˇ h'.x; y/; z  yi  0 8z 2 C;

where f .x; y/ W Rn  Rm ! R is a continuous function, U  Rn  Rm a nonempty closed set, and C  Rm is a nonempty compact set. For each given x 2 X WD fx 2 Rn j .x; y/ 2 U for some y 2 Rm g the set S.x/ WD fy 2 Cj h'.x; y/; zyi  0 8z 2 Cg is the solution set of a variational inequality. So (OVI) is effectively an MPEC as defined at the beginning of this chapter. Among the best known solution methods for MPECs let us mention the nonsmooth approach by Outrata et al. (1998), the implicit programming approach and the penalty interior point method by Luo et al. (1996). Using a gap function g.y/ we can convert (OVI) into an equivalent optimization problem as follows:

484

13 Optimization Under Equilibrium Constraints

Proposition 13.18 (OVI) is equivalent to the optimization problem min f .x; y/ s.t. .x; y/ 2 U; y 2 C; : g.y/  0: Proof Immediate, because by definition of a gap function, y 2 C solves (VI) if and only if g.y/ D 0: t u Of particular interest is (OVI) when (VI) is an affine variational inequality, i.e., when ' is an affine mapping and C is a polytope. The problem is then MPEC with an affine equilibrium constraint and is referred to as MPAEC:

.MPAEC/

minimize ˇ f .x; y/ ˇ .x; y/ 2 U; y 2 C; subject to ˇˇ hAx C By C c; z  yi  0 8z 2 C;

where f .x; y/ W Rn  Rm ! R is a convex function, U  Rn  Rm a nonempty closed set, A; B are appropriate real matrices, c 2 Rn and C  Rn a nonempty polytope. By Proposition 13.18, using a gap function g.x; y/ we can reformulate (MPAEC) as the optimization problem min f .x; y/ s.t. .x; y/ 2 U; g.x; y/  0:

(13.51)

If the gap function used is (13.48), i.e.,

1 g.x; y/ D max h'.y/; z  yi  hz  y; M.z  y/i ; z2C 2 where M is a symmetric nn positive definite matrix, then (13.51) is a differentiable optimization problem for which, as shown by Fukushima (1992), even a stationary point yields a global optimal solution, hence a solution to MPAEC. Exploiting this property which is rather exceptional for gap functions, descent methods and other differentiable (local) optimization methods can be developed for solving MPAEC: see Fukushima (1992), and also Muu-Quoc-An-Tao, (2012) among others. If the gap function (13.45) is used, we have g.x; y/ D hAx C By C c; yi  minhAx C By C c; zi; z2C

which is a dc function (more precisely, the difference of a quadratic function and a concave function). In this case, (13.51) is a convex minimization under dc constraints, i.e., a tractable global optimization problem.

13.5 Optimization with Variational Inequality Constraint

485

Since the feasible set of problem (13.51) is nonconvex, an efficient method for solving it, as we saw in Chap. 7, Sect. 7.4, is to proceed according to the following SIT scheme. For any real number denote by .SP / the auxiliary problem of finding a feasible solution .x; y/ to (MPAEC) such that f .x; y/  : This problem is a dc optimization under convex constraints since it can be formulated as .SP /

min g.x; y/ s.t. .x; y/ 2 U; f .x; y/  :

(13.52)

Consequently, as argued in Sect. 6.2, it should be much easier to handle than (13.51) whose feasible set is nonconvex. In fact this is the rationale for trying to reduce (MPAEC) to a sequence of subproblems .SP /: Given a tolerance > 0; a feasible solution .x; y/ of (MPAEC) is said to be an

-optimal solution if f .x; y/  f .x; y/ C for every feasible solution .x; y/: Basically, the SIT approach consists in replacing (MPAEC) by a sequence of auxiliary problems .SP / where the parameter is gradually adjusted until a suitable value is reached for which by solving the corresponding .SP / an -optimal solution of (MPAEC) will be found. Suppose .x0 ; y0 / is the best feasible solution available at a given stage and let 0 D f .x0 ; y0 /  : Solving the problem .SP 0 / leads to two alternatives: either a solution .x1 ; y1 / is obtained, or an evidence is found that .SP 0 / is infeasible. In the former alternative, we set .x0 ; y0 / .x1 ; y1 / and repeat, with 1 D f .x1 ; y1 /  in place of 0 : In the latter alternative, f .x0 ; y0 /  f .x; y/ C for any feasible solution .x; y/ of MPAEC, so .x0 ; y0 / is an -optimal solution as desired and we stop. In that way each iteration improves the objective function value by at least > 0; so the procedure must terminate after finitely many iterations, yielding eventually an

-optimal solution. Formally, we can state the following algorithm. By translating if necessary it can be assumed that U is a compact convex subset m of a hyperrectangle Œa; b  RnC  Rm C and C is a compact convex subset of RC : A vector u D .x; y/ 2 U is said to be "-feasible to (MPAEC) if it satisfies g.u/  "I a vector u D .x; y/ 2 U is called an ."; /-optimal solution if it is "-feasible and satisfies f .u/  f .u/ C for all "-feasible solutions. Algorithm for (MPAEC) Select tolerances " > 0; > 0: Let 0 D f .a/: Step 0. Step 1.

Set P1 D fM1 g; M1 D Œa; b; R1 D ;; D 0 : Set k D 1. For each box (hyperrectangle) M 2 Pk : – Reduce M; i.e., find a box Œp; q  M as small as possible satisfying minfg.u/j f .u/  ; u 2 Œp; qg D minfg.u/j f .u/  ; u 2 Mg: Set M Œp; q: – Compute a lower bound ˇ.M/ for g.u/ over the feasible solutions in Œp; q:

486

13 Optimization Under Equilibrium Constraints

Delete every M such that ˇ.M/ > ": Step 2.

Step 3.

Step 4.

Let P 0 k be the collection of boxes that results from Pk after completion of Step 1. Let R 0 k D Rk [ P 0 k : If R 0 k D ;, then terminate: if D 0 the problem (MPAEC) is "infeasible (has no "-feasible solutions); if D f .u/ for some "-feasible solution u, then u is an ."; /-optimal solution of (MPAEC). If R 0 k ¤ ;; let Mk 2 argminfˇ.M/j M 2 R 0 k g; ˇk D ˇ.Mk /: Determine uk 2 Mk and v k 2 Mk such that f .uk /  ;

Step 5.

Step 6.

g.v k /  ˇ.Mk / D o.kuk  v k k/:

(13.53)

If g.uk / < "; go to Step 5. If g.uk /  "; go to Step 6. uk is an "-feasible solution satisfying f .uk /  : If D 0 reset f .u/  ; u D uk ; and go to Step 6. If D f .u/ 

k and f .u / < f .u/ reset u uk and go to Step 6. Divide Mk into two subboxes by the adaptive bisection, i.e., bisect Mk via .wk ; jk /; where wk D 12 .uk C v k /; jk 2 argmaxfjvjk  ukj j W j D 1; : : : ; ng: Let PkC1 be the collection of these two subboxes of Mk ; RkC1 D R 0 k n fMk g: Increment k; and return to Step 1.

Proposition 13.19 The above SIT Algorithm terminates after finitely many iterations, yielding either an ."; /-optimal solution to MPAEC, or an evidence that the problem is infeasible. t u

Proof Immediate.

Remark 13.5 If a feasible solution can be found by some fast local optimization method available, it should be used to initialize the algorithm. Also if for some reason the algorithm has to be stopped prematurely, some reasonably good feasible solution may have been already obtained. This is in contrast with other algorithms which would become useless in that case.

13.6 Exercises 1 Solve the Linear Bilevel Program (Tuy et al. 1993a): min.2x1 C x2 C 0:5y1 / x1 ; x2  0;

s.t.

y D .y1 ; y2 / solves

min.4y1 C y2 /

s.t.

2x1 C y1  y2  2:5 x1  3x2 C y2  2 y1 ; y2  0:

13.6 Exercises

487

Hints: a non-optimal feasible solution is x D .1; 0/; y D .0:5; 1/: 2 Solve the Convex Bilevel Program (Floudas 2000, Sect. 9.3): min..x  5/2 C .2y C 1/2 / x  0;

s.t.

y solves

min..y  1/2  1:5xy/ s.t. 3x C y  3 x  0:5y  4 xCy7 x  0; y  0: Hints: A known feasible solution is x D 1; y D 0: 3 Show that any linear bilevel program is a MPAEC. 4 Solve the MPAEC equivalent to the LBP in Exercise 1. 5 Consider the MPAEC minimize f .x; y/ subject to .x; y/ 2 U; y 2 C; hAx C By C c; z  yi  0 8z 2 C

.1/ .2/ .3/

where U  RnCm ; C  Rn are two nonempty closed convex sets, f .x; y/ is convex function on RnCm while A; B are given appropriate real matrices and c 2 Rn : Show that if A is a symmetric positive semidefinite matrix then this problem reduces to a convex bilevel program. 6 Let g.x; y/ D maxy2C fhAx C By C c; y  zi  12 hz  y; G.z  y/g; where G is an arbitrary n  n symmetric positive definite matrix. Show that: (i) g.x; y/  0 8.x; y/ 2 C  Rm ; (ii) .x; y/ 2 U; y 2 C; g.x; y/ D 0 if and only if .x; y/ satisfies the conditions (2) and (3) in Exercise 5. In other words, the MPAEC in Exercise 5 is equivalent to the dc optimization problem minf .x; y/ W .x; y/ 2 U; y 2 C; g.x; y/  0:

References

Ahuja, R.K., Magnanti, T.L., Orlin, J.B.: Networks Flows: Theory, Algorithms and Applications. Prentice Hall, Englewood Cliffs, NJ (1993) Alexandrov, A.D.: On surfaces which may be represented by a difference of convex functions. Izvestiya Akademii Nauk Kazakhskoj SSR, Seria Fiziko-Matematicheskikh 3, 3–20 (1949, Russian) Alexandrov, A.D.: On surfaces which may be represented by differences of convex functions. Dokl. Akad. Nauk SSR 72, 613–616 (1950, Russian) Al-Khayyal, F.A., Falk, J.E.: Jointly constrained biconvex programming. Math. Oper. Res. 8, 273– 286 (1983) Al-Khayyal, F.A., Larsen, C., Van Voorhis, T.: A relaxation method for nonconvex quadratically constrained quadratic programs. J. Glob. Optim., 215–230 (1995) Al-Khayyal, F.A., Tuy, H., Zhou, F.: DC optimization methods for multisource location problems. School of Industrial and Systems Engineering. George Institute of Technology, Atlanta (1997). Preprint Al-Khayyal, F.A., Tuy, H., Zhou, F.: Large-scale single facility continuous location by DC Optimization. Optimization 51, 271–292 (2002) An, L.T.H.: Solving large scale molecular distance geometry problems by a smoothing technique via the Gaussian Transform and D.C. Programming. J. Glob. Optim. 27, 375–397 (2003) An, L.T.H., Tao, P.D.: The DC (difference of convex) programming and DCA revisited with DC models of real world nonconvex optimization problems. Ann. Oper. Res. 133, 23–46 (2005) An, L.T.H., Tao, P.D., Muu, L.D.: Simplicially-constrained DC optimization over the efficient and weakly efficient set. J. Optim. Theory Appl. 117, 503–531 (2003) An, L.T.H., Belghiti, M.T., Tao, P.D.: A new efficient algorithm based on DC programming and DCA for clustering. J. Glob. Optim. 37, 557–569 (2007) An, L.T.H., Tao, P.D., Nam, N.C., Muu, L.D.: Methods for optimization over the efficient and weakly efficient sets of an affine fractional vector optimisation problem. Optimization 59, 77– 93 (2010) Androulakis, I.P., Maranas, C.D., Floudas, C.A.: ˛BB: a global optimization method for general constrained nonconvex problems. J. Glob. Optim. 7, 337–363 (1995) Apkarian, P., Tuan, H.D.: Concave programming in control theory. J. Glob. Optim. 15, 343–370 (1999) Asplund, E.: Differentiability of the metric projection in finite dimensional Euclidean space. Proc. AMS 38, 218–219 (1973) Astorino, A., Miglionico, G.: Optimizing sensor cover energy via DC Programming. Optim. Lett. (2014). doi:10.1007/s11590-014-0778-y

© Springer International Publishing AG 2016 H. Tuy, Convex Analysis and Global Optimization, Springer Optimization and Its Applications 110, DOI 10.1007/978-3-319-31484-6

489

490

References

Aubin, J.-P., Ekeland, I.: Applied Nonlinear Analysis. John Wiley & Sons, NewYork (1984) Auchumuty, G.: Variational principles for variational inequalities. Numer. Funct. Anal. Optim. 10, 863–874 (1989) Auslender, A.: Optimisation, Méthodes Numériques. Masson, Paris (1976) Bach-Kim, N.T.: An algorithm for optimization over the efficient set. Vietnam J. Math. 28, 329– 340 (2000) Baer, E., et al.: Biological and synthetic hierarchical composites. Phys. Today 46, 60–67 (1992) Balas, E.: Intersection cuts—a new type of cutting planes for integer programming. Oper. Res. 19, 19–39 (1971) Balas, E.: Integer programming and convex analysis: intersection cuts and outer polars. Math. Program. 2, 330–382 (1972) Balas, E., Burdet, C.A.: Maximizing a convex quadratic function subject to linear constraints. Management Science Research Report No 299, Carnegie–Melon University (1973) Bard, J.F.: Practical Bilevel Optimization Algorithms and Optimization. Kluwer, Dordrecht (1988) Bard, J. F.: Convex two-level optimization. Math. Program. 40, 15–27 (1998) Ben-Ayed, O.: Bilevel linear programming. Comput. Oper. Res. 20, 485–501 (1993) Bengtsson, M., Ottersen, B.: Optimum and suboptimum transmit beamforming. In: Handbook of Antennas in Wirelss Coomunications, chap. 18. CRC, Boca Raton (2001) Benson, H.P.: Optimization over the efficient set. J. Math. Anal. Appl. 98, 562–580 (1984) Benson, H.P.: An algorithm for optimizing over the weakly-efficient set. Eur. J. Oper. Res. 25, 192–199 (1986) Benson, H.P.: An all-linear programming relaxation algorithm for optimization over the efficient set. J. Glob. Optim. 1, 83–104 (1991) Benson, H.P.: A bisection-extreme point search algorithm for optimizing over the efficient set in the linear dependance case. J. Glob. Optim. 3, 95–111 (1993) Benson, H.P.: Concave minimization: theory, applications and algorithms. In: Horst, R., Pardalos, P. (eds.) Handbook of Global Optimization, pp. 43–148. Kluwer Academic Publishers, Dordrecht (1995) Benson, H.P.: Deterministic algorithms for constrained concave minimization: a unified critical survey. Nav. Res. Logist. 43, 765–795 (1996) Benson, H.P.: An outer approximation algorithm for generating all efficient extreme points in the outcome set of a multi objective linear programming problem. J. Glob. Optim. 13, 1–24 (1998) Benson, H.P.: An outcome space branch and bound outer approximation algorithm for convex multiplicative programming. J. Glob. Optim. 15, 315–342 (1999) Benson, H.P.: A global optimization approach for generating efficient points for multiplicative concave fractional programming. J. Multicrit. Decis. Anal. 13, 15–28 (2006) Ben-Tal, A., Eiger, G., Gershovitz, V.: Global minimization by reducing the duality gap. Math. Program. 63, 193–212 (1994) Ben-Tal, A., Nemirovski, A.: Lectures on Modern Convex Optimization. MPS-SIAM Series on Optimization. SIAM, Philadelphia (2001) Bentley, J.L., Shamos, M.I.: Divide and conquer for linear expected time. Inf. Proc. Lett. 7, 87–91 (1978) Bialas, W.F., Karwan, C.H.: On two-level optimization. IEEE Trans. Auto. Const. AC-27, 211–214 (1982) Bj¯ornson, E., Jorswieck, E.: Optimal resource allocation in coordinated multi-cell systems. Foundations and Trends in Communications and Information Theory, vol. 9(2–3), pp. 113–381 (2012). doi:10.1561/0100000069 Bj¯ornson, E., Zheng, G., Bengtson, M., Ottersen, B.: Robust monotonic optimization framework for multicell MISO systems. IEEE Trans. Signal Process. 60(5), 2508–2521 (2012) Blum, E., Oettli, W.: From optimization and variational inequality to equilibrium problems. Math. Student 63, 127–149 (1994) Bolintineau, S.: Minimization of a quasi-concave function over an efficient set. Math. Program. 61, 83–120 (1993a)

References

491

Bolintineau, S.: Optimality condition for minimization over the (weakly or properly) efficient set. J. Math. Anal. Optim. 173, 523–541 (1993b) Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge, United Kingdom (2004) Bracken, J., McGill, J.: Mathematical programs with optimization problems in the constraints. Oper. Res. 21, 37–44 (1973) Bracken, J., McGill, J.: A method for solving mathematical programs with nonlinear programs in the constraints. Oper. Res. 22, 1097–1101 (1974) Bradley, P.S., Mangasarian, O.L., Street, W.N.: Clustering via concave minimization. Technical report 96–03. Computer Science Department, University of Wisconsin (1997) Brimberg, J., Love, R.F.: A location problem with economies of scale. Stud. Locat. Anal. (7), 9–19 (1994) Bui, A.: Etude analytique d’algorithmes distribués de routage. Thèse de doctorat, Université Paris VII (1993) Bulatov, V.P. Embedding methods in optimization problems. Nauka, Novosibirsk (1977) (Russian) Bulatov, V.P. Methods for solving multiextremal problems (Global Search). In Beltiukov, B.A., Bulatov, V.P. (eds) Methods of Numerical Analysis and Optimization. Nauka, Novosibirsk (1987) (Russian) Candler, W., Townsley, R.: A linear two-level programming problem. Comput. Oper. Res. 9, 59–76 (1982) Caristi, J.: Fixed point theorems for mappings satisfying inwardness conditions. Trans. Am. Math. Soc. 215, 241–251 (1976) Charnes, A., Cooper, W.W.: Programming with linear fractional functionals. Nav. Res. Logist. 9, 181–186 (1962) Chen, P.C., Hansen, P., Jaumard, B.: On-line and off-line vertex enumeration by adjacency lists. Oper. Res. Lett. 10, 403–409 (1991) Chen, P.-C., Hansen, P., Jaumard, B., Tuy, H.: Weber’s problem with attraction and repulsion. J. Regional Sci. 32, 467–486 (1992) Chen, P.-C., Hansen, P., Jaumard, B., Tuy, H.: Solution of the multifacility Weber and conditional Weber problems by D.C. Programming, Cahier du GERAD G-92-35. Ecole Polytechnique, Montréal (1992b) Chen, P.C., Hansen, P., Jaumard, B., Tuy, H.: Solution of the multifacility Weber and conditional Weber problems by DC Programming. Oper. Res. 46, 548–562 (1998) Cheon, M.-S., Ahmed, S., Al-Khayyal, F.: A branch-reduce-cut algorithm for the global optimization of probabilistically constrained linear programs. Math. Program. Ser. B 108, 617–634 (2006) Cheney, E.W., Goldstein, A.A.: Newton’s method for convex programming and Tchebycheff approximation. Numer. Math. 1, 253–268 (1959) Cooper, L.: Location-allocation problems. Oper. Res. 11, 331–343 (1963) Dantzig, G.B.: Linear Programming and Extensions. Princeton University Press, Princeton, NJ (1963) Dines, L.L.: On the mapping of quadratic forms. Bull. Am. Math. Soc. 47, 494–498 (1941) Dixon, L.C.W., Szego, G.P. (eds.): Towards Global Optimization, vol. I. North-Holland, Amsterdam (1975) Dixon, L.C.W., Szego, G.P. (eds.): Towards Global Optimization, vol. II. North-Holland, Amsterdam (1978) Duffin, R.J., Peterson, E.L., Zener, C.: Geometric Programming—Theory and Application. Wiley, New York (1967) Eaves, B.C., Zangwill, W.I.: Generalized cutting plane algorithms. SIAM J. Control 9, 529–542 (1971) Ecker, J.G., Song, H.G.: Optimizing a linear function over an efficient set. J. Optim. Theory Appl. 83, 541–563 (1994) Ekeland, I.: On the variational principle. J. Math. Anal. Appl. 47, 324–353 (1974)

492

References

Ekeland, I., Temam, R.: Convex Analysis and Variational Problems. North-Holland, Amsterdam (1976) Ellaia, R.: Contribution à l’analyse et l’optimisation de différences de fonctions convexes, Thèse de trosième cycle, Université de Toulouse (1984) Falk, J.E.: Lagrange multipliers and nonconvex problems. SIAM J. Control 7, 534–545 (1969) Falk, J.E.: Conditions for global optimality in nonlinear programming, Oper. Res. 21, 337–340 (1973a) Falk, J.E.: A linear max-min problem. Math. Program. 5, 169–188 (1973b) Falk, J.E., Hoffman, K.L.: A successive underestimation method for concave minimization problems. Math. Oper. Res. 1, 251–259 (1976) Falk, J.E., Hoffman, K.L.: Concave minimization via collapsing polytopes. Oper. Res. 34, 919– 929 (1986) Falk, J.E., Palocsay, S.W.: Optimizing the sum of linear fractional functions. In: Floudas, C., Pardalos, P. (eds) Global Optimization, pp. 221–258. Princeton University Press, Princeton (1992) Falk, J.E., Soland, R.M.L: An algorithm for separable nonconvex programming problems. Manag. Sci. 15, 550–569 (1969) Fan, K.: A minimax inequality and applications. In: Shisha (ed.) Inequalities, vol. 3, pp. 103–113. Academic, New York (1972) Finsler, P.: Uber das Vorkommen definiter und semidefiniter Formen in Scharen quadratischer Formen. Comment. Math. Helv. 9, 188–192 (1937) Floudas, C.A., Aggarwal, A.: A decomposition strategy for global optimum search in the pooling problem. ORSA J. Comput. 2(3), 225–235 (1990) Floudas, C.A. : Deterministic Global Optimization, Theory, Methods and Applications. Kluwer, Dordrecht, The Netherlands (2000) Forgo, F.: The solution of a special quadratic problem (in Hungarian). Szigma 53–59 (1975) Foulds, L.R., Haugland, D., Jörnsten, K.: A bilinear approach to the pooling problem. Optimization 24, 165–180 (1992) Fredman, M.L., Tarjan, R.E.: Fibonacci heaps and their uses in improved network optimization algorithms. J. Assoc. Comput. Mach. 34, 596–615 (1984) Friesz, T.L., et al.: A simulated annealing approach to the network design problem with variational inequality constraints. Transp. Sci. 28, 18–26 (1992) Fujiwara, O., Khang, D.B.: A two-phase decomposition method for optimal design of looped water distribution networks. Water Resour. Res. 23, 977–982 (1990) Fukushima, M.: Equivalent differentiable optimization problems and descent methods for asymmetric variational inequality problems. Math. Program. 53, 99–110 (1992) Gabasov, R., Kirillova, F.M.: Linear Programming Methods. Part 3 (Special Problems). Minsk, Russian (1980) Gale, P.: An algorithm for exhaustive generation of building floor plans. Commun. ACM 24, 813– 825 (1981) Gantmacher F.R.: The Theory of Matrices, vol. 2. Chelsea Publishing Company, New York (1960) Ge, R.P., Qin, Y.F.: A class of filled functions for finding global minimizers of a function of several variables. J. Optim. Theory Appl. 54, 241–252 (1987) Glover, F.: Cut search methods in integer programming. Math. Program. 3, 86–100 (1972) Glover, F.: Convexity cuts and cut search. Oper. Res. 21, 123–124 (1973a) Glover, F.: Concave programming applied to a special class of 0–1 integer programs. Oper. Res. 21, 135–140 (1973b) Graham, R.L.: An efficient algorithm for determining the convex hull of a finite planar set. Inf. Proc. Lett. 1, 132–133 (1972) Groch, A., Vidigal, L., Director, S.: A new global optimization method for electronic circuit design. IEEE Trans. Circuits Syst. 32, 160–179 (1985) Guisewite, G.M., Pardalos, P.M.: A polynomial time solvable concave network flow problem. Network 23, 143–148 (1992)

References

493

Haims, M.J., Freeman, H.: A multistage solution of the template-layout problem. IEEE Trans. Syst. Sci. Cyber. 6, 145–151 (1970) Hamami, M., Jacobsen, S.E.: Exhaustive non-degenerate conical processes for concave minimization on convex polytopes. Math. Oper. Res. 13, 479–487 (1988) Hansen, P., Jaumard, B., Savard, G.: New branch and bound rules for linear bilinear programming. SIAM J. Sci. Stat. Comput. 13, 1194–1217 (1992) Hansen, P., Jaumard, B., Tuy, H.: Global optimization in location. In Dresner, Z. (ed.) Facility Location, pp. 43–68. Springer, NewYork (1995) Hartman, P.: On functions representable as a difference of convex functions. Pac. J. Math. 9, 707– 713 (1959) Hartman, P., Stampacchia, G.: On some nonlinear elliptic differential functional equations. Acta Math. 115, 153–188 (1966) Hillestad, R.J., Jacobsen, S.E.: Linear programs with an additional reverse convex constraint. Appl. Math. Optim. 6, 257–269 (1980b) Hiriart-Urruty, J.-B.: A short proof of the variational principle for approximate solutions of a minimization problem. Am. Math. Mon. 90, 206–207 (1983) Hiriart-Urruty, J.-B.: From convex optimization to nonconvex optimization, Part I: necessary and sufficient conditions for global optimality. In: Clarke, F.H., Demyanov, V.F., Giannessi, F. (eds.) Nonsmooth Optimization and Related Topics. Plenum, New York (1989a) Hiriart-Urruty, J.-B. Conditions nécessaires et suffisantes d’optimalité globale en optimisation de différences de deux fonctions convexes. C.R. Acad. Sci. Paris 309, Série I, 459–462 (1989b) Hoai, P.T., Tuy, H.: Monotonic Optimization for Sensor Cover Energy Problem. Institute of Mathematics, Hanoi (2016). Preprint Hoai-Phuong, N.T., Tuy, H.: A unified monotonic approach to generalized linear fractional programming. J. Glob. Optim. 26, 229–259 (2003) Hoang, H.G., Tuan, H.D., Apkarian, P.: A Lyapunov variable-free KYP Lemma for SISO continuous systems. IEEE Trans. Autom. Control 53, 2669–2673 (2008) Hoffman, K.L.: A method for globally minimizing concave functions over convex sets. Math. Program. 20, 22–32 (1981) Holmberg, K., Tuy, H.: A production-transportation problem with stochastic demands and concave production cost. Math. Program. 85, 157–179 (1995) Horst, R., Tuy, H.: Global Optimization (Deterministic Approaches), 3rd edn. Springer, Berlin/Heidelberg/New York (1996) Horst, R., Thoai, N.V., Benson, H.P.: Concave minimization via conical partitions and polyhedral outer approximation. Math. Program. 50, 259–274 (1991) Horst, R., Thoai, N.V., de Vries, J.: On finding new vertices and redundant constraints in cutting plane algorithms for global optimization. Oper. Res. Lett. 7, 85–90 (1988) Idrissi, H., Loridan, P., Michelot, C.: Approximation for location problems. J. Optim. Theory Appl. 56, 127–143 (1988) Jaumard, B., Meyer, C.: On the convergence of cone splitting algorithms with !-subdivisions. J. Optim. Theory Appl. 110, 119–144 (2001) Jeyakumar, V., Lee, G.M., Li, G.Y.: Alternative theorems for quadratic inequality systems and global quadratic optimization. SIAM J. Optim. 20, 983–1001 (2009) Jorswieck, E.A., Larson, E.G.: Monotonic optimization framework for the two-user MISO interference channel. IEEE Trans. Commun. 58, 2159–2168 (2010) Kakutani, S.: A generalization of Brouwer’s fixed point theorem. Duke Math. J. 8, 457–459 (1941) Kalantari, B., Rosen, J.B.: An algorithm for global minimization of linearly constrained concave quadratic problems. Math. Oper. Res. 12, 544–561 (1987) Karmarkar, N.: An interior-point approach for NP-complete problems. Contemp. Math. 114, 297– 308 (1990) Kelley, J.E.: The cutting plane method for solving convex programs. J. SIAM 8, 703–712 (1960) Klinz, B., Tuy, H.: Minimum concave-cost network flow problems with a single nonlinear arc cost. In: Pardalos, P., Du, D. (eds.) Network Optimization Problems, pp. 125–143. World Scientific, Singapore (1993)

494

References

Knaster, B., Kuratowski, C., Mazurkiewicz, S.: Ein Beweis des Fix-punktsatzes für n-dimensionale Simplexe. Fund. Math. 14, 132–137 (1929) Konno, H.: A cutting plane for solving bilinear programs. Math. Program. 11,14–27 (1976a) Konno, H.: Maximization of a convex function subject to linear constraints. Math. Program. 11, 117–127 (1976b) Konno, H., Abe, N.: Minimization of the sum of three linear fractions. J. Glob. Optim. 15, 419–432 (1999) Konno, H., Kuno, T.: Generalized linear and fractional programming. Ann. Oper. Res. 25, 147–162 (1990) Konno, H., Kuno, T.: Linear multiplicative programming. Math. Program. 56, 51–64 (1992) Konno, H., Kuno, T.: Multiplicative programming problems. In: Horst, R., Pardalos, P. (eds.) Handbook of Global Optimization, pp. 369–405. Kluwer, Dordrecht, The Netherlands (1995) Konno, H., Yajima, Y.: Minimizing and maximizing the product of linear fractional functions. In: Floudas, C., Pardalos, P. (eds.) Recent Advances in Global Optimization, pp. 259–273. Princeton University Press, Princeton (1992) Konno, H., Yamashita, Y.: Minimization od the sum and the product of several linear fractional functions. Nav. Res. Logist. 44, 473–497 (1997) Konno, H., Yajima, Y., Matsui, T.: Parametric simplex algorithms for solving a special class of nonconvex minimization problems. J. Glob. Optim. 1, 65–82 (1991) Konno, H., Kuno, T., Yajima, Y.: Global minimization of a generalized convex multiplicative function. J. Glob. Optim. 4, 47–62 (1994) Konno, H., Thach, P.T., Tuy, H.: Optimization on Low Rank Nonconvex Structures. Kluwer, Dordrecht, The Netherlands (1997) Kough, P.L.: The indefinite quadratic programming problem. Oper. Res. 27, 516–533 (1979) Kuno, T.: Globally determining a minimum-area rectangle enclosing the projection of a higher dimensional set. Oper. Res. Lett. 13, 295–305 (1993) Kuno, T., Ishihama, T.: A convergent conical algorithm with !-bisection for concave minimization. J. Glob. Optim. 61, 203–220 (2015) Landis, E.M.: On functions representable as the difference of two convex functions. Dokl. Akad. Nauk SSSR 80, 9–11 (1951) Lasserre, J.: Global optimization with polynomials and the problem of moments. SIAM J. Optim. 11, 796–817 (2001) Lasserre, J.: Semidefinite programming vs. LP relaxations for polynomial programming. Math. Oper. Res. 27, 347–360 (2002) Levy, A.L., Montalvo, A.: The tunneling algorithm for the global minimization of functions. SIAM J. Sci. Stat. Comput. 6, 15–29 (1988) Locatelli, M.: Finiteness of conical algorithm with !-subdivisions. Math. Program. 85, 593–616 (1999) Luc, L.T.: Reverse polyblock approximation for optimization over the weakly efficient set and the efficient set. Acta Math. Vietnam. 26, 65–80 (2001) Luo, Z.Q., Pang, J.S., Ralph, D.: Mathematical Programs with Equilibrium Constraints. Cambridge University Press, Cambridge, United Kingdom (1996) Maling, K., Nueller, S.H., Heller, W.R.: On finding optimal rectangle package plans. In: Proceedings 19th Design Automation Conference, pp. 663–670 (1982) Mangasarian, O.L.: Nonlinear Programming. MacGraw-Hill, New York (1969) Mangasarian, O.L.: Mathematical programming data mining. In: Data Mining and Knowledge Discovery, vol. 1, pp. 183–201 (1997) Maranas, C.D., Floudas, C.A.: A global optimization method for Weber’s problem with attraction and repulsion. In: Hager, W.W., Hearn, D.W., Pardalos, P.M. (eds.) Large Scale Optimization, pp. 259–293. Kluwer, Dordrecht (1994) Martinez, J.M.: Local minimizers of quadratic functions on Euclidean balls and spheres. SIAM J. Optim. 4, 159–176 (1994) Mastroeni, G.: Gap functions for equilibrium problems. J. Glob. Optim. 27, 411–426 (2003)

References

495

Matsui, T.: NP-hardness of linear multiplicative programming and related problems. J. Glob. Optim. 9, 113–119 (1996) Mayne, D.Q., Polak, E.: Outer approximation algorithms for non-differentiable optimization. J. Optim. Theory Appl. 42, 19–30 (1984) McCormick, G.P.: Attempts to calculate global solutions of problems that may have local minima. In: Lootsma, F. (ed.) Numerical Methods for Nonlinear Optimization, pp. 209–221. Academic, London/New York (1972) McCormick, G.P.: Nonlinear Programming: Theory, Algorithms and Applications. Wiley, New York (1982) Megiddo, N., Supowwit, K.J.: On the complexity of some common geometric location problems. SIAM J. Comput. 13, 182–196 (1984) Migdalas, A., Pardalos, P.M., Värbrand, P. (eds): Multilevel Optimization: Algorithms and Applications. Kluwer, Dordrecht (1998) Murty, K.G.: Linear Programming. Wiley, New York (1983) Murty, K.G., Kabadi, S.N.: Some NP-complete problems in quadratic and nonlinear programming. Math. Program. 39, 117–130 (1987) Muu, L.D.: A convergent algorithm for solving linear programs with an additional reverse convex constraint. Kybernetika 21, 428–435 (1985) Muu, L.D.: An algorithm for solving convex programs with an additional convex–concave constraint. Math. Program. 61, 75–87 (1993) Muu, L.D.: Models and methods for solving convex–concave programs. Habilitation thesis, Institute of Mathematics, Hanoi (1996) Muu, L.D., Oettli, W.: Method for minimizing a convex concave function over a convex set. J. Optim. Theory Appl. 70, 377–384 (1991) Muu, L.D., Tuyen, H.Q.: Bilinear programming approach to optimization over the efficient set of a vector affine fractional problem. Acta Math. Vietnam. 27, 119–140 (2002) Muu, L.D., Quoc, T.D., An, L.T. and Tao, P.D.: A new decomposition algorithm for globally solving mathematical programs with equilibrium constraints. Acta Math. Vietnam. 37, 201– 217 (2012) Nadler, S.B.: Multivalued contraction mappings. Pac. J. Math. 30, 475–488 (1969) Nash, J.: Equilibrium points in n-person game. Proc. Natl. Acad. Sci. U.S.A. 36, 48–49 (1950) Nghia, N.D., Hieu, N.D.: A Method for solving reverse convex programming problems. Acta Math. Vietnam. 11, 241–252 (1986) Ortega, J.M., Rheinboldt, W.C.: Iterative Solution of Nonlinear Equations in Several Variables. Academic, New York (1970) Outrata, J.V., Ko˘cvara, M., Zowe, J.: Nonsmooth Approach to Optimization Problems with Equilibrium Constraints. Kluwer, Cambridge (1998) Pardalos, P.M., Vavasis, S.A.: Quadratic programming with one negative eigenvalue is NP-hard. J. Glob. Optim. 21, 843–855 (1991) Philip, J. : Algorithms for the vector maximization problem. Math. Program. 2, 207–229 (1972) Phillips, A.T., Rosen, J.B.: A parallel algorithm for constrained concave quadratic global minimization. Math. Program. 42, 421–448 (1988) Pincus, M.: A close form solution of certain programming problems. Oper. Res. 16, 690–694 (1968) Plastria, F.: Continuous location problems. In: Drezner, Z., Hamacher, H.W. (eds.) Facility Location, pp. 225–262. Springer, Berlin (1995) Polik, I., Terlaky T.: A survey of the S-lemma. SIAM Rev. 49, 371–418 (2007) Prékopa, A.: Contributions to the theory of stochastic programming. Math. Program. 4, 202–221 (1973) Prékopa, A.: Stochastic Programming. Kluwer, Dordrecht (1995) Putinar, M.: Positive polynomials on compact semi-algebraic sets. Indiana Univ. Math. J. 42, 969– 984 (1993) Qian, L.P., Jun Zhang, Y.: S-MAPEL: monotonic optimization for non-convex joint power control and scheduling problems. IEEE Trans. Wirel. Commun. 9, 1708–1719 (2010)

496

References

Qian, L.P., Jun Zhang, Y., Huang, J.W.: MAPEL: achieving global optimality for a nonconvex wireless power control problem. IEEE Trans. Wirel. Commun. 8, 1553–1563 (2009) Qian, L.P., Jun Zhang, Y., Huang, J.: Monotonic optimization in communications and networking systems. Found. Trends Netw. 7, 1–75 (2012) Roberts, A.W., Varberg, D.E.: Convex Functions. Academic, New York (1973) Robinson, S.: Generalized equations and their solutions, part 1: basic theory. Math. Program. Stud. 10,128–141 (1979) Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970) Rosen, J.B.: Global minimization of a linearly constrained concave function by partition of feasible domain. Math. Oper. Res. 8, 215–230 (1983) Rosen, J.B., Pardalos, P.M.: Global minimization of large-scale constrained concave quadratic problems by separable programming. Math. Program. 34, 163–174 (1986) Rubinov, A., Tuy, H., Mays, H.: Algorithm for a monotonic global optimization problem. Optimization 49, 205–221 (2001) Saff, E.B., Kuijlaars, A.B.J.: Distributing many points on a sphere. Math. Intell. 10, 5–11 (1997) Sahni, S.: Computationally related problems. SIAM J. Comput. 3, 262–279 (1974) Schaible, S.: Fractional Programming. In: Horst ,R., Pardalos, P. (eds.) Handbook of Global Optimization, pp. 495–608. Kluwer, Dordrecht (1992) Sherali, H.D., Adams, W.P.: A Reformulation-Lineralization Technique (RLT) for Solving Discrete and Continuous Nonconvex Programming Problems. Kluwer, Dordrecht (1999) Sherali, H.D., Tuncbilek, C.H.: A global optimization algorithm for polynomial programming problems using a reformulation-linearization technique. J. Glob. Optim. 2, 101–112 (1992) Sherali, H.D., Tuncbilek, C.H.: A reformulation-convexification approach to solving nonconvex quadratic programming problems. J. Glob. Optim. 7, 1–31 (1995) Shi, J.S., Yamamoto, Y.: A D.C. approach to the largest empty sphere problem in higher dimension. In: Floudas, C., Pardalos, P. (eds.) State of the Art in Global Optimization, pp. 395–411. Kluwer, Dordrecht (1996) Shiau, T.-H.: Finding the largest lp -ball in a polyhedral set. Technical Summary Report No. 2788. University of Wisconsin, Mathematics Research Center, Madison (1984) Shimizu, K., Ishizuka, Y., Bard, J.F.: Nondifferentiable and Two-Level Mathematical Programming. Kluwer, Cambridge (1997) Shor, N.Z.: Quadratic Optimization Problems. Technicheskaya Kibernetica, 1, 128–139 (1987, Russian) Shor, N.Z.: Dual estimates in multiextremal problems. J. Glob. Optim. 2, 411–418 (1991) Shor, N.Z., Stetsenko, S.I.: Quadratic Extremal Problems and Nondifferentiable Optimization. Naukova Dumka, Kiev (1989, Russian) Sion, M.: On general minimax theorems. Pac. J. Math. 8, 171–176 (1958) Sniedovich, M.: C-programming and the minimization of pseudolinear and additive concave functions. Oper. Res. Lett. 5, 185–189 (1986) Soland, R.M.: Optimal facility location with concave costs. Oper. Res. 22, 373–382 (1974) Sperner, E.: Neuer Beweis für die Invarianz der Dimensionszahl und des Gebietes. Abh. Math. Seminar Hamburg VI , 265–272 (1928) Stancu-Minasian, I.M.: Fractional Programming, Theory, Methods and Applications. Kluwer, Dordrecht (1997) Strekalovski, A.C.: On the global extremum problem. Sov. Dokl. 292, 1062–1066 (1987, Russian) Strekalovski, A.C.: On problems of global extremum in nonconvex extremal problems. Izvestya Vuzov, ser. Matematika 8, 74–80 (1990, Russian) Strekalovski, A.C.: On search for global maximum of convex functions on a constraint set. J. Comput. Math. Math. Phys. 33, 349–363 (1993, Russian) Suzuki, A., Okabe, A.: Using Voronoi diagrams. In: Drezner, Z. (ed.) Facility Location, pp. 103– 118. Springer, New York (1995) Swarup, K.: Programming with indefinite quadratic function with linear constraints. Cahiers du Centre d’Etude de Recherche Operationnelle 8, 132–136 (1966a)

References

497

Swarup, K.: Indefinite quadratic programming. Cahiers du Centre d’Etude de Recherche Operationnelle 8, 217–222 (1966b) Swarup, K.: Quadratic programming. Cahiers du Centre d’Etude de Recherche Opérationnelle 8, 223–234 (1966c) Tanaka, T., Thach, P.T., Suzuki, S.: An optimal ordering policy for jointly replenished products by nonconvex minimization techniques. J. Oper. Res. Soc. Jpn. 34, 109–124 (1991) Tao, P.D.: Convex analysis approach to dc programming: theory, algorithm and applications. Acta Math. Vietnam. 22, 289–355 (1997) Tao, P.D.: Large-scale molecular optimization from distance matrices by a dc optimization approach. SIAM J. Optim. 14, 77–116 (2003) Tao, P.D., Hoai-An, L.T.: DC optimization algorithms for solving the trust region subproblem. SIAM J. Optim. 8, 476–505 (1998) Thach, P.T.: The design centering problem as a d.c. programming problem. Math. Program. 41, 229–248 (1988) Thach, P.T.: Quasiconjugates of functions, duality relationship between quasiconvex minimization under a reverse convex constraint and quasiconvex maximization under a convex constraint, and applications. J. Math. Anal. Appl. 159, 299–322 (1991) Thach, P.T.: New partitioning method for a class of nonconvex optimization problems. Math. Oper. Res. 17, 43–69 (1992) Thach, P.T.: D.c. sets, d.c. functions and nonlinear equations. Math. Program. 58, 415–428 (1993a) Thach, P.T.: Global optimality criteria and duality with zero gap in nonconvex optimization problems. SIAM J. Math. Anal. 24, 1537–1556 (1993b) Thach, P.T.: A nonconvex duality with zero gap and applications. SIAM J. Optim. 4, 44–64 (1994) Thach, P.T.: Symmetric duality for homogeneous multiple-objective functions. J. Optim Theory Appl. (2012). doi: 10.1007/s10957-011-9822-6 Thach, P.T., Konno, H.: On the degree and separability of nonconvexity and applications to optimization problems. Math. Program. 77, 23–47 (1996) Thach, P.T., Thang, T.V.: Conjugate duality for vector-maximization problems. J. Math. Anal. Appl. 376, 94–102 (2011) Thach, P.T., Thang, T.V.: Problems with resource allocation constraints and optimization over the efficient set. J. Glob. Optim. 58, 481–495 (2014) Thach, P.T., Tuy, H.: The relief indicator method for constrained global optimization. Naval Res. Logist. 37, 473–497 (1990) Thach, P.T., Konno, H., Yokota, D.: A dual approach to a minimization on the set of Pareto-optimal solutions. J. Optim. Theory Appl. 88, 689–701 (1996) Thakur, L.S.: Domain contraction in nonlinear programming: minimizing a quadraticconcave function over a polyhedron. Math. Oper. Res. 16, 390–407 (1991) Thieu, T.V.: Improvement and implementation of some algorithms for nonconvex optimization problems. In: Optimization—Fifth French German Conference, Castel Novel 1988. Lecture Notes in Mathematics, vol. 1405, pp. 159–170. Springer, Berlin (1989) Thieu, T.V.: A linear programming approach to solving a jointly contrained bilinear programming problem with special structure. In: Van tru va He thong, vol. 44, pp. 113–121. Institute of Mathematics, Hanoi (1992) Thieu, T.V., Tam, B.T., Ban, V.T.: An outer approximation method for globally minimizing a concave function over a compact convex set. Acta Math. Vietnam 8, 21–40 (1983) Thoai, N.V.: A modified version of Tuy’s method for solving d.c. programming problems. Optimization 19, 665–674 (1988) Thoai, N.V.: Convergence of duality bound method in partly convex programming. J. Glob. Optim. 22, 263–270 (2002)*** Thoai, N.V., Tuy, H.: Convergent algorithms for minimizing a concave function. Math. Oper. Res. 5, 556–566 (1980) Toland, J.F.: Duality in nonconvex optimization. J. Math. Anal. Appl. 66, 399–415 (1978) Toussaint, G.T.: Solving geometric problems with the rotating calipers. Proc. IEEE MELECON 83 (1978)

498

References

Tuan, H.D., Apkarian, P., Hosoe, S., Tuy, H.: D.C. optimization approach to robust control: feasibility problems. Int. J. Control 73(2), 89–104 (2000a) Tuan, H.D., Hosoe, S., Tuy, H.: D.C. optimization approach to robust controls: the optimal scaling value problems. IEEE Trans. Autom. Control 45(10), 1903–1909 (2000b) Tuy, H.: Concave programming under linear constraints. Sov. Math. Dokl. 5, 1437–1440 (1964) Tuy, H.: On a general minimax theorem. Sov. Math. Dokl. 15, 1689–1693 (1974) Tuy, H.: A note on the equivalence between Walras’ excess demand theorem and Brouwer’s fixed point theorem’. In: Los, J., Los, M. (eds.) Equilibria: How and Why ?, pp. 61–64. NorthHolland, Amsterdam (1976) Tuy, H.: On outer approximation methods for solving concave minimization problems. Acta Math. Vietnam. 8, 3–34 (1983) Tuy, H.: Global minimization of a difference of two convex functions. In: Lecture Notes in Economics and Mathematical Systems, vol. 226, pp. 98–108. Springer, Berlin (1984) Tuy, H.: Convex Programs with an additional reverse convex constraint. J. Optim. Theory Appl. 52, 463–486 (1987a) Tuy, H.: Global minimization of a difference of two convex functions. Math. Program. Study 30, 150–182 (1987b) Tuy, H.: Convex Analysis and Global Optimization. Kluwer, Dordrecht (1998) Tuy, H.: On polyhedral annexation method for concave minimization. In: Leifman, L.J., Rosen, J.B. (eds.) Functional Analysis, Optimization and Mathematical Economics, pp. 248–260. Oxford University Press, Oxford (1990) Tuy, H.: Normal conical algorithm for concave minimization over polytopes. Math. Program. 51, 229–245 (1991) Tuy, H.: On nonconvex optimization problems with separated nonconvex variables. J. Glob. Optim. 2, 133–144 (1992) Tuy, H.: D.C. optimization: theory, methods and algorithms. In: Horst, R., Pardalos, P. (eds.) In: Handbook of Global Optimization, pp. 149–216. Kluwer, Dordrecht (1995a) Tuy, H.: Canonical d.c. programming: outer approximation methods revisited. Oper. Res. Lett. 18, 99–106 (1995b) Tuy, H.: A general d.c. approach to location problems. In: Floudas, C., Pardalos, P. (eds.) State of the Art in Global Optimization: Computational Methods and Applications, pp. 413–432. Kluwer, Dordrecht (1996) Tuy, H.: Monotonic Optimization: Problems and Solution Approaches. SIAM J. Optim. 11, 464– 494 (2000a) Tuy, H.: Strong polynomial time solvability of a minimum concave cost network flow problem. Acta Math. Vietnam. 25, 209–217 (2000b) Tuy, H.: On global optimality conditions and cutting plane algorithms. J. Optim. Theory Appl. 118, 201–216 (2003) Tuy, H.: On a decomposition method for nonconvex global optimization. Optim. Lett. 1, 245–258 (2007a) Tuy, H.: On dual bound methods for nonconvex global optimization. J. Glob. Optim. 37, 321–323 (2007b) Tuy, H.: D(C)-optimization and robust global optimization. J. Glob. Optim. 47, 485–501 (2010) Tuy, H., Ghannadan, S.: A new branch and bound method for bilevel linear programs. In: Migdalas, A., Pardalos, P.M., Värbrand, P. (eds.) Multilevel Optimization: Algorithms and Applications, pp. 231–249. Kluwer, Dordrecht (1998) Tuy, H., Hoai-Phuong, N.T.: Optimization under composite monotonic constraints and constrained optimization over the efficient set. In: Liberti, L., Maculan, N. (eds.) Global Optimization: From Theory to Implementation, pp. 3–32. Springer, Berlin/New York (2006) Tuy, H., Hoai-Phuong, N.T.: A robust algorithm for quadratic optimization under quadratic constraints. J. Glob. Optim. 37, 557–569 (2007) Tuy, H., Horst, R.: Convergence and restart in branch and bound algorithms for global optimization. Application to concave minimization and d.c. optimization problems. Math. Program. 42, 161– 184 (1988)

References

499

Tuy, H., Luc, L.T.: A new approach to optimization under monotonic constraints. J. Glob. Optim. 18, 1–15 (2000) Tuy, H., Oettli, W.: On necessary and sufficient conditions for global optimization. Matemáticas Aplicadas 15, 39–41 (1994) Tuy, H., Tam, B.T.: An efficient solution method for rank two quasiconcave minimization problems. Optimization 24, 3–56 (1992) Tuy, H., Tam, B.T.: Polyhedral annexation vs outer approximation methods for decomposition of monotonic quasiconcave minimization. Acta Math. Vietnam. 20, 99–114 (1995) Tuy, H., Tuan, H.D.: Generalized S-Lemma and strong duality in nonconvex quadratic programming. J. Glob. Optim. 56, 1045–1072 (2013) Tuy, H., Migdalas, A., Väbrand, P.: A global optimization approach for the linear two-level program. J. Glob. Optim. 3, 1–23 (1992) Tuy, H., Dan, N.D., Ghannadan, S.: Strongly polynomial time algorithm for certain concave minimization problems on networks. Oper. Res. Lett. 14, 99–109 (1993a) Tuy, H., Ghannadan, S., Migdalas, A., Värbrand, P.: Strongly polynomial algorithm for a production-transportation problem with concave production cost. Optimization 27, 205–227 (1993b) Tuy, H., Ghannadan, S., Migdalas, A., Värbrand, P.: Strongly polynomial algorithm for two special minimum concave cost network flow problems. Optimization 32, 23–44 (1994a) Tuy, H., Migdalas, A., Väbrand, P.: A quasiconcave minimization method for solving linear twolevel programs. J. Glob. Optim. 4, 243–263 (1994b) Tuy, H., Al-Khayyal, F., Zhou, F.: A d.c. optimization method for single facility location problems. J. Glob. Optim. 7, 209–227 (1995a) Tuy, H., Ghannadan, S., Migdalas, A., Värbrand, P.: The minimum concave cost flow problem with fixed numbers of nonlinear arc costs and sources. J. Glob. Optim. 6, 135–151 (1995b) Tuy, H., Al-Khayyal, F., Zhou, P.C.: DC optimization method for single facility location problems. J. Glob. Optim. 7, 209–227 (1995c) Tuy, H., Ghannadan, S., Migdalas, A., Värbrand, P.: Strongly polynomial algorithm for a concave production-transportation problem with a fixed number of nonlinear variables. Math. Program. 72, 229–258 (1996a) Tuy, H., Thach, P.T., Konno, H., Yokota, D.: Dual approach to optimization on the set of paretooptimal solutions. J. Optim. Theory Appl. 88, 689–707 (1996b) Tuy, H., Bagirov, Rubinov, A.M.: Clustering via D.C. optimization. In: Hadjisavvas, N., Pardalos, P.M. (eds.) Advances in Convex Analysis and Optimization, pp. 221–235. Kluwer, Dordrecht (2001) Tuy, H., Nghia, N.D., Vinh, L.S.: A discrete location problem. Acta Math. Vietnam. 28, 185–199 (2003) Tuy, H., Al-Khayyal, F., Thach, P.T.: Monotonic optimization: branch and cut methods. In: Audet, C., Hansen, P., Savard, G. (eds.) Essays and Survey on Global Optimization, pp. 39–78. GERAD/Springer, New York (2005) Tuy, H., Minoux, M., Hoai-Puong, N.T.: Discrete monotonic optimization with application to a discrete location problem. SIAM J. Optim. 17, 78–97 (2006) Utschick, W., Brehmer, J.: Monotonic optimization framework for coordinated beamforming in multi-cell networks. IEEE Trans. Signal Process. 60, 2496–2507 (2012) Vandenberge, L., Boyd, S.: Semidefinite programming. SIAM Rev. 38, 49–95 (1996) Van Voorhis, T.: The quadratically constrained quadratic programs. Ph.D. thesis, Georgia Institute of Technology, Atlanta (1997) Veinott, A.F.: The supporting hyperplane method for unimodal programming. Oper. Res. 15, 147– 152 (1967) Vidigal, L., Director, S.: A design centering algorithm for nonconvex regions of acceptability. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 14, 13–24 (1982) Vincent, J.F.: Structural Biomaterials. Princeton University Press, Princeton (1990) Vicentee, L., Savard, G., Judice, J.: Descent approaches for quadratic bilevel programming. J. Optim. Theory Appl. 81, 379–388 (1994)

500

References

von Neumann, J.: Zur Theorie der Gesellschaftsspiele. Math. Ann. 100, 293–320 (1928) White, D.J., Annandalingam, G.: A penalty function approach for solving bilevel linear programs. J. Glob. Optim. 3, 397–419 (1993) Wolberg, W.H., Street, W.N., Mangasarian, O.L.: Machine learning techniques to diagnose breast cancer from fine-needle aspirates. Cancer Lett. 77, 163–171 (1994) Wolberg, W.H., Street, W.N., Mangasarian, O.L.: Image analysis and machine learning applied to breast cancer diagnosis and prognosis. Anal. Quant. Cytol. Histol. 17(2), 77–87 (1995) Wolberg, W.H., Street, W.N., Heisey, D.M., Mangasarian, O.L.: Computerized breast cancer diagnosis and prognosis from fine-needle aspirates. Arch. Surg. 130, 511–516 (1995a) Wolberg, W.H., Street, W.N., Heisey, D.M., Mangasarian, O.L.: Computer-derived nuclear features distinguish malignant from benign beast cytology. Hum. Pathol. 26, 792–796 (1995b) Yakubovich, V.A.: The S-procedure in nonlinear control theory. Vestnik Leningrad Univ. 4, 73–93 (1977) Ye, Y.: On affine scaling algorithms for nonconvex quadratic programming. Math. Program. 56, 285–300 (1992) Yick, J., Mukherjee, B., Ghosal, D.: Wireless sensor network survey. Comput. Network 52, 2292– 2320 (2008) Zadeh, N.: On building minimum cost communication networks over time. Networks 4, 19–34 (1974) Zhang, Y.J., Qian, L.P., Huang, J.W.: Monotonic optimization communication and networking systems. Found. Trends Commun. Inf. Theory 7(1), 1–75 (2012) Zheng, Q.: Optimality conditions for global optimization (I) and (II). Acta Math. Appl. Sinica, English Ser. 2, 66–78, 118–134 (1985) Zhu, D.L., Marcotte, P.: An extended descent framework for variational inequalities. J. Optim. Theory Appl. 80, 349–366 (1994) Zukhovitzki, S.I., Polyak, R.A., Primak, M.E.: On a class of concave programming problems. Ekonomika i Matematitcheskie Methody 4, 431–443 (1968, Russian) Zwart, P.B.: Global maximization of a convex function with linear inequality constraints. Oper. Res. 22, 602–609 (1974)

Index

Symbols K-monotonic functions, 283 -extension, 171, 356 -valid cut, 170 !-bisection, 179 !-subdivision, 176, 184 "-approximate optimal solution, 193 "-subgradient, 65 p-center problem, 220 r-simplex, 30

A adaptive BB algorithm, 161 adaptive subdivision, 157 affine hull, 4 manifold, 3 set, 3 affine function, 39 affine variational inequality, 484 affinely independent, 5 apex of a cone, 7 approximate minimum, 87 approximate optimal solution, 204 approximate subdifferential, 65

B barycentric coordinates, 12 basic concave programming (BCP) problem, 167 basic outer approximation theorem, 163 BB/OA algorithm for (CO/SNP), 279

bilevel programming, 455 Bilinear Matrix Inequalities Problem, 248 bilinear matrix inequalities problem, 248 bilinear pprogram, 358 bisection of ratio ˛, 152 of a cone, 155 bisection of ratio ˛ of a rectangle, 157 Brouwer fixed point theorem, 97, 98

C canonical cone, 172, 173 canonical dc programming, 198 canonical dc programming problem, 135 Caratheodory core, 14 Caratheodory’s Theorem, 12 Caristi fixed point theorem, 88 closed convex function, 50 set-valued map, 94 closure of a convex function, 50 of a set, 8 clustering, 222 coercive function, 87 combined OA/BB conical algorithm for (GCP), 187 combined OA/CS algorithm for (CDC), 203 competitive location, 148 complementary convex, 104, 135 complementary convex set, 116 complementary convex structure, 103 concave function, 39 concave majorant, 118

© Springer International Publishing AG 2016 H. Tuy, Convex Analysis and Global Optimization, Springer Optimization and Its Applications 110, DOI 10.1007/978-3-319-31484-6

501

502 concave minimization, 133, 167 concave production–transportation problem, 319 concave programming, 167 concavity cut, 170 cone, 6, 155 cone generated by a set, 7 conical BB algorithm for (BCP), 180 conical BB algorithm for LRC, 196 conical subdivision, 155 conjugate, 67 conormal hull, 393 conormal set, 393 consistent bounding, 159 constancy space, 49 continuity of convex function, 42 continuous p-center problem, 221 continuous optimization problems, 271 contraction, 89 convex bilevel program, 456 convex combination, 6 convex cone, 7 convex envelope, 46 convex envelope of a concave function, 46 convex function, 39 convex hull, 12 convex hull of a function, 46 convex inequality, 52 convex maximization, 133 convex minorant, 118 convex minorant of a dc function, 117 convex set, 6 copolyblock, 394 CS algortihm for (BCP), 176 CS restart algorithm for (LRC), 194 CS restart algorithm for LRC, 194

D DC feasibility problem, 140 DC feasibility problem, 229 dc function, 103 dc inclusion, 139 dc representation, 107 of basic composite functions, 107 dc set, 113 dc structure, 103 decomposition methods by projection, 287 decreasing function, 391 design centering problem, 213 dimension of a convex set, 9 of an affine set, 3 dimension of a convex function, 50

Index direction of recession, 10 directional derivative, 61 directions in which f is affine, 49 discrete location, 429 distance function, 40 distance geometry problems, 220 dm function, 395 dual DC feasibility problem, 230 duality bound, 236 duality gap, 237 dualization through polyhedral annexation, 292

E edge of a polyhdron, 29 edge property, 191 effcient point, 468 effective domain, 39 Ekeland variational principle, 87 epigraph, 39 equilibrium constraints, 453 equilibrium point, 94 equilibrium problem, 479 essential optimal solution, 206, 406 exact bisection, 152 exhaustive BB algorithm, 161 exhaustive filter, 152 exhaustive subdivision process, 154 extreme direction, 21 extreme point, 21

F face, 28 face of a convex set, 21 facet, 28 facet of a simplex, 91 factorable function, 107 Fejer Theorem, 37 Fekete points, 221 filter, 152 First Separation Theorem, 17 free disposal, 255 full dimension, 9

G gap function, 481 gauge, 9 General Concave Minimization Problem, 184 general equilibrium theorem, 94 general nonconvex quadratic problem, 363 generalized convex multiplicative programming problems, 308

Index generalized convex optimizatiion, 78 generalized equation, 100 generalized inequality, 76 generalized KKT condition, 256 generalized linear program, 144 geometric programming, 143 global "-optimal solution, 141, 158 global maximizer, 69 global minimizer, 69, 128 global optimal solution, 127, 128 global phase, 141

H halfspaces, 6 Hausdorff distance, 89 hyperplane, plane, 4

I increasing function, 391 indefinite quadratic minimization, 361 indicator function, 40 infimal convolution, 46 inner approximation, 229 interior of a set, 8

K K-convex function, 77 K-monotonic function, 283 Kakutani fixed point theorem, 98 KKM Lemma, 91, 92 Ky Fan inequality, 97

L l.s.c. hull, 50 Lagrange dual problem, 377 Lagrangian, 377 Lagrangian bound, 244 lagrangian bound, 236 line segment, 5 lineality of a convex function, 49 lineality of a convex set, 23 lineality space, 23 lineality space of a convex function, 49 linear bilevel program, 456 linear matrix inequality, 146 linear multiplicative program, 299 linear multiplicative programming, 298 linear program under LMI constraints, 387 Lipschitz continuity, 51 LMI (linear matrix inequality), 83

503 local maximizer, 69 local minimizer, 69, 128 local optimal solution, 127, 128 local phase, 141 locally convexifiable, 369 locally dc, 105 locally increasing, 396 lower -valid cut, 400 lower S-adjustment, 414 lower level set, 47 lower linearizable, 365 lower semi-continuous (l.s.c.), 48

M master problem, 408 Mathematical Programming problem with Equilibrium Constraints, 453 maximin location, 219 minimax equality, 72 theorem, 72 minimum concave cost network flow problem, 319 minimum of a dc function, 120 Minkowski functional, 9 Minkowski’s Theorem, 24 modulus of strong convexity, 66 moment matrices, 440 Monotonic Optimization, 391 monotonic reverse convex problem, 313 multidimensional scaling, 220 multiextremal, 127

N Nadler contraction theorem, 90 Nash equilibrium, 98 Nash equilibrium theorem, 99 network constraints, 319 non-cooperative game, 98 noncanonical dc problems, 268 nonconvex quadratic programming, 337 nonconvexity rank, 283 nondegenerate vertex, 30 nonregular problems, 193 norm lp -norm, 7 equivalent norms, 8 Euclidean norm, 8 Tchebycheff norm, 8 normal cone, 20 inward normal cone, 20 outward normal cone, 20

504

Index

normal hull, 393 normal set, 392 normal to a convex set, 20 normal vector, 4

quasiconcave, 72 quasiconcave function, 47 quasiconjugate function, 250 quasiconvex function, 47

O on-line vertex enumeration, 185 optimal visible point algorithm, 269 optimal visible point problem, 269 outer approximation, 161

R radial partition(subdivision), 152 radius of a compact set, 37 rank of a convex function, 50 rank of a convex set, 26 rank of a quadratic form, 338 recession cone, 10 recession cone of a convex function, 49 rectangular subdivision, 156 redundant inequality, 29 reformulation-linearization, 372 regular (satble) problem, 191 regularity asumption, 192 relative boundary, 8 relative interior, 8 relief indicator, 272, 275 OA algorithm, 276 relief indicator method, 275 Representation of polyhedrons, 32 representation theorem, 23 reverse convex, 52 reverse convex constraint, 134 reverse convex inequality, 104 reverse convex programming, 189 reverse convex set, 116 reverse normal set, 393 reverse polyblock, 394 robust approach, 204 constraint set, 272 feasible set, 140, 406

P Pareto point, 393 partition via .v; j/, 156 partly convex problems, 236 piecewise convex function, 43 pointed cone, 7 Polar of a polyhedron, 31 polar, 25 polyblock, 394 polyblock algorithm, 403 polyhedral annexation, 229 polyhedral annexation method, 233 polyhedral annexation algorithm for BCP, 233 polyhedral convex set, 27 polyhedron, 27 Polynomial Optimization, 435 polytope, 30 pooling and blending problem, 131, 247 positively homogeneous, 59 posynomial, 391 preprocessing, 142 problem PTP(2), 328 procedure DC, 175 procedure DC*, 229 projection, 20 proper function, 39 proper partition, 152 proper vertex, 394 prototype BB algorithm, 158 prototype BB decomposition algorithm, 237 prototype OA procedure, 162

Q quadratic function, 42 quais-supgradient, 254 quasi-subdifferential, 253 quasi-subgradient, 253 quasi-supdifferential, 254

S S-shaped functions, 112 saddle point, 75 SDP (semidefinite program), 84 Second Separation Theorem, 18 semi-exhaustiveness, 153 semidefinite program, 147 sensor cover energy problem, 426 Separable Basic Concave Program, 182 separator, 115 set-valued map, 88 Shapley-Folkman’s Theorem, 13 signomial programming, 451 simple OA for CDC, 201

Index simple outer approximation, 185 simplex, 12 simplicial subdivision, 91, 151 SIT algorithm for dc optimization, 208 SIT scheme, 208 special concave program CPL(1), 326 Sperner Lemma, 92 Stackelberg game, 454 stochastic transportation-location problem, 322 strictly convex, 69 strictly convex function, 114 strongly convex, 66 strongly separated, 18 subdifferentiable, 58 subdifferential, 58 subgradient, 58 successive incumbent transcending, 207 successive incumbent transcending scheme, 208 successive partitioning, 151 support function, 26, 40 supporting halfspace, 19 supporting hyperplane, 19 T tight convex minorant, 366 Toland theorem, 121 transcending stationarity, 139

505 trust region subproblem, 353 two phase scheme, 141

U upper -valid cut, 400 upper S-adjustment, 414 upper boundary point, 475 upper extreme point, 475 upper level sets, 47 upper semi-continuous (u.s.c.), 48 upper semi-continuous set-valued map, 100

V variational inequality, 99 vertex adjacent, 30 vertex of a cone, 7 vertex of a polyhedron, 29 visibility assumption, 268 Voronoi cell, 221

W weakly convex, 123 weakly efficient point, 468 Weber problem, 218 Weber problem with attraction and repulsion, 147