Recent Developments in Mathematical Programming [1 ed.] 9782881248009, 9781138413184, 9780429333439, 2881248004

This work is concerned with theoretical developments in the area of mathematical programming, development of new algorit

141 81 23MB

English Pages 469 [470] Year 1991

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Recent Developments in Mathematical Programming [1 ed.]
 9782881248009, 9781138413184, 9780429333439, 2881248004

Table of contents :
Cover
Half Title
Title Page
Copyright Page
Table of Contents
Preface
List of Contributors
Part 1: Review Articles in Mathematical Programming
1 Recent Advances in Global Optimization: A Tutorial Survey
2 Two-level Resource Control Preemptive Hierarchical Linear Programming Problem: A Review
3 Some Recent Developments in Infinite Programming
4 Recent Developments in Mathematical Programming Software for the Microcomputer
5 C-Programming: lts Theory and Applications
Part 2: Multicriteria Optimization
6 Aspects of Multicriteria Optimization
7 A Duality Theorem for a Fractional Multiobjective Problem with Square Root Terms
8 Efficiency and Duality Theory for a Class of Differentiabie Multiobjective Programming Problems with Invexity
9 Efficiency and Duality Theory for a Class of Nondifferentiable Multiobjective Programs
10 Symmetrie Duality for Nonlinear Multiobjective Programming
Part 3: System Optimization and Heuristics
11 Data Envelopment Analysis: A Comparative Tool
12 A Study of Protean Systems — Some Heuristic Strategies for Redundancy Optimization
Part 4: Interior-Point Approach and Quadratic Programming
13 Nearest Points in Nonsimplicial Cones and LCP's with PSD Symmetric Matrices
14 A Study on Monotropie Piecewise Quadratic Programming
15 Interior-Point Algorithms for Quadratic Programming
Part 5: Computational Efficiency, Methods and Soffware
16 Problems in Protean Systems — Computer Programs for Some Solution Methods
17 Altemative Methods for Representing the Inverse of Linear Programming Basis Matrices
18 Toward Parallel Computing on Personal Computers in Mathematical Programming
Part 6: Mathematical Programming Applications
19 A Mixed Integer Model of Petroleum Fields with Moving Platforms
20 Optimal Stochastic Hydrothermal Scheduling Using Nonlinear Programming Technique
21 Nonlinear Programming Applied to the Dynamic Rescheduling of Trains
22 Network Routing Applications in National and Regional Planning
23 An Application of the Lagrangean Relaxation Based Approach to the Bulk Commodity Production Distribution Problem
Part 7: Algorithms, Games and Paradox
24 Flow Truncation in a Four Axial Sums' Transportation Problem
25 Continuous Linear Programs and Continuous Matrix Game Equivalence
26 A Short Note on a Path-following Interpretation of the Primaldual Algorithm
27 On a Paradox in Linear Fractional Transportation Problems
28 Nash Equilibrium Points of Stochastic N-uels
Appendices
Subject Index
Author Index
List of Referees

Citation preview

Recent Developments • In

Mathematical Programming

Developments •

In

Mathematical Programming

Edited by

Santosh Kumar Department of Mathematics Royal Melbourne Institute ofTechnology Melbourne, Australia

on behalf of the

Australian Society for Operations Research

Boca Raton London New York

CRC Press is an imprint of the Taylor & Francis Group, an informa business

eRe Press Taylor & Francis Group 6000 Sroken Sound Parkway NW, Suite 300 Soca Raton, FL 33487-2742 First issued in hardback 2019 © 1991 by Tay10r & Francis Group, LLC eRC Press is an imprint of Tay10r & Francis Group, an Informa business No claim to original U.S. Government works ISBN-13: 978-2-88124-800-9 (pbk) ISBN-13: 978-1-138-41318-4 (hbk) OOI: 10.1201/9780429333439 This book contains information obtained from authentic and high1y regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part ofthis book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.copyright.coml) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe.

Library of Coogress Catalogiog-io-Publicatioo Data Recent developments in mathematical programming I edited by Santosh Kumar on behalf of tbe Australian Society for Operations Research. p. cm. Includes bibliographical refereJlces and indexes. ISBN 2·88124·800-4 (softcover) 2·88124·820·9 (hardcover) 1. Programming (Mathematics) L Kumar. Santosh. 1936D. Australian Society for Operations Research. QA402.5.R43 1991 519.7-dc20 91·14324 CIP Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com

CONTENTS Preface

ix

List of Contributors

xi

PART 1: REVIEW ARTICLES IN MATHEMATICAL PROGRAMMING 1 Recent Advances in Global Optimization: A Tutorial Survey Reiner Horst

1

2 Two-level Resource Control Preemptive Hierarchical Linear Programming Problem: A Review Subhash C. Naruia and Adiele D. Nwosu

29

3 Some Recent Developments in Infinite Programming A. B. Philpott

45

4 Recent Developments in Mathematical Programming Software for the Microcomputer R. Sharda andD. M. Steiger

61

5 C-Programming: lts Theory and Applications Moshe Sniedovich

79

PART 2: MULTICRITERIA OPTIMIZATION 6 Aspects of Multicriteria Optimization B.D. Craven

93

7 A Duality Theorem for a Fractional Multiobjective Problem with Square Root Terms R. R. Egudo

101

8 Efficiency and Duality Theory for a Class of Differentiabie Multiobjective Programming Problems with Invexity Zulfiqar Ali Khan

115

v

vi

CONTENTS

9 Efficiency and Duality Theory for a Class of Nondifferentiable Multiobjective Programs Zulfiqar Ali Khan

125

10 Symmetrie Duality for Nonlinear Multiobjective Programming B. Mond and T. Weir

137

PART 3: SYSTEM OPTIMIZATION AND HEURISTICS

11 Data Envelopment Analysis: A Comparative TooI M. J. Foster

155

12 A Study of Protean Systems ­ Some Heuristic Strategies for Redundancy Optimization Radha Kalyan and Santosh Kumar

181

PART 4: INTERIOR-POINT APPROACH AND QUADRATIC PROGRAMMING

13 Nearest Points in Nonsimplicial Cones and LCP's with PSD Symmetrie Matrices K. S. Al-Sultan and K. G. Murty

199

14 A Study on Monotropie Piecewise Quadratic Programming Jie Sun

213

15 Interior-Point Algorithms for Quadratie Programming Yinyu Ye

237

PART 5: COMPUTATIONAL EFFICIENCY, METHODS AND SOFfWARE

16 Problems in Protean Systems ­ Computer Programs for Some Solution Methods Radha Kalyan and Santosh Kumar

263

17 Altemative Methods for Representing the Inverse of Linear Programming Basis Matrices Gautam Mitra and Mehrdad Tamiz

273

18 Toward Parallel Computing on Personal Computers in Mathematical Programming Moshe Sniedovich

303

CONTENTS

vii

PART 6: MATHEMATICAL PROGRAMMING APPLICA TIONS 19 A Mixed Integer Model of Petroleum Fields with Moving Plaûorms

Dag Haugland. Kurt Jornsten and Ebrahim Shayan

323

20 Optimal Stochastic Hydrothermal Scheduling Using Nonlinear Programming Technique

D. P. Kothari 21 Nonlinear Programming Applied to the Dynamic Rescheduling of Trains R. G. J. Mills and S. E. Perkins 22 Network Routing Applications in National and Regional Planning

J. P. Saksen a

335

345

359

23 An Application of the Lagrangean Relaxation Based Approach to the Bulk Commodity Production Distribution Problem

R. R. K. Sharma

369

PART 7: ALGORITHMS, GAMES AND PARADOX 24 Flow Truncation in a Four Axial Sums' Transportation Problem

L. Bandopadhyaya and M. C. Puri

383

25 Continuous Linear Programs and Continuous Matrix Game Equivalence

S. Chandra. B. Mond and M. V. Durga Prasad

397

26 A Short Note on a Path-following Interpretation of the Primaldual Algorithm

Patriek Tobin and Santosh Kumar 27 On a Paradox in Linear Fractional Transportation Problems Vanita Verma and M. C. Puri

407 413

28 Nash Equilibrium Points of Stochastic N-uels

P. Zeephongsekul

425

APPENDICES Subject Index Author Index List of Referees

453 456 457

PREFACE Although the genesis of mathematica! programming cao be traced back to the work ofL.V. Kantorovich (1.2), a major advance in the field occurred in 1949 when George Dantzig(3) developed the simplex method for solving the linear program ming problem. Since that time the subject has grown in various ways: theory, computing efficiency and applications (4.5.6>. Mathematica! program­ ming bas provided a charge which seems to have an endless energy in it. Harvesting this energy will depend on the work of the mathematica! program­ ming community. The ever increasing theoretica! and computationa! develop­ ments in mathematica! programming already have applications in science, engineering, and business activities in the private and public sectors. The initial idea for this publication was supported by the Australian Society for Operations Research. The origina! aim was to publish papers in the ASOR Bulletin. However, significant contributions from international researchers justifieda separate publication. Further, the involvementofGordon and Breach Science Publishers has made it available in the international market place, which would not have been possible under the origina! plan. The book is divided ioto the foUowiog sectioos:

• • • • • • •

Review Articles in Mathematica! Programming Multicriteria Optimization System Optimization and Heuristics Interior-Point Approach and Quadratic Programming Computational Efficiency, Methods and Software Mathematica! Programming Applications Algorithms, Games and Paradox It is hoped that this book will be of interest to researchers and practitioners, and that it will provide useful information on recent developments in this rapidly growing field. In editing this volume I am indebted to a large number of referees whose critical comments resulted in a better presentation of the papers. A list of the names and affiliations of the referees is included in the appendices. My thanks ix

x

PREFACE

are due to all autl}ors and referees for their contributions. I arn grateful to the Australian Society for Operations Research, Melbourne Chapter Committee as weIl as the National Council for their encouragement and for providing necessary financial support, in particular Or Charles Newton (President), Or Bill Haebich (Chapter Chairman) and Ms Kaye Marion (Treasurer). I arn also grateful to Or Raj Vasudeva and Or Howard J. Connell, Mathematics Oepart­ ment, Royal Melbourne Institute of Technology, for providing facilities to complete this volume. My thanks are due to Or Moshe Sniedovich, Oepartment of Mathematics, University ofMelbourne, for professional support and to Eva, Joi, Jill and Parn from the Mathematics Oepartment, Royal Melbourne Institute of Technology, for the support they have given me in completing this task. Finally I thank my wife Bala and son Sarnmy for their encouragement. I will feel suitably gratified if this book is useful to workers in the field and if it gives them encouragement to pursue their research further. In that event, I arn sure that the ASOR would consider their involvement well worthwhile. Santosh Kumar

References

1. Kantorovich, L. V. (1939) Mathematical methods for organizing and planning ofproduction, Leningrad State University. 2. Romanovsky, J. V. (1989) L.V. Kantorovich's work in mathematical prograrnming, in Mathematical Programming : Recent Developments anti Applications, edited by M. Iri and K. Tenabe, Kluwer Academic Publishers, pp 365-382. 3. Oantzig, G. B. (1985) Reminiscences about the origin oflinearprograrnming, in ASOR 7th National Conference Proceedings, edited by E. A. Cousins and C. E. M. Pearce, pp A12-A25. 4. Iri, M. and Tenabe, K. (Editors) (1989) Mathematical Programming: Recent Developments and Applications, Kluwer Academic Publishers. 5. Murphy, F. H. (Editor), (1990) Special issue on practice of mathematical prograrnming, Interfaces, vol 20, no 4, p 182. 6. Schittkowski, Klaus (Editor) (1985) Computational Mathematical Programming, Springer-Verlag, Berlin.

LIST OF CONTRIBUTORS K.S. AI-Sultan, Department of Industrial and Operations Engineering, The University of Michigan, Ann Arbor, Michigan 48109-2117, USA Lakshmisree Bandopadhyaya, Deshbandhu College, University of Delhi, Delhi, India S. Chandra, Department of Mathematics, Indian Institute of Technology, Hauz Khas, New Delhi 110016, India B.D.Craven, Department of Mathematics, The University of Melbourne, Parkville, Victoria 3052, Australia M.V. Durga Prasad, Department of Mathematics, Indian Institute of Technology, Hauz Khas, New Delhi 110016, India R.R. Egudo, Department of Mathematics and Statistics, The Papua New Guinea University of Technology, Private Mail Bag, Lae, Papua New Guinea M..J. Foster, Kingston Business School, Kingston HilI, Kingston-upon­ Thames KTI 7LB, UK Dag Haugland, Chr. Michelsen Institute, Bergen, Norway Reiner Horst, University of Trier, Department of Mathematics, Trier, Germany Kurt Jornsten, Norwegian School of Economics, Bergen, Norway Radha Kalyan, Department of Mathematics, Royal Melbourne Institute of Technology, GPO Box 2476V, Victoria 3001, Australia Zulfiqar Ali Khan, Faculty of Engineering Science, Osaka University, Osaka, Japan D.P. Kothari, Centre for Energy Studies, Indian Institute of Technology Delhi, New Delhi 110016, India Santosh Kumar, Department of Mathematics, Royal Melbourne Institute of Technology, GPO Box 2476V, Victoria 3001, Australia R.G.J. MUis, School of Mathematics and Computer Studies, South Australian Institute of Technology, The Levels, SA 5095, Australia xi

xii

LIST OF CON1RIBUTORS

Gautam Mitra, BruneI University, UK B. Mond, Department of Mathematics, La Trobe University, Bundoora, Victoria 3083, Australia K.G. Murty, Department of Industrial and Operations Engineering, The University of Michigan, Ann Arbor, Michigan 48109-2117, USA Subhash C. NaruIa, School of Business, Virginia Commonwealth University, 1015 Floyd Avenue, Richmond, Virginia 23284-4000, USA Adiele D. Nwosu, Department of Mathematics, University of Nigeria, Nsukka, Nigeria S.E. Perkins, School of Mathematics and Computer Studies, South Australian Institute of Technology, The Levels, SA 5095, Australia A.B. Philpott, Department of Engineering Science, University of Auckland, Private Bag, Auckland, New Zealand M.C. Puri, Indian Institute of Technology, Delhi, India J.P. Saksena, National Productivity Council, New Delhi, India R. Sharda, Oklahoma State University, Stillwater, Oklahoma, USA R.R.K. Sharma, Department of Industrial and Management Engineering, Indian Institute of , Technology, Kanpur 208016, India Ebrahim Shayan, School of Mechanical and Manufacturing Engineering, Swinbume Institute of Technology, Hawthorn, Victoria 3122, Australia Moshe Sniedovich, Department of Mathematics, The University of Melboume, Parkville, Victoria 3052, Australia O.M. Steiger, Oklahoma State University, Stillwater, Oklahoma, USA Jie Sun, Department of Industrial Engineering and Management Sciences, Northwestern University, Evanston, IlIinois 60208, USA Mehrdad Tamiz, The Numerical Algorithms Group Ltd, UK Patriek Tobin, Swinbume Institute of Technology, John Street, Hawthorn, Victoria 3122, Australia Vanita Verma, Department of Mathematics, Indian Institute of Technology, Hauz Khas, New Delhi 110016, India T. Weir, 13 Boehm Close, lsaacs, ACT 2607, Australia Yinyu Ye, Department of Management Sciences, University of lowa, Iowa City, lowa 52242, USA P. Zeephongsekul, Department of Mathematics, Royal MeIboume Institute of Technology, GPO Box 2476V, Victoria 3001, Australia

RECENT ADVANCES IN GLOBAL OPTIMIZATION: A TUTORIAL SURVEY REINER HORST University of Trier, West Germany

Department of Mathematics,

Trier,

Abstract This paper describes the state-of-the-art in impor­ tant parts of global optimization. By definition, a at least one (multiextremal) global optimization problemDOI: seeks 10.1201/9780429333439global minimizer of a real-valued objective function that possesses (often very many) different local minimizers. The feasible set of (admissible) points in IRn is usually determined by a system of inequalities. 1'he enormous practical need for solving global optimization problems coupled with a rapidly advancing computer technology has allowed one to consider problems which a few years ago would have been considered computati­ onally intractable. As a consequence we are seeing the creation of a large and increasing number of diverse algorithms for solving a wide variety of multiextremal global optimization problems. It is the purpose of this paper to present in a tutorial way a survey of recent methods and typical applications. Key Words Global Optimization, Multiextremal

Optimization, Nonconvex Programming

1. INTRODUCTION AND CLASSIFICATION By definition, a (multiextremal) global optimization problem seeks at least one ~lobal minimizer of a real-valued objective function that possesses (often very many) different local minimizers with objective function values that can be substantially different from the global minimum. It is well known that in practically all disciplines where mathematical models are used there are many real-world problems which can be formulated as multiextremal global optimization problems. Standard nonlinear programming techniques have not been successful for solving these problems. Their deficiency is due to the intrinsic multiextremality of the formulation and not to the lack of smoothness or continuity. One can observe that local tools such as gradients, subgradients, and second order constructions such as Hessians, cannot be expected to yield more than local solutions. One finds, for example, that a stationary point (satisfying certain first DOI: 10.1201/9780429333439-1

1

2

REINER HORST

order optimality conditions) is often detected for which there is even no guarantee of local mirumality. Moreover, determining the local minimality of such a point is known to be NP-hard in the sense of computational complexity even in relative1y simple cases. Apart from this deficiency in the local situation, classical methods do not recognize conditions for global optimality. For these reasons global solution methods must be significantly different from standard nonlinear programming techniques, and they can be expected to he and are much more expensive computationally. However, the enormous practical need for solving global optimi­ zation problems coupled with a rapidly advancing computer technolo­ gy has allowed one to consider problems which a few years ago wou1d have been considered computationally intractable. As a consequence, we are seeing the çreation of a large and increasing number of diverse algorithms for solving a wide variety of mu1tiextremal global optimi­ zation problems. Most of these procedures are designed for special problem types where helpful specific structures can be exploited. Moreover, in many practical global optimizations, the mu1tiextremal feature involves only a small number of variables and additional struc­ ture is amenable to large scale solutions. Other methods which have been proposed for solving very general and difficult global problems that possess little additional structure can handle only small problem sizes with sufficient accuracy. However, in these very general cases, the methods often provide useful tools for transcending local optimali­ ty restrictions, in the sense of providing valuable information about the global quality of a given feasible point. Typica11y, such informa­ tion wiIl give upper and lower bounds for the optimal objective function value and indicate parts of the feasible set where further investigations of global optimality will not be worthwiie. We distinguish between unconstrained and constrained global optimization problems. The unconstrained problem is to find a point x* E IRn such that for a real-valued objective function f: IRn --IR we have f(x*) ~ f(x) Vx e IRn . (1) For obvious computational reasons, one usually assumes that a compact convex set S which contains agiobal minimizer x* in its interior is specified in advance, i.e., instead of (1), one seeks a point x* E S satisfying (2) f(x*) = min {f(x): xe S} . Often, the set S is supposed to be an n-dimensional interval (hyper­ rectangle). Nevertheless, problem (2) remains essentially one of un­ constrained optimization. Unconstrained global optimization problems are encountered in engineering (especially technical design) and in econometrics (for example, when a like1ihood function has to be maximized).

GLOBALOPTIMIZATION

3

Most often, however, we encounter constrained problems, i.e. problems where a point x* is sought such that f(x*) = min {f(x): xE D} (3) n with the feasible set D ( IR of admissible points defined by a system of inequalities (4) ~(x) ~ 0 (i=l,... ,m), where the~: IRn - t IR (i=l,... ,m) are given functions. One can observe that many constrained global optimization problems encountered in the decision sciences, in engineering and operations research have at least the following close1y related properties: ') convezity is present in a limited and often unusual sense; h) a global optimum occurs within a subset of the boundary of the easible set. In an abundant class of global optimizations, convexity is present in a reverse sense. In this direction recent research is focussed on the following main problem classes: (a) minimization of concave functions subject to linear and convex constraints (i.e., "concave minimization"); (b) convex minimization over the intersection of convex sets and complements of convex sets/i.e., "reverse convex programming"); and (c) global optimization 0 functions that can be expressed as a difference oftwo convex functions (i.e., "d.c.-programming"). Another large class of constrained global optimization problems of importance has been termed "Lipschitz Programming", where now the functions in the formulation (3), (4) are assumed to be Lipschitz continuous on certain subsets of their domains. Although neither of the aforementioned properties (i) - (ii) is necessarily satisfied in Lipschitz problems, much has been done here recently by applying the basic ideas that have been deve10ped for the problem classes (a), (b) and (c) mentioned above. Fina.îly, it tUInS out that global optimization problems are related to solving systems of equations and / or inequalities in the sense that most naturally every such system can he transformed into an equivalent global optimization problem. A rough classification of the methods developed to solve global optimization problems distinguishes two classes, depending on whether or not they incorporate any stochastic elements. Stochastic methods involve function evaluations in a random sample of points and subsequent manipulations of the sample. As a rule, application of stochastic methods is limited to thl. essentially unconstrained case (2). Due to the stochastic e1ements involved the possibility of an absolute guarantee of success has to he sacrificed. However, under quite mild conditions on the sampling distribution and on f, the probability of sampling a close approximation of the global optimum can he shown

~

4

REINER HORST

for most methods to approach 1 as the sample size increases. Deterministic methods do not involve any stochastic elements. Many diverse deterministic approaches have been proposed to solve unconstrained as well as constrained global optimization problems. Within the current state of the art, the properties (i) and (ii) mentioned above that are present in most problems encountered in the decision sciences are best exploited by deterministic methods that combine analytical and combinatorial tools in an effective way. We find that typical recent approaches use techniques such as branch and bound, relaxation, outer approximation, and cutting planes, whose basic principles have long appeared in the related fields of integer and combinatorical optimization as weIl as convex minimization. One has found, however, that application of these fruitful ideas to global optimization is raising many new interesting theoretical and computational questions whose answers cannot be inferred from previous successes. For example, branch and bound methods applied to global optimization problems generate infinite processes, and hence their own convergence theory and stopping rules must be developed. In contrast, in integer programming these are finite procedures, and so their convergence properties do not directly apply. Other examples involve important results in convex minimization that reflect the coincidence of local and global solutions. Here also one cannot expect a direct application to multiextremal global minimization. The plan of this article is as foIlows. In the next section, a brief introduction is given to the main deterministic and stochastic methods which have been proposed to solve the unconstrained case (2). Certain deterministic methods, however, wiIl already he lormulated in a way that includes their application to the constrained case as well. Section 3 contains a brief introduction to the most important classes of constrained global optimization problems that can be treated by deterministic methods. Some basic properties of each class are mentioned and some typical practical applications are indicated. Section 4 is devoted to recent solution techniques for the constrained problem, and a brief report on computational experiments and on prospective lines of research will he given in the final section. The presentation will be kept on a tutorial level. In line with the scope of the article, only textbooks and surveys will be cited rather than original contributions. 2. ESSENTIALLY UNCONSTRAINED GLOBAL OPTIMIZATION Multistart and Re1ated Methods We consider the essentially unconstrained problem (2). The simplest and very straightforward stochastic approach is to sample N points in S, drawn from a prescribed probability distribution over S (for

GLOBALO~ATION

5

instance, the uniform distribution), and to select as candidate for an approximate optima! solution the point y*(N) with the smallest objective function va!ue (pure random search). A more sophisticated and certainly more efficient way is to initiate a descent loca! search procedure LS in some or all points of a uniformly distributed samy-Ie drawn from S and to select the approximate local minimizer Y*lN) found in that way with the smallest function value (multistart). It is easily seen that for N -+ lIJ the va!ues f(y*(N)J obtained with multistart, converge to the optima! value f(x*) with probability 1 under weak smoothness assumptions on f (cf., e.g., 1). On the other hand, it is obvious, however, that this method must be wasteful from a computationa! point of view, in that the same loca! minimizers will be discovered from many different starting points. Therefore, one has attempted to estimate the region of attraction A(y*) which is defined to be the set of all points in S starting from which the loca! procedure LS will eventually arise at y*. Since direct deterministic estimates of A(y*) (that used the Hessian of fat y*) have not been very successful, much research has focussed on clustering methods in the following sense: starting from a uniform sample of points from S one creates groups of mutually close points that should correspond to the regions of attractions with the aim to start the loca! search procedure LS no more than once in every such region. One way to generate such groups, removes a certain fraction of the sample points with the highest objective function values (reduction method). Another way forms the sample by admitting points that are obtained from the initial sample by a few steepest descent steps (concentration method). Clusters are formed from the transformed sample by starting from a so-called seed point, which may be an unclustered point with lowest function value or the local minimizer found by applying LS to this point. Points are added to the cluster according to certain rules known from cluster analysis. One of several such rules is to add the closest unclustered point to the current cluster until the distance of this point to the closest point in the cluster exceeds a certain critical distance (single linkage). The major deficiency of clustering methods is that the clusters generated by these methods at best correspond to the connected components of a level set, i.e., a set of the form {x E S: f(x) ~ al, instead of regions of attraction. Such a component can obviousfy contain more than one re~ion of attraction. It is therefore possibie that, although a point in AlY*) was sampled, alocal minimizer y* will not be found. Recently, a so-called multi-Ievel single linkage method was proposed where the local search procedure LS is applied to every sample point, except if there is another sample point within the critical distance which has a smaller function value. If the critical distance is appropriately chosen, then even if sampling continues forever, the tota! number of local searches ever started in this method

6

REINER HORST

is finite with probability 1. All loca! minima wiIl he located when N ---t m (cf. 2 and references therein). Another question of interest that has been studied by quite a number of authors is that of an appropriate stopping ru1e for mu1tistart and related procedures. Most of these ru1es are based on a Bayesian estimate of certain problem properties (Bayesian stopping mles). Under certain assumptions about the cost and potentia! benefit of further investigations probabilistic optima! stopping points can he derived (optima! Bayesian stopping ru1es, cf., e.g., 3, 4 and references therein). As a ru1e, mu1tistart and related methods are easy to implement and can be recommended for use in situations where little is known about the objective function. First numeri ca! test resu1ts on small problems seem to favour the mu1ti-Ievel single linkage method. Random Decrease and Trajectory Methods Standard nonlinear programming techniques for (loca11y) solving unconstrained optimization problems generate a sequence {xk} of points with decreasing objective function va!ues f(xk) by means of a descent process which is defined by xk+l = xk + ),kdk ,

(5)

where dk E IRn is an appropriate direction of descent, and ),k E IR + is the step size. It is weIl known that all of these methods are loca!, i.e., one is getting stuck at a loca! minimum where no direct ion of descent can be found. A much investigated way to achieve globa! re1iability is to use random directions dk. In these methods, having arrived at xk E S, one generates a direction dk E IRn satisfying xk + dkE S from a distribution and chooses a step size ),k such that f(xk+l) = f(xk+),k dk ) ~ min{f(xk), f(xk+dkn.

(6)

The step size ),k can just be choseu to be 0 or 1 such that f(xk+l) = min {f(xk),f(xk + èn. More sophisticated techniques determine ),k in the same way as in certain classica! loca! descent methods by minimizing f or and approximating polynomia! over the line segment . .. xk an d xk + dk. JOllling The random direction methods are correct in the sense that under most natura! mild conditions on the distribution a point which is arbitrarily close to the global minimizer wiIl be found with a

GLOBALOPTIMIZATION

7

probability that tends to 1 with increasing k. The issue of appropriate stopping ru1es has not been resolved yet. Numerical experiments indicate that these methods are fast but occasionally unreIiable in practice. A continuous analogue to descent methods is based on appropriate perturbations of the steepest descent trajectory x(t) which is defined by dd~t) = --g(x(t» , (7) where g(x(t» is the gradient of f at x. Deterministic and stochastic perturbations of (7) have been proposed. The deterministic approach suffers from theoretical weakness since it can only be shown that the method cannot converge to a local minimum with relatively high function value. The stochastic versions lead to stochastic differential equations whose trajectories converge for sufficiently smooth functions to the global minimum in the stochastic sense, but appIication in practice is very Iimited because of enormous computational expenses required to solve the stochastic differential equations. Similarities exist between these approaches and the well-known simulated annealing method. A continuous version of simulated annealing generates a random point z in a neighbourhood of the incumbent x and continues the procedure from z if f(z) ~ f(x) or, with a certain probabilitl that depends on f(z) - f(x) and on a so--ealled cooIing parameter i f(z) > flx). Stochastic convergence to the global optimum has been shown for several variants when f is sufficiently smooth and the cooIing parameter decreases in an appropriate way. The speed of convergence, however, can be very slowand no usefu1 stopping ru1es have been found yet. A continuo us version of the Newton method considers the differential equation dd~t) = ÀH-1 (x(t» . g(x(t» , (8) where H is the Hessian and À = :i: 1. Since (8) is only defined in the region of x space where H is nonsingu1ar and because of the numeri cal effort to compute H-1(x(t» one has also considered the differential equation • = Àg(x(t» , (9) the exact solution of which in terms of the gradient vector is g(x(t» = g(x(O»e Àt . (10) It is clear from (10) that for À = -1, a trajectory x(t) satisfying (10) will converge to a stationarY'p0int when t ---+ m. If at such a point one switches to À = +1, then xlt) will move away from that stationary point. It follows from (10) that, for any starting point x(O), the

8

REINER HORST

trajectory is contained in S(x(O»:= {x E S: g(x) = JLg(x(O», JL E IR}, which of course contains all stationary points. But S(x(O» consists often of several components, only one of which is the trajectory that will be followed. It is not known yet how different starting points x(O) in each component of S(x(O» can be found such that eventually all stationary points would be detected, and hence all global solutions. Recently it has been shown that different components of S(x(O» can be connected by a finite number of curves, but the problem of how to construct those curves has not yet been solved satisfactorily. Another wal of exploiting the relations bet ween the trajectories of (9) and S(x(O») is by considering the set {x E S: g(x) = JLg(x(O», JL E [0,1]} and to use so-called homotopy methods and simplicial approximation techniques to compute the pathes x(JL) as function of JL E [0,1] (see 5). Branch and Bound Methods One of the most natural and basic deterministic tools to solve difficult optimization problems is branch and bound (BB). In global optimization, BB methods have been successfully applied to many types of essentially unconstrained and constrained problems. We therefore give an outline of BB methods for the constrained case with feasible set D which clearly includes the essentailly unconstrained problem (set D = S). Suppose that min f(D):= min {f(x): x E D} exists. Then a BB method operates in the following way: Start with a relaxed feasible set MO ) D and split (partition) MO into a finite number of subsets Mi ' iEl. In each subset Mi determine lower bounds P(Mi) and upper bounds a(M i) of f on Mi n D satisfying P(Mi ) ~ inf f(M i n D) ~ a(Mi )· Then p:= min {P(Mi ): i E I} and a := min {a(Mi ): i E I} are "overall" bounds, i.e., we have P ~ min f(D) ~ a.

If a = P (or a - P ~ c, c > 0 prescribed), then stop (a = P =

min f(D); resp. 0 ~ a - min f(D) ~ c and 0 ~ min f(D) - P~ c).

Otherwise, select some subsets Mi and partition these selected

subsets further in order to obtain a refined partition of MO. Determine new, bet ter bounds on the new partition elements and repeat the process in this way. Here {Mi: i E I} defines a partition of MO if MO is the union of Mi (i EI), and if two different partition sets Mi and Mj have at most boundary points in common.

GLOBAL OPTIMIZATION

9

Whenever M' ( Mis an element of a partition of a refined subset M, then it is required that ,B(M') ~ ,B(M), a(M') ~ a(M) (monotonicity of the bounds). Subsets M of a current partition are said to be fathomed if it is known that min f(D) cannot be attained in M. Let ,Bk' ~ denote the incumbent bounds in iteration k, and let M be an element of the current partition. Then, Mis fathomed if we have fJ(M) ~ Qk or M n D = 0. Only the remaining subsets of MO are considered henceforth. In most applications, the upper bounds a(M) are simply the objective function values at the best feasible point known in M. More precisely, if SM ( M n D is the finite sample of known feasible points in M, then a(M) = min f(SM)· We set a(M) = +111 if ~M = 0. The sequence of approximate solutions xk is usually defined via

f(xk) = Qk ' where Qk denotes again the incumbent upper bound in

iteration k of the procedure.

Evidently for convergence and efficiency of any realization of the

scheme outlined above, the concrete choices of the following three

basic operations are crucial:

Partitioning (how to choose MO' and their refinement), Selection (how

to determine the partition sets to be refined further?), Bounding (how

to determine ,B(M) ?).

Partitioning

For the partition sets MO' M, most simple types of polytopes or

convex polyhedral sets are used such as simplices, rectangles and

polyhedral cones. The subdivision is often supposed to be exha'UStive,

where a subdivision of polytopes generated by successive subdivision

converges to a singleton. A subdivision of polyhedral cones is

exhaustive if every decreasing subsequence of successively refined

cones converges to a ray.

A large class of exhaustive subdivisions of simplices is discussed in 6.

aften bisections are used in the following sense:

Let M be an n-simplex defined as the convex hullof its n+1 affinely

independent vertices and let [v~, v~] be one of the longest edges of

M. Replacing v~ and v~, respectively, by v = 1/2 (v~ + v~) yields the vertex sets of two simplices of equal volume that form a partition

of M which is called bisection.

Now let S be an n-simplex containing 0 in its interior (for example,

10

REINER HORST

l~t S be defined by its vertices vi = ei (i=l,... ,~, vn + 1 = -e, where el is the i-th unit vector in IRn and e = (1, ... ,1) E IRn ). Consider the n+1 facets Fi of S. Each Fi is an (n-1)-simplex. For each Fi' let Ci be the convex cone with vertex at 0 and having exactly n edges that are the halfl.ines from 0 through the n vertices of Fï Then {Ci: i=1, ... ,n+1} is a conical partition oflRn . Any exhaustive partition of the initial facets Fi into (n-1)-simplices induces an exhaustive partition of cones, where, as above, to each (n-1)- simplex that cone is associated whose generating edges are the halfl.ines from 0 through the vertices of the simplex. The conical subdivision induced by bisection of the (n-1)-simplices is also called bisection. Let M = {x: a S x Sb} be an n-rectangle. Then bisection consists of a subdivision of M into two n-rectangles by a cutting hyperplane through 1/2(a+b) perpendicular to the longest edge of M. Bisections are exhaustive in each of the three cases. The drawback of bisection, however, is that the given structure of the optimization problem is not taken into account. Recently, more sophisticated and numerically more efficient subdivision rules have been shown to yield convergent BB procedures in various circumstances (cf. 6). Selection Let Rk be the current partition, and let P k denote the set of partition elements that are selected for subdivision in iteration k of the BB-procedure. Obviously, the lower bound Pk - 1 S min f(D) cannot be improved unless the set of partition elements MERk satisfying f\.-1 = P(M) is refined. A slightly more general selection rule that yields convergent algorithms when combined with suitable partition and bounding procedures, and which provides enough flexibility for many practical applications is to require that (at least each time after fini tely many steps) P k satisfies Pk

n {M E Rk : tJ(M) = Pk- 1} # 0 .

(11)

Bounding Let M be a polytope. An often proposed way of calculating lower bounds P(M) for min f(D n M) is by minimizing over D nMa suitable convex underestimator of f on M, i.e., a convex function tp: M ---I R satisfying cp(x) S f(x) Vx E M. The uniformly best convex underestima­ tor cp of f on M is the sense that no ot her convex underestimator exceeds cp at any point in M is called the convex envelope of f over M.

11

GLOBALO~ATION

The convex envelope ~ exists and is unique whenever f is lower semicontinuous on M. It is the pointwise supremum of all affine underestimators of f on M. Another, geometrically more appealing characterization is by means of the epigraphs of f and cp. Let epi(f) = ((x,r) E MxR: f(x) ~ r} denote the epigraph offover M. Then convepi(f) = epi(~, where convepi(f) denotes the convex hullof epi(f). The following interesting properties of convex envelopes have been used frequently in various contexts. Let M be a polytope with vertex set V(M) and let ~ denote the convex envelope of f over M. Then the minimal values of ~ and f over M coincide, and every global minimizer of f over M is agiobal minimizer of ~M over M·,

f and ~M coincide in V(M)j Let M = M1 x M2 x... x Mr be the product of the r rectangles M. 1

r

.

(i=l,... ,r) and let f(x) = E f.(xI ), where fi : M. i=l 1 1 r

---I

IR

.

(i=l,... ,r). Then ~(x) = . E ~M. (Xl).

1=1 1 The convex envelope of a concave function over an n-dimensio­ nal simplex M is that affine function ~M that is uniquely deter­ mined by the system of n+1linear equations that arise from the property that f and CPM coincide at the vertices of M. Note that the linear equations mentioned here do not need to be solved explicitly. If barycentric coordinates Àl' ... ,À n + 1 are used, i.e., n+1. . n+1 xE Mis expressed as x = E À.vI with VI E V(M), E À. = 1, À. ~ 0 i=l 1 i=l 1 1 n+1 . (i=1, ... ,n+1), then CPM(X) = E À.f(vI ). i=l 1 Another type of natural lower bounds is available for Lipschitz functions. Recall that a real-valued function f is called Lipschitzian on a set M ( IRn if there is a constant L = L(f,M) > 0 such that If(x) -f(y)1 ~ Lllx-yll Vx,y E M ,

12

REINER HORST

where 11·11 denotes the Euclidean norm. The value of the knowledge of a Lipschitzian constant L follows from the following simple observation. Suppose that the diameter d(M) < lIJ of M is known. Then it is easily seen that f(x) ~ f(y) - Lllx-yll ~ f(y) - L d(M) holds. Let SM ( M denote a finite sample of points in M, where the function values have been calculated. Then it follows that we have min f(SM) ~ inf f(M) ~ max f(SM) - L d(M). Another often occuring interesting case is that of so-called factorable functions, i.e. f is the last of a finite sequence of functions :rC I ),t:C2) ,... where this factorization sequence is built up as follows: :rC j )(xl'... 'xn ) = xj (j=l,... ,n) and for all k > n, one of the following holds: :rCk)(x) = :rCi)(x) + :rCj)(x) :rCk)(x) = :rCi)(x) . :rCj)(x) :rCk)(x) = F(N)(x))

for some i,j < k , for some i,j < k , for some i,j < k, where F belongs to a given class of elementary functions IR --IR (e.g., F(t) = t P , F(t) = et, F(t) = sin t, ... , cf. 7). This factorization

sequence corresponds quite naturally to the order in which f(xl' ... 'xn )

would be computed for given values of the arguments xj (j=l,... ,n). In the context of BB procedures, factorable functions call for

hyperrectangular partition sets M, where either convex enve10pes

IPM(x) can be computed by reasonable computational effort (7), or

interval analysis can he applied to yield lower bounds (cf. 8). Clearly, between convex envelopes and interval analytical bounds there is a lot of room for clever and at hoc ideas in specific situations. Convergence of BB methods to the global optimum is assured if the lower bounding procedure is asymptotically accurate, in that the lower bound ,8(M) approaches the global minimizers of f over M n D when the volume of M becomes sufficiently small (cf. 6). 3. CONSTRAINED GLOBAL OPTIMIZATION: PROBLEM CLASSES Many constrained global optimization problems encountered in the

decision sciences and engineering belong to one of the classes concave

minimization, d.c. programming and reverse convex constraints, and

Lipschitz optimization that were mentioned in the introduction.

GLOBALOPTIMIZATION

13

Concave Minimization One of the most important global optimization problems is that of minimizing a concave function over a compact convex set (concave minimization). Concave minimization problems are multiextremal global optimization problemsj it is easy to construct, for example, concave functions f and polytopes D having the property that every vertex of D is alocal minimizer of f with respect to D: choose, e.g., f(x) = -lIxIl 2, where 11·11 denotes the Euclidean norm and D={x E IRn: a ~ x ~ b}, a,b E IRn , a < 0, b > o. The literature abounds with problems from such diverse disciplines as technical design, economics, medicine, business and political science which can be formulated as concave minimization problems or ­ equivalently - as convex maximization problems. In decision modeis, where f represents e.g. the cost associated with certain activities x, concave minimization problems often reflect positive setup cost (fixed charge) and (or) a decrease in unit cost with increasing levels of activity (economies of scale). Several important models yield not necessarily concave, but quasi concave objective functions that, however, can often be suitably transformed into concave functions (see 6'11'10 and references therein). The following important classes of optimization problems can be transformed into equivalent concave minimization problems: integer programming, where fis twice continuously differentiable and D is the set of integral vectors in a compact subset of IRn biline ar programming, where x = (x l ,x2) and f(x l ,x2) = + + , D = Xl n X2 with c E IRnl, de IRn2 , Xl X2y pol hedral A E IRnlxn2., Xl C IRnl , X2 C IRn2" , n 1 + n2

=n. concave complementarity problems, where, given a convex set De IRn and two concave mappings g, h: IRn - t IRm , a point x E D is sought satisfying g(x) ~ 0, h(x) ~ 0, )

maxlmlze xl x2

1 2 1 2

s.t. Ax + Bx ~ b, x,x ~ 0 where xl,c E IRnlj x2,d E IRn2 j A E IRsxnl, B E Rsxn2 , bE IRs. Proofs and references can be found in 6.

Besides convexity of the feasible set D that - of course - is heavily

14

REINER HORST

exploited in the design of algorithms, the most interesting property of concave minimization problems is given in the following result that belongs to the folklore of convexity theory. Let D ( IRn be nonempty, compact and convex and let f: D - t IR be concave. Then the global minimum of f over D is attained at an extreme point of D. Most concave minimization problems encountered in practice possess additional structure. For example, often the feasible set D is a polytope. Many problems have network structure. Quadratic concave or piecewise linear objective functions are frequently eneountered. The functions involved in the problems are often separabie, i.e., the sum of functions of one variabie, and large scale concave minimization often involve only a few "nonlinear" but many "linear" varlables. D.C. Programming and Reverse Convex Constraints A real valued function f defined on a convex set C ( IRn is called d.c. on C if, for all x E C, f can be expressed in the form f(x) = p(x) - g(x), where p, q are convex functions on C. The function fis called d.c., if it is d.c. on IRn. The representation is said to be a d.c. decomposition

off.

In all these notions, d.c. stands for "difference of two convex

functions".

Agiobal optimization problem is called a d.c. programming problem if it has the form min f(x),

s . t. g/x) ~ 0 (j=l,... ,m) where all functions f,gj are d.c. Clearly, every concave minimization problem is also a d.c. programming problem. The following results show that the class of d.e. functions is very rich and, moreover , that it enjoys a remarkable stability with operations frequently encountered in optimization. Let f, fi (i=l,... ,m) be d.e. Then the following assertions hold. - Every linear combination of the f; is d.e .



- mu {fi(x): i=l,... ,m} and min {fi(x): i=l,... ,m} are d.c. - If(x) I, r+(x):= mu {O,f(x)} and t(x):= min {O,f(x)} are d.c. - the product of two d.c. functions is d.c. Another main result on recognizing d.c. function is that every function which is d.c. on a convex neighbourhood of a point xO is d.c.

GLOBALO~ATION

15

on the whole space \Rn .

Two consequences are remarkable:

- every function f: \Rn -+ \R with continuous second partial derivations

is d.c.;

- every real valued continuous function on a compact convex set

D ( \Rn is the limit of a sequence of d.c. functions which converges

uniformly in D.

(For proofs, see 6 and the literature therein.)

It follows that every problem of minimizing a continuous function

over a compact convex set can be approximated as closely as desired

(in the sense of the sup-norm) by a d.c. programming problem. The

main concern when using the above results is how to construct the

appropriate d.c. decomposition or d.c. approximation. In many cases

of practical importance, such a d.c. decomposition is readily available.

Example 1 In many applications separable objective functions are

n

E fI·(x.), where each f. is a i=1 I I real-valued function of one real variabie on a given (possibly unbounded) interval. Often, the functions zfi represent utility - or encountered, i.e., we have f(x) =

production functions and have the property that there is a point xi such that fi(xi ) is concave for xi < xi and convex for xi ~ xi (S-shaped functions). Likewise, we often encounter the case where fi(xi ) is convex for xi < xi and concave for xi ~ xi" Omitting the index i, we easily see that a d.c. decomposition of a differentiable eoneave-eonvex funetion f(x) = p(x) - q(x) of one real variabie x is given by p(x)

={

q(x)

={

Cf' (X)(x-X) + f(X» / 2

(x 0 This amounts to solving a nondifferentiable convex minimization problem when all gj are convex. For non convex gj' however, the difficulty of evaluating (14) for all M where feasible points are not readily at hand, wiil - in general - render the corresponding BB procedures computationally feasible. This deficiency of BB procedures has until very recently restricted its applicability to cases where M n DI 0 can be guaranteed for all partition sets M or where the exact decision on M n D I 0 is easily obtained. It seems that for more complicated feasible sets, we have to admit that an exact decision on M n D = 0 is not possible, and the crucial problem of handling partition sets whose feasibility is not known can only be resolved in an approximate sense. The foilowing propos als coupled with appropriate bounding and subdivision procedures have considerably enlarged the fields of optimization where BB procedures can be applied. Let V(M) denote the vertex set of M. We consider the cases of convex feasible sets, feasible sets defined by convex and reverse

GLOBAL OPTIMIZATION

21

convex constraints, and of feasible sets defined by Lipschitzian constraints. Let D:= {x E IRn:g(x) ~ Ol, where g(x) = max {gj(x): j=l,... ,m}, gf IRn

---t

IR convex (i=l,... ,m), possess a known interior point yO

°

satisfying g(yO) < (Slater condition). Suppose V(M) n D = 0 (otherwise M IS obviously feasible) and choose an arbitrary point p E M\D. (Note that a nonempty subset of V(M) is always known.) Compute the unique point z where the line segment [yO,p] intersects the boundary of D (z = ÀyO + (l-À)p, where À is the unique solution of the univariate convex programming problem min {JL E [0,1]: JLYO + (l-JL) P E D}). Let s(z) be a subgradient of g at z. Then delete M if it is strictly separated from D by the hyperplane H:={x: =O}, i.e., if V(M) ( {x: > O} . (15) Note that H supports D at z. Now let D = Dl n D2 where

°

Dl = {x: g(x) ~ Ol, D2 = {x: hix) ~ (j=l,... ,r)} and g,hf IRn ---t IR are convex (j=l, ... ,r). Assume as above that an

°

interior point yO of the convex set Dl satisfying g(yO) < is known. Let Cf= {x: hj(x) < O} (j=l,... ,r). Then delete a partition set M if its vertex set V(M) satisfies (15) applied to Dl (instead of D) or if there is a jE {l,... ,r} such that V(M) ( Cj . (16) Finally, let D = {x: gix) ~ (j=l,... ,m)}, where all functions gj are Lipschitzian on M with Lipschitz constants LJ Let d(M) be the diameter of M (which is the length of the longest edge when M is a simplex, which is the distance lIb-all bet ween the "lower left" and the "upper right" vertex when M = {a ~ x ~ b} is a rectangle). Let S(M) be an arbitrary nonempty, finite subset of M. Then delete M whenever there is a jE {l,... ,m} satisfying (17) max {gj(x): x E S(M)} - Ljd(M) > It is easy to see that by (15), (16), (17) only partition sets Mare removed that are not feasible; but there might be infeasible sets M in each case that are not deleted by these rules. None the less, when combined with appropriate bounding and subdivision procedures these

°

°.

22

REINER HORST

rules lead to convergent BB algorithms in each case.

A comprehensive discussion of the state-of-the-art in BB algorithms

can be found in 6.

Outer Approximation Consider our standard global optimization problem min f(x) (18) , s.t.X eD where f: IRn - - I IR is continuous and D is a closed subset of IRn defined by a finite number of inequalities gj(x) ~ 0 (j=l,... ,m). Assume that min f(D) exists.

Outer approximation (relaxation) methods used in global optimization

proceed essentially according to the following simple basic scheme,

Initialization: Determine a c10sed set Dl ( IRn satisfying Dl ) D. Set kl-l. Iteration k (k=1,2,9,. .. ): Solve the "relaxed" problem min f(x) (P k ) s.t. x e Dk . . asoI ' xk . obtaJmng utlOn If xk e D, then stop: xk solves problem (18). Otherwise, construct a constraint function ik: IRn - - I IR satisfying 4c(x) ~ 0 Vx e D , k

ik(x ) > O. Let Dk+1 = Dk n {x: ik(x) ~ O}, k I - k+1, and go to the next iteration. It is supposed, of course, that the subproblems (P k) can be solved by reasonable computational effort. The set Hk:= {x: ik(x) = O} that strictly separates xk from D is called a "cut". Note that in contrast to other cut-methods that are frequently applied for solving certain linearly constrained global and in integer programming problems, it follows from the above that no part of the feasible region D is ever cut off (outer approximation). Until very recently, only linear cuts have been used in global optimization: Let all functions gj(x) be convex (j=l,... ,m) and define

GLOBALOPTIMIZATION

23

g(x) = max {g(x): j=l,... ,m}. Suppose that a point yO satisfying g(yO) < is known. In the basic scheme above, let the cuts be defined by the affine functions k k k 4c(x) =

+ g(y ),

°

where yk

E

Dl is suitably chosen and pk

E

8g(l) is a subgradient of g

at yk. The proposed methods differ from each others by the choice of yk. A main field of application is concave minimization where the subproblems (P k) are solved by vertex enumeration of Dk. The transit ion from Dk to Dk+ 1, where Dk+1 is obtained from Dk by a cut, leads to the subproblem of determining the vertex set of Dk+1 from that of Dk. This can be done by pivoting procedures similar to the simplex techniques. The related question of detecting redundant constraints has been solved quite satisfactorily 6. Other recent applications of outer approximation methods concern d.c. prosramming and even Lipschitz optimization where nonlinear cuts 4ctx) = have been proposed. Convergence of outer approximation methods is guaranteed if the sequence of cut functions 4clx) satisfies certain convergence conditions (for example, continuous convergence, equicontinuity etc.) which can be enforced for a large class of cut functions. Another question of interest with outer approximation that has attracted much research is that of constraint dropping strategies in the following sense. In the basic version of an outer approximation method presented above, at each iteration a new constraint is added to the set of existing constraints, but no constraint is ever deleted. Consequently, the size of the subproblems to be solved in each iteration is strictly increasing with the number of iterations. It is therefore important to develop devices that allow dropping certain previous constraints. One variant of recently developed constraint dropping strategies is as follows. Let H := {x: 4c(x) 5 O} .

°

k

Then, in the basic scheme described above, we have k

Dk+1 = Dk

n (i~l Hi) .

(19)

Dropping certain constraints defining Dk+ 1 amounts to finding a

24

REINER HORST

strict subset Ik of {l,... ,k} such that replacing (19) by k Dk+l = Dk n en H~ lEIk defines a convergent outer approximation procedure.

Let p: Dl - I IR be a continuous penalty function, i.e. we suppose that

p(x) > 0 ' 1 is a prefixed number chosen as large as convenient. The idea behind this construction is to exploit the concavity of f in order to find on each edge of M the farthest point yi with the property that on the line segment [O,yi], f(x) is not better than a. Now let Y be the nxn matrix with columns yi (i=l,... ,n) and consider the closed halfspace not containing 0 that is generated by the

26

REINER HORST

hyperplane through yl, ... ,yn. This halfspace is defined by n n x = YÀ, À E IR , E À. ~ 1 (22) i=1 1 or ey-1x ~ 1 (23) n where e = (1, ... ,1) E IR . It follows from the concavity of f that f(x) ~ a on the simplex defined by its vertices O,yl,... ,yn, and hence that

l(x)

= 1-

-1 eY x ~ 1

defines a cut in the sense discussed above. Moreover, if we have P ( {x: l(x) ~ Ol, then our vertex 0 is the global minimum we are seeking. Checking whether this is the case can be done by linear programming techniques which also yield lower bounds of f on M n P. This idea suitably incorporated in a cone-splitting branch and bound procedure provides a quite efficient algorithm solving linearly constrained concave minimization problems. Modifications of this idea have been successfully applied in problems with reverse convex constraints and also in integer programming (cf. 6). Present Lines of Research and Computational Results Present lines of research in constrained global optimization are focussing on the following four topics: solving large classes of global optimization problems by a

sequence of linear programs and line searches,

decomposition and projection methods for certain large scale

problems,

parallel implementation,

integer global programming.

Very recently, clever combinations of branch and bound, outer

approximation and cutting plane methods have led to algorithms for solving concave minimization and d.c. programming problems that avoid the most time consuming subroutines of the pure branch and bound or outer approximation approaches. Only linear programs and one-dimensional convex minimization problems (line searches) have to be solved (for details, see 6). Another line of present research is dealing with global optimization pro bi ems of the form (24) min f(x) + cTy s . t. (x,y) E Px"Py , where P x and P y are polytopes in IRn and IRP, respectively. Very often

GLOBALOPTIMIZATION

27

f(x) is concave or d.c., and p is much larger than n. This type of problems reflects the often encountered fact that many variables enter a problem in a linear way whereas only few variables make up its nonconvex (global) aspect. Several decomposition and projection methods have been proposed that take into account the structure of (24). In most of these methods, the time consuming "global" subroutines can be restricted to the space IRn of few variables, whereas in IRP+n only linear programs have to be solved (cf., e.g., 6,10,11). Since most methods involve certain partition procedures of the feasible set of admissible points, where in each partition set M a certain subroutine for obtaining a lower bound on min f(M) must be carried out, it is natural to consider parallel implementations that run the corresponding bounding routines in several partition sets at the same time (cf. 6, 10,11 and references therein). Finally, we see that global methods are recently applied to solve nonlinear integer programming problems (e.g., 10 and references therein). Computational studies in global optimization are less advanced than theoretical research, and we believe that, in near future, much effort should go in computationally oriented research. The many future tasks include development of generally acknowledged sets of test problems for each of the main classes of global problems where algorithms have been proposedj numerical comparisons of the various methodsj further incorporation of devices such as decomposition, local steps, and heuristics in order to handle large problemsj further research on parallel computation. The most significant numeri cal results have been obtained for the problem of minimizing the sum of a concave or indefinite quadratic function and a linear function subject to linear constraints. This problem, along with various of its special forms has been extensively investigated by B.Rosen and his students. The average computation time of the most efficient algorithm on a four processor CRAY2 computer with an error toleranee of 10-3 on problems having 25 variables in the quadratic term and 400 variables in the linear term of the objective function was 15 seconds. A comparative study of three algorithms for linearly constrained concave minimization problems of Horst and Thoai has shown that problem sizes up to 50 variables and 30 constraints can be readily solved on a PC. The expected superiority of the linear programming / line search approach for the nonlinearly constrained concave minimization problem has been confirmed by test series of Horst-Thoai-Benson (cf. 6,10 and references therein).

28

REINER HORST

Several numeri cal tests are also reported on stochastic methods (e.g., 2,3,4). REFERENCES 1.

2. 3.

4. 5.

6.

7. 8.

9. 10.

11. 12. 13.

Rubinstein, R.Y. (1981). Simulation and the Monte Carlo Method. New York: John Wiley & Sons Rinnooy-Kan, A.H.G. and Timmer, G.T. (1989) Global Optimization: a Survey. International Series of Numerical Mathematics, 87, 133-155 Törn, A. and Zilinskas, A. (1989) Global Optimization. Lecture Notes in Computer Scienee, 350, Berlin: Springer Mockus, J. (1989) Bayesian Approach to Global Optimization. Dordrecht: Kluwer Academie Publishers Garcia, C.B. and Zangwill, W.I. (1981) Pathways to Solutions, Fixed Points and Equilibra, Englewood éliffs: Prentice Hall Horst, R. and Tuy, H.(1990) Global Optimization: Deterministic Methods. Berlin: Springer McCormick, G.R. (1983) Nonlinear Programming: TheoTy, Algorithms and Applications. New York: John Wiley Ratschek, H. and Rokne, J. (1988) New Computer Methods for Global Optimization, Chichester: E. Horwood Horst, R.(1984) On the Global Minimization of Concave Funetions: Introduetion and Survey. Oper. Res. Spektrum, 6, 195-205 Horst, R. (1990) Deterministic Methods in Constrained Global Optimization: Some Recent Advances and New Fields of Applieation. Naval Res. Logistics, 37, 433-471 Pardalos, P.M. and Rosen, J.B. (1987) Constrainted Global Optimization: Algorithms and Applications. Lecture Notes in Computer Science, 268, Berlin: Springer Jongen, Th., Jonker, P. and Twilt, F. (1986) Nonlinear Optimization in IR n. Frankfurt: Peter Lang Zie1inski, R. and Neumann, P. (1983) Stochastic Methods for Seeking the Minimum of a Function (in German), Berlin: Akademie-Verlag

TWO-LEVEL RESOURCE CONTROL PRE-EMPTIVE HIERARCHICAL LlNEAR PROGRAMHING PROBLEM: A REVIEW SUBHASH C. NARULA

School of Business, Virginia Commonwealth University, 1015 Floyd Avenue, Richmond, VA 23284­ 4000 USA ADIELE D. NWOSU

Department of Mathematics, University of Nigeria, Nsukka, Nigeria Abstract Multilevel pre-emptive hierarchical pro­ gramming problems arise whenever the control over the decision variables is partitioned among several inde­ pendent decision makers representing various levels of an organization; and the decisions are not made simultaneously and in concert but in a sequence. Because of the inherent nonconveld ty of the problem i t is not an easy matter to find an optimal soluti.on even for a bilevel linear programming problem. Dur objective is to review the literature on the multi­ level pre-emptive hierarchical programming problem.

Bilevel programming; Bicriterion gramming; Multicriteria programming.

Keyvords: 1.

pro­

INTRODUCTION

Consider an organization in which control over the decision variables is partitioned among a hierarchy of independent decision makers; and where decisions are not made simultaneously and in concert but in sequence down the hierarchy (e.g., the hierarchy of federal, state and local governments in a democracy). As the objective functions and constraints of the system are, in general, functions of all decision variables, the independent actions of lower level decision makers of ten impact ad­ verselyon the objective function values of higher level decision makers. However, the sequential nature of the DOI: 10.1201/9780429333439-2

29

30

SUBHASH C. NARULA AND ADIELE D. NWOSU

deeision proeess enables a higher level deeision maker to use his supedor rank to unilaterally and pre-emptively set the values of deeision variables under his control in an attempt to optimize his objeetive funeti.on within the over-all constraint set and in eognizance of the conse­ quent act ions of lower level deeision makers. We shall e"lll sueh problems pre-emptive hierarchieal programming PlIP problems. Although a project planning exercise in any organi­ zat ion that has a elearly defined ehain of command and practlces deeentralized and sequential decision making may be modelled as a PlIP problem, it is in the public domain that most examples of decentralized decision making problems have been reported. For example, in a democracy the federal (or central) government, the highest level deeision maker, can influence but not dic­ tate to independent lower level economic subunits within the system (e.g., state and local governments, corpora­ ti.ons and indi.vidual households) by pre-emptively setti.ng policy. This is usually done by implementing a package of quo tas, taxes and/or subsidies. The PHP models have been used to tnodel policy questions in industrial pol­ luti.on control (1), project selection (2), agricultural (or other) sector strategic planning (3), and regional development (4). Hierarchic"ll programming problems also occur in eco­ nomic arrangements that involve sharing and incentives and ean be described in terms of the principal and agent relationships. The principal's problem in the principal­ agent theory may be mode lIed as a PHP problem (5), for which Shavell (6) and Srinivasan (7) give conditions under whieh Pareto-optimal solutions exist. Pre-emptive hierarchical programming problem is eur­ relltly discussed in the 1 iterature under the title "multIlevel programming" or "hierarchieal programming" • However, the same phrases describe the result of decom­ position techniques on large scale systems with a single deeision maker (or possibly a committee) who has a single objective funetion and controls all the decision vari­ ables (8,9). Such a decomposit ion produces several sub­ problems, each with its own objeetive function and con­ straint set. Even iE these subproblems correspond to dec is ion making eomponents oE the organization, the decision process is not partitioned among a hierarchy of independent deeision makers. Also, the decision maker or committee obtains an overall system plan by solving a new system-wide problem derived from the subproblem solutions.

TWO-LEVEL IDERARCHICAL LINEAR PROGRAMMING PROBLEM

31

Thus, "the decomposition obtained here cannot be viewed as complete decentralization of the decision-making process. A better term would be 'centralized planning without complete information at the center'" (8). Furthermore, the philosophy of multilevel decomposition techniques envisages an iterative process involving the coordinatorl decision maker repeatedly sending out data and receiving feedback with which to upgrade the current system solution. The ultimate equilibrium solution is therefore approached asymptotica11y, if at all. We sha11 put PHP in a sharper perspective by com­ paring it to the fo11owing established mathematical pro­ gramming and other techniques and theories having related problem structure, viz., pre-emptive goal programming, multiobjective programming, game theory and equilibrium programming. Pre-emptive goal programming, multiobjective pro­ gramming and PHP problems all deal with problems having multiple objective functions. In practice confitcts exist among these multiple objectives as each objective function will, in general, attain its optimum at a different point in the decision space. Both goal pro­ gramming and multiobjective programming assume the objective functions to be those of a single autocratie decision maker or, at worst, those of a col1ection of harmonious decision makers. On the other hand, in PUP these objective functions are those of independent deci­ sion makers at various levels of an organization. Each decision maker optimizes his objective function without considering the effect of his aetions on any other dec i­ sion maker. Even more important, control of the decision variables is partitioned among the decision makers - a meaningless concept in both goal programmi ng and mul t i­ objective programming. The optimizations in PHP are not simultaneous but, beginning at the highest level, sequential down the administrative hierarchical decision strueture. Each decision maker, almlng only to optimize his objective function, sets values of the decision variables under his control af ter decision makers at higher levels have fixed theirs. This is not system-, but user-optimization. As a result the solution to a PUP problem can exhibit gross system sub-optimality in the sense that there could be another point that satisfies all the constraints and yields a higher payoff for each of the decision makers and yet that point is unattainable in the absence of cooperation or coalition among some of the decision makers (10,11). Thus, the solution need not be Pareto­

32

SUBHASH C. NARULA AND ADIELE D. NWOSU

optimal. In pre-emptive goal programming the objective func­ tions are ordered in a preference hierarchy and optimized in sequence (12,13). First, the most preferred function is optimized over the entire decision space. The second most preferred is next optimized subject to the value of the first preferred function remaining at its optimum. Each subsequent optimization for the next function main­ tains the optimal values of functions preceding it in the preferenee hierarchy. The solution techniques for multiobjective pro­ gramming problems seek Pareto-optimal solutions. This is equivalent to optimizing some convex combination of the of ten incommensurable objective functions (14) and thus to seeking a weighted compromise among the competing objectives. The decision maker (or the group performing that role) therefore chooses the most preferred one from the set of all compromise points. From the foregoing we see that each of the three methods is characterized by the manner in which it at­ tempts to model and resolve the inherent conflicts among the objective functions. In game theory the usual assumptions are that all players move simultaneously and that their strategy spaces are independent and disjoint. However, it is possible to prescribe a game requiring sequential play provided each player employs a Stackelberg strategy (15). Although the decision variables are thus partitioned among the players (decision makers) as required for the PHP problem, current game theory does not consider the case where the payoff as weIl as the strategy space for a player could be altered by the actions of other players. Equilibrium programming is the model closest to the PUP model. There is only one major difference between the two. In equilibrium programming all decisions are made simultaneously and that effectively destroys any hierarchy that may otherwise exist (16). In the class of pre-emptive hierarchical programming problems, the series resource control bilevel PHP problem with linear constraints and objective functions has attracted much attention in the literature. The quali­ fier "series" means that there is one decision maker at each level. A number of algorithms have been proposed to solve this problem. Our objective is to review the lit­ erature on the pre-emptive hierarchical programming prob­ Iem. In Section 2 we give the problem formulation for the bilevel linear pre-emptive hierarchical programming problem. In Section 3 we report on the available algo­

TWO-LEVEL HIERARCHICAL LINEAR PROGRAMMING PROBLEM

rithms to solve the bilevel PHP problem. paper with a few remarks in Section 4. 2.

33

We conclude the

PROBLEM FORHULATION

Now we give the mathematical programming formulation of the series bilevel PlIP problem and point out some of the important features and difficulties in solving the problem. Consider an organization that has two dec is ion making levels with one decision maker at each level, i.e., an organization with a series two-level hierarchy. Let level 1 be the higher level and DMi denote the deci­ sion maker at level i, i = 1,2. Suppose that the n deci­ sion variables are partitioned between the two decision makers such that DMI controls x' = (xl' x 2 ' ••• , x n ) and 1

= (YI' Y2' ••• , Yn ) where n l + n 2 = n. Let DMi maximize the function fi:R~-->RI for i = 1,2. We DM2 controls y'

use the following standard notation, max{g (a, c) : (a Ic)} == max {g(a, c)} for fixed c, a

in which (ale) is replaced by (a) when the value of c is not pre-set. The series two-level sequential decentral­ ized decision making problem can be formulated as the following two-Ievel PHP problem: (PIIP) {max {f 1 (x, y) : (x)}, PI St. {' max {f2(X' y) : (ylx)}, P2 St. (x, y) e: S, where S denotes the constraint region and Pi denotes the problem to be solved by DMi, i = 1,2. The vector (x o , yO) is a solution of PIIP if yO solves P2 with x = x O, and (x o , yO) solves Pl. For convenience we follow the established practice of writing PHP as follows: (x)}, max {fl(x, y) (ylx)}, max {f2(X' y) Subject to (x, y) e: S.

If fi' i = 1,Z and S are all linear, we obtain the resource control bilevel PHP problem, Rep: max { f 1 (x, y) alx + a2 Y (x)}, (RCP) (ylx)}, max { fZ(X' y) = dlx + dZy Subject to A1 x + AZ y ~ b, x > 0, y > 0, where al and dl are n l vectors; a;-and d 2 are n2 vectors; b is an m vector; Ai is an m x n i matrix, i = I,Z, and

34

SUBHASH C. NARULA AND ADIELE D. NWOSU

S: { (x', y')'!Alx + A2y ~ b; x~ 0, y ~ O} denotes the constraint region of the problem. The solution space of DMI is: P = {x : there is a y such that Alx + ~y ~ b}, and the solution space of DM2 for a given x is: S(xo) : {y : Azy ~ b - Alxo}. Let Y(x) denote the set of optimal solutions to the problem of DM2 : max {d 2y : y E S(x)} for each fixed x E P. Then the problem of DMI can be stated as: max {alx + a 2 y : x E P, Y E Y(x)} = max {alx + a 2 y : (x, y) E S, Y E Y(x)}. It is usually assumed that the constraint space, S, is bounded and that a unique solution exists for the lower level decision maker for any feasible value of the decision variables of the upper level decision maker. The unique response assumption is necessary to avoid philosophical difficulties in the definition of a solu­ tion. The boundedness assumption, on the other hand, is a mere convenience. The not ion of feasibility and optimality for a bi­ level PHP can be defined as follows: Definitlon 1: A point (xo, yO) is said to be a feas­ ible solution for RGP if xO EP and yO E Y(xo).

Definition optimal (i) (x*, . * (h) alx (iii)

2: A point (x*, y*) is said to be an solution for RGP if y*) is feasible, . unique for all y * E Y(x * ), + a2 Y* 1S

for all feasible pairs (xo, yO) E S,

alx * + a2 Y* ~ alx ° + a2 Y0 • The problem RGP, even with linear objective func­ tions and constraints, is non-convex. Observe that for a given x, the lower level decision maker, DM2 solves the following probl~m: (PL2) max {f 2 (x, y) = dlx + d 2 y} Y Subject to A2y ~ b - Alx,

y > o.

x.

Let S(x) be the solution set of PL2 for each fixed Then S = U S(x) is the set of optimal responses

xES

of DM2 to the action of DMI. The set 'S is piecewise linear and hence non-convex (11,17). However, it may be decomposed into non-overlapping but connected hyperplanes

TWO-LEVEL mERARcmCAL LINEAR PROGRAMMING PROBLEM

35

(11,18) , L., of S, j = 1, Z, ••• , q whe re q < mi n {m,

n Z}. Now Jthe problem of DM1 or equivalently RCP may be

written as PL1:

(PL1) max {f 1 (x, y) = a 1x + aZy},

x

Subject to (x, y) S PLZ is a linear programming problem. PLI is not since S is non-convex. The problems PLI and PLZ are of ten referred to as the policy and behavioral problems, respectively. Further. since each objective function depends on all the variables and the decision makers are indepen­ dent, the actions of the lower level decision maker may impact adversely on the objective function value of DM1. As mentioned earlier, in the absence of cooperation between the two decision makers, the solution to the problem need not be Pareto-optimal. We illustrate these points with an example. Example: Consider the following bilevel PUP problem max {f 1 (x, y) x + y (x)}, max {fZ(x. y) = -Zx - y (ylx)}. Subject to x - y < 4, 4x + y (" Zl, Zx + 3y (" 18, -Zx + 3y (" 6, x, y") O. Each decision maker controls exactly one decision vari­ able. The constraint set S for the problem is given in Figure 1. Now i f DM1 fixes the value of his control variabie, x, at , the feasible region of DMZ is given by the line AF. That is, miZ has to optimize his objective function fZ(x, y) over AF. The point F is optimal for him (whereas the optimal point forDMl with x = is at A). Clearly, f 1 (F) < f 1 (A) and the action of DMZ has reduced the attainable f l value. In particular when x = 9/Z, the problem solution is not at C (9/Z, 3), the optimal point in S for f l (x, y) but at E (9/Z, l/Z) which has a lower f l (x, y) value. The global optima 1 solution is at D(S, I) with the values of objective functions (fl(x, y), fZ(x, y» = (6, -11). Note that this solution is non-Pareto optimal since the point at B (3, 4) yields a better objective function value for both decision makers. But, in the absence of cooperation, the lat ter point is unattainable.

Althou~h

36

SUBHASH C. NARULA AND ADlELE D. NWOSU

y

( : ) l-:oI





) = x

'-

\

(

3

2

,, 1

(~)I ~ \

y

" / 9/') l15/2j -12

3

(:) I-~ I

+

,,

1: ~C) [-l~l / . -

("')[ 5 1/2 -19/2

'J-,

,

,

I

,

\

\

/",

, i I

H'

t

~ .1~

I

1

( ~)

2

"

I

3

I

,

,

I

I

jr

4

5

x

( : ) [ -:1

f 2 (x,y) = -2x - y

Legend:

(:)

Coordinates of point

[ : 1 values of (f 1 (x, y), f 2 (x,

--+-+--+-0

Figure 1:

.. 0 where H is any negative definite matrix and E: is a suitably small positive scalar so that the term Hx leads only to small perturbations of the problem KTPl. In the implementation of the algorithm they u~2d H = and found that values of E: ranging from 10 to 10 times the absolute ave rage parameter value rarely changed the final solution. Dnce the solution for KTP3 is obtained, the value of a. is increased parametrically until the current complementary basis for KTP3 is no longer feasible. Then the PCP algorithm finds another complementary basis to satisfy KTP3. The algorithm terminates when cannot be increased. Bard (10) considered the following parametrized linear programming problem: (KTP4) max À(a 1 x + a 2 y) + (1 - À)d 2 y Subject to A 1x + A2 y ~ b , x, y ~D,

:!

*

for À E: (0, 1] and showed that there exi~ts i À E: (0, 1] such that the corresponding solution (x , y ) of KTP4 is both feasible and optimal for RCP. He presented a gradi­ ent search algorithm GSA that is guaranteed to converge finitely to À* under certain nondegeneracy assumptlons. He compared five algorithms, viz., the K-th Best proce­ dure (17), the implicit search method (21), the branch­ and-bound algorithm (24), complementary pivot algorithm (26), and his own grid search algorithm in GSA. The results of the study showed that GSA outperformed the others. Recently there has been an interest in exploring the relationship between the bilevel PHP problem and the bi­ criteria programming problem (10,28,29). Consider the bicriteria programming problem: (BCP) max {a 1x + a 2 y, d 1 x + d 2 y}

x, y

Subject to (x, y)

Definition 3:

*

E:

*

S.

A point (x , y ) E: Rn is a Pareto-optimal solution to BCP i~ (i) (x, Y ) is feasible, (ii)

*

there exists no other (x, y)

+ a 2y

~

d 1x + d 2 y

~

a 1x

a 1x * + a 2y * , d 1x * + d 2y * ,

E:

S such that

40

SUBHASH C. NARULA AND ADIELE D. NWOSU

with at least one inequality holding as strict inequality. One common way to find Pareto-optimal solutions to BCP is to solve the parameterized linear programming problem max {À(a 1x + azY) + (1 - À) (d 1x + dZY) : (x, y) E S} x, y for À dO, 1]. Bard (10) conc1uded that the solution to the bilevel PlIP problem is a Pareto-optimal solution of a modified version of the associated bicriteria programming problem; namely max {alx + azy , dZy : (x, y) E S} • x, y The reader may observe this from KTP4. Based on this observation, Unlu (Z8) proposed an adaptation of an algo­ rithm to solve the bicriteria programming problem to find a solution to the bilevel PlIP problem. Her computational results indicated that her proposed algorithm performed even better than the GSA. lIowever, using a counter example, Candler (30) showed that Theorem Z, due to Bard (10), in Unlu (Z8) was in error. More recently, Wen and Hsu (Z9) pointed out that the bicriteria programming is not suitable for searching for an optimal solution for all bilevel PlIP problems. However, they proposed a sufficient condition to use a bicriteria programning algorithm for solving a bilevel PHP problem. 4.

SOKE REMARKS

Much progress has been made in understanding and finding efficient solution procedures to solve the pre­ emptive hierarchical programming problem. However, most of the theoretical research has been conducted and com­ putational procedures have been developed for the bilevel linear PlIP problem. Some results have been reported on other PHP problems. Wen and Bia1as (31) have developed a hybrid algo­ rithm that combines the K-th Best algorithm (17) and the c01Qplementary pivot algorithm (Z6) to solve the three­ level linear programming problem. Recently, Wen and Yang (3Z) have proposed one heuristic and one exact algorithm to solve the mixed-integer bilevel linear programming problem in which the decision variables controlled by the higher level decision maker Dln are zero-one variables. Their computational results show that the heuristic procedure finds near optimum solution and the computational time of the exact method increases expon­ entially with the number of zero-one variables.

TWO-LEVEL HIERARCHICAL LINEAR PROGRAMMING PROBLEM

41

It is hoped that this review wi 11 encourage more research on the subject. DRDNCES

1. Schenk, G. R., Bialas, W. F. and Karwan, M. H. (1980) A multi-Ievel programming model to determine optimal effluent control policies. Working Paper, Department of Industrial Engineering, SUNY at Buffalo, Buffalo, New York, USA. 2. Cassidy, R. G., Kirby, M. J. L. and Raikes, W. M. (1971) Efficient distribution of resources through three levels of gove rnment. Management Science, 17, 462-473. 3. Candler, W., Fortuny-Amat, J. and McCarl, B. (1981) The potential role of multi-Ievel programming in agricultural economics. American Journalof Agri­ cultural Economics, 63, 521-533~ 4. Seo, F. and Sakawa, M. (1980) Evaluation for indus­ trial land-use program related to water quality management. Working Paper 80-49, International Institute for Applied Systems Analysis, Laxenburg, Austria. 5. Holmstrom, B. (1979) Moral hazard and observa­ bility. Bell Journalof Economics, 10, 74-91. 6. Shavell, S. (1979) Risk sharing and incentives in the principal and agent relationship. Bell Journalof Economies, 10, 55-73. 7. Srinivasan, V. (1981) An investigation of the equal commission rate policy for a multi-product sales force. Management Science, 27, 731-756. 8. Lasdon, L. S. (1970) Optimization Theory for Large Systems, p. 152. New York: The MacMillan Co. 9. Dirickx, Y. M. I. and Jennergren, L. P. (1979) Systems Analysis by Multi-Ievel Methods: with Applications to Economics and Ma~agement. Chichester: John Wiley and Sons. 10. Bard, J. F. (1983) An efficient point algorithm for a linear two-stage optimization problem. Operations Research, 31, 670-684. 11. Nwo su , A. D., (1983) Pre-emptive hierarchical pro­ gramming problem: A decentralized decision model. Ph.D. dissertation. Operations Research and Statistics, Rensselaer Polytechnic Institute, Troy, New York, USA. 12. Ignizio, J. P. (1976) Goal Programming and Exten­ sions. Lexington: D. C. Heath and Co.

42

SUBHASH C. NARULA AND ADIELE D. NWOSU

13. Cohon, J. L. (1978) Multi-objective Programming and Planning. New York: Academie Press, New York. 14. Isermann, H. (1974) Proper efficiency and the linear vector maximum problem. Operations Research, 22, 189­ 191. 15. Simaan, M. and Cruz, J. B. (1973) On the Stackelberg strategy in non-zero sum games. Journalof Optimi­ zation Theory and Applications, 11, 533-555. 16. Zangwill, W. I. and Gareia, C. B. (1981) Pathways to Solutions, Fixed Points and Equilibria. Englewood Cliffs: Prentice-Hall, Inc. 17. Bialas, W. F. and Karwan, M. H • (1982) On two-Ievel optimization. IEEE Transactions on Automatic Con­ trol, AC-27, 211-214. 18. Wen, U. (1981) Mathematical methods for mul ti-level linear programming. Ph.D. Dissertation, Department of Industrial Engineering, SUNY at Buffalo, Buffalo, New York, USA. 19. Falk, J. E. (1973) A linear max-min problem. Mathe­ matical Programming, 5, 169-188. 20. Candler, W. and Norton, R. (1976) Multilevel pro­ gramming. Research Memorandum, Development Research Center, World Bank, Washington, D. C., USA. 21. Candler, W. and Townsley, R. (1982) A linear two­ level programming problem. Journalof Computers and Operations Research, 9, 59-76. 22. Bard, J. F. (1984) Optimality conditions for the bi­ level problem. Naval Research Logistic Quarterly, 31, 13-26. 23. Narula., S. C. and Nwosu, A. D. (1983) !wo-level hierarchical programming problem. Multiple Criteria Decision Making - Theory and Application, edited by P. Hansen, pp. 290-299. New York: Springer-Verlag. 24. Bard, J. F. and Falk, J. E. (1982) An explicit solu­ tion to the multi-level programming problem. Journal of Computers and Operations Research, 9, 77-100. 25. Fortuny-Amat and HcCarl, B. (1981) Arepresentation and economie interpretation of a two-level pro­ gramming problem. Journalof the Operational Research Society, 32, 783-792. 26. Bialas, W. F., Karwan, M. H., and Shaw, J. P. (1980) A parametric complementary pivot approach for two­ level linear programming. Research Report No. 80-2, Dept. of Industrial Engineering, SUNY at Buffalo, Buffalo, New York, USA. 27. Bialas, W. F. and Karwan, M. H. (1984) !wo-level linear programming. Management Science, 30, 1004­ 1020.

TWO-LEVEL mERARCmCAL LINEAR PROGRAMMING PROBLEM

43

28. Unlu, G. (1987) A linear bilevel programming algo­ rithm based on bicriteria programming. Journalof Computers and Operations Research, 14, 173-179. 29. Wen, U. P. and Hsu, S. T. (1989) A note on a linear bilevel programming algorithm based on bicriteria programming. Journalof Computers and Operations Research, 16, 79-83. 30. Candler, W. (1988) A linear bilevel programming algorithm: A comment. Journalof Computers and Operations Research, 15, 297-298. 31. Wen, U. P. and Bialas, W. F. (1989) The hybrid algorithm for solving the three-level linear programming problem. Journal of Computers and Operations Research, 13, 367-377. 32. Wen, U. P. and Yang, Y. H. (1990) Algorithms for solving the mixed integer two-level linear programming problem. Journalof Computers and Operations Research, 17, 133-142.

SOME RECENT DEVELOPMENTS IN INFINITE PROGRAMMING

A.B. PHILPOTT Department of Engineering Science University of Auckland, Private Bag, Auckland, NZ

Abstract Infinite programming is concerned with opti­ mization problems in which the number of variables and the number of constraints are both possibly infinite. If ei­ ther the constraints or variables are finite in number then an infinite program is called a semi-infinit.e program. In this paper we review some recent work which has been carried out in infinite programming, in particular in the fields of semi-infinite linear programming, and contilluous-time lin­ ear programming. Some discussion is made of theoretical issues such as the existence of solutions and duality theory, but we put a greater emphasis on endeavouring to summa­ rize and explain some of the algorithmic developments in these areas. Keywords Infinite programming, semi-infinite program­ ming, continuous linear programming

1.

INTRODUCTION

Infinite programming (or infinite-dimensional programming) 1S con­ cerned with mathematical programming problems in which the num­ ber of variables and the number of constraints are both possibly infi­

DOI: 10.1201/9780429333439-3

45

46

A.B.PHILPOTT

nite. For example one might consider the problem: rrulllmlze cTX subject to a(sfx ~ b(s),

s

E

S,

(1)

where a and bare vectors in Rn both of which are parametrized by some variable s which lies in the index set S. This is just a linear program if S is a finite set, but is an infinite linear program otherwise. Infinite programs arise naturally in many contexts in applied mathe­ matics [IJ, engineering [2J, economics [3J, and operations research. In particular, many finite-dimensional (convex) mathematical programs can in principle be formulated as optimization problems with an infi­ nite number of linear inequality constraints, although it is usually not advantageous to do so. The subject of infinite programming has a long history. At first sight it could be considered to be almost as old as classical lin­ ear programming (see [4J or [5J for two early papers on the subject), or very much older if one considers the work of Monge [6J, Appell [7,8], and Kantorovich [9J on an infinite-dimensional version of the transportation problem. Notwithstanding this long history, in this paper we shall be concerned with recent developments in the sub­ ject, by which we mean developments which have occurred in the last decade. Since infinite programming encompasses as a special case much of finite-dimensional mathematical programming, there is much one could write about, and any attempt to give an exhaustive survey will be unsatisfactory. We shall therefore only be concerned here with models that in some sense have natural formulations as infinite pro­ gramming problems. Furthermore we shall in the main restrict atten­ tion to infinite programming problems which have linear constraints and objective functions. Following on from the work of [4J much research on infinite linear programming has been devoted to the development of an adequate duality theory in the hope of emulating the duality theory of fini te­ dimensionallinear and convex programming. Thc monograph [IOJ of Anderson and Nash gives a comprehensive survey of this work, as well as being a detailed reference work on infinite linear programming. (In fact much of this paper is drawn from material in [10]). Progress in algorithm development for infinite linear programs seems to have occurred mainly in two areas:

INFINITE PROGRAMMING

47

1. Semi-infinite linear programming 2. Continuous-time linear programming In the sequel we shall proceed to examine recent developments in both of these areas. In the next section we deal with semi-infinite program­ ming, and in Section 3 we discuss continuous-time linear programming. For each class of problem we shall endeavour to give a brief introduc­ tion to the class of problems, and outline recent research that has been carried out on algorithms for their solution. We shall avoid a detailed discussion of the duality theory of these problems, although this is often important from an algorithmic point of view, especially when one requires a procedure to check optimality.

2.

SEMI-INFINITE PROGRAMMING

A semi-infinite program is an infinite program in which either the num­ ber of variables is fini te, or the number of constraints is finite. The problem (1) with infinite S posed in the previous section is an example of a semi-infinite program. One of the typical manifestations of a semi­ infinite program occurs in approximation theory, where one might seek the best uniform approximation of the function 1 on [0, 1] in terms of functions 9l,92, ... ,9n, by choosing real numbers h,Xl,X2,""X n , so as to ffilmmlze

h

subject to

Ei=l Xj9j(8) - h ~

1(8), 8 E [0,1],

+h ~

-/(8), 8 E [0,1].

- Ei=l Xj9j(8)

(2)

Semi-infinite programs were first formulated as such by Charnes et al. [11] and have been comprehensively studied in recent years. An excellent survey of recent research in this area is given by Gustafson and Kortanek [12]. Although we do not intend to dweIl here on the developments described in [12] which relate to the period 1962-82, we shall have cause to describe some of the earlier work in the subject in order to place recent research in context. It is possible to draw

48

A.B.PHlLPOTT

a distinction between countable semi-infinite programs in which the index set Sis countable, and continuous semi-infinite programs, which have uncountable index sets like those in (1) and (2) above. We can formulate the countable semi-infinite program explicitly as foIlows: mmlmlze subject to

cTx

aTx ~ bi,

(3) i =1, 2, ....

Here c and ai, i = 1,2, ... are vectors in R!', and bi, i = 1,2, ... are real scalars. We shaIl digress briefly here to discuss some aspects of duality theory for countable semi-infinite linear programs as an illustration of some of the difficulties which arise in this field, as weIl as a means to present an example of a duality theorem. For a fuIl discussion of duality theory for these problems, as weIl as for continuous semi­ infinite programs, the reader is urged to consult [10] or [13]. The dual program for (3) has an infinite number of variables and a fini te number of constraints. It can be written as foIlows: maxlmlze

". L..& y·b· 1. 1.

subject to

Ei Yiai

(4)

= c,

Yi ~ 0, and only a finite number of Yi

> O.

Most work on duality theory for these problems has been con­ cerned with deriving conditions on the problem data to ensure that strong duality holds, whereby the values of primal and dual problems coincide. Here the term value is defined to be the infimum and supre­ mum of the respective objective functions of (3) and (4) over their respective feasible regions. One would also like conditions which en­ sure that both problems are solvable whereby the respective infimum and supremum are attained by some (primal) feasible point and some (dual) feasible point. (In terms of a simplex algorithm, solvability and strong duality guarantee that there exists some analogue of a reduced cost vector to show that a candidate for optimality is in deed a solu­ tion.) If the values of (3) and (4) do not coincide then we say that the problems have a duality gap. Such a situation might occur even in

INFINITE PROGRAMMING

49

simple problems as ean be seen by examining the following example of Karney [14]. The primal problem is mmlmlze

Xl

subject to

Xl

> -1, (5)

-X2

Xl - (1/i)x2

>

0,

>

0, i = 3,4, ... ,

and the dual problem is maxlmlze

-Yl

subject to

Y3

+ Y4 +... = 1,

(6)

-Y2 - (Y3/ 3) - (Y4/ 4 ) - ... = 0, Yi ~ 0, and only a finite number of Yi

> 0.

The optimal solution to (5) is easily seen to be Xl = X2 = 0, with an optimal value of zero, whereas the optimal solution of (6) is Yl = 1, Yi = 0, i ~ 2, with optimal value of -1. In order to preclude behaviour sueh as this it seems to be neees­ sary to impose quite strong eonditions on the problem data. A typical eondition involves the subset M of Rn+1 defined by

M

= { (~: ~:::) I Yi

~ 0, and only a finite numbcr of Yi > °},

whieh is ealled a moment cone in Rn. (There are a number of alt erna­ tive definitions in thc literature of moment çones for both primal and dual problems.) The following theorem is derived in [10]. Theorem 1 IJ the problem (3) has a finite value and M is closed then the values oJ (3) and (4) coincide. The set M in the example above is generated by

{UJ '(~

1) , ( -

~/3 ) , ( - i/4 ) , ... }

50

A.B.PHILPOTT

which gives a cone having limit points on the Xl axis which does not lie in M. Most research in semi-infinite programming has concentrated on the continuous semi-infinite linear program (1). When a(s) and b(s) are continuous functions on [0,1] this has a dual problem maxlmlze

f~ b( s )dw( s )

subject to f~ a(s)dw(s) = c,

w(s)

~

0,

(7)

wE Mr[O, 1],

where Mr[O, 1] is the space of regular Borel measures on [0,1] (i.e. the topological dual space of G[O, 1]). If (7) has a feasible solution Wo then it can be shown that it has a feasible solution w with the same value as Wo and fini te support. It then becomes an equivalent problem to (4) where the support of w corresponds to the set of indices with positive Yi. Analogous results to Theorem 1 may be obtained for the continuous semi-infinite linear program and its dual, as well as examples of problems which exhibit a duality gap. The interested reader is referred to [10]. We remark in dosing that a number of authors (see e.g. [12]) treat problem (7) with fini te support as the primal problem, and (1) as the dual. We now turn our attention to numerical algorithms for sol ving semi-infinite linear programmming problems. We shall confine our account to methods for solving the continuous semi-infinite program. ( A discussion of conceptual algorithms for countable semi-infinite programs is given in [10], but it is difficult to see how any of these methods can be implemented in practice. ) A number of different nu­ merical methods have been proposed for semi-infinite programs over index sets which are compact subsets of Rn. Since in this case, under an appropriate Slater condition it can be shown [13] that only a finite number of the constraints oi (1) are binding at the optimal solution, most numerical methods attempt to find this set of indices in an it­ erative fashion. We shall proceed to give a brief overview of these methods. Exchange Methods The oldest algorithm for semi-infinite programming problems is the

INFINITE PROGRAMMING

51

Remez exchange algorithm for the uniform approximation problem (2). This method is an example of the more generally applicable class of exchange algorithms. These make use of a fini te subset S(k) of S called the reference set. At iteration k the algorithm solves the linear program given by (1) with S(k) replacing S. This yields a solution x(k) which is used to compute alliocal minima of a(sfx(k) - b(s) over S. If these values are all non negative then x(k) solves (l)j otherwise some point s(k) E S where this is not the case is added to S(k) to form S(k+l) = S(k) U {s(k)}. For the uniform approximation problem (2), where in adclit ion the functions 9i satisfy the Raar condition *, the Remez exchange algo­ rithm gives an explicit procedure for exchanging a single point in the reference set with a point outside the reference set. This procedure can be proved to converge to the solution to (2). An excellent account of this method for the uniform approximation problem is given by Powell [15]. For general semi-infinite problems the specification of explicit exchange rules is more complicated. Computational evidence (see Ret­ tich [16]) seems to indicate that multiple exchange rules (in which more than one point is added to the reference set in one iteration) are more favourable than making exchanges one at a time. Exchange meth­ ods can exhibit the same poor convergence as nonlinear cutting plane algorithms, and share the disadvantage of some other semi-infinite programming methods which have to compute all local minima for a(sfx(k) - b(s) at each itcration. The reader is referrcd to [16] for a fuller discussion of the advantages and drawbacks of these methods. Three Phase Methads A well known method attributcd to Gustafson ( see [17,18] and the references therein) works in three phases. These are: 1. Construct an approximate solution to (1) by solving a linear program with a discretized index set. 2. Approximate the indices corresponding to active constraints at the solution by clustering.

* The Haar condition states that for any n distinct points the matrix G with gij = gi(Sj) is nonsingular.

SI, S2, ... , Sn

E [0, 1]

52

A.B.PHILPOTT

3. Improve this approximation to the optimal indices using a nu­ merical method with good local convergence behaviour. The performance of each of these phases depends critically on how weU the previous stage is computed. If the discretization used is not fine enough in the first phase then the clustering procedure might fail to identify a good approximation to the optimal reference set. Since, for a fine discretization, the resulting linear program will be large and ill conditioned, the first phase of these methods can be computationally expensive, and there have been recent efforts to improve method~ rOf doing this. As an alternative to the simplex method, some authors (e.g. [19]) have successfuUy applied interior-point methods in the spirit of Karmarkar [20] and Dikin [21] to discretized versions of (1). Notwith­ standing this, with an appropriate grid refinement strategy, the sim­ plex method can perform well. In [22] Rettich reports on the imple­ mentation of a method for (1) which successively refines a discretiza­ tion of S, ad ding extra grid points in regions wh ere the constraints are violated, and using the solution from the previous problem as a starting point for the new problem. A similar approach of successive grid refinement is advocated by Asié and Kovacevié-Vujcié [23], although they make use of the problem data by considering the interior X(k)

= {x

E Rn

I a(sfx -

b(s) ~ f(k),s E S(k)}

of the feasible region, where S(k) is the grid at iteration k, and f(k) is a positive parameter which is forced to converge to zero during the course of the algorithm. In order to guarantee that X(k) lies in the feasible region the fineness of the grid and the initial value of f(O) depend on an estimate of a Lipschitz constant for the functions a and

b. The clustering procedure in phase two usually varies with the problem being solved. Typically index points are clustered at their centre of gravity if the distance between them is less than some pre­ assigned tolerance. We have remarked above that under appropriate conditions the optimal solution of (1) is the same as that of the following problem:

INFINITE PROGRAMMING

m1mm1ze

53

cTx (8)

subject to a(sjfx ~ b(sj), where t

~

Sj E S,

i = 1,2, ... , t,

n. We can thus form the Lagrangian t

C(x, À, s)

= cTx + L:)j(b(sj) -

a(sjfx),

;=1

and in the third phase apply standard numeri cal techniques such as Newton's method to compute a stationary point of C(x, À, s) in the neighbourhood of the clustered solution obtained by thc first two phases. Because of the difficulty in clustering in the second phase some authors (see e.g. [24]) have sought techniques to obtain solutions from poor approximations to the optimal reference set. One possibility explored in [24J is to use an exact penalty function of the form

cTx + 0

t

E Max{(b(s;) -

a(sj(x)fx), Ol,

;=1

where O. Indicate by superscript • the restrictions of A, H, and d to constraints active at z. The Kuhn-Tucker conditions (3) for small p;i:() then require that ('Viel) ['t'T(C+pD)+ÁT(A+pH)].=O, 't'Te=l, (A·+pH·)v=d·-H·z. 1

(13)

So, to frrst order in p, ('Viel) ['t'TC+vTAh ... _['t'TD+ÁTA1j, 't'Te =l, A·v=d·-H'''z.

(14)

The remaining requirements ~, z+pv~O hold automatically when p is smal!. The first equation of (14) is solvable, provided that the matrix formed by the rows i of A and C with ieJ has full rank. Assuming this, any direction v satisfying (for small p) A·v=d·-H·z gives a Pareto minimum x=z+pv for the perturbed mlp (11). Thus the optimal vector objective function Cx changes by an amount p(Cv), where the (non-unique) direction v satisfies A·v=d"'-H·z. If there is only a single objective, cTx replacing vector Cx, denote basic components by suffix b; then Abvb=d and CbT+~TAb=O; then CbTvb =_ÁTd, in agreement with the usual shadow co st calculation. The mpl example:

98

B. D. eRAVEN

-3]

MIN [-2 [Xl] S.t [1 1] [Xl] ::;; l+p, [Xl] -3 -2 Xz X2 Xz

~

[0] 0

(15)

is converted to the fonn (1) by introducing a slack variabIe X3~0, thus XI+X2+X3=I+p. The point xI=I/2, x2=1/2, X3=O is a Pareto minimum. For the perturbed problem, -r=[l 1] and Á=5 do not change; and each direction v satisfying [l 1] [~~] =1 leads to a perturbed Pareto minimum. But this simple Pareto minimum is not a vertex ([1 O}T or [Ol}T) of the feasible region. At these latter Pareto points, the set of non zero Xi could change; and the present approach for simpte minima would need some extension. Consider now the multiconvex program (2), assuming that the functions are twice differentiable, and that a regularity condition ("constraint qualification") holds at each minimum point. This is relevant to a genuinely nonlinear problem. The Kuhn-Tucker conditions necessary for a weak minimum at x=z (or for a Pareto, implying a weak minimum) are [1] that Lagrange multipliers -r and Á satisfy: -rTf (z)+ÁTg (z) = 0; -r~0; ~; -rTe=l; ÁTg(z)=O; g(z)::;;O. X

X

(16)

Here fx and gx denote derivatives with respect to x. Because the problem is convex, these necessary conditions also imply a weak minimum. The minimum is proper Pareto if 'DO, by Geoffrion [3]. Assume now that f and g depend also on a parameter p (which may be a vector). Then the minimum z is perturbed, say to x, where X is not uniquely detennined by p. However, the system (15) is solvable for x, provided that an implicit function theorem can be applied. Assume a simpte minimum, where 'DO, and Áj>O for each constraint gix)::;;O that is active at x=z; and denote by h(x)::;;O the constraints active at z. Then the system (15) reduces locally (thus near z) to -rTf (x)+ÁTh (x) = 0; -rTe=l; h(x)=O. X

(17)

X

The Jacobian matrix for this system at point z is (-rT f+ÁTh)xx(z) J(z,-r,Á) = [

f/(z) h/ (z)

0

eT

0

hx(z)

0

0

1 (18)

If this (n+l-m)x(n+r+m) matrix has full rank (thus, rank=n+l+m), then the system (16) is solvable, near z, for x (and the perturbed -r and Á), say

ASPECTS OF MULnCRITERIA OPTIMIZAnON

99

x=x(p). Denote by M the matrix consisting of the first n rows of the inverse matrix J(z,t,Á)-l (thus, corresponding to (lf+ÁTh)xX 0, iEP. Optimization in (F) means obtaining efficient solutions as defined be­ low.

Definition. A feasible solution Xo for (F) is an eflicient solution if there exists no other [easible X for (F) such that h(x) + (Xl BjX)I/2 < h(xo) + (xÖ B j XO)I/2 gj(x) - (x IDjx)1/2 - gj(xo) - (xÖDjxo)1/2

"Ij E P

I;(xo) + (xÖ B jXO)I/2 Ij(x) + (xIB i X)I/2 < g;(x) - (Xl Dix)1/2 gj(xo) - (xÖD;xo)1/2

lor some zE.

and

r

(2)

.

P

(3)

For vector maximization programs the inequality signs in (2) and (3) are reversed. The following multiobjective problem is a Schaible9 ,lO type dual to (F)

(DF) :

Maximize (À 1 , À2, ... , Àp) p

subject to LTj(V'J;(u)

+ BiWi + ÀiDizi -

ÀiV'gi(U))

+ yth(u) = °

(4)

i=1

li(U) + ulBjwj gi(U) - uID;z;

= À; ~ 0,

°

ylh(u) ~ ziDjzj::;l, w~Bjwj ::; 1, y~o

iEP

(5)

iEP iEP

(6) (7) (8) (9)

MULTIOBJECTIVE PROBLEM WITII SQUARE ROOT TERMS P

Ti

~ 0, iEP,

LTi

= 1.

103

(10)

;=1

We prove weak duality for cases where functions are convex, p-convex (weakly or strongly convexll ,12), pseudo convex and quasi-convex.

Definition. A function f : ~n t-+ :R is said to be p-convexll ,12 if there exists a real number p such th at for each x, u E ~n and 0 ~ À ~ 1, f(Àx

+ (1- À)u) ~

Àf(x) + (1- À)f(u) - pÀ(l-

For a differentiable function f : ~n

t-+

À)llx -

u1l 2.

:R is p-convex if and only if

f(x) - f(u) ~ (x - ut~7f(u)

+ pllx -

ull 2 Vx,u E :R n •

In the proofs of weak duality results, the following generalized Schwarz inequality 13,14 will be required. Lemma. If B is a positive semi-definite matrix then

x t Bw ~ (x t Bx ?/2( w t Bw)1/2. Equality holds if for some À ~ 0,

Bx

= ÀBw.

The statement and proof of a strong duality result requires a constraint qualification to be satisfied at an efficient point. In this paper we invoke the Abadie15 constraint qualification used in single objective programming problems. Any constraint qualification which implies the Abadie constraint qualification such as the one used by Singh 16 can be used.

2. NECESSARY CONDITIONS In this section we give a Kuhn- Tucker necessary condition for a point to be an efficient solution of (F). To derive this condition we will use the Abadie constraint qualification 15,17. The following concepts will also be required. A function f : :Rn t-+ ~ is said to be locally Lipschitz iffor each bounded subset B of ~n there exists a constraint J( ~ 0 such that

If(x') - f(x)1 ::; klx' -

xl

for all points x',x E B.

The directional derivative of a function f at a point x in the direction d is given by J'(xjd) = lima- 1 [J(x + ad) - f(x)].

"'JO

A function q is said to be quasidifferentiable at a point x if q has a directional derivative at x for each direction d such that q'(xjd) is convex with respect to d {see for example Craven 18 and Schechter 14}. It is known that if p( x) and q(x) are quasidifferentiable functions then their ratio r(x) = p(x)/q(x), q(x) f. 0 is also quasidifferentiable { see for example Demyanov 19}.

104

R. R. EGUDO

Let

J;(X)

Fi(X)

+ (xtB i X)I/2

= gi(X) _ (xtDix)1/2;

now we show that the set of directions

L(xo)

= {d E ~n : Ff(xo; d) < 0

Vi E P}

has no common elements with the tangent cone

Txo(xo) where

= {d E ~n : 3{dk} -+ d,ak ! O,xo + akdk E X O}. Ff(xo;d)

1 [Fi(XO + ad) = lima0'10

Fi(XO)]

is the directional derivative of Fi at Xo in the direction d.'

= (Fl(X),F2(X), ... ,Fp(x)) where Fi: ~n I-+~, iEP are locally Lipschitz aml possess directional derivatives at each point in each direction. If Xo is an eflicient solution of F(x) on X O ç ~n, then

Lemma 1. Let F(x)

L(xo)

n Txo(xo) = 0.

Proof: Suppose that L(xo) n Txo(xo) :f; 0 and let d E L(xo) n Txo(xo). Since Txo(xo) is a tangent cone to X O at Xo , x E X O can be expressed as x = Xo + ad + {3(a) for some sequence of real numbers a ! 0, where {3(a) = o(a). Since Fi , iEP are locally Lipschitz and directionally differentiable, we have for each iEP

Fi(XO

+ ad + ,B(a)) -

Fi(XO)

= Fi(XO + ad) - Fi(XO) + Fi(XO + ad + {3(a))'- Fi(XO + ad) =aF[(xo; d) + o(a) + o({3(a)) =aFf(xo; d) + o(a) + o(a).

Dividing through by a we obtain for each iEP

[Fi(XO

+ ad + {3(a)) -

Now taking limit as a

= Ff(xo;d) + o(a) + o(a). a a

! 0 yields

lim a-I [Fi(XO 0'10

Fi(xo)]/a

+ ad + ,B(a)) -

Fi(XO)]

= F:(xo; d) < 0,

iEP.

The last step follows from Ft( xo; d) < 0 for each iEP. From the above inequality we have for sufficiently small a, Fi(XO + ad + {3(a)) < Fi(XO) for all iEP. Since Xo + ad + {3( a) E X O the above inequality contradicts the

MULTIOBJECTIVE PROBLEM WITH SQUARE ROOT TERMS

efficiency of Xo for F(x) on XO. Therefore we must have L(xo)nTxo(xo)

105

= 0.

I

Remark 1. Let 8Fj(xo) be the subdifferential of the function Fj, iEP at Xo then we have for each V j E 8Fj (xo), iEP

V/ d :::; F! (xo; d)

for all d E ~n.

It now follows from Lemma 1 that for all Vj E 8Fj (xo), iEP, the system

v;td'i9i(X) < J;( u) + ut BiWi + >'iut DiZi - >'i9i(u)

Iï(x)

respectively. Applying convexity of

(x - u)t(V/j(u) + Bjwj

f

(25)

and -9 to (24) and (25) yields

+ >'jDjzj -

>,jV9j(U))

~

0,

"Ij E P

(26)

and for some iEP

(x - u)t(VIï(u) + BiWi

+ >'iDiZi -

If hypothesis (a) holds i.e. Ti p

(x - u)t LT{VJ;(u) i=1

>';V9i(U)) < 0,

(27)

> 0, iEP then (26) and (27) imply

+ BjWi + >,jDjzj -

>'iV9i(U))
,jX t DjZi -

p

~ LTi (Jj(u) i=1

>,j9j( x))

+ ut BjWi + >'iut DiZi -

>'i9j(U)).

This and hypothesis (b) imply (28). Again (21) and (28) contradiet (4), hence the result of the Theorem must hold. •

Theorem 2 (Weak duality). Assume th at for all feasible x in (F) and all feasible (T,U,Wl, ... ,WP ,y,zl, ... ,Zp) for (DF), Iï is Ui-convex, iEP, -9i is pi-convex, iEP and hj is {;-convex, j E M. If either of the following conditions holds: (a) (b)

> 0, iEP and Lr=1 Tj(Ui + >,jpj) + Lj=l Yj{j ~ 0; Lr=1 Ti(Ui + >'jPi) + Lj=1 Yj{j > 0;

Ti

then inequalities (19) and (20) cannot hold. Proof: The first part of the proof is the same as the proof of Theorem 1 up to inequality (25) except that instead of using convexity of yth to obtain (21) we use the Lj=1 Yj(j-convexity of yth to obtain

(x - u)tVyth(u) + /Ix -

m

u/l 2

LYj(j ~ O. j=1

(29)

MULTIOBJECTIVE PROBLEM wrrn SQUARE ROOT TERMS

109

Applying p-eonvexity of -g and O"j-eonvexity of fï, iEP, to (24) and (25) yields for all j E P

(x-u)t(V/j(u)+Bjwj+ÀjDjzj-ÀiV'gj(u))+(O"j+Àjpjllx-uIl2 ~ 0, (30) and for sorne j E P

(x-U)t(VJi(U)+BiWi+ÀiDiZi-ÀiV9i(U))+(O"i+ÀiPillx-uIl2 < 0, (31) respectively. If Ti > 0, iEP, then (30) and (31) irnply p

(x - u)t LTi(V fï(u)

+ BiWi + ÀiDizi -

À;Vgj(u))

i=1

p

+ (L Ti( O"i

+ Àjpd) IIx - ull 2 < o.

(32)

j=1

Now (32) and (29) irnply p

(x - u)t LTi(V fï(u)

+ Bjwj + ÀiDizj -

ÀjV9i(U))

+ ytVh(u)

i=1

m

p

+ (LTi(O"j

+ ÀiPi) + Lyj~j) IIx - ull 2 < o.

i=1

(33)

j=1

If hypothesis (a) holds then (33) irnplies p

(x - u)t LTi(Vfï(u) + BiWi

+ ÀiDizi -

ÀiV9i(U))

+ ytVh(u) < 0

(34)

i=1

whleh eontradicts (4). If Ti ~ 0, iEP then (24) and (25) irnply p

LTi(Ji(X) + XtBiwi

+ ÀixtDizi -

Àigi(X))

i=1 p

~ LTj(Ji(U) + UtBjWi

+ ÀjutDiZj -

Àjgj(u)).

(35)

i=1

Now (35) and yth(x) ~ yth(u) yield p

LTi(fï(x) + xtBiwi

+ Àixt DiZi -

Àjgi(X))

+ yth(x)

i=1 p

~ LTi (Ji(U) i=1

+ ut BiWi + Àiut Dizi -

Àjgi(U))

+ yth( u).

(36)

110

R. R. EGUDO

Using pi-convexity of-g j and O";-convexity of J;, iEP and hj, j E M in (36) we obtain p

(x - u)t

LTj(V Ji(U) + Bjwj + ÀjD;Zi -

À;Vg;(u))

i=l

~j-convexity

of

+ ytVh(u)

m

p

+ (LTj(O"j + Àjpj) + Lyj~j) IIx - ull 2 ~ O. j=l j=l

(37)

If hypothesis (b) holds, then (37) implies (34), again contradicting (4). Hence in either case the result must hold. •

Theorem 3 (Weak duality). Assume that for all feasible x in (F) and all feasible (T,U,W1, ... ,Wp ,y,Zl, ... ,Zp) in (DF), yth is quasiconvex at u. If either of the following holds: (a) Tj > 0, iEP and J;(.) + .tBjw; + À; .t Djzj - Àjgj(.) is pseudoconvex at u, (b) L::f=l Tj(J;(.) + .tBjwj + Àj.t Djzj - Àjg j ( . ) ) is strictly pseudoconvex at Uj then inequalities (19) and (20) cannot hold.

Proof: From h(x) ~ 0, y ~ 0 and yth(u) ~ 0 we have yth(x) - yth(u) ~ And by quasiconvexity of yth at u this implies

(x - u)tVyth(u) ~

o.

o.

(38)

Now suppose to the contrary that (19) and (20) do hold. Using the same argument used in the proof of Theorem 1 we have for all j E P

Ji(x)

+ x t Bjwj + ÀjxtDjzj - Àjgj(x) ~ Ji(u) + utBjwj + ÀjutDjzj - Àjgj(u)

(39)

+ xtBjw; + ÀjxtDjzj - Àjgj(x) < J;( u) + ut Bjwj + ÀjU t Djzj - Àjgj( u).

(40)

+ .tBjwj + Àj.t Djzj -

Àjgj(.), iEP,

and for some iEP

J;(x)

Fram the pseudoconvexity of Jj(.) (39) and (40) imply

(x - u)t(VJi(u)

+ Bjwj + ÀjDjzj -

ÀjVgj(u)) ~ 0,

"Ij E P

(41)

and for some iEP

(x - u)t(V J;(u)

+ Bjwj + ÀjDizj -

ÀjVg;(u)) < 0,

(42)

MULTIOBJECTIVE PROBLEM WITH SQUARE ROOT TERMS

111

respectively. If hypothesis (a) holds then (30) and (31) imply p

(x - u)t LTi(V fi(U)

+ BiWi + ÀiDiZi -

ÀiVgi(U)) < 0

(43)

;=1

Now (43) and (38) contradict (4). For T; ~ 0, iEP, (39) and (40) imply P

LTi (Ji( x)

+ Xl BiWi + À;x l DiZi -

Àigi(X))

;=1

p

::; LT;(Ji(u) + ulBiWi

+ ÀiulDizi -

À;gi(U)).

(44)

;=1

If hypothesis (b) holds then (44) implies (43); again (38) and (43) contradict (4). Therefore in either case the Theorem holds. I Corollary 1. Assume that weak duality (Theorem 1 or 2 or 3) holds. If gO = (T O, Uo, w~, ... , w~, yO, z~, ... , z~) is a feasible solution of (DF) such that

UÓBiW?

= (ubB;uO)I/2

and UÖDiZ?

= (UbDiUO)I/2,

iEP

(45)

and Uo is a feasible solution for (F). Then Uo is an efficient solution of (F) and gO is an efficient solution of (DF). Proof: Suppose that Uo is not an efficient solution of (F). Then there exists a feasible solution x for (F) such that

and

h(x) + (x IBjx)I/2 < h(uo) + (UÓ B jUO)I/2 gj(x) - (x IDjx)l/2 - gj(UO) - (u~DjUO)I/2

"Ij E P

(46)

B f;(x) +(xIB iX)I/2 < fi(UO) + (UÓ iUO)I/2 gi(X) - (xIDiX)I/2 gi(UO) - (U~DiuO)l/2

.. . P lor som e zE.

(47)

~~----'------'----'-~

Applying (45) to (46) and (47) yields

fj(x.) + (Xl Bjx)I/2 < h(uo) + uÓBjw~ I gj(x) - (x Djx)1/2 - gj(uo) - uÓDjz~

---'--"-'---'---:---"----'~

and

fi(x) +(xIB iX)1/2 gi(X) - (xIDix)1/2

fi(UO) + U~BiW? gi(UO) - U~DiZ?

.:...c...:..-,'--,'---:-=--'---'.~ 0 for all j E P, thus the following cannot ho1d: J

f .(x) and

.J

~

f .(u) J

f.(x) < f.(u) 1

1

for all j E P at least one iEP.

Theorem 4 (strong duality) Let f and g be invex. Let xa be an effieient solution for (P) at whieh the necessary Kuhn­ Tucker eonditions hold for (EPk ) for all k E P. Then there exists ÀOE A+ and yOE Rm sueh that (x O, ÀO, yO) is an effi­ cient solution for (MWD) and the optimal va1ues of (P) and (MWD) are equal.

ProoE Sinee f and gare invex functions and xa is an effici­

ent solution of (P). Then by Lemma 1, xO solves (EPk) for each kEP. From hypothesis, the following eonditions are satisfied at xO for eaeh (EP k ) , k E P:

a

a

m

k

0

Vfk(x ) + Lj~kÀkjVfj(x ) + Ll~IY1kVgl(xO)

0

Ll=I Yl gI(x) Àkj ~ 0 and y~

0 0

7

~

Now defining À = 1+ L .~k and yk= L7=l y for k E P. Under the above Kuhn-Tucfer conJitions, since xO solves (EPk), then by Lemma 4 there exists ÀOEA+ and 0 ~ yOER m such that (xO,ÀO,yO) solves (EDk) and the optimal values of (EP k ) and (EDk) are equal. Consequently,the optimality of solution (xO,ÀO,yO) for (EDk),in view of Lemma 1, implies the efficiency of solution for (MWD) , Clearly, the optimum of (P) and (MWD) are equal.

5. CONCLUDING RFHARIS We have considered a differentiable MOP problem, extended multiobjective duality results for efficient solutions to invex functions. The Kuhn-Tucker necessary eonditions are required to establish these results. In view of the results reported in this paper the fo11owing can be concluded:

DUAUTY THEORY FOR MOP PROBLEMS

123

It is noteworthy that in the strong duality results of Egudo 8 additional constraints appear to be binding and there­ fore in general a constraint qualification may not he satisfied. Therefore, the results here are obtained by considering a sequence of equivalent single objective programming problems, which varies on the conditions for strong duality results using necessary Kuhn-Tucker conditions. In fact,if the Kuhn­ Tucker conditions are satisfied one does not need a constraint qualification to establish duality results. Consequently, we have extended the results of Egudo 8 to invex functions and established another approach for proving duality results. However, in some rare cases the Kuhn-Tucker optimality conditions and hence constraint qualification are not satis­ fied (cf. Ben-Tal et aI 15 ).Subsequently, Weir and Mond 16 esta­ blished duality results without aconstraint qualification. An extension to prove more general duality results, in view of this paper, with no constraint qualification is possible, which extends the results of Weir and Mond 16 to a multiobjec­ tive case and will be considered in subsequent papers. ACKNOWLEDGEMENT The helpful comments of a referee are gratefully acknowledged. REFERENCES

2 3 4 5

6

Geoffrion, A.M. (1968) Proper Efficiency and the Theory of Vector Maximization. Journalof Mathemtical Analysis and Applications, 22, 618-630 618-630 Weir, T. (1987) Proper Efficiency and Duality for Vector Valued Optimization Problems. Journalof the Australian Mathematical Society. A43, 21-34 Egudo, R.R.(1987) Proper Efficiency and Multi-objective Duality in Nonlinear Progarmming. Journalof Information and Optimization Sciences. 2, 155-166 Wolfe, P.(1961) A Duality Theorem for Nonlinear Program­ ming. Applied Mathematics, 19. 239-244 Mond, B. and Weir, T. (1981) Generalized Concavity and Duality. In Generalized concavity in Optimization and and Economics. edited by S. Schaible and W.T. Ziemba, pp. 263-279, New York: Academie Press Egudo, R.R.and Hanson, M.A.(1987) Multiobjective Duality with Invexity. Journalof Mathematical Analysis and App­ lications, 126, 469-477

124

ZULFIQAR ALl KHAN

7 Chankong, V.and Haimes, Y.Y. (1983) Multiobjective Deci­ sion Theory and Methodology, New York: North-Holland 8 Egudo, R.R. (1998) Efficiency and Generalized Convex Duality for Multiobjective Programs. Journalof Mathema­ tical Analysis and Applications, 138, 84-94 9 Vial, J.P. (1982) Strong Convexity of Sets and Functions. Journalof Mathematical Economics, 9, 187-205 10 Singh, C. (1988) Duality Theory in Multiobjective Diff­ erentiable Programming. Journalof Information and Opti­ mization Sciences, 9, 231-240 11 Kuhn, H.W. and Tucker, A.M. (1950) Nonlinear Programming. In Proceedings of Berkeley Symposium on Mathematical Statistics and Probability, pp. 481-492, Berkeley: The University of California Press. 12 Mangasarian, 0.L.(1969) Nonlinear Programming. New York: McGraw-Hill 13 Hanson, M.A. (1981) On Sufficiency of the Kuhn-Tucker Conditions.Journal of Mathematical Analysis and Applica­ tions, 80, 545-550 14 Craven, B.D. (1981) Duality for the Generalized Convex Fractional Programs. In Generalized Concavity in Opti­ mization and Economics. edited by S. Schaible and W.T. Ziemba, pp.473-490, New York: Academic Press 15 Ben-Tal, A.,Ben-Israel, A. and Zlobec, S. (1976) Charac­ terization of Optimality in Convex Programming Without a Constraint Qualification. Journalof Optimization Theory and Applications, 20, 417-437 16 Weir, T. and Mond, B. (1987) Duality for the Generalized Convex Programming Without Constraint Qualification, Utilitas Mathematica, 31, 233-242

EFFICIENCY AND DUALITY THEORY FOR A CLASS OF NONDIFFERENTIABLE

MULTIOBJECTlVE

PROGRAMS

ZULFIQAR ALl KHAN

Faculty of Engineering Science Osaka University, Osaka, Japan Abstract In this paper, by considering a sequence of equivalent single objective programs,the concept of efficiency is used to establish duality results for a class of nondifferentiable multiobjective programs under suitable convexity assumptions.

Key Words: Multiobjective programming, duality, convex functions.

efficiency,

1. INTRODUCTION

An important concept of mathematical models in economics, game theory, optimal con trol and decision theory is that of a vector minimum. In recent years,optimization programs with multiobjective cost criteria or programs with many objectives conflicting with each other have been extensively reviewed in the literature. Introducing the concept of proper effici­ ency, Geoffrion 1 established an equivalence between a mul ti­ objective program (MOP) and a related parametric (scalar) objective program. Using this equivalence, Weir 2 formulated a dual for MOP involving differentiable convex functions,and weak and strong duality theorems which re late the proper efficient solutions of primal and dual programs are proved. Later on, Mond et a1 3 reformulated this dual for a non­ differentiable MOP in which each component of the objective function contains a term involving the square root of a certain positive semidefinite quadratic form and then weak and strong duality theorems in the sense of Weir 2 are estab­ lished for properll efficient solutions. Haimas et al presented another scalar equivalence for DOI: 10.1201/9780429333439-9

125

126

ZULFIQAR AU KHAN

Pareto optimization. Egud0 5 used this equivalence to formu­ late duality for a differentiable MOP. He established weak and strong duality theorems relating efficient solutions of the primal and dual programs under convex, p-convex and pseudo-convex functions. In this paper, by considering a sequence of equivalent single objective programs,the concept of efficiency is used to formulate the duality results for a class of nondifferentiable MOPs. Appropriate weak and strong duality theorems under suitable convexity conditions are proved for efficient solutions. 2. PREREQUISITES Consider the following differentiable MOP:

(p*)

Minimize x E X subject to

f(x)

=

{[lex), f 2 (x), ... , fp(x)}

n g(x) :;; 0 ; m where XC R be an open convex set, f: X -+RP, g: X -+R are differentiable functions and minimization means obtaining efficient solution in the following sense: A feasible* solu­ tion XOE X is said to be an efficient solution of (P) if, for all feasible x of (p*),

implies that

fi(x O)

~ f.(x),

f/x O)

f.(x),

1 1

for all iEP

=

{I,2, ... , p}

for all iEP.

In relation to (p*), Haimes et a1 4 considered the following single objective minimization problem:

*

(P k )

Min~mize

subject to

fk(x)

f .(x) :;; f .(x O) for all j ~ k J J

g(x) :;; 0, and proved the following Lemma connecting (P ) and (P k ).

*

*

Lemma 1 if

Xo

XOE X is an efficient solution for (p*) if and only solves a single objective program (P~) for each k E P.

We now consider the following nondifferentiable MOP: t 1/2 t 1/2 , (P) Minimize ~(x) = {fl(x)+(x Blx) , f 2 (x)+(x B2x) .•. , f (x)+(x t B x) 1/2} p p subject to g(x) :;; 0, (1)

~: X-+R P, g: X-+Rm and B., iEP, are posi tive semidefini te lIBtrices. 1

NONDIFFERENTIABLE MOP PROGRAMS

127

As suggested by Lemma 1, an equivalent single objective pro­ gram to be considered is: (EP k ) Minimize subject to (2) (3)

t

fk(x) + (x Bkx)

1/2

t f .(x)+(x t B.x) ~ fj(xO)+(xOBjxO) J J g(x) ~ o.

for j

~

k

In the subsequent analysis the generalized Schwarz inequality as stated in the following Lemma 2 will be used and functions are assumed to be convex/strictly convex throughout the paper.

Lemma 2 If B is a positive semidefinite matrix, then for x, n y ER' . x tB y 0 for all iEP and f., iEP are convex at 1 1 (b) À.> 0 for all i and L~ lÀ.f.(o) is convex at u; 1 1= 1 1 (c) L. P1 À.f.(O) is strictly convex at u. 1= 1 1 Then the following cannot hold: t 1/2 t . (31) f.(x) + (x B.x) ~ f.(u) + u B.z. all 1 1 t 1 1/2 1 1 1 f .(x) + (x B.x) < f .(u) + utB.z. , some j (32) J

J

J

J J

u; or or

E P E P

Proof We prove this theorem by contradiction. ln view of (1), (28) and (29), we obtain (33)

t

t

y g(x) - y g(u)

~

o.

The convexity of g implies that ytg is convex convexity of ytg along with (33) yields (34)

t

(x - u) Vy g(u)

~

the

o.

Applying result (34) to (26), we get (35)

and thus

(x - u) L~1= lÀ~(Vf.(u) + B.z.) ~ 1 1 1 1

o.

Now suppose that inequalities (31) and (32) hold. From the hypothesis (a), we obtain (36) À.f.(x) + À.(x t B.x) 1/2 =< À.f.(u) + À.utB.z. all iEP 1 t 1 1 1 1 1 1 1/2 1 1 (37) À.f .(x) + À.(xtB.x) < À.f .(u) + À.u B.z. some j E P JJ J J JJ J J Jp Together with the inequalities (36), (37) and L. lÀ.= 1 we 1= 1 obtain 1/2 LP. lÀ.f.(x)+ ",p t P l.. 1À.(x B.x) < L.1= lÀ.f.(U)+ 1= 1 1 1= 1 1 1 1 P t (38) L. lÀ.(u B.z.) 1= 1 1 1 which, in view of hypothesis (a) yields

134

ZULFIQAR AU KHAN

p (x-u)E.1= 1À.Vf.(u)+ E.P 1À.(x t B.x) 1/2 - E.p 1À.(u t B.z.) < 0 1 1 1= 1 1 1= 1 1 1

From (27) and Lemma 2 it follows that p 1À.Vf.(u) + E.p t t (x-u)E.1= 1À.(x B.z. - u B.z.) < O. or 1 1 1= 1 1 1 1 1 (39) (x-u){E~ 1À.(Vf.(u) + B.z.)} < 0 1= 1 1 1 1 contradiets inequality (35). From the hypothesis (b) inequality (38) implies (39). again contradiets inequality (35). Similar to the above analysis. for À.~ O. iEP. we obtain 1 ( 40 ) EP. 1À. f. (x) + E.p 1À. (x t B. x ) 1/2 Y {:} Xi> Yi, x ~ Y {:} Xi ~ Yi, x ~ Y {:} Xi ~ Yi,

i = 1,2, ... ,n;

i = 1,2, ... , n;

i = 1,2, ... , n but x x "j Y is the negation of x ~ y.

t= Y;

If F( x, y) is a k -dimensional vector valued differentiable func­ tion of x E Rn, Y E Rm, then Fy( x, y) denotes the m X k matrix of first partial derivatives. If "., E Rk , then ".,t F is a scalar valued differentiable function. (1] t F)x and (17tF)y denote gradient (column) vectors with respect to x and Y respectively. Subsequently (".,tF)yy and (".,tF)yx denote respectively the m X mand n X m matrices of second partial deriva­ tives. V x will also be used to denote the gradient (column) vector with respect to x.

Consider the multiple-objective programming problem (P) minimize f(x) subject to x EX.

Here f : Rn -+ Rk and X ç Rn. A feasible point Xo is said to

be an eflicient solution of (P) if fi(XO) ~ J;(x) for all i = 1,2, ... k

implies fi(XO)=J;(x) foralli=1,2, ... ,k.

A feasible point Xo is said to be properly efficient if it is efficient for (P) and if there exists a scalar M > 0 such that, for each i,

fi(XO) - fi(X) < M fi(x) - fj(xo) = for some j such that fi(x) and fi(X) < J;(xo).

> fi(xo) whenever x is feasible for (P)

SYMMETRIe DUALITY

139

A feasible point Xo is said to be a weak minimum if there exists no other feasible point x for whieh f(xo) > f(x). If a feasible point Xo is effieient then it is clear that it is also a weak minimum.

3. SYMMETRie DUALITY I. Consider the two programs (PI)

minimize f( x, y) subject to

(Dl)

(),t f)y(x, y) ~ 0 yt(>,tf)y(x,y) ~ 0

(1)

x~O

(3)

,X~O

(4)

(,Xtf)x(u,v) ~ 0 ut(,Xtf)x(u,v) ~ 0

(5)

v~O

(7)

,X~O

(8)

(2)

maximize f(u,v) subject to

Here f: Rn x Rm

-+

(6)

Rk; ,X E R k •

Note that if k = 1, then (PI) and (Dl) beeome the symmetrie dual programs given by Mond and Weir 4 •

Theorem 1. (Weak Duality). Let (x, y,'x) be feasible for (PI) and (u,v,À) be feasible for (D). Let either a) x i= u, ,Xtf(.,y) be strictly pseudoeonvex for fixed y and ,X t f( x, .) be pseudoeoneave for fixed x; or b) y i= v, ,Xtf(.,y) be pseudoeonvex for fixed y and ,Xtf(x,.) be strietly pseudoeoneave for fixed x; or e) 'x>O, ,Xtf(.,y) be pseudoeonvexforfixed y and ,Xtf(x,.) be pseudoeoneave for fixed x, then

f(x,y)

i

f(u,v).

140

B.MONDANDT. WEIR

Proof. (a) Assume that

f(x,y) >.tf(x,y)

~

~

f(u,v). Then from (4) or (8) >.tf(u, v).

(9)

From (3), (5) and (6)

(x - u)t(>.t f)x(u, v)

~ 0;

strict pseudoconvexity of >.tf(.,y) then gives

>.t f(x, v) > >.t f(u, v).

(10)

From (1), (2) and (7)

(v - y)t(>.tf)y(x,y)

~

0;

pseudoconcavityof >.tf(x,.) then gives

>.tf(x,v)

~

>.tf(x,y).

(11)

(10) and (11) yield >.tf(x,y) > >.tf(u,v) which contradicts (9). (b) Assume that

f(x,y) >.tf(x,y)

~

~

f(u,v). >.tf(u, v).

Then from (4) or (8)

(9)

From (3), (5) and (6)

(x - u)t(>.tf)x(u,v)

~ Dj

pseudoconvexityof >.tf(.,y) then gives

>.tf(x,v)

~

>.tf(u, v).

(12)

From (1), (2) and (7)

(v - y)t(>.tf)y(x,y)

~ 0;

strict pseudoconcavity of >.tf(x,.) then gives

>.tf(x,v) < >.tf(x,y).

(13)

SYMMETRIe DUALITY

141

(12) and (13) yield )..tf(x,y) > )..tf(u,v) which contradicts (9). (c) The proof of this case is similar to that given in Weir and Mond·5 hut is repeated here for completeness. From (3), (4), (5) and (6)

o.

(x -u)t(>.tf)x(u,v) ~

Since).. 1 f(., v) is pseudoconvex it follows that

)..tf(x,v) ~ )..'f(u, v). From (1), (2), (7) and (8)

(v - y)t()..tf)y(x,y) ~

o.

Since )..1 f(x.,) is pseudoconcave then

)..tf(x,v) ~ )..tf(x,y). Thus,

)..If(x,y) But f(x,y) tion.

~

~

)..tf(1l, v).

f(1l,v), leads to )..tf(x,y) < ).."If(u, v), a contradic­

Theorem 2. (Strong Duality). Let (xo, Yo, )..0) he an cfficicnt solu­ tion for (PI); fix ).. = )..0 in (Dl). Let ()..~f)yy(xo, yo) he positive definite and let the set {(ft)y, (h)y, ... , (fk)y} he linearly indepen­ dent. Assume that cither, a) )..tf(.,y) is strictly pseudoconvex for fixed y and )..tf(x,.) is strictly pseudo concave for fixed x; or h) ).. > 0, )..t(.,y) is pseudoconvex for fixed y and )..tf(x,.) is pseu­ doconcavc for fixed x. Then (xo, Yo, )..0) is an efficient solution for (Dl). Proof. Since (xo, Yo, )..0) is an efficient solution of (PI), there exists 7, T E Rk , r E Rffi, W E R, s E Rn, t E Rk such that

142

B. MOND AND T. WEIR

the following Fritz John eonditions are satisfied at (xo, Yo, Ào) (Ttf)x - (À~f)yx(wyo -- r) - s = 0

(14)

fy(T - wÀ o) - (À~f)yy(wyo - r) = 0

(15)

=0 =0

(16)

f~(r - wYo) - t rt(À~f)y

(17)

(18)

wYMÀ~f)y = 0

=0

(19)

tt Ào = 0

(20)

0, r ~ 0, w ~ 0, s ~ 0, t ~ 0

(21)

i- 0

(22)

stxo T

~

(T,r,w,s,t)

Prcmultiplying (15) by (r - wyo)t and thCll applying (16) and (20) g;lVCH

(1' - wYo)t(>\~f)yy(r - wYo) = _Ttt ~ 0

Since (À~f)yy is aHHUIncd positive dcfinitc, Ttt = o. ThuH wYo = 0 Ttt = 0

(r - wYo)

(23)

o

and (24) (25)

From (15), fy(T - wÀ u) = 0, alld sincc, by assumption, the set {(fd y, (f2)y,.'" (ft)y} is lincarly independcnt then T

= w>.o.

(26)

If w = 0 thcn T = 0 and, from (24), r = O. From (14), s = 0 anel, from (16), t = 0, eontradicting (21). Hence w > 0 and T ~ O. Frorn (24), Yo ~ 0 and by (14) and (26), P~f)x ~ O. From (14), (24) aUll (Hl), it also follows that x~(>.~f)x = O.

Thus (xo, Yu, >'0) is fcasihle for (Dl) anel thc objcctive functions of(P1) aud (Dl) are equal therc. Clcarly (xo,yo,>'o) is effieient for (Dl), othcrwise thcrc would cxist fcasiblc (u, ij, >'0) sueh that f(ü, i}) ~ f(xo, Yo) cOlltradictinp; wenk dlla.lity.

SYMMETRIe DUALITY

143

4. SPECIAL CASES I. Let f(x,y) = F(x) + ytG(x)e where F = (Fj ,F2 , ... ,Fk )t, G = (G j ,G 2 , . . . ,G m )t and e = (l,l, ... ,l)t E R k . Thcn (PI) and (Dl) become (P2) min F(x)+ytG(x)e subject to G(x)(,\t e) ~ 0 ytG(x)(,\t e )

(D2)

max F(u)

(27)

0

(28)

x20

(29)

'\~O

(30)

~

+ vIG(u)e

subject to

+ \1 x v t G(u)(,\t e ) ~ 0 u t [\1 x ,\tF(u) + \1 x v t G(u)(,\t e)] ~ 0

(31 )

v20

(33)

'\~O

(34)

\1 x ,\tF(u)

(32)

Now these problems (and partieularly (P2» ean be simplificd. Divide (27), (28), (30), (31), (32) and (34) by ,\ te (whieh is strietly positive) and let >-i = '\d,\t e , i = 1, ... ,k. (27), (28), (30), (31), (32) and (34) beeome

G(x) ~ 0

(27A)

ytG(x)~O

(28A)

>-~O

(30A)

\1 x >-tF(u)

+ \1xvtC(u)

~ 0

(31A)

u l [\1 x >-tF(u)

+ \1xvtG(u)]

~ 0

(32A)

>-~O

(34A)

(P2) ean now be further simplified by eliminating >- and taking the best possible value for y, i.e. y = O. Thus (P2) and (D2) rww beeome (writing ,\ instead of >-) (P3)

minimize F( x)

144

B.MONDANDT. WEIR

subject to

(D3)

G(x) ~ 0

(35)

x;::: 0

(36)

+ vtG(u)e

maXlmlze F(u) subject to

Vx>,tF(u)

+ \lxvtG(u)

~ 0

(37)

ut[Vx>,tF(u)

+ VxvtG(u)]

~ 0

(38)

v;::: 0

(39)

>,~O

( 40)

In view of the simplifications achieved in going from (PI) to (P3), we now obtain stronger duality results between (P3) and (D3) directly rather than as a special case of (PI) and (Dl). Theorem 3. (Weak Duality). For all feasible x of (P3) and (u,v,>') of (D3), let >,tF + vlG be strictly pseudoconvex. Then F(x) ~ F(u) + vtG(u)e. Proof. Assume that F(x)

~

F(u)

+ vtG(u)e.

From (40)

>,IF(x) ~ >,IF(u)

+ vtG(u)

(41)

Then, by (35) anel (39), strict pseudoconvexity of >,tF + vtG implies (x-u)I\lA>,IF(u)+1/G(U» < O. But combining (36), (37) and (38) yields

(x - U)I\lxpIF(u)

+ vIC(u»

~ 0

which is a contradiction. Theorem 4. (Strong Duality). Let Xo be an efficient solution for (P3) at which a constraint qualification is satisfied. Then there exist vo, >'0 such that (xo, vo, >'0) is feasible for (D3) and the ob­ jective values of (P3) and (D3) are equal. If also, for all feac;ible (x, u, v, >'), >,1 F + vlG is strictly pselldoconvex, then (xo, vo, >'0) is efficient for (D3 J.

SYMMETRIe DUALITY

145

Proof. Since Xo is an efficient solution for (P3) at which a constraint qualification is satisfied, then by the Kuhn-Tucker conditions 7 there exists vo, Ào, Zo such that

V xÀ~F(xo) + V xv~G(x) - Zo = v~G(xo) = z~xo =

Vo

~

a,zo

~

a a a

a,Ào ~ a

Thus (xo, Vo, Ào) is feasible for (D3) and the objective valucs of (P3) and (D3) are equal. The efficiency of (xo,vo,À o ) for (D3) follows from weak duality if ÀtF + vtG is strictly pseudoconvex.

5. Symmetrie Duality 11. Consider the following problems (P4) minimize f(x,y) - [ytptf)y(x,y)]e subject to (À t f)y(x, y) ~

a

x~a À~a

Àte = 1 (D4)

maxlmlze f(u,v) - [ut(Àtf)x(u,v)]e subject to

(Àtf)x(u,v) ~ 0 v~a À~a

Àte = 1 where, as before, f: Rn x Rm -+ Rk,À E R k and e = (1,1, ... , l)t E Rk • Observe that if k = 1, then (P4) and (D4) become thc pair of symmetrie dual programs given by Dantzig, Eisenberg and Cottle 2 (see also Mond 3 ).

146

B.MONDANDT. WEIR

Theorem 5. (Weak Duality). Let (x, y,,x) he feasihle for (P4) and (u, v,,x) he feasihle for (D4). Let either a) x 1= u,f(.,y) he strictly convex for fixed y and f(x,.) he concave for fixed x, or h) y 1= v, f(x,.) he strictly concave for fixed x and f(., y) he convex for fixed y, or c) ,x > 0,/(., y) he convex for fixed y and f(x,.) be concave for fixed x, then

f(x,y) - [yt(,xtf)y(x,y)]e

1: f(u,v) -

[ut(,xtf)z(u,v)]e

Proaf. (a) By the assumption of strict convexity

f(x,v) ­ f(u,v) > (x ­ u)tfz(u,v) so that

,xtf(x,v) ­ ,xtf(u,v) > (x ­ u)t(,xtf)z(u,v)

(42)

Now hy concavity

so that

f(x,v) ­ f(x,y)

~

(v ­ y)tfy(x,y)

,xtf(x,v) ­ ,xtf(x,y)

~

(v ­ y)t(,xtf)y(x,y)

(43)

Subtracting (43) from (42) and rearranging gives

[,xtf(x,y) ­ ytptf)y(x,y)] ­ [,xtf(u,v) ­ ut(,xtf)z(u, v)] > x\,xtf)z(u,v) ­ vt(,xtf)y(x,y) ~ 0 (44) Thus

f(x,y) ­ [yt(,xtf)y(x,y))]e

1: f(u,v)

­ [ut(,xtf)z(u,v)]e

(45)

(h) By the assumption of strict concavity

f(x,v) ­ f(x,y) < (v ­ y)tfy(x,y) so that

,xtf(x,v) ­ ,xtf(x,y) < (u ­ y)t(,xtf)y(x,y)

(46)

SYMMETRIe DUALTIY

147

By convexity

f(x,v) - f(u, v) ~ (x - u)lfx(u,v) so that

>..tf(x,v) - >..If(u,v)

~

(x - U)I(>..tf)X(U,v)

(47)

Subtracting (46) from (47) and rearranging gives (44) which implies (45). (c) By convexity and concavity, one obtains (43) and (47). Subtract­ ing (43) from (47) gives

[>..t f(x, y) - yt(>..lf)y(x, y)] - [>..If( U, v) - ut pi f)x(u, v)] ~

Xt(

>.. I f) x(u, v) - v\ >.. I f) Y ( x, y) ~ 0

( 48)

However, since>.. > 0, this implies (45). In a manner analogous to the proof of Theorem 2, one can now show the following: Theorem 6 (Strong Duality). Let (xo, Yo, >"0) be an efficient so­ lution of (P4); fix >.. = >"0 in (D4). Let (>..~f)yy(xo, yo) be positive definite and let the set {(tI)y, (h)y, ... , (ik)y} be linearly indepen­ dent. Assume that either a) f(., y) is strictly convex for fixed y and f(x,.) is strictly concave for fixed x; or b) >.. > 0, f(., y) is convex for fixed y and f(x,.) is concave for fixed x. Then (xo, yo) is an efficient solution of (D4). Proof. Since (xo, Yo, >"0) is an efficient solution of (P4) there exists 7 r E Rk , r E Rm, s E Rn, t E Rk such that the following Fritz John conditions are satisfied at (xo, Yo, >"0)'

(rlf)x + (>..~f)yx(r - (rle)yo) - s = 0 fy(r - (rle)>..o) + (>..~f)yy(r - (rle)yo) = 0

(50)

- (rte)yo) - t = 0

(51)

rl(>..~f)y

=0

(52)

SIXO

=0

(53)

rl>..o = 0

(54)

O,r ~ O,s ~ O,t ~ 0

(55)

(r, r, s, t) =1= 0

(56)

f~(r

T ~

(49)

148

B.MONDANDT. WEIR

Premultiplying (50) by (r - (rte)yo)t and using (51) and (54) gives

(r - (rte)Yo)t(.À~f)yy(r - (rte)yo) = -rtt ~

o.

Sinee (.À~f)yy is assumed positive definite, then r-(rte)yo = 0 and

rtt = O. Thus (rte)yo = r

(57)

rtt = 0

(58)

From (50) and (57), fy(r-(rte).À o) = 0, and sinee, by assump­ tion, the set {(!I)y, (!2)y, ... , (/k)y} is linearly independent then

r = (rte).À o.

(59)

If r = 0, then by (57), r = 0, and by (49), s = O. From (51) and (57), t = 0; but this eontradiets (56); henee r ~ 0 and T'e > 0 and from (57), Yo ~ o. From (49) and (59), (.À~f)x ~ 0; from (49), (53) and (59), xH.À~f)x = 0 and from (52) and (57), y~(.À~f)y = O. Thus (xo, Yo, .À o) is feasible for (04) and the objective functions of (P4) and (04) are equal there. Clearly (xo, Yo, .À o) is efficient for (04), otherwise there would exist feasible (u, ti, .À o) sueh that

f(u, ti) - [ut(.À~f)x(u, ti)]e ~ f(xo, Yo) - [x~(.À~f)x(xo, yo)]e Sinee x~(.À~f)x = 0 = YM.À~f)y it follows that

f(u, ti) - [ut(.À~f)x(u, v)]e ~ f(xo, Yo) - [y~('\~f)y(xo, yo)]e whieh also eontradicts weak duality. Remark 1. Note that symmetrie weak duality (Theorems 1 and 5) requires the same .À in primal and dual. This in turn means that for strong duality (Theorems 2 and 6) .À = .À o is fixed in the dual problems(01) and (04). Remark 2.

only if x

i-

Note that in weak duality, Theorems 1 and 5, a) holds u while b) holds only if y i- v. To eliminate the eases

SYMMETRIe DUAUTY

149

u = Xo or v = Yo, in strong duality, theorems 2 and 6, we require the weak duality assumptions of both a) and b).

6. SPECIAL CASES 11.

If, as before, f(x,y) = F(x) (D4) reduces to (D5)

+ ytG(x)e,

(P4) becomes (P3) while

maximize F(u)+vtG(u)e-ut[V x,xt F(u)+ V xvtG(u)]e subject to

Vx[,xtF(u)

+ vtG(u)] ~ 0

(60)

v~O ,x~0

,xte = 1. Thus, (D3) may be regarded as a multi-objective Mond-Weir type dual of (P3) while (D5) corresponds to the Wolfe dual. As before, since À does not appear in (P3), we obtain stronger duality results between (P3) and (D5) directly rather than as a special case of (P4) and (D4). Theorem 7. (Weak Duality). Forallfeasible x of (P3) and (u,v,,x) of (D5) let F be strictly convex and let G be convex. Then

F(x) 1:, F( u) + vtG( u)e + ut[V x,xtF( u) + V xvtG( u)]e.

Proof. Assume that

F(x) ~ F( u) + vtG( u)e - ut[V x,xt F( u) + V xvtG( u)]e; ,xtF(x) ~ ,xtF(u) + vtG(u) - ut[V x,xtF(u) + V xvtG(u)].

then

(61)

The strict convexity of F and convexity of G and (35) and (39) imply

,xt F(x) - ,xt F(u) - vtG( u) + ut[V x,xt F( u) + V xvtG( u)] > xt[Vx,xtF(u) + VxvtG(u)]. (62)

150

B. MOND AND T. WEIR

(61) and (62) imply

xt[V'xÀtF(u) + V' xvtG(u)) < 0 which is a contradiction to (36) and (60).

Theorem 8. (Strong Duality). Let Xo be an efficient solution for (P3) at which a constraint qualification is satisfied. Then there exists Vo, Ào such that (xo, Vo, Ào) is feasible for (D3) and the objective val­ ues of (P3) and (D5) are equal. If also for all feasible (x, u, v, À), F is strictly convex and G is convex then (xo, vo, Ào) is efficient or (D5). Proof. Since Xo is an efficient solution for (P3) at which a constraint qualification is satisfied, then by the Kuhn-Tucker conditions there exists (vu, Ào, zo) such that V'xÀ~F(xo)

+ V'xv~G(xo) -

Zo = 0

v~G(xo) = 0

z~xo = 0

Vo ~ O,zo ~ O,Ào ~ O.

Thus (xo, vo, Ào) is feasible for (D5) and the objective values of (P3) and (D5) are equal. The efficiency of (xo, vo, Ào) for (D5) follows from weak duality if F is strictly convex and G is convex. Finally we specialize our result to the linear case. Let J( x, y) = Cx + ytbe - ytAxe, where C == [Ct,C2, ... ,Ck)t, Ci E Rm,i = 1,2, ... ,k. Then (PI) and (Dl) become (assuming À > 0 and Àt e = 1) (LP)

mlrumlze Cx

+ ytbe ­

yt Axe

subject to

Ax ytAx

~ ~

b ytb

x~O

À> 0, Àte

(LD)

maxlffilze Cu

= 1.

+ vtbe -

v t Aue

SYMMETRIe DUALITY

subject to ct).. ~ Atv utc t ).. ~ ut Atv v~O )..>O,)..t e

=1.

(LP) simplifies to (LP!) minimize Cx subject to Ax ~ b, x ~ O. In (LD) let p = Cu + vtbe - vtAuej then )..t p = )..tCu

+ vtb _

vtAu ~ vtb.

Hence (LD) simplifies to (LD!)

maximize p subject to ct)..

~

Atv

)..t p

~

vtb

v~O

).. > 0,

)..t e

= 1.

Similarly in (P4) and (D4) set f(x,y) = Cx

+ ytbe -

ytAxe.

(P4) simplifies to (LP!) while (D4) becomes maximize Cu + vtbe - ()..tCu)e subject to ct)..

~

Atv

v~O

).. > 0,

)..t e

= 1.

Setting p = Cu + vtbe - ()..tCu)e gives

151

152

B. MOND AND T. WEIR

(LD2)

maXllIDze Il subject to ct). ~ Atv ).tll=vtb

v;:::O

). > 0,

).t e

= 1.

Note the relationship between (LD2) and (LD1). Note that weak duality for (LP1) and (LD1), (LD2) respee­ tively implies ).tCx ~ ). t Il and for optimal Xo for (LP1) and (llo,Vo,).o) for (LD1), (LD2) respectively )'~Cxo = ).~Ilo = v~b. Thus Ilo is an effieient solution for (LD 1) if and only if it is effieient for (LD2); (LD1) and (LD2) are thus equivalent problems. Equivalent forms of (LP1) and (LD1) have been given by Isermann 8 . REFERENCES

1. Dorn, W.S. (1960) A Symmetrie Dual Theorem for Quadratic Programs, J. Oper. Re8. Soc. Japan 2, 93-97

2. Dantzig, G.B., Eisenberg, E. and Cottle, R.W. (1965) Symmet­ rie Dual Nonlinear Programs, Pacific. J. Math. 15, 809-812 3. Mond, B. (1965) A Symmetrie Dual Theorem for Nonlinear Programs, Quart. Appl. Math. 23,265-269 4. Mond, B. and Weir, T. (1981) Generalized Coneavity and Du­ ality in Generalized Concavity in Optimization and Economic8, edited by S. Sehaible and W.T. Ziemba, pp. 263-279, New Vork: Academie Press 5. Weir, T. and Mond, B. (1988) Symmetrie and Self Duality in Multiple Objective Programming, A8ia-Pacific J. Oper. Re8., 5, 124-133 6. Weir, T. (1987) Proper Efficiency and Duality for Vector Val­ ued Optimization Problems, J. AU8tral. Math. Soc., Serie8 A, 43, 21-35 7. Craven, B.D. (1977) Lagrangian Conditions and Quasiduality, Bull. AU8tral. Math. Soc. 16,325-339,

SYMMETIUCDUAUTY

153

8. Isermann, H. (1978) Duality in Multiple Objective Linear Pro­ granuning, in Lecture Notes in Economics and Mathematical Systems 155, Multiple Criteria Problem Solving, edited by S. Zionts, pp. 274-285, Springer-Verlag

DATA ENVELOPMENT ANALYSIS: A COMPARATIVE TOOL

MJ FOSTER Kingston Business School, Kingston Kingston-upon-Thames, KT2 7LB, England

HilI,

Abstract This paper gives a general appreciation of a technique known as Data Envelopment Analysis (DEA). The technique is used to make assessments of the relative efficiency of decision making units which consume multiple, incommensurate inputs for the product ion of multiple, incommensurate outputs. Because it is able to deal with inputs and outputs which are incommensurate, the technique is of particular interest in the public sector. The basic idea is that the (relative) efficiency of a decision making unit is ca1culated as the ratio of a weighted sum of the outputs divided by a weighted sum of its inputs, where the weights are calculated in such a fashion as to put the given unit in the best light possible relative to its peers. These weights are determined as the variables of a fractional programme which can be transformed into a linear programme and hence solved. The method is illustrated by particular reference to an application to the collection of rates by local authorities in England. 1. INTRODUCTION

Data Envelopment Analysis (DEA) is a technique to assess the relative efficiencies of a group of decision making units (dmuls) where those units typically utilise multiple, incommensurate inputs to produce multiple, incommensurate outputs. DOI: 10.1201/9780429333439-11

155

156

M. J. FOSTER

The basic idea is that the (relative) efficiency of a dmu is calculated as the ratio of a weighted sum of the outputs divided by a weighted sum of its inputs where the weights are calculated in such a fashion as to put the given dmu in the best light possible relative to its peers. The relativity inherent in the procedure comes from the fact that each dmu's weights are, although unit specific, constrained so that the efficiencies of all units when subjected to those weights lie in the range (0,1) (or 0 to 100 if preferred) . It is important to note that because the calculations produce relative efficiencies, units with an efficiency score of 1 may not be efficient in an absolute sense: in the worst case they may simply be the best of a bad bunch! The original ideas inherent in this approach were published more than thirty years ago by Farrell. 1 However, it has only been in the past ten years or so that it has come to wider notice, since Charnes et al 2 published details of how the ideas could be operationalised as a linear programming problem. Since then much emphasis has been given to the idea that the method is particularly suited to non-profit-making units where , although the outputs may be highly valued, no immediate monetary value attaches to them e.g. schools, universities, public service hospitals and local or national government agencies. Whilst this is true it should not be thought that the method is any less suited to use in the private sector. A good example would be the retail branches of a bank. Here one major objective is certainly to make good profits but it may be over simplistic to attempt inter- branch comparisons based simplyon some unit profitability measure which, in order to be tractable, would very possibly ignore subtle, cross-subsidy issues. Perhaps not surprisingly, the fast growing body of material available on DEA tends to promote it as a highly desirable approach but there are drawbacks of which the reader should be aware. The most obvious is the method's requirement for complete, accurate sets of data but it is not alone in that of course. The other is its volatility under certain circumstances to changes in the input/output profile of a unit - we discuss this later in the paper. My own view is that, used appropriately, the approach has.mu~h to offer but, like other techniques, it must be used wlt~ln its internal limitations and with due regard to the quallty of data available. In particular, its output .IS not suitable to be given to a manager not conversant wlth how

DATA ENVELOPMENT ANALYSIS: A COMPARATIVE TOOL

157

that output was generated but who may be tempted to use uncritically any available ranking data as a stick with which to beat operating units. The aim of this paper is to provide a general appreciation of the DEA technique and its potential for applicat ion, part icularly in the non- profit- making sector where more traditional, financial efficiency measures may not be available. For illustrative purposes , particular reference will be made to an application of the technique to published data on the 1*rate collection function of London Borough Councils and Metropolitan District Councils in England for the financial year 1982/3 (see also Thanassoulis, Dyson and Foster 3 ). The reader interested in exploring the available literature in more detail should refer to a bibliography prepared by Seiford. 4 After the introduction, the main elements of the paper are: a description of the basic model; a section on defining particular models and determining their data requirements; a discussion of the model I s output and its interpretation; some comments on the robustness of the model; and a brief discussion of how one might restrict weight flexibility in the model. 2. DEA - THE BASIC KODEL As we indicated in the Introduction, the DEA model looks at the relative efficiency of a unit in comparison with its host cohort. Suppose that there are n decision- making units, each of which consumes m inputs in producing s outputs.

1*

Note: in the UK, as in Australia, rates are a form of local tax levied on property, although the system is currently being replaced by one based on a domestic head count.

158

M. 1. FOSTER

Then the relative efficiency of dmu (jo) denoted by eo is determined by the following non-linear programme:

Max e

0

=

. ESr=l Ur YrJo

(P1)

Emi =1 Vi Xijo S

E

s.t.

r =1

ur



yq

where Yrj Xij Ur Vi and f

f

= the

for all r,i

value of the r'th output from unit

= the value of the i'th input to unit j = the weight given to the r'th output = the weight given to the i'th input

j

is some small number (e.g. 10- 7 )

In this programme the variables are the weights ur, Vi which are determined so as to maximise jo I s efficiency value, whilst constrained by the requirement that no unit in the set should have a rating greater than one when these particular weights are applied to it. Thus a separate programme is solved to determine each dmu's efficiency score. The outputs are the products or services produced by the units. The inputs represent the resources consumed including any uncontrollable inputs or environmental factors which might affect the outputs. For example, in the case of an application to schools it has been suggested that the social class of the parents might be an appropriate variabie to include. However, in many of the published applications, including the rates example, no such factors are included, the input profile simply representing the resources consumed. We should note, however, that some authors feel the single input variable approach to be inadequate essentially because it does not include uncontrollables, see e.g. Cubbin et a1. 5

DATA ENVELOPMENT ANALYSIS: A COMPARATIVE TOOL

159

The non-linear programme (Pl) can be readily transformed into a linear programme using an approach due to Charnes and Cooper. 6 Either the denominator is set equal to some agreed constant and the numerator maximised or the numerator can be fixed and the denominator minimised - the two approaches correspond respectively to an input and output view of efficiency, which coincide exactly if we include no uncontrollable inputs. In the rates collection example the former transformation was used with the denominator set equal to 1, giving the following LP: Max ho s.t.

=

s

r=1 Ur Yr j 0 ~r=l Ur Yrj ~ ~

(P2)

s

m

0 j = 1, ,n i=l v·1 x"IJ m ~ = 1 and ur, Vi ~ E for all r,i i=l V·I x·· IJ o ~

For computational convenience, the dual of this LP was solved, that dual being as follows: Min Zo - E( ~s Sr+ + ~m Si _ ) i =1 r=l n s.t. Zo Xijo - ~j=lXij~j - Si - = 0 n

~.

J =1

Yrj~j

- sr+=Yrj

0

~j,

(P3) 1

= 1, ... ,m

r= 1, ... ,s Si-, Sr+ ~ 0 for all i,j,r

Zo unconstrained where Xij, Yrj and E are as before and ~j, Si-, Sr+ and Zo are dual vanables, Si - and sr + being respect i vely input and output slacks. For the full derivation see Lewin and Morey7 and Charnes et al. 2 Note also that, if uncontrollable factors are included in the input set, the model (P3) will be slightly modified in that the slacks associated with those variables do not feature in the objective function and the associated constraints do not involve the functional Zo. The efficiency score for dmu jo is now the value of zo* where * denotes optimality in the given programme. However, what is really of interest is not simply the efficiency score but whether the dmu is Pareto efficient. Dmu jo will be Pareto efficient (in producing its outputs)

160

M. J. FOSTER

if and onll if no other unit or combination of units exists capable 0 producing at least the same outputs, whilst consuming less of some input(s) and no more of the other input(s). In (P3) then dmu jo is Pareto efficient if and only if zo*=l and (Si-)* = (sr+)* = 0 for all i,r. Further, the weights Vi ,Ur defined in (Pl) are available as the optimal values of dual variables associated respectively with the i'th and (m+r)'th constraints of (P3). In conclusion to this section, we note two assumptions of the model as postulated. First, it assumes constant returns to scale for each dmu evaluated. Secondly, it is assumed that the efficient product ion front ier which bounds or envelops the observed data (hence the name DEA) is piece-wise linear and continuous, implying that all points on that surface are attainable production possibilities.

3.

MODEL DEFINITION AND DATA REqUIRElENTS

For a particular DEA application, one of the key decisions is to decide what variables to inc1ude in the efficiency index. While the initial reaction might be to try to inc1ude all "relevant" variables, there are practical benefits to be derived from a more parsimonious choice. Firstly, a more compact input- output profile reduces the data collection requirement and, secondly, the model is more effective in discriminating bet ween units when their number is large relative to the number of variables in the model. In the rates example the final model applied to 62 metropolitan authorities yielded 7 units with a score of 1.0 and a minimum value of 0.329. The same model applied to 311 non- metropolitan authorities gave only 5 with a score of 1.0 with alowest value of 0.288. In each case we used one input variabIe and four output variables. By way of comparison, in a study of nursing service efficiency in the USA, Nunamaker8 (p196) ran a model with a similar number of variables but with only 11 units. This resulted in a score of 1. 0 for 4 of the 11, scores in excess of 0.9 for another 4 and a minimum value of 0.72. This may have been a function of the units' genuine similarity but the numbers are certainly suggestive. Slightly more precisely, there will be a tendency, in general , for more of a given set of units to be shown efficient as the input/output set is extended. This is because there will be an increasing chance of a unit

DATA ENVEWPMENT ANALYSIS: A COMPARATIVE TOOL

161

finding weights for some subset of the inputs/outputs that show it efficient and the weights we recall are specifically chosen to maximise the given unit's rating on the efficiency scale. Care must be taken, however, because the possible conjecture that efficiency is an increasing function of the number of variables (provided that new variables are in addition to previous ones which all remain) is false, in spite of Nunamaker' S9 claim to that effect (p55). The counter-example is quite easy to construct but is somewhat 'pathological', hence the tendency as we have described it is a useful guideline, pointing towards parsimony once adequate coverage of the activities has been achieved. In a sense there is a parallel with the way one might decide what variables to include in aregression equation. In the rates example, after experimenting with a variety of profiles we concentrated on one with four outputs and a single input - total cost of rates collection. One reason for using only the single input was the argument that each local authority ultimately had managerial discretion to deploy resources as it saw fit to achieve its aim of collecting and administering the rating system. So, in some sense ,disaggregation of those costs might be thought to relate to questions of how to effectively deploy limited resources as distinct from making comparisons of operational efficiency in the use of those resources. Nunamaker 8 similarly used a composite cost input variabIe, with either three or four output variables (representing patient days of treatment delivered by broad patient category). In justifying his choice he makes the following observation (p189): "Since DEA measures the relative efficiency of dmu' s performing similar tasks, the sample [unitsJ selected should be fairly homogeneous with respect to product output." . Perhaps the point to note is that if this homogeneity is present in the set of units to be assessed, then the use of a single input variabIe can be more readily agreed. A contrasting example is provided by a recent study by BeasleylO of university departments - the chemistry and physics departments of 52 UK universities. Here homogeneity is not likely, albeit he looks only at the same subjects in each university, because of differing emphases between teaching (particularly of undergraduates) and research. To illustrate, Oxford had 707 undergraduates and 211 research students at the time of survey compared with

162

M. J. FOSTER

Bangor IS 89 and 7. The model used had three inputs: general expenditure, equipment expenditure and research income; and eight output variables relating to mix of students processed, research quality and research output. This last variabie is notoriously difficult to measure and Beasley, lacking data in any case, relied on his input variabie (research income) to stand as a proxy for it. This seems unwise both because it is not desirabie to have the same variabie in the inputs and outputs and also because the proxy variabie is unlikely to have been a faithful reflection of research output - especially when we consider that the purpose of the exercise is to see which universities were more efficient at using their given inputs. If, after due consideration, an input/output profile is determined for an application which is feIt to cover all the important facets of the process but contains a large number of variables, it may be prudent to look for ways of cutting down their number. One possible method is to test for pairwise correlations. If a pair of outputs is such that one is a scalar multiple of the other, then one may be dropped without effect on the efficiencies calculated: similarly for inputs. For pairs of inputs or outputs which are highly, positively correlated (but not actual multipies), one of the pair may be omitted with only limited effect on the efficiencies calculated. Nunamaker9 discusses this in more detail. Thanassoulis et a13 counsel caution lest any input variabie be of a form such that, ceteris paribus, the use of more of that input variabie wil I result in less output. If that situation arises and the effect on the output(s) is significant then the inverse of the input variabie should be used. The final (pragmatic) consideration in determining the set of variables to be used is the availability of data for all units for all variables . This of course is not the ideal situation and is one of the problems less likely to occur in applications within a single managerial framework e.g. the branches of a given bank or the departments of a single university. For, in that case the data can be required by managerial edict and strictures may be added as

DATAENVELOPMENT ANALYSIS: ACOMPARATIVETOOL

163

to its accuracy. However, if the data set needed by the model as defined is incomplete, three options exist: 1. the model can be modified by excluding the variable(s) for which there is missing data - but this should only be done if there is confidence that those variables are not the key to the operation under considerationj 2.

if the variable(s) are too important to omit and reasonable proxy data can be found then it should be used (see the example referred to above, BeasleylO)j and,

3. dmu' s for which data is missing can be omitted from the analysis: this will be less serious if the omitted units are fairly inefficient but may lead to over- optimistic assessments if those omitted are the best. Ideally, then, the data will be servant to the model and not vice versa and every effort should be made to ensure the accuracy of the data processed. If there are doubts about its accuracy, it simply means that sensible caution must be exercised in the managerial use of the output. We conc1ude this section with a description of the main model and data for the rates collection example. The data used were drawn from the annual statistics on rate collection produced by CIPFA (the Chartered Institute of Public Finance and Accountancy) for 1982/3. In principle, these statistics contain all the information necessary but in practice there are not infrequent omissions from the 93 columns of information for each local authority and the quality of some of the available data is rather variabIe. After a considerable debate it was decided that the output of a rating section would be adequately described by four variables as set out in Table 1.

164

M. J. FOSTER

TabIe 1: Input and outputs of the rates-collection function with units of .easurement. Outputs

Input Total cost of rates collection (tl00,000)

1.

2. 3.

4.

(Number of) non-council hereditaments (10,000s) Rate rebates granted (1000s) Summonses issued and distress warrants obtained (1000s) Net-present value (NPV) of non-council rates collected (tlOm)

The first output is essentially the number of rates accounts I for properties not owned by the council which are administered by the rating department. Council tenants I rates are ignored since they are collected with rents and then credited to the rate account by means of an internal transfer payment. The consequent effort of collection required of the rating section is therefore minimal. The second output relates to the fact that many households with low incomes are eligible for some remission of their rates. The number of rebates granted was taken as a measure of the effort expended. Once accounts have been raised, there is the problem of chasing up those where a serious risk of non-payment is perceived. Simple reminders were construed to be part and parcel of the general administration of an account (and were therefore covered by output 1.) It is when the "gentIe prod" fails that significant extra effort is incurred in initiating and then pursuing legal act ion. This added burden is reflected by this third output. The purpose of the type of administrative activity described above is of course to get the due monies safely An authority mayalso feel able to justify collected. deploying extra resources to ensure that high- value bills are given special attention and, more generally, that all rates are collected as quickly as possible, subject to the constraint that councils are legally obliged to offer some form of instalment payments as an alternative to collecting the due amount as a single lump sumo I

DATA ENVELOPMENT ANALYSIS: A COMPARATIVE TOOL

165

Prompt collection may obviate the need to borrow for working capital. The fourth output variable therefore reflects both the value of the rates collected and the speed of their collection. We actually computed (approximately since only quarterly cash-flows were available) the NPV of rates gathered for properties not owned by the councils (non-council hereditaments), using a discount rate of 127. p.a. which seemed to be consistent with prevailing local authority interest rates at the time. As indicated earl ier , we decided to use the total costs of the rate function as the single input variable because: (a) those costs represent the real resources used and available for management deployment, and (b) the use of disaggregated costs would have increased the number of variables, with a potential loss of discriminatory power. Also, we believed that the total costs were in general more reliable than some of their disaggregated components. The other possible input variables would have been 'some sort of contextual variables such as the prosperi ty (or otherwise) of the local community . However, these sorts of factors were represented in the output profile by variables 2,3 and 4 as burdens imposed on the system by their very existence. This idea of handling some element of context via the outputs rather than the inputs could be looked at more generally. For the particular model defined, there were complete data for 62 metropolitan authorities whose names and efficiency scores appear in Appendix 1. Some earlier models were run with a smaller data set because of missing items, as we discuss further in the section on robustness. 4.

MODEL OUfPUf AND lTS INTERPRETATION

Relative Efficiency Scores The first and most basic output from the DEA model is the (relative) efficiency score for each dmu. These appear for the model outlined in the last section (the Basic Model hereafter) in descending order of relative efficiency in Appendix 1. Their distribution is shown in Table 2 below.

166

M. J. FOSTER

Distribution of efficiency scores for basic

Table 2: model.

Score Range 1.0

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.0

-

0.999 0.899 0.799 0.699 0.599 0.499 0.399 0.299

Number of Units 7

6

4 14 11

12

6

2

o

At an initial viewing this distribution would seem to indicate a fair measure of discrimination, which is important when we remember that the scores are relative, not absolute measures of efficiency. It is also important to remember that the method seeks to put each dmu in the best possible light relative to its peers. Thus we can make the following observations: 1.

It may be that a dmu with a very high efficiency score is not uniformly strong in its performance but has achieved that rating by loading its weight on to a subset of the factors - we return to this later.

2.

A dmu having a very low score is accorded that rating despite the method's best endeavours on its behalf. It may be that the rating is not a fair reflection because of either inaccurate data (but whose fault is that ?!) or some genuinely extenuating circumstance not encompassed by the model. However, because of the "best possible" nature of the score it is at the very least reasonable to assume that there are proper grounds for enquiry in the case of lowly rated units e.g. those in the lower quartile of the distribution. To re-express in terms of our example, it may transpire that Merton c0uld give a convincing explanation of why they are not as bad as their 62nd position in the table might imply but it seems highly unlikely that they could compare favourably with those in the upper quartile.

DATA ENVELOPMENT ANALYSIS: A COMPARATIVE TOOL

167

As a very mInImum then, the DEA approach provides an effective initial sieve which can be used to sift out those units of special interest because of their outstanding performance and those which appear to be in need of further enquiry because of their poor rating. It should also be noted that in applications such as our rates example, where there is a single input variabie representing total cost, the eH iciency score can be interpreted as the maximum proportion of the actual costs incurred which if it replaced that cost would have led to an efficiency score of 1, given the same outputs. Further Information on Efficient Units One of the major reasons for being interested in units with high scores is as potential exemplars for other units. However, before high scoring units are so used, two quest ions need to be asked: (1) What aspects of a unit' s performance contributed to its high score? and (2) Does the unit show weIl rounded performance? The key to answering these questions is a consideration of the input and output values together with their allocated we ights . One way of doing this is to consider the "virtual inputs/outputs" of the unit which are defined to be the products of the input/output values with their respective weights. In the single input case the natural approach is to set the virtual input equal to 100 (or 1) and to restrict the total virtual outputs to the range 0 to 100 (or 0 to 1). This ensures that the virtual outputs represent the percentage points contributed to the unit's efficiency score from the respective outputs. In particular, for a unit with a score of 1, the virtual output of any given output is precisely its percentage contribution to the total score. When it comes to assessing a profile of virtual outputs a unit with a consistent profile of contribution will typically be more highly regarded than one which relies on just one major output. For further details see Charnes et al 2 and Thanassoulis et al. 3 Table 3 shows the virtual outputs of the seven units with a score of 1 in our example. Following the guidelines above we should tend to be more impressed by the performance of Leeds and Stockport than Brent or the City Without further investigation it is not of London. possible to say whether City of London is generally

168

M. J. FOSTER

efficient in its operations or whether inefficiencies elsewhere are hidden by its speedy collection of high rates due on its prime city locations. What its virtual output profile does suggest is that it is a unit which may be looked at as an example of good operating practice in the area of speed of collection. Table 3: Virtual outputs of the units with a score of 1 Unit Outputs - - - - - - - - - - - - - - - - - - - - Lewi Brent Stock- Brad- Leeds City of Liversham port ford London pool Non-Council 0.00 0.00 56.02 0.00 0.00 hereditaments

0.00

Rate rebates 31.94 6.31 13.00 27.21 50.47 granted

0.23 26.61

Summonses and 63.46 85.20 30.96 63.70 38.93 distress warr­ ants obtained

6.04 68.42

NPV of non4.60 8.49 0.00 9.09 10.56 93.73 council rates

0.00

4.97

We repeat our earlier warning, however, that efficiency as defined here is only relative and there may still be further scope for improvement in all the authorities listed in Table 3. Further Information on Inefficient units One of the beauties of DEA is that it automatically generates information which helps to explain how a lowly rated unit might improve. This comes in two ways: first by identifying a 'set of efficient units against which to compare it and second by detailing exactly what changes would be required for the attainment of an "efficient" rating.

DATA ENVELOPMENT ANALYSIS; A COMPARATIVE TOOL

169

The reference set of an inefficient unit comprises those units with an efficiency score of 1 with respect to the opt imal weights of the inefficient unit. Such units can be located by reference to the matrix of all efficiencies generated by evaluating every unit with respect to the weights of every other unit as weIl as its own. Thus in (Pl), if there are n dmu's this is an (nxn) matrix in which the j'th row represents the efficiencies of each unit with respect to the weights of the j'th unit, and the j'th column gives the scores for the j'th unit over all the different units' weights. By definition then the maximum value in the j' th column will occur in the j' th row. We may call this the problem's Efficiency Matrix. To illustrate the above points consider Merton again. lts reference set comprised Stockport, Leeds and City of London. Perhaps the most telling comparison is between Merton and Stockport and the details are given in Table 4. Table 4: Comparison of Merton with Stockport and "targets" for Pareto efficiency Merton

Stockport

Merton "Target"

0.329

1.0

1.0

11.190

5.760

3.680

6.578

10.909

6.578

10.900

13.392

10.900

Summonses and di stress warrants

3.523

11.527

6.924

NPV of non-council rates

J.456

4.923

3.456

Relative efficiency (Merton's weights) Total collection

CO~Lb

Non- council hereditaments Rate rebates granted

Now if we look at the first two columns of the Table we can see quite plainly that Stockport 's administrative workload was at least 257. greater in each area; they collected 407. more rates byreal value; but their costs were only 51. 57. of Merton 's. While Merton 's wage bill would undoubtedly have been raised somewhat by London Allowances, a factor of two would seem hard to explain.

170

M. J. FOSTER

In order to find out what values in its input/output profile would have shown Merton to be Pareto efficient, we have to go back to model (P3). In (P3) it transpires that unit jo, for which zo*,) is a solution of the LCP (3), then :\ is an optimum solution for (1). Conversely, if ~ is a.n optimum solution of (1), let p = QTQ:\ _ QTq, then (P,:\) is a solution of the LCP (3). Thus, any NPP is equivalent to an LCP with a PSD symmetrie matrix. Now, suppose M is a PSD symmetrie matrix of order m x m anel rank n. Consider the LCP (b, M), whieh is to find w E R"\ z E Rm satisfying w - Mz = b w, z ~ 0, w T Z = 0

(4)

202

K. S. AL-SULTAN AND K. G. MURTY

We can find a matrix Q of order 11 x 111 and rank n such that M = QT Q (for example, Q ean be taken as the Cholesky fador of 111. Subroutine LCHRG in IMSL eompl1t.es this). Consider the following system in variahles Y = (YI, ... , Ynf

QT y

=-b

(.5)

(5) may not have a solution. As an example, consider

b=(~), Af=(~ ~)

Here Q = (1,1). In this examplf' it can be verified that system (5) has no solution, even though the Lep (b, JU) has the unique solution (w = (3,Of, z = (0, 7f). If (5) does have a solution y, t hen the LCP (b, M) is equivalent to the NPP [Q; y] as diseussed above. Theorem 1: Consicler the Lep (b, Af) where M is a PSD sym­ metrie mat.rix of order 171 and rank n. Let. CJ be any matrix of order n x 171 satisfying M = {P Then the LCP (b, M) ean be transformed into a nearest point problem iff the system QT y = -b has a solution

O.

y.

Cp

Proof: If y is a solution of y = -IJ, the LCP (b, M) is equiv­ alent to the NPP [0; y] as discussecl above. Now, suppose LCP (IJ, M) is equivalent to an NPP [e); ij]. With­ out any loss of generality we ean assume that Q is of full row rank. From the above discussion we kno\\' that QTQ = 111 and QT ij = -b. Since 111 is of order 171 x 171 anel rank 11, these facts imply that Q is of order n x ln. From standard results in linear algebra we know that if Q is any matrix of order n X m satisfying QT Q = M, then the linear hull of the column vectors of QT is the same as F, the linear hull of the column vectors of 111, anel llf'nC"e (5) has a solution iff -b E F which depends only on M. Thus, if (.5) has a solution y for some Q sat.isfying QTQ = M, then it has a solution for all such Q. These facts imply that the system OT!J = -iJ must have a solution y too, in this case.



Corollary 1: The LCP (b,:1I) where M is a PSD symmetrie matrix, ean be transformeel into an l\'PP iff bEF, the linear huil of the set of column vectors of Al.

NEAREST POINTS IN NONSIMPUCIAL CONES AND LCPs

Proof: Follows from the proof of Theorem 1.

203



A vector ~ E RIn, ~ ~ 0, is an optimum solution for (1) iff there exists a p, E RIn, P, ~ 0, such that (p,,~) together satisfy (3), hence p, = QT(Q~) _ QT q, and as Q~ is constant for all optimum solutions ~ to (1), this implies that the vector It remains the same in all solutions of the LCP (3). Thus, if (p,,~) is a solut.ion of the Lep (3), every solution of it is of the form (p" À) wlwre À satisfies: Àj = 0 if j is such that p,j > 0, p, - QTQÀ = _QT q, anel À ~ O. Definition: An index j, 1 ::; j ::; 111" is said to be a (,1'itical index for the LCP (3), or for the NPP (1), if Àj > 0 in some solution of (3), or equivalently, if t.he nearest point in Pos( Q) to q can be expresseel as 2::;;'=1 ~kQ.k where ~k ~ 0 for all k, and > 0 for k = j. Hence, if j is a critical index for (3), (1), Jtj = 0 in ever)' solution for (3). Reduction to a lower dimensional problem using a critical index Consider (1), (3). Let M = (mij) = QTQ, b = (b;) = _QT q. Suppose (1) is a critical index. Then in every solution to (3) we have It1 = 0, and hence (3) is equivalent to -M1 .À = bI

Iti - 1\1;.À = bi, i = 2 to m It2 to It", ~ 0, À ~ 0; anel

Iti Ài

= 0,

i

= 2 to 171

(6)

.

Since mn = (Q'lfQ.1 > 0, from the first equation in (6), we can get À1 in terms of À2 to À", anc! eliminate it from the systel11. This leads directly into an LCP in the variables 112 to ft", anc! '\2 to À",. To get this lower order Lep perform a Gaussian Pivot Step on (- M:b) with -AI. 1 as the pivot column anc! row 1 as the pivot row. Suppose this leads to - 117 11

-11112

- 171 1",

bI

0

-11122

-1112m

h2

0

-111 m 2

- 1H mm

hrn

204

K. S. AL-SULTAN AND K. G. MURTY

Let

Nt = Uhij: i,j = 2tom),b= (~, ... ,bm)T,W= ('lz,"',l1-m),

~ = (À z ,' ", Àmf. Then the lower order LCP obtained after elimi­

nating Àl' from (6) is W -

-

M~ =

-

b, w,

~ ~

0,

W

T

~ =

0 ,

(7)

Since 1 is a critical index, every solution for (3) comes from a solution to (7), with 11-1 = 0 and ÀI obtained from the first cquation in (6). Since Jl1 is PSD, symmetric and of rank 11" from the manner in which Nt is obtained, Nt of order (m-I) x (m-I) is PSD, symmetric, and of rank (11, -1). Also, since b is in the linear huil of the columns of M, b is in the linear huIl of the columns of Nt So Nt has a Cholesky factor, Q of order (11, - 1) x (m - 1) and rank (11, - 1) alld b is in the linear huil of the columns of Q, i.e., the system, QTfj = -b, has a solution fj E Rn-l. Thus (7) is equivalent to the lower order NPP [Qj fj] where Q is of order (11, - 1) X (m - 1) and rank 11, - l. Hence, if a critica.! index for (1) is known, it can be rechlced to a lower dimensional problem

Reduction Using Geometrical Argurnents Given a critical index for the NPP [Qj q], its reduction to a lower dimensional NPP can also be carriecl out using geometric arguments as discussed 4 • Again, suppose 1 is a critical index. Then the nearest point to q in Pos(Q) is also the nearest point to q in Pos((Q : - Q.t}). Define

- Q . - Q.t{Q.IlTQ., . - 2 t Q 'J~ 'J IIQI W ,J 0 11, q- = q _ Q.dQ IlTq IIQIW

For j = 2 to 11" Q-:; is the orthogonal projection of Q-j in the hyperplane through the origin orthogonal to the ray of Q'l' likewise q- is the orthogona.! projection of q in the same hyperplane. The cone Pos((Q: - Q.I)) is the direct sum of the fullline generateel by Q'l anel the cone Pos(Q). Let Q- = (Q-:; ... Q~). If ;r-* is the solution of the NPP [Q - ; q -] as em 1)ec11(ed'111 RU • .tI len.r. * =:r -* + QdQ·IlTq. ,,~"" IS tI1e solution of the original NPP [Q; q].

NEAREST POINTS IN NONSIMPUCIAL CONES AND LCPs

205

Q- is of order n x (m - 1), and from the construct.ion it is dear that the rank of Q- is one less than the rank of Q, henee the set of row vectors of Q- form a linearly dependent set. It ean be verified that (Q-)T(Q-) = Q discussed earlier, and that the NPP [Q-; q-J is equivalent to the Lep (b, .Nt). Since Q- is not of full row rank, it is possible to transform the NPP [Q-;q-J into an equivalent NPP of the form [Q=; q=], where Q= is obtained from Q- by dropping a dependent row vector from it, as discussed in Section 1, but it is not necessary to do this work.

4. CONDITIONS FOR CRITICALITY Given zE Rn, z i= q, define B(qjz) = {x: Ilx-qll < Ilx-zll} c Ril. T(q;z) = {x: (x - z)T(q - z) = O} is the tangent plane to B(q; z) at its boundary point z. Property 1: The point z E R l l is saicl to satisfy property 1 if z i= 0 and it is the nearest point (by Euclidean distanee) to q on its ray b z : Î 2 O}. It can be verifiecl that z satisfies property 1 iff z i= 0 and zT (q ­ z) = 0, i.e., 0 E T(q; z). For such points, we therefore have T(q; z) = {x : x T (q - z) = O}. The open half space defined by T( q; z) eontaining q is ealled the near side of T( q; z), while its complement is callecl the inl' side of T( q; z). So, for points z satisfying property 1, near sicle of T( q; z) is {x: xT(q - z) > Ol, and the far side of T(q; z) is {x: :rT(q - z) ::; O} For z satisfying property 1, we clefine

N(z) = {j : j = 1 to mand (Q..jf(q - z) > O} , it is the set of j such that Q.j is on the near side of T(q; z). Let r = {I,···, m}. For ljJ i= Ser, we define

Q.s = matrix of order n x ISI whose column vectors are Q.j for JES. H(S) = linear huIl of the column vectors of Q.s in R l l . q(S)

=

orthogonal projection of q in H(S).

206

K. S. AL-SULTAN AND K. G. MURTY

In the algorithm, we will deal with subsets

Ser

satisfying:

{ Q-j : JES } is linearly independent. For sueh sets S, we have q(S) = Q.s((Q.SfQ.S)-l(Q.S)T q. We have the following Theorems 2, 3 based on corresponding results 4 for the NPP in simplicial cones. Theorem 2: A point x E Pos(Q) is the nearest point in Pos( Q) to q iff (i) x satisfies property 1, and (ii) either x = q it.self, or N(x) = . Proof: If x = q, q E Post Q) allel henee q is the nearest point in Pos( Q) to itself. So, assume that ;1' =f. q. Sinee xE Pos(Q), it is the nean'st point in Pos(Q) to q iff Pos(Q) n B(q; x) = , i.e., iff x is a boundary point of Pos(Q), and T(q; x) separat.es Pos(Q) anc! B(q; x) which happens iff N(x) = tP.



Theorem 3: Assume that. q (j. Pos(Q). Let x E Pos(Q) satisfy property 1. If N(;l') is a singlet.on set, {hl, then h is a critical index for the NPP [Q; qJ. Proof: Since N(x) =f. , by TheOl'em 2, x is not the nearest point in Pos( Q) to q. Let x* be the nearest point in Post Q) to q. So II;r* - qll < Ilx - qll and hence :17* E B(q; x). Since x sat.isfies property 1, 0 E T(q; x). Also, since N(i:) = {hl, Q.j is on the far side of T(q:.r), that is T(q;x) separates the ray of Q.j from B(q; x), for all j =f. h. Hence T(q; x) separates POS(Q.l'···' Q.h-l, Q.hH'··· Q.m) anc! B(q; x). These facts imply that Pos(Q) n B(q; x) =f. , anc! that none of the points in it is in POS(Q'I,···,Q.h-I,Q.h+h···,Q.",). Hence, x* satisfies this property too, tbat is. if :r* = I:bl 'jQ.j with ~ 0 for all j, then we must have Ih > O. Therefore h is a critical index for the NPP [Q; qJ. •

,j

5. THE ALGORITHM If (Q.jf q ~ 0 for all j = 1 to m, N(O) = , and hence 0 is the nearest point in Pos(Q) to q by Theorem 2, in this case (J.L* = _QT q ,,\* = 0) is the solution of tbe Lep (3).

NEAREST POINTS IN NONSIMPUCIAL CONES AND LCPs

207

So, in the sequel we assume that (Q.j)T q > 0 for at least one j. The algorithm applies a routine for finding a critical indC'x at most n - 2 times. This routine itself may either terminate wil h t.he nearest point (in which case the whole algorit.hm terminates), or wit.h a critica] index. In the latter case we use tllf' critical index to recluce the NPP into one of lower dimension as discussed above, and rf>pC'at the whole procedure on the reduced problem. As the rank recluces hy 1 with each reduction, this routine will be calhl at most (n - 1) times c1uring the algorithm. Actually, when the rank becomes 2, the rec!l1ced problem can be solved by the special direct algorithm eliscussed in Section 2, anel then from this solution build up an optimum comhination vector for the original problem using equat.ions of the form of t.hc first in (6) from earlier reeluction Steps. The routine maintains a subset Ser such that {Q.j : JES} is linearly independent, anel a current point x E Pos( Q.s) which always satisfies Property 1. x gets closer to q as the algorithm progresses. :\ = (i.j ) ~ 0 is the combination vector corresponeling to x, satisfying: :\j = 0 for all j (j. S, anel Q:\ = x Routine for Finding a Critical Index Step 1 : [Initialization]: For each j = 1 t.o nl, clefine q-i as in (2). q.i is the nearest point to q in the ray of Q.j. If qj = 0 for all j = 1 to nl, 0 is the nearest. point. in Pos( Q) 1.0 q, terminatf' the whole algorit.hm. If qj =I- 0 for at least one j, let qh be the nearest to q al110ng all qj, break ties arbitrarily.

Set x = qh, it satisfies property 1. Let. S = {h}. Define t.he corresponeling :\ = :\j where :\h = ((I~hg2q), and \ = 0 for j =I- h. Step 2: Compute N(x) {j: 1 ::; j ::; m anel (Q.j)T(q - .1') > Ol. If N(x) = rp, .1' is the nearest point in Pos(Q) to q, anel the corresponding :\ is an opt.imum combination vector. terl11inate the whole algorithm. If N (x) is a singleton set {h}, h is a critical index for the present

nearest point problel11, terminate th is routine.

If IN(x)1 ~ 2, go to Step 3.

208

K. S. AL-SULTAN AND K. G. MURTY

Step 3: If N(x)

n (r\s)

= 0 and a2 > O. Define ~ = (~j) where ~j = a1:\j for JES, a2 for j = P, and Ootherwise. Let Sl = Su {p}. Obtain (Q~l Q.Sl )-1 from (Q~Q.S)-l using any of the updating schemes. If linear dependence of Q.p with columns of Q.s is signaled, go to Step 5, otherwise replace S, x, :\ by Sl, q, ~ respectively and go back to Step 2. Step 5 Compute q(S) = Q.sa(S), where a(S) = (aj(S) : JES) =

((Q.s?Q.st 1(Q.s? q.

If a(S) ~ 0, define ~ = (~j) by ~j = aAS) if JES, Ootherwise.

Replace x, :\ by q(S), ~ respectively and go back to Step 2.

If a(S)

l

0, go to Step 6.

x

Step 6 Let denote the last point in Pos( Q) as we move from along the line segment joining it to q(S). Actually

x

x = L ((1- (3):\j + (3aj(S)) Q.j JES

where

(3 = min

{À _:\~j(S) : JES j

sueh that aj(S) < O}

(8)

Let k E S attains the minimum in (8), break ties arbitrarily. Define ~ = (~j) where ~j = (I - (3)~j + (3aj(S), and 0 for j rf. S. Replace x, :\, by x, ~, respectively, delete k from S, and go to Step 5.

NEAREST POINTS IN NONSIMPUCIAL CONES AND LCPs

209

Finite Convergence of the Algorithm We have the following facts: 1. At each pass through Step 4, pass through Steps 5 and 6, same.

II:r - qll strictly decrpases. At each IIx - qll may deCfease Of stay the

2. Step 4 can be performed at most n times conseclltively before linear dependence is signalecl.

3. Steps 5 and 6 could be performed at most n-l times consectively before the projection gets into the relative interiol· of the Pos of the current set. 4. The same subset of r cannot reappear as S once it changes due to 1. Since there are only a fini te nllmber of subsets of r, these facts imply that the routine must terminate in a finite number of steps. Also, the overall algorithm is finit.e as it uses the a hove routine at most (n - 2) times.

6. IMPLEMENTATION AND RELATED NUMERICAL ISSUES

((Q.S?Q.S)-l can be upelateel by standard updating sc1wmes fOf pro­ jection matrices whenever S changes (as it always changes by the addition or deletion of one element). These schemes win detect linear dependence of the set of column vectors of the new Q.s, if it OCClll·S wh en a new element is added to the set S. An even hdter imple­ mentation is obtained by updating and using the Cholesky factor of (Q.s? Q.s insteacl of its inverse, this cuts down the amount of work for each updating from 0(ISI 3 ) to 0(ISI 2 ). Computing N(x) completely in Step 2 is expensive, rarticularly when m is large. We fmmd that an efficient way to carry out Step 2 and to select a p for carrying out Step 4 if that is the step to move next, is to begin the search for the new p from the previolls p, continue upto m, anel then from 1 upto the previous p again, until t.he first eli­ gible candidate is noticed, at which point the search is terminated by

210

K. S. AL-SULTAN AND K. G. MURTY

seleding that canelielate as the new p. This is similar to t.he LRC (least recently considered) entering index strategy commonly mentioned in the literature on tbe Simplex algorit hm for linear programming.

Computational Experience Tbe above algorithm was coeled in FORTRAN 77 and testeel us­ ing APOLLO series 4000 machinps. The performance of the a.lgo­ rithm is compared with the performance of the following two algo­ rithms. K. Haskell and R. Hansons for solving linearly constrained least square prob!ems (HH) which - in our case - boils down to their implementation of Lawson and Hanson 6 for Pos cones. Tbe code is availahle as AClVI software 587. Tbe other algorithm 'VIL is that of D.R. Wilhelmsen 2 . We codeel Wilhelmsen's algorithm using FOR­ TRAN 77. We used updating procedures baseel on the Cholesky fac­ tor. Anel we used LRC (least reC('l1t consiclerecl) entering rule as dis­ cussed ahove. Surprisingly this happens to be far superior to other mIes. Tbese strategies accelerate this algoritbm a lot. Test problems were construcfed by generating the elements of Q to be uniformly elistributed betw('en -.5 anel 5 anel the elements of q t.o be uniformly between -20 anc! 20. All the test problems are fully clense. The t.imings of all tbree algorit hms are shown in Tahle 1. It is clear from the tahle that our algorithm is 2-4 times faster tban the HH anel the 'VilllC'lmsen algorithllls on an average. l\foreover, the a.lgorithm performs much bet ter t han HH when the conI" is close to simplicia] (or when 117 is compara bIe to 11) while marginally better when m ~ n. This may bI" due to the fact thai. complltationally cheap two ray series of projections (Step 4 of the algorithm) becomes less effect.ive when 17/ ~ n. The work in our algorithm consists of three types of steps, (i) the ort.hogonal projection of a point into a t.\\'o dimensional subspace, (ii) orthogona.l projection of a point. into a s'lbspace of dimension > 2, anel (iii) reduction of the problelll into one of lower dilllension using a critica! index. Among steps (i) and (ii), clearly (ii) is far more expensive than (i). In Tab!e 2, we provide the total number of steps of types (i) and (ii) carriecl out on an average per problem in our algorithm as the order of Q ranges from .50 x 70 to 600 x 800. The number of steps of type (i) clearly grows as the order of Q increases,

NEAREST POINTS IN NONSIMPUCIAL CONES AND LCPs

211

Table 1: Summary of Performance Resnlts

(Average time per problem in APOLLO 4000 seconds

n 50 150 200 300 400 500 600

m 70 150 250 400 500 550 800

- problems

#

10 10 10 10 5 5 3

-

~

om algorithm 6.84 49.9 180.3 790.0 1194.5 1470.4 5285.0

.

HH

8.53 74.1 412.5 1544.0 3038.9 4778.5 11927.0

WIL 9..55 93.6 400.1 1644.1 3118.4 4364.4 13260.7

Table 2: A verage number of type (i) anc! (ii) projections per n 50 100 200 300 400 500 600

m 70 150 250 400 500 550 800

#

bI

-----------

type (i) projections 52.8 116.4 177.6 303.4 351.6 357.2 587.0

#

type (ii) projections 3..5 4..5 :3.7 4.2 3.9 3.4 4.67

but the number of steps of type (ii) seems to be more or less constant (between 3 to ,5). Tbe good performance of oU!' algorithm stems from tbe fact that the majority of work in it involves steps of type (i) which are computationally cheap.

ACKNOWLEDGEMENTS This work was partially supportecl by NSF Graut No. ECS-8521183 and by King Fahel University of Petroleum anell'vlinerals.

212

K. S. AL-SULTAN AND K. G. MURTY

7. REFERENCES 1. Gilbert, E.G., Johnson, D.W., and Keerthi, S.S. (1988) A Fast

Procedure for Computing the Distance Between Complex Ob­ jects in Three-Dimensional Space, IEEE Journalof Robotics and Automation, 4 (2), 193-203. 2. Wilhelmsen, D.R. (1976) A Nearest Point Algorithm for Convex Polyhedral Cones and Applications to Positive Linear Approxi­ mations, Mathematics of Computation, 30, 48-57. 3. Chang, S.Y. and Murty, K.G. (1989) The Steepest Descent Gravi­ tational Method for Linear Programming, Discrete Applied Math­ ematics, 25, 211-239. 4. Murty, K.G. and Fathi, Y. (1982) A Critical Index Algorithm for the Nearest Point Problem on Simplicial Cones, Mathematical Programming, 23, 206-215. 5. Haskell, K. and Hanson, R. (1981) An Algorithm for Linear Least Squares Problems with Equality and Nonnegativity Constraints, Mathematica! Program ming, 21, 98-118. 6. Lawson, C.L. and Hanson, R.J. (1974) So!ving Least Squares Prob­ lems, Englewood Cliffs, NJ, Prentice-Hall, Inc. 7. Golub, G.H.and Van Loan, C.F. (1989) Matrix Computations, Bal­ timore, l'vID, John Hopkins University Press. 8. Mnrty, K.G. (1988) Linear Complemenfarity, Linear and Non!in­ ear Programming, West Berlin, Heldermann Verlag. 9. Wolfe, P. (1974) Algorithm for a Least Distance Programming, Mathematica! Programming Study I, 190-205. 10. Wolfe, P. (1976) Finding Nearest Point in a Polytope, Mathemat­ ica! Programming, 11, 128-149.

A STUDY ON MONOTROPIC PIECEWISE QUADRATIC PROGRAMMING JIESUN Department of Industrial Engineering and Management Sciences, Northwestern University, Evanston, IL 60208,

USA Abstract We explore a new model in mathemati­ cal programming in which a separabie convex piecewise quadratic function is minimized subject to linear con­ straints. The discussion includes basic theories such as duality, optimality, boundedness of solutions, and para­ metric properties as weIl as some examples for applica­ tions. We also briefly review algorithms developed for this model.

Key Words

monotropic programming, optimization.

1. INTRODUCTION

In this paper we study a new model in mathematical programming ­ minimizing a separabie convex piecewise quadratic function subject to linear constraints (monotropie piecewise quadratic programming or PQP for short). The mathematical statement of the PQP problem is as follows: (PQP) {

min~mize

subject to

F(x) =

Ax = b

Ej=l /j(Xj)

where A E Rmxn, bERm, and x = (Xl,··· ,xn)T E Rn. Each /j is DOI: 10.1201/9780429333439-14

213

214

JIESUN

a convex piecewise quadratic function of the single variabie x j, i.e.

+00, '2I PjlXj2+ qjlXj

/j(Xj)

= { ... I

2

'2 Pjkj Xj

+00,

+ Tjl,

+ qjkjXj + Tjkj'

ifXj Cjkj

where for each j, we call the numbers CjO, Cjl, ... ,Cjkj the breakpoints of /j j they satisfy

-00 $ CjO $ Cjl $ ... $ Cjkj $ +00. FIGURE 1 depicts some of the possibilities for the function /j, and the bounds Cj(= Cjo) and Cjkj). FIGURE 2 illustrates a PQP problem with n = 2, m = 1 and kl = k 2 = 2. It is not hard to see that linear programming, separabie con­ vex quadratic programming, and piecewise linear programming are special cases of this model and this model is, on the other hand, a special case of monotropic programming (convex separabie program­ ming with linear constraints l ). The importance of this model is two fold. First, some practi­ cal problems, especially those involving network structure, variabie costs, stochastic factors, or "soft constraints" ( i.e. the constraints to which some sort of violation is allowedj see the next section for details) can be directly formulated as PQP (e.g. the stochastic trans­ shipment problem in Sun's dissertation2 )j second, it can arise as a subproblem in sol ving more complex mathematical programs and can be used to approximate, locally or globally, certain convex nondiffer­ entiable programs. Challenges also exist in developing efficient algo­ rithms for PQP because none of the current algorithms for quadratic or smooth nonlinear programming can be applied to this problem. There have been papers scattered in the literature discussing "piecewise quadratic programming" in various senses. For instance, Wilde and Acrivos3 studied a problem in production scheduling in which the objective function has different quadratic expressions on its domain. An application to portfolio optimization by Perold4 in­ corporated both quadratic and piecewise linear terms in the objective function. It was Rockafellar who first proposed the PQP model in our sensel. Later, he and Wets 5 ,6 demonstrated how the problems in optimal control and stochastic programming can be formulated as

cf(=

PIECEWISE QUADRATIC PROGRAMMING

fj

piecewisc lillear-

/I

I I

I

f. J

constant

dom fj is a sinCle point

I

T I

I J

J J

cj 1

-(D

cj 2

+ cc.(Ç,170) and -cI>(ÇO,17) are proper, closed, convex, and piecewise

quadratic functions of ç and 17, respectively. Proof. Theorem (2.2.1) ensures that the dual of PQP( ÇO, 17°) has at least one optimal solution UO corresponding to the optimal value cI>(ÇO,170). This UO remains dual feasible no matter what the choice of ç is because the vector ç appears in the dual program only as a coefficient vector of the objective function. Theorem (2.2.1) then gives the relation

cI>(Ç,170)

= xERn inf F(x,Ç,170) = sup G(u,ç,1]°) > -00. uERm

The formula for -G in the proof of Proposition (2.4.1) says that

-G(u,Ç,1]°)

= U· ç + Go(u),

222

JIESUN

where Go (u) = u . b + H* ( - A T u - 1]0) is a proper convex piecewise quadratic function of u. The above formula for ~ says ~(e,1]°)

= uERm sup {-u· e -

Go(u)}

= G~(-e)·

80 ~(e, 1]0) is a proper, closed, convex, and piecewise quadratic func­ tion of The proof for -~(eO, 1]) is similar. We first notice that the primal optimal solution X O of PQP(eO, 1]0) remains primal feasible for all problems PQP(eo,1]). This implies by Theorem (2.2.1) that

e.

~(eo,1])

= zERn inf F(x,eo,1]) = sup G(u,eo,1]) < +00. uERm

Therefore -~(eo,1])

= zERn sup {-F(x,e,1])} = F;(-1]),

where Fo is certain proper, closed, convex, and piecewise quadratic function. The conclusion then follows. • Now we will consider the directional derivative of ~(e, 1]0), which is often of interest in optimization. For simplicity of nota­ tions, we write ~(e,1]°) as ~(e), and G(u,e, 1]0) as G(u,e). Proposition (2.4.3). If there is a eo such that ~(eO) is finite, then for all in the domain of ~ and y E Rm we have

e

~1(e,Y)

= sup {-y. u}, uEU(e)

where U(e) is the set of dual optimal solutions to PQP(e, 1]0). Proof. We know that ~ is proper, closed , convex, and piecewise quadratic (Proposition (2.4.2», so by convex analysis8 ~1(e,Y)=

sup Y·U=

uE89(e)

sup

-uE89(e)

{-y·u}.

On the other hand, we have

-u E a~(e)

{=}

~(O+~*( -u)

= -u·e

{=}

(j)(e)+Go(u)

= -u·e,

PIECEWISE QUADRATIC PROGRAMMING

223

because ~(e) = G~(-e) implies ~*(-u) = Go(u). The last equiva­ for this fixed because of the lence says that u maximizes G(·, Fenchel inequality ~(e) + ~*(-u) ~ -u· This is equivalent to u E U(e). Hence

e)

~/(e,Y)=

sup

-uE84iW

e.

e

{-y.u}= sup {-y·u}.



uEUW

e

A symmetric result can be obtained for -~( o, 7]). By Propo­ sition (2.1.6) and Theorem (2.2.2), the set of dual optimal solutions U(e) is a convex polyhedron. Thus Proposition (2.4.7) says that find­ ing the directional derivative ~/(e, y) is equivalent to solving a linear program. This property can be used to determine the subdifferen­ tial curve of "quadratic black boxes9 " , which have an application in a decomposition algorithm for quadratic network programming (see Section 3.3).

3. SOME EXAMPLES OF APPLICATIONS 3.1. Dual Quadratic Programming

Since any convex quadratic function can be efliciently diagonalized by a nonsingular linear transformation lO , we may focus on the sep­ arabie convex quadratic programming

(Q)

{

••.

",n

mln1mlZe

subject to

(1

2 + qjXj )

L..Jj=l "2PjXj

Ax = b, x E B,

where B is a certain box in Rn. According to Section 2.2, the dual of (Q) can be expressed by

(DQ) {maximize

subject to

- u· b - Ej=l gj(Vj) v = -ATu

where gj is a conjugate function. For example, if Pj then gj(Vj) is the conjugate function of r(x.) = { !Pjxj J

+00,

J

Hence gj(Vj)

=

{

+ qjXj,

O, !pj1vj _ PjlqjVj,

ol

°

if Xj ~ 0, if x j < 0. ifvj qi.

Because at least one Pi =1= 0, the problem (DQ), if converted to a minimizing problem, is a PQP problem. Moreover, if any variabie x i has an upper bound as well as a lower bound, then the function gi is in general of three pieces.

3.2. Soft Constraints In stochastic programming and optimal control problems, we often rather treat linear constraints, either equalities or inequalities, as penalty terms in the objective function to allow small violations5 ,6. These constraints are regarded as soft constraints. Let (y)+ be the n-dimensional vector [max(O, yt), max(O, Y2), ... ,max(O, Yn)]T, where Y = [Yb .... , Yn]T. Let lIylll andllYII2 be the one and two norms on Rn, respectively: n

lIylh =

n

L IYil,

IIYII2 =

i=1

(LYn~· i=1

Consider the following problems : minimize !x . px and

+ p. x + II(Ax -

b)+11t over x E B

tx .

minimize px + p. x + II(Ax - b)+I1~ over x E B, where B is a certain box in Rn, P E Rnxn,p E Rn,b E Rm,A E Rmxn, with P being positive semidefinite. These problems may be originated from minimizing the function Px + p . x subject to the soft constraints Ax ~ b and the "hard" constraints x E B. There exists a nonsingular matrix Q satisfying QTPQ = D (diagonal). Hence the problems are equivalent to

tx .

mmlmlze

subject to x

= Qy,

and mmlffilze

F(x, y, z) =

subject to x

= Qy,

1

'2 Y · Dy + p. x + lI(z)+lIl z = Ax - b, and x E B

F(x, y, z) =

z

1

'2 y . Dy + p . x + lI(z)+ II~ = Ax -

b, and xE B.

PIECEWISE QUADRATIC PROGRAMMING

225

These are PQP problems.

3.3. Aggregation of Quadratic N etworks Consider a network, to each of whose arcs a convex quadratic cost function is associated. If a subnetwork is connected with the rest of the network through two distinguished nodes, we may lump this subnetwork into a single arc. As a result, we obtain a new network whose arcs represent subnetworks of the original network. We call such arcs black boxes, and call the new network consisting of the black boxes the master network. (As an extreme case, any single arc can, of course, be viewed as a black box.) Suppose that we want to solve the quadratic program for the original network. That is, { mllllmlze subject to

L:j=1 h(Xj)

Ex=O,

where E is the node-arc incidence matrix of the network and each h is convex and piecewise quadratic. (The requirement that Ex = 0 is not arestriction for solving network problems with Ex = b, see pp.68-69 of Rockafellar's book 1 .) Let B be the incidence matrix of the master network, B E RI-'xv. Let the flow of the master network be ç E RV. Correspondingly, let the arc set and flow of the kth black box be Ak and xk, let its incidence matrix be Ak' and let its initial node be Sk and its terminal node be sI.. Definition (3.3.1). The cost function 0

t

t!O

Zjft(Xj)

+

L ZjfT(Xj), %j o. By the assumptions Al and A2, if cP(Xk) tends to -00 along some sequence {O < Xk E Op}, then q(xk)

244

YINYUYE

converges to zero. The algorithm that is described in this section generates a sequence {o < x le E Op}, such that

(x le ) ::; (x le - 1 ) where QP3

-

0:

for

k = 1,2, ... ;

0.2. Let us look at the following problem related to QP: 0: ~

mmlInIZe q(x) = xn + 1 q(T- 1 (x)) subject to x E Op = {x E Rn+1 : AXlex[n]- Xn+lb eT X = n + 1, x[n] ~ 0 and Xn+1 > 0 }

where x[n] is the vector of the first n components of T- 1 is the inverse projective transformation T- 1 : defined by x

= T- 1 (x) = Xlex[n]

Xn+l •

xE

= 0,

Rn+1, and

~ + 1 ---+ Rn

(3.1)

It can be verified that x E Op implies T- 1 (x) E Op. Conversely, for any x E Op, an E Op can be obtained via T : Rn ---+ Rn+1 defined by (n + 1)(X le ) - 1 X (3.2a) x[n] = eT(XIe)-lx+ 1 '

x

and

n+l

Xn+l = eT D-I X

Particularly, let

xle

=

+1

(3.2b)

T(x le ), and then X"Ie

= e.

In the rest contents of this section, x always designates the variabie in Op, and corresponds to x E Op. For example, xle +------+ xle , x· +------+ x· , a +------+ a, and so on, where +------+ indicates one-one mapping. The augmented nonlinear objective function in QP3 plays a key role in our proposed algorithm. Note that q(x) is the product of q(x)(~ 0) multiplied by xn + 1 (> 0). Hence, q(x) ~ O. Generally, we have

q(x)

Xn+l

= q(T-1 (x)) = q(x).

QUADRATIC PROGRAMMING

245

Especially,

q(x'") = q(e) = q(x'"), and

qA(A*) x

= XA*n + 1 q(*) X = 0 .

H q(x) is a linear function, then q(x) is also a linear functionj oth­ erwise, q(x) looks complicated. But, fortunately, q(x) is merely a convex function according a convexity invariance lemma 17 • Let q(x) be a convex function in Op. Then q( x) is a convex function in Op, i.e., the convexity of the objective function remains invariant in the objective augmentation and the projective transformation. Now we solve the following sub-optimization problem QP3.1 over an interior ellipsoid centered at x'", instead of solving QP3.1. Since the starting point x'" = e, the interior ellipsoid happens to be an interior sphere in X. QP3.1

minimize q(x) = x[n]T Qx[n]f(2xn+d subject to Ax = b IIx - ell 5, {J < 1

+ êT x[n]

where x[n] designates the vector of the first n components of x, and

Q = X'"QX'" ,

A = (AX:~

-b)

ê

and

= X'"c, b=

(n 11) .

As aresuit, the following algorithm is introduced:

Algorithm 3.1 Repeat do begin solve QP3.1 and let x'"+1 = T- 1 (a)j k = k+ Ij endj

until q(x'") 5, 2- L •

a be the minimal solutionj

To solve QP3.1, again we can either apply the trust region method as we did in the affine sealing algorithm, or use the line search

246

YINYUYE

of 21. The former can be completed in O(L) or O(n) steps, and the latter in O(ln n) steps. Furthermore, QP and QD can be combined to form QPD minimize q(x) - d(x, y) = xT Qx + CT X - bTy subject to (x, y) E {(x, y) : AT y ::; Qx + c, x E Op}.

The optimal objective value for QPD is known to be zero (except when QPD is infeasible). In addition, the objective function of QPD remains convex quadratic and the constraints of QPD remain linear. In fact, let s = Qx + c - AT y. The objective function becomes xT s and the potential function becomes

4>(x,s)

= (2n+1)In(x

n

T

s) - Lln(x;s;). ;= 1

Applying Algorithm 3.1, we derive

Theorem 3.1 17 ,21

Convex quadratic programming can be correctly solved in O(n 4 In nL)

arithmetical operations using Algorithm 3.1.

In fact, the rank-one updating technique can be used to reduce the

complexity to O(n 3 . 6 In nL).21

4. PATH-FOLLOWING ALGORITHM Since Karmarkar proposed the interior projective sealing algorithm, another interior algorithm that avoids the projective transformations, the central path-following algorithm, has been developed by several authors. The concept of the "analytic" center for polyhedra is pro­ posed in 22. The central pathway to the optimal set is studied in 23 and 24. A path-following algorithm, the first O(vlnL) iteration algorithm, for linear programming is developed in 26 (by using the rank-one updating technique, the total operations for LP is further reduced to O(n 3 L) in 26 and 27). Soon after, a primal-dual path­ following algorithm converging in O( vlnL) iterations for convex linear complementarity problems and convex quadratic programs is analyzed in 28 and 29.

QUADRATICPROG~G

À ~

247

The central path can be characterized in terms of a parameter 0 of the following system of equations: Xs = Ax

= b,

x > 0,

z

Àe,

and

s

~

0

= Qx + c -

o.

AT Y >

Obviously, optimal solutions of QP and QD are on the path corre­ spon ding to À = o. The primal-dual algorithm is in fact the homotopy algorithm by first reducing À, then using Newton's method to solve the above system, until À is reduced to zero. The sequences of x" and y" , therefore, are always close to the central path from À0 down to o. Given x" and y" with 0< x" E Op,

where zIe

SIc

+c-

= Qx"

= (x")T SIc In.

AT y" > 0, and 11 X,. SIc

-

z"ell ~ az",

(4.1) Let À=

(1- ~) (x")T SIc Vn n

for some constant a (for example, a = 1/3). Then, Newton's method is used to solve the system of linear equations

X,. ds + SIc dx = Adx = 0

and

Àe -

X" SIc

ds = QdX - AT dy.

Note that the system of linear equations can be also written as ( Q

+ (X")-1 SIc A

_AT) (l::.x) _ (À(X,.)-1 e 0 l::.y 0

SIc )

It has been shown that 11 (X" (1l::.xIl 2 + 11 (S" (1l::.sI12

a

(1 - -)z"
m Cij cost of assigning ith job to jth worker x. . - {i if job i assigned to worker j IJ 0 otherwise Given integers n and m, 3 two integers tand r such = = =

n

= mt - r;

0

~

r < m.

For a uniform distribution when r = 0, each worker was assigned t jobs. In this case the problem reduced to a transportation problem. When r > 0, each worker was assigned at least (t-i) jobs and at most t jobs such that all the jobs were allocated. An (n+i)th row was added to the (n x m) matrix of jobs and workers. This row represented an addition of r 'dummy jobs' (with a zero cost value). This was mathematically expressed as: cn + 1

and

m

.E xn + 1

J =1

=

0

=r

j=i,2, ... ,m

266

RADHA KALYAN AND SANTOSH KUMAR

The mathematical follows: Minimise

model

of

n+1 m i=l j=l

the problem was written as

Xo = bbC·· x·· IJ

IJ

... (1)

subject to bm x = {1 j=l ij r

i=1,2, ... ,n i = n+1

... (2)

n+l b x =t i =1 i j

j=1,2, ... ,m

... (3)

o ­ 0 j=1,2, ... ,m and Xi j integral

... (4)

... (5)

The computer program for generation of a random job allocation problem expects the row-column size of the original problem matrix. It generates random costs and computes the values of ai 's and bj IS.

4. EVALUATION OF SYSTEM RELIABILITY OF A COMPLEX NETVORK

or

The terminal pair reliability was used as the measure system reliability [3] . A simulation procedure was developed to evaluate the system reliability. An Exact method proposed by Kim et. al.r10] was implemented on the computer for the purpose of determining the accuracy of the simulation method. The on-line creation of input files needs some explanat ion. For creat ion of a node relationship file, an entry of a non-zero node say 'u' is made in response to a request for an entry in row i and column j, to indicate that there is a connection between node i and node U. An entry of the number 0 indicates that there are no further connections from node i. For creation of a reliability input file, an entry of the number 0 is made when there is no connection between the given nodes.

PROBLEMS IN PROTEAN SYSTEMS

267

A computer program is available for generation of a random complex network configuration with random reliability values. 5. REDUNDANCY OPTIIISATION IN COMPLEX NETliORKS USING THREE BEURISTIC STRATEGIES

The system reliability of a complex network was improved using component redundancy. Three heuristic strategies were proposed for the optimal allocation of a given number of i.i.d. components. The following notation was used for stating the mathematical model of the problem: Let h

system reliability number of paths total number of components in the network total number of i.i.d. links available for duplication reliability of component i (1 - pï) increase in reliability due to redundancy using an i.i.d. component (L\Pi = Piqi) reliability of path i in the network number of components in path i o or 1

n m

d

Pi

qi

L\Pi Pi

ni

Yj

The mathematical model was stated using the following system reliability expression[10] : h=EPi- E [PiP·]*+ E [PiPjPk] * i it=j J it=jt=k * + (-l)n+l [P1P2 ... Pn]

... (6)

The [ ]* operation[3,10] is used to prevent the probability of success of a component being accounted for more than once when paths are dependent . P may be expressed a.s a product of its component reliabilities as in Equation (7) . Pi

where R = [j

=

n pj jER

j E {ni elements from (1,2, ... ,m)}]

. . . (7)

268

RADHAKALYAN ANDSANTOSHKUMAR

If PiD denotes the reliability of a path with some or all components duplicated, then PiD may be represented by Equation (8). P· D = rr (pj + yjApj) I jER

... (8)

Yj = 1 ~ an i.i.d. component is attached to component j. Yj = 0 for all j ~ Equation (8) reduces to Equation (7).

Hence, the redundancy optimisation problem was stated as: Maximise h = ~ p. - ~ [Po P.] * + ~ [Po p. Pk ] * ·4· ID JD "4"4k ID JD D D 1" ID Ir] Ir]r + ~1)

n+1

[P1DP2D ... PnD]

*

... (9)

subject to ... (10)

"~Yj = d

J =1

6.

Yj = 0 or 1

... (11)

REDUNDANCY OPTIKISATION IN CONSECUTI~K-OUT-OF-N:F NETWORKS USING AHEURISTIC AND AN EXACT KETHOD

The system reliability of a consecutive-k-out-of-n:F A network was improved using component redundancy [5] . heuristic and an exact method were proposed for the optimal allocation of a given number of i.i.d. components. The following notation was used for stating the mathematical model of the problem: n

k

Pi qi APi h(j jk)

number of components in the system minimum number of consecutive failed components that can cause system failure probability that component i is good

= (1

-

pil

increase in reliability due to redundancy using an i.i.d. component i.e. APi = Piqi reliability of consecutive-k-out-of-j:F subsystem consisting of components 1,2, ... ,j

PROBLEMSINPROTEANSYSTEMS

269

f(jjk) fD(j jk)

= 1 - h(jjk) unreliability of consecutive-k-out-of-j:F subsystem consisting of components 1,2, ... ,j while some or all of the components contain an i.i.d. component for redundancy h'(jjk) reliabilityof consecutive-k-out-of-j:F subsystem consisting of components (n- j+l), (n- j+2), ... , (n-l),n number of i.i.d. components available d reliability importance of component i Ii (rate of change of system reliability due to an increase in component reliability) I relative importance of component i due to Ii redundancy (rate of change of system reliability due to an increase in component reliability from component redundancy) = 0 or 1 Yi

The following expression[ll] for unreliability was used for the mathematical model: f(njk)

=

n

f(n-ljk) + [1 - f(n-k-ljk)]

Pn-k j=~~i+l ...

(12)

with boundary condition: f(ujv)

= 0 for u < Vj and

Po

= 1.

... (13)

The system unreliability for a consecutive-k-out-of-n:F system with component redundancy was expressed as: fD(njk) = f D (n-ljk) + [1 - f D (n-k-ljk)]

[.

n

n(l - pj -

J=n-K+l

[(Pn-k

+

Yn-k ~Pn-k)]

... (14)

Yj~Pj)]

with the boundary condition: f D (Ujv)

= 0 for u < v and

Po

= 1.

... (15)

270

RADHA KALYAN AND SANI'OSH KUMAR

Yj = 1 ~ component j has an i. i . d. component and Yj = 0 for all j ~ recurrence relation (14,15) reduces to (1~,13). Thus, the redundancy optimisation problem was formulated as follows:

Maximise subject to

[1 - fn(n;k)]

n

. E Yi = d

1

=1

Yi

= 0 or

1

... (16) ... (17) ... (18)

A computer program is available for generation of a random consecutive-k-out-of-n:F network configuration with random reliability values. REFERENCES 1. Kalyan, R. and Kumar, S. (1987) System- Relaxat ion and

2.

3.

4.

5.

6.

7.

an Assignment Problem, lndian Journalof lanagement and Systems, 3, 11-17. Kalyan, R. and Kumar, S.(1987) Modelling and Analysis of Augmenting Systems - Án Allocation Problem, ASOR'87, In Proceedings of the Bth National Conference of the Australian Society o_f Operations Research, Melbourne, edited by S Kumar, Áustralia, pp. 126-136. Kalyan, R. and Kumar, S.(1989) Comparison of a Simulation and an Exact Method for Reliability Evaluation of Large Networks using a Personal Computer, Kicroelectronics and Reliability, 2S, 133-136. Kalyan, R. and Kumar, S. (1991) A Study of Protean Systems - Some Heuristic Strategies For Redundancy Optimisation, Recent Oevelopments in lathematical Programming, Gordon and Breach Publication. Kalyan, R. and Kumar, S. (1990) A Study of Protean Systems Redundancy Optimisation in Consecutive-k-out-of-n:F systems, Xicroelectronics and Reliability, 30, 635-638. Kalyan, R. (1989) An Analysis of Protean Systems and Solution Methods for their Optimisation. XasterJs Oegree Thesis. RMIT Victoria University of Technology, Melbourne, Australia. (Program diskette available on request at cost). Murty, K.G. (1976) Linear and Combinatorial Programming, John Wiley and Sons.

PROBLEMS IN PROTEAN SYSTEMS

271

8. Ford, L.R. and FuIkerson, D.R. (1962) Flows in Networks, Princeton University Press. 9. Taha, H.A. (1986) Operations Research An Introduction, Macmillan Publishing Company, lnc., New Vork, 4th Edition. 10. Kim, Y.H., Case K.E. and Ghare, P.M. (1972) A Method for Computing Complex System Reliabihty, IEEE Transactions on Reliability, R-21, pp. 215-219. 11. Hwang, F.K. (1982) Fast Solutions for Consecutive-k-out-of-n:F System, IEEE Transactions on Reliability, R-31, 447-448.

ALTERNATNE METHODS FOR REPRESENTING THE INVERSE OF LINEAR PROGRAMMING BASIS MATRICES GAUTAM MITRA

BruneI University, U.K.

and MEHRDAD TAMIZ

The Numerical Aigorithms Group Ltd. U.K. Abstract Methods for representing the inverse of linear programming basis matrices are c10sely related to techniques for solving a system of sparse unsymmetrie It is now weIl linear equations by direct methods. accepted that for these problems the statie process of reordering the matrix in the lower block triangular (LBT) form constitutes the initial step. We intro duce a combined statie and dynamic factorisation of a basis matrix and derive its inverse which we call the partial elimination form of the inverse (PEFI). This factorisation takes advantage of the LBT structure and produces a sparser representation of the inverse than the elimination form of the inverse (EPI). In this we make use of the original columns (of the constraint To represent the matrix) whieh are in the basis. factored inverse it is, however, necessary to introduce special data structures whieh are used in the forward and the backward transformations (the two major algorithmie steps) of the simplex method. These two steps correspond to solving a system of equations defined by the basis matrix and solving a second system of equations defined by the transposed basis matrix. In this paper we compare the nonzero build up of PEFI with that of EFI and alternative methods for updating the basis inverse. The results of our experimental investigations are presented in this paper. DOI: 10.1201/9780429333439-17

273

274

GAUTAM MITRA AND MEHRDAD TAMIZ

Key Words: Basis Matrix, Maximal Matching, Lower Block Triangular Form, Elimination Form of Inverse, Gaussian Elimination, Partial Elimination Form of Inverse, Forward Transformation, Backward Transformation. 1.

INTRODUCTION

Computer solution of large sparse systems of linear equations and the related problem of computing the inverse of the corresponding coefficient matrix, both efficiently and in a compact form assumes a key role in the study of large systems which arise in many Commercial exploitation of the equation solving con texts 1.2. technologies for such systems is most widespread in the field of linear programming. This is because formulation of large planning and scheduling problems and also their solutions have been weIl investigated since the late sixties. Also solution methods for LPs have continued to keep pace with developments in the field of algorithms, software and computer hardware. Series of special conferences on the topic of sparse matrices and sparse equation solving methods were organised during the sixties and all through the seventies3.4·5.6.7, and there is a wide body of literature which cover these developments. Computational algorithms for large scale linear systems, their data structures, software implementation and exploitation of machine architecture continue to be leading research issues. Our research interests concern the solution of large linear programming problems robustly and efficiently8·9. At the heart of such a system lies suitable basis inversion and basis inverse update procedures. In this paper we set out our arguments for the choice of the inversion and update procedures; we describe the mathematical methods and the related implementation issues and finally present some preliminary results of our investigations. The rest of the paper is organised as follows. In the next section we expand on the background of the devl'!lopments and present our analysis of the issues. This is followed by discussion as to why the lower block triangular (LBT) form is a desirabie structure, and present the corresponding facto red form of the inverse. We outline the data structures which have been adopted to represent the factorisation and we describe how it is used in the solution process

UNEAR PROGRAMMING BASIS MATRICES

275

in the section titled "Data structures for factorisation". We then compare the elimination form of the inverse with the partial elimination form of the inverse (PEFI) as derived by us. The LU factors can be computed indirectly from the PEFI representation and this is discussed a section of its own. In the section title "Established LU updates" a number of weIl known update schemes are considered. This section also contains the result of our experiments. BACKGROUND AND AN ANALYSIS OF THE ISSUES

The methods of computing and updating the inverse basis matriees of large sparse linear program ming problems are dominated by the following major considerations. (a) The nonzero structure of the matrix needs to be analyzed and the rows and columns of the matrix reordered: ANALYZE phase. (b) Pivots have to be chosen for (Gaussian) elimination and alternative methods of factorisation of the matrix have to be considered: FACTOR phase. (c) A method has to be derived whereby the inverse can be used in the solution process implicit in the "backward and forward transformations": SOLVE procedures. (d) Suitable data structures have to be designed to represent the original sparse matrix and the sparse basis inverse. (e) The implications of the algorithms and the data structures on the main memory of the computer and the architecture of the processor have to be taken into account. (f) In addition to the compact inverse representation an efficient update procedure has to be designed taking into account (b), (d), (e) above. We refer the readers to papers 10,11,12,13,14,15,16,17, whieh have addressed, expanded and dealt with most of these issues. The point (e) concerning implications of these algorithms on the current generation of multiprocessing architecture has been barely discussed in the literature, (a) ANALYZE Phase

Given a nonsingular matrix B of order m the reordering algorithms make use of the corresponding graph of the adjacency matrix derived from the non zero elements. In the symmetrie case the aim is to derive specifications for symmetric row and column interchanges given by permutation matrices p.pT such that we

276

GAUTAMMITRAANDMEHRDADTAMIZ

obtain a reordered matrix B, B = PBpT which is presented as compactly as possible in band diagonal form. The heuristic methods exploit the underlying undirected graph and make use of the degrees of the vertices. In the case of unsymmetric matrices, the general situation with LP basis matrices, it is well The LBT established 18•19 that LBT is the most desirabie form. reordering is achieved by considering the underlying directed graph and applying the following algorithms. Maximal matching We can apply the maximum matching algorithm of Hall 40 whereby the matrix B is reordered to B* whieh has a zero-free diagonal. As aresult B* = BQ; the permutation matrix Q which specifies the reordering as derived by this algorithm. Finding the strong components To the matrix B* whieh has a zero-free diagonal the celebrated algorithm of Tarjan can be applied to obtain the strong components (the diagonal blocks of the matrix) of the directed graph. This leads to the derivation of orderings given by the permutation matriees p,pT whereby the original matrix B can be reordered into the lower block triangular form B, B = PB*pT or B = P(BQ)pT, We note that for the symmetrie case the problem of bandwidth minimisation is NP-complete2o • Hence only heuristic methods are applied. In contrast the above mentioned LBT algorithms have the complexity of low order polynomials 19.21. Beale41 conjectured that a band form for the unsymmetrie case may be equally attractive as the LBT form. Although we have done some preliminary work on this topic2 2 this has not been properly followed up and we are not considering the issue of symmetrie matrices and the band form any further in this paper. It is interesting to no te that Hellerman and Rariek13 did not make use of Tarjan's work as these results were not known at that time. They also combined the ANALYZE and at least a part of the FACTOR phase as they preassigned pivot positions; the implications of this are discussed in the next subsection. (b) FACTOR Phase The main motivation of this phase is to compute a sparse yet If the LU accurate LU decomposition of the matrix B.

LINEAR PROGRAMMING BASIS MATRICES

277

decomposition is computed in the product form of the L and U transformation matrices then this reduces the nonzero growth. If pivot choice is made dynamically during factorisation with a view to control accuracy then this can lead to a stabie solution. Hellerman and Rarick's p4 algorithm finds spikes which are postponed for pivoting and constructs pivot agenda based entirely on a static analysis. The main aim is to minimise fill in. Unfortunately, however, it is necessary to modify the algorithm to deal with all possible contingencies otherwise the method is known 23 •24 to lead to structural singularity as well as unstable solutions. Erisman et al 23 have studied the comparative performance of a modified and improved p5 (precautionary partitioned preassigned pivot procedure) with the modified Markowitz procedure with threshold tolerance due to Reid. The authors conc1ude that not only is the latter superior in non-LP applications, even in LP applications, the latter is equally good if not superior, taking into account accuracy and robustness. It is well known that the choice of the largest admissible nonzero entry at each step of Gaussian elimination can provide accurate LU decomposition but this strategy will result in considerable fill in. Tomlin25 suggested a relative tolerance of 0.0 I of maximum column entry based on experimental evidence of processing LP problems. We have found Reid's recommendation of a threshold tolerance with restricted Markowitz merit criteria very effective indeed26 • According to Reid at the kth elimination step pivots are chosen by the criteria. min

r

(ri-1)(c

(2.1)

1)

i,j and

k

k

I bij I ~u(rru;x I bfjl) ... OO J Iv draws from v its positive elements. Thus, v

-3 4 6 -8 -2

4 6

(v>O) Iv

-3 -8 -2

(v subject to Ax ~ e x~O

(D)

Min < e,y > subject to A*y ~ a

°

y~O

°

°

Here x ~ means x(t) ~ for all tE [0, Tl and y ~ can be deflned similarly. The feasible solution sets for these problems are deflned as follows:

Pee) = {xix E Lf[o,T],x ~ O,Ax ~ e} D(a) = {yly E Lf[o, Tl, y ~ 0, A*y ~ a} The following results will be needed subsequently 11 . Proposition 1: Proposition 2:

(i) Sup {< a,x

< y, Ax >=< A*y, x >

> Ix E Pee)}

~

Inf {< e,y

> Iy E D(a)}

(ii) If xE P(e),y E D(a) and < e,y >=< a,x >, then x,y solve (P) and (D) respectively.

°

(iii) If x E pee), y E D(a) then < e, y >=< a, x > iff

< y, Ax - e >= =< A*y - a, x >

3. PROBLEM FORMULATION In this section, we construct a continuous game G : {U, U, K} and prove results corresponding to theorems 1 and 2 for the game G and continuous programming problems (P) and (D). Here

CONTINUOUS LINEAR PROGRAMS

u~ {

u

I

401

u(t) = (X(t), y(t), À),X(t) ~ 0, y(t) ~ 0 } x(t) E Lf'[O, T], y(t)E LfW"[O, T], À E R+ foT[E7=1 Xi(t) + Ej=lYj(t) + À]dt = 1

is the space of strategies for both Players (I and 11). K : U X U --. R defined by

K(u(l), U(2») =

l

T u(1)(t)T[D(u(2)(t))]dt is a payoff to Player I.

Payoff to Player 11 will be -K(u(1),u(2»). 0 Here D = [ A

-a*

and defined by

D(u(t)) =

[

a]

-A* 0

-c 0

c*

0

-A*

_~*

c~

~ [ -~ ft

is an operator from U to U

a ] [X(t)]

~c

y~t)

-A*y+aÀ Ax-cÀ a(tfx(t)dt + ~

]

ft c(tfy(t)dt

4. MAIN RESULTS

In this section, we will establish the equivalenee between the two

continuous linear programming problems ((P) and (D)) and contin­

uous matrix game G. For this we prove the following results:

Lemma 1. D is a skew Hermitian operator.

Proof. We have to show that its adjoint D* = -D.

i.e. for u(l), U(2) E U

< u(l), Du(2) >= - < Du(1), u(2) >

DU(2) =

L;.

~[

-A*

o

c*

a]

-c 0

[X(2)(t)] y(2)(t) À(2)

_A*y(2) + aÀ(2) ] Ax(2) _ cÀ (2) -~ < a,x(2) > +~ < c,y(2) >

402

S. CHANDRA, B. MOND AND M. V. DURGA PRASAD

< u(1), Du(2) > =
.(2) > + < y(1), A:,P) 1

À(1)[-r

c>.(2)

>

1

< a,:I; + r < c,y(2) >ldt

= - < A*y(2),:I:P) > +À(2) < a,:lP) > + < y(1),A:z:(2) - À(2) < c, y(1) > +>.(1) < c, y(2) > ->.(1) < a, :IP) = - < y(2),A:lP) > +À(2) < a,:IP) > + < A*y(1),z(2) _ À(2) < c, y(1) > +À(1) < c, y(2) > ->.(1) < a, z(2)

> > >

>

(using Proposition 1)

> +À(1) < a, z(2) > + < y(2), Az(1) > - À(1) < c, y(2) > _À(2) < a, z(1) > +À(2) < c, y(1) >1 < Du(1),u(2) >

= _[_
0, then both continuous linear programming problems have solutions, and conversely. Proof. Let u(t) = (x(t), y(t),).) be an optimal strategy for the continuous matrix game G with ). > O. Since the value of the game G is zero, we have

foT u(tfD(u(t))dt

~0

for all u(t) E U

(4)

It is claimed that (4) implies

D(u(t))

~

0

a.e.

To establish this, it is assumed to the contrary, that D(u(t)) 1: at least one component of

o on a set of non-zero measure, i.e.

404

S. CHANDRA, B. MOND AND M. V. DURGA PRASAD

the vector D(ü(t)) is positive. Assume that the ith component of vector D(ü(t)) (denoted by (D(Ü(t))i) is positive, where 1 ~ i ~ n + m + 1. By taking u(t) = 1 2 i-I i i+1 n+m+1 (0, 0, ... , 0, ~,O, ... , 0) we observe that u(t) EU.

o.

Now u(t)TD(ü(t)) = ~(D(Ü(t))i >

Thus JoT u(t)TD(ü(t))dt > 0 which contradicts (4). Hence the claim that D(ü(t» ~ 0 a.e. is validated. Thus

[

-A*y(t) + aX Ax(t) - eX -~[< a,x > - < e,y >]

1

~o

which implies that

A * y~t) À

x(t) A--and

< a,x

>~< e,y

>.

À

~ aCt)

(7)

eet)

(8)

~

(9)

(9) implies

< a,x/X > ~ < e,y(>' >

(10)

This shows that x* = x/X,y* = y/X is feasible for (P) and (D) respectively. Using (10) and Proposition 2, we get that x* and y* solve (P) and (D) respectively. Corlversely let x(t) = (Xl(t), ... , xn(t)) and y(t) = (Yl(t), ... Ym(t)) be optimal solutions to (P) and (D) respec­ I

tively. Set -

1

À=

and let x(t)

TnT m T + Jo l:i=l xi(t)dt + Jo l:i=l Yi(t)dt

=

Xx(t),y(t)

= Xy(t)

and ü

= (x,y,X).

CONTINUOUS~PROGRAMS

405

Clearly ü EU. We now verify that ü is a solution to the continuous matrix game G.

A *y ~ a ::::} A *yÀ ~ aÀ Ax ~ e::::} AxÀ ~ eÀ

and

< a,l: >=< e,y >

Combining we get

D(ü(t))

~

o.

Thus for all u(t) E U

foT u(t)TD(ü(t))dt

~ 0,

and

foT -ü(t)TD(u(t))dt = foT u(tfD(ü(t))dt

~0

~ foT ü(t)TD(u(t))dt. Hence ü is an optimal strategy for game G. This proves the theorem.

REFERENCES 1. Dantzig, G.B. (1951) A proof of the equivalence of the programming problem and the game problem. In Activ­ ity analysis of production and allocation (edited by T.C. Koopmans), pp. 330-335. Cowles Commission monograph No.13, John Wiley and Sons

2. Karlin, S. (1959) M athematical models and theory in games, programming and mathematical economics, Addison- Wesley 3. Gass, S. (1964) Linear Program ming, McGraw Hill and Kogakusha 4. Chandra, S., Craven, B.D. and Mond B. (1985) Nonlin­ ear Programming duality and matrix game equivalence,

406

S. CHANDRA, B. MOND AND M. V. DURGA PRASAD

Journalof A ustralian M athematical Society Series B, 26, 422-429 5. Tijs, S.H. (1969) Semi-infinite linear programs and semi­ infinite matrix games, Nieuw Arch. Wisk. 27, 197-214 6. Forgo, F. (1969) The relationship between zero-sum two person games and linear programming, Res. Rep. 1969­ 1, Dept. of Mathematics, Karl Marx University of Eco­ nomics, Budapest 7. Underwood, R.G. (1976) Continuous programs and their relation to continuous games, Journalof Math. Anal. and Appl., 56, 102-112

8. Bellman, R. (1957) Dynamic Programming, Princeton, Princeton Univ. Press 9. Levinson, N. (1966) A class of continuous linear program­ ming problems, Journalof M ath. Anal. and Appl., 16, 78-83 10. Tyndall, W.F. (1965) A duality theorem for a class of continuous linear programming problems SIAM J. Appl. Maths, 13, 644-666 11. Grinold, R.C. (1969) Continuous Programming Part One: Linear objectives, Journalof Math. Anal. and Appl., 28, no.1, 32-51

A SHORT NOTE ON A PATH-FOLLOWING INTERPRETATION OF THE PRIMAL-DUAL ALGORITHM PATRICK TOBIN Swinbume Institute of Technology John St, Hawthom, Vic 3122, Australia SANTOSH KUMAR RMIT

124 La Trobe St, Melboume, Vic 3000, Australia

Abstract In this paper we briefly reviewaspects of path­ following from roots in mathematical problem sol ving to applications in linear programming and a role in a parametric interpretation of the classical primal-dual algorithm. 1. INTRODUCTION In his classic work on mathematical problem solving, Polya 1 refers to the following example: "Inscribe a square in a given triangle so that two vertices of the square are on the base, and the other two are on each of the remaining two sides." This problem can be readily solved by fITst solving the related problem, where only three vertices need lie on the triangle, and observing that a succession of such related problems, with changing square sidelength, will eventually yield the solution to the required problem, as the free vertex approaches the other side. If the location of the fourth vertex is defined as the solution to each problem, the successive free vertices describe a path to the solution of the original problem. The associated heuristic technique is common in problem solving - simplify and solve a related problem and use it as the basis of solving the required problem. Path-following techniques involve this procedure in problems which can be drafted in terms of systems of equations with parametric specifications. In path-following the trajectory to the solution should travel through points which themselves solve problems (hopefully simpier) related to the original problem, and which are not merely iterates. DOI: 10.1201/9780429333439-26

407

408

PATRICK TOBIN AND SANTOSH KUMAR

2. PATH-FOLLOWING IN THE SOLUTION OF EQUATIONS An example, drawn from nonlinear equations is shown below. (1) Find (XI.X2) such that (XI)3 +(XI)2+ 5XI - 2X2 - 8 = 0 3(XI)2 - 2X2+ 2 = 0 The simpier equation system is taken to be: (XI)3 + 5XI - 2X2 = 0 (2) X2 = 0 and the only real solution to this is clearly (0,0). We transform the above system to the required one by adding a multiple of the difference between the systems' left hand sides. The multiplier is the homotopy parameter, t. Hence we have (XI)3 + 5 XI- 2 X2 - t «XI)L 8 ) = 0 (3) -2X2 + t (3(XI)2 + 2) = 0 The system (3) corresponds to the simpier system (2) if t = 0 and to the required system (1) if t = 1. We let its solution be x(t) = (XI(t),X2(t)) and seek x(1). If we eliminate X2, the system can be readily factorised and solved to give Xl (t) = 2t, X2(t) = 6t3 + t , and as t varies from 0 to 1, the solution changes from (0,0) (along the path defined by X2 = ~ (XI)3 + ~ Xl) to (2, 7) which is the solution that we are seeking. 3. PATH-FOLLOWING APPROACHES IN OPTIMISATION The relationship between the original and sub-problems is characterised by the path parameter. Garcia and Zangwill2 describe path-following in terms of homotopies with analytical examples which have solutions completely characterised by solving associated 'basic differential equations' and more complex problems which need solution by approximations to homotopies using simplicial algorithms to plot piecewise linear approximations to the true path. EIken3 used this approach in solving problems on competitive equilibria in economie systems, claiming computational efficiency rivalling that where the problem is drafted with conventional LP formulation and implementation. Since nonlinear programming problems are commonly solved using the associated Karush-Kuhn-Tucker equations, it is clear that path-following procedures drawing on ways to solve systems of equations can be useful. It is possibly less obvious that the same approach can be useful for linear programming. Karmarkar4 boosted the investigation of interior point methods for solving linear programmes with his technique which was readily interpreted in terms of logarithmic barrier methods by Gill, Murray, Saunders, Tomlin and Wright5 and as a homotopy method by Nazareth6. (For an overview see Ferris and Philpott7 and powen 8 ). These interior point methods commonly involve path-following procedures based on

PATH-FOLLOWING OF THE PRIMAL-DUAL ALGORITHM

409

primal methods (see eg. Renegar9 and Gonzaga 10 ) or primal-dual approaches which attempt to trace the 'central path' which reduces the duality gap to zero (see eg. Megiddo ll , Kojima, Mizuno and Yochise 12 and Monteiro and Adler 13). In their work, Monteiro and Adler 13 use the following problem P~ . n

Min c'x - Jl Lln Xj s.t. Ax = b, x> O. (4) j=l Jl > 0 is the barrier-penalty parameter (cf SUMT). Assume that the interiors of the primal and dual feasible regions are not empty and that A has fulI rank (rank A = m). If Jl = 0, the originallinear problem Po,with m constraints and n variables, is specified. The corresponding dual D with variables y (the Lagrange multipliers for the situation) and slack z provides part of the solution equations. The objective function in P~ is strictly convex and hence P~ has at most one global minimum given by the K-K-T conditions: ZXe-Jle =0 Ax- b =0 (5) A'y + z - c = 0, where Z, X are the diagonal matrices with entries from z,x respectively. For each Jl > 0 these imply that (y,z) is interior feasible for the dual and as A has fulI rank, this y is unique. The unique point w = (x,y,z) depends on Jl and the duality gap there is g(w) = c'x - b'y = x'z = nJl. Hence, as Jl -+ 0, g(w) -+ O. The path d~termined by Jl is called the 'central path' and this algorithm seeks to follow this path to the optimum where Jl = O. Clearly, each P~ is a problem related to the linear problem Po and the sequence of iterates deterrnines a pathway to the solution as Jl-+ O. Solution of these involves an implementable algorithm which uses a Newton-Raphson search method instead of the differential approach dictated by using a homotopy. (Gonzaga lO notes that whereas the latter shows the path nature clearly, the former gives precise bounds necessary for complexity analyses - and these are of principal concern in any algorithm attempting to match the polynomial-time efficiency criteria. Consequently most interior point methods address path-folIowing via such barrier function iteration procedures.) The Megiddo-style collection of algorithms differ mainly in their respective success in closely approximating the central path to the solution. Kojima, Mizuno and Yochise 12 note that these interior-point procedures have the common feature that the search direction for new feasible solutions involves the components that define the tangent to the central path described here. (P~)

410

PATRICK TOBIN AND SANTOSH KUMAR

The use of primal and dual enables the definition of a convenient set of equations for determining the solutions along the path. Strong complementarity is obeyed by x and z in the limit as Il ~ 0 since x'z = nll·

4. PARAMETRIC INTERPRETATION OF THE PRIMAL-DUAL ALGORITIIM It is also possible to relate the classical Dantzig-Ford-Fulkerson algorithm to the path-following approach. In that algorithm a discrete path is followed to the solution, not as a'steepest descent along edges' as in the simplex method, but along an extern al path in a space containing anificial varlables as weU. Consider the following problem: Max z = x I + 6x2 s.l. Xl + x2 ~ 2 (6) xl + 3 x2 :S 3 Xl' X2' ~ o. Incorporating anificial and slack variables (but excluding the redundant Beale constraint) we have: Max z = xl + 6x2 (7) + x a =2 s.t. Xl + x2 - X3 =3 Xl + 3X2 + X4 xl' X2' X3' X4 ~ o. The successive iterations of the algorithm produce solutions to a sequence of problems dictated by the values taken by xa• When Xa finally reaches 0, the solution to the original problem is obtained. It is pos si bIe to reformulate the problem, suppressing the primal-dual simplex character by interpreting the xa parametrically. This gives in theory, a continuum of solutions. eg. If we let xa = 2t here then t == 0 gives the original problem and t = 1 (say) the problem at a point (0,1) where z = 6. As t varies from 1 to 0 (a reversal of homotopy convention) the solution

,i).

The concept is easier than the changes to the optimum z = 4.5 at (~ implementation here; clearly the procedure needs a convenient solution method at each point along the path to be useful. 5. CONCLUSION The character of path-following illustrated by these examples is th at of solving a succession of related problems until they lead to solution of a required problem. In solving linear programming problems, the opportunity to move from interior points towards optimal or almost optima! solutions which must lie on the boundary can be useful. In problems where the solution of one time may change because of changing conditions either inherent through time dependence, or induced by adding constraints and new information this may be helpful. Sensitivity analyses, dual­

PATH-FOLLOWING OF THE PRIMAL-DUAL ALGORITHM

411

simplex and parametric programming based on the simplex method are presently used to handle changes in all coefficients and constraints. If interior methods using a known feasible solution at a starting point can be made more efficient they could riyal the simplex method, at least in some cases. REFERENCES 1. Polya, O. (1957) How To Solve It, 2nd edn, pp 23-25. Princeton: Princeton University Press 2. Oarcia~ C.B. and Zangwill, W.1. (1981) Pathways to Solutions, Fixed Points and Equilibria Eaglewood Cliffs, Prentice-Hall 3. Eiken, T.R. (1982) Combining a Path Method and Parametric Linear Programming for the Computation of Competitive Equlibria. Mathematical Programming, 39, pp181-188 4. Karmarkar, N. (1984) A New Polynomial-Time Algorithm for Linear Programming. Combinatorica,4, pp373-395. 5. Oill, P.E.,Murray, W., Saunders, M.A., Tomlin, J.A. and Wright, M.H. (1986) On Projected Newton barrier methods for linear programming and an equivalence to Karmarkar's projective method. Mathematical Programming, 36, ppI83-209. 6. Nazareth, J.L.(1986) Homotopy techniques in linear programming. Algorithmica, 1 pp 529-535. 7. Ferris, M.C. and Philpott, A.B. (1988) On the Performance of Karmarkar's Algorithm. Journal ofthe Operational Research Society, 39, 3, pp 257-270. 8. Powell, M.J.D. (1990) Karmarkar's Algorithm: A View from Nonlinear Programming Bulletin ofthe Institute ofMathematics and its Applications, 26, 8/9, pp 165-181. 9. Renegar, J. (1988) A polynomial-time algorithm, based on Newton's method, for linear programming. Mathematical Programming, 40, pp 59­ 93 10. Oonzaga, C. C. (1989) An algorithm for solving linear programming problems in O(n 3 L) operations. In Progress in Mathematical Programming, edited by N. Megiddo, pp 1-28 New York: Springer­ Verlag 11. Megiddo, N. (1989) Pathways to the optimal set in linear programming. In Progress in Mathematica] Programming, edited by N. Megiddo, pp 131-158 New York: Springer-Verlag 12. Kojima, M., Mizuno, S. and Yoshise, A. (1989) A primal dual interior point algorithm for linear programming. In Progress in Mathematical Programming, edited by N. Megiddo, pp 29-47 New York: Springer­ Verlag 13. Monteiro, R.D.C. and Adler, I. (1989) Interior path following Primal­ Dual algorithms. Part I: Linear programming. Mathematical Programming, 44, pp 27-41

ON A PARADOX IN LINEAR FRACTIONAL TRANS­

PCRTATION PRUCTION Ciassical cost minimization transportation problem (CMTP) is a weIl known allocation problern and has rnany applications. Variants of (CMTP) as discuss­ ed by Appa (I) and Brigden (2) make it amenable to more realistic situations. Addition of a flow constraint in a (CMTP) has been discussed by Khanna et al (3) and a methodology is developed to find the minimum cost solution for a given flow. Various authors like Martos (4), Charnes­ Cooper (5) and Swarup (6) have solved linear fractional programming problem. (LFTP) was first discussed by Swarup (7) and since then it has not received rnuch attention though it seerns to have potential applications, as linear fractional objective does appear in many DOI: 10.1201/9780429333439-27

413

414

VANITA VERMA AND M. C. PURI

rea I wor ld prob lems. In the present pa per, a con­ di tion is es ta blished which when holds, imp lies existence of 'paradoxical solution' in (LFTP). Although Szwarc (8) and Charnes-Klingman (9) dis­ cussed paradoxical situation in a (CMTP) , but have not computed 'paradoxical solution' for a specifim value of flow. Dinkelbach's (10) parametric élpP­ rOcJch is used to find the complete 'Paradoxical Range of Flow' and 'Paradoxical Solution' is then obtained for any specified value of flow in thélt range, using approaches developed in Dinkolbach (10), Khanna et al (3). 2. T!-EORETICAL DEVELOPMENT

(LFTP) as discussed by Swarup N (Xl (P) Minimize Z = = i 1:€ 1 o X( S 0 (X ) 01: I where S

= {X =

(x 0 .) I 1J

l€

is

01: c i 0 x· 0

J (; J_~l1. 01: J d 00 x 0. JE;

1J

1J

E

xi 0

= a 0,

i

EO

I,

r

Xo 0

=

j

EO

J,

j EO J

i El

J

1J

1

b o, J

J}

x ij > 0,

i EI, j E Sets I and J are index sets of m sources önd n demand points respec.tively, vectors (c ij ) and (dij) lie in Rmxn and X is vector of mn decision variables, ai being the availability at ith SOUl'CE: and b j the requirement at jth demand point. D(X» 0 for all XES, where S is compact. Let optimal feasible solution of (Po) yield

= ~

of the objective function DN«(~l) and D n X let 1: ai = 1: b Jo = Fo. For 'paradoxical solu­ i=l j=l tions' one has to investigate the solutions of the following problem. N(X) {PI) Minimize A

value ZO

m

D1XJ

subject to where

S = lX =

X €S

(xij)1 j;J

X ij ~

ai'

i

E.

I,

1: xio>bo JOEJ J J i €. I i EI, j ~ J ) Xij ~ 0,

LINEAR FRACTIONAL TRANSPORTATION PROBLEMS

415

Clearly an optima 1 feasible solution of (Po) is cl feasible solution of (P l ), yielding the obJoctive function flow pair (ZO,Fo). ~fini tion

,....

A solution X ot {PI)

'Pa~adoxical~luti2n':

yielding objective function-flow pair (Z,F) is called a 'Paradoxical Solution' , i f there doos not exist any other feasible solution of (PI) yir>lding pair (Z,F) such that (Z,F) < (Z,F') ,'. ,.... or "7 Z -- Lo, but F ) F r. or .... Z < Z but F = F 'Paradoxical Ranoe of Flow': If on increClsing rrow-from~töI=ï.:-Value of objective function decreases streadily from ZO to ZL, where ZL corresponds to flow F L and further on increasing the flow beVond FL, objective function value starts rising, then interval [Fo,FLJ is thp. ~Paradoxical Ra..Jlge of Flow', as can be depict(?d in Figure 1.

/

(ZO,Fo

Objective

func tion

value

(ZL,FL) Flow Figure 1: Depicting 'Paradoxical Range of Flov'/' Variables uk,vt

'

,

,

uk ' vL

-------------,---------­ For a basic feasible solution uk' v(. ,uk' given by

Vl ' ,

of problem (Po)' where (k,/-) is a basic cell, are

Uk + vL I , uk + vt

= ck = dk

vANITAVERMA AND M. C. PURI

416

Theorem 1

ITthere °ëxists a cell (i j) in the table COrrf?S­

ponding to an optimal soiution XO of problem (Po)'

such tha t

j)

Do (ui +v j) - NO (ui + v < 0, a nd ~ > 0, Such that basis corresponding to a basic feJsible solution of problem (Po) with ai replaced bya.+iI and b j by b j + i'I is same as tha t correspondinr/"to X 0, then there ex is ts a 'Pa radox ica 1 Solution'. Proof: (Po) ZO

At an optimal basic feasible solution of

E

L

Cij x ij

E

L JEJ

dij x ij

= NO = i.:.L-12 D

o

iEl

=

E

~I L

i'I

E

E

(ci .-ui-v. + u' -tv.)x. , J J ~ J ~J

jeJ

" , , ) x .. 1: ( d , ,-ui-v. + ui+v, j EJ ~J J J ~J

E

(u, + v . )x ' . ~ J ~J

= ~_i:l I: 1:

iE I

j

(

_

-

xi' = 0, l.J. nonbasic c:~'l.ls (i, j a nd c ij =ui +V j , , . dij = Ui+V. for all bi o



J

417

LINEAR FRACTIONAL TRANSPORTATION PROBLEMS "....

where

a k == a k

,...,

,...,b{ b.

=

=

ti} anc! '"a i - [j j

kE- I -

,...,

b(..

, t

f

.J

== a i

+ i\

for ,>, > 0 Clearly, every feasible so.1.ution of (P2) i~ ~, feasible solution of (p ). The basis correspond1:ng to optimal basic f:asi,­ bIe solution of (F o ) will yield a basic feasible solution for (P2 ) for wb,ich vaJ.u'2 of the objoctivc function 2 would b8 given "" k~i UkUk + t~ bi. '1,,- + (ai+>')Ui+(b j + A )V j J

Z

b.+/I J

= --. .----~ ~, I..

k~i

a kuk +

l:

t ~j'

I

b{v~

l: aku,( + r. b v + kEI ' tE!:.t t

= ----;-l: aku k + l: k~I

'" _ 0

Z-Z

Hj

-

+(a i +>.

Z < Zo.

Thus new flow ~

I

---­ )

,

u.+(b.+;'.\ v. 1. J J À(Ul.'+V') J

'("

b.{..v( +A ui+V j )

o 0" [0 (u i +v . ) -N (u i + v j ) J ==-o-o-~--,-----,-D [D +A(u. +v.)J

=>

)

1.


. = E bt + A = F 0 +). • k€I UJ Thus va lue of the objective function has faLlen be low the optima I value of (Po) and flow has increased by Thus for flON Fo+/I, there exists one fE::'sibl solution with value of objective function 105s than Zo. This imp lies tha t f or flow F + /I , 0 ,ti­ mal value of ti'c' objective function iVi~1 be stcic­ tly less than Zo. T:üs imulies that 'kIré'clr';:".c:l Solution' exists. ~m2}~1..J:_

AssuminCj thêt 'Paradoxical Solution' eXists, _~f or,: tiï.1al'feasible. solution of (PI) yields thc vL'.luc of thr:: objecti'!,,, function ZL ana thc: assocL,-~'d flo is P L , then '.Paracioxical Ranqe of FlO\'I' i.s J

i: /

[F 0' FLJ. Th:2,g.EfDL~

L0 t in onti:na 1

T('.:>'oil~lr c,lutl'('on ,/L ....

-L.'-_-

J-"';;

~J

I\.

oF .'-

(F J .) ,

418

and

VANITA VERMA AND M. C. PURI

L t xi' iel J t x~. jéJ 1J

L (> = bj = at (>­

=

FL

t

JEJ

bL j

bj ) ,

j

i

ai)' =

= 1,2, ••• ,n, = 1,2, ••• , m

a~ and let ZL be tr,e

t

i~I

1.

corrcsponding optima1 va1ue of objective function, wh0re xL = (Xl.~J'). I . J ' then no optima1 fcasi­ 1.( , J" bIe solution of the prob1em '. 1.ml.ze .. ~>~ (P3 ) I.nn ,; ..

subject to .t

t

j~J

where

Xi' J x ij

t v· JEJ J

=

b~

=

L

b. + v J., j€J loEI L J . = a.+u.,1.€I 1. 1. l 0, i € I, j E J x ij

° = ielt

U.

1.

+ V. > b·, j EJ J J J . I a.1.L +u.1 . >a., lot 1. can yit'ld the objective function vulue l('ss UIL,n ZL.

Proof Let Y be an optimal feasible solution ûf

(P3). Clearly Y wil! be a feasib1e solution of (PI).

Therefore NiYl > tli~ = ZL. DTYJ - D{X L ) l.Q...Q~in 'Paradox~ca 1 Range of Flow' Optimal feasible solution of (Pl) yie1ds 'Parê,­ doxica 1 Range of Flow'. Problem (PI) can be solved by using Dinkelbach's approach, which ~t every step involves computation of ~ F (q) = Minimize (N (X) - qD (X) Ix ES) Dinke1bàch's approach solves (PI) in the fol10\'dng mannel' • ~B~!ia~teR: For ql = 0, computer F(q ). If F(q ) = 0, thin corresponding optimal feasi­ bIC' solutiOn (say) X lis optima! for (p!), other­ wise comoute q tl(X ) • . 2 n (" u ,\. !)

=

LINEAR FRACTIONAL TRANSPORTATION PROBLEMS

419

General stee Cr -> 2): Computer F(qr) = Mini~um . I (N(X)-qr D(X)/X E $) where qr = N(Xr-..l , Xr-l D(Xr - 1 ) being optimal feasible solution corresponding to F(qr-l). If F(qr) = 0, then Xr is optimal fe~si­ bIG solution of (PI)' otherwise repeat gen('r21 s te 0 f or r = r + 1 • . Let optimal :feasible solution of (PI) b~ XL wi th flow FL and va lue objective function ZL. Thcn 'Paradoxical Range of Flow' is given by [Fo,FLJ.

_. -----..- - - -

--_._­

Paradoxica 1 Solution' for -[Fo,FLJ -_.-;:::--,----_._--

a soecified f lov: in

t

Let-;P;cif ied flow be F E [F 0, F LJ. Then 'Pa r2doxi­ ca 1 Solution' for F is gi ven by optima I fec"ls ible solution of (P4 ). (P4) Minimize

s ubj eet to

~B

LX..

JEJ

i i

1: E

~J

x ..

I

~J

> a.,

i

>b.,

JEJ

-

~

-

J

L: x •. j eJ ~J

1:

EI

°

=

f

I

F

xi'J ­> i f I , JEJ Again, optima I feasible solution of (P4 ) is obt a., iE I JEJ 2.:

iE I

1J -

x ..

>

1J -

L: L x .. iElji;J 1J

~

jf:J

b.,

J

=

F

> 0, iE.I, JEJ ­ till a q* is reached for which F(q*) = O. Problem (P5 ) can be solved by using Kha!1na et û1.'s approach in which an equ~valent prob.l:'m (P is c ons truc ted whose optima I fea s ib Ie S olu­ tion yields optima 1 feasible solution of (Ps). x .. ~J

5)

420 ( ~)' )

15

v ANITAVERMA AND M. C. PUR! :rH 1

Minü'lize

L

.

1=

nT1 , , l: (ciJ' - qd .. )y .. . 1 J.J 1J J=

1

subject tg+ 1 l: y .. . 1 1J J=

:=:

m+l .L y.. l=1 1J n+1 E

J'-1 -

!lH

E

1

i=J.

,

,

c m +1 j -

,

q

i

= 1,2, •••. rr:

b .. ,

j

=

1 J

n

. = F -

L

J'-1 '­

mTI]

y.

J.n+ 1

m

= F -

1,2, ••• ,:: b. J

La,

i=1

°

\y m+l n+1 =

1.

= c 1J .. -q d .. (i, j ) E 1J d + 1 ' = Min (c, .-q d. ,) m >'-J i d 1J 1J

c" - q d .. 1J 1J

,

y

a.,

,

,

c in + 1 - q d in + 1

= ~in

(cij-q dij)

Jf J

I

IxJ JEJ

iEl

c m+ 1 n+l - q d' = M, where j\i is m+l n+1 an arbitrari1y large positive number, y" ~ 0, i,j. If X* = (x 1j ) is an optima1 feasib1e solGtion oOf (P5) and y* = (y~,) is an optima1 feasible solution f (P' J o 5)' then * w * * x ij = Yij + Yin+l + Ym+lj Thus a

'paradoxical solution' for a specified

fla.\' ean be obtained by using simultaneously the

approaches of Dinkelbach, Khanna et al. 3. NU\'iEJU CA LILLUS TEA T I ON

Consider an (LFTP) with m4 = n

,,)

(~

" ,. Ni~) ~1.nl.mlZe ~J subject to 4

-

-

i~l 4 r

i=l

as cijx ij

~ 4

,L

~l

r

d, .x, ,

1J 1J

j=l

444 E x 3 ,=15, L x 4' X 2 ·::.:3,

LX, . = 5, L j=l 1J j =1

J

j =1

J

j =1

J

=

12

LINEAR FRACTIONAL TRANSPORTATION PROBLEMS

421

4 4 4 x 12 = 14, E x i3 =10,.E x' 4 = 6 ~=l i=l ~=1 1 1=1 X ij ~ 0, (i,j), i=1,2,3,4 and j=1,2,3,4 where C ij ' s and dij' s are given by tab1e 1. TAB IE 1. Indicating cij's and d 1j 's in (P). 4

xiI =5'.E

E

dIl c l1

r---

2

5

8

~:

2

4

10

_.~

~-.

15

1

2

6

7

1

6

~

1

5

1

3

--t---­

1

-

5

3

"2­

d 24

c 24

16 1

1

21 8 18 - -­ Optimal feasible solution of (P) is given by Table 2.

TASLE 2. Depicting optimal feasible solution ' )f

-------~--

~----

~

'---

(p)

' ·---I·---. .

L . I ~-J -­~ 3 !~ .. J---~~- n---~.-!-------"-'---

.._5_ _ _ _ _ _ _ _- .I . l . _ 7

_

_ _I

with ZO = 381/60, Fo = 35 For cell (3,1), DO(ui+V~) = 60(10) N°(ui+v.) = 381(3) . 0 0" w~th D (ui+Vj) = 600 < 143 = N (ui+v j ). On increasing the flow by 7 units along 3rd rmv and lst column, retaining the same basis, feasi­ bIe solution to problem with increased flow 42, is given in table 3 with corresponding va1ue of objective function as ~î « ~~ ).

i

422

VANITA VERMA AND M. C. PURI

TASLE 3: Depicting feasib1e solution of the problem with flow 42. - 3 --_. 5_ -~ --- -

-12

11

I

1

10

1

0

_

• Paradox ica 1 Ra nge of Flow' is given by solving problem (p), with equa1ity re1ation constraints replaced by 'L' type constraints. Optimal feasible solution of this problem is given by table 4. TAaIE 4: Depicting optimal feasib1e solution of above problem. 5

-

~----- 1 - - - - - 1---­ -- I - - 14

~,

-

15

0

9

15

­

12______ ____ yielding flow 70, with value of objective function as: 456 0 381 l65 < Z. = 60 It can be observed tha t tn above tabie, there does not exis t a ny ce 11 (i, j) for which there is a A > 0, such tha t I , >. [16~ (ui -tv j ) - 456(u i +v j )] < 0 • Paradoxica 1 Solution' for flow F = 42 is given in table 5. TABtE 5: Depicting optimal feasible solution of problem is: L-...~

5

1--­

-

10

0

..

12 yie ldlng

16 /

4

10

1

LINEAR FRACTIONAL TRANSPORTATION PROBLEMS

423

4. CONCLUDING REMARKS An approach is suggested for finding 'Paradoxical Range of Flow' and for a given flow in such a range, 'Paradoxical Solution' is also derived. But to solve (PI) without using Dinkelbach's approach or any other linear programming formula tion of (PI) with transportation structure of (P~) retained, is still an open problem, since Swarup s technique (1966) does not work upon the fo110wing problem. Minimize ~&t subject to E j

~J

xi·

J

= ai'

iEl x ij ~ b j ,

x ij

~ 0,

i (; I

j

E

J

i EI, j

E

J.

RE FERENCES

1. Appa, G.M.

2.

3.

4. 5. 6. 7.

8.

(1973) The transportation problem and its varlants, Opns. Res. Quart., 24, no.l, 79-99 Brigden, M.E.B. (1974) A variant of transpor­ tation problem in which the constraints are of mixed type, Opns. Res. Quart., 25, no.3, 437-445 Khanna, Saroj a nd Puri, M.C. (1983) Solving a transportation problem with mixed constraints a nd a specified transporta tion flow, Opsearch 20, no.1, 16-24 Martos, B. (1960) Hyperbolic programming, Pub. Math. lnst. Hung. Ac. Sc. 5, Series B, 4, 383-406 Charnes, A. and Cooper, W.W. (1962) Programming with linear fractional functionals, Nava1 Research Logistics Quarterly, 9, no.3-4, 181-186 Swarup, K. (1965) Some aspects of 1inear frac­ tiona1 functional programming, Austra1ian J. Stats, 7, no.3, 90-104 Swarup, K. (1966) Trans porta tion technique in linear fractiona1 functionals programming, Journalof the Roya1 Nava1 Scientific Service, 21, no.5 Szwarc , W. (1971) The trans porta tion para dox, Nava1 Research Logistics Quarter1y, 18, nO.2, 185-202

424

VANITA VERMA AND M. C. PUR!

9. Ch&rnas, A. and Klingman, D. (1971) The "more tor lass" paradox in the distribution moelel, Cahiers Ou Centre D'Etudes De Recherche Opera­ tionne lIe, 13, no.ll 10. Dinkelbach, W. (1967) On non-linear fractional programming, Management Science, 13, no.7

NASH EQUILIDruUM POINTS OF STOCHASTIC N-UELS

P. ZEEPHONGSEKUL Department of Mathematics,

RMIT

G.P.O. Box 2476V, Melbourne, Victoria,

Australia, 3001

Abstract: By formulating stochastic n-uels (e.g.

2-uels == duels, 3-uels == truels) as n-person non­

cooperative stochastic games, it is shown that Nash

equilibria! exist in stationary mixed strategies for n-uels.

We a!so show how these equilibria can be located for

duels and truels, and indicate the necessary steps for

obtaining equilibrium points when the number of

adversaries is greater than three.

Keywords:

1.

Nash Equilibria, Noncooperative Games, duels, Truels and N-uels

INTRODUCTION

Human conflicts resulting in acts of warfare are unfortunately as old as mankind. Consequently, there have been many attempts since antiquity at understanding the motivations behind a conflict situation, and in discovering the best strategies which will lead to victory for the favoured side. Nevertheless, attempts at an analytical and mathematica! formulation of combat operations are fairly recent. Apparently, the first of these attempts was made by F.W. Lanchester (see Karr 2 in 1916 where a simple deterministic model of attrition described by ordinary differential equations with loss rates as parameters was introduced. DOI: 10.1201/9780429333439-28

425

426

P. ZEEPHONGSEKUL

The area of conflict modelling was given a tremendous boost with the appearance of the paper by John von Neumann on game theory in 1928 (culminating in the book 3 co-authored with O. Mogenstern in 1944). For the first time, all competitive phenomena be it military, economics, biologicalor behavioural, were shown to have a common thread running through them. The main contributions conferred by game theory upon the area of conflict modelling have been twofold: firstly, it demonstrates unequivocally that all conflict phenomena are necessarily stochast ic in nature, and secondly, it provides a rational prescription, the minimax rule, for the actions that should be taken in a combat environment. As game theory was initially developed for the simple two-person zero-sum case, it was not surprising that the first application of the theory to the area of conflict modelling was to duels. Duels have been discussed very extensively in the literature but the papers appear to have fallen into two broad categories: (a) Duels which are concerned solely with the microscopic features of combat such as survival probabilities, ammunition limitation, cover, mobility etc.. These models use very little game theoretic techniques and do not dwell on strategies. The reader is referred to the review article on such duels by C.J. Ancker, Jr. 4 and the references contained therein. (b) Duels which are games of timing. Here, each player makes a decision as to when to shoot so as to optimize a certain payoff function. By adding to or relaxing certain conditions, many interesting and novel problems (some unsolved) have appeared. KimeldorfS gives a good overview of such duels as weU as the present status of knowiedge. A good mathematical treatment can also be found in Dresher6 and Karlin 7 .

With the exceptions of Kilgour 8 9 10, little work appears to have been done on conflict models where there are more than two participants, i.e. to truels and n-uels. Although the book 3 contains a very extensive coverage on n-person games, applications of the theory to the real world are fraught with philosophical and computational difficulties. One of the problems is that such games are usually not zero-sum and so the standard not ion of an equilibrium point

NASH EQUILIBRIUM POINTS OF N-UELS

427

no longer applies. Fortunately for a large c1ass of n-person games, the noncooperative ones, a concept of equilibrium point has been provided by J. Nashl and is called a Nash equilibrium point (NEP) in his honour. Further, Nash equilibria were shown to exist for a c1ass of games, the n-person stochastic games, by Fink ll and Takahashi 12 . Such games will be important in this paper. The main aims of this paper are, firstly, to show how stochastic n-uels can be modelled by n-person stochastic games, and secondly, to demonstrate how one can locate Nash equilibria for n-uels. We concentrate on duels and truels but will also outline the necessary steps for cases where n ~ 4. We give the definitions and principal results of n-person games and Nash equilibrium in Sections 2 and 3. These results are then applied to n-uels in Section 4. We rem ark that although our work in Section 4.2 is related to Kilgour 8 , our approach is completely different in that we use as our vehicle stochastic n-person games. AIso, we seek Nash equilibria in mixed strategies, whereas, Kilgour 8 considered only optimal pure strategies. Finally in Section 5, we conclude with some suggestions for fut ure work in the area. 2.

STOCHASTIC NON-COOPERATIVE N-PERSON GAMES: TERMINOLOGY AND NOTATION

We are concerned with modelling n-uels by a non- cooperative n-person game where each player acts independently of the other players with a view to maximizing his or her own gain. We admit no coalitions between any groups of players, and assume that the players are "rational" so that each will play according to what is "best" for his or her survival. N-uels cannot be represented as games in the conventional sense (Von Neumann and Mogenstern 3 ), as there are transitions between states i.e. the survivor sets, and hence must be modelled by stochastic games introduced by L.S. Shapley13 and later generalized to the n-person case by A.M. Fink ll and M. Takahashj12. In this section, we give the relevant terminology and notation for a noncooperative stochastic n-person game. Formally, let I = {I, 2, ... , n} represents the set of players and S the finite set of pussible states with cardinality m.

Associated with each player we have a finite strategie space

D~ (with members di) consisting of all pure strategies

428

P. ZEEPHONGSEKUL

Let P~, id, SfS, be the

available to player i when in state s.

1

space of probability measures on D~, i.e. P~ is the set of mixed 1 1 strategies of player i when in state s. Denote a representative 8 8 member of p. by 1' .. 1 1 We shall also use the following notation to represent tuples of pure and mixed strategies for later usage (the symbol X represents Cartesian product throughout):

aS -

= (ot,

~,

"'_ U

h

= (Uh'1

...,

2

'"

I'h

_ (1 = I'h'

I'

-(12

'"

= (N1'1'

I'

n

(X

i =1

m)

2

P

P~ =

m

1

8

8

-

-

N

N

m X P

1'2, ... , I'n) ( h=1 X Ph

We remark that

D

5



8_

- P

8=1

n



8

I'h' ... , I'h (X Ph = Ph 5=1

= 1', 1', ... , I'm) ( -



8

m

= (I'~, 14, ..., I'~)

-

= D

Uh' ... , ~) ( 8 X= 1 Dh = Dh

1'8 -

n

( 1.X= 1 Dî

u~)

.

=

P

is the space of n-tuples of pure

strategies available to the players when in state Sj Dh· the space of m-tuples of pure strategies available to player h when in the various statesj P _8 and Ph· also have the same interpretation as

D

8

and

Dh

mixed, rather than pure strategies.

respectively, but involving The spaces

Pand

P

are obviously collections of mixed strategies available to the n players and in the various states respectively. We introduce the substitution notation (I!Sj p~) to stand for the n-tuple

NASH EQUILIBRIUM POINTS OF N-UELS

"S (""1'

S Ph' S Jih+1' S ... , JinS) where PhS ""'II~.l' ..., Jih-1'

generally,

N N) (Ji; Ph

429

psh' and more (N N N N N) to represent Ji1' .•., Jih-1' Ph' Jih+1' ... , Jin , t

where Ph t Ph· . If, in state s, the ith player adopts the pure strategy ~, i t I, then the payoff (gain, utility) to player h will he denoted by g~ (~, q~,

... , q~)

== g~(~)

.

If, on the other hand, the ith player adopts a mixed strategy Ji~, i t I , then the payoff to player h will be denoted by S ... , JinS) -= ghS( I!:S) ' ghS (S Ji1' J.t2, We note that g~(I!:S)

=

l

uStD.s

g~(~) i~l Ji~ (~) (2.1)

We now eonsider the evolution of the game. If the players find themselves in a state s t S , eaeh player independently ehooses a mixed strategy Ji~, i tI, resulting in the payoff

g~(l!:s)

to player

proeeeds to another state terminates with probability

r

h, h t I .

The game then

with probability pSO(l!:s)

where

0

psr(l!:s)

or

refers to a

state (or eolleetion of states) belonging to S whieh leads to the termination of the game. The transition probabilities are assumed to be eontinuous over a suitable topology (the vague topology) on the strategie spaees. Sinee the payoffs aeerued by eaeh player aeeumulate during the eourse of the play, the following assumptions will be made so as to aehieve a finite total payoff:

430

(i)

P. ZEEPHONGSEKUL

3 positive number M , independent of such that Ig~1 ~ M 'th ( I and s ( S .

(ii) inf pSO(oB) s, oB

Definition 2.1

=

po > 0 .

s

and

h

(2.2) (2.3)

A stochastic noncooperative n-person game

is defined as the collection of tuples

r =

(D~, g~,

PD id , uS

together with the set of transition probabilities

psr and

pSo.

Denote the game that starts off in state s by r s . Throughout the paper, we will assume that a player chooses the same mixed strategy in a particular state. Such policy on the choice of strategies to be adopted by the players is called a stationary policy and the individual strategy a stationary strategy. If the game starts off in state s and the contestants adopt the stationary strategies ~,then we shall denote the total expected payoff to player h by Gh(~)' From probabilistic considerations, we have the following relation:

G~(~) = g~(~S)

+

+ m

m

L psr(~s) gh(~r)

r=l m

L L psr(~s)prt(~r)g~(~t)

r=l t=l

+ ...

m

= g~(~S)

+

L psr(~s)

Gh(~) .

r=l We remark that the above series can absolutely convergent by (2.2) and (2.3).

(2.4)

be shown

to

be

NASH EQUILIBRIUM POINTS OF N-UELS

3.

431

NASH EQUILIBRIA FOR STOCHASTIC NON­ COOPERATIVE N-PERSON GAMES

Definition 3.1 A stationary strategy ~ is a Nash equilibrium point (NEP) of the stochastic game fS if G~(~;

for any

Ph

f

Ph)

Ph·

~ G~(~)

and for every

(3.1) h

f

I

If the inequality

(3.1) holds for every s f S as weU, then ~ is a NEP of the stochastic game f. The above definition, for non-stochastic non-cooperative n-person games, was first introduced by J.F. Nash 1 • We remark that in the case of a two-person zero-sum stochastic game, (3.1) reduces to the definition of a saddle point in the conventional Von Neumann and Morgenstern's sense. Like saddle points, a NEP is not necessarily unique, but unlike saddle points in two-person zero games, the payoffs at different equilibria, i.e. the values of the game, need not be the same. Rence, different players may prefer different Nash equilibria. The foUowing theorem, proved by M. Takahashi1 2 , shows that under certain mild conditions a stochastic noncooperative n- person game always has a NEP. A closely related result, but for total discounted returns, was also proved by A.M. Finkll . Theorem 3.1 Under conditions (2.2) and (2.3), any stochastic game f has a Nash equilibrium point. The above theorem is an existence theorem and does not provide a way of locating a NEP. The next theorem indicates a method of locating NEP through pure strategies and will be very useful in the sequel. Theorem 3.2 A stationary strategy ~* f P is a NEP of fs if and only if G~(~*; uh) ~ Gg(~*)

(3.2)

432

P. ZEEPHONGSEKUL

for any

ah (. Dh'

and for every

h (. I .

Proof The existence of ~* was shown in Theorem 3.1. As a pure strategy is just a special case of a mixed strategy, (3.2) is merely a consequence of the definition of a NEP. To prove the converse, we assume that (3.2) holds and note that we can use (2.1) and (2.4) to show that

1:

gg(~) =

ÎI JL1(uï) [gg(~)

oS(.DS i =l

m

+

1:

psr(~) Gh(~)

(3.3)

J

r=.l

Applying (3.3) to the tuple

N*

(JL;

N

and multiply the

ah)

resulting inequality (3.2) by an arbitrary mixed strategy

Ph

for player

as the

h, (3.1) will result on summing over

right hand side of (3.2) is independent of 4.

N

ah'

Dh' 0

STOCHASTIC N-UELS

We label the n players by 1, 2, ... , n. Each player regards all other players as adversaries, and his choice of target represents his pure strategy. A player adopts a mixed strategy if he selects his target according to some probability After selecting the targets, the players all fire distribution. simultaneously and a hit will always result in a "kill". A player's marksmanship is defined as the probability that he hits his target and this probability is assumed fixed throughout the game. At the conclusion of a shooting round, the survivors repeat the game and the process continues until there is at most one survivor. We shall assume that each player has an unlimited supply of ammunition. The survivors at the end of each round constitute the state at the end of the round, thus the cardinality of S is equal to 20 • For n-uels, the payoff at the end of each round is strictly a function of the state reached at the end of that round. In the next two sections,

433

NASH EQUILIBRIUM POINTS OF N-UELS

we shall obtain Nash equilibrium points for duels (2-uels) and truels (3-uels). 4.1 DUELS In this simplest of n-uels, I = {1,2} and S = {A = I, B = {I}, C = {2}, D = 4>}. Let a and f3 be the marksmanship of player 1 and 2 respectively. We shall assume 0 < a, f3 < 1. In order to have a strategy, We shall also assume that each player has the choice of abstaining, i.e. he shoots into the air instead of at his opponent. Denote the strategies of player 1 and 2 by P1 = (x,x) and P2 = (y,y) respectively where, e.g. x refers to th.e probability that player 1 shoots at 2 and x = 1 - x is the probability that he abstains. (Whenever it is convenient, we shall use the notation ä to denote 1 - a for any real number a.) We shall also assume that either x > 0 or y > o. Figure 1 gives the 4 possible combinations of strategies available to the players, e.g. (i) refers to the case where both players shoot at each other.

j 1

' 2

1

I

2

(ii)

(i)

j

j

j

ll-E-----

1

2

(iii)

(iv) Figure 1

Let the payoff functions satisfy (i = 1, 2) g.(A) = u·1 where 0 < u·1 < 1 ; 1

gl(B)

=

1, g2(B)

=

0 ;

434

P. ZEEPHONGSEKUL

gl(C) gl(D)

= =

0, g2(C) = 1 g2(D) = 0 .

(4.1)

As the chance of the game terminating equals xy(a + (Ja) + xya + xyfJ , this fact together with (4.1) implies that conditions (2.2) and (2.3) are fulfilled. Hence, by Theorem 3.1, a duel has a Nash equilibrium point. We now use theorem 3.2 to obtain all these equilibria for duel starting in state A. Applying (2.4) and referring to Figure 1, G1

+ +

(/.l1,

/.l2)

=

UI

+

(xyö:1J

+ xy + xya

A xy f3) G I (/.l1,/.l2) (xyö:1J

+ xya)gl(B)

+ (l-xa)(l-yfJ)G A1 (/.l1,/.l2)

=

UI

+

xa(l-yfJ),

hence

A

GI

(/.lI,/.l2)

=

UI

+

xa( 1-yp)

1 - ( 1-xa)(1 y!3)

(4.2)

Similarly, the total expected payoff to player 2 is A G 2 (/.l1,/L2)

=

U2

+

yfJ(1-xa)

1 - ( 1-xa)(1 y/3) .

(4.3)

From player l's viewpoint, the following inequalities

+ a(l-yf3) ~ UI + xa( 1-y,B) 1 - ö: (l-yfJ) 1 - (l-xa)(l-yfJ)

UI

and

UI


0 in (4.5) as the inequality was obtained assuming the pure strate~ x = O. We now obtain the complete solutions to (4.4) and (4.5) by considering separately the case

NASH EQUIUBRIUM POINTS OF N-UELS

x = 0 , x = 1 and

0

435

< x < 1 .

(I) x = 0 : Inequality (4.5) is satisfied as an equality and (4.4) reduces to

afJly2 i.e.

+

+ UIa

y(ula~a~fJul)

fJly2 - fJ(1 +UI)y + UI

~

~ 0

0 ,

(4.6)

which, as fJ < 1 , has the sol ut ion set {y: 0 < y ~ if UI ~ fJ or {y: 0 < y ~ I} if UI > fJ .

7}}

(11) x = 1: Inequality (4.4) is satisfied as an equality and (4.5) reduces to

fJly2 - fJ(l +UI)y + UI

~

(4.7)

0 .

The solution set to (4.7) is {y: UI ~ fJ, and it is void if UI > fJ .

7}

~ y ~ I}

if

(lIl) 0 < x < 1: In this case, (4.4) reduces to the inequality uI(1-(l-xa)(l-yfJ)) ~ (UI+ax(1-yfJ))yfJ i.e. (l-yfJ)(UI-yfJ) ~ 0 and (4.5) gives ri se to the reverse inequality (l-yfJ)(UI-yfJ) ~ 0 . Therefore, in the case 0 < x < 1 , we have

(l-yfJ)(UI-yfJ) which, as

=

0 ,

(4.8)

fJ < 1 , gives rise to the solution when UI ~ fJ . y =

7}

By symmetry and using player 2's viewpoint, the following inequalities

436

P. ZEEPHONGSEKUL

U2 + ,8(1-me) ~ U2 + Y,8{1-xa) (4.9) 1 ­ 7J (I-me) 1 - (l-xa)(l-y,8)

and

U2 ~ U2 + y,8{1-xa) xa 1 - (l-xa)(l-y,8)

(4.10)

must be satisfied at a NEP. Note that x > 0 in (4.10) as the inequality was obtained assuming the pure strategy y = o. Considering separately the case y = o , y = 1 and 0 < y < 1 , (4.9) and (4.10) result in the inequalities

a2x 2

xa(l+u2) + U2

a 2x 2 - xa(l+u2)

and respeetively.

+

U2

0

~ ~

(4.11)

0

(4.12)

(l-xa)(u2-xa) = 0 (4.13) Inequality (4.11) has the solution set

{x: 0 < x ~ u~} U2

-

if U2 ~ a or

{x: 0 < x ~ I} if

> a; (4.12) has the solution set

{x: u~ ~ x ~ I} if u ~ a and it is void when U2 > a; finally, as a < 1 , (4.13) has the solution x = U2 a if U2 ­< a . Applying Theorem 3.2 by considering all relevant intersections to the above solution sets, we obtain the following theorem whieh identifies all the NEPs of a stochast ic duel. Theorem 4.1

(i)

If Ul 0 , Yt > 0 or Zt >

o.

438

P. ZEEPHONGSEKUL

= u. where Let the payoff functions satisfy gi(A) 1 o < ui < 1 (i = 1, 2, 3) j gl(B) = gl(D) = Ui' gleE) = 1, gl(F) = gl(G) = 0 j g2(C) = g2(B) = u2', g2(F) = 1 , g2(E) = g2(G) = 0 j g3(D) = g3(C) = u3' ,g3(G) = 1, g3(E) = g3(F) = 0

(4.14) For each i, we also assume 0 ~ ui ~ ui" i.e. a player would prefer to survive as a member of a group of two rather than three. As the chance of the game terminatin~ with one or none surviving after one round of shooting lstarting in A) is X3Y3'}'(1 - o:lJ)

+ zlYla(l - o:lJ) + x2z2,8(1 - 0:1')

+ x3z2Yla( "flJ + .B'Y) + X2Y3z1,8(0:"f + 'Y a) + X3Ylz2,8"f + X2Y3Z1a "f (cf. Figure 2), this fact together with (4.14) implies by Theorem 3.1 that a truel has a Nash equilibrium point. We now use Theorem 3.2 to obtain all these Nash equilibria for a truel starting from state A. From (2.4) and referring to Figure 2, A A A A -A A A A Gl (J.tl ,J.t2 , J.t3 ) = Ui + a"flJ Gl (J.tl ,J.t2 ,J.t3 )

+ (X3Y1Zl alJ'Y + X3Y1Z2 alJ'Y + X2Y3Z1 0:,81'

+ X2Y3Z2 0:,81' + x3Y3Z1 1'(1 - o:lJ) + x3Y3Z2 1'(1 - o:lJ) ) B

B

B

Gl (J.tl , J.t2 )

+ (X2Y1Zl alJ'Y + X2Y1Z2 lJ(l - 0:71) + X3Y1Z2 o:lJ"f + X2Y3Z1 alJ'Y + X2Y3z2lJ( 1 - 0:1')

439

NASH EQUILIBRIUM POINTS OF N-UELS

/

2

2

3

(i )

1

/~3 (ii)

2

2

/~/,

,

(iii)

/

3

(iv)

1

2

3

2

//~

I

(v)

3

1

,

2

~ (vi i)

(v i)

3

2

3

Figure 2

(vi i i)

3

440

P. ZEEPHONGSEKUL

X3Y3Z2 ö;7J'Y) GP(J.tP, 1'3 D)

+

+ (X3Y3Z2 'Y(1 - älJ) + X3YIZ2 alJ'Y

+

X2Y3Z1 afYt

+ X2Y3Z2 P(l - ä-r) )gl(E) , hence

A

A

A

A

Gl (1'1 ,J.t2 , 1'3 ) = (1 - älJ-r)

-1

+

B B B Y3 fYt(l - a X2) ) Gl (1'1 , JVl ) D D D (aTYr X2 + lJ'Y Z2 (1 - a X3) ) G1 (1'1 ,1'3 )

+

(Y3 Z2 P'Y

[Ul + (X3 aTYr

+

+

X3 Z2 alJ'Y

+

X2Y3 afYt)]

(4.15)

after some re-arrangements and simplifications of terms. Similarly, the total expected payoffs to player 2 and 3 are

A A

A

A

G2 (1'1 ,1'2 , 1'3 ) [U2

+

(X3 alJ't

+

+

(äP'tYl

+

(X3Z1 a"{

=

-1

(1 - älJ't)

Y3 P't(l - a X2) ) G2 B (l'l B , 1'2B)

+ +

ä'YZ1 (1-PY3)) G2 C(jJ2C, 1'3 C ) Y3 z1 äP'Y

+

X3Y1 afYt)] (4.16)

and

A

A

A

A

G3 (1'1 , 1'2 ,1'3 ) = (1 - älJ-r) [U3 + (Yl äfYt

C

+

C

C

X2

+ lJ'Y

-1

Zl ä'Y (1 - {Jy3) )

G3 (/J.2, 1'3 )

+ (alJ-r

Z2 (1 - a X3) ) G3 D(I'P, 1'3 D)

+ (X2Yl ap + YIZ2 äPr + X2Z1 alJ'Y)] (4.17) + (X3 1 a"{ + Y3 z1 äP'Y + X3Y1 afYt)] (4.16) +

+ (X3 1 a"{ + Y3 z1 äP'Y + X3Y1 afYt)] (4.16) (X3 1 a"{ + Y3 z1 äP'Y + X3Y1 afYt)] (4.16)

NASH EQUILIBRIUM POINTS OF N-UELS

+

(X3 1 a"{

+

Y3 z1 äP'Y

+

X3Y1 afYt)] (4.16)

B B B U2' + Y1,B~1-X211'1 G2 (J.t1 , J.t2 ) = 1 - ( 1-X211' (1 Y1 ) C

C

C

G2 (J.t2 , J.t3 )

=

441

(4.18)

U2' + Y3~~1-Z21) 1 - ( 1-Y3 (1~21)

C C C U3' + Z2~1-Y3,B1 G3 (J.t2 , J.t3 ) = 1 - (l-z2i(1

ya )

and

+

(X3 1 a"{

+

+

Y3 z1 äP'Y

X3Y1 afYt)] (4.16)

In the sequel we shall also make the following substi tutions X2 = 1 - ll'X2, X3 = 1 - ll'X3 , (4.19) Z1 = 1 - 1Zt, Z2 = 1 - 1Z2 , and Y1 = 1 -,Byt, Y3 = 1 - ,By3 , to simplify expressions. We now obtain the set of inequalities (3.2) resulting from letting each player adopt the two avaifable pure strategies. From player l's viewpoint, the inequalities are A

A

A

G1 ( (1,0), J.t2 , J.t3 ) ~

AA

A

A

G1 (J.t1 , J.t2 , J.t3 ) (4.20)

and

G/(

(0,1),

A

A