This book explores the major techniques involved in optimization, control theory, and calculus of variations. The book s
192 125 9MB
English Pages 260 [261] Year 2023
Table of contents :
Contents
Laplace’s Demon
Preface
1 Minimization Problems For Functions Of Finite Number Of Variables
2 Introduction to Calculus of Variations
3 Theory Of Optimal Control. Statement Of The Main Problem
4 Formalism Of Sufficient Conditions And The Optimal Principle Of V. F. Krotov
5 Euler’s Problem
6 The Lagrange–Pontryagin Method For Controlled Processes
7 Hamilton-Jacobi-Bellman Method
8 The Theory Of Optimal Control And The Variational Calculus
9 Multi-Step Controlled Processes. Theorem On Sufficient Optimal Conditions. Krotov’s Optimal Principle
10 The Lagrange–Pontryagin Method For Multistep Controlled Processes
11 Hamilton-Jacobi-Bellman Method. Multi-Step Version
12 Applied Optimal Control Problems
13 Solving Problens Of Applied Optimal Control Theory Using The Apparatus Of Near-Periodic Functions
Literature
An Introduction to Optimal Control
An Introduction to Optimal Control By
Yury Shestopalov, Elena Pronina and Roman Dzerzhinsky
An Introduction to Optimal Control By Yury Shestopalov, Elena Pronina and Roman Dzerzhinsky This book first published 2023 Cambridge Scholars Publishing Lady Stephenson Library, Newcastle upon Tyne, NE6 2PA, UK British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library Copyright © 2023 by Yury Shestopalov, Elena Pronina and Roman Dzerzhinsky All rights for this book reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior permission of the copyright owner. ISBN (10): 1-5275-2880-4 ISBN (13): 978-1-5275-2880-2
Contents Laplace’s Demon Preface 1
2
Minimization Problems For Functions 1.1 General Statement of the Optimum Problem . . . . . 1.2 Terminology, notation and definitions . . . . . . . . 1.3 Interrelation between minimization and root of the systems . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Necessary and sufficient conditions for an extremum 1.5 Conditional extremum. Method of indefinite Lagrange multipliers . . . . . . . . . . . . . . . . . . . 1.6 Conditional extremum. Problem with inequality-type constraints . . . . . . . . . . . . . . . . . . . . . . . 1.7 Reference material . . . . . . . . . . . . . . . . . . Introduction to Calculus of Variations 2.1 Historical aspects of the calculus of variations . . . . 2.2 Classic variational problems . . . . . . . . . . . . . 2.3 Functional. Variation and variation derivative . . . . 2.4 Functional extremum: absolute and relative . . . . . 2.5 Euler – Lagrange differential equation . . . . . . . . 2.6 Cases of lowering the order of the Euler–Lagrange equation . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Reference material . . . . . . . . . . . . . . . . . .
ix xiii 1
1 3 5 6 8 13 18 27
27 30 37 45 48 53 56
vi
3
4
An Introduction to Optimal Control
Theory Of Optimal Control 3.1 Managed object and its dynamics . . . . . . . . . 3.2 Examples of applied problems of optimal control 3.3 Statement of the main problem of optimal control 3.4 Reference material . . . . . . . . . . . . . . . .
61
. . . .
. . . .
62 63 65 67
Formalism Of Sufficient Conditions 4.1 Sufficient Krotov Optimality Conditions for Continuous Processes . . . . . . . . . . . . . . . . . . . . . 4.2 Basic lemmas on minimizing sequences . . . . . . . 4.3 Basic Minimal Theorems and minimizing sequences 4.4 Krotov’s optimality principle . . . . . . . . . . . . . 4.5 Control questions . . . . . . . . . . . . . . . . . . .
71
71 73 75 77 78
5
79 Euler’s Problem 5.1 Formulation of the problem. Indicatrix . . . . . . . . 79 5.2 Indicators of various types . . . . . . . . . . . . . . 80 5.3 Test questions and exercises . . . . . . . . . . . . . 101
6
The Lagrange–Pontryagin Method For Controlled Processes 6.1 Formulation of the problem . . . . . . . . . . . . . . 6.2 Equations of the Lagrange – Pontryagin method . . . 6.3 Transversality conditions . . . . . . . . . . . . . . . 6.4 Negative example . . . . . . . . . . . . . . . . . . . 6.5 Reference material . . . . . . . . . . . . . . . . . . 6.6 Exercises . . . . . . . . . . . . . . . . . . . . . . .
7
103
103 104 107 109 115 120
123 Hamilton-Jacobi-Bellman Method 7.1 The main idea of the method . . . . . . . . . . . . . 123 7.2 Hamilton – Jacobi – Bellman equation. Continuous option . . . . . . . . . . . . . . . . . . . . . . . . . 124 7.3 Synthesis and control program . . . . . . . . . . . . 126 7.4 Algorithm of the Hamilton–Jacobi–Bellman method 127 7.5 The problem of analytical design of an optimal controller129 7.6 Accounting for state constraints . . . . . . . . . . . . 138
CONTENTS
7.7 7.8 8
9
vii
Comparative analysis of the Lagrange – Pontryagin methods and Hamilton – Jacobi – Bellman . . . . . . 140 Test questions and exercises . . . . . . . . . . . . . . 141
The Theory Of Optimal Control And The Variational Calculus 8.1 Formulation of the problem . . . . . . . . . . . . . . 8.2 Euler – Lagrange equation . . . . . . . . . . . . . . 8.3 Weierstrass Necessary Condition . . . . . . . . . . . 8.4 Legendre and Jacobi conditions . . . . . . . . . . . . 8.5 Summary of Required Extremum Conditions simplest functionality . . . . . . . . . . . . . . . . . . . . . . 8.6 Sufficient conditions for an extremum simplest functionality . . . . . . . . . . . . . . . . . . . . . . . . 8.7 Hamilton – Jacobi equation. Jacobi’s theorem . . . . Multi-Step Controlled Processes 9.1 Statement of the optimal control process . . . . . . . . . . . . . 9.2 The main theorem . . . . . . . . 9.3 Krotov’s optimality principle . .
143
143 144 147 148 151 151 152 155
problem multi-step . . . . . . . . . . . 155 . . . . . . . . . . . 156 . . . . . . . . . . . 158
10 The Lagrange–Pontryagin Method 10.1 Lagrange – Pontryagin method . . 10.2 Negative example . . . . . . . . . 10.3 Algorithm for solving the problem 10.4 Control restrictions . . . . . . . . 10.5 Exercises for independent work . .
159
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
159 161 163 164 166
11 Hamilton-Jacobi-Bellman Method 11.1 The main functional relationship of the method . . . 11.2 Solving the problem by the dynamic programming method . . . . . . . . . . . . . . . . . . . . . . . . 11.3 Optimal distribution of investments between projects 11.4 Exercises for independent work . . . . . . . . . . . .
169
169 172 175 179
viii
An Introduction to Optimal Control
12 Applied Optimal Control Problems 181 12.1 Optimal control theory and models of macroeconomic dynamics . . . . . . . . . . . . . . . . . . . . . . . 181 12.2 The problem of the speed-optimal control . . . . . . 193 12.3 Maximizing the polarization of atom in a resonant field 203 12.4 Optimal control theory and differential games . . . . 208 13 Solving Problens Of Applied Optimal Control Theory Using The Apparatus Of Near-Periodic Functions 13.1 Formation of optimal control based on a representative amount of information . . . . . . . . . . . . . . 13.2 Characteristics of measurement results and their main types . . . . . . . . . . . . . . . . . . . . . . . . . . 13.3 Classes of shift functions for determining near-periods and near-proportions. . . . . . . . . . . 13.4 Models and algorithms for determining the parametric connection of the characteristics of processes at the macro and micro levels. . . . . . . . . . . . . . . . . 13.5 Methods for identifying trends and their parameters based on anamorphosis. . . . . . . . . . . . . . . . . Literature
221
221 223 228
238 241 245
Laplace’s Demon The progress of natural science since the XVIII century was largely due to the success of mathematics. The fact that differential equations can describe nature and man-made objects inspired the researchers. Philosophical generalizations appeared, for example, Laplace’s determinism. Recall that Pierre-Simon Laplace believed that if some omniscient being (Laplace’s demon) knew the precise location and momentum of all particles in the universe, then it would know everything – both the past and, most importantly, the future. Further development of science has shown that the Laplace demon would still be limited in its capabilities, because not all models boil down to systems of differential equations. But the ability to quickly solve differential equations could add some additional competencies to the demon. So that’s what this is all about. Objects and phenomena of the real world are described by some differential equations, more precisely, by some differential operators. This is usually the left side of the equation. Certain forces can act on these objects and phenomena – this is what is usually on the right side of the equation. And these forces may well be chosen by people. We, roughly speaking, can interfere with the right-hand side of the equations. This means that inhomogeneous equations and systems are considered. Quoting the eminent Soviet and Russian mathematician A. A. Abramov, “an equation without the right side, like a gentleman without money, is of only theoretical interest". It would be fascinating to choose the forces in the right parts of the equations in such a way that it is clear in advance that the desired result is achievable. The choice of the impact on the right-hand side is the task of managing the system. Having mastered some additional
x
An Introduction to Optimal Control
mathematical apparatus, the Laplace demon turns from a being who knows all and everything into an effective manager (to the extent that the apparatus of differential equations is applicable). At the everyday, not mathematical, level, we have to meet with solutions to management problems on a daily basis. Some of them were successfully solved long ago. An example of a beautiful engineering solution to the problem of optimal control in a system with negative feedback is the design of a drain tank in a toilet. Some problems (many times more complex) are poorly solved. Traffic jams are a good example of such problems. And a significant part of them can be eliminated through the competent organization of traffic and adjustment of traffic lights. Anyway, this is not the topic of our book. Thus, we have outlined a range of problems - the choice of such right-hand sides in differential equations that lead to some “desired" results. But such a statement turns out to be mathematically more difficult than the task of “simply” solving a differential equation. So, we have to integrate the function on the right side. The operation of integration is less demanding on the smoothness of functions than the operation of differentiation. Therefore, the solution may be piecewise continuous, or there may be no continuity at all points. In the course of differential equations, it’s still more common to work with continuously differentiable functions. Of course, it’s possible to construct weak solutions, where the continuity is broken, but that’s a different story... Secondly, we can impose some restrictions on our task. For example, an automatic station needs to fly from Earth to Mars. There is a finite supply of fuel onboard. We need to plan the activation of the engines in such a way that the space probe will fly to Mars, and the fuel consumption for this would not exceed the stock onboard. In this case, we will have a control problem with an inequality-type constraint. The control problem can also be formulated with equalitytype constraints. For example in proton therapy of tumors, a pulsed generator of charged particles with constant pulse energy is used. You can only control the focusing of the beam. It is required for each patient to indicate such a focusing algorithm in order to destroy as
Laplace’s Demon
xi
many cancer cells as possible, while minimally damaging healthy tissues. One of the principal ideas of this material is that the more restrictions are imposed on the system, the more difficult it is to choose the optimal control mode. In order to preserve the manageability of the system, some restrictions will have to be abandoned. This book is still mathematical, but is designed for engineers. From this point of view, of course, it is much more important to know what is the maximum principle of L. S. Pontryagin, than the details of the proof. Moreover, such a proof, even in the shortest presentation, would at least double the volume of the book. Therefore, the decision to give a certain set of mathematical facts without proof seems to be quite justified. Now about some of the features of the book that distinguishes it from other works on a similar topic. The authors give a summary of the results on control in difference problems (their authors also call them “multi-step"). Of course, to understand the main differences when reading the corresponding chapters, it would be good to get acquainted with the basics of the theory of difference equations (see, for example, V. K. Romanko. Difference equations - Moscow: Binom, 2012). We only note that the topic of control in a difference system is usually ignored by the authors. There are two reasons for this and they are opposite. Some authors believe that difference equations are a simpler object than differential equations. Therefore, the control tasks for such systems are trivial. Other authors believe that difference systems are much more complex than differential systems. Therefore, the tasks of managing them are extremely difficult. The second point of view is much closer to the truth (the difference equation will be equivalent to a system of differential equations of infinite order). Nevertheless, this book gives an idea of the methods for synthesizing optimal control in discrete systems. The authors did not reach in their book the most difficult and dynamically developing industry - control problems in systems of partial differential equations. The reason for this is obvious - the increasing complexity and size of the material. The authors rightly
xii
An Introduction to Optimal Control
believe that this textbook should serve as an initial introduction to the subject. And it is possible to transfer it to the level of practical use only by “passing theory through a sieve" - by independently solving a number of practical problems. After that, both reading special literature and expanding approaches to partial differential equations will still remain rather difficult tasks, but will not cause fear and psychological rejection. This is the main point of our textbook. A.I. Lobanov, Doctor of Physics and Mathematics, Professor of the Moscow Institute of Physics and Technology
Preface This book is based on courses of lectures on optimal control theory given by the authors at the Russian Technological University MIREA. Many textbooks and teaching aids are devoted to the theory of optimal control, for example [1,4,6,9]. But the presentation in them is quite difficult, a high level of mathematical training of students is assumed. This book attempts to simplify the presentation of the material. Note that the list of references indicates the sources that the authors used in preparing the course in accordance with their preferences. For more detailed preparation, students can use other teaching aids, a list of which is given in the course program. Chapter 1 presents methods for minimizing functions of a finite number of variables. Chapter 2 examines the classical results of the calculus of variations as applied to the simplest problem, the generalization and essential development of which has become the modern theory of optimal control. Chapters 3–8 are devoted to the theory of optimal control of processes that are described by systems of ordinary differential equations. The formalism of sufficient conditions for optimality of V.F. Krotov and its connection with the maximum principle of L.S. Pontryagin and the method of dynamic programming by R. Bellman are presented. Chapter 8 analyzes the relationship between optimal control theory and the classical calculus of variations. Chapters 9–11 deal with optimal control problems for multistep processes. Practical techniques for implementing dynamic programming, as one of the main methods for optimizing multistep processes, are shown using the example of the investment allocation problem. Chapter 12 is devoted to the study of applied problems of physical and technical content. The use of the mathematical apparatus of the theory of optimal control is demonstrated for the optimal control
xiv
An Introduction to Optimal Control
problems in terms of speed of control of the simplest mechanical motion; maximizing the polarization of a two-level atom in a resonant field; and the game model of interaction of opposing groups. Section 12.3 is based mainly on the previously unpublished original results. Chapter 13 contains many results and findings published for the first time.
1 Minimization Problems For Functions Of Finite Number Of Variables 1.1 General Statement of the Optimum Problem In the course of his activities, each person is faced with the problem of choice. Consider several examples: when it is necessary to get from one place to another by car, the driver is faced with choosing a route. The head of an enterprise chooses the strategy of economic development of his enterprise. At the household level, the problem of choice is solved based on individual preferences. Sometimes this everyday level is not enough to make a choice, and individual preferences are not enough to formalize this choice. Let us try to describe the problem of choice using a formal mathematical language. We have a number of options. Assume that they all belong to some (non-empty) set X. Since mathematics is an abstract science, we will not concretize this set. To assess the quality of the choice, we assign to each element of the set X some numerical value characterizing the quality of the choice. Thus, a correspondence may be identified; in other words, a function defined on the elements of X which can be written in the form f (x), x ∈ X. Since the function associates an element of the set with a certain number, f (x) is a scalar. The set X itself will henceforth be called an admissible set; all elements of an admissible set will be called admissible points or admissible elements. Let us now define some properties of the function f (x), x ∈ X. Choose two elements of the admissible set x1 , x2 ∈ X. Suppose that for some reason the choice of the element x1 is preferable, and this choice corresponds to smaller values of the function f (x). In other words, our preference is related to the inequality f (x1 ) < f (x2 ). If
2
Chapter 1
the choice for both elements is equal (no matter what the preference), then f (x1 ) = f (x2 ). The function introduced in this manner is called the objective function. For some problems, the objective function is easy to construct, and it has a clear physical meaning. So, when choosing an automobile route, you can set its length in kilometers in accordance with each route option. In this case, the shortest option is preferred. It is possible to define another objective function for the same task. For example, match the time it takes to reach the end point. In this case, we consider the most preferable option to be the one that takes the least time to overcome. Naturally, when using different target functions, the optimal choice may differ. Thus, the “fastest" route by car is not necessarily the shortest - the routes may have different speed limits, somewhere there will be a traffic jam, but somewhere it will not. The problem of the best (optimal) choice in the language of admissible sets and objective functions can be formulated quite simply: find an element of an admissible set x∗ such that the condition f (x∗ ) f (x), ∀x ∈ X : x = x∗ . Let us recall some notation. Thus, the ∀ sign is called a generality quantifier and is read “for all" or “for any", and the colon sign replaces the words “such that”. The problem of finding the optimal element, formulated above, will be written by the formula f (x) → min x∈X
In economics, the main goal of enterprise development is in making a profit, therefore, the optimal strategy will be the one in which the profit is maximum. The maximum task is equivalent to the task of minimizing the profit of the enterprise, taken with the opposite sign. This optimization criterion satisfies all the properties of the objective function. Sometimes in optimization problems it is quite easy to indicate an admissible set, but there are complex problems with the objective function. Let someone choose a companion of life, then the admissible set is the set of all people of the opposite sex, and what will be the objective function?
Minimization Problems For Functions
3
1.2 Terminology, notation and definitions An admissible set X. We will assume that an admissible set is a subset of the vector space Rn . In this space, the vector norm is introduced x = (
n
x2i )
1/
2
i=1
A norm is a nonnegative number assigned to each vector from the space under consideration that satisfies the following three conditions (norm axioms). 1◦ The norm of a vector is zero if and only if this vector is zero: x = 0 ⇔ x = 0 2◦ When multiplying a vector by a number, the vector norm is multiplied by the modulus of the same number: αx = |α|·x 3◦ The vector norm satisfies the triangle inequality x + y x + y Solution of the minimization problem f (x) → min. The x∈X
solution to the problem is x ∈ X which we will denote by x∗ . They also say that this point is textitthe minimum point of the objective function. Optimal value of the objective function. Let the value f (x∗ ) be defined. Let’s introduce the notation f (x∗ ) = f ∗ , we call it the optimal value of the objective function. Definition 1.2.1. A point x0 is called a local minimum point if there exists a small number ε > 0 such that the inequality f (x0 ) f (x) holds for all x such that x ∈ X ∩ {x ∈ Rn : x − x0 ε}. Definition 1.2.2. The objective function f (x) reaches the global (absolute) minimum in X at an element x∗ if the equality f (x∗ ) = = inf f (x) holds. X
4
Chapter 1
The optimal value of the objective function is not always achieved, not in all tasks. Therefore, it becomes necessary to generalize the concepts of maximum and minimum. Such generalizations are the concepts of an exact upper bound and an exact lower bound, respectively. The exact upper bound is denoted by sup f (x) (from the Latin X
supremum - the highest, supremum f is read); by definition, it is the smallest of all the upper bounds of the set. The exact bottom edge is denoted by inf f (x) (from the Latin X
infimum - the lowest, read the infimum f ) - this is the largest of all the lower bounds of the set. / X, the exact lower or the exact upper bound of the function If x∗ ∈ is not attained. The solution to the optimal problem in this case takes the form of a sequence of elements {xs }∞ s=1 . Each element of this sequence is valid (xs ∈ X ∀s), but its limit value to the admissible / X. In such problems, the set does not belong to: lim xs = x∗ , x∗ ∈ s→∞ solutions the minimizing or maximizing sequence becomes, in which the sequence of the corresponding values of the objective function converges to the optimal value, lim f (x1,s ) = inf f (x) = f (x∗1 ),
x1 ,s→x∗1
X
lim f (x2,s ) = sup f (x) = f (x∗2 ).
x2,s →x∗2
X
Remark. If X is a numerical axis, the problem of finding the minimum of a function of one variable is solved; if X is an n-dimensional vector space, there is a problem of finding the minimum of a function of n variables; if X is a functional space, then the problem of finding a function that minimizes the functional is solved (the problem of the calculus of variations or the problem of optimal control). If conditions are added to the objective function + x− k xk xk , k = 1, ..., K,
Fl− f1 (x) Fl+ , l = 1, ..., Λ,
Minimization Problems For Functions
5
± where x± k , Fl – numbers, then this is the task of finding a conditional minimum; if the existing ones are absent, then this is the problem of finding the unconditional minimum. Moreover, if the functions f (x) and fi (x) are linear in all their arguments, the problem of finding the conditional minimum is called compliance with linear programming; one of these functions is not linear, then there is non-linear programming. Both of these problems are called mathematical programming problems. Linear programming problems are conceptually simple, their complexity is determined by an independent variable functional, additional conditions such as equality and inequalities that determine the geometry of the multidimensional convex polyhedron in which the minimum is sought. The minimum is reached either at a vertex, or on an edge, or on a face, etc., and the task is to economically reach the required point. Similar problems were studied in the courses “Operations Research" and “Optimization Methods". Without loss of generality, we will talk about finding the minimum of a function, since the maximum of a function f (x) is a minimum of a function with the opposite sign: −f (x).
1.3 Interrelation between minimization and root computation problems systems of nonlinear algebraic equations Let us note the connection between the problem of calculating the roots of a nonlinear algebraic equation (NLAE) and the task of minimization. Suppose that on a set X ∈ Rn a system of nonlinear equations solves ⎧ ⎪ ⎨f1 (x1 , ..., xn ) = 0, ........................... ⎪ ⎩ fn (x1 , ..., xn ) = 0. We will define the objective function as follows: f (x1 , ..., xn ) =
n k=1
fk2 (x1 , ..., xn ).
6
Chapter 1
The inequality f (x) 0 holds on the set X, and the objective function f (x) has a minimum zero value for x = x∗ , where x∗ - is the solution to the considered system of equations. Therefore, solving NLAE is equivalent to finding the minimum of the function f (x) on X. If the objective function f (x) is strictly greater than zero, then the system has no solutions. Remark. Here and below, we will denote a set of variables - a point in a multidimensional space (or, which is the same thing, the coordinates of a vector) in straight bold type. For matrices we will use straight bold capital letters.
1.4 Unconditional extremum. Necessary and sufficient conditions for an extremum Suppose that it is required to find the minimum of the objective function f (x), which (function, not minimum) has all the first partial derivatives. In that case, the problem is reduced to solving a system of nonlinear algebraic equations (NLAE). ⎧ ∂f (x ,...,x ) n 1 ⎪ = 0, ⎨ ∂x1 ....................... ⎪ ⎩ ∂f (x1 ,...,xn ) = 0. ∂xn The point, that is the solution of NLAE is called stationary. The condition for the minimum of the objective function (the condition of stationarity) formulated above is a necessary condition for the extremum of a function, which is well known from the course of mathematical analysis and expresses the content of Fermat’s theorem1 . However, not every stationary point can be a point of a local minimum of the objective function. We present the following theorem without proof. 1 Pierre Fermat (1601-1665) - French self-taught mathematician, one of the founders of analytical geometry, mathematical analysis, probability theory and number theory.
Minimization Problems For Functions
7
Theorem 1.4.1. Let the function f (x) be twice continuously differentiable. Then the sufficient condition for the local minimum of the function f (x) at the stationary point x∗ is the positive definiteness of the Hessian matrix2 . ⎛
∂2f ∂x21
⎜ G(x∗ ) = ⎝ · · ·
∂2f ∂xn ∂x1
··· ··· ···
∂2f ∂x1 ∂xn
⎞
⎟ ··· ⎠
∂2f ∂x2n
Note that the methods for finding the minimum of the function f (x) on X often turn out to be more effective than the methods for numerically solving NLAE. An example of unconstrained minimization. If the minimum of the objective function is sought without additional conditions imposed on it, then such a minimum is called unconditional. It is known from the course of mathematical analysis that to search for such a minimum it is necessary to find all stationary points, that is, those at which all partial derivatives are equal to zero: ∂f /∂xi = 0, i = 1, 2, ..., n. We can assume that a vector is composed of all the partial derivatives. Such a vector is called the gradient of the function and is denoted as ∇f or grad f . The meaning of a gradient vector is as follows: it indicates the direction in space in which the function grows fastest. Its absolute value (modulus) is the rate of increase of the function in this direction. Naturally, the function decreases most rapidly in the direction of the vector −∇f , or the anti-gradient. Thus, at a stationary point, the gradient vector is equal to the zero vector. Example. Find the minimum of the function f (x, y) = x2 + 3y 2 − 2xy − 4y + 1. Solution. Calculate partial derivatives ∂f ∂f = 2x − 2y, = 6y − 2x − 4. ∂x ∂y 2 Ludwig Otto Hesse (1811-1874) - German mathematician; main works are related to geometry, linear algebra, calculus of variations; introduced the concept of the Hessian.
8
Chapter 1
Solve the system of equations
2x − 2y = 0, −2x + 6y − 4 = 0. From the first equation x∗ = y ∗ . From the second equation y ∗ = 1. The only stationary point is the point (1; 1). Calculate the Hesse matrix 2 ∂ f ∂2f 2 −2 2 ∂x∂y ∂x = . ∂2f ∂2f −2 6 2 ∂y∂x
∂y
Since the Hessian matrix is positive definite, the only stationary point of the function is the minimum. The minimum is global. Indeed, because f (x, y) = x2 + 3y 2 − 2xy − 4y + 1 = (x − y)2 + 2(y − 1)2 − 1, that lim f (x, y) = ∞ x→∞
Exercises 1. Find the minimum of the function f (x1 , x2 ) = |1 − exp(x21 + x22 − 1)| and its minimum value. 2. Show, that function f (x1 , x2 ) = (1 + exp x2 ) cos x1 − x2 cos x1 has countless highs and no lows. Plot the function f (x1 , x2 ) in the MathCAD software package. Is there an analogue of such a function on the plane?
1.5 Conditional extremum. Equality constraints. Method of indefinite Lagrange multipliers Suppose, now we need to find the minimum of a function with some restrictions. For example, it is necessary to deliver the spacecraft to
Minimization Problems For Functions
9
Mars as quickly as possible, find the optimal trajectory, but at the same time it is necessary to spend a certain amount of fuel (as much as is placed on board the spacecraft). Thus, we have an objective function f (x1 , . . . , xn ). In addition, additional conditions are set g1 (x1 , . . . , xn ) = 0, g2 (x1 , . . . , xn ) = 0, . . . , gm (x1 , . . . , xn ) = 0. The number of these additional conditions (constraints) is less than the number of problem variables (m < n). J.Lagrange3 proposed the following method for solving such a problem. Let’s compose a new function (Lagrange function) m λk gk (x1 , . . . , xn ). L(x, λ) = f (x1 , . . . , xn ) + k=1
The meaning of this construction is that if our constraints are met, then the additional sum is always zero, the value of the function has no effect. Coefficients λi are called undefined Lagrange multipliers, they can be any. You can look at this problem from the other side. We define another space (dimension m) - the space of undefined coefficients. In this space, we define the linear hull of our coefficients, and the coefficients in this linear hull depend on the main variables of our problem. We will speak about the meaningful interpretation of such a representation below. Now we need to find the variables x1 , . . . , xn , λ1 , . . . , λm for which the Lagrange function reaches its minimum. This point lies among the stationary points, therefore ⎧ m ⎪ ∂gk (x1 ,...,xn ) ∂f (x1 ,...,xn ) ∂L ⎪ = + λ ∗ ∗ = 0, ⎪ k ∂xi ∂xi ⎪ ⎨ ∂xi x∗ ,λ∗ x ,λ k=1 i = 1, 2, . . . , n, ⎪ ⎪ ⎪ ∂L ⎪ = g (x1 , . . . , xn ) ∗ ∗ = 0, j = 1, 2, . . . , m. ⎩ ∂λj x∗ ,λ∗
j
x ,λ
3 Joseph Louis Lagrange (1736-1813) - French mathematician, astronomer and mechanic, author of the classic treatise “Analytical Mechanics". He made a huge contribution to mathematical analysis, probability theory, number theory, created the calculus of variations.
10
Chapter 1
For a stationary point of the Lagrange function, all constraints are satisfied, as a result, the problem on the conditional minimum of the original objective function is reduced to the problem on the unconditional minimum of the Lagrange function. For convenience, we introduce the vector notation ⎞ λ1 ⎜ λ2 ⎟ T ⎟ λ=⎜ ⎝ . . . ⎠ and λ = (λ1 , λ2 , . . . , λm ). λm ⎛
In this notation, the stationarity condition is written as ∇fx + + λ ∇gx x∗ ,λ∗ = 0. Thus, the linear combination rows of the gradient vector ∇gx must be equal to the gradient vector of the objective function ∇f . Suppose that the matrix ∇gx is nondegenerate, then the last relation connecting the gradient vectors ∇f and ∇gx can be solved with respect to the vector λ : λT = −∇fx (∇gx )−1 . Therefore, the components of the vector λ are linear combinations of the components of the gradient vector ∇f with coefficients equal to the elements of the inverse matrix (∇gx )−1 . T
Interpretation of Lagrange multipliers. We will restrict ourselves to the case the only constraint (m = 1), then the value of the Lagrange multiplier −λ =
∂f ∂xi ∂g ∂xi
= x∗
∂f ∂g
. ∗ x
As a result, the value of λ coincides with the partial derivative of f with respect to g and, thus, determines the coefficient of sensitivity of the optimal value of the objective function to changes in the constraints. In other words, the value λ shows how much the optimal value of the objective function will change if the right-hand side (free term) of the constraint changes by one.
Minimization Problems For Functions
11
In problems of mechanics, constraints gi (x1 , . . . , xn ) = 0 are called constraints; therefore, the Lagrange multipliers in such problems determine constraint reactions. Let us recall the formulation of mutually dual linear programming problems. The direct problem is to maximize the objective function f (x) = = (c, x) : Ax b, x 0. In other words, profit is maximized with limited resources. Dual problem: g(y) = (y, b) : yA c, y 0. Here x is a column vector, y is a row vector; resource costs are minimized in value terms for a given production efficiency. According to the linear programming theorem on objectively determined estimates, the optimal solution to one of the dual problems is the influence on the value of the objective function of the pair problem of free terms in its system of constraints: y∗ =
∂f (x∗ ) , ∂b
x∗ =
∂g(y∗ ) . ∂c
As a result, double estimates of y are interpreted as prices of the optimal design. An economic interpretation of the Lagrange multipliers corresponding to the optimal solution is proposed for the interpretation of double bounds on the constraints of a linear programming problem. They include changes in the function of the function per unit change in the free term of the constraint in a small vicinity of the optimum: ∂f ∗ −λ = , ∂g x∗ i.e the Lagrange multiplier at the optimum point is equal to the optimal price. In contrast to linear programming problems, Lagrange multipliers in the general case are not stable: they change their values even with an arbitrarily small change in the free terms of the constraints. An example of conditional minimization. Find the values (x, y), for which the function
12
Chapter 1
1 x2 y 2 f (x, y) = + 2 2 a2 b takes a stationary value under the constraint x + my = c, where a, b, m, c are constants. The level lines of the function f are ellipses, and f increases with the size of the ellipse; equation x + my = c defines a straight line on
Figure 1.1: An example of conditional minimization of a function in the presence of a linear equality constraint the (x, y) plane (Fig. 1.1). Obviously, the minimum value of f is reached on an ellipse that touches the specified straight line. Let us now obtain an analytical solution. The Lagrange function has the form 1 x2 y 2 L(x, λ) = + 2 + λ(x + my − c). 2 a2 b Let us write down the necessary conditions for stationarity: x y x + my − c = 0, + λ = 0, + mλ = 0. 2 a b2 The resulting system of three equations with three unknowns has a simple and unique solution c a2 c b2 mc ∗ ∗ , x = , y = , a2 + m2 b2 a2 + m2 b2 a2 + m2 b2 which corresponds to the minimum value of the objective function ∂f ∗ ∂f ∗ ∗ c2 f ∗ = 2(a2 +m 2 b2 ) . Note that here −λ = ∂c = ∂g . λ∗ = −
Minimization Problems For Functions
13
Exercise. Use the Lagrange Indefinite Multiplier Method to solve the consumer choice problem under total income constraints. The objective function of preferences (utility function) is maximized: u(x1 , x2 , . . . , xn ) =
n
(xi − bi )ai → max, x
i=1
where a and b are given parameters, bi 0,
n i=1
ai = 1. t is as-
sumed that that the total income D is completely spent on the acquisition a collection of consumer basket items x1 , x2 , . . . , x n at fixed prices p1 , p2 , . . . , pn respectively, that is, limitation is met pi xi = i
= D, xi 0, i = 1, 2, . . . , n, where pi , D are given parameters. Get typical demand functions for various products.
1.6 Conditional extremum. Problem with inequality-type constraints. The Karush – Kuhn – Tucker theorem Consider a more complicated case, when the constraints are in the form of inequalities. As before, the objective function f (x1 , . . . , xn ); in addition, additional conditions g1 (x1 , . . . , xn ) 0, g2 (x1 , . . . , xn ) 0, . . . , gm (x1 , . . . , xn ) 0. The number of these additional conditions (constraints), as before, is less than the number of problem variables (m < n). Let us compose the generalized Lagrange function (Lagrangian): L(x, λ0 , λ) = λ0 f (x1 , . . . , xn ) +
m
λk gk (x1 , . . . , xn ).
k=1
A generalization of the method of indefinite Lagrange multipliers to the case of optimization problems with constraints in the form of inequalities is the following theorem.
14
Chapter 1
Theorem 1.6.1 (direct, without proof). Let a nonlinear programming problem have a local minimum at a point x∗ , then there exist Lagrange multipliers λ∗ = (λ∗0 , λ∗1 , . . . , λ∗m ), not all of which are equal to zero, such that the following relations are satisfied (the Karush–Kuhn–Tucker conditions, KKT): 1) the condition for the stationarity of the Lagrange function m ∂f ∂gk ∂L = λ0 + λk = 0, ∂xi x∗ ,λ∗ ∂xi ∂xi x∗ ,λ∗
i = 1, 2, . . . , n;
k=1
2) complementary slackness condition λ∗k gk (x∗ ) = 0,
k = 1, 2, . . . , m;
3) the condition of non-negativity λ∗ 0. It follows from the complementary slackness condition that if the Lagrange multiplier is positive (λk > 0), then the corresponding restriction is active (gk (x∗ ) = 0). On the other hand, if the constraint is satisfied as a strict inequality (gk (x∗ ) < 0), then the corresponding Lagrange multiplier is zero. The theorem does not guarantee that the Lagrange multiplier of the objective function λ0 is nonzero. However, if λ0 = 0, then the resulting estimates of the theorem have no direct connection with the problem of minimization of the function f (x) of interest to us, since the gradient of the function f (x) itself from the Karush–Kuhn–Tucker conditions “disappears”. Therefore, it is important to have regularity conditions that guarantee that λ0 > 0. One of such regularity conditions is formulated as follows: the gradients of active constraints at the optimum point x∗ must be linearly independent. Let A denote the set of constraint indices that are active at the optimum point x∗ , in other words, ∂L = gk (x∗ ) = 0 ⇔ k ∈ A. ∂λk x∗ ,λ∗
Minimization Problems For Functions
15
Regularity conditions are that the vector system {∇gk (x∗ )}k∈A is linearly independent. Note that if λ0 > 0, then without loss of generality we can take λ0 = 1, which is usually done. The corresponding theorem is called the proper (direct) Karush–Kuhn–Tucker theorem4 . For “regular” problems (λ0 = 1), the theorem asserts that if the point x∗ is a local minimum, then we can write down the Karush – Kuhn – Tucker conditions, which are necessarily solvable with some unknown variables λ, not all of which are equal to zero. Stationarity condition ∂L(x∗ , λ∗ ) = 0, ∂xi
i = 1, 2, . . . , n
can be rewritten as ∇f (x∗ ) = −
m
λ∗k ∇gk (x∗ ).
k=1
This condition represents the gradient of the objective function at the optimum point in the form of a linear combination of constraint gradients, and all the coefficients of this linear combination are nonpositive. In other words, the gradient of the objective function at the minimum point ∇f (x∗ ) must be directed in such a way that a decrease in the objective function f can be achieved only by violating the constraints. Theorem 1.6.2 (reversed). Let the original problem be convex (the functions f, g are convex). If for the collection of variables (x∗ , λ∗ ) all the requirements of the direct theorem are satisfied under the regularity conditions, then x∗ is a global minimum point. 4 William Karush (1917–1997) - American mathematician, known for proving the necessary conditions for optimality in a nonlinear programming problem. The conditions were formulated in the thesis by William Karush (1939), proved in a dissertation (1942), and then rediscovered by Kuhn and Tucker (1951). Albert Tucker (1905–1995) is a Canadian mathematician who made significant contributions to topology, game theory, and nonlinear programming. Harold Kuhn (1925–2014) - American mathematician, student of Tucker, professor emeritus at Princeton University.
16
Chapter 1
In other words, for convex programming problems, the necessary optimality conditions are also sufficient. Another form of optimality conditions in nonlinear programming problems is given by the saddle point theorem. Theorem 1.6.3. If one can find such (x∗ , λ∗ ) that form a saddle point of the Legrange function, and L(x∗ , λ) L(x∗ , λ) L(x, λ∗ ) for all λ 0 and g(x) 0, then the point x∗ is the minimum point of the function f and g. This theorem is more elegant, although its results are more difficult to apply. In the situation of maximizing the objective function, the signs of the Lagrange multipliers λ should be reversed. An example of solving a nonlinear programming problem Minimize the function f (x1 , x2 ) = x1 + x2 under the constraints g1 (x1 , x2 ) = −x21 + x2 + 2 0,
g2 (x1 , x2 ) = −x21 − x2 + 2 0, g3 (x1 , x2 ) = −x1 − x2 + 3 0. The task is presented graphically in Fig. 1.2 Let’s construct the Lagrange function: L(x1 , x2 , λ) = x1 + x2 + λ1 (−x21 + x2 + 2)+ + λ2 (−x21 − x2 + 2) + λ3 (−x1 − x2 + 3). Stationarity condition: ∂L ∂L = 1 − 2λ1 x1 − 2λ2 x1 − λ3 = 0; = 1 + λ1 − λ2 − λ3 = 0. ∂x1 ∂x2 Complementary slackness condition: λi gi = 0, i = 1, 2, 3.
Minimization Problems For Functions
17
Figure 1.2: Solving of a nonlinear programming problem From Fig. 1.2 it follows that only the first constraint is active; therefore, λ∗2,3 = 0. Then the stationarity conditions give λ∗1 = = −1, x∗1 = − 12 , and from the activity of the first restriction it follows that x∗2 = − 74 . The resulting point x∗ satisfies the Karush–Kuhn–Tucker conditions and, due to the convexity of the functions f (x) and g(x), is a global minimum point for the nonlinear programming problem. Since we have written the constraints in the form g(x) 0, then all λ 0. The gradients of the objective function ∇f (x∗ ) and the function that determines the active constraint taking into account the Lagrange multiplier, λ1 ∇g1 (x∗ ), are equal in magnitude and opposite in direction (Fig. 1.2). An exercise. Minimize the objective function f (x1 , x2 ) = x2 under the constraints g1 (x1 , x2 ) = −(x1 + 1)2 − x22 + 4 0, g2 (x1 , x2 ) = −(x1 − 1)2 − x22 + 4 0, g3 (x1 , x2 ) = −x1 − x2 + 2 0. Give a geometric interpretation of the Karush – Kuhn – Tucker theorem for this problem.
18
Chapter 1
1.7 Reference material 1.7.1 Positive definiteness of the matrix A matrix is said to be positive definite if there exists a number α > 0 such that the scalar product (Au, u) α(u, u) for all vectors u, and equality is attained only for u = 0. Let all vectors in space be multiplied by the same matrix. Multiplication by a matrix is equivalent to two transformations - rotation of the vector by some angle (possibly, its own for each vector) and stretching or compression (changing the length of the vector). Let us consider the question of whether there are vectors under this transformation that do not rotate, but only change their length - stretch or shrink. Then these vectors must satisfy the equality Au = λu, where λ is some number. This equality can be represented in the following equivalent form: (A − λE)u = 0. It is known from the course of linear algebra that a homogeneous system of linear algebraic equations can have a nonzero solution if and only if the matrix is degenerate, that is, det(A − λE) = 0. From this it follows that vectors that preserve their direction can expand (shrink) not any number of times, but only the one that satisfies the above algebraic equation. By the main theorem of algebra, this equation has as many roots as the order of the matrix (if we count multiple roots and pairs of complex conjugate roots). In what follows, we will only be interested in real eigenvalues.
An exercise. Find the eigenvalues and eigenvectors of the following matrices:
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ −2 1 0 −2 1 0 2 −1 0 1 2 ; ⎝ 1 −2 1 ⎠ ; ⎝ 1 −2 1 ⎠ ; ⎝−1 2 −1⎠ . 2 1 0 1 −2 1 1 −1 0 −1 2
If all eigenvalues of a matrix are real and positive, then the matrix is said to be positive definite.
Minimization Problems For Functions
19
In the case of a symmetric matrix (for which the transposed matrix is equal to itself), the Sylvester criterion is valid: the symmetric square matrix A and the corresponding quadratic form are positive definite if and only if all corner minors Δi of the matrix are positive. The matrix and the corresponding quadratic form are negatively defined if and only if the signs of Δi alternate, and Δ1 < 0. Determinants of the form are called corner minors of a matrix a a Δ1 = a11 , Δ2 = 11 12 , . . . , Δn = detA. a21 a22
1.7.2 Existence of extrema of a real function Theorem 1.7.1 (Weierstrass theorem). The function f (x), continuous on a closed bounded set, is bounded on this set and attains its maximum and minimum values on it. If, for example, the set X = [a, b], then inf f (x) = min f (x) = X
X
= f (x∗1 ) and sup f (x) = max f (x) = f (x∗2 ). When X = (a, b), X
X
exact lower and exact upper bounds of the function may not be reached, i.e. x∗ ∈ / X. In this case, the solution to the optimal problem is sought in the form of a sequence (minimizing or maximizing).
1.7.3 The set of global minimums of the function If the set of solutions to the minimization problem f (x) → min is x∈X
not empty, it is called the set of global minima of the function and is indicated in one of the following ways: X ∗ = arg min f (x); x∈X
X ∗ = {arg min f (x)}; x∈X
∗
∗
∗
X = {x : x ∈ X, f (x∗ ) = f ∗ }. The set of global minima of a function can, of course, include more than one element.
20
Chapter 1
1.7.4 Limit Here and below, the limit is understood as the limit of a function of several variables at the point x0 . Recall that a number a is called the limit of a function if ∀ε > 0∃δ(ε) > 0 : ∀x : x − x0 < < δ |f (x)−a| < ε. A point in a multidimensional space is understood as a vector of coordinates of a point in some basis in this space.
1.7.5 Sets, shells, cones. Convex sets, convex functions and their extremal properties Set is one of the basic concepts in mathematics. The sets considered here and below are sets of points in spaces of arbitrary dimension. Closed and open sets A set is called closed if all of its boundary points belong to the set. An example of a closed set is all points in the plane (x, y) such that |x + y| 1. If the points of the boundary do not belong to the set itself, then the set is called an open set. For example, the set of points such that x2 + y 2 + z 2 < 1 forms an open ball in the three-dimensional real space R3 , and its boundary x2 + y 2 + z 2 = 1 (a sphere in the same space) does not belong to a set. Note that, by definition, the space R3 itself is both closed and open at the same time. The same applies to the empty set ∅. Convex sets Definition 1.7.1. A set X is called convex if for any two points belonging to this set, x, y ∈ X, and for any number 0 α 1, αx + (1 − α)y ∈ X. Geometrically, this means that if two points belong to a set, then all points of the segment connecting them also belong to this set. Definition 1.7.2. A set X is called linear if for any two points belonging to this set, x, y ∈ X, and for any numbers α, β ∈ R1 αx+βy ∈ X holds.
Minimization Problems For Functions
21
Definition 1.7.3. A set X is called affine if for any two points belonging to this set, x, y ∈ X, and for any numbers α, β ∈ R1 : α + β = 1 αx + βy ∈ X holds. Definition 1.7.4. A set K is called a cone if, for any point belonging to this set, x ∈ K and for any nonnegative number α ∈ R1 (α 0), αx ∈ K. Examples of convex sets All points in the plane (x, y) such that |x + y| 1. The set of points x2 + y 2 + z 2 < 1 forms an open ball in the three-dimensional real space R3 ; this is also a convex set. Note that, by definition, the space R3 itself is a convex set. The same applies to a space of any dimension. The half-space is also a convex set. The empty set ∅ is convex. A set consisting of a single point is always convex. A convex set is always a hyperplane in a multidimensional space. Let elements x1 , x2 , . . . , xn be given (the number of these points is at most the dimension of the space). Definition 1.7.5. The set of elements X = {x =
n
αi xi , αi ∈ R1 }
i=1
is called the linear hull of the elements x1 , x2 , . . . , xn . Definition 1.7.6. The set of elements X = {x =
n
1
α i x i , αi ∈ R ,
i=1
n
αi = 1}
i=1
is called the affine hull of the elements x1 , x2 , . . . , xn . Definition 1.7.7. The set of elements X = {x =
n
αi xi , αi ∈ R1 , αi > 0}
i=1
is called the conical hull of the elements x1 , x2 , . . . , xn .
22
Chapter 1
Definition 1.7.8. The set of elements X = {x =
n i=1
1
α i x i , αi ∈ R ,
n
αi = 1, αi 0}
i=1
is called the convex hull of the elements x1 , x2 , . . . , xn . Exercises 1. Two points with coordinates (x1 , y1 ) and (x2 , y2 ) are defined on the plane. Construct linear, affine, convex and conical hulls. 2. Two points with coordinates (x1 , y1 , z1 ) and (x2 , y2 , z2 ) are defined in three-dimensional space. Construct linear, affine, convex and conical hulls. Properties of convex sets. Statement 1. The intersection of two convex sets is a convex set. Statement 2. The intersection of any number of convex sets is a convex set. Statement 3. The closure of a convex set is convex. Exercises. Prove statement 1. Prove statement 2.
1.7.6 Convex functions Let’s start with a simple question, what is a “graph". It is Intuitively clear, the graphs of functions were drawn even at school. Is there a concept of a graph of a function, for example, 100 variables? It is clearly impossible to draw it ... Definition 1.7.9. By the graph of a function (generally speaking, of any number of variables) f (x) we mean the set of points (x, p) in an (n + 1)-dimensional space such that x ∈ Rn , p = f (x).
Minimization Problems For Functions
23
Definition 1.7.10. By the epigraph of a function f (x) we mean the set of points (x, p) in an (n + 1)-dimensional space such that x ∈ Rn , p > f (x). Thus, an epigraph is a set of points lying above the graph of a function. Let us remember that in literature the text of the epigraph is also located above the text of a novel or a poem. Definition 1.7.11. A function is called convex if its epigraph is a convex set.
An exercise Construct a graph and epigraph of functions on a plane 2 f (x) = x2 , f (x) = |x|, f (x) = ex , f (x) = e−x , 2 f (x) = 1 − e−x , f (x) = sin x, f (x) = ln|x|, f (x) = x + sin2 x. Let us note the following important circumstance. Solving extreme tasks is often fraught with significant difficulties, especially for multi-extreme tasks. Some of these difficulties disappear if we restrict ourselves to considering only convex functions on convex sets. Definition 1.7.12. A function f (x) defined on a convex set X ∈ Rn is called convex if, for any points y, z ∈ X and any α ∈ [0, 1] f (αy + (1 − α)z) αf (y) + (1 − α)f (z). Definition 1.7.13. A function f (x) defined on a convex set X ∈ Rn is called strictly convex if, for any points y, z ∈ X and all α ∈ [0, 1], the strict inequality f (αy + (1 − α)z) < αf (y) + (1 − α)f (z). This definition has a clear geometric meaning (Fig. 1.3): the graph of the function ϕ(u) on the interval connecting the points u, v lies below the chord passing through the points {u, ϕ(u)} and {v, ϕ(v)}.
24
Chapter 1
Figure 1.3: Geometric interpretation of the concept of a strictly convex function
1.7.7 A sufficient condition for strict convexity A sufficient condition for the strict convexity of a twice continuously differentiable function f (x) is that the Hessian matrix ⎛
∂2f ∂x21
⎜ G(x∗ ) = ⎝ . . .
∂2f ∂xn ∂x1
... ... ...
∂2f ∂x1 ∂xn
⎞
⎟ ... ⎠.
∂2f ∂x2n
Theorem 1.7.2. Let f (x) be a convex function defined on a convex set X, x ∈ X. Then any of its local minimum on this set is simultaneously global. The global minimum of a strictly convex function f (x) on a convex set is attained at a single point. Proof. Assume the contrary, point x0 is a point of local, and x∗ is a global minimum of f (x) on X, x∗ = x0 and f (x0 ) > f (x∗ ). Hence, taking into account the convexity of the function f (x), we have f (αx∗ + (1 − α)x0 ) αf (x∗ ) + (1 − α)f (x0 ) < f (x0 )
Minimization Problems For Functions
25
As α → +0, the point x = αx∗ +(1−α)x0 falls into an arbitrarily small neighborhood of x0 . Therefore, the resulting inequality f (x) < < f (x0 ) contradicts the assumption that x0 is a local minimum point (the first part of the theorem is proved). Let x(1) , x(2) be two different points of the global minimum. The strict convexity of the function implies that for all α ∈ [0, 1] the strict inequality f (αx(1) +(1−α)x(2) ) < αf (x(1) )+(1−α)f (x(2) ) = f ∗ = min f (x), X
which contradicts the assumption that x(1) , x(2) are global minimum points.
1.7.8 Linear independence of functions Let a system of functions g1 (x1 , ..., xn ) be given on some segment (or on some set in a multidimensional space), g2 (x1 , ..., xn ), ..., gm (x1 , ..., xn ). Let’s construct its linear hull (linear combination) m λk gk (x1 , . . . , xn ). We say that a system of functions is linearly dek=1
pendent if for all points of the domain m m λk gk (x1 , . . . , xn ) = 0 on condition |λk | > 0. In other words, k=1
k=1
if the identity zero is belongs to our linear hull for nonzero coefficients, m such a system is linearly dependent. If λk gk (x1 , . . . , xn ) = 0 and only if
m k=1
k=1
|λk | = 0 then such a system of functions is linearly inde-
pendent. The vanishing of the sum of the moduli of the coefficients, of course, means that each of them is equal to zero. Let the functions be defined on the segment [a, b]. Then, for two functions on this segment, one can define their scalar product as (f (x), b g(x)) = a f (x)g(x)dx. The properties of such a dot product will be the same as that of the dot product of two vectors.
26
Chapter 1
1.7.9 Gram matrix Now let a system g1 (x), g2 (x), . . . , gm (x) be given. Let’s of all possible dot products ⎛ (g1 , g1 ) (g1 , g2 ) . . . ⎜ (g2 , g1 ) (g2 , g2 ) . . . ⎜ ⎝ ... ... ... (gm , g1 ) (gm , g2 ) . . .
of functions compose a square matrix ⎞ (g1 , gm ) (g2 , gm ) ⎟ ⎟. ... ⎠ (gm , gm )
This symmetric numeric matrix is called the Gram matrix. Theorem 1.7.3. The system of functions g1 (x), g2 (x), . . . , gm (x), defined on the interval [a, b], will be linearly independent if and only if the determinant (determinant) of its Gram matrix is nonzero. An exercise Prove the linear independence of the functions 1, x, x2 on the segment [−1; 1].
2 Introduction to Calculus of Variations 2.1 Historical aspects of the calculus of variations The calculus of variations arose at the same time as mathematical analysis, in connection with the solution of problems of maximization and minimization of functionals. One of the first problems in finding extreme values were isoperimetric5 problems, known since ancient times. The greatest poet of ancient Rome Virgil reproduces in the poem “Aeneid" the legend of the Phoenician queen Dido. According to legend, the events took place in the IX century. BC. Fleeing from the persecution of her brother, Dido went west along the shores of the Mediterranean Sea to seek refuge for herself. She liked the place on the coast of the current Gulf of Tunis, and she negotiated with the local leader about the purchase of land. According to Virgil, Dido “I bought so much land and gave it the name Byrsa, as they could surround with a bull’s hide." When the deal took place, Dido cut the skin of the bull into narrow straps, tied them together and surrounded a large territory on which she founded the fortress of Byrsa6 , and not far from it - the city of Carthage. This episode gives rise to the question: how much land can be surrounded by a bull’s hide? In order to answer this question, you 5 Isoperimetric figures are figures with the same perimeter; the term is formed from two words: iso (equal) and perimeter (measured around). 6 Byrsa means skin.
28
Chapter 2
need to correctly formulate the problem mathematically. The modern interpretation of the problem is as follows: among closed plane curves with a given length, find a curve that covers the maximum area. This problem is called the Dido’s problem, or the classical isoperimetric problem. Its solution is a circle. Virgil, when describing Dido’s action, used the verb “Circumdare" containing the root “circus", which allows us to think that the solution to the classical isoperimetric Dido’s task was known. Together with the property of a circle to cover the largest area among isoperimetric figures, ancient geometers noted the property of the ball to cover the largest volume among the isopiphans figures, i.e. figures having equal surface area. Now it is impossible to say when the idea of the largest circle and ball capacity. Anyway, Aristotle (IV century BC), already uses these facts as known. The main ways of solving the isoperimetric problem were marked back in ancient times by the ancient Greek mathematician Zenodor7 in the treatise “On isoperimetric figures". With the property of the largest capacity of isoperimetric and isopiphanous figures, representations of ancient science are connected about a circle and a ball as the embodiment of geometric perfection. “The most beautiful body is the ball, and the most beautiful flat figure - a circle ”, such a statement is attributed to Pythagoras. Historians of science believe that already in ancient science was born the idea of the expediency of the laws of nature. For the first time this principle thip expressed by Heron of Alexandria when trying to theoretical understand the laws of reflection and refraction of light: he prefers: He stated that nature acts by the shortest path. Heron’s idea was developed by the French mathematician Pierre Fermat. From P. Fermat’s law of light refraction, which was well known by that time, based on the assumption that in an inhomogeneous optical medium, light chooses a trajectory along which the time spent to overcome the path from one point to another, is minimal. Another French mathematician, Pierre Louis de Maupertuis, came to the conclusion that when in nature there is any change, the amount of 7 Presumably Zenodorus lived in the II century. BC NS.
Introduction to Calculus of Variations
29
action required for this change is the smallest. Essentially, Maupertuis in 1744 gave the first formulation of the principle of least action.8 Action, according to Maupertuis, is equal to the product of mass by speed and distance traveled. Leonard Euler gave the Maupertuis principle a general mechanical character: “Since all natural phenomena follow the law of maximum or minimum, then there is no doubt that for the curves lines that describe the abandoned bodies when they are affected any forces are present, the property of a maximum takes place, or minimum". In 1744, L. Euler published the work “On the determination the movement of abandoned bodies in a non-resisting environment by the method of highs and lows”. In this work, Euler showed that the problem of motion of a material point in the field ofcentral forces can be reduced to finding the extremum of the integral v ds. Here v is the velocity of the point, and the integral is taken over the entire trajectory of its movement. The choice of the optimality criterion to determine the true movement has caused such fierce controversy for many years, that D’Alem-bert compared them to the era of medieval wars. But only when J. Lagrange derived the equations of motion from the principle of equal weight, did we manage to find a class of functionals for which the found The Lagrange equations provided the minimum. Thus, the next stage is associated with the name of Lagrange development of the calculus of variations. Lagrange introduced a strict concept of variation of function, gave a new scientific direction and a modern perspective, and laid the foundation for analytical mechanics. In fact, a special mathematical apparatus was created, which received the name of the calculus of variations. The work of Euler and Lagrange was continued by outstanding scientists: Legendre, Cauchy, Hamilton, Jacobi, Gauss, Poisson, Ostrogradsky, Weierstrass, Hilbert. The variational principles, on the one hand, made it possible the conclusion of a large group of laws of mechanics and optics from one 8 Variational principles of mechanics / Sat. articles ed. L. S. Polak. - Moscow: Fizmatgiz, 1959, p. 7.
30
Chapter 2
general position, on the other hand, led to the formulation in the form variational principles of various laws in other fields of physics. True paths of light and radio waves, movements of pendulums and planets, flows of liquids and gases and many other motions stand out from the variety of all possible movements in that they are solutions to some problems to the maximum or minimum. This is the essence of the variational principles.
2.2 Classic variational problems physical and geometric content The class of extremal problems in the calculus of variations differs significantly from those considered earlier. If in problems of minimization of functions of a finite number of variables the desired minimum point was a point in n-dimensional space, then in problems of the calculus of variations, a minimum point, generally speaking, is a function belonging to some infinite-dimensional functional space. This feature of extreme problems in the calculus of variations is clearly visible when considering specific problems of physical and geometric content.
2.2.1 The problem of geometric optics. Fermat’s principle In a transparent medium with variable optical density, two points A and B. Determine the trajectory of a ray of light coming from A to B. The solution to the problem is based on Fermat’s principle: from all curves, connecting points A and B, the light ray “chooses" such a trajectory the path along which it travels from A to B in the shortest possible time. We restrict ourselves to the flat case. We will assume that the light is spreading wanders in the XOY plane; coordinates of two points are given: A = (x0 , y0 ) and B = (x1 , y1 ). The curve connecting these points is denoted by y = y(x), x0 x x1 , and the speed of light at the point (x, y) is through v(x, y). The speed of movement is equal
Introduction to Calculus of Variations
31
to the ratio of the increment paths to increment time dS 1 + y 2 dx v= = , dt dt here dS and dt are infinitesimal intervals of the distance traveled −1 1 + y 2 dx. and time, respectively. From here dt = (v(x, y)) Integrating, we get the time T, which is required for a ray of light to travel from A to B along the curve y = y(x): x1 1 + y 2 T [y(x)] = dx. v(x, y) x0
In the special case, when the speed of light propagation is proportional to the ordinate, the ray trajectory is the arc circles. Thus, the problem of determining the trajectory of a ray light is reduced to the definition of a line for which the function total T takes the smallest value. As a result, Fermat’s principle is the principle of least time.
2.2.2 Brachistochrone problem In mechanics, variational principles trace their history back to the end of the 17th century, when Johann Bernoulli published a note entitled “A New Problem to Which Mathematicians Are Invited". In this note, I. Bernoulli turned to the best mathematicians of his time with a proposal to test themselves in solving a problem “which will give them the opportunity to try whether the methods they possess are good and how great is the power of their mind." Three solutions were soon given; the first belonged to Jacob Bernoulli (brother of I. Bernoulli), the second to Guillaume François L’hopital, the third to Isaac Newton, it appeared in an English journal without a signature. But I. Bernoulli defined the author. History has preserved his words: “I recognized the lion by its claws." The problem proposed by I. Bernoulli was the problem of the brachistochrone9 . In the modern presentation, the problem is posed 9 Brachistochrone (from the Greek βρχιστoς "shortest" + χρνoς "time"), the brachistochronic curve is the curve of the steepest descent.
32
Chapter 2
as follows: to find among the plane curves connecting two given points A and B, that line, descending without friction, a freely launched material point will pass the path between these points in the shortest neck time (Fig. 2.1). The problem can also be formulated in the form of the following question, what shape should the roof be so that the raindrops would roll off the roof ridge in the shortest period of time? I. Bernoulli in his solution proceeded from the analogy between the problems of geometric optics and meFigure 2.1: Oscillations of chanical motion. Using Fermat’s prina spring pendulum ciple, he discovered an amazing coincidence between the curvature of a beam of light in a continuously changing optical medium and the trajectory of a material particle in a gravity field (brachistochrone curve). Let us direct the OY axis vertically downward. Let y(x) be the equation of a curve connecting point A(x0 , y0 ) with point B(x1 , y1 ). TimeT, required to travel the path AB, is expressed by the integral T = ds v . The velocity v of a material particle as a function of coordinates is determined from the energy conservation law mv 2 mv02 − = mg(y − y0 ). 2 2 If at the initial moment of time the speed is zero (v0 = 0), then at thecurrent time moment t its value is determined by the relation v = 2g(y − y0 ). Then the descent time required to pass the path AB is written in the form of the integral b T [y(x)] = a
1 + y 2 dx, 2g(y − y0 )
where g is the acceleration due to gravity.
Introduction to Calculus of Variations
33
And again the problem is reduced to finding the function y(x), which gives a minimum to some integral, the time functional T. In response, I. Bernoulli received a well-known curve — a cycloid. The shape of a cycloid takes the trajectory of a fixed point on the rim of a wheel rolling in a straight line without slipping. Thus, I. Bernoulli was the first to apply the variational principle to solve a mechanical problem, he was also the first to note the analogy between mechanics and optics.
2.2.3 Hamilton’s principle of stationary action Great mathematicians and physicists of the 18th – 19th centuries. L. Euler, J. Lagrange and W. Hamilton gave the concept of “action", which was introduced by P. Maupertuis, the content that is used today. The product of speed and path is easy to transform into the product of the square of speed and time, and if we also introduce constant factors — one factor equal to the mass of the body, and the second equal to 1/2, then we get the product of kinetic energy and time. This is the modern definition of the concept of “action" in the absence of forces. In the general case, “action" is equal to the average value of the difference between kinetic energy and potential, multiplied by the time of movement. In contrast to the principle of least action of Maupertuis, the variational principle of mechanics in Hamilton’s form is called the principle of stationary action: in a conservative field, a particle moves in such a way that the t2 integral of action L = (W − U )dt takes a t1
stationary value. Consider the vibrations of a material point suspended on spring, — a spring pendulum Figure 2.2: Oscilla(Fig. 2.2). tions of a spring penThe kinetic energy of a material point is dulum determined by the expression by W = 12 my˙ 2 , potential energy is equal to U = 12 ky 2 .
34
Chapter 2
According to Hamilton’s principle, a material point moves so that the action integral t2 1 t2 (W − U )dt = (my˙ 2 − ky 2 )dt L= 2 t1 t1 takes a stationary value. The result of solving this variational problem will be the equations of motion of J. Lagrange and the law of conservation of energy W + U = const.
2.2.4 The problem of the smallest surface of revolution Consider the functional b I[y(x)] = 2π y 1 + y 2 dx. a
Its value determines the surface area formed by rotating the smooth curve y(x) around the OX axis. If the function y(x) minimizes the functional I(·), then the surface area will be minimal (Fig. 2.3). The solution to the problem is a family of catenary lines y(x) = α ch x−β α where the constants α and β are determined by the boundary
Figure 2.3: Minimum surface of revolution conditions. It is the shape of the catenary that the chain takes, fixed at points A and B and freely sagging under the action of gravity.
Introduction to Calculus of Variations
35
The surface formed by the rotation of the catenary on the horizontal axis is called the catenoid. The word catenoid is derived from lat. catena, which means chain and Greek. eidos is a species. The catenoid was first described by Euler (1744). The form of a catenoid is taken by a soap film “stretched" over two wire rings, the planes of which are perpendicular to the line connecting their centers.
2.2.5 Finding geodetic lines Line of shortest length on a plane. What is the smallest plane curve that connects two fixed points on the plane? Let’s set the coordinates of two points: A(x0 , y0 ), B(x1 , y1 ) and consider a smooth curve y = y(x) passing through these points, y0 = y(x0 ), y1 = y(x1 ). The length of the plane curve y(x) is x1 expressed by the integral I[y(x)] = 1 + y 2 dx. The problem is x0
reduced to the choice of such a function y(x) for which the value of the integral I will be the smallest. Obviously, the required function is a segment of a straight line (Fig. 2.4). Geodetic line on an arbitrary surface. Let two points of some smooth surface g(x, y, z) = 0 be fixed. What is the curve of the shortest length lying on this surface and connecting the fixed points? It is these curves that are called geodesics (Fig. 2.4).
Figure 2.4: Geodetic lines on a plane and on a curved surface For the mathematical formulation of the problem, we pass to the
36
Chapter 2
parametric specification of the surface g(x, y, z) = 0, ⎧ ⎪ ⎨x = x(u, v), y = y(u, v), ⎪ ⎩ z = z(u, v). Here u, v are parameters that play the role of coordinates on the surface g = 0. Let us express the square of the differential of the arc ds2 in terms of the differentials x, y, z: (ds)2 = (dx)2 + (dy)2 + (dz)2 , and now in terms of the differentials u, v: ⎧ ∂x ∂x ⎪ ⎨dx = ∂u du + ∂v dv, ∂y dy = ∂u du + ∂y ∂v dv, ⎪ ⎩ ∂z ∂z dz = ∂u du + ∂v dv. from (ds)2 = P (u, v)(du)2 +2Q(u, v) du dv+R(u, v)(dv)2 , whence the function 2 2 2 ∂x ∂y ∂z P (u, v) = + + , ∂u ∂u ∂u ∂x ∂y ∂y ∂z ∂z ∂x + + , Q(u, v) = ∂u ∂v ∂u ∂v ∂u ∂v 2 2 2 ∂x ∂y ∂z R(u, v) = + + . ∂v ∂v ∂v We fix on the surface two points A(u0 , v0 ) and B(u1 , v1 ). Consider a curve lying on the surface and connecting these two points, specifying it explicitly as v = v(u). Then the arc length of the curve is determined by the integral u1 P + 2v Q + v 2 R du. I= u0
Introduction to Calculus of Variations
37
Thus, the problem was reduced to the choice of the function v = v(u), on which the functional I takes the smallest value. All the classical problems of the calculus of variations described above can be characterized as the problem of finding a curve y(x) satisfying the boundary conditions y(x0 ) = y0 , y(x1 ) = = y1 and minimizing the functional x1 I[y(x)] =
F (x, y, y ) dx.
x0
Such a functional I is called the simplest functional, and the problem of its minimization is called the simplest problem in the calculus of variations, or Euler’s problem.
2.3 Basic concepts: functional, variation and variation derivative 2.3.1 Functional As you know, the variable y is a function of the independent variable x, if each value of x is associated with a certain value of y. As a result, the number x corresponds to the number y. The variable I is called a functional depending on the function y(x), and is denoted by I[y(x)] if each function y(x) from a given class of functions corresponds to a certain numerical value I. In other words, the function y(x) corresponds to number I. Thus, the concept of a functional is a generalization of the concept of a function. The difference between a functional and a function is that its arguments are not numbers, but functions of one or more variables. Geometrically, the difference lies in the fact that an ordinary function of one or several variables is a function of a point on a straight line, on a plane or in space, while a functional is a function of more complex geometric formations: lines, surfaces, etc. In other words, a functional is a function of a function. Along with the notation I[y(x)], the notation I(y(·)), I(y), I[y] is also used.
38
Chapter 2
Examples of functionals. 1 I = 0 y(x) dx — area bounded by the curve y(x) and the axis segment 0 x 1; 1 I = 0 1 + y 2 (x) dx — the length of the plane curve y(x), 0 x 1; b I = a F (x, y, y ) dx dx — a definite integral that assigns a number I to each differentiable function y(x); here it is assumed that the integrand F (x, y, y ) is given. Functionals naturally arise in physics, in analytical mechanics, in problems related to the construction of surfaces, geodesic lines, etc.
2.3.2 Function variation A variation of a function is the difference between two infinitely close functions from the class of admissible functions for the same value of the argument x: δy = y(x) − y(x). Consequently, varying the argument of a functional means a transition from one function y(x) from the class of admissible for a given functional to another function y(x) infinitely close to it for the same value of x (Fig. 2.5). This is the difference between the variation of a function and differentiation, which, as is known, is a measure of the change in the same function when the independent variable x changes, dy = y(x + +dx) − y(x).
Figure 2.5: variation of a function δy = y(x) − y(x) The operation of variation is permutable with the operations of differentiation and integration.
Introduction to Calculus of Variations
39
Exercises 1. Show that the derivative ofthevariation is equal to the variation dy d (δy) = δ dx . of derivative dx 2. Show that the variation of the integral is equal to the integral of x1 x1 variations δ y(x) dx = δy(x) dx. x0
x0
3. Show that the functional x1 L[y(x)] = (p(x)y(x) + q(x)y (x)) dx x0
is linear, that is, satisfies the conditions L[c · y(x)] = cL[y(x)], xL[y1 (x)+y2 (x)] = L[y1 (x)]+L[y2 (x)].
2.3.3 Variation and variational derivative of the functional y(x) can be represented as the sum of two terms Δy = = y(x + Δx) − y(x) = y (x)Δx + o(Δx) = dy + o(Δx) ≈ dy, where the first term, the principal linear part of the increment of the function dy, is called the differential of the function, and the second term o(Δ(x)) is a quantity of higher order of smallness with respect to Δ(x). In the calculus of variations, the analogue of the differential is variation. In the study of functionals, variation plays the same role as the differential in the study of functions. Variation of the functional I[y(x)] We will assume that the functional the total I[y(x)] explicitly depends only on the function y(x) and does not depend on its derivatives. We represent the total increment of the functional as a sum of two terms, ΔI = I[y(x) + δy(x)] − I[y(x)] = I[y(x), δy(x)] + o(δy(x)),
40
Chapter 2
where the first term is the principal part of the functional increment, linear with respect to δy, and the second term is a small quantity of higher order of smallness. In this case, the functional I is called Frechet differentiable at the point y(x), and the main linear part of its increment is called the variation of the functional10 . Consequently, the variation of the functional is equal to δI = I[y(x), δy(x)] ≈ I[y + δ] − I[y]. As we see, the variation of the functional δI is itself a functional, since, generally speaking, it is different for different functions y and different variations of δy. Another definition of differential is also possible. The differential of the function y(x) can be calculated as the derivative of the function y(x + αΔx) with respect to the parameter α at α = 0: ∂ . dy = y(x + αΔx) ∂α α=0
Similarly, the variation of the functional can be defined as the derivative of the functional I[y(x) + αδy(x)] with respect to the parameter α for α = 0: ∂ . δI[y(x)] = I[y(x) + αδy(x)] ∂α α=0
In this case, the functional is called Gateaux differentiable, or directional differentiable11 . If there is a variation of the functional in the sense of the first definition, i.e., as the principal linear part of the increment of the functional, then I[y(x) + αδy(x)] − I[y(x)] ∂ = lim I[y(x) + αδy(x)] = α→0 ∂α α α=0 I[y(x), αδy(x)] + o(α) ||δy || = lim . α→0 α 10 Maurice Rene Frechet (1878-1973) - French mathematician, major work in topology and functional analysis. 11 Rene Eugene Gato (1889-1914) - French mathematician, major works on functional analysis.
Introduction to Calculus of Variations
41
Since the functional I[y(x), αδy(x)] is linear, the equality I[y(x), αδy(x)] = αI[y(x), δy(x)] holds. That’s why lim
α→0
I[y(x), αδy(x)] + o(α) ||δy || αI[y(x), δy(x)] = lim + α→0 α α o(α) ||δy || = I[y(x), δy(x)] = δI, + lim α→0 α
since o(α) → 0 as α → 0. Thus, if there is a variation in the sense of the main linear part of the functional increment, then there is also a variation in the sense of the derivative with respect to the parameter. And both of these definitions - according to Frechet and according to Gato - are equivalent. But the second definition of variation is somewhat broader, since there are examples of functionals, from the increment of which it is impossible to single out the principal linear parts, but variations in the sense of the second definition do exist. If the variation of the functional can be represented as x1 δI = A(x)δy(x) dx, x0
where A(x) is some function x, then A(x) is called the variational derivative of the functional I with respect to y. For the variational derivative, the standard notation is adopted: A(x) = δI/δy. In the general case, the variational derivative is analogous to the gradient of the function in the representation of the differential, df = (grad f, dx) =
∂f dxi , ∂xi i
only in the calculus of variations, the sum is replaced by an integral: ∂I δI = ∂y dx. ∂y Variational derivatives with respect to different arguments are possible, for example, δy, δz, etc.
42
Chapter 2
Example Find the first variation and the variational derivative of x1 the functional I[y(x)] = y 2 (x) dx. x0
Full functional increment: x1 x1 2 ΔI = [y(x) − δy(x)] dx − y 2 (x) dx = x0
x1 =
x0
x1 2y(x)δy(x) dx + [δy(x)]2 dx.
x0
x0
The first term on the right-hand side is linear with respect to δy(x). The second term is of the highest order of smallness with respect to δy(x). By definition, the first variation of this functional is x1 δI = 2y(x)δy(x) dx and the variational derivative of the first order x0
is δI/δy = 2y(x). As you can see, the technique for calculating the variation of the functional and the variational derivative is the same as the technique for calculating the differential and derivative of a function, only instead of the differential of the independent variable, the variation of the argument of the functional should be written. An exercise Find the first variation and the variational derivative of the functionals: I(y) = cos(y(1)); I(y) = cos(y(1)) + sin(y(6)); 2 I[y(x)] =
y dx; 1
x1 I[y(x)] = (xy 2 + y 3 e2x ) dx. x0
Introduction to Calculus of Variations
Functional variation I[y(x)] =
x 1
43
F (x, y(x), y (x)) dx The
x0
integrand F (x, y(x), y (x)) of this functional depends not only on the function y(x), but also on its derivative y (x), and is continuous together with its partial derivatives up to second order inclusive in all variables. The variation of the integrand is, by definition, δF = = F (x, y + δy, y + δy ) − F (x, y, y ). Consequently, the total increment of the functional corresponding to the variation of its argument is determined by the expression x1 x1 ΔI = [F (x, y + δy, y + δy ) − F (x, y, y )] dx = δF dx. x0
x0
The variation δF can be found using the Taylor formula for a function of many variables. Thus, for a function of two variables f (x, y) having continuous derivatives up to the (n + 1) th order inclusive in a neighborhood of some point (x0 , y0 ), the total increment can be represented as f (x0 + Δx, y0 + Δy) − f (x0 , y0 ) = 1 1 = df (x0 , y0 ) + d2 f (x0 , y0 ) + ... + dn f (x0 , y0 ) + R, 2! n! where df, d2 f, ..., dn f are the total differentials of different orders, 1 dn+1 f (x0 + θΔx, y0 + θΔy) is the remainder of the R = (n+1)! Taylor formula (0 < θ < 1). Let us write a similar expansion for the integrand F (x, y, y ). For a fixed x, the integrand F (x, y, y ) depends on two arguments, y and y’. We represent the total increment ΔF (x, y, y ) as the following sum: ΔF (x, y, y ) = F (x, y + Δy, y + Δ) − F (x, y, y ) = dF (y, y ) + R, where dF (y, y ) = ∂F ∂y Δy + function of two variables.
∂F ∂y Δy
is the total differential of a
44
Chapter 2
Considering further the function F (x, y, y ) as an argument of the functional and replacing Δy and Δy in the increment formula ΔF (x, y, y ) by variations δy and δy , we obtain a variation of the function F (x, y, y ): δF (y, y ) =
∂F ∂F δy + δy + R. ∂y ∂y
As a result, the total increment of the considered functional equals x1 ΔI =
∂F ∂F δy + δy ∂y ∂y
x0
x1 dx +
R(x, y, y , δy, δy ) dx.
x0
The first term in this formula is linear with respect to δy and δy . Neglecting the second term as a quantity of higher order of smallness with respect to δy and δy , we finally obtain an expression for the first variation of the considered functional in the form x1 δI =
∂F ∂F δy + δy ∂y ∂y
dx.
x0
The higher variations of the functional are defined similarly to the differentials of higher orders: ΔI = δI +
1 1 2 δ I + δ3 I + ... 2! 3!
Here δI, δ2 I, ... are called the first, second, etc., variations of the functional I. They are integrals of polynomials of the first, second, and higher degrees of the variations δy and δy = (δy) , which are found by expanding the integrand in ΔI into the Taylor series in powers of δy and δy . These are the notation of Lagrange, thanks to which the theory of extrema of functionals came to be called the calculus of variations.
Introduction to Calculus of Variations
45
2.4 Functional extremum: absolute and relative, strong and weak. A necessary condition for an extremum Consider the simplest functionality x1 I[y(x)] =
F (x, y, y ) dx.
x0
We will assume that the functional I is defined on a normed linear space whose elements are continuous and continuously differentiable functions. Functions of this class are also called smooth and are denoted by C1 [x0 , x1 ]. All curves y(x) ∈ C1 [x0 , x1 ] passing through two given points A(x0 , y0 ) and B(x1 , y1 ) will be called admissible. As in differential calculus, in order to solve the variational problem of the absolute minimum, it is necessary to first solve the problem of the relative (local) minimum, i.e., find such admissible curves that give the functional I a value that is less than the neighboring curves. If it is clear from physical or some other considerations that the problem of the absolute minimum must have a unique solution, then the desired curve can be chosen from those found local in the smallest value of I. Let us clarify what we mean by neighboring curves and what change in the function can be considered small. For this, we introduce the concept of proximity of curves of the k-th order. Definition 2.4.1. The curves y(x) and y(x) are close in the sense of closeness of order zero if the modulus of the difference of these functions, that is, |y(x) − y(x)| < ε , where ε is a small number. Definition 2.4.2. Curves y(x) and y(x) are close in the sense of firstorder proximity, if |y(x) − y(x)| < ε and |y (x) − y (x)| < ε and etc. The difference between these concepts is illustrated in Fig. 2.6. In fig. 2.6a shows curves that have little difference in ordinates for the same x; these are curves that are close in the sense of closeness of the zero order. In fig. 2.6b shows curves for which not only the ordinates are close, but also the directions of the tangents - these are curves that
46
Chapter 2
Figure 2.6: Close curves y(x) and y0 (x) - proximity of order zero (a) and first order (b) are close in the sense of closeness of the first order. As you can see, if the curves are close in the sense of closeness of the k-th order, then even more so they are close in the sense of closeness of any smaller order k. Definition 2.4.3. The functional I[y(x)] is said to attain an absolute (global) maximum on the curve y(x) from the entire set of functions on which it is defined if I[y(x)] I[y(x)], and the absolute (global) minimum, if I[y(x)] I[y(x)]. The absolute maximum and absolute minimum are called the absolute extrema of the functional. The concept of a relative (local) extremum of a functional is associated with the study of the behavior of the functional on close curves. But the proximity of the curves can be of different orders. Therefore, a distinction is made between strong and weak local extrema. Definition 2.4.4. If the functional reaches an extremum on the curve y(x) among all admissible curves close to y(x) in the sense of closeness of order zero, then the extremum of the functional called strong. If the functional reaches an extremum on the curve y(x) among all admissible curves close to y(x) in the sense of closeness of the first order, then the extremum of the functional is called weak. Remarks. 1. If the functional reaches a strong extremum on the curve y(x), then it also reaches a weak one. The converse is not true.
Introduction to Calculus of Variations
47
2. Any absolute extremum of a functional is simultaneously a local extremum (strong and weak), but not every local extremum will be absolute. 3. The distinction between strong and weak extrema of the functional is essential in the study of sufficient conditions for an extremum and is not of great importance when considering the necessary conditions. Let us show how to find an admissible function y(x) that minimizes functional I. The strategy for searching for an extremum of a functional will be the same as for finding an extremum of a function in differential calculus. According to Fermat’s theorem, the necessary condition for the extremum of a function at a given point is that the derivative of the function at this point (f (x∗ ) = 0) or the first differential (df (x∗ ) = = f (x∗ )dx = 0). In the calculus of variations, we calculate the variation of the functional δI and the so-called variational derivative δI/δy and equate them to zero. These equalities for the functional are similar to the necessary conditions for the extremum of a function, but in the case of a functional they lead to an ordinary differential equation, so that the function minimizing the functional must satisfy the differential equation and, of course, the boundary conditions. Necessary conditions for the relative extremum of the functional (strong and weak) is established by the following theorem. Theorem 2.4.1. If the functional I[y(x)] reaches an extremum on the curve y(x), its variation or variational derivative is equal to zero: δI = 0. δI[y(x)] = 0 or δy y(x) In this case, the functional at the extremum point is said to be stationary. So, the equality to zero of the variation or the variational derivative of the functional is a necessary condition for the relative extremum of the functional.
48
Chapter 2
2.5 Euler – Lagrange differential equation Let it be required to find the extremum of the simplest functional x1 I[y(x)] =
F (x, y(x), y (x)) dx,
x0
defined on the class of admissible functions y(x) ∈ C1 [x0 , x1 ] satisfying the boundary conditions y(x0 ) = y0 , y(x1 ) = y1 (Fig. 2.7).
Figure 2.7: The simplest variational problem with fixed ends
Theorem 2.5.1. In order for the differentiable functional x1 I[y(x)] =
F (x, y(x), y (x)) dx,
x0
defined on the class of admissible functions y(x) ∈ C1 [x0 , x1 ], with the boundary conditions y(x0 ) = y0 , y(x1 ) = y1 , reaches an extremum on some function y(x), it is necessary that this function satisfies differential equation ∂F ∂F d = 0. (2.1) − ∂y dx ∂y y(x) Taking into account the permutability of the operations of integration and variation, the necessary condition for the extremum of the x1 functional δI = 0 is written in the form δI = δF (x, y, y ) dx = x0
Introduction to Calculus of Variations
49
= 0. As a result, the equality should be fulfilled x1
∂F ∂F δy + δy ∂y ∂y
x0
dx = 0.
(2.2)
y(x)
Integrating the second term in (2.2) by parts, we find x1 x0
x1 ∂F x1 ∂F d ∂F δy dx. δy dx = δy − ∂y ∂y x0 dx ∂y
(2.3)
x0
Since the functions y(x) at the ends of the interval [x0 , x1 ] do not vary, the first term in (2.3) vanishes. Therefore, taking into account (2.2), we obtain x1 δI = x0
∂F d − ∂y dx
∂F ∂y
δy dx = 0.
(2.4)
y(x)
Applying to expression (2.4) the main lemma of the calculus of variations (see the reference material), we come to the conclusion that, since the variation of the argument of the functional δy is arbitrary, the integrand in (2.4) should be taken equal to zero, i.e., d ∂F ∂F = 0. − ∂y dx ∂y y(x) This equation is called the Euler–Lagrange equation, and the integral curves of equation (2.1) are called extremals (Euler extremals). The Euler – Lagrange equation plays a fundamental role in the calculus of variations. In general, it is a second-order nonlinear differential equation. If we apply the compound function differentiation rule to a scalar function of three arguments u = u(x, y(x), z(x)), then ∂u ∂u dy ∂u dz du = + + . dx ∂x ∂y dx ∂z dx
50
Chapter 2
Assuming that u = d dx
∂F ∂y
∂F (x,y,y ) , ∂y
=
we get
∂2F ∂ 2 F dy ∂ 2 F dy + + . ∂x ∂y ∂y ∂y dx ∂y 2 dx
Therefore, in expanded form, equation (2.1) can be written in the form of an ordinary differential equation of the second order: ∂2F ∂2F ∂F ∂ 2 F y + y + − = 0, 2 ∂y ∂y ∂y ∂x∂y dy or in a more compact form Fy y y + Fyy y + Fxy − Fy = 0.
(2.5)
Here, the subscript means differentiation with respect to the corresponding variable. So, the simplest variational problem of finding the extremum of a functional of the form x1 I[y(x)] =
F (x, y(x), y (x)) dx
x0
reduces to a two-point boundary value problem for the Euler – Lagrange differential equation (2.5) with the boundary conditions y(x0 ) = y0 , y(x1 ) = y1 . As you know, a boundary value problem may not have a solution, or the solution may not be the only one. Everything depends on the form of the Euler – Lagrange equation and the solvability of the system of equations for the boundary conditions. Note that if the boundary value problem for equation (2.5) is solvable, then this does not mean the existence of extrema for the functional, since the Euler – Lagrange equation gives only the necessary condition. As in the study of extrema of functions, additional analysis of the solution is required in order to establish whether the extremum of the functional is actually realized and of what nature (maximum or minimum). Sufficient extremum conditions are required to carry out such an analysis. However, in
Introduction to Calculus of Variations
51
applied problems the answer to this question can often be obtained based on the physical features of the problem. A setting is also possible in which one of the ends of the trajectory, for example, the right end, is not fixed and remains free, then y1 is found together with the trajectory y(x). An example of finding the minimum of a functional 1 I[y(x)] = (y 2 + y 2 ) dx 0
Let us try to find a function y(x) that passes through the points (0, 0) and (11) and minimizes functional I. Let us write down the Euler equation Fy −
d Fy = 2(y − y ) = 0. dx
Solving this simple differential equation with the boundary conditions y(0) = 0, y(1) = 1, we find Figure 2.8: Solution of the simplest variational problem: a - minimizing function y(x) = 0, 425ex − −0, 425e−x ; b, c, d - other admissible functions satisfying the boundary conditions
y(x) = 0, 425ex −0.425e−x = 0.85sh x.
The graph of the corresponding function is shown in Fig. 2.8. The value of the functional on this trajectory is I[y(x)] ≈ 1.310. Any other smooth function satisfying the same boundary conditions will give a greater value of functional I. For example, for comparison, we take a straight line segment as an admissible function, y(x) = x. This trajectory is admissible, it belongs to the class C1 [x0 , x1 ] and passes through these boundary points, for it 1 I[y(x)] = 0
1 x3 4 (1 + x ) dx = x + = > I[y(x)]. 3 0 3 2
52
Chapter 2
So we found that the function y(x) = 0, 425ex − 0, 425e−x delivers a minimum of functionality 1 I[y(x)] =
(y 2 + y 2 ) dx.
0
In this case, I[y(x)] ≈ 1.310. Exercises 1. Find the extremal of the functional 1
(4y − y 2 + 12x2 y ) dx,
0
satisfying the boundary conditions y(0) = 1, y(1) = 4. 2. Find the extremal of the functional 1
(y 2 − 4y 2 − 2xex y) dx,
0
satisfying the boundary conditions y(0) = 0, y(π/4) = 0. 3. Among all curves satisfying the conditions y(0) = 0, y(1) = = 1, find the one that minimizes the functional b 1 + y 2 dx. I[y(x)] = a
How do you interpret this result? What is the value of I[y(x)]? What is the meaning of the quantity I[y(x)]?
Introduction to Calculus of Variations
53
4. Show that in the class of smooth functions satisfying the boundary conditions y(0) = 0, y(π/2) = 1, the functional π/2 I[y(x)] = (y 2 − y 2 ) dx 0
reaches its minimum at the function y(x) = sin x. Calculate I[y(x)].
2.6 Cases of lowering the order of the Euler–Lagrange differential equation Let us note some cases when the Euler – Lagrange equation admits a reduction in order.
2.6.1 The function F does not explicitly depend on y Let F = F (x, y). Then the Euler – Lagrange equation takes the form Fy (x, y) = 0 and is not a differential equation at all, but only represents an implicit definition of the solution y = y(x). There are no integration constants here and there can be no question about their choice so that the boundary conditions are satisfied. In this case, the solution is possible only if the boundary conditions are specially selected and satisfy the equation Fy (x, y) = 0.
2.6.2 The function F does not explicitly depend on y Let F = F (x, y ), the integrand is clearly independent of y. In this d ∂F case, the Euler – Lagrange equation takes the form dx ∂y = 0, whence we obtain the first integral ∂F (x, y ) = C1 = const. ∂y To define the function y(x), we have a first-order differential equation that does not explicitly contain y. If this equation is solvable
54
Chapter 2
with respect to the derivative y , it takes the form of an equation with separable variables y = f (x, C1 ), whence y(x) = f (x, C1 ) dx + C2 . For example, with a geodesic on a plane F (x, y ) = then it follows from the Euler – Lagrange equation √ y
1+y
1 + y 2 , = C1 , 2
whence y = a and y(x) = ax + b. The constants a and b are found from the boundary conditions. So a straight line really minimizes the integral that determines the distance between two points on the plane.
2.6.3 The function F does not explicitly depend on x n this case F = F (y, y ) and the identity ∂F ∂F ∂F ∂F d ∂F ∂F d y − −F =y +y − y − y . dx ∂y ∂y dx ∂y ∂x ∂y ∂y Here, the first and last terms are similar terms; after reducing them, we obtain ∂F ∂F ∂F ∂F d ∂F ∂F d y − y − − − y = . dx ∂y ∂x ∂y dx ∂y ∂y ∂x Taking into account the Euler – Lagrange equation and the equality ∂F/∂x = 0, we find d ∂F y − F = 0. dx ∂y The first integral of the Euler – Lagrange equation y
∂F − F = const. ∂y
This is a first-order equation that depends only on y and y and does not depend on x. If it can be resolved explicitly with respect to the derivative y = ψ(y, C1 ), we obtain extremals in the form dy x= + C2 . ψ(y, C1 )
Introduction to Calculus of Variations
2.6.4 Function F is a total derivative, F =
55 dG(x,y) dx
In this case, the integral I does not depend on the choice of the function y and is determined only by the boundary conditions I = G(x0 , y0 ) − −G(x1 , y1 ). A similar situation is possible only when the integrand turns out to be a linear function of y : F (x, y, y ) = a(x, y) + b(x, y)y . As a result, the Euler–Lagrange equation takes the form ∂a ∂b − = 0. ∂y ∂x This is precisely the necessary and sufficient condition for the representation of the function F in the form of a total derivative, while a(x, y) =
∂G , ∂x
b(x, y) =
∂G . ∂y
Thus, we have obtained a necessary and sufficient condition for the identical fulfillment of the Euler – Lagrange equation: F =
dG(x, y) . dx
Exercises 1. Consider Hamilton’s principle of stationary action (principle of least action), according to which a material particle moves so t2 that the action integral L = (W − U ) dt takes a stationary t1
value. Get the Lagrange equations of motion and the energy conservation law W + U = const for a spring pendulum (see Fig. 2.2). The kinetic energy of a material point is determined by the expression W = 12 my˙ 2 , potential energy is equal to U = 12 ky 2 . 2. Solve the problem about the smallest surface of revolution.
56
Chapter 2
3. Solve the isoperimetric problem using the method of indefinite Lagrange multipliers. The coordinates of the points of the plane through which the desired curve passes, y(0) = 0, y(2) = = 0; the length of the curve is l = π. 4. Solve the brachistochrone problem. 5. Solve the problem of geometric optics, provided that the speed of light is proportional to the ordinate, with the boundary conditions y(0) = 0, y(1) = 1. 6. Consider the vertical motion of a material point of mass m in the Earth’s gravitational field. Denote through y the distance measured from the starting point, through t - the time from the beginning of the movement. It follows from Newton’s second law that the trajectory of motion y(t) must satisfy the equation d2 y + g = 0, dt2
(2.6)
where g is the acceleration due to the gravity. Does there exist a function F (t, y, y ) such that the related Euler – Lagrange differential equation coincides with equation (2.6)?
2.7 Reference material Derivation of the Euler–Lagrange equation Let the function minimizing the simplest functional exist; we denote this function by y(x). Along with the function y(x), consider the function y = y(x) + +εη(x). Here η(x) ∈ C1 [x0 , x1 ], at the boundary points η(x) is equal to zero, η(x0 ) = η(x1 ) = 0. Therefore, the curve y = y(x) + εη(x) will pass through these boundary points A and B in the same way as the curve y(x) (Fig. 2.9). Let’s define a variation function as δy = η(x). The number ε is chosen from a small neighborhood of zero, which ensures the proximity of the functions y = y(x) + εη(x) and y(x) in the sense considered above.
Introduction to Calculus of Variations
57
The value of the functional I[y(x) + εη(x)], calculated along the trajectory y = y(x) + εη(x), exceeds the value I[y(x)], since the function y(x) is a minimizing function. Therefore, for all ε, the increment of the functional satisfies inequality ΔJ = I[y(x) + εη(x)] − I[y(x)] 0,
(2.7)
As a result, the functional I[y(x) + εη(x)] becomes a function of the argument ε. Let us denote the corresponding dependence ϕ(ε) = = I[(+εη)], the graph ϕ(ε) is shown in Fig. 2.10.
Figure 2.9: Function variation: a - function y = y(x) + εη(x); b - minimizing function y(x); c - variation δy = η(x) Figure 2.10: The graph of the function ϕ(ε) = I[(+εη)] in the vicinity of ε = 0 The problem of the calculus of variations has been reduced to the usual problem of the differential calculus on the minimum of a function of one variable ϕ(ε); moreover, we know a priori that the minimum is attained at the point ε = 0. We expand ϕ(ε) in a Maclaurin power series: ϕ(ε) = ϕ(0) +
ε2 ε ϕ (0) + ϕ (0) + ... 1! 2!
The expansion coefficients of the resulting power series are called the first, second, etc. variations of the functional I and are denoted by δI, δ2 I, ... Therefore, the total increment of the functional ΔI = 2 = 1!ε δI[η] + ε2! δ2 I[η] + ....
58
Chapter 2
Let us clarify the definition of functional variations: the first d I(y + εη)ε=0 . The second variation of the functional δI(η) = dε variation of the functional is defined in a similar way δ2 I(η) = d2 = dε and etc. 2 I(y + εη) ε=0
Note that the modern concept of variation of a functional is somewhat different from the classical variation introduced by Lagrange. Variations of the functional in the sense of Lagrange are equal to δI = εI (0), δ2 I = ε2 I (0), ..., and differ from the modern interpretation by the factor ε, while the variation of the function is assumed to be δy = η(x). It is known from differential calculus that for the function ϕ(ε) to have a minimum at ε = 0, it is necessary that ϕ (0) = 0 (Fermat’s theorem). Hence, for the function y(x) to minimize the functional I, it is necessary that I (0) = ϕ (0) = 0, in other words, δI = 0, and this condition must be fulfilled for any function η(x) ∈ C1 [x0 , x1 ]. Differentiating I with respect to ε and setting ε = 0, we find the first variation of the functional δI: d dϕ(ε) = = I[y + εη] dε ε=0 dε ε=0 x1 d = F (x, y + εη, y + εη ) dx dε x0
= 0, ε=0
or
b
δI[η(x)] = I (0) = a
∂F (x, y, y ) η(x) + ∂y ∂F (x, y, y ) + η (x) dx = 0. (2.8) ∂y
Introduction to Calculus of Variations
59
Integrating (2.8) by parts, we obtain the already known result b b ∂F d ∂F ∂F + η(x) η(x) dx = − δI[η(x)] = ∂y ∂y dx ∂y y(x) a a
= 0. (2.9) The first term in expression (2.9) is equal to zero due to the boundary conditions, which are set to zero by the definition of the function η(x). The second term is represented by the integral of the product of two functions and is a functional linear with respect to the function η(x). By definition of the variational derivative b δI[η(x)] = a
δI δy
η(x) dx. y(x)
Lagrange’s lemma (without proof). If the function G(x) is continuous on the segment [x0 , x1 ] and for any function η(x) ∈ ∈ C1 [x0 , x1 ] satisfying the conditions η(x0 ) = η(x1 ) = 0, the equalx1 ity G(x)η(x) = 0 then the function G(x) ≡ 0 on the same segment; x0
otherwise, G(x) ≡ 0 : ∀x ∈ [x0 , x1 ]. In the proof of the necessary condition for the extremum of the simplest functional, this lemma plays an important role; therefore, it is called the main lemma of the calculus of variations. The lemma can be generalized with respect to smoothness if the function η(x) is required to have continuous derivatives of any order. Let us show that for the simplest problem of the calculus of variations, which consists in minimizing the functional x1 I[y(x)] =
F (x, y(x), y (x)) dx,
x0
the necessary condition for an extremum δI = 0 can be reduced to
60
Chapter 2
the form
b δJ =
∂F d − ∂y dx
∂F ∂y
δy dx = 0.
a
Cofactor
∂F d − ∂y dx
∂F ∂y
in this expression plays the role of the function G, and the variation δy plays the role of the function η(x). According to Lagrange’s lemma ∂F δI d ∂F = = 0. (2.10) − δy y(x) ∂y dx ∂y y(x) So, we have proved that if the function y(x) minimizes the funcx1 tional I[y(x)] = F (x, y, y ) dx in the class C1 [x0 , x1 ] with the x0
boundary conditions y(x0 ) = y0 , y(x1 ) = y1 , then it must nec∂F d ∂F essarily satisfy the Euler – Lagrange equation ∂y − dx ∂y = = 0 with the given boundary conditions.
3 Theory Of Optimal Control. Statement Of The Main Problem The modern stage in the development of the calculus of variations is associated with the theory of optimal control, which arose in the middle of the 20th century. due to the need to solve a number of practical problems in various areas of new technology. These problems, being variational in their mathematical essence, did not fit into the framework of classical models and required the development of a new mathematical apparatus. The foundations of the mathematical theory of optimal control were laid by a team of Soviet mathematicians headed by Academician L. S. Pontryagin12 . The maximum principle developed by this team essentially generalizes and develops the main results of the classical calculus of variations, created by Euler, Lagrange and other outstanding mathematicians of the past. In the general case, the Pontryagin maximum principle expresses the necessary optimality condition. The second fundamental result of the theory of optimal control is presented by the method of dynamic programming by R. Bellman13 and is associated with sufficient conditions for optimality. We have already become familiar with this method in the course of operations research when considering multi-step systems. A generalization of the two directions mentioned above is the 12 Lev Semenovich Pontryagin (1908-1988) - Soviet mathematician, academician of the USSR Academy of Sciences, made a significant contribution to algebraic and differential topology, oscillation theory, calculus of variations, control theory. Pontryagin L.S., Boltyansky V.G., Gamkrelidze V.G., Mishchenko E.F. Mathematical theory of optimal processes. - 2nd ed. - M .: Nauka, 1969. (The monograph was awarded the Lenin Prize for 1962) 13 Richard Bellman (1920-1984) - American mathematician, one of the leading experts in the field of mathematics and computing
62
Chapter 3
mathematical apparatus developed by the school of the Soviet scientist V.F.Krotov14 . The formalism of the proposed approach allows one to obtain both the equations of the maximum principle and the method of dynamic programming, which is achieved in various ways of specifying the Krotov function ϕ(t, x), which appears in the formulation of the main results of the theory of sufficient optimality conditions. In the general case, the function ϕ(t, x) is determined from broader conditions and can exist even when, for example, the Bellman function is not defined. Therefore, in what follows, the main attention will be directed to the study of sufficient conditions for optimality of VF Krotov and the related methods of Lagrange–Pontryagin, Hamilton–Jacobi–Bellman and the method of multiple maxima. ‘
3.1 Managed object and its dynamics We constantly encounter controlled objects, these include, for example, a car, ship, aircraft, robot, technological process in production, etc. All these objects have controls, “rudders", changing the position of which, we can influence on the movement of the object. The question arises, how to manage the object in the best way, optimally, how to apply mathematical methods for these purposes? The use of mathematical methods for the study of processes of a diverse nature becomes possible after a mathematical model has been built. Real physical processes are modeled in various ways: using ordinary differential equations; using difference equations; using partial differential equations; using integral equations; in a mixed way. Such mathematical models describe a fairly wide range of processes and have numerous applications, for example, in the dynamics 14 Vadim Fedorovich Krotov (1932–2015) - Soviet and Russian scientist, a wellknown specialist in the field of optimal control and its applications.
Theory Of Optimal Control
63
of space flight, in the control of chemical or nuclear reactors. Mathematical modeling is successfully used to synthesize optimal control modes for manipulation robots, industrial automation systems, as well as in mathematical economics. The subject of our further study will be: optimal control problems for systems whose behavior is described by ordinary differential equations, as well as those close to them in content; optimal control problems for multistep processes given by recurrent relations (difference equations).
3.2 Examples of applied problems of optimal control 3.2.1 Pendulum movement The motion of a flat pendulum suspended from a pivot point by means of a rigid weightless rod (Fig. 3.1) is described by the equation ¨ + bθ˙ + mgl sin θ = M (τ). Iθ where θ = θ(τ) is the angle of deflection of the rod from the equilibrium position; l is the length of the rigid rod of the pendulum; m is the
Figure 3.1: Mathematical pendulum and forces acting on it mass concentrated at the end of the rod; I = ml2 - moment of inertia;
64
Chapter 3
g is the gravitational constant; b 0 - damping factor; τ is the time; M(τ) - external control torque. Changing the variable t = τ mgl/I leads to an equation of the form x ¨ + β x˙ + sin x = u(t). √ here x(t) = θ(t I/mgl), β = b/ Imgl, u(t) = = M (τ I/mgl)/(mgl). Let us denote the angle of deflection of ˙ and the rate of change in the deviations the pendulum x1 (t) = θ(t) ˙ x2 (t) = θ. Then the oscillations of the pendulum will follow the differential second order system
x˙ 1 (t) = x2 (t), x˙ 2 (t) = −βx2 (t) − sin x1 (t) + u(t). Let at the initial moment t = 0 the pendulum is deflected by an angle x1 (0) = x10 and has an initial velocity x2 (0) = x20 . We will also assume that the control moment u(t), the choice of which can affect the movement of the pendulum, satisfies the constraint |u(t) | γ, γ = const, t 0. Various formulations of the problem of optimal control of the pendulum motion are possible - choose an admissible control u(t) so that: 1) stop the pendulum at one of the points of stable equilibrium in the minimum time T; in other words, to transfer the pendulum from one given point to another, for example, x1 (T ) = = 2πk, x2 (T ) = 0, k = 0, ±1, ..., in the minimum time (speed problem); 2) transfer the pendulum from a given starting point to a given area in the minimum time T; that is, to achieve in the minimum time the condition [x1 (T )]2 + [x2 (t)]2 ε is fulfilled, where ε is a given number; 3) stop the pendulum at a given time T in one Tof2the stable equilibrium positions and minimize the value 0 u (t) dt.
Theory Of Optimal Control
65
3.2.2 Soft vertical landing on the surface of the planet The vertical motion of a spacecraft with a rocket engine in a void is described by a system of differential equations ⎧ ˙ ⎪ ⎨h = v, P v˙ = m − g(h), ⎪ ⎩ m ˙ = − Pc . Here h, v, m are respectively the height, speed and mass of the apparatus, g(h) is the acceleration of gravity, P is the thrust of the engine, which can vary arbitrarily within 0 P Pmax , c is the speed of the outflow of the working fluid from the engine. It is required to find such a mode of motion from the given initial conditions h0 , v0 to the final conditions h1 = 0, v1 = 0, which corresponds to the maximum value of the final mass m1 . The examples given are a special case of a more general optimal control problem, the formulation of which is considered below.
3.3 Statement of the main problem of optimal control Let there be a set A = [t0 , t1 ] a segment of the numerical axis T with elements t.
3.3.1 Phase coordinates (states) Suppose that the considered controlled object at each moment of time t is completely described by a finite set of numbers x1 (t), x2 (t), ..., xn (t) which are called the phase coordinates of the object. From these numbers we form a vector x of dimension n: ⎞ ⎛ x1 ⎜ x2 ⎟ ⎟ x=⎜ ⎝ ... ⎠ , x ∈ En , xn where En is Euclidean space. We will call the vector x the vector of phase coordinates of the object.
66
Chapter 3
Let the law of change of phase coordinates in time be described by a system of ordinary differential equations dxi = fi (t, x1 , x2 , ..., xn , u), dt
i = 1, 2, ..., n,
(3.1)
where t is time, fi are known functions of their arguments. The basis for the compilation of such systems of differential equations are the laws of specific areas of knowledge, for example, physical laws. So, the dynamics of the controlled object follows the differential system (3.1), the right side of which includes a vector u, called control. System (3.1) determines not the specific movement of the controlled object, but its technical capabilities. To describe a specific motion of an object, one should choose the control u = u(t) as some function of time t, set the initial state x(t0 ) = x0 , and solve the initial problem (Cauchy problem) x˙ = f (t, x, u(t)) ≡ F (t, x), x(t0 ) = x0 . The solution to the Cauchy problem x(t), which depends on the control u(t) and on the initial condition x0 , specifies the specific motion of the controlled object.
3.3.2 The class of admissible controls The control vector u = (u1 , ..., ur ) is the vector of the r-dimensional Euclidean space, u ∈ Er , it characterizes the position of the “rudders" of the controlled object. Let its first component u1 be an angle equal to the deviation of the rudder from a certain direction, for it a typical + − + restriction u− 1 u1 u1 where u1 , u1 are given numbers, and the extreme values are admissible (inequalities are not strict). If the second component u2 is the traction force, then typically the + 0 u2 u+ 2 where u2 is the maximum possible traction force, and here the extreme values 0 and u+ 2. Generalizing this situation, we will assume that the control vector at each time instant t satisfies the condition u(t) ∈ U (t), where U (t) is some closed bounded set in r-dimensional Euclidean space. The set U (t) is called the control area. As a result, the class of admissible controls consists of a vector function u(t) that satisfies the condition ∀t : u(t) ∈ U (t).
Theory Of Optimal Control
67
3.3.3 Management quality criterion Consider a pair of vector functions (x(t), u(t)), t0 t t1 where u(t) is an admissible control, x(t) is the trajectory corresponding to this control, i.e., the solution to the Cauchy problem. Consider also the functional t1 f 0 (t, x(t), u(t)) dt + F (x(t1 )), (3.2) I[x(·), u(·)] = t0
where f 0 , F are known functions of their arguments. Thus, each pair (x(t), u(t)) is associated with a number I defined by formula (3.2). Functional (3.2) is called the control quality criterion. It can have different physical meanings: fuel consumption, energy costs, financial costs or profit, the time of transition from one given state to another, etc. The specific choice of functional I is made based on the requirements for the considered controlled process. Our goal will be to minimize the control quality criterion represented by the functional (3.2).
3.4 Reference material The relationship between the main variables of the optimal control problem. Terminology and notation [9] Consider some set V of the direct product T × X × U . Let us introduce the following notation: V (t) section of the set V for each t; Vx (t) projection V (t) onto the space X; Vu (t, x) - section V (t) for each fixed x; Vut is the projection of the set V onto the space T × X. The relationship between the introduced sets is schematically illustrated in Fig. 3.2. As a result, for each fixed t ∈ A, the pair of vector functions (x(t), u(t) takes values from the set V (t): (x, u) ∈ V (t),
or x ∈ Vx (t), u ∈ Vu (t, x(t));
(3.3)
68
Chapter 3
in particular, for t = t0 , t = t1 x(t0 ) ∈ Vx (t0 ),
x(t1 ) ∈ Vx (t1 ).
(3.4)
Figure 3.2: The relationship between the sets of admissible states and controls [9] Relations (3.3) are called phase and control constraints, and relations (3.4) are called boundary conditions of the problem. The set of pairs (x, u) satisfying differential system (3.1) and conditions (3.3), (3.4) will be called the set, the class of admissible D. Pairs (x(t), u(t)) ∈ D will be called control processes. The problem is posed on the minimum of the functional I, defined on the set D, according to (3.2), namely, it is required to find the sequence xs (t), us (t) ∈ D where the functional I[x(·), u(·)] tends to its smallest value on the set D: I(xs (t), us (t)) → inf I. s→∞ D
Phase coordinates (states) and controls The formal meaning of the concepts of state and control is associated only with a certain mathematical model - the recording of equations (3.1). Namely, the state variables are contained in the equations for each value of the argument together with its derivatives (in the discrete model - together with the values at the next step), while the controls
Theory Of Optimal Control
69
are represented in equations (3.1) only by the values corresponding to the current value of the argument. Model state and control variables may not correspond to those variables that describe the state and control of real objects. So, in the problems of optimization of space trajectories, it is customary to consider the position and speed of the center of mass of the spacecraft as a state, and the direction of the engine thrust as a control. At the same time, on real vehicles, a change in thrust is achieved by rotating the entire vehicle with the help of devices that create moments relative to the center of mass. This discrepancy reflects the fact that the mathematical model of a controlled object may not be unique and depends on the goals to be achieved with its help. Set-theoretical constraints on states and management For a long time, until approximately the middle of the 19th century, the idea dominated among mathematicians, according to which it was taken for granted that the minimum of a functional in a variational problem is achieved on a continuous and even smooth function if the exact lower bound of the functional is finite. The formation of such ideas was facilitated by the specificity of applications of variational problems: the problem of brachistochron, the refraction of light, isoperimeters, the principle of least action. However, Weierstrass, in his famous polemic with Riemann, showed by numerous examples that the minimum of the functional may not be attained on this set and gave examples of discontinuous solutions. Thus, by solving the problem of the minimum of the functional I[y(x)] =
1
1 + (y) ˙ 2 dx
0
In the class of elementary functions is the function y(x) = x, which determines the Euclidean distance between a given pair of ends of√the trajectory (0; 0) and (1; 1), while the minimum of the integral is 2. Extension of the class of admissible functions leads to a different solution - it will be a piecewise continuous monotonically increasing
70
Chapter 3
function with a zero derivative almost everywhere. The corresponding minimum of the integral will now be equal to 1. Thus, on the set of I[y(x)] = 1 is not attained (Fig. 3.3). smooth functions inf y(x)∈D[0,1]
Figure 3.3: The Euclidean distance between two points √ of the plane (0;0) and (1;1) in the class of smooth functions is 2 times greater than the length of a piecewise continuous (stepwise) function Therefore, the description of the class of admissible states and controls also includes an indication of the nature of the dependence of admissible states and controls on time. Thus, admissible trajectories can be continuous and piecewise differentiable functions, while admissible controls can be piecewise continuous or measurable functions, etc.
4 Formalism Of Sufficient Conditions And The Optimal Principle Of V. F. Krotov 4.1 Sufficient Krotov Optimality Conditions for Continuous Processes 4.1.1 Formulation of the problem We consider the problem of minimizing the functional t1 I[x(·), u(·)] =
f 0 (t, x, u) dt + F (x0 , x1 ) → min
(4.1)
t0
on the set of solutions of the differential system dx = f (t, x, u), dt
(4.2)
with boundary conditions x(t0 ) = x0 and restrictions on phase variables and controls (x(t), u(t)) ∈ V (t).
(4.3)
It is required to find an admissible process (x(t), u(t)) satisfying constraints (4.2), (4.3), minimizing the functional (4.1). It is assumed that the vector function x(t) is continuous for t ∈ [t0 , t1 ], and the vector control function u(t) is piecewise continuous; in particular, it can have a finite number of discontinuity points of the first kind.
72
Chapter 4
4.1.2 Additional constructions For further presentation, we introduce into consideration the following constructions: 1) the Hamilton function H, H(t, ψ, x, u) = (ψ · f (t, x, u)) − f 0 (t, x, u) = n ψi · fi (t, x, u) − f 0 (t, x, u), = i=1
where ψ = (ψ1 , ψ2 , ..., ψn ) – n-dimensional vector; 2) the function R, R(t, x, u) = H(t,
∂ϕ ∂ϕ , x, u) + , ∂x ∂t
where (ϕt , ϕx ) = (ϕt , ϕx1 , ..., ϕxn ) is the vector of partial derivatives of the function ϕ(t, x), a scalar function of n + 1 variables that is defined, continuous and has continuous partial derivatives with respect to all variables, with the possible exception of a finite number of sections T × X (otherwise the function ϕ(t, x) is arbitrary); 3) function G, G(x0 , x1 ) = F (x0 , x1 ) + ϕ(t1 , x1 ) − ϕ(t0 , x0 ). We denote by μ(t) and m the following quantities: μ(t) =
sup (x,u)∈V (t)
R(t, x, u), m =
inf
(x0 ,x1 )∈V (t0 )×V (t1 )
G(x0 , x1 ).
Let us formulate two lemmas on which we will construct the proof of the minimal theorem.
Formalism Of Sufficient Conditions
73
4.2 Basic lemmas on minimizing sequences A functional I(v) is given on the set D, v ∈ D. It is required to find a sequence that minimizes the functional I on D, that is, a sequence {vs } ⊂ D on which I(vs ) → i = inf I. D
Lemma 4.2.1. Let l be the lower bound of the functional I on D (possibly l = ∞), and there is a sequence {vs } ⊂ D on which I(vs ) → l. Then: l is the exact lower bound of the functional I on D, l = i; {vs } is a minimizing sequence, and on any minimizing sequence {vs } ⊂ D the functional I(vs ) → l = i. Proof. Obviously, i l (according to the definition of the exact lower bound, infimum is the largest of all lower bounds). By the hypothesis of the lemma, the sequence I(vs ) converges to its lower boundary, I(vs ) → l. Let us show that the strict inequality i > l is impossible. By the definition of the limit, starting from some number S, the elements of the sequence I(vs ) fall into the ε-neighborhood of the limit value l. In other words, for all s > S, the inequality |I(vs ) − −l| < ε, or l − ε < I(vs ) < l + ε. If i > l, a situation is possible when the values of the functional will be less than the number i, which contradicts the definition of the exact lower bound of the functional (Fig. 4.1).
Figure 4.1: To the proof of Lemma 4.1, the contradictory situation for i>l It remains i = l. On any minimizing sequence I(vs ) → i = l. Lemma 4.2.2. Let there be a set E containing D, D ⊆ E; functionals L(v) and I(v) defined on the sets E and D, respectively, satisfy the
74
Chapter 4
Figure 4.2: Illustration for Lemma 4.2 condition L(v) I(v), v ∈ D (in particular, it is possible that L(v) = I(v) for all v ∈ D, see Fig. 4.2). Then: 1◦ For any element v ∈ D, the inequality I(v) l, where the number l is the exact lower bound of the function the functional L on E, l = inf L(v), v ∈ E; moreover, i = inf I(v) l; E
D
2◦ If, moreover, there is a sequence {vs } ⊂ Dsuch that I(vs ) → l, the exact lower bounds of the function of the functionals I and L coincide, i = inf I(v) = l, the sequence {vs } minimizes the D
functional I on D, and any sequence {vs } ⊂ D minimizing the functional I on D minimizes the functional L on E. Proof. Under the lemma condition, the set E contains the set D, D ⊆ E. According to 1◦ , on all elements v ∈ D, the inequality l L I, v ∈ D. Consequently, the exact lower bound i of the functional I as the largest lower bound will satisfy the inequality i l, which proves assertion 1◦ . If condition 2◦ is satisfied, then by Lemma 4.1 i = l. In other words, the sequence {vs } minimizes I on D, I(vs ) → i = l. By Condition 1◦ , the inequality L(vs ) I(vs ) ∀vs ∈ D holds. However, at the same time l L(vs ) I(vs ) → i = l. This implies L(vs ) → → l, which proves the assertion 2◦ of the lemma: a sequence {vs } ∈ ∈ D that minimizes the functional I on the set D is also minimizing for the functional L to E.
Formalism Of Sufficient Conditions
75
4.3 Basic Minimal Theorems and minimizing sequences Theorem 4.3.1. Let there be a function ϕ(t, x) ∈ Φ and an element v = (x(t), u(t)) ∈ D satisfying the conditions: 1◦ . R(t, x(t), u(t)) = μ(t) almost everywhere on [t0 , t1 ]; 2◦ . G(x0 , x1 ) = m. Then v minimizes the functional I on D (v is m and n and functional I on D). Theorem 4.3.2 (generalized). Let there be a function ϕ(t, x) ∈ Φ and a sequence of elements vs = {xs (t), us (t)}s ∈ D such that: 1◦ .
t1
R(t, xs (t), us (t)) dt
t0
[t0 , t1 ];
→
t1
s→∞ t 0
μ(t) dt almost everywhere on
2◦ . G(x0s , x1s ) → m > −∞. s→∞
Then this sequence minimizes the functional I on D. Remark. To satisfy condition 1◦ of Theorem 4.2, it is sufficient that the sequence R(t, xs (t), us (t)) be uniformly bounded from below and converge in measure to μ(t). Theorem 4.3.3. Let have there are a function ϕ(t, x) ∈ Φ and a sequence vs = {t1s , xs (t), us (t)}s ∈ D such that: 1◦ . μ(t) = 0, t ∈ A; 2◦ .
t1
R(t, xs (t), us (t)) dt → 0; s→∞
t0
3◦ . G(t15 , x0s , x1s ) → m > −∞. s→∞
Then this sequence minimizes the functional I on D.
76
Chapter 4
In contrast to problems with fixed ends, for which the set A is specified as a fixed segment of the T axis, in the case of a moving boundary, the set A is chosen optimally by changing its boundaries within the given limits. For example, the set A can be a segment, the right end of which, together with x(t1 ), varies within a given bounded set V1 = T × X such that any t1 t0 . The set A is understood as a set containing all the segments [t0 , t1 ] corresponding to V1t . Proof of Theorem 4.2. Let us define a function ϕ(t, x) ∈ Φ and use it to construct the functional t1 L(v) = G(t1 , x0 , x1 ) −
R(t, x, u) dt, t0
which for any function ϕ(t, x) ∈ Φ is defined on D. For all v ∈ D the relation is obvious t1 μ(t) dt.
L(v) l = m − t0
The functionals on the set D coincide: L(v) ≡ I(v). Indeed, let (x(t), u(t)) ∈ D, then R(t, x(t), u(t)) =
∂ϕ(t, x(t)) dx ∂ϕ(t, x(t)) + − f 0 (t, x(t), ∂x dt ∂t d u(t)) = ϕ(t, x(t)) − f 0 (t, x(t), u(t)). dt
Substituting this function into the expression for the functional L(v) and integrating, we obtain t1 L(v) = G(t1 , x0 , x1 ) − ϕ(t1 , x1 ) + ϕ(t0 , x0 ) + F (x0 ,x1 )
f 0 (t, x(t), u(t)) ·
t0
· dt ≡ I(v)
Formalism Of Sufficient Conditions
77
Thus, the conditions of Lemma 4.2 are satisfied: on the set of admissible two functionals coincide (I = L) and there exists a sequence on which the functional I converges to the lower bound l of the functional L. According to Lemma 4.2, this sequence minimizes the functional I on D and simultaneously the functional L. Theorem 4.1 on the minimal is a consequence of Theorem 4.2 under the conditions when all elements of the sequences coincide: xs (t) = x(t), us (t) = u(t).
4.4 Krotov’s optimality principle The optimality principle is a general approach to solving optimal control problems. For the conditions of the minimal theorem, as the simplest, Krotov’s optimality principle consists in the fact that instead of directly finding the element v = (x(t), u(t)), that minimizes the functional I(v), it finds three functions (x(t), u(t), ϕ(t, x)) such that: 1◦ . For each fixed t ∈ [t0 , t1] the function R(t, x, u) at the point (x(t), u(t)) attains the largest value among all (x, u) ∈ V (t), and the function G(x0 , x1 ) reaches the smallest. 2◦ . The functions (x(t), u(t)) satisfy the differential system x˙ = = f (t, x, u). According to Theorem 4.1, such a pair of functions is the minimal of the functional I on D. Thus, Krotov’s optimality principle reduces the problem of optimal control on the minimum of functional I on the class D of solutions of systems of differential equations to nonlinear programming problems in a finite-dimensional space for the function R(t, x, u) for fixed values of t and the function G(x0 , x1 ). Moreover, the sets of finitedimensional spaces on which the problems of nonlinear programming for R and G are solved are extensions of the original set of admissible D and arise as a result of the elimination of differential constraints from the set of constraints of the set D. Due to the appropriate choice of the function ϕ(t, x), through which the functions R and G are defined, the
78
Chapter 4
solution of these nonlinear programming problems is approximated with any degree of accuracy by a sequence from D. The conditions of the theorem leave a certain arbitrariness in the choice of the function ϕ(t, x). By specifying a method for finding this function, we, in essence, are specifying a method for solving the optimal control problem. In particular, the solution procedures based on L. S. Pontryagin’s maximum principle and R. Bellman’s principle of optimality can be interpreted as some ways of specifying the Krotov function ϕ(t, x). This establishes a connection between the approach proposed by V. F. Krotov and these widespread methods. The arbitrariness in the choice of ϕ(t, x) can be used to construct the most convenient method for a given problem, taking into account its specifics.
4.5 Control questions 1. How, from the formal form of the equations of the process in the general formulation of the optimal control problem, to determine the state and control vectors, regardless of their designations? 2. Formulate a theorem on sufficient conditions of optimality for continuous processes (the minimal theorem). For what reason does it become necessary to turn to the generalized theorem? 3. How is the generalized theorem on sufficient conditions for optimality formulated? 4. What are the features of the optimal control problem with a moving boundary? Give an example of such a setting. 5. Formulate Krotov’s optimality principle.
5 Euler’s Problem 5.1 Formulation of the problem. Indicatrix The Euler problem is the problem of the minimum of the functional t1 I[x(·), u(·)] =
f 0 (t, x, u) dt
(5.1)
t0
on the set D of pairs of scalar functions (x(t), u(t)) related by the differential equation x˙ = u
(5.2)
with the boundary conditions x(t0 ) = x0 , x(t1 ) = x1 , where x0 , x1 are given numbers. In addition to Eq. (5.2), the elements of the set D are subject to set-theoretic constraints listed in the general statement of the problem, namely, the function x(t) is continuous and piecewise differentiable, and u(t) is piecewise continuous. The function f 0 is continuous and differentiable with respect to all arguments. The Euler problem is distinguished from the general setting by the following conditions: n = r = 1, f (t, x, u) = u; the set of admissible controls Vu (t, x) for all t ∈ (t0 , t1 ) coincides with the numerical axis U, the set of admissible states Vx (t) coincides with the numerical axis X, and for t = t0 , t = t1 it is given points x0 , x1 , respectively. The properties of the solution to this problem are largely determined are shared by the nature of the dependence of the integrand f 0 (t, x, u) on the control u, which we will call the indicatrix (indicator function).
80
Chapter 5
In what follows, we will use Theorems 4.1 and 4.2, which establish sufficient conditions for optimality. Moreover, the second conditions of the theorems in the Euler problem are certainly satisfied, since x0 , x1 are fixed points. According to Krotov’s optimality principle, it is required to find a function ϕ(t, x) and a sequence {xs (t), us (t)} ⊂ D such that the sequence R(t, xs (t), us (t)) is bounded from below and converges to its largest value for each t, R(t, xs (t), us (t)) → μ(t).
5.2 Indicators of various types: discontinuous solution, trunk, impulse and sliding modes 5.2.1 Linear indicatrix (there are no restrictions on state and control) A discontinuous solution. Let the integrand have view f 0 (t, x, u) = a(t, x) + b(t, x)u.
(5.3)
Let us write down the corresponding function R: R(t, x, u) = ϕx u − a(t, x) − b(t, x)u + ϕt .
(5.4)
If the function ϕ(t, x), in the choice of which there is a certain arbitrariness, impose the condition ϕx − b(t, x) = 0,
(5.5)
then the function R(t, x, u) will not depend on the control u. In other words, the maximum of R with respect to the control u for fixed t and x will be attained for any value of the control u. From (5.5) we obtain ∂b(t, ξ) ϕ(t, x) = b(t, ξ) dξ + c(t), ϕt = dξ + c(t), ˙ (5.6) ∂t where c(t) is an arbitrary continuously differentiable function (analog of the constant of integration). Taking into account relations (5.6), we
Euler’s Problem
find
R(t, x) = −a(t, x) +
81
∂b(t, ξ) dξ + c(t). ˙ ∂t
(5.7)
Suppose that there is at least one piecewise differentiable function x(t) such that R(t, x) = μ(t), t ∈ (t0 , t1 ). In the case of specially selected boundary conditions, when x(t0 ) = = x0 , x(t1 ) = x1 , the pair of functions x(t) = x(t), u(t) = dx/dt satisfies the conditions of Theorem 4.1, and, therefore, determines the optimal trajectory and control. In the general case, when x(t0 ) = x0 , x(t1 ) = x1 , we obtain a discontinuous trajectory ⎧ ⎪ t = t0 , ⎨ x0 , x !(t) = x(t), t ∈ (t0 , t1 ), ⎪ ⎩ x1 , t = t1 . Due to the discontinuity, the trajectory x !(t) does not belong to the set of feasibles, so there is no minimum in the class of feasibles, so the solution to the problem should be sought in the form of a minimizing sequence. To construct it, we need a uniformly bounded sequence {xs (t)} of continuous piecewise differentiable functions such that for any t from the interval (t0 , t1 ) convergence is realized xs (t) → → x(t) and on the boundaries of the interval xs (t0 ) = x0 , xs (t1 ) = = x1 . Such a sequence of functions exists, since R(t, x) = μ(t), t ∈ ∈ (t0 , t1 ). Then it is obvious that the sequence of pairs of functions {xs (t), us (t) = x˙ s (t)} belongs to the class of admissible D and satisfies the conditions of Theorem 4.2, as a result of which it is a minimizing sequence for the functional I on D. Remark. The following designations are accepted here: (x(t), u(t)) - minimal; x(t) = arg max R(t, x); x∈Vx (t)
x !(t) - trajectory for which x !(t0 ) = x0 , !(t) = x(t). ∀t ∈ (t0 , t1 ) : x
x !(t1 ) = x1 ,
82
Chapter 5
Example Solve Euler’s Minimum Functional Problem 1/2 I[x(·), u(·)] = [x2 (x + 1) − 2(t − 1)2 xu] dt → min 0
on the set of solutions of the differential equation x˙ = u with boundary conditions x(0) = 1, x(1/2) = 0. Let’s construct the function R: R(t, x, u) = ϕx u − x2 (x + 1) + 2(t − 1)2 xu + ϕt . The function R depends on the control u linearly; we group the terms containing u: R(t, x, u) = [ϕx + 2(t − 1)2 x]u − x2 (x + 1) + ϕt . Let us use the available arbitrariness in specifying the function ϕ and choose this function in such a way that R does not depend on the control u. In other words, put [ϕx + 2(t − 1)2 x] = 0. Then ϕ(t, x) = −(t − 1)2 x2 + c(t), where c(t) is an arbitrary function of time (constant of integration). We take into account that ϕt = −2(t − 1)x2 + c(t). ˙ Substitute the obtained relations into the expression for the function R: ˙ = −x2 [(x+1)+2(t−1)]. R(t, x) = −x2 (x+1)−2(t−1)x2 + c(t) =0
We define stationary points R by state x: ∂R 2 − 4t = −3x2 − 2x(2t − 1) = 0 =⇒ x1 = 0, x2 (t) = . ∂x 3
Euler’s Problem
83
The first stationary trajectory x1 (t) = 0 is the minimum point of the function R, since the second-order derivative at this point is positive, ∂ 2 R = −4t + 2 0, t ∈ (0, 1/2). ∂x2 x1 =0 Second stationary trajectory x2 (t) =
2 − 4t 3
on the same interval maximizes R: ∂ 2 R = 4t − 2 0, ∂x2 x2 (t)= 2−4t
t ∈ (0, 1/2).
3
However, only one endpoint (right end) lies on this path. As a result, a discontinuous trajectory is obtained, which does not belong to the original class of admissible ones:
1, t = 0, x !(t) = 2−4t 3 , t ∈ (0, 1/2]. We construct a sequence {xs (t)}s of continuous piecewise differentiable functions that converges to the resulting discontinuous trajectory x !(t). As an initial approximation we take a linear function x0 (t) = 1 − 2t, it passes through the given boundary points x(0) = 1 and x(1/2) = 0. The next element of the sequence is a piecewise linear function
1 − 3t, 0 t < 15 , x1 (t) = 2−4t 1 1 3 , 5 t 2. Gradually increasing the slope of the first approximating segment of the polyline, we obtain for an arbitrary number s:
1 , 1 − (s + 2)t, 0 t < 3s+2 xs (t) = 2−4t 1 1 3 , 3s+2 t 2 .
84
Chapter 5
Figure 5.1: Discontinuous trajectory x !(t) and its approximation by the sequence of continuous piecewise differentiable functions The sequence of controls is determined directly from the differential equation us (t) = x˙ s (t), then
1 , −(s + 2), 0 t < 3s+2 us (t) = 4 1 1 −3, 3s+2 t 2 . The graphs of the corresponding dependencies are shown in Fig. 5.1. Obviously, as the number s increases, the sequence of trajectories xs (t) → x(t), ∀t ∈ (t0 , t1 ) and xs (0) = 1, xs (1/2) = 0. The sequence of function pairs {xs (t), us (t) = x˙ s (t)}s belongs to the class of admissible D and satisfies the conditions of Theorem 4.2, as a result of which it is a minimizing sequence for the functional I on D.
5.2.2 Linear indicatrix (problem with constraints on state and control). Trunk mode The set of admissible states Vx (t) does not coincide with the number axis. It is determined by the set of all trajectories satisfying differential equation (5.2), boundary conditions, and additional inequality constraints Γ1 (t) x(t) Γ2 (t).
(5.8)
Euler’s Problem
85
If, in this case, no constraints are imposed on the controls, then the only change in solving this problem in comparison with the problem without constraints is that the optimal trajectory x(t) is constructed proceeding from the condition, x(t) = arg
max
Γ1 (t)x(t)Γ2 (t)
R(t, x),
where the function R(t, x) is given by relation (5.7), and the maximum is sought on the set of admissible states, taking into account additional constraints such as inequalities. Let us now consider the general case: there are constraints on both state and control. The area of admissible controls Vu (t, x(t)) : {u1 (t, x) u(t) u2 (t, x)}, where u1 (t, x), u2 (t, x) – given continuous functions. In the case when x0 = x(t0 ), x1 = x(t1 ) and the ˙ u2 (t, x) process {x(t) = x(t), u(t) = condition u1 (t, x) x(t) ˙x(t)} is still admissible and optimal by virtue of Theorem 4.1. However, if x0 = x(t0 ) or x1 = x(t1 ), the process is not only not admissible due to its discontinuity, but also cannot be approximated by a sequence of feasible trajectories, as was done earlier. Indeed, the elements of the approximating sequence in the sections adjacent to the boundary points t = t0 and t = t1 must have an infinite slope: us (t) = x˙ s (t) = s → ∞, which violates additional constraints for management. Consider four Cauchy problems for differential equation (5.2) that arise when it is closed by the lower and upper boundaries of the set of admissible controls, u = u1 and u = u2 , respectively. Two of these tasks are solved in the forward direction of time - from left to right. Their solutions satisfying the initial condition at the left end, x(t0 ) = x0 , will be denoted by γ10 (t), γ20 (t). Two more problems are solved in the opposite direction - from right to left. The solutions of these Cauchy problems with initial conditions at the right end, x(t1 ) = x1 , will be denoted by γ11 (t), γ21 (t) (Fig. 5.2).
86
Chapter 5
Figure 5.2: The nature of the optimal trajectory in the Euler problem with linear indication-tris and constraints on state and control Any admissible trajectory x(t) outgoing from the boundary point x(t0 ) = x0 given at the left endpoint does not exceed the solution γ20 (t), in other words, x(t) γ20 (t).
(5.9)
Suppose the opposite. Then, for a trajectory x(t) that violates this condition and intersects the curve γ20 (t) at t = τ, the slope at time τ ∈ [t0 , τ) will exceed the upper bound for the slopes of the admissible trajectories, x(τ ˙ ) = u(τ , x) > γ˙ 20 (τ ) = u2 (τ , x)|τ=τ , γ20 (τ ) , which is impossible due to control constraints. Similarly, one can prove the validity of the inequalities x(t) γ10 (t), x(t) γ21 (t), x(t) γ11 (t)
(5.10)
for any admissible trajectory x(t). In the problem with constraints on the controls, relations (5.9)-(5.10) define the set of attainable states, or the set of “attainability" (Fig. 5.2). So, in the presence of inequality-type constraints on states and control, the set of admissible states can be “narrowed" to the set Vx (t) bounded, apart from the a priori constraints, by inequalities (5.9)-(5.10). The optimal trajectory x(t) in this problem consists of three sections. Two of them are pieces of the boundaries γ(t) that correspond
Euler’s Problem
87
to the limit controls u = u1 , u = u2 , adjacent to the ends t0 , t1 . The third section is internal; it coincides with x(t). The trajectory x(t), “glued” in this way, is continuous, in contrast to x !(t), admissible if ˙ u2 (t, x). only the inequality u1 (t, x) x(t) Such a trajectory was called the main re-bench press. By analogy with car traffic, you should first enter the “expressway” using the maximum possible control, then move along the highway x(t), and finally, get off it in order to get to the given destination. We find the optimal control in this problem directly from the differential equation as the derivative of the first of the first order, u(t) = d x(t). = dt In the next two examples, we compare the solutions of two Euler problems, differing only in additional constraints on state and control: in the first problem, there are no additional constraints on the state and control, in the second problem, on the contrary, constraints of the type of inequalities are introduced as on the state, and management. Example 1. It is required to find a process that minimizes the function national 1 I[x(·), u(·)] = (x + 16tx2 u) dt → min (x,u)∈D
0
on the admissible set D : x˙ = u, x(0) = x(1) = 0. Let’s compose the function R: R(t, x, u) = ϕx u − x − 16tx2 u + ϕt = (ϕx − 16tx2 )u − x + ϕt . Using the arbitrariness in the choice of the function ϕ(t, x), we define it in such a way that the function R does not depend on the control u. To do this, put ϕx = 16tx2 , then ϕ(t, x) =
16 3 tx + c(t), 3
ϕt =
16 3 ˙ . x + c(t) 3 =0
As a result, the function R will only depend on the state x: R(t, x, u) =
16 3 x − x. 3
88
Chapter 5
Let us find the stationary points of the function R: ∂R 1 = −1 + 16x2 |x = 0 ⇒ x1,2 = ± . ∂x x 4 Their character is determined by the sign of the second derivative of the function R. The stationary point x1 = +1/4 is the minimum point, the point x2 = −1/4 corresponds to the maximum:
8, x1 = 14 , minimum R; ∂ 2 R = 32x| = x 1,2 ∂x2 x1,2 −8, x2 = − 14 , maximum R. The minimum condition for the second construction, for the function G appearing in the theorem on sufficient optimality conditions, in the problem with fixed ends is satisfied trivially. If the boundary conditions were x(0) = x(1) = −1/4, then by virtue of Theorem 4.1 the trajectory x(t) = −1/4 and the corre˙ = 0 would be optimal. However, the sponding control u(t) = x(t) boundary points do not lie on the trajectory x(t) = −1/4. As a result, we get a discontinuous trajectory ⎧ ⎪ t = 0; ⎨0, x !(t) = −1/4, t ∈ (0, 1); ⎪ ⎩ 0, t = 1, which does not belong to the original set of admissible processes, and it should be approximated by a sequence of continuous piecewise differentiable functions. Let’s construct a minimizing sequence in the following way: each element of the sequence will be constructed as a polyline consisting of three links. The two extreme links will adjoin the boundary points to ensure the fulfillment of the boundary conditions, and the middle link must coincide with the trajectory x(t) = −1/4. First element of the sequence, s = 1: ⎧ ⎧ ⎪ ⎪ −t, 0 t < 1/4, ⎨ ⎨−1, 0 t < 1/4, x1 (t) = −1/4, 1/4 t < 3/4 , u1 (t) = 0, 1/4 t < 3/4 , ⎪ ⎪ ⎩ ⎩ t − 1, 3/4 t 1. 1, 3/4 t 1.
Euler’s Problem
89
Second element of the sequence, s = 2: ⎧ ⎧ ⎪ ⎪ −2t, 0 t < 1/8, ⎨ ⎨−2, 0 t < 1/8, x2 (t) = −1/4, 1/8 t < 7/8 , u2 (t) = 0, 1/8 t < 7/8 , ⎪ ⎪ ⎩ ⎩ 2(t − 1), 7/8 t 1. 2, 7/8 t 1. Continuing this process, we obtain for an arbitrary number s ⎧ ⎪ 0 t < (4s)−1 , ⎨−st, xs (t) = −1/4, (4s)−1 t < 1 − (4s)−1 , ⎪ ⎩ s(t − 1), 1 − (4s)−1 t 1. ⎧ −1 ⎪ ⎨−s, 0 t < (4s) , us (t) = 0, (4s)−1 t < 1 − (4s)−1 , ⎪ ⎩ s, 1 − (4s)−1 t 1. Fig. 5.3 gives an idea of the nature of the minimizing sequence by means of which the discontinuous trajectory is approximated. The sequence of trajectories {xs (t)}s converges to a discontinuous function x !(t) = lim xs (t), on which the function R reaches its s→∞
maximum value in the state x. Moreover, the sequence {R(t, xs (t))}s is uniformly bounded from below and ∀t ∈ (t0 , t1 ) : lim R(t, xs (t)) = R(t, x(t)) = max R(t, x) = μ(t). x∈Vx (t)
s→∞
According to the generalized Theorem 4.2, the constructed sequence {xs (t), us (t)}s is minimizing for the functional I on the set D. Note that the sequence of controls us (t) converges to the generalized Dirac δ-function. Example 2. Find a process that minimizes the functional 1 I[x(·), u(·)] = 0
(x + 16x2 u) dt → min
(x,u)∈D
on the set D : x˙ = u, |u(t)| 2, x(0) = x(1) = 0.
90
Chapter 5
Figure 5.3: Building a minimizing sequence {xs (t), us (t)} Let us narrow down the set of admissible states by constructing a set of reachability. To do this, we find solutions to the differential equation of the process, closing it sequentially with the lower and upper control limits. In other words, we solve two Cauchy problems for the equations x˙ = ±2, first in forward time, from left to right, for the initial point coinciding with the left end of the interval, x(0) = 0, and then two more Cauchy problems - in the opposite direction. nii, from right to left, for a point coinciding with the right endpoint, x(1) = = 0. As a result, we get four boundaries: γ10 (t) = −2t, γ20 (t) = = 2t, γ11 (t) = 2(t − 1), γ21 (t) = −2(t − 1). Optimal process dynamics follow dependencies ⎧ ⎧ ⎪ ⎪ 0 t < 1/8, ⎨−2t, ⎨−2, 0 t < 1/8, x(t) = −1/4, 1/8 t < 7/8 , u(t) = 0, 1/8 t < 7/8 , ⎪ ⎪ ⎩ ⎩ 2(t − 1), 7/8 t 1. 2, 7/8 t 1. The corresponding trajectory and control plots are shown in Fig.5.4. Thus, the optimal process in the second task with a linear indicatrix and a control constraint, coincides with an element of the minimizing sequence of the first problem for s = 2. The introduction of one more constraint on the state, for example, x(t) −t leads to the appearance of an additional boundary in Fig.5.4 (highlighted in red).
Euler’s Problem
91
Figure 5.4: Optimal solution in a problem with a linear indicatrix and constraints on control and state
As a result, the optimal process is determined by the ratios ⎧ ⎪ 0 t < 1/4, ⎨−t, x(t) = −1/4, 1/4 t < 7/8, ⎪ ⎩ 2(t − 1), 7/8 t 1. ⎧ ⎪ ⎨−1, 0 t < 1/4, u(t) = 0, 1/4 t < 7/8 , ⎪ ⎩ 2, 7/8 t 1.
5.2.3 Indicatrix with limited nonlinearity. Pulse sliding mode The integrand has the form f 0 (t, x, u) = g0 (t, x, u) + h0 (t, x)u, where g0 (t, x, u) is a function bounded in u for any fixed t, x (fig.5.5). Let’s construct the function R: R(t, x, u) = [ϕx − h0 (t, x)]u − g0 (t, x, u) + ϕt . We define a function ϕ(t, x) by the equality ϕx = h0 (t, x).
(5.11)
92
Chapter 5
Then x
h0 (t, ξ) dξ + c(t),
(5.12)
∂h0 (t, ξ) dξ + c(t). ˙ ∂t
(5.13)
ϕ(t, x) = h
x ϕt = h
Figure 5.5: ndicatrix with limited nonlinearity control Substitute (5.11), (5.13) in the expression R(t, x, u). Setting c(t) = = 0, we get 0
x
R(t, x, u) = −g (t, x, u) +
∂h0 (t, ξ) dξ. ∂t
(5.14)
h
Suppose there is at least one pair of functions (x(t), u(t)) such that R(t, x(t), u(t)) ≡ μ(t)
(5.15)
moreover, x !(t) is piecewise differentiable and u(t) is piecewise continuous. It’s obvious that d(x(t)) = u(t). dt Due to this, the pair of functions (x(t), u(t)) does not belong to an admissible set.
Euler’s Problem
93
Construction of an approximating sequence Let us set an integer s > 0 and assign to it a pair of functions (xs (t), us (t)) ∈ D which is constructed as follows. The segment [t0 , t1 ] is split: t0 < τ1 < τ2 < ... < τs < t1 , containing all the breakpoints of the pair (x(t), u(t)). Moreover, Δ = = max(tτ1 − t0 , τ2 − τ1 , ..., τs − τs−1 ) → 0. s→∞
Each elementary interval [τp , τp+1 ], p = 1, 2, ..., s − 1 is divided by points τp into two unequal parts, [τp , τp ] and [τp , τp+1 ], moreover, τp = τp+1 − s12 (τp+1 −τp ). The lengths of the subintervals [τp , τp ] are quantities of the order of 1/s, while the lengths [τp , τp+1 ] are masks of the order of 1/(s2 ). On intervals adjacent to the boundary points, the trajectory line x(t) is approximated by line segments xs (t) = x0 +
x(τ1 ) − x0 (t − t0 ), t0 t < τ1 τ1 − t0
xs (t) = x1 +
x(τs ) − x1 (t − t1 ), τs t t1 , τs − t1
and
which pass through the specified ends of the path. On the first subinterval [τp , τp ], the function xs (t) is given by a line segment passing through the point (τp , x(τp )) with the angular coefficient u(τp ). On the second subinterval [τp , τp+1 ], the approaching line is drawn through the point (τp+1 , x(τp+1 )). The slope of the straight line in this interval is determined by the ratio of the increments x (τp+1 )− x(τp ) . As a result, ∀τp t < τp+1 the function xs (t) follows τp+1 −τp formula ⎧ ⎪ x(τp ) + u(τp )(t − τp ) ∀τp t < τp , ⎪ ⎪ ⎪ ⎪ ⎨ t−τp+1 x(τp+1 ) + τp+1 −τ x(τp+1 ) − x(τp ) − u(τp )(τp − τp ) p xs (t) = ⎪ ⎪ − x(τp ) ⎪ ⎪ ⎪ ⎩ ∀τ t < τ . p
p+1
94
Chapter 5
We define the function us (t) by the formula us (t) = x˙ s (t). The graphic representation of the broken line xs (t) is shown in Fig. 5.6. For any fixed s, the pair of functions (xs (t), us (t)) satisfies all conditions of the class D: the function xs (t) is continuous, piecewise differentiable, and satisfies the boundary conditions, while the function us (t) is piecewise continuous and coincides by definition with the derivative x(t). ˙ Obviously, lim xs (t) = x(t) on (t0 , t1 ). s→∞
Moreover, on any segment within each interval, a continuous xs (t) converges to x(t) uniformly. The uniform convergence of us (t) to u(t) on any subset of the segment [t0 , t1 ] has no place, since the values of us (t) on the segments [τp , τp+1 ] increase unboundedly. However, the total length of the segments [τp , τp+1 ], as well as the segments [t0 , τ1 ] and [τs , t1 ] tends to zero, which determines the convergence us us → u(t) on [t0 , t1 ]. s→∞
By virtue of (5.15), the sequence R(t, xs (t), us (t)) → R(t, x(t), u(t)) ≡ μ(t), s→∞
moreover, on this sequence the function R remains uniformly bounded from below due to the peculiarities of the construction of xs (t) and the boundedness in control of the function g0 (t, x, u). Thus, according to the generalized Theorem 4.2, the constructed sequence (xs (t), us (t)) is minimizing. Here, as in the case of a discontinuous solution, there is no minimum in the class of admissible D. The sequence of phase trajectories {xs (t)} conFigure 5.6: Construction of a min- verges to the function x(t), while imizing sequence. Pulse sliding the sequence of its derivatives mode {x˙ s (t) = us (t)} converges not ˙ to x(t), but to some function
Euler’s Problem
95
˙ u(t) = x(t), and the intervals where x˙ s (t) is close to u(t), alternate infinitely often with infinitely large impulses. It is said that a sequence of this design tends to a pulsed sliding mode. This mode is completely specified by a pair of functions (x(t), u(t)). The first function (x(t)) is called the zero proximity function, and the second, u(t), is the basic control. The algorithm for solving the problem is to find the functions (x(t), u(t)). For this purpose, for each fixed t ∈ (t0 , t1 ) the function R(t, x, u) given by formula (5.14) is investigated for its maximum.
5.2.4 Nonconvex indicatrix. Sliding mode The integrand f 0 (t, x, u) can be represented, as above, in the form f 0 (t, x, u) = g0 (t, x, u) + h0 (t, x)u, but now the function g0 (t, x, u) for all t, x has at least two minima in u at the points u0 (t, x) and u1 (t, x): g0 (t, x, u0 (t, x)) = g0 (t, x, u1 (t, x)) = min(g0 (t, x, u)), u
moreover, the functions u0 (t, x) and u1 (t, x) are continuous. A similar situation is possible only if the indicatrix is nonconvex (Fig.5.7). We define the function ϕ(t, x), as in the previous section, by formulas (5.11) - (5.13). The function R(t, x, u) will then be defined by relations (5.14). The function thus constructed R(t, x, u) has a
Figure 5.7: Nonconvex indicatrix maximum with respect to the control u at the points u0 (t, x) and u1 (t, x). Let us denote the largest value of the function R(t, x, u)
96
Chapter 5
with respect to control by P (t, x): P (t, x) = max R(t, x, u) = R(t, x, u0 ) = R(t, x, u1 ). u
Let there be a continuous piecewise differentiable function x !(t) satisfying the conditions P (t, x !) = max P (t, x) = μ(t),
(5.16)
x !(t0 ) = x0 , x !(t1 ) = x1 , !(t)) < x !˙ (t) < u1 (t, x !(t)). u0 (t, x
(5.17)
x
(5.18)
Then the solution to the problem is the minimizing sequence {xs (t), us (t)} ⊂ D. To construct it, the segment [t0 , t1 ] is divided into s parts by points t0 = τ1 < τ2 < ... < τs = t1 . On each interval [τp , τp+1 ], an element of the sequence xs (t) is determined by the equations
!0 (τp )(t − τp ) ∀τp t < τp , x !(τp ) + u xs (t) = (5.19) x !(τp+1 ) + u !1 (τp )(t − τp+1 ) ∀τp t < τp+1 , Here u !0,1 (t) = u0,1 ((t, x !(t))), τp is the abscissa of the intersection point of straight line segments in (5.19). With a sufficiently small interval [τp , τp+1 ] the point τp lies inside this interval in (5.18). We assume that us (t) = x˙s (t) in other words, on each interval [τp , τp+1 ], the control us (t) will be a relay function:
u !0 (τp ), ∀τp t < τp , (5.20) us (t) = u !1 (τp ), ∀τp t < τp+1 . For any fixed s, a pair of functions (xs (t), us (t)) ∈ D. Let the sequence of partitions be such that max(τp+1 − τp ) → 0, p
s→∞
!(t) uniformly on [t0 , t1 ], and by (5.16) also unithen xs (t) → x s→∞
formly on [t0 , t1 ] the convergence R(t, xs (t), us (t)) → μ(t).
Euler’s Problem
97
Therefore, the sequence {xs (t), us (t)} is minimizing. It is said that this sequence tends to a sliding mode with a zero proximity func!1 (t). tion x !(t) and basic controls u !0 (t), u Note that, according to conditions (5.17), the boundary points must lie on the line of zero proximity. In the general case, this condition is not met, so the solution may contain, along with sections of the sliding mode, sections of other types. Example 1 The problem of the minimum of the functional 1 I[x(·), u(·)] =
(x2 + cos u) dt
0
under conditions x˙ = u, |u| 4, x(0) = x(1) = 0. Let us construct a function R : R(t, x, u) = ϕx u−x2 −cos u+ϕt . We assume that ϕ(t, x) ≡ 0, then R(t, x, u) = −x2 − cos u. Due to the periodic nature of the function R and the boundedness of the set of admissible controls for the interval |u| 4, R reaches its maximum value in control at two points: u0 (t, x) = +π and u1 (t, x) = −π. As a result, P (t, x) = 1 − x2 . For each fixed t ∈ (0, 1), the function P (t, x) attains its maximum state value when x !(t) = 0. Obviously, the boundary points lie on the trajectory x !(t) and it can play the role of a zero proximity line. Let’s check the fulfillment of the condition: !(t)) < x !˙ (t) < u1 (t, x !(t)) ⇒ −4 < 0 < +4. u0 (t, x Therefore, the solution to the problem is a sequence that approximates the sliding mode with the zero proximity line x !(t) = 0 and the basic controls u0 (t, x) = +π and u1 (t, x) = −π. To construct it, the interval [0, 1] is divided into s parts by points τ0 = 0, τ1 , ..., τs = 1 (Fig. 5.8). On each elementary interval, solutions of the differential equation x˙ = u are constructed for u = ±π; they are held through the points (τp−1 , x(τp−1 ) = 0), (τp , x(τp ) = 0) until the intersection. The
98
Chapter 5
Figure 5.8: Sliding mode. Building a minimizing sequence points of intersection are at a distance of πΔp /2 from the abscissa axis, where Δp = τp − τp−1 . At s → ∞ max Δp → 0, xs (t) → 0, f 0 → −1. p
Thus, I(xs (t), us (t)) → inf I = −1. D
Example 2. The Zermelo’s navigation problem [17] The task is to swim against the wind in the middle of the river from point A to point B, with B lying downstream (Fig. 5.9). To begin with, let’s ignore the current and assume that the river is wide enough. Every sailing enthusiast knows that he should go changing tack, for example, sail along the broken line AXB, where AX and XB form a definite angle with side AB, and the same angle α, which should be chosen in an optimal way. On the AXB’s path, it is indeed possible to use the wind to sail against it in both positions at the most favorable angle α. If the river is wide and there is no current, then the path from A to B will be covered in the minimum time. This is not the only solution: it is possible to change the tack more than once. For example, you can move along any of the zigzags made up of 4, 8, or 16 segments (Figure 5.9). When moving along each of the indicated zigzags, the time spent on moving from A to B will be the same as when moving along the broken line AXB, unless, of course, the delays in changing direction
Euler’s Problem
99
are taken into account. A similar conclusion is valid for a zigzag
Figure 5.9: Variable Tack Implementing Sliding Mode consisting of 2n segments, each of which is parallel to AX or XB. As n grows, the ship deviates from AB on such a zigzag as little as desired. When current is present, it is best to move, changing tack very often so that you remain close to AB at all times. Thus, there is an ideal solution that should be distinguished from the segment AB - it is a sliding mode, a zigzag with an infinite number of links, which approximates the zero proximity function AB. The role of the basic controls is played by the angles ±α, which at each point ensure the parallelism of AX or XB. Sliding mode is also implemented in the device, called “autopilot". The autopilot can make turns and changes in the ship’s course by a given value. As soon as a signal is received from the autopilot sensor, the steering gear shifts the rudder to a predetermined angle to the side that is opposite to the ship’s departure from course. When the ship begins to return to the previous course, the autopilot retracts the rudder, and then, holding it, shifts the rudder to the side opposite to the previous side. Example 3. Maxwell’s problem of climbing a mountain [17] Skiers are well aware that to climb a steep slope, it is necessary to step in skating style, placing the skis at an angle to the main type
100
Chapter 5
of movement. You have to do the same when going up a steep hill without skis, so as not to slip. A similar problem is faced by an engineer in the construction of a highway in the mountains, when he is looking for the best way from point A to peak B. According to the solution proposed by Maxwell, when climbing the mountain, one should adhere to a path that will circle the mountain, having an optimal slope α until the terrain is leveled so that the rest of the path can be taken along a geodesic line to the top B. Climbers, skiers, road builders have come to a more reliable solution to finding the best way to climb the summit, which guarantees the ascent to peak B, and not to any other peak. First, it is necessary
Figure 5.10: Illustration for the problem of determining the optimal path when climbing a mountain to draw a geodesic G from A to B, then along the geodesic, if the slope of the mountain is not very large. When the angle of inclination is α, one should move not along the geodesic line itself, but in short steep zigzags along the shortest path (Fig. 5.10). Thus, road builders replace the steep section with a geodesic zigzag of short straight sections connected by sharp bends. In this case, the ascent takes the same time as when moving along the geodesic line G. Example 4. The problem of reaching the release speed - cyclic sliding modes [9]. Let it be required to transfer a spacecraft, controlled by a rocket engine, from a circular near-planetary orbit to a
Euler’s Problem
101
parabolic orbit touching it with the least fuel consumption (Fig. 5.11). The solution to this problem is a sequence, which is constructed as
Figure 5.11: Illustration for the problem of achieving release speed follows. A certain neighborhood of the point of intersection of the circular orbit with the parabolic axis is selected. Further, it is required that when entering this neighborhood, the engine thrust is switched on in the direction of motion, and outside it remains off. As a result, a family of transitions to more and more elongated orbits arises, which end with a transition to a parabolic orbit. The smaller the selected neighborhood, the closer the functional is to its smallest value — the difference between the parabolic and circular velocities at the point of tangency, but the larger the number of elementary transitions and the total maneuver time, which increases indefinitely in the sequence under consideration. Such modes are typical for many problems of optimization of space maneuvers; they are called cyclic sliding modes [9]. Remark. Sufficient optimality conditions for the Euler problem with convex indicatrix will be considered in Ch. 8.
5.3 Test questions and exercises 1. What is the typical structure of the optimal solution to the Euler problem with a linear indicatrix and control constraints?
102
Chapter 5
2. Can there be no solution in linear control problems with control constraints? 3. Explain the concept of “reachable set". 4. How will the solution of the Euler problem in the example from Section 5.2.1 (see p. 96) change if the boundary conditions are set in the form x(0) = 2/3, x(1) = 0? 5. For the example from Section 5.2.1 (see p. 96), write down the Euler – Lagrange differential equation. Analyze the results, compare them with the solution obtained on the basis of Krotov’s sufficient optimality conditions. 6. Solve the Euler problem with a linear indicatrix and constraints on state and control: 4 I=
(8(t + 1)x2 + 8tx2 u) dt → max,
0
x˙ = u,
x(0) = 2,
x(4) = 8,
|u| 5,
2 x 10.
7. Solve the Euler problem on the minimum of the functional 1 I[x(·), u(·)] = (1 + x2 ) [1 + (x˙ 2 − 1)2 ] dt → min 0 1
1
under the conditions x(0) = x(1) = 0. Compare the values of the functional for the smooth solution and the sliding mode.
6 The Lagrange–Pontryagin Method For Continuous Controlled Processes 6.1 Formulation of the problem Process dynamics follow a differential system dxi = fi (t, x, u), i = 1, 2, ..., n. dt
(6.1)
Here x = (x1 , x2 , ..., xn ) is an n-dimensional state vector, u = = (u1 , u2 , ..., un ) is a n-dimensional control vector. The constraints on the control are defined: u(t) ∈ U (t), where U (t) is some range of possible values of the control, which can change in time. The initial state of the system is specified in the form of conditions xi (t0 ) = xi0 , i = 1, 2, ..., n. At the final moment of time xi (t1 ) = xi1 , i = 1, 2, ..., m n, i.e., at the right end of the interval, restrictions can be specified not for all state variables, but only for some, in this case, for the first m coordinates. In other words, a problem with a partially fixed right end is considered. The formulated constraints define the set of admissible processes D. The quality of the process is assessed by the functionality t1 I[x(·), u(·)] =
f 0 (t, x, u) dt + F (x(t1 )) → min .
(6.2)
t0
If the right end of the trajectory is completely fixed (m = n), then the terminal function F (x(t1 )) is constant and does not affect the optimal solution.
104
Chapter 6
It is required to define a process (x(t), u(t)) ∈ D that minimizes functional I on the set D. It is assumed that all set-theoretic constraints on ϕ, x, u, fi , f 0 , F are satisfied, so that all subsequent expressions are defined; in particular, the trajectory x(t) is a continuous function, and the control u(t) is piecewise continuous. This problem stands out from the general setting in that: there are no restrictions on the states of the system - except for boundary points, there are no other restrictions on x(t); set of admissible controls Vu (t) = U (t) does not depend on states x.
6.2 Equations of the Lagrange – Pontryagin method Let (x(t), u(t)) ∈ D be an admissible process satisfying Krotov’s minimal theorem. In other words, there is a function ϕ(t, x) that has properties such that the expression ∂ϕ ∂ϕ fi (t, x, u) − f 0 (t, x, u) + ∂t ∂xi n
R(t, x, u) =
(6.3)
i=1
reaches for any t ∈ [t0 , t1 ] the maximum on the vector pair (x(t), u(t)) and the function G(x0 , x1 ) = F (x0 , x1 ) + ϕ(t1 , x1 ) − ϕ(t0 , x0 )
(6.4)
takes the smallest value at x(t1 ) = x(t1 ). If the ends of the trajectory are completely fixed, then the condition for the minimum of the function G with respect to x is satisfied trivially, since in this case G = const. We introduce into consideration the conjugate vector function ∂ϕ(t, x) = ψi (t), i = 1, 2, ..., n. ∂xi x(t) In other words, for each fixed value of t, the vector function ψ(t) = = (ψ1 (t), ψ2 (t), ..., ψn (t)) is the gradient of the function ϕ(t, x) calculated along the optimal trajectory x(t).
The Lagrange–Pontryagin Method For Controlled Processes
105
Let us introduce the Hamiltonian function (Hamiltonian): H(t, ψ, x, u) =
n
ψi (t)fi (t, x, u) − f0 (t, x, u).
(6.5)
i=1
Using the Hamiltonian, the function R(t, x, u) can be written in the form R(t, x, u) =
∂ϕ ∂ϕ + H(t, , x, u). ∂t ∂x
(6.6)
By assumption, the process (x(t), u(t)) satisfies Theoreme 4.1, so for any t ∈ [t0 , t1 ] R(t, x(t), u(t)) R(t, x(t), u(t))
(6.7)
for all (x, u) ∈ V (t). This implies that a similar inequality will also hold on some subset of the set V (t), in particular, R(t, x(t), u(t)) R(t, x(t), u(t)) for all admissible controls u(t) ∈ U (t). Since in expression (6.6) the partial derivative ϕt = depend on the control u, it follows from (6.8) that
(6.8) ∂ϕ ∂t
does not
H(t, ψ(t), x(t), u(t)) H(t, ψ(t), x(t), u(t))
(6.9)
In other words, for each fixed t ∈ [t0 , t1 ] the Hamilton function H reaches its maximum value on the control u(t) = u(t) This result can be expressed in the following form: H(t, ψ(t), x(t), u) =
max H(t, ψ(t), x(t), u(t)),
u(t)∈U (t)
or u(t, x(t), ψ(t)) = argmax H(t, ψ(t), x(t), u(t)).
(6.10)
u(t)∈U (t)
By virtue of the original assumption, the function R(t, x, u) also reaches its maximum in the state x. Taking into account that the set
106
Chapter 6
of admissible states Vx (t) is not limited, the necessary condition for the extremum of the function R in x can be represented as ∂R(t, x, u) = 0, i = 1, 2, ..., n. ∂xi x(t)
Otherwise, n 2 ∂ ϕ(t, x) ∂ 2 ϕ(t, x) + fi (t, x, u)+ ∂t ∂xj x(t) ∂xi ∂xj i=1 ∂ϕ(t, x) ∂fi (t, x, u) ∂f 0 (t, x, u) + − = 0. (6.11) ∂xi ∂xj ∂xj x(t) x(t) In the left-hand side of expression (6.11), we single out the terms in which the derivatives of the function ϕ of the second order are present: n ∂ 2 ϕ(t, x) ∂ 2 ϕ(t, x) + f (t, x, u) = i ∂t ∂xj x(t) ∂xi ∂xj i=1 x(t) n dψj (t) ∂ϕ(t, x) dxi ∂ ∂ ∂ϕ(t, x) + · = = . ∂t ∂xj ∂xj ∂xj dt dt i=1
˙ x=f (t,x,u)
x(t)
There are two more terms left in expression (6.11): n ∂f 0 (t, x, u) ∂ϕ(t, x) ∂fi (t, x, u) − = ∂xi ∂xj ∂xj i=1 x(t) x(t) n 0 ∂fi (t, x, u) ∂f (t, x, u) ∂H(t, ψ, x, u) = ψi (t) − = ∂xj ∂xj ∂xj i=1
x(t)
. x(t)
The resulting relationship takes the form dψj (t) ∂H(t, ψ(t), x(t), u(t)) , =− dt dxj
j = 1, 2, ..., n.
(6.12)
Thus, we have found the necessary conditions (6.10), (6.12), which the function ϕ(t, x), represented by its gradient ψ(t), together with the pair (x(t), u(t)), must satisfy, which minimize the functional (6.2).
The Lagrange–Pontryagin Method For Controlled Processes
107
6.3 Transversality conditions It follows from Theorem 4.1 that the function G(x) must attain a minimum over states. Since at the left end there are n initial conditions, and at the right end m n conditions, the function G will depend on n − m variables: xm+1 (t1 ), xm+2 (t1 ), ..., xn (t1 ). There are no restrictions on these variables, therefore, due to the necessary extremum condition ∂G(t, x) = 0, i = m + 1, m + 2, ..., n. ∂xi x(t) Taking into account the definition of the function G(x), we obtain ∂ϕ(t0 , x) ∂F (t1 , x) ∂ϕ(t1 , x) − + = 0, ∂xi x(t1 ) ∂xi x(t0 ) ∂xi x(t1 ) ψi (t1 )
or
≡0
∂F (t1 , x) , ψi (t1 ) = − ∂xi x(t1 )
i = m + 1, ..., n.
(6.13)
Conditions (6.13) are called transversality conditions. Thus, we have obtained the necessary conditions that must be satisfied by x(t), u(t), and ψ(t) on the interval (t0 , t1 ), minimizing the functional I. The total number of unknown functions is 2n + r. Of these, r unknown functions, controls uk (t, ψ, x), k = 1, 2, ..., r, are found from the condition for the maximum of the Hamiltonian H(t, ψ(t), x(t), u(t)) =
max H(t, ψ(t), x(t), u(t)).
u(t)∈U (t)
These equalities define all control components as functions of all other variables. When the maximum of H is internal over all control components, then to determine the optimal control we obtain r relations ∂H = 0, k = 1, 2, ...r. ∂uk
108
Chapter 6
The remaining 2n unknown functions are determined from the conditions (6.12) ψ˙ j = −∂H/∂xj , j = 1, 2, ..., n to which we add n equations of the process x˙ i = fi (t, x, u), i = 1, 2, ..., n and 2n boundary conditions: at the left end, xi (t) = xi0 , i = 1, 2, ..., n; at the right endpoint xi (t1 ) = xi1 , i = 1, 2, ..., m n, “plus" the m − n transversality conditions (6.13) ψj (t1 ) = −
∂F , ∂xj
j = m + 1, ..., n.
As a result, we obtain a differential system of 2n equations with respect to x(t) and ψ(t), closed by the control u(t, ψ, x), on which the Hamiltonian H reaches its maximum with respect to the control. The resulting system is a two-point boundary value problem, its general solution depends on 2n integral constants, x(t, C1 , C2 , ..., c2n ), ψ(t, C1 , C2 , ..., C2n ). To determine the integral constants, we have a finite system of 2n equations with 2n unknowns, formed by the boundary conditions and transversality conditions. So, the original problem is reduced to solving a two-point boundary value problem for a differential system of 2n first-order equations with respect to unknown functions xi (t), ψi (t) with boundary conditions composed of boundary conditions and transversality conditions. In this case, the optimal control is specified by the condition for the maximum of the Hamiltonian max along the optimal trajectory. u∈Vu (t)
The obtained relations represent the necessary conditions for optimality and coincide with the equations of the maximum principle of L. S. Pontryagin. The maximum principle equations (6.10), (6.12), (6.13) include only the components of the gradient vector of the function ϕ(t, x) calculated along the optimal trajectory; otherwise, the function ϕ(t, x) remains underdetermined. In the general case, the necessary conditions give a solution that is “suspicious” of optimality. In order to make sure that the solution is indeed optimal, one should check the fulfillment of sufficient conditions, i.e., show that R attains a maximum on this solution, and G a minimum.
The Lagrange–Pontryagin Method For Controlled Processes
109
6.4 Negative example Consider the optimal control problem, the solution of which is satisfies the necessary optimality conditions in the Lagrange – Pontryagin form (maximum principle) and show that, depending on the initial parameters of the problem, namely, the length the time interval at which the process is considered, the solution can be optimal or suboptimal. The term “negative" and the example itself are borrowed from the textbook [10]. Problem statement: minimize functional T I[x(·), u(·)] =
(u2 − x2 ) dt
0
on the set of solutions of the differential equation x˙ = u with the boundary conditions x(0) = x(T ) = 0. The time of the end of the process is assumed to be given, and the value of T is not fixed in advance. Let’s compose the Hamiltonian: H(t, ψ, x, u) = ψ0 (x2 − u2 ) + ψu. Adjoint equation: ψ˙ = −2ψ0 x. The condition for the maximum of the control Hamiltonian: u =argmax H = u
ψ . 2ψ0
Obviously, the Hamiltonian H has a maximum when ψ0 0, in 2 this case ∂∂uH2 = −2ψ0 0. Closing the original differential equation with optimal control, we
110
Chapter 6
arrive at the boundary value problem of the maximum principle:
x ¨ + x = 0, x˙ = 2ψψ0 , ¨ + 2ψ0 x = 0 ⇒ ⇒ 2ψ0 x x(0) = x(T ) − 0. ψ˙ = −2ψ0 x the conditions of the maximum principle must satisfy the nonzero vector (ψ0 ), ψ(t)) = 0, therefore ψ0 = 1 = 1
Let’s compose the characteristic equation: λ2 + 1 = 0,
λ1,2 = ±i.
General solution to a differential equation x(t) = A cos t + B sin t,
ψ(t) = −A sin t + B cos t.
According to the boundary conditions x(0) = A = 0, x(T ) = = B sin T = 0. If quantity T = πk, k = 1, 2, ..., then B = 0. As a result x(t) = 0, u(t) = 0. When T = πk, the boundary condition at the right end x(T ) = 0 holds for any B. Therefore, for T = πk, countless extremals, x(t) = = B sin t, u(t) = B cos t (Fig. 6.1). Thus, depending on the length of the interval at which the problem is considered, various solutions are possible: 1) T < π, there is a unique minimum, x(t) = 0, u(t) = 0; 2) T = π, an infinite set of minimals, x(t) = B sin t, u(t) = = B cos t; 3) π < T < 2π, the only extremal of Pontryagin, x(t) = 0, u(t) = = 0; etc.
The Lagrange–Pontryagin Method For Controlled Processes
111
In each of these cases, the value of the functional is equal to zero. Indeed, in the first and third cases T I = I(x, u) =
(u2 − x2 ) dt = 0,
0
since x(t) = 0, u(t) = 0. In the second case T I=
2
2
(u − x ) dt = B 0
2
T 0
T B2 (cos t − sin t) dt = sin 2t = 0. 2 0 2
2
Let us show that for T > π the zero trajectory is not optimal, although it satisfies the Pontryagin maximum principle. To prove it, it
Figure 6.1: Extremals of Pontryagin at different durations T : x(t) = = 0 at T1 < π and T2 > π; x(t) = B sin t for T3 = πk Figure 6.2: Construction of a trajectory x(t) on which the value of the functional is less than on the Pontryagin extremal x(t) suffices to construct an admissible process (x(t), u(t)) ∈ D on which the value of the functional will be negative (I < 0). Let’s build this solution as follows: at the point t = π/2, cut the sinusoid x(t) = = B sin t in half along the vertical and shift its right half to the right to point T (Fig. 6.2). Let’s write the equation of the constructed trajectory: ⎧ ⎪ t π2 ; ⎨B sin t, π π x(t) = B, 2 < t t1 = T − 2 ; ⎪ ⎩ B sin (t − T + π2 ), T − π2 < t T.
112
Chapter 6
We calculate the value of the functional on the trajectory x(t), for this we divide the interval of integration into three parts - from zero to π/2, from π/2 to t1 and from t1 to T: π/2 t1 2 2 2 I= B (cos t − sin t) dt + (−B 2 ) dt+ 0
π/2
π/2 + 0
π π B 2 cos2 (t − T + ) − sin2 (t − T + ) dt = 2 2
π/2 π B2 B2 π T = = −B t1 − + sin 2t + cos 2 t − T + 2 2 2 2 t1 0 π B2 " π # sin 2 − sin 2 t1 − T + = = −B 2 (T − π) + 2 2 2 = −B 2 (T − π) < 0. 2
T >π
˙ The process (x(t), u(t) = x(t)) ∈ D is admissible, since by construction it satisfies the differential equation x(t) ˙ = u(t) and the boundary conditions. The value of the functional on this process I < < 0 = I. Thus, we have constructed a process (x(t), u(t)) ∈ D which, for T > π, corresponds to a smaller value of the functional than on the Pontryagin extremal x(t) = 0, u(t) = 0. For this reason, the Pontryagin extremal cannot be optimal for T > π, while for T < π the same extremal on the contrary, it is optimal. Consider Krotov’s sufficient optimality conditions for the Pontryagin extremal for T < π. We will define the function ϕ in such a way that the construction R(t, x, u) reaches its maximum not only in the control u, but also in the state x, in other words, so that the stationary point R in x turns out to be the maximum point of R. In the Lagrange – Pontryagin method, the function ϕ(t, x) is defined as a linear form ϕ(t, x) = ψ(t)x, as a result of which on the trajectory x(t) = 0 the function ϕ(t, x) = 0. Then R(t, x, u) = −u2 +
The Lagrange–Pontryagin Method For Controlled Processes
113
+x2 . In this case, = 0 not a maximum, but a R has at the point∂ x(t) 2R ∂R minimum, ∂x x=0 = 2x|x=0 = 0, ∂x2 = 2 > 0. Therefore, it is not possible to prove the optimality of the process x(t) = 0, u(t) = 0 when choosing ϕ(t, x) in the form of a function linear in x. However, the necessary conditions for optimality of the Lagrange – Pontryagin method determine not the entire function ϕ(t, x), but only its gradient calculated at the points of the optimal trajectory, ψ(t) = ∂ϕ(t,x) . Otherwise, the function ϕ remains underdetermined. ∂t x(t)
Let us make the dependence of the function ϕ(t, x) on the state more complicated and “extend" it to a quadratic form, ϕ(t, x) = ψ(t)x + σ(t)(x − x)2 . Considering, x(t) = 0, we get ϕ(t, x) = ψ(t)x + σ(t)x2 , then
∂ϕ = ψ(t) + 2σ(t)x, ∂x As a result
∂ϕ 2 ˙ ˙ . = ψ(t)x + 2σ(t)x ∂t
2 ˙ ˙ + σ(t)x . R(t, x, u) = (ψ(t) + 2σ(t)x)u − u2 + x2 + ψ(t)x 2
The dependence of R on u is quadratic, and ∂∂uR2 < 0, therefore, the function R reaches its maximum in u at a stationary point: ψ(t) ∂R = ψ + 2σx − 2u = 0 ⇒ u = + σ(t)x. ∂u x=u=0 2 Then $ % 1 ˙ + σ2 (t) + 1 x2 + (ψ(t) ˙ + ψ(t)σ(t)) x + ψ2 (t). R(t, x, u) = σ(t) 4 (6.14) Using the arbitrariness in the choice of the function ϕ, we assume that
ψ˙ = −σψ, (6.15) σ˙ = −(σ2 + 1).
114
Chapter 6
As a result of this choice, the function R does not depend on the state x, and the condition for the maximum of R with respect to x is trivially satisfied. The second equation of the system (6.15) is integrated independently of the first. This is an equation with separable variables, its integral σ(t) = −tg(t − C1 ). Substitute σ(t) into the first equation of the system (6.15). Integrating it, we get ψ(t) = C2 [cos (t − C1 )]−1 . The maximum of the function R in u is attained when u=
ψ(t) C2 + σ(t)x = −tg(t − C1 )x + . 2 cos (t − C1 )
We close the original differential equation with this control and solve it as a first-order linear equation with variable coefficients by the method of variation of constants. As a result, we find the optimal trajectory x(t) = C3 cos (t − C1 ) +
C2 sin (t − C1 ), 2
which is a three-parameter dependency. However, there are only two boundary conditions:
x(0) = C3 cos C1 − C22 sin C1 = 0, (6.16) x(T ) = C3 cos (T − C1 ) + C22 sin (T − C1 ) = 0. The boundary condition at the left end of the interval leads to the relation C3 =
C2 tgC1 . 2
(6.17)
Substitution of expression (6.17) into the boundary condition at the right end gives C2 cos (T − C1 )(tg C1 + tg (T − C1 )) = 0. 2
(6.18)
Equation (6.18) can have three solutions: C2 = 0, T = C1 + π/2 and T = πk.
The Lagrange–Pontryagin Method For Controlled Processes
115
In the first case, it follows from (6.17) that C3 = 0 then x(t) = = 0, u(t) = 0, ψ(t) = 0, σ(t) = −tg(t − C1 ). For the function σ(t) to be uniquely determined, for any values of the argument 0 t T , the inequalities π π < t − C1 < ⇒ 2 2 ⎡ π π t = 0 : − < C1 < ; ⎢ 2 2 ⇒⎣ π π π t = T : − < T − C1 < ; ⇒ T < + C1 < π. 2 2 2 As a result, C1 < π/2, the end of the process T < π (fig. 6.3). It is obvious that the second solution of equation (6.18) once again reproduces the previously obtained estimate of the duration T. Finally, the third solution: C2 = 0, C3 = 0, T = πk. In this case, the optimal trajectory −
C2 sin (t − C1 ) 2 it is convenient to represent in the form of harmonic vibrations: x(t) = C3 cos (t − C1 ) +
x(t) = B sin (t − C1 + θ), where B = C32 + C22 /4 is the oscillation amplitude, θ = 3 = arctg 2C is the phase. Taking into account relation (6.17), C2 we find C1 = θ + πk, then x(t) = B sin t, u(t) = B cos t. So, we can draw the following conclusion: an extremal x(t) = = 0, u(t) = 0 for T < π and an uncountable set of extremals x(t) = B sin t, u(t) = B cos t for T = πk are optimal solutions of the considered problem - they satisfy not only the necessary optimality condition in the form of the Pontryagin maximum principle, but also the sufficient optimality conditions of Krotov. Pontryagin’s extremal x(t) = 0, u(t) = 0 for T > π the optimal is not.
6.5 Reference material The Pontryagin maximum principle generalizes and develops the main results of the classical calculus of variations both to problems with
116
Chapter 6
Figure 6.3: The choice of the constant C, providing an unambiguous definition of the function σ(t) = − tan(C − t)
constraints such as equalities and inequalities, and to problems with functional linearly dependent on the control and coordinates of the object. Consider the optimal control problem: x˙ = f (t, x, u) - equations of state; x(t0 ) = x0 , x(t1 ) = x1 - border conditions; I[x(·), u(·)] =
t1
f 0 (τ, x(τ), x(τ)) ˙ dτ - the minimized functional.
t0
Here x(t) is the state vector, u(t) is the control, t0 , t1 is start and end points in time. The optimal control problem is to find the function state x(t) and control u(t), which minimize functional I[x(·), u(·)]. on a time interval t0 t t1 . Let us formulate this problem as a problem of the calculus of variations on the minimum of the simplest functional I[x(·), u(·), x(t)] ˙ under additional constraints such as equalities x˙ = f (t, x, u), x(t0 ) = = x0 , x(t1 ) = x1 , in other words, as a problem on a conditional extremum. Let us apply the method of indefinite Lagrange multipliers
The Lagrange–Pontryagin Method For Controlled Processes
117
to solve it. To do this, we compose the Lagrange functional: Λ(x(·), x(0), ˙ u(·), ψ(·)) = t 1 * ) 0 ˙ + ψ(τ) · (x(τ) ˙ − f (τ, x, u)) dτ. = f (τ, x(τ), x(τ)) t0
Instead of the indefinite Lagrange multipliers λ, traditional for finite-dimensional optimization, in the theory of optimal control, the conjugate vector function ψ(t) is introduced. The Lagrangian in the problem under consideration (the integrand of the functional to be minimized) has the form L(t, x(t), x(t), ˙ u(t), ψ(t)) = ˙ + ψ(t) · (x(t) ˙ − f (t, x, u)). = f 0 (t, x(t), x(t)) As necessary conditions for an extremum, we write down the Euler – Lagrange equations of the calculus of variations: Lu = 0,
Lx −
d Lx˙ = 0. dt
Supplementing them with boundary conditions, we obtain a twopoint boundary value problem: one part of the boundary conditions in it is given at the initial moment of time, and the rest - at the final moment. The solution to this two-point boundary value problem gives the sought optimal functions x(t), u(t), and ψ(t). When the minimum of the Lagrangian with respect to the control is reached not at the interior points of the set of admissible controls, but at the boundary points, it becomes impossible to satisfy the condition of the stationarity of the Lagrangian with respect to the control. In this case, the condition Lu = 0 is replaced by the condition ˙ u(t))]. for the minimum of the Lagrangian min [L(t, x(t), x(t), u(t)∈U (t)
The mean term in the expression for the Lagrangian is explicitly independent of the control; selecting it, we note that the rest of the Lagrangian coincides with the Hamiltonian up to sign, L =
118
Chapter 6
= ψ(t)· x(t)−H. ˙ Therefore, minimizing the Lagrangian is equivalent to maximizing the Hamiltonian, min [L(t, x(t), x(t), ˙ u(t)) ⇔
u(t)∈U
⇔ max [ψ(t) · f (t, x, u) − f 0 (t, x(t), u(t))] . u(t)∈U H(t,ψ(t),x(t),u(t))−Hamilton function
The condition for the maximum of the Hamiltonian on optimal control is the main result of the Pontryagin maximum principle. When the maximum in control is reached at an interior point of the set of ˙ = admissible controls, Hu = 0. In this case, the equation of state x(t) ∂H ∂H ˙ the adjoint equation ψ(t) = − . The necessary optimality ∂ψ ∂x conditions written in this form and supplemented by the transversality conditions are called the Pontryagin equations. Pontryagin’s maximum principle The theorem is formulated for a problem in which the set of admissible controls U is independent of t and x, and the ends are fixed. Next, along with the n-dimensional state vector x = (x1 , x2 , ..., xn ), we consider the conjugate (n + 1)-dimensional vector (ψ0 , ψ1 , ..., ψn ) and the Hamilton function is determined by the relation H(t, ψ0 , ψ, x, u) = (ψ(t) · f (t, x, u)) − ψ0 f 0 (t, x, u). Here the factor ψ0 plays the same role as the factor λ0 in the construction of the generalized Lagrange function. For a regular task ψ0 > 0 in particular, we can assume that ψ0 = 1 while in the case of an irregular task, the factor ψ0 can take on zero. Theorem 6.5.1. For a pair of functions (x(t), u(t)) to provide a minimum to the functional I on the set D, it is necessary that there exist a nonzero vector function (ψ0 , ψ(t)), where ψ(t), together with x(t), satisfies the system of mutually conjugate equations (6.1), (6.12) closed by the control u(t), on which for any t0 t t1 the Hamiltonian H reaches its maximum: H(t, x(t), ψ0 , ψ(t), u(t)) = max H(t, x(t), ψ0 , ψ(t), u(t)).(6.19) u(t)∈U
The Lagrange–Pontryagin Method For Controlled Processes
119
Relation (6.19) is the central point of the theorem, it is this relation that served as the basis for calling the theorem the maximum principle. Example. It is required to solve the problem of the minimum of the functional: t1 I = (u2 −4x) dt,
where
dx = u, 0 u 1, x(0) = 0, t1 = 1. dt
t0
The Hamilton function has the form H = ψu − ψ0 u2 + 4ψ0 x. From the adjoint equation we find ˙ ψ(t) =−
∂H = −4ψ0 ⇒ ψ(t) = −4ψ0 t + C, ∂x
where C is the constant of integration, the value of which determines the transversality condition. For a problem with a free right-hand end, we have ψ(t1 ) = 0. Means, C = 4ψ0 ,
ψ(t) = −4ψ0 t + 4ψ0 .
It is obvious that ψ0 = 0 otherwise the conjugate vector (ψ0 , ψ(t)) will be zero and does not satisfy the theorem. When the maximum of the Hamiltonian with respect to u is internal, u !(t) = ψ(t)/(2ψ0 ) = −2t + 2. At time t = 1/2, the function u !(t) reaches the upper boundary of the admissible set. As a result, optimal control
1, 0 t 1/2, u(t) = 2 − 2t, 1/2 < t 1. By hypothesis, dx/dt = u. Hence, the optimal trajectory
0 t 1/2, t + c1 , x(t) = 2t − t2 + c2 , 1/2 < t 1. From the initial condition x(0) = 0 we obtain c1 = 0. From the condition of continuity of the trajectory x(t) to the point t = 1/2 we
120
Chapter 6
find the constant c2 = −1/4. The optimal trajectory takes the form of a continuous function
x(t) =
t, 2t − t2 − 1/4,
0 t 1/2, 1/2 < t 1.
Thus, the maximum principle is especially important in systems for which relay-type controls are used, which take extreme rather than intermediate values on the set of admissible controls.
6.6 Exercises
1. Show that for problems linear with respect to phase coordinates and controls, with a linear functional, the maximum principle is not only a necessary, but also a sufficient condition for optimality.
2. Find a controllable process that satisfies the necessary optimality conditions for the Lagrange – Pontryagin method. Determine if the process found is optimal:
The Lagrange–Pontryagin Method For Controlled Processes
4 a)I =
(2u + u2 − x) dt + 2x(4) → min;
0
121
dx = 3x + 2u; dt
x(0) = 0. 4 b)I =
(u + u2 + 2x2 ) dt → min;
0
dx = x + 2u; x(0) = 0. dt
10 dx c)I = (u2 + x) dt → min; = x − u; 0 u 4; x(0) = 1. dt 0
4 d)I =
(x + 5u) dt − 2x(4) → min;
dx = 2x + u; |u| 1; dt
(2u2 − 4x) dt + x(3) → min;
dx = x + u; |u| 1; dt
0
x(0) = 1. 3 e)I = 0
x(0) = 1. 10 dx f )I = (2x − u) dt → min; = −2x + u; |u| 1; ]; dt 0
x(10) = 1/4. 10 dx g)I = (2x − u) dt + 2x2 (10) → min; = −2x + u; |u| 1; dt 0
x(0) = 1. 10 dx h)I = (−3x + 3u) dt + x2 (10) → min; = −x + u; |u| 2; dt 0
x(0) = 1.
122
Chapter 6
5 i)I =
(u2 − x) dt → min;
0
x(0) = 1;
dx = −(x + u); u ∈ [0; 10]; dt
⎧ dx1 ⎪ ⎪ dt = x2 − u, ⎪ ⎪ dx2 ⎪ 5 ⎪ ⎨ dt = x1 + u, j)I = (x1 + x2 + 2u) dt − x2 (5) → min; |u| 1, ⎪ ⎪ ⎪ 0 x1 (0), ⎪ ⎪ ⎪ ⎩ x2 (0).
7 Hamilton-Jacobi-Bellman Method
7.1 The main idea of the method
The idea of the method is to "immerse" the problem under consideration in a family of problems that differ only in initial conditions - initial states x0 and initial moments of time t0 while the system of differential equations and functional for all problems remain the same. Determining the optimal control for the whole family of problems at once, we will find a solution in the form of synthesis, which is the dependence of the optimal control on the state of the system and the current moment in time. Solution of the problem in the form of synthesis provides complete information about optimal control. However, in comparison with the solution in the form of a program, finding the synthesis is a much more laborious procedure. In this chapter, we consider a method for specifying the Krotov function ϕ, which leads in the case of continuous systems to the wellknown partial differential equation of R. Bellman, and in the case of discrete systems - to the recurrence relations of the dynamic programming method following from the R. Bellman optimality principle. Bellman’s equation is, in essence, a generalization to the case of an arbitrary set Vu (t, x) of the classical results of the calculus of variations, namely, the Hamilton–Jacobi equation. In this connection, the considered approach is called the Hamilton–Jacobi–Bellman method.
124
Chapter 7
7.2 Hamilton – Jacobi – Bellman equation. Continuous option Consider the optimal control problem in the following setting: t1 I[x(·), u(·)] =
$ % f 0 (t, x, u) dt + F x(t1 ) → min,
(7.1)
t0
x˙ = f (t, x, u), x(t0 ) = x0 ,
u ∈ Vu (t, x).
(7.2) (7.3)
In the given setting: there are no restrictions on states: set of admissible states Vx (t) for all t ∈ (t0 , t1 ) falls with the whole space X; for t = t0 there is a fixed point x0 , for t = t1 restrictions on the phase coordinates are not specified, i.e., the problem with a free right end. According to Theorem 4.1, if there is an admissible process (x(t), u(t)) ∈ V(t) and a function ϕ(t, x) such that R(t, x, u) = max R(t, x, u) ∀t ∈ (t0 , t1 ), (x,u)∈V (t) % $ % $ min G x(t1 ) , G x(t1 ) = x(t1 )∈V (t1 )
(7.4) (7.5)
$ % then the process x(t), u(t) is optimal; in other words, I(x, u) = = min I(x, u). (x.u)∈D
We construct the function P (t, x) using the relation P (t, x) =
max R(t, x, u)
u∈Vu (t,x)
and require that P (t, x) does not depend on x, that is P (t, x) =
max [ϕx f (t, x, u) − f 0 (t, x, u) + ϕt ] = c(t), (7.6)
u∈Vu (t,x)
Hamilton-Jacobi-Bellman Method
125
or, in other words, H(t, x, ϕx ) + ϕt = c(t). Here a function H(t, x, ϕx ) = max H(t, ϕx , x, u); c(t) - is an arbitrary pieceu∈Vu (t,x)
wise continuous function. Suppose that max R(t, x, u) is attained for every fixed (t, x) u∈V (t,x)
at some point u(t, x) ∈ Vu (t, x). This assumption is certainly fulfilled if, for example, Vu – is a compact set. By the Weierstrass theorem, the function R(t, x, u), as a continuous function defined on$ a closed bounded set, reaches its exact upper bound on this set: % R t, x, u(t, x) = P (t, x), or u(t, x) = arg max R(t, x, u), moreu∈Vu (t,x)
over, the function u(t) = u(t, x) is piecewise continuous. To determine the optimal trajectory x(t) we substitute the optimal the control u(t, x) in the equations of the process % $ (7.2). Solving Cauchy problem for a differential system x˙ = f t, x, u(t, x) , closed by optimal control u(t, x), with an initial condition x(t0 ) = x0 , we find the optimal trajectory x(t).$ % A pair of vector functions x(t), u(t) ∈ D belongs to the class of admissible D and satisfies condition 1◦ of Theorem 4.1. For this pair to satisfy condition 2◦ of the theorem, i.e., to be minimal, it is sufficient to require that for t = t1 the function G(t, x) does not depend on x : % $ % $ (7.7) G x(t1 ) = min G x(t1 ) ; x(t1 )∈V (t1 ) % $ (7.8) [F x(t1 ) + ϕ(t1 , x1 )] = const. Thus, if it is possible to choose the function ϕ(t, x) as a solution of the partial differential equation (7.6) under the boundary condition (7.7), then the posed optimal control problem is completely solved. At the same time, a more general problem is being solved, how from any state (t0 , x0 ) to reach the abscissa t1 with the smallest value of functional I. The answer is that at each point (t, x), starting from (t0 , x0 ), one should move with control u(t, x). Equation (7.6), up to an arbitrary function c(t) and the sign of ϕ, coincides with the R. Bellman equation. It follows from Krotov’s theorem that the Bellman equation, together with the boundary condition
126
Chapter 7
(7.7), is a sufficient condition for the minimum of the functional in problem (7.1)-(7.3). Note again that the method is applicable only to optimal control problems in which there are no restrictions on states, including at the right end for t = t1 .
7.3 Synthesis and control program as realizations of two different control schemes - feedback control and open loop control The solution to the optimal control problem in the form u(t, x) is called the synthesis of optimal control, or the field of optimal controls. To construct a synthesis, in addition to the value of t, it is necessary to know the state of the system at each moment of time. Unlike synthesis, a function u(t, t0 , x0 ) = u(t, x)|x=x(t,t0 ,x0 ) , corresponding to given initial conditions (t0 , x0 ), is called an optimal control program. In the theory of automatic control, synthesis u(t, x) is implemented by a controller with ideal feedback. Programmed control u(t) is characterized by an open loop control scheme. In fig. 7.1a, b shows how the trajectory of the system is formed in these cases. Figure 7.1: formation of In the case of closed-loop control, intrajectories of the system: formation about the state of the sysa - using a controller with tem is required at each moment of time. ideal feedback; b - using an The state participates in the formation open loop control circuit of control, which, in turn, is fed to the input of the controlled object and, as a result, closes the differential system. Thus, there is a connection between state and control, which is called feedback (Fig. 7.1a). In economic problems, the control program u(t) corresponds to making decisions for the future, the synthesis of u(t, x) - to regulation, tracking the plan and making operational decisions. If the state of the system deviates from the planned value, the program control,
Hamilton-Jacobi-Bellman Method
127
which does not depend on the current state xx, loses the property of optimality, while the synthesis of u(t, x) gives an optimal solution for a new state of the system, characterized by a new initial level x0 .
7.4 Algorithm of the Hamilton–Jacobi–Bellman method. Continuous option The algorithm of the method consists in finding a solution to the Cauchy problem for a partial differential equation, i.e., a function ϕ(t, x) satisfying conditions (7.6)-(7.7). We represent the function R in the form ∂ϕ ∂ϕ , x, u + . R(t, x, u) = [ϕx f (t, x, u)−f 0 (t, x, u)+ϕt ] = H t, ∂x ∂t Since the function ϕ(t, x) does not explicitly depend on the control u, expression (7.6) takes the form ∂ϕ ∂ϕ , x, u + = c(t). (7.9) P (t, x) = max H t, ∂x ∂t u∈Vu (t,x) Here the maximization applies only to the Hamiltonian H. From relation (7.6), taking into account (7.8), we obtain ∂ϕ ∂ϕ = c(t) − max H t, , x, u . (7.10) ∂t ∂x u∈Vu (t,x) Equation (7.9) is called the Hamilton–Jacobi–Bellman equation; for c(t) = 0 equation (7.9) coincides with R. Bellman’s equation. Both equations are first-order partial differential equations resolved with respect to ∂ϕ/∂t. The boundary conditions for the Hamilton – Jacobi – Bellman equation (7.9) are obtained from relation (7.7): ϕ(t1 , x) = −F (t1 , x) + c1 .
(7.11)
Thus, to find a function ϕ(t, x) satisfying the constraints of the Hamilton–Jacobi–Bellman method (7.6) and (7.7), we need to solve
128
Chapter 7
Figure 7.2: Geometric interpretation of the algorithm for solving the Cauchy problem for the Hamilton–Jacobi–Bellman equation the Cauchy problem for the partial equation derivatives (7.9) with boundary conditions (7.10). The schematic diagram of the solution is illustrated in Fig. 7.2. Since no additional conditions are imposed on the function c(t) and the constant c1 , they can be set equal to zero. Let’s perform discretization of the problem, for this we divide the time interval [t0 , t1 ] into N parts. Through the dividing points t0 = τ0 , τ1 , . . . , τk , . . . , τN = t1 let’s draw the sections. Let’s denote ϕ(τk , x) = ϕk (x). For the derivative ∂ϕk /∂t we perform a finitedifference approximation, then ϕk−1 (x) = ϕk (x) −
∂ϕk (x) (τk − τk−1 ). ∂t
Considering that ϕt = −H(t, x, ϕx ), we get ϕk−1 (x) = ϕk (x) + (τk − τk−1 ) H(t, x, ϕx ).
(7.12)
Relation (7.11) defines an iterative process, which is realized in the opposite direction of time and according to the known ϕk (x) allows calculating ϕk−1 (x). At the right end (for t = t1 ) the function ϕ is found from condition (7.10), namely, ϕN (x) = ϕ(t1 , x) = −F (x). The function found in this way ϕ(t, x) and the corresponding synthesis u(t, x) solve the problem; all that remains is to substitute synthesis u(t, x) into the process equation (7.2) and integrate it under the given initial conditions (7.3).
Hamilton-Jacobi-Bellman Method
129
On the whole, the above approach to numerically solving the Cauchy problem for the Hamilton–Jacobi–Bellman equation is very laborious and can be realized at a sufficiently small integration step only with the help of computer calculations. Nevertheless, for individual problems it is possible to obtain an exact solution in an analytical form, as we will see later.
7.5 The problem of analytical design of an optimal controller The problem of synthesizing a linear system controller that minimizes the standard deviation was solved in 1959–1960. Professor A.M. Letov by the method of dynamic programming. This decision served as the basis for a new direction in the theory of regulation, called the analytical design of regulators. The American engineer and researcher R. Kalman generalized these results to the case of a stochastic dynamical system. In engineering practice, this is the most demanded part of optimal control theory. In this section, we will consider two statements of the problem of designing an optimal controller. In the first, a linear dynamic system with a linear quality criterion is analyzed, and in the second, a linear system with a quadratic criterion.
7.5.1 Synthesis of an optimal controller for a linear dynamic system with a linear performance criterion Formulation of the problem. The dynamics of the system follows the differential equations dxi aij (t)xj + bi (t, u); i = 1, n; t ∈ [t0 , t1 ]; x(t0 ) = x0 , = dt n
j=1
or x˙ = Ax + B(t, u).
130
Chapter 7
The set of admissible controls U (t) depends only on time. The set of states X(t) for all t ∈ (t0 , t1 ] coincides with the entire space (X(t) = En ), and for t = t0 there is a fixed point x(t0 ) = x0 . It is required to find the optimal synthesis that minimizes the functional I[x(·), u(·)] =
t1 + n t0
, aoi (t)xi
i=1
0
+ b (t, u) dt +
n
λi xi (t1 )
i=1
on the set of admissible processes defined above. Here λ is a given n-dimensional vector. Consider the function R(t, x, u) : R(t, x, u) = ϕTx [Ax + B(t, u)] − a0 (t)x − b0 (t, u) + ϕt , where ϕx is a column vector, ϕTx is a transposed vector (row vector). Let us single out in the expression for the function R a group of terms in which includes control: R(t, x, u) = [AT ϕx − a0 (t)]x + ϕt + [ϕTx B(t, u) − b0 (t, u)] . H1 (t,ϕx ,x,u)
Then P (t, x) = R(t, x, u) = [AT ϕx − a0 (t)]x + ϕt + H1 (t, ϕx , x), where H1 (t, ϕx , x) = max [ϕTx B(t, u) − b0 (t, u)]. u∈U (t)
According to the Hamilton–Jacobi–Bellman method, the function ϕ(t, x) is chosen so that the function P (t, x) does not depend on the phase coordinates x. We define the Krotov function as a linear form, n ψi (t)xi , then ϕ(t, x) = i=1
$ % ˙ P (t, x) = [ψ(t) + AT ϕx − a0 (t)]x + H1 t, ψ(t) ,
Hamilton-Jacobi-Bellman Method
where $
+
%
H1 t, ψ(t) = max
u∈U (t)
n
131
, ψi (t)bi (t, u) − b0 (t, u) .
i=1
The conjugate vector ψ(t) is defined as a solution of the Cauchy ˙ problem for a differential system ψ(t) + AT ϕx − a0 (t) = 0 with boundary conditions ψ(t1 ) = −λ. As a result of this definition of the Krotov function ϕ(t, x), the functions P (t, x) and G(x) will not depend on phase variables: P (t, x) = H1 (t);
G(x1 ) =
n )
* (λi + ψi (t1 )xi (t1 ) = 0.
i=1
a0
If A and do not depend on t, then the conjugate system of differential equations is integrated in a closed form. Knowing the vector ψ(t), we find from the condition max H1 (t, ψ(t), u) the optimal u∈U (t)
synthesis u(t) which in this case does not depend on x. Let, for example, bi (t, u) = ci (t)u, i = 1, 2, . . . , n, where u(t) is a scalar function, r = 1. The functions ci (t) are continuous functions on [t0 , t1 ]. The set of admissible controls is a bounded closed set, |u(t)| 1. So, n n $ % 0 0 H1 t, ψ(t) = max ψi (t)ci − c u = ψi (t)ci − c . |u(t)|1 i=1
i=1
Optimal control ⎧ n ⎪ ⎪ ⎪ 1, for those t, where ψi (t)i (t) − c0 (t) > 0, ⎪ ⎨ i=1 u(t) = n ⎪ ⎪ ⎪ −1, for those t, where ψi (t)i (t) − c0 (t) < 0. ⎪ ⎩ i=1
Equation ψ(t)c(t) − c0 (t) = 0 defines many switching points. Thus, the solution to a linear problem in the form of synthesis coincides with its solution in the form of a program for any given initial values (t0 , x0 ).
132
Chapter 7
7.5.2 Synthesis of an optimal controller for a linear dynamic system with a quadratic performance criterion Formulation of the problem. There is a dynamic system x˙ = A(t)x + B(t)u;
t ∈ [t0 , t1 ];
x(t0 ) = x0 .
Here x(t) has the meaning of the deviation of the trajectory from a given unperturbed motion, t is time. Elements of matrices A(t) and B(t) are continuous functions of time. The set of admissible controls U (t) depends only on time. The set of states for all t ∈ (t0 , t1 ] coincides with the entire space X. The set of admissible final states is limited by a single condition: t = t1 . It is required to find the optimal synthesis u(t, x) ∈ U , the minimizing functional t1 I[x(·), u(·)] =
$ % f 0 (t, x, u) dt + F x(t1 ) ,
t0
where f 0 (t, x, u) = f10 (t, x) + f20 (t, u), and all the functions f 0 and F are continuous, non-negative and convex functions for which f10 (t, 0) = f20 (t, 0) = F (0) = 0. $ % t1 The functional f10 (t, x) dt + F x(t1 ) is a measure of the devit0
ation of the trajectory x(t) from the given program x x(t) ≡ 0. The specific form of the dependencies that form this functional is chosen from engineering considerations, taking into account the exact sense in which the optimum is understood. In particular, if the quality of a dynamical system is characterized only by the proximity of the phase state to the desired x = 0 at the final moment of time t = t1 and on the rest of the interval (t0 , t1 ) we are not interested in the behavior of the system, then we should put f10 (t, x) ≡ 0. t1 The second functional f20 (t, u) dt is a measure of control costs t0
and has the meaning of physical energy, the amount of fuel consumed for control.
Hamilton-Jacobi-Bellman Method
133
Thus, the sought optimal synthesis u(t, x) must minimize a certain average criterion characterizing the deviation of a trajectory from a given, desired trajectory “plus” control costs. Synthesis u(t, x) is realized by an ideal feedback regulator operating according to the circuit shown in Fig. 7.1 Setting the dependence u(t, x) determines the structure of the controller. Professor A.M. Letov proposed to calculate this structure analytically, proceeding from the solution of the optimal control problem. This explains the name of the problem - the synthesis of the optimal controller, or the problem of analytical design of the optimal controller (ACOR). We will carry out further research, assuming that the functions f10 , f02 , F are sign definite quadratic forms: 1 f10 (t, x) = x∗ Sx; 2
1 f20 (t, u) = u∗ Qu; 2
1 F (x1 ) = x∗1 Λx1 , 2
where S, Λ are n × n square matrices; Q is an r × r matrix; x is a column vector, xT is a transposed row vector. Hamilton–Jacobi–Bellman method. Consider the constructions that appear in the theorem on Krotov’s sufficient optimality conditions for the function R and G: 1 1 R(t, x, u) = ϕTx (Ax + Bu) − xT Sx − uT Qu + ϕt , 2 2 1 G(x1 ) = ϕ(t1 , x1 ) − ϕ(t0 , x0 ) + xT1 Λx1 . 2
(7.13) (7.14)
Since f20 (t, u) as a positive definite quadratic form is strictly convex, then R(t, x, u) for each fixed t has a unique maximum in u, which is given by the equation ∂R = B T ϕx − Qu|u = 0, ∂u u
u = Q−1 B T ϕx
(7.15)
134
Chapter 7
Substitute (7.14) into (7.12) 1 P (t, x) = ϕt + ϕTx Ax + ϕTx BQ−1 B T ϕx − xT Sx− 2 1 −1 T T −1 T − (Q B ϕx ) QQ B ϕx = ϕTx Ax+ 2 1 1 + ϕTx BQ−1 B T ϕx − xT Sx + ϕt . (7.16) 2 2 In equation (7.14), when differentiating a quadratic form with respect to control, the symmetry property of the matrix Q is used: ∂ ∂uj
1 − ui qij uj 2 r
i=1
r
1 1 ui qij − qij ui = 2 2 r
=−
j=1
r
i=1
=−
j=1 r
ui qij . (7.17)
i=1
In equation (7.16), the rule of transposition of the product of matrices is applied: (Q−1 B T ϕx )T = ϕTx (Q−1 B T )T = ϕTx B(Q−1 )T . It follows from relation (7.13) that 1 ϕ(t1 , x1 ) + xT1 Λx1 = c1 . 2 We will seek the Krotov function ϕ(t, x) as a quadratic form: 1 ϕ(t, x) = xT Σ(t)x, 2
1 ˙ ϕt = xT Σx, 2 (7.18) Here the matrix Σ(t) is symmetric. Substituting (7.17) into (7.15), we obtain then
ϕx = Σ(t)x;
1 1 1 ˙ P (t, x) = xT ΣT Ax + xT ΣT BQ−1 B T Σx − xT Sx + xT Σx, 2 2 2 or " 1 1 1 # P (t, x) = xT AT Σ + ΣT BQ−1 B T Σ − S + Σ˙ x. 2 2 2
(7.19)
Hamilton-Jacobi-Bellman Method
135
We assume that the expression ub square brackets in equation (7.18) is zero: 1 1 1 (7.20) AT Σ + ΣT BQ−1 B T Σ − S + Σ˙ = 0. 2 2 2 Then the function P (t, x) does not depend on the state vector x and P (t, x) = 0. For the function G(x1 ) taking into account (7.16), we get: % 1 $ G(x1 ) = xT1 Σ(t1 ) + Λ x1 . 2 For the function G(x) to be independent of the final state x1 it is necessary and sufficient that Σ(t1 ) = −Λ.
(7.21)
Thus, if the function ϕ(t, x) is given in the form of a quadratic form with matrix Σ(t), whose elements are a solution to the Cauchy problem for system (7.19) with initial conditions (7.20), then the function ϕ(t, x) satisfies equations of the Hamilton–Jacobi–Bellman method. Relation (7.19) defines a matrix Riccati differential system, it is equivalent to a system of n(n + 1)/2 ordinary differential equations with respect to functions σij (t) of elements of the matrix Σ(t). Each equation of the matrix Riccati system contains linear and quadratic terms. If a solution to the Cauchy problem for system (7.19) with the initial condition (7.20) exists, then together with it we obtain a complete solution to the optimal control problem. In this case, the optimal synthesis u(t, x) = Q−1 B T Σx = L(t)x turns out to be a linear form with respect to the phase coordinates x with coefficients depending on time, L(t) = Q−1 B T Σ(t). In fact, these are the gains in the feedback loop of the optimal controller. An example of constructing an optimal control synthesis. Find a solution to the optimal control problem T 0
u2 dt + λx2 (T ) → min,
x˙ = u,
x(0) = x0 ,
λ > 0.
136
Chapter 7
Let us construct a function R(t, x, u) : R(t, x, u) = ϕx · u−u2 + +ϕt . According to the condition of the problem, the admissible set of controls Vu (t, x) is the entire numerical axis. The function R is strictly convex by u, Ruu < 0, therefore, for each fixed t, the function R has a unique maximum in u, which is attained at a stationary point: ∂R = ϕx − 2u|u = 0, ∂u u whence u(t, x) = 12 ∂ϕ ∂x . Substituting u(t, x) into the expression for R(t, x, u), we construct the function P (t, x): P (t, x) =
1 ∂ϕ 2 ∂ϕ + . 4 ∂x ∂t
The Hamilton–Jacobi–Bellman equation takes the form 1 ∂ϕ 2 ∂ϕ + = c(t) 4 ∂x ∂t with the boundary condition ϕ(T, x) = −λx2 + c1 . Taking into account that the boundary condition is determined by a quadratic dependence, we will look for the function ϕ(t, x) in the form of a polynomial of the second degree in x with coefficients depending on t, ϕ(t, x) = ψ(t)x + σ(t)x2 . Then from condition (7.7) we obtain $ % G x(T ) = λx2 + ψ(T )x + σ(T )x2 = const.
(7.22)
The necessary and sufficient conditions for the fulfillment of (7.22) are the relations
ψ(T ) = 0, (7.23) σ(T ) = −λ. Let us find the partial derivatives of the Krotov function: ϕx (t, x) = ψ(t) + 2σ(t) · x, ˙ · x2 ˙ + σ(t) ϕ)t (t, x) = ψ(t)x
Hamilton-Jacobi-Bellman Method
137
and substitute them into the Hamilton – Jacobi – Bellman equation (7.21): $ % 1 2 ˙ ˙ + σ(t)x = c(t). (7.24) (ψ + 2σx)2 + ψ(t)x 4 The left-hand side of equation (7.24) does not depend on the state x if and only if the relations
ψ˙ + σψ = 0, (7.25) σ˙ + σ2 = 0. Equations (7.25) define a homogeneous differential system with respect to the sought functions ψ(t), σ(t) with initial conditions (7.23), ψ(T ) = 0, σ(T ) = −λ. The first differential equation in system (7.25) is linear. Taking into account that the boundary condition for it is zero, the solution will be trivial (ψ(t) ≡ 0). The second equation is a separable equation; integrating it, we find σ−1 = t + C, where C is the constant of integration, The value of C is determined from the boundary condition at the right end σ(T ) = −λ, therefore, C = −λ−1 − T. The general solution of the differential equation takes the form σ(t) = −
1 . T − t + 1λ
Thus, for the example under consideration, the solution to the Bellman equation is determined by the dependence ϕ(t, x) = −
x2 . T − t + 1λ
Optimal control synthesis turns out to be linear in standing, u(t, x) =
1 ∂ϕ x . =− 2 ∂x T − t + 1λ
The resulting dependence u(t, x) is valid for any initial conditions. It determines how the system should be controlled so that, for any
138
Chapter 7
given state x(t), its behavior is optimal in the sense of the minimum of functional I. Let’s substitute synthesis u(t, x) in the process equation: x˙ = −
x . T − t + 1λ
Separating the variables and integrating, we find the trajectory of the system 1 x(t) = C1 t − T − , λ here the constant of integration C1 is determined from the initial conditions, according to which at the moment t = 0 the state x(0) = = x0 . As a result C1 = −x0 (T + 1λ )−1 . The optimal trajectory passing through the point (0, x0 ), follows the dependence x(t) = x0 Optimal control program u(t) = u(t, x)
T − t + 1λ . T + 1λ
x=x(t)
=
x0 T+
1 λ
= const.
Thus, the optimal trajectory is a linear function of time, and the optimal control program is a constant.
7.6 Accounting for state constraints The Hamilton–Jacobi–Bellman method is presented above for an optimal control problem in which there were no constraints on states, including constraints at the right end of the trajectory for t = t1 The method of penalty functions allows us to expand the original setting and take into account such constraints. Consider the optimal control problem t1 I[x(·), u(·)] = t0
f 0 (t, x, u) dt → min,
x˙ = f (t, x, u),
Hamilton-Jacobi-Bellman Method
u ∈ Vu (t, x),
x(t0 ) = x0 ,
139
x(t1 ) = x1 .
Application of the Hamilton–Jacobi–Bellman method to the solution of this problem requires a preliminary reduction to the problem without constraints on states. To do this, we remove the restrictions on the state at the right end at t = t1 and introduce into the original functional the penalty function in the form of a terminal term: t1 I[x(·), u(·)] =
f 0 (t, x, u)dt + λ
)
* x(t1 ) − x1 → min,
t0
here [x(t1 )−x1 ] is some given function called penalty. The penalty function has the following property:
0, x = x1 (x) = M > 0, x = x1 , where M is a sufficiently large number, a “penalty” for deviating from a given state constraint. For example, the-simplest penalty function can be defined as a quadratic dependence (x) = λni=1 (xi − −xi,1 )2 . Consider the relationship between the original and reduced tasks. It’s obvious that: 1◦ ) the class of admissible processes of the reduced problem is wider, D ⊂ D ; 2◦ ) if (x, u) ∈ D, then the penalty function is zero and I (x, u) = = I(x, u); 3◦ ) min I minI minI ; D
D
D
4◦ ) with an increase in the penalty, i.e., with an increase in λ, I (x, u, λ) → minI; min D
5◦ )
λ→∞
D
at the right end of the interval, the trajectory x(t1 , λ) → → x(t1 ) = x1 .
λ→∞
λ→∞
140
Chapter 7
Example. I =
T 0
u2 dt → min, x˙ = u, x(0) = x0 , x(t) = 0.
Here x and u are scalar functions. It is required to find an optimal solution x(t, t0 , x0 ) and u(t, t0 , x0 ) for any initial conditions t0 , x0 such that t0 < T In other words, it is required to find a synthesizing function u(t, x), which will be the optimal solution for a whole family of optimal control problems with different starting points (t0 , x0 ) Let’s reduce the task:
T
I =
u2 dt + λ[x(T )]2 → min, x˙ = u, x(0) = x0 , λ > 0,
0
The solution to this problem was considered above: the optimal trajectory of the reduced problem has the form 1 −1 1 T+ x(t) = x0 T − t + . λ λ For λ → ∞ we get the solution t , x(t) = x0 1 − T satisfying the boundary condition x(T ) = 0.
7.7 Comparative analysis of the Lagrange – Pontryagin methods and Hamilton – Jacobi – Bellman 1. The Lagrange–Pontryagin method reduces the optimal control problem to a two-point boundary value problem for a differential system of ordinary differential equations of order 2n, while the Hamilton – Jacobi – Bellman method associates it with the Cauchy problem for partial differential equations with respect to the function ϕ(t, x). In this respect, the Hamilton – Jacobi – Bellman method is much more complicated. 2. The Lagrange–Pontryagin method allows one to find the optimal control program u(t) and the corresponding optimal trajectory
Hamilton-Jacobi-Bellman Method
141
satisfying the given boundary conditions. The Hamilton–Jacobi– Bellman method provides the solution of optimization problems with any initial conditions in the form of control synthesis u(t, x), that is, it solves a more general problem equivalent to the family of problems solved in the Lagrange method. % $ – Pontryagin 3. The process x(t), u(t) , found by the Lagrange – Pontryagin method is suspicious of an optimum, but, generally speaking, it may not be such. The proof of optimality requires additional research. The Hamilton–Jacobi–Bellman method does not require additional justification of optimality, since%the synthesis of control u(t, x) $ allows us to find a process x(t), u(t) , that is optimal. 4. The Hamilton–Jacobi–Bellman method, in contrast to the Lagrange – Pontryagin method, is applicable only to problems without constraints on states at t > t0 and only to problems with a free right end. In the case of applying the Hamilton–Jacobi–Bellman method, the presence of constraints on states in the original formulation requires a preliminary reduction of the problem with the help of penalty functions to the problem without constraints. 5. The Lagrange–Pontryagin method is more universal with respect to boundary conditions, while the Hamilton–Jacobi–Bellman method, in turn, is applicable to a wider class of problems; in particular, to the solution of integer-valued problems such as the knapsack problem, the traveling salesman problem, and others.
7.8 Test questions and exercises 1. Write down the Hamilton – Jacobi – Bellman equation. How should the Krotov function ϕ(t, x) be determined to obtain the Hamilton–Jacobi–Bellman equation? 2. Define the concept of "optimal control synthesi". What makes it different from a management program? 3. How do you understand the control scheme of the closed-loop system? Are systems without feedback possible? Give examples. Which systems are more effective: with feedback or
142
Chapter 7
without feedback? 4. Build an optimal control synthesis for a linear differential system with a linear quality criterion. Give a comparative description of the obtained solution with the optimal program. 5. What are the features of the optimal control design for a linear differential system with a quadratic performance criterion? 6. Determine the optimal synthesis and the optimal control program in the following tasks: 1 6.1. I =
2u2 dt + x2 (1) → min, x˙ = u, x(0) = 1;
0
1 6.2. I =
u2 dt + 4x2 (1) → min, x˙ = x − 2u, x(0) = 2;
0
1 6.3. I =
(x2 + u2 )dt + x1 (1) − x2 (1) → min;
0 dx1 dt dx2 dt
= −2x2 , = u,
x1 (0) = −1, x2 (0) = 1.
8 Relationship Of The Theory Of Optimal Control And The Classical Variational Calculus 8.1 Formulation of the problem We consider the simplest problem of the calculus of variations on the minimum of the functional t1 I[x(·), u(·)] =
f 0 (t, x, u) dt → min D
(8.1)
t0
on the set of pairs of scalar functions (x(t), u(t)) ∈ D related by the differential equation dx = u, dt
(8.2)
with boundary conditions x(t0 ) = x0 , x(t1 ) = x1 (in other words, a problem with fixed ends). There are no restrictions on control and state at t ∈ (t0 , t1 ). Here the indicatrix f 0 is an arbitrary continuous, twice continuously differentiable function. It is assumed that the minimum belongs to the class of admissible, (x(t), u(t)) ∈ D. This assumption is essentially an additional requirement that limits the form of the integrand and possible boundary values x0 , x1 . Krotov’s optimality principle reduces the simplest problem on the minimum of a functional I on a class D of solutions of a differential equation to nonlinear programming problems in a finite-dimensional space for the function R(t, x, u) (for fixed values of t) and the function
144
Chapter 8
G(x0 , x1 ). The solution of these nonlinear programming problems is approximated with any degree of accuracy by a sequence from D, which is achieved by an appropriate choice of the function ϕ(t, x). The conditions of the theorem leave a certain arbitrariness in the choice of the function ϕ. By specifying a method for finding this function, we, in essence, are specifying a method for solving the optimal control problem. In particular, the solution procedures based on the maximum principle of L. S. Pontryagin and the principle of optimality of R. Bellman can be interpreted as some ways of specifying the function ϕ. This establishes a connection between the proposed approach and these methods common in optimal control theory. The arbitrariness in the choice of ϕ can be used in order to construct the most convenient method for a given problem, taking into account its specifics.
8.2 Euler – Lagrange equation Let the functions (x(t), u(t), ϕ(t, x)) satisfy the conditions of Theorem 4.1: 1) for each fixed t ∈ [t0 , t1 ] the function R(t, x, u) reaches at the point x(t), u(t) the largest value among all (x, u) ∈ V (t), and the function G(x(t0 ), x(t1 )) the smallest; 2) the functions (x(t), u(t)) satisfy the differential equation = f (t, x, u).
dx dt
=
We will assume that the function ϕ(t, x) is twice continuously differentiable. Let us write down the necessary conditions for the extremum of the function R(t, x, u):
∂R ∂x x, u = 0, (8.3) ∂R = 0. ∂u x, u
We introduce into consideration the conjugate function ∂ϕ(t, x) ψ(t) = , ∂x x(t)
The Theory Of Optimal Control And The Variational Calculus
145
in other words, we define the Krotov function to be linear in the phase variable ϕ(t, x) = ψ(t)x. Taking into account that R(t, x, u) = = ϕx (t, x) · u − f 0 (t, x, u) + ϕt (t, x) we find the derivatives in (8.3): ∂R ∂2ϕ ∂2ϕ ∂f 0 = u+ = − ∂x x, u ∂x2 ∂x ∂t ∂x x, u ∂ ∂ϕ(t, x) ∂f 0 ∂ ∂ϕ(t, x) ˙ = · x + ∂t − ∂x , ∂x ∂x ∂x x x x,u total derivative t ∂H ∂R ∂f 0 (t, x, u) ∂R = = ψ(t) − , ∂u x,u ∂u x,u ∂u ∂u x,u where is the Hamiltonian H = ψu − f 0 . As a result ⎧ dψ(t) ∂H(t,ψ,x,u) ∂R ⎪ = + = 0, ⎨ ∂x x,u dt ∂x x,u ∂f 0 (t,x,u) ∂R ∂H ⎪ ⎩ ∂x = ∂u = ψ(t) − ∂u x,u x,u
= 0.
(8.4)
x,u
A characteristic feature of relations (8.4) is that the function ϕ(t, x) is represented in them only by its derivative ψ(t) calculated along the trajectory x(t). The necessary conditions for an extremum (8.4) are now written in the form ⎧ dψ(t) ∂H(t,ψ,x,u) ⎪ = − , ⎨ dt ∂x x,u (8.5) 0 (t,x,u) ∂f . ⎪ψ(t) = ⎩ ∂u x,u
Relations (8.5), together with conditions (8.2), define equations for unknown functions x(t), u(t), ψ(t). Indeed, let the functions x(t), u(t) satisfy relations (8.2), (8.5). This means that the pair (x(t), u(t)) is admissible and satisfies the necessary conditions (8.5) for the maximum of the function R(t, x, u) for any function ϕ(t, x) whose derivative with respect to x on the trajectory x(t) coincides with ψ(t).
146
Chapter 8
It remains to show that for at least one such function ϕ(t, x) stationary point (x(t), u(t)) of the function R(t, x, u) is its maximum point. This condition is certainly satisfied when the indicatrix is a convex function. Then a couple (x(t), u(t)) is the minimum. Note that with this solution method, the function ϕ(t, x) is sought together with the functions x(t), u(t). System (8.2), (8.5) contains three equations for three unknown functions. If we express the control u from the second equation of system (8.5) as a function of the arguments t, x, ψ, then, substituting the function u(t, x, ψ) into the first equation of (8.5) and into (8.2), we obtain a two-point boundary value problem for the system ordinary differential equations, the general solution of which depends on two arbitrary constants - x(t, C1 , C2 ), ψ(t, C1 , C2 ). Integration constants make it possible to satisfy the boundary conditions: x(t0 , C1 , C2 ) = = x0 , x(t1 , C1 , C2 ) = x1 . Differentiating the second relation in (8.5) with respect to t and subtracting the result from the first equation, we arrive at the classical result of the calculus of variations - the Euler-Lagrange equation: 0 ∂f d ∂f 0 − = 0, du ∂u ∂x where u = x. ˙ Here the Euler–Lagrange equation is derived from the necessary condition for the existence of a function ϕ(t, x) that is twice differentiable at a point x(t) and such that R(t, x, u) has a maximum in x and ˙ for any t. u at the points x(t), u(t) = x(t) Let us represent the equations of the boundary value problem in a symmetric form: ⎧ ∂H ⎨ dψ = − dt ∂x x,u , (8.6) ⎩ dx = ∂H . dt ∂ψ x,u
System (8.6) is called the canonical system of Euler–Lagrange equations for the simplest functional. If the indicatrix f 0 does not depend explicitly on t, then the Hamiltonian H is also independent of
The Theory Of Optimal Control And The Variational Calculus
147
t, the canonical system is Hamiltonian, and the Hamiltonian is its first integral, dH ∂H ∂H dψ ∂H dx = + + ≡ 0, dt ∂t ∂ψ dt ∂x dt H(ψ, x, u) = ψu − f 0 (x, u) = const.
8.3 Weierstrass Necessary Condition In order for the function R(t, x, u) to have the greatest value at the point (u(t), u(t)) for each t, it is necessary that the value R(t, x, u) be the largest on any subset of the plane (x, u), in particular, among all pairs (x(t), u(t)) the relation was fulfilled R(t, x, u) = max R(t, x, u). u
Taking into account that ∂ϕ/∂t does not depend on the control u, we obtain H(t, ψ, x, u) H(t, ψ, x, u) or
dx dx x dx 0 ψ − f t, x, − f 0 t, x, . ψ dt dt dt dt
(8.7)
(8.8)
We substitute in (8.8) the expression of the conjugate function ∂f 0 ψ(t) = ∂u , then for all −∞ < u < +∞ x,u
dx E t, x, , u dt
dx ∂f 0 (t, x, u) ≡ −u − ∂u dt x,u= dx dt dx 0 + f 0 (t, x, u) 0, (8.9) − f t, x, dt
here E is the Weierstrass function. Thus, the necessary optimality condition in the simplest problem of the calculus of variations is that the Weierstrass function is ˙ u) 0. nonnegative, E(t, x, x,
148
Chapter 8
Let us rewrite condition (8.9) as ∂f 0 (t, x, u) dx dx 0 0 + (8.10) f (t, x, u) f t, x, u− dt ∂u dt x,u= dx dt
Then the right-hand side of expression (8.10) can be interpreted as the first two terms of the expansion of the indicatrix in the Taylor series in the vicinity of the point u = dx/dt. As a result, the Weierstrass 0 function E(t, x, dx dt , u) is the deviation of the indicatrix f (t, x, u) from these two terms of the expansion in the Taylor series.
8.4 Legendre and Jacobi conditions It is assumed that a pair of functions (x(t), u(t)) ∈ D and the function u is continuous and differentiable everywhere on [t0 , t1 ] the function ϕ(t, x) is three times continuously differentiable at the points (t, x(t)). In this case, the necessary condition for the maximum of the function R(t, x, u) at the point (x(t), u(t)) is the nonpositiveness of the quadratic form d2 R = Rxx dx2 + 2Rxu dx du + Ruu du2 0. Here, the bar above denotes the second derivatives calculated along a pair of functions (x(t), u(t)), dx = x − x(t), du = u − u(t). 2 Let’s introduce the notation σ(t) = ∂ ϕ(t,x(t)) ∂x2 0 Ruu = −fuu (t, x, u), 0 Rxu = σ(t) − fxu (t, x, u),
Rxx
∂ ∂ dσ 0 0 = (t, x, u), (ϕxx )u − fxx + (ϕxx ) = − fxx ∂x ∂t dt x,u
Rx = ϕxx u − fx0 + ϕtx The criterion for the non-positiveness of the quadratic form d2 R is the alternation of the signs of the diagonal minors of the Hessian matrix, Ruu Rxu ⇒ Ruu 0, det(h) 0, h= Rxu Rxx
The Theory Of Optimal Control And The Variational Calculus
or
⎧ ⎨Ruu = f 0uu 0, 2 ⎩−f 0uu dσ − f 0xx − σ − f 0xu 0. dt
149
(8.11)
he first condition in (8.11) is a necessary Legendre condition, the second is a differential inequality with respect to the function σ(t). Let us rewrite the differential inequality: 2 dσ 0 0 0 (8.12) −f uu − f xx σ − f xu 0. dt 0
If the Legendre condition −f uu 0 is satisfied, then from (8.12) it 0 follows σ˙ − −f xx 0. We introduce a continuous non-negative function ε(t) 0 with the help of which we represent the differential inequality from (8.11) in the form of a differential equation: $ % dσ 0 0 0 2 0 = ε(t)fuu . (8.13) − fxx − σ − fxu −fuu dt Equation (8.13) is a nonlinear Riccati equation. By changing 0 0 variables σ(t) − f xu = z1 dz dt · f uu equation (8.13) is reduced to a linear homogeneous differential equation of the second order with respect to the function z(t): 0 d df xu 0 dz 0 − −f xx − f uu − ε(t) z = 0. (8.14) dt dt dt For ε(t) ≡ 0, equation (8.14) expresses the necessary Jacobi condition for the calculus of variations. Indeed, the existence of an extremum of the simplest functional I necessarily implies the nonnegativity of the second variation, δ2 I 0, so that the zero value of the second variation turns out to be its smallest value. Therefore, if there is such a trajectory x(t) on which the second variation δ2 I vanishes, then this trajectory must be the Euler extremal of the functional δ2 I since
150
Chapter 8
the second variation δ2 I reaches its smallest zero value on it. The Euler extremal equation, that is, the Euler–Lagrange equation for the functional δ2 I is obtained from (8.14) for ε(t) ≡ 0: 0 f xu d 0 dz 0 − f xx − z = 0. (8.15) f uu dt dt dt In the theory of differential equations, a relationship is established between the solutions of equations (8.13) and (8.14). In particular, it is known that the singular points of equation (8.13), i.e., the points at which σ(t) does not exist, coincide with the zeros of the nontrivial solution z(t) of equation (8.14). According to the Sturm–Liouville theorem on the alternation of zeros of a solution to a linear differential equation, the interval (t0 , t1 ) on which the solution z(t) of equation (8.14) does not vanish for fixed ε(t) has the greatest length when for this solution the initial condition zero, z(t0 ) = 0. In other words, the largest interval (t0 , t1 ) corresponds to such a solution, one of the zeros of which coincides with the left end of the interval (t0 , t1 ). Let’s denote this solution z(t0 , t). Let for equation (8.14) for any ε(t) 0 z(t0 ) = 0. Then for any ε(t) 0 the right zero of the solution z(t0 , t) of equation (8.14) adjacent to the origin is no further than the corresponding zero of equation (8.15) for ε(t) = 0 from the origin. For the Jacobi equation (8.15), this zero is called the conjugate point t0 . 0 So, if f uu 0 and there is a function σ(t) such that differential inequality (8.11) holds on the interval (t0 , t1 ), then the interval (t0 , t1 ) does not contain points conjugate to t0 . This is the Jacobi necessary condition for the minimum of the simplest functional. The absence of conjugate points on a segment [t0 , t1 ] is an analogue of Sylvester’s criterion for the sign-definiteness of a quadratic form. The converse is also true: it follows from Sturm’s theorem that if an interval (t0 , t1 ) does not contain points conjugate to t0 then there exists ε(t) 0 such that the solution z(t0 , t) of equation (8.14) does not vanish at any point in the interval (t0 , t1 ). In the extreme case,
The Theory Of Optimal Control And The Variational Calculus
151
when t1 turns out to be conjugate with t0 a point, ε(t) = 0. In other words, there exists σ(t) such that (8.11) holds.
8.5 Summary of Required Extremum Conditions simplest functionality If the functional I =
t1
f 0 (t, x, x) ˙ dt reaches an extremum along the
t0
curve x(t), then: 1) the curve x(t) is an Euler extremal; otherwise, it satisfies the d (fx0 ) = 0. Euler–Lagrange equation fx0 − dt 2) the Legendre condition is satisfied along this curve fx0 x (t, x, x ) 0 (in the case of a maximum, the sign of inequality is opposite); 3) the interval (t0 , t1 ) does not contain points conjugated with t0 ; 4) the Weierstrass condition is satisfied E(t, x, dx dt , u) 0. These conditions are necessary for a weak extremum. Since any strong extremum is at the same time weak, then any condition necessary for a weak extremum is necessary for a strong one as well. 0 Strict inequalities or strengthened conditions d2 R < 0, f uu > > 0, ε(t) > 0, [t0 , t1 ] lead to sufficient conditions for the minimum of the simplest functional.
8.6 Sufficient conditions for an extremum simplest functionality Theorem 8.6.1. If: 1) the trajectory x(t) satisfies the Euler–Lagrange equation; 2) the strengthened Legendre condition is satisfied along this curve;
152
Chapter 8
3) the segment [t0 , t1 ] does not contain points conjugate to the point t = t0 (strengthened Jacobi condition), then this curve implements a weak relative minimum of the simplest functional. If the additional fourth strong Weierstrass condition holds along the curve x(t), the theorem gives sufficient conditions for the strong extremum of the functional. The sufficient conditions are very close to the necessary conditions discussed above. Each of the necessary conditions in itself, separately, is necessary, while the sufficient conditions should be considered in the aggregate, since only the simultaneous fulfillment of all conditions ensures the existence of an extremum. Thus, we can draw the following conclusion: there exists a function ϕ such that the necessary conditions for the maximum of the function R are simultaneously necessary conditions for the minimum of the simplest functional; sufficient conditions for a weak relative minimum of the functional are simultaneously sufficient conditions for the relative maximum of the function R.
8.7 Hamilton – Jacobi equation. Jacobi’s theorem Consider the simplest functional I =
t1
f 0 (t, x, x ) dt defined on
t0
curves lying in some region of the plane (t, x). Suppose that one and only one extremal of functional I passes through any two points A and B of this domain. t1 Let us introduce the quantity S = f 0 (t, x, x ) dt where the t0
integral is taken along the extremal connecting the points A(t0 , x0 ) and B(t1 , x1 ). The quantity S is called the geodesic distance between points A and B. In particular, if functional I determines the length of the curve, then S is the distance in the usual sense between A and B, while in the problem of geometric optics S is the propagation time of light from A to B.
The Theory Of Optimal Control And The Variational Calculus
153
We assume that the point A(t0 , x0 ) is fixed, and the point B(t, x) is variable, then the geodesic distance S = S(t, x) is a single-valued function of the coordinates of point B. We derive a differential equation that the function S(t, x) satisfies , for this purpose we calculate the partial derivatives ∂S/∂t and ∂S/∂x. Let us find the total differential dS, that is, the main linear part of the increment, ΔS = S(t + Δt, x + Δx) − S(t, x). According to the definition, ΔS is the difference between the values of the functional ΔS = I( γ ) − I(γ), where γ is an extremal going from point A to a point with coordinates (t, x), and γ is an extremal going from point A to point (t + Δt, x + Δx). Therefore, dS = δI. We define the variation of the functional as the linear part of the increment with respect to the variation of the function x(t), the derivative x (t), and the coordinates of the ends of the interval: t1 δI = t0
fx0
t1 t 1 d 0 0 0 0 − fx δx dt + fx δx + (f − fx x )δt .(8.16) dt t0 t0
We introduce the following notation: H = −f 0 + x · fx0 then ψ
the variation of the functional (8.16) will be written in the form ⎛ ψ ⎞ t1 t1 t1 d δI = ⎝fx0 − fx0 ⎠ δx dt + ψδx − Hδt , (8.17) dt t0 t0 t0 ≡0 on the Euler extreme
Here variables t, x, ψ and H are canonical variables. t On the Euler extreme δI = (ψδx − Hδt)t10 . In the problem with a fixed left end, all quantities are calculated at the point B(t1 , x1 ), dS = δI = (ψδx − Hδt)t1 = 0. Hence, ∂S = −H(t, x, ψ), ∂t
∂S = ψ. ∂x
(8.18)
154
Chapter 8
According to (8.18), the geodesic distance S as a function of the coordinates of the point B satisfies the equation ∂S ∂S = 0. + H t, x, ∂t ∂x
(8.19)
ψ
Relation (8.19) is the Hamilton–Jacobi equation. There is a close connection between the Hamilton–Jacobi equation and the canonical Euler–Lagrange equations. It is these canonical equations that represent the so-called characteristic system for equation (8.19). Theorem 8.7.1 (Jacobi’s theorem). Let S = S(t, x, α) be some solution of the Hamilton – Jacobi equation depending on the parameter α. Then the derivative ∂S ∂α is the first integral of the canonical system of Euler–Lagrange equations
dx ∂H dt = ∂ψ , dψ dt
In other words,
∂S ∂α
= − ∂H ∂x .
= const along each extremal.
The readers are invited to prove this theorem on their own, as an exercise. For this, one should differentiate the Hamilton–Jacobi d ∂S equation with respect to the parameter α and show that dt ∂α = 0 taking into account relations (8.18). Thus, the well-known partial differential equation of R. Bellman, to which the method of specifying the Krotov function ϕ(t, x), considered in Ch. 7, is a generalization of the classical Hamilton–Jacobi equation to the case of an arbitrary set of admissible controls.
9 Multi-Step Controlled Processes. Theorem On Sufficient Optimal Conditions. Krotov’s Optimal Principle 9.1 Statement of the optimal control problem multi-step process The equations of motion for a multistep process have the form of a difference system: xi (t + 1) = fi (t, x(t), u(t)), t = 0, 1, ..., T − 1, i = 1, 2, ..., n.(9.1) Boundary conditions xi (0) = xi,0 ,
xj (T ) = xj,1 ,
i = 1, n,
j = 1, m < n.
(9.2)
Here the n-dimensional vector function of the integer argument x(t) plays the role of the state of the system, and the r-dimensional vector function u(t) plays the role of the control. Formally, the state differs from the control in that it enters into the constraint equations (9.2) for different values of t. Additional restrictions on states and controls: (x(t), u(t)) ∈ V (t).
(9.3)
On the set of admissible D defined by conditions (9.1) - (9.3), the functional I[x(·), u(·)] =
T −1 t=0
f 0 (t, x(t), u(t)) + F (T, x(T )),
(9.4)
156
Chapter 9
where functions f 0 and F define the current and terminal performance criteria of the system. It is assumed that the functional I is bounded on D. All functions fi , f 0 and F are given functions of their arguments, continuous and differentiable. It is required to find a sequence (xs (t), us (t)) ∈ D minimizing the functional I on D, that is, a sequence such that I(xs , us ) → s→∞
→ inf (I(x, u)).
s→∞ D
9.2 The main theorem Let us define an arbitrary function ϕ(t, x) and use it to construct the following constructions: R(t, x, u) = ϕ(t + 1, f (t, x, u)) − ϕ(t, x) −f 0 (t, x, u),
(9.5)
The increment of the krotov function by [t, t + 1] is the difference analog of the derivative dϕ/dt
μ(t) =
max
(x,u)∈V (t)
R(t, x, u),
G(x(T )) = ϕ(T, x) − ϕ(0, x) + F (T, x), m=
min
x(T )∈V (T )
G(x(T )).
(9.6) (9.7) (9.8)
Theorem 9.2.1. Let there be a sequence {xs (t), us (t)} ∈ D for this sequence to minimize the functional I on D, it is sufficient (and if for each t = 0, 1, ..., T − 1 the functions f 0 (t, x, u) and F (x(T )) are bounded on V (t) - then it is necessary) the existence of a function ϕ(t, x) such that: 1) for t = 0, 1, ...T − 1 the function μ(t) is defined; 2) R(t, xs (t), us (t)) → μ(t), t = 0, 1, ..., T − 1; s→∞
3) G(xs (T )) → m. s→∞
Proof of water sufficiency. Let us consider the set E, which is an extension of the original set of admissible processes, D ⊂ E. On the
Multi-Step Controlled Processes
157
set E, the system of difference equations is "removed". Let us define on the set E the functional L[x(·), u(·)] = G(x(N )) −
T −1
R(t, x(t), u(t)).
t=0
For all admissible pairs (x, u) ∈ D, the functionals L and I coincide, L[x(·), u(·)] ≡ I[x(·), u(·)]. Indeed, L[x(·), u(·)] = ϕ(T, x) − ϕ(0, x) + F (x(T ))− −
T −1
[ϕ(t + 1, f (t, x, u)) − ϕ(t, x) − f 0 (t, x, u)] =
t=0
= ϕ(T, x) − ϕ(0, x) + F (x(T ))− −
T −1 t=0
ϕ(t + 1, f (t, x, u)) +
T −1
ϕ(t, x) +
t=0 T −1
= F (x(T )) +
T −1
f 0 (t, x, u) =
t=0
f 0 (t, x, u) ≡ I[x(·), u(·)].
t=0
Suppose that there is a function ϕ(t, x) such that on some sequence {xs (t), us (t)} ∈ D the conditions of the theorem are satisfied. Then this sequence is minimizing for the functional L on the set E and for the functional I on D by Lemmas 4.1, 4.2. Remark. If the sequence appearing in the theorem has the form ≡ xs (t) ≡ x(t), us (t) ≡ u(t) for all s, then in items 2) and 3) of the theorem the requirement of convergence is replaced by an equal sign, and a pair of vector functions (x(t), u(t)) ∈ D satisfying conditions of the theorem is the minimum. The theorem generalizes sufficient conditions for Krotov’s optimality to the case of multistep processes. The transition to an integer argument made it possible to minimize the set of constraints necessary to formulate the result - only sets of a general nature, operators on them, and functionals appear in the problem.
158
Chapter 9
9.3 Krotov’s optimality principle The theorem allows us to replace the problem on the minimum of the functional I on D with the problems on the maximum for the function R(t, x, u) on V (t) for each t = 0, 1, ..., T − 1 and on the minimum for the function G(x(T )). The connection between these problems is realized by the appropriate choice of the function ϕ(t, x). The conditions of the theorem leave arbitrariness in the choice of the Krotov function, allowing, by introducing additional requirements to ϕ(t, x), to specify various methods for solving the problem
10 The Lagrange–Pontryagin Method For Multistep Controlled Processes 10.1 Lagrange – Pontryagin method Let us concretize the problem of optimal control of the multistep process (9.1) - (9.4). We will assume that there are no restrictions on the states, therefore the set of admissible states of X coincides with the entire space, except for the initial point, which is fixed. The right end of the trajectory is free. The set of admissible controls does not depend on x, that is, U (t); moreover, the optimal control x(t) is an interior point of the set of admissible controls. We will also assume that the functions ϕ(t, x) and fi (t, x, u) are differentiable at the points of the minimum. These assumptions are needed to write down the necessary conditions for the maximum R(t, x, u) in x, u: ⎧ x(t+1) ⎪ ⎪ ⎪ n ⎪ ∂ϕ(t + 1, f (t, x, u)) ∂fk ⎪ ∂R ∂ϕ(t, x) ∂f 0 ⎪ ⎪ = · − − = 0, ⎪ ⎪ ∂xi x,u ∂xk ∂xi ∂xi ∂xi ⎪ ⎪ k=1 x,u ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ i = 1, 2, ..., n, x(t+1) ⎪ ⎪ ⎪ n ⎪ 0 ⎪ ∂R f (t, x, u)) ∂ϕ(t + 1, (t, x, u) ∂f ∂f ⎪ k ⎪ = · − ⎪ = 0, ⎪ ⎪ ∂u ∂x ∂u ∂u j x,u k j j ⎪ k=1 ⎪ x,u ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ j = 1, 2, ..., r. (10.1)
Let us introduce into consideration the vector function ψ(t) = = (ψ1 (t), ψ2 (t), ..., ψn (t)) - the gradient of the function ϕ(t, x) at
160
Chapter 10
the points of the required minimum, i.e. ψi (t) =
∂ϕ(t,x) ∂xi
,i = x(t)
= 1, 2, ..., n. Taking into account the notation of the gradient of the function ϕ(t, x), relations (10.1) take the form ⎧ n ∂R ∂fk ∂f 0 ⎪ ⎪ ⎪ ψk (t + 1) · − ψk (t) − = = 0, ⎪ ⎪ ∂xi x,u ∂xi ∂xi x,u ⎪ ⎪ k=1 ⎪ ⎪ ⎪ ⎪ ⎨ i = 1, 2, ..., n, ⎪ n ⎪ ⎪ ∂fk (t, x, u) ∂f 0 ∂R ⎪ ⎪ = ψ (t + 1) · − ⎪ = 0, k ⎪ ∂u x,u ⎪ ∂uj ∂uj x,u j ⎪ ⎪ k=1 ⎪ ⎩ j = 1, 2, ..., r, (10.2) We introduce the Hamiltonian: H(t, ψ(t + 1), x(t), u(t)) = different moments
n i=1
ψi (t + 1) · fi (t, x, u) −f 0 (t, x, u). xi (t+1)
(10.3) Let us express relations (10.2) in terms of the Hamiltonian H: ⎧ ∂R ∂ ⎪ ⎪ ⎨ ∂xi = ∂xi H(t, ψ(t + 1), x(t), u(t)) − ψi (t) = 0, i = 1, ..., n, x,u ⎪ ∂R ⎪ ⎩ ∂uj = ∂u∂ j H(t, ψ(t + 1), x(t), u(t)) = 0, j = 1, 2, ..., r, x,u
(10.4) As a result, we obtained the necessary optimality conditions for the multistep process (9.1)-(9.4). They are a discrete analogue of the Euler–Lagrange equations in the Pontryagin form. The unknowns here are n + r functions (ψ(t), u(t)) whereas the optimal trajectory x(t) is determined from n equations of the process (9.1) taking into account the boundary conditions (9.2). For m = 0, we have a problem with a free right endpoint, for m = n, a two-point boundary value problem, and for m < n, the right end of the trajectory is partially fixed.
The Lagrange–Pontryagin Method
161
According to assertion 3) of Theorem 9.1, the function G(x(T )) = = ϕ(T, x) − ϕ(0, x) + F (T, x) should be minimized with respect to the final state. Necessary minimum conditions ∂ ∂ϕ(T, x) ∂F (x(T )) G(x(T )) = + = 0, i = m + 1, ..., n, ∂xi ∂xi ∂xi x(T ) or
∂F (x(T )) , i = m + 1, ..., n. (10.5) ∂xi Thus, the transversality conditions for continuous and discrete versions of controlled systems are similar. Conditions (9.1), (9.2), (10.4), and (10.5) are not only necessary conditions for the maximum R and minimum G, but also the necessary conditions for the minimum of the functional I on the set D. They define the conjugate function ψ(t) and the pair (x(t), u(t)) ∈ ∈ D., "suspicious" for optimality. In order to solve the problem to the end and prove that a pair of vector functions (x(t), u(t)) ∈ D is the required minimum, it is necessary to show the existence of a function ϕ(t, x) satisfying the above difference relations and boundary conditions. It is characteristic that in the discrete version, in contrast to the continuous one, the maximum of the Hamiltonian H with respect to the control is not a necessary condition for max(R) and, at the same time, is not a necessary condition for optimality. ψi (T ) = −
10.2 Negative example Let the equations of motion be given by a system of difference equations
x1 (t + 1) = x1 (t) + 2u(t), x1 (0) = 3, (10.6) x2 (t + 1) = x2 (t) − [x1 (t)]2 + [u(t)]2 , x2 (0) = 0, Control constraints: |u(t)| 5. Process (10.6) and control constraints will be considered at two points at t = 0 and t = 1. It is required to minimize the functional I[x(·), u(·)] = −x2 (2) on a given set of admissible ones.
162
Chapter 10
Directly from the difference system (10.6) we find
x1 (1) = 3 + 2u(0), t=1: x2 (1) = −9 + [u(0)]2 . t = 2 : x2 (2) = −9 + [u(0)]2 − 9 − 12u(0) − 4[u(0)]2 + [u(1)]2 = = [u(1)]2 − 3[u(0)]2 − 12u(0) − 18. We minimize the control functional I[x(·), u(·)] = −x2 (2) at the zero step, u(0): ∂[−x2 (2)] = 6u(0) + 12 = 0, ∂u(0) where, u(0) = −2. The found control satisfies the constraints of the problem, its value does not go beyond the nterval [−5; +5]. The second derivative is positive, so the minimum is found. We minimize the control functional at the first step, u(1): on the interval [−5; +5] a parabola with apex at the origin, with branches directed downward, reaches a minimum at the boundaries of the interval u(1) = ±5. Let us check the fulfillment of the condition for the maximum of the Hamiltonian; for this, we write down expressions for H. At t = 0: H(ψ(1), x(0), u(0)) = = ψ1 (1)[x1 (0) + 2u(0)] + ψ2 (1){x2 (0) − [x1 (0)]2 + [u(0)]2 }. At t = 1: H(ψ(2), x(1), u(1)) = = ψ1 (2)[x1 (1) + 2u(1)] + ψ2 (2){x2 (1) − [x1 (1)]2 + [u(1)]2 }. Let’s compose a conjugate system of difference equations: ⎧ ⎨ψ1 (1) = ∂H(ψ(2),x(1),u(1)) = ψ1 (2) − 2ψ2 (2)x1 (1), ∂x1 (1) ⎩ψ2 (1) =
∂H(ψ(2),x(1),u(1)) ∂x2 (1)
= ψ2 (2).
(10.7)
The Lagrange–Pontryagin Method
163
Transversality conditions: ψ1 (2) = 0, ψ2 (2) = 1. From of the second equation of the conjugate system (10.7) it follows that ψ2 (1) = ψ2 (2) = 1 that is, the second conjugate variable is a constant. Then the first conjugate variable satisfies the difference equation ψ1 (1) = ψ1 (2) − 2x1 (1). Since x1 (1) = 3 + 2u(0)|u(0)=−2 = −1, ψ1 (1) = ψ1 (2) + 2 = 2. As a result, at t = 0 H(ψ(1), x(0), u(0)) = 2[3 + 2u(0)] + ψ2 (1){−9 + [u(0)]2 } = = −3 + 4u(0) + [u(0)]2 = [u(0) + 2]2 − 7. (10.8) The Hamiltonian, defined by relation (10.8), at u(0) = −2 attains not a maximum, but a minimum with respect to the control. For t = 1 H(ψ(2), x(1), u(1)) = 2{x2 (1) − 1 + [u(1)]2 } = = −20 + 2[u(0)]2 + 2[u(1)]2 |u(0)=−2 = 2
(10.9)
2
= −12 + 2[u(1)] = 2{[u(1)] − 6} the Hamiltonian has a boundary maximum u(1) = ±5. From a negative example, it is clear that, in the general case, Pontryagin’s maximum principle is inapplicable for multistep processes.
10.3 Algorithm for solving the problem 1) From the necessary condition for an extremum ∂ H(t, ψ(t + 1), x(t), u(t)) = 0, j = 1, 2, ..., r, ∂uj find control uj (t, ψ(t + 1), x(t)), j = 1, 2, ..., r. 2) By the found control u we close the system of difference equations
∂ ψi (t) = ∂x H(t, ψ(t + 1), x(t), u(t)), i = 1, 2, ..., n, i xi (t + 1) = fi (t, x, u), i = 1, 2, ..., n.
164
Chapter 10
3) We solve it by taking into account the boundary conditions for states and transversality conditions for conjugate variables. Before making calculations, you should bring the equations in paragraphs 2) and 3) to the same form. In the theory of difference equations, two forms of processes are known. The first option: y(t + 1) = Φ(t, y(t)), y(0) = y0 here the motion is carried out from t = 0 to t = T . In other words, the Cauchy problem is solved in direct time. Second option: y(t − 1) = Φ(t, y(t)), y(T ) = yT . This is the Cauchy problem in reverse time, from t = T to t = 0. In the case of a controlled multistep process, the basic equations of motion are written in forward time, and the conjugate system is written in reverse. Therefore, one of the two systems should be transformed in such a way that the directions of movement in the systems of item 2) coincide. For example, if the conjugate system is rewritten in the form ψ(t + 1) = Φ(t, ψ(t)), then the solution of the boundary value problem for the system of recurrent equations is carried out in the direction from 0 to T. To do this, we set ψ0 = = ψ(0) and calculate the system of equations of item 2) up to time T. Next, we check the fulfillment of the conditions at the right end, xj (T ) = xj,1 , j = 1, m. If the equality is not satisfied, set another ψ(0) and repeat the calculation again until we get the coincidence of the boundary conditions at the right end. It is also possible, keeping the original adjoint system unchanged, to transform the equations of the process to the form x(t) = = Φ(t, x(t + 1), ψ(t + 1)). In this case, the boundary value problem is solved in the opposite direction, from T to 0. We set the missing boundary conditions at the right end for states and conjugate variables. Check the initial conditions x(0) = x0 . If there is no match, we change ψ(T ) and carry out a new calculation.
10.4 Control restrictions We consider the optimal control problem for a multistep process (9.1) - (9.4), in which the set of admissible controls U (t) is bounded and
The Lagrange–Pontryagin Method
165
does not depend on x. If there are control constraints, it is no longer possible to use equations (10.4) derived under the condition that u(t) is an interior point. By virtue of the theorem on sufficient optimality conditions, the function R(t, x, u) for each fixed x must have a maximum with respect to the control u. In other words, R(t, x(t), u(t)) = max R(t, x(t), u), u∈U (t)
t = 0, 1, ..., T − 1.
Let us consider how the maximum of the function R with respect to the control u is related to the properties of the gradient vector of this function with respect to the control. 1. Let u(t) be a scalar function, then for each t the set of admissible controls is a segment; a(t) u(t) b(t). If point u(t) is an interior point of a segment, then ∂ ∂R = H(t, ψ(t + 1), x(t), u(t)) = 0, j = 1, 2, ..., r. ∂uj x,u ∂uj If u(t) = a(t) then the necessary condition for the optimality of the Hamiltonian is the nonpositiveness of its gradient, ∂ H(t, ψ(t + 1), x(t), u(t)) 0. ∂uj When u(t) = b(t) the condition of non-negativity of the gradient (Fig.10.1a) ∂ H(t, ψ(t + 1), x(t), u(t)) 0. ∂uj 2. Let u(t) be a vector function, U (t) a domain with a smooth boundary Γ, at each point of the set Γ one can construct a normal n to the boundary. We will assume that r = 2, then the set U is a plane. For an interior point of the set U, the necessary extremum condition is the equality to zero of the vector-gradient of the Hamiltonian with respect to the control. Let u lie on the boundary Γ of the set of admissible controls. Let us draw the outward normal n to the boundary Γ of the region U (Fig. 10.1b).
166
Chapter 10
The necessary condition for optimality in this case will be inequality n · gradu H(t, ψ(t + 1), x(t), u(t)) 0. All the conditions considered above are only necessary conditions for optimality, and the trajectory obtained from these conditions is only suspicious of the optimum. In other words, additional research
Figure 10.1: *** is required, similar to the continuous option. In particular, if the dynamical system is linear and the integrand is convex in u, the resulting trajectory is optimal.
10.5 Exercises for independent work In the following tasks, find a process that satisfies the necessary optimality conditions in the Lagrange–Pontryagin form. Select the cases when sufficient optimality conditions are satisfied. 1.
2 0
[x2 (t) + u2 (t)] + 2x(3) → min,
x(t + 1) = −x(t) + u(t), 2.
3 0
[x(t) + 2u(t) + u2 (t)] + 2x(4) → min,
x(t + 1) = −x(t) + 2u(t), 3.
3 0
x(0) = 0.
x(0) = 1,
[4x2 (t) − x(t)] − x2 (4) → min,
x(t + 1) = 2x(t) − u(t),
x(0) = 2.
x(4) = 1.
The Lagrange–Pontryagin Method
4.
3 0
[x2 (t) + u2 (t)] + 3x(4) → min,
x(t + 1) = −x(t) + 2u(t),
x(0) = 1.
167
11 Hamilton-Jacobi-Bellman Method. Multi-Step Version 11.1 The main functional relationship of the method Formulation of the problem I[x(·), u(·)] =
T −1
$ % f 0 (t, x, u) + F x(T ) → min;
t=0
$ % x(t + 1) = f t, x(t), u(t) , u(t) ∈ U (t, x);
t = 0, 1, . . . , T − 1; x(0) = x0 .
In the multistep setting, as in the continuous version (see Chapter 7), there are no restrictions on the state, excluding the boundary conditions at the left end. We consider a problem with a free right end in which the final state x(T ) is not given. The constructions R and G that appear in Krotov’s theorem for a multistep process are defined by the following expressions: $ % R(t, x, u) = ϕ t + 1, f (t, x, u) − ϕ(t, x) − f 0 (t, x, u), t = 0, 1, . . . , T − 1; $ % G x(t) = ϕ(T, x) − ϕ(T, x) − ϕ(0, x) + F (t, x). According to $the multistep version of the minimal theorem, if % there is a process x, u(t) ∈ D and some function ϕ(t, x), such that the following conditions are satisfied: 1◦ . R(t, x, u) = $ % 2◦ . G x(T ) =
max R(t, x, u),
(x,u)∈V (t)
min
x(N )∈V (N )
$ % G x(T ) ,
t = 0, 1, . . . , T − 1; t = T;
170
Chapter 11
$ % then the pair x(t), u(t) is the minimum. Let us introduce the function P (t, x) into consideration: $ % P (t, x) = max R(t, x, u) = R t, x, u(t, x) , t = 0, 1, . . . , T −1. u∈U (t,x)
(11.1) Substituting the expression defining the function R(t, x, u) into formula (11.1), we find # " $ % P (t, x) = max ϕ t + 1, f (t, x, u) − f 0 (t, x, u) − ϕ(t, x), u∈U (t,x)
t = 0, 1, . . . , T − 1. Here the function ϕ(t, x) is taken out of the maximum sign, since it does not depend on the control u. We will define ϕ(t, x) so that: 1) for all t = 0, 1, . . . , T −1, the function P (t, x) does not depend on x: P (t, x) = c(t) (11.2) 2) for t = T G(x) = c1 ,
(11.3)
Where c(t) is an arbitrary function of time, c1 is an arbitrary constant. Suppose that ϕ(t, x) satisfies these conditions, then this function coru(t, x) = responds to the synthesis of control = arg max R(t, x, u). u∈U (t,x)
Let us define the trajectory x(t) as a solution to a system of difference equations closed by control synthesis, $ % x(t + 1) = f t, x(t), u(t, x) (11.4) with an initial condition x(0) = x0 and an optimal control program u(t) = u(t, x).
(11.5)
Theorem 11.1.1. Let the function ϕ(t, x) and the corresponding function u(t, x) ensure the fulfillment of conditions (11.2) and (11.3). Then
Hamilton-Jacobi-Bellman Method
171
the function u(t,$x) defines %the synthesis of optimal control, a pair of vector functions x(t), u(t) specified by conditions (11.4) and (11.5), is an optimal process with an optimal trajectory x(t) and an optimal control program u(t). Proof. Condition 2◦ of Theorem 9.1 is satisfied trivially due to the way G(x) is defined (see formula (11.3)). The function u(t, x) is constructed in such a way that for each fixed set of arguments t, x it provides max R(t, x, u). u∈U (t,x)
According to condition (11.2), the function ϕ(t, x) is chosen so that max R(t, x, u) = P (t, x) = c(t), in other words, P (t, x) does u∈U (t,x)
not depend on x. With this choice of ϕ(t, x), the state x, including the trajectory x(t), paired with the control u(t) satisfy condition 1◦ of the theorem on sufficient optimality conditions for multistep processes. When constructing u(t, x) the maximum of the function R(t, x, u) we found on the set u ∈ U (t, x); consequently, the original constraints on the controls, as well of motion (11.4), are satisfied; % $ as the equations therefore, the pair x(t), u(t) is admissible. Since the second conditions in the minimum theorems for continuous and multistep processes are the same, the boundary conditions will also be the same. Taking into account relation (11.3), we obtain ϕ(T, x) = c1 − F (x),
(11.6)
those. the function ϕ(t, x) at the moment t = T must coincide with the terminal function F (x) up to a constant term. From relation (11.2), taking into account conditions (11.4), it follows * ) $ % ϕ(t, x) = max ϕ t + 1, f (t, x, u) − f 0 (t, x, u) − c(t), (11.7) u∈U (t,x)
Expression (11.7) is a discrete analogue of the Hamilton-JacobiBellman equation for continuous processes. Here we have not a partial differential equation, but a simpler functional finite-difference equation (recurrence relation), which defines the dynamic programming method. For c(t) = c1 = 0 the resulting recurrence relation coincides
172
Chapter 11
with the Bellman equation, while the Krotov function ϕ(t, x) is equal to the Bellman payoff function with subtlety up to the sign. Solving the functional equation (11.7) together with the boundary condition (11.6), we find the synthesis of the optimal control u(t, x). Substitution u(t, x) into the process equation (11.4) with given initial conditions x(0) = x0 makes it possible to determine the optimal trajectory x(t), and then the optimal control program corresponding to the given initial conditions, u(t) = u(t, x).
11.2 An example of solving the optimal control problem by the dynamic programming method Minimize functional I=
3 )
$ %2 * x(t) + u(t) + x(4) → min
t=0
on the set of admissible step-by-step processes given by the equation of motion, x(t + 1) = x(t) − u(t), with the initial condition x(0) = 0 and the control constraint u(t) ∈ [1, 2]. In accordance with the above solution scheme, we write down the Bellman equation with the boundary condition (the function c(t) and the constant c1 are taken equal to zero): ) * ϕ(t, x) = max ϕ(t + 1, x − u ) − x − u2 , u∈[1,2]
t = 0, 1, 2, 3,
x(t+1)=x(t)−u(t)
ϕ(4, x) − x. Bellman’s equation connects the value of the function ϕ at a given moment of time t with its value at the next moment of time t + 1. Since the value of the function ϕ at the right end, at the stage with a number N , is known by solving the difference equation of motion in the opposite direction of time, we can construct the function ϕ(t, x). We start with stage number t = T = 4, move from right to left, up to zero t = 0.
Hamilton-Jacobi-Bellman Method
173
First iteration (t = 3). Bellman’s equation has the form ) * ϕ(3, x) = max ϕ(4, x − u) − x − u2 . u∈[1,2]
Using the boundary condition ϕ(4, x) = −x, we rewrite the Bellman equation: ) * ) * ϕ(3, x) = max − x + u − x − u2 = max − 2x + u − u2 . u∈[1,2]
u∈[1,2]
The expression in parentheses is an equation-quadratic form; its vertex corresponding to the stationary point at u = 1/2 lies outside the admissible interval [1; 2], therefore, the boundary maximum is realized, it is reached at the lower boundary u(3) = 1. Substituting the value u(3) = 1 into the expression for ϕ(3, x) gives ϕ(3, x) = = −2x. At the second and subsequent iterations, the order of calculations is repeated. Second iteration (t = 2): ) * ϕ(2, x) = max ϕ(3, x − u) −x − u2 = u∈[1,2] = max
)
−2x(3)
* −2x + 2u −x − u2 = max [−3x + 2u − u2 ]. u∈[1,2]
u∈[1,2] −2x(3)=−2x(2)+2u(2)
Maximizing the last expression with respect to u leads to the fact that u(2) = 1 in other words, at this stage, the maximum is reached at a stationary point, which coincides with the left boundary of the admissible interval, while ϕ(2, x) = −3x + 1. Third iteration (t = 1): ) * ϕ(2, x − u) −x − u2 = ϕ(1, x) = max u∈[1,2] −3x(2)+1=−3x(1)+3u(1)+1
) = max [−3x + 3u − x − u2 + 1] = max − 4x + 3u − u2 + 1]. u∈[1,2]
u∈[1,2]
The maximum is attained at a stationary point, u(1) = 3/2, ϕ(1, x) = −4x + 13/4.
174
Chapter 11
Fourth iteration (t = 0): ) * ϕ(0, x) = max ϕ(1, x − u) − x − u2 = "
u∈[1,2]
= max −4x+4u−x−u2 + u∈[1,2]
" 13 # 13 # = max −5x+4u−u2 + . 4 4 u∈[1,2]
The maximum is reached at u(0) = 2. So, the synthesis of the optimal equation is fully defined u(t, x); in this example, it does not depend on the state and coincides with the control program (Table 11.1). Table 11.1 0 2
t u(t)
1 3/2
2 1
3 1
Knowing the dynamics u(t) and the initial value x(0) = 0, using the difference equation of the process x(t + 1) = x(t) − u(t) we determine the optimal trajectory (Table 11.2). Table 11.2 t x(t)
0 0
1 -2
2 -7/2
3 -9/2
4 -11/2
Unlike the Lagrange-Pontryagin method, the dynamic programming method does not require additional research and substantiation of the optimality of the obtained solution, since the main functional relation of the method directly follows from the sufficient optimality conditions.
Hamilton-Jacobi-Bellman Method
175
11.3 Optimal distribution of investments between projects using dynamic programming For the project management problem, which is considered below, the dynamic programming method is not the only possible one. The presence of a linear constraint in the mathematical model (apart from the trivial constraints associated with the non-negativity of variables) also makes it possible to use other approaches, for example, the Lagrange multiplier method or any other method of mathematical programming. The example under consideration plays an illustrative role, while the question of the expediency of using the dynamic programming method is not discussed here. Suppose that the problem of distribution of a certain fixed investment fund between N directions is being solved, with the aim of $maximizing the total efficiency of investments. Let us denote by % f 0 i, u(i) the efficiency of the implementation of the i-th investment direction with the amount of investments ui . The indicated efficiency can be understood, for example, as the expected increase in output (in value or in kind) when the amount is$ laid out % ui in the i-th direction. 0 We assume that all functions f i, u(i) are increasing, since df 0 /du(i) > 0 at ∀i = 1, 2, . . . , N, 0 ui A. In other words, with an increase in investments within the possible limits, the efficiency of their implementation increases. This, in turn, allows us to conclude that, in the optimal plan, the entire investment fund A should be depleted. The plan for the distribution of investments between the directions corresponds to the following mathematical model: N
$ % f 0 i, u(i) → max
(11.8)
i=1
with restrictions N
u(1) = A,
u(i) 0,
i = 1, 2, . . . , N.
(11.9)
i=1
Model (11.8) - (11.9) in the general case with nonlinear functions
176
Chapter 11
$ % f 0 u(i) represents a mathematical programming problem. Functional (11.8) is assumed to be additive. Let us reduce the problem to a multistep controlled process, and then apply the dynamic programming method to optimize it. Let us show that problem (11.8) - (11.9) can be reduced to the problem considered in § 11.2 For this, we introduce the function x(t), which we define as follows: x(t + 1) = x(t) + u(t),
t = 0, 1, . . . , N − 1,
x(0) = 0.
In this case, constraint (11.9) takes the form x(N ) = A; as a result, an additional constraint on the state at the right end appears in the model. Let us get rid of it by introducing a penalty terminal term M [x(t) − A]2 into functional (11.8), where M > 0 is an arbitrary large number. Finally, we get the model we need in the form I=
N −1
$ % f 0 t, u(t) − M [x(N ) − A]2 → max,
t=1
x(t+1) = x(t)+u(t),
x(0) = x0 ;
u(t) 0,
t = 0, 1, . . . , N −1. (11.10) The introduction into the functional as a terminal term of a negative term corresponding to the squared discrepancy between x(N ) and A with an arbitrary large factor M is a forced measure related to the conditions of applicability of the Hamilton-Jacobi-Belman algorithm, which requires the absence of phase constraints. Model (11.10) corresponds to the canonical form of a multistep controlled process, for which x(t) is the state of the system, u(t) is the control. The meaningful meaning of these indicators will be clear as a result of the analysis of the optimal solutions obtained and presented below. Let us carry out the computational process for the case N = 3. Let us write down the Bellman equation with the following condition: * ) ϕ(t, x) = max ϕ(t + 1, x + u) − f 0 (t, u) , u0
ϕ(N, x) = −M [x(N ) − A]2 ,
t = 0, 1, . . . , N − 1;
Hamilton-Jacobi-Bellman Method
177
whence the optimal control in the form of synthesis * ) u(t, x) = argmax ϕ(t + 1, x + u) − f 0 (t, u) . u0
We carry out the calculations for the case N = 3, A = 10. We take t = i − 1 and ⎧ 2 ⎪ ⎨16u − 0, 4u , t = 0, $ % f 0 t, u(t) = 18u − 0, 6u2 , t = 1, ⎪ ⎩ 25u − 0, 7u2 , t = 2. First iteration (t = 2). According to the first functional relationship of the dynamic programming method ) * ϕ(2, x) = max − M (x + u − 10)2 + 25u − 0, 7u2 . u0
Since M > 0 is an arbitrary, arbitrarily large number, the first syllable will dominate in the expression in square brackets, therefore u(t, x) = = 10 − x. In this case ϕ(2, x) = 25(10 − x) − 0, 7(10 − x)2 . Second iteration (t = 1). ) * ϕ(1, x) = max 25(10−x−u)−0, 7(10−x−u)2 +18u−0, 6u2 = u0
= max(−1, 3u2 + 7u − 1, 4ux + . . . ). u0
Here, the ellipsis refers to all other terms independent of the control u. Maximizing the last expression with respect to the control u (the case of a parabola with branches downward), we obtain
2, 69 − 0, 539x, x 5 (stationary point), u(1, x) = 0, x > 5 (boundary extremum on the left border). Substitution of the optimal synthesis into the expression ϕ(1, x) after reducing similar terms gives
x 5, −0, 323x2 − 14, 769x + 189, 423; ϕ(1, x) = 2 −0, 7x − 11x + 180, x > 5.
178
Chapter 11
Trird iteration (t = 0) ⎧ − 0, 323(x + u)2 − 14, 769(x + u)+2 ⎪ ⎪ ⎪ + 189, 423 + 16u − 0, 4u , ⎪ ⎪ ⎪ x(1)=x(0)+u(0) ⎪ ⎪ ⎨ x + u 5; ϕ(0, x) = max 2 u0 ⎪ ⎪ − 11(x + u)+ − 0, 7(x + u) ⎪ ⎪ ⎪ ⎪ ⎪ + 180 + 16u − 0, 4u2 , ⎪ ⎩ x + u > 5. After reducing similar terms, we get ⎧ − 0, 723u2 + 1, 231u − 0, 646ux + . . . , ⎪ ⎪ ⎪ ⎪ ⎨
0 u 5 − x; 2 + u − 1, 4xu + . . ., − 1, 1u u0 ⎪ ⎪
ϕ(0, x) = max
⎪ ⎪ ⎩
u > 5 − x.
Let us maximize the function ϕ(0, x) over u:
u(1, x) =
0, 85 − 0, 447x, 0,
u(0, x) = 5 − x,
0 x 1, 9, 1, 9 < x 5.
0 x 5.
As a result, we get two optimal options for the distribution of investment resources. First option: u(0) = u(0, x)x=0 = 0, 85; x(1) = x(0) + u(0) = 0, 85; u(1) = u(1, x) = 2, 69 − 0, 539 · 0, 85 = 2, 932; x=0,85
x(2) = x(1) + u(1) = 0, 85 + 2, 232 = 3, 082; u(2) = u(2, x) = 10 − 3, 082 = 6, 918. x=3,082
Hamilton-Jacobi-Bellman Method
179
Second option:
u(0) = u(0, x)x=0 = 5; x(1) = x(0) + u(0) = 5; u(1) = u(1, x) = 0; x=5
x(2) = x(1) + u(1) = 5 + 0 = 5; u(2) = u(2, x) = 10 − 5 = 5. x=5
First option:The distribution of investment resources is significantly different, in the first option, resources must be distributed in all three directions u1 , u2 and u3 , but in different quantities - most of all in the third direction. According to the second option, investments should be distributed only in the first and third directions, and evenly: u1 = u3 = 5. The total effect - the value of the functional in both projects - as it is easy to calculate, is equal to 186. Thus, the solution may not be the only one. What to do in such a case in practice? The first and easiest one is to choose one of the investment options at random. The second is to take into account additional conditions that were not taken into account in the original model.
11.4 Exercises for independent work Find the optimal processes using the Hamilton-Jacobi-Bellman method: 1. I =
3
$ %2 [2x(t) + u(t) ] → min,
t=0
x(t + 1) = x(t) + 2u(t), x(0) = 2, |u| t + 1. 2. I =
3
[x(t) − u(t)] − x(4) → min,
t=0
x(t + 1) = u(t), x(0) = 1, |u| 1. 3. I =
3
[x(t) + u(t)] + x(4) → min,
t=0
x(t + 1) = x(t) − u(t), x(0) = 0, |u|
1 t+1 .
12 Applied Optimal Control Problems 12.1 Optimal control theory and models of macroeconomic dynamics Optimization models of macroeconomic dynamics include the following three elements as components: a system of linear input-output balance equations; nonlinear production function; utility functional. The equations of the input-output balance are based on the proportional relationship between costs and production volumes of various sectors of the economy. Balance models go back to the economic table of François Quesnay 15 (1758); in fact, it became the first economic and mathematical model. F. Quesnay’s table is based on balance proportions between natural (material) and value monetary elements. In other words, there must be a correspondence between the mass of goods and the money supply. F. Quesnay came to this conclusion thanks to a biological analogy: he considered the economy as a living organism, similar to the human body. In society, the classes of farmers, artisans, landowners and the barren class were distinguished, they were considered as organs of society, and money and goods as blood and nutrients that make their cycle in the body of society. From blood circulation and metabolism in the human body, Quesnay moved 15 François Quesnay (1694-1774), being a professional physician, court physician to King Louis XV and personal physician to the Marquise Pompadour, at the age of 60, became interested in questions of philosophy and economics. He was a welleducated man and took part in the work on the Encyclopedia of Diderot, in which his economic table was first published (1758), and its first few copies were printed by the king of France himself, Louis XV. François Quesnay is also known as the head of the Physiocratic School.
182
Chapter 12
on to the circulation of the aggregate social product in the economy. Quesnay’s table was the first macroeconomic model in the history of economic science, which served as the basis for modern balance models: the model of the circulation of income and expenditure, the system of national accounts, and input-output balance models. The world’s first input-output balance was developed in the Soviet Union in 1923-1924. It was the result of a great research work, wellknown Russian economists of that time V.P. Popov, L.N. Litoshenko, V.K. Dmitriev and others took part in it. In particular, V.K. Dmitriev introduced the concept of technological coefficients aij , expressing the costs of the i-th industry for the output of a unit of the product of the jth industry. Dmitriev compiled a system of linear algebraic equations X = AX + Y , here A is the matrix of technological coefficients (matrix of direct costs); vector X - total costs, gross output by industry; vector Y is the final product. This system of equations provides a way of expressing total costs in terms of the final product X = (E − A)−1 Y . The author of the well-known input-output method, Vasily Leontiev, was familiar with the work on intersectoral balance. In the thirties of the last century V. Leontiev emigrated from the USSR to the USA, where he continued to work on the input-output method. In particular, he built input-output matrices for the economies of many countries around the world. In addition to the quantitative values of the coefficients aij, the input-output matrix allows us to identify the most significant relationships between various structural elements in the economy. V. Leontiev received the Nobel Prize for the input-output method. Leontiev’s student was the American economist Robert Solow, who later also became a Nobel laureate for his work on the theory of economic growth.
12.1.1 One-sector optimization model of macroeconomic dynamics The one-sector optimization model is based on the model of exogenous economic growth, which was developed simultaneously and indepen-
Applied Optimal Control Problems
183
dently of each other by Robert Solow and Trevor Swan16 (1956). The economic system is considered as a whole, one universal product is produced, it can be both consumed and invested. The technology in the model is not subject to cardinal changes, therefore, the elements of the direct cost matrix a = const, export – import are not taken into account. The formulation of the problem considered in this section is optimization, it consists in the optimal distribution of the final product between accumulation and consumption [12]. At each moment of time t, the state of the economy is specified by a set of the following variables: 1) X is the gross domestic product; 2) Y is the final product; 3) C - non-productive consumption; 4) K is the volume of fixed assets (capital); 5) L - labor resources (number of employees); 6) I - investment (gross capital investment). One part of the gross product is spent in the same production the same cycle, which is done. It is the current production consumption aX, where a is the share of the current production consumption in the gross product. The second part forms the final product Y. Variables X and Y are related by balance ratio X = aX + Y,
(12.1)
where 0 < a < 1, a = const. 16 Solow R.M. A contribution to the Theory of Economic Growth//The Quarterly Journal of Economics. — 1956. — Feb. (Vol. 70. No 1). — P. 65–94. Swan T. W. Economic growth and capital accumulation // The Economic Record. — 1956. — Nov. (Vol. 32. No 2). — P. 334–361
184
Chapter 12
In turn, the final product breaks down into gross capital investment and non-productive consumption: Y = I + C. Gross capital investments are spent on expanding production and reimbursing the disposal of fixed assets: I=
dK + μK, dt
where μ is a given constant depreciation rate. Let us denote the share of non-productive consumption in the final product through u = C/Y , then dK = I − μK = Y − C − μK = dt = Y (1 − u) − μK = (1 − a)(1 − u)X − μK. As a result, the differential equation of the model will take the form dK = (1 − a)(1 − u)X − μK. dt
(12.2)
We will assume that at the initial moment of time K(0) = K0 , while the condition K(t) Kmin is satisfied, where Kmin is the given minimum volume of fixed assets. For non-production consumption, the condition 0 u(t) 1 is satisfied. The volume of the gross product is determined by a nonlinear production function that depends on the size of production assets, labor resources and time, 0 X F (K, L, t).
(12.3)
It is assumed that the production function is non-negative, with an increase in the volume of resources used, output increases, the marginal productivity of factors is positive and decreasing. The production function also has constant returns to scale. The above
Applied Optimal Control Problems
185
conditions are met by a multiplicative function, in particular, the Cobb–Douglas function F (K, L, t) = X0 eρt K α Lβ , where α and β are the given positive coefficients of the elasticity of output in terms of capital and labor, respectively, ρ is the rate of scientific and technological progress. Further, we will consider a homogeneous production function of the first degree, for which α + β = 1. Such a restriction will allow the task to be reduced to indicators related to a unit of labor resources. Thus, the optimal control problem for a one-product (one-sector) economy is to find a process (K(t), X(t), u(t)) satisfying conditions (12.1)-(12.3), which on a given planning interval will ensure the maximum final consumption, taking into account discounting (maximum of the utility functional): T exp(−δt)C(t) dt → max;
J=
(12.4)
0
here δ is the discount rate. Reduction of the problem to specific indicators Let us introduce the relative technical indicators related to the unit of labor resources, x(t) =
X(t) , L(t)
k(t) =
K(t) , L(t)
c(t) =
C(t) L(t)
which mean: x(t) - labor productivity; k(t) - capital-labor ratio; c(t) - per capita non-productive consumption. Using the homogeneity property of the production function, we 1 F (K, L, t). write it in the form f (k, t) = L(t) We will consider labor resources as an exogenous variable (given outside the system), the growth of labor resources is carried out at a constant rate: dL(t) = ωL, ω = const. dt
186
Chapter 12
Let us write down the differential equation (12.2) for specific relative variables: dk = (1 − a)(1 − u)x − (μ + ω)k. (12.5) dt Restrictions on controls and states: k(t) kmin ,
k(0) = k0 ,
(12.6)
0 u(t) 1,
(12.7)
0 x(t) f (k, t).
(12.8)
Using the relations C = uY , Y = X(1 − a), x = X/L, we transform the utility functional to new variables: T J= 0
C/L
exp(−δt) (1 − a)x(t) u(t) dt → max .
(12.9)
Y /L
In the reduced problem, the state of the process is the stockto-labor ratio k(t), the controls are labor productivity x(t) and the share of consumption u(t). It is required to find an admissible process (k(t), x(t), u(t)) satisfying conditions (12.5)-(12.8), which maximizes the discounted utility functional (12.9). The main mode of economic development To solve the optimization problem, we use the sufficient optimality conditions, in particular, Theorem 4.1. Let ϕ(t, k) be a function with continuous partial derivatives with respect to its arguments. Let us construct an auxiliary construction R(k, x, u, t): R(k, x, u, t) =
# ∂ϕ(t, k) " (1 − a)(1 − u)x − (μ + ω)k + ∂k ∂ϕ(t, k) . + e−δt (1 − a)ux + ∂t
Let us single out in the expression for the function R(k, x, u, t) terms that linearly depend on the product ux. Let us equate the sum
Applied Optimal Control Problems
187
of the coefficients for this product to zero, then R will not depend on the control u. Thus, the function ϕ(t, k) is subject to the constraint −
∂ϕ(t, k) ∂ϕ(t, k) (1 − a) + e−δt (1 − a) = 0 ⇒ = e−δt . ∂k ∂k
Integrating the partial differential equation, we find ϕ(t, k) = e−δt k + ϕ(t), where ϕ(t) is an arbitrary continuous function. We put ϕ(t) = 0 then ϕ(t, k) = e−δt k;
ϕt (t, k) = −δe−δt k.
Under these conditions, the function R does not depend on the control u: R(k, x, u, t) = e−δt [(1 − a)x − (μ + ω)k] − δe−δt k = = e−δt [(1 − a)x − (μ + ω + δ)k]. Let’s maximize the function R with respect to x, k. The function R depends linearly on the control x. Since a 1, max R is 0xf (k,t)
attained at the upper boundary of the admissible set when x(t) = = f (k, t). For a single-product model, this equality is obvious, but in the case of a multi-industry model, it may turn out that some industries are underutilized. Substitute x(t) = x(t) into the expression for R: # " R(k, x, t) = e−δt (1 − a) f (k, t) −(μ + ω + δ)k . 0xf (k,t) max
=x
We denote r(k, t) = (1 − a)f (k, t) − (μ + ω + δ)k. Since the functions max R(k, x, t) and r(k, t) differ only by a factor 0xf (k,t)
e−δt > 0, the arguments of the maximization operations of these two functions will be the same, k(t) = k(t).
188
Chapter 12
The function r(k, t) is the sum of two terms: the first term is proportional to the production function, the second is directly proportional to k. The graph of the function r(k, t) and its components for a fixed t are shown in Fig. 12.1. It follows from the properties of the production function that r(k, t) is a strictly convex function and therefore has a unique maximum in k, which is attained at a stationary point. We take the
Figure 12.1: Behavior of the function r(k, t), time t is fixed Figure 12.2: Highway of the one-sector model of the economy Cobb–Douglas production function, then f (k, t) = X0 eρt k α , where 0 < α < 1. Let us write down the condition that the function r(k, t) is stationary with respect to k: ∂ = 0 ⇒ (1 − a)αX0 eρt k α−1 − (μ + ω + δ)k = 0. r(k, t) ∂k k=k k It follows from the stationarity condition that (μ + ω + δ) −ρt 1/(α−1) k(t) = e α(1 − a)X0 taking into account that 1 − α = β, we obtain α(1 − a)X0 1/β (ρ/β)t k(t) = e . (μ + ω + δ) The trajectory k(t) graph is shown in Fig. 12.2.
(12.10)
Applied Optimal Control Problems
189
The function k(t) is called the backbone of the macroeconomic dynamics model. It plays an important role in the structure of the optimal solution. We find the control realizing this highway by substituting the function k(t) into the differential equation (12.5): dk = (1 − a)(1 − u)x − (μ + ω)k. dt
(12.11)
Since x(t) = f (k, t) = X0 k α eρt solving equation (12.11) with respect to the control u, we obtain u(t) = 1 −
d k dt
+ (μ + ω)k
(1 − a)X0 k α
e−ρt .
(12.12)
Differentiating the turnpike equation (12.10) with respect to t, we obtain ddtk = βρ k. Substitution of this relation into (12.12) gives u(t) = 1 − α
μ+ω+
ρ β
μ+ω+δ
.
(12.13)
The constraint on the control action 0 u(t) 1 is certainly satisfied if ρ βδ. When the boundary conditions lie on the highway, k0 = k(0) and k1 = k(T ), the process (k(t), x(t), u(t)) is optimal. Indeed, this process provides max R for any t: on the control u since R is independent of the control u, which is achieved by choosing the function ϕ(k, t); by control x and state k by construction. At the same time, an admissible process was obtained, since it satisfies the process equation, the control u is found by substituting the main trajectory into the differential equation of the process, 0 u(t) 1 boundary conditions by assumption lie on the highway. So, in the case of specially selected initial conditions, the optimal mode of economic development coincides with the highway k(t) = arg
max
−∞ −∞. Then this sequence minimizes the functional I on D. As a result, an additional transversality condition appears in the optimal control problem with a moving boundary. By virtue of the theorem, μ(t) = R(t, x(t), u(t)) = H(t, ψ, x, u) + ϕt = 0,
Applied Optimal Control Problems
195
then for t = t1 H(t1 , ψ, x, u) = −
∂ϕ(t1 , x) . ∂t1
(12.17)
If the set V1t is not bounded, the necessary conditions for the extremum of the function G with respect to t1 can be represented as ∂ϕ(t1 , x) ∂F (t1 , x) + = 0. (12.18) ∂t1 ∂t1 x(t1 ) From expression (12.17), taking into account relation (12.18), we obtain an additional transversality condition: ∂F (t1 , x) H(t1 , ψ, x, u) = . (12.19) ∂t1 x(t1 )
12.2.2 Optimal in terms of speed control of the simplest mechanical movement - the fastest stopping of an object by a force limited in magnitude Rectilinear motion by inertia of a mechanical object is considered. The task is to stop this movement as quickly as possible at a given point by applying to this object a force whose magnitude is limited, |u| 1. In other words, choosing the law of change in force, it is required to transfer the object from the initial position to the final one in a minimum time. The moment of the end of the movement t1 is not specified; it must be minimized. According to Newton’s second law, the motion of an object is described by the differential equation x ¨ = u,
|u| 1.
(12.20)
In phase coordinates x1 = x, x2 = dx/dt, equation (12.20) is rewritten as a second-order system:
x˙ 1 = x2 , (12.21) x˙ 2 = u,
196
Chapter 12
where x1 is the position of the object, x2 is the speed of movement. Thus, the vector of the phase state will be the vector (x1 , x2 ), and the control will be the force u. The set of admissible controls Vu : |u| 1. The set of admissible states Vx (t) is not bounded for ∀t ∈ (t0 , t1 ), at the ends of the interval, admissible states are given points of the phase plane x1 (t0 ) = x10 , x2 (t0 ) = x20 , x1 (t1 ) = x11 , x2 (t1 ) = x21 . The speed problem is considered: from a given initial state = x1 (t0 ) = x10 , x2 (t0 ) = x20 , transfer the object to the final state x1 (t1 ) = = x11 , x2 (t1 ) = x21 in the minimum time. The functional in this problem is terminal, I = F (t1 , x(t1 )) = t1 − t0 . We write down the Hamiltonian H(t, ψ, x, u) = ψ1 x2 + ψ2 u. The Hamilton function H depends linearly on the control u (Fig. 12.4). A linear function defined in a closed bounded region reaches an extremum at the boundary of the region: u(t, ψ, x) = arg max H(t, ψ, x, u), |u|1
or
⎧ ⎪ ⎨ 1, u(t, ψ) = signψ2 = −1, ⎪ ⎩ ∀u :
ψ2 (t) > 0, ψ2 (t) < 0, |u| 1, ψ2 (t) ≡ 0.
(12.22)
The equations of the maximum principle will include the original differential system (12.21) and the adjoint system, ⎧ ⎨ψ˙ 1 = − ∂H = 0, ∂x1 (12.23) ⎩ψ˙ = − ∂H = −ψ . 2 1 ∂x2 Solution of system (12.23): ψ1 (t) = C1 , ψ2 (t) = −C1 t + C2 , where C1 , C2 are the integration constants. Relation (12.22) is now written in the form u(t) = sign(C2 − C1 t).
(12.24)
Applied Optimal Control Problems
197
Figure 12.4: Linear dependence of the Hamiltonian on control
It follows from relation (12.24) that each optimal control u(t), t0 t t1 , is piecewise functional, taking values ±1 and having at most two intervals of constancy, since the linear function C2 − C1 t changes sign to interval t0 t t1 at most one a lot (Fig. 12.5). The function ψ2 (t) = C2 − C1 t is called the switching function, since at the moment of time τ, which corresponds to the zero value of the function ψ2 (τ) = 0, there is a relay switching of the control u(t) from one boundary of the set of admissible controls to another. Remark. Identical equality to zero of the switching function ψ2 (t) ≡ 0 in this Figure 12.5: Relay control problem is impossible, since it leads to switching with change of a zero conjugate vector, which does not sign of function ψ2 (t) satisfy the maximum principle. It is possible to switch the control from +1 to −1 and vice versa, switch from −1 to +1. In order to find out the order of switching, consider the phase plane (x1 , x2 ). On the phase plane, the equation of motion takes the form
198
Chapter 12
dx1 = ±x2 . dx2 Integrating equations (12.25), we obtain
(12.25)
x22 + B, (12.26) 2 where B is the constant of integration. Thus, the section of the phase trajectory for which the control u ≡ +1 is an arc of the parabola (12.26) with a plus sign on the right-hand side. The control u ≡ −1 corresponds to an arc of a parabola with a minus sign (Fig. 12.6). The phase points move along parabolas (12.26) in different directions: the control u ≡ +1 leads to movement from bottom to top, 2 since dx dt = u = +1 (Fig.12.6a). Control u ≡ −1 corresponds to the opposite direction - from top to bottom (Fig. 12.6b). It was noted above that each optimal control u(t) is a piecewise constant function t that takes values ±1 and has no more than two intervals of constancy. If the control u(t) for some time first equals +1, and then takes the value −1, then the phase trajectory is "glued" x1 = ±
Figure 12.6: Trajectories of motion on the phase plane (x1 , x2 ): a control u(t) ≡ +1; b - control u(t) ≡ −1. Arrows correspond to the direction of increasing t from two sections of parabolas (Fig. 12.7). In this case, the second section lies on that of the parabolas (12.26), which passes through the origin, since, according to the boundary conditions, the desired trajectory must lead to the origin. If at first u = −1 and then u = +1, then the phase trajectory is replaced by a centrally symmetric curve.
Applied Optimal Control Problems
199
Figure 12.7: Switching to the path leading to the end point of the movement - the origin In fig. 12.8 shows the whole family of parabolic phase trajectories (12.26), with the color highlighted by a line composed of two arcs of parabolas: in the lower half-plane, this is the arc of the parabola x1 = x22 /2 in the upper half-plane, the arc of the parabola x1 = = −x22 /2. This line is the switching line. Above the switching line, control u = −1, and below u = +1. When the starting point x0 is located above the switching line, the phase point must move under the influence of the control u = −1 until it falls on that part of the line switching, which is located in the lower half-plane. At the moment the phase point hits the switching line, the
Figure 12.8: Optimal Control Synthesis on the Phase Plane control value u becomes equal to +1, up to hitting the origin. When the initial point x0 is located below the switching line, the control u
200
Chapter 12
must be equal to +1 until it hits the arc of this line passing along the upper half-plane. At the moment the phase point hits the switching line, the value of u switches and becomes equal to −1. So, according to the maximum principle, only the described trajectories can be optimal. It can be seen from the above construction that one and only one parabolic trajectory passes through each point of the phase plane. For definiteness, we will choose as the starting point located below the switching line, with coordinates (−1; 0), then
+1, t τ, u(t) = (12.27) −1, t > τ. We close the original differential system (12.21) by control u(t). The result of the substitution is a system of differential equations with discontinuous right-hand side ⎧ dx1 ⎪ = x2 , x1 (t0 ) = −1, x1 (t1 ) = 0, ⎪ ⎨ dt
(12.28) ⎪ dx2 = +1, t τ, x2 (t0 ) = 0, x2 (t1 ) = 0. ⎪ ⎩ dt −1, t > τ, The equations of the system (12.28) are integrated independently of each other. First, we determine the phase coordinate corresponding to the speed of motion
t0 t τ, t + A1 , x2 (t) = −t + A2 , τ < t t1 . From the boundary conditions we find the integration constants: A1 = = −t0 , A2 = t1 . Then
t0 t τ, t − t0 , (12.29) x2 (t) = −t + t1 , τ < t t1 . At the switching point τ, the trajectory x2 (t) must be continuous. According to the continuity condition x2 (τ − 0) = x2 (τ + 0), or τ − t0 = t1 − τ. Hence, the switching moment τ = (t0 + t1 )/2.
Applied Optimal Control Problems
201
Substitution of relation (12.29) into the first equation of the differential system (12.28) with subsequent integration gives ⎧ 2 ⎨ (t−t0 ) + A3 , t0 t τ, 2 x1 (t) = ⎩− (t1 −t)2 + A , τ < t t . 4 1 2 The values of the constants of integration are found from the boundary conditions: A3 = −1, A4 = 0. As a result, the optimal trajectory ⎧ 2 ⎨ (t−t0 ) − 1, t0 t τ, 2 x1 (t) = (12.30) ⎩− (t1 −t)2 , τ < t t . 1 2 From the condition of the continuity of the trajectory, we obtain x1 (τ−0) = x1 (τ+0), or, taking into account the value of τ calculated above, (t1 − τ)2 (τ − t0 )2 =− − 1 t0 +t1 ⇒ t1 − t0 = 2. t +t 2 2 τ= 0 1 τ= 2
2
So, conditions (12.29), (12.30) determine the optimal trajectory, and the moment of control switching τ = 1, the moment of the end of motion t1 = 2. The only construction of the maximum principle remains underdetermined - the conjugate vector function ψ(t), which depends on two integral constants C1 and C2 . We calculate the value of one of them using the transversality condition for the optimal control problem with a moving boundary, H(t1 , ψ, x, u) =
∂F = 1, ∂t1
or ψ1 (t1 ) · x2 (t1 ) +ψ2 (t2 ) · u(t1 ) = −ψ2 (t1 ) = 1. =0
−1
We find the second integral constant from the condition that the switching function vanishes at the point τ = 1. As a result, ψ2 (t1 ) = = −1, ψ2 (τ) = 0.
202
Chapter 12
Solving Linear Algebraic System relative to C1 and C2 , we obtain C1 = C2 = 1. Thus, the conjugate vector function is nonzero and satisfies the conditions of the maximum principle ψ1 (t) = 1, ψ2 (t) = = −t + 1. The optimal solution to the time-optimal problem is shown in Fig. 12.9. In this problem, we got the opportunity to define the control as a function of the phase coordinates u(x1 , x2 ), in other words, we got a solution in the form of optimal control synthesis.
Figure 12.9: Optimal solution to the performance problem The synthesis of control makes it possible to answer the question of how to get from any starting point to the switching line leading to a given end point.
Exercises 1. Check the fulfillment of Krotov’s sufficient optimality conditions in the speed problem considered above. 2. Solve the problem of the fastest stopping of a mathematical
Applied Optimal Control Problems
203
pendulum using a force bounded in absolute value: x ¨ + x = u,
|u| 1,
x(t ˙ 0 ) = v0 ,
x(t0 ) = x0 ,
x(t ˙ 1 ) = 0,
x(t1 ) = 0,
t1 − t0 → min .
12.3 Maximizing the polarization of a two-level atom in a resonant field The methods of the theory of optimal control are successfully applied to solve the problems of efficient and economical laser action. This section reproduces the results of the work of I.V. Krasnov, N. Ya. Shaparev, I.M. Shkedov [8]; the presentation follows the monograph by V. I. Gurman [6].6 It is assumed that the atom is in a resonant quasi-monochromatic electromagnetic field E0 (t) cos(ωt). It is required to find the effect E0 (t) that ensures the maximum of the rms field-induced polarization of the atom in time T. The lower energy level is the main one. The dynamics of atomic states in an electromagnetic field is described by the optical Bloch equations, which in the case of exact resonance have the form
x˙ 1 = −γ1 x1 + ux2 , (12.31) x˙ 2 = −γ2 (x2 − 1) − ux1 , where γ1 , γ2 - rates of transverse and longitudinal relaxation; x1 is the amplitude of the polarization of the atom; x2 is the difference between the populations of atomic levels; u = dE0 /h - control; d is the matrix element of the dipole moment; h is Planck’s constant. At the initial moment of time, the atom is not excited and with probability 1 is in the ground energy state x1 (0) = 0, x2 (0) = 1. The quality criterion is T I[x(·), u(·)] =
(x1 (t))2 dt → max,
(12.32)
0
which corresponds to the maximum energy of photons emitted by the induced dipole in time T.
204
Chapter 12
Let us compose the function R appearing in the theorem on sufficient optimality conditions, R(t, x1 , x2 , u) = = ϕx1 (−γ1 x1 + ux2 ) + ϕx2 (−γ2 (x2 − 1) − ux1 ) + ϕt + x21 . (12.33) Let us define a function ϕ(t, x1 , x2 ) according to the method of multiple maxima so that R does not depend on the control u, then ϕx1 x2 − ϕx2 x1 = 0.
(12.34)
Equation (12.34) is a linear first order partial differential equation. The solution to equation (12.34) is found by the method of characteristics. The characteristic system has the form ⎧ ⎨ dx1 = x2 , dθ ⎩ dx2 = −x , 1 dθ its first integral is z = x21 + x22 . As a result, the solution to equation (12.34) is the function ϕ1 (t, z); in other words, the arguments of the function ϕ are the time t and the squared length of the Bloch vector z. So,
ϕx1 = 2ϕ1z x1 , ϕx2 = 2ϕ1x x2 , R(t, z, x2 ) = 2ϕz (−γ1 z + (γ1 − γ2 )x22 + γ2 x2 ) + ϕt + z − x22 . Here the variable z plays the role of a new phase variable, and x2 plays the role of a control. As a result, we performed the transition from the original problem (12.31)-(12.32) to the derivative optimal control problem. Introducing new variables and constants z = x21 + x22 , τ = γ1 t, τK = γ1 T , η = γ2 /γ1 , we obtain the following problem: to maximize the functional 1 I1 [z(·), x2 (·)] = γ1
τK (z − x22 ) dτ 0
Applied Optimal Control Problems
205
on the set of admissible processes defined by the constraints dz 2(−z + (1 − η)x22 + ηx2 ), dτ
z(0) = 1, moreover x22 z.
We will seek a solution to the derivative of the problem, assuming that the joint inequality constraint on the phase variable and the control is performed automatically; we will check it a posteriori. The resulting derivative problem belongs to the class of optimal control problems that are linear with respect to z; therefore, Pontryagin’s conditions are not only necessary, but also sufficient. We write the Hamilton function H(τ, ψ, z, x2 ) = 2ψ(−z + (1 − η)x22 + ηx2 ) +
1 (z − x22 ). γ1
The adjoint equation dψ ∂H =− = 2ψ − 1. dτ ∂z The transversality condition ψ(τK) = 0. Solving the adjoint equation, taking into account the transversality condition, we obtain ψ(τ) =
1 1 − e2(τ−τK ) . 2
(12.35)
The optimal control x2 (τ) is determined from the condition for the maximum of the Hamiltonian with respect to x2 . It reduces to the ∂H = 0 and checking the stationarity condition of the Hamiltonian ∂x 2 2
H ξ20 , then the function F (t) increases exponentially. With increasing t, the payoff of the “red” increases. In the case when ξ10 < ξ20 , the function F (t) decreases and the blue ones wins (Fig. 12.15).
Figure 12.15: Dynamics of the function F (t) = ξ1 (t) − ξ2 (t) for different ratios in the initial forces of the opponents According to (12.55), the optimal game price coinciding with the value of the objective functional at the saddle point is equal to min max u
v
Applied Optimal Control Problems
219
λ(t−t0 ) (ξ F (t√ 1 ) = ξ1 (t) − ξ2 (t) = e 10 − ξ20 ), where λ = = k12 k21 . Since at the initial moment of time the value of the conjugate function ψ1 (t0 ) = −eλ(t1 −t0 ) , the following relation holds:
−ψ1 (t0 ) =
∂(min max F (t1 )) . ∂(ξ10 − ξ20 )
The value ψ1 (t0 ) can be considered as the coefficient of sensitivity of the optimal game price to a change in the ratio of the initial forces of “red” and “blue”. In other words, an objectively determined assessment of the reduced forces (“shadow price”, or Kantorovich’s multiplier) shows the influence of changes in the balance of forces at the initial moment on the final outcome of the operation. Thus, proportional to the intensity of the defeat, represented by the posynomials (12.51), are the conditions for the invariant distribution of the reduced forces of the opposing sides, the dynamics of which follow the linear differential system (12.47) and equations (12.52) that determine their solution. Despite the fact that the optimal shares - controls u, v - remained undefined (to find them, additional information is required), the study performed makes it possible to obtain quantitative estimates of the ratio of forces and analyze the effectiveness of their change. The question of the balance of forces in different types of groupings has always worried the military leaders. The famous Chinese commander and thinker of the 6th-5th centuries BC Sun Tzu wrote in his treatise The Art of War: “Chariots and cavalry are the military might of an army. Ten chariots smash a thousand people; a hundred chariots smash ten thousand people; ten horsemen put a hundred people to flight, a hundred horsemen put a thousand people to flight ... in a flatland battle” (see Fig. 12.16). Similar ratios were put forward as orders to the troops. Thus, during the defense of Okinawa, the commander of the 32nd Army, Lieutenant General of the Imperial Japanese Army Ushijima, wrote: “The battle motto of the 32nd Army is: one plane for one ship; one boat for one ship;
220
Chapter 12
Figure 12.16: The balance of forces in different types of groupings according to Sun Tzu one person for ten enemies or for one tank19 . In terms of setting, the situation is very close to the well-known question from Lev Kassil’s "The Black Book and Shwambrania": "... And if a whale crawls on an elephant, who collects whom?" Asked Oska.
19 Nichols C., Shaw G. The Battle of Okinawa. - Moscow: Military Publishing, 1959.
13 Solving Problens Of Applied Optimal Control Theory Using The Apparatus Of Near-Periodic Functions 13.1 Formation of optimal control based on a representative amount of information The development of computer technology and means of registration of various types of processes has led to the ability to measure and record the characteristics of complex systems with high accuracy and detail over long time intervals. Such measurements contain detailed information about the mechanisms of formation and functioning of systems, which can be used as the basis for forecasting and management of development and thereby allow optimizing the target functionality for one or a number of the studied parameters. This makes it possible to obtain digital data on the evolution of not only economic processes, but also, first of all, physical fields using a distributed system of digital sensors that do not change the measured object. In fact, measurements carried out in this way make it possible to perform a parametric decomposition of the phenomenon and, on this basis, to build models like black boxes, by probing objects, identifying a system of elements, possible connections between them, determining the conditions for the implementation of processes, etc. The gap between the capabilities of modern metrology in terms of accuracy, detail and duration of measurements of the characteristics of nonlinear processes and methods of processing and analyzing this information leads to the need to create a system for a comprehensive analysis of the hierarchical structure of mechanisms that form the development of systems, forecasting and optimal control of them on
222
Chapter 13
this basis. According to A.A. Samara mathematical modeling can be divided into three stages: model→ algorithm→ program [1].
At the first stage, a mathematical model of the object is built, which is investigated by theoretical methods to obtain important knowledge about the object. At the second stage, an algorithm is being developed to implement the model on a computer. At the third stage, a program is created for carrying out computational experiments. However, often in the study of real systems, the determination of the characteristics, properties of an object and the conditions for the implementation of various processes occurs primarily on the basis of an analysis of the results of experiments, and not a mathematical model. This determines the class of inverse problems widely represented by the school of Academician A.N. Tikhonov. [2] The problem to be solved belongs to the class of inverse problems, in which, according to the results of the analysis of measurements, the models and the corresponding development mechanisms are identified, and this makes it possible to form an optimal apparatus for a given class of processes. The measurement results contain information about the characteristics of the process under consideration, the hierarchy of mechanisms implemented on different scales, both in time and in other measured indicators. The detailing of these indicators requires a primary disaggregation of the measurement results, which makes it possible to determine the parameters of the mechanisms dominating at each of the levels. The formulation of the problem of the coexistence of mechanisms implemented on fundamentally different time and space scales first
Solving Problens Of Applied Optimal Control Theory
223
appeared in the works of I. Newton in the problem of dividing motion into slow and fast components. In this regard, for effective analysis of measurement results, it is required to separate the slow and fast components, that is, the trend and fluctuations relative to it. By trend we mean a change in the mathematical expectation or variance of measurement results from an argument. Periodic components of processes are represented both by values closest to periods (near-periods) for arithmetic progressions, and by denominators closest to values (near-proportions) for geometric progressions. Since the methods of excluding the trend affect the parameters of the remaining fluctuations, it is necessary to agree on the methods of excluding the trend and analyzing the fluctuations. After determining the parameters of the fluctuations, you can filter the original series, after which, having received the dependence for the trend, determine the corresponding mathematical model and its parameters. Fast and slow components are represented by the characteristics of the same process, which leads to the need to clarify their interrelationships. The development of such models, methods, algorithms and programs will allow solving the problems of developing a number of critical technologies, controlling the parameters of modern technological processes, etc. An urgent problem is the development of models, algorithms and programs that allow, based on the measurement results, to implement a comprehensive analysis of processes with nonlinear oscillations occurring at the macro and micro levels.
13.2 Characteristics of measurement results and their main types Let us briefly analyze various approaches and methods for modeling trend dependence, analysis of fluctuations. To do this, let us dwell on
224
Chapter 13
the problem of interrelationships between the characteristics of processes occurring at different structural levels, for example, micro and macro levels. Based on the results of the analysis, we will formulate the requirements and objectives for the direction of research. In general, the analysis and processing of measurement results are often associated with difficulties in identifying fluctuations accompanying trends as the main trends in the development of the process. Without extracting the trend component, the analysis of fluctuations is problematic to carry out reliably, since the trend distorts the result obtained, which often leads to the uncertainty of the results obtained for the fluctuations. There are no regular methods for solving this problem. The trend of the function is intuitive. The simplest example is a function f (x) = x + sin(x). It is clear that the trend here is a linear function x. Let the real function f (x) have the form $ % f (x) = F ϕ(x) + ω(x) . (13.1) Here f (x) and F (x) are known functions. F (x) has an inverse function F −1 , ϕ(x) is an unknown sufficiently smooth function, $ and% - ω(x) is also an unknown near-periodic function. Then F ϕ(x) will be a trend function. Functions of the form (13.1) arise in many practical problems. We will assume that the function ϕ(x) can be represented with sufficient accuracy in the form of a superposition of the known basis functions vm , m = 1, . . . , M , thus ϕ(x) ≈
M
am v m
(13.2)
m=1
The type of basic functions is set from a priori considerations, which are determined by the specifics of the problem. The unknown coefficients am in (13.2) will be determined from the following condition . 2 / M % $ −1 (13.3) am vm dx a1 , . . . , am min F f (x) − a1 ,...,am
m=1
Solving Problens Of Applied Optimal Control Theory
225
Condition (13.3) was obtained from the following considerations. From (13.1) we have that F −1 f (x) = ϕ(x) + ω(x). Then
F
−1
f (x)−
M
am vm dx
2
=
m=1
ϕ(x)−
M m=1
2
am vm +ω(x) dx (13.4)
The integral of the function ϕ(x) on the right-hand side of (13.4) is small, since ϕ(x) is an almost periodic function. Therefore, from (13.4), taking into account (13.2), condition (13.3) follows. Calculating the integrals in (13.3) and squaring the resulting expression, we obtain a quadratic function with respect to unknown quantities am . Equating the partial derivatives with respect to am to zero (the condition for the extremum of a quadratic function), we obtain a system of linear equations with respect to unknown values am Consider one particular case that has many applications. Let the function f (τ) have the form f (τ) = exp(a + kτ + ω(τ)),
(13.5)
where the constants a, k and the almost periodic function ϕ(τ) are unknown. From the above, it is easy to determine the trend function exp(a + kτ). Consider the function $ %(2) d2 = 2 ω(τ). P (τ) = lnf (τ) (13.6) dτ Obviously, the values close to the periods of the functions ω(τ) and P (τ) coincide. Then from (13.6) it is possible to determine the values close to the periods T of the function P (τ), and hence ω(τ). In real measurements, the function f (τ) is set at discrete points, i.e. the values f (τi ) are known, where τi = iδ, i are integers. From the Taylor formula, we have the obvious equalities df (τi ) δ+ dτ df (τi ) f (τi − δ) = f (τi ) − δ+ dτ
f (τi + δ) = f (τi ) +
1 d2 f (τi ) 2 δ + 2 dτ2 1 d2 f (τi ) 2 δ − 2 dτ2
1 d3 f (τi ) 3 δ + o(τ3 ), 6 dτ3 1 d3 f (τi ) 3 δ + o(τ3 ). 6 dτ3 (13.7)
226
Chapter 13
In (13.7), the function o(δ3 ) has the order of smallness δ4 . From (13.7) we obtain that the function P (τ) at discrete points is described by the following formula P (τi ) =
d2 ω(τi ) f (τi + δ)f (τi − δ) − f 2 (τi ) ≈ = dτ2 δ2 f 2 (τi ) f (τi + δ)f (τi − δ) 1 − 1 . (13.8) = 2· δ f 2 (τi )
(τi −δ) represents the geometric mean proHere the expression f (τi +δ)f f 2 (τi ) portion and can be directly used to exclude a trend. In fact, this solves a problem that can be defined as follows. We will assume that trend characteristics are encoded through pivot points. The task is to find such a position of these points, which will ensure the exclusion of the trend. Let us consider as the simplest case a situation in which 3 points are used to solve this problem yt−Δt , yt , yt+Δt located at equal distances in argument from the midpoint. In this formulation, the problem is reduced to the classical results of the theory of proportions, in accordance with which the segment is divided (Fig. 13.1).
Figure 13.1: Splitting a segment Consider arithmetic and geometric proportions. For arithmetic proportion: b=
yt−Δt + yt+Δt a+c or yt = . 2 2
Then the state of the system is characterized by a dimensionless criterion: yt−Δt + yt+Δt = 1. S= 2 · yt
Solving Problens Of Applied Optimal Control Theory
227
After taking the logarithm, we get ln(S) = ln
y
+ yt+Δt = 0. 2 · yt
t−Δt
For geometric proportion: b=
√
a · c or yt =
√
yt−Δt · yt+Δt .
Then the state of the system is characterized by a dimensionless criterion: yt−Δt · yt+Δt P = = 1. yt2 After taking the logarithm, we get y
ln(P ) = ln
· yt+Δt = 0. yt2
t−Δt
The characteristics of the empirical series, represented by these dimensionless criteria, can be used as indicators of the characteristics of systems to exclude a trend, the parameters of which are taken into account by the magnitude of the shift by the argument Δt relative to the state yt . Conversion of measurement results into coordinates. t, ln
y
and t, ln
+ yt+Δt 2 · yt
(13.9)
· yt+Δt 2 · yt
(13.10)
t−Δt
y
t−Δt
leads to the exclusion of trend sections from the data. Thus, to exclude a trend, it is possible to use transformations determined by the ratios of the theory of proportions. To check the obtained results for roughness, a class of coordinates was introduced to exclude a trend (Table 13.1).
228
Chapter 13
Table 13.1: View of proportions and coordinates to exclude a trend. № 1
type of proportion √ b= a·c
2
b=
3
b=
√
4
b=
5
b=
6
b=
(a+c) 2
a·c
2·a·c a+c a2 +c2 a+c
2·a·c−a2 c
type of coordinates y ·y ln t−Δtyt t+Δt ∼ t y +y ln t−Δt2·yt t+Δt ∼ t
2 yt−Δt · yt+Δt − yt∼ t
2·yt−Δt ·yt+Δt yt (yt−Δt +yt+Δt ) ∼ t y2 +y2 t+Δt ln yt (yt−Δt +yt+Δt ) ∼ t 2·y t−Δt 2 t−Δt ·yt+Δt −yt−Δt ln ∼ 2y˙ t
ln
t
13.3 Classes of shift functions for determining near-periods and near-proportions. To identify periods that are free, if possible, from a priori assumptions, we use an approach that relies primarily on the fundamental characteristic property of the period of function (*), which consists in repetition of the values of the function through the interval of variation of the independent variable equal to the period. f (t + τ) − f (t) = 0,
(13.11)
The necessary generality is possessed by a method for identifying periodicities by assessing the degree of repeatability of the behavior of the time series under study at various time shifts. As a rule, in real data one has to deal with nonlinear fluctuations, and therefore, pure periods are quite rare. As a result, the maximum that can be counted on is to identify the values that are closest to the periods. Such values are called near-periods. The following definition of an almost periodic function is introduced: the number τ is called the ε - almost - period (ε - displacement) of the function f (t) (−∞ < t < ∞), if for all t the inequality f (t + τ) − f (t) < ε (13.12)
Solving Problens Of Applied Optimal Control Theory
229
If f (t) is a periodic function and τ is its period, that is, f (t + τ) = = f (t), then obviously τ is also an almost – period for any ε > 0, just like and any number of the form nτ (n = ±1, ±2, . . . ). For the discrete case, if n is the total number of samples of the function f (ti ), given by the experimental values, the following function is introduced to determine near-periods: a(τk ) =
n−k 1 + τ ) − f (t ) f (t i i . k n−k t=1
This function is called the shift or Alter-Johnson function and is used for a discrete time series f (ti ) over an interval of duration T T (where n is the (kinetic data). The sampling interval is equal to n−1 total number of counts of the function) and determines the accuracy of measurements of the kinetics of the physicochemical process in time. Here τk is a trial period. The system of near-periods τi of the function f (ti ) can be defined as a set of local minima of the shift function τi = arg min a(τk ) τmin τi τmax , where τmin and τmax are the natural search limits for the period, chosen in such a way that, on the one hand, τi < τmin are discarded, at which the function a(τk ) can take small values due to the inertia of the function f (ti ), and, on the other hand, τi > τmax are discarded, at which the determination of the average a(τk ) becomes unreliable due to the small number of summation terms in the expression. Unlike standard methods for analyzing vibrations, shear functions are based primarily on the fundamental characteristic property of the period of the function (13.11). This allows you to identify nearperiods, free from a priori assumptions about the value of the nearperiod or the form of a function (vibration mode) corresponding to the analyzed data. To assess the roughness of the results obtained, a class of shift functions (Table 13.2), corresponding to the distances in the functional spaces of functional analysis, is introduced.
230
Chapter 13
Table 13.2: Shear functions. № 1 2 3 4
a(τk ) =
1 n−τ
a(τk ) = a(τk ) = a(τk ) =
0shear function n−τ · 2n (f (ti + τk ) − f (ti ))2n 1 n−τ
0
2n
1 n−τ
·
(∗)
t=1 n t=1
1 f (t1 + τk ) − f (ti )p p
n
1 n−τ
·
n
p p1 f (t1 + τk ) − f (ti )
t=1
t=1
(f (ti + τk ) − f (ti ))2n (∗∗)
Let us consider the effectiveness of the algorithms presented and the sequence of actions in data analysis using a number of examples. Figure 13.2 shows the dynamics of the average number of sunspots.
Figure 13.2: Dynamics of the average number of sunspots. The ordinate is the Wolf number, the abscissa is the time in years. Let us consider the result of expanding the data in Fig. 13.2 into a Fourier series, which corresponds to the standard procedure for
Solving Problens Of Applied Optimal Control Theory
231
determining the periods of oscillations (Fig. 13.3). This is where the 11-year cycle comes to light. This completes the substantial part of the information about the cycle system. To identify the remaining rhythms, it is necessary to use additional data processing methods associated with filtering short and long periods, focused on identifying periods in a limited range. Let us consider the application of the shift function (13.13) on the same data. To assess the influence of various metrics on the final result, we use the shift function No.2 from Table 13.2 at p = 1, which corresponds to the Alter-Johnson function, as well as at p = 3 and p = 5. The minima (Fig. 13.4) determine the totality of almost - periods of solar activity, represented by a sequence of values: 11, 22, 33, 43, 54, 67, 78, 89, 100, 110, 121, 132, 143, 156, 168, 179, 189, 200 years old. It can also be seen from the results that different metrics do not affect the position of local minima, and some change in the magnitudes of the amplitudes does not change the nature of the interaction of near-periods. Here, up to values of τ of 156 years, a set of harmonics corresponding to an near-period of 11 years is reproduced. Further, the structure of the shift function changes. From the presented results, one can see a fundamental difference in the information content of the results obtained when using the Fourier method as based on the imposition of a fixed oscillation structure and functional analysis metrics on empirical data, which do not require specifying such an oscillation structure. In the first case, the selection resulted in an 11-year period, which is obvious from the series in Fig. 13.2. Indeed, the existence of an eleven year cycle of solar activity is well known. Primary data processing based on the Fourier method (Fig. 13.3) does not provide any meaningful information about longer rhythms. Searches for longer periods of solar activity are carried out by the Fourier method by filtering short and long oscillations relative to the search interval for values and have now led to the identification of a number of cycles, many of which have received proper names and are presented by the results of Fig. 13.4.
232
Chapter 13
Figure 13.3: The result of decomposition of the data in Fig. 13.2 into a Fourier series to determine the cycles of solar activity, along the abscissa - time in months
Figure 13.4: The shift function (**) in the data in Fig. 13.2 (1) is the Alter – Johnson function, (2) is the function (**) for p = 3, (3) is the function (**) for p = 5
As a result, the set of basic periods identified by various authors over the past two hundred years is reproduced in the determination of near-periods on the basis of functional analysis metrics. At the same time, significant values of near-periods are revealed in the ranges, the boundaries of which differ by an order of magnitude. In particular, 168 years is associated with the parade of the planets, and 179 years the minimum angular momentum of the Sun. Consider the following example. Figure 13.5 shows a fragment of a minute cardiogram recording in a healthy person. Figure 13.6 shows the result of applying the Fourier method to the data in Figure 13.5. The period value obtained here is 1 s. Figure 13.7 shows the result of applying the shift function (Alter-Johnson function) for the same data. Its minimums define a set of near-periods. The result obtained by the Fourier method is 0.91 seconds, while the near-period obtained by the shift function is 0.77 seconds. Figure 13.8 shows the frequency of occurrence of heart rate dura-
Solving Problens Of Applied Optimal Control Theory
233
Figure 13.5: Fragment of a Holter ECG recording in a healthy person.
tions in the analyzed interval of the cardiogram. The obtained result shows that the shear function determines the true mean position of the near-period of the cardiac cycle, while the Fourier method gives a significant systematic error. The application of the Fourier transform determines the results of the interaction of measurements and the terms of the Fourier series, i.e. determine the properties of a new system, which is a set of initial data and a Fourier series. This leads to the emergence of nonlinear effects that are not related to the properties of the original empirical series. Thus, the fundamental nature of the transition from Fourier analysis to the metrics of functional analysis in the analysis of nonlinear oscillations is obvious. The algorithm for estimating a set of near-periods in the results of measurements with a trend consists of two stages, each of which contains shift times from the current value. The magnitude of the shift Δt in the coordinates of the exclusion of the trend affects the shape of the fluctuations obtained as a result of the exclusion of the trend. This leads to the need to match it with
234
Chapter 13
Figure 13.6: Result data processing by the Fourier method
Figure 13.7: The result of data processing by the shift function
Figure 13.8: Heart rate frequency. The abscissa is the time in seconds between the T-waves, the ordinate is the number of cycles the shift τ used to calculate the shift function. As a result, to estimate the system of near-periods in experimental
Solving Problens Of Applied Optimal Control Theory
235
data with a trend, it is necessary to expand the concept of a shift function to include, in addition to the argument τ, the quantity Δt. This function a(τ, Δt) will be called the generalized shift function of Table 13.3. Table 13.3: Generalized shear functions
Investigation of the system of minima of this function in terms of τ and Δt will make it possible to determine a set of values of near-periods corresponding to the initial empirical series. It is possible to identify both individual local minima and such values of τ or Δt at which “channels” of minima are formed at a fixed value of one of the variables. To identify the “channels", functions are constructed that characterize the average sum of values by the arguments of the function a(τ, Δt): N 1 a(τ, Δt), Φ(τ) = N
Ψ(Δt) =
Δt=1 L
1 L
τ=1
a(τ, Δt),
(13.13)
(13.14)
236
Chapter 13
where N and L are the number of values in the corresponding row. The positions of the near-periods are determined by the minima of the functions (13.13), (13.4). The rhythms of a geometric progression satisfy the ratio f (t · k) − f (t) = 0,
(13.15)
where f (t) is the value of the series under study at time t, k is the modulus of the geometric progression. This ratio sets the distance along the ordinate between the points for which the ratio of the distances along the abscissa is equal to k. To determine the denominators of a geometric progression, free from a priori assumptions, if possible, we use an approach that relies on relation (13.15), which consists of estimating the degree of repetition of function values, and the ratio of the distances between which along the abscissa axis is equal to k. We will call the number k a near-proportion if for all t(−∞ < t < ∞) of the function f (t) the inequality f (t · k) − f (t) < ε. (13.16) For the discrete case, if N is the total number of counts of the function f (t) given by the experimental values, we introduce the following metric to determine the near-proportions: N/k 1 f (t · k) − f (t). b(k) = N/k
(13.17)
t=1
To identify a geometric progression, it is necessary to know the position of the zero point, which can be inside or outside the interval of the studied data. To determine its position, let us expand the class of function (13.17), where the trial value of the zero position will appear as the second argument N/k 1 f (t · k + t0 ) − f (t + t0 ). b(k, t0 ) = N/k t=1
(13.18)
Solving Problens Of Applied Optimal Control Theory
237
Then, the system of near-proportions k of the function f (t) can be defined as a set of local minima of the function (13.18) k = arg min b(k, t0 ) kmin k kmax ,
t0 min t0 t0 max ,
where kmin , kmax and t0 min , t0 max are the natural search limits for the near-proportion and zero of the reference, chosen in such a way that, on the one hand, small values are discarded at which the function b(k, t0 ) can take small values due to the inertia of the function f (t), and, on the other hand, large values are discarded at which the determination of the average b(k, t0 ) becomes unreliable due to the small number of samples. In fact, in this case, the rhythms of a geometric progression are revealed, presented in the experimental data, regardless of the form of oscillations. Table 13.4 shows a class of shift functions based on metrics of functional analysis to determine near-proportions. Table 13.4 № 1 2 3 4
b(k, t0 ) = b(k, t0 ) = b(k, t0 ) = b(k, t0 ) =
0hear function N/k 2n · (f (t · k + t0 ) − f (t + t0 ))2n
1 N/k
t=1
1 N/k
0
2n
N/k 1 f (t · k + t0 ) − f (t + t0 )p p t=1 N/k
1 N/k
1 N/k
·
·
t=1 N/k t=1
(f (t · k + t0 ) − f (t + t0 ))2n
1 f (t · k + t0 ) − f (t + t0 )p p
238
Chapter 13
13.4 Models and algorithms for determining the parametric connection of the characteristics of processes at the macro and micro levels. It is natural to assume that the characteristics of the processes occurring at different levels are consistent. Let us consider the possibilities of using the parameters of the lower-level processes to predict the characteristics of the upper-level process. When determining the near-periods after excluding the trend from the initial measurement results and matching the characteristics of these processes through the generalized shear function, a set of nearperiods is obtained, as characteristic times in which the formation of the upper-level system takes place. Thus, the problem of linking the characteristics of processes occurring at different levels can be solved by establishing relationships between the characteristics of these processes. The opportunities that open up here can be considered on the dynamics of processes of limited growth, for which typical mathematical models are known that determine the trend characteristics of these processes. First of all, it is a logistic model, which N.N. Semenov was the basis for the theory of chain reactions, and the Gompertz model. Let us consider the relaxation properties of models of limited growth. The logistics model is determined by equation (13.19) dy = −ky(y∞ − y). dx
(13.19)
Let us introduce the dimensionless variable J = y/y∞ . As a result, we obtain an equation of the form dJ = −ky∞ (1 − J), dx where the dimension parameter α = ky∞ has the inverse dimension of the argument. Let us derive a new variable z associated with the dimensionless variable J by the ratio J = 1/z.
Solving Problens Of Applied Optimal Control Theory
239
Then the logistic equation will have the form dz = −α(z − 1). dx
(13.20)
Let’s introduce a new variable u = z − 1. In this case, the logistic equation is transformed to the form du = −αu. dx
(13.21)
Checking this equation (13.5) for phase shift stability du(x) = −αu(x − δ), dx
(13.22)
where δ is the delay. We seek solution of this equation in the form u(x) = ev·x , which leads to a characteristic equation of the form vδ = −kδ · e−v·x . Ratio
1 , δ determines the tangency of the left and right sides of the characteristic equation, which corresponds to the maximum realization of the characteristics of the process. Substitution of this relation into the equation determining the form of the solution gives vδ = −1 or v =
1
u = e− δ ·x = z − 1 or z = 1 + e−x/δ , where
dz 1 1 x = − e− δ = − (z − 1). (13.23) dx δ δ As a result, the dimensional parameter of the logistic equation is equal to the reciprocal of the delay (relaxation) α = 1/δ or δ = ky∞ .
240
Chapter 13
dy Gompertz’s model For the Gompertz model dx = −ky lny, we introduce the dimensionless variable J = y/y∞ , for which we rewrite the original equation in the form
dJ = −kJ lnJ dx or
1 dJ = −k lnJ. (13.24) J dx Then the left-hand side of equation (5.5**) is the logarithmic derivative and the equation has the form d lnJ = −klnJ. dx
(13.25)
Here is a single dimensional parameter k with the inverse dimension of the argument x. We introduce a new variable u = lnJ, for which the original equation (13.25) can be rewritten as du = −ku. dx
(13.26)
Let us estimate the stability of the obtained equation to a phase shift. To do this, consider an equation of the form du(x) = −ku(x − δ), dx
(13.27)
where δ is the delay. For this equation, as well as for the logistic one, we seek the solution in the form u(x) = ev·x . The characteristic equation also leads to the relations vδ = −1 or v = 1
As a result u = e− δ ·x = lnJ and
du dx
1 . δ
= − δ1 e− δ = − δ1 u. x
Solving Problens Of Applied Optimal Control Theory
241
Let us compare the resulting equation with equation (13.26), whence the main dimensional parameter of the original Gompertz equation 1 k= δ those. is equal to the reciprocal of the delay (relaxation) δ. Thus, the magnitude of the lag, as a characteristic of the process at the level preceding the initial process, determines the main dimensional parameter of the logistic model and the Gompertz model. This presents a direct connection between the characteristics of the processes occurring at two adjacent levels of the hierarchy. Hence, knowledge of the dimensional parameter makes it possible to determine the characteristic time (or spatial scale) of the process occurring at the previous level, and, conversely, the characteristic time (or spatial scale) of the process at the macrolevel is determined from the characteristic level of the process at the microlevel.
13.5 Methods for identifying trends and their parameters based on anamorphosis. The use of anamorphosis, as transformations in which the original data is linearized, leads to the selection of areas corresponding to the selected model. Thanks to the straightening of the initial data in coordinates corresponding to certain models, it is possible to decide on the selection of a suitable class of functions on a regular basis. By enumeration, the model is selected, the anamorphosis of which linearizes the initial data over a larger interval than others. This model will have the highest significance rank. The structure of nonlinear oscillations with a trend is the subject of a large number of theoretical and applied research. There are two main approaches to solving this problem, the first of which, following I. Newton, is associated with solving the problem of separation of motions, and the second, according to N. Wiener, is focused on analyzing the process as a whole through the use of J. systems. The algorithm for analyzing the measurement results presented
242
Chapter 13
by N. Wiener was focused on studying the characteristics of systems, the dynamics of which is not accompanied by an inflow or consumption of energy, i.e. isolated systems. Since it is not known a priori whether this condition is satisfied for the analyzed measurement results, the considered results are based on the method of separation of movements. I. Newton knew the equations of motion of the center of mass of a rigid body and the separation of motions was determined by the difference between the real dynamics and the reference trajectory. This method, precisely in this formulation, underlies the development of systems for automatic optimal motion control. The task becomes much more complicated when the equations of the reference trajectory (trend) are unknown. In the system of methods for analyzing time series, the Slutsky-Yule effect is known, according to which the method of excluding the trend significantly affects the results of the analysis of periodic components. It is known that the classical results of ND Kondratyev (large conjuncture cycles, Kondratieff waves) were subjected to devastating criticism as a result of arbitrary methods of excluding trends. The situation is further complicated by the fact that, despite the intuitive meaning of the term trend, there was no formal definition for it. The introduced formal definition of the trend made it possible to single out a meaningful class of functions that ensure the elimination of the trend with the transition to deviations with a mean close to zero. This class of functions is represented by the classical theory of proportions. Depending on the structure of the analyzed series, a single exclusion of the trend does not always lead to a sequence with a mean close to zero. This shows the possibility of the presence of a hierarchy of trends in the analyzed data, the successive elimination of which is carried out until the average is close to zero. As a result, oscillations with a mean close to zero remain, which contain or do not contain components close to the periods. The belonging of the series obtained after the elimination of the trend
Solving Problens Of Applied Optimal Control Theory
243
(trends) to the class of functions containing components close to the periods is subject to verification. To analyze and identify the periodic components in the measurement results, series of periodic functions are usually used, for example, the Fourier series. However, no one guarantees that data of an arbitrary nature is somehow related to this series. The possibility of using distances in functional spaces (shift functions) has been implemented, which makes it possible to determine the systems of values closest to periods (near-periods). At the same time, an a priori system of functions is not imposed on the analyzed data. Using the property of almost periodic functions allows one to determine whether the sequence under study belongs to the class of such functions. This is an additional control of the reliability of the obtained near-period estimates. On the other hand, the roughness of the values of near-periods is checked by comparing their values obtained using shift functions of different classes. The developed systems of nonlinear transformations (anamorphosis), which provide linearization of the initial time series, make it possible to single out a sequence of linear or piecewise linear intervals. The hierarchies of mathematical models made it possible to determine the corresponding anamorphoses, the boundaries of which reveal the position and significance ranks of the hierarchy of critical points (points of phase transitions). Simultaneously, linear sections of anamorphosis and critical points determine the values of the parameters of the models used. Due to the fact that the characteristics of processes at different levels of the hierarchy represent the properties of the analyzed system as a whole, it is natural to expect the relationship of these characteristics. For processes of limited growth, near-periods of fluctuations relative to the trend determine the main parameter (characteristic time) for the process as a whole. Among the near-periods detected in the shift functions, non-multiple values are often found. Their synchronization makes it possible to identify periodic structures implemented at the macro
244
Chapter 13
level. Thus, a sequence of mathematical models, algorithms and prediction of the optimal results of measurements of the characteristics of the hierarchy of processes of nonlinear oscillations with a trend is presented.
Literature 1. Boltyansky V.G. Optimal control of discrete systems. — M .: Nauka, 1973 . — 448 p. 2. Budylin A.M. Calculus of variations. 2001. [Electronic resource: digital library Bookfi]. 3. Bryson A., Ho Yu-shi. Applied theory of optimal control. — Moscow: Mir, 1972. — 544 p. 4. Vasiliev F.P. Optimization methods. — M .: Factorial press, 2002 . — 824 p. 5. Gelfand I.M., Fomin S.V. Calculus of variations. — Moscow: Fizmatlit, 1961. — 228 p. 6. Gurman V.I. The principle of expansion in control problems. — M .: Nauka, 1985 . — 288 p. 7. Kosha A. Calculus of variations. - M .: Higher school, 1983 . — 279 p. 8. Krasnov I.V., Shaparev N.Y., Shkedov I.M. Optimal Control of Resonance Radiation Processes // Optics Communications. — 1980.No 2. 9. Krotov V.F., Gurman V.I. Methods and Problems of Optimal Control. — Moscow: Nauka, 1973. — 448 p. 10. Krotov V.F., Danilina N.I. Optimal management of economic processes. Textbook: Moscow Institute of Economics and Statistics. — M .: MESI, 1978 . — 75 p.
246
Optimal Control Theory
11. Lagosha B.A. Optimal control in economics: theory and applications. Tutorial. — M .: Finance and statistics, 2008 . — 224 p. 12. V.F. Krotov. Fundamentals of the theory of optimal control: a textbook for economic universities — M .: Higher school, 1990 . — 430 p. 13. Pontryagin L.S. The principle of maximum in optimal control. — M .: UPSS, 2004 . — 64 p. 14. Smirnov V. I., Krylov V. I., Kantorovich L. V. Calculus of variations. — L .: Kubuch, 1933 . — 204 p. 15. Tikhomirov V.M. Stories about highs and lows. - M .: MTsNMO, 2006 . — 200 p. 16. Fan Lian-tsen, Wan Chu-sen. Discrete maximum principle. — M .: Mir, 1967 . — 182 p. 17. Young L. Lectures on the calculus of variations and optimal control theory. — M .: Mir, 1974 . — 488 p.