Trends in Nonlinear and Adaptive Control: A Tribute to Laurent Praly for his 65th Birthday (Lecture Notes in Control and Information Sciences, 488) 3030746275, 9783030746278

This book, published in honor of Professor Laurent Praly on the occasion of his 65th birthday, explores the responses of

105 66 5MB

English Pages 294 [291] Year 2021

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Control Technologies for Emerging Micro and Nanoscale Systems (Lecture Notes in Control and Information Sciences, 413) 3642221726, 9783642221729

This book comprises a selection of the presentations made at the “Workshop on Dynamics and Control of Micro and Nanoscal

103 64 14MB Read more

Block-oriented Nonlinear System Identification (Lecture Notes in Control and Information Sciences, 404) [2010 ed.] 1849965129, 9781849965125

Block-oriented Nonlinear System Identification deals with an area of research that has been very active since the turn o

116 83 7MB Read more

Control Systems with Saturating Inputs: Analysis Tools and Advanced Design (Lecture Notes in Control and Information Sciences, 424) 1447125053, 9781447125051

Saturation nonlinearities are ubiquitous in engineering systems: every physical actuator or sensor is subject to saturat

111 26 2MB Read more

Control Problems of Discrete-Time Dynamical Systems (Lecture Notes in Control and Information Sciences, 447) 3642380573, 9783642380570

This monograph deals with control problems of discrete-time dynamical systems, which include linear and nonlinear input/

124 58 2MB Read more

Modeling and Identification of Linear Parameter-Varying Systems (Lecture Notes in Control and Information Sciences, 403) 364213811X, 9783642138119

Through the past 20 years, the framework of Linear Parameter-Varying (LPV) systems has become a promising system theoret

102 26 5MB Read more

Modern Control Theories: Nonlinear, Optimal and Adaptive Systems 0569072379, 9780569072373

1,241 282 91MB Read more

Silver, Money and Credit: A Tribute to Robartus J. van der Spek on the Occasion of his 65th Birthday 9789062583393

518 121 202MB Read more

Trends in Mathematical, Information and Data Sciences: A Tribute to Leandro Pardo [1 ed.] 9783031041365, 9783031041372, 3031041364

This book involves ideas/results from the topics of mathematical, information, and data sciences, in connection with the

165 18 8MB Read more

Dynamic Systems and Control Lecture Notes (MIT 6.241J)

1,046 196 15MB Read more

Optimization, control, and applications in the information age. P.M.Pardalos's 60th birthday 9783319185668, 9783319185675

417 41 4MB Read more

Trends in Nonlinear and Adaptive Control: A Tribute to Laurent Praly for his 65th Birthday (Lecture Notes in Control and Information Sciences, 488)
3030746275, 9783030746278

Author / Uploaded
Zhong-Ping Jiang (editor)
Christophe Prieur (editor)
Alessandro Astolfi (editor)

Table of contents :
Preface
Contents
1 Almost Feedback Linearization via Dynamic Extension: a Paradigm for Robust Semiglobal Stabilization of Nonlinear MIMO Systems
1.1 Foreword
1.2 Invertibility and Feedback Linearization
1.3 Normal Forms of Uniformly Invertible Nonlinear Systems
1.3.1 Normal Forms
1.3.2 Strongly Minimum-Phase Systems
1.4 Robust (Semiglobal) Stabilization via Almost Feedback Linearization
1.4.1 Standing Assumptions
1.4.2 The Nominal Linearizing Feedback
1.4.3 Robust Feedback Design
1.5 Application to the Problem of Output Regulation
1.6 An Illustrative Example
References
2 Continuous-Time Implementation of Reset Control Systems
2.1 Introduction
2.2 Objective and Primary Assumption
2.3 Continuous-Time Implementation and Main Result
2.4 Examples and Simulations
2.4.1 Example 2.1 Revisited
2.4.2 A Clegg Integrator Controlling a Single Integrator System
2.4.3 A Bank of Clegg Integrators Controlling a Strictly Passive System
2.4.4 A Bank of Stable FOREs Controlling a Detectable Passive System
2.5 Conclusion
References
3 On the Role of Well-Posedness in Homotopy Methods for the Stability Analysis of Nonlinear Feedback Systems
3.1 Introduction
3.2 Signal Spaces
3.2.1 Examples of Signal Spaces
3.2.2 Composite Signals
3.3 Systems, Controllability, and Causality
3.3.1 Controllability
3.3.2 Input/Output Systems, Causality, and Hemicontinuity
3.4 Stability and Gain of IO Systems
3.4.1 Finite-Gain Stability
3.4.2 Relationships Between Gain, Small-Signal Gain, and Norm Gain
3.4.3 Stability Robustness in the Gap Topology
3.4.4 Stability via Homotopy
3.5 Stability of Interconnections
3.5.1 Well-Posed Interconnections
3.5.2 Regular Systems
3.5.3 Integral Quadratic Constraints
3.6 Summary
3.7 Appendix
References
4 Design of Heterogeneous Multi-agent System for Distributed Computation
4.1 Introduction
4.2 Strong Diffusive State Coupling
4.2.1 Finding the Number of Agents Participating in the Network
4.2.2 Distributed Least-Squares Solver
4.2.3 Distributed Median Solver
4.2.4 Distributed Optimization: Optimal Power Dispatch
4.3 Strong Diffusive Output Coupling
4.3.1 Synchronization of Heterogeneous Liénard Systems
4.3.2 Distributed State Estimation
4.4 General Description of Blended Dynamics
4.4.1 Distributed State Observer with Rank-Deficient Coupling
4.5 Robustness of Emergent Collective Behavior
4.6 More than Linear Coupling
4.6.1 Edge-Wise Funnel Coupling
4.6.2 Node-Wise Funnel Coupling
References
5 Contributions to the Problem of High-Gain Observer Design for Hyperbolic Systems
5.1 Introduction
5.2 Problem Description and Solutions
5.2.1 Triangular Form for Observer Design
5.2.2 The High-Gain Observer Design Problem
5.3 Observer Design for Systems with a Single Velocity
5.3.1 Problem Statement and Requirements
5.3.2 Direct Solvability of the H-GODP
5.4 Observer Design for Systems with Distinct Velocities
5.4.1 System Requirements and Main Approach
5.4.2 Indirect Solvability of the H-GODP
5.5 Conclusion
References
6 Robust Adaptive Disturbance Attenuation
6.1 Introduction
6.2 Problem Formulation and Objectives
6.2.1 Preliminaries and Notation
6.3 Known Stable Plants: SISO Systems
6.3.1 Discrete-Time Systems
6.3.2 Continuous-Time Systems
6.4 Known Stable Plants: MIMO Systems
6.4.1 Discrete-Time Systems
6.4.2 Continuous-Time Systems
6.5 Unknown Minimum-Phase Plants: SISO Systems
6.5.1 Non-adaptive Case: Known Plant and Known Disturbance Frequencies
6.5.2 Adaptive Case: Unknown Plant and Unknown Disturbance
6.6 Numerical Simulation
6.6.1 SISO Discrete-Time Systems with Known Plant Model
6.6.2 SISO Continuous-Time Systems with Known Plant Model
6.6.3 MIMO Discrete-Time Systems with Known Plant Model
6.6.4 SISO Discrete-Time Systems with Unknown Plant Model
6.7 Conclusion
References
7 Delay-Adaptive Observer-Based Control for Linear Systems with Unknown Input Delays
7.1 Introduction
7.1.1 Adaptive Control for Time-Delay Systems and PDEs
7.1.2 Results in This Chapter: Adaptive Control for Uncertain Linear Systems with Input Delays
7.2 Adaptive Control for Linear Systems with Discrete Input Delays
7.2.1 Global Stabilization under Uncertain Plant State
7.2.2 Global Stabilization Under Uncertain Delay
7.2.3 Local Stabilization Under Uncertain Delay and Actuator State
7.3 Observer-Based Adaptive Control for Linear Systems with Discrete Input Delays
7.4 Adaptive Control for Linear Systems with Distributed Input Delays
7.5 Beyond the Results Given Here
References
8 Adaptive Control for Systems with Time-Varying Parameters—A Survey
8.1 Introduction
8.2 Motivating Examples and Preliminary Result
8.2.1 Parameter in the Feedback Path
8.2.2 Parameter in the Input Path
8.2.3 Preliminary Result: State-Feedback Design for Unmatched Parameters
8.3 Output-Feedback Design
8.3.1 System Reparameterization
8.3.2 Inverse Dynamics
8.3.3 Filter Design
8.3.4 Controller Design
8.4 Simulations
8.5 Conclusions
References
9 Robust Reinforcement Learning for Stochastic Linear Quadratic Control with Multiplicative Noise
9.1 Introduction
9.2 Problem Formulation and Preliminaries
9.3 Robust Policy Iteration
9.4 Multi-trajectory Optimistic Least-Squares Policy Iteration
9.5 An Illustrative Example
9.6 Conclusions
References
10 Correction to: Contributions to the Problem of High-Gain Observer Design for Hyperbolic Systems
Correction to: Chapter 5 in: Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control, Lecture Notes in Control and Information Sciences 488, https://doi.org/10.1007/978-3-030-74628-5_5
Index

Citation preview

Lecture Notes in Control and Information Sciences 488

Zhong-Ping Jiang Christophe Prieur Alessandro Astolfi Editors

Trends in Nonlinear and Adaptive Control A Tribute to Laurent Praly for his 65th Birthday

Lecture Notes in Control and Information Sciences Volume 488

Series Editors Frank Allgöwer, Institute for Systems Theory and Automatic Control, Universität Stuttgart, Stuttgart, Germany Manfred Morari, Department of Electrical and Systems Engineering, University of Pennsylvania, Philadelphia, USA Advisory Editors P. Fleming, University of Sheffield, UK P. Kokotovic, University of California, Santa Barbara, CA, USA A. B. Kurzhanski, Moscow State University, Moscow, Russia H. Kwakernaak, University of Twente, Enschede, The Netherlands A. Rantzer, Lund Institute of Technology, Lund, Sweden J. N. Tsitsiklis, MIT, Cambridge, MA, USA

This series reports new developments in the fields of control and information sciences—quickly, informally and at a high level. The type of material considered for publication includes: 1. 2. 3. 4.

Preliminary drafts of monographs and advanced textbooks Lectures on a new field, or presenting a new angle on a classical field Research reports Reports of meetings, provided they are (a) of exceptional interest and (b) devoted to a specific topic. The timeliness of subject material is very important.

Indexed by EI-Compendex, SCOPUS, Ulrich´s, MathSciNet, Current Index to Statistics, Current Mathematical Publications, Mathematical Reviews, IngentaConnect, MetaPress and Springerlink.

More information about this series at https://link.springer.com/bookseries/642

Zhong-Ping Jiang Christophe Prieur Alessandro Astolfi •

•

Editors

Trends in Nonlinear and Adaptive Control A Tribute to Laurent Praly for his 65th Birthday

123

Editors Zhong-Ping Jiang Department of Electrical and Computer Engineering New York University Brooklyn, NY, USA

Christophe Prieur Automatic Control CNRS Saint Martin d’Heres, France

Alessandro Astolfi Department of Electrical and Computer Engineering Imperial College London London, UK

ISSN 0170-8643 ISSN 1610-7411 (electronic) Lecture Notes in Control and Information Sciences ISBN 978-3-030-74627-8 ISBN 978-3-030-74628-5 (eBook) https://doi.org/10.1007/978-3-030-74628-5 MATLAB is a registered trademark of The MathWorks, Inc. See https://www.mathworks.com/ trademarks for a list of additional trademarks. Mathematics Subject Classification: 34H05, 34K35, 37N35, 49L20, 49N90, 93C10, 93C20, 93C35, 93C40, 93C55, 93C73, 93D05, 93D25 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2022, corrected publication 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

to Laurent, a friend and a continuous source of inspiration

Preface

This book is a tribute to Laurent Praly on the occasion of his 65th birthday. Throughout his 40 year career Laurent has contributed ground-breaking results, has initiated research directions, has laid the foundations of adaptive control, nonlinear stabilization, nonlinear observer design, and network systems, and has motivated, guided, and forged students, junior researchers, and colleagues. In addition, he has been a driving force for the intellectual and cultural growth of the systems and control community worldwide. The volume collects nine contributions written by a total of seventeen researchers. The leading author of each contribution has been selected among the researchers who have worked or interacted with Laurent, have been influenced by his research activity, or have had the privilege and honor of being his Ph.D. students. The contributions focus on two foundational areas of control theory: nonlinear control and adaptive control, in which Laurent has been an undisputed top player for four decades. The diversity of the areas covered and the depth of the results are tangible evidence of Laurent’s impact on the way control problems are currently studied and results are developed. Control would be a very different discipline without Laurent’s vision and without his ability to push the boundaries of what is known and achievable. Laurent’s papers are timeless: the results therein are fundamental and are never superseded by more advanced or newer results. They constitute cornerstones upon which generations of control theorists will build. Similarly, practitioners and industrialists have greatly benefited from Laurent’s engineering ingenuity and tools. As anticipated, the contributions in the book reflect important areas which have been pioneered and influenced by Dr. L. Praly, as detailed hereafter. It has been known since a long time that invertible MIMO nonlinear systems can be input–output linearized via dynamic state feedback. However, the techniques originally developed to achieve this design goal are fragile, as they require the availability of an accurate model of the plant and access to the full state. Very vii

viii

Preface

recently, a robust version of these techniques has been developed, by means of which a linear input–output behavior can be approximated up to any arbitrarily fixed degree of accuracy. As a byproduct, for a strongly minimum phase invertible MIMO system, these techniques provide a robust stabilization paradigm, which can be also used in wider contexts, for instance, to simplify the solution of a problem of output regulation. The chapter “Almost Feedback Linearization via Dynamic Extension: a Paradigm for Robust Semiglobal Stabilization of Nonlinear MIMO Systems,” by A. Isidori and Y. Wu, reviews the techniques in question and their application to the design of output regulators. The chapter “Continuous-Time Implementation of Reset Control Systems,” by A. Teel, considers using a differential inclusion, instead of a hybrid system, to effectuate a linear control system with resets. In particular, it establishes global exponential stability for the differential inclusion when the hybrid version of the reset control system admits a strongly convex Lyapunov function that establishes stability. The problem of establishing the stability of the feedback interconnection of two systems is perhaps the most fundamental and well-studied problem in control theory. In their seminal 1997 paper, Megretski and Rantzer extended the classical multiplier approach to this problem by using a homotopy argument to circumvent the standard requirement that the multipliers admit certain factorizations. In “On the Role of Well-Posedness in Homotopy Methods for the Stability Analysis of Nonlinear Feedback Systems,” R. A. Freeman shows how to relax their assumption that the feedback interconnection is well-posed along the entire homotopy path. In “Design of Heterogeneous Multi-agent System for Distributed Computation,” J. G. Lee and H. Shim study the design aspect of heterogeneous multi-agent systems by a tool set based on singular perturbation analysis. A few applications illustrate how the tool is employed for generating multi-agent systems or algorithms. The chapter “Contributions to the Problem of High-Gain Observer Design for Hyperbolic Systems,” by C. Kitsos, G. Besançon, and C. Prieur, extends classical results on high-gain observers to quasilinear hyperbolic partial differential equations (PDE). Assuming that the first coordinate of the state defines the output, two different observer design methods are given: firstly a direct method, assuming that there is only one nonlinear functional velocity in the PDE, giving a natural extension to what is known for finite-dimensional nonlinear systems; secondly an indirect method, where a suitable state transformation is used to deal with distinct functional velocities. The adaptive attenuation of unknown periodic or approximately periodic output disturbances in the presence of broadband noise and modeling uncertainties is an important practical problem with a wide range of applications. The chapter “Robust Adaptive Disturbance Attenuation,” by S. Jafari and P. Ioannou, proposes several feedback adaptive control techniques which are shown analytically and demonstrated via simulations to reject periodic disturbances without attenuating the output noise and exciting unmodeled dynamics. Some of the novel techniques used include over-parametrization that provides the structural flexibility to meet multiple control objectives and robust adaptive laws for parameter estimation. The results

Preface

ix

have also been extended to the case of minimum phase and possibly unstable plant models with unknown parameters where the objectives of control, disturbance rejection, and robustness with respect to output noise and modeling errors are met. The proposed techniques cover continuous and discrete-time plants as well as MIMO systems. In “Delay-Adaptive Observer-Based Control for Linear Systems with Unknown Input Delays,” M. Krstic and Y. Zhu present a tutorial retrospective of advances, over the last ten years, in adaptive control of linear systems with input delays, enabled with a parameter-adaptive certainty-equivalence version of PDE backstepping. In addition to unknown plant parameters and unmeasured plant states, they address delay-specific challenges like unknown delays (delay-adaptive designs) and systems with distributed delays, where the delay kernels are unknown functional parameters, estimated with infinite-dimensional update laws. In “Adaptive Control for Systems with Time-Varying Parameters—A Survey,” K. Chen and A. Astolfi survey the so-called congelation of variables method for adaptive control. This method allows recasting an adaptive control problem with time-varying parameter into an adaptive control problem with constant parameter and a robust control problem with time-varying perturbation. This allows applying classical adaptive control results to systems with time-varying parameters. Both state-feedback design and output-feedback design are presented. Boundedness of closed-loop signals and convergence of output/state are guaranteed without any restrictions on the rates of parameter variations. In “Robust Reinforcement Learning for Stochastic Linear Quadratic Control with Multiplicative Noise,” B. Pang and Z. P. Jiang focus on the development of robust reinforcement learning algorithms for how to learn adaptive optimal controllers from limited data. The chapter first shows that the well-known policy iteration algorithm is inherently robust in the sense of small-disturbance input-to-state stability and then presents a novel off-policy reinforcement learning algorithm for data-driven adaptive and stochastic LQR with multiplicative noise. We complete the preface with some personal considerations. As control theorists we have been blessed to share time, as Ph.D. students and collaborators, with Laurent. It is very difficult to describe the magic that occurs in Laurent’s office, while writing on the board, or in the nearby forest, while searching for chestnuts or mushrooms. It is, however, this magic that makes Laurent unique and special, as a researcher, as a teacher, and as a friend, and that has attracted us and many colleagues to learn, work and interact with him. Without the “Fontainebleau experience” our life would have been different, and for this gift we are grateful to Laurent. It is for us a great honor to celebrate Laurent’s contributions to science and to our life. New York Grenoble London January 2021

Zhong-Ping Jiang Christophe Prieur Alessandro Astolfi

Contents

1 Almost Feedback Linearization via Dynamic Extension: a Paradigm for Robust Semiglobal Stabilization of Nonlinear MIMO Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alberto Isidori and Yuanqing Wu 2 Continuous-Time Implementation of Reset Control Systems . . . . . . Andrew R. Teel

1 27

3 On the Role of Well-Posedness in Homotopy Methods for the Stability Analysis of Nonlinear Feedback Systems . . . . . . . . Randy A. Freeman

43

4 Design of Heterogeneous Multi-agent System for Distributed Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jin Gyu Lee and Hyungbo Shim

83

5 Contributions to the Problem of High-Gain Observer Design for Hyperbolic Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Constantinos Kitsos, Gildas Besançon, and Christophe Prieur 6 Robust Adaptive Disturbance Attenuation . . . . . . . . . . . . . . . . . . . . 135 Saeid Jafari and Petros Ioannou 7 Delay-Adaptive Observer-Based Control for Linear Systems with Unknown Input Delays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 Miroslav Krstic and Yang Zhu 8 Adaptive Control for Systems with Time-Varying Parameters—A Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 Kaiwen Chen and Alessandro Astolfi

xi

xii

Contents

9 Robust Reinforcement Learning for Stochastic Linear Quadratic Control with Multiplicative Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 Bo Pang and Zhong-Ping Jiang Correction to: Contributions to the Problem of High-Gain Observer Design for Hyperbolic Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Constantinos Kitsos, Gildas Besançon, and Christophe Prieur

C1

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279

Chapter 1

Almost Feedback Linearization via Dynamic Extension: a Paradigm for Robust Semiglobal Stabilization of Nonlinear MIMO Systems Alberto Isidori and Yuanqing Wu Abstract It is well known that invertible MIMO nonlinear systems, can be input– output linearized via dynamic state feedback (augmentation of the dynamics and memoryless state feedback from the augmented state). The procedures for the design of such feedback, developed in the late 1980s for nonlinear systems, typically are recursive procedures that involve state-dependent transformations in the input space and cancelation of nonlinear terms. As such, they are fragile. In a recent work of WuIsidori-Lu-Khalil, a method has been proposed, consisting of interlaced design of dynamic extensions and extended observers, that provides a robust version of those feedback-linearizing procedures. The method in question can be used as a systematic tool for robust semiglobal stabilization of invertible and strongly minimum-phase MIMO nonlinear systems. The present paper provides a review of the method in question, with an application to the design of a robust output regulator.

1.1 Foreword This paper is dedicated to Laurent Parly on the occasion of his 65th birthday. It is a real honor to have been invited to prepare a paper in honor of one of the most influential and respected authors in our community. Over the years, the first author of this paper has been deeply influenced, in his own work, by ideas, style and mathematical rigor of Laurent, since the first time he met him—on the sands of a secluded beach in Belle Ile while catching and eating fresh palourdes—in the course of a workshop held there in 1982 under the aegis of the French CNRS. Working with Laurent, an opportunity that A. Isidori (B) Department of Computer, Control and Management Engineering, University of Rome “Sapienza”, Rome, Italy e-mail: [email protected] Y. Wu School of Automation, Guangdong University of Technology, Guangzhou, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control, Lecture Notes in Control and Information Sciences 488, https://doi.org/10.1007/978-3-030-74628-5_1

1

2

A. Isidori and Y. Wu

we wish had occurred much more frequently, was a real privilege and every time a incredible chance for learning new technical skills and broaden oneself’s knowledge. Among the various areas in which Laurent produced seminal result stand, of course, those of feedback stabilization and output regulation of nonlinear systems. It is for this reason why, in this paper, we have chosen to report our modest new contributions to a design problem that sits in between these two areas.

1.2 Invertibility and Feedback Linearization Feedback linearization has been one of the most popular, but also frequently despised, approaches for the design of feedback laws for nonlinear systems. The idea originated in 1978 with a work of R. Brockett who, in [1], while investigating the effect of feedback from the state on a input-affine nonlinear system, showed that the joint effect of a feedback and a change of coordinates could yield a system modeled by linear equations. Independently, a similar approach was pursued by G.Meyer and coauthors at NASA, in the design of autopilots for helicopters. The idea attracted soon the attention of various other authors and, in 1980, B. Jakubczyk and W. Respondek provided a complete solution to the problem of determining, for a SISO input-affine system, conditions for the existence of feedback laws and changes of coordinates yielding a system modeled by linear equations [2]. Since then, the problem gained a lot of popularity, in view of its intuitive appealing. However, the fragility of such design method was also immediately pointed out, because the method involves cancelation of nonlinear terms (and hence questionable in the presence of model uncertainties) and access to all components of the state (and hence again questionable if only limited measurements are available for feedback). A somewhat less ambitious version of this approach is that of forcing a linear input–output behavior via state feedback. Such design requires substantially weaker assumptions, but the above-mentioned criticisms of lack of robustness still persist. It was only relatively recently, in 2008, that a robust alternative was proposed, by L. Freidovich and H. Khalil [3], who showed how it is possible to robustly control an input-affine SISO nonlinear system so as to recover—up to any arbitrarily fixed degree of accuracy—the performance that would have been obtained by means of the classical input–output feedback linearization design, had the parameters of the system been accurately known and had the full state been available. Essentially, in the terminology introduced earlier by J. C. Willems [4] in the analysis of the problem of disturbance decoupling for linear systems, the authors of [3] have proven how to achieve almost input–output feedback linearization, by means of a robust controller. In the early 1980s, various authors had also addressed the problem of controlling an input-affine MIMO nonlinear system so as to obtain a linear input–output behavior. In particular, J. Descusse and C. Moog, in a 1985 paper [5], showed that any invertibile system can be forced to have a linear input–output behavior if a dynamic feedback is used, a control consisting of the addition of extra state variables and of a feedback from the augmented state. This result was acknowledged to be quite

1 Almost Feedback Linearization via Dynamic Extension …

3

powerful, because one cannot think of an assumption weaker than invertibility, but again the underlying design is not robust, as it relies upon exact cancelations and availability of the full state of the controlled plant. In particular, the method in question is based on a recursive design procedure that, at each step, requires a state-dependent change of coordinates in the input space, an intrinsically non-robust operation. Very recently, in [6], taking advantage of some developments concerning the structure of the normal forms of invertible nonlinear systems [7], a robust version of the method of [5] has been proposed. Specifically, the results of [6] have shown how it is possible to design a robust controller that solves the problem of almost input–output feedback linearization, for a reasonably general class of uniformly invertible MIMO systems. The purpose of the present paper is to summarize the highlights of the main results of this work, in the more general context of systems possessing a nontrivial zero dynamics, and to show how they can provide a useful paradigm for robust stabilization of MIMO systems: as an application, it is also shown how the method in question can be profitably used in the solution of a problem of output regulation.

1.3 Normal Forms of Uniformly Invertible Nonlinear Systems We consider in this paper input-affine nonlinear systems modeled by equations of the form x˙¯ = f¯(x) ¯ + g( ¯ x)u ¯ ¯ x) y = h( ¯ with state x¯ ∈ Rn , input u ∈ Rm and output y ∈ Rm , in which f¯(·) and the m columns ¯ are smooth functions. It is of g(·) ¯ are smooth vector fields, while the m entries of h(·) also assumed that x¯ = 0 is an equilibrium of the unforced system, i.e., that f¯(0) = 0, ¯ and, without loss of generality, that h(0) = 0.

1.3.1 Normal Forms It is known (see [8, 9], [10, pp. 251–280]) that if a MIMO nonlinear system having the same number m of input and output components is uniformly invertible in the sense of Singh [11], and if certain vector fields are complete (see in particular [8]), there exists a globally defined change of coordinates by means of which the system can be expressed, in normal form, as

4

A. Isidori and Y. Wu

z˙ = f 0 (z, x) + g0 (z, x)u x˙i,1 = xi,2 ··· 1 (z, x)(a1 (z, x) + b1 (z, x)u) x˙i,r1 = xi,r1 +1 + δi,r 1 +1 ··· 1 x˙i,r2 −1 = xi,r2 + δi,r (z, x)(a1 (z, x) + b1 (z, x)u) 2 2 j x˙i,r2 = xi,r2 +1 + δi,r2 +1 (z, x)(a j (z, x) + b j (z, x)u) j=1

···

x˙i,ri−1 = xi,ri−1 +1 +

i−1 j=1

··· x˙i,ri −1 = xi,ri +

i−1 j=1

(1.1) j

δi,ri−1 +1 (z, x)(a j (z, x) + b j (z, x)u)

j

δi,ri (z, x)(a j (z, x) + b j (z, x)u)

x˙i,ri = ai (z, x) + bi (z, x)u i = 1, . . . , m, yi = xi,1 where (z, x) ∈ Rn−r × Rr , with x = col(x1 , . . . , xm ) and xi = col(xi1 , . . . , xi,ri ) for i = 1, . . . , m, and y = col(y1 , . . . , ym ).1 Note that, as a consequence of the assumption that f¯(0) = 0, f 0 (0, 0) = 0 and ai (0, 0) = 0

∀i.

As a consequence of the property of uniform invertibility, certain parameters that characterize the normal form (1.1), namely the vectors bi (z, x) that pre-multiply the j input u and the “multipliers” δi,k (z, x), have special properties. To describe such properties and their consequences, set ⎛

⎞ a1 (z, x) A(z, x) = ⎝ · · · ⎠ , am (z, x)

⎛

⎞ b1 (z, x) B(z, x) = ⎝ · · · ⎠ , bm (z, x)

in which case the last equations of each block of (1.1) can be rewritten together in compact form as ⎞ ⎛ x˙1,r1 ⎝ · · · ⎠ = A(z, x) + B(z, x)u . x˙m,rm A consequence of uniform invertibility (see, e.g., [10, p. 274]) is that the m × m matrix B(z, x) is invertible for all (z, x). Such property, which extends to the present general setting the classical (but quite restrictive) property of having a vector relative For convenience, it is assumed that dim(yi ) = 1 for all i = 1, . . . , m and r1 < r2 < . . . < rm . In qgeneral, one should consider y split into q blocks y1 , . . . , yq , with dim(yi ) = m i ≥ 1 and i=1 m i = m. The structure of the equations remains the same.

1

1 Almost Feedback Linearization via Dynamic Extension …

5

degree, can be naturally exploited in the design of state-feedback control laws. For instance, because of such property, one could think of choosing the control u as ¯ u = B −1 (z, x)[−A(z, x) + u]

(1.2)

changing this way the system into a system of the simpler form ¯ z˙ = f 0 (z, x) + g0 (z, x)B −1 (z, x)[−A(z, x) + u] x˙i,1 = xi,2 ··· 1 (z, x)u¯ 1 x˙i,r1 = xi,r1 +1 + δi,r 1 +1 ··· 1 (z, x)u¯ 1 x˙i,r2 −1 = xi,r2 + δi,r 2 2 j x˙i,r2 = xi,r2 +1 + δi,r2 +1 (z, x)u¯ j ···

j=1

x˙i,ri−1 = xi,ri−1 +1 + ··· x˙i,ri −1 = xi,ri + x˙i,ri = u¯ i yi = xi,1

i−1 j=1

i−1 j=1

j

δi,ri−1 +1 (z, x)u¯ j

j

δi,ri (z, x)u¯ j i = 1, . . . , m.

The property that B(z, x) is invertible, on the other hand, is also useful in the characterization of the so-called zero dynamics of the system (see below) and of its asymptotic properties. j It is worth stressing that, if the “multipliers” δi,ri (z, x) in (1.1) were all independent of (z, x), the state-feedback law (1.2) would have induced a linear input–output j behavior.2 In general, though, the multipliers δi,k (z, x) are not constant. However— as a consequence of the property of invertibility—they can only depend on the individual components of x in a special way, that can be described as follows. For any sequence of real variables xi j , with 1 ≤ j ≤ r , let xi and x ik denote the strings xi = (xi1 , xi2 , . . . , xir ) x ik = (xi1 , xi2 , . . . , xik )

k 2 in which a similar property of “triangular” dependence, introduced earlier in [7] in the study of the problem of stabilization by output feedback, holds. Specifically, we assume that3 : j

Assumption 1.1 The multipliers δi,k+1 (z, x) in (1.1) are independent of z and depend on the components of x in a “triangular” fashion, as in j

δi,k+1 (x1 , . . . , xi−1 , x i,k . . . , x m,k ),

ri−1 ≤ k ≤ ri − 1, 1 ≤ j ≤ i − 1 .

Remark 1.1 An interesting fallout of this Assumption, highlighted in [7], is that all components of x can be expressed as functions of the components of y and of a suitable number of their higher order derivatives with respect to time. Thus, a uniformly invertible nonlinear system in which dim(z) = 0, if Assumption 1.1 holds is uniformly observable.

1.3.2 Strongly Minimum-Phase Systems A nonlinear system is said to be globally minimum-phase if the internal dynamics arising when the control is chosen to force the output to be identically zero are globally asymptotically stable. In the case of the normal form (1.1), it is seen that if y(t) is identically zero so is also x(t) and, necessarily, u(t) = [B(z(t), 0)]−1 [−A(z(t), 0)] . As a consequence, the zero dynamics are those of z˙˜ = f 0 (z, 0) + g0 (z, 0)[B(z, 0)]−1 [−A(z, 0)]

(1.3)

and the system is said to be globally minimum-phase if the equilibrium z = 0 of the latter is globally asymptotically stable. It has been stressed in [13, 14] that, in the design of feedback stabilizing laws, it is appropriate to look at a stronger notion of “minimum-phase,” that—roughly speaking—requires the dynamics of the inverse system to be input-to-state stable.4 3

It is easy to show that the Assumption in question is compatible with the assumption of uniform invertibility, i.e., that if in a normal form like (1.1) such Assumption holds, and the matrix B(z, x) is nonsingular, the system is uniformly invertible in the sense of Singh. However, it must be stressed that the necessity of such triangular dependence has been proven only for systems having m = 2 and a trivial dynamics of z. 4 A property that implies, but is not implied by, the property that the system is globally minimumphase.

1 Almost Feedback Linearization via Dynamic Extension …

7

In the present context of systems modeled in normal form, the property in question considers, instead of (1.3), the forced dynamics z˙ = f 0 (z, x) + g0 (z, x)[B(z, x)]−1 [−A(z, x) + χ ],

(1.4)

seen as a system with state z and inputs (x, χ ) and requires the latter to be input-tostate stable. With the results of [15, 16] in mind, we formally define such property as follows. Definition 1.1 System (1.1) is strongly minimum-phase (SMP) if there exist a C 1 function V : Rn−r → R and four class K∞ functions α(·), α(·), α(·), σ (·) such that α(z) ≤ V (z) ≤ α(z)

for all z ∈ Rn−r

and ∂V f 0 (z, x)+g0 (z, x)[B(z, x)]−1 [−A(z, x) + χ ] ≤ −α(z) + σ (x) + σ (χ ) ∂z (1.5) for all (z, x, χ ) ∈ Rn−r × Rr × Rm . The system is strongly—and also locally exponentially-minimum-phase (eSMP) if the inequalities above hold with α(·), α(·), α(·), σ (·) that are locally quadratic near the origin.

1.4 Robust (Semiglobal) Stabilization via Almost Feedback Linearization 1.4.1 Standing Assumptions System (1.1), if the matrix B(z, x) is invertible and Assumption 1.1 holds, is uniformly invertible. Hence, as it is known since a long time (see, e.g., [5], [17, pp. 249–263]), it can be input–output linearized by means of a control consisting of an augmentation of the dynamics and of a state feedback from the augmented state. In general, methods for feedback linearization require exact cancelation of nonlinear terms and availability of the full state of the system: as such they cannot be regarded as robust design methods. In the case of MIMO systems, the issue of robustness is further aggravated by the fact that all known (recursive) methods for achieving feedback linearization via dynamic extension require, at each stage, a state-dependent change of coordinates in the input space, which is intrinsically non-robust. In [6] it has been shown how such methods can be made robust, by means of a technique based on interlaced use of dynamic extensions and robust observers, extending in this way to MIMO systems the seminal results of [3], in which—for a SISO system— input–output linearization is achieved up to any arbitrarily fixed degree of accuracy by means of a robust controller.

8

A. Isidori and Y. Wu

The method of [6] considers the case of a system in normal form (1.1), supposed to satisfy Assumption 1.1, and in which the matrix B(z, x) has the property indicated in the following additional assumption. Assumption 1.2 The matrix B(z, x) is lower triangular and there exist numbers bmin , bmax such that 0 < bmin ≤ bii (z, x) ∀i,

B(z, x) ≤ bmax ∀(z, x).

(1.6)

Note that, as a consequence of this assumption, there exist a number b0 and a number 0 < δ0 < 1 such that b (z, x) − b 0 ii ≤ δ0 < 1 ∀i, ∀(z, x) . b0

(1.7)

The method described in [6] addresses the case in which the dynamics of z are trivial. If this is not the case, the following extra assumption is needed. Assumption 1.3 The controlled plant (1.1) is eSMP.

1.4.2 The Nominal Linearizing Feedback The (recursive) procedure for exact input–output linearization via state augmentation and feedback can be summarized as follow. First of all, the dynamics of (1.1) are augmented, by means of a dynamic extension defined as ζ˙1 = S1 ζ1 + T1 v1 ζ˙2 = S2 ζ2 + T2 v2 ··· ζ˙m−1 = Sm−1 ζm−1 + Tm−1 vm−1

(1.8)

in which, for i = 1, . . . , m − 1, ζi ∈ Rrm −ri are additional states and vi ∈ R are additional inputs, and ⎛ ⎛ ⎞ ⎞ 0 1 ··· 0 0 ⎜· · · · · ·⎟ ⎜· · ·⎟ ⎟ ⎟ Si = ⎜ Ti = ⎜ ⎝0 0 · · · 1 ⎠ , ⎝ 0 ⎠. 0 0 ··· 0 1 Then a state feedback (from the full state of the augmented system) is determined, by means of a recursive design procedure, that consists in the following steps. Step 1: Set x1 = col ξ1 , ζ1 with ξ1 ∈ Rr1 defined as ξ1 j = x1 j for 1 ≤ j ≤ r1 . Indeed, ξ˙1, j = ξ1, j+1 for 1 ≤ j ≤ r1 − 1 and ξ˙1,r1 = a1 (z, x) + b1 (z, x)u. Let u be such that (1.9) a1 (z, x) + b1 (z, x)u = ζ11 .

1 Almost Feedback Linearization via Dynamic Extension …

9

Pick v1 =

v1∗ (ξ1 , ζ1 )

rm r1 + u¯ 1 = − d j−1 ξ1 j − d j−1 ζ1, j−r1 + u¯ 1 = − Kˆ x1 + u¯ 1 , j=1

j=r1 +1

(1.10) where

Kˆ = d0 d1 · · · drm −1 .

(1.11)

By construction, x˙ 1 = ( Aˆ − Bˆ Kˆ )x1 + Bˆ u¯ 1 , in which Aˆ ∈ Rrm ×rm and Bˆ ∈ Rrm ×1 are matrices of the form ⎛ ⎞ ⎛ ⎞ 0 0 1 ··· 0 ⎜· · ·⎟ ⎜· · · · · ·⎟ ⎟ ˆ ⎜ ⎟ (1.12) Aˆ = ⎜ ⎝0 0 · · · 1 ⎠ , B = ⎝ 0 ⎠ . 1 0 0 ··· 0

Step 2: Assume (1.9) holds. Set x2 = col ξ2 , ζ2 with ξ2 ∈ Rr2 defined as ξ2 j = x 2 j 1 ≤ j ≤ r1 ξ2 j = x2 j + ψ2 j (x1 , x 2, j−1 , . . . , x m, j−1 , ζ 1, j−r ) 1

r1 + 1 ≤ j ≤ r2 ,

where the ψ2 j (·) are such that ξ˙2, j = ξ2, j+1 for 1 ≤ j ≤ r2 − 1. It can be checked that ξ˙2,r2 = a2 (z, x) + b2 (z, x)u + c2 (x, ζ ) in which c2 (x, ζ ) is a function that vanishes at (x, ζ ) = (0, 0). Let u be such that a2 (z, x) + b2 (z, x)u + c2 (x, ζ ) = ζ21 .

(1.13)

Pick rm r2 v2 = v2∗ (ξ2 , ζ2 ) + u¯ 2 = − d j−1 ξ2 j − d j−1 ζ2, j−r2 + u¯ 2 = − Kˆ x2 + u¯ 2 . j=1

j=r2 +1

(1.14)

By construction, x˙ 2 = ( Aˆ − Bˆ Kˆ )x2 + Bˆ u¯ 2 .

Step 3: Assume (1.9) and (1.13) hold. Set x3 = col ξ3 , ζ3 with ξ3 ∈ Rr3 defined

as 1 ≤ j ≤ r1 ξ3 j = x 3 j ξ3 j = x3 j + ψ3 j (x1 , x 2, j−1 , . . . , x m, j−1 , ζ 1, j−r ) r1 + 1 ≤ j ≤ r2 1 ξ3 j = x3 j + ψ3 j (x1 , x2 , x 3, j−1 . . . , x m, j−1 , ζ 1, j−r , ζ 2, j−r ) r2 + 1 ≤ j ≤ r3 , 1

2

10

A. Isidori and Y. Wu

where the ψ3 j (·) are such that ξ˙3, j = ξ3, j+1 for 1 ≤ j ≤ r3 − 1. It can be checked that ξ˙3,r3 = a3 (z, x) + b3 (z, x)u + c3 (x, ζ ) in which c3 (x, ζ ) is a function that vanishes at (x, ζ ) = (0, 0). Let u be such that a3 (z, x) + b3 (z, x)u + c3 (x, ζ ) = ζ31 .

(1.15)

Pick v3 =

v3∗ (ξ3 , ζ3 )

rm r3 + u¯ 3 = − d j−1 ξ3 j − d j−1 ζ3, j−r3 + u¯ 3 = − Kˆ x3 + u¯ 3 . j=1

j=r3 +1

By construction, x˙ 3 = ( Aˆ − Bˆ Kˆ )x3 + Bˆ u¯ 3 .

(1.16)

Step m-1: Assume (1.9), (1.13), . . . hold. Set xm−1 = col ξm−1 , ζm−1 with ξm−1 ∈ Rrm−1 defined as in such a way that ξ˙m−1, j = ξm−1, j+1 for 1 ≤ j ≤ rm−1 − 1. It can be checked that ξ˙m−1,rm−1 = am−1 (z, x) + bm−1 (z, x)u + cm−1 (x, ζ ) in which cm−1 (x, ζ ) is a function that vanishes at (x, ζ ) = (0, 0). Let u be such that am−1 (z, x) + bm−1 (z, x)u + cm−1 (x, ζ ) = ζm−1,1 .

(1.17)

Pick ∗ (ξ , ζ ) + u¯ m−1 vm−1 = vm−1 rm−1 m−1 m−1 m = − j=1 d j−1 ξm−1, j − rj=r d j−1 ζm−1, j−rm−1 + u¯ m−1 = − Kˆ xm−1 + u¯ m−1 . m−1 +1 (1.18) By construction, x˙ m−1 = ( Aˆ − Bˆ Kˆ )xm−1 + Bˆ u¯ m−1 .

Step m: Assume (1.9), (1.13), . . . , (1.17) hold. Set xm = ξm with ξm ∈ Rrm defined as 1 ≤ j ≤ r1 ξm j = x m j ξm j = xm j + ψm j (x1 , x 2, j−1 , . . . , x m, j−1 , ζ 1, j−r ) r1 + 1 ≤ j ≤ r2 1 ξm j = xm j + ψm j (x1 , x2 , x 3, j−1 , . . . , x m, j−1 , ζ 1, j−r , ζ 2, j−r ) r2 + 1 ≤ j ≤ r3 1 2 ··· ξm j = xm j + ψm j (x1 , x2 , . . . , xm−1 , x m, j−1 , ζ 1, j−r , ζ 2, j−r , . . . , ζ m−1, j−r ) 1 2 m−1 rm−1 + 1 ≤ j ≤ rm , where the ψm j (·) are such that ξ˙m, j = ξm, j+1 for 1 ≤ j ≤ rm − 1 . It can be checked that m−1 ξ˙m,rm = am (z, x) + bm (z, x)u + cm (x, ζ ) − i=1 γi (x, ζ )vi

1 Almost Feedback Linearization via Dynamic Extension …

11

in which cm (x, ζ ) is a function that vanishes at (x, ζ ) = (0, 0) and the γi (x, ζ )’s, for i = 1, . . . , m − 1, are appropriately defined functions. Define, for convenience, γm (x, ζ ) = 1. Let u be such that am (z, x) + bm (z, x)u + cm (x, ζ ) = m−1 γ (x, ζ )vi + vm∗ (ξm ) + u¯ m = i=1 m−1 i m−1 = i=1 γi (x, ζ )vi∗ (ξi , ζi ) + i=1 γi (x, ζ )u¯ i + vm∗ (ξm ) + u¯ m m m ∗ = i=1 γi (x, ζ )vi (ξi , ζi ) + i=1 γi (x, ζ )u¯ i , in which vm∗ (ξm ) = −

rm

d j−1 ξm j = − Kˆ xm .

(1.19)

(1.20)

j=1

By construction, x˙ m = ( Aˆ − Bˆ Kˆ )xm + Bˆ u¯ m . Formulas (1.9), (1.13), (1.15), …, (1.17), (1.19), that implicitly define the control u, can be expressed in compact form as follows. Observe that, for each i = 1, . . . , m, the vector xi ∈ Rrm is a function of (x, ζ ), that will be written as xi = Ψi (x, ζ ). Set ⎛

⎞ x1 x = ⎝· · ·⎠ xm

⎛

⎞ Ψ1 (x, ζ ) Ψ (x, ζ ) = ⎝ · · · ⎠ . Ψm (x, ζ )

It can be checked that the map Rm·rm Rm·rm → (x, ζ ) → x = Ψ (x, ζ ) is a globally defined diffeomorphism that preserves the origin. Set ⎛

⎞ −ζ11 ⎜ c2 (x, ζ ) − ζ21 ⎟ ⎜ ⎟ ⎟ ··· C(x, ζ ) = ⎜ ⎜ ⎟ ⎝cm−1 (x, ζ ) − ζm−1,1 ⎠ cm (x, ζ )

⎛

⎞ 0 ⎜0⎟ ⎜ ⎟ ⎟ D=⎜ ⎜· · ·⎟ ⎝0⎠ 1

Γ (x, ζ ) = γ1 (x, ζ ) . . . γm−1 (x, ζ ) γm (x, ζ ) , and observe that C(x, ζ ) vanishes at (x, ζ ) = (0, 0) because so do all the ci (x, ζ )’s. Then, (1.9), (1.13), (1.15), (1.19) altogether can be expressed as A(z, x) + B(z, x)u + C(x, ζ ) = −DΓ (x, ζ )(Im ⊗ Kˆ )Ψ (x, ζ ) + DΓ (x, ζ )u¯ in which u¯ = col{u¯ 1 , . . . , u¯ m }.

12

A. Isidori and Y. Wu

It is seen from all of the above that if the controls vi of the dynamic extension (1.8) are chosen as vi = vi∗ (ξi , ζi ) + u¯ i = − Kˆ Ψi (x, ζ ) + u¯ i ,

i = 1, . . . , m − 1,

(1.21)

in which Kˆ ∈ R1×rm is a row-vector of design parameters, and the control u is chosen as ¯ u = [B(z, x)]−1 [−A(z, x) − C(x, ζ ) − DΓ (x, ζ )(Im ⊗ Kˆ )Ψ (x, ζ ) + DΓ (x, ζ )u], (1.22) a closed-loop system is obtained described by equations of the form z˙ = f 0 (z, x) + g0 (z, x)[B(z, x)]−1 ×[−A(z, x) − C(x, ζ ) − DΓ (x, ζ )(Im ⊗ Kˆ )Ψ (x, ζ ) + DΓ (x, ζ )u] ¯ ˆ x˙ i = ( A − Bˆ Kˆ )xi + Bˆ u¯ i ˆ i yi = Cx i = 1, . . . , m (1.23) in which Aˆ ∈ Rrm ×rm and Bˆ ∈ Rrm ×1 are matrices of the form (1.12) and Cˆ ∈ R1×rm is a matrix of the form

Cˆ = 1 0 · · · 0 . (1.24) It is seen from this that the indicated choice of u and of the vi ’s has rendered the system input–output linear (and also non-interactive). Moreover, if the matrix Kˆ of free design parameters is such that the polynomial d(λ) = d0 + d1 λ + · · · + drm −1 λrm −1 + λrm

(1.25)

is Hurwitz, the m lower subsystems of (1.23) are all globally asymptotically stable. A consequence of the assumption of strong minimum phase is that the closedloop system (1.23), viewed as a system with input u¯ and state (z, x), is input-to-state stable. Proposition 1.1 Let Assumptions 1.1, 1.2, 1.3 be fulfilled and let Kˆ be chosen so that Aˆ − Bˆ Kˆ is Hurwitz. Then, system (1.23), viewed as a system with input u¯ and state (z, x), is input-to-state stable. If u¯ = 0, the equilibrium (z, x) = (0, 0) is globally and also locally exponentially stable. Proof Recall that C(x, ζ ) and Γ (x, ζ ) are smooth functions, with C(x, ζ ) vanishing at (x, ζ ) = (0, 0). Recall also that the map (x, ζ ) = Ψ −1 (x) is a smooth map vanishing at x = 0. Then, there exists a class K function α1 (·), locally linearly near the origin, such that (1.26) x ≤ α1 (x) , a class K function α2 (·), locally linearly near the origin, such that C(x, ζ ) − DΓ (x, ζ )(Im ⊗ Kˆ )Ψ (x, ζ ) ≤ α2 (x) ,

(1.27)

1 Almost Feedback Linearization via Dynamic Extension …

13

and two class K functions α3 (·), α4 (·), locally linearly near the origin, such that ¯ . 2DΓ (x, ζ )u ¯ ≤ α3 (x) + α( u)

(1.28)

Combining the estimate (1.5) with the estimates (1.26)–(1.27)–(1.28), we conclude the existence of two class K∞ functions σ (·) and σ (·), locally quadratic near the origin, such that ∂V f 0 (z, x) + g0 (z, x) ∂z ¯ ×[B(z, x)]−1 [−A(z, x) − C(x, ζ ) − DΓ (x, ζ )(Im ⊗ Kˆ )Ψ (x, ζ ) + DΓ (x, ζ )u] ≤ − α(z) + σ (x) + σ (C(x, ζ ) − DΓ (x, ζ )(Im ⊗ Kˆ )Ψ (x, ζ ) + DΓ (x, ζ )u) ¯ ¯ ≤ − α(z) + σ (α1 (x)) + σ (2α2 (x)) + σ (2DΓ (x, ζ )u) ¯ ≤ − α(z) + σ (α1 (x)) + σ (2α2 (x)) + σ (2α3 (x)) + σ (2α4 (u)) ¯ ≤ − α(z) + σ (x) + σ (u). Since x is a state of a linear input-to-state stable system, the claim follows from standard results.

1.4.3 Robust Feedback Design The nominal input–output linearizing control provided by (1.8)–(1.21)–(1.22) is fragile, because the states (z, x) are not available and the entries of A(z, x), B(z, x), C(x, ζ ), Ψ (x, ζ ), Γ (x, ζ ) might be uncertain. It was shown in [6], though, that the linearizing (and stabilizing) effect of a similar feedback law can be recovered up to any desired degree of accuracy by means of an implementable (extended-observerbased) control law. As suggested in [6], the controls u 1 , . . . , u m and the additional inputs v1 , . . . , vm−1 can be chosen as follows. Let g (·) be a (smooth) saturation function5 and let ϕ(σ, ς ) be a function defined as6 1 ϕ(σ, ς ) = (−σ + ς ) . b0 The controls u i , for i = 1, . . . , m, are functions defined as u 1 = g (ϕ(σ1 , ζ11 )) ··· u m−1 = g (ϕ(σm−1 , ζm−1,1 )) m u m = g (ϕ(σm , − rj=1 d j−1 ξˆm j + u¯ m ))

(1.29)

A smooth function, characterized as follows: sat (s) = s if |s| ≤ , sat (s) is odd and monotonically increasing, with 0 < sat (s) ≤ 1, and lims→∞ sat (s) = (1 + c) with 0 < c 1. 6 The number b here is any number for which condition (1.7) holds. 0 5

14

A. Isidori and Y. Wu

in which the d j ’s are the entries of the matrix (1.11), chosen in such a way that the polynomial (1.25) is Hurwitz, and σ1 , . . . , σm−1 , σm and ξˆm = (ξˆm1 , . . . , ξˆm,rm ) are states of extended observers that are defined below. Moreover, the controls vi , for i = 1, . . . , m − 1, are functions defined as 1 m d j−1 ξˆ1 j ) − rj=r d ζ + u¯ 1 v1 = g (− rj=1 r2 m 1 +1 j−1 1, j−r1 v2 = g (− j=1 d j−1 ξˆ2 j ) − rj=r d ζ + u¯ 2 j−1 2, j−r 2 2 +1 ··· rm−1 m d j−1 ξˆm−1, j ) − rj=r d j−1 ζm−1, j−rm−1 + u¯ m−1 vm−1 = g (− j=1 m−1 +1

(1.30)

in which ξˆ1 = (ξˆ11 , . . . , ξˆ1,r1 ), ξˆ2 = (ξˆ21 , . . . , ξˆ2,r2 ), . . . , ξˆm−1 = (ξˆm−1,1 , . . . , ξˆm−1,rm−1 ) are states of the extended observers that are defined below. The extended observers that generate the variables σi and ξî , for i = 1, 2, . . . , m, are defined as ξ˙î,1 = ξî,2 + κi ci,r1 (yi − ξî,1 ) ξ˙î,2 = ξî,3 + κi2 ci,r1 −1 (yi − ξî,1 ) ... (1.31) ξ˙î,ri −1 = ξî,ri + κiri −1 ci,2 (yi − ξî,1 ) ξ˙ˆ = σ + b u (·) + κ ri c (y − ξˆ ) i,ri

i

0 i

i

σ˙ i = κiri +1 ci,0 (yi − ξî,1 )

i,1

i

i,1

with ξî = (ξî,1 , ξî,2 , . . . , ξî,ri ) in which the ci, j ’s and the gain κi are design parameters. Note that the components of ζ1 , . . . , ζm−1 appearing in (1.29), (1.30) are states of the dynamical extension and, as such, available for feedback. Overall, the control defined by (1.29)–(1.30)–(1.31) is a dynamical system, with internal state ξˆ = col(ξˆ1 , . . . , ξˆm ), σ = col(σ1 , . . . , σm ), driven by the inputs y and ζ = col(ζ1 , . . . , ζm−1 ), that generates the controls u and v. It can be shown that the closed-loop system obtained controlling (1.1) by means of (1.8)–(1.29)–(1.30) can be seen as a perturbed version of the system (1.23) obtained by means of the nominal linearizing control. In particular, with x = Ψ (x, ζ ) defined as before, the dynamics of x can be expressed in the form x˙ = Ax + Bu¯ + G(w, z, x, ξˆ , σ, u) ¯ , ¯ is a perturbation in which A = Im ⊗ ( Aˆ − Bˆ Kˆ ), B = Im ⊗ Bˆ and G(w, z, x, ξˆ , σ, u) term.

1 Almost Feedback Linearization via Dynamic Extension …

15

As shown7 in [6], if the design parameters d0 , . . . , dm 1 , ci0 , . . . , ci,ri for i = 1, . . . , m, and κ1 , . . . , κm are appropriately tuned, by means of such control is possible to render the perturbation term arbitrarily small, on a time-interval of the form [T0 , ∞), in which T0 can be taken arbitrarily small as well. Hence, the input– output behavior can be made arbitrarily close to that of the ideally input–output linearized system (1.23). To the purpose of expressing this claim in precise terms and comparing the results obtained under the nominal input–output linearizing control with those obtainable under the control introduced in this section, let x L (t) denote the state-response of the system obtained under the nominal input–output linearizing control, which is

t

x L (t) = e x L (0) + At

eA(t−τ )Bu(τ ¯ )dτ .

0

Then, the methods of [6] make it possible to prove the following claim. Theorem 1.1 Consider system (1.1), augmented with (1.8), and controlled by u defined as in (1.29) and the vˆ i ’s defined as in (1.30), in which (ξî , σi ), for i = 1, . . . , m, are states of the extended observers (1.31). Let the di ’s be such that the polynomial (1.25) is Hurwitz. Let the ci j ’s be chosen in such a way that the polynomials pi (λ) = λri +1 + ci,ri λri + · · · ci1 λ + ci0 has all real and negative roots. Suppose initial conditions are taken as a fixed (but otherwise arbitrary) compact set C . Suppose the input u(·) ¯ satisfies u(t) ¯ ≤ U for all t ≥ 0, with U a fixed (but otherwise arbitrary) number. Then, there is a choice of the saturation level such ∗ (κm ) that, given any ε > 0, there is a value κm∗ and, for every κm ≥ κm∗ a value κm−1 ∗ ∗ and, for every κm−1 ≥ κm−1 (κm ) a value κm−2 (κm−1 , κm ), and so on, such that, if ∗ ∗ (κm ), κm−2 ≥ κm−2 (κm−1 , κm ), . . ., κ1 ≥ κ1∗ (κ2 , . . . , κm ), the κm ≥ κm∗ , κm−1 ≥ κm−1 trajectories of the closed-loop system obtained controlling (1.1) via (1.8)–(1.29)– (1.30)– (1.31) remain bounded and x(t) − x L (t) ≤ ε

∀t ∈ [0, ∞).

(1.32)

If, in addition, u(t) ¯ = 0 for all t ≥ 0, then lim z(t) = 0,

t→∞

lim x(t) = 0,

t→∞

lim ζ (t) = 0,

t→∞

lim ξˆ (t) = 0,

t→∞

lim σ (t) = 0.

t→∞

Thus, the proposed robust controller is able to obtain almost feedback linearization and, in particular, semiglobal asymptotic stability.

7

The proof provided in the reference [6] addresses the case in which the dynamics of z are trivial. If this is not the case, appropriate modifications are needed, taking into account the Assumption of strong minimum-phase.

16

A. Isidori and Y. Wu

1.5 Application to the Problem of Output Regulation The method for robust semiglobal asymptotic stabilization described in the previous section can be fruitfully applied in the design of a robust stabilizer for the solution of a problem of output regulation. We assume, in what follows, that the reader is familiar with the fundamentals of the theory of output regulation for nonlinear systems (see, in this respect [18–20]). Usually, a controller that solves the problem of output regulation requires two ingredients: an internal model, whose purpose is to generate—in steady state—a “feedforward” control that keeps the regulated output at zero in steady state and a stabilizer, whose purpose is to make trajectories converging to the desired steady state. The method for robust stabilization described in the previous section provides a simple and straightforward procedure for the design of such stabilizer. In this section, we address a problem of output regulation for a system having a normal form with a structure identical to that of (1.1) in which the outputs y1 , . . . , ym are replaced by the components e1 , . . . , em of the regulated variables and in which the various nonlinear functions/maps are affected by an exogenous input w. To avoid duplications we do not rewrite such normal form explicitly,8 but we limit ourselves to stress that, in a structure identical to that of (1.1), f 0 (z, x) is replaced by f 0 (w, z, x), g0 (z, x) is replaced by g0 (w, z, x), ai (z, x) is replaced by ai (w, z, x) and bi (z, x) j is replaced by bi (w, z, x), for i = 1, . . . , m; the multipliers δik (·) are allowed to depended on w but, as in Assumption 1.1, are assumed to be independent of z and dependent on the individual components of x as in j

δi,k+1 (w, x1 , . . . , xi−1 , x i,k . . . , x m,k ),

ri−1 ≤ k ≤ ri − 1, 1 ≤ j ≤ i − 1 .

Finally, a property identical to that indicated in Assumption 1.2 is taken. For convenience, in what follows we will refer to such assumptions as to the “equivalent versions” of Assumptions 1.1 and 1.2. The exogenous input w is any solution of the autonomous o.d.e. w˙ = s(w) (usually known as the exosystem) with initial conditions ranging on a compact and invariant set W . The problem of output regulation consists in finding a feedback law, driven by the regulated variables e1 , . . . , em , so that—in the resulting closed-loop system—all trajectories are bounded and limt→∞ e(t) = 0. The first step in the solution of this problem is the characterization of a solution of the so-called regulator equations that identify—in the state space of the composite system plant-exosystem—a manifold that is rendered invariant via feedback and on which the regulated variable vanish. The manifold in question is characterized by a pair of maps z = π0 (w) and x = πx (w), and u = ψ(w) is the control that renders it invariant. Simple calculations show that, since the regulated variables must vanish 8

See [21] for a more detailed presentation.

1 Almost Feedback Linearization via Dynamic Extension …

17

on such manifold, necessarily πx (w) = 0. Moreover, π0 (w) is characterized by the p.d.e. ∂π0 s(w) = f 0 (w, π0 (w), 0) + g0 (w, π0 (w), 0)ψ(w) , ∂w in which

ψ(w) = [B(w, π0 (w), 0)]−1 [−A(w, π0 (w), 0)] .

Having assumed that such π0 (w) exists, to proceed in the analysis it is convenient to scale the variable z as z˜ = z − π0 (w) . As a consequence, the dynamics of z are replaced by those of z˙˜ = f 0 (w, z˜ + π0 (w), x) + g0 (w, z˜ + π0 (w), x)[ψ(w) − ψ(w) + u] − = f˜0 (w, z˜ , x) + g˜ 0 (w, z˜ , x)[−ψ(w) + u]

∂π0 s(w) ∂w

in which, by construction, f˜0 (w, z˜ , x) vanishes at (˜z , x) = (0, 0). The dynamics of xi,ri are also affected by such scaling and change into x˙i,ri = a˜ i (w, z˜ , x) + b˜i (w, z˜ , x)[−ψ(w) + u] in which, by construction, a˜ i (w, z˜ , x) vanishes at (˜z , x) = (0, 0). Consistently, set ˜ A(w, z˜ , x) = col(a˜ 1 (w, z˜ , x), . . . , a˜ m (w, z˜ , x)) ˜ B(w, z˜ , x) = col(b˜1 (w, z˜ , x), . . . , b˜m (w, z˜ , x)). On the rescaled system it is easy to characterize the property of strong minimumphase. Consistently with the definition given earlier in the paper, the latter considers the dynamics ˜ ˜ z˜ , x)]−1 [− A(w, z˜ , x) + χ ], z˙˜ = f˜0 (w, z˜ , x) + g˜ 0 (w, z˜ , x)[ B(w,

(1.33)

seen as a system with state z˜ and inputs (x, χ ) and requires it to be input-to-state stable. With the results of [15, 16] in mind, the property in question can be expressed as follows. Definition 1.2 The system is strongly minimum-phase if there exist a C 1 function V : Rn−r → R and four class K∞ functions α(·), α(·), α(·), σ (·) such that α(˜z ) ≤ V (˜z ) ≤ α(˜z ) and

for all z˜ ∈ Rn−r

∂V ˜ ˜ ˜ f 0 (w, z˜ , x)+ g˜ 0 (w, z˜ , x)[ B(w, z˜ , x)]−1 [− A(w, z˜ , x) + χ ] ∂ z˜ ≤ −α(˜z ) + σ (x) + σ (χ )

18

A. Isidori and Y. Wu

for all (w, z˜ , x, χ ) ∈ W × Rn−r × Rr × Rm . The system is strongly—and also locally exponentially—mimimum phase (eSMP) if the inequalities above hold with α(·), α(·), α(·), σ (·) that are locally quadratic near the origin. In what follows, it is assumed that the property indicated in this Definition hold and we will refer to it as to the “equivalent version” of Assumption 1.3. The key ingredient in the solution of a problem of output regulation is the design of an internal model, a device able to generate, in steady state, the control input u = ψ(w) that renders the point (˜z , x) = (0, 0) an equilibrium point. To this end, an assumption on ψ(w) is convenient. Assumption 1.4 For each i = 1, . . . , m there exist an integer di and a globally Lipschitz smooth function φi : Rdi → R such that the i-th component ψi (w) of ψ(w) satisfies L ds i ψi (w) = φi (ψi (w), L s ψi (w), . . . , L sdi −1 ψi (w)) ∀w ∈ W. The functions φ1 (·), . . . , φm (·) determine the construction of the internal model, the aggregate of m SISO systems of the form η˙ i = Aˆ i ηi + Bˆ i φi (ηi ) + G i u¯ i u i = Cˆ i ηi + u¯ i in which ηi ∈ Rdi , and Aˆ i , Bˆ i , Cˆ i are matrices of the form (1.12)–(1.24). The vectors G i ∈ Rdi are vectors of design parameters. The controls u¯ i will be used for stabilization purposes. By construction (see [22]), for each i = 1, . . . , m, the map ϑi (w) = col{ψi (w), L s ψi (w), . . . , L sνi −1 ψi (w)} satisfies

∂ϑi s(w) = Aˆ i ϑi (w) + Bˆ i φi (ϑi (w)) for all w ∈ W . ∂w ψi (w) = Cˆ i ϑi (w)

Altogether, the various subsystems that characterize the internal model can be put in the form ˆ + Bφ(η) ˆ η˙ = Aη + G u¯ ˆ u = Cη + u¯ ˆ B, ˆ C, ˆ G are block-diagonal matrices, whose the i-th diagonal blocks are in which A, Aˆ i , Bˆ i , Cˆ i , G i , and η = col{η1 , . . . , ηm } u¯ = col{u¯ 1 , . . . , u¯ m } φ(η) = col{φ1 (η1 ), . . . , φm (ηm )}.

1 Almost Feedback Linearization via Dynamic Extension …

If η is scaled as

19

η˜ = η − ϑ(w),

with ϑ(w) = col{ϑ1 (w), . . . , ϑm (w)}, a simple calculation yields η˙˜ = F(w, η) ˜ + G[Cˆ η˜ + u] ¯ in which, by construction, ˆ η˜ + B[φ( ˆ F(w, η) ˜ = [ Aˆ − G C] η˜ + ϑ(w)) − φ(ϑ(w))] vanishes at η˜ = 0. Such scaling also affects the dynamics of z˜ and xi,ri , that change as ¯ z˙˜ = f˜0 (w, z˜ , x) + g˜ 0 (w, z˜ , x)[Cˆ η˜ + u] x˙i,ri = a˜ i (w, z˜ , x) + b˜i (w, z˜ , x)[Cˆ η˜ + u]. ¯ We are now ready to write the normal form of the so-called “augmented system,” namely, the system obtained preprocessing the plant by means of the internal model. The normal form in question is w˙ = s(w) z˙˜ = f˜0 (w, z˜ , x) + g˜ 0 (w, z˜ , x)[Cˆ η˜ + u] ¯ η˙˜ = F(w, η) ˜ + G[Cˆ η˜ + u] ¯ x˙i,1 = xi,2 ··· 1 x˙i,r1 = xi,r1 +1 + δi,r (w, x)(a˜ 1 (w, z˜ , x) + b˜1 (w, z˜ , x)[Cˆ η˜ + u]) ¯ 1 +1 ··· j x˙i,ri −1 = xi,ri + i−1 ˜ j (w, z˜ , x) + b˜ j (w, z˜ , x)[Cˆ η˜ + u]) ¯ j=1 δi,ri (w, x)(a ¯ x˙i,ri = a˜ i (w, z˜ , x) + b˜i (w, z˜ , x)[Cˆ η˜ + u] i = 1, . . . , m. ei = xi,1

(1.34)

˜ Since f˜0 (w, 0, 0) = 0, a˜ i (w, 0, 0) = 0, F(w, 0) = 0, the point (˜z , η, ˜ x) = (0, 0, 0) is an equilibrium point when u¯ = 0, for every value of w, and the regulated error vanishes at such point. Hence, if such equilibrium is stabilized, the problem of output regulation is solved. System (1.34) has a structure similar to that of system (1.1). Hence, it is reasonable to expect that if assumptions corresponding to those considered in the previous section hold, semiglobal stability can be obtained by means of a robust controller. Equivalent versions of Assumptions 1.1 and 1.2 have already been claimed; hence, it remains to check the property of strong minimum-phase for system (1.34). According to Definition 1.2, we should look at the system ˜ ˜ z˜ , x)]−1 [− A(w, z˜ , x) + χ ] z˙˜ = f˜0 (w, z˜ , x) + g˜ 0 (w, z˜ , x)[ B(w, −1 ˙η˜ = F(w, η) ˜ ˜ ˜ + G[ B(w, z˜ , x)] [− A(w, z˜ , x) + χ ]

(1.35)

20

A. Isidori and Y. Wu

and check that the latter, seen as a system with state (˜z , η) ˜ and input (x, χ ) has the properties indicated in such Definition. The system in question is the cascade of two systems: the upper subsystem—if the plant satisfies the equivalent version of Assumption 1.3—has already the desired properties. Thus, we only have to make sure that the lower subsystem has properties that imply, for (1.35), the fulfillment of the conditions indicated in Definition 1.2. This is actually possible, thanks to following result (whose proof is an extension of a proof given in [22]). Proposition 1.2 There is a choice of G, a positive definite symmetric matrix P and real number a > 0 such that the quadratic function U (η) ˜ = η˜ T P η˜ satisfies ∂U ˆ 2 [F(w, η) ˜ + G u] ˆ ≤ −aη ˜ 2 + u ∂ η˜

for all (w, η, ˜ u). ˆ

With this result in mind, appealing to standard results concerning the input-tostate stability properties of composite systems (see [23]), it is possible to prove that, if the equivalent version of Assumption 1.1 holds, an appropriate choice of G makes system (1.34) strongly minimum phase. Having checked the appropriate assumptions, we can conclude that semiglobal asymptotic stability of the equilibrium (˜z , η, ˜ x) = (0, 0, 0) can be enforced by means of the robust controller described in the previous section. Note that the controller is in the present case identical to the controller consisting of (1.8)–(1.29)–(1.30)–(1.31), because the structure of such controller is determined only by the integers r1 , . . . , rm . Only a slight change of notation is needed. The “extra input” u¯ in (1.29)–(1.30) must be suppressed (because only asymptotic stability is sought) and the variable u in (1.29)–(1.31) should be replaced by u, ¯ to make it consistent with the present setting, where the control used for stabilization purposes has been denoted by u. ¯

1.6 An Illustrative Example Consider the 2-input 2-output system modeled by the equations z˙ x¯˙1 x˙¯2 x˙¯3 y1 y2

= = = = = =

z + x¯2 + u 1 2z + x¯2 + u 1 x¯3 + x¯1 [2z + x¯2 + u 1 ] z2 + u2 x¯1 x¯2 .

This system does not have a vector relative degree, because y˙1 2z + x¯2 1 0 u1 = + . y˙2 x¯3 + x¯1 2z + x¯1 x¯2 u2 x¯1 0

1 Almost Feedback Linearization via Dynamic Extension …

21

However, this system is uniformly invertible (as a simple check shows). Let the outputs y1 , y2 be required to asymptotically track two different harmonic signals. We cast this as a problem of output regulation, defining tracking errors ei = yi − qi (wi )

i = 1, 2

in which qi (wi ) = Q i wi and w˙ i = Si wi , where wi =

wi1 wi2

Si =

0 1 −Ωi2 0

Q i = 1 0 i = 1, 2

and Ω1 = Ω2 . The design procedure presented in the paper can be implemented in various steps, as follows. Step 1: First of all, the system with input u and output e is put in normal form. To this end, we define x11 = x¯1 − q1 (w1 ) = x¯1 − w11 x21 = x¯2 − q2 (w2 ) = x¯2 − w21 x22 = x¯3 − q˙2 (w2 ) + x¯1 q˙1 (w1 ) = x¯3 − w22 + x¯1 w12 . The resulting normal form is z˙ x˙11 x˙21 x˙22 e1 e2

= z + x21 + w21 + u 1 = 2z − w12 + x21 + w21 + u 1 = x22 + (x11 + w11 )[2z − w12 + x21 + w21 + u 1 ] = z 2 + u 2 + Ω22 w21 + (2z + x21 + w21 + u 1 )w12 − (x11 + w11 )Ω12 w11 = x11 = x21 .

This is the desired normal form (1.1), with r1 = 1, r2 = 2, f 0 (z, x, w) = z + x21 + w21

g0 (z, x, w) = 1 0

1 δ12 (w, x) = x11 + w11

2z − w12 + x21 + w21 A(w, z, x) = 2 z + Ω22 w21 + (2z + x21 + w21 )w12 − (x11 + w11 )Ω12 w11

1 0 B(w, z, x) = . w12 1 Step 2: Next, we look at the nonlinear regulator equations. The function ψ(w) is given by ψ(w) = [B(w, π0 (w), 0)]−1 [−A(w, π0 (w), 0)] −2π0 (w) + w12 − w21 . = 2 2 −w12 − [π0 (w)]2 − Ω22 w21 + Ω12 w11

22

A. Isidori and Y. Wu

Replacing ψ(w) into the p.d.e. that defines π0 (w) (note that only ψ1 (w) is involved) we get ∂π0 (w) s(w) = π0 (w) + w21 − 2π0 (w) + w12 − w21 = −π0 (w) + w12 . ∂w It is seen from this that π0 (w) is a linear form

1 . Setting

= c1 w 11 + c2 w12 in w π0 (w) the p.d.e. reduces to a Sylvester equation c1 c2 S1 = − c1 c2 + 0 1 that has a unique solution. Looking now at the expression of ψ(w), it is realized that ψ1 (w) is a linear form in (w1 , w2 ), while ψ2 (w) is the sum of a quadratic form in w1 and of a linear form in w2 . In other words, we can conclude that there are vectors Γ1 ∈ R1×4 and Γ2 ∈ R1×5 such that9 ⎛ 2 ⎞ [2] w11 w1 w1 , ψ2 (w) = Γ2 where w1[2] = ⎝w11 w12 ⎠ . ψ1 (w) = Γ1 w2 w2 2 w12 Step 3: We now check that the Assumption 1.3 of strong minimum-phase is fulfilled. Scaling z as z˜ = z − π0 (w) we get z˙˜ = z˜ + x21 + [u 1 − ψ1 (w)] . In this expression, we have to replace [u 1 − ψ1 (w)] by the first component of ˜ ˜ z˜ , x) + χ ], which is [ B(w, z˜ , x)]−1 [− A(w, −a˜ 1 (w, z˜ , x) + χ1 = −2˜z − x21 + χ1 . This yields

z˙˜ = −˜z + χ1 ,

which, seen as a system with input χ1 , is trivially input-to-state stable. Step 4: Having checked that Assumptions 1.1–1.3 are fulfilled, we proceed now to check that also Assumption 1.4 is fulfilled and we determine the functions φ1 (·), φ2 (·). For the function φ1 (·), using Cayley–Hamilton’s Theorem, it is seen that L 4s ψ1 (w) = −(Ω12 + Ω22 )L 2s ψ1 (w) − Ω12 Ω22 ψ1 (w) Thus η1 ∈ R4 and φ1 (η1 ) = −(Ω12 Ω22 )η11 − (Ω12 + Ω22 )η13 := Φ1 η1 .

9

Note that the actual values of Γ1 and Γ2 are not needed in the sequel.

1 Almost Feedback Linearization via Dynamic Extension …

23

For the function φ2 (·), observe that (see [19]) dtd w1[2] = S1[2] w1[2] where S1[2] is a matrix having characteristic polynomial p1[2] (λ) = λ3 + 4Ω12 λ. Since the characteristic polynomial S2 is p2 (λ) = λ2 + Ω22 , using Cayley–Hamilton’s, we get L 5s ψ2 (w) = −(4Ω12 + Ω22 )L 3s ψ2 (w) − (4Ω12 Ω22 )L s ψ2 (w). Thus η2 ∈ R5 and φ2 (η2 ) = −(4Ω12 Ω22 )η22 − (4Ω12 + Ω22 )η24 := Φ2 η2 . Step 5: To complete the design of the internal model, we have to fix the vectors G 1 and G 2 . Since we are dealing with linear φi (·)’s the issue is trivial. The function F(w, η) ˜ is η˜ 1 F1 0 F(w, η) ˜ = η˜ 2 0 F2 in which the Fi ’s have the form Fi = Aˆ i + Bˆ i Φi − G i Cˆ i with ( Aˆ i + Bˆ i Φi , Cˆ i ) observable. Hence the G i ’s can be determined by standard methods. Step 6: At this stage, we choose the dynamic extension. Since r1 = 1 and r2 = 2, only a 1-dimensional extension is needed ζ˙1 = v1 . Step 7: Finally, we add the appropriate extended observers (1.31) and define the controls v1 and u¯ 1 , u¯ 2 . This yields v1 = g (−d0 ξˆ11 ) − d1 ζ1 u¯ 1 = b0 g ( b10 (−σ1 + ζ1 )) u¯ 2 = b0 g ( b10 (−σ2 − d0 ξˆ21 − d1 ξˆ22 )) in which ζ1 , ξˆ11 , ξˆ21 , ξˆ22 , σ1 , σ2 are states of ζ˙1 ˙ξˆ 11 σ˙ 1 ξ˙ˆ21 ξ˙ˆ

= g (−d0 ξˆ11 ) − d1 ζ1

= σ1 + b0 g ( b10 (−σ1 + ζ1 )) + κ1 c11 (e1 − ξˆ11 ) = κ12 c10 (e1 − ξˆ11 ) = ξˆ22 + κ2 c22 (e2 − ξˆ21 )

= σ2 + b0 g ( b10 (−σ2 − d0 ξˆ21 − d1 ξˆ22 )) + κ22 c21 (e2 − ξˆ21 ) σ˙ 2 = κ23 c20 (e2 − ξˆ21 ) . 22

The control designed in this way was implemented to solve a problem of tracking the sinusoidal references y1,ref = sin t ,

y2,ref = sin 2t ,

24

A. Isidori and Y. Wu 0.5

e1(t)

e(t)

e2(t)

0

-0.5 0

5

t

10

15

Fig. 1.1 The e1 and e2 1

z(t)

z(t)

0.5 0

-0.5 -1 0

5

t

10

15

Fig. 1.2 The z

in which case the reference trajectory in the (y1 , y2 ) plane is the classical Lissajou “figure eight.” We choose G 1 and G 2 so as to have det (s I − F1 ) = (s + 1)(s + 2)(s + 3)(s + 4) det (s I − F2 ) = (s + 1)(s + 2)(s + 3)(s + 4)(s + 5) and

b0 = 0.6, d0 = 1, d1 = 2, c11 = 3, c10 = 2, c22 = 6, c21 = 11, c20 = 6, = 100, κ1 = 10, κ2 = 20 .

Then, we have run a simulation assuming x11 (0) = 0.5 and x21 (0) = x22 (0) = z(0) = 0, for the state of the plant, and η1 (0), η2 (0), ζ1 (0), ξˆ1 (0), ξˆ2 (0), σ1 (0), σ2 (0) all zero for the state of the controller. The results of the simulation are shown in following figures (Figs. 1.1, 1.2, 1.3, and 1.4).

1 Almost Feedback Linearization via Dynamic Extension …

25

60 40 20 0 -20 -40 0

5

10

15

Fig. 1.3 The ξˆ and σ 1.5 1

y2(t)

0.5 0

-0.5 -1 -1.5 -1.5

-1

-0.5

0

0.5

1

1.5

y1(t) Fig. 1.4 The outputs

References 1. Brockett, R.W.: Feedback invariants for nonlinear systems. IFAC Congr. 6, 1115–1120 (1978) 2. Jakubczyk, B., Respondek, W.: On linearization of control systems. Bull. Acad. Polonaise Sci. Ser. Sci. Math. 28, 517–522 (1980) 3. Freidovich, L.B., Khalil, H.K.: Performance recovery of feedback-linearization- based designs. IEEE Trans. Autom. Control 53, 2324–2334 (2008) 4. Willems, J.C.: Almost invariant subspaces: an approach to high-gain feedback design - Part I. IEEE Trans. Autom. Control 26, 235–252 (1981) 5. Descusse, J.C., Moog, C.H.: Decoupling with dynamic compensation for strong invertible affine nonlinear systems. Int. J. Control 43, 1387–1398 (1985) 6. Wu, Y., Isidori, A., Lu, R., Khalil, H.: Performance recovery of dynamic feedback-linearization methods for multivariable nonlinear systems. IEEE Trans. Autom. Control AC–65, 1365–1380 (2020) 7. Wang, L., Isidori, A., Marconi, L., Su, H.: Stabilization by output feedback of multivariable invertible nonlinear systems. IEEE Trans. Autom. Control AC–62, 2419–2433 (2017) 8. Schwartz, B., Isidori, A., Tarn, T.J.: Global normal forms for MIMO nonlinear systems, with applications to stabilization and disturbance attenuation. Math. Control, Signals Syst. 2, 121– 142 (1999)

26

A. Isidori and Y. Wu

9. Liu, X., Lin, Z.: On normal forms of nonlinear systems affine in control. IEEE Trans. Autom. Control AC-56, 1–15 (2011) 10. Isidori, A.: Lectures in Feedback Design for Multivariable Systems. Springer, London (2017) 11. Singh, S.N.: A modified algorithm for invertibility in nonlinear systems. IEEE Trans. Autom. Control 26(2), 595–598 (1981) 12. Wang, L., Isidori, A., Marconi, L., Su, H.: A consequence of invertibility on the normal form of a MIMO nonlinear systems. IFAC-PapersOnLine 49, 856–861 (2016) 13. Liberzon, D., Morse, A.S., Sontag, E.D.: Output-input stability and minimum-phase nonlinear systems. IEEE Trans. Autom. Control 47, 422–436 (2002) 14. Liberzon, D.: Output-input stability implies feedback stabilization. Syst. Control Lett. 53, 237–248 (2004) 15. Lin, Y., Sontag, E.D., Wang, Y.: Various results concerning set input to state stability. In: Proceedings 34th IEEE CDC, pp. 1330–1335 (1995) 16. Sontag, E.D., Wang, Y.: New characterizations of input-to-state stability. IEEE Trans. Autom. Control AC-41, 1283–1294 (1996) 17. Isidori, A.: Nonlinear Control Systems, 3rd edn. Springer, London (1995) 18. Isidori, A., Byrnes, C.I.: Output regulation of nonlinear systems. IEEE Trans. Autom. Control 25, 131–140 (1990) 19. Huang, J.: Nonlinear Output Regulation: Theory and Applications. SIAM Series: Advances in Design and Control, Philadelphia (2004) 20. Marconi, L., Praly, L., Isidori, A.: Output stabilization via nonlinear Luenberger observers. SIAM J. Control Optim. 45(6), 2277–2298 (2007) 21. Wu, Y., Isidori, A., Lu, R.: Output regulation of invertible nonlinear systems via robust dynamic feedback-linearization. IEEE Trans. Autom. Control, to appear (2021) 22. Byrnes, C.I., Isidori, A.: Nonlinear internal models for output regulation. IEEE Trans. Autom. Control AC-49, 2244–2247 (2004) 23. Sontag, E.D., Teel, A.R.: Changing supply functions in input/state stable systems. IEEE Trans. Autom. Control, AC-40, 1476–1478 (1995)

Chapter 2

Continuous-Time Implementation of Reset Control Systems Andrew R. Teel

Abstract This chapter presents a continuous-time implementation of a reset control system that has a linear flow map, a linear jump map, and a quadratic function that describes the jump set. The implementation is a homogeneous differential inclusion that depends only on the data of the reset system and matches the flows of the reset system in the flow set. Assuming that the reset control system admits a strongly convex Lyapunov function that establishes stability of its origin, the continuous-time implementation has the origin globally exponentially stable. In particular, the continuoustime implementation eliminates purely discrete-time solutions of the reset system that do not converge. The behavior of the continuous-time implementation is illustrated through multiple examples.

2.1 Introduction This chapter is dedicated to Laurent Praly, who has impacted the problems I have worked on and the solutions I have found ever since my brief, but indelible, postdoctoral visit during the last six months of 1992. During the virtual workshop held to celebrate his 65th birthday, as part of the 59th IEEE Conference on Decision and Control, Laurent mentioned how Petar Kokotović encouraged him to stop working on discrete-time systems and move to continuous-time systems, where Petar envisioned that Laurent could conceive the most intricate and successful of Lyapunov approaches to nonlinear control problems. While Laurent has not suggested that I stop working on hybrid systems, for this particular work, I have chosen to shift back to continuous time, where Laurent is the dominant player, showing how a reset control system, which is a particular type of hybrid system, can be implemented in continuous time. Fittingly, from my point of view, the continuous-time system is a differential inclusion, which I began to understand more deeply while working A. R. Teel (B) University of California, Santa Barbara, CA 93106-9560, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control, Lecture Notes in Control and Information Sciences 488, https://doi.org/10.1007/978-3-030-74628-5_2

27

28

A. R. Teel

with Laurent on disturbance attenuation problems and converse Lyapunov theorems roughly twenty years ago. I am grateful for all I have learned from the brilliant Laurent Praly. In this chapter, a continuous-time implementation of a class of reset control systems is presented. Reset control systems originated with the work of J.C. Clegg in 1958 [5], who was motivated to develop a nonlinear integrating circuit with less phase lag than a standard linear integrator. Clegg’s integrator caught the attention of Krishnan and Horowitz [16] in the 1970s who described it as “an integrator which resets to zero at zero crossings of the input and is an ordinary integrator between zero crossings.” Shortly thereafter, Horowitz and Rosenbaum [12] developed the more general “First-Order Reset Elements” (FOREs), “whose output ... also resets to zero at zero crossings of the input,” to address perceived “shortcomings” of the Clegg integrator. Stability analysis for control loops with Clegg integrators and FOREs began in earnest with the work of Hollot and coauthors in the late 1990s [2, 4, 10, 11, 13, 14]. Significant additional progress was made on the analysis of feedback loops with Clegg integrators and FOREs when the zero-crossing-triggered interpretation of the resetting mechanism was replaced by a sector-triggered interpretation and piecewise-quadratic Lyapunov functions were used for the analysis [20–22]. Since that time, there has been an explosion of papers on reset control systems; at the writing of this chapter, nearly 390 of the 430 references to the original paper by Clegg (90%) have appeared since 2005. The paper [21] was one of first works to cast a closed-loop reset control system as a hybrid system explicitly. It used the hybrid systems framework of [6–9] to model the closed-loop system, showed how to use temporal regularization to eliminate purely discrete-time solutions that do not converge, and demonstrated the advantages of pursuing non-quadratic Lyapunov functions for stability analysis. This chapter explores implementing a reset control system using a differential inclusion that depends on the data of the hybrid system rather than implementing the hybrid system itself. One reason for pursuing such a result is that (discontinuous) differential equations are more familiar, compared to hybrid systems, to the dynamical systems and control engineering communities. Perhaps more significantly, the differential inclusion implementation obviates the need for any temporal regularization in the hybrid system to avoid purely discrete-time solutions that do not converge. Equivalences between certain classes of hybrid systems and differential inclusions arising as projected dynamical systems have appeared in the literature; see [3] and the references therein. The connection between projected dynamical systems and the differential inclusion used in this chapter is not explored here. The main result of this chapter, Theorem 2.1, establishes conditions under which the origin of the prescribed differential inclusion is globally exponentially stable. The main assumption is that the hybrid reset control system admits a continuously differentiable, strongly convex, homogeneous Lyapunov function that establishes stability, though not necessarily asymptotic stability, of the origin; see Assumption 2.1. Assuming the existence of a strongly convex, homogeneous Lyapunov function is not outlandish: results on the existence of convex, homogeneous Lyapunov functions in switched, but not hybrid, linear settings, can be found in [18]. Moreover, some

2 Continuous-Time Implementation of Reset Control Systems

29

reset control systems admit positive definite, quadratic Lyapunov functions, which are necessarily homogeneous and strongly convex. The final section of the chapter considers several examples, each of which corresponds to a reset control system admitting a positive definite, quadratic Lyapunov function that establishes stability. The example in Sect. 2.4.4 considers a setting that is the genesis for the general results in this chapter: a differential inclusion for accelerated convex optimization [1, 17] that is inspired by a hybrid algorithm for accelerated convex optimization as in [24] and the references therein.

2.2 Objective and Primary Assumption The objective of this chapter is to find a continuous-time implementation of the hybrid, reset control system x ∈ C := x ∈ Rn : x T M x ≤ 0 x ∈ D := x ∈ Rn : x T M x ≥ 0

x˙ = Ax +

x = Rx

(2.1a) (2.1b)

such that the origin of the implementation is globally exponentially stable when the following conditions hold for the data of the reset control system: Assumption 2.1 The matrices A, R, M = M T ∈ Rn×n are such that 1. there exist ε > 0 and a continuously differentiable, strongly convex, homogen neous of degree two, definite function positive V : R → R≥0 such that, with the n T T definition Cε := x ∈ R : x M x ≤ εx x , the following inequalities hold: ∇V (x), Ax ≤ 0, x ∈ Cε =⇒ x ∈ D =⇒ V (Rx) − V (x) ≤ 0.

(2.2a) (2.2b)

Here, “homogeneous of degree two” means that V (λx) = λ2 V (x) for all λ > 0 and x ∈ Rn ; “strongly convex” means that there exists μ > 0 such that, for all (x, y) ∈ Rn × Rn , V (y) ≥ V (x) + ∇V (x), y − x + μ|y − x|22 .

(2.3)

2. x ∈ D implies Rx ∈ C, where C and D are defined in (2.1). 3. There is no solution of (2.1a) with an unbounded time domain that keeps the function V of item 1 equal to a nonzero constant. While item 1 of Assumption 2.1 is enough to guarantee that the origin of (2.1) is stable, the totality Assumption 2.1 is not strong enough to guarantee exponential stability of the origin for the reset system (2.1). Indeed, for any x◦ ∈ Rn \ {0} such that Rx◦ = x◦ (there exists such x◦ whenever R − I is singular, which is the case

30

A. R. Teel

for most of the examples considered in Sect. 2.4), there is the solution x(0, j) = x◦ for all j ∈ Z≥0 . More generally, it is common for reset control systems to have purely discrete-time solutions that do not converge to zero and that must therefore be removed with some type of temporal or space regularization. This fact is part of the motivation for pursuing a continuous-time implementation of (2.1). The conditions of Assumption 2.1 are also not strong enough for exponential stability of the origin for (2.1) even if the focus is on solutions with time domains that are unbounded in the ordinary time direction, as illustrated by the following example. Example 2.1 Consider

01 A := , −1 0

0 −1 R := , 1 0

01 M := . 10

(2.4)

With these definitions, the flow set C is the union of the second and fourth quadrants in the plane while the jump set is the union of the first and third quadrants. The reset map R corresponds to a rotation of 90 degrees in the counterclockwise direction and the flows move along circles in the clockwise direction at constant speed. Consequently, items 2 and 3 of Assumption 2.1 hold. Finally, item 1 of Assumption 2.1 holds with V (x) := x T x for all x ∈ R2 . This implies that the origin of (2.1) is stable. The origin of (2.1) is not exponentially stable since each circle is forward invariant underflows and jumps. The results of this chapter will show that a continuous-time implementation of this reset control system transforms the origin from being stable but not attractive to being exponentially stable. Simulations of the continuous-time implementation of this system appear in Sect. 2.4.1.

2.3 Continuous-Time Implementation and Main Result To implement the system (2.1) in continuous time, while achieving exponential stability of the origin, consider the differential inclusion x˙ ∈ Ax + γ · SGN(x T M x) + 1 (Rx − x) =: F(x),

(2.5)

where γ > 0 is sufficiently large. The set-valued mapping SGN : R ⇒ R is equal to the sign of its argument except at zero, where it is equal to the interval [−1, 1]. The resulting set-valued mapping F is outer semicontinuous (that is, its graph is closed) and locally bounded with convex values. If the state x comprises a plant state x p ∈ Rn p and a compensator state xc ∈ Rn c , i.e., x := (x Tp , xcT )T ∈ Rn p +n c , and the reset map does not change the plant state, i.e., Rx − x =

0n p

∀x ∈ Rn ,

(2.6)

2 Continuous-Time Implementation of Reset Control Systems

31

then the solutions of the differential inclusion (2.5) satisfy x˙ p = I 0 x˙ = I 0 Ax.

(2.7)

In other words, no modification of the plant dynamics are needed to implement (2.5) for control systems with plant states that do not reset. Other than this feature, the solutions of (2.5) may not bear much resemblance to the solutions of (2.1). For an elaboration on this point, see Sect. 2.4.1. However, the following result holds: Theorem 2.1 Under Assumption 2.1, the origin of (2.5) is globally exponentially stable for γ > 0 sufficiently large. Proof It is straightforward to see that F(λx) = λF(x) for all λ > 0 and x ∈ Rn . Hence global exponential stability of the origin is equivalent to asymptotic stability of the origin; see [19, Theorem 11], for example. Asymptotic stability of the origin is established now. For the origin of (2.5), consider the Lyapunov candidate V , which can be seen to be positive definite and radially unbounded using (2.3) with x = 0 and y ∈ Rn arbitrary.

Step 1: Bounding ∇V (x), f 2 , f 2 ∈ SGN(x T M x) + 1 (Rx − x). Combining (2.3) with y = Rx, (2.2b), and the definition of D in (2.1b), it follows that xT Mx ≥ 0

=⇒

∇V (x), Rx − x ≤ −μ|Rx − x|22 .

(2.8)

In turn, from the definition of SGN, it follows that s ∈ SGN(x T M x) =⇒ ∇V (x), (s + 1)(Rx − x) ≤ −(s + 1)μ|Rx − x|22 . (2.9) Letting σ > 0 satisfy |M(Rx + x)|2 ≤ σ |x|2 for all x ∈ Rn and then using the Cauchy–Schwarz inequality, item 2 in Assumption 2.1, and M = M T , it follows that x T M x ≥ 0, x = 0

=⇒

−(Rx − x)T M(Rx + x) σ |x|2 T x M x − x T R T M Rx = σ |x|2 T x Mx ≥ . (2.10) σ |x|2

|Rx − x|2 ≥

Combining (2.9) and (2.10) results in x = 0, s ∈ SGN(x T M x)

=⇒

xT Mx . ∇V (x), (s + 1)(Rx − x) ≤ −2μ max 0, x T M x σ 2 |x|22

(2.11)

32

A. R. Teel

Step 2: Bounding ∇V (x), Ax . Due to item 1 of Assumption 2.1, this quantity is not positive when x T M x ≤ ε|x|22 . For x T M x ≥ ε|x|22 , using the homogeneity of degree two for V , and hence the homogeneity for degree one for ∇V due to Euler’s homogeneous function theorem, it follows that there exists κ > 0 such that x T M x ≥ ε|x|22 > 0

=⇒

∇V (x), Ax ≤ κ|x|22 κ ≤ xT Mx (2.12) ε κσ 2 x T M x ≤ max 0, x T M x . ε2 σ 2 |x|22

Step 3: Combining the previous steps and analyzing solutions. It follows from (2.2a) and (2.12) together with (2.11) that, for each ν > 0 there exists γ > 0 sufficiently large, such that x = 0, s ∈ SGN(x T M x)

=⇒

xT Mx ≤ 0. (2.13) ∇V (x), Ax + γ · (s + 1)(Rx − x) ≤ −ν max 0, x T M x |x|22 It follows that the origin of (2.5) is stable for γ > 0 sufficiently large and, by the invariance principle for differential inclusions (see [23, Theorem 1], for example), which applies due to the properties of F listed below (2.5), the origin is asymptotically stable if and only if there is no solution x : R≥0 → Rn and c > 0 such that V (x(t)) = c for all t ≥ 0. Being a solution of (2.5), x(·) satisfies, for almost all t, x(t) ˙ = Ax(t) + γ · (s(t) + 1)(Rx(t) − x(t)),

s(t) ∈ SGN x T (t)M x(t) .

(2.14a) (2.14b)

Assuming that t → V (x(t)) is a nonzero constant, by the chain rule, for almost all t, 0 = ∇V (x(t)), Ax(t) + γ · (s(t) + 1)(Rx(t) − x(t)) .

(2.15)

According to (2.13), such a solution requires x T (t)M x(t) ≤ 0 for all t ≥ 0. In turn, it follows from (2.2a) and (2.9) and the positivity of γ that, for almost all t, 0 = ∇V (x(t)), Ax(t)

(2.16a)

0 = ∇V (x(t)), γ · (s(t) + 1)(Rx(t) − x(t)) .

(2.16b)

2 Continuous-Time Implementation of Reset Control Systems

33

Again with (2.9) and the positivity of γ and μ, it follows that, for almost all t, (s(t) + 1)|Rx(t) − x(t)|2 = 0.

(2.17)

It then follows from (2.14a) that x(·) is also a solution of (2.1a). In turn, it follows from item 3 of Assumption 2.1 that x(·) does not keep V equal to a nonzero constant. That is, V (x(t)) = c > 0 for all t ∈ R≥0 is impossible. Remark 2.1 It follows from the proof that, when condition (2.2a) is strengthened to ∇V (x), Ax ≤ 0 for all x ∈ Rn , the global exponential stability result holds even if γ > 0 is not large, V is not homogenous, and V is just strictly convex.

2.4 Examples and Simulations In this section, the behavior of the differential inclusion in (2.5) is illustrated for several examples. For simplicity, examples where the reset control system admits a quadratic Lyapunov function are used.

2.4.1 Example 2.1 Revisited Consider the matrices A, R, M of (2.4). As indicated previously in Example 2.1, the origin of (2.1) is not exponentially stable, as each circle is invariant. On the other hand, Assumption 2.1 holds with V (x) = x T x, so the result of Theorem 2.1 applies to the system (2.5) for this A, R, M. In contrast to the subsequent examples, the simulations for (2.5) for this example are especially “stiff” numerically for γ > 0 large, since the second term in the differential inclusion causes sliding along the vertical axis toward the origin. Indeed, at the boundary of the third and fourth quadrants, Ax points directly to the left (into the jump set) while Rx − x points to the right (into the flow set) and up. Similarly, at the boundary of the first and second quadrants, Ax points directly to the right (into the jump set) while Rx − x points to the left (into the flow set) and down. A simulation of the state trajectory for the system (2.5), for the given A, R, M, √ using a√fixed, small (0.0001) step size and γ = 100 from the initial condition (1/ 2, −1/ 2)T is shown in Fig. 2.1. The resulting trajectory bears little resemblance to the trajectories of the reset control system (2.1) for this A, R, M.

34

A. R. Teel 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 0

1

2

3

4

5

6

7

8

9

10

Fig. 2.1 The values of x1 (t) and x2 (t) as a function of time t for Example 2.1 √ √revisited, implemented with the differential inclusion (2.5) using γ = 100 and x◦ = (1/ 2, −1/ 2)T

2.4.2 A Clegg Integrator Controlling a Single Integrator System Consider the data A :=

01 , −1 0

R :=

10 , 00

M :=

01 10

,

(2.18)

which corresponds to a Clegg integrator [5] controlling a single integrator plant using negative feedback. The origin of x˙ = Ax is stable but not exponentially stable. The origin of the reset control system (2.1) is globally exponentially stable with convergence to the origin in finite time. Regarding Assumption 2.1, item 1 holds with V (x) = x T x, item 2 holds since (Rx)T M Rx = 0, and item 3 holds since the flows oscillate and hence always leave the flow set, unless starting at the origin. For a simulation of the system (2.5) with A, R, M as in (2.18) from the initial condition (1, 0)T using γ = 100, Fig. 2.2 shows the evolution of the state while Fig. 2.3 shows the evolution of the Lyapunov function plotted on a log scale. The behavior of the state is quite similar to the behavior of the state for the reset system, until just after the first jump time. Indeed, the flow is identical until the first jump time of the reset control system, which occurs at π/2 seconds. At that time, the reset system state jumps to the origin exactly; the plant state x1 state reaches zero at that time due to the flows while the controller state x2 is reset to zero at that time. In the continuoustime implementation, the state x2 is quickly driven close to zero by the extra term in

2 Continuous-Time Implementation of Reset Control Systems

35

1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1

0

1

2

3

4

5

6

7

8

9

10

Fig. 2.2 The values of x1 (t) and x2 (t) as a function of time t for a Clegg integrator controlling a single integrator plant using negative feedback, implemented with the differential inclusion (2.5) using γ = 100 and x◦ = (1, 0)T

the differential inclusion. Subsequently, this pattern repeats itself continually but at smaller and smaller scales.

2.4.3 A Bank of Clegg Integrators Controlling a Strictly Passive System Let n p and n u be positive integers, let A p ∈ Rn p ×n p , B p ∈ Rn p ×n u and C p ∈ Rn u ×n p be such that, for some P = P T > 0, A Tp P + P A p < 0 ,

P B p = C Tp .

(2.19)

Due to these conditions, the linear system (A p , B p , C p ) is state strictly passive; see [15, Sect. 6.3] for example. Let n := n p + n u , and let A, R, M ∈ Rn×n be defined as 0 C Tp I 0 A p Bp . (2.20) , R := n p , M := A := −C p 0 0 0 Cp 0 Using the Lyapunov function candidate

36

A. R. Teel 0 -2 -4 -6 -8 -10 -12 -14 -16 -18 0

1

2

3

4

5

6

7

8

9

10

Fig. 2.3 The value of log10 (V (x(t))) as a function of time t for a Clegg integrator controlling a single integrator plant using negative feedback, implemented with the differential inclusion (2.5) using γ = 100 and x◦ = (1, 0)T

V (x) := x T diag(P, I )x,

(2.21)

it follows that the origin of x˙ = Ax is stable. It follows from the invariance principle [15, Sect. 4.2] that the origin of x˙ = Ax is exponentially stable if and only if the null space of B p is the origin, i.e., B p has full column rank. The reset control system (2.1) has purely discrete-time solutions that do not converge to the origin from any nonzero point x◦ such that the last n u components of x◦ are zero. To reiterate, this behavior is one of the primary motivations for pursuing the results in this chapter. Regarding Assumption 2.1, item 1 holds with V defined in (2.21), item 2 holds since (Rx)T M Rx = 0, and item 3 holds if and only if B p has full column rank. To simulate an example, let n p = 10, n u = 3, and generate random matrices A p (Hurwitz) and B p (with full column rank) of appropriate dimension using the MATLAB command “rss” and then rounding to two decimal places to facilitate repeatability. The simulations reported here use

2 Continuous-Time Implementation of Reset Control Systems

⎡

−1.25 0.73 0 −0.27 ⎢ 0.67 −1.14 −0.16 0.02 ⎢ ⎢ 0.16 −0.33 −0.73 0.44 ⎢ ⎢ −0.57 0.34 0.42 −1.27 ⎢ ⎢ −0.13 0.22 0.25 0.59 Ap = ⎢ ⎢ 0.22 −0.18 0.56 −1.03 ⎢ ⎢ −0.02 −0.10 −0.09 0.03 ⎢ ⎢ 0.01 −0.53 −0.06 −0.37 ⎢ ⎣ 0.77 −0.60 −0.29 0.12 −0.25 0.18 0.14 −0.24 ⎤ ⎡ 0.02 0.52 −0.29 ⎢ −0.26 −0.02 −0.85 ⎥ ⎥ ⎢ ⎢ 0 0 −1.12 ⎥ ⎥ ⎢ ⎢ −0.29 0 2.53 ⎥ ⎥ ⎢ ⎢ −0.83 1.02 1.66 ⎥ ⎥ Bp = ⎢ ⎢ −0.98 −0.13 0.31 ⎥ . ⎥ ⎢ ⎢ −1.16 −0.71 −1.26 ⎥ ⎥ ⎢ ⎢ −0.53 1.35 −0.87 ⎥ ⎥ ⎢ ⎣ −2.00 −0.22 −0.18 ⎦ 0 −0.59 0.79

−0.31 0.40 0.28 0.52 −3.38 −0.34 −0.05 −0.44 0.90 0.70

0.70 −0.60 0.34 −0.58 −0.50 −2.24 −0.09 −1.02 −0.11 0.06

0.32 −0.37 −0.34 0.51 −0.27 0.17 −0.72 0.16 0.40 −0.70

37

−0.04 −0.48 −0.05 −0.39 −0.44 −0.98 0.22 −2.06 −0.41 0.56

⎤ 0.34 0.35 −0.21 −0.34 ⎥ ⎥ −0.12 −0.20 ⎥ ⎥ −0.24 0.43 ⎥ ⎥ 1.02 0.43 ⎥ ⎥, −0.03 0.23 ⎥ ⎥ 0.69 −0.91 ⎥ ⎥ −0.38 0.49 ⎥ ⎥ −1.59 0.48 ⎦ 0.74 −1.50

It can be verified numerically that the matrix A p is Hurwitz and the matrix B p has full column rank. Then generate P = P T > 0 and C p via A Tp P + P A p = −I , C p := B pT P. Finally, pick an initial condition x◦ ∈ Rn p +n c using the MATLAB command “randn” and then rounding to two decimal places. The simulations reported here use x◦ := T −0.65 1.19 −1.61 −0.02 −1.95 1.02 0.86 0 −0.07 −2.49 0.58 −2.19 −2.32 .

Figure 2.4 shows the evolution of t → V (x(t)), on a log scale, for the linear system x˙ = Ax (dashed curve), which is the same as the differential inclusion (2.5) with γ = 0, and for the differential inclusion (2.5) with γ = 100 (solid curve). The speed of convergence in the latter case compared to the former case is a potential advantage of using a (continuous-time implementation of a) reset control system.

38

A. R. Teel 2 1.5 1 0.5 0 -0.5 -1 -1.5 -2 0

1

2

3

4

5

6

7

8

9

10

Fig. 2.4 The values of log10 (V (x(t))) as a function of time t for a bank of Clegg integrators controlling a strictly passive system, implemented with the differential inclusion (2.5) using γ = 100 (solid curve) and γ = 0 (dashed curve)

2.4.4 A Bank of Stable FOREs Controlling a Detectable Passive System Let n p and n u be positive integers, let A p ∈ Rn p ×n p , B p ∈ Rn p ×n u and C p ∈ Rn u ×n p be such that (C p , A p ) is detectable and, for some P = P T > 0, A Tp P + P A p ≤ 0 ,

P B p = C Tp .

(2.22)

Due to these conditions, the linear system (A p , B p , C p ) is passive; see [15, Sect. 6.3] for example. Let n := n p + n u , let σ > 0, and let A, R, M ∈ Rn×n be defined as A p Bp , A := −C p −σ I

R :=

In p 0 , 0 0

0 C Tp M := Cp 0

.

(2.23)

Using the Lyapunov function candidate V (x) := x T diag(P, I )x,

(2.24)

it follows that the origin of x˙ = Ax is stable. It follows from the invariance principle [15, Sect. 4.2] that the origin of x˙ = Ax is exponentially stable if and only if (C p , A p ) is detectable. The reset control system (2.1) has purely discrete-time solutions that do not converge to the origin from any nonzero point x◦ such that the last n u components

2 Continuous-Time Implementation of Reset Control Systems

39

of x◦ are zero. Regarding Assumption 2.1, item 1 holds with V defined in (2.24), item 2 holds since (Rx)T M Rx = 0, and item 3 holds since (C p , A p ) is detectable. Consider an example that is related to a particular convex optimization approach using acceleration methods [1, 17]. Take n p = 12, n u = 12, A p = 0, B p = I , and C p a random, symmetric, positive definite matrix with entries rounded to one decimal place to facilitate repeatability. The simulations reported here use ⎡

4.8 −3.6 −4.8 −2.4 ⎢ −3.6 10.0 4.4 7.9 ⎢ ⎢ −4.8 4.4 11.3 2.7 ⎢ ⎢ −2.4 7.9 2.7 18.0 ⎢ ⎢ 2.5 0.3 −1.8 1.4 ⎢ ⎢ −1.3 −4.8 −1.2 −6.5 C p := ⎢ ⎢ 0.5 −3.4 −1.7 −1.2 ⎢ ⎢ −4.0 1.0 4.1 −2.4 ⎢ ⎢ −0.4 −0.9 2.0 −4.3 ⎢ ⎢ 0.2 0.7 −0.5 1.4 ⎢ ⎣ 3.0 −0.9 −2.2 −3.7 −3.2 6.8 4.5 10.0

2.5 −1.3 0.5 −4.0 0.3 −4.8 −3.4 1.0 −1.8 −1.2 −1.7 4.1 1.4 −6.5 −1.2 −2.4 10.2 −0.7 −4.6 −0.7 −0.7 7.3 0.7 2.4 −4.6 0.7 13.3 −0.8 −0.7 2.4 −0.8 9.6 −3.8 0.1 1.0 3.6 5.6 −0.5 −2.2 1.8 2.3 −1.1 0.8 −2.2 1.3 −2.1 0.4 −2.1

−0.4 0.2 3.0 −0.9 0.7 −0.9 2.0 −0.5 −2.2 −4.3 1.4 −3.7 −3.8 5.6 2.3 0.1 −0.5 −1.1 1.0 −2.2 0.8 3.6 1.8 −2.2 8.9 −3.8 −2.3 −3.8 8.2 −0.7 −2.3 −0.7 8.2 −5.0 −2.0 1.7

⎤ −3.2 6.8 ⎥ ⎥ 4.5 ⎥ ⎥ 10.0 ⎥ ⎥ 1.3 ⎥ ⎥ −2.1 ⎥ ⎥ 0.4 ⎥ ⎥ −2.1 ⎥ ⎥ −5.0 ⎥ ⎥ −2.0 ⎥ ⎥ 1.7 ⎦ 14.0

whose eigenvalues range from about 0.04 to about 38.12, for a condition number close to 1000. Pick an initial condition randomly with entries rounded to one decimal place. The simulations reported here use x◦ := −0.8 1.5 0 1.6 −0.4 0.6 −0.1 −2.0 −1.0 0.6 −0.1 −1.1 −0.6 0.2 −1.0 1.0 −0.6 1.8 −1.1 0.2 −1.5 −0.7 −0.6 0.4

T

.

Use σ = 0.1. Figure 2.5 shows the evolution of t → V (x(t)), on a log scale, for the linear system x˙ = Ax (dashed curve), which is the same as the differential inclusion (2.5) with γ = 0, and for the differential inclusion (2.5) with γ = 100 (solid curve).

2.5 Conclusion Under mild assumptions, including strong convexity of a Lyapunov function, it is possible to implement a reset control system, whose origin may be stable but not exponentially stable, using a differential inclusion whose origin is globally exponentially stable. The behavior of the proposed inclusion has been demonstrated in several different settings, including situations that correspond to reset control using a Clegg integrator or a more general first-order reset element (FORE). This differential inclusion implementation has the potential to make “reset” control systems easier to construct, certify, and employ than their hybrid systems counterparts.

40

A. R. Teel 4

2

0

-2

-4

-6

-8 0

5

10

15

20

25

30

35

40

45

50

Fig. 2.5 The values of log10 (V (x(t))) as a function of time t for a bank of stable FOREs controlling a detectable, passive system, implemented with the differential inclusion (2.5) using γ = 100 (solid curve) and γ = 0 (dashed curve)

Acknowledgements Research supported in part by the Air Force Office of Scientific Research under grant AFOSR FA9550-18-1-0246.

References 1. Baradaran, M., Le, J.H., Teel, A.R.: Analyzing the persistent asset switches in continuous hybrid optimization algorithms. In: Submitted to the 2021 American Control Conference (2020) 2. Beker, O., Hollot, C.V., Chait, Y., Han, H.: Fundamental properties of reset control systems. Automatica 40(6), 905–915 (2004) 3. Brogliato, B., Daniilidis, A., Lemaréchal, C., Acary, V.: On the equivalence between complementarity systems, projected systems and differential inclusions. Syst. Control Lett. 55(1), 45–51 (2006) 4. Chen, Q., Hollot, C.V., Chait, Y.: Stability and asymptotic performance analysis of a class of reset control systems. In: Proceedings of the 39th IEEE Conference on Decision and Control, pp. 251–256 (2000) 5. Clegg, J.C.: A nonlinear integrator for servomechanisms. Trans. A.I.E.E. 77(Part II), 41–42 (1958) 6. Goebel, R., Hespanha, J., Teel, A.R., Cai, C., Sanfelice, R.: Hybrid systems: generalized solutions and robust stability. In: IFAC Symposium on Nonlinear Control Systems, Stuttgart, Germany, pp. –12 (2004) 7. Goebel, R., Sanfelice, R.G., Teel, A.R.: Hybrid dynamical systems. IEEE Control Syst. Mag. 29(2), 28–93 (2009). April 8. Goebel, R., Sanfelice, R.G., Teel, A.R.: Hybrid Dynamical Systems: Modeling, Stability, and Robustness. Princeton University Press, Princeton (2012)

2 Continuous-Time Implementation of Reset Control Systems

41

9. Goebel, R., Teel, A.R.: Solutions to hybrid inclusions via set and graphical convergence with stability theory applications. Automatica 42, 573–587 (2006) 10. Hollot, C.V., Zheng, Y., Chait, Y.: Stability analysis for control systems with reset integrators. In: Proceedings of the 36th IEEE Conference on Decision and Control, vol. 2, pp. 1717–1719 (1997) 11. Hollot, C.V.: Revisiting Clegg integrators: periodicity, stability and IQCs. IFAC Proc. 30, 31–38 (1997) 12. Horowitz, I.M., Rosenbaum, P.: Non-linear design for cost of feedback reduction in systems with large parameter uncertainty. Int. J. Control 21(6), 977–1001 (1975) 13. Hu, H., Zheng, Y., Chait, Y., Hollot, C.V.: On the zero-input stability of control systems with Clegg integrators. In: Proceedings of the 1997 American Control Conference, vol. 1, pp. 408– 410 (1997) 14. Hu, H., Zheng, Y., Hollot, C.V., Chait, Y.: On the stability of control systems having Clegg integrators. Topics in Control and its Applications. Springer, London (1999) 15. Khalil, H.K.: Nonlinear Systems, 3rd edn. Prentice-Hall (2002) 16. Krishnan, K.R., Horowitz, I.M.: Synthesis of a non-linear feedback system with significant plant-ignorance for prescribed system tolerances. Int. J. Control 19(4), 689–706 (1974) 17. Le, J.H., Teel, A.R.: Hybrid heavy-ball systems: reset methods for optimization with uncertainty. In: Submitted to the 2021 American Control Conference (2020) 18. Molchanov, A.P., Pyatnitskiy, Ye.S.: Criteria of asymptotic stability of differential and difference inclusions encountered in control theory. Syst. Control Lett. 13, 59–64 (1989) 19. Nakamura, H., Yamashita, Y., Nishitani, H.: Smooth Lyapunov functions for homogeneous differential inclusions. In: Proceedings of the 41st SICE Annual Conference, vol. 3, pp. 1974– 1979 (2002) 20. Nesic, D., Teel, A.R., Zaccarian, L.: Stability and performance of SISO control systems with first-order reset elements. IEEE Trans. Autom. Control 56(11), 2567–2582 (2011) 21. Nesic, D., Zaccarian, L., Teel, A.R.: Stability properties of reset systems. In: Proceedings of 16th IFAC World Congress, vol. 38, pp. 67 – 72 (2005) 22. Nesic, D., Zaccarian, L., Teel, A.R.: Stability properties of reset systems. Automatica 44(8), 2019–2026 (2008) 23. Ryan, E.P.: A universal adaptive stabilizer for a class of nonlinear systems. Syst. Control Lett. 16(3), 209–218 (1991) 24. Teel, A.R., Poveda, J.I., Le, J.: First-order optimization algorithms with resets and Hamiltonian flows. In: 58th IEEE Conference on Decision and Control, pp. 5838–5843 (2019)

Chapter 3

On the Role of Well-Posedness in Homotopy Methods for the Stability Analysis of Nonlinear Feedback Systems Randy A. Freeman

Abstract We consider the problem of determining the input/output stability of the feedback interconnection of two systems. Dissipativity and graph separation techniques are two related and popular approaches to this problem, and they include wellknown passivity and small-gain methods. The use of block diagram transformations with dynamic multipliers can greatly reduce the conservativeness of such approaches, but for the stability of the transformed system to imply that of the original one, these multipliers should admit appropriate factorizations. An alternative approach which circumvents the need to factorize multipliers was provided by Megretski and Rantzer in their seminal 1997 paper on integral quadratic constraints. Their approach is based on homotopy: one constructs a continuous transformation of a trivially stable system into the target system of interest, and by satisfying certain conditions along the homotopy path one guarantees that the target system is also stable. This method assumes that the feedback interconnection is well-posed along the homotopy path, namely, that the feedback equations have solutions for all possible exogenous inputs and that the mapping from these inputs to the solutions is causal. In this chapter we will explore the role of well-posedness in this homotopy method. In so doing we demonstrate that what suffices for the homotopy analysis is a property significantly weaker than well-posedness, one which involves a certain lower hemicontinuity of the feedback interconnection along with a certain controllability of its domain. Moreover, we show that these methods can be applied to general signal spaces, including extended Sobolev spaces, spaces of smooth functions, and spaces of distributions.

R. A. Freeman (B) Department of Electrical and Computer Engineering, Northwestern University, 2145 Sheridan Rd., Evanston, IL, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control, Lecture Notes in Control and Information Sciences 488, https://doi.org/10.1007/978-3-030-74628-5_3

43

44

R. A. Freeman

3.1 Introduction We consider the feedback interconnection of two systems G and as illustrated in Fig. 3.1. We wish to determine the stability of the feedback loop, in the sense that the outputs (y1 , y2 ) are bounded by the inputs (u 1 , u 2 ) in some appropriate sense. Here the outputs represent the internal (or endogenous) signals and the inputs represent the external (or exogenous) signals. In many cases G represents a known system and represents uncertainty, or G represents the linear part of the system and the nonlinear part. For example, in the classical absolute stability problem of Lur’e and Postnikov [19], G is a known linear and time-invariant system and is a nonlinear system known only to satisfy certain sector conditions. This stability problem has a long history; for example, the 2006 survey paper [17] on absolute stability contains over 470 references. Classical approaches to this problem include the circle criterion, the Popov criterion, the small-gain theorem, and the passivity theorem (details of which can be found in textbooks like [7, 15], among others). The classical approaches are made less conservative through the use of multipliers, or linear systems M inserted into the feedback loop together with their inverses so that the interconnection in Fig. 3.1 becomes the one in Fig. 3.2. If we can find a class of multipliers that preserve, say, the passivity of the lower path, then we can perform a search over this class to find one that allows the transformed system to satisfy the conditions of an appropriate stability theorem. O’Shea was apparently the first to propose the use of noncausal multipliers to further reduce the conservativeness of the stability tests [22]. Indeed, because the transformed system of Fig. 3.2 is artificial and exists solely for the purpose of stability analysis, there is no reason to impose physical restrictions on it (such as causality). Nonetheless, for this approach to work, the stability of the artificial, transformed system in Fig. 3.2 should imply the stability of the original system in Fig. 3.1. One can guarantee that this is true when the multipliers admit certain factorizations [7].

Fig. 3.1 The classical feedback interconnection of systems G and

Fig. 3.2 The classical feedback interconnection with multiplier M

3 On the Role of Well-Posedness in Homotopy Methods …

45

As an alternative to the factorization approach, Megretski and Rantzer pioneered a homotopy approach in [21, 23]. Their homotopy approach does not employ multipliers as they appear in Fig. 3.2; instead, the multipliers are embedded inside of integral quadratic constraints (IQCs) on the component systems G and . Such IQCs cannot guarantee stability directly, however, because the feedback loop can be unstable even when the IQCs are satisfied. Instead, Megretski and Rantzer begin with a system that they know to be stable and continuously deform it into the target system of Fig. 3.1. Their main result is that if the IQCs are satisfied along the entire homotopy path, then the target system is also stable. An advantage of the homotopy approach is that it circumvents the requirement that the multipliers admit factorizations. This does come at a price: paraphrasing their words in [21], The price paid for [circumventing the factorization requirement] is the very mild assumption that the feedback loop is well-posed [along the entire homotopy path].

What is this “very mild” well-posedness assumption? As defined in [21, 23], the feedback loop in Fig. 3.1 is well-posed when it satisfies the following two conditions: first, there should exists outputs (y1 , y2 ) for all possible choices of the inputs (u 1 , u 2 ), and second, the resulting mapping from the inputs to the outputs should be causal. Note that this definition of well-posedness is not as strong as other definitions in the literature, such as the one in [28], but it suffices for their homotopy argument. In this chapter we investigate the role of this well-posedness assumption in the homotopy method of [21, 23]. In particular, we make the following contributions: 1. We show that the homotopy approach to stability analysis works in settings much more general than the classical setting of extended L2 spaces considered in [21, 23]. This includes settings in which signals belong to extended Sobolev spaces or spaces of distributions. In fact, any locally convex Hausdorff topological vector space can serve as the space of signals. 2. We relax the requirement that outputs exist for all possible inputs. Instead, we require only that the domain of the feedback interconnection have a certain controllability property. 3. We relax the requirement that the mapping from the inputs to the outputs in Fig. 3.1 is causal. Instead, we require only that this mapping have a certain lower hemicontinuity property together with an assumed limit on signal growth. 4. We extend the homotopy method to certain interconnections more general than the classical input-additive interconnection of Fig. 3.1. One might make the valid point that ill-posed feedback interconnections are poor models of physical systems, so there is no reason to relax well-posedness assumptions. Keep in mind, however, that all systems along the homotopy path except the target system are artificial and thus need not be physically meaningful. Moreover, in some applications signals are constrained (e.g., they must take on positive values), so requiring outputs to exist for all possible inputs might be unduly restrictive. The results of this chapter apply mainly to systems satisfying so-called soft or conditional IQCs [20, 21]. This is because when systems satisfy so-called hard or

46

R. A. Freeman

complete IQCs instead, then we can often establish stability using classical dissipativity or graph separation techniques and thus homotopy is not needed (see [4] and the references therein). The chapter is organized as follows. In Sect. 3.2 we define a general notion of a signal space which constitutes the setting for our results. This notion goes beyond the classical extended L p spaces and instead emulates the treatment in [8] which does not rely on the existence of truncation operators. In Sect. 3.3 we define notions of controllability and causality adapted from versions in [29]; in particular, we will see that the perspective on controllability in [29] (which makes sense even for systems without input or state) is the natural one for our setting. In Sect. 3.4 we present our notions of stability, including a “look-ahead” version of classical finite-gain stability that is well suited to noncausal systems. We also present our main homotopy results in this section. Finally, in Sect. 3.5 we apply our results to the stability analysis of interconnections, including the one in Fig. 3.1. All proofs can be found in the appendix. Notation We let N denote the set of natural numbers including zero. We let id denote the identity relation on a set, seen either as a map from the set to itself or as the graph of this map. If A is a set, we let 1 A denote the indicator function of A (having its domain clear from context), and we let P(A) denote the power set of A. We let F denote either of the fields R or C, depending on whether we wish to work with real or complex signal spaces.

3.2 Signal Spaces A classical choice for a signal space in continuous time is the extended L p space p for p ∈ [1, ∞], i.e., the vector space L loc of all locally p-integrable functions on the time axis T = R0 taking values in F [34]. Each time t ∈ T defines a seminorm p ·t on L loc given by the truncation xt = x · 1[0,t] p , where · p denotes the usual p-norm for F -valued functions on T . We let S = {·t }t∈T denote this family p of seminorms, and we note that S defines a locally convex topology on L loc , namely, the coarsest topology under which each seminorm in S is continuous (thus turning p L loc into a Fréchet space). Furthermore, we can use these seminorms to recover L p itself: p L p = {x ∈ L loc : sup xt < ∞ .

(3.1)

t∈T

p

In summary, we see that the family S of seminorms on L loc provides us with three essential elements of a signal space: a time axis (the index set T of the family S), a small-signal subspace via the construction (3.1), and a locally convex topology. Extending this approach, we proceed to define a signal space as a vector space together with a family of seminorms, where the index set of the family has the structure of a “time axis.” Let X be a vector space over F , and let S = {·t }t∈T be a

3 On the Role of Well-Posedness in Homotopy Methods …

47

family of seminorms on X over an index set T . Recall that the family S is separated when for every nonzero x ∈ X there exists t ∈ T such that xt > 0. Every family S induces a natural preorder on its index set T as follows: given s, t ∈ T , we say s t when there exists C > 0 such that ·s C·t . The resulting equivalence relation ∼ on T (defined as s ∼ t when both s t and t s) corresponds to the usual equivalence relation for seminorms. Note that defines a partial order on T precisely when s ∼ t implies s = t. Next, recall that directs T when for any s, t ∈ T there exists r ∈ T such that s r and t r . We say that the family S is temporal when it is separated and is a partial order on T that directs T . A signal space is a vector space X together with an index set T (called the time axis) and a temporal family S of seminorms on X indexed over T . We will refer to such a space as (X, T , S), or simply as X when T and S are either clear from context or not explicitly named. Elements of X are signals. The time axis of a signal space carries a natural notion of order (the partial order ) and direction (any two-time instants have a common future time instant). The requirement that be a partial order on T ensures that the time axis T cannot circle back on itself. Note that in this setting there is no concept of the value of a signal at a particular time, only of its size. Also, we make no assumption that the mapping t → xt is monotone in t for any fixed x ∈ X, a departure from many classical definitions of signals spaces [7, 12, 28, 34]. The family of seminorms S for a signal space (X, T , S) provides a natural topology for the space, namely, the coarsest topology under which each seminorm in S is continuous. We will call this the seminorm topology for X, and the requirement that the family S be separated ensures that this topology is Hausdorff. Thus by construction, any signal space is a locally convex Hausdorff space in its seminorm topology. The next lemma shows that the converse holds, namely, that every locally convex Hausdorff space X admits a temporal family of seminorms that generates its topology, turning X into a signal space. Lemma 3.1 Let X be a locally convex Hausdorff space. Then there exists a temporal family S of seminorms on X indexed over a set T such that the resulting seminorm topology on the signal space (X, T , S) coincides with the given topology. Thus we can regard all locally convex Hausdorff spaces as signal spaces, including dual spaces like spaces of distributions. As we shall see, however, a signal space is more than a locally convex space—the particular choice of seminorms defining the topology plays a crucial role in the theory. In Sect. 3.2.1 we provide some examples of signal spaces with specific choices for their families of seminorms. Unless otherwise specified, all topological notions (such as open sets) for a signal space (X, T , S) will be with respect to the seminorm topology. Because directs T , a local base for this topology at any x¯ ∈ X is the collection of all balls of the form ¯ = x ∈ X : x − x ¯ t 0. We will also need balls having zero radius, i.e., balls of the form

48

R. A. Freeman

Bt (x) ¯ = x ∈ X : x − x ¯ t =0 = Bt,ε (x) ¯

(3.3)

ε>0

for t ∈ T and x¯ ∈ X. Abusing notation slightly, we will occasionally refer to these zero-radius balls as Bt,0 , keeping in mind that Bt,0 means Bt in (3.3) and does not mean plugging ε = 0 into (3.2) (which would result in empty balls). The collection of zero-radius balls Bt defines another topology on X: Lemma 3.2 The collection of balls in (3.3) for all x¯ ∈ X and t ∈ T is a base for a topology on X. Moreover, for each x¯ ∈ X the collection of balls (3.3) for all t ∈ T is a local base at x. ¯ The topology given in Lemma 3.2 is clearly finer than the seminorm topology, and hence we call it the fine topology. Vector addition is continuous in the fine topology because Bt(x¯1 ) + Bt(x¯2 ) = Bt (x¯1 + x¯2 ). In particular, if U ⊆ X is finely open then x + U is finely open for any x ∈ X. However, the mapping (c, x) → cx from F × X to X need not be jointly continuous in the fine topology , and so X with the fine topology is generally not a topological vector space. Nevertheless, for any fixed scalar ¯ ⊆ c ∈ F the mapping x → cx is continuous in the fine topology because cBt (x) ¯ (with equality when c = 0). In particular, if U ⊆ X is finely open and c ∈ F Bt(c x) is nonzero, then cU is finely open. p Recall that in (3.1) we recovered L p from its extension L loc by selecting “small” signals, namely, those signals whose seminorms are uniformly bounded. This concept generalizes in a natural way for a signal space (X, T , S). Adopting terminology from [12], we define the small-signal subspace Xs as Xs = x ∈ X : sup xt < ∞ ,

(3.4)

t∈T

and we call elements of Xs small signals. Here the superscript s stands for “small.” We equip the small-signal subspace Xs with a norm ·s given by xs = sup xt

(3.5)

t∈T

for x ∈ Xs. In general, the associated norm topology on Xs is finer than the seminorm topology that Xs inherits as a subset of X. Also, we emphasize that the small-signal subspace is a property of the particular choice of the family S of seminorms defining the signal space, so that two different signal spaces sharing the same underlying vector space X can have the same seminorm topology but different small-signal subspaces. Finally, we note that Xs is generally not a finely closed subspace of X (and hence not closed either), and in many cases it is a finely dense subspace of X (see Sect. 3.3.1).

3 On the Role of Well-Posedness in Homotopy Methods …

49

3.2.1 Examples of Signal Spaces According to Lemma 3.1, we can turn any locally convex Hausdorff space into a signal space. In this section we provide some examples with specific choices for the temporal families of seminorms. Example 3.1 (Normed spaces) Let X be a normed space, and suppose S consists solely of its norm · so that T is a singleton. Then S is a temporal family and thus (X, T , S) is a signal space. Moreover, all signals are small signals, that is, Xs = X. The following example is similar to the treatment in [8]. Example 3.2 (Extended spaces) Let X be a vector space, let {Xt }t∈T be a collection of normed spaces, and let {Rt }t∈T be a collection of linear operators Rt : X → Xt such that (a) t∈T ker(Rt ) = {0}, (b) ker(Rs ) = ker(Rt ) implies s = t, and (c) for all s, t ∈ T there exist r ∈ T and bounded linear operators Bs : Xr → Xs and Bt : Xr → Xt such that both Rs = Bs ◦ Rr and Rt = Bt ◦ Rr . Let S = {·t }t∈T be the family of seminorms on X given by xt = Rt x. It is straightforward to show that (a) implies that S is separated, (b) implies that is a partial order on T , and (c) implies that directs T . We conclude that S is a temporal family and thus (X, T , S) is a signal space. We now use Example 3.2 to define extended L p spaces of functions on a general measure space (Example 3.3) and extended Sobolev spaces of functions on an open subset of Rn (Example 3.4). We say that a collection C of distinct nonempty subsets of a set S is a directed cover of another set A ⊆ S when C covers A and is directed by inclusion (meaning that the union of any two members of C is contained in some member of C). Example 3.3 (Extended L p spaces) Let μ be a measure on a σ -algebra of subsets of a nonempty set E. Let C = {At }t∈T be a collection of measurable subsets of E that is a directed cover of E, that has a countable subcover, and is such that the symmetric difference of two distinct members of C has positive measure. Let p ∈ [1, ∞], let V be a Banach space over F , and let X be the vector space of all Bochner measurable functions x : E → V such that the truncated signal x · 1 At has a finite p-norm for every t ∈ T (and as usual we make no distinction between two functions that agree almost everywhere). Let S = {·t }t∈T be the family of seminorms on X given by xt = x · 1 At p . Then S is a temporal family (Lemma 3.20 in the appendix) and thus (X, T , S) is a signal space. The small-signal subspace is Xs = L p (E). The countable subcover condition in Example 3.3 guarantees that S is separated. Without this condition, S need not be separated as demonstrated by Example 3.12 p in the appendix. Example 3.3 includes as a special case the extended space L loc we described at the beginning of this section: we take μ to be the Lebesgue measure on

50

R. A. Freeman

E = T = R0 and C to be the collection of all real intervals of the form At = [0, t] for t ∈ T . For an analogous construction in discrete time, we take μ to be the counting measure on E = T = N and C to be the collection of all integer intervals of the form At = [0, t] for t ∈ T . Example 3.4 (Extended Sobolev spaces) Let be a nonempty open subset of Rn , let k ∈ N, and let p ∈ [1, ∞]. Let C = {At }t∈T be a collection of open subsets of that is a directed cover of . Let X be the vector space of all functions x : → R such that for every t ∈ T the restriction x| At belongs to the Sobolev space W k, p (At ). Let S = {·t }t∈T be the family of seminorms on X given by xt = x| At W k, p (At ) . This is a special case of Example 3.2: for each t ∈ T we let Xt = W k, p (At ), and we define Rt : X → Xt to be the restriction Rt x = x| At . It is straightforward to show that properties (a)–(c) hold, and we conclude that (X, T , S) is a signal space. The small-signal subspace is Xs = W k, p (). Example 3.5 (Smooth spaces) Let be a nonempty open subset of Rn , let X be the vector space C ∞ (), let C be a directed cover of whose members are all compact, let T = N × C, and let S = {·t }t∈T be the family of seminorms on X given by x(N ,K ) =

|α|N

max ∂ αx(τ ) τ ∈K

(3.6)

for each (N , K ) ∈ T , where α ∈ Nn is a multi-index and ∂ α denotes the corresponding mixed partial derivative operator. Then S is a temporal family (Lemma 3.21 in the appendix) and thus (X, T , S) is a signal space. Note that the time axis T is not linearly ordered in this example. Also, small signals must be real analytic, and when n = 1 they include all sinusoids having radian frequency strictly less than one. Example 3.6 (Weighted spaces) Let (X, T , S) be a signal space, and let w : T → (0, ∞) be a positive weight function. Then for each t ∈ T the seminorm w(t)·t is equivalent to the seminorm ·t , and we conclude that (X, T , {w(t)·t }t∈T ) is another signal space. This new space is a weighted version of the original space. Note that we make no assumption on the monotonicity of the weight function w. If we take the extended L p space from Example 3.3 and choose the weight function w(t) = μ(At )−1/ p whenever μ(At ) = 0, then the weighted space is a “power signal” space in which the seminorm w(t)·t measures the “power” or “average L p energy” of the signal at time t. In this case L ∞ (E) ⊂ Xs, that is, all bounded signals are small.

3.2.2 Composite Signals Signals are often composite, made up of different parts we label as inputs, outputs, states, etc., each having a particular role in the overall model. Thus we should consider the Cartesian product of signal spaces, or in fact their direct sum as we want the product itself to be a vector space. One complication in forming such a direct sum

3 On the Role of Well-Posedness in Homotopy Methods …

51

is that we need also to form a corresponding temporal family of seminorms on the sum in a way that preserves the signal space structure of its components. N To this end, suppose we have a finite collection of signal spaces {(Xi , Ti , Si )}i=1 over a common field F . Let a : T → T1 × · · · × T N be a mapping from an index set T to the Cartesian product of the individual time axes Ti , and let ai : T → Ti denote the i th component of a. We assume that a has the following three properties: (a) it is injective, (b) its image is cofinal with respect to the product order on its codomain, and (c) each component ai is surjective. Using a, we define a family S = {·t }t∈T of seminorms on the direct sum X = X1 ⊕ · · · ⊕ X N as follows: (x1 , . . . , x N )t =

N

xi ai (t) ,

(3.7)

i=1

where the i th norm in the sum is from the family Si . The partial order induced on T by this family S is such that s t if and only if ai (s) ai (t) for each i (namely, a is monotone with respect to the product order on its codomain). It is straightforward to show that under the assumptions on a listed above, S is a temporal family and thus (X, T , S) is a signal space. Also, because each component ai is surjective we have sup xi ai (t) = sup xi ti t∈T

(3.8)

ti ∈Ti

for each i, and it follows from (3.5) that max sup xi ti (x1 , . . . , x N )s

i=1,...,N ti ∈Ti

N

sup xi ti

i=1 ti ∈Ti

(3.9)

for all (x1 , . . . , x N ) ∈ Xs. Hence the small-signal subspace Xs coincides with the direct sum X1s ⊕ · · · ⊕ XsN , and thus we can write (3.9) as max xi s (x1 , . . . , x N )s

i=1,...,N

N

xi s

(3.10)

i=1

for all (x1 , . . . , x N ) ∈ Xs. Moreover, both the seminorm and the fine topologies on X coincide with the corresponding product topologies on the underlying Cartesian product X1 × · · · × X N . As an example of this direct sum, suppose each component space (Xi , Ti , Si ) is an extended L p space from Example 3.3, and suppose they all share the same underlying measure μ, set E, collection C, and time axis Ti = T . Then the natural way to define their direct sum is to take each component function ai to be the identity map on T . This choice is clearly injective with surjective components, and the cofinality property follows from the fact that the collection C is a directed cover of E. As a second example, suppose (X1 , T1 , S1 ) is a normed space as in Example 3.1 so that

52

R. A. Freeman

T1 = {t1 } is a singleton, and let (X2 , T2 , S2 ) be another signal space. Then the choice a(t) = (t1 , t) with T = T2 has the desired properties (and is the only such choice up to an isomorphism). In what follows, whenever we talk about a direct sum of signal spaces, we will implicitly assume that it carries the seminorm structure in (3.7) for an appropriate choice for the mapping a. Finally, we will use π to denote canonical projections onto component spaces, e.g., πXi : X → Xi .

3.3 Systems, Controllability, and Causality A system is a pair (X, ) where X is a signal space and ⊆ X is a collection of signals. This is the viewpoint taken in [34] and refined in [29], and as in [29] we call the behavior of the system. For convenience, we will often use the behavior

as shorthand for the system (X, ) itself when the signal space X is clear from context. A system is linear when its behavior is a linear subspace of its signal space. There are two particular subsystems associated with the small signals in any system (X, ). The first is the small-signal subsystem (Xs, s) with behavior

s = ∩ Xs, where Xs denotes the small-signal subspace of X. To define the second, recall that the small-signal subspace Xs carries a norm ·s. We can use this norm on Xs to create a signal space as in Example 3.1 by letting the time axis be a singleton and letting the family of seminorms contain just the norm ·s. We will use the notation Xn to refer to this normed signal space. Here the superscript n stands for “normed.” The normed subsystem associated with (X, ) is the system (Xn, n) with behavior n = s, so that n is also the collection of all small signals in . Even though s and n contain the same signals, their seminorm structures are different: s inherits a seminorm ·t from X for each t ∈ T , whereas

n has only one seminorm ·s (which is actually a norm). In other words, a signal in s carries the notion of time t ∈ T it inherits from X, but the same signal in n carries no notion of time. We summarize these distinctions in Table 3.1. Table 3.1 The subsystems s and n associated with any system

System Name Behavior Signal space

s

n

Small-signal subsystem Normed subsystem

Time axis

∩ Xs

(X, T , S) (Xs, T , S)

T T

∩ Xs

(Xn, {0}, {·s})

Singleton

3 On the Role of Well-Posedness in Homotopy Methods …

53

3.3.1 Controllability In the behavioral approach of [29], controllability is a property of a system that makes sense even for systems without inputs. The following definition of controllability is an approximate version of “exact controllability” from [29, Definition V.1]: Definition 3.1 A system (X, ) is controllable to a trajectory x ∈ X when for every x¯ ∈ and every open neighborhood U ⊆ X of x¯ there exists xˆ ∈ U ∩ such that xˆ − x ∈ Xs. It is finely controllable to x when the same holds for every finely open neighborhood U . Note that by this definition, a system can be controllable to a trajectory that is not part of its behavior (in other words, x need not belong to ). Recalling that (3.2) and (3.3) provide respective bases for the seminorm and fine topologies , controllability to x essentially means that given any system trajectory x¯ and any time t, there exists another system trajectory xˆ that is close to x¯ up to time t but that is ultimately close to x (in the sense that the difference xˆ − x is a small signal). Or, in the words of Jan Willems in [29]: …for a controllable system the past of a trajectory will have no lasting influence on the far future, since sooner or later any other trajectory can be joined.

We will be particularly interested in systems that are controllable to zero (i.e., to the zero trajectory). Note that a system is (finely) controllable to zero if and only if its small-signal subsystem s is (finely) dense in . If the system (X, X) containing all possible trajectories is (finely) controllable to zero, then we say that the signal space X itself is (finely) controllable to zero. For example, the signal space of Example 3.3 is finely controllable to zero because all truncated signals of the form x · 1 At for t ∈ T are small signals. Likewise, the signal space of Example 3.5 is controllable to zero, and it is finely controllable to zero if and only if all sets in C are finite (Lemma 3.22 in the appendix). If X is controllable to zero, then every system (X, ) whose behavior

is contained in the closure of its interior is controllable to zero. Likewise, if X is finely controllable to zero, then every system (X, ) whose behavior is contained in the fine closure of its fine interior is finely controllable to zero. Finally, if X is a direct sum X = X1 ⊕ · · · ⊕ X N as in Sect. 3.2.2, then X is controllable to zero (or finely controllable to zero) if and only if each component Xi is. We also need the following stronger version of controllability to zero: Definition 3.2 A system (X, ) is uniformly controllable to zero when there exist ¯ ∩ K, b 0 such that for all x¯ ∈ , all t ∈ T , and all ε > 0 there exists xˆ ∈ Bt,ε (x) ˆ s Kx ¯ t + b + ε. It is uniformly finely controllable to zero when

s such that x this holds with ε = 0. The constants K and b are the controllability constant and controllability bias, respectively. As before, we can apply this definition to entire signal spaces. For example, the signal space of Example 3.3 is uniformly finely controllable to zero with K = 1 and b = 0 because x · 1 At s = xt for all signals x. Likewise, it follows from the

54

R. A. Freeman

results in [14] that if the boundaries of the sets At in Example 3.4 are sufficiently well behaved, then the signal space of Example 3.4 is uniformly finely controllable to zero with b = 0 and a value of K that depends on n, k, p, and the boundary parameters. Note that the property of a signal space being uniformly finely controllable to zero, which appears as Assumption 4.1(d) in [8] for the case of zero controllability bias, is essentially a property that signals admit “soft” truncations. Example 3.7 (Linear time-invariant systems) Let X be the signal space L2loc on the time axis T = R0 with signals x ∈ X having vector values xt ∈ Rn at times t ∈ T . Let (X, ) be a linear system, and suppose has a state-space representation ξ˙t = Aξt + B E xt , ξ0 = 0 0 = Cξt + Dxt

(3.11) (3.12)

for matrices A, B, C, D, E of appropriate dimensions. In other words, x ∈ if and only if (3.12) holds for almost all t ∈ T , where ξ is the unique absolutely continuous solution to (3.11). As we will show in Lemma 3.23 in the appendix, if (A, B) is stabilizable, (A, C) is detectable, and [ DE ] is right-invertible, then is uniformly finely controllable to zero (with zero controllability bias). An analogous result holds in discrete time.

3.3.2 Input/Output Systems, Causality, and Hemicontinuity An input/output (IO) system is a system (X, ) whose signal space X is a direct sum X = I ⊕ O of an input space I and an output space O. We will write such a system as the triple (I, O, ). Hence an IO system is merely a system for which we label some signals as inputs and the rest as outputs. Unlike [29], we impose no requirements that the inputs are “free” or that the outputs “process” the inputs. Instead, we will take an IO stability point of view: the outputs will be those signals that we wish to be small whenever the signals we have labeled as inputs are small. Similarly, an IO system with latent variables is a system (X, ) whose signal space X is a direct sum X = I ⊕ O ⊕ L of an input space I, an output space O, and a third space L of latent variables that are neither inputs nor outputs (such as state variables) [29]. Every IO system with latent variables generates an associated IO system without latent variables via projection onto I ⊕ O. In the parlance of [29], the original behavior ⊆ I ⊕ O ⊕ L is the full behavior and its projection πI⊕O ( ) is the manifest behavior. For convenience, we will assume that our IO systems have no latent variables, i.e., that any latent variables have been projected out to produce the manifest behavior. Note that the system (3.11)–(3.12) in Example 3.7 becomes a familiar IO system when X = I ⊕ O so that x ∈ X has components x = (u, y); indeed, setting E = [I 0] and D = [D1 −I ] for some matrix D1 leads to the system ξ˙t = Aξt + Bu t and yt = Cξt + D1 u t .

3 On the Role of Well-Posedness in Homotopy Methods …

55

Let (I, O, ) be an IO system. The domain of is the projection dom( ) = πI ( ). For all u ∈ I we define the cross section [u] ⊆ O as

[u] = y ∈ O : (u, y) ∈ ,

(3.13)

and for a set S ⊆ I we define [S] = u∈S [u]. If [u] is a singleton, then we equate it with its sole member, e.g., we can write y = [u] instead of y ∈ [u]. We thus regard the map u → [u] as a set-valued map whose graph is . We say that

is univalent when [u] is a singleton for each u ∈ dom( ). We say that is an operator when dom( ) = I, that is, when its inputs can be freely chosen [23]. Note that if is a univalent operator, then [·] is an ordinary single-valued function from I to O. The inverse of an IO system (I, O, ) is the IO system (O, I, −1 ) with behavior

−1 = (y, u) ∈ O ⊕ I : (u, y) ∈ .

(3.14)

In particular, the inverse image of a set Y ⊆ O is

−1[Y ] = u ∈ I : [u] ∩ Y = ∅ .

(3.15)

Note that every system has an inverse in this set-valued sense. In the classical setting, causality is defined using truncation operators on the signal spaces [7, 28, 35]. More appropriate for our setting is the following notion of causality adapted from [29]: Definition 3.3 An IO system (I, O, ) is causal when for every (u, ¯ y¯ ) ∈ , every ¯ ∩ dom( ) there exists y ∈ Bt ( y¯ ) ∩ [u]. t ∈ T , and every u ∈ Bt(u) In other words, is causal when for every (u, ¯ y¯ ) ∈ , every t ∈ T , and every u ∈ dom( ) such that u − u ¯ t = 0 there exists y ∈ [u] such that y − y¯ t = 0. Arguably more accurate terms than “causal” are nonanticipative [7, 35] or nonanticipating [29], as they avoid the implication that the inputs somehow “cause” the outputs. Nevertheless, we use the more familiar term as it is less cumbersome. We next show that causality is a particular form of uniform continuity of an IO system (I, O, ) regarded as a set-valued map from the input space I to the output space O. Recall that a set-valued map is lower hemicontinuous when the inverse image of every open set is open.1 We will slightly weaken this standard definition by requiring only that the inverse image of every open set is open relative to its domain. Making specific choices for the topologies on I and O leads to the following: Definition 3.4 An IO system (I, O, ) is lower hemicontinuous (lhc) when the inverse image −1[Y ] of every open set Y ⊆ O is open in I relative to dom( ). It is finely lower hemicontinuous (flhc) when the inverse image of every finely open set is finely open in I relative to dom( ). It is weakly lower hemicontinuous (wlhc) when the inverse image of every open set is finely open in I relative to dom( ). 1

Many authors use “semicontinuous” rather than “hemicontinuous,” e.g., [3].

56

R. A. Freeman

Fig. 3.3 An illustration of -uniform fine lower hemicontinuity: given (u, ¯ y¯ ) ∈ and t ∈ T , for any u ∈ dom( ) that agrees with u¯ up to time (t) there exists y ∈ [u] that agrees with y¯ up to time t. This is the same as causality when = id, namely, when (t) = t for all t ∈ T

We could also consider a strong version of lower hemicontinuity using the seminorm topology on I and the fine topology on O, but we will make no use of such a version here. Also, neither lhc nor flhc is a stronger property than the other in general, but they are both stronger than wlhc. To define uniform versions of lower hemicontinuity, we tie the open sets in O in Definition 3.4 to particular open sets in I using the balls in (3.2) and (3.3) together with a mapping of the time axis T . Given a signal space (X, T , S), we say that a function : T → T is a look-ahead map for X when there exists a constant L > 0 such that ·t L· (t) for all t ∈ T . We call L a look-ahead constant for . Note that by the definition of the partial order on T , we have t (t) for all t ∈ T (which means (t) indeed “looks ahead” in time). Also, the identity map = id on T is always a look-ahead map with L = 1. As another example, suppose T = R0 (continuous time) or T = N (discrete time) and suppose the mapping t → xt is monotone in t for any fixed x ∈ X (as in most classical settings); then (t) = t + τ for a fixed positive τ ∈ T is a look-ahead map with L = 1. More generally, if there exists a constant a ∈ R such that the mapping t → eat xt is monotone in t for any fixed x ∈ X, then (t) = t + τ is a look-ahead map with L = eaτ . Definition 3.5 Let be a look-ahead map for I ⊕ O. An IO system (I, O, ) is uniformly lower hemicontinuous ( -lhc) when for every (u, ¯ y¯ ) ∈ , t ∈ T , and ε > 0 ¯ ∩ dom( ) ⊆ −1[Bt,ε( y¯ )]. It is -uniformly there exists δ > 0 such that B (t),δ(u) ¯ ∩ dom( ) ⊆ −1[Bt( y¯ )] for all finely lower hemicontinuous ( -flhc) when B (t)(u) (u, ¯ y¯ ) ∈ and t ∈ T . It is -uniformly weakly lower hemicontinuous ( -wlhc) when ¯ ∩ dom( ) ⊆ −1[Bt,ε( y¯ )] for all (u, ¯ y¯ ) ∈ , t ∈ T , and ε > 0. B (t)(u) In particular, we see that causality is just a special case of -flhc when = id. We illustrate the notion of -flhc in Fig. 3.3. Like before, neither -lhc or -flhc is a stronger property than the other in general, but they are both stronger than -wlhc.

3 On the Role of Well-Posedness in Homotopy Methods …

57

3.4 Stability and Gain of IO Systems An IO system (I, O, ) is called “input–output stable” in [12] when small inputs produce small outputs. This definition makes sense in the context of univalent systems in which the cross section [u] is a singleton for every u ∈ dom( ). When [u] is a set, however, there are different notions of what it means for it to be “small.” We adapt one such notion from [25, 31, 32] as follows: Definition 3.6 An IO system (I, O, ) is minimally stable when for each u ∈ dom( )s, the cross section [u] is controllable to zero. Thus in a minimally stable system, the set of outputs for any small input has the set of small outputs as a dense subset. This is weaker than the property that small inputs produce only small outputs.

3.4.1 Finite-Gain Stability To deal with inputs that are not necessarily small, we will introduce the notion of the gain of an IO system. In the classical setting, an IO system (I, O, ) has a finite gain when there exist constants γ , β 0 such that yt γ · ut + β

(3.16)

for all (u, y) ∈ and all t ∈ T . If the input is small then we can take the supremum of both sides of (3.16) over t to conclude that all associated outputs must also be small; in particular, all IO systems having finite gain are minimally stable. We next generalize this definition by allowing different values for time on the left- and righthand sides of the inequality (3.16). Given β 0 and a look-ahead map for I ⊕ O, the look-ahead gain with bias β of the IO system (I, O, ) is the nonnegative extended real number g β ( ) defined as ⎧ yt − β ⎪ ⎪ sup when = ∅ ⎨ t∈T ε + u (t) (3.17) g β ( ) = ε>0, (u,y)∈

⎪ ⎪ ⎩0 when = ∅ . Note that g β ( ) 0 because if the numerator in (3.17) is negative, then taking ε → ∞ will make the supremum zero. Also, if g β ( ) < ∞ then yt g β ( ) · u (t) + β

∀(u, y) ∈ , ∀t ∈ T .

(3.18)

58

R. A. Freeman

We see from (3.17) that g β ( ) is nonincreasing in β, so it has a limit as β → ∞ which we define to be the look-ahead gain g ( ) of : g ( ) = inf g β ( ) = lim g β ( ) . β0

β→∞

(3.19)

We say that the system is -stable when g ( ) < ∞, and we say that it is -stable with zero bias when g 0( ) < ∞. These gains are identical for linear systems: Lemma 3.3 If is a linear IO system then g ( ) = g 0( ). A related result is the following: Lemma 3.4 Every -stable linear IO system is univalent and -flhc. Note that supt∈T u (t) supt∈T ut , so if is -stable and u is small then we can take the supremum of both sides of (3.18) over t ∈ T to obtain ys g β ( ) · us + β for all y ∈ [u]. Hence if is -stable then small inputs produce only small outputs, and in particular all -stable systems are minimally stable. We next show that -stability is preserved under compositions. Given IO systems (I, X, ) and (X, O, ), their composition is the IO system (I, O, ◦ ) with behavior ◦ = (u, y) ∈ I ⊕ O : ∃x ∈ X such that (u, x) ∈ and (x, y) ∈ . (3.20)

Lemma 3.5 Suppose (I, X, ) and (X, O, ) are IO systems. If is 1 -stable and is 2 -stable, then ◦ is ( 1 ◦ 2 )-stable with g 1 ◦ 2( ◦ ) g 2()g 1( ). If in addition the biases for and are zero then g 01 ◦ 2( ◦ ) g 02()g 01( ). Note that unlike [23], we do not use the term “bounded” to describe the finitegain property. This is because the notion of a bounded operator between topological vector spaces has a standard meaning, namely, that (von Neumann) bounded sets are mapped to bounded sets. Also, note that -stability is stronger than continuity, even for linear univalent operators. Indeed, it is clear that a linear map T : I → O is continuous (with respect to the seminorm topologies) when for every s ∈ T there exists t ∈ T and C > 0 such that T us Cut for all u ∈ I; for -stability we require further that t = (s) and that C is independent of s. Because the special case = id is important, we highlight it by defining gβ ( ) and g( ) (without the superscript ) to be the gains in (3.17) and (3.19) for the specific choice = id. In this special case the inequality (3.18) reduces to the classical finitegain inequality (3.16), and we say that is stable when g( ) < ∞ and stable with zero bias when g0( ) < ∞. Note that if L is a look-ahead constant for then g β ( ) Lgβ ( ) for any β 0. In particular, stable systems are -stable for any look-ahead map (but not conversely in general). Also, Lemma 3.4 with = id states that every stable linear IO system is univalent and causal. However, there exist stable nonlinear IO systems that are not causal: consider the discrete-time system p with I = O = L loc given by the difference equation yt = u t tanh(u t+1 ) for t ∈ N.

3 On the Role of Well-Posedness in Homotopy Methods …

59

3.4.2 Relationships Between Gain, Small-Signal Gain, and Norm Gain We can also apply these gain concepts to the subsystems s and n in Table 3.1 to obtain the small-signal gain g ( s) and the norm gain g( n). Note that the finiteness of either of these gains does not imply that small inputs produce small outputs because by definition, the subsystems s and n contain only small signals. Indeed, in the classical extended L2 setting in continuous time, a linear time-invariant system described by a proper transfer function with no poles on the imaginary axis has a finite norm gain equal to the peak value of the magnitude portion of its Bode plot, even if the system is unstable. As we will see in Sect. 3.5.3, soft IQCs by themselves generally provide bounds on the norm gain only and thus do not ensure stability without additional assumptions. The small-signal and norm gains of a system are related to the look-ahead gain g ( ) through the following inequalities: g( n) g ( s) g ( ) n

gβ ( )

g β ( s)

g β ( ) ,

(3.21) (3.22)

where (3.22) holds for any β 0 and any look-ahead map , and (3.21) follows from (3.22) by taking the limit as β → ∞. Note that by setting = id in (3.21) we obtain the inequality g( n) g( ). What we need for our stability analysis, however, is the reverse inequality (at least up to a constant factor). One way to achieve such a reverse inequality is to assume that is a minimally stable causal operator: Lemma 3.6 Let (I, O, ) be a minimally stable causal operator, and suppose the input space I is uniformly finely controllable to zero with controllability constant K. Then g( ) Kg( n). Lemma 3.6 is used implicitly in the stability analysis of [21, 23], and is related to [7, Exercise 8c] and [25, Proposition 6]. We will not prove Lemma 3.6 separately as it is a direct corollary of the following lemma: Lemma 3.7 Let (I, O, ) be a minimally stable IO system. If is -lhc and dom( ) is uniformly controllable to zero (or if is -wlhc and dom( ) is uniformly finely controllable to zero), then g β¯ ( ) Kgβ ( n) for all β 0 such that gβ ( n) < ∞, where β¯ = β + bgβ ( n) and K and b are the controllability constant and bias for dom( ). In particular g ( ) Kg( n). We obtain Lemma 3.6 by setting = id in Lemma 3.7 and recognizing that, in this case, causality is then the same as -flhc (which in turn implies -wlhc). The following version of Lemma 3.7 has weaker assumptions but involves the larger small-signal gain rather than the norm gain: Lemma 3.8 Let (I, O, ) be a minimally stable IO system. If is lhc and dom( ) is controllable to zero (or if is wlhc and dom( ) is finely controllable to zero), then g β ( s) = g β ( ) for all β 0, and in particular g ( s) = g ( ).

60

R. A. Freeman

3.4.3 Stability Robustness in the Gap Topology Are stability, -stability, or minimal stability preserved under small perturbations? In this section we answer this question for perturbations in a certain topology on the set of systems known as a “gap topology” (see [6] and the references therein). We will use a version of the gap topology described in [10, 23]. This topology comes from a type of normalized Hausdorff set distance, defined in the following manner. Let (X, T , S) be a signal space, and consider the distance functions d, q : X × X → [0, ∞] given by x − yt ε + xt

q(x, y) = ln 1 + d(x, y) ,

d(x, y) =

sup

ε>0, t∈T

(3.23) (3.24)

with the convention ln ∞ = ∞. Lemma 3.9 The distance function q is an extended quasimetric on X. This quasimetric q generates a topology on X in the usual manner [5]: a set U ⊆ X is open in the q-topology when for every x ∈ U there exists r > 0 such that y ∈ X and q(x, y) < r imply y ∈ U . Thus a sequence {xn }n∈N in X converges to a point x ∈ X in the q topology if and only if q(x, xn ) → 0 as n → ∞. The q-topology is finer than the seminorm topology, and X with the q-topology is not a topological vector space (unless X is trivial and contains only the zero signal). Indeed, 0 is an isolated point in the q-topology because q(0, x) = ∞ for all x = 0. As a result, neither scalar multiplication nor vector addition is continuous in the q-topology when X contains a nonzero signal x: the sequence x/n does not converge to zero as n → ∞; moreover, the sequences (1 + 1/n)x and (−1 + 1/n)x have q-limits x and −x (respectively), but their sum is 2x/n which does not converge to zero. A consequence of the following lemma is that the conjugate quasimetric q¯ defined as q¯ (x, y) = q(y, x) is topologically equivalent to q, that is, it also generates the q-topology:

Lemma 3.10 If d(x, y) < 1 then d(y, x) d(x, y)/ 1 − d(x, y) . In particular, the pointwise maximum of q and q¯ is an extended metric on X which generates the q-topology. We next use d in (3.23) to define a Hausdorff-like distance d between sets A, B ⊆ X: d(A, B) = max d(A, B), d(B, A) , where d : P(X) × P(X) → [0, ∞] is defined as

(3.25)

3 On the Role of Well-Posedness in Homotopy Methods …

d(A, B) =

61

⎧ sup inf b∈B d(a, b) if A = ∅ ⎪ ⎪ ⎨ a∈A 0 ⎪ ⎪ ⎩ ∞

if A = B = ∅ if A = ∅ and B = ∅ .

(3.26)

These distance functions d and d are called the gap and the directed gap (respectively), and they are versions of the gaps defined in [10, 23]. The logarithm

ln 1 + d(·,·) is an extended pseudometric on the power set P(X) (as shown in [11], for example), and the associated pseudometric topology on P(X) is called the gap topology. The following lemma shows that the set of stable systems is open in the gap topology; this is essentially [23, Lemma 1] but without the causality assumptions. Lemma 3.11 Let (I, O, ) and (I, O, ) be IO systems. If is stable and d(,

) < (2g( ) + 2)−1 then g() 2g( ) + 1. If is stable with zero bias and d(,

) < (2g0( ) + 2)−1 then g0() 2g0( ) + 1. This lemma also shows that the smaller the gain of , the more it can be perturbed while preserving stability. In general, however, neither the set of minimally stable systems nor the set of -stable systems is open in the gap topology, as the following example shows: Example 3.8 Let be a look-ahead map, and define the parameterized IO system

α = (u, y) ∈ I ⊕ O : yt u + αy (t) ∀t ∈ T

(3.27)

for each parameter α ∈ R. Clearly g 0( 0 ) = 1 so 0 is -stable. Also, the mapping α → α is continuous in the gap topology; indeed, suppose (u, y) ∈ α and define u¯ = u + (α − α)y ¯ and y¯ = y for some α¯ ∈ R. Then (u, ¯ y¯ ) ∈ α¯ and ¯ · yt u − u ¯ t + y − y¯ t = |α − α|

(3.28)

α , α¯ ) |α − α|, for all t ∈ T . It follows that d(

¯ and by reversing the roles of α and ¯ Thus α → α is actually uniformly continuous. α¯ we obtain d( α , α¯ ) |α − α|. Next we examine the cross section α [0], which is the set

α [0] = y ∈ O : yt αy (t) ∀t ∈ T .

(3.29)

p

Now suppose we are in discrete time with I = O = L loc and 1 p < ∞, suppose (t) = t + 1, and suppose 0 < |α| < 1. Let yt denote the value of the signal y at p p time t. Then yt+1 = |yt+1 | p + yt for all y ∈ O and all t ∈ T , which means p

α [0] = y ∈ O : |yt+1 | p |α|1 p − 1 · yt ∀t ∈ T .

(3.30)

Thus we see that if y ∈ α [0] is nonzero, then ys > 0 for some s which means |yt | cannot converge to zero as t → ∞. Hence the only small signal in α [0] is the

62

R. A. Freeman

zero signal, and because α [0] also contains nonzero signals we conclude that α is not minimally stable (and in particular it is not -stable). The main reason we lack stability robustness in Example 3.8 is that nonzero outputs in the perturbed system blow up very quickly when α is near zero. If we limit the growth of such outputs a priori, then we can indeed preserve minimal stability under perturbations in the gap topology, at least under some additional controllability and lower hemicontinuity assumptions. To this end, given a look-ahead map and a parameter μ 1 we define the following set of output signals: Gμ = y ∈ O : ∃χ 0 such that ∀t ∈ T , y (t) μyt + χ .

(3.31)

This set Gμ represents those outputs that do not blow up too quickly, as measured by and μ. For example, if T = R0 and (t) = t + τ for some positive constant τ ∈ T , then Gμ represents the set of all signals in O that exhibit exponential growth no faster than μt/τ . We say that an IO system (I, O, ) is ( , μ)-limited when Gμ is dense in

[u] for all small inputs u ∈ dom( ). This basically means that small inputs to

produce outputs having limited growth. In many cases linear growth conditions on differential or difference equations can guarantee this property. Note that for = id and μ = 1 we have G1id = O, which means every IO system is (id, 1)-limited. The following is a version of Lemma 3.11 that uses the small-signal gain g ( s) to measure how far a system can be perturbed while preserving minimal stability: Lemma 3.12 Let (I, O, ) be a minimally stable IO system such that g ( s) < ∞. Suppose that either is -lhc and dom( ) is controllable to zero, or that is -wlhc and dom( ) is finely controllable to zero. Let (I, O, ) be an ( , μ)-limited IO system, and let L be a look-ahead constant for . If d(,

) < (μg ( s) + μL)−1

(3.32)

then is minimally stable. Note that we can use either the small-signal gain g ( s) or the gain g ( ) in Lemma 3.12, because in this case they are equal by Lemma 3.8.

3.4.4 Stability via Homotopy The main idea behind the stability proofs in [21, 23] is to start with a simple system

0 which we know to be stable, and then continuously deform it through a mapping α → α for α ∈ [0, 1] until we reach the target system of interest 1 . If α satisfies certain conditions along the homotopy path, then we conclude that the target system

1 is stable as well. Most of the effort in the proof comes from Lemma 3.12, which along with either Lemma 3.7 or Lemma 3.8 leads to the following stability theorems. The first one makes use of Lemma 3.8 along with the small-signal gain:

3 On the Role of Well-Posedness in Homotopy Methods …

63

Theorem 3.1 Let {(I, O, α )} be a family of IO systems with parameter α ∈ [0, 1], and let be a look-ahead map for I ⊕ O. Suppose (i) the mapping α → α is continuous in the gap topology, (ii) there exists μ 1 such that α is ( , μ)-limited for all α ∈ [0, 1], (iii) for each α ∈ [0, 1], either α is -lhc and dom( α ) is controllable to zero. or

α is -whlc and dom( α ) is finely controllable to zero, (iv) there exists γ 0 such that g ( αs ) γ for all α ∈ [0, 1], and (v) 0 is minimally stable. Then 1 is -stable with g ( 1 ) γ . Note that we cannot conclude g ( 1 ) γ directly from (iv) and Lemma 3.8 because without the homotopy argument we do not know that 1 is minimally stable. The second theorem uses Lemma 3.7 along with the weaker (and easier to verify) norm gain, but it requires uniform controllability rather than mere controllability: Theorem 3.2 Let {(I, O, α )} be a family of IO systems with parameter α ∈ [0, 1], and let be a look-ahead map for I ⊕ O. Suppose (i), (ii), and (v) of Theorem 3.1 hold, but instead of (iii) and (iv) suppose (iii’) there exists K 0 such that for each α ∈ [0, 1], α is -lhc and dom( α ) is uniformly controllable to zero (or α is -whlc and dom( α ) is uniformly finely controllable to zero) with controllability constant K, and (iv’) there exists γ 0 such that g( αn) γ for all α ∈ [0, 1]. Then 1 is -stable with g ( 1 ) Kγ . If the target system 1 is a causal operator, then we can use Lemma 3.6 to conclude from Theorem 3.1 or 3.2 that it is stable (rather than merely -stable) even though

α might not be a causal operator along the homotopy path. If each α is in fact a causal operator, then we obtain the following version of [23, Theorem 2]: Corollary 3.1 Let {(I, O, α )} be a family of IO systems with parameter α ∈ [0, 1], and suppose the input signal space I is uniformly finely controllable to zero with controllability constant K. Suppose (i) and (v) of Theorem 3.1 hold, suppose (iv’) of Theorem 3.2 holds, and suppose α is a causal operator for every α ∈ [0, 1]. Then

1 is stable with g( 1 ) Kγ . Note that [23, Theorem 2] does not assume (iv’) directly, but rather it uses a sufficient condition for (iv’) stated in terms of IQCs. We will show how to do this in Sect. 3.5.3. For us to conclude that the gain results in Theorems 3.1 and 3.2 and Corollary 3.1 hold with zero bias, we simply assume that (iv) or (iv’) is satisfied with g0 instead of g and that the uniform controllability conditions hold with zero controllability bias. Another simple extension of Theorems 3.1 and 3.2 is to allow to depend on α; in this case all we need is to have a single look-ahead constant valid for all α , plus the following monotonicity property: if α α¯ then Gμ α ⊆ Gμ α¯ . The following example shows that if all of the conditions of Theorem 3.2 are satisfied except that there is no single value of the controllability constant K in (iii’)

64

R. A. Freeman

that works for every α, then the target system can be unstable. The parameterized system in this example is even causal and linear for each α, but this does not help. Example 3.9 Let I = O = L2loc ⊕ L2loc on the time axis T = R0 so that each input and output signal has two components u = (u 1 , u 2 ) and y = (y1 , y2 ). Let h : T → R be the signal h(t) = 2et , and consider the family {(I, O, α )} of linear IO systems with parameterized behavior

α = (u, y) ∈ I ⊕ O : y1 = u 1 , y2 = u 2 + h ∗ u 2 , and y1 = (1 − α)y2 (3.33) for α ∈ [0, 1], where ∗ denotes the convolution of one-sided signals, i.e., signals supported on T . This family satisfies almost all of the conditions of Theorem 3.2 with = id. We will show in Lemma 3.24 in the appendix that the mapping α → α is uniformly continuous. The growth condition in (ii) is trivially satisfied because = id. The unstable linear map u 2 → u 2 + h ∗ u 2 has the all-pass, unity-gain transfer function (s + 1)/(s − 1), and it follows that (iv’) holds with γ = 1. The initial system

0 satisfies y1 = y2 = u 1 and is thus minimally stable. All that is left is (iii’). Each

α is causal and thus also -wlhc. Its domain is dom( α ) = (u 1 , u 2 ) ∈ I : u 1 = (1 − α)(u 2 + h ∗ u 2 ) .

(3.34)

If α = 1 then dom( α ) = {0} × L2loc which is uniformly finely controllable to zero with controllability constant K = 1. If α < 1 then we can write dom( α ) in the form (3.11)–(3.12) in Example 3.7 and thus conclude that it is also uniformly finely controllable to zero. However, there is no single value of K that works for all α ∈ [0, 1], so (iii’) is not satisfied (one can show that the value of K must grow like 1/(1 − α) as α → 1 before it jumps down to K = 1 at α = 1). As a result, the conclusion of Theorem 3.2 fails to hold, and indeed the target system 1 is unstable.

3.5 Stability of Interconnections We typically apply the homotopy methods of Theorems 3.1 and 3.2 to interconnections of systems. In this section we show how we can use properties of individual subsystems (e.g., continuity with respect to parameters and controllability) to deduce the analogous properties of their interconnection as needed in these theorems. Let (I, O, ) be an IO system. Given a system (O, ), we define the interconnection of and to be the IO system (I, O, [, ]) with behavior [, ] = (u, y) ∈ : y ∈ .

(3.35)

Figure 3.4 illustrates this interconnection. Note that is not necessarily an IO system in this context, and we need not regard the signal y in Fig. 3.4 as “entering” or

3 On the Role of Well-Posedness in Homotopy Methods …

65

Fig. 3.4 The interconnection [, ] of and Fig. 3.5 The classical interconnection [G +, ] with an additive input u

Fig. 3.6 The classical interconnection [G +, ] of Fig. 3.5 when G and are IO systems

“leaving” . Instead, represents a constraint on the outputs of that result in the new IO system [, ]. This perspective on interconnections is from [29], and typically represents the nonlinear, time-varying, or uncertain part of the system. There are various ways of defining the stability of the interconnection [, ]. In [30], the signal u is treated as an internal state, and stability means the convergence of u to zero in forward time. We will instead take an IO point of view in which stability means the finite-gain stability of [, ] as an defined in Sect. 3.4.1. In this approach u represents the exogenous signal, y represents the endogenous signal, and stability implies that y cannot be large unless u is also large. An important special case of the interconnections in Fig. 3.4 involves a system (O, G) defined on the output space O. Let us first compose G with addition on O to obtain the system (O ⊕ O, G + ): G + = (u, y) ∈ O ⊕ O : u + y ∈ G .

(3.36)

If I = O, then we can form the interconnection [G +, ] as illustrated in Fig. 3.5. We regard this special case as a classical interconnection, because if G and are both IO systems then we can split u and y into components to obtain the familiar feedback connection in Fig. 3.6 (note that u 2 enters the summing junction in Fig. 3.6 with a negative sign so that Fig. 3.6 is indeed a form of Fig. 3.5, but this has no bearing on stability). The system G is linear in much of the classical stability literature. The following lemmas show that the mapping from G to G + is continuous in the gap topology and preserves controllability to zero:

66

R. A. Freeman

Lemma 3.13 The mapping (·)+ : P(O) → P(O ⊕ O) is uniformly continuous. Lemma 3.14 If both (O, G) and the signal space O itself are controllable (resp. finely controllable) to zero, then so is G + .

3.5.1 Well-Posed Interconnections Recall the definition of a well-posedness from [21, 23]: we say that = [, ] in Fig. 3.4 is a well-posed interconnection when is a causal operator. The methods of [21, 23] assume that the interconnection is well-posed along the entire homotopy path. We have already seen in Theorems 3.1 and 3.2 how we can relax this wellposedness assumption. In particular, we can replace the requirement that is an operator with a weaker controllability requirement on its domain. Likewise, we can replace the causality requirement with a weaker lower hemicontinuity requirement together with a growth condition. Thus even if the target system is well-posed, the other artificial systems along the homotopy path need not be. In fact, it is possible that the only well-posed system on the homotopy path is the target system itself, because as the next examples show, the set of well-posed systems is not open in the gap topology. p

p

Example 3.10 We consider the interconnection of Fig. 3.6 with I = O = L loc ⊕ L loc on T = R0 so that each input and output signal has two components u = (u 1 , u 2 ) and y = (y1 , y2 ). Let G ⊂ O be the linear system G = {(0, 0)}, that is, the system whose behavior contains solely the zero vector in O. For each parameter α ∈ [0, 1], we define the system α as α = (y1 , y2 ) ∈ O : ∀t ∈ T , y1t (1 − α)y2t or y1t 0 ,

(3.37)

where y1t and y2t represent the values of the signals y1 and y2 at time t. It follows that the interconnection α = [G + , α ] is given by

α = (u, y) ∈ I ⊕ O : y = −u and −u ∈ α .

(3.38)

We will show in Lemma 3.25 in the appendix that the mapping α → α is uniformly continuous. Moreover, the target system 1 is well-posed because when α = 1 we have α = O, and thus α reduces to a constant gain of −1 (which is clearly a causal operator). When α < 1, however, the inputs to α cannot be freely chosen, which means α is causal but no longer an operator and is thus no longer well-posed. Example 3.11 We again consider the interconnection of Fig. 3.6 with I and O as in Example 3.10. Let G ⊂ O be the linear system G = {(w1 , w2 ) ∈ O : w1 = 0}. For each parameter α ∈ [0, 1], we define the system α = (y1 , y2 ) ∈ O : ∀t ∈ T , y2t = (1 − α)y1t · sech(y1,t+1 ) ,

(3.39)

3 On the Role of Well-Posedness in Homotopy Methods …

67

where y1t and y2t are as in Example 3.10 and y1,t+1 represents the value of the signal y1 at time t + 1. It follows that the interconnection α = [G + , α ] is given by

α = (u, y) ∈ I ⊕ O : y1 = −u 1 and

∀t ∈ T , y2t = (α − 1)u 1t · sech(u 1,t+1 ) . (3.40)

We will show in Lemma 3.26 in the appendix that the mapping α → α is uniformly continuous. Moreover, the target system is 1 = {(u, y) : y1 = −u 1 and y2 = 0 which is clearly a causal operator. When α < 1, however, the system is a noncausal operator and is thus no longer well-posed. To summarize these examples, perturbing a well-posed system can cause it to remain causal but no longer be an operator, or to remain an operator but to no longer be causal. It is also possible to lose both properties (e.g., just take an appropriate Cartesian product of the above two examples).

3.5.2 Regular Systems The homotopy method for stability analysis relies on the continuity (in the gap topology) of the interconnection mapping [· , ·]. In this section we show that such continuity holds when the first argument is regular as defined below. As an example, we will see that the system G + in the classical interconnection of Fig. 3.5 is always regular. As argued in [30], however, this classical interconnection may not be the best way to incorporate as model uncertainty; indeed, it seems better suited to the case in which is a controller and u represents additive actuator and sensor noise. The following definition of a regular system allows us to extend the homotopy approach to certain more general interconnections of the type shown in Fig. 3.4. Definition 3.7 An IO system (I, O, ) is r -regular when for each (u, y) ∈ there exists an IO system (O, I, ) such that id ⊆ ◦ and g0( − (y, u)) r . We let Regr (I, O) denote the set of all such r -regular systems, and we say that is regular when it is r -regular for some r 0. In this definition, each system is a “right inverse” of . Thus a regular system is one for which there is a “stable right inverse” centered at each point in its graph. In particular, if is regular and nonempty, then at least one exists and thus = ∅ implies [, ] = ∅. Note that G + in (3.36) is 1-regular for any system G; ¯ : u + y = u¯ + y¯ }. The following is indeed, given (u, y) ∈ G + choose = {( y¯ , u) a characterization of regular linear systems, which are essentially those having stable right inverses: Lemma 3.15 If (I, O, ) is linear and there exists a stable univalent linear system (O, I, ) such that id ⊆ ◦, then is regular with r = g().

68

R. A. Freeman

Lemma 3.16 Let I and O be signal spaces. Then for each r 0, the map [· , ·] is uniformly continuous on Regr (I, O) × P(O). Note that [23, Lemma 2], which states that the classical interconnection shown in Fig. 3.5 is continuous in G and , follows from Lemmas 3.13 and 3.16 together with the fact that G + is always 1-regular. We next show that regularity also provides a way to verify that the interconnection [, ] is controllable to zero. Lemma 3.17 If is regular and if and are controllable (resp. finely controllable) to zero, then [, ] is controllable (resp. finely controllable) to zero.

3.5.3 Integral Quadratic Constraints When α = [Gα+ , α ] for parameterized systems Gα and α , we can use integral quadratic constraints (IQCs) to verify condition (iv’) in Theorem 3.2. Let (X, T , S) be a signal space, and let ·s denote the norm (3.5) on its small-signal subspace Xs. Following [23], we say that a functional σ : Xs → R is quadratically continuous when for every ε > 0 there exists C > 0 such that σ (y) σ (x) + εx2s + Cx − y2s

(3.41)

for all x, y ∈ Xs. A typical choice for σ is the following: Lemma 3.18 Let U and V be normed spaces over R or C, let A : Xs → U and B : Xs → V be bounded R-linear operators, and let · , · : U × V → R be a bounded R-bilinear functional. Then σ given by σ (x) = Ax, Bx is quadratically continuous. Suppose T = R or T = R0 (for continuous time), and suppose U and V are both equal to the space of L2 functions from T to F n . Then a particular choice for the bilinear functional · , · in Lemma 3.18 is the symmetric one given by u, v = Re

∞ −∞

uˆ ∗(ω)(ω)ˆv (ω) dω ,

(3.42)

where uˆ and vˆ denote the Fourier transforms of u and v, is a Cn×n Hermitian-valued function with L ∞ entries, and Re denotes the real part. If the small-signal subspace Xs is also L2 , then we can take A = B = id in Lemma 3.18 so that ∞ xˆ ∗(ω)(ω)x(ω) ˆ dω . (3.43) σ (x) = −∞

This choice for σ leads to the IQCs considered in [21]. Another choice for A and B is for both to be of the form x → (x, xδ ) where xδ is the result of delaying the signal x by the amount δ. This leads to the delay-IQCs discussed in [1, 2]. A third

3 On the Role of Well-Posedness in Homotopy Methods …

69

choice, explored in [8], is for when Xs is the Sobolev space H k and A and B are given by x → (x, Dx, . . . , D k x), where D denotes the weak derivative operator. Other choices are possible as well. The following lemma comes from the proof of [23, Theorem 2]. Lemma 3.19 (Integral quadratic constraints) Let (O, G) and (O, ) be systems on a signal space O. Suppose there exist constants ε > 0 and d 0 and a quadratically continuous map σ : Os → R such that (i) σ (w) −2εw2s + d for all w ∈ G s, and (ii) σ (y) −d for all y ∈ s. √ √ Then gβ ([G + , ] n) γ , where β = 4d/ε and γ = 2(1 + C/ε) with C from (3.41). Condition (ii) is an IQC for the system , and the constant d is called its defect in [25]. Condition (i) is called an “inverse graph” IQC in [4]. Indeed, we can consider Lemma 3.19 as an example of a graph separation result (for example, see [24, 27] and the references therein).

3.6 Summary In this chapter we investigated the role of the well-posedness assumption in the homotopy methods of [21, 23]. In particular, we showed that feedback connection need not be defined for all possible inputs; instead, it is sufficient that its domain has a certain controllability property. We also showed how to relax the causality assumption by replacing it with a lower hemicontinuity assumption together with a growth condition. The notions of controllability and causality from [29] were particularly useful in this context. Many have argued that the finite-gain type of stability we considered here is less appropriate for nonlinear systems, and that the use of gain functions instead [13, 26, 27] can provide more flexibility. For this reason it is of interest to extend the results of this chapter to that more general setting.

3.7 Appendix Proof (Lemma 3.1) Because X is a locally convex Hausdorff space, there exists a separated family S0 = {·α }α∈I of seminorms on X over an index set I that generates the topology. Let F ⊆ P(I) denote the set of all finite subsets of I, and for each F ∈ F define the seminorm · F = α∈F ·α . Then the family S+ = {· F } F∈F induces the same topology on X that S0 does, and its index set F is directed by the natural preorder . Let T ⊆ F be any transversal of the quotient F/ ∼ (where ∼ is the equivalence relation defines by ), that is, let T be any

70

R. A. Freeman

set consisting of a single member from each equivalence class in F/ ∼. Then the subfamily S = {·t }t∈T of S+ is temporal and induces the same topology on X that S0 does. Proof (Lemma 3.2) Suppose that x¯ ∈ Bs ( y¯ ) ∩ Bt(¯z ) for some y¯ , z¯ ∈ X and s, t ∈ T . Choose r ∈ T such that both s r and t r , namely, such that ·s Cs ·r and ·t Ct ·r for some positive constants Cs and Ct . Suppose that x ∈ Br (x). ¯ ¯ s Cs x − x ¯ r= Because x¯ − y¯ s = x¯ − z¯ t = 0 we have x − y¯ s x − x ¯ t Ct x − x ¯ r = 0, and hence x ∈ Bs ( y¯ ) ∩ Bt(¯z ). Thus 0 and x − z¯ t x − x ¯ ⊆ Bs ( y¯ ) ∩ Bt(¯z ), and we conclude that the collection of sets in (3.3) is a base Br (x) for a topology on X. Moreover, it is straightforward to show that if x¯ ∈ Bt ( y¯ ) then ¯ ⊆ Bt ( y¯ ), which means the sets (3.3) for t ∈ T are a local base at x. ¯ Bt (x) Lemma 3.20 The family of seminorms in Example 3.3 is temporal and thus (X, T , S) is a signal space. Proof This is a special case of Example 3.2: if x ∈ X and t ∈ T then the restriction x| At belongs to the normed space Xt = L p (At ) with xt = x · 1 At p = x| At , so we define Rt : X → Xt as Rt x = x| At . Property (b) of Example 3.2 holds because the symmetric difference of two distinct members of C has positive measure. Property (c) holds as well: given s, t ∈ T we choose r ∈ T such that As ∪ At ⊆ Ar , and we let Bs and Bt be the restrictions to As and At , respectively. We have left to show that property (a) holds. Suppose x ∈ X is nonzero; then there exists a measurable set A ⊆ E such that μ(A) >0 and x is never zero on A. Let {Ati }i∈N be a countable subcover of C. Then E = i∈N Ati which implies 0 < μ(A) i∈N μ(A ∩ Ati ). It follows that μ(A ∩ Ati ) > 0 for some i ∈ N and thus xti = x · 1 Ati p x · 1 A∩Ati p > 0. In other words, x ∈ / ker(Rti ). Example 3.12 (The countable subcover condition in Example 3.3 cannot be removed) Let I be the real interval [0, 1], let E 1 be I together with the Lebesgue measure, and let E 2 be the set I ∪ {2} together with the counting measure (and having all sets measurable). Let E = E 1 × E 2 together with the product measure μ such that the measure μ(A) of a measurable set A is the sum of the Lebesgue measures of its horizontal sections. Let T be the set of all nonempty finite subsets of I , and for each t ∈ T define (c, 2) ∪ [0, 1] × {c} . (3.44) At = c∈t

Then C = {At }t∈T satisfies all of the conditions listed in Example 3.3 except for the countable subcover condition. Let A = I × {2}; then μ(A) = 1 which means 1 A is not equivalent to the zero function, but 1 A t = 0 for every t ∈ T which means S is not separated. Lemma 3.21 The family of seminorms in Example 3.5 is temporal and thus (X, T , S) is a signal space.

3 On the Role of Well-Posedness in Homotopy Methods …

71

Proof The family S is separated because if x ∈ X is such that x(τ ) = 0 for some τ ∈ , then x(0,K ) > 0 for any K ∈ C containing τ . Next we show that is a partial order on T . Suppose (N1 , K 1 ), (N2 , K 2 ) ∈ T with (N1 , K 1 ) = (N2 , K 2 ). If K 1 = K 2 , then because K 1 and K 2 are distinct closed sets there exists x ∈ X that is equal to zero on a neighborhood of one of them (K i ) but takes on a nonzero value on the other one (K j ). Hence x(Ni ,K i ) = 0 but x(N j ,K j ) > 0, and we conclude that these two norms are not equivalent. If K 1 = K 2 = K but N1 = N2 , then choose σ ∈ K and consider the signal x defined as x(τ ) = sin(ω(τ1 − σ1 ) + π4 ) for ω > 1, where τ1 and σ1 denote the respective first components of the n-vectors τ and σ . Then we have √ 2 i α ∂ x(σ ) = 2 ω if α = (i, 0, . . . , 0) for some i ∈ N (3.45) 0 otherwise, and it follows from (3.6) that N √

2 i ω 2

x(N ,K )

i=0

N

ωi

(3.46)

i=0

for all N ∈ N. Plugging in N1 and N2 and taking ratios yields √

√ ω N1 +1 − 1 x(N1 ,K ) 2 ω N1 +1 − 1 · N +1 . 2 · N +1 2 ω 2 −1 x(N2 ,K ) ω 2 −1

(3.47)

If the two norms in (3.47) were equivalent, then their ratio would be bounded both from above and below by positive constants not depending on ω, but we see from (3.47) that this is not the case as ω → ∞ when N1 = N2 . Finally, it is clear that directs T because C is directed by inclusion and if N1 N2 and K 1 ⊆ K 2 then ·(N1 ,K 1 ) ·(N2 ,K 2 ) . Lemma 3.22 The signal space in Example 3.5 is controllable to zero. Moreover, it is finely controllable to zero if and only if all sets in C are finite. Proof We first identify a class of small signals on this space. Let gˆ be a smooth, even, real-valued function on Rn that has support within the box B = [−1, 1]n with an integral over B equal to (2π )n . Then its inverse Fourier transform, given by 1 g(τ ) = (2π )n

Rn

jω·τ g(ω)e ˆ dω

(3.48)

for τ ∈ Rn , is a real analytic even Schwartz function with g(0) = 1. For each ε 0 we define the scaled version gε as gε (τ ) = g(ετ ) for τ ∈ Rn , and we note that g0 ≡ 1. If ε > 0, then by using the differentiation and scaling properties of the Fourier transform we obtain

72

R. A. Freeman

F j |α|+|β| ωα β ∂ α τ β gε (τ ) −−→ · ∂ gˆ (ω/ε) ε|β|+n

(3.49)

for any multi-indices α, β ∈ Nn . For each β, let Cβ denote the maximum of |∂ β g| ˆ over B, and note that 2n ε|α|+n 2n ε|α|+n . |ωα | dω = (3.50) (α1 + 1) . . . (αn + 1) εB Thus the inverse Fourier transform formula yields Cβ ε|α| α β ∂ τ gε (τ ) n |β| π ε

(3.51)

for all τ ∈ Rn and α, β ∈ Nn . If ε ∈ (0, 1) then we can take the supremum over τ and sum over α to obtain α∈Nn

sup ∂ α τ β gε (τ )

τ ∈Rn

π n (1

Cβ − ε)n ε|β|

(3.52)

for all β ∈ Nn . It follows from (3.6) and (3.4) that gε h| (that is, the product gε h restricted to ) is a small signal for every polynomial signal h and every parameter ε ∈ (0, 1). Next fix any x ∈ X, N ∈ N, K ∈ C, and γ > 0. Our goal is to find a polynomial h and a parameter ε ∈ (0, 1) such that x − gε h| (N ,K ) < γ . Because K is compact, we can extend x to all of Rn while preserving its value and the values of all of its partial derivatives on K ; in other words, there exists z ∈ C ∞ (Rn ) such that z ≡ x on a neighborhood of K . The seminorm (3.6) extends to Rn as well, so it suffices to find h and ε such that z − gε h(N ,K ) < γ . It follows from [33, Theorem 4] that there exists a sequence {h k }k∈N of scaled and shifted Bernstein polynomials such that lim max ∂ α z(τ ) − ∂ α h k (τ ) = 0

k→∞ τ ∈K

(3.53)

for each α ∈ Nn . Therefore we can choose k sufficiently large so that z − h k (N ,K ) < 1 γ . Next, it follows from [9, Proposition 2.9] that the mapping f : R0 → R0 given 2 by f (ε) = h k − gε h k (N ,K ) is continuous, so because f (0) = 0 there exists ε ∈ (0, 1) such that h k − gε h k (N ,K ) < 21 γ . Thus z − gε h(N ,K ) < γ , and we conclude that X is controllable to zero. Next suppose all sets in C are finite, and choose x ∈ X, N ∈ N, and K ∈ C as before. Because g defined above is real analytic and g(0)= 1, the set E τ = {ε ∈ (0, 1) : gε (τ ) = 0} is discrete for any τ ∈ Rn . The union τ ∈K E τ is also discrete because K is finite, which means there exists ε ∈ (0, 1) not in this union, i.e., such that gε is nonzero on K . It follows from [18, Theorem 19] that there exists a Hermite interpolating polynomial h that agrees with x/gε and all of its derivatives up to order

3 On the Role of Well-Posedness in Homotopy Methods …

73

N on K . Hence x − gε h| (N ,K ) = 0, and we conclude that X is finely controllable to zero. Conversely, suppose C contains an infinite set K ; then because K is compact it has an accumulation point σ ∈ K . Let σ denote the connected component of that contains σ . Let x ∈ X be the real analytic signal given by x(τ ) = eτ1 for τ = (τ1 , . . . , τn ) ∈ , which satisfies x(σ ) > 0 and x(N ,K ) (N + 1)x(σ ) for all N ∈ N. Now suppose y ∈ Xs is such that x − y(0,K ) = 0; then y is real analytic and agrees with x on K , which means y ≡ x on σ . Hence y(N ,K ) (N + 1)x(σ ) for all N ∈ N, but this contradicts the assumption that y is small. We conclude that X is not finely controllable to zero. Lemma 3.23 If (A, B) is stabilizable, (A, C) is detectable, and [ DE ] is rightinvertible, then the system in Example 3.7 is uniformly finely controllable to zero (with zero controllability bias). Proof Fix x¯ ∈ and let ξ be such that ξ˙t = Aξt + B E x¯t , ξ0 = 0, and 0 = Cξt + D x¯t for almost all t ∈ T . We proceed with an idea from [16]. Let the matrix L be such that A + LC is Hurwitz, and consider the observer given by the equation z˙ t = Az t + B E x¯t + L(C z t + D x¯t ) , z 0 = 0 ,

(3.54)

where z represents the observer state. The error e = ξ − z satisfies e˙t = (A + LC)et with e0 = 0, and therefore et = 0 for all t ∈ T . It follows that ξ˙t = (A + LC)ξt + (B E + L D)x¯t , ξ0 = 0

(3.55)

for almost all t ∈ T , which means ξt =

t

e(A+LC)(t−τ ) (B E + L D)x¯τ dτ

(3.56)

0

for all t ∈ T . Now because A + LC is Hurwitz, there exists a constant c independent ¯ t for all t ∈ T . Next, let the matrix K be such that A + B K of t such that |ξt | cx is Hurwitz and fix t ∈ T . Let F be a right inverse of [ DE ], and define the signal xˆ as xˆτ =

when τ t

x¯τ K F[ −C

]e

(A+B K )(τ −t)

ξt when τ > t .

(3.57)

It is straightforward to verify that xˆ ∈ . Because A + B K is Hurwitz, there exists a constant κ independent of t such that the L2 norm of xˆ on the interval [t, ∞) is bounded from above by κ|ξt |. It follows that x ˆ s (1 + cκ)x ¯ t , and we conclude that is uniformly finely controllable to zero with controllability constant 1 + cκ. Proof (Lemma 3.3) The result holds trivially when is either empty or not -stable, so we assume that is nonempty and -stable. Let β be such that g β ( ) < ∞, and

74

R. A. Freeman

suppose (u, y) ∈ . Then (cu, cy) ∈ for all c ∈ F , and thus (3.18) yields yt g β ( ) · u (t) + β/|c|

(3.58)

for all nonzero c ∈ F and all t ∈ T . Taking |c| → ∞ gives yt g β ( ) · u (t) for all t ∈ T and (u, y) ∈ . Thus from the definition (3.17) we obtain g 0( ) g β ( ). Proof (Lemma 3.4) Suppose is an -stable linear IO system, and let β 0 be such that g β ( ) < ∞. We first show that is -flhc. Pick any (u, ¯ y¯ ) ∈ and t ∈ T , and let u ∈ dom( ) be such that u − u ¯ (t) = 0. Choose any y ∈ [u]; then by the linearity of we have (cu − cu, ¯ cy − c y¯ ) ∈ for any c ∈ F , and we conclude from (3.18) that |c| · y − y¯ t β for all c ∈ F . Hence y − y¯ t = 0. The same argument with u = u¯ shows that is univalent. Proof (Lemma 3.5) Suppose is 1 -stable and is 2 -stable, and let β be large enough that g β1( ) and g β2() are both finite. Choose (u, y) ∈ ◦ ; then from (3.20) there exists x ∈ X such that (u, x) ∈ and (x, y) ∈ . It follows from (3.18) that yt g β2() · x 2 (t) + β g β2()g β1( ) · u 1 ( 2 (t)) + g β2()β + β

(3.59)

for all t ∈ T . If we define β¯ = g β2()β + β then (3.59) implies g 1 ◦ 2( ◦ ) gβ ¯1 ◦ 2( ◦ ) g β2()g β1( ) .

(3.60)

Thus ◦ is ( 1 ◦ 2 )-stable, and because (3.60) holds for all β large enough we obtain g 1 ◦ 2( ◦ ) g 2()g 1( ). Setting β = 0 throughout yields the result for g0 . Proof (Lemma 3.7) Pick any (u, ¯ y¯ ) ∈ , t ∈ T , and ε > 0, and let δ > 0 be as in Definition 3.5. We can choose δ ε without loss of generality. Because dom( ) is uniformly controllable to zero, there exists uˆ ∈ B (t),δ (u) ¯ ∩ dom( )s such that u ˆ s Ku ¯ (t) + b + ε. Because is -lhc there exists yˆ0 ∈ Bt,ε ( y¯ ) ∩ [u]. ˆ The same statements hold when is only -wlhc but dom( ) is uniformly finely controllable to zero if we take δ = 0. In either case, because is minimally stable there ˆ s. Hence exists yˆ ∈ Bt,ε ( y¯ ) ∩ [u] ˆ s+β +ε y¯ t yˆ t + yˆ − y¯ t gβ ( n) · u

Kgβ ( n) · u ¯ (t) + β + bgβ ( n) + εgβ ( n) + ε .

(3.61)

¯ Because This holds for every ε > 0, which means y¯ t Kgβ ( n) · u ¯ (t) + β. (u, ¯ y¯ ) ∈ and t ∈ T were arbitrary, we conclude that g β¯ ( ) Kgβ ( n).

3 On the Role of Well-Posedness in Homotopy Methods …

75

Proof (Lemma 3.8) Pick any (u, ¯ y¯ ) ∈ , t ∈ T , and ε > 0, and let Y = Bt,ε ( y¯ ). Because is lhc, there exists an open neighborhood U of u¯ such that U ∩ dom( ) ⊆ −1[Y ]. Because dom( ) is controllable to zero, there exists u ∈ Is ∩ U ∩ dom( ) ∩ B (t),ε (u). ¯ The same statements hold when is only wlhc but dom( ) is finely controllable to zero if we take U to be finely open instead. In either case we have u ∈ −1[Y ], namely, there exists yˆ ∈ Y ∩ [u], and because is minimally stable there exists y ∈ Y ∩ [u]s. Pick any β 0. If g β ( s) = ∞ then also g β ( ) = ∞, so suppose g β ( s) < ∞. Then (3.18) implies yt g β ( s) · u (t) + β, and thus y¯ t yt + y − y¯ t g β ( s) · u (t) + β + ε g β ( s) · u ¯ (t) + g β ( s) · u − u ¯ (t) + β + ε g β ( s) · u ¯ (t) + εg β ( s) + β + ε .

(3.62)

¯ (t) + β. Because this is This holds for all ε > 0 which means y¯ t g β ( s) · u true for all (u, ¯ y¯ ) ∈ and t ∈ T , we conclude from (3.17) that g β ( ) g β ( s). Proof (Lemma 3.9) Because S is separated we see that q(x, y) = 0 if and only if x = y, so we have left to prove that the triangle inequality holds. For any x, y, z ∈ X, ε > 0, and t ∈ T we have x − yt ε + yt y − zt x − zt + · . ε + xt ε + xt ε + xt ε + yt

(3.63)

ε + yt ε + x − yt + xt x − yt =1+ . ε + xt ε + xt ε + xt

(3.64)

We also have

We add 1 to both sides of (3.63) and use (3.64) to obtain 1+

x − zt y − zt x − yt · 1+ . 1+ ε + xt ε + xt ε + yt

(3.65)

We take the supremum of both sides over ε and t to obtain

1 + d(x, z) 1 + d(x, y) · 1 + d(y, z) .

(3.66)

Finally, we take the logarithm of both sides of (3.66) to obtain q(x, z) q(x, y) + q(y, z) as desired. Proof (Lemma 3.10) It follows from (3.23) that x − yt d(x, y) · xt

(3.67)

76

R. A. Freeman

for all t ∈ T . Hence xt x − yt + yt d(x, y) · xt + yt

(3.68)

and therefore

1 − d(x, y) · xt yt

(3.69)

for all t ∈ T . Because 1 − d(x, y) > 0, it follows from (3.67) and (3.69) that x − yt

d(x, y) d(x, y) · yt ε + yt 1 − d(x, y) 1 − d(x, y)

for all ε > 0 and t ∈ T , and the result follows.

(3.70)

Proof (Lemma 3.11) When = ∅ we have g() = g0() = 0 and thus the result holds. Hence we assume that is nonempty. Let β 0 be large enough that ¯ y¯ ) ∈ d(,

) < δ, where δ = (2gβ ( ) + 2)−1 . Let (u, y) ∈ ; then there exists (u,

such that u − u ¯ t + y − y¯ t δut + δyt

(3.71)

for all t ∈ T . It follows that ¯ t + β + y − y¯ t yt y¯ t + y − y¯ t gβ ( ) · u gβ ( ) · ut + (gβ ( ) + 1) u − u ¯ t + y − y¯ t + β (gβ ( ) + 21 ) · ut + 21 yt + β

(3.72)

for all t ∈ T , and solving for yt gives yt (2gβ ( ) + 1) · ut + 2β for all t ∈ T . This holds for all (u, y) ∈ and t ∈ T , and we conclude that g2β () 2gβ ( ) + 1. Because this holds for all β sufficiently large, we also have g() 2g( ) + 1. The same argument with β = 0 yields the result for g0 . Proof (Lemma 3.12) Let β be large enough such that both g β ( s) < ∞ and d(,

) < ρ −1 , where ρ = μg β ( s) + μL. Let (u, y) ∈ be such that u ∈ Is, and let Y ⊆ O be any open neighborhood of y. By assumption Gμ is dense in [u], which

) < d < ρ −1 , means there exists y˘ ∈ Y ∩ [u] ∩ Gμ . If we choose d such that d(, then by the definition of d there exists (u, ¯ y¯ ) ∈ such that u − u ¯ s + y˘ − y¯ s dus + d y˘ s

(3.73)

for all s ∈ T . Fix t ∈ T and ε > 0, and let δ > 0 be as in Definition 3.5. We can choose δ ε without loss of generality. Because dom( ) is controllable to zero, there exists ¯ ∩ dom( )s, and because is -lhc there exists yˆ0 ∈ Bt,ε ( y¯ ) ∩ [u]. ˆ uˆ ∈ B (t),δ (u)

3 On the Role of Well-Posedness in Homotopy Methods …

77

The same statements hold when is only -wlhc but dom( ) is finely controllable to zero if we take δ = 0. In either case, because is minimally stable there exists ˆ s. We are now ready to bound y˘ (t) . Because y˘ ∈ Gμ , there yˆ ∈ Bt,ε ( y¯ ) ∩ [u] exists χ 0 independent of t such that y˘ (t) μ y˘ t + χ μ yˆ t + μ y˘ − y¯ t + μ yˆ − y¯ t + χ μg β ( s) · u ˆ (t) + μL y˘ − y¯ (t) + μ(ε + β) + χ μg β ( s) · u (t) + μg β ( s) · uˆ − u ¯ (t) + μg β ( s) · u − u ¯ (t) + μL y˘ − y¯ (t) + μ(ε + β) + χ (μg β ( s) + ρd) · u (t) + ρd y˘ (t) + μ(εg β ( s) + ε + β) + χ , (3.74) where we have used (3.73) with s = (t). Because ρd < 1 we can solve for y˘ (t) :

y˘ (t) (1 − ρd)−1 (μg β ( s) + ρd) · u (t)

+μ(εg β ( s) + ε + β) + χ . (3.75)

Now y˘ t L y˘ (t) and u (t) us, so we have

y˘ t L(1 − ρd)−1 (μg β ( s) + ρd) · us + μ(εg β ( s) + ε + β) + χ . (3.76) The right-hand side of (3.76) is independent of t, and it follows that y˘ ∈ Os. Thus for every u ∈ dom()s, every y ∈ [u], and every open neighborhood Y of y we have found y˘ ∈ Y ∩ [u]s. In other words, is minimally stable. Proof (Theorem 3.1) Let η = (μγ + μL)−1 , where L is a look-ahead constant for . Because the interval [0, 1] is compact, it follows from (i) that the map ¯ 0 such that |α − α| implies d( α , α¯ ) < η. Choose an integer N > 1/δ and define αi = i/N for i = 0, 1, . . . , N . Note that α0 = 0 so α0 is minimally stable from (v). It follows from (ii)–(iv) and Lemma 3.12 that the minimal stability of αi implies the minimal stability of αi+1 , so by induction we conclude that 1 is minimally stable. We then apply Lemma 3.8. Proof (Theorem 3.2) We follow the proof of Theorem 3.1 but with η = (μγ K + μL)−1 . For the induction step, it follows from (ii), (iii’)–(iv’), and Lemmas 3.7 and 3.12 that the minimal stability of αi implies the minimal stability of αi+1 . We conclude that 1 is minimally stable, and then we then apply Lemma 3.7. Lemma 3.24 The mapping α → α in Example 3.9 is uniformly continuous. Proof Note that α = [, α ], where and α are the systems

78

R. A. Freeman

= (u, y) ∈ I ⊕ O : y1 = u 1 and y2 = u 2 + h ∗ u 2 α = (y1 , y2 ) ∈ O : y1 = (1 − α)y2 .

(3.77) (3.78)

We first show that the mapping α → α is uniformly continuous. Pick α, α¯ ∈ [0, 1], ¯ 2 . Then ( y¯1 , y¯2 ) ∈ α¯ and let (y1 , y2 ) ∈ α , and define y¯2 = y2 and y¯1 = (1 − α)y α , α¯ ) |α − α|. ¯ · y2 t for all t ∈ T , and it follows that d( ¯ y1 − y¯1 t = |α − α| ¯ The uniform By reversing the roles of α and α¯ we obtain d(α , α¯ ) |α − α|. continuity of the mapping α → α then follows from Lemmas 3.15 and 3.16. Proof (Lemma 3.13) Choose γ > 0 and let G 1 , G 2 ⊆ O be such that d(G 1 , G 2 ) < + γ¯ , where γ¯ = γ /2. If either G i is empty, then both are empty and thus d(G + 1 , G2 ) = d(∅, ∅) = 0 < γ . Hence we assume that the G i are both nonempty, which implies ¯ y¯ ) ∈ G + ¯ = u¯ + y¯ so that the G i+ are also both nonempty. Fix (u, 1 and define w 1 , G 2 ) < γ¯ , there exists w ∈ G 2 such that w − w ¯ t that w¯ ∈ G 1 . Because d(G and (u, y ¯ ) − ( u, ¯ y ¯ )t = γ¯ w ¯ t for all t ∈ T . Define u = w − y¯ so that (u, y¯ ) ∈ G + 2 ¯ t for all t ∈ T . Then we have (u, y¯ ) − (u, ¯ y¯ )t γ¯ (u, ¯ y¯ )t u − u ¯ t = w − w + , G + ) γ¯ < γ . Reversing the roles of G i gives for all t ∈ T , which means d(G 1 2 + + + + d(G 2 , G 1 ) < γ , and we conclude that d(G 1 , G 2 ) < γ . Proof (Lemma 3.14) We first suppose that G and O are controllable to zero. Let (u, ¯ y¯ ) ∈ G + so that w¯ ∈ G, where w¯ = u¯ + y¯ . Fix t ∈ T and ε > 0, and let δ = ε/4. Because G and O are both controllable to zero, there exist w ∈ G s and u ∈ Os such that w − w ¯ t δ and u − u ¯ t δ. Define y = w − u so that ¯ t + u − u ¯ t 2δ, and it follows that (u, y) ∈ (G + )s. Then y − y¯ t w − w ¯ y¯ ), and because t ∈ T and ε > 0 (u, y) − (u, ¯ y¯ )t 3δ < ε. Thus (u, y) ∈ Bt,ε (u, were arbitrary we conclude that G + is controllable to zero. If G and O are both finely controllable to zero, then a similar argument with δ = 0 shows that G + is finely controllable to zero as well. Lemma 3.25 The mapping α → α in Example 3.10 is uniformly continuous. Proof We first show that the mapping α → α is uniformly continuous. Pick α, α¯ ∈ [0, 1], let (y1 , y2 ) ∈ α , and define y¯2 = y2 and y¯1t =

(1 − α)y ¯ 2t when 0 < y1t < (1 − α)y ¯ 2t y1t otherwise

(3.79)

for all t ∈ T . Then ( y¯1 , y¯2 ) ∈ α¯ and |y1t − y¯1t | =

(1 − α)y ¯ 2t − y1t when 0 < y1t < (1 − α)y ¯ 2t 0 otherwise

(3.80)

for all t ∈ T . It follows from (3.37) that ¯ 2t (α − α)y ¯ 2t (1 − α)y ¯ 2t − y1t = (1 − α)y2t − y1t + (α − α)y

(3.81)

3 On the Role of Well-Posedness in Homotopy Methods …

79

for all t ∈ T such that y1t > 0, and thus from (3.80) we obtain |y1t − y¯1t | |α − α| ¯ · ¯ · y2 t for all t ∈ |y2t | for all t ∈ T . Therefore y1 − y¯1 t + y2 − y¯2 t |α − α| α , α¯ ) |α − α|. ¯ By reversing the roles of α and α¯ we T , and it follows that d( obtain d(α , α¯ ) |α − α|. ¯ The uniform continuity of the mapping α → α then follows from Lemmas 3.13 and 3.16 together with the fact that G + is 1-regular. Lemma 3.26 The mapping α → α in Example 3.11 is uniformly continuous. Proof We first show that the mapping α → α is uniformly continuous. Pick α, α¯ ∈ [0, 1], let (y1 , y2 ) ∈ α , and define y¯1 = y1 and ¯ 1t · sech(y1,t+1 ) y¯2t = (1 − α)y

(3.82)

for all t ∈ T . Then ( y¯1 , y¯2 ) ∈ α¯ and ¯ · |y1t | · sech(y1,t+1 ) |α − α| ¯ · |y1t | |y2t − y¯2t | = |α − α|

(3.83)

for all t ∈ T . The rest of the proof is similar to the end of the proof of Lemma 3.25. Proof (Lemma 3.15) Because id ⊆ ◦ we see that must be an operator. Given (u, y) ∈ let = + (y, u). Pick y¯ ∈ O and let u¯ = u + [ y¯ − y] so that ( y¯ , u) ¯ ∈ . Now id ⊆ ◦ so there exists x such that ( y¯ − y, x) ∈ and (x, y¯ − y) ∈ . But ( y¯ − y, u¯ − u) ∈ and is univalent so x = u¯ − u. Hence (u¯ − u, y¯ − y) ∈ and (u, y) ∈ , and because is linear we have (u, ¯ y¯ ) ∈ . This means ( y¯ , y¯ ) ∈ ◦ . Proof (Lemma 3.16) Fix γ > 0 and let 1 , 2 ∈ Regr (I, O) and 1 , 2 ⊆ O be such that both d(1 , 2 ) < γ¯ and d(1 , 2 ) < γ¯ , where γ¯ = γ /(2r + 3). If either i is empty, then both are empty and thus d([1 , 1 ], [2 , 2 ]) = d(∅, ∅) = 0 < γ . Likewise, if either i is empty then again d([1 , 1 ], [2 , 2 ]) = 0 < γ . Therefore we assume that the i and i are all nonempty, which means [1 , 1 ] and [2 , 2 ] are also nonempty. 1 , 2 ) < γ¯ there exists (u, y) ∈ 2 such that Fix (u, ¯ y¯ ) ∈ [1 , 1 ]. Because d( ¯ t + γ¯ y¯ t u − u ¯ t + y − y¯ t γ¯ u

(3.84)

1 , 2 ) < γ¯ there exists yˆ ∈ 2 such that for all t ∈ T . Likewise, because d( yˆ − y¯ t γ¯ y¯ t

(3.85)

for all t ∈ T . By assumption 2 ∈ Regr (I, O), and it follows from Definition 3.7 that there exists an IO system (O, I, 2 ) such that id ⊆ 2 ◦ 2 and g0(2 − (y, u)) r . ˆ ∈ 2 Therefore ( yˆ , yˆ ) ∈ 2 ◦ 2 which means there exists uˆ ∈ I such that ( yˆ , u) and (u, ˆ yˆ ) ∈ 2 , and in particular (u, ˆ yˆ ) ∈ [2 , 2 ]. We also have ( yˆ − y, uˆ − u) ∈ 2 − (y, u), and it follows that

80

R. A. Freeman

uˆ − ut r yˆ − yt r yˆ − y¯ t + r y − y¯ t γ¯ r y¯ t + r y − y¯ t

(3.86)

for all t ∈ T . Therefore ¯ t + γ¯ y¯ t uˆ − u ¯ t + yˆ − y¯ t uˆ − ut + u − u γ¯ (r + 1) y¯ t + u − u ¯ t + r y − y¯ t

2γ¯ (r + 1) u ¯ t + y¯ t

(3.87)

1 , 1 ], [2 , 2 ]) 2γ¯ (r + 1) < γ , and reversing the roles of for all t ∈ T . Thus d([ 2 , 2 ], [1 , 1 ]) < γ . Therefore d([1 , 1 ], [2 , 2 ]) i and i gives d([ < γ. Proof (Lemma 3.17) Suppose and are both controllable to zero. Let (u, ¯ y¯ ) ∈ [, ], let r 0 be such that is r -regular, fix t ∈ T and ε > 0, and let δ = ε/(2r + 3). Because is controllable to zero there exists (u, y) ∈ s such that u − u ¯ t+ y − y¯ t δ. Likewise, because is controllable to zero there exists yˆ ∈ s such that yˆ − y¯ t δ. Let be as in Definition 3.7; then because id ⊆ ◦ there exists uˆ ∈ [ yˆ ] such that (u, ˆ yˆ ) ∈ , and because g0( − (y, u)) r we have uˆ − us r yˆ − ys

(3.88)

for all s ∈ T . Taking the supremum of both sides of (3.88) over s gives uˆ − us ˆ yˆ ) ∈ [, ]s. Using (3.88) with s = t r yˆ − ys. It follows that uˆ ∈ Is and thus (u, gives ¯ t + uˆ − ut + yˆ − y¯ t (u, ˆ yˆ ) − (u, ¯ y¯ )t u − u u − u ¯ t + r yˆ − yt + yˆ − y¯ t u − u ¯ t + (r + 1) yˆ − y¯ t + r y − y¯ t 2δ(r + 1) < ε .

(3.89)

¯ y¯ ), and because t ∈ T and ε > 0 were arbitrary we conclude Thus (u, ˆ yˆ ) ∈ Bt,ε (u, that [, ] is controllable to zero. If and are both finely controllable to zero, then a similar argument with δ = 0 shows that [, ] is finely controllable to zero. Proof (Lemma 3.18) We let · denote the operator norms for A and B, and because · , · is bounded there exists M 0 such that Ax, By MA·B·xs ·ys for all x, y ∈ Xs. Therefore

3 On the Role of Well-Posedness in Homotopy Methods …

81

σ (y) − σ (x) = Ay, By − Ax, Bx = A(y − x), By + Ax, B(y − x) = A(y − x), B(y − x) + A(y − x), Bx + Ax, B(y − x) MA·B·x − y2s + 2MA·B·xs ·x − ys MA·B·x − y2s + 1ε M 2 A2 ·B2 ·x − y2s + εx2s ,

(3.90)

from which we obtain (3.41). Proof (Lemma 3.19) From (3.41) there exists C > 0 such that σ (y) σ (u + y) + εu + y2s + Cu2s

(3.91)

for any u, y ∈ Os. Then from (i)–(ii) we obtain −d −εu + y2s + d + Cu2s

(3.92)

for any (u, y) ∈ [G + , ]s. Therefore y2s 2u + y2s + 2u2s 2(1 + Cε )u2s + for any (u, y) ∈ [G + , ]s, and the result follows.

4d ε

(3.93)

References 1. Altshuller, D.: Delay-integral-quadratic constraints and stability multipliers for systems with MIMO nonlinearities. IEEE Trans. Automat. Contr. 56(4), 738–747 (2011) 2. Altshuller, D.: Frequency Domain Criteria for Absolute Stability. Lecture Notes in Control and Information Sciences, vol. 432. Springer, Berlin (2013) 3. Aubin, J.P., Frankowska, H.: Set-Valued Analysis. Birkhäuser, Boston (1990) 4. Carrasco, J., Seiler, P.: Conditions for the equivalence between IQC and graph separation stability results. Int. J. Control 92(12), 2899–2906 (2019) 5. Cobza¸s, S: ¸ Functional Analysis in Asymmetric Normed Spaces. Frontiers in Mathematics, Birkhäuser (2013) 6. de Does, J., Schumacher, J.M.: Interpretations of the gap topology: a survey. Kybernetika 30(2), 105–120 (1994) 7. Desoer, C.A., Vidyasagar, M.: Feedback Systems: Input-Output Properties. Academic, New York (1975) 8. Fetzer, M.: From classical absolute stability tests towards a comprehensive robustness analysis. Ph.D thesis, Universität Stuttgart (2017) 9. Freeman, R.A., Kokotović, P.V.: Robust Nonlinear Control Design. Modern Birkhäuser Classics, Birkhäuser (2008) 10. Georgiou, T.T., Smith, M.C.: Robustness analysis of nonlinear feedback systems: An inputoutput approach. IEEE Trans. Automat. Contr. 42(9), 1200–1221 (1997)

82

R. A. Freeman

11. Greshnov, A.V.: Distance functions between sets in (q1 , q2 )-quasimetric spaces. Sib. Math. J. 61(3), 417–425 (2020) 12. Hill, D.J., Moylan, P.J.: Dissipative dynamical systems: Basic input-output and state properties. J. of the Franklin Institute 309(5), 327–357 (1980) 13. Jiang, Z.P., Teel, A.R., Praly, L.: Small-gain theorem for ISS systems and applications. Math. Control Signals Systems 7, 95–120 (1994) 14. Jones, P.W.: Quasiconformal mappings and extendability of functions in Sobolev spaces. Acta Math. 147, 71–88 (1981) 15. Khalil, H.K.: Nonlinear Systems, 3rd edn. Prentice Hall, Upper Saddle River (2002) 16. Krichman, M., Sontag, E.D., Wang, Y.: Input-output-to-state stability. SIAM J. Control. Optim. 39(6), 1874–1928 (2001) 17. Liberzon, M.R.: Essays on the absolute stability theory. Autom. Remote. Control. 67(10), 1610–1644 (2006) 18. Lorentz, R.A.: Multivariate Hermite interpolation by algebraic polynomials: A survey. J. Comput. Appl. Math. 122, 167–201 (2000) 19. Lur’e, A.I., Postnikov, V.N.: On the theory of stability of control systems. J. Appl. Math. Mech. 8(3), 246–248 (1944) 20. Megretski, A.: KYP lemma for non-strict inequalities and the associated minimax theorem (2010) 21. Megretski, A., Rantzer, A.: System analysis via integral quadratic constraints. IEEE Trans. Automat. Contr. 42(6), 819–830 (1997) 22. O’Shea, R.P.: An improved frequency time domain stability criterion for autonomous continuous systems. IEEE Trans. Automat. Contr. 12(6), 725–731 (1967) 23. Rantzer, A., Megretski, A.: System analysis via integral quadratic constraints: Part II. Technical report ISRN LUTFD2/TFRT–7559–SE, Department of Automatic Control, Lund Institute of Technology, Sweden (1997) 24. Safonov, M.G.: Stability and Robustness of Multivariable Feedback Systems. MIT Press, Cambridge (1980) 25. Shiriaev, A.S.: Some remarks on “System analysis via integral quadratic constraints”. IEEE Trans. Automat. Contr. 45(8), 1527–1532 (2000) 26. Sontag, E.D.: Smooth stabilization implies coprime factorization. IEEE Trans. Automat. Contr. 34(4), 435–443 (1989) 27. Teel, A.R.: On graphs, conic relations, and input-output stability of nonlinear feedback systems. IEEE Trans. Automat. Contr. 41(5), 702–709 (1996) 28. Willems, J.C.: The Analysis of Feedback Systems. The MIT Press, Cambridge (1971) 29. Willems, J.C.: Paradigms and puzzles in the theory of dynamical systems. IEEE Trans. Automat. Contr. 36(3), 259–294 (1991) 30. Willems, J.C., Takaba, K.: Dissipativity and stability of interconnections. Int. J. Robust Nonlinear Contr. 17(5–6), 563–586 (2007) 31. Yakubovich, V.A.: Necessity in quadratic criterion for absolute stability. Int. J. Robust Nonlinear Contr. 10(11–12), 889–907 (2000) 32. Yakubovich, V.A.: Popov’s method and its subsequent development. Eur. J. Control. 8(3), 200–208 (2002) 33. Veretennikov, A.Y, Veretennikova, E.V.: On partial derivatives of multivariate Bernstein polynomials. Siberian Adv. Math. 26(4), 294–305 (2016) 34. Zames, G.: On the input-output stability of time-varying nonlinear feedback systems. Part I: Conditions using concepts of loop gain, conicity, and positivity. IEEE Trans. Automat. Contr. 11, 228–238 (1966) 35. Zames, G., Falb, P.L.: Stability conditions for systems with monotone and slope-restricted nonlinearities. SIAM J. Control 6(1), 89–108 (1968)

Chapter 4

Design of Heterogeneous Multi-agent System for Distributed Computation Jin Gyu Lee and Hyungbo Shim

Abstract A group behavior of a heterogeneous multi-agent system is studied which obeys an “average of individual vector fields” under strong couplings among the agents. Under stability of the averaged dynamics (not asking stability of individual agents), the behavior of heterogeneous multi-agent system can be estimated by the solution to the averaged dynamics. A following idea is to “design” individual agent’s dynamics such that the averaged dynamics performs the desired task. A few applications are discussed including estimation of the number of agents in a network, distributed least-squares or median solver, distributed optimization, distributed state estimation, and robust synchronization of coupled oscillators. Since stability of the averaged dynamics makes the initial conditions forgotten as time goes on, these algorithms are initialization-free and suitable for plug-and-play operation. At last, nonlinear couplings are also considered, which potentially asserts that enforced synchronization gives rise to an emergent behavior of a heterogeneous multi-agent system.

4.1 Introduction During the last decade, synchronization and collective behavior of a multi-agent system have been actively studied because of numerous applications in diverse areas, e.g., biology, physics, and engineering. An initial study was about identical multiagents [1–4], but the interest soon transferred to the heterogeneous case because J. G. Lee Department of Engineering, University of Cambridge, Control Group, Trumpington Street, CB2 1PZ Cambridge, United Kingdom e-mail: [email protected] H. Shim (B) ASRI, Department of Electrical and Computer Engineering, Seoul National University, Gwanak-ro 1, Gwanak-gu, 08826 Seoul, Korea e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control, Lecture Notes in Control and Information Sciences 488, https://doi.org/10.1007/978-3-030-74628-5_4

83

84

J. G. Lee and H. Shim

uncertainty, disturbance, and noise are prevalent in practice. In this regard, heterogeneity was mostly considered harmful—something that we have to suppress or compensate. To achieve synchronization, or at least approximate synchronization (with arbitrary precision if possible), against heterogeneity, various methods such as output regulation [5–10], backstepping [11], high-gain feedback [12–15], adaptive control [16], and optimal control [17], have been applied. However, heterogeneity of multi-agent systems is a way to achieve a certain task collaboratively from different agents having different roles. From this viewpoint, heterogeneity is something we should design, or, at least, heterogeneity is an outcome of distributing a complex computation into individual agents. This chapter is devoted to investigating the design possibility of heterogeneity. After presenting a few basic theorems which describe the collective behavior of multi-agent systems, we exhibit several design examples by employing the theorems as a toolkit. A feature of the toolkit is that the vector field of the collective behavior can be assembled from the individual vector fields of each agent when the coupling strength among the agents is sufficiently large. This process is explained by the singular perturbation theory. In fact, the assembled vector field is nothing but an average of the agents’ vector fields, and it appears as the quasi-steady-state subsystem (or the slow subsystem) when the inverse of the coupling gain is treated as the singular perturbation parameter. We call the quasi-steady-state subsystem as blended dynamics for convenience. The behavior of the blended dynamics is an emergent one if none of the agents has such a vector field. For instance, we will see that we can construct a heterogeneous network that individuals can estimate the number of agents in the network without using any global information. Since individuals cannot access the global information N , this collective behavior cannot be obtained by the individuals alone. On the other hand, appearance of the emergent behavior when we enforce synchronization seems intrinsic. We will demonstrate this fact when we consider nonlinear coupling laws in the later section. Finally, the proposed tool leads to the claim that the network of a large number of agents is robust against the variation of individual agents. We will demonstrate it in the case of coupled oscillators. There are two notions which have to be considered when a multi-agent system is designed. It is said that the plug-and-play operation (or, initialization-free) is guaranteed for a multi-agent system if it maintains its task without resetting all agents whenever an agent joins or leaves the network. On the other hand, if a new agent that joins the network can construct its own dynamics without the global information such as graph structure, other agents’ dynamics, and so on, it is said that the decentralized design is achieved. It will be seen that the plug-and-play operation is guaranteed for the design examples in this chapter. This is due to the fact that the group behavior of the agents is governed by the blended dynamics, and therefore, as long as the blended dynamics remains stable, individual initial conditions of the agents are forgotten as time goes on. The property of decentralized design is harder to achieve in general. However, for the presented examples, this property is guaranteed to some extent; more specifically, it is achieved except the necessity of the coupling gain which is the global information.

4 Design of Heterogeneous Multi-agent System for Distributed Computation

85

4.2 Strong Diffusive State Coupling We begin with the simplest case of heterogeneous multi-agent systems given by x˙i = f i (t, xi ) + k

αi j (x j − xi )

∈ Rn , i ∈ N ,

(4.1)

j∈N i

where N := {1, . . . , N } is the set of agent indices with the number of agents, N , and Ni is a subset of N whose elements are the indices of the agents that send the information to agent i. The coefficient αi j is the i j-th element of the adjacency matrix that represents the interconnection graph. We assume the graph is undirected and connected in this chapter. The vector field f i is assumed to be piecewise continuous in t, continuously differentiable with respect to xi , locally Lipschitz with respect to xi uniformly in t, and f i (t, 0) is uniformly bounded for t. The summation term in (4.1) is called diffusive coupling, in particular, diffusive state coupling because states are exchanged among agents through the term. Diffusive state coupling term vanishes when state synchronization is achieved (i.e., xi (t) = x j (t), ∀i, j). Coupling strength, or coupling gain, is represented by the positive constant k. It is immediately seen from (4.1) that synchronization of xi (t)’s to a common trajectory s(t) is hopeless in general due to the heterogeneity of f i unless it holds that s˙ (t) = f i (t, s(t)) for all i ∈ N . Instead, with sufficiently large coupling gain k, we can enforce approximate synchronization. To see this, let us introduce a linear coordinate change s=

N 1 xi N i=1

∈ Rn

z˜ = (R T ⊗ In )col(x1 , . . . , x N )

(4.2) ∈ R(N −1)n ,

where R ∈ R N ×(N −1) is any matrix that satisfies 0 0 1TN L 1N R = 0Λ RT

1

N

with a positive definite matrix Λ ∈ R(N −1)×(N −1) , where 1 N := [1, . . . , 1]T ∈ R N and L := D − A is the Laplacian matrix of a graph, in which A = [αi j ] and D = diag(di ) with di = j∈N i αi j . By the coordinate change, the multi-agent system is converted into the standard singular perturbation form s˙ =

N 1 f i (t, s + (Ri ⊗ In )˜z ) N i=1

1˙ 1 z˜ = −(Λ ⊗ In )˜z + (R T ⊗ In )col( f 1 (t, x1 ), . . . , f N (t, x N )), k k

(4.3)

86

J. G. Lee and H. Shim

where Ri implies the i-th row of R. From this, it is seen that z˜ quickly becomes arbitrarily small with arbitrarily large k, and the quasi-steady-state subsystem of (4.3) is given by N 1 f i (t, s), (4.4) s˙ = N i=1 which we call blended dynamics.1 By noting that xi = s + (Ri ⊗ In )˜z , it is seen that the behavior of the multi-agent system (4.3) can be approximated by the blended dynamics with some kind of stability of the blended dynamics (and with sufficiently large k) as follows: Theorem 4.1 ([14, 19]) Assume that the blended dynamics (4.4) is contractive.2 Then, for any compact set K ⊂ Rn N and for any η > 0, there exists k ∗ > 0 such that, for each k > k ∗ and col(x1 (t0 ), . . . , x N (t0 )) ∈ K , the solution to (4.1) exists for all t ≥ t0 , and satisfies lim sup xi (t) − s(t) ≤ η, ∀i ∈ N . t→∞

Theorem 4.2 ([15, 19]) Assume that there is a nonempty compact set Ab ⊂ Rn that is uniformly asymptotically stable for the blended dynamics (4.4). Let Db ⊃ Ab be an open subset of the domain of attraction of Ab , and let3 Dx := 1 N ⊗ s + w : s ∈ Db , w ∈ Rn N such that (1TN ⊗ In )w = 0 . Then, for any compact set K ⊂ Dx ⊂ Rn N and for any η > 0, there exists k ∗ > 0 such that, for each k > k ∗ and col(x1 (t0 ), . . . , x N (t0 )) ∈ K , the solution to (4.1) exists for all t ≥ t0 , and satisfies lim sup xi (t) − x j (t) ≤ η and lim sup xi (t) A b ≤ η, ∀i, j ∈ N . t→∞

(4.5)

t→∞

If, in addition, Ab is locally exponentially stable for the blended dynamics (4.4) and, f i (t, s) = f j (t, s), ∀i, j ∈ N , for each s ∈ Ab and t, then we have more than (4.5) as lim xi (t) − x j (t) = 0 and lim xi (t) A b = 0, ∀i, j ∈ N . t→∞

t→∞

We emphasize the required stabilities in the above theorems are only for the blended dynamics (4.4) but not for individual agent dynamics x˙i = f i (t, xi ). A group of unstable and stable agents may end up with a stable blended dynamics so that the 1

More appropriate name could be “averaged dynamics,” which may however confuse the reader with the averaged dynamics in the well-known averaging theory [18] that deals with time average. 2 x˙ = f (t, x) is contractive if ∃Θ > 0 such that Θ ∂ f (t, x) + ∂ f (t, x)T Θ ≤ −I for all x and t [20]. ∂x ∂x 3 The condition for w in D can be understood by recalling that col(x , . . . , x ) = 1 ⊗ s + (R ⊗ x 1 N N In )˜z .

4 Design of Heterogeneous Multi-agent System for Distributed Computation

87

above theorems can be applied. In this case, it can be interpreted that the stability is traded throughout the network with strong couplings. The blended dynamics (4.4) shows an emergent behavior of the multi-agent system in the sense that s(t) is governed by the new vector field that is assembled from the individual vector fields participating in the network. From now, we list a few examples of designing multi-agent systems (or, simply called “networks”) whose tasks are represented by the emergent behavior of (4.4) so that the whole network exhibits the emergent collective behavior with sufficiently large coupling gain k.

4.2.1 Finding the Number of Agents Participating in the Network When constructing a distributed network, sometimes there is a need for each agent to know global information such as the number of agents in the network without resorting to a centralized unit. In such circumstances, Theorem 4.1 can be employed to design a distributed network that estimates the number of participating agents, under the assumption that there is one agent (whose index is 1 without loss of generality) who always takes part in the network. Suppose that agent 1 integrates the following scalar dynamics: α1 j (x j − x1 ), (4.6) x˙1 = −x1 + 1 + k j∈N 1

while all others integrate x˙i = 1 + k

αi j (x j − xi ), i = 2, . . . , N

(4.7)

j∈N i

where N is unknown to the agents. Then, the blended dynamics (4.4) is obtained as s˙ = −

1 s + 1. N

(4.8)

This implies that the resulting emergent motion s(t) converges to N as time goes to infinity. Then, it follows from Theorem 4.1 that each state xi (t) approaches arbitrarily close to N with a sufficiently large k. Hence, by increasing k such that the estimation error is less than 0.5, and by rounding xi (t) to the nearest integer, each agent gets to know the number N as time goes on. By resorting to a heterogeneous network, we were able to impose stable emergent collective behavior that makes possible for individuals to estimate the number of agents in the network. Note that the initial conditions do not affect the final value of xi (t) because they are forgotten as time tends to infinity due to the stability of the blended dynamics. This is in sharp contrast to other approaches such as [21] where the average consensus algorithm is employed, which yields the average of individual

88

J. G. Lee and H. Shim

initial conditions, to estimate N . While their approach requires resetting the initial conditions whenever some agents join or leave the network during the operation, the estimation of the proposed algorithm remains valid (after some transient) in such cases because the blended dynamics (4.8) remains contractive for any N ≥ 1. Therefore, the proposed algorithm achieves the plug-and-play operation. Moreover, when the maximum number Nmax of agents is known, the decentralized design is also achieved. Further details are found in [22]. Remark 4.1 A slight variation of the idea yields an algorithm to identify the agents attending the network. Let the number 1 in (4.6) and (4.7) be replaced by 2i−1 , where i is the unique ID of the agent in {1, 2, . . . , Nmax }. Then the blended dynamics (4.8) becomes s˙ = −(1/N )s + j∈N a 2 j−1 /N , where Na is the index set of the attending behavior s(t) → agents, and N is the cardinality of Na . Since the resulting emergent j−1 j−1 2 , each agent can figure out the integer value 2 , which contains j∈N a j∈N a the binary information of the attending agents.

4.2.2 Distributed Least-Squares Solver Distributed algorithms have been developed in various fields of study so as to divide a large computational problem into small-scale computations. In this regard, finding a solution of a given large linear equation in a distributed manner has been tackled in recent years [23–25]. Let the equation be given by Ax = b

∈ RM ,

(4.9)

where A ∈ R M×n has full column rank, x ∈ Rn , and b ∈ R M . We suppose that the total of M equations are grouped N equation banks and the i-th equation bank into N m i = M. In particular, we write the i-th equaconsists of m i equations so that i=1 tion bank as Ai x = bi

∈ Rm i , i = 1, 2, . . . , N ,

(4.10)

where Ai ∈ Rm i ×n is the i-th block rows of the matrix A and bi ∈ Rm i is the i-th block elements of b. The problem of finding a solution to (4.10) (in the sense of leastsquares when there is no solution) is dealt with in [26–29]. Most notable among them are [27, 28], in which they proposed a distributed algorithm given by x˙i = −AiT (Ai xi − bi ) + k

αi j (x j − xi ).

(4.11)

j∈N i

Here, we analyze (4.11) in terms of Theorem 4.2. In particular, the blended dynamics of the network (4.11) is obtained as

4 Design of Heterogeneous Multi-agent System for Distributed Computation

s˙ = −

1 T A (As − b) N

89

(4.12)

which is equivalent to the gradient descent algorithm of the optimization problem minimizex Ax − b 2

(4.13)

that has a unique minimizer (A T A)−1 A T b; the least-squares solution of (4.9). Thus, Theorem 4.2 asserts that each state xi approximates the least-squares solution with a sufficiently large k, and the error can be made arbitrarily small by increasing k. Remark 4.2 Even in the case that A T A is not invertible, the network (4.11) still solves the least-squares problem because s of (4.12) converges to one of the minimizers. Further details are found in [30].

4.2.3 Distributed Median Solver The idea of designing a network based on the gradient descent algorithm of an optimization problem, as in the previous subsection, can be used for most other distributed optimization problems. Among them, a particularly interesting example is the problem of finding a median, which is useful, for instance, in rejecting outliers under redundancy. For a collection R of real numbers ri , i = 1, 2, . . . , N , their median is defined as a real number that belongs to the set

MR =

s {r(N +1)/2 }, s [r N /2 , r Ns /2+1 ],

if N is odd, if N is even,

where ris ’s are the elements of the set R with its index being rearranged (sorted) such that r1s ≤ r2s ≤ · · · ≤ r Ns . With the help of this relaxed definition of the median, finding a median s of R becomes solving an optimization problem minimizes

N

i=1 |ri

− s|.

Then, a gradient descent algorithm given by s˙ =

N 1 sgn(ri − s) N i=1

(4.14)

will solve this minimization problem, where sgn(s) is 1 if s > 0, −1 if s < 0, and 0 if s = 0. In particular, the solution s satisfies

90

J. G. Lee and H. Shim

lim s(t) MR = 0.

t→∞

Motivated by this, we propose a distributed median solver, whose individual dynamics of the agent i uses the information of ri only: x˙i = sgn(ri − xi ) + k

αi j (x j − xi ), i ∈ N .

(4.15)

j∈N i

Now, the algorithm (4.15) finds a median approximately by exchanging their states xi only (not ri ). Further details can be found in [31].

4.2.4 Distributed Optimization: Optimal Power Dispatch As another application of distributed optimization, let us consider an optimization problem minimizeλ1 ,...,λ N

N

Ji (λi )

i=1

subject to

N i=1

λi =

N

(4.16) di , λi ≤ λi ≤ λi , i ∈ N

i=1

where λi ∈ R is the decision variable, Ji is a strictly convex C 2 function, and λi , λi , and di are given constants. A practical example is the economic dispatch problem of electric power, in which di represents the demand of node i, λi is the power generated at node i with its minimum λi and maximum λi , and Ji is the generation cost. A centralized solution is easily obtained using Lagrangian and Lagrange dual functions. Indeed, it can be shown that the optimal value is obtained by λi∗ = θi (s ∗ ) where d Ji d Ji −1 d Ji sat s, (λi ), (λi ) , θi (s) := dλi dλi dλi in which (d Ji /dλi )−1 (·) is the inverse function of (d Ji /dλi )(·), sat(s, a, b) is s if a ≤ s≤ b, b if b < s, and a if s < a. The optimal s ∗ maximizes the dual function g(s) := N ∗ i=1 Ji (θi (s)) + s(di − θi (s)), which is concave so that s can be asymptotically obtained by the gradient algorithm: dg (s) = (di − θi (s)). ds i=1 N

s˙ =

(4.17)

4 Design of Heterogeneous Multi-agent System for Distributed Computation

91

A distributed algorithm to solve the optimization problem approximately is to integrate x˙i = di − θi (xi ) + k αi j (x j − xi ), i ∈ N (4.18) j∈N i

because the blended dynamics of (4.18) is given by s˙ =

N 1 1 dg (di − θi (s)) = (s). N i=1 N ds

(4.19)

Obviously, (4.19) is the same as the centralized solver (4.17) except the scaling of 1/N , which can be compensated by scaling (4.18). By Theorem 4.2, the state xi (t) of each node approaches arbitrarily close to s ∗ with a sufficiently large k, and so, we obtain λi∗ approximately by θi (xi (t)) whose error can be made arbitrarily small. Readers are referred to [32], which also describes the behavior of the proposed algorithm when the problem is infeasible so that each agent can figure out that infeasibility occurs. It is again emphasized that the initial conditions are forgotten and so the plug-and-play operation is guaranteed. Moreover, each agent can design its own dynamics (4.18) only with their local information so that decentralized design is achieved, except the global information k. In particular, the function θi can be computed within the agent i from the local information such as Ji , λi , and λi . Therefore, the proposed solver (4.18) does not exchange the private information of each agent (except the dual variable xi ).

4.3 Strong Diffusive Output Coupling Now, let us consider a bit more complex network—a heterogeneous multi-agent systems under diffusive output coupling law as4 z˙ i = gi (t, z i , yi ) y˙i = h i (t, yi , z i ) + kΛ

∈ Rm i , αi j (y j − yi ) ∈ Rn ,

i ∈N,

(4.20)

j∈N i

where the matrix Λ is positive definite. The vector fields gi and h i are assumed to be piecewise continuous in t, continuously differentiable with respect to z i and yi , 4

A particular case of (4.20) is x˙i = f i (t, xi ) + k B

αi j (x j − xi ), i ∈ N ,

j∈N i

where the matrix B is positive semi-definite, which can always be converted into (4.20) by a linear coordinate change.

92

J. G. Lee and H. Shim

locally Lipschitz with respect to z i and yi uniformly in t, and gi (t, 0, 0), h i (t, 0, 0) are uniformly bounded for t. For this network, under the same coordinate change as (4.2) in which xi replaced by yi , it can be seen that the quasi-steady-state subsystem (or, the blended dynamics) becomes z˙ˆ i = gi (t, zˆ i , s), i ∈ N , s˙ =

N 1 h i (t, s, zˆ i ). N i=1

(4.21)

This can also be seen by treating z i (t) as external inputs of h i in (4.20). Theorem 4.3 ([19]) Assume that the blended dynamics (4.21) is contractive. Then, for any compact set K and for any η > 0, there exists k ∗ > 0 such that, for each k > k ∗ and col(z 1 (t0 ), y1 (t0 ), . . . , z N (t0 ), y N (t0 )) ∈ K , the solution to (4.20) exists for all t ≥ t0 , and satisfies lim sup z i (t) − zˆ i (t) ≤ η and lim sup yi (t) − s(t) ≤ η, ∀i ∈ N . t→∞

t→∞

Theorem 4.4 ([15, 19]) Assume that there is a nonempty compact set Ab that is uniformly asymptotically stable for the blended dynamics (4.21). Let Db ⊃ Ab be an open subset of the domain of attraction of Ab , and let Ax := col(ˆz 1 , s, zˆ 2 , s, . . . , zˆ N , s) : col(ˆz 1 , . . . , zˆ N , s) ∈ Ab , ⎧ ⎫ N ⎨ ⎬ 1 si = s . Dx := col(ˆz 1 , s1 , . . . , zˆ N , s N ) : col(ˆz 1 , . . . , zˆ N , s) ∈ Db such that ⎩ ⎭ N i=1

Then, for any compact set K ⊂ Dx and for any η > 0, there exists k ∗ > 0 such that, for each k > k ∗ and col(z 1 (t0 ), y1 (t0 ), . . . , z N (t0 ), y N (t0 )) ∈ K , the solution to (4.20) exists for all t ≥ t0 , and satisfies lim sup col(z 1 (t), y1 (t), . . . , z N (t), y N (t)) A x ≤ η.

(4.22)

t→∞

If, in addition, Ab is locally exponentially stable for the blended dynamics (4.21) and, h i (t, yi , z i ) = h j (t, y j , z j ), ∀i, j ∈ N , for each col(z 1 , y1 , . . . , z N , y N ) ∈ Ax and t, then we have more than (4.22) as lim col(z 1 (t), y1 (t), . . . , z N (t), y N (t)) A x = 0.

t→∞

With the extended results, two more examples follow.

4 Design of Heterogeneous Multi-agent System for Distributed Computation

93

4.3.1 Synchronization of Heterogeneous Liénard Systems Consider a network of heterogeneous Liénard systems modeled as z¨ i + f i (z i )˙z i + gi (z i ) = u i ,

i = 1, . . . , N ,

(4.23)

where f i (·) and gi (·) are locally Lipschitz. Suppose that the output and the diffusive coupling input are given by oi = az i + z˙ i , a > 0, and u i = k

αi j (o j − oi ).

(4.24)

j∈N i

For (4.23) with (4.24), we claim that synchronous and oscillatory behavior is obtained with a sufficiently large k if the averaged Liénard systems given by z¨ + fˆ(z)˙z + g(z) ˆ := z¨ +

N N 1 1 f i (z) z˙ + gi (z) = 0 N i=1 N i=1

(4.25)

has a stable limit cycle. This condition may be interpreted as the blended version of the condition for a stand-alone Liénard system z¨ + f (z)˙z + g(z) = 0 to have a stable limit cycle. Note that this condition implies that, even when some particular agents z¨ i + f i (z i )˙z i + gi (z i ) = 0 do not yield a stable limit cycle, the network still can exhibit oscillatory behavior as long as the average of ( f i , gi ) yields a stable limit cycle. It is seen that stability of individual agents can be traded among agents in this way so that some malfunctioning oscillators can oscillate in the oscillating network as long as there are a majority of good neighbors. The frequency and the shape of synchronous oscillation is also determined by the average of ( f i , gi ). To justify the claim, we first realize (4.23) and (4.24) with yi := az i + z˙ i as z˙ i = −az i + yi y˙i = −a 2 z i + ayi − f i (z i )yi + a f i (z i )z i − gi (z i ) + k

αi j (y j − yi ),

j∈N i

and compute its blended dynamics (4.21) as i ∈N, (4.26) z˙ˆ i = −a zˆ i + s, N N N N 1 1 1 1 2 s˙ = −a zˆ i + as − f i (ˆz i ) s + a f i (ˆz i )ˆz i − gi (ˆz i ) . N N N N i=1

i=1

i=1

i=1

To see whether this (N + 1)-th order blended dynamics has a stable limit cycle, we observe that, with a > 0, all zˆ i (t) converge exponentially to a common trajectory zˆ (t) as time goes on. Therefore, if the blended dynamics has a stable limit cycle, which is an invariant set, it has to be on the synchronization manifold S defined as

94

J. G. Lee and H. Shim

S := col(ˆz , . . . , zˆ , s¯ ) ∈ R N +1 : col(ˆz , s¯ ) ∈ R2 . Projecting the blended dynamics (4.26) to the synchronization manifold S , i.e., replacing zˆ i with zˆ in (4.26) for all i ∈ N , we obtain a second-order system z˙ˆ = −a zˆ + s, ˆ z ). s˙ = −a 2 zˆ + as − fˆ(ˆz )s + a fˆ(ˆz )ˆz − g(ˆ

(4.27)

Therefore, (4.27) should have a stable limit cycle if the blended dynamics has a stable limit cycle. It turns out that (4.27) is a realization of (4.25) by s = az + z˙ , and thus, existence of a stable limit cycle for (4.25) is a necessary condition for the blended dynamics (4.26) to have a stable limit cycle. Further analysis, given in [33], proves that the converse is also true. Then, Theorem 4.4 holds with the limit cycle of (4.26) as the compact set Ab , and thus, with a sufficiently large k, all the vectors (z i (t), z˙ i (t)) stay close to each other, and oscillate near the limit cycle of the averaged Liénard system (4.25). This property has been coined as “phase cohesiveness” in [34].

4.3.2 Distributed State Estimation Consider a linear system ⎤ ⎡ ⎤ ⎡ ⎤ G1 n1 o1 ⎢ .. ⎥ ⎢ .. ⎥ ⎢ .. ⎥ n χ˙ = Sχ ∈ R , o = ⎣ . ⎦ = ⎣ . ⎦ χ + ⎣ . ⎦ = Gχ + n, oi ∈ Rqi ⎡

oN

GN

nN

where χ ∈ Rn is the state to be estimated, o is the measurement output, and n is the measurement noise. It is supposed that there are N distributed agents, and each agent i can access the measurement oi ∈ Rqi only (where often qi = 1). We assume that the pair (G, S) is detectable, while each pair (G i , S) is not necessarily detectable as in [35, 36]. Each agent is allowed to communicate its internal state to its neighboring nodes. The question is how to construct a dynamic system for each node that estimates χ (t). See, e.g., [37, 38] for more details on this distributed state estimation problem. To solve the problem, we first employ the detectability decomposition for each node, that is, for each pair (G i , S). With pi being the dimension of the undetectable subspace of the pair (G i , S), let [Z i , Wi ] be an orthogonal matrix, where Z i ∈ Rn×(n− pi ) and Wi ∈ Rn× pi , such that

Z iT S¯i 0 S Z i Wi = , G i Z i Wi = G¯ i 0 WiT ∗ ∗

and the pair (G¯ i , S¯i ) is detectable. Then, pick a matrix U¯ i ∈ R(n− pi )×qi such that S¯i − U¯ i G¯ i is Hurwitz. Now, individual agent i can construct a local partial state

4 Design of Heterogeneous Multi-agent System for Distributed Computation

95

observer, for instance, as b˙i = S¯i bi − U¯ i (G¯ i bi − oi )

∈ R pi

(4.28)

to collect the information of the state χ as much as possible from the available measurement oi only; each agent i can obtain a partial information bi about χ in the sense that bi = Z iT χ + z i

∈ R pi , i ∈ N

(4.29)

where z i denotes the estimation error that converges to zero by (4.28) if n i = 0. When we collect (4.29) and write them as ⎤ ⎡ T⎤ ⎡ ⎤ Z1 z1 b1 ⎢ .. ⎥ ⎢ .. ⎥ ⎢ .. ⎥ b = ⎣ . ⎦ = ⎣ . ⎦ χ + ⎣ . ⎦ =: Aχ + z, ⎡

bN

Z NT

(4.30)

zN

detectability of (G, S) implies that col(Z 1T , . . . , Z NT ) = A has full-column rank. Therefore, the least-squares solution χˆ (t) of Aχ(t) ˆ = b(t) can generate a unique estimate of χ (t). This reminds us of the problem in Sect. 4.2.2; finding the leastsquares solution in a distributed manner. Based on the discussion above, we propose a distributed state estimator for the given linear system as χ˙ˆ i = S χˆ i − κ Z i (Z iT χˆ i − bi ) + k

αi j (χˆ j − χˆ i )

∈ Rn , i ∈ N ,

(4.31)

j∈N i

where bi comes from (4.28), and both κ and k are design parameters. Note that the least-squares solution χˆ for Aχ(t) ˆ = b(t) is time-varying, and so, in order to have asymptotic convergence of χˆ i (t) to χ (t) (when there is no noise n), we had to add the generating model of χ (t) in (4.31), inspired by the internal model principle. To justify the proposed distributed state estimator (4.28) and (4.31), let us denote the state estimation error as yi := χˆ i − χ , then we obtain the error dynamics for the partial state observer (4.28) and the distributed observer (4.31) as z˙ i = ( S¯i − U¯ i G¯ i )z i + U¯ i n i y˙i = Syi − κ Z i (Z iT yi − z i ) + k

j∈N i

The blended dynamics (4.21) is obtained as

αi j (y j − yi ),

i ∈N.

(4.32)

96

J. G. Lee and H. Shim

z˙ˆ i = ( S¯i − U¯ i G¯ i )ˆz i + U¯ i n i s˙ = Ss −

κ N

N i=1

⎤ zˆ 1 κ κ ⎢ ⎥ Z i (Z iT s − zˆ i ) = S − A T A s + A T ⎣ ... ⎦ . N N zˆ N

⎡

(4.33)

For a sufficiently large gain κ, the blended dynamics (4.33) becomes contractive, and thus, Theorem 4.3 guarantees that the error variables (z i (t), yi (t)) of (4.32) behave like (ˆz i (t), s(t)) of (4.33). Moreover, if there is no noise n, Theorem 4.4 asserts that all the estimation errors (z i (t), yi (t)), i ∈ N , converge to zero because Ax = {0}. Even with the noise n, the proposed observer achieves almost best possible estimate whose detail can be found in [30].

4.4 General Description of Blended Dynamics Now, we extend our approach to the most general setting—a heterogeneous multiagent systems under rank-deficient diffusive coupling law given by x˙i = f i (t, xi ) + k Bi

αi j x j − xi , i ∈ N ,

(4.34)

j∈N i

where the matrix Bi is positive semi-definite for each i ∈ N . For this network, by increasing the coupling gain k, we can enforce synchronization of the states that correspond to the subspace R B :=

N

im(Bi )

⊂ Rn .

(4.35)

i=1

In order to find the part of individual states that synchronize, let us follow the procedure: 1. Find Wi ∈ Rn× pi and Z i ∈ Rn×(n− pi ) where pi is the rank of Bi , such that [Wi Z i ] is an orthogonal matrix and

2 WiT Λi 0 Z W B = i i i Z iT 0 0

(4.36)

where Λi ∈ R pi × pi is positive definite. Let Wnet := diag(W1 , . . . , W N ), Z net := . . . , Λ N ). diag(Z 1 , . . . , Z N ), and Λnet := diag(Λ1 , N pi and V := col(V1 , . . . , VN ) ∈ 2. Find Vi ∈ R pi × ps such that, with p¯ := i=1 p× ¯ ps , the columns of V are orthonormal vectors satisfying R (L ⊗ In )Wnet Λnet V = 0n(N −1)× ps

(4.37)

4 Design of Heterogeneous Multi-agent System for Distributed Computation

97

where ps is the dimension of ker(L ⊗ In )Wnet Λnet , and L is the graph Laplacian matrix. ¯ p− ¯ ps ) ¯ p¯ such that [V V ] ∈ R p× is an orthogonal matrix. 3. Find V ∈ R p×( Proposition 4.1 ([19]) (i) ps ≤ min{ p1 , . . . , p N } ≤ n. (ii) All matrices Wi Λi Vi (i = 1, . . . , N ) are the same; so let us denote it by M ∈ Rn× ps , then rank(M) = ps , im(M) = R B , and dim(R B ) = ps . (iii) Define T T ¯ ps )×( p− ¯ ps ) (L ⊗ In )Wnet Λnet V ∈ R( p− . Q := V Λnet Wnet Then, Q is positive definite. Now, we introduce a linear coordinate change by which the state xi of individual agent is split into Z iT xi and WiT xi . In particular, the sub-state z i := Z iT xi is the projected component of xi on im(Z i ), and has no direct interconnection with the neighbors as its dynamics is given by z˙ i = Z iT f i (t, xi ),

i ∈N.

(4.38)

On the other hand, the sub-state WiT xi is split once more into si := ViT Λi−1 WiT xi ∈ R ps and the other part. (In fact, si determines the behavior of the individual agent k, these in the subspace R B in the sense that Msi ∈ R BN.) With a sufficiently N large si = (1/N ) i=1 ViT Λi−1 WiT xi , si are enforced to synchronize to s := (1/N ) i=1 which is governed by N 1 T −1 T V Λ Wi f i (t, xi ). (4.39) s˙ = N i=1 i i To see this, let us consider a coordinate change for the whole multi-agent system (4.34): ⎡ ⎤ ⎡ ⎤⎡x ⎤ T 1 Z net z ⎢ .. ⎥ T ⎣s ⎦ = ⎣ ⎦ (1/N )V T Λ−1 W ⎣ . ⎦ net net T T w Q −1 V Λnet Wnet (L ⊗ In ) xN

(4.40)

N

where w ∈ R(N −1) ps + i=1 ( pi − ps ) collects all the components both in col(s1 , . . . , s N ) that are left after taking s = (1/N )1TN col(s1 , . . . , s N ) ∈ R ps , and in WiT xi that are left after taking si = ViT Λi−1 WiT xi . It turns out that the governing equation for w is ⎤ f 1 (t, x1 ) 1 1 T ⎥ ⎢ .. T (L ⊗ In ) ⎣ w˙ = −Qw + Q −1 V Λnet Wnet ⎦. . k k f N (t, x N ) ⎡

(4.41)

98

J. G. Lee and H. Shim

Then, it is clear that the system (4.38), (4.39), and (4.41) is in the standard form of singular perturbation. Since the inverse of (4.40) is given (in [19]) by ⎤ x1 ⎢ .. ⎥ ⎣ . ⎦ = (Z net − Wnet Λnet L)z + N Wnet Λnet V s + Wnet Λnet V w ⎡

xN ¯ N − p) ¯ where L ∈ R p×(n is defined as T

T (L ⊗ In )Z net L = col(L 1 , . . . , L N ) := V Q −1 V Λnet Wnet ¯ with L i ∈ R pi ×(n N − p) , the quasi-steady-state subsystem (that is, the blended dynamics) becomes

z˙ˆ i = Z iT f i (t, Z i zˆ i − Wi Λi L i zˆ + N Ms), i ∈ N s˙ =

N 1 T −1 T V Λ Wi f i (t, Z i zˆ i − Wi Λi L i zˆ + N Ms) N i=1 i i

(4.42)

where zˆ = col(ˆz 1 , . . . , zˆ N ). Theorem 4.5 ([19]) Assume that the blended dynamics (4.42) is contractive. Then, for any compact set K and for any η > 0, there exists k ∗ > 0 such that, for each k > k ∗ and col(x1 (t0 ), . . . , x N (t0 )) ∈ K , the solution to (4.34) exists for all t ≥ t0 , and satisfies lim sup xi (t) − (Z i zˆ i (t) − Wi Λi L i zˆ (t) + N Ms(t)) ≤ η, t→∞

∀i ∈ N .

Theorem 4.6 ([19]) Assume that there is a nonempty compact set Ab that is uniformly asymptotically stable for the blended dynamics (4.42). Let Db ⊃ Ab be an open subset of the domain of attraction of Ab , and let

Ax := (Z net − Wnet Λnet L)ˆz + N (1 N ⊗ M)s : col(ˆz , s) ∈ Ab ,

!

zˆ

"

¯ ps Dx := (Z net − Wnet Λnet L)ˆz + N (1 N ⊗ M)s + Wnet Λnet V w : . ∈ Db , w ∈ R p− s

Then, for any compact set K ⊂ Dx and for any η > 0, there exists k ∗ > 0 such that, for each k > k ∗ and col(x1 (t0 ), . . . , x N (t0 )) ∈ K , the solution to (4.34) exists for all t ≥ t0 , and satisfies lim sup col(x1 (t), . . . , x N (t)) A x ≤ η. t→∞

(4.43)

4 Design of Heterogeneous Multi-agent System for Distributed Computation

99

If, in addition, Ab is locally exponentially stable for the blended dynamics (4.42) and, ⎤ f 1 (t, x1 ) T ⎥ ⎢ .. T V Λnet Wnet (L ⊗ In ) ⎣ ⎦ = 0, . f N (t, x N ) ⎡

(4.44)

for all col(x1 , . . . , x N ) ∈ Ax and t, then we have more than (4.43) as lim col(x1 (t), . . . , x N (t)) A x = 0.

t→∞

4.4.1 Distributed State Observer with Rank-Deficient Coupling We revisit the distributed state estimation problem discussed in Sect. 4.3.2 with the following agent dynamics, which has less dimension than (4.28) and (4.31): χ˙ˆ i = S χˆ i + Ui (oi − G i χˆ i ) + kWi WiT

N

αi j (χˆ j − χˆ i )

(4.45)

j=1

where Ui := Z i U¯ i and k is sufficiently large. Here, the first two terms on the righthand side look like a typical state observer, but due to the lack of detectability of (G i , S), it cannot yield stable error dynamics. Therefore, the diffusive coupling of the third term exchanges the internal state with the neighbors, compensating for the lack of information on the undetectable parts. Recalling that WiT χ represents the undetectable components of χ by oi in the decomposition given in Sect. 4.3.2, it is noted that the coupling term compensates only the undetectable portion in the observer. As a result, the coupling matrix Wi WiT is rank-deficient in general. This point is in sharp contrast to the previous results such as [38], where the coupling term is nonsingular so that the design is more complicated. With xi := χˆ i − χ and Bi = Wi WiT , the error dynamics becomes x˙i = (S − Ui G i )xi + Ui n i + k Bi

N

αi j (x j − xi ),

i ∈N.

j=1

This is precisely the multi-agent system (4.34), where in this case the matrices Z i and Wi have implications related to detectable decomposition. In particular, from N N im(Wi ) = ∩i=1 ker(Z iT ) = {0} the detectability of the pair (G, S), it is seen that ∩i=1 T (which corresponds to R B in (4.35)), by the fact that ker(Z i ) is the undetectable subspace of the pair (G i , S). This implies that ps = 0, V is null, and thus, V can be

100

J. G. Lee and H. Shim

chosen as the identity matrix. With them, the blended dynamics (4.42) is given by, with the state s being null, z˙ˆ i = Z iT (S − Ui G i )(Z i zˆ i − Wi Λi L i zˆ ) + Z iT Ui n i = (Z iT S − U¯ i G¯ i Z iT )(Z i zˆ i − Wi Λi L i zˆ ) + U¯ i n i = ( S¯i − U¯ i G¯ i )ˆz i + U¯ i n i , i ∈ N .

Since S¯i − U¯ i G¯ i is Hurwitz for all i, the blended dynamics is contractive and Theorem 4.5 asserts that the estimation error xi (t) behaves like Z i zˆ i (t), with a sufficiently large k. Moreover, if there is no measurement noise, then the set Ab = {0} ⊂ Rn N − p¯ is globally exponentially stable for the blended dynamics. Then, Theorem 4.6 asserts that the proposed distributed state observer exponentially finds the correct estimate with a sufficiently large k because (4.44) holds with Ax = {0} ⊂ Rn N .

4.5 Robustness of Emergent Collective Behavior When a product is manufactured in a factory, or some cells and organs are produced in an organism, a certain level of variance is inevitable due to imperfection of the production process. In this case, how to reduce the variance in the outcomes if improving the process itself is not easy or even impossible? We have seen throughout the chapter, the emergent collective behavior of the network involves averaging the vector fields of individual agents; that is, the network behavior is governed by the blended dynamics if the coupling strength is sufficiently large. Therefore, even if the individual agents are created with relatively large variance from their reference model, its blended dynamics can have smaller variance because of the averaging effect. Then, when the coupling gain is large, all the agents, which were created with large variance, can behave like an agent that is created with small variance. In this section, we illustrate this point. In particular, we simulate a network of pacemaker cells under a single conduction line. The nominal behavior of a pacemaker cell is given in [39] as z¨ + (1.45z 2 − 2.465z − 0.551)˙z + z = 0

(4.46)

which is a Liénard system considered in Sect. 4.3.1 that has a stable limit cycle. Now, suppose that a group of pacemaker cells are produced with some uncertainty so that they are represented as z¨ i + f i (z i )˙z i + gi (z i ) = u i , where

i = 1, . . . , N ,

4 Design of Heterogeneous Multi-agent System for Distributed Computation

101

f i (z i ) = 0.1Δi1 z i3 + (1.45 + Δi2 )z i2 − (2.465 + Δi3 )z i − (0.551 + Δi4 ) gi (z i ) = (1 + Δi5 )z i + 0.1Δi6 z i2 in which, all Δli are randomly chosen from a distribution of zero mean and unit variance. With u i = k j∈N i (˙z j + z j − z˙ i − z i ), the blended dynamics of the group of pacemaker is given as the averaged Liénard system (4.25) with fˆ(z) = 0.1Δ¯ 1 z 3 + (1.45 + Δ¯ 2 )z 2 − (2.465 + Δ¯ 3 )z − (0.551 + Δ¯ 4 ) g(z) ˆ = (1 + Δ¯ 5 )z + 0.1Δ¯ 6 z 2 N where Δ¯ l = (1/N ) i=1 Δli whose expectation is zero and variance is 1/N . By the Chebyshev’s theorem in probability, it is seen that the behavior of the blended dynamics recovers that of (4.46) almost surely as N tends to infinity. It is emphasized that some agent may not have a stable limit cycle depending on their random selection of Δli , but the network can still exhibit oscillatory behavior, and the frequency and the shape of oscillation becomes more robust as the number of agents gets large. Figure 4.1 shows the simulation results of the pacemaker network when the number of agents are 10, 100, and 1000, respectively. For example, we randomly generated the network for N = 10 three times independently, and plotted their behavior in Fig. 4.1a–c, respectively. It is seen that the variation is large in this case and Fig. 4.1b even shows the case that no stable limit cycle exists. On the other hand, in the case when N = 1000, the randomly generated networks exhibit rather uniform behavior as in Fig. 4.1g–i. For the simulation, the initial condition is z i (0) = z˙ i (0) = 1, the coupling gain is k = 50, and the graph has all-to-all connection. We refer the reader to [14] for more discussions in this direction.

4.6 More than Linear Coupling Until now, we have considered linear diffusive couplings with constant strength k. In this section, let us consider two particular nonlinear couplings; edge-wise and node-wise funnel couplings, whose coupling strength varies as a nonlinear function of time and the differences between the states.

4.6.1 Edge-Wise Funnel Coupling The coupling law to be considered is inspired by the so-called funnel controller [40]. For the multi-agent system x˙i = f i (t, xi ) + u i

∈ R, i ∈ N ,

102

J. G. Lee and H. Shim

(a) N = 10

(b) N = 10

(c) N = 10

(d) N = 100

(e) N = 100

(f) N = 100

(g) N = 1000

(h) N = 1000

(i) N = 1000

Fig. 4.1 Simulation results of randomly generated pacemaker networks. Initial condition is (1, 1) for all cases

let us consider the following edge-wise funnel coupling law, with νi j := x j − xi , u i (t, {νi j , j ∈ Ni }) :=

j∈N i

γi j

|νi j | ψi j (t)

νi j ψi j (t)

(4.47)

where each function ψi j : [t0 , ∞) → R>0 is positive, bounded, and differentiable with bounded derivative, and the gain functions γi j : [0, 1) → R≥0 are strictly increasing and unbounded as s → 1. We assume the symmetry of coupling functions; that is, ψi j = ψ ji and γi j = γ ji for all i ∈ N and j ∈ Ni (or, equivalently, j ∈ N , i ∈ N j because of the symmetry of the graph). A possible choice for γi j and ψi j is

4 Design of Heterogeneous Multi-agent System for Distributed Computation

103

Fig. 4.2 State difference νi j evolves within the funnel Fψi j so that the synchronization error can be prescribed by the shape of the funnel

γi j (s) =

1 1−s

ψi j (t) = (ψ − η)e−λ(t−t0 ) + η,

and

where ψ, λ, η > 0. With the funnel coupling (4.47), it is shown in [41] that, under the assumption that no finite escape time exists, the state difference νi j (t) evolves within the funnel: Fψi j := (t, νi j ) : |νi j | < ψi j (t) if |νi j (t0 )| < ψi j (t0 ), ∀i ∈ N , j ∈ Ni , as can be seen in Fig. 4.2. Therefore, approximate synchronization of arbitrary precision can be achieved with arbitrarily small η > 0 such that lim supt→∞ ψi j (t) ≤ η. Indeed, due to the connectivity of the graph, it follows from lim supt→∞ |νi j (t)| ≤ η, ∀i ∈ N , j ∈ Ni , that lim sup |x j (t) − xi (t)| ≤ dη, t→∞

∀i, j ∈ N

(4.48)

where d is the diameter of the graph. For the complete graph, we have d = 1. Here, we note that, by the symmetry of ψi j and γi j and by the symmetry of the graph, it holds that N i=1

ui =

N

γi j

i=1 j∈N i

=−

N

|νi j | ψi j (t)

γ ji

j=1 i∈N j

N |ν ji | νi j ν ji =− γ ji ψi j ψ (t) ψ ji ji i=1

|ν ji | ψ ji (t)

j∈N i

ν ji =− ψ ji

N

u j.

j=1

Therefore, we have that 0=

N i=1

ui =

N (x˙i (t) − f i (t, xi (t))) i=1

(4.49)

104

J. G. Lee and H. Shim

which holds regardless whether synchronization is achieved or not. If all xi ’s are synchronized whatsoever such that xi (t) = s(t) by a common trajectory s(t), it implies that x˙i = f i (t, s) + u i = s˙ for all i ∈ N ; i.e., u i (t) compensates the term f i (t, s(t)) so that all x˙i (t)’s become the same s˙ . Hence, (4.49) implies that N 1 s˙ = f i (t, s) =: f s (t, s). N i=1

(4.50)

In other words, enforcing synchronization under the condition (4.49) yields an emergent behavior for xi (t) = s(t), governed by the blended dynamics (4.50). In practice, the funnel coupling (4.47) enforces approximate synchronization as in (4.48), and thus, the behavior of the network is not exactly the same as (4.50) but can be shown to be close to it. More details are found in [41]. A utility of the edge-wise funnel coupling is for the decentralized design. It is because the common gain k, whose threshold k ∗ contains all the information about the graph and the individual vector fields of the agents, is not used. Therefore, the individual agent can self-construct their own dynamics when joining the network. (For example, if it is used for the distributed least-squares solver in Sect. 4.2.2, then the agent dynamics (4.11) can be constructed without any global information.) Indeed, when an agent joins the network, the agent can handshake with the agents to be connected, and communicate to set the function ψi j (t) so that the state difference νi j at the moment of joining resides inside the funnel.

4.6.2 Node-Wise Funnel Coupling Motivated by the observation in the previous subsection that the enforced synchronization under the condition (4.49) gives rise to the emergent behavior of (4.50), let us illustrate how different nonlinear couplings may yield different emergent behavior. As a particular example, we consider the node-wise funnel coupling given by u i (t, νi ) := γi

|νi | ψi (t)

νi where νi = αi j (x j − xi ) ψi (t)

(4.51)

j∈N i

where each function ψi : [t0 , ∞) → R>0 is positive, bounded, and differentiable with bounded derivative, and the gain functions γi : [0, 1) → R≥0 are strictly increasing and unbounded as s → 1. A possible choice for γi and ψi is

γi (s) =

δ tan s π δ, 2

π s , s>0 2 s=0

where δ, ψ, λ, η > 0.

and ψi (t) = (ψ − η)e−λ(t−t0 ) + η

(4.52)

4 Design of Heterogeneous Multi-agent System for Distributed Computation

105

With the funnel coupling (4.51), it is shown in [42] that, under the assumption that no finite escape time exists, the quantity νi (t) evolves within the funnel Fψi := {(t, νi ) : |νi | < ψi (t)} if |νi (t0 )| < ψi (t0 ), ∀i ∈ N . Therefore, approximate synchronization of arbitrary precision can be achieved with arbitrarily small η > 0 such that lim supt→∞ ψi (t) ≤ η. Indeed, due to the connectivity of the graph, it follows that √ 2 N η, ∀i, j ∈ N (4.53) lim sup |x j (t) − xi (t)| ≤ λ2 t→∞ where λ2 is the second smallest eigenvalue of L . Unlike the case of edge-wise funnel coupling, there is lack of symmetry so that the equality of (4.49) does not hold. However, assuming that the map u i (t, νi ) from νi to u i is invertible (which is the case for (4.52) for example) such that there is a function Vi such that ∀t, νi , νi = Vi (t, u i (t, νi )), we can instead make use of the symmetry in νi as N

αi j (x j − xi ) = −

i=1 j∈N i

N

α ji (xi − x j ) = −

i=1 j∈N i

N

α ji (xi − x j ),

j=1 i∈N j

which leads to 0=

N i=1

νi =

N

Vi (t, u i (t, νi )) =

i=1

N

Vi (t, x˙i (t) − f i (t, xi (t))).

(4.54)

i=1

This holds regardless whether synchronization is achieved or not. If all xi ’s are synchronized whatsoever such that xi (t) = s(t) by a common trajectory s(t), it implies that x˙i (t) = f i (t, s) + u i (t) = s˙ for all i ∈ N ; i.e., u i (t) compensates the term f i (t, s) so that all x˙i are the same as s˙ , which can be denoted by f s (t, s) = f i (t, s) + u i (t). Hence, (4.54) implies that N

Vi (t, f s (t, s) − f i (t, s)) = 0.

(4.55)

i=1

In other words, (4.55) defines f s (t, s) implicitly, which yields the emergent behavior governed by (4.56) s˙ = f s (t, s). In practice, the funnel coupling (4.51) enforces approximate synchronization as in (4.53), and the behavior of the network is not exactly the same as (4.56) but it is shown in [42] to be close to (4.56).

106

J. G. Lee and H. Shim

In order to illustrate that different emergent behavior may arise from various nonlinear couplings, let us consider the example of (4.52), for which the function Vi is given by u 2ψi (t) i . Vi (t, u i ) = tan−1 π δ Assuming that all ψi ’s are the same, the emergent behavior s˙ = f s (t, s) can be found with f s (t, s) being the solution to 0=

N

tan

i=1

−1

f s (t, s) − f i (t, s) . δ

If we let δ → 0, then the above equality shares the solution with 0=

N

sgn ( f s (t, s) − f i (t, s)) .

i=1

Recalling the discussions in Sect. 4.2.3, it can be shown that f s (t, s) takes the median of all the individual vector fields f i (t, s), i = 1, . . . , N . Since taking median is a simple and effective way to reject outliers, this observation may find further applications in practice. Acknowledgements This work was supported by the National Research Foundation of Korea Grant funded by the Korean Government (Ministry of Science and ICT) under No. NRF-2017R1E1 A1A03070342 and No. 2019R1A6A3A12032482.

References 1. Olfati-Saber, R., Murray, R.M.: Consensus problems in networks of agents with switching topology and time-delays. IEEE Trans. Autom. Control 49(9), 1520–1533 (2004) 2. Moreau, L.: Stability of continuous-time distributed consensus algorithms. In: Proceedings of 43rd IEEE Conference on Decision and Control, pp. 3998–4003 (2004) 3. Ren, W., Beard, R.W.: Consensus seeking in multiagent systems under dynamically changing interaction topologies. IEEE Trans. Autom. Control 50(5), 655–661 (2005) 4. Seo, J.H., Shim, H., Back, J.: Consensus of high-order linear systems using dynamic output feedback compensator: low gain approach. Automatica 45(11), 2659–2664 (2009) 5. Kim, H., Shim, H., Seo, J.H.: Output consensus of heterogeneous uncertain linear multi-agent systems. IEEE Trans. Autom. Control 56(1), 200–206 (2011) 6. De Persis, C., Jayawardhana, B.: On the internal model principle in formation control and in output synchronization of nonlinear systems. In: Proceedings of 51st IEEE Conference on Decision and Control, pp. 4894–4899 (2012) 7. Isidori, A., Marconi, L., Casadei, G.: Robust output synchronization of a network of heterogeneous nonlinear agents via nonlinear regulation theory. IEEE Trans. Autom. Control 59(10), 2680–2691 (2014) 8. Su, Y., Huang, J.: Cooperative semi-global robust output regulation for a class of nonlinear uncertain multi-agent systems. Automatica 50(4), 1053–1065 (2014)

4 Design of Heterogeneous Multi-agent System for Distributed Computation

107

9. Casadei, G., Astolfi, D.: Multipattern output consensus in networks of heterogeneous nonlinear agents with uncertain leader: a nonlinear regression approach. IEEE Trans. Autom. Control 63(8), 2581–2587 (2017) 10. Su, Y.: Semi-global output feedback cooperative control for nonlinear multi-agent systems via internal model approach. Automatica 103, 200–207 (2019) 11. Zhang, M., Saberi, A., Stoorvogel, A.A., Grip, H.F.: Almost output synchronization for heterogeneous time-varying networks for a class of non-introspective, nonlinear agents without exchange of controller states. Int. J. Robust Nonlinear Control 26(17), 3883–3899 (2016) 12. DeLellis, P., Di Bernardo, M., Liuzza, D.: Convergence and synchronization in heterogeneous networks of smooth and piecewise smooth systems. Automatica 56, 1–11 (2015) 13. Montenbruck, J.M., Bürger, M., Allgöwer, F.: Practical synchronization with diffusive couplings. Automatica 53, 235–243 (2015) 14. Kim, J., Yang, J., Shim, H., Kim, J.-S., Seo, J.H.: Robustness of synchronization of heterogeneous agents by strong coupling and a large number of agents. IEEE Trans. Autom. Control 61(10), 3096–3102 (2016) 15. Panteley, E., Loría, A.: Synchronization and dynamic consensus of heterogeneous networked systems. IEEE Trans. Autom. Control 62(8), 3758–3773 (2017) 16. Lee, S., Yun, H., Shim, H.: Practical synchronization of heterogeneous multi-agent system using adaptive law for coupling gains. In: Proceedings of American Control Conference, pp. 454–459 (2018) 17. Modares, H., Lewis, F.L., Kang, W., Davoudi, A.: Optimal synchronization of heterogeneous nonlinear systems with unknown dynamics. IEEE Trans. Autom. Control 63(1), 117–131 (2017) 18. Sanders, J.A., Verhulst, F.: Averaging Methods in Nonlinear Dynamical Systems. Springer, Berlin (1985) 19. Lee, J.G., Shim, H.: A tool for analysis and synthesis of heterogeneous multi-agent systems under rank-deficient coupling. Automatica 117 (2020) 20. Lohmiller, W., Slotine, J.J.E.: On contraction analysis for non-linear systems. Automatica 34(6), 683–696 (1998) 21. Shames, I., Charalambous, T., Hadjicostis, C.N., Johansson, M.: Distributed network size estimation and average degree estimation and control in networks isomorphic to directed graphs. In: Proceedings of 50th Annual Allerton Conference on Communication, Control, and Computing, pp. 1885–1892 (2012) 22. Lee, D., Lee, S., Kim, T., Shim, H.: Distributed algorithm for the network size estimation: blended dynamics approach. In: Proceedings of 57th IEEE Conference on Decision and Control, pp. 4577–4582 (2018) 23. Mou, S., Morse, A.S.: A fixed-neighbor, distributed algorithm for solving a linear algebraic equation. In: Proceedings of 12th European Control Conference, pp. 2269–2273 (2013) 24. Mou, S., Liu, J., Morse, A.S.: A distributed algorithm for solving a linear algebraic equation. IEEE Trans. Autom. Control 60(11), 2863–2878 (2015) 25. Anderson, B.D.O., Mou, S., Morse, A.S., Helmke, U.: Decentralized gradient algorithm for solution of a linear equation. Numer. Algebra Control Optim. 6(3), 319–328 (2016) 26. Wang, X., Zhou, J., Mou, S., Corless, M.J.: A distributed linear equation solver for least square solutions. In: Proceedings of 56th IEEE Conference on Decision and Control, pp. 5955–5960 (2017) 27. Shi, G., Anderson, B.D.O.: Distributed network flows solving linear algebraic equations. In: Proceedings of American Control Conference, pp. 2864–2869 (2016) 28. Shi, G., Anderson, B.D.O., Helmke, U.: Network flows that solve linear equations. IEEE Trans. Autom. Control 62(6), 2659–2674 (2017) 29. Liu, Y., Lou, Y., Anderson, B.D.O., Shi, G.: Network flows as least squares solvers for linear equations. In: Proceedings of 56th IEEE Conference on Decision and Control, pp. 1046–1051 (2017) 30. Lee, J.G., Shim, H.: A distributed algorithm that finds almost best possible estimate under non-vanishing and time-varying measurement noise. IEEE Control Syst. Lett. 4(1), 229–234 (2020)

108

J. G. Lee and H. Shim

31. Lee, J.G., Kim, J., Shim, H.: Fully distributed resilient state estimation based on distributed median solver. IEEE Trans. Autom. Control 65(9), 3935–3942 (2020) 32. Yun, H., Shim, H., Ahn, H.-S.: Initialization-free privacy-guaranteed distributed algorithm for economic dispatch problem. Automatica 102, 86–93 (2019) 33. Lee, J.G., Shim, H.: Behavior of a network of heterogeneous Liénard systems under strong output coupling. In: Proceedings of 11th IFAC Symposium on Nonlinear Control Systems, pp. 342–347 (2019) 34. Dörfler, F., Bullo, F.: Synchronization in complex networks of phase oscillators: a survey. Automatica 50(6), 1539–1564 (2014) 35. Bai, H., Freeman, R.A., Lynch, K.M.: Distributed Kalman filtering using the internal model average consensus estimator. In: Proceedings of American Control Conference, pp. 1500–1505 (2011) 36. Kim, J., Shim, H., Wu, J.: On distributed optimal Kalman-Bucy filtering by averaging dynamics of heterogeneous agents. In: Proceedings of 55th IEEE Conference on Decision and Control, pp. 6309–6314 (2016) 37. Mitra, A., Sundaram, S.: An approach for distributed state estimation of LTI systems. In: Proceedings of 54th Annual Allerton Conference on Communication, Control, and Computing, pp. 1088–1093 (2016) 38. Kim, T., Shim, H., Cho, D.D.: Distributed Luenberger observer design. In: Proceedings of 55th IEEE Conference on Decision and Control, pp. 6928–6933 (2016) 39. dos Santos, A.M., Lopes, S.R., Viana, R.L.: Rhythm synchronization and chaotic modulation of coupled Van der Pol oscillators in a model for the heartbeat. Physica A: Stat. Mech. Appl. 338(3–4), 335–355 (2004) 40. Ilchmann, A., Ryan, E.P., Sangwin, C.J.: Tracking with prescribed transient behaviour. ESAIM: Control, Optim. Calc. Var. 7, 471–493 (2002) 41. Lee, J.G., Berger, T., Trenn, S., Shim, H.: Utility of edge-wise funnel coupling for asymptotically solving distributed consensus optimization. In: Proceedings of 19th European Control Conference, pp. 911–916 (2020) 42. Lee, J.G., Trenn, S., Shim, H.: Synchronization with prescribed transient behavior: Heterogeneous multi-agent systems under funnel coupling. Under review (2020). https://arxiv.org/abs/ 2012.14580

Chapter 5

Contributions to the Problem of High-Gain Observer Design for Hyperbolic Systems Constantinos Kitsos, Gildas Besançon, and Christophe Prieur

Abstract This chapter proposes some non-trivial extensions of the classical highgain observer designs for finite-dimensional nonlinear systems to some classes of infinite-dimensional ones, written as triangular systems of coupled first-order hyperbolic Partial Differential Equations (PDEs), where an observation of one only coordinate of the state is considered as the system’s output. These forms may include some epidemic models and tubular chemical reactors. To deal with this problem, depending on the number of distinct velocities of the hyperbolic system, direct and indirect observer designs are proposed. We first show intuitively how direct observer design can be applied to quasilinear partial integrodifferential hyperbolic systems of balance laws with a single velocity, as a natural extension of the finite-dimensional case. We then introduce an indirect approach for systems with distinct velocities (up to three velocities), where an infinite-dimensional state transformation first maps the system into suitable systems of PDEs and the convergence of the observer is subsequently exhibited in appropriate norms. This indirect approach leads to the use of spatial derivatives of the output in the observer dynamics. Dedication Control Theory owes a lot to Laurent Praly and his scientific legacy has inspired the development of various new areas in Control of various new areas in Control. His long-term and still ongoing contributions in adaptive robust control and stabilization, The original version of this chapter was revised: The author’s name “Gildas Besançon” incorrect on the website, correction have been incorporated. The correction to this chapter is available at https:// doi.org/10.1007/978-3-030-74628-5_10 C. Kitsos (B) LAAS-CNRS, University of Toulouse, CNRS, 31400 Toulouse, France e-mail: [email protected] C. Kitsos · G. Besançon · C. Prieur Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, 38000 Grenoble, France e-mail: [email protected] C. Prieur e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022, corrected publication 2022 Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control, Lecture Notes in Control and Information Sciences 488, https://doi.org/10.1007/978-3-030-74628-5_5

109

110

C. Kitsos et al.

forwarding and backstepping methods, nonlinear observers, and output feedback control are deeply motivating, and it is a pleasure to propose the present chapter as a tribute to his work, the quality of presentation of his results, the rigor in his theorems, and his continuous pursuit for the most possible generality of the approaches.

5.1 Introduction Our chapter deals with solutions to a problem of High-Gain Observer design (HGODP) for hyperbolic systems. This problem for the case of finite-dimensional systems has already been addressed (see [12, 13]) and has gained significant consideration in the last decades [17, 18]. These observers rely on a tuning parameter (gain), chosen large enough, in order to compensate for the nonlinear terms and ensure arbitrary convergence rate. This chapter aims at presenting some extensions of this design to infinite-dimensional systems, namely hyperbolic systems of balance laws, obeying to some triangular structure, similarly to the observability form in the finite dimensions, see [4] while considering one observation only. There exist some studies on observer design for infinite-dimensional systems in the literature, mainly considering the full state vector on the boundaries as measurement. Among others, one can refer to [2, 11, 14, 29] for Lyapunov-based analysis and backstepping, and to [26] for optimization methods. The case of state estimation for nonlinear infinitedimensional systems, which is significantly more complicated, has been addressed in [5, 6, 15, 25, 28, 30]. Unlike these approaches, the present chapter provides solutions to this H-GODP, where a part of the state is fully unknown (including at the boundaries). The known part is however distributed and the explored observers strongly rely on high gain, extending techniques and performances of finite-dimensional cases. In general, the problem of control/observer design with a reduced number of controls/observations, less than the number of the states, is a difficult problem. To the best of our knowledge, observer design for systems with reduced number of observations, whose error equations cannot achieve dissipativity in their boundary conditions, has not been considered. Somehow dual problems of controllability for cascade systems of PDEs with a reduced number of internal controls have already been considered (see in particular [1]). In [24], observability for coupled systems of linear PDEs with a reduced number of observations is studied. In this work, we reveal some links to these works, coming from our assumption on stronger regularity of the system. Additionally to this, for hyperbolic systems, arbitrary convergence, a feature of high-gain observers, would be desirable, since the boundary observers proposed in the literature, for instance [6], experience a limitation with respect to convergence speed (transport phenomena). The minimum-time control problem, see for instance [10], suggests that a faster observer than a boundary one, would be desirable in some cases. While dealing with the H-GODP in infinite dimensions, the assumed triangularity of the source terms, similar to the finite-dimensional case, is not enough and several difficulties come from the properties of the hyperbolic operator. This might not allow designs for any large number of states. Also, the presence of

5 High-Gain Observer Design for Systems of PDEs

111

nonlocal terms in the dynamics, the generality of the boundary conditions and types of nonlinearities increase the complexity of the design. The main contribution here is the proof of solvability of the H-GODP first for n × n quasilinear hyperbolic triangular systems with nonlocal terms, i.e., systems of Partial Integrodifferential Equations (PIDEs), considering only a single velocity function. Then, in the case of distinct velocities, the nonexistence of diagonal Lyapunov functionals that would result in the proof of the observer convergence, leads us to adopt an indirect strategy for the case of 2 × 2 and 3 × 3 systems. For this, we introduce a nonlinear infinite-dimensional state transformation, in order to map the initial system into a new system of PDEs. The problem comes from the lack of a commutative property, yet needed in the Lyapunov stability analysis. Note that constraints on the source term can be found in some studies of stability problems as in [3, 9, 27], which allow a similar commutation, while this is not the case here. The methodology proposed here requires stronger regularity for system’s dynamics and then output’s spatial derivatives up to order 2 are injected in the high-gain observer dynamics additionally to the classical output correction terms. The presence of nonlinearities in the velocity functions, which might also have nonlocal nature, the presence of nonlocal and nonlinear components in the source terms, and the generality of the couplings on the boundaries are treated explicitly. The proposed approach relies on Lyapunov analysis for function spaces of stronger regularity and the introduction of an infinite-dimensional state transformation. The direct observer design has already partially appeared in [20], without considering velocity functions of nonlocal nature. The indirect one is inspired by our previous work for semilinear parabolic systems [21] and for the case of quasilinear strictly hyperbolic systems, as the ones considered here, it has not appeared before. In Sect. 5.2, we introduce the considered system, some examples from epidemics and chemical reactions and then the main observer design problem H-GODP, along with its complications and some proposed solutions. In Sect. 5.3, we present a direct approach for the H-GODP for a hyperbolic system with one velocity via a Lyapunovbased methodology. In Sect. 5.4, we show indirect solvability of the H-GODP for systems with distinct velocities, via a suitable infinite-dimensional state transformation, which maps the system into appropriate target systems for observer design. Notation For a given w in Rn , |w| denotes its usual Euclidean norm. For a given constant matrix A in Rn×n , A denotes its transpose, |A| := sup {|Aw| , |w| = 1} is its induced norm and Sym(A) = A+A stands for its symmetric part. By eig(A), we 2 denote the minimum eigenvalue of a symmetric matrix A. By In , we denote the identity matrix of dimension n. For given ξ : [0, +∞) × [0, L] → Rn and time t ≥ 0, we use the notation ξ(t)(x) := ξ(t, x), for all x in [0, L] to refer to the profile at certain time and with ξt or ∂t ξ (resp. ξx or ∂x ξ ), we refer to its partial derivative with respect to t (resp. x). By dt (resp. dx ), we refer to the total derivative with respect to t (resp. x). For a continuous (C 0 ) map [0, L] x → ξ(x) ∈ Rn , we adopt the notation ξ 0 := max{|ξ(x)| , x ∈ [0, L]} for its sup-norm. If this mapping is q- times continuously q differentiable (C q ), we adopt the notation ξ q := i=0 ∂xi ξ 0 for the q-norm. We use the difference operator given by Δξˆ [F ] (ξ ) := F [ξˆ ] − F [ξ ], parametrized by ξˆ , where F denotes any chosen operator acting on ξ . By D f , we denote the Jacobian

112

C. Kitsos et al.

of a differentiable mapping Rn u → f (u) ∈ Rm . For a Fréchet differentiable mapping F , by D F [u] , h, we denote its Fréchet derivative w.r.t. u acting on h. For a locally Lipschitz mapping F , F ∈ Li ploc (X , · X ) means that for every R > 0, ˆ X ≤ R, it there exists L R > 0, such that, for every w, wˆ ∈ X , with w X , w ˆ X . For a globally Lipschitz mapping F , holds F [w] − F [w] ˆ X ≤ L R w − w i.e., F ∈ Li p (X , · X ), the previous holds for all w, wˆ ∈ X . By sgn(x) we denote the signum function sgn(x) = dxd |x|, when x = 0, with sgn(0) = 0.

5.2 Problem Description and Solutions In this section, we introduce the hyperbolic system written in a triangular form, which allows the observer design proposed in this chapter. It might be quasilinear and contain also both velocity functions of nonlocal nature and nonlocal source terms, making it a system of PIDEs. We illustrate some examples of systems having this triangular form and then we introduce the main observer problem and its solutions.

5.2.1 Triangular Form for Observer Design We are concerned with one-dimensional hyperbolic systems of balance laws, described by the following equations on a strip Π := [0, +∞) × [0, L]: ξt (t, x) + Λ[ξ1 (t)](x)ξx (t, x) = Aξ(t, x) + F [ξ(t)] (x),

(5.1a)

where ξ = ξ1 · · · ξn . The matrix function Λ[·] contains the velocity functions of the balance law, each of which assumed strictly positive, and is diagonal of the form Λ[ξ1 ] := diag {λ1 [ξ1 ], . . . , λn [ξ1 ]} . We assume a specific structure of the source terms, which provides an internal coupling of the n equations in a triangular fashion. More specifically, matrix A contains 1s on its sup-diagonal and 0s elsewhere, i.e., it performs the operation Aξ = ξ2 ξ3 · · · ξn 0 and the nonlinear source term F (·) has the following form F (ξ ) = F1 [ξ1 ] F2 [ξ1 , ξ2 ] · · · Fn [ξ1 , . . . , ξn ] . We assume that mappings Λmight include x terms of the form λi (ξ1 (t, x)) or local x nonlocal terms of the form λi 0 ξ1 (t, s)ds or 0 λi (ξ1 (t, s)) ds, for instance. Same

5 High-Gain Observer Design for Systems of PDEs

113

for the nonlinear source term that might include x local terms of the form f (x, ξ(t, x)), integral terms of Volterra type of the form 0 f (s, ξ(t, s)) ds, and possibly boundary terms of the form f (ξ(t, l)), with l = 0, L. Consider, also, a distributed measurement, available at the output, written as follows: y(t, x) =Cξ(t, x),

(5.1b)

where C = 1 0 ··· 0 . To complete the definition of the class of systems, let us consider initial condition (in general unknown) ξ 0 and boundary conditions as follows: ξ(0, x) =ξ 0 (x), x ∈ [0, L], ξ(t, 0) =H (ξ(t, L)) , t ∈ [0, +∞),

(5.2a) (5.2b)

where H is a nonlinear mapping coupling the incoming with the outgoing information on the boundaries. More about the regularity of the dynamics and the properties of the system will be provided in the forthcoming sections. Note that the above system has the same structure up to the hyperbolic operator as the one considered in the finite-dimensional nonlinear triangular systems, appropriate for observer designs, see [17]. We provide here some examples of dynamic phenomena coming from epidemiology and chemical reactions, that can be described by triangular systems of hyperbolic PDEs, as the ones given above. Note that some distributed Lotka–Volterra systems might also take this triangular form, as it was shown in [21], but obeying parabolic equations. • SIR epidemic models For infectious diseases, a fundamental model was formulated by Kermack and McKendrick (see [3, Chap. 1] for more details). In this model, population is classified into three groups: (i) the individuals who are uninfected and susceptible (S) of catching the disease, (ii) the individuals who are infected (I ) by the concerned pathogen, (iii) the recovered (R) individuals who have acquired a permanent immunity to the disease. Assuming that the age of patients is taken into account, S(t, x), I (t, x), R(t, x) represent the age distribution of the population of each group at time t. As a result, the integral from x1 to x2 of S, I , and R is the number of individuals of each group with ages between x1 and x2 . The dynamics of the disease propagation in the population are then described by the following set of hyperbolic PIDEs on Π

114

C. Kitsos et al.

St (t, x) + Sx (t, x) + μ(x)S(t, x) + G [S(t), I (t)](x) = 0, It (t, x) + Ix (t, x) + (γ (x) + μ(x)) I (t, x) − G [S(t), I (t)](x) = 0, (5.3) Rt (t, x) + Rx (t, x) + μ(x)R(t, x) − γ (x)I (t, x) = 0, L where G [S(t), I (t)](x) := β(x)S(t, x) 0 I (t, s)ds stands for the disease transmission rate by contact between susceptible and infected individuals, which is assumed to be proportional to the sizes of both groups, with β(x) > 0 being the age-dependent transmission coefficient between all infected individuals and susceptibles having age x. The maximal life duration in the considered population is denoted by L and, thus, S(t, L) = I (t, L) = R(t, L) = 0. Parameter μ(x) > 0 denotes the natural age-dependent per capita death rate in the population and γ (x) > 0 is the age-dependent rate at which infected individuals recover from the disease. We also assume some boundary conditions of the form S(t, 0) = B(t), I (t, 0) = 0, R(t, 0) = 0, where B(t) stands for the inflow of newborn individuals in the susceptible part of the population at time t. Assume that we are able to measure the number of people in the group R of recovered patients between ages 0 and x, for every age x ∈ [0, L] and time t ≥ 0, i.e., system’s output x is given by the quantity 0 R(t, s)ds. System (5.3) is written in the form (5.1a)-(5.1b)-(5.2) by applying a nonlocal transformation of the following form:

x

ξ1 (t, x) = ξ2 (t, x) =

R(t, s)ds, 0 x

γ (s)I (t, s)ds,

0 x

ξ3 (t, x) = 0

(5.4)

β(s)γ (s)S(t, s)ds

L

I (t, s)ds.

0

Then, in the new coordinates, system is written in the general form (5.1a) we considered here, with constant velocities, namely, Λ[ξ1 ] = In and with its nonlinear source term having the form F [ξ(t)](x) := F (x, ξ(t)(x)), containing also nonlinear nonlocal terms, more explicitly, some integral terms of Volterra type and boundary terms. For the exact form of these mappings that we derive after the transformation (5.4), the reader can refer to [20]. Such a problem, where the hyperbolic operator has a single velocity, is investigated in Sect. 5.3, which allows a direct observer design. Note also that due to the nonlocal nature of the transformation (5.4), one needs to prove the convergence of a candidate observer for system in the new coordinates ξ in the 1-spatial norm (and not in the sup-spatial norm), in order to be able to return to the original coordinates of the SIR model. • Tubular chemical reactors Control and observer designs for chemical reactors in the context of distributed parameter systems have been widely investigated, see for instance [7]. We present

5 High-Gain Observer Design for Systems of PDEs

115

here a model of a parallel plug flow chemical reactor (see [3, Chap. 5.1]). A plug flow chemical reactor is a tubular reactor where a liquid reaction mixture circulates. The reaction proceeds as the reactants travel through the reactor. We consider the case of a horizontal reactor, where a simple mono-molecular reaction takes place between A and B, where A is the reactant species and B is the desired product. The reaction is supposed to be exothermic and a jacket is used to cool the reactor. The cooling fluid flows around the wall of the tubular reactor. The dynamics are described by the following hyperbolic equations on Π: ∂t Tc + Vc ∂x Tc − k0 (T c − Tr ) = 0, ∂t Tr + Vr ∂x Tr + k0 (Tc − Tr ) − k1r (Tr , C A ) = 0, ∂t C A + Vr ∂x C A + r (Tr , C A ) = 0,

(5.5)

where Vc is the coolant velocity in the jacket, Vr is the reactive fluid velocity in the reactor, k0 and k1 are some positive constants, Tc (t, x) is the coolant temperature, Tr (t, x) is the reactor temperature, and C A (t, x) is the concentration of the chemical A in the reaction medium. The function r (Tr , C A ) stands for the reaction rate E in , where we have and is given by r (Tr , C A ) := (a + b)C A − bC A exp − RT r assumed that the sum of concentrations C A + C B is constant, equal to C A (t, x) + C B (t, x) = C in A , as it is simply described by a delay equation. Also, a, b are rate constants, C in A is the concentration on the left endpoint, E is the activation energy and R is the Boltzmann constant. We consider boundary conditions of the form: Tr (t, 0) = Trin , Tc (t, 0) = Tcin , C A (t, 0) = C in A , C B (t, 0) = 0. Assuming that the measured quantity is the coolant temperature Tc , we can transform system (5.5) into a form as (5.1a)-(5.1b)-(5.2) by applying the invertible transformation ξ1 = Tc ,

ξ2 = Tr ,

ξ3 = k1 (a + b)C A −

bC in A

E exp − RTr

.

In this example, the hyperbolic operator has distinct velocities. Such a problem is investigated in Sect. 5.4.

5.2.2 The High-Gain Observer Design Problem We present here the main problem this chapter deals with and some proposed solutions. Definition 5.1 (H-GODP) The High-Gain Observer Design Problem is solvable for a system written in the form (5.1a)–(5.2) with output (5.1b), while output’s spatial derivatives of order at most n − 1 might also be available, if there exists a wellposed observer system of PDEs, which estimates the state of initial system with a convergence speed that can be arbitrarily tuned via a single parameter (high-gain

116

C. Kitsos et al.

constant) θ . More precisely, for every κ > 0, there exists θ0 > 1, such that for every θ ≥ θ0 , solutions to (5.1a)–(5.2) satisfy ξˆ (t, ·) − ξ(t, ·) X 1 ≤ e−κt ξˆ 0 (·) − ξ 0 (·) X 2

(5.6)

for some > 0 polynomial in θ, where ξˆ , ξˆ 0 represent the observer state and its initial condition, respectively, and by X1 , X2 , we denote some function spaces, whose accurate choice depends on the number of distinct velocities. A feature of this observer design problem is a considered internal measurement of a part of the state, without any other knowledge on the other states. Furthermore, another feature indicated in the H-GODP definition, is a required stronger regularity of the solutions to the initial systems since the observer dynamics may include spatial derivatives of the output. This requirement reveals some links to previous studies on internal controllability for cascade systems with reduced number of controls, see [1]. We note here that, although boundary observers with the full-state measurement are preferred for practical reasons, see for instance [6], in the present formulation distributed measurement of the part of the state might be available in many cases, for instance, thermal cameras for chemical reactors or approximations with distributed measurements within the domain. Additionally, the required spatial derivatives of the output can be available in real time since they follow from causal measurements, contrary to the time derivatives of the output, which are strictly not included in observer designs, as the knowledge of them is non-causal. Although this requirement of the availability of space derivatives of the output might seem restrictive, approximations via kernel convolutions might be an alternative realization. Remark 5.1 Solvability of the H-GODP suggests that a high-gain observer would be arbitrarily fast, without any limitation in the convergence speed. H-GODP is not solvable in the case of boundary measurement, instead of internal measurement as in (5.1b). First, arbitrary convergence condition would not be fulfilled since a boundary observer for hyperbolic systems would experience a limitation with respect to convergence speed. The rate of convergence is limited by a minimal observation time which depends on the size of the domain and the velocities in that case (see [23] for minimum time of observability due to transport phenomena). Second, following a boundary observer design methodology as in [6], in the presence of a general form of boundary conditions, where a nonlinear law couples the incoming with the outgoing information on the boundaries, boundary measurement of the whole state vector would be required, instead of just the first state, for the boundary observer to be feasible. In [8], control design is achieved for a 2 × 2 hyperbolic system with some specific boundary conditions, via boundary control on one end of only one state. Here, however, where we consider the dual problem of observer design with one observation, such an approach would not be feasible, because for general boundary conditions, by just one observation, we cannot achieve a dissipativity of the boundary conditions as in this work, which would lead to stability of the observation error system (see [3] about linking dissipativity of boundary conditions with stability).

5 High-Gain Observer Design for Systems of PDEs

117

The main problem appearing when dealing with the solvability of the H-GODP comes from the hyperbolic operator. More particularly, the form and also the domain of the hyperbolic operator might be general, including distinct velocities and also very general couplings of the incoming with the outgoing information on the boundaries. In stability analysis of hyperbolic systems, diagonal Lyapunov functionals are usually chosen, see [3] since in taking its time derivative, integration by parts is required, which can be simplified if the Lyapunov matrix commutes with the diagonal matrix of the velocities Λ[·]. In our case, the Lyapunov matrix cannot be diagonal since it shall solve a quadratic Lyapunov equation for the non-diagonally stabilizable matrix A. Section 5.3 deals with a solution to this H-GODP for a system with one velocity, where an extension from finite dimensions is direct since the aforementioned commutative property is met. In Sect. 5.4, we elaborate an indirect design, where the general hyperbolic operator of distinct velocities is decomposed into a new hyperbolic operator with one velocity plus some mappings acting only on the measured first state, and a bilinear triangular mapping between the measured first state and the second one. Additionally to these complications, note that for the solvability of the H-GODP, difficulties also come from the presence of nonlocal terms, which require stability proof in the sup-norm, and also, from the quasilinearity of the system, i.e., the dependence of Λ[·] on the state.

5.3 Observer Design for Systems with a Single Velocity In this section, we show the solvability of the H-GODP for a system with a single velocity, which constitutes a direct extension of observer designs in finite dimension.

5.3.1 Problem Statement and Requirements Consider the general hyperbolic system (5.1a)–(5.2), with output (5.1b) and with the same triangularity of its mappings given therein. We assume in this section that system has only one velocity, i.e., matrix of velocities is of the form Λ[ξ1 ] := λ[ξ1 ]In , where velocity λ : C 1 ([0, L]; R) → C 1 ([0, L]; R) is Fréchet differentiable, possibly nonlocal, and positive (namely, λ[y] > 0, for all y ∈ C 0 ([0, L]; R) (nonlocal case), or y ∈ R (local case)), and nonlinear mapping F [ξ(t)](x) := F (x, ξ(t)(x)), with F : [0, L] × C 1 ([0, L]; Rn ) → C 1 ([0, L]; Rn ) is continuously differentiable with respect to its first argument and Fréchet differentiable with respect to its second argument. It can possibly contain nonlocal terms (integral terms of Volterra type and boundary terms). We further assume that DF ∈ Li ploc C 0 ([0, L]; Rn ) , · 0 . Also, initial condition ξ 0 ∈ C 1 ([0, L]; Rn ) satisfies zero-order and one-order com-

118

C. Kitsos et al.

patibility conditions (see [3, App. B] for precise definition of compatibility conditions) and the nonlinear mapping H coupling the incoming with the outgoing information is in C 1 (Rn ; Rn ), while its gradient is locally Lipschitz continuous, i.e., DH ∈ Li ploc (Rn , | · |). The assumption that follows is essential to assert the well-posedness of the considered system, along with an observer design requirement of forward completeness. Furthermore, it imposes global boundedness of classical solutions in the 1-norm. The latter requirement is due to the quasi-linearity of the system (the dependence of λ on ξ1 ) and can be dropped for the case of semilinear systems, but then a stronger assumption on the nonlinear source terms would be imposed instead. For a more detailed presentation of the nature of the following assumption, the reader can refer to [3] and references therein, where sufficient conditions for the well-posedness and existence of classical solutions for hyperbolic systems are given. In the case of nonlocal conservation laws, i.e., wherevelocity λ : C 0 ([0, L]; R) → C 1 ([0, L]; R) might be x of the form λ[ξ1 (t)](x) := λ( 0 ξ1 (t, s)ds), this assumption can be met more easily, see for instance [16] and other works of these authors. Assumption 5.1 Consider a set M ⊂ C 1 ([0, L]; R) nonempty and bounded, consisting of functions satisfying zero-order and one-order compatibility conditions for problem (5.1a)–(5.2). Then for any initial condition ξ 0 in M , problem (5.1a)–(5.2) admits a unique classical solution in C 1 ([0, +∞) × [0, L]; Rn ). Moreover, there exists δ > 0, such that for all ξ 0 in M , we have ξ ∈ Bδ1 := 1 n u ∈ C ([0, L]; R ) : u 1 ≤ δ . With these assumptions, we are in a position to introduce our candidate observer dynamics and its boundary conditions on Π for system (5.1)–(5.2), as follows:

ξˆt (t, x) + λ [y(t)] (x)ξˆx (t, x) = Aξˆ (t, x) + F sδ ξˆ (t) (x) − Θ K y(t, x) − C ξˆ (t, x) , ξˆ (t, 0) =H sδ ξˆ (t, L) ,

(5.7a) (5.7b)

where the function sδ : Rn ζ → sδ (ζ ) = sδ1 (ζ1 ), . . . , sδn (ζn ) is parametrized by δ (given in Assumption 5.1) and satisfies the following properties: 1. 2. 3. 4.

it is uniformly bounded and continuously differentiable; its first derivative is uniformly bounded; its derivative function Dsδ (·) is in Li p(Rn , | · |); for every δ > 0 and v, w in Rn , such that |w| ≤ δ, there exists ωδ > 0, such that the following inequality is satisfied |sδ (v) − w| ≤ ωδ |v − w|.

Note that a saturation-like function of the form

(5.8)

5 High-Gain Observer Design for Systems of PDEs

sδi (ζi ) =

ζi , |ζi | ≤ δ sgn(ζi ) (|ζi | − δ) e−|ζi |+δ + δ , |ζi | > δ

119

(5.9)

√ satisfies all properties and, particularly, (5.8) ωδ = n max 1, e−1 + δ and √ with with Lipschitz constant of Dsδ (·) equal to ne−3 . Also, Θ, appearing in the output correction term of the observer, is a diagonal matrix given by (5.10) Θ := diag θ, θ 2 , . . . , θ n , where θ > 1 is the candidate high-gain constant of the observer, which will be selected precisely later, and K ∈ Rn is chosen such that A + K C is Hurwitz (we can always find such a K , due to the observability of the pair (A, C)). Note that for such a K , one can find a symmetric and positive definite n × n matrix P satisfying a quadratic Lyapunov equation of the following form: 2Sym (P (A + K C)) = −In .

(5.11)

Let us remark that P satisfying (5.11) cannot be diagonal since matrix A fails by its definition to be a diagonally stabilizable matrix. The matrix P will be used as the Lyapunov matrix in the Lyapunov functional used in the proof of the observer convergence. However, in stability analysis of general hyperbolic systems, see for instance [3], the chosen Lyapunov functionals are diagonal, in order to commute with the matrix of the velocities. In the present case, we assume only one velocity and, thus, we do not need that P is diagonal.

5.3.2 Direct Solvability of the H-GODP We are in a position to present our main result on the solvability of the H-GODP via observer system (5.7). Theorem 5.1 Consider system (5.1a)–(5.2), with a single velocity λ and output (5.1b), and suppose that Assumption 5.1 holds for initial condition ξ 0 in M . Let also K in Rn , chosen in such a way that A + K C is Hurwitz. Then, the H-GODP for system (5.1a)–(5.2) is solvable by system (5.7) for θ > 1 as a high gain and initial condition ξˆ 0 in C 1 ([0, L]; Rn ), with ξˆ 0 (x) = ξˆ (0, x), satisfying zero-order and one-order compatibility conditions. More precisely, for every κ > 0, there exists θ0 ≥ 1, such that for every θ > θ0 , the following inequality holds ξˆ (t, ·) − ξ(t, ·) 1 ≤ e−κt ξˆ 0 (·) − ξ 0 (·) 1 , ∀t ≥ 0

(5.12)

for some > 1, polynomial in θ. Note that Theorem 5.1 shows solvability of the H-GODP of Definition 5.1 with no use of spatial derivatives of the output. This is the reason why we call this approach

120

C. Kitsos et al.

direct. This result slightly generalizes [19, 20] in the sense that it considers also the case of a velocity function of nonlocal nature. Proof Prior to the observer convergence, existence and uniqueness of global classical solutions to the observer system, which is a semilinear hyperbolic system with possibly nonlocal terms, must be proven. The reader can refer to [22, Theorem 2.1] and, similarly to that work, we can follow a fixed-point methodology, taking into account the sufficient regularity of the dynamics, the global boundedness of system’s solutions (and, thus of the output y) coming from Assumption 5.1, and also the fact that the nonlinearities appearing in the observer system are globally Lipschitz. More details can be found in [20, Appendix A]. Consider the following linearly transformed observer error ε := Θ −1 ξˆ − ξ , for which we derive the following hyperbolic equations on Π : εt (t, x) + λ [y(t)] (x)εx (t, x) =θ (A + K C) ε(t, x) + Θ −1 Δz(t) [F ] (ξ(t))(x), ε(t, 0) =Θ

−1

Δz(t) [H ] (ξ(t))(L),

(5.13a) (5.13b)

where z := sδ ξˆ . Notice that in the above internal dynamics, θ (A + K C)ε will be used as a damping term leading to the exponential stabilization of the error dynamics. To prove exponential stability of the error system at the origin in the 1-norm, we adopt a Lyapunov approach inspired by methodologies presented in [3]. The proof is included in the Appendix. A slightly different proof for a local velocity function has appeared in [20]. Following the proof in the Appendix, the H-GODP for (5.1a), (5.1b), (5.2) is solved by designing an exponential in the 1-norm high-gain observer of adjustable convergence rate, dependent on the selection of θ.

5.4 Observer Design for Systems with Distinct Velocities In this section, we employ an indirect strategy, in order to show solvability of the HGODP, when the velocities are not identical. To this end, we first map the system into an appropriate system of PDEs via an introduced nonlinear infinite-dimensional state transformation, noting however that for nonlinear systems with more than three states, accompanied with more than three distinct velocities, such a state transformationbased approach is difficult to be employed.

5 High-Gain Observer Design for Systems of PDEs

121

5.4.1 System Requirements and Main Approach Consider again system (5.1)–(5.2), but with the restriction of up to three states, namely n ∈ {2, 3} . To provide the properties and appropriate regularity of the dynamics of the considered system, let us first define a, roughly speaking, “index of strict hyperbolicity” as follows: q := min i : λi ≡ λ j , ∀ j = i, . . . , n , where we used the equivalence relation λi ≡ λ j ⇔ λi [ξ1 ] = λ j [ξ1 ], ∀ξ1 ∈ C 0 ([0, L]; R). By this definition, we have q ∈ {1, 2, 3} and in the case of a strictly hyperbolic system, we have q = n. The case where q = 1 (a single velocity) was addressed in the previous section. We further define q0 := max {1, q − 1} . We assume that velocities λi : C q0 ([0, L]; R) → C q0 ([0, L]; R) , i = 1, . . . , n are q0 -times Fréchet differentiable and positive (namely λi [y] > 0, for all y ∈ C 0 ([0, L]; R) (nonlocal case), or y ∈ R (local case)), nonlinear mapping F : and for q0 = 1 C q0 ([0, L]; Rn ) → C q0 ([0, L]; Rn ) isq0 -times Fréchet differentiable we further assume that DF ∈ Li ploc C 0 ([0, L]; Rn ) , · 0 . For initial condition ξ 0 ∈ C q0 ([0, L]; Rn ), we assume that it satisfies compatibility conditions of order q0 (see [3, Chap. 4.5.2] for definition of compatibility conditions of any order) and mapping H is of class C q0 (Rn ; Rn ), while for q0 = 1, we additionally have DH ∈ Li ploc (Rn , | · |). As in the previous section, we make an assumption on the existence and uniqueness of solutions of possibly stronger regularity, depending on the index of strict hyperbolicity q, and this assumption might be met more easily in the case of nonlocal conservation laws, see for instance [16]. Assumption 5.2 Consider M ⊂ C q0 ([0, L]; R) nonempty and bounded, consisting of functions satisfying compatibility conditions of order q0 for problem (5.1a)–(5.2). Then for any initial condition ξ 0 in M , problem (5.1a)–(5.2) admits a unique solution 0 in C q0 ([0, +∞) × [0, L];Rn ). Moreover, there exists δ > 0, such that for all ξ in q0 q0 n M , we have ξ ∈ Bδ := u ∈ C ([0, L]; R ) : u q0 ≤ δ . Let us define a Banach space by X := C q0 ([0, L]; R) × C 1 ([0, L]; Rn−1 ), equipped with the norm · X := · 1 , when n = 2 and ξ X := ξ1 q0 + ξ2 1 + ξ3 1 , when n = 3. To deal with the generality of the considered hyperbolic operator, i.e., the presence of distinct velocities, we need to employ a different strategy than in the previous

122

C. Kitsos et al.

section. The problem comes from the fact that the balance laws in (5.1a) do not allow the choice of a diagonal Lyapunov functional to be used in the stability analysis of the observer error equations. A non-diagonal Lyapunov functional does not permit an integration by parts when taking its time derivative since the Lyapunov matrix and the matrix of velocities do not commute. To address this problem, we perform a transformation including spatial derivations of the state up to order q − 2, in order to write the system in an appropriate form for which a Lyapunov approach is feasible. Then, for the obtained target system, we design the high-gain observer and, finally, returning to the initial coordinates, solvability of H-GODP is guaranteed. The increased difficulties with respect to the presence of distinct velocities appear in the somehow dual problems of internal controllability with reduced numbers of controls (see comments on algebraic solvability in [1]). q We shall show the existence of a nonlinear transformation T ∈ C 0 Bδ 0 ; B(X ) q invertible, with T −1 ∈ C 0 Bδ 0 ; B(X ) (with B(X ) : the space of bounded linear operators from X to X ), which maps system (5.1a)–(5.2) into a target system ζ , as follows: ζ = T [ξ1 ]ξ ; with ζ1 = ξ1 .

(5.14)

Assume also that this transformation is written in the form T [·] = In + T˜ [·]C,

(5.15)

for some column operator T˜ . The desired target system (T) of PDEs satisfies the following equations on Π ⎧

⎪ ζt (t, x) + λn [ζ1 (t)](x)ζx (t, x) = Aζ (t, x) + F ζ (t) − T˜ [ζ1 (t)]Cζ (t) (x) ⎪ ⎪ ⎪ ⎨ +N1 [ζ1 (t)](x) + N2 [ζ1 (t), ζ2 (t)](x), (T) ⎪ ˜ [ζ1 (t)]Cζ (t)(L) + N3 [ζ1 (t)](0), ζ (t, 0) = H ζ (t, L) − T ⎪ ⎪ ⎪ ⎩ Y (t, x) = y(t, x) = Cζ (t, x), with initial condition ζ (0, x) := ζ 0 (x) = T [ξ10 ]ξ 0 (x), where nonlinear operators N1 : C q0 ([0, L]; R) → C 0 ([0, L]; Rn ), N3 : C q0 ([0, L]; R) → Rn are acting on the measured state ζ1 , N2 : C q0 ([0, L]; R) × C 0 ([0, L]; R) → C q0 −1 ([0, L]; Rn ) is a bilinear triangular mapping, to be determined in the sequel, depending on the choice of T , and Y is target system’s output, which remains equal to the original system’s output y. In this considered target system of PDEs, we observe that system’s hyperbolic operator has been decomposed into the sum of a hyperbolic one with only one velocity, the last one λn , plus a nonlinear differential operator acting only on the measured first state only N1 , a bilinear mapping of the state N2 , while a nonlinear differential operator N3 acting on the first state appears on the boundaries. Thus,

5 High-Gain Observer Design for Systems of PDEs

123

observer design can be possible for target system (T), as we now meet the desired property of one single velocity that we imposed in the previous section, while we can simultaneously cancel the unwanted terms of the transformed system, represented by the nonlinear operators N1 , N2 , N3 acting on the measured state ζ1 . The proposed high-gain observer for target system (T) satisfies the following equations on Π:

ζˆt (t, x) + λn [y(t)](x)ζˆx (t, x) = Aζˆ (t, x) + F sδ ζˆ (t) − T˜ [y(t)]y(t) (x) (5.16a) + N1 [y(t)](x) + N2 [y(t), ζˆ2 (t)](x) − Θ K y(t, x) − C ζˆ (t, x) , (5.16b) ζˆ (t, 0) = H sδ ζˆ (t, L) − T˜ [y(t)]y(t)(L) + N3 [y(t)](0), with initial condition ζˆ 0 (x) := ζˆ (0, x) (for a function ζˆ 0 in X ), where again, as in the previous section, Θ is the diagonal matrix containing the increasing powers of the high-gain constant θ > 1 as in (5.10), sδ is a saturating function satisfying the properties of items 1–4 as in Sect. 5.3.1 and K is a constant vector gain rendering matrix A + K C Hurwitz, similarly as in the previous section. In the next subsection, we determine the transformation T and we show the solvability of the H-GODP.

5.4.2 Indirect Solvability of the H-GODP We are now in a position to state the main result of this section, which includes both the existence of an infinite-dimensional transformation (5.14) and the convergence of observer (5.16) to the transformed system (T), implying that, inverting the observer state via T , we eventually establish converge to the actual state ξ . This leads to an indirect solvability of the observer design problem. Theorem 5.2 Assume that Assumption 5.2 holds for initial condition ξ 0 ∈ M . Then, the H-GODP is solvable for system (5.1a)–(5.2), with output (5.1b) and n, q ∈ {2, 3} by T −1 [y]ζˆ (where ζˆ is the unique solution to (5.16)), for θ > 1 as a high gain and initial condition T −1 [y(0)]ζˆ 0 (x), with ζˆ 0 satisfying zero-order and one-order compatibility conditions. More precisely, for every κ > 0, there exists θ0 ≥ 1, such that for every θ > θ0 , the following holds for all t ≥ 0: T −1 [y(t)]ζˆ (t)(·) − ξ(t, ·) 2−q0 ≤ e−κt T −1 [y(0)]ζˆ 0 (·) − ξ 0 (·) X ,

(5.17)

with > 0 a polynomial in θ. We note here that in the study of internal controllability for underactuated systems, the phenomenon of loss of derivatives appears, as the regularity of the dynamics is stronger than the regularity of the control laws, whenever the velocities are distinct (see [1, Theorem 3.1]). In the present framework aiming at the solvability of the

124

C. Kitsos et al.

H-GODP, we note that for n = q = 3, the regularity of system’s dynamics needs to be stronger (of order q0 = 2) than the regularity of the space in the norm of which the asymptotic convergence of the observer is exhibited (sup-norm). The above theorem constitutes a generalization of our previous works, see for instance [19], where the case of linear hyperbolic and semilinear parabolic systems via a linear state transformation is treated. We introduce here a transformation-based approach, inspired by these works, but using a nonlinear state transformation, in order to deal with the quasilinearity of the system. Proof Let us choose T˜ in (5.14), (5.15) by ⎧ 0, ⎪ ⎪⎛ ⎨

when n = 2, ⎞ 0 ; T˜ [ξ1 ] := ⎝ τ [ξ1 ]⎠ dx , when n = 3 ⎪ ⎪ ⎩ 0 where τ [ξ1 ] := λ2 [ξ1 ] − λ3 [ξ1 ]. Obviously, transformation T , with T˜ given as above, meets the specifications of the previous subsection independently of boundary conditions and its inverse is given by T −1 [·] = In − T˜ [·]C. Applying the transformation chosen above to system (5.1a)–(5.2), we obtain target system (T) of the previous subsection, with ⎧

(λ2 [ζ1 ] − λ1 [ζ1 ]) ∂x ζ1 ⎪ ⎪ , n = 2, ⎪ ⎪ 0 ⎪ ⎪ ⎞ ⎨⎛ (λ3 [ζ1 ] − λ1 [ζ1 ] − τ [ζ1 ]) ∂x ζ1

N1 [ζ1 ] := ⎜ ⎟

Dτ [ζ ], − (λ ⎪ 1 1 [ζ1 ] + τ [ζ1 ]) ∂x ζ1 + F1 [ζ1 ] ∂x ζ1 − τ [ζ1 ] ⎟ ⎜ ⎪ ⎪⎝ ⎪ ⎠ , n = 3, [ζ ]∂ ζ − F [ζ ]) + λ [ζ ]d ]∂ ζ [ζ ×d (λ (τ ) ⎪ x 1 1 x 1 1 1 3 1 x 1 x 1 ⎪ ⎩ 0 ⎧ 0, n = 2, ⎪ ⎪ ⎞ ⎨⎛ 0 N2 [ζ1 , ζ2 ] := ⎝

Dτ [ζ1 ], ζ2 ∂x ζ1 ⎠ , n = 3, ⎪ ⎪ ⎩ 0 ⎧ 0, ⎪ ⎪ ⎞ n = 2, ⎨⎛ 0 N3 [ζ1 ] := ⎝ τ [ζ1 ]∂x ζ1 ⎠ , n = 3, ⎪ ⎪ ⎩ 0

We are now in a position to prove that solutions to observer (5.7) converge exponentially to the solutions to transformed system (T). First, the well-posedness of the observer system, i.e., the global existence of unique classical solutions of regularity C 1 follows from classical arguments that one can find, for instance, in [3] or in our previous works [21] (details are left for the reader) and relies on Assumption 5.2 on existence and boundedness in the C q0 of system solutions and also on the fact that

5 High-Gain Observer Design for Systems of PDEs

125

observer nonlinearities are globally Lipschitz. In this way, we now focus on the proof of the stability analysis. Let us consider observer error by ε := Θ −1 ζˆ − ζ , which satisfies the following hyperbolic equations: εt (t, x) + λn [y(t)] (x)εx (t, x) =θ (A + K C) ε(t, x) + Θ −1 Δz(t) [F ] (ζ (t) − T˜ [y(t)]y(t))(x)

+N2 y(t), ζˆ2 (t) − ζ2 (t) (x) , (5.18a) ε(t, 0) =Θ −1 Δz(t) [H ] (ζ (t) − T˜ [y(t)]y(t))(L), (5.18b) where z := sδ ζˆ − T˜ [y]y. For the proof of the exponential stability of solutions to (5.18), the reader can refer to the Appendix of the present chapter. Following the proof in the Appendix, where an appropriate Lyapunov functional is chosen, we obtain an exponential stability result for the transformed system in the 1-norm as follows: ¯ −κt ζˆ 0 (·) − ζ 0 (·) 1 , ζˆ (t, ·) − ζ (t, ·) 1 ≤ e

(5.19)

where κ > 0 is adjustable by choosing the high-gain constant θ large enough, and ¯ > 0 is a polynomial in θ. Now, to return to the original coordinates, we notice that T [Cξ ] : X → X q is bounded for ξ ∈ Bδ 0 , X is continuously embedded in C 1 ([0, L]; Rn ), also the q −1 extension of T [Cξ ] on C 0 [0, L]; Rn ) for ξ ∈ Bδ 0 is bounded in C 0 [0, L]; Rn ), 1 n 0 and C ([0, L]; R ) is continuously embedded in C ([0, L]; Rn ). Thereby, by (5.19), we can calculate a constant , polynomial again in θ , such that (5.17) is satisfied. The proof of Theorem 5.2 is complete. Remark 5.2 Although in this section we considered a reduced number of states (up to 3), as the presence of an increased number of distinct velocities imposes extra difficulties to the problem, we note that the H-GODP is solvable even for more than three states, but with the restriction that system is linear and also space L-periodic. In this case, we consider a state transformation that includes higher order differentiations in its domain than the ones in this section and, to determine it, we solve an operator Sylvester equation. We have included this generalization in previous works, see [19], where some links with problems of controllability of coupled hyperbolic PDEs as in [1] were revealed.

126

C. Kitsos et al.

5.5 Conclusion Solutions to a high-gain observer design problem for a class of quasilinear hyperbolic systems, containing also nonlocal source terms and velocities, and written in a triangular form were presented in this chapter. A part of the state was considered as measurement (one observation). First, this problem was solved for systems with n equations and only one velocity, as a direct extension of the finite-dimensional approach. Then, sufficient conditions were provided for the solvability of such a problem for the case of 2 or 3 distinct velocities. This required the introduction of a nonlinear infinite-dimensional state transformation, which led to the injection of output spatial derivatives in the observer dynamics. The extension of this methodology to wider classes of infinite-dimensional systems, ISS properties of such observers, and the investigation of output feedback laws via such observers with applications to real systems, will be topics of our future works.

Appendix Observer Convergence Proofs In this section, we prove the Lyapunov stability part of both Theorems 5.1 and 5.2, appearing in Sects. 5.3.2 and 5.4.2, respectively. Particularly, observer error systems appearing in Theorems 5.1 and 5.2 are given by (5.13) and (5.18), respectively. We prove here the Lyapunov stability result for error system (5.18) in Theorem 5.2 only, which is more complicated. Then, the Lyapunov stability in Theorem 5.1 follows, as error system (5.13) therein is a simpler version of (5.18), with ζ substituted by ξ , ζˆ substituted by ξˆ , λn substituted by λ, and T˜ = 0, N2 = 0. To prove the exponential stability of the solution to error system (5.18) at the origin in Theorem 5.2, let us define a Lyapunov functional W p : C 1 ([0, L]; Rn ) → R by W p [ε] :=

L

1/ p

π(x)exp pμθ,δ x G p [ε](x)dx p G p [ε] := ε Pε + ρ0 εt Pεt ,

;

(5.20)

0

where ε := Θ −1 ζˆ − ζ is the observer error for the transformed via T system, ρ0 ∈ (0, 1) is a constant (to be chosen appropriately), p ∈ N, P ∈ Rn×n is positive definite and symmetric, satisfying (5.11), π : [0, L] → R is given by π(x) := (π¯ − 1)

sup ζ 0 ≤δ λn [Cζ ] x + 1; π¯ := , L inf ζ 0 ≤δ λn [Cζ ]

(5.21)

5 High-Gain Observer Design for Systems of PDEs

127

with π(x) ∈ [1, π¯ ], and constant μθ,δ is given by μθ,δ :=

1 ln(μδ θ 2n−2 ) L

(5.22a)

where μδ := γ1,δ :=

|P| 2 , γ 2 , γ 2 δ2 , γ γ δ ; max γ1,δ 1,δ 2,δ 1 2,δ 3,δ 1 eig(P)

|Θ −1 Δz [H ] ζ − T˜ [y]y (L)|

sup

θ n−1 |ε(L)| |Θ −1 DH [z] (L)Dsδ ζˆ (L) Θ|,

|ζ (L)|≤δ,y∈C 1 ([0,L];R),ζˆ (L)∈Rn ,ε(L) =0

γ2,δ :=θ 1−n γ3,δ :=

(5.22b)

sup ζˆ (L)∈Rn , y 1 ≤δ

sup |ζ (L)|≤δ,ζˆ (L)∈Rn , y

1 n−1 |ε(L)| θ 1 ≤δ,ε(L) =0

× |Θ −1 Δz [DH ] (ζ − T˜ [y]y)(L)Dsδ (ζˆ ) + DH (z(L))Δζˆ [Dsδ ](ζ )(L) |,

with

z := sδ (ζˆ ) − T˜ [y]y,

and also δ1 :=

sup ζ ∈B δ1 , y q0 ≤δ

= sup

q

ζt 0 + dt T˜ [y]y 0

dt (T [Cξ ]ξ ) 0 + dt T˜ [Cξ ]Cξ 0 ,

ξ ∈B δ 0

which is proven to be finite after substituting ξt by −Λ[Cξ ]ξx + Aξ + F [ξ ] by (5.1a) and invoking the regularity of Λ, and the boundedness of the mapping T in X. Note here that the above constants with subscript δ do not have any dependence on the observer gain θ but on the global bound δ of system solutions (coming from Assumption 5.2); otherwise, such a dependence will be explicit by use of a subscript θ . We took also into account the implication q

ξ ∈ Bδ 0 (coming from Assumption 5.2) ⇒ T [Cξ ]ξ = ζ ∈ Bδ1 . By invoking global existence of solutions to observer system (5.16) and Assumption 5.2, which establishes global unique classical solutions for system (5.1a)– (5.2), and boundedness of mapping T in X , we are now in a position to define G p , W p : [0, +∞) → R

128

C. Kitsos et al.

G p (t) := G p [ε](t), W p (t) := W p [ε](t), ∀t ≥ 0.

(5.23)

Before taking the time derivative of the Lyapunov function, we temporarily assume that ε is of class C 2 and we can, thus, derive the hyperbolic equations satisfied by εt (details are left to the reader). Calculating the time derivative W˙ p along the classical solutions of the hyperbolic equations (5.18) for ε and of the corresponding hyperbolic equations for εt , we get 1 1− p L ˙ pπ(x) exp pμθ,δ x G p−1 (x) Wp = Wp p 0 × εt (x) Pε(x) + ε (x)Pεt (x) + ρ0 εtt (x)Pεt (x) + ρ0 εt (x) Pεtt (x) dx, where after substituting dynamics of ε, εt and performing integration by parts, we obtain

1 1 T1, p + T2, p + T3, p + T4, p , (5.24) W˙ p = W p1− p p p where T1, p := − π(L)λn [y](L) exp pμθ,δ L G p (L) + π(0)λn [y](0)G p (0), L dx π(x)λn [y](x) exp pμθ,δ x G p (x)dx, T2, p := 0 L π(x) exp pμθ,δ x G p−1 (x) ε (x)PΘ −1 T3, p :=2 0

× Δz [F ] ζ − T˜ [y]y (x) + N2 y, ζˆ2 − ζ2 (x) ˆ +ρ0 εt (x)Sym(PK [ζ ](x))εt (x) + ρ0 εt (x)P K ζ [ζ ](x) dx, +Θ −1 DF [z] , Dsδ ζˆ Θεt (x) + Θ −1 dt N2 y, ζˆ2 − ζ2 (x) L π(x) exp pμθ,δ x G p−1 (x) T4, p :=θ 0 × 2ε Sym(P(A + K C))ε + 2ρ0 εt Sym(P(A + K C))εt −ρ0 εt PK [ζ ]( A + K C)ε − ρ0 ε (A + K C) K [ζ ]Pεt dx, where K : Bδ1 → C 0 [0, L]; Rn×n is a bounded mapping defined by K [ζ ] := (λn [ζ1 ])−1 Dλn [ζ1 ], Cζt In and K

ζˆ

(5.26)

: Bδ1 → C 0 ([0, L]; Rn ), parametrized by ζˆ ∈ C 0 ([0, L]; Rn ), is given by

5 High-Gain Observer Design for Systems of PDEs

129

ˆ K ζ [ζ ] := − K [ζ ]Θ −1 Δz [F ] ζ − T˜ [ζ1 ]ζ1 + N2 ζ1 , ζˆ2 − ζ2 + Θ −1 Δz [DF ] (ζ − T˜ [ζ1 ]ζ1 ), ζt − dt T˜ [ζ1 ]ζ1 (5.27) + DF [z], Δζˆ [Dsδ ](ζ )ζt . After substituting boundary conditions for ε, εt in T1, p and by virtue of (5.21) and (5.22b), we obtain the following inequality: p T1, p ≤ sup (λn [Cζ ])G p (L) − exp pμθ,δ L + θ 2n−2 μδ ζ ∈B δ1

and, subsequently, by (5.22a), we get T1, p ≤ 0.

(5.28)

For T2, p , we can easily derive the following bound: T2, p ≤ (ω1,δ + p|μθ,δ |ω2,δ )W pp

(5.29)

where ω1,δ :=

|P|δ supζ ∈B δ1 |dx λn [Cζ ]| eig(P)

,

ω2,δ :=

|P| supζ ∈B δ1 λn [Cζ ] eig(P)

.

By taking into account that the dynamics are locally Lipschitz, we obtain p−1

T3, p ≤ ω3,δ G 1 (·) 0 W p−1 ,

(5.30)

where ω3,δ

! |P| 3 :=2 max γ4,δ , γ5,δ , γ6,δ , sup |K [ζ ], γ7,δ ; eig(P) 2 ζ ∈B δ1

γ4,δ :=

1 ζ ∈B δ1 ,y∈C 1 ([0,L];R),ζˆ ∈C 0 ([0,L];Rn ),ε =0 ε 0

× Θ −1 Δz [F ] ζ − T˜ [y]y + N2 y, ζˆ2 − ζ2 0 , sup

ˆ

γ5,δ :=

K ζ [ζ ] 0 , ε 0 ζ ∈B 1 ,ζˆ ∈C 0 ([0,L];Rn ),ε =0 sup

δ

γ6,δ :=

sup ζˆ ∈C 0 ([0,L];Rn ), y 1 ≤δ,εt ∈C 0 ([0,L];Rn ),εt =0

Θ −1 DF [z], Dsδ ζˆ Θεt 0 εt 0

,

130

C. Kitsos et al.

γ7,δ :=

Θ −1 dt N2 y, ζˆ2 − ζ2 0 sup

ε 0 + εt 0

ζ ∈B δ1 , y q0 ≤δ,ζˆ2 ∈C 0 ([0,L];R),ε,εt =0

,

where we can easily see that γ7,δ is finite since quantity dt N2 y, ζˆ2 − ζ2 is given by dt N2 y, ζˆ2 − ζ2 =N2 y, θ 2 ∂t ε2 ⎛ ⎞ ⎛ ⎞ 0 0 " # " # + ⎝ yt x Dτ [y], θ 2 ε2 ⎠ + ⎝ yx D 2 τ [y], θ 2 yt ε2 ⎠ 0 0 q

and yt , yt x are uniformly bounded, whenever ξ ∈ Bδ 0 (see Assumption 5.2), as seen by hyperbolic dynamics (5.1a). Term T4, p , which will lead to the negativity of the Lyapunov derivative, can be rewritten in the following way:

L

T4, p := − θ

π(x) exp pμθ,δ x G p−1 (x)E (x)Σ[ζ ](x)E(x)dx,

0

where E := ε εt and after utilizing (5.11), Σ : Bδ1 → C 0 [0, L]; R2n×2n is given by Σ[ζ ] :=

−ρ0 (A + K C) K [ζ ]P In . ρ0 In −ρ0 PK [ζ ]( A + K C)

Now, we can easily verify (using Schur complement) that for all w ∈ R2n \0, we have Σ[ζ ]w > 0, if inf ζ ∈B δ1 w |w| 2

ρ0 < min

⎧ ⎪ ⎨

1 ⎪ ⎩ |P|2 |A + K C|2 sup

ζ ∈B δ1

⎫ ⎪ ⎬

K [ζ ]

2 , 1⎪ . ⎭

It turns out that for every choice of matrices P and K satisfying (5.11), there always exists a ρ0 , such that (5.5) is satisfied and this fact renders Σ positive. Consequently, there exists σδ > 0, such that T4, p ≤ −θ

σδ p W . |P| p

(5.31)

We note here that all the previously defined constants with subscript δ (for instance, γi,δ , i = 4, . . . , 7) have no dependence on the observer gain constant θ and this is a consequence of the triangularity of the involved nonlinear mappings, similarly

5 High-Gain Observer Design for Systems of PDEs

131

as in the classical high-gain observer designs [13]. This property turns out to be sufficient for the solvability of the H-GODP. More precisely, while bounding the Lyapunov derivative from above, the independence of these parameters on θ shall not add positive terms with linear (or higher order) dependence on θ. On the other hand, negative terms will appear depending linearly on θ as a direct consequence of the assumed observability of the pair (A, C). This will render the negativity of the Lyapunov derivative feasible. Now, combining (5.28)–(5.30), and (5.31) with (5.24), we obtain p−1 W˙ p ≤ (−θ ω4,δ + ω5,δ ln(θ ) + ω6,δ )W p + ω3,δ W p1− p W p−1 G 1 (·) 0 , σδ , ω5,δ := where ω4,δ := |P| inequality, one can obtain

ω2,δ (2n−2) , ω6,δ L

p−1

:=

ω2,δ | ln μδ |. L

(5.32)

Now, using Hölder’s

1/ p

W p−1 ≤ W pp−1 π(·) 0 . Utilizing the above inequality, (5.32) gives W˙ p ≤ (−θ ω4,δ + ω5,δ ln(θ ) + ω6,δ )W p + ω3,δ π¯ 1/ p G 1 (·) 0 .

(5.33)

We obtained the estimate (5.33) of W˙ p for ε of class C 2 , but, by invoking density arguments, the results remain valid with ε only of class C 1 (see [9] for further details on analogous statements). Taking the limit as p → +∞ of both sides of (5.33), we get in the distribution sense in (0, +∞), d G 1 (t, ·) 0 ≤ −θ ω4,δ + ω5,δ ln(θ ) + ω7,δ G 1 (t, ·) 0 , dt

(5.34)

where ω7,δ := ω3,δ + ω6,δ . Now, one can select the high gain θ, such that θ > θ0 , where θ0 ≥ 1 is such that − θ ω4,δ + ω5,δ ln(θ ) + ω7,δ ≤ −2κδ , ∀θ > θ0

(5.35)

for some κδ > 0. One can easily check that for any κδ > 0, there always exists a θ0 ≥ 1, dependent on the involved constants, such that the previous inequality is satisfied. Subsequently, (5.34) yields to the following differential inequality in the distribution sense in (0, +∞) d G 1 (t, ·) 0 ≤ −2κδ G 1 (t, ·) 0 dt and by the comparison lemma, we get G 1 (t, ·) 0 ≤ e−2κδ t G 1 (0, ·) 0 , ∀t ≥ 0.

(5.36)

132

C. Kitsos et al.

Now, by the dynamics (5.18), we can obtain the following inequality: inf λn [Cζ ] εx 0 − sδ,θ ε 0 ≤ εt 0 ≤ sup λn [Cζ ] εx 0 + sδ,θ ε 0

ζ ∈B δ1

ζ ∈B δ1

where sδ,θ := θ |A + K C| + γ4,δ . Invoking these inequalities, also (5.22a), estimate (5.36) and the following inequality, μθ,δ +|μθ,δ | ρ0 μθ,δ −|μθ,δ | L eig(P) ( ε 0 + εt 0 )2 ≤ G 1 (·) 0 ≤ e 2 L |P| ( ε 0 + εt 0 )2 e 2 2

we obtain

ε 1 ≤ θ,δ e−κδ t ε0 1 , ∀t ≥ 0,

(5.37)

where ε0 (x) := ε(0, x) and ' θ,δ :=

|P| 1 n−1 (μδ ) 2L θ L ρ0 eig(P)

× max sδ,θ

! ! 1 + 1, max 1 + 2sδ,θ , 2 sup λn [Cζ ] . inf ζ ∈B δ1 λn [Cζ ] ζ ∈B δ1

By (5.37), we derive the following estimate, which holds for every t ≥ 0 ζˆ (t, ·) − ζ (t, ·) 1 ≤ ¯θ,δ e−κδ t ζˆ 0 − ζ 0 1 ,

(5.38)

where ¯θ,δ := θ n−1 θ,δ . Note that the polynomial dependence of ¯δ,θ on θ is a phenomenon appearing also in high-gain observer designs in finite dimensions. The proof of the exponential convergence of ζˆ to ζ in the 1-norm is complete.

References 1. Alabau-Boussouira, F., Coron, J.-M., Olive, G.: Internal controllability of first order quasilinear hyperbolic systems with a reduced number of controls. SIAM J. Control Optim. 55(1), 300–323 (2017) 2. Anfinsen, H., Diagne, M., Aamo, O.M., Krstić, M.: An adaptive observer design for n + 1 coupled linear hyperbolic PDEs based on swapping. IEEE Trans. Autom. Control 61(12), 3979–3990 (2016) 3. Bastin, G., Coron, J.-M.: Stability and boundary stabilization of 1-D hyperbolic systems. In: Progress in Nonlinear Differential Equations and Their Applications. Springer International Publishing (2016)

5 High-Gain Observer Design for Systems of PDEs

133

4. Besançon, G.: Nonlinear Observers and Applications. Springer, New York (2007) 5. Bounit, H., Hammouri, H.: Observer design for distributed parameter dissipative bilinear systems. Appl. Math. Comput. Sci. 8, 381–402 (1998) 6. Castillo, F., Witrant, E., Prieur, C., Dugard, L.: Boundary observers for linear and quasi-linear hyperbolic systems with application to flow control. Automatica 49(11), 3180–3188 (2013) 7. Christofides, P.D., Daoutidis, P.: Feedback control of hyperbolic pde systems. AIChE J. 42(11), 3063–3086 (1996) 8. Coron, J.-M., Vazquez, R., Krstić, M., Bastin, G.: Local exponential stabilization of a 2 × 2 quasilinear hyperbolic system using backstepping. SIAM J. Control Optim. 51(3), 2005–2035 (2013) 9. Coron, J.-M., Bastin, G.: Dissipative boundary conditions for one-dimensional quasi-linear hyperbolic systems: Lyapunov stability for the C1-norm. SIAM J. Control Optim. 53(3), 1464– 1483 (2015) 10. Coron, J.-M., Nguyen, H.M.: Optimal time for the controllability of linear hyperbolic systems in one dimensional space. SIAM J. Control Optim. 57(2), 1127–1156 (2019) 11. Di Meglio, F., Vasquez, R., Krstić, M.: Stabilization of a system of n + 1 coupled first-order hyperbolic linear PDEs with a single boundary input. IEEE Trans. Autom. Control. 58, 3097– 3111 (2013) 12. Gauthier, J.P., Bornard, G.: Observability for any u(t) of a class of nonlinear systems. IEEE Trans. Autom. Control 26(4), 922–926 (1981) 13. Gauthier, J.P., Hammouri, H., Othman, S.: A simple observer for nonlinear systems: applications to bioreactors. IEEE Trans. Autom. Control 37(6), 875–880 (1992) 14. Hasan, A., Aamo, O.M., Krstić, M.: Boundary observer design for hyperbolic PDE-ODE cascade systems. Automatica 68, 75–86 (2016) 15. Karafyllis, I., Ahmed-Ali, T., Giri, F.: Sampled-data observers for 1-D parabolic PDEs with non-local outputs. Syst. Control Lett. 133 (2019) 16. Keimer, A., Pflug, L., Spinola, M.: Existence, uniqueness and regularity of multidimensional nonlocal balance laws with damping. J. Math. Anal. Appl. 466, 18–55 (2018) 17. Khalil, H.K.: High-gain observers in nonlinear feedback control. Advances in Design and Control, SIAM (2017) 18. Khalil, H.K., Praly, L.: High-gain observers in nonlinear feedback control. Int. J. Robust Nonlinear Control 24(6), 993–1015 (2014) 19. Kitsos, C.: High-gain observer design for systems of PDEs. Ph.D. Thesis. Univ. Grenoble Alpes (2020) 20. Kitsos, C., Besançon, G., Prieur, C.: High-gain observer design for a class of quasilinear integro-differential hyperbolic systems - application to an epidemic model. In: To appear in IEEE Transactions on Automatic Control (2022). https://doi.org/10.1109/TAC.2021.3063368 21. Kitsos, C., Besançon, G., Prieur, C.: High-gain observer design for some semilinear reactiondiffusion systems: a transformation-based approach. IEEE Control Syst. Lett. 5(2), 629–634 (2021) 22. Kmit, I.: Classical solvability of nonlinear initial-boundary problems for first-order hyperbolic systems. Int. J. Dyn. Syst. Differ. Equ. 1(3), 191–195 (2008) 23. Li, T.-T.: Exact boundary observability for quasilinear hyperbolic systems. ESAIM: Control, Optim. Calc. Var. 14(4), 759–766 (2008) 24. Lissy, P., Zuazua, E.: Internal observability for coupled systems of linear partial differential equations. SIAM J. Control Optim. Soc. Ind. Appl. Math. 57(2), 832–853 (2019) 25. Meurer, T.: On the extended Luenberger-type observer for semilinear distributed-parameter systems. IEEE Trans. Autom. Control 58(7), 1732–1743 (2013) 26. Nguyen, V., Georges, D., Besançon, G.: State and parameter estimation in 1-d hyperbolic PDEs based on an adjoint method. Automatica 67, 185–191 (2016) 27. Prieur, C., Girard, A., Witrant, E.: Stability of switched linear hyperbolic systems by Lyapunov techniques. IEEE Trans. Autom. Control 59(8), 2196–2202 (2014) 28. Schaum, A., Moreno, J.A., Alvarez, J., Meurer, T.: A simple observer scheme for a class of 1-D semi-linear parabolic distributed parameter systems. European Control Conf, Linz, Austria, pp. 49–54 (2015)

134

C. Kitsos et al.

29. Vazquez, R., Krstić, M.: Boundary observer for output-feedback stabilization of thermal-fluid convection loop. IEEE Trans. Control Syst. Technol. 18(4), 789–797 (2010) 30. Xu, C., Ligarius, P., Gauthier, J.P.: An observer for infinite-dimensional dissipative bilinear systems. Comput. Math. Appl. 29(7), 13–21 (1995)

Chapter 6

Robust Adaptive Disturbance Attenuation Saeid Jafari and Petros Ioannou

Abstract Effective attenuation of unwanted sound and vibrations dominated by a number of harmonics is a key enabling technology in a vast array of industrial control applications. This chapter focuses on output feedback robust adaptive suppression of unknown disturbances acting on dynamical systems with uncertain models. Under certain assumptions on the open-loop plant characteristics, the robust stability and performance of a proposed adaptive disturbance-rejecting feedback control scheme are examined. It is shown that over-parameterization of the controller together with a suitable pre-compensation of the open-loop plant can effectively attenuate the disturbance harmonics without amplifying the output broadband noise, while at the same time guaranteeing stability and robustness with respect to unmodeled dynamics. Analytical results are provided for single-input single-output as well as multi-input multi-output systems in both continuous- and discrete-time domains and practical design considerations are presented together with numerical simulations to demonstrate the effectiveness of the proposed scheme.

This chapter is partially reprinted from [1, 2] with the following copyright and permission notices: c • 2015, IEEE. Reprinted, with permission, from S. Jafari, P. Ioannou, B. Fitzpatrick, and Y. Wang, Robustness and performance of adaptive suppression of unknown periodic disturbances, IEEE Transactions on Automatic Control, vol. 60, pages 2166–2171 (2015). • Reprinted from Automatica, Vol 70, S. Jafari and P. Ioannou, Robust adaptive attenuation of unknown periodic disturbances in uncertain multi-input multi-output systems, Pages 32–42, Copyright (2016), with permission from Elsevier. S. Jafari (B) Aurora Flight Sciences – A Boeing Company, Manassas, Virginia 20110, United States e-mail: [email protected] P. Ioannou Department of Electrical and Computer Engineering, at the University of Southern California, Los Angeles, California 90089, United States e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control, Lecture Notes in Control and Information Sciences 488, https://doi.org/10.1007/978-3-030-74628-5_6

135

136

S. Jafari and P. Ioannou

6.1 Introduction The problem of attenuation of unwanted sound and vibrations is very important in many systems in precision engineering as the system’s performance is significantly affected by various sources of vibrational disturbances. This problem arises in many applications such as laser beam pointing, structural vibration control, active acoustic noise control, disturbance reduction in hard disk drives, disturbance suppression in wind turbines, and active acoustic noise control in magnetic resonance imaging systems [3–13]. The applications of laser systems have grown in recent decades, including areas such as communications [14, 15], detection and ranging [16], imaging [17], medicine [18], and defense [19, 20]. Of particular interest in many laser applications are the problems of beam control, ranging from pointing/tracking to wavefront control. The performance of laser beam control systems is often adversely affected by difficultto-characterize disturbances that arise from the medium of propagation, structural vibrations in the platform, or other external factors [21]. In particular, much of the jitter in laser beam control is due to periodic disturbances whose frequencies and amplitudes are unknown and could vary with time. Isolation of sensitive equipment from the vibrations of a base structure is another application of adaptive vibrational control. In some cases, the sensitive equipment may be supported by a structure that vibrates due to unknown oscillatory forces. A set of actuators and sensors connected by a feedback loop is often employed to minimize the effects of vibrations and to prevent the propagation of vibrational disturbances to sensitive components [8, 22]. Examples of such applications include (i) vibration reduction in helicopters, where the main objectives are to improve the comfort of the pilot/passengers, to reduce fatigue in the rotor and structure of the helicopter, and to protect onboard equipment from damage [23–26]; and (ii) vibration control in high-performance spacecraft, wherein some components generate uncertain periodic disturbances which are detrimental to performance; for example, disturbances caused by gyroscopes and cryogenic coolers [27]. Identifying and attenuating the effect of such disturbances are essential to improve the performance of instruments. The control of acoustic noise in a wide class of systems has been the subject of a lot of engineering research over recent decades. In many industrial applications, undesired sound can be classified as periodic or quasi-periodic disturbances which are mainly caused by components such as electric motors, compressors, engines, cooling systems, fans, propellers, air-conditioning systems, or can be generated by resonance coils in magnetic resonance imaging system [13, 28–32]. In active noise control, the main objective is to minimize the noise level in an environment by producing anti-noise at sensor position, i.e., generating noise from speakers (control actuators) such that noise level at microphones (error sensors) position is made as small as possible. The active noise control primarily deals with low-frequency acoustical noise, typically less than 500 Hz; high-frequency components can be effectively suppressed by using passive sound absorbers [33, 34]. Some successful

6 Robust Adaptive Disturbance Attenuation

137

applications including reduction of propeller-induced cabin noise in aircraft, engine and road noise in cars, and noise in acoustic ducts can be found in [35–37]. In most research efforts of the past two decades, the problem of attenuation of unknown periodic disturbances is formulated as follows. A dynamical system with LTI model is considered in the presence of additive input and/or output disturbances. The nominal model of the system is often assumed to be known and stable. It is inherent in the assumption that if the system was unstable but known, a robust LTI controller could be designed to end up with the configuration involving a stable LTI system with a known nominal model. In many control applications, the plant has been already stabilized with a fixed-gain controller, and what is referred to as the controlled process for the disturbance rejection problem is an augmentation of the baseline original control design [7, 9, 38]. The dominant part of additive disturbances can be often modeled as a summation of a finite number of sinusoidal terms with unknown amplitude, frequency, and phase. Another formulation of the disturbance terms is to view it as the output of a filter with poles on the imaginary axis (in continuous-time formulations), or on the unit circle (in discrete-time formulations) and with Dirac impulse as the input of the filter. The control objective is to find a control input u that will reject the unknown periodic components of the disturbance from the output and make the magnitude of the plant output as small as possible. This formulation under the aforementioned assumptions has been extensively studied in the control literature [7, 9, 12, 37, 39–46]. In [41], properties and limitations of different approaches for rejection of unknown periodic disturbances have been discussed. Most of the proposed techniques are based on the internal model principle and are divided into two main classes: indirect and direct adaptive control schemes. In indirect methods, an on-line estimator provides an estimate of the parameters of the disturbance internal model which are used to calculate the controller parameters and generate the control action. In direct methods, however, under certain parameterization e.g., Youla-Ku˘cera or other parameterizations [47, 48], the controller parameters are directly calculated on-line without intermediate calculations. Also, a state-space approach involving the design of adaptive observers together with on-line frequency identifiers is used in [49]. In [41, 50], a method based on the concept of a phase-locked loop is proposed. The problem is also studied using harmonic steady-state methods and averaging in which the plant is approximated by its steady-state sinusoidal response [26, 51]. In [52, 53], an adaptive pole placement control has been used for rejection of unknown disturbances acting on unknown LTI systems. It is assumed that the plant order and the number of distinct frequencies of the disturbance are known, but the plant and disturbance parameters are unknown. The global stability and convergence of the algorithm have been established without the requirement of persistently exciting signals. The method, however, has some major drawbacks impairing its usability in practical situations: in addition to difficulties in the extraction of the estimate of the internal model of the disturbance from a composite polynomial, especially for high-order cases, the procedure is susceptible to numerical problems due to possible division by zero; also no effective procedure has been proposed for the construction of some design parameters; moreover, some

138

S. Jafari and P. Ioannou

unrealistic assumptions are made which are limiting its practical use. In [54], an adaptive harmonic steady-state algorithm for rejection of sinusoidal disturbances acting on unknown linear systems has been proposed, but the disturbance frequencies are assumed to be known, and in [55], the idea has been extended for unknown disturbances. In both cases, the local stability of the closed-loop system has been established, but no analysis for the size of the region of attraction has been provided. A state-derivative feedback adaptive controller has been proposed in [56] for the cancellation of unknown sinusoidal disturbances acting on a class of continuoustime LTI systems with unknown parameters; and in [57], an LTI plant model in the controllable canonical form with unknown parameters and measurable full state is considered, and a state-feedback adaptive control scheme is proposed for the rejection of an unknown sinusoidal disturbance term. It is, however, not clear that this approach can be practically implemented as the robustness properties of the scheme in the presence of unmodeled dynamics has not been studied; moreover, it has not been addressed how the proposed controller may perform for rejection of disturbances with multiple frequencies. In fact, the case of unknown plant model and unknown disturbance remains an open problem as no practical solution with guaranteed global stability has been yet proposed. Despite the considerable number of publications in this area, problems of high practical importance need to be addressed. ◦ In practice, the plant transfer function is never known perfectly let alone be LTI. Analysis of the effect of inevitable plant modeling uncertainties on any control scheme is of great practical importance. In most publications, the robustness with respect to plant unmodeled dynamics and noise has been taken for granted under the argument that persistence of excitation (PE) of the regressor in the adaptive law is guaranteed due to sufficient excitation by the periodic disturbance terms. The PE property guarantees exponential convergence of the estimated disturbance model parameters close to their real values, which in turn guarantees a level of robustness. This assumption, however, is based on a parameterization that uses the exact number of the unknown frequencies in the disturbance. Assuming an upper bound for the number of frequencies which is the practical thing to do, given that the number of frequencies may change with time, leads to an over-parameterization in which case the regressor cannot be PE. In the absence of PE, most if not all of the adaptive laws proposed in the literature and even experimented with are non-robust as small disturbances can lead to parameter drift and instability as pointed out using simple examples in [58]. This serious practical problem has been completely overlooked in the literature on the adaptive rejection of periodic vibrations. In this chapter, we address this problem and present practical solutions supported by analysis. ◦ In most publications and applications, the focus is to reject the periodic disturbances. In the absence of noise, this objective can be achieved exactly provided that the LTI plant is exactly known and stable. In the presence of noise, however, a control scheme that in the absence of noise perfectly rejects the periodic disturbances may drastically amplify the noise leading to worse performance than

6 Robust Adaptive Disturbance Attenuation

139

without the feedback. A practical feedback control scheme should have enough structural flexibility to reject the periodic disturbances without amplifying the noise. This structural flexibility should be supported analytically as part of the overall design. The amplification of noise often occurs when the zeros of the internal model of the disturbance model are close to the zeros of the plant in addition to other cases. Addressing and understanding these issues and finding appropriate solutions are critical. One of the objectives of this chapter is to provide solutions to these problems and show possible limitations. ◦ The assumption that the plant is exactly known, stable, and LTI is fundamental to all approaches proposed for adaptive vibration control. The justification behind this assumption is that off-line identification is used to estimate the parameters of a dominant plant model and a fixed robust controller is designed to stabilize it. While this assumption may be valid under normal operations, changes in the plant parameters over time due to tear and wear or due to some component failures may easily lead to the failure of the fixed controller. Designing a robust adaptive control scheme that can simultaneously stabilize a plant with unknown or changing parameters in addition to rejecting periodic disturbances is recognized to be an open problem and of practical importance. This chapter aims to address the above issues and propose practical solutions for the problem together with guidelines for the selection of design parameters for practical implementation.

6.2 Problem Formulation and Objectives Consider an LTI system with n u inputs and n y outputs, whose output is corrupted by an additive disturbance as shown in Fig. 6.1. The input–output relationship of the system is described as y(t) = G(q)[u(t)] + d(t) = (I + Δm (q))G 0 (q)[u(t)] + d(t),

(6.1)

Fig. 6.1 A system with modeled dynamic G 0 (q) and multiplicative unmodeled dynamic Δm (q), whose output y is corrupted with unknown disturbance d, and u is the input

140

S. Jafari and P. Ioannou

Fig. 6.2 The modeled part of plant and disturbance for control design purposes

where y(t) ∈ Rn y is the measurable output, u(t) ∈ Rn u is the control input, d(t) ∈ Rn y is an unknown bounded disturbance which is not directly measurable. The transfer function of the system is denoted by G(q) = (I + Δm (q))G 0 (q), where G 0 (q) is the modeled part of the system and Δm (q) is an unknown multiplicative modeling uncertainty term, and q is either the variable of Laplace transformation (i.e., q = s in continuous-time systems) or the variable of the z-transformation (i.e., q = z in discrete-time systems). In many applications, unwanted sound and vibrational disturbances are often modeled as a summation of a finite number of sinusoidal terms with unknown frequency, magnitude, and phase corrupted by broadband noise as presented by the following equation for each output channel: d(t) = ds (t) + v(t) =

nf

ai sin(ωi t + ϕi ) + v(t),

(6.2)

i=1

where ds (t) is the dominant part of the disturbance (modeled disturbance) with n f distinct frequencies, and v(t) is a zero-mean bounded random noise. The parameters of the modeled part of the disturbance, i.e., ai , ωi , ϕi , and n f are all unknown. An known upper bound for n f is often assumed to be available for control design. Note that the disturbances applied to different output channels may have completely different characteristics. As shown is Fig. 6.2, the modeled part of the disturbance, ds (t), may be viewed as the response of an unknown LTI system with transfer function matrix G d (q), to a Dirac impulse δ(t), where G d (q) is of order 2n f with all poles on the unit circle for the discrete-time model or on the jω-axis for the continues-time formulation. The control objective is to design the control signal u as a function of plant output y to make the effect of d on output y as small as possible. Such a feedback control law must provide an acceptable level of performance and robustness when applied to the actual plant with modeling uncertainties. The analysis of the trade-offs between performance and robustness helps in the selection of design parameters to achieve a good practical design. The design of a robust adaptive control law is done under certain assumptions on the properties of the open-loop plant model. The following two cases are considered:

6 Robust Adaptive Disturbance Attenuation

141

Fig. 6.3 General structure of the control law for Case 1, when the parameters of the plant model G 0 (q) are known, K (q, θ) is an adaptive filter and F(q) is an LTI compensator

Fig. 6.4 General structure of the control law for Case 2, when the parameters of the plant model G 0 (q) are unknown, K y (q, θ y ) and K u (q, θu ) are adaptive filters

Case 1: The modeled part of the open-loop plant, G 0 (q), is known and stable, with possibly unstable zeros. The following scenarios are studied for this case: ◦ Discrete-time SISO Systems ◦ Continues-time SISO Systems ◦ Discrete-time MIMO Systems ◦ Continues-time MIMO Systems Case 2: The modeled part of the open-loop plant, G 0 (q), is unknown and possibly unstable, but has stable zeros. The following scenario is studied for this case: ◦ Discrete-time SISO Systems For Case 1, we consider the control architecture shown in Fig. 6.3, wherein the adaptive filter K (q, θ ) and the LTI compensator F(q) are to be designed to meet the control objective [1, 2, 59, 60]. The knowledge on the parameters and stability of the modeled part of the plant, G 0 (q), is utilized for the design of the control law. For Case 2, however, a fully adaptive control law as shown in Fig. 6.4 is assumed, wherein the adaptive filters K y (q, θ y ) and K u (q, θu ), in addition to unknown disturbance rejection, must stabilize a possibly unstable open-loop plant. In the second case, the minimum-phase property of G 0 (q) is employed to meet the control objective. It should be noted that the architecture in Fig. 6.4 is equivalent to that of the classical model-reference adaptive control with zero reference signal [58, Sect. 6.3], [61].

6.2.1 Preliminaries and Notation Throughout this chapter, the following notation and definitions are used. For an n u -input, n y -output finite-dimensional LTI system with real-rational transfer function matrix H (q), with input u(t), and output y(t), the input–output relationship is

142

S. Jafari and P. Ioannou

expressed as y(t) = H (q)[u(t)], where the variable q is either z or s, for discretetime and continuous-time systems, respectively. For a signal x, the same notation x(t) is used in both continuous and discrete-time domains, where in the continuous-time case, the independent variable t takes continuous values, while in the discrete-time case, t is the dimensionless time index taking only integer values (assuming a fixed constant sampling period). With this notation, for a sinusoidal signal x(t) = sin(ω0 t), the frequency ω0 is in rad/s in continuous-time domain and is in rad/sample in discrete-time domain. Definition 6.1 The H∞ -norm and the L1 -norm of an LTI system with transfer function matrix H (q) are defined as follows [58, 62, 63]: • Discrete-time systems (q = z): The H∞ -norm is defined as H (z)∞ = max σmax (H (eiθ )), θ∈[0,π]

where σmax denotes the maximum singular value, and the L1 -norm is defined as ny

H (z)1 = max i=1

nu

h i j 1 ,

j=1

where ∞ h i j is the impulse response of the i j-element of H (z) and h i j 1 = τ =0 |h i j (τ )|. • Continuous-time systems (q = s): The H∞ -norm is defined as H (s)∞ = max σmax (H ( jω)), ω

where σmax denotes the maximum singular value, and the L1 -norm is defined as ny

H (s)1 = max i=1

nu

h i j 1 ,

j=1

where h i j is the impulse response of the i j-element of H (s) and h i j 1 = ∞ |h i j (τ )|dτ . 0 Remark 6.1 The H∞ norm and the L1 norm are induced norms and satisfy the multiplicative property. For an LTI system with transfer function matrix H (q), the two norms are related as [64]: H ∞ ≤

√ √ n y H 1 ≤ n u n y (2n + 1)H ∞ ,

where n is the degree of a minimal realization of H (q). Definition 6.2 We say that the signal x is μ-small in the mean square sense and write x ∈ S (μ), for a given constant μ ≥ 0, if [58, 62]:

6 Robust Adaptive Disturbance Attenuation

143

• Discrete-time systems (q = z): For any t, T and some constant c1 , c2 ≥ 0: t+T −1

x(τ ) x(τ ) ≤ c1 μT + c2 .

τ =t

• Continuous-time systems (q = s): For any t, T and some constant c1 , c2 ≥ 0:

t+T

x (τ )x(τ )dτ ≤ c0 μT + c1 .

t

6.3 Known Stable Plants: SISO Systems In this section, we consider the plant model shown in Fig. 6.1 and the control structure in Fig. 6.3 and propose a robust adaptive scheme for rejection of unknown periodic components of the disturbance acting on the plant output. It is assumed that the nominal plant model, G 0 (q), is known and stable, possibly with unstable zeros. The stability and performance properties of the closed-loop system are analyzed for both discrete-time and continuous-time SISO systems. We first consider the ideal scenario (non-adaptive) when complete information about the characteristics of the disturbance is available. The analysis of the known frequency case allows us to identify what is the best performance that can be achieved and set the reference performance and robustness levels to be compared with those in the more realistic case where the frequencies are unknown. We show that the rejection of periodic terms may lead to amplification of output unmodeled disturbance, v(t). A way to avoid such undesirable noise amplification is to increase the order of the feedback adaptive filter in order to have the flexibility to achieve rejection of the periodic disturbance terms while minimizing the effect of the noise on the plant output. The increased filter order leads to an over-parameterized scheme where the persistence of excitation is no longer possible, and this shortcoming makes the use of robust adaptation essential. With this important insight in mind, the coefficients of the feedback filter whose size is overparameterized are adapted using a robust adaptive law. We show analytically that the proposed robust adaptive control scheme guarantees performance and stability robustness with respect to unmodeled dynamics and disturbance.

6.3.1 Discrete-Time Systems Consider the plant model (6.1) and assume G 0 (z) is a known stable nominal plant transfer function (possibly non-minimum-phase) and the unknown multiplicative modeling uncertainty term Δm (z) is such that Δm (z)G 0 (z) is proper with stable poles.

144

6.3.1.1

S. Jafari and P. Ioannou

Non-adaptive Case: Known Disturbance Frequencies

Let the filter K in Fig. 6.3 be an FIR filter of order N of the form K (z, θ ) = Θ(z)/z N , where Θ(z) = θ0 + θ1 z + . . . + θ N −1 z N −1 , which can be written as K (z, θ ) =

N −1 Θ(z) i−N = θi z = θ α(z), zN i=0

(6.3)

α(z) = [z −N , z 1−N , . . . , z −1 ] , where θ = [θ0 , θ1 , . . . , θ N −1 ] ∈ R N . The control objective is to find the parameter vector θ and to design a stable compensator F(z) (if needed) such that the magnitude of y is minimized. If the frequencies of ds (t) in (6.2) are known, then its internal model nf (z 2 − 2 cos(ωi )z + 1) Ds (z) =

(6.4)

i=1

is known. From Figs. 6.1 and 6.3, the sensitivity transfer function from d(t) to y(t) is given by y(t) = S(z, θ )[d(t)] =

1 − G 0 (z)F(z)K (z, θ ) [d(t)]. 1 + Δm (z)G 0 (z)F(z)K (z, θ )

(6.5)

It follows from (6.5) that the effect of periodic components of d(t) on y(t) is completely rejected if S(z, θ ) has zeros on the unit circle at the disturbance frequencies; in other words, if S(z, θ ) has the internal model of the sinusoidal components Ds (z) as a factor. In addition, the filters K (z, θ ) and F(z) should be chosen such that S(z, θ ) remain stable for any admissible Δm (z). Assuming that we have perfect knowledge on the frequencies in ds (t), we show that with the control architecture Fig. 6.3 and filter (6.3), the control objective can be met. In particular, we discuss that how the design parameters may affect robust stability and performance. Theorem 6.1 Consider the closed-loop system shown in Figs. 6.1 and 6.3 with disturbance model (6.2) and filter K (z, θ ) of the form (6.3). Let ω1 , . . . , ωn f be the distinct frequencies of the sinusoidal disturbance ds (t) and Ds (z) in (6.4) be the internal model of ds (t). Then, there exists a θ ∗ such that with K (z, θ ∗ ), the control law of Fig. 6.3 completely rejects the periodic components of the disturbances if and only if G 0 (z i )F(z i ) = 0, z i = exp(± jωi ), for i = 1, 2, . . . , n f , i.e., G 0 (z)F(z) has no zero at the roots of Ds (z), and N ≥ 2n f , provided the stability condition Δm (z)G 0 (z)F(z)K (z, θ ∗ )∞ < 1

(6.6)

is satisfied. The choice of θ ∗ is unique if N = 2n f . In addition, all signals in the closed-loop system are guaranteed to be uniformly bounded and the plant output satisfies

6 Robust Adaptive Disturbance Attenuation

145

lim sup |y(τ )| ≤ S(z, θ ∗ )1 v0

t→∞ τ ≥t

≤ c(1 + G 0 (z)F(z)K (z, θ ∗ )1 )v0 ,

(6.7)

1 where v0 = supτ |v(τ )| and c = 1+Δm (z)G 0 (z)F(z)K is a finite positive constant. (z,θ ∗ ) 1 Moreover, in the absence of unmodeled noise (i.e., if v(t) = 0), y(t) converges to zero exponentially fast.

Proof The proof is given in [1].

Theorem 6.1 gives conditions under which we can completely reject the sinusoidal components of the disturbance when the frequencies in ds (t) are known. It also shows that if the plant uncertainty satisfies a norm-bound condition, the output will be of the order of the broadband random noise level at the steady state. In the presence of noise v(t), however, a large magnitude of the sensitivity function S(z, θ ∗ ) may lead to noise amplification, especially at high frequencies. The situation may be worse if the plant has a very small gain at the frequency range of ds (t) with a larger gain at high frequencies. In such cases, the design of a pre-compensator F(z) to shape the frequency response of the plant will be a possible remedy to achieve a good compromise between performance and robustness. It should be noted that with N = 2n f , there exists a unique θ ∗ for which complete rejection of the sinusoidal terms of the disturbance is possible. Such a unique parameter vector, however, does not provide any flexibility to improve performance and/or robust stability margins. For N > 2n f , however, there exists an infinite number of vectors θ ∗ that guarantee the results of Theorem 6.1. In such cases, one may choose a θ ∗ that in addition to rejecting the sinusoidal disturbance terms, minimizes the magnitude of G 0 (z)F(z)K (z, θ ∗ ) and therefore limits the possible amplification of the output noise. The existence of a minimizer is the subject of the following lemma. Lemma 6.1 Consider the closed-loop system shown in Figs. 6.1 and 6.3 with disturbance model (6.2) and filter K (z, θ ) of the form (6.3). If the conditions in Theorem 6.1 are satisfied, then there exists a θ ∗ with θ ∗ ≤ r0 , for some r0 > 0, that solves the following constrained convex optimization problem: θ ∗ = arg min G 0 (z)F(z)K (z, θ )∞ θ∈Ω

(6.8)

where Ω = {θ | θ ≤ r0 , G 0 (z i )F(z i )K (z i , θ ) = 1, z i = exp(± jωi ), i = 1, . . . , n f }. (6.9) Proof The proof is given in [1].

Remark 6.2 When N = 2n f , the set Ω in (6.9) is a singleton, hence, the cost in (6.8) is fixed constant and cannot be reduced.

146

S. Jafari and P. Ioannou

Remark 6.3 As shown in [1], the constraint G 0 (z i )F(z i )K (z i , θ ) = 1 is a polynomial equation which can be expressed as a Sylvester-type matrix equation, i.e., the constraint is equivalent to a system of algebraic linear equations. The following simple example illustrates the above results. Example 6.1 Consider the following open-loop plant model and disturbance: G 0 (z) =

−0.00146(z − 0.1438)(z − 1) , (z − 0.7096)(z 2 − 0.04369z + 0.01392)

(6.10)

with sampling period of 1/480 s and assume d(t) = sin(ω0 t) + v(t),

(6.11)

where ω0 = 0.0521 rad/sample (= 25 rad/s) and v(t) is a zero-mean Gaussian noise with standard deviation 0.02. Let us suppose Δm (z) = 0 and choose F(z) = 1. Figure 6.5 shows the performance of the off-line designed controllers obtained from (6.8) for complete rejection of the periodic term of a disturbance. The performance of the controller with different orders N are shown with different colors in Fig. 6.5. After closing the loop at t = 20 s, the sinusoidal disturbance is completely rejected, but the noise part is amplified for low-order filters. Increasing the order of filter K (z, θ ) provides the flexibility to reduce noise amplification.

Fig. 6.5 Control performance for Example 6.1, where the disturbance frequency ω0 is known. The controller is turned on at time 20 s. The closed-loop plant outputs for three different values of the c filter order N show how performance improvement by increasing the filter order N . 2015, IEEE. Reprinted, with permission, from S. Jafari, P. Ioannou, B. Fitzpatrick, and Y. Wang, Robustness and performance of adaptive suppression of unknown periodic disturbances, IEEE Transactions on Automatic Control, vol. 60, pages 2166–2171 (2015)

6 Robust Adaptive Disturbance Attenuation

147

The control design and analysis of the known frequency case are very useful as they establishes analytically that, for a better performance, the order of the filter K (z, θ ) has to be much larger than 2n f in order to provide the structural flexibility to design the coefficients of the filter to simultaneously reject the periodic components of the disturbance and minimize the effect of noise disturbances on the plant output. The non-adaptive analysis provides the form and dimension of the filter K (z, θ ) in the adaptive case where the frequencies of the disturbances are completely unknown. We treat this case in the following subsection.

6.3.1.2

Adaptive Case: Unknown Disturbance

In order to attenuate disturbances with unknown characteristics, the same control architecture as shown in Fig. 6.3 is employed wherein the unknown controller parameters are replaced with their estimates. That is, based on the certainty equivalence principle [62], a robust parameter identifier is to be designed to calculate the controller parameters at each time. The adaptive filter in Fig. 6.3 is given by ˆ K (z, θ(t)) =

N −1

θî (t)z i−N = θˆ (t)α(z),

i=0 1−N

α(z) = [z −N , z

(6.12)

, . . . , z −1 ] ,

ˆ = [θˆ0 (t), θˆ1 (t), . . . , θˆN −1 (t)] ∈ R N is the estimate of the unknown θ ∗ where θ(t) at time t. The control law is the same as in the known parameter case except that the unknown vector θ ∗ is replaced with its estimate as u(t) = −F(z) K (z, θˆ (t − 1))[ζ (t)] , ζ (t) = y(t) − G 0 (z)[u(t)],

(6.13)

where θˆ (t − 1) is the most recent estimate of θ ∗ available to generate control action at time t. ˆ at each time t, we express In order to design a parameter estimator to generate θ(t) θ ∗ in an appropriate parametric form. The following lemma presents a parametric model for the closed-loop plant that is used for online parameter estimation. Lemma 6.2 The closed-loop system (6.1),(6.13) is parameterized as ζ (t) = φ(t) θ ∗ + η(t),

(6.14)

where θ ∗ = [θ0∗ , θ1∗ , . . . , θ N∗ −1 ] ∈ R N is the unknown desired parameter vector to be identified, ζ (t) = y(t) − G 0 (z)[u(t)] is a measurable scalar signal, and the regressor vector is given by

148

S. Jafari and P. Ioannou

φ(t) = G 0 (z)F(z)α(z)[ζ (t)],

(6.15)

and the unknown error term η(t) depends on the unmodeled noise v(t) and the plant unmodeled dynamics Δm (z) and is given by η(t) = (1 − G 0 (z)F(z)K (z, θ ∗ )) [v(t) + Δm (z)G 0 (z)F(z)[u(t)]] + εs (t), (6.16) where εs (t) = (1 − G 0 (z)F(z)K (z, θ ∗ ))[ds (t)] is an exponentially decaying to zero term.

Proof The proof is given in [1].

Remark 6.4 It follows from Lemma 6.2 that in the absence of noise and modeling error (i.e., if Δm (z) = 0 and v(t) = 0), the error term η(t) is just an exponentially decaying to zero signal. Remark 6.5 The definition of the regressor (6.15) implies that the stable compensator F(z) provides the flexibility to manipulate the regressor excitation level by shaping the spectrum of the open-loop plant G 0 (z). Now, using the parametric model (6.14) and employing the parameter estimation techniques discussed in [62], we design a robust adaptive law to estimate the unknown parameter vector θ ∗ as follows. Let θˆ (t − 1) be the most recent estimate of θ ∗ , then the predicted value of the signal ζ (t) based on θˆ (t − 1) is generated as ζˆ (t) = φ(t) θˆ (t − 1).

(6.17)

The normalized estimation error is defined as ε(t) =

ζ (t) − ζˆ (t) , m 2 (t)

(6.18)

where m 2 (t) = 1 + γ0 φ(t) φ(t), with γ0 > 0, is a normalizing signal. To generate θˆ (t), we consider the robust pure least-squares algorithm [62]: P(t) = P(t − 1) −

P(t − 1)φ(t)φ(t) P(t − 1) m 2 (t) + φ(t) P(t − 1)φ(t)

(6.19)

θˆ (t) = proj(θˆ (t − 1) + P(t)φ(t)ε(t)) where P(0) = P (0) > 0 and projection operator proj(·) is used to guarantee that θˆ (t) ∈ S, ∀t, where S is a compact set defined as 2 S = {θ ∈ R N | θ θ ≤ θmax },

(6.20)

where θmax > 0 is such that the desired parameter vector of the optimum filter, θ ∗ , belongs to S. Since we do not have a priori knowledge on the norm of θ ∗ , the

6 Robust Adaptive Disturbance Attenuation

149

upper bound θmax must be chosen sufficiently large. The projection of the estimated parameter vector into S may be implemented as [62, 65]: χ(t) = θˆ (t − 1) + P(t)ε(t)φ(t) ρ(t) ¯ = P −1/2 (t)χ(t)

ρ(t) ¯ if ρ(t) ¯ ∈ S¯ ρ(t) = ¯ ⊥ proj of ρ(t) ¯ on S if ρ(t) ¯ ∈ / S¯

(6.21)

ˆ = P 1/2 (t)ρ(t), θ(t)

where P −1 is decomposed into (P −1/2 ) (P −1/2 ) at each time step t to ensure all the properties of the corresponding recursive least-squares algorithm without projection. ¯ Since S In (6.21), the set S is transformed into S¯ such that if χ ∈ S, then P −1/2 χ ∈ S. −1/2 ¯ χ is a linear transformation, then S is also a convex is chosen to be convex and P set [65]. An alternative to projection is a fixed σ -modification wherein no bounds for |θ ∗ | are required [58]. In implementing (6.19), we also need to use modifications such as covariance resetting [62] which monitors the covariance matrix P(t) to make sure it does not become small in some directions, i.e., its minimum eigenvalue is always greater than a small positive constant. Such modifications are presented and analyzed in [62] and other references on discrete-time robust adaptive control schemes. The following theorem summarizes the properties of the adaptive control law. Theorem 6.2 Consider the closed-loop system (6.1), (6.13), (6.17)–(6.21), and choose N > 2n¯ f , where n¯ f is a known upper bound for the number of distinct frequencies in the disturbance. Assume that the plant modeling error satisfies c0 Δm (z)G 0 (z)F(z)1 < 1,

(6.22)

where c0 is a finite positive constant independent of Δm (z) which depends on known parameters. Then, all signals in the closed-loop system are uniformly bounded and the plant output satisfies lim sup

T →∞

t+T −1 1 |y(τ )|2 ≤ c(μ2Δ + v02 ), T τ =t

(6.23)

for any t ≥ 0 and some finite positive constant c which is independent of t, T , Δm (z), and v(t), where μΔ is a constant proportional to the size of the plant unmodeled dynamics Δm (z), and v0 = supτ |v(τ )|. In addition, in the absence of modeling error and noise (i.e., if Δm (z) = 0 and v(t) = 0), the adaptive control law guarantees the convergence of y(t) to zero. Proof The proof is given in [1].

Remark 6.6 The condition (6.22) is a sufficient condition to ensure the uniform boundedness of all signals in the closed-loop system. Such norm-bound conditions

150

S. Jafari and P. Ioannou

state that the closed-loop system can tolerate nonzero small-size modeling errors. This condition also indicates the role of the compensator F(z) in improving the stability robustness with respect to unmodeled dynamics.

6.3.2 Continuous-Time Systems In this section, we extend the result of Sect. 6.3.1 to continuous-time systems. Consider the plant model (6.1) and assume G 0 (s) is a known stable nominal plant transfer function (possibly non-minimum phase) and the unknown multiplicative modeling uncertainty term Δm (s) is such that Δm (s)G 0 (s) is proper with stable poles.

6.3.2.1

Non-adaptive Case: Known Disturbance Frequencies

Let the filter K in Fig. 6.3 be of the form N −1

λ N −k = θ Λ(s), N −k (s + λ) k=0

λ λN , . . . , , Λ(s) = (s + λ) N s+λ K (s, θ ) =

θk

(6.24)

where θ = [θ0 , θ1 , . . . , θ N −1 ] ∈ R N is the controller parameter vector and the scalar λ > 0 and the integer N are design parameters. The control objective is to find the parameter vector θ and to design a stable compensator F(s) (if needed) such that the magnitude of y is minimized. If the frequencies of ds (t) in (6.2) are known, then its internal model Ds (s) =

nf

(s 2 + ωi2 )

(6.25)

i=1

is known. From Figs. 6.1 and 6.3, the sensitivity transfer function from d(t) to y(t) is given by y(t) = S(s, θ )[d(t)] =

1 − G 0 (s)F(s)K (s, θ ) [d(t)]. 1 + Δm (s)G 0 (s)F(s)K (s, θ )

(6.26)

It follows from (6.26) that the effect of periodic components of d(t) on y(t) is completely rejected if S(s, θ ) has zeros on the unit circle at the disturbance frequencies; in other words, if S(s, θ ) has the internal model of the sinusoidal components Ds (s) as a factor. In addition, the filters K (s, θ ) and F(s) should be chosen such that S(s, θ ) remain stable for any admissible Δm (s). Assuming that we have perfect knowledge on the frequencies in ds (t), we show that with the control architecture Fig. 6.3 and

6 Robust Adaptive Disturbance Attenuation

151

filter (6.24), the control objective can be met. In particular, we discuss how the design parameters may affect robust stability and performance. Theorem 6.3 Consider the closed-loop system shown in Figs. 6.1 and 6.3 with disturbance model (6.2) and filter K (s, θ ) of the form (6.24). Let ω1 , . . . , ωn f be the distinct frequencies of sinusoidal disturbance ds (t) and Ds (s) in (6.25) be the internal model of ds (t). Then, there exists a θ ∗ such that with K (s, θ ∗ ), the control law of Fig. 6.3 completely rejects the periodic components of the disturbances if and only if G 0 (si )F(si ) = 0, si = jωi , for i = 1, 2, . . . , n f , i.e., G 0 (s)F(s) has no zero at the roots of Ds (s), and N ≥ 2n f , provided the stability condition (6.27) Δm (s)G 0 (s)F(s)K (s, θ ∗ )∞ < 1 is satisfied. The choice of θ ∗ is unique if N = 2n f . In addition, all signals in the closed-loop system are guaranteed to be uniformly bounded and the plant output satisfies lim sup |y(τ )| ≤ S(s, θ ∗ )1 v0 t→∞ τ ≥t (6.28) ≤ c(1 + G 0 (s)F(s)K (s, θ ∗ )1 )v0 , 1 where v0 = supτ |v(τ )| and c = 1+Δm (s)G 0 (s)F(s)K is a finite positive constant. ∗ (s,θ ) 1 Moreover, in the absence of unmodeled noise (i.e., if v(t) = 0), y(t) converges to zero exponentially fast. Proof The proof is given in [60].

Theorem 6.3 gives conditions under which we can completely reject the sinusoidal components of the disturbance when the frequencies in ds (t) are known. It also shows that if the plant uncertainty satisfies a norm-bound condition, the output is of the order of the broadband random noise level at steady state. In the presence of noise v(t), however, a large magnitude of the sensitivity function S(s, θ ∗ ) may lead to noise amplification, especially at high frequencies. The situation may be worse if the plant has a very small gain at the frequency range of ds (t) with a larger gain at high frequencies. In such cases, the design of a pre-compensator F(s) to shape the frequency response of the plant is a possible remedy to achieve a good compromise between performance and robustness. It should be noted that with N = 2n f , there exists a unique θ ∗ for which complete rejection of the sinusoidal terms of the disturbance is possible. Such a unique parameter vector, however, does not provide any flexibility to improve performance and/or robust stability margins. For N > 2n f , however, there exists an infinite number of vectors θ ∗ that guarantee the results of Theorem 6.3. In such cases, one may choose a θ ∗ that in addition to rejecting the sinusoidal disturbance terms, minimizes the magnitude of G 0 (s)F(s)K (s, θ ∗ ) and therefore limits the possible amplification of the output noise. The existence of a minimizer is the subject of the following lemma.

152

S. Jafari and P. Ioannou

Lemma 6.3 Consider the closed-loop system shown in Figs. 6.1 and 6.3 with disturbance model (6.2) and filter K (s, θ ) of the form (6.24). If the conditions in Theorem 6.3 are satisfied, then there exists a θ ∗ with θ ∗ ≤ r0 , for some r0 > 0, that solves the following constrained convex optimization problem: θ ∗ = arg min G 0 (s)F(s)K (s, θ )∞ θ∈Ω

(6.29)

where Ω = {θ | θ ≤ r0 , G 0 (si )F(si )K (si , θ ) = 1, si = ± jωi , i = 1, . . . , n f }. (6.30) Proof The proof is given in [60].

Remark 6.7 When N = 2n f , the set Ω in (6.30) is a singleton, hence, the cost in (6.29) is fixed constant and cannot be reduced. Remark 6.8 As shown in [60], the constraint G 0 (si )F(si )K (si , θ ) = 1 is a polynomial equation which can be expressed as a Sylvester-type matrix equation, i.e., the constraint is equivalent to a system of algebraic linear equations. From (6.27), the LTI filter F(s) can help to improve the stability robustness with respect to modeling errors. In the adaptive case, in addition to stability margin, F(s) affects the level of excitation of the regressor at the disturbance frequencies. Let us further explain the objective of introducing this filter. Since the plant unmodeled dynamics are often dominant at high-frequencies and the control bandwidth for disturbance rejection is in the low-frequency ranges, we choose F(s) to make the magnitude of G 0 (s)F(s) sufficiently large over the control bandwidth and sufficiently small at high frequencies in order to limit the excitation of the high-frequency unmodeled dynamics. A simple procedure for the design of F(s) for open-loop plants with low gain over the control bandwidth is given below. 1. Let G˜ 0 (s) = G 0 (s). 2. If G˜ 0 (s) has a zero on the jω-axis at s = ± jω0 , change it to s = −δ0 ± jω0 , where δ0 > 0 is some small positive constant. 3. If G˜ 0 (s) has an unstable zero at s = σ0 ± jω0 with σ0 > 0, change it to s = −σ0 ± jω0 , i.e., reflect it across the jω-axis and make it stable. 4. Let α0m (6.31) F(s) = κ0 G˜ −1 (s), (s + α0 )m 0 where m > 0 is an integer greater than the relative degree of G 0 (s) and κ0 , α0 > 0 are design constants that are chosen such that the low-pass filter κ0 α0m /(s + α0 )m has a large enough gain over the control bandwidth. The example below clarifies the above procedure.

6 Robust Adaptive Disturbance Attenuation

153

Fig. 6.6 The magnitude plot of (a) G 0 (s) and (b) G 0 (s)F(s). The plant nominal model G 0 (s) has zeros on the jω axis at ω = 0 and 2 rad/s. The filter F(s) flattens the magnitude plot of G 0 (s) over the frequency range of interest and improves disturbance rejection

Example 6.2 Consider the following nominal plant model which has three zeros on the jω-axis: s(s 2 + 4)(s − 0.8)(s + 1.4) G 0 (s) = , (s + 0.5)3 (s + 2)2 (s + 3) and assume that that the disturbance frequencies are in the range [0, 300] rad/s. Following the above procedure, the filter F(s) can be chosen as F(s) =

κ0 α 2 (s + 0.5)3 (s + 2)2 (s + 3) , (s + α)2 (s + δ0 )((s + δ0 )2 + 4)(s + 0.8)(s + 1.4)

where κ0 = 1, δ0 = 0.01, and α = 300. Figure 6.6 shows the magnitude plot of G 0 (s) and G 0 (s)F(s). The compensated plant G 0 (s)F(s) has unity gain over most part of the control bandwidth except at and around ω = 0 and 2 rad/s, as G 0 (s) has zeros on the jω axis at these two frequencies. It should be noted that according to Theorem 6.3, for this plant model, rejection of disturbances at ω = 0 and 2 rad/s is impossible as the plant has zero gain at these two frequencies.

6.3.2.2

Adaptive Case: Unknown Disturbance

In order to control unknown disturbances, we design and analyze an adaptive version of filter (6.24) based on the certainty equivalence principle [58]. The adaptive filter in Fig. 6.3 is given by N −1

λ N −k ˆ Λ(s), = θ(t) N −k (s + λ) k=0

λN λ Λ(s) = ,..., , (s + λ) N s+λ K (s, θˆ (t)) =

θˆk (t)

(6.32)

154

S. Jafari and P. Ioannou

where θˆ (t) = [θˆ0 (t), θˆ1 (t), . . . , θˆN −1 (t)] ∈ R N and λ > 0 is a design constant. Then, the control law can be expressed as u(t) = −F(s) K (s, θˆ (t))[ζ (t)] , ζ (t) = y(t) − G 0 (s)[u(t)],

(6.33)

where θˆ (t) is the estimate of θ ∗ at time t. ˆ at each time t, we express In order to design a parameter estimator to generate θ(t) θ ∗ in an appropriate parametric form. The following lemma presents a parametric model for the closed-loop plant that is used for online parameter estimation. Lemma 6.4 The closed-loop system (6.1),(6.33) is parameterized as ζ (t) = φ(t) θ ∗ + η(t),

(6.34)

where θ ∗ = [θ0∗ , θ1∗ , . . . , θ N∗ −1 ] ∈ R N is the unknown desired parameter vector to be identified, ζ (t) = y(t) − G 0 (s)[u(t)] is a measurable scalar signal, and the regressor vector is given by φ(t) = G 0 (s)F(s)Λ(s)[ζ (t)], (6.35) where the unknown error term η(t) depends on the unmodeled noise v(t) and the plant unmodeled dynamics Δm (s) and is given by η(t) = (1 − G 0 (s)F(s)K (s, θ ∗ )) [v(t) + Δm (s)G 0 (s)F(s)[u(t)]] + εs (t), (6.36) where εs (t) = (1 − G 0 (s)F(s)K (s, θ ∗ ))[ds (t)] is an exponentially decaying to zero term. Proof The proof is given in [60].

Remark 6.9 It follows from Lemma 6.4 that in the absence of noise and modeling error (i.e., if Δm (s) = 0 and v(t) = 0), the error term η(t) is just an exponentially decaying to zero signal. Remark 6.10 The definition of the regressor (6.35) implies that the stable compensator F(s) provides the flexibility to manipulate the regressor excitation level by shaping the spectrum of the open-loop plant G 0 (s). Now, using the parametric model (6.34) and employing the parameter estimation techniques discussed in [58], we design a robust adaptive law to estimate the unknown parameter vector θ ∗ as follows. The predicted value of the signal ζ (t) based on θˆ (t) is generated as ˆ ζˆ (t) = φ(t) θ(t). The normalized estimation error is defined as

(6.37)

6 Robust Adaptive Disturbance Attenuation

ε(t) =

155

ζ (t) − ζˆ (t) , m 2 (t)

(6.38)

where m 2 (t) = 1 + γ0 φ(t) φ(t), with γ0 > 0, is a normalizing signal. To generate θˆ (t), we consider the robust pure least-squares algorithm [58]:

φ(t)φ(t) ˙ P(t) = −P(t) P(t) m 2s (t) θ˙ˆ (t) = proj (P(t)φ(t)ε(t))

(6.39)

where P(0) = P (0) > 0 and projection operator proj(·) is used to guarantee that θˆ (t) ∈ S, ∀t, where S is a compact set defined as 2 S = {θ ∈ R N | θ θ ≤ θmax },

(6.40)

where θmax > 0 is such that the desired parameter vector of the optimum filter, θ ∗ , belongs to S. Since we do not have a priori knowledge on the norm of θ ∗ , the upper bound θmax must be chosen sufficiently large. The projection of the estimated parameter vector into S may be implemented as [58]:

θ˙ˆ (t) =

⎧ ⎪ P(t)φ(t)ε(t) ⎪ ⎪ ⎪ ⎨

if θˆ (t) ∈ S0 or if θˆ (t) ∈ δS and ˆ ≤ 0, (P(t)φ(t)ε(t)) θ(t)

⎪ ⎪ ⎪ ⎪ ˆ

⎩ P(t)φ(t)ε(t) − P(t) θˆ (t)θ(t) ˆ P(t)θˆ (t) P(t)φ(t)ε(t) otherwise θ(t)

(6.41) 2 2 } and δS = {θ ∈ R N | θ θ = θmax } denote the where S0 = {θ ∈ R N | θ θ < θmax interior and the boundary of S, respectively. The following theorem summarizes the properties of the adaptive control law. Theorem 6.4 Consider the closed-loop system (6.1), (6.33), (6.37)–(6.41), and choose N > 2n¯ f , where n¯ f is a known upper bound for the number of distinct frequencies in the disturbance. Assume that the plant modeling error satisfies c0 Δm (s)G 0 (s)F(s)1 < 1,

(6.42)

where c0 is a finite positive constant independent of Δm (s) which depends on known parameters. Then, all signals in the closed-loop system are uniformly bounded and the plant output satisfies 1 lim sup T →∞ T

t+T t

|y(τ )|2 dτ ≤ c(μ2Δ + v02 ),

(6.43)

156

S. Jafari and P. Ioannou

for any t ≥ 0 and some finite positive constant c which is independent of t, T , Δm (s), and v(t), where μΔ is a constant proportional to the size of the plant unmodeled dynamics Δm (s), and v0 = supτ |v(τ )|. In addition, in the absence of modeling error and noise (i.e., if Δm (s) = 0 and v(t) = 0), the adaptive control law guarantees the convergence of y(t) to zero.

Proof The proof is given in [60].

Theorem 6.4 states that with the proposed adaptive control law if the modeling uncertainty satisfies the norm-bound condition (6.42), the energy of the plant output y(t) is of the order of broadband noise level and the size of plant unmodeled dynamics.

6.4 Known Stable Plants: MIMO Systems In this section, the results of Sect. 6.3 are generalized to MIMO systems. We consider the plant model shown in Fig. 6.1 and the control structure in Fig. 6.3 and propose a robust adaptive scheme for rejection of unknown periodic components of the disturbance acting on the output channels of MIMO systems. It is assumed that the nominal plant model, G 0 (q), is known and stable, possibly with unstable zeros. The stability and performance properties of the closed-loop system are analyzed for both discrete-time and continuous-time MIMO systems. We first consider the ideal scenario (non-adaptive) when complete information about the characteristics of the disturbance is available. Subsequently, we design a robust adaptive control scheme and analytically show that the proposed control law guarantees performance and stability robustness with respect to unmodeled dynamics and disturbance. It should be noted that for MIMO systems, for each output channel, the additive output disturbance is assumed to be of the form of (6.2), that is, for the j-th channel, we have n fj (6.44) ai j sin(ωi j t + ϕi j ) + v j (t). d j (t) = ds j (t) + v j (t) = i=1

That is, in MIMO systems, the disturbances applied to different output channels may have completely different characteristics.

6.4.1 Discrete-Time Systems Let G 0 (z) be the nominal plant transfer function with n u inputs and n y outputs and assume that the unknown multiplicative modeling uncertainty term Δm (z) is such that Δm (z)G 0 (z) is proper with stable poles.

6 Robust Adaptive Disturbance Attenuation

6.4.1.1

157

Non-adaptive Case: Known Disturbance Frequencies

Let the filter K in Fig. 6.3 be an n u × n y matrix whose elements are FIR filters of order N of the form K (z, θ ) = K i j (z, θi j ) n u ×n y , K i j (z, θi j ) = θi j α(z), α(z) [z

−N

,z

1−N

(6.45) −1

,...,z ] ,

where θi j ∈ R N is the parameter vector of the i j-th element of the filter and θ =

, θ12 , . . . , θn u n y ] ∈ R N n u n y is the concatenation of θi j ’s. [θ11 Considering the filter structure in (6.45), we examine the conditions under which there exists a desired parameter vector θ ∗ for which the periodic terms of the disturbance can be completely rejected without amplifying the output noise. In addition, we examine who the stable LTI filter F(z) in Fig. 6.3 can be chosen to further improve robustness and performance. If for each output channel, the frequencies of the additive disturbance are known, then we have the list of all distinct frequencies in the disturbance vector d(t). We let n f denote the total number of distinct frequencies in d(t) and the distinct frequencies be denoted by ω1 , . . . , ωn f ; since some output channels may have disturbance terms at n y the same frequencies, then n f ≤ j=1 n f j . Then, the internal model of the sinusoidal disturbances is given by Ds (z) =

nf

(z 2 − 2 cos(ωi )z + 1).

(6.46)

i=1

From Figs. 6.1 and 6.3, the sensitivity transfer function from d(t) to y(t) is given by y(t) = S(z, θ )[d(t)] = (I − G 0 (z)F(z)K (z, θ )) (I + Δm (z)G 0 (z)F(z)K (z, θ ))−1 [d(t)].

(6.47)

The existence of a disturbance rejecting parameter vector θ ∗ is the subject of the following theorem. Theorem 6.5 Consider the closed-loop system shown in Figs. 6.1 and 6.3 with disturbance model (6.2) and filter K (z, θ ) of the form (6.45). Let ω1 , . . . , ωn f be the distinct frequencies of the sinusoidal disturbance terms. Then, there exists a θ ∗ such that with K (z, θ ∗ ), the control law of Fig. 6.3 completely rejects the periodic components of the disturbances if and only if n y ≤ n u , rank(G 0 (z i )F(z i )) = n y , z i = exp(± jωi ), for i = 1, 2, . . . , n f , and N ≥ 2n f , provided the stability condition Δm (z)G 0 (z)F(z)K (z, θ ∗ )∞ < 1

(6.48)

158

S. Jafari and P. Ioannou

is satisfied. The choice of θ ∗ is unique if N = 2n f . In addition, all signals in the closed-loop system are guaranteed to be uniformly bounded and the plant output satisfies lim sup y(τ )∞ ≤ S(z, θ ∗ )1 v0 t→∞ τ ≥t (6.49) ≤ c(1 + G 0 (z)F(z)K (z, θ ∗ )1 )v0 , where v0 = supτ |v(τ )| and c = (I + Δm (z)G 0 (z)F(z)K (z, θ ∗ ))−1 1 is a finite positive constant. Moreover, in the absence of unmodeled noise (i.e., if v(t) = 0), y(t) converges to zero exponentially fast. Proof The proof is given in [2].

For plants with fewer inputs than outputs, for degenerate plants rank(G 0 (z)F(z)) < min{n y , n u }, ∀z, and for plants with transmission zeros at the frequencies of the disturbance, complete rejection of disturbances in all directions is impossible. It should be noted that Theorem 6.5 provides necessary and sufficient conditions for rejection of periodic disturbance signals in any direction; however, for a given disturbance vector, these conditions are sufficient as for a specific direction we just need S(z, θ ) to have zero gain in the direction of the disturbance vector not necessarily in all directions. In a MIMO system, a signal vector u with frequency ω0 can pass through G 0 (z) even if G 0 (z) has a transmission zero at this frequency; which is the case when the input vector u is not in the zero-gain directions of G 0 . The following two examples show that the conditions given in Theorem 6.5 are not necessary for a specific direction of the disturbance vector. Example 6.3 (A system with zero gain at a single frequency) Consider the following plant model (the sampling period is 0.001 s), and assume F(z) = I and Δm (z) = 0: 1 z 2 cos(0.1)z − 1 , G 0 (z) = 2 z z − 0.5z + 0.5 1 which has a pair of zeros on the unit circle at ω0 = 0.1 rad/sample (100 rad/s). From Theorem 6.5, complete rejection of periodic disturbances with frequency ω0 in all directions is impossible for this plant, because rank(G 0 (exp(± jω0 ))) = 1 < 2 and the system has zero gain in some direction. However, for some disturbance vectors, say ds (t) = [sin(0.1t + 0.1), sin(0.1t)] , we can find a filter of the form (6.45) that completely rejects ds (t), even though the conditions in Theorem 6.5 are not satisfied. For example, for 1 0.2469 −0.4994 K (z, θ ) = , z 0.9906 0.2469 the sensitivity transfer function matrix S(z, θ ) kills the effects of ds (t) on y(t), despite the fact that the conditions of Theorem 6.5 are not satisfied (rank(G 0 (exp(±0.1 j))) = 1 < 2 and N = 1 < 2).

6 Robust Adaptive Disturbance Attenuation

159

Example 6.4 (A degenerate system): Consider the following plant model (the sampling period is 0.001 s), and assume F(z) = I and Δm (z) = 0: 1 1 1 G 0 (z) = . z 1 1 The largest possible rank of G 0 (z) is one, the minimum singular value of the system is zero for all frequencies, and the input direction corresponding to the zero of the system is [1, −1] . Therefore, for this system we cannot reject periodic disturbances in every direction. However, for some disturbances, e.g., ds (t) = [sin(0.1t), sin(0.1t)] , there exists a filter of the form (6.45) for which ds (t) is completely rejected. For example, for 0.74z − 0.4975 1 1 K (z, θ ) = , 1 1 z2 the sensitivity transfer function matrix S(z, θ ) has zero gain at ω0 = 0.1 rad/sample (100 rad/s) in the direction of the disturbance vector ds (t), and therefore the disturbance is rejected despite the fact that one of the conditions of Theorem 6.5 is not satisfied (rank(G 0 (exp(±0.1 j))) = 1 < 2). A disturbance rejecting parameter vector θ ∗ in the ideal case (i.e., when v(t) = 0 and Δm (z) = 0) leads to a perfect performance and guarantees that the plant output converges to zero exponentially fast. However, in the presence of noise and modeling error, a large gain of G 0 (z)F(z)K (z, θ ∗ ) may drastically amplify the noise part of the disturbance. Moreover, it may significantly reduce the stability margin or lead to instability. The situation may get worse if the disturbance has some modes z i = exp(± jωi ) near the transmission zeros of G 0 (z)F(z). In such cases, G 0 (z)F(z) is close to becoming rank deficient at the frequencies of the disturbance ωi ’s. Indeed, in order to cancel the periodic disturbance terms, the controller has to generate periodic terms of the same frequencies which after going through G 0 (z)F(z) result in periodic terms identical to that of the disturbance but of opposite sign. Clearly, when the plant has a very low gain at the frequencies of the disturbance, the filter gain must be large enough to make the gain of G 0 (z)F(z)K (z, θ ) in the direction of the disturbance vector close to one. This however may increase G 0 (z)F(z)K (z, θ )∞ leading to noise amplification. Lemma 6.5 Consider the closed-loop system shown in Figs. 6.1 and 6.3 with disturbance model (6.2) and filter K (z, θ ) of the form (6.45). If the conditions in Theorem 6.5 are satisfied, then there exists a θ ∗ with θ ∗ ≤ r0 , for some r0 > 0, that solves the following constrained convex optimization problem: θ ∗ = arg min G 0 (z)F(z)K (z, θ )∞ θ∈Ω

where

(6.50)

160

S. Jafari and P. Ioannou

Ω = {θ | θ ≤ r0 , G 0 (z i )F(z i )K (z i , θ ) = I, z i = exp(± jωi ), i = 1, . . . , n f }. (6.51) Proof The proof is given in [2].

Remark 6.11 When N = 2n f , the set Ω in (6.51) is a singleton, hence the cost in (6.50) is fixed constant and cannot be reduced. Remark 6.12 As shown in [2], the constraint G 0 (z i )F(z i )K (z i , θ ) = I is a set of polynomial equations which can be expressed as a Sylvester-type matrix equation, i.e., the constraint is equivalent to a system of algebraic linear equations. To design a filter with a satisfactory performance, one may use either or both of the following two remedies: (i) Increasing the order of the filter: Increasing the value of N provides more flexibility in selecting the best parameter vector θ ∗ for which the periodic components of the disturbance are rejected and the H∞ -norm of sensitivity function of the output with respect to the random noise in the disturbance is minimized. (ii) Pre-filtering modification (design of the stable filter F(z)): If the plant has a small gain at the frequencies of the disturbance and comes close to lose rank at these frequencies, we may have a very poor performance because of possible large value of the sensitivity transfer function S(z, θ ) at other frequencies which may lead to noise amplification. Moreover, in the presence of high-frequency unmodeled dynamics, if the plant has relatively large gain at high frequencies, with a controller of the form (6.45), the closed-loop system may have a small stability margin or may become unstable. Since the plant model G 0 (z) is known, by proper shaping of the singular value of G 0 (z) both performance and robust stability may be improved. That is, a pre-compensator F(z) is designed such that G 0 (z)F(z) has a large enough gain in all directions over the frequency range of the disturbances (if possible), and has a sufficiently small gain at high frequencies where the modeling error is often dominant. We discuss this modification and the trade-offs between performance improvement and robustness with respect to plant modeling uncertainties. The design of the filter F(z) requires the following a priori information: i) the frequency range where the modeled part of the plant G 0 (z) has high enough accuracy (which is typically at low frequencies); iii) an upper bound for the maximum singular value of the unmodeled dynamics Δm (z); iii) the expected frequency range of the dominant part of the disturbance ds (t). It should be noted that if the disturbance contains periodic terms with frequencies at the high range where the unmodeled dynamics are dominant, their rejection may excite the unmodeled dynamics and adversely affect stability. In practice, however, most high-frequency periodic disturbances have a small amplitude and it may be better to ignore them and consider them as part of unmodeled disturbance v(t) rather than try to attenuate them. This is one of the trade-offs of performance versus robustness that is well known in robust control and robust adaptive control [58].

6 Robust Adaptive Disturbance Attenuation

161

In classical robust control design for MIMO systems, prior to the design of a controller we may need to shape the singular values of the nominal plant in order to be able to meet the desired specifications. The desired shape of the open-loop plant is typically as follows: large enough gain at low frequencies in all directions, low enough roll-off rate at the desired bandwidth (about 20 dB/decade) and higher rate at high frequencies, and very small gain at high frequencies [63, 66, 67]. It is also desired that the maximum and minimum gain of the shaped plant to be almost the same, i.e., singular values to be aligned to have a plant with almost the same gain in all directions at each frequency [63, 68]. For an ill-conditioned system (a system with large condition number), however, aligning singular values are not recommended as it may lead to a poor performance and robustness [69]. Several algorithms and procedures have been proposed for singular value shaping which mainly requires some trial and error [66, 70]. In [71], a more systematic algorithm has been proposed which guarantees that the loop-shaped and the singular values and condition number of the shaping weights lie in a pre-specified region. System decomposition techniques such as inner–outer factorization can be also used to design a frequency weighing matrix for well-conditioned square plants. The inner–outer factorization is used in solving problems related to optimal, robust, and H∞ control design and several algorithms have been proposed to calculate an inner–outer factorization of a proper rational matrix [72–74]. Consider a square plant matrix G 0 (z) with stable poles and without any transmission zero on the unit circle. By using the algorithm proposed in [75], inner–outer factors of G 0 (z) can be computed. If the plant has some zeros on the unit circle, one can perturb the zeros and move them slightly away from the unit circle and then apply the decomposition procedure. Let G out (z) be the outer factor of G 0 (z) (or perturbed version of G 0 (z)), then G 0 (z)G −1 out (z) is a proper stable all-pass (or almost all-pass) matrix. Let F(z) = κ0 f (z)G −1 out (z), where f (z) is a scalar low-pass filter with dc-gain of one and desired bandwidth and roll-off rate at high frequencies and κ0 > 0 is a design constant. Then, G 0 (z)F(z) has a gain of κ0 over the desired low-frequency range and small gain at high frequencies with aligned singular values.

6.4.1.2

Adaptive Case: Unknown Disturbance

In order to counteract the effect of unknown disturbances on the plant output, an adaptive filter is required to adjust its parameters in the direction of minimizing the norm of the plant output. The structure of the closed-loop system with an adaptive filter is shown in Fig. 6.3, where the adaptive filter parameter vector is calculated at each time. The adaptive version of (6.45) is given by

162

S. Jafari and P. Ioannou

ˆ K (z, θ(t)) = K i j (z, θî j (t))

n u ×n y

,

K i j (z, θî j (t)) = θî j (t) α(z),

(6.52)

α(z) = [z −N , z 1−N , . . . , z −1 ]

where θî j (t) ∈ R N is the parameter vector of the i j-th element of the filter and θˆ (t) = [θˆ11 (t) , θˆ12 (t) , . . . , θˆn u n y (t) ] ∈ R N n u n y . Then, the control law can be expressed as u(t) = −F(z) K (z, θˆ (t − 1))[ζ (t)] , (6.53) ζ (t) = y(t) − G 0 (z)[u(t)], where θˆ (t − 1) is the most recent estimate of θ ∗ available to generate control action at time t. In order to generate an online parameter estimator that would generate the estimate of θ ∗ , we express θ ∗ in an appropriate parametric form. The following lemma presents a parametric model for the closed-loop plant to be used for parameter estimation. Lemma 6.6 The closed-loop system (6.1),(6.53) is parameterized as ζ (t) = Φ(t) θ ∗ + η(t),

(6.54)

∗

∗

, θ12 , . . . , θn∗

] ∈ R N n u n y is the unknown desired parameter vecwhere θ ∗ = [θ11 uny tor to be identified, ζ (t) = y(t) − G 0 (z)[u(t)] ∈ Rn y is a measurable vector signal, and the regressor matrix is given by

Φ(t) = G 0 (z)F(z)[W (t)], where

⎡ w(t) ⎢ .. W (t) = ⎣ .

(6.55)

⎤

⎥ ⎦ w(t) N n

(6.56) u n y ×n u

where w(t) = [α(z) [ζ1 (t)], . . . , α(z) [ζn y (t)]] ∈ R N n y , and the unknown error term η(t) depends on the unmodeled noise v(t) and the plant unmodeled dynamics Δm (z) and is given by η(t) = (I − G 0 (z)F(z)K (z, θ ∗ )) [v(t) + Δm (z)G 0 (z)F(z)[u(t)]] + εs (t), (6.57) where εs (t) = (I − G 0 (z)F(z)K (z, θ ∗ ))[ds (t)] is an exponentially decaying to zero term. Proof The proof is given in [2].

6 Robust Adaptive Disturbance Attenuation

163

Remark 6.13 It follows from Lemma 6.6 that in the absence of noise and modeling error (i.e., if Δm (z) = 0 and v(t) = 0), the error term η(t) is just an exponentially decaying to zero signal. Remark 6.14 The definition of the regressor (6.55) implies that the stable compensator F(z) provides the flexibility to manipulate the regressor excitation level by shaping the spectrum of the open-loop plant G 0 (z). The matrix W (t) in (6.55) contains the frequency components of the disturbance, and the regressor Φ(t) is obtained by passing W (t) through G 0 (z)F(z). If G 0 (z) has very low gains at some frequencies of the disturbance, it may severely attenuate those frequency components, so the regressor will carry almost no information on those frequencies with the consequence of making their identification and therefore their rejection difficult if at all possible in some cases. Introducing the stable filter F(z) to shape the frequency response of G 0 (z) helps to improve parameter estimation. After shaping the singular values of the nominal plant G 0 (z), we design a parameter estimator for the adaptive filter. Based on the derived parametric model (6.54), a robust parameter estimator can be developed using the techniques discussed in [62] to guarantee stability and robustness independent of the excitation properties of the regressor Φ(t) in the presence of non-zero error term η(t). It should be noted that the number of distinct frequencies, n f , of the disturbance vector is unknown and an upper bound, n¯ f , for it is assumed to be known; moreover, as discussed earlier even if n f is known, we often choose N > 2n f to achieve a better performance. This choice of N leads to an over-parameterization of the controller and gives a regressor Φ(t) which cannot be persistently exciting. The lack of persistence of excitation makes the adaptive law susceptible to parameter drift [58] and possible instability. Therefore the adaptive law for estimating θ ∗ must be robust; otherwise, the presence of the nonzero error term η(t) may cause parameter drift and lead to instability. In the absence of a persistently exciting regressor, several modifications have been proposed in the literature to avoid parameter drift in the adaptive law [58]. In the present paper, we use parameter projection to directly restrict the estimate of the unknown parameter vector from drifting to infinity. Let θˆ (t − 1) be the most recent estimate of θ ∗ , then the predicted value of the signal ζ (t) based on θˆ (t − 1) is generated as ˆ − 1). ζˆ (t) = Φ(t) θ(t

(6.58)

The normalized estimation error vector is defined as ε(t) =

ζ (t) − ζˆ (t) , m 2 (t)

(6.59)

where m 2 (t) = 1 + γ0 trace(Φ(t) Φ(t)), with γ0 > 0, is a normalizing signal. To generate θˆ (t), we consider the robust pure least-squares algorithm [62]:

164

S. Jafari and P. Ioannou

Φ(t)Φ(t)

P −1 (t) = P −1 (t − 1) + , m 2 (t) θˆ (t) = proj θˆ (t − 1) + P(t)Φ(t)ε(t) ,

(6.60)

where P −1 (0) = P − (0) > 0 and projection operator proj(·) is used to guarantee that θˆ (t) ∈ S, ∀t, where S is a compact set defined as 2 }, S = {θ ∈ R N n u n y | θ θ ≤ θmax

(6.61)

where θmax > 0 is such that the desired parameter vector of the optimum filter, θ ∗ , belongs to S. Since we do not have a priori knowledge on the norm of θ ∗ , the upper bound θmax must be chosen sufficiently large. To avoid P −1 (t) from growing without bound, covariance resetting modification [58, 62] can be used, i.e., set P(tr ) = β0 I , where tr is the sample time at which λmin (P) ≤ β1 , and β0 > β1 > 0 are design constants. One may also use the modified least-squares algorithm with forgetting factor [58]. The projection of the estimated parameter vector into S may be implemented as [62, 65]: χ (t) = θˆ (t − 1) + P(t)Φ(t)ε(t) ρ(t) ¯ = P −1/2 (t)χ (t)

ρ(t) ¯ if ρ(t) ¯ ∈ S¯ ρ(t) = ⊥ proj of ρ(t) ¯ on S¯ if ρ(t) ¯ ∈ / S¯

(6.62)

ˆ = P 1/2 (t)ρ(t), θ(t) where P −1 is decomposed into (P −1/2 ) (P −1/2 ) at each time step t to ensure all the properties of the corresponding recursive least-squares algorithm without projection. ¯ Since S In (6.21), the set S is transformed into S¯ such that if χ ∈ S, then P −1/2 χ ∈ S. is chosen to be convex and P −1/2 χ is a linear transformation, then S¯ is also a convex set [65]. An alternative to projection is a fixed σ -modification wherein no bounds for |θ ∗ | are required [58]. The following theorem summarizes the properties of the adaptive control law. Theorem 6.6 Consider the closed-loop system (6.1), (6.53), (6.58)–(6.62), and choose N > 2n¯ f , where n¯ f is a known upper bound for the number of distinct frequencies in the disturbance vector. Assume that the plant modeling error satisfies c0 Δm (z)G 0 (z)F(z)1 < 1,

(6.63)

where c0 is a finite positive constant independent of Δm (z) which depends on known parameters. Then, all signals in the closed-loop system are uniformly bounded and the plant output satisfies

6 Robust Adaptive Disturbance Attenuation

lim sup

T →∞

t+T −1 1 y(τ )22 ≤ c(μ2Δ + v02 ), T τ =t

165

(6.64)

for any t ≥ 0 and some finite positive constant c which is independent of t, T , Δm (z), and v(t), where μΔ is a constant proportional to the size of the plant unmodeled dynamics Δm (z), and v0 = supτ |v(τ )|. In addition, in the absence of modeling error and noise (i.e., if Δm (z) = 0 and v(t) = 0), the adaptive control law guarantees the convergence of y(t) to zero.

Proof The proof is given in [2].

The stability condition (6.63) indicates that the stability margin with respect to modeling uncertainties can be improved by proper shaping of the singular values of the plant model G 0 (z) by an appropriate choice of filter F(z).

6.4.2 Continuous-Time Systems In this section, we extend the result of Sect. 6.4.1 to continuous-time systems. Following the same procedure, by considering a structure for the filter K in Fig. 6.3, we first examine the solvability of the problem and then design and analyze a robust adaptive control scheme.

6.4.2.1

Non-adaptive Case: Known Disturbance Frequencies

Let the filter K in Fig. 6.3 be an n u × n y matrix whose elements are FIR filters of order N of the form K (s, θ ) = K i j (s, θi j ) n u ×n y , N −1

λ N −k = θi j Λ(s), N −k (s + λ) k=0

λN λ Λ(s) = , . . . , , (s + λ) N s+λ K i j (s, θi j ) =

(θi j )k

(6.65)

where θi j ∈ R N is the parameter vector of the i j-th element of the filter, θ =

, θ12 , . . . , θn u n y ] ∈ R N n u n y is the concatenation of θi j ’s, and λ > 0 is a design [θ11 parameter. Considering the filter structure in (6.65), we examine the conditions under which there exists a desired parameter vector θ ∗ for which the periodic terms of the disturbance can be completely rejected without amplifying the output noise. In addition, we examine how the stable LTI filter F(z) in Fig. 6.3 can be designed to further improve robustness and performance.

166

S. Jafari and P. Ioannou

Let n f denote the total number of distinct frequencies in d(t) and the distinct frequencies be denoted by ω1 , . . . , ωn f ; since some output channels may have disn y turbance terms at the same frequencies, then n f ≤ j=1 n f j . Then, the internal model of the sinusoidal disturbances is given by Ds (s) =

nf

(s 2 + ωi2 ).

(6.66)

i=1

From Figs. 6.1 and 6.3, the sensitivity transfer function from d(t) to y(t) is given by y(t) = S(s, θ )[d(t)] = (I − G 0 (s)F(s)K (s, θ )) (I + Δm (s)G 0 (s)F(s)K (s, θ ))−1 [d(t)].

(6.67)

The following lemma gives a necessary and sufficient condition under which the periodic components of the disturbance vector can be rejected. Theorem 6.7 Consider the closed-loop system shown in Figs. 6.1 and 6.3 with disturbance model (6.2) and filter K (s, θ ) of the form (6.65). Let ω1 , . . . , ωn f be the distinct frequencies of the sinusoidal disturbance terms. Then, there exists a θ ∗ such that with K (s, θ ∗ ), the control law of Fig. 6.3 completely rejects the periodic components of the disturbances if and only if n y ≤ n u , rank(G 0 (si )F(si )) = n y , si = ± jωi , for i = 1, 2, . . . , n f , and N ≥ 2n f , provided the stability condition Δm (s)G 0 (s)F(s)K (s, θ ∗ )∞ < 1

(6.68)

is satisfied. The choice of θ ∗ is unique if N = 2n f . In addition, all signals in the closed-loop system are guaranteed to be uniformly bounded and the plant output satisfies lim sup y(τ )∞ ≤ S(s, θ ∗ )1 v0 t→∞ τ ≥t (6.69) ≤ c(1 + G 0 (s)F(s)K (s, θ ∗ )1 )v0 , where v0 = supτ |v(τ )| and c = (I + Δm (s)G 0 (s)F(s)K (s, θ ∗ ))−1 1 is a finite positive constant. Moreover, in the absence of unmodeled noise (i.e., if v(t) = 0), y(t) converges to zero exponentially fast. Proof The proof is given in [59].

Theorem 6.7 implies that the minimization of the magnitude of G 0 (s)F(s) K (s, θ ) improves the stability margin, and prevents the unnecessary amplification of v(t). Since for N > 2n f , there is an infinite number of vectors θ which guarantee rejection of the periodic disturbance terms, we can select the θ which also improves the stability margin and prevents the broadband output noise amplification to the extent possible. When partial knowledge about the frequency range of the disturbances is available, the filter F(s) should be chosen such that G 0 (s)F(s) has large

6 Robust Adaptive Disturbance Attenuation

167

enough gains in all directions (if possible) over the expected disturbance frequency range, and very small gains at high frequencies where the unmodeled dynamics may be dominant. Large enough gains of G 0 (s)F(s) at the frequencies of the disturbance increase the excitation level of the regressor at those frequencies and can significantly improve the performance; moreover, small gains of G 0 (s)F(s) at high frequencies reduce the level of excitation of the high-frequency unmodeled dynamics. To design F(s), one may use singular-value shaping techniques [66] or system decomposition approaches such as inner–outer factorization [75]. Lemma 6.7 Consider the closed-loop system shown in Figs. 6.1 and 6.3 with disturbance model (6.2) and filter K (s, θ ) of the form (6.45). If the conditions in Theorem 6.7 are satisfied, then there exists a θ ∗ with θ ∗ ≤ r0 , for some r0 > 0, that solves the following constrained convex optimization problem: θ ∗ = arg min G 0 (s)F(s)K (s, θ )∞ θ∈Ω

(6.70)

where Ω = {θ | θ ≤ r0 , G 0 (si )F(si )K (si , θ ) = I, si = ± jωi , i = 1, . . . , n f }. (6.71)

Proof The proof is given in [59].

Remark 6.15 As shown in [59], the constraint G 0 (si )F(si )K (si , θ ) = I is a set of polynomial equations which can be expressed as a Sylvester-type matrix equation, i.e., the constraint is equivalent to a system of algebraic linear equations.

6.4.2.2

Adaptive Case: Unknown Disturbance

In order to suppress unknown periodic disturbances, we propose the adaptive version of (6.65) given by K (s, θˆ (t)) = K i j (s, θî j (t))

n u ×n y

N −1

,

λ N −k = θî j (t) Λ(s), N −k (s + λ) k=0

λN λ Λ(s) = , . . . , , (s + λ) N s+λ K i j (s, θî j (t)) =

(θî j (t))k

(6.72)

where θî j (t) ∈ R N is the parameter vector of the i j-th element of the filter and θˆ (t) = [θˆ11 (t) , θˆ12 (t) , . . . , θˆn u n y (t) ] ∈ R N n u n y . Then, the control law can be expressed as

168

S. Jafari and P. Ioannou

u(t) = −F(s) K (s, θˆ (t))[ζ (t)] , ζ (t) = y(s) − G 0 (s)[u(t)],

(6.73)

where θˆ (t) is the estimate of the unknown parameter vector θ ∗ at time t. In order to generate an online parameter estimator that would generate the estimate of θ ∗ , we express θ ∗ in an appropriate parametric model described by the following lemma. Lemma 6.8 The closed-loop system (6.1),(6.73) is parameterized as ζ (t) = Φ(t) θ ∗ + η(t),

(6.74)

∗

∗

, θ12 , . . . , θn∗

] ∈ R N n u n y is the unknown desired parameter vecwhere θ ∗ = [θ11 uny tor to be identified, ζ (t) = y(t) − G 0 (s)[u(t)] ∈ Rn y is a measurable vector signal, and the regressor matrix is given by

Φ(t) = G 0 (s)F(s)[W (t)], where

⎡ w(t) ⎢ .. W (t) = ⎣ .

(6.75)

⎤

⎥ ⎦ w(t) N n u n y ×n u

(6.76)

where w(t) = [Λ(s) [ζ1 (t)], . . . , Λ(s) [ζn y (t)]] ∈ R N n y , and the unknown error term η(t) depends on the unmodeled noise v(t) and the plant unmodeled dynamics Δm (s) and is given by η(t) = (I − G 0 (s)F(s)K (s, θ ∗ )) [v(t) + Δm (s)G 0 (s)F(s)[u(t)]] + εs (t), (6.77) where εs (t) = (I − G 0 (s)F(s)K (s, θ ∗ ))[ds (t)] is an exponentially decaying to zero term.

Proof The proof is given in [59].

The parametric model (6.74) together with the parameter estimation techniques discussed in [58] is used to design a robust adaptive law to estimate the unknown parameter vector θ ∗ as follows. The predicted value of the signal ζ (t) based on θˆ (t) is generated as ζˆ (t) = Φ(t) θˆ (t).

(6.78)

The normalized estimation error vector is defined as ε(t) =

ζ (t) − ζˆ (t) , m 2 (t)

(6.79)

6 Robust Adaptive Disturbance Attenuation

169

where m 2 (t) = 1 + γ0 trace(Φ(t) Φ(t)), with γ0 > 0, is a normalizing signal. To generate θˆ (t), we consider the robust pure least-squares algorithm [58]:

Φ(t)Φ(t) ˙ P(t) P(t) = −P(t) m 2s (t) θ˙ˆ (t) = proj (P(t)Φ(t)ε(t))

(6.80)

where P(0) = P (0) > 0 and projection operator proj(·) is used to guarantee that θˆ (t) ∈ S, ∀t, where S is a compact set defined as 2 }, S = {θ ∈ R N n u n y | θ θ ≤ θmax

(6.81)

where θmax > 0 is such that the desired parameter vector of the optimum filter, θ ∗ , belongs to S. The projection of the estimated parameter vector into S may be implemented as [58]: ⎧ ⎪ P(t)Φ(t)ε(t) if θˆ (t) ∈ S0 ⎪ ⎪ ⎪ ⎨ or if θˆ (t) ∈ δS and θ˙ˆ (t) = (P(t)Φ(t)ε(t)) θˆ (t) ≤ 0, ⎪ ⎪ ⎪

⎪ ˆ ˆ θ (t) θ(t) ⎩ P(t)Φ(t)ε(t) − P(t) P(t)Φ(t)ε(t) otherwise ˆ P(t)θ(t) ˆ θ(t)

(6.82) 2 2 where S0 = {θ ∈ R N n u n y | θ θ < θmax } and δS = {θ ∈ R N n u n y | θ θ = θmax } denote the interior and the boundary of S, respectively. Some alternatives to projection and other robust modifications can be found in [58]; for example, one may use the fixed σ -modification that requires no bounds on the set where the unknown parameters belong to; however, it destroys the ideal convergence properties of the adaptive algorithm [58]. Also, to prevent the covariance matrix P(t) from becoming close to singularity, covariance resetting [58] is used to keep the minimum eigenvalue of P(t) greater than a pre-specified small positive constant ρ1 at each time. This modification guarantees that P(t) is positive definite for all t ≥ 0. The following theorem summarizes the properties of the adaptive control law. Theorem 6.8 Consider the closed-loop system (6.1), (6.73), (6.78)–(6.82), and choose N > 2n¯ f , where n¯ f is a known upper bound for the number of distinct frequencies in the disturbance. Assume that the plant modeling error satisfies c0 Δm (s)G 0 (s)F(s)1 < 1,

(6.83)

where c0 is a finite positive constant independent of Δm (s). Then, all signals in the closed-loop system are uniformly bounded and the plant output satisfies 1 lim sup T →∞ T

t

t+T

y(τ )22 dτ ≤ c(μ2Δ + v02 ),

(6.84)

170

S. Jafari and P. Ioannou

for any t ≥ 0 and some finite positive constant c which is independent of t, T , Δm (s), and v(t), where μΔ is a constant proportional to the size of the plant unmodeled dynamics Δm (s), and v0 = supτ |v(τ )|. In addition, in the absence of modeling error and noise (i.e., if Δm (s) = 0 and v(t) = 0), the adaptive control law guarantees the convergence of y(t) to zero.

Proof The proof is given in [59].

The proposed control scheme provides a satisfactory performance and stability margin if the design parameters are chosen properly. In any practical control design, different types of uncertainties and modeling errors demand some sort of trade-off between robust stability and performance which can be achieved using any available a priori knowledge regarding bounds on uncertainties to choose certain design parameters. We summarize the main design parameters that contribute to the performance and robustness with respect to unmodeled dynamics as follows: The filter F(s), λ, N , γ0 , and P(0) are the design parameters affecting the excitation level of the regressor and the rate of adaptation; and the margin of stability depends upon F(s) and the adaptive filter order N . As explained earlier, we choose the design parameters based on a priori knowledge of the maximum number of distinct frequencies of the disturbance, frequency range of the disturbance, and the frequency range over which the unmodeled dynamics may be dominant.

6.5 Unknown Minimum-Phase Plants: SISO Systems In this section, we relax the assumption of stable known plant and consider the case where the plant model G 0 (q) in Fig. 6.1 can be unstable with unknown parameters. We do assume, however, that G 0 (q) has stable zeros. We use the Model Reference Adaptive Control (MRAC) structure [58, 62] to meet the objective of unknown periodic disturbance rejection without amplifying the effect of broadband noise in the output. The cost of achieving these objectives is the use of an over-parameterization which adds to the number of computations. By focusing on discrete-time SISO systems, we consider the plant model shown in Fig. 6.1 and the control structure in Fig. 6.4, and show how the problem of rejecting unknown periodic disturbances can be solved for unstable plants with unknown parameters as long as they are minimum phase. We consider the plant model (6.1) and assume G 0 (z) =

k0 Z 0 (z) R0 (z)

(6.85)

is an unknown minimum-phase nominal plant transfer function (possibly unstable).

6 Robust Adaptive Disturbance Attenuation

171

Assumption 6.1 The following assumptions are made about G 0 (z): • Z 0 (z) is a monic Hurwitz polynomial with unknown coefficients, and R0 (z) is an arbitrary unknown monic polynomial which is allowed to be non-Hurwitz. • The polynomial degrees n p = deg(R0 ) and n z = deg(Z p ) are unknown, yet an upper bound, n¯ p , for n p is known, and the relative degree of G 0 (z), n ∗ = n p − n z > 0, is known • The gain k0 is unknown, yet its sign, sign(k0 ), and an upper bound, k¯0 , for |k0 | are known. The above assumptions are the same as those in the classical MRAC [58, 62]. It should be noted that the knowledge of the sign of k0 significantly simplifies the structure of the control scheme for which the control objective can be met; it can be relaxed at the expense of a more complex control law [76]. The uncertain nature of the plant and disturbance models necessitates the use of a robust adaptive control scheme to achieve the control objective with high accuracy. We first consider the case where the parameters of G 0 (z) and disturbance frequencies are perfectly known. A non-adaptive controller is then designed and conditions under which the control objective is achievable as well as limitations imposed by the controller structure are studied. The analysis of the non-adaptive closed-loop system is crucial as it provides insights that can be used to deal with the unknown parameter case. Subsequently, the structure of the non-adaptive controller along with the certainty equivalence principle [58, 62] are used to design an adaptive control algorithm to handle the unknown plant unknown disturbance case. Consider the plant model (6.1) and disturbance (6.2) and the closed-loop system architecture shown in Figs. 6.1 and 6.4. We solve the problem for discrete-time SISO systems. The following assumption is made about the plant unmodeled dynamic Δm (z). √ Assumption 6.2 The multiplicative uncertainty Δm (z) is analytic in |z| ≥ ρ0 , for ∗ some known ρ0 ∈ [0, 1), and z −n Δm (z) is a proper transfer function.

6.5.1 Non-adaptive Case: Known Plant and Known Disturbance Frequencies Let us suppose that the disturbance frequencies ω1 , . . . , ωn f and the plant model G 0 (z) are perfectly known, and consider the structure of the classical MRAC scheme [62] as shown in Fig. 6.4 where u(t) = K u (z, θu )[u(t)] + K y (z, θ y )[y(t)], α(z) , θu = θ1 ∈ R N , K u (z, θu ) = θ1

Λ(z) α(z) + θ3 , θ y = [θ2 , θ3 ] ∈ R N +1 , K y (z, θ y ) = θ2

Λ(z) α(z) = [z N −1 , z N −2 , . . . , z, 1] ,

Λ(z) = z N ,

(6.86)

172

S. Jafari and P. Ioannou

where N is the order of the controller. We can express the control law (6.86) as θ α(z) + θ3 Λ(z) u(t) = 2 [y(t)]. Λ(z) − θ1 α(z)

(6.87)

By substituting (6.87) into (6.1), we obtain R0 (z)(Λ(z) − θ1 α(z)) [d(t)]. R0 (z)(Λ(z) − − k0 Z 0 (z)(θ2 α(z) + θ3 Λ(z))(1 + Δm (z)) (6.88) To achieve the disturbance rejection objective, the controller parameters θ1 , θ2 , θ3 are to be chosen such that the sensitivity transfer function from d(t) to y(t) in (6.88) is stable and has zero gain at the disturbance frequencies. This can be achieved if the following matching equations are satisfied (the existence of solutions to these equations will be investigated subsequently): y(t) =

θ1 α(z))

∗

R0 (z)Ds (z)A(z) + B(z) = z n Λ(z),

(6.89a)

θ1 α(z) θ2 α(z)

= Λ(z) − Z 0 (z)Ds (z)A(z),

(6.89b)

+ θ3 Λ(z) = −B(z)/k0 ,

(6.89c)

where n ∗ is the relative degree of G 0 (z) and Ds (z) is the internal model (generating polynomial) of sinusoidal disturbances defined as Ds (z) =

nf

(z 2 − 2 cos(ωi )z + 1),

(6.90)

i=1

where ωi ’s, i = 1, 2, . . . , n f , are the distinct frequencies in the modeled part of the disturbances (in rad/sample), and n f is the total number of distinct frequencies, and the polynomials A(z) (monic and of degree N − deg(Z 0 ) −deg(Ds )) and B(z) (of degree at most N ) are to be determined by solving (6.89a). Remark 6.16 In the absence of additive disturbances (i.e., when d(t) = 0), the generating polynomial of disturbances is Ds (z) = 1; hence, the control law (6.86), (6.89) with N = deg(R0 ) − 1 = n p − 1 reduces to that of the classical MRAC scheme [62, Sect. 7.2.2]. From (6.88) and (6.89), the sensitivity transfer function from d(t) to y(t) can be expressed in terms of polynomials A(z) and B(z) as R0 (z)A(z)Ds (z) B(z) −1 y(t) = 1 + Δm (z) n ∗ [d(t)], z n ∗ Λ(z) z Λ(z)

(6.91)

6 Robust Adaptive Disturbance Attenuation

173

where Ds (z) has zero gain at z i = exp(± jωi ), i = 1, 2, . . . , n f . The solvability of (6.89) and the existence of polynomials A(z) and B(z) is the subject of the following lemma. Lemma 6.9 Consider the matching equations (6.89) and let A(z) be a monic polynomial of degree N − deg(Z 0 ) −deg(Ds ) and B(z) be a polynomial of degree at most N . If N ≥ deg(R0 Ds ) − 1 = n p + 2n f − 1, then for any given G 0 (z), the system of equations (6.89) is solvable. That is, the existence of polynomials A(z) and B(z), hence the controller parameters θ1 , θ2 , θ3 satisfying (6.89) is guaranteed. In addition, if N = n p + 2n f − 1, the solution is unique.

Proof The proof is given in [61].

Now, we can summarize the properties of the closed-loop system by the following theorem. Theorem 6.9 Consider the closed-loop system shown in Figs. 6.1 and 6.4 with disturbance model (6.2), under Assumption 6.1, with (6.86) and (6.89). If N ≥ n p + 2n f − 1, and the stability condition Δm (z) B(z) n p + 2n f − 1, the polynomial equation (6.89a) has infinitely many solutions among which we can choose one that minimizes the peak magnitude of ∗ B(z)/(z n Λ(z)), thereby the stability margin and the tracking performance can be improved. The existence of the best parameter vector θ = [θ1 , θ2 , θ3 ] is the subject of the following lemma. Lemma 6.10 Consider the closed-loop system shown in Figs. 6.1 and 6.4 with disturbance model (6.2), under Assumption 6.1, with (6.86) and (6.89). If the conditions in Theorem 6.9 are satisfied, then there exists a θ ∗ with θ ∗ ≤ r0 , for some r0 > 0, that solves the following constrained convex optimization problem: B(z) , θ ∗ = arg min ∗ θ∈Ω z n Λ(z) ∞

Ω = {θ | θ ≤ r0 , (6.89) holds}.

Proof The proof is given in [61].

(6.97)

Remark 6.17 As shown in [61], if N ≥ n p + 2n f − 1, then the set Ω is non-empty, compact, and convex, hence the cost function in (6.97) attains its minimum on Ω. Also, the optimization problem can be formulated as a linear matrix inequality (LMI) feasibility problem and can be solved using efficient LMI solvers. The following simple example illustrates the above results. Example 6.5 Consider the following minimum-phase unstable open-loop plant model: 2.63(z − 0.13) k0 Z 0 (z) = 2 , (6.98) G 0 (z) = R0 (z) z − 1.91z + 1.44 with sampling period of 0.001 s, and suppose that the additive output disturbances have two sinusoidal components at frequencies ω1 = 0.2416 rad/sample

6 Robust Adaptive Disturbance Attenuation

175

∗

Table 6.1 The H∞ -norm of B(z)/(z n Λ(z)) from the solution of (6.97) ∗ N B(z)/(z n Λ(z)) from the solution of ∞ (6.97) 5 7 9 11 13 15 17 19

60.89 (= 35.69 dB) 15.67 (= 23.90 dB) 8.00 (= 18.07 dB) 5.15 (= 14.23 dB) 3.71 (= 11.38 dB) 2.86 (= 9.13 dB) 2.31 (= 7.24 dB) 1.94 (= 5.75 dB)

(= 38.45 Hz) and ω2 = 0.6357 rad/sample (= 101.17 Hz). From Theorem 6.9, if the controller order satisfies N ≥ 5, then the sinusoidal disturbances are completely ∗ rejected. Table 6.1 shows the peak value of |B(z)/(z n Λ(z))| from the solution of the optimization problem (6.97) for different values of controller order N . Although for any N ≥ 5, the sinusoidal terms of the disturbance are perfectly rejected, for ∗ small values of N , the transfer function B(z)/(z n Λ(z)) has large gains at high frequencies making the closed-loop system very sensitive to high-frequency unmodeled dynamics (see the stability condition (6.92)); moreover it may drastically amplify the effect of the high-frequency noise on the plant output and therefore destroy the tracking performance. Figure 6.7 shows the magnitude bode plot of ∗ ∗ B(z)/(z n Λ(z)) and R0 (z)A(z)Ds (z)/(z n Λ(z)) based on the solution of (6.97), for three different values of N . It should be noted that according to (6.95), at the dis∗ ∗ turbance frequencies we have B(z)/(z n Λ(z)) = 1 and B(z)/(z n Λ(z))∞ cannot

∗

∗

Fig. 6.7 Magnitude bode plot of B(z)/(z n Λ(z)) and R0 (z)A(z)Ds (z)/(z n Λ(z)) from the solution of (6.97), for different values of N . The controller is designed to reject the disturbances at frequencies ω1 = 38.45 Hz and ω2 = 101.17 Hz. Increasing the controller order N reduces the H∞ -norm of ∗ B(z)/(z n Λ(z)) and thereby improve performance and robustness

176

S. Jafari and P. Ioannou

be made less than one. Also, the inequality (6.96) implies that at any frequency, ∗ ∗ |R0 (z)A(z)Ds (z)/(z n Λ(z))| and |B(z)/(z n Λ(z))| differ at most by one. The above results reveal the properties of the control law (6.86) and provide insight into the corresponding level of achievable performance for the ideal case when the parameters of plant model and the disturbance frequencies are perfectly known.

6.5.2 Adaptive Case: Unknown Plant and Unknown Disturbance In this section, we design an adaptive version of the control law (6.86) to solve the problem of rejecting unknown disturbances acting on unknown minimum-phase systems. The adaptive filters in Fig. 6.4 are given by u(t) = K u (z, θû (t))[u(t)] + K y (z, θˆy (t))[y(t)], α(z) , θû (t) = θˆ1 (t) ∈ R N , K u (z, θû (t)) = θˆ1 (t)

Λ(z) (6.99) α(z) K y (z, θˆy (t)) = θˆ2 (t)

+ θˆ3 (t), θˆy (t) = [θˆ2 (t) , θˆ3 (t)] ∈ R N +1 , Λ(z) N −1 N −2 α(z) = [z ,z , . . . , z, 1] , Λ(z) = z N , In order to design a robust parameter estimator, we first develop a suitable parameterization for the closed-loop system in terms of the unknown controller parameters, based on which an estimate of the controller parameter vector θˆ (t) = [θˆ1 (t) , θˆ2 (t) , θˆ3 (t)] is generated at each time t. The following lemma describes the parametric model to be used for parameter estimation. Lemma 6.11 The closed-loop system (6.1),(6.99) is parameterized as ζ (t) = φ(t) θ¯ ∗ + η(t),

(6.100)

where θ¯ ∗ = [θ1∗ , θ2∗ , θ3∗ , 1/k0 ] ∈ R2N +2 is the unknown parameter vector to be identified, ζ (t) and φ(t) are known measurable signals, and η(t) is an unknown function representing the modeling error and noise terms, where ∗

ζ (t) = z −n [u(t)],

α(z)

α(z)

−n ∗ [u(t)], n ∗ [y(t)], z [y(t)], y(t) , φ(t) = n ∗ z Λ(z) z Λ(z) R0 (z)A(z)Ds (z) Z 0 (z)A(z)Ds (z) [u(t)] − [v(t)] − εs (t), η(t) = −Δm (z) z n ∗ Λ(z) k0 z n ∗ Λ(z) (6.101)

6 Robust Adaptive Disturbance Attenuation

177

∗ where εs (t) = (R0 (z)A(z)Ds (z))/(k0 z n Λ(z)) [ds (t)] is an exponentially decaying to zero term.

Proof The proof is given in [61].

Based on the affine parametric model (6.100) a wide class of robust parameter estimators can be employed to generate an estimate of the unknown parameter vector θ¯ ∗ at each time t. Let θˆ¯ (t − 1) be the most recent estimate of θ¯ ∗ , then the predicted value of the signal ζ (t) based on θˆ¯ (t − 1) is generated as ζˆ (t) = φ(t) θˆ¯ (t − 1).

(6.102)

The normalized estimation error is defined as ε(t) =

ζ (t) − ζˆ (t) , m 2 (t)

(6.103)

where m 2 (t) = 1 + γ0 φ(t) φ(t), with γ0 > 0, is a normalizing signal. To generate θˆ¯ (t), we consider the robust pure least-squares algorithm [62]: P(t) = P(t − 1) −

P(t − 1)φ(t)φ(t) P(t − 1) m 2 (t) + φ(t) P(t − 1)φ(t)

θˆ¯ (t) = proj(θˆ¯ (t − 1) + P(t)φ(t)ε(t))

(6.104)

where P(0) = P (0) > 0 and projection operator proj(·) is used to guarantee that θˆ¯ (t) ∈ S, ∀t, where S is a compact set defined as 2 }, S = {θ ∈ R2N +2 | θ θ ≤ θmax

(6.105)

where θmax > 0 is such that the desired parameter vector of the optimum filter, θ¯ ∗ , belongs to S. The projection of the estimated parameter vector into S may be implemented as [62, 65]: χ (t) = θˆ¯ (t − 1) + P(t)ε(t)φ(t) ρ(t) ¯ = P −1/2 (t)χ (t)

ρ(t) ¯ if ρ(t) ¯ ∈ S¯ ρ(t) = ⊥ proj of ρ(t) ¯ on S¯ if ρ(t) ¯ ∈ / S¯

(6.106)

θˆ¯ (t) = P 1/2 (t)ρ(t), where the set S is transformed into S¯ such that if χ ∈ S, then P −1/2 χ ∈ S¯ [65].

178

S. Jafari and P. Ioannou

Remark 6.18 The last element of θˆ¯ (t) is the estimate of 1/k0 (reciprocal of the plant model gain k0 in (6.85)). The adaptive law provides the estimate of 1/k0 at each time, but this value is discarded as the controller (6.99) depends on only the first 2N + 1 elements of θˆ¯ (t). The following theorem summarizes the properties of the adaptive control law. Theorem 6.10 Consider the closed-loop system (6.1), (6.99), (6.102)–(6.106), under Assumptions 6.1 and 6.2, and choose N ≥ n¯ p + 2n¯ f − 1, where n¯ p is an upper bound for deg(R0 ) and n¯ f is an upper bound for the number of distinct frequencies in the disturbance. If Δm (z) satisfies a norm-bound condition, then all signals in the closed-loop system are uniformly bounded and the plant output satisfies lim sup

T →∞

t+T −1 1 |y(τ )|2 ≤ c(μ2Δ + v02 ), T τ =t

(6.107)

for any t ≥ 0 and some finite positive constant c which is independent of t, T , Δm (z), and v(t), where μΔ is a constant proportional to the size of the plant unmodeled dynamics Δm (z), and v0 = supτ |v(τ )|. In addition, in the absence of modeling error and noise (i.e., if Δm (z) = 0 and v(t) = 0), the adaptive control law guarantees the convergence of y(t) to zero. Proof The proof is obtained by following the same steps as those in the proof of Theorem 7.2 and Theorem 7.3 in [62, 77]. Remark 6.19 The minimum-phase assumption on the plant model is the main limitation of the MRAC scheme. The location of zeros of a discretized system critically depends on the type of sampling as well as the size of sampling time. A minimumphase continuous-time system after discretization with a sample-and-hold device may become a non-minimum-phase discrete-time system; particularly, when the relative degree of the continuous-time system is greater than two and the sampling time is significantly small [78, Sect. 2]. It is also possible for a non-minimum-phase continuous-time system to become a minimum-phase discrete-time system [78, Sect. 2]. There are some criteria under which all zeros of a sampled transfer function are stable; these criteria, however, are restrictive [79].

6.6 Numerical Simulation In this section we demonstrate the performance of the adaptive control schemes proposed in Sects. 6.3, 6.4, and 6.5.

6 Robust Adaptive Disturbance Attenuation

179

6.6.1 SISO Discrete-Time Systems with Known Plant Model Consider the following open-loop plant model shown in Fig. 6.1 and the controller structure in Fig. 6.3 with disturbance model (6.2). Suppose G 0 (z) =

−0.00146(z − 0.1438)(z − 1) , (z − 0.7096)(z 2 − 0.04369z + 0.01392)

(6.108)

with sampling period of 1/480 s, is the known stable modeled part of the plant, and Δm (z) =

−0.0001 , (z + 0.99)2

is the unknown plant multiplicative uncertainty which is dominant at high frequencies and negligible at low frequencies, and d(t) = 0.7 sin(ω1 t + π/3) + 0.5 sin(ω2 t + π/4) + v(t)

(6.109)

is an unknown additive output disturbance, where ω1 = 0.0521 rad/sample (= 25 rad/s), ω2 = 0.4688 rad/sample (= 225 rad/s), and v(t) is a zero-mean Gaussian noise with standard deviation 0.02. The following design parameters are assumed for the adaptive law (6.13), (6.17)– ˆ = 0, F(z) = 100. Fig. 6.8 shows the per(6.21): N = 50, γ0 = 1, P(0) = 20I , θ(0) formance of the adaptive control scheme, where at t = 10 s the feedback is switched on and the control input u(t) is applied. In order to demonstrate the behavior of the

Fig. 6.8 Simulation results for a SISO discrete-time system with known stable plant model and the adaptive control scheme (6.13), (6.17)–(6.21). The control input is applied at t = 10 s. New unknown sinusoidal terms are abruptly added to the existing disturbance at t = 30 s; the adaptive controller updates its parameters at this time to suppress the effect of new terms on the plant output

180

S. Jafari and P. Ioannou

adaptive controller when the disturbance characteristics changes, at t = 30 s, two new sinusoidal terms 0.6 sin(ω3 t − π/6) ω3 = 0.1771 rad/sample (= 85 rad/s), and 0.4 sin(ω4 t + π/2), ω4 = 0.2604 rad/sample (= 125 rad/s), are abruptly added to the existing disturbance (6.109). At this time, the controller re-adjusts its parameters to reject the new sinusoidal terms. Such performance is achieved because the controller order N is chosen large enough to handle multiple disturbance frequencies.

6.6.2 SISO Continuous-Time Systems with Known Plant Model Consider the following open-loop plant model shown in Fig. 6.1 and the controller structure in Fig. 6.3 with disturbance model (6.2). Suppose G 0 (s) =

0.5(s − 0.2) s 2 + s + 1.25

is the known stable modeled part of the plant, and Δm (s) = −0.001s is the unknown unmodeled dynamics with small magnitude at low frequencies and large magnitude at high frequencies. We also assume that d(t) = 0.6 sin(ω1 t + π/4) + 0.7 sin(ω2 t + π/2) + v(t), is an unknown additive output disturbance, where ω1 = 70 rad/s, ω2 = 187 rad/s, and v(t) is a zero-mean Gaussian noise with standard deviation 0.02. Let us suppose the following partial knowledge about the unknown disturbance is given: the disturbances dominated by at most 5 distinct frequencies and the largest frequency of the disturbance is less than 600 rad/s. This information helps in selection of some design parameters. In order to show the effect of open-loop plant pre-filtering by a suitable LTI filter F(s), we compare the performance of the adaptive law (6.33), (6.37)–(6.41), with and without a compensator F(s). The magnitude of the open-loop plant G 0 (s) is relatively small at the expected frequency range of the disturbance, which may drastically slow down the adaptation and adversely affect the performance of the proposed adaptive control scheme. In order to increase the plant gain over the frequency range of interest of the disturbance, we use the procedure proposed in Sect. 6.3.2 and design a stable filter F(s) such that the compensated plant model G 0 (s)F(s) has a large enough gain over the expected frequency range of disturbance. The filter F(s) designed for this plant is given by F(s) =

α02 (s 2 + s + 1.25) , (s + α0 )2 (s + 0.2)

α0 = 500.

(6.110)

Figure 6.9 shows the magnitude bode plot of the original uncompensated open-loop plant G 0 (s) and that of the compensated version G 0 (s)F(s).

6 Robust Adaptive Disturbance Attenuation

181

Fig. 6.9 The magnitude plots of G 0 (s) and G 0 (s)F(s). The filter F(s) increases the open-loop plant gain over the expected range of disturbance frequencies. Open-loop plant filtering increases the excitation level of the regressor (6.35) and improves the disturbance rejection performance of the adaptive controller

The following design parameters are assumed for the adaptive law (6.33), (6.37)– (6.41): N = 20, γ0 = 1, λ = 500, P(0) = 500I , and θˆ (0) = 0. Figure 6.10 shows the plant output y(t), where the control input u(t) is applied at t = 5 s. Figure 6.10a shows the performance of the proposed adaptive control scheme without filter F(s) (i.e., F(s) = 1). The rate of adaptation in this case is very small as the plant model G 0 (s) has a very small gain at the frequencies of the disturbance ω1 = 70 and ω2 = 187 rad/s (see Fig. 6.9). It is clear that the adaptive controller with F(s) = 1 and N = 20 is pretty slow and does not provide significant improvement in performance. We should note that by increasing the size of the adaptive filter, N , the performance shown in Fig. 6.10a can be improved but not as effectively as with filter F(s) in (6.110) for the same filter order. As shown in Fig. 6.10b, the periodic terms have been quickly rejected when the control input is applied at t = 5 s with filter (6.110) in the loop.

6.6.3 MIMO Discrete-Time Systems with Known Plant Model Consider the following open-loop plant model shown in Fig. 6.1 and the controller structure in Fig. 6.3 with disturbance model (6.2). Suppose 0.01(z − 1) z − 2 z + 0.5 G 0 (z) = (z − 0.75)(z 2 + 1.3z + 0.8) z − 2 z + 1 with sampling period of 0.001 s, is the known stable modeled part of the plant, and

182

S. Jafari and P. Ioannou

Fig. 6.10 Simulation results for a SISO continuous-time system with known stable plant model and the adaptive control scheme (6.33), (6.37)–(6.41). The performance of the proposed scheme for N = 20 a without filter F(s) (i.e., F(s) = 1), the speed of adaptation is pretty low; b with filter F(s) in (6.110), much better performance is achieved. In both cases, the control signal u(t) is applied at t = 5 s

Δm (z) =

−10−5 I, (z + 0.999)2

is the unknown plant multiplicative uncertainty which has a negligible size at low frequencies and relatively large size near the Nyquist frequency, and the unknown additive output disturbance applied to the two output channels are d1 (t) = 0.6 sin(ω11 t) + 0.6 sin(ω12 t + π/8) + 0.6 sin(ω13 t + π/6) + v1 (t), d2 (t) = 0.5 sin(ω21 t + π/4) + 0.5 sin(ω22 t + π/2) + 0.5 sin(ω23 t + π/5) + v2 (t),

where ω11 = 0.03 rad/sample (= 30 rad/s), ω12 = 0.095 rad/sample (= 95 rad/s), ω13 = 0.18 rad/sample (= 180 rad/s), ω21 = 0.02 rad/sample (= 20 rad/s), ω22 = 0.11 rad/sample (= 110 rad/s), ω23 = 0.21 rad/sample (= 210 rad/s), and v1 (t) and v2 (t) are zero-mean Gaussian with standard deviation 0.02. The sigma plot of the open-loop plant G 0 (z) shows large gains of G 0 (z) at high frequencies and low gains at low frequencies. With such a system, the closed-loop system is vulnerable to high-frequency plant unmodeled dynamics with a poor disturbance rejection performance. A suitable filter F(z) is therefore needed to properly shape the singular values of the plant model. For design of filter F(z), we use the inner-outer factorization of G 0 (z). Since the plant has zeros on the unit circle, we apply the algorithm proposed in [75] to the perturbed plant model G˜ 0 (z) obtained by scaling the zeros by factor 0.99 to

6 Robust Adaptive Disturbance Attenuation

183

Fig. 6.11 The maximum and minimum singular values of the original uncompensated plant model G 0 (z) and those of the compensated plant G 0 (z)F(z). The singular values of G 0 (z)F(z) are almost aligned

move them slightly away from the unit circle. Then G˜ 0 (z) = G˜ in (z)G˜ out (z), where inner factor G˜ in (z) is a stable proper all-pass filter and the outer factor G˜ out (z) is a stable proper with a stable right inverse. Then, we choose F(z) = κ0 f (z)G˜ −1 outer (z), where κ0 = 0.1 and f (z) is a scalar low-pass filter with DC gain of one designed to compensate the effect of high-frequency modeling uncertainties. We assume f (z) is a third-order Butterworth low-pass filter with cutoff frequency of 630 rad/s, given by f (z) =

0.018099(z + 1)3 . (z − 0.5095)(z 2 − 1.251z + 0.5457)

Then, the compensated plant G 0 (z)F(z) has the gain of κ0 = 0.1 = −20 dB in every direction over the expected frequency range of disturbances. The selection of the bandwidth of f (z) and the gain κ0 is based on partial knowledge on the size and the frequency range where the modeling error may be dominant. Large values of these two parameters can adversely affect the stability margin. Figure 6.11 shows the maximum and minimum singular values of the original uncompensated plant model G 0 (z) and those of the compensated plant G 0 (z)F(z). The following design parameters are assumed for the adaptive law (6.53), (6.58)– (6.62): N = 60, γ0 = 1, P −1 (0) = 0.01I , and θˆ (0) = 0. To demonstrate the performance of the adaptive law when new disturbance are abruptly added to the plant output, we assume that new unknown disturbance terms 0.9 sin(ω14 t + π/7), ω14 = 0.103 rad/sample (= 103 rad/s) and 0.9 sin(ω24 t + π/3), ω24 = 0.128 rad/sample (= 128 rad/s), are added to the output of channel 1 and 2, respectively, at time t = 25 s. Figure 6.12 shows the performance of the proposed adaptive control scheme with the above filter F(z). After closing the feedback loop at t = 10 s, the controller quickly adjusts its parameters to reject the sinusoidal components of the disturbance.

184

S. Jafari and P. Ioannou

Fig. 6.12 Simulation results for a MIMO discrete-time system with known stable plant model and the adaptive control scheme (6.53), (6.58)–(6.62). The control input is applied at t = 10 s. At time t = 25 s, new unknown sinusoidal terms are abruptly added to the existing disturbance

At t = 25 s, the controller re-adjusts its parameters to counteract the effect of new disturbance terms.

6.6.4 SISO Discrete-Time Systems with Unknown Plant Model Consider the following open-loop plant model shown in Fig. 6.1 and the controller structure in Fig. 6.4 with disturbance model (6.2). Suppose G 0 (z) =

k0 Z 0 (z) 2.63(z − 0.13) = 2 , R0 (z) z − 1.91z + 1.44

with sampling period of 0.001 s, is the unknown unstable minimum-phase modeled part of the plant, and d(t) = 1.2 sin(ω1 t + π/3) + 0.8 sin(ω2 t − π/4) + v(t) is an unknown additive output disturbance, where ω1 = 0.127 rad/sample (= 127 rad/s), ω1 = 0.225 rad/sample (= 225 rad/s), and v(t) is a zero-mean Gaussian noise with standard deviation 0.02. The following design parameters are assumed for the adaptive law (6.99), (6.102)– (6.106): N = 30, γ0 = 1, P(0) = 20I , θî (0) = 0, for i = 1 : 2N + 1, θˆ2N +2 (0) = 5. The control input is applied at t = 10 s, and at time t = 30 s a new sinusoidal

6 Robust Adaptive Disturbance Attenuation

185

Fig. 6.13 Simulation results for a SISO discrete-time system with unknown unstable minimumphase plant model and the adaptive control scheme (6.99), (6.102)–(6.106). The control input is applied at t = 10 s and at time t = 30 s, a new unknown sinusoidal term is abruptly added to the existing disturbance

disturbance term 1.5 sin(ω3 t − π/5), where ω3 = 0.325 rad/sample (= 325 rad/s), is abruptly added to existing output disturbance (Fig. 6.13).

6.7 Conclusion The problem of attenuating unknown narrowband disturbances in the presence of broadband random noise and plant unmodeled dynamics for SISO and MIMO plants are examined in both continuous and discrete-time formulations. We showed that by using proper plant pre-filtering, an over-parameterization of the controller parameters, and a robust adaptive law for parameter estimation we can achieve the following: ◦ Guarantee stability provided the unmodeled dynamics are small in the lowfrequency range, ◦ Attenuation of the periodic components of the disturbance despite the presence of noise, unmodeled dynamics and time-varying frequencies of the periodic disturbance terms ◦ Improve performance as well as stability margin, especially in cases where the zeros of the plant are close to the zeros of the internal model of some of the disturbance terms. Suppression of unknown additive sinusoidal disturbances acting on a class of discrete-time SISO systems with unknown parameters in the presence of unstructured modeling uncertainties are studied. It is shown that an over-parameterized version of the classical robust MRAC can be employed for rejection of the unknown periodic disturbance components in the plant output without amplifying the noise.

186

S. Jafari and P. Ioannou

This capability is achieved by increasing the number of the parameters of the controller without changing the structure of the control law. The use of the reference model structure, however, restricts the class of dominant plant models to those that are minimum phase. Numerical simulations are presented to demonstrate the performance of the proposed schemes.

References 1. Jafari, S., Ioannou, P., Fitzpatrick, B., Wang, Y.: IEEE Trans. Autom. Control 60(8), 2166 (2015) 2. Jafari, S., Ioannou, P.: Automatica 70, 32 (2016) 3. Chen, X., Tomizuka, M.: IEEE Trans. Control Syst. Technol. 20(2), 408 (2012) 4. Kim, W., Chen, X., Lee, Y., Chung, C.C., Tomizuka, M.: Mech. Syst. Signal Process. 104, 436 (2018) 5. Gan, W.C., Qiu, L.: IEEE/ASME Trans. Mechatron. 9(2), 436 (2004) 6. Houtzager, I., van Wingerden, J.W., Verhaegen, M.: IEEE Trans. Control Syst. Technol. 21(2), 347 (2013) 7. Perez-Arancibia, N.O., Gibson, J.S., Tsao, T.C.: IEEE/ASME Trans. Mechatron. 14(3), 337 (2009) 8. Preumont, A.: Vibration Control of Active Structures. Springer, Berlin (2011) 9. Orzechowski, P.K., Chen, N.Y., Gibson, J.S., Tsao, T.C.: IEEE Trans. Control Syst. Technol. 16(2), 255 (2008) 10. Silva, A.C., Landau, I.D., Ioannou, P.: IEEE Trans. Control Syst. Technol. (2015) 11. Landau, I.D., Silva, A.C., Airimitoaie, T.B., Buche, G., Noe, M.: Eur. J. Control. 19(4), 237 (2013) 12. Landau, I.D., Alam, M., Martinez, J.J., Buche, G.: IEEE Trans. Control Syst. Technol. 19(6), 1327 (2011) 13. Rudd, B.W., Lim, T.C., Li, M., Lee, J.H.: J. Vib. Acoust. 134(1), 1 (2012) 14. Uchida, A.: Optical Communication with Chaotic Lasers: Applications of Nonlinear Dynamics and Synchronization. Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim (2012) 15. Andrews, L.C., Phillips, R.L.: Laser Beam Propagation through Random Media. SPIE (1998) 16. Fujii, T., Fukuchi, T.: Laser Remote Sensing. CRC, Boca Raton (2005) 17. Roggemann, M.C., Welsh, B.: Imaging Through Turbulence. CRC Press, Boca Raton (1996) 18. Vij, D.R., Mahesh, K.: Medical Applications of Lasers. Springer, Berlin (2002) 19. Baranec, C.: Astronomical adaptive optics using multiple laser guide stars, Ph.D Dissertation, University of Arizona (2007) 20. Tyson, R.: Principles of Adaptive Optics. CRC Press, Boca Raton (2010) 21. Watkins, R.J., Agrawal, B., Shin, Y., Chen, H.J.: In: 22nd AIAA International Communications Satellite Systems Conference (2004) 22. Ulbrich, H., Gunthner, W.: IUTAM Symposium on Vibration Control of Nonlinear Mechanisms and Structures. Springer, Netherlands (2005) 23. Roesch, P., Allongue, M., Achache, M., In: Proceedings of the 9th European Rotorcraft Forum, vol. Cernobbio, Italy, pp. 1–19. Cernobbio, Italy (1993) 24. Bittanti, S., Moiraghi, L.: IEEE Trans. Control Syst. Technol. 2(4), 343 (1994) 25. Friedmann, P.P., Millot, T.A.: J. Guidance Control Dyn. 18, 664 (1995) 26. Patt, D., Liu, L., Chandrasekar, J., Bernstein, D.S., Friedmann, P.P.: J. Guidance Control Dyn. 28(5), 918 (2005) 27. Lau, J., Joshi, S.S., Agrawal, B.N., Kim, J.W.: J. Guidance Control Dyn. 29(4), 792 (2006) 28. Nelson, P.A., Elliott, J.C.: Active Control of Sound. Academic, London (1992)

6 Robust Adaptive Disturbance Attenuation

187

29. Benesty, J., Sondhi, M.M., Huang, Y.: Springer Handbook of Speech Processing. Springer, Berlin (2008) 30. Emborg, U., Ross, C.F.: In: Proceedings of the Recent Advances in Active control Sound and Vibrations. Blacksburg (1993) 31. Eriksson, L.J.: Sound Vibra. 22, 2 (1988) 32. Kuo, S.M., Morgan, D.R.: Active Noise Control Systems, Algorithms and DSP Implementations. Wiley-Interscience, New York (1996) 33. Elliott, S.J., Nelson, P.A.: Electron. Commun. Eng. J. 2(4), 127 (1990) 34. Elliott, J.C., Nelson, P.A.: IEEE Signal Process. Mag. 10(4), 12 (1993) 35. Elliott, S.J., Nelson, P.A., Stothers, I.M., Boucher, C.C.: J. Sound Vib. 140(2), 219 (1990) 36. Sutton, T.J., Elliott, S.J., McDonald, A.M., Saunders, T.J.: Noise Control Eng. J. 42(4), 137 (1994) 37. Amara, F.B., Kabamba, P., Ulsoy, A.: J. Dyn. Syst. Meas. Contr. 121(4), 655 (1999) 38. Wang, Y., Liu, L., Fitzpatrick, B.G., Herrick, D.: In: Proceedings of the Directed Energy System Symposium (2007) 39. Pulido, G.O., Toledo, B.C., Loukianov, A.G.: In: Proceedings of the 44th IEEE Conference on Decision and Control, pp. 4821–4826 (2005) 40. Kinney, C.E., de Callafon, R.A.: Int. J. Adapt. Cont. Sig. Process. 25, 1006 (2011) 41. Bodson, M.: Int. J. Adapt. Cont. Sig. Process. 19, 67 (2005) 42. Kim, W., Kim, H., Chung, C.C., Tomizuka, M.: IEEE Trans. Control Syst. Technol. 19(5), 1296 (2011) 43. Amara, F.B., Kabamba, P., Ulsoy, A.: J. Dyn. Syst. Meas. Contr. 121(4), 648 (1999) 44. Aranovskiy, S., Freidovich, L.B.: Eur. J. Control. 19(4), 253 (2013) 45. Marino, R., Tomei, P.: Automatica 49(5), 1494 (2013) 46. Landau, I.D., Alma, M., Constantinescu, A., Martinez, J.J., Noe, M.: Control. Eng. Pract. 19(10), 1168 (2011) 47. Youla, D., Bongiorno, J., Jabr, H.: IEEE Trans. Autom. Control 21(1), 3 (1976) 48. Landau, I.D.: Int. J. Control 93(2), 204 (2020) 49. Marino, R., Santosuosso, G.L.: IEEE Trans. Autom. Control 52(2), 352 (2007) 50. Bodson, M., Douglas, S.: Automatica 33, 2213 (1997) 51. Chanderasekar, J., Liu, L., Patt, D., Friedmann, P.P., Bernstein, D.S.: IEEE Trans. Control Syst. Technol. 14(6), 993 (2006) 52. Feng, G., Palaniswami, M.: IEEE Trans. Autom. Control 37(8), 1220 (1992) 53. Palaniswami, M.: IEE Proc. D Control Theory Appl. 140(1), 51 (1993) 54. Pigg, S., Bodson, M.: IEEE Trans. Autom. Control 18(4), 822 (2010) 55. Pigg, S., Bodson, M.: Asian J. Control 15, 1 (2013) 56. Basturk, H.I., Krstic, M.: Automatica 50(10), 2539 (2014) 57. Basturk, H.I., Krstic, M.: Automatica 58, 131 (2015) 58. Ioannou, P.A., Sun, J.: Robust Adaptive Control. Prentice-Hall, Upper Saddle River (1996) 59. Jafari, S., Ioannou, P.: Int. J. Adapt. Control Signal Process 30(12), 1674 (2016) 60. Jafari, S., Ioannou, P., Rudd, L.: J. Vib. Control 23(4), 526 (2017) 61. Jafari, S., Ioannou, P.: Int. J. Adapt. Control Signal Process 33(1), 196 (2019) 62. Ioannou, P.A., Fidan, B.: Adaptive Control Tutorial. SIAM (2006) 63. Skogestad, S., Postlethwaite, I.: Multivariable Feedback Control: Analysis and Design. Wiley, New York (2005) 64. Dahleh, M., Diaz-Bobillo, I.J.: Control of Uncertain Systems: A Linear Programming Approach. Prentice-Hall, Englewood Cliffs (1995) 65. Goodwin, G.C., Sin, K.S.: Adaptive Filtering Prediction and Control. Prentice-Hall, Englewood Cliffs (1984) 66. McFarlane, D., Glover, K.: Robust Controller Design Using Normalized Coprime Factor Plant Descriptions (Lecture Notes in Control and Information Sciences, vol. 138. Springer, Berlin (1990) 67. Papageorgiou, G., Glover, K.: In: The AIAA Conference on Guidance, Navigation and Control, pp. 1–14 (1999)

188

S. Jafari and P. Ioannou

68. McFarlane, D., Glover, K.: IEEE Trans. Autom. Control 37(6), 759 (1992) 69. Hyde, R.A.: The Application of Robust Control to VSTOL Aircraft, Ph.D thesis, University of Cambridge (1991) 70. Hyde, R.A.: H∞ Aerospace Control Design – A VSTOL Flight Application. Advances in Industrial Control. Springer, London (1995) 71. Lanzon, A.: Automatica 41(7), 1201 (2005) 72. Francis, B.A.: A Course in H∞ Control Theory. Lecture Notes in Control and Information Science. Springer, Berlin (1987) 73. Weiss, M.: IEEE Trans. Autom. Control 39(3), 677 (1994) 74. Varga, A.: IEEE Trans. Autom. Control 43(5), 684 (1998) 75. Chen, B.M., Lin, Z., Shamash, Y.: Linear Systems Theory: A Structural Decomposition Approach. Birkhauser, Boston (2004) 76. Lee, T.H., Narendra, K.: IEEE Trans. Autom. Control 31(5), 477 (1986) 77. Ioannou, P.A., Fidan, B.: Adaptive control tutorial – supplement to chapter 7. https://archive. siam.org/books/dc11/Ioannou-Web-Ch7.pdf 78. Astrom, K.J., Wittenmark, B.: Computer-Controlled Systems: Theory and Design. Dover Publication, New York (2011) 79. Astrom, K., Hagander, P., Sternby, J.: Automatica 20(1), 31 (1984)

Chapter 7

Delay-Adaptive Observer-Based Control for Linear Systems with Unknown Input Delays Miroslav Krstic and Yang Zhu

Abstract Laurent Praly’s contributions to adaptive control and to state and parameter estimation are inestimable. Inspired by them, over the last several years we have developed adaptive and observer-based control designs for the stabilization of linear systems that have large and unknown delays at their inputs. In this chapter, we provide a tutorial introduction to this collection of results by presenting several of the most basic ones among them. Among the problems considered are some with measured and some with unmeasured states, some with known and some with unknown plant parameters, some with known and some with unknown delays, and some with measured and some with unmeasured actuator state under unknown delays. We have carefully chosen, for this chapter, several combinations among these challenges, in which estimation of a state (of the plant or of the actuator) and/or estimation of a parameter (of the plant or the delay) is being conducted and such estimates fed into a certainty-equivalence observer-based adaptive control law. The exposition progresses from designs that are relatively easy to those that are rather challenging. All the designs and stability analyses are Lyapunov based. The delay compensation is based on the predictor approach and the Lyapunov functionals are constructed using backstepping transformations and the underlying Volterra integral operators. The stability achieved is global, except when the delay is unknown and the actuator state is unmeasured, in which case stability is local.

M. Krstic (B) Department of Mechanical and Aerospace Engineering, University of California, San Diego, USA e-mail: [email protected] Y. Zhu State Key Laboratory of Industrial Control Technology, Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control, Lecture Notes in Control and Information Sciences 488, https://doi.org/10.1007/978-3-030-74628-5_7

189

190

M. Krstic and Y. Zhu

7.1 Introduction 7.1.1 Adaptive Control for Time-Delay Systems and PDEs Actuator and sensor delays are among the most common dynamic phenomena in engineering practice, and when disregarded, they render controlled systems unstable [11]. Over the past 60 years, predictor feedback has been a key tool for compensating such delays, but conventional predictor feedback algorithms assume that the delays and other parameters of a given system are known [2, 5, 6, 24, 26]. When incorrect parameter values are used in the predictor, the resulting controller may be as destabilizing as without the delay compensation [6, 13, 14]. Adaptive control—a simultaneous real-time combination of system identification (the estimation of the model parameters) and feedback control—is one of the most important in the field of control theory and engineering [1, 3, 12, 18, 25]. In adaptive control, actuator or sensor delays have traditionally been viewed as perturbations— effects that may challenge the robustness of the adaptive controllers and whose ignored presence should be studied with the tools for robustness analysis and redesign [10, 13, 14, 22, 23]. The simultaneous presence of unknown plant parameters and actuator/sensor delays poses challenges, which arise straight out of the control practice, and which are not addressed by conventional predictor and adaptive control methods. The turning point for the development of the ability to simultaneously tackle delays and unknown parameters was the introduction of the Partial Differential Equation (PDE) backstepping framework for parabolic PDEs and the resulting backstepping interpretation of the classical predictor feedback [1, 15, 16, 19–21, 25]. Similar to the role that finite-dimensional adaptive backstepping had played in the development of adaptive control in the 1990s, PDE backstepping of the 2000s furnished the Lyapunov functionals needed for the study of stability of delay systems under predictor feedback laws. It also enabled the further design of adaptive controllers for systems with delays and unknown parameters, from the late 2000s onward. Adaptive control problems for single-input linear systems with unknown discrete input delay are addressed in [7–9, 29, 35], and then extended to multi-input systems with distinct discrete delays by [30, 32, 33]. In publications [4, 27, 31, 34], delay-adaptive control for linear systems with uncertain distributed delays are studied.

7.1.2 Results in This Chapter: Adaptive Control for Uncertain Linear Systems with Input Delays In general, linear systems with input delays usually come with the following five types of uncertainties [28]: • unknown actuator delays,

7 Delay-Adaptive Observer-Based Control for Linear Systems … Table 7.1 Uncertainty collections of linear systems with input delays Section Delay Delay kernel Parameter Plant state Sect. 7.2.1 Sect. 7.2.2 Sect. 7.2.3 Sect. 7.3 Sect. 7.4

• • • •

known unknown unknown unknown unknown

– – – – unknown

known known known unknown known

unknown known known unknown known

191

Actuator state known known unknown known known

unknown actuator delay kernels, unknown plant parameters, unmeasurable finite-dimensional plant state, and unmeasurable infinite-dimensional actuator state.

In this chapter, we provide a tutorial introduction to delay-adaptive control approach to handle a collection of above basic uncertainties. We have carefully chosen, for this chapter, several combinations among these challenges, in which estimation of a state (of the plant or of the actuator) and/or estimation of a parameter (of the plant or the delay) is being conducted and such estimates fed into a certaintyequivalence observer-based adaptive control law. All the designs and stability analyses are Lyapunov based. The delay compensation is based on the predictor approach and the Lyapunov functionals are constructed using backstepping transformations and the underlying Volterra integral operators. To clearly describe the chapter’s organization, different combinations of uncertainties considered in later sections are summarized in Table 7.1, from which the interested readers and the practitioners can make their own selections to address a vast class of relevant problems.

7.2 Adaptive Control for Linear Systems with Discrete Input Delays Consider linear systems with discrete input delays as follows: X˙ (t) = A(θ )X (t) + B(θ )U (t − D) Y (t) = C X (t),

(7.1) (7.2)

where X (t) ∈ Rn is the plant state, Y (t) ∈ Rq is the output, and U (t) ∈ R is the control input with a constant delay D ∈ R+ . The matrices A(θ ) and B(θ ) dependent upon constant parameter vector θ ∈ R p are linearly parameterized such that

192

M. Krstic and Y. Zhu

A(θ ) = A0 +

p

θi Ai ,

B(θ ) = B0 +

i=1

p

θi Bi ,

(7.3)

i=1

where θi is the ith element of θ and A0 , Ai , B0 , and Bi for i = 1, ..., p are known matrices. To stabilize the potentially unstable system (7.1)–(7.2), we have following assumptions: Assumption 7.1 The actuator delay satisfies 0 < D ≤ D ≤ D,

(7.4)

where D and D are known lower and upper bounds of D, respectively. The plant parameter vector belongs to Θ = {θ ∈ R p |P(θ ) ≤ 0},

(7.5)

where P(·) : R p → R is a known, convex, and smooth function and Θ is a convex set with a smooth boundary ∂Θ. Assumption 7.2 The pair A(θ ), B(θ ) is stabilizable. In other words, there exists a matrix K (θ ) to let A(θ ) + B(θ )K (θ ) is Hurwitz such that (A + B K )T (θ )P(θ ) + P(θ )(A + B K )(θ ) = −Q(θ )

(7.6)

with P(θ ) = P(θ )T > 0 and Q(θ ) = Q(θ )T > 0. The system (7.1)–(7.2) is equivalent to the ODE–PDE cascade system X˙ (t) = A(θ )X (t) + B(θ )u(0, t) Y (t) = C X (t) Du t (x, t) = u x (x, t), x ∈ [0, 1] u(1, t) = U (t),

(7.7) (7.8) (7.9) (7.10)

where the PDE solution is u(x, t) = U (t + D(x − 1)), x ∈ [0, 1].

(7.11)

The ODE–PDE cascade system (7.7)–(7.10) representing the linear system with input delay may come with the following four types of basic uncertainties: • • • •

unknown actuator delay D, unknown plant parameter θ , unmeasurable finite-dimensional plant state X (t), and unmeasurable infinite-dimensional actuator state u(x, t).

7 Delay-Adaptive Observer-Based Control for Linear Systems …

193

Fig. 7.1 Adaptive control for uncertain linear systems with input delays

The basic idea of certainty-equivalence-based adaptive control is using an estimator (a parameter estimator or a state estimator) to replace the unknown variables in the control law, as shown in Fig. 7.1.

7.2.1 Global Stabilization under Uncertain Plant State From this section, different combinations of four above uncertainties are taken into account. Accordingly, for each kind of uncertainty combination, a unique control scheme is proposed. First of all, we consider the observer-based stabilization when the finite-dimensional plant state X (t) is unmeasurable. Since θ is known in this section, for sake of brevity, we denote A = A(θ ), B = B(θ ). We have the following assumption: Assumption 7.3 The pair (A, C) is detectable, namely, there exists a matrix L ∈ Rn×q to make A − LC Hurwitz such that (A − LC)T PL + PL (A − LC) = −Q L

(7.12)

with PLT = PL > 0 and Q TL = Q L > 0. The control scheme is summarized in Table 7.2. Theorem 7.1 Consider the closed-loop system consisting of the ODE–PDE cascade plant (7.13)–(7.16), the ODE-state observer (7.17), and the predictor feedback law (7.18). The origin is exponentially stable in the sense of the norm |X (t)|2 + | Xˆ (t)|2 +

21

1

u(x, t)2 dt 0

.

(7.25)

194

M. Krstic and Y. Zhu

Table 7.2 Global stabilization under uncertain plant state

The ODE-PDE cascade system: X˙ (t) = AX (t) + Bu(0, t)

(7.13) (7.14) (7.15) (7.16)

Y (t) = C X (t) Du t (x, t) = u x (x, t), x ∈ [0, 1] u(1, t) = U (t)

The ODE-state observer: X˙ˆ (t) = A Xˆ (t) + Bu(0, t) + L Y (t) − C Xˆ (t)

(7.17)

The predictor feedback law: U (t) = u(1, t) = K e AD Xˆ (t) + D

1

e AD(1−y) Bu(y, t)dy

(7.18)

0

The invertible backstepping transformation: w(x, t) = u(x, t) − K e ADx Xˆ (t) + D

x

e AD(x−y) Bu(y, t)dy

0

u(x, t) = w(x, t) + K e(A+B K )Dx Xˆ (t) + D

x

e(A+B K )D(x−y) Bw(y, t)dy

(7.19)

(7.20)

0

The closed-loop target system: X˙ˆ (t) = (A + B K ) Xˆ (t) + Bw(0, t) + LC X˜ (t), X˜˙ (t) = (A − LC) X˜ (t), Dwt (x, t) = wx (x, t) − D K e ADx LC X˜ (t), x ∈ [0, 1] w(1, t) = 0

(7.21) (7.22) (7.23) (7.24)

where X˜ (t) = X (t) − Xˆ (t).

Proof The proof is found in [15, Chap. 3].

7.2.2 Global Stabilization Under Uncertain Delay In this section, we consider the adaptive stabilization when the input delay D is unknown. The control scheme is summarized in Table 7.3.

7 Delay-Adaptive Observer-Based Control for Linear Systems …

195

Table 7.3 Global stabilization under uncertain input delay

The ODE-PDE cascade system: X˙ (t) = AX (t) + Bu(0, t)

(7.26) (7.27) (7.28)

Du t (x, t) = u x (x, t), x ∈ [0, 1] u(1, t) = U (t)

The predictor feedback law: ˆ ˆ U (t) = u(1, t) = K e A D(t) X (t) + D(t)

1

ˆ

e A D(t)(1−y) Bu(y, t)dy

(7.29)

0

The invertible backstepping transformation: ˆ ˆ w(x, t) = u(x, t) − K e A D(t)x X (t) + D(t)

x

0

ˆ

e A D(t)(x−y) Bu(y, t)dy

ˆ ˆ u(x, t) = w(x, t) + K e(A+B K ) D(t)x X (t) + D(t)

x

ˆ

e(A+B K ) D(t)(x−y) Bw(y, t)dy

(7.30)

(7.31)

0

The delay update law: ˙ˆ ˆ ∈ [D, D], D(t) = γ D Proj[D,D] τ D (t), D(0)

1 ˆ A D(t)x (1 + x)w(x, t)K e d x AX (t) + Bu(0, t) τ D (t) = − 0

1 1 + X (t)T P X (t) + g 0 (1 + x)w(x, t)2 d x

(7.32) (7.33)

The closed-loop target system: X˙ (t) = (A + B K )X (t) + Bw(0, t), ˙ˆ ˜ p(x, t) − D D(t)q(x, Dwt (x, t) = wx (x, t) − D(t) t), x ∈ [0, 1] w(1, t) = 0

(7.34) (7.35) (7.36)

˜ ˆ where D(t) = D − D(t), ˆ ˆ p(x, t) = K e A D(t)x AX (t) + Bu(0, t) = K e A D(t)x (A + B K )X (t) + Bw(0, t) x ˆ ˆ ˆ q(x, t) = K Axe A D(t)x X (t) + K (I + A D(t)(x − y))e A D(t)(x−y) Bu(y, t)dy 0

(7.37) (7.38)

196

M. Krstic and Y. Zhu

Theorem 7.2 Consider the closed-loop system consisting of the ODE–PDE cascade plant (7.26)–(7.28), the predictor feedback law (7.29), and the delay update law (7.32), (7.33). The zero solution is stable in the sense of the norm 2 |X (t)| +

1

2 ˆ u(x, t) dt + |D − D(t)|

21

2

(7.39)

0

and the convergence limt→∞ X (t) = 0 and limt→∞ U (t) = 0 is achieved. Proof The proof is found in [15, Chap. 7].

7.2.3 Local Stabilization Under Uncertain Delay and Actuator State In this section, we consider the adaptive stabilization when the input delay D is unknown and the actuator state u(x, t) is unmeasurable. The delay-adaptive problem without the measurement of u(x, t) is unsolvable globally because it cannot be formulated as linearly parameterized in the unknown delay D. That is to say, when the controller uses an estimate of u(x, t), not only do the initial values of the plant state and the actuator state have to be small, but the initial value of the delay estimation error also has to be small (the delay value is allowed to be large but the initial value of its estimate has to be close to the true value of the delay). The control scheme is summarized in Table 7.4. Theorem 7.3 Consider the closed-loop system consisting of the ODE–PDE cascade plant (7.40)–(7.42), the PDE-state observer (7.43), (7.44), the predictor feedback law (7.45), and the delay update law (7.48), (7.49). The zero solution is stable in the sense of the norm |X (t)|2 +

1 0

u(x, t)2 dt +

1 0

u(x, ˆ t)2 dt +

1 0

1 2 ˆ uˆ x (x, t)2 dt + |D − D(t)|

2

(7.55)

and the convergence limt→∞ X (t) = 0 and limt→∞ U (t) = 0 is achieved, if there exists M > 0 such that the initial condition |X (0)|2 +

1 0

u(x, 0)2 dt +

1 0

u(x, ˆ 0)2 dt +

1 0

1 2 ˆ uˆ x (x, 0)2 dt + |D − D(0)|

2

≤M

(7.56) is satisfied. Proof The proof is found in [15, Chap. 8] and [9].

7 Delay-Adaptive Observer-Based Control for Linear Systems …

197

Table 7.4 Local stabilization under uncertain input delay and actuator state

The ODE-PDE cascade system: X˙ (t) = AX (t) + Bu(0, t)

(7.40) (7.41) (7.42)

Du t (x, t) = u x (x, t), x ∈ [0, 1] u(1, t) = U (t)

The PDE-state observer: ˙ˆ ˆ uˆ t (x, t) = uˆ x (x, t) + D(t)(x − 1)uˆ x (x, t), x ∈ [0, 1] D(t)

(7.43) (7.44)

u(1, ˆ t) = U (t) ˆ where u(x, ˆ t) = U t + D(t)(x − 1) , x ∈ [0, 1] The predictor feedback law: ˆ ˆ U (t) = u(1, ˆ t) = K e A D(t) X (t) + D(t)

1

ˆ

e A D(t)(1−y) B u(y, ˆ t)dy

(7.45)

0

The invertible backstepping transformation: ˆ ˆ w(x, ˆ t) = u(x, ˆ t) − K e A D(t)x X (t) + D(t)

x

0

ˆ

e A D(t)(x−y) B u(y, ˆ t)dy

ˆ ˆ u(x, ˆ t) = w(x, ˆ t) + K e(A+B K ) D(t)x X (t) + D(t)

x

ˆ

e(A+B K ) D(t)(x−y) B w(y, ˆ t)dy

(7.46)

(7.47)

0

The delay update law: ˙ˆ ˆ D(t) = γ D Proj[D,D] τ D (t), D(0) ∈ [D, D], 1 ˆ τ D (t) = − (1 + x)w(x, ˆ t)K e A D(t)x d x AX (t) + B u(0, ˆ t)

(7.48) (7.49)

0

The closed-loop target system: X˙ (t) = (A + B K )X (t) + B w(0, ˆ t) + B u(0, ˜ t) ˙ ˜ ˆ D u˜ t (x, t) = u˜ x (x, t) − D(t)r (x, t) − D D(t)(x − 1)r (x, t) u(1, ˜ t) = 0 ˆ ˙ˆ ˆ D(t)s(x, ˆ ˆ wˆ t (x, t) = wˆ x (x, t) − D(t) t) − D(t)K e A D(t)x B u(0, ˜ t) D(t) w(1, ˆ t) = 0 ˜ ˆ where D(t) = D − D(t), u(x, ˜ t) = u(x, t) − u(x, ˆ t).

(7.50) (7.51) (7.52) (7.53) (7.54)

198

M. Krstic and Y. Zhu

7.3 Observer-Based Adaptive Control for Linear Systems with Discrete Input Delays In this section, we deal with a more challenging uncertainty collection D, X (t), θ . When the state X (t) and the parameter θ in the finite-dimensional plant are unknown simultaneously, the relative degree plays an important role in the output-feedback problem. Consider single-input single-output uncertain linear systems with input delay bm s m + · · · + b1 s + b0 Y (s) = n e−Ds U (s) (7.57) s + an−1 s n−1 + · · · + a1 s + a0 which is of the observer canonical form: 0(ρ−1)×1 ˙ U (t − D) X (t) = AX (t) − aY (t) + b Y (t) = e1T X (t), where

A=

(7.58)

0(n−1)×1 In−1 0 01×(n−1)

⎡ ⎤ ⎡ ⎤ an−1 bm ⎢ .. ⎥ ⎢ .. ⎥ , a = ⎣ . ⎦, b = ⎣ . ⎦ a0

(7.59)

b0

and ei for i = 1, 2, ... is the ith coordinate vector; ρ denotes the relative degree satisfying ρ = n − m; X (t) = [X 1 (t), X 2 (t), · · · , X n (t)]T ∈ Rn is the plant state unavailable to measure; Y (t) ∈ R is the measurable output; U (t − D) ∈ R is the control input with an unknown constant time delay D; and an−1 , · · · , a0 and bm , · · · , b0 are unknown constant plant parameters and control coefficients, respectively. The system (7.58) is written compactly as T X˙ (t) = AX (t) + F U (t − D), Y (t) θ Y (t) = e1T X (t),

(7.60)

where p = n + m + 1-dimensional parameter vector θ is defined by b θ= a

(7.61)

and T F U (t − D), Y (t) =

0(ρ−1)×(m+1) U (t − D), −In Y (t). Im+1

Several assumptions concerning the system (7.57)–(7.58) are given.

(7.62)

7 Delay-Adaptive Observer-Based Control for Linear Systems …

199

Assumption 7.4 The plant (7.57) is minimum-phase, i.e., the polynomial B(s) = bm s m + · · · + b1 s + b0 is Hurwitz. Assumption 7.5 There exist two known constants D > 0 and D¯ > 0, such that ¯ The high-frequency gain’s sign sgn(bm ) is known and a constant bm D ∈ [D, D]. is known such that |bm | ≥ bm > 0. Furthermore, θ belongs to a convex compact set Θ = {θ ∈ R p |P(θ ) ≤ 0}, where P : R p → R is a smooth convex function. The control purpose is to let output Y (t) asymptotically track a time-varying reference signal Yr (t) which satisfies the assumption given below. Assumption 7.6 In the case of known θ , given a time-varying reference output trajectory Yr (t) which is known, bounded, and smooth, there exist known reference state signal X r (t, θ ) and reference input signal U r (t, θ ) which are bounded in t, continuously differentiable in the argument θ and satisfy T X˙ r (t, θ ) = AX r (t, θ ) + F U r (t, θ ), Yr (t) θ Yr (t) = e1T X r (t, θ ).

(7.63)

We introduce the distributed input u(x, t) = U (t + D(x − 1)), x ∈ [0, 1],

(7.64)

where the measurable actuator state is governed by the following PDE equation: Du t (x, t) = u x (x, t) u(1, t) = U (t).

(7.65) (7.66)

To estimate the unmeasurable ODE state, we employ the Kreisselmeier filters (Kfilters) as follows: η(t) ˙ = A0 η(t) + en Y (t) ˙ λ(t) = A0 λ(t) + en U (t − D) ξ(t) = −An0 η(t) Ξ (t) = −[An−1 0 η(t), · · · , A0 η(t), η(t)] j A0 λ(t),

j = 0, 1, ..., m υ j (t) = Ω(t)T = [υm (t), · · · , υ1 (t), υ0 (t), Ξ (t)]

(7.67) (7.68) (7.69) (7.70) (7.71) (7.72)

where k = [k1 , k2 , · · · , kn ]T is chosen so that the matrix A0 = A − ke1T is Hurwitz, i.e., A0T P + P A0 = −I , P = P T > 0. The unmeasurable ODE state X (t) is virtually estimated as Xˆ (t) = ξ(t) + Ω(t)T θ and the estimation error ε(t) = X (t) − Xˆ (t) vanishes exponentially as (7.73) ε˙ (t) = A0 ε(t).

200

M. Krstic and Y. Zhu

Thus, we get a static relationship such that X (t) = ξ(t) + Ω(t)T θ + ε(t) = −A(A0 )η(t) + B(A0 )λ(t) + ε(t), where A(A0 ) = An0 +

n−1 i=0

ai Ai0 , B(A0 ) =

m i=0

(7.74) (7.75)

bi Ai0 .

According to (7.63), when θ is known, we can use reference K-filters ηr (t) and λ (t, θ ) to produce the reference output Yr (t). When θ is unknown, by certaintyequivalence principle, we have r

η˙ r (t) = A0 ηr (t) + en Yr (t)

(7.76)

∂λ (t, θˆ ) ˙ ˆ + λ˙ r (t, θˆ ) = A0 λr (t, θˆ ) + en U r (t, θ) θˆ ∂ θˆ ξ r (t) = −An0 ηr (t) r r r Ξ r (t) = −[An−1 0 η (t), · · · , A0 η (t), η (t)] j υ rj (t, θˆ ) = A0 λr (t, θˆ ), j = 0, 1, ..., m r

Ω r (t, θˆ )T = [υmr (t, θˆ ), · · · , υ1r (t, θˆ ), υ0r (t, θˆ ), Ξ r (t)] X r (t, θˆ ) = ξ r (t) + Ω r (t, θˆ )T θˆ ˆ 0 )ηr (t) + B(A ˆ 0 )λr (t, θˆ ) = − A(A Yr (t) = e1T X r (t, θˆ ),

(7.77) (7.78) (7.79) (7.80) (7.81) (7.82) (7.83) (7.84)

ˆ ˆ where θ(t) is the estimate of unknown parameter θ with θ˜ (t) = θ − θ(t), and n−1 m n i i ˆ 0 ) = A0 + ˆ 0) = A(A aˆ i A0 , B(A bî A0 . A few of error variables are defined i=0

i=0

as follows: z 1 (t) = Y (t) − Yr (t) ˆ U˜ (t − D) = U (t − D) − U r (t, θ) r η(t) ˜ = η(t) − η (t) ˜λ(t) = λ(t) − λr (t, θˆ ) ξ˜ (t) = ξ(t) − ξ r (t) Ξ˜ (t) = Ξ (t) − Ξ r (t) υ˜ j (t) = υ j (t) − υ rj (t, θˆ ),

j = 0, 1, ..., m

˜ Ω(t) = Ω(t) − Ω (t, θˆ ) , T

T

r

T

which are governed by the following dynamic equations:

(7.85) (7.86) (7.87) (7.88) (7.89) (7.90) (7.91) (7.92)

7 Delay-Adaptive Observer-Based Control for Linear Systems …

˙˜ = A0 η(t) ˜ + en z 1 (t) η(t)

201

(7.93)

∂λ (t, θˆ ) ˙ λ˙˜ (t) = A0 λ˜ (t) + en U˜ (t − D) − θˆ ∂ θˆ ˜ T θˆ + ε2 (t) + ω(t)T θ˜ z˙ 1 (t) = ξ˜2 (t) + ω(t) ¯˜ T θˆ + ε2 (t) + ω(t)T θ˜ , = bˆm υ˜ m,2 (t) + ξ˜2 (t) + ω(t) r

(7.94) (7.95) (7.96)

where ω(t) ˜ = [υ˜ m,2 (t), υ˜ m−1,2 (t), · · · , υ˜ 0,2 (t), Ξ˜ 2 (t) − z 1 (t)e1T ]T ¯˜ ω(t) = [0, υ˜ m−1,2 (t), · · · , υ˜ 0,2 (t), Ξ˜ 2 (t) − z 1 (t)e1T ]T

(7.97) (7.98)

ω(t) = [υm,2 (t), υm−1,2 (t), · · · , υ0,2 (t), Ξ2 (t) − Y (t)e1T ]T .

(7.99)

Then a couple of new variables are further defined as ⎡

⎤ ⎡ ⎤ Yr (t) Y (t) χ (t) = ⎣ η(t) ⎦ , χ r (t, θˆ ) = ⎣ ηr (t) ⎦ λ(t) λr (t, θˆ ).

(7.100)

Thus, a new error variable is derived ⎡

⎤ z 1 (t) ˜ ⎦ ∈ R2n+1 χ˜ (t) = χ (t) − χ r (t, θˆ ) = ⎣ η(t) ˜ λ(t)

(7.101)

which is driven by ∂χ r (t, θˆ ) ˙ ˆ χ˜ (t) + e2n+1 U˜ (t − D) + e1 ε2 (t) + ω(t)T θ˜ − ˆ χ˙˜ (t) = Aχ˜ (θ) θ, ∂ θˆ (7.102) where ⎤ ⎡ ˆ 0 ) e2T B(A ˆ 0) −aˆ n−1 −e2T A(A r ˆ) 0(1+n)× p (t, θ ∂χ = ∂λr (t,θˆ ) (7.103) Aχ˜ (θˆ ) = ⎣ en A0 0n×n ⎦ , . ˆ ∂ θ ˆ ∂ θ 0n×1 0n×n A0 Next, similar to the transformation on [18, pp. 435–436], we bring in the dynamic equation for the m-dimensional inverse conversion ζ (t) = T X (t) of (7.58) and its reference signal ζ r (t) as ζ˙ (t) = Ab ζ (t) + bb Y (t) ζ˙ r (t) = Ab ζ r (t) + bb Yr (t), where

(7.104) (7.105)

202

M. Krstic and Y. Zhu

⎡ bm−1 − b ⎢ .m ⎢ Ab = ⎣ .. Im−1 − bb0 0 · · · m

⎤ ⎥ ⎥ , bb = T ⎦ 0

Aρ

0 b bm

ρ − a , T = Ab e1 , · · · , Ab e1 , Im ,

(7.106)

thus the error state ζ˜ (t) = ζ (t) − ζ r (t)

(7.107)

ζ˙˜ (t) = Ab ζ˜ (t) + bb z 1 (t)

(7.108)

is driven by

under Assumption 7.4, we can see that Ab is Hurwitz, i.e., Pb Ab + AbT Pb = −I , Pb = PbT > 0. Aiming at system (7.109) z˙ 1 = bˆm υ˜ m,2 + ξ˜2 + ω¯˜ T θˆ + ε2 + ω T θ˜ r ˆ ∂λ (t, θ ) ˙ ˆ i = 2, 3, ..., ρ − 1 (7.110) υ˙˜ m,i = υ˜ m,i+1 − ki υ˜ m,1 − eiT Am θ, 0 ∂ θˆ ∂λr (t, θˆ ) ˙ υ˙˜ m,ρ = U˜ (t − D) + υ˜ m,ρ+1 − kρ υ˜ m,1 − eρT Am θˆ 0 ∂ θˆ (7.111) we present the adaptive backstepping recursive control design below. Coordinate Transformation: z 1 = Y − Yr

(7.112)

z i = υ˜ m,i − αi−1 , i = 2, 3, ..., ρ.

(7.113)

Stabilizing Functions: 1 − (c1 + d1 )z 1 − ξ˜2 − ω¯˜ T θˆ bˆm 2 ∂α 1 α2 = −bˆm z 1 − c2 + d2 z 2 + β2 ∂z 1 ∂αi−1 2 αi = −z i−1 − ci + di z i + βi , i = 3, ..., ρ ∂z 1 α1 =

βi = ki υ˜ m,1 +

∂αi−1 ∂αi−1 (A0 η˜ + en z 1 ) (ξ˜2 + ω˜ T θˆ ) + ∂z 1 ∂ η˜

(7.114) (7.115) (7.116)

7 Delay-Adaptive Observer-Based Control for Linear Systems …

+

m+i−1 j=1

203

∂αi−1 (λ˜ j+1 − k j λ˜ 1 ), i = 2, ..., ρ, ∂ λ˜ j

(7.117)

where ci > 0, di > 0 for i = 1, 2, ..., ρ are design parameters. Adaptive Control Law: U˜ (t − D) = −υ˜ m,ρ+1 + αρ .

(7.118)

˜ λ˜ but nonlinear in θˆ . Thus, through Note that αi for i = 1, 2, ..., ρ are linear in z 1 , η, a recursive but straightforward calculation, we show the following equalities: ˆ λ˜ z 2 = K 2,z1 (θˆ )z 1 + K 2,η˜ (θˆ )η˜ + K 2,λ˜ (θ)

(7.119)

z 3 = K 3,z1 (θˆ )z 1 + K 3,η˜ (θˆ )η˜ + K 3,λ˜ (θˆ )λ˜

(7.120)

ˆ 1 + K i+1,η˜ (θˆ )η˜ + K i+1,λ˜ (θ) ˆ λ, ˜ i = 3, ..., ρ − 1 (7.121) z i+1 = K i+1,z1 (θ)z ˆ η(t) ˆ λ˜ (t) = K χ˜ (θ) ˆ χ˜ (t), ˜ + K λ˜ (θ) U˜ (t − D) = K z1 (θˆ )z 1 (t) + K η˜ (θ)

(7.122)

where the explicit expressions of K i,z1 (θˆ ), K i,η˜ (θˆ ), K i,λ˜ (θˆ ) for i = 2, ..., ρ and ˆ K η˜ (θˆ ), K λ˜ (θˆ ) are as follows: K z1 (θ), 1 c1 + d1 − aˆ n−1 bˆm 1 ˆ 0) K 2,η˜ (θˆ ) = − e2T A(A bˆm 1 T ˆ K 2,λ˜ (θˆ ) = e2 B(A0 ) bˆm 2 ∂α ∂α1 ∂α1 1 en K 3,z1 (θˆ ) = bˆm + c2 + d2 aˆ n−1 − K 2,z1 (θˆ ) + ∂z 1 ∂z 1 ∂ η˜ ∂α1 2 ∂α ˆ 0 ) − ∂α1 A0 ˆ + 1 e2T A(A K 3,η˜ (θˆ ) = c2 + d2 K 2,η˜ (θ) ∂z 1 ∂z 1 ∂ η˜ ∂α1 2 ∂α1 T ˆ K 3,λ˜ (θˆ ) = c2 + d2 − e B(A0 ) K 2,λ˜ (θˆ ) + e2T Am+1 0 ∂z 1 ∂z 1 2

K 2,z1 (θˆ ) =

−

m+1 j=1

∂α1 e j e Tj A0 ∂ λ˜

K i+1,z1 (θˆ ) = K i−1,z1 (θˆ ) + ci + di

(7.123) (7.124) (7.125) (7.126) (7.127)

(7.128)

∂αi−1 ∂z 1

2

K i,z1 (θˆ ) +

∂αi−1 ∂αi−1 en aˆ n−1 − ∂z 1 ∂ η˜ (7.129)

204

M. Krstic and Y. Zhu

K i+1,η˜ (θˆ ) = K i−1,η˜ (θˆ ) + ci + di −

∂αi−1 A0 ∂ η˜

∂αi−1 ∂z 1

2

K i,η˜ (θˆ ) +

∂αi−1 T ˆ e A(A0 ) ∂z 1 2 (7.130)

ˆ + ci + di K i+1,λ˜ (θˆ ) = K i−1,λ˜ (θ) −

∂αi−1 ∂z 1

2

K i,λ˜ (θˆ ) + eiT Am+1 0

m+i−1 ∂αi−1 ∂αi−1 T ˆ e2 B(A0 ) − e e T A , i = 3, 4, ..., ρ − 1 ˜ j j 0 ∂z 1 ∂ λ j=1

K z1 (θˆ ) = − K ρ−1,z1 (θˆ ) + cρ + dρ ∂αρ−1 ∂αρ−1 en aˆ n−1 − + ∂z 1 ∂ η˜

K η˜ (θˆ ) = − K ρ−1,η˜ (θˆ ) + cρ + dρ

∂αρ−1 ∂z 1

2

(7.131) K ρ,z1 (θˆ ) (7.132)

∂αρ−1 ∂z 1

2

K ρ,η˜ (θˆ )

∂αρ−1 ∂αρ−1 T ˆ A0 e A(A0 ) − ∂z 1 2 ∂ η˜ ∂αρ−1 2 ˆ ˆ K λ˜ (θ ) = − K ρ−1,λ˜ (θ) + cρ + dρ K ρ,λ˜ (θˆ ) ∂z 1 +

⎤ m+ρ−1 ∂α ∂α ρ−1 T ˆ ρ−1 +eρT Am+1 − e B(A0 ) − e j e Tj A0 ⎦ 0 ˜ ∂z 1 2 ∂ λ j=1

and

ˆ = [K z1 (θˆ ), K η˜ (θˆ ), K λ˜ (θ)] ˆ ∈ R1×(2n+1) . K χ˜ (θ)

(7.133)

(7.134)

(7.135)

If parameter θ and time delay D are known, utilizing θ to replace θˆ , bearing (7.64) and (7.122) in mind, one can prove that the following prediction-based control law U (t) = U r (t + D, θ ) − K χ˜ (θ )χ r (t + D, θ ) + K χ˜ (θ )e Aχ˜ (θ)D χ (t) 1 +D K χ˜ (θ )e Aχ˜ (θ)D(1−y) e2n+1 u(y, t) dy (7.136) 0

achieves our control objective for system (7.58). To deal with the unknown plant parameters and the unknown actuator delay, utilizing certainty-equivalence principle, we bring in the reference transport PDE u r (x, t, θˆ ) = U r (t + Dx, θˆ ), x ∈ [0, 1] and the corresponding PDE error variable u(x, ˜ t) = u(x, t) − u r (x, t, θˆ ) satisfying

7 Delay-Adaptive Observer-Based Control for Linear Systems …

∂u r (x, t, θˆ ) ˙ ˆ x ∈ [0, 1] θ, ∂ θˆ ˆ θˆ ), u(1, ˜ t) = U˜ (t) = U (t) − U r (t + D,

D u˜ t (x, t) = u˜ x (x, t) − D

205

(7.137) (7.138)

ˆ ˜ ˆ where D(t) is a estimate of D, D(t) = D − D(t). Here we further bring in a similar backstepping transformation presented in [8] and [15, Sect. 2.2], ˆ ˆ w(x, t) = u(x, ˜ t) − K χ˜ (θˆ )e Aχ˜ (θ ) Dx χ(t) ˜ x ˆ ˆ − Dˆ K χ˜ (θˆ )e Aχ˜ (θ ) D(x−y) e2n+1 u(y, ˜ t) dy

(7.139)

0 ˆ ˆ

u(x, ˜ t) = w(x, t) + K χ˜ (θˆ )e(Aχ˜ +e2n+1 K χ˜ )(θ) Dx χ˜ (t) x ˆ ˆ ˆ +D K χ˜ (θˆ )e(Aχ˜ +e2n+1 K χ˜ )(θ) D(x−y) e2n+1 w(y, t) dy

(7.140)

0

with which systems (7.73), (7.93), (7.108), (7.109)–(7.111), and (7.137), (7.138) are transformed into the closed-loop target error systems as follows: z˙ = A z (θˆ )z + Wε (θˆ )(ε2 + ω T θ˜ ) + Q(z, t)T θ˙ˆ +Q r (t, θˆ )T θ˙ˆ + e w(0, t) ρ

η˙˜ = A0 η˜ + en z 1 ζ˙˜ = Ab ζ˜ + bb z 1 ε˙ = A0 ε

Dwt (x, t) = wx (x, t) − D(ε2 + ω T θ˜ )r0 (x, t) − D˜ p0 (x, t) ˆ˙ (x, t) − D θ˙ˆ T q(x, t) −D Dq 0

w(1, t) = 0,

(7.141) (7.142) (7.143) (7.144) (7.145) (7.146)

ˆ Q(z, t)T , Q r (t, θˆ )T , r0 (x, t), p0 (x, t), q0 (x, t), and q(x, t) are where A z (θˆ ), Wε (θ), listed as follows: ⎡ ⎤ −(c1 + d1 ) bˆm 0 ··· 0 2 ⎢ ⎥ .. . ∂α1 ⎢ −bˆm ⎥ 1 .. . − c + d 2 2 ⎢ ⎥ ∂z 1 ⎢ ⎥ ⎢ ⎥ . . . . ⎢ ⎥ ˆ . . A z (θ ) = ⎢ 0 −1 0 ⎥ ⎢ ⎥ .. .. .. .. ⎢ ⎥ . . . . 1 ⎢ 2 ⎥ ⎣ ⎦ ∂α 0 ··· 0 −1 − cρ + dρ ∂zρ−1 1 (7.147)

206

M. Krstic and Y. Zhu

⎡

⎡

⎤

1

0

⎤

⎢ − ∂α1 ⎥ 1 ⎢ − ∂α ⎥ ⎢ ∂ θˆ ⎥ ⎢ ∂z1 ⎥ T ⎥ ˆ Wε (θ ) = ⎢ .. ⎥ , Q(z, t) = ⎢ ⎢ .. ⎥ , ⎣ . ⎦ ⎣ . ⎦ ∂α ∂α − ∂zρ−1 − ∂ρ−1 1 θˆ ⎡ ⎤ 0 ⎢ m+1 ∂α1 T ∂λr (t,θˆ ) ⎥ ⎢ ⎥ − e ⎢ − e2T Am ⎥ 0 ∂ λ˜ j j ∂ θˆ ⎢ ⎥ j=1 ⎢ ⎥ Q r (t, θˆ )T = ⎢ ⎥ .. ⎢ ⎥ . ⎢ ⎥ ⎢ m+ρ−1 ∂αρ−1 T ∂λr (t,θˆ ) ⎥ ⎣ ⎦ T m ej − eρ A0 − ∂ λ˜ ∂ θˆ j=1

(7.148)

j

ˆ ˆ

r0 (x, t) = K χ˜ (θˆ )e Aχ˜ (θ ) Dx e1 ˆ ˆ ˆ χ(t) p0 (x, t) = K χ˜ (θˆ )e Aχ˜ (θ ) Dx Aχ˜ (θ) ˜ + e2n+1 u(0, ˜ t) ˆ ˆ ˆ χ˜ (t) + e2n+1 w(0, t) = K χ˜ (θˆ )e Aχ˜ (θ ) Dx (Aχ˜ + e2n+1 K χ˜ )(θ)

(7.149) (7.150) (7.151)

ˆ ˆ

Aχ˜ (θ ) Dx ˆ q0 (x, t) = K χ˜ Aχ˜ (θ)xe χ(t) ˜ x ˆ ˆ − y) e Aχ˜ (θˆ ) D(x−y) ˆ I + Aχ˜ (θ) ˆ D(x + K χ˜ (θ) e2n+1 u(y, ˜ t) dy (7.152) 0 x ˆ ˆ Aχ˜ (θˆ ) Dx ˆ − y) e Aχ˜ (θˆ ) D(x−y) ˆ ˆ I + Aχ˜ (θˆ ) D(x = K χ˜ Aχ˜ (θ)xe + K χ˜ (θ) e2n+1 0 ˆ (Aχ˜ +e2n+1 K χ˜ )(θˆ ) Dˆ y dy χ(t) ×K χ˜ (θ)e ˜ x ˆ ˆ − y) e Aχ˜ (θˆ ) D(x−y) w(y, t) K χ˜ (θˆ ) I + Aχ˜ (θˆ ) D(x e2n+1 + 0 x ˆ ˆ − s) e Aχ˜ (θˆ ) D(x−s) + Dˆ K χ˜ (θˆ ) I + Aχ˜ (θˆ ) D(x e2n+1 y

ˆ ˆ (Aχ˜ +e2n+1 K χ˜ )(θˆ ) D(s−y) ×K χ˜ (θ)e e2n+1 ds dy ˆ ˆ ∂ K χ˜ (θ) ∂ Aχ˜ (θ) ˆ ˆ ˆ ˆ Dx e Aχ˜ (θ ) Dx χ(t) qi (x, t) = + K χ˜ (θ) ˜ ∂ θî ∂ θî x ∂ K χ˜ (θˆ ) ∂ Aχ˜ (θˆ ) ˆ ˆ ˆ D(x − y) +D + K χ˜ (θ) ∂ θî ∂ θî 0 ˆ ˆ

ˆ ˆ

ˆ Aχ˜ (θ ) Dx ×e Aχ˜ (θ ) D(x−y) e2n+1 u(y, ˜ t) dy − K χ˜ (θ)e − Dˆ

0

x

ˆ ˆ

K χ˜ (θˆ )e Aχ˜ (θ ) D(x−y) e2n+1

∂u r (y, t, θˆ ) dy ∂ θî

(7.153)

∂χ r (t, θˆ ) ∂u r (x, t, θˆ ) + ∂ θî ∂ θî

(7.154)

7 Delay-Adaptive Observer-Based Control for Linear Systems …

207

∂ K χ˜ (θˆ ) ∂ Aχ˜ (θˆ ) ˆ ˆ ˆ ˆ Dx e Aχ˜ (θ ) Dx = + K χ˜ (θ) ∂ θî ∂ θî x ∂ Aχ˜ (θˆ ) ∂ K χ˜ (θˆ ) ˆ ˆ − y) e Aχ˜ (θˆ ) D(x−y) ˆ D(x + Dˆ + K χ˜ (θ) e2n+1 ∂ θî ∂ θî 0 ˆ (Aχ˜ +e2n+1 K χ˜ )(θˆ ) Dˆ y dy χ(t) ×K χ˜ (θ)e ˜

ˆ ∂ K χ˜ (θ) ∂ Aχ˜ (θˆ ) ˆ ˆ − y) e Aχ˜ (θˆ ) D(x−y) ˆ D(x w(y, t) + K χ˜ (θ) e2n+1 ∂ θî ∂ θî 0 x ∂ Aχ˜ (θˆ ) ∂ K χ˜ (θˆ ) ˆ ˆ ˆ ˆ ˆ D(x − s) e Aχ˜ (θ ) D(x−s) e2n+1 +D + K χ˜ (θ) ∂ θî ∂ θî y ˆ ˆ (Aχ˜ +e2n+1 K χ˜ )(θˆ ) D(s−y) ×K χ˜ (θ)e e2n+1 ds dy

+ Dˆ

x

ˆ ˆ −K χ˜ (θˆ )e Aχ˜ (θ ) Dx

− Dˆ

x

0

∂χ r (t, θˆ ) ∂u r (x, t, θˆ ) + ∂ θî ∂ θî

ˆ ˆ K χ˜ (θˆ )e Aχ˜ (θ ) D(x−y) e2n+1

∂u r (y, t, θˆ ) dy. ∂ θî

(7.155)

As a consequence, we design our control below to ensure that (7.146) holds: ˆ Dˆ ˆ θˆ ) − K χ˜ (θˆ )χ r (t + D, ˆ θˆ ) + K χ˜ (θˆ )e Aχ˜ (θ) U (t) = U r (t + D, χ (t) 1 ˆ ˆ D(1−y) ˆ Aχ˜ (θ) K χ˜ (θ)e e2n+1 u(y, t) dy. (7.156) + Dˆ 0

Two Lyapunov-based estimators to estimate unknown plant parameters and actuator time-delay are presented as follows: D˙ˆ = γ D Proj[D, D] ¯ {τ D }, γ D > 0 1 τD = − (1 + x)w(x, t) p0 (x, t) dx

(7.157) (7.158)

0

θ˙ˆ = γθ ProjΘ {τθ }, γθ > 0, bˆm (0)sgn(bm ) ≥ bm τθ =

1 ωWε (θˆ )T z − ω 2g

(7.159)

1

(1 + x)w(x, t)r0 (x, t) dx,

(7.160)

0

where g > 0, Proj[D, D] ¯ {·} is a standard projection operator defined on the interval ¯ and ProjΘ {·} is a standard projection algorithm defined on the set Θ to [D, D], guarantee that |bˆm | ≥ bm > 0. The adaptive controller is shown in Fig. 7.2. Above all, the stability of ODE–PDE cascade system is summarized in the main theorem below.

208

M. Krstic and Y. Zhu

Fig. 7.2 Observer-based adaptive control for linear systems with input delays

Theorem 7.4 Consider the closed-loop system consisting of the plant (7.58), the Kfilters (7.67)–(7.72), the adaptive controller (7.156), the time-delay identifier (7.157), (7.158), and the parameter identifier (7.159)–(7.160). There exists a constant M > 0 such that if the initial error state satisfies the condition 2 ˜ + |ζ˜ (0)|2 |z(0)|2 + |η(0)| 2 2 ˜ ˜ + |θ(0)| + | D(0)| + |ε(0)|2 + w(x, 0)2 ≤ M

(7.161)

then all the signals of the closed-loop system are bounded and the asymptotic tracking is achieved, i.e., (7.162) lim z 1 (t) = lim (Y (t) − Yr (t)) = 0. t→∞

t→∞

Proof The proof is found in [28, Chap. 5] and [29]. Based on the equalities (7.75), (7.85)–(7.88), (7.107), (7.119)–(7.121), (7.139)– ˆ the initial condition of the error state (7.161) (7.140), θ˜ = θ − θˆ , and D˜ = D − D, can be transformed into the initial conditions of the states of the actual plants, filters, ˆ ˆ identifiers, and transport PDE, namely, X (0), η(0), λ(0), θ(0), D(0), and u(x, 0). We illustrate next, through simulations, the proposed scheme by applying it numerically to the following three-dimensional plant with relative degree two: X˙ 1 (t) = X 2 (t) − a2 Y (t) X˙ 2 (t) = X 3 (t) − a1 Y (t) + b1 U (t − D) X˙ 3 (t) = −a0 Y (t) + b0 U (t − D) Y (t) = X 1 (t),

(7.163)

7 Delay-Adaptive Observer-Based Control for Linear Systems …

209

Trajectory Tracking

Output Trajectory 30

Y Yr

20 10 0 −10 −20 0

10

20

30 Time(sec)

40

60

50

Control Input U Ur

Control Input

40 20 0 −20 0

10

20

30 Time(sec)

40

50

60

Fig. 7.3 Output and input tracking Delay Estimate

Delay Estimate

1.5

ˆ D

1.4 1.3 1.2 1.1 1 0

10

20

30 Time(sec)

40

50

60

Parameter Estimate

Parameter Estimate −2.5

a ˆ2

−3

−3.5 0

10

Fig. 7.4 Delay and parameter estimate

20

30 Time(sec)

40

50

60

210

M. Krstic and Y. Zhu

where X 2 (t), X 3 (t) are the unmeasurable states, Y (t) is the measured output, D = 1 is the unknown constant time-delay in control input, and a2 = −3 is the unknown constant system parameter. To show the effectiveness of the developed control algorithm, without losing the generality, we simplify the simulation by assuming that a1 = a0 = 0, b1 = b0 = 1 are known and the PDE state u(x, t) is measurable. It is easy to check that this plant is potentially unstable due to the poles possessing ¯ a 2 = −4, nonnegative real part. The prior information including D = 0.5, D=1.5, a¯ 2 = −2 are known to designer. Yr (t) = sin t is the known reference signal to track. The boundedness and the asymptotic tracking of the system output and control input, as well as the estimations of θ and D are shown in Figs. 7.3, 7.4.

7.4 Adaptive Control for Linear Systems with Distributed Input Delays In this section, we deal with the adaptive control problem for linear systems with distributed input delays. The plant model is as follows: X˙ (t) = AX (t) +

D

B(D − σ )U (t − σ )dσ

(7.164)

0

Y (t) = C X (t)

(7.165)

which is equivalent to the following ODE–PDE cascade system: X˙ (t) = AX (t) + D

1

B(Dx)u(x, t)d x

(7.166)

0

Y (t) = C X (t) Du t (x, t) = u x (x, t), x ∈ [0, 1] u(1, t) = U (t).

(7.167) (7.168) (7.169)

Concentrating on (7.164)–(7.165) and its transformation (7.166)–(7.169), a linear plant with distributed actuator delay comes with the following five types of basic uncertainties: • • • • •

unknown delay D, unknown delay kernel B(Dx), unknown parameters in the system matrix A, unmeasurable finite-dimensional plant state X (t), and unmeasurable infinite-dimensional actuator state u(x, t).

This section addresses the most relevant problem where the delay and delay kernel are unknown. Since the n-dimensional input vector B(Dx) for x ∈ [0, 1] is a continuous function of Dx such that

7 Delay-Adaptive Observer-Based Control for Linear Systems …

211

⎡

⎤ ρ1 (Dx) ⎢ ρ2 (Dx) ⎥ ⎢ ⎥ B(Dx) = ⎢ ⎥ .. ⎣ ⎦ .

(7.170)

ρn (Dx),

where ρi (Dx) for i = 1, ..., n are unknown components of the vector-valued function B(Dx). On the basis of (7.166), we further denote n

B(x) = D B(Dx) =

Dρi (Dx)Bi =

i=1

n

bi (x)Bi ,

(7.171)

i=1

where bi (x) = Dρi (Dx) for i = 1, ..., n are unknown scalar continuous functions of x, and Bi ∈ Rn for i = 1, ..., n are the unit vectors accordingly. The system (7.166)– (7.169) is rewritten as X˙ (t) = AX (t) +

1

B(x)u(x, t)d x

(7.172)

0

Y (t) = C X (t) Du t (x, t) = u x (x, t), x ∈ [0, 1] u(1, t) = U (t).

(7.173) (7.174) (7.175)

A few of assumptions are assumed. Assumption 7.7 There exist known constants D, D, b¯i , and known continuous functions bi∗ (x) such that

1

0 < D ≤ D ≤ D, 0 < 0

2 bi (x) − bi∗ (x) d x ≤ b¯i

(7.176)

for i = 1, ..., n. Assumption 7.8 The pair (A, β) is stabilizable where β=

1

e−AD(1−x) B(x)d x.

(7.177)

0

Namely, there exist a vector K (β) to make A + β K (β) Hurwitz such that (A + β K (β))T P(β) + P(β) ( A + β K (β)) = −Q(β),

(7.178)

where P(β) = P(β)T > 0 and Q(β) = Q(β)T > 0. ˆ Denote D(t) and bî (x, t) as the estimates of D and bi (x) (for i = 1, ..., n) with estimation errors satisfying

212

M. Krstic and Y. Zhu

˜ ˆ D(t) = D − D(t)

(7.179)

b˜i (x, t) = bi (x) − bî (x, t) and ˆ B(x, t) =

n

(7.180)

bî (x, t)Bi .

(7.181)

i=1

The delay-adaptive control scheme is designed as follows: The control law is ˆ U (t) = u(1, t) = K (β(t))Z (t),

(7.182)

ˆ where K (β(t)) is chosen to make

ˆ Acl (β(t)) = A+

1

ˆ ˆ ˆ e−A D(t)(1−x) B(x, t)d x K (β(t))

0

= A+

1

ˆ

e−A D(t)(1−x)

0

n

ˆ bî (x, t)Bi d x K (β(t)).

i=1

ˆ β(t)

(7.183)

Hurwitz and ˆ Z (t) = X (t) + D(t)

0

1

x

ˆ ˆ e−A D(t)(x−y) B(y, t)dyu(x, t)d x.

(7.184)

0

The update laws are ˙ˆ D(t) = γ D Proj[D,D] {τ D (t)}, γ D > 0

1 ˆ f D (t) − 0 (1 + x)w(x, t)h D (x, t)d x 1/g Z T (t)P(β(t)) τ D (t) = 1 + Ξ (t)

b˙î (x, t) = γb Proj{τbi (x, t)}, γb > 0

ˆ f bi (x, t) 1/g Z (t)P(β(t)) 1 + Ξ (t)

1 (1 + y)w(y, t)h bi (y, t)dy f bi (x, t) , − 0 1 + Ξ (t)

τbi (x, t) =

(7.185) (7.186) (7.187)

T

ˆ where P(β(t)) satisfies (7.178) and g > 0 is a designing coefficient,

(7.188)

7 Delay-Adaptive Observer-Based Control for Linear Systems … ˆ

ˆ

Acl (β(t)) D(t)(x−1) ˆ w(x, t) = u(x, t) − K (β(t))e Z (t) 1 ˆ (t) + g (1 + x)w(x, t)2 d x Ξ (t) = Z T (t)P(β(t))Z 0 1 ˆ f D (t) = B(x, t)u(x, t)d x 0 1 ˆ ˆ e−A D(t)(1−x) B(x, t)d xu(1, t) − 0 1 x ˆ ˆ ˆ − D(t) Ae−A D(t)(x−y) B(y, t)dyu(x, t)d x 0 0 ˆ ˆ Acl (β(t)) D(t)(x−1) ˆ ˆ h D (x, t) = K (β(t))e f D (t) + Acl (β(t))Z (t)

f bi (x, t) = Bi u(x, t) ˆ h bi (x, t) = K (β(t))e

213

(7.189) (7.190)

(7.191) (7.192) (7.193)

ˆ ˆ Acl (β(t)) D(t)(x−1)

(7.194)

for i = 1, ..., n, and the projector operators are defined as ⎧ ˆ ⎪ = D and τ < 0 ⎨0, D(t) ˆ Proj[D,D] {τ } = 0, D(t) = D and τ > 0 ⎪ ⎩ τ, else ⎧ 1 bî (x)−b∗ (x)τ (x)d x i ⎪ ∗ ⎪τ (x) − bî (x) − b (x) 0 2 , ⎪ i ⎪ 1 ˆ ∗ ⎪ (x)−b b i ⎪ i (x) d x 0 ⎪ 2 ⎨

1 î (x) − bi∗ (x) d x = b¯i b Proj{τ (x)} = i f 0 ⎪

1 ⎪ ⎪ ∗ ˆ ⎪ b and (x) − b (x) τ (x)d x > 0, i ⎪ i 0 ⎪ ⎪ ⎩ τ (x), else

(7.195)

(7.196)

Theorem 7.5 Consider the closed-loop system consisting of the plant (7.172)– (7.175) and the adaptive controller (7.182)–(7.196). All the states X (t), u(x, t), ˆ D(t), bî (x, t) of the closed-loop system are globally bounded and the regulation of X (t) and U (t) such that limt→∞ X (t) = limt→∞ U (t) = 0 is achieved. Proof The proof is found in [28, Chap. 11] and [31].

7.5 Beyond the Results Given Here Both adaptive control and control of PDEs are challenging subjects. In adaptive control, the challenge comes from the need to design feedback for a plant whose dynamics may be highly uncertain (due to the plant parameters being highly unknown) and open-loop unstable, requiring control and learning to be conducted at the same time.

214

M. Krstic and Y. Zhu

In control of PDEs, the challenge lies in the infinite-dimensional essence of the system dynamics. Adaptive control problem for PDEs [1, 25] is a task whose difficulties are even greater than the sum of the difficulties of its two components. In particular, the conventional adaptive control methods for ODEs [3, 12, 18] cannot be trivially used to address uncertain delay systems whose dynamics are infinite dimensional. By modeling the actuator state under input delay as a transport PDE and regarding the propagation speed (delay dependent) as a parameter in the infinite-dimensional part of the ODE–PDE cascade system, an introductory exposition of adaptive control of delay systems has been given in this chapter. As aforementioned, linear systems with input delays usually have the following five types of uncertainties: • • • • •

unknown input delays, unknown delay kernels, unknown plant parameters, unmeasurable finite-dimensional plant state, and unmeasurable infinite-dimensional actuator state.

Different uncertainty combinations results in different control designs. For a tutorial introduction, in this chapter, we chose subset of results that are the most basic. More results are available in the articles [4, 7–9, 17, 27, 29–35] and are summarized in the books [15, 28]. The first adaptive control design for an ODE system with a large discrete input delay of unknown length was developed by Bresch-Pietri and Krstic [8, 9, 17]. An introduction where the delay is unknown but the ODE plants are known and the transport state is measurable is available in [17] and [15, Chap. 7]. The publications [8] and [15, Chap. 9] generalize the design to the situation where, besides the unknown delay value, the ODE also has unknown parameters. The references [9] and [15, Chap. 8] solve the problem of adaptive stabilization when both the delay value and the actuator state are unavailable. In the case where the delay state is not available for measurement, only local stability is obtained as the problem is not linearly parameterized, which means that the initial delay estimate needs to be sufficiently close to the true delay. More uncertainty combinations without the knowledge of the actuator state are taken into account in [7]. In [29, 35] and [28, Chaps. 4–5], we deal with the observer-based delay-adaptive control problems in the presence of uncertain parameters and state of the ODE plant, where the modeling of the observer canonical form and the Kreisselmeier filters is employed for the ODE-state estimation. On the basis of uncertainty-free case [26], we consider adaptive control for multi-input linear systems with distinct discrete input delays in [30, 32, 33] and [28, Chaps. 6–9]. Since each delay length in multiinput channels is not identical, the multi-input plant significantly complicates the prediction design as it requires to compute different future state values on time horizons (which seems to be non-causal at first sight). In [5], the PDE-backstepping predictor method is expanded to compensate another big family of delays—distributed input delays. Since linear systems with distributed delays consisting of the finite-dimensional plant state and the infinite-dimensional actuator state are not in the strict-feedback form, a novel forwarding-backstepping transformation is introduced to transform the system to an exponentially stable target

7 Delay-Adaptive Observer-Based Control for Linear Systems …

215

system [5]. The paper [4], for the first time, addresses the adaptive stabilization problem of linear systems with unknown parameters and known distributed input delays. Delay-adaptive control for linear systems with unknown distributed input delays are further studied in [31, 34], [28, Chap. 11] (single-input case) and in [27], [28, Chap. 12] (multi-input case with distinct delays). Sizable opportunities exist for further development of the subject of delay-adaptive control, including systems with simultaneous input and state delays, and PDE systems with delays.

References 1. Anfinsen, H., Aamo, O.-M.: Adaptive Control of Hyperbolic PDEs. Springer, Berlin (2019) 2. Artstein, Z.: Linear systems with delayed controls: a reduction. IEEE Trans. Autom. Control 27(4), 869–879 (1982) 3. Astrom, K.J., Wittenmark, B.: Adaptive Control. Courier Corporation, Chelmsford (2013) 4. Bekiaris-Liberis, N., Jankovic, M., Krstic, M.: Adaptive stabilization of LTI systems with distributed input delay. Int. J. Adapt. Control Signal Proc. 27, 47–65 (2013) 5. Bekiaris-Liberis, N., Krstic, M.: Lyapunov stability of linear predictor feedback for distributed input delays. IEEE Trans. Autom. Control 56, 655–660 (2011) 6. Bekiaris-Liberis, N., Krstic, M.: Nonlinear Control Under Nonconstant Delays. Society for Industrial and Applied Mathematics, Philadelphia (2013) 7. Bresch-Pietri, D., Chauvin, J., Petit, N.: Adaptive control scheme for uncertain time-delay systems. Automatica 48, 1536–1552 (2012) 8. Bresch-Pietri, D., Krstic, M.: Adaptive trajectory tracking despite unknown input delay and plant parameters. Automatica 45, 2074–2081 (2009) 9. Bresch-Pietri, D., Krstic, M.: Delay-adaptive predictor feedback for systems with unknown long actuator delay. IEEE Trans. Autom. Control 55(9), 2106–2112 (2010) 10. Evesque, S., Annaswamy, A.M., Niculescu, S., Dowling, A.P.: Adaptive control of a class of time-delay systems. ASME Trans. Dynam. Syst. Meas. Control 125, 186–193 (2003) 11. Fridman, E.: Introduction to Time-Delay Systems: Analysis and Control. Birkhauser (2014) 12. Goodwin, G.C., Sin, K.S.: Adaptive Filtering, Prediction and Control. Courier Corporation, Chelmsford (2014) 13. Karafyllis, I., Krstic, M.: Delay-robustness of linear predictor feedback without restriction on delay rate. Automatica 49, 1761–1767 (2013) 14. Krstic, M.: Lyapunov tools for predictor feedbacks for delay systems: inverse optimality and robustness to delay mismatch. Automatica 44, 2930–2935 (2008) 15. Krstic, M.: Delay Compensation for Nonlinear, Adaptive, and PDE Systems. Birkhauser, Berlin (2009) 16. Krstic, M.: Compensation of infinite-dimensional actuator and sensor dynamics: nonlinear and delay-adaptive systems. IEEE Control Syst. Mag. 30, 22–41 (2010) 17. Krstic, M., Bresch-Pietri, D.: Delay-adaptive full-state predictor feedback for systems with unknown long actuator delay. In: Proceedings of 2009 American Control Conference, Hyatt Regency Riverfront, St. Louis, MO, USA, June 10–12 (2009) 18. Krstic, M., Kanellakopoulos, I., Kokotovic, P.V.: Nonlinear and Adaptive Control Design. Wiley, New York (1995) 19. Krstic, M., Smyshlyaev, A.: Backstepping boundary control for first-order hyperbolic PDEs and application to systems with actuator and sensor delays. Syst. Control Lett. 57(9), 750–758 (2008) 20. Krstic, M., Smyshlyaev, A.: Boundary Control of PDEs: A Course on Backstepping Designs. SIAM, Philadelphia (2008)

216

M. Krstic and Y. Zhu

21. Liu, W.-J., Krstic, M.: Adaptive control of Burgers’ equation with unknown viscosity. Int. J. Adapt. Control Signal Process. 15, 745–766 (2001) 22. Niculescu, S.-I., Annaswamy, A.M.: An adaptive smith-controller for time-delay systems with relative degree n ∗ ≤ 2. Syst. Control Lett. 49, 347–358 (2003) 23. Ortega, R., Lozano, R.: Globally stable adaptive controller for systems with delay. Int. J. Control 47, 17–23 (1988) 24. Smith, O.J.M.: A controller to overcome dead time. ISA 6, 28–33 (1959) 25. Smyshlyaev, A., Krstic, M.: Adaptive Control of Parabolic PDEs. Princeton University Press, Princeton (2010) 26. Tsubakino, D., Krstic, M., Oliveira, T.R.: Exact predictor feedbacks for multi-input LTI systems with distinct input delays. Automatica 71, 143–150 (2016) 27. Zhu, Y., Krstic, M.: Adaptive and robust predictors for multi-input linear systems with distributed delays. SIAM J. Control Optim. 58, 3457–3485 (2020) 28. Zhu, Y., Krstic, M.: Delay-adaptive Linear Control. Princeton University Press, Princeton (2020) 29. Zhu, Y., Krstic, M., Su, H.: Adaptive output feedback control for uncertain linear time-delay systems. IEEE Trans. Autom. Control 62, 545–560 (2017) 30. Zhu, Y., Krstic, M., Su, H.: Adaptive global stabilization of uncertain multi-input linear timedelay systems by PDE full-state feedback. Automatica 96, 270–279 (2018) 31. Zhu, Y., Krstic, M., Su, H.: Delay-adaptive control for linear systems with distributed input delays. Automatica 116, 108902 (2020) 32. Zhu, Y., Krstic, M., Su, H.: PDE boundary control of multi-input LTI systems with distinct and uncertain input delays. IEEE Trans. Autom. Control 63, 4270–4277 (2018) 33. Zhu, Y., Krstic, M., Su, H.: PDE output feedback control of LTI systems with uncertain multiinput delays, plant parameters and ODE state. Syst. Control Lett. 123, 1–7 (2019) 34. Zhu, Y., Krstic, M., Su, H.: Predictor feedback for uncertain linear systems with distributed input delays. IEEE Trans. Autom. Control 64, 5344–5351 (2020) 35. Zhu, Y., Su, H., Krstic, M.: Adaptive backstepping control of uncertain linear systems under unknown actuator delay. Automatica 54, 256–265 (2015)

Chapter 8

Adaptive Control for Systems with Time-Varying Parameters—A Survey Kaiwen Chen and Alessandro Astolfi

Dedicated to Laurent Praly

Abstract Adaptive control was originally proposed to control systems, the model of which changes over time. However, traditionally, classical adaptive control has been developed for systems with constant parameters. This chapter surveys the socalled congelation of variables method to overcome the obstacle of time-varying parameters. Two examples, illustrating how to deal with time-varying parameters in the feedback path and in the input path, respectively, are first presented. Then n-dimensional lower triangular systems to show how to combine the congelation of variables method with adaptive backstepping are discussed. Finally, we study how to control a class of nonlinear systems via output feedback: this is a problem that cannot be solved directly due to the coupling between the input and the time-varying perturbation. It turns out that if we assume a strong minimum-phase property, namely, ISS of the inverse dynamics, such a coupling is converted into a coupling between the output and the time-varying perturbation. Then, a small-gain-like analysis, which takes all subsystems into account, yields a controller that achieves output regulation ∗ This

chapter is partially reprinted from [8] with the following copyright and permission notice: ©2021 IEEE. Reprinted, with permission, from Chen, K., Astolfi, A.: Adaptive control for systems with time-varying parameters. IEEE Transactions on Automatic Control 66(5), 1986–2001 (2021). K. Chen (B) · A. Astolfi Imperial College London, SW7 2AZ London, UK e-mail: [email protected] A. Astolfi e-mail: [email protected] A. Astolfi Università di Roma “Tor Vergata”, 00133 Rome, Italy © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control, Lecture Notes in Control and Information Sciences 488, https://doi.org/10.1007/978-3-030-74628-5_8

217

218

K. Chen and A. Astolfi

and boundedness of all closed-loop signals. Simulation results to demonstrate that the proposed controller achieves asymptotic output regulation and outperforms the classical adaptive controller in the presence of time-varying parameters are presented.

8.1 Introduction Since the publication of the seminal paper [10] that proposes an adaptive control scheme with theoretically guaranteed stability properties, adaptive control has undergone extensive research (see, e.g., [2, 13, 17, 26, 29]) typically under the assumption that the system parameters are constant. This, however, somehow deviates from the original intention that one could use adaptive control to cope with plants or environments that change with time. One would expect that if the time-varying parameters can be estimated exactly, their effects could also be exactly cancelled by a certaintyequivalence design. Therefore, early works on adaptive control for time-varying systems (see, e.g., [12]) exploit persistence of excitation to guarantee stability by ensuring that parameter estimates converge to the true parameters. The restriction of persistence of excitation is relaxed by subsequent works (see, e.g., [16, 25]) which only require bounded and slow (in an average sense) parameter variations. More recent contributions can be mainly categorized into two trends, both of which exploit techniques from robust adaptive control to confine the parameter estimates. One of the trends is based on the so-called switching σ -modification (see, e.g., [13]), a mechanism which adds leakage to the parameter update integrator, if the parameter estimates drift out of a pre-specified “reasonable” region, to guarantee boundedness of the parameter estimates. This approach achieves asymptotic tracking when the parameters are constant, otherwise the tracking error is nonzero and related to the rates of the parameter variations, see [30]. Such a result can be further improved, as shown in [32, 33], if one could model the parameter variations in two parts: known parameter variations and unknown variations, in which case the residual tracking error only depends on the rates of the unknown parameter variations. The other trend exploits the projection operation (see, e.g., [11, 28]), which confines the parameter estimates within a pre-specified compact set to guarantee boundedness of the parameter estimates, and the so-called filtered transformation, which is essentially an adaptive observer described via a change of coordinates, see [20, 21, 23]. These methods guarantee asymptotic tracking provided that the parameters are bounded in a compact set, their derivatives are L1 , and the disturbance on the state evolution is additive and L2 . Moreover, a priori knowledge on parameter variations is not needed and the residual tracking error is independent of the rates of the parameter variations. The methods mentioned above cannot guarantee zero-error regulation when the unknown parameters are persistently varying. To achieve asymptotic state/output regulation when the time-varying parameters are neither known nor asymptotically constant, in [3, 4] a method called the congelation of variables has been proposed and developed on the basis of the adaptive backstepping approach and the adaptive

8 Adaptive Control for Systems with Time-Varying Parameters—A Survey

219

immersion and invariance (I&I) approach, respectively. In the spirit of the congelation of variables method, each unknown time-varying parameter is treated as a nominal unknown constant parameter perturbed by the difference between the true parameter and the nominal parameter, which causes a time-varying perturbation term. The controller design is then divided into a classical adaptive control design, with constant unknown parameters, and a damping design via dominance to counteract the time-varying perturbation terms. This method is compatible with most adaptive control schemes using parameter estimates, as it does not change the original parameter update law designed for time-invariant systems. Since full-state feedback is not always implementable, most practical scenarios require an output-feedback adaptive control scheme. In the output-feedback design with the congelation of variables method, the major difficulty is caused by the coupling between the input and the time-varying perturbation. In this case, simply strengthening damping terms in the controller alters the input (as well as the perturbation itself) and therefore causes a chicken-and-egg dilemma, which prevents stabilization via dominance. In [5, 6], a special output-feedback case is solved on the basis of adaptive backstepping and adaptive I&I, respectively, by exploiting a modified minimum-phase property for time-varying systems and decomposing the coupling between the input and the time-varying perturbation into a coupling between some output-related nonlinearities and some “new” time-varying perturbations, which enables the use of the dominance design again, though it is still restricted by a relative degree condition. This restriction is relaxed in [8] by using the nonlinear negative feedback control law proposed in [7] and by performing a stability analysis that takes all the filter subsystems into account. This chapter summarizes the ideas and results of [3, 5, 7, 8] and gives some extensions. The chapter is organized as follows. In Sect. 8.2, two scalar systems to illustrate the use of the congelation of variables method are presented, and an ndimensional lower triangular system with unmatched uncertainties controlled by an adaptive state-feedback controller is discussed to elaborate on the combination of the congelation of variables method with adaptive backstepping. With these design tools, in Sect. 8.3, we recall the results developed in [5–8] on the decomposition of the perturbation coupled with the input, on the input and output filters design, and on a small-gain-like controller design. In Sect. 8.4, a numerical example to highlight the performance improvement achievable with the proposed scheme is presented. For conciseness, most of the technical proofs in this chapter are omitted. All the proofs can be found in [8]. Notation This chapter uses standard notation unless stated otherwise. √ For an n-dimensional vector v ∈ Rn , |v| denotes the Euclidean 2-norm, |v| M = v Mv, M = M 0, denotes the weighted 2-norm with weight M, vi ∈ Ri , 1 ≤ i ≤ n, denotes the vector composed of the first i elements of v. ei denotes the i-th unit vector of proper dimension. For an n × m matrix M, (M)i denotes the i-th column, (M )i denotes the i-th row, (M) i j denotes the i-th element on the j-th column, tr(M) n m 2 denotes the trace, and |M|F = i=1 j=1 (M)i j denotes the Frobenius norm. I

220

K. Chen and A. Astolfi

Fig. 8.1 Graphical illustration of the role of 0 , θ , θ (t), and δθ . ©2021 IEEE. Reprinted, with permission, from Chen, K., Astolfi, A.: Adaptive control for systems with time-varying parameters. IEEE Transactions on Automatic Control 66(5), 1986–2001 (2021)

and S denote the identity matrix and the upper shift matrix of proper dimension, respectively. For an n-dimensional time-varying signal s : R → Rn , the image of which is contained in a compact set S, s : R → Rn denotes the deviation of s from a constant value s , i.e., s (t) = s(t) − s , and δs ∈ R denotes the supremum of the n 2-norm of s, i.e., δs = supt≥0 |s(t)| ≥ 0. (·)(n) = dtd n denotes the n-th time derivative operator. In this chapter, the unknown time-varying system parameters1 θ : R → Rq and bm : R → R may verify one of the assumptions below. Assumption 8.1 The parameter θ is piecewise continuous and θ (t) ∈ 0 , for all t ≥ 0, where 0 is a compact set. The “radius” of 0 , i.e., δθ , is assumed to be known, while 0 can be unknown (see Fig. 8.1). Assumption 8.2 The parameter θ is smooth, that is, θ (i) (t) ∈ i , for i ≥ 0, for all t ≥ 0, respectively, where i are compact sets possibly unknown. δθ is assumed to be known. Assumption 8.3 The parameter bm (t) is bounded away from 0 in the sense that there exists a constant bm such that sgn(bm ) = sgn(bm (t)) = 0 and 0 < |bm | ≤ |bm (t)|, for all t ≥ 0. The sign of bm and bm (t), for all t ≥ 0, is known and does not change.

8.2 Motivating Examples and Preliminary Result In this section, we first briefly discuss the core idea which allows to cope with timevarying parameters, the so-called congelation of variables method, by demonstrating All parameters, e.g., θ, are time-varying unless stated otherwise. To highlight this fact, the time argument is explicitly used, e.g., θ(t), although this may be dropped for conciseness as long as no confusion arises. 1

8 Adaptive Control for Systems with Time-Varying Parameters—A Survey

221

the adaptive controller design on two scalar systems. Then, an n-dimensional lower triangular system is considered to generalize the proposed method.

8.2.1 Parameter in the Feedback Path To begin with consider a scalar nonlinear system described by the equation x˙ = θ (t)x 2 + u,

(8.1)

where x(t) ∈ R is the state, u(t) ∈ R is the input, and θ (t) ∈ R is an unknown time-varying parameter satisfying Assumption 8.1. In the spirit of the certaintyequivalence principle, we can substitute an “estimate” θˆ for the true parameter θ (t), and rewrite (8.1) as x˙ = θˆ x 2 + u + (θ − θˆ )x 2 .

(8.2)

In the classical direct adaptive control scheme for time-invariant systems, considering a Lyapunov function candidate of the form V (x, θˆ , θ ) =

1 1 2 x + (θ − θˆ )2 2 2γθ

(8.3)

(and assuming differentiability of θ for the time being) yields θ˙ θ˙ˆ ˆ 3 − (θ − θˆ ) + (θ − θ) ˆ , V˙ = θˆ x 3 + ux + (θ − θ)x γθ γθ

(8.4)

which allows using the parameter update law θ˙ˆ = γθ x 3

(8.5)

to cancel the unknown (θ − θˆ )x 3 term in V˙ , where the constant γθ > 0 is known as the adaptation gain. Since it is typically assumed that θ is constant, the indefinite ˆ θ˙ disappears, and the selection of the control law term (θ − θ) γθ u = −kx − θˆ x 2 ,

(8.6)

with k > 0, yields V˙ = −kx 2 ≤ 0. We can then conclude, by invoking Barbalat’s lemma, that x and θˆ are bounded, and lim x(t) = 0. However, if θ is not a constant t→+∞

˙ parameter, special treatment is needed for the indefinite term (θ − θˆ ) γθθ . One popular modification to cope with this indefinite term is the so-called projection operation

222

K. Chen and A. Astolfi

(see, e.g., [11, 28]), which confines the parameter θˆ inside a convex compact set by adding an additional term to (8.5) so that it “projects” θˆ back to a “reasonable” set when θˆ drifts out of the set, and therefore guarantees boundedness of (θ − θˆ ). It follows that boundedness of θ˙ guarantees boundedness of x (either exact boundedness, e.g., in [34] or boundedness in an average sense, e.g., in [25]), and θ˙ ∈ L1 guarantees that lim x(t) = 0 (e.g., in [21–23]). Alternatively, one may exploit a soft version t→+∞

of the projection operation, commonly referred to as switching σ -modification, to guarantee boundedness of θˆ , which adds some leakage to the integrator (8.5) if the parameter estimate drifts outside a “reasonable” region, see, e.g., [30, 32, 33]. All these schemes share the similarity that they treat θ˙ as a disturbance. Therefore, designing in the spirit of disturbance attenuation, one could guarantee that bounded θ˙ causes bounded state/output regulation/tracking error, and sufficiently fast converging θ˙ , which means that θ becomes constant sufficiently fast, guarantees the convergence of the error to 0. As a result, none of these methods can guarantee zeroerror regulation/tracking when the unknown parameter is persistently time varying, in which case θ˙ is non-vanishing. To overcome this limitation, first note that the time derivative θ˙ is introduced by taking time derivative of the θ − θˆ term in (8.3) along the solutions of the system. Also note that the role of the θ − θˆ term is only to guarantee boundedness of θˆ , yet by no means guaranteeing convergence of θˆ to θ , no matter if θ is time varying or constant. In fact, replacing θ with a constant θ , to be determined, can guarantee the same properties without introducing θ˙ when taking the time derivative. In the light of this, consider the modified Lyapunov function candidate V (x, θˆ , θ ) =

1 2 1 x + (θ − θˆ )2 . 2 2γθ

(8.7)

Taking the time derivative of V along the trajectories of (8.2) yields θ˙ˆ ˆ 3 − (θ − θˆ ) + θ x 3 , V˙ = θˆ x 3 + ux + (θ − θ)x γθ

(8.8)

where θ = θ − θ . Comparing (8.8) with (8.4) we see that the substitution of θ for θ eliminates the θ˙ term, at the cost of adding a perturbation term θ x 3 due to the inconsistency between θ and θ . Considering the same parameter update law as in (8.5) and the new control law u=− k+

1 1 δθ x − θ δθ x 3 − θˆ x 2 , 2 θ 2

(8.9)

where θ > 0 is a constant, to balance the linear and the nonlinear terms, yields ˙ V = − k +

1 1 δθ x 2 − θ δθ x 4 + θ x 3 ≤ −kx 2 ≤ 0. 2 θ 2

(8.10)

8 Adaptive Control for Systems with Time-Varying Parameters—A Survey

223

Therefore, we can conclude boundedness of all trajectories of the closed-loop system, as well as lim x(t) = 0, using the same argument as the one used in the classical t→+∞

constant parameter problem, without requiring a small or vanishing θ˙ , which shows that such a result can be achieved even for systems with fast and persistently varying parameters. The method of substituting the constant θ for the time-varying θ to avoid unnecessary time derivatives is called congelation of variables [3].2 The following remarks are given to further facilitate understanding the proposed scheme. Remark 8.1 The control law (8.9) and the parameter update law (8.5) do not contain the unknown constant θ explicitly, which can be regarded as an analogy of the fact that classical adaptive controllers do not contain the unknown constant θ explicitly. This fact shows that the proposed scheme is “adaptive” and distinguishes it from a robust scheme with static high gain, which requires the knowledge of θ . One can interpret the proposed controller as a combination of an adaptive controller, to cope with the unknown parameter θ representing the average of θ (t), and a robust controller, to cope with the time-varying perturbation θ (t), the 2-norm of which is upper bounded by δθ , the “radius” of the compact set 0 containing θ , as shown in Fig. 8.1. It is also worth noting that the control law (8.9) contains the same certaintyequivalence term, that is, the term −θˆ x 2 , as in the classical control law (8.6), and therefore when θ is a constant, one could select θ = θ , hence δθ = 0, and the control law (8.9) would be reduced to the classical control law (8.6). This fact distinguishes the proposed scheme from the dynamic high-gain scheme in, e.g., [19, 31], in which the adaptive term is not certainty equivalence but used for dominance. Remark 8.2 The control law (8.9) explicitly contains δθ , which is assumed to be known by Assumption 8.1. Such an assumption can be relaxed by introducing an “estimate” for δθ and replacing the nonlinear damping term that contains δθ with a certainty-equivalence term. This is feasible since δθ is a constant and the control law is linearly parameterized, thus yielding a problem which can be effectively solved by a classical adaptive control scheme. See Remark 2 of [7] for a brief example. Remark 8.3 It is worth introducing a convention to clarify the spirit in which we treat unknown quantities. If an unknown indefinite term in the time derivative of the Lyapunov function vanishes as the system parameters become constant, then this term is to be dominated by a static damping design, like the θ -term in this case, and we do not aim at estimating δθ , the bound of θ ; if an unknown indefinite term is not vanishing even when all system parameters are constant, like the θ -term in this case, then this term is to be compensated by a dynamically updated “estimate,” which is θˆ in this case. The reasons for this convention of design are, first, that we do not want to over-extend the dimension of the closed-loop system by adding too many Some works predating [3] exploit similar ideas to avoid involving θ˙ in the analysis. For example, in [1], the unknown time-varying controller parameter in the Lyapunov function is replaced with a constant (0, as a matter of fact). In other works, one first derives a constant parameter controller via dominance design (instead of directly using a time-varying parameter controller that cancels the time-varying parameter) and then estimates the constant parameter of the dominance controller, see, e.g., [19, 31].

2

224

K. Chen and A. Astolfi

dynamic estimates, and second, that we need the static damping terms to counteract fast parameter variations for better transient performance (for the same reason one can use nonlinear damping techniques even for system with constant parameters). Remark 8.4 Consider a classical adaptive control problem for system (8.1) in which θ is constant. The closed-loop dynamics can be described via a negative feedback loop consisting of two passive subsystems, namely,

1 :

x˙1 = −kx1 + x12 u 1 , y1 = x13 ,

2 :

x˙2 = γθ u 2 , y2 = x2 ,

(8.11)

where x1 = x, x2 = θˆ − θ , u 1 = −y2 , u 2 = y1 . The storage functions are S1 = 21 x12 and S2 = 2γ1θ x22 , respectively. Although θˆ is called the parameter estimate by convention, it is well known that the parameter update law (8.5), in general, cannot guarantee that lim θˆ (t) − θ = 0. The selection of the update law is to guarantee t→+∞

that the signal θˆ − θ can be used to create a passive interconnection. When θ is time varying, the dynamics of 2 are described by

2 :

x˙2 = γθ u 2 − θ˙ , y2 = x2 ,

(8.12)

which causes the loss of passivity from u 2 to y2 . The congelation of variables method can therefore be interpreted as selecting a new signal θˆ − θ that can yield a passive interconnection, while maintaining the passivity of 1 via nonlinear damping. Combining the adaptive controller described by (8.5) and (8.9) with the open-loop system (8.1), the two interconnected passive subsystems are described by

1 :

2 :

x˙1 = −a(x1 , t)x1 + x12 u 1 , y1 = x13 ,

(8.13)

x˙2 = γθ u 2 , y2 = x2 ,

(8.14)

where x1 = x, x2 = θˆ − θ , u 1 = −y2 , u 2 = y2 , and a(x1 , t) = k + 1 δ x2 2 θ θ 1

− θ x1 ≥ k > 0.

1 δ 2 θ θ

+

In the same spirit, one could apply the congelation of variables method to other parameter-based adaptive schemes to cope with time-varying parameters. For example, consider system (8.1) again, but with an adaptive I&I controller defined by the equations 1 1 1 2 x − z + θ δ x 3 − θˆ + β(x) x 2 , u= − k+ + θ z 2 θ 2 ∂β θ˙ˆ = − θˆ + β(x) x 2 + u , ∂x

(8.15) (8.16)

8 Adaptive Control for Systems with Time-Varying Parameters—A Survey

225

x3 and (·) > 0. Defining the error variable z = θˆ − θ + β(x) in where β(x) = γθ 3 the spirit of the congelation of variables yields the closed-loop system dynamics described by 1 1 1 2 x − z + θ δ x 3 − x 2 (z + θ ), x˙ = − k + + θ z 2 θ 2

(8.17)

z˙ = − γθ x 4 (z + θ ).

(8.18)

Let Vx (x) = 21 x 2 , Vz (z) = 2γ1θ z 2 , and note that their time derivatives along the solutions of (8.17) and (8.18), respectively, yield 1 1 1 2 2 ˙ x − z + θ δ x 4 − x 3 (z + θ ) Vx = − k + + θ z 2 θ 2 1 2 ≤ − kx 2 − z δ x 4 + z x 4 z 2 , (8.19) θ 4 3 2 x 4. (8.20) V˙z = − x 4 z 2 − x 4 zθ ≤ − x 4 z 2 + δ θ 4 Finally, setting V (x, z) = V (x) + z Vz (z) yields 1 V˙ = V˙ + z V˙z ≤ −kx 2 − z x 4 z 2 ≤ 0. 2

(8.21)

This guarantees that lim x(t) = 0 and x 2 z ∈ L2 . t→+∞

Similar to Remark 8.4, the substitution of θ for θ removes θ˙ from the z-dynamics yet, differently, it makes the x-subsystem and the z-subsystem finite-gain L2 stable instead of making them passive, which can be seen from (8.19) and (8.20). The nonlinear damping term x 3 in (8.15) renders the loop gain of the interconnected system sufficiently small, so that stability properties hold. It is worth comparing this result with the classical scenario (see Sect. 3.2 of [2]) in which the parameter θ is constant, and the property x 2 z ∈ L2 can be concluded directly from (8.18) without the presence of θ . This leads to two cascaded subsystems instead of two subsystems interconnected in a loop. The two adaptive schemes discussed above show the compatibility of the congelation of variables method with different adaptive schemes. This is due to the fact that the substitution of θ for θ only introduces θ , whereas the role of θ is exactly the same as that of θ in the classical constant parameter scenario, which allows using the same parameter update law. For simplicity, in what follows, we only demonstrate the congelation of variables method with the passivity-based scheme, which is also referred to as direct adaptive control or Lyapunov-based adaptive control in the literature.

226

K. Chen and A. Astolfi

8.2.2 Parameter in the Input Path In this subsection, we show how to extend the idea of congelation of variables to systems with a time-varying parameter coupled with the input that is commonly referred to as the high-frequency gain. To this end, consider the scalar system x˙ = θ (t)x 2 + b(t)u,

(8.22)

where θ (t) ∈ R satisfies Assumption 8.1 and b(t) ∈ R satisfies Assumption 8.1 and Assumption 8.3. In the spirit of the congelation of variables method (8.22) can be rewritten as 1 2 2 2 ˆ ˆ − ˆ u, ¯ (8.23) x˙ =θ x + u¯ + θ x + b ˆ u¯ + (θ − θ)x − b b where b (t) = b(t) − b , ˆ is an “estimate” of 1b , and u = ˆ u. ¯ From classical adaptive control theory (see, e.g., [17]) it is known that the parameter estimation error terms in (8.23) can be cancelled by selecting the parameter update laws (8.5) and ¯

˙ˆ = −γ sgn(b )ux, and considering the Lyapunov function candidate V (x, θˆ , ) ˆ = 21 x 2 + |b | 1 ( 2γ b

(8.24) 1 2γθ

(θ − θˆ )2 +

− ) ˆ 2 , the time derivative of which along the trajectories of (8.23) satisfies V˙ = θˆ x 3 + ux ¯ + θ x 3 + b ˆ ux. ¯

(8.25)

¯ depends on u¯ explicitly, which means that we Note that the perturbation term b ˆ ux cannot dominate this term by simply adding damping terms to u, ¯ as doing this would ¯ non-positive also alter the perturbation term itself. Instead, we need to make b ˆ ux by designing u¯ and selecting b . Similar to the selection of θ in Sect. 8.2.1, such a selection is only made for analysis rather than implementation, i.e., b does not need to be known. Let u¯ be defined as 1 1 δ θ 1 2 2 ˆ + ( θ δθ + θˆ θ )x x = −κ(x, θˆ )x, u¯ = − k + + (8.26) 2 θ θˆ 2 with θˆ > 0 and κ(x, θˆ ) > 0. Substituting (8.26) into (8.24) yields ˙ˆ = γ sgn(b )κ x 2 . By Assumption 8.3, we only need to consider two cases. In the first case, there exists a constant b such that 0 < b ≤ b(t), for all t ≥ 0, and therefore b > 0, ˙ˆ ≥ 0, which means that any initialization of ˆ such that (0) ˆ > 0 guarantees that (t) ˆ > 0, ¯ = −b κ ˆ x 2 ≤ 0, for all t ≥ 0. Alternatively for all t ≥ 0, and therefore b ˆ ux ˆ < 0 guarantees b(t) ≤ b < 0, and therefore b < 0, ˙ˆ ≤ 0. Then selecting (0) that (t) ˆ < 0, for all t ≥ 0, and b ˆ ux ¯ ≤ 0. Recalling (8.25), (8.26), and noting that

8 Adaptive Control for Systems with Time-Varying Parameters—A Survey

227

b ˆ ux ¯ ≤ 0 yields V˙ ≤ − kx 2 −

θˆ 2 4 θ δ θ 4 1 δθ 2 3 3 ˆθ x + ˆ x + x − θx − x + θ x 2 2 θˆ 2 2 θ

≤ −kx 2 ≤ 0. (8.27) Exploiting the same stability argument as before, boundedness of the system trajectories and convergence of x to zero follows. Remark 8.5 This example highlights the flexibility of the congelation of variables method: the congealed parameter (·) can be selected according to the specific usage. It can be a nominal value for robust design as in Sect. 8.2.1, or an “extreme” value to create sign-definiteness as in Sect. 8.2.2, as long as the resulting perturbation (·) is considered consistently. One can even make (·) a time-varying parameter subject to some of the assumptions used in the literature (e.g., ˙(·) ∈ L∞ , ˙(·) ∈ L1 , see, e.g., [21, 25]), and use the congelation of variables method to relax these assumptions. This is the reason why the proposed method is named “congelation”3 not “freeze.” Remark 8.6 Similar to what is discussed in Remark 8.4, the selection of b makes

ˆ − 1b a passivating input/output signal. In addition, note that the closed-loop system described by (8.23), (8.5), (8.24), and (8.26) is passive from −b κ ˆ x to x (see Fig. ˆ 8.2 for a schematic representation). Our selection of b always guarantees that −b κ ˆ x is negative and therefore yields a negative feedback “control” (if regarding −b κ as the control law), which is well known to possess an arbitrarily large gain margin ˆ and it is robust against the variation of b κ. The examples discussed above are simple, yet illustrate the core ideas put forward in the chapter: no matter if the time-varying parameters appear in the feedback path or in the input path. These ideas allow us to proceed with more complicated scenarios.

8.2.3 Preliminary Result: State-Feedback Design for Unmatched Parameters The scalar systems considered in Sects. 8.2.1 and 8.2.2 satisfy the so-called matching condition, that is, the unknown parameter θ enters the system dynamics via the same integrator from which the input u enters. For a more general class of systems in which the unknown parameters are separated from the input by integrators, adaptive backstepping design [17] is needed. Consider an n-dimensional nonlinear system in the so-called parametric strict-feedback form, namely, 3

The word “congelation” “freeze/solidification”[24].

is

polysemous:

it

means

both

“coagulation”

and

228

K. Chen and A. Astolfi

Fig. 8.2 Schematic representation of system (8.23), (8.5), and (8.24) as the interconnection of passive subsystems. Each of the subsystems in the round-rectangular blocks is passive from its input to its output. ©2021 IEEE. Reprinted, with permission, from Chen, K., Astolfi, A.: Adaptive control for systems with time-varying parameters. IEEE Transactions on Automatic Control 66(5), 1986–2001 (2021)

x˙1 = φ1 (x1 )θ (t) + x2 , ... x˙i = φi (xi )θ (t) + xi+1 , .. .

(8.28)

x˙n = φn (x)θ (t) + b(t)u, where i = 2, . . . , n − 1; x(t) = [x1 (t), . . . , xn (t)] ∈ Rn is the state; u(t) ∈ R is the input; θ (t) ∈ Rq is the vector of unknown parameters satisfying Assumption 8.1; and b(t) ∈ R is an unknown parameter satisfying Assumptions 8.1 and 8.3. The regressors φi : Ri → Rq , i = 1, . . . , n, are smooth mappings and satisfy φi (0) = 0. Remark 8.7 By Hadamard’s lemma [27], the condition φi (0) = 0 implies that ¯ i (xi )xi , for some smooth mappings ¯ i . This also means that φi (0)θ (t) = φi (xi ) = 0, allowing zero control effort at x = 0 regardless of θ (t). One can easily see that if φi (0) = 0, φi (0)θ (t) becomes an unknown time-varying disturbance, yielding a disturbance rejection/attenuation problem not discussed here. We directly give the results below and omit the derivation of the adaptive backstepping procedures.4 For each step i, i = 1, . . . , n, define the error variables

4

The classical procedures of adaptive backstepping, on which the following procedures are based, can be found in Chap. 4 of [17].

8 Adaptive Control for Systems with Time-Varying Parameters—A Survey

229

z 0 = 0,

(8.29)

z i = xi − αi−1 ,

(8.30)

the new regressor vectors wi (xi , θˆ ) = φi −

i−1 ∂αi−1 j=1

∂x j

φj,

(8.31)

wi z i ,

(8.32)

the tuning functions τi (xi , θˆ ) = τi−1 + wi z i =

i j=1

and the virtual control laws α0 = 0,

(8.33)

∂αi−1 θ τi αi (xi , θˆ ) = − z i−1 − (ci + ζi )z i − wi θˆ + ∂ θˆ +

i−1 ∂αi−1 j=1

∂x j

x j+1 +

i−1 ∂α j−1 j=2

∂ θˆ

θ wi z j , i = 1, . . . , n − 1,

ˆ θˆ )z n , αn = ˆ α¯ n = − κ(x,

(8.34) (8.35)

where ci > 0 are constant feedback gains; ζi (xi , θ ) are nonlinear feedback gains to be defined; θ = θ 0 is the adaptation gain; κ(x, θˆ ) is a positive nonlinear feedback gain to be defined, similar to the one in (8.26). To proceed with the analysis, select the control law and the parameter update laws as u = αn , θ˙ˆ = θ τn ,

˙ˆ = −γ sgn(b )α¯ n z n ,

(8.36) (8.37) (8.38)

respectively, and consider the Lyapunov function candidate V (z, θˆ , ) ˆ = 21 |z|2 + | | 1 1 2 2 b ˆ −1 + |θ − θ| | − | ˆ , where z = [z 1 , . . . , z n ] . Taking the time derivative of 2

V yields

2γ b

230

K. Chen and A. Astolfi

V˙ = −

n (ci + ζi )z i + z n α¯ n + + z n ψ i=1

+ (θ − θˆ )

n−1

wi z i − θ−1 θ˙ˆ + b

i=1

1 − ˆ b

˙ˆ , α¯ n z n − γ

(8.39)

where =

n−1

z i wi θ + b ˆ α¯ n z n ,

(8.40)

i=1

ψ = z n−1 +

wn θˆ

−

n−1 ∂αn−1 j=1

∂x j

∂α j−1 ∂αn−1 x j+1 − θ τn − θ wn z j . (8.41) ∂ θˆ ∂ θˆ n−1

j=2

Remark 8.8 Recalling Remark 8.7 and implementing (8.29)–(8.35) recursively, it is not hard to see that z i (xi , θˆ ), wi (xi , θˆ ), τi (xi , θˆ ), αi (xi , θˆ ) are smooth and z i (0, θˆ ) = 0, wi (0, θˆ ) = 0, τi (0, θˆ ) = 0, αi (0, θˆ ) = 0. In addition, the θˆ -dependent change of coordinates between z i and xi is smooth, invertible, and xi = 0 implies and is implied by z i = 0, thus we can directly express wi as wi = W¯ i (xi , θˆ )z i , with Wi smooth, and similarly, ψ as ψ = ψ¯ (x, θˆ )z, with ψ¯ smooth. The parameter estimation error terms in (8.39) are eliminated by the parameter update laws (8.37) and (8.38), and the non-positivity of b ˆ α¯ n z n can be established in the same way as in Sect. 8.2.2, thanks to the form5 of α¯ n . The rest of the problem is to determine the nonlinear damping gains ζi (xi , θˆ ) and κ(x, θˆ ) to dominate the θ -terms. Proposition 8.1 Consider system (8.28) and the control law (8.36) with the nonlinear damping gains δ θ 1 1 2 ¯ , (n − i + 1) + θ δθ |Wi |F + 2 θ ψ¯ 1 ¯ 2, κ(x, θˆ ) = cn + ζn + ψ¯ |ψ| 2

ζi (xi , θˆ ) =

(8.42) (8.43)

with cn > 0 and (·) > 0, and the parameter update laws (8.37) and (8.38) with sgn( (0)) ˆ = sgn(b). Then all closed-loop signals are bounded and lim x(t) = 0. t→+∞

Although state feedback is, in general, not available in practice, the result presented above indicates how to combine the congelation of variables method and backstepping to cope with unmatched time-varying parameters, which is essential in the output-feedback design. This form of α¯ n is inspired by [18], which also proposes a control law with a nonlinear negative feedback gain, albeit to achieve inverse optimality.

5

8 Adaptive Control for Systems with Time-Varying Parameters—A Survey

231

8.3 Output-Feedback Design Consider now an n-dimensional system in output-feedback form with relative degree ρ described by the equations x˙1 = x2 + φ0,1 (y) +

q

φ1, j (y)a j (t),

j=1

.. . x˙ρ = xρ+1 + φ0,ρ (y) +

q

φρ, j (y)a j (t) + bm (t)g(y)u,

j=1

.. . x˙n = φ0,n (y) +

q

φn, j (y)a j (t) + b0 (t)g(y)u,

j=1

y = x1 ,

(8.44)

or, in compact form, by the equations x˙ = Sx + φ0 (y) + F (y, u)θ, y = e1 x,

(8.45)

where x(t) = [x1 (t), . . . , xn (t)] ∈ Rn is the state, u(t) ∈ R is the input, y(t) ∈ R is the output, θ (t) = [b (t), a (t)] is the vector of unknown time-varying parameters, a(t) = [a1 (t), . . . , aq (t)] ∈ Rq , b(t) = [bm (t), . . . , b0 (t)] ∈ Rm+1 , m = n − ρ,

0(ρ−1)×(m+1) g(y)u, (y) , F (y, u) = Im+1

(8.46)

( (y))i j = φi, j (y) and g : R → R is a smooth mapping such that g(y) = 0, for all y ∈ R. In addition, θ satisfies Assumption 8.2, and, in particular, bm also satisfies Assumption 8.3. The mappings φ0,i : R → R and φi, j : R → R, i = 1, . . . , n, j = 1, . . . , q, are smooth and such that φ0,i (0) = 0, φi, j (0) = 0. Remark 8.9 Similar to what is discussed in Remark 8.7, there exist smooth mappings φ¯ 0,i and φ¯ i, j such that φ0,i (y) = φ¯ 0,i (y)y, φi, j (y) = φ¯ i, j (y)y.

232

K. Chen and A. Astolfi

8.3.1 System Reparameterization Due to the presence of unmeasured state variables we use Kreisselmeier filters (Kfilters) [15] to reparameterize the system with the filter state variables (which are known) into a new form that is favorable for the adaptive backstepping design [17]. The filters are given by the equations ξ˙ = Ak ξ + ky + φ0 (y), ˙ = Ak + (y), λ˙ = Ak λ + en g(y)u,

(8.47) (8.48) (8.49)

where Ak = S − ke1 , and k ∈ Rn is the vector of filter gains. These filters are equivalent, see [17], to the filters ξ˙ = Ak ξ + ky + φ0 (y),

(8.50)

˙ = Ak + F (y, u),

(8.51)

where = [vm , . . . , v0 , ], vi =

Aik λ,

i = 0, . . . , m.

(8.52) (8.53)

Define now the non-implementable state estimate xˆ = ξ + θ .

(8.54)

The state estimation error dynamics are then described by the equation

0(ρ−1)×1 g(y)u, ε˙ = Ak ε + F (y, u)θ = Ak ε + (y)a + b

(8.55)

where ε = x − x. ˆ We now show that after using the K-filters (8.47)–(8.49) with the congelation of variables method the original n-dimensional system with timevarying parameters can be reparameterized as a ρ-dimensional system with constant parameters θ and some auxiliary systems to be defined. The substitution of θ for θ prevents θ˙ from appearing in the ε-dynamics. For ρ > 1, one has the problem described by the equations y˙ = ω0 + ω¯ θ + ε2 + bm vm,2 v˙ m,i = −ki vm,1 + vm,i+1 , i = 2, . . . , ρ − 1, v˙ m,ρ = −kρ vm,1 + vm,ρ+1 + g(y)u,

(8.56)

8 Adaptive Control for Systems with Time-Varying Parameters—A Survey

233

and, for ρ = 1, one has y˙ = ω0 + ω θ + ε2 + bm g(y)u,

(8.57)

¯ + e1 vm,2 . where ω0 = φ0,1 + ξ2 , ω¯ = [0, vm−1,2 , . . . , v0,2 , () 1 + ()2 ] , ω = ω Similar to the classical adaptive backstepping scheme, we consider the ρth-order system (8.56) (or (8.57) if ρ = 1) to exploit its lower triangular form, yet (8.56) and (8.57) are useful only if the estimation error ε2 is converging to 0. In classical schemes, this is not a problem since there are no a (t) or b (t) terms and ε converges to 0 exponentially provided that Ak is Hurwitz. The effect of a can be dominated via a strengthened damping design, as proposed in [3]. However, the dominance method cannot be directly applied to (8.55) since b is coupled with the input u. To solve this issue, in the next section, we revisit the ideas of [5] and [6] to see how we can decouple b and u with the help of the inverse dynamics of system (8.44).

8.3.2 Inverse Dynamics To study the inverse dynamics of (8.44) pretend that the system is “driven” by y, φ0,i (y), φi (y), and their time derivatives. Then one could write x2 =y (1) − (φ1 a + φ0,1 ), .. .

(8.58)

+ φ0,ρ−1 ). xρ =y (ρ−1) − (φ1 a + φ0,1 )(ρ−2) − · · · − (φρ−1

Setting yi = φi a + φ0,i , i = 1, . . . , n and u g = g(y)u yields ug =

1 (ρ−1) (−xρ+1 + y (ρ) − y1 − · · · − yρ ). bm

(8.59)

The resulting inverse dynamics are then described by x˙ρ+1 = −

bm−1 bm−1 (ρ) (ρ−1) xρ+1 + xρ+2 + yρ+1 + (y − y1 − · · · − yρ ), bm bm

.. . x˙n = −

(8.60) b0 b0 (ρ) (ρ−1) xρ+1 + yn + (y − y1 − · · · − yρ ). bm bm

234

K. Chen and A. Astolfi

Algorithm 8.1 Change of coordinates xρ+1 , . . . , xn . Require: xρ+1 , . . . , xn , x˙ρ+1 , . . . , x˙n . Ensure: x¯ρ+1 , . . . , x¯n , x˙¯ρ+1 , . . . , x˙¯n . 1: while time derivatives of y appear in the expression of x˙ρ+1 , . . . , x˙n do

This while-loop iterates for ρ times as it reduces the order of y (ρ) by one each iteration. 2: for i = n → ρ + 2 do 3: Update x¯i and x˙¯i using (8.61). xρ+1 4: Rewrite xi in terms of x¯i in the expression of x˙i−1 and leave the feedback term − bbn−i m unchanged. 5: end for 6: Update x¯ρ+1 and x˙¯ρ+1 using (8.61). 7: Rewrite xρ+1 in terms of x¯ρ+1 in the expressions of x˙¯ρ+1 , . . . , x˙¯n , respectively.

This brings back the time derivatives of y, y1 , . . . , yρ , but with the order reduced by one. 8: xρ+1 ← x¯ρ+1 , . . . , xn ← x¯n , x˙ρ+1 ← x˙¯ρ+1 , . . . , x˙n ← x˙¯n . Update the old coordinates before the next iteration. 9: end while ©2021 IEEE. Reprinted, with permission, from Chen, K., Astolfi, A.: Adaptive control for systems with time-varying parameters. IEEE Transactions on Automatic Control 66(5), 1986– 2001 (2021)

Since it is difficult to use backstepping techniques to establish convergence properties for the time derivatives of y or yi , we need to perform a change of coordinates to remove the derivative terms from the inverse dynamics. Note that for any pair of smooth signals s1 and s2 the equation s1 s2(i) = (−1)i s1(i) s2 +

i−1

( j) (i−1− j)

(−1) j s1 s2

(1) (8.61)

j=0

holds. With this fact, the change of coordinate x¯n =xn −

( j) ( j) ρ−1 ρ−1 ρ−i−1 b0 b0 (ρ−i−1− j) (−1) j y (ρ−1− j) + (−1) j yi b b m m j=0 i=1 j=0 (8.62)

yields b0 xρ+1 + yn + (−1)ρ x˙¯n = − bm

b0 bm

(ρ)

(ρ−i) ρ b0 ρ−i y− (−1) yi , bm i=1 (8.63)

which does not contain time derivatives of y and yi . In the same spirit, applying the change of coordinates specified by Algorithm 8.1, we are able to remove the terms containing the time derivatives of y and yi in each equation of the inverse dynamics. The resulting inverse dynamics in the new coordinates (we use x¯i , i = ρ + 1, . . . , n

8 Adaptive Control for Systems with Time-Varying Parameters—A Survey

235

with a slight abuse of notation) are described by the equations x˙¯ =Ab¯ (t)x¯ + bx¯ y (t)y +

n

bxφ,0,i (t)φ0,i (y) + ¯

i=1

q n

bxφ,i, ¯ j (t)φi, j (y),

i=1 j=1

(8.64) ug =

1 bm (t)

− x¯ρ+1 + y (ρ) +

ρ−1 j=0

au g y ( j) (t)y ( j) +

ρ ρ−i

( j)

au g y ( j) (t)yi i

,

i=1 j=0

(8.65) ¯ 1 , b(t) ¯ = where x(t) ¯ = [x¯ρ+1 (t), . . . , x¯n (t)] ∈ Rm , Ab¯ = S − be m . . . , b0 (t)] ∈ R .

1 [b (t), bm (t) m−1

Remark 8.10 The time-varying vectors bx¯ y , bxφ,0,i , bxφ,i, ¯ ¯ j and the time-varying scalars au g y ( j) , au g y ( j) are unknown as they depend on the unknown θ . However, as a i consequence of Assumption 8.2, they are bounded. Assumption 8.4 The time-varying system (8.44) has a strong minimum-phase property in the sense that the inverse dynamics (8.64) are input-to-state stable (ISS) with respect to the inputs y, φ0,i (y), φi, j (y), i = 1, . . . , n, j = 1, . . . , q. Moreover, there ¯ 2 ≤ Vx¯ (x, ¯ t) ≤ γ¯x¯ |x| ¯ 2 , 0 ≤ γ x¯ ≤ γ¯x¯ , such that exists an ISS Lyapunov function γ x¯ |x| the time derivative of Vx¯ along the trajectories of the inverse dynamics satisfies the inequality 2 2 ¯ 2 + σx¯ y y 2 + σxφ V˙x¯ ≤ −|x| ¯ 0 |φ0 (y)| + σx ¯ |(y)|F ,

(8.66)

for some constant σ(·) > 0. Remark 8.11 Assumption 8.4 is verified if x¯ = 0 is a globally exponentially stable equilibrium of the zero dynamics described by x˙¯ = Ab¯ (t)x, ¯ see, e.g., Lemma 4.6 in [14]. Some works (e.g., [30] and [23]) exploit this exponential stability property as a substitute for the classical minimum-phase assumption. Note, finally, that Assumption 8.4 is not more restrictive than the classical minimum-phase assumption because for time-invariant systems Assumption 8.4 reduces to minimum-phaseness.

8.3.3 Filter Design Consider now the state estimation error dynamics (8.55) with u g given by (8.65), which yields

236

K. Chen and A. Astolfi

ε˙ = Ak ε + (y)a +

− x¯ρ+1 + y (ρ) +

0(ρ−1)×1 1 × b bm ρ−1

au g y ( j) (t)y ( j) +

j=0

ρ−i ρ

( j)

au g y ( j) (t)yi i

.

(8.67)

i=1 j=0

Similar to what is done in Sect. 8.3.2, we need to use a change of coordinates to remove the time derivative terms brought by u g . Implementing a change of coordinates in the same spirit of Algorithm 8.1, the state estimation error dynamics in the new coordinates ε¯ are described by the equations ¯ b x¯ρ+1 + bε¯ y (t)y + ε˙¯ =Ak ε¯ −

n

bε¯ φ,0,i (t)φ0,i (y) +

i=1

q n

bε¯ φ,i, j (t)φi, j (y),

i=1 j=1

(8.68) 1 ¯ b = [01×(ρ−1) , where b ] bm .

Remark 8.12 The time derivative terms are injected into the ε-dynamics via the ¯ b , bε¯ y , bε¯ φ,0,i , vector of gains b . Similar to Remark 8.10, the time-varying vectors bε¯ φ,i, j are unknown, yet bounded, due to Assumption 8.2. We will see that as long as these parameters are bounded they do not affect the controller design. In particular, ¯ b , bε¯ y , when b is constant, b (t) = 0 for all t ≥ 0, provided b = b, and thus bε¯ φ,0,i , bε¯ φ,i, j are all identically 0 and ε¯ = ε, which yields ε˙ = Ak ε + (y)a , a simplified case that has been dealt with in [3]. Similar to the description of the ISS inverse dynamics, we want the state estimation error dynamics to be ISS, but in this case, rather than assuming it, we can guarantee such a property by designing the K-filters. Proposition 8.2 The state estimation error dynamics are ISS with respect to the inputs x¯ρ+1 , y, φ0,i (y), φi, j (y), i = 1, . . . , n, j = 1, . . . , q, if the vector of filter 6 gains is given by k = 21 X ε¯ e1 , X ε¯ = X ε ¯ 0, and X ε¯ satisfies the Riccati inequality S X ε¯ + X ε¯ S − X ε¯ (e1 e1 − γε¯−1 I )X ε¯ + Q ε¯ 0,

(8.69)

where Q ε¯ =

n n δbε¯ y δbε¯ φ,i, j δbε¯ φ,0,i δ¯ b + + + ¯ b bε¯ y i=1 bε¯ φ,0,i i=1 j=1 bε¯ φ,i, j q

I.

(8.70)

Moreover, there exists an ISS Lyapunov function Vε¯ = γε¯ |¯ε|2Pε¯ , with Pε¯ = X ε−1 ¯ and the time derivative of Vε¯ along the trajectories of the state estimation error dynamics satisfies the inequality 6

The solvability of (8.69) has been discussed in [3].

8 Adaptive Control for Systems with Time-Varying Parameters—A Survey

V˙ε¯ ≤ − |ε|2 + bε¯ y δbε¯ y y 2 +

n

237

2 bε¯ φ,0,i δbε¯ φ,0,i φ0,i (y)

i=1

+

q n

bε¯ φ,i, j δbε¯ φ,i, j φi,2 j (y)

(8.71) +

2 ¯ b δ¯ b x¯ρ+1 ,

i=1 j=1

where (·) > 0, or in a more compact (yet more conservative) form, 2 V˙ε¯ ≤ − |¯ε|2 + σε¯ y y 2 + σε¯ φ0 |φ0 (y)|2 + σε¯ |(y)|2F + σε¯ x¯ρ+1 x¯ρ+1 ,

(8.72)

for some constant σ(·) > 0.

8.3.4 Controller Design In Sects. 8.3.2 and 8.3.3, we have established the ISS of the inverse dynamics and the state estimation error dynamics. However, before proceeding to design the controller, we have to consider (8.56) in the new coordinates. Note that ε2 can be written as ε2 = ε¯ 2 + aε2 y (1) (t) y˙ + Yε2 (y), where Yε2 (y) = aε2 y (t)y +

n

bm (t) . bm (t)

i=1

aε2 φ,0,i (t)φ0,i (y) +

(8.73)

n i=1

q j=1

aε2 φ,i, j (t)φi, j (y)

Two special cases, in which either ρ = 1 or ρ ≥ 2 and bm and aε2 y (1) (t) = is constant, and therefore aε2 φ,0,i (t) = 0, for all t ≥ 0, have been discussed in [5]. In general, aε2 y (1) (t) = 0 and, as a result, ε2 contains y˙ . Substituting (8.73) into the first equation of (8.56) yields (1 − aε2 y (1) ) y˙ = ω0 + ω¯ θ + bm vm,2 + ε¯ 2 + Yε2 . Noting that

1 1−aε

2y

(1)

=

bm bm −bm

=

bm bm

(8.74)

, we can write the dynamics of y as

bm (t) bm (t) y˙ = (ω0 + Yε2 + ε¯ 2 ) + ω¯ θ + bm (t)vm,2 . bm bm

(8.75)

Observe that the effect of the aε2 y (1) (t) y˙ term is to bring the time-varying parameters back to the dynamics of y, which requires the congelation of variables method again. To do this, we need first to augment system (8.56) with the ξ , , and v-dynamics, which are not needed in the classical constant parameter scenarios but necessary in the current setup. It turns out that the extended system is in the so-called parametric block-strict-feedback form [17] described by the equations ξ˙ = Ak ξ + ky + φ0 (y),

(8.76)

238

K. Chen and A. Astolfi

˙ = Ak + (y), y˙ =

bm (t) bm (t) (ω0 + Yε2 + ε¯ 2 ) + ω¯ θ bm bm

(8.77)

+ bm (t)vm,2 ,

v˙ 0,2 = v1,2 , .. .

(8.78)

v˙ m−1,2 = vm,2 , v˙ m,2 = − k1 vm,1 + vm,3 , .. . v˙ m,ρ−1 = − kρ−1 vm,1 + vm,ρ ,

(8.79)

v˙ m,ρ = − kρ vm,1 + vm,ρ+1 + g(y)u. In these Eqs. (8.76) and (8.77) describe the state evolution of the filters of the regressors; Eq. (8.79) gives the integrator-chain structure used for backstepping; and Eq. (8.78) is the key part of the design that contains the dynamics of the output y. Recall that ω0 = φ0,1 + ξ2 and ω¯ = [0, vm−1,2 , . . . , v0,2 , () 1 + ()2 ] . The congelation of variables method requires an ISS-like property of the state variables coupled with the time-varying parameters. It turns out that we need to first establish ISS properties for (8.76) and (8.77), and the zero dynamics of (8.78), before developing the backstepping design. For the subsystems described by (8.76) and (8.77), we have the following result. Lemma 8.1 Let the filter gain k be as in Proposition 8.2. Then system (8.76) is ISS with respect to the inputs y, φ0,i (y) and system (8.77) is ISS with respect to the inputs φi, j (y), where i = 1, . . . , n, j = 1, . . . , q. Moreover, there exist two ISS Lyapunov functions Vξ = |ξ |2Pξ and V = tr(P ), with Pξ = P = γε¯ Pε¯ 0, such that the time derivative of Vξ along the trajectories of (8.76) satisfies V˙ξ ≤ − |ξ |2 + σξ y y 2 + σξ φ0 |φ0 (y)|2 ,

(8.80)

and the time derivative of V along the trajectories of (8.77) satisfies V˙ ≤ − ||2F + σ |(y)|2F ,

(8.81)

for some constant σ(·) > 0. The remaining work is to investigate if ISS holds for the inverse dynamics of (8.78). To do this, first let

8 Adaptive Control for Systems with Time-Varying Parameters—A Survey

1 1 1 y˙ − (ω0 + Yε2 + ε¯ 2 ) − ω¯ θ bm bm bm b b 1 = − m−1 vm−1,2 − · · · − 0 v0,2 + y˙ bm bm bm 1 − (() (ω0 + Yε2 + ε¯ 2 ) 2 + ()1 )a − bm

239

vm,2 =

(8.82)

and then define the change of coordinates: v¯ 0,2 = v¯ 0,2 , . . . , v¯ m−2,2 = vm−2,2 , v¯ m−1,2 = vm−1,2 − b1m y. The inverse dynamics of (8.78) are then described by v˙¯ = Ab¯ v¯ + gv¯ (y, ξ, , ε¯ 2 , t),

(8.83)

bm−1 ] , and gv¯ (y, ξ, , ε¯ 2 , t) = [0, . . . , 0, bm 1 () ¯ 2 )] . Exploiting 2 )a − bm (ω0 + Yε2 + ε

where Ab¯ = S − em , b¯ = [ bb0 , . . . , b¯ m

b

y, −( bmm−1 + ( b1m )(1) )y − (() 2 + bm the flexibility of the congelation of variables method we can always select b to construct a Hurwitz Ab¯ , and therefore ISS of system (8.83) can be established as shown in the lemma that follows. 1 bm

Lemma 8.2 Suppose b = [bm , . . . , b0 ] is such that the polynomial bm s m + bm−1 s m−1 + · · · + b0 is Hurwitz. Then system (8.83) is ISS with respect to the inputs y, φ0,i (y), φi, j (y), ξ2 , () j2 , and ε¯ 2 , where i = 1, . . . , n, j = 1, . . . , q. Moreover, there is an ISS Lyapunov function Vv¯ = |¯v|2Pv¯ , with Pv¯ = Pv¯ 0, such that the time derivative of Vv¯ along the trajectories of (8.83) satisfies V˙v¯ ≤ − |¯v|2 + σv¯ y y 2 + σv¯ φ0 |φ0 (y)|2 + σv¯ |(y)|2F + σv¯ ξ2 ξ22 + σv¯ ()2 |()2 |2 + σv¯ ε¯ 2 ε¯ 22 ,

(8.84)

where σ(·) > 0 are constant. Having established the ISS properties of (8.76), (8.77) and the zero dynamics of (8.78), we proceed to the backstepping design on the chain of integrators (8.79). Define the error variables z 1 = y, z i = vm,i − αi−1 , i = 2, . . . , ρ,

(8.85) (8.86)

τ1 = (ω − ˆ α¯ 1 e1 )z 1 , ∂αi−1 ωz i , i = 2, . . . , ρ, τi = τi−1 − ∂y

(8.87)

the tuning functions

the virtual control laws

(8.88)

240

K. Chen and A. Astolfi

α1 = ˆ α¯ 1 = − κz ˆ 1,

(8.89)

α2 = − bˆm z 1 − (c2 + ζ2 )z 2 + β2 +

∂α1 θ τ2 , ∂ θˆ

(8.90)

∂α j−1 ∂αi−1 ∂αi−1 ωz j , (8.91) θ τi − θ ∂y ∂ θˆ ∂ θˆ i−1

αi = − z i−1 − (ci + ζi )z i + βi +

j=2

i = 3, . . . , ρ, with θˆ 2 2 ˆ + ζˆy + ζˆφ0 |φ¯ 0 (y)|2 + ζˆ |(y)| ¯ |θ| (8.92) F, 2 2 ρδbm 1 1 ∂α1 2 2 bm δbm ( ˆ κ + 1) + θ¯ δθ¯ + Yε2 + ε¯ 2 , + + ζ2 = 2 θˆ 2 bm 2 ∂y (8.93) 2 1 ∂αi−1 ζi = bm δbm ( ˆ 2 κ 2 + 1) + θ¯ δθ¯ + Yε2 + ε¯ 2 , (8.94) 2 ∂y ∂αi−1 ∂αi−1 (ω0 + ω θˆ ) + (Ak ξ + ky + φ0 ) βi = ki vm,1 + ∂y ∂ξ q m+i−1 ∂αi−1 ∂αi−1 (A ( ) + ( ) ) + (−k j λ1 + λ j+1 ) (8.95) + k j j ∂( ) j ∂λ j j=1 j=1 κ = c1 +

+

∂αi−1 ˙ ∂αi−1 ˙ ∂αi−1 ˙ ∂αi−1 ˙ ζˆy + ζˆφ0 + ζˆ , i = 2, . . . , ρ,

ˆ + ∂ ˆ ∂ ζˆy ∂ ζˆφ0 ∂ ζˆ

the control law u=

1 (αρ − vm,ρ+1 ), g(y)

(8.96)

and the parameter update laws

˙ˆ = γ sgn(bm )κz 12 , ζ˙ˆy = γζ y z 12 , ζ˙ˆφ0 = γζφ0 |φ0 |2 , ζ˙ˆ = γζ ||2F , θ˙ˆ = τ , θ ρ

(8.97) (8.98) (8.99)

where ci > 0, i = 1, . . . , ρ, (·) > 0, γ(·) > 0, θ = θ 0, θ¯ (t) = bmb(t) θ , and m ¯ θ¯ = θ¯ (t) − θ . In the definition of κ, φ¯ 0 (y), (y) are defined such that φ0 (y) = ¯ φ¯ 0 (y)y, (y) = (y)y, which is feasible due to Remark 8.9. Moreover, the initial value of the parameter estimates is selected such that (0) ˆ > 0, ζˆ(·) (0) > 0.

8 Adaptive Control for Systems with Time-Varying Parameters—A Survey

241

Fig. 8.3 A schematic representation of the closed-loop system as the interconnection of the z, x, ¯ ε¯ , ξ , , and v¯ subsystems. By convention, an active node is denoted by a green solid circle, and an inactive node is denoted by a red dashed circle

Proposition 8.3 Consider the adaptive controller described by Eqs. (8.85)–(8.99) for the system described by Eqs. (8.76)–(8.79). Then the closed-loop signals z, x, ¯ ε¯ , ˆ , ξ , , v¯ , θ, ˆ and ζˆ(·) are bounded. Remark 8.13 We use dynamically updated “estimates” ζˆ(·) as the coefficient of the additional damping terms due to the convention mentioned in Remark 8.3, since the required damping coefficients are, in general, hard to compute and not vanishing even when all system parameters are constant. Meanwhile, we do not need to know δθ¯ as it can be dominated by these adaptive damping terms with the help of the balancing constant θ¯ . Another advantage provided by the dynamic estimates ζˆ(·) is that the L2 gains of the input–output maps of the z-subsystem with outputs y, φ0 (y), and (y) are arbitrarily and adaptively adjustable. In [9], such a subsystem in a network system is called an active node, and if each directed cycle of the network contains at least one active node, the overall dissipation inequality can be made negative by adjusting the L2 gains. Since the z-subsystem is contained in each directed cycle as shown in Fig. 8.3, one could prove Proposition 8.3 with some network analysis. This serves as an alternative interpretation of the proof in [8]. We should not forget that the invariance-like analysis of asymptotic output regulation requires boundedness of ε. In Proposition 8.3, we have established boundedness of ε¯ after the change of coordinates described in Algorithm 8.1. However, it is not easy to directly infer boundedness of ε since Algorithm 8.1 involves the time derivatives of y, φ0,i (y), and φi, j (y), i = 1, . . . , n, j = 1, . . . , q, boundedness of which is difficult to establish. Recall that these time derivatives are present because u has to be decomposed at the design stage with the help of the inverse dynamics. Now that we have completed the design, it is more convenient to directly use boundedness of u for concluding boundedness of ε, which leads to the following proposition. Proposition 8.4 Consider the system described by Eqs. (8.76)–(8.79) and the adaptive controller described by Eqs. (8.85)–(8.99). Then all closed-loop signals are bounded and lim y(t) = 0, that is, asymptotic output regulation to 0 is achieved. t→+∞

242

K. Chen and A. Astolfi

In addition, using the fact that lim z(t) = 0 one can establish that lim ξ(t) = 0, t→+∞

t→+∞

lim (t) = 0, lim λ(t) = 0, lim ε(t) = 0, and lim x(t) = 0 by exploiting

t→+∞

t→+∞

t→+∞

t→+∞

the converging-input converging-output property of the corresponding subsystems or the dependency on converging signals.

8.4 Simulations To compare the proposed controller with the classical adaptive controller, consider the nonlinear system described by the equations x˙1 = a1 (t)x12 + x2 , x˙2 = a2 (t)x12 + x3 + b1 (t)u, x˙3 = a3 (t)x12 + b0 (t)u,

(8.100)

y = x1 , where b1 is a periodic signal switching between 0.6 and 1.4 at frequency 8 rad/s; b0 is a periodic signal switching between 4 and 6 at frequency 15 rad/s; and a is defined as (ω) ¯ 3:5 ∂α1 z2 , (8.101) a(t) = [2, 3, 1] − 10 sgn ∂y |(ω) ¯ 3:5 | with (ω) ¯ 3:5 = [(ω) ¯ 3 , (ω) ¯ 4 , (ω) ¯ 5 ] . Each of these parameters comprises a constant nominal part and a time-varying (a is also state dependent) part designed to “destabilize” the system. It is not difficult to verify that Assumption 8.4 is satisfied since b0 4 ≥ 1.4 > 0. Consider now two controllers: Controller 1 is the classical adaptive b1 backstepping controller, and Controller 2 is the controller proposed in this chapter. To compare fairly, set the common controller parameters as c1 = c2 = 1, θ = I , ˆ = 0, (0) ˆ = 1 for both controllers. Each of γ = 1 and the initial conditions θ(0) the controllers uses an identical set of K-filters given by (8.47)–(8.49). The filter gains are obtained by solving the algebraic Riccati equation (8.69) with Q ε¯ = 10 and γε¯ = 100, and the filter states are initialized to 0. For the parameters solely used in Controller 2, set γ(·) = 1, (·) = 1, δbm = 0.2, θ¯ δθ¯ = 1 (note that one does not need to know δθ¯ as mentioned in Remark 8.13), and set the initial conditions to ζˆy (0) = 2, ζˆ (0) = 1 (nonzero initial conditions provide additional damping from the beginning to counteract the parameter variations). The initial condition for the system state is set to x(0) = [1, 0, 0] . Two scenarios are explored: in the first scenario, each controller is applied to a separate yet identical system while the statedependent time-varying parameters of both systems are generated by the closed-loop system controlled by Controller 1, and the second scenario has the same setting as the first scenario except that the state-dependent time-varying parameters are gen-

8 Adaptive Control for Systems with Time-Varying Parameters—A Survey

243

Fig. 8.4 Scenario 1: time-varying parameters generated by the closed-loop system controlled by Controller 1

Fig. 8.5 Scenario 1: time histories of the system state and control effort driven by different controllers and the parameters shown in Fig. 8.4

244

K. Chen and A. Astolfi

Fig. 8.6 Scenario 2: time-varying parameters generated by the closed-loop system controlled by Controller 2

Fig. 8.7 Scenario 2: time histories of the system state and control effort driven by different controllers and the parameters shown in Fig. 8.6

8 Adaptive Control for Systems with Time-Varying Parameters—A Survey

245

erated by the closed-loop system controlled by Controller 2. In both scenarios, the “Baseline” results describe the responses of the closed-loop system with constant nominal parameters controlled by Controller 1, which demonstrate the performance of the classical controller in the case of constant parameters. The responses of the system state variables in each scenario are plotted in Figs. 8.5 and 8.7, respectively, and the parameters used in each scenario are shown in Figs. 8.4 and 8.6, respectively. These results show that the proposed controller (Controller 2) outperforms the classical controller (Controller 1) in the presence of time-varying parameters and effectively prevents the oscillations caused by parameter variations. Note that the parameter variations shown in Figs. 8.4 and 8.6 contain discontinuity and therefore do not satisfy Assumption 8.2, and the proposed controller proves to be effective also in this operation condition.

8.5 Conclusions This chapter surveys a new adaptive control scheme developed to cope with timevarying parameters based on the so-called congelation of variables method. Several examples with full-state feedback, including scalar systems with time-varying parameters in the feedback path and in the input path, and n-dimensional systems with unmatched time-varying parameters, to illustrate the main ideas, are considered. The output regulation problem for a more general class of nonlinear systems, to which the previous results are not directly applicable due to the coupling between the input and the time-varying perturbation, is then discussed. To solve this problem, ISS of the inverse dynamics, a counterpart of minimum-phaseness in classical adaptive control schemes, is exploited to convert the coupling between the input and the time-varying perturbation into the coupling between the output and the time-varying perturbation. A set of K-filters that guarantee ISS state estimation error dynamics are also designed to replace the unmeasured state variables. Finally, a controller with adaptively updated damping terms is designed to guarantee convergence of the output to zero and boundedness of all closed-loop signals, via a small-gain-like analysis. The simulation results show performance improvement resulting from the use of the proposed controller compared with the classical adaptive controller in the presence of time-varying parameters.

References 1. Annaswamy, A.M., Narendra, K.S.: Adaptive control of simple time-varying systems. In: Proceedings of the 28th IEEE Conference on Decision and Control, pp. 1014–1018. IEEE (1989) 2. Astolfi, A., Karagiannis, D., Ortega, R.: Nonlinear and Adaptive Control with Applications. Springer Science & Business Media, Berlin (2007) 3. Chen, K., Astolfi, A.: Adaptive control of linear systems with time-varying parameters. In: Proceedings of the 2018 American Control Conference, pp. 80–85. IEEE (2018)

246

K. Chen and A. Astolfi

4. Chen, K., Astolfi, A.: I&I adaptive control for systems with varying parameters. In: Proceedings of the 57th IEEE Conference on Decision and Control, pp. 2205–2210. IEEE (2018) 5. Chen, K., Astolfi, A.: Output-feedback adaptive control for systems with time-varying parameters. IFAC-PapersOnLine 52(16), 586–591 (2019) 6. Chen, K., Astolfi, A.: Output-feedback I&I adaptive control for linear systems with timevarying parameters. In: Proceedings of the 58th IEEE Conference on Decision and Control, pp. 1965–1970. IEEE (2019) 7. Chen, K., Astolfi, A.: Adaptive control for nonlinear systems with time-varying parameters and control coefficient. IFAC-PapersOnLine 53(2), 3829–3834 (2020) 8. Chen, K., Astolfi, A.: Adaptive control for systems with time-varying parameters. IEEE Trans. Autom. Control 66(5), 1986–2001 (2021) 9. Chen, K., Astolfi, A.: On the active nodes of network systems. In: Proceedings 59th IEEE Conference on Decision and Control, pp. 5561–5566. IEEE (2020) 10. Goodwin, G., Ramadge, P., Caines, P.: Discrete-time multivariable adaptive control. IEEE Trans. Autom. Control 25(3), 449–456 (1980) 11. Goodwin, G.C., Mayne, D.Q.: A parameter estimation perspective of continuous time model reference adaptive control. Automatica 23(1), 57–70 (1987) 12. Goodwin, G.C., Teoh, E.K.: Adaptive control of a class of linear time varying systems. IFAC Proceedings Volumes 16(9), 1–6 (1983) 13. Ioannou, P.A., Sun, J.: Robust Adaptive Control. PTR Prentice-Hall Upper Saddle River, NJ (1996) 14. Khalil, H.K.: Nonlinear Systems. Prentice Hall (2002) 15. Kreisselmeier, G.: Adaptive observers with exponential rate of convergence. IEEE Trans. Autom. Control 22(1), 2–8 (1977) 16. Kreisselmeier, G.: Adaptive control of a class of slowly time-varying plants. Syst. Control Lett. 8(2), 97–103 (1986) 17. Krstic, M., Kokotovic, P.V., Kanellakopoulos, I.: Nonlinear and Adaptive Control Design, 1st edn. Wiley, New York, NY, USA (1995) 18. Li, Z., Krstic, M.: Optimal-design of adaptive tracking controllers for nonlinear-systems. Automatica 33(8), 1459–1473 (1997) 19. Lin, W., Qian, C.: Adaptive control of nonlinearly parameterized systems: the smooth feedback case. IEEE Trans. Autom. Control 47(8), 1249–1266 (2002) 20. Marino, R., Tomei, P.: Global adaptive output-feedback control of nonlinear systems. I. Linear parameterization. IEEE Trans. Autom. Control 38(1), 17–32 (1993) 21. Marino, R., Tomei, P.: An adaptive output feedback control for a class of nonlinear systems with time-varying parameters. IEEE Trans. Autom. Control 44(11), 2190–2194 (1999) 22. Marino, R., Tomei, P.: Robust adaptive regulation of linear time-varying systems. IEEE Trans. Autom. Control 45(7), 1301–1311 (2000) 23. Marino, R., Tomei, P.: Adaptive control of linear time-varying systems. Automatica 39(4), 651–659 (2003) 24. Merriam-Webster Staff: Merriam-Webster’s Collegiate Dictionary. Merriam-Webster (2004) 25. Middleton, R.H., Goodwin, G.C.: Adaptive control of time-varying linear systems. IEEE Trans. Autom. Control 33(2), 150–155 (1988) 26. Narendra, K.S., Annaswamy, A.M.: Stable Adaptive Systems. Prentice Hall (1989) 27. Nestruev, J.: Smooth Manifolds and Observables. Springer Science & Business Media (2006) 28. Pomet, J.B., Praly, L.: Adaptive nonlinear regulation: Estimation from the Lyapunov equation. IEEE Trans. Autom. Control 37(6), 729–740 (1992) 29. Tao, G.: Adaptive Control Design and Analysis. Wiley (2003) 30. Tsakalis, K., Ioannou, P.A.: Adaptive control of linear time-varying plants. Automatica 23(4), 459–468 (1987) 31. Wang, C., Guo, L.: Adaptive cooperative tracking control for a class of nonlinear time-varying multi-agent systems. J. Frankl. Inst. 354(15), 6766–6782 (2017) 32. Zhang, Y., Fidan, B., Ioannou, P.A.: Backstepping control of linear time-varying systems with known and unknown parameters. IEEE Trans. Autom. Control 48(11), 1908–1925 (2003)

8 Adaptive Control for Systems with Time-Varying Parameters—A Survey

247

33. Zhang, Y., Ioannou, P.A.: Adaptive control of linear time varying systems. In: Proceedings of 35th IEEE Conference on Decision and Control, vol. 1, pp. 837–842. IEEE (1996) 34. Zhou, J., Wen, C.: Adaptive Backstepping Control of Uncertain Systems: Nonsmooth Nonlinearities. Springer, Interactions or Time-Variations (2008)

Chapter 9

Robust Reinforcement Learning for Stochastic Linear Quadratic Control with Multiplicative Noise Bo Pang and Zhong-Ping Jiang

Dedicated to Laurent Praly, a beautiful mind

Abstract This chapter studies the robustness of reinforcement learning for discretetime linear stochastic systems with multiplicative noise evolving in continuous state and action spaces. As one of the popular methods in reinforcement learning, the robustness of policy iteration is a longstanding open problem for the stochastic linear quadratic regulator (LQR) problem with multiplicative noise. A solution in the spirit of input-to-state stability is given, guaranteeing that the solutions of the policy iteration algorithm are bounded and enter a small neighborhood of the optimal solution, whenever the error in each iteration is bounded and small. In addition, a novel off-policy multiple-trajectory optimistic least-squares policy iteration algorithm is proposed, to learn a near-optimal solution of the stochastic LQR problem directly from online input/state data, without explicitly identifying the system matrices. The efficacy of the proposed algorithm is supported by rigorous convergence analysis and numerical results on a second-order example.

B. Pang (B) · Z.-P. Jiang Control and Networks Lab, Department of Electrical and Computer Engineering, Tandon School of Engineering, New York University, 370 Jay Street, Brooklyn, NY 11201, USA e-mail: [email protected] Z.-P. Jiang e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control, Lecture Notes in Control and Information Sciences 488, https://doi.org/10.1007/978-3-030-74628-5_9

249

250

B. Pang and Z.-P. Jiang

9.1 Introduction Since the huge successes in the Chinese game of Go and video game Atari [50], reinforcement learning (RL) has been extensively studied by both researchers in academia and practitioners in industry. Optimal control [11] is a branch of control theory that discusses the synthesis of feedback controllers to achieve optimality properties for dynamical control systems, but often requires the knowledge of the system dynamics. Adaptive control [48] is a field that deals with dynamical control systems with unknown parameters, but usually ignores the optimality of the control systems (with a few exceptions [9, 19, 46]). RL combines the advantages of these two control methods [51], and searches for adaptive optimal controllers with respect to some performance index through interactions between the controller and the dynamical system, without the complete model knowledge. Over the past decades, numerous RL methods have been proposed for different optimal control problems with various kinds of dynamical systems, see books [6, 28, 31, 50] and recent surveys [12, 29, 33, 37, 45] for details. A class of important and popular methods in RL is policy iteration. Policy iteration involves two steps, policy evaluation and policy improvement. In policy evaluation, a given policy is evaluated based on a scalar performance index. Then this performance index is utilized to generate a new control policy in policy improvement. These two steps are iterated in turn, to find the solution of the RL problem at hand. If implemented perfectly, policy iteration is proved to converge to the optimal solution. However in reality, policy evaluation or policy improvement can hardly be implemented precisely, because of the existence of various errors induced by function approximation, state estimation, sensor noise, external disturbance, and so on. Therefore, a natural question to ask is: when is a policy iteration algorithm robust to the exogenous errors? That is, under what conditions on the errors, does the policy iteration still converge to (a neighborhood of) the optimal solution? In spite of the popularity and empirical successes of policy iteration, its robustness issue has not been fully investigated yet in theory [5], especially for RL problems of physical systems where the state and action spaces are unbounded and continuous, such as robotics and autonomous cars [38]. Regarding the policy iteration as a dynamical system, and utilizing the concepts of input-to-state stability in control theory [49], the robustness of policy iteration for the classic continuous-time and discrete-time LQR problems is analyzed in [43] and [44], respectively. It is shown that the policy iteration with errors for the LQR is smalldisturbance input-to-state stable, if the errors are regarded as the disturbance input. In this chapter, we generalize this robustness result to the policy iteration for LQR of discrete-time linear systems perturbed by stochastic state- and input-dependent multiplicative noises. Stochastic multiplicative noises are important in modeling the random perturbation in system parameters and coefficients, and are widely found in modern control systems such as networked control systems with noisy communication channels [24], modern power networks [23], neuronal brain networks [10], and human sensorimotor control [8, 27, 53]. We firstly prove that the optimal solution

9 Robust Reinforcement Learning …

251

of this stochastic LQR problem is a locally exponentially stable equilibrium of the exact policy iteration. Then based on this observation, we show that if the policy iteration starts from an initial solution close to the optimal solution, and the errors are small and bounded, the discrepancies between the solutions generated by the policy iteration and the optimal solution will also be small and bounded, in the spirit of Sontag’s input-to-state stability [49]. Thirdly, we demonstrate that for any initial stabilizing control gain, as long as the errors are small, the approximate solution given by policy iteration will eventually enter a small neighborhood of the optimal solution. Finally, a novel off-policy model-free RL algorithm, named multi-trajectory optimistic least-squares policy iteration (MO-LSPI), is proposed to find near-optimal solutions of the stochastic LQR problem directly from online input/state data when all the system matrices are unknown. Our robustness result is applied to show the convergence of this off-policy MO-LSPI. Experiments on a numerical example validate our results. In the presence of stochastic multiplicative noise, the Lyapunov equation in policy evaluation for the classic deterministic LQR [44] becomes the generalized Lyapunov equation in policy evaluation for the stochastic LQR, while the algebraic Riccati equation for the classic deterministic LQR becomes the generalized algebraic Riccati equation for the stochastic LQR. Thus, although the robustness analysis of policy iteration for stochastic LQR is parallel to that for its deterministic counterpart, the derivations and proofs are inevitably distinct. The optimal control of linear systems with stochastic multiplicative noise has been studied for a long time [3, 4, 15–17, 47, 52, 56]. With the popularization of low-cost, more powerful computational resources and data-acquisition equipment, the study of stochastic LQR with multiplicative noises is re-emerging in the context of data-driven control and learning-based control. Data-driven methods are proposed in [13, 14] to find near-optimal solutions of stochastic LQR, assuming that the distribution of the stochastic multiplicative noise is unknown. Model-free RL algorithms are proposed to find near-optimal solutions in [20] using policy gradient, in [35] using policy iteration for stochastic LQR, and in [22] using policy iteration for linear quadratic games. A system identification method is proposed in [55] to explicitly estimate the system matrices from multiple-trajectory data for subsequent LQR design. The algorithms proposed in these papers either assume the knowledge of the system dynamics [13, 14], or do not have proofs of convergence [22, 35], or belong to the class of on-policy methods [20], or lead to indirect adaptive control [55]. In contrast, the MO-LSPI algorithm proposed in this paper is off-policy, and learns near-optimal solutions of the stochastic LQR directly from input/state data without the precise knowledge of any system matrices, and with provable convergence analysis. It is also worth emphasizing that MO-LSPI proposed in this chapter is based on policy iteration, while the Q-learning algorithm proposed in [18] for stochastic LQR with random parameters is based on value iteration. The robustness studied in this book chapter differs from [14, 21] in that the learning process (policy iteration) itself is viewed as a dynamical system (similar to [7] where robustness is studied for value iteration), while in [14, 21] the robustness analysis is developed for the closed-loop system comprised of the environment and the policy.

252

B. Pang and Z.-P. Jiang

The rest of this paper is organized as follows. Section 9.2 introduces the stochastic LQR with multiplicative noise and its associated policy iteration. Section 9.3 conducts the robustness analysis of policy iteration. Section 9.4 presents the MO-LSPI algorithm and its convergence analysis. Section 9.5 validates the proposed robust RL algorithm by means of an elementary example. Section 9.6 closes the chapter with some concluding remarks. Notation R is the set of all real numbers; Z+ denotes the set of nonnegative integers; Sn is the set of all real symmetric matrices of order n; ⊗ denotes the Kronecker product; In denotes the identity matrix of order n; · F is the Frobenius norm; · 2 is the Euclidean norm for vectors and the spectral norm for matrices; for function u : F → Rn×m , u∞ denotes its l ∞ -norm when F = Z+ , and L ∞ -norm when F = R. For matrices X ∈ Rm×n , Y ∈ Sm , and vector v ∈ Rn , define vec(X ) = [ X 1T X 2T · · · X nT ]T , v˜ = svec(vv T ), √ √ √ √ svec(Y ) = [y11 , 2y12 , · · · , 2y1m , y22 , 2y23 , · · · , 2ym−1,m , ym,m ]T , where X i is the ith column of X . vec−1 (·) and svec−1 (·) are operations such that X = vec−1 (vec(X )), and Y = svec−1 (svec(Y )). For Z ∈ Rm×n , define Br (Z ) = ¯ r (Z ) as the closure of Br (Z ). Z † is the Moore– {X ∈ Rm×n |X − Z F < r } and B Penrose pseudoinverse of matrix Z .

9.2 Problem Formulation and Preliminaries Consider linear systems with state- and input-dependent multiplicative noises, xt+1 = (A0 +

q1

wt, j A j )xt + (B0 +

j=1

q2

wˆ t,k Bk )u t ,

(9.1)

k=1

where x ∈ Rn is the system state; u ∈ Rm is the control input; the initial state x0 ∈ Rn is given and deterministic; wt, j ∈ R and wˆ t,k ∈ R are stochastic noises; and A0 , B0 , q1 q2 , {Bk }k=1 are system matrices of compatible dimensions. wt, j and wˆ t,k are {A j } j=1 mutually independent random variables independent and identically distributed over time. For all t ∈ Z+ , j = 1, · · · , q1 , k = 1, · · · , q2 , 2 ] = 1. E[wt, j ] = E[wˆ t,k ] = 0, E[wt,2 j ] = E[wˆ t,k

Definition 9.1 The unforced system (9.1), i.e., system (9.1) with u t = 0 for all t ∈ Z+ , is said to be mean-square stable if for any x0 ∈ Rn , lim E[xt xtT ] = 0.

t→∞

9 Robust Reinforcement Learning …

253

System (9.1) is said to be mean-square stabilizable, if there exists a matrix K ∈ Rm×n , such that the closed-loop system (9.1) with control law u t = −K xt is mean-square stable. In this case, the gain K is said to be mean-square stabilizing. The following lemma gives conditions under which K is mean-square stabilizing. Lemma 9.1 For a control gain K ∈ Rm×n , the following statements are equivalent: (i) K is mean-square stabilizing. (ii) There exists P > 0, such that L K (P) < 0, where L K (P) (P) − P − K T B0T P A0 − A0T P B0 K + K T (P)K , q2 q1 A Tj P A j and (P) = k=0 BkT P Bk . and (P) = j=0 (iii) ρ(A (K ) + In ⊗ In ) < 1, where A (K ) =

q1

A Tj ⊗ A Tj − In ⊗ In − A0T ⊗ (K T B0T )

j=0

− (K T B0T ) ⊗ A0T +

q2

K T BkT ⊗ K T BkT .

k=0

Proof (i) ⇐⇒ (ii) is obtained by a direct application of [42, Lemma 1.]. The proof of (i) ⇐⇒ (iii) can be found in [20, Lemma 2.1]. Assuming that system (9.1) is mean-square stabilizable, in stochastic LQR we want to find a controller u minimizing the cost functional J (x0 , u) = E[

∞

xtT Sxt + u tT Ru t ],

(9.2)

t=0

where weighting matrices S ∈ Sn and R ∈ Sm are positive definite. From dynamic programming theory (e.g., from [14, Proposition 3]), it follows that the optimal controller to this stochastic LQR problem is u ∗t = −K ∗ xt , and the optimal cost is J (x0 , u ∗ ) = x0T P ∗ x0 , where K ∗ = −(R + (P ∗ ))−1 B0T P ∗ A0

(9.3)

is mean-square stabilizing and P ∗ ∈ Sn is the unique positive definite solution of the generalized algebraic Riccati equation (GARE) P = S + (P) − A0T P B0 (R + (P))−1 B0T P A0 .

(9.4)

For mean-square stabilizing control gain K ∈ Rm×n , the cost it induces is J (x0 , −K x) = x0T PK x0 , where PK ∈ Sn is the unique positive definite solution of the generalized Lyapunov equation

254

B. Pang and Z.-P. Jiang

L K (PK ) + S + K T R K = 0.

(9.5)

Define G(PK ) =

T [G(PK )]x x [G(PK )]ux [G(PK )]ux [G(PK )]uu

=

S + (PK ) − PK A0T PK B0 . B0T PK A0 R + (PK )

Then (9.5) can be equivalently rewritten as H(G(PK ), K ) = 0, where

H(G(PK ), K ) = In , −K T G(PK )

(9.6)

In −K T

.

The policy iteration for stochastic LQR is described in the following procedure: Procedure 9.1 (Exact Policy Iteration) 1) Choose a mean-square stabilizing control gain K 1 , and let i = 1. 2) (Policy evaluation) Evaluate the performance of control gain K i , by solving H(G i , K i ) = 0

(9.7)

for Pi ∈ Sn , where G i = G(Pi ). 3) (Policy improvement) Obtain an improved policy K i+1 = [G i ]−1 uu [G i ]ux .

(9.8)

4) Set i ← i + 1 and go back to Step 2. The following convergence results of Procedure 9.1 are similar to those of the policy iteration for the LQR in [25], whose proof is given in Appendix 2. Theorem 9.1 In Procedure 9.1, we have i) K i is mean-square stabilizing for all i = 1, 2, · · · . ii) P1 ≥ P2 ≥ P3 ≥ · · · ≥ P ∗ . iii) limi→∞ Pi = P ∗ , limi→∞ K i = K ∗ . Theorem 9.1 guarantees that the optimal solution will be found by Procedure 9.1 in the q1 q2 and {Bk }k=0 is required in Procedure limit. However, the exact knowledge of {A j } j=0 q1 q2 . In practice, very often 9.1, as the solution to (9.7) relies upon {A j } j=0 and {Bk }k=0 we only have access to incomplete information required to solve the problem. In other words, each policy evaluation step will result in inaccurate estimation. Thus, we are interested in studying the following problem. Problem 9.1 If G i is replaced by an approximated matrix Gˆ i , will the conclusions in Theorem 9.1 still hold?

9 Robust Reinforcement Learning …

255

The difference between Gˆ i and G i can be attributed to errors from various sources, q1 q2 and {Bk }k=0 in indiincluding but are not limited to: the estimation errors of {A j } j=0 rect adaptive control [48], system identification [39], and model-based reinforcement learning [54]; approximate values of S and R in inverse optimal control/imitation learning, due to the absence of exact knowledge of the cost function [36, 41]; the imprecise values of Pi in model-free reinforcement learning [1, 54]. In Sect. 9.3, using the concept of exponential stability and input-to-state stability in control theory, we provide an answer to Problem 9.1. In Sect. 9.4, we present the model-free RL algorithm MO-LSPI, to find near-optimal solutions of stochastic LQR directly from a set of input/state data collected along the trajectories of the control q1 q2 and {Bk }k=0 are unknown. The derived answer to Problem system, when {A j } j=0 9.1 in Sect. 9.3 is utilized to analyze the convergence of the proposed MO-LSPI algorithm.

9.3 Robust Policy Iteration Consider the policy iteration in the presence of errors. Procedure 9.2 (Inexact Policy Iteration) (1) Choose a mean-square stabilizing control gain Kˆ 1 , and let i = 1. (2) (Inexact policy evaluation) Obtain Gˆ i = G˜ i + G i , where G i ∈ Sm+n is a disturbance, G˜ i G( P˜i ) and P˜i ∈ Sn satisfy H(G˜ i , Kˆ i ) = 0,

(9.9)

and J (x0 , − Kˆ i x) = x0T P˜i x0 is the true cost induced by control gain Kˆ i . (3) (Policy update) Construct a new control gain ˆ Kˆ i+1 = [Gˆ i ]−1 uu [G i ]ux .

(9.10)

(4) Set i ← i + 1 and go back to Step 2. Remark 9.1 The requirement that Gˆ i ∈ Sm+n in Procedure 9.2 is not restrictive, since for any X ∈ R(n+m)×(n+m) , x T X x = 21 x T (X + X T )x, where 21 (X + X T ) is symmetric. We firstly show that the exact policy iteration Procedure 9.1, viewed as a dynamical system, is locally exponentially stable at P ∗ . Then based on this result, we show that the inexact policy iteration Procedure 9.2, viewed as a dynamical system with G i as the input, is locally input-to-state stable. For X ∈ Rm×n , Y ∈ Rn×n , define K (Y ) = R −1 (Y )B0T Y A, R(Y ) = R + (Y ).

256

B. Pang and Z.-P. Jiang

Note that vec(L X (Y )) = A (X ) vec(Y ),

(9.11)

where L X (·) and A (·) are defined in Lemma 9.1. By (iii) in Lemma 9.1, if X is mean-square stabilizing, then A (X ) is invertible, and (9.11) implies that the inverse n×n . operator L−1 X (·) exists on R In Procedure 9.1, suppose K 1 = K (P0 ), where P0 ∈ Sn is chosen such that K 1 is mean-square stabilizing. Such a P0 always exists. For example, since K ∗ is meansquare stabilizing, one can choose P0 close to P ∗ by continuity. Then from (9.7) and ∞ generated by Procedure 9.1 satisfies (9.8), the sequence {Pi }i=0 T Pi+1 = L−1 K (Pi ) −S − K (Pi ) RK (Pi ) .

(9.12)

If Pi is regarded as the state, and the iteration index i is regarded as time, then (9.12) is a discrete-time dynamical system and P ∗ is an equilibrium by Theorem 9.1. The next lemma shows that P ∗ is actually a locally exponentially stable equilibrium, whose proof is given in Appendix 3. Lemma 9.2 For any σ < 1, there exists a δ0 (σ ) > 0, such that for any Pi ∈ Bδ0 (P ∗ ), R(Pi ) is invertible, K (Pi ) is mean-square stabilizing and Pi+1 − P ∗ F ≤ σ Pi − P ∗ F . In Procedure 9.2, suppose Kˆ 1 = K ( P˜0 ) and G 0 = 0, where P˜0 ∈ Sn is chosen such that Kˆ 1 is mean-square stabilizing. If Kˆ i is mean-square stabilizing and [Gˆ i ]uu is invertible for all i ∈ Z+ , i > 0 (this is possible under certain conditions, see Appendix ∞ generated by Procedure 9.2 satisfies 4), the sequence { P˜i }i=0

˜i )T RK ( P˜i ) + E(G˜ i , G i ), −S − K ( P P˜i+1 = L−1 ˜ K (P )

(9.13)

i

where E(G˜ i , G i ) = L−1 Kˆ

i+1

T ˜i )T RK ( P˜i ) . −S − Kˆ i+1 −S − K ( P R Kˆ i+1 − L−1 K ( P˜ ) i

∞ Regarding {G i }i=0 as the disturbance input, the next theorem shows that dynamical system (9.13) is locally input-to-state stable [30, Definition 2.1], whose proof can be found in Appendix 4.

Lemma 9.3 For σ and its associated δ0 in Lemma 9.2, there exists δ1 (δ0 ) > 0, such that if G∞ < δ1 , P˜0 ∈ Bδ0 (P ∗ ), (i) [Gˆ i ]uu is invertible and Kˆ i is mean-square stabilizing, ∀i ∈ Z+ , i > 0; (ii) (9.13) is locally input-to-state stable: P˜i − P ∗ F ≤ β( P˜0 − P ∗ F , i) + γ (G∞ ), ∀i ∈ Z+ ,

9 Robust Reinforcement Learning …

257

where β(y, i) = σ i y,

γ (y) =

c3 y, 1−σ

y ∈ R,

and c3 (δ0 ) > 0. (iii) Kˆ i F < κ1 for some κ1 (δ0 ) ∈ R+ , ∀i ∈ Z+ , i > 0; (iv) limi→∞ G i F = 0 implies limi→∞ P˜i − P ∗ F = 0. Intuitively, Lemma 9.3 implies that in Procedure 9.2, if P˜0 is near P ∗ (thus Kˆ 1 is near K ∗ ), and the disturbance input G is bounded and not too large, then the cost of the generated control policy Kˆ i is also bounded, and will ultimately be no larger than a constant proportional to the l ∞ -norm of the disturbance. The smaller the disturbance is, the better the ultimately generated policy is. In other words, the algorithm described in Procedure 9.2 is not sensitive to small disturbances when the initial condition is in a neighborhood of the optimal solution. The requirement that the initial condition P˜0 need to be in a neighborhood of P ∗ in Lemma 9.3 can be removed, as stated in the following theorem whose proof is given in Appendix 5. Theorem 9.2 For any given mean-square stabilizing control gain Kˆ 1 and any > 0, if S > 0, there exist δ2 (, Kˆ 1 ) > 0, α(δ2 ) > 0, and κ(δ2 ) > 0, such that as long as G∞ < δ2 , [Gˆ i ]uu is invertible, Kˆ i is mean-square stabilizing, P˜i F < α, Kˆ i F < κ, ∀i ∈ Z+ , i > 0, and lim sup P˜i − P ∗ F < . i→∞

If in addition limi→∞ G i F = 0, then limi→∞ P˜i − P ∗ F = 0. In Theorem 9.2, Kˆ 1 can be any mean-square stabilizing control gain, which is different from that of Lemma 9.3. When there is no disturbance, Theorem 9.2 implies the convergence result of Procedure 9.1, i.e., Theorem 9.1.

9.4 Multi-trajectory Optimistic Least-Squares Policy Iteration q

q

1 2 In this section, the system matrices {A j } j=0 and {Bk }k=0 are assumed unknown in system (9.1), and the MO-LSPI is proposed, to find the near-optimal solutions of the stochastic LQR directly from the input/state data. The following lemma provides an alternative way to implement the policy evaluation step in Procedure 9.1.

Lemma 9.4 For any given mean-square stabilizing control gain K ∈ Rm×n , its associated PK ∈ Sn satisfying (9.6) is the unique stable equilibrium of the following iteration: (9.14) PK , j+1 = H(Q(PK , j ), K ), PK ,0 ∈ Rn×n ,

258

B. Pang and Z.-P. Jiang

where Q(PK , j ) = G(PK , j ) +

PK , j 0 . 0 0

Proof Vectorizing (9.14) and using (9.11), we have vec(PK , j+1 ) = (A (K ) + In ⊗ In ) vec(PK , j ) + vec(S + K T R K ).

(9.15)

By Lemma 9.1, ρ(A (K ) + In ⊗ In ) < 1. This implies that (9.14) admits a unique stable equilibrium and it must be PK . Lemma 9.4 means that instead of solving algebraic equation (9.7), we could use iteration (9.14) for the policy evaluation. Now we explain how (9.14) is used together with least-squares method to form an estimation Gˆ i in Procedure 9.2 directly from the input/state data. Suppose the following control law is applied to system (9.1) u t = − Kˆ 1 xt + vt ,

(9.16)

where Kˆ 1 is mean-square stabilizing, vt is independently drawn from multivariate Gaussian distribution with mean Vt and covariance Im , and sequence {Vt }∞ t=0 is a realization of the discrete-time white Gaussian noise process. Then for any P ∈ Sn , we have T P xt+1 ] = E[xtT (P)xt + 2xtT A0T P B0 u t + u t (P)u t ] E[xt+1

= E[z tT Q(P)z t − xtT Sxt − u tT Ru t ], where z t = [xtT , u tT ]T . Vectorizing the above equation yields T svec(P) + X tT svec(S) + UtT svec(R), Z tT svec(Q(P)) = X t+1

(9.17)

where Z t = E[˜z t ], X t = E[x˜t ], and Ut = E[u˜ t ]. For M ∈ Z+ , organizing equations (9.17) for time index from t = 0 to t = M − 1 into one equation yields 2 svec(P) + r M , (9.18)

M svec(Q(P)) = M where

1 r M = M svec(S) + M svec(R) 1

M = [Z 0 , Z 1 , · · · , Z M−1 ]T , M = [X 0 , X 1 , · · · , X M−1 ]T , 2 M = [X 1 , X 2 , · · · , X M ]T , M = [U0 , U1 , · · · , U M−1 ]T .

The following assumption is made on the matrix M .

9 Robust Reinforcement Learning …

259

Assumption 9.1 Matrix M has full column rank, i.e., rank( M ) =

(m + n)(m + n + 1) . 2

Under Assumption 9.1, (9.18) can be rewritten as 2 svec(Q(P)) = †M ( M svec(P) + r M ).

(9.19)

Then (9.14) is equivalent to 2 svec(PK , j ) + r M )), K ). PK , j+1 = H(svec−1 ( †M ( M q

(9.20) q

1 2 and {Bk }k=0 . HowNote that (9.20) does not depend on any system matrix in {A j } j=0 1 2 ever, matrices M , M , M , and M are not known either. This issue is overcome by averaging the multiple input/state trajectories of the system (9.1). Concretely, suppose in total N trajectories are collected by running system (9.1) independently p p for N times. Let xt and u t denote the state and input at time t of the pth trajectory. p p Then using xt and u t we can construct iteration

PˆK , j+1 = H( Qˆ K , j , K ), ˆ †N ,M ( ˆ N2 ,M svec( PˆK , j ) + rˆN ,M )), Qˆ K , j = svec−1 (

(9.21)

where ˆ N1 ,M svec(S) + ˆ N ,M svec(R), rˆN ,M = ˆ N ,M = [ Zˆ N ,0 , Zˆ N ,1 , · · · , Zˆ N ,M−1 ]T , ˆ N1 ,M = [ Xˆ N ,0 , Xˆ N ,1 , · · · , Xˆ N ,M−1 ]T ,

ˆ N2 ,M = [ Xˆ N ,1 , Xˆ N ,2 , · · · , Xˆ N ,M ]T , ˆ N ,M = [Uˆ N ,0 , Uˆ N ,1 , · · · , Uˆ N ,M−1 ]T ,

and

N 1 p Zˆ N ,t = z˜ t , N p=1

N N 1 p 1 p Xˆ N ,t = x˜t , Uˆ N ,t = u˜ t . N N p=1

p=1

Since any two trajectories are independent, by the strong law of large numbers, almost surely lim Zˆ N ,t = Z t , lim Xˆ N ,t = X t , lim Uˆ N ,t = Ut . N →∞

N →∞

N →∞

Then for each M ∈ Z+ , ˆ N ,M = M , lim

N →∞

2 ˆ N2 ,M = M , lim

N →∞

1 ˆ N1 ,M = M lim ,

N →∞

ˆ N ,M = M . lim

(9.22)

N →∞

Thus for large value of N , it is expected that PˆK , j and Qˆ K , j generated by (9.21) are good approximations of PK , j and Q(PK , j ) generated by (9.20), respectively, for

260

B. Pang and Z.-P. Jiang

all j ∈ Z+ , if PˆK ,0 = PK ,0 . Then since (9.14) is equivalent to (9.20), by Lemma 9.4, PˆK , j and Qˆ K , j are expected to be close to PˆK and Q(PK ) for j large enough, respectively. In this way, the policy evaluation step is implemented directly from the input/state data. The proposed MO-LSPI is presented in Algorithm 9.1. Based on Theorem 9.2, the convergence of Algorithm 9.1 is derived as the following theorem, whose proof is postponed to Appendix 6. Theorem 9.3 For each mean-square stabilizing control gain Kˆ 1 , each realization Vt of the discrete-time white Gaussian noise process, each M ≥ (m + n)(m + n + 1)/2, and any > 0, if Assumption 9.1 is satisfied, then there exist L¯ 0 > 0 and N0 > 0 such that for any L¯ ≥ L¯ 0 and N ≥ N0 , almost surely lim sup P˜I¯ − P ∗ F < I¯→∞

and Kˆ i is mean-square stabilizing for all i = 1, · · · , I¯, where P˜I¯ is the unique solution of (9.9) for Kˆ I¯ . Remark 9.2 To satisfy Assumption 9.1, the noises vt are added to the controller (9.16). If the mean Vt ≡ 0, then E[xt u tT ] = −E[xt xtT ] Kˆ 1 , and by definition, M cannot be full column rank. This is why we require {Vt }∞ t=0 to be a realization of the discrete-time white Gaussian noise process.

Algorithm 9.1: MO-LSPI

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

¯ Input: Initial control gain Kˆ 1 , Number of policy iterations I¯, Length of policy evaluation L, Length of rollout M, Number of rollout N . for t = 0, · · · , M − 1 do Generate Vt independently from standard multivariate Gaussian distribution ; end for p = 1, · · · , N do p M p M−1 Generate trajectories {xt }t=0 and {u t }t=0 by applying control law (9.16) to system (9.1). end ˆ N ,M , ˆ1 , ˆ 2 and ˆ N ,M ; Compute the data matrices N ,M N ,M for i = 1, · · · , I¯ − 1 do Pî,0 ← 0; for j = 0, · · · , L¯ − 1 do ˆ † ( ˆ 2 svec( Pî, j ) + rˆN ,M )) ; Qˆ i, j = svec−1 ( N ,M N ,M Pî, j+1 = H( Qˆ i, j , Kˆ i ); end ˆ † ( ˆ 2 svec( Pî, L¯ ) + rˆN ,M )); Qˆ i = svec−1 ( N ,M N ,M −1 Kˆ i+1 ← [ Qˆ i ]uu [ Qˆ i ]ux ; end return Kˆ I¯ .

9 Robust Reinforcement Learning …

261

9.5 An Illustrative Example The proposed MO-LSPI Algorithm 9.1 is applied to the second-order system with multiplicative noises studied in [55], which is described by (9.1) with system matrices 1 1 −7.37, −0.58 0.08, 1.69 −0.2, 0.3 , A2 = , A1 = A0 = −0.4, 0.8 100 −0.25, 0 100 −1.61, 0 0, 0 −0.037, 0.022 −1.8 A3 = , A4 = , B0 = , 0, 0.08 0.1617, 0 −0.8 1 1 −4.7 20.09 B1 = , B2 = . 100 −0.62 100 −2.63

The stochastic noises wt, j and wˆ t,k are random variables independently drawn from the normal distribution for each t, j, and k. The initial control gain is chosen as Kˆ 1 = [0, 0], which is mean-square stabilizing by checking that ρ(A ([0, 0]) + I2 ⊗ I2 ) < 1. In the simulation, we set weighting matrices S = I2 and R = 1, number of policy iterations I¯ = 11, length of rollout M = 7, number of rollout N = 106 , length of policy evaluation L¯ = 1000. Algorithm 9.1 is run for 200 times, with the same realization of the discrete-time white Gaussian noise process Vt . In other words, for each t, the random seed in Line 2 of Algorithm 9.1 is fixed over the 200 implementations. In each iteration, the relative errors P˜i − P ∗ F /P ∗ F for i = 1, 2, · · · , I¯ are computed and recorded. The sample average and sample variance of relative error Relative Error (Avergage)

0.3 0.2 0.1 0

1

2 10

1.5

3

4

5

6

7

8

9

10

11

9

10

11

Iteration Index Relative Error (Variance)

-4

1 0.5 0

1

2

3

4

5

6

7

Iteration Index

Fig. 9.1 Experimental results of the second-order system

8

262

B. Pang and Z.-P. Jiang

for each iteration index i are plotted in Fig. 9.1. This validates Theorem 9.3. Since Theorem 9.3 is based on Theorem 9.2, our robustness results are also verified.

9.6 Conclusions This chapter analyzes the robustness of policy iteration for stochastic LQR with multiplicative noises. It is proved that starting from any mean-square stabilizing initial policy, the solutions generated by policy iteration with errors are bounded and ultimately enter and stay in a neighborhood of the optimal solution, as long as the errors are small and bounded. This result is employed to prove the convergence of the multiple-trajectory optimistic least-squares policy iteration (MO-LSPI), a novel offpolicy model-free RL algorithm for discrete-time LQR with stochastic multiplicative noises in the model. The theoretical results are validated by the experiments on a numerical example. Acknowledgements Confucius once said, Virtue is not left to stand alone. He who practices it will have neighbors. Laurent Praly, the former PhD advisor of the second-named author, is such a beautiful mind. His vision about and seminal contributions to control theory, especially nonlinear and adaptive control, have influenced generations of students including the authors of this chapter. ZPJ is privileged to have Laurent as the PhD advisor during 1989–1993 and is very grateful to Laurent for introducing him to the field of nonlinear control. It is under Laurent’s close guidance that ZPJ started, in 1991, working on the stability and control of interconnected nonlinear systems that has paved the foundation for nonlinear small-gain theory. The research findings presented here are just a reflection of Laurent’s vision about the relationships between control and learning. We also thank the U.S. National Science Foundation for its continuous financial support.

Appendix 1 The following lemma provides the relationship between operations vec(·) and svec(·). Lemma 9.5 ([40, Page 57]) For X ∈ Sn , there exists a unique matrix Dn 2 1 ∈ Rn × 2 n(n+1) with full column rank, such that vec(X ) = Dn svec(X ), svec(X ) = Dn† vec(X ). Dn is called the duplication matrix. Lemma 9.6 ([44, Lemma A.3.]) Let O be a compact set such that ρ(O) < 1 for any O ∈ O, then there exist an a0 > 0 and a 0 < b0 < 1, such that O k 2 ≤ a0 b0k , ∀k ∈ Z+ for any O ∈ O.

9 Robust Reinforcement Learning …

263

For X ∈ Rn×n , Y ∈ Rn×m , X + X ∈ Rn×n , Y + Y ∈ Rn×m , supposing X and X + X are invertible, the following inequality is repeatedly used: X −1 Y − (X + X )−1 (Y + Y ) F = X −1 Y − X −1 (Y + Y ) +X −1 (Y + Y ) − (X + X )−1 (Y + Y ) F

= − X −1 Y + X −1 X (X + X )−1 (Y + Y ) F

(9.23)

≤ X −1 F Y F + X −1 F (X + X )−1 F (Y + Y ) F X F .

Appendix 2 The following property of L K (·) is useful. Lemma 9.7 If K is mean-square stabilizing, then L K (Y1 ) ≤ L K (Y2 ) =⇒ Y1 ≥ Y2 , where Y1 , Y2 ∈ Sn . Proof Let {xt }∞ t=0 be the solution of the closed-loop system (9.1) with controller u = −K x. Then for any t ≥ 1 T Y1 xt+1 − xtT Y1 xt ] = E[xtT L K (Y1 )xt ] E[xt+1 T ≤ E[xtT L K (Y2 )xt ] = E[xt+1 Y2 xt+1 − xtT Y2 xt ].

Since K is mean-square stabilizing, −x0T Y1 x0 =

∞

E[xtT L K (Y1 )xt ] ≤

t=0

∞

E[xtT L K (Y2 )xt ] = −x0T Y2 x0 .

t=0

The proof is complete because x0 is arbitrary.

Now we are ready to prove Theorem 9.1. Proof (Theorem 9.1) By (9.7) and (9.8), for any x ∈ Rn , K 2 ∈ arg min{x T H(P1 , K )x}. K ∈Rm×n

Thus H(P1 , K 2 ) ≤ 0. By definition, P1 > 0 and L K 2 (P1 ) ≤ −S − K 2T R K 2 < 0. Then Lemma 9.1 implies that K 2 is mean-square stabilizing. Inserting (9.7) into the above inequality yields L K 2 (P1 ) ≤ L K 2 (P2 ). This implies P1 ≥ P2 by Lemma 9.7. An application of mathematical induction proves the first two items. For the last item, by a theorem on the convergence of a monotone sequence of self-adjoint operators

264

B. Pang and Z.-P. Jiang

(see [32, Pages 189–190]), limi→∞ Pi and limi→∞ K i exist. Letting i → ∞ in (9.7) and (9.8), and eliminating K ∞ in (9.7) using (9.8), we have P∞ = S + (P∞ ) − A T P∞ B(R + (P∞ ))−1 B T P∞ A. The proof is complete by the uniqueness of P ∗ .

Appendix 3 Proof (Lemma 9.2) Since K (P ∗ ) is mean-square stabilizing, by continuity there always exists a δ¯0 > 0, such that R(Pi ) is invertible, K (Pi ) is mean-square stabi¯ δ¯ (P ∗ ). Suppose Pi ∈ B ¯ δ¯ (P ∗ ). Subtracting lizing for all Pi ∈ B 0 0 T T K i+1 B T P ∗ A + A T P ∗ B K i+1 − K i+1 R(P ∗ )K i+1

from both sides of the GARE (9.4) yields LK

(Pi ) (P

∗

) = −S − K T (Pi )RK (Pi )+

(K (Pi ) − K (P ∗ ))T R(P ∗ )(K (Pi ) − K (P ∗ )).

(9.24)

Subtracting (9.24) from (9.12), we have ∗ T ∗ ∗ Pi+1 − P ∗ = −L−1 K (Pi ) ((K (Pi ) − K (P )) R(P )(K (Pi ) − K (P )) . Taking norm on both sides of above equation, (9.11) yields Pi+1 − P ∗ F ≤ A (K (Pi ))−1 2 R(P ∗ ) F K (Pi ) − K (P ∗ )2F . Since K (·) is locally Lipschitz continuous at P ∗ , by continuity of matrix norm and matrix inverse, there exists a c1 > 0, such that ¯ δ¯ (P ∗ ). Pi+1 − P ∗ F ≤ c1 Pi − P ∗ 2F , ∀Pi ∈ B 0 So for any 0 < σ < 1, there exists a δ¯0 ≥ δ0 > 0 with c1 δ0 ≤ σ . This completes the proof.

Appendix 4 Before the proof of Lemma 9.3, some auxiliary lemmas are firstly proved. Procedure 9.2 will exhibit a singularity, if [Gˆ i ]uu in (9.10) is singular, or the cost (9.2) of Kˆ i+1 is

9 Robust Reinforcement Learning …

265

infinity. The following lemma shows that if G i is small, no singularity will occur. Let δ¯0 be the one defined in the proof of Lemma 9.2, then δ0 ≤ δ¯0 . Lemma 9.8 For any P˜i ∈ Bδ0 (P ∗ ), there exists a d(δ0 ) > 0, independent of P˜i , such that Kˆ i+1 is mean-square stabilizing and [Gˆ i ]uu is invertible, if G i F ≤ d. ¯ δ¯ (P ∗ ) is compact and A (K (·)) is a continuous function, set Proof Since B 0 ¯ δ¯ (P ∗ )} S = {A (K ( P˜i ))| P˜i ∈ B 0 is also compact. By continuity and Lemma 9.1, for each X ∈ S, there exists a r (X ) > 0 such that ρ(Y + In ⊗ In ) < 1 for any Y ∈ Br (X ) (X ). The compactness of S implies the existence of a r > 0, such that ρ(Y + In ⊗ In ) < 1 for each Y ∈ Br (X ) and all X ∈ S. Similarly, there exists d1 > 0 such that [Gˆ i ]uu is invertible for all P˜i ∈ ¯ δ¯ (P ∗ ), if G i F ≤ d1 . Note that in policy improvement step of Procedure 9.1 B 0 ˜ (the policy update step in Procedure 9.2), the improved policy K˜ i+1 = [G˜ i ]−1 uu [G i ]ux ˆ ˜ ˆ (the updated policy K i+1 ) is continuous function of G i (G i ), and there exists a 0 < ¯ δ¯ (P ∗ ), if G i F ≤ d2 ≤ d1 , such that A ( Kˆ i+1 ) ∈ Br (A (K ( P˜i ))) for all P˜i ∈ B 0 d2 . Thus, Lemma 9.1 implies that Kˆ i+1 is mean-square stabilizing. Setting d = d2 completes the proof. ∞ By Lemma 9.8, if G i F ≤ d, the sequence { P˜i }i=0 satisfies (9.13). For simplicity, ˜ we denote E(G i , G i ) in (9.13) by Ei . The following lemma gives an upper bound on Ei F in terms of G i F .

Lemma 9.9 For any P˜i ∈ Bδ0 (P ∗ ) and any c2 > 0, there exists a 0 < δ11 (δ0 , c2 ) ≤ d, independent of P˜i , where d is defined in Lemma 9.8, such that Ei F ≤ c3 G i F < c2 , if G i F < δ11 , where c3 (δ0 ) > 0. ¯ δ0 (P ∗ ), G i F ≤ d, we have from (9.23) Proof For any P˜i ∈ B ˆ −1 ˆ K ( P˜i ) − Kˆ i+1 F ≤ [G˜ i ]−1 uu F (1 + [G i ]uu F [G i ]ux F )G i F ≤ c4 (δ0 , d)G i F ,

(9.25)

where the last inequality comes from the continuity of matrix inverse and the extremum value theorem. Define

T −S − K ( P˜i )T RK ( P˜i ) . R Kˆ i+1 , P˚i = L−1 Pˇi = L−1 −S − Kˆ i+1 Kˆ i+1

Then by (9.11) and (9.13),

K ( P˜i )

266

B. Pang and Z.-P. Jiang

Ei F = vec( Pˇi − P˚i )2 ,

T vec( Pˇi ) = A −1 Kˆ i+1 vec −S − Kˆ i+1 R Kˆ i+1 ,

vec( P˚i ) = A −1 K ( P˜i ) vec −S − K ( P˜i )T RK ( P˜i ) . Define

T Ai = A K ( P˜i ) − A Kˆ i+1 , bi = vec K ( P˜i )T R K ( P˜i ) − Kˆ i+1 R Kˆ i+1 .

Using (9.25), it is easy to check that Ai F ≤ c5 G i F , bi 2 ≤ c6 G i F , for some c5 (δ0 , d) > 0, c6 (δ0 , d) > 0. Then by (9.23)

Ei F ≤ A −1 Kˆ i+1 c6 + c5 A −1 K ( P˜i ) F F T ˜ ˜ × S + K ( Pi ) RK ( Pi ) G i F ≤ c3 (δ0 )G i F , F

where the last inequality comes from the continuity of matrix inverse and Lemma 9.8. Choosing 0 < δ11 ≤ d such that c3 δ11 < c2 completes the proof. Now we are ready to prove Lemma 9.3. Proof (Lemma 9.3) Let c2 = (1 − σ )δ0 in Lemma 9.9, and δ1 be equal to the δ11 associated with c2 . For any i ∈ Z+ , if P˜i ∈ Bδ0 (P ∗ ), then [Gˆ i ]uu is invertible, Kˆ i+1 is mean-square stabilizing and ˜iT B R −1 B T P˜i ) − P ∗ (S + P P˜i+1 − P ∗ F ≤ Ei F + L−1 K ( P˜ ) i

F

≤ σ P˜i − P ∗ F + c3 G i F ≤ σ P˜i − P ∗ F + c3 G∞

(9.26)

< σ δ0 + c3 δ1 < σ δ0 + c2 = δ0 ,

(9.28)

(9.27)

where (9.26) and (9.28) are due to Lemmas 9.2 and 9.9. By induction, (9.26) to (9.28) hold for all i ∈ Z+ , thus by (9.27), P˜i − P ∗ F ≤ σ 2 P˜i−2 − P ∗ F + (σ + 1)c3 G∞ ≤ · · · ≤ σ i P˜0 − P ∗ F + (1 + · · · + σ i−1 )c3 G∞ c3 G∞ , < σ i P˜0 − P ∗ F + 1−σ which proves (i) and (ii) in Lemma 9.3. Then (9.25) implies (iii) in Lemma 9.3. In terms of (iv) in Lemma 9.3, for any > 0, there exists a i 1 ∈ Z+ , such that ∞ < γ −1 (/2). Take i 2 ≥ i 1 . For i ≥ i 2 , we have by (ii) in Lemma sup{G i F }i=i 1 9.3,

9 Robust Reinforcement Learning …

267

P˜i − P ∗ F ≤ β( P˜i2 − P ∗ F , i − i 2 ) + /2 ≤ β(c7 , i − i 2 ) + /2, where the second inequality is due to the boundedness of P˜i . Since limi→∞ β(c7 , i − i 2 ) = 0, there is a i 3 ≥ i 2 such that β(c7 , i − i 2 ) < /2 for all i ≥ i 3 , which completes the proof.

Appendix 5 Notice that all the conclusions of Theorem 9.2 can be implied by Lemma 9.3 if δ2 < min(γ −1 (), δ1 ),

P˜1 ∈ Bδ0 (P ∗ )

for Procedure 9.2. Thus, the proof of Theorem 9.2 reduces to the proof of the following lemma. Lemma 9.10 Given a mean-square stabilizing Kˆ 1 , there exist 0 < δ2 < min(γ −1 (), δ1 ), i¯ ∈ Z+ , α2 > 0, and κ2 > 0, such that [Gˆ i ]uu is invertible, Kˆ i is mean¯ P˜i¯ ∈ Bδ0 (P ∗ ), as long square stabilizing, P˜i F < α2 , Kˆ i F < κ2 , i = 1, · · · , i, as G∞ < δ2 . The next two lemmas state that under certain conditions on G i F , each element ¯ ¯ in { Kˆ i }ii=1 is mean-square stabilizing, each element in {[Gˆ i ]uu }ii=1 is invertible, and ¯ { P˜i }ii=1 is bounded. For simplicity, in the following we assume S > In and R > Im . All the proofs still work for any S > 0 and R > 0, by suitable rescaling. Lemma 9.11 If Kˆ i is mean-square stabilizing, then [Gˆ i ]uu is nonsingular and Kˆ i+1 is mean-square stabilizing, as long as G i F < ai , where −1

√ √ ai = m( n + Kˆ i 2 )2 + m( n + Kˆ i+1 2 )2 . Furthermore,

Kˆ i+1 F ≤ 2R −1 F (1 + B T P˜i A F ).

(9.29)

Proof By definition, ˆ ˜ ˜ −1 [G˜ i ]−1 uu ([G i ]uu − [G i ]uu ) F < ai [G i ]uu F . Since R > Im , the eigenvalues λ j ([G˜ i ]−1 uu ) ∈ (0, 1] for all 1 ≤ j ≤ m. Then by the fact that for any X ∈ Sm X F = X F , X = diag{λ1 (X ), · · · , λm (X )}, we have

268

B. Pang and Z.-P. Jiang

√ ˆ ˜ [G˜ i ]−1 uu ([G i ]uu − [G i ]uu ) F < ai m < 0.5.

(9.30)

Thus by [26, Section 5.8], [Gˆ i ]uu is invertible. For any x ∈ Rn on the unit ball, define X Kˆ i =

I − Kˆ i

x x T I − Kˆ iT .

From (9.9) and (9.10) we have x T H(G˜ i , Kˆ i )x = tr(G˜ i X Kˆ i ) = 0, and

tr(Gˆ i X K ). tr(Gˆ i X Kˆ i+1 ) = min m×n K ∈R

Then tr(G˜ i X Kˆ i+1 ) ≤ tr(Gˆ i X Kˆ i+1 ) + G i F tr(11T |X Kˆ i+1 |abs ) ≤ tr(Gˆ i X Kˆ i ) + G i F 1T |X Kˆ i+1 |abs 1 ≤ tr(G˜ i X Kˆ i ) + G i F 1T (|X Kˆ i |abs + |X Kˆ i+1 |abs )1 ≤ G i F 1T (|X Kˆ i |abs + |X Kˆ i+1 |abs )1,

(9.31)

where |X Kˆ i |abs denotes the matrix obtained from X Kˆ i by taking the absolute value of each entry. Thus by (9.31) and the definition of G˜ i , we have x T L Kˆ i+1 ( P˜i )x + 1 ≤ 0,

(9.32)

where T R Kˆ i+1 )x − G i F 1T (|X Kˆ i |abs + |X Kˆ i+1 |abs )1. 1 = x T (S + Kˆ i+1

√ For any x on the unit ball, |1T x|abs ≤ n. Similarly,√for any K ∈ Rm×n , by the definition of induced matrix norm, |1T K x|abs ≤ K 2 m. This implies

T

√ √ I

1 x

= 1T x − 1T K x abs ≤ m( n + K 2 ),

−K abs √ which means 1T |X K |abs 1 ≤ m( n + K 2 )2 . Thus G i F 1T (|X Kˆ i |abs + |X Kˆ i+1 |abs )1 < 1. Then S > In leads to

9 Robust Reinforcement Learning …

269

x T L Kˆ i+1 ( P˜i )x < 0 for all x on the unit ball. So Kˆ i+1 is mean-square stabilizing by Lemma 9.1. By definition, T ˜ Kˆ i+1 F ≤ [Gˆ i ]−1 uu F (1 + B Pi A F ) −1 T ˜ ˜ −1 ˆ ˜ ≤ [G˜ i ]−1 uu F (1 − [G i ]uu ([G i ]uu − [G i ]uu ) F ) (1 + B Pi A F ) ≤ 2R −1 F (1 + B T P˜i A F ), (9.33)

where the second inequality comes from [26, Inequality (5.8.2)], and the last inequality is due to (9.30). This completes the proof. Lemma 9.12 For any i¯ ∈ Z+ , i¯ > 0, if ¯ G i F < (1 + i 2 )−1 ai , i = 1, · · · , i,

(9.34)

where ai is defined in Lemma 9.11, then P˜i F ≤ 6 P˜1 F , Kˆ i F ≤ C0 , ¯ where for i = 1, · · · , i,

C0 = max Kˆ 1 F , 2R −1 F 1 + 6B T F P˜1 F A F . Proof Inequality (9.32) yields T L Kˆ i+1 ( P˜i ) + (S + Kˆ i+1 R Kˆ i+1 ) − 2,i I < 0,

(9.35)

where 2,i = G i F 1T (|X Kˆ i |abs + |X Kˆ i+1 |abs )1 < 1. Inserting (9.9) into above inequality, and using Lemma 9.7, we have (−I ). P˜i+1 < P˜i + 2,i L−1 Kˆ

(9.36)

i+1

With S > In , (9.35) yields L Kˆ i+1 ( P˜i ) + (1 − 2,i )I < 0. Similar to (9.36), we have L−1 (−I ) < Kˆ i+1

1 P˜i . 1 − 2,i

(9.37)

270

B. Pang and Z.-P. Jiang

From (9.36) to (9.37), we obtain P˜i+1 < 1 +

2,i 1 − 2,i

P˜i .

By definition of 2,i and condition (9.34), 1 2,i ¯ ≤ 2 , i = 1, · · · , i. 1 − 2,i i Then [34, §28. Theorem 3] yields ¯ P˜i ≤ 6 P˜1 , i = 1, · · · , i.

An application of (9.29) completes the proof. Now we are ready to prove Lemma 9.10.

Proof (Lemma 9.10) Consider Procedure 9.2 confined to the first i¯ iterations, where i¯ is a sufficiently large integer to be determined later in this proof. Suppose G i F < bi¯

−2 √ 1 n + C0 . 2 2m(1 + i¯ )

(9.38)

Condition (9.38) implies condition (9.34). Thus Kˆ i is mean-square stabilizing, [Gˆ i ]−1 uu is invertible, P˜i F and Kˆ i F are bounded. By (9.9) we have T L Kˆ i+1 ( P˜i+1 − P˜i ) = −S − Kˆ i+1 R Kˆ i+1 − L Kˆ i+1 ( P˜i ).

Letting E i = Kˆ i+1 − K ( P˜i ), the above equation can be rewritten as (E ), P˜i+1 = P˜i − N( P˜i ) + L−1 K ( P˜ ) i

(9.39)

i

◦ R( P˜i ), and where N( P˜i ) = L−1 K ( P˜ ) i

R(Y ) = (Y ) − Y − A0T Y B0 (R + (Y ))−1 B0T Y A0 + S,

Ei = −E iT R( P˜i+1 )E i + E iT R( P˜i+1 ) K ( P˜i+1 ) − K ( P˜i ) T

+ K ( P˜i+1 ) − K ( P˜i ) R( P˜i+1 )E i . Given Kˆ 1 , let Mi¯ denote the set of all possible P˜i , generated by (9.39) under condition (9.38). By definition, {M j }∞ j=1 is a nondecreasing sequence of sets, i.e., M1 ⊂ M2 ⊂ ∞ · · · . Define M = ∪ j=1 M j , D = {P ∈ Sn | P F ≤ 6 P˜1 F }. Then by Lemma 9.12 and Theorem 9.1, M ⊂ D; M is compact; K (P) is stable for any P ∈ M.

9 Robust Reinforcement Learning …

271

Now we prove that N(P 1 ) is Lipschitz continuous on M. Using (9.11), we have N(P 1 ) − N(P 2 ) F = A −1 (K (P 1 )) vec(R(P 1 )) − A −1 (K (P 2 )) vec(R(P 2 ))2 ≤ A −1 (K (P 1 ))2 R(P 1 ) − R(P 2 ) F + R(P 2 ) F A −1 (K (P 1 )) − A −1 (K (P 2 ))2 ≤ LP 1 − P 2 F ,

(9.40)

where the last inequality is due to the fact that matrix inversion A (·), K (·), and R(·) are locally Lipschitz, thus Lipschitz on compact set M with some Lipschitz constant L > 0. ˜ Define {Pk|i }∞ k=0 as the sequence generated by (9.12) with P0|i = Pi . Similar to (9.39), we have (9.41) Pk+1|i = Pk|i − N(Pk|i ), k ∈ Z+ . By Theorem 9.1 and the fact that M is compact, there exists k0 ∈ Z+ , such that Pk0 |i − P ∗ F < δ0 /2, Suppose

L−1 K ( P˜

i+ j )

∀P0|i ∈ M.

(9.42)

j = 0, · · · , i¯ − i.

(Ei+ j ) F < μ,

(9.43)

We find an upper bound on Pk|i − P˜i+k F . Notice that from (9.39) to (9.41), Pk|i = P0|i −

k−1

N(P j|i ),

P˜i+k = P˜i −

j=0

k−1

N( P˜i+ j ) +

j=0

k−1 j=0

L−1 K ( P˜

i+ j )

(Ei+ j ).

Then (9.40) and (9.43) yield Pk|i − P˜i+k F ≤ kμ +

k−1

LP j|i − P˜i+ j F .

j=0

An application of the Gronwall inequality [2, Theorem 4.1.1.] to the above inequality implies k−1 Pk|i − P˜i+k F ≤ kμ + Lμ j (1 + L)k− j−1 . (9.44) j=0

By (9.11), the error term in (9.39) satisfies −1 LK ( P˜ ) (Ei ) = A −1 (K ( P˜i )) vec (Ei ) ≤ C1 Ei F , i

F

2

(9.45)

272

B. Pang and Z.-P. Jiang

where C1 is a constant and the inequality is due to the continuity of matrix inverse. Let i¯ > k0 , and k = k0 , i = i¯ − k0 in (9.44). Then by condition (9.38), Lemma 9.12, (9.43), (9.44), and (9.45), there exist i 0 ∈ Z+ , i 0 > k0 , such that Pk0 |i−k ¯ 0 − ¯ ¯ ˜ Pi¯ F < δ0 /2, for all i ≥ i 0 . Setting i = i − k0 in (9.42), the triangle inequality yields P˜i¯ ∈ Bδ0 (P ∗ ), for i¯ ≥ i 0 . Then in (9.38), choosing i¯ ≥ i 0 such that δ2 = bi¯ < min(γ −1 (), δ1 ) completes the proof.

Appendix 6 For given Kˆ 1 , let K denote the set of control gains (including Kˆ 1 ) generated by ∞ satisfying G∞ < δ2 , where δ2 is the Procedure 9.2 with all possible {G i }i=1 one in Theorem 9.2. The following result is firstly derived. Lemma 9.13 Under the conditions in Theorem 9.3, there exist L¯ 0 > 0 and N0 > 0 such that for any L¯ ≥ L¯ 0 and N ≥ N0 , Kˆ i ∈ K implies G i F < δ2 , almost surely. Proof By definition, in the context of Algorithm 9.1, G i F ≤ Qˆ i − Q( Pî, L¯ ) F + Q( Pî, L¯ ) − Q( P˜i ) F + P˜i − Pî, L¯ F , where P˜i is the unique solution of (9.9) with K = Kˆ i . Thus, the task is to prove that each term in the right-hand side of the above inequality is less than δ2 /3. To this end, we firstly study P˜i − Pî, L¯ F . Define pˆ i, j = vec( Pî, j ), by Lemma 9.5, Line 11 and Line 12 in Algorithm 9.1 can be rewritten as ˆ †N ,M , ˆ N2 ,M , Kˆ i ) pˆ i, j + T 2 ( ˆ †N ,M , rˆN ,M , Kˆ i ), pˆ i, j+1 = T 1 (

(9.46)

2

where pˆ i,0 ∈ Rn and ˆ †N ,M , ˆ N2 ,M , Kˆ i ) = In , − Kˆ iT ⊗ In , − Kˆ iT D(m+n)(m+n+1)/2 ˆ †N ,M ˆ N2 ,M Dn† , T 1 ( ˆ †N ,M rˆN ,M . ˆ †N ,M , rˆN ,M , Kˆ i ) = In , − Kˆ iT ⊗ In , − Kˆ iT D(m+n)(m+n+1)/2 T 2 ( Similar derivations applied to (9.20) with K = Kˆ i yield 2 p¯ i, j+1 = T 1 ( M , M , Kˆ i ) p¯ i, j + T 2 ( M , r M , Kˆ i ),

2

p¯ i,0 ∈ Rn .

(9.47)

Since (9.20) is identical to (9.14), (9.47) is identical to (9.15) with K and vec(PK , j ) replaced by Kˆ i and p¯ i, j respectively, and T 1 ( M , M , Kˆ i ) = A ( Kˆ i ) + In ⊗ In , T 2 ( M , r M , Kˆ i ) = vec(S + Kˆ iT R Kˆ i ). (9.48)

9 Robust Reinforcement Learning …

273

Since A ( Kˆ i ), Kˆ i ∈ K is mean-square stabilizing, by Lemma 9.4 lim P¯i, j = P˜i ,

j→∞

(9.49)

¯ is bounded, thus comwhere P¯i, j = vec−1 ( p¯ i, j ). By definition and Theorem 9.2, K pact. Let V be the set of the unique solutions of (9.5) with K ∈ K. Then by ¯ othTheorem 9.2 V is bounded. So A (K ) is mean-square stable for ∀K ∈ K, erwise by (9.11) and Lemma 9.1 it contradicts the boundedness of V. Define ¯ Then ρ(X ) < 1 for any X ∈ K1 , and by contiK1 = {A (K ) + In ⊗ In |K ∈ K}. nuity K1 is a compact set. This implies the existence of a δ3 > 0, such that ρ(X ) < 1 ¯ 2 , where for any X ∈ K K2 = {X |X ∈ Bδ3 (Y ), Y ∈ K1 }. Define

2 ˆ †N ,M , ˆ N2 ,M , Kˆ i ), , Kˆ i ) − T 1 ( T N1 ,M,i = T 1 ( M , M

ˆ †N ,M , rˆN ,M , Kˆ i ). T N2 ,M,i = T 2 ( M , r M , Kˆ i ) − T 2 ( The boundedness of K, (9.22), and (9.48) imply the existence of N1 > 0, such that for any N ≥ N1 , any Kˆ i ∈ K, almost surely ¯ 2 , T 2 ( ˆ †N ,M , ˆ N2 ,M , Kˆ i ) ∈ K ˆ †N ,M , rˆN ,M , Kˆ i ) < C9 , T 1 (

(9.50)

where C9 > 0 is a constant. Then ˆ †N ,M , ˆ N2 ,M , Kˆ i )) < 1 ρ(T 1 ( and (9.46) admits a unique stable equilibrium, that is, lim Pî, j = P˚i

j→∞

for some P˚i ∈ Sn . From (9.46), (9.47), (9.49), and (9.51), we have

−1 vec( P˜i ) = In 2 − T 1 ( M , M , Kˆ i ) T 2 ( M , r M , Kˆ i ),

−1 ˆ †N ,M , ˆ N2 ,M , Kˆ i ) ˆ †N ,M , rˆN ,M , Kˆ i ). vec( P˚i ) = In 2 − T 1 ( T 2 ( Thus by (9.23), for any N ≥ N1 , any Kˆ i ∈ K, almost surely

(9.51)

274

B. Pang and Z.-P. Jiang

−1 1 T 2 ˆ 2 I P˚i − P˜i F ≤ − T ( , , K ) M M i N ,M,i F + n F

−1 2 † 1 I n 2 − T 1 ( ˆ †N ,M , ˆ N2 ,M , Kˆ i ) ˆ N ,M , rˆN ,M , Kˆ i ) (

T T N ,M,i F 2 F

≤ C10 T N2 ,M,i F + C11 T N1 ,M,i F , where C10 and C11 are some positive constants, and the last inequality is due to ¯ 2 are compact sets. Then for any 1 > 0, the (9.48), (9.50) and the fact that K1 and K boundedness of K and (9.22) implies the existence of N2 ≥ N1 , such that for any N ≥ N2 , almost surely (9.52) P˚i − P˜i F < 1 /2, as long as Kˆ i ∈ K. By Lemma 9.6 and (9.52), for any N ≥ N2 and any Kˆ i ∈ K, j j P˚i − Pî, j F ≤ a0 b0 P˚i F ≤ a1 b0 ,

for some a0 > 0, 1 > b0 > 0 and a1 > 0. Therefore there exists a L¯ 1 > 0, such that for any L¯ ≥ L¯ 1 , and any N ≥ N2 , almost surely Pî, L¯ − P˚i F < 1 /2,

(9.53)

as long as Kˆ i ∈ K. With (9.52) and (9.53), we obtain Pî, L¯ − P˜i F < 1 ,

(9.54)

almost surely for any L¯ ≥ L¯ 1 , any N ≥ N2 , as long as Kˆ i ∈ K. Since 1 is arbitrary, we can choose 1 such that almost surely Pî, L¯ − P˜i F < δ2 /3 for any L¯ ≥ L¯ 1 , any N ≥ N2 , as long as Kˆ i ∈ K. Secondly, by definition and (9.54), there exist L¯ 2 ≥ L¯ 1 and N3 ≥ N2 , such that Q( Pî, L¯ ) − Q( P˜i ) F < δ2 /3 for any L¯ ≥ L¯ 2 , any N ≥ N3 , as long as Kˆ i ∈ K. Thirdly, since V is bounded, Pî, L¯ is also almost surely bounded by (9.54). Thus, from Line 14 in Algorithm 9.1 and (9.22), there exists N4 ≥ N3 , such that Qˆ i − Q( Pî, L¯ ) F < δ2 /3 for any N ≥ N4 and any L¯ ≥ L¯ 2 , as long as Kˆ i ∈ K. Setting N0 = N4 and L¯ 0 = L¯ 2 yields G i < δ2 .

9 Robust Reinforcement Learning …

275

Now we are ready to prove the convergence of Algorithm 9.1. Proof (Theorem 9.3) Since Kˆ 1 ∈ K, Lemma 9.13 implies G 1 F < δ2 almost surely. By definition, Kˆ 2 ∈ K. Thus G i F < δ2 , i = 1, 2, · · · almost surely by mathematical induction. Then Theorem 9.2 completes the proof.

References 1. Abbasi-Yadkori, Y., Lazic, N., Szepesvari, C.: Model-free linear quadratic control via reduction to expert prediction. In: International Conference on Artificial Intelligence and Statistics (AISTATS) (2019) 2. Agarwal, R.P.: Difference Equations and Inequalities: Theory, Methods, and Applications, 2nd edn. Marcel Dekker Inc, New York (2000) 3. Athans, M., Ku, R., Gershwin, S.: The uncertainty threshold principle: some fundamental limitations of optimal decision making under dynamic uncertainty. IEEE Trans. Autom. Control 22(3), 491–495 (1977) 4. Beghi, A., D’alessandro, D.: Discrete-time optimal control with control-dependent noise and generalized Riccati difference equations. Automatica, 34(8):1031 – 1034, 1998 5. Bertsekas, D.P.: Approximate policy iteration: A survey and some new methods. J. Control Theory Appl. 9(3), 310–335 (2011) 6. Bertsekas, D.P.: Reinforcement Learning and Optimal Control. Athena Scientific, Belmont, Massachusetts (2019) 7. Bian, T., Jiang, Z.P.: Continuous-time robust dynamic programming. SIAM J. Control Optim. 57(6), 4150–4174 (2019) 8. Bian, T., Wolpert, D.M., Jiang, Z.P.: Model-free robust optimal feedback mechanisms of biological motor control. Neural Comput. 32(3), 562–595 (2020) 9. Bitmead, R.R., Gevers, M., Wertz, V.: Adaptive Optimal Control: The Thinking Man’s GPC. Prentice-Hall, Englewood Cliffs, New Jersy (1990) 10. Breakspear, M.: Dynamic models of large-scale brain activity. Nat. Neurosci. 20(3), 340–352 (2017) 11. Bryson, A.E., Ho, Y.C.: Applied Optimal Control: Optimization, Estimation and Control. Talor & Francis (1975) 12. Bu¸soniu, L., de Bruin, T., Tolić, D., Kober, J., Palunko, I.: Reinforcement learning for control: Performance, stability, and deep approximators. Annu. Rev. Control 46, 8–28 (2018) 13. Coppens, P., Patrinos, P.: Sample complexity of data-driven stochastic LQR with multiplicative uncertainty. In: The 59th IEEE Conference on Decision and Control (CDC), pp. 6210–6215 (2020) 14. Coppens, P., Schuurmans, M., Patrinos, P.: Data-driven distributionally robust LQR with multiplicative noise. In: Learning for Dynamics and Control (L4DC), pp. 521–530. PMLR (2020) 15. De Koning, W.L.: Infinite horizon optimal control of linear discrete time systems with stochastic parameters. Automatica 18(4), 443–453 (1982) 16. De Koning, W.L.: Compensatability and optimal compensation of systems with white parameters. IEEE Trans. Autom. Control 37(5), 579–588 (1992) 17. Drenick, R., Shaw, L.: Optimal control of linear plants with random parameters. IEEE Trans. Autom. Control 9(3), 236–244 (1964) 18. Du, K., Meng, Q., Zhang, F.: A Q-learning algorithm for discrete-time linear-quadratic control with random parameters of unknown distribution: convergence and stabilization. arXiv preprint arXiv:2011.04970, 2020 19. Duncan, T.E., Guo, L., Pasik-Duncan, B.: Adaptive continuous-time linear quadratic gaussian control. IEEE Trans. Autom. Control 44(9), 1653–1662 (1999)

276

B. Pang and Z.-P. Jiang

20. Gravell, B., Esfahani, P.M., Summers, T.: Learning robust controllers for linear quadratic systems with multiplicative noise via policy gradient. IEEE Trans. Autom. Control (2019) 21. Gravell, B., Esfahani, P.M., Summers, T.: Robust control design for linear systems via multiplicative noise. arXiv preprint arXiv:2004.08019 (2020) 22. Gravell, B., Ganapathy, K., Summers, T.: Policy iteration for linear quadratic games with stochastic parameters. IEEE Control Syst. Lett. 5(1), 307–312 (2020) 23. Guo, Y., Summers, T.H.: A performance and stability analysis of low-inertia power grids with stochastic system inertia. In: American Control Conference (ACC), pp. 1965–1970 (2019) 24. Hespanha, J.P., Naghshtabrizi, P., Xu, Y.: A survey of recent results in networked control systems. Proceedings of the IEEE 95(1), 138–162 (2007) 25. Hewer, G.: An iterative technique for the computation of the steady state gains for the discrete optimal regulator. IEEE Trans. Autom. Control (1971) 26. Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, New York (2012) 27. Jiang, Y., Jiang, Z.P.: Adaptive dynamic programming as a theory of sensorimotor control. Biolog. Cybern. 108(4), 459–473 (2014) 28. Jiang, Y., Jiang, Z.P.: Robust Adaptive Dynamic Programming. Wiley, Hoboken, New Jersey (2017) 29. Jiang, Z.P., Bian, T., Gao, W.: Learning-based control: A tutorial and some recent results. Found. Trends Syst. Control. 8(3), 176–284 (2020) 30. Jiang, Z.P., Lin, Y., Wang, Y.: Nonlinear small-gain theorems for discrete-time feedback systems and applications. Automatica 40(12), 2129–2136 (2004) 31. Kamalapurkar, R., Walters, P., Rosenfeld, J., Dixon, W.: Reinforcement learning for optimal feedback control: A Lyapunov-based approach. Springer (2018) 32. Kantorovich, L.V., Akilov, G.P.: Functional Analysis in Normed Spaces. Macmillan, New York (1964) 33. Kiumarsi, B., Vamvoudakis, K.G., Modares, H., Lewis, F.L.: Optimal and autonomous control using reinforcement learning: A survey. IEEE Trans. Neural Netw. Learn. Syst. 29(6), 2042– 2062 (2018) 34. Konrad, K.: Theory and Application of Infinite Series, 2nd edn. Dover Publications, New York (1990) 35. Lai, J., Xiong, J., Shu, Z.: Model-free optimal control of discrete-time systems with additive and multiplicative noises. arXiv preprint arXiv:2008.08734 (2020) 36. Levine, S., Koltun, V.: Continuous inverse optimal control with locally optimal examples. In: International Conference on Machine Learning (ICML) (2012) 37. Levine, S., Kumar, A., Tucker, G., Fu, J.: Offline reinforcement learning: Tutorial, review, and perspectives on open problems. arXiv preprint arXiv:2005.01643 (2020) 38. Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. In: International Conference on Learning Representations (ICLR) (2016) 39. Ljung, L.: System Identification: Theory for the user, 2nd edn. Prentice Hall PTR, Upper Saddle River (1999) 40. Magnus, J.R., Neudecker, H.: Matrix Differential Calculus With Applications In Statistics And Economerices. Wiley, New York (2007) 41. Monfort, M., Liu, A., Ziebart, B.D.: Intent prediction and trajectory forecasting via predictive inverse linear-quadratic regulation. In: AAAI Conference on Artificial Intelligence (AAAI) (2015) 42. Morozan, T.: Stabilization of some stochastic discrete-time control systems. Stoch. Anal. Appl. 1(1), 89–116 (1983) 43. Pang, B., Bian, T., Jiang, Z.-P.: Robust policy iteration for continuous-time linear quadratic regulation. IEEE Trans. Autom. Control (2020) 44. Pang, B., Jiang, Z.-P.: Robust reinforcement learning: A case study in linear quadratic regulation. In: AAAI Conference on Artificial Intelligence (AAAI) (2020) 45. Powell, W.B.: From reinforcement learning to optimal control: A unified framework for sequential decisions. arXiv preprint arXiv:1912.03513 (2019)

9 Robust Reinforcement Learning …

277

46. Praly, L., Lin, S.-F., Kumar, P.R.: A robust adaptive minimum variance controller. SIAM J. Control Optim. 27(2), 235–266 (1989) 47. Rami, M.A., Chen, X., Zhou, X.Y.: Discrete-time indefinite LQ control with state and control dependent noises. J. Glob. Optim. 23(3), 245–265 (2002) 48. Åström, K.J., Wittenmark, B.: Adaptive Control, 2nd edn. Addison-Wesley, Reading, Massachusetts (1995) 49. Sontag, E.D.: Input to state stability: Basic concepts and results. Nonlinear and optimal control theory. volume 1932, pp. 163–220. Springer, Berlin (2008) 50. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. MIT Press, Cambridge, Massachusetts (2018) 51. Sutton, R.S., Barto, A.G., Williams, R.J.: Reinforcement learning is direct adaptive optimal control. IEEE Control Syst. Mag. 12(2), 19–22 (1992) 52. Tiedemann, A., De Koning, W.: The equivalent discrete-time optimal control problem for continuous-time systems with stochastic parameters. Int. J. Control 40(3), 449–466 (1984) 53. Todorov, E.: Stochastic optimal control and estimation methods adapted to the noise characteristics of the sensorimotor system. Neural Comput. 17(5), 1084–1108 (2005) 54. Tu, S., Recht, B.: The gap between model-based and model-free methods on the linear quadratic regulator: An asymptotic viewpoint. In Annual Conference on Learning Theory (COLT) (2019) 55. Xing, Y., Gravell, B., He, X., Johansson, K.H., Summers, T.: Linear system identification under multiplicative noise from multiple trajectory data. In American Control Conference (ACC), pp 5157–5261 (2020) 56. Huang, Y., Zhang, W., Zhang, H.: Infinite horizon LQ optimal control for discrete-time stochastic systems. In: The 6th World Congress on Intelligent Control and Automation (WCICA), vol. 1, pp. 252–256 (2006)

Correction to: Contributions to the Problem of High-Gain Observer Design for Hyperbolic Systems Constantinos Kitsos, Gildas Besançon, and Christophe Prieur

Correction to: Chapter 5 in: Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control, Lecture Notes in Control and Information Sciences 488, https://doi.org/10.1007/978-3-030-74628-5_5 The author’s name in Chapter 5 was incorrect on the website and has now been corrected from Gildas Besan to Gildas Besançon.

The updated online version of this chapter can be found at https://doi.org/10.1007/978-3-030-74628-5_5 © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control, Lecture Notes in Control and Information Sciences 488, https://doi.org/10.1007/978-3-030-74628-5_10

C1

Index

A Active node, 241 Active noise control, 136 Actuator, 67, 136, 189–192, 196, 197, 199, 204, 207, 210, 214 Adaptive backstepping, 190, 202, 217–219, 227, 228, 232, 233, 242 Adaptive control, 137, 138, 141, 149, 155, 156, 164, 165, 169–171, 178–185 Algebraic Riccati equation (ARE), 242, 251 Almost feedback linearization, 1, 7, 15 Asymptotic stability, 15, 20, 28, 31, 39 Asymptotic tracking, 208, 210, 218 Averaged dynamics, 83, 86

B Backstepping, 84, 109, 110, 189, 190, 214, 230, 234, 238, 239 Backstepping transformation, 189, 191, 194, 195, 197, 205, 214 Behavior, 2, 5, 15, 27, 33, 34, 36, 39, 52–55, 58, 64, 66, 83, 84, 86–88, 91, 93, 97, 100, 101, 104–106, 179 Broadband noise, 135, 140, 156, 170, 174

C Causal system, 55 Certainty-equivalence principle, 200, 204, 221 Composite signals, 50 Congelation of variables, 217–220, 223– 227, 230, 232, 237–239, 245

Continuous-time system, 27, 140, 142, 143, 150, 165, 178, 180, 182 Controllability bias, 53, 54, 63, 73 Controllability constant, 53, 59, 63, 64, 73 Controllable, 53, 57, 59, 62, 63, 66, 68, 71, 72, 75, 76, 78, 80, 138 Covariance resetting, 149, 164, 169 Cross section, 55, 57, 61

D Decentralized design, 84, 88, 91, 104 Degenerate system, 159 Delay, 68, 115, 189–198, 204, 207–210, 212, 214, 215 Delay kernel, 191, 210, 214 Differential inclusion, 27–40 Directed gap, 61 Direct sum, 50–54 Discrete delay, 190 Discrete-time system, 27, 58, 140, 142, 143, 156, 178, 179, 181, 184, 185 Distributed delay, 190, 214 Distributed optimization, 83, 89, 90 Distributed state estimation, 83, 94, 99 Disturbance rejection, 137, 228 Domain of a system, 55 Dynamical system, 14, 28, 135, 137, 250, 251, 255, 256 Dynamic extension, 1, 7, 8, 12, 23 Dynamic feedback, 2

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z.-P. Jiang et al. (eds.), Trends in Nonlinear and Adaptive Control, Lecture Notes in Control and Information Sciences 488, https://doi.org/10.1007/978-3-030-74628-5

279

280 E Estimator, 95, 137, 147, 154, 162, 163, 168, 176, 177, 193, 207 Exponential stability, 29–31, 33, 120, 125, 126, 235, 255 Extended space, 49, 50

F Feedback linearization, 2, 3, 7 Finely controllable, 53, 59, 63, 66, 68, 71, 73, 75, 77, 78, 80 Finely lower hemicontinuous, 55 Finely uniformly controllable, 53, 54, 59, 63, 64, 73, 74 Fine topology, 48, 51, 53, 56 Finite-dimensional, 109, 110, 113, 126, 141, 190–193, 198, 210, 214 Finite-gain stability, 46, 57, 58, 65 FIR filter, 144, 157, 165 Flhc. see finely lower hemicontinuous Funnel coupling, edge-wise, 101, 102, 104, 105 Funnel coupling, node-wise, 101, 104

G Gap, 61 Gap topology, 60–63, 65–67 Generalized Algebraic Riccati equation (GARE), 253, 264 Generating polynomial, 172 Global stabilization, 193–195

H High-gain observer, 109–111, 115, 116, 120, 122, 123, 126, 131, 132 Homotopy, 43, 45, 46, 62–64, 66, 67, 69 Hybrid system, 27, 28, 39

I Ill-conditioned system, 161 Immersion and invariance, 219, 224 Infinite-dimensional, 109–111, 120, 123, 126, 191, 192, 210, 214 Inner-outer factorization, 161, 167, 182 Input-output system, 20, 54, 136 Input-to-state stability, 20, 24, 249–251, 255 Integral quadratic constraint, 43, 45, 46, 59, 63, 68, 69 Interconnection, 43–46, 64–68, 85, 97, 224, 228, 241

Index Internal model principle, 95, 137 Inverse dynamics, 217, 233–239, 241, 245 Inverse of a system, 55 Inverse system, 6 Invertibility, 2–6 IO system. see input/output system IQC. see integral quadratic constraint

K Kreisselmeier filter (K-filter), 199, 200, 208, 214, 232, 236, 242, 245

L -flhc. see finely lower hemicontinuous -lhc. see lower hemicontinuous (, μ)-limited system, 62 -stable system, 58 -wlhc. see weakly lower hemicontinuous Least-squares algorithm, 148, 149, 155, 163, 164, 169, 177, 249 Lhc. see lower hemicontinuous Linear Matrix Inequality (LMI), 174 Linear Matrix Linequality (LMI), 174 Linear Quadratic Regulator (LQR), 249– 255, 257, 262 Linear system, 2, 35, 37–39, 44, 52, 58, 66, 67, 94, 95, 138, 189–193, 198, 208, 210, 214, 215, 250–252 Lienard system, 93, 100, 101 Look-ahead constant, 56, 58, 63 Look-ahead gain, 57–59 Look-ahead map, 56–59, 61–63 Lower hemicontinuous, 55 Lower triangular system, 217, 219, 221 Lyapunov function, 27–29, 33–35, 38, 39, 128, 221–223, 226, 229, 235, 236, 238, 239 Lyapunov functional, 111, 117, 119, 122, 125, 126, 189–191

M Mean-square stabilizable, 253 Mean-square stabilizing, 253–258, 260– 267, 269, 270, 273 Mean-square stable, 252, 253, 273 Minimally stable, 57–59, 61–64, 74, 77 Minimum phase, 6, 12, 20, 22, 170, 178, 186, 199 Model Reference Adaptive Control (MRAC), 170–172, 178, 185

Index Multi-agent system, 83–87, 91, 96, 97, 99, 101 Multi-Input Multi-Output (MIMO) system, 1, 3, 7, 135, 141 156, 158, 161, 181 Multiplicative noise, 249–252, 261, 262 Multiplier, 4–6, 16, 43–45

N Noise amplification, 143, 145, 146, 151, 159, 160, 166 Nonlinear control, 27, 262 Nonlinear damping, 223–225, 230 Nonlinear observer, 109 Nonlinear system, 1–3, 6, 16, 24, 39, 44, 69, 109, 120, 217, 221, 227, 242, 245, 262 Norm gain, 59, 63 Normal form, 3, 4, 6–8, 16, 19, 21 Normalized estimation error, 148, 154, 163, 168, 177 Normalizing signal, 148, 155, 163, 169, 177 Normed subsystem, 52

O Observable, 6, 23 Observer, 1, 7, 14, 15, 23, 73, 95, 96, 99, 100, 109–120, 122–127, 130, 137, 189, 191, 193, 194, 196–198, 208, 218 Observer canonical form, 198, 214 ODE–PDE cascade, 192–197, 207, 210, 214 Off-policy, 249, 251, 262 Operator, 46, 49, 50, 55, 58, 59, 63, 66–69, 79, 80, 110, 111, 113–115, 117, 121– 123, 125, 148, 155, 164, 169, 177, 189, 191, 207, 213, 220, 256, 263 Optimal control, 84, 250, 251, 255 Output feedback, 2, 6, 109, 126, 217, 231 Output regulation, 2, 3, 16, 18, 19, 21, 84, 217, 218, 222, 241, 245 Output regulator, 1 Over-parameterization, 135, 138, 163, 170, 185

P Parameter drift, 138, 163 Parametric model, 147, 148, 154, 162, 163, 168, 176, 177 Partial differential equation, 109–111, 113, 115, 120, 122, 125, 190, 192, 199, 204, 208, 210, 213–215

281 Persistence of Excitation (PE), 138, 143, 163, 218 Plant parameter, 139, 189–192, 198, 204, 207, 213, 214 Plant state, 30, 31, 34, 191–194, 196, 198, 210, 214 Plug-and-play operation, 83, 84, 88, 91 Policy, 250, 251, 254, 255, 257, 262, 265 Policy evaluation, 250, 251, 254, 255, 257, 258, 260, 261 Policy improvement, 250, 254, 265 Policy iteration, 249–252, 254, 255, 257, 260–262 Pre-filtering modification, 160 Predictor, 189–191, 193–197, 214 Projection, 52, 54, 55, 148, 149, 155, 163, 164, 169, 177, 207, 218, 221, 222

Q Quadratically continuous, 68, 69

R Rate of adaptation, 170, 181 Regular system, 67 Regulator equation, 16, 21 Reinforcement Learning (RL), 249–252, 255, 262 Reset control system, 27–30, 33, 34, 36–39 Robust adaptive control, 139, 140, 143, 149, 156, 160, 165, 171, 218 Robust reinforcement learning, 250 Robustness, 2, 7, 60, 62, 100, 135, 138, 140, 143, 145, 146, 150–152, 156, 157, 160, 161, 163, 165, 170, 175, 190, 249–252, 262

S Semiglobal stabilization, 1 Seminorm topology, 47, 48, 56, 58, 60 Sensor, 67, 136, 190, 250 Shaping of the singular values, 160, 165 Signals, 21, 24, 44, 45, 47–50, 52–54, 59–62, 64–68, 71–73, 137, 140–142, 144, 147–149, 151, 154, 155, 158, 162– 164, 166, 168, 169, 173, 176–178, 182, 199, 201, 208, 210, 218, 220, 224, 227, 230, 234, 241, 242, 245 Signal space, 43, 46–56, 60, 63, 66, 68–71 Single-Input Single-Output (SISO) system, 2, 7, 18, 135, 141 143, 170, 171, 185, 198

282 Small-gain, 43, 44, 82, 217, 219, 245, 262, 276 Small-signal gain, 59, 62 Small-signal subspace, 46, 48–52, 68 Small-signal subsystem, 52, 53 Stability, 19, 27–29, 43–46, 54, 58–62, 64, 65, 67, 69, 77, 83, 86, 87, 93, 111, 116, 117, 119, 122, 125, 126, 135, 137, 138, 141, 143–145, 150–152, 156, 157, 159, 160, 163, 165, 166, 170, 173–175, 183, 185, 189–191, 207, 214, 218, 219, 225, 227, 262 Stabilization, 2, 3, 6, 7, 16, 18, 20, 24, 109, 120, 189, 193, 194, 196, 197, 214, 215, 219 Stable system, 13, 24, 43, 58, 61 State feedback, 2, 7, 8, 219, 245 Stochastic system, 249 Strongly minimum-phase, 1, 6–8, 17, 20 Sylvester-type matrix equation, 146, 152, 160, 167 System, 52 T Temporal family of seminorms, 47, 51 Time axis, 47 Time-varying system, 218–220, 235 Transmission zero, 158, 159, 161

Index U Uniformly controllable, 53, 63, 74 Uniformly finely lower hemicontinuous, 56 Uniformly lower hemicontinuous, 56 Uniformly weakly lower hemicontinuous, 56 Univalent system, 55, 57 Unknown minimum-phase plant, 170 Unknown periodic disturbance, 136, 137, 146, 167, 170, 185 Unmodeled dynamics, 135, 138, 139, 143, 148–150, 152, 154, 156, 160, 162, 165, 167, 168, 170, 171, 174, 175, 178, 180, 182, 185 Update law, 195–197, 212, 219, 221–226, 229, 230, 240

W Weakly lower hemicontinuous, 55 Well-posed, 43, 45, 66, 67, 115 Wlhc. see weakly lower hemicontinuous

Z Zero dynamics, 3, 5, 6, 235, 238, 239