Informatics in Control, Automation and Robotics: 15th International Conference, ICINCO 2018, Porto, Portugal, July 29-31, 2018, Revised Selected Papers [1st ed. 2020] 978-3-030-31992-2, 978-3-030-31993-9

The goal of this book is to familiarize readers with the latest research on, and recent advances in, the field of Inform

1,117 43 67MB

English Pages XVII, 570 [587] Year 2020

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Informatics in Control, Automation and Robotics: 15th International Conference, ICINCO 2018, Porto, Portugal, July 29-31, 2018, Revised Selected Papers [1st ed. 2020]
 978-3-030-31992-2, 978-3-030-31993-9

Table of contents :
Front Matter ....Pages i-xvii
Fast and Simple Interacting Models of Drape Tool and Ply Material for Handling Free Hanging, Pre-impregnated Carbon Fibre Material (Gudmundur G. Gunnarsson, Ole W. Nielsen, Christian Schlette, Henrik G. Petersen)....Pages 1-25
Optimal Time-Sampling in a Statistical Process Control with a Polynomial Expected Loss (Valery Y. Glizer, Vladimir Turetsky)....Pages 26-50
Observability for the Wave Equation with Variable Support in the Dirichlet and Neumann Cases (Antonio Agresti, Daniele Andreucci, Paola Loreti)....Pages 51-75
Nonprehensile Manipulation Control and Task Planning for Deformable Object Manipulation: Results from the RoDyMan Project (Fabio Ruggiero, Jung-Tae Kim, Alejandro Gutierrez-Giles, Aykut C. Satici, Alejandro Donaire, Jonathan Cacace et al.)....Pages 76-100
Towards a Novel Maintenance Support System Based On mini-terms: Mini-term 4.0 (Eduardo García, Nicolás Montes, Mónica Alacreu)....Pages 101-117
Automated Nonlinear Control Structure Design by Domain of Attraction Maximization with Eigenvalue and Frequency Domain Specifications (Elias Reichensdörfer, Dirk Odenthal, Dirk Wollherr)....Pages 118-141
Computational Experience with a Modified Newton Solver for Discrete-Time Algebraic Riccati Equations (Vasile Sima, Peter Benner)....Pages 142-167
Influence of the Predictive Rainfall/Runoff Model Accuracy on an Optimal Water Resource Management Strategy (Baya Hadid, Eric Duviella)....Pages 168-192
Interlocking Problem in Automatic Disassembly Planning and Two Solutions (Feiying Lan, Yongjing Wang, Duc Truong Pham, Jiayi Liu, Jun Huang, Chunqian Ji et al.)....Pages 193-213
Thermal Imaging for Psychophysiological State Detection in the Human-Machine Interface (HMI) Control System (Changjiang He, Mahdi Mahfouf, Luis A. Torres-Salomao)....Pages 214-229
Nonlinear Model Predictive Control Algorithms for Industrial Articulated Robots (Květoslav Belda)....Pages 230-251
Experimental Investigation of a Biologically Inspired Gripper with Active Haptic Control for Geometric Compliancy (Christian Ivan Basson, Glen Bright)....Pages 252-275
Quantifying Robotic Swarm Coverage (Brendon G. Anderson, Eva Loeser, Marissa Gee, Fei Ren, Swagata Biswas, Olga Turanova et al.)....Pages 276-301
Design of a Vibration Driven Motion System Based on a Multistable Tensegrity Structure (Philipp Schorr, Valter Böhm, Lena Zentner, Klaus Zimmermann)....Pages 302-317
An Experimental Study for the Contactless Manipulation of Single Object in Vertical Plane Using Multiple Air Jets (Nobukado Abe, Kazuki Yoshinaga, Satoshi Iwaki, Naoki Tsuchihashi, Tetsushi Ikeda, Toshiharu Kosaku et al.)....Pages 318-327
Combined Feedback-Feed Forward Control Strategy for a Knee Rehabilitation Device with Soft Actuation (Leonardo Solaque, Yair Valbuena, Marianne Romero, Alexandra Velasco)....Pages 328-343
Controlling Off-Road Bi-steerable Mobile Robots: An Adaptive Multi-control Laws Strategy (Roland Lenain, Ange Nizard, Mathieu Deremetz, Benoit Thuilot, Vianney Papot, Christophe Cariou)....Pages 344-363
On Randomized Searching for Multi-robot Coordination (Jakub Hvězda, Miroslav Kulich, Libor Přeučil)....Pages 364-383
Increasing Machining Accuracy of Industrial Manipulators Using Reduced Elastostatic Model (Shamil Mamedov, Dmitry Popov, Stanislav Mikhel, Alexandr Klimchik)....Pages 384-406
Control of Force in Surgical Robots with Random Time Delays Using Model Predictive Control (Jasmeet Singh Ladoiye, Dan S. Necsulescu, Jurek Sasiadek)....Pages 407-428
An IPM Approach to Multi-robot Cooperative Localization: Pepper Humanoid and Wheeled Robots in a Shared Space (M. Hassan Tanveer, Antonio Sgorbissa, Antony Thomas)....Pages 429-447
Study of the Physiological Parameters of a Regulated Oxygen Mask (Geoffray Battiston, Dominique Beauvois, Gilles Duc, Emmanuel Godoy)....Pages 448-463
Linear Discrete-Time Systems - \(H_\infty \) Dynamic Output-Feedback Control with Preview (Eli Gershon)....Pages 464-481
Direct Integrability for State Feedback Optimal Control with Singular Solutions (Paolo Di Giamberardino, Daniela Iacoviello)....Pages 482-502
Petri Nets Tracking Control for Electro-pneumatic Systems Automation (Carlos Renato Vázquez, José Antonio Gómez-Castellanos, Antonio Ramírez-Treviño)....Pages 503-525
18O Isotope Separation Process Control (Vlad Mureşan, Iulia Clitan, Valentin Sita, Mihail Abrudean, Mihaela-Ligia Ungureşan)....Pages 526-551
Approximate Bayesian Prediction Using State Space Model with Uniform Noise (Ladislav Jirsa, Lenka Kuklišová Pavelková, Anthony Quinn)....Pages 552-568
Back Matter ....Pages 569-570

Citation preview

Lecture Notes in Electrical Engineering 613

Oleg Gusikhin Kurosh Madani   Editors

Informatics in Control, Automation and Robotics 15th International Conference, ICINCO 2018, Porto, Portugal, July 29–31, 2018, Revised Selected Papers

Lecture Notes in Electrical Engineering Volume 613

Series Editors Leopoldo Angrisani, Department of Electrical and Information Technologies Engineering, University of Napoli Federico II, Naples, Italy Marco Arteaga, Departament de Control y Robótica, Universidad Nacional Autónoma de México, Coyoacán, Mexico Bijaya Ketan Panigrahi, Electrical Engineering, Indian Institute of Technology Delhi, New Delhi, Delhi, India Samarjit Chakraborty, Fakultät für Elektrotechnik und Informationstechnik, TU München, Munich, Germany Jiming Chen, Zhejiang University, Hangzhou, Zhejiang, China Shanben Chen, Materials Science and Engineering, Shanghai Jiao Tong University, Shanghai, China Tan Kay Chen, Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore Rüdiger Dillmann, Humanoids and Intelligent Systems Lab, Karlsruhe Institute for Technology, Karlsruhe, Baden-Württemberg, Germany Haibin Duan, Beijing University of Aeronautics and Astronautics, Beijing, China Gianluigi Ferrari, Università di Parma, Parma, Italy Manuel Ferre, Centre for Automation and Robotics CAR (UPM-CSIC), Universidad Politécnica de Madrid, Madrid, Spain Sandra Hirche, Department of Electrical Engineering and Information Science, Technische Universität München, Munich, Germany Faryar Jabbari, Department of Mechanical and Aerospace Engineering, University of California, Irvine, CA, USA Limin Jia, State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing, China Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Alaa Khamis, German University in Egypt El Tagamoa El Khames, New Cairo City, Egypt Torsten Kroeger, Stanford University, Stanford, CA, USA Qilian Liang, Department of Electrical Engineering, University of Texas at Arlington, Arlington, TX, USA Ferran Martin, Departament d’Enginyeria Electrònica, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain Tan Cher Ming, College of Engineering, Nanyang Technological University, Singapore, Singapore Wolfgang Minker, Institute of Information Technology, University of Ulm, Ulm, Germany Pradeep Misra, Department of Electrical Engineering, Wright State University, Dayton, OH, USA Sebastian Möller, Quality and Usability Lab, TU Berlin, Berlin, Germany Subhas Mukhopadhyay, School of Engineering & Advanced Technology, Massey University, Palmerston North, Manawatu-Wanganui, New Zealand Cun-Zheng Ning, Electrical Engineering, Arizona State University, Tempe, AZ, USA Toyoaki Nishida, Graduate School of Informatics, Kyoto University, Kyoto, Japan Federica Pascucci, Dipartimento di Ingegneria, Università degli Studi “Roma Tre”, Rome, Italy Yong Qin, State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing, China Gan Woon Seng, School of Electrical & Electronic Engineering, Nanyang Technological University, Singapore, Singapore Joachim Speidel, Institute of Telecommunications, Universität Stuttgart, Stuttgart, Baden-Württemberg, Germany Germano Veiga, Campus da FEUP, INESC Porto, Porto, Portugal Haitao Wu, Academy of Opto-electronics, Chinese Academy of Sciences, Beijing, China Junjie James Zhang, Charlotte, NC, USA

The book series Lecture Notes in Electrical Engineering (LNEE) publishes the latest developments in Electrical Engineering - quickly, informally and in high quality. While original research reported in proceedings and monographs has traditionally formed the core of LNEE, we also encourage authors to submit books devoted to supporting student education and professional training in the various fields and applications areas of electrical engineering. The series cover classical and emerging topics concerning:

• • • • • • • • • • • •

Communication Engineering, Information Theory and Networks Electronics Engineering and Microelectronics Signal, Image and Speech Processing Wireless and Mobile Communication Circuits and Systems Energy Systems, Power Electronics and Electrical Machines Electro-optical Engineering Instrumentation Engineering Avionics Engineering Control Systems Internet-of-Things and Cybersecurity Biomedical Devices, MEMS and NEMS

For general information about this book series, comments or suggestions, please contact leontina. [email protected]. To submit a proposal or request further information, please contact the Publishing Editor in your country: China Jasmine Dou, Associate Editor ([email protected]) India Aninda Bose, Senior Editor ([email protected]) Japan Takeyuki Yonezawa, Editorial Director ([email protected]) South Korea Smith (Ahram) Chae, Editor ([email protected]) Southeast Asia Ramesh Nath Premnath, Editor ([email protected]) USA, Canada: Michael Luby, Senior Editor ([email protected]) All other Countries: Leontina Di Cecco, Senior Editor ([email protected]) Christoph Baumann, Executive Editor ([email protected]) ** Indexing: The books of this series are submitted to ISI Proceedings, EI-Compendex, SCOPUS, MetaPress, Web of Science and Springerlink **

More information about this series at http://www.springer.com/series/7818

Oleg Gusikhin Kurosh Madani •

Editors

Informatics in Control, Automation and Robotics 15th International Conference, ICINCO 2018, Porto, Portugal, July 29–31, 2018, Revised Selected Papers

123

Editors Oleg Gusikhin Ford Research and Advanced Engineer Dearborn, MI, USA

Kurosh Madani University of Paris-EST Créteil (UPEC) Créteil, France

ISSN 1876-1100 ISSN 1876-1119 (electronic) Lecture Notes in Electrical Engineering ISBN 978-3-030-31992-2 ISBN 978-3-030-31993-9 (eBook) https://doi.org/10.1007/978-3-030-31993-9 © Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface

The present book includes extended and revised versions of a set of selected papers from the 15th International Conference on Informatics in Control, Automation and Robotics (ICINCO 2018), held in Porto, Portugal, from July 29 to 31, 2018. The purpose of the 15th edition of this annual conference is to bring together researchers, engineers, and practitioners interested in the application of informatics to control, automation, and robotics. Four simultaneous tracks were held, covering intelligent control systems, optimization, robotics, automation, signal processing, sensors, systems modeling and control, and industrial informatics. Informatics applications are pervasive in many areas of control, automation, and robotics. ICINCO 2018 received 207 paper submissions from 39 countries, of which 12% were included in this book. The papers were selected by the event chairs, and their selection is based on a number of criteria that include the classifications and comments provided by the Program Committee members, the Session Chairs’ assessment, and also the Program Chairs’ global view of all papers included in the technical program. The authors of selected papers were then invited to submit a revised and extended version of their papers having at least 30% innovative material. The selection has also followed the ambition of an equilibratory highlight of the four aforementioned conference’s tracks. The book comprises 27 papers selected to highlight the recent advancements in the theory and practice of informatics in control, robotics, and automation including intelligent control systems and optimization, robotics, automation, signal processing, sensors, and systems modeling and control. Twelve papers describe novel control methods and models for dynamic systems, four papers discuss robot trajectory tracking and multi-robot coordination, two papers address human–machine interface design, five papers present novel approaches to industrial applications, and four papers present applications in several diverse domain areas such as medical devices, chemical process, and water management.

v

vi

Preface

We would like to thank all the authors for their contributions and to express our gratitude to the reviewers who have helped to ensure the quality of this post-conference publication. July 2018

Oleg Gusikhin Kurosh Madani

Organization

Conference Chair Kurosh Madani

University of Paris-EST Créteil (UPEC), France

Program Chair Oleg Gusikhin

Ford Motor Company, USA

Program Committee Hussein Abdullah Ghassan Abu-Lebdeh El-Houssaine Aghezzaf Eugenio Aguirre Erdinç Altug Mihail Antchev Rui Araujo Rafael Aroca T. Asokan M. Amine Atoui Alfonso Baños Guilherme Barreto Roman Barták Marcelo Becker Carsten Behn Faïz Ben Amar Karsten Berns

University of Guelph, Canada American University of Sharjah, UAE Ghent University, Faculty of Engineering and Architecture, Belgium University of Granada, Spain Istanbul Technical University, Turkey Technical University - Sofia, Bulgaria University of Coimbra, Portugal Universidade Federal de São Carlos, Brazil Indian Institute of Technology Madras, India UBS, France Universidad de Murcia, Spain Universidade Federal do Ceará, Brazil Charles University, Czech Republic Escola de Engenharia de São Carlos, Brazil Schmalkalden University of Applied Sciences, Germany Université Pierre et Marie Curie, Institut Systèmes Intelligents et de Robotique, France University of Kaiserslautern, Germany

vii

viii

Reinaldo Bianchi Mauro Birattari Jean-Louis Boimond Magnus Boman Richard Braatz Kalinka Branco Glen Bright Sebastien Briot Marvin Bugeja Kevin Burn Fernando Caballero Kenneth Camilleri Angelo Cangelosi Giuseppe Carbone Alessandro Casavola Marco Castellani Jaime Cerda Kuo-Ming Chang Albert Cheng Sung-Bae Cho Paul Christodoulides Feng Chu Daisuke Chugo Carlos Coello Yechiel Crispin Erik Cuevas José Cunha Joshua Dayan Angel Del Pobil Kyriakos Deliparaschos Mingcong Deng Paolo Di Giamberardino Wenjie Dong António Dourado Vasile Dragan Venky Dubey Ioan Dumitrache Richard J. Duro

Organization

Centro Universitario da FEI, Brazil Université Libre de Bruxelles, Belgium ISTIA - LARIS, France The Royal Institute of Technology, Sweden Massachusetts Institute of Technology, USA ICMC – USP, Brazil University of KwaZulu-Natal, South Africa Laboratoire des Sciences du Numérique de Nantes, France University of Malta, Malta University of Sunderland, UK University of Seville, Spain University of Malta, Malta University of Plymouth, UK Università della Calabria, Italy University of Calabria, Italy University of Birmingham, UK Universidad Michoacana de San Nicolas de Hidalgo, Mexico National Kaohsiung University of Applied Sciences, Taiwan University of Houston, USA Yonsei University, Korea, Republic of Cyprus University of Technology, Cyprus University of Evry Val d’Essonne, France Kwansei Gakuin University, Japan CINVESTAV-IPN, Mexico Embry-riddle Aeronautical University, USA Universidad de Guadalajara, Mexico University of Trás-os-montes and Alto Douro, Portugal Technion – Israel Institute of Technology, Israel Universitat Jaume I, Spain Cyprus University of Technology, Cyprus Tokyo University of Agriculture and Technology, Japan Sapienza University of Rome, Italy University of Texas Rio Grande Valley, USA University of Coimbra, Portugal Romanian Academy, Romania Bournemouth University, UK University “POLITEHNICA” of Bucharest, Romania Universidade Da Coruña, Spain

Organization

Marc Ebner Petr Ekel Ashraf Elnagar Simon Fabri Mohammed Fadali Christophe Farges Luca Ferrarini Paolo Fiorini Juan Flores José Fonseca Mauro Franceschelli Georg Frey Toyomi Fujita Takeshi Furuhashi Mauro Gaggero Iván García Daza Paulo Gil Giuseppina Gini Maria Gini Arthur Gómez Lucian Grigorie Lech Grzesiak Oleg Gusikhin Jan Haase Wolfgang Halang Kensuke Harada Jennifer Harding Dominik Henrich Arturo Hernández-Aguirre Yasuhisa Hirata Wladyslaw Homenda Jonathan How Yisheng Huang Daniela Iacoviello Gianluca Ippoliti Ivan Ivanov Sarangapani Jagannathan

ix

Ernst-Moritz-Arndt-Universität Greifswald, Germany Pontifical Catholic University of Minas Gerais, Brazil University of Sharjah, UAE University of Malta, Malta UNR, USA Bordeaux University, IMS – UMR 5218 CNRS, France Politecnico di Milano, Italy University of Verona, Italy University of Michoacan, Mexico DETI/IT, Universidade de Aveiro & Micro I/O, Portugal Università di Cagliari, Italy Saarland University, Germany Tohoku Institute of Technology, Japan Nagoya University, Japan National Research Council of Italy, Italy University of Alcalá, Spain Universidade Nova de Lisboa, Portugal Politecnico di Milano, Italy University of Minnesota, USA Universidade do Vale do Rio dos Sinos, Brazil Military Technical Academy “Ferdinand I”, Bucharest, Romania Warsaw University of Technology, Poland Ford Motor Company, USA Helmut Schmidt University of the Federal Armed Forces Hamburg, Germany Qingdao University of Science and Technology, China Osaka University, Japan Loughborough University, UK University of Bayreuth, Germany Centre for Research in Mathematics, Mexico Tohoku University, Japan Warsaw University of Technology, Poland Massachusetts Institute of Technology, USA Nat. Ilan Univ., Taiwan Sapienza University of Rome, Italy Università Politecnica delle Marche, Italy Sofia University “St. Kl. Ohridski”, Bulgaria Missouri University of Science and Technology, USA

x

N. Jawahar Thira Jearsiripongkul Michael Jenkin Wootae Jeong Isabel Jesus Márcio José da Cunha Marc Jungers Fabrício Junqueira Mansour Karkoub Tohru Kawabe Rafael Kelly DaeEun Kim Krzysztof Kozlowski Ondrej Krejcar Masao Kubo Sébastien Lahaye Jose Lastra Kauko Leiviskä Roland Lenain Fabrizio Leonardi Guoshi Li Tsai-Yen Li Christos Liambas Gordon Lightbody Guoping Liu Antonio Lopes Sérgio Lopes Gonzalo Lopez-Nicolas Yu-Shen Lu Martin Lukac Danilo Maccio José Machado Anthony Maciejewski Kurosh Madani Ana Madureira Frederic Maire Om Malik Stamatis Manesis Marga Marcos Philippe Martinet Tariq Masood

Organization

Thiagarajar College of Engineering, India Thammasat University, Thailand York University, Canada Korea Railroad Research Institute, Korea, Republic of ISEP/IPP - School of Engineering, Polytechnic Institute of Porto, Portugal Federal University of Uberlândia, Brazil CNRS - Université de Lorraine, France University of São Paulo (USP), Brazil Texas A&M University, Qatar University of Tsukuba, Japan CICESE Research Center, Mexico Yonsei University, Korea, Republic of Poznan University of Technology, Poland University of Hradec Kralove, Czech Republic Narional Defense Academy of Japan, Japan ISTIA - LARIS, France Tampere University of Technology, FAST Laboratory, Finland University of Oulu, Finland Irstea, France FEI University, Brazil University of North Carolina at Chapel Hill, USA National Chengchi University, Taiwan Aristotle University of Thessaloniki, Greece University College Cork, Ireland University of South Wales, UK University of Porto, Portugal University of Minho, Portugal Universidad de Zaragoza, Spain National Taiwan Normal University, Taiwan Nazarbayev University, Kazakhstan National Research Council of Italy, Italy Institute of Engineering, Polytechnic of Porto, Portugal Colorado State University, USA University of Paris-EST Créteil (UPEC), France ISEP/IPP, Portugal Queensland University of Technology, Australia University of Calgary, Canada University of Patras, Greece Universidad del Pais Vasco, Spain Inria, France University of Cambridge, UK

Organization

Alicja Mazur Ross McAree Joaquim Melendez Rafael Mendes Luis Merino Nadhir Messai Konstantinos Michail Sanya Mitaim Francesco Morabito Vladimir Mostyn Vassilis Moulianitis George Moustris Riccardo Muradore Saeid Nahavandi Andreas Nearchou Juan J. Nieto Jun Okamoto Fernando Osorio Ju H. Park Igor Paromtchik Ignacio Parra-Alonso Pierre Payeur Dariusz Pazderski Qingjin Peng D. Pham Nirvana Popescu Andrey Postnikov Cesar Prada Moraga Radu-Emil Precup José Ragot Akshay Rajhans Oscar Reinoso Mihailo Ristic Roseli Romero Juha Röning Jocelyn Sabatier Christophe Sabourin Mehmet Sahinkaya Priti Sajja Antonio Sala Roberto Sanchis Llopis Nilanjan Sarkar

xi

Wroclaw University of Technology, Poland University of Queensland, Australia University of Girona, Spain University of Campinas, Brazil Pablo de Olavide University, Spain CReSTIC, France Cyprus University of Technology, Cyprus Thammasat University, Thailand University Mediterranea of Reggio Calabria, Italy VSB - Technical University of Ostrava, Czech Republic University of the Aegean, Greece National Technical University of Athens, Greece University of Verona, Italy Deakin University, Australia University of Patras, Greece University of Santiago de Compostela, Spain Escola Politécnica da USP, Brazil USP - Universidade de Sao Paulo, Brazil Yeungnam University, Korea, Republic of Inria, France University of Alcalá, Spain University of Ottawa, Canada Poznan University of Technology, Poland University of Manitoba, Canada The University of Birmingham, UK University Politehnica of Bucharest, Romania The University of Lincoln, UK Universidad de Valladolid, Spain Politehnica University of Timisoara, Romania Centre de Recherche en Automatique de Nancy, France MathWorks, USA Miguel Hernandez University, Spain Imperial College London, UK University of São Paulo, Brazil University of Oulu, Finland IMS Laboratory - Bordeaux University, France IUT Sénart, University Paris-Est Creteil (UPEC), France Kingston University London, UK Sardar Patel University, India Universidad Politecnica de Valencia, Spain Universitat Jaume I, Spain Vanderbilt University, USA

xii

Jurek Sasiadek Daniel Schmidt Gerhard Schweitzer Gerald Seet João Sequeira Roman Sergienko Madhavan Shanmugavel Jinhua She Manuel Silva Vasile Sima Daniel Simon Andrzej Sluzek Jorge Solis Stefano Squartini Burkhard Stadlmann J. Stigter Olaf Stursberg Les Sztandera Tomasz Tarczewski Daniel Thalmann Simon Thompson Gui Tian Geetam Tomar Germano Torres Mohamed Trabia Wa-Muzemba Tshibangu Antonios Tsourdos Ali Turgut Bartlomiej Ufnalski Angel Valera Alexandra Velasco Ramiro Velazquez Gianni Vercelli Hugo Vieira Neto Arnoud Visser Damir Vrancic Bernardo Wagner Feng Wan

Organization

Carleton University, Canada SEW EURODRIVE GmbH, Germany ETH Zurich, Switzerland Nanyang Technological University, Singapore Instituto Superior Técnico, Portugal Gini GmbH, Germany Monash University, Malaysia Tokyo University of Technology, Japan ISEP-Instituto Superior de Engenharia do Porto, Portugal National Institute for Research & Development in Informatics, Romania Inria, France Khalifa University, UAE Karlstad University/Waseda University, Sweden Università Politecnica delle Marche, Italy Upper Austria University of Applied Sciences, Wels, Austria Wageningen University, Netherlands University of Kassel, Germany Jefferson (Philadelphia University + Thomas Jefferson University), USA Nicolaus Copernicus University, Poland Ecole Polytechnique Federale de Lausanne, Switzerland AIST, Japan Newcastle University, UK Machine Intelligence Research Labs, India PS Solutions, Brazil University of Nevada, Las Vegas, USA Morgan State University, USA Cranfield University, UK Middle East Technical University, Turkey Warsaw University of Technology, Poland Universidad Politécnica de Valencia, Spain Universidad Militar Nueva Granada, Colombia Universidad Panamericana, Mexico Università degli Studi di Genova, Italy Federal University of Technology - Paraná, Brazil Universiteit van Amsterdam, Netherlands Jožef Stefan Institute, Slovenia Leibniz Universität Hannover, Germany University of Macau, Macau

Organization

Jingyu Wang Jinling Wang Long Wang James Whidborne Warren White Min Wu Qinmin Yang Jie Zhang Yuming Zhang Primo Zingaretti Argyrios Zolotas

xiii

Northwestern Polytechnical University (NPU), China University of New South Wales (UNSW) Australia, Australia Peking University, China Cranfield University, UK Kansas State University, USA China University of Geosciences, China Zhejiang University, China Newcastle University, UK Univ. of Kentucky, USA Università Politecnica delle Marche, Italy University of Lincoln, UK

Additional Reviewers Nicolas Delanoue Haoliang Li Jinglan Li

Yiping Liu Aitziber Mancisidor Jon Martin Federico Perez-Gonzalez Dominik Riedelbauch Dorian Rohner Teodor Tiplica Tobias Werner Biyun Xie

LARIS - University of Angers, France Zhejiang University, China Group of Networked Sensing & Control (NeSC) State Key Lab. of Industrial Control Technology, China MathWorks, USA University of the Basque Country (UPV/EHU), Spain University of the Basque Country, Spain University of the Basque Country (UPV/EHU), Spain Universität Bayreuth, Germany University of Bayreuth, Germany LARIS/ISTIA, France Universität Bayreuth, Germany Colorado State University, USA

Invited Speakers Keith Clark Christian Schlette Bruno Siciliano

Imperial College London, UK University of Southern Denmark, Denmark University of Naples Federico II, Italy

Contents

Fast and Simple Interacting Models of Drape Tool and Ply Material for Handling Free Hanging, Pre-impregnated Carbon Fibre Material . . . Gudmundur G. Gunnarsson, Ole W. Nielsen, Christian Schlette, and Henrik G. Petersen

1

Optimal Time-Sampling in a Statistical Process Control with a Polynomial Expected Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Valery Y. Glizer and Vladimir Turetsky

26

Observability for the Wave Equation with Variable Support in the Dirichlet and Neumann Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . Antonio Agresti, Daniele Andreucci, and Paola Loreti

51

Nonprehensile Manipulation Control and Task Planning for Deformable Object Manipulation: Results from the RoDyMan Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fabio Ruggiero, Jung-Tae Kim, Alejandro Gutierrez-Giles, Aykut C. Satici, Alejandro Donaire, Jonathan Cacace, Luca Rosario Buonocore, Giuseppe Andrea Fontanelli, Vincenzo Lippiello, and Bruno Siciliano

76

Towards a Novel Maintenance Support System Based On mini-terms: Mini-term 4.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Eduardo García, Nicolás Montes, and Mónica Alacreu Automated Nonlinear Control Structure Design by Domain of Attraction Maximization with Eigenvalue and Frequency Domain Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 Elias Reichensdörfer, Dirk Odenthal, and Dirk Wollherr Computational Experience with a Modified Newton Solver for Discrete-Time Algebraic Riccati Equations . . . . . . . . . . . . . . . . . . . . 142 Vasile Sima and Peter Benner

xv

xvi

Contents

Influence of the Predictive Rainfall/Runoff Model Accuracy on an Optimal Water Resource Management Strategy . . . . . . . . . . . . . 168 Baya Hadid and Eric Duviella Interlocking Problem in Automatic Disassembly Planning and Two Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 Feiying Lan, Yongjing Wang, Duc Truong Pham, Jiayi Liu, Jun Huang, Chunqian Ji, Shizhong Su, Wenjun Xu, Quan Liu, and Zude Zhou Thermal Imaging for Psychophysiological State Detection in the Human-Machine Interface (HMI) Control System . . . . . . . . . . . . 214 Changjiang He, Mahdi Mahfouf, and Luis A. Torres-Salomao Nonlinear Model Predictive Control Algorithms for Industrial Articulated Robots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 Květoslav Belda Experimental Investigation of a Biologically Inspired Gripper with Active Haptic Control for Geometric Compliancy . . . . . . . . . . . . . 252 Christian Ivan Basson and Glen Bright Quantifying Robotic Swarm Coverage . . . . . . . . . . . . . . . . . . . . . . . . . . 276 Brendon G. Anderson, Eva Loeser, Marissa Gee, Fei Ren, Swagata Biswas, Olga Turanova, Matt Haberland, and Andrea L. Bertozzi Design of a Vibration Driven Motion System Based on a Multistable Tensegrity Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 302 Philipp Schorr, Valter Böhm, Lena Zentner, and Klaus Zimmermann An Experimental Study for the Contactless Manipulation of Single Object in Vertical Plane Using Multiple Air Jets . . . . . . . . . . . . . . . . . . 318 Nobukado Abe, Kazuki Yoshinaga, Satoshi Iwaki, Naoki Tsuchihashi, Tetsushi Ikeda, Toshiharu Kosaku, and Takeshi Takaki Combined Feedback-Feed Forward Control Strategy for a Knee Rehabilitation Device with Soft Actuation . . . . . . . . . . . . . . . . . . . . . . . 328 Leonardo Solaque, Yair Valbuena, Marianne Romero, and Alexandra Velasco Controlling Off-Road Bi-steerable Mobile Robots: An Adaptive Multi-control Laws Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344 Roland Lenain, Ange Nizard, Mathieu Deremetz, Benoit Thuilot, Vianney Papot, and Christophe Cariou On Randomized Searching for Multi-robot Coordination . . . . . . . . . . . 364 Jakub Hvězda, Miroslav Kulich, and Libor Přeučil

Contents

xvii

Increasing Machining Accuracy of Industrial Manipulators Using Reduced Elastostatic Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384 Shamil Mamedov, Dmitry Popov, Stanislav Mikhel, and Alexandr Klimchik Control of Force in Surgical Robots with Random Time Delays Using Model Predictive Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407 Jasmeet Singh Ladoiye, Dan S. Necsulescu, and Jurek Sasiadek An IPM Approach to Multi-robot Cooperative Localization: Pepper Humanoid and Wheeled Robots in a Shared Space . . . . . . . . . . 429 M. Hassan Tanveer, Antonio Sgorbissa, and Antony Thomas Study of the Physiological Parameters of a Regulated Oxygen Mask . . . 448 Geoffray Battiston, Dominique Beauvois, Gilles Duc, and Emmanuel Godoy Linear Discrete-Time Systems - H1 Dynamic Output-Feedback Control with Preview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464 Eli Gershon Direct Integrability for State Feedback Optimal Control with Singular Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482 Paolo Di Giamberardino and Daniela Iacoviello Petri Nets Tracking Control for Electro-pneumatic Systems Automation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 Carlos Renato Vázquez, José Antonio Gómez-Castellanos, and Antonio Ramírez-Treviño 18

O Isotope Separation Process Control . . . . . . . . . . . . . . . . . . . . . . . . . 526 Vlad Mureşan, Iulia Clitan, Valentin Sita, Mihail Abrudean, and Mihaela-Ligia Ungureşan Approximate Bayesian Prediction Using State Space Model with Uniform Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552 Ladislav Jirsa, Lenka Kuklišová Pavelková, and Anthony Quinn

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569

Fast and Simple Interacting Models of Drape Tool and Ply Material for Handling Free Hanging, Pre-impregnated Carbon Fibre Material Gudmundur G. Gunnarsson, Ole W. Nielsen, Christian Schlette(B) , and Henrik G. Petersen(B) The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Campusvej 55, 5230 Odense M, Denmark {gunu,own,chsch,hgp}@mmmi.sdu.dk

Abstract. Robotic layup of pre-impregnated (prepreg) fiber plies onto complex molds is a hitherto unsolved problem. At the University of Southern Denmark (SDU), we have gathered an international consortium with industrial keyplayers for addressing this gap in the FlexDraper project. We use an underactuated array of suction cups to handle the plies and our approach is based on model- and simulation-based control of the shaping of the tool and the free hanging plies during the draping procedure. This requires fast computable models in the draping control loop. In this paper, we present a tool model that predicts the 6D configuration of the suction cups based on the actuator positions and subsequently a ply model that predicts the shape of the free hanging prepreg fiber plies with perpendicular fiber directions, where the ply is held (clamped) at a set of locations that are not too far apart. Rather than using computationally expensive physical techniques, e.g. FEM simulations, our ply model is only based on interpolations between the held locations satisfying curve length constraints in fiber directions. The paper also contains experimental tests for the accuracy of the overall model. Keywords: Robotic layup of fiber plies · Modeling underactuated tools · Modeling shapes of deformable objects

1

Introduction

With decreasing material costs for prepreg fiber plies, the cost efficiency for the process of accurate layup of the plies onto potentially double-curved structures for composite components is becoming more and more important. The potential of applying composite components is huge within e.g. aerospace, automotive and the wind turbine industries. As the potential with composites is huge and the demand for better cost efficiency is the most important challenge, there have been many attempts to automate the layup process of prepreg plies. Despite c Springer Nature Switzerland AG 2020  O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 1–25, 2020. https://doi.org/10.1007/978-3-030-31993-9_1

2

G. G. Gunnarsson et al.

the many attempts, no solutions are yet available and hence the process is still performed manually. The difficulty in understanding and modeling the draping process and deriving suitable tools for the process are the main reasons for the lack. To make such a system feasible, it has to be able to handle a large variety of plies as well as parts. Theoretically, with a proper draping tool, a trained operator will be able to teach/program a robot to layup the plies for a single part through experimentation. However, as we require a large variety of parts as well as plies, and that the quantity of each part is low, the time required for an operator working on the robot will lead to an infeasible solution with respect to ROI. Hence, we need to be able to deploy model-based automatic control of the draping process.

Fig. 1. The FlexDraper robot.

In the project FlexDraper [5], we have gathered an international consortium for developing a new drape tool mounted on a conventional 6-axis Kuka robot. The drape tool (see Fig. 1) consists of an array of suction cups where the updown movement of each suction cup is individually controllable. At the fixed platform, a universal joint is placed and there is a spherical joint at the cup (see Fig. 2(a)). The nearest neighbour cups are connected by rigid interlinks with spherical joints at the end (see Fig. 2(b)). This allow us to control the shape of fiber plies so that we can shape them to the form of the mould.

Fast and Simple Interacting Models of Drape Tool and Ply Material

(a) Subset of tool showing the relative placement of the universal joints, actuators and suction cups.

(b) Close-up of suction cups with ball joints for connecting with the actuator and interlink structure.

Fig. 2. Actuation and suction cup design.

3

4

G. G. Gunnarsson et al.

The unactuated universal joint and spherical joint allow the suction cups to be compliant to the material forces and to the mould during laydown. In addition, having just one actuator per suction cups makes the tool significantly simpler to maintain. However, it is a challenge with the tool that we need to be able to predict the unactuated degrees of freedom in the tool. With an array of M × N suction cups, there are 6M N degrees of freedom, M N actuators and M (N − 1) + (M − 1)N interlink constraints leaving us with 6M N − M N − M (N − 1) − (M − 1)N = 3M N + M + N unactuated degrees of freedom in the tool. However, experimental studies have shown that the system is determinable, and hence a model is feasible to derive. Another main issue with developing model-based control is that the shape of the materials between the suction cups needs to be accurately estimated to enable a draping “from one side” so that undesired air bubbles are avoided. Again, experiments have shown that the material behave in a repeatable way, so also for this a model is feasible to establish. Our approach to model-based control is thus deploying an accurate and fast computable model for (i) predicting and simulating the drape tool when moving the actuators and (ii) simulating the material during the draping process. For the ply simulations, we will in this paper focus on a fast computable model for predicting the shape of the free hanging parts of the material (i.e. between the suction cups and before touching the mold). Both models must be fast computable because the simulations are used to annotate the outcome of a studied draping strategy. These annotations are used in a simulation-based learning procedure. Hence, many simulations must be carried out for fine-tuning the parameters of each draping process. It is the main purpose of this paper to present our solution to this and some initial experimental verifications of the accuracy of the models. The paper is organized as follows: In Sect. 2, we outline the properties of the FlexDraper tool for draping and discuss the properties of the fiber ply material and in Sect. 3, we discuss the organization of the draping process. These discussions include the desired properties of our model. Related modelling work is discussed in Sect. 4. In Sects. 5 and 6, we present the tool model and ply model respectively, which constitute the core contributions of this paper. In Sect. 7, we discuss preliminary validation results and Sect. 8 concludes the paper.

2

The FlexDraper Tool and Material

In the Flexdraper project, a drape tool to handle the material has been created. The tool consists of 48 suction cups arranged in a grid like structure, where all the suction cups are connected by interlinks to the nearest neighbours (see Fig. 2(b)). The suctions cups are spaced in a loose grid roughly 11 cm apart in both, x and y directions. This gives rise to many small square segments or elements of free hanging material with sizes around 7 × 7 cm2 between four suction cups and strips of free-hanging material of sizes around 4 × 7 cm2 between two nearest neighbor cups (see Fig. 4). It should be mentioned that although the tool is

Fast and Simple Interacting Models of Drape Tool and Ply Material

5

constrained by the interlink structure, it has many passive degrees of freedom that allows the suction cups to align with the mold. The material is as mentioned a bi-directional carbon fiber weave, preimpregnated with epoxy resin. It is virtually in-extensible in the perpendicular fiber directions but has a small slack in the weave. Furthermore, the preimpregnation of the material means it is “sticky/tacky” in nature and relatively stiff with respect to bending curvature and furthermore dynamic responses (vibrations) are very small.

3

The Process and Model Requirements

The process flow is shown in Fig. 3. The plies are picked with suction cups from a flat surface. We assume for simplicity of the model description that the suction cup array aligns with the fibre directions. From the tool state sensor, we have accurate predictions of the positions of all suction cups at pick up. After pick up, the suction cups are moved to shape the ply into what we call a preshape. With a preshaped model, each suction cup has a position and orientation which we can again accurately predict using the tool state sensor. The plies are draped onto the mold by choosing a suitable starting point (choice is depending of the shape of the mold). Then the first suction cup with the material attached is placed on the mold. Subsequently, the other suction cups are placed in a wave like pattern running from the starting point to the boundary. To avoid undesired air bubbles, it is very important that also the material is attaching to the mold sequentially from the starting point and outwards. To ensure this, we have at any point in time the following available information – The three dimensional shape of the mold surface – The configuration of the robot – The configuration of all the linear actuators controlling the up-down movements of the suction cups We then have an dynamics based forward kinematics model of the drape tool that accurately predicts the position and orientation of all the suction cups after the relatively small movements from the sensor measured preshape. To finalize the simulation model used for learning, we then just need to be able to predict the shape of the free-hanging ply material between the suction cups. Before going into the detailed description of the shape predicting model, we will discuss the learning procedure because it impacts the desired property of the model. In the most general sense, a draping procedure is a coordinated movement of the robot and the linear actuators, i.e. a path in 48 + 6 = 54 dimensional space. To learn such a general trajectory in 54-dimensional space is computationally infeasible. Hence, we divide the learning into a sequence part and a trajectory part. The sequence part learning selects the starting point and the wave pattern for placing the suction cups. The trajectory part learning derives - for each step in the sequence - the trajectories of the relevant suction cups, which are those to be placed next and their neighbours. For the sequence part, there is in practise

6

G. G. Gunnarsson et al.

Fig. 3. The process: components in the FlexDraper system (Blue = flow of actions, green = sensor systems, red = data and computational models).

typically only around 5–10 relevant options, so the main task is the derivation of the trajectory part. With the learning procedure in mind, we have the following desired properties of the models – They must be fast computable to allow many simulations during learning – They should be local so that a given region relevant for a step in sequence can be simulated independent of the rest of the ply – The model must be accurate enough with respect to reality to allow for simulation-based learning

Fast and Simple Interacting Models of Drape Tool and Ply Material

7

The requirements for computational speed ask for considering sensible assumptions that simplify the modeling. For the tool modeling, we assume that draping is rather vertical so that gravitational forces on the universal joints are negligible. Moreover, we assume that drag forces from the wiring can be neglected (minimizing these drag forces was considered in the design). Hence, the drape tool is subject to the forces from the material, the constraint forces from the interlinks and the friction forces in the passive joints. Concerning the material model, we make the following assumptions – Since the material is relatively stiff with respect to bending and because the distance between neighbor suction cups is small, we can neglect gravity. For the same reasons, we can assume that the material is continuous differentiable everywhere – Since the material is very damped, we can use an equilibrium model – We assume that the material is unstretchable in the fiber directions

4

Related Work

Robotic handling of composite materials is a growing field, and some of the proposed systems will be discussed here. In [6] three suction gripper grids are proposed to handle dry-fibers for large structures. The common idea for the gripper grids is that the composite material is picked up and then the grid bends into shape to match the form the material is to be placed on. In [4] two collaborative robots with bendable suction grids are used to place dry-fiber plies on a curved mould. In [7] a gripper grid has been designed to handle prepreg fiber plies. The solution consists of a gripper grid to layup the material and a separate tool to drape the material onto the mould. This gripper grid bends along one axis and is not suitable for doubly-curved moulds. The grippers can rotate to follow the curvature of the mould. This tool uses a combination of cryo grippers and needle grippers to handle the material. Modelling of composite materials is a broad topic and therefore we will only be able to give a rather sparse overview of some of the more detailed strategies. In the work by [11], the physical properties are modelled not only for the cloth as a whole but also the interactions between fibers. In [14], the authors have defined a simplified mathematical model using finite element analysis to model deformations of plies, which is then used to control the handling trajectories of an automated composite manufacturing facility. More recently, [9] used a finite element approach as part of the Flexdraper project. The calculation time for simulating a drape using these type of methods is typically in the order of hours on a conventional computer. Finite difference modelling is another interesting approach used by [3] to calculate deformation profiles of rectangular composite plies under the effect of multiple suction surfaces. Their results show good correspondence between expected results and simulations. They illustrate the importance of mesh vs accuracy vs computation time. In order to reach high accuracies their computational time was close to 30 min, whereas medium accuracies could be obtained with computational time in the range of 10 s. These

8

G. G. Gunnarsson et al.

physics based methods however all have a tendency of being computationally expensive. Therefore, we look a bit broader for fast modeling of deformable objects. Mass Spring Models is a typical approach (see e.g. [8]) where the approach is outlined in a nice detail. The paper [12] focuses more on the speed of the method. More advanced versions were presented in e.g. [16] where cloth animation using mass spring models with dynamic stiffness is presented. The focus is on obtaining buckling and wrinkling. Mass spring models are relatively fast dynamical approximations, however for the material we are working with, the springs coefficients would be very high. Therefore timesteps for modelling would be very low and overall the model would hence again become computationally expensive. There has therefore been a lack of computationally fast and accurate model for prepreg fiber plies.

5

The Drape Tool Model

We model the drape tool system using the constrained manipulator equation [1,15]. Gravity is assumed to have negligible influence on the dynamics and is left out of the model. Thus the model becomes M (q)¨ q + C(q, q) ˙ q˙ = − f (q) = 0 g(t, q) = 0

∂F + J Tq (f )λ + J Tq (g)μ = 0 ∂ q˙

(1) (2) (3)

Where q is the state vector, ∂F ∂ q˙ are the damping forces, f are the interlink constraints, g are rheonomic constraints specifying the actuator movements, λ and μ are Lagrange multipliers. The inertia matrix M , and the matrix of coriolis and centrifugal forces C, are defined from the kinetic energy. These quantities will be defined in the following. We use the following notation conventions: q is the state vector for the whole system. q i is the [5 × 1] state vector for suction cup i. qi is the ith element of q. T B A is a [4 × 4] transformation matrix from frame B to frame A. We solve the equation by differentiating the constraints twice with respect to time and setting up the following system of equations ⎤ ⎡ ⎤⎡ ⎤ ⎡ − ∂F q¨ M −J Tq (f ) −J Tq (g) ∂ q˙ − C ⎢  N −M −N T ∂ 2 fα ⎥ ⎣J q (f ) (4) 0 0 ⎦ ⎣λ⎦ = ⎣− 2M q˙ ∂ 2 q q˙ ⎦ α=1 ∂g μ 0 0 J q (g) − ∂t To correct for constraint drift during simulation we used the method detailed in [10].

Fast and Simple Interacting Models of Drape Tool and Ply Material

9

We first define joined constrained jacobian matrix, A and the constraint vectors c and d as

J q (f ) A= (5) J q (g)

f (q) c= (6) g(t, q)

f˙(q) d= (7) g(t, ˙ q) Then we can solve the following system of equations for the position corrections AM −1 AT  = −c

(8)

Δq = M −1 AT 

(9)

And similarily for the velocity corrections

5.1

AM −1 AT ν = −d

(10)

Δq˙ = M −1 AT ν

(11)

Kinetic Energy and Drape Tool Kinematics

We define the generalized coordinates for each suction cup to be: The two angles of the universal joint, θy and θx , the actuator extension, z, and the two angles of the suction cup ball joint, φy and φx . We note that the rotation of the suction cup about the symmetric axis of the suction cup has been constrained, and thus the ball joint can be described by two coordinates. We define each suction cup to consist of three rigid bodies, the actuator housing, the actuator rod down to the ball joint, and the suction cup itself. We call the actuator housing body 1, the actuator rod is body 2 and the suction cup is body 3. We follow the kinematic formulation given in [13]. For each generalized coordinate we define a twist. The twists are T (12) ε1 = 0 0 0 0 1 0 T (13) ε2 = 0 0 0 1 0 0 T (14) ε3 = 0 0 1 0 0 0 T (15) ε4 = −p 0 0 0 1 0 T (16) ε5 = 0 p 0 1 0 0

10

G. G. Gunnarsson et al.

where p is the length from the universal joint mounting point to the suction cup rotation point at zero actuator extension. The position of the rotation point of the suction cup with all generalized coordinates set to zero is ⎡ ⎤ 1 0 0 g x,i ⎢0 1 0 g y,i ⎥ rod ⎢ ⎥ (17) T sc w (0) = T w (0) = ⎣ 0 0 1 g z,i + p⎦ 000 1 where the point g is the position of universal joint mounting point in the actuator grid. Similarily the zero configuration for the actuator housing is ⎤ ⎡ 1 0 0 g x,i ⎢0 1 0 g y,i ⎥ ⎥ (18) T hw (0) = ⎢ ⎣0 0 1 g z,i ⎦ 000 1 Then the forward kinematics for each of the three bodies are then given by the product of exponents formula as T hw (q i ) = eεˆ1 θy,i eεˆ2 θx,i T hw (0) T rod w (q i ) T sc w (q i )

εˆ1 θy,i

=e

ˆ1 θy,i ε

=e

ˆ2 θx,i ε

e

ˆ2 θx,i ε

e

(19)

e T rod w (0) ˆ ˆ ˆ5 φx,i ε 3 zi ε 4 φy,i ε ˆ3 zi ε

e

e

e

(20) T sc w (0)

(21)

Given a transformation matrix relating a body to the world frame, we can calculate the velocity of the rigid body in a body centric frame as

−1 b [ω] p˙

× T˙ a = Vˆ = T ba (22) 0 0

p˙ V = (23) ω To relate the velocity twists to the generalized velocities, the jacobian is needed V 1 = J q (T hw )q˙ i 2

V = 3

V =

J q (T rod w )q˙ i J q (T sc w )q˙ i

(24) (25) (26)

Each rigid body has a mass m, and an inertia tensor, I, seen in the body frame. Thus, using the velocities as defined in Eq. 22 the kinetic energy is E=



M N 3 T mj I 3×3 0 1   j 1 V (q i ) V j (q i ) = q˙ T M (q)q˙ 0 Ij 2 i=1 j=1 2

(27)

Fast and Simple Interacting Models of Drape Tool and Ply Material

11

The inertia tensor is defined as the Hessian of the energy, which is, in coordinates M i,j (q) =

∂2E ∂qi ∂qj

(28)

This matrix can be shown to be symmetric and positive definite. The coriolis and centrifugal forces can be derived from the inertia matrix, using its Christoffel symbols. The Coriolis matrix is defined as C k,j (q, q) ˙ =

n 1  ∂M k,j ∂M k,i ∂M i,j + − q˙i 2 i=1 ∂qi ∂qj ∂qk

(29)

We model the damping in the system using with a linear damping function, resulting from the following Rayleigh dissipation function F(q) ˙ =

M N 1  T q˙ D q˙ i 2 i i

(30)

where D is a [5 × 5] diagonal matrix of damping factors. 5.2

Interlink Constraints

The interlinks between suction cups are modelled as constraints that fix the distance between a point on either of the interlink suction cups. This is expressed for interlink α between suction cup i and j as   1  T iw (aI eα ) − T jw (−aI eα )2 − d2 = 0 2 i = 1, ..., 2M N − M − N

fα =

(31) (32)

Depending on the direction of the interlink, eα is either the x or the y unit vector. 5.3

Actuator Movements

The actuator movements are approximated by a trapezoidal velocity profile [2]. Each movement is parameterized by maximum velocity vmax , maximum acceleration amax , initial position xi and final position xf . The movement profile is assumed to be symmetric and initially at rest. The velocity of the actuator then becomes ⎧ 0 t < 0 ∨ t ≥ tf ⎪ ⎪ ⎪ ⎨a t 0 ≤ t < t1 max v(t) = ⎪ vmax t 1 ≤ t < t2 ⎪ ⎪ ⎩ −amax (t − t2 ) + vmax t2 ≤ t < tf

12

G. G. Gunnarsson et al.

We obtain the extension of the actuator by integrating the ⎧ ⎪ xi ⎪ ⎪ ⎪ 1 2 ⎪ ⎪ ⎨ 2 amax t + xi p(t) = vmax (t − t1 ) + x1 ⎪ ⎪ 1 ⎪ ⎪ ⎪− 2 amax t(t − t2 ) + (vmax + amax t2 )(t − t2 ) + x2 ⎪ ⎩x f

velocity curve t umin , 2.3

Tmax < umax .

(5)

Polynomial Loss Model

We assume that the shift Δ of the mean in random value x remains constant. In this case, the time td , required for discovering this shift (so-called time to signal), is the sum of a random amount Kd of random independent and identically distributed sampling time-intervals ui , conditionally independent of Kd : td =

Kd 

ui .

(6)

i=1

The value Kd is distributed geometrically with the success probability 1 − β, given by (1). The mathematical expectation and the variance of Kd are [24]: E(Kd ) =

1 , 1−β

Var(Kd ) =

β . (1 − β)2

(7)

The cost functional, to be minimized by a proper choice of the sampling time-interval u(z), is the mathematical expectation E(L) of the loss L, caused

30

V. Y. Glizer and V. Turetsky

by the delay in the detection of the shift. In this paper, we consider the loss L as the following homogeneous quadratic polynomial with respect to the delay td :   L = L(δ) = k(δ) κ1 t2d + κ2 td , (8) where k(δ) ≥ 0 is an even function of δ, it increases for δ ≥ 0, and k(0) = 0; the coefficients κ1 and κ2 are independent of δ and satisfy the inequalities κ1 ≥ 0,

κ2 ≥ 0,

κ1 + κ2 > 0.

(9)

Let us calculate the expected loss E(L). Based on the Eqs. (6), (8), we directly have        (10) E(L) = k(δ)E κ1 t2d + κ2 td = k(δ) κ1 E t2d + κ2 E td . To continue  the calculation of the expected loss E(L), we calculate separately E td and E t2d . Using the Eq. (7) and the results of [24], we obtain   E(td ) = E(Kd )E u(z) =

  1 E u(z) , 1−β

    Var(td ) = E(Kd )Var u(z) + Var(Kd )E2 u(z) ,

(11)

(12)

      E t2d = E2 (td ) + Var(td ) = E2 (Kd )E2 u(z) + E(Kd )Var u(z)          +Var(Kd )E2 u(z) = E2 (Kd )E2 u(z) + E(Kd ) E u2 (z) − E2 u(z)      +Var(Kd )E2 u(z) = E(Kd )E u2 (z) + E2 (Kd ) − E(Kd )        1 2β +Var(Kd ) E2 u(z) = E u2 (z) + E2 u(z) . 1−β (1 − β)2 (13) Using the normal distribution of z (z ∼ N (δ, 1)) and the results of [12], we have    exp(−δ 2 /2) 3 √ E u(z) = ψ(z, δ)u(z)dz, (14) 2πβ 0   2  exp(−δ 2 /2) 3 √ ψ(z, δ)u2 (z)dz, (15) E u (z) = 2πβ 0 where ψ(z, δ) = 2 exp(−z 2 /2) cosh(δz) > 0,

z ∈ [0, 3].

(16)

Thus, the substitution of (14) into (11), as well as (14) and (15) into (13), yields  3 E(td ) = A(δ) ψ(z, δ)u2 (z)dz, (17) 0

Optimal Time-Sampling in a Statistical Process Control

  E t2d = A(δ)

A(δ) 





3

2

3

2

ψ(z, δ)u (z)dz + B(δ)

ψ(z, δ)u(z)dz

0

31

,

(18)

0

exp(−δ 2 /2) √ > 0, (1 − β)β 2π

B(δ) 

2 exp(−δ 2 /2) √ > 0. (1 − β) 2π

Finally, the substitution of (17) and (18) into (10) yields the following expression for the expected loss:   3   Lex = Lex u(z), δ = E(L) = k(δ)A(δ) κ1 ψ(z, δ)u2 (z)dz + 0

 κ1 B(δ)

2

3

ψ(z, δ)u(z)dz

 + κ2

0



3

ψ(z, δ)u(z)dz .

(19)

0

Remark 2. Two cases can be distinguished with respect to the information on the value of δ: (i) the current value of δ is known; (ii) the current value of δ is unknown but it is known that |δ| ≤ δ0 , where δ0 > 0 is its predicted upper bound. In the first case, one should minimize with respect to u(z) the expected loss (19), calculated for the known value of δ. Such a minimization, subject to the constraints (2) and (3), yields the optimal time-sampling interval u∗ (z, δ) depending not only on the current value of z but also on the value of δ. In the second case, one should look for a time-sampling interval u ˆ(z) independent of δ. Such u ˆ(z), robust w.r.t. δ, should provide the best guaranteed value of the   expected loss (19), i.e., minimize the functional max|δ|≤δ0 Lex u(z), δ subject to the constraints (2) and (3). In the subsequent sections, we analyze both cases.

3

Optimal Time-Sampling for a Known δ

Since k(δ) and A(δ) are positive for any δ = 0, the minimization of the functional (19) for any such δ is equivalent to the minimization of the functional   J u(z), δ = κ1  +κ1 B(δ)

ψ(z, δ)u(z)dz 0

3

ψ(z, δ)u2 (z)dz 0

2

3

 

+ κ2

3

ψ(z, δ)u(z)dz.

(20)

0

Using this observation, we can formulate the following extremal problem. Extremal Problem (EP): for a known δ = 0, to find the function u(z) = u∗ (z, δ), z ∈ [0, 3], which minimizes the functional (20) subject to the constraints (2), (3) and the inequality (5).

32

3.1

V. Y. Glizer and V. Turetsky

Analytical Solution of the Extremal Problem

The EP is a nonstandard calculus of variations problem with two types of constraints, imposed on the minimizing function: the geometric constraint (2) and the integral inequality constraint (3). The classical calculus of variations theory does not study extremal problems with such types of constraints (see, e.g., [25]). In this paper, we propose a method of solution of this problem, which is not based on the classical calculus of variations Euler–Lagrange equation. The proposed method consists in an equivalent transformation of the EP into an optimal control problem. The latter is solved by application of the control optimality necessary condition – the Pontryagin’s Maximum Principle (PMP) [26]. Transformation of the EP. Consider the auxiliary vector-valued function T  w(z) = w1 (z), w2 (z), w3 (z) , z ∈ [0, 3], where  z  z ψ(ζ, δ)u2 (ζ)dζ, w2 (z) = ψ(ζ, δ)u(ζ)dζ, w1 (z) = 0 0  z   exp − ζ 2 /2 u(ζ)dζ. w3 (z) = 0

(21) The functions wi (z), (i = 1, 2, 3), satisfy the differential equations ⎧ ⎨ dw1 /dz = ψ(z, δ)u2 (z), dw2 /dz = ψ(z,δ)u(z),  ⎩ dw3 /dz = exp − z 2 /2 u(z),

(22)

and the initial conditions w1 (0) = 0,

w2 (0) = 0,

w3 (0) = 0.

(23)

Using the definition of w3 (z) in (21), the integral inequality (3) of the EP becomes aTmin ≤ w3 (3) ≤ aTmax , which can be rewritten equivalently as:     g1 w(3)  −w3 (3) + aTmin ≤ 0, g2 w(3)  w3 (3) − aTmax ≤ 0. (24) Due to (21), the functional (20) becomes    2 J = J u(z) = κ1 w1 (3) + κ1 B(δ) w2 (3) + κ2 w2 (3).

(25)

Thus, we have transformed the EP into the equivalent optimal control problem: to find the control function u(z), transfering the system (22) from the initial position (23) to the set of terminal positions (24) and minimizing the functional (25), subject to the geometric constraint (2) and the inequality (5). If κ1 > 0, this optimal control problem is non-linear with respect to u(z). However, for κ1 = 0, it becomes linear with respect to u(z). In what follows, we call this problem the Optimal Control Problem (OCP). By virtue of the results of the book [27] (see Section 9.2, Theorem 3), the OCP has a solution (optimal control).

Optimal Time-Sampling in a Statistical Process Control

33

Solution of the OCP by Application of the PMP. The Variational Hamiltonian of the OCP is   H = H(w, u, λ, z, δ) = λ1 ψ(z, δ)u2 + λ2 ψ(z, δ)u + λ3 exp − z 2 /2 u, (26)  T where λ = λ(z) = λ1 (z), λ2 (z), λ3 (z) , and λi = λi (z), (i = 1, 2, 3) are the costate variables satisfying the differential equations and the terminal conditions ⎧ ⎨ dλ1 /dz = −∂H/∂w1 = 0, z ∈ [0, 3], dλ2 /dz = −∂H/∂w2 = 0, z ∈ [0, 3], (27) ⎩ dλ3 /dz = −∂H/∂w3 = 0, z ∈ [0, 3],   λ1 (3) = −C0 ∂J/∂w1 (3) = −κ1 C0 ,

  λ2 (3) = −C0 ∂J/∂w2 (3) = −2κ1 C0 B(δ)γ − κ2 C0 , γ  w2 (3),       λ3 (3) = −C0 ∂J/∂w3 (3) − C1 ∂g1 w(3) /∂w3 (3)     −C2 ∂g2 w(3) /∂w3 (3) = C1 − C2 . (28) In the conditions (28), the constants C0 , C1 , C2 satisfy the conditions C0 ≥ 0,

C1 ≥ 0, C2 ≥ 0, C0 + C1 + C2 > 0, (29)     C2 g2 w(3) = 0. (30) C1 g1 w(3) = 0,   Denote Iu  u : umin ≤ u ≤ umax . Due to the PMP, an optimal control u∗ (z, δ) of the OCP necessarily satisfies the following condition for all z ∈ [0, 3]:     H w(z), u∗ (z, δ), λ(z), z = max H w(z), u(z), λ(z), z . (31) u(z)∈Iu

Thus, any control u(z), satisfying the Eqs. (31), (22)–(23) and (27)–(28), the conditions (24) and (29)–(30), is an optimal control candidate in the OCP. Let us obtain such a control. First, we solve the terminal-value problem (27)–(28). This problem yields the solution for z ∈ [0, 3] and a known δ = 0: λ1 (z, δ) = −κ1 C0 , λ2 (z, δ) = −2κ1 C0 B(δ)γ − κ2 C0 , λ3 (z, δ) = C1 − C2 . By substitution of this solution into (26) and use of (16), the Variational Hamiltonian of the OCP becomes     (32) H = − exp − z 2 /2 G u, z, δ, γ, C0 , C1 , C2 ,   where the function G u, z, γ, C0 , C1 , C2 has the form    G u, z, δ, γ, C0 , C1 , C2 = 4κ1 C0 B(δ)γ cosh(δz) + 2κ2 C0 cosh(δz)  +C2 − C1 u + 2κ1 C0 cosh(δz)u2 . (33)

34

V. Y. Glizer and V. Turetsky

Now, we are going to show that C0 > 0. For this purpose, we assume the opposite which, due to the first inequality in (29), is C0 = 0. In this case, the use of (32)–(33) and the third inequality in (29) yields H = exp(−z 2 /2)(C1 − C2 )u

(34)

and C1 +C2 > 0. Note that C1 = C2 . Indeed, if C1 = C2 , then both are  constants  w(3) = 0 and g w(3) = 0. The non-zero. In such a case, by virtue of (30), g 1 2     latter, along with the definitions of g1 w(3) and g2 w(3) (see the Eq. (24)), yields the equality Tmin = Tmax , which contradicts the above assumed inequality Tmin < Tmax . Thus, C1 − C2 = 0. The unique control, satisfying (31) with the Variational Hamiltonian of the form (34), is  umin , if C1 − C2 < 0, (35) u∗ (z) = umax , if C1 − C2 > 0. Substituting (35) into the third equation of (22) instead of u(z) and solving the resulting equation subject to the initial condition from (23), we obtain  aumin if C1 − C2 < 0, w3 (3) = aumax , if C1 − C2 > 0. The latter, along with the inequality (5), means that w3 (3) does not belong to the set of terminal positions (24). Therefore, the control (35), obtained from (31), (34) under the assumption C0 = 0, is not admissible. This means that the assumption C0 = 0 is wrong, i.e., C0 > 0. Due to the PMP, we can set C0 = 1 and rewrite the Eqs. (32)–(33) as:     H = − exp − z 2 /2 G1 u, z, δ, γ, C1 , C2 ,    G1 u, z, δ, γ, C1 , C2 = 4κ1 B(δ)γ cosh(δz) + 2κ2 cosh(δz)  +C2 − C1 u + 2κ1 cosh(δz)u2 . (36) In what follows, we distinguish the following cases: (i) κ1 = 0, (ii) κ1 > 0. Case (i) (κ1 = 0). In this case, the Hamiltonian in (36) becomes the following linear function of u:   H = − exp − z 2 /2 S(z, δ, C1 , C2 )u, S(z, δ, C1 , C2 ) = 2κ2 cosh(δz) + C2 − C1 .

(37) Applying (31) to the Hamiltonian (37), we obtain the optimal control of the OCP as:  umin , S(z, δ, C1 , C2 ) > 0, u∗i (z, δ) = u∗i (z, δ, C1 , C2 ) = (38) umax , S(z, δ, C1 , C2 ) < 0. To use the Eq. (38), we need to know the constants C1 and C2 . These constants should be chosen in such a way that the resulting control (38) will transfer the

Optimal Time-Sampling in a Statistical Process Control

35

system (22) from the initial position (23) to the set of terminal positions (24). Substitution of (38) into the third equation of (22) instead of u(z), solution of the resulting equation subject to the initial condition in (23), and use of the above mentioned requirement yield the following inequality with respect to C1 and C2 : (39) aTmin ≤ Ψi (δ, C1 , C2 ) ≤ aTmax ,  3 Ψi (δ, C1 , C2 )  exp(−z 2 /2)u∗i (z, δ, C1 , C2 )dz. (40) 0

Note that Ψi (C1 , C2 ) is the value w3 (3), generated by the control (38). Hence, due to (24) and (30), the constants C1 and C2 should satisfy the algebraic equations   C1 aTmin − Ψi (δ, C1 , C2 ) = 0, (41)   (42) C2 Ψi (δ, C1 , C2 ) − aTmax = 0. Let us solve the set of the inequalities and the equalities (39)–(42) with respect to C1 ≥ 0 and C2 ≥ 0. In this solution, the following four cases can be distinguished: (ai ) C1 > 0, C2 > 0; (bi ) C1 = C2 = 0; (ci ) C1 = 0, C2 > 0; (di ) C1 > 0, C2 = 0. We start with the case (ai ). Due to (41)–(42), this case yields the contradictory equality Tmin = Tmax , meaning that the first case is impossible. Proceed to the case (bi ). In this case, the switch function S(z, δ, C1 , C2 ) (see the Eq. (37)) becomes S(z, δ, 0, 0) = 2κ2 cosh(δz), z ∈ [0, 3]. Since κ2 > 0, then S(z, δ, 0, 0) > 0, z ∈ [0, 3]. Therefore, due to the Eq. (38), we have u∗ (z, δ, 0, 0) = umin , z ∈ [0, 3]. Now, the substitution of this control into (40), and the use of the inequalities (39) and a > 0 yield the inequality Tmin ≤ umin , which contradicts the inequality in (5). Thus, the case (bi ) also is impossible. Now, we consider the case (ci ). Here, S(z, δ, C1 , C2 )|C1 =0 = 2κ2 cosh(δz) + C2 > 0, z ∈ [0, 3]. The further analysis of this case uses the same arguments as the analysis of the case (bi ), meaning that the case (ci ) is impossible. Finally, we consider the case (di ). In this case, by denoting Ci  C1 , the optimal control (38) becomes  ¯ δ, Ci ) > 0, umin , S(z, ∗ ∗ ui (z, δ) = ui (z, δ, Ci ) = (43) ¯ δ, Ci ) < 0, , umax , S(z, ¯ δ, Ci ) = 2κ2 cosh(δz) − Ci . S(z,

(44)

Moreover, using the Eq. (41), the inequality (39) becomes the equation with respect to Ci :  3 exp(−z 2 /2)u∗i (z, δ, Ci )dz − aTmin = 0. (45) Λi (δ, Ci )  0

Using the the inequality (5), we directly obtain that  Eqs. (43)–(44) and  Λi (δ, Ci )C =0 < 0, Λi (δ, Ci )C =+∞ > 0, meaning the existence of a positive i

i

36

V. Y. Glizer and V. Turetsky

solution Ci = Ci∗ to the Eq. (45). Furthermore, since Λi (δ, Ci ) is an increasing function of Ci ≥ 0, this solution is the unique positive solution of the Eq. (45). Thus, the optimal control of the OCP in the case κ1 = 0 has the form  ¯ δ, C ∗ ) > 0, umin , S(z, i u∗i (z, δ) = u∗i (z, δ, Ci∗ ) = (46) ¯ δ, C ∗ ) < 0. umax , S(z, i Case (ii) (κ1 > 0). In this case, applying (31) to (36), we obtain the optimal control of the OCP in the form u∗ii (z, δ) = u∗ii (z, δ, γ, C1 , C2 ) = u ˜(z, δ, γ, C1 , C2 ) ≤ umin , u ˜(z, δ, γ, C1 , C2 ) ∈ (umin , umax ], u ˜(z, δ, γ, C1 , C2 ) > umax ,

⎧ ⎨ umin , u ˜(z, δ, γ, C1 , C2 ), ⎩ umax ,

(47)

where u ˜(z, δ, γ, C1 , C2 ) is the point of the unconstrained minimum of the function G1 (u, z, δ, γ, C1 , C2 ) with respect to the argument u. Since this function is a quadratic one with respect to u with a positive coefficient for u2 , then u ˜(z, δ, γ, C1 , C2 ) is the unique solution of the following equation with respect to u: ∂G1 (u, z, δ, γ, C1 , C2 )/∂u = 0. Thus,     u ˜(z, δ, γ, C1 , C2 ) = C1 − C2 − 2κ2 cosh(δz) / 4κ1 cosh(δz) − B(δ)γ. (48) The optimal control (47) depends on the constants γ, C1 and C2 , which still are unknown. We should choose these constants in such a way that the resulting control (47) will transfer the system (22) from the initial position (23) to the intersection of the set of terminal positions (24) and the plane w2 (3) − γ = 0  in the 3D-space w1 (3), w2 (3), w3 (3) . Substituting (47) into the system (22) instead of u(z), solving the resulting system subject to the initial conditions (23), and using the above mentioned requirement yield the following set of the inequality and the algebraic equation with respect to γ, C1 and C2 : aTmin ≤ Ψii,1 (δ, γ, C1 , C2 ) ≤ aTmax ,

(49)

Ψii,2 (δ, γ, C1 , C2 ) − γ = 0,

(50)

where

 Ψii,1 (δ, γ, C1 , C2 )  0

3

exp(−z 2 /2)u∗ii (z, δ, γ, C1 , C2 )dz,



Ψii,2 (δ, γ, C1 , C2 ) 

3

ψ(z, δ)u∗ii (z, δ, γ, C1 , C2 )dz.

(51) (52)

0

Note that Ψii,1 (δ, γ, C1 , C2 ) and Ψii,2 (δ, γ, C1 , C2 ) are the values w3 (3) and w2 (3), respectively, generated by the control (47). Further, due to (24), (30) and (29), the constants γ, C1 ≥ 0 and C2 ≥ 0 should satisfy the algebraic equations   (53) C1 aTmin − Ψii,1 (δ, γ, C1 , C2 ) = 0,   (54) C2 Φii,1 (δ, γ, C1 , C2 ) − aTmax = 0.

Optimal Time-Sampling in a Statistical Process Control

Remark 3. Since the OCP has the solution, the set (49)–(54) If this set has  more than one solution, we choose the solution ∗ ∗ C1 , C2 = C2 , which provides the minimum value of the OCP (25) in comparison with the other solutions.

37

has  a solution. γ = γ ∗ , C1 = cost functional

Analysis of the Set (49)–(54). Let (γ, C1 , C2 ) be a solution of this set. Since the control (47) satisfies the inequality u∗ii (z, δ, γ, C1 , C2 ) ≥ umin > 0 for all z ∈ [0, 3], then due to (16), (50) and (52), γ > 0. For the further analysis, similarly to the case κ1 = 0, we distinguish the following four cases: (aii ) C1 > 0, C2 > 0; (bii ) C1 = C2 = 0; (cii ) C1 = 0, C2 > 0; (dii ) C1 > 0, C2 = 0. The case (aii ) is analized quite similarly to the above considered case (ai ), yielding the same conclusion. Namely, the case (aii ) is impossible. ˜(z, δ, γ, C1 , C2 ) (see (48)) Proceed to the case (bii ). In this case, the function u becomes u ˜(z, δ, γ, 0, 0) = −0.5κ2 /κ1 − B(δ)γ, z ∈ [0, 3]. Since κ1 > 0, κ2 ≥ 0, B(δ) > 0 and γ > 0, then u ˜(z, δ, γ, 0, 0) < 0, z ∈ [0, 3]. Therefore, due to the Eq. (47) and the inequality umin > 0, we have u∗ii (z, δ, γ, 0, 0) = umin , z ∈ [0, 3]. Now, substitution of this control into (51), and use of (49) and the positiveness of a yield the inequality Tmin ≤ umin , which contradicts the inequality in (5). Thus, the case (bii ) also is impossible. case, we have u ˜(z, δ, γ, C1 , C2 ) = Now, let us treat  the case (cii ). In this  u ˜(z, δ, γ, 0, C2 ) = − C2 + 2κ2 cosh(δz) / 4κ1 cosh(δz) + B(δ)γ < 0, z ∈ [0, 3]. Thus, by the same arguments as in the previous case, the case (cii ) is impossible. Finally, let us consider the case (dii ). In this case, by denoting Cii  C1 , the optimal control (47) becomes u∗ii (z, δ) = u∗ii (z, δ, γ, Cii ) = ⎧ u ¯(z, δ, γ, Cii ) ≤ umin , ⎨ umin , ¯(z, δ, γ, Cii ) ≤ umax , u ¯(z, δ, γ, Cii ), umin < u ⎩ u ¯(z, γ, Cii ) > umax , umax ,

(55)

where     u ¯(z, δ, γ, Cii ) = Cii − 2κ2 cosh(δz) / 4κ1 cosh(δz) − B(δ)γ

(56)

is the unique solution of the equation ∂G2 (u, z, δ, γ, Cii )/∂u = 0 with respect to u, and the function G2 (u, z, δ, γ, Cii ) has the form     G2 u, z, δ, γ, Cii = 4κ1 B(δ)γ cosh(δz) + 2κ2 cosh(δz) − Cii u + 2κ1 cosh(δz)u2 . Moreover, based on the Eq. (53), the set (49)–(50) is transformed to the system of the following two equations with respect to γ and Cii :  3 exp(−z 2 /2)u∗ii (z, δ, γ, Cii )dz − aTmin = 0, (57) Λii,1 (δ, γ, Cii )  0

 Λii,2 (δ, γ, Cii )  0

3

ψ(z, δ)u∗ii (z, δ, γ, Cii )dz − γ = 0.

(58)

38

V. Y. Glizer and V. Turetsky

Hence, to construct the optimal control u∗ii (z, δ, γ, Cii ) of the OCP in the case κ1 > 0 and, thus, to design the SPC optimal sampling time-interval, one should solve the system (57)–(58) with respect to (γ, Cii ) and substitute its solution into the Eq. (55). Due to Remark 3, the system (57)–(58) has a solution. Some properties of this solution, helpful for its numerical obtaining, are presented below. Properties of the Solution to the System (57)–(58). Consider the following values:   Cmin (γ)  4κ1 B(δ)γ + umin + 2κ2 ,   Cmax (γ)  4κ1 cosh(3δ) B(δ)γ + umax + 0.5κ2 /κ1 , (59)  γmin  umin



3

ψ(z, δ)dz, 0

Γmin  max {γmin , 2aT } ,

γmax  umax

3

ψ(z, δ)dz,

(60)

0

Γmax  min {γmax , 2aT cosh(3δ)} .

Also, in the plane (γ, Cii ), we consider the non-empty domain      Ω  γ ∈ Γmin , Γmax , Cii ∈ Cmin (γ), Cmax (γ) .

(61)

(62)

In (59)–(62), for simplicity we omit the designation of the dependence of the corresponding values and the domain on the parameter δ = 0. Using the values (59)–(61) and the domain (62), we obtain the following assertions. Assertion 1. Let the pair (γ, Cii ) be a solution of the system (57)–(58). Then, (γ, Cii ) ∈ Ω. Remark 4. Due to Assertion 1, we can look for the solution of the system (57)– (58) not in the entire plane (γ, Cii ), but in the bounded domain Ω, which decreases considerably the computational effort.   Assertion 2. For any given γ ∈ Γmin , Γmax , the Eq. (57) has the unique solu tion Cii = C(γ), and    C(γ) ∈ Cmin (γ), Cmax (γ) . (63)    Moreover, C(γ) is a monotonically increasing function of γ ∈ Γmin , Γmax .   Assertion 3. For any given γ ∈ Γmin , Γmax , the Eq. (58) has the unique solu¯ tion Cii = C(γ), and   ¯ C(γ) ∈ Cmin (γ), Cmax (γ) . (64)   ¯ Moreover, C(γ) is a monotonically increasing function of γ ∈ Γmin , Γmax . Remark 5. Based on Assertions 2 and 3, the γ-component of the solution to the  system (57)–(58) can be obtained by solving with respect to γ ∈ Γmin , Γmax      ¯ either the equation Λ2 δ, γ, C(γ) = 0, or the equation Λ1 δ, γ, C(γ) = 0, or  ¯ the equation C(γ) = C(γ). Proofs of Assertions 1–3 are presented in Appendix.

Optimal Time-Sampling in a Statistical Process Control

3.2

39

Numerical Solution of EP

To solve the problem EP numerically, we divide the interval [0, 3] into N equal subintervals by the collocation points zj = jΔz,

j = 0, 1, . . . , N,

Δz = 3/N.

(65)

Based on (65) and using the left rectangles formula [28], we approximate the integrals in the functional (20) and in the integral constraint (3) by finite sums. Thus, the functional is approximated as: N −1    J u(z) ≈ J˜N (U )  κ1 Δz ψ(zj , δ)Uj2

⎛ +κ1 B(δ) ⎝Δz

N −1 

j=0

⎞2 ψ(zj , δ)Uj ⎠ + κ2 Δz

j=0

N −1 

ψ(zj , δ)Uj ,

j=0

(66) T   T where U = U0 , U1 , . . . , UN −1 = u(z0 ), u(z1 ), . . . , u(zN −1 ) . The constraint (3) is approximated as: ΔzTmin

N −1 

exp(−zj2 /2) ≤ Δz

j=0

N −1 

exp(−zj2 /2)Uj ≤ ΔzTmax

j=0

N −1 

exp(−zj2 /2).

j=0

(67) The approximation (67) of the constraint (3) is derived using the following approximation of the value a: a ≈ aN Δz,

aN 

N −1 

exp(−zj2 /2).

j=0

The geometric constraint (2), appearing in the EP, is approximated as: umin ≤ Uj ≤ umax ,

j = 0, 1, . . . , N − 1.

(68)

Further, we divide the expression in the right-hand side of (66) and the inequality (67) by Δz > 0. Thus, we obtain the following functional and the constraint in the N -dimensional Euclidean space: J N (U )  κ1 ⎛ +κ1 ΔzB(δ) ⎝

N −1  j=0

⎞2 ψ(zj , δ)Uj ⎠ + κ2

N −1 

ψ(zj , δ)Uj2

j=0 N −1 

ψ(zj , δ)Uj ,

j=0

(69)

40

V. Y. Glizer and V. Turetsky

aN Tmin ≤

N −1 

exp(−zj2 /2)Uj ≤ aN Tmax .

(70)

j=0

Now, based on (69)–(70), we can formulate the following mathematical programming problem. Mathematical Programming Problem (MPP): for a known δ, to find the T  which minimizes the functional N -dimensional vector U = U0 , U1 , ..., UN −1 (69) subject to the constraints (68), (70) and the inequality (5). Remark 6. Like in the solution of the OCP, two cases with respect to the coefficient κ1 can be distinguished in the solution of the MPP. Namely, if κ1 = 0, this problem becomes a linear programming problem. If κ1 > 0, the MPP becomes a quadratic programming problem. In both cases, the MPP can be solved using standard optimization tools, for example, the MATLAB optimization tool box. It is reasonable to expect that for a sufficiently large number N of the collocation points, the components Uj , (j = 0, 1, ..., N − 1) of the MPP solution will be close to the corresponding values u∗i (zj , Ci∗ ) or u∗ii (zj , γ ∗ , Cii∗ ) of the optimal control in the OCP. In such a case, the optimal value of the cost functional (69) in the MPP multiplied by Δz will be close to the optimal value of the cost functional (25) in the OCP, i.e., the solution of the MPP will represent a suboptimal time-sampling interval in the SPC. 3.3

Numerical Evaluation of the Optimal and Suboptimal Sampling Time-Intervals

In Fig. 1, the optimal sampling time-intervals u∗i (z, δ) (the EP solution for κ1 = 0 given by the analytical expression (46), see Fig. 1a) and u∗ii (z, δ) (the EP solution for κ1 > 0 given by (55), see Fig. 1b) are compared with the respective approximate sampling time-intervals (the MPP solutions). In both figures, umin = 0.5, umax = 2.5, κ2 = 1, Tmin = 1, whereas κ1 = 1 in Fig. 1b. It is seen that the approximation, obtained for N = 100, and the optimal sampling timeinterval match well. For κ1 = 0, the value of Ci = 2.102 in (46) was obtained numerically by using the bisection method for the Eq. (45) (solution accuracy was |Λi (δ, Ci )| = 6.54 · 10−4 ). The key element in the construction of the optisolution of the set (57)–(58) mal solution in the case κ1 > 0 is the numerical   ˜ and based on Assertions 1–3. For each γ ∈ Γmin , Γmax , the solutions C(γ) ¯ C(γ) of the Eqs. (57) and (58), respectively, were obtained by the bisection ˜ ¯ method. Then, the equation, C(γ) = C(γ) was solved by the bisection. The resulting values of (γ, Cii ) ∈ Ω are γ = 3.29, Cii = 302.26. For this solution, |Λii,1 (δ, γ, Cii )| = 6.88 · 10−5 , and |Λii,2 (δ, γ, Cii )| = 7.87 · 10−5 , indicating a sufficient accuracy of the numerical solution.

Optimal Time-Sampling in a Statistical Process Control

41

Fig. 1. Optimal vs. approximate sampling time-intervals.

4

Time-Sampling Independent of δ

In this section, we assume that the parameter δ is unknown. However, it is known that |δ| ≤ δ0 , and δ0 > 0 is a predicted upper bound of δ. Also, in this section, like in Sect. 3, we consider separately the cases: (i) κ1 = 0; (ii) κ1 > 0. 4.1

Case κ1 = 0

Due to the results of the previous section, for any known δ = 0, the optimal time-sampling interval u∗i (z, δ), z ∈ [0, 3] is given by the Eq. (46). Substitution of this time-sampling interval into the Eq. (19) with κ1 = 0 yields the minimum expected loss for this value of δ  3   ψ(z, δ)u∗i (z, δ)dz. Li,min (δ)  Lex u∗i (z, δ), δ κ =0 = κ2 k(δ)A(δ) 1

0

Let δi∗ be the value satisfying the equation Li,min (δi∗ ) = max Li,min (δ).

(71)

|δ|≤δ0

Using this value, we design the following time-sampling interval: u ˆi (z) = u∗i (z, δi∗ ),

z ∈ [0, 3].

(72)

The time-sampling interval u ˆi (z) is independent of δ and     Li,min (δi∗ ) = max min Lex u(z), δ κ =0 , |δ|≤δ0 u(z)∈U

1

(73)

where U is the set of all admissible time-sampling intervals, i.e., the set of all functions u(z), z ∈ [0, 3], satisfying the constraints (2)–(5).

42

V. Y. Glizer and V. Turetsky

Since



max min

|δ|≤δ0 u(z)∈U

       Lex u(z), δ κ1 =0 ≤ min max Lex u(z), δ κ1 =0 , u(z)∈U |δ|≤δ0

(74)

then, by virtue of Remark 2, the value Li,min (δi∗ ) represents a lower estimate of the best guaranteed value of the expected loss (19) for κ1 = 0. If in (74) the inequality becomes the equality, this lower estimate becomes the best guaranteed value of the expected loss and the time-sampling interval u ˆi (z) provides this best guaranteed value. Moreover, in this case for any u(z) ∈ U and |δ| ≤ δ0 , the saddle point inequality holds:             Lex u ˆi (z), δ κ =0 ≤ Lex u ˆi (z), δi∗ κ =0 ≤ Lex u(z), δi∗ κ =0 . (75) 1

1

1

In the sequel, the time-sampling interval u ˆi (z) is evaluated numerically. 4.2

Case κ1 > 0

This case is analyzed similarly to the previous one. Namely, due to the results of Sect. 3, for any known δ = 0, the optimal time-sampling interval u∗ii (z, δ), z ∈ [0, 3] is given by the Eq. (55), where the pair (γ, Cii ) is the solution of the system (57)–(58). Substituting this time-sampling interval into the Eq. (19), we obtain the minimum expected loss for this value of δ   3  ∗   2 Lii,min (δ) = Lex uii (z, δ), δ = k(δ)A(δ) κ1 ψ(z, δ) u∗ii (z, δ) dz + 0



3

κ1 B(δ)

ψ(z, δ)u∗ii (z, δ)dz

2



3

+ κ2

0

ψ(z, δ)u∗ii (z, δ)dz .

0

By δii∗ , we denote the value satisfying the equation Lii,min (δii∗ ) = max Lii,min (δ). |δ|≤δ0

(76)

Based on this value, we construct the following time-sampling interval: u ˆii (z) = u∗ii (z, δii∗ ),

z ∈ [0, 3].

The time-sampling interval u ˆii (z) is independent of δ. Moreover,   Lii,min (δii∗ ) = max min Lex u(z), δ , |δ|≤δ0 u(z)∈U

(77)

(78)

and     max min Lex u(z), δ ≤ min max Lex u(z), δ .

|δ|≤δ0 u(z)∈U

u(z)∈U |δ|≤δ0

(79)

Thus, due to Remark 2, the value Lii,min (δ ∗ ) is a lower estimate of the best guaranteed value of the expected loss (19). If in (79) the inequality becomes the

Optimal Time-Sampling in a Statistical Process Control

43

equality, this lower estimate becomes the best guaranteed value of the expected loss and the time-sampling interval u ˆii (z) provides this best guaranteed value. Moreover, in this case the saddle point inequality holds:       ˆii (z), δ ≤ Lex u ˆii (z), δii∗ ≤ Lex u(z), δii∗ ∀ u(z) ∈ U, |δ| ≤ δ0 . (80) Lex u In the next subsection, we evaluate numerically the time-sampling interval u ˆii (z). 4.3

Numerical Evaluation of the Time-Samplings u ˆi (z) and u ˆii (z)

For κ1 = 0, due to (74), we obtain         max min Lex u(z), δ κ =0 ≤ min max Lex u(z), δ κ =0 ≤ 1 1 |δ|≤δ0 u(z)∈U u(z)∈U |δ|≤δ0     max Lex u ˆi (z), δ κ =0 , (81) 1

|δ|≤δ0

where u ˆi (z) is given by (72). Therefore, in order to show that the saddle point inequality (75) is valid, it is sufficient to show that         ˆi (z), δ κ =0 . max min Lex u(z), δ κ =0 = max Lex u |δ|≤δ0 u(z)∈U

1

1

|δ|≤δ0

This equality was checked for k(δ) = δ, κ2 = 1, umin = 0.5,umax = 2.5, Tmin =  1,   Tmax = 2, δ0 = 3. For these data, we have max min Lex u(z), δ κ =0 = 1 |δ|≤δ0 u(z)∈U     ∗  max Lex u ˆi (z), δ κ =0 = 77.3, δi = 0.39, meaning the fulfilment of the sad1 |δ|≤δ0   ˆi (z), 0.3 κ =0 = dle point inequality (75). For example, we obtain that Lex u 1   ˆi (z), δi∗ κ1 =0 = 77.3. Now, define u1 (z) ∈ U as: u1 (z) = 74.45 < Lex u 2.5 for z ∈ [0, 0.375] and u1 (z) = 0.5  for z ∈ (0.375, 3]. Then, we have ˆi (z), δi∗ κ =0 = 77.3. This inequality, along Lex u1 (z), δi∗ κ =0 = 84.08 > Lex u 1 1 with the above obtained one, illustrates the fulfilment of the saddle point inequality (75) for u(z) = u1 (z) and δ = 0.3. Similarly to (81), for κ1 > 0 due to (79), we have        ˆii (z), δ , max min Lex u(z), δ ≤ min max Lex u(z), δ ≤ max Lex u |δ|≤δ0 u(z)∈U

u(z)∈U |δ|≤δ0

|δ|≤δ0

where u ˆii (z) is given by (77). Therefore, in order to show that the saddle point inequality (80) is valid, it is sufficient to show that       ˆii (z), δ . max min Lex u(z), δ = max Lex u |δ|≤δ0 u(z)∈U

|δ|≤δ0

For κ1 = 1 and the other parameters (as in the case κ1 = 0), we  same      ˆii (z), δ = 38642, δii∗ = 0.24. obtain max min Lex u(z), δ = max Lex u |δ|≤δ0 u(z)∈U

|δ|≤δ0

Thus for these parameters the saddle (80) is valid. For exam point ∗inequality   ˆi (z), 0.2 = 37564 < Lex u ˆii (z),δii = 38642. Define u2 (z) ∈ U as: ple, Lex u  ∗ ˆii (z), δii∗ = 38642 u2 (z) = sin(10z) + 1.5. Then Lex u2 (z), δii = 98532 > Lex u which, along with the above obtained inequality, illustrates the fulfilment of the saddle point inequality (80) for u(z) = u2 (z) and δ = 0.2.

44

V. Y. Glizer and V. Turetsky

Optimal robust sampling time

2.5 2

u ˆi (z) u ˆii (z)

1.5 1 0.5 0

0

0.5

1

1.5

2

2.5

z

3

Fig. 2. Optimal sampling time-intervals independent of δ.

In Fig. 2, the optimal sampling time-intervals, providing the best guaranteed results over δ ∈ [−3, 3], are presented for κ1 = 0 and for κ1 = 1.

5

Conclusions

In this paper, the design of an optimal state-feedback sampling time-interval for the statistical process control was considered. The criterion of the optimization is the expected loss caused by the delay in the detection of an undesirable process change. This loss was chosen in the form of a homogeneous quadratic polynomial of the delay. The problem of the optimization (minimization) of the expected loss was reduced to the extremal problem in the form of a nonstandard calculus of variations problem where the sampling time-interval of the statistical process control is a minimizing function. This minimizing function depends of the process state. The functional in the above mentioned calculus of variations problem, which is another (deterministic) form of the expected loss, depends not only on the minimizing function but also on the numerical parameter δ characterising a magnitude of the process change. The following cases of the information on δ where studied: (i) the current value of this parameter is known (the perfect information); (ii) only the interval, where this parameter varies, is known (the imperfect information). It is important to note that the imperfect information case is significantly less restrictive and more suitable for various real-life applications than the perfect information case. To the best of our knowledge, the case of the imperfect information on δ was studied for the first time in this paper. The solution of the calculus of variations problem in the perfect information case yields an optimal time-sampling interval depending not only on the current value of the process state but also on the current value of the parameter δ. However, such a solution in the imperfect information case yields the optimal time-sampling interval depending only on the current value of the process

Optimal Time-Sampling in a Statistical Process Control

45

state, i.e., this time-sampling interval is robust with respect to δ. The optimal time-sampling interval in the imperfect information case provides the best guaranteed value of the expected loss. This best guaranteed value is the minimum, with respect to the sampling time-interval, of the maximum, with respect to δ, of the expected loss. It was shown by numerical simulation that this minmax value of the expected loss equals to the maxmin value of the expected loss. The latter value means the maximum, with respect to δ, of the minimum, with respect to the sampling time-interval, of the expected loss. The equality between the minmax and maxmin values means the existence of the saddle point equilibrium (sampling time-interval, parameter δ) in the problem of the minimization of the expected loss subject to the imperfect information on δ.

Appendix The proofs of Assertions 1–3 are based on the following auxiliary propositions. Auxiliary Propositions Proposition 1. There are no solutions of the set (57)–(58) in the half-planes Cii ≤ Cmin (γ),

(82)

Cii ≥ Cmax (γ)

(83)

of the plane (γ, Cii ). Proof. First, let us prove the claim of the proposition with respect to the halfplane (82). We do this by contradiction, i.e., we assume that there exists a solution (γ, Cii ) of (57)–(58) in this half-plane. Hence, using the inequalities (82) and cosh(δz) ≥ 1, we obtain that u ¯(z, δ, γ, Cii ) ≤ umin . By virtue of (55), ∗ (z, δ, γ, C ) = umin . Thus, due to (4), (5) and (57), the latter means that u ii  ii  Λii,1 (δ, γ, Cii ) = a umin − Tmin < 0, meaning that the above mentioned pair (γ, Cii ) does not satisfy the Eq. (57). The claim with respect to the half-plane (83) is proven similarly.

Proposition 2. There are no solutions of the set (57)–(58) in the half-planes γ ≤ γmin ,

(84)

γ ≥ γmax

(85)

of the plane (γ, Cii ). Proof. First of all, let us note the following. Since 0 < umin < umax and ψ(z, δ) > 0, z ∈ [0, 3], then (86) 0 < γmin < γmax .

46

V. Y. Glizer and V. Turetsky

Consider the case where the pair (γ, Cii ) satisfies the strict inequality in (84). In this case, due to (16), (55), (58) and (84), we have 

3

Λii,2 (δ, γ, Cii ) >

  ψ(z, δ) u∗ii (z, δ, γ, Cii ) − umin dz ≥ 0,

0

meaning that (γ, Cii ) does not satisfy the Eq. (58). Now, let us consider the case γ = γmin . In this case, using (55) and (58), we have 

3

Λii,2 (δ, γmin , Cii ) =

  ψ(z, δ) u∗ii (z, δ, γmin , Cii ) − umin dz ≥ 0.

(87)

0

By virtue of (55), the inequality in (87) becomes the equality if and only if u∗ii (z, δ, γmin , Cii ) ≡ umin for all z ∈ [0, 3], yielding u ¯(z, δ, γmin , Cii ) ≤ umin for all z ∈ [0, 3]. Using (56), one directly obtains that the latter inequality for z = 0 is equivalent to the inequality (82) with γ = γmin . However, in this case by virtue of Proposition 1, the Eq. (57) is not satisfied. Thus, the claim of the assertion with respect to the half-plane (84) has been proven. The claim of the proposition with respect to the half-plane (85) is proven similarly.

Proposition 3. Let (γ, Cii ) be a solution of the set (57)–(58). Then, the component γ of this solution satisfies the inequality 2aTmin < γ < 2aTmin cosh(3δ). Proof. Substitution of (16) into (58) yields after a simple rearrangement 

3

γ=2

  exp − z 2 /2 cosh(δz)u∗ii (z, δ, γ, Cii )dz.

(88)

0

Applying the Mean Value Theorem to the integral in the right-hand side of (88) and taking into account the fact that cosh(δz) monotonically increases with respect to z ∈ [0, 3], we obtain 

3

γ = 2 cosh(δ¯ z)

  exp − z 2 /2 u∗ii (z, δ, γ, Cii )dz,

(89)

0

where z¯ is some value from the interval (0, 3). Due to (57), we can replace the integral in (89) with aTmin , which leads z ). The latter, along with the inequality 1 < to the equality γ = 2aTmin cosh(δ¯ cosh(δ¯ z ) < cosh(3δ), directly implies the statement of the proposition.

Proposition 4. The following inequality holds: 0 < Γmin < Γmax .

(90)

Proof. Using the definition of γmax (see the Eq. (60)), the Eq. (4), the inequality (5) and the same arguments as in the proof of Proposition 3, we obtain γmax = 2aumax cosh(δ˜ z ) > 2aTmin ,

(91)

Optimal Time-Sampling in a Statistical Process Control

47

where z˜ is some value from the interval (0, 3). Similarly, using the definition of γmin (see the Eq. (60)), the Eq. (4) and the inequality (5), we have γmin = 2aumin cosh(δ˜ z ) < 2aTmin cosh(3δ).

(92)

The definitions of Γmin and Γmax in (61), the inequalities (86), (91), (92), and the inequality cosh(3δ) ≥ 1 yield immediately the statement of the proposition.

Proposition 5. For any γ ≥ 0, this inequality holds: 0 < Cmin (γ) < Cmax (γ). Proof. The proposition directly follows from the definitions of Cmin (γ) and

Cmax (γ) (see the Eq. (59)). Proof of Assertion 1 First of all let us note that, due to Propositions 4 and 5, the domain Ω (see the Eq. (62)) is non-empty. Now, the statement of the assertion directly follows from Propositions 1–3 and the definitions of Γmin , Γmax and Ω. Proof of Assertion 2  The existence and uniqueness of C(γ) is proven similarly to the work [16] (see Lemma 5.1 and its proof where γ = 0). The inclusion (63) follows from the    proof of Proposition 1. Let us prove the monotonicity of C(γ), γ ∈ Γmin , Γmax .  We prove this feature of C(γ) by contradiction. Namely, we assume that the statement on the monotonic increasing of this function   is wrong. This means the existence of γ1 ∈ Γmin , Γmax and γ2 ∈ Γmin , Γmax such that

while

γ1 < γ2 ,

(93)

 1 ) ≥ C(γ  2 ). C(γ

(94)

The Eq. (56), and the inequalities (93) and (94) directly yield      1) > u  2) , u ¯ z, δ, γ1 , C(γ ¯ z, δ, γ2 , C(γ

z ∈ [0, 3].

(95)

Due to the Eq. (55) and the inequality (95), we immediately have      1 ) ≥ u∗ii z, δ, γ2 , C(γ  2) u∗ii z, δ, γ1 , C(γ

∀z ∈ [0, 3]. (96)   By virtue of the inclusion (63), we obtain that for any γ ∈ Γmin , Γmax and    all z ∈ [0, 3] the function u∗ii z, δ, γ, C(γ) is neither identical umin , nor identical umax . This fact, along with (55) and (95), yields the existence of a point z˜ ∈ [0, 3], such that      1 ) > u∗ z˜, δ, γ2 , C(γ  2) . (97) u∗ii z˜, δ, γ1 , C(γ ii

48

V. Y. Glizer and V. Turetsky

  Further, from (55) and (56), one directly concludes that for any γ ∈ Γmin , Γmax    is continuous function of z ∈ [0, 3]. This observation, the control u∗ii z, δ, γ, C(γ) along with the inequality (97), yields the existence of the interval [z1 , z2 ], (z1 < z2 ), such that [z1 , z2 ] ⊂ [0, 3] and      1 ) > u∗ii z˜, δ, γ2 , C(γ  2) u∗ii z˜, δ, γ1 , C(γ

∀z ∈ [z1 , z2 ].

(98)

Now, the definition of Λii,1 (δ, γ, C) in (57), along with the inequalities (96), (98),      1 ) > Λii,1 δ, γ2 , C(γ  2 ) . The latter contradicts the fact that yields Λii,1 δ, γ1 , C(γ  1 ) and C(γ  2 ) are solutions of the Eq. (57) with respect to Cii for γ = γ1 C(γ  and γ = γ2 . This contradiction implies that the function C(γ) monotonically increases for γ ∈ (Γmin , Γmax ), which completes the proof of the assertion. Proof of Assertion 3 We start the proof with the first two statements of the assertion. Let the pair (γ, Cii ) be any fixed satisfying the inequalities Γmin < γ < Γmax , Cii ≤ Cmin (γ). Using the Eq. (56), the definition of Cmin (γ) (see the Eq. (59)), the inequality (90), as well as the positiveness of B(δ) and the inequality cosh(δz) ≥ 1, z ∈ [0, 3], one directly has the following chain of the inequalities for z ∈ [0, 3]:

  umin 1 + B(δ)γ + 0.5(κ2 /κ1 ) − 1 ≤ umin . u ¯(z, δ, γ, Cii ) ≤ cosh(δz) cosh(δz) Hence, due to (55), u∗ii (z, δ, γ, Cii ) ≡ umin , z ∈ [0, 3]. Using the latter and (58), (84)–(85) yield   Λii,2 (δ, γ, Cii ) < 0, γ ∈ Γmin , Γmax , Cii ≤ Cmin (γ). (99) It is shown similarly, that Λii,2 (δ, γ, Cii ) > 0,

  γ ∈ Γmin , Γmax , Cii ≥ Cmax (γ).

(100)

Further,  from (55) and (56), one directly concludes that for any z ∈ [0, 3] and γ ∈ Γmin , Γmax the control u∗ii (z, δ, γ, Cii ) is continuous and monotonically increasing function of Cii ∈ (−∞, +∞). Therefore, Λii,2 (δ, γ, Cii ) (see (58)) is a continuous and monotonically increasing function of Cii ∈ (−∞, +∞) for  any γ ∈ Γmin , Γmax . The latter, along with (99)–(100), implies immediately ¯ the  of the unique solution Cii = C(γ) of the Eq. (58)for any γ ∈  existence for any γ ∈ Γmin , Γmax . Γmin , Γmax . Moreover, the inclusion   (64) is satisfied ¯ The monotonicity of C(γ), γ ∈ Γmin , Γmax is proven similarly to the same  feature of the function C(γ) (see the proof of Assertion 2). This completes the proof of the assertion.

Optimal Time-Sampling in a Statistical Process Control

49

References 1. Qiu P (2013) Introduction to statistical process control. Chapman and Hall/CRC, Boca Raton, FL ¯ charts with variable 2. Reynolds MR, Amin RW, Arnold JC, Nachlas J (1988) X sampling intervals. Technometrics 30:181–192. https://doi.org/10.1080/00401706. 1988.10488366 ¯ charts with variable 3. Amin RW, Hemasinha R (1993) The switching behavior of X sampling intervals. Commun Stat Theory Methods 22:2081–2102. https://doi.org/ 10.1080/03610929308831136 4. Amin RW, Miller RW (1993) A robustness study of charts with variable sampling intervals. J Qual Technol 25:36–44. https://doi.org/10.1080/00224065.1993. 11979414 5. Prabhu SS, Montgomery DC, Runger GC (1994) A combined adaptive sample size and sampling interval X control scheme. J Qual Technol 26:164–176. https://doi. org/10.1080/00224065.1994.11979524 6. Reynolds MR (1995) Evaluating properties of variable sampling interval control charts. Sequential Anal. 14:59–97. https://doi.org/10.1080/07474949508836320 ¯ and R control charts with variable parameters. IIE 7. Costa AFB (1998) Joint X Trans 30:505–514. https://doi.org/10.1080/07408179808966490 ¯ and R charts with variable sample sizes and sam8. Costa AFB (1999) Joint X pling intervals. J Qual Technol 31:387–397. https://doi.org/10.1080/00224065. 1999.11979945 ¯ charts with variable parameters. J Qual Technol 31:408–416. 9. Costa AFB (1999) X https://doi.org/10.1080/00224065.1999.11979947 10. Costa AFB, De Magalh˜ aes MS (2007) An adaptive chart for monitoring the process mean and variance. Qual Reliab Eng Int 23:821–831. https://doi.org/10.1002/qre. 842 11. Hatjimihail AT (2009) Estimation of the optimal statistical quality control sampling time intervals using a residual risk measure. PLoS ONE 4:e5770. https://doi. org/10.1371/journal.pone.0005770 12. Bashkansky E, Glizer VY (2012) Novel approach to adaptive statistical process control optimization with variable sampling interval and minimum expected loss. Int J Qual Eng Technol 3:91–107. https://doi.org/10.1504/IJQET.2012.049680 13. Li Z, Qiu P (2014) Statistical process control using dynamic sampling. Technometrics 56:325–335. https://doi.org/10.1080/00401706.2013.844731 ¯ control 14. Sultana I, Ahmed I, Chowdhury AH, Paul SK (2014) Economic design of X chart using genetic algorithm and simulated annealing algorithm. Int J Prod Qual Manag 14:352–372. https://doi.org/10.1504/IJPQM.2014.064810 15. Chew XY, Khoo MBC, Teh SY, Castagliola P (2015) The variable sampling inter¯ control chart. Comput Ind Eng 90:25–38. https://doi.org/10.1016/ val run sum X j.cie.2015.08.015 16. Glizer VY, Turetsky V, Bashkansky E (2015) Statistical process control optimization with variable sampling interval and nonlinear expected loss. J Ind Manag Optim 11:105–133. https://doi.org/10.3934/jimo.2015.11.105 17. Glizer VY, Turetsky V (2018) Optimal time-sampling problem in a statistical control with a quadratic cost functional: analytical and numerical approaches. In: Proceedings of the 15th international conference on informatics in control, automation and robotics, vol. 1. Scitepress - Science and Technology Publications, Setbal, Portugal. https://doi.org/10.5220/0006829000210032, pp 21–32

50

V. Y. Glizer and V. Turetsky

18. Taguchi G, Chowdhury S, Wu Y (2007) Taguchi’s quality engineering handbook. Wiley, Hoboken 19. Babrauskas V (2008) Heat release rates. In: DiNenno PJ (ed) SFPE handbook of fire protection engineering. National Fire Protection Association, Quincy, pp 1–59 20. Sebasti˜ ao P, Soares CG (1995) Modeling the fate of oil spills at sea. Spill Sci Technol Bull 2:121–131. https://doi.org/10.1016/S1353-2561(96)00009-6 21. Bulelzai MAK, Dubbeldam JLA (2012) Long time evolution of atherosclerotic plaques. J Theoret Biol 297:1–10. https://doi.org/10.1016/j.jtbi.2011.11.023 22. Carpenter TE, OBrien JM, Hagerman A, McCarl B (2011) Epidemic and economic impacts of delayed detection of foot-and-mouth disease: a case study of a simulated outbreak in California. J Vet Diagn Invest 23:26–33. https://doi.org/10. 1177/104063871102300104 23. Kim S, Frangopol DM (2011) Optimum inspection planning for minimizing fatigue damage detection delay of ship hull structures. Int J Fatigue 33:448–459. https:// doi.org/10.1016/j.ijfatigue.2010.09.018 24. Ross S (2009) A first course in probability. Prentice Hall, Upper Saddle River 25. Gelfand IM, Fomin SV (1963) Calculus of variations. Prentice-Hall, Englewood Cliffs 26. Pontryagin LS, Boltyanskii VG, Gamkrelidze RV, Mishchenko EF (1962) The mathematical theory of optimal processes. Interscience, New York 27. Ioffe AD, Tihomirov VM (1979) Theory of extremal problems. North-Holland Pub. Co., Amsterdam 28. Davis PJ, Rabinowitz P (2007) Methods of numerical integration. Dover Publications Inc., Mineola

Observability for the Wave Equation with Variable Support in the Dirichlet and Neumann Cases Antonio Agresti1(B) , Daniele Andreucci2 , and Paola Loreti2 1

2

Mathematics Department, Sapienza University, Piazzale Aldo Moro, 5, 00185 Rome, Italy [email protected] SBAI Department, Sapienza University, Via Antonio Scarpa, 16, 00161 Rome, Italy {daniele.andreucci,paola.loreti}@sbai.uniroma1.it

Abstract. We study the observability of the wave equation when the observation set changes over time. For the one dimensional Neumann problem, using Fourier series, we are able to prove for the exact observability an equivalent condition already known for the Dirichlet problem; see [1]. For the observability problem with Dirichlet boundary conditions, we focus on multidimensional problems and the observability inequality is proven through a multiplier approach. Besides this, we present some applications and a numerical simulation. Keywords: Exact controllability · Wave equation subset · Multiplier · Fourier series

1

· Variable control

Introduction

As well known, exact controllability and observability of hyperbolic systems of wave type may be studied in an equivalent way (see Lions’ book for an overview on the topic [14]; see also [11,17]). The exact controllability is equivalent to the exact observability for the adjoint system. In previous works [1,2] the authors solve the exact controllability problem of the wave equation and Dirichlet boundary conditions with alternating controls allowing the support of the control to change over time. The novelty of alternating controls is motivated by engineering applications (see [4,6]), in order to avoid stressing the actuator during the evolutive process: we may change the boundary subset of the control action, still achieving final equilibrium after a long enough time interval, which we estimate. In the 1-D case in fact we find the optimal controllability time, also in the nontrivial case, for example, when the action in one boundary point is interrupted and switched to the other boundary point before the optimal controllability time for fixed control has elapsed. We point out that even in the fixed support D. Andreucci—Member of Italian G.N.F.M.-I.N.d.A.M. c Springer Nature Switzerland AG 2020  O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 51–75, 2020. https://doi.org/10.1007/978-3-030-31993-9_3

52

A. Agresti et al.

case, the subset of the boundary, on which the control acts, cannot be chosen arbitrarily (see [3]). On the subject of varying in time control see also [13]. Let us describe in detail the new features with respect to [2]. Sections 2–3, 5–6 are new. The content of Sect. 4 is taken from [2]. More specifically we added in Sects. 2–3 the Neumann one-dimensional case, which did not appear elsewhere previously. In Sect. 5 we introduce new examples of application of our results, and in Sect. 6 we present a connected numerical investigation. The multidimensional theory treated in Sect. 4 is taken from [2], and is included here in order to make the paper self contained, in view of the new content of Sect. 5. We begin by explaining the setting for the Dirichlet problem: – Let Ω be a bounded domain of Rd with d ≥ 1, of class C 2 or convex. Thus, we have that the exterior normal vector ν is well defined Hd−1 -a.e. on ∂Ω; where Hd−1 is the d − 1 dimensional Hausdorff measure (see [7]). As usual, we denote dΓ the measure Hd−1 restricted to ∂Ω. – For each t ∈ (0, T ) and T > 0, Γ (t) denotes an open subset of ∂Ω. – Finally, we define  Σ := Γ (t) × {t}; (1) t∈(0,T )

and we suppose it to be H

d−1

⊗ L1 -measurable (as defined in [7] Chap. 1).

In the main part of the paper we will study the observability property of the following system: ⎧ utt − Δu = 0, (x, t) ∈ Ω × (0, ∞), ⎪ ⎪ ⎨ u = 0, (x, t) ∈ ∂Ω × (0, ∞), (2) , (x, t) ∈ Ω × {0}, u = u ⎪ 0 ⎪ ⎩ (x, t) ∈ Ω × {0}; ut = u1 , The exact meaning of the exact observability for the problem (2) is given below, for the terminology see Proposition 1. Definition 1 (Exact Observability). Let u ∈ C(R+ ; H01 (Ω))∩C 1 (R+ ; L2 (Ω)) be the unique mild solution of (2) for the initial data (u0 , u1 ) ∈ H01 (Ω) × L2 (Ω). Then the system is called exactly observable in time T > 0 if there exists a constant C > 0 such that  |∂ν u|2 dΓ ⊗ dt {(x,t)∈Σ : t 0 independent on (u0 , u1 ). This ensures that the normal derivative ∂ν u is a well defined element of L2 (∂Ω×(0, T )) for each T > 0 since the regularity of a mild solution u is not sufficient to have a well defined trace for ∂ν u, in the sense of Sobolev spaces (see [16] Chap. 4). This result is generally called hidden regularity; the proof relies on a clever integration by parts argument, for details see [10,14] or [12]. The following Remark explains the interest in studying the exact observability. Remark 2. In control theory, one is interested in knowing that given a time T > 0 and any w0 , z0 ∈ L2 (Ω) and any w1 , z1 ∈ H −1 (Ω) there exists a control v ∈ L2 (Σ) such that the unique solution w ∈ C([0, T ]; L2 (Ω)) ∩ C 1 ([0, T ]; H −1 (Ω)) of ⎧ wtt − Δw = 0, (x, t) ∈ Ω × (0, T ), ⎪ ⎪ ⎪ ⎪ (x, t) ∈ ∂Ω × (0, T ) \ Σ, ⎨ w = 0, w = v, (x, t) ∈ Σ, (4) ⎪ ⎪ , (x, t) ∈ Ω × {0}, w = w ⎪ 0 ⎪ ⎩ (x, t) ∈ Ω × {0}; wt = w1 , satisfies w(x, T ) = z0 , wt (x, T ) = z1 . For the notion of weak solution and the well posedness of (4) we refer to [2, Sect. 1]. This property is usually known as exact controllability. Via a duality argument one can prove that exact controllability for the system (4) is equivalent to the exact observability for (2) as defined in Definition 1 (see for instance [11,14]). In the next proposition we show the existence of solutions to (2) in the appropriate functions spaces, as used in Definition 1; below R+ := [0, ∞). Proposition 1. The following holds true: (i) For each (u0 , u1 ) ∈ H01 (Ω) × L2 (Ω), then the problem (2) has a unique mild solution u in the class C(R+ ; H01 (Ω)) ∩ C 1 (R+ ; L2 (Ω)). (ii) For each (u0 , u1 ) ∈ (H 2 (Ω) ∩ H01 (Ω)) × H01 (Ω), then the problem (2) has a unique classical solution u in the class C(R+ ; H 2 (Ω) ∩ H01 (Ω)) ∩ C 2 (R+ ; L2 (Ω)).

54

A. Agresti et al.

Proof. Before starting the proof, we recall that the operator Δ with Dirichlet boundary conditions admits an L2 orthonormal basis composed of its eigenfunctions {fk : k ∈ N}. Moreover, we denote with {λk : k ∈ N} the correspondent sequence of eigenvalues. Since Ω is of class C 2 or convex, then fk ∈ H 2 (Ω) ∩ H01 (Ω) for all k ∈ N; for a proof see [16] or [9]. Standard considerations show that the solution u of the problem (2) can be written as u(t) =



k=0

sin(λk t) u0,k + u ˆ1,k fk ; cos(λk t)ˆ λk

(5)



where u ˆi.k := (ui , fk )L2 (Ω) :=

ui fk dx,

i = 0, 1.

Ω

The representation in (5) readily allows one to prove the claims in (i)–(ii). Corollary 1 (Energy Conservation). Let u ∈ C(R+ ; H01 (Ω)) ∩ 1 + 2 C (R ; L (Ω)) be a mild solution of (2) with initial data (u0 , u1 ) ∈ H01 (Ω) × L2 (Ω). Then, the energy map R+ t → E(t) :=

1 ∇u(·, t)2L2 (Ω) + ut (·, t)2L2 (Ω) , 2

(6)

is constant, and it is equal to the initial energy E0 := 1.1

1  ∇u0 2L2 (Ω) +  u1 2L2 (Ω) . 2

(7)

Alternating Observation

In this work we are mainly interested in the case that the mapping t → Γ (t) ⊂ ∂Ω is piecewise constant in time. More precisely, we assume that there exists a partition 0 =: t−1 < t0 < t1 < · · · < tN −1 < tN =: T of the interval [0, T ] such that ∀t ∈ (tj−1 , tj ) j = 0 . . . , N ; (8) Γ (t) ≡ Γj , where, for each j, Γj ⊂ ∂Ω is a fixed subset and Γj = Γj+1 . Under the previous hypothesis, by (1) we have Σ=

N 

Γj × (tj−1 , tj ),

(9)

j=0

up to a set of measure 0. In the rest of the paper, we will write alternating observability instead of exact observability if the condition (8) is verified. In this case, we will employ the notation of this Subsection without future references.

Observability for the Wave Equation

1.2

55

One Dimensional Neumann Problem

In Subsect. 2.1 we also discuss and prove new results on the alternating observability for the Neumann problems; i.e. ⎧ utt − uxx = 0, (x, t) ∈ (0, π) × (0, T ), ⎪ ⎪ ⎪ ⎨u = 0, (x, t) ∈ {0, 2π} × (0, T ), x ⎪ , x ∈ (0, π), u(x, 0) = u 0 ⎪ ⎪ ⎩ ut (x, 0) = u1 , x ∈ (0, π), where u0 , u1 are suitable functions with mean zero. For the definition of alternating observability for such a problem, see Definition 2. In Subsect. 2.1 we prove an equivalent condition to alternating observability of the Neumann problem. Surprisingly, such condition is equivalent to the one discovered in [1] for the one dimensional problem with Dirichlet boundary conditions, which will be briefly discuss at the beginning of Sect. 2. In virtue of the theory developed in Subsect. 2.1 for the Neumann problem, in Sect. 3 we are able to discuss some interesting applications and an open problem (see Remark 5).

2

The One Dimensional Problem

The alternating observability problem for the one dimensional wave equation with Dirichlet boundary conditions is solved in [1, Sect. 3]. Here, we will comment the result of [1], in next subsection we prove that the observability problem for Dirichlet boundary condition is equivalent to the one with Neumann condition. In some sense, the condition (11), in Theorem 1 below, is universal. In the one dimensional case, we can assume without loss of generality that Ω = (0, π), then ∂Ω = {0, π}. Since we are interested in alternating observation, we have that Γj = {0} or Γj = {π} for any j = 0, . . . , N . Up to a reflection argument, we can assume that Γj = {0} if j is even, otherwise Γj = {π}. To state the main result in the case of Dirichlet boundary conditions we have to introduce some notation: – Let r : R → R/(2πZ) be the quotient map to the unit circle S 1 = R/(2πZ). – Let Ij := (tj−1 , tj ) be the j-th subset of observation, we set I˜j = {r(t) : t ∈ Ij } I˜j = {r(t − π) : t ∈ Ij }

if j is even, if j is odd.

Note that I˜j ⊂ S 1 , for any j. – For any measurable A ⊂ S 1 , |A| denotes its measure. The following is proven in [1].

(10)

56

A. Agresti et al.

Theorem 1. The system (4) is alternating observable in time T = sup IN if and only if N

1  ˜ (11) Ij = 0.

S \ j=0

Remark 3. Roughly speaking, Theorem 1 affirms that each Ik gives a contribu˜ tion in terms of controllability if and only if it covers a part of the set S 1 \∪k−1 j=0 Ij , otherwise it is useless. More importantly, the condition (11) does not depend on the equation and the boundary conditions anymore, and we will see that it is also a sufficient condition for the one dimensional Neumann problem. 2.1

One Dimensional Neumann Problem

In this section we analyse the observability problem for the one dimensional wave equation with Neumann boundary condition, i.e. ⎧ utt − uxx = 0, (x, t) ∈ (0, π) × (0, T ), ⎪ ⎪ ⎪ ⎨u = 0, (x, t) ∈ {0, 2π} × (0, T ), x (12) ⎪ x ∈ (0, π), u(x, 0) = u0 , ⎪ ⎪ ⎩ ut (x, 0) = u1 , x ∈ (0, π), where u0 ∈ H 1 (0, π) and u1 ∈ L2 (0, π) satisfy  π ui (x)dx = 0, i = 0, 1. 0

For notational convenience we set

   V := v ∈ H 1 (0, π) 

π 0

 v(x)dx = 0 ,

   W := w ∈ L2 (0, π) 

π 0

 w(x)dx = 0 .

Parallel to Proposition 1 we have the following: Proposition 2. For any (u0 , u1 ) ∈ V × W there exists a unique mild solution to (12) such that u ∈ C(R+ ; V) ∩ C 1 (R+ ; W). Proof. The solution is given explicitly by a Fourier series representation u(x, t) =



(ˆ uk0 cos(kt) +

k=1

where u ˆk0 = arguments.

π 0

u ˆk1 sin(kt)) cos(kx); k

(13)

ui (x) cos(kx) dx for i = 0, 1. The claim follows by standard

Observability for the Wave Equation

57

Before going further we give the definition of exact observability for the Neumann problem (12). Here we deal with the one dimensional case; for more on the fixed case, i.e. Γ ≡ Γ (t) for any t ∈ [0, T ], see [14]. For notational convenience, we will set λj = 0 if j is even, otherwise λj = π. Definition 2 (Exact Observability for (12)). The system (12) is exactly observable in time T > 0 if for all (u0 , u1 ) ∈ V × W the unique mild solution u of (12) provided by Proposition 2 satisfies N 

Ij

j=0

|ut (λj , t)|2 dt ≥ C( u0 2V +  u1 2W );

(14)

for some constant C > 0 independent of (u0 , u1 ) ∈ V × W. The following consists in an easy application of Parseval’s inequality. Proposition 3. With the previous notation, if t0 ≥ 2π, the system (12) is exactly observable in time 2π. Moreover,  2π |ut (0, t)|2 dt = 2( u0 2V +  u1 2W ). 0

Proof. See [11, Proposition 1.1]. We are ready to prove the main result of this subsection: Theorem 2. The system (12) is exactly observable in time T = sup IN if and only if (11) holds true, i.e. N

1  ˜ Ik = 0.

S \ j=0

Proof. Before starting the proof, we remark that by a density argument it is enough to consider initial data (u0 , u1 ) in a dense subset of V × W, the one given by pairs of finite linear combinations of cos(kx). So, we have that ui (x) = M k ˆi cos(kx), i = 0, 1, for a sufficiently large number M ∈ N. k=1 u We start by looking at the quantity |ut (λj , t)|2 which appears as integrand in (14). Since u is given by (13), one has ut (0, t) =

M



 −kˆ uk0 sin(kt) + u ˆk1 cos(kt) ;

k=1

ut (π, t) =

M



 −kˆ uk0 sin(kt) + u ˆk1 cos(kt) cos(kπ)

k=1

=

M



 −kˆ uk0 sin(k(t − π)) + u ˆk1 cos(k(t − π)) = ut (0, t − π).

k=1

58

A. Agresti et al.

The previous equalities imply N   j=0

Ij



N 

|ut (λj , t)|2 dt =

j=0 j∈2N N 

=

Ij



j=0 j∈2N

=

N   j=0

I˜j

Ij

|ut (0, t)|2 dt +



N  j=0 j∈2N+1

|ut (0, t)|2 dt +

N  j=0 j∈2N+1

Ij

 Ij

|ut (π, t)|2 dt |ut (0, t − π)|2 dt

|ut (0, t)|2 dt;

(15) where in the last equality we have used the 2π-periodicity of the map t → ut (0, t). Sufficiency of (11). If (11) holds true, then by (15) we have N   j=0

2π 0

|ut (λj , t)|2 dt =

N   j=0

 ≥

I˜j

|ut (0, t)|2 dt

˜ ∪N j=0 Ij

|ut (0, t)|2 dt =

 S1

|ut (0, t)|2 dt = 2( u0 2V +  u1 2W );

where the last equality follows by Proposition 3. Necessity of (11). We will prove this implication by contradiction. Indeed, sup˜ pose that the exact observability inequality (14) holds true and |S 1 \∪N j=0 Ij | = 0. Then by (15), we obtain C( u0 2V +  u1 2W ) ≤

N 

j=0

=

Ij

N 

j=0

|ut (λj , t)|2 dt

I˜j

(16)

 |ut (0, t)| dt ≤ (N + 1) 2

|ut (0, t)| ; 2

˜ ∪N j=0 Ij

˜ where in the equality we have used (15). By assumption, the set S 1 \ ∪N j=0 Ij has non empty interior part U , such that |U | = 0. Applying the Lemma 1 below to U ,  there exists a pair of initial data (u0 , u1 ) ∈ V ×W such that ∪N I˜j |ut (0, t)|2 = 0 j=0 and  u0 2V +  u1 2W = 0. This choice of (u0 , u1 ) in (16) leads to an inconsistency. The proof is now complete. The next Lemma was the key ingredient in the proof of the necessity part in Theorem 2. Lemma 1. Let U ⊂ S 1 be an non empty open set. Then, there exists a choice of non-trivial initial data (u0 , u1 ) ∈ V × W such that the unique mild solution u to (12) verifies supp ut (0, ·) ⊂ U.

Observability for the Wave Equation

59

As done in [1], for technical convenience, in the following we freely identify the measure space (S 1 , | · |) with ([0, 2π); m); where m denotes the one dimensional Lebesgue measure. Proof. Without loss of generality we can suppose that π ∈ / U (otherwise we of π). can replace U with U \ V , where V is a sufficiently small neighborhood π Let φ ≡ 0 be a Cc∞ -function on (0, 2π) such that supp φ ⊂ U and 0 φ(t) dt =  2π φ(t) dt = 0. π It is enough to prove that there exists a pair (u0 , u1 ) ∈ V × W such that the mild solution of (12) u satisfies ut (0, ·) = φ (·); where (·) = d(·)/dt. By the Fourier series representation formula (13), we obtain ut (0, t) =





 −kˆ uk0 sin(kt) + u ˆk1 cos(kt) = u0 (t) + u1 (t),

t ∈ (0, π).

k=1

Moreover, for t ∈ (π, 2π), setting t = 2π − τ with τ ∈ (0, π), one has ut (0, 2π −τ ) =





 −kˆ uk0 sin(k(2π − τ )) + u ˆk1 cos(k(2π − τ )) = −u0 (τ )+u1 (τ ).

k=1

Then ut (0, ·) = φ is equivalent to φ (t) = u0 (t) + u1 (t),

φ (2π − t) = −u0 (t) + u1 (t),

t ∈ (0, π).

We immediately obtain 1  (φ (t) − φ (2π − t)) , 2 1 u1 (t) := (φ (t) + φ (2π − t)) , 2 for t ∈ (0, π). Moreover, Eq. (17) implies  1 t  1 u0 (t) := (φ (s) − φ (2π − s)) ds = (φ(t) + φ(2π − t)) . 2 0 2 u0 (t) =

(17) (18)

(19)

It remains to prove that u0 and u1 defined in (18)–(19) belong respectively to V  π and W. By the smoothness assumption on φ, it is enough to show that ui (t)dt = 0 for i = 0, 1. Indeed, 0  π  1 π u0 (t)dt = (φ(t) + φ(2π − t)) dt = 0, 2 0  0π  π

π 1 1 

π

= 0, +φ − φ u1 (t) dt = (φ (t) − (φ(2π − t)) ) dt = 2 2 0 2π 0 0 π where we have used the assumption on φ and on U , namely: 0 φ(t) dt =  2π φ(t) dt = 0, supp φ ⊂ U and π ∈ U . π By construction, it is clear that u0 and u1 defined in (18)–(19) verify the statement of the Lemma and the proof is complete. Remark 4. The proof of Lemma 1 is similar to the one of [1, Lemma 3.2], although the problems under consideration have different boundary conditions.

60

3

A. Agresti et al.

Applications - One Dimensional Case

As explained in Sect. 2, Theorem 2 leads to the necessary and sufficient condition of the one dimensional case for Dirichlet boundary condition. This implies that, the results proven in [1, Sect. 3] for the one dimensional Dirichlet problem still holds for the Neumann case. To give a first elementary application of Theorem 2, we prove the following result which is a special case of [1, Corollary 1.8]. Corollary 2. Let N = 1. If t0 = 3/2π and t1 ≥ 3π (resp. t1 < 3π) then the system (12) is (resp. is not) alternating observable. Proof. First assume that t1 ≥ 3π; it is enough to consider t1 = 3π. By (10) and the identification (S 1 , | · |) = ([0, 2π); m), one has I˜0 = (0, 3/2π) and I˜1 = {r(t − π) : t ∈ I1 } = (π/2, 2π). This implies |S 1 \ I˜0 ∪ I˜1 | = 0. The condition (11) in Theorem 2 is verified and this concludes the proof for t1 ≥ 3π. In the case t1 < 3π then I˜1 = (π/2, 2π − t1 ); see Fig. 1. This implies that S 1 \ (I˜0 ∪ I˜1 ) = (2π − t1 , 2π) and |S 1 \ (I˜0 ∪ I˜1 )| > 0. Again an application of Theorem 2 concludes the proof.

Fig. 1. Covering of S 1 in Corollary 2. The observability sets I˜j are pictured in red.

The next Corollary in not contained in [1] and gives rise to a new interesting problem; see Remark 5 below. Corollary 3. Let {tj }j=−1,...,N be a finite sequence such that |tj−1 − tj | = π if j is even, otherwise |tj−1 −tj | = π/2. If N ≥ 4, then the system (12) is alternating observable.

Observability for the Wave Equation

61

Proof. The proof will be done by verifying the necessary and sufficient condition (11). It is enough to consider the case N = 4. With our notations, I˜0 = {r(t) : t ∈ (0, π)} = (0, π), I˜1 = {r(t − π) : t ∈ (π, (3/2)π)} = (0, π/2), I˜2 = {r(t) : t ∈ ((3/2)π, (5/2)π)} = ((3/2)π, 2π) ∪ (0, π/2), I˜3 = {r(t − π) : t ∈ ((5/2)π, 3π)} = (π/2, π), I˜4 = {r(t) : t ∈ (3π, 2π)} = (π, 2π); where we have used the identification of (S 1 , |·|) = ([0, 2π), m). Up to a 0-measure 1 1 ˜ ˜ ˜ set, one has ∪N j=0 Ij = S (more interestingly, it is also true that |S \ I0 ∪ I4 | = 0; see Fig. 2).

Fig. 2. Covering of S 1 in Corollary 3. The observability sets I˜j are pictured in red.

Remark 5. In applications which require alternating controls it may be necessary to have different lengths of control intervals on each endpoint. In the case that, the action of the control on the endpoint x = 0 (resp. x = π) is π (resp. π/2) Corollary 3 provides a positive answer to the controllability problem. In general one could ask for alternating observability under the condition |tj−1 − tj | = T0 , if j is even,

|tj−1 − tj | = αT0 , otherwise,

(20)

where T0 < 2π and α > 0 are fixed constants. One could fix T0 and regulate α in accordance with the necessity of the structure, for instance if the endpoint x = 0 is structurally more resistant than x = π one may set α  1. The problem for α = 1 is solved in [1, Sect. 3]. It is of interest to note that, in case α = 1, there are situations in which the controllability cannot be reached for any time, for more on this see [1].

62

A. Agresti et al.

Remark 6. By Theorem 1, the condition (11) is also necessary for the alternating observability for the one dimensional Dirichlet problem. So the result of Corollaries 2–3 are also valid for observability of the Dirichlet problem (2).

4

Multidimensional Dirichlet Problem

We begin with a key technical result in view of our main alternating observability Theorem 3. Its name is due to the use of the function (x − ξ) · ∇u below, which is called a multiplier. Lemma 2 (A multiplier identity). Let u ∈ C(R+ ; H01 (Ω))∩C 1 (R+ ; L2 (Ω)) be the mild solution corresponding to initial data (u0 , u1 ) ∈ H01 (Ω)×L2 (Ω). Then for each 0 ≤ s < τ and ξ ∈ Rd , we have   1 τ (x − ξ) · ν |∂ν u|2 dΓ (x) dt = 2 s ∂Ω τ  d−1 u dx + (τ − s)E0 ; (21) ut ∇u · (x − ξ) + 2 Ω s 

where E0 is the initial energy of the system as defined in (7) and [f (t)]ss := f (s ) − f (s ). Proof. For the sake of clarity, we split the proof into three steps. Step 1. We first suppose that u0 , u1 are regular enough that the solution to (2) is a classical solution, owing to Proposition 1. Assume then u0 ∈ H 2 (Ω)∩H01 (Ω) and u1 ∈ H01 (Ω) so that uxk ,xj , for 1 ≤ k, j ≤ d, belong to C(R+ ; L2 (Ω)), as needed below. Indeed in this case, u solves the wave equation almost everywhere on Ω ×R+ . Then we can multiply the equation by (x − ξ) · ∇u and integrate over Ω × (s, t). We get   τ   τ ((x − ξ) · ∇u) utt dx dt − ((x − ξ) · ∇u) Δu dx dt = 0. (22) s

Ω

Ω

s

We integrate by parts the first term on the LHS in (22) and obtain   Ω

s

τ

 (x − ξ) · ∇u utt dx =

Ω

τ (x − ξ) · ∇u ut dx

 =

Ω

s

τ (x − ξ) · ∇u ut dx

  −

s

Ω

d + 2

τ s

(x − ξ) ·

  Ω

s

τ

1 ∇((ut )2 )dx dt 2

(ut )2 dx dt;

in the last step, we have used the divergence’s theorem, ut = 0 on ∂Ω × (s, τ ) and ∇ · (x − ξ) = d.

Observability for the Wave Equation

63

Again by appealing to divergence’s theorem we transform the second term in (22) as   τ   τ ((x − ξ) · ∇u) · Δu dx dt = ∂ν u(x − ξ) · ∇u dΓ dt Ω s ∂Ω s   τ − ∇((x − ξ) · ∇u) · ∇u dx dt. Ω

s

Next we note that ((uxj )2 )xk = 2uxj ,xk and ∇(xk − ξk )xj = δk,j for all 1 ≤ k, j ≤ d (where δk,j is the Kronecker’s delta); thus the following holds pointwise 1 ∇((x − ξ) · ∇u) · ∇u = |∇u|2 + (x − ξ) · ∇(|∇u|2 ). 2 Routine calculations yield   τ   τ   τ 1 2 d−2 ((x−ξ)·∇u)·Δu dx dt = (x−ξ)·ν|∂ν u| + |∇u|2 dx dt. 2 ∂Ω s 2 Ω s Ω s Collecting the equalities above, together with E(t) ≡ E0 (cfr. Corollary 1), we conclude  τ   τ 1 2 (x − ξ) · ν|∂ν u| = (x − ξ) · ∇u ut dx 2 ∂Ω s Ω   τs d−1 +(t − τ )E0 + |ut |2 − |∇u|2 dx dt.(23) 2 Ω s Step 2. Next we treat the last term on LHS in (23). Again noting that under our assumptions in Step 1, the wave equation is solved by u a.e. on Ω × (s, τ ), we multiply it by u, integrate over Ω × (s, τ ) and use Green’s formulas (together with u = 0 on ∂Ω × (s, τ )), to find  τ   τ   τ u(utt − Δu) dx dt = uut dx − |ut |2 − |∇u|2 dx dt. (24) 0= Ω

s

Ω

s

Ω

s

Step 3. Equations (23)–(24) imply (21) for classical solutions. In order to remove the extra regularity assumptions, choose sequences {u0,k : k ∈ N} ⊂ H 2 (Ω) ∩ H01 (Ω) and {u1,k : k ∈ N} ⊂ H01 (Ω), such that u0,k → u0

in H01 (Ω),

u1,k → u1

in L2 (Ω).

Thus we take the limit as k → ∞ in the identity (21) written for the solution uk corresponding to initial data (u0,k , u1,k ), and prove the claim. The family {Γi }i=0,...,N cannot be chosen arbitrarily, as we already remarked in Sect. 1. We select this family in the following fashion.

64

A. Agresti et al.

1. Choose an arbitrary family {xi }i=0,...,N ⊂ Rd . 2. For each i = 0, . . . , N and each j = 0, . . . , N − 1, define Ri = max{|x − xi | | x ∈ Ω},

Rj+1,j = |xj+1 − xj |.

(25)

3. For each i = 0, . . . , N , define Γi = {x ∈ ∂Ω | (x − xi ) · ν > 0}.

(26)

Theorem 3 (Alternating Observability). With the notation in (25)–(26), assume that T ∈ R satisfies T > RN +

N −1

Ri+1,i + R0 .

(27)

i=0

Then the system is alternating observable in time T , in the sense of Definition 1 for Σ as in (9). Proof. We appeal to (21) written for ξ = xi , s = ti−1 and τ = ti . We add over i = 0, . . . , N all these equalities to infer 1

2 i=0 N



ti



ti−1

(x − xi ) · ν |∂ν u|2 dΓ (x) dt = Γi

N 

i=0

Ω

ut

d−1 u dx ∇u · (x − xi ) + 2

ti + T E0 ,

(28)

ti−1

N since i=0 (ti − ti−1 ) = T , where again T = tN . We employ below the following notation us (x) := u(x, s),

s ∈ [0, T ],

ust (x) := ut (x, s),

x ∈ Ω.

We may cancel some terms in the sum on the RHS of (28), to get N   i=0



Ω

N −1   i=0

Ω

ut

    ti  d−1 d−1 T ∇u · (x − xi ) + ∇uT · (x − xN ) + dx u dx u = uT t 2 2 Ω ti−1

t  uti ∇uti · [(x − xi+1 ) − (x − xi )] dx −

 Ω

u1

  d−1 u0 dx. ∇u0 · (x − x0 ) + 2

(29) We postpone the proof of the following inequalities (30)–(32) to Lemma 3 below: (30)–(31) and (32) follow respectively by (i) and (ii) of the Lemma. In fact, the first and last term on the RHS of (29) can be estimated by

 d−1 T

u uTt ∇uT · (x − xN ) + (30) dx ≤ RN E0 ,

2 Ω

 d−1

u0 dx ≤ R0 E0 , u1 ∇u0 · (x − x0 ) + (31)

2 Ω

Observability for the Wave Equation

while the other terms on the RHS of (29) are majorised by

  

utti ∇uti · (xi − xi+1 ) dx ≤ Ri+1,i E0 , i = 0, . . . , N.

65

(32)

Ω

Due to our definitions of Γi in (26) and of Ri in (25), we conclude at once that for i = 0, . . . , N ,  ti   ti  Ri |∂ν u|2 dΓ dt ≥ (x − xi ) · ν |∂ν u|2 dΓ dt. (33) ti−1

Γi

ti−1

∂Ω

By appealing to Eqs. (28), (29) and to estimates (30)–(33), we get max Ri

i=0,...,N



N  i=0

ti

ti−1

 |∂ν u|2 dΓ dt ≥ 2(T −RN −RN,N −1 −· · ·−R1,0 −R0 )E0 . Γi

Finally, by assumption T > RN + a constant C = CT > 0, such that N 

i=0

ti

ti−1

N −1 i=1

(34) Ri+1,i + R1 ; therefore there exists

 |∂ν u|2 dΓ dt ≥ CT E0 ,

(35)

Γi

amounting to the observability inequality, as given in Definition 1. Lemma 3. Every mild solution u ∈ C(R+ ; H01 (Ω)) ∩ C 1 (R+ ; L2 (Ω)) of (2) satisfies (i) For each ξ ∈ Rd and s ∈ R, then

 d−1 s

s s u dx ≤ Rξ E0 , ut ∇u · (x − ξ) +

2 Ω where Rξ = max{|x − ξ| : x ∈ Ω}. (ii) For each ξ, η ∈ Rd and s ∈ R+ , then



ust (∇us · (ξ − η)) dx ≤ Rξ,η E0 ,

Ω

where Rξ,η = |ξ − η|. Proof. (i) On applying Chauchy-Schwarz inequality, we obtain

 d−1 s d−1 s

s s u u L2 u · (x − ξ) + ∇u dx ≤ ust L2  ∇us · (· − ξ) +

t 2 2 Ω Rξ 1 d−1 s 2  ust 2L2 + u L2 ; (36) ≤  ∇us · (· − ξ) + 2 2 Rξ 2

66

A. Agresti et al.

here L2 := L2 (Ω). Moreover  ∇us ·(· − ξ) +

d−1 s 2 u L2 2

= ∇us · (· − ξ) 2L2 +(d − 1)(∇us · (· − ξ), us )L2 +

(d − 1)2  us 2L2 . 4

On integrating by parts the second term on the RHS of the previous equation, we transform it as follows, since u = 0 on ∂Ω, 

Ω

(∇us us ) · (x − ξ) d x =

1 2



Ω

 1 (us )2 (∇ · (x − ξ)) d x 2 Ω d = −  us 2L2 . 2

(∇(us )2 ) · (x − ξ) d x = −

This yields   (d − 1)2 d−1 s 2 d s 2  ∇u · (· − ξ) + u L2 = ∇u · (· − ξ) L2 + −(d − 1) +  us 2L2 2 2 4 s

≤ ∇us · (· − ξ) 2L2 ≤ Rξ2  ∇us 2L2 ,

owing to

d2 1 d(d − 1) (d − 1)2 + = − + < 0, d ≥ 1. 2 4 2 4 We go back to (36), recalling our results above, and find 

d−1 s 1

Rξ s s u dx ≤  ust 2L2 + ut ∇u · (x − ξ) + Rξ2  ∇us 2L2

2 2 2 R ξ Ω Rξ ( ust 2L2 +  ∇us 2L2 ) = Rξ E0 ; = 2 −

the last equality is a consequence of energy conservation (cf. Corollary 1). (ii) Again by means of Cauchy-Schwarz inequality we have     ust ∇us ·(ξ −η) dx ≤ ust L2  ∇us ·(ξ −η) L2 ≤ Rξ,η  ust L2  ∇us L2 ≤ Rξ,η E0 ;  Ω

where we used energy conservation in the last inequality. Remark 7. If we set N = 0, Theorem 3 coincides with Theorem 6.1, Chap. 1 of [14]. So, Theorem 3 can be seen as an extension of such Theorem. In Subsect. 4.1 we will refer as “the fixed case” to the case N = 0. The motivation is as follows: if N = 0 then Σ = Γ × (0, T ) for some Γ ⊂ ∂Ω and the section of Σ at each time t ∈ (0, T ) (i.e. {x ∈ ∂Ω : (x, t) ∈ Σ}) is constant.

Observability for the Wave Equation

67

The Role of {ti }i=−1,...,N .

4.1

Some remarks on the role of the family {ti }i=−1,...,N are perhaps in order, given N that in Theorem 3 only the sum T = i=0 (ti − ti−1 ) is relevant. In particular we would like to point out that if |tj − tj−1 | is not too large, in a sense which we make more rigorous below, then Theorem 3 does not follows by the fixed case (i.e. N = 0 see Remark 7). The fixed case of Theorem 3 ensures that for any x0 ∈ Rd ,   T |∂ν u|2 dΓ dt ≥ C( u0 2H 1 (Ω) +  u1 2L2 (Ω) ), Γ0

0

0

for some C > 0 independent of (u0 , u1 ) (here Γ0 is defined as in (26)) if T > 2 max{|x − x0 | | x ∈ Ω}. Suppose that for some j = 0, . . . , N , one has |tj − tj−1 | > 2 max{|x − xj | | x ∈ Ω},

(37)

then by the previous considerations, we have the exact observability for the ˜ = Γj × (tj−1 , tj ) (here we have used the invariance of the observability set Σ wave equation under translation in time). Under the hypothesis of Theorem 3, if the condition in (37) holds true for some j, we have   tj  |∂ν u|2 dΓ dt ≥ |∂ν u|2 dΓ dt ≥ C( u0 2H 1 (Ω) +  u1 2L2 (Ω) ), (38) Σ

Γ

tj−1

0

since by definition Σ ⊃ Γj × (tj−1 , tj ) for any j = 0, . . . , N . Thus, under the hypothesis (37) one can avoid to observe on other time intervals; thus, under the hypothesis (37), we can use only the fixed case of Theorem 3 (i.e. N = 0) which coincides with a known result, see Remark 7. If (37) does not hold for any j one has to use the full strength of Theorem 3 to obtain observability. Remark 8. In the terminology of Remark 2, Theorem 3 allows us to apply the Hilbert uniqueness method (HUM) to obtain exact controllability for the problem (4) (see [11,14]). We do not dwell on this issue here for the sake of brevity, but Theorem 3 and HUM yield a control minimizing the L2 norm, a property which can be useful in applications.

5

Applications - Multidimensional Case

In this section we discuss some applications of Theorem 3 in order to show its potentiality. Here we discuss triangular type domains Ω, which are new with respect to the other works [1,2] where we consider domains with different shape. To begin, we denote by T ⊂ R2 the triangular domain given by √ √ T := {(x, y) ∈ R2 : y ≥ 0, y ≤ − 3(x − 1/2), y ≤ 3(x + 1/2)}. Moreover, for i = 0, 1, 2, we denote by Li the side of T as indicated in Fig. 3.

68

A. Agresti et al.

Corollary 4 (1-time Alternating). Let T > 3 be a real number. Then the system (4) is alternating observable for Ω = T , t0 an arbitrary elements of (0, T ), and 1  Σ= Lj × (tj−1 , tj ), j=0

where t−1 := 0 and t1 := T .

Fig. 3. The triangle set T .

√ Proof. Set x0 := (0, 3/2) and x1 := (−1/2, 0). Then the claim follows by Theorem 3. Indeed, with N = 1 and the previous choice of x0 , x1 , one has Γi = Li for i = 0, 1 (see Fig. 4 and the definition of Γi in (26)) and R0 = R0,1 = R1 = 1 (see (25)).

√ Fig. 4. Construction of Γ0 for Ω = T and x0 = ( 3/2, 0).

Remark 9. Note that in Corollary 4 there is no assumption on the value t0 ∈ (0, T ); due to the discussion in Subsect. 4.1 it is clear that Corollary 4 does not follow trivially by known results if t0 < 2.

(39)

Observability for the Wave Equation

69

Similar remark apply to Corollaries 5–6 below. The interested reader can also see [2, Sect. 4]. The N -times alternating version of the previous Corollary is as follows: Corollary 5 (N -time Alternating). Let N be any integer, and let T > (N + 1) be a real number. Then the system (4) is alternating observable for Ω = T , {ti }i=0,...,N −1 any increasing finite subfamily of (0, T ), and Σ=

2 



Li × (tj−1 , tj ),

i=0 {j≤N, j≡i(mod. 3)}

where t−1 := 0 and tN := T (Fig. 5).

Fig. 5. Construction of Γi in Corollary 5.

Proof. It is similar to the proof of Corollary 4. For each i ≤ N , one has to choose √ xi = (−1/2, 0) if i ≡ 0(mod. 3), xi = ( 3/2, 0) if i ≡ 1(mod. 3) or xi = (1/2, 0) if i ≡ 2(mod. 3). The claim follows immediately by Theorem 3. At expense of an increase in the subset of observation at each time, one can decrease the time T required by observability, still maintaining the condition Γi = ∂T verified. This is the content of the following result; for the sake of simplicity we only state it for the 1-times alternating observation; the N -times case follows similarly with minor modifications. √ Corollary 6 (1-time Alternating - II). Let T > 1/4 + 3 be a real number. Then the system (4) is alternating observable for Ω = T , t0 an arbitrary elements of (0, T ), and Σ = (L0 ∪ L1 ) × (0, t0 ) ∪ (L0 ∪ L2 ) × (t0 , T ). Proof. As before, we derive √ the claim trough an √ application of Theorem 3. Indeed, set N = 1, x0 =√(3 3/4, −1/4), x1 = (3 3/4, 1/4). It is routine to check that R0 = R1 = 3/2 and R0,1 = 1/4; moreover Γ0 = L0 ∪ L1 and Γ1 = L0 ∪ L2 (see Fig. 6).

70

A. Agresti et al.

Fig. 6. Construction of Γi in Corollary 6.

Remark 10. Note that Corollaries 4 and 6 provide two observability (and also controllability) result for the same domain T . The difference is clear; in the first corollary the observability set at each fixed time is a side Li for i = 0, 1 but we choose T√ > 3 to ensure the alternating observability. In the second case, T > 1/4 + 3 < 3 but at each time the observability set is the union of two sides.

6

Numerical Simulations

The numerical implementation of the exact controllability problem (4) is a tough problem even in the fixed boundary case, i.e. Γ (t) ≡ Γ . For the latter case, the interested reader can consult [5,8,15] and the references therein. In the variable control case, the difficulties increase and we do not tackle this difficult problem here. Instead, we discuss a very simple method to simulate the controllability problem in the one dimensional case, using a kind of reverse technique. More specifically, we first define a control v of the controlled wave equations (4) (see also (42) below) and then find the initial data (w0 , w1 ) which are controlled by v (see (41)). The control function v is found using HUM method, which says that v = ∂x u (up to a sign) where u solves a suitable homogeneous problem (2). After that, one can formally compute explicitly the controlled solution w and then approximate the behaviour of w thanks to the obtained formula. We believe that repeating the same procedure for a large number of initial data and using the linearity of the Eq. (4), one can construct a very large class of controlled problems which can be sufficient for some applications. In this scenario, the controllability results provided in Sects. 2–4 must be kept in mind to be sure that the system is controllable in the desired time T . With the notation of Subsect. 1.1 we consider the case N = 1 and t0 = 3π/2 and T := t1 = 3π. Due to Corollary 2 and Remark 6 the system (4) is alternating observable. To begin we note that the function u(x, t) = sin(2t) sin(2x) is a solution of the homogeneous problem 2 with initial data u0 (x) = sin(2x) and u1 (x) = 0. With simple computation, one has ux (0, t) = ux (π, t) = 2 sin(2t),

t ∈ (0, 3π).

Observability for the Wave Equation

71

Then, by an easy application of the HUM method (see [14]), one has that v defined by v(t) = −ux (0, t), t ∈ (0, t0 ) , v(t) = ux (π, t), t ∈ (t0 , 3π) ,

(40)

is the control for the initial data [w(x, 0), wt (x, 0)]T , where w satisfies ⎧ ⎪ wtt − wxx = 0, ⎪ ⎪ ⎪ ⎪ ⎪ ⎨w(0, t) = v(t), w(π, t) = 0, w(0, t) = 0, w(π, t) = v(t), ⎪ ⎪ ⎪ w(x, 3π) = 0, ⎪ ⎪ ⎪ ⎩w (x, 3π) = 0, t

(x, t) ∈ (0, π) × (0, 3π), t ∈ (0, t0 ) , t ∈ (t0 , 3π) , x ∈ (0, π), x ∈ (0, π).

(41)

(42)

The next goal is to find an explicit solution to the problem (42) via a Fourier expansion. To do so, we set  x x η = −2 sin(2t) 1 − χ[0,t0 ] (t) + χ[t0 ,3π] (t) , π π where χI is the indicator function of I ⊂ [0, 3π]. Note that η = w at the boundary, then defining ξ := w − η, one has ⎧ (x, t) ∈ (0, π) × (0, 3π), ⎪ ⎪ξtt − ξxx = −ηtt , ⎪ ⎨ξ(0, t) = ξ(π, t) = 0, t ∈ (0, 3π), ⎪ξ(x, 3π) = −η(x, 3π) = 0, x ∈ (0, π), ⎪ ⎪ ⎩ ξt (x, 3π) = −ηt (x, 3π) = − π4 x, x ∈ (0, π). Since ξ satisfies homogeneous boundary conditions, then ξ can be written as ξ=−



1 j=1

j

sin(j(3π − t)) sin(jx)ˆ ηt (·, 3π)(j)  −

∞ 3π

t

ξt =



j=1

1 sin(j(τ − t)) sin(jx)ˆ ηtt (·, τ )(j) dτ, j

(43)

cos(j(3π − t)) sin(jx)ˆ ηt (·, 3π)(j)

j=1

 + t

∞ 3π

j=1

cos(j(τ − t)) sin(jx)ˆ ηtt (·, τ )(j) dτ,

(44)

72

A. Agresti et al.

where, for τ ∈ (0, 3π),  2 π ηˆt (·, τ )(j) = ηˆt (x, τ ) sin(jx) dx, π 0

ηˆtt (·, τ )(j) =

2 π



π

ηˆtt (x, τ ) sin(jx) dx. 0

(45) For the study of the convergence of the previous integrals and sums in suitable function spaces we refer to [16]. In order to obtain a flavour of the behaviour of w, one can stop the sums in (43)–(44) at N sufficiently large and we plot the results (in Fig. 8, we have set N = 250). Moreover, standard computations show that w(x, 0) = 0, instead the behaviour of wt (x, 0) is given in Fig. 7.

Fig. 7. The initial velocity wt (x, 0).

Comments on the Simulation. Figure 8 plots the behaviour of the controlled function w and its derivative wt . More specifically: – Figure 8a plots the behaviour of the function w in the domain (0, π) × (0, T ). The reader can recognize the action of the control v defined in (40) on the function w through the endpoints x = 0 and x = π.

Observability for the Wave Equation

(a) w(x, t) on the domain (0, π) × (0, 3π).

(b) wt (x, t) on the domain (0, π) × (0, 3π).

Fig. 8. Figure (a) (resp. (b)) represents the graph of w (resp. wt ).

73

74

A. Agresti et al.

– Figure 8b plots the behaviour of the function wt in the domain (0, π) × (0, T ). Note that, the plot of wt appears rougher than the one of w, this is due to the fact that the greatest regularity which we can ensure is wt ∈ C([0, T ]; H −1 (0, π)); see Remark 2. The reader can appreciate the action of the control v defined in (40) on the function wt through the endpoints x = 0 and x = π.

7

Conclusions

In this paper we have studied the exact observability for the wave equations with observation set varying in time. For the one dimensional Neumann problem, we have found an equivalent condition to the exact observability. Instead, without any assumption on the dimension, for the Dirichlet problem we have worked out a sufficient condition which ensures the exact observability for T sufficiently large and for a suitable subset of observation Σ. With respect to [2] there are some extra contents. Especially, the one dimensional Neumann case and the applications of Theorem 3 are completely new. Moreover, we have included a numerical simulation of the controllability problem (4) in a one dimensional setting. Our results can be helpful in engineering applications in which one has to control the evolution of a structure (whose dynamics is governed by the wave equation) by means of an action on a portion of the boundary (see [4–6]), but for structural reasons, one cannot act for a long time on the same portion of the boundary, so that the switch of the control is necessary. Finally, it is worthwhile to notice that the multidimensional Neumann case is still an open problem when the observability set varies in time; the technique used in Sect. 4 does not seem applicable.

References 1. Agresti A, Andreucci A, Loreti P (2017) Alternating and variable controls for the wave equation, to appear in ESAIM: COCV, dos: https://doi.org/10.1051/cocv/ 2019017 2. Agresti A, Andreucci D, Loreti P (2018) Variable support control for the wave equation - a multiplier approach. In: Proceedings of the 15th international conference on informatics in control, automation and robotics - volume 1: ICINCO, INSTICC, SciTePress, pp 33–42. https://doi.org/10.5220/0006832600430052 3. Bardos C, Lebeau G, Rauch J (1992) Sharp sufficient conditions for the observation, control, and stabilization of waves from the boundary. SIAM J Control Optim 30(5):1024–1065 4. Carcaterra A, Graziani G, Pepe G, Roveri N (2014) Cable oscillations in a streamflow: art in the Tiber. In: Cunha A, Caetano E, Ribeiro P, M¨ uller G (eds) Proceedings of the 9th international conference on structural dynamics 5. Castro C, Micu S, M¨ unch A (2008) Numerical approximation of the boundary control for the wave equation with mixed finite elements in a square. IMA J Numer Anal 28(1):186–214

Observability for the Wave Equation

75

6. Dessi D, Carcaterra A, Diodati G (2004) Experimental investigation versus numerical simulation of the dynamic response of a moored floating structure to waves. Proc Inst Mech Eng Part M: J Eng Marit Environ 218(3):153–165 7. Evans LC, Gariepy RF (2015) Measure theory and fine properties of functions. CRC Press, Boca Raton 8. Glowinski R, Lions JL (1995) Exact and approximate controllability for distributed parameter systems. Acta Numer 4:159–328 9. Grisvard P (2011) Elliptic problems in nonsmooth domains, vol 69. SIAM, Philadelphia 10. Komornik V (1994) Exact controllability and stabilization: the multiplier method, vol 36. Masson, Paris 11. Komornik V, Loreti P (2005) Fourier series in control theory. Springer, Heidelberg 12. Lasiecka I, Triggiani R (1983) Regularity of hyperbolic equations under L2 (0, T ; L2 (Γ ))-Dirichlet boundary terms. Appl Math Optim 10(1):275–286 13. Le Rousseau J, Lebeau G, Terpolilli P, Tr´elat E (2017) Geometric control condition for the wave equation with a time-dependent observation domain. Anal PDE 10(4):983–1015 14. Lions JL (1988) Contrˆ olabilit´e exacte, perturbations et stabilisation de syst`emes distribu´es. tome 1. RMA 8 15. M¨ unch A (2005) A uniformly controllable and implicit scheme for the 1-D wave equation. ESAIM: Math Model Numer Anal 39(2):377–418 16. Taylor M (2011) Partial differential equations I: basic theory. Applied mathematical sciences, vol 115 17. Tucsnak M, Weiss G (2009) Observation and control for operator semigroups. Springer, Heidelberg

Nonprehensile Manipulation Control and Task Planning for Deformable Object Manipulation: Results from the RoDyMan Project Fabio Ruggiero1(B) , Jung-Tae Kim1 , Alejandro Gutierrez-Giles2 , Aykut C. Satici3 , Alejandro Donaire4 , Jonathan Cacace1 , Luca Rosario Buonocore5 , Giuseppe Andrea Fontanelli1 , Vincenzo Lippiello1 , and Bruno Siciliano1 1

CREATE Consortium and Department of Electrical Engineering and Information Technology, University of Naples Federico II, Via Claudio 21, 80125 Naples, Italy [email protected] 2 Center for Research and Advanced Studies of the National Polytechnic Institute, Av. IPN 2508, San Pedro Zacatenco, 07300 Mexico City, Mexico 3 Mechanical and Biomedical Engineering, Boise State University, 1910 University Drive, Boise 83725, ID, USA 4 School of Engineering, The University of Newcastle, University Drive, Callaghan, NSW 2308, Australia 5 CERN, Router de Meyrin 385, Geneve 23, Switzerland

Abstract. This chapter aims the broadcasting of the results achieved by the RoDyMan project about the task planning manipulation of deformable objects, and the nonprehensile manipulation control. The final demonstrator of the project is a pizza-making process. After an introduction to the general topic of nonprehensile manipulation, the mechatronic design and the high-level software architecture are described. Then, the smoothed particle hydrodynamic formulation is briefly introduced, along with the description of a detection method for a deformable object. The task planning for stretching a modelling clay, emulating the pizza dough, is sketched. After, the problematic control objective is split into several nonprehensile motion primitives: holonomic and nonholonomic rolling, friction-induced manipulation, and tossing are the described primitives. This chapter highlights the achievements reached so far by the project, and pave the way towards future research directions. Keywords: Nonprehensile dynamic manipulation Perception · Robot planning · Robot control

1

· Deformable object

Introduction

Last decades saw a dramatic improvement in powerful technology for sensing and actuation, leading to a significant advancement in robotic manipulation c Springer Nature Switzerland AG 2020  O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 76–100, 2020. https://doi.org/10.1007/978-3-030-31993-9_4

Nonprehensile Manipulation Control and Task Planning

77

capabilities. However, the human skills in manipulating a massive range of objects in a dexterous and fast way are still far off from being replaced by robots. Roughly speaking, manipulating an object means to change its configuration regarding position and orientation. Such a change can be performed in many ways on the basis of the performed grasp. In case the object is entirely caged between the fingertips or the palm, supposing infinite power for the muscles, the hand can resist any external disturbance applied to the object. If even infinitesimal motions of the object are limited by the hand, the grasp is said to be in a form closure configuration, otherwise it is referred to as a force closure configuration [48]. Bilateral constraints appear in both closures. On the other hand, if only unilateral constraints are involved in the grasp, the hand can resist only external disturbances applied in specific directions. A simple example is given by an object pushed on a plane: the fingertip can resist only external disturbances applied on the object and counteracting the pushing direction; it is instead not able to resist to an external force lifting the object. In this case, nonprehensile manipulation occurs. Nonprehensile manipulation can be endorsed as dynamic when the dynamics of both the object and the manipulating hand are crucial to accomplishing the task. Some manipulation actions must be inevitably performed in a prehensile way, like screwing or unscrewing a bottle cap, while other manipulation actions must be performed in a nonprehensile way, like carrying a glass of liquid on a tray. Some other manipulation tasks can be instead carried out either in a prehensile or in a nonprehensile way, while other more complex tasks may need both ways to be finalised. An example may be given by a juggler that has to throw a ball with one hand (nonprehensile) while the other hand must grab another ball (nonprehensile for a dynamic catch, prehensile when the ball is firmly grasped between the fingertips and the palm). Nonprehensile manipulation benefits of several advantages as highlighted in [36]. The workspace of the robot can be increased since unilateral constraints allow breaking the contact with the manipulated part (e.g., the robot can throw and dynamically catch the object). Since it is not anymore required having a standard gripper, the structure of the robot can be simplified because any available surface can be employed to manipulate the object through suitable contact forces. Finally, since form and force closure grasps are not involved, the dexterity of the manipulation task is augmented because it is possible to control more degrees of freedom of the object than the ones of the manipulator itself. Nevertheless, these advantages go to the detriment of an increase of the complexity of the planning and the control design. 1.1

The RoDyMan Project

This chapter illustrates the results achieved by the RoDyMan project1 in the development of a service robot that can manipulate rigid, elastic, and soft objects also in a nonprehensile way. The sought goals are split into three main research challenges, which outline the chapter too. 1

http://www.rodyman.eu.

78

F. Ruggiero et al.

– Robot Design. A mobile robotic platform is built from scratch to test the nonprehensile dynamic manipulation tasks addressed by the project. – Modelling, Perception, and Task Planning for Deformable Objects. Accurate modeling of deformable objects is essential to plan and simulate their motion and design the related planning and control tasks properly. Moreover, the real-time requirements posed by robot interaction need an accurate tracking of these objects. – Nonprehensile Manipulation Planning and Control. The project wants to advance the state of the art in controlling rigid objects in a nonprehensile way and starting the investigation of problems related to both prehensile and nonprehensile manipulation of deformable objects. The RoDyMan final demonstrator is the realisation of an autonomous pizza maker. Preparing a pizza involves an exceptional level of manual dexterity. Besides, it is indeed a media attractor and a tribute to the hometown of the project. Videos of the related experiments can be found in the laboratory YouTube channel2 . 1.2

Literature Review

Different projects tried to address nonprehensile manipulation using different approaches. The RIBA robot3 can lift and set down patients from/to their beds and/or wheelchairs. The addressed transporting task is indeed nonprehensile, but it is not dynamic since the patient is often considered as a rigid object. Motion planning strategies could be enough to achieve the task. The robot is covered by a soft material to enhance safety during the human-robot interaction. The ERC SHRINE project4 wanted instead to advance the robot manipulation skills to overcome the limitations preventing robots from safe and smooth operations in anthropic environments. Some manipulation actions were executed in a nonprehensile fashion. Beyond the mentioned projects, advancements in the domain of robotic nonprehensile manipulation are very slow. Mainly because such a domain lacks a solid theoretical background, and this may be caused by the absence of a shared device where researchers can compare their works (i.e., multi-fingered hands for the robotic grasping domain; humanoids for the researchers studying walking; unmanned aerial vehicles for the aerial robotic community). Nonprehensile manipulation has instead very different tasks and situations that can be addressed with different devices. Therefore, applications are commonly limited to ad-hoc solutions to solve distinct problems. As previously outlined, the benefits of nonprehensile manipulation are compensated by some drawbacks related to the difficulty in finding the suitable control design. For instance, the possible change of the contact status during a nonprehensile manipulation task leads to non-smooth dynamics, complicating 2 3 4

https://www.youtube.com/user/ThePRISMAlab. http://rtc.nagoya.riken.jp/RIBA/index-e.html. https://cordis.europa.eu/project/rcn/98813 en.html.

Nonprehensile Manipulation Control and Task Planning

79

the controller design. Further control problems arising in nonprehensile manipulation tasks are highlighted in [37], while a recent survey regarding nonprehensile manipulation is available in [52]. Therefore, dynamic nonprehensile manipulation may be considered as the most complex manipulation action. A standard procedure to deal with nonprehensile manipulation tasks is to split them in simpler subtasks. Each single subtask is often referred to as a nonprehensile manipulation primitive [37]. Examples of such primitives are sliding (or friction induced manipulation) [69], holonomic rolling [16], nonholonomic rolling [54], throwing [53], dynamic catching [4], batting [56], juggling [49], dribbling [3], and so on. Each primitive is equipped with its motion planner and controller. Within a complex manipulation task, a primitive can be activated or deactivated by a high-level supervisor [72]. Besides, nonprehensile manipulation involves one or more objects that should be adequately tracked for the successful execution of the task. Recognition and tracking of deformable objects are even more challenging. Typically, object

Fig. 1. The RoDyMan robotic platform equipped with two Schunk SVH hands as endeffectors.

80

F. Ruggiero et al.

tracking is based on local feature detection, (i.e., the traditional Harris’ corner detector [24], SIFT [35], SURF [5] or FAST [51]). However, some deformable objects, like a bread dough, usually have few features. Hence, their boundaries are the unique information, and they can be extracted by edge detectors [12,21], or with depth sensors like the Kinect device. A real-time tracking method for a deformable object with RGB-D data is proposed in [46] by using a finite element method (FEM) framework for the tracking of the object. On the other hand, recognising a deformable object is possible without having a model in advance. The model of the object is instead identified during its manipulation directly [31]. A neural network-based recognition method for a deformable object is proposed in [15], given the tactile and the camera sensor information without any model in advance. Manipulating and observing the object is useful in [18] and [19] to reconstruct its dynamic properties, which are then used to plan its deformations [47]. As an application for perception and manipulation of a deformable object, it is worth mentioning the procedure for the automatic stretching of a dough in [65].

2 2.1

Robot Design Mechatronic Design

The RoDyMan project also gives the name to a 21-degree-of-freedom (DoF) humanoid-like robotic platform composed by a mobile base and an upper-body with two anthropomorphic arms and a sensorized head (see Fig. 1). The mobile base is developed with omnidirectional mecanum wheels allowing the robot to move on the floor in any direction. The four wheels are linked to the structure through an extensible actuated mechanism to increase the contact area during the execution of rapid movements of the upper body. Two redundant 7-DoF arms constitute the upper-body limbs. Two SCHUNK LWA 4P 6-DoF arms were chosen thanks to the following properties: (i) the high payload; (ii) the humanlike distance between the shoulder and the elbow, and between the elbow and the wrist; (iii) the possibility to control the arm without any external controller by using a controller area network (CAN) bus communication. In order to add a human-like redundancy, a seventh joint SCHUNK PRL-100 is directly integrated into the shoulder, and it is controlled via another CAN channel. The two arms are linked to a 2-DoF torso made up by two SCHUNK PRL-120 motors to rotate and bow the upper body. Besides, a pan-tilt neck links the sensorized head to the torso. The arms come out with a large variety of different end-effectors such as two SCHUNK Servo-electric 5-finger SVH underactuated hands, or suitable 3-D printed tools specifically designed for the different experiments carried out with the robot. From the perception point of view, the RoDyMan robot is equipped with force sensors that can be mounted on the wrist to measure the interaction forces with the environment. Two laser scanners are placed in the mobile platform for odometry and simultaneous localisation and mapping. Different vision sensors are placed in the head to reconstruct the environment and implement advanced

Nonprehensile Manipulation Control and Task Planning

81

control strategies. In details, the head contains an RGB-D sensor that exhibits a reasonable resolution with medium refresh rate, a time-of-flight sensor that has less resolution with high refresh rate, and two high-quality cameras that can be used in both mono or stereo configuration to have the maximum visual resolution. The RoDyMan robot is a stand-alone system with both the controller and power supply directly integrated into the body. In details, two high capacity batteries packs and an uninterruptible power supply are located in the mobile base to provide the RoDyMan robot with an 8-hour autonomy. Moreover, a QNX-based personal computer is used for the low-level motor control and safety. A standard Linux-based personal computer is instead employed for perception and high-level planning and control. Finally, a complete dynamic model of the 21-DoF system was carried out to allow the development of advanced control strategies for human-robot interaction. The complete dynamic model was derived in a symbolic form using an ad-hoc developed Matlab toolbox. Moreover, the linear matrix inequality identification method described in [60] was employed to obtain the dynamic parameters by including the physical consistency within the optimisation procedure. The obtained results show a reasonable estimation of parameters like mass and inertia. Because of its high predominance, especially in the static part, friction was instead identified separately and included as a constraint in the subsequent optimisation. 2.2

Software Architecture

A high-level control architecture handles the autonomous execution of actions on the RoDyMan robotic platform. Such architecture aims at planning and executing the robot activities needed to perform complex dynamic manipulation tasks. The control architecture is depicted in Fig. 2, and it is explained in the following. High-level or dynamic manipulation tasks (e.g., the pizza tossing) are specified as inputs for the system using the Human-Robot Interaction (HRI) module. The task is then decomposed by the Supervisor module, splitting high-level actions into different lower level tasks considering both the state of the robot and the information generated by the Perception module. After the decomposition process, each lower level action can be executed. Examples of high-level tasks are Grasp(Object), Search(Object) or Toss(Object), and sequences of non-prehensile manipulation primitives. The Supervision module performs the high-level task decomposition process. It is endowed with a hierarchical decomposition tasks library, similar to the hierarchical task networks, which can be composed by the system to achieve the desired goal [11]. The Executors module is instead responsible for handling the execution of the tasks decomposed by the Supervisor. In this context, when a non-prehensile manipulation action is required, the desired low-level controller is chosen and invoked from the dynamic manipulation task list. For all the other actions, the Executors is directly responsible for their execution. In particular, path and motion planning functionalities are implemented

82

F. Ruggiero et al.

in such a module to find an obstacle-free way for the end-effectors of the robot and its base to achieve the requested actions. The Executors module relies on MoveIT! [13], a framework that integrates and a universal robot description file, the Open Motion Planning Library, and other toolkits. The Controller module is instead responsible for sending the generated motion trajectories to the joints of the robot and its base. Finally, the Perception module extracts information from the operative environment through image elaboration algorithms. In conclusion, the described control architecture represents a suitable choice to support the RoDyMan robot. It statically allocates the best low-level controller to accomplish the desired dynamic manipulation task. The methodology represents an innovative way of decomposing high-level tasks for nonprehensile manipulation with respect to the available literature [72].

Fig. 2. RoDyMan high-level control architecture.

3

Manipulation Planning for Deformable Objects

Unlike solid objects, it is difficult to predict the shape of a deformable object after applying external forces. Hence, in order to plan manipulation actions, it is favourable having at disposition a precise model of the object. The following subsection is based on the smoothed particle hydrodynamics (SPH) framework. The pizza dough is addressed as an illustrative example. Moreover, the manipulation of a deformable object also requires an accurate perception module that can extract proper abstract information from the sensor devices. An RGB image-bases perception method is then explained in the following. Based on these modelling and perception methods, the manipulation planning for a pizza dough is briefly sketched. 3.1

Modelling of Deformable Objects

The model of a deformable object is relevant for (i) compensating the lack of visual data from sensors; (ii) simplifying its representation; (iii) predicting its shape during the time.

Nonprehensile Manipulation Control and Task Planning

83

A primary way to model a deformable object is to consider it as a sequence of solid objects linked by joints. An object modelled in this way is referred to as a foldable object. The deformability or flexibility of these objects is represented by how much the links can bend. The most typical method is based on a mass-spring-damper model [2,20,28,44,70], whose equations exploit Newton’s second law and Hooke’s law. More general models describing a deformable object regard it as a continuous object, which has a volume and some properties like elasticity or viscosity. According to the Navier-Stokes’ theorem, that is a fundamental theorem representing continuous objects, the momentum conservation of an incompressible continuous object is formulated as ρ

Dv = −∇ · ph Ir + ∇ · τ + ρ f body . Dt

(1)

D where ρ is density, v ∈ Rr is velocity vector, Ir ∈ Rr×r is identity matrix, Dt is the material derivative operator, ∇· is the divergence operator, ph ≥ 0 is the hydrostatic pressure, τ ∈ Rr×r is the viscous stress, f body ∈ Rr is the body force due to gravity, surface tension, or friction, and r is a proper dimension (usually r = 3). Among the various formulations for describing the Navier-Stokes’ theorem (i.e., the finite difference method [14], the finite volume method [17], the finite element method [29,73], and the gradient smoothing method [33]), the SPH method [34,42], is one of the well-known formulations describing continuous objects. A physical quantity A (e.g., pressure, temperature, force) of any point x ∈ Rr of a continuous object can be calculated by  ∞ A(x) = A(x ) W (x − x , h) dx , (2) −∞

where W : Rr → R≥0 is an interpolating kernel with dimension r = 3, and h is the radius of the kernel domain. The value for the outer of the kernel domain is zero. The simplicity of the formulation and the virtue of mesh-free Lagrangian models gave popularity to the SPH method within the domain of modelling continuous objects. The pizza dough is here considered as illustrative example for a deformable object. A pizza dough is a rheological continuous material composed of various ingredients such as water, heavy water, glutamine residues, salts, agents affecting disulfide bonding, and the protein subunits [1]. The dough has high viscous properties because of gliadin, and elastic properties due to the presence of the ∂u in (1), glutenin [66,71]. The viscosity is defined by the viscous stress τ = μ ∂y ∂u where μ is the dynamic viscosity coefficient, and ∂ y is the local shear velocity. Unlike ideal Newtonian fluids having a constant dynamic viscosity coefficient, a general non-Newtonian fluid has a dynamic viscosity coefficient which depends on

84

F. Ruggiero et al.

the local shear velocity. It is thus defined as a nonlinear equation. Several models exist for a non-Newtonian fluid like the Bingham plastic, the shear thinning, the shear thickening, or the Herschel-Bulkley-Papanastasiou model. The latter is suitable for rheological materials [38,39,59], whose equations are  for |τ | > τy , τ = K|γ| ˙ n−1 γ˙ ± τy , (3) γ˙ = 0, otherwise, where γ˙ = ∂∂ uy . For a bread dough, it is experimentally known that K = 5177, n = 0.417, and τy = 298.76 [39]. Based on the SPH formulation, the viscosity term in the momentum conservation (1) was designed and proposed in [7,40–42,45,57,61,62,64]. One of the representations for the viscosity force in a Newtonian fluid is fivis =

1 1 1  mj ∇ · τ i = μ∇2 vi ≈ μ (vj − vi ) ∇2 Wij , ρi ρi ρi j ρj

(4)

where mj is the mass of a neighbour j-particle. The first equality of (4) holds because of volume conservation, while the second approximate equality holds because of the discretization of the second golden rule of the SPH formulation [40]. Four different viscous fluids are simulated in Fig. 3.

Fig. 3. Simulations of fluids with different dynamic viscosity coefficients. The fluids falls from above on the solid object representing a rabbit. From left to right the considered coefficients τ are 0 (water), 5 (honey), 70 (ketchup), and 250 (bread dough), respectively.

Volume conservation is difficult to guarantee for the SPH method. Hence, an additional procedure is needed, like the implicit incompressible SPH formulation [45], or the divergence-free SPH method [6].

Nonprehensile Manipulation Control and Task Planning

3.2

85

Detection of Deformable Objects

Perception of a deformable object roughly consists of two parts. The former is the image processing of camera sensor data. The latter is the representation of the status of the object. Perception methods highly depend on the sought application. Considering as an illustrative example a modelling clay, emulating the pizza dough, a specific detector is explained in the following Some restrictions are considered to decrease the complexity: (i) the workspace is reduced by defining a bounding area (i.e., a plain rectangular plate, see Fig. 4(a)); (ii) the contrast between the clay and the background is enhanced (e.g., in the considered case, a green clay is considered on a white background, see Fig. 4(b)). Three corner marks on a rectangular plate, the absolute distance among them, and the kinematic information provided by the RoDyMan robot are enough to compute the frame transformation between a 2-D point in the camera image and the corresponding absolute position in the world frame [25], which is placed without loss o generality at the middle of the RoDyMan mobile base. Within the carried out experiments, QR codes are placed at each corner of the rectangular plate (see Fig. 4(a)). Such corner points are detected using an image matching with SIFT [35], SURF [5], or FAST [51]. In order to detect the corners quickly, AR codes and the corresponding AR code detector (i.e., ARToolkit [30]) are used. The detected corner points are also employed to remove outlines of the plate. Figure 4(b) shows the detected green clay on a white plate which is identified by a blue contour. By supposing the object uniformly thick, except at the boundary margin, and using colour-based object and edge detectors [21], like the canny-edge detector [12], the modelling clay can be detected in real-time. Then, the corner points and the boundary of the deformable object in 2D camera image are transformed into the top-down orthogonal view of Fig. 4(c). 3.3

Task Planning to Stretch a Deformable Object by a Tool

The illustrative example of stretching a pizza dough with a rolling-pin is addressed. In the experiments, a modelling clay is used in place of a real dough. The proposed procedure consists of four sequential steps: (i) object recognition; (ii) object deformation planning; (iii) tool action planning; (iv) robot manipulation planning. The flowchart is depicted in Fig. 5. The recognition of the deformable object is explained in the previous subsection, and thus it is possible to reconstruct the current status of the modelling clay. By using the SPH model described in Subsect. 3.1, a transition look-uptable is built in simulation. The table contains a discretization of possible status and the transitions to other status given a rolling-pin’s action. The object deformation planner generates a proper sequence of actions for the rolling-pin from the information of the current status of the object to bring it to the desired one. Based on such a sequence of actions, the movement of the rolling-pin and, in turn, the movements of the robot are planned. An example is shown in Fig. 6. The robot starts at an initial pose. The movement is planned to bring it towards

86

F. Ruggiero et al.

(a) The RoDyMan robot and a white plate where the deformable object will be tracked.

(b) Detection of the background plate and the green modelling clay.

(c) Extracted modelling clay contour.

Fig. 4. Detection of a deformable object on a rectangular plate.

Nonprehensile Manipulation Control and Task Planning

87

Fig. 5. Flowchart of the described procedure to stretch a modelling clay with a tool manipulated by a robot.

a starting pose. From the current status of the object and its desired configuration, some stretching action is performed with the rolling-pin, and a final pose is reached. The sequence is repeated until the deformable object reaches the desired shape.

Fig. 6. Example of task planning actions to stretch a deformable object.

88

4

F. Ruggiero et al.

Nonprehensile Manipulation: Planning and Control

Nonprehensile manipulation is maybe the most complex and dexterous manipulation task performed by a human being. In robotics, a literature analysis reveals that the conventional way to cope with a nonprehensile manipulation task is to split it into simpler subtasks [52]. Such subtasks are usually referred to as nonprehensile manipulation primitives. Three manipulation primitives are described in the following: (i) nonprehensile rolling, both holonomic and nonholonomic; (ii) friction-induced manipulation; (iii) throwing/tossing. The first primitive takes into account only rigid objects, while the second can instead consider both rigid and deformable objects. The latter addresses a deformable object explicitly. In the following, the controller and/or the motion planner designed for the primitives above are described.

Fig. 7. The RoDyMan platform balancing a ball on a disk. The structure made by the three connected bars shown in the figure allows starting the experiments in a non-equilibrium position. The world frame and the body-fixed frame, attached to the rotating wheel, are displayed in red and green lines, respectively. The angle between these two frames is represented by θh .

4.1

Holonomic Rolling

In this subsection, an actuated manipulator of a given shape that handles an object is considered. The manipulator is referred to as the hand and manipulates the object through rotations only, that is without grasping or caging it. In this

Nonprehensile Manipulation Control and Task Planning

89

class of manipulation, the object can only roll along the hand surface. Only planar holonomic rolling is thus presented. The RoDyMan mechatronic platform uses highly-geared harmonic drives, which motivate the assumption that the acceleration of the hand ah = θ¨h ∈ R con be considered as a control input for the system. Under the additional assumption that the hand can only rotate around its centre of mass, the dynamics of a nonprehensile planar rolling manipulation system can be described as follows θ¨h = ah ,





˙ s¨h = −b−1 22 b12 ah + c21 θh + c22 s˙ h + g2 ,

(5a) (5b)

where the parameters b12 ∈ R and b22 ∈ R are entries of the inertia matrix of the system B ∈ R2×2 , c21 ∈ R and c22 ∈ R are entries of the (2 × 2) Coriolis matrix, and g2 ∈ R describe the effect of the gravity. The contact position of the object on the hand, whose shape is parametrized through arclength, is represented by the variable sh ∈ R. A detailed description of the system can be found in [32], where it is also shown that a nonprehensile planar rolling manipulation system θh +sh if the Coriolis forces are zero. This is differentially flat with the output bb12 22 condition is verified for the ball-on-disk (BoD) system. Note that the dynamics of the BoD and the disk-on-disk (DoD) are similar in the transversal plane [16]. The BoD system is a mechanical arrangement of ball that can roll on top of a disk as shown in Fig. 7. In this case, the disk is the actuated hand and the ball is the object. The control task for the BoD is to balance the object and simultaneously drive the hand to a desired angular position. This control problem is solved using passivity control (PBC) and port-Hamiltonian (pH) theory. In its classical formulation, the PBC applied to nonprehensile rolling system aims at finding a controller for system (5a)–(5b) such that the dynamics of the closed loop takes the form     O B−1 Bd q˙ (6) ∇Hd (q, p), = −Bd B−1 J2 (q, p) − Rd (q, p) p˙ where q ∈ R2 and p ∈ R2 are the vectors of generalized coordinates and momenta, respectively, and O the zero matrix of proper dimensions. The function Hd = 12 p B−1 d (q)p + Vd (q) ∈ R represents the desired energy in closed loop, where Bd ∈ R2×2 is the desired mass matrix and Vd ∈ R is the desired potential energy. The function J2 ∈ R2×2 and Rd ∈ R2×2 represent the gyroscopic forces and damping of the closed loop, respectively. In addition, it is required that the desired energy can be used as Lyapunov function for the closed loop, and thus stability is ensured. Besides, detectability of the passive output will ensure asymptotically stability of the desired equilibrium. The detailed design of the controller can be found in [16] for the DoD example. As a milestone, it is possible to affirm that holonomic rolling nonprehensile manipulation primitive can be successfully modelled through the pH formalism and, consequently, controlled with PBC approaches [55]. This is relevant because it means that there exists a unified framework at least for such a class of nonprehensile manipulation primitive.

90

4.2

F. Ruggiero et al.

Nonholonomic Rolling

The typical example of nonholonomic rolling is the so-called ball and plate benchmarking system. In such an example, a ball is steered towards the desired position with a given orientation by a moving plate actuated by a robot. A planning algorithm plus the controller designed within the pH formalism are presented in [54]. In this subsection, instead, a further interesting example of the nonholonomic rolling primitive, studied in the framework of the RoDyMan project, is the hoopand-pole system depicted in Fig. 8. In this figure, the following coordinate frames are shown: an inertial frame ow —xw yw zw , a frame attached to the pole op — xp yp zp , a frame attached to the hoop oh —xh yh zh , and a contact frame oc — xc yc zc . The contact frame is defined as follows: oc ∈ R3 is the contact point, xc ∈ R3 is a versor passing through the contact point and pointing outwards the pole surface, yc ∈ R3 a versor in the intersection of the pole surface tangent plane at the contact point and the hoop equatorial plane, and zc ∈ R3 defined to compose an orthonormal frame.

Fig. 8. Pole and hoop system driven by the RoDyMan robot in a simulation environment.

Nonprehensile Manipulation Control and Task Planning

91

To obtain the contact kinematics [43, Section 6.2], the following contact coordinates are defined: zo ∈ R is the distance to the contact point measured over the zp axis, θ ∈ R is the angle from one arbitrary point on the pole surface to the contact point measured over the zp axis, γ ∈ R is the angle from one arbitrary point on the hoop surface to the contact point over the zh axis, ψ ∈ R is the angle from an arbitrary point on the equator of the hoop to the contact point measured over the yc axis, and φ ∈ R is the angle between two tangent vectors, one of each surface, measured over the xc axis (see [22] for details).

T These contact coordinates are grouped in qc  γ ψ zo θ φ ∈ R5 . Moreover, let qp ∈ Rm be the pole coordinates vector, where m ≤ 6 is the number of the pole degrees of freedom. The complete generalised for the system



coordinates under consideration are given by the vector q  qc ; qp ∈ R5+m , where x ; y

T

T is a shorthand notation for xT yT . Finally, let qh  γ ψ ∈ R2 be the

2+m vector of hoop coordinates, and qr  qh ; qp ∈ R . From these definitions and by following the same modeling procedure as in [22], the following dynamic model is obtained Bh (q)¨ qh + ch (q, q˙ r ) + Th (q)¨ qp = 0 qp + cp (q, q˙ r ) + Bp (q)¨

TT qh h (q)¨

= u,

(7) (8)

subject to the constraints θ˙ = (lh cφ /rp ) γ˙ z˙o = −lh sφ γ˙ φ˙ = −sψ γ˙ ,

(9) (10) (11)

where lh > 0 is the hoop radius, rp > 0 is the pole radius, and sx and cx are shorthand notations for sin(x) and cos(x), respectively. In the above equations Bh (q) ∈ R2×2 and Bp (q) ∈ Rm×m are symmetric positive definite matrices, ch (q, q˙ r ) ∈ R2 and cp (q, q˙ r ) ∈ Rm are vectors accounting for centripetal, Coriolis and gravitational forces, u ∈ Rm is the vector of input forces acting on the pole, and Th (q) ∈ R2×m is the inertia coupling matrix. If rank(Th (q)) = 2, ∀q, the system is said to be strong inertially coupled [63], which is an assumption in this work. The control objective is to rotate the hoop at a constant speed γ˙ d ∈ R while simultaneously translating the hoop to an arbitrarily desired point over the pole surface zod ∈ R. Moreover, the pole coordinates must be stabilized. The controllability of the whole system depends on the coordinates chosen for the pole. After a controllability analysis [23], the task can be accomplished if the pole degrees of freedom consist of two translations along the xp and yp axes and two rotations around the same axes. By exploiting the strong inertially coupled property, the main idea is to construct two orthogonal projection matrices form

92

F. Ruggiero et al.

m×m the Th (q) matrix in (7)–(8), that is, Ph  T+ and Qh  Im − Ph ∈ h Th ∈ R + m×m T T −1 m×2 R , where Th  Th Th Th ∈R is the Penrose’s right pseudoinverse. T It is straightforward to verify that the following relations hold: Ph TT h = Th , T Th Ph = Th , Qh Th = O, and Th Qh = O. Based in these projectors, the control input in (8) is designed as

u = Bp (Ph uP + Qh uQ ) .

(12)

A direct application of noncollocated partial feedback linearisation [63], results ¨ h = vP for controlling the hoop in the decoupled double integrator system q coordinates, where vP ∈ R2 is a new input to be designed. Notice, however, that for designing this control law the nonholonomic constraints (9)–(11) must be taken into account. Furthermore, not only the coordinates in qh must be stabilised, but φ, θ and zo as well. This can be carried out by employing the backstepping methodology, as explained in [23]. As regards to the control of the pole coordinates, the linearisation along the trajectories of the steady state solutions, resulting from applying the noncollocated partial linearisation controller for the hoop coordinates, leads to a periodic linear time-varying system. For such particular kind of systems, a linearquadratic regulator controller can be employed [58], and the resulting periodic Riccati equation can be solved by the quasi-linearisation method [8, p. 137]. Through a formal mathematical analysis, it is possible to guarantee the ultimate boundedness of the hoop coordinates, with an arbitrary small ultimate bound, and the local stabilisation of the pole coordinates [23]. The proposed scheme is validated through numerical simulations in [23]. The main challenges for carrying out experiments are the necessity of a speedy reconstruction of the hoop position and orientation and the required high speeds and accelerations for the actuators (robot joint motors). 4.3

Friction-Induced Manipulation

Positioning object by sliding is a problem studied in the literature over the last decade. For example, a mechanism called universal planar manipulator is designed in [50] for moving parts on the plane though friction induced movements. Later, the problem of positioning and orienting of a rigid body over a 2D plate is considered in [67]. More recently, a method for positioning and orienting a rigid body with a six-degrees-of-freedom rigid plate is proposed and successfully tested in [68]. In the context of the RoDyMan project, the illustrative example of turning a pizza over an actuated peel is tackled. To uniformly cook a pizza inside a woodfired oven, in which the source of the heat is not distributed but localised in a specific position, it is necessary to turn around the pizza with a peel frequently. The movements inspired by Neapolitan pizza chefs are an example of frictioninduced manipulation [26].

Nonprehensile Manipulation Control and Task Planning

93

Fig. 9. Peel and pizza system.

The problem of positioning and orientating a disk with a two-degrees-offreedom manipulation system was first studied in [27]. In the cited article, the authors take advantage of the physical properties of the mechanical system to successfully drive the disk to an arbitrary desired position and orientation on the peel surface. The authors also show that a translation of the disk can be done without rotating it but not vice versa. One weak point of the cited work is that the control strategy is mainly carried out in an open loop fashion, that is, the visual feedback is only employed to indicate that the disk has arrived at a particular pose. In contrast, a model-based closed-loop controller for rotating the pizza at a desired constant speed over the peel surface is briefly proposed below. In order to design the control strategy, a mathematical model of the system is firstly derived. Let ow –xw yw zw be the inertial frame, oh –xh yh zh a frame attached to the peel and op –xp yp zp a frame attached to the pizza, as shown in Fig. 9. The generalised coordinates for the peel, driven by a robotic manipulator, are xh ∈ R and θ ∈ R, where xh is the first component of oh ∈ R3 and θ is the rotation angle of the peel with respect to the inertial frame. The generalised coordinates for the pizza are chosen as xp , yp , φ ∈ R, where xp and yp are the first two components of ohp ∈ R3 and φ is the angle of rotation of the pizza with respect to the zhp axis. Therefore, the configuration of the system is completely

T described by the vector q = xh θ xp yp φ ∈ R5 . Before applying the Euler– Lagrange methodology to obtain the equations of motion, the external and non conservative forces are described. Notice that the Coulomb friction terms play a critical role for this task. These terms are defined as functions of the relative velocities between the peel and the pizza x˙ p and y˙ p , and are described by [27] fx = mp gμp sign(x˙ p )

(13)

fy = mp gμp sign(y˙ p ) ,

(14)

where g ∈ R is the gravity acceleration constant, μp ∈ R is the Coulomb friction coefficient, and sign(x) is the standard sign function. On the other hand, there

94

F. Ruggiero et al.

is a torque over the zhp axis produced by the movement of the xh coordinate and the change of the pressure distribution which in turn is induced by the ¨ acceleration on the θ coordinate [27], and is given by τφ = μφ Ipx sign(x˙ p )θ, where Ipx ∈ R is the (1, 1) component of the pizza inertia tensor Ip ∈ R3×3 . Overall, the non-conservative and external forces are represented by the vector

T ξ = uh τθ fx fy τφ ∈ R5 , where uh ∈ R is the external force applied over the peel in the xh direction, and τθ ∈ R is the external torque over the same axis. A further simplification can be made if it is considered the linear and angular

T

T accelerations of the peel as inputs, i.e., u  uh uθ  x ¨h θ¨ ∈ R2 . In order to employ continuous tools to analyze the system dynamics, the following approximation of the sign function sign(xi ) ≈ tanh(ki xi ) is made, where k is a positive constant. Since the objective of the control is also the pizza rotation speed, the ˜ ˙ φ˙ d , where φ˙ d ∈ R is the desired pizza rotaerror following error is defined φ˙ = φ−

T

T ∈ R9 tion speed. Now, let x  x1 x2 · · · x9  xh θh xp yp x˙ h θ˙h x˙ p y˙ p φ˜˙ be the state space vector. Then, the system dynamics can be put in the form x˙ = f (x) + g1 u1 + g2 (x)u2 , (15)

T

T where f (x) = x5 x6 x7 x8 f2 (x) ∈ R9 , g1 = 0 g12 ∈ R9 , and g2 (x) =

T 9 0 g22 (x) ∈ R , with ⎤ ⎡ 0 ⎥ ⎢ 0 ⎥ ⎢ bp ⎥ ⎢ − mp x7 − gμp tanh(k7 x7 ) (16) f2 (x) = ⎢ ⎥ ∈ R5 ⎥ ⎢ bp 2 ⎣− mp x8 − gμp tanh(k8 x8 ) − g sin(x2 ) + x4 x6 ⎦ 0

T (17) g12 = 1 0 −1 0 0 ∈ R5

T 5 (18) g22 (x) = 0 1 0 0 −μφ (Ipx /Ipz ) tanh(k7 x7 ) , ∈ R

where Ipz ∈ R is the (3, 3) component of Ip . The control objective consists in inducing a rotating movement on the pizza dough at a desired angular speed φ˙ d while keeping the remaining coordinates as close to zero as possible. In order to fulfil this objective, the following control law is proposed u1 = −k1 x1 − k5 x5 + ah sin (ωh t)

(19)

tanh(k7 x7 )Ipz k9 x9 − k2 x2 − k6 x6 . u2 = μφ Ipx

(20)

The first term of (19) is a PD control, employed to stabilise the peel linear movement plus a feedforward term ah sin (ωh t), to ensure the condition x˙ p = 0. On the other hand, the control law (20) is a PD control to stabilise the peel orientation plus a nonlinear term to induce a rotation in the pizza by exploiting the torque τφ .

Nonprehensile Manipulation Control and Task Planning

95

The centripetal forces term x4 x26 in (18) is neglected, as commonly assumed in the literature [27,68]. Therefore, a closed-loop analysis can be carried out to guarantee the existence of periodic solutions of the peel coordinates around the equilibrium position. Besides, the stabilisation of the pizza centre of mass around zero is also ensured, as well as the practical stabilisation of the angular speed ˜˙ with arbitrary small tracking error. tracking error φ, 4.4

Tossing of a Deformable Object

The nonprehensile throwing manipulation primitive aims at bringing an object to the desired location (with eventually a desired orientation) which is out of the workspace for the robot. The task becomes even more challenging if the thrown object is deformable. Again, the illustrative problem of tossing and catching a pizza dough is considered within the RoDyMan project. Such a procedure is frequently dexterously performed by human pizza-chefs. There are at least three reasons why tossing the dough during the preparation of the pizza is attractive: (i) the dough is stretched to the desired size; (ii) the dough naturally assumes a configuration that is thicker at the ends and thinner in the middle; (iii) as the spinning dough freely falls, the outside of the dough dries, making it crunchy in the outside but light in the middle. The combined model of a dough grasped with robotic fingers through unilateral constraints, and the kinematics and dynamics of the robot manipulator were derived in [53]. Upon that, a control law achieving the desired tossing motion can be designed. Besides, with a perfect knowledge of the motion of the dough, optimal trajectories can be generated in SE(3) for the catching phase. The optimal trajectory generation is repeated as new sensor information is available. The optimal trajectories are generated in such a way that the initial position, velocity, acceleration, and final velocity and accelerations are matched. Therefore, it is at least thrice continuously differentiable. An optimal trajectory whose initial and final accelerations are desired to satisfy a sixth-order boundary value problem (BVP). Such a BVP is generated by using the necessary conditions for a path to minimise a convex combination of the jerk and acceleration functionals. While minimising the jerk functional reduces the vibrations in the structure of the robotic manipulator, minimising the acceleration functional reduces the total amount of energy expended during the catching motion [9]. To derive the trajectories for tossing and catching, the following cost functional Lc : T Q × T Q → R is minimised  tf +sδtf Dv Dv D2 v D2 v , ⟫ + β⟪ , ⟫ dt, α⟪ Lc (s) = dt dt dt dt t0 where the weights α ∈ R and β ∈ R satisfy α + β = 1, t0 ∈ R and tf ∈ R are the initial and final time instants, v ∈ se(3) is the end-effector twist, and ⟪·, ·⟫ is a D here denotes the covariant metric on the Riemannian manifold. The symbol dt derivative of a tangent vector along a curve [10]. Only the case where the final position is left free and is part of the minimization problem is considered. The

96

F. Ruggiero et al.

first variation of this functional is calculated, taking special care of the free endpoint conditions. More details can be found in [53]. Experimental validations are in progress. Nevertheless, preliminary results show that such kind of task requires high peak currents in the motors to toss the dough for more than 10 cm. The motors of the RoDyMan robot do not have such skills and must be improved as highlighted in the next section.

5

Discussion and Future Work

As sketched in the introduction, the illustrative example of manipulating a pizza dough is a media expedient. It is clear that if a robot can manipulate a pizza dough, it might be able to perform skilfull manipulation tasks. Therefore, many achievements have been reached by the RoDyMan project. Nevertheless, many improvements are instead needed to be tackled. For instance, the RoDyMan robot must be revised to cope with the issues given by the highvelocity required by some nonprehensile manipulation task. Friction is, in general, the main problem affecting the actuators of the RoDyMan robot. A more accurate parametric estimation and robust controllers are essential. Besides, many assumptions are often made within a nonprehensile manipulation task to avoid the almost intrinsic non-smooth behaviour of it. Such assumptions make a nonprehensile manipulations system like a prehensile one. The proof that the designed controller does not violate such assumptions is usually performed aposteriori. The contact forces should be directly addressed within the modelbased approach to avoid this aspect. This is indeed a future research direction leading to non-smooth and hybrid controllers. Future development for the RoDyMan project also includes the application of the reached goals in other domains. For instance, the perception of elastic objects is currently being applied in the medical context to shape variations of muscles and organs. The manipulation performed while tossing the deformable dough is currently under investigation to improve the automation of gluing the shoes’ lower surfaces. From a theoretical point of view, the RoDyMan project may also be improved in the research of creating a general unified framework determining the design of ad-hoc controller to solve each nonprehensile manipulation primitive. The pH formalism and the PBC can be used to solve the holonomic nonprehensile rolling, but it is not enough. Moreover, a high-level supervisor enabling the correct switching between the different nonprehensile manipulation primitives must be addressed. A recent attempt is carried out in [72]. For instance, learning-based approaches may be helpful to design such a supervisor since task simplification, and human-inspired control strategies may be the key towards the fulfilling of the whole complex nonprehensile manipulation task. Acknowledgements. The research leading to these results has been supported by the RoDyMan project, which has received funding from the European Research Council FP7 Ideas under Advanced Grant agreement number 320992.

Nonprehensile Manipulation Control and Task Planning

97

References 1. Amjid MR, Shehzad A, Hussain S, Shabbir MA, Khan MR, Shoaib M (2013) A comprehensive review on wheat flour dough rheology. Pak J Food Sci 23:105–123 2. Anshelevich E, Owens S, Lamiraux F, Kavraki LE (2000) Deformable volumes in path planning applications. In: 2000 IEEE international conference on robotics and automation, San Francisco, CA, pp 2290–2285 3. B¨ atz G, Mettin U, Schimdts A, Scheint M, Wollherr D, Shiriaev A (2010) Ball dribbling with an underactuated continuous-time control phase: theory & experiments. In: 2010 IEEE/RSJ international conference on intelligent robots and systems, Taipei, TW, pp 2890–2895 4. B¨ atz G, Yaqub A, Wu H, Kuhnlenz K, Wollherr D, Buss M (2010) Dnamic manipulation: nonprehensile ball catching. In: 18th mediterranean conference on control and automation, Marrakech, MA, pp 365–370 5. Bay H, Ess A, Tuytelaars T, Van Gool L (2006) Surf: speeded up robust features. In: European conference on computer vision, Graz, AT, pp 404–417 6. Bender J, Koschier D (2015) Divergence-free smoothed particle hydrodynamics. In: 14th ACM SIGGRAPH/eurographics symposium on computer animation, Los Angeles, CA, USA, pp 147–155 7. Bender J, Koschier D (2017) Divergence-free SPH for incompressible and viscous fluids. IEEE Trans Vis Comput Graph 23:1193–1206 8. Bittanti S, Laub A, Willems J (1991) The Riccati equation. Springer, New York 9. Bloch AM, Crouch PE (1995) Nonholonomic control systems on riemannian manifolds. SIAM J Control Optim 33(1):126–148 10. Boothby WM (1986) An introduction to differentiable manifolds and Riemannian geometry, vol 120. Academic Press 11. Cacace J, Finzi A, Lippiello V, Loianno G, Sanzone D (2015) Aerial service vehicles for industrial inspection: task decomposition and plan execution. Appl Intell 42:49– 62 12. Canny J (1986) A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell PAMI 8(6):679–698 13. Chitta S, Sucan I, Cousins S (2012) Moveit! [ros topics]. IEEE Robot Autom Mag 19:18–19 14. Chung TJ (2010) Computational fluid dynamics, 2nd edn. Cambridge University Press, Cambridge 15. Cretu AM, Payeur P, Petriu E (2008) Neural network mapping and clustering of elastic behavior from tactile and range imaging for virtualized reality applications. IEEE Trans Instrum Meas 57(9):1918–1928 16. Donaire A, Ruggiero F, Buonocore L, Lippiello V, Siciliano B (2016) Passivitybased control for a rolling-balancing system: the nonprehensile disk-on-disk. IEEE Trans Control Syst Technol 25(6):2135–2142 17. Eymard R, Gallou¨et TR, Herbin R (2000) The finite volume method. In: Handbook of numerical analysis, vol. 7, pp 713–1020 18. Frank B, Schmedding R, Stachniss C, Teschner M, Burgard W (2010) Learning the elasticity parameters of deformable objects with a manipulation robot. In: 2010 IEEE/RSJ international conference on intelligent robots and systems, Taipei, TW, pp 1877–1883 19. Frank B, Stachniss C, Abdo N, Burgard W (2011) Efficient motion planning for manipulation robots in environments with deformable objects. In: 2011 IEEE/RSJ international conference on intelligent robots and systems, San Francisco, CA, USA, pp 2180–2185

98

F. Ruggiero et al.

20. Frisken S, Gibson F, Mirtich B (1997) A survey of deformable modeling in computer graphics. Techincal report, Brigham and Women’s Hospital 21. Gonzalez RC, Woods RE (2006) Digital image processing, 3rd edn. Prentice-Hall Inc., Upper Saddle River 22. Guti´errez-Giles A, Ruggiero F, Lippiello V, Siciliano B (2017) Modelling and control of a robotic hula-hoop system without velocity measurements. In: 20th world congress of the international federation of automatic control, Toulouse, FR, pp 9808–9814 23. Guti´errez-Giles A, Ruggiero F, Lippiello V, Siciliano B (2018) Nonprehensile manipulation of an underactuated mechanical system with second-order nonholonomic constraints: the robotic hula-hoop. IEEE Robot Autom Lett 3(2):1136–1143 24. Harris C, Stephens M (1988) A combined corner and edge detector. In: 4th Alvey vision conference, Manchester, UK, pp 147–151 25. Hartley R, Zisserman A (2004) Multiple view geometry in computer vision, 2nd edn. Cambridge University Press, Cambridge 26. Higashimori M, Utsumi K, Kaneko M (2008) Dexterous hyper plate inspired by pizza manipulation. In: 2008 IEEE international conference on robotics and automation, Pasadena, CA, USA, pp 399–406 27. Higashimori M, Utsumi K, Omoto Y, Kaneko M (2009) Dynamic manipulation inspired by the handling of a pizza peel. IEEE Trans. Robot. 25(4):829–838 28. Higashimori M, Yoshimoto K, Kaneko M (2010) Active shaping of an unknown rheological object based on deformation decomposition into elasticity and plasticity. In: 2010 IEEE international conference on robotics and automation, Anchorage, AK, USA, pp 5120–5126 29. Irving G, Teran J, Fedkiw R (2004) Invertible finite elements for robust simulation of large deformation. In: 2004 ACM SIGGRAPH/eurographics symposium on computer animation, Grenoble, FR, pp 131–140 30. Kato H, Billinghurst M (1999) Marker tracking and HMD calibration for a videobased augmented reality conferencing system. In: 2nd IEEE and ACM international workshop on augmented reality, San Francisco, CA, USA, pp 85–94 31. Lang J (2004) An acquisition method for interactive deformable models. In: Second international conference on creating, connecting and collaborating through computing, Kyoto, JP, pp 165–170 32. Lippiello V, Ruggiero F, Siciliano B (2016) The effects of shapes in input-state linearization for stabilization of nonprehensile planar rolling dynamic manipulation. Robot Autom Lett 1(1):492–499 33. Liu GR, Zhang J, Lam KY, Li H, Xu G, Zhong ZH, Li GY, Han X (2008) A gradient smoothing method (GSM) with directional correction for solid mechanics problems. Comput Mech 41(3):457–472 34. Liu MB, Liu GR (2010) Smoothed particle hydrodynamics (SPH): An overview and recent developments. Arch. Comput. Methods in Eng. 17(1):25–76 35. Lowe DG (1999) Object recognition from local scale-invariant features. In: International conference on computer vision, Kerkyra, GR, vol 2, pp 1150–1157 36. Lynch KM, Mason MT (1999) Dynamic nonprehensile manipulation: controllability, planning, and experiments. Int J Robot. Res. 18(1):64–92 37. Lynch KM, Murphey TD (2003) Control of nonprehensile manipulation. In: Bicchi A, Prattichizzo D, Christensen H (eds) Control problems in robotics, vol 4. Springer tracts in advanced robotics. Springer, Heidelberg, pp 39–57 38. Mitsoulis E (2008) Numerical simulation of calendering viscoplastic fluids. J NonNewton Fluid Mech 154:77–88

Nonprehensile Manipulation Control and Task Planning

99

39. Mitsoulis E, Hatzikiriakos SG (2009) Rolling of bread dough: experiments and simulations. Food Bioprod Process 87(2):124–138 40. Monaghan JJ (1992) Smoothed particle hydrodynamics. Annu. Rev. Astron. Astrophys. 30:543–574 41. Monaghan JJ (1994) Simulating free surface flows with SPH. J. Comput. Phys. 110(2):399–406 42. Monaghan JJ (2005) Smoothed particle hydrodynamics. Rep Progress Phys 68:1703–1759 43. Murray RM, Li Z, Sastry S (1994) A mathematical introduction to robotic manipulation. CRC Press 44. Nealen A, M¨ uller M, Keiser R, Boxerman R, Carlson M (2005) Physically based deformable models in computer graphics. Eurographics: State of the Art Report, pp 71–94 45. Peer A, Ihmsen M, Cornelis J, Teschner M (2015) An implicit viscosity formulation for SPH fluids. ACM Trans Graph 34(4):1–10 46. Petit A, Lippiello V, Siciliano B (2015) Real-time tracking of 3D elastic objects with an RGB-D sensor. In: 2015 IEEE/RSJ international conference on intelligent robots and systems, Hamburg, GE, pp. 3914–3921 47. Phillips-Grafflin C, Berenson D (2014) A representation of deformable objects for motion planning with no physical simulation. In: 2014 IEEE International conference on robotics and automation, Hong Kong, CN, pp 98–105 48. Prattichizzo D, Trinkle J (2016) Grasping. In: Siciliano B, Khatib O (eds) Springer handbook of robotics, pp 955–988. Springer 49. Reist P, D’Andrea R (2012) Design and analysis of a blind juggling robot. IEEE Trans Robot 28(6):1228–1243 50. Reznik DS, Canny JF (2001) C’mon part, do the local motion! In: 2001 IEEE international conference on robotics and automation, vol 3, Seul, KR, pp 2235– 2242 51. Rosten E, Drummond T(2006) Machine learning for high-speed corner detection. In: European conference on computer vision, Graz, AT, pp 430–443 52. Ruggiero F, Lippiello V, Siciliano B (2018) Nonprehensile dynamic manipulation: a survey. IEEE Robot Autom Lett 3(3):1711–1718 53. Satici A, Ruggiero F, Lippiello V, Siciliano B (2016) A coordinate-free framework for robotic pizza tossing and catching. In: 2016 IEEE international conference on robotics and automation, Stockholm, SE, pp 3932–3939 54. Serra D, Ferguson J, Ruggiero F, Siniscalco A, Petit A, Lippiello V, Siciliano B (2018) On the experiments about the nonprehensile reconfiguration of a rolling sphere on a plate. In: 26th mediterranean conference on control and automation, Zadar, HR 55. Serra D, Ruggiero F, Donaire A, Buonocore L, Lippiello V, Siciliano B (2019) Control of nonprehensile planar rolling manipulation: a passivity based approach. IEEE Trans Robot 35(2):317–329. https://doi.org/10.1109/TRO.2018.2887356 56. Serra D, Satici A, Ruggiero F, Lippiello V, Siciliano B (2016) An optimal trajectory planner for a robotic batting task: the table tennis example. In: 13th international conference on informatics in control, automation and robotics, Lisbon, PT, pp 90–101 (2016) 57. Shao S, Lo E (2003) Incompressible SPH method for simulating newtonian and non-newtonian flows with a free surface. Adv Water Resour 26(7):787–800 58. Shiriaev A, Perram J, Canudas-de Wit C (2005) Constructive tool for orbital stabilization of underactuated nonlinear systems: virtual constraints approach. IEEE Trans Autom Control 50(8):1164–1176

100

F. Ruggiero et al.

59. Sofou S, Muliawan EB, Hatzikiriakos SG, Mitsoulis E (2008) Rheological characterization and constitutive modeling of bread dough. Rheologica Acta 47(4):369–381 60. Sousa CD, Cortes˜ ao R (2014) Physically feasibility of robot base inertial parameters identification: a linear matrix inequality approach. Int J Robot Res 33(6):931– 944 61. de Souza Andrade LF, Sandim M, Petronetto F, Pagliosa P, Paiva A (2014) SPH fluids for viscous jet buckling. In: 27th conference on graphics, patterns and images, Rio de Janeiro, BR, pp 65–72 62. de Souza Andrade LF, Sandim M, Petronetto F, Pagliosa P, Paiva A (2015) Particle-based fluids for viscous jet buckling. Comput Graph 52:106–115 63. Spong MW (1994) Partial feedback linearization of underactuated mechanical systems. In: IEEE/RSJ/GI international conference on intelligent robots and systems, Munich, DE, pp 314–321 64. Takahashi T, Dobashi Y, Fujishiro I, Nishita T, Lin MC (2015) Implicit formulation for SPH-based viscous fluids. Comput Graph Forum 34(2):493–502 65. Tokumoto S, Hirai S (2002) Deformation control of rheological food dough using a forming process model. In: 2002 IEEE international conference on robotics and automation, Washington, DC, USA, pp 1457–1464 66. Van Bockstaele F, De Leyn I, Eeckhout M, Dewettinck K (2008) Rheological properties of wheat flour dough and the relationship with bread volume. i. creeprecovery measurements. Cereal Chem J 85(6):753 67. Vose TH, Umbanhowar P, Lynch KM (2009) Friction-induced lines of attraction and repulsion for parts sliding on an oscillated plate. IEEE Trans Autom Sci Eng 6(4):685–699 68. Vose TH, Umbanhowar P, Lynch KM (2012) Sliding manipulation of rigid bodies on a controlled 6-dof plate. Int J Robot Res 31(7):819–838 69. Vose T, Umbanhowar P, Lynch K (2009) Friction-induced velocity fields for point parts sliding on a rigid oscillated plate. Int J Robot Res 28(8):1020–1039 70. Wada T, Hirai S, Kawamura S, Kamiji N (2001) Robust manipulation of deformable objects by a simple PID feedback. In: 2001 IEEE international conference on robotics and automation, Seoul, KR, pp 85–90 71. Wieser H (2007) Chemistry of gluten proteins. Food Microbiol 24(2):115–119 72. Woodruff J, Lynch K (2017) Planning and control for dynamic, nonprehensile, and hybrid manipulation tasks. In: 2017 IEEE international conference on robotics and automation, Singapore, pp 4066–4073 73. Zienkiewicz O, Taylor R, Zhu J (2005) The finite element method set, 6th edn. Butterworth-Heinemann, Oxford

Towards a Novel Maintenance Support System Based On mini-terms: Mini-term 4.0 Eduardo Garc´ıa1 , Nicol´ as Montes2(B) , and M´ onica Alacreu2 1

Ford Spain, Pol´ıgono Industrial Ford S/N, 46440 Almussafes, Valencia, Spain [email protected] 2 Department of Mathematics, Physics and Technological Sciences, University CEU Cardenal Herrera, C/ San Bartolom´e, 55, Alfara del Patriarca, Valencia, Spain {nicolas.montes,monica.alacreu}@uchceu.es

Abstract. This paper presents how to design a novel Maintenance Support System (MSS) to prognostic breakdowns in production lines based on mini-terms, well known as a Mini-term 4.0. The system is based on the real time sub-cycle time (mini-term) monitorization and how the miniterm variability can be used as a fault detection indicator. Mini-terms and micro-terms were introduced in our previous work as a machine subdivision. A mini-term subdivision can be selected by the expert team for several reasons, the replacement of a machine part or simply to analyze the machine more adequately. (A micro-term is a component from a mini-term and it can be as small as the user wishes. Mini-terms are able to detect the same physical deterioration phenomenon than common sensor but with an important advantage, it is easy and cheap to install. It is cheap because do not require any additional hardware installation to measure the sub-cycle time, just use the PLC and sensors installed for the automated production process, and easy because only requires to code extra timers into the PLC. Mini-terms are nowadays implanted at Ford factories around the world where a learning process is established to enrich the knowledge of the system. The system detects change points and sends an e-mail to the maintenance workers. They repair the machine and report the pathology detected to the system. Real cases are shown at the end of the paper. Keywords: Knowledge-driven support system · Maintenance prognosis · mini-term · Change point · Mini-term 4.0

1

Introduction

A production line is composed of a set of sequential operations established in a factory whereby materials are put through a refining process to produce an end-product. c Springer Nature Switzerland AG 2020  O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 101–117, 2020. https://doi.org/10.1007/978-3-030-31993-9_5

102

E. Garc´ıa et al.

During the lifespan of the line, which could be decades, the throughput depends on an amount of parameters like, maintenance policy, downtime events, machine breakdowns, deteriorating systems, dynamic bottleneck behavior, bowl phenomenon, market demand, etc. There are open questions to be resolved that are not treated in literature in depth which produces an enormous gap between academic theory and real plant problems, bringing up a considerable amount of research topics where maintenance and replacement problems of deteriorating systems are some of them. Maintenance operations have a direct influence on production performance in manufacturing systems. Maintenance task prioritization is crucial and important, especially when availability of maintenance resources is limited. Generally, maintenance can be categorized into two major classes: corrective maintenance (CM) and preventative maintenance (PM). CM is performed when a machine fails. It usually involves replacing or repairing the component that is responsible for the failure of the overall system. However, PM is performed before machine failure. The objective of PM is to achieve continuous system production. In conditionbased maintenance framework, a deterioration indicator that correctly describes the dynamic of the failure process is required. Usually, this efficient indicator can be constructed from collected information on various deterioration-related monitoring parameters such as vibration, temperature, noise levels, etc. However, the need of continuous monitoring may increase the system costs when expensive monitoring devices are required [3]. In fact, that is the main drawback in PM using these techniques. Over the last two decades, numerous prognostic approaches have been developed. Prognostic is a major scientific challenge for industrial implementation of maintenance strategies in which the remaining useful life estimation (RUL) is an important task. For environmental, economic and operational purposes, the prognostic and the remaining useful lifetime prediction arouse a big interest. In the framework of prognostic and health management (PHM), many prognostic techniques exist and they are basically classified into three principal classes: data-driven approaches, model-based approaches, experience-based approaches, but these can also be classified in two groups, non-probabilistic methods and probabilistic methods, see [4]. In non-probabilistic methods the deterioration phenomenon is not random and in most observations the deterioration can be fuzzy. With probabilistic methods, the deterioration phenomenon is considered to be random and with stochastic tools it is considered a random behavior. In this case the prognostic is based on the future behavior of the stochastic deterioration process and can give results in terms of probabilities, see [4]. 1.1

Previous Works. from Micro-term to Long-Term

The data used in the analysis of the production lines is classified intolong-term and short-term. Long-term is mainly used for process planning, while shortterm focuses, primarily, on process control. Following the definition in [7], shortterm is referred to an operational period not large enough for machine failure period to be described by a statistic distribution. The machine cycle time is

Mini-term 4.0

103

Fig. 1. A pyramid of terms, [22].

considered short-term. In [9,17] redefines short-term into two new terms, miniterm and micro-term, see Fig. 1. A mini-term could be defined as a machine part, in a predictive maintenance policy or in a breakdown, replaceable easier and faster than another machine part subdivision. Furthermore, a mini-term could be defined as a subdivision that allows us to understand and study the machine behavior. These sub-cycle times (micro-terms and mini-terms) are not the same at each repetition and they follow a probabilistic distribution, mean value μ and standard deviation σ. In addition to that, the probabilistic sub-cycle time for each machine component varies during the lifespan of the component. In other words, the deterioration indicators that can be measured with thermal cameras, vibration and ultrasonic devices have an effect on the machine cycle time. In most cases, the measurement of these cycle times does not imply any additional costs because the actuators that allow the sub-cycle time measurement were installed in the machine and are used for their automated work. 1.2

Maintenance Support Systems and the Industry 4.0

In modern manufacturing, real-time control of production operation to improve system responsiveness, increase system efficiency and reduce downtime is becoming more and more critical. More recently, this concept is moving forward to the concept of “Industry 4.0”. It is a current trend and data exchange in manufacturing technologies. It includes cyber-physical systems, the internet of things and cloud computing creating what has been called a “smart factory”. Most manufacturing industries have sought to improve a productivity and quality with this new techniques where data-driven decision support systems (DSS) appear as a

104

E. Garc´ıa et al.

new paradigm. In [16] we can find a recent review on the use of DSS in manufacturing. One of the data-driven DSS is the one called Knowledge-driven DSS. It has its origin in the Intelligent Decision Support Systems or in a broader sense, in Artificial intelligence (AI), [10,11]. Knowledge-driven DSS are computedbased reasoning systems with the distinction that AI technologies, management expert systems, data mining technologies and communication mechanism are integrated. Intelligent DSS are divided in some evolutionary developments. One of them is about rule-based expert systems [20]. These systems are based on the use of heuristics, which can be understood as strategies that lead to the correct solution for the problem. For these systems it is always necessary to use human expert knowledge collected in a database, [12]. Knowledge-driven DSS are successfully applied for many applications like for instance, market management, [14] or for radiation therapy treatment planning [15]. 1.3

Mini-term Degradation Path. A Change Point

Prediction and analysis of degradation paths are important to condition-based maintenance (CBM). It is well known that the degradation paths are non-linear. It means that in the degradation path, a sudden change point appears when the RUL (Remaining Useful Life) is near the end, see [18,19]. Before the change point, the component works in optimal conditions and after the change point the component works in bad conditions announcing that the failure is near, see Fig. 2.

Fig. 2. Change point.

The change point in the physical part of the machine components produces similar effect in the sub-cycle time, that is a change point in the mini-term, Fig. 11 shows two examples measured at Ford Almussafes factory. The first one is a change point produced in the mini-term due to the deterioration of a welding clamp proportional valve. The second one is a cylinder with inner leakage, also in a welding clamp. These change points in the mini-terms can be detected using

Mini-term 4.0

105

common data analysis techniques, see [18,19]. When a change point in the miniterm is detected, an alarm must be activated for the maintenance workers in order to replace it, as soon as possible. There are some questions to be solved about the use of mini-terms in maintenance systems; – What kind of pathology produces the change point? – How does it affect the production rate? – How many times does the maintenance worker have to replace it before breakdown? Our research work tries to answer these questions. Therefore, the present study proposes a novel protocol on how to design Knowledge-driven Maintenance Support Systems (MSS) for it. Based on real-time measurement of sub-cycle times, with the help of expert knowledge and the statistical analysis of the measurements, we could define rules that allow us to predict the deterioration of a particular machine part or component. In addition to that and, based on the same statistical analysis and numerical simulation techniques, we will be able to find out the loss of production caused by the component deterioration. The present paper is organized as follows. Section 2 tries to answer the first question, What kind of pathology produces the change point?. For that purpose, a single welding station is tested for different pathologies. Time-series are analyzed by means of statistical analysis, allowing to define the rules of our MSS. Section 3 tries to answer the second question, How does it affect the production rate?. For that purpose, the welding line is modeled and simulated to compute the loss of production rate when the change point occurs in a welding station. Section 4 presents Mini-term 4.0, the first steps for the generalization of the mini-term for failure prognosis and its advantages for the Industry 4.0. Section 5 shows the conclusions with an emphasis on future research challenges that are mainly focused on answering the third question, How many times does the maintenance worker have to replace it before breakdown?.

2 2.1

What Kind of Pathology Produces the Change Point? Welding Line at Ford Almussafes Factory

In order to answer the above questions, in [22] was used a real welding line located at Ford Almussafes (Valencia), see Fig. 3. In a real welding line like this, there are welding workstations where, each one has welding stations working in parallel and sometimes in serial. Each welding station makes some welding points in the same cycle time. It is possible to find 1, 2, 4 or at least 6 welding station in the same workstation, where each one makes up to 19 welding points. In our particular case, our welding line has 8 workstations where workstation 1, 5 and 6 have 4 welding units, workstations 2, 4, 7 and 8 have 6 welding stations and workstation 3 has 1 welding unit. The welding line was installed in 1980. The staff group that designed the line defined the maximum running capacity, ECR (engineering running capacity), 60 JPH (Jobs Per Hour). However, the

106

E. Garc´ıa et al.

plant engineers have another maximum running capacity, that is the ERR (engineering running rate), in this case defined in 51 JPH. Nowadays, this line welds 68 different models and variants. Different car models with 3, 5 doors with or without solar roof, etc. Obviously, from 1980 to today, the line has suffered a lot of changes and updates, new models and variants have appeared and old models and variants have disappeared, most advanced robot arms and welding units have been introduced, etc. Therefore, the line is re-balanced, if it is possible, when new update occurs.

Fig. 3. Welding line at Ford Almussafes (Valencia).

2.2

A Test Bench for a Welding Station

The behavior of the welding station, Fig. 4, is simple. First, the robot arm moves the welding clamp to the point to weld. Then, a pneumatic cylinder moves the welding clamp in two phases: One to bring closer the clamp and a second one to weld. The pressure applied by the clamp is controlled by a control system. The Robot Arm and Welding Clamp need a certain time to develop their task and their components also need a certain time to develop their own tasks. In [22] the welding station was divided in three mini-terms, the robot arm, the welding clamp motion and the welding task.

Mini-term 4.0

107

Fig. 4. Welding station, [22].

2.3

Pathologies Analyzed

The welding station, as well as other stations in the industry, is bound to suffer from pathologies that produce an effect on the cycle time. Based on the operator’s experience, In [22] were selected the most common ones for the experimental welding station. These pathologies produce a cycle time modification but do not produce failure of the component, going unnoticed for maintenance workers and also for the control system, in other words, after the change point and before the failure of the component, see Fig. 2. The pathologies rated are; for the welding clamp mini-term: the proportional valve, the cylinder stiffness, welding failure produced by the transformer and pressure loss, and for the robot arm mini-term; the robot arm speed. A brief description of each one is hereby explained: – Pathology 1 (P1 ) (Proportional valve): This valve transmits the pressure to the cylinder and is managed by the controller. It is responsible for maintaining the proper pressure in the cylinder. During its lifetime, its components suffer fatigues that produce the stiffness of some of them. This condition creates a time delay. When it is too deteriorated the valve cannot transmit enough pressure to the cylinder and the welding task is not possible. – Pathology 2 (P2 ) (Cylinder stiffness): A critical term in welding task is the pressure applied on the metals. This force is necessary to ensure good electrical contact between the parts to be welded, and to maintain the fixed parts until the metal forming the solid board has time to solidify. The elements responsible for transmitting the proper pressure to these plates are the cylinder clamps. If one of the cylinders has is worn off, has galling or there is communication inside the stem, a time delay is produced. Maintenance workers detect this pathology when the cylinder cannot transmit enough pressure on the metals and the welding task cannot be performed. – Pathology 3 (P3 ) (Welding failure): The welding process between parts consists of passing an electric current through intensive metals in order to be joined. The device generally used for this task is a transformer. The fatigue of this component is mainly produced due to the loss of wire insulation.

108

E. Garc´ıa et al.

A modification is produced in the value of the insulated resistance and therefore generates a current reduction that affects the welding time. Maintenance workers detect this pathology when the welding task cannot be performed due to failure. – Pathology 4 (P4 ) (Pressure loss): One of the most common delays is produced by pressure losses in a pneumatic circuit. The pressure drop causes a delay or malfunction in the pneumatic devices to be operated. This pathology could be produced by many facts such as a simple pore that produces a failure in the compressor. Maintenance workers detect this failure when the low pressure alarm is triggered. – Pathology 5 (P5 ) (Robot Speed): The common industrial robots have 6 axes. All these axes are synchronized to achieve the points that have been defined by the program to perform its function or task. If there is a failure in the operation, it causes an engine speed reduction that directly affects the process cycle time. There are several reasons why this pathology is produced. In these industrial robot arms, high speed and high accurate operation are required. However, in the case of high speed operations, strong jerks often arouse, i.e., rapid change of acceleration. Jerk causes deterioration of control performance such as vibration in a tip of a robot arm. Jerk forces are not equally distributed and as the robot arm does the same movement again and again, the deterioration is located in some particular joint. Mechanical structure deterioration or the deterioration of electrical parts also affects the speed. This pathology is very difficult to detect by maintenance workers because it does not produce the breakdown of the machine and, as the robot moves at high speed, it is nearly impossible to be detected without a specific procedure. 2.4

Rules Definition Based on Statistical Analysis

In [22] were analyzed the experimental samples to understand how the pathologies affect the cycle time and to generate rules that allow us to define our MSS. Figure 5 shows the descriptive analysis of the data where the test developed without pathology is called as “C” (control test) and the behavior with one of the pathologies (P1 ...P5 ), that is, six different situations for each mini-term. It is obvious that the cycle time for the mini-terms with pathology is different compared with the control or bassal situation. The statistical tests used in [22] were Shaphiro-Wilk, Levene, ANOVA, Kruskal-Wallis and a variance. For all the tests, the significance level is α = 0.05. With this, statistical rules were obtained, allowing to differentiate between pathologies. In Table 1 a summary of the rules is shown. The first two rows are rules that classify mean and variance values according to the pathology. Third row shows threshold values to determine if there are pathologies or not and the last row shows extra rules like for instance, when pathology 4 occurs, the data do not pass the normality test.

Mini-term 4.0 Table 1. Rules for the Knowledge-driven MSS. Welding station case. Robot motion mini-terms Mean rule

μC = μP 1 = μP 2 = μP 3 = μP 4 < μP 5

Variance rule

−−−

Stand. desv. rule

S > 25.4 · 10−3 → All

Normality rule

−−−−

Welding Motion mini-terms μC = μP 5 < μP 1 < μP 3 < μP 2 < μP 4

Mean rule Stand. desv. rule

2 2 2 2 2 2 σC = σP 1 = σP 3 = σP 5 < σP 2 < σP 4 −4 −4 S ∈ [47 · 10 74 · 10 ] → P2 , P4

Normality rule

P 4f ail

Variance rule

Welding task mini-terms μP 2 < μP 4 < μC = μP 3 < μP 5 < μP 1

Mean rule Variance rule

2 2 2 2 2 2 σC = σP 3 = σP 5 < σP 2 = σP 4 < σP 1

Stand. desv. rule

S > 12.9 · 10−3 → P1 , P2 , P4

Normality rule

P 1f ail

Fig. 5. Descriptive analysis.

109

110

3

E. Garc´ıa et al.

How Does It Affect the Production Rate?

In [22] was analyzed how the deterioration of the mini-terms affects to the production rate. The present analysis was focused on the welding line located at Ford Almussafes, Valencia. The welding line was modeled taking into account the mini-terms subdivision, that is, the motion of the robot arm (the time that the robot is in movement), and the number of welding points for all the 68 different models and variants, see [9], Annex 4. Also in [9], Annex 4, we can see the offset time that a particular robot is awaiting for another robot in the same workstation and the transfer time, the required time to move the car body to the next workstation, (12 s). With this model and using a numerical simulation, the productivity rate was re-computed in [22], taking into account the mini-term variability. Figure 6 shows the simulation results. As we can expect, there is a variability between pathologies. While Pathology 3 has a neglectful effect, nearly to the ERR of the line, pathology 1 has a deep effect, reducing the production rate around 32 JPH. Also the simulation results show the variability if the pathology appears in a different welding station, around 4 JPH. This behavior is due to the dynamic behavior of the bottleneck.

Fig. 6. Welding line simulation.

The information provided by the simulation could be useful to prioritize the maintenance tasks after change point, with the goal to minimize the loss of productivity.

4

Mini-term 4.0. A New Paradigm for Failure Prognosis

As it is briefly explained in the introduction, “Industry 4.0” is a current trend and data exchange in manufacturing technologies. It includes cyber-physical systems, the internet of things and cloud computing creating what has been called

Mini-term 4.0

111

a “smart factory”. Following this tendency, the ideal way for maintenance prognosis would be to do it continuously and automatically. However, as indicated in [21] it is very expensive since many sensors and devices are needed to carry it out. The most used sensors to perform the maintenance prognosis are vibration, noise, temperature, pressure, flow, etc. Fortunately, as we have explained in [9], [22], mini-terms are able to detect the same physical deterioration phenomenon as common sensors but with an important advantage, it is easy and cheap to install them. It is cheap because they do not require any aditional hardware installation to measure the sub-cycle time, just the use of the PLC and sensors installed for the automated production process, and it is easy because we only need to code extra timers into the PLC. 4.1

Mini-term 4.0 Installation Setup

The results presented in [22] generated a great expectation in Ford Motor Company, which has led to the standardization of the mini-terms and the intention to implement it in all the factories owned by Ford all over the world. Nowadays the system is being installed at the Ford factory in Valencia. As we have explained before, one of the main drawbacks for industry 4.0 is the cost of introducing sensors into machines and how to integrate this with the system installed in the production line. In big manufacturing industries like Ford, there are a lot of memory and I/O restrictions for the PLC. Everything is standardized with a lot of protocols for all the plants around the world. Then, the success of whatever industry 4.0 technique depends mainly on the intrusiveness in the existing production lines. In our particular case, the standardization consists in reserving memory space for the mini-term measurements in the Standard that Ford has in the PLC Coding. Nowadays, we can measure the mini-terms for whatever element that Ford has in its factories. In the same way, there is a hardware architecture to collect data from the PLC that it is used also to collect mini-terms, see Fig. 8. In the first layer there are PLCs that control the machines and measure mini-terms. The second layer is an intermediate layer with one single objective: connect the PLC with the third layer, the Database collector. In this sense, there are four possibilities, see Fig. 7; – The PLC is connected directly with the Database collector Fig. 7(B), – The PLC and the Database collector use a PLC concentrator between them, Fig. 7(B), – A PC Line is used to extract the data from the PLC, Fig. 7(C), – A dedicated PC extracts the Data, Fig. 7(D) In the third layer, Database collectors send the data to a Database collector that is able also to analyze the mini-terms and send messages to maintenance workers. This database collector is connected to the fourth layer, where the developers and the managers of each plant can supervise and improve the system. The last layer is the internet connection that allows to connect different plants around the world as well as to monitor the process out of the factory. The whole

112

E. Garc´ıa et al.

system is well known as mini-term 4.0. Figure 8 shows the interface used in the third layer to monitor and analyze the mini-terms. 4.2

Towards a Real-Time mini-term Whatchdog for Failure Prognosis at Ford Motor Company

The process to collect and analyze mini-terms started few months ago at Almussafes factory. At present, three plants, Body 1, 2 and 3 have hundreds of miniterms collected in the Mini-term 4.0. Table 2 shows the mini-terms collected, the sensors used to measure the time as well as if the measurement requires any additional software and hardware installation. As we can see, the mini-term measurement uses the sensors used for the automated machine and a timer in the PLC. Therefore, it is not necessary to install any new hardware and software. Figure 9 shows a layout of the mini-terms located at Body 2 plant in that moment. The kind and number of mini-terms are increasing continuously.

Fig. 7. Architecture for Mini-term 4.0. It collects mini-terms in real-time at Ford factories.

Mini-term 4.0

113

Table 2. Mini-terms monitored at Ford Almussafes (Valencia). Mini-term

Required sensors

PLC code New HW or SW?

Pneumatic welding gun time Limit sensor

Timer

NO

Elevators time

Limit sensor

Timer

NO

Cylinder time

Limit sensor and actuation valve Timer

NO

Turn table time

Limit sensor

NO

Scissors table time

Limit sensor

Timer

NO

NutRunners time

Limit sensor

Timer

NO

Timer

The systems analyze the mini-terms and send an e-mail to the maintenance worker as shown in Fig. 10 when a change point occurs in one of them. The maintenance worker check the component and acts if a failure is found. Maintenance team reports the pathology detected to the Mini-term 4.0. Until now, the Mini-term 4.0 is able to detect a huge variety of pathologies, like for instance, the deterioration of the proportional valve that controls the clamp movement or an internal leak in the clamp cylinder, see Fig. 11. Figure 12 shows the effect on the mini-term of the lubricant deterioration in the welding clamp and how, once lubricated correctly, its nominal value is recovered.

Fig. 8. Architecture for Mini-term 4.0. It collects mini-terms in real-time at Ford factories.

114

E. Garc´ıa et al.

Fig. 9. Mini-terms collected at Body 2 plant.

Fig. 10. Mail example sent to the maintenance workers.

Mini-term 4.0

115

Fig. 11. Change points in mini-terms. Proportional valve (above). Leak in cylinder (below).

Fig. 12. Change points in mini-terms welding clamp due to lack of lubrication.

5

Conclusions and Future Works

This paper presents shows how a novel Maintenance Support System (MSS) to prognostic breakdowns in production lines, Mini-term 4.0. The system is based on the real time sub-cycle time (mini-term) monitorization and how the miniterm variability can be used as a fault detection indicator. The system is nowadays on a standardization process in a Ford Motor company where thousands of mini-terms and their pathologies are reported and analyzed. The system detects change points and sends an e-mail to the maintenance workers. They repair the machine and report the pathology detected to the system, increasing the knowledge of the system about that. Our immediate future work will be to diagnosticate and prognosticate the RUL (Remaining Useful Life) based on the data collected in the Mini-term 4.0. Acknowledgements. The authors wish to thank Ford Espa˜ na S.L and in particular Almussafes Factory for the support in the present research.

116

E. Garc´ıa et al.

References 1. Battaia O, Dolgui A (2013) A taxonomy of line balancing problems and their solution approaches. Int J Prod Econ 142(2):259–277. https://doi.org/10.1016/j. ijpe.2012.10.020 2. Li L, Djurdjanovic D, Ni J (2007) Maintenance task priorization using data driven bottleneck detection and maintenance opportunity windows. In: ASME 2007 international conference on manufacturing science and engineering, pp 517–523. https://doi.org/10.1115/MSEC2007-31150 3. Jardine AKS, Lin D, Banjevic D (2006) A review on machinery diagnostics and prognostics implementing condition-based maintenance. Mech Syst Signals Process 20(7):1483–1510. https://doi.org/10.1016/j.ymssp.2005.09.012 4. Son KL, Fouladirad M, Barros A, Levrat E, Lung B (2013) Remaining useful life estimation based on stochastic deterioration models: a comparative study. Reliab Eng Syst Safe 112:165–175. https://doi.org/10.1016/j.ress.2012.11.022 5. Chang Q (2005) Supervisory factory control based on real-time production feedback. PhD 6. Leal F, da Silva R, Montevechi J, de Almeida D, Silva F (2011) A practical guide for operational validation of discrete simulation models. Pesquisa Oper 31:57–77. https://doi.org/10.1590/S0101-74382011000100005 7. Li L, Chang Q, Ni J, Biller S (2009) Real time production improvement through bottleneck control. Int J Prod Res 47(21):6145–6158. https://doi.org/10.1080/ 00207540802244240 8. L´ opez CE (2014) Unbalanced workload allocation in large assembly lines. PhD 9. Garcia E (2016) An´ alisis de los sub-tiempos de ciclo t´ecnico para la mejora del rendimiento de las l´ıneas de fabricaci´ on. PhD 10. Negnevitsky M (2005) Artificial intelligence: a guide to intelligent systems. Pearson Education 11. Nemati HR, Steiger DM, Lyer LS, Herschel RT (2002) Knowledge warehouse: an architectural integration of knowledge management, decision support, artificial intelligence and data warehousing. Decis Support Syst 33(2):143–161. https:// doi.org/10.1016/S0167-9236(01)00141-5 12. Chakir A, Chergui M, Elhasnaou S, Medromi H, Sayouti A (2016) A decision approach to select the best framework to treat an it problem by using multi-agent systems and expert systems. In: Advances in ubiquitous networking, pp 499–511. https://doi.org/10.1007/978-981-287-990-540 13. Felsberger A, Bernhard O, Gerald R (2016) A review of decision support systems for manufacturing systems. In: The international conference on knowledge technologies and data-driven business 2016 - i-KNOW 2016, pp 499–511 14. Murthadha M, Banaz A (2013) Knowledge-driven decision support system based on knowledge warehouse and data mining for market management. J Manag Bus Res 13(10):2249–2288 15. Deshpande RR, Demarco J, Sayre JW, Liu BJ (2016) Knowledge-driven decision support for assessing dose distributions in radiation therapy of head and neck cancer. Int J Comput Assistant Radiol Surg 11(11):2071–2083. https://doi.org/10. 1007/s11548-016-1403-6 16. Muhammad SH, Ebrahim Z, Mahmood WHW, Rahman MN (2017) Decision support system classification and its application in manufactoring sector: a review. Int Jurnal Teknologi 79(1):153–163. https://doi.org/10.11113/jt.v79.7689

Mini-term 4.0

117

17. Garcia E, Montes N (2017) A Tensor Model for Automated Production Lines based on Probabilistic Sub-Cycle Times. Nova Science Publishers 18(1):221–234 18. Zhao X, Cai K, Wang X, Song Y (2018) Optimal replacement policies for a shock model with a change point. Comput Ind Eng 118:383–393. https://doi.org/10. 1016/j.cie.2018.03.005 19. Nigro MB, Pakzad SN, Dorvash S (2014) Localized structural damage detection: a change point analysis. Comput Aided Civ Infrastruct Eng 29:416–432. https:// doi.org/10.1111/mice.12059 20. Chakir A, Chergui M, Elhasnaou S, Medromi H, Sayouti A (2016) A decision approach to select the best framework to treat an it problem by using multi-agent system and expert systems. In: Advances in ubiquitous networking, pp 499–511 .https://doi.org/10.1007/978-981-287-990-5 40 21. Ahmad R, Kamaruddini S (2012) An overview of time-based and condition-based maintenance in industrial application. Comput Ind Eng 63(1):135–149. https:// doi.org/10.1016/j.cie.2012.02.002 22. Garcia E, Montes N, Alacreu M (2018) Towards a knowledge-driven maintenance support system for manufacturing lines. In Proceedings of the 15th international conference on informatics in control, automation and robotics (ICINCO 2018), vol 1, pp 43–54.https://doi.org/10.5220/0006834800430054

Automated Nonlinear Control Structure Design by Domain of Attraction Maximization with Eigenvalue and Frequency Domain Specifications Elias Reichensd¨orfer1,3(B) , Dirk Odenthal2 , and Dirk Wollherr3 1

2 3

BMW Group, Knorrstr. 147, 80788 Munich, Germany [email protected] BMW M GmbH, Daimlerstr. 19, 85748 Garching near Munich, Germany [email protected] Chair of Automatic Control Engineering, Technical University of Munich, Theresienstr. 90, 80333 Munich, Germany [email protected]

Abstract. This work proposes a new method for nonlinear control structure design for nonlinear dynamical systems by grammatical evolution. The optimization is based on a novel fitness function which implements three different design objectives. First, the closed loop eigenvalues of the linearized system dynamics are restricted to be located in a predefined region of the complex plane. Second, design specifications on the frequency magnitude of the sensitivity transfer functions of the linearized system dynamics are imposed. Third, the estimated domain of attraction of the nonlinear closed loop system dynamics is maximized by evaluating a quadratic Lyapunov function, obtained by the solution of the Lyapunov equation, with a Monte-Carlo sampling algorithm. Additionally, a general formula for the centroid of a design region for pole placement is derived analytically and embedded in the optimization framework. The generated controllers are evaluated on a common, nonlinear benchmark system. It is shown that the proposed method can efficiently generate control laws which meet the imposed design specifications on the closed loop system while maximizing the domain of attraction. Keywords: Nonlinear control structure design · Lyapunov equation Grammatical evolution · Robust control · Domain of attraction maximization · Frequency domain specifications · Stability centroid

1

·

Introduction

Nonlinear control design and analysis, in general, is a challenging task which might require substantial effort depending on the controlled plant as well as the performance and stability specifications imposed on the closed loop control c Springer Nature Switzerland AG 2020  O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 118–141, 2020. https://doi.org/10.1007/978-3-030-31993-9_6

Automated Nonlinear Control Structure Design

119

system. While there exists a broad variety of different methods for both linear and nonlinear control design, only few methods exist that aim at finding an optimal structure for a control law, while including common performance specifications from control theory. In particular, we want to make use of the sophisticated and well understood existing measures of performance and robustness from linear control theory and combine these with stability analysis tools available from nonlinear control theory. We propose the use of a novel fitness function for structural optimization by GE (grammatical evolution) in order to automatically generate nonlinear control structures that meet predefined design specifications. The GE algorithm [59] is a variation of GP (genetic programming) [39], both from the field of GA (genetic algorithms) [31]. These approaches not only optimize the parameters of a fixed function structure, but the structure itself as well. Existing approaches on structural controller optimization by GE and GP can be classified into simulation-based, quasi-analytic and hybrid methods. Simulation-based optimization schemes compute the quality of a candidate solution by measures obtained from simulation of the closed loop system. This approach has been applied to different linear and nonlinear control systems [12,19,40,57,62]. Quasi-analytic methods derive general properties of a closed loop system by analyzing its differential equations using numerical and symbolic methods [58,67]. Hybrid methods perform both simulation and analysis of the differential equations during the optimization [21]. The proposed approach of this work falls into the second category, which is based on an automated analysis of the differential equations of the considered control system. We propose an approach which combines eigenvalue and frequency domain specifications for the linearized control system with the maximization of the domain of attraction of the actual, nonlinear system. This paper builds upon and extends our previous work on nonlinear control structure design by grammatical evolution and generalizes different aspects of our optimization framework [58]. In especially, the following items expand our research from [58]: 1. Inclusion of frequency domain design specifications into the optimization. 2. Generalization of the centroid formula for pole placement by Theorem 2. 3. Derivation of a novel fitness function for simultaneous tuning of closed loop eigenvalue, frequency domain and domain of attraction specifications. 4. Evaluation of the proposed method on a different control benchmark. Item 1 enables the designer to shape the frequency response of the system. This is important for applications, since the frequency response characterizes various robustness properties of control systems [2,70]. Ensuring robustness of the closed loop system is an essential aspect of control design and is now made available as a measure of quality in the proposed optimization algorithm. Item 2 gives the designer the ability to define more general regions for pole placement. Depending on the application, it might not be possible to encode the desired locations of closed loop poles using only a single region. Therefore, we generalize our previous approach [58] by allowing multiple and also more generic regions. Item 3

120

E. Reichensd¨ orfer et al.

shows how the proposed ideas are put together in order to include these into the overall optimization framework. Finally, item 4 demonstrates the efficiency of the proposed method on a common benchmark problem from nonlinear control. Figure 1 gives an overview of the proposed algorithm for evaluation of autogenerated control structures, which is embedded into a GE framework. The left branch of the flow diagram, resulting in the Γ -fitness, is for pole placement. The central branch addresses the frequency response of the controlled system, summarized in the B-fitness measure. Finally, the domain of attraction of the nonlinear system is represented by the Ω-fitness measure on the right branch. The overall goal here is to find control laws with high local performance and stability margins (Γ -fitness and B-fitness) when the system is close to its operating point, while maximizing stability guarantees for the actual nonlinear system if it is pushed away from that operating point (Ω-fitness). Such a combination of different concepts for specifications on the closed loop performance and nonlinear stability in the context of structural optimization has not been considered before. We close this gap by defining the objectives in a nonlinear bi-level optimization problem, which is then solved using GE. Start

Integer vector

Translation Section 2.3 Control law k(x)

Linearized equations

Nonlinear System

Nonlinear equations

Section 2.1

Linear System Section 2.2

System matrix

Lyapunov Equation

V (x)

Section 2.2

Section 2.1

Eigenvalues

Γ -stable?

V˙ (x, −k(x)) Yes

Bode Diagram

Level Set Estimation

Section 2.2

Section 2.1

Magnitudes No

B-stable? No

Γ -fitness

Section 3.1

Lie Derivative

B-fitness

Section 3.2

Level set Yes

Domain of Attraction Section 3.3 DoA size

Ω-fitness

Section 3.3

Return

Fig. 1. Overview of the proposed hierarchical fitness function for evaluation of control structures by means of linear and nonlinear system properties within a GE framework.

The remainder of this paper is organized as follows. In Sect. 2, an overview of the required tools from control theory and grammatical evolution is given.

Automated Nonlinear Control Structure Design

121

Section 3 presents the proposed method, describing the novel fitness function from Fig. 1 in detail. In Sect. 4, the method is evaluated on two benchmark examples. The results are discussed and the paper is concluded in Sect. 5.

2

Background

In this section, the required background for the different parts of the proposed algorithm from Fig. 1 and an introduction to optimization with the GE algorithm is provided. In terms of notation, we denote with In the n × n identity matrix, 0n×m the n × m zero matrix, C 1 the set of continuously differentiable functions and X0 ⊆ Rn an open set which contains the origin. 2.1

Stability of Nonlinear Systems and the Domain of Attraction

First, we give an overview of stability analysis of dynamical systems with the first method of Lyapunov [42]. Here, a property of particular interest is definitness. Definition 1 (Definitness). Let V (x) : Rn → R with V (0n×1 ) = 0 and let X = X0 \ {0}. The function V is called positive definite on X0 , denoted by V (x)  0, if V (x) > 0 , ∀x ∈ X. If −V (x)  0, the function is called negative definite, denoted with V (x) ≺ 0. Non-strict inequalities, denoted by V (x)  0 and V (x) 0 are called positive- and negative semi-definite respectively. The direct method of Lyapunov states that a system is stable if and only if there exists a Lyapunov function for the system [42]. Definition 2 (Lyapunov function). Given a dynamical system as x˙ = f (x) and real-valued function V (x) : Rn → R. The function V (x), with x ∈ X0 and V (x)  0, is called a Lyapunov function if for its Lie-Derivative V˙ (x) holds:  ∂V (x) ∂V (x) f (x) = V˙ (x) = fi (x) 0 . ∂x ∂xi i=1 n

(1)

However, a Lyapunov can not only be used to proof stability of an equilibrium, but also for estimating its DoA (domain of attraction). Definition 3 (Domain of attraction). The set of initial conditions x0 n ∗ D = {t0 ∈ R+ 0 , x0 ∈ R : lim ϕ(t, t0 , x0 ) = x } t→∞

(2)

which converge to x∗ is called the domain of attraction of x∗ . Here, ϕ(t, t0 , x0 ) is the solution of x˙ = f (x) for initial condition x0 , starting at t0 . Given a Lyapunov function V (x), a subset of the DoA Ωc ⊆ D can be calculated: Ωc = {x ∈ X0 : V (x) ≤ c , V˙ (x) 0} .

(3)

122

E. Reichensd¨ orfer et al.

A Lyapunov function for which Ωc = D is called maximal Lyapunov function and can be computed for example by solving the Zubov equation [71]. However, the Zubov equation is a partial differential equation and in general it is not possible to obtain an analytic solution. Therefore, a variety of methods for estimation of the domain of attraction exist. These include LMIs (linear matrix inequalities) [14,41,44,66], SOS (sum-of-squares) programming [24,35], interval arithmetic [45,69], LP (linear programming) [29], mesh-free collocation [22], LP combined with SMT (satisfiability modulo theories) solvers [36] and sampling based approaches [8,48]. However, it is not only a challenging task to compute the DoA of a given, stable equilibrium using a suitable Lyapunov function. Finding such a Lyapunov function in the first place is, in general, still an open problem. There exists different approaches for Lyapunov function synthesis, mainly for specific classes of nonlinear systems. For example, SOS programming [54] can be applied for polynomial systems. Other methods include LP [9], LMIs [34], but also structural optimization by GP [5,28,46]. The survey paper [23] provides an overview of these and various other methods for the computation of Lyapunov functions. However, Lyapunov functions can not only be used for analysis, but also for control design for nonlinear systems with inputs. Definition 4 (Autonomous dynamical system with inputs). Let x ∈ Rn and u ∈ U ⊂ Rm , f (x, u) : Rn+m → Rn , h(x, u) : Rn+m → Rk , n, k ∈ N and f , h ∈ C 1 . Then, an autonomous dynamical system with inputs is given by x˙ = f (x, u) , y = h(x, u) ,

(4) (5)

with state vector x, input vector u, system order n, input dimension m, output dimension k and U the essentially bounded convex set of admissible controls. This leads to the concept of CLF (Control Lyapunov functions), see [20]. Definition 5 (Control Lyapunov function). Let V (x)  0 on X0 . The V (x) is called a Control Lyapunov function for system (4)–(5) if inf u∈U V˙ (x, u) < 0. A theoretical investigation of CLF is provided by [3,64]. Definition 5 means that for every state x there exists a control input u which can render V˙ (x, u) negative. In the control design step, ideally, u is chosen such that the guaranteed domain of attraction is maximized. In general, this is difficult to achieve for a nonlinear system as given by definition (4)–(5), especially if the structure of the control law is not specified. Therefore, we proposed in [58] to automatize this step by using GE in combination with one of the DoA estimation algorithms mentioned above. However, in practical applications stability alone is not enough. Typically, the system should be sufficiently fast and well damped while respecting actuator limits. In [58], this was achieved by pole placement of the linearized dynamics. However, such a design does not necessarily result in good disturbances rejection and stability margins. Hence, in the following, we include frequency domain specifications into the optimization process. Furthermore, the method for pole

Automated Nonlinear Control Structure Design

123

placement with GE is generalized to allow more sophisticated designs for the locations of the poles of the linearized system. In this work, we use an extended version of the sampling algorithm from [48], which we proposed in [58] for estimating the size of the DoA. The algorithm is described in detail in Sect. 3.3. There, also a formal definition of the optimization problem considered in this work is given. In the next section, we give some background information on LTI (linear time invariant) systems, internal stability and sensitivity transfer functions. 2.2

Linear Systems and Sensitivity Analysis

One common approach to deal with nonlinear dynamical systems is to perform Jacobian linearization about an operating point. Defining δx = x − x∗ and δu = u − u∗ gives the linearized dynamics by  ∂f (x, u)  x=x∗∗ δu u=u = Aδx + Bδu ∂x ∂u  ∂h(x, u)  x=x∗∗  ∂h(x, u)  δx + δu u=u = Cδx + Dδu . δy = ∂x ∂u δ x˙ =

 ∂f (x, u) 

δx +

(6) (7)

The transfer function matrix can then be computed by the classical equation G(s) = C(sIn − A)−1 B + D .

(8)

In the following we assume that m = k = 1, which results in a SISO (single-input, single-output) system, and therefore a single transfer function G(s) in (8). We consider the standard feedback system with controller K(s) in Fig. 2, including disturbances and measurement noise. Here, r(t) is the reference input, y(t) the controlled output, ym (t) the measured controlled output, yp (t) the plant output, e(t) the tracking error, u(t) the controller output, ud (t) the control signal, d1 (t) the input disturbance, d2 (t) the output disturbance and n(t) the sensor noise signal, compare for example [2,18].

r(t)

e(t) −

K(s)

u(t)

d1 (t) ud (t)

d2 (t) G(s)

yp (t)

ym (t)

y(t)

n(t)

Fig. 2. Single-input, single-output closed loop control system.

In Definition 6, the resulting transfer functions from Fig. 2, which are commonly used for robustness analysis in linear control theory [2], are listed.

124

E. Reichensd¨ orfer et al.

Definition 6. Given the closed loop system with controller K(s) and plant G(s) as shown in Fig. 2. Then, the following transfer functions are called L(s) = K(s)G(s) ,

open loop transfer function,

(9a)

T (s) = L(s)/(1 + L(s)) , S(s) = 1/(1 + L(s)) ,

complementary sensitvity function, sensitvity function,

(9b) (9c)

SK (s) = K(s)/(1 + L(s)) , SG (s) = G(s)/(1 + L(s)) ,

input disturbance sensitivity function, output disturbance sensitivity function.

(9d) (9e)

These transfer functions are useful for analyzing stability, performance and robustness of the linear control system in Fig. 2. It is well known that a linear system (8) is stable if and only if its characteristic equation det(sIn − A) = 0 has only solutions with strictly negative real part. Similarly, the concept of internal stability is defined. Definition 7 (Internal stability). The closed loop control system in Fig. 2 is called internally stable, if T (s), S(s), SK (s) and SG (s) are stable. This can be verified with the following theorem on internal stability by [18]. Theorem 1 (Internal stability). The closed loop control system in Fig. 2 is internally stable, if and only if: 1. The transfer function 1 + L(s) has no zeros with (s) ≥ 0. 2. There is no pole-zero cancellation in (s) ≥ 0 in the product K(s)G(s). Proof. See [18]. As a method to achieve time-domain specifications for the system in Fig. 2, the concept of Γ -stability and the parameter space approach have been proposed [1]. There, the location of transfer function poles is not only restricted to have a positive real part, but to lie within a specific Γ -region in the complex s-plane. This region encodes the desired behavior of the system dynamics and has to be defined in the control design phase. Typically, this specification is imposed on a family of transfer functions which is parameterized by an uncertainty of the plant model parameters. Similarly to Definition 7 and Theorem 1, the concepts of internal stabiltiy have been extended to Γ -stability [2]. Definition 8 (Internal Γ -stability). The closed loop control system in Fig. 2 is internally Γ -stable, if and only if: 1. The transfer function 1 + L(s) has no zeros with (s) ∈ / Γ. 2. There is no pole-zero cancellation in the product K(s)G(s) outside of Γ . We have proposed the inclusion of Γ -stability for the linearization of the nonlinear closed loop system of (4)–(5) into the structural controller optimization process using GE previously [58]. In this work, the fitness function for the GE

Automated Nonlinear Control Structure Design

125

algorithm is extended by including FRM (frequency response magnitude) specifications on (9a)–(9e). For a transfer function F (s), its FRM is MF = sup |F (jω)| .

(10)

ω∈R+ 0

These specifications are for example discussed in [17,18,43,70]. Similar to Γ stability, FRM have been included in the parameter space approach for parametric families of transfer functions [51]. This concept is know as B-stability: Definition 9. A transfer function F (s) is called B-stable in the frequency interval [ω − , ω + ] with ω − < ω + , if |F (jω)| ∈ B , ∀ω ∈ [ω − , ω + ] .

(11)

Similar to Γ -stability, the control designer has to choose an appropriate Bregion in the frequency domain in order to encode the desired dynamics of F (jω). The B-region can be defined by its frequency dependent boundary. Definition 10 (B-region). The B-region used for frequency response magnitude + specifications is given, with its boundary ∂B(ω) : R+ 0 → R0 , by + B = {(ω, M ) ∈ R+ 0 × R0 : M ≤ ∂B(ω)} ,

(12)

where M denotes the magnitude in the (ω, M )-plane. The B-region can be defined by the choice of an arbitrary boundary function ∂B(ω). Two commonly used boundaries are line segments or the frequency response of an auxiliary transfer function. In both cases, the designer has to specify the shape of the boundary. Another important result in linear control theory is given by the Lyapunov equation. Definition 11 (Lyapunov equation). Let x˙ = Ac x be a stable linear system. Let further Q = QT  0. Then, there exits P = PT  0 such that ATc P + PAc + Q = 0n×n

(13)

and a Lyapunov function for this system is given by V (x) = xT Px. Equation (13) can be solved with numerical algorithms, see for example [6] and has the advantage that it is positive definite by construction. This will be useful for domain of attraction estimation as a part of the fitness measure for GE. Next, we proceed with an introduction to the GE algorithm used in this work. 2.3

Optimization by Grammatical Evolution

While most of the methods from control theory which we described in the previous sections are of purely analytic nature, they often rely on different restrictive assumptions. One branch of optimization, under which various general purpose optimization methods can be summarized, are MHSM (meta-heuristic search

126

E. Reichensd¨ orfer et al.

methods). Such methods often require only very basic assumptions in order to be applied and are therefore an interesting tool for applications in control theory. MHSM is a broad term, which includes optimization methods like simulated annealing [38], artificial immune systems [11], tabu search [25] or ant colony optimization [16]. An overview of MHSM is given in [26]. A branch of MHSM are EA (evolutionary algorithms), which also include GA (genetic algorithms) [31]. These are optimization algorithms inspired by nature and simulate the natural selection process. MHSM have been applied successfully to different problems in the field of control engineering. Examples include tuning of LQR (linear quadratic regulator) [65], time series prediction [32,60], tuning of PID (proportional-integralderivative) controllers [13,50,63] and fuzzy control [55,68]. The books [30,53] provide an overview of further applications of MHSM and we also listed additional references, which use MHSM for identification of Lyapunov functions and nonlinear control, in Sect. 2.1. While most of the mentioned optimization algorithms from the field of MHSM focus on optimizing parameters, there also exist several methods for optimizing the topology, or structure, for a given problem. One such method is GP (genetic programming), which extends classical GA to perform structural optimization [39]. However, the classical GP algorithm often requires additional effort to avoid bloating, the creation of overly complex equations [53]. Therefore, we use GE (grammatical evolution) [59], a variant of GP, which is also build on GA but less prone to bloating [52]. The general optimization process of a GA algorithm is displayed in Fig. 3.

Init.

Fitness Evaluation

Mutation

Crossover

Selection

No

Done?

Yes

Exit

Evolutionary Adaptation

Fig. 3. Flow-chart of a GA with the standard genetic operators, adopted from [58].

The basic steps of the process shown in Fig. 3 can be described as follows. 1. Initialization: Random initialization of candidate solutions, which are called individuals. The set of candidate solutions forms a so-called population. 2. Fitness Evaluation: Each individual is assigned a fitness value, based on a problem specific metric. This metric represents the quality of an individual with respect to the considered optimization problem and is evaluated in the fitness function. The fitness function used in this work is depicted in Fig. 1. 3. Selection: A subset of individuals is selected randomly, but with higher probability, the higher its fitness value is. 4. Crossover: The selected individuals are recombined in order to form new individuals, which replace the individuals that where not selected. 5. Mutation: Newly generated individuals are modified randomly with some probability. This step is called mutation.

Automated Nonlinear Control Structure Design

127

One iteration of these steps is also called a generation. The steps are repeated until some predefined stopping criterion is met, like for example a maximum number of generations. The genetic operators selection, crossover and mutation are the most fundamental operators in GA. These perform the adaptation of solution candidates based on their fitness, in order to achieve an overall fitness improvement. Various approaches and extensions exist for each operator and also other operators have been proposed. An overview of different genetic operators is given in [47,53]. Next, we give a short introduction to the GE algorithm by [59], which was used in this work. This method is built on the idea to define a formal contextfree grammar in BNF (Backus-Naur-Form) [4,49], which maps a sequence of numbers, interpreted as source language, to an arbitrary target language. So, basically an integer string is compiled into, for example, a mathematical equation which can then be evaluated by a fitness function. We first give a definition of a context free grammar, as defined in [58], and then proceed by giving an explicit example of a compilation step of an individual. Definition 12 (Context free grammar). A context free grammar G is defined as the 4-tuple G = (N , Σ, P, S). Here, N is the finite set of non-terminals, Σ the finite set of terminals with N ∩ Σ = ∅ and S ∈ N the start symbol. The vocabulary of G is given by V = N ∪ Σ, which gives the finite set of production rules by P ⊂ (V ∗ \ Σ ∗ ) × V, each of the form N → V ∗ . In Definition 12, the notation (·)∗ stands for the Kleene closure of (·). Example 1. We start by defining an arbitrarily chosen individual, given by the  T vector iGE = 6 4 1 4 3 6 2 4 4 1 . The example grammar Ge = (Ne , Σe , Pe , Se ) used in this example is defined as Se = e, Ne = {e, o, v, c}, Σe = {+, −, ×, /, x1 , x2 , 1, 2, 3} and the production rules Pe , denoted in BNF by e ::= r1 | r2 | r3

 e ::= o(e, e) | v | c ,

(14a)

o ::= r4 | r5 | r6 | r7

 o ::= + | − | × | / ,

(14b)

v ::= r8 | r9

 v ::= x1 | x2 ,

(14c)

c ::= r10 | r11 | r12

 c ::= 1 | 2 | 3 .

(14d)

The dispatching of the current rule rk is done with the modulo operator by iGE,k mod p. Here, iGE,k is the k-th entry of the vector iGE , also called codon. Further, p is the number of available rules for unfolding the current non-terminal. The parsing process and the resulting tree are shown in Table 1 and Fig. 4. The example begins with Se = e, which can be expanded by three rules, so in the first step, p = 3. By modulo dispatch, we get iGE,k mod p = 6 mod 3 = 0, so the first rule is chosen. The translation process then unfolds all non-terminals from left to right, until the final expression x1 + 2x2 is generated. If all codons in iGE have been used, but the compilation is not finished yet, the wrapping technique is applied, which starts reading codons from the start of iGE . To avoid infinite loops, after a maximum number of wrappings, all recursive rules are removed

128

E. Reichensd¨ orfer et al.

from G (in Ge , that would be r1 ), so that parsing is guaranteed to terminate in a finite number of steps. We now proceed by combining GE with control design. Table 1. Compilation of iGE using Ge .

e

iGE,k p iGE,k mod p Rule Expression

o + e

e

v x1

o ×

3.1

– –

Se

e

6

3 0

r1

o(e, e)

4

4 0

r4

+(e, e)

1

3 1

r2

+(v, e)

4

2 0

r8

+(x1 , e)

e

e

3

3 0

r1

+(x1 , o(e, e))

c 2

v x2

6

4 2

r6

+(x1 , ×(e, e))

2

3 2

r3

+(x1 , ×(c, e))

4

3 1

r11

+(x1 , ×(2, e))

4

3 1

r2

+(x1 , ×(2, v))

1

2 1

r9

+(x1 , ×(2, x2 ))

Fig. 4. Parse tree of iGE .

3



Methods Centroids for Pole Placement

The fitness function for the GE algorithm will be composed of three different performance measures. We start with eigenvalue specifications on the linearized closed loop system. Assume the GE algorithm produces a state-feedback control law of the form u = −k(x). Inserting this into (6)–(7) gives x˙ = Ac x = (A − BkTc )x

(15)

−kTc x

where is the linearized control law computed by Jacobian linearization of the nonlinear state feedback −k(x) with respect to x. First, we want that the generated controller makes the linearized closed loop system (15) internally Γ -stable according to Definition 8. We extend the Γ -region used for pole placement we used in [58] by an optional upper bound on the damping ratio. This allows more general eigenvalue specifications. The definition is based on the commonly used Γ -region in robust control [2,15] and given by ΓT = {s = σ + jω : σ ≤ −1/Tu } ,  Γω = {s = σ + jω : σ 2 + ω 2 ≤ ωu } ,   Γζ+ = {s = σ + jω : −σ 1 − ζl 2 /ζl ≤ ω ≤ −σ 1 − ζu2 /ζu } ,   Γζ− = {s = σ + jω : σ 1 − ζu2 /ζu ≤ ω ≤ σ 1 − ζl 2 /ζl } , Γ = Γu ∩ Γl with Γu = ΓT ∩ Γω ∩ Γζ+ and Γl = ΓT ∩ Γω ∩ Γζ− .

(16a) (16b) (16c) (16d) (16e)

Automated Nonlinear Control Structure Design

129

In (16a)–(16e), Tu , ωu and ζu specify the upper bounds on time constant, bandwidth and damping ratio and ζl the lower bound on damping ratio of (15). Figure 5 shows two example Γ -regions constructed by this definition.

2 Γu

Γ

jω [s−1 ]

jω [s−1 ]

2

0 −2 −6

0 Γl

−2 −4

−2 σ [s

−1

−6

0

−4

]

−2 σ [s

(a) Γ -region with ζu = 1, from [58].

−1

0

]

(b) Γ -region with ζu < 1.

Fig. 5. Two example Γ -regions (grey) and the centroids of each connected subset.

With these four parameters, specifications on the closed loop behavior can be represented efficiently, since parameters like bandwidth and damping ratio can be analyzed and interpreted well. In order to compute a fitness value that encodes the location of the closed loop eigenvalues of (15), we use the centroids, also known as geometrical mean, of each connected component of Γ . Notice, that for geometrical computations, the Γ -regions can be treated as areas in R2 rather than C. Then, it is well known that the centroid sc = σc + jωc of an area between two functions f1 (σ) : R → R, f2 (σ) : R → R on the interval [a, b] with f1 (σ) ≥ f2 (σ) and a ≤ σ ≤ b can be computed by the following integrals σc = ωc =

1 A



1 2A  b

A=

b



σ f1 (σ) − f2 (σ) dσ

a



b



(17)



f1 (σ) + f2 (σ) f1 (σ) − f2 (σ) dσ

(18)

a

f1 (σ) − f2 (σ)dσ .

(19)

a

Furthermore, the composite centroid of N different areas can be computed by N σc + jωc =

i=1 σc,i Ai N i=1 Ai

N i=1 +j N

ωc,i Ai

i=1

Ai

.

(20)

Setting ζu = 1 merges the two Γ -areas into one single area. For this case, the following fact will be useful.

130

E. Reichensd¨ orfer et al.

Lemma 1. The Γ -region and its upper and lower sub regions Γu , Γl , from (16a) have all three the same σc coordinate. Furthermore, the imaginary part ωc of the centroid of Γ is equal to 0. Proof. Let A(X) denote the area of the set X and σc (X), ωc (X) the real part and the imaginary part of the centroid of the set X respectively. Then we have 2A(Γu ) = 2A(Γl ) = A(Γ ) and further σ(Γu ) = σ(Γl ) by symmetry. Inserting this into (20) gives

A(Γ ) σc (Γu ) + σc (Γl ) σc (Γu )A(Γu ) + σc (Γl )A(Γl ) = = σc (Γu ). σc (Γ ) = A(Γu ) + A(Γl ) 2A(Γ ) (21) Therefore, σc (Γ ) = σc (Γu ) = σc (Γl ). Since Γ is symmetric about the real axis,   it follows that ωc (Γ ) = 0. Theorem 2. The centroids sc = σc ± jωc of the generalized Γ -stability area spanned in the complex s-plane as a function of the design parameters ζl , ζu , Tu and ωu with ζl , ζu ∈ (0, 1], ζl < ζu and ωu > 1/Tu are given by  σc ± jωc , ζu < 1 , sc = σc ± jωc = with (22) ζu = 1 , σc ,  −ωu ζl , Δ > 0 , a2 = (23) −1/Tu , Δ ≤ 0 and the substitutions Δ = ωu ζl − 1/Tu , σc = (Mc,σ + Mq,σ )/(Ac + Aq ), ωc = (Mc,ω + Mq,ω )/(Ac + Aq ) and the moments and areas of the sub-components of the Γ -region, 

 Aq = (Tu2 a22 − 1) ζu 1 − ζl 2 − ζl 1 − ζu2 /(2ζu ζl Tu2 ) , (24a) 



Ac = a2 ζu ωu2 − a2 2 + a2 1 − ζu2 + ζu ωu2 γ /(2ζu ) , (24b) γ = arcsin(ζl ) + arcsin(a2 /ωu ) ,  

Mc,σ = ζu (a22 − ωu2 ) ωu2 − a2 2 + (ζu ωu3 + a32 ) 1 − ζu2 /(3ζu ) ,

Mc,ω = 2ζu3 ωu3 + 3a2 ζu2 ωu2 − a32 /(6ζu2 ) , 

 Mq,σ = (Tu3 a32 + 1) ζu 1 − ζl 2 − ζl 1 − ζu2 /(3ζl ζu Tu3 ) , Mq,ω =

(Tu3 a32

+

1)(ζl2



ζu2 )/(6Tu3 ζl2 ζu2 ) .

(24c) (24d) (24e) (24f) (24g)

Proof. Two cases are required to distinguish between two seperate Γ -regions, Γu , Γl due to ζu < 1 and a single Γ -region due to ζu = 1. Furthermore, two other cases are required in case that the vertical line given by the boundary of ΓT is to the left of the left-hand-plane-intersection of the circle and ζl -line. Here, we

Automated Nonlinear Control Structure Design

131

denoted these cases by the sign conditions on Δ. This results in a  total of 4 cases (σ) = ωu2 − σ 2 and in (22)–(23). Now, define the functions in (17)–(19) as f 1  −1 2 f2 (σ) = −ζu 1 − ζu σ on the interval [a, b], with a = −ωu ζu and b = −ωu ζl for the centroid ofthe circular area. Next, if Δ > 0, use f1 (σ) = −ζl−1 1 − ζl2 σ and f2 (σ) = −ζu−1 1 − ζu2 σ on the interval [a, b] with a = −ωu ζl and b = −1/Tu for the centroid of the remaining areas. With the integrals (17)–(19) and the fact  that σc (Γ ) = σc (Γu ) = σc (Γl ) by Lemma 1, formula (22)–(23) can be derived.  The first part of the fitness value can now be defined in terms of the centroids for an arbitrary, finite amount of KΓ different, user-defined Γ -regions. Definition 13 (Γ -fitness). Let λi (Ac ) be the i-th eigenvalue of the matrix Ac (k) (k) (k) (k) and Γ (k) be the k-th Γ -region defined by the k-th 4-tuple (Tu , ωu , ζl , ζu ), KΓ (k) (k) the union of all Γ -regions. Let S denote the set of all with Γtotal = k=1 Γ centroids of all Γ (k) -regions by Theorem 2. Then the Γ -fitness, φΓ is given by  1 n / Γtotal , minsc ∈S ||λi (Ac ) − sc ||22 , λi (Ac ) ∈ (25) φΓ = 0, λi (Ac ) ∈ Γtotal . i=1 n The quantity φΓ in Definition 13 describes the average distance of all eigenvalues of the linearized closed loop system (15) to their closest Γ (k) -region. How close an eigenvalue is to such a region is defined as its distance from the centroid of that region. Any eigenvalue that is located in any Γ (k) -region is assigned zero distance, as it is already within the desired region. The Γ -fitness is zero if and only if (15) is internally Γ -stable since Ac is, in general, not a minimal realization and its eigenvalues represent the poles of its corresponding transfer function before pole-zero cancellation. We now proceed with the second part of the total fitness value which defines a term for FRM specifications. 3.2

Frequency Response Magnitude Specifications

Next, the fitness value is extended by FRM specification. We encode these for a transfer function F (s) by NF piecewise constant segments which limit the FRM − + − + − + < ωF,1 = ωF,2 < ωF,2 = · · · = ωF,N < ωF,N , by of F (s), with 0 < ωF,1 F F MF,1

∗ = max |F (jω)| ≤ MF,1 ,

− + ω ∈ [ωF,1 , ωF,1 ],

(26a)

MF,2

∗ = max |F (jω)| ≤ MF,2 ,

− + ω ∈ (ωF,2 , ωF,2 ],

(26b)

.. .

ω ω

.. .

.. .

∗ , MF,NF = max |F (jω)| ≤ MF,N F ω

.. . − + ω ∈ (ωF,N , ωF,N ]. F F

(26c)

132

E. Reichensd¨ orfer et al.

Let BF be the B-region of F (s). Then, the fitness measure for B-stability is: Definition 14 (B-fitness). Let T be a finite set of NT different transfer functions. Then, we compute the B-fitness by φB = 1 −

NF ∗ MF,i 1  1  . ∗ ,M NT NF i=1 max(MF,i F,i )

(27)

F ∈T

The quantity φB ∈ [0, 1) describes the average relative exceeding of the FRM specifications, averaged over all specified transfer functions. In the case that all transfer functions violate all FRM specifications in every frequency interval by a large margin, then φB → 1. If the frequency response of each F (s) is contained in its corresponding BF -region and therefore, meets the FRM specifications, then φB = 0. By including for example the transfer functions (9b)–(9e) into T, the designer can readily include different FRM specifications on the closed loop system. Figure 6 shows an example FRM specification and the amplitude of the frequency response of an example sensitivity function S(s) that meets these specification. While the specifications on Γ and B-stability consider the linearized closed loop dynamics, it is also desirable to include the actual nonlinear dynamics into the optimization. Therefore, we continue with the third and last component of the overall fitness function.

MF

2 1 0 10−1

100

101

102

103

Frequency [rad/s]

Fig. 6. Example Bode plot and BS -region (grey) for a sensitivity function S(s).

3.3

Optimization Problem and Domain of Attraction Estimation

The previous two parts of the fitness function, φΓ and φB represent performance and robustness measures for the linearization (6)–(7) of the actual nonlinear system (4)–(5). In order to also incorporate the nonlinearity of the actual system into the optimization, we optimize the size of the DoA as we did in [58]. Before that, the nonlinear, constrained, bi-level optimization problem can be stated formally, as all required components have been introduced.

Automated Nonlinear Control Structure Design

133

maximize k(x) ∈ UG (Rn , R)

Λ(Ωc )

subject to

Ωc = {x ∈ Rn : V (x) ≤ c } ,

(28b)

c = minimize x ∈ Rn=0

(28c)

(28a)

V (x)

subject to V˙ (x, −k(x)) = 0 , λi (Ac ) ∈ Γtotal , ∀ i ∈ {1, . . . , n} ,

(28d) (28e)

|F (jω)| ∈ BF , ∀F (s) ∈ T , |k(x)| ≤ umax , ∀x ∈ Rn .

(28f) (28g)

The term Λ(Ωc ) in (28a) denotes the Lebesgue measure of the DoA estimate (3), which can be interpreted as its hyper-volume. This volume is subject to maximization over UG (Rn , R), which denotes the function space of all k(x) : Rn → R that can be generated with the user-defined grammar G. The DoA estimation (28b)–(28c) is done by searching for the smallest level curve, along which V˙ (x, −k(x)) = 0 holds for x = 0n×1 (28d). Details on this method can be found for example in [61]. We compute (28c), as in [58], by Algorithm 1, an extended version of the original algorithm by [48]. Furthermore, constraints (28e)–(28f) state that the linearized closed loop system must be internally Γ -stable and B-stable, while (28g) accounts for actuator limitations. Algorithm 1 . Domain of attraction estimation. 1: procedure DoASize(V (x), V˙ (x, −k(x)), Xc , NMC )  Initialize L as empty list. 2: Initialization: c ← ∞, kMC ← 0, L ← ∅  Monte Carlo search with NMC samples. 3: for i ← 1 to NMC do  Draw from a uniform distribution. 4: Pick random state xi ∈ Xc  Store V (xi ) by appending it to the list L. 5: L ← (L, V (xi ))  Test Lyapunov conditions. 6: if V (xi ) < c and V˙ (xi , −k(xi )) ≥ 0 then  Update the level curve V (x) = c . 7: c ← V (xi )  Estimate the size of the domain of attraction. 8: for i ← 1 to NMC do  Li denotes the i-th element of the list L. 9: if Li < c then 10: kMC ← kMC + 1  Count the samples inside the level curve V (x) = c .  Return the normalized size of the DoA estimate. 11: return kMC /NMC

Algorithm 1 estimates the DoA size based on Monte Carlo sampling in a predefined region Xc ⊂ Rn . The fitness value representing the normalized DoA size estimate is called Ω-fitness and given by Algorithm 1 as φΩ = 1 − kMC /NMC .

(29)

The derivative of V (x) as well as the Jacobian linearization of the closed loop control system are calculated symbolically by automatic differentiation [56]. In order to generate a suitable Lyapunov function for a generate control law k(x), the solution P of the Lyapunov equation (13), with Ac from the linearized closed

134

E. Reichensd¨ orfer et al.

loop system (15) and Q = In is used. The Lyapunov function is constructed by V (x) = xT Px, as discussed in [58]. The transfer functions for checking Bstability are calculated from the system matrices with the algorithm from [7]. Actuator limitations are represented by a smooth approximation of a symmetric saturation [27], in order to keep k(x) differentiable. The saturated control law is formed as in [58] by

(30) ksat (x) = umax − 2umax / 1 + exp(2k(x)/umax ) . Notice that (30) has a slope of 1 at u = −k(x) = 0 and therefore has no effect when linearizing about the origin. Hence, instead of ksat (x), the less complex k(x) can be used for calculating the required derivatives to save some additional computation time. Finally, the total fitness can now be defined. Definition 15 (Total fitness). The total fitness φ : UG (Rn , R) → R+ 0 is ⎧ ⎨ 2 + φ Γ , φΓ > 0 , φ = 1 + φ B , φΓ = 0 , φ B > 0 , ⎩ φ Γ = 0 , φB = 0 . φΩ ,

(31)

The total fitness (31) makes the optimization problem (28a)–(28g) computationally accessible by a GE algorithm. One advantage of (31) is, that the optimization objectives are allocated like in a daisy-chain framework: The fitness value φB needs to be computed only if the system is internally Γ -stable, namely if φΓ = 0. The same holds for φΩ , which is only required if the system is B-stable and φΓ = φB = 0. Since typically, φΩ requires significantly more computation time than φΓ or φB , the hierarchical structure in (31) can also be seen as a validation step, that picks only promising control laws for the DoA estimation. Another advantage of (31) is, that the current progress of the optimization can be read from it conveniently. A value of φ > 2 means that the closed loop system is not yet Γ -stable. For φ ∈ (1, 2), B-stability is optimized, while a value of φ < 1 is a direct measure of the DoA estimate by Algorithm 1. Additionally, the total fitness (31) could be readily extended in the proposed systematic fashion of the present work, for example to also include phase specifications and Nyquist/Popov stability margins [10]. The proposed new method for nonlinear structural control design by DoA maximization with linear performance measures is now evaluated on one benchmark system from nonlinear control. In the next section, an experimental evaluation of the proposed method on a benchmark example is presented.

4 4.1

Experiments Optimization Setup

The parameters used for optimization with GE are listed in Table 2.

Automated Nonlinear Control Structure Design

135

Table 2. Optimization setup used for the GE algorithm, adopted from [58]. Parameter

Value

Parameter

Value

Population size

100

Selection

Rank selection

Number of generations 5000

Crossover

Cut and splice

Mutation rate

0.05

Mutation

Uniform multi-point

Initial genome length

∼ U{10, 20} Elitism rate

0.1

Initial genome values

∼ U{0, 20}

Selection pressure

1.5

Max. wrapping count

5

Monte Carlo samples 10000

Max. tree depth

15

Pruning

Yes

We use the grammar G = (N , Σ, P, S) with S = e, N = {e, o, f , v, c} and Σ = {+, −, ×, /, sin, cos, arctan, tanh, x1 , . . . , xn , c1 , . . . , cKc }. Further, P is defined by the mappings in BNF as e ::= o(e, e) | f (e) | v | c, o ::= + | − | × | /, f  ::= sin | cos | arctan | tanh, v ::= x1 | . . . | xn and c ::= c1 | . . . cKc . The number of constants Kc = 10000, drawn from a continuous uniform distribution ∼ U{−10, 10}, as we selected in [58]. 4.2

Inverted Pendulum

The inverted pendulum is selected as benchmark system, adopted from [37] by x˙ 1 = x2 ,

(32a)

x˙ 2 = (g/l) sin(x1 ) − (k/m)x2 + 1/(ml )u , x˙ 3 = x1 . 2

(32b) (32c)

Here, x1 is the deviation of the pendulum angle from the top position in rad, x2 the angular velocity in rad/s and x3 the integrated angular deviation in rad s. The states x1 , x2 are plant states, while x3 is a controller state, in order to realize integral action. The model parameters are the gravitational constant g = 9.81 m/s2 , the pendulum length l = 0.3 m, the pendulums mass m = 0.5 kg and the friction coefficient k = 0.02 kg/s. The input u, in Nm, is the torque acting on the pendulum and y = x1 the controlled variable. (1) (1) (1) (1) We define the Γ -region for pole placement by (Tu , ωu , ζl , ζu ) = (2) (2) (2) (2) (0.6 s, 2.5 rad/s, 0.9, 1) and (Tu , ωu , ζl , ζu ) = (0.1 s, 20 rad/s, 0.7, 0.9) and the set of transfer functions for FRM specifications and B-stability as T = {S(s)}. Here, S(s) is the sensitivity function (9c) of the system. The open loop transfer function of (32a)–(32c) can be computed by L(s) = kTc (sI3 − A)−1 B, ∗ = 0.2 with kc by (15). The magnitude bounds on S(s) are defined as MS,1 −3 ∗ 2 , ∀ω ∈ (10 , 1]Hz and MS,2 = 1.5 , ∀ω ∈ (1, 10 ] Hz. The actual magnitudes MS,1 , MS,2 of each controller are computed on a logarithmically spaced grid with 100 samples per interval. Actuator limits are set to |k(x)| ≤ 15 Nm. The sampling

136

E. Reichensd¨ orfer et al.

region Xc in the state space for estimating the DoA is chosen as x1 ∈ [−1, 1] rad, x2 ∈ [−2, 2] rad/s and x3 ∈ [−1, 1] rad s. We choose two control designs for comparison with the generated controller. The first is based on JL (Jacobian linearization), the second on FL (feedback linearization) [33]. Notice that due to saturation, the original system is not feedback-linearizable. Therefore, we apply the FL technique to the unsaturated system and apply the saturation afterwards. This leads to the control laws: kGE (x) = 9.64x1 + x2 + 11.1x3 + 6.85x1 (2x1 + x3 ) tanh(ψ(x)) , kJL (x) = 9.64x1 + x2 + 11.1x3 ,

kFL (x) = ml2 (g/l) sin(x1 ) − (k/m)x2 + klin (x) ,

(33a) (33b) (33c)

2

2

0

0

−2 −4

x3

x3

with ψ(x) = x1 + x2 + x3 + arctan(1.25x1 ) and klin (x) = 181.54x1 + 22.26x2 + 246.7x3 . Control law (33a) was identified by the proposed GE method and achieves all specifications on Γ -stability and B-stability. For a fair comparison, (33b) and (33b) were designed such that all three controllers result in the same eigenvalues of the linearized closed loop system, namely (−1.68, −10.29 ± j6.4). Hence, (13) also results in the same Lyapunov function for all three controllers: V (x) = 5.16x21 +0.071x1 x2 +11.96x1 x3 +0.0241x22 +0.0041x2 x3 +9.17x23 . While the behavior of (33a)–(33c) is therefore locally equivalent, the estimated DoA by the GE controller (33a) is larger by a factor of 5.92 and 6.01 compared to the JL (33b) and the FL controller (33c). Figure 7 shows the DoA estimates for the GE and the FL controller. The GE controller achieves not only Γ and B-stability, but also a significantly larger DoA than the JL and FL techniques.

−2

x1

0

2

0 4 −20

20

(a) Feedback linearization controller.

x2

−2 −4

−2

x1

0

2

0 4 −20

20 x2

(b) Grammatical evolution controller.

Fig. 7. Comparison of the estimated domains of attraction obtained with Algorithm 1, using 105 samples per controller. a and b display 2000 selected samples.

5

Conclusion

We proposed an extension to our method of automated nonlinear control structure design by grammatical evolution. The extension was carried out in three

Automated Nonlinear Control Structure Design

137

directions. First, the formula for centroid calculations used for pole placement was generalized and an explicit analytic formula was presented. Second, we included frequency response magnitude specifications into the optimization process. Third, we showed how these specifications can be combined with eigenvalue specifications and domain of attraction maximization into a nonlinear bi-level optimization problem which can be approached with GE. Finally, we applied the method to a nonlinear benchmark problem. Advantages of the proposed method are its flexibility, the interpretability of intermediate fitness values and that it combines different measures from linear and nonlinear control. Therefore, the generated control laws typically result in high performance and robustness of the closed loop system, while the designer can focus on high-level system specifications and the choice of a suitable grammar. Also, the proposed fitness function can of course be used with classical GA too, if a fixed control structure is preferred over structural optimization. A disadvantage is that a large amount of samples is required to estimated the domain of attraction reliably. Therefore, the method can result in long computation times for high-order systems. The limitation to use quadratic forms as Lyapunov functions can be seen as a disadvantage too, as these might result in conservative estimates of the domain of attraction. However, in control design it is often desirable that the closed loop system behaves approximately like a linear system. If the system is linear however, it admits a quadratic Lyapunov function. Hence, this limitation can also be seen as an advantage as it implicitly biases the evolutionary process towards control structures that result in approximate linear behavior of the closed loop system. We demonstrated with a nonlinear benchmark problem, that the proposed method is able to generate control laws that meet the design specifications and maximizes the guaranteed domain of attraction. Future work could focus on investigating different sampling algorithms for estimating the domain of attraction, the inclusion of more specifications from linear control or to generalize the method for MIMO (multi-input, multi-output) systems. The presented method, as a novel design tool for nonlinear control structure design, demonstrated in the considered benchmarks that it is able to generate efficient controllers that are robust and fast while respecting actuator limits.

References 1. Ackermann J (1980) Parameter space design of robust control systems. IEEE Trans Autom Control 25(6):1058–1072 2. Ackermann J, Blue P, B¨ unte T, G¨ uvenc L, Kaesbauer D, Kordt M, Muhler M, Odenthal D (2002) Robust control: the parameter space approach. Springer Science & Business Media, Heidelberg 3. Artstein Z (1983) Stabilization with relaxed controls. Nonlinear Anal TMA 7(11):1163–1173 4. Backus JW (1959) The syntax and semantics of the proposed international algebraic language of the Zurich ACM-GAMM conference. Proceedings of the international conference on information processing (1959)

138

E. Reichensd¨ orfer et al.

5. Banks C (2002) Searching for Lyapunov functions using genetic programming. Virginia Polytech Institute and State University, Blacksburg, Virginia. http://www. aerojockey.com/files/lyapunovgp.pdf. Accessed 24 May 2018 6. Bartels RH, Stewart G (1972) Solution of the matrix equation AX + XB = C: algorithm 432. Commun ACM 15(9):820–826 7. Blackwell C (1984) On obtaining the coefficients of the output transfer function from a state-space model and an output model of a linear constant coefficient system. IEEE Trans Autom Control 29(12):1122–1125 8. Bobiti R, Lazar M (2018) Automated sampling-based stability verification and DOA estimation for nonlinear systems. IEEE Trans Autom Control 63(11):3659– 3674 9. Brayton R, Tong C (1979) Stability of dynamical systems: a constructive approach. IEEE Trans Circuits Syst 26(4):224–234 10. B¨ unte T (2000) Mapping of Nyquist/Popov theta-stability margins into parameter space. IFAC Proc Vol 33(14):519–524 11. Castro LN, De Castro LN, Timmis J (2002) Artificial immune systems: a new computational intelligence approach. Springer Science & Business Media, London 12. Chen P, Lu YZ (2011) Automatic design of robust optimal controller for interval plants using genetic programming and Kharitonov theorem. Int J Comput Intell Syst 4(5):826–836 13. Chen Z, Yuan X, Ji B, Wang P, Tian H (2014) Design of a fractional order PID controller for hydraulic turbine regulating system using chaotic non-dominated sorting genetic algorithm II. Energy Convers Manag 84:390–404 14. Chesi G (2005) Domain of attraction: estimates for non-polynomial systems via LMIs. In: Proceedings of 16th IFAC world congress, Prague, Czech Republic 15. Chilali M, Gahinet P (1996) H∞ design with pole placement constraints: an LMI approach. IEEE Trans Autom Control 41(3):358–367 16. Dorigo M, Maniezzo V, Colorni A (1996) Ant system: optimization by a colony of cooperating agents. IEEE Trans Syst Man Cybern Part B (Cybernetics) 26(1):29– 41 17. Doyle J, Stein G (1981) Multivariable feedback design: concepts for a classical/modern synthesis. IEEE Trans Autom Control 26(1):4–16 18. Doyle JC, Francis BA, Tannenbaum AR (2013) Feedback control theory. Courier Corporation 19. Duriez T, Brunton SL, Noack BR (2017) Machine learning control-taming nonlinear dynamics and turbulence. Springer, Heidelberg 20. Freeman R, Kokotovic PV (2008) Robust nonlinear control design: state-space and Lyapunov techniques. Springer Science & Business Media, Basel 21. Gholaminezhad I, Jamali A, Assimi H (2014) Automated synthesis of optimal controller using multi-objective genetic programming for two-mass-spring system. In: 2014 second RSI/ISM international conference on robotics and mechatronics (ICRoM). IEEE, pp 041–046 22. Giesl P (2007) Construction of global Lyapunov functions using radial basisfunctions, vol 1904. Springer, Heidelberg 23. Giesl P, Hafstein S (2015) Review on computational methods for Lyapunov functions. Discrete Continuous Dyn Syst Ser B 20(8):2291–2337 24. Glassman E, Desbiens AL, Tobenkin M, Cutkosky M, Tedrake R (2012) Region of attraction estimation for a perching aircraft: a Lyapunov method exploiting barrier certificates. In: 2012 IEEE international conference on robotics and automation (ICRA). IEEE, pp 2235–2242

Automated Nonlinear Control Structure Design

139

25. Glover F (1986) Future paths for integer programming and links to artificial intelligence. Comput Oper Res 13(5):533–549 26. Glover FW, Kochenberger GA (2006) Handbook of metaheuristics, vol 57. Springer Science & Business Media, Heidelberg 27. Graichen K, Kugi A, Petit N, Chaplais F (2010) Handling constraints in optimal control with saturation functions and system extension. Syst Control Lett 59(11):671–679 28. Grosman B, Lewin DR (2005) Automatic generation of Lyapunov functions using genetic programming. IFAC Proc Vol 38(1):75–80 29. Hafstein SF (2007) An algorithm for constructing Lyapunov functions. Electron J Differ Eq 2007:101 30. Haupt RL, Haupt SE (1998) Practical genetic algorithms, vol 2. Wiley, New York 31. Holland JH (1975) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. University of Michigan Press, Oxford 32. Hosen MA, Khosravi A, Nahavandi S, Creighton D (2015) Improving the quality of prediction intervals through optimal aggregation. IEEE Trans Industr Electron 62(7):4420–4429 33. Isidori A (1989) Nonlinear control systems, 2nd edn. Springer, Heidelberg 34. Johansson M, Rantzer A (1997) Computation of piecewise quadratic Lyapunov functions for hybrid systems. In: 1997 european control conference (ECC). IEEE, pp 2005–2010 35. Jones M, Mohammadi H, Peet MM (2017) Estimating the region of attraction using polynomial optimization: a converse Lyapunov result. In: 2017 IEEE 56th annual conference on decision and control (CDC). IEEE, pp 1796–1802 36. Kapinski J, Deshmukh JV, Sankaranarayanan S, Arechiga N (2014) Simulationguided Lyapunov analysis for hybrid dynamical systems. In: Proceedings of the 17th international conference on Hybrid systems: computation and control. ACM, pp 133–142 37. Khalil HK (1996) Nonlinear systems, 3rd edn. Prentice-Hall, Englewood Cliffs 38. Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680 39. Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection, vol 1. MIT Press, Cambridge 40. Koza JR, Keane MA, Yu J, Bennett FH, Mydlowec W, Stiffelman O (1999) Automatic synthesis of both the topology and parameters for a robust controller for a nonminimal phase plant and a three-lag plant by means of genetic programming. In: Proceedings of the 38th IEEE Conference on Decision and Control, vol 5, IEEE, pp 5292–5300 41. Li C, Ryoo CS, Li N, Cao L (2009) Estimating the domain of attraction via moment matrices. Bull Korean Math Soc 46(6):1237–1248 42. Lyapunov AM (1992) The general problem of the stability of motion. Taylor & Francis, London. English translation by A.T. Fuller, original work published in Russian in 1892 43. Maciejowski JM (1989) Multivariable feedback design. Electronic systems engineering series, vol 6. Addison-Wesley, Wokingham, pp 85–90 44. Materassi D, Salapaka MV (2009) Attraction domain estimates combining Lyapunov functions. In: American control conference, ACC 2009. IEEE, pp 4007–4012 45. Matthews ML, Williams C (2012) Region of attraction estimation of biological continuous boolean models. In: 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, pp 1700–1705

140

E. Reichensd¨ orfer et al.

46. McGough JS, Christianson AW, Hoover RC (2010) Symbolic computation of Lyapunov functions using evolutionary algorithms. In: Proceedings of the 12th IASTED international conference, vol 15, pp 508–515 47. Mitchell M (1998) An introduction to genetic algorithms. MIT Press, Cambridge 48. Najafi E, Babuˇska R, Lopes GA (2016) A fast sampling method for estimating the domain of attraction. Nonlinear Dyn 86(2):823–834 49. Naur P, Backus JW, Bauer FL, Green J, Katz, C., McCarthy J, Perlis AJ (1969) Revised report on the algorithmic language. In: ALGOL 60. Springer 50. Neath MJ, Swain AK, Madawala UK, Thrimawithana DJ (2014) An optimal PID controller for a bidirectional inductive power transfer system using multiobjective genetic algorithm. IEEE Trans Power Electron 29(3):1523–1531 51. Odenthal D, Blue P (2000) Mapping of frequency response performance specifications into parameter space. IFAC Proc Vol 33(14):531–536 52. O’Neill M, Ryan C (1999) Under the hood of grammatical evolution. In: Proceedings of the 1st annual conference on genetic and evolutionary computation, vol 2. Morgan Kaufmann Publishers Inc., pp 1143–1148 53. Poli R, Langdon WB, McPhee NF (2018) A field guide to genetic programming. Published via http://lulu.com, http://www.gp-field-guide.org.uk. (Accessed 24 May 2018), (With contributions by J. R. Koza) 54. Prajna S, Papachristodoulou A, Parrilo PA (2002) Introducing SOSTOOLS: a general purpose sum of squares programming solver. In: Proceedings of the 41st IEEE conference on decision and control, vol 1. IEEE, pp 741–746 55. Precup RE, Sabau MC, Petriu EM (2015) Nature-inspired optimal tuning of input membership functions of Takagi-Sugeno-Kang fuzzy models for anti-lock braking systems. Appl Soft Comput 27:575–589 56. Rall LB (1981) Automatic differentiation: techniques and applications. Springer, Heidelberg 57. Reichensd¨ orfer E, Odenthal D, Wollherr D (2017) Grammatical evolution of robust controller structures using Wilson scoring and criticality ranking. In: European conference on genetic programming. Springer, pp 194–209 58. Reichensd¨ orfer E, Odenthal D, Wollherr D (2018) Nonlinear control structure design using grammatical evolution and Lyapunov equation based optimization. In: Proceedings of the 15th international conference on informatics in control, automation and robotics - Volume 1: ICINCO. INSTICC, SciTePress, pp 55–65 59. Ryan C, Collins J, Neill MO (1988) Grammatical evolution: evolving programs for an arbitrary language. In: European conference on genetic programming. Springer, pp 83–96 60. Saadat J, Moallem P, Koofigar H (2017) Training echo state neural network using harmony search algorithm. Int J Artif Intell TM 15(1):163–179 61. Saleme A, Tibken B, Warthenpfuhl SA, Selbach C (2011) Estimation of the domain of attraction for non-polynomial systems: a novel method. IFAC Proc Vol 44(1):10976–10981 62. Shimooka H, Fujimoto Y (2000) Generating robust control equations with genetic programming for control of a rolling inverted pendulum. In: Proceedings of the 2nd annual conference on genetic and evolutionary computation. Morgan Kaufmann Publishers Inc., pp 491–495 63. Singh R, Kuchhal P, Choudhury S, Gehlot A (2015) Implementation and evaluation of heating system using PID with genetic algorithm. Indian J Sci Technol 8(5):413 64. Sontag ED (1983) A Lyapunov-like characterization of asymptotic controllability. SIAM J Control Optim 21(3):462–471

Automated Nonlinear Control Structure Design

141

65. Szczepanski R, Tarczewski T, Erwinski K, Grzesiak LM (2018) Comparison of constraint-handling techniques used in artificial bee colony algorithm for autotuning of state feedback speed controller for PMSM. In: Proceedings of the 15th international conference on informatics in control, automation and robotics - Volume 1: ICINCO. INSTICC, SciTePress, pp 269–276 66. Tibken B (2000) Estimation of the domain of attraction for polynomial systems via LMIs. In: Proceedings of the 39th IEEE conference on decision and control, vol 4. IEEE, pp 3860–3864 67. Verdier C, Mazo Jr. M (2017) Formal controller synthesis via genetic programming. IFAC-PapersOnLine 50(1):7205–7210 68. Vrkalovic S, Teban TA, Borlea ID (2017) Stable Takagi-Sugeno fuzzy control designed by optimization. Int J Artif Intell 15:17–29 69. Warthenpfuhl S, Tibken B, Mayer S (2010) An interval arithmetic approach for the estimation of the domain of attraction. In: 2010 IEEE international symposium on computer-aided control system design (CACSD). IEEE, pp 1999–2004 70. Zhou K, Doyle JC, Glover K et al (1996) Robust and optimal control, vol 40. Prentice Hall, New Jersey 71. Zubov VI (1964) Methods of A.M. Lyapunov and their Application. P. Noordhoff

Computational Experience with a Modified Newton Solver for Discrete-Time Algebraic Riccati Equations Vasile Sima1(B)

and Peter Benner2

1

2

National Institute for Research and Development in Informatics, 011455 Bucharest, Romania [email protected] Max Planck Institute for Dynamics of Complex Technical Systems, 39106 Magdeburg, Germany [email protected]

Abstract. A Newton-type algorithm and line search strategies for solving generalized discrete-time algebraic Riccati equations are dealt with. The conceptual algorithm is presented, and its main computational steps are discussed. Evaluation of residuals and closed-loop matrices at each iteration, determination of the step size, and the use of line search with backtracking, are addressed in detail. Algorithmic and implementation issues taken into account in the developed solver are described. An extensive performance investigation on a large collection of examples has been performed, and the results are summarized. Both usual line search and line search with backtracking, and either identity or diagonal performance index matrices are considered. Difficult examples are included. The results often show significantly improved accuracy, measured in terms of normalized and relative residuals, in comparison with the stateof-the-art MATLAB function. The new solver is strongly recommended for improving the solutions computed by other solvers. Keywords: Algebraic Riccati equation · Numerical linear algebra Numerical methods · Optimal control · Optimal estimation

1

·

Introduction

Many procedures for control systems analysis and design require the solution of algebraic Riccati equations (AREs). Such equations appear in various domains and practical applications, including model reduction, optimal filtering, guidance, (robust) control, robotics, etc. Discrete-time AREs (DAREs) are important since many measured systems are modeled by difference equations.

c Springer Nature Switzerland AG 2020  O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 142–167, 2020. https://doi.org/10.1007/978-3-030-31993-9_7

Computational Experience with a Modified Newton Solver

143

Let A, E ∈ IRn×n , B ∈ IRn×m , and Q and R be symmetric matrices of suitable dimensions. In a compact notation, the generalized DAREs, with unknown X = X T ∈ IRn×n , are defined by −1 ˆ 0 = Q + op(A)T X op(A) − op(E)T X op(E) − σL(X)R(X) L(X)T =: R(X), (1) ˆ where σ = ±1, E and R(X) are nonsingular, and

ˆ R(X) := R + σB T XB,

L(X) := S + op(A)T XB,

(2) T

with S of suitable size. The operator op(M ) represents either M or M , corresponding to a control or a filtering problem, respectively. A and E are the state and descriptor matrices, respectively, of a linear time-invariant dynamic system, and, in a control setting, B is the input matrix. The use of the ±sign through ˆ σ offers a greater generality. In practice, often Q and S are expressed as C T QC ˆ respectively, where C ∈ IRp×n is the output matrix of the system, and S = C T S, ˆ and Sˆ are given. and C, Q, The solutions of a DARE are the matrices X = X T for which R(X) = 0. Usually, what is needed is a stabilizing solution, Xs , for which the matrix pair (A − σ op(BK(Xs )), E) is stable (in a discrete-time sense), where op(K(Xs )) is the gain matrix of the optimal regulator or estimator, given by T

−1 ˆ L(X) , K(X) := R(X)

(3)

with X replaced by Xs . For the dynamic system Exk+1 = Axk + Buk , k = 0, 1, . . ., x(0) = x0 , the optimal control trajectory is given by the state feedback law uk = −σK(Xs )xk . By a proper selection of Q, S, and R, the closed-loop dynamics can be modified to achieve a fast transient response, disturbance rejection, etc. Note that for a filtering problem, B should be replaced by the transpose of C, and the computed K(X) is the transpose of the filter gain matrix. When Y is not a solution of (1), then R(Y ) differs from the zero matrix; R(Y ) is called the residual matrix of (1) in Y . The Frobenius norm of R(Y ), R(Y )F , is a measure of the error in Y with respect to the solution X. There is an impressive literature concerning theory and numerical solution of AREs and their practical applications. Several monographs, e.g., [4,16,35,41,45] address various theoretical and practical results. Existence and uniqueness results for ARE solutions are dealt with, for instance, in [34,36,37]. Many “direct” or iterative algorithms have been proposed for solving AREs. The first class includes the (generalized) Schur techniques, e.g., [7,32,39,43,55], or structure-exploiting (QR-like) methods, e.g., [15,17,53]. (These techniques are actually also iterative, but they are applied to a matrix or matrix pencil defined by the given matrices of an ARE.) The second class has several categories, including sign function techniques, e.g., [8,18,22,44,51], Newton techniques, e.g., [3,7,25,29], doubling algorithms, e.g., [19,26–28], or recursive algorithms [38].

144

V. Sima and P. Benner

Newton’s method for solving AREs has been considered by many authors, for instance, [7,9–11,30,33,35,41,45]. Moreover, the matrix sign function method for AREs, see [18,22] and the references therein, is actually a specialization of Newton’s method for computing the square root of the identity matrix of order 2n. Newton’s method is best used for iterative improvement of a solution, or as a defect correction method [42], delivering the maximal possible accuracy when starting from a good approximate solution. Moreover, it may be preferred in implementing certain fault-tolerant or slowly varying systems, which require online controller updating [20]; then, the previously computed ARE solution can be used for initialization. Some robotics applications can also benefit from using iterative ARE solvers. For this reason, such algorithms are used in a new opensource C++ library for robotics, optimal and model predictive control [23], for solving both continuous-time AREs (CAREs) and DAREs. This paper summarizes the main theoretical facts about Newton’s method for DAREs, as well as implementation issues and numerical results obtained using the newly developed solver. There are several contributions comparing to [10,11] concerning, e.g., generality and functionality, line search strategies, or stopping criteria. This paper significantly extends the presentation in [54] with a detailed description of the step size and backtracking procedures. New numerical results, using another MATLAB solver and different performance index matrices, are included; more difficult examples are considered. The paper complements our previous studies on the numerical solution of CAREs by Newton’s method with line search reported in [49,52].

2

Basic Theory and Conceptual Algorithm

The following Assumptions are made: 1. 2. 3. 4.

Matrix E is nonsingular. Matrix pair (op(E)−1 op(A), op(E)−1 B) is stabilizable. Matrix R = RT is non-negative definite (R ≥ 0). ˆ s ) is positive definite A stabilizing solution Xs exists and it is unique, and R(X ˆ (R(Xs ) > 0).

Note that Assumption 1 is not actually used by the developed solver, contrary to other solvers (including MATLAB function dare). The algorithmic variants considered in the sequel for DAREs are extensions of Newton’s method, which employ a line search procedure attempting to reduce the residual along the Newton direction.

Computational Experience with a Modified Newton Solver

145

The conceptual algorithm can be stated as follows [10]: Algorithm NDARE: Modified Newton Method for DARE Input: The coefficient matrices E, A, B, Q, R, and S, and an initial matrix X0 = X0T . Output: The approximate solution Xk of DARE (1). FOR k = 0, 1, . . . , kmax , DO 1. Compute R(Xk ). If convergence or non-convergence is detected, return Xk and/or a warning or error indicator value. 2. Compute Kk := K(Xk ) in (3) and op(Ak ), where Ak = op(A) − σBKk . 3. Solve for Nk the discrete-time generalized Stein equation op(Ak )T Nk op(Ak ) − op(E)T Nk op(E) = −R(Xk ) .

(4)

4. Find a step size tk which approximately minimizes R(Xk + tNk )2F , with respect to t. 5. Update Xk+1 = Xk + tk Nk . END Equation (4) is also called discrete-time generalized Lyapunov equation. The usual, standard Lyapunov equation has E = In . The solution Nk of (4) gives a search direction, called Newton direction, and tk is the step size along this direction. Standard Newton algorithm is obtained by taking tk = 1 in Step 4 at each iteration. When the initial matrix X0 is far from a Riccati equation solution, Newton’s method with line search often outperforms the standard Newton’s method. If the assumptions above hold and X0 is stabilizing, then the iterates of the Algorithm NDARE with σ = 1 and tk = 1 have the following properties [9]: (a) (b) (c) (d)

All matrices Xk are stabilizing. Xs ≤ · · · ≤ Xk+1 ≤ Xk ≤ · · · ≤ X1 . limk→∞ Xk = Xs . There is a constant γ > 0 such that Xk+1 − Xs  ≤ γXk − Xs 2 ,

k ≥ 1.

(5)

Note that the global quadratic convergence formula (5) does not hold for k = 0, involving the iterates X0 and X1 . Weaker results are available for the modified Newton algorithm. One such result [9] states that if Xk is stabilizing, then Nk computed by Algorithm NDARE is a descent direction for R(Xk )2F from Xk , unless Xk = Xs .

146

3

V. Sima and P. Benner

Algorithmic Details

Algorithm NDARE was implemented in a Fortran 77 subroutine, SG02CD, following the SLICOT Library [12–14,56] implementation and documentation standards (see http://www.slicot.org). The same routine also solves CAREs. A first version of SG02CD and preliminary results on random examples and SLICOT CAREX benchmark collection [1] have been presented in [50]. The implemented solver deals with generalized DAREs without inverting the matrix E, which is very important for numerical reasons, since E might be ill-conditioned with respect to inversion. Standard DAREs are solved with the maximal possible efficiency. Moreover, both control and filter DAREs can be solved by the same routine, using an option (“mode”) parameter, which specifies the op operator. The matrices A and E are not transposed. The implementation also optimizes the needed memory space. When possible, pairs of symmetric matrices of order n are stored economically in n × (n + 1) arrays, to reduce the workspace requirements, but preserving the two-dimensional array indexing, for efficiency. Details on the memory storage issues are given in [54]. The iteration is started by an initial (stabilizing) matrix X0 , which may not be given on input, if the zero matrix can be used. Since the solution computed by a Newton algorithm generally depends on initialization, an option specifies if the stabilizing solution Xs is to be found. This is assumed to be the case in the sequel. The initial matrix X0 must then be stabilizing. Another option specifies whether to use standard Newton’s method, or one of the modified Newton’s method variations, discussed in a paragraph below, which employ a line search strategy. Optionally, the matrices Ak and E (if E is general) are scaled for solving the Stein equations, and their solutions are suitably updated. Note that the LAPACK subroutines DGEES and DGGES [5], which are called by the standard and generalized Stein solvers, respectively, to compute the real Schur(-triangular) form, do not scale the coefficient matrices. For some examples, and no scaling, the convergence was not achieved in a reasonable number of iterations. Moreover, sometimes scaling allows to compute more accurate solutions and/or use less iterations, comparing to the case with no scaling. The essential computational procedures involved in Algorithm NDARE are detailed below. 3.1

Computing Residual and Closed-Loop Matrices

If R is nonsingular, DAREs can be put in a simpler form, which is more convenient for Newton algorithms. Specifically, setting A˜ = A − σ op(BR−1 S T ),

˜ = Q − σSR−1 S T , Q

(6)

˜ respectively, Eq. (1) reduces to after redefining A and Q as A˜ and Q, ˆ 0 = op(A)T X op(A) − op(E)T X op(E) − σ op(A)T X G(X)X op(A) + Q =: R(X), (7)

Computational Experience with a Modified Newton Solver

147

−1 T ˆ ˆ where G(X) := B R(X) B , and the second term reduces to X in the standard case (E = In ). The transformations in (6) eliminate the matrix S from the formulas to be used. In this case, the matrix Kk may sometimes no longer be ˆ k Xk op(A), with G ˆ k := G(X ˆ k ). computed in Step 2, and Ak = op(A) − σ G ˜ in (6), the Cholesky factor of R, Rc , can be used if To obtain A˜ and Q ˜ = BRc−1 and R > 0, where R =: RcT Rc , with Rc upper triangular. Defining B −1 S˜ = SRc , the relations (6) are equivalent to

˜ S˜T ), A˜ = A − σ op(B

˜ = Q − σ S˜S˜T , Q

(8)

so just two triangular systems need to be solved, and two matrix products are ˜ after factoring R. Symmetry is exploited for computed for obtaining A˜ and Q, ˜ ˆ k ) > 0, then the getting Q via a BLAS [21] symm operation. Similarly, if R(X ˆ ˆ Cholesky factor of R(Xk ), Rc (Xk ), can be used. Defining Dk := D(Xk ) := ˆ k = Dk DT , and Ak = op(A) − σDk DT Xk op(A). The use of ˆ c (Xk )−1 , then G BR k k ˆ k is convenient when m is sufficiently smaller than n (m ≤ n/4). Dk instead of G When R is not positive definite, then either U DU T or LDLT factoriza˜ Similarly, U DU T /LDLT tion [24] of R can be employed for computing A˜ and Q. ˆ k ) can be used for obtaining G ˆ k , when R(X ˆ k ) is indefinite. factorization of R(X When S = 0, but R is ill-conditioned with respect to inversion, the use of formulas (6) will potentially introduce large errors from the beginning of the iterative process, which will be propagated during the entire process. This might involve a degradation of its behavior, resulting in slower convergence, and/or an inaccurate computed solution. Using S during the iterative process could avoid such degradation. Therefore, an option of the solver allows to avoid the transformations (6), and involve S in all subsequent calculations. In this case, ˆ k cannot be used. Specifically, define other formulas are needed, since G Hk := op(A)T Xk B + S,

ˆ c (Xk )−1 , F k = Hk R

(9)

ˆ c (Xk ) introduced above; for having Fk it is assumed here that R(X ˆ k ) > 0. with R (Hk is a convenient notation for L(Xk )). Then, the residual R(Xk ) and the matrix Ak can be computed using R(Xk ) = Q + op(A)T Xk op(A) − op(E)T Xk op(E) − σFk FkT , Ak = op(A) − σDk FkT ,

(10) (11)

where Dk has been defined above. ˆ k ) is indefinite, then the needed formulas follow directly If, however, R(X from (1)–(3), namely, R(Xk ) = Q + op(A)T Xk op(A) − op(E)T Xk op(E) − σHk Kk , Ak = op(A) − σBKk ,

(12) (13)

ˆ k ). Moreover, symmetry of involving the U DU T or LDLT factorization of R(X the matrix product Hk Kk is taken into account. (The solver computes either the upper or lower triangle of R(Xk ).)

148

V. Sima and P. Benner

The implementation is optimized by using common subexpressions when computing R(Xk ) and op(Ak ), taking also into account the ratio between n and m. Various formulas for efficient implementation of Newton’s method for AREs are proven in [48]. 3.2

Iterative Process

The algorithm computes the initial residual matrix R(X0 ) and the initial matrix ˆ 0 X0 op(A). If no initial matrix X0 is given, then op(A0 ), where A0 := op(A)−σ G ˜ and op(A0 ) = A˜ in (6). X0 = 0, R(X0 ) = Q At the beginning of the iteration k, 0 ≤ k ≤ kmax , the algorithm decides to terminate or continue the computations, based on the current normalized residual rk (and possibly on relative residual rr (Xk˜ )), defined below. If min(rk , rr (Xk˜ )) > τ , a standard (if E = In ) or generalized Stein equation (4) is solved for Nk (the Newton direction). The basic stopping criterion for the iterative process is stated in terms of a normalized residual, rk := r(Xk ), and a tolerance τ . If rk := R(Xk )F / max(1, Xk F ) ≤ τ,

(14)

the iterative process is successfully terminated. If τ ≤ 0, a default tolerance is used, defined in terms of the Frobenius norms of the given matrices, and relative machine precision, εM . Specifically, τ is computed by the formula  √ √  τ = min { εM n A2F (1 + D0 2F ) + E2F + QF , εM /103 }. (15) ˆ 0 if R(X ˆ 0 ) is indefinite.) The second operand (The factor D0 2F is replaced by G of min in (15) was introduced to prevent deciding convergence too early for ˆ 0 ), and/or Q. systems with very large norms for A, E, D0 (or G ˆ 0 ), and/or For systems with very large norms of the matrices A, E, D0 (or G Q, and small norm of the solution X, the termination criterion involving (15) might not be satisfied in a reasonable number of iterations (or never, due to accumulated rounding errors), while an acceptable approximate solution might be much earlier available. Therefore, the MATLAB-style relative residual, rr (Xk˜ ), which includes the Frobenius norms of the four matrix terms of (1) in the denominator of its formula, is also tested at iterations k˜ = 10 + 5q, q = 0, 1, . . ., and it might produce the termination of the iterative process, instead of the criterion based on the normalized residual. (Note that rr (Xk ) = r0 , for k < 10, and then it preserves its value for each consecutive 5 iterations, if any.) The relative residual is not tested at each iteration in order to reduce the computation costs, and to increase the chances of termination via the normalized residual test. Another test is to check if updating Xk is meaningful. The updating is done if tk Nk F > εM Xk F . If this is the case, set Xk+1 = Xk + tk Nk , and compute the updated matrices op(Ak+1 ) and R(Xk+1 ). Otherwise, the iterative process is terminated and a warning value is set, since no further significant, but only marginal improvements can be expected, eventually after many additional iterations.

Computational Experience with a Modified Newton Solver

149

Often, but mainly in the first iterations, the computed optimal steps tk are too small, and the residual decreases too slowly. This is called stagnation, and remedies are used to escape stagnation. Specifically, in order to detect stagnation, the last computed kB residuals are stored in an array RES. If R(Xk +tk Nk )F > τs R(Xk−kB )F > 0, then tk = 1 is used instead. The implementation uses τs = 0.9 and sets kB = 2, but values as large as 10 can be used by changing this parameter. The first kB entries of array RES are reset to 0 whenever a standard Newton step is applied. 3.3

Finding the Step Size

The optimal step size tk is given by tk = arg min R(Xk + tNk )2F . t

(16)

Since solving (16) for a DARE is expensive, it was suggested in [9,10] to solve an approximate minimization problem. Using the Stein equation (4) and the Sherman-Morrison-Woodbury formula, it has been proven [9] (for op(M ) = M and σ = 1) that (17) R(Xk + tNk ) = (1 − t)R(Xk ) − t2 Vˆk , where

ˆ k Nk op(Ak ) . ˆ k Nk )−1 G Vˆk := op(Ak )T Nk (In + tG ˆ k Nk )−1 by its Taylor series at 0, Let consider replacing gk (t) := (In + tG ˆ k Nk + t2 (G ˆ k Nk )2 + O(t3 ). Tk (t) := In − tG

(18)

(19)

If the series (19) converges (to gk (t)), using its second order approximation, (17) can be rewritten as, ˆ k Nk op(Ak ) . Vk := op(Ak )T Nk G (20) A sufficient condition for the pointwise convergence of the series (19) is that for any submultiplicative norm  · , t ∈ IR satisfies (see [9])

R(Xk + tNk ) = (1 − t)R(Xk ) − t2 Vk + O(t3 ),

ˆ k Nk . |t| < 1/G

(21)

ˆ k := R(X ˆ k ) is nonsingular, then R ˆ k+1 := In addition, if tk satisfies (21) and R ˆ k+1 R + σB T (Xk + tk Nk )B is nonsingular. However, there is no guarantee that R ˆ k Nk are nonsingular if tk does not satisfy (21). and In + tG Let  2  , (22) fk (t) := (1 − t)R(Xk ) − t2 Vk 2F = trace (1 − t)R(Xk ) − t2 Vk and the minimization problem (16) is replaced by mint fk (t). The function fk is a polynomial or order 4, fk (t) =: αk (1 − t)2 − 2βk (1 − t)t2 + γk t4 ,

(23)

150

V. Sima and P. Benner

where [9] αk = trace(R(Xk )2 ), βk = trace(R(Xk )Vk ), γk = trace(Vk2 ). If γk > 0, the quartic polynomial fk (t) has either one local minimum and no local maxima, or two local minima and one local maximum. If γk = 0, then Vk = 0, and fk (1) = ˆ k Nk op(Ak ) = 0 and 0, hence tk = 1 minimizes fk . In addition, Vk = 0 implies G from (17) it follows that Xk+1 = Xk + Nk is the DARE solution. It is easy to show that the function fk (t) has a local minimum tk ∈ [0, 2], and tk can actually be chosen such that fk (tk ) = mint∈[0,2] fk (t). Indeed, the derivative of fk (t) is [9]    fk (t) = −2 trace R(Xk ) + 2tVk (1 − t)R(Xk ) − t2 Vk , (24) so that,

  fk (0) = −2 trace R(Xk )2 ≤ 0,  2  fk (2) = 2 trace R(Xk ) + 4Vk ≥ 0.

(25)

Therefore, if fk (0) = 0, then R(Xk )F = 0 and tk = 0. Otherwise, fk (0) < 0 and there must be a local minimum at a value tk ∈ [0, 2], since fk (2) ≥ 0. If there are two local minima in [0, 2], tk is chosen as the minimum of them. Moreover, it follows that (26) fk (tk ) < fk (0) = R(Xk )2F , unless R(Xk ) = 0, i.e., unless Xk is a solution of DARE (1). Using (21), for any tk selected in the interval [0, 2], the convergence of the Taylor series Tk (t) to gk (t) happens if ˆ k Nk F < 1/2. G

(27)

Clearly, there is a neighbourhoud of Xs where this inequality holds, since Nk will be small when Xk is close to Xs . But at the beginning of the iterative Newton’s process, Xk may be far away from Xs , and the computed tk will usually be ˆ k Nk F ≥ 1/2. But small, such that Tk (t) may still converge at tk even if G then fk (t) may not be a good model for R(Xk )F for larger t, e.g., t ≥ 1. It is well possible that a smaller residual be obtained for tk = 1 than for the value tk which solves mint fk (t). Moreover, the exact minimization problem (16) may have several local minima. Since the local minima of fk (t) may approximate any of the exact local minima, the chosen step size may not necessarily lead to a residual decrease, or there may be other local minima leading to a larger decrease. To overcome such difficulties, the solver offers other three line search stategies besides the pure line search strategy, which uses a solution tk of the approximate quartic polynomial (22) at each iteration k. Specifically, in the combined strategy, line search is employed in the beginning of the iterative process, but the solver switches to the standard method when the normalized residual is smaller than a specified (or default) tolerance. The rationale for this strategy is that when the normalized residual is small enough, line search cannot offer sensible improvements, and the standard algorithm converges with a fast rate, usually

Computational Experience with a Modified Newton Solver

151

quadratically. In addition, in such an instance, tk will be close to 1, and typically there will be no difference between the values of R(Xk )F computed for tk and for 1. Therefore, the calculations for finding tk are avoided. In the hybrid strategy, both standard Newton step and the step corresponding to the approximate line search procedure are computed, and that step which gives the smallest residual is selected at each iteration. Finally, the backtracking strategy, proposed in [9], is a special hybrid strategy in which the selected step is only taken provided there is a sufficient residual decrease. Otherwise, the step size is reduced until a sufficient decrease is eventually obtained. If this is not the case, or stagnation is detected, then a standard Newton step is used. This approach can increase the speed of the iterative process. More details are given in the next subsection. To solve the problem mint fk (t) for (23), a cubic polynomial (the derivative of fk (t)) is set up, whose roots in [0, 2], if any, are candidates for the solution of the approximate minimum residual problem. The roots of this cubic polynomial are computed by solving an equivalent 4-by-4 standard or generalized eigenproblem, following [31]. 3.4

Backtracking Strategy

The pure line search strategy usually gives good results if the convergence criteˆ k Nk )−1 is satisfied. Often, the computed rion (27) for the Taylor series of (In +tG step size yields a smaller residual than for tk = 1. Although for the step size computed by approximate line search, fk (tk ) ≤ R(Xk )2F (see (26)), there is no guarantee that R(Xk + tk Nk )F ≤ R(Xk + Nk )F ,

(28)

R(Xk + tk Nk )F ≤ R(Xk )F .

(29)

or However, if Xk is stabilizing and Nk is computed by Algorithm NDARE, then either Nk is a descent direction for f (X) := R(X)2F from Xk or Xk = Xs (Lemma 4.40 in [9]). This result shows that there is a tk > 0 so that R(Xk + tk Nk )F < R(Xk )F and, since tk > 0, (28) holds, unless Xk is already the DARE stabilizing solution. Such a tk can be found using the backtracking strategy, which searches for a value for t ensuring a “sufficient decrease” of the residual, √ (30) R(Xk + tNk )F ≤ 1 − δt R(Xk )F , where δ ∈ (0, 1) is a given constant. Taking δ too small may imply a slowly decreasing residual or even stagnation. (In such a case, a standard Newton step, tk = 1, may be used.) If (30) is neither satisfied by tk = 1, nor by tk computed by approximate line search, t may be decreased starting from t = 1. Backtracking uses the currently available information. For tk = 1, this information is given by fk (0) = R(Xk )2F , fk (0) = −2 R(Xk )2F , fk (1) = R(Xk + Nk )2F ,

(31)

152

V. Sima and P. Benner

and it can be used to interpolate a quadratic polynomial model,   qk (t) = fk (1) − fk (0) − fk (0) t2 + fk (0)t + fk (0).

(32)

Since fk (1) > fk (0) + fk (0), qk (t) has a global minimum at the value tk zeroing the derivative of qk (t), qk (t), i.e., tk =

−fk (0) = 2 fk (1) − fk (0) − fk (0) 

 1+

1 R(Xk +Nk )F R(Xk )F

2 .

(33)

If the next computed residual still does not satisfy the criterion (30), the next tk can be computed using a cubic model of fk (t), since additional information is available. The cubic model is used for the remaining iterations of the backtracking procedure. If t−1 and t−2 are the last two previously computed values for tk , the cubic polynomial interpolating fk (0), fk (0), fk (t−1 ), and fk (t−2 ) is ck (t) = αk t3 + βk t2 + fk (0)t + fk (0),

(34)

and the minimizing value is [9] tk =

−βk +



βk2 + 6αk R(Xk )2F . 3αk

(35)

The values αk and βk are computed by solving the 2 × 2 linear algebraic system obtained using the interpolating conditions for t−1 and t−2 , i.e., ck (t−1 ) = fk (t−1 ) and ck (t−2 ) = fk (t−2 ) . If δ < 1/4 in (30), then tk computed with (35) is real. The solver uses the value δ = 0.2. As for the standard backtracking procedure in nonlinear optimization, the following constraints on tk are imposed, t−1 /10 ≤ tk ≤ t−1 /2,

(36)

which prevent reducing tk too much (via the first inequality), and ensure a sufficient change in tk (via the second inequality). The initial values for t−1 and t−2 are 1 and tals , where tals is the step size computed by approximate line search.

4

Numerical Results

This section presents some results of an extensive performance investigation of the new solver based on Newton’s method. The numerical results have been obtained on an Intel Core i7-3820QM portable computer at 2.7 GHz, with 16 GB RAM, with the relative machine precision εM ≈ 2.22 × 10−16 , using Windows 7 Professional (Service Pack 1) operating system (64 bit), Intel Visual Fortran Composer XE 2015 and MATLAB 8.6.0.267246 (R2015b). A MATLAB executable MEX-function has been built using MATLAB-provided optimized LAPACK and BLAS subroutines.

Computational Experience with a Modified Newton Solver

153

Besides tests with randomly generated matrices, the results for which are not reported here, other tests have been conducted for linear systems from the COMPle ib collection [40]. Preliminary results have been presented in [46,47]. (The second reference summarizes the results obtained using Newton’s method for solving AREs for examples from the SLICOT benchmark collections for CAREs [1] and DAREs [2].) The COMPle ib collection contains 124 standard continuous-time examples (with E = In ), with several variations, giving a total of 168 problems. For testing purposes, these examples have been considered in this paper as being of discrete-time type. All but 16 problems (for systems of order larger than 2000, with matrices in sparse format) have been tried. The matrix S was always zero. In a series of tests, the performance index matrices Q and R have been chosen as identity matrices of suitable sizes. Results for this selection have been presented in [54], in comparison with the state-of-the-art MATLAB function dare. However, 63 problems did not satisfy the needed conditions for the existence of a stabilizing solution, and could not be solved by dare, which gave the error message “There is no finite stabilizing solution”. These examples have been omitted. In addition, other five examples, namely WEC1, WEC2, WEC3, HF2D CD4, and HF2D CD6, have been excluded from the comparison in [54]. For these examples, the solution computed by dare had a very large Frobenius norm (of order 1013 for WEC examples, 1010 and 1011 for the two HF2D examples), and relatively large normalized residuals, of order 10−4 or larger for WEC1–WEC3, 10−7 and 10−6 , for HF2D CD4 and HF2D CD6, respectively. Such matrices proved to offer a poor initialization for Newton’s method. In another series of tests discussed in [54], X0 was set to a zero matrix, if A was found to be stable; otherwise, an initialization of the Newton solver with a matrix computed using the stabilization algorithm in [6] was tried, and when this algorithm failed to deliver a stabilizing X0 matrix, the solution provided by dare was used. A zero initialization could be tried for 7 stable examples, namely AC5, REA4, BDT1, CSE1, TMD, FS, and ROC5, but the Newton solver failed for CSE1 with X0 = 0, since a singular Stein equation was found. The stabilization algorithm was tried on 82 unstable systems, and succeeded for 55 examples, hence it failed for 27 examples. Tests with X0 computed by the stabilization algorithm also for stable systems, or with X0 returned by dare for all examples, have also been successfully performed. The last set of tests shows the performance of the Newton solver in refining a solution computed by another solver. This paper considers also the five difficult examples mentioned above, and uses as reference the SLICOT-based MATLAB function sldaresg instead of dare. Moreover, results for different performance index matrices Q and R are given. Specifically, these matrices are defined as Q = 10AF In ,

R = 100BF Im .

(37)

Both standard and modified Newton’s method, with or without balancing the coefficient matrices of the Stein equations, were tried. In addition, results obtained using line search with backtracking are reported.

154

4.1

V. Sima and P. Benner

Pure Line Search, Identity Matrices Q and R

Figure 1 displays the normalized residuals for examples from the COMPle ib collection, using MATLAB function sldaresg and the standard Newton solver, with default tolerance and sldaresg initialization. The behavior is very similar with that in [54], where the MATLAB function dare has been used instead of sldaresg. But here all examples solvable by dare are considered and the numerical difficulties are analyzed. With few exceptions, the Newton solver is either comparable with sldaregs or it improved the normalized residuals, sometimes by several orders of magnitude. However, for several examples (AC4, HF2D IS5, HF2D IS7, HF2D CD4, HF2D CD5, HF2D CD6, HF2D17, and HF2D18, numbered as 4, 60, 62, 64–66, 74, and 75, respectively, in Fig. 1), clearly worse results have been obtained. Line search succeeded to get smaller normalized residuals than sldaregs for HF2D IS7 and HF2D CD5, but larger for HF2D CD4, as can be seen in Fig. 2. For HF2D CD4 with line search, this residual jumps at iteration 3 from about 4.25 10−5 to .99999, to avoid stagnation. Then, there are several episodes where the residual is sequentially reduced till a new jump appears (possibly with larger amplitude), and normalized and relative residuals never recover their smaller initial values. During iteration, the stability is lost, due to rounding errors and big values of the entries in Xk , which could exceed those encountered for the standard Newton algorithm. For HF2D CD6 stability is lost and the residuals increase without bound, starting from k > 2 and k ≥ 1 with pure line search and standard solver, respectively. For HF2D IS7, the smallest normalized residual is at iteration k = 1. Then the values behave cyclically, increasing when tk approaches 1, and decreasing in the next two-three iterates. The variations could be over four orders of magnitude. Standard Newton solver does not converge. A similar behavior is encountered for HF2D CD5. But for ROC5, ROC7, and ROC9, the Newton solver obtained residuals of order 10−30 . Normalized residuals for sldaresg and Newton solver

100

Newton sldaresg

Normalized residuals

10-5

10-10

10-15

10-20

0

10

20

30

40

50

60

70

80

90

Example #

Fig. 1. Normalized residuals for examples from the COMPle ib collection (taken as discrete-time systems), using SLICOT-based MATLAB function sldaregs and standard Newton solver, with default tolerance and sldaregs initialization.

Computational Experience with a Modified Newton Solver

155

Figure 3 displays the estimated condition numbers of the matrix U1 used for obtaining the sldaregs solution as U2 U1−1 . (Actually, sldaregs returns the reciprocal condition numbers, rcond.) The sldaregs residuals closely follow the variation of condition numbers, which could also influence the residuals of Newton solvers. However, except for some examples with very high condition numbers, Newton solvers are less influenced than dare or sldaregs. Table 1 shows the normalized residuals obtained by Newton solver with line search and sldaregs for COMPle ib examples with rcond(U1 ) < 10−10 . Compared to sldaregs, Newton solver succeeded to improve the residuals with several orders of magnitude (about seven for AGS, six for HE6, HE7, and DLR1, four for PAS, and two for WEC1 and NN11). The residuals of the two solvers are comparable for WEC2 and WEC3. The only really worse residuals are for HF2D CD4 and HF2D CD6. Figure 4 plots the MATLAB-style relative residuals for examples from the COMPle ib collection, using SLICOT-based MATLAB function sldaregs and standard Newton solver, default tolerance and sldaregs initialization. Similarly, Fig. 5 plots the MATLAB-style relative residuals for sldaregs and Newton solver with line search. The standard and modified Newton solvers returned comparable or (much) smaller residuals except for seven and three examples, respectively, namely, HF2D IS6, HF2D IS7, HF2D CD4–HF2D CD6, HF2D17, and HF2D18 (numbered as 61, 62, 64–66, 74, and 75, respectively), in the first case, and HF2D CD4, HF2D CD6, and HF2D17 (numbered as 64, 66, and 74, respectively), in the second case. Standard solver produced much better results for HF2D CD4 and HF2D CD6 than modified solver, for which all accuracy is lost. Normalized residuals for sldaresg and Newton solver

100

Newton sldaresg

Normalized residuals

10-5

10-10

10-15

10-20

0

10

20

30

40

50

60

70

80

90

Example #

Fig. 2. Normalized residuals for examples from the COMPle ib collection, using SLICOT-based MATLAB function sldaregs and Newton solver with line search, default tolerance and sldaregs initialization.

156

V. Sima and P. Benner Estimated condition numbers of U 1 , with X0 = U 2 /U1 10

15

cond

1010

105

100 0

10

20

30

40

50

60

70

80

90

Example #

Fig. 3. Estimated condition numbers of the matrix U1 computed by sldaresg for obtaining the DARE solution X0 used to initialize Newton solver.

Figure 6 shows the corresponding elapsed CPU times for the two solvers. For 19 examples, the computations with standard Newton method ended before finishing the first iteration, and just 10 examples (AGS, PAS, NN11, HF2D IS7, HF2D CD4–HF2D CD6, HF2D15, HF2D18, and DLR1) needed more than two iterations, namely, 11, 11, 11, 50, 50, 50, 32, 4, 3, and 4 iterations, respectively. For the same examples, the modified Newton method needed 1, 11, 11, 41, 50, 0, 50, 2, 0, and 1 iterations, and it was by three and two orders of magnitude more accurate for HF2D IS7 and HF2D CD5, respectively. Since very few iterations are most often needed, the CPU time for the Newton solver is usually significantly smaller than that for the MATLAB solver sldaregs. Table 1. Normalized residuals for COMPle ib examples with rcond(U1 ) < 10−10 . Ex.# Example

rcond

20

7.03 10−12 2.47 10−12

HE6

Normalized residuals Line search sldaregs −12

1.81 10−6 1.78 10−4

21

HE7

7.03 10

31

AGS

5.65 10−13 1.10 10−11 −14

2.47 10

−6

1.57 10−4 9.02 10−5

32

WEC1

3.23 10

33

WEC2

4.78 10−14 8.79 10−5

34

WEC3

−14

3.56 10

−14

1.08 10

−4

5.95 10−4

−6

3.24 10−2

5.96 10

40

PAS

7.80 10

53

NN11

1.30 10−11 7.42 10−10

5.55 10

2.91 10−8

64

HF2D CD4 9.44 10

−1

4.37 10−8

66

HF2D CD6 2.94 10−12 6.72 10−3

1.06 10−5

78

−11

1.81 10−6

−12

DLR1

−11

8.98 10

2.73 10

−12

7.21 10

2.05 10−6

Computational Experience with a Modified Newton Solver

157

dare-style residuals for sldaresg and Newton solver

10-6

Newton sldaresg

10-8

dare-style residuals

10-10 10-12 10-14 10-16 10-18 10-20

0

10

20

30

40

50

60

70

80

90

Example #

Fig. 4. MATLAB-style residuals for examples from the COMPle ib collection, using SLICOT-based MATLAB function sldaregs and standard Newton solver, default tolerance and sldaregs initialization

The bar graph from Fig. 7 shows the improvement obtained using the standard Newton solver, default tolerance and sldaregs initialization. The height of the i-th vertical bar indicates the number of examples for which the improvement was between i − 1 and i orders of magnitude, in comparison to sldaregs. The number of examples in the six bins with nonzero values are 55, 17, 5, 6, 3, and 1, corresponding to improvements till one order of magnitude, between one and two orders of magnitude, and so on. Newton solver with line search leads to a similar improvement (the values in the first two bins are 53 and 19, and those in the bins 6–8 are 1, 2, and 1). dare-style residuals for sldaresg and Newton solver

100

Newton sldaresg

dare-style residuals

10-5

10-10

10-15

10-20

0

10

20

30

40

50

60

70

80

90

Example #

Fig. 5. MATLAB-style residuals for examples from the COMPle ib collection, using SLICOT-based MATLAB function sldaregs and Newton solver with line search, default tolerance and sldaregs initialization.

158

V. Sima and P. Benner CPU times for sldaresg and Newton solver

10-1

Newton sldaresg

CPU times (sec)

10-2

10-3

10-4

10-5

0

10

20

30

40

50

60

70

80

90

Example #

Fig. 6. Elapsed CPU time for examples from the COMPle ib collection, using SLICOTbased MATLAB function sldaregs and standard Newton solver, with default tolerance and sldaregs initialization.

4.2

Backtracking, Identity Matrices Q and R

Figure 8 displays the normalized residuals for examples from the COMPle ib collection, using MATLAB function sldaregs and the modified Newton solver with backtracking, tolerance εM and sldaregs initialization. For most examples, the solver found that there is no need to invoke the backtracking procedure. However, this procedure has been called for several examples, including PAS, NN11, HF2D IS7, HF2D CD4, HF2D CD6, and DLR1 (numbered as 40, 53, 62, 64, 66, and 78). The results for HF2D IS7, HF2D CD4, HF2D CD6, HF2D17, and HF2D18 (the last two numbered as 74 and 75) are much better than without Improvement of dare-style residuals

60

Number of examples

50

40

30

20

10

0 1

2

3

4

5

6

7

8

i

Fig. 7. Bar graph showing the improvement of the MATLAB-style residuals for examples from the COMPle ib collection, using standard Newton solver, default tolerance and sldaregs initialization. The height of the i-th vertical bar indicates the number of examples for which the improvement was between i − 1 and i orders of magnitude.

Computational Experience with a Modified Newton Solver

159

Normalized residuals for sldaresg and Newton solver

100

Newton sldaresg

Normalized residuals

10-5

10-10

10-15

10-20

0

10

20

30

40

50

60

70

80

90

Example #

Fig. 8. Normalized residuals for examples from the COMPle ib collection, using SLICOT-based MATLAB function sldaregs and modified Newton solver with backtracking, tolerance εM and sldaregs initialization

backtracking (see Fig. 2). The tolerance used was too small for some examples, e.g., HF2D IS7 and HF2D CD4. A similar behavior had the MATLAB-style relative residuals, see Fig. 9. Essentially larger relative residuals for Newton solver than for sldaregs have been obtained for HF2D IS7, HF2D CD4, HF2D CD5, and HF2D17. dare-style residuals for sldaresg and Newton solver

10-6

Newton sldaresg

10-8

dare-style residuals

10-10 10-12 10-14 10-16 10-18 10-20

0

10

20

30

40

50

60

70

80

90

Example #

Fig. 9. MATLAB-style residuals for examples from the COMPle ib collection, using SLICOT-based MATLAB function sldaregs and modified Newton solver with backtracking, tolerance εM and sldaregs initialization.

160

V. Sima and P. Benner Normalized residuals for sldaresg and Newton solver

100

Newton sldaresg

Normalized residuals

10-5

10-10

10-15

10-20

0

10

20

30

40

50

60

70

80

90

Example #

¯

¯ R= Fig. 10. Normalized residuals for examples from the COMPle ib collection, Q = Q, ¯ using SLICOT-based MATLAB function sldaregs and standard Newton solver, R, with default tolerance and sldaregs initialization.

4.3

Pure Line Search, Diagonal Matrices Q and R

Figure 10 displays the normalized residuals for examples from the COMPle ib collection, using MATLAB function sldaregs and the standard Newton solver, with default tolerance and dare initialization. The weighting matrices have been ¯ := 100BF Im . With few exceptions, ¯ := 10AF In and R = R chosen as Q = Q the Newton solver is either comparable with sldaregs or it improved the normalized residuals, sometimes by several orders of magnitude. However, for eight examples (HF2D IS7, HF2D IS8, HF2D CD4–HF2D CD6, HF2D15, HF2D17, and HF2D18, numbered as 62–66, 72, 74, and 75, respectively, in Fig. 10), worse results have been obtained. Line search succeeded to get smaller normalized residuals for HF2D IS7, HF2D IS8, and HF2D CD5, but produced worse results for WEC2 and PAS (numbered 33 and 40), as can be seen in Fig. 11. Actually, for WEC2 the residual R(Xk )F increased at each iteration k ≥ 3 when a step size close to or equal to 1 was taken, and exceeded its value at the previous such occurrence. A similar behavior occurred for HF2D CD4 and HF2D CD6; stability was lost for HF2D CD6. Also, for PAS, the solver did not stop at the minimum values for residual and normalized residual. The MATLAB-style relative residuals have an essentially similar shape. The standard Newton solver returned comparable to or (much) smaller residuals than sldaregs, except for HF2D IS7, HF2D IS8, HF2D CD4–HF2D CD6, HF2D13, HF2D15, HF2D16, and HF2D18. Line search produced better results for examples HF2D IS7, HF2D IS8, HF2D CD5, and HF2D15, but worse results for WEC2, see Fig. 12.

Computational Experience with a Modified Newton Solver

161

Normalized residuals for sldaresg and Newton solver

100

Newton sldaresg

Normalized residuals

10-5

10-10

10-15

10-20

0

10

20

30

40

50

60

70

80

90

Example #

¯ Fig. 11. Normalized residuals for examples from the COMPle ib collection, Q = Q, ¯ using SLICOT-based MATLAB function sldaregs and Newton solver with R = R, line search, default tolerance and sldaregs initialization. dare-style residuals for sldaresg and Newton solver

100

Newton sldaresg

dare-style residuals

10-5

10-10

10-15

10-20

0

10

20

30

40

50

60

70

80

90

Example #

¯ Fig. 12. MATLAB-style residuals for examples from the COMPle ib collection, Q = Q, ¯ using SLICOT-based MATLAB function sldaregs and Newton solver with R = R, line search, default tolerance and sldaregs initialization.

Figure 13 shows the elapsed CPU times for sldaregs and Newton solvers with or without balancing the matrices of Lyapunov equations. (STD and LS denote standard and modified Newton solvers, and BAL denotes balancing.) For 19 examples, the computations with standard Newton method ended before finishing the first iteration, and just eight examples (AGS, PAS, NN11, HF2D IS7, HF2D IS8, and HF2D CD4–HF2D CD6) needed more than two iterations, namely, 11, 11, 11, 50, 5, 50, 50, and 36 iterations, respectively. For the same examples, the modified Newton method needed 10, 11, 11, 11, 1, 50, 36, and 50 iterations, and it was by over three orders of magnitude more accurate for HF2D IS7 and HF2D CD5, and comparable for all other examples.

162

V. Sima and P. Benner CPU times for sldaresg and Newton solver with options

10-1

Newton LS Newton LS bal Newton STD Newton STD bal sldaresg

CPU times (sec)

10-2

10-3

10-4

10-5

0

10

20

30

40

50

60

70

80

90

Example #

¯ R = R, ¯ Fig. 13. Elapsed CPU time for examples from the COMPle ib collection, Q = Q, using SLICOT-based MATLAB function sldaregs and Newton solver with options, default tolerance and sldaregs initialization.

The improvement in accuracy is similar to that obtained for identity matrices Q and R. 4.4

Backtracking, Diagonal Matrices Q and R

Figure 14 displays the normalized residuals for examples from the COMPle ib collection, using MATLAB function sldaregs and the modified Newton solver with ¯ and R = R, ¯ tolerance εM and sldaregs initialization. With backtracking, Q = Q two exceptions (HF2D15 and HF2D16, numbered as 72 and 73), the Newton solver is either comparable with sldaregs or it improved the normalized residuals, sometimes by several orders of magnitude. Better results than with the pure line search strategy (Fig. 11) have been obtained for several examples, including WEC2, PAS, HF2D IS7, and HF2D CD4–HF2D CD6, but not due to a smaller tolerance in these cases. For some examples, e.g., NN11, HF2D CD4, HF2D CD5, and HF2D18, the residuals R(Xk )F recorded a monotonic decrease, contrary to the pure line search strategy. Also, backtracking strategy succeeded to find the stabilizing solution for HF2D CD6. Sometimes, e.g., for HE6, HE7, and AGS, the use of scaling option for the matrices of the Stein equations resulted in smaller residuals and less iterations than without scaling. Figure 15 plots the MATLAB-style relative residuals for examples from the COMPle ib collection, using SLICOT-based MATLAB function sldaregs and ¯ and R = R, ¯ tolerance modified Newton solver with backtracking, Q = Q εM and sldaregs initialization. The Newton solver returned comparable or (much) smaller residuals except for four examples, namely, HF2D IS8, HF2D13, HF2D15, and HF2D16 (numbered as 63, 70, 72, and 73, respectively).

Computational Experience with a Modified Newton Solver

163

Normalized residuals for sldaresg and Newton solver

100

Newton sldaresg

Normalized residuals

10-5

10-10

10-15

10-20

0

10

20

30

40

50

60

70

80

90

Example #

¯ Fig. 14. Normalized residuals for examples from the COMPle ib collection, Q = Q, ¯ using SLICOT-based MATLAB function sldaregs and modified Newton solver R = R, with backtracking, tolerance εM and sldaregs initialization. dare-style residuals for sldaresg and Newton solver

10-6

Newton sldaresg

10-8

dare-style residuals

10-10 10-12 10-14 10-16 10-18 10-20

0

10

20

30

40

50

60

70

80

90

Example #

¯ Fig. 15. MATLAB-style residuals for examples from the COMPle ib collection, Q = Q, ¯ using SLICOT-based MATLAB function sldaregs and modified Newton solver R = R, with backtracking, tolerance εM and sldaregs initialization.

For 8 examples, the computations with standard Newton method ended before finishing the first iteration, and only 23 examples needed more than two iterations. Note that for some examples, the number of iterations increased compared to the pure line search strategy, due to the use of a smaller tolerance, τ = εM in (14). The quality of the results is proven by the improvement in the relative residuals compared to sldaregs. The number of examples in the six nonempty bins are 45, 23, 7, 7, 1, and 4, showing an increased number of examples with improvements of more orders of magnitude (in the bins 2–4).

164

5

V. Sima and P. Benner

Conclusions

A Newton-type algorithm and various line search strategies for solving generalized discrete-time algebraic Riccati equations have been presented. The basic theory and conceptual algorithm, and its main computational steps, have been discussed. Highlighted were evaluation of residuals and closed-loop matrices at each iteration, determination of the step size, and the use of line search with backtracking. Algorithmic and implementation issues taken into account in the developed solver have been described. An extensive performance investigation on a comprehensive set of examples from the COMPle ib collection, taken as discretetime systems, has been performed, and the results are summarized. The numerical results illustrate the capabilities of this solver. The possibility to achieve, in few iterations, a reduction by one or more orders of magnitude of the normalized and MATLAB-style residuals of the solutions computed by MATLAB functions dare or sldaresg, makes Newton solver an attractive support tool for solving DAREs, and especially for improving the solutions computed by other solvers.

References 1. Abels J, Benner P (1999) CAREX—A collection of benchmark examples for continuous-time algebraic Riccati equations (Version 2.0). SLICOT Working Note 1999-14. http://www.slicot.org 2. Abels J, Benner P (1999) DAREX—A collection of benchmark examples for discrete-time algebraic Riccati equations (Version 2.0). SLICOT Working Note 1999-16. http://www.slicot.org 3. Anderson BDO (1978) Second-order convergent algorithms for the steadystate Riccati equation. Int J Contr 28(2):295–306. https://doi.org/10.1080/ 00207177808922455 4. Anderson BDO, Moore JB (1971) Linear Optimal Control. Prentice-Hall, Englewood Cliffs 5. Anderson E, Bai Z, Bischof C, Blackford S, Demmel J, Dongarra J, Du Croz J, Greenbaum A, Hammarling S, McKenney A, Sorensen D (1999) LAPACK users’ guide: third edition. Software · Environments · Tools. SIAM, Philadelphia 6. Armstrong ES, Rublein GT (1976) A stabilization algorithm for linear discrete constant systems. IEEE Trans Autom Contr AC-21(4):629–631. https://doi.org/ 10.1109/TAC.1976.1101295 7. Arnold WF III, Laub AJ (1984) Generalized eigenproblem algorithms and software for algebraic Riccati equations. Proc IEEE 72(12):1746–1754. https://doi.org/10. 1109/PROC.1984.13083 8. Balzer LA (1980) Accelerated convergence of the matrix sign function method of solving Lyapunov, Riccati and other matrix equations. Int J Contr 32(6):1057– 1078. https://doi.org/10.1080/00207178008910040 9. Benner P (1997) Contributions to the numerical solution of algebraic Riccati equations and related eigenvalue problems. Dissertation, Fakult¨ at f¨ ur Mathematik, Technische Universit¨ at Chemnitz–Zwickau, D–09107 Chemnitz, Germany 10. Benner P (1998) Accelerating Newton’s method for discrete-time algebraic Riccati equations. In: Beghi A, Finesso L, Picci G (eds) Mathematical Theory of Networks and Systems, Il Poligrafo, Padova, Italy, pp 569–572. https://doi.org/10.1.1.20. 7467

Computational Experience with a Modified Newton Solver

165

11. Benner P, Byers R (1998) An exact line search method for solving generalized continuous-time algebraic Riccati equations. IEEE Trans Autom Contr 43(1):101– 107. https://doi.org/10.1109/9.654908 12. Benner P, Sima V (2003) Solving algebraic Riccati equations with SLICOT. In: The 11th Mediterranean conference on control and automation, Rhodes, Greece. https://doi.org/10.1.1.89.3409 13. Benner P, Mehrmann V, Sima V, Van Huffel S, Varga A (1999) SLICOT—a subroutine library in systems and control theory. In: Datta BN (ed) Applied and computational control, signals, and circuits, vol 1, Birkh¨ auser, Boston, MA, pp 499–539. https://doi.org/10.1007/978-1-4612-0571-5 10 14. Benner P, Kressner D, Sima V, Varga A (2010) Die SLICOT-Toolboxen f¨ ur Matlab. at—Autom 58(1):15–25. https://doi.org/10.1524/auto.2010.0814 15. Benner P, Sima V, Voigt M (2016) Algorithm 961: Fortran 77 subroutines for the solution of skew-Hamiltonian/Hamiltonian eigenproblems. ACM Trans Math Softw 42(3):1–26. https://doi.org/10.1145/2818313 16. Bini DA, Iannazzo B, Meini B (2012) Numerical solution of algebraic Riccati equations. SIAM, Philadelphia 17. Bunse-Gerstner A, Mehrmann V (1986) A symplectic QR like algorithm for the solution of the real algebraic Riccati equation. IEEE Trans Autom Contr AC– 31(12):1104–1113. https://doi.org/10.1109/TAC.1986.1104186 18. Byers R (1987) Solving the algebraic Riccati equation with the matrix sign function. Linear Algebra Appl 85(1):267–279. https://doi.org/10.1016/00243795(87)90222-9 19. Chu EW, Fan HY, Lin WW (2005) A structure-preserving doubling algorithm for continuous-time algebraic Riccati equations. Linear Algebra Appl 386:55–80. https://doi.org/10.1016/j.laa.2004.10.010 20. Ciubotaru BD, Staroswiecki M (2009) Comparative study of matrix Riccati equation solvers for parametric faults accommodation. In: 10th European control conference, Budapest, Hungary, pp 1371–1376. https://doi.org/10.23919/ECC.2009. 7074597 21. Dongarra JJ, Du Croz J, Duff IS, Hammarling S (1990) Algorithm 679: a set of level 3 basic linear algebra subprograms. ACM Trans Math Softw 16(1):1–17, 18–28. https://doi.org/10.1145/77626.77627 22. Gardiner JD, Laub AJ (1986) A generalization of the matrix sign function solution for algebraic Riccati equations. Int J Contr 44:823–832. https://doi.org/10.1109/ CDC.1985.268700 23. Giftthaler M, Neunert M, St¨ auble M, Buchli J (2018) The control toolbox— an open-source C++ library for robotics, optimal and model predictive control. https://arxiv.org/abs/1801.04290. https://doi.org/10.1109/SIMPAR.2018. 8376281 24. Golub GH, Van Loan CF (2013) Matrix computations, 4th edn. The Johns Hopkins University Press, Baltimore 25. Guo C, Laub AJ (2000) On a Newton-like method for solving algebraic Riccati equations. SIAM J Matrix Anal Appl 21(2):694–698. https://doi.org/10.1137/ S0895479898348519 26. Guo CH, Iannazzo B, Meini B (2007) On the doubling algorithm for a (shifted) nonsymmetric algebraic Riccati equation. SIAM J Matrix Anal Appl 29(4):1083– 1100. https://doi.org/10.1137/060660837 27. Guo PC (2016) A modified large-scale structure-preserving doubling algorithm for a large-scale Riccati equation from transport theory. Numer Algorithms 71(3):541– 552. https://doi.org/10.1007/s11075-015-0008-4

166

V. Sima and P. Benner

28. Guo XX, Lin WW, Xu SF (2006) A structure-preserving doubling algorithm for nonsymmetric algebraic Riccati equation. Numer Math 103(3):393–412. https:// doi.org/10.1007/s00211-005-0673-7 29. Hammarling SJ (1982) Newton’s method for solving the algebraic Riccati equation. NPC report DIIC 12/82, National Physics Laboratory, Teddington, Middlesex TW11 OLW, U.K 30. Hewer GA (1971) An iterative technique for the computation of the steady state gains for the discrete optimal regulator. IEEE Trans Autom Contr AC–16(4):382– 384. https://doi.org/10.1109/TAC.1971.1099755 31. J´ onsson GF, Vavasis S (2004) Solving polynomials with small leading coefficients. SIAM J Matrix Anal Appl 26(2):400–414. https://doi.org/10.1137/ S0895479899365720 32. Kenney C, Laub AJ, Wette M (1989) A stability-enhancing scaling procedure for Schur-Riccati solvers. Syst Contr Lett 12:241–250. https://doi.org/10.1016/01676911(89)90056-X 33. Kleinman DL (1968) On an iterative technique for Riccati equation computations. IEEE Trans Autom Contr AC 13:114–115. https://doi.org/10.1109/TAC. 1968.1098829 34. Lancaster P, Rodman L (1980) Existence and uniqueness theorems for the algebraic Riccati equation. Int J Contr 32:285–309. https://doi.org/10.1080/ 00207178008922858 35. Lancaster P, Rodman L (1995) The algebraic Riccati equation. Oxford University Press, Oxford 36. Lancaster P, Ran ACM, Rodman L (1986) Hermitian solutions of the discrete algebraic Riccati equation. Int J Contr 44:777–802. https://doi.org/10.1080/ 00207178608933632 37. Lancaster P, Ran ACM, Rodman L (1987) An existence and monotonicity theorem for the discrete algebraic matrix Riccati equation. Linear Multilinear Algebra 20:353–361. https://doi.org/10.1080/03081088708817768 38. Lanzon A, Feng Y, Anderson BDO, Rotkowitz M (2008) Computing the positive stabilizing solution to algebraic Riccati equations with an indefinite quadratic term via a recursive method. IEEE Trans Autom Contr AC 53(10):2280–2291. https:// doi.org/10.1109/TAC.2008.2006108 39. Laub AJ (1979) A Schur method for solving algebraic Riccati equations. IEEE Trans Autom Contr AC 24(6):913–921. https://doi.org/10.1109/CDC.1978.267893 40. Leibfritz F, Lipinski W (2004) COMPl e ib 1.0 – User manual and quick reference. Tech. rep., Department of Mathematics, University of Trier, D–54286 Trier, Germany 41. Mehrmann V (1991) The autonomous linear quadratic control problem. Theory and numerical solution. In: Thoma M, Wyner A (eds) LNCIS, vol 163. SpringerVerl, Heidelberg 42. Mehrmann V, Tan E (1988) Defect correction methods for the solution of algebraic Riccati equations. IEEE Trans Autom Contr AC 33(7):695–698. https://doi.org/ 10.1109/9.1282 43. Pappas T, Laub AJ, Sandell NR (1980) On the numerical solution of the discretetime algebraic Riccati equation. IEEE Trans Autom Contr AC 25(4):631–641. https://doi.org/10.1109/TAC.1980.1102434 44. Roberts J (1980) Linear model reduction and solution of the algebraic Riccati equation by the use of the sign function. Int J Contr 32:667–687. https://doi.org/ 10.1080/00207178008922881

Computational Experience with a Modified Newton Solver

167

45. Sima V (1996) Algorithms for linear-quadratic optimization. In: Taft EJ, Nashed Z (eds) Pure and applied mathematics: a series of monographs and textbooks, vol 200. Marcel Dekker, Inc., New York 46. Sima V (2013) Solving discrete-time algebraic Riccati equations using modified Newton’s method. In: 6th International scientific conference on physics and control, San Luis Potos´ı, Mexico 47. Sima V (2013) Solving SLICOT benchmarks for algebraic Riccati equations by modified Newton’s method. In: 17th joint international conference on system theory, control and computing, Sinaia, Romania. IEEE, pp 491–496. https://doi.org/ 10.1080/00207178008922881 48. Sima V (2014) Efficient computations for solving algebraic Riccati equations by Newton’s method. In: Matcovschi MH, Ferariu L, Leon F (eds) 2014 18th joint international conference on system theory, control and computing, Sinaia, Romania. IEEE, pp 609–614. https://doi.org/10.1109/ICSTCC.2014.6982483 49. Sima V (2015) Computational experience with a modified Newton solver for continuous-time algebraic Riccati equations. In: Ferrier JL, Gusikhin O, Madani K, Sasiadek J (eds) Informatics in control, automation and robotics, LNEE, vol 325. Springer International Publishing, pp 55–71. https://doi.org/10.1007/978-3319-10891-9 3 50. Sima V, Benner P (2006) A SLICOT implementation of a modified Newton’s method for algebraic Riccati equations. In: 14th Mediterranean conference on control and automation, Ancona, Italy, Omnipress. https://doi.org/10.1109/MED. 2006.328740 51. Sima V, Benner P (2008) Experimental evaluation of new SLICOT solvers for linear matrix equations based on the matrix sign function. In: 9th IEEE international symposium on computer-aided control systems design, San Antonio, Texas, U.S.A., Omnipress, pp 601–606. https://doi.org/10.1109/CACSD.2008.4627361 52. Sima V, Benner P (2014) Numerical investigation of Newton’s method for solving continuous-time algebraic Riccati equations. In: Ferrier JL, Gusikhin O, Madani K, Sasiadek J (eds) 11th international conference on informatics in control, automation and robotics, Vienna, Austria, vol 1. SciTePress, Portugal, pp 404–409. https://doi.org/10.5220/0005117004040409 53. Sima V, Benner P (2015) Solving SLICOT benchmarks for continuous-time algebraic Riccati equations by Hamiltonian solvers. In: 2015 19th international conference on system theory, control and computing, Cheile Gradistei - Fundata Resort, Romania. IEEE, pp 1–6. https://doi.org/10.1109/ICSTCC.2015.7321260 54. Sima V, Benner P (2018) Numerical investigation of Newton’s method for solving discrete-time algebraic Riccati equations. In: Madani K, Gusikhin OY (eds) 15th international conference on informatics in control, automation and robotics, Porto, Portugal, vol 1. SciTePress, pp 66–75 55. Van Dooren P (1981) A generalized eigenvalue approach for solving Riccati equations. SIAM J Sci Stat Comput 2(2):121–135. https://doi.org/10.1137/0902010 56. Van Huffel S, Sima V, Varga A, Hammarling S, Delebecque F (2004) Highperformance numerical software for control. IEEE Contr Syst Mag 24(1):60–76. https://doi.org/10.1109/MCS.2004.1272746

Influence of the Predictive Rainfall/Runoff Model Accuracy on an Optimal Water Resource Management Strategy Baya Hadid(B) and Eric Duviella Institut Mines Telecom Lille Douai, University of Lille, Lille, France {baya.hadid,eric.duviella}@imt-lille-douai.fr

Abstract. This work regards the improvement of water asset management and modelling strategies proposed recently for hydrographical networks in a context of global warming which exacerbates extreme events. Indeed, hydrographical systems, and more specifically those including inland waterways, are large scale systems that involve mass energy transport phenomena consisting in the amount of water in excess induced by extreme rainy events. The water excess has to be dispatched over the complete network in order to anticipate the effects of potential floods and rejected to the sea either by gravity flow heeding the tides or by the utilization of pumps. The latter solution leads to big operation costs and has to be minimized. These goals are fulfilled by means of an integrated model within a flow-based network and a quadratic optimization based on constraints. The recent improvements in the field of predictive rainfall/runoff modelling approaches helps to enhance the operational goals of this water management strategy. A particular example of these approaches are those related to the identification of black-box ARX models in its linear and nonlinear structures. A performance comparison between the recursive least square and the recursive instrumental variable estimation of an ARX model and the estimation of a Linear Parameter Varying ARX-based model is carried out on a river located in the north of France with a prediction horizon of 24 h. Further, the performance of the resource management methodology is tested by means of a realistic case study based on a portion of the real inland waterways located in the north of France. Keywords: Modelling · Water management · Rainfall/runoff model Optimization · Large scale systems · Water system

1

·

Introduction

Transport via inland waterways benefits of economic and environmental advantages [8,28,32,33]. However, the inland waterways will be strongly impacted by climate hazards [17,26,29]. It is confirmed by the studies in [2,45] and [27] that c Springer Nature Switzerland AG 2020  O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 168–192, 2020. https://doi.org/10.1007/978-3-030-31993-9_8

Influence of the Predictive Rainfall/Runoff Model Accuracy

169

focus on inland waterways in UK, in China and on the Rhine respectively. The frequency and intensity of future flood events are expected higher [5,6,10]. To deal with the management of inland waterways in a climate change context, a multi-scale management architecture has been proposed in [13] and developed in [36], allowing the use of modelling approaches and management strategies for water level control [22–24,40,41], and water volume allocation [35,37]. More recently, an improvement of the water resource allocation planning in inland waterways was carried out by [1] using a predictive optimization by considering a future horizon to deal with extreme rainy events. However, it has been shown that even if it is possible to control the hydraulic structures with efficiency by considering a deterministic problem, uncontrolled intakes and withdrawals have a strong influence on this optimization. It is particularly the case for rainy events that have big influence on inland waterways. Moreover, inland waterways with outlet to the sea have not been considered. This implies a new challenging issue that requires to take into account the effect of the tide and the electrical cost of the pumps during the design of the water management strategies. The management strategy that consists in an optimal adaptive allocation planning of water resource is adapted and improved in this paper by dealing with inland waterways with outlet to the sea. The influence of the tide has taken into account. A criterion to minimize is defined based on the configuration of the inland waterways, the priorities on water intakes and withdrawals, the management objectives, the tide and the predicted volumes that come from rainy events. The amount of water volumes that come from rain can be estimated and forecast according to rainfall/runoff models. Several rainfall/runoff models have been proposed in the literature. A state of the art dedicated to rainfall/runoff models is proposed in this paper. Then black-box models are designed allowing taking into account only the input and output measurement without deep knowledge on physical parameters of the catchment areas. These methods are based on Auto-Regressive eXogenous (ARX) model and Linear Parameter Varying (LPV) model. Then the requirement of an accurate prediction of the amount of water volumes that comes from rain is highlighted by considering a realistic case study. The modelling tools and the optimal allocation planning approach are described in the Sect. 1. In the Sect. 2, a state of the art of existing predictive rainfall/runoff models is presented. Data-driven predictive models based on ARX and LPV are described and tested by considering real measurement. Then, a case study allows illustrating all the step to implement the designed tools and the optimal management strategy is proposed in the Sect. 3. This case study is used to highlight the importance of the rainfall/runoff model accuracy for optimization of the water resource planning.

170

2 2.1

B. Hadid and E. Duviella

Water Asset Management Strategy Integrated Model

An integrated model that is linked with a flow-based network has been proposed in [36] to deal with inland navigation networks. It aims at modelling the dynamics of each Navigation Reach (N R), i.e. a part of the canal located between two locks, as a tank with a sample time of several hours. It allows taking into account all the inputs and outputs, controlled or not, of each N R. For inland waterways with links to the sea, it is necessary to define a new element representing the output of the integrated model. The model consists in considering a finite number η of interconnected N R that are linked following the configuration of the network (see Fig. 1a). Each N R is numbered and denoted N Ri , with i ∈ 1 to η.

Fig. 1. (a) Inland navigation network, (b) corresponding model [19].

Each N Ri is supplied and emptied by controlled water volumes from locks, controlled gates and pumps when they are available, and by uncontrolled water volumes from water intakes, rain, or exchanges with groundwater. Moreover, the network can be emptied thanks to the outlets to the sea. The set of controlled water volumes is: 1. controlled volumes from one or several upstream N R that supply N Ri , denoted Vis,c (s: supply, c: controlled), 2. controlled volumes from N Ri that empty it, denoted Vie,c (e: empty), 3. controlled volumes from water intakes that supply or empty N Ri , denoted Vic . These volumes are signed; positive if N Ri is supplied, negative otherwise,

Influence of the Predictive Rainfall/Runoff Model Accuracy

171

4. controlled volumes from pump that supply N Ri , denoted Vis,p (s: supply, p: pumped), 5. controlled volumes from pump that empty N Ri , denoted Vie,p . 6. controlled volumes that empty N Ri to the sea, denoted Vio,c (o: outlet). The set of uncontrolled water volumes is: 1. uncontrolled volumes from natural rivers, rainfall-runoff, Human uses, denoted Viu (u: uncontrolled). These volumes are signed depending on their contribution to the volume Vi (k) in N Ri . 2. uncontrolled volumes from exchanges with groundwater, denoted Vig,u (g: groundwater). These volumes are also signed. The dynamic volume of N Ri is computed according to the set of controlled and uncontrolled water volumes [19]: Vi (k) = Vi (k − 1) + Vis,c (k) − Vie,c (k) + Vic (k) + Vis,p (k) −Vie,p (k) + Viu (k) + Vig,u (k) − δi Vio,p (k),

(1)

where k corresponds to the current period of time and k − 1 the last one with TM the sample time that corresponds to several hours and δi a variable equal to one when outlet is available, equal to zero otherwise. 2.2

Flow-Based Network

A flow-based network is designed according to the integrated model as G = (N , A), where N is the set of ordered nodes (vertices) and A the set of arcs (directed edges). The set of nodes is composed of a common source vertex O without incoming edges, a common sink node without outgoing edges, denoted S, and a node for each N Ri . The total number of nodes is η = card(N ) + 2. An example of a flow-based network composed of three NR is depicted in Fig. 2. An arc links two nodes and is defined as a couple a = (i, j), a ∈ Rα with α = card(A), where i and j are the origin and destination node of the edge respectively. For networks with outlet to the sea, it is necessary to consider several arcs between the corresponding N R and the sink node S. Indeed, the water volumes can be rejected by gravity or thanks to the pumps with not the same cost. Therefore, these arcs are defined as couple aν = (i, S), aν ∈ Rα with α = card(A), where i is the origin node and ν ∈ {1 : m} with m the number of arcs between these two nodes. The water volume that is transferred between two nodes is represented by a flow associated to each arc a, such as φa (k) = φi j (k) with i and j the indexes of the nodes. Here again, it is necessary to consider several flows φνi S (k) between the N Ri and the sink node S. The exchanged water volumes are limited by physical characteristics of the hydraulic devices. Hence, dynamical boundary constraints are considered such as li j (k) ≤ φi j (k) ≤ ui j (k), where li j (k) and ui j (k) are the lower and upper bound constraints respectively; with liν S (k) ≤ φνi S (k) ≤ uνi S (k) for the arcs between N Ri and S.

172

B. Hadid and E. Duviella

Fig. 2. Flow graph composed of three N R with one outlet [19].

The dynamical cost ωi j (k) ∈ Rα that is associated to each arc a is denoted for arcs between N Ri and the outlet to the sea. These costs constant on the period k, can be different between k and k + 1. They allow taking into account a smaller cost for an arc that corresponds to a water transfer by gravity comparing to an arc that corresponds to a water transfer by pump. The boundary constraints are computed in volume ([m3 ]) according to the device characteristics (gate, lock, pump...), the time period TM following some rules that are well described in [12]. They have to reflect real behaviour of the inland navigation networks. Considering the nodes, with the exception of O and S, a relative objective capacity Di (k), with i ∈ N − {O, S} is assigned to each of them. It is equal to 0. The current capacity in the node N Ri , denoted di (k), has to be equal to Di (k). It is computed as: ωiνS (k)

di (k) = di (k − 1) + φa+ (k) − φa− (k) for i ∈ N − {O, S} ,

(2)

where a+ is the set of arcs entering the node i, a− the set of arcs leaving the node i, and di (k) the capacity of the node i for the last period. That means that at each time the amount of water entering in the node N Ri has to be equal to the amount of water leaving the node. However, an interval around the objective Di (k) is allowed leading to di ≤ di (k) ≤ d¯i , with di and d¯i the lower and upper bound constraints. Then, the capacity di (k) can be negative or positive. When di (k) is negative (respectively positive) at time k, N Ri requires more (resp. less) water at time k + 1. Even if the objective conditions can not be reached at each time, the capacities di (k) have to be closest as possible to their objective. Hence, 2 a dynamical cost function Wi ((Di (k) − di (k)) ), i ∈ N − {O, S} is associated to

Influence of the Predictive Rainfall/Runoff Model Accuracy

173

each capacity di (k). This function aims at penalizing the gap between the current capacity di (k) and the objective Di (k). Thus, the optimal water management consists in satisfying the objectives of each node, i.e. Di (k), by optimizing the flows Φ(k) in terms of minimal cost; the vector Φ(k) contains the set of flows φi j (k), Δ(k) the set of capacities di (k) at time k. 2.3

Optimal Allocation Planning

The objective of the optimal allocation planning consists in determining the optimal sequence of flows Φ to guaranty the objectives Di (k) [12,19]. It is based on the minimization of an objective criterion for each management step: JV (k) =

η  i

2

Wi ((Di (k) − di (k)) ) +

α 

ωa (k) · φa (k),

(3)

a

with k the current step time, η the number of nodes without nodes O and S, and α the number of That includes arcs between N Ri and S, where the arcs. m ν (k) · φνi,S (k). Note that the value of di (k) depends cost is computed as ν=1 ωi,S on di (k − 1) and on the flows φa (k). First terms in Eq. 3 correspond to the cost due to the gap between the capacity di (k) and the objective Di . This cost can be expressed as a quadratic function. Second terms are the cost of each flow. The quadratic programming method quadprog in Matlab is used to minimize JV (k) under the equality constraints defined for each flow and each capacity:  Lb (k) ≤ x(k) ≤ Ub (k) min JV (k) such that (4) A.x(k) = b(k) with x(k) the vector that is composed of elements of Φ(k) and Δ(k), Lb (k) and Ub (k) the boundary vectors, b(k) ∈ Rη·1 the vector that contains the values of Δ(k − 1) of the previous period and the matrix A ∈ Rη·(α+η) that is composed of 0 or 1 following the Eq. 3 and the structure of the flow graph. It is considered as initial conditions that di (k − 1) = 0 for i ∈ [1, η]. The following algorithm 1 is proposed to obtain the sequence of optimal flows Φ by minimizing the criterion JV (k), n times. In this algorithm, Ξ ∈ Nα gathers all the indexes of x(k) that correspond to the flow φa (k), and Ψ ∈ Nη the indexes of x(k) that correspond to the capacities di (k). This optimization approach leads to the determination of the optimal sequence of flows. It is well suitable in a deterministic situation when all the flows are supposed to be known. But, the uncontrolled water volumes are often very difficult to determine with accuracy. However, it is possible to estimate their average values based on real measurements as proposed in [21], or based on rainfall/runoff models as proposed in [11]. It will be highlighted that this information will improve the quality of the proposed optimal water management. In the next section, a brief state of the art of predictive rainfall/runoff models is proposed.

174

3 3.1

B. Hadid and E. Duviella

Predictive Rainfall/Runoff Models State of the Art

Rainfall/runoff modelling received considerable attention of many researchers over the past two decades. This important attention is motivated by the speed up of the climate change phenomena and their impact on water resources. The literature can be classified into two approaches according to the a priori knowledge of the system: physical models and data-driven models. The complexity of the considered system is too important and then a physicalbased/mathematical model could not be considered because of the huge number of parameters. Moreover, no theoretical hydrological model is able to simulate the behavior of a catchment [38]. On the contrary, data-driven, black box fully numerical modelling approaches establish models by using only input and output measurements. Many researchers have developed numerical runoff/rainfall models with varying degrees of success. A good survey on non parametric data driven modeling techniques is presented in [16]. Neural networks (ANNs), genetic programming, evolutionary polynomial regression, Support vector machines (SVM), M5 model trees, K-nearest neighbors, and multiple linear regression techniques are implemented and evaluated using daily stream flow data. The SVM approach was also explored in [3], where a short-term stream flow prediction was performed on hourly data. Other approaches based on the neural networks were developed by for instance, introducing an a priori knowledge on the evapotranspiration (see [42] and references therein). However, and despite respectable results provided by Machine learning techniques, they still need a huge database to achieve a correct training and validation results. Input: G, Lb , Ub , Wi , Δ(0) Output: Φ b(1) = Δ(0) For i ∈ 1, n do Build A with G Build JV (i) min  JV (i) such that Lb (i) ≤ x(i) ≤ Ub (i) A.x(i) = b(i) Return x(i) Φ0 (i) = xj (i) with j ∈ Ξ Δ(i) = xl (i) with l ∈ Ψ b(i + 1) = Δ(i) End Return Φ Algorithm 1. Optimization algorithm [19].

Influence of the Predictive Rainfall/Runoff Model Accuracy

175

In [38], a daily lumped rainfall/runoff model called GR4J (from the french “G´enie Rural a` 4 param`etres Journaliers”) is presented as an improvement of the GR3J [14,15] and the performance was tested using five criteria. This conceptual model is based on a reservoir system and its initialization requires the knowledge of the soil saturation of the day before. This approach has shown its reliability once it was properly calibrated using a large amount of data and was applied on several case studies [7,9,18]. In the family of the parametric approaches, the linear models consisting in transfer functions were firstly used due to their simplicity [43,46]. These methods have been abandoned because of the non linearities due to Evapotranspiration phenomenon. In [4], the nonlinearities were represented by a Hammerstein structure using an a priori knowledge of the hydrological system to characterize the static function which also means that the achieved model is non replicable due to the differences between a geographical location and another. Recently, a black box Linear Parameter Varying (LPV) model was investigated for the Rainfall-Runoff Relationship (RRR) in urban drainage networks [39] and rural catchment [30]. This kind of systems consider a lot of nonlinearities and depend on one or several external variables, called scheduling variables, and then could be linearized at different operating points resulting in a set of local Linear Time-Invariant (LTI) systems. The issue with this approach comes from how to choose the right scheduling variable which is not trivial. In [39], the scheduling variable was chosen as the output of a non parametric model and the model was identified using the least-squares algorithm when the scheduling parameter is taken as the output of the best linear model in [30] but the optimal identification Simplified Refined Instrument Variable (SRIV) algorithm was applied. Both LPV methods lead to acceptable results. However, in [11], the authors proposed an online recursive nonlinear identification algorithm applied to Liane river (France) and was compared to a recursive least-squares linear model over a future horizon of 24 h. Indeed, the online estimation allows to track the catchment intrinsic variations. The study over a horizon is innovative comparing to the previous cited approaches and shows that even the Fit score [31] or the Nash coefficient [34] are good, the introduction of a prediction horizon deteriorates the estimation results and then increases the number of false alarms and missed alarms. More recently, a semi-automatic software tool was proposed by (Wolfs et al., 2015). It combines different model structures both physical (weir equations) and data-driven (ANNs, PieceWice Linear, etc.) depending on the river dynamics. In the continuum of the work presented in [19], the following section addresses an overview of some I/O model estimation approaches with an Auto-Regressive eXogenous (ARX) basis structure, applied on a river located in the north of France. Indeed, a comparison is performed between a recursive estimation of a linear model and an off-line estimation of a spacial kind of nonlinear system called Linear Parameter Varying system with a short-term prediction horizon. For more detailed explanation concerning the brief description of these approaches, the reader is invited to refer to [20].

176

3.2

B. Hadid and E. Duviella

Data-Driven ARX Models for Flood Forecasting

The advantage of Auto-Regressive eXogenous (ARX) model structure is its fast and practical implementation regarding the complexity of the studied large-scale system while keeping the notion of dynamical behavior for an a posteriori analysis. Another positive feature is the diversity of the identification approaches developed around this structure whether it is linear or nonlinear, online for a parameter update or off-line with also a possibility of an evolving parametrization. The ARX regression vector structure allows to predict directly the river runoff at a short-time prediction horizon starting form hourly acquired data thus avoiding an error propagation as will be shown later. An ARX system has the following structure y (k) = θ ϕH (k) + v (k)

(5)

where u(k) ∈ Rnu and y(k) ∈ Rny are respectively the input and the measured output of the system at time k ∈ Z. v(k) refers to zero-mean Gaussian noise samples with a standard deviation σ 2 . For the sake of exposition and according to the case study, only the SISO case is discussed in the following (nu = ny = 1). ϕ(k) is the regression vector having the following structure. ϕH (k) = [y(k − H − 1) · · · y(k − H − na ) u(k) u(k − 1) · · · u(k − H − nb )] (6) na and nb are the orders of the discrete-time ARX model transfer function, θ ∈ Rn with n = na + nu (nb + 1) is the associated parameter vector to be recursively estimated, H is the prediction horizon. ARX estimation is classically used with a null prediction horizon H = 0 and the algorithm is then a one-step ahead prediction model. Since the objective is to forecast the output at a future time k +f with f ≥ 1 the prediction is given by yˆ (k + f |k) = θˆ ϕˆ0 (k + f |k) ϕˆ0 = [ˆ y (k + f − 1|k) · · · yˆ(k + f − na |k) u(k + f ) · · · u(k + f − nb )]

(7) (8)

where ϕˆ0 is the regression vector containing the estimated output yˆ instead of the real output y. We note here that ϕˆ0 contains the predicted outputs that were previously an repeatedly estimated using the forecasted inputs u(k + 1) to u(k + f − 1) if they are available, and to simulate the model output yˆ(k) = ˆ  ϕˆ0 (k). The repetition of this operation causes an error propagation. To θ(k) avoid this, a prediction horizon equal to is used H = f . The little inconvenient is that the model is more complex due to the higher number of parameters to estimate. The predicted output is then defined by yˆ (k + f |k) = θˆ ϕH (k + f )

(9)

ϕH = [y(k − 1|k) · · · y(k − na |k) u(k + f ) · · · u(k + f − nb )]

(10)

Influence of the Predictive Rainfall/Runoff Model Accuracy

177

Recursive Least Squares (LS) Linear Model. The recursive least squares linear model is defined in [31]. ⎧ P (k−1)ϕH (k) ⎪ , ⎪ K(k) = λ(k)+ϕH (k)P (k−1)ϕ ⎨ H (k)

ˆ = θ(k ˆ − 1) + K(k) y(k) − θˆ (k − 1)ϕH (k) , (11) θ(k) ⎪ ⎪ ⎩ P (k) = 1 P (k − 1) − K(k)ϕ (k)P (k − 1) . H

λ(k)

λ is a positive parameter between 0 and 1, commonly called forgetting factor. Recursive Instrumental Variable (IV) Linear Model. The iterative instrumental variable (IV) algorithm is used when the residue is correlated with the regression vector, caused for example by measurement errors. The aim is to build a new regression vector called instrument that must be uncorrelated with the residue in order to cancel the bias. ⎧ P (k−1)ζH (k) ⎪ , ⎪ ⎨ K(k) = λ(k)+ϕH (k)P (k−1)ζ H (k)

ˆ ˆ (12) θ(k) = θ(k − 1) + K(k) y(k) − θˆ (k − 1)ϕH (k) , ⎪ ⎪ ⎩ P (k) = 1 P (k − 1) − K(k)ϕ (k)P (k − 1) . H

λ(k)

where ζH is the regression vector formed by the past inputs and outputs (instrument) of the auxiliary model at each iteration. The main steps of the iterative IV algorithm are 1: Initialization Estimate the initial parameter vector (j = 0). The initial parameter vector can be the result of the recursive LS estimation. 2: Building the regressor Simulate the output of the associated auxiliary model ξ(k) and build the regression vector ζ, ζH (k) = [ξ(k − H − 1) · · · ξ(k − H − na ) u(k) u(k − 1) · · · u(k − H − nb )]

(13) 3: Parameter estimation Estimate the new parameter vector θˆj+1 using the IV recursive algorithm (12), 4: Convergence test Increment j and repeat the steps 2 to 4 until     ˆj+1 − θj  θ  < T ol    θj

j > Niter

where T ol is a threshold (e.g. T ol = 10−4 ) and Niter is the maximum number of iterations, both set by the user, ˆ 5: Return parameter estimate θ. Algorithm 2. Iterative IV Algorithm.

(14)

178

B. Hadid and E. Duviella

Linear Parameter Varying Model (LPV). The LPV model gives an additional information about the current level value thanks to the scheduling variable in the assumption of this latter is rigorously selected. In [47], the author considers the level value as a scheduling variable in his SDP (Sate Dependent Parameter) model because it is correlated with the soil storage. However, the level/flow is the variable that has to be estimated. Three alternatives are proposed: the past values of level (or runoff), the past rainfall value and the output of the linear (delayed) model. In the prediction result section, only the LPV model with the past runoff values as a scheduling variable will be presented because it is superior in terms of performances as shown in [20]. y(k) = θ ΦH (k) + v(k)

(15)

ΦH (k) = ϕH (k)(In ⊗ G(k))

(16)

with G(k) = [g0 (p(k)) g1 (p(k)) . . . gr (p(k))]. In is then an identity matrix with the size n, ⊗ is the Kronecker product, G ∈ R(r+1) and Φ ∈ Rn(r+1) . r might be the order of the polynomial function in case of a polynomial dependence on the scheduling variable. p(k) is denoted pk for a better readability. The coefficients ai and bj are real meromorphic functions denoted here by g with static dependence on p to allow identifiability [44]. This means a dependence only on the instantaneous value of p at time tk ai (pk ) = bj (pk ) =

r  l=0 r 

ali (σk )gl

(17a)

blj (σk )gl

(17b)

l=0

where ali , blj , l = 0, . . . , r are constant coefficients. Generally, multiple choices exist to select the function g. Hereafter, we consider a polynomial dependence defined by Eq. (18), where r is the polynomial order. The polynomials ai and bi may naturally have different degrees but hereafter they are considered with the same r. (18) gl (pk ) = plk l = 0, . . . , r The system dependence on the scheduling variable is generally considered polynomial because the model structure becomes linear to the parameters and the parsimony condition on the LPV model of providing a good approximation of the system with a minimum number of parameters is satisfied. The parameter vector θ is then defined by the following θ = [a01 , . . . , ar1 , . . . , a0na , . . . , arna , b01 , . . . , br1 , . . . , b0nb , . . . , brnb ]

(19)

The LPV regression vector thus defined, any of the classical linear identification approaches can be used to estimate θ. In the following, the Instrumental Variable algorithm is used once again to avoid estimation bias.

Influence of the Predictive Rainfall/Runoff Model Accuracy

3.3

179

Example of Prediction Results

The models described above are tested on Liane river located in the north of France. This river is 37 km long and has an average annual flow of 1.83 m3 /s [25] but knows an increasingly number of severe flood events thus exceeding 20 m3 /s. We consider that there is a significant flood event when the runoff exceeds 12 m3 /s. The Liane river is equipped with a hydrometric station consisting in a limnimeter to measure the river level and a pluviometer. The flow rates are usually deduced from the level values using characteristic charts. The collected data are drawn in Fig. 3.

Fig. 3. Model input and output from January 2010 to June 2018. (a) Rainfall in [mm] (b) River runoff Q in [m3 /s].

Using the rainfall forecast provided by weather services and the observed runoff values, the objective here is to forecast the water flow rate with the prediction horizon of 24 h. The LPV model was trained using data collected from the hydrometric station between January 2010 and November 2017 and was validated on data between December 2017 and June 2018. It is not necessary to train with such a large database but it is only recommended in order to take into account the maximum number of different behaviors. Figure 4 shows the outputs of the models described above between December 3rd , 2017 and December 14th , 2017. One can notice three significant floods in a row. The LPV model manages in time and amplitude to predict the first flood in opposition to the recursive models with a slight overestimation. The recursive methods are, in contrast, more successful to predict second and last floods with a slight edge for

180

B. Hadid and E. Duviella

the LS approach. The online approaches are even more interesting that they do not need a calibration because the model parameters are updated at each time step. The F IT [31] and N ash [34] performance indicators are also calculated to a global performance appreciation. The LPV model gives respectively 66.4% and 0.88 for F IT and N ash in contrast with the recursive LS (respectively 44% and 0.69) and (35% and 0.58) for the recursive IV.

4 4.1

Case Study Description

The considered case study is based on a real inland navigation network that is located in the north of France. The territory in this part of France is characterized by polders, i.e. low-lying land reclaimed from the sea. During rainy periods, natural inflows supply the polders (see Fig. 5). This amount of water in excess has to be pumped to the inland waterways in order to be discharged to the sea. The pumps from the polders are not controlled from the managers of inland waterways leading to be considered as uncontrolled discharges. A discharge meter is located in a natural river close to the polders. This sensor is used to forecast, thanks to the predictive rainfall/runoff models, the amount of water that will have to be discharged to the sea. It is not a direct forecast of the water volumes that comes from the rain but an estimation based on the correlation between the 35 Observed level Recursive LS Recursive IV LPV-IV

30

25

20

15

10

5

0 0

50

100

150

200

250

Fig. 4. Observed Liane river runoff together with the recursive LS model output (green solid line) the recursive IV model output (magenta dotted line) and the LPV model output estimated using IV algorithm (dashed red line), with H = 24.

Influence of the Predictive Rainfall/Runoff Model Accuracy

181

discharge from the rain and the water volumes that are pumped to the inland waterways. The case study of the Liane river, that is presented in the previous section, is used to highlight the influence of predictive models on the proposed water resource management strategy. The characteristics of the navigation reaches, locks, pumps and gates are realistic but not real. It is composed of three N R that are interconnected as it is depicted in Fig. 5a. The N R1 supplies with a controlled gate and a lock the N R2 . It supplies also with another controlled gate and lock the N R3 . The N R2 is directly linked to the outlet to the sea, and with a lock to another N R that is not considered in this example. The N R2 is equipped with a pump downstream that allows rejecting water to the sea. The N R3 supplies another N R that is also not considered in this study. The integrated model and the associated flow graph are depicted in Figs. 5b and c, respectively. The characteristics of the system, i.e. dimensions of the N R and the boundaries on water levels are given in Table 1. The dimensions of the locks, the operating range of the gates and the average values of the uncontrolled discharges are the same used in the case study presented in [19] and are given in Table 2. Notice that the operating range of Qcdw2 depends on the tide. During low tide, the discharge due to the gravity corresponds to Qcdw2 = 30 [m3 /s]. During high tide, this discharge is equal to 0. However, the pump can empty N R2 with discharge between [0; 40] [m3 /s] whatever

Fig. 5. (a) Studied network, (b) the integrated volume model, (c) the flow graph [19].

182

B. Hadid and E. Duviella

Table 1. Characteristics of N R, with length L in [km], width w in [m], depth l in [m] and upper and lower level boundaries in [m] [19]. L

w

l

l(+) l(−)

N R1 56.724 41.8 3.7 0.1 N R2 42.3

52

0.05

4.3 0.05 0.05

N R3 25.694 45.1 3.3 0.05 0.05

is the period of the day. That means that the operating range of Qcdw2 = [0; 70] during low tide and Qcdw2 = [0; 40] during high tide. Of course the cost of pumping is higher than the cost of water rejection by gravity. It is the reason of the consideration of two arcs between nodes 2 and S (see Fig. 5c). The flow φ12,S represents all the water volume that is emptied to the N R2 with the downstream lock and by gravity to the sea. The flow φ22,S is dedicated to the water volume that is pumped to be rejected to the sea. By taking into account the tide, the sample time that is considered in these simulations corresponds to TM = 6 h. ch 3 3 Table 2. Characteristics of the lock chamber υ{up;dw} i expressed in 10 .[m ], gates Qc{up;dw}i , controlled and uncontrolled inputs Qci and Qui ; discharges are expressed in [m3 /s]. X = nonavailable, and ∗ = depends to the operating conditions [19]. ch ch c υup i υdw i Qupi

4.2

Qcdwi

Qci Qui

N R1 6.7



X

N R2 3.5

23

[0; 6.4] [0; ∗]

N R3 5.9

7.3

[0; 30] [0; 60] 0



−1 6.56 0

0.63 1.2

Design of the Optimal Water Allocation Strategy

The lower and upper bound capacities of the arcs lij (k) and uij (k) are determined according to the flow graph G depicted in Fig. 5c and to the known discharge intervals that are given in Table 2 over the period TM [19]. By considering the case study, the sets are χ = {1}, and κ = {2, 3}, the number of lock operations is denoted βij (k) ∈ N with i the index of the upstream N R (i can be the node O) and j the index of the downstream N R (j can be the node S) [19]: 

1. upper bound capacities for arcs φ12 , φ13 , φO1 , φ12S , φ3S are the sum of the maximum available volumes from water intakes over TM , i.e. as an example V1u = Qu1 · TM , and volumes that correspond to the lock operations, i.e. as ch an example υup 1 · βO1 (k), 2. upper bound capacities for arcs {φO2 , φO3 } are the sum of the maximum available volumes from water intakes over TM , i.e. V2u = Qu2 · TM and V3u = Qu3 · TM respectively,

Influence of the Predictive Rainfall/Runoff Model Accuracy

183

3. upper bound capacity for arc {φ1S }, corresponds to the sum of the maximum c c volumes that can empty N R1 during  TM , i.e. V1 = Q1 · TM , 2 4. upper bound capacity for arc φ2S , corresponds to the maximum discharge that can be pumped to the sea during TM , i.e. Qp2 · T M ,

5. lower bound capacities for arcs φ12 , φ13 , φ12S , φ3S are the volumes from ch that lock operations, i.e. as an example υdw 2 · β2S (k), 6. lower bound capacity for arc {φO1 } is the volume from that lock operation and the sum of the maximum available volumes from water intakes over TM , ch u u i.e. υup 1 · βO1 (k) and V1 = Q1 · TM ,

 7. lower bound capacities for arcs φO2 , φO3 , φ22S are equal to 0, 8. lower bound capacity for the arc {φ1S } is equal to the reserved discharge during period TM , i.e. V1c = Qc1 · TM , that leads to: ⎧ ch u ch u φ ∈ [υup 1 · βO1 (k) + Q1 · TM ; υup1 · βO1 (k) + Q1 · TM ], ⎪ ⎪ O1 ⎪ u u ⎪ φO2 ∈ [Q2 · TM ; Q2 · TM ], ⎪ ⎪ ⎪ u u ⎪ φ O3 ∈ [Q3 · TM ; Q3 · TM ], ⎪ ⎪ ⎪ c c ⎪ φ1S ∈ [Q1 · TM ; Q1 · TM ], ⎪ ⎪ ⎨ φ1 ∈ [υ ch · β (k); υ ch · β (k) + Qdw · T ], low tide, 2S

dw2

2S

dw2

2S

2

M

ch ch φ12S ∈ [υdw high tide, 2 · β2S (k); υdw 2 · β2S (k)], ⎪ ⎪ p ⎪ 2 ⎪ ∈ [0; Q · T ], φ M ⎪ 2 2S ⎪ ⎪ ch ch c ⎪ φ ∈ [υ 3 · β3S (k); υdw 3 · β3S (k) + Qdw 3 · TM ], ⎪ 3S dw ⎪ ⎪ ch ch c ⎪ φ12 ∈ [υup2 · β12 (k); υup2 · β12 (k) + Qup2 · TM ], ⎪ ⎪ ⎩ ch ch c φ13 ∈ [υup 3 · β13 (k); υup3 · β13 (k) + Qup3 · TM ],

(20)

where Qp2 is the maximum capacity of the pump, TM is expressed in 10−3 s to obtain volumes in 103 · [m3 ], and Q the upper value of the controlled discharge interval. The management objective aims at keeping the capacity objective Di = 0 for each N R. A same and constant quadratic cost function Wi is assigned to each N Ri : C max · (Di − di (k))2 , if di (k) ≤ 0, 2 2 i) Wi ((Di − di (k)) ) = C(dmax (21) 2 (di )2 · (Di − di (k)) , if di (k) > 0, with Cmax the maximal cost, assuming that di and di correspond to the lower and upper boundaries respectively. For the proposed system, Cmax = 2, 000 as a big arbitrary value. It is assumed that water volumes that supply or empty the network from natural rivers {φO2 , φO3 , φ1S } have less priority than the

, φ12 , φ13 , φ12S others φ O1

 , φ3S . Thus, two different costs are chosen such as 1 ωO1 , ω12 , ω13 , ω2S {ωO2 , ω3S = 0 and , ωO3 , ω1S } = 1. Moreover, the cost 2 = 5. associated to the pump is defined as ω2S 4.3

Water Allocation Planning

The proposed water allocation planning algorithm has been implemented in Matlab. A Simulink model has been build to reproduce the dynamics of the

184

B. Hadid and E. Duviella

studied network. It is run at a discrete time T M corresponding to 6 h. At each step k, the current states of the N R, i.e. di (k), the navigation demand and the predicted water volumes that come from rain are taken into account for the minimization of the criterion JV (k) (see Eq. 4). New setpoints are therefore computed for the controlled devices of the considered network, then a new simulation step is run. The results can be depicted at the end of the simulation. To test the proposed approach and highlight the requirement of a good prediction of water volumes that come from rainfall/runoff models, several scenarios have been built. For all these scenarios, the navigation is allowed during half of the day and then forbidden (12 h for navigation and 12 h without navigation). The navigation is also forbidden the 7th day that corresponds to Sunday. The navigation demand is the same for all the scenarios. It is given in Table 3. The effect of the tide is also simulated that allows the rejection of water volume to the sea by gravity during 6 h during the beginning of the navigation periods and the beginning of the non navigation periods. Table 3. Navigation demand over 1 week [19]. Day 1

2

3

4

5

6

7

βO1 21 19 20 22 21 20 0 β12

13 10 14 12 13 14 0

β13

14 12 15 16 13 14 0

β2S

10 9

β3S

16 15 16 18 16 15 0

10 11 9

11 0

The scenarios are built by considering extreme rainy events. They impact directly the inland navigation network by increasing the uncontrolled discharges that supply each N R. These uncontrolled discharges are multiplied by 3 between time 1 to 4 and by more than 4 between time 7 to 14 (see Fig. 6). The first scenario (Scenario 1) consists in using the water allocation planning algorithm without any prediction about the increase of the uncontrolled discharges. The N R1 is the most impacted because the magnitude of uncontrolled discharges is high and the water allocation planning is not able to allocate the water between the N R (see Fig. 7a). The rain has also some flood consequence during the second rainy event in the N R3 (see Fig. 7c). However, the volume of the N R2 stays inside the defined boundaries thanks to the use of the pump (see Fig. 8b) and the water rejection to the sea by gravity (see Fig. 8a). The water volumes can be rejected by gravity to the sea only during low tide. The tide is depicted in red dotted line in Fig. 8a. When additional water volumes that come from rain have to be rejected during high tide, it is necessary to use the pump guaranteeing the navigation condition in N R2 (see sample times 4 and 12 in Fig. 8b). This strategy allows limiting the cost due to the use of the pump.

Influence of the Predictive Rainfall/Runoff Model Accuracy

185

Fig. 6. Uncontrolled discharges Qu1 , Qu2 and Qu3 for the defined scenario with two periods of strong rain, where a sample time correspond to 6 h [19].

Fig. 7. Scenario 1: in red line, the water volume in (a) the N R1 , (b) the N R2 and (c) the N R3 , in blue dashed line the allowed boundaries [19].

186

B. Hadid and E. Duviella

The second scenario (Scenario 2) is based on the strong assumption that the water volumes from rainy events are perfectly predicted. With this assumption, the three N R keep perfectly their navigation objective as it is depicted in Fig. 9. In this case also, the pump and the water rejection by gravity are used to keep the objective in the N R2 by taking into account the tide (see Fig. 10). The water volumes that are pumped are not so different than the Scenario 1. In this Scenario 2, the water volumes have been better allocated between the three N R. This strategy allows limiting the cost due to the use of the pump. The third scenario (Scenario 3) consists in considering an error of 30 % in the prediction of the water volumes from rainy events. Based on this assumption, the N R1 is still the most impacted. However, its volume is kept inside the defined boundaries (see Fig. 11a). The objectives are guaranteed also for N R2 and N R3 as it is depicted in Fig. 11b and in Fig. 11c. The pump and the water rejection by gravity are still used to keep the objective in the N R2 by taking into account the tide (see Fig. 10). The pumped volumes are more progressive during the time, but not so different from the two first scenarios (Fig. 12). The three scenarios show that the prediction of strong rainy events is required to optimize the water resource management of inland navigation networks. However, the Scenario 3 indicates that the management objectives can be kept even if a big error is made on this prediction, i.e. an error of 30%. Hence, a strong effort have to be done on the design of accurate predictive rainfall/runoff models.

Fig. 8. Scenario 1: water volumes that are rejected by (a) gravity to the sea, (b) pump [m3 ], with the tide depicted in red dotted line [19].

Influence of the Predictive Rainfall/Runoff Model Accuracy

187

Fig. 9. Scenario 2: in red line, the water volume in (a) the N R1 , (b) the N R2 and (c) the N R3 , in blue dashed line the allowed boundaries [19].

Fig. 10. Scenario 2: water volumes that are rejected by (a) gravity to the sea, (b) pump [m3 ], with the tide depicted in red dotted line [19].

188

B. Hadid and E. Duviella

Fig. 11. Scenario 3: in red line, the water volume in (a) the N R1 , (b) the N R2 and (c) the N R3 , in blue dashed line the allowed boundaries [19].

Fig. 12. Scenario 3: water volumes that are rejected by (a) gravity to the sea, (b) pump [m3 ], with the tide depicted in red dotted line [19].

Influence of the Predictive Rainfall/Runoff Model Accuracy

5

189

Conclusions

This study dealt with the tailoring of the existing approaches based on an integrated model and flow graph to the inland waterways systems with outlet to the sea. To this end, the effects of the tide was added to the water allocation planning algorithm and simulated accordingly to allow the rejection of the water by gravity according to the navigation. These tools are tested on a realistic case study based on inland waterways located in the north of France and shown the necessity of a good rainy events impact prediction to guarantee optimal management objectives although acceptable simulation results are noted for a relatively important prediction error. For this purpose, rainfall-runoff parametric approaches were applied for comparison on data collected in a river hydrometric station based in the north of France. The objective is first to highlight the intrinsic parameter variation of the rainfall-runoff relationship and also to provide a 24 h ahead forecast precise enough to improve the proposed management strategy. The application of the designed management tool on real inland waterways on a larger portion with a higher number of navigation reaches comprising multiple bifurcations and confluences will be tackled in future works.

References 1. Alves DCCS, Duviella E, Doniec A (2018) Improvement of water resource allocation planning of inland waterways based on predictive optimization approach. In: Proceedings of the 15th international conference on informatics in control, automation and robotics (ICINCO 2018), pp 305–312 2. Arkell B, Darch G (2006) Impact of climate change on london’s transport network. Proc ICE - Municipal Eng 159:231–237 3. Asefa T, Kemblowski M, McKee M, Khalil A (2006) Multi-time scale stream flow predictions: the support vector machines approach. J Hydrol 318(1):7–16 4. Bastin G, Moens L, Dierick P (2009) Online river flow forecasting with hydromax: successes and challenges after twelve years of experience. In: proceedings of the 15th IFAC symposium on system identification, Saint-Malo, France, 6–8 July 2009 5. Bates B, Kundzewicz Z, Wu S, Palutikof J (2008) Climate change and water. Technical report, Intergovernmental Panel on Climate Change, Geneva 6. Bo´e J, Terray L, Martin E, Habetsi F (2009) Projected changes in components of the hydrological cycle in French river basins during the 21st century. Water Resour Res 45 7. Bourgin F (2014) Comment quantifier l’incertitude pr´edictive en mod´elisation hydrologique?: Travail exploratoire sur un grand ´echantillon de bassins versants. Ph.D. thesis, th`ese de doctorat dirig´ee par Andr´eassian, Vazken Hydrologie Paris, AgroParisTech 2014. http://www.theses.fr/2014AGPT0016 8. Brand C, Tran M, Anable J (2012) The UK transport carbon model: an integrated life cycle approach to explore low carbon futures. Energy Policy 41:107–124 9. Dakhlaoui H, Ruelland D, Tramblay Y, Bargaoui Z (2017) Evaluating the robustness of conceptual rainfall-runoff models under climate variability in northern Tunisia. J Hydrol 550:201–217. https://doi.org/10.1016/j.jhydrol.2017.04.032. http://www.sciencedirect.com/science/article/pii/S0022169417302512

190

B. Hadid and E. Duviella

10. Ducharne A, Habets F, Pag´e C, Sauquet E, Viennot P, D´equ´e M, Gascoin S, Hachour A, Martin E, Oudin L, Terray L, Thi´ery D (2010) Climate change impacts on water resources and hydrological extremes in northern France. In: XVIII conference on computational methods in water resources, Barcelona, Spain, June 2010 11. Duviella E, Bako L (2012) Predictive black-box modeling approaches for flow forecasting of the Liane river. In: SYSID 2012, Bruxelles, Belgium, 11–13 July 2012 12. Duviella E, Nouasse H, Doniec A, Chuquet K (2016) Dynamic optimization approaches for resource allocation planning in inland navigation networks. In: ETFA 2016, Berlin, Germany, 6–9 September 2016 13. Duviella E, Rajaoarisoa L, Blesa J, Chuquet K (2013) Adaptive and predictive control architecture of inland navigation networks in a global change context: application to the cuinchy-fontinettes reach. In: IFAC MIM conference, Saint Petersburg, 19–21 June 2013 14. Edijatno, Michel C (1989) Un mod`ele pluie-d´ebit journalier ` a trois param`etres. La Houille Blanche 2:113–121 15. Edijatno, Nascimento NDO, Yang X, Makhlouf Z, Michel C (1999) GR3J: a daily watershed model with three free parameters. Hydrol Sci J 44(2):263–277 16. Elshorbagy A, Corzo G, Srinivasulu S, Solomatine DP (2010) Experimental investigation of the predictive capabilities of data driven modeling techniques in hydrology - part 2: application. Hydrol Earth Syst Sci 14(10):1943–1961 17. EnviCom (2008) Climate change and navigation - waterborne transport, ports and waterways: a review of climate change drivers, impacts, responses and mitigation. EnviCom - Task Group 3 18. Ficchi A (2017) An adaptive hydrological model for multiple time-steps: diagnostics and improvements based on fluxes consistency. Theses, Universit´e Pierre et Marie Curie - Paris VI, France, February 2017. https://tel.archives-ouvertes.fr/ tel-01619102 19. Hadid B, Duviella E (2018) Water asset management strategy based on predictive rainfall/runoff model to optimize the evacuation of water to the sea. In: Proceedings of the 15th international conference on informatics in control, automation and robotics (ICINCO 2018), pp 76–85 20. Hadid B, Duviella E, Lecoeuche S (2018) Improvement of a predictive data-driven model for rainfall-runoff global characterization of a river. In: Proceedings of the 4th international forum on research and technologies for society and industry, Palermo, Italy, September 2018 21. Horv` ath K, Duviella E, Rajaoarisoa L, Chuquet K (2014) Modelling of a navigation reach with unknown inputs: the cuinchy-fontinettes case study. Simhydro, Sofia Antipolis, 11–13 June 2014 22. Horv` ath K, Duviella E, Rajaoarisoa L, Negenborn R, Chuquet K (2015) Improvement of the navigation conditions using a model predictive control - the cuinchyfontinettes case study. In: International conference on computational logistics, Delft, The Netherlands, 23–25 September 2015 23. Horv` ath K, Petrecsky M, Rajaoarisoa L, Duviella E, Chuquet K (2014) MPC of water level in a navigation canal - the cuinchy-fontinettes case study. In: European control conference, Strasbourg, France, 24–27 June 2014 24. Horv` ath K, Rajaoarisoa L, Duviella E, Blesa J, Petreczky M, Chuquet K (2015) Enhancing inland navigation by model predictive control of water level - the cuinchy-fontinettes case. In: Ocampo-Martinez C, Engenborn R (eds) Transport of water versus transport over water. Springer, Heidelberg Exploring the dynamic interplay between transport and water

Influence of the Predictive Rainfall/Runoff Model Accuracy

191

25. http://www.hydro.eaufrance.fr 26. IWAC (2009) Climate change mitigation and adaptation. Implications for inland waterways in England and wales. Report 27. Jonkeren O, Rietveld P, van Ommeren J (2007) Climate change and inland waterway transport: welfare effects of low water levels on the river Rhine. J Transp Econ Policy 41:387–412 28. Kara S, Sihn W, Pascher H, Ott K, Stein S, Schumacher A, Mascolo G (2015) A green and economic future of inland waterway shipping. In: Procedia CIRP - the 22nd CIRP conference on life cycle engineering, vol 29, pp 317– 322. https://doi.org/10.1016/j.procir.2015.02.171, http://www.sciencedirect.com/ science/article/pii/S2212827115004850 29. Koetse MJ, Rietveld P (2009) The impact of climate change and weather on transport: an overview of empirical findings. Transp Res Part D Transp Environ 14(3):205–221. https://doi.org/10.1016/j.trd.2008.12.004 30. Laurain V (2010) Contributions ` a l’identification de mod`eeles param´etriques non lin´eaires. application ` a la mod´elisation de bassins versants ruraux. Ph.D. thesis, Universit´e Henri Poincar´e, Nancy 1 31. Ljung L (1999) System identification: theory for the user, 2nd edn. Prentice Hall, Upper Saddle River 32. Mallidis I, Dekker R, Vlachos D (2012) The impact of greening on supply chain design and cost: a case for a developing region. J Transp Geogr 22:118–128 33. Mihic S, Golusin M, Mihajlovic M (1993) Policy and promotion of sustainable inland waterway transport in Europe - Danube river. Renew Sustain Energy Rev 15:1801–1809 34. Nash JE, Sutcliffe JV (1970) River flow forecasting through conceptual models part I: a discussion of principles. J Hydrol 10(3):282–290 35. Nouasse H, Doniec A, Lozenguez G, Duviella E, Chiron P, Archim`ede B, Chuquet K (2016) Constraint satisfaction problem based on flow transport graph to study the resilience of inland navigation networks in a climate change context. In: IFAC conference MIM, Troyes, France, 28–30 June 2016 36. Nouasse H, Horv` ath K, Rajaoarisoa L, Doniec A, Duviella E, Chuquet K (2016) Study of global change impacts on the inland navigation management: application on the nord-pas de calais network. Transport Research Arena, Varsovie, Poland 37. Nouasse H, Rajaoarisoa L, Doniec A, Chiron P, Duviella E, Archim`ede B, Chuquet K (2015) Study of drought impact on inland navigation systems based on a flow network model. In: ICAT, Sarajevo, Bosnie Herzegovia 38. Perrin C, Michel C, Andr´eassian V (2003) Improvement of a parsimonious model for streamflow simulation. J Hydrol 279:275–289 39. Previdi F, Lovera M (2009) Identification of parametrically-varying models for the rainfall-runoff relationship in urban drainage networks. IFAC Proc Vol 42(10):1768–1773. https://doi.org/10.3182/20090706-3-FR-2004.00294 15th IFAC symposium on system identification, http://www.sciencedirect.com/science/ article/pii/S147466701638908X 40. Rajaoarisoa L, Horv` ath K, Duviella E, Chuquet K (2014) Large-scale system control based on decentralized design. Application to cuinchy fontinette reach. In: IFAC world congress, Cape Town, South Africa, 24–29 August 2014 41. Segovia P, Rajaoarisoa L, Nejjari F, Puig V, Duviella E (2017) Decentralized control of inland navigation networks with distributaries: application to navigation canals in the north of France. In: ACC 2017, Seattle, WA, USA, 24–26 May 2017

192

B. Hadid and E. Duviella

42. Siou LKA, Johannet A, Pistre S, Borrell V (2010) Flash floods forecasting in a Karstic basin using neural networks: the case of the Lez basin (south of France). In: Andreo et al (eds) International symposium on Karst – ISKA, Malaga. Advances in research in karst media. Springer 43. T` oth R, Felici F, Heuberger PSC, den Hof PMJV (2007) Discrete time LPV I/O and state space representations, differences of behavior and pitfalls of interpolation. In: Proceedings of the European control conference, Kos, Greece, pp 5418–5425, July 2007 44. T´ oth R, Laurain V, Gilson M, Garnier H (2011) On the closed loop identification of LPV models using instrumental variables. In: Proceedings of the 18th IFAC world congress, Milano, Italy, vol 44, no 1, pp 7773–7778 45. Wang S, Kang S, Zhang L, Li F (2007) Modelling hydrological response to different land-use and climate change scenarios in the Zamu river basin of northwest China. Hydrol Process 22:2502–2510 46. Young PC (1986) Time series methods and recursive estimation in hydrological systems analysis. In: Kraijenhoff DA, Moll JR (eds) River flow modelling and forecasting. D. Reidel, Dordrecht, pp 129–180 47. Young P (2003) Top-down and data-based mechanistic modelling of rainfall-flow dynamics at the catchment scale. Hydrol Process 17:2195–2217

Interlocking Problem in Automatic Disassembly Planning and Two Solutions Feiying Lan1, Yongjing Wang1(&), Duc Truong Pham1, Jiayi Liu1,2, Jun Huang1, Chunqian Ji1, Shizhong Su1, Wenjun Xu2, Quan Liu2, and Zude Zhou2 1

The University of Birmingham, Edgbaston, Birmingham B15 2TT, UK {fxl655,jxl1012}@student.bham.ac.uk, {y.wang,d.t.pham,j.huang.1,c.ji,s.su.2}@bham.ac.uk 2 Wuhan University of Technology, Wuhan 430070, China {xuwenjun,quanliu,zudezhou}@whut.edu.cn

Abstract. In remanufacturing, disassembly is the first step to dismantle the end-of-life products into components, which is labour-intensive due to the variability of returned products. Compared to manual disassembly, robotic disassembly is a promising technique to automate remanufacturing processes, which liberates the human labours from the repetitive disassembly operations. However, it requires predesigned disassembly sequences which are planned manually. Several planning methods have been proposed to remove removable parts sequentially. However, those methods can fail in the disassembly sequence planning task if the product has interlocked components. This paper first explains the interlocking problem and then proposes two solutions. One solution is to identify subassemblies by using ‘separable pairs’. It complements conventional sequential disassembly planning methods and enables automatic detection of subassemblies online. Another method is based on a divide-andconquer disassembly strategy which allows subassemblies to be detected before disassembly. This approach generates disassembly sequence plans that are hierarchical to avoid interlocking problems and reduce computational complexity. Keywords: Remanufacturing  Disassembly planning  Dismantling robotic disassembly  Divide-and-conquer strategy  Hierarchy analysis

1 Introduction Remanufacturing is “the rebuilding of a product to specifications of the original manufactured product using a combination of reused, repaired and new parts” [6]. Disassembly is an important feature that distinguishes remanufacturing from conventional manufacturing. It is labour intensive with complex operations involved, which is carried out manually due to the variability in the condition of the returned products. A study of the robotic disassembly of a PC [7] in the mid-1990s started the research on automated disassembly systems. Afterwards, the successful attempts at dismantling electrical devices and automotive components [1, 2, 14] showed more encouraging © Springer Nature Switzerland AG 2020 O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 193–213, 2020. https://doi.org/10.1007/978-3-030-31993-9_9

194

F. Lan et al.

results. These experiments were mostly product-oriented and based on preprogrammed sequences. Compared with ‘automated disassembly with preprogrammed disassembly sequence, a key advance of ‘autonomous’ disassembly is that machine can interpret product structure and plan the disassembly sequence. A popular approach is based on graphs [8, 13]. Many algorithms and rule-based methods have been used to calculate disassembly sequences, for example, the Fuzzy Reasoning Petri Net proposed by Zhao and Li [16]. However, a graph cannot be generated exclusively by machine understanding. It requires human’s interpretation. Smith et al. proposed a method to plan disassembly sequences based on five matrices with several rules to represent an assembly [9, 11]. Tao et al. modified the matrices to enable partial/parallel disassembly [12]. However, the complexity of the mathematical representation of an assembly is not reduced in this optimisation-focused work. Besides, it requires to distinguish fasteners and general parts due to their fuzzy definitions. For example, it might be confused whether an object in press-fit component should be categorised as a fastener or a general part. Jin et al. [4, 5] proposed another matrix-based method in which the relationships of components were described by just a matrix. The matrix-based methods work well on sequential disassembly tasks, but they fail if the product has interlocked components. In this case, breaking an assembly into subassemblies is required in disassembly plans. Based on an analysis of over 239 mechanical products by the authors’ team, 23% of them contain interlocked components [3]. They cannot be correctly dealt with using conventional sequential disassembly planning methods. This paper presents two solutions. One is to identify subassemblies by using ‘separable pairs’ to complements conventional sequential disassembly planning methods [15]. Another solution is based on a divide-and-conquer disassembly strategy which is essentially a hierarchical rather than a sequential planning method to avoid the interlocking problem. Section 2 presents the mathematical representation of a product and explains interlocking problems. Section 3 presents the first solution, the use of ‘separable pairs’ to find subassemblies. Section 4 gives a new mathematical representation of assemblies to be adopted in a divide-and-conquer strategy. Section 5 explains the strategy and depicts the use of it for disassembly sequence planning. A case study is given in Sect. 6 to demonstrate the use of the two proposed solutions.

2 Contact and Relation Matrices: Fundamental Tools Jin et al. [4, 5] demonstrated a method based on the space interference matrix to find removable components to generate feasible disassembly sequences. The key point of this method is to find removable components which have freedom in at least one direction. By taking away removable components iteratively, the product can be disassembled step-by-step in a sequential way.

Interlocking Problem in Automatic Disassembly Planning

195

Fig. 1. An example product [10].

Figure 1 shows an example product which contains interlocked components. If the method is used to plan the disassembly sequences for this product, the first step is to remove f3 and f4 . However, after this step, no component is removable for further disassembly process. Figure 2 gives the space interference matrix of the example product after the removal of f3 and f4 . This is a typical interlocking structure. An assembly without removable part cannot be disassembled, hence, it requires to be divided into subassemblies to break the interlock.

Fig. 2. Sequential disassembly method proposed by Jin et al. [4, 5].

The first proposed solution finds subassemblies by using contact matrix and relation matrix as fundamental tools, in which the contact conditions in an assembly in six directions (X+, X−, Y+, Y−, Z+, Z−) can be represented by the number 0 or 1. For the example product, only four directions (X+, X−, Y+, Y−) is required for demonstrations due to its 2D representation, as shown in Eq. 1. In the matrix, Cn represents components in an assembly. rij:x þ , rij:x , rij:y þ , and rij:y indicate the contact status of components in the corresponding columns and rows by using two states: 0 for no contact and 1 for contact. For example, Eq. 2 is the mathematical representation of the assembly in Fig. 1. The element in C1 row and C2

196

F. Lan et al.

column, r12:x þ r12:x r12:y þ r12:y , is 0001 because C2 contact C1 in Y− direction. C1 can be removed from C2 in Y+ direction. Similarly, the element in C2 row and C1 column, r21:x þ r21:x r21:y þ r21:y , is 0010 because C1 contact C2 in Y+ direction, and hence C2 can be removed from C1 in Y− direction. It is worth noting that symmetry may not be observed in rab:x þ rab:x rab:y þ rab:y and rba:x þ rba:x rba:y þ rba:y due to the requirements of proper disassembly operations. For example, r61:x þ r61:x r61:y þ r61:y is 1110 while r16:x þ r16:x r16:y þ r16:y is 1111, because removing f1 from C1 is a proper operation but the reverse is not. The general contact status of the components in an assembly is represented by the relation matrix (Fig. 3). In Sect. 3, the utility of the two matrices to allow a machine to understand subassemblies will be explained.

C1 . C ¼ .. Cn

C1 r11:x þ r11:x r11:y þ r11:y 6 .. 4 . 2

rn1:x þ rn1:x rn1:y þ rn1:y

... Cn 3    r1n:x þ r1n:x r1n:y þ r1n:y 7 ½15 .. .. 5 . .

ð1Þ

   rnn:x þ rnn:x rnn:y þ rnn:y

[15] ð2Þ

Fig. 3. Derivation of a relation matrix from a contact matrix [15].

3 Separability Check 3.1

Definition of Separability

The separability of an assembly is defined by whether it contains ‘separable pairs’, the pairs of contacting components that can be separated without effect on other contacting components, which means it can be divided into subassemblies. For example, in Fig. 4

Interlocking Problem in Automatic Disassembly Planning

197

(a), the assembly consists of three components: A1, B1 and C1, and two pairs of contacting components: A1–B1 and B1–C1. If the contact relationships can be represented as a line, and the component is represented by a node, then the model of the assembly in Fig. 4(a) can be abstracted to Fig. 4(b), which can also be represented by its relation matrix (R1), as shown in Fig. 4(c). Both pairs, A1–B1 and B1–C1, are separable, as the separation of either pair would not affect the other.

Fig. 4. An example of a product comprising separable pairs [15].

However, the model shown in Fig. 5 has no ‘separable pair’, as the separation of any pair (A2–B2, B2–C2 and A2–C2) could have effect on other pairs. For example, the separation of A2–B2 pair inevitably causes the detachment of A2 from C2. Thus, it indicates the assembly is interlocked so that it cannot be divided into subassemblies. Comparing Figs. 4(b) to 5(b), apparently, there is only one path between A1 and B1 (A1–B1) in Fig. 4(b), while there are two paths between A2 and B2 (A2–B2, and A2– C2–B2) in Fig. 5(b). If there is only one path between two components, their interaction is not coupled with other components. Hence, they are separable pairs. It is the sufficient condition of separable pairs that there exists only one path between the two components in a pair, as shown in the pairs A1–B1 and B1–C1 in Fig. 4(b).

Fig. 5. An example of a product comprising inseparable pairs [15].

198

3.2

F. Lan et al.

Separable Pairs Search Process

The node vectors are used to identify separable pairs as shown in Eq. 3, so that the nodes which connect to a given node can be calculated by multiplying the relation matrix R1 with its node vector (Eqs. 4 to 6). 2 3 2 3 2 3 1 0 0 A1 ¼ 4 0 5; B1 ¼ 4 1 5; C1 ¼ 4 0 5 ½15 0 0 1 2

3 2 3 2 3 0 1 0 1 5  4 0 5 ¼ 4 1 5 ¼ B1; ½15 0 0 0

ð4Þ

3 2 3 2 3 0 0 1 1 5  4 1 5 ¼ 4 0 5 ¼ A1 þ C1; ½15 0 0 1

ð5Þ

3 2 3 2 3 0 0 0 1 5  4 0 5 ¼ 4 1 5 ¼ B1 ½15 0 1 0

ð6Þ

0 1 R1:A1 ¼ 4 1 0 0 1 2

0 R1:B1 ¼ 4 1 0

1 0 1 2

ð3Þ

0 1 R1:C1 ¼ 4 1 0 0 1

This method can be used to identify adjacent components pairs. Besides, the path between two nodes can be obtained by traversing of the graph. It is carried out mathematically by recursively multiplying the relation matrix by a node vector and its adjacent node vectors until a destination is reached. Figure 6 shows the procedure of searching for separable pairs using a relation matrix.

Fig. 6. Separable pairs search process [15].

Interlocking Problem in Automatic Disassembly Planning

199

The first step is to search for adjacent components pairs, two components in contact, using Eqs. 3 to 6. The second step is to identify the pair in which there is only one route between the two components, as discussed in Sect. 3.1. We propose a recursive strategy by using the pseudo code in Algorithm 1. Algorithm 1. Generate single-path pair list from adjacent pair list [15]. Main function: Input: adjacent pair list (APL) Output: Single-path list (SPL) 1 For every pair {X, Y} Є APL 2 counter = 0 3 searchPath(X,Y) ; 4 If counter = 1 5 add {X, Y} to SPL; 6 End if 7 End for searchPath(X,Y) 8 Label X as discovered 9 For every component k adjacent to X 10 If k is not labelled as discovered 11 If k = Y 12 counter++; 13 If counter >=2 14 break; 15 End if 16 Else 17 Recursively call searchPath(k,Y) 18 End if 19 End if 19 Return counter 20 End for

If all single-path pairs are obtained, their corresponding elements in the contact matrix should be checked. If the elements are not 1111 (rab:x þ rab:x rab:y þ rab:y 6¼1111 and rba:x þ rba:x rba:y þ rba:y 6¼1111), it indicates that one component has freedom in at least one direction in relation to the other, and thus the pair is separable. Details are explained using the discussed example in Fig. 1. If f3 and f4 are removed from the assembly, the node-line model of the assembly and its relation matrix are presented in Fig. 7 and Eq. 7. By using Eqs. 3 and 4, eight adjacent pairs can be identified: C1–C2, C1–C5, C1–f1, C2–f1, C3–C4, C3–f2, C4–C5, and C5–f2.

200

F. Lan et al.

Fig. 7. Model of the assembly after the removal of f3 and f4 [15].

C1 C2 C R¼ 3 C4 C5 f1 f2

C21 6 6 6 6 6 6 6 6 4

0 1 0 1 0 1 0

C2 C3 1 0 0 1 1 0 0 1 0 0 0 0 0 0

C4 C5 f1 3f2 1 0 1 0 0 0 0 0 7 7 1 0 0 0 7 7 ½15 0 1 0 0 7 7 1 0 1 1 7 7 0 1 0 1 5 0 1 1 0

ð7Þ

Algorithm 1 gives the solution to obtain the number of routes between two components in a pair, which starts from the first pair C1–C2 in the pair list. It returns the result that the pair is not separable, as there is more than one route from C1 to C2 (C1!C2 and C1!f1!C2), as depicted in Fig. 8. For the next pair on the adjacent pair list, C1–C5, it returns only one route. Thus, this pair is added to single-path list. The algorithm continues to traverse all pairs in the adjacent pair list. Finally, only C1–C5 is single-path pair. By checking the contact matrix, C1 and C5 have freedom in 3 directions. Hence, the pair C1–C5 is a separable pair.

Fig. 8. An example of searching for single path pairs [15].

Interlocking Problem in Automatic Disassembly Planning

201

It indicates that the separation of C1 and C5 would result in two subassemblies: C1–C2–f1 and C3–C4–C5–f2. For each subassembly, f1 and f2 are removable respectively. The disassembly operations can carry on by applying conventional sequential disassembly planning methods.

4 Set and Closure: Mathematical Representation The section first introduces new mathematical tools, set and unary operator, to represent a product and describe its space interference relations. The next part explains the definitions of closure which is used to detect subassemblies. They build the foundation of divide-and-conquer strategy. 4.1

Set

The Set of an Assembly An assembly of a product can be represented by a set A, which is a collection of the components of the products. Each component Ei ði ¼ 1; 2; 3; . . .nÞ, the element of the set A, cannot be disassembled into smaller components (Eq. 8). A ¼ fE1 ; E2 ; . . .; En g

ð8Þ

The Relation of Components in a Set Space interferences of components in a set A can be described by using an interference matrix I [5]. For example, the 2-dimensional space relation of a product in Fig. 9 can be described in Eq. 9.

Fig. 9. An example to show the use of an interference matrix to describe the 2-dimensional space relations of a product.

E1 I¼ E2 E3

E1 0000 4 1101 1101 2

E2 E3 3 1110 1110 0000 0010 5 0001 0000

ð9Þ

202

F. Lan et al.

The Interference Operator of a Set Set A and the interference matrix I describe the attributes of an assembly. For a set A ¼ fE1 ; E2 ; . . .; En g and its interference matrix I, a unary operator U d ðEi Þ ðEi 2 AÞ can obtain the elements which have interference with Ei in direction d (d 2 fx þ ; x; y þ ; y; z þ ; zg). 8Ei 2 A; U d ðEi Þ ¼ fEu ; Ev ; . . .; Ew g  A; 1  u; v; w  n

ð10Þ

Referring to Eq. 9 as an example, an unary operator can be used to find the elements which have space interference with E2 along x+ direction. The result (Eq. 11) indicates that fE1 g has a space interference with E2 in x+ direction. E2x þ ¼

4.2

E1 ½1

E2 0

E3 0

ð11Þ

Subassembly and Closure

A subassembly is a collection of components that can be disassembled along a certain direction d without affecting other components. For a given direction d, elements in a subassembly have interferences with one another, but they have no interferences with elements outside the subassembly. In other words, if an element has a space interference with another element, they both belong to the same subassembly (Eq. 12).   8Ei 2 S; Ud ðEi Þ ¼ Eu; Ev ; . . .; Ew S

ð12Þ

Where S ¼ fEu ; Ev ; . . .; Ew gA; 1  u; v; w  n. The subset S is defined to be closed under the operator U d in d direction, if the results of the operation on any element in S also belongs to S. Such feature is also called a closure. A closure indicates that the elements in the subset do not have interference with components outside the subset, which is a sign that the subset builds a subassembly that can be removed in direction d. Also, the subset S in a set A is equivalent to an element in A. The space interference operator can be subjected to the subset S, too. For example, SA; A ¼ fE1 ; E2 ; E3 ; E4 ; . . .; En g; S ¼ fE1 ; E2 ; E3 g, then, A ¼ fS; E4 ; . . .; En g

ð13Þ

The divide-and-conquer strategy to be presented in Sect. 5 is based on the detection of subassemblies through the detection of closure in a set.

Interlocking Problem in Automatic Disassembly Planning

203

5 Divide-and-Conquer Disassembly Strategy The strategy is to divide a product into subassemblies iteratively until all subassemblies become individual components which cannot be divided anymore. This method allows all subassemblies to be identified automatically during a disassembly planning, and hence the interlocking problem does not exist. 5.1

Subassembly Search Process

The search for subassemblies can use the algorithm given in Fig. 10. For example, referring to Fig. 11, after the disassembly of f3 and f4, the assembly is shown as in Eq. 14 and the space interference relation is given in Eq. 15.

Fig. 10. Subassembly search process

Fig. 11. An assembly example after the disassembly of f3 and f4 (refer to Fig. 1).

204

F. Lan et al.

A ¼ fC 1 ; C 2 ; C 3 ; C 4 ; C 5 ; f 1 ; f 2 g

ð14Þ

ð15Þ

Identification of a subassembly starts with initialising a target subset S1 to be an empty set /. The subassembly search process adds the first element C1 to the subset S1 , and hence S1 ¼ fC1 g. Subject the interference operator to C1 along y+ direction to obtain: U y þ ðC1 Þ ¼ ff 1 g S1

ð16Þ

The result ff 1 g does not duplicate with the elements in S1 , and thus f 1 is added to S1 ¼ fC1 ; f 1 g

ð17Þ

Subject the interference operator to the new element f 1 again and yield: U y þ ð f 1 Þ ¼ fC 1 ; C 2 g

ð18Þ

The new element C2 is added to S1 as in Eq. 19. S1 ¼ f C 1 ; f 1 ; C 2 g

ð19Þ

Subject the operator to C2 and yield U y þ ðC2 Þ ¼ fC1 ; f 1 gS1

ð20Þ

As C1 and f 1 are already in S1 , the result indicates that the current subset S1 is closed. S1 A; 8Ei 2 S1 ; U y þ ðEi ÞS

ð21Þ

Hence, the set A can be re-written and contain two elements: subset S1 and its complement CA S1 .

Interlocking Problem in Automatic Disassembly Planning

A ¼ fS1 ; CA S1 g

205

ð22Þ

Where CA S1 ¼ fC3 ; C4 ; C5 ; f 2 g. The next step is to identify subassemblies in CA S1 starting from the first element C3 . Because the previous subset S1 can be disassembled along y+ direction, the new search direction is y−. Similarly, a subset S2 is initialised to be empty: S2 ¼ /, and the first element C3 is added to S2 . Hence, S2 ¼ fC3 g. U y ðC3 Þ ¼ fC4 ; C5 ; f 2 g

ð23Þ

C4 ; C5 ; f 2 do not exist in S2 and are added. S2 ¼ fC3 ; C4 ; C5 ; f 2 g ¼ CA S1

ð24Þ

The subset S2 is equal to CA S1 , which means the second subset has been found. The space interference of the rest of the elements are checked below: U y ðC4 Þ ¼ fC5 gS2

ð25Þ

U y ðC5 Þ ¼ ff 2 gS2

ð26Þ

U y ðf 2 Þ ¼ fC3 ; C5 gS2

ð27Þ

Hence, the subset S2 is closed under the interference operator. All the elements have been traversed, which means A ¼ fS1 ; S2 g

5.2

ð28Þ

Procedure of Divide-and-Conquer Strategy

This strategy analyses elements in set A in a hierarchical way following a procedure given in Fig. 13. The first operation is to divide A into subassemblies (Eq. 29) A ¼ fS1 ; S2 ; S3 ; . . .; Su g

ð29Þ

For each Si in A ¼ fS1 ; S2 ; S3 ; . . .; Su g, divide the subset Si into several subassemblies again, yields: Si ¼ fSi1 ; Si2 ; . . .; Siw g

ð30Þ

Repeat this operation iteratively, until all the subassemblies are components which cannot be divided anymore. A diagram showing the hierarchy of the disassembly sequences is shown below.

206

F. Lan et al.

Fig. 12. Procedure of divide and conquer disassembly strategy.

Fig. 13. Hierarchy of disassembly sequences of a product.

For example, if there is a set A (Eq. 31) and its undisclosed hierarchy is in Fig. 14, the first step of using divide-and-conquer disassembly strategy is to divide A into subsets, yielding Eqs. 32 and 33. A ¼ fE1 ; E2 ; . . .; E10 g

Fig. 14. The hierarchy of the disassembly sequences.

ð31Þ

Interlocking Problem in Automatic Disassembly Planning

A ¼ fS1 ; S2 ; S3 ; S4 g 8 S1 ¼ fE1 g > > < S2 ¼ fE2 ; E3 g > S3 ¼ fE4 ; E5 ; E6 g > : S4 ¼ fE7 ; E8 ; E9 ; E10 g

207

ð32Þ

ð33Þ

After generating the first layer of disassembly sequences as in Eq. 33, each subset is checked. S1 has already been an element so the operation stops. S2 , S3 , and S4 can be further divided. S2 is divided into two subsets: 

S5 ¼ fE2 g S6 ¼ fE3 g

ð34Þ

S7 ¼ fE4 ; E5 g S 8 ¼ fE 6 g

ð35Þ

S9 ¼ fE7 ; E8 g S10 ¼ fE9 ; E10 g

ð36Þ

S2 ¼ fS5 ; S6 g; S5 and S6 cannot longer be divided. S3 and S4 are divided into subsets below:  S3 ¼ fS7 ; S8 g;  S4 ¼ fS9 ; S10 g; S7 ; S9 and S10 can be divided below: 

S11 ¼ fE4 g S  12 ¼ fE5 g S13 ¼ fE7 g S9 ¼ fS13 ; S14 g; S  14 ¼ fE8 g S15 ¼ fE9 g S10 ¼ fS15 ; S16 g; S16 ¼ fE10 g

S7 ¼ fS11 ; S12 g;

ð37Þ

Then, all the subsets contain only an element and the disassembly planning is finished.

6 Case Study This section demonstrates a case study of the disassembly of a piston used in a 4-stroke engine, as shown in Fig. 15, by using both ‘separable pairs’ method and divide-andconquer strategy. B–C1–C2–D is an interlocked structure. Disassembly would require building two subassemblies: B–C1 and C2–D, so that C1 can be removed from B, and C2 can be taken away from D.

208

F. Lan et al.

Fig. 15. Parts in a piston [15].

6.1

‘Separable Pairs’ Method

The sequential disassembly plan as shown in Table 1 is obtained by using the conventional sequential disassembly method [4, 5] based on the space interference matrix (Appendix). As shown in Table 1, there is no removable part identified at iteration 7, and parts B, C1–2 and D are interlocked. The methods presented in Sect. 3 is used to find a separable pair to divide the interlocking structure into subassemblies for normal disassembly operations. Table 1. Sequential disassembly plan generated using the method by Jin et al. [15]. Iteration Removable parts 1 A1-111110 A2-111110 E1-110111 E2-111011 H1-111101 2 F -110011 H2 - 111101 3 H3 - 111101 4 H4 - 111101 5 H5 - 111101 6 G - 111101 7 None

Remaining parts B, C1–2, D, F, G, H2–5

B, C1–2, D, G, H3–5 B, B, B, B, B,

C1–2, C1–2, C1–2, C1–2, C1–2,

D, G, H4–5 D, G, H5 D, G D D

Interlocking Problem in Automatic Disassembly Planning

209

Equations 38 and 39 give the contact matrix and the relation matrix of the interlocked structure B–C1–C2–D. By using the method presented in Eqs. 3 to 6, three adjacent components pairs are identified: B–C1, B–D and C2–D. Then, the number of routes between two components in each adjacent components pair is calculated based on Algorithm 1. The results show that all adjacent components pairs are single-path pairs. However, by checking the contact matrix, only B–D pair is a separable pair as C12 and C43 in Eq. 38 are 111111, indicating that either B–C1 or C2–D has no freedom to separate. The separation of B and D forms two subassemblies, B–C1 and C2–D. Thus, the conventional sequential disassembly methods can carry on.

ð38Þ

B R ¼ C1 C2 D

6.2

B 2

0 61 40 1

C1 1 0 0 0

C2 0 0 0 1

D 3 1 0 7 ½15 15 0

ð39Þ

Divide-and-Conquer Strategy

By using the divide-and-conquer disassembly strategy proposed, the hierarchy of the piston example can be generated as in Fig. 16. The disassembly sequences are generated, given in Tables 2, 3 and 4 by using the space interference matrix (Appendix).

Fig. 16. Hierarchy analysis of piston example.

210

F. Lan et al. Table 2. Divide-and-conquer disassembly sequences of the example product. Iteration Subassemblies 1 {A1}, {A2}, {E1}, {E2}, {H1}, {B, C1–2, D, F, G, H2–5} 2 {F}, {H2}, {B, C1–2, D, G, H3–5} 3 {H3}, {B, C1–2, D, G, H4–5} 4 {H4}, {B, C1–2, D, G, H5} 5 {H5}, {B, C1–2, D, G} 6 {G}, {B, C1–2, D} 7 {B, C1}, {D, C2}

Minimum units Remaining subassemblies {B, C1–2, D, F, G, H2–5} {A1}, {A2}, {E1}, {E2}, {H1} {F}, {H2}

{B, C1–2, D, G, H3–5}

{H3}

{B, C1–2, D, G, H4–5}

{H4}

{B, C1–2, D, G, H5}

{H5}

{B, C1–2, D, G}

{G}

{B, C1–2, D}

/

{B, C1}, {D, C2}

When it comes to step 7, the remaining subassemblies is {B, C1} and {D, C2}. Similarly, the disassembly sequences of these two subassemblies generated by divideand-conquer strategy are shown in Tables 3 and 4. The interlocking problem does not exist using this method. Table 3. Disassembly sequences of the subassembly {B, C1}. Iteration Subassemblies Minimum units Remaining subassemblies 8 {B}, {B}, / {C1} {C1}

Table 4. Disassembly sequences of the subassembly {D, C2}. Iteration Subassemblies Minimum units Remaining subassemblies 8 {D}, {D}, / {C2} {C2}

Interlocking Problem in Automatic Disassembly Planning

211

7 Conclusion Machine understanding of the structure of an assembly in three-dimensional space is an enabler of autonomous disassembly planning. Conventionally, due to the complexity of spatial information, many disassembly planning methods are not suitable for those containing interlocked components. This paper presents two methods to deal with the interlocking problem. They are designed to work for all subassemblies containing interlocking components and their effectiveness was demonstrated with a case study. One work is based on the search for separable pairs, which can be used to identify subassemblies during a sequential disassembly planning. The other solution is based on a divide-and-conquer strategy to replace the conventional sequential disassembly methods. It plans disassembly sequence through the detection of subassemblies instead of removable individual components so that interlocked components can be detected automatically in each interaction, and thus interlocking problems can be avoided. Acknowledgement. This research was supported by the EPSRC (Grant No. EP/N018524/1) and the National Science Foundation of China (Grant No. 51775399).

212

F. Lan et al.

Appendix Space interference matrix 15:

Interlocking Problem in Automatic Disassembly Planning

213

References 1. Barwood M, Li J, Pringle T, Rahimifard S (2015) Utilisation of reconfigurable recycling systems for improved material recovery from e-waste. Procedia CIRP 29:746–751. https:// doi.org/10.1016/j.procir.2015.02.071 2. Gil P, Pomares J, Puente SVT, Diaz C, Candelas F, Torres F (2007) Flexible multi-sensorial system for automatic disassembly using cooperative robots. Int J Comput Integr Manuf 20 (8):757–772. https://doi.org/10.1080/09511920601143169 3. Ji C, Pham DT, Su S, Huang J, Wang Y (2017) AUTOREMAN – D.1.1 - List of generic disassembly task categories. Technical report, Autonomous Remanufacturing Laboratory, The University of Birmingham 4. Jin G, Li W, Wang S, Gao S (2015) A systematic selective disassembly approach for waste electrical and electronic equipment with case study on liquid crystal display televisions. Proc Inst Mech Eng Part B J Eng Manuf 231:2261–2278. https://doi.org/10.1177/ 0954405415575476 5. Jin GQ, Li WD, Xia K (2013) Disassembly matrix for liquid crystal displays televisions. Procedia CIRP 11:357–362. https://doi.org/10.1016/j.procir.2013.07.015 6. Johnson MR, McCarthy IP (2014) Product recovery decisions within the context of extended producer responsibility. J Eng Tech Manag 34:9–28. https://doi.org/10.1016/j.jengtecman. 2013.11.002 7. Kopacek P, Kronreif G (1996) Semi-automated robotic disassembling of personal computers. In: EFTA 1996 - IEEE conference on emerging technologies and factory automation, vol 2, pp 567–572. IEEE. https://doi.org/10.1109/ETFA.1996.573938 8. Li JR, Khoo LP, Tor SB (2002) A novel representation scheme for disassembly sequence planning. Int J Adv Manuf Technol 20(8):621–630. https://doi.org/10.1007/s001700200199 9. Smith S, Chen W-H (2009) Rule-based recursive selective disassembly sequence planning for green design. Springer, London, pp 291–302. https://doi.org/10.1007/978-1-84882-7622_27 10. Smith S, Hung P-Y (2015) A novel selective parallel disassembly planning method for green design. J Eng Des 26(10–12):283–301. https://doi.org/10.1080/09544828.2015.1045841 11. Smith S, Smith G, Chen W-H (2012) Disassembly sequence structure graphs: an optimal approach for multiple-target selective disassembly sequence planning. Adv Eng Inform 26 (2):306–316. https://doi.org/10.1016/j.aei.2011.11.003 12. Tao F, Bi L, Zuo Y, Nee AYC (2017) Partial/parallel disassembly sequence planning for complex products. J Manuf Sci Eng 140(1):011016. https://doi.org/10.1115/1.4037608 13. Torres F, Puente ST, Aracil R (2003) Disassembly planning based on precedence relations among assemblies. Int J Adv Manuf Technol 21(5):317–327. https://doi.org/10.1007/ s001700300037 14. Vongbunyong S, Chen WH (2015) Disassembly automation. Springer, Cham. https://doi. org/10.1007/978-3-319-15183-0 15. Wang Y, Lan F, Pham D, Liu J, Huang, J, Ji C, Su S, Xu W, Liu Q, Zhou Z (2018) Automatic detection of subassemblies for disassembly sequence planning. In: Proceedings of the 15th international conference on informatics in control, automation and robotics - volume 1: ICINCO, pp 94–100. https://doi.org/10.5220/0006906601040110. ISBN 978-989-758321-6 16. Zhao S, Li Y (2010) Disassembly sequence decision making for products recycling and remanufacturing systems. In: 2010 International symposium on computational intelligence and design, pp 44–48). IEEE. https://doi.org/10.1109/ISCID.2010.19

Thermal Imaging for Psychophysiological State Detection in the Human-Machine Interface (HMI) Control System Changjiang He1(B) , Mahdi Mahfouf1 , and Luis A. Torres-Salomao2 1

2

Department of Automatic Control and Systems Engineering, The University of Sheffield, Mappin Street, Sheffield S1 3JD, UK {che5,m.mahfouf}@sheffield.ac.uk Digital Systems and Solutions, General Electric Infrastructure Queretaro, Queretaro, Mexico [email protected]

Abstract. Psychophysiological state prediction is of great importance to the human-machine interface (HMI) as far as both safety and reliability are concerned. In this paper, the use of facial temperature changes for predicting psychophysiological state and task performance has been investigated. The effectiveness of using facial temperature with the thermal camera to estimate the human psychophysiological state has been validated with the statistical results from a carefully designed HMI experiment with ten (10) healthy subjects. The new facial temperature biomarkers have exhibited a similar or even better ability to differentiate various psychophysiological state in comparison with the traditional biomarkers (e.g. heart rate variability (HRV), task load index (TLI) and pupil size). The mean nasal temperature and the differential energy between philtrum and forehead (DEFP) have been shown to be more sensitive to the psychophysiological state changes comparing to the conventional biomarkers. The maximum facial temperature and the mean forehead temperature have also shown clear correlations with psychophysiological state and task performance. As a final step, an adaptive type-2 fuzzy logic model combining the new markers with previously used bio-indicators has elicited and as a result achieved significantly high accuracy in predicting task load performance. Keywords: Human-Machine Interfaces · Psychophysiology human factors · Stress detection · Facial temperature

1

Introduction

The combination of an automatic system with a human operator has been widely implemented in many human-centred environments, including manufacturing, This paper is an extended version of work published in [12]. This version includes an extended sample size, the introduction of a new frequency related marker and an initial type-2 fuzzy logic model for prediction. c Springer Nature Switzerland AG 2020  O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 214–229, 2020. https://doi.org/10.1007/978-3-030-31993-9_10

Thermal Imaging for Psychophysiological State Detection

215

transportation and clinical medicine. However, the performance of such a system has usually been compromised by increasing operational demands on the human operator, which can also threaten the safety and reliability of the whole system [29]. Therefore, it is of paramount importance to introduce an effective interface between the human operator and the automatic system. The main aim of this interface is to help the automatic system to assign suitable tasks for the human operator depending on his or her psychophysiological state and to achieve the best overall task performance. The human operator’s performance in a certain task is dependent on his or her attention span, cognition, perception and execution, which all develop from the basic conditional reflex [1,2]. Therefore, monitoring the activities of some specific neurons and subsystems they regulate has proved to be a valid approach to assess one’s psychophysiological state [1,2,16]. In the area of human-machine interface research, the assessment of the human operator’s psychophysiological state usually combines peripheral physiology, startle response, central physiology and behaviour. The frequently used measurements cover electrocardiography (ECG), electroencephalography (EEG), pupil size, blood pressure, blood volume, blood volume pulse, respiration, muscle tension, electrodermal activity, galvanic skin and temperature signals [9,19,20,25–27,31]. Heart rate variable (HRV) from ECG and task load index (TLI) from EEG are the most common and recommended psychophysiological state biomarkers. HRV is consistently corresponding to the cardio-respiratory system, which is susceptible to the changes of psychophysiological stress [3,17]. The aim of TLI is to calculate one’s working memory (WM), which constitutes one’s ability to maintain the focus on one specific task regardless of the surrounding interference [11,24]. However, EEG and ECG measurements are normally involved with using the electrodes to record voltage differences across the skin. Such a requirement has limited the movement and the range of movement of the human operator and disturbed his or her mental state as well. Meanwhile, the measurements here remain sensitive to the noise introduced by defective skin-electrode connections and surrounding electromagnetic fields. The high complexity of EEG and ECG measurements has limited the efficiency and safety of applying HRV and TLI in real-world situations. Therefore, it is important to design and integrate new psychophysiological state biomarkers with the existing system to overcome these constraints. Infrared cameras have hitherto provided a reliable means of documenting the facial temperature in real time without body contact, and they usually have fewer requirements for the work environment compared with EEG and ECG. The temperature of the target is measured by the recorded infrared radiation, the emissivity, the transmittance and the surrounding temperature [28]. Infrared cameras provide a non-contact measurement that is free from the psychophysiological changes for connecting the electrodes in EEG and ECG. A comparison between areas and a tracking of changing thermal patterns can be obtained with the real-time two-dimensional thermal image from the camera [28].

216

C. He et al.

The facial temperature recorded by infrared cameras has been recognised as a potential valid reflection of the human mental state nowadays. Stateof-the-art research has demonstrated that both the hypothalamus and the parasympathetic-sympathetic nervous system have significant emotion induced influence on the human thermoregulation [6,14]. Such impact usually leads to changes in skin temperature and leads to different periodic temperature cycles from seconds to minutes, which are observable with infrared cameras [4,5,15]. Previous studies have also shown that temperature readings from the forehead, the periorbital and the nasal regions are closely correlated with psychophysiological state [21–23]. Also, it has been proved that the energy spectrum from the frequency analysis of the facial temperature is consistently correlated to the heartbeat rate and cortisol level [14]. Thus, the biomarkers based on the facial temperature readings from the camera have great potential to provide accurate psychophysiological state estimation instantaneously, and also retain adequate distance from the subject comparing to the HRV and TLI. While the ultimate aim of this research is to control the human-machine interface, the specific objectives are (1) to validate the effectiveness of using facial temperature for assessing psychophysiological state in HMI, (2) to compare the efficiency of using facial temperature as a psychophysiological state biomarker with other existing biomarkers, (3) to outline the limitation of current research and discuss the future development of the temperature biomarker within the human-machine interface framework. 1.1

Participants

A total of ten (10) healthy research students from the Automatic Control and Systems Engineering Department at the University of Sheffield (UK) have been selected as suitable participants for the experiment. The participants include both genders and various races, from 22 to 30 years old with an average age of 25. The subjects have been advised not to take any medicine or caffeine related drinks. 1.2

Simulation of Human-Machine Interaction

Mental arithmetic has been selected as the simulation of HMI in this experiment. Mental arithmetic has proved to be a simple efficient intuitive way of introducing psychophysiological stress [10,13]. The mental arithmetic assessment applied in this experiment is based on a Matlab GUI application, which is similar to the one implemented in previous HMI studies [26,27]. This assessment requires the participant to complete the multiplication of two numbers within a certain amount of time. 1.3

Data Acquisition

EEG and ECG data have been continuously monitored via the Biosemi Active Two System. EEG signals have been acquired using a 32 + 2 electrodes layout

Thermal Imaging for Psychophysiological State Detection

217

from a standard Biosemi 10/20 system. ECG signals have also been acquired from the 3-lead system that formed a triangle area covering the chest. Data has been recorded with Biosemi ActiView software with a sampling rate at 2048 Hz. Pupil size has been measured by a Gazepoint eye-tracking camera, and the acquisition is done by Gazepoint software during the experiment. Facial temperature has been captured with a FLIR E40bx thermal imaging camera, which is placed horizontally to the subject’s face with the distance of 0.5 m and the emissivity of 0.98. The thermal imaging sequences have been captured with a sampling frequency of 10 Hz by FLIR ResearchIR. During the experiment, the temperature and the relative humidity of the room have been maintained to be 20.0 ◦ C and 50%. 1.4

Data Analysis

R The thermal imaging sequences have been analysed in MATLAB to extract the temperature magnitude and frequency changes in the regions of interest (ROI), which are the periorbital area, the nasal area and the forehead see Fig. 1. The developed biomarkers based on these are the maximum facial temperature (around periorbital), the mean nasal temperature, the mean forehead temperature and the differential energy between philtrum and forehead (DEFP).

Fig. 1. Regions of interest: periorbital, nasal and forehead [12].

The maximum facial temperature Tmaxf is captured by the point measurement of the camera. The calculation is shown as follows [12]:   Tmaxf = max(T(i,j) )|∀i ∈ L, ∀j ∈ W

(1)

where T(i,j) represents the temperature value recorded at pixel (i, j), and L and W were the numbers of the columns and rows of pixels in a frame.

218

C. He et al.

The mean nasal temperature T¯n is captured by the rectangle measurement of the camera. The calculation is shown as follows [12]:   1  T¯n = (2) T(i,j) |∀i ∈ X, ∀j ∈ Y XY where T(i,j) represents the temperature value recorded at pixel (i, j) from the selected rectangle region over the nasal area, and X and Y were the numbers of the columns and rows of pixels in the region. The mean forehead temperature T¯f is captured by the rectangle measurement of the camera. The calculation is shown as follows [12]:   1  T¯f = T(i,j) |∀i ∈ M, ∀j ∈ N (3) MN where T(i,j) represents the temperature value recorded at pixel (i, j) from the selected rectangle region over the forehead, and M and N were the numbers of the columns and rows of pixels in the region. The DEFP is obtained by the differential energy from the Fourier transform analysis of the temperatures between the philtrum and forehead. After filtering the raw data with a bandpass filter (0.4 Hz–4 Hz representing 24–240 beats per minute), Fast Fourier Transform has been performed with a window size of 10 s to extract the energy spectrum within that frequency range. The final value is generated from the subtraction between the maximum energy value of the philtrum temperature and the mean energy value of the forehead temperature. The whole process can be summarised with the following equations (use discrete Fourier transform to represent fast Fourier transform for simplicity): N −1   2π −i kn T¯p [2 − n − j]e N |∀k ∈ [4, 40], N = 100 , SˆT p,j [k] = n=0

SˆT f,j [k] =

N −1 

 −i 2π kn ¯ Tf [2 − n − j]e N |∀k ∈ [4, 40], N = 100 ,

(4)

n=0

DEF Pj = max SˆT p,j − SˆT f,j , where T¯n and T¯f represent the mean philtrum temperature and the mean forehead temperature. 1.5

Procedure

The whole experiment for one subject lasted approximately 40 min, including two 12-min mental arithmetic sessions and one 12-min comparison session in the interval, with 2-min breaks in between these sessions. Within each mental arithmetic session, there were four 3-min subsessions of varying difficulty levels. The difficulty level is determined by the digit of the numbers and the time allocated for answering the questions. The order of the difficulty levels has been altered after the first arithmetic session to separate the physiological changes introduced by the mental state from the normal daily activities.

Thermal Imaging for Psychophysiological State Detection

2 2.1

219

Results Maximum Facial Temperature

(a) Normalised maximum facial temperature (-.-), accuracy (- -) and difficulty level (-) plots for participant 1. Session 1 with elevated difficulty levels

(b) Normalised maximum facial temperature (-.-), accuracy (- -) and difficulty level (-) plots for participant 1. Session 3 with randomised difficulty levels

Fig. 2. Maximum facial temperature [12].

The overall average value of the maximum facial temperature for all the ten volunteers is 36.2713 ◦ C. Changes in temperature for different sessions are mostly around 0.4397 ◦ C, and around 0.0634 ◦ C for any two adjacent sub-sessions. In addition, compared to the period of the control sessions, the maximum temperature of the experimental sessions has demonstrated a certain degree of deviation from the initial values recorded at rest.

220

C. He et al.

In the experiment with ten (10) participants, the maximum facial temperatures of six subjects are consistently correlated with their task performance for both increasing and random difficulty order, see Fig. 2. Among the six participants, four of these have showed negative correlations between the temperature and the accuracy, while the other two have demonstrated positive correlations. Thus, the psychophysiological state induced change of the maximum facial temperature is predominantly affected by the individual differences. 2.2

Mean Nasal Temperature

(a) Normalised mean nasal temperature (-.-), accuracy (- -) and difficulty level (-) plots for participant 1. Session 1 with elevated difficulty levels

(b) Normalised mean nasal temperature (-.-), accuracy (- -) and difficulty level (-) plots for participant 1. Session 3 with randomised difficulty levels [12]

Fig. 3. Mean nasal temperature.

The general mean value of the mean nasal temperature for all ten (10) test subjects is 33.9293 ◦ C. The change of temperature over the different sessions ranges

Thermal Imaging for Psychophysiological State Detection

221

from 0.1527 up to 1.8739 ◦ C with a mean value of 0.7758 ◦ C, and from 0.0021 to 0.8933 ◦ C with a mean value of 0.2287 ◦ C for any two nearby sub-sessions. Apart from the temperature difference within the control sessions and the experimental sessions, the increases and decreases of the mean nasal temperature are also consistently correlated to the rise and fall of the subjects’ accuracy, which supports the findings from previous research mentioned in the literature review, see Fig. 3 for example.

(a) Normalised mean forehead temperature (-.-), accuracy (- -) and difficulty level (-) plots for participant 7. Session 1 with elevated difficulty levels

(b) Normalised mean forehead temperature (-.-), accuracy (- -) and difficulty level (-) plots for participant 7. Session 3 with randomised difficulty levels

Fig. 4. Mean forehead temperature [12].

2.3

Mean Forehead Temperature

The overall mean value of the forehead temperatures across all volunteers is 34.4130 ◦ C. The mean temperature difference between the sessions ranges from

222

C. He et al.

0.1090 to 1.2535 ◦ C with a mean value of 0.4802 ◦ C, and are mostly around 0.1285 ◦ C between any two neighbouring sub-sessions. Similarly to the maximum facial temperature, the forehead temperatures of the experimental sessions have shown a degree of deviation from the value of the control sessions. Also, six (6) out of the ten participants have demonstrated significant correlations between their task performance and their mean forehead temperature, for the experiment sessions with both ordered and randomised difficulty levels. An example is provided in Fig. 4. In these six (6) subjects, the mean forehead results of five subjects show positive correlations with their accuracy, and the remaining one shows a negative correlation. Therefore, it can be concluded that similarly to the maximum facial temperature, the change of mean forehead temperature is influenced by the individual differences as well. 2.4

Differential Energy Between Philtrum and Forehead

The mean differential energy value between philtrum and forehead of the ten subjects is 0.1265 during the experiment sessions. Compared to the control sessions, the DEFP values have increased up to around 10%. DEFP values change around 0.1117 for crossing the two neighboring sub-sessions and 0.1163 for crossing different sessions. Meanwhile, six subjects’ DEFP values show consistent correlations with their task performance. Similarly to the mean nasal temperature, the changes of the subjects’ DEFP values can indicate the variation of their task performance to a certain degree, see Fig. 5 for example. 2.5

Comparison with Conventional Biomarker

Heart rate variability (HRV) and task load index (TLI) mentioned previously have been recommended as the biomarkers for the psychophysiological state estimation for years, and the pupil diameter marker (PDM) has been validated as an effective biomarker for the psychophysiological state assessment [3,11,17, 24,26,27]. In this project, the two-sample T-test is introduced to compare the efficiency of using facial temperature as a suitable psychophysiological indicator with HRV, TLI and PDM. According to the two-sample T-test, each indicator is tested for its ability to distinguish different psychophysiological state. For H = 0, there was no significant difference observed within a 5% confidence level. For H = 1, there were significant differences between the data from two adjacent sub-sessions. The details about the test results are summarised and shown in Table 1. The H values of the temperature markers are all above 0.9 and higher than the score of the most conventional biomarkers. Therefore, the new markers developed from the data relating to the facial temperature have exhibited similar or even better ability to differentiate the subjects’ psychophysiological state changes in comparison to the traditional biomarkers. In general, the biomarkers based on facial temperature are more susceptible to the minor changes in the psychophysiological state during low work load period, especially for the maximum facial temperature, the mean nasal temperature and DEFP.

Thermal Imaging for Psychophysiological State Detection Table 1. Mean H values for T-test summary [12]. Biomarkers

Session 1 Session 2 Mean

Maximum facial 0.9667

0.9667

Mean nasal

0.9333

0.9667

0.9667 0.9500

Mean forehead

0.9667

0.9000

0.9333

DEFP

0.9667

0.9667

0.9667

HRV1

0.9167

0.9333

0.9250

HRV2

1.0000

1.0000

1.0000

TLI1

0.7333

0.8833

0.8083

TLI2

0.8667

0.9333

0.9083

PDM

0.9500

0.8833

0.9167

(a) Differential Energy (-.-), accuracy (- -) and difficulty level (-) plots for participant 4. Session 1 with elevated difficulty levels

(b) Differential Energy (-.-), accuracy (- -) and difficulty level (-) plots for participant 4. Session 3 with randomised difficulty levels

Fig. 5. Differential energy between philtrum and forehead.

223

224

3 3.1

C. He et al.

Discussions Facial Temperature Biomarkers

The effectiveness and efficiency of using the facial temperature to estimate subjects’ psychophysiological state changes have been validated with the experiments, yet in practice these biomarkers are still limited by two major problems: auto-calibration of the camera and subjects’ head movement. Auto-calibration is an intrinsic design of the camera to deal with the problem of thermal drift in the data recording. The thermal drift is related to the abnormal temperature shifts in the recordings, and it is introduced by the changing temperature of the camera itself. The process is automatically programmed to calibrate the camera depending on the temperature change of the camera. During the one or two seconds of auto-calibration, the camera measures the temperature within itself rather than the outside target. As a consequence, the recording is disturbed by the sudden fluctuations that needed to be manually removed during data processing. The lack of actual data in those periods of time affected the ability of indicators to reflect the mental state at those precise moments. In addition, it can also seriously interrupt the frequency analysis of the temperature signal. The effect of the calibration is limited by switching on the camera at least ten minutes before each experiment. However, this method only reduces the number of calibration for a 12-min recording instead of eliminating them completely. Better cameras may well provide more effective solutions to this problem in the future. R are only able to support fixed winBoth FLIR ResearchIR and MATLAB dows for data extraction. However, the subject’s head movements are unavoidable for any long observation. Therefore, windows with fixed positions and fixed shapes are not capable of handling the displacement and distortion caused by these movements. Therefore, due to the failure of tracking regions of interest, the biomarkers based on the data are not able to faithfully represent the actual temperature changes in those areas, and thus their efficiency is compromised. Unfortunately, in contrast to the object tracking of the normal RGB images, the thermal image lacks enough shape contrast for the normal tracking algorithm to follow. Since there is little research on thermal image tracking and barely any actual algorithm, the participants are advised to be conservative with their head movements, which in this case can be considered as a source of disturbance to the psychophysiological state. The tracking of the region of interest seems to represent the toughest challenge among all the other challenges. However, a new tracking algorithm based on particle-filter may be a useful solution to this problem [7,8,18,30,32,33]. The algorithm, built on the Matte algorithm that is based on the pixel dependence, can deal with nonlinear motion within the predict-update cycle in a simple way. Despite the limitations of the current thermal imaging technique, the facial temperature has proved to be a reliable tool for psychophysiological state measurement in HMI.

Thermal Imaging for Psychophysiological State Detection

3.2

225

Human-Machine Interface Prediction

Different humans respond to the same stimuli differently. Such uncertainty requires the model to be adaptive to different individuals. Due to the high degree of complexity within HMI, it may even be impossible to understand the whole underlay mechanism of the system. Also, the high cost of expanding human sample collection limits the use of conventional data-driven approaches. Therefore, a type-2 fuzzy logic model is suggested for the prediction of HMI. It combines the existing empirical knowledge and accounts for the individual uncertainty for predictions. Therefore, Fig. 6 shows the proposed scheme that integrates the facial temperature within the HMI system for optimising monitoring/control performances.

Fig. 6. The proposed human machine interaction system [12].

Figure 7 provides an example of how to combine some of the facial temperature biomarkers with existing biomarkers for performance estimation in HMI using type-2 fuzzy logic model. The use of facial temperature biomarkers have succeeded to compensate for the information loss or the distortion of the TLI2 signal around 30 s and 165 s. Meanwhile, the use of the pupil size, the EEG and the ECG have provided fast direct measurements of the psychological state, which avoids the slow reaction within the physiological state that delays the temperature markers’ measurement. In the end, the model has provided task performance prediction with relatively high accuracy and around the one-second delay.

226

C. He et al.

Fig. 7. Type-2 fuzzy model for HMI prediction from participant 2

Thermal Imaging for Psychophysiological State Detection

4

227

Conclusions

In conclusion, the experimental results of the facial temperature have validated the effectiveness and efficiency of using thermal imaging for psychophysiological state estimation. Such a method proposes a more reliable marker for assessing the psychophysiological state of the operator. Furthermore, the combination of the facial temperature and other well-known biomarkers can significantly increase the robustness of the system and the precision of the prediction, as the facial temperature measurement requires no body contact and is more sensitive to the changes within the low mental stress states.

References 1. Barrett LF (2006) Solving the emotion paradox: categorization and the experience of emotion. Pers Soc Psychol Rev 10(1):20–46 2. Barrett LF, Mesquita B, Ochsner KN, Gross JJ (2007) The experience of emotion. Annu Rev Psychol 58:373–403 3. Bernardi L, Wdowczyk-Szulc J, Valenti C, Castoldi S, Passino C, Spadacini G, Sleight P (2000) Effects of controlled breathing, mental activity and mental stress with or without verbalization on heart rate variability. J Am Coll Cardiol 35(6):1462–1469 4. Brengelmann GL (2000) Body surface temperature: manifestation of complex anatomy and physiology of the cutaneous vasculature. In: Engineering in medicine and biology society, 2000, proceedings of the 22nd annual international conference of the IEEE, vol 3, pp 1927–1930. IEEE 5. Charkoudian N (2003) Skin blood flow in adult human thermoregulation: how it works, when it does not, and why. In: Mayo clinic proceedings, vol 78, pp 603–612. Elsevier 6. Clay-Warner J, Robinson DT (2015) Infrared thermography as a measure of emotion response. Emot Rev 7(2):157–162 7. Dowdall J, Pavlidis IT, Tsiamyrtzis P (2007) Coalitional tracking. Comput Vis Image Underst 106(2):205–219 8. Driessen H, Boers Y (2008) Map estimation in particle filter tracking 9. El-Samahy E, Mahfouf M, Torres-Salomao L, Anzurez-Marin J (2015) A new computer control system for mental stress management using fuzzy logic. In: 2015 IEEE international conference on evolving and adaptive intelligent systems (EAIS), pp 1–7. IEEE 10. Garde A, Laursen B, Jørgensen A, Jensen B (2002) Effects of mental and physical demands on heart rate variability during computer work. Eur J Appl Physiol 87(4– 5):456–461 11. Gevins A, Smith ME (2003) Neurophysiological measures of cognitive workload during human-computer interaction. Theor Issues Ergon Sci 4(1–2):113–131 12. He C, Mahfouf M, Torres-Salomao LA (2018) Facial temperature markers for mental stress assessment in human-machine interface (HMI) control system. In: ICINCO 13. Hjortskov N, Riss´en D, Blangsted AK, Fallentin N, Lundberg U, Søgaard K (2004) The effect of mental stress on heart rate variability and blood pressure during computer work. Eur J Appl Physiol 92(1–2):84–89

228

C. He et al.

14. Hong K, Hong S (2016) Real-time stress assessment using thermal imaging. Visual Comput 32(11):1369–1377 15. Houdas Y, Ring E (2013) Human body temperature: its measurement and regulation. Springer Science & Business Media 16. Inzlicht M, Bartholow BD, Hirsh JB (2015) Emotional foundations of cognitive control. Trends Cogn Sci 19(3):126–132 17. Kuriyagawa Y, Kageyama I (1999) A modeling of heart rate variability to estimate mental work load. In: 1999 IEEE international conference on systems, man, and cybernetics, 1999, IEEE SMC 1999 Conference Proceedings, vol 2, pp 294–299. IEEE 18. Levin A, Lischinski D, Weiss Y (2008) A closed-form solution to natural image matting. IEEE Trans Pattern Anal Mach Intell 30(2):228–242 19. Mahfouf M, Zhang J, Linkens DA, Nassef A, Nickel P, Hockey GRJ, Roberts AC (2007) Adaptive fuzzy approaches to modelling operator functional states in a human-machine process control system. In: IEEE international fuzzy systems conference, 2007, FUZZ-IEEE 2007, pp 1–6. IEEE 20. Nassef A, Mahfouf M, Ting CH, El-Samahy E, Linkens DA, Dena¨ı MA (2010) Hybrid physiological modeling of subjects undergoing cyclic physical loading. In: Biosignals, pp 252–257 21. Nhan B, Chau T (2009) Infrared thermal imaging as a physiological access pathway: a study of the baseline characteristics of facial skin temperatures. Physiol Meas 30(4):N23 22. Nhan BR, Chau T (2010) Classifying affective states using thermal infrared imaging of the human face. IEEE Trans Biomed Eng 57(4):979–987 23. Nozawa A, Tacano M (2009) Correlation analysis on alpha attenuation and nasal skin temperature. J Stat Mech Theory Exp 2009(01):P01007 24. Smith ME, Gevins A, Brown H, Karnik A, Du R (2001) Monitoring task loading with multivariate EEG measures during complex forms of human-computer interaction. Hum Factors 43(3):366–380 25. Ting CH, Mahfouf M, Nassef A, Linkens DA, Panoutsos G, Nickel P, Roberts AC, Hockey GRJ (2010) Real-time adaptive automation system based on identification of operator functional state in simulated process control operations. IEEE Trans Syst Man Cybern-Part A Syst Hum 40(2):251–262 26. Torres-Salomao L, Mahfouf M, El-Samahy E (2015) Pupil diameter size marker for incremental mental stress detection. In: 2015 17th international conference on e-health networking, application & services (HealthCom), pp 286–291. IEEE 27. Torres-Salomao LA, Mahfouf M, El-Samahy E, Ting CH (2017) Psychophysiologically based real-time adaptive general type 2 fuzzy modeling and self-organizing control of operator’s performance undertaking a cognitive task. IEEE Trans Fuzzy Syst 25(1):43–57 28. Usamentiaga R, Venegas P, Guerediaga J, Vega L, Molleda J, Bulnes FG (2014) Infrared thermography for temperature measurement and non-destructive testing. Sensors 14(7):12305–12348 29. Walter S, Wendt C, B¨ ohnke J, Crawcour S, Tan JW, Chan A, Limbrecht K, Gruss S, Traue HC (2014) Similarities and differences of emotions in humanmachine and human-human interactions: what kind of emotions are relevant for future companion systems? Ergonomics 57(3):374–386 30. Wu HY, Rubinstein M, Shih E, Guttag J, Durand F, Freeman W (2012) Eulerian video magnification for revealing subtle changes in the world

Thermal Imaging for Psychophysiological State Detection

229

31. Zhai J, Barreto A (2006) Stress detection in computer users based on digital signal processing of noninvasive physiological variables. In: 28th annual international conference of the IEEE engineering in medicine and biology society, 2006, EMBS 2006, pp 1355–1358. IEEE 32. Zhou Y, Tsiamyrtzis P, Lindner P, Timofeyev I, Pavlidis I (2013) Spatiotemporal smoothing as a basis for facial tissue tracking in thermal imaging. IEEE Trans Biomed Eng 60(5):1280–1289 33. Zhou Y, Tsiamyrtzis P, Pavlidis IT (2009) Tissue tracking in thermo-physiological imagery through spatio-temporal smoothing. In: International conference on medical image computing and computer-assisted intervention, pp 1092–1099. Springer

Nonlinear Model Predictive Control Algorithms for Industrial Articulated Robots Kvˇetoslav Belda(B) Department of Adaptive Systems, The Czech Academy of Sciences, Institute of Information Theory and Automation, Pod Vod´ arenskou vˇeˇz´ı 4, 182 08 Prague 8, Czech Republic [email protected]

Abstract. This paper deals with a novel nonlinear design of the discrete model predictive control represented by two algorithms based on the features of linear methods for the numerical solution of ordinary differential equations. The design algorithms allow more accurate motion control of robotic or mechatronic systems that are usually modelled by nonlinear differential equations up to the second order. The proposed ways apply nonlinear models directly to the construction of equations of predictions employed in predictive control design. These equations are composed using principles of explicit linear multi-step methods leading to straightforward and unambiguous construction of the predictions. Examples of the noticeably improved behaviour of proposed ways in comparison with conventional linear predictive control are demonstrated by comparative simulations with the nonlinear model of six-axis articulated robot. Keywords: Discrete model predictive control Lagrange equations · Articulated robots

1

· Nonlinear design ·

Introduction

The integration of industrial robot applications in production increases rapidly. Such continuous trend proceeds with the advent of Industry 4.0 and it will continue in future as well [10]. The robots in industrial production perform huge number of operations. Their efficiency depends on used motion control that can exploit available information from used robot and measured data [9]. Nowadays, necessary information, data and computing power are broadly available, however, the designers are not able to use them effectively. In industrial production, there exist a lot of elaborated strategies that follow from long-term, empirical experiences [24]. Unfortunately, such strategies are usually not general enough. They are not scalable or transferable for different or modified systems. From mathematical point of view, the robots, manipulators, such as a mechanical structure of articulated robot class shown in Fig. 1, represent dynamic systems that are usually described by systems of ordinary differential c Springer Nature Switzerland AG 2020  O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 230–251, 2020. https://doi.org/10.1007/978-3-030-31993-9_11

Nonlinear Model Predictive Control Algorithms

231

equations (ODEs) as suitable models of dynamics [23]. These models can be considered as a proper substitution of the real physical robot mechanism for computer simulations and motion control design as well. The aforementioned ODE systems reflect various relations among individual elements of the robot constructions. These relations are usually nonlinear. It is given by nonlinear operations on descriptive variables and related derivatives [1]. Thus, usual linear control theory [25] cannot be directly applied without some modifications.

0.6

axis z 0

0.5 0.4 0.3 0.2 0.1 0

axis y 0 -0.2 0

axis x 0 0.2 0

0.1

0.2

0.3

0.4

0.5

0.6

Fig. 1. Wire-frame model of the articulated robot class [2].

In practice, several common solutions are considered. They employ local linearization [21] by means of Taylor series, partial derivative models or switching local linear models and discretization to obtain discrete linear-like model, often in state-space form. Then, usual linear design of discrete model predictive control can be applied [3,5,6,18,20,25]. Further approaches consider linear models with stochastic uncertainties as substitution of initial nonlinear models [16]. Different solutions can be realized by neural network [19] or by the direct nonlinear optimization such as in [11,14,15,26,28]. However, mentioned ways are not immediately applicable as straightforward multi-step design in control process. This paper deals with a novel way of the design of the discrete model predictive control (single-pass algorithm) that however employs a continuous-time nonlinear ODE model of the robot dynamics. This way was initially introduced in [2] and this paper extends its idea for a novel general way (two-cycle algorithm) using nonlinear prediction. The new algorithm serves as a specific etalon

232

K. Belda

or ideal for demonstrated fast single-pass algorithm. It is given by computation demands. The two-cycle algorithm is more demanding, slower but gives results near the ideal in comparison with the single-pass algorithm that is fast and fully comparable in computation demands with the conventional linear approach [25]. However, the conventional linear approach cannot achieve comparable results. The model in both algorithms is employed directly by means of specifically adapted explicit linear multi-step numerical methods. These methods are used for the construction of equations of predictions or using of these methods substitutes conventional equations fully. Considered equations of predictions can be applied in usual way to the common quadratic cost function and optimization criterion. The explanation is introduced with Adams-Bashforth method as a representative of the aforementioned explicit methods [8]. Features of the proposed solutions are discussed and compared with usual linear design of model predictive control [20,25] that considers conventional repeated linearization and discretization along motion trajectories of the robot.

2

Nonlinear Robot Model

The model of the given class “articulated robots” is generally represented by a nonlinear function expressing relations between control actions (robot inputs, joint torques τ = τ (t)) and descriptive variables (robot outputs, joint angles and their derivatives q = q(t), q˙ = q(t) ˙ and q¨ = q¨(t)): q¨ = f(q, q, ˙ τ)

(1)

The model (1) expresses equations of motions [22,23] that reflect robot dynamics. Such a model is mostly composed by Lagrange equations, e.g. in the following form       ∂Ek T ∂Ep T d ∂Ek T − + = τ (2) dt ∂ q˙ ∂q ∂q where q, q, ˙ Ek , Ep and τ are generalized coordinates and their appropriate derivatives, total kinetic and potential energy and vector of generalized force effects corresponding to generalized coordinates [23]. The individual elements of (2) are defined as follows   d ∂Ek T ˜ q) = H(q, ˙ q˙ + H(q) q¨ (3) dt ∂ q˙  −

∂Ek ∂q

T =



∂Ep ∂q

S(q, q) ˙ q˙ − T =

g(q)

1˜ H(q, q) ˙ q˙ 2

(4)

(5)

Nonlinear Model Predictive Control Algorithms

233

where, with specific simplified notation, the matrices H = H(q), S = S(q, q) ˙ d ˜ = H(q, ˜ q) (H(q)) relate to inertia effects and vector g = g(q) correand H ˙ = dt sponds to effects of gravity. Then, the model (the equations of motion of articulated robots) can be written as follows f(q, q, ˙ τ)

 q¨ = − H −1 



   1˜ H + S q˙ − H −1 g + H −1 τ 2        

f (q, q) ˙





fg (q)



gτ(q) τ

fc (q, q) ˙

= f (q, q) ˙ + u  ˜ + S q˙ and u = H −1 (− g + τ ). where f (q, q) ˙ = − H −1 12 H

(6)

Thus, Eq. (6) can be considered as a particular form of the model (1): ˙ + gτ(q) τ = f (q, q) ˙ + fg (q) + gτ(q) τ q¨ = fc (q, q) = f (q, q) ˙ + g(q) u

(7)

where g(q) is added just for the generality; it is identity matrix here. In (7), the effects of gravity fg (q) are included into robot inputs as outer forces that cannot be reduced or suppressed in the control design due to their static and fixed character. Note that final torques required on appropriate drives (reference torques for local drive control) are given by τ = Hu + g

(8)

i.e. backward recomputation of designed control actions to real torques including a compensation of effects of gravity.

3

Integration Concept

To design nonlinear predictive control in discrete form, let us consider individual time instants of the initial continuous nonlinear function in the model (1) as follows (i.e. only time sampling, but no model discretization) ˙ τ )|t = kTs , k = 0, 1, · · · fk = f(q, q,

(9)

where Ts is appropriately selected sampling period. Let the same be applied to the terms from (7) ˙ gk = g(q) | t = kTs fk = f (q, q)

(10)

234

K. Belda

The propositions (9) or (10) will be apt for a specific construction of the predictions towards unknown control actions u(t) within in an ordered finite set of discrete time instants t ∈ {k Ts , (k + 1) Ts , · · · , (k + N − 1) Ts }, where N is a prediction horizon. The predictions still take into account the continuous nonlinear model, but only the indicated discrete time samples that will be applied to discrete design of predictive control. This concept can be realised by means of numerical methods [8] used for the numerical approximation of the solution of the first-order ordinary differential equations (ODEs): y˙ = f(t, y) with initial condition y0 = y|t = 0

(11)

Thus, these methods are used to find a numerical approximation of the exact integral over a particular time interval, e.g. t ∈ k Ts , (k+1) Ts  (k+1)

Ts

yk+1 = yk +

(k+1)

Ts

y˙ dt = yk +

k Ts

f(t, y) dt

k Ts

yˆk+1 = yk + h δ(t, y)

(12)

where yk = y|t=k Ts is an initial condition of the given time interval, yˆk+1 is an approximation of the exact solution yk+1 = y|t=(k+1) Ts , δ(t, y) means in general the function approximating y˙ so that yˆk+1 would be the adequate approximation of yk+1 and h is a step of integration method, which is selected as h = Ts . From a large number of the methods, let us take into account linear multi-step methods that are convenient for the predictions in predictive design. Generally, linear multi-step methods are expressed as follows yˆk+1 =

r

αi yk−i + h

i=0

s

βj fk−j

(13)

j=−1

where yˆk+1 is a result of the numerical integration arose from the previous yk−i . For the design, the explicit methods are useful. One their representative is explicit Adams-Bashforth method of fourth order with r = 0, α0 = 1, s = 3 γ and βj = 24j , j ∈ {−1, 0, 1, 2, 3}, where γ−1 = 0, γ0 = 55, γ1 = −59, γ2 = 37 and γ3 = −9, as follows yˆk+1 = yk + h ( −

9 24

fk−3 +

37 24

fk−2 −

59 24

fk−1 +

55 24

fk )

(14)

Note, for completeness, that the function approximating y˙ from (12) is as follows: 9 δ(t, y) = − 24 fk−3 +

37 24

fk−2 −

59 24

fk−1 +

55 24

fk .

The aforementioned Adams-Bashforth linear method [8] will be used explicitly in an explanation of the single-pass algorithm in the following section.

Nonlinear Model Predictive Control Algorithms

4

235

Nonlinear Design of Model Predictive Control

The nonlinear model predictive design focusses recently on the solution of nonlinear optimal control problem with integral criterion of optimality. It represents general solution using sophisticated nonlinear optimization algorithms. But it leads to sequential quadratic programming (QP) representing one-ahead spreading-in-time-optimization process, thus iteratively approximating the nonlinear problem with QP [11,28]. Alternatively, common additive forms of the criterion employed in linear control theory can be considered as well. The operation of integration is just moved from the criterion towards predictions composed by means of the numerical methods for ODEs involving the continuous-time model (7). This idea will be introduced and employed in the proposed design. Let us start just from the continuous model (7) considered in the discrete time samples (time instants) as was indicated in (10). Moreover, let us take more usual, universal notation into account: instead of q use for outputs symbol y, i.e. yk = qk as well as for appropriate derivatives y˙ k = q˙k and y¨k = q¨k y¨k = fk + gk uk

(15)

From (6), the function fk holds fk |[y˙ = 0, y ∈R)] = 0 whereas term fg k | y ∈ R = 0 in (7) for the given spatial articulated robot class as well as specially for vertical planar robot configurations. Note, for completeness, fg k | y ∈ R = 0 applies to horizontal planar robot configurations. The knowledge of the property of fk is useful for maintaining the stability in the control design. Using the model (15), the specific design of the nonlinear model predictive control will now be explained in the following three sections. 4.1

Criterion and Cost Function

The criterion for predictive control design can generally be written as follows min Jk ( Yˆk+1 , Wk+1 , Uk ) Uk subject to: Yˆk+1 = f (yk , y˙ k , δ(t, y, y)) ˙

(16)

y¨k = fk + gk uk where the function δ(t, y, y) ˙ arises from the second order model y¨k = fk + gk uk , ˙ as was shown in (10). Vectors Yˆk+1 , Wk+1 and Uk represent and fk = f (y, y) the sequences of the robot output predictions, references and control actions from given current time sample up to the horizon of prediction N , respectively T T Yˆk+1 = [ˆ yk+1 , · · · , yˆk+N ]T

(17)

T T Wk+1 = [wk+1 , · · · , wk+N ]T

(18)

Uk = [uTk , · · · , uTk+N −1 ]T

(19)

236

K. Belda

Thereafter, the cost function Jk is chosen in usual quadratic form as follows N yk+i − wk+i )||22 + ||Qu uk+i−1 ||22 } Jk = {||Qyw (ˆ i=1

= (Yˆk+1 − Wk+1 )T QTY W QY W (Yˆk+1 − Wk+1 ) + UkT QTU QU Uk

(20)

where QTY W QY W and QTU QU represent penalizations defined by the following matrix form ⎤ ⎡ T 0 Q∗ Q∗ ⎥ ⎢ .. QT Q = ⎣ (21) ⎦ . T 0 Q∗ Q∗ where the symbolic subscripts , ∗ have the following interpretation:  ∈ {Y W, U } and ∗ ∈ {yw, u}. However, the cost function can be selected differently according to user requirements, e.g. considering incremental terms that can slightly moderate and smooth the robot motion [18] or suppress steady-state control error, as shown in [7,25]. It means that cost function (20) may also be selected in some specific incremental form from control action point of view, for instance, as follows Jk = (Yˆk+1 − Wk+1 )T QTY W QY W (Yˆk+1 − Wk+1 )+  U Tk QTU QU  U k

(22)

However, a minimization of the cost functions may be provided by similar optimization procedure. Such procedure will be introduced at the end of this section. 4.2

Equations of Predictions – Algorithm with Nonlinear Prediction

The algorithm with nonlinear prediction (two-cycle algorithm) is developed to improve accuracy of predicted outputs yˆ by specific approximation of nonlinearities of the robot model by specific equations of predictions involving specific nonlinear prediction. The algorithm consists of two cycles as follows from Fig. 2. The main cycle represents usual control cycle extended by internal simulation cycle. Internal cycle composes the essential part of the control usim k and predicts future values of the system state –xsim k:k+N . For these operations, the general predictive control algorithm and numerical integration of nonlinear model (i.e. function f (x) + g(x) u (= f(x, u)) are used. Obtained values are stored for computations in main cycle. In it, at first, the models determined by states xsim k:k+N are calculated. Then, they are used for composition of equations of predictions serving for generating of increments of control actions. Finally, the increments are added to actions from internal cycle usim k . Final control actions uk are applied to controlled systems. Now, this generally defined algorithm can be mathematically formulated.

Nonlinear Model Predictive Control Algorithms

237

Start of algorithm Initialization: step i = 1 (state x1, reference w (1: Tfinal / Ts))

Main cycle I. for k = 2 : Tfinal / Ts Simulative cycle II. for i = 1 : N

Incremental algorithm with nonlinear prediction

Evaluation x 1:Tfinal / Ts

computation of nonlinear .model

computation of state-space models (xsim k:k+N) and predictions

List of results

(parameters f (q, q), g (q))

linearization (including transformation to state-space formulation)

discretization computation of control ui numerical integration x = ∫ (f (x ) +g (q) ui) storing x to xsim k+i N

if i == 1

computation of increments uk of control actions composition of control action uk = usim k + uk real object (robot) or numerical integration x = ∫ (f (x ) +g (q) uk)

y

storing ui to usim k

storing x to xk End of algorithm

Fig. 2. Flowchart of the algorithm with nonlinear prediction.

In detail, the specific equations of predictions are constructed as follows xk+1 = Ak xk + Bk usim k +Bk  uk    x ˆk+1 = xsim k+1 + Bk  uk xk+2 = Ak+1 xk+1 + Bk+1 usim k+1 + Bk+1  uk x ˆk+2 = Ak+1 xsim k+1 + Bk+1 usim k+1 +Ak+1 Bk  uk + Bk+1  uk+1    x ˆk+2 = xsim k+2 +Ak+1 Bk  uk + Bk+1  uk+1 .. .. . . (23) x ˆk+N = xsim k+N + Ak+N −1 · · · Ak+1 Bk  uk + · · · + Ak+N −1 Bk+N −2  uk+N −2 + Bk+N −1  uk+N −1

238

K. Belda

In (23), xsim k+1 , usim k (etc. xsim k+i , usim k+i−1 , i = 1, · · · , N ) are vectors of state and control actions given from pre-simulation in considered receding horizon of incremental control algorithm. Furthermore, Ak = A(xk ), Bk = B(xk ), Ak+i = A(xsim k+i ), Bk+i = B(xsim k+i ), i = 1, · · · , N −1 are matrices of state-space model given by model linearization and discretization repeating in every time instant k. C is an output matrix. All terms follow from state-space form:        0 I y˙ y 0 = + u 0 f (y, ˙ y) y¨ y˙ I       x x˙ B A(t)

(24)

   0 y y = I    y˙ C that is discretized: A(t), B |Ts and arranged as

(25)

⇒ Ak , Bk by first-order-hold method [12]

yˆk+1 = CAk xk + C Bk uk

(26)

The derivation of the Eq. (23) follows from usual construction of equations of predictions, i.e. suitably repetitive substitution for predicted future states and outputs, respectively. However, the structure of the equations of predictions is more complex due to variable state-space matrices within optimisation interval given be horizon of prediction N . ⎤ ⎡ ⎤ ⎡ Cxsim k+1 yˆk+1 ⎢ yˆk+2 ⎥ ⎢ Cxsim k+2 ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ .. ⎥ = ⎢ .. ⎦ ⎣ . ⎦ ⎣. yˆk+N Cxsim k+N ⎡

···

CBk

0 .. .



⎢ ⎥ ⎢ CAk+1 Bk ⎥ ⎥ +⎢ ⎢. ⎥ .. ⎣ .. ⎦ . 0 CAk+N −1 · · · Ak+1 Bk · · · CBk+N −1 ⎡

 uk ⎢  uk+1 ⎢ × ⎢. ⎣ ..  uk+N −1

⎤ ⎥ ⎥ ⎥ ⎦

(27)

Nonlinear Model Predictive Control Algorithms

239

A corresponding condensed matrix notation is as follows Yˆk+1 = Yˆsim k+1 ¯k +G   Yˆ = [ˆ yk+1 , yˆk+2 , · · · , yˆk+N ]T ×  Uk  k+1  U = [ uk ,  uk+1 , · · · ,  uk+N −1 ]T

(28)

¯ k  Uk Yˆk+1 = F¯k + G

(29)

¯ k follow from the structure (27). where vector F¯k = Yˆsim k+1 and matrix G 4.3

Equations of Predictions – Single-Pass Algorithm

As was already mentioned, the equations of predictions can also be composed with the nonlinear continuous model (15) and by means of the idea of the approximation of the exact integral as indicated in (12). It can be ensured by the exemplarily selected linear multi-step Adams-Bashforth method of fourth order (14) for the solution of the first-order ODEs. However, the considered nonlinear model of the robot (15) represents a system of the second-order ODEs. To apply the chosen suitable numerical method, but without loss of information about included nonlinear relations, the model (15) has to be specifically transformed into ODEs of the first order. For such necessary rearrangement, the following backward Euler formula can be applied yˆk+1 − yk yˆ˙k+1 = h

(30)

This specific formula ensures the coupling (propagation) of nonlinear relations from the model (15) into positional estimates, since the numerical methods represent only linear combinations within rows of the ODEs. Hence, usual rearrangement via addition of further ODEs decreasing the order of initial ODE system would not be useful because of the loss of information. Considering the aforementioned features, then the positional estimate yˆk+1 can be determined by the velocity estimate yˆ˙k+1 that includes fully the initial nonlinear model (15), as follows yˆk+1 = yk + h yˆ˙k+1

(31)

where the estimate yˆ˙k+1 is generally given by the numerical integration of the ODE set as indicated yˆ˙k+1 = y˙ k + h δ(t, y, y) ˙

(32)

Note, for completeness, if the appropriate robot model would be set of first-order ODEs only, i.e. y˜˙k = f˜k + g˜k uk as e.g. in [13,27], this step is not needed.

240

K. Belda

Now, the real equations of predictions can be composed considering the initial nonlinear model (15), together with a specific numerical method (32) (here, the Adams-Bashworth method (14)) and the transformation to the first-order ODE set as indicated by (31). The equations of predictions, expressed in a condensed matrix form, are defined for the velocity vector Yˆ˙k as follows Yˆ˙k+1 = y˙ k FI + Fk + Gk Uk

(33)

and as well as for the position vector Yˆ˙k as Yˆk+1 = yk FI + Lk + Mk Uk

(34)

where the individual terms represent multiple identity matrix: FI = [ I · · · I ]T , specific free responses: “y˙ k FI +Fk ”, “yk FI +Lk ” and forced responses: “Gk Uk ”, “Mk Uk ”, respectively. Fk and Gk can be defined using the following sequences for velocities, where, for clear arrangement, particular time instants are separated by horizontal lines: yˆ˙k+1 = y˙ k + F1,k F1,k

+ β0 gk uk = β3 y¨k−3 + β2 y¨k−2 + β1 y¨k−1 + β0 fk

yˆ˙k+2 = y˙ k + F2,k + (β1 + β0 ) gk uk + β0 gˆk+1 uk+1 F2,k = F1,k +β3 y¨k−2 +β2 y¨k−1 +β1 fk +β0 fˆk+1 .. .

yˆ˙k+N = y˙ k +FN,k + {

3

βj } gk uk

j=0

+ · · · + β0 gˆk+N −1 uk+N −1 4 FN,k = FN−1,k + βj−1 fˆk+N−j

(35)

j=1

Note that in step k, the topical value y˙ k as well as its appropriate past values y¨k−1 , y¨k−2 and y¨k−3 are known from measurements or possibly from some suitable running state estimation. The coefficients βj appearing in (35) are identical to the coefficients βj that were introduced in the Eq. (13).

Nonlinear Model Predictive Control Algorithms

241

Consequently, vector Fk and matrix Gk from (33) can be defined as follows: ⎤ ⎡ F1, k ⎥ ⎢ ⎢F2, k ⎥ ⎥ ⎢ . ⎥ Fk = ⎢ ⎢ .. ⎥, ⎥ ⎢ ⎦ ⎣ FN, k



β0 gk

0

···

0 .. . .. .



⎥ ⎢ ⎥ ⎢(β1 + β0 ) gk β0 gˆk+1 ⎥ ⎢ .. ⎥ .. Gk = ⎢ ⎥ ⎢ . . ⎥ ⎢ 3 ⎦ ⎣  βj } gk ··· · · · β0 gˆk+N −1 {

(36)

j=0

where fˆk+i and gˆk+i can be substituted by the future reference values as follows fˆk+i = fk+i (wk+i , w˙ k+i ) and gˆk+i = gk+i (wk+i ). Similarly in the construction of the Eqs. (34), Yˆk+1 , Lk and Mk can be defined by analogical sequences for positions as follows (again for clear arrangement, particular time instants are separated by horizontal lines): yˆk+1 = yk + L1, k + h β0 gk uk L1, k = h y˙ k + h F1, k yˆk+2 = yk + L2, k +h (β1 + 2 β0 ) gk uk +h β0 gˆk+1 uk+1 L2, k = L1, k + h F2, k .. . yˆk+N = yk + LN, k 3 + h { (N−j)βj } gk uk j=0

LN, k

+ · · · + hβ0 gˆk+N −1 uk+N −1 = LN −1, k + h FN, k

(37)

Then, vector Lk and matrix Mk from (34) are: ⎡

L1, k





h β0 gk

0

···

0 .. . .. .



⎥ ⎥ ⎢ ⎢ ⎥ ⎢ L2, k ⎥ ⎢ h (β1 + 2 β0 ) gk h β0 gˆk+1 ⎥ ⎥ ⎢ ⎢ . ⎥ ⎥ ⎢ ⎢ . . Lk = ⎢ . ⎥, Mk = ⎢ .. .. ⎥ . ⎥ ⎥ ⎢ ⎢ 3 ⎦ ⎦ ⎣ ⎣  LN, k ··· · · · h β0 gˆk+N −1 h{ (N−j)βj }gk

(38)

j=0

Note that the two-step equations of predictions (33) and (34) are used here with respect to the second order of ODEs.

242

4.4

K. Belda

Square-Root Minimization

To minimize the cost function (20), let us consider the following expression min Jk = min JTk Jk ⇒ min Jk Uk

Uk

(39)

Uk

that indicates the square-root minimization of the vector Jk instead of minimization of the scalar Jk . The reason is that minimizing the square-root is more suitable in terms of calculation. Thus, the square-root of the criterion (16) with the cost function (20) can be expressed as follows    0 Q Yˆk+1 − Wk+1 min Jk = min Y W (40) 0 QU Uk Uk Uk The indicated minimization (40) can be solved as a specific least-squares problem by the following system of algebraic equations [17] that involves equations of predictions (34) for Yˆk+1     QY W (Wk+1 − yk FI − Lk ) QY W Mk (41) Uk = QU 0 A similar least-squares problem can also be written specifically for the cost function (22) used in the two-cycle algorithm with nonlinear prediction (29): 

   ¯k QY W (Wk+1 − F¯k ) QY W G  Uk = QU 0

(42)

The system (41) or (42), that is over-determined, can be written in condensed general form (43). It can be transformed to another form (44) by orthogonaltriangular decomposition [17] and solved for unknown Uk (or  Uk ) A Uk = b

(43)

QT A Uk = QT b assuming that A = Q R R1 Uk = c1

(44)

where Q T is an orthogonal matrix that transforms matrix A into upper triangle R 1 as it is indicated by the following equation diagram A

Uk =

@

b ⇒

@R 1 Uk = c1 @ 0

cz

(45)

Nonlinear Model Predictive Control Algorithms

243

Vector cz represents a loss vector, Euclidean norm ||cz || of which equals √ to the square-root of the optimal cost function minimum, i.e. scalar value J, where J = cTz cz . Only the first elements corresponding to uk are used from computed vector Uk . Note that for  Uk it is necessary to add the control action value usim k from internal cycle as follows: uk = usim k +  uk , where  uk is the appropriate first vector element from  Uk obtained from the main cycle.

5

Simulation Examples

The examples demonstrate the behaviour of the articulated robot along a selected testing trajectory. The corresponding wire-frame model of the given robot including trajectory is shown in Fig. 1. Trajectory in detail is in Figs. 3 and 4. The depicted trajectory was time parameterized with acceleration polynomial of fifth-order [4,23]. The specification of individual trajectory segments is listed in the Table 1, where G19, G01 and G02 are G-codes of plane selection for full definiteness of circles, motion with linear and circular interpolation, respectively. The model of given robot dynamics (7) and (8) from (2)–(6) took into account the parameters of ABB robot IRB 140 (Fig. 5). The number of actuated (driven) axes of the robot is six as well as a number of degrees of freedom of the robot. Six degrees of freedom correspond to six inputs: torques τ1:6 (N·m), six outputs: joint coordinates y = q1:6 (rad) relating to the adequate Cartesian coordinates: E = [ {xe , ye , ze (m)}, {αze , βye , γxe (rad) } ]

T

Fig. 3. Testing trajectory with specific time marks [2].

244

K. Belda

Fig. 4. Cartesian coordinates and derivatives (time in (s)) [2]. Table 1. Testing trajectory in G code (mm). 001 : 002 : 003 : 004 : 005 : 006 : 007 : 008 : 009 : 010 :

N010 N020 N030 N040 N050 N060 N070 N080 N090 N010

G19 G01 G01 G01 G02 G02 G02 G01 G01 G01

X630 X630 X630 X430 X430 X630 X630 X630 X630

Y-200 Y200 Y0 Y-200 Y200 Y0 Y-200 Y200 Y0

Z400 Z400 Z400 Z400 I-200 J0 K0 Z400 I0 J200 K0 Z400 I0 J-200 K0 Z400 Z400 Z400

and twelve state variables: x = [ q T1:6 (rad), q˙ T1:6 (rad · s−1 ) ] T corresponding to joint coordinates and their respective derivatives. Note that end-effector was oriented to be parallel with axis x0 . Thus, the orientation angles are considered to be constant: αze = βye = γxe = 0 rad. However, corresponding reference values in joint space w1:6,∀k , k = 1, 2, · · · , are time-varying according to appropriate inverse kinematic transformations [23], specific for the considered robot.

Nonlinear Model Predictive Control Algorithms

245

q

q q

E = xe ye ze

q q

ze ye xe

q

Fig. 5. Six-axis multipurpose ABB robot IRB 140 [2].

5.1

Simulation Setup

Introduced nonlinear design of model predictive control, i.e. two-cycle algorithm (PreSim MPC) and the fast single-pass algorithm (NonLin MPC), was tested with the following parameters: – sampling period: T s = 0.01 s – horizon of prediction: N = 10 – output penalization: Qyw = I(6×6) – input penalization: Qu = 2 · 10−4 I(6×6) where I is the identity matrix. The both algorithms were compared with normal model predictive control (Normal MPC, MPC) [25] having identical setting involved in its equation: ˜ k + QTU QU )−1 × G ˜ Tk QTY W QY W G ˜ Tk QTY W QY W (Wk+1 − F˜k xk ) Uk = (G

(46)

˜ are derived from the nonlinear model (7) as follows: where matrices F˜ and G ⎤ ⎤ ⎡ ⎡ ··· 0 CBk CAk ⎥ ⎢ .. . ⎥ .. ˜k = ⎢ F˜k = ⎣ ... ⎦ , G (47) ⎣ . .. ⎦ . CAN k

−1 CAN Bk · · · CBk k

The details on used normal MPC can be found e.g. in [20,25].

246

5.2

K. Belda

Summary of the Results

The comparative simulations were performed with the aforementioned setting and with artificially added mismatch between the model used for control design and the model for the simulation. The mismatch consisted in the four-times increased weight of the last, the sixth robot link in the simulation model against ˜ 6 = 4 × m6 . model for the design, i.e. m6 = 0.25 kg and m The Figs. 6, 7 and 8 show control errors at the robot motion along the testing trajectory. They represent the errors in Cartesian coordinate system. The values of the appropriate Cartesian coordinates were determined from ‘measured’ values of joint coordinates by appropriate direct kinematic transformations [23]. The corresponding errors in the joint space for the most exposed joints q2 and q3 to load are shown in Figs. 9 and 10. The exposition is caused by the robot motion itself or its character across the symmetry plane x0 z0 | y0 =0 . The aforementioned figures show the lower tracking errors for the both proposed algorithms. Considering moving-mass distribution in the robot: m1 = 35 kg, m2 = 16 kg, m3 = 14 kg, m4 = 6 kg, m5 = 0.75 kg and m6 = 0.25 kg, then the highest vertical load is on the second link between joints q2 and q3 , which is actuated in joint q2 .

Normal MPC

NonLin MPC

PreSim MPC

Fig. 6. Time histories of errors in the axis x.

Normal MPC

NonLin MPC

PreSim MPC

Fig. 7. Time histories of errors in the axis y.

Nonlinear Model Predictive Control Algorithms

Normal MPC

NonLin MPC

247

PreSim MPC

Fig. 8. Time histories of errors in the axis z.

For the chosen trajectory or its orientation, the joint q2 together with joint q3 influence dominantly the motion in the direction parallel to axis x0 and axis z0 whereas joint q1 (rotation around axis z0 ) influences the motion in the direction of axis y0 , especially if the motion trajectory leads to a specific robot orientation through the mentioned vertical symmetry plane x0 z0 | y0 = 0 . Since the both proposed algorithms as well as normal MPC design have positional character, then specific steady-state errors are perceptible. This is especially obvious for the vertical axis z (Fig. 8) and a bit less for the horizontal axis x (Fig. 6), which is coupled with the joints serving predominantly for the motion in the direction of axis z, i.e. joints q2 and q3 . The Fig. 11 shows joint coordinates qi corresponding to Cartesian coordinates in Fig. 4. It is evident that the most difficult motion phase is around 2.9 s, because the robot arms and end-effector are in full motion speed and the trajectory decomposed into the individual joint angles varies rapidly. Figure 12 shows corresponding situation in designed control actions τi generated during control process.

Normal MPC

NonLin MPC

PreSim MPC

Fig. 9. Time histories of control error for joint q2 .

248

K. Belda

Normal MPC

NonLin MPC

PreSim MPC

Fig. 10. Time histories of control error for joint q3 .

Such rapid turn or change cause variations in the model parameters, which cannot be expressed with one linearized model (24) fixed within respective moving time intervals at standard design (46) instead of flexible varying nonlinear model along the same time intervals at proposed algorithms based on equations of predictions (29) or (33) and (34), respectively.

Normal MPC

NonLin MPC

PreSim MPC

Fig. 11. Time histories of joint coordinates: qi , i = 1, · · · , 6 .

Nonlinear Model Predictive Control Algorithms

Normal MPC

NonLin MPC

249

PreSim MPC

Fig. 12. Time histories of control actions, joint torques: τi , i = 1, · · · , 6 .

6

Conclusion

The proposed nonlinear design algorithms are characterized by a specific straightforward use of initial nonlinear continuous model in the construction of predictions. For the fast single-pass algorithm, the design is fully without any linearization and conventional model discretization. The introduced algorithms can consider reasonable prediction horizon as normal MPC. However, they can offer more accurate tracking the desired motion trajectories in comparison with the use of conventional linear control approaches containing linearization of used nonlinear models. The emphasis in further research will be placed on the selection analysis of adequate numerical method, incremental prediction forms for offset-free motion and on a general point-to-point motion in unconstrained and constrained robot workspace. Acknowledgement. This paper was supported by The Czech Academy of Sciences, Institute of Information Theory and Automation.

References 1. Arimoto S (1996) Control theory of non-linear mechanical systems. Oxford Press, New York

250

K. Belda

2. Belda K (2018) Nonlinear design of model predictive control adapted for industrial articulated robots. In: Proceedings of the 15th international conference on informatics in control, automation and robotics, vol 2, pp 71–80 3. Belda K, B¨ ohm J, P´ıˇsa P (2007) Concepts of model-based control and trajectory planning for parallel robots. In: Proceedings of the 13th IASTED international conference on robotics and applications, pp 15–20 4. Belda K, Novotn´ y P (2012) Path simulator for machine tools and robots. In: Proceedings of MMAR IEEE international conference, Poland, pp 373–378 5. Belda K, Rovn´ y O (2017) Predictive control of 5 DOF robot arm of autonomous mobile robotic system motion control employing mathematical model of the robot arm dynamics. In: 21st international conference on process control, pp 339–344 6. Belda K, Voˇsmik D (2016) Explicit generalized predictive control of speed and position of PMSM drives. IEEE Trns Ind Electron 63(2):3889–3896 7. Belda K, Z´ ada V (2017) Predictive control for offset-free motion of industrial articulated robots. In: Proceedings of MMAR IEEE international conference, Poland, pp 688–693 8. Butcher J (2016) Numerical methods for ordinary differential equations. Wiley, New York 9. Chung W, Fu L, Kr¨ oger T (2016) Motion control. Springer, pp 163–194 10. Colombo AW, Karnouskos S, Shi Y, Yin S, Kaynak O (2016) Industrial cyberphysical systems - scanning the issue. Proc IEEE 104(5):899–903 11. Faulwasser T, Weber T, Zometa P, Findeisen R (2017) Implementation of nonlinear model predictive path-following control for an industrial robot. IEEE Tran Control Syst Technol 25(4):1505–1511 12. Franklin G, Powell J, Workman M (1998) Digital control of dynamic systems, 3rd edn. Addison Wesley, Menlo Park 13. Groothuis S, Stramigioli S, Carloni R (2017) Modeling robotic manipulators powered by variable stiffness actuators: a graph-theoretic and port-hamiltonian formalism. IEEE Trans Rob 33(4):807–818 14. Houska B, Ferreau H, Diehl M (2011) ACADO toolkit - an open-source framework for automatic control and dynamic optimization. Optimal Control Appl Methods 32(3):298–312 15. Kirches C (2011) Fast numerical methods for mixed-integer nonlinear modelpredictive control. Springer 16. Kothare MV, Balakrishnan V, Morari M (1996) Robust constrained model predictive control using linear matrix inequalities. Automatica 32(10):1361–1379 17. Lawson C, Hanson R (1995) Solving least squares problems. Siam 18. Maciejowski J (2002) Predictive control: with constraints. Prentice Hall, London 19. Negri GH, Cavalca MSM, Celeberto AL (2017) Neural nonlinear model-based predictive control with faul tolerance characteristics applied to a robot manipulator. Int J Innovative Comput Inf Control 13(6):1981–1992 20. Ordis A, Clarke D (1993) A state-space description for GPC controllers. J Syst SCI 24(9):1727–1744 21. Ostafew CJ, Schoellig AP, Barfoot TD, Collier J (2016) Learning-based nonlinear model predictive control to improve vision-based mobile robot path tracking. J Field Rob 33(1):133–152 22. Othman A, Belda K, Burget P (2015) Physical modelling of energy consumption of industrial articulated robots. In: Proceedings of the 15th international conference on control, automation and systems. ICROS, Korea, pp 784–789 23. Siciliano B, Sciavicco L, Villani L, Oriolo G (2009) Robotics - modelling, planning and control. Springer, London

Nonlinear Model Predictive Control Algorithms

251

24. SPS IPC Drives (2018) Internationale fachmesse f¨ ur elektrische automatisierung, systeme und komponenten, November 2018. https://www.mesago.de/en/SPS/ 25. Wang L (2009) Model predictive control system design and implementation using R Springer, London MATLAB. 26. Wilson J, Charest M, Dubay R (2016) Non-linear model predictive control schemes with application on a 2 link vertical robot manipulator. Rob Comput Integr Manuf 41:23–30 27. Z´ ada V, Belda K (2017) Application of hamiltonian mechanics to control design for industrial robotic manipulators. In: Proceedings of MMAR IEEE international conference, Poland, pp 390–395 28. Zanelli A, Domahidi A, Jerez J, Morari M (2017) FORCES NLP: an efficient implementation of interior-point methods for multistage nonlinear nonconvex programs. Int J Control, pp 1–17

Experimental Investigation of a Biologically Inspired Gripper with Active Haptic Control for Geometric Compliancy Christian Ivan Basson(&) and Glen Bright University of KwaZulu-Natal, Durban 4001, South Africa {bassonc,brightg}@ukzn.ac.za

Abstract. Human-machine interaction for object manipulation in assembly environments is enhanced through the haptic feedback. The sensitivity of grasping fragile components is achieved through force feedback control for an active haptic sensory system for grippers. Intelligent gripping systems employing haptic control has the potential to be implemented in Reconfigurable Assembly Systems and on-demand production lines. Successful object handling and object control of a biologically inspired gripping system incorporating active haptic control were investigated. The proposed system was designed to evaluate adaptability and object conformity of the gripper grasping force through a dynamic pick-and-place procedure. The grasping procedure employed a haptic feedback control sensory system to monitor the force output of the grasping procedure and manipulate grip hold intensity. The force output data was empirically gathered and formulated to be expressed on a signal versus time graph to illustrate self-conformity and grasping stabilization of the gripper system. Recommendations and conclusions were expressed in relation to effective grip force control. Keywords: Flexibility  Shape conformity control  Force feedback

 Adaptability  Active haptic

1 Introduction 1.1

Background

Modern manufacturing industries are confronted with high production rates and demand response. The implementation of Reconfigurable Manufacturing Systems, (RMS), meets the demand for increased reconfigurability and flexibility in production. Flexible Manufacturing Systems, (FMS) and Reconfigurable Manufacturing Systems, (RAS) are divisions of RMS. Assembly and pick-and-place procedures require accurate and adaptable part control [1]. RAS is described by [2], as a system that has the ability to alter their functionality and capacity in order for the system to adapt to the market demand. Product variety and part size change are catered through the incorporation of flexible grippers in FMS and RAS. RMS based production system integrating flexible fixtures possess on-demand manufacturing capabilities [3]. Flexible grippers are © Springer Nature Switzerland AG 2020 O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 252–275, 2020. https://doi.org/10.1007/978-3-030-31993-9_12

Experimental Investigation of a Biologically Inspired Gripper

253

consequently an important component in assembly operations requiring adaptability. The end-effector of a robotic manipulator is necessary for fixturing applications and part handling. Flexibility and reconfigurability of part handling due to component size and part family variability require multi-function grippers [4]. Flexible grippers possessing multi-functionality for grasping operations are required to incorporate into these environments. Flexibility and performance improve the functionality of grippers (see Fig. 1). The performance of grippers is described in terms of accuracy and repeatability. An inverse relationship exists between flexibility, performance and affordability of the gripper. A higher requirement for accuracy and flexibility necessitates high precision components of the gripper [5]. Adaptability and precision consequently possess an inverse relationship. For the purpose of adaptability, an active haptic feedback control system was introduced to monitor and control gripper conformity.

Fig. 1. Performance verse flexibility of grippers [5].

1.2

Overview

The migration to cyber-physical systems in production environments require that grippers possess haptic intelligence. Traditional product assembly incorporates ineffective end-effectors for human-computer integration [6]. Advanced flexible gripping systems necessitates immediate conformity to part shape and size. Applicable haptic systems are categorized into three (3) divisions namely: Human haptics, computer haptics and machine haptics. The sense of touch between object and human is known as human haptics. The machine input variable during object manipulation is designated as “human intuition”. The variation of human control is converted into the output movement of the robotic mechanism. Computer haptics is described as the interaction of software algorithmic utilization for simulation and computation of the manipulated object in order to describe the properties of the response of touch. The paper focusses on the experimental analysis and modelling of conformity occurrences in part manipulation using an active machine haptics. Machine haptics is explained as the touch interaction between an object and machine. The machine is related to a robotic gripper and the interaction between the gripper and the object being

254

C. I. Basson and G. Bright

handled is known as machine haptics [7]. Sensory systems are employed to quantify machine haptics. Haptic devices for gripper systems are designed and developed to simulate and augment touch sensitivity and control for intelligent gripping solutions [8]. Haptic feedback with force sensitive systems employed in part handling enables improved control reducing the amount of human intervention [9]. The integrity of the part surface damage is minimized with the utilization of a force threshold. The force threshold is determined through the calculation of stress required to produce surface damage. Haptic feedback reduces slippage during part handling operations. Haptic systems enhance conformity and part-machine interaction performance [10]. Haptic feedback systems are classified as passive and active according to the amount of external control intervention. Passive haptic feedback systems are implemented into gripping systems to only monitor grasping actions without energy manipulation of actuators. Partial or full actuation control is performed through active haptic feedback. Active haptic control is performed through computer-generated simulations or programmed inputs for threshold control [11]. The design and development of the proposed gripper system employed and active haptic feedback control system. The haptic system monitored the active force applicable in grasping, additionally, the actuators were controlled through force feedback threshold values from the sensory system. The data input requirements and object orientation are required in the active haptic gripper systems [12]. Therefore, the mobility of end-effectors in conjunction with robotic-manipulators are improved through the implementation of active haptic feedback systems [13]. Complex machine mechanism and movements are improved through minimizing human interface as a result of haptic systems. Machine physiological presence is enhanced by utilizing intelligent control systems for flexible grippers. 1.3

Related Studies

Flexible and reconfigurable assembly stations consist of gripping and fixturing systems that possess adaptable capabilities for part handling and manipulation. Assembly procedures that are flexible and that consist of multi-task orientated stations possess the potential to reduce time loss over a production period. Time consumption is reduced through the implementation of flexible and/or modular gripping methods [14]. Prospective flexible grippers are categorized as malleable grippers, enveloping grippers and multi-fingered grippers [15]. The advantages, disadvantages and applicable gripping function describe the functionality of the flexible grippers and are explained in Tables 1, 2 and 3. An eight (8) degree of freedom reconfigurable gripper was developed by [16]. The 3-finger robotic gripper manufactured possessed high-speed precision and demonstrated flexibility in handling multiple tasks. The gripper provides a function for both conventional and reconfigurable grasping, whereby grasping approach configuration are interchangeable. The study illustrated effective conformity and high dexterity of a multi-finger gripper for grasping applications.

Experimental Investigation of a Biologically Inspired Gripper

255

Table 1. Advantages of flexible grippers [15]. Design Multi-fingered grippers Enveloping grippers Malleable grippers

Advantages Flexibility gripping for different object shapes, gripping with force feedback Adaptability to mould around the object Adaptable to different shapes, reliable gripping

Table 2. Drawbacks of flexible grippers [15]. Design Multi-fingered grippers Enveloping grippers Malleable grippers

Drawbacks Control complexity Low force control capability Low gripping dexterity

Table 3. A significant application of flexible grippers [15]. Design Multi-fingered grippers Enveloping grippers Malleable grippers

Significant Application Grasping all shaped objects with force control Grasping oddly shaped and unknown objects Grasping unknown and specially deformed objects

A passive adaptive gripper incorporating under-actuated gripper appendages was proposed by [17]. The gripper was developed for a 2-finger, 3-finger and 4-finger configuration. The grasping and conforming mechanism depended on the elastic properties of the material. The system was entirely self-conforming through controlling multi appendages from one (1) actuation source. 1.4

Fin Ray Effect® Mechanism and Related Studies

The Fin Ray Effect® is illustrated (see Fig. 2) through the application of an applied force P onto a V-shaped structure (A). The V-shaped structure (B) deforms and conforms towards the applied force. The structure converges around the load, which represents a grasped object [18]. The Fin Ray Effect® is employed in flexible gripping techniques and presents an effective method for compliant geometric conformity [19]. A biologically inspired underactuated gripping system incorporating the Fin Ray Effect® was investigated by [20, 21]. The gripping system consisted of underactuated appendages designed to incorporate the Fin Ray Effect®. Tactile perception related to sensor integration and adaptive grippers improves object-type recognition and response [22]. The Fin Ray Effect® in conjunction with active feedback possesses the potential improves dexterous manipulation, physical human-robot interaction, object recognition and grasping tasks Research clarifies that fragile object handling and grasping control requirements in production environments are met through the application of flexible grippers in

256

C. I. Basson and G. Bright

Fig. 2. The working principle of the Fin Fin Effect® [18].

conjunction with active haptic control. Fragile objects that are able to be handled without damage are glass components i.e. light bulbs, fragile foods i.e. eggs, etc. [5]. The main contribution of this research are summarized as follows: – Investigation and development of performance characteristics of intelligent gripping systems based on a biological process. – Investigation of maximum conformity through simulation of the proposed gripper system. – Developing an intelligent grasping mechanism incorporating active haptic feedback control for grasping procedures. – Experimental testing and comparison of passive and active haptic feedback for the proposed flexible gripper system.

2 Conceptual Design of Flexible Gripper 2.1

Closed-Loop Robotic System Integration

A flexible gripper system integrated into an assembly station for production can be described as a closed-loop robotic system (see Fig. 3). The closed-loop robotic system consists of a: sensory system, robotic gripper manipulator, controller, robotic arm, communication acquisition between subsystems and computer interface [23]. The proposed manipulation system was established from a closed-loop robotic system [24].

Fig. 3. Conceptual closed-loop robotic system setup [24].

Experimental Investigation of a Biologically Inspired Gripper

2.2

257

Design of Flexible Gripper

The system design compromised of subsystems to develop a flexible gripper system. The design consisted of a mechanical skeletal, an actuation system, a robotic arm, a control system and a sensory system (see Fig. 4). The system was developed according to the functionality of an end-effector framework that consists of input variables, processing of data and actuation process and, output concerning grasping (see Fig. 5). The input and process are governed by a control system. The control system receives input values from a sensory system [25].

Fig. 4. Gripper mechanical system outline [25].

Fig. 5. Feedback and control of gripper system [25].

The study was performed on a previously designed gripper system based on the Fin Ray Effect® and was developed by [24]. The system was designed for a 3-finger and 4finger configuration (see Figs. 6 and 7). The end-effector was designed with a motor actuator to control the appendages from a centralized force motion. The mechanical body was manufactured from Acrylonitrile Butadiene Styrene, (ABS), through 3D printing. The Elastic Modulus (E) of ABS plastic is 2 GPa, the Poisson’s Ratio (t) is 0.4 and the Yield Stress (ry) is 45 MPa.

258

C. I. Basson and G. Bright

Fig. 6. Assembled 3-finger gripper design [24].

Fig. 7. Assembled 4-finger gripper design [24].

The appendages were designed from four (4) geometrical shapes and the structural selection was described by [24]. Geometry 1 was based on the traditional Fin Ray Effect® illustrating a parallel rib structure (see Fig. 8). Skewed rib structures were introduced in Geometry 2. The parallel rib structure in Geometry 3 was designed with a curvature. Geometry 4 consisted of a combined structure between Geometry 2 and Geometry 4 consisting of a combined slanted/curved structure. The geometrical changes on the structure investigated possible rib configurations to improve geometric conformity and maximize deflection of appendages based on the Fin Ray Effect®. Identifying possible rib configurations exploited maximized performance of finger design.

Fig. 8. Rib structure design for appendages [24].

Experimental Investigation of a Biologically Inspired Gripper

2.3

259

Simulation for Optimal Conformity

The appendages were analyzed for maximum load conformity using a Finite Element Analysis, (FEA). The FEA was performed using Siemens NX® simulation software. The rib structure was modelled according to Geometry 1, 2, 3 and 4. The loading surface was divided into 22 areas with a height of 4 mm along the length of the loading face as shown in C (see Fig. 9). A fixed constraint C was added to represent the fastened corners of the appendage as shown in B (see Fig. 9). A load P of 10 kN was applied to each of surface areas and represented as load cases at a distance from the fingertip as shown in A (see Fig. 10). The load cases presented an object in contact along the length of the appendage. The simulation investigated the optimal object grasping position for maximum geometric conformity.

Fig. 9. A. Loads, constraints. B. Divided surfaces. C. All load cases from fingertip length in mm for the gripper appendage.

The simulation was performed on 22 load cases across the length of the appendage to determine deflection conditions of Geometry 1, 2, 3 and 4. The maximum deformation cases were tabulated in Table 4. Geometry 1 showed a maximum deflection of 58.3 mm (see Fig. 10). Geometry 2 demonstrated a maximum deformation of 66.3 mm (see Fig. 11). The FEA simulation of Geometry 3 produced a maximum deflection of 58.3 mm (see Fig. 12). Finally, Geometry 4 showed a maximum deformation of 66.3 mm (Fig. 13). Table 4. Maximum deformation load cases for ABS plastic for non-linear statics. Configuration 1 2 3 4

Length from fingertip (mm) Deflection (mm) Stress (MPa) 58.3 2.832 24.60 66.3 3.060 37.84 58.3 2.843 24.14 66.3 3.006 38.15

260

C. I. Basson and G. Bright

The results from the load cases were graphically represented and described a characteristic curve for each of the configurations (see Fig. 14). An increase of deformation was illustrated towards a maximum conformity along the length of the appendage. The simulation of Geometry 1 and 3 produced similar characteristic curve and consequently, the same similar characteristic curves were produced for Geometry 2

Fig. 10. Maximum deformation load case for ABS plastic for non-linear statics for Geometry 1.

Fig. 11. Maximum deformation load case for ABS plastic for non-linear statics for Geometry 2.

Fig. 12. Maximum deformation load case for ABS plastic for non-linear statics for Geometry 3.

Experimental Investigation of a Biologically Inspired Gripper

261

Fig. 13. Maximum deformation load case for ABS plastic for non-linear statics for Geometry 4.

and 4. The data provides insight into the effects of stiffness due to the length of the cross rib structure to the maximum deflection of the appendage. Deflection is inversely proportional to the stiffness of the rib. Geometry 2 and 4 was proven to possess the highest conformity properties.

Fig. 14. Maximum deflection verse length from the fingertip for Geometry 1, 2, 3 and 4.

2.4

Robot Sensory System Integration

The haptic sensory system consisted of Force Sensitive Resistors, (FSR). The sensors were attached to the grasping face of the gripper (see Fig. 15). The grasping mechanism was comprised of an actuated flexible gripper and an active haptic feedback system The end-effector was connected to a FANUC M-10iA robotic arm (see Fig. 15 An Arduino Mega 2560 microcontroller was linked to the sensory subsystem and actuator control subsystem. The gripper was connected and controlled from a control board.

262

C. I. Basson and G. Bright

Fig. 15. Installed sensory system and a frontal view of the installed gripper system [25].

3 Active Haptic Control 3.1

Pseudocode and Mechatronic Control

The pseudocode described the grasping and control procedure of the flexible gripper. A haptic control sensory system was included in the pseudocode (see Fig. 16). The operational process was initiated through the program. A written input procedure was recorded for the robotic arm to locate the grasping location of the object. The grasping action of the gripper was manipulated through the aid of a push-button control for opening and closing. The push-button control was employed for manual control in place of a code for an equivalent procedure. Successful gasped was verified through force threshold detection. The occurrence of an unsuccessful grasp was reinitiated. After the detection of a successful grasp, the gripper continues to conform around the object utilizing the push-button control and overcomes a lower force threshold value A. The threshold value initiates an automated response to continuing to enclose the object without external input until a high threshold value B is encountered. The gripper releases holding pressure until a force threshold C and closes the gripper fingers until it reaches the high threshold value B. Force value C is located at a certain fraction of force value B. Alternating between threshold value B and C allows for selfadjustment below the force required to damage the surface of the manipulated object. The described self-adjustment allows for force stabilization when the grip force experiences and dynamic motion when encountering movement encountered in pick and place procedures. The final portion of the procedure identifies the release location. The gripper was disengaged from the alternating threshold loop through overriding a release program. The object was released at the identified location. The program ended the procedure and started a new operation cycle.

Experimental Investigation of a Biologically Inspired Gripper

263

Fig. 16. Operational pseudocode for the gripping system [5].

Fig. 17. Logic flow diagram of haptic gripper system architecture [5].

The mechatronic structure of the gripper system comprised of a mechanical architecture and an electronic architecture (see Fig. 17). Manual input initiated the gripping sequence through a push-button control component. The haptic input signal from the FSR sensors was converted and relayed to the system controller. The stepper motor was manipulated through signal transmission from the stepper motor controller. The grip movement direction and intensity was manipulated through the motor actuator. The haptic control of the flexible gripper system is illustrated through a force feedback control loop (see Fig. 18). The force input values resemble through voltage input. Regulating of object grasp intensity occurs between the threshold voltage values. An external power supplied the micro-controller a constant voltage to be transmitted to

264

C. I. Basson and G. Bright

the associated electronic components. The required signal was transmitted to the gripper. The FSR signal was relayed to the micro-controller and compared to the required signal. The incorrect force output was corrected. The corrected signal value was utilized to manipulate the directional motion and intensity of the actuator. 3.2

Dynamic Interferences Forces

The object grasped by the gripper experiences dynamic interference forces through the motion of the robotic arm [26]. The force attributes consist of radial and tangential forces, along with gravitational and Coriolis forces (see Fig. 19). The Lagrangian equation describes the sum of forces vectors defined in Eq. 1. The gripping system worked collectively to coincide with self-adjustment requirements through regulating the combined force loading normal to the gripping surface. _  þ cðqÞ½€q þ gðqÞ ¼ s aðqÞ½€q þ bðqÞ½qq Where: q: Vector of joint angles. a(q): Symmetric, bounded, positive definite inertia matrix. c(q): Coriolis forces. b(q): Centripetal forces. g(q): Gravitational force. s: Vector of actuator torques.

Fig. 18. Force feedback control loop [5].

ð1Þ

Experimental Investigation of a Biologically Inspired Gripper

265

Fig. 19. Force attributes of a moving object [26].

4 Testing of Active Haptic Control 4.1

Visual Static Testing

The performance of the flexible gripper system was evaluated through a static mass holding test. Repeatability testing was performed on the 3-finger and 4-finger gripper configuration. A 1000 g test weight was grasped and held, while the weight was incrementally increased until a maximum test weight of 2435 g was reached. The test was repeated fifteen (15) times on each geometric configuration of the appendages (see Fig. 20). The testing exhibited a repeatability of 97.3% for Geometry 4 and was the best performing configuration [25].

Fig. 20. Dynamic test for 4-finger gripper holding a concentric sphere shape: (A) Geometry 1, (B) Geometry 2, (C) Geometry 3, and (D) Geometry 4 [25].

A dynamic qualitative visual gripping test was performed on the flexible gripper system on the 3-finger and 4-finger configuration of all four (4) geometries. Various objects were gripped to visually inspect conformity (see Fig. 20). The following shapes

266

C. I. Basson and G. Bright

were gripped: A sphere (A), a cube (B), a triangular prism (C), cylindrical extrusion screw (D) and a crazy cube with various shaped sides (E). The performance was tabulated according to five (5) experimental repetitions according to whether the object slipped. The robotic arm was programmed to follow a path plan and the grasped object and gripper was subjected to the movement. The result of each repetition was denoted a binary one (1) and zero (0) for YES and NO accordingly. The denotation YES signified that the object was grasped throughout the movement and NO was denoted for the repetition whereby the object slipped out of the grippers grasp. The results showed that the 4-finger configured performed marginally better as the 3-finger configuration was at a slight disadvantage when handling 4-sided objects. The results were tabulated for the 3-finger gripper and 4-finger gripper in Tables 5 and 6 respectively. Table 5. Visual dynamic slip test results of the 3-finger gripper for Geometry 1, 2, 3 and 4. Configuration A Yes 1 5 2 5 3 5 4 5

No 0 0 0 0

B Yes 5 5 4 5

No 0 0 1 0

C Yes 5 5 5 5

No 0 0 0 0

D Yes 5 5 5 5

No 0 0 0 0

E Yes 5 5 4 5

No 0 0 1 0

Table 6. Visual dynamic slip test results of the 4-finger gripper for Geometry 1, 2, 3 and 4. Configuration A Yes 1 5 2 5 3 5 4 5

4.2

B No Yes 0 5 0 5 0 5 0 5

C No Yes 0 5 0 5 0 5 0 5

D No Yes 0 5 0 5 0 5 0 5

E No Yes 0 5 0 5 0 5 0 5

No 0 0 0 0

Dynamic Testing Preparation

The dynamic tests were prepared by establishing a path plan for the robotic arm as mentioned in a previous experiment (see Fig. 21). A pick and place procedure was simulated through importing a path plan for the robotic arm and describes the motion that the end-effector experiences. The experiment was repeated multiple times for the same motion. The experiment was performed to simulate conditions of an assembly procedure in an experimental environment.

Experimental Investigation of a Biologically Inspired Gripper

267

Fig. 21. Graphical representation of the robotic arm path plan [25].

Voltage signal was transmitted from the FSR sensors to the microcontroller for haptic control. A calibration curve was established for the non-linear relation between mass loading and voltage output readings of FSR sensors due to irregular material shearing properties (see Fig. 22). The calibration curve was generated for all four (4) sensors. The shear mechanics and changes in material thicknesses between sensors affected the voltage output of the sensors.

Fig. 22. Mass verse voltage calibration [5].

The potential difference applied in the calibration curve in conjunction with dynamic testing was calculated according to Eq. 1 [5]. The signal output generated from the sensors was calibrated accordingly in a written code applied to microcontroller to produce a signal output in millivolts. Vout ¼ ðval  Vin Þ=1024  1000

ð2Þ

268

C. I. Basson and G. Bright

Where: Vout: Output voltage (mV) val: Value reading output (−) Vin: Input voltage (5 V) The relative error was calculated to determine the variation in subsequently generated data signal values. The relative error calculation is equivalent to the difference of the actual value and expected value divided by the expected value expressed according to Eq. 3. The relative error was recorded across 4.5 s for all the sensors (see Fig. 23). The average relative error for all sensors across the period was calculated. The expected mass value, the expected voltage value and average relative error was tabulated in Table 7.  err ¼ 100  Valact  Valexp =Valexp Where: err: Relative error Valact: Value reading Valexp: Expected reading

Table 7. Expected and relative values for sensors [5]. Sensors Sensor 1 Sensor 2 Sensor 3 Sensor 4

Expected mass value (g) Expected Voltage (mV) Average relative error (%) 552.25 3378 −0.0003 429.02 3534 −0.0043 483.01 3527 −0.0020 296.08 3173 0.0027

Fig. 23. Relative error verse time for FSR sensor 1, 2, 3 and 4 [5].

ð3Þ

Experimental Investigation of a Biologically Inspired Gripper

4.3

269

Dynamic Performance Without Haptic Feedback Control

Dynamic testing was performed in previous research without implementing haptic feedback. The gripper was subjected to a low-velocity path plan of 250 mm/s and raw values were generated from the sensory system. The experiment was subjected to the 3finger and 4-finger gripper utilizing Geometry 4 since the appendage was proved to possess the highest performance. A specimen possessing a spherical shape was used to maintain full contact with all gripping surfaces and to maintain conformity. The gripper was cycled through the path plan and was repeated five (5) times. The period of the cycles was subdivided into three (3) time lapses. The results were generated for the 3-finger and 4-finger configuration (see Figs. 24 and 26). The first timelapse consisted of a ten (10) second delay before a movement instruction was directed to the robotic arm. The second time lapse comprised of the dynamic movement of the path plan. The third time lapse contained a ten (10) second pause for residual movement of the object in the grasp after the motion. Results show self-conformity properties of the appendages. The grasping force increases over time as external forces are applied to the load. The potential difference increases overtime resembling grasp force. The difference in potential difference from the starting period is plotted on a graph against time for the 3-finger and 4-finger gripper (see Figs. 25 and 27). The characteristic curve was generated through a three (3) degree quadratic trend line. The elastic properties in combination with the V-shape structure of the appendages allow for greater deflection around an object as the object experiences force components of the robotic arm movement. The gripper encloses further as a result of loading forces during motion.

Fig. 24. Voltage verse time for 3-finger gripper: Geometry 4 with 250 mm/s speed without haptic control [25].

270

C. I. Basson and G. Bright

Fig. 25. Third Order trend line of voltage verse time for 3-finger gripper: Geometry 4 with 250 mm/s speed without haptic control.

Fig. 26. Voltage verse time for 4-finger gripper: Geometry 4 with 250 mm/s speed without haptic control [25].

Fig. 27. Third Order trend line of voltage verse time for 4-finger gripper: Geometry 4 with 250 mm/s speed without haptic control.

Experimental Investigation of a Biologically Inspired Gripper

4.4

271

Dynamic Performance of Haptic Feedback Control

Dynamic testing was performed on the flexible gripper system using a high-velocity path plan of 2500 mm/s. The conformity of grasping and simultaneous contact of sensors was maintained by utilizing a spherical gripper. Identical experimental procedure and path plan input were employed as in the previous dynamic experiment. Voltage output data values were generated and plotted on a voltage versus time graph (see Fig. 28).

Fig. 28. Voltage verse time for 4-finger gripper: Geometry 4 with 2000 mm/s speed with haptic control [5].

Force control was illustrated for high-speed applications by means of haptic feedback control. The increased force due to self-conformity across the time phases was decreased interference and the interference forces were decreased. The change in potential difference output signal stabilization tended towards the data values at the start of the phases (see Fig. 29). The characteristic curve was generated through a three (3) degree quadratic trend line.

Fig. 29. Third Order trend line of voltage verse time for 3-finger gripper: Geometry 4 with 2000 mm/s speed with haptic control.

272

C. I. Basson and G. Bright

5 Discussion 5.1

Gripper Performance

The gripper system utilizing haptic feedback was analyzed, tested and evaluated according to performance criteria regarding conformity properties and loading efficiency. The performance of the system was evaluated according to, repeatability, maximum static loading and maximum grip force for the 3-finger and 4-finger configuration. The simulation and experimental testing showed that Geometry 4 in combination with the 4-finger configuration was superior in performance. Haptic feedback provided data that describes self-conformity and grasping characteristics of the developed Fin Ray Effect® appendages. The intended loading specification was one (1) kilogram and actual performance of the best performing configuration and exceeds specified loading condition by a safety factor 1.687. The following Table 8 gives a summary of the performance evaluation (Tables 9 and 10).

Table 8. 3-Finger gripper performance [25]. Configuration

Repeatability

1 2 3 4

89.2% 93.7% 92.4% 97.3%

Max static loading (g) 2177.05 2297.99 2338.34 2418.87

Dynamic loading (g) 320 320 320 320

Estimated grip force required (g) 1660 1607 1687 1257

Table 9. 4-finger gripper performance [25]. Configuration

Repeatability

1 2 3 4

93.4% 96.9% 95.9% 98.6%

Max static loading (g) 2344.80 2418.89 2398.64 2459.15

Dynamic loading (g) 320 320 320 320

Estimated grip force required (g) 1504 1420 1479 1146

The performance of active haptic feedback control of the flexible gripper was assessed. The standard deviation of the voltage difference across the time period for a grasping operation was evaluated. The standard deviation presents the expanse of deviation of signal values. The variation of signal quantifies the amount force variation of grasping during the operation. The variation is directly proportional to the control of grip force. Consequently, active haptic feedback control manipulates the force signal directly. The standard deviation evaluation was performed on the 4-finger gripper utilizing Geometry 4. The following Table 4 shows the standard deviation of the

Experimental Investigation of a Biologically Inspired Gripper

273

voltage signals of each sensor installed on each appendage. The standard deviation shows a decrease of force variation of the active haptic control when compared to the gripper system not possessing active haptic control.

Table 10. 4-finger haptic feedback performance using configuration. Sensor 1 2 3 4

Grip standard deviation at 250 mm/s (without active haptic control) (mV) 74 46 80 108

Grip standard deviation 2000 mm/s (with active haptic control) (mV) 65 52 54 70

6 Conclusions A biologically inspire flexible gripper employing an active haptic feedback system was evaluated. A deflection study was done to determine the deformation along the length of the proposed appendages to determine the Conformity characteristics of the gripper. The deflection and stress values were generated through FEA simulation and a deformation character curve was produced to quantify the Fin Ray Effect®. Repeatability experiments provided quantitative results to determine the static grasping performance of the gripper and justify FEA simulation results. Grasp conformity of the gripper through the handling of various of shapes was evaluated through visual dynamic testing. The gripper system was tested through experimental procedures to determine the performance of a passive haptic feedback control system. The combination of Geometry 4 finger shape and 4-finger configuration proved to possess the best performance through the testing performed. Testing without active haptic control in prior studies self-conformity and force variation. Handled parts and objects are potentially damaged through high force impulses during handling and movement. The system without haptic feedback control at lower operational speeds to provide sensitivity results of force fluctuations. The system was tested by employing an active feedback control and at higher operational speeds. The force regulation and variation of loading signals of actuation was monitored and manipulated through the active haptic control. Higher speeds were implemented to illustrate reduction in force variation in the case of both functionality and higher operational criteria. The active haptic system stabilized dynamic force fluctuations. Haptic feedback control possesses drawbacks pertaining to environment interaction. The system is effective in reducing force impulses the system requires a quicker response to high force fluctuations. Biologically inspire flexible grippers with the aid of active haptic control possess the ability to improve part handling effectiveness and intelligence. Active haptic control allows handling of fragile parts and enhances intelligent handling in terms of orientation, texture and shape conformity.

274

C. I. Basson and G. Bright

References 1. Bouchard S (2014) Robotic gripper repeatability definition and measurement. RobotIQ, Saint-Nicolas 2. Koren Y, Shpitalni M (2010) Design of reconfigurable manufacturing systems. J Manuf Syst 29(4):130–141 3. Padayachee J, Bright G (2013) The design of reconfigurable assembly stations for high variety and mass customization manufacturing. S Afr J Ind Eng 24(3):43–57 4. Reddy PV, Suresh VV (2013) A review on the importance of universal grippers in industrial robot applications. Int J Mech Eng Robot 2(2):255–264 5. Basson CI, Bright G (2018) Active haptic control for biologically inspired gripper in reconfigurable assembly systems. In: International conference in informatics in control, automation and robotics, ICINCO, Portu, pp 91–100 6. Xia P (2016) Haptics for product design and manufacturing simulation. IEEE Trans Haptics 9(3):358–375 7. Deshpande CS (2015) The study of haptics in medical training. Int J Adv Res Comput Eng Technol (IJARCET) 4(5):2488–2491 8. Sadun AS, Jalani J, Jamil, F (2016) Grasping analysis for a 3-finger adaptive robot gripper. In: 2nd IEEE international symposium on robotics and manufacturing automation. ROMA, Ipoh, pp 1–6 9. Belzile B, Birglen L (2014) A compliant self-adaptive gripper with proprioceptive haptic feedback. Auton Robot 36(1–2):79–91 10. Park TM, Won SY, Lee RS, Sziebig G (2016) Force feedback based Gripper control on a robotic arm. In: 20th jubilee IEEE international conference on intelligent engineering systems. INES, Budapest, pp 107–112 11. Martin NA, Mittelstadt V, Prieur M, Stark R, Bar T (2013) Passive haptic feedback for manual assembly simulation. In: 46th conference on manufacturing systems. CIRP, Setubal, pp 509–514 12. Felip J, Bernabe MV, Morales A (2012) Contact-based blind grasping of unknown objects. In: 12th international conference on humanoid robots. IEEE-RAS, Osaka, pp 396–401 13. Lipina J, Tomek L, Krys V (2011) Gripper with adjustable grip force. Trans VŠB LVI I (2):93–102 14. Molfino RM, Lacchini A, Maggiolo G, Michelini RC, Razzoli RP (1999) Re-engineering Issues in automatic assembly. In: Globalization of manufacturing in the digital. Kluwer Academic, Boston, pp 603–616 15. Shahriari M, El-Sayed A-R (2016) State of the art robotic grippers and applications. Robotics 5(11):1–20 16. Spiliotopoulos J, Michalos G, Makris S (2018) A reconfigurable gripper for dextrous manipulation in flexible assembly. Inventions Innovation Adv Manuf 3(4):1–15 17. Petkovic D, Pavlovic ND, Shamshirband S, Anuar NB (2013) Development of a new type of passively adaptive compliant gripper. Ind Robot Int J 40(6):610–623 18. Pfaff O, Simeonov S, Cirovic I, Stano P (2011) Application of fin ray effect approach for production process automation. In: Annals of DAAAM for 2011 & proceedings of the 22nd international DAAAM symposium, vol 22(1), pp 1247–1248 19. Tharayil V, Babu E, Cherussery AA, Joy PJ, Umesh V (2017) Design, fabrication and analysis of three fingered fin gripper for KUKA KR6 R900 robot. In: SSRG International Journal of Mechanical Engineering - (ICETM-2017) (Special Issue), pp 42–44

Experimental Investigation of a Biologically Inspired Gripper

275

20. Crooks W, Vukasin G, O’Sullivan M, Messner W, Rogers C (2016) Fin ray effect inspired soft robotic gripper: from the robosoft grand challenge toward optimization. Front Robot AI 3(70):1–9 21. Crooks W, Rozen-Levy S, Trimmer B, Rogers C, Messner W (2017) Passive gripper inspired by Manduca sexta and the Fin Ray®Effect. Int J Adv Robot Syst:1–7 22. Gandarias JM, Gómez-de-Gabriel JM, García-Cerezo AJ (2018) Enhancing perception with tactile object recognition in adaptive grippers for human-robot interaction. Sensors 18(3):692 23. Petruzella F (1996) Industrial electronics. Mc-Graw Hill, Singapore 24. Basson CI (2018) Part clamping and fixture geometric adaptability for reconfigurable assembly systems. The University of KwaZulu-Natal, Durban 25. Basson CI, Bright G, Walker AJ (2018) Testing flexible grippers for geometric and surface grasping conformity in reconfigurable assembly systems. South Afr J Ind Eng 29(1):128– 142 26. Yang C, Ma H, Fu M (2016) Advanced technologies in modern robotic applications. Springer, Singapore

Quantifying Robotic Swarm Coverage Brendon G. Anderson1 , Eva Loeser2 , Marissa Gee3 , Fei Ren4 , Swagata Biswas5 , Olga Turanova6 , Matt Haberland7(B) , and Andrea L. Bertozzi8 1

5 6

Department of Mechanical Engineering, UC Berkeley, Berkeley, CA 94720, USA 2 Department of Mathematics, UC San Diego, San Diego, CA 92093, USA 3 Center for Applied Mathematics, Cornell University, Ithaca, NY 14853, USA 4 Industrial Engineering and Operations Research, UC Berkeley, Berkeley, CA 94720, USA School of Mathematics, University of Edinburgh, Edinburgh EH9 3FD, Scotland School of Mathematics, Institute for Advanced Study, Princeton, NJ 08540, USA 7 BioResource and Agricultural Engineering, Cal Poly, San Luis Obispo, CA 93407, USA [email protected] 8 Department of Mathematics, UCLA, Los Angeles, CA 90095, USA Abstract. In the field of swarm robotics, the design and implementation of spatial density control laws has received much attention, with less emphasis being placed on performance evaluation. This work fills that gap by introducing an error metric that provides a quantitative measure of coverage for use with any control scheme. The proposed error metric is continuously sensitive to changes in the swarm distribution, unlike commonly used discretization methods. We analyze the theoretical and computational properties of the error metric and propose two benchmarks to which error metric values can be compared. The first uses the realizable extrema of the error metric to compute the relative error of an observed swarm distribution. We also show that the error metric extrema can be used to help choose the swarm size and effective radius of each robot required to achieve a desired level of coverage. The second benchmark compares the observed distribution of error metric values to the probability density function of the error metric when robot positions are randomly sampled from the target distribution. We demonstrate the utility of this benchmark in assessing the performance of stochastic control algorithms. We prove that the error metric obeys a central limit theorem, develop a streamlined method for performing computations, and place the standard statistical tests used here on a firm theoretical footing. We provide rigorous theoretical development, computational methodologies, numerical examples, and MATLAB code for both benchmarks. Keywords: Swarm robotics · Multi-agent systems optimization · Central limit theorem

1

· Coverage

Introduction

Much of the research in swarm robotics has focused on determining control laws that elicit a desired group behavior from a swarm [6], and less attention c Springer Nature Switzerland AG 2020  O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 276–301, 2020. https://doi.org/10.1007/978-3-030-31993-9_13

Quantifying Robotic Swarm Coverage

277

has been placed on methods for evaluating the performance of these controllers quantitatively. Both [6] and [9] point out the lack of developed performance metrics for assessing and comparing swarm behavior, and [6] notes that existing performance metrics are often too specific to the task being studied to be useful in comparing performance across controllers. This extended version of the authors’ paper [1] studies an error metric that evaluates one common desired swarm behavior: distributing the swarm according to a prescribed spatial density. Here, we delve deeper into the analysis of the properties of the error metric. Further, we include new examples, including a continuous target distribution to complement the original piecewise constant distribution, in order to better illustrate the utility of our methods. In many applications of swarm robotics, the swarm must spread across a domain according to a target distribution in order to achieve its goal. Some examples are in surveillance and area coverage [8,20,23,31], achieving a heterogeneous target distribution [4,10,14,16,17,34,41], and aggregation and pattern formation [28–30,36–38]. Despite the importance of assessing performance, some studies such as [28,29,34,38,41] rely only on qualitative methods such as visual comparison. Others present performance metrics that are too specific to be used outside of the specific application, such as measuring cluster size in [36], distance to a pre-computed target location in [10,31], and area coverage by tracking the path of each agent in [8]. Ergodicity has also been used as a notion of coverage error [2,26], but it quantifies only the time-average statistics of the robot trajectories rather than the effective coverage of the swarm at an instant in time. In [30] an L2 norm of the difference between the target and achieved swarm densities is considered, but the notion of achieved swarm density is particular to the controllers under study. These existing works do not provide methodology for carrying out computations with the notions they introduce. Moreover, they do not give guidance on how to interpret the values produced by the error metrics they define. Our work seeks to fill these gaps. We develop and analyze an error metric that quantifies how well a swarm achieves a prescribed spatial distribution. Our method is independent of the controller used to generate the swarm distribution, and thus has the potential to be used in a diverse range of robotics applications. In [18,25,40], error metrics similar to the one presented here are used, but their properties are not discussed in sufficient detail for them to be widely adopted. In particular, although the error metric that we study always takes values somewhere between 0 and 2, these values are, in general, not achievable for an arbitrary desired distribution and a fixed number of robots. Therefore, one needs a better understanding of the finer properties of the error metric in order to judge whether its values (and hence the performance of the underlying controller) are “good” or not. We address this by studying two benchmarks, 1. the extrema of the error metric, and 2. the probability density function (PDF) of the error metric when robot positions are sampled from the target distribution,

278

B. G. Anderson et al.

which were first proposed in [25]. Using tools from nonlinear programming for (1) and rigorous probability results for (2), we put each of these benchmarks on a firm foundation. In addition, we provide MATLAB code for performing all calculations at https://git.io/v5ytf. Thus, by using the methods developed here, one can assess the performance of a given controller for any target distribution by comparing the error metric value of robot configurations produced by the controller against benchmarks (1) and (2). Our paper is organized as follows. Our main definition, its basic properties, and a comparison to common discretization methods is presented in Sect. 2. Then, Sects. 3 and 4 are devoted to studying (1) and (2), respectively. We suggest future work in Sect. 5 and conclude in Sect. 6.

2

Quantifying Coverage

One difficulty in quantifying swarm coverage is that the target density within the domain is often prescribed as a continuous (or piecewise continuous) function, yet the swarm is actually composed of a finite number of robots at discrete locations. The approach we take here is to use the robot positions to construct a continuous function that represents coverage (e.g. [2,18,24,29,42]). It is also possible to use a combination of the two methods, as in [11]. The method we present and analyze is inspired by vortex blob numerical methods for the Euler equation and the aggregation equation (see [12] and the references therein). There are also similarities between our approach and the numerical methods known as smoothed particle hydrodynamics (SPH), which have also been applied in robotics [28,29,41]. One main idea behind vortex methods and SPH is to use particles to approximate a continuous function that represents the fluid. We apply this idea but with the opposite aim: namely, we represent discrete points (the robots’ positions) with a continuous function. A similar strategy was alluded to in [18] and used in [25,40] to measure the effectiveness of a certain robotic control law, but to our knowledge, our work here and in [1] is the first to develop any such method in a form sufficiently general for common use. This section is devoted to our definition of the error metric and to its basic properties and computational considerations. We also provide a contrast between our method and another common way to measure error, namely, by discretizing the domain (e.g. [4,14]) in Subsect. 2.3. Finally, in Subsect. 2.4, we present the setup for the two main examples that we will use throughout the paper. 2.1

Preliminaries

We are given a bounded region1 Ω ⊂ Rd , a desired robot distribution ρ : Ω →  (0, ∞) satisfying Ω ρ(z) dz = 1, and N robot positions x1 , . . . , xN ∈ Ω. To 1

We present our definitions for any number of dimensions d ≥ 1 to demonstrate their generality. However, in the latter sections of the paper, we restrict ourselves to d = 2, a common setting in ground-based applications.

Quantifying Robotic Swarm Coverage

279

compare the discrete points x1 , . . . , xN to the function ρ, we place a “blob” of can be some shape and size at each point xi . The blob or robot blob function  any function K : Rd → R that is non-negative on Ω and satisfies Rd K(z)dz = 1. In other contexts the blob function is referred to as a “kernel” [28,39,41]. Although more general blob shapes may be useful to represent properties of a robot’s sensing and/or manipulation capabilities, it is often natural to use a K that is radially symmetric, decreases along radial directions, and enjoys certain integrability properties. We discuss these issues further in Subsect. 4.1. One choice of K could be a scaled indicator function, for instance, a function of constant value within a disc of radius 1 and 0 elsewhere. This is an appropriate choice when a robot is considered to either perform its task at a point or not, and where there is no notion of the degree of its effectiveness. For the remainder of this paper, however, we usually take K to be the Gaussian   |z|2 1 exp − G(z) = , 2π 2 which is useful when the robot is most effective at its task locally and to a lesser degree some distance away. A list of other common choices of K and different ways of constructing multivariate blobs from single variable functions can be found in [39, Chaps. 2 and 4]. To formulate our definition, we need one more parameter, a positive number δ, which we call blob radius. We then define K δ as, 1 z  . (1) K δ (z) = d K δ δ We point out that this rescaling preserves the L1 norm of K, so that  K δ (z) dz = 1 holds for all δ > 0. Rd The shape of K and the blob radius δ have two physical interpretations as: – the area in which an individual robot performs its task, or – inherent uncertainty in the robot’s position. Either of these interpretations (or a combination of the two) can be invoked to make a meaningful choice of these parameters. To define the swarm blob function ρδN , we place a blob Gδ at each robot position xi , sum over i and renormalize, yielding, N Gδ (z − xi ) . (2) ρδN (z; x1 , ..., xN ) = N i=1  δ i=1 Ω G (z − xi ) dz For brevity, we usually write ρδN (z) to mean ρδN (z; x1 , ..., xN ). This swarm blob function gives a continuous representation of how the discrete robots are distributed. Note that each integral in the denominator of (2) approaches 1 if δ is small or all robots are far from the boundary, so that we have, ρδN (z) ≈

N 1  δ G (z − xi ). N i=1

(3)

280

B. G. Anderson et al.

Moreover, ρδN (z; x1 , ..., xN ) is a smoothed version of

N 

δxi (z), which represents

i=1

the physical robot positions (here, δxi (z) denotes the Dirac mass at xi ). Indeed, lim ρδN (z; x1 , ..., xN ) =

δ→0

N 1  δx (z) N i=1 i

(4)

in the sense of distributions [7]. This point of view gives yet another interpretation of ρδN . We now introduce our notion of error: Definition 1. The error metric eδN is defined as:

δ δ

ρN (z) − ρ(z) dz. eN (x1 , ..., xN ) =

(5)

Ω

This definition, including the definitions of K δ and ρδN , was introduced in this context by the authors in [1]. The relation (3) appeared there as well. Remarks and Basic Properties. Our error is defined as the L1 norm of the difference between the swarm blob function ρδN and the desired robot distribution ρ. One could use another Lp norm; however, p = 1 is a standard choice in applications that involve particle transportation and coverage such as [18,40]. Moreover, the L1 norm has a key property: for any two integrable functions f and g,







|f − g| dz = 2 sup f dz − g dz

. B⊂Ω

Ω

B

B

p

The other L norms do not enjoy this property [15, Chap. 1]. Consequently, by measuring L1 norm on Ω, we are also bounding the error we make on any particular subset, and, moreover, knowing the error on “many” subsets gives an estimate of the total error. This means that by using the L1 norm we capture the idea that discretizing the domain provides a measure of error, but avoid the pitfalls of discretization methods described in Subsect. 2.3. Studies in optimal control of swarms often use the L2 norm due to the favorable inner product structure [40]. We point out that the L1 norm is bounded from above by the L2 norm: indeed, according to the Cauchy-Schwarz inequality, for any function f we have,  1/2 f dz ≤ |Ω| f 2 dz , Ω

Ω

where |Ω| denotes the area of the bounded region Ω. Thus, if an optimal control strategy controls the L2 norm, then it will also control the error metric we present here. Last, we note that for any Ω, ρ, δ, N , and (x1 , ..., xN ), we have 0 ≤ eδN ≤ 2. This was established in Proposition 2.1 of [1]. The theoretical minimum of eδN

Quantifying Robotic Swarm Coverage

281

can only be approached for a general target distribution when δ is small and N is large, or in the trivial special case when the target distribution is exactly the sum of N Gaussians of the given δ, motivating the need to develop benchmarks (1) and (2). Variants of the Error Metric. The notion of error in Definition 1 is suitable for tasks that require good instantaneous coverage. For tasks that involve tracking average coverage over some period of time (and in which the robot positions are functions of time t), an alternative “cumulative” version of the error metric is





1 M δ

dz

(6) ρ (z, t ) − ρ(z) j N

M Ω

j=1 for time points j = 1, . . . , M . This was defined in the authors’ [1], and is similar to the metric used in [40]. It may also be likened to the approach of [2], which expresses the average over time using an integral rather than finite sum and measures the difference from the target distribution using the Kullback-Leibler divergence and Bhattacharyya distance instead of an L1 norm. Although the cumulative error metric (6) is, in general, distinct from the instantaneous version of (5), note that the extrema and PDF of this cumulative version can be calculated in the same way as the extrema and PDF of the instantaneous error metric with M N robots. Therefore, in subsequent sections we restrict our attention to the extrema and PDF of the instantaneous formulation without loss of generality. In addition, [40] considers a one-sided notion of error, in which a scarcity of robots is penalized but an excess is not, that is,

δ

ρN (z) − ρ(z) dz, eˆδN = Ω−

where Ω − := {z|ρδN (z) ≤ ρ(z)}. The definition of eˆδN , which appears in [1], is particularly useful in conjunction with the choice of K δ as a scaled indicator function, as eˆδN becomes a direct measure of the deficiency in coverage of a robotic swarm. For instance, given a swarm of surveillance robots, each with observational radius δ, eˆδN is the percentage of the domain not observed by the swarm.2 Remarkably, eˆδN and eδN are related by eδN = 2ˆ eδN . This was established by the authors in Proposition 2.2 of [1]. This relationship implies that eδN enjoys the interpretation of being a measure of deficiency, and also allows the techniques introduced here to be directly applied to eˆδN .

2

The notion of “coverage” in [8] might be interpreted as eˆδN with δ as the width of the robot. There, only the time to complete coverage (t such that eˆδN (z; x1 (t), ..., xn (t)) = 0) was considered.

282

2.2

B. G. Anderson et al.

Calculating eδN

In practice, the integral in (5) can rarely be carried out analytically, primarily because the integral needs to be separated into regions for which the quantity ρδN (z) − ρ(z) is positive and regions for which it is negative, the boundaries between which are usually difficult to express in closed form. Hence we must rely on numerical integration, or quadrature. Here we study the magnitude Em of the difference between the true value of the error metric eδN and an approximation e˜δN as the number of grid points per axis m increases for three elementary quadrature rules: the rectangle rule, the trapezoidal rule, and Simpson’s rule. As quadrature convergence rate proofs typically depend on smoothness of the integrand [13], we did not expect all of these rules to achieve their nominal rates of convergence. Indeed, in Fig. 1 we see that the trapezoidal rule converges as O(m−1.58 ) instead of O(m−2 ) and Simpson’s rule converges as O(m−1 ) instead of O(m−4 ). As our applications do not require extremely accurate estimates of the error metric, other error metric values reported throughout the paper have been calculated using the rectangle rule with a moderate number of grid points. For applications that demand increased accuracy, we recommend an adaptive scheme such as that employed by MATLAB’s integral2 function. 2.3

The Pitfalls of Discretization

Before concluding this section, we analyze a measure of error that involves discretizing the domain and show that the values produced by this method are strongly dependent on a choice of discretization. In particular, this error approaches its theoretical minimum when the discretization is too coarse and its theoretical maximum when the discretization is too fine, regardless of robot positions. Discretizing M the domain means dividing Ω into M disjoint regions Ωi ⊂ Ω of robots is such that i=1 Ωi = Ω. Within each region, the desired proportion  the integral of the target density function within the region Ωi ρ(z)dz. Using Ni to denote the observed number of robots in Ωi , we can define an error metric as μ=

M 



i=1

Ωi

ρ(z) dz −

Ni

. N

(7)

This exact definition appeared in the authors’ [1], and is based on discrete error metrics used in practice, for instance in [4,14]. It is easy to check that 0 ≤ μ ≤ 2 always holds. One advantage of this approach is that μ is very easy to compute, but there are two major drawbacks. First, the choice for domain discretization is not unique, and this choice can dramatically affect the value of μ. Indeed, if M = 1 then μ = 0. This follows directly from the definition of μ and appears in [1] as Proposition 2.3. On the other hand, as the discretization becomes finer, the error approaches its maximum value 2, regardless of the robot positions. More precisely:

Quantifying Robotic Swarm Coverage

283

  Fig. 1. Estimated error Em = eδN − e˜δN  as the number of grid points per axis m increases. As the exact value of eδN is unknown, we use the estimate provided by the MATLAB function integral2 using the ‘iterated’ method and a relative tolerance of 10−8 . The solid line is a best fit function of the form 10a mb for the trapezoidal rule, where a = 0.2782 and b = 1.577 were found using the MATLAB Curve Fit app. The particular configuration of robots is the optimal arrangement found (see Sect. 3.2) for the ring distribution from Definition 2 with N = 200 robots and δ = 2 in.

Proposition 1. Suppose the robot positions are distinct3 and the regions Ωi are  sufficiently small such that, for each i, Ωi contains at most one robot and ρ(z) dz ≤ 1/N holds. Then μ → 2 as |Ωi | → 0. Ωi This proposition and its proof appear in [1] as Proposition 2.4. Note that the shape of each region is also a choice that will affect the calculated value of μ. Although our approach also requires the choice of some size and shape (namely, δ and K), these parameters have much more immediate physical interpretations, making appropriate choices easier to make. The second, and perhaps more significant, drawback, is that by discretizing the domain we also discretize the range of values that the error metric can assume. This means that we have simultaneously desensitized the error metric to changes in robot distribution within each region. That is, so long as the number of robots Ni within each region Ωi does not change, the distribution of robots within any and all Ωi may be changed arbitrarily without affecting the 3

This is reasonable in practice as two physical robots cannot occupy the same point in space. In addition, the proof can be modified to produce the same result even if the robot positions coincide.

284

B. G. Anderson et al.

value of μ. On the other hand, the error metric eδN is continuously sensitive to differences in distribution. 2.4

Setup for Main Examples

We will perform computations with the following two desired distributions. We have purposefully chosen one that is piecewise continuous, ρring , and one that is smooth, ρripple . Definition 2. The ring distribution ρring is defined on the Cartesian plane with coordinates z = (z1 , z2 ) as follows. Let inner radius r1 = 11.4 in., outer radius r2 = 20.6 in., width w = 48 in., height h = 70 in., domain Ω = {z : z1 ∈ [0, w], z2 ∈ [0, h]}, and region Γ = {z : r12 < (z1 − w2 )2 + (z2 − h2 )2 < r22 }. Define the non-normalized ρring (z) to be 36 if z ∈ Ω ∩ Γ and 1 if z ∈ Ω \ Γ . Then let  C = Ω ρring (z) dz and ρring (z) = C −1 ρring (z) for z ∈ Ω. Definition 3. The ripple distribution ρripple is defined on the Cartesian plane with coordinates z = (z1 , z2 ) as follows. Let width w = 48 in., height h = 70 in., domain Ω = {z : z1 ∈ [0, w], z2 ∈ [0, h]}, non-normalized ρripple (z) = 2 +  1 z2 z3 sin[3π(z12 + z22 ) 2 ] + 2 w12 + h23 , and normalization factor C = Ω ρripple (z) dz. Then ρripple (z) = C −1 ρripple (z) for z ∈ Ω.

3

Error Metric Extrema

In the rest of the paper, we provide tools for determining whether or not the values of eδN produced by a controller in a given situation are “good”. As mentioned in Sect. 2.1, it is not possible to achieve eδN = 0 for every combination of target distribution ρ, number of robots N , and blob radius δ . Therefore, we would like to compare the achieved value of eδN against its realizable extrema given ρ, N , and δ. Unfortunately, eδN is a nonlinear and moreover non-convex function of the robot positions (x1 , ..., xN ), and thus its extrema may elude analytical expression. Thus, we approach this problem with nonlinear programming. 3.1

Extrema Bounds via Nonlinear Programming

Let x = (x1 , ..., xN ) represent a vector of N robot coordinates. The optimization problem, first introduced in the authors’ [1], is minimize eδN (x1 , ..., xN ), subject to xi ∈ Ω for i ∈ {1, 2, . . . , N }.

(8)

Note that the same problem structure can be used to find the maximum of the error metric by minimizing −eδN . Given ρ, N , and δ, we approach these problems using a standard nonlinear programming solver, MATLAB’s fmincon.

Quantifying Robotic Swarm Coverage

285

A limitation of all general nonlinear programming algorithms is that successful termination produces only a local minimum, which is not guaranteed to be the global minimum. Since there is no obvious formulation of this problem for which a global solution is guaranteed, we use the local minimum as an upper bound for the global minimum of the error metric. Heuristics such as multi-start (running the optimization many times from several initial guesses and taking the minimum of the local minima) can be used to make this bound tighter. We use e− to denote the best local minimum found and e+ to denote the best local maximum found. The bounds e− and e+ serve as benchmarks against which we can compare a value of the error metric achieved by a given control law. This is reasonable, because if a configuration of robots with a lower value of the error metric exists but eludes numerical optimization, it is probably not a fair standard against which to compare the performance of a general controller. Examples of extremal swarm blob functions found using this approach are shown in Figs. 2, 3, and 4.

in. Fig. 2. Swarm blob function ρδ=2 N =200 corresponding with the robot distribution that yields a locally minimal value of the error metric for the ring distribution, 0.28205. This figure appears in [1] as Fig. 1.

Relative Error. We introduce the notion of relative error, a quantitative way of assessing the performance of a robot distribution controller. This notion first appeared in the authors’ [1]. Definition 4. Let eobserved denote the error value of a robot configuration. The relative error is defined as, erel =

eobserved − e− . e+ − e−

(9)

286

B. G. Anderson et al.

δ=2.5 in. Fig. 3. Swarm blob function ρN corresponding with the robot distribution that =178 yields a locally minimal value of the error metric for the ripple distribution, 0.06376.

In order to apply this definition, we need to specify how to find eobserved . If the robot positions x1 , ..., xN produced by a given controller are constant, then eobserved can simply be taken as eδN (x1 , ..., xN ). In general, however, the positions x1 , ..., xN may change over time, and it is natural to define eobserved based on the steady state value of eδN . We suggest defining the steady state settling time ts to be the time at which the error has settled to within 2% of its asymptotic value. Then, we propose taking eobserved to be the third-quartile value observed for t > ts , which we denote eQ3 . We suggest that if erel is less than 10%, the performance of the controller is quite close to the best possible, and if this ratio is 30% or higher, the performance of the controller is rather poor. We summarize the procedure described in this section: – Step 1: Given ρ, N and δ, compute e− and e+ . – Step 2: Compute eobserved for the desired swarm controller. That is, • Step 2.a: Calculate eδN (t) from robot trajectories x1 (t), ..., xN (t) produced by the controller. • Step 2.b: Find the steady state settling time ts . • Step 2.c: Measure the third quartile value of the error metric eQ3 for t > ts . This is eobserved . – Step 3: Evaluate erel according to Definition 4. We emphasize that e− and e+ are independent of the particular controller; therefore, Step 1 has to be completed only once when comparing different controllers or different outcomes of running one controller.

Quantifying Robotic Swarm Coverage

287

in. Fig. 4. Swarm blob function ρδ=2 N =200 corresponding with the robot distribution that yields a locally maximal value of the error metric for the ring distribution, 1.9867. This occurs when all robots coincide outside the ring. This figure appears in [1] as Fig. 2.

Example. We apply this method to assess the performance of the controller in [25], which guides a swarm of N = 200 robots with δ = 2 in. (the physical radius of the robots) to achieve the ring distribution from Definition 2. These calculations were originally performed in the authors’ [1]. Step 1: Compute e− and e+ . To determine an upper bound on the global minimum of the error metric, we computed 50 local minima of the error metric starting with random initial guesses, then took the lowest of these to be e− = 0.28205. An equivalent procedure bounds the global maximum as e+ = 1.9867, produced when all robot positions coincide near a corner of the domain. The corresponding swarm blob functions are depicted in Figs. 2 and 4. Note that the minimum of the error metric is significantly higher than zero for this finite number of robots of nonzero radius, emphasizing the importance of performing this benchmark calculation rather than using zero as a reference. Step 2: Find eobserved . Under the stochastic control law of [25], the behavior of the error metric over time appears to be a noisy decaying exponential. Therefore, we fit to the data shown in Fig. 7 of [25] a function of the form f (t) = α + β exp(− τt ) by finding error asymptote α, error range β, and time constant τ that minimize the sum of squares of residuals between f (t) and the data. By convention, the steady state settling time is taken to be ts = 4τ , which can be interpreted as the time at which the error has settled to within 2% of its asymptotic value [3]. The third quartile value of the error metric for t > ts is eQ3 = 0.5157.

288

B. G. Anderson et al.

Step 3: Find erel . Using these values for eQ3 , e− , and e+ , we calculate erel according to Eq. 9 as 13.71%. Remark 1. Although the sentiment of the erel benchmark is found in [25], we have formalized the calculation and made three important improvements to make it suitable for general use. First, we have placed the calculation of e− and e+ into a framework appropriate for nonlinear programming, so that the calculation is objective and repeatable. On the other hand, in [25], values analogous to e− and e+ were found by “manual placement” of robots. Second, we suggest a general way of evaluating error in time-dependent controllers. In particular, although [25] refers to steady state, a definition is not suggested there. Adopting the 2% settling time convention allows for an unambiguous calculation of eQ3 and other steady state error metric statistics. It also provides a metric for assessing the speed with which the control law effects the desired distribution. Finally, we suggest using the third quartile value eQ3 in the calculation, in contrast to the minimum observed value of the error metric used in [25]. Since eQ3 better represents the distribution of error metric values achieved by a controller, our notion of relative error is more representative of the controller’s overall performance. These changes account for the difference between our calculated value of erel = 13.71% and the report in [25] that the error is “7.2% of the range between the minimum error value. . . and maximum error value”. Our substantially higher value of erel indicates that the performance of this controller is not very close the best possible. We emphasize this to motivate the need for our second benchmark in Sect. 4, which may be more appropriate for a stochastic controller like the one presented in [25]. 3.2

Error Metric for Optimal Swarm Design

So far we have taken N and δ to be fixed; we have assumed that the robotic agents and size of the swarm have already been chosen. Now, we briefly consider the use of the error metric as an objective function for the design of a swarm. Adding δ > 0 as a decision variable to (8) yields the minimization problem, minimize eδN (x1 , ..., xN ),

(10)

subject to xi ∈ Ω for i ∈ {1, 2, . . . , N }, δ > 0. Solving (10) for several fixed values of N provides insight into the number of robots and what effective working radius are needed to achieve a given level of coverage for a particular target distribution. Visualizations of such calculations are provided in Figs. 5 and 6, and Fig. 7 shows the optimal value of δ as a function of swarm size. Note that “breakthroughs”, or relatively rapid decreases in the error metric, can occur once a critical number of robots is available; these correspond with a qualitative change in the distribution of robots. For example, at N = 22 in. Fig. 5 the robots are arranged in a single ring; beginning with N = 25 we see the robots start to be arranged in two separate concentric rings of different radii

Quantifying Robotic Swarm Coverage

289

Fig. 5. Swarm blob functions ρδN corresponding with the robot distributions and values of δ that yield the minimum value of the error metric for the ring distribution target. These plots appear in Fig. 3 of [1]. Inset graph shows the relationship between N and the minimum value of the error metric observed from repeated numerical optimization. For N < 256, the initial guess provided to the optimizer had robots uniformly randomly distributed within the domain. For N = 256 in. [1], such a guess resulted in local minima with error values greater than those for smaller swarms, inconsistent with the intuitive notion that a larger swarm should be able to more accurately achieve the target distribution. To remedy this here, the initial guesses for N = 256 and higher (not shown) were taken with all robots between the inner and outer radius of the ring (i.e. within the region Γ from Definition 2). With these the plot decreases monotonically for all values of N tested (including 300, 350, 400, 450, and 500), but progress appears slow beyond N = 256; doubling the number of robots decreases the error metric by only 10%.

290

B. G. Anderson et al.

Fig. 6. Swarm blob functions ρδN corresponding with the robot distributions and values of δ that yield the minimum value of the error metric for the ripple distribution target. Inset graph shows the relationship between N and the minimum value of the error metric observed from repeated numerical optimization.

and the error metric begins to drop sharply. On a related note, there are also “lulls” in which increasing the number of robots has little effect on the minimum value of the error metric, such as between N = 44 and N = 79. We leave for future work the task of finding a theoretical explanation for these breakthroughs and lulls. In the meantime, computations like these can help a swarm designer determine the best number of robots N and effective radius of each δ to achieve the required coverage.

4

Error Metric Probability Density Function

In the previous section, we described how to find bounds on the minimum and maximum values for error and use these as benchmarks against which to measure

Quantifying Robotic Swarm Coverage

291

Fig. 7. Optimal blob radius δ to minimize error eδN along with best fit lines (loglog scale). Note that for both target distributions the optimal blob radius seems to scale with N p where p ≈ −0.4. This relationship may be useful for approximating the effective radius of robot sensing/manipulation capabilities in order to maximize effectiveness for a given swarm size.

the performance of a given control law. However, stochastic control laws may tend to produce robot positions with observed error eobserved well above the minimum e− . For these controllers, erel may be too stringent of a measure for assessing whether the controller’s performance is “good” or “bad”, and so we propose an alternative. According to the setup of our problem, the goal of any control law is for the robots to achieve the desired distribution ρ. Thus, it is natural to compare the outcome of a given control law to simply picking the robot positions at random from the target distribution ρ. We take the robots’ positions X1 , . . . , XN , to be independent, identically distributed bivariate random vectors in Ω ⊂ R2 with probability density function ρ. We place a blob of shape K at each of the Xi (previously we took K to be the Gaussian G), so that the swarm blob function is, ρδN (z) =

N 1  δ K (z − Xi ) , N i=1

(11)

where K δ is defined by (1). Equation (11) appears in the authors’ [1]. We point out that the right-hand sides of (11) and (3) agree upon taking K to be the Gaussian G in (11) and the robot locations xi to be the randomly selected Xi

292

B. G. Anderson et al.

in (3). Moreover, we have the relationship,

N  1 δX , ρδN = K δ ∗ N i=1 i

(12)

where ∗ denotes convolution. This equality is the reason (4) holds. In this context, the error eδN is a random variable, the value of which depends on the particular realization of the robot positions X1 , . . . , XN , and thus has a well-defined probability density function (PDF) and cumulative distribution function (CDF). We denote the PDF and CDF by feδN and FeδN , respectively. The performance of a stochastic robot distribution controller can be quantitatively assessed by calculating the error values it produces in steady state and comparing their distribution to feδN . We introduce this benchmark in Subsect. 4.4. 4.1

Kernel Density Estimation

The approach we take in this section is closely linked to the statistical theory of kernel density estimation (KDE). For an overview, see, for example, the book [32]. Broadly, the aim of this area of study is to find the underlying density ρ from the values of identically distributed random variables sampled from this density. The expression (11) is the so-called kernel density estimator of ρ, and ρδN is considered as an approximation to ρ. In a sense, this is the reverse point of view from the one we take, since we are given ρ and seek to learn about the distribution of the Xi s. Nevertheless, we are able to apply ideas from this well-developed field of study to our context. There are many notions of error that are used in the context of KDE. We refer the reader to [39, Chap. 2] for a summary. Our notion eδN corresponds to integrated absolute error (IAE). The use of IAE instead of other notions, particularly ones involving the L2 norm, was advocated for in [15]. We conclude this subsection by applying two results from KDE theory to ideas we’ve presented so far. Then, in Subsect. 4.2 we present rigorous results that show that the error metric has an approximately normal distribution when the Xi are sampled from ρ. As a corollary we obtain that the limit of this error is zero as N approaches infinity and δ approaches 0. Subsections 4.3 and 4.4 include a numerical demonstration of these results. The theoretical results presented in the next subsection not only support our numerical findings, but they also allow for faster computation. Indeed, if one did not already know that the error when robots are sampled randomly from ρ has a normal distribution for large N , tremendous computation may be needed to get an accurate estimate of this probability density function. On the other hand, since the results we present prove that the error metric has a normal distribution for large N , we need only fit a Gaussian function to the results of relatively little computation. In Subsect. 4.4 we present an example calculation demonstrating the utility of this approach. Optimal δ(N ). In Subsect. 3.2, we included δ as a parameter in the optimization problem (8) and numerically obtained an estimate for the optimal δ given

Quantifying Robotic Swarm Coverage

293

a swarm size N . This problem has also been studied in the context of KDE. In [33] it was found that the optimal choice is, δ ∗ (N ) = CN −1/6 , where C is a constant that does not depend on N . This is Eq. (2.3) of [33] in the 2-dimensional case. Here, “optimal” means that this choice of δ minimizes the expectation of the L1 error between ρ and ρδN . This result holds under the hypotheses that K satisfies (13) and ρ has bounded second derivatives. We remark that the optimal δ(N ) minimizing (10) that was found in Subsect. 3.2 and shown in Fig. 7 scales with N p where p ≈ −0.4 = −1/6. Despite the difference in exponent, these computations do not contradict the results of [33] as the latter considers the radius δ that minimizes the expected error value, and hence takes into account robot positions that are not optimal. On the other hand, the values of δ(N ) computed in Subsect. 3.2 are optimal only in concert with the optimal robot positions. The contrast between the two situations suggests that for the design of a swarm as described in Subsect. 3.2, the sensing/manipulation radius required to maximize coverage may depend on whether the controller is stochastic or deterministic. Choice of K. In the context of KDE, there is a notion of how efficient a given kernel K is. This notion is formulated in terms of L2 error between ρ and ρδN , and therefore the precise details do not directly apply to our situation. However, we point out that it is known [39, Sect. 2.7] that, so long as the kernel K is non-negative, has integral 1, and satisfies (13) zK(z) dz = 0, z 2 K(z) dz < ∞, then the particular choice of K has only a marginal effect on efficiency. We leave to future work the extension of this concept to our context. 4.2

Theoretical Central Limit Theorem

It turns out that, under appropriate hypotheses, the L1 error between ρ and ρδN has a normal distribution with mean and variance that approach zero as N approaches infinity. In other words, a central limit theorem holds for the error. The first such result was obtained in [15] in the one-dimensional case. Previously, similar results, such as those in [5], were available only for L2 notions of error, which are easier to analyze. For any result of central limit theorem type to hold, δ and N have to be compatible. Thus, for the remainder of this subsection δ will depend on N , and we display this as δ(N ). We have, Theorem 1. Suppose ρ is twice continuously differentiable, and K is zero outside of some bounded region and radially symmetric. Then, for δ(N ) satisfying δ(N ) = O(N −1/6 ) and lim δ(N )N 1/4 = ∞, N →∞

(14)

294

B. G. Anderson et al.



we have δ(N )

eN

≈N

e(N ) σ(N )2 δ(N )2 , N N 1/2

 ,

where σ 2 (N ) and e(N ) are deterministic quantities that are bounded uniformly in N .4 From Theorem 1 it is easy to deduce: δ(N )

Corollary 1. Under the hypotheses of Theorem 1, the error eN distribution to zero as N → ∞.

converges in

Corollary 1, Theorem 1, and the proof of Theorem 1 (which follows from [22, Theorem, p. 1935]) appear in the authors’ [1]. Remark 2. There are a few ways in which practical situations may not align perfectly with the assumptions of Theorem 1. However, we posit that in all of these cases, the differences are numerically insignificant. Moreover, we believe that a stronger version of Theorem 1 may hold, one which applies in our situation. We now briefly summarize these three discrepancies and indicate how to resolve them. First, we defined our density ρδN by (2), but in this section we use a version with denominator N . However, as explained above, the two expressions approach each other for small δ, and this is the situation we are interested in here. We believe that the result of Theorem 1 should hold with (2) instead of (11). Second, in one of the two examples we consider here, the desired density ρ is only piecewise continuous but not twice differentiable. We point out that an arbitrary density ρ may be approximated to arbitrary precision by a smoothed out version, for example by convolution with a mollifier (a standard reference is [7, Sect. 4.4]). Moreover, the main results of [19] give a version of Theorem 1 that require that ρ is only Lebesgue measurable (piecewise continuous functions satisfy this hypothesis). However, the results of [19] only apply in 1 dimension. Generalizing the results of [19] to several spatial dimensions is an interesting open problem in KDE, but is beyond the scope of our work here. Third, in our computations we use the kernel G, which is not compactly supported, for the sake of simplicity. Similarly, this kernel can be approximated, with arbitrary accuracy, by a compactly supported version. Making these changes to the kernel or target density would not affect the conclusions of numerical results. 4.3

Numerical Approximation of the Error Metric PDF

In this subsection we describe how to numerically find feδN and FeδN . According to Theorem 1, for sufficiently large N , one could simply use random sampling 4

Here N (μ, σ 2 ) denotes the normal random variable of mean μ and variance σ 2 , and we use the notation ≈ to mean that the difference of the quantity on the left-hand side and on the right-hand side converges to zero in the sense of distributions as N → ∞.

Quantifying Robotic Swarm Coverage

295

to estimate the mean and standard deviation, then take these as the parameters of the normal PDF and CDF. However, for moderate N , we choose to begin by estimating the entire CDF and confirming that it is approximately normal. To this end, we apply: Proposition 2. We have, FeδN (z) =

ΩN

1{x|eδN (x)≤z}

N 

ρ(xi )dx.

(15)

i=1

Here 1 denotes the indicator function. Proposition 2 and its proof appear in the authors’ [1]. Notice that, since each of the xi is itself a 2-dimensional vector as the Xi are random points in the plane, the integral defining the cumulative distribution function of the error metric is of dimension 2N . Finding analytical representations for the CDF quickly becomes infeasible for large swarms. Therefore, we approximate (15) using Monte Carlo integration, which is well-suited for highdimensional integration [35]5 . Next, we fit a Gauss error function (erf(·), the integral of a Gaussian G) to the data. If the fitted curve matches the data well, we differentiate to obtain the PDF. We remark that we express the integral in (15) in terms of an indicator function in order to express the quantity of interest in a way that is easily approximated with Monte Carlo integration. Computation. We apply Proposition 2 to numerically find the PDF feδN and the CDF FeδN for ρ = ρring (see Definition 2), N = 200, and δ = 2 in. We approximate FeδN using M = 1000 Monte Carlo evaluation points; this is shown by a solid gray line in Fig. 8. The numerical approximation appears to closely match a Gauss error function as theory predicts. Therefore an analytical erf(·) curve, represented by the dashed line, is fit to the data using MATLAB’s least squares curve fitting routine lsqcurvefit. To obtain feδN , the analytical curve fit for FeδN is differentiated, and the result is also shown in Fig. 8. In addition, we calculate the mean and standard deviation of eδN to be, respectively, 0.4933 and 0.02484. 4.4

Benchmark for Stochastic Controllers

In this subsection we describe how to use the tools developed so far to assess the performance of a given stochastic controller for a given desired distribution ρ, number of robots N , and blob radius δ. We then demonstrate the method via an example. 5

Quasi-Monte Carlo techniques, which use a low-discrepancy sequence rather than truly random evaluation points, promise somewhat faster convergence but require considerably greater effort to implement. The difficulty is in generating a lowdiscrepancy sequence from the desired distribution, which is possible using the Hlawka-M¨ uck method, but computationally expensive [21].

296

B. G. Anderson et al.

Fig. 8. The CDF of the error metric when robot positions are sampled from ρ is approximated by Monte Carlo integration, an erf(·) curve fit matches closely, and the PDF is taken as the derivative of the fitted erf(·). This figure appears as Fig. 4 in the authors’ [1].

We propose using two standard statistical tests, the two-sample t- and Ftests [27], to assess whether the performance of the control law is comparable to sampling robot positions from the target distribution. More precisely, these tests provide criteria for the rejection of the null hypothesis that the mean and variance of the steady state values of the error metric produced by a given control law are indistinguishable from mean and variance of feδN . This is summarized in the following procedure, which assumes that feδN is approximately normal. This assumption is reasonable due to Theorem 1. – Step 1: Numerically find the mean and variance of feδN using either of the methods described in Subsect. 4.3. – Step 2: Find the mean and variance of the steady state error metric values produced by the controller. – Step 3: Perform two-sample t- and F-tests to compare the means and variances, respectively. – Step 4: Assess the results. • Step 4.a: If the tests fail to refute the null hypotheses, the performance of the controller is consistent with sampling robot positions from the target distribution. • Step 4.b: If the two-sample t-test suggests that the means are significantly different, find the 95% confidence interval of their difference to assess the magnitude of the controller’s performance surplus or deficiency. Just as for the benchmark erel defined in Sect. 3, the first step is independent of choice of controller. Thus, if the goal is to compare several controllers, the computations of Step 1 need to be performed only once. Example. We apply this method to assess the performance of the controller in [25], where the desired distribution is ρ = ρring (see Definition 2), there are

Quantifying Robotic Swarm Coverage

297

N = 200 robots, and the radius is δ = 2 in. These calculations were originally performed in the authors’ [1]. Step 1, the computation of the mean and variance of eδN when robot positions are sampled at random from ρring , was completed in Subsect. 4.3. For Step 2, we use the data presented as Fig. 7 of [25] to calculate that the distribution of steady state error metric values produced by their controller has a mean of 0.5026 with a standard deviation of 0.02586. These values are summarized in Table 1. Table 1. Summary of mean and standard deviation of error metric values for the example in Subsect. 4.4. Mean

Standard deviation

Positions sampled from ρring 0.4933 0.02484 Steady state error values

0.5026 0.02586

Next we perform the two-sample F-test and two-sample t-test. We take the null hypothesis to be that the distribution of these error metric values is the same as feδN . The results are summarized in Table 2. Table 2. Summary of statistical tests for the example in Subsect. 4.4. F-statistic 1.0831 t-statistic 8.5888

The F-statistic of 1.0831 fails to refute the null hypothesis, indicating that there is no significant difference in the standard deviations. On the other hand, a two-sample t-test rejects the null hypothesis with a t-statistic of 8.5888, indicating that the steady state error is not distributed with the same population mean as feδN . Nonetheless, the 95% confidence interval for the true difference between population means is computed to be merely (0.00717, 0.01141). This shows that the mean steady state error achieved by this controller is unlikely to exceed that of feδN by more than 2.31%. Therefore, we find the performance of the controller in [25] to be acceptable given its stochastic nature, as the error metric values it realizes are only slightly different from those produced by sampling robot positions from the target distribution. Remark 3. As with erel of Sect. 3, the sentiment of this benchmark is preceded by [25]. However, here we have presented a concrete, repeatable, and objective approach that makes this benchmark suitable for general use. In particular, we have introduced into this context the use of appropriate statistical tests for comparing two approximately normal distributions. On the other hand, [25] relies on visual inspection, noting, “the error values [from simulation] mostly lie between

298

B. G. Anderson et al.

. . . the 25th and 75th percentile error values when robot configurations are randomly sampled from the target distribution”. The fact that feδN is approximately Gaussian (according to Theorem 1) not only allows for the use of these statistical tests, but also provides a way to greatly speed up computation. Indeed, the calculations in [25]6 took two orders of magnitude more computation than we perform in Subsect. 4.3.

5

Future Work

Several open problems remain regarding the computational and theoretical aspects of our proposed performance assessment methods. For instance, the benchmark calculations scale quickly with the size of the swarm, and therefore they become unfeasible for a standard personal computer with swarms on the order of 1000 robots. To extend our proposed methods to larger swarms, computational techniques should be improved. Possibilities include the derivation of an analytical representation of the error metric PDF in terms of the target distribution, more sophisticated optimization algorithms to compute the extrema, and alternate formulations of the optimization problem that reduce the number of variables or introduce guarantees on the global optimality of a resulting solution. As for theoretical extensions, analysis of the shape and size of the robot blob function, as well as a rigorous understanding of the optimal radius, are intriguing topics for further study. Specifically, if a certain control law were characterized as “good” with one choice of blob function, would the same conclusion be reached with a different blob function? These questions were raised by the authors’ [1], and here we provide possible routes to answering them, as well as connections to known results in the theory (see Subsect. 4.1).

6

Conclusion

This work deals with the performance assessment of spatial density control of robotic swarms. We introduce a sensitive error metric and two corresponding benchmarks, one based on the realizable extrema of the error metric and one based on its probability density function, which provide methods for evaluating the performance of control algorithms. The parameters of the error metric correspond to physical properties such as robot shape and size, and can therefore be tailored to the particular system under study. The benchmark calculations then provide reference points to which the performance of different control schemes can be compared. The error metric and these benchmarks were first carefully defined in the authors’ [1]; here, we have made these definitions more transparent, and, in Subsects. 3.1 and 4.4, we have added step-by-step methods for performing the relevant computations. This will allow swarm designers and practitioners to more easily use these benchmarks to quantitatively decide which control scheme 6

According to the caption of Fig. 2 of [25], the figure was generated as a histogram from 100,000 Monte Carlo samples.

Quantifying Robotic Swarm Coverage

299

best fits their application. As demonstrated by the examples, including the new example featured in Subsects. 3.1 and 3.2, these evaluation methods work well for a variety of target distributions and can be applied to judge the performance of both deterministic and stochastic control algorithms. Acknowledgements. The authors gratefully acknowledge the support of NSF grant CMMI-1435709, NSF grant DMS-1502253, the Dept. of Mathematics at UCLA, and the Dept. of Mathematics at Harvey Mudd College. OT acknowledges support from the Charles Simonyi Endowment at the Institute for Advanced Study. The authors also thank Hao Li (UCLA) for his insightful contributions regarding the connection between this error metric and kernel density estimation.

References 1. Anderson BG, Loeser E, Gee M, Ren F, Biswas S, Turanova O, Haberland M, Bertozzi A (2018) Quantitative assessment of robotic swarm coverage. In: Proceedings of the 15th international conference on informatics in control, automation, and robotics (ICINCO 2018), vol 2, pp 91–101 2. Ayvali E, Salman H, Choset H (2017) Ergodic coverage in constrained environments using stochastic trajectory optimization. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 5204–5210 3. Barbosa RS, Machado JT, Ferreira IM (2004) Tuning of pid controllers based on Bode’s ideal transfer function. Nonlinear Dyn 38(1–4):305–321 4. Berman S, Kumar V, Nagpal R (2011) Design of control policies for spatially inhomogeneous robot swarms with application to commercial pollination. In: 2011 IEEE international conference on robotics and automation (ICRA). IEEE, pp 378– 385 5. Bickel PJ, Rosenblatt M (1973) On some global measures of the deviations of density function estimates. Ann Statist 1(6):1071–1095. https://doi.org/10.1214/ aos/1176342558 6. Brambilla M, Ferrante E, Birattari M, Dorigo M (2013) Swarm robotics: a review from the swarm engineering perspective. Swarm Intell 7(1):1–41 7. Brezis H (2010) Functional analysis, sobolev spaces and partial differential equations. Springer, Heidelberg 8. Bruemmer DJ, Dudenhoeffer DD, McKay MD, Anderson MO (2002) A robotic swarm for spill finding and perimeter formation. Technical report, Idaho National Engineering and Environmental Lab, Idaho Falls (2002) 9. Cao YU, Fukunaga AS, Kahng A (1997) Cooperative mobile robotics: antecedents and directions. Auton Robots 4(1):7–27 10. Chaimowicz L, Michael N, Kumar V (2005) Controlling swarms of robots using interpolated implicit functions. In: Proceedings of the 2005 IEEE international conference on robotics and automation, pp 2487–2492. https://doi.org/10.1109/ ROBOT.2005.1570486 11. Cortes J, Martinez S, Karatas T, Bullo F (2004) Coverage control for mobile sensing networks. IEEE Trans Rob Autom 20(2):243–255 12. Craig K, Bertozzi A (2016) A blob method for the aggregation equation. Math Comput 85(300):1681–1717 13. Cruz-Uribe D, Neugebauer C (2002) Sharp error bounds for the trapezoidal rule and simpson’s rule. J Inequal Pure Appl Math 3(4):1–22

300

B. G. Anderson et al.

14. Demir N, Eren U, A¸cıkme¸se B (2015) Decentralized probabilistic density control of autonomous swarms with safety constraints. Auton Robots 39(4):537–554 15. Devroye L, Gy˝ orfi L (1985) Nonparametric density estimation: the L1 view. Wiley, Hoboken 16. Elamvazhuthi K, Adams C, Berman S (2016) Coverage and field estimation on bounded domains by diffusive swarms. In: 2016 IEEE 55th conference on decision and control (CDC). IEEE, pp 2867–2874 17. Elamvazhuthi K, Berman S (2015) Optimal control of stochastic coverage strategies for robotic swarms. In: 2015 IEEE international conference on robotics and automation (ICRA). IEEE, pp 1822–1829 18. Eren U, A¸cıkme¸se B (2017) Velocity field generation for density control of swarms using heat equation and smoothing kernels. IFAC-PapersOnLine 50(1):9405 – 9411. https://doi.org/10.1016/j.ifacol.2017.08.1454. 20th IFAC World Congress 19. Gin´e E, Mason DM, Zaitsev AY (2003) The L1 -norm density estimator process. Ann Probab 31(2):719–768 20. Hamann H, W¨ orn H (2006) An analytical and spatial model of foraging in a swarm of robots. In: International workshop on swarm robotics. Springer, Heidelberg, pp 43–55 21. Hartinger J, Kainhofer R (2006) Non-uniform low-discrepancy sequence generation and integration of singular integrands. In: Monte Carlo and Quasi-Monte Carlo methods 2004. Springer, Heidelberg, pp 163–179 22. Horv´ ath L (1991) On Lp -norms of multivariate density estimators. Ann Stat 19:1933–1949 23. Howard A, Mataric MJ, Sukhatme GS (2002) Mobile sensor network deployment using potential fields: a distributed, scalable solution to the area coverage problem. Distrib Auton Rob Syst 5:299–308 24. Hussein II, Stipanovic DM (2007) Effective coverage control for mobile sensor networks with guaranteed collision avoidance. IEEE Trans Control Syst Technol 15(4):642–657 25. Li H, Feng C, Ehrhard H, Shen Y, Cobos B, Zhang F, Elamvazhuthi K, Berman S, Haberland M, Bertozzi AL (2017) Decentralized stochastic control of robotic swarm density: theory, simulation, and experiment. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE 26. Mathew G, Mezi´c I (2011) Metrics for ergodicity and design of ergodic dynamics for multi-agent systems. Physica D: Nonlinear Phenom 240(4):432–442 27. Moore DS, McCabe GP, Craig BA (2009) Introduction to the Practice of Statistics 28. Pimenta LCA, Michael N, Mesquita RC, Pereira GAS, Kumar V (2008) Control of swarms based on hydrodynamic models. In: 2008 IEEE international conference on robotics and automation, pp 1948–1953. https://doi.org/10.1109/ROBOT.2008. 4543492 29. Pimenta LCA, Pereira GAS, Michael N, Mesquita RC, Bosque MM, Chaimowicz L, Kumar V (2013) Swarm coordination based on smoothed particle hydrodynamics technique. IEEE Trans Rob 29(2):383–399. https://doi.org/10.1109/TRO.2012. 2234294 30. Reif JH, Wang H (1999) Social potential fields: a distributed behavioral control for autonomous robots. Rob Auton Syst 27(3):171–194 31. Schwager M, McLurkin J, Rus D (2006) Distributed coverage control with sensory feedback for networked robots. In: Robotics: science and systems 32. Scott DW (2015) Multivariate density estimation: theory, practice, and visualization, 2nd edn

Quantifying Robotic Swarm Coverage

301

33. Scott DW, Wand M (1991) Feasibility of multivariate density estimates. Biometrika 78(1):197–205. https://doi.org/10.1093/biomet/78.1.197 34. Shen WM, Will P, Galstyan A, Chuong CM (2004) Hormone-inspired selforganization and distributed control of robotic swarms. Auton Rob 17(1):93–105 35. Sloan I (2010) Integration and approximation in high dimensions–a tutorial. Uncertainty Quantification, Edinburgh 36. Soysal O, S ¸ ahin E (2006) A macroscopic model for self-organized aggregation in swarm robotic systems. In: International workshop on swarm robotics. Springer, Heidelberg, pp 27–42 37. Spears WM, Spears DF, Hamann JC, Heil R (2004) Distributed, physics-based control of swarms of vehicles. Auton Rob 17(2):137–162 38. Sugihara K, Suzuki I (1996) Distributed algorithms for formation of geometric patterns with many mobile robots. J Field Rob 13(3):127–139 39. Wand M, Jones M (1994) Kernel smoothing. Chapman and Hall/CRC, Boca Raton 40. Zhang F, Bertozzi AL, Elamvazhuthi K, Berman S (2017) Performance bounds on spatial coverage tasks by stochastic robotic swarms. IEEE Trans Autom Control 63(6):1563–1578 41. Zhao S, Ramakrishnan S, Kumar M (2011) Density-based control of multiple robots. In: Proceedings of the 2011 American control conference, pp 481–486. https://doi.org/10.1109/ACC.2011.5990615 42. Zhong M, Cassandras CG (2011) Distributed coverage control and data collection with mobile sensor networks. IEEE Trans Autom Control 56(10):2445–2455

Design of a Vibration Driven Motion System Based on a Multistable Tensegrity Structure Philipp Schorr1,2(B) , Valter B¨ ohm2 , Lena Zentner1 , and Klaus Zimmermann1 1

2

Technische Universit¨ at Ilmenau, 98693 Ilmenau, Germany {philipp.schorr,lena.zentner,klaus.zimmermann}@tu-ilmenau.de Ostbayerische Technische Hochschule Regensburg, 93053 Regensburg, Germany [email protected]

Abstract. In this paper a novel approach to realize a uniaxial bidirectional vibration driven motion system with controllable direction of motion is investigated. The considered motion system bases on a tensegrity structure with multiple stable equilibrium configurations. The structure is in contact with a horizontal plane due to gravity and the actuation is realized by the harmonic change of the length of a selected member. Beside varying the actuation parameters, the direction of motion can be controlled by changing the equilibrium configuration of the tensegrity structure. In this paper the topology of the tensegrity structure and the parameter values are chosen appropriately to provide two symmetric equilibrium configurations. A change of the equilibrium state yields a novel configuration of the entire motion system which is symmetric to the original state. Utilizing the symmetry of the system the same actuation yields an opposite motion. This approach represents a reliable opportunity to control the direction of motion by changing the equilibrium state for constant actuation parameters. This paper focuses on the parameter selection and the design of the actuation of the vibration driven motion system. The working principle of the vibration driven motion system is verified by numerical simulations. This contribution represents the theoretical investigation for the further development of a prototype. Keywords: Compliant tensegrity structure multistability

1

· Vibration driven motion

Introduction

The application field of mobile robotics gains continuously in significance. Amongst others, such kinds of robots are applied in dangerous terrain where human interaction is too risky. However, the detailed environmental conditions are often unknown and the limited working space prevent the use common motion systems which base on wheels or legs. Therefore, the requirements on Supported by Deutsche Forschungsgemeinschaft (DFG project BO4114/2-2). c Springer Nature Switzerland AG 2020  O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 302–317, 2020. https://doi.org/10.1007/978-3-030-31993-9_14

Design of a Vibration Driven Motion System

303

those robots become more sophisticated. Hence, the improvement of already existing motion systems as well as the development of novel innovative motion principles is necessary. The limited operating space encourages the investigation of non-classical motion principles which bases on the immediate interaction between the motion system and the given environment. In particular, the crawling motion of earth worms observed in the nature is a promising approach. The body of the earth worm is divided into multiple segments. The peristaltic motion of the animal results as a consequence of the alternating shortening and elongation of these separate segments. Here, the spikes which are added to the surface each segment are essential for the control of the direction of motion. Depending on the orientation of those spikes a sliding motion in forward or backward direction is prevented by sticking into the surrounding soil. Many groups of researchers investigate the worm-like motion. Beside the biological aspects the knowledge about the mechanical working principle of this motion type is fundamental for the development of according motion systems. In [1,2] the worm is modeled as a finite number of rigid bodies which are connected serially by kinematic drives. The mechanical properties of the spikes are modeled by Coulomb’s Law of Friction using anisotropic friction coefficients and the resulting motion behavior of the system is investigated. But because of this anisotropy a preferred direction of motion results and the generation of a crawling motion in the opposite direction is not possible for a harmonic actuation. The control of additional internal masses inside the body segments are considered in [3,4]. Different actuation function are determined to realize a bidirectional uniaxial crawling motion. In [5,6] the actuation of kinetic drives is varied to optimize the steady state velocity of the system. However, the resulting drive functions are non-intuitive and require the application of high-end actuators. Tensegrity structures represent a special kind of prestressed structure. According to the load the elements are divided in compressed members and tensioned members. The tensegrity principle requires that all (groups of) compressed members are connected to each other by tensioned members. These structures enable advantages like high weight-to-load ratio, shock resistance and adjustable dynamical behavior. These mechanical properties encourage the application of such structures as motion system in mobile robotics. The huge shape capability is utilized in [7–12] to generate a controllable motion by successive tilting sequences. The variation of the structural shape is realized by changing the free length of selected members. In [13–17] the application of tensegrity structures as vibration driven motion systems is investigated. Beside changing the actuation parameters the dynamics of the motion system can be manipulated by varying the prestress state of the structure. A tensegrity structure with multiple stable equilibrium configurations is considered in [18]. Here, the change between the different equilibria represents a further possibility to change the dynamics of the vibration driven motion. In [19] this multistability is utilized to realize a feasible control of the direction of motion of a vibration driven motion system. A tensegrity structure consisting of several segments connected in series

304

P. Schorr et al.

is applied as crawling motion system in [20]. The actuation strategy of the system is investigated for different environmental scenarios. However, because of the amount of segments lots of actuators and an enormous control effort is necessary. Following a multistable tensegrity structure inspired by [21] is investigated. The structure is in contact with a horizontal plane due to gravity and excited by a harmonic actuation of a single member. Changing the equilibrium configuration of the tensegrity structure results in a tilting of the entire motion system. As a consequence a similar configuration of the system occurs. Because of the similarity between the two equilibrium configurations, the structural dynamics are nearly the same. Therefore, this approach represents a feasible opportunity for a reliable control the direction of motion of the vibration driven motion system. Utilizing this multistability, a bidirectional uniaxial motion can be realized for a given actuation of the system. This operating principle is illustrated in Fig. 1.

Fig. 1. Supposed behavior of the motion system: Utilization of the symmetry of the different equilibrium configurations to control the direction of motion (see [19]).

2 2.1

Mechanical Model of the Tensegrity Structure Topology of the Tensegrity Structure

The tensegrity structure illustrated in Fig. 2 is investigated. Because of the symmetry along the z-axis the consideration of the projection on the x-y-plane is sufficient. The structure consists of 9 members (j = 1, 2, . . . , 9) which are connected in 5 nodes (i = 1, 2, . . . , 5). These nodes are modeled as frictionless revolute joints. For the depicted structural topology, the members j = 1, 2, 3 are compressed members and the members j = 4, 5, 6, 7, 8, 9 are tensioned members. The compressed members j = 1, 2, 3 are characterized by the length Lj and the mass mj . These members are supposed be to homogeneous and rigid. The corresponding center of mass is defined by the geometric center of the compressed member. The tensioned members j = 4, . . . , 9 are modeled as a parallel arrangement of a linear spring and a visco-elastic damper. The spring is specified by the stiffness kj and the free length λ0,j . The damper is characterized by the

Design of a Vibration Driven Motion System

305

constant damping coefficient cj . The current length of the tensionend member j is given by the node at the beginning i1(j) and at the end i2(j) of the member. The masses of the tensioned members which are usually realized by tensiles or springs are neglectable comparing to the masses of the compressed members.

Fig. 2. Two-dimensional projection of the investigated tensegrity structure and the mechanical models of compressed members and tensioned members (see [19]).

The working principle of the purposed vibration driven motion system is based on the utilization of the symmetry of the tensegrity structure. Hence, the parameter values have to be selected appropriately in order to realize a symmetric structure. The longitudinal axis of compressed member 1 and the corresponding perpendicular center of member 1 are required to be axes of symmetry. Thus, the compressed members 2 and 3 must be the same. Furthermore, the tensioned members 4, 5, 6, 7 have to be identical as well as the tensioned member 8 and 9. The remaining parameters are listed in Table 1. Table 1. Topology of the tensegrity structure and restriction of the mechanical parameters to ensure the symmetry of the structure. j i 1(j ) − i 2(j ) Length [m] k j [N/m] c j [Ns/m] m j [kg] 1 1−2

L1



−−−

2 3−5

L2



−−−

m2 = L2 · 1 kg/m

m1 = L1 · 1 kg/m

3 4−5

L2



−−−

m2 = L2 · 1 kg/m

4 1−3

λ0,4

k4

c4 = 0.2

−−−

5 2−3

λ0,4

k4

c4 = 0.2

−−−

6 2−4

λ0,4

k4

c4 = 0.2

−−−

7 1−4

λ0,4

k4

c4 = 0.2

−−−

8 1−5

λ0,8

k8

c8 = 0.2

−−−

9 2−5

λ0,8

k8

c8 = 0.2

−−−

306

P. Schorr et al.

The ratio of mass to length of the compressed members as well as the damping properties of the tensioned members are estimated based on the experience of already existing prototypes. Thus, 6 parameters (L1 , L2 , λ0,4 , k4 , λ0,8 , k8 ) can be chosen for the design of the motion system. 2.2

Equations of Motion and Equilibrium Configurations of the Tensegrity Structure

To analyze the dynamics of the tensegrity structure the equations of motion have to be derived. Here, the Lagrange equations of the 2nd kind are used. The introduced two-dimensional tensegrity structure enables 7 degrees of freedom. As generalized coordinates the position of the nodes 1 and 5 as well as the orientation of the compressed members 1, 2 and 3 are chosen (see Fig. 2). These coordinates are summarized in the vector q = (q1 , q2 , . . . , q7 )T = (x1 , y1 , ϕ1 , x5 , y5 , ϕ2 , ϕ3 )T . This the resulting equations of motion are formulated in (1). d ∂T ∂T − = Qn dt ∂ q˙n ∂qn

with n = 1, . . . , 7

(1)

The parameter T describes the kinetic energy of the structure. The generalized forces are represented by the vector Q = (Q1 , Q2 , . . . , Q7 )T . This vector considers the spring forces Fk,j and the damping forces Fc,j of the tensioned members. These forces are defined in (2)–(3). Fk,j = −kj (|λj | − λ0,j ) Fc,j = −cj

λj , |λj |

λ˙ j · λj λj , |λj | |λj |

λj = ri2(j) − ri1(j) λ˙ j = r˙ i2(j) − r˙ i1(j)

(2)

(3)

This approach results in a system of nonlinear 2nd order differential equations. These equations of motion can be formulated as shown in (4). However, the solution of these equations requires computational support using numerical calculations. Considering the general tensegrity structure without external loads ˜ equals zero. the vector Q ˜ ˙ q) + K(q) = Q M¨ q + C(q,

(4)

For the determination of the static equilibrium configurations of the tensegrity structure (4) can be simplified. This yields the nonlinear system of equations formulated in (5). However, also these equations cannot be solved analytically. Moreover, the number solutions cannot be predicted. ˜ K(qeq ) = Q

(5)

Design of a Vibration Driven Motion System

307

Furthermore, the stability of the detected equilibrium configurations has to be evaluated to guarantee a multistable tensegrity structure. Thus, the Hessian H(qeq ) of the potential energy of the structure is considered. This approach is given in (6).  ˜  ∂(K − Q) H(qeq ) = − (6)   ∂q q=qeq

The equilibrium configuration is stable if all eigenvalues of H(qeq ) are positive. Otherwise, the equilibrium is unstable.

3 3.1

Parameter Finding for the Tensegrity Structure Multistability of the Tensegrity Structure

Beside the structural topology the multistability of tensegrity structures also depend on the prestress specified by the tensioned members. Following, the parameter values of the tensioned members are defined in order to guarantee the existence of multiple stable equilibrium configurations. However, because of the nonlinear character of (5) a general approach for the detection of those multistable parameter ranges is not possible. Nevertheless, utilizing the symmetry of the given tensegrity structure a feasible method can be derived. The following considerations are restricted to symmetric configurations of the tensegrity structures (y5 = 0, ϕ3 = −ϕ2 ) and the compressed member 1 is supposed to be fixed (x1 = 0, y1 = 0, ϕ1 = 0). Furthermore, to avoid the collision between the compressed member 2 and 3 the parameter ϕ2 is limited (ϕ2 ∈ (0, π)). The symmetric equilibrium configurations (qeq+ , qeq−1 qeq−2 ) are obvious and exist for all possible parameter configurations. However, the onedimensional equilibrium configurations qeq−1 and qeq−2 are not valid because of member collisions. These equilibrium configurations are depicted in Fig. 3. Considering the tensegrity structure as prestressed framework at least one stable equilibrium configuration exists. Assuming that the equilibrium configurations qeq+ and qeq−1 are unstable the compressed members of each stable equilibrium configuration will not coincide with the axes of symmetry (see Fig. 3 – equilibrium configuration A). Thus, as a consequence of the symmetric topology and the chosen parameter values a further stable equilibrium state will occur (see Fig. 3 – equilibrium configuration B). Therefore, the consideration of the stability of the equilibrium configurations qeq+ and qeq−1 is sufficient and analytical stability limits can be derived. These limits are illustrated in Fig. 4 for various parameter ratios of L2 /L1 . These limits do not depend on the parameter λ0,8 . To ensure the multistability of the tensegrity structure a parameter configuration within these boundaries has to be chosen. For the following investigations the parameter ratios L2 /L1 = 0.4, k8 /k4 = 1 and the parameter λ0,4 = 0.04 m is selected.

308

P. Schorr et al.

Fig. 3. Existence symmetric equilibrium configurations A and  for two  stable     criterion B with h = L1 2 + kk84 − 4L2 / 4 + 2 kk84 .

Fig. 4. Nomogram for the determination of the parameter range of a multistable tensegrity structure.

Design of a Vibration Driven Motion System

3.2

309

Tilting by Changing the Equilibrium Configuration

Beside the multistability of the tensegrity structure the operating principle of the vibration driven motion system requires additionally a tilting of the system as consequence of changing the equilibrium configuration. Therefore, the mechanical model has to be extended by considering gravity g = −gey (|g| = 9.81 m/s2 ). Furthermore, the tensegrity structure is fixed as shown in Fig. 5. Because of the symmetry the consideration of the equilibrium configurations A14 and B14 is sufficient. This yields x1 = 0, y1 = 0, y5 + L3 sin(ϕ3 ) = 0. To realize a tilting of the configuration B14 the condition given in (7) has to be fulfilled.   m1 L21 cos(ϕ1 ) + m2 2x5 + L22 (cos(ϕ2 ) + cos(ϕ3 )) > x5 + L2 cos(ϕ3 ) (7) m1 + 2m2 If this condition is valid changing into the equilibrium configuration A24 or B14 yields a tilting of the structure. As consequence the contact node of the compressed member 1 is changing. The occurrence of all relevant equilibrium configurations is depicted in Fig. 5.

Fig. 5. Occurring equilibrium configurations of the tensegrity structure and according change of the equilibrium configuration by tilting.

The influences of the gravity are considered by the vector Q respective Q’. Now, (5) is solved numerically for various combination of the parameters k8 and λ0,8 . Here, the parameter L1 which represents the geometrical dimensions of the system is selected as L1 = 0.1 m. A parameter set is classified as valid if all of the following conditions are fulfilled: 1. Equation (7) is valid, 2. all tensioned members are loaded with tension (|λj | > λ0,j ),

310

P. Schorr et al.

3. the equilibrium configurations are stable (see Eq. (6)). For the investigations the parameter values L1 = 0.1 m, k8 = 1000 N/m and λ0,8 = 0.02 m are selected. For these parameters 4 stable equilibrium configurations A, B, C and D are detected (see Fig. 6). However, for the configurations C and D a collision of the members 2 and 3 occur. Moreover, the corresponding potential energies of these equilibrium configuration are lower comparing to the relevant equilibria A and B. Therefore, after reaching the equilibrium configuration C or D a change into the state A or B requires much effort. Thus, to ensure the operating of the vibration driven motion system the equilibrium states C and D should be avoided.

Fig. 6. Determined stable equilibrium configuration of the tensegrity for the selected parameter values.

4 4.1

Design of the Actuation of the Tensegrity Structure Modal Analysis of the Tensegrity Structure

The selection of the actuator requires the knowledge about the operating frequency range. A suitable frequency range with appropriate distance to all resonance frequencies is targeted. Therefore, a modal analysis of the tensegrity structure is evaluated to determine the according natural frequencies. The nonlinear equations of motion of the tensegrity structure are linearized respective to the stable equilibrium state A. The resulting equations of motion are formulated in (8). (8) M¨ q + Cq˙ + K(q − qeq ) = 0 The numerical calculation of the eigenvalues and eigenvector yield the eigenfrequencies and the corresponding eigenmodes. Beside the 3 rigid body motions 4 eigenfrequencies remain which are relevant for the design of the actuation of the

Design of a Vibration Driven Motion System

311

Fig. 7. Eigenmodes and eigenfrequencies of the multistable tensegrity structure.

motion system. The results are depicted in Fig. 7. The eigenfrequencies are sorted in increasing order (f1 < f2 < f3 < f4 ). For the application of the tensegrity structure as vibration driven motion system the frequency range 0 < f ≤ 20 Hz is selected. This parameter range represents a restriction for the selection of the actuator. Moreover, an appropriate distance to the 1st eigenfrequency is guaranteed in order to avoid resonance catastrophes. 4.2

Frequency Response Analysis of the Tensegrity Structure

The integration of the actuator in the tensegrity structure has to be specified to remain the simple design of the system. Especially, the location of the actuator within the structure will influence the motion behavior. Therefore, frequency response analyses for different actuation strategies and various equilibrium states of the tensegrity structure are evaluated. The working principle of the considered vibration driven motion system is based on the symmetry of the entire system. However, as consequence of changing the equilibrium state the position of the actuator within the structure is changed. Hence, the dynamical properties for both cases has to be the same in order to ensure a reliable control of the direction of motion. For the sake of simplicity the harmonic actuation of a single member e is assumed. The corresponding actuation force is given in (9). FA (t) = F0 sin(2πf t)

λe |λe |

(9)

Analogous to the derivation of the nonlinear damping and spring force vector the actuation of the structure is transferred into nodal forces summarized in A(q, t). This approach results in (10). q + Cq˙ + K(q − qeq ) = A(q, t) M¨

(10)

312

P. Schorr et al.

Based on this approach frequency response analyses for the actuation of the tensioned members 4, 7 and 8 are evaluated respective to the stable equilibrium configurations A and B. These results are illustrated in Fig. 8. For various actuated tensioned members the dynamical properties for the different equilibrium configurations are compared focused on the selected frequency range (0 < f ≤ 20 Hz). Large deviations can be recognized clearly for the choice of the tensioned member 4 or 7 as actuated member. But choosing member 8 as actuated member, the frequency response for both equilibrium configurations are nearly identical. Therefore, following the tensioned member e = 8 is chosen for the integration of the harmonic actuation of the tensegrity structure.

Fig. 8. Frequency response of the tensegrity structure respective to the deviation of the x-component of node 3 with F0 = 1 N and – (a) e = 4, (b) e = 7, (c) e = 8.

Design of a Vibration Driven Motion System

5 5.1

313

Simulation of the Vibration Driven Motion System Modeling of the Vibration Driven Motion System

The considered tensegrity structure is in contact with a horizontal plane due to gravity (g = −gey with |g| = 9.81 m/s2 ). For the simulation of the motion performance of the vibration driven motion system with the selected parameter and actuation values the environment has to be taken into account. Occurring friction effects at the contact nodes are modeled by Coulomb’s Sliding Friction. Therefore, the derived equations of motion have to be supplemented by the contact force Fcontact,i = (FF,i , FN,i )T acting on node i. The components of this force are defined in (11) and (12). Here, the parameters kg and cg represent the stiffness and the damping properties of the horizontal ground. The variable μ is the dynamic friction coefficients of Coulomb’s Law of Friction. The parameter x˙ i is the velocity of the node i along the x-axis. ⎧ ⎪ ⎨−kg yi − cg y˙ i if yi < 0 and y˙ i < 0 FN,i = −kg yi (11) if yi < 0 ⎪ ⎩ 0 else FF,i = −μFN,i sign(x˙ i ) 5.2

(12)

Motion Characteristics of the Vibration Driven Motion System

The resulting equations of motion are solved numerically using 4th order Runge-Kutta-Method with an appropriate step size (Δt = 1e−4 s). As initial condition the equilibrium configuration A14 or B24 is assumed and the entire system is supposed to be in rest (q˙ = 0). Furthermore, a constant actuation amplitude of F0 = 1 N is selected. The simulation time was chosen to 1100 actuation periods. However, only the steady state solution which is assumed after 1000 actuation periods is evaluated. The motion performance is classified respective to the steady state velocity v which is defined in (13). Here, the parameter t0 represents an arbitrary time instance. t0 +1/f x˙ 1 dt (13) v=f t0

Thus, the value of v can be calculated for each actuation period during the steady state solution. Finally, the averaged value is given. The occurring motion behavior is shown in Fig. 9. These results confirm the basic idea of the investigations. Because of the symmetry of the multistable tensegrity structure the identical actuation yields different directions of motion for various equilibrium configurations as initial condition. However, the general motion behavior characterized by the absolute value of the corresponding steady state velocities are nearly the same. The motion behavior is simulated for different friction coefficients and various actuation frequencies. The results are depicted in Fig. 10.

314

P. Schorr et al.

Fig. 9. Motion behavior after 1000 actuation periods for F0 = 1 N, μ = 0.1, f = 10 Hz and the initial condition – (a) A14 : v = −30.9 mm/s, (b) B24 : v = 31.0 mm/s.

Fig. 10. Steady state velocity for different equilibrium configurations as initial condition, F0 = 1 N and – (a) μ = 0.1, (b) μ = 0.2, (c) μ = 0.5.

Design of a Vibration Driven Motion System

315

Fig. 11. Final state of the vibration driven motion system after the actuation in order to change the equilibrium configuration with μ = 0.1 and – (a) initial state: A14 , (b) initial state: B24 .

These results emphasize the symmetric motion behavior. All the simulation results confirm the influence of the initial condition on the direction of motion. However, the corresponding steady state velocity is affected only marginal by the different equilibrium configuration. Furthermore, all simulations with the equilibrium state A14 as initial condition yield a negative steady state velocity. Therefore, for all investigated actuation strategies the direction of motion can only be varied by changing the operating equilibrium configuration. Considering the motion performance depending on the actuation frequency the frequency range of 0 < f < 8 Hz the resulting motion is almost zero and therefore the motion inefficient. The frequency range of 8 Hz ≤ f ≤ 14 Hz is defined as relevant operating range which yields a usable motion. 5.3

Change of the Equilibrium Configuration of the Vibration Driven Motion System

The change between the stable equilibrium configurations of the vibration driven motion system is considered. An actuation strategy which ensure a reliable change between the equilibria is required. For the sake of simplicity the use of additional actuator should be avoided. As actuation strategy the approach formulated in (14) is selected.

F1 (1 − cos(2πf1 t)) |λλ 88 | if t < f11 (14) FA1 (t) = 2 0 else To specify the parameter values of F1 and f1 various combinations are simulated. The results are illustrated in Fig. 11.

316

P. Schorr et al.

These results confirm that a change between the stable equilibrium configurations is possible using the same actuator. However, for some actuation strategies the equilibrium C or D occurs and the motion cannot be continued. Thus, those actuations should be avoided. Moreover, for the construction of the prototype safe guards which prevent the occurrence of these states are recommended.

6

Conclusion

In this paper a symmetric tensegrity structure is investigated. Based on static considerations and stability analysis the parameter range of a multistable tensegrity structure is detected. Furthermore, an appropriate parameter configuration for the multistability of the structure is selected, which guarantees a tilting of the entire motion system as consequence of a change between the stable equilibria. Thus, two symmetric operating configurations of the motion system occur. Based on the linearization of the structural dynamics the actuation frequency and the integration of the actuator in the structure are specified. Numerical simulations with the different equilibrium configurations as initial conditions are evaluated for various actuation frequencies and different environmental conditions. These results confirm the basic idea of the approach to generate a bidirectional uniaxial motion utilizing the symmetry of the system. Furthermore, the change between the equilibrium configuration is considered in order to define a reliable actuation strategy. At total, the mechanical consideration of the properties and the dynamics of the tensegrity structure enable the development of a vibration driven motion system with feasible control of the direction of motion and low requirements on the unique actuator of the system.

References 1. Steigenberger J, Behn C (2012) Worm-like locomotion systems. An intermediate theoretical approach, De Gruyter Oldenbourg. ISBN 978-3-486-71987-1 2. Zimmermann K, Zeidis I, Behn C (2009) Mechanics of terrestrial locomotion with a focus on nonpedal motion systems. Springer. ISBN:9783540888406 3. Bolotnik N, Zeidis I, Zimmermann K et al (2006) Dynamics of controlled motion of vibration-driven systems. J Comput Syst Sci Int 45(5):831–840. https://doi.org/ 10.1134/S1064230706050145 4. Zimmermann K, Zeidis I, Bolotnik N et al (2009) Dynamics of a two-module vibration-driven system moving along a rough horizontal plane. Multibody Sys Dyn 22(2):199–219. https://doi.org/10.1007/s11044-009-9158-2 5. Fang H-B, Xu J (2011) Dynamic analysis and optimization of a three-phase control mode of a mobile system with an internal mass. J Vib Control 17(1):19–26. https:// doi.org/10.1177/1077546309345631 6. Fang H-B, Xu J (2012) Controlled motion of a two-module vibration-driven system induced by internal acceleration-controlled masses. Arch Appl Mech 82(4):461–477. https://doi.org/10.1007/s00419-011-0567-3 7. Sabelhaus AP, Bruce J, Caluwaerts K et al (2015) System design and locomotion of SUPERball, an untethered tensegrity robot. In: 2015 IEEE international conference on robotics and automation (ICRA), pp 2867–2873. https://doi.org/10.1109/ ICRA.2015.7139590

Design of a Vibration Driven Motion System

317

8. Kim K, Chen LH, Cera B et al (2016) Hopping and rolling locomotion with spherical tensegrity robots. In: 2016 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 4369–4376. https://doi.org/10.1109/IROS.2016. 7759643 9. Hustig-Schultz D, SunSpiral V, Teodorescu M (2016) Morphological design for controlled tensegrity quadruped locomotion. In: 2016 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 4714–4719. https://doi.org/10. 1109/IROS.2016.7759693 10. Liu H, Yu Y, Sun P et al (2016) Motion analysis of the four-bar tensegrity robot. In: 2016 IEEE international conference on mechatronics and automation, pp 1483– 1488. https://doi.org/10.1109/ICMA.2016.7558783 11. Chen LH, Kim K, Tang E et al (2017) Soft spherical tensegrity robot design using rod-centered actuation and control. J Mech Rob 9(2): 025001-025001-9. https:// doi.org/10.1115/1.4036014 12. Schorr P, B¨ ohm V, Zimmermann K et al (2018) An approach to the estimation of the actuation parameters for mobile tensegrity robots with tilting movement sequences. In: 2018 international conference on reconfigurable mechanisms and robots (ReMAR), pp 1–8. https://doi.org/10.1109/REMAR.2018.8449871 13. Rieffel JA, Valer-Cuevas FJ, Lipson H (2010) Morphological communication: exploiting coupled dynamics in a complex mechanical structure to achieve locomotion. J R Soc Interface 7(45):613–621. https://doi.org/10.1098/rsif.2009.0240 14. B¨ ohm V, Zimmermann K (2013) Vibration-driven mobile robots based on single actuated tensegrity structures. In: 2013 IEEE international conference on robotics and automation, pp 5475–5480. https://doi.org/10.1109/ICRA.2013.6631362 15. B¨ ohm V, Zeidis I, Zimmermann K (2013) Dynamic analysis of a simple planar tensegrity structure for the use in vibration driven locomotion systems. In: 12th conference on dynamical systems - theory and applications, pp 341–352 16. Khazanov M, Jocque J, Rieffel J (2014) Evolution of locomotion on a physical tensegrity robot. In: 14th international conference on the synthesis and simulation of living systems, pp 232–238 17. B¨ ohm V, Zeidis I, Zimmermann K (2015) An approach to the dynamics and control of a planar tensegrity structure with application in locomotion systems. Int J Dyn Control 3(1):41–49. https://doi.org/10.1007/s40435-014-0067-8 18. Schorr P, B¨ ohm V, Zentner L et al (2018) Motion characteristics of a vibration driven mobile tensegrity structure with multiple stable equilibrium states. J Sound Vib 437:198–208. https://doi.org/10.1016/j.jsv.2018.09.019 19. Schorr P, B¨ ohm V, Zentner L et al (2018) Dynamical investigation of crawling motion system based on a multistable tensegrity structure. In: 15th international conference on informatics in control, automation and robotics, pp 122–130. https:// doi.org/10.5220/0006852701220130 20. Tietz BR, Carnahan RW, Bachmann RJ et al (2013) Tetraspine: robust terrain handling on a tensegrity robot using central pattern generators. In: 2013 IEEE/ASME international conference on advanced intelligent mechatronics, pp 261–267. https://doi.org/10.1109/AIM.2013.6584102 21. B¨ ohm V (2016) Mechanics of tensegrity structures and their use in mobile robotics. Habilitation thesis, Ilmenau University of Technology

An Experimental Study for the Contactless Manipulation of Single Object in Vertical Plane Using Multiple Air Jets Nobukado Abe1, Kazuki Yoshinaga1(&), Satoshi Iwaki1, Naoki Tsuchihashi1, Tetsushi Ikeda1, Toshiharu Kosaku1, and Takeshi Takaki2 1

Systems Engineering Robotics Laboratory, Graduate School of Information Science and Technology, Hiroshima City University, Asaminami, Hiroshima, Hiroshima Prefecture, Japan {abe,yoshinaga}@robotics.info.hiroshima-cu.ac.jp, [email protected] 2 Graduate School of Engineering, Hiroshima University, Hiroshima, Japan

Abstract. We propose a novel technology in which multiple air jet nozzles are arranged consecutively along a conveying line and an object can be spatially relayed one after another. In this paper, as a most essential challenge for such an air jet conveyor system, we focus on an issue to spatially relay an object between a pair of air jet on a vertical plane. In order to develop this spatial relay control algorithm, together with the findings obtained in the planar air jet manipulation experiments we have studied in the past, we utilize new experiments on the behavior of an object when two air jets join in the air. Based on these preliminary findings, a new control algorithm which exploits the passivity of the COANDA effect as well as the active feedback control is proposed. Finally, the validity of the proposed method is demonstrated by the relay experiments with some payloads of various masses and sizes. Keywords: Air jet manipulation

 3D space  Object  Relay motion

1 Introduction Non-contact object manipulation technology using air jets has excellent features such as cleanness and no need for a transmission mechanism. It has been extensively studied for some years mainly aiming at an application to a conveying system for relatively smaller and lighter objects such as fruits or vegetable [2–5, 7–9]. In a three-dimensional space, the three translational DOF control method using a single air jet nozzle mounted on a pan-tilt actuator has been proposed [1]. In [1], it is possible to freely control the translational 3-DOF within the range where the object can be held by the Coandã effect. However, since the range in which an object can be held against gravity is at most about 40°, the driving range is inherently narrow. In order to solve this problem, in this paper we propose a method in which multiple nozzles are continuously arranged along a conveyance line, and an object is relayed one after another on each air jet stream. Especially we focus on a control method to transfer an object between a pair of air jet © Springer Nature Switzerland AG 2020 O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 318–327, 2020. https://doi.org/10.1007/978-3-030-31993-9_15

An Experimental Study for the Contactless Manipulation

319

on a vertical plane, which is the most fundamental problem to realize such a transport system. In order to develop this spatial relay control algorithm, together with the findings obtained in the planar air jet manipulation experiments we have studied in the past, we utilize new experiments on the behavior of an object when two air jets join in the air. Finally, the validity of the proposed method is demonstrated by the relay experiments with some payloads of various masses and sizes.

2 Outline and Problem Setting of Proposed Technology 2.1

Basic Idea and Conveyance Image

Here a system equipped with an active air jet nozzle and a distance sensor mounted on a pan-tilt actuator is called “one nozzle control module” (Fig. 1). Prepare N “modules” and connect them from module 1 to module N along a given object transport path. Figure 2 shows an image of such a system where an object is relayed (M denotes a module). In order to realize such a transportation system, it is essential to create a relay control method between a pair of air jet nozzle in a vertical plane. In the following, we formulate this problem. 2.2

An Essential Problem to Be Solved

Figure 3 is a diagram showing the experimental coordinate system of the state of step 2 (Fig. 2) as an example.In a vertical YZ plane as shown in Fig. 3, let’s consider that a spherical object is moved horizontally from the start point Ps to the goal point Pg by using two air jet nozzles separated by a distance l [mm]. Our problem here is to determine the air jet nozzle angles hL , hR and the air jet flow rates fL ; fR (proportional solenoid valve supply voltage).

Fig. 1. Configuration of “one nozzle control module” [10].

Fig. 2. An image of object transportation [10].

320

N. Abe et al.

3 Preliminary Experiments to Propose the Air Jet Relay Control Method 3.1

Consideration of Air Jet Manipulation Experiment on a Flat Surface

As a fundamental basis of this research, we have already succeeded in the experiment of 2 DOF control of an object on a flat surface using multiple air jets [2]. Here, we briefly introduce the conclusion and derive some hints to develop the air relay control algorithm. Figure 4 shows a coordinate system of the problem to control the XY positions of a cylindrical object with three nozzles. Assuming that the object is relatively large with respect to the distance between the nozzle and the object, we expect that the air jet output applied to the object can be approximated to a lumped constant system. And each jet nozzle always shoots the center of the cylindrical object. Then the T resultant force vector fx ; fy applied to the object from the air jet output fk ejected by each nozzle can be expressed in matrix form as follows. f p ¼ Kwf w

ð1Þ

T f p ¼ fx ; fy ; f w ¼ ðf1 ; f2 ; f3 ÞT ;  Kw ¼

sinh1 cosh1

 cos p6  h2 sin p6  h2

cos p6 þ h3 sin p6 þ h3



ð2Þ

where the matrix Kw is determined by the position and the orientation of the nozzle, and the force closure condition needs to be satisfied [6]. It should be noted that, due to the unilaterality of the air jet output, air jet nozzles having more than the number of control degrees of freedom is required, so this inherently becomes a redundant degree of freedom control problem. Now the object manipulation problem here is formulated as a problem of controlling the barycentric position ðx; yÞT with f w ¼ ðf1 ; f2 ; f3 ÞT  0 (not negative) and the nozzle angle hk ðk ¼ 1; 2; 3Þ as control inputs. However, when given f p , f w satisfying Eq. (1) exists infinitely. Therefore, with an objective function z ¼ maxðf1 ; f2 ; f3 Þ ¼ fmax , the following Eq. (3) is used to minimize the maximum value of the air jet output amount using linear programming. min : z ¼ fmax sub:to : K w f w ¼ f p f1 ; f2 ; f3  fmax f1 ; f2 ; f3  fmin

ð3Þ

As for the servo system, PD compensator is used to control the XY position independently. The total block diagram of the control system is shown in Fig. 5. Figure 6 shows our experiment system using a cylindrical object with a diameter of 100 mm, a height of 140 mm and a mass of 30 g was performed. The experimental device is an air nozzle (KN-Q06-20 manufactured by SMC, nozzle diameter 2 mm, length 35 mm)

An Experimental Study for the Contactless Manipulation

321

controlled PC (Microsoft Windows 10 64 bit), an air compressor (PO-0.75 PGS6). A photograph of Fig. 7 shows a movement trajectory of the object as an example of the experimental results. From these 2D experimental results, we can confirm that a control system design methodology as a lumped constant system can be applied thanks to the feedback control, although the air jet manipulation system is inherently a non-linear distributed constant system. More specifically, for example, Eq. (1) can be used for the control. However, these planar case studies can never give us any hint for the two air jet collision problem in 3D space, which we think would be the most difficult challenge for our research. Therefore, we decided to conduct the following additional preliminary experiments. 3.2

Observation of the Behavior of an Object When Two Air Jets Join in the Air

In the case of a motion of an object in a single air jet stream, hydrodynamic analysis is possible to some extent, and there can be a model-based control approach. However, neither theory nor experiments have been reported on a behavior of an object in a turbulent field where two air jets collide, therefore there have been absolutely no idea how to address it. Thus, in order to get a hint for the problem, we first decided to observe natural phenomena as it is. Namely, as a preliminary experiment, we tried to investigate the behavior of the object in the confluence area between the two air jets under a condition that the two air jet flow rates are constant. Some experiments were conducted to measure the object levitation steady position in the state of collision between the two air jet flow rates ðfL ; fR Þ and the angles ðhL ; hR Þ as parameters. In Fig. 8, three examples (fL ¼ fR ¼ 5:5½V, fL ¼ 5:5½V [ fR ¼ 5:2½V, fL ¼ 5:2½V \ fR ¼ 5:5½V with hL ¼ hR ¼ 30½deg) are shown to consider the behavior analysis. From this result, we understand that the two air jet streams will become one unified jet stream from the collision point, on which the object can be floated. And also, it was confirmed that the resultant force applied to the object from the two air jets can be roughly handled as a vector in a linear lumped parameter system.

4 Proposed Control Method 4.1

Approach to Problem Solving

Based on the above two preliminary experiment results, we expected that regarding the horizontal direction, the object can be passively stabilized by the Coandã effect. Simultaneously regarding the vertical direction, we expected that the feedback control can work to stabilize the position utilizing the two air jet resultant force as a linear lumped parameter system. Specifically, we propose the following three policies based on the assumption of the linear lumped parameter system [10]. (Policy 1) The two air jet nozzle angles are manipulated assuming that the object will stay near the confluent point thanks to the Coandã effect.

322

N. Abe et al.

(Policy 2) The flow rate of each air jet nozzle is controlled so that the horizontal component of the resultant force of the wind forces at the confluence point is always balanced to zero. (Policy 3) Regarding the vertical direction, the flow rate of each air jet nozzle is feedback controlled so that the vertical component of the resultant force and the gravity are balanced.

Fig. 3. Configuration of the experiment system (The angle is positive for clockwise rotation) [10].

Fig. 5. Block diagram of the 2-DOF control system [2].

Fig. 4. Geometrical model of the two DOF experimental system on a flat surface.

Fig. 6. An overview of an experimental system to control 2-DOF of a cylindrical object using three air jet nozzles [2].

An Experimental Study for the Contactless Manipulation

Fig. 7. Trajectories of the controlled cylindrical object’s COG [2].

4.2

323

Fig. 8. Positions of the levitated object and confluence points of the two air jets [10].

Proposed Algorithm

Based on the above policy, we created our relay control algorithm as follows [10]. 1. Determination of hL , hR From the policy 1, the angle of each nozzle is calculated from the object position target value Pr ry ; rz and the module-to-module distance, as shown in the following expression and Fig. 3. hL ¼ tan1

    ry l  ry ; hR ¼ tan1 rz rz

ð4Þ

Where rz ¼ h and ry ¼ vy t when the object should be horizontally moved with a constant speed vy . 2. Determination of f L ; f R From the policy 2, each air jet flow rate is set so that the Y direction resultant force is constantly cancelled as follows; fL sin hL þ fR sinhR ¼ 0

ð5Þ

Further, from the policy 3, the resultant force in the Z direction fz is determined using a PID controller as follows; fz ¼ PIDðrz  zÞ þ fmin

ð6Þ

324

N. Abe et al.

Where PID denotes the function of the PID compensator and fmin ¼ mg is a bias force to cancel the gravity force (m is a mass of the object). fz ¼ fL cos hL þ fR coshR

ð7Þ

Accordingly, by simultaneously solving the Eqs. (6) and (7), fL and fR are calculated as follows; fL ¼  fR ¼

fz sinhR sinðhL  hR Þ

fz sinhL sinðhL  hR Þ

ð8Þ ð9Þ

This is our proposed relay control algorithm.

5 Demonstration Experiment 5.1

Outline of Experiment

We try to demonstrate the validity of the proposed control method. Here, we conducted experiments using 5 objects (objects A, B, C, D, E) whose sizes and weights were changed to investigate the applicable range of the proposed control method. The appearance of the experimental system is shown in Figs. 9 and 10. In this experiment, the module-to-module distance L was 300 [mm]. We tried 10 times and recorded the movement trajectory of the object with a camera sensor. 5.2

Experimental Results and Discussion

Experimental results on each object are shown in Table 1. It is considered that the success rate of object A was the highest, and the combination of size and weight were best for the proposed control method (Fig. 11). In addition, since the object B has a larger diameter than the object A, the unified air jet created during the relay conveyance has a large influence. Therefore, it is thought that it floated by about 90 [mm] at the point Pm at the central point (Fig. 12). Since the object C is the lightest, even though the vibration of the object C is small, it falls and the success rate is thought to have decreased (Fig. 13). Since the object D has a smaller diameter than the object A, the influence of the unified air jet is small. Therefore, the force to float is small and it is relatively heavy, so the maximum vibration amplitude is thought to be as large as 130 [mm] (Fig. 14). Since the object D is the smallest and heavy, the influence of the unified air jet is smaller and it is considered that relay conveyance was difficult (Fig. 15).

An Experimental Study for the Contactless Manipulation

Fig. 9. An overview of “one nozzle control module” [10].

325

Fig. 10. The object and the relaying system [10].

Table 1. Summary of experiment results on each object. Object Object Object Object Object Object

Figure number Diameter [mm] Weight [g] Success rate [%] A 7 100 10.24 80 B 8 140 28.6 70 C 9 70 4.6 60 D 10 70 14.1 70 E 11 60 22.9 10

Fig. 11. Experimental result of object A (Success).

Fig. 12. Experimental result of object B (Success).

Fig. 14. Experimental result of object D (Success).

Fig. 13. Experimental result of object C (Success).

326

N. Abe et al.

Fig. 15. Experimental result of object E (Failure).

6 Conclusion Aiming at a long distance and non-contact conveyor system, we have proposed an air jet manipulation system in which multiple nozzles are continuously arranged along a conveyance line, and an object is relayed one after another on each air jet stream. Especially in this paper, we have focused on a control method to transfer an object between a pair of air jet on a vertical plane. Based on results of our conventional 2D air jet manipulation experiments as well as new experiments on the behavior of an object when two air jets join in the air, a new control algorithm has been proposed. The proposed method skillfully exploits the passivity of the Coandã effect as well as the active feedback control scheme. Finally, the validity of the proposed method has been demonstrated by the relay experiments with some payloads of various masses and sizes. Also, we could find that the success rate depends on the combination of the size and weight of the object. In the future, we aim to improve performance by using control method applying fluid dynamics theory.

References 1. Becker A, et al (2009) Automated manipulation of spherical objects in three dimensions using a gimbaled air jet. In: Proceedings of IROS, pp 781–786 2. Matsushita T, Sugiyama T et al (2014) Contactless object manipulation using multiple air jets on planar surface (Experimental case studies for small control range with continuous air jets). Trans JSME 80(817):15 3. Matsushita T, Tsuchihashi N, Iwaki S, Takaki T (2016) Contactless object manipulation using multiple air jets on planar surface (Experimental case studies of control method for the multiple objects using four air jets nozzles. https://doi.org/10.1299/transjsme.15-00459 4. Tsuchihashi N, Yoshinaga K, Iwaki S, et al (2016) Non-contact manipulation of a single solid object on a plane surface using multiple air jets (Experiments with fuzzy control). In: 2016 International Conference on Advanced Mechatronic System. https://ieeexplore.ieee. org/document/7813469 5. Iwaki S, Morimasa H, Noritsugu T, Kobayashi M (2011) Contactless manipulation of an object on a plane surface using multiple air jets. In: Proceedings of ICRA, pp 3257–3262 6. Iwaki S (1990) The optimal location of electromagnets in multiple degree-of-freedom magnetically suspended actuators. ASME J Dyn Syst Meas Control 112:690–695 7. Suzuki Y, Kobayashi M, Shimada Y, Nakayama A, Iwaki S (2004) Untethered force feedback interface that uses air jets. In: SIGGRAPH 2004 emerging technologies

An Experimental Study for the Contactless Manipulation

327

8. Satake Inc. PIKASEN. https://satake-japan.co.jp/pikasen/feature03.html 9. Chikura T, et al (1998) Development of automatic sorting system for green peppers (Handling of light-weight and irregular-shape produce). MHI Technical report 10. Yoshinaga K, Iwaki S, Abe N, et al (2018) A spatial motion control to transfer an object between a pair of air jet. In: ICINCO 2018 proceedings, vol 2

Combined Feedback-Feed Forward Control Strategy for a Knee Rehabilitation Device with Soft Actuation Leonardo Solaque, Yair Valbuena, Marianne Romero, and Alexandra Velasco(B) Mechatronics Engineering Department, Universidad Militar Nueva Granada, Bogota, Colombia {leonardo.solaque,u3900232,tmp.marianne.romero, alexandra.velasco}@unimilitar.edu.co

Abstract. Assistive devices for rehabilitation purposes have gained much attention in robotics research. Using actuation systems that include compliant elements, provide advantages such as natural motions of robotic devices and safety in the interaction with people. These actuation systems are called soft actuators. There are parallel elastic actuators (PEA), series elastic actuators (SEA) and variable stiffness actuators (VSA), which differ among them by the position of the compliant element and the possibility of changing the stiffness online. We have designed a five-bar-linkage knee rehabilitation system which uses the advantages of soft actuation. To accomplish desired tasks in a proper manner, using automatic systems, control strategies are required. In our case, this means to reproduce the desired motions without affecting the patient. In this way, the control system is crucial. In this chapter, we present a combined feedback-feedforward control strategy for the knee rehabilitation device designed. This work was partially presented before in [16], where we discussed the strategy and presented some simulation results. In this chapter we extend the results, presenting experimental trials to validate the performance of the controller and the behavior of the system. The goal of the proposed strategy is to control the system while maintaining the intrinsic softness of the actuators when the patient is in the rehabilitation process. The feedback control strategy acts in a defined threshold to maintain the stiffness of the system, and it is combined with a feed-forward decision control to reject disturbances. The simulations and the experimental results presented are obtained from the analysis of a One-Degree-of-Freedom (DoF) soft actuated system, to allow us to have an insight of the controller and the system, without losing generality. Keywords: Feedback and feedforward control · Knee rehabilitation device · Soft actuation · Intrinsic compliant dynamics

c Springer Nature Switzerland AG 2020  O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 328–343, 2020. https://doi.org/10.1007/978-3-030-31993-9_16

Combined Feedback-Feed Forward Control Strategy

329

Fig. 1. Model representation for knee rehabilitation.

1

Introduction

According to the World’s Health Organization1 near 15% the of world’s population has some disability caused by accidents, chronic diseases, or other conditions. Besides, people of all ages practice sports frequently to maintain their quality of life, but in this way, they are more exposed to joint injuries. Therefore, the is an increasing need for physical therapy. Any case of physical impairment or injury requires rehabilitation to reduce pain, and to improve or to maintain the remaining functional and structural characteristics of the musculoskeletal system. Rehabilitation includes several practices that aim to recover the functions that have been lost or diminished by disease or accident. In this case we will specifically refer to physical rehabilitation of the knee [1]. However, even if patients are prescribed a protocol that includes physical therapy, in some cases they do not complete the treatment. According to [8], patients do not comply with the physical treatments prescribed due to costs, difficulty to reach the physiotherapy’s center, and so on. Other difficulties are given by the impossibility of performing the exercises correctly due to pain, lack of strength and lack of range of mobility. For these reasons, several efforts have been done in the research of systems, methods, and mechanisms to improve the availability and effectiveness of physical therapy, by designing assistive rehabilitation devices. For instance, devices for upper limb (see e.g. [2] or [10]), and for lower limb rehabilitation (see e.g. [9], [?] or [18]) have notably increased. There are specific requirements and characteristics that this kind of devices must accomplish, as safety, and the performance of natural motions. In this sense, the actuation system is a key. Thus, soft actuators are used due to the advantages that they provide [6]. 1

http://www.who.int/.

330

L. Solaque et al.

In general, robotic devices that are intended to be used with humans, e.g. in rehabilitation, require a control strategy to ensure that the system performs the task with the defined specifications. It is important, for example, to achieve zero tracking error. Classical control strategies based on feedback compensation are effective for this purpose. However, they modify or change the system dynamics; moreover, when applying to soft actuated systems, these strategies may cancel the intrinsic compliant dynamics, eliminating the desired characteristics of these actuation systems [5]. Following this idea, we introduce a control strategy that does not cancel the desired compliant dynamics, based on the combination of feed-forward and feedback actions, oriented to control the motion of the softactuated knee rehabilitation device that we have designed. This work has been partially presented before in [16], therefore, the mathematical formulation and the main ideas for the design of the controller will be used here. However, in [16], an experimental validation was missing, which is the focus of this work. In the literature other works tackle problems related to the control of rehabilitation devices. For instance, [21], presents a closed loop torque control using classical proportional feedback control with damping injection in conjunction with iterative learning a knee exoskeleton. A PD control strategy for soft actuated systems has been derived in [4]. Moreover a feedback linearized controller is presented in [12]. Other control strategies include back-stepping [13] immersion and invariance theories [20], optimal control [11], and so on. Rehabilitation devices that use compliant actuation have been also designed. For instance, the control of an assistive orthopaedic system for rehabilitation based on compliant actuators has been presented in [19]. In [10] a one degree of freedom assistive platform to augment the strength of upper limbs with VSA is presented. Authors aim to control the system in feed-forward by mapping the Electromyographic signals (from muscle activation), to exploit the muscle-like dynamics of the mechanical device. However, a complete analysis of the control problem for the system is left as future work. Furthermore, a compliant actuated parallel ankle rehabilitation robot is presented in [7]. The robot allows the patients to modify the robot motions according to their own level of disability by applying the strategy of interactive training based on impedance control. This control scheme is dependent on the therapists’ decision, therefore automatic adaptation between impedance control modes with low and high compliance is required. The mechanical design of the 5-bars-linkage device for knee rehabilitation, using VSA was presented in [14]. The proposed system is shown in Fig. 1. To perform the desired routines for knee rehabilitation, a proper control strategy for this rehabilitation system is required. The device consists in a 5-bars-linkageclosed chain system, in which we wish to control the knee joint; therefore, to gain an insight of the control strategy and the requirements, without losing generality, we first consider a one-Degree-of-Freedom (DoF) model that simplifies the analysis. The proposed approach is designed taking into account control specifications from the patient’s point of view. The idea is to perform a desired motion during the knee rehabilitation therapy, according to the routines proposed by the physiotherapist. This control strategy is aimed to maintain the intrinsic dynamic properties of the system in order to exploit its advantages.

Combined Feedback-Feed Forward Control Strategy

331

Results, either in simulation and from the experimental trials performed with the controlled one-DoF system, show that the global control strategy proposed which combines feedback and feed-forward position control strategy, satisfies the conditions presented in [5], allowing to maintain indeed the intrinsic softness of the system, while achieving the requirements of the system, i.e. compensating with the feed-forward strategy the disturbances due to the leg’s weight, and keeping the stability of the system. When approaching the 80% of the reference, we switch the control parameters in order to have a lower velocity near to the reference.

2

Device Modeling Approach

In this chapter, we present an approach of a controller for a soft-actuated rehabilitation device as the one in Fig. 1.

Fig. 2. Model representation for knee rehabilitation [16].

To gain an insight of the control system behavior, and taking into account that our scope is to control the knee joint of the proposed assistive rehabilitation device, we consider a one-DoF soft actuated system, which represents the knee joint, as shown in Fig. 2. Let us define the system model as q˙ = ωq , 1 (−kq − Bωq + kθ + τload ) ω˙ q = M

(1)

where q and ωq = q˙ are the link (calf) angular position and angular velocity respectively; θ is the rotor (knee join) angular position, which is an input to our system; τload is the load torque, and M , k, B are the mass, the stiffness and damping of the system, respectively. The state vector is defined as X = [q, ωq ]T . Our aim is to control the link angular position q of the knee rehabilitation device, to comply with the defined repetitive motions of the physical therapy exercises. It is worth to mention that the soft actuators used in the design and

332

L. Solaque et al.

implementation of the rehabilitation system have a low-level PD control that guarantees that the desired rotor’s angular position is indeed the input angular position θ.

3

Combined Feedback and Feedforward Control Strategy

Fig. 3. Control structure used in the rehabilitation system [16].

In this section we present the design of the combined feedback and feed-forward strategy. A block diagram of the control structure is presented in Fig. 3. We first describe the control strategy requirements based on the desired behavior of the patient’s knee motion; then, we present the control law proposed. This section is entirely based on the work presented in [16]. 3.1

Control Requirements

Let us describe Fig. 3. According to physical therapy indications [17], the control law must keep the output position close to the desired reference with a smooth approach, so the Reference Block provides a saturated reference. In this case, the reference is the motor’s angular position θ. Then, the saturated reference must be smooth in order to reach it with a soft response. Furthermore, the Feedback Block has an integral action, which ensures that the system follows the desired reference and that disturbances are rejected. The Decision System Block acts as follows. When the knee joint approaches to the reference value, over a threshold of 20%, the speed with which it approaches the reference is lower. This is done because when performing rehabilitation training, the patient reduces the velocity with which the movement is done in the critical angular positions, according to the physiotherapist’s criteria. If the force exerted by the patient is over the threshold, (e.g. because the patient is in pain), the system will send a zero reference to relax the patient’s leg.

Combined Feedback-Feed Forward Control Strategy

3.2

333

Control Design

Now, let us focus on the control design for the knee rehabilitation assistive device. Based on pole placement, we use low gains to approach the natural behavior of the human motion, when performing exercises to stretch and strengthen the knee muscles, guided by a physiotherapist. Consider the linear model of the system in (1). Defining k1 , k2 , which are respectively the gain of the angular position q; the gain of the angular velocity ωq . Furthermore, ki is the gain of ξ, where ξ˙ = R − q; R is the desired knee angular position of the rehabilitation system. Then, the feedback control law is defined by (2) θ = k[k2 ωq + k1 q + ki ξ]. Let us assume that the system is in equilibrium, this is τload = 0. Then, the closed-loop dynamics are ki k q(s) = . R(s) M s3 + s2 (B − kk2 ) + s(k − kk1 ) + ki k

(3)

Here we ensure that the feedback design, i.e. det(SI −A∗ ) = 0, is Hurwitz to tune the control coefficients, so the system is stable (pole placement). In a practical way, these coefficients modify the system dynamics, depending on their values. So, to maintain the intrinsic dynamics of the system, the stiffness changes have to be such that the system response is close to the natural response. According to [5], a low-gain of the feedback controller is required in order to have little stiffness alteration in the model. The main difference and the contribution of the work presented in [16] is the focus on the control design specifications for a soft actuated knee rehabilitation device (i.e. actuated by SEA or VSA), applying the sufficient condition derived in [5]. To maintain the intrinsic characteristics of , proportional to the actuation system is to keep the stiffness defined as ∂T (q−θ,σ) ∂q q − θ, where T (q − θ, σ) is the torque due to the compliant element at the joint, and σ is a parameter used to set joint stiffness in variable stiffness actuators. To minimize the changes in the physical compliance, the stiffness value in closed loop has to remain in a δ-neighborhood of the value in open loop, along the system reference signal. Thus, let us consider that our controller is θ(q, q, ˙ t, σ, r, ξ) results in the controller coefficients. as in (2). Then, the partial derivative ∂θ(·) ∂q Consider also that the natural stiffness along the system reference is defined as ∂T (0,σ) , then, the sufficient condition to maintain the system stiffness is verified. ∂q In this case it is 

∂T (0, σ) −1 ∂θ(q, q, ˙ t, σ, r, ξ) ≤ δ   , ∂q ∂q

(4)

where r is the reference trajectory, R is the desired knee angular position, and ξ is the integral action on the error. When ξ˙ = Π(q, q, ˙ t, σ, r, ξ) exists (4) becomes 

∂T (0, σ) −1 ∂θ ∂θ ∂ξ + ≤ δ   ∂q ∂ξ ∂q ∂q

(5)

334

3.3

L. Solaque et al.

Controller Simulation

The main idea of (5) is that the coefficients of feedback action of the controller have to be sufficiently small, so we need to evaluate these coefficients in the case of the controller proposed here. If the right term in (5) is zero, there will be no stiffness variation due to the control action. Step Response 0.8 0.7

Amplitude

0.6 0.5 0.4 Uncertainty response Nominal response

0.3 0.2 0.1 0 0

20

40

60 Time (seconds)

80

100

120

Fig. 4. Variation of control parameters - Step response [16].

This is 

∂θ ∂q

∂θ ∂ξ ∂ξ ∂q ∂T (0,σ) ∂q

+

≤ 0. Regarding the controller designed in this case,

it is true that  −1 = 0, which meets the sufficient condition. To test the controller in simulation, we choose k1 , k2 y ki sufficiently small, applying the poles placement method, according to the following criteria. Using a Robust Control Toolbox, we test the system to tune these values, such that with a variation up to 20% of k1 , and k2 , and up to 30% of ki , the system is stable and the design specifications are met, i.e. low speed when approaching to the desired angular position, keeping the intrinsic stiffness of the system. Observe Figs. 4 and 5, which show respectively in time and frequency domain that the system is stable when there is a change of the control parameters. In this way, the condition (4) is accomplished, as well as the control objective. Therefore, the natural motion achieved using soft actuation allows to perform assisted knee rehabilitation exercises. For the design and validation of the derived control law, we use a stiffness preset constant value of σ = 6 Nm/rad. Once we have tested in simulation the designed controller, we can implement it on the desired system.

Combined Feedback-Feed Forward Control Strategy

335

Bode Diagram

Phase (deg)

Magnitude (dB)

0

-50

-100 0 -45 -90 -135 -180 -225 -270 -3 10

Uncertainty response Nominal response

10

-2

-1

10 Frequency (rad/s)

10

0

10

1

Fig. 5. Variation of control parameters - Bode diagram of the system [16].

4

Results and Discussion

In this section we present the controlled system behavior, first from tests done in simulation and then from experimental tests, to show the validity and performance of the combined control strategy. 4.1

Simulation Results

To analyze the results of the control strategy for the rehabilitation device, we design a test in simulation. As presented in [16], we consider that the rehabilitation system can be configured for patients heights h between 1.40 m and 1.90 m, and weight W between 40 kg and 90 kg according to mean normal adult population. According to anthropomorphic proportions we can establish mean length and weight of the leg according to h and W . For the simulations, the parameters correspond to a subject of h = 1.60 m and W = 60 kg. Besides, damping and stiffness are obtained experimentally as B = 1. First, let us consider the feedback action. According to the proposed strategy, we define the reference for the angular position that the knee has to reach. When the angular position has achieved 80% of the desired value, the structure of the regulator changes by means of the integral action, making the response slower, compared to the system dynamics. It is worth to mention that the controller was designed such that the stiffness is not affected, so a change in the parameters will maintain this condition. Figure 6 shows the system response to a reference qref = π2 rad. Notice that the output q follows the reference with zero tracking error. It is worth to mention that the input to the control system is a saturated reference thus it is close to the real state of the system. We observe that at time t = 190 s, when the output reaches 80% of the final value, indeed the dynamic changes and the motion becomes slower, as desired (see point labeled as 1 in Fig. 6). Let us define that after t = 300 s, the patient needs to stop the system (e.g. because of pain), then we simulate this requirement as an stop. In this case, the system has to go back to an initial configuration in order to relieve pain (see point labeled as 2 in Fig. 6).

336

L. Solaque et al.

Fig. 6. Tracking system [16].

Fig. 7. Input to the system and Error [16].

Fig. 8. Tracking system with perturbation [16].

Combined Feedback-Feed Forward Control Strategy

337

In Fig. 7 we show the control and error signals. Observe that there are no overshoots nor strong changes, keeping the system stable, guaranteeing a smooth motion for the knee joint. Now, let us consider the feed-forward action. The idea of this action is to reject disturbances. Then, we test the system with a step signal perturbation μ(t) which may represent for instance the corresponding component of the leg’s weight, that is an available measurement of the system. The signal μ(t) acts from t = 0 s to t = 100 s. In Fig. 8, the results of the control action are presented. Observe that the system performance with the disturbance is close to that when there is no disturbance, and that the response recovery starts at t = 0 s (see point labeled as 1 in Fig. 8). At the end of the disturbance, at t = 100 s the system recovers and continues to operate normally, as desired (see point labeled as 2 in Fig. 8). Referring to the control signals, due to the compliant behavior of the system it presents oscillations, which are properly compensated. These oscillations are due to the natural behavior of the system using soft actuation. 4.2

Experimental Results

Here we describe the main considerations taken into account for the implementation of the designed controller for the One-DoF soft actuated system. Then we present the experimental results obtained from the application of the designed control law in our system (Fig. 9).

Fig. 9. Input to the system and Error with perturbation [16].

Set up Description. For the experimental trials, we use a one-DoF system as shown in Fig. 10. It consists of a one-link system, actuated by a VSA (QB advanced actuator2 -see [3,15] for further details). Table 1 presents the main characteristics of the actuator and link used in the setup. The system allows measuring the motor position θ, and the link position q. To control the system we use the Raspberry PI3 programmed in Phyton, with an external real-time clock to guarantee the execution times. The sampling time is Ts = 20 ms for 2

https://qbrobotics.com/products/qbmove-advanced.

338

L. Solaque et al.

the whole process. The experiments carried out allow to check the validity and performance of the combined feedback+feed-forward strategy in the one-DOF system. We first test the feedback action to ensure that the system follows a desired link reference position qref (k), in a time k, and then we show the system behavior using the combined strategy, under known disturbances, i.e. a known weight that generates a torque and loads the link.

Fig. 10. Physical system. Table 1. System characteristics. Range of stiffness (Nm/rad) 0.5–83.5 Range of motion (deg)

±164

Peak output torque (N)

6.8

Motor weight (Kg)

0.45

Link weight (Kg)

0.2

Link length (m)

0.17

Link width (m)

0.03

Controller Implementation. Referring to Fig. 3, the reference block is implemented using a second order smoothing function H(s) H(s) =

1 , s2 + 10s + 1

(6)

to make the reference smooth. This function is discretized using the Backward Euler Method (s = (1 − z −1 )/Ts ). Then the difference equation that defines the feedback control law is 

u1 (k) = k1 qm (k) + k2 qm (k) + ut (k) ,

(7)

Combined Feedback-Feed Forward Control Strategy

339

where k1 and k2 are the control coefficients tuned to guarantee the system stability, according to the desired system behavior; qm (k) is the measured link position  in time k, qm (k) is its discrete time derivative, and ut (k) is the integral action defined as (8) ut (k) = ut (k − 1) + Ts ki e(k) . Here ki is the integral control constant, e(k) is the link position error e(k) = qref (k) − qm (k), with qref (k) the desired link position at time k. For the experiments carried out, ki = 0.08, k1 = −0.1 and k2 = −0.2. In this way, we ensure that the system meets the sufficient condition in (5). The feed-forward action is implemented using a filter transfer function H1 (s) =

s+1 , 1.5s + 1

(9)

which in discrete time, using the Backward Euler Method with sampling time Ts = 20 ms is H1 (s) =

1.02 − z −1 . 1.52 − 1.5z −1

(10)

Considering that H1 (k) = u2 (k)/D(k), where D(k) is the known disturbance (weight), the feed-forward control action is defined by u2 (k) = −K(a1 D(k) − a2 D(k − 1) + a3 u2 (k − 1)) ,

(11)

where a1 = 0.6710, a2 = 0.6578, a3 = 0.9868, and K is the function gain defined by the unit compensation. Finally, the combined feedback + feedforward control law is given by u(k) = u1 (k) + u2 (k). Output Response 1.6

2

1.4

Link (q) rad

1.2

1

1 0.8 0.6 0.4 0.2 Reference (qd) Pos Link(q)

0 -0.2 0

50

100

150

200

250

Time (s)

Fig. 11. Tracking a reference with feedback controller action. At the point marked with 1 the system is approaching the reference, so the controller action slows down to achieve the reference (point 2).

In Fig. 11, we show the system behavior when tracking a desired reference qr ef = π/2. Notice that the system behaves as expected from the simulation results (see Fig. 6). Results of the system behavior under the feedback controller action are also shown in Fig. 12; in this case, we changed the desired link position qref using steps of 35◦ . From this test, we observe that the system indeed

340

L. Solaque et al. Output Response

Link (q) Degrees

150 100 50 0

Reference (qd) Pos Link(q)

-50 0

50

100

150

200

250

300

350

250

300

350

400

Time (s) Control Response

Control (u) Degrees

150 100 50 0

Control (u)

-50 0

50

100

150

200

400

Time (s)

Fig. 12. System output changing reference along the time.

stabilizes as expected. The system reacts slowly (stabilization time near 50 s), which is a desired feature for the rehabilitation system. It is possible to tune the controller coefficients in order to have faster responses. Figure 13 show the system behavior when there is a known disturbance D(k) = 0.17 N, only under the feedback control action. Notice that the feedback action makes the system stabilize after the disturbance, however, when the disturbance affects, there is an undesired peak of 30◦ , which will affect a patient when using the rehabilitation device. Then, we add the feed-forward action. Figs. 14 and 15, show the system response under the action of the combined strategy. Two tests were performed, i.e. when the weight action is against the movement and when it has a component that supports the movement. Each time the reference changes with steps of 45◦ , as we did in the simulations. Notice that when the combined strategy is used, the peaks due to the disturbance reduce (approx. 4◦ ) which may not harm the patient or may be absorbed by the system. Output Response

Link (q) Degrees

150 100

3

2 1

50 0

Reference (qd) Pos Link(q)

-50 0

50

100

150

200

250

300

350

250

300

350

400

Time (s) Control Response

Control (u) Degrees

150 100 50 0

Control (u)

-50 0

50

100

150

200

400

Time (s)

Fig. 13. Controlled system output with 0.17 N load perturbation.

Combined Feedback-Feed Forward Control Strategy

341

Output Response

100

Link (q) Degrees

80

2

60

1

40 20

Reference (qd) Pos Link(q)

0 -20 0

50

100

150

200

250

300

350

250

300

350

400

Time (s) Control Response

Control (u) Degrees

100 80 60 40 20 0

Control (u)

-20 0

50

100

150

200

400

Time (s)

Fig. 14. System output with load perturbation under feedback+ feed-forward actionagainst the movement. Output Response

Link (q) Degrees

100 80

1

60 40

2 Reference (qd) Pos Link(q)

20 0 0

100

200

300

400

500

400

500

600

Time (s) Control Response

Control (u) Degrees

100 80 60 40 20 0

Control (u)

-20 0

100

200

300

600

Time (s)

Fig. 15. System output with load perturbation under feedback+ feed-forward actionsupporting the movement.

5

Conclusion

In this chapter, we have presented a global control strategy that combines feedforward and feedback actions for a One-DoF system that models the knee joint of a soft actuated knee rehabilitation device. The designed rehabilitation device is a 5-bar-linkage system. However, we have presented an approach to control a softactuated-one DoF system, without losing generality, in order to gain an insight of the control strategy, its requirements, and its performance when carrying out the desired motion during physical knee rehabilitation. With the proposed combined strategy, have proved in simulation and by experimental results, to keep the intrinsic stiffness of the system, and to achieve stability, low velocity when approaching to the reference, and disturbance rejection. The feed-forward strategy compensates loading disturbances while the feedback strategy acts in a defined threshold to maintain the stiffness of the system. When approaching

342

L. Solaque et al.

the 80% of the reference, we switch the control parameters in order to have a lower velocity near to the reference, meeting the desired specifications. In future works, we will implement the control strategy proposed here, in the 5-barslinkage-rehabilitation device. Acknowledgment. This work is funded by Universidad Militar Nueva GranadaVicerrector´ıa de Investigaciones, under research grant for project IMP-ING-2291, enti-

tled ’Dise˜ no de un prototipo para rehabilitaci´ on de rodilla mediante el uso de actuadores flexibles’.

References 1. Andrade AO, Pereira AA, Walter S, Almeida R, Loureiro R, Compagna D, Kyberd PJ (2014) Bridging the gap between robotic technology and health care. Biomed Signal Process Control 10:65–78. https://doi.org/10.1016/j.bspc.2013.12.009 2. Balasubramanian S, Wei R, Perez M, Shepard B, Koeneman E, Koeneman J, He J (2008) Rupert: an exoskeleton robot for assisting rehabilitation of arm functions. In: 2008 virtual rehabilitation, pp 163–167, August 2008. https://doi.org/10.1109/ ICVR.2008.4625154 3. Catalano MG, Grioli G, Garabini M, Bonomo F, Mancini M, Tsagarakis N, Bicchi A (2011) VSA-cubebot: a modular variable stiffness platform for multiple degrees of freedom robots. In: 2011 IEEE international conference on robotics and automation, pp 5090–5095, May 2011 4. De-Luca A, Flacco F (2011) A PD-type regulator with exact gravity cancellation for robots with flexible joints. In: 2011 IEEE international conference on robotics and automation, pp 317–323, May 2011. https://doi.org/10.1109/ICRA.2011.5979615 5. Della-Santina C, Bianchi M, Grioli G, Angelini F, Catalano MG, Garabini M, Bicchi A (2017) Controlling soft robots: balancing feedback and feedforward elements. IEEE Robot Automat Mag 24(3):75–83. https://doi.org/10.1109/MRA. 2016.2636360 6. Grioli G, Wolf S, Garabini M, Catalano M, Burdet E, Caldwell D, Carloni R, Friedl W, Grebenstein M, Laffranchi M, Lefeber D, Stramigioli S, Tsagarakis N, van Damme M, Vanderborght B, Albu-Schaeffer A, Bicchi A (2015) Variable stiffness actuators: the user’s point of view. Int J Robot Res 34(6):727–743. https://doi. org/10.1177/0278364914566515 7. Jamwal PK, Hussain S, Ghayesh MH, Rogozina SV (2016) Impedance control of an intrinsically compliant parallel ankle rehabilitation robot. IEEE Trans Industr Electron 63(6):3638–3647. https://doi.org/10.1109/TIE.2016.2521600 8. Jensen GM, Lorish CD (1994) Promoting patient cooperation with exercise programs. Linking research, theory, and practice. Arthritis Rheum 7:181–189 9. Koller-Hodac A, Leonardo D, Walpen S, Felder D (2010) A novel robotic device for knee rehabilitation improved physical therapy through automated process. In: 2010 3rd IEEE RAS EMBS international conference on biomedical robotics and biomechatronics, pp 820–824, September 2010. https://doi.org/10.1109/BIOROB. 2010.5628057 10. Mghames S, Laghi M, Santina CD, Garabini M, Catalano M, Grioli G, Bicchi A (2017) Design, control and validation of the variable stiffness exoskeleton flexo. In: 2017 international conference on rehabilitation robotics (ICORR), pp 539–546, July 2017. https://doi.org/10.1109/ICORR.2017.8009304

Combined Feedback-Feed Forward Control Strategy

343

11. Ozparpucu MC, Albu-Schaffer A (2014) Optimal control strategies for maximizing the performance of variable stiffness joints with nonlinear springs. In: 53rd IEEE conference on decision and control, pp 1409–1416, December 2014 12. Petit F, Albu-Sch¨ affer A (2011) State feedback damping control for a multi DOF variable stiffness robot arm. In: 2011 IEEE international conference on robotics and automation, pp 5561–5567, May 2011. https://doi.org/10.1109/ICRA.2011. 5980207 13. Petit F, Daasch A, Albu-Sch¨ affer A (2015) Backstepping control of variable stiffness robots. IEEE Trans Control Syst Technol 23(6):2195–2202. https://doi.org/ 10.1109/TCST.2015.2404894 14. Romero A ML, Valbuena Y, Velasco A, Solaque L (2017) Soft-actuated modular knee-rehabilitation device: proof of concept. In: Proceedings of the international conference on bioinformatics research and applications 2017, ICBRA 2017, pp 71–78. ACM, New York, NY, USA. https://doi.org/10.1145/3175587.3175593, https://doi.org/10.1145/3175587.3175593 15. Santina CD, Piazza C, Gasparri GM, Bonilla M, Catalano MG, Grioli G, Garabini M, Bicchi A (2017) The quest for natural machine motion: an open platform to fast-prototyping articulated soft robots. IEEE Robot Autom Mag 24(1):48–56. https://doi.org/10.1109/MRA.2016.2636366 16. Solaque L, Romero M, Velasco A (2018) Knee rehabilitation device with soft actuation: an approach to the motion control. In: Proceedings of the 15th international conference on informatics in control, automation and robotics - volume 2: ICINCO, pp 156–162. INSTICC, SciTePress. https://doi.org/10.5220/0006861001660172 17. Umivale PS (2011) Patolog´ıa de la rodilla: Gu´ıa de manejo cl´ınico 18. Vouga T, Zhuang KZ, Olivier J, Lebedev MA, Nicolelis MAL, Bouri M, Bleuler H (2017) EXiO: a brain-controlled lower limb exoskeleton for rhesus macaques. IEEE Trans Neural Syst Rehabil Eng 25(2):131–141. https://doi.org/10.1109/TNSRE. 2017.2659654 19. Wilkening A, Baiden D, Ivlev O (2011) Assistive control of motion therapy devices based on pneumatic soft-actuators with rotary elastic chambers. In: 2011 IEEE international conference on rehabilitation robotics, pp 1–6, June 2011. https://doi. org/10.1109/ICORR.2011.5975361 20. Wimboeck T, Ott C, Hirzinger G (2010) Immersion and invariance control for an antagonistic joint with nonlinear mechanical stiffness. In: 49th IEEE conference on decision and control (CDC), pp 1128–1135, December 2010. https://doi.org/ 10.1109/CDC.2010.5717891 21. Witte KA, Fatschel AM, Collins SH (2017) Design of a lightweight, tethered, torque-controlled knee exoskeleton. In: 2017 international conference on rehabilitation robotics (ICORR), pp 1646–1653, July 2017. https://doi.org/10.1109/ ICORR.2017.8009484

Controlling Off-Road Bi-steerable Mobile Robots: An Adaptive Multi-control Laws Strategy Roland Lenain1(B) , Ange Nizard2 , Mathieu Deremetz1 , Benoit Thuilot2 , Vianney Papot1 , and Christophe Cariou1 1

Irstea, UR TSCF, 9 avenue Blaise Pascal, CS 20085, 63172 Aubiere, France [email protected] 2 Universit´e Clermont Auvergne, CNRS, SIGMA Clermont, Institut Pascal, 63000 Clermont-Ferrand, France http://www.irstea.fr/la-recherche/unites-de-recherche/tscf/

Abstract. This paper proposes a path tracking strategy for wheeled mobile robots of type {1, 2} (i.e. equipped with two steering axles), with the aim to ensure the convergence of the front and rear control points along a same trajectory, leading to reduce the required space to achieve maneuvers. The proposed approach considers front and rear steering axles as two separate systems with their own control variables: the front and the rear steering angles. The problem of managing two steering axles is solved without considering an explicit control of the robot’s orientation, nor a relationship between the two steering angles which is generally a not optimal approach. The proposed control laws are based on adaptive and predictive control techniques in order to address phenomena acting when moving in unstructured context, such as bad grip conditions, low-level and inertial delays. As a result, this control algorithm enables to accurately control bi-steerable mobile robots, while increasing their maneuverability. This is particularly suitable for off-road applications, such as in agriculture where potentially large robots have to move in cluttered environments and face low grip conditions.

Keywords: Wheeled mobile robots tracking · Adaptive control

1

· Four wheel-steered robot path

Introduction

Mobile robotics arises as a promising solution to different societal problems in many areas such as transportation [7], civil safety and defense, or agriculture [3]. Beyond acting in hazardous environments or situations instead of human operators, the automation of vehicles may offer new potentialities thanks to extended control capabilities. It is notably the case of bi-steerable vehicles. With respect to classical car-like vehicles [1], they theoretically offer enhanced functionalities, c Springer Nature Switzerland AG 2020  O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 344–363, 2020. https://doi.org/10.1007/978-3-030-31993-9_17

Controlling Off-Road Bi-steerable Mobile Robots

345

such as an increased maneuverability or the possibility to control the actual orientation independently from the position. Four wheeled-steered (4WS) vehicles are generally used in order to reduce the radius of curvature, such as within warehouses where lots of goods are stored or within agricultural fields to minimize maneuver areas. Nevertheless, the manual control of such vehicles may be difficult for a human driver. In autonomous modes, the control of 4WS robots is generally made by imposing the rear steering angle as a function of the front steering one, as proposed in [12]. Such an approach enables to decrease the radius of curvature without however achieving a specific task. Beyond motion control, one can take advantage of this additional degree of freedom to achieve a secondary task. In automotive industry, this can be for instance the compensation of sideslip angles and the improvement of the vehicle lateral dynamics [14]. In the framework of mobile robotics, the control of such vehicles may reduce the space required to maneuver, such as for transportation in cluttered environment [10]. It is also particularly interesting in agriculture to optimize half-turn maneuvers within headlands, since they are usually lowyield field areas due to high soil compaction, or to ensure the proper orientation of an implement. In an off-road context, a control strategy has for instance be proposed in [6] to control independently the heading and the position of the robot in the framework of path tracking [13]. In this framework, a path is preliminary defined and a reference point on the robot has to follow this trajectory (for a car-like mobile robot, this point is usually chosen as the middle of the rear axle, since it presents interesting properties, see [4]). When only the front steering angle is actuated, the middle of the front axle is necessarily outside of the turn. In this paper, the control of bi-steerable cars is investigated to ensure that front and rear axles follow the same trajectory. For that purpose, the robot is viewed as two subsystems, such as introduced in [8]: one describes the motion of the rear axle of the robot, and the other one is dedicated to the front axle. These two models are linked by the rear steering variable, which is regarded as a measured parameter for the front semimodel. Moreover, extended kinematic models are considered, enabling to account for non-perfect conditions thanks to model-based adaptive algorithms [2]. This approach permits to derive independently two control laws for the front and the rear steering angles. As a result, two points on the robot (the middle of the two front wheels and the middle of the two rear wheels) can track the same trajectory. The paper is organized as follows: first, the modeling of the robot is proposed, based on an extended kinematic approach recalled in Sect. 2. The on-line estimation of the required variables, based on observation techniques, is then briefly detailed in Sect. 3. This observer permits to know all the variables introduced in the model. As a result, this model may then be used to build two independent control laws, which are derived in Sect. 4. The addition of a predictive layer is also briefly introduced by considering the future curvature of the reference path. Next, after describing in Sect. 5 how to avoid lock-up risks due to steering saturation encountered in practice, the performances of the proposed approach

346

R. Lenain et al.

are then tested by performing full-scale experiments with two different off-road mobile robots. The results shown in Sect. 6 permit to highlight the efficiency of the proposed algorithms.

2 2.1

Modeling of the Robot Assumptions and Notations

In this paper, path tracking for a four-wheeled steered robot is considered. The robot is viewed as a bicycle model (see [11]), such as depicted in Fig. 1. The two front (respectively rear) wheels are reduced to a unique front (resp. rear) wheel with an equivalent steering angle denoted δF at point F (and resp. δR at point R). This is made possible assuming that the robot has a longitudinal axis symmetry, moves on a plane, and that all the wheels are in contact with the ground.

Fig. 1. Extended kinematic model of a two-wheeled steered robot.

Such a robot belongs to the type {1,2} in the classification proposed by [4]. Basically, the control variables attached to this kind of robots are: • v: the velocity imposed at the rear axle, • δF : the front steering angle, • δR : the rear steering angle. In the path tracking framework, the velocity is supposed to be strictly positive (v > 0) or at least non-null. Such a condition had to be strictly satisfied in the previous version proposed in [8]. In this version, it is no longer mandatory, since the objective is to ensure the convergence of the robot to its trajectory

Controlling Off-Road Bi-steerable Mobile Robots

347

after a settling curvilinear distance. Therefore, the control of the robot speed is not considered in this paper and only control expressions for the front and the rear steering angles are derived, with the value of v regarded as a measured parameter potentially variable. In order to achieve the path tracking task, one considers the following state variables: • y R is the tracking error: it is the shortest distance between the point R and the trajectory Γ to be followed. • θ˜ is the angular deviation, denoting the difference between the robot’s heading θ and the orientation of the trajectory Γ at the closest point of R belonging to Γ . In the sequel, this point is identified by its curvilinear abscissa s along the trajectory Γ and c(s) denotes the curvature of Γ at this point. • y F is a second lateral deviation defined at the center of the front axle F , separated from the point R by the wheelbase L. Specifically, it is the distance between F and the trajectory Γ in the same direction than the projection of R onto Γ (see Fig. 1). These variables are sufficient to describe the robot motion when ideal “rolling without sliding” conditions are assumed. Nevertheless, when moving at important speed or on a non-ideal ground, such an assumption is not satisfied. In such a case, the directions of the speed vectors at the wheel centers no longer belong to the wheel planes. As detailed in [5], an angle appears for each virtual wheel (front and rear), which is named sideslip angle and denoted in this paper βF for the front wheel and βR for the rear one. In order to describe properly the robot motion, these two angles have to be incorporated into the robot model. They are hardly measurable directly and are here assumed to be estimated indirectly by the observer described in Sect. 3. Classically, the objective of path tracking is to ensure the convergence of the tracking error to zero (i.e. y R → 0). Thanks to the two steering angles, a second objective is here assigned: the convergence of the front lateral deviation to zero (i.e. y F → 0). Assuming that sideslip angles are known, the kinematic model of the bi-steerable robot depicted in Fig. 1 can be established and next be used to derive the control expressions. 2.2

Classical Motion Equations

As detailed in [15], when classical path tracking is addressed, the motion equations for the extended model depicted in Fig. 1 may be advantageously expressed as: ⎧ cos θ˜2 ⎪ s˙ = v 1−c(s) ⎨ yR (1) y˙ R = v sin θ˜2 ⎪ ⎩ ˜˙ θ = v [cos(δR + βR ) λ1 − λ2 ] with: λ1 =

tan(δF +βF )−tan(δR +βR ) , L

λ2 =

˜ R +βR ) c(s) cos(θ+δ 1−c(s) y R

and θ˜2 = θ˜ + δR + βR .

348

R. Lenain et al.

Model (1) is singular if 1 − c(s) y R = 0. However, if the robot is properly initialized and the steering laws properly tuned, the rear lateral deviation y R never drifts up to the radius of curvature 1/c(s) of the path to be followed. So, in practice, the model singularity is never met. This model is attractive, since it can be turned into a linear form as it has been shown in [11] (the extension of kinematic models with the sideslip angles does not change indeed the structure of these models). As achieved in [9], model (1) permits to derive a front steering control law enabling the convergence of the rear lateral deviation y R to zero, whatever the value of the rear steering angle δ R , provided that its time derivative δ˙ R can be neglected with respect to the settling time of the proposed front steering control law. In the abovementioned reference, the rear steering angle is then used to servo the angular deviation. If this permits to control the heading of a robot independently from its lateral position, it does not explicitly control the front axle position. As this paper aims at ensuring the convergence of the position of both front and rear axles (points R and F ) to the reference path Γ , a second model is investigated. 2.3

Modeling Robot Motion in Two Subsystems

In this paper, the objective is to ensure that both steering axles converge and stay on the desired trajectory Γ . To meet this aim, two models for the derivatives of the tracking errors y F and y R are developed from the representation depicted in Fig. 1. In [8], time-derivatives have been considered. This leads to control performances expressed as settling times. Moreover, both control laws depend on the velocity v and are therefore singular when the speed is close to zero. However, in path tracking applications, control performances expressed as settling distances are usually preferred, so that the imposed behavior is independent from the robot velocity. As a result, derivatives with respect to the curvilinear abscissa s are here considered. The dynamic of the rear lateral deviation y R is obtained directly from the two first lines in Eq. (1): d yR = (1 − c(s) y R ) tan θ˜2 ds

(2)

With respect to the front axle, it can first be derived from Fig. 1 that: y F = y R + Lsinθ˜ + e

(3)

(in view of sign conventions, e is negative in the configuration shown in Fig. 1). Next, assuming that the trajectory can be locally (at any curvilinear abscissa s) 1 , the deviation e can be expressed as: viewed as a circle of radius c(s) e=

1 (1 − cos α) c(s)

(4)

where α, depicted in Fig. 1, is the angle defined by the points {R , O , FΓ }. This angle can easily be expressed using the curvature of the reference trajectory, the angular deviation θ˜ and the robot wheelbase L:

Controlling Off-Road Bi-steerable Mobile Robots

α = arcsin(L cos θ˜ c(s))

349

(5)

Since an expression for e has been obtained, the derivative of the lateral front deviation can be computed. In view of actuator reactivity, it can be considered that e is slow-varying with respect to the possible variations of y F , so it comes from (3) that: d yR d θ˜ d yF = +L cos θ˜ (6) ds ds ds Using model (1), Eq. (6) can be developed as:  d yF (1 − c(s) y R )  ˜ sin θ2 + L cos θ˜ (cos(δR + βR ) λ1 − λ2 ) = ds cos θ˜2

(7)

The definitions of λ1 , λ2 and θ˜2 given in model (1) show that these variables are explicitly linked to the control variables δF and δR . Therefore, Eqs. (2) and (7) provide the relation between the control variables and the derivatives of the lateral deviations with respect to the curvilinear abscissa. These two equations constitute the model that will be used in Sect. 4 to design the control laws. However, the sideslip angles βF and βR have to be known. This is the aim of the observer described hereafter.

3

Observation of Sideslip Angles

In order to design control laws for the convergence of the lateral errors to zero, the knowledge of the sideslip angles is mandatory. There is no perception system allowing to measure directly such variables. The sensors on-boarded on the robot permit usually to measure the position, the orientation and the velocity of the robot at a given point, here R. As it has been shown in [9], an observer may be built in order to estimate the two sideslip angles from the data supplied by such a perception system. Since this observer is here used, a short presentation is given below. More details can be found in [9]. 3.1

Observer State

In order to achieve the indirect estimation of sideslip angles, let us consider the state vector ξ composed of the robot lateral deviation, its relative orientation and the two sideslip angles. Relying on model (1), its time-derivative (since derivatives with respect to time are more appropriate to account for sliding occurrence) can be written as:   f (ξdev , ξβ , v, δF , δR ) ξ˙ = (8) ξ˙ = dev 02×1 ξ˙β where ξ is split into two sub-states:

350

R. Lenain et al.

˜ T gathers the deviations of the robot with respect to the trajec• ξdev = [y R θ] tory Γ . These variables can be measured with the common robot perception systems. • ξβ = [βF βR ]T is composed of the sideslip angles, to be estimated. The function f (ξdev , ξβ , v, δF , δR ) is constituted of the two last lines in model (1) and the lower term has been set to 02×1 , since no equation is available in a kinematic representation for the evolution of β˙ F and β˙ R . The objective of the observer is to ensure the convergence of the complete observed state ξˆ to the actual one ξ, when measuring only the sub-state ξdev . As a result, only the observation error related to the deviations, namely ξ˜dev = ξˆdev −ξdev , is available and may be used. 3.2

Observer Equations

It has been shown in [9] that the observer defined by Eq. (9) permits the convergence of the whole observed state ξˆ to the actual one ξ:

˙ ξˆdev = f (ξdev , ξβ , v, δF , δR ) + αdev (ξ˜dev ) (9) ˙ˆ ξβ = αβ (ξ˜dev ) where αdev and αβ are functions of ξ˜dev , the part of the observation error that can actually be measured. They are defined as follows:

αdev (ξ˜dev ) = Kdev ξ˜dev T (10) ∂f αβ (ξ˜dev ) = Kβ ∂ξ (ξ , ξ , v, δ , δ ) ξ˜dev dev β F R β with Kdev a 2 × 2 positive diagonal matrix and Kβ a positive scalar, that both permit to tune the settling time of the observer. Thanks to observer (9)–(10), all the variables in model (1) are known and control design can now be addressed in the next section.

4

Control Laws

In the sequel, estimated sideslip angles βˆR and βˆF are regarded as the actual ones and expressions (2) and (7) proposed for the derivatives of the rear and front lateral deviations with respect to the curvilinear abscissa are used successively to design control laws ensuring that points R and F track the reference path Γ . 4.1

Rear Steering Angle

Let us first consider the control of the rear point R. The objective is to ensure the convergence of the rear lateral deviation y R to zero. The evolution of this variable is related to the rear steering angle δR by Eq. (2), through the intermediate

Controlling Off-Road Bi-steerable Mobile Robots

351

variable θ˜2 . The convergence y R → 0 may be ensured by imposing the following differential equation: d yR = −KR y R (11) ds with KR a positive scalar defining a settling distance for the exponential convergence of y R to zero imposed by (11). By injecting the expression (2) for the derivative of the rear lateral deviation into (11), and using the definition of θ˜2 , it can be obtained that:

 −KR y R δR = arctan (12) − θ˜ − βˆR 1 − c(s) y R Expression (12) constitutes the control law on the rear steering angle. It ensures that the lateral deviation y R satisfies differential Eq. (11), implying its convergence to zero. 4.2

Front Steering Angle

Once the rear steering angle is controlled according to expression (12), the subsequent objective is to design a control law for the front steering angle in order that the front point (denoted F in Fig. 1) converges to the desired trajectory Γ . In other words, the objective is to ensure the convergence of the front lateral deviation to zero: y F → 0. The evolution of this variable is related to the front steering angle δF by Eq. (7), through the intermediate variable λ1 . As previously, the convergence y F → 0 may be ensured by imposing the following differential equation: d yF = −KF y F (13) ds with KF a positive scalar defining a settling distance for the exponential convergence of y F to zero imposed by (13). By injecting the expression (7) for the derivative of the front lateral deviation into (13), and using the definition of λ1 , it can be obtained that: δF = arctan



L λ2 ˆ ) cos(δR +β R



˜ KF y F cos θ 2 ˆ ) cos θ ˜ (1−c(s) y R ) cos(δR +β R



˜ sin θ 2 ˆ ) cos θ ˜ cos(δR +β R

 + tan(δR + βˆR ) − βˆF

(14)

All the variables in expression (14) are known: the rear steering angle δR is computed from (12) and the sideslip angles βˆF and βˆR are estimated on-line from observer (9). Therefore expression (14) can serve as the control law for the front steering angle. It is properly defined provided that δR + βˆR = π2 [π] and θ˜ = π2 [π]. Since the range of the steering angles is mechanically limited and the sideslip angles never approach large values, the first condition is always satisfied. Next, if the robot is properly initialized and the steering laws properly tuned, the robot never heads perpendicularly to the reference trajectory, so the second condition is also satisfied. Expression (14) can therefore be used without any restriction and ensures that the lateral deviation y F satisfies differential Eq. (13), implying its convergence to zero.

352

4.3

R. Lenain et al.

Curvature Anticipation

Control performances may however be depreciated when the robot moves at high speed, since the delay induced by the steering actuator has not been taken into account within model (1). Specifically, overshoots may transiently be recorded when the curvature of the reference path presents fast variations. This difficulty has been addressed in [9] by adding a predictive layer onto the front steering control law. The same approach can be considered in the framework of this paper. The idea is to take advantage of the knowledge of the trajectory to be followed in order to anticipate curvature variations. The reference path curvature c(s) appears twice within the front steering control law (14). The term (1 − c(s) y R ) is not crucial: when the robot converges to the reference trajectory (y R → 0), its influence vanishes (1 − c(s) y R → 1). The path curvature c(s) appears also within the term λ2 . This term is essential, since it is the preponderant one when the robot has converged to the reference trajectory: if the robot tracks the path Γ (y R and y F stay equal to 0), then θ˜2 = 0 (in view of the second line in model (1)) and injecting these 3 null values into control law (14) leads to:   L λ2 (c(s)) ˆR ) − βˆF δF = arctan cos(δ + tan(δ + β (15) R +βˆ ) R

R

or, if λ2 (c(s)) is replaced by its expression when y R = y F = θ˜2 = 0:   L c(s) + tan(δR + βˆR ) − βˆF δF = arctan cos(δR + βˆR )

(16)

In other words, in order for the robot to go on tracking the reference trajectory, its radius of curvature must be equal to the radius of curvature of this trajectory. In view of (15), the term λ2 (c(s)) is instrumental to steer appropriately the front axle so that this condition is met. When the curvature of the reference trajectory is fast varying, the desired value (15) for the front steering angle cannot be applied instantaneously due to the delay in steering actuation. As a consequence, the robot slightly drifts from the trajectory and an overshoot is recorded. In order to reduce this effect, the curvature of the reference trajectory must be anticipated to compensate for the steering delay, denoted Ts . Since a prediction of the curvilinear abscissa covered by the robot during the settling time of the steering actuator is: sH = v Ts

(17)

the term c(s) has to be substituted with c(s + sH ) within the expression of λ2 (c(s)), leading to the following predictive expression for the front steering control law:  λ2 (c(s+sH )) KF y F cos θ˜2 δFP red = arctan Lcos(δ − ˆ (1−c(s) y R ) cos(δR +βˆR ) cos θ˜ R +βR ) (18)  sin θ˜2 − + tan(δR + βˆR ) − βˆF cos(δR +βˆR ) cos θ˜

Controlling Off-Road Bi-steerable Mobile Robots

353

This anticipation has also an action on the control of the rear steering angle: the ˜ The anticipation front steering angle acts directly on the robot orientation θ. introduced in the front steering control law reflects therefore on the evolution of the robot orientation, so that in view of (12) the rear steering control law is also somehow anticipated.

5

Prevention of Steering Angle Saturations

On an actual robot, the mechanical design may limit the steering angles: |δ{F,R} | < δSat . The proposed control laws do not take into account this contraint and may then send values leading to saturation for rear and front steering angles. One situation is especially critical: when both steering angles are saturated with the same sign. As it is highlighted in Fig. 2, the robot moves then crabwise (δR = δF = δSat or δR = δF = −δSat ): it has lost its ability to turn and keeps on drifting away from the reference trajectory. This constitutes a failure, since the robot may not reach the trajectory Γ anymore. The other situations (only one steering angle is saturated or both of them are saturated, but with opposite signs) are less harmful: a transient error will be recorded, but since the robot can still turn (δR = δF ), the robot angular deviation with respect to the reference trajectory can decrease, so that eventually the values delivered by the steering control laws are expected to return within the authorized range (|δR | < δSat and |δF | < δSat ) and ensure again the convergence of the robot to the reference trajectory.

Fig. 2. Robot experiencing a failing test due to steering saturation.

In order to avoid the situation depicted in Fig. 2, a first possibility may consist in imposing a saturation value for the rear angle lower than for the front steering angle: the ability of the robot to turn when both steering angles are saturated is then preserved, but it is not an optimal solution since the robot

354

R. Lenain et al.

turning capability is reduced (the full range of the rear steering angle values is not used). In order to restrict as little as possible the robot turning capability, it has been preferred here to reduce the rear steering angle only when a saturation with the same sign occurs for the front steering angle. Specifically, if δR and δF are the rear and front steering angle values computed from the rear and front  defined as steering law (12) and (18), it is proposed to substitute δR with δR follows:   δR = δR if |δF | < δSat (19)  δR = δR − sign(δF )(|δF | − δSat ) otherwise Rear steering law (19) guarantees that the failure situation shown in Fig. 2 can  | < |δF |, so that the robot can still no longer occur: if |δF | = δSat , then |δR turn and converge back to the reference trajectory. The rear lateral deviation y R is temporarily not controlled when |δF | = δSat , but as soon as |δF | < δSat ,  = δR ). convergence is restarted (since then δR

6 6.1

Experimental Results Experimental Setup

Two four-wheel drive robots, depicted in Fig. 3, are used in order to highlight the performances of the proposed approach. They are fully electric and equipped with four independent motors controlling the wheel speeds and additional motors controlling the front and rear steering angles. Their main properties are listed in Table 1. Table 1. Main properties for each robot used in experiments. Adap2E RobuFAST Mass (kg)

650

525

Wheelbase (m)

1.38

1.2

Track (m)

0.7

1.22



Steering angle saturation ( )

45

Steering angle settling time (s) 0.750

22 0.270

One of the main differences in these robots lies in the steering properties: the saturation threshold is much higher for Adap2E, when the steering angle settling time leads RobuFAST to be the most reactive. As it is shown in Fig. 3, the main on-boarded sensor is an RTK-GPS settled at the vertical of the rear axle (above the point R depicted in Fig. 1). It supplies a position accurate to within ±2 cm, while the heading is computed thanks to a Kalman filter mixing the GPS heading data and the robot odometry. Connected to the reference path, these data permit to feed both the observer and the control laws (18) and (19).

Controlling Off-Road Bi-steerable Mobile Robots

6.2

355

Maneuverability Improvement and Tracking Accuracy

With High Saturation Threshold. The first set of tests consists in the tracking of the trajectory depicted in black line in Fig. 4 with Adap2E robot. The reference path is composed of a straight line part at the beginning, followed by two successive curves with an important curvature and the robot speed is 1 m/s.

Fig. 3. Experimental robots and on-boarded sensors.

Fig. 4. Ref. path (black), robot traj. with front steering (red) or front and rear steering (blue).

In order to show the efficiency of the proposed control algorithms, the autonomous tracking of this path is first realized using only the front steering axle, namely with the previous adaptive and predictive control law proposed in [9]. The trajectory obtained for the middle of the rear axle (point R) is depicted in red in Fig. 4. A second tracking is then achieved using the proposed control laws (18) and (19) for the front and rear steering angles. The results of this test are reported in blue line. One can see in Fig. 4 that in both cases the successive positions of the point R are almost superimposed with the reference trajectory. Such a result is logical, since both control strategies have been tuned in a similar

356

R. Lenain et al.

way (same settling distance equal to 10 m) and the high saturation limit is not reach during the tests. The main difference is revealed when considering the lateral errors reported in Fig. 5. One can see in this figure that the tracking errors y R recorded in the two tests are quite similar. They indeed converge to zero and stay within few centimeters despite the harsh curves (between abscissas 15 m–25 m and 40 m– 50 m) and the terrain modification (gravel below 20 m and grass after). On the contrary, front lateral deviations y F are quite different. When controlling only the front steering angle (plain red line), one can indeed check that important deviations may be recorded during the curves (up to 50 cm), since the robot has an important wheelbase. In contrast, the control of both front and rear steering angles proposed in this paper permits to servo properly the front lateral deviation to zero. As a result, y F and y R present about the same accuracy. The front and the rear axles are then both superimposed with the reference trajectory, allowing to reduce the space occupied by the robot. 0.5

Lateral errors (m)

0

-0.5

-1

yF with only front steering yF with proposed control yR with only front steering yR with proposed control

-1.5

5

10

15

20

25

30

35

40

45

50

55

Achived distance (m)

Fig. 5. Adapt2E robot: comparison of tracking errors with and without the use of rear steering.

This reference path is achievable by the Adapt2E robot, even when only the front steering angle is controlled. Therefore, the proposed control laws do not improve the tracking error. This is not the case when the robot has a lower saturation limit. With Low Saturation Threshold. To highlight the improvement in maneuverability, a similar trajectory, depicted in Fig. 6, is tracked with RobuFAST robot, which has a much lower steering saturation. The robot velocity is 2 m/s. As previously, two successive trackings have been achieved: one using only the front steering angle and a second one using the proposed approach. This robot cannot reach the radius of curvature imposed by the reference trajectory if only the front angle is steered (limited to an angle of 20◦ ). As a result, the front steering angle saturates and the robot cannot follow the trajectory properly, as it can be observed in red in Fig. 6. On the contrary, when using the front and rear steering control strategy proposed in this paper, the robot is able to track accurately the trajectory: the successive positions when using control laws (18) and (19), reported in blue in Fig. 6, are superimposed with the reference trajectory.

Controlling Off-Road Bi-steerable Mobile Robots

357

Fig. 6. Ref. path (black), robot traj. with front steering (red) or front and rear steering (blue).

Fig. 7. RobuFast robot: comparison of tracking errors with and without the use of rear steering.

The tracking accuracy obtained with and without the use of the rear steering angle can be investigated in Fig. 7. The red dashed line depicts the tracking error obtained when controlling only the front steering axle. The large errors obtained during the curves (between abscissas 15–25 m and 30–42 m) are due to the saturation of the front steering axle, as shown in Fig. 8: the front steering angle recorded when controlling the robot only with the front steering wheels is reported in dashed red line and it can be noticed that the saturation value δSat = 20◦ is reached during the two curved parts. On the contrary, when using the front and rear steering wheels, the front steering angle (depicted in blue plain line) is decreased thanks to rear steering (turning to the right), depicted in green dotted line. The robot can then stay on the reference trajectory (lateral deviation stays around zero), whatever the curvature of this trajectory and the type of the ground (alternatively gravel and wet grass): the tracking error is shown in blue plain line in Fig. 7 and stays within ±10 cm.

358

R. Lenain et al.

Fig. 8. RobuFast robot: comparison of front steering angles, and rear steering angle (when used).

One can also check that both lateral deviations are properly servoed around zero. This fact is highlighted in Fig. 9, where the reference trajectory is depicted in black line and the trajectories of the front and rear axles (points F and R) are depicted in green dotted line and blue dotted line respectively. One can see that each point of the robot (F and R) follows accurately the reference trajectory, when in contrast a deviation of point F is observed when only the front axle is steered. The robot turning capability has been increased and this permits to reduce the necessary space to achieve harsh maneuvers. Moreover, with the proposed control strategy, the robot is able to reach the trajectory without creating a transient angular deviation: as it can be observed in Fig. 8, the front and rear steering angles rotate in the same direction during initialization phase, which means that the initial lateral deviation is canceled via a crabwise motion.

Fig. 9. RobuFast robot: ref. path (black), front axle position (green) and rear axle position (blue).

Controlling Off-Road Bi-steerable Mobile Robots

359

In order to assess the accuracy obtained during the tracking, Table 2 shows the statistical data related to the absolute value of the lateral deviations, namely the mean and the standard deviation of the absolute value of the errors recorded at points R and F (y R and y F ) in the two experiments. When using the proposed control strategy, it can be noticed that the means and the standard deviations are very close to zero (a few centimeters accuracy), despite the high curvature of the reference path. On the contrary, since saturations occur when only front steering is used, large deviations are recorded, leading to a poor accuracy: the mean value at the front steering point is equal to 32 cm. At the rear steering point, since R is the only point to be explicitly controlled with the front steering law, the mean value is logically lower (22 cm), but also quite unsatisfactory. Eventually, when using the proposed control laws (18) and (19), it can be noticed that the front lateral deviation y F is a little bit less accurate than the rear one y R (7 cm against 4 cm), which is consistent with the small deviations of the point F that can be noticed in Fig. 9. This is a consequence of the fact that the front steering law (18) depends on the rear steering angle δR : when the rear steering law (19) imposes fast variations to the rear steering angle δR , a settling time is mandatory for the front axle to compensate for these variations. As a result, the evolution of point F presents some punctual overshoots. Table 2. Comparison of the absolute values of the lateral deviations recorded during tracking [8]. 2 steering angles 1 steering angle | yR | | yF |

| yR | | yF |

0.04

0.07

0.22

0.32

Standard deviation (m) 0.03

0.05

0.19

0.27

Mean (m)

6.3

Prevention of Steering Saturation

As it has been pointed out in Sect. 5, a bi-steering control strategy may lead to failure situations if both steering wheels saturate with the same sign. The modified control law (19) permits to avoid these failures, but it may depreciate the tracking accuracy. In order to investigate this point, a third experiment is proposed using the RobuFast robot (since this robot has the lowest saturation limit). The reference trajectory is shown in black line in Fig. 10. It is composed of a first straight line, a harsh curve on the right, a long straight line on a lateral slope (as pointed out on the right part in Fig. 10), and finally a harsh curve on the left. Two tracking have been achieved at 2 m/s. A first one with only front steering, leading to the trajectory depicted in red. A second one with the proposed control laws, leading to the trajectory depicted in blue.

360

R. Lenain et al.

Fig. 10. Ref. path (black), robot traj. with front steering (red) or front and rear steering (blue).

One can notice an overshoot during the first curve with both control strategies: the curvature of the reference trajectory is so high that this path is not achievable, even when the two steering angles are used. If the modification proposed in Sect. 5 for the rear steering law is not used (i.e. rear steering law (12) is used instead of (19)), then a failure occurs, the robot is not able to follow the trajectory. The modified rear steering law (19) permits to avoid failure: the stability of the bi-steering approach is preserved and the robot is this time able to converge back to the trajectory and track it during the slope and the second turn. Figure 11 compares the front and the rear lateral errors (y F and y R ) obtained during the two tests. One can see that the tracking errors are quite comparable during the overshoot (between abscissas 10 m and 20 m): even if the tracking accuracy is reduced with control law (19), the error stays comparable with the one obtained with only front steering. One can also see that, thanks to rear steering, the errors are much more stable, since the angular error stays close to zero. In contrast, when the robot is controlled with only the front steering angle, one can see some small oscillations, because of the crabwise angle mandatory to compensate for sideslip angles.

Lateral deviations (m)

1 yR two steering angles yF two steering angles yR one steering angle yF one steering angle

0.5

0

-0.5

-1 10

20

30

40

50

60

70

80

90

Curvilinear abscissa (m)

Fig. 11. Comparison of tracking errors with and without the use of the rear steering angle.

The front and rear steering angles are reported in Fig. 12 in order to illustrate the procedure used to deal with steering saturation. The saturation of the front steering angle (in black line) can clearly be observed between abscissas 10 m

Controlling Off-Road Bi-steerable Mobile Robots

361

and 20 m. As a result, the quantity δe = sign(δF )(|δF | − δSat ) becomes non-null and then reaches significant values from the abscissa 13 m: as a consequence, the rear steering angle (in red line) is drastically decreased. This permits to keep the robot turning and the saturation of the front steering angle then stops at 20 m.

Fig. 12. Comparison of the front and rear steering angles.

At the end of the trajectory, one can see an oscillating behavior due to a loss of accuracy in the RTK-GPS signal, since the robot goes under a tree during the second bend (at abscissa 80 m). Nevertheless, the action of the rear steering angle stabilizes the angular deviation, so that the tracking accuracy is eventually preserved. This is not the case when the robot is controlled with only the front steering angle: oscillations are increased, as it can be observed in Fig. 11. Excepted when the steering angles are saturated, where the robot cannot physically reach the reference path, a high accurate tracking is achieved with the proposed control strategy. The means and standard deviations of the absolute errors are reported in Table 3. One can see the advantage of controlling the rear axle, not only with respect to the front lateral error y F , but also with respect to the tracking error y R , since rear steering offers a stabilizing effect. Table 3. Comparison of the absolute value of the lateral deviations recorded during the tracking. 2 steering angles 1 steering angle | yR | | yF | Mean (m)

7

| yR | | yF |

0.06

0.06

0.08

0.19

Standard deviation (m) 0.06

0.07

0.07

0.27

Conclusion

In this paper, a new control algorithm dedicated to path tracking for four-wheel drive mobile robots is proposed. The two axles of such robots are considered as two separate subsystems, with their own servoing tasks. The rear steering angle is devoted to the convergence of the middle of the rear axle to the trajectory (i.e.

362

R. Lenain et al.

convergence of the tracking error to zero). Next, considering this rear steering angle as a known parameter, the front steering angle is computed so that a front lateral deviation (the distance between the middle of the robot front axle and the trajectory) also converges to zero. The link between front and rear subsystems is ensured when deriving the front steering control law. An adaptive approach is then developed in order to face the potential influence of bad grip conditions, since this work is devoted to off-road applications, such as in civil security or in agriculture, where high maneuverability is an important feature. In addition, a predictive layer, allowing to compensate for actuator delays has been introduced. Finally, a practical solution to the problem of steering saturation, which may occur when considering actual trajectory control of two-wheel steered robots, is proposed. This solution consists in injecting the difference between the desired and the maximum values of the front steering angle to the value computed for the rear one. This actually permits to avoid lock-up and preserve the path tracking stability without generating important deviations. Experimental results with several robots and in different conditions are proposed in this paper. They permit to highlight the different contributions and explore several conditions which can be encountered in actual applications. Future works are focused on the increase of the robot speed. At higher speeds, the extended kinematic model may be insufficient to describe the robot dynamics, while dynamical models may introduce parameters difficult to estimate. As a result, mixed kinematic-dynamic models may appear as an attractive alternative. They should permit to propose a control strategy improving the maneuverability of bi-steerable robots, at high speed and in off-road conditions. Acknowledgment. This work has been sponsored by the French government research program “Investissements d’Avenir” through the IMobS3 Laboratory of Excellence (ANR-10-LABX-16-01), by the European Union through the pro- gram “Regional competitiveness and employment 2007–2013” (ERDF Auvergne region), and by the Auvergne region. It received the support of French National Research Agency under the grant number ANR-14-CE27-0004 attributed to Adap2E project (adap2e.irstea.fr) and has also been sponsored through the RobotEx Equipment of Excellence (ANR-10-EQPX-44). We thank them for their financial support.

Notes and Comments. This paper constitutes an extension of the original control algorithms proposed in [8], where the robot model was expressed with respect to time, so that control performances were dependent on the robot velocity. In this paper, models are expressed with respect to the curvilinear abscissa covered by the robot, so that control performances are henceforth described as settling distances and are independent from robot velocity. Furthermore, this paper proposes an anticipation layer to improve the robustness of the path tracking with respect to actuator delays and the rear steering law has been modified to avoid lock-up situations occurring when a saturation is present at the front steering angle. Experiments have also been enhanced to highlight the generality of the approach with respect to the robot configuration and the diversity of the situations encountered in targeted applications.

Controlling Off-Road Bi-steerable Mobile Robots

363

References 1. Ackermann J (1994) Robust decoupling, ideal steering dynamics and yaw stabilization of 4WS cars. Automatica 30(11):1761–1768 2. Anderson R, Bevly DM. (2005) Estimation of tire cornering stiffness using GPS to improve model based estimation of vehicle states. In: IEEE intelligent vehicles symposium (IV), Las Vegas, U.S.A., pp 801 – 806 3. Blackmore S (2016) Towards robotic agriculture. In: SPIE conference autonomous air and ground sensing systems for agricultural optimization and phenotyping, Baltimore, U.S.A., vol 9866, pp 986603 4. Campion G, Bastin G, Novel B (1996) Structural properties and classification of kinematic and dynamic models of wheeled mobile robots. IEEE Trans Robot Autom 12(1):47–62 5. de Wit CC, Tsiotras P, Claeys X, Yi J, Horowitz R (2003) Friction tire/road modeling, estimation and optimal braking control. In: Nonlinear and hybrid systems in automotive control. Lecture Notes in Control and Information Science. Springer, pp 125–146 6. Cariou C, Lenain R, Thuilot B, Berducat M (2009) Automatic guidance of a fourwheel-steering mobile robot for accurate field operations. J Field Robot 26(6– 7):504–518 7. Lef`evre S, Carvalho A, Borrelli F (2015) Autonomous car following: a learningbased approach. In: IEEE intelligent vehicles symposium (IV), Seoul, Korea, pp 920–926 8. Lenain R, Nizard A, Deremetz M, Thuilot B, Papot V, Cariou C (2018) Path tracking of a bi-steerable mobile robot: an adaptive off-road multi-control law strategy. In: International conference on informatics in control, automation and robotics (ICINCO), Porto, Portugal, pp 109–114 9. Lenain R, Thuilot B, Guillet A, Benet B (2014) Accurate target tracking control for a mobile robot: a robust adaptive approach for off-road motion. In: IEEE international conference on robotics and automation (ICRA), Hong Kong, China, pp 2652 – 2657 10. Lucet E, Micaelli A (2015) State and parameters observation for accurate off-road navigation of wheeled mobile robots. In: IEEE international workshop on advanced robotics and its social impacts (ARSO), Lyon, France, pp 1–6 11. Morin P, Samson C (2000) Practical stabilization of a class of nonlinear systems. application to chain systems and mobile robots. In IEEE conference on decision and control (CDC), Sydney, Australia, pp 2989–2994 12. Peng S-T, Sheu J-J, Chang C-C (2004) A control scheme for automatic path tracking of vehicles subject to wheel slip constraint. In: IEEE american control conference (ACC), Boston U.S.A., pp 804–809 13. Raksincharoensak P, Nagai M, Mouri H (2001) Investigation of automatic path tracking control using four-wheel steering vehicle. In: IEEE international vehicle electronics conference (IVEC), Tottori, Japan, pp 73–77 14. Wagner S, Weiskircher T, Ammon D, Prokop G (2018) Pivot point-based control for active rear-wheel steering in passenger vehicles. Veh Syst Dyn 56(8):1139–1161 15. Yi J, Wang H, Zhang J, Song D, Jayasuriya S, Liu J (2009) Kinematic modeling and analysis of skid-steered mobile robots with applications to low-cost inertialmeasurement-unit-based motion estimation. IEEE Trans Robot 25(5):1087–1097

On Randomized Searching for Multi-robot Coordination Jakub Hvˇezda1,2 , Miroslav Kulich1(B) , and Libor Pˇreuˇcil1 1 Czech Institute of Informatics, Robotics, and Cybernetics, Czech Technical University in Prague, Prague, Czech Republic [email protected], {kulich,preucil}@cvut.cz 2 Department of Cybernetics, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic http://imr.ciirc.cvut.cz

Abstract. In this chapter, we propose a novel approach for solving the coordination of a fleet of mobile robots, which consists of finding a set of collision-free trajectories for individual robots in the fleet. This problem is studied for several decades, and many approaches have been introduced. However, only a small minority is applicable in practice because of their properties - small computational requirement, producing solutions near-optimum, and completeness. The approach we present is based on a multi-robot variant of Rapidly Exploring Random Tree algorithm (RRT) for discrete environments and significantly improves its performance. Although the solutions generated by the approach are slightly worse than one of the best state-of-the-art algorithms presented in [23], it solves problems where ter Morses algorithm fails.

1

Introduction

Due to the increased deployment of robotic systems in many industrial applications as well as the recent advances in mobile robotics the interest in the research of multi-robot systems has increased. A major problem in this area is the coordination of trajectories of a fleet of robots/agents in these systems. The goal of this problem is to find a set of paths for each robot/agent in the fleet that does not collide with obstacles in the environment as well as other robots/agents, given the starting and goal locations, while also minimizing global cost function. Among such cost functions we can consider for example the time it takes for the last robot to reach its destination or sum of all path lengths. A typical example of multi-robot path planning application are automated warehouses, see Fig. 1, which are using autonomous robots that deliver ordered items from/to given locations. Another example of industry field that faces higher increase in traffic than the actual capacity are the airports, where it is necessary to guide the incoming and outgoing planes towards their destination quickly and safely. This leads to increased reliance on path optimization algorithms that are able to increase their throughput. c Springer Nature Switzerland AG 2020  O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 364–383, 2020. https://doi.org/10.1007/978-3-030-31993-9_18

On Randomized Searching for Multi-robot Coordination

365

Fig. 1. Automated warehouse: G-COM system by Grenzebach (https://www. grenzebach.com) in a costumer application.

Multi-robot path planning and motion coordination has been studied since the 1980s, and many techniques have been developed during this period, see surveys [8,25] for overview. This problem (formulated as the warehouseman’s problem) was proved to be PSPACE-complete [12]. For the case where robots move on a predefined graph, complexity of the problem can be reduced, nevertheless, it is still NP-hard [11], which means that optimal solutions cannot generally be found in a reasonable time for non-trivial instances (e.g., for a number of robots in order of tens). Solutions to the problem consider either coupled or decoupled approaches. Centralized (coupled) approaches consider the multi-robot team as a multi-body robot for which classical single-robot path planning can be applied in composite configuration space. Traditional centralized methods are based on complete (i.e., the algorithm finds a solution if it exists or reports that no solution exists otherwise) and optimal classical algorithms and provide optimal solutions [17,19,27]. For example, a solution which assumes a multi-robot team as a multi-body robot by building a Cartesian product of particular robots’ configurations and finding a trajectory in the constructed space is presented in [19]. However, these approaches require computational time exponential in the dimension of the composite configuration space, so they are appropriate for small-sized problems only. This drawback leads to the development of methods that prune the search space. For instance, Berg et al. [3] decompose any instance of the problem into a sequence of sub-problems where each subproblem can be solved independently from the others. The Biased Cost Pathfinding [10] employs generalized central decision maker that resolves collision points on paths that were pre-computed independently per unit, by replanning colliding units around the highest priority unit. Another approach is to design an algorithm based on a specific topology describing the environment. [26] present a multi-phase approach with linear time

366

J. Hvˇezda et al.

complexity based on searching a minimum spanning tree of the graph. The main idea of the algorithm is to find vertices for agents to move to while maintaining such a state of the graph that does not block other agents. The paths between these vertices are then found using standard one-agent planning algorithms such as A* while looking at other agents as obstacles. Flow Annotation Replanning, an approach for grid-like environments is introduced in [34]. A flow-annotated search graph inspired by two-way roads is built to avoid head-to-head collisions and to reduce the branching factor in an A* search. A heuristics is furthermore used to solve deadlocks locally instead of resorting to a more expensive replanning step. Nevertheless, the computational complexity is still high (e.g., [3] solves a problem with 40 robots in 12 min, [34] needs approx. 30 s for 400 robots). On the contrary, decoupled methods present a coordination phase separated from the path planning phase. These approaches provide solutions typically in orders of magnitude faster times than coupled planners, but these solutions are sub-optimal. Moreover, the decoupled methods are often not complete as they may suffer from deadlocks. These approaches are divided into two categories: path coordination techniques and prioritized planning. Path coordination considers tuning the velocities of robots along the precomputed trajectories to avoid collisions like in [18]. Similarly, a resolution-complete algorithm is presented in [30], which is consists of exploring a so-called coordination diagram (CD). CD was firstly introduced in [24] and it describes conflicts of robots in a joint space where each dimension represents and individual robot’s position on its path. Prioritized planning computes trajectories sequentially for the particular robots based on the robots’ priorities. Robots with already determined trajectories are considered as moving obstacles to be avoided by robots with lower priorities. A simple heuristics to assign priorities is introduced in [2] – priority of a robot is determined as the distance to its goal. Another approach, a randomized hill-climbing technique based on a greedy local search procedure [28] ˇ ap et al. [4] discuss to optimize the order of robots is presented in [1]. Finally, C´ conflicts that occur during prioritized planning and propose a revised version of prioritized planning (RPP) that tries to avoid these conflicts. They also propose decentralized implementations of RPP (synchronous as well as asynchronous) and prove that the asynchronous version is guaranteed to terminate. Ter Morse et al. [23] present another prioritized planning scheme, but it codes information about trajectories of robots with higher priorities into a planning graph rather than into the planning algorithm itself. It does so by constructing a resource graph, where each resource can be for example a node of the original graph or intersection graph edges. Every such resource holds information about time intervals in which it is not occupied by already planned robots. An adaptation of the A* algorithm is used on this graph to find the shortest path through these intervals (called free time windows) to obtain a path that avoids all already planned robots. Gawrilow et al. [9] deal with a real-life problem of routing vehicles in Container Terminal Altenwerder in Hamburg harbor. They used a similar approach

On Randomized Searching for Multi-robot Coordination

367

to keeping a set of free time windows for path arcs in the graph. Their algorithm contains a preprocessing of the graph for the use of specific vehicles followed by computation of paths for individual vehicles on this preprocessed graph. Another similar approach is presented in [35], where each robot looks for a viable path in a 2D spatial grid map and checks for collisions with moving obstacles using a temporal occupancy table. Zhang et al. [38] adopt a similar approach to [23] with enhanced taxiway modeling approach to improve performance on airport graph structures. Ter Mors et al. [22] compare Context-Aware Route Planning (CARP) [23] with a fixed-path scheduling algorithm using k shortest paths [37] and a fixed-path scheduling algorithm using k disjoint paths [32]. The experiments show that the CARP algorithm is superior in all measured qualities. In our recent paper [14], we propose a modification of CARP, which generates a trajectory for an robot ak assuming that trajectories for k − 1 robots are already planned which can possibly lead to modification of those planned trajectories. The main idea is to iteratively build a set of robots whose trajectories mostly influence an optimal trajectory of ak . The experimental results show that the proposed approach finds better solutions than the original CARP algorithm after several random shuffles of the robots’ priorities while requiring significantly less computational time for adding individual robots into the system. Comparison of several heuristic approaches of assigning priority to robots in [21] concludes that the heuristics which plans longest paths first perform best when a makespan is to be minimized. A greedy best-first heuristics provides best results regarding joint plan cost. However, its downside is that it calls the planning algorithm for all yet unplanned robots in every round and it is thus very time-consuming. Several computationally efficient heuristics have been introduced recently enabling to solve problems for tens of robots in seconds. Chiew [5] proposes an algorithm for n2 vehicles on a n × n mesh topology of path network allowing simultaneous movement of vehicles in a corridor in opposite directions with computational complexity O(n2 ). Windowed Hierarchical Cooperative A* algorithm (WHCA*) employs heuristic search in a space-time domain based on hierarchical A* limited to a fixed depth [29]. Wang and Botea [33] identify classes of multi-robot path planning problems that can be solved in polynomial time and introduce an algorithm with low polynomial upper bounds for time, space and solution length. Luna and Bekris [20] present a complete heuristics for general problems with at most n − 2 robots in a graph with n vertices based on the combination of two primitives - “push” forces robots towards a specific path, while “swap” switches positions of two robots if they are to be colliding. An extension which divides the graph into subgraphs within which it is possible for agents to reach any position of the subgraph, and then uses “push”, “swap”, and “rotate” operations is presented in [6]. Finally, Wang and Wooi [36] formulate multi-robot path planning as an optimization problem and approximate the objective function by adopting a maximum entropy function, which is minimized by a probabilistic iterative algorithm.

368

J. Hvˇezda et al.

The approaches mentioned above have nice theoretical properties, but Context-Aware Route Planning (CARP) [23] is probably the most practically usable algorithm as it produces high quality solutions fast, and it finds a solution for a large number of practical setups. Recently, Solovey et al. present an approach for multi-robot coordination inspired by the Rapidly-exploring random tree (RRT) algorithm [19]. The approach, MRdRRT [31] is probabilistic and plans paths on predefined structures (graphs) for small fleets of robots. This is further improved in [7] by employing ideas from RRT* [16], a variant of RRT which converges towards an optimal solution. In this paper, we present a probabilistic approach which extends and improves a discrete version of Rapidly-Exploring Random Tree (RRT) for multiple robots [31]. Our approach focuses mainly on scalability with increasing number of agents as well as improving the quality of solution compared to [7] that presents the optimal version of the dRRT algorithm but keeps the number of robots relatively low. We show that the proposed extensions allow solving problems with tens of robots in times comparable to CARP with a slightly worse quality of results. On the other hand, the proposed algorithm finds solutions also for setups where CARP fails. The paper is an extended version of the paper [13] presented at the 15th International Conference on Informatics in Control, Automation and Robotics (ICINCO). In comparison to the ICINCO paper, the following modifications were made: – The state-of the art was refined and extended and new references were added. – The section about the proposed improvements of the original MRdRRT algorithm was extended. Namely, the proposed use of the CARP algorithm was modified with a detailed description of CARP. – Experiments were performed on various sizes of maps (only one map size was considered in the ICINCO paper). Moreover, influence of two main improvements, expansion and rewiring, on solution quality and time complexity was studied. – All the drawings were updated and redesigned and some additional figures were included. The rest of the paper is organized as follows. The multi-agent path-finding problem is presented as well as the used terms are defined in Sect. 2. The multi-robot discrete RRT algorithm and the proposed improvements are described in Sect. 3, while performed experiments, their evaluation, and discussion are presented in Sect. 4. Finally, Sect. 5 is dedicated to concluding remarks.

2

Problem Definition

The problem of multi-agent pathfinding/coordination concerns itself with searching for non-colliding paths for each agent in a set of agents that starts in agents

On Randomized Searching for Multi-robot Coordination

369

start location and ends in its goal location while minimizing some global cost function such as agent travel time or global plan completion time. For more rigorous specification, we assume: – A set of k homogenous agents each labeled a1 , a2 , ..., ak . – A graph G(V, E) where |V | = N . The vertices V of the graph are all possible agent’s locations, while E represents a set of all possible agent’s transitions between the locations. – A start location si ∈ V and a target location ti ∈ V of each agent. The goal is to find a set of trajectories on G(V, E) that avoid collisions with other robots and environmental obstacles that specify all locations of each agent in each time step such that the agents are in their initial positions at the start and in their goal positions at the end. The key used terms and additional constraints to the generated trajectories will be specified in the following paragraphs. 2.1

Actions

In each time point, only two action types are permitted for every agent: The first type of action is for the agent to move into one of the neighbouring nodes and the second type of action is that the robot waits at its current location. The cost of these actions can be different for each algorithm, but in our case, the assumption is that staying still has zero cost. Also, another assumption is that once agents reach their goal location, they wait for all other agents to finish their respective plans. 2.2

Constraints

In this paper the main constraints placed upon the movement of agents are: – No two agents a1 and a2 can occupy the same vertex v ∈ V at the same time. – Assume two agents a1 , a2 located in two neighboring nodes v1 , v2 ∈ V respectively, they can not travel along the same edge (v1 , v2 ) at the same time in opposite directions. In other words, two neighboring agents cannot swap positions. However, it is possible for agents to follow one another assuming that they do not share the same vertex or edge concurrently. For example, if the agent a1 moves from v2 ∈ V to v3 ∈ V then the agent a2 can move from v1 ∈ V to v2 ∈ V at the same time. 2.3

Composite Configuration Space

The composite configuration space G = (V, E) is a graph that can be defined in a following manner. The vertices V are all combinations of collision-free placements of m agents on the original graph G. These vertices can also be viewed as m agent configurations C = (v1 , v2 , ..., vm ), where an agent ai is located at a vertex vi ∈ G

370

J. Hvˇezda et al.

and the agents do not collide with each other. The edges of G can be created using either Cartesian product or Tensor product. In this paper the Tensor product is used because is allows multiple agents to move simultaneously. Therefore, for  ) the edge two m agent configurations C = (v1 , v2 , ..., vm ), C  = (v1 , v2 , ..., vm   (C, C ) exists if (vi , vi ) ∈ Ei for every i and no two agents collide with each other during the traversal of their respective edges. The distance between two neighboring nodes C1 = (v11 , v12 , ..., v1n ) and C2 = (v21 , v22 , ..., v2n ) in a composite roadmap is calculated as the sum of Euclidean distances d between the corresponding nodes: δ (C1 , C2 ) =

n 

d(v1i , v2i )

i=0

3 3.1

Proposed Algorithm Discrete RRT

A discrete multi-robot rapidly-exploring random tree (MRdRRT) [31] is a modification of the RRT algorithm for pathfinding in an implicitly given graph embedded in a high-dimensional Euclidean space. Similarly to RRT, the MRdRRT finds the paths for the agents in the composite configuration space Rd by growing a tree T rooted in the vertex s that represents the start locations of all agents. The tree is expanded by iterative addition of new points while simultaneously attempting to connect the newly added points to the goal vertex t such that no constraints are violated, e.g. not causing any collisions with the environment and between agents. The addition of new points is handled first sampling a random point u from the composite configuration space, followed by extending the current tree towards the sample point u. The important thing to note here is that vertices added to the tree are taken from G. This means that to extend the tree towards sample point it is required to first find the node in the tree from which the extension is made in this case the nearest point n in the tree to the sample u is chosen, but also which neighbour of this node will be selected as the extension. Because unlike in RRT that operates in continuous space, it is not possible to make a step in the given direction. To choose the best neighbour of n the MRdRRT uses a technique d called oracle. Without loss of generality consider that G is embedded in [0, 1] . d For two points v, v  ∈ [0, 1] the ρ (v, v  ) denotes a ray that begins in v and goes d    through v . ∠v (v , v ) given three points v, v  , v  ∈ [0, 1] denotes the (smaller)   angle between ρ (v, v ) and ρ (v, v ). The way the oracle is used is given sample point u it returns the neighbor v  of v such that the angle between rays ρ (u, v  ) and ρ (v, v  ) is minimized. This can be defined as OD (v, u) := argmin {∠v (u, v  ) | (v, v  ) ∈ E} . v  ∈V

As mentioned earlier the growth of the tree is not done only by adding single new nodes, but also attempting to connect the newly added nodes to the goal

On Randomized Searching for Multi-robot Coordination

371

vertex t. The reason for this is that the tree may eventually expend to this vertex, but it is unlikely for larger problems. For this reason, the local connector is necessary to ensure the algorithm finds the paths in a reasonable amount of time. The main restriction for the local connector is that it returns decision about finding or not finding the paths quickly by attempting to solve only a restricted version of the problem so that it can be run often. 3.2

Proposed Improvements

The original MRdRRT is able to solve path-finding problems for several agents. However, the realisation of its particular steps is inefficient, which results in its inability to deal with complex scenarios in which tens of robots take part of. The experimental results with up to 10 robots were presented by the authors of MRdRRT, with the mention that their algorithm faces problems once the tasks contain a more substantial number of robots. To improve the behaviour of the algorithm we, therefore, introduced several modifications to the original version. Random Sample Generation Improvement. The version of the expansion phase proposed in the original paper generates random samples from the bounding box of G. However, this is inefficient in maps with tightly spaced obstacles because it disallows the agents to stay on their current position in the next step resulting in not being able to find a solution for problems where this action is necessary. Also, most of the points that were generated by the original expansion phase procedure were far from a solution leading to a really unnecessarily large growth of the tree over the configuration space and resulting in high computational complexity of the algorithm. Our solution to this issue is that we find shortest path for each agent separately during preprocessing and then create a set of all possible samples that can be generated for the particular agent as all points for which dist(si , q) + dist(q, ti ) ≤ dist(si , ti ) + Δ, where dist is a distance of two points, si and ti are start and goal positions of i-th robot, and Δ > 0 is a defined constant threshold. Improvements to the Oracle Method. The main issue of the original oracle method was the fact that it checks for collisions after a new candidate point is generated. This results in many failed attempts and discarding several samples. The version proposed here takes the collision possibility into account during the process of generating candidate point in an attempt to generate new points that are collision-free and thus reducing the number of iterations required to produce a viable new point. Just as the original version, our version also attempts to minimize the angle (ui , vi , vi ). Proposed Use of the CARP Algorithm as Local Connector. The local connector in the original version of the algorithm works such that it finds the shortest paths for all robots not taking the other ones into account. It then

372

J. Hvˇezda et al.

attempts to find an order in which the robots would move to their destinations one by one while the others stand still. If such ordering exists, the local connector reports success. We propose the use of the CARP algorithm [23] as a local connector along with random shuffling of the order in which CARP attempts planning of trajectories for individual agents. CARP algorithm models the environment as a resource graph, that stores information about capacity and occupancy in time for each individual resource. For this reason, the resource graph can also be called free time window graph. CARP uses a modified A* algorithm on the time window graph to find the shortest path through time for each robot from his starting resource to his goal resource. The found path takes the form of a sequence of time windows and corresponding resources, which is subsequently updated into the free time window graph to ensure the other agents can avoid the given agent.

Fig. 2. An example problem of the CARP algorithm.

An example of pathfinding using the CARP algorithm can be seen in Fig. 2a. Trajectories of two agents need to be found in this example problem. One agent wants to go from the node A to the node B while the second one wants to go from the node C to the node D. Because the time window graph is empty at the start and all windows are thus free, the first agent finds its path as a straight route towards its goal. This can be seen in Fig. 2b along with occupied time windows on the resource graph that were reserved by the first agent. When the second agent plans its path, it is apparent that it cannot go straight towards its goal without pausing. If it chooses to go through the upper path using the node A, it would take four time units to reach the node. It can not use the lower path because the node B is taken from time 2 indefinitely. However if the agent waits

On Randomized Searching for Multi-robot Coordination

373

for one time unit at its starting node C and then move straight towards its goal, the path is free. This resulting plan, as well as the resulting time occupancy for all nodes, can be seen in Fig. 2c. Algorithm 1 . Improved MRdRRT algorithm [13]. 1: T .init (s) 2: loop 3: EXP AN D (T ) 4: REW IRE (T , v  ) 5: P ← CON N ECT T O T ARGET (T , t) 6: if not empty(P) then return RET RIEV E P AT H (T , P)

Proposed New Steps of the Algorithm. RRT* algorithm [15] introduced new steps to the original RRT algorithm that enabled it to be asymptotically optimal. Our last modifications to the MRdRRT algorithm take inspiration from RRT* to implement the improved expansion step and rewiring step (see Algorithm 1) in order to increase the quality of the obtained plans. The tree is initialized with a single node that represents the starting positions of agents (line 1). The start of the main loop of the algorithm starts consists of the modified expansion phase (line 3). Once a new node is added to the tree, the rewiring step is called (line 4) which attempts to revise the structure of the tree to improve the length of the path from the added point to the tree root. The following step is the call to the local connector that checks whether it is possible to connect the newly added point to the goal configuration. If the local connector is successful, the algorithm terminates and returns the found path as a result. The main change to the expansion phase, Algorithm 2, is that after the random sample u is generated (line 1), a set of N nearest neighbours of u in the tree T is selected. A new candidate point v  is then generated from each nearest neighbour (lines 6–11) using oracle OD , but is not added to the tree. Each candidate point v  is then checked for the distance travelled from the root of the tree T and only the node that minimizes this distance is connected to the tree to its corresponding predecessor. This step is the equivalent of similar step in the RRT* algorithm where the algorithm generates new point and then checks all points in a given radius around this point for the best predecessor, meaning a predecessor that minimizes distance through the tree to the root of the tree. In the multi-agent discrete scenario, it was necessary to adjust this step (Algorithm 2) because the computational requirements would be much higher as the points in given radius would not be able to be connected directly to the new point, because they might not be direct neighbours in the composite configuration space. This issue would result in the need to use the local connector on each point in the radius. The last step inspired by RRT* that was added to the algorithm is the step called rewiring, which locally revises the structure of T by checking if nodes that are within certain radius r of newly added point would have the path travelled

374

J. Hvˇezda et al.

to them from the root of the tree shorter if they had the newly added point v  as their predecessor. To avoid the search for all points in a given radius, this step was modified for the multi-agent discrete scenario (Algorithm 3) by employing N nearest neighbours search of v  to obtain the neighbouring points c. Because it is highly unlikely that two neighbouring configurations in the tree are direct neighbours in the composite configuration space, it was necessary to use the local connector to obtain paths that connect configurations v  and c (line 3). In a case the local connector fails to find connecting paths between the two configurations, the considered point is automatically skipped (lines 4–5). However, if the local connector succeeds in finding a path p between v  and c, it is checked whether the length of the path from the root to v  concatenated with the path p and the node c is shorter than the distance travelled through T from the root to c (lines 5–7). If it is shorter, then all nodes of p are added to T . The first node of p is connected as the successor of v  and the last node of p is chosen as a new predecessor of c. An example of the rewiring step is displayed in Fig. 3. Algorithm 2. Improved MRdRRT EXPAND(T , r) [13]. 1: u ← RAN DOM SAM P LE () 2: N N s ← getN earestN eighbours(u)  = −1 3: vpred 4: dbest = ∞  =∅ 5: vbest 6: for c ∈ N N s do 7: v  ← OD (c, u) 8: if lT (c) + δ (c, v  ) < dbest then 9: dbest = lT (c) + δ (c, v  )  =c 10: vpred  = v 11: vbest  12: T .add vertex best )   (v   13: T .add edge vpred , vbest

Algorithm 3. REWIRE(T , v  ) [13]. 1: N N s ← getN earestN eighbours(v  ) 2: for c ∈ N N s do 3: p ← LOCAL CON N ECT OR (v  , c) 4: if p ← ∅ then 5: Continue 6: n ← LastN ode (p) 7: if lT (v  ) + l (p) + δ (n, c) < lT (c) then 8: T .add(p) 9: c.predecessor = n

On Randomized Searching for Multi-robot Coordination

4

375

Experiments

To show the performance of the proposed algorithm we chose to compare it with the CARP algorithm [23]. Because the CARP algorithm is heavily dependent on

Fig. 3. Example of the rewiring procedure.

376

J. Hvˇezda et al.

Fig. 4. Results of the first experiment for the maps from 20 × 20 grid.

Fig. 5. Results of the first experiment for the maps from 30 × 30 grid.

On Randomized Searching for Multi-robot Coordination (a) Success rate

(c) Median iterations

377

(b) Median number of plan steps

(d) Median time to plan

Fig. 6. Results of the first experiment for the maps from 40 × 40 grid.

the ordering in which the agents are planned we have used several variants that differ in the maximal number of allowed shuffles of the ordering. Once CARP finds solution with a given ordering, we consider it succeeded and returned its result. We used 4 such variants with 1, 10, 100 and 1000 maximum possible shuffles of the ordering. The comparison was done in two separate experiments, each of which contained its own set of maps and assignments. The primary goal was to evaluate the reliability of the algorithm as well as its runtime. To achieve this, we have recorded metrics based on which we have made the comparison of the tested algorithms. These metrics are the following: success rate, length of the resulting plan, sum of individual agent plans in the given plan, runtime of the algorithm and the number of iterations the algorithm needed to find a solution. The first experiment was performed with the aim to evaluate how the algorithm performs on maps with various densities, e.g. with a sequence of maps with an increasing number of edges given the same set of nodes. The set was created by initially generating grid maps with preset a number of nodes. More precisely, we have generated three such grids with 20 × 20, 30 × 30 and 40 × 40 nodes. In the next step, a minimum spanning tree for each grid was determined, which created the base map. Once the minimum spanning trees have been created, 9 more maps have been created for each spanning tree by iteratively adding a fixed number of edges from the original grid. This procedure resulted in 30 different maps, 10 for each grid. An example of these maps can be seen in Fig. 7.

378

J. Hvˇezda et al.

Furthermore, for each of the grids 100 different assignments were generated assuming a fleet of 100 robots. These assignments were created by randomly sampling pairs of nodes from the given grid. In addition, our algorithm was tested in 4 different variants that differ by either enabling or disabling two main new components – improved expansion and rewiring.

Fig. 7. Example of maps from the first experiment.

As the algorithms could fail, the results from failed planning attempts were handled such that if any of the 100 assignments was successful, all failed attempts were set as 2 times the worst result. In case none of the 100 assignments was planned successfully, all failed attempts were set to arbitrary high value, in our case 100000. The graphs then show median value over the 100 assignments for each map of each grid. In the case of runtime measurements, the time was measured from the start of the algorithm until result was returned, even if the result was a failure. Moreover, the proposed algorithm was given 500000 maximum iterations to find a plan, otherwise the attempt was considered failed in which case the 500000 was reported as the number of iterations. For the CARP algorithm, if a failure to plan was detected, the number of iterations was set to the given maximum limit of shuffles. For example, if CARP with the limit of 1000 shuffles failed, it reported the number of iterations was 1000. The results of the experiment can be seen in Figs. 4, 5, and 6. Note that results for the first two maps are not shown in the graphs, because none of the algorithms was able to find a solution for any assignment. The first thing to notice is that the proposed approach shows much higher success rate in all its variants than all CARP variants in Figs. 4a, 5a and 6a. Note that the two versions

On Randomized Searching for Multi-robot Coordination

379

with the improved expansion turned on provided a higher success rate on the map created from 20 × 20 grid with less edges than the two versions with this component turned off. Upon inspecting the length on the plan in the Figs. 4b, 5b, and 6b it can be seen that our approach provides comparable results with CARP variants in all versions with the versions where the improved expansion was turned on providing better results on more sparse maps while also requiring less computational time to obtain the result. The worst results are provided on the more sparse maps created from the 20 × 20 grid as can be seen for example in Fig. 4c where the variants of the proposed algorithm were not able to find a solution in more than half the assignments for the maps with 960 and 1038 edges, and thus the median number of iterations was set to the preset limit of 500000.

Fig. 8. Map generation procedure.

380

J. Hvˇezda et al.

The second set of maps was created specifically together with assignments so that the problems would be impossible to solve for the CARP algorithm. The maps and assignments were randomly generated by the following process: 1. Create a basic problem that is impossible to solve for the CARP algorithm depicted in Fig. 8a. Arrows indicate the starting and goal positions of robots A and B on the graph. CARP fails because the agents need to swap their positions while having the same distance to the only node they can use to avoid each other. Because CARP plans agents sequentially one by one while ignoring the subsequent agents, no ordering of these agents can solve this issue. 2. Pick random node that has only one edge associated with it. 3. Either add 2 nodes A and B to either side of this node if possible along with corresponding assignment of 2 agents – The first agent going from A to B and the second one from B to A. The example of this step can be seen in Fig. 8b. The alternative method to expand the map is to connect the same structure to it as in the Step 1 together with the same type of assignment, the example of which can be seen in Fig. 8c. 4. Repeat Steps 2 and 3 until the map of a required size is generated. The example of a fully generated map following the previous steps can be seen in Fig. 9. The second set of experiments was carried out on the second set of maps with the aim to illustrate the behavior of the proposed algorithm on assignments that CARP algorithm can not solve. The total of 400 different combinations of a map and assignment were generated: 100 each for 10, 20, 30 and 40 agents. The results of this experiment can be seen in Fig. 10. The setup numbers 1 to 4 correspond to the number of agents 10 to 40 respectively.

Fig. 9. Example of a generated map.

On Randomized Searching for Multi-robot Coordination

381

Fig. 10. Results of the proposed approach on assignments which CARP is unable to solve [13].

For up to 30 agents the success rate is 100% while it is decreased to 95% for 40 robots. Regarding the computational time results, the algorithm takes approximately 1 s to calculate the paths for each agent in assignments that are impossible to solve for CARP algorithm for up to 30 agents even with relatively complicated assignments.

5

Conclusion

This chapter presents a novel approach to multi-robot coordination on a graph, which is based on a discrete version of RRT for multiple robots (MRdRRT) and significantly improves its performance in several steps of the algorithm. Two additional steps inspired by RRT* were also introduced into the algorithm which improve the quality of solutions the algorithm provides. Moreover, the results of several experiments were performed that show the behavior of the algorithm in several various scenarios in different settings. The results show that the proposed approach can solve problems containing tens of robots in tens of seconds, which is a significant improvement upon the original version of MRdRRT which was able to solve problems with up to ten robots in tens of seconds. Moreover, the results also show that this approach is able to solve problems that are unsolvable for the CARP algorithm which is one of the best algorithms to be used in practice nowadays. The future work should be focused on reducing the number of iterations the algorithm requires to find a solution as well as reducing the computational complexity. These improvements could be achieved for example by reducing the dimensionality of the problem by first clustering the robots into groups and then planning for these groups separately, but taking their respective solutions as obstacles. Acknowledgements. This work has been supported by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 688117, by the Technology Agency of the Czech Republic under the project no. TE01020197 “Centre for Applied Cybernetics”, the project Rob4Ind4.0 CZ.02.1.01/0.0/0.0/15 003/0000470 and the European Regional Development Fund. The work of Jakub Hvˇezda was also supported by the Grant Agency of the Czech Technical University in Prague, grant No. SGS18/206/OHK3/3T/37.

382

J. Hvˇezda et al.

References 1. Bennewitz M, Burgard W, Thrun S (2001) Optimizing schedules for prioritized path planning of multi-robot systems. In: IEEE international conference on robotics and automation, proceedings 2001 ICRA, vol 1, pp 271–276 2. van den Berg J, Overmars M (2005) Prioritized motion planning for multiple robots. In: 2005 IEEE/RSJ international conference on intelligent robots and systems. IEEE, pp 430–435 3. van den Berg J, Snoeyink J, Lin MC, Manocha D (2009) Centralized path planning for multiple robots: optimal decoupling into sequential plans. In: Trinkle J, Matsuoka Y, Castellanos JA (eds) Robotics: science and systems V, University of Washington, Seattle, USA, 28 June–1 July 2009. The MIT Press 4. Cap M, Novak P, Kleiner A, Selecky M, Pechoucek M (2015) Prioritized planning algorithms for trajectory coordination of multiple mobile robots. IEEE Trans Autom Sci Eng Spec Is (2015) 5. Chiew K (2010) Scheduling and routing of autonomous moving objects on a mesh topology. Oper Res Int J 12(3):385–397 6. DeWilde B, Ter Mors A, Witteveen C (2014) Push and rotate: a complete multiagent pathfinding algorithm. J Artif Intell Res 51:443–492 7. Dobson A, Solovey K, Shome R, Halperin D, Bekris KE (2017) Scalable asymptotically-optimal multi-robot motion planning. In: 2017 international symposium on multi-robot and multi-agent systems (MRS), pp 120–127, December 2017 8. Doriya R, Mishra S, Gupta S (2015) A brief survey and analysis of multi-robot communication and coordination. In: International conference on computing, communication automation, pp 1014–1021, May 2015 9. Gawrilow E, K¨ ohler E, M¨ ohring RH, Stenzel B (2008) Dynamic routing of automated guided vehicles in real-time. Springer, Heidelberg, pp 165–177 10. Geramifard A, Chubak P, Bulitko V (2006) Biased cost pathfinding. In: AAAI conference on artificial intelligence and interactive digital entertainment, pp 112– 114 11. Goldreich O (2011) Finding the shortest move-sequence in the graph-generalized 15-puzzle is NP-hard. In: Goldreich O (ed) Studies in complexity and cryptography. Miscellanea on the Interplay between randomness and computation - in collaboration with Avigad L, Bellare M, Brakerski Z, Goldwasser S, Halevi S, Kaufman T, Levin L, Nisan N, Ron D, Lecture Notes in Computer Science, vol 6650. Springer, pp 1–5 12. Hopcroft J, Schwartz J, Sharir M (1984) On the complexity of motion planning for multiple independent objects; PSPACE-hardness of the “Warehouseman’s Problem”. Int J Rob Res 3(4):76–88 13. Hvˇezda J, Kulich M, Pˇreuˇcil L (2018) Improved discrete RRT for coordinated multi-robot planning. In: Proceedings of the 15th international conference on informatics in control, automation and robotics, ICINCO 2018 - vol 2, pp 181–189 14. Hvˇezda J, Rybeck´ y T, Kulich M, Pˇreuˇcil L Context-aware route planning for automated warehouses. In: 21st IEEE international conference on intelligent transportation systems (in press) 15. Karaman S, Frazzoli E (2010) Incremental sampling-based algorithms for optimal motion planning. CoRR abs/1005.0416 16. Karaman S, Frazzoli E (2011) Sampling-based algorithms for optimal motion planning. CoRR abs/1105.1186

On Randomized Searching for Multi-robot Coordination

383

17. Latombe JC (1991) Robot motion planning. Kluwer Academic Publishers, Norwell 18. LaValle S, Hutchinson S (1998) Optimal motion planning for multiple robots having independent goals. IEEE Trans Robot Autom 14(6):912–925 19. Lavalle SM (1998) Rapidly-exploring random trees: a new tool for path planning. Technical report, Computer Science Deptartment, Iowa State University, 11 20. Luna R, Bekris KE (2011) Efficient and complete centralized multi-robot path planning. In: 2011 IEEE/RSJ international conference on intelligent robots and systems. IEEE, pp 3268–3275, September 2011 21. ter Mors A (2011) Evaluating heuristics for prioritizing context-aware route planning agents. In: International conference on networking, sensing and control, pp 127–132, April 2011 22. ter Mors A, Witteveen C, Ipema C, de Nijs F, Tsiourakis T (2012) Empirical evaluation of multi-agent routing approaches. In: 2012 IEEE/WIC/ACM international conferences on web intelligence and intelligent agent technology, vol 2, pp 305–309, December 2012 23. ter Mors AW, Witteveen C, Zutt J, Kuipers FA (2010) Context-aware route planning. In: Dix J, Witteveen C (eds) Multiagent system technologies. Springer, Heidelberg, pp 138–149 24. O’Donnell PA, Lozano-Perez T (1989) Deadlock-free and collision-free coordination of two robot manipulators. In: IEEE robotics and automation conference, pp 484– 489 25. Parker LE (2009) Path planning and motion coordination in multiple mobile robot teams. In: Encyclopedia of complexity and system science (2009) 26. Peasgood M, Clark CM, McPhee J (2008) A complete and scalable strategy for coordinating multiple robots within roadmaps. IEEE Trans Rob 24(2):283–292 27. Ryan MR (2008) Exploiting subgraph structure in multi-robot path planning. J Artif Intell Res, pp 497–542 28. Selman B, Levesque H, Mitchell D (1992) A new method for solving hard satisfiability problems. In: Proceedings of the tenth national conference on artificial intelligence, AAAI 1992. AAAI Press, pp 440–446 29. Silver D (2005) Cooperative pathfinding. In: The 1st conference on artificial intelligence and interactive digital entertainment, pp 117–122 30. Simeon T, Leroy S, Lauumond JP (2002) Path coordination for multiple mobile robots: a resolution-complete algorithm. IEEE Trans Robot Autom 18(1):42–49 31. Solovey K, Salzman O, Halperin D (2014) Finding a needle in an exponential haystack: discrete RRT for exploration of implicit roadmaps in multi-robot motion planning. Algorithmic Found Rob XI:591–607 32. Suurballe J (1974) Disjoint paths in a network 4:125–145 33. Wang KC, Botea A (2011) MAPP: a scalable multi-agent path planning algorithm with tractability and completeness guarantees. J Artif Intell Res, pp 55–90 34. Wang KHC, Botea A (2008) Fast and memory-efficient multi-agent pathfinding. In: ICAPS, pp 380–387 35. Wang W, Goh WB (2012) Multi-robot path planning with the spatio-temporal A* algorithm and its variants. In: Dechesne F, Hattori H, ter Mors A, Such JM, Weyns D, Dignum F (eds) Advanced agent technology. Springer, Heidelberg, pp 313–329 36. Wang W, Goh WB (2015) A stochastic algorithm for makespan minimized multiagent path planning in discrete space. Appl Soft Comput 30:287–304 37. Yen JY (1971) Finding the k shortest loopless paths in a network. Manage Sci 17(11):712–716 38. Zhang T, Ding M, Wang B, Chen Q (2015) Conflict-free time-based trajectory planning for aircraft taxi automation with refined taxiway modeling 50

Increasing Machining Accuracy of Industrial Manipulators Using Reduced Elastostatic Model Shamil Mamedov, Dmitry Popov, Stanislav Mikhel, and Alexandr Klimchik(B) Center for Technologies in Robotics and Mechatronics Components, Innopolis University, Universitetskaya Str. 1, Innopolis, Russia {sh.mamedov,d.popov,s.mikhel,a.klimchik}@innopolis.ru

Abstract. For a long time, the field of machining has been dominated by computer numerical control (CNC) machines as they are simple from the kinematic point of view and stiff that ensures high positioning accuracy. However they are very expensive and occupy large space, thus there is a demand from industry for cheaper and smaller alternatives. The most promising one is an industrial manipulator which is indeed cheaper and have the lower footprint to workspace ration. The reason why industrial manipulators have not yet replaced CNC machines is their comparatively low stiffness that causes deflections in the end-effector position and orientation due to an external force applied to it during a machining operation. Therefore, an extensive research is being conducted in this area that focuses on developing accurate stiffness model of the robot and incorporating it into the control scheme. As the majority of stiffness models include stiffness of the links as well as joints even though the former cannot be obtained from the parameters provided by the robot manufacturer, that is why it is important to understand how accurately a reduced stiffness model, which lumps all the stiffness properties in the joints, can replicate the deflections of the full model. In this paper, we focus on analyzing the quantitative difference between these two models using Virtual Joint Modeling method and its effect on the trajectory tracking. The systematic analysis carried out for FANUC R-2000iC/165F robot demonstrates that reduced stiffness model can quite accurately replicate the full one so that up to 92% of the end-effector deflection can be compensated. The average deflection error after compensation is about 0.7 µm/N for a typical heavy industrial robot under the loading. Keywords: Elastostatics manipulators · Machining

1

· Virtual joint method · Industrial · Deflection compensation

Introduction

Today there is a tendency to replace CNC machines with industrial robots as the latter is cheaper and more compact. However, because of the open-loop control and serial kinematic structure, the stiffness of industrial robots is lower c Springer Nature Switzerland AG 2020  O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 384–406, 2020. https://doi.org/10.1007/978-3-030-31993-9_19

Increasing Machining Accuracy of Industrial Manipulators

385

comparing to CNC machines. Both theoretical and experimental studies show that this ratio can be more than 50 times [23]. Deformations due to low stiffness lead to poor machining quality and decrease processing efficiency [31]. In order to increase the accuracy of manipulators, researchers tend to model robot elasticity and compensate related compliance errors. For the stiffness modeling three main approaches are distinguished in literature: the finite elements analysis (FEA) [29], the matrix structural analysis (MSA) [11,14,19] and the virtual joint method (VJM) [8,24,25] each of them has its own advantages and disadvantages. Compliance error compensation is a delicate matter since compensation in the Cartesian space is achieved evidently by the motors located at the joints. For this reason, it still remains an open question how accurately reduced stiffness model in terms of compliance error compensation in the Cartesian space comparing to the full stiffness model that takes into account both links and joint elasticity separately. To address the above-mentioned problem, the remainder of the paper is organized as follows. Section 2 defines the theoretical models used in this paper. Section 3 describes a model of the robot. The results of the simulations are given in Sect. 4. Finally, Sect. 5 provides discussion and Sect. 6 summarizes the main contributions of the paper. 1.1

Related Work

Stiffness modeling for robotic manipulators was initiated in the 1980s by a pioneering work of Salisbury on active stiffness control [27]. First models considered only joint elasticities and the stiffness parameters were estimated in a rather straightforward way [26]. Recent developments permit to model both joint and link flexibility [10], resulting in three main approaches mentioned in the previous section. The basic idea behind the FEA is to decompose the physical model of a mechanism into a number of small elements and to introduce compliant relations between adjacent nodes by corresponding stiffness matrices [4]. This method is highly accurate, but a number of finite elements increase limitations of computer memory and high dimension matrix inversion becomes pressing. The MSA uses the main ideas of the FEA but at the same time handles large complaint elements, which reduces computational efforts [19]. Nevertheless, this approach is hard to apply for the manipulator in the loaded condition [15]. The last method is the VJM that is based on the expansion of the traditional rigid-body model of the robotic manipulator with virtual joints corresponding to the compliances of the links and joints [24]. The use of VJM can be justified by its computational efficiency and acceptable accuracy. The VJM model requires the parameters of virtual springs in order to compensate for the end-effector deflections which are a priory unknown. There are several ways to deal with it. The first one is to approximate links with symmetrical beams and use well-known equations to compute their stiffness. But this method is rather a simplification of the problem under consideration and will not result in accurate deflection compensation. The second one is to use the manipulator CAD model [25], however, this method is limited due to variations

386

S. Mamedov et al.

and non-homogeneity in the material properties. Besides, robot manufacturers usually do not provide CAD models. Apparently, the most reliable approach is to exploit model calibration techniques using the data from the real experiments [1,20]. Since the main goal is to compensate for all deflections, the parameters of the virtual springs should be the best estimate of real ones, consequently, the method based on the real experiments is used in this work. When the full geometric and stiffness models of the manipulator are known, one can try to compensate deflections caused by the flexibility of links and joints. The compensation can be either online or offline. The first one involves modification of the control algorithm by tweaking the manipulator inverse/direct kinematics embedded in robot software [6,31], while the second one is concerned with modifying the reference trajectory. Usually, access to inner control algorithms is prohibited, therefore in most cases, the user is left with the second option only. That is why offline trajectory compensations are mainly applied in the engineering practice [2,21,22]. Among the most popular implementations of offline methods is so-called “mirror technique” [3], where the reference and non-compensated trajectories are symmetrical with respect to the desired one. 1.2

Problem Statement

Estimating stiffness parameters of both links and joints from experimental data is a very challenging task as it requires identification of more than 250 parameters [12] for 6 DoF industrial manipulator. For this reason, a reduced model that describes stiffness of links and joints with only six parameters are so appealing. But at the same time, it poses the following question: if the overall stiffnesses of joints and links are lumped in the joints only how accurate the compensation of the end-effector compliance errors due to machining or other industrial processes can be? The answer to this question will be systematically investigated and analyzed in the following sections.

2 2.1

Theoretical Background Virtual Joint Method

The VJM approach allows developing a more detailed and complete geometric model of the manipulator which provides a more accurate estimate of the endeffector position and orientation. It is assumed that the original model is complemented by virtual joints which describe the deformations of links. Besides, virtual springs are included in actuated joints, in order to take into consideration the stiffness of the transmission and control loop. As a result, a so-called extended geometric model of the robot is formulated (1) t = g(q, θ)

(1)

where q is the actuator coordinates and θ is the vector of virtual joint coordinates. The vector q is completely defined by the robot controller, while the

Increasing Machining Accuracy of Industrial Manipulators

387

values of virtual joint coordinates θ depend on the external loading w applied to the robot’s end-effector. Mathematical fundamentals of the expression for the force-deflection relation and Cartesian stiffness matrix has been presented in several papers before [10,21, 22], here for the completeness purpose a summary of it is provided. The variations in the virtual joint variables θ generate the reaction forces and torques in the corresponding compliant element that are described by the generalized Hooke’s law for manipulator in virtual joints space is formulated here as follows τ θ = Kθ θ

(2) (i)

where K θ = diag(K 1θ , K 2θ , . . . , K nθ θ ) is overall virtual joint stiffness matrix, K θ is the spring stiffness matrix of the corresponding link/joint and τ θ is the vector of virtual joint torques. Let’s apply the principle of virtual work and assume that displacements in the virtual joints Δθ are small. Then we obtain the virtual work done by the external wrench w as (3) δW = (wT J θ )Δθ where J θ = ∂g(q, θ)/∂θ is the Jacobian matrix with respect to θ. On the other hand, the virtual work for internal forces τ θ is equal to δW = −τ Tθ Δθ.

(4)

In case of static equilibrium the total virtual work is equal to zero for any virtual displacement. Consequently, the equilibrium conditions can be presented as τ θ = J Tθ w.

(5)

If we combine Eqs. (2), (5) and linearize (1) around the equilibrium point, then we obtaine the expression for the end-effector deflection T Δt = J θ K −1 θ J θ w.

(6)

Now the expression for Cartesian stiffness matrix can be extracted from Eq. (6): T K C = (J θ K −1 θ Jθ )

−1

.

(7)

The relationship between the Cartesian and joint space stiffness matrices that was proposed in [30] is called the conservative congruence transformation (CCT). 2.2

Identification

To obtain desired stiffness model parameters from the experimental data an identification procedure should be used. The main goal here is to find unknown stiffness parameters from torque and displacement measurements. Equation (6) sets the relationship between the end-effector displacement and applied wrench

388

S. Mamedov et al.

through joint stiffness matrix and allows us to formulate parameter identification problem. But first it is more convenient to rewrite it in a form Δt =

n 

(J θ,i kθ,i J Tθ,i )w

(8)

i=1 (i)

where n is a number of measurements, the matrices kθ denote the (i) link/joint compliances to be identified, and J θ denote sub-Jacobians - J θ = [J θ,1 , J θ,2 . . . ]. Further, it is possible to rewrite Eq. (8) in a form standard for identification, i.e. as a linear function with respect to parameters to be identified Δt = Ak (q, w)k + 

(9)

where Ak = [J θ,1 J Tθ,1 w, . . . , J θ,nθ J Tθ,nθ w] is so-called “observation matrix” and k = (kθ,1 , kθ,2 , . . . , kθ,nθ )T is a compliance vector. Equation (9) includes measurement noise  that is treated as independent identically distributed (i.i.d.) random variable with zero mean and standard deviation σ. For this reason, the least square approach is applied, which minimizes the sum of squared residuals m 

Δti − Ak (q i , wi )k → min .

(10)

k

i=1

Index i here defines the manipulator configuration number. Solution for Eq. (10) has form m −1  m  m −1  m      T T T T ˆ k= A Ai A Δti + A Ai A Δi (11) i

i=1

i

i

i=1

i=1

i

i=1

where the second term describes the stochastic component. The impact of the measurement errors is defined by the covariance matrix that defines the accuracy of identification m −1  T ˆ = σ2 A i Ai (12) cov(k) m

i=1

where σ is standard deviation and i=1 ATi Ai is called the information matrix. Obtained covariance matrix can be used for both the accuracy evaluation of the identified parameters and the experiment design for calibration. Equation (12) assumes that measurement errors are similar for all directions and configurations. If this assumption is not valid, additional weight matrix could be introduced to correct the result. 2.3

Design of Experiment

Different robot configurations have a different influence on the accuracy of estimated stiffness parameters for the same force orientation and magnitude. To

Increasing Machining Accuracy of Industrial Manipulators

389

Table 1. Objective functions of existing approaches used in calibration. Approach

Objective function

A-optimality

trace(X −1 ) → minx1 ,...,xm

D-optimality E-optimality

det(X) → maxx1,...,xm   max diag(X −1 ) → minx1 ,...,xm   max eig(X −1 ) → minx1 ,...,xm

Condition number

σi /σs → minq1 ,...,qm

Noise amplification index

σs2 /σ1 → maxq1 ,...,qm

G-optimality

Product of singular values (σi . . . σs )1/s → maxq1 ,...,qm Minimum singular value

σs → minq1 ,...,qm

obtain a reliable solution, several robot configurations should be used. The simplest way to get a set of measurement configurations is to use random ones. In this case, the accuracy will be proportional to the number of experiments if the number of measurements is high enough. Thus, to accurately estimate the compliance parameters too many experiments should be carried out. To reduce that number of measurements a design of experiments can be applied [5,9,11], which ensures high accuracy with the lower number of configurations. To select desired set of configurations several optimization criteria can be applied [28] (see Table 1). A-optimality criteria tries to minimize the average variance of the estimated parameters while D-optimality tends to give extreme importance to the most sensitive parameters of the model. G-optimality is minimizing the maximum variance of the estimated parameters. Finally, E-optimality minimizes the maximum variance of all possible normalized linear combination of the parameter estimates. Group of criteria, represented in the bottom part of the Table 1, are based on singular value decomposition (SVD) of the kinematic Jacobian. Here σ1 and σs are the maximum and minimum singular values. All optimization criteria, which are presented in Table 1, do not evaluate directly the measurement error impact on the robot accuracy. Industrial robots typically work with a restricted set of configurations, and the task of calibration is to provide the best accuracy for these particular robot states. In this case, the following statistical performance measure can be introduced [15]  (13) ρ0 = E (δpT δp) which is the root-mean-square distance for deflection of the real end-effector position from its desired value δp with function E() denoting mathematical expectation. If position deflection is linearized as δp = A · δk, then Eq. (13) can be written as ⎞ ⎛  −1 m  ρ20 = σ 2 · trace ⎝A0 (14) ATi Ai AT0 ⎠ . i=1

390

S. Mamedov et al.

After that the optimization problem can be formulated as ⎞ ⎛  −1 m  trace ⎝A0 ATi Ai AT0 ⎠ → min . i=1

3

q1 ,...,qm

(15)

Model Description

It was mentioned before that the virtual joint method is very appealing for the manipulator stiffness modeling [10]. In the frame of this approach, several alternative techniques have been proposed. They differ in a number of parameters and dimensions of virtual springs describing the link/joint elastostatic properties. To be more specific, let us consider FANUC R-2000i C/165F robotic manipulator (Fig. 1) performing any machining operation. This robot consists of a fixed base, a serial chain of links with non-negligible compliance, a number of compliant actuated joints, the end-effector which is in contact with a workpiece undergoing machining and an external wrench w applied due to the technological process. In order to build full stiffness model of the robot, the VJM proposes to model the stiffness of each link with 6 DoF spring (three of which correspond to translation and three to the orientation) and the stiffness of a joint with 1 DoF spring positioned along the axis of rotation (Fig. 2d) that results in full stiffness matrix of the manipulator, K f ∈ R42×42 . More than 250 parameters should be known to use this model [13], which ideally is obtained from CAD model of a robot. The problem is that proper CAD model with all the material properties is never known to an end user. As a result, a lot of effort was put to construct simpler models which can provide results close to the full model.

Fig. 1. FANUC R-2000iC/165F and its kinematic scheme with the workspace.

Increasing Machining Accuracy of Industrial Manipulators

391

Table 2. Geometric parameters of the links, m. Parameter

Length External size Internal size

d1 (rectangle) 0.324

0.25

0.16

d2 (rectangle) 0.312

0.24

0.19

d3 (rectangle) 1.075

0.16

0.18

d4 (circle)

1.28

0.14

0.10

d5 (circle)

0.225

0.12

0.08

d6 (rectangle) 0.215

0.12

0.08

There are three main approaches for developing reduced stiffness model. The first one is used when the stiffness of the joints is significantly higher than the stiffness of links, therefore the joints can be considered as rigid (Fig. 2c). In this case, the model is reduced and the stiffness matrix K rl is of dimension 36 × 36. The second approach is used in the opposite case i.e. when the stiffness of links is much higher than the stiffness of joints, thus the links are taken as rigid (Fig. 2b). The stiffness matrix K rj in this case is diagonal and has dimension 6 × 6.

(a) Rigid joints and (b) Rigid links and (c) Flexible links (d) Flexible links links and rigid joints flexible joints and joints

(e) Full and reduced models

Fig. 2. Possible VJM models for four different cases.

392

S. Mamedov et al.

The last approach is used when the link stiffness is higher than the joint stiffness but not negligible. Here, part of a link stiffness is incorporated into the stiffness of the corresponding joint. As in the previous case, the size of the stiffness matrix K rjl is 6 × 6 and it is diagonal. The first approach is rarely used in industry because links of industrial robots are stiff compared to the transmission of actuators, and more importantly, it requires a CAD model. Regarding the second approach, links are usually not rigid enough to neglect their compliance, especially for highly accurate machining operations. This model is used in current work to understand the contribution of each elastic component to overall stiffness matrix. The main analysis was performed on full and reduced models. From now on the reduced model implies the one obtained by the third approach. In the cases when we refer to other reduced models it will be mentioned explicitly. Based on the geometric and virtual springs parameters of the FANUC R2000i C/165F (Table 4), its extended geometric full and reduced models can be developed. Both of them are graphically presented in Fig. 2e. For full model (Eq. (16)), we approximate links by hollow cylindrical or hollow square beams of corresponding length (Table 2) and based on that and properties of link material (Table 3) compute stiffness of links. For reduced elastostatic model, (Eq. (17)), we use identified stiffness of joints (Table 5) that takes into account stiffness of links. f f )T x (d2 )T 3D (θ8:13 ) tf = Rz (q1 )Rz (θ1f )T z (d1 )T 3D (θ2:7 f f f )T z (d3 )T 3D (θ15:20 )Ry (q3 )Ry (θ21 ) Ry (q2 )Ry (θ14 f f f T x (−d4 )T 3D (θ22:27 )T z (d5 )T 3D (θ28:33 )Rx (q4 )Rx (θ34 )

(16)

f f f Ry (q5 )Ry (θ35 )Rx (q6 )Rx (θ36 )T z (d6 )T 3D (θ37:42 )

tr = Rz (q1 )Rz (θ1r )T z (d1 )T x (d2 )Ry (q2 )Ry (θ2r ) T z (d3 )Ry (q3 )Ry (θ3r )T x (−d4 )T z (d5 )Rx (q4 )

(17)

Rx (θ4r )Ry (q5 )Ry (θ5r )Rx (q6 )Rx (θ6r )T z (d6 ) where T u is a homogeneous transformation matrix with translation in u direction, Ru is a homogeneous transformation matrix with rotation about u axis, T 3D is a homogeneous transformation matrix with three translation and three rotation components. Table 3. Physical properties of material. Parameter

Value

Young’s modulus, Pa 7.0 · 107 Poisson’s ratio

0.346

Density, kg/m2

2.7 · 103

Increasing Machining Accuracy of Industrial Manipulators

393

Table 4. Joint stiffness values and upper and lower limits. Joint Compliance, µm/N Lower limit, deg Upper limit, deg

3.1

1

0.40

−179

179

2

0.29

−50

90

3

0.29

−155

120

4

2.50

−350

350

5

2.86

−122

122

6

2.00

−350

350

Stiffness Estimation

To put it briefly, identification consists of two steps: data collection and data fitting. In case of stiffness identification the data is measurements of the endeffector displacement in different configurations under external force of various orientations and magnitudes. But since the scope of this paper is rather theoretical, we do not have real measurements, instead we obtain it from full kinematic model of the manipulator i.e. we perform virtual experiments. To be more precise, first we assume that both links and joints are compliant, then for each configuration we compute Jacobian and use Eq. (8) to attain the vector of displacements. Finally, we add measurement errors to imitate real life condition. These errors are considered as independent identically distributed (i.i.d.) random values with zero mean and standard deviation σ. In this paper we take σ as 100 µm that is typical accuracy of high quality measuring devices like FARO laser tracker. In our previous paper [18] for data generation we used thousand random configurations and forces of random orientation and constant magnitude of 103 N, however there exists a technique for optimal selection of measurement configurations and applied forces called design of experiment. This technique is described in previous section and allows to drastically reduce number of experiments, for example, instead of thousand experiments only twenty or even less can be used. For optimization NOMAD (Nonlinear Optimization with MADS (Mesh Adaptive Search) Algorithm) [16] nonlinear optimization algorithm was used. As regarding the data fitting, during this phase one tries to fit data to the chosen model. In our case, the model is reduced elastostatic model of the manipulator for which we try to find diagonal elements of the stiffness matrix. That is to say assuming that all the compliance both due to joints and links is lumped in joints, try to find an equivalent manipulator with elastic joints and rigid links that have the same displacement as a full elastostatic model of a given manipulator. We tried several approaches, the first one is so-called random, where we take random configurations and random forces with constant magnitude but different orientations. The second approach is test-pose where we find optimal configurations and forces by minimizing trace of a weighted covariance matrix. Others are based on optimization criteria, listed in Table 1. A number of configurations

394

S. Mamedov et al. Table 5. Indentified joint compliances of the manipulator, µm/N. Approach

1

2

3

4

5

6

Random

0.43 0.43 0.44 5.26 3.70 0.77

Test-pose

0.46 0.30 0.48 3.75 3.72 3.22

A-optimality

0.46 0.27 0.51 4.17 3.85 2.08

D-optimality

0.49 0.29 0.47 2.70 3.85 2.70

E-optimality

0.60 0.52 0.38 2.78 4.76 3.85

G-optimality

0.52 0.33 0.45 4.17 4.00 1.64

Condition number

0.57 0.35 0.32 2.22 4.55 3.33

Noise amplification index

0.55 0.43 0.16 3.70 3.33 3.57

Product of singular values

0.50 0.32 0.53 2.33 3.03 3.13

Minimum singular value

0.64 0.40 0.27 3.23 4.35 2.94

Table 6. Performance of each optimization method, mm. Approach

Maximal error Mean error Full Limited Full Limited

Random

0.97 0.98

0.52 0.88

Test-pose

0.89 0.78

0.27 0.25

A-optimality

0.90 0.90

0.30 0.78

D-optimality

0.85 0.81

0.24 0.69

E-optimality

1.22 0.78

0.54 0.37

G-optimality

0.80 0.72

0.22 0.58

Condition number

0.72 0.56

0.22 0.44

Noise amplification index

0.91 0.72

0.39 0.53

Product of singular values

0.83 0.78

0.29 0.65

Minimum singular value

0.89 0.81

0.29 0.27

in all cases is twelve. The obtained values for the robot joint compliance are represented in Table 5. Error values for all optimization criteria in case of full and reduced workspaces are shown in Table 6. By reduced workspace we mean cube of 10 cm with the center at the test pose - q0 = [0.1, −0.0999, 1.7628, 0.1, 0.1, 0.1]T rad. By means of the design of the experiment, it is possible to increase performance at least two times compared to the random configuration for the same amount of experiments. The lowest error corresponds to “condition number” criterion that will be used in the rest of the paper. Cartesian compliance matrix kc is the inverse of the Cartesian stiffness matrix Kc . It sets the relationship between generalized displacement of the end-effector and applied external wrench (Eqs. (6–7)). The problem is that kc has elements with different physical units. It causes difficulties in analyzing compliance of

Increasing Machining Accuracy of Industrial Manipulators

395

the robot. Although, it was stated in [30] that in machining operations the rotational displacement of the tool can be negligible compared to its translational component, here we present thorough analysis of translational and rotational displacements. Combining Eqs. (6) and (7) we have: F Δp (18) = kc 0 Δo where Δp = [Δpx , Δpy , Δpz ]T , Δo = [Δox , Δoy , Δoz ]T and F = [Fx , Fy , Fz ]T . Compliance matrix kc can be divided into four submatrices [7]: kc,tt kc,tr kc = (19) kTc,tr kc,rr where kc,tt is the translational compliance submatrix, kc,rr is the rotational compliance submatrix, and kc,tr is the coupling compliance submatrix. A direct relationship between Δp and F as well as between Δo and F can be obtained by substituting Eq. (19) into Eq. (18) Δp = kc,tt F Δo = kTc,tr F .

(20)

Equation (20) is used further in order to build deflection maps. 3.2

Deflection Maps

When the joint stiffness values are known, deflections for each configuration can be calculated. Based on that we are able to obtain deflections along the endeffector trajectory or for the whole workspace plane (deflection map). There are several problems that can be encountered during deflection map computations. One of them is a robot configuration as parts of the workspace can be reached using several configurations, while other parts only by one. Each configuration is characterized by its own Cartesian stiffness matrix, so the question is which one should be used for computation in the case when there are several possible configurations? This question is out of the discussion in case of operation in XOY plane because each point can be reached without changing manipulator configuration. But in the orthogonal plane (XOZ) configuration must be changed in order to obtain the highest or lowest points of the workspace. There are two main configurations for the manipulator used for simulations and represented in Fig. 1: “elbow up” – when the angle q3 is negative and “elbow down” – when q3 is positive. For the sake of simplicity, in the horizontal plane only “elbow up” is considered as it is more common. In the case of the vertical plane, two deflections for different parts of workspace should be combined. It is assumed that manipulator will operate in an optimal mode, hence minimal deflection should be chosen in the intersection of two regions. The second problem is the value and direction of the applied force that could provide the worst result, i.e. maximal deflection. This problem can be solved by

396

S. Mamedov et al.

exploiting the singular value decomposition (SVD) method [17] applied to the translational and coupling compliance sub-matrices (Eq. (20)). kc,tt = U tt S tt V Ttt kc,rt = U rt S rt V Trt

(21)

where the diagonal elements of S tt and S rt are non-negative singular values in decreasing order, the columns of U tt and U rt are left singular vectors and columns of V tt and V rt are right singular vectors. The first elements of matrices S tt and S rt are the absolute values of the maximal deflection obtained under the influence of a unit force (1 N). The direction of this deflection is corresponding left singular vector i.e. the first column of matrix V . In order to obtain realistic deflection maps, we use the force of magnitude 1 kN along the direction of maximal deflection. The similar approach was used in [7].

4 4.1

Results Deflection Maps

Deflection maps for the manipulator with the full elastostatic model are shown in Fig. 3. Left image corresponds to the horizontal plane at the 1.5 m level i.e. z coordinate is fixed at z = 1.5, but x and y coordinates span all the workspace. The map demonstrates radial symmetry, deflection grows with distance and increases from the center to the edges due to an increase in the lever length. For vertical plane at y = 0 m, as in the case of the horizontal one, the deflection increases proportionally to the distance from the origin of the base link due to the same reason. Here no symmetry can be spotted as the second and third joint limits are not symmetrical with respect to the Z axis. Maximum deflection that manipulator undergoes under the force 1 kN applied in the direction causing maximal deflections for vertical and horizontal planes are 4 mm and 3.5 mm respectively. External force does affect not only position but also the orientation of the robot end-effector. Figure 4 shows deflection values in orientation for the full model over the workspace. There are two distinct levels for “elbow up” configurations and within each level deflections are homogeneous. Overall, deflections are small, not more than 2 mrad. Deflection maps of the manipulator with the reduced elastostatic model obtained from the calibration process (Table 5) are shown in Figs. 5 and 6. It is rather obvious that the deflection map for reduced model is different from the full as the number of elastic elements representing the manipulator is many fewer. However, the difference is not immense since experiment based compliance identification of the joint parameters implicitly incorporates link compliance. For example, maximum deflections in horizontal plane for both models are the same while in vertical plane they slightly differ but change in deflection levels in case of full model is smoother. To draw some

Increasing Machining Accuracy of Industrial Manipulators

Fig. 3. Position deflection maps for full elastostatic model.

Fig. 4. Orientation deflection maps in full elastostatic model.

Fig. 5. Position deflection maps in reduced elastostatic model.

397

398

S. Mamedov et al.

Fig. 6. Orientation deflection maps in reduced elastostatic model.

quantitative conclusions it is better to analyze difference between full and reduced model deflection maps that is provided in Figs. 7 and 8. Figure 7 shows that the maximum difference is approximately 0.55 mm for horizontal plane and 0.7 mm for the vertical plane. Poor compensation can be observed around the origin and specific for “elbow-up” configurations. Regarding orientations (Fig. 8), maximal deflection in horizontal plane is slightly more than 0.2 mrad while in vertical plane is 0.25 mrad. The common characteristic for both position and orientation is that the reduced model can reproduce the full model’s deflections less accurately around the base. But the overall simple conclusion is the following: the reduced model is very good in recovering the performance of the full model for the setup considered in this paper while requiring many fewer parameters (Fig. 9 demonstrates that error distributions are almost the same).

Fig. 7. Difference in position deflection maps between full and reduced models.

Increasing Machining Accuracy of Industrial Manipulators

399

Fig. 8. Difference in orientation deflection maps between full and reduced models.

4.2

Deflection Compensation

By comparing trajectories of the end-effector with and without reduced model based compensation for some technological process we can assess accuracy of compliance error compensation. Assuming that the manipulator has to move along the trajectories, shown in Fig. 10 being affected by the external force F = [440, −1370, −635, 0, 0, 0]T N, cutting force caused by machining process [8]. Depending on the value and the orientation of the applied force, end-effector trajectory can differ from the desired one.

(a) Horizontal plane

(b) Vertical plane

Fig. 9. Error histograms of the full (red) and reduced (blue) models.

400

S. Mamedov et al.

In order to compensate for the difference between reference trajectory of the tool, r(t), and the real trajectory, p(t), deflection at each sampling point, Δp(t), should be found, using reduced stiffness model of the manipulator. Then, any compensation technique has to be applied as to attain new reference trajectory, r ∗ (t), in this paper, it is mirroring technique. The compensated trajectory is used as an input of the manipulator control system. The difference between the desired and obtained trajectories for the robot without compensation in every direction is shown in Fig. 10b, e and h, while the same for the robot with compensation is demonstrated in the Fig. 10c, f, and i. Mean and maximum errors for the given trajectories are presented in Table 7. For the circle trajectory, the maximum error is decreased by 66% in X, 74% in Y and 95% in Z directions. At the same time mean error in case of the circle was reduced by 65%, 78% and 95% in X, Y , and Z directions correspondingly. Rectangle trajectory was built in the same area as the circle and has similar size. In result, the obtained errors are almost the same. For the line trajectory error decreasing in X, Y and Z directions were 70%, 85%, 95% in case of maximal value and 60%, 88%, 95% for mean error respectively. The numbers demonstrate excellent quality of compensation despite the fact that it is based on the reduced model.

5

Discussion

So far, the analysis was conducted for the manipulator having geometrical properties of FANUC R-2000i C/165F. To develop its full elastostatic model, we approximated all the links with geometric primitives such as hollow cylinders and boxes due to the absence of the CAD model. Thus, the results hold true for this particular case and are not applicable to the generic manipulator. To generalize obtained results, first, we define more practical workspace, which has the same shape as full one but is smaller as to avoid singularities (Fig. 11). Then, we vary the thickness of the walls from 50% of nominal thickness up to 150%, to understand how it affects the quality of compensation in full and restricted workspaces. As the thickness of the walls decreases the quality of compensation decreases as well for both position and orientation. The efficiency of the compensation for both workspaces is especially sensitive to wall thickness in the range from 50% to 75% of nominal thickness for the position (Fig. 12a). This phenomenon is observed because when walls are thin, due to low stiffness they have a great deal of contribution to overall deflection. Although the reduced model used throughout this work accounts for some compliance of the links, it is not enough to provide satisfactory compensation quality for thin-walled links. For orientation, there is saturation as in case of position and the thicker walls the better is compensation (Fig. 12b).

Increasing Machining Accuracy of Industrial Manipulators

401

(a) Circle in the Cartesian (b) Position error for the (c) Position error for the uncompensated circle compensated circle space. trajectory. trajectory.

(d) Rectangle in the Cartesian space.

(e) Position error for the (f ) Position error for the uncompensated rectangle compensated rectangle trajectory. tryjectory.

(g) Line in the Cartesian (h) Position error for the uncompensated line space. trajectory. Green dotted line is The red solid line is an error uncalibrated trajectory, in the X direction, Green the blue dash-dot line is dotted line is an error in the desired trajectory, the red Y direction, Blue dash-dot solid line is a trajectory line is an error in the Z direction, the Dashed line is after calibration process. error norm.

(i) Position error for the compensated line trajectory. The red solid line is an error in the X direction, Green dotted line is an error in the Y direction, Blue dash-dot line is an error in the Z direction, the Dashed line is error norm.

Fig. 10. Compensated and uncompensated tool trajectories.

402

S. Mamedov et al. Table 7. Max and mean errors for different trajectories, mm. Shape

Direction Uncompensated Compensated Max error Mean error Max error Mean error

Circle

X Y Z

1.72 4.17 1.54

1.15 3.03 0.90

0.59 1.07 0.08

0.40 0.68 0.05

Rectangle X Y Z

1.72 4.17 1.54

1.15 3.03 0.90

0.59 1.07 0.08

0.40 0.68 0.05

Line

1.11 4.16 1.54

0.80 3.84 1.49

0.33 0.61 0.08

0.32 0.44 0.07

X Y Z

Fig. 11. Full and restricted workspaces.

It is also interesting to abstract away from a particular manipulator again and looks at how ration between longest robot links (second and third) can affect deflection. This dependency is shown in Fig. 13 and was obtained for the case when sum length is constant. The difference in deflections between full and reduced models is normalized to the maximum clarity of representation. Both maximal and mean deflections demonstrate non-monotonic dependence from the value of link ration. As can be seen, both curves have minimums located close to one. Thus, we can conclude that in order to minimize end-effector deflection due to elasticity the longest links of the robot arm should have approximately equal length.

Increasing Machining Accuracy of Industrial Manipulators

(a) position

403

(b) orientation

Fig. 12. The relationship between link thickness and reduced elastostatic model based compliance error compensation.

Fig. 13. Change of deflection with respect to link length ratio.

There are several remarks to be made about the scope of this paper. First, for simplicity, we assume that the manipulator does not have gravity compensator. Its presence slightly complicates the analysis by making the stiffness of the second joint a nonlinear function of angle q2 . Second, the metrics used in this work are maximum and mean values of deflection when we analyze workspace, and maximum and mean values of errors in all X, Y and Z direction when we consider the trajectory tracking. We admit that the use of other metrics can provide slightly different results. The last, while performing deflection analysis for overall workspace we did not assume any specific machining operation for the sake of generality and used singular value decomposition to define the direction of maximum deflection for the full model and apply force of magnitude 1 kN along this direction for both full and reduced model in estimating deflection.

404

S. Mamedov et al.

However, for a specific machining operation, the analysis can be particularized by defining the force specifically to given machining process.

6

Conclusion

In the majority of the studies focused on deflection compensation of the industrial manipulators, authors assume the links to be rigid. In this work, we focus on understanding when this assumption holds true and in which cases the compliance of the links cannot be neglected. In order to compensate for compliance errors, firstly the model compliances must be identified. In this paper, we used the reduced elastostatic model of the robot. It was shown that when the walls of the links modeled as hollow cylinders are a thin full elastostatic model should be considered while when they are thick as in case of FANUC R-2000i C/165F then depending on the actual thickness of the walls reduced model can compensate more than 80% of overall deflections in general. Acknowledgement. This research has been supported by the grant of Russian Science Foundation 17-19-01740.

References 1. Alici G, Shirinzadeh B (2005) Enhanced stiffness modeling, identification and characterization for robot manipulators. https://doi.org/10.1109/TRO.2004.842347 2. Belchior J, Guillo M, Courteille E, Maurine P, Leotoing L, Guines D (2013) Offline compensation of the tool path deviations on robotic machining: application to incremental sheet forming. Robot Comput Integr Manufact 29(4):58–69. https:// doi.org/10.1016/j.rcim.2012.10.008 3. Chen Y, Gao J, Deng H, Zheng D, Chen X, Kelly R (2013) Spatialstatistical analysis and compensation of machining errors for complex surfaces. Precis Eng 37(1):203–212. https://doi.org/10.1016/j.precisioneng.2012.08.003 4. Corradini C, Fauroux JC, Krut S, et al (2003) Evaluation of a 4-degree of freedom parallel manipulator stiffness. In: Proceedings of the 11th world congress in mechanisms and machine science, Tianjin, China 5. Daney D, Papegay Y, Madeline B (2005) Choosing measurement poses for robot calibration with the local convergence method and Tabu search. Int J Robot Res 24(6):501–518. https://doi.org/10.1177/0278364905053185 6. Guillo M, Dubourg L (2016) Impact & improvement of tool deviation in friction stir welding: weld quality & real-time compensation on an industrial robot. Robot Comput Integr Manufact 39:22–31. https://doi.org/10.1016/j.rcim.2015.11.001 7. Guo Y, Dong H, Ke Y (2015) Stiffness-oriented posture optimization in roboticmachining applications. Robot Comput Integr Manufact 35:69–76. https://doi.org/10. 1016/j.rcim.2015.02.006 8. Klimchik A, Ambiehl A, Garnier S, Furet B, Pashkevich A (2017) Efficiency evaluation of robots in machining applications using industrial performancemeasure. Robot Comput Integr Manufact 48:12–29. https://doi.org/10.1016/j.rcim.2016.12. 005

Increasing Machining Accuracy of Industrial Manipulators

405

9. Klimchik A, Caro S, Pashkevich A (2015) Optimal pose selection for calibrationof planar anthropomorphic manipulators. Precis. Eng. 40:214–229. https://doi. org/10.1016/j.precisioneng.2014.12.001, http://www.sciencedirect.com/science/ article/pii/S0141635914002116 10. Klimchik A, Chablat D, Pashkevich A (2014) Stiffness modeling for perfect andnonperfect parallel manipulators under internal and external loadings. Mech Mach Theory 79:1–28. https://doi.org/10.1016/j.mechmachtheory.2014.04.002 11. Klimchik A, Chablat D, Pashkevich A (2019) Advancement of MSA-technique for stiffness modeling of serial and parallel robotic manipulators. In: Arakelian V, Wenger P (eds) ROMANSY 22 - robot design, dynamics and control. Springer, Cham, pp 355–362 12. Klimchik A, Furet B, Caro S, Pashkevich A (2015) Identification of the manipulator stiffness model parameters in industrial environment. Mech Mach Theory 90:1–22. https://doi.org/10.1016/j.mechmachtheory.2015.03.002, http://www. sciencedirect.com/science/article/pii/S0094114X15000397 13. Klimchik A, Pashkevich A, Chablat D (2013) CAD-based approach for identification of elasto-static parameters of robotic manipulators. Finite Elem Anal Des 75:19–30. https://doi.org/10.1016/j.finel.2013.06.008 14. Klimchik A, Pashkevich A, Chablat D (2018) MSA - technique for stiffness modeling of manipulators with complex and hybrid structures. In: 12th IFAC symposium on robot control - SYROCO 2018, Budapest, Hungary, June 2018 15. Klimchik A, Wu Y, Pashkevich A, Caro S, Furet B (2012) Optimal selection of measurement configurations for stiffness model calibration of anthropomorphic manipulators. Appl Mech Mater 162:161–170. https://doi.org/10.4028/www.scientific. net/AMM.162.161 16. Le Digabel S (2011) Algorithm 909: NOMAD: nonlinear optimization with the mads algorithm. ACM Trans Math Softw (TOMS) 37(4):44 17. Leon SJ, Bica I, Hohn T (1980) Linear algebra with applications. Macmillan, New York 18. Mamedov S, Popov D, Mikhel S, Klimchik A (2018) Compliance error compensation based on reduced model for industrial robots. In: Proceedings of the 15th international conference on informatics in control, automation and robotics, ICINCO 2018, Porto, Portugal, 29–31 July 2018, vol 2, pp 190–201 19. Martin HC (1966) Introduction to matrix methods of structural analysis. McGrawHill, New York 20. Nubiola A, Bonev IA (2013) Absolute calibration of an ABB IRB 1600 robot using a laser tracker. Robot Comput Integr Manufact 29(1):236–245. https://doi.org/10. 1016/j.rcim.2012.06.004 21. Olabi A, Damak M, Bearee R, Gibaru O, Leleu S (2012) Improving the accuracy of industrial robots by offline compensation of joints errors. In: 2012 IEEE international conference on industrial technology (ICIT), IEEE, pp 492–497 22. Ozaki T, Suzuki T, Furuhashi T, Okuma S, Uchikawa Y (1991) Trajectory control of robotic manipulators using neural networks. IEEE Trans. Ind. Electron 38(3):195–202. https://doi.org/10.1109/41.87587 23. Pan Z, Zhang H, Zhu Z, Wang J (2006) Chatter analysis of robotic machining process. J Mater Process Technol 173(3):301–309. https://doi.org/10.1016/j. jmatprotec.2005.11.033 24. Pashkevich A, Chablat D, Wenger P (2009) Stiffness analysis of over constrained parallel manipulators. Mech Mach Theory 44(5):966–982. https://doi.org/10.1016/ j.mechmachtheory.2008.05.017

406

S. Mamedov et al.

25. Pashkevich A, Klimchik A, Chablat D (2011) Enhanced stiffness modeling of manipulators with passive joints. Mech Mach Theory 46(5):662–679. https://doi. org/10.1016/j.mechmachtheory.2010.12.008 26. Pigoski T, Griffis M, Duffy J (1998) Stiffness mappings employing different frames of reference. Mech Mach Theory 33(6):825–838. https://doi.org/10.1016/S0094114X(97)00083-9 27. Salisbury JK (1980) Active stiffness control of a manipulator in cartesian coordinates. In: 1980 19th IEEE conference on decision and control including the symposium on adaptive processes, IEEE, vol 19, pp 95–100. https://doi.org/10.1109/ CDC.1980.272026 28. Sun Y, Hollerbach JM (2008) Observability index selection for robot calibration. In: 2008 IEEE international conference on robotics and automation, pp 831–836, May 2008. https://doi.org/10.1109/ROBOT.2008.4543308 29. Taghaeipour A, Angeles J, Lessard L (2010) Online computation of the stiffness matrix in robotic structures using finite element analysis. McGill University, Montreal, Department of Mechanical Engineering and Centre for Intelligent Machines 30. Zargarbashi S, Khan W, Angeles J (2012) Posture optimization in robot-assisted machining operations. Mech Mach Theory 51:74–86. https://doi.org/10.1016/j. mechmachtheory.2011.11.017 31. Zhang H, Wang J, Zhang G, Gan Z, Pan Z, Cui H, Zhu Z (2005) Machining with flexible manipulator: toward improving robotic machining performance. In: 2005 IEEE/ASME international conference on advanced intelligent mechatronics, proceedings, IEEE, pp 1127–1132. https://doi.org/10.1109/AIM.2005.1511161

Control of Force in Surgical Robots with Random Time Delays Using Model Predictive Control Jasmeet Singh Ladoiye1(&), Dan S. Necsulescu1, and Jurek Sasiadek2 1

The University of Ottawa, Ottawa, ON K1N 6N6, Canada {jlado023,necsu}@uottawa.ca 2 Carleton University, Ottawa, ON K1S 5B6, Canada [email protected]

Abstract. Robotic telesurgery has gained a lot of interest in the medical and engineering field with the advancement of technology over in Minimum Invasive Surgery (MIS) that uses small incisions to operate resulting in the faster recovery of the patients. In the commercially available surgical systems, the surgeon cannot feel the sense of touch of the environment, unlike open type surgery. The control becomes harder when the robotic surgery is performed remotely in the presence of time delays. The development of the force feedback will help to reduce the tissue damage and end-effector deflection. The primary goal of this work is to develop the force feedback for the surgical arms affected by the time delays. The random behavior of the network delays makes the control realization even harder and can make the response go unstable. A novel approach is used to develop the haptic feedback for the Minimally Invasive Robotic Systems (MIRS) by using Model Predictive Control (MPC) in case of random communication delays. Keywords: Minimum invasive robotic surgery  Surgical robots  Force control  Model Predictive Control  Telecommunication  Bilateral master-slave control

1 Introduction Interest in the minimally invasive surgery (MIS) techniques resulted in being adopted all over the world owing to its benefits. MIS techniques allow the surgeon to use long thin instruments to operate inside the body using small incisions on the human body with the help of laparoscopic vision. Patients have less pain and trauma and faster recovery time by using MIS as compared to the traditional open type surgery [1, 2]. MIS techniques, apart from its advantages, also has some shortcomings such as limited tactile and force feedback because of long instruments, reduced sight, restricted motion around the pivots and amplification of the tremor due to long lever arms [5]. Robotics research in the field of medical surgery has allowed the surgeon to extend its capabilities to a remote environment by overcoming the geographical constraints [3]. Figure 1 shows the conventional MIS using long instruments with the help of laparoscopic vision through tiny incisions. © Springer Nature Switzerland AG 2020 O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 407–428, 2020. https://doi.org/10.1007/978-3-030-31993-9_20

408

J. S. Ladoiye et al.

Fig. 1. Conventional minimum invasive surgery [13].

Minimally invasive robotics surgery (MIRS) not only benefits the patients but also liberate the surgeon’s difficulties by decreasing the intensive labor of surgery. The idea of remote surgery proposed in the 1970’s aimed at providing medical support to the astronauts in space [4]. Later many robot-assisted systems for the MIS were developed and made commercially available to carry out the surgery. Da Vinci system, developed by Intuitive Systems early 2000’s is the most widely used commercial surgical arm. This is the first surgical arm approved by the FDA of the United States of America and has been used successfully in many medical institutes [6]. The kernel computed at the master end is responsible for transforming the surgeon’s hand motion into the movement commands that are responsible for actuating the joint motors for the robotic joint motion, which are transmitted through a digital signal processor (DSP). Researchers in Poland developed surgical arms with the name ‘Robin Heart’ [7]. Robin Heart robots also were based on the kernel computer and DSP for the control but the data transfer was done by the Controller Area Network (CAN) protocol. Development of the force feedback is of large interest in the field of medical robotics. The commercially available arms lack force feedback to the surgeon. They depend on solely the visual feedback from the operation site. They exclusively estimate the force applied by observing the change of color of the skin and organs for each maneuver. This constant repeating effort by the surgeon increases the operation time and leads to errors because of the excessive force onto the tissues or while stitching [12]. SOFAR S.P.A developed the ALF-X surgical robot to track the eye movements and provides tactile feedback making it distinguished from the other surgical arms [8]. RAVEN surgical arm, developed at the biorobotics lab, University of Washington uses Phantom Omni haptic devices at the master end and the four slave manipulators at the patient’s end [9]. This robot focused on carrying out the remote surgeries without haptic feedback. This surgical system uses TCP/UDP as a transport protocol.

Control of Force in Surgical Robots with Random Time Delays

409

Technical University of Eindhoven developed SOPHIE (Surgeon’s Operating Force Feedback Interface Eindhoven) robotic arm and improved the existing Da Vinci arm by developing the force feedback for the system [10]. ROBODOC an orthopedic robot developed by Think Surgical Inc. can carry out the operations in autonomous mode under the supervision of the surgeon [11]. It makes use of Computed Tomography (CT) scans and fiducial makers to plan the motion. The other significant problems in the teleoperated systems are the bandwidth limitation, packet losses, sampling and delays. Proportional-integral-derivative (PID) control is the most widely adopted strategy owing to its simplicity. Model predictive control (MPC), a well-known model based strategy, has ability to handle non-linear and time delayed systems as opposed to PID control. Developing the force feedback will make the surgeries less prone to errors, faster and will prevent damage to the tissue. This work is the extension of our previous work where an approach discussed the development of force feedback of the surgical arm using MPC in case of time delays [15]. In this paper, a hybrid position/force feedback control strategy is discussed in the presence of the random time delays that occurs in the communication link. The random behavior of the latency can make the system unstable if not compensated for. MPC is a model-based control that uses a system model to predict future outputs and can handle the time delay and its effects. In summary, the significant contributions of this work is the analysis of MPC effectiveness in case of time delays as a force controller in surgical arms with random time delays. This control algorithm will enhance the present systems by providing the haptic feedback to the surgeon. Development of this kind of feedback will reduce the surgeon’s labor and will prevent the damage to the tissues by providing sensation of touch analogous to that of open surgery. A systematic study on the effect of the weight parameters of the MPC is also presented for case of increase in time delays. This work is limited to the robotic systems with the ability to work in the autonomous mode with a known reference under the supervision of the surgeon.

2 System’s Control Architecture Surgical robots operate based on the principle of master and slave robot. A typical teleoperated robot has three main components: a master end (surgeon’s side), slave end (robotic end) and a transmission medium (communication medium), as shown in Fig. 2. The Surgeon handles the master’s side with the main objective to operate the slave robot to carry out the surgery. The slave side gets extended by the tool-tissue interaction with the unknown environment inside the human body. It is very important to understand the type of the environment with which robot will interact. The robot is assumed here to work in a known environment for this work. Figure 3. shows the Simulink™ model the robotic model with time delays.

410

J. S. Ladoiye et al.

Fig. 2. System architecture for the hybrid position/force control.

Fig. 3. Describes the Simulink™ model for the development of the force feedback with time delay.

2.1

Communication System

The two ends, Master and Slave relate to each other through communication link to carry out the task in synchronization. The system’s main functions are the coding signals, decoding signals and effective data transfer between the two ends. A communication system comprises a transmitter, a receiver, and a communication medium. Issues are the latency and quality of the signal. Apart from the quality issues in

Control of Force in Surgical Robots with Random Time Delays

411

telesurgical systems, data loss is one of the most critical issue to take care of, which is best handled by User Datagram Protocol (UDP) [11]. Another novel approach is discussed by Ainchwar, Ladoiye and Necsulescu to compensate for the packet losses during communication by using MPC [14]. An average time delay of 155 ms was measured in a transatlantic surgery [16] where a dedicated fiber optic line based on ATM (Asynchronous Transfer Mode) network was used. Total time delay in a teleoperated system is the measure of transmission lag, time taken for coding and for decoding of the video signals. It was observed that 85 ms of delay occurs due to data transfer, while a 70 ms time delay occurs at the slave side due to encoding and decoding of the visual signals [16]. Anvari carried out an advanced laparoscopic surgery in 2005 at Center of Minimum Access Surgery (CMAS) at McMaster University [17]. The average time delay of 150 ms using commercial high-speed link Virtual Private Network (VPN) protocol was measured in the process. In a telesurgery between Japan and Thailand, the average time delay in data was observed around 122 ms whereas visual delay of about 540 ms by using low latency CODEC system was recorded [18]. The effect on the response of the system gets more pronounced with the time delay variation. Smith and Chauhan carried out a study by using Da Vinci robotic simulator to study the effects of the distance on latency [19]. The robotic simulator was initialized with the real-time parameters to match the reality. The following observations were made: 1. The time delay less than 200 ms went unnoticed to the surgeon. 2. Time delay went noticeably between 200 ms–500 ms and was compensated by pausing the movements. 3. Beyond 500 ms the operation becomes insecure due to the effect of the time delay on the increase of the number of errors in the surgery as shown in Table 1. Table 1. Experimental observations of the Da Vinci simulator [15]. Time delay (in milliseconds) Effect on the system 0–200 Safe 200–500 Physically dependent on the surgeon 600-more Unsafe

A systematic study is presented showing the effect on the tuning parameters of the MPC in case of the increase of the time delay. Figure 4 shows the block diagram of the Simulink™ model to generate the random time delays acting as communication medium for our study. The Simulink™ is used to generate the random numbers with the sampling time of 0.1 s. The block is designed to generate the maximum time delay of around 1 s and average of .15 s. The Detachment block in the figure prevents sending values less than 0. Two blocks are added to the system to mimic the input and communication delay of the telerobotic system.

412

J. S. Ladoiye et al.

Fig. 4. Simulink™ model for generating random time delay values for the force control strategy.

Figure 5 describes the total time delay observed in this study. It is very vital to measure the time latency in a closed loop system. The control strategy should be designed in a way to compensate for the varying delays in the system that may lead to systems instability if not accounted for.

Fig. 5. Total random time delay generated by the system.

2.2

Master Level

The slave end of the system is operated by the surgeon from the master console. The surgeon does decision making of the future inputs through the perception available at the master console. Figure 6 describes the main components comprising the master’s end for the study. The master’s inputs are somewhat designed like the inputs of the minimally invasive

Control of Force in Surgical Robots with Random Time Delays

413

surgical tools. The surgeon also gets visual feedback from the slave robot that helps in effective decision making.

Fig. 6. Master model of the system.

A hybrid position/force control approach is used such that the master end is responsible for controlling the end of effectors position and force independently. A hybrid approach for controlling position and force of the end of effector is used as shown in Fig. 6 such that position (INPUT2) and force (Force Input) are controlled independently. This architecture has two different loops one for controlling force and the other for position control. The time delay can be controlled by a controller in a system only if the controller can predict the future. The feedback from the slave end suffers from transmission delay and the master controller must compensate that. The proportional-integral-derivative (PID) controller can only predict one step i.e. Td (derivate time constant) such that the controller can become unstable when the time delay is longer than the time constant of the system [21]. An adequate force controller should have the capability to compensate the time delay and reach the setpoint robustly by using prediction property. A model predictive controller is chosen for looking ahead and predicting the robot’s behavior in the future and control the force loop efficiently. Model Predictive Control. Model Predictive control (MPC), is attractive because of its unique advantages over other controllers, also known as Receding Horizon Control (RHC). The MPC performs the optimisation operation of a performance index concerning the future control sequences, using predictions of the output signal based on a systems model with constraints on the states, inputs and output. The difference in the

414

J. S. Ladoiye et al.

primary methodology of MPC and PID controllers in which the former predicts the future is desirable while latter only has the property to react to the past behaviour. Model predictive control solves an optimization problem at each control interval to determine the manipulated variables (MV’s) for the system until next control interval. A quadratic problem comprises a cost function, constraints and decisions. Cost function is a scalar quantity that must be minimized to at each interval to measure the controller’s performance. Physical bounds in the form of constraints on MV’s and plant output can be applied to keep a check on systems performance. MV’s are adjusted as per the applied constraints to satisfy the solution. The cost function is given by [22]: J ðzk Þ ¼ Jy ðzk Þ þ Ju ðzk Þ þ Jdu ðzk Þ þ Je ðzk Þ

ð1Þ

where, zk is the Quadratic Problem (QP) decision variables. Default weights, as shown in (2), (3), (4), (5) are applied on each term to achieve the objectives of the system [22]. Jy ð z k Þ ¼

Jdu ðzk Þ ¼

Ju ðzk Þ ¼

(

Xny Xp j¼1

i¼1

Xnu Xp1 j¼1

i¼1

Xnu Xp1 j¼1

(

i¼0

(

)  wyi;j  rj ðk þ ijkÞ  yj ðk þ ijkÞ syj

 wdu i;j  u uj ðk þ ijkÞ  uj ðk þ i  1jkÞ sj

yj(k + i|k) rj(k + i|k) syj wyi,j suj wui,j wdu i,j ek qe

)2

)2  wui;j  uj ðk þ ijkÞ  uj;target ðk þ ijkÞ suj

Je ðzk Þ ¼ qe e2k where: k P ny zk

ð2Þ

– Current control interval. – Prediction horizon (number of intervals). – Number of plant output variables. – QP decision, given by: zTk = [u(k|k)T u(k + 1|k)T ….. u(k + p − 1|k)T ek]. – Predicted value of jth plant output at ith step. – Reference value for jth plant output at ith step. – Output scale factor. – Tuning weights for the plant output. – Input scale factor. – Tuning weight for plant input. – Tuning weight for rate of change of input. – Slack variable for control variable k. – Constraint violation penalty weight

ð3Þ

ð4Þ ð5Þ

Control of Force in Surgical Robots with Random Time Delays

415

The steady space equations of a time delayed system with input and output delay is represented by: xðk þ 1Þ ¼ Axðk Þ þ BDuðk  tÞ

ð6Þ

yðkÞ ¼ Cxðk  tÞ

ð7Þ

where: xi = i-th control variable ri = i-th reference variable ui = i-th manipulated variable t = (total) time delay in the system A, B, C are the state space matrices of the system. At each control interval t, the process output is predicted p-steps into the future y(t + l), where l = 1,..,p. The prediction output depends on the past results and planned m-steps. The planned move is evaluated by minimizing a quadratic cost function. The cost function index incorporates the error and the actuation moves. Only u(t) is applied to the system, and the future vector is evaluated. Prediction value is evaluated at every step by comparing the current values to the predicted values through the filter as shown in Fig. 7. The above-stated methods are repeated at every control interval, that is why it is also called receding horizon control. MPC also has a “previewing” feature such that the past information helps to predict the future information of the system. This feature can be utilized for the systems with known reference trajectories such as surgical robots with ability to perform autonomously to compensate for the time delay. Such information is useful because that makes the controller prepared a few steps ahead of time. The tracking capability of the MPC is affected by the length of the prediction horizon N. With the increase of the horizon, the computational load on the system increases. When the time delay increases in the system the prediction horizon should be more than that of the time delay for the accurate prediction of the future and the stable response of the internal plant.

Fig. 7. Model Predictive Control [15].

416

J. S. Ladoiye et al.

Gain Scheduling-MPC. Gain Scheduling is a control strategy that switches the control of the system between the predefined set of controllers gains. The controller switches in a very coordinated fashion to control the systems with dynamics operating in a wide range of conditions. This control is used when a single controller is not providing adequate system performance. Gain scheduling using MPC is used to control the surgical system with random delays. A series of MPC’s are designed to control the system with time delays. The efficiency of the system improves further because the inactive controllers do not compute the optimal control moves, hence, reducing the computational load on the system.

Fig. 8. Gain scheduling of the MPC.

The smooth transition in between the controllers is significant. The sudden transfer of the control between the controller can lead to instability. To achieve the bumpless transfer, the inactive controllers continue to perform the state estimation. This will prevent the sudden changes in the manipulated variables when the switching occurs. Figure 8 illustrates the conditions of the switching in between the MPC controllers. The switching time intervals are chosen in a way to have a stable response from the system. A total of 5 different MPC’s, are selected to cover time delay ranging from 0–1.2 s. Figure 9 shows the switching of the control as per the conditions mentioned in Fig. 8. The average delay ranging 250 ms is observed based on the switching as shown in Fig. 9 below. The values of the schedule of the time delay for different MPC controllers is: For time delay ranging between 0–75 ms, MPC tuning for no delay used; For time delay ranging between 75.1–150 ms, MPC tuning for 100 ms delay is used; For time delay ranging between 150.1 ms–275 ms, MPC tuning for 200 ms delay is used; For time delay ranging between 275.1–600 ms, MPC tuning for 400 ms delay is used; For 600–1200 ms, MPC tuning for 800 ms delay is used;

Control of Force in Surgical Robots with Random Time Delays

417

Fig. 9. Switching in between the different MPC’s as per conditions.

Model Predictive Control Formulation. The controlled output of the model predictive control can be defined as: e u k þ uk1 u k ¼ De

ð8Þ

General steady state equation is represented by: e ðk Þ þ Bu e x ¼ Ax

ð9Þ

which, more specifically, takes the following form: 2 6 6 6 6 6 6 4 2 6 6 6 X¼6 6 6 4

x0 x1 x2 .. . xN1

3

x0 x1 x2 ... xN1 2

7 6 7 6 7 6 7; . . . A e ¼ . . .6 7 6 7 6 5 4

3

2

7 6 7 6 7 6 7¼6 7 6 7 6 5 4 I A A2 .. . AN1

3

I A A2 .. . AN1

2

3

6 7 6 7 6 7 7xðtÞ þ 6 6 7 6 7 4 5 2

7 6 7 6 7 6 7; . . . B e¼6 7 6 7 6 5 4

0

0

B AB .. .

0 B .. .



0 B .. .

AN2 B

AN3 B

0

32

u0 u1

3

6  0 7 76 76    0 76 u2 76 . . .. 76 .. . . 54 . B 0 uN1 2 3 0 u0 6 u1   0 7 6 7 6 7 0    0 7; U ¼ 6 u2 6 7 6 .. . . . . .. 7 4 . . . 5 .  B 0 uN1

 0 .. . AN2 B AN3 B  0 0  

B AB .. .



7 7 7 7 7 7 5 3 7 7 7 7 7 7 5

ð10Þ The control horizon is m, where m Ps − P t where Pt is the threshold for Ps − Pp to start delivering a mass flow.

Study of the Physiological Parameters

2.2

451

Equilibrium Trajectory of the Mask

We use Eqs. (1–6) to write the state equations: ⎡





f k ⎢ −m x− m x˙ − Smm Pm + Sms Pp + H ⎥ ⎥ X˙ = ⎢ rT ⎣−ki V Pp + VrT Wp + VrT ki (Ps − Pt )⎦ m m m rT rT 2 2 Vp (−kc Pp − ks xPp ) + Vp kc Ps

(7)

in which the state vector is defined by: ⎡

and H is given by:

⎤ x ⎢ x˙ ⎥ ⎥ X=⎢ ⎣Pm ⎦ Pp

(8)

Sm (Pe − dP0 )/m − Ss Ps /m

(9)

Equilibrium state variables are written with a bar:

P¯m = Sm (Pe − dP0 ) + Ss (P¯p − Ps ) − k¯ x /Sm P¯p = Wp /ki + Ps − Pt x ¯ = kc /ks (Ps2 /P¯p − P¯p )

(10) (11) (12)

It was shown in [5] that two eigenvalues of the 4-dimension evolution matrix of the model linearized around equilibrium state variables were having a positive real part value, which was coincident with the diverging position of the comparator. This one was then colliding with its seat. The time of the first strike was the same as the time of the beginning of chattering. An example is shown in Figs. 2, 3 and 4. 2.3

Results from the Appplication of the Routh Criterion

After linearizing the state equations around the equilibrium trajectory and considering the characteristic polynomial of the obtained evolution matrix, it is possible to apply the Routh criterion to find inequalities on the physical parameters to make the mask stable during an inhalation. The first inequality was: rT ks Ss ¯ kf Pp > 0 − 2 m Vp m

(13)

The second inequality was obtained after some complementary assumptions which will not be described here: 2 rT ki Sm f 10rT kc Ps + Pe − dP0 . The only working equation is (1). Its solution is: Pm (t) = Pe − The comparator opens at time tco : tco =

wrT 2 t 2Vm

(17)

2Vm dP0 wrT

(18)

The next time that is needed is the time tac between the opening of the comparator and the one of the actuator activation. It is needed to solve (2) and (3) at the same time, which is difficult and may lead to a transcandental equation in tac . Simulations of Pm , x and Pp are shown in Figs. 5, 6 and 7.

Pm

P P e-dPe0

0 tco

tac

0.1

0.15

0.2

0.25

0.3

Time (s)

Fig. 5. Pm at the beginning of a ramp demand. At tco , Pm = Pe − dP0 .

Study of the Physiological Parameters

455

It can be observed in Fig. 8 that for the actual mask parameter settings, the dynamics of the comparator can be neglected. Then, Eq. (2) can be simplified: Sm Ss (Pe − dP0 − Pm ) + (Pp − Ps ) k k rT P˙p = (kc (Ps2 − Pp2 ) − ks xPp ) Vp x=

(19) (20)

Since the actuator is still not sending oxygen, Eq. (17) is used in (19), yielding to:

Sm wrT 2

rT Ss 2 2 ˙ Pp = ( (Pp − Ps ) t − dP0 ) + kc (Ps − Pp ) − ks Pp Vp k Vm k

(21)

We change the time origin by writting t = t + tco . The mask pressure can be rewritten: wrT  (t + tco )2 (22) Pm = Pe − 2Vm This is reinserted in (21): Sm wrT 

rT Ss 2 2 2 ˙ Pp = ( (t + tco ) − dP0 ) + (Pp − Ps ) kc (Ps − Pp ) − ks Pp (23) Vp k Vm k Which we write for simplicity: P˙p = −bPp2 + cPp + d − e(t + tco )2

x

(24)

0 0 tco

tac

0.1

0.15

0.2

0.25

0.3

Time (s)

Fig. 6. x at the beginning of a ramp demand. At tco , the comparator opens.

456

G. Battiston et al.

Pp

Ps

P s-Pt

0 tco

tac

0.1

0.15

0.2

0.25

0.3

Time (s)

Fig. 7. Pp at the beginning of a ramp demand. At tco , the pressure starts decreasing and at tac it reaches Ps − Pt which is the time at which the actuator is activated.

Equivalent forces

kx fv ma

0

0 tco

tac

0.1

0.15

0.2

0.25

0.3

Time (s)

Fig. 8. Comparison between kx, f v = f x˙ and ma = m¨ x at the beginning of a ramp demand.

Study of the Physiological Parameters

with b = k s P s Sm − rT Vp k

rT Vp (kc wrT  2Vm (t

+

k s Ss k ), c tco )2 and

=

rT ks kVp (Sm dP0

+ Ss Ps ), d =

rT 2 Vp kc Ps ,

457

e =

+ where in e the factor Pp has been approximated with Ps . This approximation is linked to the fact that in practice Ps − Ps < 0.1Ps . Solving this Riccati equation is very difficult. This difficulty can be overcome with numerical approximations. The pilot pressure Pp will be visible on Fig. 9. Pp seems to be quadratic from tco . It is approximated with: Pp = Ps − at2

(25)

This approximation is inserted in the differential equation (24). − 2at  −bPs2 + cPs + d − et2comp − 2etco t + (2bPs a − ca − e)t2

(26)

where t4 has been neglected. After a short calculus, it can be shown that the constant term of the polynom in t in the right-hand side is equal to 0. From here, in order to simplify the resolution of the equation, we can see if considering that the term in t or the term in t2 being 0 is a valid approximation. In the first case: (27) a1 = etco In the second one:

e (28) 2bPs − c These two approximations of Pp are plotted in Fig. 9: It appears the first approximation is better that the second one. From there, we seek for the time tac at which the actuator opens, that is when Pp = Ps − Pt : (29) Ps − Pt = Ps − a1 t2ac a2 =

which gives:

Pt a1

(30)

√ 2kPt Vp Vm √ ks Sm Ps rT rT wdP0

(31)

tac = 

i.e.: tac =

It appears that tac doesn’t take into account Ss and kc . In fact, it can be checked that their influence is indeed not very remarkable during the fall of Pp , see Fig. 10. Simulation also shows that multiplying or dividing Ss by 5 leaves Pp almost unchanged. In order to be able to use tac for an eventual optimisation of the mask parameters, we use a rule of thumb by rewritting it:  √ 2kPt Vp Vm √ (32) tac = α ks Sm Ps rT rT wdP0 where α is a correcting factor to make correspond the real tac and the approximated one in the case of the original parameter settings. This incidently supposes that the variations of the real tac are followed by its approximating formula.

458

G. Battiston et al.

Ps

Pp

Pp

approx. 1 approx. 2

P s-Pt

0

tac-tco0.05

0.1

0.15

0.2

0.25

0.3

Time t' (s)

Fig. 9. Comparison of Pp and its two approximations 1 and 2 with respective coefficients a1 and a2 .

Deduction of the Opening Time. The opening time can finally be written: top = tco + tac

(33)

In order to decrease tco , one can decrease Vm and dP0 . To decrease tac , one can decrease k and Pt for instance, and increase dP0 and Sm . 3.2

Closing Time

The closing time tcl is the duration between the closing of the comparator and the end of oxygen distribution. If it is too high, oxygen still flows after the end of an inhalation and the beginning of the exhalation, which is disturbing for one who breathes in the mask. If the comparator closes its seat, the working equations for the mask are: rT rT kc 2 P˙p = Wc = (Ps − Pp2 ) Vp Vp

(34)

The analytical solution of this differential equation exists. It can be found using the separation of variables: 2Ps

dPp rT kc 2Ps dPp dPp = dt = + Vp (Ps − Pp )(Ps + Pp ) Ps − Pp Ps + Pp

(35)

Study of the Physiological Parameters

459

original kc*5

Ps

Pp

kc/5

0

0.05

0.1

0.15

0.2

0.25

0.3

Time (s)

Fig. 10. Fall of Pp for different kc .

This equation is then integrated between t = 0 and t, Pp0 and Pp :

exp 2Ps rTVpkc t − Pp = Ps

exp 2Ps rTVpkc t +

Ps −Pp0 Ps +Pp0 Ps −Pp0 Ps +Pp0

(36)

for a starting pilot pressure Pp (t = 0) = Pp0 . The mask stop sending oxygen at the moment the actuator is deactivated, that is to say for Pp = Ps − Pt . We write tcl = {t, Pp (t) = Ps − Pt }, the closing time. Solving (36) for tcl yields: 2Ps − Pt Ps − Pp0 Vp tcl = ln (37) 2Ps rT kc Ps + Pp0 Pt The closing time increases with Vp and Pt and decreases with kc . However, one must not conclude that making kc very large is a good idea: if the restriction is small, the opening of the comparator may not be sufficient enough to empty the pilot volume and then send oxygen: someone inhalating could suffocate. A compromise shall be found. 3.3

Mask Pressure Value

For the mask to be easy to breathe in, the equilibrium mask pressure P¯m , needs to be as high as possible (high pressure gas flow more easily in lower pressure areas such as the lungs during inhalation). Using (10), (11) and (12), the expression

460

G. Battiston et al.

of P¯m is:

Wp Ps2 Wp kkc − Pt − −( + Ps − Pt ) ki ks Sm Wp + Ps − Pt ki ki (38) It obviously depends of the demand flow Wp . The derivative along Wp is: Ps2 Ss kkc 1 ∂ P¯m = + (39) Wp

2 + ∂Wp Sm ki ks Sm ki ki + Ps − Pt

Ss P¯m = (Pe − dP0 ) + Sm



ki

It is always positive. Nevertheless, one must remember that the convention for Wp is a flow entering the mask. Then, it is negative for an inhalation, which also means that P¯m decreases with the increase of the demand. If P¯m must not decrease too much with Wp , one can decrease as much as possible this derivative, that is to say increase ki , Sm , or decrease Ss , k or kc . 3.4

Mask Pressure Overshoot

∗ The last physiological parameter is the mask pressure overshoot Pm . In this case, the demand is not set as a ramp, but a step Wp,0 . In Fig. 11, we plot Wi and −Wp to see at which moment they cross each other, and observe this time t∗ coincides with the one at which Pm reaches its overshoot.

-Wp Wi

Physical quantities

normalized -Pm

0

0

0.02 t* 0.04 0.06 0.08

0.1

0.12 0.14 0.16 0.18

0.2

Time (s)

Fig. 11. Plot of normalized Pm and Wi in response to a step demand −Wp .

In fact, Pm reaches its overshoot value Pm ∗ when P˙m = 0, that is to say Wi = −Wp,0 (if the demand was not a ramp, we woud not know exactly at

Study of the Physiological Parameters

461

which flow value the derivative of Pm would be zero). According to (11), we can write: Wp,0 (40) P p = Ps − P t + ki At this point, the mask dynamics is (stars refer to the value at t∗ ):

rT ∗ P˙p = kc (Ps2 − Pp∗2 ) − ks x∗ Pp∗ Vp

(41)

Wp,0

∗ m¨ x∗ + f x˙ ∗ + kx∗ = Sm (Pe − dP0 − Pm ) + Ss −Pt + ki

(42)

That is: ∗ Pm = Pe − dP0 +

1 Ss Wp,0

−Pt + − (m¨ x∗ + f x˙ ∗ + kx∗ ) Sm ki Sm

(43)

∗ In order to get Pm , the values of x∗ , x˙ ∗ and x ¨∗ are needed. It seems difficult to get a sufficiently satisfactory analytical approximation for these derivatives. Nevertheless, the same approximation as (19) can be made. We finally get an ∗ : approximation for the overshot Pm ∗ = Pe − dP0 + Pm

V pP˙p∗ Ss Wp,0 k 2 2 ) (−Pt + )− (k (P − P ) − c s p Sm ki ks Sm Pp∗ rT

(44)

We suppose rT kc (Ps2 −Pp2 )  −V pP˙p∗ , which comes from a simulation observation, see Fig. 12. ∗ The last remaining problem is to get the value of P˙p . The same approximation as in (25) for Pp is made: Pp = Ps − at2

(45)

where t + t1 = t and t1 is the time at which the comparator opens. This time, the coefficient a is directly set with the activation time tac : Pp (t = tac ) = Ps − Pt = Ps − at2ac i.e.: a=

Pt t2ac

(46)

(47)

The time at which Pp = Pp∗ is defined by: Ps − at∗2 = Ps − Pt +

so: ∗

t =

Wp,0 ki

1 Wp,0 (Pt − ) a ki

(48)

(49)

462

G. Battiston et al.

Flow values

0

term 1 term 2 0

0.01

0.02

0.03

0.04

0.05

Time (s)

Fig. 12. Comparison between the first and second term of the last parenthesis of (44).

This gives the last wanted term: Pt ∗ P˙p = −2 2 t∗ tac

(50)

Equation (44) can finally be fully written: ∗ Pm

Vp 1 Ss Wp,0 k = Pe − dP0 + (−Pt + )−2 Sm ki ks Sm Pp∗ rT tac

1−

Wp,0 ki Pt

(51)

Obviously, the time tac is not the same for a step demand flow as the one for a ramp demand flow. Nevertheless, it is possible to get an approximation of it in the same way as for the previous case. One would have to use (24) and replace tco and e with their new values. ∗ (because Pm decreases In order to decrease the overshoot, ie increase Pm with a inhalation and the overshoot is bottom-oriented), one must decrease dP0 , Ss , Pt , k or increase Sm , ki , ks . These tendencies were confirmed with multiple simulations for different parameter settings.

4

Improvement of the Mask Behaviour

As a summary of this work, in order to improve the mask behaviour, it is needed to satisfy conditions (13) and (14) while decreasing the opening time (33), the closing time (37), the mask pressure overshoot (51) and increasing the equilibrium mask pressure (38). As it was already stated in a different way, modifying

Study of the Physiological Parameters

463

one parameter may not be sufficient and it is needed to solve an optimisation problem. This matter was not yet tackled with. Nevertheless, it is possible to precise how it is going to be dealt with. Obviously, one point will concern the choose of the cost function which will at least take into the physiological parameters. Another element must be taken into consideration even if it is not obvious: the cost function must also integrate the size of the region of stability around the position of the minimum. As a matter of fact, the physical values of the mask components are not mastered with full accuracy. If the stability-physiologically optimized region is narrow, we may end up with leaving it easily due to experimental settings inaccuracy. There also are uncertainties in the model (one about Ss was explained in [5]). We may find a less physiologically performant region but with more room for parameters settings. At last, one point concerns the geometry limitations for the mask which must be added as additional constraints to the stability conditions.

5

Conclusion

The stability analysis of the studied regulated oxygen mask performed in [5] gave stability regions which needed to be refined to take into account physiological performance. The physiological parameters of the mask were given thanks to multiple assumptions in this paper. Future works will be devoted to an optimisation programming in order to find parameters configurations for which both performance and stability are reached.

References 1. MacLeod G (1985) Safety valve dynamic instability: an analysis of chatter. J Pressure Vessel Technol 107(2):172–177 2. Maeda T (1970) Studies on the dynamic characteristic of a poppet valve: 1st report, theoretical analysis. Bull JSME 13(56):281–289 3. Hayashi S, Hayase T, Kurahashi T (1997) Chaos in a hydraulic control valve. J Fluids Struct 11(6):693–716 4. Licsko G, Champneys A, Hos C (2009) Nonlinear analysis of a single stage pressure relief valve. Int J Appl Math 34(4):12–26 5. Battiston G, Beauvois D, Duc G, Godoy E (2018) Stability analysis of a regulated oxygen mask. In: Proceedings of the 15th international conference on informatics in control, automation and robotics, ICINCO, vol. 1. https://doi.org/10.5220/ 0006848203270334

Linear Discrete-Time Systems - H∞ Dynamic Output-Feedback Control with Preview Eli Gershon1,2(B) 1

Holon Institute of Technology (HIT), Holon, Israel [email protected] 2 Tel Aviv University, Tel Aviv, Israel

Abstract. The theory of preview tracking control is applied for the solution of the dynamic output-feedback control problem of linear discretetime systems for both: finite-horizon and stationary cases. Based of the solution of the state-feedback control problem and a specially devised version of the bounded real lemma for systems with tracking signals, three preview patterns are explored. The latter include the simplest case where the tracking signal is measured on line and two additional cases where the tracking signal is either previewed in a fixed interval ahead or it is perfectly known in advance along the finite horizon interval given in the system. The principle method of solution for all the latter three preview patterns is that the controller plays against nature which chooses the initial condition and the energy-bounded disturbances. Keywords: H∞ control

1

· Discrete-time systems · Tracking preview

Introduction

In the present paper we address the problem of H∞ output-feedback tracking control with preview of discrete-time linear systems, in both the finite and the infinite time settings [1]. Tracking feedback-control has been a central problem in both the frequency domain and in the state-space domain over the last decades. Numerous variations of this problem, by large, have been published in both the continuous-time and in the discrete-time settings (to name a few see for example, [2]–[9]). In [2,3] the H∞ tracking control has been applied to flight control systems where in [5] it has been applied to high-order flexible structure. In the field of process control, tracking was applied for a boiler turbine system in [7]. In the stochastic case various tracking control patterns were solved in [10]–[12]. The problem of H∞ tracking control with preview has been solved, for the first time, in the deterministic case by [9]. In the latter work a previewed tracking signal appears in the system dynamics allowing for three patterns of preview. The first case deals with the inclusion of the tracking signal where no preview is given [the c Springer Nature Switzerland AG 2020  O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 464–481, 2020. https://doi.org/10.1007/978-3-030-31993-9_23

Linear Discrete-Time Systems - H∞ Dynamic Output-Feedback Control

465

simplest case] while in the second case, the preview signal is known for a given fixed finite time interval in the future which is smaller the system dynamic horizon. The third pattern deals with the case where the previewed signal is known along the full finite-horizon interval of the system dynamics. The latter problems were solved in [9] by applying a game theoretical approach where a saddle point strategy is adopted. The solution of the finite-horizon, stochastic counterpart, state-feedback tracking control with preview was obtained in [10,12], where three preview patterns of the tracking signal were considered. These include the simple case of on-line measurement of the tracking signal and two patterns where this signal is either previewed with a fixed time interval ahead or perfectly known in advance along the system horizon [12] (see also [10] for further details). Similarly to the deterministic case, for all these patterns the solutions were obtained using a game theory approach where the controller plays against nature which chooses the system energy-bounded disturbance and the initial condition. In the present paper, we extend the work of [12] which deals with the stochastic state-feedback tracking control [zeroing, of course, the stochastic terms] to the case where there is no full access to the state-vector and a dynamic outputfeedback strategy must be applied. An optimal output-feedback tracking strategy is derived which minimizes the standard H∞ performance index, for the three tracking patterns of the reference signal. In both, the finite-horizon horizon case and the stationary one, a min-max strategy is applied that yields an index of performance that is less than or equal to a certain cost. We address the problem via two approaches: In the finite-horizon case we apply the DLMI (Difference Linear Matrix Inequality) method [10] for the solution of the discrete-time Riccati inequality obtained, and in the stationary case we apply a special Lyapunov function which leads to a LMI (Linear Matrix Inequality) based tractable solution. In order to better clarify the method applied in the solution of the output-feedback control problem, we first bring the state-feedback control result along with the various preview patterns similarly to the output-feedback case. This enables us to fully prove the results obtained in this paper. Notation. Throughout the paper the superscript ‘T ’ stands for matrix transposition, Rn denotes the n dimensional Euclidean space, Rn×m is the set of all n × m real matrices, N is the set of natural numbers and the notation P > 0, (respectively, P ≥ 0) for P ∈ Rn×n means that P is symmetric and positive definite (respectively, semi-definite). We denote by || · || is the standard Euclidean norm and by Tr{·} the trace of a matrix and by δij the Kronecker delta function. By [Qk ]+ , [Qk ]− we denote the causal and non causal parts respectively, of a sequence {Qi , i = 1, 2, ..., N }.

466

2

E. Gershon

Problem Formulation

Given the following linear discrete time-varying system: xk+1 = Ak xk +B2,k uk +B1,k wk + B3,k rk yk = C2,k xk + D21,k nk ,

(1a-b)

where xk ∈ Rn is the state vector, yk ∈ Rz is the measurement vector, wk ∈ Rp is a deterministic exogenous disturbance, rk ∈ Rr is deterministic reference signal which can be measured on line or previewed, uk ∈ Rl is the control input signal and x0 is an unknown initial state and where we denote zk = Ck xk +D2,k uk +D3,k rk , zk ∈ Rq , k ∈ [0, N ]

(2)

and we assume, for simplicity that: [CkT

T D3,k

T ˜ k ], R ˜ k > 0. D2,k ]D2,k = [0 0 R

Our objective is to find a control law {uk } that minimizes the energy of {zk } by using the available knowledge on the reference signal, for the worst-case of the process disturbances {wk }, {nk } and the initial condition x0 . We, therefore, consider, for a given scalar γ > 0, the following performance index:    Δ  J˜E (rk , uk , wk , nk , x0 ) = CN xN +D3,N rN 2 + ||zk ||22 −γ 2 [||wk ||22 + ||nk ||22 ] −γ 2 xT0 R−1 x0 , R−1 ≥ 0. (3) Similarly to [12] we consider three tracking problems differing on the information pattern over {rk } : 1) H ∞ -tracking with full preview of {rk }: The tracking signal is perfectly known for the interval k ∈ [0, N ]. 2) H∞ -tracking with no preview of {rk }: The tracking signal measured at time k is known for i ≤ k. 3) H∞ -tracking with fixed-finite preview of {rk }: At time k, ri is known for i ≤ min(N, k + h) where h is the preview length. In all the above three cases we seek a control law {uk } of the form uk = Hy yk + Hr rk where Hy is a causal operator and where the causality of Hr depends on the information pattern of the reference signal. The design objective is to minimize maxJ˜E (rk , uk , wk , nk , x0 )

∀{wk }, {nk }, {uk } ∈ l2 [0, N − 1],

xo ∈ Rn ,

where for all of the three tracking problems we derive a controller {uk } which plays against it’s adversaries {wk }, {nk } and x0 .

Linear Discrete-Time Systems - H∞ Dynamic Output-Feedback Control

3

467

State-Feedback Tracking Control

We bring first the deterministic counterpart of the state-feedback result of [12] which is based on a game theory approach and which constitutes the first step in the solution of the output-feedback control problem. We consider the system of (1a) and (2) and obtain the following result: Theorem 1. Consider the system of (1a), (2) and J˜E of (3) with the term of nk excluded. Given γ > 0, the state-feedback tracking game possesses a saddle-point equilibrium solution iff there exists Qi > 0, ∀i ∈ [0, N ] that solves the following Riccati-type equation T T Qk =ATk Mk+1 Ak +CkT Ck− ATk Mk+1 B2,k Φ−1 k B2,k Mk+1 Ak , Q(N ) = CN CN , Δ

T Mk+1 = Qk+1 [I − γ −2 B1,k B1,k Qk+1 ]−1 , T ˜ Φk = B2,k Mk+1 B2,k + Rk .

(4a-c) and satisfies k ∈ [0

Rk+1 > 0,

Δ

N − 1],

γ 2 R−1 − Q0 > 0,

T where Rk+1 = γ 2 I − B1,k Qk+1 B1,k .

(5a-c)

When a solution exists, the saddle-point strategies are given by: x∗0 = (γ 2 R−1 − Q0 )−1 θ0 , −1 T wk∗ = Rk+1 B1,k [θk+1 +Qk+1 (Ak xk +B2,k uk +B3,k rk )] −1 ∗ T c uk = −Φk {B2,k Mk+1 [Ak xk + B3,k rk + Q−1 k+1 θk+1 ]}

(6a-c)

where the causal part of θk+1 is c = [θk+1 ]+ θk+1

(7)

¯k rk , θN = C T D3,N rN , θk = A¯Tk θk+1 + B N Δ −1 −1 −1 −1 T A¯k = Qk+1 Sk+1 Ak , Sk+1 = Mk+1 + B2,k Tk+1 B2,k , Δ T T ¯k = A¯ Qk+1 B3,k + C D3,k , Tk+1 = R ˜k , B

(8)

and where θk satisfies

k

k

The game value is then given by: 1

1

−2 −2 2 2 2 J˜E (rk , u∗k , wk∗ , x∗0 ) = Sk+1 (Q−1 k+1 θk+1 + B3,k rk )2 − Qk+1 θk+1 2 +D3,k rk 2 2 T 2 −1 −1 +D3,N rN  +θ0 (γ R −Q0 ) θ0 . (9)

Proof. [12] The proof is based on adapting the standard completing to squares arguments to the case. We bring in the Appendix the detailed proof of the state-feedback control solution, which is needed for the derivation of the BRL (Bounded Real Lemma) of the next section. We note also, at this point, that the solution of the output-feedback control problem is also based on these derivations.

468

E. Gershon

Remark 1. It is important to note that the signal of θk in (8) is admitted in the above derivation because of the tracking signal which affects the dynamics of (1a) [12]. This signal accounts for the nature of the tracking pattern, where it’s causal part (i.e [θk+1 ]+ ) appears in the structure of the controller in accordance with the preview patterns. Remark 2. Applying the result of Theorem 1 on the specific pattern of the reference signal it is shown in [12] that the saddle-point controller strategy depends on the causal part of θk+1 (i.e [θk+1 ]+ ), where θk+1 is given in (8). The latter dependency on θk+1 appears also in the structure of the control signal in the output-feedback case. Applying the result of Theorem 1 on the various preview patterns of the reference signal, we obtain: c = θk+1 , thereby Lemma 1. H∞ -Tracking of noncausal {rk }: In this case θk+1 solving (8) we obtain: N −k−1 c ¯N −j rN −j = Φˆk+1 θN + j=1 Ψk+1,j B θk+1 where Δ ¯T ˆk+1 = Φ Ak+1 A¯Tk+2 ...A¯TN −1

 Δ

Ψk+1,j =

A¯Tk+1 A¯Tk+2 ...A¯TN −j−1 j < N − k − 1



(10a-b)

j =N −k−1

I

c Lemma 2. H∞ -Tracking of Causal {rk }: In this case θk+1 = 0 since at time −1 ∗ T k, ri is known only for i ≤ k. We obtain: u = −Φk [B2,k Mk+1 (Ak xk +B3,k rk )].

Lemma 3. H∞ Fixed-Preview Tracking: Since ri is known at time k for i ≤ c satisfies: min(N, k + h) we obtain the control law of (6c) where θk+1 h

c = θk+1

 ¯k+h+1−j rk+h+1−j Ψ¯k+1,j B k+h≤N −1 . ˆk+1 θN + h−1 Ψ¯k+1,j B ¯N −j rN −j k + h = N Φ j=1 j=1

where Ψ¯k+1,j is obtained from (10b) by replacing N by k + h + 1. 3.1

Stationary State-Feedback Tracking Control

We treat the case where the matrices of the system in (1a) and (2) are all time-invariant and N tends to infinity. Since (4a,b) and the condition of (5a,b) are identical to those encountered in the corresponding state-feedback regulator problem (where rk ≡ 0), the solution Qk of (4a,b) and (5a,b), if it exists, will tend to the stabilizing solution of the following equation: ¯ + AT M ¯ A − AT M ¯ B2 Φ¯−1 + B T M ¯ A + CT C = 0 −Q 2

(11)

Linear Discrete-Time Systems - H∞ Dynamic Output-Feedback Control

469

where Δ ¯ Δ ¯ = ¯ −1 , Φ ¯ B2 + R ¯= ˜ M Q[I − γ −2 B1 B1T Q] B2T M

and where

¯ 1 > 0. γ 2 Ip − B1T QB

(12a,b)

(13)

ˆ The latter will guarantee that JE (rk , u∗k , wk , x0 ) ≤ J(r), where ˆ = [Q ¯ −1 + B2 T¯−1 B T − γ −2 B1 B T ]− 12 (Q ¯ −1 θk+1 + B3 rk )2 J(r) 2 1 2 1 ¯ − 2 θk+1 2 + D3 rk 2 + D3 rN 2 + θT (γ 2 R−1 − Q) ¯ −1 θ0 . −Q 2

2

0

In the latter, θk is given by the time-invariant version of (8), θkc is defined in (7) and Δ ˜ T¯ = R. (14) A strict inequality is achieved, for ({wk }, xo ) that is not identically zero, iff the left side of (11) is negative definite. The latter inequality can be expressed in a LMI (Linear Matrix Inequality) form as follows. Theorem 2. Consider the system of (1a) and (2) with constant matrices. Then, for a given γ > 0, ˆ JE (rk , u∗k , wk , x0 ) < J(r). for all non zero {wk } ∈ l2 [0 ∞) and x0 ∈ Rn iff there exists a matrix P = P T ∈ Rn×n that satisfies the following positive definite LMIs: ⎡ ⎤ P P AT P C T 0 ⎢ AP Γ (2, 2) 0 B1 ⎥ ⎢ ⎥ (15) ⎣ CP 0 Iq 0 ⎦ T 2 0 γ Ip 0 B1 ˜ −1 B T and where Γ (2, 2) = P + B2 R 2

2  γ Ip B1T >0 B1 P

(16)

If such a solution exists then the optimal control strategy is given by the following u∗k : ˆ (Axk + B3 rk + P θc ), where u∗k = −Φ¯−1 B2T M k+1 −2 ˆ (17) M = (P − γ B1 B1T )−1 , and c = ER¯ k {θk+1 } θk+1 ˆ = (P − γ −2 B1 B T )−1 , θc = ER¯ {θk+1 } and θk satisfies where M 1 k+1 k ˜ −1 B T ]−1 (P θk+1 + B3 rk ). θk = AT [P − γ −2 B1 B1T + B2 R 2

470

E. Gershon

Proof. The inequality that is obtained from (11) is ¯ + AT M ¯ [I − B2 Φ¯−1 B2T M ¯ ]A + C T C < 0 −Q with the condition of (13). Since ¯ −M ¯ B2 Φ¯−1 B2T M ¯ = (M ¯ −1 + B2 T −1 B2T )−1 M ¯ in (12a), the condition: we obtain, using the definition of M T ¯ −1 ¯ −Q+A [Q −γ −2 B1 B1T + B2 T −1 B2T ]−1 A + C T C < 0

(18)

˜ −1 . Substituting Applying the inversion lemma on T¯ of (14) we obtain T¯−1 = R −1 ¯ the latter in (18) and defining P = Q we obtain: ˜ −1 B T ]−1 A −C T C > 0 P −1 −AT [P −γ −2 B1 B1T +B2 R 2 Using Schur’s complements the latter inequality is equivalent to ⎡ −1 ⎤ P AT C T 0 ⎢ A Γ (2, 2) 0 B1 ⎥ ⎢ ⎥>0 ⎣ C 0 Iq 0 ⎦ 0 γ 2 Ip 0 B1T

(19)

The LMI of (15) is then obtained by multiplying (19), from both sides, by diag{P, In , In , In , Iq , Ip }. Using Schur’s complements the condition of (13) is readily given by (16). The optimal strategy for the control input uk is obtained from the results of Theorem 2 according to the specific information we have on {rk }. Assuming that the conditions of the latter theorem are satisfied we obtain the following. c = 0 we Lemma 4. H∞ -Tracking of Causal {rk }: Since in this case θk+1 obtain: ˆ (Axk + B3 rk ). u∗ = −Φ¯−1 B2T M

Lemma 5. H∞ Fixed-Preview Tracking: Since ri is known at time k for c c is given by: θk+1 = i ≤ k + h we obtain the control law of (17) where θk+1 h Δ ¯T h−j (M ¯ )rk+h+1−j , where M ¯ = A¯T P −1 B3 + C T D3 . Using the latter j=1 (A ) c ¯ rk + M ¯ rk+1 . definition it is easily seen that Θk+1 = A¯T Θkc − A¯T M

4

Dynamic Output-Feedback Tracking Control

The solution of the dynamic output-feedback control problem involves 2 steps where the latter one is a filtering problem of order n. A second Riccati equation is thus achieved by applying the BRL to the dynamic equation of the estimation error. The latter imposes augmentation of the system to 2n order. This augmented system contains also a tracking signal component and therefore one needs to apply a special BRL for systems with tracking signal. We thus bring first the latter BRL.

Linear Discrete-Time Systems - H∞ Dynamic Output-Feedback Control

4.1

471

BRL for Systems with Tracking Signal

We consider the following system: xk+1 = Ak xk + B1,k wk + B3,k rk zk = Ck xk + D3,k rk , zk ∈ Rq , k ∈ [0, N ]

(20a,b)

which is obtained from (1a) and (2) by setting B2,k ≡ 0 and D2,k ≡ 0. We consider the following index of performance:    Δ  JB (rk , wk , x0 ) = CN xN +D3,N rN 2 +||zk ||22 − γ 2 [||wk ||22 ] − γ 2 xT0 R−1 x0 , R−1 ≥ 0. (21) We arrive at the following theorem: Theorem 3. Consider the system of (20a,b) and JB of (21). Given γ > 0, ˜ ), ∀{wk } ∈ l2 [0, N − 1], xo ∈ Rn , where JB of (21) satisfies JB ≤ J(r, ˜ ) = J(r,

N −1  k=0

−1 T T θk+1 {B1,k Rk+1 B1,k }θk+1 +

+||D3,N rN ||2 + 2

N −1  k=0

N −1 

T rkT (D3,k D3,k )rk

k=0

T ˜ −1 (M ¯ −1 )−1 B3,k rk + θ0T −1 θ, Q θk+1 k+1 k+1

˜ k that solves the following Riccati-type equation if there exists Q ˜ k = AT M ¯ k+1 Ak +C T Ck , Q k k T ˜ ˜ Qk+1 B1,k > 0, Q(0) = γ 2 R−1 − I, γ 2 I − B1,k

(22a,c)

for some > 0 where ˆk rk , θ˜N = C T D3,N rN , θ˜k = AˆTk θ˜k+1 + B N −1 T ¯ ˜ ˆ ˆT ˜ Aˆk = Q k+1 Mk+1 Ak , Bk = Ak Qk+1 B3,k + Ck D3,k , ˆk = AˆT Q ˜ k+1 B3,k + C T D3,k , B k Δ

k

(23a-e)

−1 ˜ ¯ k+1 = Q ˜ k+1 [I − γ −2 B1,k B T Q M . 1,k k+1 ]

Proof. Unlike the state-feedback tracking control in [12] (see also the Appendix), the solution of the BRL does not acquire saddle-point strategies (Since the input signal uk is no longer an adversary). It can, however, be readily derived based on the first part of the sufficiency of Theorem 1 by setting B2,k ≡ 0 and D2,k ≡ 0. In the Appendix we bring the proof of the BRL as a derivation of the proof of the state-feedback tracking control solution (which is not included in [12]. The latter proof is also essential for the derivation of the output-feedback tracking control solution. ˜ Remark 3. The choice of > 0 in Q(0) of (22b) reflects on both, the cost value ˜ )) of (22b) and the minimum achievable γ. If one chooses 0 < 0 over [0, N ] where (5a,b) are satisfied. The solution of the output-feedback problem is stated in the following theorem, for the a priori case, where uk can use the information on {yi , 0 ≤ i < k} : Theorem 4. Consider the system of (1a,b), (2) and J˜E of (3). Given γ > 0, the output-feedback tracking control problem, where {rk } is known a priori for all k ≤ N (the full preview case) possesses a solution if there exists Pˆk ∈ R2n×2n > 0, Ko,k ∈ Rn×z , ∀k ∈ [0, N ] that solves the following Difference LMI (DLMI): ⎡ ˆ −1 T ⎤ A˜Tk Pk 0 C˜1,k ⎢ A˜k Pˆk+1 γ −1 B ˜1,k 0 ⎥ ⎥ ≥ 0, ⎢ (24a,b) ⎣ 0 γ −1 B ˜T I 0 ⎦ 1,k C˜1,k 0 0 I

 R Δ −2 R ˆ −1 = where P0 = Q γ with a forward iteration, starting from the 0 R R + In ˜1,k , C˜k are above initial condition of P0 , where R is defined in (3), where A˜k , B −1 ˆ ˆ defined in (32) and where Qk = Pk is given in (33). Proof. Using the expression that is achieved in the Appendix for J˜E (rk , uk , wk , x0 ) in the state-feedback case, the index of performance is now given by: J˜E (rk , uk , wk , nk , x0 ) = J˜E (rk , uk , wk , x0 ) − γ 2 ||nk ||22 = N −1 −γ 2 ||w ¯k ||22 − γ 2 ||x0 − x∗0 ||2R−1 −γ −1 Q0 + k=0 [{||¯ uk + Cˆ1,k xk ||2 }]+ 2 2 ¯ −γ ||nk ||2 + J(r).

(25)

We note that in the full preview case [θk+1 ]+ = θk+1 . Using the following definitions: −1/2

T u ¯k = Φk+1 uk + Φk+1 B2,k Mk+1 (B3,k rk + Q−1 k+1 θk+1 ) −1 1/2 −1 −1/2 T w ¯k = γ Rk+1 wk − γ Rk+1 B1,k [Qk+1 (Ak xk + B2,k uk +B3,k rk ) + θk+1 ], 1/2

we note that w ¯k = γ −1 Rk+1 (wk − wk∗ ), 1/2

u ¯k = Φk+1 (uk − u∗k ), 1/2

(26)

Linear Discrete-Time Systems - H∞ Dynamic Output-Feedback Control

473

where wk∗ , u∗k are defined in (6b,c) respectively. Note also that the terms that are not accessed by the controller (i.e the terms with xk ), are exclude from u∗k . ¯k we seek a controller of the form Considering the above w ¯k and u u ¯k = −Cˆ1,k x ˆk . We, therefore, re-formulate the state equation of (1a) adding the additional terms to recover the original equation of (1a). Considering the above, we obtain the following new state equation: ¯1,k w ¯2,k u ¯3,k rk + B ¯4,k θk+1 , xk+1 = Aˆk xk + B ¯k + B ¯k + B

(27)

where ¯1,k = γB1,k R−1/2 , Aˆk = Q−1 B k+1 k+1 Mk+1 Ak , ¯2,k = Q−1 Mk+1 B2,k Φ−1/2 , B k+1 k+1 ¯3,k = B3,k + B1,k R−1 B T Qk+1 B3,k − B ¯2,k Φ−1/2 B T Mk+1 B3,k , B 2,k k+1 k+1 1,k −1/2 −1 ¯4,k = B ¯T − B ¯1,k R ¯2,k Φ B B B T Mk+1 Q−1 . k+1

k+1

1,k

2,k

(28a-e)

k+1

Replacing for w ¯k and u ¯k we obtain: −1 ∗ ¯ ¯2,k (Φ Rk+1 (wk − wk∗ )) + B xk+1 = (Q−1 k+1 (uk − uk )) k+1 Mk+1 Ak )xk + B1,k (γ ¯ ¯ +B3,k rk + B4,k θk+1 . (29) We consider the following Luenberger-type state observer: 1/2

1/2

x ˆk+1 = Aˆk x ˆk + Ko,k (yk − C2,k x ˆ k ) + dk , ˆk , zˆk = Cˆ1,k x

x ˆ0 = 0,

(30)

where ¯2,k u ¯3,k rk + B ¯4,k θk+1 . ¯k + B dk = B ˆk and using the latter we obtain: Denoting ek = xk − x ˆ1,k w ek+1 = (Aˆk − Ko,k C2,k )ek + B ˆk , where we define Δ

w ˆk = [w ¯kT Defining also ξk = [xTk

nTk ]T ,

ˆ1,k = [B ¯1,k − Ko,k D21,k ]. B Δ

eTk ]T , r¯k = [rkT

T θk+1 ]T , we obtain

˜1,k wˆk + B ˆ3,k r¯k , ξk+1 = A˜k ξk + B ˜ z˜k = C1,k ξk , where

  ¯ ¯ Cˆ 0 A˜ B ˜1,k = B1,k A˜k = 11,k 2,k˜ 1,k , B ¯1,k −Ko,k D21,k , B 0 A22,k

(31a-b)

 ¯ ¯ ˆ3,k = B3,k B4,k , B 0 0

474

E. Gershon

˜ = Aˆk − B ¯2,k Cˆ1,k , A22,k ˜ = Aˆk − Ko,k C2,k A11,k −1/2 T ˆ C1,k = Φk+1 B2,k Mk+1 Ak , C˜1,k = [0 Cˆ1,k ].

(32a-g)

Applying the results of Theorem 4 to the system of (31) we obtain the following Riccati-type equation: ˆ k=A˜T [Q ˜1,k B ˜ T ]−1 A˜k + C˜ T C˜k , ˆ −1 − γ −2 B Q k 1,k k k+1

(33)

ˆ −1 and using Schur complement we ˆ 0 is given in (24b). Denoting Pˆk = Q where Q k obtain the DLMI of (24a). The latter DLMI is initiated with the initial condition of (24b) which corresponds to the case where a weighting γ 2 −1 In is applied to ˆ0 = 0 in the corresponding differential x ˆ0 in order to force nature to select x game (see [10], Chapter 9, for details). In the case where {rk } is measured on line, or with preview h > 0, we note that nature strategy which is not restricted to causality constraints, will be the ¯k of (26) is unchanged. same as in the case of full preview of {rk }, meaning that w We obtain the following: Lemma 6. H∞ Output-feedback Tracking with full preview of {rk }: We obtain −1/2

T u ¯k = Φk+1 uk + Φk+1 B2,k Mk+1 (B3,k rk + Q−1 k+1 [θk+1 ]+ ) 1/2

where we note that in this case [θk+1 ]+ = θk+1 . Solving (8) we obtain: N −k−1 ¯N −j rN −j [θk+1 ]+ = Φˆk+1 θN + j=1 Ψk+1,j B where Δ Φˆk+1 = A¯Tk+1 A¯Tk+2 ...A¯TN −1  T  A¯k+1 A¯Tk+2 ...A¯TN −j−1 j < N − k − 1 Δ Ψk+1,j = I j =N −k−1

(34a-b)

Proof. Considering (8) and taking k + 1 = N we obtain: ¯N −1 rN −1 , θN −1 = A¯TN −1 θN + B where θN is given in (8). Similarly we obtain for N − 2 ¯N −2 rN −2 θN −2 = A¯TN −2 θN −1 + B T T ¯ ¯ ¯ ¯N −2 rN −2 = AN −2 [AN −1 θN + BN −1 rN −1 ] + B T T T ¯ ¯N −1 rN −1 . ¯ ¯ ¯ = BN −2 rN −2 + AN −2 AN −1 θN + AN −2 B The above procedure is thus easily iterated to yield (34a,b). Taking, for example N = 3 one obtains from (8) the following equation for θ1 : ¯2 r2 + B ¯1 r1 . θ1 = A¯T1 A¯T2 θ3 + A¯T1 B The same result is recovered by taking k = 0 in (34a,b) where j = 1, 2.

Linear Discrete-Time Systems - H∞ Dynamic Output-Feedback Control

475

Lemma 7. H∞ Output-feedback Tracking with no preview of {rk }: In this case [θk+1 ]+ = 0 since at time k, ri is known only for i ≤ k. We obtain: −1/2

T Mk+1 B3,k rk . u ¯k = Φk+1 uk + Φk+1 B2,k 1/2

Lemma 8. H∞ Output-feedback tracking with fixed-finite preview of {rk }: In this case we obtain: −1/2

T Mk+1 (B3,k rk + Q−1 u ¯k = Φk+1 uk + Φk+1 B2,k k+1 [θk+1 ]+ ) 1/2

and ¯2,k u ¯3,k rk + B ¯4,k [θk+1 ]+ . ¯+B dk = B where [θk+1 ]+ satisfies: ⎧ ⎫ ¯k+h+1−j rk+h+1−j k + h ≤ N − 1 ⎬ ⎨ hj=1 Ψ¯k+1,j B [θk+1 ]+ = . h−1 ¯ ⎩ Φˆ ¯N −j rN −j k + h = N ⎭ Ψ B k+1 θN + k+1,j j=1 where Ψ¯k+1,j is obtained from (34b) by replacing N by k + h + 1.

5

Stationary Output-Feedback Tracking Control

We treat the case where the matrices of the system in (1a-b) and (2) are all time-invariant, N tends to infinity and the system of (1a-b) is exponential l2 ˜ k of (22a,b), if it exists, will tend to the mean stable. In this case, the solution Q square stabilizing solution of the following equation: ¯ A + C T C, ˜ = AT M Q ˜ Q(0) = γ 2 R−1 − I.

˜ 1 > 0, γ 2 I − B1T QB

We Introduce the following Lyapunov function:

 ˆ Q αQ T ˜ ˜ Vk = ξk Qξk , with Q = ˆ Q ˆ , αQ

(35a,c)

(36)

ˆ are n × n matrices and α ia a where ξk is the state vector of (31), Q and Q scalar. Considering (35) and the above result, we obtain the following theorem: Theorem 5. Consider the system of (1a-b), (2) where the matrices A, B1 , B2 , B3 , C2 , D21 , C, D2 and D3 are all constant and T → ∞. Given γ > 0, there exists ˆ= a controller that minimizes maxJ˜E of (3) if there exist Q = QT ∈ Rn×n , Q T n×n n×q ˆ ˜ Q ∈ R , Y ∈ R and a tuning scalar parameter α that satisfy Υ > 0 where Υ˜ is the following LMI: ⎡ ⎤ ˆ Υ (1, 3) Υ (1, 4) Q αQ 0 0 0 ˆ Υ (2, 3) Υ˜ (2, 4) ⎢∗ Q 0 0 Cˆ1T ⎥ ⎢ ⎥ ⎢∗ ∗ ˆ Q αQ Υ (3, 5) Υ (3, 6) 0 ⎥ ⎢ ⎥ ⎢∗ ∗ ˆ Υ (4, 5) (37) ∗ Q 0 0 ⎥ ⎢ ⎥ ⎢∗ ∗ ∗ ∗ I 0 0 ⎥ ⎢ ⎥ ⎣∗ ∗ ∗ ∗ ∗ I 0 ⎦ ∗ ∗ ∗ ∗ ∗ ∗ I

476

E. Gershon

where Δ ˆ Υ (1, 3) = AˆT Q − Cˆ T B ¯ T Q, Y = KoT Q, 1 2 T T T ¯ )αQ, ˆ Υ (1, 4) = (Aˆ − Cˆ1 B 2 ¯ T Q + αAˆT Q ˆ − αC T Y, Υ (2, 3) = Cˆ1T B 2 2 T ¯T T ˆ ˆ ˆ Υ (2, 4) = (αC1 B2 + A )Q − C2T Y, ˆ B ¯1 , Υ (3, 6) = −γ −1 αY T D21 , Υ (3, 5) = γ −1 (Q + αQ) −1 ˆ ¯ Υ (4, 5) = γ QB1 (1 + α).

Proof. The proof outline for the above stationary case resembles the one of the finite-horizon case. Considering the stationary version of (1a-b), (2) the stationary state-feedback control problem is solved to obtain the optimal stationary ∗ and u∗s,k [12]. Thus we obtain: strategies of both ws,k −1 ∗ ws,k = Rk+1 B1T [θk+1 +P (Axk +B2 uk +B3 rk )], −1 ∗ ˆ [Axk + B3 rk + P θc ]}, us,k = −Φk {B2T M k+1 Δ ˆ = P −1 [I − γ −2 B1 B T P −1 ]−1 , M 1

where P −1 is the stationary version of the solution Qk of (1). Using the above optimal strategies we transform the problem to an estimation one, thus arriving to the stationary counterpart of the augmented system of (31). Applying the result of (35) to the latter system the algebraic counterpart of (33) is obtained which, similarly to the finite-horizon horizon case, becomes the following stationary version of (24): ⎡ ˜ −1 ˜T P A 0 ˜ γ −1 B ˜1 ⎢ A˜ P ⎢ ⎣ 0 γ −1 B ˜T I 1 C˜1 0 0

⎤ C˜1T 0 ⎥ ⎥ ≥ 0. 0 ⎦ I

(38)

Multiplying the above LMI by diag{In , P˜ −1 , I2p , Il } from the left and the ˜ of (36), the result ˜ = P˜ −1 and using the matrix partition of Q right, denoting Q of (37) is obtained.

6

Conclusions

The theory of H∞ preview tracking control was formulated and solved for the case where there is no access to the system states and a dynamic output-feedback controller is sought for. Three preview patterns were treated, beginning with the simplest case where the signal to be tracked is measured on line and additional two preview patterns where the tracking signal is either known in a finite fixed interval ahead or it is known along the full dynamic horizon of the system. The solution of the latter preview patterns was obtained in both: the finite horizon and the infinite-horizon (stationary) cases. In the finite horizon case, the solution

Linear Discrete-Time Systems - H∞ Dynamic Output-Feedback Control

477

is obtained via the use of DLMIs which where shown to be numerically stable and tractable for various dynamical systems. In the stationary case, the solution is obtained via the use of LMIs which have become a standard tool in control theory by large over the last two decades. Contrary to the state-feedback control solution which is based on the standard H∞ BRL, the solution of the output-feedback tracking problem is solved by applying a special form of the BRL to a filtering problem which is formulated once the state-feedback solution is obtained. In all the treated cases in this paper, an optimal output-feedback tracking control strategy is derived which minimizes the standard H∞ performance index, for the three tracking patterns of the reference signal. In both, the finite-horizon horizon case and the stationary one, a min-max strategy is applied that yields an index of performance that is less than or equal to a certain cost.

Appendix Proof of Theorem 1 The proof follows the standard line of applying a Lyapunov-type quadratic function in order to comply with the index of performance. This is usually done by using two successive completing to squares operations however, since the reference signal of rk is introduced in the dynamics of (1a), we apply a third completing to squares operation with the aid of the fictitious signal of θk+1 . This latter signal finally affects the controller design through it’s causal part [θk+1 ]+ . Defining Jk = ||Ck xk + D3,k rk ||2 + ||D2,k uk ||2 − γ 2 ||wk ||2 , we consider (3) with the term of nk excluded and obtain: JE (r, u, w, x0 ) = −γ 2 xT0 R−1 x0 + CN xN + D3,N rN 2 +

N −1 

Jk .

k=0

Denoting φk = xTk+1 Qk+1 xk+1 − xTk Qk xk , and substituting (1a) in the latter, we find that T φk = [Ak xk + B2,k uk + B3,k rk ]2Qk+1 + 2[xTk ATk + uTk B2,k + rkT B3,k ]Qk+1 B1,k wk T +wkT B1,k Qk+1 B1,k wk − xk 2Qk +

Ck xk + D3,k rk 2 − γ 2 wkT wk + γ 2 wkT wk + uk 2R˜ − zkT zk k

Rearranging the latter expression we obtain: T φk = −wkT [γ 2 I − B1,k Qk+1 B1,k ]wk + 2[xTk ATk Qk+1 B1,k T T ˜ k + B T Qk+1 B2,k ]uk + +(uTk B2,k + rkT B3,k )Qk+1 B1,k ]wk + uTk [R 2,k

2{xTk [ATk Qk+1 B2,k ] + rkT B3,k Qk+1 B2,k }uk + xTk [ATk Qk+1 Ak + CkT Ck − Qk ]xk − Jk

478

E. Gershon T T T T +rkT (D3,k D3,k + B3,k Qk+1 B3,k )rk + 2rkT (B3,k Qk+1 Ak + D3,k Ck )xk

Completing to squares for both wk and uk we obtain: φk = −(wk − w ˜k )T Rk+1 (wk − w ˜k ) + (uk − u ˜k )T Φk (uk − u ˜k ) − Jk + T (B2,k Mk+1 Ak +CkT Ck −Qk]xk 2Φ−1 k

−1 −1 T T T −1 +rkT [D3,k D3,k +B3,k (Mk+1 +B2,k Tk+1 B2,k ) B3,k ]rk −1 −1 T T −1 T +2rkT [B3,k (Mk+1 +B2,k Tk+1 B2,k ) Ak +D3,k Ck ]xk

where u ˜k and w ˜k are recovered from u∗k and wk∗ of (6b,c), respectively, by taking c θk+1 = θk+1 = 0. Note that following the above two completion to square operations, the mixed terms of rk and xk still remain. In order to get rid of these latter terms of rk and xk we first define the following: Δ

w ˆk = wk − w ˜k ,

Δ

u ˆ k = uk − u ˜k .

c Considering the above definition of w ˆk and w ˜k of (6b) (by taking θk+1 =0 for the latter), we readily re formulate the state equation of (1a) in the following way: ˆk + B1,k w ˜k + B3,k rk xk+1 = Ak xk + B2,k uk + B1,k w

= Ak xk + B1,k w ˆk + B2,k uk + B3,k rk + −1 T B1,k Rk+1 B1,k Qk+1 (Ak xk + B2,k uk + B3,k rk ).

Considering the latter and further applying simple manipulations and using c uk = u ˜k + u ˆk , where u ˜k is given in (6c) taking θk+1 = 0, we obtain the following: −1 T xk+1 = [Q−1 k+1 Mk+1 (I − B2,k Φk B2,k Mk+1 )Ak ]xk + −1 T −1 ˆk + B1,k w ˆk . +Q−1 k+1 Mk+1 (I − B2,k Φk B2,k Mk+1 )rk + Qk+1 Mk+1 B2,k u

Considering the above derivations the new state equation of (1a) in terms of u ˆk and w ˆk becomes xk+1 = A¯k xk + Q−1 ˆk + B1,k w ˆk + k+1 Mk+1 B2,k u −1 T Q−1 k+1 Mk+1 (I − B2,k Φk B2,k Mk+1 )B3,k rk ,

where A¯k is given in (8b). The next step which is done in order to get rid of the T xk+1 − θkT xk ) to φk . mixed terms of rk and xk in φk is the addition of 2(θk+1 In terms of the above new state equation we obtain: T T θk+1 xk+1 = θk+1 [A¯k xk +Q−1 ˆk +B1,k w ˆk + k+1 Mk+1 B2,k u −1 T Q−1 k+1 Mk+1 (I −B2,k Φk B2,k Mk+1 )B3,k rk ].

Linear Discrete-Time Systems - H∞ Dynamic Output-Feedback Control

479

T Defining: φ∗k = φk + 2(θk+1 xk+1 − θkT xk ), we thus add the following quantity to φk T 2(θk+1 xk+1 − θkT xk ) = T T A¯k − θkT )xk + 2θk+1 2(θk+1 [Q−1 ˆk + B1,k w ˆk + k+1 Mk+1 B2,k u −1 T Q−1 k+1 Mk+1 (I − B2,k Φk B2,k Mk+1 )B3,k rk ].

We thus obtain, after completing the squares for θk , the following expression −1 −1 T T 2 ˆk − Rk+1 B1,k θk+1 ||2Rk+1 + ||ˆ uk + Φ−1 φ∗k = −||w k B2,k Mk+1 Qk+1 θk+1 ||Φk − Jk T T 2 +||B1,k θk+1 ||2R−1 − ||B2,k Mk+1 Q−1 k+1 θk+1 ||Φ−1 k+1

k

−1 −1 T T T −1 +rkT [D3,k D3,k +B3,k (Mk+1 +B2,k Tk+1 B2,k ) B3,k ]rk + T xTk [ATk Mk+1 Ak − B2,k Mk+1 Ak2Φ−1 +CkT Ck −Qk]xk k

−1 T +2θk+1 Q−1 k+1 (Mk+1

+

−1 T −1 B2,k Tk+1 B2,k ) B3,k rk .

Using (1) the performance index becomes: JE (rk , uk , wk , x0 ) = N −1  k=0

+

−1 T ¯ −||w ˆk − Rk+1 B1,k θk+1 ||2Rk+1 + J(r)

N −1  k=0

−1 T 2 {||ˆ uk + Φ−1 k B2,k Mk+1 Qk+1 θk+1 ||Φk }

−γ 2 ||x0 − (γ 2 R−1 − Q0 )−1 θ0 ||2R−1 −γ −2 Q0 .

(39)

where Δ ¯ = J(r)

N −1  k=0

+

−1 −1 T −1 T T θk+1 {B1,k Rk+1 B1,k − Q−1 k+1 Mk+1 B2,k Φk B2,k Mk+1 Qk+1 }θk+1

N −1  k=0

+2

−1 −1 T T T −1 rkT (D3,k D3,k + B3,k (Mk+1 +B2,k Tk+1 B2,k ) B3,k )rk N −1  k=0

−1 −1 T T −1 θk+1 Q−1 B3,k rk + k+1 (Mk+1 + B2,k Tk+1 B2,k )

θ0T (γ 2 R−1 − Q0 )−1 θ0 + ||D3,N rN ||2 . The result of (9) readily follows from (39) and a saddle-point strategy is obtained by taking: −1 T w ˆ ∗ = Rk+1 B1,k θk+1 , −1 T ∗ c u ˆ = −Φk B2,k Mk+1 Q−1 k+1 θk+1 , ∗ 2 −1 −1 x0 = (γ R − Q0 ) θ0 .

(40)

480

E. Gershon

Proof of Theorem 3 The proof of Theorem 3 is based on the above proof of Theorem 1 which concerns the state-feedback control problem (for a detailed proof see also [12], [10]). We note that the above proof of Theorem 1 follows the standard line of applying a Lyapunov-type quadratic function in order to comply with the index of performance. This is usually done by using two successive completing to squares operations however, since the reference signal of rk is introduced in the dynamics of (1a), we apply a third completing to squares operation with the aid of the fictitious signal of θk+1 . This latter signal finally affects the controller design through it’s causal part [θk+1 ]+ (for a detailed proof see [12], [10]). Now, the sufficient part of the proof of Theorem 3 stems from the above proof of Theorem 1 where B2,k = 0, D2,k = 0. Analogously to the proof of Theorem 1, we obtain the following: JB (rk , uk , wk , x0 ) = −

N −1  k=0

−1 T ˜ ||w ˆk − Rk+1 B1,k θk+1 ||2Rk+1

˜ 0 )−1 θ˜0 ||2 −1 −2 ˜ −γ 2 ||x0 − (γ 2 R−1 − Q R −γ Q0 . ˜ k , respectively and where where we replace θk+1 , Qk by θ˜k+1 and Q ˜ 0 ]−1 , x0 = γ −2 P˜0 θ˜0 = [γ 2 R−1 − Q ˜ 0 ]−1 θ˜0 . P˜0 = [R−1 − γ −2 Q ˜ ) = 0 (noting The necessity follows from the fact that for rk ≡ 0, one gets J(r, ˜ ) of that in this case θ˜k ≡ 0 in (23) and therefore the last 3 terms in J(r, ˜ > 0 that solves Theorem 3 are set to zero) and JB < 0. Thus the existence of Q (22a,c) is the necessary condition in the BRL.

References 1. Gershon E (2018) H∞ measurement-feedback tracking with preview. ICINCO, Porto, Portugal 2. Yang GH, Zhang XN (2008) Reliable H∞ flight tracking control via state feedback. In: American control conference westin seattle hotel, pp 1800–1805 3. Luo W, Chu YC, Ling KV (2005) H∞ inverse optimal attitude-tracking control of rigid spacecraft. J Guidance Control Dyn 28(3):481–494 4. Kim BS, Tao TC (2002) An integrated feedforward robust repetitive control design for tracking near periodic time varying signals. Japan-USA on Flexible Automation, Japan 5. You WQ, Chen HH, He XD (2010) Tracking control research of high-order flexible structures on the H-infinity control method. In: 2nd international conference on advanced computer control (ICACC) 6. Wen G, Ugrinovskii V (2013) Distributed H-infinity tracking control for discretetime multi-agent systems with a high-dimensional leader. In: Proceedings of the 52nd IEEE conference on decision and control, Florence, Italy 7. Wu J, Nguang SK, Shen J, Liu G, Li YG (2010) Robust H∞ tracking control of boiler-turbine systems. ISA Trans 49(3):369–375

Linear Discrete-Time Systems - H∞ Dynamic Output-Feedback Control

481

8. Gao H, Chen T (2008) Network-based H∞ output tracking control. IEEE Trans Autom Control 53(3):655–667 9. Shaked U, deSouza CE (1995) Continuous-time tracking problems in an H∞ setting: a game theory approach. IEEE Trans Autom Control 40(5):841–852 10. Gershon E, Shaked U, Yaesh I (2005) H∞ control and estimation of statemultiplicative linear systems, vol 318. Lecture notes in control and information sciences, LNCIS. Springer, Heidelberg 11. Gershon E, Shaked U (2013) Advanced topics in control and estimation of statemultiplicative noisy systems, vol 439. Lecture notes in control and information sciences, LNCIS. Springer, Heidelberg 12. Gershon E, Shaked U, Berman N (2010) H∞ preview tracking control for retarded stochastic systems. In: Conference on control and fault tollerant systems (SysTol10), Nice, France

Direct Integrability for State Feedback Optimal Control with Singular Solutions Paolo Di Giamberardino1(B) and Daniela Iacoviello1,2 1

2

Department of Computer, Control and Management Engineering Antonio Ruberti, Sapienza University of Rome, Rome, Italy {paolo.digiamberardino,daniela.iacoviello}@uniroma1.it Institute for Systems Analysis and Computer Science Antonio Ruberti, Rome, Italy http://www.diag.uniroma1.it/~digiamb/webpage.html, http://www.diag.uniroma1.it/~iacoviel/

Abstract. The paper studies the problem of determining the optimal control when singular arcs are present in the solution. In the general classical approach, the expressions obtained depend on the state and the costate variables at the same time, so requiring a forward-backward integration for the computation of the control. In this paper, firstly sufficient conditions on the dynamics structure are discussed, in order to have both the control and the switching function depending on the state only, computable by a simple forward integration. Then, the possibility to extend this result by means of a preliminary dynamic extension is presented. The approach has been checked and validated making use of a classical SIR epidemic model. Keywords: Optimal control · Singular control singular surface · SIR epidemic model

1

· Costate independent

Introduction

Optimal control theory provides the natural framework to solve control problems with resource limitations when contrasting goals are required. The design procedure can make use of the minimum principle, allowing the determination of the optimal control that, depending on the cost index and on the model, could be a bang-bang or a bang-singular-bang solution [1–4]. In particular, if the model as well as the cost index are linear in the control, the existence conditions of such kind of solutions can be explicitly established. In the bang-bang solution the control assumes only the extreme values, whereas the singular one is obtained if the Hamiltonian does not depend on the control in an interval of positive measure. The extreme values assumed by the control depend on the sign of the switching function, whereas the existence of singular control is related to the possibility that this switching function is identically zero on an interval of finite length. c Springer Nature Switzerland AG 2020  O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 482–502, 2020. https://doi.org/10.1007/978-3-030-31993-9_24

Direct Integrability for State Feedback Optimal Control

483

The determination of singular control, while it is easy from a theoretical point of view, is generally difficult to implement; the optimal control requires the solution of a non linear differential equations system in the state variables with initial conditions, and a non linear differential equations system in the costate variables with final conditions. Moreover, in general it is not easy the determination of the best control sequence and the number of switching points [5,6]. In this paper, the determination of the optimal control of nonlinear systems is investigated referring to the case in which the input acts linearly both in the model and in the cost index, aiming at a constructive computing of the singular solution. The effectiveness of the approach is tested making use of the example of a classical SIR epidemic model, where S stands for the class of susceptible subjects, I for the class of infected patients and R for the class of the removed ones [7,8]. Facing epidemic spread control in the framework of optimal control theory is rather common for its capability of suggesting suitable scheduling of possible actions such as vaccination or quarantine or treatment, taking into account resource limitations. Then, it is widely used in literature [9,10]. In particular, the optimal singular control for a SIR epidemic model has been already studied in [11,12], where the structure of singular control has been deeply investigated in presence of the double control, vaccination and medical treatment, showing that the latter can’t be singular, whereas a singular regimen is expected for the optimal vaccination strategy. In this work, the SIR model is used with two different formulations. In a first example, it is assumed that a recovered subject cannot be infected again, while in a second example, the modified case in which the possibility for a recovered individuals to become susceptible again is used. The two formulations allow to deal with different dynamical structures so that all the results presented can be illustrated. The paper is organized as follows; the classical optimal control problem formulation is given in Sect. 2; the conditions and the constraints usually considered are illustrated and the structure of the optimal solution, as well as the conditions for the existence of both switches and singular arcs, are given. Then, in Sect. 3, a particular class of dynamics is introduced, the same considered in [13], and constructive conditions for the singular solutions are given for different formulations of the optimal problem. The example of a SIR model is considered, to put in evidence the procedure and the effectiveness of the results. In Sect. 4, the extension of the results to generic dynamics is discussed and it is explicitly formulated for the simpler case of one dimensional dynamic extension. Also in this case, a SIR model in a modified version of the one considered in the previous Section, is considered to exemplify the approach and verify its effectiveness.

484

2

P. Di Giamberardino and D. Iacoviello

The Optimal Control Problem Formulation

The optimal control problem here addressed is referred to the design of the control input u, for a given non linear dynamics of the form x˙ = f (x) + g(x)u = f (x) +

m 

ui gi (x)

(1)

i=1

where x ∈ Rn , u ∈ Rm and with x(t0 ) = x0 , such that a cost index 

tf

J=

L(x, u, t)dt

(2)

t0

L : Rn × Rm × R → R is minimised. The presence of a bound on the control amplitude can be expressed in the form af a box constraint as umin ≤ u(t) ≤ uM AX

∀t ∈ [t0 , tf ]

(3)

with umin , uM AX ∈ Rm . In the definition of the problem, additional constraints can be introduced. The most common are on (some components of) the final state x(tf ) and on the final time tf . Such constraints are considered introducing the function χ (x(tf ), tf ) = 0

(4)

with χ ∈ C 1 almost everywhere, dim {χ} = σ, 1 ≤ σ ≤ n + 1. Clearly, if tf is left free, its value has to be found during the optimal control problem solution. In the same way, if some or all the final values for the state components are not prefixed, they are obtained from the problem solution. The usual design procedure starts with the definition of the Hamiltonian function (5) H(x, λ, u, t) = L(x, u, t) + λT (f (x) + g(x)u) in which the multiplier function λ(t), also called costate, is introduced, with λ(t) : R → Rn , λ(t) ∈ C 1 almost everywhere. The Hamiltonian (5) verifies the condition H(x, λ, u, t) = K

∀t ∈ [t0 , tf ]

(6)

with K ∈ R. Since (6) must satisfy H(x, λ, u, tf ) =

∂χ(x(tf ), tf ) η ∂tf

(7)

for an unknown η ∈ R, K to be found during the optimization procedure if the final time tf is fixed, with K = η, while it is equal to zero when the final time is left free and the function χ does not depend on tf .

Direct Integrability for State Feedback Optimal Control

485

From (5), and taking into account the dynamics (1), the costate λ must satisfy the differential equation   m  ∂gi ˙λT = − ∂H = − ∂L − λT ∂f + (8) ui ∂x ∂x ∂x i=1 ∂x under the boundary conditions T ∂χ (x(tf ), tf )  λ(tf ) = −  ζ ∂x(tf )

(9)

with ζ ∈ Rσ to be found and χ given in (4). Clearly, if the final state is unconstrained, (4) does not depends on x(tf ) and (9) becomes λ(tf ) = 0

(10)

The structure of (5) suggests that if in L(x, u, t) the control u appears linearly, the whole Hamiltonian results to be affine with respect to the control. This property can be very useful when the minimum principle is applied, and it will be used in this paper. Then, ˜ t) + cT u L(x, u, t) = L(x,

(11)

can be assumed and, consequently, the expression for (5) becomes ˜ t) + cT u + λT f (x) + λT g(x)u H(x, λ, u, t) = L(x,   ˜ t) + λT f (x) + cT + λT g(x) u = F (x, λ, t) + G(x, λ)u = L(x,

(12)

˜ t) + λT f (x) F (x, λ, t) = L(x, G(x, λ) = cT + λT g(x)

(13) (14)

with

Clearly, the presence in the cost function of the input u in a linear form is always possible if umin ≥ 0 in (3). The minimum principle H(x, λ, u, t) ≤ H(x, λ, ω, t)

∀ω ∈ [umin , uM AX ]

(15)

can be used, yielding, for (12), the condition G(x, λ)u ≤ G(x, λ)ω

∀ω ∈ [umin , uM AX ]

As a consequence, the optimal control uM AX if G(x, λ) < 0 u= umin if G(x, λ) > 0

(16)

(17)

486

P. Di Giamberardino and D. Iacoviello

can be obtained. The time instant ts such that G(x(ts ), λ(ts )) = 0 with

(18)

− + + G(x(t− s ), λ(ts )) · G(x(ts ), λ(ts )) < 0

is the instant of switching in which the control changes from one extreme to the other, so getting the classical bang-bang solution. If condition (18) holds over a finite time interval [t1 , t2 ] ⊆ [ti , tf ] G(x(t), λ(t)) = 0

t ∈ [t1 , t2 ]

(19)

then the optimal control presents a singular arc in the solution and it assumes the expression ⎧ ⎨ uM AX if G(x, λ) < 0 u = us (x, λ) if G(x, λ) = 0 (20) ⎩ if G(x, λ) > 0 umin In this case, (19) means that G(x(t), λ(t)) is constant over a finite time interval and, then, all its time derivatives in the same interval must be equal to zero; so, the identities ∂ k G(x(t), λ(t)) . (k) = G (x(t), λ(t)) = 0 ∀t ∈ [t1 , t2 ] ∂tk

(21)

must hold for any k ≥ 0. By definition, G(0) (x(t), λ(t)) does not depend on the control u. Computing (21), for k = 0, 1, 2, . . . , there exists an index r, necessarily even [11,12], such that G(k) (x(t), λ(t)) is independent from u if k < r, while, for k = r, the control u appears explicitly [4]. Then, the first r conditions (21), for k = 0, . . . , r − 1, give r relations between state x and costate λ, and from G(r) (x(t), λ(t)) = 0 the expression for the singular control us in (20) can be obtained. Making reference to the expression (14), one has G(0) (x, λ) = cT + λT g(x) = 0

(22)

G(1) (x, λ) = λ˙ T g(x) + λT g(x) ˙ m m   ˜ ∂ L(x, t) =− g(x) − λT Lg f (x) − ui λT Lg gi (x) + λT Lf g(x) + ui λT Lgi g(x) ∂x i=1 i=1 =−

m  ˜ ∂ L(x, t) ui λT adgi g(x) = 0 + λT adf g(x) + ∂x i=1

(23)

Direct Integrability for State Feedback Optimal Control

487

where the compact expressions ∂g1 f (x) ∂x Lf g(x) = (Lf g1 (x) . . . Lf gm (x))

Lf g1 (x) =

Lg f (x) = (Lg1 f (x) . . . Lgm f (x)) Lg gi (x) = (Lg1 gi (x) . . . Lgm gi (x)) Lgi g(x) = (Lgi g1 (x) . . . Lgi gm (x)) adf g(x) = Lf g(x) − Lg f (x) adgi g(x) = Lgi g(x) − Lg gi (x) are used. Note that the structures of the Lie Bracket introduced are   adf g(x) = adf g1 (x) adf g2 (x) · · · adf gm (x)   adgi g(x) = adgi g1 (x) adgi g2 (x) · · · adgi gm (x) recalling that adgi gi (x) = 0 ∀i ∈ [1, m].

m If the vector fields gi in (1) commute, then i=1 ui λT adgi g(x) = 0, as in [11,12]; the same term is equal to zero also if single input systems are considered. In both cases, the identity (23) reduces to G(1) (x, λ) = −αT g + λT adf g(x) = 0

(24)

and G(2) (x, λ) can be computed using this simplified expression, iterating the procedure.

3

The Case of Block Sub Triangular Single Input Systems

In this Section, the class of dynamics (1) considered is particularised setting x non negative and m = 1, as in [13]. This simplification allows to better clarify the relationships between the structure of the system and the possibility to compute the optimal control law avoiding the backward integration of the costate dynamics (8). Under these hypothesis, the initial dynamics (1) is simplified and (24) holds: moreover, thanks to the non negativeness of the state, ˜ t) = αT x L(x,

(25)

can be used, so obtaining a fully linear term in the cost function L(x, u, t) = L(x, u) = αT x + cu

(26)

Then, λ˙ T in (8) becomes ∂f ∂g − uλT λ˙ T = −αT − λT ∂x ∂x

(27)

488

P. Di Giamberardino and D. Iacoviello

and the expressions of (22) and (23) become G(0) (x, λ) = c + λT g(x) = 0 G(1) (x, λ) = −αT g(x) + λT adf g(x) = 0

(28) (29)

with (29) always independent from u. Then, the computation of G(2) (x, λ, u) = 0 must always be performed: G

(2)

T

(x, λ, u) = −α



− =



T



Lf g(x) + uLg g(x)

α



T

2

T

∂f ∂x

+ uλ T

λ adf g(x) − α



T

∂g ∂x

  adf g(x) + λ

Lf g(x) + adf g(x)

T





Lf adf g(x) + uLg adf g(x)





T

T

 

α Lg g(x) − λ adg adf g(x) u

(30)

If αT Lg g(x) − λT adg adf g(x) = 0, the singular control us can be obtained as us (x, λ) =

λT ad2f g(x) − αT (Lf g(x) + adf g(x)) αT Lg g(x) − λT adg adf g(x)

(31)

otherwise G(2) (x, λ, u) = G(2) (x, λ), the condition λT ad2f g(x) − αT (Lf g(x) + adf g(x)) = 0

(32)

involving x and λ only, is obtained, and a further derivative G(3) (x, λ, u) must be computed. Set r as the first index such that G(r) (x, λ, u) is dependent from the input. It is easily verified that: Proposition 1. The expression of the i–th derivative G(i) (x, λ), for i = 0, 1, . . . , r − 1, is of the form G(i) (x, λ) = λT adif g(x) + hi (x) for suitable functions hi (x) (h0 (x) = c, h1 (x) = −αT g(x)). Proof. It comes iteratively at the i–th step from the structure of G(i−1) (x, λ) and the computations evidenced in (30). 3.1

The Class of Dynamics Considered

The class of nonlinear dynamics considered in this Section presents the structure       g1 (x1 ) f1 (x1 ) x˙ 1 + u (33) = x˙ 2 f2 (x1 ) g2 (x1 ) where x1 ∈ Rr , x2 ∈ Rn−r , and the functions fi and gi defined consequently. The controllability condition     dim span g1 , adf1 g1 , . . . , adr−1 (34) f1 g (x1 ) = r

Direct Integrability for State Feedback Optimal Control

489

for the first subsystem is assumed verified, with x1 in a suitable domain containing x1 (t0 ). In order to avoid long computations, the simplest case of r = 2 is here addressed, for which all the expressions can be written in a compact form. This is also the case of the example discussed in Subsect. 3.3 to illustrate the proposed approach. Consequently, setting   λ1 (35) λ1 ∈ Rr , λ2 ∈ Rn−r λ= λ2 one has ∂L  ∂L ∂L  = ∂x1 ∂x2 ∂x   ∂f1 (x1 ) 0 ∂f 1 = ∂f∂x 2 (x1 ) ∂x 0 ∂x

∂g = ∂x

1



∂g1 (x1 ) ∂x1 ∂g2 (x1 ) ∂x1

0 0



so that (8) becomes 

λ˙ 1 λ˙ 2

 =−



 ∂L  ∂x1 ∂L ∂x2

+

∂f1T (x1 ) ∂f2T (x1 ) ∂x1 ∂x1

 ∂g1 (x1 )

+u

∂x1

0

0

∂g2 (x1 ) ∂x1

0





0

λ

λ

The above equation can be explicitly decomposed into  T  ˙λ1 = − ∂L − ∂f1 (x1 ) + u ∂g1 (x1 ) λ1 + ∂x1 ∂x1 ∂x1  T  ∂f2 (x1 ) ∂g2 (x1 ) − +u λ2 ∂x1 ∂x1 ∂L λ˙ 2 = − ∂x2 3.2

(36)

(37) (38)

Results for the Class of Dynamics Considered

The particular structure of the system (33), which induces the simplification on the λ dynamics (37)–(38), has been used in [13] to prove the following results, for which a short justification is here recalled, mainly focusing on the computation of the costate. The differences among the following three Proposition are in the boundary conditions, studying the differences for fixed or free final time and/or state [13]. Proposition 2. Consider the optimal control problem, with fixed final time tf and free final conditions on the state, for a non linear dynamics of the form (33) and the cost function (2) with L as in (26). There exists a procedure for

490

P. Di Giamberardino and D. Iacoviello

the computation of the optimal singular solution where the control law is as a pure state feedback, bringing to a bang–singular–bang optimal control, for which the singular manifold can be explicitly given as a state function only. Moreover, after the last switch, the optimal control u is equal to umin . Proof. : under the conditions stated in the Proposition, the constraint (4) does not depend on the final state x(tf ), so that (10) holds; the expression of L(x, u, t) gives λ˙ 2 = 0 in (38) and, thanks to (10) and the continuity hypothesis on λ(t), one has ∀t ∈ [t0 , tf ] (39) λ2 (t) = 0 As far as λ1 is concerned, equations G(i) (x, λ), i = 0, 1, . . . , r − 1, can be written, for Proposition 2, as G(0) (x, λ) = λT1 g1 (x1 ) + c = 0 (1)

G

(r−1)

G

(x, λ) =

λT1 adf1 g1 (x1 )



α1T g1 (x1 )

(40) =0

(41)

... = ... (x, λ) = λT1 adr−1 f1 g1 (x1 ) + hr−1 (x1 ) = 0 (42)

Rewritten in the compact form   λT1 g1 (x1 ), adf1 g1 (x1 ), . . . adi−1 f1 g1 (x1 )   = −c, α1T g1 (x1 ), . . . hi−1 (x1 ) the expression of λ1  −1 λT1 = (−c, . . . hi−1 (x1 )) g1 (x1 ), . . . adi−1 f1 g1 (x1 )

(43)

can be easily computed once (34) holds. Then, the expression of the costate λ is fully known, with λ2 (t) as in (39) and λ1 as in (43). It is a function of the state, without the necessity of using (36), so avoiding the backward integration procedure. Then, it is possible to express the singular control us (x, λ) as a state function only, in particular a function of x1 , us = us (x1 ). The equation of the singular manifold can be obtained from F (x, λ, t) = α1T x1 + λT1 f1 (x1 ) = K

(44)

thanks to (6), in which the structure (12) is used, and the fact that condition (19) holds. Expression (44) can be explicitly written as a function of the state once (43) is used:  −1 K = α1T x1 + (−c . . . hi−1 (x1 )) g1 (x1 ) . . . adi−1 f1 (x1 ) f1 g1 (x1 )

(45)

Direct Integrability for State Feedback Optimal Control

491

which fully describes the singular surface in the x1 subspace. The subspace x2 can be computed from x1 , thanks to the structure of (33). The value of the unknown parameter K can be obtained noting that, from (12) evaluated af t = tf , one has (46) H(x, λ, u, tf ) = α1T x1 (tf ) = K being u(tf ) = 0 as well as λ(tf ) = 0. This procedure allows to fully compute the singular part of the optimal solution. The full solution (20) is obtained including the conditions in (17). Other two Propositions, shortly recalled below, are proved in [13], where the same problem as in Proposition 2 is defined but with different conditions on the final state and time. More explicitly, while Proposition 2 refers to the case of fixed final time tf and free final state x(tf ), Proposition 3 considers both x(tf ) and the time tf fixed, and Proposition 4 refers to fixed final state and free final time. All these choices bring to different definitions of the constraint function (4) which determines the boundary conditions on λ(t) and on the Hamiltonian (5). The core of the results, the possibility to have the expression for λ(t) directly as a function of the state without its differential system integration, is the same for all the three cases. Proposition 3. Given the optimal control problem, with fixed time tf and fixed final condition on the state x(tf ) = xf , for a non linear dynamics of the form (33) and the cost function (2) with L of the form (26), there exists a procedure for the computation of the optimal singular solution which gives the expression of the control law as a pure state feedback, bringing to a bang–singular–bang optimal control, for which the singular surface can be explicitly written as a state function only. In this case, the presence of constraint (4) does not allow to know λ(tf ) due to expression (9). However, λ˙ 2 = 0 is still true and then one has λ2 (t) = λ2 (tf ) = const. In this case Eqs. (40), (41) and (42) assume the form G(0) (x, λ) = λT1 g1 (x1 ) + λT2 (tf )g2 (x1 ) = 0 (1)

G

(x, λ) =

λT1 adf1 g1 (x1 )



α1T g1 (x1 )

+

λT2 (tf ) [f2 , g2 ] (x1 )

... = ... (r−1) G (x, λ) = λT1 adr−1 f1 g1 (x1 ) + hr−1 (λ2 (tf ), x1 )

(47) =0

(48) (49)

Following the same considerations as in the previous Proposition 2, it is possible to compute λ1 as in (43), whose expression in this case is λ1 (λ2 (tf ), x1 ). Also the singular control us as well as the singular surface (45) can be computed as function of x1 , but they result to be parametrised by λ2 (tf ). The same follows for x2 (t). However, thanks to condition (4), it is possible to determine the actual values for λ2 (tf ), so getting again the full computation of the singular solution us = us (x1 ) and of the singular state space manifold. The remaining part of the procedure follows what stated in Proposition 2. The last case here addressed refers to a free time fixed final conditions on the state variables. It is possible to state the following

492

P. Di Giamberardino and D. Iacoviello

Proposition 4. Given the optimal control problem, with free final time tf and the constraint (4) for the final conditions on the state, for a non linear dynamics of the form (33) and the cost function (2) with L(x, u, t) = α1T x1 + cu, α1 ∈ Rr , there exists an algorithm for the computation of the optimal singular solution which gives the expression of the control law as a pure state feedback, bringing to a bang–singular–bang optimal control, for which the singular surface can be explicitly written as a state function only. Proof: The proof comes straightforwardly from the one of Proposition 3, since the only difference is in the fact that being the final time tf free, K in (6) and in all the derived expressions, as far as (45), are equal to zero. All the other considerations still hold. These results are applied, in next Subsection, to a simple dynamical system in order to put in evidence the main steps and the results. 3.3

The Example of a SIR Model

The effectiveness of the procedures introduced is here verified making use of a classical problem, the optimal control of an epidemic spread. The mathematical model here considered is a SIR dynamics. With reference to the general formulation introduced in Sect. 2, the classical model of a SIR epidemic spread is S˙ = −βSI − Su + μ (50) I˙ = βSI − γI R˙ = γI

(51) (52)

with given initial conditions S0 , I0 , R0 , and box constraints u(t) ∈ [0, UM AX ], for which the control law u(t) minimising the cost function  tf (aI(t) + cu(t)) dt (53) J(S(t), I(t), R(t), u(t)) = t0

has to be found. Only the final time tf is assumed fixed and equal to tdf . Then, the constraint (4) has the expression χ(tf ) = tf − tdf and its derivative with respect to tf is equal to one. The aim of the control action is to minimize the number of infected subjects I in the fixed time interval [0, tf ] by using as less resources as possible; the positive parameters a and c represent the weights of these two contrasting requirements. Setting     S x1 (54) and x2 = R x= with x1 = I x2 the formulation corresponds  to the case considered in Proposition 2, with r = 2,    α1T = 0 a , αT = α1T 0 and     −βSI + μ −S f1 (x1 ) = g1 (x1 ) = βSI − γI 0 (55) f2 (x1 ) = γI

g2 (x1 ) = 0

Direct Integrability for State Feedback Optimal Control

Correspondingly, the costate λ is defined as   λ1 λ= with λ1 ∈ R2 and λ2

λ2 ∈ R

493

(56)

and the Hamiltonian (12) is H(x, λ, u) = α1T x1 + cu + λT1 f1 (x1 ) + λ2 f2 (x1 ) + λT1 g1 (x1 )u = F (x, λ) + G(x, λ)u

(57)

with F (x, λ) = α1T x1 + λT1 f1 (x1 ) + λ2 f2 (x1 ) G(x, λ) = c +

λT1 g1 (x1 )

(58) (59)

and for which H(x, λ, u) = K

∀t ∈ [t0 , tf ],

K∈R

The costate dynamics (37) and (38) for λ can be computed, getting         0 −1 0 0 −βI βI − λ1 − λ2 λ1 − u λ˙ 1 = − a 0 0 γ −βS βS − γ λ˙ 2 = 0

(60)

(61) (62)

with λ(tf ) = 0 due to the free final state condition. Then, as expected, from (62) one has λ2 (t) = 0. Performing the computations, expressions (47) and (48) assume the form   −S =0 (63) G(0) (x, λ) = c + λT1 0   −μ =0 (64) G(1) (x, λ) = λT1 βSI   S , obtainfrom which the variable λ1 can be computed, as a function of x1 = I ing  −1    c  −S 0 −c S λ1 = (65) = μc −μ βSI 0 βS 2 I Moreover, for G(2) (x, λ) one has G(2) (x, λ) = λT1 ad2f1 g1 (x1 ) − α1T adf1 g1 (x1 ) + uλT1 adg1 adf1 g1 (x1 )

(66)

from which the singular control us (x, λ) =

λT1 ad2f1 g1 (x1 ) − α1T adf1 g1 (x1 ) λT1 adg1 adf1 g1 (x1 )

(67)

494

P. Di Giamberardino and D. Iacoviello

is obtained. After all the computations, and making use of (65), the state feedback singular control 2   − 2cμ aβSI − β 2 cSI + 2βcμI βc − a 2 μ S S2 S −1 + (68) = βI us (x) = cμ −2 S 2cμ S is computed. The singular manifold can be obtained as a state function according to (44) and (45):       S  c μc  −βSI + μ cγμ cμ 0a = K (69) + S βS 2 I = aI − βcI + 2 − I βSI − γI S βS 2 Finally, from (46), K = aI(tf )

(70)

1

u(t)

0.8

0.6

0.4

0.2

0 0

5

10 Time t

15

20

Fig. 1. The Bang–Singular–Bang optimal control u(t) ([13]).

The simulation has been performed setting tf = 20, fixing the model parameters to β = 0.01, γ = 0.4 and μ = 10 and choosing the weights α = 1 and c = 1 in the cost function (53). The time history of the bang-singular-bang control is depicted in Fig. 1, where the finite time interval of the singular solution t ∈ [t1 , t2 ], with t1 = 5.32 and t2 = 15.3, is evidenced by the two vertical dotted lines. The corresponding state evolution is represented in Fig. 2 for the x1 –x2 plane projection: the initial part of the trajectory from the starting conditions x1,0 = 50 and x2,0 = 10, the squared point, to the intersection with the singular curve (dotted line), denoted by a dashed curve, represents the evolution under the upper bound value for the control. Then, under the singular control (68), the trajectory follows the singular curve along the solid arc, until the second switch from u = us to u = umin = 0 occurs, yielding to the free evolution from the singular arc to the final point (diamond marker).

Direct Integrability for State Feedback Optimal Control

495

12

10

x2(t)

8 Singular curve SC State trajectory Trajectory on the SC Initial condition Final condition Switching points

6

4

2

0 0

10

20

30

40 x1(t)

50

60

70

80

Fig. 2. State trajectory in the x1 –x2 plane, compared with the singular curve ([13]).

4

An Extended Case

The results presented in Sect. 3 are based on the particular form for dynamics (33) and its relationship with condition (34). The use of these results on a generic system can be extended once it is observed that if in the dynamics (33) one has r = n, the expression of a generic n dimensional system is obtained, satisfying the full controllability condition (34). For such a system, the procedure described above can be easily applied. In fact, in this case, the full costate can be obtained as a function of the state making use of (43) with r = n. Then, the procedures described in Proposition 2, Proposition 3 and Proposition 4 can be applied and the optimal control problem can be solved without requiring the solution, by backward integration, of the differential system (27) and yielding to an optimal control law that can be expressed as a state function, that is in the form of a state feedback. The fulfilment of condition r = n is not common, being impossible if n is odd. So, also the direct use of the results in Sect. 3 is strongly limited. In this Section, the possibility to extend the results to the case of generic dynamics for which the condition r < n holds is here addressed. The enriched control scheme is described for the case r = n − 1, since the other cases can be obtained recursively. In this case the optimal control problem is referred to a dynamics (1) with x ∈ Rn , xi ≥ 0, and u ∈ R, u ≥ 0, to a cost function (2) with position (25), and control constraints (3) with umin = 0. The introduction of the Hamiltonian function (12) and the use of the minimum principle (15), which becomes (16), allow to obtain the expression (17) for the optimal control.

496

P. Di Giamberardino and D. Iacoviello

If a singular arc is present in the optimal solution, then the expressions of the G(i) (x, λ) can be computed and, for the hypothesis assumed on r, and for the Proposition 1, one has that for the first n − 1 terms G(i) (x, λ), i = 0, . . . , n − 2, the following conditions G(0) (x, λ) = λT g(x) + c = 0 G(1) (x, λ) = λT adf g(x) − αT g(x) = 0 ... ... ... G(n−2) (x, λ) = λT adn−2 g(x) + hn−2 (x) = 0 f

(71)

hold independently from the control u, while, at the n − 1–th step, one gets the form, as the one in (30) G(n−1) (x, λ, u) = M (x, λ) + N (x, λ)u = 0

(72)

with N (x, λ, u) = 0 so that the singular control us = −

M (x, λ) N (x, λ)

(73)

can be computed. In this case, without particular hypothesis on the dynamics, from (43) with r = n − 2, it is not possible to obtain the n components of the costate λ from the n − 1 dimensional system (71). The approach here followed to overcome this problem makes use of a dynamic extension of the input, adding an integrator on u. Then, u becomes one component of the state and v = u˙ is the new input. This position has the consequence of making (72) the (n − 1)–th equation of system (71) from which a solution like (43) can be obtained for the costate λ. The drawback of this idea is that the dimension of the state becomes n+1 and the same is, consequently, the dimension of the costate. Then, since from (43) only n components of the costate can be computed, it seems that the problem is not actually solved. In order to clarify the effects of the dynamic extension on the whole optimal control problem, the reformulation for the extended dynamics is performed and the steps for the computation of the new solutions are analysed. Then, consider now the extended dynamics x˙ = f (x) + g(x)u u˙ = v

(74)

with x ∈ Rn , u ∈ R, v ∈ R, which can be described by z˙ = Φ(z) + Γ v (75) T  is introduced, as long as the extended once the extended state z = xT u vector fields     f (x) + g(x)u 0 Φ(z) = , Γ = (76) 0 1

Direct Integrability for State Feedback Optimal Control

497

Clearly, the same cost function (2) with (26) is considered, being it the original cost of the problem. Its expression, for the extended case, becomes  tf  tf  T  α x + cu dt = α ˜ T z dt (77) J= t0

t0

The external input v is not present in the cost function and is not explicitly constrained, since the previous bound on u, u(t) ∈ [0, uM AX ]

∀t ∈ [t0 , tf ]

(78)

here becomes a constraint on one of the state components. As far as the boundary conditions on the state, x(t0 ) = x0 is the same as in the original problem; for u, the initial and/or the final values can be assumed known, once the existence of the control (20) is assumed for the initial problem. Also the constraint (4) must be rewritten with respect to the extended state, getting the expression (79) χ ˜ (z(tf ), tf ) = 0 verifying

where

∂χ ˜ (z(tf ), tf )  ∂ χ(x(t ˜ f ),u(tf ),tf ) = ∂x(tf ) ∂z(tf )

∂ χ(x(t ˜ f ),u(tf ),tf ) ∂u(tf )



∂χ ˜ (x(tf ), u(tf ), tf ) =0 ∂u(tf )

if no hypothesis can be initially formulated for the final condition on u. According to the extension, the costate function   ˜= λ λ λu

(80)

(81)

(82)

is introduced, with λ ∈ Rn the same as the original case, and λu ∈ R. The Hamiltonian function becomes ˜ v, t) = α ˜ T (Φ(z) + Γ v) = F˜ (z, λ) ˜ + G( ˜ ˜ λ, ˜ λ)v ˜T z + λ H(z,

(83)

˜ =α ˜ T Φ(z) F˜ (z, λ) ˜T z + λ T ˜ =λ ˜ Γ ˜ λ) G(

(84) (85)

with

Expressions of the original formulation can be put in evidence in (84) and (85), observing that ˜ = αT x + cu + λT (f (x) + g(x)u) = F˜ (z, λ) = H(x, λ, u) ˜ ˜ G(z, λ) = λu

(86) (87)

498

P. Di Giamberardino and D. Iacoviello

The necessary condition for the extended costate becomes ˜ ˜ T ∂Γ ˜ T ∂Φ − λ ˜˙ T = − ∂ H = α ˜T − λ λ ∂z ∂z ∂z     ˜ T ∂Γ ∂Γ ˜ T ∂Φ ∂Φ − λ = −˜ αT − λ ∂x ∂u ∂x ∂u  ∂f    ∂g ˜ T ∂x + ∂x u g − λ ˜T 0 0 = −˜ αT − λ 00 0 0     ∂g T T ∂f T c+λ g =− α +λ ∂x + ∂x u

(88)

in which the two blocks  λ˙ = −α −

∂f ∂g + u ∂x ∂x

T λ

(89)

λ˙ u = c + λT g

(90)

can be evidenced, where (89) is the same as (27) and (90) corresponds to (14). Since the new extended problem does not have any constraint on the new input v, the necessary condition ˜ v, t) ˜ λ, ∂ H(z, =0 (91) ∂v can be considered, yielding to ˜ = λu (t) = 0 ˜ λ)) G(

(92)

Having assumed verified all the conditions of the original problem, on the singular arc one has, for (22), (90) and (92), ˜ (1) (x, λ) = G(0) (x, λ) = 0 λ˙ u = G

(93)

and then dk λ u ˜ (k) (x, λ) = G(k−1) (x, λ) = 0 =G d tk

∀k = 1, 2, . . .

(94)

holds in the time interval t ∈ [t1 , t2 ], while for t ∈ [t0 , t1 ]∪[t2 , tf ] λu (t) is constant with all the time derivatives equal to zero. On these basis it is possible to state the following. Proposition 5. For the extended dynamics, the presence of a singular solution is guaranteed in the same time interval as in the solution of the initial problem. Thanks to Proposition 5, from the expressions in (93) and (94), and under the hypothesis (71) on the solution of the original problem, for the conditions on the existence of the singular solution for the extended dynamics, one has ˜ = 0, ˜ (k) (z, λ) G independent from u.

for k ∈ [0, n − 1]

(95)

Direct Integrability for State Feedback Optimal Control

499

For the n–th derivative, one has ˜ = G(n−1) (x, λ, u) = 0 ˜ (n) (z, λ) G

(96)

with G(n−1) (x, λ, u) in (72). For the extension, the control v does not appear yet and then the subsequent derivative must be performed to compute the singular control. ˜ = G(n) (x, λ, u) ˜ (n+1) (z, λ) G =

∂G(n−1) (x, λ, u) (f + gu) + ∂x T ∂G(n−1) (x, λ, u) ∂H(x, λ, u)  −  ∂λ ∂x +

∂G(n−1) (x, λ, u) v ∂u

(97)

Performing the computation, thanks to the form of the compact expression (72), one obtains ∂G(n−1) (x, λ, u) = N (x, λ) (98) ∂u This result proves the following Proposition 6. If the singular control us can be computed, as in (73), for the non singularity of the coefficient N (x, λ), the same holds for the control v of the extended problem. Moreover Proposition 7. The set of equations ˜ = λu = 0 ˜ (0) (z, λ) G ˜ = G(0) (x, λ) = λT g(x) + c = 0 ˜ (1) (z, λ) G (2) ˜ = G(1) (x, λ) = λT adf g(x) − αT g(x) = 0 ˜ (z, λ) G ... ... ... (n−1) ˜ = G(n−2) (x, λ) = λT adn−2 g(x) + hn−2 (x) = 0 ˜ G (z, λ) f

˜ = G(n−1) (x, λ, u) = M (x, λ) + N (x, λ)u = 0 ˜ (n) (z, λ) G

(99)

can be used to compute λ = λ(x, u) without integrating the system (27). Proof. For the proof, the explicit expression of M (x, λ) and N (x, λ) should be put in evidence to show their relationship with λ. Making also use of (30) for reference, it can be observed that the coefficients of λ in the terms G(i) (x, λ), for i = 0, . . . , n − 2 are adif g(x) (Proposition 1). For G(n−1) (x, λ, u), the terms in the coefficient of λ contain adn−1 g(x) along with other Lie Brackets of order f lower than n contained in the non singular distribution   D = span g, adf g, . . . , adn−1 g (x) (100) f

500

P. Di Giamberardino and D. Iacoviello

so that (99) can be always inverted with respect to λ, getting λ = λ(x, u)

(101)

which can be used in the remaining part of the procedure to solve the optimal control problem, while for λu (92) holds. The correspondence between the original optimal solution in (20), whose expression is (73), and the optimal solution v for the extended control problem can be easily stated. In fact, v = 0 outside the singular part of the solution corresponds to u constant, whose value is 0 or uM AX , defined by means of the boundary conditions on the state component u in z. For the singular arc, comparing the expressions of G(n−1) (x, λ, u) in (72), from which us is computed, with (102) G(n) (x, λ, u, v) = M˙ (x, λ) + N˙ (x, λ)u + N (x, λ)v from which vs = −

M˙ (x, λ) + N˙ (x, λ)us N (x, λ)

(103)

is obtained, u˙ s = vs it is possible to verify, since that the optimal solution for the state component u of the extended problem coincides with the optimal control us , solution of the original problem. In fact   d M (x, λ) u˙ s = − dt N (x, λ) M˙ (x, λ)N (x, λ) − N˙ (x, λ)M (x, λ) =− N 2 (x, λ) (x,λ) M˙ (x, λ) − N˙ (x, λ) M N (x,λ) =− N (x, λ) ˙ M (x, λ) − N˙ (x, λ)us = vs =− (104) N (x, λ) once compared with (103). 4.1

The Example of a SIR Model with No Immunization

An example is here adopted to show explicitly the procedure which allows to compute the costate λ as a function of the state x and the control u, so that the extended input v over the singular manifold, can be expressed without the necessity of the integration of the differential system for λ. The dynamics considered is the same as in Subsect. 3.3, slightly modified in order to make it loose the particular structure (33).  T The modified dynamics is of the form (1) with x = S I R , ⎛ ⎞ ⎛ ⎞ −βSI + ρR + μ −S f (x) = ⎝ βSI − γI ⎠ , g(x) = ⎝ 0 ⎠ (105) γI − ρR 0

Direct Integrability for State Feedback Optimal Control

501

where the change w.r.t. Subsect. 3.3 is the addition of the term ρR representing the rate of recovered which can re-contract the virus, moving from R to S. In the cost function, the term L is as in (53), of the form (25). The Hamiltonian function is as in (12), with α = (α1 α2 α3 )T so that the functions G(0) (x, λ), G(1) (x, λ) and G(2) (x, λ, u) have the expressions (28), (29) and (30) respectively. The differential system for λ arising from the necessary conditions is λ˙ 1 = − (α1 − βIλ1 + βIλ2 − γλ2 − λ1 u) λ˙ 2 = − (α2 − βSλ1 + βSλ2 + γλ3 ) λ˙ 3 = − (α3 + ρλ1 − ρλ3 )

(106)

which should be solved, along with (1), under mixed boundary conditions. In this case, if the extension (74) is introduced, the costate λ can be computed as a function of x and u directly solving the system (28)–(29)–(30). For the present case, one gets ⎞ ⎞ ⎛ ⎛ ρR + μ ρR + μ adg adf g(x) = ⎝ −βSI ⎠ (107) adf g(x) = ⎝ −βSI ⎠ , 0 0 ⎛ ⎞ ⎛ ⎞ S βSI − ρR − μ ⎠ 0 Lg g(x) = ⎝ 0 ⎠ , Lf g(x) = ⎝ 0 0 ⎞ ⎛ ργI − ρ2 R + ρβIR + μβI − β 2 S 2 I ⎠ β 2 SI 2 − 2ρβIR − 2βI ad2f g(x) = ⎝ −βγSI

(108)

(109)

and the system to be solved becomes

with

λT C(x, u) = D(x, u)

(110)

  C(x, u) = g adf g ad2f g + adg adf gu

(111)

⎞ ⎛ −S ρR + μ ργI − ρ2 R + ρβIR + μβI − β 2 S 2 I + ρRu + μu ⎠ β 2 SI 2 − 2ρβIR − 2βI − βSIu = ⎝ 0 −βSI 0 0 −βγSI   T T D(x, u) = −c α g α Lg gu + αT (Lf g + adf g)

(112) (113)

It can be verified that the controllability condition is satisfied once S = 0 and I = 0. Under the same conditions, λT (x, u) = D(x, u)C −1 (x, u)

(114)

can be computed, since, as it is easy to verify by visual inspection, the determinant Δ of the matrix C(x, u) is Δ = −β 2 γS 3 I 2 = 0 and then C(x, u) is invertible.

502

5

P. Di Giamberardino and D. Iacoviello

Conclusions

Sufficient conditions under which the solution of a singular optimal control problem can be directly expressed as a state feedback law are provided, allowing its computation by means of a simple forward integration of the system dynamics only. The necessity of a particular structure for the systems for satisfying the conditions required for the implementation of the proposed procedure has been overcome, introducing a dynamic extension on the input. The effectiveness of the approach is shown for the case in which an order one integrator is sufficient, but the proof of the success of the proposed approach for a generic order can be obtained, for example, by iterative application of the one dimensional single step. Acknowledgements. This work was supported by Sapienza University of Rome, Grants No. 191/2016 and No. RP11715C82440B12.

References 1. Athans M, Falb PL (1996) Optimal control. McGraw-Hill Inc., New York 2. Hartl RF, Sethi SP, Vickson RG (1995) A survey of the maximum principles for optimal control problems with state constraints. Soc Ind Appl Math 37:181–218 3. Johnson CD, Gibson JE (1963) Singular solutions in problems of optimal control. IEEE Trans Autom Control 8(1):4–15 4. Bryson AE, Ho YC (1969) Applied optimal control: optimization, estimation, and control 5. Vossen G (2010) Switching time optimization for bang-bang and singular controls. J Optim Theory Appl 144:409–429 6. Fraser-Andrews G (1989) Finding candidate singular optimal controls: a state of art survey. J Optim Theory Appl 60:173–190 7. Di Giamberardino P, Iacoviello D (2017) Optimal control of SIR epidemic model with state dependent switching cost index. Biomed Signal Process Control 31:377– 380 8. Bakare EA, Nwagwo A, Danso-Addo E (2014) Optimal control analysis of an sir epidemic model with constant recruitment. Int J Appl Math Res 3:273 9. Behncke H (2000) Optimal control of deterministic epidemics. Optimal Control Appl Methods 21:269–285 10. Di Giamberardino P, Compagnucci L, De Giorgi C, Iacoviello D (2018) Modeling the effects of prevention and early diagnosis on HIV/AIDS infection diffusion. IEEE Trans Syst Man Cybern Syst 11. Ledzewicz U, Schattler E (2011) On optimal singular controls for a general SIRmodel with vaccination and treatment. Discrete Continuous Dyn Syst 2:981–990 12. Ledzewicz U, Aghaee M, Schattler H (2016) Optimal control for a sir epidemiological model with time-varying population. In: 2016 IEEE conference on control applications 13. Di Giamberardino P, Iacoviello D (2018) State feedback optimal control with singular solution for a class of nonlinear dynamics. In: Proceedings of the 15th international conference on informatics in control, automation and robotics (ICINCO 2018), vol 1

Petri Nets Tracking Control for Electro-pneumatic Systems Automation Carlos Renato V´azquez1(B) , Jos´e Antonio G´ omez-Castellanos2 , and Antonio Ram´ırez-Trevi˜ no3 1

3

Tecnologico de Monterrey, Escuela de Ingenieria y Ciencias, Av. Ram´ on Corona 2514, 45019 Zapopan, Mexico [email protected] 2 UTZMG, Carretera Santa Cruz-San Isidro km. 45, 45640 Tlajomulco de Zu˜ niga, Mexico CINVESTAV Unidad Guadalajara, Av. del Bosque 1145, 45019 Zapopan, Mexico

Abstract. In this work, an output tracking control problem is introduced for electro-pneumatic systems (EN S) modeled by interpreted Petri nets (IP N ). In this framework, IP N models for the most common components in EN S are present. Later, it is explained how to compute a plant model and an specification model as IP N ’s. The specification can allow to concurrently activate several actuators. Afterwards, an algorithm is introduced to compute the controller and the closed-loop system as IP N ’s. It is demonstrated that the proposed controller solves the tracking control problem, ensuring boundedness and deadlock-freeness of the closed-loop system. The resulting controller can be translated to a Ladder Diagram for its implementation with a provided algorithm. The introduced algorithms are illustrated through a case study. The aim of this work is to provide a control framework that is simple and convenient for practitioners in the field of EN S’s, but at the same time is based on a formal model for analysis purposes.

1

Introduction

Petri nets (P N ) are a mathematical formalism useful for modelling and analyzing discrete event systems (DES), such as manufacturing systems, automation systems, industrial robotics, among others. The study of P N ’s has been particularly intensive for the synthesis of controllers and supervisors. In the literature, different supervision and control methods have been reported for DES’s based on Petri nets. The most studied control paradigms are: – Supervisory-control [8,9,13]. This is the most known formal paradigm, in this the system behavior is confined into the specification behavior (both given as languages) by a supervisor agent, which disables controllable events. The supreme controllable language inside the specification is computed in order to synthesize the controller, after that, a DES is computed which generates supreme controllable language. c Springer Nature Switzerland AG 2020  O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 503–525, 2020. https://doi.org/10.1007/978-3-030-31993-9_25

504

C. R. V´ azquez et al.

– Generalized mutual exclusions [1,7] (GM EC). The controller consists of places (monitors) and arcs that are added to the P N , in order to limit the weighted sum of tokens inside some places. By using this technique, the closed-loop system may avoid unsafe states (states in which two activities occur simultaneously) and deadlock states (states in which none transition is enabled). – Liveness-enforcing [4,11]. Liveness is an important property according to which, the firing of any transition in the future evolution is possible from any reachable state. Several works address the synthesis of controllers based on GM EC to guarantee liveness. Since the verification of liveness is a NP problem in the number of nodes, the studies frequently focus on net subclasses, such as S3P R [6], by using mathematical programming [3,10] or structural analysis [12,19] to identify siphons that can lose tokens, and then adding monitor places to avoid that these siphons lose their tokens. Despite the amount of works reported in the literature regarding control techniques in P N ’s, there is a lack of theoretical developments and techniques for the synthesis of controllers for different control objectives. For instance, in the automation of industrial processes, the requirements are sequences of sensors signals, for which actuators must be executed in certain order; in the control of flexible manufacturing systems, the requirements are processed products, for which certain events must be executed, such as assemblies and machine loads; in the supervision of rail transport systems, the requirements are sequences of vehicle positions, for which the movement of the vehicles is enabled or disabled in a safe manner. In these applications, the system is required to be controlled in such a way that its output be equal to a reference signal. The control paradigms for DES’s mentioned above do not address this problem. In fact, the design of controllers for these applications is frequently based on heuristic rules developed by practitioners, without following any standard procedure that guarantee the safe operation of the closed-loop system. For these problems, the regulation control framework was introduced and studied in [2,15–17]. In these, the specification and the system to be controlled (the Plant) are interpreted Petri nets (IP N ), in which some transitions are enabled or disable by means of the application of certain input symbols, and some places have symbols that are observable by external agents. The objective is to design a controller that, for every firing in the specification, executes a sequence of transitions in the Plant so the output symbols of the specification and the Plant become equal. In fact, the output languages of the plant and specification are equal. In electro-pneumatic systems (EN S), practitioners usually provide specifications as sequences of signals produced by the actuators’ limit position sensors, such sequences are triggered by signal from switches or proximity sensors that are detecting the presence of parts or machine conditions. Nevertheless, in this kind of specifications, not all EN S signals appear, or even worst, some uncontrollable events may affect the plant, producing sensor signals that not mentioned at the specification, thus the output languages of the plant and specification cannot be

Petri Nets Tracking Control for Electro-pneumatic Systems Automation

505

equal. Hence the control techniques reported in the literature cannot handle the EN S synthesis control problem for these specifications. This work addresses the control synthesis of EN S. The framework here introduced, named Tracking Control, is based on the IP N paradigm. First, a IP N models are obtained for components, later, those models are integrated into a single plant model. The specification is represented as an IP N . Some preliminary ideas were presented in the conference paper [18], without formal proofs, restricting the specification to a state machine. However, in this work the specification is not restricted to be a state machine, allowing to specify the concurrent activation of several actuators, sequences of required sensors and actuators symbols, selection between different sequences, among other possibilities. The tracking control problem is formally stated and important properties of the closed-loop system, such as boundedness and deadlock-freeness, are demonstrated. Finally, an algorithm to translate the resulting IP N controller to a Ladder Diagram for its implementation on a P LC device is introduced. This paper is organized as follows: the definition of IP N is introduced in Sect. 2. In Sect. 3, models for typical electron-pneumatic components are presented, moreover, the specification is defined as an IP N . In this section the tracking control problem is formally stated. Section 4, presents a control synthesis algorithm which computes the controller and the closed-loop system as IP N ’s. Furthermore, important properties of the closed-loop system are demonstrated. An algorithm for translating the resulting IP N controller to a Ladder Diagram is presented. In Sect. 5, a case study is analyzed to illustrate the introduced techniques. Section 6 presents the conclusions.

2

Basic Concepts

In this section, the P N and IP N definitions and some basic concepts are recalled (for more details see [5]). 2.1

Petri Nets

Definition 1. A Petri net (P N ) structure N is a bipartite digraph represented by the 4-tuple N = P, T, Pre, Post where P = {p1 , p2 , ..., pn } is the finite set of places, T = {t1 , t2 , ..., tm } is the finite set of transitions, Pre and Post are |P | × |T | matrices representing the weighted (nonnegative integer number) arcs going from places to transitions and from transitions to places, respectively. The incidence matrix of N is defined as C = Post − Pre. Let x ∈ P ∪ T be a node of N . The input set of x, denoted by • x, is defined as x = {xi ∈ P ∪ T | there exists an arc from xi to x}. Similarly, the output set of a node x, denoted by x• , is defined as x• = {xi ∈ P ∪ T | there exists an arc from x to xi }. A P N is a state machine if each transition has only one input and one output place, i.e., ∀t ∈ T |• t| = |t• | = 1 and it is strongly connected. •

506

C. R. V´ azquez et al.

Definition 2. A P N system is the pair N , M0 , where N is a P N structure. The marking function M : P → Z+ is a mapping from each place to the nonnegative integers representing the number of tokens residing inside each place. The marking of a P N is expressed as a column vector M of length |P |. M0 is the initial marking distribution. The marking distribution evolves according to the firing of transitions. A transition tj is enabled at marking Mk iff ∀pi ∈ • tj , Mk (pi ) ≥ Pre(pi , tj ), this tj

is denoted as Mk →. A transition tj can fire if it is enabled. The firing of an enabled transition tj reaches a new marking Mk+1 that can be computed with the so-called P N fundamental equation Mk+1 = Mk + Cvk tj

where vk (i) = 0 for i = j and vk (j) = 1. This is denoted as Mk → Mk+1 . Graphically, places are represented by circles, transitions by rectangles, arcs by arrows, and tokens are represented as dots or positive integer numbers inside places. Definition 3. A sequence of transitions σ = ti tj , ..., tk of a P N system N , M0  t

tj

t

k i such that M0 − → M1 −→, ..., Mw −→ is said to be fireable or it is said that σ is a firing transition sequence.

The marking M reached after the firing of σ at a marking M can be computed by M = M + Cσ where σ is a vector, named Parikh vector, defined as a column vector of size |T | such that σ(j) = k if tj is fired k times in the sequence σ. This is denoted as σ M −→ M , M is said to be reachable from M. The reachability set of a P N is the set of all the reachable markings from M0 , and it is denoted as R(N , M0 ). A P N system is said to be bounded if there exists a finite number k such that, for all the reachable markings, each place has at most k tokens, i.e., ∀M ∈ R(N , M0 ) ∀p ∈ P it holds M(p) ≤ k. A P N system is said to be safe if it is bounded with k = 1. A P N system is said to be live if for any transition tj ∈ T and any reachable σ marking M ∈ R(N , M0 ) there exists a fireable sequence σ such that M −→ M  and tj is enabled at M . 2.2

Interpreted Petri Nets

In this work, the Interpreted Petri net model will be used, which is an extension to P N ’s allowing to represent input and output symbols [14]. Definition 4. An Interpreted Petri net (IP N ) is a 6-tuple Q = N , M0 , ΣI , ΣO , λ, ϕ where:

Petri Nets Tracking Control for Electro-pneumatic Systems Automation

507

– N , M0  is a Petri net system; – ΣI is the input alphabet of the Petri net system, where each element of the set ΣI is an input symbol; – λ : T → ΣI ∪ {} is the labeling function of transitions with the restriction that nondeterministic inputs are not allowed, i.e., ∀tj , tk ∈ T, j = k, if Pre(pi , tj ) = Pre(pi , tk ) = 0 and both λ(tj ), λ(tk ) = , then λ(tj ) = λ(tk ). Here,  represents a system’s internal event; – ΣO is the output alphabet of the Petri net system, where each element of the set ΣO is an output symbol; – ϕ : P → 2ΣO ∪ {ε} is an output function that associates places to output symbols. ε represents a symbol that is not available to an external observer. The function ϕ can be represented by a |ΣO |×|P | matrix ϕ, in which ϕ(i, j) = 1 if the place pj is associated to the i-th output symbol and ϕ(i, j) = 0 otherwise. Each place may generate more than one output symbol. A place p ∈ P is said to be measurable if ϕ(p) = ε, otherwise, it is nonmeasurable. A transition t is said to be controllable if λ(t) = , otherwise it is uncontrollable. The evolution of an IP N is similar to that of the P N system with the addition that a symbol a ∈ ΣI is indicated if it is activated by an external device (for instance a controller or a user). The following aspects are also considered for the transitions firing. – If λ(tj ) = ai =  is indicated and tj is enabled then tj must fire. If tj is enabled by the marking, but the symbol λ(tj ) is not indicated, then tj cannot fire. If λ(tj ) =  and tj is enabled, then tj can fire at any moment. – At any reachable marking Mk , an external observer reads the symbols associated to the marked places, whose set is denoted as ϕ(Mk ). In this work, it will be assumed that the IP N ’s are event-detectable [14], a property that is recalled as follows: Definition 5. An IP N N , M0 , ΣI , ΣO , λ, ϕ is said to be event-detectable if the firing of any ti ∈ T can be detected by a change on the output symbols and can be distinguished from the firing of other transitions, i.e., ∀ ti ∈ T – ϕC(•, ti ) = 0 – ∀tj ∈ T \ {ti }, λ(ti ) = λ(tj ) or ϕC(•, i) = ϕC(•, j). IP N models for complex systems can be built from IP N submodels of independent components (i.e., the submodels do not share places, transitions or symbols). This is performed by the synchronous product defined as follows:    1 2 Definition 6. Let Q1 = N 1 , M10 , ΣI1 , ΣO , λ1 , ϕ1 and Q2 = N 2 , M20 , ΣI2 , ΣO ,    1 1  2 1 1 2 2 1 2 λ , ϕ be two IP N models, where N = P , T , Pre , Post and N = P ,   T 2 , Pre2 , Post2 . The synchronous product results in an IP N Q3 N 3 , M30 , ΣI3 ,    3 ΣO , λ3 , ϕ3 , where N 3 = P 3 , T 3 , Pre3 , Post3 , denoted as Q3 = Q1 ||Q2 . The IP N Q3 is computed as follows:

508

C. R. V´ azquez et al.

– The net structure is computed as follows: P 3 = P 1 ∪ P 2 and T 3 = T 1 ∪ T 2 , enumerating first the nodes in N 1 . Next, define Pre3 = diag(Pre1 , Pre2 ) and Post3 = diag(Post1 , Post2 ), where diag(•, •) is a matrix built with the argument matrices as diagonal blocks, and other entries are null. – The initial marking is M30 = [(M10 )T , (M20 )T ]T . 3 1 2 = ΣO ∪ ΣO , – The input and output alphabets are ΣI3 = ΣI1 ∪ ΣI2 and ΣO respectively. – The input function is defined as: ∀t ∈ T 3 set λ3 (t) = λi (t), where t is a node of T i with i ∈ {1, 2}. Similarly, the output function is defined as: ∀p ∈ P 3 set ϕ3 (p) = ϕi (p), where p is a node of P i with i ∈ {1, 2}. This synchronous product is compatible with the previous reported in the literature since the IP N modules used in next sections are label disjoint (i.e. two different modules do not share input symbols).

3

Tracking Control Problem

In the following subsection, a library of IP N models for the most frequently used components in EN S’s is proposed, including selectors, push buttons, proximity sensors, actuators and electro-pneumatic valves. Next, the plant model is built from these models. Later, the specification model is defined in Subsect. 3.3. Finally, the tracking control problem is introduced in Subsect. 3.4. 3.1

Electro-Pneumatic Component Models

Figure 1 shows IP N models for different kind of switches. In these, output symbols are defined in the places that represent a state in which the switch conducts current. Figure 1(a) represents a normally open (NO) switch, whereas Fig. 1(b) represents a normally closed (NC) switch. These switches can be either pushbuttons or switches (in such case, t0 represents the event of “push” and t1 the event of “release”), or lock-buttons (in such case, t0 represents the event of “push for the first time” and t1 the event of “push for the second time”). Figure 1(c) represents a single-pole two-throw switch, in one state the current is conducted to throw A, in the other state the current is conducted to throw B. Finally, Fig. 1(d) represents a selector between three throws, A, B or C. Figure 2 shows the electro-pneumatic symbols for different kind of proximity sensors (DIN standard, as used in electro-pneumatic diagrams) and its IP N model. All of them are proximity sensors commonly used in EN S’s to detect a limit position of an actuator, the presence of a part or a machine condition. These sensors are the pressure sensor (Fig. 2(a)), the capacitive sensor (Fig. 2(b)), the inductive sensor (Fig. 2(c)), the magnetic sensor (Fig. 2(d)), and the optical sensor (Fig. 2(e)). Frequently, these sensors provide two output signals (in Fig. 2 only one terminal is drawn), one NO and one NC to detect the presence and absence of parts, respectively. For that reason, the IP N model includes two output symbols, B for NC and A for NO. Transition t0 represents the event

Petri Nets Tracking Control for Electro-pneumatic Systems Automation

509

Fig. 1. [18] IP N models for different kinds of switches.

“a part is detected” and t1 represents the event “a part is not detected”. Reed magnetic sensors (not shown) are used to detect limit positions of pneumatic actuators, these behave as NO switches, thus, the model of Fig. 1(a) should be used for these sensors.

Fig. 2. [18] IP N models for different proximity sensors.

Frequently, pneumatic actuators and valves are used in usual assemblies, the most common are shown in Figs. 3 and 4. Figure 3 shows a vacuum actuator assembly, consisting of a 3-ways 2-positions valve, a Venturi nozzle to produce vacuum, a suction cup and a vacuum sensor, which provides an activation signal when vacuum is detected (when a part is grasped). The IP N model of the vacuum assembly represents the valve (nodes p3 , p4 , t3 and t4 ) and the actuator (nodes p0 , p1 , p2 , t0 , t1 and t2 ). Place p0 represents the state in which vacuum is not activated, thus the sensor provides a signal B; place p1 represents a transition state, and place p2 represents the state in which a part is grasped, thus the vacuum sensor provides the output signal A. The activation of the valve solenoids are the only controllable events, represented by symbols a and b at transitions t3 and t4 , respectively.

510

C. R. V´ azquez et al.

Fig. 3. [18] IP N model for vacuum assembly.

Figure 4 represents the most typical valve-actuator assemblies: double acting actuator controlled by a 5-ways 2-positions valve (a), spring return actuator controlled by a 3-ways 2-positions spring return valve (b), and a rotary actuator controlled by a 5-ways 2-positions valve (c). Pneumatic-grippers (not shown) are usually driven by double acting actuators controlled by 5-ways 2-positions valves, thus they are similar to Fig. 4(a). The same IP N model is valid for all the assemblies. In this, the valve is represented by nodes p3 , p4 , t4 and t5 , and the actuator is represented by nodes p0 , p1 , p2 , t0 , t1 , t2 and t3 . In the assemblies, sensors are located to detect the limit positions, providing the output signals represented by A and B, for the leftmost and rightmost positions, respectively. On the other hand, the activation of the valve solenoids are the only controllable events, represented by symbols a and b (to move to the left and right, respectively) at transitions t4 and t5 , respectively. For the case of the spring return

Fig. 4. [18] IP N models for electro-pneumatic assemblies.

Petri Nets Tracking Control for Electro-pneumatic Systems Automation

511

valve at Fig. 4(b), the symbol a is defined as the logical negation of b (i.e., the absence of event b). In addition to the introduced IP N models, the following control function must be stated. It indicates the controllable transition that is needed to turn on a particular sensor (i.e. to reach a marking where a measurable place is marked). This is formalized in the sequel: Definition 7. The function t→p : P → T , which indicates the controllable transition (if it exists) whose firing leads to the marking of p, is defined as follows: ∀p ∈ P , ⎧ t if ϕ(p) = ε and there exists a directed path ⎪ ⎪ ⎪ ⎪ from a controllable transition t to p, which ⎪ ⎪ ⎪ ⎪ does not contain any other controllable ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ transition or measured place. If more than ⎨ one of such paths exists, then any of them t→p (p) = ⎪ ⎪ can be used, ⎪ ⎪ ⎪ ⎪ t’where t ∈• p, if either ϕ(p) = ε or there ⎪ ⎪ ⎪ ⎪ does not exist a path from a controllable ⎪ ⎪ ⎪ ⎪ transition to p not containing other ⎪ ⎪ ⎩ measured places. The models of Figs. 1 and 2 do not contain controllable transitions, thus t→p is defined as in the second statement of previous definition. For the models of Figs. 3 and 4, t→p is defined as follows: Model in Fig. 3 t→p (p0 ) = t4 t→p (p1 ) = t0 t→p (p2 ) = t3 t→p (p3 ) = t4 t→p (p4 ) = t3

3.2

Model in Fig. 4 t→p (p0 ) = t4 t→p (p1 ) = t1 t→p (p2 ) = t5 t→p (p3 ) = t5 t→p (p4 ) = t4

Building the Plant Model

Following the Control Theory terminology, the system to be controlled is named the Plant. A plant IP N model for an EN S can be built by using the synchronous product on the IP N models of the components. Since the labels of the transitions are module disjoint, then the plant is a disjoint collection of component submodels (Figs. 1, 2, 3 and 4). In other words, the plant model is a collection of disjoint EN S IP N modules. Definition 8. The plant is a safe event-detectable IP N Qp = N p , Mp0 , ΣIp , p , λp , ϕp , where N p = P p , T p , Prep , Postp , that models the DES to be ΣO controlled. For an EN S with components {c1 , .., cn }, its plant IP N model can be obtained as follows:

512

C. R. V´ azquez et al.

ci – For each component ci , define its IP N model Qci = N ci , Mc0i , ΣIci , ΣO , λ ci , ci ci ci ci ϕ , as shown in the Figs. 1, 2, 3 and 4, where N = P , T , Preci , i . Different input and output alphabets Postci , and define its function tc→p c c ci ∩ΣOj = ∅ must be defined for different components, i.e., ΣIci ∩ΣI j = ∅ and ΣO if j = i. – The plant’s IP N model is computed as

Qp = Qc1 ||...||Qcn i – The function tp→p of the plant is computed as: ∀p ∈ P p set tp→p (p) = tc→p (p), ci where p is a node of P .

p act ⊆ ΣO is defined as the subset of output symbols related to actuMoreover, ΣO ators (symbols from models of Figs. 3 and 4).

3.3

Specification Model

In EN S’s, specifications are a collection of plant output signals sequences. In them, neither the occurrences of internal events nor the reachability of silent states are specified. Each specification sequence indicates partial orders of actuators’ output symbols that are required to occur in the plant (sensor or switch plant output symbols are allowed to occur in any order). In these sequences, a place p must be defined for each instance/time that a plant output symbol o is required; this place must have associated the corresponding symbols. Moreover, it is also possible to specify the simultaneously occurrence of a set or output symbols, by adding a place p in the specification associated to that symbols. In addition, specification sequences may have extra places with associated symbols from sensors and switches, defining guards for the occurrence of sequences, which can be implemented as either a place in the sequence or as self-loop places. In this work, the specification is represented as an IP N . This is formalized as follows: s , Definition 9. An specification is a safe and live IP N Qs = N s , Ms0 , ΣIs , ΣO s s λ , ϕ , with additional marked self-loop places (a self-loop is a place p with P re(p, t) = 1, P ost(p, t) = 1 for a particular transition t). All the transitions are controllable. The output alphabet of the specification is equal to the output p s = ΣO . For any marking Ms ∈ R(N s , Ms0 ) it alphabet of the plant, i.e., ΣO s s holds that ϕ (M ) does not include more than one symbol of the same device (actuator, sensor or switch).

Example 1. For instance, consider an EN S consisting of an assembly valve/ double-acting-actuator (the one shown in Fig. 4a), a proximity sensor (shown in Fig. 2) and a push button (shown in Fig. 1(a)). An specification for this system indicates that when the proximity sensor detects a part the actuator must complete two operation cycles (extending/returning).

Petri Nets Tracking Control for Electro-pneumatic Systems Automation

513

The plant’s model is shown in Fig. 5(a), it is the collection of the IP N models of each EN S component, as expected. The specification is depicted in Fig. 5(b), notice that it is a sequence starting by symbol As and continuing with the two working cycles of the actuator, as the required specification of the example indicates. Since the specification does not mention the plant output symbols Bs and Ab , then they are not relevant for the specification and can occur in any order during the plant’s operation. Moreover the sensor symbol As could occur in any order during the plant’s operation, however, the symbol As is required to be present for starting the sequence.

Fig. 5. [18] IP N models of plant (a) and specification (b) of Example 1.

If the pressing of the push button is additionally required to start the sequence, a self-loop place with one token can be added to t0 with the symbol Ab , associated to the button’s closed position (i.e., a place p connected with input an output arcs to t0 , having one token and the symbol Ab ). Example 2. Figure 6(a) represents an EN S composed of two assembly valve/ double-acting-actuator and a proximity sensor. The specification for this system requires certain degree of synchronization. In particular, when the proximity sensor detects a part, the first actuator must move to its extended position, after that, the second actuator moves to its extended position, next, both actuators must return to their initial positions concurrently. This specification is captured by the IP N of Fig. 6(b). Notice that the symbols of the specification places indicate the required position of both actuators. An alternative IP N for this specification is depicted in Fig. 6(c). In this case, the specification places are only associated to those “newly” required symbols in the sequence. This way of modelling can be used when a single sequence can occur in the specification,

514

C. R. V´ azquez et al.

however, in case that more than one sequence can occur, it is recommended to indicate the required proper symbols for all the actuators of interest. 3.4

Tracking Control Problem

The tracking control problem is stated as follows: Definition 10. Let Qp and Qs be the IP N models of the plant and the specification, respectively. The output tracking control problem consists in the computation of a controller (H) that is a mapping: p

p

H : 2ΣO × R(N s , Ms0 ) → 2ΣI

such that, for any Msi ∈ R(N s , Ms0 ), the controller H(ϕp Mpi , Msi ) indicates input symbols {λp (t1q ), λp (t2q ), .., λp (tsq )}, where Mpi is the current plant marking, σa Mpj , where ϕs (Mjs ) ⊆ such that any fireable sequence σa from Mpi leads Mpi −→ p p ϕ (Mj ).

Fig. 6. IP N models of plant (a), first specification (b) and second specification of Example 2.

Petri Nets Tracking Control for Electro-pneumatic Systems Automation

4

515

Control Synthesis Algorithm

The control of EN S’s is synthesized to control the actuators’ in a specified way, by activating the corresponding valves, in response to the activation of sensors and switches. At the IP N level this means a proper synchronization of the plant and the specification. The following algorithm provides a formal solution for the tracking control problem whenever the plant and the specification are modelled according to Definitions 8 and 9. In the algorithm it is used the property ϕ(p) = ε iff p is measured, similarly, λ(t) =  iff t is controllable. Algorithm 1. Calculation of the closed-loop IP N and the controller. 1: Input IP N models of the plant Qp and the specification Qs , and the function

t→p .

2: Output IP N models of the closed-loop system Qcl and the controller Qc .

% First, relabel the specification’s places as follows:

s 3: Let ΣO = {o1 , ..., om } be the output alphabet of Qs . Define a new output

s s = {o1 , .., om } and a bijective function Π : ΣO → alphabet for Qs as ΣO s  ΣO such that Π(oi ) = oi . The function is applied to sets of symbols as Π({oi , oj , ..., ok }) = {oi , oj , ..., ok }. 4: Define a new output function ϕs for the specification as follows: ∀p ∈ P s ,  Π(ϕs (p)) if ϕs (p) = ε s ϕ (p) = ε otherwise

5: 6: 7:

8:

% Next, define mirror places in the specification associated to places sharing the same output symbol as follows: s s s s s Initialize P s = P s , Ms 0 = M0 , Pre = Pre and Post = Post . s for every oi ∈ ΣO such that oi ∈ ϕs (pi ) ∩ ϕs (pj ), for some pi , pj ∈ P s , do Define the set of specification places with the output symbol oi as [oi ] = Connect pi to other {pj ∈ P s |oi ∈ ϕs (pi )}. Add a place pi to P s . Pres (ps , tj ) and nodes such that, ∀tj ∈ T s , Pres (pi , tj ) =  ps ∈[oi ] Posts (pi , tj ) = Posts (ps , tj ). ps ∈[oi ]  Define the initial marking of pi as Ms Ms0 (ps ). Therefore, pi 0 (pi ) = ps ∈[oi ]

is marked iff any place p ∈ [oi ] is marked. 9: Define the output symbol ϕs (pi ) = oi . Next, eliminate the symbol oi from all the places in [oi ]. 10: end for 11: Initialize the closed-loop system as Qcl = Qp ||Qs , where N s = P s , T s , s s s s Pres , Posts and Qs = N s , Ms 0 , ΣI , ΣO , λ , ϕ . % Next, add bidirectional arcs to Qcl , between specification’s places and plant’s controllable transitions, and between plant’s measured places and specification’s transitions, as follows:

516

C. R. V´ azquez et al.

12: for every pi ∈ P s such that ϕs (pi ) = ε do 13: Let pp ∈ P p be such that Π(ϕp (pp )) = ϕs (pi ). Let tj = t→p (pp ). 14: if λp (tj ) =  then 15: Define a bidirectional arc between pi and tj , i.e., set Precl (pi , tj ) = 1 16: 17: 18: 19: 20: 21: 22: 23:

24: 25:

and Postcl (pi , tj ) = 1. end if for every transition tu ∈ p•i do Define a bidirectional arc between pp and tu , i.e., set Precl (pp , tu ) = 1 and Postcl (pp , tu ) = 1. end for end for The resulting IP N Qcl represents the closed-loop system. Initialize the IP N of the controller as Qc = Qcl . Eliminate from Qc the plant’s nodes that are only connected to other plant’s nodes, i.e., eliminate places p ∈ P p s.t. Prec (p, t) = Postc (p, t) = 0 ∀t ∈ T s , and eliminate transitions t ∈ T p s.t. Prec (p, t) = Postc (p, t) = 0 ∀p ∈ P s . The resulting IP N Qc represents the controller as a IP N . s s Define the set of marked specification places Pos (Msi ) = {ps o ∈ P |Mi p s (po ) > 0}, then for any current plant marking Mi , the controller function is defined as H(ϕp Mpi , Msi ) = {λp (t1q ), ..., λp (tsq )}, where {t1q , ..., tsq } = Pos (Msi )• ∩ P p .

The previous algorithm is an extension to Algorithm 4.1 presented in [18], by allowing here to consider several symbols at each specification place. The computation of function H is performed to formally accomplish the tracking control problem. However, the computation of H is impractical due to the cardinality of its domain. Instead of that, the IP N Qc provides a compact and convenient representation of the controller. Example 3. Consider the plant of Fig. 6(a) and the specification of Fig. 6(b). By applying the Algorithm 1, the closed-loop model depicted in Fig. 7 is obtained. In this, the plant and specification are drawn in solid line, whereas the nodes and arcs added by the Algorithm 1 are drawn in dashed line. In particular, steps 3–4 relabel the specification symbols, from As , A1, B1, A2 and B2 to As , A1 , B1 , A2 and B2 , respectively. Notice that there are two places with the symbols B1 and A2 , then steps 5–10 define the new places drawn in dashed line with their input and output arcs, one for the symbol B1 and another for the symbol A2 . Steps 12–16 add bidirectional arcs from the places with symbols A1 , B1 , A2 and B2 to the plant’s controllable transitions with symbols a1, b1, a2 and b2, respectively. Steps 17–19 add bidirectional arcs from the plant’s measured places with symbols A1, B1, A2 and B2 to the output transitions of the new places with symbols A1 , B1 , A2 and B2 , respectively.

Petri Nets Tracking Control for Electro-pneumatic Systems Automation

517

A simulation of the closed-loop system can show that the Plant evolves following the specification, i.e., it performs a double cycle after the sensor detects a part. Moreover, notice that the sensor and switch can change their states during the actuator’s movement, in accordance to the explanation of Example 1. The controller Qc , can be obtained from the closed-loop model by eliminating the plant’s nodes that are only connected to other plant’s nodes.

4.1

Properties of the Closed-Loop System

The Algorithm 1 ensures that the closed-loop system exhibits important good properties. This is formally introduced in the following proposition. Proposition 1. Consider a plant IP N model Qp and an specification IP N s cl model  8 and 9. Consider the closed-loop model Q =  cl Qcl, ascl in clDefinitions cl cl as computed by the Algorithm 1. The following N , M0 , ΣI , ΣO , λ , ϕ statements hold:

Fig. 7. IP N closed-loop system of Example 2.

518

C. R. V´ azquez et al.

1. The Algorithm 1 is well defined , i.e., all the required information is available and the resulting models Qcl and Qc are IP N ’s. 2. Qcl is bounded. 3. The plant is connected to the specification only through measured sensors and controllable transitions. 4. The specification cannot evolve until the plant produces the output symbols indicated by the specification. Proof. The first statement holds directly by construction. For the second statement, notice that, by definition, both Qp and Qs are bounded. The lines 1– 10 add additional places to the specification, but its reachability set is not affected. Finally, Qcl is preliminary defined as Qp ||Qs in line 11. Next, selfloop arcs are added between plant places (transitions) and specification transitions (places) in lines 12–20, which do not increase the reachability set of Qcl . Thus, Qcl is bounded. The third statement is formally stated as, for any p ∈ P p and t ∈ T cl \ T p , if ϕp (p) = ε then Precl (p, t) = Postcl (p, t) = 0, else Precl (p, t) = Postcl (p, t). For any t ∈ T p and p ∈ P cl \ P p , if λp (t) =  then Precl (p, t) = Postcl (p, t) = 0, else Precl (p, t) = Postcl (p, t). This statement holds by construction, since the only arcs linking the plant to the specification are the self-loops added in lines 12–19, connecting measured places and controllable transitions. The fourth statement holds due to the self-loop arcs between measured places and specification transitions added in lines 17–19. Proposition 2. Consider a plant IP N model Qp and an specification IP N model Qs , as in Definitions 8 and 9. Consider the IP N closed-loop model Qcl as computed by the Algorithm 1. The following statements hold: – Qcl is deadlock-free. – The controller solves the output tracking problem of Definition 10. Proof. Both statements are demonstrated by showing that, if a transition ta ∈ T s is enabled by the specification marking, the plant will evolve to reach a marking that enables ta in the closed-loop system, and the output symbols produced at such marking include all the output symbols produced by the specification marking. First, notice that Qs and Qs have the same reachability set. Moreover, for σ σ s s any Msi such that Ms0 −→ Msi there exists Ms i fulfilling M0 −→ Mi and s s s s s s Π(ϕ (Mi )) = ϕ (Mi ), i.e., Q and Q produce the same output symbol sequences. Let Ms i be the current marking of the transformed specification. By definition, Qs is live, thus Qs is live as well, then there exists a transition ta ∈ T s whose specification input places • ta ∩ P s are marked. On the other hand, in the closed-loop system ta may have plant input places, due to the self-loop arcs added in lines 17–19. Let us show that such places will be eventually marked, • s producing the required output symbols. First, consider a place ps o ∈ ta ∩ P . There are three cases:

Petri Nets Tracking Control for Electro-pneumatic Systems Automation

519

act 1. ϕs (ps ). In such case, there is a self-loop arc (due to lines 13– o ) ∈ Π(Σ and the controllable transition to = t→p (ppo ), where ppo ∈ P 16) between ps o s s fulfills ϕ (po ) = Π(ϕp (po )), i.e., λ(to ) is indicated by the controller and to must fire. By definition of t→p (•), after the firing of to there is a sequence of uncontrollable transitions whose firing will lead to the marking of ppo , and p p thus the plant will reach a marking Mpi such that ϕs (ps o ) ∈ Π(ϕ (Mi )). The p place po is connected through a self-loop arc to ta , due to lines 17–19. s act 2. ϕs (ps ). In this case, there are no self-loops between ps o ) ∈ Σ \ Π(Σ o and plant transitions. On the other hand, there is a place ppo ∈ P p such that p p p ϕs (ps o ) = Π(ϕ (po )), but po is associated to either a sensor or a switch device, which can always evolve in such a way that ppo gets marked, i.e., the p p p plant reaches a marking Mpi such that ϕs (ps o ) ∈ Π(ϕ (Mi )). The place po is connected through a self-loop arc to ta , due to lines 17–19. s 3. ϕs (ps o ) = ε. In this case, there are no self-loops between po and plant transitions. Moreover, self-loop arcs between plant places and ta are not added by this place.

Notice that, by definition of the specification, ϕs (Msi ) does not include two symbols of the same device, and by definition of the plant, devices are independent, thus in the three cases the plant can always reach a marking Mpi such that p p ϕs (Ms i ) ⊆ ϕ (Mi ), i.e., the plant produces all the output symbols indicated by the specification. At Mpi , all the plant places connected to ta through selfloop arcs are marked. Thus, ta is enabled. Since this holds for any transition ta that is enabled by the specification marking, and the specification is live, the closed-loop system is deadlock-free. Remark 1. The tracking controller is neither a particular case of the supervisorycontrol theory nor of the regulation control framework for a given specification, because in our setting the output language of the controlled system is not included in the specification’s output language. It might be possible to modify the specification in order to make the output tracking controller fit in the supervisory-control or regulation frameworks, however, such adaptation would not be trivial and would not fulfilled Definition 9. 4.2

Ladder Diagram

The controller IP N can be translated to a Ladder Diagram for its implementation into a P LC device. A Ladder Diagram (LD) is composed of a set of networks (lines), each one having a condition (array of switches) and an action (array of

520

C. R. V´ azquez et al.

coils). The following algorithm, which is an adaptation of that introduced in [17], translates the IP N controller to a LD. Algorithm 2. Translation to a Ladder Diagram. c 1: Input IP N model of the controller Qc = N c , Mc0 , ΣIs , ΣO , λs , ϕc , where

N c = P c , T c , Prec , Postc , P c is partitioned into specification places P cs and plant places P cp , and T c is partitioned into specification transitions T cs and plant transitions T cp .

2: For each place pi ∈ P c , an internal coil Coili is defined. 3: Define a network for initializing the internal coils: the condition is a NO

4: 5:

6: 7: 8:

9: 10: 11:

12:

switch (associated to a reset push-button), the action is a parallel array of |P cs | internal coils such that, a reset Coili is added for any pi ∈ P cs s.t. M0c (pi ) = 0, a set Coili is added for any pi ∈ P cs s.t. M0c (pi ) = 1. for any transition ti ∈ T cs do Define a network as follows: the condition is a series array of NO switches, each one associated to a Coilj that fulfills P rec (pj , ti ) > 0 (pj can be either an specification or a plant place); the action is a parallel array of internal coils such that, a reset Coilj is added if Prec (pj , ti ) > 0, a set Coilj is added if Postc (pj , ti ) > 0. end for for any place pi ∈ P cp do Define a network as follows: the condition is a NO switch associated to an external input connected to the sensor signal ϕc (pi ), the action is a simple coil associated to Ci . end for for any transition tj ∈ T cp do Define a network as follows: let pi ∈ P cs be such that Pre(pi , tj ) > 0, the condition is a NO switch associated to Ci , the action is a simple coil associated to an external output connected to the actuator signal λc (tj ). end for

Example 4. Consider the closed-loop system of Fig. 7. The controller can be obtained by removing the plant nodes that are not connected to specification nodes. The LD of the resulting system is depicted in Fig. 8.

Petri Nets Tracking Control for Electro-pneumatic Systems Automation

521

Fig. 8. Ladder Diagram for the controller Qc of Example 2. The inputs I0 , I1 , I2 , I3 , I4 and I5 of the PLC are connected to a NO reset push-button, the NO terminal of the proximity sensor As , and the actuator’s limit sensors A1 and B1 of actuator 1 and A2 and B2 of actuator 2, respectively. The outputs Q0 , Q1 , Q2 and Q3 of the PLC are connected to the electro-valve’s coils corresponding to a1 and b1 of the actuator 1 and a2 and b2 of the actuator 2, respectively.

5

Case Study

Consider the EN S depicted in Fig. 9 composed of one pneumatic arm, two stamping machines, one part dispenser, a conveyor and a start lock-button. The EN S required functionality is the following. The pneumatic arm retrieves a part from the dispenser and loads it in the stamping machine M1 or M2 , depending on

Fig. 9. [18] EN S for case study.

522

C. R. V´ azquez et al.

Fig. 10. [18] Actuators for the electro-pneumatic arm.

which is idle. Every time that a machine is loaded, it stamps and forms the part. Whenever a machine M1 or M2 finishes its work, the pneumatic arm unloads it placing the finished part on the conveyor. The cycle is repeated every time that the dispenser has a part. The dispenser has a proximity sensor to indicate that it holds a part (position As1) or it is empty (position Bs1); machines M1 and M2 have proximity sensors to indicate that the machine is idle without part (positions Bs2 and Bs3 respectively) or they finished a part that must be unloaded (positions As2 and As3 respectively). In this section, the controller for the pneumatic arm will be synthesized. Its electro-pneumatic diagram is shown in Fig. 10. Every axis, X, Y, Z of the pneumatic arm has an electro-pneumatic assembly. The specification for the arm is composed of four sequences, representing when the pneumatic arm: it loads machine M1 ; it loads M2 ; it unloads M1 ; and it loads M2 . Qualitatively, every sequence is split into five subsections: the arm moves to the picking position; the effector (suction cup) turns on; the arm moves to the placing position; the effector turns off; and the arm returns to its home position. The initial state of the arm is its home position. The specification sequences are the following: – unloading M1 and placing the part on the conveyor Seq1 = B2 B3 A4 A3 A2 B1 B4 A1 – unloading M2 and placing the part on the conveyor Seq2 = B1 B2 B3 A4 A3 A2 B4 A1 – retrieving a part form the dispenser to load M1 Seq3 = B3 A4 A3 B2 B4 A2 – retrieving a part form the dispenser to load M2 Seq4 = B3 A4 A3 B1 B2 B4 A2 A1

Petri Nets Tracking Control for Electro-pneumatic Systems Automation

523

Fig. 11. [18] Closed-loop system Qcl of the electro-pneumatic arm.

The controller is designed according to the Algorithm 1 presented in Subsect. 4. Figure 11 presents the closed loop behavior. The specification is represented by the nodes in the central area with solid line, the four sequences are indicated. According to the proposed controller design methodology, some guards are added to transitions in sequences: the first transition of Seq1 must be guarded by As2 (i.e. M1 finished its work), thus the self-loop place As2 is added to this transition; the first transition of Seq2 must be guarded by As3 (i.e., M2 finished its work), thus the self-loop place As3 is added to this transition; the first transition of Seq3 must be guarded by Bs2 (i.e., M1 is idle), thus the self-loop place Bs2 is added to this transition; and the first transition of Seq4 must be guarded by Bs3 (i.e., M4 is idle), thus the self-loop place Bs3 is added to this transition. In addition, the last two sequences are guarded also by As1, requiring that a part is available in the dispenser. The starting of the four sequences is guarded by As0, the signal of the start lock-button. In Fig. 11 dashed circles with their input and output arcs are the places added at steps 5–12 of the Algorithm 1 for mirror places with the same symbols (symbols in brackets {•} represent original symbols that are removed by step 9). Bidirectional thick dashed arcs represent synchronizations between plant’s output places and specification’s transitions, obtained at steps 19–21 of the Algorithm. For clarity of presentation, the bidirectional arcs between specification’s output places and controllable plant’s transitions are not drawn (computed as steps 15–18), however, those arcs are indicated in the table in Fig. 11. The corresponding controller can be computed by eliminating from Qcl the nodes of the plant that are not connected to the specification (i.e., all the plant’s

524

C. R. V´ azquez et al.

nodes without symbols). Finally, the resulting controller can be translated to a Ladder Diagram for its implementation in a PLC as explained in [17].

6

Conclusions

The tracking control problem for EN S’s was formally introduced in work. For this, an IP N plant model is built from a collection of individual EN S models, leading to a straightforward process. The specification is modeled as an IP N , allowing to specify sequences actuators activation, selections between different sequences, concurrent activation of different actuators, sensors and switches as guards, among other behaviors. An algorithm for the synthesis of the controller and the closed-loop system as IP N ’s was provided. Moreover, it was demonstrated that the closed-loop behavior exhibits important properties, such as deadlock-freeness and boundedness. An algorithm for the translation of the computed IP N controller to a Ladder Diagram for its implementation was provided. The proposed approach was illustrated through an application example. Acknowledgements. The research leading to these results has received funding from the Conacyt Program for Education, project number 288470.

References 1. Basile F, Cordone R, Piroddi L (2013) Integrated design of optimal supervisors for the enforcement of static and behavioral specifications in Petri net models. Automatica 49(11):3432–3439 2. Campos-Rodr´ıguez R, Ram´ırez-Trevino A, L´ opez-Mellado E (2004) Regulation control of partially observed discrete event systems. In: IEEE international conference on systems, man and cybernetics, The Hague, Netherlands 3. Chao DY (2009) Direct minimal empty siphon computation using MIP. Int J Adv Manuf Technol 45(3–4):397–405 4. Chen YF, Li ZW, Khalgui M, Mosbahi O (2011) Design of a maximally permissive liveness-enforcing Petri net supervisor for flexible manufacturing systems. IEEE Trans Autom Sci Eng 8(2):374–393 5. David R, Alla H: Discrete, Continuous, and Hybrid Petri nets. Springer (2010) 6. Ezpeleta J (1995) A Petri net based deadlock prevention policy for flexible manufacturing systems. IEEE Trans Robot Autom 11(2):173–184 7. Giua, A., DiCesare F., Silva, M.: Generalized mutual exclusion constraints on nets with uncontrollable transitions. In: IEEE international conference on systems, man and cybernetics, Chicago, USA (1992) 8. Holloway LE, Krongh BH, Giua A (1997) A survey of Petri net methods for controlled Discrete Event Systems. Discrete Event Dyn Syst 9(2):151–190 9. Iordache, M., Antsaklis, P.: A survey on the supervision of Petri nets. In: International conference on petri nets. Miami, USA (2005) 10. Li Z, Liu D (2007) A correct minimal siphons extraction algorithm from a maximal unmarked siphon of a Petri net. Int J Prod Res 45(9):2163–2167 11. Li Z, Wu N, Zhou M (2012) Deadlock control of automated manufacturing systems based on Petri nets - a literature review. IEEE Trans Syst Man Cybern Part C 42(4):437–462

Petri Nets Tracking Control for Electro-pneumatic Systems Automation

525

12. Li Z, Zhou M (2008) On siphon computation for deadlock control in a class of Petri nets. IEEE Trans Syst Man Cybern Part A Syst Hum 38(3):667–679 13. Ramadge RJ, Wonham WM (1987) Supervisory control of a class of discrete event processes. SIAM J Control Optim 25(1):206–230 14. Ram´ırez-Trevino A, Rivera-Rangel I, L´ opez-Mellado E (2003) Observability of discrete event systems modeled by interpreted petri nets. IEEE Trans Robot Autom 19(4):557–565 15. Ram´ırez-Prado, G., Santoyo, A., Ram´ırez-Trevino, A., Begovich, O.: Regulation problem in discrete event systems using interpreted petri nets. In: IEEE international conference on systems, man and cybernetics, Nashville, USA (2000) 16. S´ anchez-Blanco, F., Ram´ırez-Trevino, A., Santoyo, A.: Regulation control in interpreted petri nets using trace equivalence. In: IEEE international conference on systems, man and cybernetics, The Hague, Netherlands (2004) 17. Santoyo, A., Jim´enez-Ochoa, I., Ram´ırez-Trevino, A.: A complete cycle for controller design in Discrete Event Systems. In: IEEE international conference on systems, man and cybernetics, Tucson, USA (2001) 18. V´ azquez, C.R., G´ omez-Castellanos, J.A., Ram´ırez-Trevino, A.: Tracking Control of electro-pneumatic systems based on petri nets. In: 15th international conference on informatics in control, automation and robotics, Porto, Portugal (2018) 19. Wang S, Wang C, Zhou M, Li Z (2012) A method to compute strict minimal siphons in a class of Petri nets based on loop resource subsets. IEEE Trans Syst Man Cybern Part A Syst Hum 42(1):226–237

18

O Isotope Separation Process Control

Vlad Mureşan(&), Iulia Clitan, Valentin Sita, Mihail Abrudean, and Mihaela-Ligia Ungureşan Technical University of Cluj-Napoca, 400114 Cluj-Napoca, Romania {Vlad.Muresan,Valentin.Sita}@aut.utcluj.ro

Abstract. The paper presents an original solution for the nonlinear Gaussian control of the 18O isotope concentration at the output of a separation column. The design of the nonlinear controller is based on the proposed separation column mathematical model which is an accurate one. In order to highlight the advantages of using a nonlinear controller, the comparison with using a linear Proportional Integral Derivative controller is made. Both the architecture and the tuning procedure of the nonlinear controller represent original elements introduced in the paper. It is proved, through simulation, that the nonlinear controller improves significantly the control system performances, especially the settling time, which is a dominant parameter in analyzing the separation processes work. Also, a solution to determine the instantaneous value of the separation column length constant is proposed, solution which opens the possibility to implement new control strategies. Keywords: 18O isotope  Nonlinear Concentration Controller  Gaussian function  Neural network  Concentration  Mathematical model  Separation process

1 Introduction The separation column, together with the mandatory refluxing system’s equipment used for the 18O isotope separation process is depicted in Fig. 1 [1]. In this case study, the 18O isotope separation is obtained by means of isotopic exchange in the system made up of NO, NO2-H2O and HNO3 [2]. In Fig. 1, the authors have noted FSC as the Final Separation Column (the term “Final” relates to the fact that this specific column operates as the end column of a separation cascade which consists of two separation columns). The aim of this paper is to design a control strategy for the FSC when it operates independently from the separation cascade. The results of this research activity will be further used in order to find an appropriate method to control the entire separation cascade. In the following, we describe the constructive details of the FSC, shown in Fig. 1. The height is h = 10 m (the cascade is composed of 5 sectors (S1 – S5), each having equal heights of h/5 = 2 m). The corresponding sectors contain steel packing [3], hence the hatching. The column diameter (d) equals 14 mm. The process of the 18O isotope separation is based on the circulation, in counter-current, of the nitric oxides (NO, NO2)

© Springer Nature Switzerland AG 2020 O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 526–551, 2020. https://doi.org/10.1007/978-3-030-31993-9_26

18

Fig. 1. The separation column for the

18

O Isotope Separation Process Control

527

O isotope and the equipment of the refluxing system.

and the nitric acid (HNO3) – solution, introduced in the lower part, respectively in the upper part, of the column. In the above figure, the input flow of nitric oxides is represented as Fi and the output flow of nitric acid is represented by FW, as the isotopic waste of the separation process. Fo represents the output flow of the nitric oxides from FSC which is circulated to the arc-cracking reactor (ACR – containing steel packing [3]). The result gathered at the ACR’s output is the flow FN of both nitrogen (N2) and nitric oxides (NO, NO2), which has an increased concentration of NO2. Inside the absorber A, the process of nitric oxides absorption in water takes place due to the ceramic packing, generating the nitric acid solution that supplies the upper part of FSC (thus, FH2O represents the supplied water flow and FA represents the flow of nitric acid solution). The amount of nitrogen and nitric oxide from the absorber’s output, the flow FNN (NO, N2), is sent to the catalytic reactor CR (also containing steel packing [3]). Here, the water supplied to the absorber is the result of nitric oxide (NO) reacting to the hydrogen (H2), as the CR is supplied with the hydrogen flow FH. The excess of nitrogen and hydrogen is released in the form of FNH. Eventually, from the top of FSC, the product is extracted in the form of nitric acid with an increase concentration of 18O isotope (HN18O3). The concentration of the 18O isotope at the output of the separation column is measured using a mass spectrometer.

528

V. Mureşan et al.

The pipe designated for the product flow (FP) is represented graphically by a dashed line since, this paper’s research only deals with the total reflux working regime (thus, FP = 0).

2 Modeling of the Isotope Separation Process Throughout the separation exchange, the main output signal (in this case the 18O isotope concentration) depends on two independent variables, thus the 18O isotope separation process is considered to be a distributed parameter process [4, 5]. The independent variables are represented by time (t) and the position in FSC height (s), highlighted by the (0s) axis (see Fig. 1). The origin (0) of the (0s) axis is given by the centre of the Column Base Section (CBS). The independent variable (s) starts from the column base and increases its value as it reaches the top, having a maximum value at s = h, corresponding to the Column Top Section (CTS). The main input signal for this process is represented by the input flow of nitric oxides Fi(t) and, as it was previously mentioned, the main output signal is represented by the 18O isotope concentration, notated as y(t, s). The mathematical modeling is made in the hypothesis that FSC operates in total reflux regime (FP = 0). This regime is used in the first part of the plant functioning, aiming at increasing the 18O isotope concentration until the imposed value, at the column top, is reached. The specified working regime has been selected by the authors since it is the most relevant for the FSC dynamics. The mathematical model which describes with high accuracy the functioning of FSC was previously proposed in [6], and is briefly presented in the following equations, starting with the final form of the output signal as: s

y(t, s) = y0 + yf (t) 

Fs1 (s)  y0 eS  1 = y0 + yf (t)  sf Fs1 (s = sf = h)  y0 eS  1

ð1Þ

where y0 = 0.204% represents the natural abundance of the 18O isotope. The function that models the separation process dynamics in relation to time (t) is represented by yf(t), as the solution of the ordinary differential equation: dyf (t) 1 1 ¼  yf (t) +  uf dt T(Fi ) T(Fi )

ð2Þ

where T(Fi) represents the time constant of the isotope exchange process (that depends on the input flow Fi(t)); uf(t) represents the increase in steady state regime of the 18O concentration over the y0 value. The previously presented differential equation is solved considering the remark that firstly the value of the time constant is singularized for a certain (Fi). The linear variation of the time constant in relation to the input signal (T(Fi)) is given by: T(Fi ) = T2 + KT  (Fi  F2 )

ð3Þ

18

O Isotope Separation Process Control

529

where F2 = 185 l/h (implying the time constant T2 = 84.703 h, experimentally deter2 mined) and KT ¼ 3.9262 hl represents the gradient of the ramp T(Fi). The form of the uf(t) signal is presented below: uf = y(tf ,s = sf = h)  y0 = y0  (S(Fi )  1) = y0  ðantp ðFi Þ  1)

ð4Þ

where t = tf is the final value for time (given by the reach of the steady state regime) and s = sf is the final value for the independent variable (s) (corresponding to the top of the column). Also, the term S(Fi ) ¼ antp ðFi Þ represents the separation, a function that depends on the input signal. In the separation equation, a = 1.018 represents the 18O isotope elementary separation factor, for the isotopic exchange procedure in the system NO, NO2-H2O, HNO3. The theoretical plates number is given by the equation: ntp (Fi ) =

h HETP(Fi )

ð5Þ

where HETP (Fi) is the Height of Equivalent Theoretical Plate. Both ntp and HETP functions depend on the input signal (Fi). The HETP function has a linear variation over small intervals, but it has a nonlinear variation over the entire domain of Fi values, and it is described by the system of equations given below: 

HETP(Fi ) = HETP0 + KH1  (Fi  F0 ), if Fi  140 l/h HETP(Fi ) = HETP0 + KH2  (Fi  F0 ), if Fi [ 140 l/h

ð6Þ

where HETP0 = 8.6 cm, obtained for the input flow F0 = 140 l/h; KH1 ¼ 0.08 cml  h for Fi  140 l/h, and KH2 = 0.0333 cml h for Fi [ 140 l/h represent the gradients of the two experimentally determined ramps described by (6). Based on the above, the function HETP (Fi) is plotted in Fig. 2 [1]. Simplifying the use of the two equations from (6) is achieved by training a neural network to learn the behaviour of the HETP(Fi) function. The neural network in use is a feed-forward fully connected one [7, 8], having one input signal (Fi), one hidden layer (containing 20 nonlinear neurons, with hyperbolic tangent activation function) and one output signal (HETP(Fi)) (containing a linear neuron). Following the training process, the neural network’s output (the HETP(Fi) function) results as: HETP(Fi ) = WL  tanh(W  Fi + B) + bo

ð7Þ

where the training solutions are: WL - the row vector which contains the input weights, W - the column vector which contains the layer weights, B - the column vector which contains the bias values of the neurons from the hidden layer and bo - the output neuron bias value. The authors used the notation “tanh” to represent the hyperbolic tangent function. s The function Fs (s) = esfS 1 given in (1), highlights the proportion of the 18O eS  1 isotope concentration at a certain position from the FSC height, in relation to the

530

V. Mureşan et al.

13.5 13

HETP

12.5

HETP [cm]

12 11.5 11 10.5 10 9.5 9

F4

F3

8.5 80

100

120

140

160

180

200

Fi [l/h]

Fig. 2. The HETP dependency in relation to Fi.

isotope concentration at the top of the column, meaning that Fs(s) models the concentration evolution on the column height. The FS(s) function is derived from the s Fs1 (s) = y0  eS function (the approximation of the 18O isotope concentration distribution on the FSC height, for a certain (Fi) (implicitly for a certain HETP(Fi))). In Fs1(s), (S) is the constant “length” of the column, depending on the input signal Fi. The function S(Fi) is given by the system: 

S(Fi ) = S1 + KS1  (Fi  F1 ), if Fi  140 l/h S(Fi ) = S2 + KS2  (Fi  F2 ), if Fi [ 140 l/h

ð8Þ

where S1 = 751.124 cm is obtained for F1 = 80 l/h and S2 = 565.5 cm is obtained for F2 = 185 l/h. The gradients of the two ramps from (8) are experimentally determined as KS1 ¼  4.4804 cml h for Fi  140 l/h and KS2 = 1.8489 cml h for Fi [ 140 l/h. From (8), it can be remarked that S(Fi) has a linear evolution over intervals, but a nonlinear variation over the entire domain of Fi values. The same neural structure as in the case of HETP(Fi) function is used to learn the S(Fi) function, but with 50 neurons in the hidden layer. From the training process it results the neural network output (the S (Fi) function as: S(Fi ) = WL  tanh(W  Fi + B) + bo

ð9Þ

where WL, W, B and bo notations have the same meaning as above, noting that they are the solutions of the S(Fi) function’s neural network training.

18

O Isotope Separation Process Control

531

3 The Proposed Control Strategy The control structure proposed for the 18O isotope concentration is a complex one, and is depicted in Fig. 3. The SDPP block from Fig. 3 corresponds to the Separation Distributed Parameter Process which has an input signal that consists of Fi(t) (flow of nitric oxides) and, also, the independent variable (s) that it depends on. The output signal from the SDPP block represents the 18O isotope concentration y(t, s). The output signal is not affected by the disturbance effect just yet.

Fig. 3. The proposed control structure [1].

The main disturbance signal (yd(t)) is added, and it influences directly the value of SDPP output’s signal. This implies that the resulted value of the 18O isotope concentration (meaning the final output signal) is given by y1(t, s) = y(t, s) + yd(t). The DD block represents the disturbance delay element. The role of this block is to model the disturbance propagation into the process and its input yd0, is the steady state value of the disturbance. The main disturbance in the control system is represented by the product extraction flow (FP), since it can be seen as a disturbance signal due to the fact that extracting (HN18O3) from the separation column, with an increased concentration of 18O, leads to a decrease in the 18O isotope concentration in the column. The CS block represents the transducer, the concentration sensor (a mass spectrometer) used to measure the concentration of 18O isotope, and it generates the feedback signal noted as r1(t). The automation equipment [9, 10] from the proposed control system uses the

532

V. Mureşan et al.

range 4–20 mA for current signals, corresponding to the unified current’s range An example of such a signal is represented by r1(t). The actuating signal (Fi(t)) is generated by the actuator (the nitric oxides pump P). The MSFP Block represents the reference Model of the System Fixed Part (consisting of mathematical models of elements such as: the pump P, the SDPP, the sensor CS, elements that have a serial connection). MSFP is run on a process computer parallel to the real process, having its initial behavior not affected by the exogenous disturbances (yd(t) = 0) and also, not affected by parametric disturbances. MSFP also integrates the SDPP, by implementing the mathematical model presented in Sect. 2. Also, the mathematical models of P and CS are integrated in MSFP by implementing for each a neural network. The two neural networks have nonlinear autoregressive structure with exogenous inputs (NARX), each containing 9 linear neurons in the single hidden layer and one linear neuron in the output layer, respectively they have, each, two unit delays (one on the input signal and one on the output signal, due to the fact that P and CS are first order systems). The main concentration controller NCC (Nonlinear Concentration Controller) processes the main error signal e1(t) = w(t) – r1(t) (where w(t) is the concentration setpoint signal), and generates the main control signal noted as c1(t). The secondary error signal e2(t) = r1(t) – r2(t), where r2(t) is the feedback signal generated at the output of MSFP. This feedback signal represents the measure of the effects of all disturbances, exogenous and parametric, which occur in the system. Thus, e2(t) becomes a measure of the deviation of the output’s signal value in relation to its reference value. For proper comparison between the real plant behaviour (referring to the fixed part) and its reference model behaviour, the same input signal cf(t) must be applied to both blocks. There are two types of possible parametric disturbances, such as small variations in relation to time of the separation column structure parameters, or small variations of the independent variable (s). The latter may occur due to a shift of the CS position and of the product extraction point, in relation with the column height, thus the reference model is simulated for s = sf. The disturbances compensator block (DC), of Proportional – Derivative type (PD), computes the error signal e2(t) and generates the necessary compensation control signal c2(t). The total control signal c3(t) is a composed control signal, such as c3(t) = c1(t) – c2(t). The final control signal cf(t) results as cf(t) = c0 – c3(t), where c0 represents the value proportional to the value of the reference flow, in unified current. The reference flow is referring to the initial flow at which the separation column starts to operate. From (3) it results that the lowest value of the separation column’s time constant is obtained for the flow F2. Consequently, in order to generate an acceptable value of the concentration control system settling time, the reference flow is fixed at the value of F2. This means that other values could be chosen for the reference flow (such as F1), however this would generate slower control systems, since it increases the settling time. The element IF (Initial Flow) generates the value F2, scaled to the c0 value using the proportionality constant K1. In case of the real separation plant, the following elements from Fig. 3: NCC, DC, MSFP, LIVIE, PM, IF, K1, NA and K2 have to be implemented on numerical equipment. Since the proposed control structure is a complex one, to not further complicate it in a non-understandable manner, the analog to digital and the digital to analog converters are not figured. The role of such blocks is to connect the numerical equipment to the separation plant to be controlled.

18

O Isotope Separation Process Control

533

4 Controllers Tuning and Simulations Results The computer simulations presented in this paper are carried out via MATLAB/Simulink. The NCC tuning is made neglecting the disturbances action (meaning yd0 = 0 and s = sf) and, also, without considering the DC control action (c2(t) = 0). Moreover, in the primary tuning procedure, we singularize the NCC controller as being a linear Proportional Integral Derivative one (PID CC – PID Concentration Controller). Due to the fact that the isotope separation process, besides being a distributed parameter process, is a strong nonlinear one, the relay method is used for the PID CC tuning. This method is based on generating oscillatory variation on the output signal y(t, s), and for that to occur the PID CC controller is replaced with a two position relay. The relay characteristics are: the output commutation between the value bmin = 0 mA and the value bmax = 6.4 mA (without considering the initial value of 4 mA for the unified current signals), respectively the hysteresis set to ±1 mA. The simulation is carried out for a step type setpoint signal (proportional with the imposed value of 1.5% for the output, in this case the 18O isotope concentration). The resulted control system response (the variation in relation to time of the y(t, s) signal) is presented in Fig. 4 [1]. TOSC

18

O CONCENTRATION [%]

1.8 1.6

A 1.4 1.2 1 0.8 0.6 0.4 0.2 0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

TIME [h] Fig. 4. The control system response in the case of the relay tuning method application.

In Fig. 4, the fact that oscillation with constant amplitude are obtained, after 1000 h from the simulation start, is highlighted. From Fig. 4, the oscillation period TOSC = 552 h and the oscillation amplitude AOSC = A/2 = 0.0907% result. Using the values obtained for TOSC and AOSC, the parameters of the PID CC controller can be computed quite easily using the Ziegler – Nichols equations. The PID controller transfer function is:

534

V. Mureşan et al.

HCC (s) = KCC  (1 +

1 TDCC  s + ) TICC  s TfCC  s + 1

ð10Þ

where KCC is the controller proportionality constant, TICC is the controller integral time constant, TDCC is the controller derivative time constant and TfCC is the time constant of the first order filter used to make the PID controller feasible. The controller parameters values that result from Ziegler – Nichols equations, are the benchmarks for fine tuning the controller, in order to improve the control performances. The fine tuning consists in modifying the initial form of the obtained controller parameters and testing the control performances variation, until the best set of performances is obtained. At the tuning procedure end, the following parameters are obtained: KCC = 0.358, TICC = 197.143 h, TDCC = 33.12 h and TfCC = 0.625 h. If the setpoint concentration is set to 1.5%, the comparative graph between the separation process open loop response and the control system responses for the cases of using the best PI controller that could be obtained, respectively using the proposed PID CC controller, is presented in Fig. 5 [1]. In Fig. 5, the notation CSt signifies the minimum concentration value of the steady state band near the steady state value of the output signal (the steady state band is fixed at ±2%, more exactly CSt = 1.47%).

18

O CONCENTRATION [%]

1.6 1.4 CSt 1.2 1 Case of PID Controller

0.8

Open loop response Case of PI controller

0.6 0.4 ts1 0.2 0

500

1000

TIME [h]

1500

2000

2500

Fig. 5. The comparative graph between the open loop separation process response and the control system responses using two different controllers.

From Fig. 5, it results that in all three simulation cases, the steady state error est = 0%, due to the fact that, in steady state regime the three curves reach the imposed value for 18O concentration (yst = 1.5%). Also, in the case of using both controllers, a negligible overshoot occurs, having the value smaller than 1% (the maximum imposed limit). The PID CC controller generates an overshoot of r1 = 0.94%, insignificantly

18

O Isotope Separation Process Control

535

higher than r2 = 0.9% (generated by the PI controller). Obviously, in the case of simulating the separation process open loop response, the overshoot is r3 = 0%. Due to the fact that, in all three cases, the overshoot is included in the steady state band, the settling time (which is a critical performance) is determined as the first moment when the curves plotted in Fig. 5 reach the CSt value. From Fig. 5, the fact that the PID CC controller generates the smallest settling time ts1, results. In order to distinguish the three values of the settling times, a detailed view of Fig. 5, around the CSt value is depicted in Fig. 6 [1]. In Fig. 6, the fact that the PID CC controller generates the best settling time (ts1 = 719.58 h) is highlighted. It can be remarked that ts1 < ts2 < ts3 (ts2 = 740.21 h and ts3 = 819.92 h). Consequently, the PID CC controller’s usage improves significantly the separation plant settling time, leading to important economic advantages. Also, using the PID CC controller, even better settling times were obtained in simulations, but only if the actuating signal exits from the allowed variations limits. Practically, ts1 represents the best settling time that could be obtained if the actuating signal (Fi(t)) remains enclosed between the saturation limits of the unified current’s range. The open loop separation process simulation is made by simulating the mathematical model presented in Sect. 2, for the input flow F5 = 150.255 l/h, value which generates the steady state value of 1.5% for the output 18O isotope concentration (according to (3) and (6), the best settling time for the open loop process, if we impose a certain value of the output signal, is obtained if Fi(t) > 140 l/h).

18

O CONCENTRATION [%]

1.54

Case of PID Controller Open loop response

1.52

Case of PI controller 1.5 CSt 1.48 1.46 1.44

ts2

ts1

1.42 550

600

650

700

ts3 750

TIME [h]

800

850

900

Fig. 6. The settling times associated to the three curves from Fig. 5.

The main disadvantage in the case of using the separation process in open loop is the fact that the effect of the disturbances cannot be rejected, aspect highlighted in Fig. 7. The simulations from Fig. 7 are carried out in the same manner and conditions as the

536

V. Mureşan et al.

simulations from Fig. 5, but the exogenous disturbance with the value yd0 = −0.1% is added in the system at the moment t1 = 2500 h. The yd0 value is propagated in the system through the DD element having the transfer function HDD (s) = T s1 + 1, DD

where the time constant equals TDD = 50 h. Also, the negative value of yd0 is physically interpreted as the output 18O isotope concentration decrease, due to the exogenous disturbance effect. From Fig. 7, it results that in the case of the PID CC controller usage, the disturbance effect is rejected, thus in steady state regime the system response reaches again the imposed value (yst = 1.5%). The open loop response of the separation process, after the disturbance occurrence in the system, settles at a different steady state value of yst1 = 1.4% (with yd0 smaller than yst, resulting that the disturbance effect is not rejected), fact which highlights again the necessity of using a controller for the concentration control. An important problem which results from Fig. 7 [1] is the slow reaction of the control system in disturbance rejecting regime (the duration of the transitory regime caused by the disturbance effect is over 800 h).

yst

18

O CONCENTRATION [%]

1.6 1.4 1.2 1 0.8

The case with PID Controller The open loop response

0.6 0.4 0.2 0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

TIME [h] Fig. 7. The exogenous disturbance effect.

As it is presented in Figs. 5 and 6, the linear PID concentration controller represents the best solution of controlling the separation process that could be obtained using linear controllers, obtain through classical methods. But from these figures, the necessity to improve even more the control system performances occurs, too. Hence, even if the obtained settling time value is consistently improved using the PID controller, it remains higher than 700 h. In this context, the implementation of a control strategy which can generate a settling time for the 18O isotope concentration, in starting regime, lower than 700 h, would improve significantly the separation plant efficiency.

18

O Isotope Separation Process Control

537

A solution to obtain higher control performances is to use the nonlinear control and implicitly to use a NCC (Nonlinear Concentration Controller). The main idea of implementing the NCC is based on the simulation presented in Fig. 8. In Fig. 8, the comparative graph between the open loop response of the separation process, if the input signal has the value F2 = 185 l/h and the closed loop response of the control system, if the determined PID controller is used, is presented. From Fig. 8, the much faster increase of the 18O isotope concentration until the value of 1% results, for the case of the open loop response (the proportion between the two increasing times, being: ti1 = 138.35 h n, computation of a general polytope volume K −1 is a complex task [15].    Parallelotope. m = n, M is a non-diagonal matrix, K −1 = det M −1 diag(b − a) , see [9].

Approximate Bayesian Prediction Using State Space Model

555

Orthotope. m = n, M is a diagonal matrix (as a special case, M can be identity n  (bi − ai )/|Mii |. The orthotope represents a multidimenmatrix), K −1 = i=1

sional interval (a box). An illustrative example of two-dimensional sets is given in Fig. 1. Note that the respective sets are described by the corresponding system of linear inequalities in the characteristic function (1).

Fig. 1. An example of two-dimensional convex polytope (shaded area), parallelotope (thick line) and orthotope (thin line).

2.3

Approximation of a Complex Pdf by a Uniform Pdf with a Parallelotopic Support

Using a uniform pdf within the Bayesian estimation scheme typically induces non-uniform distribution on a geometrically difficult support. There are two issues that has to be considered: – finite sufficient statistic does not exist because the uniform pdf is not a member of the exponential family [16]; this property calls for approximation of the support, – sum of two independent uniformly distributed random quantities is not uniformly distributed (unlike of normally distributed), i.e., the uniform pdf is not closed under summing [17]; this property calls for approximation of the function. Approximation of the Support. With increasing number of data processed, the size m of the matrix M in (1) increases, too. The support represented by the polytope becomes more complex as a consequence.

556

L. Jirsa et al.

To avoid problems with storing the data, volume evaluation, analytical and computational complexity, memory requirements etc., it is practical to approximate the polytope by a tightly circumscribing parallelotope (m = n) with minimum volume [18]. Parallelotope can be tightly circumscribed by an orthotope in order to get the bounds of the parallelotope. Anyway, the paralellotope describes uncertainty more exactly than the orthotope as it less differs from the original polytope. Both these approximations are also used for recursiveness: a support is “spoilt” by a processing step it enters. The approximation projects it back to the initial class, i.e. the function is kept, the support is changed. Approximation of the Function. Sum of two random variables corresponds to convolution of their pdfs. Convolution of two uniform pdfs is a trapezoidal pdf [17]. Each subsequent convolution increases functional complexity of the resulting pdf. To preserve the class of the function and computational complexity, we approximate trapezoidal pdf by a uniform pdf by minimising Kullback-Leibler divergence of these pdfs [9]. The result is a uniform pdf on the support of the trapezoid, i.e. the support is kept, the function is changed. Discussion on the Approximations. The approximations mentioned above enable both analytical and computational efficiency. The tasks become tractable and they can be formulated recursively. On the other hand, the approximations are local and propagation of their errors is not under control. Also, both support and function approximations increase uncertainty of the respective random quantities. However, this property can be interpreted as a form of implicit forgetting, useful in adaptive estimation.

3

Approximate Bayesian Estimation of Uniform State Space Model

This section introduces basics of the Bayesian filtering, describes the involved stochastic linear state space model with a uniform noise and presents the approximate state estimation of this model. 3.1

Bayesian Filtering

In the considered Bayesian set up [19], system is described by the following pdfs: prior pdf: f (x0 ) observation model: f (yt | xt )

(3)

time evolution model: f (xt | xt−1 , ut−1 ) where yt is a scalar observable output, ut is a system input, and xt is an dimensional unobservable system state, t ∈ t ≡ {1, 2, . . . , t}.

Approximate Bayesian Prediction Using State Space Model

557

We assume that (i) state xt satisfies Markov property, (ii) no direct relationship between input and output exists in the observation model, and (iii) the inputs consist of a known sequence u0 , u1 , . . . , ut−1 . The Bayesian state estimation or filtering [19] consists in the evolution of the posterior pdf f (xt |d(t)) where d(t) ≡ {d1 , d2 , . . . , dt } is a sequence of observed data records dt = (yt , ut ), t ∈ t . The evolution of f (xt |d(t)) is described by the two-steps recursion that starts from the prior pdf f (x0 ): – Time update  f (xt |d(t − 1)) =

f (xt |ut−1 , xt−1 )f (xt−1 |d(t − 1)) dxt−1 ,

(4)

x t−1

– Data update f (xt |d(t)) =  x t

f (yt |xt )f (xt |d(t − 1)) f (yt |xt )f (xt |d(t − 1)) . = f (yt |d(t − 1)) f (yt |xt )f (xt |d(t − 1))dxt

(5)

The denominator in (5) represents the Bayesian output predictor. 3.2

Linear State Space Model with Uniform Noise

We introduce a linear state space model with a uniform noise (LSU model) in the form xt = Axt−1 + But−1 + νt , νt ∼ Uν (−ρ, ρ) (6) nt ∼ Un (−r, r) yt = Cxt + nt , where A, B, C are the known model matrices/vectors of appropriate dimensions, νt ∈ (−ρ, ρ) is the uniform state noise with known parameter ρ, nt ∈ (−r, r) is the uniform observation noise with known parameter r. Equivalently, using above mentioned pdf notation (3) f (xt |ut−1 , xt−1 ) = Ux (Axt−1 + But−1 −ρ, Axt−1 + But−1 +ρ)       x ˜t

x ˜t

f (yt |xt ) = Uy (Cxt −r, Cxt +r).   y˜t

(7)

y˜t

State estimation of LSU model (7) according to (4) and (5) leads to a very complex form of posterior pdf. In [9], an approximate Bayesian state estimation of this model is proposed. The presented algorithm provides the evolution of the approximate posterior pdf f (xt |d(t)) that is uniformly distributed on a parallelotopic support.

558

L. Jirsa et al.

3.3

Approximate State Estimation of LSU Model

Approximate Time Update. The time update according to (4) starts at the time t = 1 with f (xt−1 |d(t − 1)) = f (x0 ) = Ux0 (x0 , x0 ), i.e., f (x0 ) is uniformly distributed on an orthotopic support. In next steps, without approximation, the pdf f (xt−1 |d(t − 1)) would be non-uniform and having a polytopic support. A below described double approximation as proposed in [9] keeps the uniform orthotopic form of f (xt−1 |d(t − 1)), i.e. f (xt−1 |d(t − 1)) = Uxt−1 (xt−1 , xt−1 ), t ∈ t . Then, according to (4), f (xt |d(t − 1)) = ×

 

  1 1 |det(A)| i=1 2ρi (xt−1;i − xt−1;i )

(8)

([(xt;i − Bi ut−1 + ρi )χ(xt;i < mt;i − ρi ) + (mt;i − Bi ut−1 )χ(xt;i ≥ mt;i − ρi )]

i=1

  − (mt;i − Bi ut−1 )χ(xt ; i ≤ mt;i + ρi ) +(xt;i − Bi ut−1 − ρi )χ(xt;i > mt;i + ρi )  

×

χ(mt;i − ρi ≤ xt;i ≤ mt;i + ρi ),

i=1







Cutting according to the conditions given by state evolution model.

where mt;i =



min(Aij xt−1;j + Bi ut−1 , Aij xt−1;j + Bi ut−1 ),

(9)

j=1

mt;i =



max(Aij xt−1;j + Bi ut−1 , Aij xt−1;j + Bi ut−1 ),

j=1

Aij means the term on the i-th row and the j-th column of A. The resulting pdf (8) is trapezoidal [17]. In [9], the following approximation (based on a minimising of KullbackLeibler divergence of two pdfs) of the original distribution (8) by a uniform distribution is proposed f (xt |d(t − 1)) ≈



χ(mt;i − ρi ≤ xt;i ≤ mt;i + ρi ) mt;i − mt;i + 2ρi i=1

=



Uxt;i (mt;i − ρi , mt;i + ρi ) = Uxt (mt − ρ, mt + ρ), (10)

i=1

where mt = [mt;1 , . . . , mt; ] , mt = [mt;1 , . . . , mt; ] , the vectors entries are defined by (9). Approximate Data Update. Performing the data update of (10) according to (5), we obtain a posterior pdf with a support in the form of polytope f (xt |d(t)) =

1 Uy (Cxt − r, Cxt + r)Uxt (mt − ρ, mt + ρ) It t

(11)

Approximate Bayesian Prediction Using State Space Model

559



with

Uy (Cxt − r, Cxt + r)Uxt (mt − ρ, mt + ρ)dxt .

It = x t

It holds f (xt |d(t)) ∝ χ(mt − ρ ≤ xt ≤ mt + ρ)χ(Cxt − r ≤ yt ≤ Cxt + r)



 I mt + ρ mt − ρ ≤ x ≤ =χ C t yt − r yt + r

(12)

where I is the ( × ) identity matrix. The uniform pdf (12) has a polytopic support. In [9], an approximation of (12) by a uniform pdf with a parallelotopic support is proposed. It has the form f (xt |d(t)) ≈ Kt χ(xt ≤ Mt xt ≤ xt ),

(13)

where Kt is a normalising constant. The posterior pdf (13) is the result of state filtering in the time t. Nevertheless, the time update (8) in the next step assumes pdf with an orthotopic support, i.e. f (xt |d(t)) = Uxt (xt , xt ). The required bounds xt , xt are obtained by circumscription of the parallelotope xt ≤ Mt xt ≤ xt in (13) by an orthotope, see Fig. 1 and computational details in [9]. Then xt ≤ xt ≤ xt .

(14)

In this way, the recursion is closed and the obtained orthotopic bounds (14) can be used in the next time update step (8) for the computation of the terms m and m (9). Point Estimates of States. State point estimate corresponds to the centre of circumscribing orthotope [9] x + xt . (15) x ˆt = t 2

4

Approximate Predictor for Scalar Output and -dimensional State

This paper enriches the above described approximated state estimator in an output prediction. The proposed predictor corresponds to the denominator of (5). Its exact computation is a complex task. In this section, an approximated uniform predictor is proposed. Assume yt be a scalar. Note that a modelling of a scalar output is sufficient because the chain rule for pdfs [19] implies that the multivariate case can be treated using a collection of such models. Generally, state xt ∈ R is a vector.

560

L. Jirsa et al.

4.1

Predictive Pdf

The data predictor of a linear state-space model is the denominator of (5), where f (yt |xt ) is given by (7) and f (xt |d(t − 1)) is the result of the approximate time update (10). Then,  f (yt |d(t − 1)) ∝ χ(Cxt − r ≤ yt ≤ Cxt + r) χ(mt − ρ ≤ xt ≤ mt + ρ) dxt , (16) x t

where C is a matrix (1 × ), i.e. a row vector. As we assume the data yt be unknown yet, the formula (16) must be integrated within the state bounds from time update step (10), i.e. from mt − ρ to mt + ρ. These bounds are implicitly guaranteed by the second term in (16), therefore they are not explicitly specified hereafter.  Considering xt = [xt;1 , . . . , xt; ] , the formula (16) is     f (yt |d(t − 1)) ∝ χ yt − r ≤ Ci xt;i ≤ yt + r i=1 

×

  χ mt;j − ρj ≤ xt;j ≤ mt;j + ρj dxt;1 . . . dxt; .(17)

j=1

Assuming C1 > 0, the first term in (17) can be expressed as    χ yt − r ≤ Ci xt;i ≤ yt + r ⎛

i=1  

 



Ci xt;i − r Ci xt;i + r ⎟ yt − ⎜ yt − ⎜ ⎟ i=2 i=2 = χ⎜ ≤ xt;1 ≤ ⎟ ≡ R1 (yt |xt ). C1 C1 ⎝ ⎠ Then,

 

f (yt |d(t − 1)) ∝





R1 (yt |xt ) χ mt;1 − ρ1 ≤ xt;1 ≤ mt;1 + ρ1 dxt;1  



I1 (yt )

×

 χ mt;j − ρj ≤ xt;j ≤ mt;j + ρj dxt;2 . . . dxt; . (18) 

  j=2

Computing I1 , we get I1 (yt ) = max (J1 (yt ); 0) , where



⎜ J1 (yt ) = min ⎜ ⎝mt;1 + ρ1 ;

yt −



  i=2

Ci xt;i + r

C1



⎟ ⎜ ⎟−max ⎜m − ρ1 ; ⎠ ⎝ t;1

(19) yt −

  i=2

Ci xt;i − r

C1

⎞ ⎟ ⎟, ⎠

(20)

Approximate Bayesian Prediction Using State Space Model

561

which is proportional to a symmetric trapezoidal pdf of yt . We will approximate it by a uniform pdf on its support. This is achieved by minimisation of KullbackLeibler divergence of these pdfs [9]. The purpose is (i) to preserve the class of uniform pdfs, (ii) to keep the computation recursive and tractable.    ⊕(1) (1) , (21) ≤ Ci xt;i ≤ yt + r I1 (yt ) ≈ k1 χ yt − r i=2

−1  where k1 = r⊕(1) + r(1) . It can be shown that, according to the sign of C1 , r(1) = r − C1 (mt;1 − ρ1 ), r(1) = r − C1 (mt;1 + ρ1 ),

r⊕(1) = r + C1 (mt;1 + ρ1 ), r⊕(1) = r + C1 (mt;1 − ρ1 ),

if C1 > 0, if C1 < 0

(22)

(i.e. the terms in the parentheses are swapped). The formulae (22) were inferred for C1 = 0 but they hold for C1 = 0 as well. Substituting (21) into (18), we get     ⊕(1) (1) f (yt |d(t − 1)) ≈ K1 χ yt − r ≤ Ci xt;i ≤ yt + r i=2

×



  χ mt;j − ρj ≤ xt;j ≤ mt;j + ρj dxt;2 . . . dxt; (23)

j=2

having the same form as (17) with a normalising constant K1 . Applying the procedure (17)–(23) for integration over xt;2 , . . . , xt; , we get the approximate uniform predictive pdf   (24) f (yt |d(t − 1)) ≈ Kχ −r() ≤ yt ≤ r⊕() ,  −1 where K = r⊕() + r() . If we define vectors s and s⊕ so that s i = mt;i − ρi , s i = mt;i + ρi ,

s⊕ i = mt;i + ρi , s⊕ i = mt;i − ρi ,

if Ci ≥ 0, if Ci < 0,

(25)

then r() = r −

 i=1

r⊕() = r +

 i=1

 Ci s ≡ −y t , i = r − Cs ⊕ Ci s⊕ ≡ yt i = r + Cs

(26)

  and we get prediction bounds, i.e. χ y t ≤ yt ≤ y t . This predictive pdf is conditioned by mt and mt considered as statistics, provided the parameters A and B be known.

562

4.2

L. Jirsa et al.

Point Prediction

Mean value of (24) is r⊕() − r() =C yˆt ≡ E [yt |d(t − 1)] = 2



 mt + mt . 2   

(27)

E[xt |d(t−1)]

Remarks: (i) This formula is identical with the point one-step-ahead predictor xt−1 + But−1 , see (9) and (15)). for Kalman filter [1] (note that (mt + mt )/2 = Aˆ However, their results differ because of different time-updated estimates x ˆt−1 supplied by each estimator. (ii) The point prediction is unaffected by the aproximation (21), because the approximated trapezoidal pdf is symmetric and its approximation by the uniform pdf preserves the mean value.

5

Algorithmic Summary

Here, the proposed algorithm of observation prediction for model (7) is summarised. It is assumed, that model matrices A, B, C are known as well as the noise bounds r, ρ. Initialisation: – Choose final time t > 0, set initial time t = 0 – Set values x0 , x0 , u0 On-line (i) Set t = t + 1 (ii) Compute mt , mt according to (9) (iii) Compute predictor according to (24) (iv) Get the point output prediction according to (27) (v) Obtain new data ut , yt (vi) Add successively single data strips according to (12) and approximate the obtained support by a parallelotope (for details see Appendix A.2 in [9]) to obtain the resulting form (13) (vii) Compute xt , xt (14) (viii) Compute the point estimate x ˆt (15) (ix) If t < t, go to (i)

6

Experiments

In this section, the simulative experiments demonstrate the proposed algorithm properties. The algorithm is also compared with the Kalman filter (KF).

Approximate Bayesian Prediction Using State Space Model

6.1

563

Experiment Setup

The matrices of the state space model ⎡ 0.4 −0.3 A = ⎣ −0.4 0.4 0.3 0.2

(6) are set as ⎤ ⎡ ⎤ 0.1 0.1 0.0 ⎦ , B = ⎣ 0.6 ⎦ , 0.1 0.3 ! C = −1.0 0.9 −0.5 .

(28)

Input is randomly generated as ut ∼ N (0, s), where s = 1. Length of data sequences t = 100. 6.2

Results

We compare performance of the LSU predictor (24) with the KF predictor [20]. We examine a modellling mismatch of KF caused by a uniform noise pdf. Prediction Error and Predictive Interval. We evaluated medians of output prediction errors (ˆ yt − yt ) and half-widths of prediction intervals. For LSU predictor, half-width is defined as (y t − yˆt ) ≡ (y t −y t )/2. For KF predictor, standard deviation σt of the predictive pdf was used instead, as a measure of variance. The prediction interval is then yˆt ± half widtht , specific for each predictor. Half-width of the predictive interval for the LSU predictor (24) is yt − yt 2

= |C| |A|

xt−1 − xt−1 + |C|ρ + r, 2

absolute value of a matrix applies to its elements. Standard deviation of predictive pdf of the KF predictor [20] is " σt = CA cov(ˆ xt−1 ) A C  + CRw C  + Rv ,

(29)

(30)

where Rw is a covariance of state noise and Rv is a covariance of output noise. Both the formulae have a similar structure, however, they combine different terms in a different way. The covariance matrices in KF (30) were chosen as Rw = c diag(ρ2 ) and Rv = c r2 . The value of c plays a role of a “matching” parameter and will be discussed later. The noise bounds and covariances are fixed and known. We examined the influence of noise bounds ρ and r on the predictive statistics mentioned above, c = 2.7. The results and comparison are in Table 1.

564

L. Jirsa et al.

Table 1. Influence of noise parameters ρ and r on medians of prediction errors (ˆ yt −yt ) and half-widths of prediction intervals (y t − yˆt ) for LSU and KF. ρ

r

Prediction error LSU KF

Half-width LSU KF

0.001 0.001 −0.0001 −0.0001

0.0128 0.0031

−0.0005 −0.0001

0.0124 0.0168

0.001 0.01 0.001 0.1

0.0016

0.0038

0.1024 0.1644

0.01

0.001 −0.0009 −0.0008

0.0616 0.0242

0.01

0.01

−0.0015 −0.0013

0.1282 0.0309

0.01

0.1

−0.0046 −0.0008

0.1240 0.1679

0.1

0.001 −0.0088 −0.0037

0.4109 0.2411

0.1

0.01

−0.0090 −0.0078

0.6913 0.2419

0.1

0.1

−0.0147 −0.0128

1.2815 0.3086

2

2

−0.2949 −0.2561 25.3609 6.1710

Predictive Interval and KF Covariance Matrices. Further, we investigated the influence of the parameter c in Rw and Rv on KF predictive interval. Data were generated with ρ = 0.1 and r = 0.3 and the prediction by KF was run with various values of c. Then, the same experiment was done with ρ = r = 2. The results are in Table 2. The purpose of these experiments was, in context of mismatch between uniform noise and KF’s assumption of normality, to examine influence of c on median(σt ) and on the number of observed output values out of the predictive intervals, yˆt ± σt (denoted ‘ ∈’ in Table 2) and yˆt ± 2σt (denoted ‘ ∈2 ’). Setting for the matching the second moments, n ∼ Un (−r, r), var(n) = r2 /3. Table 2. Influence of c on median(σt ) and on observed output values out of the predictive intervals (∈ and ∈2 ) for KF. c

ρ = 0.1, r = 0.3 ρ = 2, r = 2 ∈ ∈2 median(σt ) ∈ ∈2 median(σt )

0.33 40 3

0.2040

29 7

2.1683

0.6

0.2737

19 0

2.9091

17 0

1

6 0

0.3534

10 0

3.7556

1.5

2 0

0.4328

5 0

4.5996

2

0 0

0.4998

1 0

5.3112

2.7

0 0

0.5807

0 0

6.1710

This behaviour is also illustrated in Fig. 2.

Approximate Bayesian Prediction Using State Space Model 10 -3

565

c=2.7

c=0.33 0.015

10

0.01

5

0.005

0

-0.005

0

-0.01

-5

-0.015 60

70

80

90

60

70

80

90

Fig. 2. Simulated (dotted grey) vs. KF-predicted (solid black) output yt with prediction bounds (thin dashed), ρ = 0.003, r = 0.007. Left figure: c by moment matching (c = 0.33), right figure: empirical value c = 2.7.

Predictive Interval and LSU Predictor. The LSU predictive interval contains all the observed output values. Influence of Input. Variance s of input generator has no influence on values in Tables 1 and 2, i.e. the same results have been observed for the system without input, s = 0. However, presence of input influences the absolute values of output (excitation), as shown in Fig. 3. Note that the width of the prediction interval seems changed due to different scales on the vertical axes. s=1

s=0 0.4 0.05

0.2

0

-0.2

0

-0.4 -0.6

-0.05

-0.8 60

70

80

90

60

70

80

90

Fig. 3. Simulated (dotted grey) vs. LSU-predicted (solid black) output yt with prediction bounds (thin dashed), ρ = 0.003, r = 0.007. Left figure: no input (s = 0), right figure: input with s = 1.

566

6.3

L. Jirsa et al.

Discussion

The LSU predictive performance was examined, using simulated data with uniformly distributed noises, and compared to KF as a standard tool, but assuming normally distributed noises. This modelling difference was involved in the study as well. The LSU predictor, in comparison to KF predictor, has moderately higher prediction errors, see Table 1 (standard deviations of prediction errors are in a similar relation as medians, not reported). Setting up the KF noise covariances (30) meets the fact of different parametric models of noise. Therefore, a matching parameter c was introduced to interconnect ρ and r with Rw and Rv . If we matched second moments of normal and uniform pdfs, then c = 1/3. However, Table 2 shows that predictive interval yˆt ± σt , constructed with such covariances, does not contain significant amount of observed outputs and the predictive interval yˆt ± 2σt does not contain all observed outputs as well. To include all the data in both predictive intervals, the parameter c was set empirically to 2.7 (see also Fig. 2), its square root equals 1.64. It means, that noise parameters for KF are set 1.64-times higher (in the sense of standard deviation) than the corresponding parameters used for simulation of the data. Therefore, uncertainty of the KF predictor was artificially increased to broaden the predictive interval, otherwise some data would be missed (Table 2, Fig. 2). This observation indicates that KF, with covariances set by c ≤ 1, cannot process correctly data that are uniformly distributed. On the other hand, higher values of c improve matching KF to uniformly distributed data. With c increasing, variance of normal pdf increases, which, on finite support, resembles a “pseudo-uniform” distribution. The LSU predictive interval (29) reflects the noise model and therefore naturally expresses bounds of actual knowledge on the system, however conservative it may appear, particularly for higher value of noises ρ and r (Table 1). Although the KF predictive interval given by (30) contains all the observed outputs for various combinations of noises, with c chosen for this purpose, it is narrower. The question is, whether the narrower predictive pdf given by (mismodelling) KF does not pretend artificial precision, while the corresponding knowledge is actually not at disposal.

7

Concluding Remarks

We extended the LSU estimator of unknown states of a linear state space model with uniformly distributed noise [9] by one-step-ahead LSU predictor for scalar output. The limitation of the output dimension is not essentially critical. Using the chain rule, scalar random variables can be composed together into a vector a vice versa. Because of properties of uniformly distributed pdf, approximation of pdf was necessary in the process of the predictor construction. However, this approximation does not bias the point prediction.

Approximate Bayesian Prediction Using State Space Model

567

The LSU predictor was compared with KF (Kalman) predictor, together with predictive intervals computed pro each predictor using the corresponding formulae. It was shown that covariances of KF must be adjusted carefully to give meaningful predictive intervals. This effect pointed out a mismatch between uniform data and assumption of normal distribution at KF. Predictive intervals given by LSU are more conservative than those by KF. Consequence of this observation should manifest itself in intended application of the LSU predictor for information sharing in sensor networks. The literature reveals many contemporary examples of sensor networks with uniformly distributed innovations and/or observation processes [21,22]. The LSU filtering algorithm published in [9], and the consistent predictor developed in this paper, will find important applications in such contexts. In particular, [23] has proposed that optimal probabilistic (i.e. Bayesian) knowledge transfer between interacting nodes can be accomplished optimally via the data predictor. However, results are available to date only for Kalman (normally modelled) nodes [20]. The progress in this paper will allow the extension of the Bayesian transfer learning technique in [23] to uniformly modelled processing nodes, significantly enriching the application range in networked knowledge processing. This research will be published shortly.

References 1. Jazwinski AM (1970) Stochastic processes and filtering theory. Academic Press, New York 2. Fletcher R (2000) Practical methods of optimization. Wiley, Hoboken 3. Simon D, Simon DL (2010) Constrained Kalman filtering via density function truncation for turbofan engine health estimation. Int J Syst Sci 41:159–171 4. Lang L, Chen W, Bakshi BR, Goel PK, Ungarala S (2007) Bayesian estimation via sequential Monte Carlo sampling - constrained dynamic systems. Automatica 43(9):1615–1622 5. Chisci L, Garulli A, Zappa G (1996) Recursive state bounding by parallelotopes. Automatica 32(7):1049–1055 6. Becis-Aubry Y, Boutayeb M, Darouach M (2008) State estimation in the presence of bounded disturbances. Automatica 44:1867–1873 7. Pavelkov´ a L, K´ arn´ y M (2014) State and parameter estimation of state-space model with entry-wise correlated uniform noise. Int J Adapt Control Signal Process 28(11):1189–1205 8. Combastel C (2015) Zonotopes and Kalman observers: gain optimality under distinct uncertainty paradigms and robust convergence. Automatica 55:265–273 9. Pavelkov´ a L, Jirsa L (2018) Approximate recursive Bayesian estimation of state space model with uniform noise. In: Proceedings of the 15th international conference on informatics in control, automation and robotics (ICINCO), Porto, Portugal, pp 388–394 10. Belda K, Pavelkov´ a L (2017) Online tuned model predictive control for robotic systems with bounded noise. In: Proceedings of the 22nd IEEE international conference on methods and models in automation and robotics (MMAR), Miedzyzdroje, Poland, pp 694–699

568

L. Jirsa et al.

11. Mizumoto I, Fujimoto Y, Ikejiri M (2015) Adaptive output predictor based adaptive predictive control with ASPR constraint. Automatica 57:152–163 12. Campi MC, Calafiore G, Garatti S (2009) Interval predictor models: identification and reliability. Automatica 45(2):382–392 13. Bravo JM, Alamo T, Vasallo M (2017) A general framework for predictors based on bounding techniques and local approximation. IEEE Trans Autom Control 62(7):3430–3435 14. Crespo LG, Kenny SP, Giesy DP (2018) Staircase predictor models for reliability and risk analysis. Struct Saf 75:35–44 15. Lawrence J (1991) Polytope volume computation. Math Comput 57(195):259–271 16. Bernardo JM, Smith AFM (1997) Bayesian theory. Wiley, Hoboken 17. Kotz S, Dorp JR (2004) Beyond beta: Other continuous families of distributions with bounded support and applications. World Scientific 18. Vicino A, Zappa G (1996) Sequential approximation of feasible parameter sets for identification with set membership uncertainty. IEEE Trans Autom Control 41(6):774–785 19. K´ arn´ y M, B¨ ohm J, Guy TV, Jirsa L, Nagy I, Nedoma P, Tesaˇr L (2005) Optimized Bayesian dynamic advising: theory and algorithms. Springer, London 20. Foley C, Quinn A (2018) Fully probabilistic design for knowledge transfer in a pair of Kalman filters. IEEE Signal Process Lett 25:487–490 21. Goudjil A, Pouliquen M, Pigeon E, Gehan O, Targui B (2017) Recursive output error identification algorithm for switched linear systems with bounded noise. IFAC-PapersOnLine 50(1):14112–14117 22. He J, Duan X, Cheng P, Shi L, Cai L (2017) Accurate clock synchronization in wireless sensor networks with bounded noise. Automatica 81:350–358 23. Quinn A, K´ arn´ y M, Guy T (2016) Fully probabilistic design of hierarchical Bayesian models. Inf Sci 369(1):532–547

Author Index

A Abe, Nobukado, 318 Abrudean, Mihail, 526 Agresti, Antonio, 51 Alacreu, Mónica, 101 Anderson, Brendon G., 276 Andreucci, Daniele, 51 B Basson, Christian Ivan, 252 Battiston, Geoffray, 448 Beauvois, Dominique, 448 Belda, Květoslav, 230 Benner, Peter, 142 Bertozzi, Andrea L., 276 Biswas, Swagata, 276 Böhm, Valter, 302 Bright, Glen, 252 Buonocore, Luca Rosario, 76 C Cacace, Jonathan, 76 Cariou, Christophe, 344 Clitan, Iulia, 526

D Deremetz, Mathieu, 344 Di Giamberardino, Paolo, 482 Donaire, Alejandro, 76 Duc, Gilles, 448 Duviella, Eric, 168

F Fontanelli, Giuseppe Andrea, 76 G García, Eduardo, 101 Gee, Marissa, 276 Gershon, Eli, 464 Glizer, Valery Y., 26 Godoy, Emmanuel, 448 Gómez-Castellanos, José Antonio, 503 Gunnarsson, Gudmundur G., 1 Gutierrez-Giles, Alejandro, 76 H Haberland, Matt, 276 Hadid, Baya, 168 He, Changjiang, 214 Huang, Jun, 193 Hvězda, Jakub, 364 I Iacoviello, Daniela, 482 Ikeda, Tetsushi, 318 Iwaki, Satoshi, 318 J Ji, Chunqian, 193 Jirsa, Ladislav, 552 K Kim, Jung-Tae, 76 Klimchik, Alexandr, 384

© Springer Nature Switzerland AG 2020 O. Gusikhin and K. Madani (Eds.): ICINCO 2018, LNEE 613, pp. 569–570, 2020. https://doi.org/10.1007/978-3-030-31993-9

570 Kosaku, Toshiharu, 318 Kuklišová Pavelková, Lenka, 552 Kulich, Miroslav, 364 L Ladoiye, Jasmeet Singh, 407 Lan, Feiying, 193 Lenain, Roland, 344 Lippiello, Vincenzo, 76 Liu, Jiayi, 193 Liu, Quan, 193 Loeser, Eva, 276 Loreti, Paola, 51 M Mahfouf, Mahdi, 214 Mamedov, Shamil, 384 Mikhel, Stanislav, 384 Montes, Nicolás, 101 Mureşan, Vlad, 526

Author Index S Sasiadek, Jurek, 407 Satici, Aykut C., 76 Schlette, Christian, 1 Schorr, Philipp, 302 Sgorbissa, Antonio, 429 Siciliano, Bruno, 76 Sima, Vasile, 142 Sita, Valentin, 526 Solaque, Leonardo, 328 Su, Shizhong, 193 T Takaki, Takeshi, 318 Tanveer, M. Hassan, 429 Thomas, Antony, 429 Thuilot, Benoit, 344 Torres-Salomao, Luis A., 214 Tsuchihashi, Naoki, 318 Turanova, Olga, 276 Turetsky, Vladimir, 26

N Necsulescu, Dan S., 407 Nielsen, Ole W., 1 Nizard, Ange, 344

U Ungureşan, Mihaela-Ligia, 526

O Odenthal, Dirk, 118

V Valbuena, Yair, 328 Vázquez, Carlos Renato, 503 Velasco, Alexandra, 328

P Papot, Vianney, 344 Petersen, Henrik G., 1 Pham, Duc Truong, 193 Popov, Dmitry, 384 Přeučil, Libor, 364 Q Quinn, Anthony, 552 R Ramírez-Treviño, Antonio, 503 Reichensdörfer, Elias, 118 Ren, Fei, 276 Romero, Marianne, 328 Ruggiero, Fabio, 76

W Wang, Yongjing, 193 Wollherr, Dirk, 118 X Xu, Wenjun, 193 Y Yoshinaga, Kazuki, 318 Z Zentner, Lena, 302 Zhou, Zude, 193 Zimmermann, Klaus, 302